|
Course in theory |
Laboratory practice |
|
1. What is covered in this course, and why? Brief introduction, definition of basic, frequently referred phrases like "DNA", "protein", "sequencing" etc.Introduction on algorithmic complexity: big O notation, P and NP classes, approximation algorithms; why are these relevant to us (examples).Basic databases (only a brief overview; they will be mentioned in detail at the beginning of the relevant "Part"). (GV)
|
1, Linux basics, commands, piping, archiving, updating. (RT, IG) |
|
Part 1: Genome sequencing 2., Genome
sequencing: Related problems in bioinformatics: Physical mapping.
Physical mapping with backtrack algorithm.
|
2, Shell scripts, compilation, ssh tunneling, (RT, IG) |
|
3., Shotgun sequencing.
Motivation, calculation of sequence overlaps (later: with suffix
trees). Greedy algorithm. (IG) |
3, Python script language (ÖR) |
|
Part 2: Sequence analysis 4., Strings, basic problems related to string search: pattern matching, alignments.Brief introduction on data structures. Two approaches for pattern matching: preprocessing the text or the searched pattern.Distance functions on strings: Hamming-distance, Levenshtein-distance, Levenshtein-distance with different costs. Dynamic programming: main
ideas and applicability to sequence comparisons. "Scoring functions"
and their motivations. Brief introduction on scoring matrices (PAM,
BLOSUM). Databases of amino acid sequences: UniProt = (SwissProt U
TrEMBL); SwissProt and TrEMBL difference; RefSeq (also nucleotide
sequences); corresponding websites; download hints (IG) |
4, Python continued (ÖR) |
|
5., Pattern matching using
pattern preprocessing: the Boyer-Moore algorithm. Pattern matching
using text preprocessing: Suffix trees and suffix arrays. Quickly
solvable tasks using Suffix trees. |
5, Sequencing, sequence alignments (IG) |
|
6., Sequence alignment algorithms The concept of local,
global, "glocal" alignments. Gap penalty types. Alignment with
different overlap requirements / different gap penalty types.
Motivation behind scoring functions: statistical considerations.
Sequence alignment: statistical parameters and their interpretations.
Live demonstration of the different alignment algorithms. (IG) |
6, Sequencing, sequence alignment (IG) |
|
7., Multiple alignments, heuristic sequence alignment algorithms. Basic idea: BLAST (in
detail), improvement possibilities: PSI-BLAST (sketch). Phylogeny,
evolution trees. The NCBI taxonomy tree. Different methods for
constructing phylogenetic trees. (IG) |
7, Sequencing, sequence alignment (IG) |
|
8, From genes to proteins:
finding protein coding genes (GV) Transcription and translation. CDS,
ORF, gene finding.
|
8, System administration for beginners (R |
|
9, AI and Bioinformatics.
Markov models. Hidden Markov models. The forward algorithm. Viterbi
algorithm. The Viterbi learning algorithm. (GV) |
9, cancelled |
|
Part 3: Structure prediction 10, Molecular structure
primer; Molecular structure prediction (GV)
|
10, |
|
11, Drug-protein and
protein-protein docking (GV) |
11, Docking on a webserver (??) |
|
Part 4: Interaction networks 12, Molecular networks: metabolic and physical interaction networks Source of information:
online databases (DIP, MINT, Intact, HPRD) (GV) |
12. PPI download and update (BD) |
|
13, Molecular networks: gene regulatory and cell signaling networks (GV) |
13, PPI generation (BD)
|
|
14, Protein function prediction and similarity (GV)
|
14, PPI analysis (BD) |