Curriculum 2018/20

- Data Analysis in R
- Statistics
- Discrete Mathematics
- Programming in Python
- Applied Bioinformatics / Journal Club
- Soft Skills Course
- Project Management
- Foreign Language in Professional Activity
- Research Work

- Systems Biology
- Population and Medical Genetics
- Scientific Python
- Applied Statistics
- High Performance Computing
- Applied Artificial Intelligence
*(online)* - Foreign Language in Professional Activity
- Course Project

- Structural Bioinformatics
- Metagenomics
- Molecular Phylogenetics
- Biotechnology
- Applied Bioinformatics / Journal Club
- Research Work

Semester 4 is for Research Internship and Master's Thesis

Statistics

- Probability theory: basic concepts and definitions: events, independent events, random variables
- Statistics: hypothesis, tests, significance, false positive and false negative errors.
- P-values. Null hypothesis, alternative hypothesis, significance and statistical power.
- Chi-square test of independence.
- Chi-square goodness of fit test.
- Tests for sample comparisons. Student's t-test and Mann-Whitney U-test.
- Multiple comparisons problem. Genome-wide association study. Bonferroni correction and Benjamini-Hochberg Procedure.
- Principal component analyses. Covariaton matrix, eigenvctors and eigenvalues.
- Linear regression. Simple linear regression and multiple linear regression. Analyses of missing values.

Programming in Python

2. More data types

3. Modules

4. Functions and functional programming

5. Recursion

6. Numpy module

7. Classes as data types

8. OOP basics

Data Analysis in R

- Data types and structures
- Program flow, functions, etc.
- Scoping rules, Functional approach
- Shiny Basics

- Data import and cleaning
- Data manipulation (tidyr, dplyr)
- Data visualization (ggplot2)

- Exploratory and descriptive analysis
- Linear Regression

Discrete Mathematics

- Introduction to set theory and boolean functions
- Boolean logic and proof methods
- Combinatorics
- Combinatorial generation
- Asymptotic analysis of algorithms
- Sorting algorithms
- Dynamic programming
- Graph theory: basic concepts
- Graph theory: graph algorithms

Systems Biology

- Introduction to systems biology and immunology.
- Introduction to gene expression analysis. Interactive analysis in Phantasus: working with public datasets.
- Introduction to single cell RNA-sequencing.

- Introduction to epigenetics.

- Introduction to metabolism.

- RNA-sequencing quantification pipelines

- Analysis of gene expression data in R

Proteomics

- Introduction to proteomics. Experimental protocols.

- Proteomics quantification.

- Analysis of proteomics data in Max Quant.

- Proteomics extras: posttranslational modifications, protein-protein interactions, spatial proteomics.

Population and Medical Genetics

- Where it all started. Solving a mystery of inheritance. Mendel and post-Mendel era
- Pre-genome era. Mapping of the first human disease gene. Huntingnton's disease.
- RFPL, microsatellite and genetic linkage. Pedigree and linkage analysis. DNA forensics.
- Human Genome Project. SNP map of the human genome. DNA variation.
- Linkage disequilibrium and genome-wide association studies. HapMap project.
- GWAS concepts and approaches.
- Hands on GWAS tutorial.
- GWAS discussion and resources. Polygenic risk scores. UK biobank.
- Next generation sequencing. Exome sequencing.
- GATK pipeline.
- Variant annotations, selection pressure metrics. Large scale sequencing resources (ExAC & GnomAD).
- Rare variant association studies
- Hands-on analysis of exome sequencing data.
- Cancer genetics. TCGA project.
- Case-control matching challenge.

Scientific Python

- Regular expressions. Metacharacters, special sequences and sets. "Greedy" and "lazy" quantifiers. Lookarounds. Protein database Prosite
- Biopython. Bio.Data, Bio.Alphabet. Bio.Seq, Bio.SeqRecord, Bio.SeqIO. Bio.Align, Bio.Blast, Bio.Phylo. Bio.PDB
- Numpy, pandas, seaborn. Numpy arrays, indexing/reshaping operations, time checking. Pandas data structures, tidy data concept, data wrangling. Visualization.
- Error handling. Error type hierarchy. Different types of clauses. Best practices for function organization
- Requests. Variety of databases. Concept of API. Popular data formats. Uniprot API queries
- Functional programming. Iterators, generators, comprehensions. Lambdas. Partial application of functions
- Pipelines and OS-level virtualization.
- Single cell in scanpy. Common data structures and conventional pipeline. Basic data representation.
- Introduction to machine learning.

Structural Bioinformatics

- Defining bioinformatics and structural bioinformatics.
- Fundamentals of macromolecular organization and structure. Hierarchical levels of protein organization. Protein 3D structure. Protein Domains. Protein Folds.
- Analysis of macromolecule. Sequence and structural alignment. Amino acid substitutions, amino acid replacement matrices. Quality of protein structures, Torsion angles and Ramachadran plot. Function from structure: Structure-function relationship and analysis.
- Prediction and modeling of macromolecules. Homology and similarity of proteins, quality assessments of homology models. Molecular dynamics and docking, Monte Carlo simulations. Protein folding and energetics.
- Experimental approaches in structural biology: Determination of macromolecular structures. X-ray crystallography. Nuclear Magnetic Resonance. Cryo-electron microscopy. Hydrogen-deuterium exchange.
- Structure based drug design: Hit identification.
- Protein databases.

Biotechnology

1. Basic genetic engineering.

2. Gene delivery, genome editing.

3. Gene synthesis, high-throughput cloning.

4. Genome synthesis, synthetic signaling circuits and metabolic pathways.

1. Protein design, peptide design.

2. Antibody design, directed evolution.

3. Ligand design, small molecule library creation and optimization.

4. CAR-T cells.

1. Stem cells, regenerative medicine.

2. Organoids, organ-on-a-chip, 3D bioprinting.

3. Brain-machine interfaces, neuroprosthetics.

1. Microbial biotechnology, industrial biotechnology.

2. Plant biotechnology, agricultural biotechnology, biofuel.

3. Biotechnology of animals, transgenic animals.

Molecular Phylogenetics

- Introduction. Necessary terminology. Acquaintance with trees.
- Alignment of nucleotide and protein sequences.
- Methods for constructing phylogenetic trees: MP, ME, NJ, ML.
- Testing the tree topology: bootstrap, supertrees.
- Bayesian methods in phylogenetics. Dating
- Molecular Markers. Gene evolution and genome evolution.

- Graphical display of phylogenetic trees using R and Python.
- Work with NCBI databases
- Multiple sequence alignment tools.
- Cleaning the alignment. Evolution model testing.
- Comparison of tree reconstruction algorithms.
- Tree topology verification. Bootstrap analysis .
- Bayesian methods in phylogenetics.
- Discussion and debriefing.

Metagenomics

- A primer on linear algebra I: Algebraic structures, Vector spaces
- A primer on linear algebra II: Coordinate systems, Linear maps, Covariance
- A primer on compositional data analysis: Compositions as equivalence classes, Aitchison simplex, Principles of compositional data analysis, Generating systems on the simplex, Isomorphisms and isometries
- Introduction to metagenomics: Shotgun metagenomes, Amplicon libraries
- Taxonomic annotation: Reference databases, Alignment-based annotation, Machine learning methods, Phylogenetic placement
- Statistical analyses I: Diversity indices, Ordination, Location tests, Handling zeros
- Statistical analyses II: ILR balances, Feature selection, Mixed effects linear models
- Statistical analyses III: Dirichlet-multinomial regression, Logistic-normal-multinomial models

- Human microbiome. Analysis of 16S-data
- Shotgun metagenomics
- Clinical applications of metagenomics
- Statistical analysis in metagenomics
- Machine learning in metagenomics