Personal tools
You are here: Home Courses BIOINF 547
Document Actions

BIOINF 547

by ikeday last modified 2005-11-29 23:03

Bioinformatics 547
Probailistics Modeling in Bioinformatics

The course will be about probabilistic models of proteins and nucleic acids, and their uses in molecular biology. The core of the course will be the analysis of sequences and its biological applications such as the searching of large databases for optimal comparisons or homologies of sequences (DNA nucleotide sequences or amino acid sequences for proteins), location of genes on a string of DNA, estimation of phylogenetic trees, structural motif recognition for proteins and RNA. Guest lecturers will address the class on topics from the pharmaceutical industry, DNA physics and chromatin structure.

More specifically, the topics will include a review of basic concepts of probability and very rudimentary molecular biology; probability and the design of similarity scoring functions; optimal local and global alignments of sequences: dynamic programming, Smith-Waterman algorithm, other algorithms available on the Web (BLAST and FastA, etc.), probabilistic (heuristic) versus rigorous algorithms; significance of scores and simulation; dependence of scoring functions and optimal alignments on parameters, comparison of standard tables; hidden markov models and neural network models; entropy and information content of a sequence; multiple sequence alignment methods and algorithms. The applications of this part will be to gene finding; families of proteins; phylogenetic tree determinations; structure of proteins and recognizable patterns in amino acid sequences (motif recognition); protein expression and regulation, protein phylogeny.

In the later parts of the course, we will discuss the mechanics and dynamics of these macromolecules, especially DNA. Of special interest will be the topology of DNA (supercoiling), duplex destabilization or "melting", scaffold attachment regions and larger scale organization of DNA, and chromatin structure and dynamics. Possible final topic, time permitting: mass spectrometry data and de novo protein sequence determination algorithms.

If you are interested, a more detailed syllabus from the Winter, '04 version of the course is still available at http://www.math.lsa.umich.edu/~dburns/547/547syll.html. Some topics will change!

There will be no exam for the course. Students will be expected to complete several problem sets, most of which will hopefully be group projects. If the class demographics work out favorably, we will be mixing students with biological background and mathematical/statistical background in each group. The prerequisites are left very flexible for the moment, but students will be expected to try to familiarize themselves with possible gap areas in their background as the term progresses. Every effort will be made to accommodate students from biological or other relevant backgrounds.

There are a variety of books for background. Biological Sequence Analysis, by R. Durbin, S. Eddy, et al. is a good general one. A more practical one is D. Mount’s, Bioinformatics. Math and stats students might appreciate Statistical Methods in Bioinformatics, by Ewens and Grant.

Credits: 3

Course Homepage:

Winter 2004: http://www.math.lsa.umich.edu/~dburns/547/547syll.html

Copyright 2008 by The University of Michigan
Site Powered by Plone