STA 294 B

BIOLOGICAL SEQUENCE ANALYSIS

-BGT.04 Genomic Data, Informatics and Sequence Analysis-
 


 

Thanks for you interest in my class ... Rainer
 
 

Some Links:
 
 

  • Time: Mon & Wed: 10:30am-11:45am, Weeks 8-14 of semester
  • Place: 025 Old Chemistry Building
  • Instructor: Rainer Spang, Duke Statistics
  • email: rainer@stat.duke.edu
  • tel: (919) 684 - 4447
  • Office location: 217 Old Chemistry Building
  • Office hours: Mon & Wed 2:00pm-3:00pm, or by appointment.

  •  

    Description

    This course gives an introduction to the theory of biological sequence analysis. The central theme is sequence alignment. We will discuss algorithms for calculating alignments, issues of parameter choice, the use of alignments in database searches and the significance of alignment scores.
     
     

    Schedule

    Week Topic Homework
    8   Wed Molecular Sequences in the Light of Evolution  none
    9   Mon Sequence Data on the Internet none
    9   Wed Sequence Comparison, Dot Plots, Edit Distance none
    10 Mon Global Alignment, Distance, Score, Gaps Assignment 1
    10 Wed Alignment with Gaps, Local Alignment none
    11 Mon Suboptimal Alignments, Introduction to Multiple Alignment none
    11 Wed Tree Alignment,  Progressive Alignment, Guide Trees none
    12 Mon Multiple Alignment ( End ) Markov Chains ( Beginning ) Assignment 2
    12 Wed More Markov Chains none
    13 Mon Models for DNA and Protein Evolution none
    13 Wed Fitting Models from Alignment Data, Estimation of Divergence Assignment 3
    14 Mon Database Searches, Random Sequence Similarity none
    14 Wed Alignment Statistics none

     

    Grading

    Grading is based on homework that will be assigned as we go along. There will be no quizzes or exams. You may work on problems together, but write up the solutions on your own. Especially, team work of students with a background in life sciences together with students with a background in statistics, computer science or mathematics is highly encouraged. However, every student must be able to present his/her solutions in class.
     
     

    Prerequisites

    Some molecular biology (BGT.01 or equivalent), basic knowledge in probability (including Markov chains), linear algebra and calculus.
     
     

    Books

    Recommended:
  • Durbin R. et al. (1999) Biological Sequence Analysis (Cambridge Univ Pr)
  • Gusfield, D (1997) Algorithms on Strings, Trees, and Sequences (Cambridge Univ Pr)

  • Textbooks on Bioinformatics:

  • Waterman, M. S.  (1995) Introduction to Computational Biology
  • Sankoff, D. & Kruskal, J. (1983!!!) Time Warps, String Edits, and Macromolecules
  • Setubal J. & Meidanis J. (1997) Introduction to Computational Molecular Biology (Brooks/Cole)
  • Baldi P. & Brunak S. (1998) Bioinformatics (MIT Press)
  • Rashidi, H. H. & Buehler L. K. (2000) Bioinformatics Basics (CRC Press)
  • Misener, S. & Krawetz S. A. (2000) Bioinformatics - Methods and Protocols (Humana Press)

  • Some Biology for Statisticians:

  • Graur, D & Li, W. H. (2000) Fundamentals of Molecular Evolution (second edition, Sinauer)
  • Alberts B. et al. (1999) Molecular Biology of the Cell (third edition, Garland)
  • Eigen, M. (1992)  Steps towards Life (Oxford University Press)

  • Some Statistics and Probability for Biologists:
  • Ross, S (1997) A first course in probability ( Prentice Hall )
  • Chung, K.L (1974) Elementary probability theory with stochastic processes (Springer)
  • Berry, D. (1996) Statistics, A Bayesian Perspective (Duxbury Press)
  • STA 294A: ELEMENTS OF COMPUTATION & MODELING IN GENETICS