ABS04 - 2004 Applied Bayesian Statistics School
STATISTICS & GENE EXPRESSION GENOMICS:
METHODS AND COMPUTATIONS
Centro Congressi Panorama, Trento, Italy
15th-19th June 2004
Lecture Slides | Data, Examples, Code | Statistics Notes | Papers |
Tools and Software | More Software Sites | Microarray Info | Gene Info Servers |
Data and examples with Matlab |
PNAS 2001 Breast Cancer: data and code, local version of the paper, the PNAS journal paper & low level array data (cel files) |
PNAS 2004 Breast Cancer: data and code, local version of the paper, and the PNAS journal paper |
Nat Gen 2003 Myc/Ras/E2F: data and code, local version of the paper, and the Nat Gen journal paper & low level array data (cel files) |
Science 1999 MIT/Whitehead Leukemia: data and paper |
Some Matlab utilities - functions and scripts - for data handling and exploration, and some of the statistical summaries and modelling of expression data here. And here are some additional functions and scripts for binary tree analysis and examples. |
Statistics notes |
1. Basic Statistics: dvi and pdf | 2. Least Squares Regression dvi and pdf | 3. Multiple Regression dvi and pdf | 4. Clustering: dvi and pdf | 5. Empirical Factors - PCA and SVD: dvi and pdf | 6. Factor Regression: dvi and pdf | 7. Bayesian Regression & Shrinkage Estimation: dvi and pdf | 8. Binary Regression: dvi and pdf | 9. Gibbs Sampling in Linear Regression with Shrinkage Priors: dvi and pdf | 10. Gibbs Sampling in Binary Regression: dvi and pdf | 11. Multinormal Theory: dvi and pdf |
A few other relevant Duke papers
A list including those above as well as others on statistical modelling and a range of applications in expression genomics |
Some Duke software and tools sites, and key gene/genomics data base sites |
The Duke CAGP GraphExplore software for displaying, exploring and manipulating general graphs (directed, undirected), and of particular use for graphs generated in analysis of gene expression associations and other genomic data sets |
The Duke Integrated Genomics (DIG) data base, for exploration of gene annotation, links to information servers, automated literature searches to generate biological information, etc |
MetaGeneCreator software (Adrian Dobra at Duke) for reclustering and improved definition of metagene clusters. This takes as input either raw data, in which case it utilises k-means clustering and then an iterative refinement of clustering, or data from covariance selection/graphical models such as generated by ... |
GGM software - C++ code implementing methods of stochastic computation (Metropolis Hastings MCMC and shotgun stochastic search) for model exploration and selection in Gaussian graphical models (Duke graphical models group). |
Bayesian covariance selection in high-dimensions - initial code available as HdBCS (Adrian Dobra at Duke) |
Matlab, R, Bioconductor and other useful software and tools sites |
Bioconductor web site: Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data, built on the R statistical programming environment |
Bioconductor lab materials - really useful introductory tutorial material for Bioconductor (and R). Look particularly at lab slides by Robert Gentleman and colleagues |
CRAN web site for R -- go there to download and install R (free). R is a widely used open source language and environment for statistical computing and graphics. It is available for Linux, Unix, Windows, and MacIntosh computers. More information on R is available in the "R Basics" section of the R FAQ |
xcluster by Gavin Sherlock |
Cluster software, including Cluster 3.0 (Max, Windows, Linux) and manual |
Java Treeview software and download site |
Eisen lab site for Cluster & Treeview software |
Some local microarray info |
Magic of Microarrays, a recent Scientific American article overview | Duke template for Affymetrix files gives a brief description of Affymetrix data files |
Some useful web sites on arrays, resources |
Microarray slides (powerpoint) |
Affyx data processing - basic details |
Affymetrix manual more Affymetric details |
Some gene/genomics data base sites |
The Entrez Gene site at NCBI. Gene provides a unified query environment for genes defined by sequence and/or in NCBI's Map Viewer. You can query on names, symbols, accessions, publications, GO terms, chromosome numbers, E.C. numbers, and many other attributes associated with genes and the products they encode. Gene is one of the Entrez systems and is likely to (soon) replace LocusLink. |
The LocusLink site at NCBI, providing single query interface to curated sequence and descriptive information about genetic loci. It presents information on official nomenclature, aliases, sequence accessions, phenotypes, EC numbers, MIM numbers, UniGene clusters, homology, map locations, and related web sites. |
The KEGG site: Kyoto Encyclopedia of Genes and Genomes, for annotation and visualisation of functional metabolic pathways |