STA 278/BGT 208: Gene Expression Analysis
- Computing -
Computation and Statistical Software
- OIT's web page
to buy your own copy of Matlab under the Duke site license ($120)
- - This is probably the easiest wasy to
set up Matlab on your own computer (Mac/Windows/Linux); the OIT Duke-wide licenses
are cheaper in the long-run as they provide access to the Matlab Tooboxes at
reduced rates too, and the annual renewals are negligible.
- - Or, if you prefer you can buy a student edition from the
Duke Computer Store in Bryan Center, or in fact direct from Mathworks (next link)
- All students in the class will also have Duke Statistics accounts so can use Matlab, R, and many other
systems and languages there
- Matlab resources:
- The Matlab web site, where you
can find lots of info, references etc
- Stixbox,
a most useful collection of statistics functions developed by Anders Holtsberg
- More support in class: including access to various libraries and utility
functions of the instructor, and other relevant resources
-
Cluster software, including Cluster 3.0 (Max, Windows, Linux) and manual
- Java Treeview software and download site
- Bioconductor web site:
Bioconductor is an open source and open development software project for
the analysis and comprehension of genomic data, built
on the R statistical programming environment.
- Bioconductor lab materials - useful
introductory tutorial material for Bioconductor (and R)
- CRAN web site for R -- go there to
download and install R (free). R is a widely used open source language and
environment for statistical computing and graphics. It is available for Linux,
Unix, Windows, and MacIntosh computers. More information on R is available in
the "R Basics" section of the R FAQ.
Some of our own Matlab functions of interest for the course:
- ... Simple read-in of a Matlab file with expression array information:
genesin
- Utility functions directory to download to your own, as needed
- Simple correlations of expression measures of genes, and linear least squares regression fit:
correg. Includes some Bayesian linear modelling
and here are additional examples with more extensive data.
- Simple factor/principal component analysis of expression
data, and a start on factor regressions
- Clustering using matlab, with examples
- Read-in data and run binary regression analysis with SVD predictors to
produce approximate Bayesian estimates of regression parameters, cross-validation predictions etc, and also full Bayesian analysis of binary regression with SVD predictors, using simulation analysis