|Class:||Tu Th 10:05-11:20am||Old Chem 116|
|Prof:||Robert L. Wolpert||(email@example.com)|
|OH:||By appointment||Old Chem 211c|
Half a century ago the phrase Multivariate Statistics was generally understood to describe sampling-theory based statistical methods for studying multi-dimensional normally-distributed data. The fundamental tools for this aspect of the subject are a deep understanding of linear algebra and of the probability distributions associated with the normal, such as Wishart and its kin. The best-known methods arising in this area are PCA (Principal Components Analysis), FA (Factor Analysis), Hotelling's T2 test, and perhaps relatives like Principal Components Regression and multivariate ANOVA.
Traditional MVA methods are tailored for problems in which the number of observations (traditionally denoted n) exceeds (maybe by a lot) the number of uncertain parameters (traditionally p). Recently there is a great deal of interest in problems where p»n--- this arises naturally in genomic applications, intrusion detection, and other emerging areas of interest. "Big p small n"
My plan is to try to sketch the high-lights of traditional (multivariate Gaussian) MVA in the first half of the semester, then segue to a discussion-format course in which students select papers or book chapters covering more recent material, and make an oral presentation of these to the class. I'll help identify a number of possible papers and topics for student presentation, but you're welcome to choose something outside those offerings. Here's the start of a list of Multivariate Papers and one of Consistency and Asymptotic Papers (pdf copies of each availble on request).
Possible topics will include random-projection methods, the statistical modeling of computer output, random forests, linear discriminant analysis, kernel PCA, and others.
Students are expected to be (or become) comfortable with probability theory at the level of STA230 or STA711 or STA831, statistical inference at the level of STA250 or STA732, and linear models at the level of STA721. Some experience in computing in MatLab, Python, or R would be helpful.