Prof: | Sayan Mukherjee |
sayan@stat.duke.edu | OH: Mon 10-12 | 112 Old Chem | |

TAs: | |||||

Abhishek Dubey | abhisdub@cs.duke.edu | OH: Wednesday 10-11am LSRC D309 | |||

Yuhao Liang | yuhao.liang@duke.edu | OH: Monday 7:00-9:00pm Old Chem 211a | |||

Xinyi Li | xinyi.li@duke.edu | OH: | |||

Class: | M/W 8:30-9:45am | Social Sciences 136 |

All students: we will have one poster session, Dec 4. The poster session will be in Gross Hall 3rd floor East Meeting Space. For a keynote version of an example poster see tex example or keynote example. If you are auditing the course, we'd love to have you at the poster sessions (bring your research groups too!).

Statistics at the level of STA611 (Introduction to Statistical Methods) is encouraged, along with knowledge of linear algebra and multivariate calculus.

Course grade is based on an in class midterm (15%), in class final (35%), a final project (40%), and the poster session for the final project (10%). We will have homeworks but they will not be graded, we will post solutions.

There is a Piazza course discussion page. Please direct questions about homeworks and other matters to that page. Otherwise, you can email the instructors (TAs and professor) at sta561-ta@duke.edu. Note that we are more likely to respond to the Piazza questions than to the email, and your classmates may respond too, so that is a good place to start.

The final porjects should be in LaTeX. If you have never used LaTeX before, there are online tutorials, Mac GUIs, and even online compilers that might help you.

The course project will include a project proposal due mid-semester, a four page writeup of the project at the end of the semester, and an all-campus poster session where you will present your work. This is the most important part of the course; we strongly encourage you to come and discuss project ideas with us early and often throughout the semester. We expect some of these projects to become publications. You are absolutely permitted to use your current rotation or research project as course projects. Examples of last years projects.

A second set of references for R may be useful. First, you can download R from the CRAN website. There are many resources, such as R Studio, that can help with the programming interface, and tutorials on R are all over the place. If you are getting bored with the standard graphics package, I really like using ggplot2 for beautiful graphics and figures. Finally, you can integrate R code and output with plain text using KNITR, but that might be going a bit too far if you are a beginner.

The course will follow my lecture notes (this will be updated as the course proceeds), Lecture Notes. Some other texts and notes that may be useful include:

- Kevin Murphy, Machine Learning: a probabilistic perspective
- Michael Lavine, Introduction to Statistical Thought (an introductory statistical textbook with plenty of R examples, and it's online too)
- Chris Bishop, Pattern Recognition and Machine Learning
- Daphne Koller & Nir Friedman, Probabilistic Graphical Models
- Hastie, Tibshirani, Friedman, Elements of Statistical Learning (ESL) (PDF available online)
- David J.C. MacKay Information Theory, Inference, and Learning Algorithms (PDF available online)

The final project TeX template and final project style file should be used in preparation of your final project report. Please follow the instructions and let me know if you have questions. We will have a poster session where you present your research project in lieu of a final exam.

This syllabus is *tentative*, and will almost surely be modified. Reload your browser for the current version.

- Predicting sales of Rossman's stores
- Gentrification Index Using Yelp Data
- Risk estimates of tree mortality across species using Bayesian hierarchical models
- Classification of TV Channels
- Prediction of Coupon Purchasing Behavior
- Classification of Cardiac Tissue Regions Based on Motion Profile in Ultrasound Images
- Spectral Clustering of Chinese Herbal Medicine Network
- Use of Machine Learning in Predicting Bankruptcy
- Distinguishing malignant from benign breast tumors
- Detection of Solar Panes from Satellite Imagery
- Yelp Customer Review Bias Analysis through Linear Mixed Effect Models with Natural Language Sentiment Polarity Scores
- Testing the CAPM Theory for German CDS Based on a Model with GARCH-type Volatilities and SSAEPD Errors
- Bayesian Non-Parametrics and Dirichlet Process Clustering Techniques
- Text Analysis of News Articles (Building a Protest Dataset through Machine Learning)
- Information Popularity and Diffusion Size Prediction in Online Social Networks
- Cascading Classifier for Face Detection
- What's Cooking ? Predicting Cuisines from Recipe Ingredients
- Analysing Senator Community Structure from Roll Call Data
- Handwritten Digits Recognition
- A Neural Algorithm for Artistic Style
- Machine Learning with Python
- Predictive Modeling of Bank Marketing for Term Deposit
- Air Pollution Distribution Analysis for Beijing Haze
- Beyond SVD
- Legislation approval ratings prediction via vote correlation
- Categorical Prediction of Song Popularity Using Topological Data Analysis
- Movie Recommender System
- The Effect of Racial Diversity on High School Graduation Rates
- Comparison of feature selection methods in modeling resting metabolic rate
- Randomization as regularization
- Designing an optimum traffic signal system using reinforcement learning
- Topic modeling for community analysis and range estimation
- Classifying Soccer Matches in the English Premier League
- Spectral algorithms and tensor methods for learning in POMDPs
- World Cup Recap
- Dimension Reduction Methods on Handwritten Digits Recognition
- ML methods for Drosophila Dorsal closure
- The Animal Model for Censored Traits
- Spectral Clustering and Community Detection in Labeled Graph
- Cluster Analysis of Endogenous Taxi Driver Schedule Patterns
- (August 24th) Introduction and review: Lecture 1 in notes
- Optional: (video) Christopher Bishop Embracing Uncertainty: The New Machine Intelligence
- Optional: (video) Sam Roweis Machine Learning, Probability and Graphical Models, Part 1
- Optional: (video) Mikaela Keller Basics of probability and statistics for statistical learning
- Optional: Alan Turing Computing Machinery and Intelligence
- Homework: Due
Sep. 7 Assignment
1 Solution 1
- Poisson problem HW1.txt
- Gene expression problem test.txt train.txt samples.txt

- (August 26th) No class
- Optional: (video) Michael Jordan Bayesian or Frequentist: Which Are You?

- (August 31th) Linear regression, the proceduralist approach: Lecture 2 in notes
- Optional: Norman R. Draper and R. Craig van Nostrand Ridge regression
- Optional: Elements of Statistical Learning Pages 61-67
- Optional: Proof that leave-k-out is unbiased Lecture notes based on: A. Luntz and V. Brailovsky. Technicheskaya Kibernetica, 3, 1969.
- (September 2nd) Bayesian motivation for proceduralist approach: Lecture 3 in notes
- Optional: (video) Alex Smola Exponential Families
- Strongly suggested: Useful properties of the multivariate normal in notes
- Optional*: Persi Diaconis and Donald Ylvisaker Conjugate priors for exponential families
- (September 7th) Bayesian linear regression: Lecture 4 in notes
- Optional: (video) LISA Short Course: Regression Using Bayesian Statistics in R
- Strongly suggested: Review of Functional analysis in notes
- Homework: Due
Sep. 23 Assignment
2 Solution 2
- Gene expression problem test.txt train.txt samples.txt

- (September 9th) Reproducing kernel Hilbert spaces: Lecture 5 in notes
- Optional: (video) Partha Niyogi Introduction to Kernel Methods
- Optional*: Nachman Aronszajn Theory of Reproducing Kernels
- (September 14th) Nonlinear regression: Lecture 6 in notes
- Optional: (video) Partha Niyogi Introduction to Kernel Methods
- Optional: (video) John Shawe-Taylor Kernel Methods and Support Vector Machines
- Strongly suggested: Review of convex optimization in notes
- Strongly suggested if you don't know Lagrange Multipliers: Lagrange multipliers and KKT conditions
- (September 16th, 21st) Support Vector Machines:
Lecture 7 in notes
- Optional: (video) Lieven Vandenberghe Convex optimization
- Optional: (video) Stephen Boyd Domain Specific Languages for Convex Optimization

- (September 23rd) Regularized logistic regression: Lecture 8 in notes
- Optional: (video) Nate Otten Introduction to conjugate gradient
- Optional*: Andrew Stuart and Jochen Voss Matrix analysis and algorithms pg. 75--83
- (September 28th) Gaussian process regression: Lecture 9 in notes
- Optional: (video) Karl Rasmussen Gaussian processes
- Optional: (video) David MacKay Gaussian Process Basics
- Optional*: J.L. Doob The elementary Gaussian process
- (September 30th) Sparse regression: Lecture 10 in notes
- Optional: (video) Daniela Witten and Robert Tibshirani The Lasso
- Optional: (video) Trevor Hastie glmnet package
- Homework Due Oct. 7 Assignment 3
- (October 5th) The boosting hypothesis and Adaboost: Lecture 11 in notes
- Optional: (video) Rob Schapire Theory and Applications of Boosting
- Optional: Leslie Valiant A Theory of the Learnable
- Optional: Rob SchapireThe Strength of Weak Learnability
- (October 7th) In class midterm
- (October 14th, 19th) Statistical learning theory:
Lecture 12 in notes
- Optional: (video) Leon Bottou and Vladimir Vapnik Foundations of Statistical Learning
- Optional: Vladimir Vapnik and Ya. Chervonenkis On the Uniform Convegence of Relative Frequencies of Events to their Probabilities
- Optional*: Michel Talagrand The Glivenko-Cantelli Problem

- (October 19th, 21st) Mixture models and latent space models: Lecture 13 in notes
- Optional: (video) Victor Lavrenko Expectation maximization
- Optional: (slides) David Sontag Expectation maximization
- Optional*: Dempster, Laird, Rubin Maximum Likelihood from Incomplete Data via the EM Algorithm
- (October 26th, 28th) Latent Dirichlet Allocation: Lecture 14 in notes
- Optional: (video) Dave Blei Topic models
- Optional: (video) John Novembre Methods for the analysis of population structure and admixture
- Optional: (slides) Dave Blei Probabilistic Topic Models
- Optional: Pritchard, Stephens, Donnelly Inference of Population Structure Using Multilocus Genotype Data
- Optional: Blei, Ng, Jordan Latent Dirichlet Allocation
- (November 2nd, 4th) Markov chain Monte Carlo: Lecture 15 in notes
- Optional: (video) Iain Murray MCMC
- Optional: (slides) Iain Murray MCMC
- Optional: Casella, George Explaining the Gibbs Sampler
- Optional*: Levin, Peres, Wilmer Markov Chains and Mixing times
- Optional*: Metropolis, Rosenbluth, Rosenbluth, Teller, Teller Equation of State Calculations by Fast Computing Machines
- (November 9th, 11th) Hidden Markov models Lecture 16 in notes
- Optional: (video) Nando de Freitas HMMs
- Optional: (slides) Eric Xing HMMs
- Optional: Rabiner A Tutorial on Hidden Markov Models and. Selected Applications in Speech Recognition.
- (November 23rd) In class final
- (December 4th) Poster session (2pm)
- (December 7th) Final projects due