STA 521: Predictive Modeling

This is a rough schedule for the course and will be updated regularly. Please check this frequently for adjustments. Announcements will be posted here and made in class. It will be up to you to keep up to date on all class announcements and web announcements made for the course. Readings and lab assignments will also be made here as well as homework and exam postings.

You might try using RStudio's template or vignette for homework submissions. The only thing I have found so far is that it doesn't display figures very nicely. Perhaps, we can find a nice template of our own for the class (extra credit for the best template at the end of the semester)!

Something new that we will try is if you have found something new or learned something from the course and would like to share this at the beginning of class, you have 3--5 minutes to explain what you have learned (and even demo it live). Extra credit will be given for what you have learned, sharing it with the class, and even better willing to sharing it on the course webpage. Beware: if you go over the 5 minutes, we will stop you!

Slides and notes for this class are based upon ISL, ESL, the Bayesian Essentials with R and also slides kindly provided by Professor Ryan Tibshirani (not to be confused by the author of the text Rob Tibshirani!), Department of Statistics, Carnegie Mellon University. The slides will be posted after class and homework's will be posted on this page as well.

A guide to git can be found at Use Git

The course syllabus can be found here Syllabus: things you need to know about the course!

You lab schedule and homeworks can be found here: Lab and Homework Schedule


Lecture notes

  • Intro to data science and predictive modeling (Lecture 1, August 25, 2015)
  • Intro to Markdown, RStudio, and git (Lecture 2, August 27, 2015)
  • Information retrieval (Lecture 3, September 1, 2015)
  • Information retrieval continued (Lecture 4 , September 3, 2015)
  • An introduction to multivariate methods (Lecture 5, September 8 2015)
  • An introduction to multivariate methods, continued (Lecture 6, September 10 2015)

  • Multivariate methods, a review (Lecture 7 (in place of lab), September 14 2015)
  • Multivariate methods and contour plots, a review (Lecture 8), September 15 2015)
  • PageRank (Lecture 9, September 18 2015)
  • Clustering 1: K-means and K-medoids (Lecture 10, September 22 2015)
  • Clustering 2: Hierarchical clustering (Lecture 11, September 24 2015)
  • Clustering 3: Hierarchical clustering (continued); choosing the number of clusters (Lecture 12, September 29)
  • Midterm Exam: Goes out Wednesday September 30. Due Wednesday October 7 at 11:59 PM.
  • Regression: Introduction (Lecture 13,14, October 2 and 6)
  • Modern Regression: (Lecture 15, 16, October 8, October 15, October 20)
  • Bayesian methods 1: An introduction (Lecture 17, 18, October 22, October 26)
  • Bayesian methods 2: More advanced Bayesian methods (Lecture 19, October 29) analysis
  • Bayesian methods 3: The Bayesian lasso (Lecture 20, November 3) analysis
  • Bayesian methods 4: Bayesian model and variable selection (Lecture 21, November 5,9) analysis
  • Intro to classification methods and introduction to trees (Lecture 22: November 10)
  • Boostrapping and Bagging (Lecture 23)
  • Random Forests and Bayesian Adaptive Random Forests (Lecture 24)
    Top