STA 325: Data Mining and Machine Learning

The syllabus can be found here Syllabus

All announcements will be posted on Sakai.

Please check for updates to the course notes regularly as they are being written and updated quite frequently. You will be able to see all updates on Github directly, so please check there for updates to the lecture notes. For your convenience, I have listed the Github slides and materials below to make the course easier to naviagate.


Lecture notes


  • Find a quick review of R that I expect you should have no trouble completing.
  • Module 0: An introduction to machine learning and a review of R programming
  • Module 1: An introduction to statistical computing, Part I: Functions
  • Module 1: An introduction to statistical computing, Part II: Strings/Textual Data
  • Review for Exam I
  • Module IR: Information Retrieval

  • Module IR: Locality Sensitive Hashing

  • Module 2: Introduction to Statistical Machine Learning

  • Module US: Unsupervised Learning (Ch 10 ISLR)

  • Module 3: Linear Regression (We will not cover this module in class. Be sure to review Chapter 3 of ISLR on your own. I will assume that everyone is familar with linear regression).
  • Module 4: Classification: Logistic Regression and Linear Discrimminant Analysis (We will not cover this module in class. Please go over logistic regression on your own and see me in persion if you would like to discuss LDA or QDA. I've posted the lecture materials in case you have questions about it.).
  • Module 5: Resampling Techniques
  • Module 8: Decision Trees
    Top