SAMSI COURSE ON

DATA MINING AND MACHINE LEARNING
Instructors:
Professor David Banks : banks@stat.duke.edu
Professor Feng Liang : feng@stat.duke.edu
Class Time:
Wednesdays, 4:30 - 7:00pm
Class begins August 27, 2003

Class Location:
NISS Building, Room 104
Maps and Directions
Distances:
-   Duke - SAMSI: ~ 8.5 miles (14 km)
-   NCSU - SAMSI: ~ 16.5 miles (26 km)
-   UNC - SAMSI: ~ 13.5 miles (22 km)

Course Description
Data mining represents an expanding partnership between statisticians and computer scientists. This SAMSI course attempts to bring graduate students up to the research frontier in this area, drawing together the foundations (Curse of Dimensionality, smoothing, flexible modeling, recursive partitioning, and parsimony) with more recent innovations (support vector machines, boosting and bagging, model stiffness, data streaming, and false discovery rate). The class will involve some applications and some illustrative use of software, but the focus will be upon theory. Grading will be based upon a research project--students will be expected to invent a new idea in this area, implement it, and then test it (this is easier than it may sound).

Prerequisites:

Text:
The main text for the course is Hastie, Tibshirani, and Friedman's "The Elements of Statistical Learning," but it will be supplemented by current articles.
Lectures:
Instructor: David Banks
Instructor: Feng Liang Links: