Statistics 49S: Fundamentals of
Modern Statistical Modeling and Data Analysis

  Spring 2009

Course Home Page


Course Description

This course introduces modern statistical science, which combines mathematical theory and computing to answer applied data analysis questions.  Students learn the basics of advanced statistical modeling, with special emphasis on Monte Carlo (computer intensive) approaches for maximum likelihood and Bayesian inference.  Students study probability and statistical theory, learn to code statistical routines using a statistical programming language, and develop skills at analyzing data.  Applications are drawn from the social sciences, the natural sciences, and professional and intercollegiate sports.  Students who enjoy this course are likely to enjoy advanced courses in statistical science at Duke (see the on-line description of the major in statistical science).. 

Course Objectives

Logistics

Prerequisites

Students must have passed Math 31 and Math 32, or the equivalent.  Students should be either (i) comfortable with basic computer programming or (ii) willing to learn basic computer programming during the semester.  A previous statistics course is not required.

Readings

Lavine, M. (2008) Introduction to Statistical Thought. This book is free and can be downloaded at http://www.stat.duke.edu/~michael/book.html.  A copy is on the STA 49S Blackboard site in the Course Documents folder. 

We also will read from journal articles provided by Professor Reiter. The primary journals include the Journal of Quantitative Analysis in Sports, Journal of the American Statistical Association, The American Statistician, and Chance.


Computing

We will use the statistical software package, R. It can be downloaded for free at http://www.r-project.org/

Calculator

Students don't need a calculator for this course.

Schedule of Topics

We will cover the topics in the table below.  We may spend different amounts of time on each topic than shown, depending on the interests of the class participants.

Basics of calculus-based probability
Chapter 1
6 lectures
Modes of inference (MLE, Bayesian methods)
Chapter 2
7 lectures
Regression (normal and logistic models)
Chapter 3
4 lectures
More probability (densities, linear combinations)
Chapter 4
2 lectures
Special distributions (binomial, Poisson, normal)
Chapter 5
2 lectures
Bayesian inference (MCMC methods--Gibbs sampler, Metropolis Hastings)
Chapter 6
7 lectures


Graded work

Graded work for the course will consist of methods assignments, data analysis assignments, and one final project.  There are no exams.  Students' final grades will be determined as follows:

Methods Assignments

30 %

Data Analysis Assignments

30 %

Final Project

40 %

There are no make-ups for assignments except for medical or familial emergencies or for reasons approved by the instructor before the due date.  See the instructor in advance of relevant due dates to discuss possible alternatives.

Descriptions of graded work

Methods Assignments:
Methods assignments are posted on the Statistics 49S course web site on Blackboard.  Students turn in these assignments at the beginning of class on the due date.  Students are permitted to work with others on the assignments, but each person must write up and turn in their own answers.  The methods assignments include questions on the computational and the mathematical aspects of the methods that underpin the statistical models we learn during the semester.

Data Analysis Assignments:
Data analysis assignments are posted on the Statistics 49S course web site on Blackboard.  Students turn in these assignments at the beginning of class on the due date.  Students are permitted to work with others on the assignments, but each person must write up and turn in their own answers.  The data analysis assignments apply the skills and models discussed in seminar and the readings to analyze data.

Final Project:
For the final
project, students work individually to analyze a data-based research question of their choosing, subject to the instructor's approval.  Students can ask the instructor for assistance with identifying appropriate data.  Students write an 8 - 12 page paper describing their data analysis.  The paper is due at the end of the semester.  Students present their research to the class at the end of the semester.  See the  Course Assignments folder on Blackboard or the online instructions for the final project for details (these are the same files).


Academic honesty

Students are expected to abide by Duke's Community Standard for all work for this course.  Violations of the Standard will result in a failing final grade for this course and will be reported to the Dean of Students for adjudication.  Ignorance of what constitutes academic dishonesty is not a justifiable excuse for violations.

For the methods and data analysis assignments, students may work with a study group with others but each student must submit his or her own answers.  For the final project, students are required to perform the data analysis, write statistical programs, and write the paper individually (although students can consult with the instructor and discuss ideas with other students in the class).