Fall 2019
This course introduces students to Gaussian linear models and extensions for model building, including exploratory data analysis techniques and model checking, variable transformations and selection, parameter estimation and interpretation, prediction, hierarchical models, model selection and Bayesian model averaging. The concepts of linear models will be covered from Bayesian and classical viewpoints. Topics in Markov chain Monte Carlo simulation will be introduced as required. Co-requisite STA 601.
Name | location | office hours | |
---|---|---|---|
Dr. Merlise Clyde | clyde@duke.edu | 223E Old Chem | Wed 3-4 & Fri 2:15-3:15 or by appointment |
Vittorio Orlandi (TA) | vittorio.orlandi@duke.edu | TBA | TBA |
Pritam Dey (TA) | pritam.dey@duke.edu | TBA | TBA |
Textbook | Ordering Information |
---|---|
![]() |
Plane Answers to Complex Questions Ronald Christensen (2011) 4th Edition Springer-Verlag, NY. The textbook is freely as an eBook thru the Duke Library. You’re welcomed to read on screen or print it out. If you prefer a paperback version you can buy it at the cost of printing from Springer or purchase a hardback version at the Bookstore. |
![]() |
Data Analysis Using Regression and Multilevel/Hierarchical Models Gelman & Hill (2009) ISBN-13: 978-0521686891 |
![]() |
Bayesian and Frequentist Regression Methods Wakefield (2013). First edition Springer-Verlag ISBN 978-1-4419-0924-4. The textbook is freely as an eBook thru the Duke Library. You’re welcomed to read on screen or print out chapters. |
![]() |
A First Course in Bayesian Statistical Methods, Hoff, P. L. (2009), Springer. ISBN 978-0-387-92299-7 Review chapters via eBook through the Duke Library. Used in co-requisite course STA 601/STA 360 |
We will use R as a programming language for computation and data analysis, with the use existing packages written in R to support the course. All students will have access to RStudio/R on a server within the department and support during the labs. You are free to all run RStudio/R on your personal laptop or desktop. See the Resources page for books and other resources for learning R. You should bring a calculator to exams. There is no restrictions on the type of calculator (but not on a mobile device).
We will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TAs, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza (peer answers earn participation points!). If you have any problems or feedback for the developers, email team@piazza.com.
Find our class page at: Piazza or through the link on Sakai. Click to signup
This course has achieved Duke’s Green Classroom Certification. The certification indicates that the faculty member teaching this course has taken significant steps to green the delivery of this course. Your faculty member has completed a checklist indicating their common practices in areas of this course that have an environmental impact, such as paper and energy consumption. Some common practices implemented by faculty to reduce the environmental impact of their course include allowing electronic submission of assignments, providing online readings and turning off lights and electronics in the classroom when they are not in use. The eco-friendly aspects of course delivery may vary by faculty, by course and throughout the semester. Learn more at http://sustainability.duke.edu/action/certifications/classroom/index.php.
Tentative outline; please refresh for the latest version. Each Lecture/HW has additional details, including reading assignments, code and data.
Week | Date | Topic | HW |
---|---|---|---|
1 | 08-26-2019 | Introduction | |
08-28-2019 | MLE | ||
08-30-2019 | Lab 1: Intro to Weaving Latex and R | ||
2 | 09-02-2019 | Projections & Expectations | HW1 |
09-04-2019 | Normal Theory | ||
09-06-2019 | Lab 2: Introduction to GitHub and Rstudio | See invitation sent from Sakai | |
3 | 09-09-2019 | Sampling Distributions | HW2 |
09-11-2019 | Prediction | ||
09-13-2019 | Lab 3: | ||
4 | 09-16-2019 | Gauss-Markov and Prediction | HW3 |
09-18-2019 | Bayes Estimation in Linear Models | ||
09-20-2019 | Lab 4: Writing functions, coding style, and Q&A | ||
5 | 09-23-2019 | [Conjugate Priors in Linear Models] | HW4 |
09-25-2019 | Non-informative Priors | ||
09-27-2019 | Lab 5: Q&A | ||
6 | 09-30-2019 | G-Priors and Prior Choices | |
10-02-2019 | Review | ||
10-04-2019 | Midterm | ||
7 | 10-07-2019 | Fall Break | |
10-09-2019 | Cauchy Priors: Mixtures & MCMC | ||
10-11-2019 | Lab 6: JAGS | HW5 | |
8 | 10-14-2019 | Bayes Estimation | |
10-14-2019 | Ridge Regression | ||
10-16-2019 | Bayesian Ridge Regression | ||
9 | 10-21-2019 | Lasso and Bayesian Lasso Regression | HW6 |
10-23-2019 | Shrinkage Priors and Selection | ||
10 | 10-28-2019 | Testing and Model Comparison | HW7 (Nott & Kohn code) |
10-30-2019 | Testing and Model Comparison continued | ||
11 | 11-04-2019 | Model Choice | |
11-06-2019 | BMA | HW8 | |
12 | 11-11-2019 | Criteria for Priors for use in BMA/BVS | |
11-13-2019 | MCMC in BMA/BVS and inference | ||
13 | 11-18-2019 | Factors and Hierarchical Models | |
11-20-2019 | Residuals and Checking | ||
Transformations & Normality | |||
14 | 11-25-2019 | Robustness | TakeHome Data Analysis |
11-27-2019 | Thanksgiving Break | ||
15 | 12-02-2019 | Graduate Reading Period | |
12-04-2019 | Graduate Reading Period | ||
16 | 12-09-2019 | Graduate Reading Period | |
12-12-2019 | Final Exam 2-5 | Link Classroom 5 |
Computing & Other Resources
JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS. The name is a misnomer as JAGS implements more than just Gibbs Samplers. JAGS was written with three aims in mind:
Resources for JAGS:
Gilbert Strang’s Online Course at MIT
Matrix Algebra from a Statistician’s Perspective. Harville, David A. eBook in Duke Library
Plane Answers to Complex Questions, Christensen, Ronald. eBook in Duke Library
Using emacs as an editor for R, C/C++, LaTeX provides a great environment for editing, compiling and debugging - you can even use it as a shell!
Course expectations, outline, grading policy, and more
This course introduces students to linear models and its extensions for model building, including exploratory data analysis techniques, variable transformations and selection, parameter estimation and interpretation, prediction, hierarchical models, model selection and Bayesian model averaging. The concepts of linear models will be covered from Bayesian and classical viewpoints. Topics in Markov chain Monte Carlo simulation will be introduced as required, however it is expected that students have either taken STA 601 or are co-registered.
All students should be extremely comfortable with linear/matrix algebra and mathematical statistics at the level of STA 611 or equivalent; Statistical Inference - Casella and Berger is an excellent resource in case you need to review any mathematical statistics. If you need to review linear algebra, please explore material under Resources and links - Gilbert Strang’s online course is highly recommended.
The course goals are as follows:
Course topics will be drawn (but subject to change) from
Please check the website for updates, slides and current readings.
Homework | 20% |
Midterm | 25% |
TakeHome | 25% |
Final | 25% |
Participation | 5% |
Grades may be curved at the end of the semester. Cumulative numerical averages of 90 - 100 are guaranteed at least an A-, 80 - 89 at least a B-, and 70 - 79 at least a C-, however the exact ranges for letter grades will be determined after the final exam. The more evidence there is that the class has mastered the material, the more generous the curve will be.
These will be assigned weekly on the course webpage.
The objective of the problem sets is to help you develop a more in-depth understanding of the material and help you prepare for exams and projects. Grading will be based on completeness as well as accuracy. In order to receive credit you must show all your work.
No late assignments will be allowed, however the lowest score will be dropped.
You are welcomed, and encouraged, to work with each other on the problems, but you must turn in your own work. If you copy someone else’s work, both parties will receive a 0 for the problem set grade as well as being reported to the Office of Student Conduct. Work submitted on Sakai will be checked for instances of plagiarism prior to being graded.
Submission instructions: You will submit your HW on Sakai by uploading a PDF. If the TAs cannot view your work, or read your handwriting, you will lose points accordingly. We will be using R/knitr with $\LaTeX$ for preparing assignments using github classroom for data analysis.
You are expected to be present at class meeting and actively participate in the discussion. Your attendance and participation during class, as well as your activity on the discussion forum on Sakai will make up 5% of your grade in this class. While I might sometimes call on you during the class discussion, it is your responsibility to be an active participant without being called on.
The objective of the TakeHome is to give you independent applied research experience using real data and statistical methods. You will use all (relevant) techniques learned in this class to analyze a dataset provided by me.
Further details on the TakeHome will be provided as due dates approach.
Note that you must score at least 30% of the points on the TakeHome Exam in order to pass this class.
There will be one midterm and one final in this class. See course info for dates and times of the exams. You are allowed to use one sheet of notes (``cheat sheet”) on the midterm and two for the final. This sheet must be no larger than 8 1⁄2 x 11, and must be prepared by you. You may use both sides of the sheet and can write as small as you wish.
No late Homework
The lowest HW score will be dropped automatically at the end of the semester
Late work policy for TakeHome Data Analysis: 10% off for each day late.
The final exam must be taken at the stated time. Please book flights accordingly!
There will be no Makeup exams; if you miss the midterm for any reason, your predicted grade given the other information from the class will be used to fill in the missing grade.
Regrade requests must be made within 3 days of when the assignment is returned, and must be submitted in writing. These will be honored if points were tallied incorrectly, or if you feel your answer is correct but it was marked wrong. No regrade will be made to alter the number of points deducted for a mistake. There will be no grade changes after the final exam.
Use of disallowed materials (textbook, class notes, web references, any form of communication with classmates or other persons, etc.) during inclass exams will not be tolerated. For the Take Home data analysis, students are limited to materials covered in class or course resources; no external queries or use of outside resources. This will result in a 0 on the exam for all students involved, possible failure of the course, and will be reported to the Office of Student Conduct. If you have any questions about whether something is or is not allowed, please ask me beforehand.
I will regularly send announcements by email through Sakai; please make sure to check your email daily.
Any non-personal questions related to the material covered in class, problem sets, labs, projects, etc. should be posted on Piazza forum. Before posting a new question please make sure to check if your question has already been answered. The TAs and myself will be answering questions on the forum daily and all students are expected to answer questions as well. Please use informative titles for your posts.
Note that it is more efficient to answer most statistical questions ``in person” so make use of Office Hours.
Students with disabilities who believe they may need accommodations in this class are encouraged to contact the Student Disability Access Office at (919) 668-1267 as soon as possible to better ensure that such accommodations can be made.
Duke University is a community dedicated to scholarship, leadership, and service and to the principles of honesty, fairness, respect, and accountability. Citizens of this community commit to reflect upon and uphold these principles in all academic and non-academic endeavors, and to protect and promote a culture of integrity. Cheating on exams and quizzes, plagiarism on homework assignments and projects, lying about an illness or absence and other forms of academic dishonesty are a breach of trust with classmates and faculty, violate the Duke Community Standard, and will not be tolerated. Such incidences will result in a 0 grade for all parties involved as well as being reported to the Office of Student Conduct. Additionally, there may be penalties to your final class grade. Please review the Duke’s Academic Dishonesty policies.
Most Announcements will be made through Sakai