Statistical Science 790.01
Methods for Missing Data

  Spring 2024

Course Home Page


Course Description

In this mini-course (4 weeks long, 1 credit), we review theory and methods for handling missing data. We discuss types of missing data and how they affect inference, the pros and cons of different approaches to handling missing data commonly used, and the Bayesian and frequentist theory underpinning multiple imputation, which is often used by applied scientists as a way to handle missing data. We will review some open research problems that offer opportunities for thesis work. The course is taught at a level appropriate for PhD students and advanced MSS students.

Course Objectives

Logistics

Prerequisites

Students must have passed STA 602/702, as well as have courses in regression analysis.

Readings

There are no required texts for this course. Instead, we will read articles and other materials posted on the course website on Canvas. A useful text for reference is:

Little, R. J. A. and Rubin, D. B.  (2002),  Statistical Analysis with Missing Data, 2nd edition.  John Wiley & Sons.

Schedule of Topics

We will cover the topics in the table below.  We may spend different amounts of time on each topic than shown, depending on the interests of the participants in the course.

Introduction: Missing data mechanisms
1 lecture
Multiple imputation
3 lectures
Research: Nonignorable missing data
1 lecture
Research: Data fusion and combining information
1 lecture
Research: Privacy in statistical databases as a missing data problem
2 lectures


Graded work

The course is graded as credit/no credit. Students earn credit through class participation and by writing a document describing either (i) a research question that they find potentially interesting or (ii) something they learned during the mini-course. The document should be typed and around one or two pages. The document is uploaded to the "One Pager" assignment in the Assignments page on the course web site.