ISBA 2000 Tutorials

SUNDAY 28 MAY: 1:00pm-2:30pm

David Draper, University of Bath, d.draper@maths.bath.ac.uk

This tutorial will provide a very brief introduction to the formulation, fitting, and checking of hierarchical -- or multi-level -- models from the Bayesian point of view. Hierarchical models (HMs) arise frequently in five main kinds of applications:

HMs are common in fields such as health and education, in which data -- both outcomes and predictors -- are often gathered in a nested or hierarchical fashion: for example, patients within hospitals, or students within classrooms within schools. HMs are thus also ideally suited to the wide range of applications in government and business in which single- or multi-stage cluster samples are routinely drawn, and offer a unified approach to the analysis of random-effects (variance-components) and mixed models.
A different kind of nested data arises in meta-analysis in, e.g., medicine and the social sciences. In this setting the goal is combining information from a number of studies of essentially the same phenomenon, to produce more accurate inferences and predictions than those available from any single study. Here the data structure is subjects within studies, and as in the clustered case above there will generally be predictors available at both the subject and study levels.
When individuals -- in medicine, for instance -- are sampled cross-sectionally but then studied longitudinally, with outcomes observed at multiple time points for each person, a hierarchical data structure of the type studied in repeated-measures or growth curve analyses arises, with the readings at different time points nested within person.
For simplicity people often try to model data as (conditionally) IID at a fairly high level of aggregation -- for instance, by pretending that all the subjects in a sampling experiment are drawn homogeneously from a single population. In fact, heterogeneity is often the rule rather than the exception, and frequently the available predictor variables do not "explain" this heterogeneity sufficiently. With recent computational advances it is becoming increasingly straightforward to at least describe such heterogeneity with mixture models that employ latent variables (unobserved predictors) in a hierarchical structure. Examples include density estimation with an unknown number of sub-populations mixed together and Bayesian nonparametric modeling, in which people work with distributions whose sample spaces are themselves sets of distributions instead of (say) real numbers.
Hierarchical modeling also provides a natural way to treat issues of model selection and model uncertainty with all types of data, not just cluster samples or repeated measures outcomes. For example, in regression, if the data appear to exhibit residual variation that changes with the predictors, you can expand the model that assumes constant variation, by embedding it hierarchically in a family of models that span a variety of assumptions about residual variation. In this way, instead of having to choose one of these models and risk making the wrong choice, you can work with several models at once, weighting them in proportion to their plausibility given the data.

In studying HMs there are two kinds of technical issues that also arise:

fully Bayesian computation in HMs requires the use of simulation methods such as those based on Markov Chain Monte Carlo (MCMC) ideas, and
as usual with any class of statistical models there are questions of model (as opposed to MCMC) diagnostics.

MCMC methods will be covered in the second tutorial by Brad Carlin. In the first session I will give a brief overview of some of the topics above, concentrating on two real examples: a meta-analysis of teacher expectancy studies in education, and a random-effects Poisson regression problem arising from a controlled trial of in-home geriatric assessment for elderly people in Scandinavia.

SUNDAY 28 MAY: 2:45pm-4:15pm

TUTORIAL 2: BAYESIAN COMPUTATION

Bradley Carlin, University of Minnesota, brad@muskie.biostat.umn.edu

Bayesian methods have increased in popularity over the past decade, in large part due to advances in statistical computing that allow the evaluation of complex posterior distributions using Markov chain Monte Carlo (MCMC) integration methods such as the Gibbs sampler and the Metropolis-Hastings algorithm. In this tutorial we begin by reviewing these algorithms, their various hybrid forms, and methods for their monitoring and acceleration, as well as estimation of associated standard errors. We also discuss several recent developments in MCMC, including reversible jump MCMC, slice sampling, structured MCMC, and overrelaxation. Methodological areas where MCMC methods have made a particularly large impact (such as model choice) will also be discussed.

In the second part of the tutorial, we describe and demonstrate what is currently the most general software for this purpose, the WinBUGS package produced by the MRC Biostatistics Unit at the University of Cambridge. The Windows-based version of the original BUGS program, WinBUGS 1.3 offers several enhancements, including expanded Metropolis algorithm capability, numerical and graphical methods for convergence diagnosis and output analysis (reminiscent of the existing S-based CODA function), and a ``front end'' (GUI) which can create the relevant sampling code directly from a user-specified graphical model. A new add-on for spatial statistical analysis, GeoBUGS, will also be discussed. Data examples will be presented throughout the presentation as appropriate, arising from statistical application areas such as disease risk mapping, interim analysis for clinical trials, cross-study metaanalysis, and linear and nonlinear longitudinal modeling.

Much of the presentation's theory and examples will be drawn from the textbook Bayes and Empirical Bayes Methods for Data Analysis, 2nd ed. , (June 2000, Chapman and Hall/CRC Press; ISBN: 0412056119).