ISBA 2000 Tutorials
SUNDAY 28 MAY: 1:00pm-2:30pm
TUTORIAL 1: HIERARCHICAL MODELLING
David Draper, University of Bath, d.draper@maths.bath.ac.uk
This tutorial will provide a very brief introduction to the formulation,
fitting, and checking of hierarchical -- or multi-level -- models from the
Bayesian point of view. Hierarchical models (HMs) arise frequently in five
main kinds of applications:
- HMs are common in fields such as health and education, in which data --
both outcomes and predictors -- are often gathered in a nested or
hierarchical fashion: for example, patients within hospitals, or students
within classrooms within schools. HMs are thus also ideally suited to the
wide range of applications in government and business in which single- or
multi-stage cluster samples are routinely drawn, and offer a unified
approach to the analysis of random-effects (variance-components) and mixed
models.
- A different kind of nested data arises in meta-analysis in, e.g.,
medicine and the social sciences. In this setting the goal is combining
information from a number of studies of essentially the same phenomenon, to
produce more accurate inferences and predictions than those available from
any single study. Here the data structure is subjects within studies, and
as in the clustered case above there will generally be predictors available
at both the subject and study levels.
- When individuals -- in medicine, for instance -- are sampled
cross-sectionally but then studied longitudinally, with outcomes observed
at multiple time points for each person, a hierarchical data structure of
the type studied in repeated-measures or growth curve analyses arises, with
the readings at different time points nested within person.
- For simplicity people often try to model data as (conditionally) IID at a
fairly high level of aggregation -- for instance, by pretending that all
the subjects in a sampling experiment are drawn homogeneously from a single
population. In fact, heterogeneity is often the rule rather than the
exception, and frequently the available predictor variables do not
"explain" this heterogeneity sufficiently. With recent computational
advances it is becoming increasingly straightforward to at least describe
such heterogeneity with mixture models that employ latent variables
(unobserved predictors) in a hierarchical structure. Examples include
density estimation with an unknown number of sub-populations mixed together
and Bayesian nonparametric modeling, in which people work with
distributions whose sample spaces are themselves sets of distributions
instead of (say) real numbers.
- Hierarchical modeling also provides a natural way to treat issues of
model selection and model uncertainty with all types of data, not just
cluster samples or repeated measures outcomes. For example, in regression,
if the data appear to exhibit residual variation that changes with the
predictors, you can expand the model that assumes constant variation, by
embedding it hierarchically in a family of models that span a variety of
assumptions about residual variation. In this way, instead of having to
choose one of these models and risk making the wrong choice, you can work
with several models at once, weighting them in proportion to their
plausibility given the data.
In studying HMs there are two kinds of technical issues that also arise:
- fully Bayesian computation in HMs requires the use of simulation methods
such as those based on Markov Chain Monte Carlo (MCMC) ideas, and
- as usual with any class of statistical models there are questions of
model (as opposed to MCMC) diagnostics.
MCMC methods will be covered in the second tutorial by Brad Carlin. In the
first session I will give a brief overview of some of the topics above,
concentrating on two real examples: a meta-analysis of teacher expectancy
studies in education, and a random-effects Poisson regression problem
arising from a controlled trial of in-home geriatric assessment for elderly
people in Scandinavia.
SUNDAY 28 MAY: 2:45pm-4:15pm
TUTORIAL 2: BAYESIAN COMPUTATION
Bradley Carlin, University of Minnesota, brad@muskie.biostat.umn.edu
Bayesian methods have increased in popularity over the past decade, in
large part due to advances in statistical computing that allow the
evaluation of complex posterior distributions using Markov chain Monte
Carlo (MCMC) integration methods such as the Gibbs sampler and the
Metropolis-Hastings algorithm. In this tutorial we begin by reviewing
these algorithms, their various hybrid forms, and methods for their
monitoring and acceleration, as well as estimation of associated
standard errors. We also discuss several recent developments in MCMC,
including reversible jump MCMC, slice sampling, structured MCMC, and
overrelaxation. Methodological areas where MCMC methods have made a
particularly large impact (such as model choice) will also be
discussed.
In the second part of the tutorial, we describe and demonstrate what
is currently the most general software for this purpose, the WinBUGS
package produced by the MRC Biostatistics Unit at the University of
Cambridge. The Windows-based version of the original BUGS program,
WinBUGS 1.3 offers several enhancements, including expanded Metropolis
algorithm capability, numerical and graphical methods for convergence
diagnosis and output analysis (reminiscent of the existing S-based
CODA function), and a ``front end'' (GUI) which can create the
relevant sampling code directly from a user-specified graphical model.
A new add-on for spatial statistical analysis, GeoBUGS, will also be
discussed. Data examples will be presented throughout the
presentation as appropriate, arising from statistical application
areas such as disease risk mapping, interim analysis for clinical
trials, cross-study metaanalysis, and linear and nonlinear
longitudinal modeling.
Much of the presentation's theory and examples
will be drawn from the textbook
Bayes and Empirical Bayes Methods for Data Analysis, 2nd ed.
, (June 2000, Chapman and Hall/CRC Press; ISBN: 0412056119).