abstracts

Speaker: Bruno Sansó

Title: Spatial statistics: introduction. I

In the first two sessions an introduction to the topic of spatial statistics will be given, covering the terminology of the field and briefly reviewing the most common types of models: geostatistical models, lattice models and point processes. Particular attention will be given to the properties of gaussian random fields and their correlation structures.

January 22

Speaker: Bruno Sansó

Title: Spatial statistics: introduction. II

January 22

Speaker: David Higdon

Title: Markov Random Fields in magnetoresistance imaging

Scanning magnetoresistance microscopy is an emerging technology for detecting very small ferromagnetic signals. The behavior of the magnetic field along with properties of the imaging technology lead to a blurred version of the underlying signal. This talk will give a simple overview of magnetoresistance imaging and will look at the particular problem of estimating the head sensitivity function. For this problem we take a Bayesian approach using a couple of different Markov random fields to induce regularity in this effort.

January 22

Speaker: Katja Ickstadt

Title: Disparate Data Resolutions in Environmental and Ecological Modeling

Disease counts in small geographic areas are typically modeled as Poisson random variables with intensities whose logarithms are modeled by a Gaussian random field. Area-specific covariates are incorporated as additive terms in the model for the log intensities (this is known as ecological regression). However, in the conventional approach covariates are restricted to enter the model at the same spatial scale as the disease counts. Environmental covariates (e.g. pollution concentrations) are typically measured and reported at a finer resolution than are the disease counts; replacing these measurements with their averages over the larger areas may obscure local variation (and high local peaks) which could be critical for explaining the disease. We use a flexible class of Bayesian hierarchical Poisson/gamma models for spatially correlated event data, which overcomes the problem of disparate spatial scales by relating all observable quantities to an underlying continuous random field model. Markov chain Monte Carlo methods using data augmentation are employed to estimate posterior distributions. The models are applied to an ecological regression analysis of severe childhood wheeze and nitrogen dioxide (NO_2) pollution in Huddersfield, an English industrial town. The data are part of the SAVIAH study, an ongoing investigation into Small Area Variations In Air quality and Health funded by the European Community. Disease counts are recorded for 427 census enumeration districts; NO_2 levels and their measurement-error variances are available at 4,500 evenly-spaced grid-points covering the region.

February 3

Speaker: Brani Vidakovic

Title: Spatial models in wavelet shrinkage - wavelets in theory of shapes.

The talk will consist of two parts. In the first part I will discuss the shrinkage action in the wavelet domain by models usually attributed to spatial statistics. A critical review and important references will be given. Examples are done in collaboration with Dave Higdon. In the second part I'll describe some recent experiments performed on wavelet description of shapes. Traditional Fourier descriptions are replaced by wavelet decompositions. Difficulties and merrits of the proposal are discussed. Good reference is the book by Stoyan and Stoyan [Fractals, Random Shapes, and Point Fields, Wiley, 1994].

February 5

Speaker: Richard Smith

Title: Estimating Nonstationary Spatial Correlations

Many classical methods of geostatistics assume spatial correlations derived from a stationary, isotropic process. However, many environmental processes are clearly not stationary and isotropic. This poses the problem of how to characterize spatial correlations in this case. An ingenious solution, due to Guttorp and Sampson, is based on assuming that some nonlinear transformation of the observation space will result in a process which is stationary and isotropic in the transformed space. Their methods of estimating this transformation were somewhat ad hoc, however. In this presentation we give an alternative likelihood-based approach, with applications to ozone modeling and to spatial correlations among temperatures in the continental USA.

February 10

Speaker: Colin McCulloch

Title: Atlas-Based Feature Identification in Two- and Three-dimensional Images

With the advent of full three-dimensional medical imaging techniques (eg. MRI, PET, SPECT) the technician's task of image analysis is becoming increasingly difficult. To identify features in three dimensions, a technician must typically engage in the tedious chore of examining numerous lower dimensional representations of parts of the data set, for instance slices though the volume or volume-rendered views. The pursuit of automatic image analysis, previously sought after in two-dimensional images for objective measurements and to reduce operator burden, therefore has become proportionally more valuable in these larger image sets. Unfortunately, current methods for analyzing three-dimensional images typically involve a staggering computational burden. The goal of this research is to realize automated image analysis by framing the problem in terms of matching the image to a pre-analyzed atlas image, thereby applying that analysis to the new image. We have formulated a model that, given the positions of a large number of points (~250,000) in an atlas image, assigns a probability distribution to their locations in a new image from the image class characterized by the atlas. The model combines a joint hierarchical normal model on the expected locations of all these points with a matching measure based on local cross-correlations between the atlas and the new image. Application of the model in two and three dimensions is feasible with reasonable computational load due to the extensive conditional independence afforded by the model's hierarchical structure.

February 12

Speaker: Víctor De Oliveira

Title: Bayesian Prediction of Transformed Gaussian Random Fields

A model for prediction in some types of non-Gaussian random fields is presented. It extends the work of Handcock and Stein (1993) to prediction in transformed Gaussian random fields, where the transformation is known to belong to a parametric family of monotone transformations. The Bayesian Transformed Gaussian model (BTG), provides an alternative to trans-Gaussian kriging taking into account the major sources of uncertainty, including uncertainty about the `normalizing transformation' itself, in the computation of the predictive density function. Unlike trans-Gaussian kriging, this approach mitigates the consequences of a mis-specified transformation giving in this sense a more robust predictive inference. Because the mean of the predictive distribution does not exist for some commonly used families of transformations, the median is used as the optimal predictor. The BTG model is applied in the spatial prediction of weekly rainfall amounts. Cross-validation shows the predicting performance of the BTG model compares favorably with several kriging variants.

February 17

Speaker: Jacob Laading

Title: A medical imaging application of a hierarchical deformation model : the digital chest radiograph

Recently, we have been developing new ways to think about and understand medical images in the context of understanding their structure and deformation across patients. Here, we describe a method under development for defining a model structure, as well as its application to one class of images; the digital chest radiograph. The approach is hierarchical in nature, and the model structure generation is based on the use of scale-space extrema. The talk will also cover some other aspects of the radiograph, in particular how statistical and other computer methods have been applied to try to achieve some automation of diagnostic tasks in the field.

February 17

Speaker: Giovanni Parmigiani

Title: Latent variables 101

This class will be a lightweight introduction to latent variables ---a sort of appetizer to the more serious applications that will come later in the course. All I plan to do is to give you a series of elementary examples in which latent variables are or should be used. I'll try to make the best of the fact that I am not an expert on latent variables, and give you a statistician-in-the-street perspective on the topic.

February 17

Speaker: Ed Iversen

Title: Latent Variables as Proxies for Genotype: a Breast Cancer Survival Analysis

Inherited mutations of the BRCA1 and BRCA2 genes are known to confer an elevated risk of both breast and ovarian cancers. The effect of carrying such a mutation on survival after developing breast or ovarian cancer is less well understood. We investigate the relationship between BRCA1 and BRCA2 carrier status and survival after breast cancer. In the absence of genetic testing, presence or absence of mutation at a breast cancer susceptibility gene is captured by a pair of latent categorical variables whose probability depends on the patient's family history of breast and ovarian cancer. We estimate the effect of genotype on survival using a Cox proportional hazards model, treating genetic status as a latent variable and controlling for a set of standard prognostic variables. Inference is accomplished using a Markov chain Monte Carlo algorithm to draw a sample from the posterior distribution of model parameters accounting for sampling error, uncertainty in the genotype of study participants and uncertainty in estimates of genetic parameters. A one-latent-variable "combined-gene" analysis and a two- latent-variable "separate-gene" analysis are compared.

March 3

Speaker: Alex Reutter

Title: The Use of Latent Variables in the Analysis of Ranking Data

A model is presented that is appropriate for the analysis of ordinal data in the absence of observable regressors. It will be motivated by the example of undergraduate grades, and we'll discuss some of the difficulties in implementation of the model and interpretation of results.

March 5

Speaker: David Higdon

Title: Latent variable methods for lattice models

Suppose one wishes to sample from the density $\pi(x)$ using Markov chain Monte Carlo (MCMC). An auxiliary variable $u$ and its conditional distribution $\pi(u|x)$ can be defined, giving the joint distribution $\pi(x,u) = \pi(x) \pi(u|x)$. A MCMC scheme which samples over this joint distribution can lead to substantial gains in efficiency compared to standard approaches. The revolutionary algorithm of Swendsen and Wang (1987) is one such example. In addition to reviewing the Swendsen-Wang algorithm and its generalizations, this paper introduces a new auxiliary variable method called partial decoupling. Two applications in Bayesian image analysis are considered. The first is a binary classification problem in which partial decoupling out performs SW and single site Metropolis. The second is a PET reconstruction which uses the gray level prior of Geman and McClure (1987). A generalized Swendsen-Wang algorithm is developed for this problem, which reduces the computing time to the point that MCMC is a viable method of posterior exploration.

March 10

Speaker: Bruno Sansó

Title: A Multisite Dynamic Model for Rainfall

A model for rainfall based on a the truncated multivariate normal distribution is considered, that is, it is asummed that the observations, which are indexed on both time and space, correspond to a normal variate that has been truncated and (possibly) transformed. According to this model, the dry periods correspond to the (unobserved) negative values of a latent variable and the wet periods correspond to some power of the positive ones. This model has been proposed in Bardossy and Plate (1992). The serial structure that is present in rainfall can be modelled by imposing a serial structure to the latent variables. To do so a dynamic linear model is used, with a Fourier representation of the seasonality of the data, which is assumed to be the same for all sites, plus a linear combination of functions of the location of each site, obtaining a multivariate generalisation of the model proposed in Sansó and Guenni (1997). This approach captures the sometimes remarkable year to year variability and provides a tool for interpolation, estimation of the probability of a dry period, estimation of the aereal rainfall and short term forecasting. The method is illustrated with rainfall accumulated over periods of ten days collected in the Venezuelan state of Guárico.

March 12

Speaker: Peter Muller

Title: Pseudo-Priors and Reversible Jump -- MCMC in Varying Dimension Parameter Spaces

Many interesting statistical models involve a parameter vector of variable dimension. Examples are regression models involving a choice of covariates, mixtures with an unknown number of terms, harmonic models with a random number of frequencies, factor models, neural network models, etc. Implementing MCMC simulation in such models requires proposals which include an augmented/reduced parameter vector. The pseudo-prior approach by Carlin and Chib, and Green's reversible jump have been proposed as such varying dimension MCMC methods. Using harmonic models and neural network models as examples we will discuss both and make an argument that both approaches really are different formalizations of the same underlying idea.

March 24

Speaker: Gabriel Huerta

Title: Priors on Latent Structure and Reversible Jump for Autoregressive Models

The analysis and decomposition of time series under autoregressive models is considered, with a new class of prior distributions for parameters def ining latent components. The approach induces a new class of smoothness priors on autoregress ive coefficients, provides for formal inference on model order, including very high order models, and for the incorporation of uncertainty about model order into summary posterior inferences. The class of prior models also allows for unitary roots, and hence leads to inference on sust ained though stochastically time-varying periodicities in a series. Model implementation and analysis are described in this presentation. As the pr ior modeling induces complicated forms for prior distributions on the usual ``linear'' autoregressive parameters, exploration of posterior distributions involves in a Markov Chain Monte Carlo method that uses several Metropolis-Hastings steps. Particularly, conditionals related to the quasi-periodic components of the process are updated via a Reversible Jump step following Green (1995). Implementation of the method will be discussed in terms of simulated and real data. An extension to unequally-spaced time series in terms of Dynamic Linear Models will be introduced.

March 26

Speaker: Hedibert Lopes

Title: Model Uncertainty in Factor Analysis

The number of common factors, $k$, in a factor analytic model will be considered as part of the model itself. We will use Peter Green's Reversible Jump Markov chain Monte Carlo methodology (Green (1995)) to sample from $p(k|\mbox{Data})$, directly. Finding efficient proposals is a very hard task due to the high dimensionality involved. In a particular application we try to solve such problem by proposing from multivariate distributions centered around the posterior means, obtained by Gibbs sampling conditional on the proposed $k$. We have applied this new technique to a six-variate problem involving monthly observed international exchange rates in a period of 144 months. We compare our methodology with other ones currently used in factor analysis literature, such as the log likelihood ratio test statistic, the Akaike and the Schwarz criteria. We also compute posterior odds indirectly through the calculation of the predictive densities by using the candidate's formula. (Besag (1989), Chib (1995) and Polasek (1997)).

Mar 31

Speaker: Merlise Clyde

Title: Bayesian Model Averaging and Reversal Jump MCMC -- Should We Take the Leap?

In this talk I will review Bayesian Model averaging (BMA). In regression models with even a moderate number of covariates, it is usually infeasible to enumerate all models, and deterministic or stochastic search strategies are used to identify a subset of models for BMA. In linear models, the stochastic search variable selection (SSVS) algorithm of George and McCulloch and the MC^3 (Markov Chain Monte Carlo Model Composition) algorithm of Madigan and York can be viewed as special cases of reversible jump MCMC algorithms. I will discuss other proposal distributions for jumping through model space using RJMCMC and other sampling approaches for implementing BMA in generalized linear models.

April 2 and 7

Speaker: Don Berry

Title: Designs for Clinical Trials I and II

The typical approach in designing phase III clinical trials is to pick two very specific treatment regimens and assign hundreds of patients to each. This is slow, ponderous, and inefficient, both for developing current and future therapies and for effectively treating patients. I will describe alternative approaches, including factorial designs and adaptive designs. As regards the latter, I will consider assigning different therapies adaptively (bandit problems) and assigning patients to different doses within the same therapy. Other topics will include interim analyzes, choosing sample sizes, using predictive probabilities in designs and analyses, and using historical information in the design and eventual analysis of clinical trials. One overriding set of issues addressed is the ethics of clinical trials. A consideration relevant for many of the topics is the differing attitudes of Bayesians and frequentists in clinical trial designs.

April 9 and 14

Speaker: Dalene Stangl

Title: Bayesian Methods in Survival Analysis

The first lecture will provide an introduction to time-to-event models. Topics to be covered include distinguishing characteristics, commonly used distributions, incorporation of censoring, product-limit estimates, partial likelihood, and the Cox proportional hazards model. The second lecture will review Bayesian approaches to modeling time-to-event data. Frailty models (Clayton, 1991, Gray, 1993, Gustafsen, 1995, 1997), mixture and changepoint models (Stangl, 1995, 1998), and dynamic models (Gammerman, 1991, West, 1991) will be discussed.

April 16

Speaker: Francesca Dominici

Title: STATISTICAL ISSUES IN THE NATIONAL MORTALITY, MORBIDITY AIR POLLUTION STUDY (NM MAPS)

Time-series studies have shown associations between air pollution concentrations and morbidity and mortality. These studies have largely been conducted within single cities with varying meth ods. Key questions remain unaddressed concerning the findings, including 1) the extent of the heterogeneity of air pollution effects across locations and its sources; 2) the effect of measurement error on the estimated effect of air pollution; and 3) the public health significance of the short-term associat ions. NMMAPS comprises the development of methods to address these questions and their application to national data sets of mortality and hospitalization among 65 years of age and older, as an index of morbidity. In this talk I will review statistical issues arising in the NMMAPS study. For analyzing data from multiple locations and summarizing the findings, a Baye sian hierarchical regression method has been applied to data from 8 eastern US-cities and from 20 largest US-cities. Frequency domain regression methods have been developed to assess mortality displacement, with the initial finding that air pollution effects can be assessed in time-response bands that are relatively resistant to short-term mortality displacement. The consequences of measurement error are being comprehensively explored using available data sets. Data bases have been assembled on mortality of the 100 largest U.S. cities. The methods of NMMAPS should prove useful for future surveillance for the health effects of air pollution.

April 21

Speaker: Brani Vidakovic

Title: Some Non-Standard Wavelet Tasks

The lecture discusses 5 nonstandard statistical applications of wavelets. The first one concerns dimension reduction in the case when regression predictors are curves. We exemplify the method on Brown-Fearn's NIR protein data set. In the second we discuss the benefits of waveletization of some standard statistical procedures that utilize classical orthogonal systems. The applications fall in the field of order statistics. The third application concerns the wavelet based random variables and densities. A possibility of wavelet mixture modeling will be discussed. An unsolved problem will be presented. In the forth application, the wavelet shrinkage, is proven to possess a desirable property: The associated shrinkage estimator is strongly consistent when the threshold goes to 0. This intuitive result fails in the case of thresholding the Fourier expansions. Finally, the fifth application will discuss some applications in turbulence.

April 23

Speaker: Alex Stark

Title: Old and new methods for applying hidden Markov models to signals, with application to ion channel records

In this talk I will outline some of the older methods of analysing signals with hidden Markov models, concentrating on the EM method which employs the Baum-Welch forward-backward procedure, and looking at the Viterbi algorithm. The application of these methods to ion channel (patch clamp) records will be discussed. Such methods have fundamental limitations which present problems in analysing real data. In contrast, computational statistical modelling not only overcomes such limitations, but is also more flexible and so should be able to account for the features of data recorded in practical experiments.

B. S.
Fri Apr 17