abstracts Jan 23

Speaker: Jim Berger

Title: Musings on a Simple Statistical Problem from Physics - I.

Physicists have a problem - finding confidence limits in today's large-scale physics experiments. The problem is almost trivial from a subjective Bayesian perspective, but the community of physicists does not currently want to operate in that domain on these problems. If one excludes subjective Bayesian analysis, the problem still looks trivial, but it is not. In particular, it is difficult to obtain agreement between objective Bayesian and frequentist analyses.

We begin with a presentation of the physics problem, its subjective Bayes solution, and discussion of the resistance to that solution. At that point, the discussion will wander away, into topics such as objective Bayesian analysis and the need for conditioning in frequentist statistics, with a variety of examples - such as one in medical diagnosis. Eventually we end up back at the physics problem, with the background to understand the dilemma. This is a serious problem awaiting a convincing (to physicists) solution.


Jan 25

Speaker: Jim Berger

Title: Musings on a Simple Statistical Problem from Physics - II.

Physicists have a problem - finding confidence limits in today's large-scale physics experiments. The problem is almost trivial from a subjective Bayesian perspective, but the community of physicists does not currently want to operate in that domain on these problems. If one excludes subjective Bayesian analysis, the problem still looks trivial, but it is not. In particular, it is difficult to obtain agreement between objective Bayesian and frequentist analyses.

We begin with a presentation of the physics problem, its subjective Bayes solution, and discussion of the resistance to that solution. At that point, the discussion will wander away, into topics such as objective Bayesian analysis and the need for conditioning in frequentist statistics, with a variety of examples - such as one in medical diagnosis. Eventually we end up back at the physics problem, with the background to understand the dilemma. This is a serious problem awaiting a convincing (to physicists) solution.


Jan 30

Speakers: Peter Müller and Maria De Iorio

Title: Anova on Random Functions - I.

We consider inference for related random functions indexed by categorical covariates X=(X1, ..., Xp). The functions could, for example, be mean functions in a regression, or random effects distributions for patients administered treatment combination X. with An appropriate probability model for the random distributions should allow dependent but not identical probability models. We focus on the case of the random measures, i.e., the random functions are probability densities and discuss two alternative probability models for such related random measures.

One approach uses a decomposition of the random measures into a part which is in common across all levels of X, and offsets which are specific to the respective treatments. The emerging structure is akin to ANOVA where a mean effect is decomposed into an overall mean, main effects for different levels of the categoricatl covariate, etc. We consider computational issues in the special case of DP mixture models. Implementation is greatly simplified by the fact that posterior simulation in the described model is almost identical to posterior simulation in a traditional DP mixture model, with the only modification being a constraint when resampling configuration indicators commonly used in DP mixture posterior simulation. Inference for the entire set of random measures proceeds simultaneously, and requires no more computational effort than the estimation of one DP mixture model.

We compare this model with an alternative approach based on modeling dependence at the level of point masses defining the random measures. We use the dependent Dirichlet process framework of MacEachern (2000). Dependence across different levels of the categorical covariate is introduced by defining an ANOVA like dependence of the base measures which generate these point masses. As in the general DDP setup implementation is no more difficult than for a traditional DP mixture model, with the only additional complication being dependence when resampling multivariate point masses.

We discuss differences and relative merits of the two approaches and illustrate both with examples.


Feb 1

Speakers: Peter Müller and Maria De Iorio

Title: Anova on Random Functions - II.

We consider inference for related random functions indexed by categorical covariates X=(X1, ..., Xp). The functions could, for example, be mean functions in a regression, or random effects distributions for patients administered treatment combination X. with An appropriate probability model for the random distributions should allow dependent but not identical probability models. We focus on the case of the random measures, i.e., the random functions are probability densities and discuss two alternative probability models for such related random measures.

One approach uses a decomposition of the random measures into a part which is in common across all levels of X, and offsets which are specific to the respective treatments. The emerging structure is akin to ANOVA where a mean effect is decomposed into an overall mean, main effects for different levels of the categoricatl covariate, etc. We consider computational issues in the special case of DP mixture models. Implementation is greatly simplified by the fact that posterior simulation in the described model is almost identical to posterior simulation in a traditional DP mixture model, with the only modification being a constraint when resampling configuration indicators commonly used in DP mixture posterior simulation. Inference for the entire set of random measures proceeds simultaneously, and requires no more computational effort than the estimation of one DP mixture model.

We compare this model with an alternative approach based on modeling dependence at the level of point masses defining the random measures. We use the dependent Dirichlet process framework of MacEachern (2000). Dependence across different levels of the categorical covariate is introduced by defining an ANOVA like dependence of the base measures which generate these point masses. As in the general DDP setup implementation is no more difficult than for a traditional DP mixture model, with the only additional complication being dependence when resampling multivariate point masses.

We discuss differences and relative merits of the two approaches and illustrate both with examples.


Feb 6

Speaker: Dongchu Sun, University of Missouri

Title: Two--fold Conditional Autoregressive Model and Joint Disease Mapping

A bivariate Bayes method is proposed for estimating the mortality rates of a single disease for a given population, using additional information from a second disease. The information on the two diseases is assumed to be from same population groups or areas.

The joint frequencies of deaths for the two diseases for a given population are assumed to have a bivariate Poisson distribution with joint means proportional to the population sizes. The relationship between the mortality rates of the two different diseases is formulated through the 2--fold conditional autoregressive (CAR) model where spatial effects as well as spatial correlations are introduced to capture the structured clustering among regions. This procedure is compared with a univariate hierarchical Bayes procedure proposed by Sun {\it et al.} (2000) which uses information from one disease only. Comparisons of two procedures are made by the theoretical study, Monte Carlo simulations, real data, and Bayes factor. All the methods we introduced demonstrate a substantial improvement in the bivariate over the univariate procedure. For analyzing male and female lung cancer data from the state of Missouri, Markov chain Monte Carlo methods are used to estimate mortality rates.


Feb 8

Speaker: Jose Bernardo, University of Valencia, Spain

Title: Intrinsic Estimation

From a Bayesian viewpoint, estimation problems are typically posed as decision problems where the action space is the parameter space. For instance, it is well known that the use of conventional loss functions, such as quadratic or zero-one, respectively lead to the posterior mean or mode. However, conventional loss functions are typically not invariant, and focus in the "distance" between parameter values, rather than in the "distance" between the models they label. Information theoretical ideas may be used to propose an alternative "intrinsic" loss function $\delta(\theta_i,\theta_j)$ which measures the minimum amount of information required to discriminate between models labeled by $\theta_i$, and $\theta_j$. Minimization of the corresponding reference posterior expectation leads to interesting, invariant Bayes estimators which may be argued to enjoy better properties then thier conventional counterparts. The concept of admissibility will be reviewed from this standpoint.


Feb 13

Speaker: Thanasis Kottas

Title: Median Regression Using Dirichlet Process Mixture Models

Dirichlet process mixture models form a very rich class of nonparametric mixtures which provides modeling for the unknown population distribution by employing a mixture of parametric distributions with a random mixing distribution assumed to be a realization from a Dirichlet process. Simulation-based model fitting of Dirichlet process mixture models is well established in the literature by now, the common characteristic of the Markov chain Monte Carlo methods devised being the marginalization over the mixing distribution. However, this feature results to rather limited inference regarding functionals associated with the random mixture distribution. In particular, only posterior moments of linear functionals can be handled.

We first describe a computational approach to obtain the entire posterior distribution for more general functionals. The approach uses the Sethuraman representation of the Dirichlet process, after fitting the model, to obtain posterior samples of the random mixing distribution. Then, a Monte Carlo integration is used to convert each such sample to a random draw from the posterior distribution of the functional of interest. Hence, arbitrarily accurate inference is available for the functional and for comparing it across populations.

We illustrate by considering two Dirichlet process mixtures for the errors in median regression models. The associated families of error distributions allow for increased variability, skewness and flexible tail behavior. The first family is semiparametric with extra variability captured nonparametrically through mixing and skewness handled parametrically. The second family, a fully nonparametric one, includes all unimodal densities on the real line with median equal to zero. In conjunction with a parametric regression specification two semiparametric median regression models arise. After fitting such models through the use of Gibbs sampling, full posterior inference for general population functionals is possible. The approach can also be applied when censored observations are present, leading to semiparametric censored median regression modeling.


Feb 15

Speaker: Luis Pericchi

Title: Minimal Training Samples and Instrinsic Bayes Factors for Censored and Discrete Observations

The concept of Training Samples is important for Cross Validation methods and for Bayesian Statistics. Regarding the latter, two recent developments in Bayesian model selection are the intrinsic Bayes factor of Berger and Pericchi (1996) and the expected posterior prior of Perez (1998). Central to both is consideration of training samples that allow utilization of improper objective priors for model selection. The most common prescription for choosing training samples is to choose them to be as small as possible, subject to yielding proper posteriors; these are called minimal training samples .

When data can vary widely in terms of either information content or impact on the noninformative priors, use of minimal training samples can be inadequate. Important examples include certain cases of discrete data and situations with censored observations. Such situations require a modification of the choice of training samples. In this article we propose use of a random minimal training sample , which is a training sample of smallest size such that the posterior is proper, but is drawn randomly without replacement from the set of data. This new definition reduces to the old definition in standard cases, but successfully overcomes the problems alluded to in non-standard situations.


Feb 20

Speaker: Herbie Lee

Title: Did Lennox Lewis Beat Evander Holyfield? An Application of Small-sample Inter-rater Agreement Methods

On March 13, 1999, a highly-anticipated prizefight between heavyweight champions Evander Holyfield and Lennox Lewis was ruled a draw by the three official judges. Many observers of the fight felt that Lewis had clearly outperformed Holyfield; dissatisfaction with the result---particularly the pro-Holyfield scorecard of judge Eugenia Williams---fueled speculation that the fight had been fixed and prompted official investigations. Here we examine whether the official judges scored the fight in a significantly different way than other professional observers of the fight. We do so by analyzing the round-by-round scoring within the context of inter-rater agreement. The literature on inter-rater agreement typically considers a large number of samples rated by a small number of judges, and relies on asymptotic results for tests. In our case, the sample size is too small to rely on asymptotics. Instead, we investigate a number of techniques that can be applied to small-sample inter-rater agreement problems, including logistic regression, an exact test, and some Bayesian approaches. We demonstrate these methods on both the March 1999 Holyfield-Lewis fight, as well as the September 1999 bout between welterweights Oscar de la Hoya and Felix Trinidad.


Feb 22

Speaker: Sandra McBride

Title: Modeling Indoor Air Pollutant Concentration Time Series

One cause of discrepancies between personal and stationary indoor air quality monitors is the proximity effect, in which pollutant sources near the respondent cause elevated and highly variable exposures. In a set of experiments in a home, a CO tracer gas was used as a continuously emitting point source. CO concentrations were simultaneously monitored at up to 38 locations at different angles, heights and distances from the source in each experiment.

The CO concentration time series at the monitoring sites reflect the sum of a slowly varying baseline time series and the superposition of transient, elevated concentrations, or ``microplumes.'' Microplume arrivals to a monitoring site appear as pulses in CO concentration time series, with pulse shapes varying by location relative to the CO source. At each monitoring site, a nonparametric method is used to estimate time-varying parameters of the baseline time series. The time series of superposed microplumes is then modeled using a point process approach. Estimates of parameters describing the arrival rates, durations and peak concentrations of microplumes are found using the method of moments. A simulation study is used to assess the bias and sampling error of parameter estimates. This modeling approach provides a parametric description of the source proximity effect in a home.


Feb 27

Speaker: Yuguo Chen, Stanford University

Title: Conditional Inference on Zero-One Tables: A Sequential Importance Sampling Approach

The Monte Carlo method of sequential importance sampling (SIS) has been shown to be a versatile and powerful tool for solving complex problems in dynamic systems. We describe a sequential importance sampling approach to making conditional inferences about zero-one tables, a problem which is not inherently dynamic. We apply our method to examples from ecology and education. Our approach to this problem provides insights for developing an efficient SIS methodology. We briefly describe other general principles behind efficient SIS algorithms we have developed for inference on genealogical trees, permutation tests on truncated data and filtering and smoothing in hidden Markov models.


Mar 1

Speaker: Jun Xie, University of California at Los Angeles

Title: A Bayesian Multiple Protein Alignment Algorithm for Motif Searching

Bayesian models have been developed for finding ungapped motifs in multiple protein sequences (Liu, Neuwald and Lawrence 1995). In this article, we extend the model to allow for deletions and insertions in the motifs. Direct generalization of the ungapped algorithm, based on Gibbs sampling, proves unsuccessful because the configuration space has become much larger. To alleviate this difficulty, a two-stage procedure is introduced. At the first stage, we use a method called entropy filtering which gives a crude estimate of amino acid frequencies in the motif without the concern of deletion/insertion patterns. At the second stage, we switch to a Metropolis-Hastings algorithm for optimizing the alignment scores. This time the deletion/insertion pattern is incorporated in the algorithm. When applied to a data set consisting of 19 protein sequences from the globin-like superfamily taken from SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/), our procedure identifies the motif regions of B and C helices.


Mar 8

Speaker: Hongquan Xu

Title: Optimal Factor Assignment for Asymmetrical Fractional Factorial Designs:Theory and Applications

Fractional factorial designs have been successfully used in various scientific investigations for many decades. Its practical success is due to its efficient use of experimental runs to study many factors simultaneously. A fundamental and practically important question for factorial designs is the issue of optimal factor assignment to columns of the design matrix. Aimed at solving this problem, this thesis introduces two new criteria: the generalized minimum aberration and the minimum moment aberration, which are extensions of the minimum aberration and minimum $G_2$-aberration. These new criteria work for symmetrical and asymmetrical designs, regular and nonregular designs, orthogonal and nonorthogonal designs, nonsaturated and supersaturated designs. They are equivalent for symmetrical designs and in a weak sense for asymmetrical designs. The theory developed for these new criteria covers many existing theoretical results as special cases. In particular, a general complementary design theory is developed for asymmetrical designs and some general optimality results for mixed-level supersaturated designs. As an application, a two-step approach is proposed for finding optimal designs and some 16-, 27- and 36-run optimal designs are tabulated. As another application, an algorithm is developed for constructing mixed-level orthogonal and nearly orthogonal arrays, which can efficiently construct a variety of small-run arrays with good statistical properties.


Mar 22

Speaker: Jim Berger

Title: On propriety of posterior distributions

The talk gives a (somewhat random) overview of issues involving the propriety of posterior distributions. Certain common situations of impropriety are discussed, and the various methods of avoiding impropriety are reviewed and compared.


Mar 27

Speaker: Thanasis Kottas

Title: Bayesian Survival Analysis Using Dirichlet Process Mixture Models

Dirichlet process mixture models form a very rich class of nonparametric mixtures which provides modeling for the unknown population distribution by employing a mixture of parametric distributions with a random mixing distribution assumed to be a realization from a Dirichlet process. Simulation-based model fitting of Dirichlet process mixture models is well established in the literature by now, the common characteristic of the Markov chain Monte Carlo methods devised being the marginalization over the mixing distribution. However, this feature results to rather limited inference regarding functionals associated with the random mixture distribution. In particular, only posterior moments of linear functionals can be handled.

We first describe a computational approach to obtain the entire posterior distribution for more general functionals. The approach uses the Sethuraman representation of the Dirichlet process, after fitting the model, to obtain posterior samples of the random mixing distribution. Then, a Monte Carlo integration is used to convert each such sample to a random draw from the posterior distribution of the functional of interest. Hence, arbitrarily accurate inference is available for the functional and for comparing it across populations.

The range of inferences the approach covers is illustrated by considering Dirichlet process mixture models for distributions on the positive real line. Full inference is obtained for various functionals of interest in this setting, including the median survival time and the population density, survival, cumulative hazard and hazard functions. In the process a new method for fully nonparametric Bayesian survival analysis emerges.


Mar 29

Speaker: Rasmus Waagepetersen, Aalborg University

Title: Parametric and Non-parametric Inference for Inhomogeneous Spatial Point Processes

Much research in statistics for spatial point processes has been concentrated on development of non-parametric summary statistics for stationary point processes like the well-known $K$-function. Focus is, however, now more directed toward parametric inference - in particular for inhomogeneous point processes.

In this talk we shall begin by considering an analogue, $K_{inhom}$ (Baddeley, M{\o}ller, and Waagepetersen, 2000), of the $K$-function which can be applied for study of interactions between points in inhomogeneous point patterns. A prerequisite for the application of $K_{inhom}$ is that the correlation structure of the point process is translation invariant, i.e., that the point process is `second-order intensity reweighted stationary'. Examples of point processes which possess this property are inhomogeneous Poisson point processes, (inhomogeneous) thinnings of stationary point processes, and log Gaussian Cox processes (M{\o}ller, Syversveen, and Waagepetersen, 1998).

If an analysis using $K_{inhom}$ suggests that an inhomogeneous point pattern can not be modelled by an inhomogeneous Poisson process due to (e.g.) repulsion or aggregation in the point pattern. Possible alternative models include thinned Markov point processes and log Gaussian Cox processes. We conclude by discussing semi-parametric infererence for thinned Markov point processes and space-time modelling using log Gaussian Cox processes.

The methods and models presented will be illustrated with examples from forestry and weed science.


Apr 3

Speaker: David Denison, Imperical College, UK

Title: Bayesian Partition Models

In this talk we propose a new Bayesian approach to data modelling motivated by the difficulties encountered with some tree-based methods. The Bayesian partition model constructs arbitrarily complex response surfaces over the design space by splitting it into an unknown number of disjoint regions. Within each region the data is assumed to be exchangeable and to come from some simple distribution. Using conjugate priors the marginal likelihoods of the models can be obtained analytically for any proposed partitioning of the space, hugely simplifying the sampling algorithm required to simulate from the posterior of interest. By example we shall show how the partition model can be used for a wide variety of problems including regression, classification and disease mapping.


Apr 5

Speaker: Mark Borsuk, School of Environment

Title: A Bayesian hierarchical model to predict benthic oxygen demand from organic matter loading in estuaries and coastal zones

Ecological models that have a theoretical basis and yet are mathematically simple enough to be parameterized using available data are likely to be the most useful for environmental management and decision-making. However, even models that are mechanistically simple can be overparameterized when system-specific data are limited. To overcome this problem, models are often fit to data sets composed of observations from multiple systems. The resulting parameter estimates are then used to predict changes within a single system, given changes in management variables. However, the assumption of common parameter values across all systems may not always be valid. This assumption can be relaxed by adopting a hierarchical approach. Under the hierarchical structure, each system has its own set of parameter values, but some commonality in values is assumed across systems. An underlying population distribution is employed to structure this commonality among parameters, thereby avoiding the problems of overfitting. The hierarchical approach is, therefore, a practical compromise between entirely site-specific and globally-common parameter estimates. We applied the hierarchical method to annual data on organic matter loading and benthic oxygen demand from 34 estuarine and coastal systems. Both global and system-specific parameters were estimated using Bayes Theorem. The generality of the hierarchical approach makes it suitable for a number of ecological modeling applications in which cross-system data are required for empirical parameter estimation, yet only partial commonality can be assumed across sampling units.


Apr 10

Speaker: Marco Ferreira

Title: A Class of Multi-Scale Time Series Models

We introduce a class of multi-scale models for time series. The novel framework couples 'simple' standard Markov models for the time series stochastic process at different levels of aggregation, and links them via 'error' models to induce a new and rich class of structured linear models reconciling modelling and information at different levels of resolution. Jeffrey's rule of conditioning is used to revise the implied distributions and ensure that the probability distributions at different levels are strictly compatible. Our construction has several interesting characteristics: a variety of autocorrelation functions resulting from just a few parameters; the ability to combine information from different scales; and the capacity to emulate long memory processes. There are at least three uses for our multi-scale framework: to integrate the information from data observed at different scales; to induce a particular process when the data is observed only at the finest scale; as a prior for an underlying multi-scale process. Bayesian estimation based on MCMC analysis is developed, and issues of forecasting are discussed. Two interesting applications are presented: in the first application, we illustrate some basic concepts of our multiscale class of models through the analysis of the flow of a river; In the second application we use our multiscale framework to model daily and monthly log-volatilities of exchange rates.


Apr 17

Speaker: Luis Pericchi

Title: Bayesian Versus Non-Bayesian Estimation of Tail Probabilities with Examples from Ecology and Hydrology


Jim Berger
January, 2001