Abstracts

Jan 7

Speaker: James Berger

Title: Inference for the Bivariate Normal Distribution

Determining optimal objective inference for the parameters of the bivariate normal distribution is surprisingly difficult. From the objective Bayesian viewpoint, there are a host of competing objective priors. This problem also provided one of the most significant successes of Fisher’s fiducial approach, with each parameter of the bivariate normal distribution having a fiducial distribution with an exact frequentist interpretation. All this will be reviewed, and relationships between the various approaches discussed.


Jan 12

Speaker: Susie Bayarri

Title: Multiple Testing: the Problem and Some Solutions

In situations where p tests of hypothesis are performed simultaneously, the usual frequentist procedure of controlling the type I error of each test at a fixed level alpha results, in an expected number rejections when all the null hypotheses are true, of about alpha*p. For large or very large p (as in gene expression problems) this is perceived to be too many. Classical proposals 'adjust' each individual alpha to achieve an overall 'Family Wise Error Rate', but this solution is not satisfactory either, because it usually results in way too conservative procedures. More recent proposals control instead the 'false discovery rate' or related quantities. In this talk we review these proposals, and their role in the Bayesian approach. We also show how Bayesians procedures automatically 'adjust' for the multiplicity (without need for explicit adjustment); the answers, however, are typically dependent on the prior.


Jan 19

Speaker:

Title:


Jan 26

Speaker: David B. Dunson

Title: Bayesian Inferences on Multiple Correlated Hypotheses

In many applications, interest focuses on inferences on a potentially large number of closely related hypotheses. For example, hypotheses may relate to treatment group comparisons for related outcome variables (e.g., occurrence of tumors of different types or expression levels of different genes) or to increasing trends at different dose levels. In such cases, multiple comparison adjustments are often recommended to limit inflation in the type I error rate. From a Bayesian perspective, we view the multiple comparison problem as an issue of choosing an appropriate prior for the different hypotheses. In particular, we may wish to assign a particular value to global and local null hypotheses, and to incorporate prior beliefs about the dependency structure. A class of hierarchical mixture priors is proposed, which have a number of appealing conceptual and practical properties. Focusing on the problem of inferences on local and global dose response trends, a number of alternative priors are considered, stochastic search-type MCMC algorithms are developed for posterior computation, and the methods are illustrated using simulated and real data examples.


Feb 2

Speaker: Jerry Reiter

Title: Simultaneous Use of Multiple Imputation for Missing Data and Disclosure Limitation

Several statistical agencies use, or are considering the use of, multiple imputation to limit the risk of disclosing respondents' identities or sensitive attributes in public use data files. For example, agencies can release partially synthetic datasets, comprising the units originally surveyed with some collected values, such as sensitive values at high risk of disclosure or values of key identifiers, replaced with multiple imputations. In this talk, I summarize my ongoing research on using multiple imputation for disclosure limitation, and I present a new approach for generating multiply-imputed, partially synthetic datasets that simultaneously handles disclosure limitation and missing data. The basic idea is to fill in the missing data first to generate $m$ completed datasets, then replace sensitive values in each completed dataset with $r$ imputed values. I also present methods that allow users to obtain valid inferences from such multiply-imputed datasets, based on new rules for combining the multiple point and variance estimates. New rules are needed because the double duty of multiple imputation introduces two sources of variability into point estimates, which existing methods for obtaining inferences from multiply-imputed datasets do not measure accurately.


Feb 9

Speaker:Paul Marriott and Zhenglei Gao

Title: Analysing Neuron Firing Data

Recent developments in in vitro measuring have made available large amounts of high quality neuron firing data in live animals behaving in a relatively unconstrained way. In particular now available are parallel measurements of large numbers of neurons over long periods of time. This talk describes some on going working which analyses such data in the context of an experiment which investigates memory reinforcment in rats. We will describe the research questions and some aspects of the data. Comparisons with current work in the literature shows that new methodology needs to be created and two approaches are sketched out.


Feb 16

Speaker: Ian Dinwoodie

Title: The M/M/C queue, loss networks, and blocking probabilities

Some early research on telephone traffic falls into the scope of the M/M/C queue. A formula was derived by Erlang for the probability that traffic exceeds the server capacity. We describe some high-dimensional versions of this problem, add dynamic routing, and discuss the situation where two sets of blocking probabilities solve a fixed point system of polynomial equations.


Feb 23

Speaker: David Banks

Title: Things you should know

This talk reviews things that new researchers ought to know, but which are rarely mentioned. I plan to talk about how to referee papers, ethical issues in research and consulting, time managemement, professional activity, and related topics. Much of this is an updated (and blunter) version of mateial that I put into the IMS "New Researchers' Survival Guide."


Mar 1

Speaker: Jason Duan

Title: Expected Utility with Non-additive Probability

The expected utility model proposed by von Neuman and Morgenstern is commonly used in the decision theoretical framework. Savage improved it by including subjective probability. However various paradoxes, such as the Ellsberg's, Allais' and Machina's, question its central assumption of independence, which is restrictive and unrealistic in modeling the human behavior. One of the more flexible approaches is to replace the additive subjective probabilities by the non-additive probabilities, which are also called the Choquet capacities. The axiomization of the expected utility with non-additive probabilities relaxes the independence assumption. In this presentation the paradoxes aforementioned will be elaborated. The Choquet integral that gives rise to the expected utility with non-additive probabilities will be introduced. The implications of this method will be discussed.


Mar 15

Speaker: Leanna House

Title: Analyzing Proteomic Data Generated from MALDI-TOF MS

Proteomic profiles generated from MALDI-TOF mass spectrometers present three critical data issues: peak identification, dependence, and shifting. Several attempts have been made to address the issues in order to compare profiles from diseased and non-diseased subjects. Unlike past research, we develop a Bayesian non-parametric approach for assessing mass spectrometry data. We assert that an underlying stochastic process, dependent upon mass/charge (m/z), influences protein abundances per spectrogram. Furthermore, we suggest that m/z is random following a uniform distribution over a finite interval. Since implementing our proposal is still work in progress, this talk will contain limited results but inspiring simulations for future developments. Ultimately, we plan to incorporate Bayesian model averaging and capitalize on the reversible jump algorithm to traverse the model space within MCMC.


Mar 22

Speaker: Jingqin (Rosy) Luo

Title: Space-time Point Process Analysis of Earthquake Sequences

Earthquake sequences have been modeled as marked spatio-temporal point processes, which helps to predict the expected occurrence rates of earthquakes in normal earthquake sequence and find out spatial/temporal abnormalities. Self-clustering and self-inhibitory model are two main classes of point process models. Based on self-clustering model, temporal ETAS model was introduced by Ogata in 1988, extended to multi-dimension in 1998. ETAS has been used in seismology widely ever since.

The parametric conditional intensity function is generally represented in terms of well-known seismology laws. It is composed of a uniform background rate and an inhomogeneous isotropic clustering rate. The background rate can be also estimated as a nonhomogenous process nonparametrically. The model is applied to simulated data first. Later, two actual earthquake catalogues--Japan earthquake sequence and Mammoth earthquake data, are analyzed. Parameter estimations are obtained by minimizing the negative log-likelihood function, using the hybrid of simulated annealing and standard numerical DFP method. To access the absolute goodness of fit of the model. The result from Mammoth is then examined by point process residual analysis.


Mar 29

Speaker: Gangqiang Xia

Title: Analysis for Large Spatial Data Set

In many spatial data analysis problems, we have very large data sets. In those situations, likelihood based inference becomes unstable and, eventually, infeasible since it involves computing various forms of a big covariance matrix. If we are to fit a Bayesian model and implement some MCMC algorithm, the big matrix will make repeated calculations a disaster. In this talk we review a number of ways to deal with the large data set problem. We propose a discrete approximation model and compare it with sub-sampling strategy. Examples will be given to illustrate the methods. The representation of random fields, spatial design of sub-sampling locations will also be discussed.


Apr 5

Speaker: Laura Gunn

Title: Bayesian Methods for Assessing Ordering in Hazard Functions

In toxicology studies that collect event time data, it is often appropriate to assume non-decreasing hazards across dose groups, though dose effects may vary with time. Motivated by this application, we propose a Bayesian approach for order restricted inference using an additive hazard model with time-varying coefficients. In order to make inferences on equalities versus increases in the hazard functions, a prior is chosen for the time-varying coefficients that assigns positive probability to no dose effect while restricting the coefficeints to be non-negative. By using a high dimensional piecewise constant model and smoothing the functions via Markov gamma and beta processes, we obtain a flexible and computationally tractable approach for identifying sets of dose and age values at which hazards are increased. This approach also can be used to estimate dose response and survival functions. The methods are illustrated through application to data from a toxicology study of gallium arsenide.


Apr 12

Speaker: Chong Tu

Title: Some Applications of Levy Processes in Bayesian Nonparametric Modeling

We propose a new class of Bayesian nonparametric methods based on Levy process priors. Some theory and basic properties of Levy processes will be discussed briefly. Then we will construct a Bayesian nonparametric model based on kernel convolution of Levy processes. We will present an application involving modeling of air pollutants to illustrate how our approach can be used to effectively model spatial temporal processes. Using marked Poisson process, we extend the method for multivariate processes modeling. The method allows us to avoid large matrix inversion and also get away from assuming stationarity and normality. Other applications of the approach such as nonparametric function estimation will also be briefly discussed.


Apr 19

Speaker: Scott Schmidler

Title: Bayesian Shape Analysis with Applications to Bioinformatics

Statistical shape theory concerns the analysis of geometric objects under natural invariances. Shape theory draws on elements of stochastic and differential geometry, algebraic topology, and multivariate statistics, and has a broad array of applications including image processing, archaeology, comparative anatomy, astronomy, epidemiology, and (now) bioinformatics.

After introducing basic concepts of shape theory, I describe novel methods we have developed for Bayesian shape analysis. Using these methods, we provide a statistical framework for analysis, prediction, and discovery in biomolecular structure data. This approach yields natural solutions to a variety of problems in the field, including the study of conservation and variability, examination of uncertainty in database searches, algorithms for flexible matching, detection of motions and disorder, function prediction and active site recognition, and clustering and classification techniques. Several of these will be discussed.


Jim Berger
January, 2004