Speaker: Brani Vidakovic
Title: Introduction to Multiscale Statistical Methods
In this tutorial and non-technical lecture students will be introduced to basics of wavelet and wavelet-like function families. A variety of applications in Geophysics, Signal and Image Processing, Mathematics and Statistics will be discussed.
Speaker: Brunero Liseo, Universita' di Roma - La Sapienza
Title: The Skew-Normal class densities
In the talk I will introduce the class and I will stress its properties. Also I will discuss pros and cons of its uses both as sampling densities to check skewness in the data, and as prior distributions in Bayesian analysis.
Speaker: Hedibert Lopes, PhD Candidate at Duke Statistics
Title: Predictive Computation in Factor Models
Bayesian inference in factor analytic models has received renewed attention in recent years due partly to computational advances, but also partly to applied focuses generating factor structures, as exemplified by recent work in financial time series modeling. The focus of our current work is on exploring questions of uncertainty about the number of latent factors in a multivariate factor model, combined with methodological and computational issues of model specification and model fitting. We explore reversible jump MCMC methods that build on sets of parallel Gibbs sampling-based analyses to generate suitable "empirical" proposal distributions and that address the challenging problem of finding efficient proposals in high-dimensional models. Various additional computational issues are discussed, and we explore applications in an econometric context.
Speaker: Jacob Laading, Duke Statistics
Title: Hierarchical deformation modeling for medical images
We present a flexible model for the deformation of images based on a hierarchically defined probability density. The model is based on the idea of "facets"; landmarks which are not tied to specific image phenomena. A large number of these facets are used in a hierarchy to capture the deformation on several scales. An atlas structure is deformed onto a new observed realization from an image class via a conditionally defined hierarchical normal distribution intended to capture shape. In addition we define a distribution on deviations in local intensity profile or other image-derived quantities for a given facet. Several alternative methods for defining the model structure will be presented, as well as a two example applications. This work was carried out with Drs. Colin McCulloch and Valen Johnson.
Speaker: Jaeyong Lee: Purdue University and Duke Statistics
Title: Acceleration of Metropolis-Hastings Algorithms
It is a common sense among those who use Markov Chain Monte Carlo that large rejection rates of Markov chain result in slow mixing of the chain. Indeed, there are two theoretical results (Peskun, 1973, Tierney, 1997) which confirm this common sense. We will give another look at the Metropolis-Hastings algorithm and possible improvements will be suggested, which includes the splitting rejection algorithm termed by Mira and Tierney.
Speaker: Merlise Clyde, Duke Statistics
Title: Model Uncertainty and Bayesian Model Averaging
In this talk I will provide an overview of Bayesian model averaging and model selection. For example, in regression models there is often substantial prior uncertainty about which covariates one should include. Variable selection typically results in the selection of a single model for estimating quantities of interest, and as a result, ignores model uncertainty in statistical inferences. In Bayesian model averaging one estimates quantities of interest by a weighted average of model specific quantities, where weights are determined by how much support each model receives from the data. I will discuss the use of improper prior distributions and connections to model selection criteria, such as AIC, BIC, and RIC. In high dimensional problems, one must approximate Bayesian model averaging based on a sample of models. I will present relationships among some common algorithms for sampling models in linear regression, such as reversible jump Markov chain Monte Carlo, Stochastic Search Variable Selection, and Markov chain Monte Carlo Model Composition and random sampling, and discuss various methods for estimation based on the sampled output.
Speaker: Dongchu Sun, University of Missouri-Columbia, NISS and Duke University
Title: Random Effects in Generalized Linear Mixed Models
We examine the use of special forms of correlated random effects in the generalized linear mixed model (GLMM) setting. A special feature of our GLMM is the inclusion of random residual effects to account for lack of fit due to extra variation, outliers, and other unexplained sources of variation. For random effects, we consider, in particular, the correlation structure and improper priors associated with the autoregressive (AR) model of Ord (1975) and the conditional auto-regressive (CAR) model of Besag (1974). We give conditions for the propriety of the posterior distribution of the GLMM when the fixed effects have a constant improper prior and the random effects have a possibly improper conditional autoregressive prior. Several examples of exponential families as well as computational details for Markov chain Monte Carlo simulation are also presented.
Speaker: Susie Bayarri, University of Valencia and Duke Statistics
Title: Conditional Measures of Surprise
Measures of surprise refer to studying the compatibility of data with an assumed hipothesis without a careful formulation of alternative hipotheses. This seems in contradiction with Bayesian reasoning, but we argue that these measures do have a useful role to play even in the Bayesian world. As a matter of fact, many (Bayesian) authors have tried to develop such. We first make a brief summary of these measures. Then we ellaborate on the appropriate distribution in which 'surprise' should be measued. We show that we have to narrow down the prior predictive distribution by appropriately conditioning. Several posibilites are discussed and an optimal conditioning proposed.
Speaker: Mike West, Duke Statistics
Title: Multivariate Non-Gaussian Time Series: Bayesian Analysis of Longitudinal Data in a Case Study in the VA Hospital System, I
Discussions of the developments on the VA project as reported in Duke Statistics Discussion Papers 97-22a,b. A slow and easy discussion of the basics: VA interests, policy questions, background. Data structure, exploration, characteristics. Modelling ideas and first models. Model fitting and model assessment. Some summary findings.
Speaker: Mike West, Duke Statistics
Title: Multivariate Non-Gaussian Time Series: Bayesian Analysis of Longitudinal Data in a Case Study in the VA Hospital System, II
Discussions of the developments on the VA project as reported in Duke Statistics Discussion Papers 97-22a,b. Continuation: Don't miss Talk 1 if you want to understand Talk 2. More advanced modelling and inferential questions in the VA study: multivariate/longitudinal/time series/hierarchical random effects models. Issues of institutional comparisons, and other matters as time permits.
Speaker: Jim Berger
Title: Default Bayesian Hypothesis Testing and Model Selection
This will be a review of the motivation for, and difficulties with, default Bayesian hypothesis testing and model selection. Included will be a discussion of some of the general automatic model selection and testing procedures, including BIC, the "intrinsic Bayes factor" and the "fractional Bayes factor".
Speaker: Jim Berger
Title: Bayesian Model Selection via the Expected Posterior Prior
Recently developed automatic Bayesian methods of model selection, such as the "intrinsic Bayes factor" and the "fractional Bayes factor," have proven to be highly effective but are often difficult to work with. In particular, intrinsic Bayes factors can be challenging to compute, while fractional Bayes factors require considerable care in definition and use. A highly promising new approach to the problem is based on developing explicit default priors for the models under consideration, called "expected posterior priors." These are strongly related to "intrinsic priors" arising from the intrinsic Bayes factor approach, but have the advantages of being explicitly given and being relatively easy to use in MCMC computational schemes. A variety of examples of use of expected posterior priors will be given, including an application to analysis of a mixture model arising in an astrophysical problem.
Speaker: Gabriel Katul
Title: Modeling Turbulent Transport within Forested Canopies: Why statistics?
In this lecture, equations of motions that describe air flow inside vegetation and the "closure" problem in turbulence are briefly introduced. Motivation for using statistical description of turbulence is then presented. Preliminary results on "closure" model results and comparison with field measurements performed at Duke Forest are also shown. Model limitations are then discussed and are used to motivate detailed analysis of specific types of eddy motion commonly observed using high frequency detailed turbulence measurements. Identification of such eddy motion using newly developed wavelet thresholding methods with the hope of refining closure models and better understanding turbulent transport concludes this lecture.
Speaker: Giovanni Parmigiani
Title: Statistical issues in understanding disease genes, I
The investigation of how we inherit susceptibility to a diseases from our parents is one of the current frontiers of medicine. Earlier work focussed on inheritance of features (phenotypes) that are very closely determined by a single gene. Our recently increased ability to "measure" the human genome is giving us the option to investigate more complex and also more common situations, involving many genes at once, and concerning genetic effects that are weaker and subtler. In these situations, linkage analysis (the search for the gene(s) that are responsible for a disease) needs to be complemented by subsequent analyses that investigate the nature and magnitude of the genetic effect. These are the analysis that carry most of the clinical and public health implications and that can lead the way, many years down the line, to preventive treatments. A key issue is penetrance, which, in the simplest case, is the probability of developing disease when one carries a "defective" gene. Tuesday, I will review the fundamentals of disease inheritance and describe some of the standard study designs. Thursday I will discuss statistical methodologies, presenting examples from problems I have worked on and highlighting promising research and modeling approaches.
Speaker: Giovanni Parmigiani
Title: Statistical issues in understanding disease genes, II
See the abtract for: Statistical issues in understanding disease genes, I.
Speaker: Francesca Dominici
Title: National Mortality, Morbidity and Air Pollution Study: Statistical Challenges
Time series studies have shown associations between air pollution concentrations and morbidity and mortality. These studies have largely been conducted within single cities, and with varying methods. Key questions remain unaddressed concerning the findings, including 1) the extent and sources of heterogeneity of air pollution effects across locations; 2) the public health significance of the short-term associations ("harvesting"); and 3) the effect of error in the measurement of the exposure variable on the estimated effect of air pollution. The NMMAPS study comprises the development of statistical methods to address these questions and the application of these methods to national data sets on mortality and hospitalization among persons 65 years of age and older. The latter serves as an index of morbidity. In this talk I will review some of the statistical challenges that arise in addressing the questions of the NMMAPS study. For analyzing data from multiple locations, we develop a semiparametric Poisson regression analyses of daily time-series data from the largest 20 U.S. cities, and we introduce hierarchical models for combining estimates of the pollution-mortality relationship. For addressing ``harvesting'' in air pollution studies, we propose a novel statistical strategy based on frequency domain log-linear regression methods. Finally, for evaluating the effects of measurement error we introduce a semiparametric Poisson-normal model to estimate the bias in the relative rate of mortality due of using ambient concentrations instead of personal exposures of PM10. The model is applied to the combined analysis of five studies with personal and outdoor sampling of particulate matter. Data bases have been assembled on mortality of the 100 largest U.S. cities. The next phase of the NMMAPS study will completed the morbidity and mortality analyses and carry out a combined analyses of both morbidity and mortality. The methods of NMMAPS should prove useful for future surveillance of the health effects of air pollution. Joint work with Jonathan Samet and Scott L. Zeger.
Speaker: Peter Mueller and Don Berry, Duke University
Title: Simulation Based Sequential Design: Optimal Stopping in a Clinical Trial
We discuss simulation based methods for exploration and maximization of expected utility in sequential decision problems. We consider problems which require backward induction with analytically intractable expected utility integrals at each stage. We propose to use forward simulation to approximate the integral expressions, and a reduction of the allowable action space to avoid problems related to an exponentially exploding number of possible trajectories in the backward induction. The artificially reduced action space allows strategies to depend on the full history of earlier observations and decisions only indirectly through a low dimensional summary statistic. We illustrate the proposed approach with an application to an optimal stopping problem in a clinical trial.
Key words: Backward induction, Forward simulation, Monte Carlo simulation, Optimal design, Sequential decision.
Speaker: Lurdes Inoue, Peter Mueller, Gary Rosner, and Mark Dewhirst, Duke University and Duke University Medical Center
Title: A Bayesian Model for Detecting Changes in Nonlinear Profiles
We propose a model for longitudinal data with random effects which includes a flexible nonparametric regression for the profile of responses over time for individual subjects. This research is motivated by experiments evaluating the hemodynamic effects of various agents in tumor-bearing rats. In one set of experiments, the mice breathed room air, followed by carbogen (a mixture of pure oxygen and carbon dioxide), with different groups of animals receiving different concentrations of the two gases. Interest focuses on changes in hemodynamic profiles, e.g., longitudinal measurements of oxygen pressure, heart rate, tumor blood flow, tumor arteriolar diameter, etc. For example: Do individual profiles change once the breathing mixture changes? How does changing the concentration of carbon dioxide alter the effect of carbogen on hemodynamics? The nature of the recorded responses does not allow any meaningful parametric form for a regression of these profiles on time. Additionally, response patterns differ widely across individuals. Therefore, we propose a non-parametric regression model of the profile data on time, with a hierarchical structure to account for subject-to-subject variability.We explore several alternative implementations of the non-parametric regression, including a dynamic state space model.
Speaker: Richard De Veaux, Department of Mathematics, Williams College
Title: Hybrid Neural Networks for Environmental Process Control
In many environmental processes there is a great deal of scientific and
engineering knowledge about the system. This domain knowledge may range
from a simple energy balance taking the form of a constraint, to a complex
first principles models containing unmeasurable reaction rates. The
challenge is to incorporate this knowledge into the data analysis and
eventual control of the process. A feed forward neural network provides a
particularly convenient form for folding in such prior knowledge into the
estimation, prediction and control of the system. The resulting model is
known as a hybrid neural network model. We will show how the neural network
is trained and compare its performance to more traditional techniques.
November 10
Speaker: David Higdon
Title:
Markov Random Fields and Applications in Image Analysis
We apply Bayesian image analysis techniques to a problem
in a newly developed scanned probe technology which uses
commercial magnetoresistive (MR) record/playback heads as
probes to sense magnetic fields. This technology can be
used both for magnetic imaging, and for evaluating playback
and record processes in magnetic recording.
In MR microscopy, an MR head is raster-scanned while in
physical contact with a magnetic sample (e.g., hard disk
media, tape, or fine magnetic particles). By plotting the
MR resistance as a function of position, a very high resolution
(on the order of $.1 \times 1.0$ $\mu$m) magnetic image of
the sample is constructed. This case study focuses on
characterizing the head sensitivity function (HSF), which
depends on the physical dimensions and the magnetic properties
of the MR head. These sensitivity functions are of great
practical interest since they ultimately relate to the head's
performance in a high density data storage environment. We use a
Bayesian approach to model and estimate the HSF, while
accounting for noise and other nuisance effects such as
thermal drift. Besides yielding a point estimate, which
is a fairly difficult task here, this approach also quantifies
uncertainty so we can assess whether certain features of
the estimated head sensitivity function appear to be genuine.
Speaker: David Conesa
Title:
Bayesian Analysis of Bulk Arrival Queues
Statistical analysis of bulk arrival queues from a Bayesian
point of view is presented. We review briefly some basics of Queueing
Theory, and bulk arrival models. We deal with the most basic type of bulk
arrival queue, namely $M^{X}/M/1$. The focus is on prediction of the usual
measures of performance of the system in equilibrium. We present a way to
compute the posterior predictive distribution of the number of customers
in the system, through the inversion of its probability generating
function. Posterior distribution of the waiting time, in the queue and in
the system, of the first customer of an arriving group is also computed,
but now in terms of their Laplace and Laplace-Stieltjes transforms.
Finally, a numerical example is also addressed.
Speaker: Jamie Robins, Harvard School of Public Health
Title:
Marginal Structural Models and Causal Inference in Epidemiology
Standard approaches for adjustment of confounding are biased when there exist
time-dependent confounders that are also intermediate variables. This paper introduces
marginal structural models (MSMs), a new class of causal models that allow for appropriate
adjustment of confounding in those situations. The parameters of a MSM can be consistently
estimated using a new class of estimators: the inverse-probability-of-treatment weighted
estimators.
Speaker: Peter M\"uller and Brani Vidakovic
Title:
Bayesian Inference with Wavelets: Density Estimation
We propose a prior probability model in the wavelet coefficient
space. The proposed model implements wavelet coefficient thresholding
by full posterior inference in a coherent probability model. We
introduce a prior probability model with mixture priors for the
wavelet coefficients. The prior includes a positive prior probability
mass at zero which leads to a posteriori thresholding and
generally to a posteriori shrinkage on the coefficients. We
discuss an efficient posterior simulation scheme to implement
inference in the proposed model. The discussion is focused on the
density estimation problem. However, the introduced prior probability
model on the wavelet coefficient space and the Markov chain Monte
Carlo scheme are general.
Speaker: Jane Liu, Duke Statistics
Title:
Particle Filtering in Dynamic Models
We discuss the issues of simulation-based sequential analysis -- or particle
filtering -- in dynamic models. Our focus is sequential Bayesian learning
about time-varying state vectors and fixed model parameters simultaneously.
We discuss a general approach that combines old ideas of smoothing using kernel
methods with newer ideas of auxilliary particle filtering of Shephard and Pitt (1997).
We show that specific smoothing approaches can interpret and suggest modifications
to techniques that add "artificial evolution noise" to fixed model parameters
at each time point, an idea introduced by Gordon, Salmond and Smith (1993) to address
the problems of sample attrition and prior:data conflict arising in simulation-based
sequential analysis using SIR and other standard methods for parameter learning.
Unlike the Gordon et al method, our new approach permits smoothing and regeneration
of sample points of model parameters without the "loss of historical information" inherent
in the Gordon et al approach. This is achieved using shrinkage modifications of kernel
smoothing, as introduced by West (1992). An example where analytical forms of posterior
distributions are available provides assessment of the method, and further illustration
and comparisons with auxilliary particle filtering are given in a stochastic volatility model.
Speaker: Ed Iversen, Duke Statistics, Duke University
Title:
A Model for Risk of Breast Cancer
A statistical model for predicting risk of breast cancer conditional
on a fixed set of covariates is described. The model assumes known
the joint distribution of genotype, race, age, and age at diagnosis of
breast cancer in the general (U.S.) population; the conditional
distribution of a fixed set of covariates given these variables is,
however, unknown. Data is available from both retrospective
(covariates are collected for a sample of individuals conditional on
their disease status) and prospective (disease status is observed for
a sample of individuals conditional on their risk factors) studies,
and an approach is outlined for combining data from the two types of
studies (Mueller, et al. 1996). The choice of retrospective versus
prospective modeling is discussed in the situation where one of the
margins (essentially the response here) is assumed known.
Speaker: James D. Lynch, University of South Carolina and NISS
Title:
On the Ergodicity of General State Space Markov Chains
The variation norm ergodic theorem for general state space Markov chains is shown to be
equivalent to the a.s. convergence to one of the likelihood ratio of the transition density and the
equilibrium density for samples from the chain. The results are derived using martingales
arguments and are for completely general state spaces. For the ergodic case, Doob's inequality
can be used to show how the variation norm regulates how far, in some sense, a Markov
simulation is from the desired equilibrium distribution. This talk is based on work with J.
Sethuraman.
November 12
November 17
November 24
November 24
December 01
December 03
B. S.
Fri Apr 17