Jan 13

Speaker: Michael Lavine

Title: Information, Experimental Design, Decisions and Seedlings

A botanist has two methods of gathering data. One is more informative but requires more labor. The other is easier but less informative. Which should he use? This is a question of Experimental Design and Decision Theory. (Which experiment should I perform?) The answer depends on how much information is gained by gathering the more labor-intensive data.

This talk will begin with a simplified version of the problem, in which we can discuss how information should be measured. We will draw a connection to the problem of estimating n in a binomial. Then, time permitting, we will introduce two complications. One is the fact that the data are not perfectly reliable; the second is that conditions vary from place-to-place and year-to-year.


Jan 20

Speaker: Jim Berger

Title: The Development of Objective Priors: the History of (Bayesian) Statistics

During its first 200 years, Bayesian statistics (and arguably of much classical statistics) was primarily based on use of objective priors. We review this history, focusing on the role of objective priors in the developments.


Jan 25

Speaker: Jim Berger

Title: Modern Theories of Objective Priors

The modern objective prior theories that have received the most attention are the reference prior approach and the probability matching approach. These are discussed and illustrated in a number of situations.


Jan 27

Speaker: Ken Land (Sociology, Duke)

Title: Some Recent Developments In Finite-Mixture Poisson Regression Models, With Applications To The Study Of Delinquent/Criminal Careers

This talk will review the development of the semiparametric-mixed Poisson regression models in recent years and applications of these models to the study of delinquent/criminal careers. This modeling apparatus leads naturally to the identification of Poisson latent classes. The methodology for constructing such classes and statistical issues of identification will be reviewed.


Feb 01

Speaker: Brani Vidakovic

Title: Algorithmic (Kolmogorov) complexity: Is it relevant?

The theory of algorithmic complexity is almost 30 years old. It can be used to formally define the randomness of finite objects (binary sequences) by the measure of their algorithmic entropy. The theory, also known as Kolmogorov complexity, gave rise to several related theories and paradigms such as: Theory of Martin-L{\"o}f's tests, Algorithmic Theory of Information, Minimum Description Length Principle (MDLP), etc. In the talk we first overview some basic notions, definitions, and properties of algorithmic complexity. The most interesting fact for a statistician is the existence of the so-called ``universal prior'', a universal measure on the space of all infinite binary words which is ``least informative.'' In the rest of the exposition we draw some parallels between Rissanen's MDLP, having a theoretical basis in the algorithmic complexity theory, and the Bayesian formalizm. Several examples in the area of model selection (Barron-Cover criteria, Wallace-Freeman criteria) are provided.


Feb 3

Speaker: Tom DiPrete (Sociology, Duke)

Title: Family Change, Employment Transitions, and the Welfare State: A Comparison of Household Income Dynamics in the U.S. and Germany

see a copy of the paper

Household income is affected by changes in labor market activity and household composition, but a comparison of the U.S. and Germany demonstrates that these effects are conditioned by the institutional environment. Both the rates of partner loss and employment exit and the short-term financial impact of these events are lower in Germany than in the U.S. In both countries, the effects of income-enhancing events tend to persist over time, while the effects of negative events gradually decline. Women in both countries are more dependent upon partner income than are men; they gain more from union formation and lose more from union dissolution than men do. However, the loss from union dissolution for women is mitigated substantially by tax policies and by private and public transfers in both countries. Furthermore, the negative effects of union dissolution for women tend to diminish over time, especially in Germany where women respond by increasing their work activity as well as by repartnering. Our results suggest that rates of change in family structure and rates of change in labor market activity are interrelated, and depend in large part on the structure of incentives for match-forming and match-breaking that are generated by a country's institutional environment.


Feb 8

Speaker: Mark Peot

Title: Learning from What You Don't Observe

see a copy of the paper

I argue that the current practice for modeling missing observations in interactive Bayesian expert systems is incorrect. If there is no reported value for a chance node in a belief network, it is usually assumed that the chance node is unobserved and that this unobserved node contributes no likelihood information to the rest of the belief network. If this chance node has no graphical successors, then it is barren: the joint distribution of the other variables in the belief network is not a function of the distribution of the unobserved node. In interactive diagnostic expert systems, however, there is a systematic bias introduced by people's preferences for reporting that can lead to systematic errors in diagnosis. In medicine, common sources of reporting biases might include: * A bias to report symptoms that are present instead of symptoms that are absent, or * A bias to report symptoms that are more significant or urgent instead of symptoms that are less obviously urgent. Failure to model these reporting biases can lead an expert system to produce erroneous recommendations early in the diagnostic process. In this talk, I will introduce the listener to Bayesian expert systems and show how reporting biases can arise through the use of implicit or explicit "open probe" questions. I also describe a family of techniques for capturing some of these reporting biases. The techniques are simple, require very little assessment overhead, and can significantly improve diagnostic inference.


Feb 15

Speaker: Jim Clark (Botany, Duke)

Title: Seedling Dispersal

Dispersal affects community dynamics and vegetation response to global change. Understanding these effects requires descriptions of dispersal at local and regional scales and statistical models that permit estimation. Classical models of dispersal describe local or long-distance dispersal, but not both. Lack of statistical methods means that models have rarely been fitted to seed dispersal in closed forests. We present a mixture model of dispersal that assumes a range of disperal patterns, both local and long-distance. The bivariate Student's t or "2Dt" follows from an assumption that the distance parameter in a Gaussian model varies randomly, thus having a density of its own. We use an inverse approach to "compete" our mixture model against classical alternatives using seed rain data sets from temperate broadleaf, temperate mixed conifer, and tropical flood plain forests. The 2Dt model fits dispersal data better than do classical models for most species. The superior fit results from the potential for a convex shape near the source tree and a "fat tail". Our parameter estimates have implications for community dynamics at local scales, for vegetation responses to global change at regional scales, and for differences in seed dispersal among biomes. The 2Dt model predicts that less seed travels beyond the immediate crown influence (< 5 m) than does a Gaussian model, but that more seed travels longer distances (> 30 m). Whereas Gaussian and exponential models predict slow population spread in the face of environmental change, our dispersal estimates suggest rapid spread. The preponderance of animal-dispersed and rare seed types in tropical forests results in noisier patterns of dispersal than occur in temperate hardwood and conifer stands.


Mar 24

Speaker: Tim Berry

Title: Analytical and Statistical Opportunities in Direct Marketing

The Direct Marketing industry has experienced tremendous growth during the 1990's and is expected to continue this high grow into the 21st century. A major reason for this growth has been the increase in technology, tools and knowledge that are now available to companies in the direct marketing arena. This talk will focus on the direct mail industry and how statistical techniques and tools are playing a major role in defining this industry and how opportunities for statistician and technical people are wide spread. The topics for discussion include:

  • Direct Marketing Overview
  • Database Marketing Overview
  • Analytical Tools and Techniques
  • Three Case Studies
  • Career Opportunities in the Direct Mail Industry
  • The talk is targeted for students who want to learn more about how statistical techniques are being applied in the corporate world and what opportunities exist for statistician and technical folk outside the traditional channels of academia and pharmaceuticals.


    Apr 14

    Speaker: Dubois Bowman

    Title: A Strategy for Obtaining Inferences about Projected Completors in Longitudinal Studies with Nonignorable Dropout

    Attrition in longitudinal studies frequently occurs in many areas of statistical application including clinical and epidemiological research. Attrition may severely bias estimates depending on the mechanism that causes subjects to drop out. In many studies, interest primarily lies in making inferences about subjects who are likely to complete the study. When the dropout process is nonignorable, a complete-case analysis is not appropriate because a joint model for the outcome and the dropout mechanism should be considered. In addition, a complete case analysis leads to limitations in making inferences since it conditions on a stochastic vector that is not realized until the study has been completed. The last observation carried forward method is also commonly used, but is usually inappropriate because of the restrictive assumption upon which it is based. A new strategy for coping with attrition in longitudinal studies that does not require explicit specification of the dropout mechanism is proposed within the context of a pattern-mixture framework. The strategy employs the linear mixed model conditional on a surrogate for dropout. Estimators for model parameters are developed using maximum likelihood and restricted maximum likelihood methods. Properties of the fixed effects estimators are presented under certain regularity conditions for the dropout surrogate. Procedures for evaluating the performance of the method are presented. The new strategy for coping with attrition in longitudinal studies is applied to a data example.


    215++

    Speaker: various

    Title: Theory of Statistics

    215++ will be a series of lectures based on chapters 2, 4, 5, and 7 of Mark Shervish's book Theory of Statistics. The series is organized by Peter Mueller, Giovanni Parmigiani and Michael Lavine. We will give introductory talks on each chapter. Other talks will be contributed by students. The dates are flexible. Please sign up soon, for a date and topic.


    M. Lavine
    January, 1999