Semiparametric Bayesian Analysis of Selection Models

Jaeyong Lee and James O. Berger

July 1999

Selection models are appropriate when a datum x enters the sample only with probability or weight w(x). It is typically assumed that the weight function w is monotone, but the precise functional form of the weight function is often unknown. In this paper, the Dirichlet process prior, centered on a parametric form, is used as a prior distribution on the weight function. This allows for incorporation of knowledge about the weight function, without restricting it to be of some particular functional form. By introducing latent variables related to the selection mechanism, computation via Gibbs sampling can be implemented in the case where the total number of selected and unselected observations, N, is known. When N is unknown, a reversible jump Markov chain sampler is needed to carry out the computations. An important difficulty that can be thought of as `practical nonidentifiablity' is revealed, even for selection models in which the weight functions are theoretically identifiable. The proposed solution to this problem depends on the existence of prior knowledge concerning the effective range of the weight function.

The manuscript is available in postscript format.