Clustered Dirichlet Process Mixture Modelling


CDP home Downloads Inputs Outputs Examples

Description: This software implements a highly optimized Markov Chain Monte Carlo algorithm for fitting a clustered Dirichlet process mixture of normal distributions. This class of models is designed for performing density estimation and hierarchical classification in multivariate, non-Gaussian data. A clustered Dirichlet process mixture model can be thought of as a nonparametric mixture of nonparametric mixtures, as in:
A key feature of this class of models is that both the number of clusters (the top layer of the mixture) and the number of mixture components per cluster (the bottom layer of the mixture) are estimated from the data. The software can also be used to fit the standard Dirichlet Process mixtures of normals.

In the Downloads section, you can find serial and multithreaded executables for the Windows, Linux, and Macintosh OS X platforms.

In the Inputs section, you can find a description of the various options and model settings that can be specificied in the input file.

In the Examples section, you can find sample data sets, input files, and R scripts for producing useful graphical summaries of the fitted models, including the figures appearing above.


Acknowledgements: The research and development underlying the code provided here was supported, in part, by the National Science Foundation (grants DMS-0342172) and the National Institutes of Health (grants U54-CA-112952-01 and P50-GM081883, and contract HHSN268200500019C). Any opinions, findings and conclusions or recomendations expressed in this work are those of the authors and do not necessarily reflect the views of the NSF or NIH.

Disclaimer: This software is made freely available to any interested user. The authors can provide no support nor assistance with implementations beyond the details and examples here, nor extensions of the code for other purposes. The download has been tested to confirm all details are operational as described here. It is understood by the user that neither the authors nor Duke University bear any responsibility nor assume any liability for any end-use of this software. It is expected that appropriate credit/acknowledgement be given should the software be included as an element in other software development or in publications.

CDP code developed by: Dan Merl & Quanli Wang

More software from the West group