Skip to main content
Log in

Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the “correct” number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ferguson T S. A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1973, 1(2): 209-230.

    Article  MathSciNet  MATH  Google Scholar 

  2. Blackwell D, MacQueen J B. Ferguson distributions via Pólya urn schemes. Annals of Statistics, 1973, 1(2): 353-355.

    Article  MathSciNet  MATH  Google Scholar 

  3. Aldous D. Exchangeability and Related Topics. Ecole d’´Eté de Probabilités de Saint-Flour XIII–1983, Berlin: Springer, 1985, pp.1-198.

  4. Pitman J. Combinatorial Stochastic Processes Ecole d’Eté de Probabilités de Saint-Flour XXXII – 2002, Lecture Notes in Mathematics, Vol. 1875, Springer, 2006.

  5. Sethuraman J, Tiwari R C. Convergence of Dirichlet Measures and the Interpretation of Their Parameter. Statistical Decision Theory and Related Topics, III, Gupta S S, Berger J O (eds.), London: Academic Press, Vol.2, 1982, pp.305-315.

  6. Ishwaran H, James L F. Gibbs sampling methods for stickbreaking priors. Journal of the American Statistical Association, March 2001, 96(453): 161-173.

    Article  MathSciNet  MATH  Google Scholar 

  7. Antoniak C E. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics, 1974, 2(6): 1152–1174.

    Article  MathSciNet  MATH  Google Scholar 

  8. Neal R M. Bayesian mixture modeling. In Proc. the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, Seattle, USA, June, 1991, pp.197-211.

  9. Neal R M. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 2000, 9(2): 249-265.

    Article  MathSciNet  Google Scholar 

  10. Green P, Richardson S. Modelling heterogeneity with and without the Dirichlet process. Scandinavian Journal of Statistics, 2001, 28: 355-375.

    Article  MathSciNet  MATH  Google Scholar 

  11. Rasmussen C E. The infinite Gaussian mixture model. Advances in Neural Information Processing Systems, 2000, 12: 554-560.

    Google Scholar 

  12. Escobar M D. Estimating normal means with a Dirichlet process prior. Journal of the American Statistical Association, 1994, 89(425): 268-277.

    Article  MathSciNet  MATH  Google Scholar 

  13. MacEachern S N. Estimating normal means with a conjugate style Dirichlet process prior. Communications in Statistics: Simulation and Computation, 1994, 23(3): 727-741.

    Article  MathSciNet  MATH  Google Scholar 

  14. Escobar M D, West M. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 1995, 90(430): 577-588.

    Article  MathSciNet  MATH  Google Scholar 

  15. Müller P, Erkanli A, West M. Bayesian curve fitting using multivariate normal mixtures. Biometrika, 1996, 83(1): 67-79.

    Article  MathSciNet  MATH  Google Scholar 

  16. West M, Müller P, Escobar M D. Hierarchical Priors and Mixture Models with Applications in Regression and Density Estimation. Aspects of Uncertainty, Freeman P R, Smith A F M (eds.), John Wiley, 1994, pp.363-386.

  17. MacEachern S N, Müller P. Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 1998, 7(2): 223-238.

    Article  Google Scholar 

  18. Neal R M. Markov chain sampling methods for Dirichlet process mixture models. Technical Report 4915, Department of Statistics, University of Toronto, 1998.

  19. Gilks W R, Wild P. Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 1992, 41(2): 337-348.

    Article  MATH  Google Scholar 

  20. Scott D W. Multivariate Density Estimation: Theory, Practice and Visualization, Wiley, 1992.

  21. Fisher R A. The use of multiple measurements in axonomic problems. Annals of Eugenics, 1936, 7: 179-188.

    Google Scholar 

  22. Forina M, Armanino C, Castino M, Ubigli M. Multivariate data analysis as a discriminating method of the origin of wines. Vitis, 1986, 25(3): 189-201.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dilan Görür.

Additional information

This work is supported by Gatsby Charitable Foundation and PASCAL2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Görür, D., Edward Rasmussen, C. Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution. J. Comput. Sci. Technol. 25, 653–664 (2010). https://doi.org/10.1007/s11390-010-9355-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-010-9355-8

Keywords

Navigation