Journal of Computer Science and Technology

, Volume 25, Issue 4, pp 653–664 | Cite as

Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

  • Dilan GörürEmail author
  • Carl Edward Rasmussen
Regular Paper


In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the “correct” number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.


Bayesian nonparametrics Dirichlet processes Gaussian mixtures 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Ferguson T S. A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1973, 1(2): 209-230.CrossRefMathSciNetzbMATHGoogle Scholar
  2. [2]
    Blackwell D, MacQueen J B. Ferguson distributions via Pólya urn schemes. Annals of Statistics, 1973, 1(2): 353-355.CrossRefMathSciNetzbMATHGoogle Scholar
  3. [3]
    Aldous D. Exchangeability and Related Topics. Ecole d’´Eté de Probabilités de Saint-Flour XIII–1983, Berlin: Springer, 1985, pp.1-198.Google Scholar
  4. [4]
    Pitman J. Combinatorial Stochastic Processes Ecole d’Eté de Probabilités de Saint-Flour XXXII – 2002, Lecture Notes in Mathematics, Vol. 1875, Springer, 2006.Google Scholar
  5. [5]
    Sethuraman J, Tiwari R C. Convergence of Dirichlet Measures and the Interpretation of Their Parameter. Statistical Decision Theory and Related Topics, III, Gupta S S, Berger J O (eds.), London: Academic Press, Vol.2, 1982, pp.305-315.Google Scholar
  6. [6]
    Ishwaran H, James L F. Gibbs sampling methods for stickbreaking priors. Journal of the American Statistical Association, March 2001, 96(453): 161-173.CrossRefMathSciNetzbMATHGoogle Scholar
  7. [7]
    Antoniak C E. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics, 1974, 2(6): 1152–1174.CrossRefMathSciNetzbMATHGoogle Scholar
  8. [8]
    Neal R M. Bayesian mixture modeling. In Proc. the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, Seattle, USA, June, 1991, pp.197-211.Google Scholar
  9. [9]
    Neal R M. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 2000, 9(2): 249-265.CrossRefMathSciNetGoogle Scholar
  10. [10]
    Green P, Richardson S. Modelling heterogeneity with and without the Dirichlet process. Scandinavian Journal of Statistics, 2001, 28: 355-375.CrossRefMathSciNetzbMATHGoogle Scholar
  11. [11]
    Rasmussen C E. The infinite Gaussian mixture model. Advances in Neural Information Processing Systems, 2000, 12: 554-560.Google Scholar
  12. [12]
    Escobar M D. Estimating normal means with a Dirichlet process prior. Journal of the American Statistical Association, 1994, 89(425): 268-277.CrossRefMathSciNetzbMATHGoogle Scholar
  13. [13]
    MacEachern S N. Estimating normal means with a conjugate style Dirichlet process prior. Communications in Statistics: Simulation and Computation, 1994, 23(3): 727-741.CrossRefMathSciNetzbMATHGoogle Scholar
  14. [14]
    Escobar M D, West M. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 1995, 90(430): 577-588.CrossRefMathSciNetzbMATHGoogle Scholar
  15. [15]
    Müller P, Erkanli A, West M. Bayesian curve fitting using multivariate normal mixtures. Biometrika, 1996, 83(1): 67-79.CrossRefMathSciNetzbMATHGoogle Scholar
  16. [16]
    West M, Müller P, Escobar M D. Hierarchical Priors and Mixture Models with Applications in Regression and Density Estimation. Aspects of Uncertainty, Freeman P R, Smith A F M (eds.), John Wiley, 1994, pp.363-386.Google Scholar
  17. [17]
    MacEachern S N, Müller P. Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 1998, 7(2): 223-238.CrossRefGoogle Scholar
  18. [18]
    Neal R M. Markov chain sampling methods for Dirichlet process mixture models. Technical Report 4915, Department of Statistics, University of Toronto, 1998.Google Scholar
  19. [19]
    Gilks W R, Wild P. Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 1992, 41(2): 337-348.CrossRefzbMATHGoogle Scholar
  20. [20]
    Scott D W. Multivariate Density Estimation: Theory, Practice and Visualization, Wiley, 1992.Google Scholar
  21. [21]
    Fisher R A. The use of multiple measurements in axonomic problems. Annals of Eugenics, 1936, 7: 179-188.Google Scholar
  22. [22]
    Forina M, Armanino C, Castino M, Ubigli M. Multivariate data analysis as a discriminating method of the origin of wines. Vitis, 1986, 25(3): 189-201.Google Scholar

Copyright information

© Springer 2010

Authors and Affiliations

  1. 1.Gatsby Computational Neuroscience UnitUniversity College LondonLondonU.K.
  2. 2.Department of EngineeringUniversity of CambridgeCambridgeU.K.
  3. 3.Max Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations