Advertisement

Statistics and Computing

, Volume 22, Issue 2, pp 337–347 | Cite as

Model based labeling for mixture models

  • Weixin YaoEmail author
Article

Abstract

Label switching is one of the fundamental problems for Bayesian mixture model analysis. Due to the permutation invariance of the mixture posterior, we can consider that the posterior of a m-component mixture model is a mixture distribution with m! symmetric components and therefore the object of labeling is to recover one of the components. In order to do labeling, we propose to first fit a symmetric m!-component mixture model to the Markov chain Monte Carlo (MCMC) samples and then choose the label for each sample by maximizing the corresponding classification probabilities, which are the probabilities of all possible labels for each sample. Both parametric and semi-parametric ways are proposed to fit the symmetric mixture model for the posterior. Compared to the existing labeling methods, our proposed method aims to approximate the posterior directly and provides the labeling probabilities for all possible labels and thus has a model explanation and theoretical support. In addition, we introduce a situation in which the “ideally” labeled samples are available and thus can be used to compare different labeling methods. We demonstrate the success of our new method in dealing with the label switching problem using two examples.

Keywords

Bayesian mixtures Labeling probabilities Label switching Markov chain Monte Carlo Mixture model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benaglia, T., Chauveau, D., Hunter, D.R.: An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Stat. 18, 505–526 (2009) MathSciNetCrossRefGoogle Scholar
  2. Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications. Chapman and Hall/CRC, Boca Raton (1999) Google Scholar
  3. Bordes, L., Chauveau, D., Vandekerkhove, P.: A stochastic EM algorithm for a semiparametric mixture model. Comput. Stat. Data Anal. 51, 5429–5443 (2007) MathSciNetzbMATHCrossRefGoogle Scholar
  4. Celeux, G.: Bayesian inference for mixtures: The label switching problem. In: Payne, R., Green, P.J. (eds.) Compstat 98-Proc. in Computational Statistics, pp. 227–232. Physica, Heidelberg (1998) Google Scholar
  5. Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000) MathSciNetzbMATHCrossRefGoogle Scholar
  6. Chung, H., Loken, E., Schafer, J.L.: Difficulties in drawing inferences with finite-mixture models: a simple example with a simple solution. Am. Stat. 58, 152–158 (2004) MathSciNetCrossRefGoogle Scholar
  7. Crawford, S.L.: An application of the Laplace method to finite mixture distributions. J. Am. Stat. Assoc. 89, 259–267 (1994) MathSciNetzbMATHCrossRefGoogle Scholar
  8. Crawford, S.L., Degroot, M.H., Kadane, J.B., Small, M.J.: Modeling lake-chemistry distributions-approximate Bayesian methods for estimating a finite-mixture model. Technometrics 34, 441–453 (1992) CrossRefGoogle Scholar
  9. Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions through Bayesian sampling. J. R. Stat. Soc. B 56, 363–375 (1994) MathSciNetzbMATHGoogle Scholar
  10. Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96, 194–209 (2001) zbMATHCrossRefGoogle Scholar
  11. Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006) zbMATHGoogle Scholar
  12. Geweke, J.: Interpretation and inference in mixture models: Simple MCMC works. Comput. Stat. Data Anal. 51, 3529–3550 (2007) MathSciNetzbMATHCrossRefGoogle Scholar
  13. Grun, B., Leisch, F.: Dealing with label switching in mixture models under genuine multimodality. J. Multivar. Anal. 100, 851–861 (2009) MathSciNetCrossRefGoogle Scholar
  14. Hurn, M., Justel, A., Robert, C.P.: Estimating mixtures of regressions. J. Comput. Graph. Stat. 12, 55–79 (2003) MathSciNetCrossRefGoogle Scholar
  15. Jasra, A.: Bayesian inference for mixture models via Monte Carlo. Ph.D. Thesis, Imperial College London (2005) Google Scholar
  16. Jasra, A., Holmes, C.C., Stephens, D.A.: Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20, 50–67 (2005) MathSciNetzbMATHCrossRefGoogle Scholar
  17. Lindsay, B.G.: Mixture Models: Theory, Geometry, and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute of Mathematical Statistics, Hayward (1995) zbMATHGoogle Scholar
  18. Marin, J.-M., Mengersen, K.L., Robert, C.P.: Bayesian modelling and inference on mixtures of distributions. In: Dey, D., Rao, C.R. (eds.) Handbook of Statistics 25. North-Holland, Amsterdam (2005) Google Scholar
  19. McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000) zbMATHCrossRefGoogle Scholar
  20. Papastamoulis, P., Iliopoulos, G.: An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. J. Comput. Graph. Stat. 19, 313–331 (2010) MathSciNetCrossRefGoogle Scholar
  21. Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. B 59, 731–792 (1997) MathSciNetzbMATHCrossRefGoogle Scholar
  22. Scott, D.W.: Multivariate Density Estimation: Theory, Practice and Visualization. Wiley, New York (1992) zbMATHCrossRefGoogle Scholar
  23. Sperrin, M., Jaki, T., Wit, E.: Probabilistic relabeling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20, 357–366 (2010) MathSciNetCrossRefGoogle Scholar
  24. Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc. B 62, 795–809 (2000) MathSciNetzbMATHCrossRefGoogle Scholar
  25. Walker, A.M.: On the asymptotic behaviour of posterior distributions. J. R. Stat. Soc. B 31, 80–88, (1969) zbMATHGoogle Scholar
  26. Yao, W., Lindsay, B.G.: Bayesian mixture labeling by highest posterior density. J. Am. Stat. Assoc. 104, 758–767 (2009) MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of StatisticsKansas State UniversityManhattanUSA

Personalised recommendations