Skip to main content
Log in

Model based labeling for mixture models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Label switching is one of the fundamental problems for Bayesian mixture model analysis. Due to the permutation invariance of the mixture posterior, we can consider that the posterior of a m-component mixture model is a mixture distribution with m! symmetric components and therefore the object of labeling is to recover one of the components. In order to do labeling, we propose to first fit a symmetric m!-component mixture model to the Markov chain Monte Carlo (MCMC) samples and then choose the label for each sample by maximizing the corresponding classification probabilities, which are the probabilities of all possible labels for each sample. Both parametric and semi-parametric ways are proposed to fit the symmetric mixture model for the posterior. Compared to the existing labeling methods, our proposed method aims to approximate the posterior directly and provides the labeling probabilities for all possible labels and thus has a model explanation and theoretical support. In addition, we introduce a situation in which the “ideally” labeled samples are available and thus can be used to compare different labeling methods. We demonstrate the success of our new method in dealing with the label switching problem using two examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Benaglia, T., Chauveau, D., Hunter, D.R.: An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Stat. 18, 505–526 (2009)

    Article  MathSciNet  Google Scholar 

  • Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications. Chapman and Hall/CRC, Boca Raton (1999)

    Google Scholar 

  • Bordes, L., Chauveau, D., Vandekerkhove, P.: A stochastic EM algorithm for a semiparametric mixture model. Comput. Stat. Data Anal. 51, 5429–5443 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Celeux, G.: Bayesian inference for mixtures: The label switching problem. In: Payne, R., Green, P.J. (eds.) Compstat 98-Proc. in Computational Statistics, pp. 227–232. Physica, Heidelberg (1998)

    Google Scholar 

  • Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Chung, H., Loken, E., Schafer, J.L.: Difficulties in drawing inferences with finite-mixture models: a simple example with a simple solution. Am. Stat. 58, 152–158 (2004)

    Article  MathSciNet  Google Scholar 

  • Crawford, S.L.: An application of the Laplace method to finite mixture distributions. J. Am. Stat. Assoc. 89, 259–267 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Crawford, S.L., Degroot, M.H., Kadane, J.B., Small, M.J.: Modeling lake-chemistry distributions-approximate Bayesian methods for estimating a finite-mixture model. Technometrics 34, 441–453 (1992)

    Article  Google Scholar 

  • Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions through Bayesian sampling. J. R. Stat. Soc. B 56, 363–375 (1994)

    MathSciNet  MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96, 194–209 (2001)

    Article  MATH  Google Scholar 

  • Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    MATH  Google Scholar 

  • Geweke, J.: Interpretation and inference in mixture models: Simple MCMC works. Comput. Stat. Data Anal. 51, 3529–3550 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Grun, B., Leisch, F.: Dealing with label switching in mixture models under genuine multimodality. J. Multivar. Anal. 100, 851–861 (2009)

    Article  MathSciNet  Google Scholar 

  • Hurn, M., Justel, A., Robert, C.P.: Estimating mixtures of regressions. J. Comput. Graph. Stat. 12, 55–79 (2003)

    Article  MathSciNet  Google Scholar 

  • Jasra, A.: Bayesian inference for mixture models via Monte Carlo. Ph.D. Thesis, Imperial College London (2005)

  • Jasra, A., Holmes, C.C., Stephens, D.A.: Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20, 50–67 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay, B.G.: Mixture Models: Theory, Geometry, and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute of Mathematical Statistics, Hayward (1995)

    MATH  Google Scholar 

  • Marin, J.-M., Mengersen, K.L., Robert, C.P.: Bayesian modelling and inference on mixtures of distributions. In: Dey, D., Rao, C.R. (eds.) Handbook of Statistics 25. North-Holland, Amsterdam (2005)

    Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  • Papastamoulis, P., Iliopoulos, G.: An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. J. Comput. Graph. Stat. 19, 313–331 (2010)

    Article  MathSciNet  Google Scholar 

  • Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. B 59, 731–792 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Scott, D.W.: Multivariate Density Estimation: Theory, Practice and Visualization. Wiley, New York (1992)

    Book  MATH  Google Scholar 

  • Sperrin, M., Jaki, T., Wit, E.: Probabilistic relabeling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20, 357–366 (2010)

    Article  MathSciNet  Google Scholar 

  • Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc. B 62, 795–809 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Walker, A.M.: On the asymptotic behaviour of posterior distributions. J. R. Stat. Soc. B 31, 80–88, (1969)

    MATH  Google Scholar 

  • Yao, W., Lindsay, B.G.: Bayesian mixture labeling by highest posterior density. J. Am. Stat. Assoc. 104, 758–767 (2009)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weixin Yao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, W. Model based labeling for mixture models. Stat Comput 22, 337–347 (2012). https://doi.org/10.1007/s11222-010-9226-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-010-9226-8

Keywords

Navigation