Abstract
Label switching is one of the fundamental problems for Bayesian mixture model analysis. Due to the permutation invariance of the mixture posterior, we can consider that the posterior of a m-component mixture model is a mixture distribution with m! symmetric components and therefore the object of labeling is to recover one of the components. In order to do labeling, we propose to first fit a symmetric m!-component mixture model to the Markov chain Monte Carlo (MCMC) samples and then choose the label for each sample by maximizing the corresponding classification probabilities, which are the probabilities of all possible labels for each sample. Both parametric and semi-parametric ways are proposed to fit the symmetric mixture model for the posterior. Compared to the existing labeling methods, our proposed method aims to approximate the posterior directly and provides the labeling probabilities for all possible labels and thus has a model explanation and theoretical support. In addition, we introduce a situation in which the “ideally” labeled samples are available and thus can be used to compare different labeling methods. We demonstrate the success of our new method in dealing with the label switching problem using two examples.
Similar content being viewed by others
References
Benaglia, T., Chauveau, D., Hunter, D.R.: An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Stat. 18, 505–526 (2009)
Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications. Chapman and Hall/CRC, Boca Raton (1999)
Bordes, L., Chauveau, D., Vandekerkhove, P.: A stochastic EM algorithm for a semiparametric mixture model. Comput. Stat. Data Anal. 51, 5429–5443 (2007)
Celeux, G.: Bayesian inference for mixtures: The label switching problem. In: Payne, R., Green, P.J. (eds.) Compstat 98-Proc. in Computational Statistics, pp. 227–232. Physica, Heidelberg (1998)
Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)
Chung, H., Loken, E., Schafer, J.L.: Difficulties in drawing inferences with finite-mixture models: a simple example with a simple solution. Am. Stat. 58, 152–158 (2004)
Crawford, S.L.: An application of the Laplace method to finite mixture distributions. J. Am. Stat. Assoc. 89, 259–267 (1994)
Crawford, S.L., Degroot, M.H., Kadane, J.B., Small, M.J.: Modeling lake-chemistry distributions-approximate Bayesian methods for estimating a finite-mixture model. Technometrics 34, 441–453 (1992)
Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions through Bayesian sampling. J. R. Stat. Soc. B 56, 363–375 (1994)
Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96, 194–209 (2001)
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)
Geweke, J.: Interpretation and inference in mixture models: Simple MCMC works. Comput. Stat. Data Anal. 51, 3529–3550 (2007)
Grun, B., Leisch, F.: Dealing with label switching in mixture models under genuine multimodality. J. Multivar. Anal. 100, 851–861 (2009)
Hurn, M., Justel, A., Robert, C.P.: Estimating mixtures of regressions. J. Comput. Graph. Stat. 12, 55–79 (2003)
Jasra, A.: Bayesian inference for mixture models via Monte Carlo. Ph.D. Thesis, Imperial College London (2005)
Jasra, A., Holmes, C.C., Stephens, D.A.: Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20, 50–67 (2005)
Lindsay, B.G.: Mixture Models: Theory, Geometry, and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute of Mathematical Statistics, Hayward (1995)
Marin, J.-M., Mengersen, K.L., Robert, C.P.: Bayesian modelling and inference on mixtures of distributions. In: Dey, D., Rao, C.R. (eds.) Handbook of Statistics 25. North-Holland, Amsterdam (2005)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Papastamoulis, P., Iliopoulos, G.: An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. J. Comput. Graph. Stat. 19, 313–331 (2010)
Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. B 59, 731–792 (1997)
Scott, D.W.: Multivariate Density Estimation: Theory, Practice and Visualization. Wiley, New York (1992)
Sperrin, M., Jaki, T., Wit, E.: Probabilistic relabeling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20, 357–366 (2010)
Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc. B 62, 795–809 (2000)
Walker, A.M.: On the asymptotic behaviour of posterior distributions. J. R. Stat. Soc. B 31, 80–88, (1969)
Yao, W., Lindsay, B.G.: Bayesian mixture labeling by highest posterior density. J. Am. Stat. Assoc. 104, 758–767 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yao, W. Model based labeling for mixture models. Stat Comput 22, 337–347 (2012). https://doi.org/10.1007/s11222-010-9226-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-010-9226-8