Skip to main content

Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1150))

Abstract

We address the issue of selecting automatically the number of components in mixture models with non-Gaussian components. As a more efficient alternative to the traditional comparison of several model scores in a range, we consider procedures based on a single run of the inference scheme. Starting from an overfitting mixture in a Bayesian setting, we investigate two strategies to eliminate superfluous components. We implement these strategies for mixtures of multiple scale distributions which exhibit a variety of shapes not necessarily elliptical while remaining analytical and tractable in multiple dimensions. A Bayesian formulation and a tractable inference procedure based on variational approximation are proposed. Preliminary results on simulated and real data show promising performance in terms of model selection and computational time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Archambeau, C., Verleysen, M.: Robust Bayesian clustering. Neural Netw. 20(1), 129–138 (2007)

    Article  Google Scholar 

  2. Arnaud, A., Forbes, F., Steele, R., Lemasson, B., Barbier, E.L.: Bayesian mixtures of multiple scale distributions, July 2019). https://hal.inria.fr/hal-01953393. Working paper or preprint

  3. Attias, H.: Inferring parameters and structure of latent variable models by variational Bayes. In: UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, 30 July–1 August 1999, pp. 21–30 (1999)

    Google Scholar 

  4. Attias, H.: A variational Bayesian framework for graphical models. In: Proceedings of Advances in Neural Information Processing Systems 12, pp. 209–215. MIT Press, Denver (2000)

    Google Scholar 

  5. Banfield, J., Raftery, A.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)

    Article  MathSciNet  Google Scholar 

  6. Baudry, J.P., Raftery, E.A., Celeux, G., Lo, K., Gottardo, R.: Combining mixture components for clustering. J. Comput. Graph. Stat. 19(2), 332–353 (2010)

    Article  MathSciNet  Google Scholar 

  7. Baudry, J.P., Maugis, C., Michel, B.: Slope heuristics: overview and implementation. Stat. Comput. 22(2), 455–470 (2012)

    Article  MathSciNet  Google Scholar 

  8. Beal, M.J.: Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University of London (2003)

    Google Scholar 

  9. Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recogn. 28(5), 781–793 (1995)

    Article  Google Scholar 

  10. Celeux, G., Fruhwirth-Schnatter, S., Robert, C.: Model selection for mixture models-perspectives and strategies. In: Handbook of Mixture Analysis. CRC Press (2018)

    Google Scholar 

  11. Corduneanu, A., Bishop, C.: Variational Bayesian model selection for mixture distributions. In: Proceedings Eighth International Conference on Artificial Intelligence and Statistics, p. 2734. Morgan Kaufmann (2001)

    Google Scholar 

  12. Dahl, D.B.: Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian Inference for Gene Expression and Proteomics (2006)

    Google Scholar 

  13. Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)

    Article  Google Scholar 

  14. Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweights: application to robust clustering. Stat. Comput. 24(6), 971–984 (2014)

    Article  MathSciNet  Google Scholar 

  15. Fritsch, A., Ickstadt, K.: Improved criteria for clustering based on the posterior similarity matrix. Bayesian Anal. 4(2), 367–391 (2009)

    Article  MathSciNet  Google Scholar 

  16. Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006). https://doi.org/10.1007/978-0-387-35768-3

    Book  MATH  Google Scholar 

  17. Gorur, D., Rasmussen, C.: Dirichlet process Gaussian mixture models: choice of the base distribution. J. Comput. Sci. Technol. 25(4), 653–664 (2010)

    Article  MathSciNet  Google Scholar 

  18. Hennig, C.: Methods for merging Gaussian mixture components. Adv. Data Anal. Classif. 4(1), 3–34 (2010)

    Article  MathSciNet  Google Scholar 

  19. Hoff, P.D.: A hierarchical eigenmodel for pooled covariance estimation. J. R. Stat. Society. Ser. B (Stat. Methodol.) 71(5), 971–992 (2009)

    Article  MathSciNet  Google Scholar 

  20. Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2, 2nd edn. Wiley, New York (1994)

    MATH  Google Scholar 

  21. Malsiner-Walli, G., Frühwirth-Schnatter, S., Grün, B.: Model-based clustering based on sparse finite Gaussian mixtures. Stat. Comput. 26(1), 303–324 (2016)

    Article  MathSciNet  Google Scholar 

  22. McGrory, C.A., Titterington, D.M.: Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Anal. 51(11), 5352–5367 (2007)

    Article  MathSciNet  Google Scholar 

  23. McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, Hoboken (2000)

    Book  Google Scholar 

  24. Melnykov, V.: Merging mixture components for clustering through pairwise overlap. J. Comput. Graph. Stat. 25(1), 66–90 (2016)

    Article  MathSciNet  Google Scholar 

  25. Rasmussen, C.E.: The infinite Gaussian mixture model. In: NIPS, vol. 12, pp. 554–560 (1999)

    Google Scholar 

  26. Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 59(4), 731–792 (1997)

    Article  Google Scholar 

  27. Rousseau, J., Mengersen, K.: Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 73(5), 689–710 (2011)

    Article  MathSciNet  Google Scholar 

  28. Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.: mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8(1), 205–233 (2016)

    Article  Google Scholar 

  29. Tu, K.: Modified Dirichlet distribution: allowing negative parameters to induce stronger sparsity. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 1986–1991 (2016)

    Google Scholar 

  30. Verbeek, J., Vlassis, N., Kröse, B.: Efficient greedy learning of Gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)

    Article  Google Scholar 

  31. Wei, X., Li, C.: The infinite student t-mixture for robust modeling. Signal Process. 92(1), 224–234 (2012)

    Article  Google Scholar 

  32. Yerebakan, H.Z., Rajwa, B., Dundar, M.: The infinite mixture of infinite Gaussian mixtures. In: Advances in Neural Information Processing Systems, pp. 28–36 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florence Forbes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Forbes, F., Arnaud, A., Lemasson, B., Barbier, E. (2019). Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions. In: Nguyen, H. (eds) Statistics and Data Science. RSSDS 2019. Communications in Computer and Information Science, vol 1150. Springer, Singapore. https://doi.org/10.1007/978-981-15-1960-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1960-4_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1959-8

  • Online ISBN: 978-981-15-1960-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics