Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions

Forbes, Florence; Arnaud, Alexis; Lemasson, Benjamin; Barbier, Emmanuel

doi:10.1007/978-981-15-1960-4_6

Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions

Florence Forbes ORCID: orcid.org/0000-0003-3639-0226⁷,
Alexis Arnaud^7,8,
Benjamin Lemasson⁸ &
…
Emmanuel Barbier⁸

Conference paper
First Online: 03 January 2020

1021 Accesses
3 Citations
3 Altmetric

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1150))

Abstract

We address the issue of selecting automatically the number of components in mixture models with non-Gaussian components. As a more efficient alternative to the traditional comparison of several model scores in a range, we consider procedures based on a single run of the inference scheme. Starting from an overfitting mixture in a Bayesian setting, we investigate two strategies to eliminate superfluous components. We implement these strategies for mixtures of multiple scale distributions which exhibit a variety of shapes not necessarily elliptical while remaining analytical and tractable in multiple dimensions. A Bayesian formulation and a tractable inference procedure based on variational approximation are proposed. Preliminary results on simulated and real data show promising performance in terms of model selection and computational time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Archambeau, C., Verleysen, M.: Robust Bayesian clustering. Neural Netw. 20(1), 129–138 (2007)
Article Google Scholar
Arnaud, A., Forbes, F., Steele, R., Lemasson, B., Barbier, E.L.: Bayesian mixtures of multiple scale distributions, July 2019). https://hal.inria.fr/hal-01953393. Working paper or preprint
Attias, H.: Inferring parameters and structure of latent variable models by variational Bayes. In: UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, 30 July–1 August 1999, pp. 21–30 (1999)
Google Scholar
Attias, H.: A variational Bayesian framework for graphical models. In: Proceedings of Advances in Neural Information Processing Systems 12, pp. 209–215. MIT Press, Denver (2000)
Google Scholar
Banfield, J., Raftery, A.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)
Article MathSciNet Google Scholar
Baudry, J.P., Raftery, E.A., Celeux, G., Lo, K., Gottardo, R.: Combining mixture components for clustering. J. Comput. Graph. Stat. 19(2), 332–353 (2010)
Article MathSciNet Google Scholar
Baudry, J.P., Maugis, C., Michel, B.: Slope heuristics: overview and implementation. Stat. Comput. 22(2), 455–470 (2012)
Article MathSciNet Google Scholar
Beal, M.J.: Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University of London (2003)
Google Scholar
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recogn. 28(5), 781–793 (1995)
Article Google Scholar
Celeux, G., Fruhwirth-Schnatter, S., Robert, C.: Model selection for mixture models-perspectives and strategies. In: Handbook of Mixture Analysis. CRC Press (2018)
Google Scholar
Corduneanu, A., Bishop, C.: Variational Bayesian model selection for mixture distributions. In: Proceedings Eighth International Conference on Artificial Intelligence and Statistics, p. 2734. Morgan Kaufmann (2001)
Google Scholar
Dahl, D.B.: Model-based clustering for expression data via a Dirichlet process mixture model. In: Bayesian Inference for Gene Expression and Proteomics (2006)
Google Scholar
Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)
Article Google Scholar
Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweights: application to robust clustering. Stat. Comput. 24(6), 971–984 (2014)
Article MathSciNet Google Scholar
Fritsch, A., Ickstadt, K.: Improved criteria for clustering based on the posterior similarity matrix. Bayesian Anal. 4(2), 367–391 (2009)
Article MathSciNet Google Scholar
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006). https://doi.org/10.1007/978-0-387-35768-3
Book MATH Google Scholar
Gorur, D., Rasmussen, C.: Dirichlet process Gaussian mixture models: choice of the base distribution. J. Comput. Sci. Technol. 25(4), 653–664 (2010)
Article MathSciNet Google Scholar
Hennig, C.: Methods for merging Gaussian mixture components. Adv. Data Anal. Classif. 4(1), 3–34 (2010)
Article MathSciNet Google Scholar
Hoff, P.D.: A hierarchical eigenmodel for pooled covariance estimation. J. R. Stat. Society. Ser. B (Stat. Methodol.) 71(5), 971–992 (2009)
Article MathSciNet Google Scholar
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2, 2nd edn. Wiley, New York (1994)
MATH Google Scholar
Malsiner-Walli, G., Frühwirth-Schnatter, S., Grün, B.: Model-based clustering based on sparse finite Gaussian mixtures. Stat. Comput. 26(1), 303–324 (2016)
Article MathSciNet Google Scholar
McGrory, C.A., Titterington, D.M.: Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Anal. 51(11), 5352–5367 (2007)
Article MathSciNet Google Scholar
McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, Hoboken (2000)
Book Google Scholar
Melnykov, V.: Merging mixture components for clustering through pairwise overlap. J. Comput. Graph. Stat. 25(1), 66–90 (2016)
Article MathSciNet Google Scholar
Rasmussen, C.E.: The infinite Gaussian mixture model. In: NIPS, vol. 12, pp. 554–560 (1999)
Google Scholar
Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 59(4), 731–792 (1997)
Article Google Scholar
Rousseau, J., Mengersen, K.: Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 73(5), 689–710 (2011)
Article MathSciNet Google Scholar
Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.: mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8(1), 205–233 (2016)
Article Google Scholar
Tu, K.: Modified Dirichlet distribution: allowing negative parameters to induce stronger sparsity. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, 1–4 November 2016, pp. 1986–1991 (2016)
Google Scholar
Verbeek, J., Vlassis, N., Kröse, B.: Efficient greedy learning of Gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)
Article Google Scholar
Wei, X., Li, C.: The infinite student t-mixture for robust modeling. Signal Process. 92(1), 224–234 (2012)
Article Google Scholar
Yerebakan, H.Z., Rajwa, B., Dundar, M.: The infinite mixture of infinite Gaussian mixtures. In: Advances in Neural Information Processing Systems, pp. 28–36 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP (Institute of Engineering Univ. Grenoble Alpes), LJK, 38000, Grenoble, France
Florence Forbes & Alexis Arnaud
Grenoble Institut des Neurosciences, Inserm U1216, Univ. Grenoble Alpes, Grenoble, France
Alexis Arnaud, Benjamin Lemasson & Emmanuel Barbier

Authors

Florence Forbes
View author publications
You can also search for this author in PubMed Google Scholar
Alexis Arnaud
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Lemasson
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Barbier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florence Forbes .

Editor information

Editors and Affiliations

La Trobe University, Bundoora, VIC, Australia
Hien Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Forbes, F., Arnaud, A., Lemasson, B., Barbier, E. (2019). Component Elimination Strategies to Fit Mixtures of Multiple Scale Distributions. In: Nguyen, H. (eds) Statistics and Data Science. RSSDS 2019. Communications in Computer and Information Science, vol 1150. Springer, Singapore. https://doi.org/10.1007/978-981-15-1960-4_6

Download citation

DOI: https://doi.org/10.1007/978-981-15-1960-4_6
Published: 03 January 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1959-8
Online ISBN: 978-981-15-1960-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics