Combining Mixture Models and Spectral Clustering for Data Partitioning

Muzeau, Julien; Oliver-Parera, Maria; Ladret, Patricia; Bertolino, Pascal

doi:10.1007/978-3-030-50516-5_6

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12132))

Included in the following conference series:

International Conference on Image Analysis and Recognition

887 Accesses
2 Citations

Abstract

Gaussian Mixture Models are widely used nowadays, thanks to the simplicity and efficiency of the Expectation-Maximization algorithm. However, determining the optimal number of components is tricky and, in the context of data partitioning, may differ from the actual number of clusters. We propose to apply a post-processing step by means of Spectral Clustering: it allows a clever merging of similar Gaussians thanks to the Bhattacharyya distance so that clusters of any shape are automatically discovered. The proposed method shows a significant improvement compared to the classical Gaussian Mixture clustering approach and promising results against well-known partitioning algorithms with respect to the number of parameters.

Supported by Auvergne-Rhône-Alpes region.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/deric/clustering-benchmark.

References

Akaike, H.: A new look at the statistical model identification. IEEE Trans. Auto. Control 19(6), 716–723 (1974)
Article MathSciNet Google Scholar
Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: ACM-SIAM Symposium on Discrete Algorithms, January 2007
Google Scholar
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 7, 99–109 (1943)
MathSciNet MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Royal Stat. Soc. 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Ester, M., Hans-Peter, K., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 226–231, December 1997
Google Scholar
Figueiredo, M., Jain, A.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)
Article Google Scholar
Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78(383), 553–569 (1983)
Article Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article Google Scholar
Ghosal, A., Nandy, A., Das, A.K., Goswami, S., Panday, M.: A short review on different clustering techniques and their applications. In: Mandal, J., Bhattacharya, D. (eds.) Emerging Technology in Modelling and Graphics, pp. 69–83. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-7403-6_9
Chapter Google Scholar
Keribin, C.: Consistent estimation of the order of mixture models. Sankhyā: The Indian Journal of Statistics, Series A, pp. 49–66 (2000)
Google Scholar
Leroux, B.G.: Consistent estimation of a mixing distribution. Ann. Stat. 20, 1350–1360 (1992)
Article MathSciNet Google Scholar
McLachlan, G.J., Rathnayake, S.: On the number of components in a gaussian mixture model. Wiley Interdisciplinary Rev. Data Min. Knowl. Disc. 4(5), 341–355 (2014)
Article Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2001)
Google Scholar
Pelleg, D., Moore, A.: X-means: extending k-means with efficient estimation of the number of clusters. In: International Conference on Machine Learning, pp. 727–734 (2000)
Google Scholar
Roeder, K., Wasserman, L.: Practical bayesian density estimation using mixtures of normals. J. Am. Stat. Assoc. 92(439), 894–902 (1997)
Article MathSciNet Google Scholar
Ruan, L., Yuan, M., Zou, H.: Regularized parameter estimation in high-dimensional gaussian mixture models. Neural Comput. 23(6), 1605–1622 (2011)
Article MathSciNet Google Scholar
Saxena, A., et al.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)
Article Google Scholar
Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans. Database Syst. 42(3), 1–21 (2017)
Article MathSciNet Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet Google Scholar
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer, New York (2005). https://doi.org/10.1007/0-387-27656-4
Book MATH Google Scholar
Zhang, Z., Chen, C., Sun, J., Chan, K.L.: EM algorithms for gaussian mixtures with split-and-merge operation. Pattern Recogn. 36(9), 1973–1983 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000, Grenoble, France
Julien Muzeau, Maria Oliver-Parera, Patricia Ladret & Pascal Bertolino

Authors

Julien Muzeau
View author publications
You can also search for this author in PubMed Google Scholar
Maria Oliver-Parera
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Ladret
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Bertolino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julien Muzeau .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
Aurélio Campilho
University of Waterloo, Waterloo, ON, Canada
Fakhri Karray
University of Waterloo, Waterloo, ON, Canada
Zhou Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muzeau, J., Oliver-Parera, M., Ladret, P., Bertolino, P. (2020). Combining Mixture Models and Spectral Clustering for Data Partitioning. In: Campilho, A., Karray, F., Wang, Z. (eds) Image Analysis and Recognition. ICIAR 2020. Lecture Notes in Computer Science(), vol 12132. Springer, Cham. https://doi.org/10.1007/978-3-030-50516-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-50516-5_6
Published: 17 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50515-8
Online ISBN: 978-3-030-50516-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics