Skip to main content
Log in

A fresh look at mean-shift based modal clustering

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Modal clustering is an unsupervised learning technique where cluster centers are identified as the local maxima of nonparametric probability density estimates. A natural algorithmic engine for the computation of these maxima is the mean shift procedure, which is essentially an iteratively computed chain of local means. We revisit this technique, focusing on its link to kernel density gradient estimation, in this course proposing a novel concept for bandwidth selection based on the concept of a critical bandwidth. Furthermore, in the one-dimensional case, an inverse version of the mean shift is developed to provide a novel approach for the estimation of antimodes, which is then used to identify cluster boundaries. A simulation study is provided which assesses, in the univariate case, the classification accuracy of the mean-shift based clustering approach. Three (univariate and multivariate) examples from the fields of philately, engineering, and imaging, illustrate how modal clusterings identified through mean shift based methods relate directly and naturally to physical properties of the data-generating system. Solutions are proposed to deal computationally efficiently with large data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Conceptually, one could point out here that the maxima of \({\hat{f}}\) constitute the estimates of the (local) modes, while the iterative solutions entailed by equation (5) act as approximations to these estimates. For all practical purposes, \(x^{\blacktriangle }\) will still serve as ‘the estimate’ of a local mode of X; therefore we do not make further efforts to distinguish this conceptual nuance further in this exposition.

  2. We have so far consistently spoken of the modes as a property of a random variable (or vector) X, corresponding to the (local) maxima of the associated density f of X. In this subsection, for ease of presentation and consistency with the source literature, we allow us to be a bit more lenient and speak of modes and antimodes of a density f (rather than X), with the obvious meaning.

References

  • Ameijeiras-Alonso J, Crujeiras RM, Rodríguez-Casal A (2019) Mode testing, critical bandwidth and excess mass. TEST 28(3):900–919

    Article  MathSciNet  Google Scholar 

  • Ameijeiras-Alonso J, Crujeiras RM, Rodríguez-Casal A (2021) Multimode: an R package for mode assessment. J Stat Softw 97(9):1–32

    Article  Google Scholar 

  • Arias-Castro E, Qiao W (2023) A unifying view of modal clustering. Inf Inference J IMA 12(2):897–920

    MathSciNet  Google Scholar 

  • Azzalini A, Menardi G (2014) Clustering via nonparametric density estimation: the R package pdfCluster. J Stat Softw 57(11):1–26

    Article  Google Scholar 

  • Bowman A, Foster P (1993) Density based exploration of bivariate data. Stat Comput 3(4):171–177

    Article  Google Scholar 

  • Carreira-Perpiñán M (2007) Gaussian mean-shift is an EM algorithm. IEEE Trans Pattern Anal Mach Intell 29:767–776

    Article  Google Scholar 

  • Carreira-Perpiñán M (2015) Clustering methods based on kernel density estimators: mean-shift algorithms. In: Rocci R, Murtagh F, Meila M, Hennig C (eds) Handbook of cluster analysis. CRC, New York

    Google Scholar 

  • Casa A, Chacón JE, Menardi G (2020) Modal clustering asymptotics with applications to bandwidth selection. Electron J Stat 14(1):835–856

    Article  MathSciNet  Google Scholar 

  • Casa A, Scrucca L, Menardi G (2021) Better than the best? Answers via model ensemble in density-based clustering. Adv Data Anal Class 15:599–623

    Article  MathSciNet  Google Scholar 

  • Chacón JE (2015) A population background for nonparametric density-based clustering. Stat Sci 30(4):518–532

    Article  MathSciNet  Google Scholar 

  • Chacón JE (2019) Mixture model modal clustering. Adv Data Anal Classif 13:379–404

    Article  MathSciNet  Google Scholar 

  • Chacón JE (2020) The modal age of statistics. Int Stat Rev 88(1):122–141

    Article  MathSciNet  Google Scholar 

  • Chacón JE, Duong T (2013) Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting. Electron J Statist 7:499–532

    Article  MathSciNet  Google Scholar 

  • Chacón J, Monfort P (2013) A comparison of bandwidth selectors for mean shift clustering. In: Skiadas C (ed) Theoretical and applied issues in statistics and demography. ISAST, Athens

    Google Scholar 

  • Chaudhuri P, Marron JS (1999) Sizer for exploration of structures in curves. J Am Stat Assoc 94(447):807–823

    Article  MathSciNet  Google Scholar 

  • Chen Y-C (2018) Modal regression using kernel density estimation: a review. WIREs Comput Stat 10:e1431

    Article  MathSciNet  Google Scholar 

  • Chen Y-C, Genovese CR, Wasserman L (2016) A comprehensive approach to mode clustering. Electron J Stat 10(1):210–241

    Article  MathSciNet  Google Scholar 

  • Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799

    Article  Google Scholar 

  • Cheng Y, Ray S (2014) Multivariate modality inference using gaussian kernel. Open J Stat 4(5):419–434

    Article  Google Scholar 

  • Coleman GB, Andrews HC (1979) Image segmentation by clustering. Proc IEEE 67(5):773–785

    Article  Google Scholar 

  • Comaniciu D (2003) An algorithm for data-driven bandwidth selection. IEEE Trans Pattern Anal Mach Intell 25(2):281–288

    Article  Google Scholar 

  • Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  • Comaniciu D, Ramesh V, Meer P (2001) The variable bandwidth mean shift and data-driven scale selection, in: Proceedings Eighth IEEE international conference on computer vision. ICCV 2001, Vol. 1, pp 438–445

  • Duong T, Cowling A, Koch I, Wand MP (2008) Feature significance for multivariate kernel density estimation. Comput Stat Data Anal 52(9):4225–4242

    Article  MathSciNet  Google Scholar 

  • Duong T, Wand M (2015) Feature: local inferential feature significance for multivariate kernel density estimation, R package version 1.2.13. https://CRAN.R-project.org/package=feature

  • Eaton W, Chen W (2015) Image segmentation for automated taxiing of unmanned aircraft. In: 2015 international conference on unmanned aircraft systems (ICUAS), pp 1–8

  • Einbeck J (2011) Bandwidth selection for based unsupervised learning techniques: a unified approach via self-coverage. J Pattern Recogn Res 6(2):175–192

    Google Scholar 

  • Einbeck J (2019) R programming and mixture models, with application to image analysis. Tutorial at CMStatistics, London

  • Einbeck J, Evers L (2020) LPCM: Local principal curve methods, R package version 0.46-7. https://CRAN.R-project.org/package=LPCM

  • Einbeck J, Evers L, Powell B (2010) Data compression and regression through local principal curves and surfaces. Int J Neural Syst 20(03):177–192

    Article  Google Scholar 

  • Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Institute for Computer Science, University of Munich. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD-96)

  • Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40

    Article  MathSciNet  Google Scholar 

  • Genovese CR, Perone-Pacifico M, Verdinelli I, Wasserman L (2016) Non-parametric inference for density modes. J R Stat Soc Ser B (Methodol) 78(1):99–126

    Article  MathSciNet  Google Scholar 

  • Godtliebsen F, Marron J, Chaudhuri P (2002) Significance in scale space for bivariate density estimation. J Comput Graph Stat 11(1):1–21

    Article  MathSciNet  Google Scholar 

  • Hall P, York M (2001) On the calibration of Silverman’s test for multimodality. Stat Sin 11(2):515–536

    MathSciNet  Google Scholar 

  • Hennig C (2020) Fpc: flexible Procedures for Clustering. R package version 2.2-9

  • Hennig C, Christlieb N (2002) Validating visual clusters in large datasets: fixed point clusters of spectral features. Comput Stat Data Anal 40(4):723–739

    Article  MathSciNet  Google Scholar 

  • Hennig C, Meila M, Murtagh F, Rocci R (2015) Handbook of cluster analysis. CRC Press, Boca Raton

    Book  Google Scholar 

  • Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4):800–802

    Article  MathSciNet  Google Scholar 

  • Hu S, Wang Y (2021) Modal clustering using semiparametric mixtures and mode flattening. Stat Comput 31:5

    Article  MathSciNet  Google Scholar 

  • Izenman AJ, Sommer CJ (1988) Philatelic mixtures and multimodal densities. J Am Stat Assoc 83(404):941–953

    Article  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1995) continuous univariate distributions. Wiley Series in Probability and Statistics, New York

    Google Scholar 

  • Jones MC (2000) Rough-and-ready assessment of the degree and importance of smoothing in functional estimation. Stat Neerl 54(1):37–46

    Article  Google Scholar 

  • Klemelä J (2008) Mode trees for multivariate data. J Comput Graph Stat 17(4):860–869

    Article  MathSciNet  Google Scholar 

  • Lange K (2004) The MM Algorithm. In: Optimization. Springer texts in statistics. Springer, New York, NY

    Chapter  Google Scholar 

  • Li J, Ray S, Lindsay BG (2007) A nonparametric statistical approach to clustering via mode identification. J Mach Learn 8:1687–1723

    MathSciNet  Google Scholar 

  • Liu P, Zhou D, Wu N, VDBSCAN: varied density based spatial clustering of applications with noise. In: (2007) International conference on service systems and service management. IEEE 2007:1–4

  • McLachlan GJ, Peel D (2000) Finite mixture models. John Wiley & Sons, New York

    Book  Google Scholar 

  • Meila M (2015) Criteria for comparing clusterings. In: Handbook of cluster analysis. CRC Press, Boca Raton, pp 640-657

  • Menardi G (2016) A review on modal clustering. Int Stat Rev 84(3):413–433

    Article  MathSciNet  Google Scholar 

  • Minnotte MC, Scott DW (1993) The mode tree: a tool for visualization of nonparametric density features. J Comput Graph Stat 2(1):51–68

    Google Scholar 

  • Müller DW, Sawitzki G (1991) Excess mass estimates and tests for multimodality. J Am Stat Assoc 86(415):738–746

    MathSciNet  Google Scholar 

  • Rinaldo A, Wasserman L (2010) Generalized density clustering. Ann Stat 38(5):2678–2722

    Article  MathSciNet  Google Scholar 

  • Sasaki H, Kanamori T, Hyvärinen A, Niu G, Sugiyama M (2018) Mode-seeking clustering and density ridge estimation via direct estimation of density-derivative-ratios. J Mach Learn Res 18(180):1–47

    MathSciNet  Google Scholar 

  • Satopaa V, Albrecht J, Irwin D, Raghavan B, Finding a “kneedle” in a haystack: detecting knee points in system behavior. In, (2011) 31st international conference on distributed computing systems workshops. IEEE 2011:166–171

  • Schubert E, Sander J, Ester M, Kriegel H-P, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst (TODS) 42(3):1–21

    Article  MathSciNet  Google Scholar 

  • Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, New Jersey

    Book  Google Scholar 

  • Scrucca L, Fop M, Murphy TB, Raftery AE (2016) mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8(1):289–317

    Article  Google Scholar 

  • Sheikh YA, Khan EA, Kanade T (2007) Mode-seeking by Medoidshifts. In: 2007 IEEE 11th international conference on computer vision, pp 1–8

  • Silverman BW (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B (Methodol) 43(1):97–99

    MathSciNet  Google Scholar 

  • Stuetzle W, Nugent R (2010) A generalized single linkage method for estimating the cluster tree of a density. J Comput Graph Stat 19(2):397–418

    Article  MathSciNet  Google Scholar 

  • Tarpey T, Flury B (1996) Self-consistency: a fundamental concept in statistics. Stat Sci 11(3):229–243

    MathSciNet  Google Scholar 

  • Wilson I (1983) Add a new dimension to your philately. Amer Philat 97:342–349

    Google Scholar 

  • Wu K-L, Yang M-S (2007) Mean shift-based clustering. Pattern Recogn 40(11):3035–3052

    Article  Google Scholar 

  • Yamasaki R, Tanaka T (2019) Selection Kernel, for Modal linear regression: optimal Kernel and IRLS algorithm. In: 18th IEEE international conference on machine learning and applications (ICMLA). Boca Raton, FL, USA 2019:595–601

  • Yamasaki R, Tanaka T (2020) Properties of mean shift. IEEE Trans Pattern Anal Mach Intell 42(9):2273–2286

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the H2020 Marie Curie ITN, UTOPIAE, Grant Agreement No. 722734; Grant PID2020-116587GB-I00 funded by MCIN/AEI/10.13039/501100011033; and the Com- petitive Reference Groups 2021-2024 (ED431C 2021/24) from the Xunta de Galicia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Ameijeiras-Alonso.

Ethics declarations

Conflicts of interest

All authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (zip 10398 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ameijeiras-Alonso, J., Einbeck, J. A fresh look at mean-shift based modal clustering. Adv Data Anal Classif (2023). https://doi.org/10.1007/s11634-023-00575-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11634-023-00575-1

Keywords

Mathematics Subject Classification

Navigation