Abstract
Modal clustering is an unsupervised learning technique where cluster centers are identified as the local maxima of nonparametric probability density estimates. A natural algorithmic engine for the computation of these maxima is the mean shift procedure, which is essentially an iteratively computed chain of local means. We revisit this technique, focusing on its link to kernel density gradient estimation, in this course proposing a novel concept for bandwidth selection based on the concept of a critical bandwidth. Furthermore, in the one-dimensional case, an inverse version of the mean shift is developed to provide a novel approach for the estimation of antimodes, which is then used to identify cluster boundaries. A simulation study is provided which assesses, in the univariate case, the classification accuracy of the mean-shift based clustering approach. Three (univariate and multivariate) examples from the fields of philately, engineering, and imaging, illustrate how modal clusterings identified through mean shift based methods relate directly and naturally to physical properties of the data-generating system. Solutions are proposed to deal computationally efficiently with large data sets.
Similar content being viewed by others
Notes
Conceptually, one could point out here that the maxima of \({\hat{f}}\) constitute the estimates of the (local) modes, while the iterative solutions entailed by equation (5) act as approximations to these estimates. For all practical purposes, \(x^{\blacktriangle }\) will still serve as ‘the estimate’ of a local mode of X; therefore we do not make further efforts to distinguish this conceptual nuance further in this exposition.
We have so far consistently spoken of the modes as a property of a random variable (or vector) X, corresponding to the (local) maxima of the associated density f of X. In this subsection, for ease of presentation and consistency with the source literature, we allow us to be a bit more lenient and speak of modes and antimodes of a density f (rather than X), with the obvious meaning.
References
Ameijeiras-Alonso J, Crujeiras RM, Rodríguez-Casal A (2019) Mode testing, critical bandwidth and excess mass. TEST 28(3):900–919
Ameijeiras-Alonso J, Crujeiras RM, Rodríguez-Casal A (2021) Multimode: an R package for mode assessment. J Stat Softw 97(9):1–32
Arias-Castro E, Qiao W (2023) A unifying view of modal clustering. Inf Inference J IMA 12(2):897–920
Azzalini A, Menardi G (2014) Clustering via nonparametric density estimation: the R package pdfCluster. J Stat Softw 57(11):1–26
Bowman A, Foster P (1993) Density based exploration of bivariate data. Stat Comput 3(4):171–177
Carreira-Perpiñán M (2007) Gaussian mean-shift is an EM algorithm. IEEE Trans Pattern Anal Mach Intell 29:767–776
Carreira-Perpiñán M (2015) Clustering methods based on kernel density estimators: mean-shift algorithms. In: Rocci R, Murtagh F, Meila M, Hennig C (eds) Handbook of cluster analysis. CRC, New York
Casa A, Chacón JE, Menardi G (2020) Modal clustering asymptotics with applications to bandwidth selection. Electron J Stat 14(1):835–856
Casa A, Scrucca L, Menardi G (2021) Better than the best? Answers via model ensemble in density-based clustering. Adv Data Anal Class 15:599–623
Chacón JE (2015) A population background for nonparametric density-based clustering. Stat Sci 30(4):518–532
Chacón JE (2019) Mixture model modal clustering. Adv Data Anal Classif 13:379–404
Chacón JE (2020) The modal age of statistics. Int Stat Rev 88(1):122–141
Chacón JE, Duong T (2013) Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting. Electron J Statist 7:499–532
Chacón J, Monfort P (2013) A comparison of bandwidth selectors for mean shift clustering. In: Skiadas C (ed) Theoretical and applied issues in statistics and demography. ISAST, Athens
Chaudhuri P, Marron JS (1999) Sizer for exploration of structures in curves. J Am Stat Assoc 94(447):807–823
Chen Y-C (2018) Modal regression using kernel density estimation: a review. WIREs Comput Stat 10:e1431
Chen Y-C, Genovese CR, Wasserman L (2016) A comprehensive approach to mode clustering. Electron J Stat 10(1):210–241
Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799
Cheng Y, Ray S (2014) Multivariate modality inference using gaussian kernel. Open J Stat 4(5):419–434
Coleman GB, Andrews HC (1979) Image segmentation by clustering. Proc IEEE 67(5):773–785
Comaniciu D (2003) An algorithm for data-driven bandwidth selection. IEEE Trans Pattern Anal Mach Intell 25(2):281–288
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Comaniciu D, Ramesh V, Meer P (2001) The variable bandwidth mean shift and data-driven scale selection, in: Proceedings Eighth IEEE international conference on computer vision. ICCV 2001, Vol. 1, pp 438–445
Duong T, Cowling A, Koch I, Wand MP (2008) Feature significance for multivariate kernel density estimation. Comput Stat Data Anal 52(9):4225–4242
Duong T, Wand M (2015) Feature: local inferential feature significance for multivariate kernel density estimation, R package version 1.2.13. https://CRAN.R-project.org/package=feature
Eaton W, Chen W (2015) Image segmentation for automated taxiing of unmanned aircraft. In: 2015 international conference on unmanned aircraft systems (ICUAS), pp 1–8
Einbeck J (2011) Bandwidth selection for based unsupervised learning techniques: a unified approach via self-coverage. J Pattern Recogn Res 6(2):175–192
Einbeck J (2019) R programming and mixture models, with application to image analysis. Tutorial at CMStatistics, London
Einbeck J, Evers L (2020) LPCM: Local principal curve methods, R package version 0.46-7. https://CRAN.R-project.org/package=LPCM
Einbeck J, Evers L, Powell B (2010) Data compression and regression through local principal curves and surfaces. Int J Neural Syst 20(03):177–192
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Institute for Computer Science, University of Munich. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD-96)
Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40
Genovese CR, Perone-Pacifico M, Verdinelli I, Wasserman L (2016) Non-parametric inference for density modes. J R Stat Soc Ser B (Methodol) 78(1):99–126
Godtliebsen F, Marron J, Chaudhuri P (2002) Significance in scale space for bivariate density estimation. J Comput Graph Stat 11(1):1–21
Hall P, York M (2001) On the calibration of Silverman’s test for multimodality. Stat Sin 11(2):515–536
Hennig C (2020) Fpc: flexible Procedures for Clustering. R package version 2.2-9
Hennig C, Christlieb N (2002) Validating visual clusters in large datasets: fixed point clusters of spectral features. Comput Stat Data Anal 40(4):723–739
Hennig C, Meila M, Murtagh F, Rocci R (2015) Handbook of cluster analysis. CRC Press, Boca Raton
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4):800–802
Hu S, Wang Y (2021) Modal clustering using semiparametric mixtures and mode flattening. Stat Comput 31:5
Izenman AJ, Sommer CJ (1988) Philatelic mixtures and multimodal densities. J Am Stat Assoc 83(404):941–953
Johnson NL, Kotz S, Balakrishnan N (1995) continuous univariate distributions. Wiley Series in Probability and Statistics, New York
Jones MC (2000) Rough-and-ready assessment of the degree and importance of smoothing in functional estimation. Stat Neerl 54(1):37–46
Klemelä J (2008) Mode trees for multivariate data. J Comput Graph Stat 17(4):860–869
Lange K (2004) The MM Algorithm. In: Optimization. Springer texts in statistics. Springer, New York, NY
Li J, Ray S, Lindsay BG (2007) A nonparametric statistical approach to clustering via mode identification. J Mach Learn 8:1687–1723
Liu P, Zhou D, Wu N, VDBSCAN: varied density based spatial clustering of applications with noise. In: (2007) International conference on service systems and service management. IEEE 2007:1–4
McLachlan GJ, Peel D (2000) Finite mixture models. John Wiley & Sons, New York
Meila M (2015) Criteria for comparing clusterings. In: Handbook of cluster analysis. CRC Press, Boca Raton, pp 640-657
Menardi G (2016) A review on modal clustering. Int Stat Rev 84(3):413–433
Minnotte MC, Scott DW (1993) The mode tree: a tool for visualization of nonparametric density features. J Comput Graph Stat 2(1):51–68
Müller DW, Sawitzki G (1991) Excess mass estimates and tests for multimodality. J Am Stat Assoc 86(415):738–746
Rinaldo A, Wasserman L (2010) Generalized density clustering. Ann Stat 38(5):2678–2722
Sasaki H, Kanamori T, Hyvärinen A, Niu G, Sugiyama M (2018) Mode-seeking clustering and density ridge estimation via direct estimation of density-derivative-ratios. J Mach Learn Res 18(180):1–47
Satopaa V, Albrecht J, Irwin D, Raghavan B, Finding a “kneedle” in a haystack: detecting knee points in system behavior. In, (2011) 31st international conference on distributed computing systems workshops. IEEE 2011:166–171
Schubert E, Sander J, Ester M, Kriegel H-P, Xu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans Database Syst (TODS) 42(3):1–21
Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, New Jersey
Scrucca L, Fop M, Murphy TB, Raftery AE (2016) mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8(1):289–317
Sheikh YA, Khan EA, Kanade T (2007) Mode-seeking by Medoidshifts. In: 2007 IEEE 11th international conference on computer vision, pp 1–8
Silverman BW (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B (Methodol) 43(1):97–99
Stuetzle W, Nugent R (2010) A generalized single linkage method for estimating the cluster tree of a density. J Comput Graph Stat 19(2):397–418
Tarpey T, Flury B (1996) Self-consistency: a fundamental concept in statistics. Stat Sci 11(3):229–243
Wilson I (1983) Add a new dimension to your philately. Amer Philat 97:342–349
Wu K-L, Yang M-S (2007) Mean shift-based clustering. Pattern Recogn 40(11):3035–3052
Yamasaki R, Tanaka T (2019) Selection Kernel, for Modal linear regression: optimal Kernel and IRLS algorithm. In: 18th IEEE international conference on machine learning and applications (ICMLA). Boca Raton, FL, USA 2019:595–601
Yamasaki R, Tanaka T (2020) Properties of mean shift. IEEE Trans Pattern Anal Mach Intell 42(9):2273–2286
Acknowledgements
This work is partially supported by the H2020 Marie Curie ITN, UTOPIAE, Grant Agreement No. 722734; Grant PID2020-116587GB-I00 funded by MCIN/AEI/10.13039/501100011033; and the Com- petitive Reference Groups 2021-2024 (ED431C 2021/24) from the Xunta de Galicia.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
All authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ameijeiras-Alonso, J., Einbeck, J. A fresh look at mean-shift based modal clustering. Adv Data Anal Classif (2023). https://doi.org/10.1007/s11634-023-00575-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11634-023-00575-1