Abstract
The mean-shift algorithm is an iterative method of mode seeking and data clustering based on the kernel density estimator. The blurring mean-shift is an accelerated version which uses the original data only in the first step, then re-smoothes previous estimates. It converges to local centroids, but may suffer from problems of asymptotic bias, which fundamentally depend on the design of its smoothing components. This paper develops nearest-neighbor implementations and data-driven techniques of bandwidth selection, which enhance the clustering performance of the blurring method. These solutions can be applied to the whole class of mean-shift algorithms, including the iterative local mean method. Extended simulation experiments and applications to well known data-sets show the goodness of the blurring estimator with respect to other algorithms.
Similar content being viewed by others
References
ALIYARI GHASSABEH, Y. (2013), “On the Convergence of the Mean Shift Algorithm in the One-Dimensional Space,” Pattern Recognition Letters, 34, 1423–1427.
CARREIRA-PERPIÑÁN, M.Á. (2006), “Fast Nonparametric Clustering with Gaussian Blurring Mean Shift,” in Proceedings of 23rd International Conference on Machine Learning, ICML 2006, pp. 153–160.
CARREIRA-PERPIÑÁN, M.Á. (2007), “Gaussian Mean Shift is an EM Algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 767–776.
CARREIRA-PERPIÑÁN, M.Á. (2008), “Generalized Blurring Mean Shift Algorithms for Nonparametric Clustering,” IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8.
CHACÓN, J.E., and DUONG, T. (2013), “Data-driven Density Derivative Estimation, with Applications to Nonparametric Clustering and Bump Hunting,” Electronic Journal of Statistics, 7, 499–532.
CHEN, T.-L. (2015), “On the Convergence and Consistency of the Blurring Mean Shift Process,” Annals of the Institute of Statistical Mathematics, 67, 157–176.
CHENG, Y. (1995), “Mean Shift, Mode Seeking and Clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.
COMANICIU, D., and MEER, P. (2002), “Mean Shift: A Robust Approach Toward Feature Space Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.
DUONG, T. (2014), Package ‘ks’, ver. 1.9.1., Cran R Project, available at: http://cran.rproject.org/web/packages/ks/ks.pdf.
FUKUNAGA, K., and HOSTETLER, L.D. (1975), “The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition,” IEEE Transactions on Information Theory, 21, 32–40.
GRILLENZONI, C. (2007), “Pattern Recognition via Robust Smoothing, with Application to Laser Data,” Australian & New Zealand Journal of Statistics, 37, 137–153.
GRILLENZONI, C. (2014), ”Detection of Tectonic Faults by Spatial Clustering of Earthquake Hypocenters,” Spatial Statistics, 7, 62–78.
ISAACSON, D.L., and MADSEN, R.W. (1976), Markov Chains, Theory and Applications, New York: Wiley.
LI, X., HU, Z., and WU F. (2007), “A Note on the Convergence of the Mean Shift,” Pattern Recognition, 40, 1756–1762.
RAO, S., DE MEDEIROS MARTINS A., and PRÍNCIPE, J. (2009), “Mean Shift: An Information Theoretic Perspective”, Pattern Recognition Letters, 30, 222–230.
RIPLEY, B., and WAND M. (2014), Package ‘KernSmooth’, ver. 2.23-12, available at http://cran.r-project.org/web/packages/KernSmooth/KernSmooth.pdf.
ROUSEEUW, P.J. (1986), “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computation and Applied Mathematics, 20, 53–65.
SHEATHER, S.J., and JONES, M.C. (1991), “A Reliable Data-based Bandwidth Selection Method for Kernel Density Estimation,” Journal of the Royal Statistical Society, B, 53, 683–690.
SILVERMAN, B.W. (1986), Density Estimation for Statistics and Data Analysis, London: Chapman & Hall.
WANG, K.,WANG B., and PENGL. (2009), “Validation for Cluster Analyses”, Data Science Journal, 8, 88–93.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Grillenzoni, C. Design of Blurring Mean-Shift Algorithms for Data Classification. J Classif 33, 262–281 (2016). https://doi.org/10.1007/s00357-016-9205-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-016-9205-7