Abstract
This paper proposes a new data clustering algorithm based on data depth. In the proposed algorithm the centroids of the K-clusters are calculated using Mahalanobis data depth method. The performance of the algorithm called K-Data Depth Based Clustering Algorithm (K-DBCA) is evaluated in R using datasets defined in the mlbench package of R and from UCI Machine Learning Repository, yields good clustering results and is robust to outliers. In addition, it is invariant to affine transformations and it is also tested for face recognition which yields better accuracy.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cheng, C.-H., Chen, Y.-S.: Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst. Appl. 36(3), 4176–4184 (2009)
Pappas, T.N.: An adaptive clustering algorithm for image segmentation. IEEE Trans. Signal Process 40(4), 901–914 (1992)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Nat. Acad. Sci. 95(25), 14863–14868 (1998)
Jurdak, R., Zhao, K., Liu, J., AbouJaoude, M., Cameron, M., Newth, D.: Understanding human mobility from twitter. PLoS ONE 10(7), e0131469 (2015)
Rokach, L., Maimon, O.: Clustering methods. In: Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, Berlin (2005)
Ester, M., Kriegel, H.-P., Sander, J., Xiaowei, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd vol. 96, pp. 226–231 (1996)
Hamid, A., Sukumar, M.: Gchl: a grid-clustering algorithm for high-dimensional very large spatial data bases. Pattern Recogn. Lett. 26(7), 999–1010 (2005)
Boley, D., Gini, M., Gross, R., Sam Han, E.-H., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., Moore, J.: Partitioning-based clustering for web document categorization. Decis. Support Syst. 27(3), 329–341 (1999)
John, A.: Hartigan and Manchek A Wong. Algorithm as 136: A k-means clustering algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)
Liu, R.Y., Parelius, J.M., Singh, K., et al.: Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by liu and singh). Ann. Statist. 27(3), 783–858 (1999)
Barnett, V.: The ordering of multivariate data. J. Royal Stat. Soc. Series A (General), pp. 318–355 (1976)
Eddy, W.F.: Convex hull peeling. In: COMPSTAT 1982 5th Symposium held at Toulouse 1982, pp. 42–47. Springer, Berlin (1982)
Hodges, J.L.: A bivariate sign test. Ann. Math. Stat. 26(3), 523–527 (1955)
Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, vol. 2, pp. 523–531 (1975)
Liu, R.Y., et al.: On a notion of data depth based on random simplices. Ann. Statist. 18(1), 405–414 (1990)
Rousseeuw, P.J., Hubert, M.: Depth in an arrangement of hyperplanes. Discrete Comput. Geom. 22(2), 167–176 (1999)
Rousseeuw, P.J., Hubert, M.: Regression depth. J. Am. Statist. Assoc. 94(446), 388–402 (1999)
Vardi, Y., Zhang, C.-H.: The multivariate L1-median and associated data depth. Proc. Nat. Acad. Sci. 97(4), 1423–1426 (2000)
Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Statist. pp. 461–482 (2000)
Serfling, R.: Depth functions in nonparametric multivariate inference. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 72, 1 (2006)
Leisch, F., Dimitriadou, E.: mlbench: Machine Learning Benchmark Problems (2010). R package version 2.1-1
Hubert, L., Arabie, P.: Comparing partitions. J. classif. 2(1), 193–218 (1985)
Meilă, M.: Comparing clusteringsan information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In: European Conference on Computer Vision, pp. 214–227. Springer, Berlin (2012)
Lyons, M.J., Akamatsu, S., Kamachi, M., Gyoba, J., Budynek, J.: The Japanese female facial expression (JAFFE) database. In: Proceedings of Third International Conference on Automatic Face and Gesture Recognition, pp. 14–16 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Baidari, I., Patil, C. (2019). K-Data Depth Based Clustering Algorithm. In: Verma, N., Ghosh, A. (eds) Computational Intelligence: Theories, Applications and Future Directions - Volume I. Advances in Intelligent Systems and Computing, vol 798. Springer, Singapore. https://doi.org/10.1007/978-981-13-1132-1_2
Download citation
DOI: https://doi.org/10.1007/978-981-13-1132-1_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1131-4
Online ISBN: 978-981-13-1132-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)