Abstract
A measure of how closely a set of data items are connected is determined based on the similarity or dissimilarity between the data items. Inter-cluster and intra-cluster distances are two metrics that exhibit how the data items can group accordingly to form clusters. Usually, the data items with high similarities are in the same cluster, and the distance between them is significantly less. A good clustering algorithm always maintains high similarity within the cluster, thereby maximizing the distance between the data items having higher dissimilarities in distinct clusters. These measures play a crucial role in identifying the patterns among the data objects. Euclidean distance acts as a distance measure for determining these two metrics. This paper proposes a multi-objective approach for clustering to establish the relationship between inter-cluster and intra-cluster distances. It aims to perform a comparative analysis of the sum of inter-cluster and intra-cluster distances. It reveals that when the sum of intra-cluster distance is minimizing, then the sum of inter-cluster distance is maximizing for a given dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Greene D, Cunningham P, Mayer R (2008) Unsupervised learning and clustering. In: Machine learning techniques for multimedia. Springer, pp 51–90
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Mercioni MA, Holban S (2019) A survey of distance metrics in clustering data mining techniques. In: Proceedings of the 2019 3rd international conference on graphics and signal processing, pp 44–47
Singh A, Yadav A, Rana A (2013) K-means with three different distance metrics. Int J Comput Appl 67(10)
Irani J, Pise N, Phatak M (2016) Clustering techniques and the similarity measures used in clustering: a survey. Int J Comput Appl 134(7):9–14
Gomaa WH, Fahmy AA et al (2013) A survey of text similarity approaches. Int J Comput Appl 68(13):13–18
Zhu E, Ma R (2018) An effective partitional clustering algorithm based on new clustering validity index. Appl Soft Comput 71:608–621
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier
Paterlini S, Krink T (2006) Differential evolution and particle swarm optimisation in partitional clustering. Comput Stat Data Anal 50(5):1220–1247
Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley
Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016
Sinaga KP, Yang M-S (2020) Unsupervised k-means clustering algorithm. IEEE Access 8:80716–80727
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc.
Hartigan JA (1975) Clustering algorithms. Wiley
Hossain MZ, Akhtar MN, Ahmad RB, Rahman M (2019) A dynamic k-means clustering for data mining. Indones J Electr Eng Comput Sci 13(2):521–526
Wright DB, Nunn JA (2000) Similarities within event clusters in autobiographical memory. Appl Cogn Psychol 14(5):479–489
Agbaje MB, Ezugwu AE, Els R (2019) Automatic data clustering using hybrid firefly particle swarm optimization algorithm. IEEE Access 7:184963–184984
Sreedhar C, Kasiviswanath N, Reddy PC (2017) Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop. J Big Data 4(1):1–19
Meng Y, Liang J, Cao F, He Y (2018) A new distance with derivative information for functional k-means clustering algorithm. Inf Sci 463:166–185
Dvoenko SD, Owsinski JW (2019) The permutable k-means for the bi-partial criterion. Informatica 43(2)
Fei L, Zhang B, Xu Y, Guo Z, Wen J, Jia W (2019) Learning discriminant direction binary palmprint descriptor. IEEE Trans Image Process 28(8):3808–3820
Coppi R, D’Urso P (2006) Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization. Comput Stat Data Anal 50(6):1452–1477
Li B, Li H, Wang W, Yin Q, Liu H (2013) Performance analysis and optimization for energy-efficient cooperative transmission in random wireless sensor network. IEEE Trans Wireless Commun 12(9):4647–4657
Jadhav AN, Gomathi N (2018) WGC: hybridization of exponential grey wolf optimizer with whale optimization for data clustering. Alex Eng J 57(3):1569–1584
Dutta D, Sil J, Dutta P (2019) Automatic clustering by multi-objective genetic algorithm with numeric and categorical features. Expert Syst Appl 137:357–379
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Das P, Das DK, Dey S (2018) A new class topper optimization algorithm with an application to data clustering. IEEE Trans Emerg Top Comput 8(4):948–959
Das P, Das DK, Dey S (2018) A modified bee colony optimization (MBCO) and its hybridization with k-means for an application to data clustering. Appl Soft Comput 70:590–603
Das P, Das DK, Dey S (2017) PSO, BCO and K-means based hybridized optimization algorithms for data clustering. In: 2017 international conference on information technology (ICIT). IEEE, pp 252–257
Das P, Das DK, Dey S (2017) A multi-objective modified particle swarm optimization (MMPSO) technique with an application to data clustering. In: 2017 14th IEEE India council international conference (INDICON). IEEE, pp 1–6
Blake CL, Merz CJ (1998) UCI repository of machine learning databases
Street WN, Wolberg WH, Mangasarian OL (1993) Nuclear feature extraction for breast tumor diagnosis. In: Biomedical image processing and biomedical visualization, vol 1905. International Society for Optics and Photonics, pp 861–870
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Evett IW, Ernest JS (1987) Rule induction in forensic science. Central Research Establishment, Home Office Forensic Science Service, Aldermaston, Reading, Berkshire
Deterding DH (1990) Speaker normalisation for automatic speech recognition. PhD thesis, University of Cambridge
Bishop CM (2006) Pattern recognition. Mach Learn 128(9)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Binu Jose, A., Das, P. (2022). A Multi-objective Approach for Inter-cluster and Intra-cluster Distance Analysis for Numeric Data. In: Kumar, R., Ahn, C.W., Sharma, T.K., Verma, O.P., Agarwal, A. (eds) Soft Computing: Theories and Applications. Lecture Notes in Networks and Systems, vol 425. Springer, Singapore. https://doi.org/10.1007/978-981-19-0707-4_30
Download citation
DOI: https://doi.org/10.1007/978-981-19-0707-4_30
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0706-7
Online ISBN: 978-981-19-0707-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)