Skip to main content

A Multi-objective Approach for Inter-cluster and Intra-cluster Distance Analysis for Numeric Data

  • Conference paper
  • First Online:
Soft Computing: Theories and Applications

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 425))

Abstract

A measure of how closely a set of data items are connected is determined based on the similarity or dissimilarity between the data items. Inter-cluster and intra-cluster distances are two metrics that exhibit how the data items can group accordingly to form clusters. Usually, the data items with high similarities are in the same cluster, and the distance between them is significantly less. A good clustering algorithm always maintains high similarity within the cluster, thereby maximizing the distance between the data items having higher dissimilarities in distinct clusters. These measures play a crucial role in identifying the patterns among the data objects. Euclidean distance acts as a distance measure for determining these two metrics. This paper proposes a multi-objective approach for clustering to establish the relationship between inter-cluster and intra-cluster distances. It aims to perform a comparative analysis of the sum of inter-cluster and intra-cluster distances. It reveals that when the sum of intra-cluster distance is minimizing, then the sum of inter-cluster distance is maximizing for a given dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Greene D, Cunningham P, Mayer R (2008) Unsupervised learning and clustering. In: Machine learning techniques for multimedia. Springer, pp 51–90

    Google Scholar 

  2. Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193

    Article  MathSciNet  Google Scholar 

  3. Mercioni MA, Holban S (2019) A survey of distance metrics in clustering data mining techniques. In: Proceedings of the 2019 3rd international conference on graphics and signal processing, pp 44–47

    Google Scholar 

  4. Singh A, Yadav A, Rana A (2013) K-means with three different distance metrics. Int J Comput Appl 67(10)

    Google Scholar 

  5. Irani J, Pise N, Phatak M (2016) Clustering techniques and the similarity measures used in clustering: a survey. Int J Comput Appl 134(7):9–14

    Google Scholar 

  6. Gomaa WH, Fahmy AA et al (2013) A survey of text similarity approaches. Int J Comput Appl 68(13):13–18

    Google Scholar 

  7. Zhu E, Ma R (2018) An effective partitional clustering algorithm based on new clustering validity index. Appl Soft Comput 71:608–621

    Article  Google Scholar 

  8. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier

    Google Scholar 

  9. Paterlini S, Krink T (2006) Differential evolution and particle swarm optimisation in partitional clustering. Comput Stat Data Anal 50(5):1220–1247

    Article  MathSciNet  Google Scholar 

  10. Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341

    Article  Google Scholar 

  11. Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley

    Google Scholar 

  12. Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016

    Article  Google Scholar 

  13. Sinaga KP, Yang M-S (2020) Unsupervised k-means clustering algorithm. IEEE Access 8:80716–80727

    Google Scholar 

  14. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc.

    Google Scholar 

  15. Hartigan JA (1975) Clustering algorithms. Wiley

    Google Scholar 

  16. Hossain MZ, Akhtar MN, Ahmad RB, Rahman M (2019) A dynamic k-means clustering for data mining. Indones J Electr Eng Comput Sci 13(2):521–526

    Google Scholar 

  17. Wright DB, Nunn JA (2000) Similarities within event clusters in autobiographical memory. Appl Cogn Psychol 14(5):479–489

    Google Scholar 

  18. Agbaje MB, Ezugwu AE, Els R (2019) Automatic data clustering using hybrid firefly particle swarm optimization algorithm. IEEE Access 7:184963–184984

    Google Scholar 

  19. Sreedhar C, Kasiviswanath N, Reddy PC (2017) Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop. J Big Data 4(1):1–19

    Google Scholar 

  20. Meng Y, Liang J, Cao F, He Y (2018) A new distance with derivative information for functional k-means clustering algorithm. Inf Sci 463:166–185

    Article  Google Scholar 

  21. Dvoenko SD, Owsinski JW (2019) The permutable k-means for the bi-partial criterion. Informatica 43(2)

    Google Scholar 

  22. Fei L, Zhang B, Xu Y, Guo Z, Wen J, Jia W (2019) Learning discriminant direction binary palmprint descriptor. IEEE Trans Image Process 28(8):3808–3820

    Article  MathSciNet  Google Scholar 

  23. Coppi R, D’Urso P (2006) Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization. Comput Stat Data Anal 50(6):1452–1477

    Article  MathSciNet  Google Scholar 

  24. Li B, Li H, Wang W, Yin Q, Liu H (2013) Performance analysis and optimization for energy-efficient cooperative transmission in random wireless sensor network. IEEE Trans Wireless Commun 12(9):4647–4657

    Article  Google Scholar 

  25. Jadhav AN, Gomathi N (2018) WGC: hybridization of exponential grey wolf optimizer with whale optimization for data clustering. Alex Eng J 57(3):1569–1584

    Google Scholar 

  26. Dutta D, Sil J, Dutta P (2019) Automatic clustering by multi-objective genetic algorithm with numeric and categorical features. Expert Syst Appl 137:357–379

    Article  Google Scholar 

  27. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  28. Das P, Das DK, Dey S (2018) A new class topper optimization algorithm with an application to data clustering. IEEE Trans Emerg Top Comput 8(4):948–959

    Google Scholar 

  29. Das P, Das DK, Dey S (2018) A modified bee colony optimization (MBCO) and its hybridization with k-means for an application to data clustering. Appl Soft Comput 70:590–603

    Google Scholar 

  30. Das P, Das DK, Dey S (2017) PSO, BCO and K-means based hybridized optimization algorithms for data clustering. In: 2017 international conference on information technology (ICIT). IEEE, pp 252–257

    Google Scholar 

  31. Das P, Das DK, Dey S (2017) A multi-objective modified particle swarm optimization (MMPSO) technique with an application to data clustering. In: 2017 14th IEEE India council international conference (INDICON). IEEE, pp 1–6

    Google Scholar 

  32. Blake CL, Merz CJ (1998) UCI repository of machine learning databases

    Google Scholar 

  33. Street WN, Wolberg WH, Mangasarian OL (1993) Nuclear feature extraction for breast tumor diagnosis. In: Biomedical image processing and biomedical visualization, vol 1905. International Society for Optics and Photonics, pp 861–870

    Google Scholar 

  34. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Google Scholar 

  35. Evett IW, Ernest JS (1987) Rule induction in forensic science. Central Research Establishment, Home Office Forensic Science Service, Aldermaston, Reading, Berkshire

    Google Scholar 

  36. Deterding DH (1990) Speaker normalisation for automatic speech recognition. PhD thesis, University of Cambridge

    Google Scholar 

  37. Bishop CM (2006) Pattern recognition. Mach Learn 128(9)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Binu Jose .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Binu Jose, A., Das, P. (2022). A Multi-objective Approach for Inter-cluster and Intra-cluster Distance Analysis for Numeric Data. In: Kumar, R., Ahn, C.W., Sharma, T.K., Verma, O.P., Agarwal, A. (eds) Soft Computing: Theories and Applications. Lecture Notes in Networks and Systems, vol 425. Springer, Singapore. https://doi.org/10.1007/978-981-19-0707-4_30

Download citation

Publish with us

Policies and ethics