Skip to main content

Overview of Overlapping Partitional Clustering Methods

  • Chapter
  • First Online:
Partitional Clustering Algorithms

Abstract

Identifying non-disjoint clusters is an important issue in clustering referred to as Overlapping Clustering. While traditional clustering methods ignore the possibility that an observation can be assigned to several groups and lead to k exhaustive and exclusive clusters representing the data, Overlapping Clustering methods offer a richer model for fitting existing structures in several applications requiring a non-disjoint partitioning. In fact, the issue of overlapping clustering has been studied since the last four decades leading to several methods in the literature adopting many usual approaches such as hierarchical, generative, graphical and k-means based approach. We review in this paper the fundamental concepts of overlapping clustering while we survey the widely known overlapping partitional clustering algorithms and the existing techniques to evaluate the quality of non-disjoint partitioning. Furthermore, a comparative theoretical and experimental study of used techniques to model overlaps is given over different multi-labeled benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The original objective function of PCM takes into account the identification of outliers, whereas we give in this paper a short structure of the objective function to facilitate the comparison of PCM with the other described methods.

  2. 2.

    The original objective function of ECM takes into account the identification of outliers by considering \(\pi _{i\varnothing }\) a mass of belief to belong to any cluster, whereas we consider in this paper that all combinations of clusters are tolerated except the empty set (\(A_{j}\neq \varnothing \)) in order to facilitate the comparison of ECM with the other described methods.

  3. 3.

    cf. http://www.grouplens.org/node/76.

  4. 4.

    cf.http://mlkd.csd.auth.gr/multilabel.html

  5. 5.

    cf.http://mlkd.csd.auth.gr/multilabel.html

  6. 6.

    cf.http://mlkd.csd.auth.gr/multilabel.html

References

  1. Amigo E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12(4):461–486

    Article  Google Scholar 

  2. Banerjee A, Krumpelman C, Basu S, Mooney RJ, Ghosh J (2005). Model based overlapping clustering. In: International conference on knowledge discovery and data mining, pp 532–537

    Google Scholar 

  3. Baumes J, Goldberg M, Magdon-Ismail M (2005) Efficient identification of overlapping communities. In: IEEE international conference on Intelligence and security informatics, pp 27–36

    Google Scholar 

  4. BenN’Cir C, Essoussi N (2012) Overlapping patterns recognition with linear and non-linear separations using positive definite kernels. Intern J Comput Appl 56:1–8

    Google Scholar 

  5. BenN’Cir C, Essoussi N, Bertrand P (2010) Kernel overlapping k-means for clustering in feature space. In: International conference on knowledge discovery and information retrieval (KDIR), pp 250–256

    Google Scholar 

  6. BenN’Cir C, Cleuziou G, Essoussi N (2013) Identification of non-disjoint clusters with small and parameterizable overlaps. In: IEEE international conference on computer applications technology (ICCAT), pp 1–6

    Google Scholar 

  7. Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data - Recent Advances in Clustering, Springer pp 28–71

    Google Scholar 

  8. Bertrand P, Janowitz M (2003) The k-weak hierarchical representations: an extension of the indexed closed weak hierarchies. Discrete Appl Math 127(2):199–220

    Article  MATH  MathSciNet  Google Scholar 

  9. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, USA

    Book  MATH  Google Scholar 

  10. Bonchi F, Gionis A, Ukkonen A (2011) Overlapping correlation clustering. In: 11th IEEE international conference on data mining (ICDM), pp 51–60

    Google Scholar 

  11. Bonchi F, Gionis A, Ukkonen A (2013) Overlapping correlation clustering. Knowl Inf Syst 35(1):1–32

    Article  Google Scholar 

  12. Celebi ME, Kingravi H (2012) Deterministic initialization of the k-means algorithm using hierarchical clustering Intern J Pattern Recognit Artif Intell 26(7):1250018

    Google Scholar 

  13. Celebi ME, Kingravi H, Vela P-A (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210

    Article  Google Scholar 

  14. Cleuziou, G. (2008). An extended version of the k-means method for overlapping clustering. In: International conference on pattern recognition (ICPR), pp 1–4

    Google Scholar 

  15. Cleuziou G (2009) Two variants of the OKM for overlapping clustering. In: Advances in knowledge discovery and management, Springer pp 149–166

    Google Scholar 

  16. Cleuziou G (2013) Osom: a method for building overlapping topological maps. Pattern Recognit Lett 34(3):239–246

    Article  Google Scholar 

  17. Davis GB, Carley KM (2008) Clearing the fog: fuzzy, overlapping groups for social networks. Soc Netw 30(3):201–212

    Article  Google Scholar 

  18. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–38

    MATH  MathSciNet  Google Scholar 

  19. Depril D, Van Mechelen I, Mirkin B (2008) Algorithms for additive clustering of rectangular data tables. Comput Stat Data Anal 52(11):4923–4938

    Article  MATH  Google Scholar 

  20. Depril D, Mechelen IV, Wilderjans TF (2012) Low dimensional additive overlapping clustering. J Classif 29(3):297–320

    Article  Google Scholar 

  21. Diday E (1984) Orders and overlapping clusters by pyramids. Technical Report 730, INRIA

    Google Scholar 

  22. Duda RO, Hart PE, Stork DG (2001) Pattern Classification (2nd edition), (John Wiley & Sons, New York, NY)

    Google Scholar 

  23. Fellows MR, Guo J, Komusiewicz C, Niedermeier R, Uhlmann J (2011). Graph-based data clustering with overlaps. Discrete Optim 8(1):2–17

    Article  MATH  MathSciNet  Google Scholar 

  24. Fu Q, Banerjee A (2008) Multiplicative mixture models for overlapping clustering. In: 8th IEEE international conference on data mining, pp 791–796

    Google Scholar 

  25. Gil-García R, Pons-Porrata A (2010) Dynamic hierarchical algorithms for document clustering. Pattern Recognit Lett 31(6):469–477

    Article  Google Scholar 

  26. Goldberg M, Kelley S, Magdon-Ismail M, Mertsalov K, Wallace A (2010). Finding overlapping communities in social networks. In: IEEE second international conference on social computing (SocialCom), pp 104–113

    Google Scholar 

  27. Gregory S (2007) An algorithm to find overlapping community structure in networks. In: Knowledge discovery in databases: PKDD 2007, vol 4702, pp 91–102

    Google Scholar 

  28. Gregory S (2008) A fast algorithm to find overlapping communities in networks. In: Machine learning and knowledge discovery in databases, vol 5211, pp 408–423

    Google Scholar 

  29. Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2–3):107–145

    Article  MATH  Google Scholar 

  30. Heller K, Ghahramani Z (2007) A nonparametric Bayesian approach to modeling overlapping clusters. In: 11th International conference on AI and statistics (AISTATS)

    Google Scholar 

  31. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666

    Article  Google Scholar 

  32. Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110

    Article  Google Scholar 

  33. Lingras P, West C (2004) Interval set clustering of web users with rough k-means. J Intell Inf Syst 23(1):5–16

    Article  MATH  Google Scholar 

  34. Liu Z-G, Dezert J, Mercier G, Pan Q (2012) Belief c-means: an extension of fuzzy c-means algorithm in belief functions framework. Pattern Recognit Lett 33(3):291–300

    Article  Google Scholar 

  35. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297

    Google Scholar 

  36. Magdon-Ismail M, Purnell J (2011) Ssde-cluster: fast overlapping clustering of networks using sampled spectral distance embedding and gmms. In: IEEE third international conference on social computing (socialcom), pp 756–759

    Google Scholar 

  37. Masson M-H, Denoeux T (2008) Ecm: an evidential version of the fuzzy c-means algorithm. Pattern Recognit 41(4):1384–1397

    Article  MATH  Google Scholar 

  38. Mirkin BG (1987) Method of principal cluster analysis. Autom Remote Control 48:1379–1386

    MATH  Google Scholar 

  39. Mirkin BG (1990) A sequential fitting procedure for linear data analysis models. J Classif 7(2):167–195

    Article  MATH  MathSciNet  Google Scholar 

  40. Pérez-Suárez A, Martínez-Trinidad JF, Carrasco-Ochoa JA, Medina-Pagola JE (2013a) Oclustr: a new graph-based algorithm for overlapping clustering. Neurocomputing 109:1–14

    Article  Google Scholar 

  41. Pérez-Suárez A, Martnez-Trinidad JF, Carrasco-Ochoa JA, Medina-Pagola JE (2013b) An algorithm based on density and compactness for dynamic overlapping clustering. Pattern Recognit 46(11):3040–3055

    Article  Google Scholar 

  42. Snoek CGM, Worring M, van Gemert JC, Geusebroek J-M, Smeulders AWM (2006) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: 14th annual ACM international conference on multimedia, pp 421–430

    Google Scholar 

  43. Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: ACM conference on information and knowledge management, pp 1107–1116

    Google Scholar 

  44. Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook, Springer pp 667–685

    Google Scholar 

  45. Wang Q, Fleury E (2011) Uncovering overlapping community structure. In: Complex networks, vol 116, pp 176–186

    Google Scholar 

  46. Wang X, Tang L, Gao H, Liu H (2010) Discovering overlapping groups in social media. In: IEEE international conference on data mining, pp 569–578

    Google Scholar 

  47. Wieczorkowska A, Synak P, Ras Z (2006) Multi-label classification of emotions in music. In: Intelligent information processing and web mining. Advances in soft computing, vol 35, pp 307–315

    Google Scholar 

  48. Wilderjans T, Ceulemans E, Mechelen I, Depril D (2011) Adproclus: a graphical user interface for fitting additive profile clustering models to object by variable data matrices. Behav Res Methods 43(1):56–65

    Article  Google Scholar 

  49. Wilderjans TF, Depril D, Mechelen IV (2013) Additive biclustering: a comparison of one new and two existing als algorithms. J Classif 30(1):56–74

    Article  Google Scholar 

  50. Yang Y (1999) An evaluation of statistical approaches to text categorization. J Inf Retrieval 1:67–88

    Google Scholar 

  51. Zhang S, Wang R-S, Zhang X-S (2007) Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A Stat Mech Appl 374(1):483–490

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Cleuziou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

N’Cir, CE.B., Cleuziou, G., Essoussi, N. (2015). Overview of Overlapping Partitional Clustering Methods. In: Celebi, M. (eds) Partitional Clustering Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-09259-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09259-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09258-4

  • Online ISBN: 978-3-319-09259-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics