Abstract
Identifying non-disjoint clusters is an important issue in clustering referred to as Overlapping Clustering. While traditional clustering methods ignore the possibility that an observation can be assigned to several groups and lead to k exhaustive and exclusive clusters representing the data, Overlapping Clustering methods offer a richer model for fitting existing structures in several applications requiring a non-disjoint partitioning. In fact, the issue of overlapping clustering has been studied since the last four decades leading to several methods in the literature adopting many usual approaches such as hierarchical, generative, graphical and k-means based approach. We review in this paper the fundamental concepts of overlapping clustering while we survey the widely known overlapping partitional clustering algorithms and the existing techniques to evaluate the quality of non-disjoint partitioning. Furthermore, a comparative theoretical and experimental study of used techniques to model overlaps is given over different multi-labeled benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The original objective function of PCM takes into account the identification of outliers, whereas we give in this paper a short structure of the objective function to facilitate the comparison of PCM with the other described methods.
- 2.
The original objective function of ECM takes into account the identification of outliers by considering \(\pi _{i\varnothing }\) a mass of belief to belong to any cluster, whereas we consider in this paper that all combinations of clusters are tolerated except the empty set (\(A_{j}\neq \varnothing \)) in order to facilitate the comparison of ECM with the other described methods.
- 3.
- 4.
- 5.
- 6.
References
Amigo E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12(4):461–486
Banerjee A, Krumpelman C, Basu S, Mooney RJ, Ghosh J (2005). Model based overlapping clustering. In: International conference on knowledge discovery and data mining, pp 532–537
Baumes J, Goldberg M, Magdon-Ismail M (2005) Efficient identification of overlapping communities. In: IEEE international conference on Intelligence and security informatics, pp 27–36
BenN’Cir C, Essoussi N (2012) Overlapping patterns recognition with linear and non-linear separations using positive definite kernels. Intern J Comput Appl 56:1–8
BenN’Cir C, Essoussi N, Bertrand P (2010) Kernel overlapping k-means for clustering in feature space. In: International conference on knowledge discovery and information retrieval (KDIR), pp 250–256
BenN’Cir C, Cleuziou G, Essoussi N (2013) Identification of non-disjoint clusters with small and parameterizable overlaps. In: IEEE international conference on computer applications technology (ICCAT), pp 1–6
Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data - Recent Advances in Clustering, Springer pp 28–71
Bertrand P, Janowitz M (2003) The k-weak hierarchical representations: an extension of the indexed closed weak hierarchies. Discrete Appl Math 127(2):199–220
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, USA
Bonchi F, Gionis A, Ukkonen A (2011) Overlapping correlation clustering. In: 11th IEEE international conference on data mining (ICDM), pp 51–60
Bonchi F, Gionis A, Ukkonen A (2013) Overlapping correlation clustering. Knowl Inf Syst 35(1):1–32
Celebi ME, Kingravi H (2012) Deterministic initialization of the k-means algorithm using hierarchical clustering Intern J Pattern Recognit Artif Intell 26(7):1250018
Celebi ME, Kingravi H, Vela P-A (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210
Cleuziou, G. (2008). An extended version of the k-means method for overlapping clustering. In: International conference on pattern recognition (ICPR), pp 1–4
Cleuziou G (2009) Two variants of the OKM for overlapping clustering. In: Advances in knowledge discovery and management, Springer pp 149–166
Cleuziou G (2013) Osom: a method for building overlapping topological maps. Pattern Recognit Lett 34(3):239–246
Davis GB, Carley KM (2008) Clearing the fog: fuzzy, overlapping groups for social networks. Soc Netw 30(3):201–212
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–38
Depril D, Van Mechelen I, Mirkin B (2008) Algorithms for additive clustering of rectangular data tables. Comput Stat Data Anal 52(11):4923–4938
Depril D, Mechelen IV, Wilderjans TF (2012) Low dimensional additive overlapping clustering. J Classif 29(3):297–320
Diday E (1984) Orders and overlapping clusters by pyramids. Technical Report 730, INRIA
Duda RO, Hart PE, Stork DG (2001) Pattern Classification (2nd edition), (John Wiley & Sons, New York, NY)
Fellows MR, Guo J, Komusiewicz C, Niedermeier R, Uhlmann J (2011). Graph-based data clustering with overlaps. Discrete Optim 8(1):2–17
Fu Q, Banerjee A (2008) Multiplicative mixture models for overlapping clustering. In: 8th IEEE international conference on data mining, pp 791–796
Gil-García R, Pons-Porrata A (2010) Dynamic hierarchical algorithms for document clustering. Pattern Recognit Lett 31(6):469–477
Goldberg M, Kelley S, Magdon-Ismail M, Mertsalov K, Wallace A (2010). Finding overlapping communities in social networks. In: IEEE second international conference on social computing (SocialCom), pp 104–113
Gregory S (2007) An algorithm to find overlapping community structure in networks. In: Knowledge discovery in databases: PKDD 2007, vol 4702, pp 91–102
Gregory S (2008) A fast algorithm to find overlapping communities in networks. In: Machine learning and knowledge discovery in databases, vol 5211, pp 408–423
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2–3):107–145
Heller K, Ghahramani Z (2007) A nonparametric Bayesian approach to modeling overlapping clusters. In: 11th International conference on AI and statistics (AISTATS)
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110
Lingras P, West C (2004) Interval set clustering of web users with rough k-means. J Intell Inf Syst 23(1):5–16
Liu Z-G, Dezert J, Mercier G, Pan Q (2012) Belief c-means: an extension of fuzzy c-means algorithm in belief functions framework. Pattern Recognit Lett 33(3):291–300
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297
Magdon-Ismail M, Purnell J (2011) Ssde-cluster: fast overlapping clustering of networks using sampled spectral distance embedding and gmms. In: IEEE third international conference on social computing (socialcom), pp 756–759
Masson M-H, Denoeux T (2008) Ecm: an evidential version of the fuzzy c-means algorithm. Pattern Recognit 41(4):1384–1397
Mirkin BG (1987) Method of principal cluster analysis. Autom Remote Control 48:1379–1386
Mirkin BG (1990) A sequential fitting procedure for linear data analysis models. J Classif 7(2):167–195
Pérez-Suárez A, Martínez-Trinidad JF, Carrasco-Ochoa JA, Medina-Pagola JE (2013a) Oclustr: a new graph-based algorithm for overlapping clustering. Neurocomputing 109:1–14
Pérez-Suárez A, Martnez-Trinidad JF, Carrasco-Ochoa JA, Medina-Pagola JE (2013b) An algorithm based on density and compactness for dynamic overlapping clustering. Pattern Recognit 46(11):3040–3055
Snoek CGM, Worring M, van Gemert JC, Geusebroek J-M, Smeulders AWM (2006) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: 14th annual ACM international conference on multimedia, pp 421–430
Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: ACM conference on information and knowledge management, pp 1107–1116
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook, Springer pp 667–685
Wang Q, Fleury E (2011) Uncovering overlapping community structure. In: Complex networks, vol 116, pp 176–186
Wang X, Tang L, Gao H, Liu H (2010) Discovering overlapping groups in social media. In: IEEE international conference on data mining, pp 569–578
Wieczorkowska A, Synak P, Ras Z (2006) Multi-label classification of emotions in music. In: Intelligent information processing and web mining. Advances in soft computing, vol 35, pp 307–315
Wilderjans T, Ceulemans E, Mechelen I, Depril D (2011) Adproclus: a graphical user interface for fitting additive profile clustering models to object by variable data matrices. Behav Res Methods 43(1):56–65
Wilderjans TF, Depril D, Mechelen IV (2013) Additive biclustering: a comparison of one new and two existing als algorithms. J Classif 30(1):56–74
Yang Y (1999) An evaluation of statistical approaches to text categorization. J Inf Retrieval 1:67–88
Zhang S, Wang R-S, Zhang X-S (2007) Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A Stat Mech Appl 374(1):483–490
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
N’Cir, CE.B., Cleuziou, G., Essoussi, N. (2015). Overview of Overlapping Partitional Clustering Methods. In: Celebi, M. (eds) Partitional Clustering Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-09259-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-09259-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09258-4
Online ISBN: 978-3-319-09259-1
eBook Packages: EngineeringEngineering (R0)