Skip to main content

Relative Validity Criteria for Community Mining Algorithms

  • Reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Clustering evaluation; Clustering objective function; Community mining; Evaluation approaches; Graph clustering; Graph partitioning; Quality measures

Glossary

ARI (adjusted rand index):

Measures similarity of two clusterings based on pair counts

Community Structure:

Clustering structure underlying a network which models regions of densely connected nodes

External Evaluation:

Compares a clustering against the ground-truth clustering

Internal Evaluation:

Matches a clustering with the structure of data

Network:

A graph of interconnected nodes which models relationships in data

NMI (normalized mutual information):

Measures similarity of two clusterings

Relative Evaluation:

Ranks different clusterings of the same dataset

Definition

Grouping data points is one of the fundamental tasks in data mining, which is commonly known as clustering if data points are described by attributes. When dealing with interrelated data, data represented in the form of nodes and their relationships and...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Theory Methods 3:1–27

    Article  MathSciNet  MATH  Google Scholar 

  • Dalrymple-Alford EC (1970) Measurement of clustering in free recall. Psychol Bull 74:32–34

    Article  Google Scholar 

  • Danon L, Guilera AD, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech Theory Exp 2005(09):09008

    Article  Google Scholar 

  • Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  • Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J Cybern 4(1):95–104

    Article  MathSciNet  MATH  Google Scholar 

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  • Fortunato S, Barthélemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41

    Article  Google Scholar 

  • Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  MATH  Google Scholar 

  • Hubert LJ, Levin JR (1976) A general statistical framework for assessing categorical clustering in free recall. Psychol Bull 83:1072–1080

    Article  Google Scholar 

  • Jaccard P (1901) Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37:547–579

    Google Scholar 

  • Kenley EC, Cho YR (2011) Entropy-based graph clustering: application to biological and social networks. In: IEEE international conference on data mining, Vancouver

    Google Scholar 

  • Krebs V (2004) Books about US politics. http://www.orgnet.com/

  • Kvalseth TO (1987) Entropy and correlation: some comments. IEEE Trans Syst Man Cybern 17(3):517–519. https://doi.org/10.1109/TSMC.1987.4309069

    Article  Google Scholar 

  • Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117

    Article  Google Scholar 

  • Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110

    Article  Google Scholar 

  • Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: ACM SIGKDD international conference on knowledge discovery in data mining, Chicago, pp 177–187

    Google Scholar 

  • Milligan G, Cooper M (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179

    Article  Google Scholar 

  • Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582

    Article  Google Scholar 

  • Newman M (2010) Networks: an Introduction. Oxford University Press, New York

    Book  MATH  Google Scholar 

  • Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113

    Article  Google Scholar 

  • Nooy W, Mrvar A, Batagelj V (2004) Exploratory social network analysis with Pajek. Cambridge University Press, Cambridge

    Google Scholar 

  • Orman GK, Labatut V, Cherifi H (2011) Qualitative comparison of community detection algorithms. In: International conference on digital information and communication technology and its applications, Dijon 167, pp 265–279

    Google Scholar 

  • Pakhira M, Dutta A (2011) Computing approximate value of the PBM index for counting number of clusters using genetic algorithm. In: International conference on recent trends in information systems, Kolkata, pp 241–245

    Google Scholar 

  • Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818

    Article  Google Scholar 

  • Rabbany R (2016) Modular structure of complex networks. PhD thesis, University of Alberta. http://hdl.handle.net/10402/era.43464

  • Rabbany R, Zaïane OR (2011) A diffusion of innovation-based closeness measure for network associations. In: IEEE international conference on data mining workshops, Vancouver, pp 381–388

    Google Scholar 

  • Rabbany R, Zaïane OR (2015) Generalization of clustering agreements and distances for overlapping clusters and network communities. Data Min Knowl Disc 29(5):1458–1485

    Article  MathSciNet  Google Scholar 

  • Rabbany R, Chen J, Zaïane OR (2010) Top leaders community detection approach in information networks. In: SNA-KDD workshop on social network mining and analysis, Washington, DC

    Google Scholar 

  • Rabbany R, Takaffoli M, Fagnan J, Zaïane OR (2013) Communities validity: methodical evaluation of community mining algorithms. Soc Netw Anal Min 3(4):1039–1062

    Article  Google Scholar 

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123

    Article  Google Scholar 

  • Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20(1):53–65

    Article  MATH  Google Scholar 

  • Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: a comparative overview. Stat Anal Data Min 3(4):209–235

    MathSciNet  Google Scholar 

  • Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854

    MathSciNet  MATH  Google Scholar 

  • Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful for the support from Alberta Innovates Centre for Machine Learning and NSERC. Ricardo Campello also acknowledges the financial support of Fapesp and CNPq.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reihaneh Rabbany .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Rabbany, R., Takaffoli, M., Fagnan, J., Zaïane, O.R., Campello, R. (2018). Relative Validity Criteria for Community Mining Algorithms. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-7131-2_356

Download citation

Publish with us

Policies and ethics