Evaluation of Community Mining Algorithms in the Presence of Attributes

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9441)


Grouping data points is one of the fundamental tasks in data mining, commonly known as clustering. In the case of interrelated data, when data is represented in the form of nodes and their relationships, the grouping is referred to as community. A community is often defined based on the connectivity of nodes rather than their attributes or features. The variety of definitions and methods and its subjective nature, makes the evaluation of community mining methods non-trivial. In this paper we point out the critical issues in the common evaluation practices, and discuss the alternatives. In particular, we focus on the common practice of using attributes as the ground-truth communities in large real networks. We suggest to treat these attributes as another source of information, and to use them to refine the communities and tune parameters.


Network clusters Community mining Networks with attributes Community evaluation Community validation 


  1. 1.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Statis. Mech.: Theory Exp. 2008(10), P10008 (2008)Google Scholar
  2. 2.
    Chen, J., Zaiane, O., Goebel, R.: An unsupervised approach to cluster web search results based on word sense communities. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2008, vol. 1, pp. 725–729, December 2008Google Scholar
  3. 3.
    Chen, J., Zaïane, O.R., Goebel, R.: Detecting communities in social networks using max-min modularity. In: SIAM International Conference on Data Mining, pp. 978–989 (2009)Google Scholar
  4. 4.
    Clauset, A.: Finding local community structure in networks. Phys. Rev. E (Statis., Nonlinear, Soft Matter Phys.) 72(2), 026132 (2005)Google Scholar
  5. 5.
    Crandall, D., Cosley, D., Huttenlocher, D., Kleinberg, J., Suri, S.: Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 160–168. ACM (2008)Google Scholar
  6. 6.
    Cruz Gomez, J.D., Bothorel, C.: Information integration for detecting communities in attributed graphs. In: 2013 Fifth International Conference on Computational Aspects of Social Networks (CASoN), pp. 62–67 (2013)Google Scholar
  7. 7.
    Danon, L., Guilera, A.D., Duch, J., Arenas, A.: Comparing community structure identification. J. Statis. Mech.: Theory Exp. (09), 09008 (2005)Google Scholar
  8. 8.
    Fortunato, S.: Community detection in graphs. Phys. Rep. 486(35), 75–174 (2010)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Fortunato, S., Castellano, C.: Community structure in graphs. In: Computational Complexity, pp. 490–512. Springer (2012)Google Scholar
  10. 10.
    Gong, N.Z., Talwalkar, A., Mackey, L., Huang, L., Shin, E.C.R., Stefanov, E., Song, D., et al.: Jointly predicting links and inferring attributes using a social-attribute network (san). arXiv preprint arXiv:1112.3265 (2011)
  11. 11.
    Günnemann, S., Boden, B., Färber, I., Seidl, T.: Efficient mining of combined subspace and subgraph clusters in graphs with feature vectors. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 261–275. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  12. 12.
    Gustafsson, M., Hörnquist, M., Lombardi, A.: Comparison and validation of community structures in complex networks. Phys. A Statis. Mech. Its Appl. 367, 559–576 (2006)CrossRefGoogle Scholar
  13. 13.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intel. Inf. Syst. 17, 107–145 (2001)CrossRefzbMATHGoogle Scholar
  14. 14.
    Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.: Co-clustering of biological networks and gene expression data. Bioinformatics 18(suppl. 1), S145–S154 (2002)CrossRefGoogle Scholar
  15. 15.
    Hu, B., Song, Z., Ester, M.: User features and social networks for topic modeling in online social media. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 202–209. IEEE (2012)Google Scholar
  16. 16.
    La Fond, T., Neville, J.: Randomization tests for distinguishing social influence and homophily effects. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 601–610. ACM, New York (2010)Google Scholar
  17. 17.
    Lancichinetti, A., Fortunato, S.: Community detection algorithms: A comparative analysis. Phys. Rev. E 80(5), 056117 (2009)Google Scholar
  18. 18.
    Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)Google Scholar
  19. 19.
    Lancichinetti, A., Kivelä, M., Saramäki, J., Fortunato, S.: Characterizing the community structure of complex networks. PloS One 5(8), e11976 (2010)Google Scholar
  20. 20.
    Largeron, C., Mougel, P., Rabbany, R., Zaïane, O.R.: Generating attributed networks with communities. PloS One (to appear, 2015)Google Scholar
  21. 21.
    Lee, C., Cunningham, P.: Benchmarking community detection methods on social media data. arXiv preprint arXiv:1302.0739 (2013)
  22. 22.
    Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web, pp. 631–640. ACM (2010)Google Scholar
  23. 23.
    Lewis, K., Gonzalez, M., Kaufman, J.: Social selection and peer influence in an online social network. Proc. Nat. Acad. Sci. 109(1), 68–72 (2012)CrossRefGoogle Scholar
  24. 24.
    Luo, F., Wang, J.Z., Promislow, E.: Exploring local community structures in large networks. Web Intel. Agent Syst. 6, 387–400 (2008)Google Scholar
  25. 25.
    Mislove, A., Viswanath, B., Gummadi, K.P., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 2010, pp. 251–260. ACM, New York (2010)Google Scholar
  26. 26.
    Moser, F., Colak, R., Rafiey, A., Ester, M.: Mining cohesive patterns from graphs with feature vectors. SDM 9, 593–604 (2009)Google Scholar
  27. 27.
    Moussiades, L., Vakali, A.: Benchmark graphs for the evaluation of clustering algorithms. In: Proceedings of the Third IEEE International Conference on Research Challenges in Information Science, RCIS 2009, pp. 197–206 (2009)Google Scholar
  28. 28.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)Google Scholar
  29. 29.
    Newman, M.E.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004)Google Scholar
  30. 30.
    Onnela, J.P., Arbesman, S., González, M.C., Barabási, A.L., Christakis, N.A.: Geographic constraints on social network groups. PLoS One 6(4), e16939 (2011)Google Scholar
  31. 31.
    Orman, G.K., Labatut, V.: The effect of network realism on community detection algorithms. In: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2010, pp. 301–305 (2010)Google Scholar
  32. 32.
    Orman, G.K., Orman, G.K., Labatut, V., Labatut, V., Cherifi, H., Cherifi, H.: Qualitative comparison of community detection algorithms. In: Cherifi, H., Cherifi, H., Zain, J.M., Zain, J.M., El-Qawasmeh, E., El-Qawasmeh, E. (eds.) DICTAP 2011 Part II. CCIS, vol. 167, pp. 265–279. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  33. 33.
    Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRefGoogle Scholar
  34. 34.
    Latapy, M., Latapy, M., Pons, P., Pons, P.: Computing communities in large networks using random walks. In: Yolum, I., Yolum, I., Özturan, C., Özturan, C., Gürgen, F., Gürgen, F., Güngör, T., Güngör, T. (eds.) ISCIS 2005. LNCS, vol. 3733, pp. 284–293. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  35. 35.
    Rabbany, R., Takaffoli, M., Fagnan, J., Zaiane, O., Campello, R.: Relative validity criteria for community mining algorithms. In: 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), August 2012Google Scholar
  36. 36.
    Rabbany, R., Chen, J., Zaïane, O.R.: Top leaders community detection approach in information networks. In: Proceedings of the 4th Workshop on Social Network Mining and Analysis (2010)Google Scholar
  37. 37.
    Rabbany, R., Chen, J., Zaïane, O.R.: Top leaders community detection approach in information networks. In: SNA-KDD Workshop on Social Network Mining and Analysis (2010)Google Scholar
  38. 38.
    Rabbany, R., Takaffoli, M., Fagnan, J., Zaïane, O.R., Campello, R.: Relative validity criteria for community mining algorithms. In: Social Networks Analysis and Mining (SNAM) (2013)Google Scholar
  39. 39.
    Rabbany, R., Zaïane, O.R.: A diffusion of innovation-based closeness measure for network associations. In: IEEE International Conference on Data Mining Workshops, pp. 381–388 (2011)Google Scholar
  40. 40.
    Rabbany, R., Zaïane, O.R.: Generalization of clustering agreements and distances for overlapping clusters and network communities. CoRR abs/1412.2601 (2014)Google Scholar
  41. 41.
    Rosvall, M., Bergstrom, C.T.: An information-theoretic framework for resolving community structure in complex networks. Proc. Nat. Acad. Sci. 104(18), 7327–7331 (2007)CrossRefGoogle Scholar
  42. 42.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. 105(4), 1118–1123 (2008)CrossRefGoogle Scholar
  43. 43.
    Rosvall, M., Bergstrom, C.T.: Mapping change in large networks. PloS One 5(1), e8694 (2010)Google Scholar
  44. 44.
    Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proc. Nat. Acad. Sci. 100(21), 12123–12128 (2003)CrossRefGoogle Scholar
  45. 45.
    Traud, A.L., Kelsic, E.D., Mucha, P.J., Porter, M.A.: Comparing community structure to characteristics in online collegiate social networks. SIAM Rev. 53(3), 526–543 (2011)MathSciNetCrossRefGoogle Scholar
  46. 46.
    Traud, A.L., Mucha, P.J., Porter, M.A.: Social structure of facebook networks. Phys. A: Statis. Mech. Appl. 391(16), 4165–4180 (2012)CrossRefGoogle Scholar
  47. 47.
    Wagner, A., Fell, D.A.: The small world inside large metabolic networks. Proc. Royal Soc. Lond. Ser. B: Biol. Sci. 268(1478), 1803–1810 (2001)CrossRefGoogle Scholar
  48. 48.
    Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833. ACM (2007)Google Scholar
  49. 49.
    Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, p. 3. ACM (2012)Google Scholar
  50. 50.
    Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 927–936. ACM (2009)Google Scholar
  51. 51.
    Yang, Y., Sun, Y., Pandit, S., Chawla, N.V., Han, J.: Perspective on measurement metrics for community detection algorithms. In: Mining Social Networks and Security Informatics, pp. 227–242. Springer (2013)Google Scholar
  52. 52.
    Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. Proc. VLDB Endowment 2(1), 718–729 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada

Personalised recommendations