Data Mining and Knowledge Discovery

, Volume 27, Issue 3, pp 294–320 | Cite as

ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS

  • Michele Berlingerio
  • Fabio Pinelli
  • Francesco Calabrese
Article

Abstract

Community discovery in complex networks is the problem of detecting, for each node of the network, its membership to one of more groups of nodes, the communities, that are densely connected, or highly interactive, or, more in general, similar, according to a similarity function. So far, the problem has been widely studied in monodimensional networks, i.e. networks where only one connection between two entities may exist. However, real networks are often multidimensional, i.e., multiple connections between any two nodes may exist, either reflecting different kinds of relationships, or representing different values of the same type of tie. In this context, the problem of community discovery has to be redefined, taking into account multidimensional structure of the graph. We define a new concept of community that groups together nodes sharing memberships to the same monodimensional communities in the different single dimensions. As we show, such communities are meaningful and able to group nodes even if they might not be connected in any of the monodimensional networks. We devise frequent pAttern mining-BAsed Community discoverer in mUltidimensional networkS (ABACUS), an algorithm that is able to extract multidimensional communities based on the extraction of frequent closed itemsets from monodimensional community memberships. Experiments on two different real multidimensional networks confirm the meaningfulness of the introduced concepts, and open the way for a new class of algorithms for community discovery that do not rely on the dense connections among nodes.

Keywords

Community discovery Multidimensional networks  Social network analysis 

References

  1. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multi-scale complexity in networks. Nature 466:761–764Google Scholar
  2. Balcan D, Colizza V, Goncalves B, Hu H, Ramasco JJ, Vespignani A (2009) Multiscale mobility networks and the spatial spreading of infectious diseases. Proc Natl Acad Sci USA 106(51):21484–21489Google Scholar
  3. Bastide Y, Pasquier N, Taouil R, Stumme G, Lakhal L (2000) Mining minimal non-redundant association rules using frequent closed itemsets. In: Computational logic, pp 972–986Google Scholar
  4. Benevenuto F, Rodrigues T, Cha M, Almeida VAF (2009) Characterizing user behavior in online social networks. In: Internet measurement conference, pp 49–62Google Scholar
  5. Berlingerio M, Coscia M, Giannotti F (2011) Finding redundant and complementary communities in multidimensional networks. In: ACM conference on information and knowledge management, pp 2181–2184Google Scholar
  6. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011a) Foundations of multidimensional network analysis. In: International conference on advances in social networks analysis and mining, pp 485–489Google Scholar
  7. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011b) The pursuit of hubbiness: analysis of hubs in large multidimensional networks. J Comput Sci 2(3):223–237Google Scholar
  8. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2012) Multidimensional networks: foundations of structural analysis. In: World Wide Web, pp 1–27Google Scholar
  9. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. theory and experiment. J Stat Mech 10:P10008Google Scholar
  10. Bonchi F, Lucchese C (2005) Pushing tougher constraints in frequent pattern mining. Adv Knowl Discov Data Mining 3518:173–202CrossRefGoogle Scholar
  11. Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2005) Efficient breadth-first mining of frequent pattern with monotone constraints. J Knowl Inf Syst 8(2):131–153Google Scholar
  12. Borgelt C (2003) Efficient implementations of apriori and eclat. In: IEEE ICDM workshop on frequent item set mining implementations, p 90Google Scholar
  13. Bringmann B, Zimmermann A (2007) The chosen few: on identifying valuable patterns. In: IEEE international conference on data mining, pp 63–72Google Scholar
  14. Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. IEEE Intel Syst 25(4):26–35Google Scholar
  15. Cai D, Shao Z, He X, Yan X, Han J (2005) Community mining from multi-relational networks. In: The European conference on machine learning and principles and practice of knowledge discovery in databases, pp 445–452Google Scholar
  16. Calabrese F, Dahlem D, Gerber A, Paul D, Chen X, Rowland J, Rath C, Ratti C (2011) The connected states of America: quantifying social radii of influence. In: IEEE international conference on social computing (SocialCom)Google Scholar
  17. Cerf L, Besson J, Robardet C, Boulicaut JF (2009a) Closed patterns meet n-ary relations. ACM Trans Knowl Discov Data 3(1):1–36Google Scholar
  18. Cerf L, Nguyen TBN, Boulicaut JF (2009b) Discovering relevant cross-graph cliques in dynamic networks. In: International symposium on methodologies for intelligent systems, pp 513–522Google Scholar
  19. Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111Google Scholar
  20. Cook DJ, Crandall AS, Singla G, Thomas B (2010) Detection of social interaction in smart spaces. Cybern Syst 41(2):90–104MATHCrossRefGoogle Scholar
  21. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174MathSciNetCrossRefGoogle Scholar
  22. Francisco AP, Baeza-Yates RA, Oliveira AL (2008) Clique analysis of query log graphs. In: String processing and information retrieval, pp 188–199Google Scholar
  23. Goyal A, Bonchi F, Lakshmanan LV (2008) Discovering leaders from community actions. In: ACM conference on information and knowledge management, pp 499–508Google Scholar
  24. Gregory S (2009) Finding overlapping communities using disjoint community detection algorithms. Complex Netw 2009:47–61CrossRefGoogle Scholar
  25. Günnemann S, Färber I, Boden B, Seidl T (2010) Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: IEEE international conference on data mining, pp 845–850Google Scholar
  26. Huang Y, Sun L, Nie JY (2010) Query model refinement using word graphs. In: ACM conference on information and, knowledge management, pp 1453–1456Google Scholar
  27. Leskovec J, Huttenlocher D, Kleinberg J (2010a) Predicting positive and negative links in online social networks. In: ACM international conference on World Wide Web, pp 641–650Google Scholar
  28. Leskovec J, Lang KJ, Mahoney MW (2010b) Empirical comparison of algorithms for network community detection. In: ACM international conference on World Wide Web, pp 631–640Google Scholar
  29. Mongiovì M, Singh AK, Yan X, Zong B, Psounis K (2012) Efficient multicasting for delay tolerant networks using graph indexing. In: IEEE international conference on computer communications, pp 1386–1394Google Scholar
  30. Mougel PN, Rigotti C, Gandrillon O (2012) Finding collections of k-clique percolated components in attributed graphs. In: Advances in knowledge discovery and data mining, pp 181–192Google Scholar
  31. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328:876–878MathSciNetMATHCrossRefGoogle Scholar
  32. Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 2:167–256CrossRefGoogle Scholar
  33. Nijssen S, Jiménez A, Guns T (2011) Constraint-based pattern mining in multi-relational databases. In: Workshops of the IEEE international conference on data mining, pp 1120–1127Google Scholar
  34. Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818CrossRefGoogle Scholar
  35. Papadimitriou S, Sun J, Faloutsos C, Yu PS (2008) Hierarchical, parameter-free community discovery. In: The European conference on machine learning and principles and practice of knowledge discovery in databases, pp 170–187Google Scholar
  36. Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: ACM international conference on scalable, information systems, p 1Google Scholar
  37. Pei J, Han J (2000) Can we push more constraints into frequent pattern mining? In: ACM international conference on knowledge discovery and data mining, pp 350–354Google Scholar
  38. Pei J, Han J, Lakshmanan LVS (2001) Mining frequent item sets with convertible constraints. In: IEEE international conference on data engineering, pp 433–442Google Scholar
  39. Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76:036106. doi:10.1103/PhysRevE.76.036106
  40. Rossetti G, Berlingerio M, Giannotti F (2011) Scalable link prediction on multidimensional networks. In: Workshops of the IEEE international conference on data mining, pp 979–986Google Scholar
  41. Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105:1118–1123CrossRefGoogle Scholar
  42. Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: ACM international conference on knowledge discovery and data mining, pp 797–806Google Scholar
  43. Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci USA 107(31):13636–13641Google Scholar
  44. Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: ACM conference on information and knowledge management, pp 1107–1116Google Scholar
  45. Trasarti R, Pinelli F, Nanni M, Giannotti F (2011) Mining mobility user profiles for car pooling. In: ACM international conference on knowledge discovery and data mining, pp 1190–1198Google Scholar
  46. Wang J, Zeng Z, Zhou L (2006) Clan: an algorithm for mining closed cliques from large dense graph databases. In: IEEE international conference on data engineering, p 73Google Scholar
  47. Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: IEEE international conference on data mining, pp 721–724Google Scholar
  48. Zeng Z, Wang J, Zhou L, Karypis G (2006) Coherent closed quasi-clique discovery from large dense graph databases. In: ACM international conference on knowledge discovery and data mining, pp 797–802Google Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  • Michele Berlingerio
    • 1
  • Fabio Pinelli
    • 1
  • Francesco Calabrese
    • 1
  1. 1.IBM ResearchDublinIreland

Personalised recommendations