Skip to main content
Log in

ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Community discovery in complex networks is the problem of detecting, for each node of the network, its membership to one of more groups of nodes, the communities, that are densely connected, or highly interactive, or, more in general, similar, according to a similarity function. So far, the problem has been widely studied in monodimensional networks, i.e. networks where only one connection between two entities may exist. However, real networks are often multidimensional, i.e., multiple connections between any two nodes may exist, either reflecting different kinds of relationships, or representing different values of the same type of tie. In this context, the problem of community discovery has to be redefined, taking into account multidimensional structure of the graph. We define a new concept of community that groups together nodes sharing memberships to the same monodimensional communities in the different single dimensions. As we show, such communities are meaningful and able to group nodes even if they might not be connected in any of the monodimensional networks. We devise frequent pAttern mining-BAsed Community discoverer in mUltidimensional networkS (ABACUS), an algorithm that is able to extract multidimensional communities based on the extraction of frequent closed itemsets from monodimensional community memberships. Experiments on two different real multidimensional networks confirm the meaningfulness of the introduced concepts, and open the way for a new class of algorithms for community discovery that do not rely on the dense connections among nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://dblp.uni-trier.de/xml

  2. http://www.gregsadetsky.com/aol-data

  3. http://igraph.sourceforge.net

  4. https://code.launchpad.net/louvain

References

  • Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multi-scale complexity in networks. Nature 466:761–764

    Google Scholar 

  • Balcan D, Colizza V, Goncalves B, Hu H, Ramasco JJ, Vespignani A (2009) Multiscale mobility networks and the spatial spreading of infectious diseases. Proc Natl Acad Sci USA 106(51):21484–21489

    Google Scholar 

  • Bastide Y, Pasquier N, Taouil R, Stumme G, Lakhal L (2000) Mining minimal non-redundant association rules using frequent closed itemsets. In: Computational logic, pp 972–986

  • Benevenuto F, Rodrigues T, Cha M, Almeida VAF (2009) Characterizing user behavior in online social networks. In: Internet measurement conference, pp 49–62

  • Berlingerio M, Coscia M, Giannotti F (2011) Finding redundant and complementary communities in multidimensional networks. In: ACM conference on information and knowledge management, pp 2181–2184

  • Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011a) Foundations of multidimensional network analysis. In: International conference on advances in social networks analysis and mining, pp 485–489

  • Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011b) The pursuit of hubbiness: analysis of hubs in large multidimensional networks. J Comput Sci 2(3):223–237

    Google Scholar 

  • Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2012) Multidimensional networks: foundations of structural analysis. In: World Wide Web, pp 1–27

  • Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. theory and experiment. J Stat Mech 10:P10008

    Google Scholar 

  • Bonchi F, Lucchese C (2005) Pushing tougher constraints in frequent pattern mining. Adv Knowl Discov Data Mining 3518:173–202

    Article  Google Scholar 

  • Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2005) Efficient breadth-first mining of frequent pattern with monotone constraints. J Knowl Inf Syst 8(2):131–153

    Google Scholar 

  • Borgelt C (2003) Efficient implementations of apriori and eclat. In: IEEE ICDM workshop on frequent item set mining implementations, p 90

  • Bringmann B, Zimmermann A (2007) The chosen few: on identifying valuable patterns. In: IEEE international conference on data mining, pp 63–72

  • Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. IEEE Intel Syst 25(4):26–35

    Google Scholar 

  • Cai D, Shao Z, He X, Yan X, Han J (2005) Community mining from multi-relational networks. In: The European conference on machine learning and principles and practice of knowledge discovery in databases, pp 445–452

  • Calabrese F, Dahlem D, Gerber A, Paul D, Chen X, Rowland J, Rath C, Ratti C (2011) The connected states of America: quantifying social radii of influence. In: IEEE international conference on social computing (SocialCom)

  • Cerf L, Besson J, Robardet C, Boulicaut JF (2009a) Closed patterns meet n-ary relations. ACM Trans Knowl Discov Data 3(1):1–36

    Google Scholar 

  • Cerf L, Nguyen TBN, Boulicaut JF (2009b) Discovering relevant cross-graph cliques in dynamic networks. In: International symposium on methodologies for intelligent systems, pp 513–522

  • Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066111

    Google Scholar 

  • Cook DJ, Crandall AS, Singla G, Thomas B (2010) Detection of social interaction in smart spaces. Cybern Syst 41(2):90–104

    Article  MATH  Google Scholar 

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  • Francisco AP, Baeza-Yates RA, Oliveira AL (2008) Clique analysis of query log graphs. In: String processing and information retrieval, pp 188–199

  • Goyal A, Bonchi F, Lakshmanan LV (2008) Discovering leaders from community actions. In: ACM conference on information and knowledge management, pp 499–508

  • Gregory S (2009) Finding overlapping communities using disjoint community detection algorithms. Complex Netw 2009:47–61

    Article  Google Scholar 

  • Günnemann S, Färber I, Boden B, Seidl T (2010) Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: IEEE international conference on data mining, pp 845–850

  • Huang Y, Sun L, Nie JY (2010) Query model refinement using word graphs. In: ACM conference on information and, knowledge management, pp 1453–1456

  • Leskovec J, Huttenlocher D, Kleinberg J (2010a) Predicting positive and negative links in online social networks. In: ACM international conference on World Wide Web, pp 641–650

  • Leskovec J, Lang KJ, Mahoney MW (2010b) Empirical comparison of algorithms for network community detection. In: ACM international conference on World Wide Web, pp 631–640

  • Mongiovì M, Singh AK, Yan X, Zong B, Psounis K (2012) Efficient multicasting for delay tolerant networks using graph indexing. In: IEEE international conference on computer communications, pp 1386–1394

  • Mougel PN, Rigotti C, Gandrillon O (2012) Finding collections of k-clique percolated components in attributed graphs. In: Advances in knowledge discovery and data mining, pp 181–192

  • Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328:876–878

    Article  MathSciNet  MATH  Google Scholar 

  • Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 2:167–256

    Article  Google Scholar 

  • Nijssen S, Jiménez A, Guns T (2011) Constraint-based pattern mining in multi-relational databases. In: Workshops of the IEEE international conference on data mining, pp 1120–1127

  • Palla G, Derenyi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818

    Article  Google Scholar 

  • Papadimitriou S, Sun J, Faloutsos C, Yu PS (2008) Hierarchical, parameter-free community discovery. In: The European conference on machine learning and principles and practice of knowledge discovery in databases, pp 170–187

  • Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: ACM international conference on scalable, information systems, p 1

  • Pei J, Han J (2000) Can we push more constraints into frequent pattern mining? In: ACM international conference on knowledge discovery and data mining, pp 350–354

  • Pei J, Han J, Lakshmanan LVS (2001) Mining frequent item sets with convertible constraints. In: IEEE international conference on data engineering, pp 433–442

  • Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76:036106. doi:10.1103/PhysRevE.76.036106

  • Rossetti G, Berlingerio M, Giannotti F (2011) Scalable link prediction on multidimensional networks. In: Workshops of the IEEE international conference on data mining, pp 979–986

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105:1118–1123

    Article  Google Scholar 

  • Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: ACM international conference on knowledge discovery and data mining, pp 797–806

  • Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci USA 107(31):13636–13641

    Google Scholar 

  • Tang L, Liu H (2009) Scalable learning of collective behavior based on sparse social dimensions. In: ACM conference on information and knowledge management, pp 1107–1116

  • Trasarti R, Pinelli F, Nanni M, Giannotti F (2011) Mining mobility user profiles for car pooling. In: ACM international conference on knowledge discovery and data mining, pp 1190–1198

  • Wang J, Zeng Z, Zhou L (2006) Clan: an algorithm for mining closed cliques from large dense graph databases. In: IEEE international conference on data engineering, p 73

  • Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: IEEE international conference on data mining, pp 721–724

  • Zeng Z, Wang J, Zhou L, Karypis G (2006) Coherent closed quasi-clique discovery from large dense graph databases. In: ACM international conference on knowledge discovery and data mining, pp 797–802

Download references

Acknowledgments

We would like to thank Mason Porter and Peter Mucha for helpful discussion around (Mucha et al. 2010), and Vincent Traag for providing us with a c++ implementation of GL.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michele Berlingerio.

Additional information

Responsible editor: Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, Filip Zelezny.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berlingerio, M., Pinelli, F. & Calabrese, F. ABACUS: frequent pAttern mining-BAsed Community discovery in mUltidimensional networkS. Data Min Knowl Disc 27, 294–320 (2013). https://doi.org/10.1007/s10618-013-0331-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-013-0331-0

Keywords

Navigation