Skip to main content
Log in

Discovery of “comet” communities in temporal and labeled graphs Com \(^2\) 

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

While the analysis of unlabeled networks has been studied extensively in the past, finding patterns in different kinds of labeled graphs is still an open challenge. Given a large edge-labeled network, e.g., a time-evolving network, how can we find interesting patterns? We propose Com \(^2\) , a novel, fast and incremental tensor analysis approach which can discover communities appearing over subsets of the labels. The method is (a) scalable, being linear on the input size, (b) general, (c) needs no user-defined parameters and (d) effective, returning results that agree with intuition. We apply our method to real datasets, including a phone call network, a computer-traffic network and a flight information network. The phone call network consists of 4 million mobile users, with 51 million edges (phone calls), over 14 days, while the flights dataset consists of 7733 airports and 5995 airline companies flying 67,663 different routes. We show that Com \(^2\)  spots intuitive patterns regarding edge labels that carry temporal or other discrete information. Our findings include large “star”-like patterns, near-bipartite cores, as well as tiny groups (five users), calling each other hundreds of times within a few days. We also show that we are able to automatically identify competing airline companies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We tested different methods with no significant differences found in the results since the subsequent steps of growing and shrinking lead to the selection of the most relevant edges and the removal of irrelevant ones. Selecting the edge (ijk) with highest \(min(a_i, b_j, c_k)\) provides a good initial seed.

References

  1. Aggarwal C, Subbian K (2014) Evolutionary network analysis: a survey. ACM Comput Surv 47(1):10:1–10:36

    Article  Google Scholar 

  2. Araujo M, Günnemann S, Mateos G, Faloutsos C (2014) Beyond blocks: hyperbolic community detection. ECML PKDD 8724:50–65

  3. Araujo M, Papadimitriou S, Günnemann S, Faloutsos C, Basu P, Swami A, Papalexakis EE, Koutra D (2014) Com2: fast automatic discovery of temporal (‘comet’) communities. PAKDD 8444:271–283

  4. Boden B, Günnemann S, Hoffmann H, Seidl T (2012) Mining coherent subgraphs in multi-layer graphs with edge labels. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining 1258–1266

  5. Boden B, Günnemann S, Hoffmann H, Seidl T (2013) RMiCS: a robust approach for mining coherent subgraphs in edge-labeled multi-layer graphs. In: Proceedings of the 25th international conference on scientific and statistical database management 1–23

  6. Carroll J, Chang J-J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition. Psychometrika 35(3):283–319

    Article  MATH  Google Scholar 

  7. Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. TKDD 5(2):10

    Article  Google Scholar 

  8. Flake GW, Lawrence S, Giles CL (2000) Efficient identification of web communities. KDD 150–160

  9. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  10. Gkantsidis C, Mihail M, Zegura EW (2003) Spectral analysis of internet topologies. INFOCOM 1:364–374

  11. Grünwald PD (2007) The minimum description length principle. The MIT Press, Cambridge

    Google Scholar 

  12. Günnemann S, Färber I, Boden B, Seidl T (2014) Gamer: a synthesis of subspace clustering and dense subgraph mining. Knowl Inf Syst 40(2):243–278

    Article  Google Scholar 

  13. Günnemann S, Färber I, Raubach S, Seidl T (2013) Spectral subspace clustering for graphs with feature vectors. In: IEEE 13th international conferance on data mining 231–240

  14. Harshman R (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multimodal factor analysis. UCLA Work Pap Phon 16:1–84

    Google Scholar 

  15. Johnson DS, Krishnan S, Chhugani J, Kumar S, Venkatasubramanian S (2004) Compressing large boolean matrices using reordering techniques. VLDB 30:13–23

  16. Karypis G, Kumar V (1995) Metis: unstructured graph partitioning and sparse matrix ordering system. Tech Rep

  17. Kolda T, Bader B (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500

  18. Kolda TG, Bader BW, Kenny JP (2005) Higher-order web link analysis using multilinear algebra. In: Fifth IEEE international conference on data mining 242–249

  19. Koutra D, Kang U, Vreeken J, Faloutsos C (2014) VoG: summarizing and understanding large graphs. In: Proceedings of the 2014 SIAM international conference on data mining 91–99

  20. Koutra D, Papalexakis E, Faloutsos C (2012) Tensorsplat: spotting latent anomalies in time. In: 16th Panhellenic conference on informatics (PCI)

  21. Kumar R, Novak J, Raghavan P, Tomkins A (2003) On the bursty evolution of blogspace. WWW, pp 568–576

  22. Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. IEEE TKDD 1(1):917–922

  23. Liu Z, Yu J, Ke Y, Lin X, Chen L (2008) Spotting significant changing subgraphs in evolving graphs. In: ICDM, pp 917–922

  24. Maruhashi K, Guo F, Faloutsos C (2011) Multiaspectforensics: pattern mining on large-scale heterogeneous networks with tensor analysis. In: Proceedings of the 2011 international conference on advances in social networks analysis and mining 203–210

  25. Papalexakis E, Akoglu L, Ience D (2013) Do more views of a graph help? Community detection and clustering in multi-graphs. In: International conference on information FUSION, pp 899–905

  26. Papalexakis EE, Faloutsos C, Sidiropoulos ND (2012) Parcube: sparse parallelizable tensor decompositions. ECML/PKDD 1:521–536

    Google Scholar 

  27. Papalexakis EE, Sidiropoulos ND, Bro R (2013) From k-means to higher-way co-clustering: multilinear decomposition with sparse latent factors. IEEE Trans Signal Process 61(2):493–506

  28. Paxson V, Floyd S (1995) Wide-area traffic: the failure of poisson modeling. IEEE/ACM Trans Netw 3:226–244

    Article  Google Scholar 

  29. Prakash BA, Sridharan A, Seshadri M, Machiraju S, Faloutsos C (2010) Eigenspokes: surprising patterns and scalable community chipping in large graphs. PAKDD 6119:435–448

  30. Rissanen J (1983) A universal prior for integers and estimation by minimum description length. Ann Stat 11:416–431

  31. Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Nat Acad Sci 104(18):7327–7331

    Article  Google Scholar 

  32. Sen T, Kloczkowski A, Jernigan R (2006) Functional clustering of yeast proteins from the protein-protein interaction network. BMC Bioinf 7:355–367

    Article  Google Scholar 

  33. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE PAMI 22(8):888–905

    Article  Google Scholar 

  34. Sun J, Papadimitriou S, Faloutsos C, Yu PS (2007) Graphscope: parameter-free mining of large time-evolving graphs. KDD 687–696

  35. Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. KDD, pp 374–383

  36. Tang L, Wang X, Liu H (2009) Uncovering groups via heterogeneous interaction analysis. In: Ninth IEEE international conference on data mining 503–512

  37. Tantipathananandh C, Berger-Wolf TY (2011) Finding communities in dynamic social networks. ICDM, pp 1236–1241

  38. Wasserman S (1994) Social network analysis: methods and applications. cambridge University Press, Cambridge

    Book  Google Scholar 

  39. Wu Z, Yin W, Cao J, Xu G, Cuzzocrea A (2013) Community detection in multi-relational social networks. WISE 8181:43–56

  40. Yang J, Leskovec J (2012) Community-affiliation graph model for overlapping network community detection. In: 12th IEEE International Conference on Data Mining 1170–1175

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant Nos. IIS-1247489 and IIS-1217559. Research was sponsored by the Defense Threat Reduction Agency and was accomplished under contract No. HDTRA1-10-1-0120 and also sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement No. W911NF-09-2-0053. Additional funding was provided by the US Army Research Office (ARO) and Defense Advanced Research Projects Agency (DARPA) under Contract No. W911NF-11-C-0088. This work is also partially supported by a Google Focused Research Award, by the Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) through the Carnegie Mellon Portugal Program under Grant SFRH/BD/52362/2013, by ERDF and FCT through the COMPETE Programme within project FCOMP-01-0124-FEDER-037281 and by a fellowship within the postdoc program of the German Academic Exchange Service (DAAD). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, DARPA or other funding parties. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Araujo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Araujo, M., Günnemann, S., Papadimitriou, S. et al. Discovery of “comet” communities in temporal and labeled graphs Com \(^2\) . Knowl Inf Syst 46, 657–677 (2016). https://doi.org/10.1007/s10115-015-0847-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-015-0847-2

Keywords

Navigation