Abstract
While the analysis of unlabeled networks has been studied extensively in the past, finding patterns in different kinds of labeled graphs is still an open challenge. Given a large edge-labeled network, e.g., a time-evolving network, how can we find interesting patterns? We propose Com \(^2\) , a novel, fast and incremental tensor analysis approach which can discover communities appearing over subsets of the labels. The method is (a) scalable, being linear on the input size, (b) general, (c) needs no user-defined parameters and (d) effective, returning results that agree with intuition. We apply our method to real datasets, including a phone call network, a computer-traffic network and a flight information network. The phone call network consists of 4 million mobile users, with 51 million edges (phone calls), over 14 days, while the flights dataset consists of 7733 airports and 5995 airline companies flying 67,663 different routes. We show that Com \(^2\) spots intuitive patterns regarding edge labels that carry temporal or other discrete information. Our findings include large “star”-like patterns, near-bipartite cores, as well as tiny groups (five users), calling each other hundreds of times within a few days. We also show that we are able to automatically identify competing airline companies.
Similar content being viewed by others
Notes
We tested different methods with no significant differences found in the results since the subsequent steps of growing and shrinking lead to the selection of the most relevant edges and the removal of irrelevant ones. Selecting the edge (i, j, k) with highest \(min(a_i, b_j, c_k)\) provides a good initial seed.
References
Aggarwal C, Subbian K (2014) Evolutionary network analysis: a survey. ACM Comput Surv 47(1):10:1–10:36
Araujo M, Günnemann S, Mateos G, Faloutsos C (2014) Beyond blocks: hyperbolic community detection. ECML PKDD 8724:50–65
Araujo M, Papadimitriou S, Günnemann S, Faloutsos C, Basu P, Swami A, Papalexakis EE, Koutra D (2014) Com2: fast automatic discovery of temporal (‘comet’) communities. PAKDD 8444:271–283
Boden B, Günnemann S, Hoffmann H, Seidl T (2012) Mining coherent subgraphs in multi-layer graphs with edge labels. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining 1258–1266
Boden B, Günnemann S, Hoffmann H, Seidl T (2013) RMiCS: a robust approach for mining coherent subgraphs in edge-labeled multi-layer graphs. In: Proceedings of the 25th international conference on scientific and statistical database management 1–23
Carroll J, Chang J-J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition. Psychometrika 35(3):283–319
Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. TKDD 5(2):10
Flake GW, Lawrence S, Giles CL (2000) Efficient identification of web communities. KDD 150–160
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Gkantsidis C, Mihail M, Zegura EW (2003) Spectral analysis of internet topologies. INFOCOM 1:364–374
Grünwald PD (2007) The minimum description length principle. The MIT Press, Cambridge
Günnemann S, Färber I, Boden B, Seidl T (2014) Gamer: a synthesis of subspace clustering and dense subgraph mining. Knowl Inf Syst 40(2):243–278
Günnemann S, Färber I, Raubach S, Seidl T (2013) Spectral subspace clustering for graphs with feature vectors. In: IEEE 13th international conferance on data mining 231–240
Harshman R (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multimodal factor analysis. UCLA Work Pap Phon 16:1–84
Johnson DS, Krishnan S, Chhugani J, Kumar S, Venkatasubramanian S (2004) Compressing large boolean matrices using reordering techniques. VLDB 30:13–23
Karypis G, Kumar V (1995) Metis: unstructured graph partitioning and sparse matrix ordering system. Tech Rep
Kolda T, Bader B (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Kolda TG, Bader BW, Kenny JP (2005) Higher-order web link analysis using multilinear algebra. In: Fifth IEEE international conference on data mining 242–249
Koutra D, Kang U, Vreeken J, Faloutsos C (2014) VoG: summarizing and understanding large graphs. In: Proceedings of the 2014 SIAM international conference on data mining 91–99
Koutra D, Papalexakis E, Faloutsos C (2012) Tensorsplat: spotting latent anomalies in time. In: 16th Panhellenic conference on informatics (PCI)
Kumar R, Novak J, Raghavan P, Tomkins A (2003) On the bursty evolution of blogspace. WWW, pp 568–576
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. IEEE TKDD 1(1):917–922
Liu Z, Yu J, Ke Y, Lin X, Chen L (2008) Spotting significant changing subgraphs in evolving graphs. In: ICDM, pp 917–922
Maruhashi K, Guo F, Faloutsos C (2011) Multiaspectforensics: pattern mining on large-scale heterogeneous networks with tensor analysis. In: Proceedings of the 2011 international conference on advances in social networks analysis and mining 203–210
Papalexakis E, Akoglu L, Ience D (2013) Do more views of a graph help? Community detection and clustering in multi-graphs. In: International conference on information FUSION, pp 899–905
Papalexakis EE, Faloutsos C, Sidiropoulos ND (2012) Parcube: sparse parallelizable tensor decompositions. ECML/PKDD 1:521–536
Papalexakis EE, Sidiropoulos ND, Bro R (2013) From k-means to higher-way co-clustering: multilinear decomposition with sparse latent factors. IEEE Trans Signal Process 61(2):493–506
Paxson V, Floyd S (1995) Wide-area traffic: the failure of poisson modeling. IEEE/ACM Trans Netw 3:226–244
Prakash BA, Sridharan A, Seshadri M, Machiraju S, Faloutsos C (2010) Eigenspokes: surprising patterns and scalable community chipping in large graphs. PAKDD 6119:435–448
Rissanen J (1983) A universal prior for integers and estimation by minimum description length. Ann Stat 11:416–431
Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Nat Acad Sci 104(18):7327–7331
Sen T, Kloczkowski A, Jernigan R (2006) Functional clustering of yeast proteins from the protein-protein interaction network. BMC Bioinf 7:355–367
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE PAMI 22(8):888–905
Sun J, Papadimitriou S, Faloutsos C, Yu PS (2007) Graphscope: parameter-free mining of large time-evolving graphs. KDD 687–696
Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. KDD, pp 374–383
Tang L, Wang X, Liu H (2009) Uncovering groups via heterogeneous interaction analysis. In: Ninth IEEE international conference on data mining 503–512
Tantipathananandh C, Berger-Wolf TY (2011) Finding communities in dynamic social networks. ICDM, pp 1236–1241
Wasserman S (1994) Social network analysis: methods and applications. cambridge University Press, Cambridge
Wu Z, Yin W, Cao J, Xu G, Cuzzocrea A (2013) Community detection in multi-relational social networks. WISE 8181:43–56
Yang J, Leskovec J (2012) Community-affiliation graph model for overlapping network community detection. In: 12th IEEE International Conference on Data Mining 1170–1175
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant Nos. IIS-1247489 and IIS-1217559. Research was sponsored by the Defense Threat Reduction Agency and was accomplished under contract No. HDTRA1-10-1-0120 and also sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement No. W911NF-09-2-0053. Additional funding was provided by the US Army Research Office (ARO) and Defense Advanced Research Projects Agency (DARPA) under Contract No. W911NF-11-C-0088. This work is also partially supported by a Google Focused Research Award, by the Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) through the Carnegie Mellon Portugal Program under Grant SFRH/BD/52362/2013, by ERDF and FCT through the COMPETE Programme within project FCOMP-01-0124-FEDER-037281 and by a fellowship within the postdoc program of the German Academic Exchange Service (DAAD). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, DARPA or other funding parties. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Araujo, M., Günnemann, S., Papadimitriou, S. et al. Discovery of “comet” communities in temporal and labeled graphs Com \(^2\) . Knowl Inf Syst 46, 657–677 (2016). https://doi.org/10.1007/s10115-015-0847-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-015-0847-2