Abstract
Nowadays internet-based applications collect and distribute large datasets, which are mostly modeled by pertinent massive graphs. One solution to process such massive graphs is summarization. There are two kinds of graphs, stationary and stream. There are several algorithms to summarize stationary graphs; however, no comprehensive method has been devised to summarize stream graphs. This is because of the challenges of the graph stream, which are the high data volume and the continuous changes of data over time. To tackle such challenges, we propose a novel method based on the sliding window model that performs summarization using both the structure and vertex attributes of the input graph stream. We devise a new structure for a summary graph by considering the structural and semantical attributes that can better elucidate every heterogeneous summary graph. Moreover, our framework comprises innovative components for comparing hybrid summary graphs. To the best of our knowledge, this is the first method that summarizes a graph stream using both the structure and vertex attributes with varying contributions. Our approach also takes user directions and ontology into account. Aiming to study the efficiency and effectiveness of our proposed method, we conduct extensive experiments on two real-life datasets: American political web-logs and Amazon co-purchasing products. The experimental results confirm that compared to the existing approaches the proposed method generates graph summaries with better quality. The expected time of our proposed method in this paper (\(O(n^3)\)) has significantly enhanced the efficiency compared to the current best complexity which is \(O(n^5)\).
This is a preview of subscription content, access via your institution.







References
Cheng H, Zhou Y, Yu JX (2011) Clustering large attributed graphs: a balance between structural and attribute similarities. ACM Trans Knowl Discov Data (TKDD) 5(2):12
LeFevre K, Terzi E (2010) GraSS: Graph structure summarization. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 454–465
Shah N, Koutra D, Zou T, Gallagher B, Faloutsos C (2015) Timecrunch: Interpretable dynamic graph summarization. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 1055–1064
Facebook hits 2.27 billion monthly active users as earnings stabilize. https://www.nbcnews.com/tech/tech-news/facebook-hits-2-27-billion-monthly-active-users-earnings-stabilize-n926391. Accessed 20 Feb 2019
Liu Y, Safavi T, Dighe A, Koutra D (2018) Graph summarization methods and applications: a survey. ACM Comput Surv (CSUR) 51(3):62
Koutra D, Kang U, Vreeken J, Faloutsos C (2015) Summarizing and understanding large graphs. Stat Anal Min ASA Data Sci J 8(3):183–202
Navlakha S, Schatz MC, Kingsford C (2009) Revealing biological modules via graph summarization. J Comput Biol 16(2):253–264
Thor A, Anderson P, Raschid L, Navlakha S, Saha B, Khuller S, Zhang XN (2011) Link prediction for annotation graphs using graph summarization. In: International Semantic Web Conference. Springer, Berlin, pp 714–729
Riondato M, García-Soriano D, Bonchi F (2017) Graph summarization with quality guarantees. Data Min Knowl Discov 31(2):314–349
Navlakha S, Rastogi R, Shrivastava N (2008) Graph summarization with bounded error. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, pp 419–432
Wu Y, Yang S, Srivatsa M, Iyengar A, Yan X (2013) Summarizing answer graphs induced by keyword queries. Proc VLDB Endow 6(14):1774–1785
Bei Y, Lin Z, Chen D (2016) Summarizing scale-free networks based on virtual and real links. Phys A Stat Mech Appl 444:360–372
Tian Y, Hankins RA, Patel JM (2008) Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, pp 567–580
Zhang N, Tian Y, Patel JM (2010) Discovery-driven graph summarization. In: IEEE 26th International Conference on Data Engineering (ICDE 2010). IEEE, pp 880–891
Sarma AD, Gollapudi S, Panigrahy R (2011) Estimating pagerank on graph streams. J ACM (JACM) 58(3):13
Feigenbaum J, Kannan S, McGregor A, Suri S, Zhang J (2008) Graph distances in the data-stream model. SIAM J Comput 38(5):1709–1727
Aggarwal CC, Zhao Y, Yu PS (2010) On clustering graph streams. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 478–489
Gou X, Zou L, Zhao C, Yang T (2019) Fast and accurate graph stream summarization. In: IEEE 35th International Conference on Data Engineering (ICDE). IEEE, pp 1118–1129
Hosseini S, Yin H, Zhang M, Elovici Y, Zhou X (2018) Mining subgraphs from propagation networks through temporal dynamic analysis. In: 19th IEEE International Conference on Mobile Data Management (MDM). IEEE, pp 66–75
Hosseini S, Yin H, Cheung NM, Leng KP, Elovici Y, Zhou X (2018) Exploiting reshaping subgraphs from bilateral propagation graphs. In: International Conference on Database Systems for Advanced Applications. Springer, Cham, pp 342–351
Ashrafi-Payaman N, Kangavari MR, Fander AM (2017) A new method for graph stream summarization based on both the structure and concepts. Open Eng 9(1):500–511
Datar M, Motwani R (2007) The sliding-window computation model and results. In: Data streams. Springer, Boston, pp 149–167
Liu X, Tian Y, He Q, Lee WC, McPherson J (2014) Distributed graph summarization. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. ACM, pp 799–808
Chen C, Lin CX, Fredrikson M, Christodorescu M, Yan X, Han J (2009) Mining graph patterns efficiently via randomized summaries. Proc VLDB Endow 2(1):742–753
Seo H, Park K, Han Y, Kim H, Umair M, Khan KU, Lee YK (2018) An effective graph summarization and compression technique for a large-scaled graph. J Supercomput. https://doi.org/10.1007/s11227-018-2245-5
Tang N, Chen Q, Mitra P (2016) Graph stream summarization: from big bang to big crunch. In: Proceedings of the 2016 International Conference on Management of Data. ACM, pp 1481–1496
Khan A, Bhowmick SS, Bonchi F (2017) Summarizing static and dynamic big graphs. Proc VLDB Endow 10(12):1981–1984
Tsalouchidou I, Bonchi F, Morales GDF, Baeza-Yates R (2018) Scalable dynamic graph summarization. IEEE Trans Knowl Data Eng 32:360–373
Lim Y, Kang U, Faloutsos C (2014) Slashburn: graph compression and mining beyond caveman communities. IEEE Trans Knowl Data Eng 26(12):3077–3089
Boldi P, Vigna S (2004) The webgraph framework I: compression techniques. In: Proceedings of the 13th International Conference on World Wide Web. ACM, pp 595–602
Seo H, Kim H, Park K, Han Y, Lee YK (2015) Summarization technique on a compressed graph for massive graph analysis. Korean Soc Big Data Serv 2(1):25–35
Jouili S, Mili I, Tabbone S (2009) Attributed graph matching using local descriptions. International Conference on Advanced Concepts for Intelligent Vision Systems. Springer, Berlin, pp 89–99
Duchenne O, Joulin A, Ponce J (2011) A graph-matching kernel for object categorization. In: International Conference on Computer Vision. IEEE, pp 1792–1799
Ashrafi-Payaman N, Kangavari M (2017) GSSC: Graph summarization based on both structure and concepts. Int J Inf Commun Technol Res 9(1):33–44
White S, Smyth P (2005) A spectral clustering approach to finding communities in graphs. In: Proceedings of the 2005 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 274–285
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Dhillon IS, Guan Y, Kulis B (2004) A unified view of kernel k-means, spectral clustering and graph cuts. Computer Science Department University of Texas at Austin, Austin
Van Dongen SM (2000) Graph clustering by flow simulation. Doctoral dissertation
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856
Zhou D, Burges CJ (2007) Spectral clustering and transductive learning with multiple views. In: Proceedings of the 24th International Conference on Machine Learning. ACM, pp 1159–1166
Liu J, Wang C, Danilevsky M, Han J (2013) Large-scale spectral clustering on graphs. In: Twenty-Third International Joint Conference on Artificial Intelligence
Wang CD, Lai JH, Yu PS (2013) Dynamic community detection in weighted graph streams. In: Proceedings of the 2013 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 151–161
Prabavathi MG, Thiagarasu V (2013) Overlapping community detection algorithms in dynamic networks: an overview
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117
Wang W, Street WN (2014) A novel algorithm for community detection and influence ranking in social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014). IEEE, pp 555–560
Benyahia O, Largeron C, Jeudy B (2017) Community detection in dynamic graphs with missing edges. In: 11th International Conference on Research Challenges in Information Science (RCIS). IEEE, pp 372–381
Ashrafi-Payaman N, Kangavari MR (2018) Graph hybrid summarization. J AI Data Min 6(2):335–340
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ashrafi-Payaman, N., Kangavari, M.R., Hosseini, S. et al. GS4: Graph stream summarization based on both the structure and semantics. J Supercomput 77, 2713–2733 (2021). https://doi.org/10.1007/s11227-020-03290-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03290-2
Keywords
- Graph stream summarization
- Attributed graph
- Summary graph
- Super-node
- Super-edge