Compressing Streaming Graph Data Based on Triangulation
There is a wide diversity of applications for graph compression in web data management, scientific data processing, and social data analysis. In real-life applications like social media data processing, elements in a graph, typically vertices and edges, are arriving continuously. Compressing the graph before storing it in a database is important for real-time processing and analysis, while being a challenging yet interesting problem. A streaming lossless compression method, named as STT (streaming timeliness triangulation), is introduced in this paper. It is a time-efficient method for compressing a streaming graph, which differs itself from static graph compression methods in that: (1) it’s able to compress streaming graph without occupying extra storage; (2) it can achieve both low compression ratio and high throughput over the streaming graph; (3) it supports efficient graph query processing directly over compressed graphs. Thus, it can support a wide range of streaming graph processing tasks. Empirical study over a paper co-author graph and a real-life large-scale social network graph has shown the superiority of the newly proposed method over existing static graph compression methods.
KeywordsGraph compression Streaming data Social graph Graph query
This work is partially supported by National Hightech R&D Program (863 Program) under grant number 2015AA015307, and National Science Foundation of China under grant number 61432006.
- 1.Adler, M., Mitzenmacher, M.: Towards compressing web graphs. In: Data Compression Conference, DCC 2001, Snowbird, Utah, USA, 27–29 March 2001, pp. 203–212 (2001)Google Scholar
- 2.Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, New York, NY, USA, 17–20 May 2004, pp. 595–602 (2004)Google Scholar
- 3.Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, 11–12 February 2008, pp. 95–106 (2008)Google Scholar
- 6.Cui, H.: Link prediction on evolving data using tensor-based common neighbor. In: 2012 Fifth International Symposium on Computational Intelligence and Design (ISCID), vol. 2, pp. 343–346. IEEE (2012)Google Scholar
- 7.Gilbert, A.C., Levchenko, K.: Compressing network graphs. In: Proceedings of the LinkKDD Workshop at the 10th ACM Conference on KDD (2004)Google Scholar
- 8.Kang, U., Faloutsos, C.: Beyond ‘caveman communities’: hubs and spokes for graph compression and mining. In: 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, 11–14 December 2011, pp. 300–309 (2011)Google Scholar
- 10.McGregor, A.: Graph mining on streams. In: Encyclopedia of Database Systems, pp. 1271–1275 (2009)Google Scholar
- 11.Smith, A.J.: CPU cache memories. In: SIGMETRICS, p. 219 (1989)Google Scholar