Fast Construction of Compressed Web Graphs
- 498 Downloads
Several compressed graph representations were proposed in the last 15 years. Today, all these representations are highly relevant in practice since they enable to keep large-scale web and social graphs in the main memory of a single machine and consequently facilitate fast random access to nodes and edges.
While much effort was spent on finding space-efficient and fast representations, one issue was only partially addressed: developing resource-efficient construction algorithms. In this paper, we engineer the construction of regular and hybrid \(k^2\)-trees. We show that algorithms based on the Z-order sorting reduce the memory footprint significantly and at the same time are faster than previous approaches. We also engineer a parallel version, which fully utilizes all CPUs and caches. We show the practicality of the latter version by constructing partitioned hybrid k-trees for Web graphs in the scale of a billion nodes and up to 100 billion edges.
KeywordsWeb graphs Compact data structures Graph compression
- 4.Boldi, P., Marino, A., Santini, M., Vigna, S.: BUbiNG: massive crawling for the masses. In: Proceedings of WWW, pp. 227–228 (2014)Google Scholar
- 5.Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: Proceedings of WWW, pp. 595–601 (2004)Google Scholar
- 12.Jacobson, G.: Space-efficient static trees and graphs. In: Proceedings of FOCS, pp. 549–554 (1989)Google Scholar
- 13.Junghanns, M., Petermann, A., Gómez, K., Rahm, E.: GRADOOP: scalable graph data management and analytics with Hadoop. CoRR abs/1506.00548 (2015)Google Scholar
- 14.Kyrola, A., Blelloch, G., Guestrin, C.: GraphChi: large-scale graph computation on just a PC. In: Proceedings of USENIX, pp. 31–46 (2012)Google Scholar
- 15.Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of SIGMOD, pp. 135–146 (2010)Google Scholar
- 17.Xin, R.S., Crankshaw, D., Dave, A., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: unifying data-parallel and graph-parallel analytics. CoRR abs/1402.2394 (2014)Google Scholar