(Web/Social) Graph Compression
In this section, we discuss the problem of representing large graphs in core memory using suitable compressed data structures; after defining the problem, we survey the most important techniques developed in the last decade to solve it, highlighting the involved trade-offs.
A graph is a pair (N, A), where N is the set of nodes and A ⊆ N × N is the set of arcs, that is, a directed adjacency relation. We use n for the number of nodes, that is, n = |V |, and write x→y when (x, y) ∈ A.
Many datasets come with a natural relational structure, that is, a graph, that contains a wealth of information about the data itself, and many data mining tasks can be accomplished from this information alone (e.g., detecting outlier elements, identifying interest groups, estimating measures of importance, and so on). Often, such tasks can be solved through suitable (sometimes,...
- Boldi P, Vigna S (2004) The WebGraph framework I: compression techniques. In: Proceedings of the thirteenth international world wide web conference (WWW 2004). ACM Press, Manhattan, pp 595–601Google Scholar
- Boldi P, Rosa M, Santini M, Vigna S (2011) Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks, pp 587–596. https://doi.org/10.1145/1963405.1963488
- Dhulipala L, Kabiljo I, Karrer B, Ottaviano G, Pupyrev S, Shalita A (2016) Compressing graphs and indexes with recursive graph bisection, 13–17 Aug 2016, pp 1535–1544. https://doi.org/10.1145/2939672.2939862
- Farzan A, Munro JI (2013) Succinct encoding of arbitrary graphs. Theor Comput Sci 513(Supplement C):38–52. https://doi.org/10.1016/j.tcs.2013.09.031
- Ferragina P, Piccinno F, Venturini R (2015) Compressed indexes for string searching in labeled graphs, pp 322–332. https://doi.org/10.1145/2736277.2741140
- Pibiri GE, Venturini R (2017) Dynamic Elias-Fano representation. In: 28th annual symposium on combinatorial pattern matching (CPM), pp 30:1–30:14. https://doi.org/10.4230/LIPIcs.CPM.2017.30
- Randall K, Stata R, Wickremesinghe R, Wiener JL (2001) The LINK database: fast access to graphs of the Web. Research Report 175, Compaq Systems Research Center, Palo AltoGoogle Scholar