Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

(Web/Social) Graph Compression

  • Paolo Boldi
  • Sebastiano Vigna
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_54-1

Synonyms

Definitions

In this section, we discuss the problem of representing large graphs in core memory using suitable compressed data structures; after defining the problem, we survey the most important techniques developed in the last decade to solve it, highlighting the involved trade-offs.

A graph is a pair (N, A), where N is the set of nodes and A ⊆ N × N is the set of arcs, that is, a directed adjacency relation. We use n for the number of nodes, that is, n = |V |, and write xy when (x, y) ∈ A.

Overview

Many datasets come with a natural relational structure, that is, a graph, that contains a wealth of information about the data itself, and many data mining tasks can be accomplished from this information alone (e.g., detecting outlier elements, identifying interest groups, estimating measures of importance, and so on). Often, such tasks can be solved through suitable (sometimes,...

This is a preview of subscription content, log in to check access.

References

  1. Apostolico A, Drovandi G (2009) Graph compression by BFS. Algorithms 2(3):1031–1044. https://doi.org/10.3390/a2031031 MathSciNetCrossRefGoogle Scholar
  2. Boldi P, Vigna S (2004) The WebGraph framework I: compression techniques. In: Proceedings of the thirteenth international world wide web conference (WWW 2004). ACM Press, Manhattan, pp 595–601Google Scholar
  3. Boldi P, Santini M, Vigna S (2010) Permuting web and social graphs. Internet Math 6(3):257–283MathSciNetCrossRefzbMATHGoogle Scholar
  4. Boldi P, Rosa M, Santini M, Vigna S (2011) Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks, pp 587–596. https://doi.org/10.1145/1963405.1963488
  5. Brisaboa N, Ladra S, Navarro G (2014) Compact representation of WebGraphs with extended functionality. Inf Syst 39(1):152–174. https://doi.org/10.1016/j.is.2013.08.003 CrossRefGoogle Scholar
  6. Brisaboa N, Cerdeira-Pena A, de Bernardo G, Navarro G (2017) Compressed representation of dynamic binary relations with applications. Inf Syst 69:106–123. https://doi.org/10.1016/j.is.2017.05.003 CrossRefGoogle Scholar
  7. Chierichetti F, Kumar R, Lattanzi S, Panconesi A, Raghavan P (2013) Models for the compressible web. SIAM J Comput 42(5):1777–1802. https://doi.org/10.1137/120879828 MathSciNetCrossRefzbMATHGoogle Scholar
  8. Dhulipala L, Kabiljo I, Karrer B, Ottaviano G, Pupyrev S, Shalita A (2016) Compressing graphs and indexes with recursive graph bisection, 13–17 Aug 2016, pp 1535–1544. https://doi.org/10.1145/2939672.2939862
  9. Elias P (1974) Efficient storage and retrieval by content and address of static files. J Assoc Comput Mach 21(2):246–260MathSciNetCrossRefzbMATHGoogle Scholar
  10. Farzan A, Munro JI (2013) Succinct encoding of arbitrary graphs. Theor Comput Sci 513(Supplement C):38–52. https://doi.org/10.1016/j.tcs.2013.09.031
  11. Ferragina P, Piccinno F, Venturini R (2015) Compressed indexes for string searching in labeled graphs, pp 322–332. https://doi.org/10.1145/2736277.2741140
  12. Jacobson G (1989) Space-efficient static trees and graphs. In: 30th annual symposium on foundations of computer science (FOCS’89). IEEE Computer Society Press, Research Triangle Park, pp 549–554CrossRefGoogle Scholar
  13. Navarro G (2016) Compact data structures: a practical approach. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  14. Ottaviano G, Venturini R (2014) Partitioned Elias–Fano indexes. In: Proceedings of the 37th international ACM SIGIR conference (SIGIR’14). ACM, New York, pp 273–282. https://doi.org/10.1145/2600428.2609615 Google Scholar
  15. Pibiri GE, Venturini R (2017) Dynamic Elias-Fano representation. In: 28th annual symposium on combinatorial pattern matching (CPM), pp 30:1–30:14.  https://doi.org/10.4230/LIPIcs.CPM.2017.30
  16. Randall K, Stata R, Wickremesinghe R, Wiener JL (2001) The LINK database: fast access to graphs of the Web. Research Report 175, Compaq Systems Research Center, Palo AltoGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Università degli Studi di MilanoMilanoItaly

Section editors and affiliations

  • Paolo Ferragina
    • 1
  1. 1.Department of Computer ScienceUniversity of PisaPisaItaly