Knowledge and Information Systems

, Volume 49, Issue 2, pp 553–595 | Cite as

Compressed \(\text {k}\mathsf {^d}\text {-tree}\) for temporal graphs

  • Diego Caro
  • M. Andrea Rodríguez
  • Nieves R. Brisaboa
  • Antonio Fariña
Regular Paper
  • 252 Downloads

Abstract

Temporal graphs represent vertices and binary relations that change along time. The work in this paper proposes to represent temporal graphs as cells in a 4D binary matrix: two dimensions to represent extreme vertices of an edge and two dimensions to represent the temporal interval when the edge exists. This strategy generalizes the idea of the adjacency matrix for storing static graphs. The proposed structure called Compressed \(\text {k}\mathsf {^d}\text {-tree}\) (\(\text {ck}\mathsf {^d}\text {-tree}\)) is capable of dealing with unclustered data with a good use of space. The \(\text {ck}\mathsf {^d}\text {-tree}\) uses asymptotically the same space than the (worst case) lower bound for storing cells in a 4D binary matrix, without considering any regularity. Techniques that group leaves into buckets and compress nodes with few children show to improve the performance in time and space. An experimental evaluation compares the \(\text {ck}\mathsf {^d}\text {-tree}\) with \(\text {k}\mathsf {^d}\text {-tree}\) (the d-dimensional extension of the \(\text {k}\mathsf {^2}\text {-tree}\)) and with other up-to-date compressed data structures based on inverted indexes and \(\mathsf {Wavelet}\text { Tree}\)s, showing the potential use of the \(\text {ck}\mathsf {^d}\text {-tree}\) for different types of temporal graphs.

Keywords

Multidimensional compact data structure Compact data structures for temporal graphs Time-varying graphs Evolving graphs 

Notes

Acknowledgments

Diego Caro and M. Andrea Rodríguez were partially funded by Fondef D09I1185. Diego Caro was also funded by CONICYT PhD scholarship and M. Andrea Rodríguez by Fondecyt 1140428 and MINECO (PGE and FEDER) Grant TIN2013-46801-C4-3-R. Nieves Brisaboa and Antonio Fariña are funded by MINECO (PGE and FEDER) Grants TIN2013-46238-C4-3-R and TIN2013-47090-C3-3-P; CDTI, AGI and MINECO Grant CDTI-00064563/ITC-20133062; ICT COST Action IC1302; and by Xunta de Galicia (co-founded with FEDER) Grant GRC2013/053. We also thank to Diego Seco for his help in the preliminary discussions of the structures, to Gonzalo Navarro and Simon Gog for their suggestions on the improvement of the data structures and the experimental evaluation, and to Claudio Sanhueza from Yahoo! Labs who helps us with the P-Yahoo-Session dataset.

References

  1. 1.
    Albert R, Barabási A-L (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Apostolico A, Drovandi G (2009) Graph compression by BFS. Algorithms 2(3):1031–1044MathSciNetCrossRefGoogle Scholar
  3. 3.
    Álvarez-García S, Brisaboa NR, Fernández JD, Martínez-Prieto MA (2011) Compressed k2-triples for full-in-memory rdf engines. In: Proceedings of the Americas conference on information systems (AMCIS). Association for Information SystemsGoogle Scholar
  4. 4.
    Aluru S, Sevilgen FE (1999) Dynamic compressed hypertoctrees with application to the N-body problem. In: Proceedings of the 19th conference on foundations of software technology and theoretical computer science. Springer, BerlinGoogle Scholar
  5. 5.
    Brisaboa NR, Caro D, Fariña A, Rodríguez A (2014) A compressed suffix-array strategy for temporal-graph indexing. In: Moura E, Crochemore M (eds) String processing and information retrieval. Lecture notes in computer science. Springer International Publishing, pp 77–88Google Scholar
  6. 6.
    Brisaboa NR, de Bernardo G, Navarro G (2012) Compressed dynamic binary relations. In: Data compression conference (DCC). IEEE Computer Society, pp 52–61Google Scholar
  7. 7.
    Benoit D, Demaine ED, Ian Munro J, Raman R, Raman V, Srinivasa Rao S (2005) Representing trees of higher degree. Algorithmica 43(4):275–292MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Brisaboa NR, Ladra S, Navarro G (2009) k2-trees for compact web graph representation. In: International symposium on string processing and information retrieval (SPIRE), vol 5721 of lecture notes in computer science. Springer, Berlin, pp 18–30Google Scholar
  9. 9.
    Brisaboa NR, Ladra S, Navarro G (2013) DACs: bringing direct access to variable-length codes. Inf Process Manag 49(1):392–404CrossRefGoogle Scholar
  10. 10.
    Brisaboa NR, Ladra S, Navarro G (2014) Compact representation of web graphs with extended functionality. Inf Syst 39:152–174CrossRefGoogle Scholar
  11. 11.
    Brodnik A, Ian Munro J (1999) Membership in constant time and almost-minimum space. SIAM J Comput 28(5):1627–1640MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Caro D, Rodríguez MA, Brisaboa NR (2015) Data structures for temporal graphs based on compact sequence representations. Inf Syst 51:1–26CrossRefGoogle Scholar
  13. 13.
    Clarkson KL (1983) Fast algorithms for the all nearest neighbors problem. In: Proceedings of the 24th annual symposium on foundations of computer science (sfcs 1983). IEEE, pp 226–232Google Scholar
  14. 14.
    Cha M, Mislove A, Gummadi PK (2009) A measurement-driven analysis of information propagation in the flickr social network. In: International world wide web conference (WWW). ACM, pp 721–730Google Scholar
  15. 15.
    Claude F, Navarro G (2008) Practical rank/select queries over arbitrary sequences. In: International symposium on string processing and information retrieval (SPIRE), vol 5280 of lecture notes in computer science. Springer, pp 176–187Google Scholar
  16. 16.
    de Bernardo G, Álvarez-García S, Brisaboa NR, Navarro G, Pedreira O (2013) Compact querieable representations of raster data. In: International symposium on string processing and information retrieval (SPIRE), vol 8214 of lecture notes in computer science. Springer, pp 96–108Google Scholar
  17. 17.
    de Bernardo G, Brisaboa NR, Caro D, Rodríguez MA (2013) Compact data structures for temporal graphs. In: Data compression conference (DCC). IEEE, p 477Google Scholar
  18. 18.
    de Bernardo Roca G (2014) New data structures and algorithms for the efficient managementof large spatial datasets. PhD thesis, Universidade da CoruñaGoogle Scholar
  19. 19.
    Demetrescu C, Eppstein D, Galil Z, Italiano GF (2010) Algorithms and theory of computation handbook, chapter dynamic graph algorithms. Chapman & Hall/CRC, pp 9-1–9-27Google Scholar
  20. 20.
    Eppstein D, Goodrich MT, Sun JZ (2005) The skip quadtree: a simple dynamic data structure for multidimensional data. In: SCG ’05: proceedings of the twenty-first annual symposium on computational geometry. ACM Request PermissionsGoogle Scholar
  21. 21.
    Fariña A, Brisaboa N, Navarro G, Claude F, Places A, Rodríguez E (2012) Word-based self-indexes for natural language text. ACM TOIS 30(1):1CrossRefGoogle Scholar
  22. 22.
    Ferreira A, Viennot L (2002) A note on models, algorithms, and data structures for dynamic communication networks. Technical Report RR-4403, INRIAGoogle Scholar
  23. 23.
    Gargantini I (1982) An effective way to represent quadtrees. In: Communications of the ACM, pp 1–6Google Scholar
  24. 24.
    Garcia SA (2014) Compact and Efficient Representations of Graphs. PhD thesis, Universidade da CoruñaGoogle Scholar
  25. 25.
    Garcia SA, Brisaboa NR, de Bernardo G, Navarro G (2014) Interleaved K2-tree: indexing and navigating ternary relations. In: 2014 data compression conference (DCC). IEEE, pp 342–351Google Scholar
  26. 26.
    Gog S, Beller T, Moffat A, Petri M (2014) From theory to practice: plug and play with succinct data structures. In: Proceedings of the 13th international symposium on experimental algorithms, (SEA 2014), pp 326–337Google Scholar
  27. 27.
    Grossi R, Gupta A, Vitter JS (2003) High-order entropy-compressed text indexes. In: Proceedings of the annual ACM-SIAM symposium on discrete algorithms (SODA). ACM/SIAM, pp 841–850Google Scholar
  28. 28.
    Holme P, Saramäki J (2012) Temporal networks. Phys Rep 519(3):97–125CrossRefGoogle Scholar
  29. 29.
    Hudson B (2009) Succinct representation of well-spaced point clouds. Technical Report. arXiv:0909.3137
  30. 30.
    Jacobson G (1989) Space-efficient static trees and graphs. In: Proceedings of the 30th annual symposium on foundations of computer science (FOCS). IEEE Computer Society, pp 549–554Google Scholar
  31. 31.
    Khurana U, Deshpande A (2013) Efficient snapshot retrieval over historical graph data. In: International conference on data engineering (ICDE). IEEE Computer Society, pp 997–1008Google Scholar
  32. 32.
    Kunegis J (2013) Konect: the koblenz network collection. In: Proceedings of the 22nd international conference on world wide web companion, WWW ’13 Companion, pp 1343–1350, Republic and Canton of Geneva, Switzerland, 2013. International World Wide Web Conferences Steering CommitteeGoogle Scholar
  33. 33.
    Labouseur AG, Birnbaum J, Olsen PW, Spillane SR, Vijayan J, Hwang J-H, Han W-S (2014) The G* graph database: efficiently managing large distributed dynamic graphs. Distrib Parallel DatabasesGoogle Scholar
  34. 34.
    Matsuyama T, Hao LV, Nagao M (1984) A file organization for geographic information systems based on spatial proximity. Comput Vis Graph Image Process 26(3):303–318CrossRefGoogle Scholar
  35. 35.
    Nicosia V, Tang J, Mascolo C, Musolesi M, Russo G, Latora V (2013) Graph metrics for temporal networks. In: Temporal networks, understanding complex systems. Springer Berlin Heidelberg, pp 15–40Google Scholar
  36. 36.
    Pagh R (1999) Low redundancy in static dictionaries with O(1) worst case lookup time. In: ICAL ’99: proceedings of the 26th international colloquium on automata, languages and programming. Springer, BerlinGoogle Scholar
  37. 37.
    Ren C, Lo E, Kao B, Zhu X, Cheng R (2011) On querying historical evolving graph sequences. Proc VLDB Endow (PVLDB) 4(11):726–737Google Scholar
  38. 38.
    Raman R, Raman V, Srinivasa Rao S (2002) Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. In: Proceedings SODA’12, pp 233–242Google Scholar
  39. 39.
    Sadakane K (2003) New text indexing functionalities of the compressed suffix arrays. J Algorithms 48(2):294–313MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, Burlington, MAMATHGoogle Scholar
  41. 41.
    Xuan BB, Ferreira A, Jarry A (2003) Computing shortest, fastest, and foremost journeys in dynamic networks. Int J Found Comput Sci 14(02):267–285MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Yahoo! Labs (2014) Yahoo! network flows data, version 1.0. http://webscope.sandbox.yahoo.com/catalog.php?datatype=g
  43. 43.
    Zukowski M, Héman S, Nes N, Boncz PA (2006) Super-scalar ram-cpu cache compression. In: Proceedings ICDE’06, p 59Google Scholar
  44. 44.
    Zhang J, Long X, Suel T (2008) Performance of compressed inverted list caching in search engines. In: Proceedings WWW’08, pp 387–396Google Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  • Diego Caro
    • 1
  • M. Andrea Rodríguez
    • 1
  • Nieves R. Brisaboa
    • 2
  • Antonio Fariña
    • 2
  1. 1.Computer Science DepartmentUniversity of ConcepciónConcepciónChile
  2. 2.Database Lab, Facultade de InformáticaUniversity of A CoruñaA CoruñaSpain

Personalised recommendations