Advertisement

On Designing Archiving Policies for Evolving RDF Datasets on the Web

  • Kostas Stefanidis
  • Ioannis Chrysakis
  • Giorgos Flouris
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8824)

Abstract

When dealing with dynamically evolving datasets, users are often interested in the state of affairs on previous versions of the dataset, and would like to execute queries on such previous versions, as well as queries that compare the state of affairs across different versions. This is especially true for datasets stored in the Web, where the interlinking aspect, combined with the lack of central control, do not allow synchronized evolution of interlinked datasets. To address this requirement the obvious solution is to store all previous versions, but this could quickly increase the space requirements; an alternative solution is to store adequate deltas between versions, which are generally smaller, but this would create the overhead of generating versions at query time. This paper studies the trade-offs involved in these approaches, in the context of archiving dynamic RDF datasets over the Web. Our main message is that a hybrid policy would work better than any of the above approaches, and describe our proposed methodology for establishing a cost model that would allow determining when each of the two standard methods (version-based or delta-based storage) should be used in the context of a hybrid policy.

Keywords

Query Processing Space Requirement Query Time Query Execution Time Overhead 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. TODS 29, 2–42 (2004)CrossRefGoogle Scholar
  2. 2.
    Drago, I., Mellia, M., Munafò, M.M., Sperotto, A., Sadre, R., Pras, A.: Inside Dropbox: understanding personal cloud storage services. In: Internet Measurement Conference (2012)Google Scholar
  3. 3.
    Gutierrez, C., Hurtado, C.A., Vaisman, A.A.: Temporal RDF. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 93–107. Springer, Heidelberg (2005)Google Scholar
  4. 4.
    Kang, U., Tong, H., Sun, J., Lin, C.-Y., Faloutsos, C.: Gbase: A scalable and general graph management system. In: KDD (2011)Google Scholar
  5. 5.
    Koloniari, G., Souravlias, D., Pitoura, E.: On graph deltas for historical queries. In: WOSS (2012)Google Scholar
  6. 6.
    Manola, F., Miller, E., McBride, B.: RDF primer (2004), http://www.w3.org/TR/rdf-primer
  7. 7.
    Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: VLDB (2001)Google Scholar
  8. 8.
    Noy, N., Musen, M.: PromptDiff: A fixed-point algorithm for comparing ontology versions. In: AAAI (2002)Google Scholar
  9. 9.
    Papavasileiou, V., Flouris, G., Fundulaki, I., Kotzinos, D., Christophides, V.: High-level change detection in RDF(S) KBs. TODS 38(1) (2013)Google Scholar
  10. 10.
    Rula, A., Palmonari, M., Harth, A., Stadtmüller, S., Maurino, A.: On the diversity and availability of temporal information in Linked Open Data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 492–507. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Stavrakas, Y., Papastefanatos, G.: Supporting complex changes in evolving interrelated web databanks. In: OTM Conferences (1) (2010)Google Scholar
  12. 12.
    Stefanidis, K., Efthymiou, V., Herchel, M., Christophides, V.: Entity resolution in the Web of data. In: WWW (2014)Google Scholar
  13. 13.
    Tzitzikas, Y., Theoharis, Y., Andreou, D.: On storage policies for semantic Web repositories that support versioning. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 705–719. Springer, Heidelberg (2008)Google Scholar
  14. 14.
    Umbrich, J., Hausenblas, M., Hogan, A., Polleres, A., Decker, S.: Towards dataset dynamics: Change frequency of Linked Open Data sources. In: LDOW (2010)Google Scholar
  15. 15.
    Volkel, M., Winkler, W., Sure, Y., Kruk, S., Synak, M.: SemVersion: A versioning system for RDF and ontologies. In: ESWC (2005)Google Scholar
  16. 16.
    Weikum, G., Theobald, M.: From information to knowledge: harvesting entities and relationships from Web sources. In: PODS (2010)Google Scholar
  17. 17.
    Zeginis, D., Tzitzikas, Y., Christophides, V.: On computing deltas of RDF(S) knowledge bases. In: TWEB (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Kostas Stefanidis
    • 1
  • Ioannis Chrysakis
    • 1
  • Giorgos Flouris
    • 1
  1. 1.Institute of Computer ScienceFORTHHeraklionGreece

Personalised recommendations