Abstract
When dealing with dynamically evolving datasets, users are often interested in the state of affairs on previous versions of the dataset, and would like to execute queries on such previous versions, as well as queries that compare the state of affairs across different versions. This is especially true for datasets stored in the Web, where the interlinking aspect, combined with the lack of central control, do not allow synchronized evolution of interlinked datasets. To address this requirement the obvious solution is to store all previous versions, but this could quickly increase the space requirements; an alternative solution is to store adequate deltas between versions, which are generally smaller, but this would create the overhead of generating versions at query time. This paper studies the trade-offs involved in these approaches, in the context of archiving dynamic RDF datasets over the Web. Our main message is that a hybrid policy would work better than any of the above approaches, and describe our proposed methodology for establishing a cost model that would allow determining when each of the two standard methods (version-based or delta-based storage) should be used in the context of a hybrid policy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. TODS 29, 2–42 (2004)
Drago, I., Mellia, M., Munafò, M.M., Sperotto, A., Sadre, R., Pras, A.: Inside Dropbox: understanding personal cloud storage services. In: Internet Measurement Conference (2012)
Gutierrez, C., Hurtado, C.A., Vaisman, A.A.: Temporal RDF. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 93–107. Springer, Heidelberg (2005)
Kang, U., Tong, H., Sun, J., Lin, C.-Y., Faloutsos, C.: Gbase: A scalable and general graph management system. In: KDD (2011)
Koloniari, G., Souravlias, D., Pitoura, E.: On graph deltas for historical queries. In: WOSS (2012)
Manola, F., Miller, E., McBride, B.: RDF primer (2004), http://www.w3.org/TR/rdf-primer
Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: VLDB (2001)
Noy, N., Musen, M.: PromptDiff: A fixed-point algorithm for comparing ontology versions. In: AAAI (2002)
Papavasileiou, V., Flouris, G., Fundulaki, I., Kotzinos, D., Christophides, V.: High-level change detection in RDF(S) KBs. TODSÂ 38(1) (2013)
Rula, A., Palmonari, M., Harth, A., Stadtmüller, S., Maurino, A.: On the diversity and availability of temporal information in Linked Open Data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 492–507. Springer, Heidelberg (2012)
Stavrakas, Y., Papastefanatos, G.: Supporting complex changes in evolving interrelated web databanks. In: OTM Conferences (1) (2010)
Stefanidis, K., Efthymiou, V., Herchel, M., Christophides, V.: Entity resolution in the Web of data. In: WWW (2014)
Tzitzikas, Y., Theoharis, Y., Andreou, D.: On storage policies for semantic Web repositories that support versioning. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 705–719. Springer, Heidelberg (2008)
Umbrich, J., Hausenblas, M., Hogan, A., Polleres, A., Decker, S.: Towards dataset dynamics: Change frequency of Linked Open Data sources. In: LDOW (2010)
Volkel, M., Winkler, W., Sure, Y., Kruk, S., Synak, M.: SemVersion: A versioning system for RDF and ontologies. In: ESWC (2005)
Weikum, G., Theobald, M.: From information to knowledge: harvesting entities and relationships from Web sources. In: PODS (2010)
Zeginis, D., Tzitzikas, Y., Christophides, V.: On computing deltas of RDF(S) knowledge bases. In: TWEB (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Stefanidis, K., Chrysakis, I., Flouris, G. (2014). On Designing Archiving Policies for Evolving RDF Datasets on the Web. In: Yu, E., Dobbie, G., Jarke, M., Purao, S. (eds) Conceptual Modeling. ER 2014. Lecture Notes in Computer Science, vol 8824. Springer, Cham. https://doi.org/10.1007/978-3-319-12206-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-12206-9_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12205-2
Online ISBN: 978-3-319-12206-9
eBook Packages: Computer ScienceComputer Science (R0)