Advertisement

DynamiTE: Parallel Materialization of Dynamic RDF Data

  • Jacopo Urbani
  • Alessandro Margara
  • Ceriel Jacobs
  • Frank van Harmelen
  • Henri Bal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8218)

Abstract

One of the main advantages of using semantically annotated data is that machines can reason on it, deriving implicit knowledge from explicit information. In this context, materializing every possible implicit derivation from a given input can be computationally expensive, especially when considering large data volumes.

Most of the solutions that address this problem rely on the assumption that the information is static, i.e., that it does not change, or changes very infrequently. However, the Web is extremely dynamic: online newspapers, blogs, social networks, etc., are frequently changed so that outdated information is removed and replaced with fresh data. This demands for a materialization that is not only scalable, but also reactive to changes.

In this paper, we consider the problem of incremental materialization, that is, how to update the materialized derivations when new data is added or removed. To this purpose, we consider the ρdf RDFS fragment [12], and present a parallel system that implements a number of algorithms to quickly recalculate the derivation. In case new data is added, our system uses a parallel version of the well-known semi-naive evaluation of Datalog. In case of removals, we have implemented two algorithms, one based on previous theoretical work, and another one that is more efficient since it does not require a complete scan of the input.

We have evaluated the performance using a prototype system called DynamiTE, which organizes the knowledge bases with a number of indices to facilitate the query process and exploits parallelism to improve the performance. The results show that our methods are indeed capable to recalculate the derivation in a short time, opening the door to reasoning on much more dynamic data than is currently possible.

Keywords

Main Memory Full Materialization Performance Bottleneck Count Attribute Rederive Phase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995), http://webdam.inria.fr/Alice/
  2. 2.
    Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Incremental Reasoning on Streams and Rich Background Knowledge. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part I. LNCS, vol. 6088, pp. 1–15. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Broekstra, J., Kampman, A.: Inferencing and Truth Maintenance in RDF Schema: Exploring a Naive Practical Approach. In: Workshop on Practical and Scalable Semantic Systems (PSSS), Sanibel Island, Florida (2003)Google Scholar
  4. 4.
    Della Valle, E., Ceri, S., van Harmelen, F., Fensel, D.: It’s a streaming world! reasoning upon rapidly changing information. IEEE Intelligent Systems 24(6), 83–89 (2009)CrossRefGoogle Scholar
  5. 5.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web 3, 158–182 (2005)CrossRefGoogle Scholar
  6. 6.
    Gupta, A., Mumick, I.S.: Maintenance of Materialized Views: Problems, Techniques, and Applications. Data Engineering Bulletin 18(2), 3–18 (1995)Google Scholar
  7. 7.
    Gupta, A., Mumick, I.S., Subrahmanian, V.S.: Maintaining Views Incrementally. In: Proceedings of SIGMOD, vol. 22, pp. 157–166. ACM (1993)Google Scholar
  8. 8.
    Harrison, J.V., Dietrich, S.: Maintenance of Materialized Views in a Deductive Database: An update Propagation Approach. In: Workshop on Deductive Databases, JICSLP, pp. 56–65 (1992)Google Scholar
  9. 9.
    Hayes, P. (ed.): RDF Semantics. W3C Recommendation (2004)Google Scholar
  10. 10.
    Kolovski, V., Wu, Z., Eadon, G.: Optimizing Enterprise-scale OWL 2 RL Reasoning in a Relational Database System. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 436–452. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Kotowski, J., Bry, F., Brodt, S.: Reasoning as Axioms Change. In: Rudolph, S., Gutierrez, C. (eds.) RR 2011. LNCS, vol. 6902, pp. 139–154. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Munoz-Venegas, S., Prez, J., Gutierrez, C.: Simple and Efficient Minimal RDFS. Web Semantics: Science, Services and Agents on the World Wide Web 7(3) (2009)Google Scholar
  13. 13.
    Olson, M.A., Bostic, K., Seltzer, M.: Berkeley db. In: Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference, pp. 183–192 (1999)Google Scholar
  14. 14.
    Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C Recommendation (2008)Google Scholar
  15. 15.
    Staudt, M., Jarke, M.: Incremental Maintenance of Externally Materialized Views. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.) Proceedings of VLDB, pp. 75–86 (1996)Google Scholar
  16. 16.
    Urbani, J., Kotoulas, S., Maassen, J., Harmelen, F.V., Bal, H.: WebPIE: A Web-scale Parallel Inference Engine using MapReduce. Web Semantics: Science, Services and Agents on the World Wide Web 10, 59–75 (2012)CrossRefGoogle Scholar
  17. 17.
    Urbani, J., Maassen, J., Drost, N., Seinstra, F., Bal, H.: Scalable RDF data compression with MapReduce. Concurrency and Computation: Practice and Experience 25(1), 24–39 (2013)CrossRefGoogle Scholar
  18. 18.
    Volz, R., Staab, S., Motik, B.: Incremental Maintenance of Materialized Ontologies. In: Meersman, R., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 707–724. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Volz, R., Staab, S., Motik, B.: Incrementally maintaining materializations of ontologies stored in logic databases. In: Spaccapietra, S., Bertino, E., Jajodia, S., King, R., McLeod, D., Orlowska, M.E., Strous, L. (eds.) Journal on Data Semantics II. LNCS, vol. 3360, pp. 1–34. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Weaver, J., Hendler, J.A.: Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 682–697. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  21. 21.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple indexing for semantic web data management. In: Proceedings of VLDB, vol. 1, pp. 1008–1019 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jacopo Urbani
    • 1
  • Alessandro Margara
    • 1
  • Ceriel Jacobs
    • 1
  • Frank van Harmelen
    • 1
  • Henri Bal
    • 1
  1. 1.Vrije Universiteit AmsterdamThe Netherlands

Personalised recommendations