Advertisement

OWL Reasoning with WebPIE: Calculating the Closure of 100 Billion Triples

  • Jacopo Urbani
  • Spyros Kotoulas
  • Jason Maassen
  • Frank van Harmelen
  • Henri Bal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6088)

Abstract

In previous work we have shown that the MapReduce framework for distributed computation can be deployed for highly scalable inference over RDF graphs under the RDF Schema semantics. Unfortunately, several key optimizations that enabled the scalable RDFS inference do not generalize to the richer OWL semantics. In this paper we analyze these problems, and we propose solutions to overcome them. Our solutions allow distributed computation of the closure of an RDF graph under the OWL Horst semantics.

We demonstrate the WebPIE inference engine, built on top of the Hadoop platform and deployed on a compute cluster of 64 machines. We have evaluated our approach using some real-world datasets (UniProt and LDSR, about 0.9-1.5 billion triples) and a synthetic benchmark (LUBM, up to 100 billion triples). Results show that our implementation is scalable and vastly outperforms current systems when comparing supported language expressivity, maximum data size and inference speed.

Keywords

Canonical Representation MapReduce Framework Algorithm Rule Synthetic Benchmark Hadoop Platform 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Battré, D., Höing, A., Heine, F., Kao, O.: On triple dissemination, forward-chaining, and load balancing in DHT based RDF stores. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 343–354. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Cai, M., Frank, M.: RDFPeers: A scalable distributed RDF repository based on a structured peer-to-peer network. In: Proc. of the WWW 2004(2004)Google Scholar
  3. 3.
    Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Proceedings of USENIX OSDI, pp. 137–147 (2004)Google Scholar
  4. 4.
    Fang, Q., Zhao, Y., Yang, G., Zheng, W.: Scalable distributed ontology reasoning using DHT-based partitioning. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 91–105. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Hayes, P. (ed.): RDF Semantics. W3C Recommendation (2004)Google Scholar
  6. 6.
    Hogan, A., Harth, A., Polleres, A.: Scalable authoritative OWL reasoning for the web. International Journal on Semantic Web and Information Systems 5(2) (2009)Google Scholar
  7. 7.
    Hogan, A., Polleres, A., Harth, A.: SAOR: Authoritative reasoning for the web. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 76–90. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    ter Horst, H.J.: Completeness, decidability and complexity of entailment for RDF schema and a semantic extension involving the OWL vocabulary. Journal of Web Semantics 3(2-3), 79–115 (2005)MathSciNetGoogle Scholar
  9. 9.
    Kaoudi, Z., Miliaraki, I., Koubarakis, M.: RDFS reasoning and query answering on top of DHTs. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 499–516. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Kotoulas, S., Oren, E., van Harmelen, F.: Mind the data skew: Distributed inferencing by speeddating in elastic regions. In: Proc. of the WWW 2010 (2010)Google Scholar
  11. 11.
    Newman, A., Li, Y., Hunter, J.: Scalable semantics the silver lining of cloud computing. In: Proceedings of the 4th IEEE International Conference on eScience (2008)Google Scholar
  12. 12.
    Oren, E., Kotoulas, S., Anadiotis, G., et al.: MARVIN: distributed reasoning over large-scale Semantic Web data. Journal of Web Semantics (2009)Google Scholar
  13. 13.
    Schlicht, A., Stuckenschmidt, H.: Peer-to-peer reasoning for interlinked ontologies. To appear in the International Journal of Semantic Computing, Special Issue on Web Scale Reasoning (2010)Google Scholar
  14. 14.
    Soma, R., Prasanna, V.: Parallel inferencing for OWL knowledge bases. In: International Conference on Parallel Processing, pp. 75–82 (2008)Google Scholar
  15. 15.
    Urbani, J., Kotoulas, S., Oren, E., van Harmelen, F.: Scalable distributed reasoning using mapReduce. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 634–649. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Weaver, J., Hendler, J.: Parallel materialization of the finite RDFS closure for hundreds of millions of triples. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 682–697. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jacopo Urbani
    • 1
  • Spyros Kotoulas
    • 1
  • Jason Maassen
    • 1
  • Frank van Harmelen
    • 1
  • Henri Bal
    • 1
  1. 1.Department of Computer ScienceVrije UniversiteitAmsterdam

Personalised recommendations