SPARQL Query Optimization on Top of DHTs

  • Zoi Kaoudi
  • Kostis Kyzirakos
  • Manolis Koubarakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6496)


We study the problem of SPARQL query optimization on top of distributed hash tables. Existing works on SPARQL query processing in such environments have never been implemented in a real system, or do not utilize any optimization techniques and thus exhibit poor performance. Our goal in this paper is to propose efficient and scalable algorithms for optimizing SPARQL basic graph pattern queries. We augment a known distributed query processing algorithm with query optimization strategies that improve performance in terms of query response time and bandwidth usage. We implement our techniques in the system Atlas and study their performance experimentally in a local cluster.


Query Processing Distribute Hash Table Query Evaluation Query Optimization SPARQL Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cai, M., Frank, M.R., Yan, B., MacGregor, R.M.: A Subscribable Peer-to-Peer RDF Repository for Distributed Metadata Management. Journal of Web Semantics (2004)Google Scholar
  2. 2.
    Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An Efficient SQL-based RDF Querying Scheme. In: VLDB 2005 Google Scholar
  3. 3.
    Erling, O., Mikhailov, I.: Towards Web Scale RDF. In: SSWS 2008 Google Scholar
  4. 4.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A Benchmark for OWL Knowledge Base Systems. Journal of Web Semantics (2005)Google Scholar
  5. 5.
    Harth, A., Umbrich, J., Hogan, A., Decker, S.: YARS2: A Federated Repository for Querying Graph Structured Data from the Web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 211–224. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Heine, F.: Scalable P2P based RDF Querying. In: InfoScale 2006Google Scholar
  7. 7.
    Kaoudi, Z., Koubarakis, M., Kyzirakos, K., Magiridou, M., Miliaraki, I., Papadakis-Pesaresi, A.: Publishing, Discovering and Updating Semantic Grid Resources using DHTs. In: CoreGRID Workshop on Grid Programming Model, Grid and P2P Systems Architecture 2006 Google Scholar
  8. 8.
    Kaoudi, Z., Koubarakis, M., Kyzirakos, K., Miliaraki, I., Magiridou, M., Papadakis-Pesaresi, A.: Atlas: Storing, Updating and Querying RDF(S) Data on Top of DHTs. Journal of Web Semantics (System paper) (2010)Google Scholar
  9. 9.
    Kaoudi, Z., Miliaraki, I., Koubarakis, M.: RDFS Reasoning and Query Answering on Top of DHTs. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 499–516. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Karnstedt, M.: Query Processing in a DHT-Based Universal Storage - The World as a Peer-to-Peer Database. PhD thesis (2009)Google Scholar
  11. 11.
    Liarou, E., Idreos, S., Koubarakis, M.: Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. In: VLDB 2008Google Scholar
  13. 13.
    Neumann, T., Weikum, G.: Scalable Join Process 2009Google Scholar
  14. 14.
    Ntarmos, N., Triantafillou, P., Weikum, G.: Distributed Hash Sketches: Scalable, Efficient, and Accurate Cardinality Estimation for Distributed Multisets. ACM TOCS (2009)Google Scholar
  15. 15.
    Owens, A., Seaborne, A., Gibbins, N., schraefel, m.: Clustered TDB: A Clustered Triple Store for Jena. Technical Report (2008) (Unpublished)Google Scholar
  16. 16.
    Poosala, V., Ioannidis, Y., Haas, P., Shekita, E.: Improved Histograms for Selectivity Estimation of Range Predicates. In: ACM SIGMOD 1996 Google Scholar
  17. 17.
    Quilitz, B., Leser, U.: Querying Distributed RDF Data Sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Rhea, S., Geels, D., Roscoe, T., Kubiatowicz, J.: Handling Churn in a DHT. In: USENIX Annual Technical Conference 2004 Google Scholar
  19. 19.
    Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: A SPARQL Performance Benchmark. In: ICDE 2009 Google Scholar
  20. 20.
    Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access Path Selection in a Relational Database Management System. In: SIGMOD (1979)Google Scholar
  21. 21.
    Shironoshita, E.P., Ryan, M.T., Kabuka, M.R.: Cardinality Estimation for the Optimization of Queries on Ontologies. SIGMOD Record (2007)Google Scholar
  22. 22.
    Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized Optimization for the Join Ordering Problem. VLDB Journal (1997)Google Scholar
  23. 23.
    Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL Basic Graph Pattern Optimization using Selectivity Estimation. In: WWW 2008 Google Scholar
  24. 24.
    Stuckenschmidt, H., Vdovjak, R., Broekstra, J., jan Houben, G., Eindhoven, T., Amersfoort, A.: Towards Distributed Processing of RDF Path Queries. Int. J. Web Eng. and Tech. (2005)Google Scholar
  25. 25.
    Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I and II. Computer Science Press, Rockville (1988)Google Scholar
  26. 26.
    Vidal, M.-E., Ruckhaus, E., Lampo, T., Martínez, A., Sierra, J., Polleres, A.: Efficiently Joining Group Patterns in SPARQL Queries. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6088, pp. 228–242. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  27. 27.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for Semantic Web Data Management. In: VLDB 2008 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Zoi Kaoudi
    • 1
  • Kostis Kyzirakos
    • 1
  • Manolis Koubarakis
    • 1
  1. 1.Dept. of Informatics and TelecommunicationsNational and Kapodistrian University of AthensGreece

Personalised recommendations