Skip to main content

A Pattern-Based Approach for Efficient Query Processing over RDF Data

  • Chapter
Transactions on Large-Scale Data- and Knowledge-Centered Systems V

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 7100))

  • 471 Accesses

Abstract

The recent prevalence of Linked Data attracts research interest towards the efficiency of query execution over the web of data. Search and query engines crawl and index triples into a centralized repository and queries are executed locally. It has been shown in various literatures that the performance bottleneck of large scale query execution lies in joins and unions. Based on the observation that a large part of join operations result in a much smaller binding set which can be precomputed and stored, we propose to augment RDF indexes to store the bindings of complex patterns and exploit these patterns to enhance performance. In addition to the index, we also introduce two strategies of selecting these patterns: one depends on developed heuristic rules and the other employs query history to optimize time-space ratio. Our empirical study demonstrates the proposed pattern index outperforms traditional triple index by up to three orders of magnitude while keeping the overhead low.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abadi, D., Marcus, A., Madden, S., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for Semantic Web data management. The VLDB Journal 18(2), 385–406 (2009)

    Article  Google Scholar 

  2. Angles, R., Gutierrez, C.: The Expressive Power of SPARQL. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 114–129. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Beckett, D.: RDF/XML Syntax Specification

    Google Scholar 

  4. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. In: Proceedings of the VLDB (2008)

    Google Scholar 

  6. Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1998, pp. 34–43 (1998)

    Google Scholar 

  7. Chong, E., Das, S., Eadon, G., Srinivasan, J.: An efficient SQL-based RDF querying scheme. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 1216–1227. VLDB Endowment (2005)

    Google Scholar 

  8. Cook, S.: The complexity of theorem-proving procedures. In: Proceedings of the Third Annual ACM Symposium on Theory of Computing, pp. 151–158. ACM (1971)

    Google Scholar 

  9. Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment 1(1), 647–659 (2008)

    Article  Google Scholar 

  10. Neumann, T., Gerhard, W.: Scalable join processing on very large RDF graphs. In: Proceedings of the 35th SIGMOD, pp. 627–639 (2009)

    Google Scholar 

  11. Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF

    Google Scholar 

  12. Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Proceedings of the 13th International Conference on Database Theory, pp. 4–33. ACM (2010)

    Google Scholar 

  13. Sidirourgos, L., Goncalves, R., Kersten, M., Nes, N., Manegold, S.: Column-store support for RDF data management: not all swans are white. Proceedings of the VLDB Endowment 1(2), 1553–1563 (2008)

    Article  Google Scholar 

  14. Silberschatz, A., Korth, H., Sudarshan, S.: Database system concepts, vol. 72. McGraw-Hill (2002)

    Google Scholar 

  15. Stonebraker, M., Abadi, D., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., et al.: C-store: a column-oriented DBMS. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 553–564. VLDB Endowment (2005)

    Google Scholar 

  16. Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A graph based RDF index. In: Proceedings of the National Conference on Artificial Intelligence, vol. 22, p. 1465. AAAI Press, MIT Press, Menlo Park, Cambridge (1999/2007)

    Google Scholar 

  17. Wilkinson, K., Sayers, C., Kuno, H.: Efficient RDF storage and retrieval in Jena2. In: Proceedings of SWDB (2003)

    Google Scholar 

  18. Wilkinson, K.: Jena property table implementation. In: Proc. of the International Workshop on Scalable (November 2006)

    Google Scholar 

  19. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. Order A Journal On The Theory Of Ordered Sets And Its Applications (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Abdelkader Hameurlain Josef Küng Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tian, Y., Wang, H., Jin, W., Ni, Y., Yu, Y. (2012). A Pattern-Based Approach for Efficient Query Processing over RDF Data. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems V. Lecture Notes in Computer Science, vol 7100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28148-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28148-8_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28147-1

  • Online ISBN: 978-3-642-28148-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics