Advertisement

A Structural Approach to Indexing Triples

  • François Picalausa
  • Yongming Luo
  • George H. L. Fletcher
  • Jan Hidders
  • Stijn Vansummeren
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)

Abstract

As an essential part of the W3C’s semantic web stack and linked data initiative, RDF data management systems (also known as triplestores) have drawn a lot of research attention. The majority of these systems use value-based indexes (e.g., B + -trees) for physical storage, and ignore many of the structural aspects present in RDF graphs. Structural indexes, on the other hand, have been successfully applied in XML and semi-structured data management to exploit structural graph information in query processing. In those settings, a structural index groups nodes in a graph based on some equivalence criterion, for example, indistinguishability with respect to some query workload (usually XPath). Motivated by this body of work, we have started the SAINT-DB project to study and develop a native RDF management system based on structural indexes. In this paper we present a principled framework for designing and using RDF structural indexes for practical fragments of SPARQL, based on recent formal structural characterizations of these fragments. We then explain how structural indexes can be incorporated in a typical query processing workflow; and discuss the design, implementation, and initial empirical evaluation of our approach.

Keywords

Query Processing SPARQL Query Triple Pattern Read Request Index Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abadi, D., et al.: SW-Store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18, 385–406 (2009)CrossRefGoogle Scholar
  2. 2.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)Google Scholar
  3. 3.
    Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. In: USEWOD (2011)Google Scholar
  4. 4.
    Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path summaries and path partitioning in modern XML databases. WWW 11(1), 117–151 (2008)Google Scholar
  5. 5.
    Brenes Barahona, S.: Structural summaries for efficient XML query processing. PhD thesis, Indiana University (2011)Google Scholar
  6. 6.
    Bröcheler, M., Pugliese, A., Subrahmanian, V.S.: DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 97–113. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Fletcher, G.H.L., Beck, P.W.: Scalable indexing of RDF graphs for efficient join processing. In: CIKM, Hong Kong, pp. 1513–1516 (2009)Google Scholar
  8. 8.
    Fletcher, G.H.L., Hidders, J., Vansummeren, S., Luo, Y., Picalausa, F., De Bra, P.: On guarded simulations and acyclic first-order languages. In: DBPL, Seattle (2011)Google Scholar
  9. 9.
    Fletcher, G.H.L., Van Gucht, D., Wu, Y., Gyssens, M., Brenes, S., Paredaens, J.: A methodology for coupling fragments of XPath with structural indexes for XML documents. Information Systems 34(7), 657–670 (2009)CrossRefGoogle Scholar
  10. 10.
    Gentilini, R., Piazza, C., Policriti, A.: From bisimulation to simulation: Coarsest partition problems. J. Autom. Reasoning 31(1), 73–103 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. J. Web Sem. 3(2-3), 158 (2005)CrossRefGoogle Scholar
  12. 12.
    Luo, Y., Picalausa, F., Fletcher, G.H.L., Hidders, J., Vansummeren, S.: Storing and indexing massive rdf datasets. In: De Virgilio, R., et al. (eds.) Semantic Search over the Web, Data-Centric Systems and Applications, pp. 29–58. Springer, Heidelberg (2012)Google Scholar
  13. 13.
    Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  14. 14.
    Neumann, T., Weikum, G.: Scalable join processing on very large RDF graphs. In: SIGMOD, pp. 627–640 (2009)Google Scholar
  15. 15.
    Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)CrossRefGoogle Scholar
  16. 16.
    Picalausa, F., Vansummeren, S.: What are real SPARQL queries like? In: Proceedings of the International Workshop on Semantic Web Information Management, SWIM 2011, pp. 7:1–7:6. ACM, New York (2011)Google Scholar
  17. 17.
    Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. Technical report, W3C Recommendation (2008)Google Scholar
  18. 18.
    Sidirourgos, L., et al.: Column-store support for RDF data management: not all swans are white. Proc. VLDB Endow. 1(2), 1553–1563 (2008)Google Scholar
  19. 19.
    Tran, T., Ladwig, G.: Structure index for RDF data. In: Workshop on Semantic Data Management, SemData@ VLDB (2010)Google Scholar
  20. 20.
    Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A graph based RDF index. In: AAAI, Vancouver, B.C., pp. 1465–1470 (2007)Google Scholar
  21. 21.
    van Glabbeek, R.J., Ploeger, B.: Correcting a Space-Efficient Simulation Algorithm. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 517–529. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for Semantic Web Data Management. In: VLDB, Auckland, New Zealand (2008)Google Scholar
  23. 23.
    Wylot, M., Pont, J., Wisniewski, M., Cudré-Mauroux, P.: dipLODocus[RDF]—Short and Long-Tail RDF Analytics for Massive Webs of Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 778–793. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: Answering SPARQL queries via subgraph matching. Proc. VLDB Endow. 4(8), 482–493 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • François Picalausa
    • 1
  • Yongming Luo
    • 2
  • George H. L. Fletcher
    • 2
  • Jan Hidders
    • 3
  • Stijn Vansummeren
    • 1
  1. 1.Université Libre de BruxellesBelgium
  2. 2.Eindhoven University of TechnologyThe Netherlands
  3. 3.Delft University of TechnologyThe Netherlands

Personalised recommendations