Exchange and Consumption of Huge RDF Data

  • Miguel A. Martínez-Prieto
  • Mario Arias Gallego
  • Javier D. Fernández
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)

Abstract

Huge RDF datasets are currently exchanged on textual RDF formats, hence consumers need to post-process them using RDF stores for local consumption, such as indexing and SPARQL query. This results in a painful task requiring a great effort in terms of time and computational resources. A first approach to lightweight data exchange is a compact (binary) RDF serialization format called HDT. In this paper, we show how to enhance the exchanged HDT with additional structures to support some basic forms of SPARQL query resolution without the need of ”unpacking” the data. Experiments show that i) with an exchanging efficiency that outperforms universal compression, ii) post-processing now becomes a fast process which iii) provides competitive query performance at consumption.

Keywords

Data Provider SPARQL Query Triple Pattern Adjacency List Wavelet Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Compact Data Structures Library (libcds), http://libcds.recoded.cl/
  2. 2.
    SPARQL Query Language for RDF. W3C Recomm. (2008), http://www.w3.org/TR/rdf-sparql-query/
  3. 3.
    Turtle-Terse RDF Triple Language. W3C Team Subm. (2008), http://www.w3.org/TeamSubmission/turtle/
  4. 4.
    Notation3. W3C Design Issues (1998), http://www.w3.org/DesignIssues/Notation3
  5. 5.
    RDF/XML Syntax. W3C Recomm. (2004), http://www.w3.org/TR/REC-rdf-syntax/
  6. 6.
    Binary RDF Representation for Publication and Exchange (HDT). W3C Member Subm. (2011), http://www.w3.org/Submission/2011/03/
  7. 7.
    Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked Data On the Web (LDOW 2008). In: Proc. of WWW, pp. 1265–1266 (2008)Google Scholar
  8. 8.
    Brisaboa, N.R., Cánovas, R., Claude, F., Martínez-Prieto, M.A., Navarro, G.: Compressed String Dictionaries. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 136–147. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Ding, L., Finin, T.: Characterizing the Semantic Web on the Web. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 242–257. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Erling, O., Mikhailov, I.: RDF Support in the Virtuoso DBMS. In: Proc. of CSSW, pp. 59–68 (2007)Google Scholar
  11. 11.
    Fernández, J.D., Martínez-Prieto, M.A., Gutierrez, C.: Compact Representation of Large RDF Data Sets for Publishing and Exchange. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 193–208. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    González, R., Grabowski, S., Mäkinen, V., Navarro, G.: Practical Implementation of Rank and Select Queries. In: Proc. of WEA, pp. 27–38 (2005)Google Scholar
  13. 13.
    Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. of SODA, pp. 841–850 (2003)Google Scholar
  14. 14.
    Le-Phuoc, D., Parreira, J.X., Reynolds, V., Hauswirth, M.: RDF On the Go: An RDF Storage and Query Processor for Mobile Devices. In: Proc. of ISWC (2010), http://iswc2010.semanticweb.org/pdf/503.pdf
  15. 15.
    Martínez-Prieto, M., Fernández, J., Cánovas, R.: Compression of RDF Dictionaries. In: Proc. of SAC (2012), http://dataweb.infor.uva.es/sac2012.pdf
  16. 16.
    Navarro, G., Mäkinen, V.: Compressed Full-Text Indexes. ACM Comput. Surv. 39(1), art. 2 (2007)Google Scholar
  17. 17.
    Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management of RDF data. The VLDB Journal 19(1), 91–113 (2010)CrossRefGoogle Scholar
  18. 18.
    Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for Semantic Web Data Management. Proc. of the VLDB Endowment 1(1), 1008–1019 (2008)Google Scholar
  19. 19.
    Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Miguel A. Martínez-Prieto
    • 1
    • 2
  • Mario Arias Gallego
    • 1
    • 3
  • Javier D. Fernández
    • 1
    • 2
  1. 1.Department of Computer ScienceUniversidad de ValladolidSpain
  2. 2.Department of Computer ScienceUniversidad de ChileChile
  3. 3.Digital Enterprise Research InstituteNational University of IrelandGalwayIreland

Personalised recommendations