Skip to main content

Lightweighting the Web of Data through Compact RDF/HDT

  • Conference paper
Advances in Artificial Intelligence (CAEPIA 2011)

Abstract

The Web of Data is producing large RDF datasets from diverse fields. The increasing size of the data being published threatens to make these datasets hardly to exchange, index and consume. This scalability problem greatly diminishes the potential of interconnected RDF graphs. The HDT format addresses these problems through a compact RDF representation, that partitions and efficiently represents three components: Header (metadata), Dictionary (strings occurring in the dataset), and Triples (graph structure). This paper revisits the format and exploits the latest findings in triples indexing for querying, exchanging and visualizing RDF information at large scale.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Álvarez, S., Brisaboa, N., Ladra, S., Pedreira, O.: A Compact Representation of Graph Databases. In: Proc. of MLG, pp. 18–25 (2010)

    Google Scholar 

  2. Álvarez García, S., Brisaboa, N., Fernández, J.D., Martínez-Prieto, M.A.: Compressed k2-Triples for Full-In-Memory RDF Engines. In: Proc. of AMCIS, TBP (2011)

    Google Scholar 

  3. Arias, M., Fernández, J.D., Martínez-Prieto, M.A.: RDF Visualization using a Three-Dimensional Adjacency Matrix. In: Proc. of SemSearch (2011), http://km.aifb.kit.edu/ws/semsearch11/8.pdf

  4. Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix “Bit” loaded: a scalable lightweight join query processor for RDF data. In: Proc of WWW, pp. 41–50 (2010)

    Google Scholar 

  5. Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked Data On the Web (LDOW 2008). In: Proc. of WWW, pp. 1265–1266 (2008)

    Google Scholar 

  6. Brisaboa, N.R., Ladra, S., Navarro, G.: k2-Trees for Compact Web Graph Representation. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 18–30. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Claude, F., Fariña, A., Martínez-Prieto, M.A., Navarro, G.: Compressed q-gram indexing for highly repetitive biological sequences. In: Proc. of BIBE, pp. 86–91 (2010)

    Google Scholar 

  8. Dokulil, J., Katreniakova, J.: RDF Visualization - Thinking Big. In: Proc. DEXA, pp. 459–463 (2009)

    Google Scholar 

  9. Fekete, J.: Visualizing networks using adjacency matrices: Progresses and challenges. In: Proc. of CAD/GRAPHICS 2009, pp. 636–638 (2009)

    Google Scholar 

  10. Fernández, J.D., Martínez-Prieto, M.A., Gutierrez, C.: Compact Representation of Large RDF Data Sets for Publishing and Exchange. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 193–208. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. González, R., Grabowski, S., Makinen, V., Navarro, G.: Practical implementation of rank and select queries. In: Proc. of WEA, pp. 27–38 (2005)

    Google Scholar 

  12. Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL Queries over the Web of Linked Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the Pedantic Web. In: Proc. of LDOW (2010)

    Google Scholar 

  14. Navarro, G., Mäkinen, V.: Compressed Full-Text Indexes. ACM Computing Surveys 39(1), article 2 (2007)

    Google Scholar 

  15. Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management of RDF data. The VLDB Journal 19(1), 91–113 (2010)

    Article  Google Scholar 

  16. Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice.com: a document-oriented lookup index for open linked data. International Journal of Metadata Semantics and Ontologies 3(1), 37 (2008)

    Article  Google Scholar 

  17. Quilitz, B., Leser, U.: Querying Distributed RDF Data Sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: A SPARQL Performance Benchmark. In: Proc. of ICDE, pp. 222–233 (2009)

    Google Scholar 

  19. Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL Query Optimization. In: Proc. of ICDT, pp. 4–33 (2010)

    Google Scholar 

  20. Sheth, A.P., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys 22(3), 183–236 (1990)

    Article  Google Scholar 

  21. Sidirourgos, L., Goncalves, R., Kersten, M., Nes, N., Manegold, S.: Column-store Support for RDF Data Management: not All Swans are White. Proc. of the VLDB Endowment 1(2), 1553–1563 (2008)

    Article  Google Scholar 

  22. Theoharis, Y., Tzitzikas, Y., Kotzinos, D., Christophides, V.: On Graph Features of Semantic Web Schemas. IEEE Trans. on Know. and Data Engineering 20(5), 692–702 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fernández, J.D., Martínez-Prieto, M.A., Arias, M., Gutierrez, C., Álvarez-García, S., Brisaboa, N.R. (2011). Lightweighting the Web of Data through Compact RDF/HDT. In: Lozano, J.A., Gámez, J.A., Moreno, J.A. (eds) Advances in Artificial Intelligence. CAEPIA 2011. Lecture Notes in Computer Science(), vol 7023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25274-7_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25274-7_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25273-0

  • Online ISBN: 978-3-642-25274-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics