International Semantic Web Conference

ISWC 2010: The Semantic Web – ISWC 2010 pp 193-208

Compact Representation of Large RDF Data Sets for Publishing and Exchange

  • Javier D. Fernández
  • Miguel A. Martínez-Prieto
  • Claudio Gutierrez
Conference paper

DOI: 10.1007/978-3-642-17746-0_13

Volume 6496 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Fernández J.D., Martínez-Prieto M.A., Gutierrez C. (2010) Compact Representation of Large RDF Data Sets for Publishing and Exchange. In: Patel-Schneider P.F. et al. (eds) The Semantic Web – ISWC 2010. ISWC 2010. Lecture Notes in Computer Science, vol 6496. Springer, Berlin, Heidelberg

Abstract

Increasingly huge RDF data sets are being published on the Web. Currently, they use different syntaxes of RDF, contain high levels of redundancy and have a plain indivisible structure. All this leads to fuzzy publications, inefficient management, complex processing and lack of scalability. This paper presents a novel RDF representation (HDT) which takes advantage of the structural properties of RDF graphs for splitting and representing, efficiently, three components of RDF data: Header, Dictionary and Triples structure. On-demand management operations can be implemented on top of HDT representation. Experiments show that data sets can be compacted in HDT by more than fifteen times the current naive representation, improving parsing and processing while keeping a consistent publication scheme. For exchanging, specific compression techniques over HDT improve current compression solutions.

Download to read the full conference paper text

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Javier D. Fernández
    • 1
  • Miguel A. Martínez-Prieto
    • 1
    • 2
  • Claudio Gutierrez
    • 2
  1. 1.Department of Computer ScienceUniversidad de ValladolidSpain
  2. 2.Department of Computer ScienceUniversidad de ChileChile