International Symposium on String Processing and Information Retrieval

SPIRE 2015: String Processing and Information Retrieval pp 103-115

A Compact RDF Store Using Suffix Arrays

  • Nieves R. Brisaboa
  • Ana Cerdeira-Pena
  • Antonio Fariña
  • Gonzalo Navarro
Conference paper

DOI: 10.1007/978-3-319-23826-5_11

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9309)
Cite this paper as:
Brisaboa N.R., Cerdeira-Pena A., Fariña A., Navarro G. (2015) A Compact RDF Store Using Suffix Arrays. In: Iliopoulos C., Puglisi S., Yilmaz E. (eds) String Processing and Information Retrieval. SPIRE 2015. Lecture Notes in Computer Science, vol 9309. Springer, Cham

Abstract

RDF has become a standard format to describe resources in the Semantic Web and other scenarios. RDF data is composed of triples (subjectpredicateobject), referring respectively to a resource, a property of that resource, and the value of such property. Compact storage schemes allow fitting larger datasets in main memory for faster processing. On the other hand, supporting efficient SPARQL queries on RDF datasets requires index data structures to accompany the data, which hampers compactness. As done for text collections, we introduce a self-index for RDF data, which combines the data and its index in a single representation that takes less space than the raw triples and efficiently supports basic SPARQL queries. Our storage format, RDFCSA, builds on compressed suffix arrays. Although there exist more compact representations of RDF data, RDFCSA uses about half of the space of the raw data (and replaces it) and displays much more robust and predictable query times around 1–2 microseconds per retrieved triple. RDFCSA is 3 orders of magnitude faster than representations like MonetDB or RDF-3X, while using the same space as the former and 6 times less space than the latter. It is also faster than the more compact representations on most queries, in some cases by 2 orders of magnitude.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Nieves R. Brisaboa
    • 1
  • Ana Cerdeira-Pena
    • 1
  • Antonio Fariña
    • 1
  • Gonzalo Navarro
    • 2
  1. 1.Database Lab.University of A CoruñaCoruñaSpain
  2. 2.Department of Computer ScienceUniversity of ChileSantiagoChile

Personalised recommendations