Skip to main content

LOD-a-lot

A Queryable Dump of the LOD Cloud

Part of the Lecture Notes in Computer Science book series (LNISA,volume 10588)

Abstract

LOD-a-lot democratizes access to the Linked Open Data (LOD) Cloud by serving more than 28 billion unique triples from 650 K datasets over a single self-indexed file. This corpus can be queried online with a sustainable Linked Data Fragments interface, or downloaded and consumed locally: LOD-a-lot is easy to deploy and demands affordable resources (524 GB of disk space and 15.7 GB of RAM), enabling Web-scale repeatable experimentation and research even by standard laptops.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-68204-4_7
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-68204-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.

Notes

  1. 1.

    See https://datahub.io.

  2. 2.

    See http://lodlaundromat.org.

  3. 3.

    See https://www.w3.org/TR/rdf11-concepts/#section-skolemization.

  4. 4.

    HDT creation took 64h & 170 GB RAM. HDT-FoQ took 8h & 250 GB RAM.

  5. 5.

    See https://datahub.io/dataset/lod-a-lot.

  6. 6.

    See https://opendatacommons.org/licenses/pddl/1-0/.

  7. 7.

    See https://github.com/rdfhdt.

  8. 8.

    8 cores (2.6 GHz), RAM 32 GB and a SATA HDD on Ubuntu 14.04.5 LTS.

  9. 9.

    See http://lod.openlinksw.com/.

  10. 10.

    See http://km.aifb.kit.edu/projects/btc-2014/.

  11. 11.

    See http://webdatacommons.org/structureddata/2016-10/stats/stats.html.

References

  1. Beek, W., Ilievski, F., Debattista, J., Schlobach, S., Wielemaker, J.: Literally better: analyzing and improving the quality of literals. Semant. Web J. (2017). http://www.semantic-web-journal.net/content/literally-better-analyzing-and-improving-quality-literals-1

  2. Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 213–228. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_14

    Google Scholar 

  3. Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)

    CrossRef  Google Scholar 

  4. Boncz, P., Fundulaki, I., Gubichev, A., Larriba-Pey, J., Neumann, T.: The linked data benchmark council project. Datenbank-Spektrum 13(2), 121–129 (2013)

    CrossRef  Google Scholar 

  5. Buil-Aranda, C., Arenas, M., Corcho, O., Polleres, A.: Federating queries in SPARQL 1.1: syntax, semantics and evaluation. JWS 18(1), 1–17 (2013)

    CrossRef  Google Scholar 

  6. Ding, L., Finin, T.: Characterizing the semantic web on the web. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 242–257. Springer, Heidelberg (2006). doi:10.1007/11926078_18

    CrossRef  Google Scholar 

  7. Ermilov, I., Lehmann, J., Martin, M., Auer, S.: LODStats: the data web census dataset. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 38–46. Springer, Cham (2016). doi:10.1007/978-3-319-46547-0_5

    CrossRef  Google Scholar 

  8. Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). JWS 19, 22–41 (2013)

    CrossRef  Google Scholar 

  9. Garlik, S.H., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 query language. W3C Recommendation (2013).https://www.w3.org/TR/sparql11-query/

  10. Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: Proceedings of EDBT, pp. 439–450 (2014)

    Google Scholar 

  11. Hartig, O.: SQUIN: a traversal based query execution system for the web of linked data. In: Proceedings of SIGMOD, pp. 1081–1084 (2013)

    Google Scholar 

  12. Hartig, O., Pirró, G.: A context-based semantics for SPARQL property paths over the web. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 71–87. Springer, Cham (2015). doi:10.1007/978-3-319-18818-8_5

    CrossRef  Google Scholar 

  13. Käfer, T., Harth, A.: Billion Triples Challenge Data Set (2014). http://km.aifb.kit.edu/projects/btc-2014/

  14. Lanthaler, M., Gütl, C.: Hydra: A Vocabulary for Hypermedia-Driven Web APIs. In: CEUR, vol. 996 (2013)

    Google Scholar 

  15. Martínez-Prieto, M.A., Arias Gallego, M., Fernández, J.D.: Exchange and consumption of huge RDF data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 437–452. Springer, Heidelberg (2012). doi:10.1007/978-3-642-30284-8_36

    CrossRef  Google Scholar 

  16. Meusel, R., Petrovski, P., Bizer, C.: The webdatacommons microdata, RDFa and microformat dataset series. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 277–292. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_18

    Google Scholar 

  17. Millard, I.C., Glaser, H., Salvadores, M., Shadbolt, N.: Consuming multiple linked data sources: challenges and experiences. In: Proceedings of COLD, vol. 665, pp. 37–48. CEUR (2010)

    Google Scholar 

  18. Oguz, D., Ergenc, B., Yin, S., Dikenelli, O., Hameurlain, A.: Federated query processing on linked data: a qualitative survey and open challenges. Knowl. Eng. Rev. 30(5), 545–563 (2015)

    CrossRef  Google Scholar 

  19. Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice.com: a document-oriented lookup index for open linked data. Int. J. Metadata Semant. Ontol 3(1), 37–52 (2008)

    CrossRef  Google Scholar 

  20. Rietveld, L., Beek, W., Hoekstra, R., Schlobach, S.: Meta-data for a lot of LOD. Semantic Web J. 8(6), 1067–1080 (2017)

    CrossRef  Google Scholar 

  21. Vandenbussche, P.Y., Umbrich, J., Matteis, L., Hogan, A., Buil-Aranda, C.: SPARQLES: Monitoring public SPARQL endpoints. Semantic Web J. 8(6), 1049–1065 (2017)

    CrossRef  Google Scholar 

  22. Verborgh, R., Vander Sande, M., Hartig, O., Van Herwegen, J., De Vocht, L., De Meester, B., Haesendonck, G., Colpaert, P.: Triple pattern fragments: a low-cost knowledge graph interface for the web. JWS 37–38, 184–206 (2016)

    CrossRef  Google Scholar 

Download references

Acknowledgments

Partly funded by Austrian Science Fund: M1720-G11, European Union’s Horizon 2020 research and innovation programme under grant 731601, WU Post-doc Research Contracts, and MINECO, Spain: TIN2013-46238-C4-3-R, and TIN2016-78011-C4-1-R. We also thank the KEYSTONE COST Action IC1302.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier D. Fernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Fernández, J.D., Beek, W., Martínez-Prieto, M.A., Arias, M. (2017). LOD-a-lot. In: , et al. The Semantic Web – ISWC 2017. ISWC 2017. Lecture Notes in Computer Science(), vol 10588. Springer, Cham. https://doi.org/10.1007/978-3-319-68204-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68204-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68203-7

  • Online ISBN: 978-3-319-68204-4

  • eBook Packages: Computer ScienceComputer Science (R0)