Skip to main content

A Distributed Inverted Indexing Scheme for Large-Scale RDF Data

  • Conference paper
  • 874 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7419))

Abstract

With the development of the Linked Data project, enormous RDF data have been published on the Web. A scalable system is required to provide an efficient retrieval for large-scale RDF data. This paper presents a distributed inverted indexing scheme for large-scale RDF data. A scalable inverted index is built using the underlying data structure of Cassandra which is a distributed key-value storage system. We optimize the indexing scheme with the characteristics of RDF data model to effectively support the fast keyword search. The loading, encoding and indexing procedures are implemented for RDF data simultaneously using the MapReduce framework. The experimental results show that our indexing scheme can effectively support keyword retrieval for large-scale RDF data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data. In: Proc. of OSDI, pp. 205–218 (2006)

    Google Scholar 

  2. Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. International Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)

    Article  Google Scholar 

  3. Wang, H., Liu, Q., Penin, T., Fu, L., Zhang, L., Tran, T., Yu, Y., Pan, Y.: Semplore: A scalable IR approach to search the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 177–188 (2009)

    Article  Google Scholar 

  4. Bhagdev, R., Chapman, S., Ciravegna, F., Lanfranchi, V., Petrelli, D.: Hybrid Search: Effectively Combining Keywords and Semantic Searches. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 554–568. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Cheng, G., Ge, W., Qu, Y.: FALCONS: Searching and browsing entities on the semantic web. In: Proceedings of the World Wide Web Conference (2008)

    Google Scholar 

  6. Ding, L., Pan, R., Finin, T.W., Joshi, A., Peng, Y., Kolari, P.: Finding and Ranking Knowledge on the Semantic Web. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 156–170. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Guha, R., McCool, R., Miller, E.: Semantic search. In: Proceedings of the 12th International Conference on World Wide Web, pp. 700–709 (2003)

    Google Scholar 

  8. Weiss, C., Karras, P., Bernstein, A.: Hexastore – sextuple indexing for semantic web data management. Proceedings of the VLDB Endowment 1(1), 1008–1019 (2008)

    Google Scholar 

  9. Harth, A., Umbrich, J., Hogan, A., Decker, S.: YARS2: A Federated Repository for Querying Graph Structured Data from the Web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 211–224. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Beckett, D., Grant, J.: Semantic Web Scalability and Storage: Mapping Semantic Web Data with RDBMSes. In: SWAD-Europe Deliverable, W3C (January 2003)

    Google Scholar 

  11. Hogan, A., Harth, A., Umbrich, J., Kinsella, S., Polleres, A., Decker, S.: Searching and browsing Linked Data with SWSE: the Semantic Web Search Engine. J. Web Sem. 9(4), 365–401 (2011)

    Article  Google Scholar 

  12. Ladwig, G., Harth, A.: CumulusRDF: Linked Data Management on Nested Key-Value Stores. In: SSWS (2011)

    Google Scholar 

  13. Wang, X., Jiang, L., Shi, H., Feng, Z., Du, P.: Jingwei+: A Distributed Large-Scale RDF Data Server. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 779–783. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, X., Wang, X., Shi, H., Sheng, Z., Feng, Z. (2012). A Distributed Inverted Indexing Scheme for Large-Scale RDF Data. In: Bao, Z., et al. Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33050-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33050-6_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33049-0

  • Online ISBN: 978-3-642-33050-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics