Advertisement

LOTUS: Adaptive Text Search for Big Linked Data

  • Filip Ilievski
  • Wouter Beek
  • Marieke van Erp
  • Laurens Rietveld
  • Stefan Schlobach
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9678)

Abstract

Finding relevant resources on the Semantic Web today is a dirty job: no centralized query service exists and the support for natural language access is limited. We present LOTUS: Linked Open Text UnleaShed, a text-based entry point to a massive subset of today’s Linked Open Data Cloud. Recognizing the use case dependency of resource retrieval, LOTUS provides an adaptive framework in which a set of matching and ranking algorithms are made available. Researchers and developers are able to tune their own LOTUS index by choosing and combining the matching and ranking algorithms that suit their use case best. In this paper, we explain the LOTUS approach, its implementation and the functionality it provides. We demonstrate the ease with which LOTUS enables text-based resource retrieval at an unprecedented scale in concrete and domain-specific scenarios. Finally, we provide evidence for the scalability of LOTUS with respect to the LOD Laundromat, the largest collection of easily accessible Linked Open Data currently available.

Keywords

Findability Text indexing Semantic search Scalable data management 

Notes

Acknowledgments

The research for this paper was supported by the European Union’s 7th Framework Programme via the NewsReader Project (ICT-316404) and the Netherlands Organisation for Scientific Research (NWO) via the Spinoza fund.

References

  1. 1.
    Antoniou, G., Groth, P., van Harmelen, F., Hoekstra, R.: A Semantic Web Primer, 3rd edn. The MIT Press, Cambridge (2012)Google Scholar
  2. 2.
    Beek, W., Rietveld, L.: Frank: algorithmic access to the LOD cloud. In: Proceedings of the ESWC Developers Workshop (2015)Google Scholar
  3. 3.
    Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: Lod laundromat: a uniform way of publishing other peoples dirty data. ISWC 2014, 213–228 (2014)Google Scholar
  4. 4.
    Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  5. 5.
    Cheng, G., Ge, W., Qu, Y.: Falcons: searching and browsing entities on the semantic web. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, NY, USA, pp. 1101–1102 (2008). http://doi.acm.org/10.1145/1367497.1367676
  6. 6.
    Christophides, V., Efthymiou, V., Stefanidis, K.: Entity Resolution in the Web of Data. Morgan and Claypool Publishers, San Rafael (2015)Google Scholar
  7. 7.
    Cyganiak, R., Wood, D., Lanthaler, M.: RDF 1.1 concepts and abstract syntax (2014)Google Scholar
  8. 8.
    Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, CIKM 2004, NY, USA, pp. 652–659 (2004). http://doi.acm.org/10.1145/1031171.1031289
  9. 9.
    Feyznia, A., Kahani, M., Zarrinkalam, F.: Colina: a method for ranking sparql query results through content and link analysis. In: Proceedings of the 2014 International Conference on Posters & Demonstrations Track, ISWC-PD 2014, CEUR-WS.org, Aachen, Germany, vol. 1272, pp. 273–276 (2014). http://dl.acm.org/citation.cfm?id=2878453.2878522
  10. 10.
    Hogan, A., Harth, A., Umbrich, J., Kinsella, S., Polleres, A., Decker, S.: Searching and browsing linked data with swse: the semantic web search engine. Web Semant. Sci. Serv. Agents World Wide Web 9(4), 365–401 (2011). JWS special issue on Semantic Search. www.sciencedirect.com/science/article/pii/S1570826811000473 CrossRefGoogle Scholar
  11. 11.
    Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker, S.: An empirical survey of linked data conformance. Web Semant. Sci. Serv. Agents World Wide Web 14, 14–44 (2012)CrossRefGoogle Scholar
  12. 12.
    Ichinose, S., Kobayashi, I., Iwazume, M., Tanaka, K.: Ranking the results of DBpedia retrieval with SPARQL query. In: Kim, W., Ding, Y., Kim, H.-G. (eds.) JIST 2013. LNCS, vol. 8388, pp. 306–319. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  13. 13.
    Ilievski, F., Beek, W., van Erp, M., Rietveld, L., Schlobach, S.: Lotus: linked open text unleashed. In: COLD workshop, ISWC (2015)Google Scholar
  14. 14.
    Lei, Y., Uren, V.S., Motta, E.: SemSearch: a search engine for the semantic web. In: Staab, S., Svátek, V. (eds.) EKAW 2006. LNCS (LNAI), vol. 4248, pp. 238–245. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Mulay, K., Kumar, P.S.: Spring: ranking the results of sparql queries on linked data. In: Proceedings of the 17th International Conference on Management of Data, COMAD 2011, Computer Society of India, Mumbai, India, pp. 12:1–12:10 (2011). http://dl.acm.org/citation.cfm?id=2591338.2591350
  16. 16.
    Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. (CSUR) 33(1), 31–88 (2001)CrossRefGoogle Scholar
  17. 17.
    Rizzo, G., Troncy, R.: NERD: a framework for unifying named entity recognition and disambiguation extraction tools. In: Proceedings of EACL 2012, pp. 73–76 (2012)Google Scholar
  18. 18.
    Tran, T., Wang, H., Haase, P.: Hermes: data web search on a pay-as-you-go integration infrastructure. Web Semant. Sci. Serv. Agents World Wide Web 7(3), 189–203 (2009). www.sciencedirect.com/science/article/pii/S1570826809000213. The Web of DataCrossRefGoogle Scholar
  19. 19.
    Tummarello, G., Delbru, R., Oren, E.: Sindice.com: weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  20. 20.
    Van Herwegen, J., De Vocht, L., Verborgh, R., Mannens, E., Van de Walle, R.: Substring filtering for low-cost linked data interfaces. In: Arenas, M. (ed.) ISWC 2015. LNCS, pp. 128–143. Springer, Switzerland (2015)CrossRefGoogle Scholar
  21. 21.
    Verborgh, R., Hartig, O., De Meester, B., Haesendonck, G., De Vocht, L., Vander Sande, M., Cyganiak, R., Colpaert, P., Mannens, E., Van de Walle, R.: Querying datasets on the web with high availability. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 180–196. Springer, Heidelberg (2014)Google Scholar
  22. 22.
    Wang, H., Liu, Q., Penin, T., Fu, L., Zhang, L., Tran, T., Yu, Y., Pan, Y.: Semplore: a scalable IR approach to search the web of data. Web Semantics Science Services and Agents on the World Wide Web 7(3), 177–188 (2009). www.sciencedirect.com/science/article/pii/S1570826809000262. The Web of DataCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Filip Ilievski
    • 1
  • Wouter Beek
    • 1
  • Marieke van Erp
    • 1
  • Laurens Rietveld
    • 1
  • Stefan Schlobach
    • 1
  1. 1.The Network InstituteVU University AmsterdamAmsterdamThe Netherlands

Personalised recommendations