Mapping Keywords to Linked Data Resources for Automatic Query Expansion

  • Isabelle Augenstein
  • Anna Lisa Gentile
  • Barry Norton
  • Ziqi Zhang
  • Fabio Ciravegna
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7955)


Linked Data is a gigantic, constantly growing and extremely valuable resource, but its usage is still heavily dependent on (i) the familiarity of end users with RDF’s graph data model and its query language, SPARQL, and (ii) knowledge about available datasets and their contents. Intelligent keyword search over Linked Data is currently being investigated as a means to overcome these barriers to entry in a number of different approaches, including semantic search engines and the automatic conversion of natural language questions into structured queries. Our work addresses the specific challenge of mapping keywords to Linked Data resources, and proposes a novel method for this task. By exploiting the graph structure within Linked Data we determine which properties between resources are useful to discover, or directly express, semantic similarity. We also propose a novel scoring function to rank results. Experiments on a publicly available dataset show a 17% improvement in Mean Reciprocal Rank over the state of the art.


Semantic Similarity Query Expansion Mean Reciprocal Rank Labelling Property Natural Language Interface 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bernstein, A., Kaufmann, E., Kaiser, C.: Querying the semantic web with ginseng: A guided input natural language search engine. In: 15th Workshop on Information Technologies and Systems, Las Vegas, NV, pp. 112–126 (2005)Google Scholar
  2. 2.
    Budanitsky, A., Hirst, G.: Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Journal of Computational Linguistics 32(1), 13–47 (2006)CrossRefzbMATHGoogle Scholar
  3. 3.
    Carpineto, C., Romano, G.: A Survey of Automatic Query Expansion in Information Retrieval. ACM Comput. Surv. 44(1), 1 (2012)CrossRefGoogle Scholar
  4. 4.
    Cheng, G., Ge, W., Qu, Y.: Falcons: searching and browsing entities on the semantic web. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1101–1102. ACM (2008)Google Scholar
  5. 5.
    Damljanovic, D., Agatonovic, M., Cunningham, H.: FREyA: An interactive way of querying Linked Data using natural language. In: García-Castro, R., Fensel, D., Antoniou, G. (eds.) ESWC 2011. LNCS, vol. 7117, pp. 125–138. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    d’Aquin, M., Baldassarre, C., Gridinoc, L., Sabou, M., Angeletou, S., Motta., E.: Watson: Supporting next generation semantic web applications. In: Proceedings of the WWW/Internet Conference, Vila real, Spain (2007)Google Scholar
  7. 7.
    Freitas, A., Curry, E., Oliveira, J.G., O’Riain, S.: A Distributional Structured Semantic Space for Querying RDF Graph Data. International Journal of Semantic Computing 5(04), 433–462 (2011)CrossRefzbMATHGoogle Scholar
  8. 8.
    Freitas, A., Curry, E., Oliveira, J.G., O’Riain, S.: Querying heterogeneous datasets on the linked data web: Challenges, approaches, and trends. IEEE Internet Computing 16(1), 24–33 (2012)CrossRefGoogle Scholar
  9. 9.
    Freitas, A., Curry, E., O’Riain, S.: A distributional approach for terminological semantic search on the linked data web. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, pp. 384–391. ACM (2012)Google Scholar
  10. 10.
    Freitas, A., Oliveira, J., O’Riain, S., Curry, E., Pereira da Silva, J.: Querying Linked Data using semantic relatedness: A vocabulary independent approach. Natural Language Processing and Information Systems, 40–51 (2011)Google Scholar
  11. 11.
    Kaufmann, E., Bernstein, A., Fischer, L.: NLP-Reduce: A “naıve” but Domain-independent Natural Language Interface for Querying Ontologies. In: 4th European Semantic Web Conference (2007)Google Scholar
  12. 12.
    Lopez, V., Nikolov, A., Fernandez, M., Sabou, M., Uren, V., Motta, E.: Merging and ranking answers in the semantic web: The wisdom of crowds. The Semantic Web, 135–152 (2009)Google Scholar
  13. 13.
    Nuzzolese, A.G., Gangemi, A., Presutti, V., Ciancarini, P.: Encyclopedic knowledge patterns from wikipedia links. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 520–536. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice. com: a document-oriented lookup index for open linked data. International Journal of Metadata, Semantics and Ontologies 3(1), 37–52 (2008)CrossRefGoogle Scholar
  15. 15.
    Tran, T., Wang, H., Haase, P.: SearchWebDB: Data web search on a pay-as-you-go integration infrastructure. Tech. rep., Technical report, University of Karlsruhe (2008)Google Scholar
  16. 16.
    Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: IEEE 25th International Conference on Data Engineering, ICDE 2009, pp. 405–416. IEEE (2009)Google Scholar
  17. 17.
    Walter, S., Unger, C., Cimiano, P., Bär, D.: Evaluation of a layered approach to question answering over linked data. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 362–374. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  18. 18.
    Wang, H., Tran, T., Haase, P., Penin, T., Liu, Q., Fu, L., Yu, Y.: Searchwebdb: Searching the billion triples. In: Billion Triple Challenge at the International Semantic Web Conference (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Isabelle Augenstein
    • 1
  • Anna Lisa Gentile
    • 1
  • Barry Norton
    • 2
  • Ziqi Zhang
    • 1
  • Fabio Ciravegna
    • 1
  1. 1.Department of Computer ScienceUniversity of SheffieldUK
  2. 2.OntotextUK

Personalised recommendations