KI - Künstliche Intelligenz

, Volume 32, Issue 1, pp 19–26 | Cite as

A Quality Evaluation of Combined Search on a Knowledge Base and Text

  • Hannah BastEmail author
  • Björn Buchhold
  • Elmar Haussmann
Technical Contribution


We provide a quality evaluation of KB+Text search, a deep integration of knowledge base search and standard full-text search. A knowledge base (KB) is a set of subject–predicate–object triples with a common naming scheme. The standard query language is SPARQL, where queries are essentially lists of triples with variables. KB+Text search extends this by a special occurs-with predicate, which can be used to express the co-occurrence of words in the text with mentions of entities from the knowledge base. Both pure KB search and standard full-text search are included as special cases. We evaluate the result quality of KB+Text search on three different query sets. The corpus is the full version of the English Wikipedia (2.4 billion word occurrences) combined with the YAGO knowledge base (26 million triples). We provide a web application to reproduce our evaluation, which is accessible via


Knowledge bases Semantic search KB+Text search Quality evaluation 


  1. 1.
    Bast H, Bäurle F, Buchhold B, Haußmann E (2014) Semantic full-text search with broccoli. In: SIGIR, ACM, pp 1265–1266Google Scholar
  2. 2.
    Mihalcea R, Csomai A (2007) Wikify! Linking documents to encyclopedic knowledge. In: CIKM, pp 233–242Google Scholar
  3. 3.
    Bast H, Haussmann E (2013) Open information extraction via contextual sentence decomposition. In: ICSCGoogle Scholar
  4. 4.
    Bast H, Buchhold B (2013) An index for efficient semantic full-text search. In: CIKMGoogle Scholar
  5. 5.
    Bast H, Buchhold B, Haussmann E (2016) Semantic search on text and knowledge bases. Found Trends Inf Retr 10(2–3):119–271. doi: 10.1561/1500000032 CrossRefGoogle Scholar
  6. 6.
    Balog K, de Vries AP, Serdyukov P, Thomas P, Westerveld T (2009) Overview of the TREC 2009 entity track. In: TRECGoogle Scholar
  7. 7.
    Bron M, Balog K, de Rijke M (2010) Ranking related entities: components and analyses. In: CIKM, pp 1079–1088Google Scholar
  8. 8.
    Balog K, Serdyukov P, de Vries AP (2010) Overview of the TREC 2010 entity track. In: TRECGoogle Scholar
  9. 9.
    Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C (2015) Dbpedia—a large-scale, multilingual knowledge base extracted from wikipedia. Sem Web 6(2):167–195Google Scholar
  10. 10.
    Balog K, Serdyukov P, de Vries AP (2011) Overview of the TREC 2011 entity track. In: TRECGoogle Scholar
  11. 11.
    Campinas S, Ceccarelli D, Perry TE, Delbru R, Balog K, Tummarello G (2011) The sindice-2011 dataset for entity-oriented search in the web of data. In: Workshop on entity-oriented search (EOS), pp 26–32Google Scholar
  12. 12.
    Halpin H, Herzig DM, Mika P, Blanco R, Pound J, Thompson HS, Tran DT (2010) Evaluating ad-hoc object retrieval. In: Workshop on evaluation of semantic technologies (WEST)Google Scholar
  13. 13.
    Blanco R, Halpin H, Herzig DM, Mika P, Pound J, Thompson HS, Duc TT (2011) Entity search evaluation over structured web data. In: SIGIR workshop on entity-oriented search (JIWES)Google Scholar
  14. 14.
    Dang HT, Kelly D, Lin JJ (2007) Overview of the TREC 2007 question answering track. In: TRECGoogle Scholar
  15. 15.
    Lopez V, Unger C, Cimiano P, Motta E (2013) Evaluating question answering over linked data. J Web Sem 21:3–13CrossRefGoogle Scholar
  16. 16.
    Cimiano P, Lopez V, Unger C, Cabrio E, Ngomo ACN, Walter S (2013) Multilingual question answering over linked data (QALD-3): lab overview. In: CLEF, pp 321–332Google Scholar
  17. 17.
    Unger C, Forascu C, López V, Ngomo AN, Cabrio E, Cimiano P, Walter S (2014) Question answering over linked data (QALD-4). In: Working notes for CLEF 2014 conference, Sheffield, 15–18 Sept 2014, pp 1172–1180Google Scholar
  18. 18.
    Unger C, Forascu C, López V, Ngomo AN, Cabrio E, Cimiano P, Walter S (2015) Question answering over linked data (QALD-5). In: Working notes of CLEF 2015—conference and labs of the evaluation forum, Toulouse, 8–11 Sept 2015Google Scholar
  19. 19.
    Bast H, Chitea A, Suchanek FM, Weber I (2007) Ester: efficient search on text, entities, and relations. In: SIGIR, pp 671–678Google Scholar
  20. 20.
    Bhagdev R, Chapman S, Ciravegna F, Lanfranchi V, Petrelli D (2008) Hybrid search: effectively combining keywords and semantic searches. In: ESWC, pp 554–568Google Scholar
  21. 21.
    Tablan V, Bontcheva K, Roberts I, Cunningham H (2015) Mímir: an open-source semantic search framework for interactive information seeking and discovery. J Web Sem 30:52–68CrossRefGoogle Scholar
  22. 22.
    Wang H, Liu Q, Penin T, Fu L, Zhang L, Tran T, Yu Y, Pan Y (2009) Semplore: a scalable IR approach to search the web of data. J Web Sem 7(3):177–188CrossRefGoogle Scholar
  23. 23.
    Giunchiglia F, Kharkevich U, Zaihrayeu I (2009) Concept search. In: ESWC, pp 429–444Google Scholar
  24. 24.
    Tran T, Mika P, Wang H, Grobelnik M (2011) SemSearch’11: the 4th semantic search workshop. In: WWW (companion volume)Google Scholar
  25. 25.
    Bollacker KD, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD, pp 1247–1250Google Scholar
  26. 26.
    Sanderson M (2010) Test collection based evaluation of information retrieval systems. Found Trends Inf Retr 4(4):247–375CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Deutschland 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of FreiburgFreiburgGermany

Personalised recommendations