Skip to main content

DBpedia and YAGO Based System for Answering Questions in Natural Language

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11055))

Included in the following conference series:

Abstract

In this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The analysis of the question consists of three stages: query analysis, query breakdown and information retrieval. During these stages the entities of interest, their attributes and the question domain are detected and the question is converted into a SPARQL query against the DBpedia and YAGO databases. Improvements to the methods are presented and we discuss the quality of the modified solution. We present a system for evaluation of the implemented methods, showing that the methods are viable for use in real applications. We discuss the results and indicate future directions of the work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://trec.nist.gov/data/qa/T8_QAdata/topics.qa_questions.txt.

  2. 2.

    https://docs.python.org/2/library/difflib.html.

  3. 3.

    https://www.nltk.org/.

  4. 4.

    https://en.wikipedia.org/wiki/Washington.

  5. 5.

    https://en.wikipedia.org/wiki/Earth_Star_Voyager.

References

  1. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52

    Chapter  Google Scholar 

  2. Banko, M., Brill, E., Dumais, S., Lin, J.: AskMSR: question answering using the worldwide web. In: Proceedings of 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, pp. 7–9 (2002)

    Google Scholar 

  3. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media, Sebastopol (2009)

    MATH  Google Scholar 

  4. Boiński, T., Ambrożewicz, A.: DBpedia and YAGO as knowledge base for natural language based question answering—the evaluation. In: Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds.) ICMMI 2017. AISC, vol. 659, pp. 251–260. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67792-7_25

    Chapter  Google Scholar 

  5. Bouziane, A., Bouchiha, D., Doumi, N., Malki, M.: Question answering systems: survey and trends. Procedia Comput. Sci. 73, 366–375 (2015)

    Article  Google Scholar 

  6. Cimiano, P., Haase, P., Heizmann, J.: Porting natural language interfaces between domains: an experimental user study with the ORAKEL system. In: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 180–189. ACM (2007)

    Google Scholar 

  7. Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 423. Association for Computational Linguistics (2004)

    Google Scholar 

  8. Czarnul, P., Rościszewski, P., Matuszek, M., Szymański, J.: Simulation of parallel similarity measure computations for large data sets. In: 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), pp. 472–477. IEEE (2015)

    Google Scholar 

  9. De Marneffe, M.C., MacCartney, B., Manning, C.D., et al.: Generating typed dependency parses from phrase structure parses. Proc. LREC. 6, 449–454 (2006)

    Google Scholar 

  10. Duch, W., Szymański, J., Sarnatowicz, T.: Concept description vectors and the 20 question game. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol. 31, pp. 41–50. Springer, Heidelberg (2005). https://doi.org/10.1007/3-540-32392-9_5

    Chapter  Google Scholar 

  11. Ferré, S.: SQUALL: a controlled natural language as expressive as SPARQL 1.1. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 114–125. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38824-8_10

    Chapter  Google Scholar 

  12. Ferret, O., Grau, B., Hurault-Plantet, M., Illouz, G., Monceaux, L., Robba, I., Vilnat, A.: Finding an answer based on the recognition of the question focus. In: TREC (2001)

    Google Scholar 

  13. Ferrucci, D., Levas, A., Bagchi, S., Gondek, D., Mueller, E.T.: Watson: beyond jeopardy!. Artif. Intell. 199, 93–105 (2013)

    Article  Google Scholar 

  14. Ferrucci, D.A.: IBM’s Watson/DeepQA. In: ACM SIGARCH Computer Architecture News, vol. 39. ACM (2011)

    Google Scholar 

  15. Green, Jr, B.F., Wolf, A.K., Chomsky, C., Laughery, K.: BASEBALL: an automatic question-answerer. In: Papers Presented at the 9–11 May 1961, Western Joint IRE-AIEE-ACM Computer Conference, pp. 219–224. ACM (1961)

    Google Scholar 

  16. Harabagiu, S.M., et al.: FALCON: boosting knowledge for answer engines. In: TREC (2000)

    Google Scholar 

  17. Laurent, D., Séguéla, P., Nègre, S.: Cross lingual question answering using QRISTAL for CLEF 2006. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 339–350. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74999-8_41

    Chapter  Google Scholar 

  18. Lcvenshtcin, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys. Doklady 10, 707–710 (1996)

    MathSciNet  Google Scholar 

  19. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  20. Miller, G.A., Beckitch, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An On-line Lexical Database. Princeton University Press, Cognitive Science Laboratory (1993)

    Google Scholar 

  21. Mishra, A., Jain, S.K.: A survey on question answering systems with classification. J. King Saud Univ. Comput. Inf. Sci. 28(3), 345–361 (2016)

    Google Scholar 

  22. Moldovan, D.I., Harabagiu, S.M., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., Rus, V.: LASSO: A tool for surfing the answer net. TREC 8, 65–73 (1999)

    Google Scholar 

  23. Moussa, A.M., Abdel-Kader, R.F.: QASYO: a question answering system for YAGO ontology. Int. J. Database Theor. Appl. 4(2), 99–112 (2011)

    Google Scholar 

  24. Ngonga Ngomo, A.C., Bühmann, L., Unger, C., Lehmann, J., Gerber, D.: Sorry, I don’t speak SPARQL: translating SPARQL queries into natural language. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 977–988. ACM (2013)

    Google Scholar 

  25. Popescu, A.M., Etzioni, O., Kautz, H.: Towards a theory of natural language interfaces to databases. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp. 149–157. ACM (2003)

    Google Scholar 

  26. Simmons, R.F.: Natural language question-answering systems: 1969. Commun. ACM 13(1), 15–30 (1970)

    Article  Google Scholar 

  27. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)

    Google Scholar 

  28. Szymański, J.: Self–organizing map representation for clustering wikipedia search results. In: Nguyen, N.T., Kim, C.-G., Janiak, A. (eds.) ACIIDS 2011, Part II. LNCS (LNAI), vol. 6592, pp. 140–149. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20042-7_15

    Chapter  Google Scholar 

  29. Szymański, J.: Words context analysis for improvement of information retrieval. In: Nguyen, N.-T., Hoang, K., Jȩdrzejowicz, P. (eds.) ICCCI 2012, Part I. LNCS (LNAI), vol. 7653, pp. 318–325. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34630-9_33

    Chapter  Google Scholar 

  30. Szymański, J., Duch, W.: Semantic memory knowledge acquisition through active dialogues. In: 2007 International Joint Conference on Neural Networks, IJCNN 2007, pp. 536–541. IEEE (2007)

    Google Scholar 

  31. Tahri, A., Tibermacine, O.: Dbpedia based factoid question answering system. Int. J. Web Seman. Technol. 4(3), 23 (2013)

    Article  Google Scholar 

  32. TREC: TREC 2008 (2008). http://trec.nist.gov/data/qa/T8_QAdata/topics.qa_questions.txt. Accessed 19 Nov 2017

  33. Trec: Text REtrieval Conference (TREC) (2012). http://trec.nist.gov/. Accessed 12 Mar 2018

  34. Yahya, M., Berberich, K., Elbassuoni, S., Weikum, G.: Robust question answering over the web of linked data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 1107–1116. ACM (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Boiński .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Boiński, T., Szymański, J., Dudek, B., Zalewski, P., Dompke, S., Czarnecka, M. (2018). DBpedia and YAGO Based System for Answering Questions in Natural Language. In: Nguyen, N., Pimenidis, E., Khan, Z., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2018. Lecture Notes in Computer Science(), vol 11055. Springer, Cham. https://doi.org/10.1007/978-3-319-98443-8_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98443-8_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98442-1

  • Online ISBN: 978-3-319-98443-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics