Open-Domain Question Answering Framework Using Wikipedia

Ameen, Saleem; Chung, Hyunsuk; Han, Soyeon Caren; Kang, Byeong Ho

doi:10.1007/978-3-319-50127-7_55

Saleem Ameen²¹,
Hyunsuk Chung²¹,
Soyeon Caren Han²¹ &
…
Byeong Ho Kang²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9992))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3169 Accesses
10 Altmetric

Abstract

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia’s knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia’s knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice’s coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Grosz, B.J., et al.: TEAM: an experiment in the design of transportable natural-language interfaces. Artif. Intell. 32(2), 173–243 (1987)
Article Google Scholar
Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2000)
Google Scholar
Unger, C., et al.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web. ACM (2012)
Google Scholar
Kwiatkowski, T., et al.: Scaling semantic parsers with on-the-fly ontology matching. Association for Computational Linguistics (ACL) (2013)
Google Scholar
Berant, J., et al.: Semantic parsing on freebase from question-answer Pairs. In: EMNLP (2013)
Google Scholar
Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and Lexicon extension. ACL (1). Citeseer (2013)
Google Scholar
Tsai, C., Yih, W., Burges, C.: Web-based question answering: revisiting AskMSR. Technical report MSR-TR-2015-20, Microsoft Research (2015)
Google Scholar
Liang, P., Jordan, M.I., Klein, D.: Learning dependency-based compositional semantics. Comput. Linguist. 39(2), 389–446 (2013)
Article MathSciNet Google Scholar
Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2014)
Google Scholar

Download references

Acknowledgement

This work was supported by the Industrial Strategic Technology Development Program, 10052955, Experiential Knowledge Platform Development Research for the Acquisition and Utilization of Field Expert Knowledge, funded by the Ministry of Trade, Industry & Energy (MI, Korea). This work was supported as part of the the Office of Naval ResearchgrantN62909-16-1-2219.

Author information

Authors and Affiliations

School of Engineering and ICT, Tasmania, 7005, Australia
Saleem Ameen, Hyunsuk Chung, Soyeon Caren Han & Byeong Ho Kang

Authors

Saleem Ameen
View author publications
You can also search for this author in PubMed Google Scholar
Hyunsuk Chung
View author publications
You can also search for this author in PubMed Google Scholar
Soyeon Caren Han
View author publications
You can also search for this author in PubMed Google Scholar
Byeong Ho Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Byeong Ho Kang .

Editor information

Editors and Affiliations

University of Tasmania, Hobart, Australia
Byeong Ho Kang
Auckland University of Technology, Auckland, New Zealand
Quan Bai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ameen, S., Chung, H., Han, S.C., Kang, B.H. (2016). Open-Domain Question Answering Framework Using Wikipedia. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_55

Download citation

DOI: https://doi.org/10.1007/978-3-319-50127-7_55
Published: 29 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics