Abstract
This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia’s knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia’s knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice’s coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.
References
Grosz, B.J., et al.: TEAM: an experiment in the design of transportable natural-language interfaces. Artif. Intell. 32(2), 173–243 (1987)
Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2000)
Unger, C., et al.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web. ACM (2012)
Kwiatkowski, T., et al.: Scaling semantic parsers with on-the-fly ontology matching. Association for Computational Linguistics (ACL) (2013)
Berant, J., et al.: Semantic parsing on freebase from question-answer Pairs. In: EMNLP (2013)
Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and Lexicon extension. ACL (1). Citeseer (2013)
Tsai, C., Yih, W., Burges, C.: Web-based question answering: revisiting AskMSR. Technical report MSR-TR-2015-20, Microsoft Research (2015)
Liang, P., Jordan, M.I., Klein, D.: Learning dependency-based compositional semantics. Comput. Linguist. 39(2), 389–446 (2013)
Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2014)
Acknowledgement
This work was supported by the Industrial Strategic Technology Development Program, 10052955, Experiential Knowledge Platform Development Research for the Acquisition and Utilization of Field Expert Knowledge, funded by the Ministry of Trade, Industry & Energy (MI, Korea). This work was supported as part of the the Office of Naval ResearchgrantN62909-16-1-2219.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Ameen, S., Chung, H., Han, S.C., Kang, B.H. (2016). Open-Domain Question Answering Framework Using Wikipedia. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_55
Download citation
DOI: https://doi.org/10.1007/978-3-319-50127-7_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)