Skip to main content

Open-Domain Question Answering Framework Using Wikipedia

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9992))

Abstract

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia’s knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia’s knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice’s coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.

This is a preview of subscription content, log in via an institution.

References

  1. Grosz, B.J., et al.: TEAM: an experiment in the design of transportable natural-language interfaces. Artif. Intell. 32(2), 173–243 (1987)

    Article  Google Scholar 

  2. Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2000)

    Google Scholar 

  3. Unger, C., et al.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web. ACM (2012)

    Google Scholar 

  4. Kwiatkowski, T., et al.: Scaling semantic parsers with on-the-fly ontology matching. Association for Computational Linguistics (ACL) (2013)

    Google Scholar 

  5. Berant, J., et al.: Semantic parsing on freebase from question-answer Pairs. In: EMNLP (2013)

    Google Scholar 

  6. Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and Lexicon extension. ACL (1). Citeseer (2013)

    Google Scholar 

  7. Tsai, C., Yih, W., Burges, C.: Web-based question answering: revisiting AskMSR. Technical report MSR-TR-2015-20, Microsoft Research (2015)

    Google Scholar 

  8. Liang, P., Jordan, M.I., Klein, D.: Learning dependency-based compositional semantics. Comput. Linguist. 39(2), 389–446 (2013)

    Article  MathSciNet  Google Scholar 

  9. Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2014)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Industrial Strategic Technology Development Program, 10052955, Experiential Knowledge Platform Development Research for the Acquisition and Utilization of Field Expert Knowledge, funded by the Ministry of Trade, Industry & Energy (MI, Korea). This work was supported as part of the the Office of Naval ResearchgrantN62909-16-1-2219.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byeong Ho Kang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Ameen, S., Chung, H., Han, S.C., Kang, B.H. (2016). Open-Domain Question Answering Framework Using Wikipedia. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50127-7_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50126-0

  • Online ISBN: 978-3-319-50127-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics