Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems

Abstract

Question Answering over Linked Data (QALD) refer to the use of Linked Data by question answering systems, and in recent times this has become increasingly popular as it opens up a massive Linked Data cloud which is a rich source of encoded knowledge. However, a major shortfall of current QALD systems is that they focus on presenting a single fact or factoid answer which is derived using SPARQL (SPARQL Protocol and RDF Query Language) queries. There is now an increased interest in development of human-like systems which would be able to answer questions and even hold conversations by constructing sentences akin to humans. In this paper, we introduce a new answer construction and presentation system, which utilizes the linguistic structure of the source question and the factoid answer to construct an answer sentence which closely emanates a human-generated answer. We employ both semantic Web technology and the linguistic structure to construct the answer sentences. The core of the research resides on extracting dependency subtree patterns from the questions and utilizing them in conjunction with the factoid answer to generate the answer sentence with a natural feel akin to an answer from a human when asked the question. We evaluated the system for both linguistic accuracy and naturalness using human evaluation. These evaluation processes showed that the proposed approach is able to generate answer sentences which have linguistic accuracy and natural readability quotients of more than 70%. In addition, we also carried out a feasibility analysis on using automatic metrics for answer sentence evaluation. The results from this phase showed that the there is not a strong correlation between the results from automatic metric evaluation and the human ratings of the machine-generated answers.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    https://jena.apache.org

References

  1. 1.

    O’Neill, A.: DictService: Word Dictionary Web Service (2011)

  2. 2.

    Benamara F.: Generating intensional answers in intelligent question answering systems. Lang. Gener. Nat. (2004). doi:10.1007/978-3-540-27823-8

  3. 3.

    Bizer, C.: The emerging web of linked data. Intelligent Systems, IEEE (2009). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5286174

  4. 4.

    Bosma, W.: Extending Answers Using Discourse Structure. In: Recent Advances in Natural Language Processing. Association for Computational Linguistics, Borovets, Bulgaria (2005). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.216.4051

  5. 5.

    Chen, B., Cherry, C., Canada, C.: A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU Boxing Chen and Colin Cherry. Assoc. Comput. Linguist. (ACL) 2, 362–367 (2014)

    Google Scholar 

  6. 6.

    Demner-Fushman, D., Lin, J.: Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL – ACL ’06, pp. 841–848. Association for Computational Linguistics, Morristown, NJ, USA (2006). doi:10.3115/1220175.1220281. http://dl.acm.org/citation.cfm?id=1220175.1220281

  7. 7.

    Denkowski, M., Lavie, A.: Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems. ...In: Workshop on Statistical Machine Translation pp. 85–91 (2011). http://dl.acm.org/citation.cfm?id=2132969

  8. 8.

    Gao, J., He, X.: Training MRF-Based Phrase Translation Models using Gradient Ascent. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Atlanta (2013)

  9. 9.

    Gatt, A., Reiter, E.: SimpleNLG: a realisation engine for practical applications. In: Twelth European Workshop on Natural Language Generation, pp. 90–93. Association for Computational Linguistics, Athens, Greece (2009). http://dl.acm.org/citation.cfm?id=1610195.1610208

  10. 10.

    Ginzburg, J., Sag, I.A.: Interrogative Investigations. Stanford CSLI Publications, Stanford (2000)

    Google Scholar 

  11. 11.

    Hirschman, L., Gaizauskas, R.: Natural language question answering: the view from here. Nat. Lang. Eng. 7(04), 275–300 (2001). doi:10.1017/S1351324901002807. http://dl.acm.org/citation.cfm?id=973890.973891

  12. 12.

    Kipper, K., Korhonen, A., Ryant, N., Palmer, M.: A large-scale classification of English verbs. Lang. Res. Eval. 42(1), 21–40 (2008). doi:10.1007/s10579-007-9048-2

    Article  Google Scholar 

  13. 13.

    Konstantinova, N., Orasan, C.: Interactive Question Answering. Emerging Applications of Natural Language Processing: Concepts and New Research pp. 149 –169 (2013). 10.4018/978-1-4666-2169-5.ch007. http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-4666-2169-5

  14. 14.

    Lin, C.Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Workshop on Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain (2004)

  15. 15.

    Lopez, V., Uren, V., Sabou, M., Motta, E.: Is question answering fit for the semantic web?: a survey. Semant. Web 2(2), 125–155 (2011). doi:10.3233/SW-2011-0041

  16. 16.

    Mann, W.C., Thompson, S.A.: Rhetorical structure theory: toward a functional theory of text organization. Text 8(3), 243–281 (1988)

    Article  Google Scholar 

  17. 17.

    Manning, C., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP Natural Language Processing Toolkit. In: The 52nd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Baltimore (2014)

  18. 18.

    de Marneffe, M.C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., Manning, C.D.: Universal Stanford Dependencies: A cross-linguistic typology. In: 9th International Conference on Language Resources and Evaluation (LREC’14), pp. 4585–4592 (2014). papers3://publication/uuid/D4B7BB39-4FFB-4AA6-B21E-701A91F27739

  19. 19.

    Materna, P.: Question-like and non-question-like imperative sentences. Linguist. Philos. 4(3), 393–404 (1981). doi:10.1007/BF00304402

    Article  Google Scholar 

  20. 20.

    Maybury, M.: New directions in question answering. In: T. Strzalkowski, S.M. Harabagiu (eds.) Advances in Open Domain Question Answering, Text, Speech and Language Technology, vol. 32, chap. New Direct. Springer Netherlands, Dordrecht (2008). doi:10.1007/978-1-4020-4746-6

  21. 21.

    McGuinness, D.L.: Question answering on the semantic Web (2004). 10.1109/MIS.2004.1265890. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1265890

  22. 22.

    Mendes, A.C., Coheur, L.: When the answer comes into question in question–answering: survey and open issues. Nat. Lang. Eng. 19(1), 1–32 (2013). doi:10.1017/S1351324911000350. http://journals.cambridge.org/abstract_S1351324911000350

  23. 23.

    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  24. 24.

    Moriceau, V.: Numerical Data Integration for Cooperative Question–Answering. In: European Chapter of the Association for Computational Linguistics Workshop On KRAQ Knowledge And Reasoning For Language Processing, pp. 42–49. Association for Computational Linguistics (2006). http://dl.acm.org/citation.cfm?id=1641493.1641501

  25. 25.

    Nivre, J.: Dependency grammar and dependency parsing. MSI Rep. 5133(1959), 1–32 (2005)

    Google Scholar 

  26. 26.

    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics—ACL ’02, pp. 311–318. Association for Computational Linguistics, Morristown, NJ, USA (2001). doi:10.3115/1073083.1073135. http://dl.acm.org/citation.cfm?id=1073083.1073135

  27. 27.

    Perera, R., Nand, P.: RealText cs—Corpus Based Domain Independent Content Selection Model. In: 26th IEEE International Conference on Tools with Artificial Intelligence. IEEE Press (2014)

  28. 28.

    Perera, R., Nand, P.: The role of linked data in content selection. Trends Artif. Intell. 8862, 573–586 (2014). doi:10.1007/978-3-319-13560-1_46

    Google Scholar 

  29. 29.

    Perera, R., Nand, P.: A Multi-strategy Approach for Lexicalizing Linked Open Data. Computational Linguistics and Intelligent Text Processing (2015). doi:10.1007/978-3-319-18117-2_26

  30. 30.

    Perera, R., Nand, P., Klette, G.: RealText lex : A Lexicalization Framework for Linked Open Data. In: Internationa Semantic Web Conference (ISWC)—Demonstration, pp. 1–4 (2015)

  31. 31.

    Porter, M.F.: An algorithm for suffix stripping. Program. 14(3), 130–137 (1980). doi:10.1108/00330330610681286

  32. 32.

    Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Information Processing and Management 40.6 (2004): 919–938. p. 10 (2000). doi:10.1016/j.ipm.2003.10.006. http://arxiv.org/abs/cs/0005020

  33. 33.

    Santorini, B., Kroch, A.: The syntax of natural language: An online introduction using the Trees program (2007). http://www.ling.upenn.edu/~beatrice/syntax-textbook

  34. 34.

    Vargas-Vera, M., Motta, E.: AQUA-ontology-based question answering system. In: Mexican International Conference on Artificial Intelligence. Springer-Verlag, Mexico City, (2004). http://link.springer.com/chapter/10.1007/978-3-540-24694-7_48

  35. 35.

    Webber, B., Gardent, C., Bos, J.: Position statement: Inference in question answering. Proceedings of LREC (2002)

  36. 36.

    Yu, H., Lee, M., Kaufman, D., Ely, J., Osheroff, J.A., Hripcsak, G., Cimino, J.: Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians. J. Biomed. Inform. 40, 236–251 (2007). doi:10.1016/j.jbi.2007.03.002

    Article  Google Scholar 

Download references

Acknowledgements

The research reported in this paper is a part of a research funded by the Auckland University of Technology.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Rivindu Perera.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Perera, R., Nand, P. & Naeem, A. Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems. Prog Artif Intell 6, 105–119 (2017). https://doi.org/10.1007/s13748-017-0113-9

Download citation

Keywords

  • Answer presentation
  • Question answering
  • Dependency parsing
  • Linked Data
  • Semantic Web