KI - Künstliche Intelligenz

, Volume 26, Issue 4, pp 381–390 | Cite as

Connecting Question Answering and Conversational Agents

Contextualizing German Questions for Interactive Question Answering Systems
  • Ulli WaltingerEmail author
  • Alexa Breuing
  • Ipke Wachsmuth


Research results in the field of Question Answering (QA) have shown that the classification of natural language questions significantly contributes to the accuracy of the generated answers. In this paper we present an approach which extends the prevalent question classification techniques by additionally considering further contextual information provided by the questions. Thereby we focus on improving the conversational abilities of existing interactive interfaces by enhancing their underlying QA systems in terms of response time and correctness. As a result, we are able to introduce a method based on a tripartite contextualization. First, we present a comprehensive question classification experiment based on machine learning using two different datasets and various feature sets for the German language. Second, we propose a method for detecting the focus chunk of a given question, that is, for identifying which part of the question is fundamentally relevant to the answer and which part refers to a specification of it. Third, we investigate how to identify and label the topic of a given question by means of a human-judgment experiment. We show that the resulting contextualization method contributes to an improvement of existing question answering systems and enhances their application within interactive scenarios.


Interactive question answering Question classification Topic spotting Machine learning 



We gratefully acknowledge financial support of the German Research Foundation (DFG) through EXC 277 Cognitive Interaction Technology (CITEC) at Bielefeld University.


  1. 1.
    Allan J (2002) Topic detection and tracking: event-based information organization. Kluwer Academic, Norwell zbMATHGoogle Scholar
  2. 2.
    Bergstrom T, Karahalios K (2009) Conversation clusters: grouping conversation topics through human-computer dialog. In: Olsen DR, Arthur R.B., Hinckley K, Morris MR, Hudson SE, Greenberg S (eds) Proceedings of the 27th international conference on human factors in computing systems (CHI 2009), Boston, MA, USA, 4–9 April 2009, ACM, New York, pp 2349–2352 Google Scholar
  3. 3.
    Blooma MJ, Goh DH-L, Chua AYK (2009) Question classification in social media. Int. J. Inf. Stud. 1(2):101–109 Google Scholar
  4. 4.
    Bradesko L, Dali L, Fortuna B, Grobelnik M, Mladenic D, Novalija I (2010) Contextualized question answering. J Comput Inf Technol 18(4) Google Scholar
  5. 5.
    Breuing A, Wachsmuth I (2012) Let’s talk topically with artificial agents! Providing agents with humanlike topic awareness in everyday dialog situations. In: Proceedings of the 4th international conference on agents and artificial intelligence (ICAART), pp 62–71 Google Scholar
  6. 6.
    Breuing A, Waltinger U, Wachsmuth I (2011) Harvesting Wikipedia knowledge to identify topics in ongoing natural language dialogs. In: Proceedings of the international conference on web intelligence and intelligent agent technology (WI-IAT 2011), Lyon, France Google Scholar
  7. 7.
    Bublitz W (1989) Topical coherence in spoken discourse. Stud. Ang. Posn. 22:31–51 Google Scholar
  8. 8.
    Buscaldi D, Rosso P (2006) Mining knowledge from Wikipedia for the question answering task. In: Proceedings of the 5th international conference on language resources and evaluation (LREC 2006), Genoa, Italy Google Scholar
  9. 9.
    Cramer I, Leidner J, Klakow D (2006) Building an evaluation corpus for german question answering by harvesting Wikipedia. In: Proceedings of the fifth international conference on language resources and evaluation (LREC), pp 1514–1519 Google Scholar
  10. 10.
    Collins LM, Powell JE, Dunford CE, Mane KK, Martinez MLB (2008) Emergency synthesis and awareness using E-SOS. In: Proceedings of the 5th international ISCRAM conference, pp 618–623 Google Scholar
  11. 11.
    Damljanovic D, Agatonovic M, Cunningham H (2010) Identification of the question focus: combining syntactic analysis and ontology-based lookup through the user interaction. In: Proceedings of the international conference on language resources and evaluation (LREC 2010), Valletta, Malta, 17–23 May 2010 Google Scholar
  12. 12.
    Davidescu A, Heyl A, Kazalski S, Cramer IM, Klakow D (2007) Classifying german questions according to ontology-based answer types. In: Advances in data analysis: Proceedings of the 30th annual conference of the Gesellschaft für Klassifikation e.V., Freie Universität Berlin, 8–10 March 2006. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 603–610 Google Scholar
  13. 13.
    Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur AA, Lally A, Murdock JW, Nyberg E, Prager J, Schlaefer N, Welty C (2010) Building Watson: an overview of the DeepQA project. AI Mag. 31(3):59–79 Google Scholar
  14. 14.
    Fissaha Adafre S, Jijkoun V, de Rijke M (2007) Fact discovery in Wikipedia. In: Proceedings of the IEEE/WIC/ACM international conference on web intelligence (WI ’07), Washington, DC, USA. IEEE Comput Soc, Los Alamitos, pp 177–183 CrossRefGoogle Scholar
  15. 15.
    Fleiss J (1973) Statistical methods for rates and proportions. Wiley, New York zbMATHGoogle Scholar
  16. 16.
    Furbach U, Glöckner I, Helbig H, Pelzer B (2008) LogAnswer a deduction-based question answering system (system description). In: Proceedings of the 4th international joint conference on automated reasoning (IJCAR ’08). Springer, Berlin, pp 139–146 Google Scholar
  17. 17.
    Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 6–12 Google Scholar
  18. 18.
    Gerber M, Chai J (2006) Topic term identification for context question answering. In: Proceedings of the third midwest computational linguistics colloquium (MCLC), Urbana-Champaign, IL Google Scholar
  19. 19.
    Giampiccolo D, Forner P, Herrera J, Peñas A, Ayache C, Forascu C, Jijkoun V, Osenova P, Rocha P, Sacaleanu B, Sutcliffe RFE (2007) Overview of the CLEF 2007 multilingual question answering track. In: CLEF, pp 200–236 Google Scholar
  20. 20.
    Glöckner I, Pelzer B (2010) The LogAnswer project at ResPubliQA. In: Braschler M, Harman D, Pianta E (eds) CLEF (notebook papers/labs/workshops) Google Scholar
  21. 21.
    Godfrey JJ, Holliman EC, Mcdaniel J (1992) SWITCHBOARD: Telephone speech corpus for research and development. In: International conference on acoustics, speech, and signal processing (ICASSP-92), vol 1, pp 517–520 Google Scholar
  22. 22.
    Gupta R, Ratinov L (2007) Topic spotting in dialogues using knowledge transfer. In: Proc of the NIPS workshop on learning problem design Google Scholar
  23. 23.
    Hatcher E, Gospodnetic O, McCandless M (2010) Lucene in action, 2nd revised edn. Manning, Shelter Island Google Scholar
  24. 24.
    Huang Z, Thint M, Qin Z (2008) Question classification using head words and their hypernyms. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP ’08), Stroudsburg, PA, USA, pp 927–936. Association for Computational Linguistics CrossRefGoogle Scholar
  25. 25.
    Joachims T (2002). SVM light.
  26. 26.
    Koehler F, Schütze H, Atterer M (2008) A question answering system for German. Experiments with morphological linguistic resources. In: Proceedings of the international conference on language resources and evaluation (LREC 2008), 26 May–1 June 2008, Marrakech, Morocco Google Scholar
  27. 27.
    Kopp S, Gesellensetter L, Krämer N, Wachsmuth I (2005) A conversational agent as museum guide—design and evaluation of a real-world application. In: Proceedings of intelligent virtual agents (IVA 2005). Springer, Berlin, pp 329–343 CrossRefGoogle Scholar
  28. 28.
    Lagus K, Kuusisto J (2002) Topic identification in natural language dialogues using neural networks. In: Proceedings of the 3rd SIGdial workshop on discourse and dialogue (SIGDIAL ’02), vol 2. Stroudsburg, PA, USA, pp 95–102. Association for Computational Linguistics CrossRefGoogle Scholar
  29. 29.
    Lemnitzer L, Kunze C (2002) GermaNet—representation, visualization, application. In: Proceedings of the 4th language resources and evaluation conference, pp 1485–1491 Google Scholar
  30. 30.
    Li X, Roth D (2002) Learning question classifiers. In: Proceedings of the 19th international conference on computational linguistics (COLING ’02), vol 1, Stroudsburg, PA, USA, pp 1–7. Association for Computational Linguistics CrossRefGoogle Scholar
  31. 31.
    Lin J, Quan D, Sinha V, Bakshi K, Huynh D, Katz B, Karger D (2003) What makes a good answer? The role of context in question answering. In: Proceedings of the 9th international conference on human-computer interaction Google Scholar
  32. 32.
    Liu J, Chua T-S (2001) Building semantic perceptron net for topic spotting. In: Proceedings of the 39th annual meeting on association for computational linguistics (ACL ’01), Stroudsburg, PA, USA, pp 378–385. Association for Computational Linguistics CrossRefGoogle Scholar
  33. 33.
    Mehta M, Corradini A (2008) Handling out of domain topics by a conversational characteristics. In: Proceedings of the 3rd international conference on digital interactive media in entertainment and arts (DIMEA ’08), pp 273–280 CrossRefGoogle Scholar
  34. 34.
    Moldovan DI, Harabagiu SM, Pasca M, Mihalcea R, Goodrum R, Girju R, Rus V (1999) Lasso: a tool for surfing the answer net. In: TREC Google Scholar
  35. 35.
    Myers K, Kearns MJ, Singh SP, Walker MA (2000) A boosting approach to topic spotting on subdialogues. In: Proceedings of the seventeenth international conference on machine learning (ICML ’00), San Francisco, CA, USA. Morgan Kaufmann, San Mateo, pp 655–662 Google Scholar
  36. 36.
    Neumann G, Sacaleanu B (2004) A crosslanguage question/answeringsystem for German and English. In: Peters C, Gonzalo J, Braschler M, Kluck M (eds) Comparative evaluation of multilingual information access systems. Lecture notes in computer science, vol 3237. Springer, Berlin, pp 101–109 CrossRefGoogle Scholar
  37. 37.
    Peñas A, Forner P, Sutcliffe R, Rodrigo A, Forăscu C, Alegria I, Giampiccolo D, Moreau N, Osenova P (2010) Overview of ResPubliQA 2009: question answering evaluation over European legislation. In: Peters C, Nunzio GMD, Kurimo M, Mandl T, Mostefa D, Peñnas A, Roda G (eds) Multilingual information access evaluation. I. Text retrieval experiments. Lecture notes in computer science, Chap. 21, vol 6241. Springer, Berlin, pp 174–196 CrossRefGoogle Scholar
  38. 38.
    Quarteroni S, Moschitti A, Manandhar S, Basili R (2007) Advanced structural representations for question classification and answer re-ranking. In: Amati G, Carpineto C, Romano G (eds) Advances in information retrieval—proceedings of the 29th European conference on information retrieval (ECIR 2007), Rome, Italy, 2–5 April 2007. Lecture notes in computer science, vol 4425. Springer, Berlin, pp 234–245 Google Scholar
  39. 39.
    Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5):513–523 CrossRefGoogle Scholar
  40. 40.
    Schapire RE, Singer Y (2000) BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2/3):135–168 zbMATHCrossRefGoogle Scholar
  41. 41.
    Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the international conference on new methods in language processing, Manchester, UK Google Scholar
  42. 42.
    Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell. Agent Syst. 7:195–207 Google Scholar
  43. 43.
    Solorio T, Pérez-Coutiño M, Montes-y Gémez M, Villaseñor PL, López-López A (2004) A language independent method for question classification. In: Proceedings of the 20th international conference on computational linguistics (COLING ’04), Stroudsburg, PA, USA. Association for Computational Linguistics Google Scholar
  44. 44.
    Sonntag D, Romanelli M (2006) A multimodal result ontology for integrated semantic web dialogue applications. In: Proceedings of the 5th conference on language resources and evaluation (LREC 2006), Genova, Italy Google Scholar
  45. 45.
    Suzuki J, Taira H, Sasaki Y, Maeda E (2003) Question classification using HDAG kernel. In: Proceedings of the ACL 2003 workshop on multilingual summarization and question answering (MultiSumQA ’03), vol 12, Stroudsburg, PA, USA, pp 61–68. Association for Computational Linguistics CrossRefGoogle Scholar
  46. 46.
    Vapnik VN (1995) The nature of statistical learning theory. Springer, New York zbMATHGoogle Scholar
  47. 47.
    Voorhees EM (2007) Overview of TREC. In: Proceedings of the sixteenth text retrieval conference (TREC 2007), Gaithersburg, Maryland, USA, 5–9 November 2007. National Institute of Standards and Technology (NIST) Google Scholar
  48. 48.
    Waltinger U, Breuing A, Wachsmuth I (2011) Interfacing virtual agents with collaborative knowledge: open domain question answering using Wikipedia-based topic models. In: Proceedings of the 22nd international joint conference on artificial intelligence (IJCAI 2011), Barcelona, Catalonia, Spain, 16–22 July 2011, pp 1896–1902 Google Scholar
  49. 49.
    Waltinger U, Mehler A (2009) Social semantics and its evaluation by means of semantic relatedness and open topic models. In: Proceedings of the 2009 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology (WI-IAT ’09), vol 1, Washington, DC, USA. IEEE Comput. Soc., Los Alamitos, pp 42–49 CrossRefGoogle Scholar
  50. 50.
    Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’03). New York, NY, USA ACM, New York, pp 26–32 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Corporate TechnologySiemens AGMunichGermany
  2. 2.Artificial Intelligence GroupBielefeld UniversityBielefeldGermany

Personalised recommendations