Question Answering

  • Yassine Benajiba
  • Paolo Rosso
  • Lahsen Abouenour
  • Omar Trigui
  • Karim Bouzoubaa
  • Lamia Belguith
Chapter

Abstract

Question Answering (QA) is a task that aims at finding a precise answer to a specific user question. This task is significantly challenging because both the question and the answer are formulated in natural language. For this reason, in order to build an efficient QA system one has to rely on different NLP parsers to extract the necessary information to be used to compute the most relevant answer(s). Challenges are even higher when the target language is based on a rich/complex morphology. Not only the lack of resources and tools stymie the task but also the nature of the language requires some preprocessing for the statistical models to be able to operate efficiently. In this chapter, we describe in details how QA is more complex for rich morphology languages. We also summarize the literature that has been published around this task and describe in more detail some recent research work that has been conducted to build Arabic QA systems.

References

  1. 1.
    Abney S, Collins M, Singhal A (2000) Answer extraction. In: Proceedings of the 6th applied natural language processing conference, Seattle, pp 296–301Google Scholar
  2. 2.
    Abouenour L (2011) On the improvement of passage retrieval in Arabic question/answering (Q/A) systems. In: Mueoz R et al (eds) NLDB’11, Alicante. Lecture notes in computer science, vol 6716/2011. Springer, Berlin/Heidelberg, pp 336–341. doi:10.1007/978-3-642-22327-3_50Google Scholar
  3. 3.
    Abouenour L, Bouzoubaa K, Rosso P (2009) Three-level approach for passage retrieval in Arabic question/answering systems. In: Proceedings of the 3rd international conference on Arabic language processing CITALA2009, RabatGoogle Scholar
  4. 4.
    Abouenour L, Bouzoubaa K, Rosso P (2009) Structure-based evaluation of an Arabic semantic query expansion using the JIRS passage retrieval system. In: Proceedings of the workshop on computational approaches to Semitic languages, E-ACL-2009, Athens, April 2009. Association for Computational Linguistics (ACL), Stroudsburg, pp 62–68Google Scholar
  5. 5.
    Abouenour L, Bouzoubaa K, Rosso P (2010) An evaluated semantic QE and structure-based approach for enhancing Arabic Q/A. In: The special issue on “advances in Arabic language processing” for the IEEE Int J Inf Commun Technol (IJICT). ISSN:0973-5836, Serial PublicationsGoogle Scholar
  6. 6.
    Abouenour L, Bouzoubaa K, Rosso P (2010) Using the Yago ontology as a resource for the enrichment of named entities in Arabic WordNet. In: Workshop on language resources (LRs) and human language technologies (HLT) for Semitic languages status, updates, and prospects, LREC’10 conference, MaltaGoogle Scholar
  7. 7.
    Aktolga E, Allan J, Smith DA (2011) Passage reranking for question answering using syntactic structures and answer types. In: ECIR 2011, Dublin. Lecture notes in computer science, vol 6611. Springer, Berlin/Heidelberg, pp 617–628Google Scholar
  8. 8.
    Balahur A, Boldrini E, Montoyo A (2010) The OpAL system at NTCIR 8 MOAT. In: Proceedings of NTCIR-8 workshop meeting, TokyoGoogle Scholar
  9. 9.
    Benajiba Y, Rosso P, Gómez JM (2007) Adapting JIRS passage retrieval system to the Arabic. In: Proceedings of the 8th International conference on computational linguistics and intelligent text processing, CICLing-2007, Mexico City. Lecture notes in computer science, vol 4394. Springer, pp 530–541Google Scholar
  10. 10.
    Benajiba Y, Rosso P, Lyhyaoui A (2007) Implementation of the ArabiQA question answering system’s components. In: Proceedings of ICTIS-2007, FezGoogle Scholar
  11. 11.
    Benajiba Y, Zitouni I, Diab M, Rosso P (2010) Arabic named entity recognition: using features extracted from noisy data. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics, ACL-2010, Uppsala, 11–16 July 2010, pp 281–285Google Scholar
  12. 12.
    Bilotti M (2004) Query expansion techniques for question answering. Master’s thesis, Massachusetts Institute of TechnologyGoogle Scholar
  13. 13.
    Bilotti M, Katz B, Lin J (2004) What works better for question answering: stemming or morphological query expansion?. In: Proceedings of the information retrieval for question answering (IR4QA) workshop at SIGIR 2004, SheffieldGoogle Scholar
  14. 14.
    Blair-Goldensohn S, McKeown K, Schlaikjer AH (2003) A hybrid approach for QA track definitional questions. In: TREC 2003, Gaithersburg, pp 185–192Google Scholar
  15. 15.
    Borg C, Rosner M, Pace G (2009) Evolutionary algorithms for definition extraction. In: The 1st workshop on definition extraction 2009, RANLP, BorovetsGoogle Scholar
  16. 16.
    Boudlal A, Lakhouaja A, Mazroui A, Meziane A, Ould Abdallahi Ould Bebah M, Shoul M (2010) Alkhalil MorphoSys: a Morphosyntactic analysis system for non vocalized Arabic. In: International Arab conference on information technology (ACIT’2010), BenghaziGoogle Scholar
  17. 17.
    Chu-Carroll J, Prager J (2007) An experimental study of the impact of information extraction accuracy on semantic search performance. In: Proceedings of the 16th ACM conference on information and knowledge management, New York, pp 505–514Google Scholar
  18. 18.
    Chu-Carroll J, Prager J, Czuba K, Ferrucci D, Duboue P (2006) Semantic search via XML fragments: a high-precision approach to IR. In: Proceedings of the annual international ACM SIGIR conference on research and development on information retrieval, Seattle, pp 445–452Google Scholar
  19. 19.
    Clarke C, Cormack G, Lynam T (2001) Exploiting redundancy in question answering. In: Proceedings of the 24th ACM SIGIR conference, New York, pp 358–365Google Scholar
  20. 20.
    Cui H, Kan M, Chua T (2005) Generic soft pattern models for definitional question answering. In: SIGIR 2005, Salvador, pp 384–391Google Scholar
  21. 21.
    Cui H, Kan MY, Chua TS (2007) Soft pattern matching models for definitional question answering. ACM Trans Inf Syst (TOIS) 25(2):1–30CrossRefGoogle Scholar
  22. 22.
    Demeke G, Getachew M (2006) Manual annotation of Amharic news items with part-of-speech tags and its challenges. ELRC working papers 2:1–17Google Scholar
  23. 23.
    Denicia-Carral C, Montes-y-Gómez M, Villaseñor-Pineda L, García Hernández R (2006) A text mining approach for definition question answering. In: 5th international conference on natural language processing, FinTal 2006, Turku. Lecture notes in artificial intelligenceGoogle Scholar
  24. 24.
    Doddington G, Mitchell A, Przybocki M, Ramshaw L, Strassel S, Weischedel R (2004) ACE program – task definitions and performance measures. In: Proceedings of LREC, Lisbon, pp 837–840Google Scholar
  25. 25.
    Elkateb S, Black W, Vossen P, Farwell D, Rodríguez H, Pease A, Alkhalifa M (2006) Arabic WordNet and the challenges of Arabic. In: Proceedings of Arabic NLP/MT conference, LondonGoogle Scholar
  26. 26.
    Fahmi I, Bouma G (2006) Learning to identify definitions using syntactic features. In: Workshop of learning structured information in natural language applications, EACL, TrentoGoogle Scholar
  27. 27.
    Fellbaum C (2000) WordNet: an electronic lexical database. MIT, Cambridge. cogsci.princeton.edu, 7 Sept
  28. 28.
    Figueroa A, Neumann G, Atkinson J (2009) Searching for definitional answers on the web using surface patterns. IEEE Comput 42(4):68–76CrossRefGoogle Scholar
  29. 29.
    García-Blasco S, Danger R, Rosso P (2010) Drug-drug interaction detection: a new approach based on maximal frequent sequences. Sociedad Espanola para el Procesamiento del Lenguaje Natural (SEPLN) 45:263–266Google Scholar
  30. 30.
    Gómez JM, Montes-Gomez M, Sanchis E, Villasenor-Pineda L, Rosso P (2005) Language independent passage retrieval for question answering. In: Fourth Mexican international conference on artificial intelligence, MICAI 2005, Monterrey. Lecture notes in computer science. Springer, pp 816–823Google Scholar
  31. 31.
    Gómez JM, Buscaldi D, Rosso P, Sanchis E (2007) JIRS: language independent passage retrieval system: a comparative study. In: 5th international conference on natural language processing, Hyderabad, 2006Google Scholar
  32. 32.
    Gómez JM, Rosso P, Sanchis E (2007) Re-ranking of Yahoo snippets with the JIRS passage retrieval system. In: Proceedings of the workshop on cross lingual information access, CLIA-2007, 20th international joint conference on artificial intelligence, IJCAI-07, HyderabadGoogle Scholar
  33. 33.
    Gong Z, Cheang CW, Hou U (2005) Web query expansion by WordNet. In: DEXA 2005, Copenhagen. Lecture notes in computer science, vol 3588, pp 166–175Google Scholar
  34. 34.
    Gong Z, Cheang C, Hou L (2006) Multi-term web query expansion by WordNet. In: Bressan S, Küng J, Wagner R (eds) Database and expert systems applications. Lecture notes in computer science, vol 4080. Springer, Berlin/Heidelberg, pp 379–388CrossRefGoogle Scholar
  35. 35.
    Hammou B, Abu-salem H, Lytinen S, Evens M (2002) QARAB: a question answering system to support the Arabic language. In: The proceedings of the workshop on computational approaches to Semitic languages, Philadelphia. ACL, pp 55–65.Google Scholar
  36. 36.
    Hsu MH, Tsai MF, Chen HH (2008) Combining WordNet and ConceptNet for automatic query expansion: a learning approach. In: Asia information retrieval symposium, Harbin. Lecture notes in computer science, vol 4993. Springer, pp 213–224Google Scholar
  37. 37.
    Itai A, Wintner S (2008) Language resources for Hebrew. Lang Resour Eval 42:75–98CrossRefGoogle Scholar
  38. 38.
    Joho H, Sanderson M (2000) Retrieving descriptive phrases from large amounts of free text. In: 9th ACM CIKM conference, McLean, Nov 2000, pp 180–186Google Scholar
  39. 39.
    Joho H, Liu YK, Sanderson M (2001) Large scale testing of a descriptive phrase finder. In: 1st human language technology conference, San Diego, pp 219–221Google Scholar
  40. 40.
    Kaisser M, Becker T (2004) Question answering by searching large corpora with linguistic methods. In: Proceedings of the thirteenth text REtrieval conference, TREC 2004, Gaithersburg, 16–19 Nov 2004Google Scholar
  41. 41.
    Kim S, Baek D, Kim S, Rim H (2000) Question answering considering semantic categories and co-occurrence density. In: TREC 2000, GaithersburgGoogle Scholar
  42. 42.
    Larkey LS, Ballesteros L, Connell ME (2002) Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In: Proceedings of ACM SIGIR, Tampere, pp 269–274Google Scholar
  43. 43.
    Maamouri M, Bies A, Buckwalter T (2004) The Penn Arabic Treebank: building a largescale annotated Arabic corpus. In: NEMLAR conference on Arabic language resources and tools, CairoGoogle Scholar
  44. 44.
    Miliaraki S, Androutsopoulos I (2004) Learning to identify single-snippet answers to definition questions. In: COLING 2004, Geneva, pp 1360–1366Google Scholar
  45. 45.
    Moldovan D, Pasca M, Harabagiu S, Surdeanu M (2003) Performance issues and error analysis in an open-domain question answering system. In: Proceedings of the 40th annual meeting Association of Computational Linguistics, New York, 2003Google Scholar
  46. 46.
    Moldovan D, Clark C, Harabagiu S (2003) COGEX: a logic prover for question answering. In: NAACL ’03 proceedings of the 2003 conference of the North American chapter of the Association for Computational Linguistics on human language technology – vol 1, Edmonton, pp 87–93.Google Scholar
  47. 47.
    Monz C (2003) From document retrieval to question answering. Ph.D. dissertation, Institute for Logic, Language, and Computation, University of AmsterdamGoogle Scholar
  48. 48.
    Nanba H (2007) Query expansion using an automatically constructed thesaurus. In: Proceedings of NTCIR-6 workshop meeting, Tokyo, 15–18 May 2007Google Scholar
  49. 49.
    Niles I, Pease A (2003) Linking lexicons and ontologies: mapping WordNet to the suggested upper merged ontology. In: Proceedings of the 2003 international conference on information and knowledge engineering, Las VegasGoogle Scholar
  50. 50.
    Pinto FJ(2008) Automatic query expansion and word sense disambiguation with long and short queries using WordNet under vector model. In: Proceedings of workshops of Jornadas de Ingeniera del Software y Bases de Datos, Gijón, vol 2(2)Google Scholar
  51. 51.
    Ravichandran D, Hovy EH (2002) Learning surface text patterns for a question answering system. In: Proceedings of ACL, Philadelphia, pp 41–47Google Scholar
  52. 52.
    Robertson E, Walker S, Beaulieu M (2000) Experimentation as a way of life: okapi at TREC. Inf Process Manag 36(1):95–108CrossRefGoogle Scholar
  53. 53.
    Rodrigo A, Peñas A, Verdejo F (2010) Answer validation exercises at CLEF. In: Tufis D, Forascu C (eds) Multilinguality and interoperability in language processing with emphasis in Romanian. Ed. Academiei Romane, Bucureşti, pp 135–148Google Scholar
  54. 54.
    Rodríguez H, Farwell D, Farreres J, Bertran M, Alkhalifa M, Antonia Martí M, Black W, Elkateb S, Kirk J, Pease A, Vossen P, Fellbaum C (2008) Arabic WordNet: current state and future extensions. In: Proceedings of the fourth international global WordNet conference – GWC 2008, SzegedGoogle Scholar
  55. 55.
    Rosso P, Hurtado L, Segarra E, Sanchis E (2012) On the voice-activated question answering. IEEE Trans Syst Man Cybern–Part C 42(1):75–85CrossRefGoogle Scholar
  56. 56.
    Harabagiu SM, Moldovan DI, Clark C, Bowden M, Williams J, Bensley J (2003) Answer mining by combining extraction techniques with abductive reasoning. In: TREC 2003, Gaithersburg, pp 375–382Google Scholar
  57. 57.
    Suchanek FM, Kasneci G, Weikum G (2007) YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In: Proceedings of the 16th WWW, Banff, pp 697–706Google Scholar
  58. 58.
    Tanev H, Kouylekov M, Magnini B (2004) Combining linguistic processing and Web mining for question answering: ITC-irst at TREC 2004. In: Voorhees EM, Buckland LP (eds) TREC 2004, Gaithersburg. National Institute of Standards and Technology (NIST)Google Scholar
  59. 59.
    Tellex S, Katz B, Lin J, Fernandes A, Marton G (2003) Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, New York, 2003, pp 41–47Google Scholar
  60. 60.
    Trigui O (2011) How to extract Arabic definitions from the Web? Arabic definition question answering system. In: NLDB’11 proceedings of the 16th international conference on natural language processing and information systems, Alicante, pp 318–323Google Scholar
  61. 61.
    Trigui O, Belguith HL, Rosso P (2010) DefArabicQA: Arabic definition question answering system. In: Workshop on language resources and human language technologies for Semitic languages, 7th LREC. Valletta, pp 40–44Google Scholar
  62. 62.
    Trigui O, Belguith HL, Rosso P (2010) An automatic definition extraction in Arabic language. In: 15th international conference on applications of natural language to information systems (NLDB 2010), Cardiff. Lecture notes in computer science, vol 6177. Springer, Heidelberg, pp 240–247Google Scholar
  63. 63.
    Trigui O, Belguith HL, Rosso P (2011) ARABTEX. In: 7th international computing conference in Arabic (ICCA’11), RiyadhGoogle Scholar
  64. 64.
    Voorhees E (2003) Overview of the TREC 2003 question answering track. In: TREC 2003, Gaithersburg, pp 54–68Google Scholar
  65. 65.
    Westerhout E (2009) Extraction of definitions using grammar-enhanced machine learning. In: Student research workshop, EACL 2009, Athens, pp 88–96Google Scholar
  66. 66.
    Xu J, Licuanan A, Weischedel RM (2003) TREC 2003 QA at BBN: answering definitional questions. In: 12th TREC, Washington, DC, pp 98–106Google Scholar
  67. 67.
    Zhang Z, Zhou Y, Huang X, Wu L (2005) Answering definition questions using web knowledge bases. In: Dale R, Wong K-F, Su J, Kwong OY (eds) IJCNLP 2005, Jeju. Lecture notes in computer science (lecture notes in artificial intelligence), vol 3651. Springer, Heidelberg, pp 498–506Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Yassine Benajiba
    • 1
  • Paolo Rosso
    • 2
  • Lahsen Abouenour
    • 3
  • Omar Trigui
    • 4
  • Karim Bouzoubaa
    • 3
  • Lamia Belguith
    • 4
  1. 1.Thomson ReutersNew YorkUSA
  2. 2.Pattern Recognition and Human Language Technology (PRHLT) Research CenterUniversitat Politécnica de ValénciaValenciaSpain
  3. 3.Mohamed V-Agdal UniversityRabatMorocco
  4. 4.ANLP Research Group-MIRACL LaboratoryUniversity of SfaxSfaxTunisia

Personalised recommendations