International Journal of Speech Technology

, Volume 20, Issue 2, pp 339–353 | Cite as

A logical representation of Arabic questions toward automatic passage extraction from the Web

  • Wided BakariEmail author
  • Patrice Bellot
  • Mahmoud Neji


With the expanding growth of Arabic electronic data on the web, extracting information, which is actually one of the major challenges of the question-answering, is essentially used for building corpus of documents. In fact, building a corpus is a research topic that is currently referred to among some other major themes of conferences, in natural language processing (NLP), such as, information retrieval (IR), question-answering (QA), automatic summary (AS), etc. Generally, a question-answering system provides various passages to answer the user questions. To make these passages truly informative, this system needs access to an underlying knowledge base; this requires the construction of a corpus. The aim of our research is to build an Arabic question-answering system. In addition, analyzing the question must be the first step. Next, it is essential to retrieve a passage from the web that can serve as an appropriate answer. In this paper, we propose a method to analysis the question and retrieve the passage answer in the Arabic language. For the question analysis, five factual question types are processed. Additionally, our purpose is to experiment with the generation of a logic representation from the declarative form of each question. Several studies, deal with the logic approaches in question-answering, are discussed in other languages than the Arabic language. This representation is very promising because it helps us later in the selection of a justifiable answer. The accuracy of questions that are correctly analyzed and translated into the logic form achieved 64%. And then, the results of passages of texts that are automatically generated achieved an 87% score for accuracy and a 98% score for c@1.


Arabic Question analysis Answer passage retrieval Logic representation Word Wide Web 


  1. Abdelnasser, H., Mohamed, R., Ragab, M., & Farouk, B. (2014). Al-Bayan: An Arabic question-answering system for the Holy Quran. In Proceedings of the workshop on natural language processing (pp. 57–64). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  2. Abouenour, L., Bouzoubaa, K., & Rosso, P. (2012). IDRAAQ: New Arabic question answering system based on query expansion and passage retrieval. In CLEF (online working notes/labs/workshop).Google Scholar
  3. AlAgha, I., & Abu-Taha, A. (2015). AR2SPARQL: An Arabic natural language interface for the semantic Web. International Journal of Computer Applications, 125(6), 19–27.CrossRefGoogle Scholar
  4. Al-Khalifa, H., & Al-Wabil, A. (2007). The Arabic language and the semantic web: Challenges and opportunities. In The first international symposium on computers and the Arabic language, November 2007, Riyadh, Saudi Arabia (pp 27–35).Google Scholar
  5. Athira, P. M., Sreeja, M., & Reghuraj, P. C. (2013). Architecture of an ontology-based domain specific natural language question answering system. International Journal of Web and Semantic Technology, IJWesT, 4, 31–39.CrossRefGoogle Scholar
  6. Bakari, W., Bellot, P., & Neji, M. (2015). Literature review of Arabic question-answering. In Proceedings of the eleventh flexible query answering systems 2015 (FQAS-2015). In T. Andreasen et al. (Eds.), Advances in intelligent systems and computing, #400 (pp 321–334). Berlin: Springer Publishers.Google Scholar
  7. Bakari, W., Bellot, P., & Neji, M. (2016). AQA-WebCorp: Web-based factual questions for Arabic, in: Proceedings of the 20th international conference on knowledge-based and intelligent information and engineering systems (KES-20156). Procedia Computer Science, 96, 275–284.CrossRefGoogle Scholar
  8. Bakari, W., Trigui, O., & Neji, M. (2014). Logic-based approach for improving Arabic question-answering. In IEEE international conference on computational intelligence and computing research (ICCIC), University of Tlemcen, Algeria (pp 1–6).Google Scholar
  9. Baral, C., Gelfond, G., Gelfond, M., & Scherl, R. B. (2005). Textual inference by combining multiple logic-forming paradigms. In Proceedings of the AAAI 2005 workshop on inference for textual question answering (pp 1–5).Google Scholar
  10. Bdour, W. N., & Gharaibeh, N. K. (2013). Development of yes/no Arabic question answering system. International Journal of Artificial Intelligence and Applications (IJAIA), 4(1), 51–63. doi: 10.5121/ijaia.2013.4105.CrossRefGoogle Scholar
  11. Bebah, M. O. A. O., Meziane, A., Mazroui, A., & Lakhouaja, A. (2011). Alkhalil morpho sys. In 7th International computing conference in Arabic.Google Scholar
  12. Bekhti, S., & Al-Harbi, L. (2013). AQuASys: A question-answering system for Arabic. In WSEAS international conference proceedings. Recent Advances in Computer Engineering Series, 25(6), 19–27.Google Scholar
  13. Benajiba, Y., Rosso, P., & Lyhyaoui, A. (2007). Implementation of the ArabiQA question answering system’s components. In Proceedings workshop on Arabic natural language processing, 2nd information communication technologies international symposium, ICTIS-2007, Fez, Morocco (pp. 3–5).Google Scholar
  14. Benamara, F. (2004). Cooperative question-answering in restricted domains: the WEB-COOP experiment. In Proceedings of the ACL-2004 workshop: question answering in restricted domains, Barcelona, Spain (pp 31–38). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  15. Bilotti, M. W., & Nyberg, E. (2008). Improving text retrieval, precision, and answer accuracy in question answering systems. In Coling 2008, proceedings of the 2nd workshop on information retrieval for question answering systems (pp. 1–8). Stroudsburg, PA: Association for Computational Linguistics.CrossRefGoogle Scholar
  16. Brini, W., Ellouze, M., Mesfar, S., & Belguith, L. H. (2009). An Arabic question answering system For factoid questions. In Proceedings of the IEEE international conference on natural language processing and knowledge engineering (IEEE NLP-KE’09), Dalian, China (pp 1–7).Google Scholar
  17. Clark, C., Hodges, D., Stephan, J., & Moldovan, D. (2005). Moving QA Towards reading comprehension using context and default reasoning. In Proceedings of the AAAI 2005 workshop on inference for textual question answering, Pittsburgh, PA, USA (pp 6–12). Palo Alto, CA: AAAI Press.Google Scholar
  18. Ezzeldin, A. M., Kholief, M. H., & El-Sonbaty, Y. (2013). ALQASIM: Arabic language question answer: Selection in machines. In The 13th international Arab conference on information technology, CLEF’2013 (pp 100–103). Berlin: Springer.Google Scholar
  19. Ganesh, S., & Varma, V. (2009). Exploiting structure and content of Wikipedia for query expansion in the context of question answering. In Proceedings of the international conference on recent advances in natural language processing (RANALP) (pp 103–106).Google Scholar
  20. Hammo, B., Abuleil, S., Lytinen, S., & Evens, M. (2004). Experimenting with a question answering system for the Arabic language. Computers and the Humanities, 38(4), 395–415.CrossRefGoogle Scholar
  21. Harabagiu, S. M., Pas¸ca, M., & Maiorano, S. J. (2000). Experiments with open-domain textual question-answering. In Proceedings of the 16th international conference on computational linguistics (COLING 2000), Saarbrucken, Germany (pp 292–298).Google Scholar
  22. Hovy, E., Geber, L., Hermjakob, U., Lin, C. Y., & Ravichandran, D. (2001). Toward semantics-based answer pinpointing. In Proceedings of the first international conference on human language technology research (pp 1–7).Google Scholar
  23. Ittycheriah, A., Martin, F., Zhu, W. J., Adwait, R., & Mammone, R. J. (2000). IBM’s statistical question answering system. In Proceedings of the 9th text retrieval conference (pp 229–237). Gaithersburg, MD: NIST.Google Scholar
  24. Kanaan, G., Hammouri, A., Al-Shalabi, R., & Swalha, M. (2009). Question answering system for the Arabic language. American Journal of Applied Sciences, 6(4), 797–805.CrossRefGoogle Scholar
  25. Kurdi, H., Alkhaider, S., & Alfaif, N. (2014) Development and evaluation of a web based question answering system for Arabic language. Computer Science and Information Technology (CS and IT), 4(2), 202–214.Google Scholar
  26. Lally, A., Prager, J. M., McCord, M. C., Boguraev, B. K., Patwardhan, S., Fan, J., & Chu-Carroll, J. (2012). Question analysis: How Watson reads a clue, IBM Journal of Research and Development, 56(3/4), 1–14.Google Scholar
  27. Loni, B., Van Tulder, G., Wiggers, P., Tax, D. M., & Loog, M. (2011). Question classification by weighted combination of lexical, syntactic, and semantic features. In Proceedings of the conference on text, speech, and dialogue (pp. 243–250). Berlin: Springer Publishers.CrossRefGoogle Scholar
  28. Meftouh, K., Smaili, K., & Laskri, M. T. (2007). Constitution d’un corpus de la langue arabe à partir du Web, Iera, Rabat, Morocco, 17–18, juin 2007. HAL Id: inria-00186536, version 1.Google Scholar
  29. Mervin, R. (2013). An overview of question answering system. International Journal of Research in Advanced Technology (IJRATE), 1.Google Scholar
  30. Metzler, D., & Croft, B. (2004). Combining the language model and inference network Information approaches to retrieval. Information Processing and Management, Special Issue on Bayesian Networks and Information Retrieval, 40(5), 735–750.Google Scholar
  31. Mishra, A., & Jain, S. K. (2015). A survey on question answering systems with classification. Journal of King Saud University: Computer and Information Sciences, 28(3), 345–361.Google Scholar
  32. Mohammed, F. K., Nasser, K., & Harb, H. (1993). A knowledge based Arabic question answering system (AQAS). ACM SIGART Bulletin, 4(4), 21–30.CrossRefGoogle Scholar
  33. Moldovan, D., Clark, C., Harabagiu, S., & Hodges, D. (2007). Cogex: A semantically and contextually enriched logic prover for question answering. Journal of Applied Logic, 5, 49–69.
  34. Moldovan, D., Pasca, M., Harabagiu, S., & Surdeanu, M. (2003). Performance issues and error analysis in an open-domain question answering system. ACM Transactions on Information Systems, 21(2), 133–154.CrossRefGoogle Scholar
  35. Moldovan, D., Clark, C., Harabagiu, S., & Maiorano, S. J. (2003). COGEX: A logic prover for question answering. In Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology-volume (pp 87–93). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  36. Moldovan, D., & Rus, V. (2001). Logic form transformation of WordNet and its applicability to question answering. In Proceedings of the Association for Computational Linguistics, (ACL’2001), Toulouse, France (pp 402–409).Google Scholar
  37. Molla, D. (2003). Towards semantic-based overlap measures for question-answering. In Proceedings of the first Australasian language technology workshop (ALTW’03) (pp 1–8).Google Scholar
  38. Molla, D., Schwitter, R., Hess, M., & Fournier, R. (2000). ExtrAns, an answer extraction system, TAL. Traitement automatique des langues, 41(2), 495–522.Google Scholar
  39. Moreale, E., & Vargas-Vera, M. (2004). A question-answering system using argumentation. In Proceedings of MICAI 2004, advances in artificial intelligence (pp. 400–409). Berlin: Springer Publishers.CrossRefGoogle Scholar
  40. Nyberg, E., Mitamura, T., Callan, J., Carbonell, J., Frederking, R., & Collins-Thompson, K. (2003). The JAVELIN question-answering system at TREC 2003: A multi-strategy approach with dynamic planning. Washington: NIST Special Publication SPGoogle Scholar
  41. Peñas, A., & Rodrigo L. A. (2011). A simple measure to assess non-response. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies (vol. 1, pp 1415–1424).Google Scholar
  42. Rastier, F. (1998). Enjeux ėpistėmologiques de la linguistique de corpus, In: C. G. Williams (Ed.) La linguistique de corpus (pp 31–46). Rennes: Presses Universitaires de Rennes.Google Scholar
  43. Resnik, P. (1998). Parallel stands: A preliminary investigation into mining the web for bilingual text, In: Conference of the association for machine translation in the Americas (pp 72–82). Berlin: Springer.Google Scholar
  44. Rinaldi, F., Downdall, J., Schneider, G., & Persidis, A. (2004). Answering questions in the genomics domain. In Proceedings of the ACL 2004 workshop on question answering in restricted domains (pp 46–53). In Proceedings of the ACL-2004 workshop: Question answering in restricted domains, Barcelona, Spain (pp 31–38). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  45. Rodrigo, Á., Perez-Iglesias, J., Peñas, A., Garrido, G., & Araujo, L. (2010). A Question answering system based on information retrieval and validation. In CLEF (Notebook Papers/LABs/Workshops).Google Scholar
  46. Rosso, P., Benajiba, Y., & Lyhyaoui, A. (2006). Towards an Arabic question-answering system. In Proceedings of the 4th scientific research outlook and technology development in the Arab world (SROIV), Damascus, Syria, 11–14 December (pp 11–14).Google Scholar
  47. Sinclair, J. (2005). Corpus and text: Basic principles. In M. Wynne (Ed.), Developing Linguistic corpora: A guide to good practice (pp. 1–16). Oxford: Oxbow Books.Google Scholar
  48. Tari, L., & Baral, C. (2005). Using AnsProlog with link grammar and WordNet for QA with deep reasoning. In Information technology, 2006. ICIT’06. 9th International conference on IEEE, 2006 (pp 125–128). Pittsburgh, CA: AAAI Press Palo Alto.Google Scholar
  49. Terol, R. M., Martinez-Barco, P., & Palomar, M. (2007). A knowledge-based method for the medical question-answering problem. Computers in Biology and Medicine, 37(10), 1511–1521.CrossRefGoogle Scholar
  50. Trigui, O., Belguith, L. H., & Rosso, P. (2010). DefArabicQA: Arabic definition question answering system. In Workshop on language technologies for Semitic languages, 7th LREC, Valetta, Malta (pp 40–45).Google Scholar
  51. Vicedo, J. L., & Ferrandez, A. (2001). A semantic approach to question answering systems (vol. 249, pp 511–516). Washington: NIST Special Publication SP.Google Scholar
  52. Wang, M. (2006). A survey of answer extraction techniques in factoid question answering. In HLT-NAACL (pp 1–13).Google Scholar
  53. Zribi, I., Hammami, S. M., & Belguith, L. H. (2010). L’apport d’une approche hybride pour la reconnaissance des entités nommées en langue arabe. In TALN’2010 (pp 19–23).Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.MIR@CLSfaxTunisia
  2. 2.Faculty of Economics and Management of SfaxUniversity of SfaxSfaxTunisia
  3. 3.Aix-Marseille UniversityMarseilleFrance

Personalised recommendations