Abstract
In many linguistic situations, the repetitions of objects and entities are reduced to the pronoun. The correct interpretation of pronouns plays an important role in the construction of meaning. Thus, the resolution of the pronominal anaphors remains a very important task for most natural language processing applications. This paper presents a novel approach to resolve pronominal anaphora in Arabic texts. At first, we identify non-referential pronouns by using an iterative self-training SVM method. After, we resolve the antecedents by combining a Q-learning method with a Word2Vec based method. The Q-learning method seeks to optimize, for each anaphoric pronoun, a sequence of criteria choice to evaluate the antecedents and look for the best. It uses syntactic criteria as preference factors to favor candidate antecedents over others. The Word2Vec method uses the word embedding model AraVec 3.0. It provides the semantic similarity measures between antecedent word vectors and pronoun context vectors. To combine Q-learning and Word2Vec results, we use a ranking aggregation method. The resolution system is evaluated on literary, journalistic and technical manual texts. Its precision rate reaches until 80.82%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The cataphor is the case where the anaphora precedes its antecedent.
- 3.
Clitics are elements of grammar attached to the root of a word.
- 4.
Short vowels in Arabic are replaced by symbols called diacritics.
- 5.
The filter resamples a dataset by applying the Synthetic Minority Oversampling TEchnique (SMOTE). The amount of SMOTE and the number of nearest neighbors may be specified as needed in order to balance the two-class instances size.
References
Lappin, S., Leass, H.J.: An algorithm for pronominal anaphora resolution. Computat. Linguist. 20(4), 535–561 (1994)
Mitkov, R.: Robust pronoun resolution with limited knowledge. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING 1998)/ACL 1998, Montreal, Canada (1998)
Schmolz, H., Coquil, D., Döller, M.: In-depth analysis of anaphora resolution requirements. In: 2012 23rd International Workshop on Database and Expert Systems Applications, Vienna, Austria (2012)
Gelain, B., Sedogbo, C.: La résolution d’anaphore à partir d’un lexique-grammaire des verbes anaphoriques. In: COLING 1992 Proceedings of the 14th Conference on Computational Linguistics, France, vol. 3, pp. 901–905 (1992)
Bittar, A.: Un algorithme pour la résolution d’anaphores événementielles. Université Paris 7 Denis Diderot, UFR de Linguistique (2006)
Nouioua, F.: Heuristique pour la résolution d’anaphores dans les textes d’accidents de la route. Villetaneuse, Institut Galilée, Université Paris 13, F-93430 (2007)
Fallahi, F., Shamsfard, M.: Recognizing anaphora reference in Persian sentences. Int. J. Comput. Sci. 8, 324–329 (2011)
Ashima, A., Mohana, B.: Improving anaphora resolution by resolving gender and number agreement in Hindi language using rule based approach. Indian J. Sci. Technol. 9(32) (2016)
Mitkov, R., Belguith, L., Stys, M.: Multilingual robust anaphora resolution. In: Proceedings of the Third International Conference on Empirical Methods in Natural Language Processing (EMNLP-3), Granada, Spain, pp. 7–16 (1998)
Seminck, O., Amsili, P.: A computational model of human preferences for pronoun resolution. In: Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 53–63 (2017)
Elghamry, K., Al-Sabbagh, R., El-Zeiny, N.: Arabic anaphora resolution using Web as corpus. In: Proceedings of the Seventh Conference on Language Engineering, Cairo, Egypt, pp. 1–18 (2007)
Aone, C., Bennett, S.W.: Applying machine learning to anaphora resolution. In: Wermter, S., Riloff, E., Scheler, G. (eds.) IJCAI 1995. LNCS, vol. 1040, pp. 302–314. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60925-3_55
Li, D., Miller, T., Schuler, W.: A pronoun anaphora resolution system based on factorial hidden Markov models. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, 19–24 June 2011, pp. 1169–1178 (2011)
Aktas, B., Scheffler, T., Stede, M.: Anaphora resolution for Twitter conversations: an exploratory study. In: Proceedings of the Workshop on Computational Models of Reference, Anaphora and Coreference, New Orleans, Louisiana, 6 June 2018, pp. 1–10 (2018)
Charniak, E., Elsner, M.: EM works for pronoun anaphora resolution. In: Proceedings of EACL, pp. 48–156 (2009)
Weissenbacher, D., Nazarenko, A.: Identifier les pronoms anaphoriques et trouver leurs antécédents: l’intérêt de la classification bayésienne. In: Proceeding of TALN, pp. 145–155 (2007)
Kamune, K., Agrawal, A.: Hybrid approach to pronominal anaphora resolution in English newspaper text. Int. J. Intell. Syst. Appl. 02, 56–64 (2015). https://doi.org/10.5815/ijisa.2015.02.08. Published Online January 2015 in MECS
Dakwale, P., Mujadia, V., Sharma, D.M.: A hybrid approach for anaphora resolution in Hindi. In: International Joint Conference on Natural Language Processing, Nagoya, Japan, pp. 977–981 (2013)
Mujadia, V., Gupta, P., Sharma, D.M.: Pronominal reference type identification and event anaphora resolution for Hindi. Int. J. Comput. Linguist. Appl. 7(2), 45–63 (2016)
Abolohom, A., Omar, N.: A hybrid approach to pronominal anaphora resolution in Arabic. J. Comput. Sci. 11(5), 764–771 (2015). https://doi.org/10.3844/jcssp.2015.764.771
Hammami, S.: La résolution automatique des anaphores pronominales pour la langue arabe. Thèse de doctorat. Université de Sfax, Faculté des Sciences Economiques et de Gestion, Sfax, Tunisie (2016)
Ben-Othmane, C.: De la synthèse lexicographique à la détection et à la correction des graphies fautives arabes. Ph.D. thesis. Université de Paris XI, Orsay (1998)
Mohamadally, H., Fomani, B.: SVM: Machines à Vecteurs de Support ou Séparateurs à Vastes Marges. BD Web, ISTY3 Versailles St Quentin, France (2006)
Sigaud, O., Garcia, F.: Apprentissage par renforcement Processus décisionnels de Markov en IA. Groupe PDMIA, 27 février 2008
Abu Bakr, S., Kareem, E., El-Beltagy, S.R.: AraVec: a set of Arabic word embedding models for use in Arabic NLP. In: 3rd International Conference on Arabic Computational Linguistics, ACLing 2017, Dubai, United Arab Emirates, 5–6 November 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mathlouthi Bouzid, S., Ben Othmane Zribi, C. (2019). Aggregation of Word Embedding and Q-learning for Arabic Anaphora Resolution. In: Smaïli, K. (eds) Arabic Language Processing: From Theory to Practice. ICALP 2019. Communications in Computer and Information Science, vol 1108. Springer, Cham. https://doi.org/10.1007/978-3-030-32959-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-32959-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32958-7
Online ISBN: 978-3-030-32959-4
eBook Packages: Computer ScienceComputer Science (R0)