Advertisement

A Possibilistic Approach for Arabic Domain Terminology Extraction and Translation

  • Wiem Lahbib
  • Ibrahim Bounhas
  • Yahya Slimani
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 935)

Abstract

This paper proposes a hybrid possibilistic approach for bilingual terminology extraction using possibility and necessity measures. On the one hand, we extract domain-relevant terms from the source language, and on the other hand, we build a co-occurrence-based translation graph, which is mined to translate terms in the target language. We compare our approach with different state-of-the art approaches. Experimental results show that the possibilistic approach reaches better results in terms of Recall, Precision and Mean Average Precision (MAP). The differences between the compared approaches show that our contribution is significant in terms of p-value.

Keywords

Arabic bilingual terminology Possibility theory Graph-mining 

References

  1. 1.
    Shah, N.S.: Review of indexing techniques applied in information retrieval. Pak. J. Eng. Technol. Sci. 5(1) (2016)Google Scholar
  2. 2.
    Hazem, A., Morin, E.: Extraction de lexiques bilingues à partir de corpus comparables par combinaison de représentations contextuelles. In: Actes de la 20ème conférence sur le Traitement Automatique des Langues Naturelles (TALN), Sables d’Olonne, France, 17–21 June, pp. 243–256 (2013)Google Scholar
  3. 3.
    Sellami, R., Sadat, F., Belguith, L.H.: Extraction de lexiques bilingues à partir de Wikipédia. In: Atelier de Traitement Automatique des Langues Africaines, JEP (conférence Journées d’Études en Parole) -TALN-RECITAL, Grenoble, France, 4–8 June (2012)Google Scholar
  4. 4.
    Hazem, A., Morin, E.: Efficient data selection for bilingual terminology extraction from comparable corpora. In: Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka, Japan, 11–16 Dec 2016. Technical Papers, pp. 3401–3411 (2016)Google Scholar
  5. 5.
    Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst. 1(1), 3–28 (1978)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bouamor, D., Popescu, A., Semmar, N., Zweigenbaum, P.: Building specialized bilingual lexicons using large scale background knowledge. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, 18–21 Oct, pp. 479–489 (2013)Google Scholar
  7. 7.
    Zhao, B., Xing, E.P.: HM-BiTAM: Bilingual topic exploration, word alignment, and translation. In: Advances in Neural Information Processing Systems (NIPS), Vancouver, Canada, 3–6 Dec, pp. 1689–1696 (2007)Google Scholar
  8. 8.
    Lefever, E., Macken, L., Hoste, V.: Language-independent bilingual terminology extraction from a multilingual parallel corpus. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece, 03 Apr, pp. 496–504 (2009)Google Scholar
  9. 9.
    Okita, T., Hosseinzadeh Vahid, A., Way, A., Liu, Q.: The DCU terminology translation system for the medical query subtask at WMT 2014. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, USA, 26–27 June, pp. 239–245 (2014)Google Scholar
  10. 10.
    Vulic, I., Moens, M.F.: Bilingual distributed word representations from document-aligned comparable data. J. Artif. Intell. Res. 55(1), 953–994 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Chebel, M., Latiri, C., Gaussier, E.: Bilingual lexicon extraction from comparable corpora based on closed concepts mining. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10234, pp. 586–598. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-57454-7_46CrossRefGoogle Scholar
  12. 12.
    Dubois, D., Prade, H.: Possibility theory and its application: where do we stand. Mathw. Soft Comput. 18(1), 18–31 (2011)Google Scholar
  13. 13.
    Menacer, M.A., Boumerdas, A., Zakaria, C., Smaili, K.: A new language model based on possibility theory. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9623, pp. 127–139. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-75477-2_8CrossRefGoogle Scholar
  14. 14.
    Bounhas, I., Ayed, R., Elayeb, B., Evrard, F., Saoud, N.B.B.: Experimenting a discriminative possibilistic classifier with reweighting model for Arabic morphological disambiguation. Comput. Speech Lang. 33(1), 67–87 (2015)CrossRefGoogle Scholar
  15. 15.
    Bounhas, I., Ayed, R., Elayeb, B., Saoud, N.B.B.: A hybrid possibilistic approach for Arabic full morphological disambiguation. Data Knowl. Eng. 100, 240–254 (2015)CrossRefGoogle Scholar
  16. 16.
    Lahbib, W., Bounhas, I., Slimani, Y.: Arabic terminology extraction and enrichment based on domain-specific text mining. In: The 27th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), Vietri sul Mare, Italy, 9–11 Nov, pp. 340–347 (2015)Google Scholar
  17. 17.
    Alguliyev, R.M., Aliguliyev, R.M., Isazade, N.R.: A new similarity measure and mathematical model for text summarization. Problems Inf. Technol. 6(1), 42–53 (2015)Google Scholar
  18. 18.
    Lahbib, W., Bounhas, I., Elayeb, B.: Arabic-English domain terminology extraction from aligned corpora. In: Meersman, R., et al. (eds.) OTM 2014. LNCS, vol. 8841, pp. 745–759. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-45563-0_46CrossRefGoogle Scholar
  19. 19.
    Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the Conference of the North American Chapter of the Association of Computational Linguistics on Human Language Technologies (HLT-NAACL), Atlanta, Georgia, 10–12 June, pp. 746–751 (2013)Google Scholar
  20. 20.
    Demˇsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Wiem Lahbib
    • 1
    • 2
  • Ibrahim Bounhas
    • 1
    • 2
    • 3
  • Yahya Slimani
    • 1
    • 2
    • 4
  1. 1.LISI Laboratory of Computer Science for Industrial SystemsCarthage UniversityTunisTunisia
  2. 2.JARIR: Joint Group for Artificial Reasoning and Information RetrievalManoubaTunisia
  3. 3.Higher Institute of Documentation, La Manouba UniversityManoubaTunisia
  4. 4.Higher Institute of Multimedia Arts, La Manouba UniversityManoubaTunisia

Personalised recommendations