Abstract
With the adverse health effects of chemical drugs and antibiotics, herbal medicine has been a resurgence of interest in recent years. Thus, the use of medicinal plants is being largely considered as an effective and lucrative treatment, especially in Asia and Africa. The objective of this work is to achieve an identification system of medicinal plants names from French-Arabic parallel corpora. Corpora are formed by several texts composed from the multilingual encyclopedia Wikipedia. The identification of Named Entities is realized by several types of patterns. These patterns are represented by a set of transducers. The prototype is implemented in NooJ linguistic platform using a set of morphological and syntactic grammars. This prototype is experimented on a French-Arabic parallel corpora collected from Wikipedia. The obtained results are promising given the measures values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allauzen, A., Wisniewski, G.: Modèles discriminants pour l’alignement mot à mot. TAL 50(3), 173–203 (2009)
Ameh, S.J., Obodozie, O.O., Babalola, P.C., Gamaniel, K.S.: Medical herbalism and herbal clinical research: a global perspective. Br. J. Pharm. Res. 1(4), 99–123 (2011)
Bodenreider, O., Zweigenbaum, P.: Identifying proper names in parallel medical terminologies. Stud. Health Technol. Inform. 77, 443–447 (2000)
Daille, B., Gaussier, E., Lange, J.-M.: Towards automatic extraction of monolingual and bilingual terminology. In: Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), pp. 515–521 (1994)
Debili, F., Zibi, A.: Les dépendances syntaxiques au service de l’appariement des mots. In: Actes du 10ème Congrès Reconnaissance des Formes et Intelligence Artificielle (RFIA 1996) (1996)
Deléger, L., Merkel, M., Zweigenbaum, P.: Translating medical terminologies through word alignement in parralel text corpora. J. Biomed. Inform. 42, 692–701 (2009)
Fehri, H.: Reconnaissance automatique des entités nommées arabes et leur traduction vers le français. Ph.D. thesis, Sfax University (2012)
Gledhill, D.: The Names of Plants, 4th edn. Cambridge University Press, Cambridge (2008)
Goldman, J.-P., Scherrer, Y.: Création automatique de dictionnaires bilingues d’entités nommés grace à Wikipédia. Nouveaux Cahiers de Linguistique Française 30, 213–227 (2012)
Hoffmann, D.: Medical Herbalism: The Science And Practice of Herbal Medicine. Haling Arts Press, Rochester (2003)
Nadeau, N., Sekine, S.: A survey of named entity recognition and classification. In: Sekine, S., Ranchhod, E. (eds.) Named Entities: Recognition, Classification and Use. John Benjamins publishing company, pp. 3–28 (2009)
Nothan, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from Wikipedia. Artif. Intell. 194, 151–175 (2013)
Ozdowska, S., Claveau, V.: Inférence de règles de propagation syntaxique pour l’alignement de mots. TAL 47(1), 167–186 (2006). ATALA
Rose Lim-Cheng, N., Co, J.R.C., Gaudiel, C.H.S., Umadac, D.F., Victor, N.L.: Semi-automatic population of ontology of philippine medicinal plants from on-line text. Presented at the DLSU Research Congress, De La Salle University, Manila, Philippines, 6–8 March 2014
Semmar, N., Saadane, H.: Etude de l’impact de la translittération de noms propres sur la qualité de l’alignement de mots à partir de corpus paralléles français-arabe. In: 21iéme Traitement Automatique des Langues Naturelles, Marseille, pp. 268–279 (2014)
Semmar, N., Servan, C., De Chalendar, G., Le Ny, B.: A hybrid word alignment approach to improve translation lexicons with compound words and idiomatic expressions. In: Proceedings of the 32nd Translating and the Computer Conference, England (2010)
Shaalan, K.: A survey of Arabic named entity recognition and classification. Comput. Linguist. 40(2), 469–510 (2014)
Silberztein, M.: NooJ: a Linguistic annotation system for corpus processing. In: Proceedings of the Conference on Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, 6–8 October 2005
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45, 427–437 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Seideh, M.A.F., Fehri, H., Haddar, K. (2016). Named Entity Recognition from Arabic-French Herbalism Parallel Corpora. In: Okrut, T., Hetsevich, Y., Silberztein, M., Stanislavenka, H. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2015. Communications in Computer and Information Science, vol 607. Springer, Cham. https://doi.org/10.1007/978-3-319-42471-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-42471-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42470-5
Online ISBN: 978-3-319-42471-2
eBook Packages: Computer ScienceComputer Science (R0)