Skip to main content

Named Entity Recognition from Arabic-French Herbalism Parallel Corpora

  • Conference paper
  • First Online:
Automatic Processing of Natural-Language Electronic Texts with NooJ (NooJ 2015)

Abstract

With the adverse health effects of chemical drugs and antibiotics, herbal medicine has been a resurgence of interest in recent years. Thus, the use of medicinal plants is being largely considered as an effective and lucrative treatment, especially in Asia and Africa. The objective of this work is to achieve an identification system of medicinal plants names from French-Arabic parallel corpora. Corpora are formed by several texts composed from the multilingual encyclopedia Wikipedia. The identification of Named Entities is realized by several types of patterns. These patterns are represented by a set of transducers. The prototype is implemented in NooJ linguistic platform using a set of morphological and syntactic grammars. This prototype is experimented on a French-Arabic parallel corpora collected from Wikipedia. The obtained results are promising given the measures values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.academicjournals.org/journal/JMPR.

  2. 2.

    http://www.sciencedirect.com/science/journal/22147861.

  3. 3.

    http://escop.com/.

  4. 4.

    http://www.europam.net/.

  5. 5.

    http://www.phytotherapyjournal.com/.

  6. 6.

    http://www.iospress.nl/journal/medicinal-plants/.

References

  1. Allauzen, A., Wisniewski, G.: Modèles discriminants pour l’alignement mot à mot. TAL 50(3), 173–203 (2009)

    Google Scholar 

  2. Ameh, S.J., Obodozie, O.O., Babalola, P.C., Gamaniel, K.S.: Medical herbalism and herbal clinical research: a global perspective. Br. J. Pharm. Res. 1(4), 99–123 (2011)

    Article  Google Scholar 

  3. Bodenreider, O., Zweigenbaum, P.: Identifying proper names in parallel medical terminologies. Stud. Health Technol. Inform. 77, 443–447 (2000)

    Google Scholar 

  4. Daille, B., Gaussier, E., Lange, J.-M.: Towards automatic extraction of monolingual and bilingual terminology. In: Proceedings of the 15th International Conference on Computational Linguistics (COLING 1994), pp. 515–521 (1994)

    Google Scholar 

  5. Debili, F., Zibi, A.: Les dépendances syntaxiques au service de l’appariement des mots. In: Actes du 10ème Congrès Reconnaissance des Formes et Intelligence Artificielle (RFIA 1996) (1996)

    Google Scholar 

  6. Deléger, L., Merkel, M., Zweigenbaum, P.: Translating medical terminologies through word alignement in parralel text corpora. J. Biomed. Inform. 42, 692–701 (2009)

    Article  Google Scholar 

  7. Fehri, H.: Reconnaissance automatique des entités nommées arabes et leur traduction vers le français. Ph.D. thesis, Sfax University (2012)

    Google Scholar 

  8. Gledhill, D.: The Names of Plants, 4th edn. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  9. Goldman, J.-P., Scherrer, Y.: Création automatique de dictionnaires bilingues d’entités nommés grace à Wikipédia. Nouveaux Cahiers de Linguistique Française 30, 213–227 (2012)

    Google Scholar 

  10. Hoffmann, D.: Medical Herbalism: The Science And Practice of Herbal Medicine. Haling Arts Press, Rochester (2003)

    Google Scholar 

  11. Nadeau, N., Sekine, S.: A survey of named entity recognition and classification. In: Sekine, S., Ranchhod, E. (eds.) Named Entities: Recognition, Classification and Use. John Benjamins publishing company, pp. 3–28 (2009)

    Google Scholar 

  12. Nothan, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from Wikipedia. Artif. Intell. 194, 151–175 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ozdowska, S., Claveau, V.: Inférence de règles de propagation syntaxique pour l’alignement de mots. TAL 47(1), 167–186 (2006). ATALA

    Google Scholar 

  14. Rose Lim-Cheng, N., Co, J.R.C., Gaudiel, C.H.S., Umadac, D.F., Victor, N.L.: Semi-automatic population of ontology of philippine medicinal plants from on-line text. Presented at the DLSU Research Congress, De La Salle University, Manila, Philippines, 6–8 March 2014

    Google Scholar 

  15. Semmar, N., Saadane, H.: Etude de l’impact de la translittération de noms propres sur la qualité de l’alignement de mots à partir de corpus paralléles français-arabe. In: 21iéme Traitement Automatique des Langues Naturelles, Marseille, pp. 268–279 (2014)

    Google Scholar 

  16. Semmar, N., Servan, C., De Chalendar, G., Le Ny, B.: A hybrid word alignment approach to improve translation lexicons with compound words and idiomatic expressions. In: Proceedings of the 32nd Translating and the Computer Conference, England (2010)

    Google Scholar 

  17. Shaalan, K.: A survey of Arabic named entity recognition and classification. Comput. Linguist. 40(2), 469–510 (2014)

    Article  Google Scholar 

  18. Silberztein, M.: NooJ: a Linguistic annotation system for corpus processing. In: Proceedings of the Conference on Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, 6–8 October 2005

    Google Scholar 

  19. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45, 427–437 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Aly Fall Seideh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Seideh, M.A.F., Fehri, H., Haddar, K. (2016). Named Entity Recognition from Arabic-French Herbalism Parallel Corpora. In: Okrut, T., Hetsevich, Y., Silberztein, M., Stanislavenka, H. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2015. Communications in Computer and Information Science, vol 607. Springer, Cham. https://doi.org/10.1007/978-3-319-42471-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42471-2_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42470-5

  • Online ISBN: 978-3-319-42471-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics