A Terminology Indexing Based on Heuristics Using Linguistic Resources for Medical Textual Corpus

  • Ali Benafia
  • Ramdane Maamri
  • Zaidi Sahnoun
Part of the Studies in Computational Intelligence book series (SCI, volume 473)


The term extraction is an important step in building a resource of indexing and many strong tools are available for many languages. This complex process, which identifies candidate terms may become indexes for annotations, is often subject to the problem of lack of relevance of calculated terms. As a result, extractor terms must be strong to handle the errors and suggest better results, without encumbering the user with too many proposed index. In this respect, we are suggesting a new indexing approach based on a hybrid of terminologies extraction using a filter by removing terms and operates upon corpus of medical texts.


Term extraction medical terminology Mesh ADM syntactic patterns n-grams 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aussenac, N.G., Jacques, M.P.: Designing and Evaluating Patterns for Relation Acquisition from Texts with CAMÉLÉON. Dans: Terminology, J.B.P Amsterdam, Numéro spécial Pattern-Based approaches to Semantic Relations, Vol. 14 N. 1, pp. 45-73 (2008) Google Scholar
  2. 2.
    Baziz, M., Boughanem, M., Aussenac-Gilles, N.: The Use of Ontology for Semantic Representation of Documents (regular paper). In: Semantic Web and Information Retrieval Workshop at SIGIR (SWIR), Sheeld, UK, pp. 38–45 (2004)Google Scholar
  3. 3.
    Baziz, M., Boughanem, M., Aussenac-Gilles, N.: Conceptual Indexing Based on Document Content Representation. In: Crestani, F., Ruthven, I. (eds.) CoLIS 2005. LNCS, vol. 3507, pp. 171–186. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Benafia, A., Maamri, R., Sahnoun, Z.: An Indexing Approach Based on a Hybrid Model of Terminology-extraction using a Filtering by Elimination Terms. J. of Advances in Information Technology 3(4) (2012)Google Scholar
  5. 5.
    Bethesda, Maryland, MeSH: Medical Subject Headings National Library of Medicine (1986)Google Scholar
  6. 6.
    Biskri, I., Meunier, J., Joyal, S.: L’extraction des Termes Complexes: Une Approche Modulaire Semi-Automatique. In: JADT 2004: 7es Journées Internationales d’Analyse Statistique des Données Textuelles (2004)Google Scholar
  7. 7.
    Boubekeur-Amirouche, F.: Contribution à la Définition de Modèles de Recherche d’Information Flexibles Basés sur les CP-Nets (2008)Google Scholar
  8. 8.
    Bourigault, D., Fabre, C., Frérot, C., Jacques, M.P., Ozdowska, S.: Syntex, Analyseur Syntaxique de Corpus. In: TALN 2005, Dourdan (2005)Google Scholar
  9. 9.
    Can, A.B., Baykal, N.: MedicoPort: A Medical Search Engine for All. Computer Methods and Programs in Biomedicine 86(1), 73–86 (2007)CrossRefGoogle Scholar
  10. 10.
    Evans, D.A., Ginther-Webster, K., Hart, M., Lefferts, R.G., Monarch, I.A.: Automatic Indexing Using NLP and First Order Thesauri. In: RIAO 1991 Recherche d’Informations Assistée par Ordinateur, pp. 624–643 (1991)Google Scholar
  11. 11.
    Grimault, F.: Terres d’Innovation Photographie, Indexer et Légender (2009)Google Scholar
  12. 12.
    Jacquemin, C.: Variation Terminologique: Reconnaissance et Acquisition Automatiques de Termes et de Leurs Variantes en Corpus, Mémoire d’habilitation à Diriger des Recherches en Informatique Fondamentale, Université de Nantes (1997)Google Scholar
  13. 13.
    Jacques, M.-P.: Que: la Valse des Etiquettes. Actes de la Conférence Traitement Automatique des Langues Naturelles. In: TALN 2005, Dourdan, France (2005)Google Scholar
  14. 14.
    Khan, L.: Ontology-based Information Selection. Ph.D. Thesis, University of South California (2000)Google Scholar
  15. 15.
    Lancaster, F.W.: Indexing and Abstracting in Theory and Practice. University of Illinois, Champaign (1991)Google Scholar
  16. 16.
    Lenoir, P., Michel, J.-R., Frangeul, C., Chales, G.: Réalisation, Développement et Maintenance de la Base de Données A.D.M. 6 (1), 51–56 (1981)Google Scholar
  17. 17.
    Lindberg, D., Humphreys, B., McCray, A.: The Unified Medical Language System. National Library of Medicine, Bethesda (1993)Google Scholar
  18. 18.
    Liu, H., Hu, Z.-Z., Zhang, J., Wu, C.: BioThesaurus: a Web-based Thesaurus of Protein and Gene Names. Bioinformatics (1), 103–105 (2006)Google Scholar
  19. 19.
    Schwab, D., Lian Tze, L., Lafourcade, M.: TALN 2007, Toulouse, Les Vecteurs Conceptuels un Outil Complémentaire aux Réseaux Lexicaux (2007)Google Scholar
  20. 20.
    Secon, S., Veale, T., Hayes, J.: An Intrinsic Information Content Metric for Semantic Similarity in WordNet. In: Proc. ECAI 2004, 16th European Conf. on Artificial Intelligence, pp. 1089–1090 (2004)Google Scholar
  21. 21.
    Siberztein, M.: Le Dictionnaire Electronique des Mots Composés en Langue Française (1990)Google Scholar
  22. 22.
    Siberztein, M.: Traitement des Expressions Figées avec Syntex. In: Analyse Lexicale et Syntaxique: le Système Intex, Lingvisticae Investigationes XXII, pp. 425–449. John Benjamins Publishing Compagny, Amsterdam (1999)Google Scholar
  23. 23.
    Vergne, J.: Une Méthode Indépendante des Langues Pour Indexer les Documents de l’Lnternet par Extraction de Termes de Structure Contrôlée. In: Actes de la Conférence Internationale sur le Document Électronique (CIDE 8), Beyrouth, Liban (2005)Google Scholar
  24. 24.
    Zipf, G.K.: The Psycho-biology of Language. An Introduction to Dynamic Philology. The M.I.T. Press, Cambridge (1968); Second paperback printing (First Ed. 1935)Google Scholar
  25. 25.
    Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)Google Scholar
  26. 26.
    Voorhees, E.M.: Using WordNet to Disambiguate Word Senses for Text Retrieval. In: Proceedings of the ACM-SIGIR 1993, pp. 171–180. ACM Press, New York (1993)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Ali Benafia
    • 1
    • 3
  • Ramdane Maamri
    • 2
    • 3
  • Zaidi Sahnoun
    • 2
    • 3
  1. 1.Université El Hadj Lakhdar de BatnaBatnaAlgérie
  2. 2.Département d’Informatique CampusConstantineAlgérie
  3. 3.Laboratoire LIREUniversité de ConstantineConstantineAlgérie

Personalised recommendations