Morphological Analysis and Generation of Monolingual and Bilingual Medical Lexicons

  • Annibale Elia
  • Alessandro Maisto
  • Serena Pelosi
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 537)


To efficiently extract and manage extremely large quantities of meaningful data in a delicate sector like healthcare requires sophisticated linguistic strategies and computational solutions. In the research described here we approach the semantic dimension of the formative elements of medical words in monolingual and bilingual environments. The purpose is to automatically build Italian–English medical lexical resources by grounding their analysis and generation on the manipulation of their consituent morphemes. This approach has a significant impact on the automatic analysis of neologisms, typical for the medical domain. We created two electronic dictionaries of morphemes and a morphological finite state transducer, which, together, find all possible combinations of prefixes, confixes, and suffixes, and are able to annotate and translate the terms contained in a medical corpus, according to the meaning of the morphemes that compose these words. In order to enable the machine to “understand” also medical multiword expressions, we designed a syntactic grammar net that includes several paths based on different combinations of nouns, adjectives, and prepositions.


Morphosemantics Machine translation Dictionary population Neoclassical formative elements 


  1. 1.
    Amtrup, J.W.: Morphology in machine translation systems: efficient integration of finite state transducers and feature structure descriptions. Mach. Transl. 18(3), 217–238 (2003)CrossRefGoogle Scholar
  2. 2.
    Berruto, G.: Sociolinguistica dell’Italiano Contemporaneo. Carocci, Roma (1987)Google Scholar
  3. 3.
    Cartoni, B.: Lexical morphology in machine translation: a feasibility study. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 130–138. Association for Computational Linguistics (2009)Google Scholar
  4. 4.
    D’Achille, P.: L’italiano contemporaneo. Il Mulino, Bologna (2003)Google Scholar
  5. 5.
    Dardano, M.: I linguaggi scientifici. Storia della Lingua Italiana 2, 497–551 (1994)Google Scholar
  6. 6.
    Dardano, M.: La formazione delle parole nell’italiano di oggi, vol. 148. Bulzoni, Roma (1978)Google Scholar
  7. 7.
    Daumke, P., Schulz, S., Markó, K.: Subword approach for acquiring and cross-linking multilingual specialized lexicons. In: 5th International Conference on Language Resources and Evaluation (LREC ’06) Workshop on Acquiring and Representing Multilingual, Specialized Lexicons (2006)Google Scholar
  8. 8.
    De Mauro, T.: Grande Dizionario Italiano dell’Uso, vol. 8. UTET, Torino (1999)Google Scholar
  9. 9.
    De Mauro, T.: Nuove Parole Italiane dell’uso, GRADIT, vol. 7. UTET, Torino (2003)Google Scholar
  10. 10.
    Deléger, L., Naner, F., Zweigenbaum, P.: Defining medical words: transposing morphosemantic analysis from French to English. In: Kuhn, K.A., Warren, J.R., Leong, T.Y. (eds.) MEDINFO 2007: Proceedings of the 12th World Congress on Health, pp. 535–539. IOS Press, Amsterdam (2007)Google Scholar
  11. 11.
    Dujols, P., Aubas, P., Baylon, C., Grémy, F.: Morpho-semantic analysis and translation of medical compound terms. Meth. Inf. Med. 30(1), 30 (1991)Google Scholar
  12. 12.
    Elia, A., Cardona, G.R.: Discorso scientifico e linguaggio settoriale. un esempio di analisi lessico-grammaticale di un testo neuro-biologico. In: Cicalese, A., Landi, A. (eds.) Simboli, linguaggi e contesti. Carocci, Roma (2002)Google Scholar
  13. 13.
    Elia, A., Martinelli, M., D’Agostino, E.: Lessico e Strutture sintattiche: Introduzione alla sintassi del verbo italiano. Liguori, Napoli (1981)Google Scholar
  14. 14.
    Grabar, N., Zweigenbaum, P.: Automatic acquisition of domain-specific morphological resources from thesauri. In: Proceedings of RIAO, pp. 765–784. Citeseer (2000)Google Scholar
  15. 15.
    Hahn, U., Honeck, M., Piotrowski, M., Schulz, S.: Subword segmentation-leveling out morphological variations for medical document retrieval. In: Proceedings of the AMIA Symposium, p. 229. American Medical Informatics Association (2001)Google Scholar
  16. 16.
    Iacobini, C.: Composizione con elementi neoclassici. In: Grossmann, M., Rainer, F. (eds.) La formazione delle parole in italiano, pp. 69–95. Niemeyer, Tübingen (2004)Google Scholar
  17. 17.
    Jacquemin, C.: Syntagmatic and paradigmatic representations of term variation. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 341–348. Association for Computational Linguistics (1999)Google Scholar
  18. 18.
    Kirkness, A.: Aero-lexicography: observations on the treatment of combinemes and neoclassical combinations in historical and scholarly European dictionaries. Willy Martin ua (Hrsg.): Euralex, pp. 530–535 (1994)Google Scholar
  19. 19.
    Lee, Y.S.: Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004, Short Papers. pp. 57–60. Association for Computational Linguistics (2004)Google Scholar
  20. 20.
    Lovis, C., Baud, R., Rassinoux, A.M., Michel, P.A., Scherrer, J.R.: Medical dictionaries for patient encoding systems: a methodology. Artif. Intell. Med. 14(1), 201–214 (1998)CrossRefGoogle Scholar
  21. 21.
    Lovis, C., Michel, P.A., Baud, R., Scherrer, J.R.: Word segmentation processing: a way to exponentially extend medical dictionaries. Medinfo 8(pt 1), 28–32 (1995)Google Scholar
  22. 22.
    Martinet, A.: Syntaxe générale. Armand Colin, Paris (1985)Google Scholar
  23. 23.
    Migliorini, B.: Saggi sulla lingua del Novecento, chap. I prefissoidi (il tipo “aeromobile, radiodiffusione)”, pp. 6–90. Sansoni, Firenze (1963)Google Scholar
  24. 24.
    Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  25. 25.
    Minkov, E., Toutanova, K., Suzuki, H.: Generating complex morphology for machine translation. ACL 7, 128–135 (2007)Google Scholar
  26. 26.
    Möbius, B.: Rare events and closed domains: two delicate concepts in speech synthesis. Int. J. Speech Technol. 6(1), 57–71 (2003)CrossRefzbMATHGoogle Scholar
  27. 27.
    Namer, F.: Acquisizione automatica di semantica lessicale in francese: il sistema di trattamento computazionale della formazione delle parole dérif. In: et Maria Grossmann, A.M.T. (ed.) Atti del XXVII Congresso internazionale di studi Società di Linguistica Italiana: La Formazione delle parole, pp. 369–388 (2005)Google Scholar
  28. 28.
    Namer, F.: Morphologie, lexique et traitement automatique des langues. Hermès-Lavoisier, Cachan (2009)Google Scholar
  29. 29.
    Norton, L., Pacak, M.G.: Morphosemantic analysis of compound word forms denoting surgical procedures. Methods Inf. Med. 22(1), 29–36 (1983)Google Scholar
  30. 30.
    Pacak, M.G., Norton, L., Dunham, G.S.: Morphosemantic analysis of-ITIS forms in medical language. Meth. Inf. Med. 19(2), 99–105 (1980)Google Scholar
  31. 31.
    Pratt, A.W., Pacak, M.: Identification and transformation of terminal morphemes in medical English. Meth. Inf. Med. 8(2), 84–90 (1969)Google Scholar
  32. 32.
    Salvi, G., Vanelli, L.: Grammatica essenziale di riferimento della lingua italiana. Le Monnier, Firenze (1992)Google Scholar
  33. 33.
    Scalise, S.: Morfologia Lessicale. Clesp, Padova (1983)Google Scholar
  34. 34.
    Serianni, L.: Grammatica italiana: italiano comune e lingua letteraria: suoni, forme, costrutti. UTET, Torino (1988)Google Scholar
  35. 35.
    Sgroi, S.C.: Per una ridenizione di “confisso": composti confissati, derivati confissati, parasintetici confissati vs etimi ibridi e incongrui. Quaderni di semantica 24, 81–153 (2003)Google Scholar
  36. 36.
    Silberztein, M.: NooJ manual (2003).
  37. 37.
    Tekavčić, P.: Grammatica storica della lingua italiana: Lessico, vol. 3. Il Mulino, Bologna (1980)Google Scholar
  38. 38.
    Thornton, A.M.: Morfologia. Carocci, Roma (2005)Google Scholar
  39. 39.
    Toutanova, K., Suzuki, H., Ruopp, A.: Applying morphology generation models to machine translation. In: ACL, pp. 514–522 (2008)Google Scholar
  40. 40.
    Vietri, S.: The Italian module for NooJ. In. In Proceedings of the First Italian Conference on Computational Linguistics, CLiC-it 2014. Pisa University Press (2014)Google Scholar
  41. 41.
    Virpioja, S., Väyrynen, J.J., Creutz, M., Sadeniemi, M.: Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner. Mach. Transl. Summit XI 2007, 491–498 (2007)Google Scholar
  42. 42.
    Wolff, S.: The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. Meth. Inf. Med. 23(4), 195–203 (1984)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Annibale Elia
    • 1
  • Alessandro Maisto
    • 1
  • Serena Pelosi
    • 1
  1. 1.Department of Political, Social, and Communication SciencesUniversity of SalernoFiscianoItaly

Personalised recommendations