Abstract
In this paper we use Nooj to solve a recognition and translation task on medical terms with a morphosemantic approach. The Medical domain is characterized by a huge number of different terms that appear in corpora with very low frequencies. For this reason, machine learning or statistical approaches do not achieve good results on this domain. In our work we apply a morpho-semantic approach that take advantage from a number of Italian and English word-formation strategies for the automatic analysis of Italian words and for the generation of Italian/English bilingual lexicons in the medical sub-code. Using Nooj we built a series of Italian and bilingual dictionaries of morphemes, a set of morphological grammars that specify how morphemes combine with each other, a syntactic grammar for the recognition of compound terms and a Finite State Transducer (FST) for the translation of medical terms based on morphemes. This approach produces as output: a categorized Italian electronic dictionary of medical simple words, provided with labels specifying the meaning of each term; a Thesaurus of simples and compounds medical terms, organized in 22 medical subcategories; A an Italian/English translation of medical terms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amato, F., Elia, A., Maisto, A., Mazzeo, A., Pelosi, S.: Automatic population of italian medical thesauri: a morphosemantic approach. In: 9th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 432–436. IEEE, Guangzhou (2014)
Amtrup, J.W.: Morphology in machine translation systems: efficient integration of finite state transducers and feature structure descriptions. Mach. Transl. 18(3), 217–238 (2003)
Cartoni, B.: Lexical morphology in machine translation: a feasibility study. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 130–138. Association for Computational Linguistics (2009)
Daumke, P., Schulz, S., Markó, K.: Subword Approach for Acquiring and Crosslinking Multilingual Specialized Lexicons. Programme Committee (2006)
De Mauro, T.: Nuove Parole Italiane dell’uso, GRADIT, vol. 7 (2003)
Deléger, L., Naner, F., Zweigenbaum, P., et al.: Defining medical words: transposing morphosemantic analysis from French to English (2007)
Dujols, P., Aubas, P., Baylon, C., Grémy, F.: Morpho-semantic analysis and translation of medical compound terms. Methods Inf. Med. 30(1), 30 (1991)
Hahn, U., Honeck, M., Piotrowski, M., Schulz, S.: Subword segmentation-leveling out morphological variations for medical document retrieval. In: Proceedings of the AMIA Symposium, p. 229. American Medical Informatics Association (2001)
Iacobini, C.: Composizione con elementi neoclassici, in La formazione delle parole in italiano, a cura di Grossmann, M., Rainer, F., pp. 69–95 (2004)
Jacquemin, C.: Syntagmatic and paradigmatic representations of term variation. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 341–348. Association for Computational Linguistics (1999)
Lee, Y.S.: Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 57–60. Association for Computational Linguistics (2004)
Lovis, C., Baud, R., Rassinoux, A.M., Michel, P.A., Scherrer, J.R.: Medical dictionaries for patient encoding systems: a methodology. Artif. Intell. Med. 14(1), 201–214 (1998)
Lovis, C., Michel, P.A., Baud, R., Scherrer, J.R.: Word segmentation processing: a way to exponentially extend medical dictionaries. Medinfo 8(pt 1), 28–32 (1995)
Minkov, E., Toutanova, K., Suzuki, H.: Generating complex morphology for machine translation. ACL 7, 128–135 (2007)
Möbius, B.: Rare events and closed domains: two delicate concepts in speech synthesis. Int. J. Speech Technol. 6(1), 57–71 (2003)
Namer, F.: Acquisizione automatica di semantica lessicale in francese: il sistema di trattamento computazionale della formazione delle parole dérif. In: Thornton, A.M, Grossmann, M. (eds.) Atti del XXVII Congresso internazionale di studi Società di Linguistica Italiana: La Formazione delle parole, pp. 369–388 (2005)
Namer, F.: Morphologie, lexique et traitement automatique des langues (2009)
Norton, L., Pacak, M.G.: Morphosemantic analysis of compound word forms denoting surgical procedures. Methods Inf. Med. 22(1), 29–36 (1983)
Pacak, M.G., Norton, L., Dunham, G.S.: Morphosemantic analysis of-ITIS forms in medical language. Methods Inf. Med. 19(2), 99–105 (1980)
Pratt, A.W., Pacak, M.: Identification and transformation of terminal morphemes in medical english. Methods Inf. Med. 8(2), 84–90 (1969)
Silberztein, M.: NooJ manual (2003). www.nooj4nlp.net
Toutanova, K., Suzuki, H., Ruopp, A.: Applying morphology generation models to machine translation. In: ACL, pp. 514–522 (2008)
Virpioja, S., Väyrynen, J.J., Creutz, M., Sadeniemi, M.: Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner. Mach. Transl. Summit XI 2007, 491–498 (2007)
Wolff, S.: The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. Methods Inf. Med. 23(4), 195–203 (1984)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Maisto, A., Guarasci, R. (2016). Morpheme-Based Recognition and Translation of Medical Terms. In: Okrut, T., Hetsevich, Y., Silberztein, M., Stanislavenka, H. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2015. Communications in Computer and Information Science, vol 607. Springer, Cham. https://doi.org/10.1007/978-3-319-42471-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-42471-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42470-5
Online ISBN: 978-3-319-42471-2
eBook Packages: Computer ScienceComputer Science (R0)