Abstract
Acquisition and enrichment of lexical resources have long been acknowledged as an important research in the area of computational linguistics. Nevertheless, we notice that such resources, particularly in specialised domains, are missing. However, specialised domains, i.e. biomedicine, propose several structured terminologies. In this paper, we propose a high-quality method for exploiting a structured terminology and inferring a specialised elementary synonym lexicon. The method is based on the analysis of syntactic structure of complex terms. We evaluate the approach on the biomedical domain by using the terminological resource Gene Ontology. It provides results with over 93% precision. Comparison with an existing synonym resource (the general-language resource WordNet) shows that there is a very small overlap between the induced lexicon of synonyms and the WordNet synsets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brill, E.: A Corpus-Based Approach to Language Learning. PhD thesis, University of Pennsylvania, Philadelphia (1993)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, pp. 44–49 (1994)
Namer, F.: FLEMM: un analyseur flexionnel du français á base de règles. Traitement Automatique des Langues (TAL) 41(2), 523–547 (2000)
Burnage, G.: CELEX - A Guide for Users. Centre for Lexical Information, University of Nijmegen (1990)
Hathout, N., Namer, F., Dal, G.: An experimental constructional database: the MorTAL project. In: Boucher, P. (ed.) Morphology book, Cascadilla Press, Cambridge (2001)
NLM: UMLS Knowledge Sources Manual. National Library of Medicine, Bethesda, Maryland (2007), www.nlm.nih.gov/research/umls/
Schulz, S., et al.: Towards a multilingual morpheme thesaurus for medical free-text retrieval. In: Medical Informatics in Europe (MIE) (1999)
Zweigenbaum, P., et al.: Towards a Unified Medical Lexicon for French. In: Medical Informatics in Europe (MIE) (2003)
Fellbaum, C.: A semantic network of english: the mother of all WordNets. Computers and Humanities. EuroWordNet: a multilingual database with lexical semantic network 32(2–3), 209–220 (1998)
Smith, B., Fellbaum, C.: Medical wordnet: a new methodology for the construction and validation of information. In: Proc. of 20th CoLing, Geneva, Switzerland, pp. 371–382 (2004)
Hamon, T., Nazarenko, A.: Detection of synonymy links between terms: experiment and results. In: Recent Advances in Computational Terminology, pp. 185–208. John Benjamins (2001)
Gene Ontology Consortium: Creating the Gene Ontology resource: design and implementation. Genome Research 11, 1425–1433 (2001)
Partee, B.H.: In: Compositionality. F. Landman and F. Veltman (1984)
Ogren, P., et al.: The compositional structure of Gene Ontology terms. In: Pacific Symposium of Biocomputing, pp. 214–225 (2004)
Hamon, T., et al.: A robust linguistic platform for efficient and domain specific web content analysis. In: RIAO 2007, Pittsburgh, USA (2007)
Berroyer, J.F.: Tagen, un analyseur d”entits nommes: conception, développement et valuation. Mémoire de D.E.A. d’intelligence artificielle, Universit Paris-Nord (2004)
Tsuruoka, Y., et al.: Developing a robust part-of-speech tagger for biomedical text. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 382–392. Springer, Heidelberg (2005)
Aubin, S., Hamon, T.: Improving term extraction with terminological resources. In: Salakoski, T., et al. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 380–387. Springer, Heidelberg (2006)
Verspoor, C.M., Joslyn, C., Papcun, G.J.: The gene ontology as a source of lexical semantic knowledge for a biological natural language processing application. In: SIGIR workshop on Text Analysis and Search for Bioinformatics, pp. 51–56 (2003)
Ogren, P., Cohen, K., Hunter, L.: Implications of compositionality in the Gene Ontology for its curation and usage. In: Pacific Symposium of Biocomputing, pp. 174–185 (2005)
Cruse, D.A.: Lexical Semantics. Cambridge University Press, Cambridge (1986)
Grabar, N., Zweigenbaum, P.: Utilisation de corpus de spécialité pour le filtrage de synonymes de la langue générale. In: Traitement Automatique de Langues Naturelles (TALN) (2005)
Bodenreider, O., Burgun, A.: Characterizing the definitions of anatomical concepts in WordNet and specialized sources. In: Proceedings of the First Global WordNet Conference, pp. 223–230 (2002)
Bodenreider, O., Burgun, A., Mitchell, J.A.: Evaluation of WordNet as a source of lay knowledge for molecular biology and genetic diseases: a feasibility study. In: Medical Informatics in Europe (MIE), pp. 379–384 (2003)
National Library of Medicine Bethesda, Maryland: Medical Subject Headings (2001), http://www.nlm.nih.gov/mesh/meshhome.html
Côté, R.A.: Répertoire d’anatomopathologie de la SNOMED internationale, v3.4. Université de Sherbrooke, Sherbrooke, Québec (1996)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hamon, T., Grabar, N. (2008). Acquisition of Elementary Synonym Relations from Biological Structured Terminology. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-78135-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78134-9
Online ISBN: 978-3-540-78135-6
eBook Packages: Computer ScienceComputer Science (R0)