Skip to main content

Maintaining the balance between knowledge and the lexicon in terminology: a methodology based on frame semantics

Abstract

This paper argues for an approach to terms—based on Frame Semantics (Fillmore in Ann N Y Acad Sci Conf Origin Dev Lang Speech 280:20–32, 1976; Fillmore and Baker in A Frames Approach to Semantic Analysis, 313–339, 2010)—that takes into account their linguistic properties and shows how terms and their properties are connected formally to the expression of knowledge in specialized fields. I briefly present the theoretical assumptions underlying this proposal. The main part of the article describes the methodology devised to implement the proposal in two terminological resources that are under development at the Observatoire de linguistique Sens-Texte (OLST). The methodology that comprises seven main steps is based on that of FrameNet (https://framenet.icsi.berkeley.edu/fndrupal/, 2017. Accessed 20 January 2017) (Ruppenhofer et al. in FrameNet II: extended theory and practice. https://framenet.icsi.berkeley.edu/fndrupal/index.21php?q=the_book, 2016. Accessed 27 January 2017), the lexical implementation of Frame Semantics. I illustrate the methodology by applying it to terms that belong to the field of endangered species, a subfield of the environment.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. I chose this field rather than one directly connected to medicine, since my team has been working on terms related to the environment and hence I am much more comfortable with these. However, the problem raised in the introduction (i.e., two different perspectives taken in terminology) remains the same regardless of the specialized domain considered.

  2. Vulnerable as such might not appear in terminological resources. However, the noun phrase vulnerable species is likely to be listed. This confirms the preference of specialized resources for nouns and noun phrases.

  3. This can easily be verified by searching for different terms via EcoRessouces (2017), an aggregator developed by the Observatoire de linguistique Sens-Texte and the research group Recherche appliquée en linguistique informatique that gives access to 16 terminological online resources containing environmental terms. If we set aside our own resource (the DiCoEnviro), all 15 others provide little linguistic information (some do not provide any kind of linguistic information).

  4. Obligatory participants are labeled core frame elements in FrameNet and correspond partly to what is normally referred to as arguments (although arguments are usually defined for linguistic units; whereas frame elements are defined for frames that accounts for a conceptual representation of a situation). Optional participants are labeled non-core frame elements in FrameNet and correspond partially to what is called adjuncts.

  5. It should be noted that the terminological content of frames may be enriched as more data is taken into account. In addition, some languages are better covered that others.

  6. The DiCoEnviro is the terminological resource that contains the descriptions upon which we base our discovery of frames. It states the argument structure of terms, gives access to up to 20 contexts (when these are annotated) and lists various types of relations that terms hold with other terms in the field (related meanings, opposites, morphologically related terms, collocations, etc. In addition, when equivalents in other languages also appear in the resource (the resource covers English, French, Spanish and has some entries in Portuguese), hyperlinks are provided to allow users to access these entries. At the end of July 2017, the DiCoEnviro contained 884 English entries (with over 4000 lexical relations and 8000 annotated contexts) and 1264 French entries (with over 6500 lexical relations and 20,000 annotated contexts). The resource also includes a few Spanish and Portuguese terms. The DiCoEnviro is first designed as a tool for researchers in terminology, but some of the information it contains (the annotated contexts, lexical relations) makes it attractive to other kinds of users, i.e., translators, lexicographers, etc.

  7. A bottom-up methodology was also used by other researchers interested in specialized lexica (Pimentel 2013; Schmidt 2009).

  8. There are exceptions though. In the field of the environment, for instance, a corpus called PANACEA (2015) can be used for research purposes. However, the corpus was compiled automatically and might not be suitable for our terminological projects since automatically compiled corpora do not discriminate textual genres dealing with the same topic (scientific articles, reports, newspaper articles). Some even contain glossaries that do not show how terms are used in running texts. Since we want to be able to know exactly where contexts that we collect come from and record many details regarding texts that we place into our corpora, compiling corpora manually still remains the best option.

  9. It should be said at this point that labels used in our terminological resources differ from those used in FrameNet. Frame elements in FrameNet are relevant within a specific frame. In our resources, labels should be applied to large sets of terms.

  10. A member of our team (Bernier-Colborne 2016) explored how a method based on distributional semantics to identify the terminological content of frames automatically. The method is promising but has not been completely integrated to our methodology.

  11. Users can also view the similarities and the differences between frames as they are represented in FrameNet and those that appear in the Framed DiCoEnviro when selecting the “Click here to see associated FrameNet infos”. More explanations are given about this in L’Homme et al. (2016).

References

  • Azoulay, D. 2017. Frame-based knowledge representation using large specialized corpora. In: Proceedings of the AAAI spring symposium on computational construction grammar and natural language understanding, Stanford University, CA.

  • Bernier-Colborne, Gabriel. 2016. Aide à l’identification de relations lexicales au moyen de la sémantique distributionnelle et son application à un corpus bilingue du domaine de l’environnement. Ph.D Thesis presented at the Université de Montréal, Montréal.

  • DiCoEnviro. 2017. Dictionnaire fondamental de l’environnement. http://olst.ling.umontreal.ca/cgi-bin/dicoenviro/search_enviro.cgi. Accessed 31 July 2017.

  • Drouin, P. 2003. Term extraction using non-technical corpora as a point of leverage. Terminology 9 (1): 99–117.

    Article  Google Scholar 

  • EcoRessources. Terminological resources for the environment. 2017. http://termeco.info/EcoRessources/index-e.html. Accessed 31 July 2017.

  • Faber, P., P. León-Araúz, and A. Reimerink. 2016. EcoLexicon: New features and challenges. In GLOBALEX 2016: lexicographic resources for human language technology and 10th edition of the language resources and evaluation conference, ed. by Kernerman, I., I. Kosem Trojina, S. Krek, and L. Trap-Jensen, 73-80. Portorož.

  • Fillmore, C.J. 1976. Frame semantics and the nature of language. In Annals New York Academy of Sciences: Conference on the Origin and Development of Language and Speech 280: 20–32.

    Article  Google Scholar 

  • Fillmore, C.J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6: 222–254.

    Google Scholar 

  • Fillmore, C. J., and B.T. Atkins. 1992. Toward a frame-based Lexicon: the semantics of RISK and its neighbors.” In Frames, Fields and Contrasts, ed. by A. Lehrer, and E. Feder Kittay, 75–102. Hillsdale, New Jersey: Lawrence Erlbaum Assoc.

  • Fillmore, C.J., and C. Baker. 2010. A frames approach to semantic analysis. In Handbook of Linguistic Analysis, ed. B. Heine, and H. Narrog, 313–339. Oxford: Oxford University Press.

    Google Scholar 

  • Fillmore, C., M.R.L. Petruck, J. Roppenhofer, and A. Wright. 2003. FrameNet in action: the case of attaching. International Journal of Lexicography 16 (2): 297–332.

    Article  Google Scholar 

  • Forest, D., H. Brousseau, P. Drouin, and G. Bernier-Colborne. 2015. L’environnement vu par ses documents: utilisation de techniques de fouille de textes dans un contexte de description linguistique. In 13e Journées internationales d’analyse statistique des données textuelles, Nice, France.

  • Framed DiCoEnviro. 2017. A Framed Version of DiCoEnviro. http://olst.ling.umontreal.ca/dicoenviro/framed/index.php. Accessed 31 July 2017.

  • FrameNet. 2017. https://framenet.icsi.berkeley.edu/fndrupal/home. Accessed 20 January 2017.

  • Ghazzawi, N. 2016. Du terme prédicatif au cadre sémantique: méthodologie de compilation d’une ressource terminologique pour les termes arabes de l’informatique. Ph.D. Thesis, presented at the Université de Montréal, Montreal.

  • Hadouche, F., S. Desgroseillers, J. Pimentel, M.C. L’Homme, and G. Lapalme. 2011. Identification des participants de lexies prédicatives: évaluation en performance et en temps d’un système d’annotation automatique. In Terminologie et intelligence artificielle (TIA 2011), Institut national des langues orientales INALCO, Paris.

  • L’Homme, M.C. 2015. Découverte de cadres sémantiques dans le domaine de l’environnement: le cas de l’influence objective. Terminàlia 12: 29–40.

    Google Scholar 

  • L’Homme, M.C. 2016. Terminologie de l’environnement et sémantique des cadres. In Congrès mondial de linguistique française (CMLF 2016), Tours, France.

  • L’Homme, M.C., C. Subirats, and B. Robichaud. 2016. A Proposal for combining “general” and specialized frames. In Proceedings of the workshop on cognitive aspects of the Lexicon. 156–165, Osaka, Japan.

  • PANACEA. 2015. http://panacea-lr.eu/en/info-for-researchers/data-sets/monolingual-corpora. Accessed 23 January 2017.

  • Pimentel, J. 2013. Methodological bases for assigning terminological equivalents. A Contribution. Terminology 19 (2): 237–257.

    Article  Google Scholar 

  • Ruppenhofer, J, M. Ellsworth, M. Petruck, C. Johnson, and C. Baker, and J. Scheffczyk. 2016. FrameNet II: extended theory and practice. https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=the_book. Accessed 27 January 2017.

  • Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of international conference on new methods in language processing, Manchester, UK.

  • Schmidt, T. 2009. The Kicktionary—a multilingual lexical resources of football language. In Multilingual FrameNets in Computational Lexicography. Methods and Applications, ed. Boas, H.C., 101–134. Berlin/New York: Mouton de Gruyter.

  • Wildlife Ontology (2017). http://www.bbc.co.uk/ontologies/wo. Accessed 20 January 2017.

Download references

Acknowledgements

This research is supported by the Social Sciences and Humanities Research Council (SSHRC) of Canada and by the Fonds de recherche du Québec—Société et culture (FRQ-SC). I would like to thank the members of my research team who contributed in one way or another to the Framed DiCoEnviro project. I also extend my thanks to two anonymous reviewers whose comments helped clarify many parts of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marie-Claude L’Homme.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

L’Homme, MC. Maintaining the balance between knowledge and the lexicon in terminology: a methodology based on frame semantics. Lexicography ASIALEX 4, 3–21 (2018). https://doi.org/10.1007/s40607-018-0034-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40607-018-0034-1

Keywords

  • Terms
  • Predicative units
  • Frames
  • Frame semantics
  • Terminological resource
  • Environment