Improving the consistency of usage labelling in dictionaries with TEI Lex-0

Abstract

This paper analyzes the application of usage labels in three representative lexicographic works, namely the Portuguese, Spanish, and French Academy Dictionaries as a starting point for creating a consistent classification of usage labels and their encoding in accordance with TEI Lex-0. The use of labels is not always entirely consistent within individual dictionaries and even less so across different lexicographic projects. This makes the tasks of accurately classifying and encoding them quite difficult. This difficulty is compounded by the differences and partial incompatibilities found in the lexicographic literature on the treatment of diasystemic information. We address the existing literature and the initial classification of TEI Lex-0, and argue for the need to introduce some changes to TEI Lex-0, most notably in terms of diatextual labels. Finally, we argue that the existing classifications based on examples rather than on clear and explicit definitions of classification categories will always lack in precision and lead to mutually incompatible encodings of different dictionaries. We propose a set of definitions for usage label categories that can be adopted by TEI Lex-0 and used in other similar attempts to create interoperable lexical resources. An agreement on usage label categories is a first and necessary step before proceeding in the direction of harmonizing and standardizing the actual values of usage labels across various dictionaries and across different languages.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Notes

  1. 1.

    The three selected dictionaries are representative of the academic tradition in European lexicography (see Considine 2014). Each dictionary under consideration is a large, scholarly, monolingual dictionary of a major Romance language undertaken by a national academy. While further work could be taken up to extend our study to other traditions, and other languages, this would be beyond the scope of what can be achieved in a single journal article.

  2. 2.

    https://www.w3.org/2016/05/ontolex/.

  3. 3.

    https://www.iso.org/standard/37327.html.

  4. 4.

    TEI is a de facto standard in digital edition or text annotation projects, and it is frequently used in Digital Humanities as the basis for a large number of current lexicographic projects, such as http://nenufar.huma-num.fr/presentation/, https://artfl-project.uchicago.edu/, and https://vicav.acdh.oeaw.ac.at/.

  5. 5.

    The electronic version of the DACL is not publicly available, but the first author of this paper is the coordinator of the new edition. The Natural Language Processing group of the Computer Science Department of the University of Minho has been developing the technological support of the new digital edition of DACL, counting on the participation of Alberto Simões from IPCA (Instituto Politécnico do Cávado e do Ave), responsible for the technological support, José João Almeida, and the consultancy of Álvaro Iriarte Sanromán, both from University of Minho. The participation of NOVA CLUNL (Linguistic Research Center of NOVA University of Lisbon) is related to its transition into the TEI LEX-0 format.

  6. 6.

    https://dle.rae.es.

  7. 7.

    We would like to thank ILex (Institute of Lexicography of the RAE) for allowing us to use some statistics obtained during a 3-week stay in the scope of a scholarship granted to the first author by ELEXIS (https://elex.is). Therefore, at the time of that stay (November 2018), the database of the DLE had 95,410 entries, in a total of 198,176 senses.

  8. 8.

    círculo in Dicionário Infopédia da Língua Portuguesa [em linha]. Porto: Porto Editora, 2003–2019. [consult. 2019-08-15]: https://www.infopedia.pt/dicionarios/lingua-portuguesa/círculo.

  9. 9.

    https://www.tei-c.org/release/doc/tei-p5-doc/en/html/DI.html.

  10. 10.

    https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#index.xml-body.1_div.2_div.2.

  11. 11.

    https://www.tei-c.org/release/doc/tei-p5-doc/en/html/DI.html.

  12. 12.

    For further details, see chapter 6—“Usage Information”: https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#index.xml-body.1_div.2_div.2.

  13. 13.

    Typed element is used to indicate an element that can have a type, and it specifies a set of values.

References

  1. Yong, H., and J. Peng. 2007. Bilingual Lexicography from a Communicative Perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company.

Dictionaries

  1. Dicionário da Língua Portuguesa Contemporânea. 2001. João Malaca Casteleiro (coord.), 2 vols. Lisboa: Academia das Ciências de Lisboa & Editorial Verbo. New digital edition under revision.

  2. Diccionario de la Lengua Española (24.ª ed.). Real Academia Española, 2001–2018, www.rae.es/rae.

  3. Dictionnaire de l´Académie Française (9.ª ed.). Académie Française, 2019, http://www.dictionnaire-academie.fr/.

Other literature

  1. Ahumada, I. (ed.). 2002. Diccionarios y lenguas de especialidad. Jaén: Universidad de Jaén.

    Google Scholar 

  2. Atkins, B. T. S., and M. Rundell. 2008. The Oxford Guide to Practical Lexicography. New York: Oxford University Press.

    Google Scholar 

  3. Bergenholtz, H., and S. Tarp. 1995. Manual of Specialised Lexicography. The Preparation of Specialised Dictionaries. Amsterdam: John Benjamins Publishing Company.

    Google Scholar 

  4. Considine, J. 2014. Academy dictionaries 1600–1800. Cambridge: Cambridge University Press.

    Google Scholar 

  5. Fedorova, I. V. 2004. Style and Usage Labels in Learner’s Dictionaries: Ways of Optimization. In Proceedings of the 11th Euralex International Congress, ed. Geoffrey Williams and Sandra Vessier, pp. 265–272. Lorient: Université de Bretagne-Sud, Faculté des lettres et des sciences humaines.

  6. Hausmann, F. J. 1989. Die Markierung in eineim allgemeinen einsprachigen Wörterbuch: eine Übersicht. In Wörterbücher. Ein internationales Handbuch zur Lexikographie, ed. F. J. Hausmann, O. Reichmann, H. E. Wiegand and L. Zgusta, pp. 649–657. Berlin: Walter de Gruyter.

  7. Jackson, H. 2002. Lexicography: An Introduction. London/New York: Routledge.

    Google Scholar 

  8. Landau, S. 1989. Dictionaries. The Art and Craft of Lexicography. Cambridge: Cambridge University Press.

    Google Scholar 

  9. Milroy, J., and L. Milroy. 1990. Authority in Language: Investigating Standard English. Routledge.

  10. Monson, S. C. 1973. "Discussion Paper: Restrictive Labels – Descriptive or Prescriptive?". In Lexicography in English. Annals of the New York Academy of Sciences 211, ed. McDavid Jr., R. I., Duckert. A. R., pp. 208–212.

  11. Rey, A. 2008. De l´artisanat des dictionnaires à une science du mot. Images et modèles. Paris: Armand Colin.

  12. Romary, L., and T. Tasovac. 2018. TEI Lex-0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources. In Proceedings of the 8th Conference of Japanese Association for Digital Humanities, pp. 274–275. https://tei2018.dhii.asia/AbstractsBook_TEI_0907.pdf.

  13. Sakwa, L. N. 2011. Problems of Usage Labelling in English Lexicography. Lexicos 21, pp. 305–315. https://doi.org/10.5788/21-1-47.

    Article  Google Scholar 

  14. Salgado, A., R. Costa, T. Tasovac, and A. Simões. 2019. TEI Lex-0 In Action: Improving the Encoding of the Dictionary of the Academia das Ciências de Lisboa. In Proceedings of the eLex 2019 conference, 1–3 October 2019, Sintra, Portugal, pp. 417–433. Brno: Lexical Computing CZ, s.r.o.

  15. Svensén, B. 2009. A Handbook of Lexicography: The Theory and Practice of Dictionary Making. Cambridge: Cambridge University Press.

    Google Scholar 

  16. TEI Consortium, eds. TEI P5: Guidelines for Electronic Text Encoding and Interchange. [Version 3.5.0]. [Last updated on 29th January 2019, revision 3c0c64ec4]. TEI Consortium. http://www.tei-c.org/Guidelines/P5/ ([13.07.2019]).

Download references

Acknowledgements

Research financed by Portuguese National Funding through the FCT—Fundação para a Ciência e Tecnologia as part of the project Centro de Linguística da Universidade NOVA de Lisboa—UID/LIN/03213/2019, and by the European Union’s Horizon 2020 research and innovation program under Grant Agreement No. 731015 (ELEXIS).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ana Salgado.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Salgado, A., Costa, R. & Tasovac, T. Improving the consistency of usage labelling in dictionaries with TEI Lex-0. Lexicography ASIALEX 6, 133–156 (2019). https://doi.org/10.1007/s40607-019-00061-x

Download citation

Keywords

  • Lexicography
  • Usage labels
  • Diasystemic information
  • TEI