Community Efforts Around the ISOcat Data Category Registry

  • Sue Ellen Wright
  • Menzo Windhouwer
  • Ineke Schuurman
  • Marc Kemps-Snijders
Chapter
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

The ISOcat Data Category Registry provides a community computing environment for creating, storing, retrieving, harmonizing and standardizing data category specifications (DCs), used to register linguistic terms used in various fields. This chapter recounts the history of DC documentation in TC 37, beginning from paper-based lists created for lexicographers and terminologists and progressing to the development of a web-based resource for a much broader range of users. While describing the considerable strides that have been made to collect a very large comprehensive collection of DCs, it also outlines difficulties that have arisen in developing a fully operative web-based computing environment for achieving consensus on data category names, definitions, and selections and describes efforts to overcome some of the present shortcomings and to establish positive working procedures designed to engage a wide range of people involved in the creation of language resources.

Keywords

Data Category User Group Machine Translation Thematic Domain Relation Registry 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Budin G, Melby A (2000) Accessibility of multilingual terminological resources – current problems and prospects for the future. In: Proceedings of the second international conference on language resources and evaluation (LREC’00), Athens, Greece. ELRAGoogle Scholar
  2. 2.
    ISO (2008) Annex ST (normative) procedure for the development and maintenance of standards in database format. Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  3. 3.
    ISO:12200 (1999) Computer applications in terminology – machine-readable terminology interchange format (MARTIF). Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  4. 4.
    ISO:12620 (1999) Computer applications in terminology – Data categories. Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  5. 5.
    ISO:12620 (2009) Terminology and other language and content resources – specification of data categories and management of a Data Category Registry for language resources. Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  6. 6.
    ISO:16642 (2003) Computer applications in terminology – terminological markup framework. Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  7. 7.
    ISO:24611 (2012) Language resource management – morpho-syntactic annotation framework (MAF). Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  8. 8.
    ISO:30042 (2008) Systems to manage terminology, knowledge and content – TermBase eXchange (tbx). Technical report, International Organization of Standardization, Geneva, SwitzerlandGoogle Scholar
  9. 9.
    Kemps-Snijders M, Ducret J, Romary L, Wittenburg P (2006) An API for accessing the Data Category Registry. In: Proceedings of the fifth international conference on language resources and evaluation (LREC’06), Genoa, Italy. ELRAGoogle Scholar
  10. 10.
    Kuhn T (2000) The road since structure. University of Chicago Press, Chicago. Chap Commensurability, Comparability, CommunicabilityGoogle Scholar
  11. 11.
    Schuurman I, Windhouwer M (2011) Explicit semantics for enriched documents. What do ISOcat, RELcat and SCHEMAcat have to offer? In: Proceedings of the second supporting digital humanities conference, Copenhagen, DenmarkGoogle Scholar
  12. 12.
    Váradi T, Krauwer S, Wittenburg P, Wynne M, Koskenniemi K (2008) CLARIN: Common language resources and technology infrastructure. In: Proceedings of the sixth international conference on language resources and evaluation (LREC’08), Marrakech, Morocco. ELRAGoogle Scholar
  13. 13.
    Windhouwer M (2012) RELcat: a Relation registry for ISOcat data categories. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12), Istanbul, Turkey. ELRAGoogle Scholar
  14. 14.
    Wright SE, Budin G (2001) Handbook of terminology management: application-oriented terminology management. John Benjamins Publishing, AmsterdamGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sue Ellen Wright
    • 1
  • Menzo Windhouwer
    • 2
  • Ineke Schuurman
    • 3
    • 4
  • Marc Kemps-Snijders
    • 5
  1. 1.Kent State UniversityKentUSA
  2. 2.The Language Archive, DANSThe HagueThe Netherlands
  3. 3.KU Leuven and Utrecht UniversityLeuvenBelgium
  4. 4.UtrechtThe Netherlands
  5. 5.Meertens Instituut AmsterdamAmsterdamThe Netherlands

Personalised recommendations