The ISOcat Registry Reloaded

  • Claus Zinn
  • Christina Hoppermann
  • Thorsten Trippel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)


The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry, a collaborative platform to hold a (to be standardized) set of data categories (i.e., field descriptors). Descriptors have definitions in natural language and little explicit interrelations. With the registry growing to many hundred entries, authored by many, it is becoming increasingly apparent that the rather informal definitions and their glossary-like design make it hard for users to grasp, exploit and manage the registry’s content. In this paper, we take a large subset of the ISOcat term set and reconstruct from it a tree structure following the footsteps of Our ontological re-engineering yields a representation that gives users a hierarchical view of linguistic, metadata-related terminology. The new representation adds to the precision of all definitions by making explicit information which is only implicitly given in the ISOcat registry. It also helps uncovering and addressing potential inconsistencies in term definitions as well as gaps and redundancies in the overall ISOcat term set. The new representation can serve as a complement to the existing ISOcat model, providing additional support for authors and users in browsing, (re-)using, maintaining, and further extending the community’s terminological metadata repertoire.


Data Category Concept Scheme Concept System Conceptual Domain Relation Registry 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    DCR Style Guidelines. Version ”2010-05-16”, (retrieved December 5, 2011)
  2. 2.
    Data Category specifications. Clarin-NL ISOcat workshop (May 2011), (retrieved December 5, 2011)
  3. 3.
    Int’l Organization of Standardization. Data elements and interchange formats – Information interchange – Representation of dates and times (ISO-8601), Geneva (2009)Google Scholar
  4. 4.
    Int’l Organization of Standardization. Terminology and other language and content resources - Specification of data categories and management of a Data Category Registry for language resources (ISO-12620), Geneva (2009)Google Scholar
  5. 5.
    Int’l Organization of Standardization. Terminology work – Principles and methods (ISO-704), Geneva (2009)Google Scholar
  6. 6.
    Schuurman, I., Windhouwer, M.: Explicit semantics for enriched documents. What do ISOcat, RELcat and SCHEMAcat have to offer? In: Proceedings of Supporting Digital Humanities, SDH 2011 (2011)Google Scholar
  7. 7.
    Soldatova, L.N., King, R.D.: An ontology of scientific experiments. Journal of the Royal Society Interface 3(11), 795–803 (2006)CrossRefGoogle Scholar
  8. 8.
    Wright, S.E., Kemps-Snijders, M., Windhouwer, M.A.: The OWL and the ISOcat: Modeling Relations in and around the DCR. In: LRT Standards Workshop at LREC 2010, Malta (May 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Claus Zinn
    • 1
  • Christina Hoppermann
    • 1
  • Thorsten Trippel
    • 1
  1. 1.Department of LinguisticsUniversity of TübingenGermany

Personalised recommendations