Creating & Testing CLARIN Metadata Components
- 149 Downloads
The CLARIN Metadata Infrastructure (CMDI) that is being developed in Common Language Resources and Technology Infrastructure (CLARIN) is a computer-supported framework that combines a flexible component approach with the explicit declaration of semantics. The goal of the Dutch CLARIN project “Creating & Testing CLARIN Metadata Components” was to create metadata components and profiles for a wide variety of existing resources housed at two data centres according to the CMDI specifications. In doing so the principles of the framework were tested. The results of the project are of benefit to other CLARIN-projects that are expected to adhere to the CMDI framework and its accompanying tools.
KeywordsMetadata Infrastructure CLARIN
The authors would like to thank Jan Pieter Kunst (Meertens Institute) and Anna Aalstein (INL) for their valuable input during the project. The project reported on in this paper was funded by CLARIN-NL (www.clarin.nl).
- Barbiers, S., Cornips, L. & Kunst, J. P. (2007). The Syntactic Atlas of the Dutch Dialects: A corpus of elicited speech and text as an on-line dynamic atlas. In J. C. Beal & K. C. Corrigan & H. Moisl [red.] Creating and digitizing language corpora. Volume 1: Synchronic databases. Palgrave Macmillan, Hampshire, pp. 54–90.Google Scholar
- Beeken, J. C. & van der Kamp, P. (2004). The Centre for Dutch Language and Speech Technology (TST Centre). In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), pp. 555–558.Google Scholar
- Broeder, D., Declerck, T., Hinrichs, E., Piperidis, S., Romary, L., Calzolari, N., & Wittenburg, P. (2008). Foundation of a component-based flexible registry for language resources and technology. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC).Google Scholar
- Cucchiarini, C., Driesen, J., Van Hamme, H., & Sanders, E. (2008). Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: The JASMIN-CGN Corpus. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC).Google Scholar
- ISLE Metadata Initiative (IMDI). (2009). Metadata Elements for Catalogue Descriptions. Part 1 B, Version 3.0.13. http://www.mpi.nl/IMDI/documents/Proposals/IMDI_Catalogue_3.0.0.pdf.
- Kemps-Snijders, M., Windhouwer, M., Wittenburg, P. & Wright, S.E. (2009). ISOcat: Remodeling Metadata for Language Resources. In the special issue on the Open Forum on Metadata Registries of the International Journal of Metadata, Semantics and Ontologies (IJMSO), 4(4), pp. 261–276.Google Scholar
- Meder, T. (2010). From a Dutch Folktale Database towards an International Folktale Database. In: Fabula 51, Heft 1/2. Walter de Gruyter: Berlin: New York.Google Scholar
- NISO. (2004). Understanding Metadata. Bethesda, MD: NISO Press. URL: http://www.niso.org/standards/resources/UnderstandingMetadata.pdf.
- Simons, G., & Bird, S. “OLAC Metadata”. 2008, cited version http://www.language-archives.org/OLAC/metadata-20080531.html, latest version http://www.language-archives.org/OLAC/metadata.html.
- TEI Text Encoding Initiative. (2009). http://www.tei-c.org/.
- Váradi, T., Wittenburg, P., Krauwer, S., Wynne, M., & Koskenniemi, K. (2008). CLARIN: Common language resources and technology infrastructure. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC).Google Scholar