Transforming a Flat Metadata Schema to a Semantic Web Ontology: The Polish Digital Libraries Federation and CIDOC CRM Case Study

  • Cezary Mazurek
  • Krzysztof Sielski
  • Maciej Stroiński
  • Justyna Walkowska
  • Marcin Werla
  • Jan Węglarz
Part of the Studies in Computational Intelligence book series (SCI, volume 390)


This paper describes the transformation of the metadata schema used by the Polish Digital Libraries Federation to the CIDOC CRM model implemented in OWL as Erlangen CRM. The need to transform the data from a flat schema to a full-fledged ontology arose during preliminary works in the Polish research project SYNAT. The Digital Libraries Federation site offers aggregated metadata of more than 600,000 publications that constitute the first portion of data introduced into the Integrated Knowledge System - one of the services developed in the SYNAT project. To be able to perform the desired functions, IKS needs heavily linked data that can be searched semantically. The issue is not only one of mapping one schema element to another, as the conceptualization of CIDOC is significantly different from that of the DLF schema. In the paper we identify a number of problems that are common to all such transformations and propose solutions. Finally, we present statistics concerning the mapping process and the resulting knowledge base.


CIDOC CRM digital libraries metadata ontologies RDF repositories semantic web thesauri 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Atkins, A., Fox, E., France, R., Suleman, H.: ETD-MS: an Interoperability Metadata Standard for Electronic Theses and Dissertations, 1.2 edn. (2008),
  2. 2.
    Bishop, B., Kiryakov, A., Ognyanoff, D., Peikov, I., Tashev, Z., Velkov, R.: OWLIM: A family of scalable semantic repositories. In: Semantic Web – Interoperability, Usability, Applicability (2010),
  3. 3.
    Clayphan, R. (ed.): Europeana Semantic Elements Specification, Version 3.3.1, January 24 (2011),
  4. 4.
    Crofts, N., Doerr, M., Gill, T., Stead, S., Stiff, M.: Definition of the CIDOC Conceptual Reference Model, 5.0.2 edn. (June 2005),
  5. 5.
    Daćko, D., Józefowska, J., Ławrynowicz, A.: An ontology based semantic library catalogue. In: Proceeding of the 3rd Language & Technology Conference, Poznań, pp. 109–113 (2007)Google Scholar
  6. 6.
    Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M.: Polish WordNet on a Shoestring. In: Proceedings of Biannual Conference of the Society for Computational Linguistics and Language Technology, Universität Tübingen, Tübingen, April 11–13, pp. 169–178 (2007)Google Scholar
  7. 7.
    Doerr, M.: Mapping of the Dublin Core Metadata Element Set to the CIDOC CRM. Technical Report, 274, ICS-FORTH, Heraklion, Crete (July 2000),
  8. 8.
    Dudczak, A., Heliński, M., Mazurek, C., Mielnicki, M., Werla, M.: Extending the Shibboleth Identity Management Model with a Networked User Profile. In: Proceedings of the 1st International Conference on Information Technology, Gdańsk, May 18-21, pp. 179–182 (2008)Google Scholar
  9. 9.
    Fellbaum, C.: WordNet. An Electronic Lexical Database. MIT Press (1998)Google Scholar
  10. 10.
    Gill, T.: Building semantic bridges between museums, libraries and archives: The CIDOC Conceptual Reference Model. First Monday 9(5) (2004),
  11. 11.
    Görz, G., Oischinger, M., Schiemann, B.: An Implementation of the CIDOC Conceptual Reference Model (4.2.4) in OWL-DL. In: Proceedings of CIDOC 2008 — The Digital Curation of Cultural Heritage. ICOM CIDOC, Athens (2008)Google Scholar
  12. 12.
    Hohmann, G., Scholz, M.: Recommendation for the representation of the primitive value classes of the CRM as data types in RDF/OWL implementations,
  13. 13.
    Kakali, K., Doerr, M., Papatheodorou, C., Stasinopoulou, T.: Wp5 - task 5.5 dc.type mapping to cidoc/crm. Technical report. Department of Archives and Library Science. Ionian University (2007)Google Scholar
  14. 14.
    Koutsomitropoulos, D.A., Solomou, G.D., Papatheodorou, T.s.: Metadata and Semantics in Digital Object Collections: A Case-Study on CIDOC-CRM and Dublin Core and a Prototype Implementation. Journal of Digital Information 10(6) (2009),
  15. 15.
    Lewandowska, A., Mazurek, C., Werla, M.: Enrichment of European Digital Resources by Federating Regional Digital Libraries in Poland. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds.) ECDL 2008. LNCS, vol. 5173, pp. 256–259. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Lourdi, I., Papatheodorou, C., Doerr, M.: Semantic Integration of Collection Description. Combining CIDOC/CRM and Dublin Core Collections Application Profile. D-Lib Magazine 15(7/8) (2009),
  17. 17.
    Mazurek, C., Stroiński, M., Węglarz, J., Werla, M.: Metadata harvesting in regional digital libraries in PIONIER Network. Campus-Wide Information Systems 23(4), 241–253 (2006)CrossRefGoogle Scholar
  18. 18.
    NUKAT, the National Union Catalog,
  19. 19.
    Paluszkiewicz, A.: Format rekordu kartoteki haseł wzorcowych: zastosowanie w Centralnej Kartotece Haseł Wzorcowych NUKAT. In: Formaty, Kartoteki, SBP, Warszawa, vol. 17 (2009)Google Scholar
  20. 20.
    Pasin, M., Motta, E.: Ontological Requirements for Annotation and Navigation of Philosophical Resources. In: Synthese, September 28, pp. 1–33. Springer, Netherlands (2009)Google Scholar
  21. 21.
    Purday, J.: Think culture: from concept to construction. The Electronic Library 27(6), 919–937 (2009)CrossRefGoogle Scholar
  22. 22.
    Spero, S.: LCSH is to Thesaurus as Doorbell is to Mammal: Visualizing Structural Problems in the Library of Congress Subject Headings. In: Proceedings of the 2008 International Conference on Dublin Core and Metadata Applications (DCMI 2008), Dublin Core Metadata Initiative, pp. 203–203 (2008)Google Scholar
  23. 23.
    Tarjan, R.E.: Depth-First Search and Linear Graph Algorithms. SIAM Journal on Computing 1(2), 146–160 (1972)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Berlin Heidelberg 2012

Authors and Affiliations

  • Cezary Mazurek
    • 1
  • Krzysztof Sielski
    • 1
  • Maciej Stroiński
    • 1
  • Justyna Walkowska
    • 1
  • Marcin Werla
    • 1
  • Jan Węglarz
    • 1
  1. 1.Poznań Supercomputing and Networking CenterPoznańPoland

Personalised recommendations