Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Lessons learned: on the challenges of migrating a research data repository from a research institution to a university library

  • 68 Accesses

Abstract

The transfer of research data management from one institution to another infrastructural partner is all but trivial, but can be required, for instance, when an institution faces reorganization or closure. In a case study, we describe the migration of all research data, identify the challenges we encountered, and discuss how we addressed them. It shows that the moving of research data management to another institution is a feasible, but potentially costly enterprise. Being able to demonstrate the feasibility of research data migration supports the stance of data archives that users can expect high levels of trust and reliability when it comes to data safety and sustainability.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    As a starting point, depositing agreements can draw upon templates that are prepared by infrastructure providers. The legal evaluation of an instantiated template, however, often depends on the very instantiations, that is, specific criteria that involve the character of the data, the legal status of depositor and depositee, third parties etc. To avoid any form of liability, templates are rarely shared across institutions. If templates are shared, then with an explicit disclaimer (“do not use it as is”) and the strong suggestion to seek for independent professional legal advice.

  2. 2.

    It is possible that the new archive may move the resources at a much later point in time to yet another location, so the capability to manipulate PID-URL mappings should be transferred from the giving archive to the receiving archive.

  3. 3.

    In the Fedora repository, the deletion of a digital object yields a “tombstone”. The PID associated with this object then points to a tombstone notifying users that the resource has been deleted. Note that tombstones still require migration, meaning that PIDs still need to resolve to inform users that their associated digital objects have been removed.

  4. 4.

    Usually, researchers working in the same organization that also hosts the archive do not need depositing agreements.

  5. 5.

    See https://wiki.duraspace.org/display/FF/Training+-+Migrating+from+Fedora+3+to+Fedora+4.

  6. 6.

    The ontology has, for instance, the concept ’MediaObject’ which can be described with properties such as ’encodingFormat’, ’bitrate’, and ’duration’, among many others.

References

  1. Dima, E., Henrich, V., Hinrichs, E., Hinrichs, M., Hoppermann, C., Trippel, T., Zastrow, T., Zinn, C. (2012a). A Repository for the sustainable management of research data. In: Calzolari N, Choukri K, Declerck T, Doğan MU, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (Eds) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), ELRA.

  2. Dima, E., Hoppermann, C., Hinrichs, E., Trippel, T., Zinn, C. (2012b). A metadata editor to support the description of linguistic resources. In: Calzolari N, Choukri K, Declerck T, Doğan MU, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (Eds) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), ELRA.

  3. ISO 24619 (2011). Language resource management—Persistent identification and sustainable access (PISA). International Standard.

  4. ISO 24622-1 (2015). Language resource management—Component Metadata infrastructure (CMDI)—Part 1: the component metadata model. International Standard.

  5. Kamocki, P., Ketzan, E. (2014). Creative commons and language resources: general issues and what’s new in CC 4.0. Tech. rep., CLARIN Legal Issues Committee (CLIC), White Paper Series. see https://www.clarin-d.de/images/legal/CLIC_white_paper_1.pdf.

  6. Lyse, G. I., Meurer, P., Smedt, K. D. (2015). Comedi: A component metadata editor. In Selected Papers from the CLARIN 2014 Conference, Linköping University Electronic Press 116(8):82–98.

  7. Trippel, T., Zinn, C. (2016). Enhancing the quality of metadata by using authority control. In 5th Workshop on Linked Data in Linguistic (LDL-2016) at LREC-2016.

  8. Wilkinson, M. D. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data,. https://doi.org/10.1038/sdata.2016.18.

  9. Zinn, C., Trippel, T., Kaminski, S., Dima, E. (2016). Crosswalking from CMDI to Dublin Core and MARC 21. In: Calzolari N, Choukri K, Declerck T, Goggi S, Grobelnik M, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S (Eds) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), ELRA.

Web resources

  1. [U1] The Dublin Core Metadata Initiative, see https://www.dublincore.org.

  2. [U2] The MARC 21 standard, see https://www.loc.gov/marc/bibliographic.

  3. [U3] The EAD standard, see https://www.loc.gov/ead/.

  4. [U4] The MARC to EAD crosswalk, see https://www.loc.gov/ead/ag/agappb.html#sec4.

  5. [U5] The Handle system, see https://www.handle.net.

  6. [U6] The Fedora repository platform, see fedorarepository.org.

  7. [U7] ProAI, see proai.sourceforge.net.

  8. [U8] The OAI-PMH protocol, see https://www.openarchives.org/pmh.

  9. [U9] Apache Lucene and Solr, see lucene.apache.org/solr.

  10. [U10] Docuteam packer, see https://www.docuteam.ch/en/products/it-for-archives/software.

  11. [U11] The FAIR principles, see https://www.force11.org/group/fairgroup/fairprinciples.

  12. [U12] The Virtual International Authority File, see viaf.org.

  13. [U13] Example of a deposit agreement (University of Reading, UK), see researchdata.reading.ac.uk/deposit_agreement.html.

  14. [U14] Integrated Authority File (GND) at the German National Library, see https://www.dnb.de/EN/Standardisierung/GND/gnd.html.

  15. [U15] The Library of Congress Control Number, see id.loc.gov/authorities/names.html.

  16. [U16] The International Standard Name Identifier, see isni.org.

  17. [U17] On micro-formats, see https://en.wikipedia.org/wiki/Microformat.

  18. [U18] The Schema.org vocabulary, see schema.org.

Download references

Acknowledgements

This work has been supported by the German Research Foundation (DFG reference no. 88614379), and the SFB 833 data management project INF (DFG reference no. 75650358). The data centre cooperates closely with the CLARIN-D centre in Tübingen which is funded by the German Federal Ministry of Education and Research (BMBF).

Author information

Correspondence to Claus Zinn.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Trippel, T., Zinn, C. Lessons learned: on the challenges of migrating a research data repository from a research institution to a university library. Lang Resources & Evaluation (2019). https://doi.org/10.1007/s10579-019-09474-4

Download citation

Keywords

  • Research data management
  • Data repositories
  • Data migration