Abstract
Developments in information and communication technologies offer new opportunities to use and integrate scientific data that have been collected by researchers and scholars from diverse fields of inquiry. Data archives and digital repository systems are being developed to preserve current and legacy scientific data and technical information for use by others. However, capabilities are needed for data producers of various disciplines to easily and efficiently submit their data into archival systems for preservation. Analysis of digital preservation requirements has identified the requirements for services to support the submission and review of scientific data for preservation. Data submission and review processes are segmented into services, which are defined to support efficient preparation of scientific data for ingest into an archive or digital repository system. A model is proposed to inform the design of submission and workflow services for preserving interdisciplinary scientific data. Recommendations are offered for improving the design and evaluation of systems and services to prepare and preserve scientific data for new uses by interdisciplinary communities of users in the future. Improving the infrastructure that enables members of the scientific community to submit their data for archiving contributes to the scientific data stewardship and data curation capabilities needed to preserve scientific data for future generations of users.
Similar content being viewed by others
References
Ambacher BI (2007) Government Archives and the Digital Repository Checklist. J Digit Inf 8(2). ISSN: 1368-7506 http://journals.tdl.org/jodi/article/view/190/171 Accessed 27 January 2010
Atkins DE, Droegemeier KK, Feldman SI, Garcia-Molina H, Klein ML, Messerschmitt DG, Messina P, Ostriker JP, Wright MH (2003) Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure.
Berman, F (2008) Got data?: A Guide to Data Preservation in the Information Age. Communications of the ACM 51(12), 50–56. ISSN:0001-0782
Borgman CL, Wallis JC, Enyedy N (2007) Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. Int J Digit Libr 7:17–30. doi:10.1007/s00799-007-0022-9
Center for Research Libraries (2010) Report on Portico Audit Findings Center for Research Libraries (CRL), Chicago. Available via CRL: http://www.crl.edu/sites/default/files/attachments/pages/CRL Report on Portico Audit 2010.pdf Accessed 27 January 2010.
Cohn D, Hull R (2009) Facilitating Workflow Interoperation Using Artifact-Centric Hubs. In Baresi L, Chi C, Suzuki J (Eds) Service-Oriented Computing, Proceedings of the 7th International Joint Conference on Service-Oriented Computing, Lecture Notes In Computer Science, 5900, Springer, Berlin, pp 1-18 ISBN:978-3-642-10382-7
Committee on Ensuring the Utility and Integrity of Research Data in a Digital Age (2009) Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. The National Academies Press, Washington, ISBN-10: 0-309-13684-9
Committee on Facilitating Interdisciplinary Research (2004) Facilitating Interdisciplinary Research. The National Academies Press, Washington. ISBN 10: 0-309-09435-6
Consultative Committee for Space Data Systems (2002) Reference Model for an Open Archival Information System (OAIS). Adopted as: Space data and information transfer systems—Open archival information system—Reference model (ISO 14721:2003). Available via CCSDS: http://public.ccsds.org/publications/archive/650x0b1.pdf Accessed 25 September 2009
Consultative Committee for Space Data Systems (2004) Producer-Archive Interface Methodology Abstract Standard. (CCSDS 651.0-B-1). Adopted as: Space data and information transfer systems—Producer-archive interface—Methodology abstract standard (ISO 20652:2006). Available via CCSDS: http://public.ccsds.org/publications/archive/651x0b1.pdf Accessed 25 September 2009
Consultative Committee for Space Data Systems (2009) Audit and Certification of Trustworthy Digital Repositories: Draft Recommended Standard. Red Book, Issue 1. Available: http://wiki.digitalrepositoryauditandcertification.org Accessed 25 September 2009
Digital Curation Centre, Digital Preservation Europe (2007) Digital Repository Audit Method Based on Risk Assessment (DRAMBORA). http://www.repositoryaudit.eu/ Accessed 25 September 2009
Downs RR, Chen RS (2003) Cooperative Design, Development, and Management of Interdisciplinary Data to Support the Global Environmental Change Research Community. Science & Technology Libraries, 23(4), 5–19. ISSN: 0194-262X
Downs RR, Chen RS (2005) Organizational Needs for Managing and Preserving Geospatial Data and Related Electronic Records. Data Sci J, 4, 31 December 2005, 255–271. ISSN: 1683-1470. http://www.jstage.jst.go.jp/article/dsj/4/0/4_255/_article Accessed 26 March 2010
Downs RR, Chen RS (2009a) Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository. 4th International Conference on Open Repositories (OR09). May 18–21, 2009. Atlanta, GA, Available from Georgia Institute of Technology: http://hdl.handle.net/1853/28456 Accessed 27 January 2010
Downs RR, Chen RS (2009b) Designing Submission Services for a Trustworthy Digital Repository of Interdisciplinary Scientific Data. Earth and Space Science Informatics Workshop: Developing the Next Generation of Earth and Space Science Informatics: Technologies and the People That Will Implement Them. August 3–5, 2009. University of Maryland, Baltimore County. Available from NASA: http://essi.gsfc.nasa.gov/pdf/Downs.pdf Accessed 25 September 2009
Downs RR, Chen RS (2010) Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository. J Digit Inf, 11(1). ISSN: 1368–7506 http://journals.tdl.org/jodi/article/view/753 Accessed 26 March 2010
Green A, Gutmann M (2007) Building partnerships among social science researchers, intuitional-based repositories and domain specific data archives. OCLC Syst Serv Int Digit Libr Perspect 23(1):35–53
Harnessing the Power of Digital Data for Science and Society (2009) Report of the Interagency Working Group on Digital Data to the Committee on Science of the National Science and Technology Council. Available from NITRD: http://www.nitrd.gov/about/Harnessing_Power_Web.pdf Accessed 25 September 2009
HathiTrust Digital Library (2009) Review of Compliance with Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) Minimum Required Elements. HathiTrust http://www.hathitrust.org/documents/trac.pdf Accessed 27 January 2010
Heidorn PB (2008) Shedding Light on the Dark Data in the Long Tail of Science. Libr Trends 57(2):280–299. doi:10.1353/lib.0.0036
Lee CA, Tibbo HR (2007). Digital Curation and Trusted Repositories: Steps Toward Success. J Digit Inf 8(2). ISSN: 1368-7506 http://journals.tdl.org/jodi/article/view/229/183 Accessed 27 January 2010
Moore RW, Smith M (2007) Automated Validation of Trusted Digital Repository Assessment Criteria. J Digit Inf 8(2). ISSN: 1368-7506 http://journals.tdl.org/jodi/article/viewArticle/198/181 Accessed 27 January 2010
National Science Board (2005) Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century. Washington, DC: National Science Foundation. http://nsf.gov/pubs/2005/nsb0540/nsb0540_1.pdf Accessed 22 January 2010
Nestor Working Group, Trusted Repositories—Certification (2006) Catalogue of Criteria for Trusted Digital Repositories, Version 1. Available from the Nestor Working Group: http://edoc.hu-berlin.de/series/nestor-materialien/8en/PDF/8en.pdf Accessed 25 September 2009
Nicholson D, Dobreva M (2009) Beyond OAIS: Towards a reliable and consistent digital preservation implementation framework. 16th International Conference on Digital Signal Processing, 1-7. doi:10.1109/ICDSP.2009.5201126
Nigam A, Caswell NS (2003) Business artifacts: An Approach to Operational Specification. IBM Syst J, 42(3), 428–445. ISSN: 0018-8670
Online Computer Library Center and Center for Research Libraries (2007) Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC), Version 1.0. Available from CRL: http://www.crl.edu/PDF/trac.pdf Accessed 25 September 2009
Onsrud H, Campbell J (2007) Big Opportunities in Access to "Small Science" Data. Data Science Journal 6, OD58-OD66. doi:10.2481/dsj.6.OD58
Orchard S, Hermjakob H (2008) The HUPO proteomics standards initiative—easing communication and minimizing data loss in a changing world. Brief Bioinform 9:166–173. doi:10.1093/bib/bbm061
Salo D (2008) Innkeeper at the Roach Motel. Library Trends 57(2). E-ISSN: 1559-0682 Print ISSN: 0024-2594. doi:10.1353/lib.0.0031 Available from Project MUSE: http://muse.jhu.edu/journals/library_trends/v057/57.2.salo.html Accessed 25 September 2009
Schmidt, LM (2009) Preserving the H-net Academic Electronic Mail Lists. Society of American Archivists, SAA Campus Case Study—Case 11, http://www.archivists.org/publications/epubs/campusCaseStudies/casestudies/Case11-Schmidt.pdf Accessed 27 January 2010
Smith M, Moore RW (2007) Digital Archive Policies and Trusted Digital Repositories. Int J Digit Curation 2 (1) http://www.ijdc.net/ijdc/article/view/27/30 Accessed 27 January 2010
Steinhart G, Dietrich D, Green A (2009) Establishing Trust in a Chain of Preservation: The TRAC Checklist Applied to a Data Staging Repository (DataStaR). D-Lib Magazine 15 (9/10) doi:10.1045/september2009-steinhart Accessed 27 January 2010
Taylor CF, Paton NW, Lilley KS, Binz P, Julian RK, Jones AR, Zhu W, Apweiler R, Aebersold R, Deutsch EW, Macht M, Mann M, Neubert TA, Patterson SD, Seymour SL, Tsugita A, Xenarios I, Hermjakob H (2007) The Minimum Information About a Proteomics Experiment (MIAPE). Nat Biotechnol 25(8):887–893. doi:10.1038/nbt1329
To Stand the Test of Time: Long-term Stewardship of Digital Data Sets in Science and Engineering (2006) A report to the National Science Foundation from the ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe, September 26–27, 2006, Arlington VA
Van Horn JD, Toga AW (2009) Is it time to re-prioritize neuroimaging databases and digital repositories? Neuroimage 47(4):1720–1734. doi:10.1016/j.neuroimage.2009.03.086
Watry P (2007) Digital Preservation Theory and Application: Transcontinental Persistent Archives Testbed Activity. Int J of Digit Curation 2(2), 41–68. http://www.ijdc.net/index.php/ijdc/article/viewFile/43/28 Accessed 27 January 2010
Watson SEA (2007) Authors’ Attitudes to, and Awareness and Use of, a University Institutional Repository. Serials, 20(3), 225–230. http://hdl.handle.net/1826/2017 Accessed 25 September 2009
Zborowski M (2009) CISTI'S Activities in Support of Scientific Data Management in Canada 2008–2010. Data Science Journal 8, 27–33. http://www.jstage.jst.go.jp/article/dsj/8/0/8_27/_article Accessed 25 September 2009
Zimmerman A (2007) Not by metadata alone: the use of diverse forms of knowledge to locate data for reuse. Int J Digit Libr 7:5–16. doi:10.1007/s00799-007-0015-8
Acknowledgements
The authors very much appreciate the constructive comments for improving an earlier draft of the article that were offered by the referees and gratefully acknowledge support for this work that has been received from the National Aeronautics and Space Administration (NASA), under contracts NAS5-03117 and NNG08HZ11C. The opinions expressed here are those of the authors and not necessarily those of the Earth Institute, Columbia University, or NASA.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Thomas Narock
This article is based on the presentation, “Designing Submission Services for a Trustworthy Digital Repository of Interdisciplinary Scientific Data” by the authors to the Earth and Space Science Informatics Workshop, Developing the Next Generation of Earth and Space Science Informatics: Technologies and the People That Will Implement Them, on August 3, 2009 at the University of Maryland, Baltimore County, in Baltimore, Maryland.
Rights and permissions
About this article
Cite this article
Downs, R.R., Chen, R.S. Designing submission and workflow services for preserving interdisciplinary scientific data. Earth Sci Inform 3, 101–110 (2010). https://doi.org/10.1007/s12145-010-0051-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-010-0051-6