Earth Science Informatics

, Volume 3, Issue 1–2, pp 101–110 | Cite as

Designing submission and workflow services for preserving interdisciplinary scientific data

  • Robert R. Downs
  • Robert S. Chen
Research Article


Developments in information and communication technologies offer new opportunities to use and integrate scientific data that have been collected by researchers and scholars from diverse fields of inquiry. Data archives and digital repository systems are being developed to preserve current and legacy scientific data and technical information for use by others. However, capabilities are needed for data producers of various disciplines to easily and efficiently submit their data into archival systems for preservation. Analysis of digital preservation requirements has identified the requirements for services to support the submission and review of scientific data for preservation. Data submission and review processes are segmented into services, which are defined to support efficient preparation of scientific data for ingest into an archive or digital repository system. A model is proposed to inform the design of submission and workflow services for preserving interdisciplinary scientific data. Recommendations are offered for improving the design and evaluation of systems and services to prepare and preserve scientific data for new uses by interdisciplinary communities of users in the future. Improving the infrastructure that enables members of the scientific community to submit their data for archiving contributes to the scientific data stewardship and data curation capabilities needed to preserve scientific data for future generations of users.


Scientific data Data stewardship Digital preservation Data curation Data submission Digital repositories 



The authors very much appreciate the constructive comments for improving an earlier draft of the article that were offered by the referees and gratefully acknowledge support for this work that has been received from the National Aeronautics and Space Administration (NASA), under contracts NAS5-03117 and NNG08HZ11C. The opinions expressed here are those of the authors and not necessarily those of the Earth Institute, Columbia University, or NASA.


  1. Ambacher BI (2007) Government Archives and the Digital Repository Checklist. J Digit Inf 8(2). ISSN: 1368-7506 Accessed 27 January 2010
  2. Atkins DE, Droegemeier KK, Feldman SI, Garcia-Molina H, Klein ML, Messerschmitt DG, Messina P, Ostriker JP, Wright MH (2003) Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure.Google Scholar
  3. Berman, F (2008) Got data?: A Guide to Data Preservation in the Information Age. Communications of the ACM 51(12), 50–56. ISSN:0001-0782Google Scholar
  4. Borgman CL, Wallis JC, Enyedy N (2007) Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. Int J Digit Libr 7:17–30. doi: 10.1007/s00799-007-0022-9 CrossRefGoogle Scholar
  5. Center for Research Libraries (2010) Report on Portico Audit Findings Center for Research Libraries (CRL), Chicago. Available via CRL: Report on Portico Audit 2010.pdf Accessed 27 January 2010.
  6. Cohn D, Hull R (2009) Facilitating Workflow Interoperation Using Artifact-Centric Hubs. In Baresi L, Chi C, Suzuki J (Eds) Service-Oriented Computing, Proceedings of the 7th International Joint Conference on Service-Oriented Computing, Lecture Notes In Computer Science, 5900, Springer, Berlin, pp 1-18 ISBN:978-3-642-10382-7Google Scholar
  7. Committee on Ensuring the Utility and Integrity of Research Data in a Digital Age (2009) Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. The National Academies Press, Washington, ISBN-10: 0-309-13684-9Google Scholar
  8. Committee on Facilitating Interdisciplinary Research (2004) Facilitating Interdisciplinary Research. The National Academies Press, Washington. ISBN 10: 0-309-09435-6Google Scholar
  9. Consultative Committee for Space Data Systems (2002) Reference Model for an Open Archival Information System (OAIS). Adopted as: Space data and information transfer systems—Open archival information system—Reference model (ISO 14721:2003). Available via CCSDS: Accessed 25 September 2009
  10. Consultative Committee for Space Data Systems (2004) Producer-Archive Interface Methodology Abstract Standard. (CCSDS 651.0-B-1). Adopted as: Space data and information transfer systems—Producer-archive interface—Methodology abstract standard (ISO 20652:2006). Available via CCSDS: Accessed 25 September 2009
  11. Consultative Committee for Space Data Systems (2009) Audit and Certification of Trustworthy Digital Repositories: Draft Recommended Standard. Red Book, Issue 1. Available: Accessed 25 September 2009
  12. Digital Curation Centre, Digital Preservation Europe (2007) Digital Repository Audit Method Based on Risk Assessment (DRAMBORA). Accessed 25 September 2009
  13. Downs RR, Chen RS (2003) Cooperative Design, Development, and Management of Interdisciplinary Data to Support the Global Environmental Change Research Community. Science & Technology Libraries, 23(4), 5–19. ISSN: 0194-262XGoogle Scholar
  14. Downs RR, Chen RS (2005) Organizational Needs for Managing and Preserving Geospatial Data and Related Electronic Records. Data Sci J, 4, 31 December 2005, 255–271. ISSN: 1683-1470. Accessed 26 March 2010
  15. Downs RR, Chen RS (2009a) Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository. 4th International Conference on Open Repositories (OR09). May 18–21, 2009. Atlanta, GA, Available from Georgia Institute of Technology: Accessed 27 January 2010
  16. Downs RR, Chen RS (2009b) Designing Submission Services for a Trustworthy Digital Repository of Interdisciplinary Scientific Data. Earth and Space Science Informatics Workshop: Developing the Next Generation of Earth and Space Science Informatics: Technologies and the People That Will Implement Them. August 3–5, 2009. University of Maryland, Baltimore County. Available from NASA: Accessed 25 September 2009
  17. Downs RR, Chen RS (2010) Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository. J Digit Inf, 11(1). ISSN: 1368–7506 Accessed 26 March 2010
  18. Green A, Gutmann M (2007) Building partnerships among social science researchers, intuitional-based repositories and domain specific data archives. OCLC Syst Serv Int Digit Libr Perspect 23(1):35–53Google Scholar
  19. Harnessing the Power of Digital Data for Science and Society (2009) Report of the Interagency Working Group on Digital Data to the Committee on Science of the National Science and Technology Council. Available from NITRD: Accessed 25 September 2009
  20. HathiTrust Digital Library (2009) Review of Compliance with Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) Minimum Required Elements. HathiTrust Accessed 27 January 2010
  21. Heidorn PB (2008) Shedding Light on the Dark Data in the Long Tail of Science. Libr Trends 57(2):280–299. doi: 10.1353/lib.0.0036 CrossRefGoogle Scholar
  22. Lee CA, Tibbo HR (2007). Digital Curation and Trusted Repositories: Steps Toward Success. J Digit Inf 8(2). ISSN: 1368-7506 Accessed 27 January 2010
  23. Moore RW, Smith M (2007) Automated Validation of Trusted Digital Repository Assessment Criteria. J Digit Inf 8(2). ISSN: 1368-7506 Accessed 27 January 2010Google Scholar
  24. National Science Board (2005) Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century. Washington, DC: National Science Foundation. Accessed 22 January 2010
  25. Nestor Working Group, Trusted Repositories—Certification (2006) Catalogue of Criteria for Trusted Digital Repositories, Version 1. Available from the Nestor Working Group: Accessed 25 September 2009
  26. Nicholson D, Dobreva M (2009) Beyond OAIS: Towards a reliable and consistent digital preservation implementation framework. 16th International Conference on Digital Signal Processing, 1-7. doi: 10.1109/ICDSP.2009.5201126
  27. Nigam A, Caswell NS (2003) Business artifacts: An Approach to Operational Specification. IBM Syst J, 42(3), 428–445. ISSN: 0018-8670Google Scholar
  28. Online Computer Library Center and Center for Research Libraries (2007) Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC), Version 1.0. Available from CRL: Accessed 25 September 2009
  29. Onsrud H, Campbell J (2007) Big Opportunities in Access to "Small Science" Data. Data Science Journal 6, OD58-OD66. doi: 10.2481/dsj.6.OD58
  30. Orchard S, Hermjakob H (2008) The HUPO proteomics standards initiative—easing communication and minimizing data loss in a changing world. Brief Bioinform 9:166–173. doi: 10.1093/bib/bbm061 CrossRefGoogle Scholar
  31. Salo D (2008) Innkeeper at the Roach Motel. Library Trends 57(2). E-ISSN: 1559-0682 Print ISSN: 0024-2594. doi: 10.1353/lib.0.0031 Available from Project MUSE: Accessed 25 September 2009
  32. Schmidt, LM (2009) Preserving the H-net Academic Electronic Mail Lists. Society of American Archivists, SAA Campus Case Study—Case 11, Accessed 27 January 2010
  33. Smith M, Moore RW (2007) Digital Archive Policies and Trusted Digital Repositories. Int J Digit Curation 2 (1) Accessed 27 January 2010
  34. Steinhart G, Dietrich D, Green A (2009) Establishing Trust in a Chain of Preservation: The TRAC Checklist Applied to a Data Staging Repository (DataStaR). D-Lib Magazine 15 (9/10) doi: 10.1045/september2009-steinhart Accessed 27 January 2010
  35. Taylor CF, Paton NW, Lilley KS, Binz P, Julian RK, Jones AR, Zhu W, Apweiler R, Aebersold R, Deutsch EW, Macht M, Mann M, Neubert TA, Patterson SD, Seymour SL, Tsugita A, Xenarios I, Hermjakob H (2007) The Minimum Information About a Proteomics Experiment (MIAPE). Nat Biotechnol 25(8):887–893. doi: 10.1038/nbt1329 CrossRefGoogle Scholar
  36. To Stand the Test of Time: Long-term Stewardship of Digital Data Sets in Science and Engineering (2006) A report to the National Science Foundation from the ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe, September 26–27, 2006, Arlington VAGoogle Scholar
  37. Van Horn JD, Toga AW (2009) Is it time to re-prioritize neuroimaging databases and digital repositories? Neuroimage 47(4):1720–1734. doi: 10.1016/j.neuroimage.2009.03.086 CrossRefGoogle Scholar
  38. Watry P (2007) Digital Preservation Theory and Application: Transcontinental Persistent Archives Testbed Activity. Int J of Digit Curation 2(2), 41–68. Accessed 27 January 2010
  39. Watson SEA (2007) Authors’ Attitudes to, and Awareness and Use of, a University Institutional Repository. Serials, 20(3), 225–230. Accessed 25 September 2009
  40. Zborowski M (2009) CISTI'S Activities in Support of Scientific Data Management in Canada 2008–2010. Data Science Journal 8, 27–33. Accessed 25 September 2009
  41. Zimmerman A (2007) Not by metadata alone: the use of diverse forms of knowledge to locate data for reuse. Int J Digit Libr 7:5–16. doi: 10.1007/s00799-007-0015-8 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Center for International Earth Science Information NetworkColumbia UniversityPalisadesUSA

Personalised recommendations