The application of archival concepts to a data-intensive environment: working with scientists to understand data management and preservation needs

Abstract

The collection, organization, and long-term preservation of resources are the raison d’être of archives and archivists. The archival community, however, has largely neglected science data, assuming they were outside the bounds of their professional concerns. Scientists, on the other hand, increasingly recognize that they lack the skills and expertise needed to meet the demands being placed on them with regard to data curation and are seeking the help of “data archivists” and “data curators.” This represents a significant opportunity for archivists and archival scholars but one that can only be realized if they better understand the scientific context.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    The data deluge is a topic of many articles in the press and popular scientific publications. Recent examples include Nature's special issue, "Data Sharing" (Sept. 2009, Vol. 461(145)); The Economist's special report, "Data, Data Everywhere" (Feb. 2010); and Science Magazine's special issue, "Dealing with Data" (Feb. 2011, Vol 331(6018)).

  2. 2.

    “III-V” (pronounced “three five”) refers to periodic grouping in the periodic table of elements. III-V semiconductors, comprised of a group III element and a group V element, are commonly used in optical electronic devices such as lasers. Examples of III-V semiconductors include gallium arsenide, gallium antimonide, and indium arsenide.

References

  1. Berman F (2008) Got data? A guide to data preservation in the information age. Commun ACM 51(12):50–55

    Article  Google Scholar 

  2. Birnholtz JP, Bietz MJ (2003) Data at work: supporting sharing in science and engineering. Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work. Sanibel Island, FL, pp 339–348

    Google Scholar 

  3. Borgman CL, Wallis JC, Enyedy N (2007) Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. Int J Digit Libr 7(1–2):17–30

    Article  Google Scholar 

  4. Botticelli P (2000) Records appraisal in network organizations. Archivaria 49:161–191

    Google Scholar 

  5. Bowker G (2006) Memory practices in the sciences. MIT Press, Cambridge, MA

    Google Scholar 

  6. Cragin MH, Palmer CL, Carlson JR, Witt M (2010) Data sharing, small science and institutional repositories. Philos Trans R Soc A 368:4023–4038

    Article  Google Scholar 

  7. Curry A (2011) Reuse of old data offers lesson for particle physicists. Science 331:694–695

    Article  Google Scholar 

  8. Elliott CA (1974) Experimental data as a source for the history of science. Am Arch 37(1):27–35

    Google Scholar 

  9. Feijen M (2011) What researchers want. SURF foundation, Utrecht

    Google Scholar 

  10. Gantz J, Reinsel D (2010) The digital universe decadeare you ready? IDC White Paper, May 2010. http://idcdocserv.com/925. Assessed 27 Feb. 2011

  11. Haas J, Samuels H, Simmons B (1985) Appraising the records of modern science and technology: a guide. MIT Press, Cambridge, MA

    Google Scholar 

  12. Hackman L, Warnow-Blewett J (1987) The documentation strategy: a model and a case study. Am Arch 50(1):12–27

    Google Scholar 

  13. Hey T, Tansley S, Tolle K (eds) (2009) The fourth paradigm: data-intensive scientific discovery. Microsoft Research, Redmond, Washington

  14. King WJ (1964) The project on the history of recent physics in the United States. Am Arch 27(2):237–243

    Google Scholar 

  15. King G (2011) Ensuring the data-rich future of the social sciences. Science 331:719–721

    Article  Google Scholar 

  16. Lauriault T, Craig B, Taylor DR, Pulsifer P (2007) Today’s data are part of tomorrow’s research: Archival issues in the sciences. Archivaria 64:123–178

    Google Scholar 

  17. Madnick S, Smith M, Clopeck K (2009) Materials science and engineering at MIT. The scientific data flood: a case study of “how much information?” http://hmi.ucsd.edu/pdf/HMI_Case_MaterialsScienEng.pdf. Accessed 16 Mar. 2011

  18. Piwowar H, Chapman W (2009) Public sharing of research datasets: a pilot study of associations. J Informetr 4(2):148–156

    Article  Google Scholar 

  19. Shankar K (2007) Order from chaos: The poetics and pragmatics of scientific recordkeeping. J Am Soc Inf Sci Technol 58(10):1457–1466

    Article  Google Scholar 

  20. U.S. National Institutes of Health (2003) Data sharing policy and implementation guidance. http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm. Accessed 15 Mar. 2011

  21. U.S. National Science Foundation (2010) Application and administration guide, Chapter IV.D.4. http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4. Accessed 15 Mar. 2011

  22. Van House NA (2003) Digital libraries and collaborative knowledge construction. In: Bishop AP, Buttenfield B, Van House NA (eds) Digital library use: social practice in design and evaluation. MIT Press, Cambridge, MA, pp 271–295

    Google Scholar 

  23. Wallis J C, Borgman C L, Mayernik M, Pepe A (2008) Moving archival practices upstream: an exploration of the life cycle of ecological sensing data in collaborative field research. Int J Digit Curation, 3(1):114–126. http://www.ijdc.net/index.php/ijdc/article/view/67/46. Accessed 17 Mar. 2011

  24. Warnow-Blewett J, Capitos AJ, Genuth J, Weart SR (1995) AIP study of multi-institutional collaborations: Phase I: High energy physics. Report No. 1: Summary of project activities and findings: Project recommendations. American Institute of Physics, College Park, MD. http://www.aip.org/history/pubs/collabs/hep-rp1.htm. Accessed 15 Mar. 2011

  25. Warnow-Blewett J, Genuth J, Weart SR (2001) AIP study of multi-institutional collaborations: final report: highlights and project recommendations. American Institute of Physics, College Park, MD. http://www.aip.org/history/pubs/collabs/highlights.html. Accessed 15 Mar. 2011

  26. Yin RK (2008) Case study research: Design and methods, 4th ed. Sage, Thousand Oaks, CA

    Google Scholar 

  27. Zimmerman A (2008) New knowledge from old data: the role of standards in the sharing and reuse of ecological data. Sci Technol Human Values 33(5):631–652

    Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the materials scientists who shared their experiences with us. We also thank Elizabeth Yakel for her comments on several versions of the manuscript, the members of the University of Michigan Archival Research Group for their suggestions, and the anonymous reviewers for their valuable feedback, which helped to improve the manuscript. This material is based upon work supported by the National Science Foundation under Grant No. 0724300. Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Dharma Akmon.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Akmon, D., Zimmerman, A., Daniels, M. et al. The application of archival concepts to a data-intensive environment: working with scientists to understand data management and preservation needs. Arch Sci 11, 329–348 (2011). https://doi.org/10.1007/s10502-011-9151-4

Download citation

Keywords

  • Science data
  • Data curation
  • Data reuse
  • Data management
  • Data documentation