Advertisement

Preserve: Protecting Data for Long-Term Use

  • Robert B. Cook
  • Yaxing Wei
  • Leslie A. Hook
  • Suresh K. S. VannanEmail author
  • John J. McNelis
Chapter

Abstract

This chapter provides guidance on fundamental data management practices that investigators should perform during the course of data collection to improve both the preservation and usability of their data sets over the long term. Topics covered include fundamental best practices on how to choose the best format for your data, how to better structure data within files, how to define parameters and units, and how to develop data documentation so that others can find, understand, and use your data easily. We also showcase advanced best practices on how to properly specify spatial and temporal characteristics of your data in standard ways so your data are ready and easy to visualize in both 2-D and 3-D viewers. By following this guidance, data will be less prone to error, more efficiently structured for analysis, and more readily understandable for any future questions that the data products might help address.

References

  1. Baldocchi D, Reichstein M, Papale D et al (2012) The role of trace gas flux networks in the biogeosciences. Eos Trans 93:217–218. doi: 10.1029/2012EO230001 CrossRefGoogle Scholar
  2. Bond-Lamberty BP, Thomson AM (2014) A global database of soil respiration data, version 3.0. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/1235
  3. Brunt JW (2010) Protecting your digital research data and documents: LTER cybersecurity briefing #1. http://intranet2.lternet.edu/content/protecting-your-digital-research-data-and-documents. Accessed 25 Jan 2015
  4. Cook RB, Olson RJ, Kanciruk P et al (2001) Best practices for preparing ecological and ground-based data sets to share and archive. Bull Ecol Soc Am 82:138–141. http://www.jstor.org/stable/20168543 Google Scholar
  5. Cook RB, Post WM, Hook LA et al (2009) A conceptual framework for management of carbon sequestration data and models. In: McPherson BJ, Sundquist ET (eds) Carbon sequestration and its role in the global carbon cycle, AGU Monograph Series 183. American Geophysical Union, Washington, DC, pp 325–334. doi: 10.1029/2008GM000713 CrossRefGoogle Scholar
  6. Cook RB, Vannan SKS, McMurry BF et al (2016) Implementation of data citations and persistent identifiers at the ORNL DAAC. Ecol Inf 33:10–16. doi: 10.1016/j.ecoinf.2016.03.003 CrossRefGoogle Scholar
  7. dos-Santos MN, Keller MM (2016) CMS: forest inventory and biophysical measurements, Para, Brazil, 2012-2014. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/1301
  8. Eaton B, Gregory J, Drach R et al (2011) NetCDF climate and forecast (CF) metadata conventions (Vers. 1.6). CF conventions and metadata. http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.pdf. Accessed 10 May 2016
  9. Edinburgh Data Share (2015) Recommended file formats. http://www.ed.ac.uk/files/atoms/files/recommended_file_formats-apr2015.pdf Accessed 10 May 2016
  10. Ehleringer J, Martinelli LA, Ometto JP (2011) LBA-ECO CD-02 forest canopy structure, Tapajos National Forest, Brazil: 1999–2003. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/1009
  11. ESIP (Earth Science Information Partners) (2014) Data citation guidelines for data providers and archives. doi: 10.7269/P34F1NNJ
  12. ESO (ESDIS Directory Standards Office) (2016) Standards, requirements and references. https://earthdata.nasa.gov/user-resources/standards-and-references. Accessed 20 Apr 2016
  13. Hook LA, Vannan SKS, Beaty TW et al (2010) Best practices for preparing environmental data sets to share and archive. Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/BestPractices-2010
  14. IGBP (International Geosphere Biosphere Program) (2012) The Merton Initiative: towards a global observing system for the human environment. http://www.igbp.net/publications/themertoninitiative.4.7815fd3f14373a7f24c256.html. Accessed 7 Mar 2016
  15. ISO (2016) Date and time format - ISO 8601. http://www.iso.org/iso/home/standards/iso8601.htm. Accessed 18 Apr 2016
  16. Iversen CM, Vander Stel HM, Norby RJ et al (2015) Active layer soil carbon and nutrient mineralization, Barrow, Alaska, 2012. Next generation ecosystem experiments arctic data collection, Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, TN. doi: 10.5440/1185213
  17. Justice CO, Bailey GB, Maiden ME et al (1995) Recent data and information system initiatives for remotely sensed measurements of the land surface. Remote Sens Environ 51:235–244. doi: 10.1016/0034-4257(94)00077-Z CrossRefGoogle Scholar
  18. Kervin K, Cook RB, Michener WK (2014) The backstage work of data sharing. In: Proceedings of the 18th international conference on supporting group work (GROUP), Sanibel Island, FL, ACM, New York. doi: 10.1145/2660398.2660406
  19. Lavoie B (2000) Meeting the challenges of digital preservation: the OAIS reference model. OCLC. http://www.oclc.org/research/publications/library/2000/lavoie-oais.html. Accessed 21 Aug 2015
  20. Michener WK (2015) Ecological data sharing. Ecol Inform 29:33–44. doi: 10.1016/j.ecoinf.2015.06.010 CrossRefGoogle Scholar
  21. Michener WK (2017a) Project data management planning, Chapter 2. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  22. Michener WK (2017b) Quality assurance and quality control (QA/QC), Chapter 4. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  23. Michener WK (2017c) Creating and managing metadata, Chapter 5. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  24. Michener WK (2017d) Data discovery, Chapter 7. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  25. Michener WK, Brunt JW, Helly J et al (1997) Non-geospatial metadata for ecology. Ecol Appl 7:330–342. doi:10.1890/1051-0761(1997)007[0330:NMFTES]2.0.CO;2 CrossRefGoogle Scholar
  26. NRC (National Research Council) (1991) Solving the global change puzzle: A U.S. strategy for managing data and information, Report by the Committee on Geophysical Data, Geosciences, Environment and Resources, National Research Council. National Academy Press, Washington, DC. http://dx.doi.org/10.17226/18584
  27. Olson RJ, McCord RA (2000) Archiving ecological data and information. In: Michener WK, Brunt JW (eds) Ecological data: design, management and processing. Blackwell Science, Oxford, pp 117–130Google Scholar
  28. Papale D, Agarwal DA, Baldocchi D et al (2012) Database maintenance, data sharing policy, collaboration. In: Aubinet M, Vesala T, Papale D (eds) Eddy covariance: a practical guide to measurement and data analysis. Springer, Dordrecht, pp 411–436. doi: 10.1007/978-94-007-2351-1 Google Scholar
  29. Parsons MA, Duerr R, Minster J-B (2010) Data citation and peer-review. Eos Trans 91(34):297–298. doi: 10.1029/2010EO340001 CrossRefGoogle Scholar
  30. Porter JH (2017) Scientific databases for environmental research, Chapter 3. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  31. Reid WV, Chen D, Goldfarb L et al (2010) Earth system science for global sustainability: grand challenges. Science 330:916–917. doi: 10.1126/science.1196263 CrossRefGoogle Scholar
  32. Ricciuto DM, Schaefer K, Thornton PE et al (2013) NACP site: terrestrial biosphere model and aggregated flux data in standard format. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/1183
  33. Rüegg J, Gries C, Bond-Lamberty B et al (2014) Completing the data life cycle: using information management in macrosystems ecology research. Front Ecol Environ 12:24–30. doi: 10.1890/120375 CrossRefGoogle Scholar
  34. Schildhauer M (2017) Data integration: principles and practice, Chapter 8. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, HeidelbergGoogle Scholar
  35. Scholes RJ (2005) SAFARI 2000 woody vegetation characteristics of Kalahari and Skukuza sites. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/777
  36. Starr J, Castro E, Crosas M et al (2015) Achieving human and machine accessibility of cited data in scholarly publications. Peer J Comp Sci 1:e1. doi: 10.7717/peerj-cs.1 CrossRefGoogle Scholar
  37. Strasser C, Cook RB, Michener WK et al (2012) Primer on data management: what you always wanted to know about data management, but were afraid to ask. California Digital Library. http://dx.doi.org/doi:10.5060/D2251G48
  38. Tenopir C, Allard S, Douglass K et al (2011) Data sharing by scientists: practices and perceptions. PLoS One 6:e21101. doi: 10.1371/journal.pone.0021101 CrossRefGoogle Scholar
  39. Thornton PE, Thornton MM, Mayer BW et al (2017) Daymet: daily surface weather data on a 1-km grid for North America, Version 3. ORNL DAAC, Oak Ridge, TN. doi: 10.3334/ORNLDAAC/1328
  40. UCAR (University Corporation for Atmospheric Research) (2016) UDUNITS. http://www.unidata.ucar.edu/software/udunits/. Accessed 18 Apr 2016
  41. USGEO (US Group on Earth Observation) (2015) Common framework for earth – observation data. US Group on Earth Observation, Data Management Working Group, Office of Science and Technology Policy. https://www.whitehouse.gov/sites/default/files/microsites/ostp/common_framework_for_earth_observation_data_draft_120215pdf. Accessed 25 Jan 2015
  42. Whitlock MC (2011) Data archiving in ecology and evolution: best practices. Trends Ecol Evol 26(2):61–65. doi: 10.1016/j.tree.2010.11.006 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Robert B. Cook
    • 1
  • Yaxing Wei
    • 1
  • Leslie A. Hook
    • 1
  • Suresh K. S. Vannan
    • 1
    Email author
  • John J. McNelis
    • 1
  1. 1.Oak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations