Abstract
Each year destructive events might cause loss of data in members of an archival federation. This paper provides a ‘back-of-the-envelope’ model for the fraction of the federated data collection that survives after a certain number of years. It also discusses some simple parameterizations of factors that contribute to the trade offs between cost and survival of information.
Similar content being viewed by others
References
Berriman GB, Groom SL (2012) How will astronomy archives survive the data tsunami. CACM 54:52–56
Carroll GR, Hannan MT (2000) The demography of corporations and industries. Princeton Univeristy Press, Princeton, NJ
Christiansen C (1997) The innovator’s dilemma: when new technologies cause great firms to fail. Harvard Business School Press, Boston, MA
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proceedings of OSDI’04: sixth symposium on operating System Design and Implementation, San Francisco, CA
Gertsbakh I, Shpungin Y (2010) Models of network reliability: analysis, combinatorics, and Monte Carlo. CRC Press, Boca Raton, FL
Ghemawat S, Gobioff H, Leung ST (2003) The Google file system. SIGOPS Oper Syst Rev 37:29–43. doi:10.1145/1165389.945450
Komorowski M (2011) A history of storage cost. Available at http://www.mkomo.com/mattkomorowsi.htm
Moore R, JaJa J, Chadduck R (2005) Mitigating risk of data loss in preservation environments. In: Proceedings of 22nd IEEE/13th NASA Goddard conference on Mass Storage Systems and Technologies (MSST 2005)
NOAA (2006) Celebrating 200 years at NOAA. Obsolete web page available at http://celebrating200years.noaa.gov/visions/data_mgmt/image14.html. Accessed 3 May 2012
Rosenthal DSH, Robertson T, Lipkis T, Reich V, Morabito S (2005) Requirements for digital preservation systems: a bottom-up approach. Stanford University Libraries, CA
Sawyer D, Hills K, Mccaslin P (2004) Preserving access to legacy information through data migration at NSSDC: experiences and lessons learned. In: Proceedings of ensuring the long-term presrevation and adding value to the scienctific data symposium, PV2004, ESA/ESRIN Frascati, Italy, 5–7 Oct 2004. Available from http://iki.rssi.ru/conferences/Pv2004/Session3/3.05-Sawyer.pdf
Smith I (2008) Disk and tape storage cost models. Available on-line at http://users.sdsc.edu/~mcdonald/content/papers/dt_cost.pdf
Smith I (2012) Cost of hard drive storage space. Available at http://ns1758.ca/winch/winchest.html
Tran J, Cinquini L, Mattmann C, Zimdars P, Cuddy D, Leung K, Kwoun O, Crichton D, Freeborn D (2011) Evaluating cloud computing in the NASA DESDynI ground data system. In: Proceedings of the ICSE 2011 workshop on Software Engineering for Cloud Computing—SECLOUD, Honolulu, HI, 22 May 2011
Acknowledgements
Support for Dr. Mattmann’s effort was provided by the Jet Propulsion Laboratory, California Institute of Technology under contract to the National Aeronautics and Space Administration.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Hassan A. Babaie
Rights and permissions
About this article
Cite this article
Barkstrom, B.R., Mattmann, C.A. A simple model illustrating the virtue of replication for long-term information preservation. Earth Sci Inform 5, 105–109 (2012). https://doi.org/10.1007/s12145-012-0100-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-012-0100-4