Techniques to audit and certify the long-term integrity of digital archives
Abstract
A fundamental requirement for a digital archive is to set up mechanisms that will ensure the authenticity of its holdings in the long term. In this article, we develop a new methodology to address the long-term integrity of digital archives using rigorous cryptographic techniques. Our approach involves the generation of a small-size integrity token for each object, some cryptographic summary information, and a framework that enables cost-effective regular and periodic auditing of the archive’s holdings depending on the policy set by the archive. Our scheme is very general, architecture and platform independent, and can detect with high probability any alteration to an object, including malicious alterations introduced by the archive or by an external intruder. The scheme can be shown to be mathematically correct as long as a small amount of cryptographic information, in the order of 100 KB/year, can be kept intact. Using this approach, a prototype system called ACE (Auditing Control Environment) has been built and tested in an operational large scale archiving environment.
Keywords
Integrity auditing Digital archives Data integrity Authenticity of digital archivesReferences
- 1.Bairavasundaram, L.N., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Goodson, G.R., Schroeder, B.: An analysis of data corruption in the storage stack. ACM Trans. Storage 4(3), 1–28 (2008). http://doi.acm.org/10.1145/1416944.1416947
- 2.Desmedt, Y.G., Frankel, Y.: Threshold cryptosystems. In: CRYPTO ’89: Proceedings on advances in cryptology, pp. 307–315. Springer-Verlag, New York, Inc., New York, NY, USA (1989)Google Scholar
- 3.Diffie W., Hellman M.E.: New directions in cryptography. IEEE Trans. Inf. Theory IT-22(6), 644–654 (1976)CrossRefMathSciNetGoogle Scholar
- 4.Farquhar, A., Martin, S., Boulderstone, R., Dooher, V., Masters, R., Wilson, C.: Design for the long term: Authenticity and object representation. In: Proceedings of Archiving 2005. IS&T, pp. 104–108 (2005)Google Scholar
- 5.Giuli, T.J., Maniatis, P., Baker, M., Rosenthal, D.S.H., Roussopoulos, M.: Attrition defenses for a peer-to-peer digital preservation system. In: ATEC’05: Proceedings of the USENIX Annual Technical Conference 2005, pp. 163–178. USENIX Association, Berkeley, CA, USA (2005)Google Scholar
- 6.Haber, S., Kamat, P.: Content integrity service for long-term digital archives. In: Proceedings of Archiving 2006. IS&T, pp. 159–164 (2006)Google Scholar
- 7.Haber S., Stornetta W.S.: How to time-stamp a digital document. J. Cryptol. 3(2), 99–111 (1991)CrossRefGoogle Scholar
- 8.Kaufman C., Perlman R., Speciner M.: Network Security: Private Communication in a Public World. 2nd edn. Prentice-Hall, Inc., Upper Saddle River, NJ, USA (2002)Google Scholar
- 9.Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program Lang. Syst. 4(3), 382–401 (1982). http://doi.acm.org/10.1145/357172.357176 Google Scholar
- 10.Maniatis, P., Baker, M.: Enabling the archival storage of signed documents. In: FAST ’02: Proceedings of the 1st USENIX Conference on File and Storage Technologies, p 3. USENIX Association, Berkeley, CA, USA (2002)Google Scholar
- 11.Maniatis, P., Giuli, T., Baker, M.: Enabling the long-term archival of signed documents through time stamping. Technical Report arXiv:cs.DC/0106058, Computer Science Department, Stanford University, Stanford, CA, USA (2001)Google Scholar
- 12.Maniatis, P., Roussopoulos, M., Giuli, T.J., Rosenthal, D.S.H., Baker, M.: The LOCKSS peer-to-peer digital preservation system. ACM Trans. Comput. Syst. 23(1), 2–50 (2005). http://doi.acm.org/10.1145/1047915.1047917
- 13.Menezes A.J., Vanstone S.A., Oorschot P.C.V.: Handbook of Applied Cryptography. CRC Press, Inc., Boca Raton, FL, USA (1996)Google Scholar
- 14.Merkle, R.C.: Protocols for public key cryptosystems. In: IEEE Symposium on Security and Privacy, pp. 122–134. (1980)Google Scholar
- 15.Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (RAID). In: SIGMOD ’88: Proceedings of the 1988 ACM SIGMOD international conference on Management of data, pp. 109–116. ACM Press, New York, NY, USA (1988). http://doi.acm.org/10.1145/50202.50214
- 16.Plank J.S.: A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exp. 27(9), 995–1012 (1997)CrossRefGoogle Scholar
- 17.Sivathanu, G., Wright, C.P., Zadok, E.: Ensuring data integrity in storage: techniques and applications. In: StorageSS ’05: Proceedings of the 2005 ACM workshop on Storage security and survivability, pp. 26–36. ACM Press, New York, NY, USA (2005). http://doi.acm.org/10.1145/1103780.1103784
- 18.Song, S., JaJa, J.: Ace: a novel software platform to ensure the integrity of long term archives. In: Proceedings of Archiving 2007, pp. 90–93. IS&T (2007)Google Scholar
- 19.Wang, X., Yu, H.: How to break MD5 and other hash functions. In: EUROCRYPT. 19–35 (2005)Google Scholar
- 20.Wang, X., Yin, Y.L., Yu, H.: Finding collisions in the full SHA-1. In: CRYPTO. 17–36 (2005)Google Scholar
- 21.Weatherspoon, H., Wells, C., Kubiatowicz, J.: Naming and integrity: self-verifying data in peer-to-peer systems. In: Future Directions in Distributed Computing, pp. 142–147 (2003)Google Scholar