International Journal on Digital Libraries

, Volume 10, Issue 2–3, pp 123–131 | Cite as

Techniques to audit and certify the long-term integrity of digital archives

  • Sangchul SongEmail author
  • Joseph JaJa


A fundamental requirement for a digital archive is to set up mechanisms that will ensure the authenticity of its holdings in the long term. In this article, we develop a new methodology to address the long-term integrity of digital archives using rigorous cryptographic techniques. Our approach involves the generation of a small-size integrity token for each object, some cryptographic summary information, and a framework that enables cost-effective regular and periodic auditing of the archive’s holdings depending on the policy set by the archive. Our scheme is very general, architecture and platform independent, and can detect with high probability any alteration to an object, including malicious alterations introduced by the archive or by an external intruder. The scheme can be shown to be mathematically correct as long as a small amount of cryptographic information, in the order of 100 KB/year, can be kept intact. Using this approach, a prototype system called ACE (Auditing Control Environment) has been built and tested in an operational large scale archiving environment.


Integrity auditing Digital archives Data integrity Authenticity of digital archives 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bairavasundaram, L.N., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Goodson, G.R., Schroeder, B.: An analysis of data corruption in the storage stack. ACM Trans. Storage 4(3), 1–28 (2008).
  2. 2.
    Desmedt, Y.G., Frankel, Y.: Threshold cryptosystems. In: CRYPTO ’89: Proceedings on advances in cryptology, pp. 307–315. Springer-Verlag, New York, Inc., New York, NY, USA (1989)Google Scholar
  3. 3.
    Diffie W., Hellman M.E.: New directions in cryptography. IEEE Trans. Inf. Theory IT-22(6), 644–654 (1976)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Farquhar, A., Martin, S., Boulderstone, R., Dooher, V., Masters, R., Wilson, C.: Design for the long term: Authenticity and object representation. In: Proceedings of Archiving 2005. IS&T, pp. 104–108 (2005)Google Scholar
  5. 5.
    Giuli, T.J., Maniatis, P., Baker, M., Rosenthal, D.S.H., Roussopoulos, M.: Attrition defenses for a peer-to-peer digital preservation system. In: ATEC’05: Proceedings of the USENIX Annual Technical Conference 2005, pp. 163–178. USENIX Association, Berkeley, CA, USA (2005)Google Scholar
  6. 6.
    Haber, S., Kamat, P.: Content integrity service for long-term digital archives. In: Proceedings of Archiving 2006. IS&T, pp. 159–164 (2006)Google Scholar
  7. 7.
    Haber S., Stornetta W.S.: How to time-stamp a digital document. J. Cryptol. 3(2), 99–111 (1991)CrossRefGoogle Scholar
  8. 8.
    Kaufman C., Perlman R., Speciner M.: Network Security: Private Communication in a Public World. 2nd edn. Prentice-Hall, Inc., Upper Saddle River, NJ, USA (2002)Google Scholar
  9. 9.
    Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program Lang. Syst. 4(3), 382–401 (1982). CrossRefGoogle Scholar
  10. 10.
    Maniatis, P., Baker, M.: Enabling the archival storage of signed documents. In: FAST ’02: Proceedings of the 1st USENIX Conference on File and Storage Technologies, p 3. USENIX Association, Berkeley, CA, USA (2002)Google Scholar
  11. 11.
    Maniatis, P., Giuli, T., Baker, M.: Enabling the long-term archival of signed documents through time stamping. Technical Report arXiv:cs.DC/0106058, Computer Science Department, Stanford University, Stanford, CA, USA (2001)Google Scholar
  12. 12.
    Maniatis, P., Roussopoulos, M., Giuli, T.J., Rosenthal, D.S.H., Baker, M.: The LOCKSS peer-to-peer digital preservation system. ACM Trans. Comput. Syst. 23(1), 2–50 (2005).
  13. 13.
    Menezes A.J., Vanstone S.A., Oorschot P.C.V.: Handbook of Applied Cryptography. CRC Press, Inc., Boca Raton, FL, USA (1996)CrossRefGoogle Scholar
  14. 14.
    Merkle, R.C.: Protocols for public key cryptosystems. In: IEEE Symposium on Security and Privacy, pp. 122–134. (1980)Google Scholar
  15. 15.
    Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (RAID). In: SIGMOD ’88: Proceedings of the 1988 ACM SIGMOD international conference on Management of data, pp. 109–116. ACM Press, New York, NY, USA (1988).
  16. 16.
    Plank J.S.: A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exp. 27(9), 995–1012 (1997)CrossRefGoogle Scholar
  17. 17.
    Sivathanu, G., Wright, C.P., Zadok, E.: Ensuring data integrity in storage: techniques and applications. In: StorageSS ’05: Proceedings of the 2005 ACM workshop on Storage security and survivability, pp. 26–36. ACM Press, New York, NY, USA (2005).
  18. 18.
    Song, S., JaJa, J.: Ace: a novel software platform to ensure the integrity of long term archives. In: Proceedings of Archiving 2007, pp. 90–93. IS&T (2007)Google Scholar
  19. 19.
    Wang, X., Yu, H.: How to break MD5 and other hash functions. In: EUROCRYPT. 19–35 (2005)Google Scholar
  20. 20.
    Wang, X., Yin, Y.L., Yu, H.: Finding collisions in the full SHA-1. In: CRYPTO. 17–36 (2005)Google Scholar
  21. 21.
    Weatherspoon, H., Wells, C., Kubiatowicz, J.: Naming and integrity: self-verifying data in peer-to-peer systems. In: Future Directions in Distributed Computing, pp. 142–147 (2003)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.Institute for Advanced Computer Studies and Department of Electrical and Computer EngineeringUniversity of MarylandCollege ParkUSA

Personalised recommendations