Just One Bit in a Million: On the Effects of Data Corruption in Files
So far little attention has been paid to file format robustness, i.e., a file formats capability for keeping its information as safe as possible in spite of data corruption. The paper on hand reports on the first comprehensive research on this topic. The research work is based on a study on the status quo of file format robustness for various file formats from the image domain. A controlled test corpus was built which comprises files with different format characteristics. The files are the basis for data corruption experiments which are reported on and discussed.
Keywordsdigital preservation file format file format robustness data integrity data corruption bit error error resilience
Unable to display preview. Download preview PDF.
- 2.Bairavasundaram, L.N., et al.: An Analysis of Data Corruption in the Storage Stack. ACM Transactions on Storage 4(3) (2008)Google Scholar
- 3.Buonora, P., Liberati, F.: A Format for Digital Preservation of Images: A Study on JPEG 2000 File Robustness. D-Lib Magazine 7/8 (2008), http://www.dlib.org/dlib/july08/buonora/07buonora.html (accessed May 2009)
- 4.Chapman, S., et al.: Page Image Compression for Mass Digitization. In: Archiving 2007. Final program and proceedings, pp. 37–42 (2007)Google Scholar
- 5.Gilesse, R., Rog, J., Verheusen, A.: Life Beyond uncompressed TIFF: Alternative File Formats for the Storage of Master Image Files. In: Archiving 2008. Final program and proceedings, pp. 41–46 (2008)Google Scholar
- 6.Heydegger, V.: Analysing the Impact of File Formats on Data Integrity. In: Archiving 2008. Final program and proceedings, pp. 50–55 (2008)Google Scholar
- 7.Iraci, J.: The Relative Stabilities of Optical Disk Formats. Restaurator 26(2) (2005)Google Scholar
- 8.ISO/IEC 15444-5:2003. JPEG 2000 image coding system (2003) Google Scholar
- 10.Panzer-Steindel, B.: Data Integrity, internal CERN/IT study (2007), http://indico.cern.ch/getFile.py/access?contribId=3&sessionId=0&resId=1&materialId=paper&confId=13797 (accessed May 2009)
- 11.Schroeder, B., Gibson, G.A.: Disk failures in the real world: What does an mttf of 1,000,000 hours mean to you? In: Proceedings of the 5th USENIX Conference on File and Storage Technologies, FAST (2007)Google Scholar