An Implementation of Deflate in Coq

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9995)

Abstract

The widely-used compression format “Deflate” is defined in RFC 1951 and is based on prefix-free codings and backreferences. There are unclear points about the way these codings are specified, and several sources for confusion in the standard. We tried to fix this problem by giving a rigorous mathematical specification, which we formalized in Coq. We produced a verified implementation in Coq which achieves competitive performance on inputs of several megabytes. In this paper we present the several parts of our implementation: a fully verified implementation of canonical prefix-free codings, which can be used in other compression formats as well, and an elegant formalism for specifying sophisticated formats, which we used to implement both a compression and decompression algorithm in Coq which we formally prove inverse to each other – the first time this has been achieved to our knowledge. The compatibility to other Deflate implementations can be shown empirically. We furthermore discuss some of the difficulties, specifically regarding memory and runtime requirements, and our approaches to overcome them.

Keywords

Formal verification Program extraction Compression Coq 

References

  1. 1.
  2. 2.
    High assurance cyber military systems proposers’ day presentation, February 2012. http://www.darpa.mil/WorkArea/DownloadAsset.aspx?id=2147484882
  3. 3.
    Affeldt, R., Garrigue, J.: Formalization of error-correcting codes: from hamming to modern coding theory. In: Urban, C., Zhang, X. (eds.) ITP 2015. LNCS, vol. 9236, pp. 17–33. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22102-1_2 Google Scholar
  4. 4.
    Affeldt, R., Hagiwara, M., Sénizergues, J.: Formalization of Shannon’s theorems. J. Autom. Reasoning 53(1), 63–103 (2014). doi:10.1007/s10817-013-9298-1 MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Appel, A.W.: Program Logics for Certified Compilers. Cambridge University Press, New York (2014)CrossRefMATHGoogle Scholar
  6. 6.
    Berger, U., Jones, A., Seisenberger, M.: Program extraction applied to monadic parsing. J. Log. Comput. exv078 (2015). doi:10.1093/logcom/exv078
  7. 7.
    Bhargavan, K., Fournet, C., Kohlweiss, M., Pironti, A., Strub, P.Y.: Implementing TLS with verified cryptographic security (2013). http://www.mitls.org/downloads/miTLS-report.pdf
  8. 8.
    Blanchette, J.C.: Proof pearl: mechanizing the textbook proof of Huffman’s algorithm. J. Autom. Reasoning 43(1), 1–18 (2009). doi:10.1007/s10817-009-9116-y MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Danielsson, N.A.: Total parser combinators. ACM SIGPLAN Not. 45, 285–296 (2010)CrossRefMATHGoogle Scholar
  10. 10.
    Deutsch, P.: DEFLATE Compressed Data Format Specification version 1.3. RFC 1951 (Informational), May 1996. http://www.ietf.org/rfc/rfc1951.txt
  11. 11.
    Deutsch, P.: GZIP file format specification version 4.3. RFC 1952 (Informational), May 1996. http://www.ietf.org/rfc/rfc1952.txt
  12. 12.
    Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., Berners-Lee, T.: Hypertext Transfer Protocol - HTTP/1.1. RFC 2616 (Draft Standard), June 1999. http://www.ietf.org/rfc/rfc2616.txt, obsoleted by RFCs 7230, 7231, 7232, 7233, 7234, 7235, updated by RFCs 2817, 5785, 6266, 6585
  13. 13.
    Google Inc.: Zopfli compression algorithm. https://github.com/google/zopfli
  14. 14.
    Hollenbeck, S.: Transport Layer Security Protocol Compression Methods. RFC 3749 (Proposed Standard), May 2004. http://www.ietf.org/rfc/rfc3749.txt
  15. 15.
    Huffman, D.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952)CrossRefMATHGoogle Scholar
  16. 16.
    Jang, D., Tatlock, Z., Lerner, S.: Establishing browser security guarantees through formal shim verification. In: Proceedings of the 21st USENIX Conference on Security Symposium, p. 8. USENIX Association (2012)Google Scholar
  17. 17.
    Kelsey, J.: Compression and information leakage of plaintext. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 263–276. Springer, Heidelberg (2002). doi:10.1007/3-540-45661-9_21 CrossRefGoogle Scholar
  18. 18.
    Klein, G., Andronick, J., Elphinstone, K., Murray, T., Sewell, T., Kolanski, R., Heiser, G.: Comprehensive formal verification of an OS microkernel. ACM Trans. Comput. Syst. 32(1), 2:1–2:70 (2014)CrossRefGoogle Scholar
  19. 19.
    Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009). http://gallium.inria.fr/~xleroy/publi/compcert-CACM.pdf
  20. 20.
    McMillan, B.: Two inequalities implied by unique decipherability. IRE Trans. Inf. Theory 2(4), 115–116 (1956)CrossRefGoogle Scholar
  21. 21.
    Nogin, A.: Writing constructive proofs yielding efficient extracted programsGoogle Scholar
  22. 22.
    PKWARE Inc.: ZIP File Format Specification, September 2012. https://www.pkware.com/documents/APPNOTE/APPNOTE-6.3.3.TXT
  23. 23.
    Thery, L.: Formalising human’s algorithm. Technical report, Technical report TRCS 034, Dept. of Informatics, Univ. of L’Aquila (2004)Google Scholar
  24. 24.
    Vafeiadis, V.: Adjustable references. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 328–337. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39634-2_24 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Ludwig-Maximilians-UniversitätMunichGermany

Personalised recommendations