Higher-Order and Symbolic Computation

, Volume 25, Issue 1, pp 39–84 | Cite as

Functional programs as compressed data

  • Naoki Kobayashi
  • Kazutaka Matsuda
  • Ayumi Shinohara
  • Kazuya Yaguchi


We propose an application of programming language techniques to lossless data compression, where tree data are compressed as functional programs that generate them. This “functional programs as compressed data” approach has several advantages. First, it follows from the standard argument of Kolmogorov complexity that the size of compressed data can be optimal up to an additive constant.

Secondly, a compression algorithm is clean: it is just a sequence of β-expansions (i.e., the inverse of β-reductions) for λ-terms. Thirdly, one can use program verification and transformation techniques (higher-order model checking, in particular) to apply certain operations on data without decompression. In this article, we present algorithms for data compression and manipulation based on the approach, and prove their correctness. We also report preliminary experiments on prototype data compression/transformation systems.


Semantics based program manipulation Program transformation Data compression Functional programs Higher-order Model Checking 



We would like to thank Oleg Kiselyov, Tatsunari Nakajima, Kunihiko Sadakane, John Tromp, and anonymous referees for discussions and useful comments. This work was partially supported by Kakenhi 23220001.


  1. 1.
    Apostolico, A., Lonardi, S.: Some theory and proactive of greedy off-line textual substitution. In: Data Compression Conference 1998 (DCC98), pp. 119–128 (1998) Google Scholar
  2. 2.
    Barendregt, H., Coppo, M., Dezani-Ciancaglini, M.: A filter lambda model and the completeness of type assignment. J. Symb. Log. 48(4), 931–940 (1983) CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Broadbent, C.H., Carayol, A., Ong, C.-H.L., Serre, O.: Recursion schemes and logical reflection. In: Proceedings of LICS 2010, pp. 120–129. IEEE Computer Society Press, Los Alamitos (2010) Google Scholar
  4. 4.
    Busatto, G., Lohrey, M., Maneth, S.: Efficient memory representation of XML document trees. Inf. Syst. 33(4–5), 456–474 (2008) CrossRefGoogle Scholar
  5. 5.
    Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005) CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Church, A.: The calculi of lambda-conversion. In: Annals of Mathematics Studies, vol. 6. Princeton University Press, Princeton (1941) Google Scholar
  7. 7.
    Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007). Available on http://www.grappa.univ-lille3.fr/tata. Release October, 12th 2007
  8. 8.
    Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2002) CrossRefGoogle Scholar
  9. 9.
    Dam, W.: The IO- and OI-hierarchies. Theor. Comput. Sci. 20, 95–207 (1982) CrossRefGoogle Scholar
  10. 10.
    Engelfriet, J.: Bottom-up and top-down tree transformations—a comparison. Math. Syst. Theory 9(3), 198–231 (1975) CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Engelfriet, J., Rozenberg, G., Slutzki, G.: Tree transducers, L systems, and two-way machines. J. Comput. Syst. Sci. 20(2), 150–202 (1980) CrossRefMATHMathSciNetGoogle Scholar
  12. 12.
    Engelfriet, J., Vogler, H.: High level tree transducers and iterated pushdown tree transducers. Acta Inform. 26(1/2), 131–192 (1988) CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Gill, A.J., Launchbury, J., Peyton-Jones, S.L.: A short cut to deforestation. In: FPCA, pp. 223–232 (1993) CrossRefGoogle Scholar
  14. 14.
    Hague, M., Murawski, A., Ong, C.-H.L., Serre, O.: Collapsible pushdown automata and recursion schemes. In: Proceedings of LICS 2008, pp. 452–461. IEEE Computer Society, Los Alamitos (2008) Google Scholar
  15. 15.
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer, Berlin (2004) Google Scholar
  16. 16.
    Kaji, H., Kida, Y., Morimoto, Y.: Learning translation templates from bilingual text. In: COLING, pp. 672–678 (1992) Google Scholar
  17. 17.
    Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern matching algorithm for strings with short description. Nord. J. Comput. 4(2), 129–144 (1997) MathSciNetGoogle Scholar
  18. 18.
    Kida, T., Matsumoto, T., Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S.: Collage system: a unifying framework for compressed pattern matching. Theor. Comput. Sci. 1(298), 253–272 (2003) CrossRefMathSciNetGoogle Scholar
  19. 19.
    Knapik, T., Niwinski, D., Urzyczyn, P.: Higher-order pushdown trees are easy. In: Proceedings of FOSSACS 2002. Lecture Notes in Computer Science, vol. 2303, pp. 205–222. Springer, Berlin (2002) Google Scholar
  20. 20.
    Kobayashi, N.: Model-checking higher-order functions. In: Proceedings of PPDP 2009, pp. 25–36. ACM Press, New York (2009) Google Scholar
  21. 21.
    Kobayashi, N.: Types and higher-order recursion schemes for verification of higher-order programs. In: Proceedings of ACM SIGPLAN/SIGACT Symposium on Principles of Programming Languages (POPL), pp. 416–428 (2009) Google Scholar
  22. 22.
    Kobayashi, N.: A practical linear time algorithm for trivial automata model checking of higher-order recursion schemes. In: Proceedings of FOSSACS 2011. Lecture Notes in Computer Science, vol. 6604, pp. 260–274. Springer, Berlin (2011) Google Scholar
  23. 23.
    Kobayashi, N.: Model checking higher-order programs. J. ACM 60(3) (2013) Google Scholar
  24. 24.
    Kobayashi, N., Matsuda, K., Shinohara, A.: Functional programs as compressed data. In: Proceedings of PEPM 2012, pp. 121–130. ACM Press, New York (2012) Google Scholar
  25. 25.
    Kobayashi, N., Ong, C.-H.L.: A type system equivalent to the modal mu-calculus model checking of higher-order recursion schemes. In: Proceedings of LICS 2009, pp. 179–188. IEEE Computer Society Press, Los Alamitos (2009) Google Scholar
  26. 26.
    Kobayashi, N., Ong, C.-H.L.: Complexity of model checking recursion schemes for fragments of the modal mu-calculus. Logical Methods in Computer Science 7(4) (2011) Google Scholar
  27. 27.
    Larsson, N.J., Moffat, A.: Offline dictionary-based compression. In: Proc. Data Compression Conference ’99 (DCC’99), pp. 296–305. IEEE Computer Society, Los Alamitos (1999) Google Scholar
  28. 28.
    Li, M., Vitányi, P.M.B.: Kolmogorov complexity and its applications. In: Handbook of Theoretical Computer Science, vol. A: Algorithms and Complexity, pp. 187–254. MIT Press, Cambridge (1990) Google Scholar
  29. 29.
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Texts in Computer Science Springer, Berlin (2009) Google Scholar
  30. 30.
    Lohrey, M., Maneth, S., Mennicke, R.: Tree structure compression with repair. In: Data Compression Conference (DCC 2011), pp. 353–362. IEEE Computer Society, Los Alamitos (2011) CrossRefGoogle Scholar
  31. 31.
    Maneth, S., Busatto, G.: Tree transducers and tree compressions. In: Proceedings of FOSSACS 2004. Lecture Notes in Computer Science, vol. 2987, pp. 363–377. Springer, Berlin (2004) Google Scholar
  32. 32.
    Matsubara, W., Inenaga, S., Ishino, A., Shinohara, A., Nakamura, T., Hashimoto, K.: Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theor. Comput. Sci. 410(8–10), 900–913 (2009) CrossRefMATHMathSciNetGoogle Scholar
  33. 33.
    Nevill-Manning, C.G., Witten, I.H.: Compression and explanation using hierarchical grammars. Comput. J. 40(2/3), 103–116 (1997) CrossRefGoogle Scholar
  34. 34.
    Ong, C.-H.L.: On model-checking trees generated by higher-order recursion schemes. In: LICS 2006, pp. 81–90. IEEE Computer Society Press, Los Alamitos (2006) Google Scholar
  35. 35.
    Plandowski, W.: Testing equivalence of morphisms on context-free languages. In: ESA’94. Lecture Notes in Computer Science, vol. 855, pp. 460–470. Springer, Berlin (1994) Google Scholar
  36. 36.
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003) CrossRefMATHMathSciNetGoogle Scholar
  37. 37.
    Rytter, W.: Grammar compression, LZ-encodings, and string algorithms with implicit input. In: ICALP’04. Lecture Notes in Computer Science, vol. 3142, pp. 15–27. Springer, Berlin (2004) Google Scholar
  38. 38.
    Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based compression. J. Discrete Algorithms 3(2–4), 416–430 (2005) CrossRefMATHMathSciNetGoogle Scholar
  39. 39.
    Sakamoto, H., Maruyama, S., Kida, T., Shimozono, S.: A space-saving approximation algorithm for grammar-based compression. IEICE Trans. Inf. Syst. E 92-D(2), 158–165 (2009) CrossRefGoogle Scholar
  40. 40.
    Somers, H.L.: Review article: example-based machine translation. Mach. Transl. 14(2), 113–157 (1999) CrossRefMathSciNetGoogle Scholar
  41. 41.
    Statman, R.: The typed lambda-calculus is not elementary recursive. Theor. Comput. Sci. 9, 73–81 (1979) CrossRefMATHMathSciNetGoogle Scholar
  42. 42.
    Stirling, C.: Decidability of higher-order matching. Log. Methods Comput. Sci. 5(3) (2009) Google Scholar
  43. 43.
    Storer, J.: NP-completeness results concerning data compression. Technical Report 234, Department of Electrical Engineering and Computer Science, Princeton University, Princeton, NJ (1997) Google Scholar
  44. 44.
    Takano, A., Meijer, E.: Shortcut deforestation in calculational form. In: Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture, FPCA’95, pp. 306–313. ACM, New York (1995) CrossRefGoogle Scholar
  45. 45.
    Terui, K.: Semantic evaluation, intersection types and complexity of simply typed lambda calculus. In: 23rd International Conference on Rewriting Techniques and Applications (RTA’12). LIPIcs, vol. 15, pp. 323–338. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2012) Google Scholar
  46. 46.
    Tozawa, A.: XML type checking using high-level tree transducer. In: Functional and Logic Programming, 8th International Symposium (FLOPS 2006). Lecture Notes in Computer Science, vol. 3945, pp. 81–96. Springer, Berlin (2006) CrossRefGoogle Scholar
  47. 47.
    Tromp, J.: Binary lambda calculus and combinatory logic. In: Claude, C.S. (ed.) Randomness and Complexity, from Leibniz to Chaitin, pp. 237–260. World Scientific, Singapore (2008) Google Scholar
  48. 48.
    Tsukada, T., Kobayashi, N.: Untyped recursion schemes and infinite intersection types. In: Proceedings of FOSSACS 2010. Lecture Notes in Computer Science, vol. 6014, pp. 343–357. Springer, Berlin (2010) Google Scholar
  49. 49.
    van Bakel, S.: Complete restrictions of the intersection type discipline. Theor. Comput. Sci. 102(1), 135–163 (1992) CrossRefMATHGoogle Scholar
  50. 50.
    van Bakel, S.: Intersection type assignment systems. Theor. Comput. Sci. 151(2), 385–435 (1995) CrossRefMATHGoogle Scholar
  51. 51.
    Vytiniotis, D., Kennedy, A.: Functional pearl: every bit counts. In: Proceedings of ICFP 2010, pp. 15–26 (2010) Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Naoki Kobayashi
    • 1
  • Kazutaka Matsuda
    • 1
  • Ayumi Shinohara
    • 2
  • Kazuya Yaguchi
    • 2
  1. 1.Graduate School of Information Science and TechnologyUniversity of TokyoTokyoJapan
  2. 2.Graduate School of Information SciencesTohoku UniversitySendaiJapan

Personalised recommendations