Analytic variations on the common subexpression problem

  • Philippe Flajolet
  • Paolo Sipala
  • Jean-Marc Steyaert
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 443)

Abstract

Any tree can be represented in a maximally compact form as a directed acyclic graph where common subtrees are factored and shared, being represented only once. Such a compaction can be effected in linear time. It is used to save storage in implementations of functional programming languages, as well as in symbolic manipulation and computer algebra systems. In compiling, the compaction problem is known as the “common subexpression problem” and it plays a central rôle in register allocation, code generation and optimisation. We establish here that, under a variety of probabilistic models, a tree of size n has a compacted form of expected size asymptotically
$$C\frac{n}{{\sqrt {\log n} }},$$
where the constant C is explicitly related to the type of trees to be compacted and to the statistical model reflecting tree usage. In particular the savings in storage approach 100% on average for large structures, which overperforms the commonly used form of sharing that is restricted to leaves (atoms).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Aho, A. V., Sethi, R., and Ullman, J. D.Compilers: Principles, Techniques and Tools. Addison-Wesley, 1986.Google Scholar
  2. [2]
    Albert, L., Casas, R., Fages, F., and Zimmermann, P. Average case analysis of unification algorithms, 1990. Technical report, INRIA, in preparation.Google Scholar
  3. [3]
    Albert, L., and Fages, F. Average case analysis of the Rete pattern-matching algorithm. In Automata, Languages and Programming (1988), T. Lepistö and A. Salomaa, Eds., vol. 317 of Lecture Notes in Computer Science, Springer Verlag. Proceedings of 15th ICALP Colloquium, Tempere, Finland, July 1988.Google Scholar
  4. [4]
    Casas, R., Diaz, J., and Steyaert, J.-M. Average case analysis of Robinson's unification algorithm with two different variables. Inf. Process. Lett. 31 (June 1989), 227–232.Google Scholar
  5. [5]
    Casas, R., Diaz, J., Steyaert, J.-M., and Verges, M. On compact representation of trees. In Proceedings of the Colloquium on Algebra, Combinatorics and Logic for Computer Science (1984), Janos Bolyai Mathematical Society, North Holland Publishing Company.Google Scholar
  6. [6]
    Casas, R., Fernandez Camacho, M.-I., and Steyaert, J.-M. Algebraic simplification in computer algebra: an analysis of bottom-up algorithms. Tech. Rep. LIX-RR-89.04, Ecole Polytechnique, Palaiseau, France, 1989. To appear in Theoretical Computer Science, 1990.Google Scholar
  7. [7]
    Char, B., Geddes, K., Gonnet, G., Monagan, M., and Watt, S.MAPLE: Reference Manual. University of Waterloo, 1988. 5th edition.Google Scholar
  8. [8]
    Char, B. W., Fee, G. J., Geddes, K. O., Gonnet, G. H., and Monagan, M. M. A tutorial introduction to Maple. Journal of Symbolic Computation 2, 2 (1986), 179–200.Google Scholar
  9. [9]
    Choppy, C., Kaplan, S., and Soria, M. Complexity analysis of term rewriting systems. Theoretical Computer Science 67 (1989), 261–282.Google Scholar
  10. [10]
    Clark, D. W. Measurements of dynamic list structure use in Lisp. IEEE Trans. Software Eng. SE-5, 1 (1979), 51–59.Google Scholar
  11. [11]
    Clark, D. W., and Green, C. C.. An empirical study of list structure in Lisp. Commun. ACM 20, 2 (1977), 78–87.Google Scholar
  12. [12]
    Donzeau-Gouge, V., Huet, G., Kahn, G., and Lang, B. Programming environments based on structured editors: the MENTOR experience. In Interactive Programming Environments (1984), D. Barstow, E. Sandewall, and H. Shrobe, Eds., McGraw-Hill, pp. 128–140.Google Scholar
  13. [13]
    Downey, P. J., Sethi, R., and Tarjan, R. E. Variations on the common subexpression problem. J. A.C.M. 27 (1980), 758–771.Google Scholar
  14. [14]
    Flajolet, P. Mathematical methods in the analysis of algorithms and data structures. In Trends in Theoretical Computer Science, E. Börger, Ed. Computer Science Press, Rockville, Maryland, 1988, ch. 6, pp. 225–304. (Lecture Notes for A Graduate Course in Computation Theory, Udine, 1984).Google Scholar
  15. [15]
    Flajolet, P., and Odlyzko, A. The average height of binary trees and other simple trees. J. Comput. Syst. Sci. 25 (1982), 171–213.Google Scholar
  16. [16]
    Flajolet, P., and Odlyzko, A. M. Singularity analysis of generating functions. SIAM Journal on Discrete Mathematics 3, 1 (February 1990). To appear. (Also available as INRIA Research Report 826, 1987, 25 pages).Google Scholar
  17. [17]
    Flajolet, P., Salvy, B., and Zimmermann, P. Lambda-Upsilon-Omega: The 1989 Cook-book. Research Report 1073, Institut National de Recherche en Informatique et en Automatique, August 1989. 116 pages.Google Scholar
  18. [18]
    Flajolet, P., and Steyaert, J.-M. A complexity calculus for recursive tree algorithms. Mathematical Systems Theory 19 (1987), 301–331.Google Scholar
  19. [19]
    Goto, E. Monocopy and associative algorithms in an extended LISP. Tech. Rep. 74-03, Information Sciences Lab., University of Tokyo, April 1974.Google Scholar
  20. [20]
    Goulden, I. P., and Jackson, D. M.Combinatorial Enumeration. John Wiley, New York, 1983.Google Scholar
  21. [21]
    Knuth, D. E.The Art of Computer Programming, vol. 1: Fundamental Algorithms. Addison-Wesley, 1968.Google Scholar
  22. [22]
    Knuth, D. E.The Art of Computer Programming, vol. 3: Sorting and Searching. Addison-Wesley, 1973.Google Scholar
  23. [23]
    Macsyma. VAX UNIX MACSYMA Reference manual, 1985.Google Scholar
  24. [24]
    McCarthy, J.LISP 1.5 Programmer's Manual. M.I.T. Press, Cambridge, Mass., 1962.Google Scholar
  25. [25]
    Meir, A., and Moon, J. W. On the altitude of nodes in random trees. Canadian Journal of Mathematics 30 (1978), 997–1015.Google Scholar
  26. [26]
    Meir, A., and Moon, J. W. On an asymptotic method in enumeration. Journal of Combinatorial Theory, Series A 51 (1989), 77–89.Google Scholar
  27. [27]
    Odlyzko, A. M. Enumeration of strings. In Combinatorial Algorithms on Words (1985), A. Apostolico and Z. Galil, Eds., vol. 12 of NATO Advance Science Institute Series. Series F: Computer and Systems Sciences, Springer Verlag, pp. 205–228.Google Scholar
  28. [28]
    Pedersen, J. Enumeration of trees containing variable patterns, 1988. Manuscript.Google Scholar
  29. [29]
    Pólya, G. Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen. Acta Mathematica 68 (1937), 145–254.Google Scholar
  30. [30]
    Pólya, G., and Read, R. C.Combinatorial Enumeration of Groups, Graphs and Chemical Componds. Springer Verlag, New York, 1987.Google Scholar
  31. [31]
    Steyaert, J.-M., and Flajolet, P. Patterns and pattern-matching in trees: an analysis. Information and Control 58, 1–3 (July 1983), 19–58.Google Scholar
  32. [32]
    Terashima, M. Algorithms used in an implementation of HLISP. Tech. Rep. 75-03, Information Sciences Lab., University of Tokyo, January 1975.Google Scholar

Copyright information

© Springer-Verlag 1990

Authors and Affiliations

  • Philippe Flajolet
    • 1
  • Paolo Sipala
    • 2
  • Jean-Marc Steyaert
    • 3
  1. 1.INRIA, RocquencourtLe ChesnayFrance
  2. 2.Università degli Studi di TriesteTriesteItaly
  3. 3.LIX Ecole PolytechniquePalaiseauFrance

Personalised recommendations