Fringe Analysis of Plane Trees Related to Cutting and Pruning

Rooted plane trees are reduced by four different operations on the fringe. The number of surviving nodes after reducing the tree repeatedly for a fixed number of times is asymptotically analyzed. The four different operations include cutting all or only the leftmost leaves or maximal paths. This generalizes the concept of pruning a tree. The results include exact expressions and asymptotic expansions for the expected value and the variance as well as central limit theorems.


INTRODUCTION
Plane trees are among the most interesting elementary combinatorial objects; they appear in the literature under many different names such as ordered trees, planar trees, planted plane trees, etc. They have been analyzed under various aspects, especially due to their relevance in Computer Science. Two particularly well-known quantities are the height, since it is equivalent to the stack size needed to explore binary (expression) trees, and the pruning number (pruning index), since it is equivalent to the register function (Horton-Strahler number) of binary trees. Several results for the height of plane trees can be found in [3,9,23], for the register function, we refer to [4,11,19], and for results on the connection between the register function and the pruning number to [4,27].
Reducing (cutting-down) trees has also been a popular research theme during the last decades [17,21,22]: according to a certain probabilistic model, a given tree is reduced until a certain condition is satisfied (usually, the root is isolated).
In the present paper, the point of view is slightly different, as we reduce a tree in a completely deterministic fashion at the leaves until the tree has no more edges. All these reductions take place on the fringe, meaning that only (a subset of) leaves (and some adjacent structures) are removed. We consider four different models: -In one round, all leaves together with the corresponding edges are removed (see Section 2).
-In one round, all maximal paths (linear graphs), with the leaves on one end, are removed (see Section 3). This process is called pruning. -A leaf is called an old leaf if it is the leftmost sibling of its parents. This concept was introduced in [2]. In one round, only old leaves are removed (see Section 4). -The last model deals with pruning old paths. There might be several interesting models related to this; the one we have chosen here is that in one round maximal paths are removed, under the condition that each of their nodes is the leftmost child of their parent node (see Section 5). The four tree reductions are illustrated in Figure 1. We describe these reductions more formally in the corresponding sections.

Leaves
Paths Old leaves Old paths FIGURE 1. Removal of (old) leaves / paths The first model is clearly related to the height of the plane tree, and the second one to the Horton-Strahler number via the pruning index [27,24]. While there are no surprises about the number of rounds that the process takes here, we are interested in how the fringe develops. The number of leaves and nodes altogether in the remaining tree after a fixed number of reduction rounds is the main parameter analyzed in this paper.
For the sake of simplicity, we will use the same notation for each of the following reduction analyses. In case we need to compare objects from two different sections, we will distinguish them by adding appropriate superscripts.
The random variable X n,r models the tree size after reducing a plane tree of size n (that is chosen uniformly at random among all trees with n nodes) r-times iteratively according to one of our four reductions. If a tree does not "survive" r rounds of reductions, we consider the size of the resulting tree to be 0. In particular, for r = 0, the given plane tree is not changed and X n,0 = n.
As we will see, a key aspect of the analysis of X n,r is the translation of the algorithmic description of the reduction into an operator Φ that acts on the corresponding generating functions.
In Section 2, the reduction cutting away all leaves from the tree is discussed. Section 2.1 contains all necessary auxiliary concepts required in order to study the r-fold application of this reduction. In Section 2.2, we determine the operator Φ acting on the corresponding generating function explicitly and prove some direct consequences. Then, in Section 2.3 we carry out the analysis of the behavior of X n,r by computing explicit expressions and asymptotic expansions for the factorial moments of X n,r as well as a central limit theorem. Section 3 is devoted to the study of the reduction that cuts away all paths. As we will see in Section 3.1, we can actually obtain all results regarding the behavior of X n,r as consequences of the corresponding results in Section 2. In Section 3.2, we analyze the asymptotic behavior of the expected number of paths required to construct a plane tree of size n, i.e. the number of paths we can cut away until the tree cannot be reduced any further.
Sections 4 and 5 are devoted to the analysis of reductions removing only leftmost leaves and leftmost paths from the tree, respectively. In particular, in Section 5.3, we study the total number of old paths that can be removed from a tree until it cannot be reduced any further.
On a general note, the computationally heavy parts of this paper have been carried out with the open-source computer mathematics system SageMath [5], and the corresponding worksheets are available for download. In particular, there are the following files: • treereductions.ipynb for most of the asymptotic computations in Sections 2, 3, and 4, • old_paths.ipynb for most of the asymptotic computations in Section 5, • factorial_moments_leaves.ipynb for computation of the factorial moments in Theorem 1, • factorial_moments_old_paths.ipynb for computation of the factorial moments in Theorem 6.
Additionally, in order to run these computations yourself, you also need to download the following two utility files: • identities_common.py, • conditional_substitution.py. All these files including some instructions on how to use them can be found at https:// benjamin-hackl.at/publications/treereductions/.

CUTTING LEAVES
2.1. Preliminaries. In this part of the paper we investigate the effect of the tree reduction that cuts away all leaves from a given tree. However, before we can do so, we require some auxiliary concepts, which we discuss in this section. Most importantly, we need a generating function counting plane trees with respect to their number of inner nodes and leaves, which is intimately linked to Narayana numbers. The generating function presented in the following proposition is actually well-known (see, e.g. [12,Example III.13]). Proposition 2.1. The generating function T (z, t) which enumerates plane trees with respect to their internal nodes (marked by the variable z) and leaves (marked by t) is given explicitly by Proof. This can be obtained directly from the symbolic equation describing the combinatorial class of plane trees , which is illustrated in Figure 2. In particular, and represent leaves and internal nodes, respectively.
= + n≥1 · · · n FIGURE 2. Symbolic equation for plane trees The symbolic equation translates into the functional equation which yields (1) after solving for T (z, t) and choosing the appropriate branch.
In the context of plane trees, the so-called Narayana numbers count the number of trees with a given size and a given number of leaves (cf. [8]). As these numbers will appear throughout the entire paper, we introduce them formally and investigate some properties within the following statements.
Definition. The Narayana numbers are defined as N n,k = 1 n n k − 1 n k for 1 ≤ n and 1 ≤ k ≤ n, and N 0,0 = 1. All other indices give N n,k = 0. Combinatorially, for n ≥ 1 the Narayana number N n,k corresponds to the number of plane trees with n edges (i.e. n + 1 nodes) and k leaves. The Narayana polynomials are defined as for n ≥ 1 and N 0 (x) = 1, and the associated Narayana polynomials are defined as for n ≥ 0. Note that N n (1) =Ñ n (1) = C n = 1 n + 1 2n n is the nth Catalan number.
Remark. The generating function 1 z T (z, z) = 1− 1−4z 2z enumerates Catalan numbers, see [7,Theorem 3.2], and the generating function T (z, tz) enumerates Narayana numbers We will frequently use this relation in the form Furthermore, it is easily checked that T (z, tz) satisfies the ordinary differential equation Extracting the coefficient of z n+2 then yields the recurrence relation for n ≥ 0.
The following proposition gives another useful property of associated Narayana polynomials. Proposition 2.2. Let n ≥ 0, then we have the relation t n+1Ñ Proof. This relation follows from extracting the coefficient of z n+1 from the identity T (tz, z) = T (z, tz) + (1 − t)z with the help of (3). While it is straightforward to prove that the identity is valid by means of algebraic manipulation, we also give a combinatorial proof.
From a combinatorial point of view, both generating functions T (tz, z) and T (z, tz) enumerate plane trees where z marks the tree size, the only difference is that the variable t enumerates inner nodes in T (tz, z) and leaves in T (z, tz). We want to show that for trees of size n ≥ 2, these two classes are equal, resulting in T (tz, To construct an appropriate bijection between the class of trees of size n with k leaves and the class of trees of size n with k inner nodes we need to have a closer look at the wellknown rotation correspondence [12, I.5.3], which is a bijection between plane trees of size n and binary trees with n − 1 inner nodes. In fact, the leaves in the binary tree are strongly related to the leaves and inner nodes of the original tree: -Left leaves in the binary tree are only attached to those nodes whose companions in the plane tree have no children, i.e., to those who correspond to leaves in the plane tree.
-Right leaves, on the other hand, are attached to nodes whose companion nodes in the plane tree have no sibling right of them. This means that for every node with children, i.e., for every inner node, there is precisely one rightmost child and thus precisely one right leaf in the binary tree.
The bijection between the two tree classes can now be described as follows: given some tree of size n and k leaves, apply the rotation correspondence in order to obtain a binary tree. Then mirror the binary tree by swapping all left and right children. Transform this mirrored tree back by means of the inverse rotation correspondence, and the result is a plane tree of size n and k inner nodes as mirroring the binary tree swapped the number of left and right leaves in the tree. This proves the proposition.
Derivatives of the associated Narayana polynomials defined above will occur within the analysis of a reduction model later, which is why we compute some special values in the following proposition.

Proposition 2.3.
Evaluating the rth derivative of the associated Narayana polynomials at 1, i.e.Ñ (r) n (1), gives the number of trees with n + 1 nodes where precisely r leaves are selected and labeled from 1 to r. In particular, for n ≥ 1 we havẽ Proof. The combinatorial interpretation follows immediately by rewriting where we used the notion k r = k(k − 1) · · · (k − r + 1) for the falling factorial. Explicit values can be obtained by differentiating (2) r-times with respect to t, then setting t = 1 and extracting the coefficient of z n+1 .
Remark. By the combinatorial interpretation of Proposition 2.3 we find thatÑ n (1) = 1 2 2n n enumerates the number of leaves, summed over all trees with n + 1 nodes. At the same time, as there are C n = 1 n+1 2n n such trees, the total number of nodes in these trees is 2n n . This implies that exactly half of all nodes in all trees of given size are leaves! In fact, this interpretation also motivates a second, purely combinatorial proof of the explicit value ofÑ n (1): the bijection correspondence maps trees of size n + 1 to binary trees with n inner nodes. In the proof of Proposition 2.2 we already observed that the number of left leaves in the binary tree obtained from the rotation correspondence is equal to the number of leaves in the plane tree.
As binary trees with n inner nodes have n + 1 leaves, and as there are C n binary trees with n inner nodes, the total number of leaves in all binary trees with n inner nodes is In addition to the polynomials related to the Narayana numbers, there is another wellknown sequence of polynomials that will occur throughout this paper.
For many identities involving Fibonacci numbers, there is an analogous statement for Fibonacci polynomials. The identity presented in the following proposition will be used repeatedly throughout this paper.
Proof. The left-hand side of (6) can be expressed as the determinant of . At the same time, for r, s ≥ 1 we can write Combining these two observations yields which proves the statement.
Observe that setting s = r + 1 in (6) yields the identity which we will make heavy use of later on. An important tool in the context of plane trees is the substitution z = u/(1 + u) 2 , which allows us to write some expressions in a manageable form. It is easy to check that with this substitution, we can write Fibonacci polynomials as The fact that this substitution also works for Fibonacci polynomials is not that surprising, as z F r (−z)/F r+1 (−z) is the generating function of plane trees with height ≤ r (see [3]). \ { } → we want to investigate now can be explained very easily. For any tree τ ∈ \ { } we obtain the reduced tree ρ(τ) simply by removing all leaves from τ. Repeated application of ρ to a tree is illustrated in Figure 3.
It is easy to see that this operator is certainly not injective: there are many trees that reduce to the same tree. However, it is also easy to see that ρ is surjective, as we can always construct an expanded tree that reduces to any given tree τ by attaching leaves to all leaves of τ.
In fact, the operator ρ −1 mapping trees τ ∈ to the set of preimages is easier to handle from a combinatorial point of view. This is because we can model the expansion of trees in the language of generating functions.

Proposition 2.5. Let
⊆ be a family of plane trees with bivariate generating function f (z, t), where z marks inner nodes and t marks leaves. Then the generating function of ρ −1 ( ), the family of trees whose reduction is in , is given by Proof. It is obvious from a combinatorial point of view that the operator Φ has to be linear. Thus we only have to determine how a tree represented by an arbitrary monomial z n t k , i.e. a tree τ with n inner nodes and k leaves, is expanded. In order to obtain all possible tree expansions from τ, we perform the following operations: first, all leaves of τ are expanded by appending a nonempty sequence of leaves to each of them. Then, every inner node of τ is expanded by appending (possibly empty) sequences of leaves between two of its children as well as before the first and after the last one.
In terms of generating functions, expanding the leaves of τ corresponds to replacing t by z t/(1 − t). Expanding the inner vertices is a bit more involved: by considering that every inner node has precisely one more available position to attach new leaves than it has children we find that there are 2n + k − 1 available positions overall within τ. Therefore we find Φ(z n t k ) = z n z t which, as Φ is linear, immediately proves (9). Corollary 2.6. The generating function for plane trees T (z, t) satisfies the functional equation Proof. This follows directly from the fact that ρ : Corollary 2.7. The Narayana numbers satisfy the identity Proof. The result follows from extracting the coefficient of z n t k from both sides of (10).
Remark. Note that in [1] there is a very short proof based on Dyck paths for this identity, and actually the argumentation there is strongly related to our tree reduction here: by the wellknown glove bijection, it is easy to see that cutting away all leaves of a plane tree translates into removing all peaks within the corresponding Dyck path.
We are now interested in determining a multivariate generating function enumerating plane trees with respect to the tree size as well as the size of the tree after applying the tree reduction ρ a fixed number of times.
enumerating plane trees whose leaves can be cut at least r-times, where z marks the tree size, and v I and v L mark the number of inner nodes and leaves of the r-fold cut tree, respectively, is given by Proof. First, observe that formally, we can obtain the generating function enumerating plane trees that can be reduced at least r-times with respect to their size by considering Φ r (T (z, t))| t=z . If we additionally track some size parameter like the number of inner nodes or the number of leaves before the expansion by marking their size with v I and v L , then we obtain a generating function for plane trees that can be reduced at least r-times where v I and v L mark inner nodes and leaves in the original tree and z marks the size of the expanded tree. From a different point of view, z marks the size of the original tree and v I and v L mark the number of inner nodes and leaves of the r-fold reduced tree, meaning that we have which proves the first equation in (11).
As Φ is linear, we are mainly interested in finding a representation for Φ r (z n t k )| t=z . To do so, we consider the strongly related operator It is easy to prove by induction that iterative application of Φ can be expressed in terms of Ψ via which means that we can concentrate on the investigation of the linear operator Ψ. Note that Ψ is also multiplicative, meaning that Ψ r (z n t k ) = Ψ r (z) n Ψ r (t) k .
Again by induction, it is easy to show that the recurrences We prove by induction that these quantities can be represented by means of Fibonacci polynomials as for r ≥ 0, where the recurrence relations from above, the identity (7) as well as the relation for r ≥ 0 play integral parts in the proof. With these explicit representations, we find Then, using (8) and rewriting the right-hand side of (12) By linearity, we are allowed to apply Φ r to every summand in the power series expansion of f (z, t) separately-which proves the statement.
The generating function G r (z, v, v) tells us how many nodes (marked by v) are still in the tree after r reductions. For the sake of brevity we set G r (z, v) := G r (z, v, v). It is completely described in terms of the function T (z, t), although in a non-trivial way. Results about moments and the limiting distribution can be extracted from this explicit form.
With the help of the mathematics software system SageMath [5], the generating function G r (z, v) can be expanded. For small values of r, the first few summands are As announced in the introduction, we investigate the behavior of the random variable X n,r = X L n,r that models the number of nodes which are left after reducing a random tree τ with n nodes r-times. In case the r-fold application of ρ to τ is not defined, we consider the resulting tree size to be 0, i.e., the random variable X n,r = 0 for these trees. Note that the tree τ is chosen uniformly at random among all trees of size n. With the help of the generating function G r (z, v) we are able to express the probability generating function of X n,r as v X n,r = a n,r where a n,r is the number of trees of size n which are empty after reducing r-times. We have a n, In addition to X n,r , we also consider the random variables I n,r and L n,r that model the number of inner nodes and leaves, respectively, that remain after reducing a random tree with n nodes r times. The generating functions corresponding to I n,r and L n,r are G r (z, v, 1) and G r (z, 1, v), respectively.
The relations X n,r d = I n,r + L n,r and I n,r d = X n,r+1 hold by the combinatorial interpretation of the operator Φ.

Asymptotic Analysis.
We find explicit generating functions for the factorial moments of the random variables X n,r , I n,r , and L n,r . Proposition 2.9. The dth factorial moments of X n,r , I n,r and L n,r are given by (15), see (5). Proof. We use the abbreviations We consider the exponential generating function of to be a Taylor series and obtain By Proposition 2.8, extracting the coefficient of q d yields We have By using the fact that 1 − u r+2 and by choosing α and β such that .
where (3) has been used. Noting that completes the proof of (14).
For the proof of (15), we proceed in the same way and use the identity .
From the proof of Proposition 2.9, we extract the following identities for the modified Narayana polynomials.
n−1 denotes the dth derivative ofÑ n−1 . Proof. In the proof of Proposition 2.9, we showed that Expanding the left side using (3) and evaluating the derivative yields (16) (where u r has been replaced by the independent variable x). The identity (17) is proved in the same way.
Corollary 2.10. The expected value of X n+1,r is explicitly given by Proof. Using Proposition 2.9 and Cauchy's integral formula, we have where γ is a circle around 0 with a sufficiently small radius such that γ , the image of γ under the transformation, is a small contour circling 0 exactly once as well.
Expanding (1−u r+1 ) −1 into a geometric series and exchanging integration and summation, we obtain which implies the result.
Having determined a closed form for this generating function allows us to analyze the asymptotic behavior of X n,r in a relatively straightforward way. Theorem 1. Let r ∈ 0 be fixed and consider n → ∞. Then the expected size and the corresponding variance of an r-fold cut plane tree are given by and The factorial moments are asymptotically given by Note that all O-constants above depend implicitly on r.
Proof. In a nutshell, we want to extract the growth of the derivatives of the generating func- , as dividing these quantities by C n−1 yields the factorial moments. We want to extract the growth by means of singularity analysis (cf. [10]).
In order to do so, we first need to establish the location of the dominant singularity of these generating functions, which are explicitly given in (14).
The singularities of (14) are roots of unity in terms of u. Substituting back u = (1 − 1 − 4z )/(2z) − 1 maps these roots of unity to real numbers greater or equal to 1/4 and only u = 1 is mapped to z = 1/4. Thus z = 1/4 is the dominant singularity of (14). A more detailed treatment of these analytic properties of the substitution z = u/(1+u) 2 can be found in [13,Proposition 2.3].
As N 0 (x) = 1, we obtain the expansion for the function on the right-hand side of (14) with d = 1. Then, the expansion for fixed κ ∈ yields By singularity analysis, the nth coefficient, normalized by C n−1 , is asymptotically The higher order factorial moments follow similarly by expanding the function on the right-hand side of (14) for general d > 1 around u = 1 with the help of SageMath, where in particular the explicit values of the derivatives of the Narayana polynomials from Proposition 2.3 are required.
Singularity analysis of the resulting expansion yields the expression given in the statement of the theorem. Finally, note that the variance can be computed by using X n,r = X 2 n,r + X n,r − ( X n,r ) 2 .

Theorem 2.
The size X n,r of the tree obtained from a random plane tree with n nodes by cutting it r-times is, after standardization, asymptotically normally distributed for n → ∞ and fixed r, i.e., X n,r − n r + 1 r(r + 2) 6(r + 1) 2 n d −→ (0, 1).
To be more precise, for x ∈ we have X n,r − nµ with µ = 1 r+1 and σ 2 = r(r+2) 6(r+1) 2 and where the O-constant depends implicitly on r. As I n,r−1 d = X n,r , the same also holds for this random variable.
The rest of this section is devoted to the proof of this central limit theorem. In order to derive the fact that the number of remaining nodes after r reductions is asymptotically normally distributed, we first show that the number of nodes that are deleted after r reductions is asymptotically normally distributed. Then, as the sum of the number of remaining nodes and the number of deleted nodes is equal to the original tree size, we obtain immediately that the number of remaining nodes has to be asymptotically normally distributed as well.
We begin by considering the function F r : → 0 which maps a plane tree τ to the number of nodes that are deleted when reducing the tree r times, i.e. the difference between the size of τ and the size of ρ r (τ). Let τ n now denote a plane tree with n nodes.
For the sake of convenience, we consider F r (τ n ) to be n if r is larger than the maximal number of reductions that can be applied to τ n before the tree cannot be reduced further. In particular, this means that F r ( ) = 1 for r ≥ 1.
It is easy to see that the parameter F r (τ n ) is a so-called additive tree parameter, meaning that F r (τ n ) = F r (τ i 1 ) + · · · + F r (τ i ) + f r (τ n ) holds, where τ i 1 , . . . , τ i are the subtrees rooted at the children of the root of τ n , and f r : → {0, 1} is a toll function recursively defined by for r ≥ 1 and f 0 (τ n ) = 0. In order to prove asymptotic normality for additive tree parameters, we can use [25, Theorem 2], which requires us to show that the expected value of the toll function is exponentially decreasing in n. This is done in the following lemma. Remark. Of course, n − F r (τ n ) is also an additive parameter. However, the expected value of the corresponding growth function is not exponentially decreasing.
Observe that F r (τ n ) = n holds if and only if τ n has height less than r, as removing all leaves from a tree reduces its height by precisely one. Therefore, the generating function Q r (z) is the generating function enumerating trees of height less than r.
It is well-known (cf. [3]) that the generating function for plane trees of height less than r can be expressed in terms of Fibonacci polynomials as The roots of F r (−z) are also well-known and can be written as α j,r = (4 cos 2 ( jπ/r)) −1 for j = 1, . . . , (r − 1)/2 . Thus Q r (z) is a rational function and its coefficients have the form C n−1 q n,r = j c j,r α −n j,r for constants c j,r . We have |α j,r | > 4. As there exists a constant c ∈ (0, 1) such that q n,r = O(c n ).
Thus, by the strategy discussed above, we find that not only F r (τ n ) but also X n,r = n − F n,r is asymptotically normally distributed.
Remark. Note that the fact that F 1 (τ n ) is asymptotically normally distributed means that the Narayana numbers are asymptotically normally distributed, see for example [7, Theorem 3.13].
As sketched above, Lemma 2.11 allows us to apply [25,Theorem 2] in order to prove that F r (τ n ), and therefore also X n,r = n − F r (τ n ) is asymptotically normally distributed. All that remains to prove is that the speed of convergence is O(n −1/2 ).
We do so by noting that the proof for asymptotic normality in Wagner's theorem is based on [7, Theorem 2.23], where a version of Hwang's Quasi-Power Theorem [16] without quantification of the speed of convergence is used. Replacing this argument with the multidimensional quantified version given in [15] then gives us the desired speed of convergence of O(n −1/2 ).

CUTTING PATHS
3.1. The Expansion Operator and Results. Let denote the combinatorial class of paths, i.e. trees in which every node is either a leaf or has precisely one child. The tree reduction ρ : \ → which we will focus on in this section reduces a tree by cutting away all paths of the tree. This operation is illustrated in Figure 4. Analogously to our approach in Section 2.2, we first determine the corresponding expansion operator Φ. In order to do so, we need the generating function for the family of paths , which is given by P = P(z, t) = t 1−z . For the sake of readability, we omit the arguments of P.

Proposition 3.1. Let
⊆ be a family of plane trees with bivariate generating function f (z, t), where z marks inner nodes and t marks leaves. Then the generating function for ρ −1 ( ), the family of trees whose reduction is in , is given by Proof. The fact that Φ is a linear operator is obvious from a combinatorial point of view, meaning that we may concentrate on some tree τ with n inner nodes and k leaves, represented by z n t k .
We follow the proof of Proposition 2.5 and observe that all possible tree expansions of τ can be obtained by the following operations: the leaves of τ are expanded by appending a sequence of at least two paths to each of them. Note that appending a single path to a leaf is not allowed, because this would just extend the path ending in that leaf, which causes ambiguity. Then, the inner nodes are expanded as well by appending (possibly empty) sequences of paths to the 2n + k − 1 available positions between, before, and after their children.
Translating this expansion to the language of generating functions yields which proves (21).

Corollary 3.2. The generating function for plane trees T (z, t) satisfies the functional equation
Proof. Surjectivity of ρ implies ρ −1 ( ) = \ , which proves the statement after translating this into the language of generating functions with the help of Φ.
In the following proposition, we determine the generating function G r (z, v I , v L ) measuring the effect of applying the path reduction r times on the size of the tree. Most interestingly, we will see that the path connection is in fact strongly related to the leaf reduction from the previous section.

Proposition 3.3. The trivariate generating function G r
enumerating plane trees whose paths can be cut at least r-times, where z marks the tree size and v I and v L mark the number of inner nodes and leaves of the r-fold cut tree, respectively, is given by Proof. By the same reasoning as in the proof of Proposition 2.8, the generating function we are interested in is G r (z, v I , v L ) = Φ r (T (zv I , t v L ))| t=z , meaning that we want to study the iterated application of Φ. To do so, we consider the strongly related operator The relation can be proved easily by induction and enables us to determine the behavior of Φ via Ψ.
First of all, for r ≥ 0 and r ≥ 1, the relations and Ψ r (P) = z(Ψ r−1 (P)) 2 r−1 j=0 (1 − Ψ j (P)) 2 − z can be proved easily by induction, respectively. Also observe that we can write Ψ r (t) = Ψ r (z)Ψ r−1 (P) 2 . Now let f r = Ψ r (z)| t=z , g r = Ψ r (t)| t=z , and h r = Ψ r (P)| t=z . With the help of the identity r j=0 (1 + u 2 j ) = 1−u 2 r+1 1−u we are able to prove the explicit formula where z = u/(1 + u) 2 and the second equation is a consequence of (8). Using (23), we immediately find Putting everything together yields which directly implies the statement.
The following result shows that there is an intimate connection between the "cutting leaves"-reduction from Section 2 and the "cutting paths"-reduction, as can be seen after comparing the statement of Proposition 2.8 with the statement of Proposition 3.3. This connection is now especially important for the analysis of the random variable X n,r = X P n,r modeling the number of nodes that are left after reducing a random tree τ with n nodes r times by removing all paths. In fact, it follows that X P n,r d = X L n,2 r+1 −2 , meaning that the asymptotic analysis of the factorial moments of X P n,r as well as the limiting distribution follow directly from the corresponding results in Section 2.3.

Theorem 3.
Let r ∈ 0 be fixed and consider n → ∞. Then expectation and variance of the random variable X n,r = X P n,r can be expressed as and The factorial moments are asymptotically given by Furthermore, X n,r = X P n,r is asymptotically normally distributed, i.e., for x ∈ we have X n,r − µn All O-constants in this theorem depend implicitly on r. 3.2. Total number of paths. In the context of this reduction it is interesting to investigate the total number of paths needed to construct a given tree. To determine this parameter we can reduce the tree repeatedly and count the number of leaves. The sum of the number of leaves over all reduction steps is equal to the number of paths, which follows from the observation that leaves mark the endpoints of all paths.
Formally, given the random variables P n,r counting the number of leaves in the rth reduction of a tree of size n, we want to analyze the random variable P n := r≥0 P n,r . Proposition 3.5. The expected number of paths needed to construct a uniformly random tree of size n satisfies where z = u/(1 + u) 2 .
Proof. As a consequence of Proposition 3.3, the bivariate generating function enumerating plane trees where z marks tree size and v marks the number of leaves after r path reductions can be written as By differentiating this generating function once with respect to v and setting v = 1 afterwards, we obtain an expression where C n−1 P n,r can be extracted as the coefficient of z n . By (15) with d = 1 and r replaced by 2 r+1 − 2, we have .
Summation over r ≥ 0 and shifting the index of summation by one completes the proof.
Our strategy for determining an asymptotic expansion for P n as given in (26) is based on the Mellin transform.

Theorem 4. For n → ∞, the expected number of paths required to construct a uniformly random tree of size n is given by the asymptotic expansion
with χ k = 2kπi log 2 is a fluctuation with mean 0 and α := k≥1 1/(2 k − 1) ≈ 1.606695, γ is the Euler-Mascheroni constant and ζ is the Riemann zeta function.
Remark. The constant α appears in the asymptotic analysis of digital search trees (see e.g. [20]).
Proof. In order to obtain an asymptotic expansion from (26), we rewrite The main task to obtain an asymptotic expansion of P(z) is to provide a precise analysis of this sum, which we carry out via the Mellin transform. We consider the function with fundamental strip 〈1, ∞〉. In order for the inversion formula to be valid, we need to show that f * (s) decays sufficiently fast along vertical lines in the complex plane. While Γ (s) and ζ(s) are well-known to decay exponentially and grow polynomially along vertical lines, respectively, the Dirichlet series A(s) has to be investigated in more detail. We want to estimate the summands in To do so, we consider g(x) = (1 − x) −s as a function of a real variable. By means of the integral form of the Taylor approximation error we find where the last inequality is valid under the assumption that Re s > −2. Using this estimate, we find where the sum converges for Re s > −2. Therefore, A(s) has polynomial growth in Im s for Re s > −2 and Im s = 2πi log 2 k + 1 2 , where k ∈ and |k| → ∞, as well as on vertical lines with Re s > −2 and Re s = −1. This implies that f * (s) decays sufficiently fast, and thus the inversion formula states which is valid for real, positive t → 0 (and thus u → 1 − and z → (1/4) − , as we have z = u/(1+u) 2 and u = e −t ). In order to extract the coefficient growth (in terms of z) with the help of singularity analysis, we require analyticity in a larger region (cf. [10]), e.g. in a complex punctured neighborhood of 1/4 with 1 |arg(z − 1/4)| > 2π/5. 1 Note that the bound 2π/5 is somewhat arbitrary: the argument just needs to be less than π/2. Substituting back t for z, we find such that we have the bound |arg t| < 2π/5 for t → 0, given that the restriction on the argument in terms of z is satisfied.
With the help of our estimates on f * (s) that we discussed above, we find that for −3/2 ≤ Re s ≤ 2 and Im s = 2πi log 2 k + 1 2 , where k ∈ and |k| → ∞. This is a consequence of combining the quantified growth of Γ (s) (see [6, 5.11.3]) and the growth of ζ(s) (see [26,13.51]) with the facts that A(s) is of order O(Im(s) 2 ) and s 2 s+1 −1 is of order O(Im(s)) for s taking values in the specified region.
We can evaluate (29) by shifting the line of integration from Re(s) = 2 to Re(s) = −3/2 and collecting the residues of the poles we cross. This yields For the error term we use the estimate above and find 1 2πi Evaluating the residues yields Note that with α : When substituting back in order to obtain an expansion in terms of z → 1/4, we have to carefully check that the error terms within the sum of the residues at χ k for k ∈ \ {0} can still be controlled. Considering that for some exponent κ, we have the expansion and thus Setting κ = −1+χ k shows that the errors that we sum are of order O(|k|(1−4z) exp(|k|O(1− 4z))). Choosing z sufficiently close to 1/4 ensures that the exponential growth is negligible compared to the exponential decay proved in (30). Finally, it is easy to see that the factor u 1+u can be rewritten as 1− 1−4z 2 . Multiplying our expansion of f (t) with this factor and substituting back yields the expansion Applying singularity analysis, normalizing the result by C n−1 , and rewriting the coefficients of the contributions from the poles at −1 + χ k via the duplication formula for the Gamma function (cf. [6, 5.5.5]) then proves the asymptotic expansion for P n .

CUTTING OLD LEAVES
4.1. Preliminaries. In this section we consider a slightly more complex reduction: instead of removing all leaves, we just remove all leftmost leaves. Following [2], we call a leaf that is a leftmost child an old leaf.
In order to describe the corresponding expansion in the language of generating functions, we need to change our underlying combinatorial model of trees in a way that specifically marks old leaves.
Let be the combinatorial class of plane trees where marks old leaves and marks all nodes that are neither old leaves nor parents thereof. Now, as a first step we determine the bivariate generating function L(z, w) of .

Proposition 4.1. The generating function L(z, w) enumerating plane trees with respect to old leaves (marked by the variable w) and all nodes that are neither old leaves nor parents thereof (marked by z) is given by
For n ≥ 2 there are C k−1 n−2 n−2k 2 n−2k plane trees of size n (meaning n nodes overall) with k old leaves.
For example in Figure 5, the original tree corresponds to z 3 w 3 because it has three old leaves (dashed nodes) and three nodes which are neither old leaves nor parents of old leaves.
→ → → · · · → FIGURE 5. Illustration of the "cutting old leaves"-operator ρ Proof. We consider the symbolic equation describing the combinatorial class of plane trees with respect to old leaves, which is illustrated in Figure 6. The functional equation that can = + n≥0 · · · n + n≥0 − · · · n FIGURE 6. Symbolic equation for plane trees w.r.t. old leaves be derived from the symbolic equation by marking with w and with z is Solving this equation and choosing the correct branch of the root yields (31).
To extract coefficients of L(z, w), we rewrite it as As we will see in the next section, the polynomials defined below will play a similar role for the "old leaves"-reduction as the Fibonacci polynomials played for the "leaves"-and "paths"reduction.
Definition. The polynomials B r (z) are the generating functions of binary trees w.r.t. the number of internal nodes of height ≤ r satisfying for r ≥ 1 and B 0 (z) = 1.

The Expansion Operator and Asymptotic Results.
As described in the previous section, we now concentrate on the reduction ρ : → , which removes all old leaves from a tree. Note that ρ( ) = , as the root itself is not an old leaf. We begin our analysis of this reduction by determining the expansion operator Φ.

Proposition 4.2. Let
⊆ be a family of plane trees with bivariate generating function f (z, w), where z marks nodes that are neither old leaves nor parents thereof and w marks old leaves. Then the generating function for ρ −1 ( ), the family of trees whose reduction is in , is given by Proof. Linearity of Φ is obvious from the combinatorial interpretation, meaning that we can focus on the expansion of any tree represented by z n w k , i.e. a tree with n nodes that are neither old leaves nor parents thereof and k old leaves. Figure 7 illustrates all three possibilities to expand an old leaf : -appending an old leaf to the parent of , which turns the original old leaf into , -appending an old leaf to itself, which turns the parent into , -appending old leaves both to and its parent. In terms of generating functions, this means that w is substituted by 2zw + w 2 . Furthermore, the nodes represented by can optionally be expanded by attaching an old leaf to them, otherwise they stay as they are. This option corresponds to the substitution z → z + w.
There are no more operations to expand the tree, so putting everything together yields Φ(z n w k ) = (z + w) n (2zw + w 2 ) k , which proves the statement.
An immediate consequence of the fact that ρ : → is surjective is the following corollary.  L(z, w).
We now focus on determining the generating function measuring the change in the tree size after repeatedly applying the reduction ρ. Proposition 4.4. Let r ∈ 0 . The bivariate generating function G r (z, v) = G OL r (z, v) enumerating plane trees, where z marks the tree size and v marks the size of the r-fold cut tree, is given by , where the B r (z) are the polynomials enumerating binary trees of height ≤ r w.r.t. the number of internal nodes.
Proof. First, note that the size of a tree with k old leaves and n nodes that are neither old leaves nor parents thereof is actually n+2k, as parents of old leaves are not explicitly marked. This explains why we have to substitute w = z 2 in order to arrive at the tree size.
In contrast to the previous sections, the operator Φ is already linear and multiplicative, meaning that we have Φ r (z n w k ) = Φ r (z) n Φ r (w) k .
Investigating the repeated application of Φ to z and w leads to the recurrences for r ≥ 2 and r ≥ 0, respectively. With the recurrence for the polynomials B r from (35) it is easy to prove by induction that Φ r (z)| w=z 2 = zB r (z) for r ≥ 0. Thus, we also find Φ r (w)| w=z 2 = z(B r+1 (z) − B r (z)). Overall, we obtain which, by linearity of Φ, proves the proposition.
For the next step in our analysis, we turn to the random variable X n,r = X OL n,r which models the size of the tree that results from reducing a random tree τ with n nodes r-times.
As we have ρ( ) = (and thus no trees vanish completely), the probability generating function for this random variable is simply While the height polynomials B r (z) make it very difficult to obtain general results for the factorial moments of X n,r , special moments like expectation and variance are no problem, and even a central limit theorem is possible. Theorem 5. Let r ∈ 0 be fixed and consider n → ∞. Then the expected tree size after deleting old leaves of a tree with n nodes r-times and the corresponding variance are given by and All O-constants in this theorem depend implicitly on r. Additionally, the random variable X n,r is asymptotically normally distributed for fixed r ≥ 1, i.e. X n,r − µn Proof. First of all, we observe that Proposition 4.1 and Proposition 4.4 combined with the recursion B r (z) = 1 + zB r−1 (z) 2 allow us to write the bivariate generating function as The asymptotic expansion for the expected value X n,r can now be obtained by determining By means of singularity analysis we find which proves (37). For the second factorial moment we obtain which yields The variance can now be obtained via X n,r = X 2 n,r + X n,r − ( X n,r ) 2 , which proves (38).
In order to show asymptotic normality of X n,r we investigate the random variable n − X n,r , which counts the number of nodes that are deleted after reducing some tree r times. Observe that this quantity can be seen as an additive tree parameter F r defined recursively by where τ n is some tree of size n, τ i 1 up to τ i are the subtrees rooted at the children of the root of τ n , and f r : → {0, 1, . . . , r − 1} is a toll function defined by 1 if ρ j (τ n ) has an old leaf attached to its root, 0 otherwise, for r ≥ 1. Now, as f r (τ n ) enumerates the number of old leaves deleted from the root of τ n after r reductions, F r (τ n ) equals the total number of deleted nodes after r reductions. The fact that r is fixed implies that f r is not only bounded, but also a so-called local functional, meaning that the value of f r (τ n ) can already be determined from the first r levels of τ n . This is because one application of ρ can reduce the distance between the root of the tree and the closest old leaf by at most one. Thus all old leaves that are deleted from the root during r reductions have to be found within the first r levels of τ n .
As we have now established that f r is both bounded and a local functional, we are able to apply [18,Theorem 1.13], which proves that n − X n,r is asymptotically normally distributed. Thus X n,r is asymptotically normally distributed as well, which proves the statement.
Remark. In [9], the asymptotic behavior of a sequence strongly related to B r (1/4) was studied: in Section 4, the authors define a sequence f n such that f r+1 = 1 2 − B r (1/4) 4 , in our notation. They prove the asymptotic expansion f n = 1 n+log n+O (1) . This allows us to conclude that the asymptotic behavior of B r (1/4) can be described as

CUTTING OLD PATHS
5.1. The Expansion Operator. As in previous sections, we adapt the "old leaves" reduction to remove all "old paths". That is, the tree reduction ρ : → in this section reduces a tree by removing all paths that end in an old leaf. This operation is illustrated in Figure 8, where marks old leaves and marks all nodes that are neither old leaves nor parents thereof. Obviously, we also need the combinatorial class of paths for our analysis. The bivariate generating function of is given by P = P(z, w) = w 1−z , where w and z mark and , respectively. Also, we omit the arguments of P for the sake of readability. Now, we determine the shape of the expansion operator Φ. leaves. Then the generating function for ρ −1 ( ), the family of trees whose reduction is in , is given by Proof. With linearity of the operator Φ being obvious from a combinatorial point of view, we only have to investigate the expansion of any tree represented by z n w k , i.e. a tree with n nodes that are neither old leaves nor parents thereof and k old leaves. There are two options to expand an old leaf : -either appending an old path to the parent of , which turns the old leaf into , -or an old path is appended to both the parent of and to itself. Note that just appending an old path to is not a valid expansion as this introduces ambiguity. This is the same argument that we also used in the proof of Proposition 3.1. Overall, this means that Φ has to map w to zP + P 2 .
On the other hand, the nodes represented by can optionally be expanded by attaching an old path. Otherwise they stay as they are. Overall, this implies Φ(z) = z + P.
Putting everything together, we immediately arrive at the statement of the Proposition.
Analogously to the previous reductions, surjectivity of ρ : → implies the following corollary.

Corollary 5.2. The generating function for plane trees L(z, w) satisfies the functional equation
In order to carry out a detailed analysis of this reduction, we need information about the iterated application of Φ to L(zv I , wv 2 L ), which leads to the generating function G r (z, v I , v 2 L ) measuring the change in the tree size after r applications of the reduction. The following proposition deals with determining this generating function. Proposition 5.3. Let r ∈ 0 . The trivariate generating function G r (z, v I , v 2 L ) = G OP r (z, v I , v 2 L ) enumerating plane trees, where z marks the tree size, v L marks all old leaves, and v I marks all nodes that are neither old leaves nor parents thereof, is given by Proof. Observe that the operator Φ is already linear and multiplicative, which is why we can concentrate on finding suitable expressions for Φ r (z) and Φ r (w). First of all, for r ≥ 1 the recurrences follow immediately from (39). Furthermore, the relation can easily be proved by induction. Then, by setting f r := Φ r (z)| w=z 2 the recurrences above translate to As a next step, we show by induction that f r can be expressed in terms of Fibonacci polynomials as where in particular (7) was used. As a consequence, we find .
This allows us to express g r := Φ r (w)| w=z 2 as Finally, as we have Φ r (z n w k )| w=z 2 = f n r g k r , substituting z = u/(1+u) 2 and using (8) completes the proof.

Analysis of Tree Size and Related Parameters.
We investigate the behavior of the random variable X n,r = X OP n,r which models the number of nodes remaining after reducing a random tree τ with n nodes r-times. The tree τ is chosen uniformly among all trees of size n. Analogously to the "old leaf"-reduction from the previous section, we also have ρ( ) = for the "old path"-reduction, meaning that no trees vanish completely. For the sake of convenience we set G r (z, v) := G r (z, v, v 2 ), allowing us to write the probability generating function of X n.r as v X n,r = [z n ]G r (z, v) With the help of Proposition 5.3, it is easy to obtain expressions for the factorial moments X d n,r for fixed d by differentiating G r (z, v) d-times with respect to v and setting v = 1 afterwards. General expressions for d ≥ 2 (coinciding with the value given for d = 2) are available but less pleasant.
Lemma 5.4. The factorial moments of X n,r are Proof. The expressions for d ∈ {1, 2} can be obtained by differentiation. We consider the general case here. We use the abbreviations By the same argument as in the proof of Proposition 2.9, we have By using (31), we rewrite L(a(1 + q), b(1 + q) 2 ) as We have . We choose α and β such that This results in Using (3) to extract the coefficient of q d for d ≥ 1 yields Inserting everything concludes the proof of the proposition. (1 + u r+1 )u (1 + u)(1 − u r+2 ) and proceeding as in Corollary 2.10 we obtain the given result.
By expanding the expressions in Lemma 5.4 and using singularity analysis, we obtain the asymptotic growth of the expected value and the variance.

All O-constants in this theorem depend implicitly on r.
Besides the analysis of the tree size, we are also interested in how the numbers of nodes represented by and by develop when the tree is reduced repeatedly. Formally, this means that we consider the random variables X n,r and X n,r counting the number of old leaves and the number of all nodes that are neither old leaves nor parents thereof, respectively. By construction, the relation X n,r = 2 · X n,r + X n,r holds.
The bivariate generating functions corresponding to these random variables can be obtained directly from Proposition 5.3. We have G r (z, v) = G r (z, 1, v), G r (z, v) = G r (z, v, 1).
In contrast to X n,r , the dth factorial moments for X n,r and X n,r have simpler expressions.
Proposition 5.6. Let d ∈ . Then the dth factorial moments of X n,r and X n,r are given by and X n,r = 1 as well as for d > 1.
Proof of Proposition 5.6. As in the proof of Lemma 5.4, we use the abbreviations Then, using (33), we get Setting v = 1 and using the fact that proves (41). For deriving ∂ d /(∂ v) d G r (z, v), we proceed as in the proof of Proposition 2.9. The crucial identity is This implies (42) and (43).
As in Section 2.3, the above proof exhibits some identities: Remark. For d ∈ ≥1 , the power series identities n≥0 k≥1 and n≥0 k≥1 hold.
Proof. We replace u r+1 by x in the proof of Proposition 5.6 and expand L by (34).
Additionally, for fixed d ≥ 2 the behavior of the factorial moments of X n,r and X n,r is given by and X n,r d = 2 d (r + 1) d (r + 2) 2d n d + O(n d−1 ), respectively. All O-constants in this theorem depend implicitly on r.

Total number of old paths.
Similarly to our approach for counting the total number of paths required to construct a given tree from Section 3.2, we can also analyze the number of "old path"-segments within a random tree of size n. Formally, this corresponds to an analysis of the random variable S n := r≥0 X n,r .

FUTURE WORK
It seems likely that similar results also hold for reductions where one can cut a different structure as long as it is allowed to cut a single leaf. An example is cutting either single leaves or cherries (a root with two children). At least a formulation as an operator as in (9) seems possible in general. How much information about the moments and the central limit theorem can be extracted from that may vary (as it varies in this article already). Also the case of cutting old structures might be more difficult to handle in general.