Polynomialtime inverse computation for accumulative functions with multiple data traversals
Abstract
The problem of inverse computation has many potential applications such as serialization/deserialization, providing support for undo, and testcase generation for software testing. In this paper, we propose an inverse computation method that always terminates for a class of functions known as parameterlinear macro tree transducers, which involve multiple data traversals and the use of accumulations. The key to our method is the observation that a function in the class can be regarded as a nonaccumulative contextgenerating transformation without multiple data traversals. Accordingly, we demonstrate that it is easy to achieve terminating inverse computation for the class by contextwise memoization of the inverse computation results. We also show that when we use a tree automaton to express the inverse computation results, the inverse computation runs in time polynomial to the size of the original output and the textual program size.
Keywords
Program inversion Inverse computation Program transformation Functional programming Tree automata Tree transducers1 Introduction
The problem of inverse computation [1, 2, 20, 21, 23, 30, 37, 40, 49]—finding an input s for a program f and a given output t such that f(s)=t—has many potential applications, including testcase generation in software testing, supporting undo/redo, and obtaining a deserialization from a serialization program.
The inverse computation of eval, which enumerates the inputs Open image in new window for a given t, is sometimes useful for testing computations on E. For example, suppose that we write an optimizer f that converts all the expressions e satisfying \(\mathit {eval} (e) = \mathsf {S} ^{2^{n}}( \mathsf {Z} )\) into Dbl ^{ n }(One), and we want to test if the optimizer works correctly or not, i.e., whether \(\mathit {eval} (e) = \mathsf {S} ^{2^{n}}( \mathsf {Z} )\) implies f(e)=Dbl ^{ n }(One) or not.^{1} A solution would involve randomly generating or enumerating expressions e, filtering out the es that do not satisfy \(\mathit {eval} (e) = \mathsf {S} ^{2^{n}}( \mathsf {Z} )\), and checking f(e)=Dbl ^{ n }(One). However, it is unsatisfactory because it is inefficient; the majority of the expressions do not evaluate to \(\mathsf {S} ^{2^{n}}( \mathsf {Z} )\). Inverse computation enables us to generate only the testcases that are relevant to the test. A test with inverse computation can be efficiently performed by (1) picking up a number m of the form \(\mathsf {S} ^{2^{n}}( \mathsf {Z} )\), (2) picking up an expression e from the set obtained from the inverse computation for m, and (3) checking if the optimizer f converts e into Dbl ^{ n }(One). Here, all the picked up (randomly generated or enumerated) data are relevant to the final check in the Step (3). Lazy SmallCheck [43] and EasyCheck [9] use inverse computation for efficient testcase generation, which of course has to be supported by efficient inverse computation.
However, there are as yet no systematic efficient inverse computation methods that can handle eval. One reason is that evalAcc contains accumulations and multiple data traversals. It is so far unclear how to perform tractable terminating inverse computation for functions with accumulations and multiple data traversals (Sect. 2). Some of the existing methods [1, 2, 21, 37] do not terminate for functions with accumulations. Some approaches [20, 39, 40] can handle certain accumulative computations efficiently, but they do not work for noninjective functions such as eval. Although some inverse computation methods terminate for accumulative functions [17, 32], the complexity upper bound is unclear when there are also multiple data traversals.
In this paper, we propose an inverse computation method that can handle a class of accumulative functions like eval that have multiple data traversals, namely deterministic macro tree transducers [15] with the restriction of parameterlinearity (Sect. 3). In this class of functions, variables for accumulation (such as y in evalAcc) cannot be copied but inputs (such as x, x _{1} and x _{2} in evalAcc) can be traversed in many times (as x). Our method computes the set Open image in new window as a tree automaton [10] for a given function f and an output y in time polynomial to the size of y (Sect. 4). The key to our inverse computation is the observation that a program in the parameterlinear macro tree transducers is indeed a nonaccumulative transformation that generates contexts (i.e., trees with holes) without multiple data traversals. From this viewpoint, we can do the inverse computation through a variant of the existing inverse computation methods [1, 2, 4]. Note that viewing a program as a contextgenerating transformation is not new. What is new in our paper is to use this view to achieve polynomialtime inverse computation for the class of accumulative functions with multiple data traversals.

We demonstrate that simply viewing a function as a contextgenerating transformation helps us to achieve a systematic inverse computation method for accumulative functions. After converting a program into a contextgenerating one, it is easy to perform inverse computation for the program.

We show that, for parameterlinear macro tree transducers, our inverse computation method runs in time polynomial to the size of the output and the textual program size, and in time exponential to the number of the functions in the program.
The rest of the paper is organized as follows. Section 2 shows an overview of our proposal. Section 3 defines the target language, parameterlinear macro tree transducers. Section 4 formally presents our inverse computation method. Section 5 reports and discusses the experimental results with our prototype implementation. Section 6 shows four extensions of our proposal, and Sect. 7 shows the relationship between ours and the other research. Section 8 concludes the paper and outlines future work.
The preliminary version of this article has appeared in [36]. The main difference from the version is that we have implemented the proposed algorithm and performed some experiments (Sect. 5). We also have added discussions on the two further extensions of our proposed method (Sects. 6.3 and 6.4) and some explanations in several places.
2 Overview
In this section, we give a brief overview of our proposal.
2.1 Review: when inverse computation terminates
Actually, with memoization, the simple inverse computation for parity always terminates. For the above narrowing sequence, by memoizing all the checks in the sequence, we can tell that the same check \(( \mathit {parity} (x) \stackrel {?}{=} \mathsf {S} ^{2}( \mathsf {Z} ))\) occurs twice, and hence the narrowing sequence cannot produce any result. In general, the number of equality checks occurring in the inverse computation is finite because it always has the form \(f(x) \stackrel {?}{=}t\) (up to αrenaming), where t is a subterm of the original output given to the inverse computation. Thus, the simple inverse computation always terminates with memoization for parity.
This observation also gives an upper bound of the worstcase complexity of inverse computation of parity; it runs in constant time regardless the size of the original output because the checks in the narrowing have the form of either \(\mathit {parity} (x) \stackrel {?}{=}t\) or \(\mathit {aux} (x) \stackrel {?}{=}t\), only where t is the original output.
2.2 Problem: nontermination due to accumulations and multiple data traversals
 Accumulations, a sort of calltime computation commonly used in tail recursion, increase the size of the terms in the narrowing process. For example, evA contains the accumulationswhich increase the termsize in the following narrowing steps.$$\mathit {evA} ( \mathsf {Dbl} (x),y) = \mathit {evA} (x,\underline{ \mathit {evA} (x,y)}) $$We can see that the second argument of evA (underlined above) gets bigger in narrowing.$$( \mathit {evA} (x,\underline{ \mathsf {Z} }) \stackrel {?}{=} \mathsf {Z} ) \leadsto _{x \mapsto \mathsf {Dbl} (x)} ( \mathit {evA} (x,\underline{ \mathit {evA} (x, \mathsf {Z} )}) \stackrel {?}{=} \mathsf {Z} ) $$
 Multiple data traversals make things much worse. It prevents us from considering function calls separately. For example, we have to track the two calls \(\underline{ \mathit {evA} (x,\underline{ \mathit {evA} (x,y)}})\) simultaneously. We can see that the number of function calls we have to track simultaneously increases in narrowing. To clarify the problem caused by multiple data traversals, we will look at the issue of accumulations in the absence of multiple data traversals. Suppose that ev does not have the case for Dbl and thus does not contain multiple data traversals. Although there are still an infinite narrowing sequenceone can make the simple inverse computation terminate by decomposing the check \(( \mathit {evA} (x_{1}, \mathit {evA} (x_{2}, \mathsf {Z} )) \stackrel {?}{=} \mathsf {Z} )\) into \(\mathit {evA} (x_{1},z) \stackrel {?}{=} \mathsf {Z} \wedge \mathit {evA} (x_{2}, \mathsf {Z} ) \stackrel {?}{=}z\) and by observing that, for \(\mathit {evA} (x_{1},z) \stackrel {?}{=}z'\), we only need to consider the substitutions that map z and z′ to subterms of the output fed to the inverse computation, i.e., Z. Thus, we can substitute a concrete subterm to z and check \(\mathit {evA} (x_{1}, \mathsf {Z} ) \stackrel {?}{=}t\) and \(\mathit {evA} (x_{2},t) \stackrel {?}{=} \mathsf {Z} \) separately for a concrete t (a more refined idea can be found in [17, 32]), and we can bound the complexity of inverse computation in a similar way as we did for parity. However, this idea does not scale for functions with multiple data traversals, in which many function calls are tracked simultaneously in narrowing. Although the existing approaches [17, 32] achieve terminating inverse computation of certain accumulative functions with multiple data traversals, it is unclear whether there are polynomialtime inverse computations for functions with multiple data traversals.$$\begin{aligned} ( \mathit {ev} (x) \stackrel {?}{=} \mathsf {Z} )& \leadsto ( \mathit {evA} (x, \mathsf {Z} ) \stackrel {?}{=} \mathsf {Z} ) \\ &\leadsto _{x \mapsto \mathsf {Add} (x_1,x_2)} ( \mathit {evA} (x_1, \mathit {evA} (x_2, \mathsf {Z} )) \stackrel {?}{=} \mathsf {Z} ) \leadsto \dots\mbox{,} \end{aligned}$$
2.3 Our idea
All of the above results are obtained by just a simple observation: a program like ev is a nonaccumulative contextgenerating transformation without multiple data traversals.
3 Target language
In this section, we formally describe the programs we target, which are written in an (untyped) firstorder functional programming language with certain restrictions.
3.1 Values: trees
The values of the language are trees consisting of constructors (i.e., a ranked alphabet).
Definition 1
(Trees)
A set of trees \(\mathcal {T}_{\varSigma }\) over constructors Σ is defined inductively as follows: for every σ∈Σ ^{(0)}, \(\sigma \in \mathcal {T}_{\varSigma }\), and for every σ∈Σ ^{(n)} and \(t_{1}, \ldots, t_{n} \in \mathcal {T}_{\varSigma }\) (n>0), \(\sigma(t_{1},\dots,t_{n}) \in \mathcal {T}_{\varSigma }\), where Σ ^{(n)} is the set of the constructors with arity n.
For constructors Z,Zero,One,Nil∈Σ ^{(0)}, S∈Σ ^{(1)} and Cons,Add∈Σ ^{(2)}, examples of trees are S(Z), Cons(Z,Nil), and Add(Add(Zero,One),Zero). We shall fix the set Σ of the constructors throughout the paper for simplicity of presentation. The size of a tree t is the number of the constructor occurrences in t. For example, the size of S(Z) is 2.
In what follows, we shall use vector notation: \(\overline {t}\) represents a sequence t _{1},…,t _{ n } and \(\overline {t}\) denotes its length n.
3.2 Programs: macro tree transducers
Example 1
(reverse)
Example 2
(eval)
The eval program in Sect. 1 is an example of an MTT program. So is its simplified version ev.
Example 3
(mirror)
The size of a program is defined by the total number of function, constructor, and variable occurrences in the program. The intuition behind this definition is to approximate the size of program code in text. Note that the number of function or constructor occurrences is different from the number of functions or constructors. For example, the number of functions in reverse is 3, whereas the number of function occurrences is 9.
In addition, we also assume that programs are nondeleting, i.e., every input variable must occur in the corresponding righthandside expression. This restriction does not change the expressiveness; we can convert any program to one satisfying this restriction by introducing the function ignore satisfying [[ignore]](s,t)=t for any s and t and defined by ignore(σ(x _{1},…,x _{ n }),y)=ignore(x _{1},…ignore(x _{ n },y)…) for every σ∈Σ. The restriction simplifies the discussions in Sects. 4.3, 6.1 and 6.2. All the previous examples are deterministic and nondeleting.
A program is called parameterlinear if every output variable y occurring on the lefthand side occurs exactly once on the corresponding righthand side of each rule.^{5} All the previous examples are parameterlinear. Our polynomial time inverse computation depends on parameterlinearity.
4 Polynomialtime inverse computation
In this section, we formally describe our inverse computation. As briefly explained in Sect. 2, first, we convert an MTT program into a nonaccumulative contextgenerating program (a program that generates contexts instead of trees) without multiple data traversals, such as ev _{c} in Sect. 2.3. Then, we perform inverse computation with memoization. More precisely, we construct a tree automaton [10] that represents the inverse computation result, whose run implicitly corresponds to (a contextaware version of) the existing inverse computation process with memoization [1, 4].
4.1 Conversion to contextgenerating program
The first and most important step is to convert an MTT program into a nonaccumulative contextgenerating program. This transformation is also useful for removing certain multiple data traversals, as shown in the example of ev in Sect. 2. Moreover, this makes it easy to apply tupling [8, 25] to programs. Note that viewing MTT programs as nonaccumulative contextgenerating transformations is not a new idea (see Sect. 3.1 of [12] for example). The semantics of the contextgenerating programs shown later is nothing but using Lemma 3.4 of [12] to evaluate MTT programs.
First, we will give a formal definition of contexts.
Definition 2
An (mhole) context K is a tree in Open image in new window where •_{1},…,•_{ m } are nullary symbols such that \({\bullet }_{1},\ldots,{\bullet }_{m} \not\in \varSigma \).
An mhole context K is linear if each •_{ i } (1≤i≤m) occurs exactly once in K. We write K[t _{1},…,t _{ m }] for the tree obtained by replacing •_{ i } with t _{ i } for each 1≤i≤m. For example, K=Cons(•_{1},•_{2}) is a 2hole context and K[Z,Nil] is the tree Cons(Z,Nil). For 1hole contexts, •_{1} is sometimes written as •.
We showed that ev is indeed a nonaccumulative contextgenerating transformation in Sect. 2. In general, any MTT program can be regarded as a nonaccumulative contextgenerating transformation in the sense that, since output variables cannot be patternmatched, the values bound to the output variables appear asis in the computation result. Formally, we can state the following fact (Engelfriet and Vogler [15]; Lemma 3.19).
Fact 1
\({\mathopen {[\![}f\mathclose {]\!]}}(s,\overline {t}) = t\) if and only if there is K such that \({\mathopen {[\![}f\mathclose {]\!]}}(s,\overline {{\bullet }}) = K\) and \(t = K[\overline {t}]\).
As a result of the above, in a converted program, the arguments of every function are variables, and the return value of a function cannot be traversed again. This rules out any accumulative computation.
The algorithm above is very similar to that used for deaccumulation [19, 31]. Unlike deaccumulation, we treat contexts as firstclass objects, which enables us to adopt special treatment for contexts in our inverse computation method.
Example 4
(reverse)
Example 5
(eval)
Example 6
(mirror)
Now, we can show that the conversion is sound; it does not change the semantics of the functions.
Lemma 1
For any tree s, \({\mathopen {[\![}f\mathclose {]\!]}}(s,\overline {{\bullet }}) = {\mathopen {[\![}f_{\mathrm {c}}\mathclose {]\!]}}(s)\).
Together with Fact 1, we have \({\mathopen {[\![}f\mathclose {]\!]}}(s,\overline {t}) = K[\overline {t}]\) with K=[[f _{c}]](s) for every tree s and \(\overline {t}\).
4.2 Tupling
Tupling is a wellknown semanticpreserving program transformation that can remove some of the multiple data traversals [8, 25].
Note that we tuple only the functions that need to be tupled, i.e., the functions that traverse the same input, for the sake of simplicity of our inverse computation method that we will discuss later. For example, app _{c} and rev _{c} are tupled because they traverse the same input, whereas nat _{c} and app _{c} are not tupled. Thus, the tupling step does not change the reverse _{c} and eval _{c} programs. In the tupled program obtained in this way, for any call of a tupled function (k _{1},…,k _{ n })=〈f _{1},…,f _{ n }〉(x), each variable k _{ i } (1≤i≤n) occurs at least once in the corresponding expression.
Tupling may cause size blowup of a program: a tupled program is at worst 2^{ F }times as big as the original program; F here is the number of functions in the original program. Recall that we tuple only the functions that traverse the same input, not all the functions in a program. Note that only one of 〈rev _{c},app _{c}〉 and 〈app _{c},rev _{c}〉 can appear in a tupled program. Thus, the tupled functions 〈f _{1},…,f _{ n }〉 are as numerous as the sets of the original functions Open image in new window .
4.3 Tree automata construction as memoized inverse computation

A tree automaton is more suitable for a theoretical treatment than a sideeffectful memoization table.

The set Open image in new window may be infinite (e.g., eval).

We can extract a tree (in DAG representation) from an automaton in time linear to the size of the automaton [10].

In some applications such as testcase generation, it is more useful to enumerate the set of the corresponding inputs instead of returning one of the corresponding inputs.
First of all, we review the definition of tree automata. A tree automaton [10] \(\mathcal{A}\) is a triple (Σ,Q,R), where Σ is a ranked alphabet, Q is a finite set of states, and R is a finite set of transition rules each having the form of either q←q′ or q←σ(q _{1},…,q _{ n }) where σ∈Q ^{(n)}. We write \({\mathopen {[\![}q\mathclose {]\!]}}_{\mathcal{A}}\) for the trees accepted by state q in \(\mathcal{A}\), i.e., Open image in new window where we take ← as rewriting.
The problem of finding Θ satisfying \(\overline {e}\varTheta = \overline {K}\) for given \(\overline {e}\) and \(\overline {K}\) is called secondorder (pattern) matching, and there have been proposed some algorithms to the problem [11, 26, 27]. In the actual construction of the automaton, we do not generate any state that cannot reach \(q_{f^{1}(t)}\), where f is the function to be inverted and t is the original output. The examples that will be discussed below use this optimization. Note that a tree is a 0hole context. The nondeleting property is used in the above algorithm for simplicity. If a program is not nondeleting, some input variable x may not have the corresponding function call g(x) in a rule of the tupled program. Then, we have to adopt special treatment for such a x in the construction of R in the algorithm.
Example 7
(reverse _{c})
Example 8
(eval _{c})
Example 9
(mirror _{c})
Our inverse computation is correct in the following sense.
Theorem 1
(Soundness and completeness)
For an inputlinear tupled program, \(s \in {\mathopen {[\![}q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\mathclose {]\!]}}_{\mathcal{A}_{ \mathrm {I} }}\) if and only if \({\mathopen {[\![} {\langle \overline {f} \rangle } \mathclose {]\!]}}(s) = (\overline {K})\).
Proof
Straightforward by induction. □
4.4 Complexity analysis of our inverse computation
We show that the inverse computation runs in time polynomial to the size of the original output and the size of the program, but in time exponential to the number of functions and the maximum arity of the functions and constructors. We state as such in the following theorem.
Theorem 2
Given a parameterlinear MTT program that defines a function f and a tree t, we can construct an automaton representing the set Open image in new window in time O(2^{ F } m(2^{ F } n ^{ MF })^{ N+1} n ^{ NMF }) where F is the number of the functions in the program, n is the size of t, N is the maximum arity of constructors in Σ, m is the size of the program, and M is the maximum arity of functions.
Proof
First, let us examine the cost of our preprocessing. The conversion into contextgenerating transformation does not increase the program size and can be done in time linear to the program size. In contrast, the tupling may increase the program size to 2^{ F } m. Thus, the total worstcase time complexity for preprocessing is O(2^{ F } m).
Next, let us examine the cost of the inverse computation. The constructed automaton has at most 2^{ F } n ^{ MF } states because every state is in the form 〈g _{1},…,g _{ l }〉^{−1}(K _{1},…,K _{ l }), the number of 〈g _{1},…,g _{ l }〉 is smaller than 2^{ F }, the number of K _{ i } is smaller than n ^{ M }, and l is no more than F. Note that the number of khole subcontexts in t is at most n ^{ k+1} and the contexts occurring in our inverse computation have at most (M−1) kinds of holes. Since the number of the states in an automaton is bounded by P=2^{ F } n ^{ MF } and the transition rules are obtained from the rules of the tupled programs that are smaller than 2^{ F } m, the number of the transition rules is bounded by 2^{ F } mP ^{ N+1}. Each rule construction takes O(n ^{ NMF }) time because, for the secondorder matching to find Θ such that \(\overline {e} \varTheta = \overline {K}\), the size of the solution space is bounded by O(n ^{ NMF }); note that \(\overline {e}\) contains at most NF context variables that have at most (M−1) kind of holes. Thus, an upper bound of the worstcase cost of the inverse computation is O(2^{ F } m(2^{ F } n ^{ MF })^{ N+1} n ^{ NMF }).
Therefore, the total worstcase time complexity of our method is bounded by O(2^{ F } m(2^{ F } n ^{ MF })^{ N+1} n ^{ NMF }). □
Note that, if we start from inputlinear tupled contextgenerating programs, the cost is O(m(Fn ^{ Md })^{ N+1} n ^{ Mc }), where c is the maximum number of context variables in the rules, and d is the maximum number of components of the tuples in the program. Also note that the above approximation is quite rough. For example, our method ideally runs in time linear to the size of the original output for reverse and mirror for eval, assuming some sophisticated secondorder pattern matching algorithm under some sophisticated context representation depending on programs, which will be discussed in Sect. 5.5.
Each step of our inverse computation itself shown in Sects. 4.1, 4.2 and 4.3 does not use the parameterlinearity of an MTT. We only use the parameterlinearity to guarantee that our inverse computation is performed in polynomial time. For parameterlinear MTTs, we only have to consider linear contexts; the number of linear subcontexts of a tree t of size n is a polynomial of n, which leads our polynomialtime results. Our inverse computation indeed terminates for MTTs without restrictions in exponential time because the number of possiblynonlinear mhole subcontexts of a tree t is at most t(m+1)^{t}.
5 Experiments and discussions
In this section, we report our prototype implementation of the proposed algorithm and experimental results with the prototype system. The actual complexity of our inverse computation is unclear due to the two points: secondorder matching and the automaton states actually generated by the automaton construction. By investigating several programs, we estimate the complexity of our method and clarify how these two points affect the computation cost.
After the experiments, we discuss how can we improve the complexity of our method for the investigated programs. For example, it is true that ideally we can achieve lineartime inverse computation for reverse, the lineartime inverse computation is hard to achieve with the naive implementation, as shown by the experimental result that we will describe later. We discuss what causes the gap and how we can remove the gap.
5.1 Implementation and environment
Our prototype system is written in Haskell, and is implemented as an inverse interpreter [1], i.e., a program that takes a program and its output, and returns the corresponding inputs, rather than an inverse compiler (program inverter) [20]. Usually, inverse computation done by a inverse compiler runs faster than that done by an inverse interpreter. However, it is expected that the effect is rather small for our case which uses rather heavy computations, i.e., the secondorder pattern matching and the automaton construction. For the secondorder matching, we used the algorithm in [44] without heuristics, which is a variant of Huet’s algorithm [26] specialized to linear λterms.
The experiments below were carried out on Ubuntu Linux 12.04 (for i686) on a machine with Intel(R) Core(TM) i5 660 (3.33 GHz) and 8 GB memory. The prototype implementation is complied by Glasgow Haskell Compiler 7.4.1^{6} under the flags O2 rtsopts and executed under the flags +RTS H.
5.2 Experiments

reverse in Example 1 and a list of Zs.

eval in Sect. 1 and a natural number.

mirror in Example 3 and a list of Zs.
 The function toc, which construct the tableofcontents of a document and which will be discussed in Sect. 6.3, expressed as an MTT as below, and horizontallyrepeated sequence representing (X)HTML fragments like: <ul><li>A</li><li>A</li>…</ul> Here, a fragment <li> x </li> y and <ul> x </ul> y are represented by LI(x,y) and UL(x,y) respectively, the text A is represented by A, and the empty sequence is represented by E.

The program toc above and verticallyrepeated (nested) (X)HTML fragments like <ul><li>…<ul><li>A</li></ul>…</li></ul>
In the following, we discuss the experimental results one by one. Throughout the discussions, we use n for the size of the original output tree fed to the inverse computation that we focus on.
5.2.1 reverse
The running time of the inverse computation for reverse is estimated as O(n ^{2}) from Fig. 4. One might think that this result is strange because we know that the inverse of reverse is reverse and thus can be executed in linear time. This gap comes from the two points: the construction of the tree automaton and the secondorder pattern matching.
Regarding the construction of the automaton described in Sect. 4.3, we just used a pair \((\overline {f}, \overline {K})\) to represents a state \(q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\) in the automaton. Since in the construction we check if the transitions that go to a state are already generated or not and we used a balanced search tree^{7} for the check, the checks takes O(qlogq) for \(q = (\overline {f}, \overline {K})\). For reverse, since the constructed automaton contains O(n) states and each state has the size O(n), the construction itself takes O(n ^{2}logn).
One might notice that the experiment indicates that the time cost of inverse computation of O(n ^{2}) while the above discussion indicates that it is O(n ^{2}logn). Note that it is hard to observe the difference by logn because the factor is too small for the problem size. Thus, this is not a contradiction.
5.2.2 mirror
The running time of the inverse computation of mirror is estimated as O(n ^{3}) from Fig. 4. Note that the generated automaton contains O(n ^{2}) states because we do not know which part of the list is generated by app or rev. The constructed automaton contains states of the form \(q_{ {\langle \mathit {app} , \mathit {rev} \rangle }^{1}(K^{lk},K^{mk})}\) where l+m is equal to the length of an output list (thus, l+m=(n−1)/2) and k≤l,k≤m, and K ^{ i } denotes the context of the form of Cons(Z,Cons(Z,…,Cons(Z,•)…)) containing i occurrences of Zs.
The effects of the automaton construction and the secondorder matching are similar to those of reverse; the automaton construction and the secondorder matching take O(nlogn) and O(n) time for each state, respectively.
5.3 eval
The inverse computation of eval is estimated to run in time O(n ^{3}) from Fig. 4. The constructed automaton contains O(n) states and O(n ^{2}) transition rules.
In contrast to reverse and mirror, the implemented secondorder pattern matching takes O(n ^{2}) time for each state. The matching \(k[k[{\bullet }]] \stackrel {?}{=}K\) takes O(K^{2}) time; the implemented algorithm guesses K _{1} such that K _{1}[K _{2}]=K, in which there are K candidates of such K _{1}, and then checks K _{1}=K _{2}, which takes O(K) time. The matching \(k_{1}[k_{2}[{\bullet }]] \stackrel {?}{=}K\) also takes O(K^{2}) time; the algorithm guesses K _{1} such that K _{1}[K _{2}]=K (similarly, there are K candidates of such K _{1}) and takes O(K) to check K _{2} has the form k _{2}[•].
5.4 toc
The secondorder matching took O(n ^{2}) time for each state because we solve \(k_{1}[k_{2}[{\bullet }]] \stackrel {?}{=}K\) which takes O(K^{2}) time, similar to that in eval.
5.5 Discussions
In this section, we discuss how we can improve the asymptotic complexity of the implemented algorithm. Again, we use n for the size of an output tree that we focus on.
5.5.1 Pointerrepresentation of contexts
The prototype implementation uses very naive representation of contexts, i.e., a tree with holes, which takes O(n) space and the check of the equivalence also takes O(n) time. Due to this cost, the running time of the inverse computation is usually no better than O(nQ) where Q is the number of states in the constructed automaton. For example, even for nat in Example 1, the inverse computation takes O(n ^{2}) time, in which the secondorder matching \(S(k) \stackrel {?}{=}t\)—this is nothing but a firstorder matching—can be solved in O(1).
A possible solution to the problem is to represent a mhole context by (m+1) pointers (1 for its root and m for its holes). Since a pointer to the output tree t can be expressed in O(logn) space rather than n, it is expected that the representation reduces the cost of introduction of a state of a constructing automaton. Actually, this representation, combined with the “jumping” technique that will be discussed in Sect. 5.5.2, reduces the cost of the inverse computation for nat, reverse, while it may increase the cost in some cases as described later.
With this pointer representation, the inverse computation for nat runs in time O(n). To reduce the cost from O(n) to O(1), we have to know that nat is the identity function on natural numbers and is surjective, which is an orthogonal story to the discussions in this paper.
Note that there is a tradeoff between this representation and the naive representation: a context can have n pointerrepresentations at worst. Thus, although this representation works effectively for reverse, mirror and toc, the constructed automaton for eval now has O(n ^{2}) states and O(n ^{3}) transitions in the pointer representation. Recall that it has O(n) states and O(n ^{2}) transitions in the naive context representation. In the pointer representation, the same contexts S(•) occurring in different positions are distinguished.
5.5.2 Jumping to arbitrary subtrees
As described above, the pointer representation is sometimes useful to reduce the cost of the automaton construction. However, to reduce the cost of the inverse computation, we have to reduce the cost of the secondorder matching.
The pointer representation also sheds light on the problem, which enables us to traverse a tree or context from a leaf or arbitrary positions, while we have to traverse a tree or context from a top in the naive representation. For example, for reverse in which we solve the secondorder matching problem \(k_{1}[ \mathsf {Cons} (k_{2},{\bullet })] \stackrel {?}{=}t\), we have to search Cons(k _{2},•) in t from the top in the naive representation. In contrast, in the pointer representation, the corresponding problem Γ⊢k _{1}[Cons(k _{2},•)]::o′→o can be solved in constant time because we can “jump” to the hole position by searching transition rule o″←Cons(o‴,o′).
Finding a good strategy for typing would lead to an efficient secondorder matching. Assuming some strategies such as performing “jumping” as possible, we can find that the inverse computation for reverse runs in time O(n) and that for mirror runs in time O(n ^{2}). On the other hand, this technique does not reduce the inverse computation cost for eval.
5.5.3 Special treatment for monadic trees
More optimization can be applicable when the outputs are monadic trees, i.e., trees built only from unary and nullary constructors such as S(S(Z)) and A(B(A(E))). For monadic trees we can use integers for pointers.
Sometimes this integerrepresentation is useful to solve the secondorder matching more efficiently. Consider the secondorder matching \(k[k[{\bullet }]] \stackrel {?}{=} \mathsf {S} ^{n}({\bullet })\), which, in the integerrepresentation, can be translated to a problem that enumerating Γ such that Γ⊢k[k[•]]::n→0 where n represents a pointer to the subtree occurring at depth n, or the subtree accessible from the root by a path with length n. Since the pattern is k[k[•]], we know that k splits the context n→0 in the middle. That is, n must be even and Open image in new window . Thanks to the integer representation, we can find this unique candidate of k without investigating the context S ^{ n }(•) at all; unlike the pointer representation, we can divide or multiply a “pointer” by a constant in the integer representation. Note that we still have to check if the types n/2→0 and n→n/2 represent the same context or not. The cost of the secondorder matching is reduced from O(n ^{2}) to O(n).
In general, for a secondorder matching problem \(e \stackrel {?}{=}K\), with the integer representation of contexts, we can represent constraints on “shape”s of the free variables in e by linear equations and inequalities. For example, from Open image in new window , we obtain a constraint on shape as x _{1}=x _{5}∧x _{2}=x _{4}∧x _{3}=x _{0}∧(x _{3}−x _{2})=(x _{1}−x _{0})∧x _{0}≤x _{1}∧x _{2}≤x _{3}∧x _{4}≤x _{5}. Solving the constraint for x _{0},x _{1},x _{2},x _{3}, we get x _{1}=x _{5},x _{2}=x _{4},2x _{0}=2x _{3}=x _{4}+x _{5}. By using the technique, the cost of the secondorder matching in the inverse computation of eval becomes O(n), and thus the cost of the inverse computation of eval becomes O(n ^{3}) again. This kind of technique is also applicable to listgenerating programs like reverse and mirror, and functions of which outputs are partly monadic.
The similar optimization technique has been discussed also in the context of parsing of range concatenation grammars [7] in which users can represent arbitrary number of repetitions of a string. By using the technique, their parsing algorithm accepts 2^{ n } repetitions of a, i.e., \(\mathtt{a}^{2^{n}}\), in O(2^{ n }) time.
A much more specialized representation of contexts is applicable for natural numbers represented by S and Z, i.e., trees built from one unary constructor and one nullary constructor. In this situation, we can represent a context by one natural number; for example, we can represents a context S(S(•)) by 2. For eval, since there are no more redundant states in the representation and we can also apply the above optimization techniques to this representation, the inverse computation of eval runs in O(n ^{2}) time in this representation.
5.5.4 Estimation of shape
In the secondorder matching, we have not used the fact that a variable k represents a return value of a function. Sometimes, we can perform for efficient inverse computation by using this information.
It would be a good future direction to discuss how can we estimate the shape and how can we use the estimated shape.
6 Extensions
We shall discuss four extensions of the inverse computation.
6.1 Pattern guards
The specialization of a program increases the program size [33, 35]. In the worst case, a specialized program is Q^{ N } times as big as the original one and the specialization takes time proportional to the size of the specialized program, assuming that lookahead is defined by a deterministic [10] tree automaton with the states Q, where N is the maximum arity of the constructors. Since this only increases the program size, our method still runs in time polynomial to the size of the original output.
6.2 Bounded use of parameters
We can easily extend the method in Lemma 6.3 of [12] to generate specialized functions. A converted program can be (b+1)^{ MF(N+1)}times as big as the original one, where b is the bound of the parameter copies, N is the maximum arity of the constructors, F is the number of functions, and M is the maximum arity of the functions.
6.3 Parameterlinear macro forest transducers
A macro forest transducer [41], which is an important extension of a macro tree transducer, generates forests (roughly speaking, sequences of trees) instead of trees, which enables us to express XML transformations and serialization programs more directly. Our polynomialtime inverse computation results can be lifted to parameterlinear macro forest transducers.
 1.
We can see a program as a linear nonaccumulative contextgenerating program.
 2.
The number of contexts in a given output is bounded by a polynomial to the size of the output.
 3.
The substitutions of \(\overline {e}\varTheta = \overline {K}\) can be enumerated in polynomial time. Since the solution space is bounded polynomially by the second item, the existence of the polynomialtime checking of the equivalence of two contexts is sufficient.
Note that, for linear macro forest transducers, where the uses of the both input and output variables are linear, polynomialtime inverse computation can be performed simply by preprocessing. For a linear macro forest transducer, the size of an output forest is bounded linearly by the size of the input forest, i.e., the transformation is linear size increase [14]. Thus, a linear macro forest transducers is MSOdefinable because it is expressed as compositions of MTTs and linear size increase [14]. Since MSOdefinable tree transformation can be represented by a MTT that is both finitecopyingintheinputs and finitecopyinginparameter [12], our method becomes applicable with some extra preprocessing as noted in Sect. 6.2.
6.4 Composing with inverseimage computation
Recall that our inverse computation method returns a set of corresponding inputs as a tree automaton. To enlarge the class of functions for which polynomialtime inverse computation can be performed, it is natural to try composing our inverse computation with inverseimage computation—computation of the set Open image in new window for a problem f and a given set of outputs T—which has been studied well in the context of tree transducers (for example, [15, 17, 32]).

Usually, the inverseimage computation is harder than P. For example, it is known that the complexity of the inverseimage computation is EXPTIMEcomplete even for MTTs without output variables, which are thus nonaccumulative, when T and the result set are given in tree automata [34].

A few results are known on polynomial time inverseimage computation. However, some method [17] requires that a set of output trees must be given in a deterministic tree automaton; in general, converting a tree automaton to a deterministic one causes exponential sizeblowup.

For some programs, polynomialtime inverseimage computation is possible even if we use a nondeterministic tree automaton to represents a set of outputs. However, composing these methods sometimes does not increase the expressive power. For example, although it is not difficult to see that polynomialtime inverseimage computation can be performed for MSOdefinable transducers, the composition of a MSOdefinable transducer followed by a parameterlinear MTT can also be expressed in a parameterlinear MTT [12].

The inverseimage computation method proposed by Frisch and Hosoya [17] runs in polynomialtime for MTTs with the restriction of finiteinputcopyingintheinputs [12].

The method requires deterministic tree automata, but the automata obtained by our inverse computation can be converted to deterministic ones in polynomial time; there is no exponential size blowup.

The composition of a finiteinputcopyingintheinputs MTT followed by a parameterlinear MTT can express a transformation that cannot be expressed in a single MTT, although the resulting class is artificial.
In the following, we show that the automata obtained by our inverse computation can be converted to deterministic ones in polynomial time. A tree automaton is called ϵfree if it contains no rules of the form of q←q′. A tree automaton is called deterministic [10] if it is ϵfree and its transition rules contain no two different rules q←σ(q _{1},…,q _{ n }) and q′←σ(q _{1},…,q _{ n }) for any σ and q _{1},…,q _{ n }. Note that we can convert a automaton to an ϵfree one in polynomialtime [10].
A key property here is that, in an automaton obtained by our inverse computation, each state has the form \(q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\) and it satisfies that \(s \in {\mathopen {[\![}q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\mathclose {]\!]}}_{\mathcal{A}_{ \mathrm {I} }}\) if and only if \({\mathopen {[\![} {\langle \overline {f} \rangle } \mathclose {]\!]}}(s) = (\overline {K})\) (Theorem 1). Using the fact, we obtain the following lemma:
Lemma 2
Proof
We prove the lemma by contradiction. Suppose that we have \({\mathopen {[\![}q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\mathclose {]\!]}}_{\mathcal{A}_{ \mathrm {I} }} \cap {\mathopen {[\![}q_{ {\langle \overline {f'} \rangle }^{1}(\overline {K'})}\mathclose {]\!]}}_{\mathcal{A}_{ \mathrm {I} }} \ne \emptyset\) for some \(\overline {f}\), \(\overline {f}'\), \(\overline {K}\) and \(\overline {K'}\) such that \(f_{i} = f'_{i}\) and \(K_{i} \ne K_{i}'\) for some i. According to Theorem 1, there exists some s such that \({\mathopen {[\![} {\langle \overline {f} \rangle } \mathclose {]\!]}}(s) = \overline {K}\) and \({\mathopen {[\![} {\langle \overline {f}' \rangle } \mathclose {]\!]}}(s) = \overline {K'}\). That is [[f _{ i }]](s)=K _{ i } and \({\mathopen {[\![}f'_{i}\mathclose {]\!]}}(s) = K'_{i}\). Since \(f_{i} = f_{i}'\) and [[f _{ i }]] is a function because we consider deterministic MTTs, we have \(K_{i} = K'_{i}\), which contradicts the assumption \(K_{i} \ne K'_{i}\). □
The lemma guarantees that the naive subsetconstruction [10], which converts a tree automaton to a deterministic one, runs in polynomialtime. In the subsetconstruction, we construct an automaton whose states are represented by sets of states of the input automaton. A key property is that, in the constructed automaton, every state accepts at least one tree. Thus, if we apply the subset construction to the resulting automaton of our inverse computation, each state P in the generated automaton does not contain two states \(q_{ {\langle \overline {f} \rangle }^{1}(\overline {K})}\) and \(q_{ {\langle \overline {f'} \rangle }^{1}(\overline {K'})}\) with \(K_{i} \ne K'_{i}\) and \(f_{i} = f'_{i}\) in P. Therefore, the number of states in the constructed automaton is bounded by the number of mappings from a function f in the original program to a context K, which is O(n ^{ FM }) where n is the size of an original output fed to the inverse computation, F is the number of functions in the original program and M is the maximum arity of the functions. Note that the number of states in the resulting automaton of our inverse computation is also O(n ^{ FM }) (see the proof of Theorem 2).
Since we can convert an automaton obtained by our inverse computation to a deterministic one in polynomialtime, we can perform polynomialtime inverse computation for a transformation that is defined by a finitecopyingintheinputs MTT followed by a finitecopyinginparameter MTT.
7 Related work
7.1 Inverse computation
There have been many studies on the inverse computation problem [1, 20, 21, 23, 30, 37, 40, 49]. They can be categorized into those on leftinverse computation and those on rightinverse computation. Leftinverse computation [20, 21, 23, 30, 40] focuses on injective functions and tries to make an efficient inverse computation based on injectivity analysis, but it can only handle provablyinjective functions. Rightinverse computation [1, 37, 49] including ours can handle more functions than leftinverse computation does—it works even for noninjective functions—but the yielded inversecomputation process is usually much slower than that of leftinverse computation. Another important difference is that leftinverse computation is compositional; if we have effective leftinverse computation methods for f and g, we have an effective leftinverse computation method for f∘g. On the other hand, rightinverse computation may not be compositional; even if we have rightinverse computation methods for f and g, then rightinverse computation may happen to be undecidable for f∘g. Leftinverse computation is suitable for applications in which efficiency is the biggest concern, such as in serialization/deserialization. On the other hand, rightinverse computation is suitable for applications in which one wants to invert noninjective function to enumerate all the corresponding inputs, such as in testcase generation [9, 43]. It is worth noting that checking the injectivity of a function is generally undecidable. For parameterlinear MTTs in particular, the injectivity check is undecidable even if it has no outputvariables [18] or it has no multiple data traversals (we can reduce the ambiguity check of a contextfree grammar, which is known to be undecidable [24], to the problem). Thus, any leftinverse computation method essentially has a function written in parameterlinear MTT that cannot be inverted by it.
To the best of our knowledge, there are few discussions on the topic of multiple data traversals, except for Eppstein’s work [16]. He demonstrated the usefulness of tupling [8, 25] that can make an injective function from noninjective functions.
Regarding accumulations, studies on leftinverse computation have treated them heuristically [20, 39, 40] because the injectivity check is usually undecidable with them. Glück and Kawabe [20] use the LRparsing technique. In their system, if the grammar obtained from a program is LRparsable, inverse program based on LRparsing is derived. Note that their use of grammar is different from our use of tree automaton: their grammar represents a set of possible instruction sequences (traces) of a program while our tree automaton represents a set of inverse computation results. Nishida and Vidal [40] and Mogensen [39] focus on the special tailrecursive (thus usually accumulative) pattern and discuss the inverse computation of the pattern. Regarding rightinverse computation, although there are few studies focusing on accumulative functions, the approaches [17, 32] regarding the inverseimage computation have a strong connection to this work and will be discussed later in this section.
7.2 Results on tree transducers
We assumed that the programs are deterministic and showed that a tractable inverse computation is possible for parameterlinear MTTs. However, this result does not scale to nondeterministic programs. Even for MTTs without output variables, the problem of checking whether an inversecomputation result is empty or not is known to be NPcomplete [42]. This means the complexity of the inverse computation problem of the nondeterministic MTTs even without output variables is NPhard. For compositions of (deterministic/nondeterministic) macro tree transducers, checking whether an inversecomputation result is empty or not is known to be in NP [28]; thus the problem is NPcomplete for compositions of nondeterministic macro tree transducers. It is still open whether the problem is NPhard or not for compositions of deterministic macro tree transducers.
The problem of inverse computation takes a function f and an output tree t and returns the trees s such that f(s)=t. A similar problem, the inverseimage computation problem—computation of the set Open image in new window for a given f and T—has been studied on tree transducers (for example, [15, 17, 32]). The difference from the inverse computation problem is that the inverse computation takes one tree but inverseimage computation takes a set of trees, and this difference is a key to our polynomialtime result. The complexity of the inverseimage computation is EXPTIMEcomplete even for the parameterlinear MTTs without output variables which are thus nonaccumulative, when T and the result set are given in tree automata [34]. Roughly speaking, their EXPTIMEhard result is caused by intersections; for an expression like …f(x)…f(x)… we essentially have to compute the intersection Open image in new window in the inverseimage computation [34]. On the other hand in our method, we do not need to compute the intersection because, for trees t _{1} and t _{2}, Open image in new window equals Open image in new window if t _{1}=t _{2}, and otherwise it is empty. This is implicitly expressed by the transformation in Sect. 4.1, in which we replace …f(x)…f(x)… by …k…k… where k=f(x); a multiple data traversal is replaced by an output copying.
The observation that an MTT program is a nonaccumulative contextgenerating transformation plays an important role in our method. A similar but different idea is exploited in inverseimage computation [17, 32]. Unlike our approach, the idea is to view an MTT program as a nonaccumulative mappinggenerating transformation, where a mapping is represented by inputoutput pairs. A context is different from a mapping; it contains more information than a mapping, e.g., the information about the positions of holes. This difference results in the difference in inverse computation between our contextgeneration view and the mappinggeneration view. The mappinggeneration view considers mappings from a tuple of subtrees of t to a subtree of t for the original output t, which are indeed partiallyapplied functions such as \(\lambda \overline {y}.{\mathopen {[\![}f\mathclose {]\!]}}(s,\overline {y})\) used to generate t. However, the number of mary mappings on the subtrees of t is exponential to the size of t [17, 32]. Although the inverse computation based on the mappinggeneration view can be performed in polynomialtime if there are no multiple data traversals [17], it is unclear whether polynomialtime inverse computation for functions with multipledata traversals can be achieved or not. In contrast, we exploit the linearity of the holes—a context contains this information but a mapping does not—to achieve polynomialtime inverse computation for parameterlinear MTTs, in which a function can have multiple data traversals. Note that, like mary functions, the number of nonlinear mhole subcontexts in a tree is bounded exponentially by the size of the tree, whereas the number of linear ones is bounded polynomially by the size.
Regarding inverse computation of general MTTs, there is another polynomialtime inverse computation method besides ours that works for a subset of MTTs. The method of [17], as mentioned in the previous paragraph, runs in polynomial time for MTTs without multiple data traversals, i.e., MTTs with the restriction of finiteinputcopyingintheinputs [12]. In the restricted class of MTTs, we can copy an output unboundedly many times but we can traverse an input in only a bounded number of times. For example, reverse and mirror are finitecopyingintheinputs, but eval is not. In contrast, our method runs in polynomial time for (deterministic) MTTs with the restriction of finitecopyingintheoutput (Sect. 6.2), in which we can traverse an input unboundedly many times but we can copy an output only a bounded number of times. Whether we can perform polynomialtime inverse computation for general deterministic MTTs or not is still an open problem. It is worth noting that many useful functions can be written as an MTT in which both the input traversals and the output copies are bounded [12, 13, 14, 32], and thus inverse computation for the functions can be performed in polynomial time both by theirs and ours. Thus, the difference in expressiveness between ours and other methods is rather small, though not negligible. However, we claim that our method stands out by being systematic and simple.
7.3 (Formal) Grammars
In previous work [37], we have suggested an idea for inverse computation: We first find a grammar representing the possible outputs of a program, and then perform inverse computation through the parsing of the grammar. For leftinverse computation, we use an unambiguous grammar that overapproximates the possible outputs of a program, and for rightinverse computation, we use a grammar that exactly represents the possible outputs of a program but the grammar need not to be unambiguous. In [37], we have mainly discussed inverse computation using regular tree grammars [10], and showed that lineartime (right)inverse computation is possible for affine and treeless [48] programs. For some subclass of MTTs, we can achieve polynomialtime inverse computation via parsing; for example, we can use contextfree tree grammars for linear macro tree transducers and we can use IO macro grammars [5] for inputlinear MTTs. However, to the best of our knowledge, there is no (tree) grammar that is powerful enough to express the possible outputs (range) of parameterlinear MTTs and is parsable in polynomial time.
One would notice that the idea of the parsing with rangewise memoization is already used CockeYoungerKasami (CYK) parsing [24] for contextfree grammars. Indeed, both of the parsing algorithm of RCG and our inverse computation method are extensions of CYK parsing. We can see a contextfree grammar as a transformation from a concrete syntax tree to the corresponding string, and we can write the transformation by a parameterlinear MTT. Then, our inverse computation for the transformation behaves as CYK parsing for the grammar, although the time complexity of the inverse computation depends on what secondorder matching algorithm we use.
It is known that RCG is Pcomplete, i.e., RCG can express any set of strings of which membership test is performed in polynomialtime [7]. However, we have not used the fullexpressive power of RCG. Filling this gap is a future direction.
8 Conclusion
We have shown that viewing a function as a contextgenerating transformation simplifies inverse computation of accumulative functions with multiple data traversals. Accordingly, we can achieve systematic polynomialtime inverse computation with small modifications to the existing techniques.
A future direction is to develop a systematic program inversion method for accumulative functions based on the view point. Since now an accumulative function can be viewed as nonaccumulative contextgenerating functions, we hope that we can extend usual rangeanalysisbased programinversion methods [21, 30, 37] to those functions, and hope that a programinversion method developed in this way would be a good alternative to the existing approaches [20, 39, 40]. Another future direction is to develop an inverse computation method that can handle more kinds of copying. One sort of the interesting copying in practice is those introduced by “join” operation in database query. Although this study is the first one to tackle the problem of “copies” in inverse computation, still there is a large gap between our results and the general “join” functions used in practice. Since tree transducers are hardly able to express “join”like transformation [38], the next step in our research would be to identify what “join”s we should treat by designing an appropriate language.
Footnotes
Notes
Acknowledgements
We wish to thank Akimasa Morihata, Meng Wang, and Soichiro Hidaka, who gave us many valuable comments on an earlier version of this work. The discussions in Sect. 5.5.1 are hinted from Naoki Kobayashi and Takeshi Tsukada. We also thank Janis Voigtländer who pointed out the relationship between Sect. 4.1 and deaccumulation. This work was partially supported by Japan Society for the Promotion of Science, GrantinAid for Research Activity Startup 22800003, when the first author was in Tohoku University, and this work is partially supported by Japan Society for the Promotion of Science, GrantinAid for Young Scientists (B) 24700020.
References
 1.Abramov, S.M., Glück, R.: Principles of inverse computation and the universal resolving algorithm. In: Mogensen, T.Æ., Schmidt, D.A., Sudborough, I.H. (eds.) The Essence of Computation. Lecture Notes in Computer Science, vol. 2566, pp. 269–295. Springer, Berlin (2002) CrossRefGoogle Scholar
 2.Abramov, S.M., Glück, R., Klimov, Y.A.: An universal resolving algorithm for inverse computation of lazy languages. In: Virbitskaite and Voronkov [46], pp. 27–40 Google Scholar
 3.Albert, E., Vidal, G.: The narrowingdriven approach to functional logic program specialization. New Gener. Comput. 20(1), 3–26 (2001) CrossRefGoogle Scholar
 4.Antoy, S., Echahed, R., Hanus, M.: A needed narrowing strategy. J. ACM 47(4), 776–822 (2000) MathSciNetGoogle Scholar
 5.Asveld, P.R.: Time and space complexity of insideout macro languages. Int. J. Comput. Math. 10(1), 3–14 (1981) CrossRefMATHMathSciNetGoogle Scholar
 6.Bird, R.: Introduction to Functional Programming Using Haskell, 2nd edn. Prentice Hall, New York (1998) Google Scholar
 7.Boullier, P.: Range concatenation grammars. In: Bunt, H., Carroll, J., Satta, G. (eds.) New Developments in Parsing Technology. Text, Speech and Language Technology, vol. 23. Kluwer Academic, Dordrecht (2004). Chap. 13 CrossRefGoogle Scholar
 8.Chin, W.N., Khoo, S.C., Jones, N.: Redundant call elimination via tupling. Fundam. Inform. 69(1–2), 1–37 (2006) MATHMathSciNetGoogle Scholar
 9.Christiansen, J., Fischer, S.: EasyCheck—test data for free. In: Garrigue, J., Hermenegildo, M.V. (eds.) FLOPS. Lecture Notes in Computer Science, vol. 4989, pp. 322–336. Springer, Berlin (2008) Google Scholar
 10.Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007). http://www.grappa.univlille3.fr/tata
 11.Dowek, G.: A secondorder pattern matching algorithm for the cube of typed lambdacalculi. In: MFCS, pp. 151–160 (1991) Google Scholar
 12.Engelfriet, J., Maneth, S.: Macro tree transducers, attribute grammars, and MSO definable tree translations. Inf. Comput. 154(1), 34–91 (1999) CrossRefMATHMathSciNetGoogle Scholar
 13.Engelfriet, J., Maneth, S.: A comparison of pebble tree transducers with macro tree transducers. Acta Inform. 39(9), 613–698 (2003) CrossRefMATHMathSciNetGoogle Scholar
 14.Engelfriet, J., Maneth, S.: Macro tree translations of linear size increase are MSO definable. SIAM J. Comput. 32(4), 950–1006 (2003) CrossRefMATHMathSciNetGoogle Scholar
 15.Engelfriet, J., Vogler, H.: Macro tree transducers. J. Comput. Syst. Sci. 31(1), 71–146 (1985) CrossRefMATHMathSciNetGoogle Scholar
 16.Eppstein, D.: A heuristic approach to program inversion. In: IJCAI, pp. 219–221 (1985) Google Scholar
 17.Frisch, A., Hosoya, H.: Towards practical typechecking for macro tree transducers. In: Arenas, M., Schwartzbach, M.I. (eds.) DBPL. Lecture Notes in Computer Science, vol. 4797, pp. 246–260. Springer, Berlin (2007). Full version is available as Research Report, RR6107, INRIA, 2007 Google Scholar
 18.Fülöp, Z.: Undecidable properties of deterministic topdown tree transducers. Theor. Comput. Sci. 134(2), 311–328 (1994) CrossRefMATHGoogle Scholar
 19.Giesl, J., Kühnemann, A., Voigtländer, J.: Deaccumulation techniques for improving provability. J. Log. Algebr. Program. 71(2), 79–113 (2007) CrossRefMATHMathSciNetGoogle Scholar
 20.Glück, R., Kawabe, M.: Derivation of deterministic inverse programs based on LR parsing. In: Kameyama, Y., Stuckey, P.J. (eds.) FLOPS. Lecture Notes in Computer Science, vol. 2998, pp. 291–306. Springer, Berlin (2004) Google Scholar
 21.Glück, R., Kawabe, M.: Revisiting an automatic program inverter for lisp. SIGPLAN Not. 40(5), 8–17 (2005) CrossRefGoogle Scholar
 22.Glück, R., Sørensen, M.H.: Partial deduction and driving are equivalent. In: Hermenegildo, M.V., Penjam, J. (eds.) PLILP. Lecture Notes in Computer Science, vol. 844, pp. 165–181. Springer, Berlin (1994) Google Scholar
 23.Gries, D.: The Science of Programming. Springer, Heidelberg (1981). Chap. 21 “Inverting programs” CrossRefMATHGoogle Scholar
 24.Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Prentice Hall, New York (2006). Chap. 7 Google Scholar
 25.Hu, Z., Iwasaki, H., Takeichi, M., Takano, A.: Tupling calculation eliminates multiple data traversals. In: ICFP, pp. 164–175 (1997) CrossRefGoogle Scholar
 26.Huet, G.P.: Résolution d’équations dans les langages d’ordre 1,2,…,ω. PhD thesis, Université de Paris VII (1976) Google Scholar
 27.Huet, G.P., Lang, B.: Proving and applying program transformations expressed with secondorder patterns. Acta Inform. 11, 31–55 (1978) MATHMathSciNetGoogle Scholar
 28.Inaba, K., Maneth, S.: The complexity of tree transducer output languages. In: Hariharan, R., Mukund, M., Vinay, V. (eds.) FSTTCS. LIPIcs, vol. 2, pp. 244–255. Schloss Dagstuhl—LeibnizZentrum fuer Informatik (2008) Google Scholar
 29.Kobayashi, N.: Types and higherorder recursion schemes for verification of higherorder programs. In: Shao, Z., Pierce, B.C. (eds.) POPL, pp. 416–428. ACM, New York (2009) Google Scholar
 30.Korf, R.E.: Inversion of applicative programs. In: Hayes, P.J. (ed.) IJCAI, pp. 1007–1009. Kaufmann, Los Altos (1981) Google Scholar
 31.Kühnemann, A., Glück, R., Kakehi, K.: Relating accumulative and nonaccumulative functional programs. In: Middeldorp, A. (ed.) RTA. Lecture Notes in Computer Science, vol. 2051, pp. 154–168. Springer, Berlin (2001) Google Scholar
 32.Maneth, S., Nakano, K.: XML type checking for macro tree transducers with holes. In: PLANX (2008) Google Scholar
 33.Maneth, S., Perst, T., Seidl, H.: Exact XML type checking in polynomial time. In: Schwentick, T., Suciu, D. (eds.) ICDT. Lecture Notes in Computer Science, vol. 4353, pp. 254–268. Springer, Berlin (2007) Google Scholar
 34.Martens, W., Neven, F.: On the complexity of typechecking topdown XML transformations. Theor. Comput. Sci. 336(1), 153–180 (2005) CrossRefMATHMathSciNetGoogle Scholar
 35.Matsuda, K., Hu, Z., Takeichi, M.: Typebased specialization of XML transformations. In: Puebla, G., Vidal, G. (eds.) PEPM, pp. 61–72. ACM, New York (2009) Google Scholar
 36.Matsuda, K., Inaba, K., Nakano, K.: Polynomialtime inverse computation for accumulative functions with multiple data traversals. In: Kiselyov, O., Thompson, S. (eds.) PEPM, pp. 5–14. ACM, New York (2012) Google Scholar
 37.Matsuda, K., Mu, S.C., Hu, Z., Takeichi, M.: A grammarbased approach to invertible programs. In: Gordon, A.D. (ed.) ESOP. Lecture Notes in Computer Science, vol. 6012, pp. 448–467. Springer, Berlin (2010) Google Scholar
 38.Milo, T., Suciu, D., Vianu, V.: Typechecking for XML transformers. J. Comput. Syst. Sci. 66(1), 66–97 (2003) CrossRefMATHMathSciNetGoogle Scholar
 39.Mogensen, T.Æ.: Report on an implementation of a semiinverter. In: Virbitskaite and Voronkov [46], pp. 322–334 Google Scholar
 40.Nishida, N., Vidal, G.: Program inversion for tail recursive functions. In: SchmidtSchauß, M. (ed.) RTA. LIPIcs, vol. 10, pp. 283–298. Schloss Dagstuhl—LeibnizZentrum fuer Informatik (2011) Google Scholar
 41.Perst, T., Seidl, H.: Macro forest transducers. Inf. Process. Lett. 89(3), 141–149 (2004) CrossRefMATHMathSciNetGoogle Scholar
 42.Rounds, W.C.: Complexity of recognition in intermediatelevel languages. In: FOCS, pp. 145–158. IEEE Press, New York (1973) Google Scholar
 43.Runciman, C., Naylor, M., Lindblad, F.: SmallCheck and lazy SmallCheck: automatic exhaustive testing for small values. In: Gill, A. (ed.) Haskell, pp. 37–48. ACM, New York (2008) CrossRefGoogle Scholar
 44.Salvati, S., de Groote, P.: On the complexity of higherorder matching in the linear lambdacalculus. In: Nieuwenhuis, R. (ed.) RTA. Lecture Notes in Computer Science, vol. 2706, pp. 234–245. Springer, Berlin (2003) Google Scholar
 45.Turchin, V.F.: The concept of a supercompiler. ACM Trans. Program. Lang. Syst. 8(3), 292–325 (1986) CrossRefMATHMathSciNetGoogle Scholar
 46.Virbitskaite, I., Voronkov, A. (eds.) Perspectives of Systems Informatics, 6th International Andrei Ershov Memorial Conference, PSI, Revised Papers, Novosibirsk, Russia, June 27–30, 2006. Lecture Notes in Computer Science, vol. 4378, Springer, Berlin (2007) Google Scholar
 47.Voigtländer, J., Kühnemann, A.: Composition of functions with accumulating parameters. J. Funct. Program. 14(3), 317–363 (2004) CrossRefMATHGoogle Scholar
 48.Wadler, P.: Deforestation: transforming programs to eliminate trees. Theor. Comput. Sci. 73(2), 231–248 (1990) CrossRefMATHMathSciNetGoogle Scholar
 49.Yellin, D.M.: Attribute Grammar Inversion and SourcetoSource Translation. Lecture Notes in Computer Science, vol. 302. Springer, Berlin (1988) MATHGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.