Decidable Synthesis of Programs with Uninterpreted Functions

We identify a decidable synthesis problem for a class of programs of unbounded size with conditionals and iteration that work over infinite data domains. The programs in our class use uninterpreted functions and relations, and abide by a restriction called coherence that was recently identified to yield decidable verification. We formulate a powerful grammar-restricted (syntax-guided) synthesis problem for coherent uninterpreted programs, and we show the problem to be decidable, identify its precise complexity, and also study several variants of the problem.

A classical stream of program synthesis research is one that emerged from a problem proposed by Church [Church 1960] in 1960 for Boolean circuits.Seminal results by Büchi and Landweber [Buchi and Landweber 1969] and Rabin [Rabin 1972] led to a mature understanding of the problem, including connections to infinite games played on finite graphs and automata over infinite trees (see [Grädel et al. 2002;Kupferman et al. 2010] for surveys of this theory).Tractable synthesis for temporal logics like LTL, CTL, and their fragments was investigated and several applications for synthesizing hardware circuits emerged [Bloem et al. 2007[Bloem et al. , 2012]].A fundamental result for this work is that, though the class of allowed systems is infinite and programs are allowed unbounded memory/state, whenever a program is feasible for a specification it is always possible to realize it using a finite-state transition system.
In recent years, the field has taken a different track altogether, tackling synthesis of programs that work over infinite domains such as strings [Gulwani 2011;Gulwani et al. 2012], integers [Alur et al. 2015;Solar-Lezama et al. 2006], and heaps [Qiu and Solar-Lezama 2017].Typical solutions derived in this line of research involve (a) bounding the class of programs to a finite set (perhaps iteratively increasing the class) and (b) searching the space of programs using techniques like symmetry-reduced enumeration, SAT engines, or even random walks [Alur et al. 2015[Alur et al. , 2018]], typically guided by counterexamples (CEGIS) [Jha and Seshia 2017;Löding et al. 2016;Solar-Lezama et al. 2006].Note that iteratively searching larger classes of programs allows synthesis engines to find a program if one exists, but it does not allow one to conclude that there is no program that satisfies the specification.Consequently, in this stream of research, decidability results are not common (see Section 10 for some exceptions in certain heavily restricted cases).
In this paper we present, to the best of our knowledge, the first decidability results for synthesizing programs with iterations/recursion of arbitrary size that work on infinite data domains; in particular, decidable synthesis of a subclass of programs that use uninterpreted functions and relations.
Our primary contribution is a decidability result for realizability and synthesis of a restricted class of imperative uninterpreted programs.Uninterpreted programs work over infinite data models that give arbitrary meanings to their functions and relations.Such programs satisfy their assertions if they hold along all executions on every model that interprets the functions and relations.The theory of uninterpreted functions and relations is a well-studied theory-classically studied in 1929 by Gödel where completeness results were shown [Davis 1990] and, more recently, its decidable quantifier-free fragment exploited in SMT solvers in combination with other theories [Bradley and Manna 2007].In recent work [Mathur et al. 2019], the authors establish that for a subclass of uninterpreted programs, called coherent programs, the verification problem is decidable.Note that in this verification problem there are no user-given loop invariants; the verification engine finds inductive invariants and proves them automatically in order to prove program correctness.
In this paper, we consider the synthesis problem for coherent uninterpreted programs.The user gives a grammar G that can generate well-formed programs in our programming language.This grammar is allowed to force programs to have assert statements at various points, which collectively act as the specification.The program synthesis problem is then to determine whether there is a coherent program conforming to the grammar G that satisfies all assertions in all executions when running on any data model that gives meaning to function and relation symbols.
Our primary technical result is that the realizability problem (checking the existence a program conforming to the grammar and satisfying its assertions) is decidable for the class of coherent uninterpreted programs.Furthermore, we prove that the problem is 2EXPTIME-complete.And whenever a correct coherent program that conforms to the grammar exists, we can synthesize one.We also show that the realizability/synthesis problem is undecidable if the coherence restriction is dropped.In fact we show a stronger result that the problem is undecidable even for synthesis of straight-line programs (without conditionals and iteration!).
Coherence of programs is a technical restriction introduced in [Mathur et al. 2019].It consists of two properties, both of which were individually proven to be essential for ensuring that program verification is decidable.The first one, called memoizing, says, intuitively, that functions are evaluated on any tuple of terms only once (strictly speaking, multiple evaluations are allowed if the computed value is already present in one of the program variables).The second restriction, called early-assumes, requires executions to always make equality assumptions on variables (using conditional branches) early, before computing functions on them.When automatically synthesizing programs over infinite domains, we need to be able to at least automatically verify a conjectured program on the domain without being given loop invariants.The class of coherent uninterpreted programs is the only natural class of programs we are aware of that has recursion and works over infinite domains with decidable verification.Consequently, this class is a natural target for proving a decidable synthesis result.
The problem of synthesizing a program from a grammar with assertions is a powerful definition of program synthesis.In particular, the grammar can be used to restrict the space of programs in various ways.For example, we can restrict the space of programs syntactically by disallowing while loops.Or, for a fixed n, by using a set of Boolean variables linear in n and demanding a loop body to strictly increment a counter encoded using these variables, we can demand that while loops terminate in a linear/polynomial/exponential number of iterations.We can also implement while loops that do not always terminate, but terminate only when the data model satisfies particular properties.For example, we can insist that a while loop iterates through the nodes of a linked list segment from x to y by demanding the occurrence of 'x := next(x)' in the body and 'x y' as the loop guard.
Grammar-restricted program synthesis can also express the synthesis of programs with holes, used in systems like Sketch [Solar-Lezama 2013].Here, one is given a sketch, which is a program with holes (marked in code using '??').The problem is to fill these holes using programs/expressions conforming to a particular grammar so that the assertions in the program hold.It is easy to see that the sketch and the grammar for holes can be easily expressed using a grammar in our setting (where the skeleton in the sketch is "hard-coded" in the grammar).Synthesizing programs/expressions using restricted grammars is also the cornerstone of the intensively studied SyGuS (syntax-guided synthesis) format [Alur et al. 2015; SyGuS ]1 .
The proof of our decidability result relies on tree-automata, a call-back to classical theoretical approaches to synthesis.The key idea is to represent programs as trees (specifically program trees), and build tree automata that accept trees corresponding to correct programs.The central construction is to build a mother 2-way alternating tree automaton that accepts all program trees of coherent programs that satisfy their assertions.Given a grammar G of programs (which has to satisfy certain natural conditions), we show that there is a regular set of program trees for the language of allowed programs L(G).Intersecting the automata for these two regular tree languages and checking for emptiness establishes the upper bound.We crucially use the word automaton construction from the decision procedure presented in [Mathur et al. 2019] (which itself is built using a finite-memory streaming congruence closure algorithm) in order to build the mother tree automaton, adapting ideas from [Madhusudan 2011] for building two-way automata over program trees.The verification word automaton incurs an exponential blow-up in the number of program variables, and the conversion of 2-way alternating tree automata to 1-way nondeterministic tree automata causes another exponential blow-up.Our final decision procedure is doubly-exponential in the number of program variables and linear in the size of the grammar.
We prove a matching lower bound.Using a reduction from the acceptance problem for alternating exponential space Turing machines (which is 2EXPTIME-complete), we show that program synthesis of coherent uninterpreted programs is 2EXPTIME-hard in the number of variables.This reduction forces the synthesized program to come up with a witnessing run for the Turing machine, where the skeleton of the grammar performs small checks nondeterministically to ensure that the witness encoded by the program is correct.The reduction is non-trivial in that programs (which correspond to runs in the Turing machine) must simulate sequences of configurations, each of which is of exponential size, by using only polynomially many variables.

Synthesis of Recursive Programs, Transition Systems, and Boolean programs
We study several other related synthesis problems.First, we show that we can extend our results to synthesis of call-by-value recursive uninterpreted programs (with a fixed number of functions and fixed number of local/global variables).This problem is also 2EXPTIME-complete but is more complex, as even single executions simulated on the program tree must be split into separate copies, with one executing the summary of a function call and the other proceeding under the assumption that the call has returned in a summarized state.
We then study a synthesis problem for transition systems.Transition systems are similar to programs in that they execute similar kinds of atomic statements.We allow the user to restrict the set of allowable executions (using regular sets).Despite the fact that this problem seems very similar to program synthesis, we show that it is an easier problem, and coherent transition system realizability and synthesis can be solved in time exponential in the number of program variables and polynomial in the automata that restrict executions.We prove a corresponding lower bound and establish EXPTIME-completeness of this problem.
Finally, we note that our results also show, as a corollary, that grammar-restricted Boolean program realizability/synthesis (and execution-restricted Boolean transition system synthesis) is decidable, and is 2EXPTIME-complete (respectively, EXPTIME-complete).These results are themselves new.The lower bound results for these hence show that coherent program synthesis/transition synthesis is not particularly harder than Boolean program synthesis.Grammar-restricted Boolean program synthesis is an important problem as it is implemented by several practical synthesis systems such as Sketch [Solar-Lezama 2013].
Due to space restrictions, we present only proof gists in the paper; more extensive proofs can be found in the Appendix.

ILLUSTRATIVE EXAMPLES
Let us illustrate some aspects of the different results in this paper with examples.  .This program has a hole '⟨⟨ ?? | Cannot . . .⟩⟩', that we intend to fill with a sub-program so that the entire program (together with the contents of the hole) satisfies the assertion at the end.The sub-program corresponding to the hole is allowed to use the variables cipher as well as some additional program variables y 1 , . . ., y n (for some fixed n), but is not allowed to refer to key and secret in any manner.Intuitively, the above setting models the encryption of a secret message secret with a key key.The assumption in the second line of the program models the fact that the secret message can be decrypted from cipher and key.Here, the functions enc and dec are uninterpreted functions and thus, the program we are looking for is an uninterpreted program.For such a program, the assertion at the end "assert(z = secret)" holds, if it holds for all models, i.e, for all interpretations of enc and dec, and for all initial values of the different variables.With this setup, we are essentially asking whether a program that does not have access to key can recover secret.It is easy to see that there is indeed no program that satisfies the above requirement.The above modeling of keys, encryption, nonces, etc. is common in algebraic approaches to model cryptographic protocols [Dolev and Yao 1983;Durgin et al. 2004].In Section 3.2, we will show how to model this problem in our framework.Example 2. The program in Figure 1 (right) is another simple example of an unrealizable specification.This program also illustrates the fact that the synthesized code has incomplete information about the execution that led up to it.The program variables here are x, b, y.The hole in this partial program is restricted so that it cannot refer to x or b, and it is easy to phrase the question for synthesis of the complete program in terms of a grammar.Now, since the hole cannot access the variable x, it cannot directly check if x = T or not.Further, it cannot check whether it is T or F as it cannot refer to b.Consequently, it is easy to see that there is no program for the hole that can ensure y is equal to b.Note that the code at the hole, apart from not being allowed to examine some variables, is also implicitly prohibited from looking at the control path taken to reach the hole.If we could synthesize two different programs, depending on the two control flows taken to reach the hole, then we could set y := T when the then-branch of the condition is taken, and set y := F if the else-branch is taken.However, program synthesis requires the hole to be filled by a program independent of the control flow taken to reach the hole.The fact that the hole has incomplete information about executions can be used to encode specifications using complex ghost code, as we show in later examples.In Section 7, we explore a slightly different synthesis problem, called transition system synthesis, where holes can be differently instantiated based on the history of an execution.Example 3. In this example, we are trying, roughly, to model the synthesis of a program that checks whether a linked list pointed to by some node x has a key k.We model the next pointer as a unary function next, and locations using elements in the data model.
In order to state the specification, we use ghost code which is interleaved into a program.The program skeleton has a while loop that essentially advances the pointer variable x along the list until NIL (a special constant modeled as an immutable program variable) is reached (see template on the right).The first hole '⟨⟨ ?? 1 ⟩⟩' before the whileloop and the second hole '⟨⟨ ?? 2 ⟩⟩' within the while-loop need to be filled so that the assertion at the end is satisfied.We use three ghost variables in the skeleton: g ans , g witness , and g found .The ghost variable g ans evaluates to whether we expect to find k in the list or not, and hence at the end the skeleton asserts that the Boolean variable b computed by the holes is precisely g ans .Note that here we are assuming that the holes cannot look at the ghost variables.assume(T F); g found := F; ⟨⟨ ?? 1 ⟩⟩; while(x NIL) { if (g ans T) then assume(key(x) k); else if (g witness = x) then { assume (key(x) = k); g found := T; }; ⟨⟨ ?? 2 ⟩⟩; However, the sketch needs to check that the answer g ans is indeed correct.If g ans is not T, then we add the assumption that key(x) k in each iteration of the loop, hence ensuring the key is not present.For ensuring correctness in the case g ans = T, we need two more ghost variables g witness and g found .The variable g witness witnesses the precise element in the list that holds the key k, and g found checks whether the location g witness belongs to the list pointed to by x.
The above specification is realizable.For example, filling the first hole '⟨⟨ ?? 1 ⟩⟩' with "b := F" and '⟨⟨ ?? 2 ⟩⟩' with "if key(x) = k then b := T" satisfies the assertion.Furthermore, this program is indeed coherent [Mathur et al. 2019] and hence our decision procedure will answer in the affirmative and synthesize code for the holes.In fact, our synthesis procedure will synthesize a representation for all possible ways to fill these holes (hence including the solution above) and it is thus possible to enumerate and pick specific solutions.As before, it is clear to see how to formulate a grammar that matches this setup.As noted, we must stipulate that the holes do not use the ghost variables.
Example 4. Consider the same partial program as in Example 3, but let us add an assertion at the end: "assert (b = T ⇒ z = g witness )", where z is another program variable.We are then demanding that the synthesized code also find a node z that is equal to the ghost node g witness , guessed nondeterministically at the beginning of the program, whose key is k.This specification is unrealizable -for a list with two nodes having key k, no matter what the program picks, we can always set g witness to the other node with key k in the list to violate the assertion.Our decision procedure will report in the negative for this problem.
We can encode input/output examples in our setting by adding at the beginning of the program grammar a sequence of assignments and assumptions that define certain 'models'.For example, the sequence of statements on the right defines a linked list of two elements with different keys.We can also define, similarly, using special variables, the answer that we expect in the case of each model.
Then the grammar will require that one of these models be nondeterministically chosen (using a fresh variable that can hold any initial value, and therefore acts as a nondeterministic choice) such that when the program (after the hole is filled) is executed on the chosen model, the variable(s) returned will contain the expected answer.This has the effect of requiring a solution to the hole that generalizes across models.For want of space we provide a detailed example in Appendix A.1.
Observe that our method in this paper builds an automaton that accepts all possible programs that are correct solutions, and thus this method of encoding input-output examples could be used to extract the smallest program -which is usually a good proxy for programs that generalize well.

PRELIMINARIES
The goal of this section is to formally define the grammar-restricted program synthesis problem for coherent uninterpreted programs.We first define a simple target imperative language of uninterpreted programs, giving its semantics and defining a notion of program correctness.Next, we consider the subtleties involved in defining an interesting and nontrivial synthesis problem.This leads us to our definition of grammar-restricted uninterpreted program synthesis or simply uninterpreted program synthesis, in which the user gives a grammar from which a correct program needs to be synthesized.

Uninterpreted Programs: Syntax and Semantics
We consider simple imperative programs that operate over arbitrary data models that provide interpretations for the constants, functions, and relations the program uses; henceforth such programs are called uninterpreted programs.
3.1.1Syntax.Our programs include assignment statements, while loops, if-then-else conditionals, assume statements, and assertions.Let us fix a first order signature Σ = (C, F , R) where C, F , and R are, respectively, sets of constant, function, and relation symbols.Let V be a finite set of program variables.The set of programs over V is inductively defined using the following grammar.Below, c, d ∈ C, f ∈ F , R ∈ R (with f and R of the appropriate arities) and x, y, z 1 , . . ., z r ∈ V .
Arbitrary Boolean combinations in ⟨cond⟩ V can be modeled using the if−then−else construct (with nesting).Further, constants can be modeled as variables which are never reassigned, and relations can be modeled using functions along with special constants ⊤ and ⊥.Therefore we will assume that all conditionals are atomic equality/disequality predicates, and that there are no constants or relation symbols in programs.Similarly, all assertions can be rewritten using if−then−else and assertions of the form assert(false). Therefore, we assume all assertions are assert(false).When the set of variables V is clear from context, we will also omit the subscript V from ⟨stmt⟩ V and ⟨cond⟩ V .

Program
Executions.An execution over V is a word over the alphabet The set of complete executions for a program p over V denoted Exec(p) is a regular language defined inductively as follows.Here c is of the form "x = y" or "x y".We assume that ¬(x = y) is synonymous with x y and ¬(x y) is synonymous with x = y.

Semantics.
The semantics of executions is given in terms of data models.A data model M = (U , I) is a first order structure over Σ that is comprised of a universe U and the interpretation function I that maps every k-ary symbol f ∈ F to a k-ary function over U and also gives initial values to the variables in V .The semantics of an execution π over a data model M is given by a configuration σ (π , M) : V → U that maps every variable to a value in the universe U of M at the end of π .This notion is straightforward and a formal definition is skipped (see [Mathur et al. 2019] for details).Intuitively, the program's variables evolve according to the function interpretations given by the data model, equality being interpreted in the natural way to evolve conditionals, and if-then-else and while constructs evaluated in the natural way.Assume statements halt the program silently if the condition is not true, while assert statements cause an error/exception when the evaluated condition is not true.Note that a program on a particular data model has at most one execution.
An execution π is feasible in a data model M if every assumption is true at its point of occurrence.More formally, an execution π is feasible in a data model M if for every prefix ρ = ρ ′ •assume(x ∼ y) of π (where ∼∈ {=, }), we have σ (ρ ′ , M)(x) ∼ σ (ρ ′ , M)(y).Execution π is said to be correct in a data model M if for every prefix of π of the form ρ = ρ ′ • assert(false), we have that ρ ′ is not feasible, or infeasible in M. Finally, a program p is said to be correct if for all data models M and partial executions π ∈ PExec(p), π is correct in M.

The Program Synthesis Problem
We are now ready to define the program synthesis problem.Our approach will be to allow users to specify a grammar and ask for a program to be synthesized that conforms to the grammar.However, it remains to define the specifications that programs must satisfy.
We allow the user to express specifications using assertions in the program to be synthesized.More precisely, we allow the user to express in the grammar where assertions (of equality and disequality of variables) occur-synthesized programs must have such assertions present and ensure they hold along every run and on every data model.Such assertions can be used at the end of programs to ensure postconditions, and can also be used intermittently along executions to ensure properties.As we will show, the user can also insert ghost code into the grammars, which, augmented with assertions, can express a wide variety of properties.
Some care is necessary to avoid defining trivial synthesis problems.Consider a problem in which the task is to fill a hole in a template program such that an assertion after the hole is satisfied.A correct but uninteresting solution would be to fill the hole with a non-terminating while loop.Our program grammar restriction allows users to constrain the grammar syntactically in order to rule out trivial solutions like the one above.The grammars also enable us to use assertions to ensure such trivial solutions are incorrect.
3.2.1 Grammar Schema and Input Grammar.In our program synthesis problem formulation, we allow users to define a grammar (called the input grammar) to which synthesized programs must conform.We now describe the schema by which we define the set of allowable grammars.
The input grammar can be a context-free grammar, but our schema will disallow arbitrary contextfree grammars.The input grammars allow the usual context-free power required to describe proper nesting/bracketing of program expressions, but disallow other uses of the context-free power, such as counting statements.For example, we would disallow the grammar on the right.This grammar has two non-terminals S (also the start symbol) and T , and generates programs with a conditional that has the same number of assignments in the if and else branches.Intuitively, the grammar schema restricts the input grammar so that any nonterminal produces well-formed programs.The grammar schema is a set of production rules, and the user can form the input grammar using any finite subset of the grammar schema.The schema is quite natural -it allows atomic statements, and allows combining sets of well-formed programs (corresponding to nonterminals) using sequential composition, conditional composition, and iteration.
We assume a countably infinite set P N of nonterminals and a countably infinite set PV of program variables.The grammar schema S over P N and PV is the infinite collection of production rules: "P → x := y", "P → x := f (z)", "P → assume(x ∼ y)", "P → assert(false)", "P → skip", "P → while (x ∼ y) P 1 ", "P → if (x ∼ y) then P 1 else P 2 ", "P → P 1 ; P 2 ; • • • ; P k " P, P 1 , P 2 , . . ., An input program grammar G is any finite subset of the schema S. Such a subset implicitly identifies a finite set of program variables (those variables occurring in G), a finite set of nonterminals (those nonterminals occurring in G), and hence defines a context-free grammar.Hence an input program grammar defines a class of programs, denoted L(G).
Note that grammar rules where nonterminals on the right hand side are replaced by a sequence of nonterminals separated by ";", such as P → if (x ∼ y) then P 1 ; P 2 ; P 3 ; P 4 else P 5 are effectively also allowed, as they can be transformed to rules in S using extra nonterminals and rules.This is similar to conversion to Chomsky Normal Form [Hopcroft et al. 2006] -for example, we can replace the above rule by the rules "P → if (x ∼ y) then Q else P 5 ", "Q → P 1 ; Q 1 ", "Q 1 → P 2 ; Q 2 " and "Q 2 → P 3 ; P 4 ".
Definition 1 (Uninterpreted Program Realizability and Synthesis).Given an input grammar G defined above, the program realizability problem is to determine whether there is an uninterpreted program p ∈ L(G) such that p is correct.The program synthesis problem is to determine the above, and further, if realizable, synthesize a correct program p ∈ L(G).

UNDECIDABILITY OF UNINTERPRETED PROGRAM SYNTHESIS
First, we observe that synthesis of uninterpreted programs is undecidable in general (Theorem 1).This is not surprising given that the verification problem of uninterpreted programs is already undecidable [Mathur et al. 2019;Müller-Olm et al. 2005].
Theorem 1.The uninterpreted program synthesis problem is undecidable.
Proof.(Sketch.)This can be proved using a straightforward reduction from the verification problem of uninterpreted programs which is known to be undecidable [Mathur et al. 2019;Müller-Olm et al. 2005].Given an uninterpreted program p as an input to the verification problem, we construct a grammar G p such that L(G p ) = {p}, where p is the input program to the verification problem.It is easy to see that the program verifies iff the reduced problem is realizable.□ Given the above result and that verification of loop-free programs is decidable (equivalent to checking satisfiability of quantifier free theory of equality and uninterpreted functions, or EUF), it is natural to ask whether the program synthesis of loop-free uninterpreted programs is decidable.The first signs that this problem may be undecidable were indicated in a result by [Caulfield et al. 2015], where it was shown that the problem of synthesizing quantifier-free formulae over the theory of equality with uninterpreted functions (EUF) together with an additional ITE (if-then-else) construct, is undecidable.Observe that a quantifier-free EUF formula corresponds to an uninterpreted program without loops.However, since there might not be any bound on the size of the candidate formulae, the programs that correspond to these formulae may require an unbounded number of variables.
We first show that program synthesis (over a fixed set of variables) remains undecidable even when the target programs do not contain loops.In the following, S loop-free is the grammar schema for grammars that generate loop-free programs: S loop-free = S \ {"P → while (x ∼ y) P 1 " | P, P 1 ∈ P N , x, y ∈ PV , ∼ ∈ {=, }} Theorem 2. The uninterpreted program synthesis problem is undecidable for the schema S loop-free .
In fact, the above result is a corollary of the following result that we prove.We show that the problem remains undecidable even when the input grammar is constrained to generate loop-free and conditional-free programs, or straight-line programs.Formally, let S SLP be the following schema over P N and PV .
The proof of the above result constructs a reduction from Post's Correspondence Problem [Post 1946] and is presented in Appendix B.
In summary, program synthesis of even straight-line uninterpreted programs, which do not have conditionals nor iteration, is already undecidable.The notion of coherence of uninterpreted programs was shown to lead to decidable verification in [Mathur et al. 2019].We show in the next section that restricting to coherent programs yields decidable program synthesis, even for programs with conditionals and iteration.

SYNTHESIS OF COHERENT UNINTERPRETED PROGRAMS
In this section, we discuss the main result of the paper -program synthesis for uninterpreted coherent programs [Mathur et al. 2019] is decidable.Coherence is a restriction that allows an automata-theoretic decision procedure for verifying uninterpreted programs.The restrictions are technical and intuitively allow maintaining congruence closure in a streaming fashion when reading a coherent execution.We will recall the definition of coherent executions and programs in Section 5.1 and also briefly recall the algorithm for verification of such programs.We next move on to the program synthesis problem.Our synthesis procedure works by constructing a two-way alternating tree automaton.We briefly discuss this class of tree automata in Section 5.2 and recall some standard known results.In Sections 5.3-5.5 we describe the details of the synthesis algorithm, argue its correctness, and discuss its complexity (2EXPTIME upper bound).

Coherent Executions and Programs
The notion of coherence for an execution π is defined with respect to the terms it computes.Intuitively, at the beginning of an execution, each variable x ∈ V stores some constant term x ∈ C. As the execution proceeds, new terms are computed and stored in variables.Let Terms Σ be the set of all ground terms defined using the constants and functions in Σ. Formally, the term corresponding to a variable x ∈ V at the end of an execution π ∈ Π * V , denoted T(π , x) ∈ Terms Σ , is inductively defined as follows.We assume that the set of constants C includes a designated set of A related notion is the set of term equality assumptions that an execution accumulates, which we formalize as α : π → P(Terms Σ × Terms Σ ), and define inductively as α(ε) = , α(π •"assume(x = y)") = α(π ) ∪ {(T(π , x), T(π , y))}, and α(π •a) = α(π ) otherwise.
For a set of term equalities A ⊆ Terms Σ ×Terms Σ , and two ground terms t 1 , t 2 ∈ Terms Σ , we say t 1 and t 2 are equivalent modulo A, denoted t 1 A t 2 , if A |= t 1 = t 2 .For a set of terms S ⊆ Terms Σ , and a term t ∈ Terms Σ we write t ∈ A S if there is a term t ′ ∈ S such that t A t ′ .For terms t, s ∈ Terms Σ , we say s is a superterm modulo A of t, denoted t ≼ A s if there are terms t ′ , s ′ ∈ Terms Σ such that With the above notation, let us now recall the notion of coherence.
Definition 2 (Coherent Executions and Programs [Mathur et al. 2019]).An execution π ∈ Π * V is said to be coherent if it satisfies the following two conditions. Memoizing.
If there is a term s ∈ T(ρ ′ ) such that either The following theorems due to [Mathur et al. 2019] establish the decidability of verifying coherent programs and also of checking if a program is coherent.
Theorem 4 ( [Mathur et al. 2019]).The verification problem for coherent programs, i.e., checking if a given uninterpreted coherent program p is correct, is decidable.
Mathur et.al., in fact, show that the decision procedures for both the above problems are automata-theoretic.They construct deterministic word automata A correct and A coherent2 that accept languages L correct and L coherent , respectively.The language L correct contains all coherent executions that are correct and contain no coherent execution that is incorrect.The language L coherent is the set of all coherent executions.The size of both these automata is O(2 poly(|V |) ).We remark that one can therefore construct a deterministic word automaton A cc-exec whose language is the set of all executions that are both coherent and correct (i.e, L(A cc-exec ) = L correct ∩ L coherent ) and whose size |A cc-exec | is again O(2 poly( |V |) ).A cc-exec will have a unique rejecting state q reject which is absorbing, and its language is prefix closed.

Non-deterministic Top-Down and Two-Way Alternating Tree Automata
Our synthesis procedure is tree-automata theoretic.We consider tree representations of programs, or program trees.The synthesis problem is thus to check if there is a program tree whose corresponding program is coherent, correct and belongs to the input grammar G.A formal correspondence between trees and programs is discussed in Section 5.3.The synthesis procedure then works as follows.We first construct a top-down tree automaton A G that accepts the set of trees corresponding to the programs generated by the input grammar G.We then construct another tree automaton A cc , which accepts all trees that correspond to programs that are coherent as well as correct.A cc is a two-way alternating tree automaton that examines all executions of an input program tree, checking if each of them is both correct and coherent by simulating them on the word automaton A cc-exec (described in Section 5.1).Recall that A cc-exec accepts an execution if and only if it is correct and coherent.In order to simulate longer and longer executions arising from constructs like while-loops, the automaton traverses the input tree and performs multiple passes over subtrees, visiting the internal nodes of the tree many times.This crucially relies on the "two-way"ness of the automaton.We then translate the two-way alternating tree automaton to an equivalent (one-way) non-deterministic top down tree automaton, and finally check if it intersects with A G .In the following, we recall the formal descriptions of these automata for readers that might be unfamiliar with them.Our presentation closely follows the presentations in [Madhusudan 2011;Vardi 1998].

5.2.1
Trees.We will consider binary trees here.Let us fix a tree alphabet Γ = 2 i=0 Γ i , which is a finite set of symbols annotated with arities -symbols in Γ i have arity i and Γ i ∩ Γ j = when i j.Formally, a finite tree T over Γ is a pair (S, γ ), where S ⊆ {L, R} * is a finite set of nodes in the tree; S is prefix closed, ϵ ∈ S and for every string ρ•R ∈ S, we also have ρ•L ∈ S. The labeling function γ : S → Γ maps each leaf node n ∈ S (i.e., there is no node n ′ which is a suffix of n), to Γ 0 , each node with exactly one child (i.e., n•L ∈ S but n•R S) to Γ 1 and the remaining nodes to Γ 2 .The node corresponding to ϵ is called the root node.The left and right children of a node n ∈ S are the nodes n 1 = n•L and n 2 = n•R if they exist, in which case n is the parent of n 1 and n 2 .For a node n different from the root ϵ, we will use n•U to denote the parent node of n.Readers must observe that the notion of binary trees can easily be extended to trees of arbitrary arity.

Non-Deterministic Top Down Tree Automata.
A non-deterministic finite top-down tree automaton over a tree alphabet where Q is a a finite set of states and and T is accepted by A if there is an accepting run of A on T .The language L(A) of the top down tree automaton is the set of all trees it accepts.
We note that checking if the language of a non-deterministic top-down tree automaton is empty is decidable in linear time in the size of the automaton.Further, given two tree automata A 1 and A 2 over the same tree alphabet, we can construct another tree automaton A such that We refer the reader to [Comon et al. 2007] for details of these standard results.

Two Way Alternating Tree
Automata.We will denote by B + (U ) the set of all positive Boolean formulae over a set U .That is, B + (U ) is the smallest set such that {true, false} ∪ U ⊆ B + (U ) and for every For a (possibly empty) set U ′ ⊆ U and a formula φ ∈ B + (U ), we say U ′ |= φ if φ evaluates to true by setting each of the elements in U ′ to true and the remaining elements of U to false.
A two-way tree automaton is a tuple A = (Q, I , δ 0 , δ 1 , δ 2 ) where Q is a finite set of states and I ⊆ Q is the set of initial states.δ 0 , δ 1 and δ 2 are respectively the transition functions for the leaf nodes, nodes with one child and nodes with two children.That is, A run of a two-way alternating tree automaton A on a finite tree T = (S, γ ) is a (possibly infinite) directed rooted tree T run = (S run , γ run ) (nodes in the run tree are allowed to have more than 2 children), such that γ run : S run → S × Q × {D, U L , U R } and the following conditions hold.(a) The root r of the tree is such that γ run (r ) = (ϵ, q, D), where q ∈ I .(b) For every node v with γ run (v) = (n, q, m) and for every child node respectively the left child, right child or the parent of n in the input tree T .A two-way alternating tree automaton accepts a tree T if there is any run of the automaton on T .
A two-way alternating tree automaton can be converted to an equivalent top-down tree automaton with at most exponential blowup: Lemma 6 ( [Kupferman and Vardi 2000;Vardi 1998]).Given a two-way alternating tree automaton A, one can construct a non-deterministic top-down tree automaton Here, |A| denotes the size of the description of the automaton A. Appendix C.2 presents a construction for the above result adapted to the simpler setting of our paper.

Program
Trees and Associated Programs.Every program can be interpreted as a tree where the leaves of the tree are basic statements like "x := y" and the internal nodes are labeled with constructs like while or seq (the sequencing construct alias for ;) and have sub-programs/subexpressions as children.Let us formalize this as follows.We fix a tree alphabet • Γ V ,0 = {"skip", "x:=y", "x:=f (z)", "assume(x=y)", "assume(x y)", "assert(false When the set of variables V is clear from the context, we will omit the subscript and simply use Γ, Γ 0 , Γ 1 and Γ 2 . A program tree is a tree over the alphabet Γ V where (a) the root is labeled root, and (b) all other nodes are labeled from symbols in Γ 0 ∪ Γ 1 ∪ Γ 2 \ {"root"} depending upon the number of children they have.Internal nodes in a program tree labeled with ite(•) correspond to if−then−else constructs, nodes labeled with while(•) correspond to while constructs, and nodes labeled with seq correspond to the sequencing construct ';'.An example program tree is shown on the right.It is easy to see that the set of all program trees is a regular language and that we can construct a non-deterministic tree automaton of size polynomial in |V | which accepts this set.To every node n in the program tree T = (S, γ ), one can associate a program, denoted Prog(n) ∈ ⟨stmt⟩ V , defined inductively on the structure of the tree as follows The program Prog(T ) associated with a program tree T is the program Prog(n), where n is the root node of T .It is easy to see that for every program there exists at least one program tree (though not unique, since there can be different parses) that represents it.

5.3.2
Grammar to Tree Automaton.The next task is to represent the set of programs generated by an input grammar G as a regular set of program trees, accepted by a non-deterministic top-down tree automaton A G .The construction of A G mimics the standard construction for tree automata that accept parse trees of context free grammars.The details of this construction are presented in Appendix C.3.
The following lemma states that the language of the A G accurately represents programs from G.
Lemma 7. Let G be a grammar conforming to the schema S and let A G be the tree automaton constructed above.Then, we have and has size O(|G|).□

Two Way Alternating Tree Automaton for Simulating Executions
We describe now the construction of two-way alternating automata that captures the crux of our our synthesis procedure, and perhaps the most important technical result of the paper.We construct a two-way alternating automaton A cc that accepts precisely the set of all annotated trees that correspond to correct and coherent programs.This will be achieved by ensuring that a program tree is accepted by A cc if and only if all executions of the program it represents are accepted by the word automaton A cc-exec (Section 5.1).The basic idea behind the tree automaton A cc is as follows.Given a program tree T as input, the automaton A cc traverses this tree and explores all the executions of the associated program p T = Prog(T ).For each execution σ of p T , A cc keeps track of the state that the word automaton A cc-exec would reach after reading σ .An accepting run of the tree automaton never visits the rejecting state of the word automaton A cc-exec .
Let us now present the formal description of the two-way alternating tree automaton that works over the alphabet Γ V described in Section 5.3.
States.The set of states and the initial states of the two way tree automaton A cc coincide with that of the word automaton A cc-exec .That is, Q cc = Q cc-exec and I cc = {q cc-exec 0 }, where q cc-exec 0 is the unique starting state of A cc-exec .
Transitions.We first give some intuition on how the transitions are designed.Let us consider the case when the automaton's control is in state q reading an internal input tree node n with one child and which is labeled with a = while(x = y).Let m be the last move of the automaton at this point.Then, in the next step, the automaton simultaneously performs two transitions corresponding to the two possibilities-entering the loop after assuming the guard 'x = y' to be true, or exiting the loop with the guard being false.In the first transition, the automaton moves to the (left) child n•L, and the state of the automaton changes to q ′ 1 where q ′ 1 = δ cc-exec (q, "assume(x = y)").In the second simultaneous transition, the automaton moves to the parent node n•U (searching for the next statement to execute that is after the end of the while-loop) and changes its state to q ′ 2 , where q ′ 2 = δ cc-exec (q, "assume(x y)").We encode these two simultaneous possibilites as a conjunctive transition of the two-way alternating automaton.That is, δ cc 1 (q, m, a) = (q ′ 1 , L) ∧ (q ′ 2 , U ) .For every i, m, a, we have δ i (q reject , m, a) = false, where q reject is the absorbing rejecting state of A cc-exec .Below we give the formal description of the transitions from all other states q q reject .All transitions δ i (q, m, a) not described below are false.
Transitions from the root.At the root node (labeled "root"), the automaton transitions in the following way: true otherwise Recall that the run of a two-way automaton starts in the configuration where m is set to D. This means, in the very first step, the automaton moves to the child node (direction L).If the automaton visits the root node in subsequent steps, then all transitions are enabled.Transitions from leaf nodes.For a leaf with label a ∈ Γ 0 , and state q, the transition of the automaton is δ cc 0 (q, D, a) = (q ′ , U ).That is, when the automaton visits a leaf node from the parent, it moves to some state q ′ and visits the parent node in the next step.The state component q ′ depends on q and a and is given by cases as follows.
• Case a = "skip".In this case, we have q ′ = q, since a skip statement does not change the state of the word automaton A cc-exec .• Case a ∈ Γ 0 \ {"skip"}.Here, q ′ is given by the transition function of the word automaton A cc-exec , i.e., q ′ = δ cc-exec (q, a).Transitions on a "while" node.As described previously, when reading a node labeled with "while(x ∼ y)", where ∼∈ {=, }, the automaton simulates both the possibility of entering the loop body as well as the possibility of not entering the loop body (and assuming the loop guard to be false).This corresponds to a conjunctive transition: where q ′ = δ cc-exec (q, "assume(x ∼ y)") and q ′′ = δ cc-exec (q, "assume(x y)") In the above, is = when ∼ is , and is when ∼ is =.The first conjunct corresponds to the program execution where the program enters the loop body (thereby assuming the loop guard to be true), and where the control moves to the left child of the current node, which corresponds to the sub-program of the body of the loop.The second conjunct corresponds to the program execution where the loop guard evaluates to false and the automaton moves to the parent of the current tree node.Notice that, in both the conjuncts above, the direction in which the tree automaton moves does not depend upon the last move component m of the state.That is, no matter how the program arrives at a while block, the automaton simulates both the possibilities of entering or exiting the body of the loop.Transitions on a "ite" node.At a conditional node labeled "ite(x ∼ y)", when coming down the tree from the parent, the tree automaton A cc simulates both branches of the conditional: where q ′ = δ cc-exec (q, "assume(x ∼ y)") and q ′′ = δ cc-exec (q, "assume(x y)") Here, q ′ corresponds to the state corresponding to the "then" branch, obtained when the conditional (x ∼ y) holds true, and the tree automaton moves control to the left child (body of the then branch).The second conjunct above corresponds to entering the else branch, simulating the word automaton on the negation of the condition and passing control to the right child.
Let us now consider the case when the automaton moves up to an ite node from a child node.In this case, the automaton moves to the parent node (marking the completion of executing the then or the else block) and the state q remains unchanged : Transitions on a "seq" node.In this case, the automaton moves either to the left child, the right child, or to the parent, depending on the "last move".It does not change the state component.More formally, δ cc 2 (q, D, "seq") = q, L) δ cc 2 (q, U L , "seq") = q, R) δ cc 2 (q, U R , "seq") = q, U ) The above transitions match the semantics of the sequencing of two statements s 1 ; s 2 .If the automaton visits from the parent node, in the next step it should move to the left child to simulate an execution corresponding to the first statement s 1 .When the program completes execution of s 1 , it comes up from the left child and should start executing s 2 , which corresponds to transitioning to the right child.Finally, when execution of s 2 is complete, execution of the sequenced block s 1 ; s 2 is also complete, and thus the tree automaton A cc transitions to the parent node, exiting the subtree.
The following lemma asserts the correctness of the automaton construction above, and states its runtime complexity.Lemma 8.A cc accepts the set of all program trees corresponding to correct coherent programs.That is, {Prog(T ) | T ∈ L(A cc )} = {p ∈ ⟨stmt⟩ V | p is a correct and coherent program}.Further, the automaton can be constructed in O(2 poly(|V|) ) time and its size is □

The Overall Decision Algorithm and Synthesis Algorithm
So far we have described how to transform an input grammar G conforming to the grammar schema S to an equivalent top-down non-deterministic tree automaton A G (Section 5.3).We then described in Section 5.4 the construction of a two-way alternating tree automaton A cc that accepts all trees corresponding to correct and coherent programs.
The rest of the synthesis algorithm proceeds as follows.We first construct a non-deterministic top-down tree automaton A cc-top-down such that L(A cc-top-down ) = L(A cc ).Lemma 6 ensures that A cc-top-down can be constructed in time O(2 2 poly(|V |) ), and that , and if it is non-empty, a program tree can be constructed.
This gives us the central upper bound result of the paper.
Theorem 9.The program synthesis problem for uninterpreted coherent programs is decidable in 2EXPTIME, and in particular, in time doubly exponential in the number of variables and linear in the size of the input grammar.Furthermore, a tree automaton representing the set of all correct coherent programs that conform to the grammar can be constructed in the same time.□ Note that our result not only shows decidability of the realizability and synthesis problems, but also shows that the class of all correct coherent programs is tree-regular.The tree automaton representing correct coherent programs can be further utilized for a variety of tasks-for instance, to sample correct coherent programs to examine if they meet other requirements (or to verify them using more concrete interpretations for functions/relations), etc. See [Wang et al. 2017] for work in this vein in which they sample programs that satisfy some abstractions in order to eventually synthesize correct programs (also see Section 10 on related work).

MATCHING LOWER BOUND FOR SYNTHESIZING COHERENT PROGRAMS
We now turn to proving that the synthesis algorithm from Section 5 is in fact optimal for the problem of synthesizing coherent uninterpreted programs.We prove a 2EXPTIME lower bound for uninterpreted program synthesis by reduction from the 2EXPTIME-hard word acceptance problem of an alternating Turing machine (ATM) with exponential space bound [Chandra et al. 1981].In what follows, we give a high-level description of a polynomial-time reduction f that maps pairs of ATMs M and inputs w to grammars G (conforming to schema S) such that M accepts w if and only if there exists a correct (coherent) program p ∈ G.For a given pair (M, w) we will denote the reduction grammar f (M, w) by G M,w .Full details of the reduction and a proof of its correctness are deferred to the Appendix.

Alternating Turing Machines
Let us first recall the formal definition of an alternating Turing machine.An ATM is a tuple M = (Q, ∆, δ, q 0 , д), where Q is a finite set of states, q 0 ∈ Q is the initial state, and ∆ is a finite set of tape symbols.The machine transition relation has the form δ : Without loss of generality, we will assume that either there exist exactly two transitions (referred to as 0 and 1) for any particular configuration or none at all.The function д : Q → {acc, rej, ∧, ∨} maps states to their type (accepting, rejecting, universal, and existential).Whether or not an ATM accepts its input starting from a particular configuration can be defined inductively.An ATM accepts from any configuration whose state is of type acc, and it accepts from a configuration whose state is of type ∨ (respectively ∧) if some (respectively each) transition leads to a configuration from which it accepts.An ATM accepts a word if and only if it accepts it from the initial configuration, whose state is q 0 .In what follows, we assume without loss of generality that existential configurations are immediately followed by universal configurations under any transition (and vice versa) and that the initial configuration is existential.

Overview of the Reduction
One can view the semantics of alternating Turing machines using two-player games.An ATM can be imagined to transition according to the choices of an existential player (Eve) and a universal player (Adam).Eve begins the game by picking a transition from the initial configuration, according to the transition function of the machine.In turn, Adam responds by picking the next machine transition.Intuitively, Eve is trying to drive the machine into an accepting state, no matter what choice Adam makes on his turns.In this sense, the ATM accepts a word if Eve can choose a valid machine transition, such that for all of Adam's possible next transitions there exists another that she can pick, and so on, such that the final configuration is accepting.
The idea of the reduction is to produce, given an ATM and its input, a grammar whose correct programs encode winning strategies for Eve.A winning strategy for Eve can be viewed as a tree where nodes are labeled by configurations.Nodes labeled by existential configurations have one child (0 or 1, Eve's choice) and universal configurations have two children (both 0 and 1, all possible choices of Adam).For the tree to constitute a winning strategy for Eve, configurations at each node in the tree should follow legally from their parents according to the ATM transition function, and configurations at the leaves should be accepting.Our goal is to design a grammar whose programs resemble such strategy trees, and whose correct programs resemble winning strategy trees.To achieve this, we will insert assertions in the grammar to force programs to simulate only valid machine transitions.The grammar will model Adam's moves by reading an uninterpreted function.Consequently, since correctness depends on satisfying assertions in all data models, a program will be correct only if it encodes a strategy that is winning for all possible moves from Adam.
Several mechanisms are needed to model the game between Adam and Eve using a program grammar.We need (a) a mechanism for simulating transitions (game moves) for M (b) a mechanism for ensuring the moves follow the transition function (c) a mechanism to represent alternation (rounds of the game) and (d) a mechanism to ensure that all possible sequences of moves under Eve's strategy eventually lead to accepting configurations.There are a number of challenges to overcome here.Mechanism (a) requires some way to represent a configuration in an uninterpreted program.The most obvious choice would be to use program variables, one for each of the exponentially many tape cells that our space-bounded machine may use.We can rule this out immediately, since this will result in a grammar of exponential size.Mechanism (b) is problematic for the same reason as mechanism (a).Since we cannot represent a full machine tape in program variables, we cannot check that every tape cell is updated according to a legal transition.Mechanism (c) can be accomplished by maintaining a program variable that indicates the current turn.Our solution to the problem of mechanisms (a) and (b) will also be useful for mechanism (d).
We realize a mechanism for (a) by forcing the synthesizer (Eve) to produce the symbols of each tape cell one at a time, inside a loop.Mechanism (b), which ensures that the synthesizer outputs only correct tape symbols in each turn, is accomplished by nondeterministically designating a particular tape cell to track.Intuitively, the grammar will track the evolution of only a single tape cell, ensuring that the synthesizer always makes an appropriate choice for the next symbol of that particular cell.It can refer to the distinguished cell using a polynomial number of program variables that store the binary encoding of the cell index.The evolution of this cell can be verified by keeping track of the symbols for a finite number of the cells surrounding it (one cell on either side).The crux of the matter is that for different data models, the cell under inspection is different.Thus, for the synthesizer to succeed, it must in fact choose correctly for all tape cells.Here, it is important to note the role of incomplete information (as explained in Section 2).The grammar ensures (syntactically) that the synthesizer cannot learn which cell is being tracked.Otherwise, it could cheat by producing a bogus sequence of symbols that may be correct in isolation, but could never have occurred for any sequence of valid moves.Finally, mechanism (d) ensures winningness of the strategy in a manner similar to how transitions are verified: the grammar uses assertions to require that any final move results in the production of tape contents that constitute a configuration in an acc state.Further note that the grammar ensures all sequences of game play are finite by excluding the use of arbitrary while loops, which may not terminate.
Another challenge involves the question of memory.In general, a winning strategy for the synthesizer will need access to the entire history of a play.How can a grammar of polynomial size represent a strategy of much larger size?The solution is to notice that the grammar can provide for unbounded memory by judiciously branching at the important decision point (choosing how to move).Intuitively, after a move is selected, but before it has been simulated and checked, the program must branch on which move was chosen.This has the effect of allowing the synthesizer to use the program counter for memory, since subsequent choices will be synthesized in distinct sub-programs.A complete synthesized program will have a nested if−then−else structure whose tree representation resembles the accepting computation tree for the ATM.The details of the grammar and the proof of correctness are complex and are left to Appendix D.
Theorem 10.The grammar-restricted program synthesis problem is 2EXPTIME-hard.

SYNTHESIZING TRANSITION SYSTEMS
In this section, we investigate a variant of uninterpreted program synthesis in terms of transition systems.Rather than synthesizing programs from grammars, we consider instead the synthesis of transition systems whose executions must belong to a regular set.Our main result is that the synthesis problem in this case is EXPTIME-complete, in contrast to grammar-restricted program synthesis which is 2EXPTIME-complete.

Transition System Definition and Semantics
Let us fix a set of program variables V as before.We consider the following finite alphabet Σ V = {"x := y", "x := f (z)", "assert(false)", "check(x = y)" | x, y, z ∈ V } Let us define Γ V ⊆ Σ V to be the set of all elements of the form "check(x = y)", where x, y ∈ V .We refer to the elements of Γ V as check letters.
A (deterministic) transition system T S over V is a tuple (Q, q 0 , H , λ, δ ), where Q is a finite set of states, q 0 ∈ Q is the initial state, H ⊆ Q is the set of halting states, λ : Q → Σ V is a labeling function such that for any q ∈ Q, if λ(q) = "assert(false)" then q ∈ H , and δ : Intuitively, a transition system is a finite-state system in which there is precisely one transition from each non-halting state labeled by an assignment and precisely one ordered pair of states for every non-halting state labeled by a check letter.States labeled by check letters represent points at which the system checks whether a condition holds, moving to the first state of the transition pair if yes and to the second state of the pair if no.A transition system can be viewed as an uninterpreted program where we translate check letters as conditionals and model the transitions using goto statements on a set of program labels (program labels being states).It is easy to see that finite transition systems and programs of this kind correspond to each other.
We define the semantics of a transition system using the set of executions that it generates.An execution π of a transition system T S = (Q, q 0 , H, λ, δ ) over variables V is a finite word a over the induced execution alphabet Π V from Section 3, with the following properties.If a = a 0 a 1 . . .a n with n ≥ 0, then there exists a sequence of states q 0 , q 1 , . . .q n where the following hold: • Let 0 ≤ i ≤ n, and suppose λ(q i ) is not a check letter.Then a i = λ(q i ) and, if i < n, q i+1 = δ (q i ).• Let 0 ≤ i ≤ n, and λ(q i ) = "check(x = y)".Then either a i = "assume(x = y)" and (provided i < n) q i+1 = δ (q i ) ⇂ 1 , or a i = "assume(x y)" and (provided i < n) q i+1 = δ (q i ) ⇂ 2 .
In the above, we denote pair projection with ⇂, i.e., (t 1 , t 2 ) ⇂ i = t i , where i ∈ {1, 2}.A complete execution is an execution whose corresponding (unique) final state (q n above) is in H .For any transition system T S, we denote the set of its executions by Exec(T S) and the set of its complete executions by CompExec(T S).The notions of correctness and coherence for transition systems are identical to their counterparts for programs, given in Section 3 and Section 5.1 respectively.The crucial distinction between transition system and grammar-restricted program synthesis is that a transition system may use additional memory (states) to record the history of an execution and use it to determine which statements to execute later.This is a consequence of our specification language for transition systems, which we discuss next.

The Transition System Synthesis Problem
In order to define a non-trivial synthesis problem for transition systems (we wish to avoid trivial systems, e.g. one that never halts), we will allow the user to specify a transition system T S by placing restrictions on its executions (both partial and complete) using two regular languages S and R. We require that all executions of T S belong to the first language S (which is prefix-closed) and that all complete executions belong to the second language R. A specification will be given as two deterministic automata A S and A R over executions, where L(A S ) = S and L(A R ) = R.For a transition system T S and specification automata A S and A R , whenever Exec(T S) ⊆ L(A S ) and CompExec(T S) ⊆ L(A R ) we will say that T S satisfies its (syntactic) specification.Note that this need not entail correctness of T S.
Definition 3 (Transition System Realizability and Synthesis Problems).Given a finite set of program variables V and deterministic specification automata A S (prefix-closed) and A R over the execution alphabet Π V , decide if there is a correct coherent transition system T S over V that satisfies the specification.Furthermore, produce one if it exists.
Since programs are readily translated to transition systems (of similar size), the transition system synthesis problem seems, at first glance, to be a problem that ought to have similar complexity.However, as we show, it is crucially different in that it allows the synthesized system to have complete information of past actions executed at any point.In fact, we will show in this section that the transition system synthesis problem is EXPTIME-complete.
To see the difference between program and transition system synthesis, consider program skeleton P from Example 2 in Section 2. The problem is to fill the hole in P with either y := T or y := F. Observe that when P executes, there are two different executions that lead to the hole.In grammar-restricted program synthesis, the hole must be filled by a sub-program that is executed no matter how the hole is reached, and hence no such program exists.However, when we model this problem in the transition system synthesis setting, the synthesizer will be able to produce transitions that depend on how the hole is reached.Hence, it does not solve the problem of filling the hole in P with uniform code.In this sense, in grammar-restricted program synthesis, programs have incomplete information of the past.This property was crucially exploited in the 2EXPTIME program synthesis lower bound proof (the grammar, with the help of the data model nondeterminism, hid the identities of the TM cells being checked).No such incomplete information can be enforced by regular execution specifications in transition system synthesis, and indeed the problem turns out to be easier, as we show: transition system realizability and synthesis are EXPTIME-complete.Upper Bound: Let the given finite set of program variables be V and let the specification be given by the execution automata A S and A R over the alphabet Π V .We build a non-deterministic top-down tree automaton that accepts trees whose labels collectively encode the (potentially) infinitely-many tree unfoldings of correct transition systems which satisfy the specification.The nodes (states of the transition system) of input trees are labeled by Σ V , and any of such (non-leaf) nodes labeled by a check letter has two children while nodes with other labels have a single child.Let A cc-exec be the deterministic word automaton from Section 5.1 accepting all coherent and correct executions over Π V .The states of the tree automaton have three components that track properties of executions across each branch of the input tree: the first component tracks the state of A S , the second tracks the state of A R , and the third tracks the state of A cc-exec .When reading any label other than a check letter, the tree automaton simulates all three component automata on that label.When reading a letter of the form "check(x = y)", it simulates the component automata on "assume(x = y)" and "assume(x y)", propagating the resulting state triples to the left and right children respectively.At any point, if A S reaches a rejecting state, then the input tree is immediately rejected (by ensuring that such state triples transition to absorbing reject states).An input tree is accepted if all leaves in a run are labeled by accepting states of each component automaton.The size of this tree automaton is exponential in the number of program variables (from the size of A cc-exec ) and linear in the sizes of A S and A R .We can now check emptiness of the tree automaton in time polynomial in its size.If nonempty, we can construct a finite transition system (of size at most that of the tree automaton) whose tree unfoldings are precisely those accepted by the automaton here described.This shows that the realizability problem is solvable in EXPTIME.And furthermore, when realizable, a transition system of size exponential in the number of variables and linear in the sizes of A S and A R can be constructed.Lower Bound: We show that the realizability problem is EXPTIME-hard using a reduction from the membership problem for alternating PSPACE Turing machines.The reduction has a similar structure to that of the lower bound for grammar-restricted program synthesis, but is notably simpler because we can encode Turing machine configurations using polynomially-many program variables.The goal of the reduction is to design a specification A R (and its prefix-closed counterpart A S ) such that a correct transition system that satisfies it will witness an accepting computation tree for the PSPACE Turing machine.Once again, we can think about this witness as encoding a strategy for Eve, with Adam playing his moves by reading an uninterpreted function.Machine configurations can be updated by inserting rules in the transitions for A S and A R that ensure each cell is updated correctly, and that any final configurations are accepting.We can then show that there is a correct transition system with executions in L(A S ) and complete executions in L(A R ) if and only if the alternating PSPACE TM accepts the input.This yields (proof gist in Section E): Theorem 11.The transition system realizability problem is decidable in time exponential in the number of program variables and polynomial in the size of the deterministic automata A S and A R .Furthermore, the problem is EXPTIME-complete.When realizable, within the same time bounds we can construct a correct coherent transition system whose executions are in L(A S ) and whose complete executions are in L(A R ).

SYNTHESIZING BOOLEAN PROGRAMS
In this section, we briefly observe corollaries of our results when applied to the more restricted problem of synthesizing Boolean programs.
In Boolean program synthesis, we interpret variables in programs over the Boolean domain {T , F }, disallowing computations of uninterpreted functions and checking of uninterpreted relations.Standard Boolean functions such as ∧, ∨, ¬, ⇒, etc. are instead allowed, but note that these can be modeled using if−then−else statements.We allow nondeterminism using a special assignment b := * that assigns the b nondeterministically to T or F .As usual, a program is correct iff it satisfies all its assertions.
Synthesis of Boolean programs can be easily modeled as uninterpreted program synthesis.We have two special constants T and F .Each nondeterministic assignment is modeled by computing a next function on successive nodes of a linked list, accessing a nondeterministic value by computing key on the current node, and assuming equality of the result with either T or F .Since programs must to be correct for all models, this indeed captures nondeterministic assignment.The 2EXPTIME upper bound for Boolean program synthesis now follows from Theorem 9. Interestingly, the 2EXPTIME lower bound from Section 6 can be adapted to prove Boolean program synthesis is 2EXPTIME-hard.Note that the reduction uses a single uninterpreted function to model the binary universal choice and the rest of the grammar manipulates variables that only ever contain two values, which can hence be modeled with Booleans.
Theorem 12.The realizability problem for grammar-restricted Boolean program synthesis is 2EXPTIMEcomplete, and can be solved in time doubly-exponential in the number of variables and linear in the size of the input grammar.□ The above shows that uninterpreted program synthesis is no more complex than Boolean program synthesis, establishing decidability and complexity of a problem which has found wide use in practice-for instance, the synthesis tool Sketch solves precisely this problem, as it models integers using a small number of bits (usually 5) and allows grammars to restrict programs with holes.
We can also show that the transition system synthesis problem studied in the previous section can be adjusted to work over Boolean variables.Both the upper and lower bound proofs can be adapted to show the problem is EXPTIME-complete.The definitions and results are the natural analogs and we omit further details.

SYNTHESIZING RECURSIVE PROGRAMS
We now extend the positive result of Section 5 to synthesize coherent, recursive programs that meet user specifications.The setup for the problem is very similar -given a grammar that identifies a class of (now) recursive programs, the goal is to determine if there is a program in the class that is coherent and correct.In order to do this, we first introduce the class of recursive programs and their semantics, along with important notions that will help us outline an algorithmic solution to the synthesis problem.

Recursive Programs and their Semantics
To keep the presentation simple, we will impose some restrictions on the program syntax, none of which limit the generality of our results.Let us fix the set of program variables to be V = {v 1 , v 2 , . . .v r }, along with an ordering ⟨V ⟩ = v 1 , v 2 , . . .v r .The programs we consider will have recursively defined methods, and we fix the names of such methods to belong to a finite set M. We will assume that m 0 ∈ M denotes the "main" method that is invoked when the program is executed.Without loss of generality, we will assume that the set of local variables for any method m ∈ M is V ; methods can easily ignore some variables if they use fewer variables.We will also assume the set of formal parameters for every method is also V , called in the order ⟨V ⟩.None of these are serious restrictions.Our methods will return multiple values back, which are assigned by the caller to local variables.Therefore, for every method m, we fix o m to be the (ordered) output variables of m; the variables in o m will be among the variables in V .We require the output variables in o m to be distinct to avoid implicit aliasing.Recursive programs are now a sequence of method definitions, wherein one can call other methods, assign values, use conditional branching and loops, along with sequencing.
⟨pдm⟩ M,V :: Here x, y, z, w belong to V , and the length of vector w must match the output o m .A program is nothing but a sequence of method definitions, and the main method m 0 is invoked first when the program is run.The new statement in the grammar is w := m(⟨V ⟩) where a method m is called and when the call returns, the output values are assigned to the vector w of variables.
Different aspects associated with the semantics of such programs, like executions, terms, coherence, etc., are sketched informally below.Precise definitions for many of these concepts were first presented in [Mathur et al. 2019], and for completeness are also given in Appendix F. Executions of recursive programs are sequences over the alphabet Π V plus two other collections of symbols -{"call m" | m ∈ M } which are events corresponding to method invocation, and {"z:=return" | z in V } which are events corresponding to a return from a method invocation and the assignment of outputs to local variables in the caller.The set of executions of a program can be naturally defined and it forms a context-free language.
Given a data model that provides an interpretation to the constant and function symbols, every partial execution naturally maps each program variable to a value in the universe of the data model; in the interests of space we skip this definition.The notions of an execution being feasible in a data model (i.e., all assume statements must hold when encountered) and a program being correct (i.e., all executions of the form ρ • assert(false) are infeasible in all data models) can be extended naturally to recursive programs.
Finally, the definition of coherent executions and programs can be extended to the recursive case.To do this, we need to identify the syntactic term stored in a variable after a partial execution, the set of syntactic terms computed during a completed execution, and the collection of equality assumptions made during an execution.These can be naturally extended from the non-recursive case using a call-by-value semantics.Based on these, coherence is defined in exactly the same manner as in the non-recursive case -executions are coherent if they are memoizing and have early assumes, and programs are coherent if all their executions are coherent.Again, we skip the formal definition to avoid repetition.
As in the non-recursive case, we will represent recursive programs as finite trees.Recall that a recursive program is nothing but a sequence of method definitions.Therefore, our tree representation of a program will be a binary tree with root labeled "root", where the right-most path in the tree will have labels of the form "m ⇒ o m ", and the left child of such a node will be the tree representing the body of the method definition of m; since a method body is nothing but a program statement, we could use a tree representation very similar to the one used for non-recursive programs.The formal definition of such trees can be easily worked out, but is given in Appendix F for completeness.
We conclude this section by recalling the main observations from [Mathur et al. 2019] about recursive programs -that the problems of determining if a recursive program is coherent, and of determining if a coherent program is correct are decidable.
Theorem 13 ([Mathur et al. 2019]).Given a recursive program P, checking if P is coherent is decidable in EXPTIME.Further, checking correctness of a coherent recursive program is decidable in EXPTIME.
The proof of Theorem 13 relies on the observation that there are visibly pushdown automata [Alur and Madhusudan 2004] (with respect to the partition of Π M,V into call alphabet {"call m" | m ∈ M }, return alphabet {"z:=return" | z in V }, and internal alphabet Π V )3 A rcoh and A rcor that accept the set of all coherent executions, and the set of all coherent executions that are correct, respectively.Both A rcoh and A rcor are of size O(2 poly( |V |) ).Since the set of all program executions is also a visibly context-free language, decidability follows from taking appropriate automata intersections and checking for emptiness.By taking automata cross-products, we can conclude there is a visibly pushdown automaton A rcc of size O(2 poly( |V |) ) accepting the set of all recursive executions that are both coherent and correct; as in the non-recursive case, we crucially exploit A rcc for synthesis.

Synthesizing Correct, Coherent Programs
The approach to synthesizing recursive programs is similar to the non-recursive case, though more complicated.Once again, given a grammar G, the set of trees corresponding to programs generated by G is regular; let A G be the tree automaton accepting this set of trees.The crux of the proof is to show that there is a two-way alternating tree automaton A rcc that accepts exactly the collection of all trees that correspond to recursive programs that are coherent and correct.The synthesis algorithm then involves checking if there is a common tree accepted by both A G and A rcc , and if so constructing such a tree.The latter problem is easily reduced to tree automata emptiness.Therefore, in the rest of the section, we describe how to construct the automaton A rcc .
The construction of the automaton A rcc is similar to the construction of A cc in the non-recursive case.On an input tree t, A rcc will generate all executions of the program corresponding to t by walking up and down t and checking if each one of them is coherent and correct by simulating A rcc .The challenge is to account for recursive function calls and the fact that A rcc is a (visibly) pushdown automaton rather than a simple finite automaton.Giving a precise formal description of A rcc will be notationally cumbersome, and will obfuscate the ideas behind the construction.Therefore, we only outline the informal ideas, and leave working out the precise details to the reader.
Like in the non-recursive case, A rcc will simulate A rcc as each execution is generated.Since A rcc does not change its stack, except on call m and z:=return, we can simulate A rcc on most symbols by simply keeping track of the control state of A rcc .The interesting case to consider is that of method invocation.Suppose A rcc is at a leaf labeled "z:=m(⟨V ⟩)".Let q be the control state of A rcc after the execution thus far.Executing the statement z:=m(⟨V ⟩) gives a partial trace of the form "call m" • ρ • "z:=return", where ρ is an execution of method m.Suppose A rcc on symbol "call m" from state q goes to state q 1 and pushes γ on the stack.Notice that no matter what ρ (the execution of method m) is, since A rcc is visibly pushdown, the stack at the end of ρ will be the same as that at the begining.Therefore, A rcc will (nondeterministically) guess the control state q 2 of A rcc at the end of method m.A rcc will send two copies.One copy will simulate the rest of the program (after z:=m(⟨V ⟩)) from the state q ′ , which is the state of A rcc after reading "z:=return" from q 2 and popping γ .The second copy will simulate the method of m to confirm that there is an execution of m from state q 1 to q 2 .To simulate the method body of m, A rcc will walk all the way up to the root, and then walk down, until it finds the place where the definition of m is in the tree.A rcc will also need to account for the possibility that the call to m does not terminate; in this case, it will send one copy to simulate the body of m, and if that body ever terminates, A rcc will reject.Given this informal description, one can say that a state of A rcc will be of the form (p, q 1 , q 2 ), where q 1 and q 2 are a pair of states of A rcc with the intuition that q 1 is the current state of A rcc , q 2 is the target state to reach at the end of the method, and p is some finite amount of book-keeping information needed to perform tasks like finding an appropriate method body to simulate, whether the method will return, etc. Thus, the size of A rcc will be O(2 poly(|V |) ).
Theorem 14.The program synthesis problem for uninterpreted, coherent, recursive programs is decidable in 2EXPTIME; in particular the algorithm is doubly exponential in the number of program variables and linear in the size of the input grammar.Furthermore, a tree automaton representing the set of all correct, coherent, recursive programs conforming to the grammar can be constructed in the same time.Finally, the program synthesis problem in this case is 2EXPTIME-hard.
The 2EXPTIME lower bound follows from the non-recursive case (Section 6).

RELATED WORK
The automata and game-theoretic approaches to synthesis date back to a problem proposed by Church [Church 1960], after which a rich theory emerged [Buchi and Landweber 1969;Grädel et al. 2002;Kupferman et al. 2010;Rabin 1972].The problems considered in this line of work have typically been about a system reacting to an environment input interactivly using a finite set of signals over an infinite number of rounds.Tree automata over infinite trees, representing strategies, with various infinitary acceptance conditions (Büchi, Rabin, Muller, parity) emerged as a uniform technique to solve such synthesis problems against temporal logic specifications with optimal complexity bounds [Kupferman et al. 2000;Madhusudan and Thiagarajan 2001;Pnueli andRosner 1989, 1990].In this paper, we use an alternative approach from [Madhusudan 2011] that works on finite program trees, using two-way traversals of the tree to simulate iteration in the program.The work in [Madhusudan 2011], however, uses such representations to solve synthesis problems for programs over a fixed finite set of Boolean variables and against LTL specifications.In this work we use it to synthesize coherent programs that have finitely many variables working over infinite domains endowed with functions and relations.
While decidability results for program synthesis that go beyond finite data domains are rare, we do know of some results of this kind.First, there are some decidability results known regarding the synthesis of tranducers that have registers [Khalimov et al. 2018].Transducers interactively read a stream of inputs and emit a stream of outputs.Finite-state tranducers can be endowed with a set of registers for storing inputs and doing only equality/disequality comparisons on future inputs read.Synthesis of such transducers for temporal logic specifications is known to be decidable.Note here that though the data domain is infinite, there are no functions or relations on data (other than equality), making it a much more restricted class (also, grammar-based approaches for syntactically restricting transducers is not studied).Indeed, with uninterpreted functions and relations, the synthesis problem is undecidable (see Theorem 1), with decidability only for coherent programs.
Second, closely related to the aims of this work is an unpublished paper [Caulfield et al. 2015], in which the authors investigate decidable SyGuS problems (Syntax Guided Synthesis, a problem format for grammar-restricted synthesis).The authors study a synthesis problem where programs are simple terms over functions (terms have if-then-else constructs but no recursion) and specifications are given using a first-order formula over the uninterpreted function theory with function symbol f .The problem then is to find an expression/term in the grammar such that the specification holds when f is substituted by this term.The authors show that even this problem (without allowing iteration/recursion in programs) is undecidable.They also identify a very restricted fragment (where the synthesized term cannot even have if-then-else constructs) for which the problem is decidable.In contrast to these, our synthesis results are for programs with conditionals and iteration (but restricted to coherent programs) and for specifications using assertions in code.
A third setting with a decidable synthesis result over unbounded domains is work on strategy synthesis for linear arithmetic satisfiability games [Farzan and Kincaid 2018].In this work, it is shown that for a satisfiability game, in which two players (SAT and UNSAT) play to prove a formula is satisfiable (where the formula is interpreted over the theory of linear rational arithmetic), if the SAT player has a winning strategy then a strategy can be synthesized.Though the data domain (rationals) is infinite, the game here consists of a finite set of interactions and hence has no need for recursion.The authors also consider reachability games where the number of rounds can be unbounded, but present only sound and incomplete results, as checking who wins in such reachability games is undecidable.
Tree-automata techniques for accepting finite parse trees of programs was explored in [Madhusudan and Parlato 2011] for synthesizing reactive programs with variables over finite domains.In more recent work, automata on finite trees have been explored for practical synthesis for synthesizing data completion scripts from input-output examples [Wang et al. 2016].to accept programs that are verifiable using abstract interpretations [Wang et al. 2017], and for relational program synthesis for synthesizing multiple programs that are related [Wang et al. 2018].
The work in [Madhusudan et al. 2018] explores a logic with ∃ * ∀ * prefixes that can be used to encode synthesis problems, with background theories such as arithmetic as well, and that is decidable.However, encoding program synthesis in this logic only expresses programs of finite size.Another recent paper [Hu et al. 2019] explores sound (but incomplete) techniques for showing unrealizability of syntax-guided synthesis problems.

CONCLUSIONS
We have presented foundational results on synthesizing programs; in particular, coherent programs with uninterpreted functions and relations.To the best of our knowledge, this is the first natural decidable program synthesis problem for programs that have arbitrary size, iteration/recursion, and work over infinite domains.We have established that program synthesis is 2EXPTIME-complete, where the decision procedure is doubly exponential in the number of variables and linear in the size of the input grammar, and that this in fact matches the complexity of even Boolean program synthesis.We have proved decidability of other related synthesis problems, including transition systems with uninterpreted functions (EXPTIME-completeness) and recursive program synthesis (2EXPTIME-complete).
A practical realization of our technique that lazily builds automata while looking for accepted trees (programs) would be interesting.The use of tree automata as version space algebras in practical synthesis algorithms in recent work [Wang et al. 2017[Wang et al. , 2016[Wang et al. , 2018] ] gives hope for realizing this in practice.
Finally, it is also exciting that this paper bridges the worlds of program synthesis and the rich classical synthesis frameworks of systems over finite domains using tree automata [Buchi and Landweber 1969;Grädel et al. 2002;Kupferman et al. 2010;Rabin 1972].We believe this link could revitalize both domains with new techniques and applications.

B PROOFS FROM SECTION 4 B.1 Undecidability of Synthesising Straight Line Programs
In this section we will present the proof of Theorem 3. We prove undecidability by reducing from Post's Correspondence Problem (PCP), which we define here.
It is a well-known result that PCP is undecidable [Post 1946].We shall now detail the reduction.Given an instance of PCP P = (Γ, α, β) over alphabet Γ with lists of strings α and β, consider the first order signature Σ P = ( , { f σ } σ ∈Γ , ) and the grammar G P = (∆ P , St P , NT P , R P ) such that: • ∆ P = {"x 1 := x 2 ", "x 1 := x 3 ", "; "}∪{t 1,σ } σ ∈Γ ∪{t 2,σ } σ ∈Γ ∪{"assume(x 1 x 2 )", "assert(false)"} where t 1,σ = "x 1 := f σ (x 1 )" where n is the length of the lists α and β as given by P. • R P is the following collection of rules Then, the production rule for A i is given by Then, the production rule for B i is given by For an intuitive understanding of the production rules for A i and B i , recall that in our first order signature the functions are indexed by letters from Γ.If by abuse of notation we associate the function f γ 1 • f γ 2 (for γ 1 , γ 2 ∈ Γ) with the symbol f γ 1 •γ 2 (and similarly for longer compositions), then the production rule for A i produces a program block that updates (assigns to) the variable x 1 by f α i (x 1 ).Similarly B i produces a program block that updates the variable x 2 by f β i (x 2 ).Note that in our new notation, if a symbol appears earlier than another in the subscript word, the function corresponding to it is applied later.
Observe that, although the grammar as presented does not quite conform to the grammar schema S SLP of straight line programs over the given first order signature, it can easily be rewritten into an equivalent grammar that does conform by introducing some extra non-terminals and folding the (didactic) productions C 1 . . .C n into productions for Q P .We claim that the uninterpreted synthesis problem over this grammar is equivalent to the given PCP instance.
To prove this claim, observe that every program generated by this grammar is of the form for some N and some i j , 1 ≤ i j ≤ n for every 1 ≤ j ≤ N .Let π 2 be the prefix that excludes the last two statements, and π 1 be the prefix that excludes the last statement.
We shall look at the correctness of this program.To do this, first see that using our shorthand notation, the value of the variable x 1 and at the end of the program block π 2 is f w α (x 3 ) where this can be seen using a simple inductive argument).More precisely, in any first order model M over our signature, the value of x 1 at the end of π 2 is the value (given by M) corresponding to the term f w α ( x 3 ) where by x 3 we mean the initial value of the variable x 3 (the value does not change through the program).Similarly, at the end of π 2 the value of the variable For the program to be correct, by our definition of correctness the prefix π 1 has to be infeasible (since the next statement is a assert(false)), i.e., infeasible in every data model.In fact since this is a straight line program it has no other executions and therefore the program is correct iff π 1 is infeasible.Therefore let us look at the feasibility of π 1 .
To be infeasible in every data model, in particular it must be infeasible in the free model of terms.Recall that in the free model the equality is syntactic equality, and therefore to be infeasible in the free model the statement "assume(x 1 x 2 )" must not be true at the end of π 2 .That is, the value of the variables x 1 and x 2 , namely the terms f w α ( x 3 ) and f w β ( x 3 ) must be syntactically equal, which happens iff w α = w β .
Observe that we have now concluded that an arbitrary program generated by the given grammar is correct iff w α = w β .However, the right hand side when expanded yields a solution to the given PCP instance P, namely the number N and the indices i j for 1 ≤ j ≤ N such that w α = w β , i.e., From the above discussion we can conclude that there exists a correct program that can be synthesised from the given grammar if and only if there exists a solution to the given PCP instance.Since the original instance P was arbitrary, this yields that the given problem of uninterpreted synthesis over the schema S SLP must be undecidable.This concludes the proof of Theorem 3.

C.2 Proof of Lemma 6
In this section, we shall detail the process of constructing an equivalent non-deterministic top-down tree automaton given a two-way alternating tree automaton.The definitions of these automata can be found in Section 5.2.These ideas are inspired from and closely follow [Vardi 1998].
The key components of the construction and the correctness arguments of this construction are the following.
(1) Strategy annotations for a given input tree (finite binary tree in our setting) and the notion of acceptance of a strategy annotation with respect to a tree (and the given two way tree automaton).
(2) Equivalence between the set of trees accepted by a two-way alternating tree automaton and trees that have an accepting strategy annotation.
(3) Construction of a top-down non-deterministic tree automaton whose language is the language of a given two-way alternating tree automaton.The top-down automaton we construct, amongst other things, intuitively, decorates input trees with strategy annotations and accepts those trees which have an accepting strategy annotation.
The overall picture is the following.Recall that the run of a two-way alternating tree automaton at a given node in a given input tree selects a set of neigbouring nodes and states such that they satisfy the formula given by the transition funtion.Since there can be many satisfying assignments, we can look at the acceptance of a tree by the automaton as a turn-based two-player game played between a Believer and a Sceptic played over the nodes of the tree.Intuitively the Believer tries to prove that the given tree is accepted by the automaton and the other player is sceptical of that claim.Starting at the root at the initial state of the automaton, at each point in the run of the automaton the Believer tries to play a satisfying assignment of neighbouring nodes and states and the Sceptic chooses one of them to force the Believer into a bad state.To avoid this, the Believer must be able to play an assignment at every point in the run such that no matter what the Sceptic picks a bad state will never be encountered.
As might already be obvious, we will prove below that there is a correspondence between such winning strategies for the Believer and accepting runs of the automaton.Moreover since the above kind of game is a special game for which the Believer's strategies need only depend on the node of the tree, the state, etc and not what round of the game or how far into the game the Believer is, we can annotate the given input tree with such a special strategy (which we call a strategy annotation) and check that it is indeed a winning strategy.Combined with the above equivalence, we construct a non-deterministic top-down tree automaton to nondeterministically decorate the tree with a strategy annotation and then check that the strategy is a winning strategy for the Believer, thereby being able to accept precisely those trees accepted by the original automaton.
We shall formally detail each of these steps below.

C.2.1 Strategy Annotations
) be a two-way alternating tree-automaton (we choose a form with a single initial state for a simpler presentation).A strategy σ in A maps states and last moves to different states and directions, i.e., σ : We denote by S A the set of strategies in A: notice that the size of this set is O(2 |Q | 2 ).Given a tree T = (S, γ ), a strategy annotation a maps each node of T to some strategy, i.e., a : S → S A such that it satisfies the transitions of A -for every n ∈ S, q ∈ Q, m ∈ {U L , U R , D}, we have that a(n)(q, m) |= δ i (q, m, γ (n)) (where i is 0, 1 or 2 depending upon the label γ (n), and m is also appropriately chosen depending upon the arity of n).Note that a strategy annotation cannot exist where the transition on any pair of a state and a previous direction is false.
Intuitively, such an annotation directs the automaton A about which set of next (q, d) pairs to transition to when its control is on a given node of the tree.While this picture might appear slightly inaccurate -a given run of a two-way automaton may transition to different sets of (q, d) pairs when visiting the same node at different times -we will later show that when a tree is accepted by a two-way tree automaton, there is an accepting run of the automaton on the tree, that choses the same set of states each time it visits a given node.
A strategy annotation a of a tree T is accepting if the run of the two-way automaton A that obeys a is an accepting run.By this we mean that the run graph G = (V , E) defined in the following way is an accepting run of A on T (all states visited are safe).(0, q 0 , D, ϵ) ∈ V and for every vertex v = (i, q, m, n) ∈ V , the set of edges outgoing from v is the set k j=1 {(i + 1, q j , m j , n j )}, where a(n)(q, m) = {(q 1 , d 1 ), . . ., (q k , d k )}, and for each j there exists the node n j in the tree T (i.e., n j ∈ S) such that: Now, we present the crucial part of the intuition in defining strategies and strategy annotation.
C.2.2 Equivalence between accepted trees and accepting strategy annotations.
Lemma 15.Let A be a two-way alternating tree automaton and let T be a binary tree.T is accepted by A iff there is a strategy annotation a of T which is accepting.
Proof Sketch.The 'if' part follows from the definition of accepting strategy annotations.As for the 'only if' part, let us consider a turn-based two-player game played on the nodes of the given tree T = (S, γ ) between a Believer and a Sceptic.Intuitively, the Believer is trying to show that T is accepted by the automaton and the Sceptic does not believe that and throws challenges along the way.
A configuration of the game is an element of S × Q × {U L , U R , D}.The game starts at the root of T at starting configuration (ϵ, q 0 , D).During each round of the game, say configuration (n, q, m), the Believer goes first and chooses an element a nqm of P(Q × {L, R, U }) such that a nqm |= δ i (q, m, γ (n)) (where i is 0, 1 or 2 depending upon the label γ (n), and m is also appropriately chosen depending upon the arity of n).If δ i (q, m, γ (n)) = f alse then the Sceptic wins immediately.If not, the Sceptic chooses an element (q ′ , d ′ ) of a nqm and the game then transitions to the configuration not exist in S, then too the Sceptic wins immediately.We define the winning condition to be that the Believer wins if the game only encounters states (the Q component of the configuration) in F (as given by A), i.e., safe states.
A strategy for the Believer is a map from configurations to the possible choices for each round of the game and is an object of the form N → (S × Q × {U L , U R , D} → P(Q × {L, R, U })).Since the game can take many paths crossing many configurations, given a strategy σ Bel for the Believer and depending on the possible strategies of the Sceptic, we can represent the possibilities of the game as a graph of configurations starting at the root (0, ϵ, q 0 , D) (owing to our default initial configuration) and where a node (i, n, q, m) has a child (i + 1, n ′ , q ′ , m ′ ) only if (n, q, m) and ((n ′ , q ′ , m ′ ) are possible successive configurations in rounds i and i + 1 respectively(induced by the choices of the Sceptic).By comparing definitions it is clear that this graph is the same as the directed graph that defines a run of the automaton A. Let us call this the run defined by the Believer's strategy σ Bel .Lastly, it is clear that σ Bel is a winning strategy iff the above graph does not encounter any bad states, which happens iff A accepts T on the run defined by σ Bel .
Observe that the above is also a generalization of the notion of a run that obeys a strategy annotation, where we can consider the run that obeys a : S → S A (where S A consists of elements of the form Q × {U L , U R , D} → P(Q × {L, R, U })) as the run defined by the strategy σ a given by σ a (n, q, m) = a(n)(q, m).We can also see that the strategy annotation is accepting iff the strategy defined by it in the above manner is a winning strategy for the Believer.
Finally a well-known result [Grädel et al. 2002] gives us that the above kind of game is a safety game (and more generally a parity game) and that these games are determined and have positional or memoryless strategies.The determinacy implies that starting at the root of the input tree T the Believer either has a winning strategy or it does not, i.e., the tree is either accepted by the automaton or it does not.This is clear already from the correspondence between runs and winning strategies explained above.However, the fact that positional strategies exist mean that the Believer has a winning strategy that plays the same set a qnm whenever it visits the same configuration.
So far, we have that accepting strategy annotations correspond to winning strategies (that are also positional), and that winning strategies correspond to accepted trees.Expanding the above observations, we have that if A accepts T then the Believer has a positional winning strategy, i.e., a strategy that maps each configuration from S ×Q ×{U L , U R , D} to an element of P(Q ×{L, R, U }).But this can be interpreted to be a strategy annotation, i.e., an element of the form S → (Q × {U L , U R , D} to an element of P(Q × {L, R, U })) that provides a set to be played for every possible state and previous move at a given node in the tree.Moreover, it is also clear that since A accepts T this strategy must be a winning strategy and therefore the strategy annotation must be accepting.This completes the other direction of the proof.□ C.2.3 Construction of an Equivalent Top Down Automaton.Let us now embark on the description of the non-deterministic top down tree automaton A ′ that accepts the same language as a given two way alternating tree automaton A. The first step towards seeing intuitively that this construction is possible is Lemma 15, which ensures that it is possible to guess the moves of the two-way automaton in one shot.The challenge however is to verify the guessed annotation is an accepting annotation, in a top-down manner.To tackle this, we observe that, the set of states visited in any given run, when on a given node n is also a finite set (some subset of Q) and one can also guess these sets (or bags) of states.The check for acceptance then translates to checking if (a) from every state in the bag of a node n, you transition with a pair (q, d) such that q is in the bag of n•d and (b) each bag is a safe set of states.Let us formalize the construction below.
We fix A = (Q, {q 0 }, F , δ 0 , δ 1 , δ 2 ) to be the given two-way automaton.The top-down automaton is a tuple States of A ′ that are in Q ′ are valid pairs of the form (σ , B), where σ ∈ S A and the bag is a set of pairs of safe states and some move, i.e., B ⊆ F × {D, U L , U R }, and further, for every symbol a ∈ Γ, and for every (q, m) ∈ B, we have that σ (q, m) |= δ i (q, m, a) (i is appropriately chosen depending upon the arity of a).The set of initial states is Let us now describe the transitions.The transition δ ′ 1 is such that for every (σ , B) ∈ Q ′ and for every a ∈ Γ 1 , every state (σ 1 , B 1 ) ∈ δ ′ 1 ((σ , B), a) satisfies the following two conditions.(a) For every (q, m) ∈ B and for every (q ′ , L) ∈ σ (q, m), we must have (q ′ , U L ) ∈ B 1 .(b) For every (q, m) ∈ B 1 and for every (q ′ , U ) ∈ σ 1 (q, m), we must have (q ′ , U L ) ∈ B.
Finally, δ ′ 0 = Q ′ × Γ 0 is the complete relation on Q ′ and Γ 0 .We also complete the automaton by sending all other transitions to the Sink state such that the automaton transitions until the leaves, all in the Sink state.Due to our δ ′ 0 , this means that if any leaf in a tree goes to Sink then that tree is rejected.Lemma 6.Given a two-way alternating tree automaton A, one can construct a non-deterministic top-down tree automaton A ′ of size O(2 poly(|A|) ) in time O(2 poly (|A|) ) such that L(A) = L(A ′ ).
Proof Sketch.We first define the notion of a strategy-consistent annotation over an input tree T = (S, γ ) (w.r.t. the two-way alternating tree automaton A) extending a strategy annotation a, denoted by a, as a map of the form a : S → S A × P(Q × {U L , U R , D}) if for every s ∈ S, a(n) = (σ n , Bag n ) for σ n = a(n) and some Bag n such that (q 0 , D) ∈ Bag ϵ and for each node n of T visited by the run (defined by a) of A at state q and previous direction m, a(n) = (σ n , Bag n ) where a(n) = σ n and (q, m) ∈ Bag n .Intuitively, at each node of the input tree apart from deciding what strategy to play it also tracks the set of states and previous directions at which the automaton visits that node in the tree (including bad states).Observe that for any strategy annotation over an input tree there always exists a minimal strategy-consistent annotation that only includes in each bag the exact states and previous directions at which the node is visited (the proof of this fact is trivial and is skipped).An accepting strategy-consistent annotation is simply one that extends an accepting strategy annotation.We will see later that accepting strategy-consistent annotations are nondeterministically guessed, giving us our construction.
We detailed above a run of a two-way alternating tree automaton obeying/induced by a given strategy annotation over an input tree in the proof of Lemma 15.Given the above definition of a strategy-consistent annotation we can restate the definition of an accepting strategy annotation using the following lemma.
Lemma 16.Given any strategy-consistent annotation a extending a strategy annotation a, for every non-leaf node n ∈ S the following holds: (1) For every (q, m) ∈ Bag n that is visited by the run of A (induced by a) and for every (q ′ , L) ∈ σ n (q, m) it must be the case that n ′ = n.L ∈ S and (q ′ , D) ∈ Bag n ′ .
(2) For every (q, m) ∈ Bag n that is visited by the run of A and for every (q ′ , R) ∈ σ n (q, m) it must be the case that n ′ = n.R ∈ S and (q ′ , D) ∈ Bag n ′ .(3) For every (q, m) ∈ Bag n that is visited by the run of A and for every (q ′ , U ) ∈ σ n (q, m) it must be the case that n ′ = n.U ∈ S and (q ′ , m ′ ) ∈ Bag n ′ where m ′ = U L if n ′ • L = n and m ′ = U R otherwise.Moreover the converse is also true, i.e., if there exists a strategy-consistent annotation a and a strategy annotation a that (i) satisfies the above properties and (ii) (q 0 , D) ∈ Bag ϵ (of a), then the strategy-consistent annotation extends the strategy annotation.
The first part of Lemma 16 is clear by restating the definition of a run that obeys a given strategy annotation in terms of a strategy-consistent annotation extending it.The converse can be realized using a simple inductive argument that inducts on the first component of the vertices of the rungraph (which is a natural number) obeying the strategy annotation.The idea is that if a vertex is visited at a certain state and previous direction, the first requirement ensures that the bag of the neighbouring vertices that are visited at the next stage of the induction contain the appropriate states and previous directions according to the transition function of A. The base case is ensured by our second requirement.
Given both directions of Lemma 16 it is also clear that it is enough to check if a strategyannotation is accepting by checking if there exists a strategy-consistent annotation extending it that (i) contains only safe states (ii) satisfies the conditions of Lemma 16 for all states in every bag (as appropriately applies), not just the states visited by A. This is because if such an extension exists we are done, and there always exists a minimal extension (whose bags only contain the visited states) to which the conditions apply on every element and will contain only good states in every bag if the strategy annotation is accepting.
The proof of Lemma 6 concludes by observing that the constructed automaton transitions, by definition, on a non-leaf node if and only if it meets the obligations of Lemma 16 on that non-leaf node (in this special way of the entire bag satisfying the properties and the bags containing only safe states).Therefore the automaton transitions to the final state on every path if and only if every non-leaf node meets the obligations of Lemma 16, which can be equivalently stated as the fact that the input tree has an accepting strategy-consistent annotation, i.e., is in the language of the original two-way alternating tree automaton.
The size of the automaton A' and the time to build it are also clear from the size of strategyconsistent annotations.□

C.3 Top Down Tree Automata as Acceptors of Program Trees
C.3.1 Grammar to Tree Automaton.The next task is to represent the set of programs generated by an input grammar G as a regular set of program trees.We will use the non-deterministic top-down tree automata for this purpose.More precisely, we will construct a tree automaton A G which accepts precisely the set of trees that correspond to the programs generated by G = (∆, St, NT , R).
We require that G conforms to the schema S discussed in Section 3.2.1.
Let us now define the components of The automaton as states the non-terminals of G, and additionally there is a special start state q 0 .That is, Q G = {q 0 } ⊎ NT , and I G = {q 0 }.The transitions are defined as follows.
Lemma 17.Let G be a grammar conforming to the schema S and let A G be the tree automaton constructed above.Then, we have L Let M = (Q, ∆, δ, q 0 , д) be a single-tape alternating Turing machine (ATM) with exponential space bound, where Q and ∆ are finite sets of states and tape symbols, respectively.The transition function has the form δ : (Q × ∆) → P(Q × ∆ × {L, R}).Without loss of generality, we will assume that either there exist exactly two transitions (referred to as 0 and 1) for any particular configuration or none at all.The initial state is q 0 ∈ Q and д : Q → {acc, rej, ∧, ∨} maps states to their type.It will be convenient to represent machine configurations as sequences of tape symbols.For a given machine M, this can be allowed by working with a modified machine M ′ whose alphabet Γ = ∆ ∪ (Q × ∆) contains the original symbols from ∆ as well as composite symbols from (Q × ∆) to encode both the machine head position and machine state.For example, for state q ∈ Q and regular symbol t ∈ ∆, the composite symbol (q, t) ∈ Γ encodes the tape head reading regular symbol t with the machine in state q.The transition function δ can easily be modified to account for this representational change, and we omit the details.A universal (resp.existential) configuration is a sequence of tape symbols containing one composite symbol (q, t), with д(q) = ∧ (д(q) = ∨).A configuration is accepting (rejecting) if its composite symbol (q, t) has д(q) = acc (д(q) = rej).
Accepting (rejecting) states can be assumed to have no transitions and are thus halting.The set of accepting configurations is the smallest set S that contains (a) all states q with д(q) = acc, (b) all universal configurations such that every configuration reachable within one transition belongs to S and (c) all existential configurations such that there is some configuration in S reachable within one transition.An ATM M accepts an input w if the initial configuration is accepting.In what follows, we will assume without loss of generality that existential configurations are immediately followed by universal configurations under any transition, and vice versa.Further, we assume the initial configuration is existential.Our representation of configurations as sequences of tape symbols will allow us to work with a modified transition relation δ W that is lifted to configuration windows.A configuration window is a triple of tape symbols.After retrofitting a given ATM with compositite symbols, as mentioned above, the information in δ can be easily represented by δ W ⊆ (Γ 3 × Γ), which relates triples of tape symbols (a window that views three adjacent tape cells) to symbols that the middle cell can legally transition to according to δ .For convenience in the reduction grammar we will overload δ W , writing δ W (0, t i , t j , t k ) to denote the tape symbol for the cell containing t j (with t i and t k to the left and right) after the machine takes the 0 transition.Similarly, δ W (1, t i , t j , t k ) will denote the tape symbol for the 1 transition.If the window (t i , t j , t k ) is ill-formed (for example, it may contain more than one composite machine state symbol) we say δ W (0, t i , t j , t k ) and δ W (1, t i , t j , t k ) are undefined.We will not discuss the details for handling corner cases in which the machine reads at either edge of the tape, noting that this can be easily dealt with by assuming special edge-of-tape symbols.In the forthcoming grammar, we assume |Γ| = k and use t 1 . . .t k as constants to model the extended alphabet of the (appropriately modified) machine M. Indices for such variables are sometimes used to indicate the particular symbol they contain, e.g.t bl ank refers to the unique blank symbol.

D.1 Reduction
Before presenting the reduction grammar G M,w , we discuss its primary components and summarize the purposes of its program variables.Recall that the goal of the reduction is to produce a grammar G M,w ∈ S that contains a correct program exactly when an AEXPSPACE turing machine M accepts an input w.Essentially, this amounts to building a grammar whose correct programs encode winning strategies for Eve in the two-player game semantics.That is, we want a correct program to exist in G M,w exactly when there is a configuration tree starting from the initial configuration of M on w, branching appropriately according to δ , and terminating in accepting leaf configurations.The grammar encodes the alternation by branching on a turn variable (Adam or Eve) and choosing the next transition accordingly.For Eve's turn, the grammar enforces a transition choice from the synthesizer, whereas Adam's turns are read from the uninterpreted function choice.Relying on our intuition that correct programs must satisfy assertions in every data model, we observe that this modeling decision captures the semantics of ATMs.Deeper in the grammar, we will find that after a move has been made, there is a mechanism for requiring that the next configuration (produced by the synthesizer) indeed follows from the selected move according to δ .As noted earlier, the full configuration cannot be represented in program variables all at once because M may use exponential space (and hence a grammar using exponentially many variables could not be produced in polynomial time).To circumvent this issue, the grammar uses a while loop to iterate through the full configuration, enforcing that the synthesizer produces the configuration contents one cell at a time.See Figure 2 for an illustration of this idea.Further, since the full configuration contents cannot be stored at once, the correctness check must distribute the work across all data models.For an input w with m = |w |, the grammar utilizes n = poly(m) index variables s 1 . . .s n to point into the configuration.For any given data model, the index points at a single tape cell.For this single cell, the grammar enforces that all transitions are correct.Since uninterpreted programs must be correct in all data models, it follows that a correct program from the target grammar will witness the correctness of transitions for all tape cells.Finally, the grammar enforces that any leaf configurations are accepting.Table 1 describes the purposes of the symbols that appear in the grammar, several of which have not yet been mentioned.We denote vectors of variables with boldface, e.g.s 1 . . .s n by s.     , 5, and 4 present the full grammar G M,w .The reader may note that G M,w does not appear to conform to our grammar schema S. It is not hard to see that in fact G M,w can be factored appropriately by introducing a polynomial number of new non-terminal symbols to produce the components involving fixed instruction sequences.Before arguing that our reduction is correct, we pause to mention a few presentation-related details and to explain the important rules for G M,w .
To simplify the presentation of the grammar and promote its interesting rules, several simplifications and omissions were made.First, we omit else branches whenever they include only a skip statement.Second, several of the conditions in if and assume statements consist of boolean combinations of equality and disequality.These can be translated into semantically equivalent statements using sequences of nested if statements.For each condition in G M,w , the translated code is of polynomial size.This crucially relies on the fact that each condition is already expressed in conjunctive normal form.We omit the precise translation, which is straightforward.Additionally, in a few places in the grammar we use boldface to denote the bitvector representation of a number n, e.g.n.Equalities over bitvectors (e.g. if (b = n)) ultimately are handled using a conjunction of equalities on each bit.Overflow and underflow in the binary operation rules are not addressed, but could easily be fixed by adding conditional statements and keeping a few variables as flags to signal such events.Finally, when two bitvectors of disequal length are compared (see b and s in <Check>), the bitwise comparision of the least significant bits is intended.
At this point, we encourage the reader to digest this paragraph by walking through Figure 3 as we trace the important parts of the grammar structure.The <Move> rule serves to extend the strategy tree by one move, or alternatively, finish it in <Base> by asserting that configurations are accepting.For extending the tree, the grammar checks which player's turn is next with a conditional statement.On one branch the synthesizer is allowed to choose a transition to make, and on the other the transition is determined by reading from choice.After determination of the next move Fig. 3. Rules to impose the desired strategy tree structure and to check correctness of each move.Note that <Check> elides the full branching on possible windows, and uses a shorthand condition assert(cond x ) to denote assert(c 2 = δ W (x, t i , t j , t k )) whenever δ W (x, t i , t j , t k ) is defined, and to denote assert(false) otherwise.The shorthand notation comp(t) holds for any composite symbol t = (x, q) ∈ Γ, and acc(t) holds for any composite symbol t = (x, q) with д(q) = acc.See Figures 5 and 4 for <Prelude> and <Increment>.
in each branch we find the <Generate> rule, described next.The <Generate> rule coordinates the simulation and checking of the most recent move.The synthesizer iteratively produces the contents of each tape cell inside a while loop.It is allowed to branch on the index variables that determine the current iteration (and not the secret index s) in order to decide which symbol to produce (see <ProduceCell>.)The grammar checks correctness of a transition only for the particular cell it happens to be tracking (see <Check>).After this, the grammar repeats the process with another <Move>.Recall that in general, a strategy for Eve requires more memory than can be explicitly allocated in program variables.The grammar provides for this memory by placing <Generate> rules under each branch of the <Move> rule.This has the effect of using the program counter as memory, as mentioned earlier.
Our grammar can be produced in time polynomial in m = |w |.There are polynomially many grammar rules, and each is clearly of size polynomial in m.The key components that make this possible are the use of a bounded loop to produce tape contents, as well as the technique of distributing the problem of checking transition correctness across data models.Finally, memory is <Base> rule.We now proceed to describe only those program branches that are relevant to ensuring that all assertions hold.The <Prelude> rule corresponds to the root of T and produces the initial configuration, from which the initial window contents are set in c.
During traversal, if we reach in T a configuration c j−1 (j > 0) that is existential, choose dir := 0 (left) in the if branch of <Move> if it is the case that c j is the left child of c j−1 .Otherwise choose dir := 1 (right).In <ProduceCell> under <Generate>, expand such that every possible cell index (valuation of b) is branched upon.Choose cell := t s in the leaf of the branch for (b = i), where (c j ) i = t s .It is hard to see that for every interpretation of the index s (where s ranges over possible cells to track) the transition assertion in <Check> will indeed hold.We have made sure to select the correct choice for dir and the transitions in the computation tree are necessarily correct.If c j−1 is a universal configuration with left and right children c l j and c r j in T , proceed in a similar manner to that of the existential case for the left <Generate> subtree in the else branch of <Move>, and upon returning to this branch in the traversal, do the same for the <Generate> subtree on the right.Upon encountering a leaf of T (which is an accepting configuration), take the terminating <Base> rule under <Move>.This asserts that any cell containing a machine state symbol in fact contains the accept state symbol.Since our procedure has only ever produced cell contents corresponding to valid machine configurations that proceed according to the transition relation, it is the case that in every interpretation in which the tracked cell contains a state symbol we have c ′ 2 = q acc .It is not hard to see that any program from G M,w will be coherent, as noted in our discussion about boolean programs.The grammar ensures that no memoizing failures are possible, since every variable is effectively boolean, with the exception of the hardcoded machinery for reading universal moves from the data model.In that case, terms are computed in a linear fashion and there is no chance for recomputation.Finally, all assumes are early by virtue of the fact that no variables appearing in equality conditions are ever used in a computation with an uninterpreted function.
(⇐) Suppose we have the derivation tree of a coherent program p ∈ G M,w such that p satisfies its assertions.We are to show there is an accepting computation tree for M on w.Consider two cases: Case: no moves In this case p does not make any moves, which corresponds in its derivation tree to the <Move> production immediately rewriting to <Base>.Since p is correct, it satisfies its assertions (in all data models) and, in particular, it satisfies them in a model where the tracked cell contains the initial state symbol q init .That is, the success of assert(c 2 = q acc ) implies that q init = q acc .Hence M has the initial configuration as a trivial accepting computation tree on w.Case: some moves Let us think of the depth of the derivation tree for p only in terms of the rules <S>, <Move>, <Generate>, and <Prelude>.This allows us to speak of the number of moves in p in terms of the depth of its derivation tree.Now, suppose p has a derivation tree in G M,w of depth 2m + 2, with m > 0 (m is the number of moves).We build an accepting computation tree as follows.
Base: The root of the budding computation tree is c 0 = a 1 . . .a 2 n , where a i is the tape symbol corresponding to the choice for the ith cell in the <Prelude> loop.We can simulate the loop (which is bounded) to obtain the cell contents.It is clear that this is the proper initial configuration for M on w.
Inductive: The inductive case proceeds similarly to the pre-order traversal from the other direction of our proof.Universal turns involve building two branches of the computation tree, whereas existential turns only build one.Once again, we ignore the infeasible branches of the derivation tree, and we can determine which branches these are by keeping track of the turn variable during the traversal.For existential turns in the derivation tree, we build the next configuration by inspecting the cell choices (via simulation of the bounded loop) in the <Generate> subtree (under the if branch of <Move>).Suppose the configuration c nex t , so generated, were not correct.That is, c nex t does not follow according to the transition relation δ W from the computation tree parent c pr ev .Then there must be some index k for which cell k of c nex t is wrong according to δ W .But there is then a model where (s = k) holds, and hence the corresponding assertion inside <Check> fails, contradicting the correctness of p. Universal turns in the derivation tree are processed similarly to existential turns by first traversing the left <Generate> subtree and later the right.
Finally, since the derivation tree is finite, the last derivation on every branch gives the assert for q acc .There is a model where the tracked cell contains the final machine state symbol.In that model, the satisfaction of the assertion ensures that the configuration at every leaf in the computation tree is indeed an accepting configuration.□

E TRANSITION SYSTEM EXPTIME HARDNESS
We show that the realizability and synthesis problems are EXPTIME-hard using a reduction from the membership problem for alternating PSPACE Turing machines.The goal of the reduction is to design a specification (A R and A S ) such that a correct transition system that satisfies it will witness an accepting computation for the PSPACE Turing machine.
The key to modeling the desired TM semantics in A R is to observe that there is a relationship between the transitions of a specification automaton A R and the nodes of a transition system T S that satisfies it.Notice that the only way for our transition systems to produce executions containing assume(x = y) is to branch at a check(x = y) node.Thus, any execution ending with an equality assumption is always accompanied by a correponding execution ending with a disequality assumption instead.As in the program synthesis reduction grammar, we want to restrict our attention to only certain data models.For example, we want to make sure that the variables we use to model the TM tape cells initially contain the input symbols.In the program case, we used statements of the form assume(x = y) to achieve this.Here however, we introduce rules in the transition relation for A R that allow reading either assume(x = y) or assume(x y).The state reached by reading the negated condition (x y in this example) will be an accepting state for A R .This reflects the fact that we are uninterested in requiring anything of executions where the TM does not begin with the appropriate input symbols on its tape.See Figure 6 for a picture that illustrates this kind of modeling.Assertions can be modeled in a similar way.Recall that, besides assignment statements, our transition systems are restricted to checking equality and disequality conditions and asserting false.Thus, to model assert(x = y) a transition system would first branch on check(x = y), proceeding with computation in the affirmative branch and reaching assert(false) in the negative branch.Such assertions can be enforced in A R by introducing transitions for assume(x = y) and assume(x y), with the latter transitioning to an accepting state after reading assert(false).See Figure 7 for a picture that illustrates this kind of modeling.
Having made the observation that much of the componentry from the program synthesis reduction grammar can be modeled in the transitions of the specification automaton A R , we emphasize once more the crucial difference between program and transition system synthesis.If we attempted to recreate the 2EXPTIME-hardness proof in this setting, we would be unable to hide information from the synthesizing algorithm.Imagine that we try using variables to store the secret index of the tape cell being checked.In order for these variables to serve the purpose of the lower bound proof, they will eventually be involved in a check node.This has the effect of permanently leaking their values to the synthesis algorithm, which can make synthesis decisions on the basis of that information.Indeed, program specification in terms of grammars allows one to enforce the uniformity of synthesized code, whereas specification in terms of acceptable executions does not.This leads to an easier problem.We now give an overview of the reduction from alternating PSPACE TMs.The structure is quite similar to that of the reduction for grammer-restricted program synthesis, and we hence omit many details.

E.1 Gist of the Reduction
Given an alternating PSPACE TM M and input w with |w | = m, we must construct a specification consisting of deterministic execution automata A R and A S such that there is a correct transition system satisfying the specification exactly when M accepts w.We will assume that M uses a counter to ensure its termination in 2 poly(m) time.Let us now discuss the key aspects of A R .We omit a full description, preferring to compare the main components to the corresponding ones in the reduction grammar for program synthesis.Note that A S will accept the prefix closure of L(A R ).It can be the prefix automaton A S can be constructed by making every state accepting, with the exception of the absorbing reject state.

E.2 Correctness
Now let us consider the correctness of the reduction.First suppose there is an accepting computation tree T for PSPACE ATM M on input w, where |w | = m.We must show there is a correct transition system T S whose executions are contained in the language of the automaton A R described above.The transition system somewhat resemblesT .It models each alternation inT with a check(turn = 1) node, branching to simulate a move from either Adam or Eve according to the transitions in T .To simulate Eve, the system can choose an assignment node labeled by dir := 0 or dir := 1, depending on the corresponding transition in T .To simulate Adam, the system must go to a sequence of assignment nodes to read a move decision from the uninterpreted function choice, as in the grammar reduction.After this, the system is forced (by construction of A R ) to generate correct updates for each tape cell variable, depending on which transition decision was made.Each choice can be determined by referring to configurations in the corresponding branches of T .Producing each tape symbol is accomplished with a sequence of check and assignment nodes.Finally, Since T contains correct transitions, all of the complete executions for T S resulting from transition correctness checks will be correct.Similarly, complete executions arising from checking that an ending state is accepting will also be correct, since the leaves of T are accepting configurations.In the other direction, given a correct transition system T S whose complete executions are in L(A R ) and whose partial executions are in L(A S ), an accepting computation tree T for M on w can be built by simulating T S (as in the grammar lower bound).Each transition in T will proceed according to the transition relation for M because T S (which is correct) has executions that assert this.Since T S has executions that assert the final tape contents constitute accepting configurations, every leaf of T will indeed be an accepting configuration.Note we have assumed that machine M keeps a counter to ensure termination in exponential time.Thus no correct T S that satisifes the specification can go on simulating (without halting) beyond this time bound, since it correctly simulates all machine transitions.Finally, all executions allowed by A R are easily seen to be coherent.

Fig. 1 .
Fig. 1.Motivating programs with holes to be filled by sub-programsExample 1.Consider the program in Figure1(left).This program has a hole '⟨⟨ ?? | Cannot . . .⟩⟩', that we intend to fill with a sub-program so that the entire program (together with the contents of the hole) satisfies the assertion at the end.The sub-program corresponding to the hole is allowed to use the variables cipher as well as some additional program variables y 1 , . . ., y n (for some fixed n), but is not allowed to refer to key and secret in any manner.Intuitively, the above setting models the encryption of a secret message secret with a key key.The assumption in the second line of the program models the fact that the secret message can be decrypted from cipher and key.Here, the functions enc and dec are uninterpreted functions and thus, the program we are looking for is an uninterpreted program.For such a program, the assertion at the end "assert(z = secret)" holds, if it holds for all models, i.e, for all interpretations of enc and dec, and for all initial values of the different variables.With this setup, we are essentially asking whether a program that does not have access to key can recover secret.It is easy to see that there is indeed no program that satisfies the above requirement.The above modeling of keys, encryption, nonces, etc. is common Further, A G can be constructed in time O(|G|) and has size O(|G|).□ D DETAILED PROOF OF 2EXPTIME HARDNESS s n b 1 . . .b n+1 : index for pointing at current tape cell during iteration 0, 1: constants used for indices as well as move choices cell: holds the putative current tape symbol turn: holds 0 or 1, represents which player has the next move dir: holds 0 or 1, the most recent move choice c 1 , c 2 , c 3 : the previous contents of the configuration window for the cell being checked c ′ 1 , c ′ 2 , c ′ 3 : the next contents of the configuration window for the cell being checked choice: uninterpreted function that models Adam's moves, restricted to 0 and 1 with an assume statement next: uninterpreted function that captures the time-dependence of Adam's decisions

Figures 3
Figures3, 5, and 4 present the full grammar G M,w .The reader may note that G M,w does not appear to conform to our grammar schema S. It is not hard to see that in fact G M,w can be factored appropriately by introducing a polynomial number of new non-terminal symbols to produce the components involving fixed instruction sequences.Before arguing that our reduction is correct, we pause to mention a few presentation-related details and to explain the important rules for G M,w .To simplify the presentation of the grammar and promote its interesting rules, several simplifications and omissions were made.First, we omit else branches whenever they include only a skip statement.Second, several of the conditions in if and assume statements consist of boolean combinations of equality and disequality.These can be translated into semantically equivalent statements using sequences of nested if statements.For each condition in G M,w , the translated code is of polynomial size.This crucially relies on the fact that each condition is already expressed in conjunctive normal form.We omit the precise translation, which is straightforward.Additionally, in a few places in the grammar we use boldface to denote the bitvector representation of a number n, e.g.n.Equalities over bitvectors (e.g. if (b = n)) ultimately are handled using a conjunction of equalities on each bit.Overflow and underflow in the binary operation rules are not addressed, but could easily be fixed by adding conditional statements and keeping a few variables as flags to signal such events.Finally, when two bitvectors of disequal length are compared (see b and s in <Check>), the bitwise comparision of the least significant bits is intended.At this point, we encourage the reader to digest this paragraph by walking through Figure3as we trace the important parts of the grammar structure.The <Move> rule serves to extend the strategy tree by one move, or alternatively, finish it in <Base> by asserting that configurations are accepting.For extending the tree, the grammar checks which player's turn is next with a conditional statement.On one branch the synthesizer is allowed to choose a transition to make, and on the other the transition is determined by reading from choice.After determination of the next move 1 . . .w m : store input tape contents t 1 . . .t k : constants to represent tape symbols s 1 . . .s n : index holding location of the cell being checked s ′ 1 . . .s ′ n : index holding location of binary predecessor to s 1 . . .s n s ′′ 1 . . .s ′′ n : index holding location of binary successor to s 1 . . . w

Table 1 .
Summary of purposes for grammar variables.