Proving Unrealizability for Syntax-Guided Synthesis

Proving Unrealizability for Syntax-Guided Synthesis We consider the problem of automatically establishing that a given syntax-guided-synthesis (SyGuS) problem is unrealizable (i.e., has no solution). Existing techniques have quite limited ability to establish unrealizability for general SyGuS instances in which the grammar describing the search space contains infinitely many programs. By encoding the synthesis problem's grammar G as a nondeterministic program P_G, we reduce the unrealizability problem to a reachability problem such that, if a standard program-analysis tool can establish that a certain assertion in P_G always holds, then the synthesis problem is unrealizable. Our method can be used to augment any existing SyGus tool so that it can establish that a successfully synthesized program q is optimal with respect to some syntactic cost -- e.g., q has the fewest possible if-then-else operators. Using known techniques, grammar G can be automatically transformed to generate exactly all programs with lower cost than q -- e.g., fewer conditional expressions. Our algorithm can then be applied to show that the resulting synthesis problem is unrealizable. We implemented the proposed technique in a tool called NOPE. NOPE can prove unrealizability for 59/134 variants of existing linear-integer-arithmetic SyGus benchmarks, whereas all existing SyGus solvers lack the ability to prove that these benchmarks are unrealizable, and time out on them.


Introduction
The goal of program synthesis is to find a program in some search space that meets a specification-e.g., satisfies a set of examples or a logical formula.Recently, a large family of synthesis problems has been unified into a framework called syntax-guided synthesis (SyGuS).A SyGuS problem is specified by a regular-tree grammar that describes the search space of programs, and a logical formula that constitutes the behavioral specification.Many synthesizers now support a specific format for SyGuS problems [1], and compete in annual synthesis competitions [2].Thanks to these competitions, these solvers are now quite mature and are finding a wealth of applications [9].
Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by (ψ max2 (f, x, y), G 1 ).The goal is to create e f -an expression-tree for f -where e f is in the language of the following regular-tree grammar and ∀x, y.ψ max2 ( e f , x, y) is valid, where e f denotes the meaning of e f , and ψ max2 (f, x, y) := f (x, y) ≥ x ∧ f (x, y) ≥ y ∧ (f (x, y) = x ∨ f (x, y) = y).
Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is unrealizable-i.e., it does not admit a solution.For example, consider the SyGuS problem (ψ max2 (f, x, y), G 2 ), where G 2 is the more restricted grammar with if-then-else operators and conditions stripped out: This SyGuS problem does not have a solution, because no expression generated by G 2 meets the specification. 1 However, to the best of our knowledge, current SyGuS solvers cannot prove that such a SyGuS problem is unrealizable. 2 A key property of the previous example is that the grammar is infinite.When such a SyGuS problem is realizable, any search technique that systematically explores the infinite search space of possible programs will eventually identify a solution to the synthesis problem.In contrast, proving that a problem is unrealizable requires showing that every program in the infinite search space fails to satisfy the specification.This problem is in general undecidable [6].Although we cannot hope to have an algorithm for establishing unrealizability, the challenge is to find a technique that succeeds for the kinds of problems encountered in practice.Existing synthesizers can detect the absence of a solution in certain cases (e.g., because the grammar is finite, or is infinite but only generate a finite number of functionally distinct programs).However, in practice, as our experiments show, this ability is limited-no existing solver was able to show unrealizability for any of the examples considered in this paper.
In this paper, we present a technique for proving that a possibly infinite SyGuS problem is unrealizable.Our technique builds on two ideas. 1 Grammar G2 only generates terms that are equivalent to some linear function of x and y; however, the maximum function cannot be described by a linear function. 2The synthesis problem presented above is one that is generated by a recent tool called QSyGuS, which extends SyGuS with quantitative syntactic objectives [10].The advantage of using quantitative objectives in synthesis is that they can be used to produce higher-quality solutions-e.g., smaller, more readable, more efficient, etc.The synthesis problem (ψmax2(f, x, y), G2) arises from a QSyGuS problem in which the goal is to produce an expression that (i) satisfies the specification ψmax2(f, x, y), and (ii) uses the smallest possible number of if-then-else operators.Existing SyGuS solvers can easily produce a solution that uses one if-then-else operator, but cannot prove that no better solution exists-i.e., (ψmax2(f, x, y), G2) is unrealizable.
1. We observe that unrealizability can often be proven using finitely many input examples.In §2, we show how the example discussed above can be proven to be unrealizable using four input examples-(0, 0), (0, 1), (1, 0), and (1, 1). 2. We devise a way to encode a SyGuS problem (ψ(f, x), G) over a finite set of examples E as a reachability problem in a recursive program P [G, E].In particular, the program that we construct has an assertion that holds if and only the given SyGuS problem is unrealizable.Consequently, unrealizability can be proven by establishing that the assertion always holds.This property can often be established by a conventional program-analysis tool.
The encoding mentioned in item 2 is non-trivial for three reasons.The following list explains each issue, and sketches how they are addressed 1) Infinitely many terms.We need to model the infinitely many terms generated by the grammar of a given synthesis problem (ψ(f, x), G).
To address this issue, we use non-determinism and recursion, and give an encoding P [G, E] such that (i) each non-deterministic path p in the program P [G, E] corresponds to a possible expression e p that G can generate, and (ii) for each expression e that G can generate, there is a path p e in P [G, E]. (There is an isomorphism between paths and the expression-trees of G) 2) Nondeterminism.We need the computation performed along each path p in P [G, E] to mimic the execution of expression e p .Because the program uses non-determinism, we need to make sure that, for a given path p in the program P [G, E], computational steps are carried out that mimic the evaluation of e p for each of the finitely many example inputs in E.
We address this issue by threading the expression-evaluation computations associated with each example in E through the same non-deterministic choices.
3) Complex Specifications.We need to handle specifications that allow for nested calls of the programs being synthesized.
For instance, consider the specification f (f (x)) = x.To handle this specification, we introduce a new variable y and rewrite the specification as f (x) = y ∧ f (y) = x.Because y is now also used as an input to f , we will thread both the computations of x and y through the non-deterministic recursive program.
Our work makes the following contributions: -We reduce the SyGuS unrealizability problem to a reachability problem to which standard program-analysis tools can be applied ( §2 and §4).-We observe that, for many SyGuS problems, unrealizability can be proven using finitely many input examples, and use this idea to apply the Counter-Example-Guided Inductive Synthesis (CEGIS) algorithm to the problem of proving unrealizability ( §3).-We give an encoding of a SyGuS problem (ψ(f, x), G) over a finite set of examples E as a reachability problem in a nondeterministic recursive program P [G, E], which has the following property: if a certain assertion in P [G, E] always holds, then the synthesis problem is unrealizable ( §4).-We implement our technique in a tool nope using the ESolver synthesizer [2] as the SyGuS solver and the SeaHorn tool [8] for checking reachability.nope is able to establish unrealizability for 59 out of 132 variants of benchmarks taken from the SyGuS competition.In particular, nope solves all benchmarks with no more than 15 productions in the grammar and requiring no more than 9 input examples for proving unrealizability.Existing SyGuS solvers lack the ability to prove that these benchmarks are unrealizable, and time out on them.§6 discusses related work.Some additional technical material, proofs, and full experimental results are given in Apps.A, B, and C, respectively.

Illustrative Example
In this section, we illustrate the main components of our framework for establishing the unrealizability of a SyGuS problem.
Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by (ψ max2 (f, x, y), G 1 ).The goal is to create e f -an expression-tree for f -where e f is in the language of the following regular-tree grammar G 1 : and ∀x, y.ψ max2 ( e f , x, y) is valid, where e f denotes the meaning of e f , and SyGuS solvers can easily find a solution, such as e := IfThenElse(GreaterThan(x, y), x, y).
Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is unrealizable-i.e., it does not admit a solution.For example, consider the SyGuS problem (ψ max2 (f, x, y), G 2 ), where G 2 is the more restricted grammar with if-then-else operators and conditions stripped out: This SyGuS problem does not have a solution, because no expression generated by G 2 meets the specification. 3However, to the best of our knowledge, current SyGuS solvers cannot prove that such a SyGuS problem is unrealizable.As an example, we use the problem (ψ max2 (f, x, y), G 2 ) discussed in §1, and show how unrealizability can be proven using four input examples: (0, 0), (0, 1), (1, 0), and (1, 1).
Our method can be seen as a variant of Counter-Example-Guided Inductive Synthesis (CEGIS), in which the goal is to create a program P in which a certain assertion always holds.Until such a program is created, each round of the algorithm returns a counter-example, from which we extract an additional input example for the original SyGuS problem.On the i th round, the current set of input examples E i is used, together with the grammar-in this case G 2and the specification of the desired behavior-ψ max2 (f, x, y), to create a candidate program P [G 2 , E i ].The program P [G 2 , E i ] contains an assertion, and a standard program analyzer is used to check whether the assertion always holds.
Suppose that for the SyGuS problem (ψ max2 (f, x, y), G 2 ) we start with just the one example input (0, 1)-i.e., E 1 = {(0, 1)}.Fig. 1 shows the initial program P [G 2 , E 1 ] that our method creates.The function spec implements the predicate ψ max2 (f, x, y).(All of the programs {P [G 2 , E i ]} use the same function spec.)The initialization statements "int x_0 = 0; int y_0 = 1;" at line (21) in procedure main correspond to the input example (0, 1).The recursive procedure Start encodes the productions of grammar G 2 .Start is non-deterministic; it contains four calls to an external function nd(), which returns a non-deterministically chosen Boolean value.The calls to nd() can be understood as controlling whether or not a production is selected from G 2 during a top-down, left-to-right generation of an expression-tree: lines (3)-( 8) correspond to "Start ::= Plus(Start, Start)," and lines (10), (11), (12), and (13) correspond to "Start ::= x," "Start ::= y," "Start ::= 1," and "Start ::= 0," respectively.The code in the five cases in the body of Start encodes the semantics of the respective production of G 2 ; in particular, the statements that are executed along any execution path of P [G 2 , E 1 ] implement the bottom-up evaluation of some expression-tree that can be generated by G 2 .For instance, consider the path that visits statements in the following order (for brevity, some statement numbers have been elided): where ( Start and ) Start indicate entry to, and return from, procedure Start, respectively.Path (1) corresponds to the top-down, left-to-right generation of the expression-tree Plus(x,1), interleaved with the tree's bottom-up evaluation.
Note that with path (1), when control returns to main, variable I_0 has the value 1, and thus the assertion at line (23) fails.
A sound program analyzer will discover that some such path exists in the program, and will return the sequence of non-deterministic choices required to follow one such path.Suppose that the analyzer chooses to report path (1); the sequence of choices would be t, f, t, f, f, f, t, which can be decoded to create the expression-tree Plus(x,1).At this point, we have a candidate definition for f : f = x + 1.This formula can be checked using an SMT solver to see whether it satisfies the behavioral specification ψ max2 (f, x, y).In this case, the SMT solver returns "false."One counter-example that it could return is (0, 0).
As can be seen from the comments in the two programs, program main begins with initialization statements for the four example inputs.
-Start has five cases that correspond to the five productions of G 2 .
The main difference is that because the encoding of G 2 in Start uses nondeterminism, we need to make sure that along each path p in P [G 2 , E 4 ], each of the example inputs is used to evaluate the same expression-tree.We address this issue by threading the expression-evaluation computations associated with each of the example inputs through the same non-deterministic choices.That is, each of the five "production cases" in Start has four encodings of the production's semantics-one for each of the four expression evaluations.By this means, the statements that are executed along path p perform four simultaneous bottom-up evaluations of the expression-tree from G 2 that corresponds to p. Programs but their paths carry out two and three simultaneous bottom-up evaluations, respectively.The actions taken during rounds 2 and 3 to generate a new counter-example-and hence a new example input-are similar to what was described for round 1.On round 4, however, the program analyzer will determine that the assertion on lines (34)-(35) always holds, which means that there is no path through P [G 2 , E 4 ] for which the behavioral specification holds for all of the input examples.This property means that there is no expression-tree that satisfies the specification-i.e., the SyGuS problem (ψ max2 (f, x, y), G 2 ) is unrealizable.
Our implementation uses the program-analysis tool SeaHorn [8] as the assertion checker.In the case of P [G 2 , E 4 ], SeaHorn takes only 0.5 seconds to establish that the assertion in P [G 2 , E 4 ] always holds.

Background
Trees and Tree Grammars.A ranked alphabet is a tuple (Σ, rk Σ ) where Σ is a finite set of symbols and rk Σ : Σ → N associates a rank to each symbol.For every m ≥ 0, the set of all symbols in Σ with rank m is denoted by Σ (m) .In our examples, a ranked alphabet is specified by showing the set Σ and attaching the respective rank to every symbol as a superscript-e.g., Σ = {+ (2) , c (0) }. (For brevity, the superscript is sometimes omitted.)We use T Σ to denote the set of all (ranked) trees over Σ-i.e., T Σ is the smallest set such that (i) In what follows, we assume a fixed ranked alphabet (Σ, rk Σ ).
In this paper, we focus on typed regular tree grammars, in which each nonterminal and each symbol is associated with a type.There is a finite set of types {τ 1 , . . ., τ k }.Associated with each symbol σ (i) ∈ Σ (i) , there is a type assignment a σ (i) = (τ 0 , τ 1 , . . ., τ i ), where τ 0 is called the left-hand-side type and τ 1 , . . ., τ i are called the right-hand-side types.Tree grammars are similar to word grammars, but generate trees over a ranked alphabet instead of words.
Definition 1 (Regular Tree Grammar).A typed regular tree grammar (RTG) is a tuple G = (N, Σ, S, a, δ), where N is a finite set of non-terminal symbols of arity 0; Σ is a ranked alphabet; S ∈ N is an initial non-terminal; a is a type assignment that gives types for members of Σ ∪ N ; and δ is a finite set of productions of the form In a SyGuS problem, each variable, such as x and y in the example RTGs in §1, is treated as an arity-0 symbol-i.e., x (0) and y (0) .
Given a tree t ∈ T Σ∪N , applying a production r = A → β to t produces the tree t ′ resulting from replacing the left-most occurrence of A in t with the right-hand side β.A tree t ∈ T Σ is generated by the grammar G-denoted by t ∈ L(G)-iff it can be obtained by applying a sequence of productions r 1 • • • r n to the tree whose root is the initial non-terminal S.
Syntax-Guided Synthesis.A SyGuS problem is specified with respect to a background theory T -e.g., linear arithmetic-and the goal is to synthesize a function f that satisfies two constraints provided by the user.The first constraint, ψ(f, x), describes a semantic property that f should satisfy.The second constraint limits the search space S of f , and is given as a set of expressions specified by an RTG G that defines a subset of all terms in T .

Definition 2 (SyGuS).
A SyGuS problem over a background theory T is a pair sy = (ψ(f, x), G) where G is a regular tree grammar that only contains terms in T -i.e., L(G) ⊆ T -and ψ(f, x) is a Boolean formula constraining the semantic behavior of the synthesized program f .
A SyGuS problem is realizable if there exists a expression e ∈ L(G) such that ∀x.ψ( e , x) is true.Otherwise we say that the problem is unrealizable.Theorem 1 (Undecidability [6]).Given a SyGuS problem sy, it is undecidable to check whether sy is realizable.
Counterexample-Guided Inductive Synthesis The Counterexample-Guided Inductive Synthesis (CEGIS) algorithm is a popular approach to solving synthesis problems.Instead of directly looking for an expression that satisfies the specification ϕ on all possible inputs, the CEGIS algorithm uses a synthesizer S that can find expressions that are correct on a finite set of examples E. If S finds a solution that is correct on all elements of E, CEGIS uses a verifier V to check whether the discovered solution is also correct for all possible inputs to the problem.If not, a counterexample obtained from V is added to the set of examples, and the process repeats.More formally, CEGIS starts with an empty set of examples E and repeats the following steps: 1.Call the synthesizer S to find an expression e such that ψ E ( e , x) def = ∀x ∈ E.ψ( e , x) holds and go to step 2; return unrealizable if no expression exists.2. Call the verifier V to find a model c for the formula ¬ψ( e , x), and add c to the counterexample set E; return e as a valid solution if no model is found.
Because SyGuS problems are only defined over first-order decidable theories, any SMT solver can be used as the verifier V to check whether the formula ¬ψ( e , x) is satisfiable.On the other hand, providing a synthesizer S to find solutions such that ∀x ∈ E.ψ( e , x) holds is a much harder problem because e is a second-order term drawn from an infinite search space.In fact, checking whether such an e exists is an undecidable problem [6].
The main contribution of our paper is a reduction of the unrealizability problem-i.e., the problem of proving that there is no expression e ∈ L(G) such that ∀x ∈ E.ψ( e , x) holds-to an unreachability problem ( §4).This reduction allows us to use existing (un)reachability verifiers to check whether a SyGuS instance is unrealizable.

CEGIS and Unrealizability
The CEGIS algorithm is sound but incomplete for proving unrealizability.Given a SyGuS problem sy = (ψ(f, x), G) and a finite set of inputs E, we denote with sy E := (ψ E (f, x), G) the corresponding SyGuS problem that only requires the function f to be correct on the examples in E.
Even when given a perfect synthesizer S-i.e., one that can solve a problem sy E for every possible set E-there are SyGuS problems for which the CEGIS algorithm is not powerful enough to prove unrealizability.

Lemma 2 (Incompleteness).
There exists an unrealizable SyGuS problem sy such that for every finite set of examples E the problem sy E is realizable.
Despite this negative result, we will show that a CEGIS algorithm can prove unrealizability for many SyGuS instances ( §5).

From Unrealizability to Unreachability
In this section, we show how a SyGuS problem for finitely many examples can be reduced to a reachability problem in a non-deterministic, recursive program in an imperative programming language.

Reachability Problems
A program P takes an initial state I as input and outputs a final state O, i.e., P (I) = O where • denotes the semantic function of the programming language.As illustrated in §2, we allow a program to contain calls to an external function nd(), which returns a non-deterministically chosen Boolean value.When program P contains calls to nd(), we use P to denote the program that is the same as P except that P takes an additional integer input n, and each call nd() is replaced by a call to a local function nextbit() defined as follows: bool nextbit(){bool b = n%2; n=n»1; return b;}.In other words, the integer parameter n of P [n] formalizes all of the nondeterministic choices made by P in calls to nd().
For the programs P [G, E] used in our unrealizability algorithm, the only calls to nd() are ones that control whether or not a production is selected from grammar G during a top-down, left-to-right generation of an expression-tree.Given n, we can decode it to identify which expression-tree n represents.

Reduction to Reachability
The main component of our framework is an encoding enc that given a SyGuS Remark: In this section, we assume that in the specification ψ(f, x) every occurrence of f has x as input parameter.We show how to overcome this restriction in §A.1.In the following, we assume that the input x has type τ I , where τ I could be a complex type-e.g., a tuple type.
Program construction.Recall that the grammar G is a tuple (N, Σ, S, a, δ).
We now turn to the correctness of the construction.First, we formalize the relationship between expression-trees in L(G), the semantics of P [G, E], and the number n.Given an expression-tree e, we assume that each node q in e is annotated with the production that has produced that node.Recall that δ(A) = {r 1 , . . ., r m } is the set of productions with head A (where the subscripts are indexes in some arbitrary, but fixed order).Concretely, for every node q, we assume there is a function pr(q) = (A, i), which associates q with a pair that indicates that non-terminal A produced n using the production r i (i.e., r i is the i th production whose left-hand-side non-terminal is A).
We now define how we can extract a number #(e) for which the program P [#(e)] will exhibit the same semantics as that of the expression-tree e.First, for every node q in e such that pr(q) = (A, i), we define the following number: The number # nd (q) indicates what suffix of the value of n will cause funcA to trigger the code corresponding to production r i .Let q 1 • • • q m be the sequence of nodes visited during a pre-order traversal of expression-tree e.The number corresponding to e, denoted by #(e), is defined as the bit-vector # nd (q m ) • • • # nd (q 1 ).
Finally, we add the entry-point of the program, which calls the function funcS corresponding to the initial non-terminal S, and contains the assertion that encodes our reachability problem on all the input examples E = {c 1 , . . ., c k }.Each procedure funcA[n](i 1 , . . ., i k ) that we construct has an explicit dependence on variable n, where n controls the non-deterministic choices made by the funcA and procedures called by funcA.As a consequence, when relating numbers and expression-trees, there are two additional issues to contend with: Non-termination.Some numbers can cause funcA[n] to fail to terminate.
For instance, if the case for "Start ::= Plus(Start, Start)" in program P [G 2 , E 1 ] from Fig. 1 were moved from the first branch (lines (3)-( 8)) to the final else case (line ( 13)), the number n = 0 = . . .0000000 (base 2) would cause Start to never terminate, due to repeated selections of Plus nodes.However, note that the only assert statement in the program is placed at the end of the main procedure.Now, consider a value of n such that re enc(sy,E) is satisfiable.Defn. 3 implies that the flow of control will reach and falsify the assertion, which implies that funcA[n] terminates. 4hared suffixes of sufficient length.In Ex. 1, we showed how for program P [G 2 , E 1 ] (Fig. 1) the number n = 1000101 (base 2) corresponds to the top-down, left-to-right generation of Plus(x,1).That derivation consumed exactly seven bits; thus, any number that, written in base 2, shares the suffix 1000101-e.g., 11010101000101-will also generate Plus(x,1).
The issue of shared suffixes is addressed in the following lemma: Lemma 4. For every non-terminal A and number n such that funcA[n] (i 1 , . . ., i k ) = ⊥ (i.e., funcA terminates when the non-deterministic choices are controlled by n), there exists a minimal n ′ that is a (base 2) suffix of n for which (i) there is an e ∈ L(G) such that #(e) = n ′ , and (ii) for every input {i 1 , . . ., i k }, we have funcA We are now ready to state the correctness property of our construction.
Theorem 2. Given a SyGuS problem sy E = (ψ E (f, x), G) over a finite set of examples E, the problem sy E is realizable iff re enc(sy,E) is satisfiable.

Implementation and Evaluation
nope is a tool that can return two-sided answers to unrealizability problems of the form sy = (ψ, G).When it returns unrealizable, no expression-tree in L(G) satisfies ψ; when it returns realizable, some e ∈ L(G) satisfies ψ; nope can also time out.nope incorporates several existing pieces of software.
1.The (un)reachability verifier SeaHorn is applied to the reachability problems of the form re enc(sy,E) created during successive CEGIS rounds.2. The SMT solver Z3 is used to check whether a generated expression-tree e satisfies ψ.If it does, nope returns realizable (along with e); if it does not, nope creates a new input example to add to E.
It is important to observe that SeaHorn, like most reachability verifiers, is only sound for unsatisfiability-i.e., if SeaHorn returns unsatisfiable, the reachability problem is indeed unsatisfiable.Fortunately, SeaHorn's one-sided answers are in the correct direction for our application: to prove unrealizability, nope only requires the reachability verifier to be sound for unsatisfiability.
There is one aspect of nope that differs from the technique that has been presented earlier in the paper.While SeaHorn is sound for unreachability, it is not sound for reachability-i.e., it cannot soundly prove whether a synthesis problem is realizable.To address this problem, to check whether a given SyGuS problem sy E is realizable on the finite set of examples E, nope also calls the SyGuS solver ESolver [2] to synthesize an expression-tree e that satisfies sy E . 5n practice, for every intermediate problem sy E generated by the CEGIS algorithm, nope runs the ESolver on sy E and SeaHorn on re enc(sy,E) in parallel.If ESolver returns a solution e, SeaHorn is interrupted, and Z3 is used to check whether e satisfies ψ.Depending on the outcome, nope either returns realizable or obtains an additional input example to add to E. If SeaHorn returns unsatisfiable, nope returns unrealizable.
Modulo bugs in its constituent components, nope is sound for both realizability and unrealizability, but because of Lemma 2 and the incompleteness of SeaHorn, nope is not complete for unrealizability.Benchmarks.We perform our evaluation on 132 variants of the 60 LIA benchmarks from the LIA SyGuS competition track [2].We do not consider the other SyGuS benchmark track, Bit-Vectors, because the SeaHorn verifier is unsound for most bit-vector operations-e.g., bit-shifting.We used three suites of benchmarks.LimitedIf (resp.LimitedPlus) contains 57 (resp.30) benchmarks in which the grammar bounds the number of times an IfThenElse (resp.Plus) operator can appear in an expression-tree to be 1 less than the number required to solve the original synthesis problem.We used the tool Quasi to automatically generate the restricted grammars.LimitedConst contains 45 benchmarks in which the grammar allows the program to contain only constants that are coprime to any constants that may appear in a valid solution-e.g., the solution requires using odd numbers, but the grammar only contains the constant 2. The numbers of benchmarks in the three suites differ because for certain benchmarks it did not make sense to create a limited variant-e.g., if the smallest program consistent with the specification contains no IfThenElse operators, no variant is created for the LimitedIf benchmark.In all our benchmarks, the grammars describing the search space contain infinitely many terms.
Our experiments were performed on an Intel Core i7 4.00GHz CPU, with 32GB of RAM, running Lubuntu 18.10 via VirtualBox.We used version 4.8 of Z3, commit 97f2334 of SeaHorn, and commit d37c50e of ESolver.The timeout for each individual SeaHorn/ESolver call is set at 10 minutes.
Experimental Questions.Our experiments were designed to answer the questions posed below.
EQ 1. Can nope prove unrealizability for variants of real SyGuS benchmarks, and how long does it take to do so?Finding: nope can prove unrealizability for 59/132 benchmarks.For the 59 benchmarks solved by nope, the average time taken is 15.59s.The time taken to perform the last iteration of the algorithm-i.e., the time taken by SeaHorn to return unsatisfiable-accounts for 87% of the total running time.
nope can solve all of the LimitedIf benchmarks for which the grammar allows at most one IfThenElse operator.Allowing more IfThenElse operators in the grammar leads to larger programs and larger sets of examples, and consequently the resulting reachability problems are harder to solve for SeaHorn.
For a similar reason, nope can solve only one of the LimitedPlus benchmarks.All other LimitedPlus benchmarks allow 5 or more Plus statements, resulting in grammars that have at least 130 productions.
nope can solve all LimitedConst benchmarks because these require few examples and result in small encoded programs.EQ 2. How many examples does nope use to prove unrealizability and how does the number of examples affect the performance of nope?
Note that Z3 can produce different models for the same query, and thus different runs of NOPE can produce different sequences of example.Hence, there is no guarantee that NOPE finds a good sequence of examples that prove unrealizability.One measure of success is whether nope is generally able to find a small number of examples, when it succeeds in proving unrealizability.
Finding: Nope used 1 to 9 examples to prove unrealizability for the benchmarks on which it terminated.Problems requiring large numbers of examples could not be solved because either ESolver or SeaHorn timeouts-e.g., on the problem max4, nope gets to the point where the CEGIS loop has generated 17 examples, at which point ESolver exceeds the timeout threshold.

Related Work
The SyGuS formalism was introduced as a unifying framework to express several synthesis problems [1].Caulfield et al. [6] proved that it is undecidable to determine whether a given SyGuS problem is realizable.Despite this negative result, there are several SyGuS solvers that compete in yearly SyGuS competitions [2] and can efficiently produce solutions to SyGuS problems when a solution exists.Existing SyGuS synthesizers fall into three categories: (i) Enumeration solvers enumerate programs with respect to a given total order [7].If the given problem is unrealizable, these solvers typically only terminate if the language of the grammar is finite or contains finitely many functionally distinct programs.While in principle certain enumeration solvers can prune infinite portions of the search space, none of these solvers could prove unrealizability for any of the benchmarks considered in this paper.(ii) Symbolic solvers reduce the synthesis problem to a constraint-solving problem [3].These solvers cannot reason about grammars that restrict allowed terms, and resort to enumeration whenever the candidate solution produced by the constraint solver is not in the restricted search space.Hence, they also cannot prove unrealizability.(iii) Probabilistic synthesizers ran-domly search the search space, and are typically unpredictable [14], providing no guarantees in terms of unrealizability.Synthesis as Reachability.CETI [12] introduces a technique for encoding template-based synthesis problems as reachability problems.The CETI encoding only applies to the specific setting in which (i) the search space is described by an imperative program with a finite number of holes-i.e., the values that the synthesizer has to discover-and (ii) the specification is given as a finite number of input-output test cases with which the target program should agree.Because the number of holes is finite, and all holes correspond to values (and not terms), the reduction to a reachability problem only involves making the holes global variables in the program (and no more elaborate transformations).
In contrast, our reduction technique handles search spaces that are described by a grammar, which in general consist of an infinite set of terms (not just values).Due to this added complexity, our encoding has to account for (i) the semantics of the productions in the grammar, and (ii) the use of non-determinism to encode the choice of grammar productions.Our encoding creates one expressionevaluation computation for each of the example inputs, and threads these computations through the program so that each expression-evaluation computation makes use of the same set of non-deterministic choices.
Using the input-threading, our technique can handle specifications that contain nested calls of the synthesized program (e.g., f (f (x)) = x).( §A.1.) The input-threading technique builds a product program that perform multiple executions of the same function as done in relational program verification [4].Alternatively, a different encoding could use multiple function invocations on individual inputs and require the verifier to thread the same bit-stream for all input evaluations.In general, verifiers perform much better on product programs [4], which motivates our choice of encoding.Unrealizability in Program Synthesis.For certain synthesis problems-e.g., reactive synthesis [5]-the realizability problem is decidable.The framework tackled in this paper, SyGuS, is orthogonal to such problems, and it is undecidable to check whether a given SyGuS problem is realizable [6].
Mechtaev et al. [11] propose to use a variant of SyGuS to efficiently prune irrelevant paths in a symbolic-execution engine.In their approach, for each path π in the program, a synthesis problem p π is generated so that if p π is unrealizable, the path π is infeasible.The synthesis problems generated by Mechtaev et al. (which are not directly expressible in SyGuS) are decidable because the search space is defined by a finite set of templates, and the synthesis problem can be encoded by an SMT formula.To the best of our knowledge, our technique is the first one that can check unrealizability of general SyGuS problems in which the search space is an infinite set of functionally distinct terms.
Consider the following semantic specification that involves multiple invocations of the function f on different arguments, as well as nested function calls: By introducing new input variables and performing the proper refactoring, we can rewrite ψ 1 as the following specification, where f is always called on a single input variable: It is now easy to adapt our encoding to operate over this new specification.For instance, assume that the input grammar has a production A → π 1 that generates an access on the first parameter of the function to be synthesized, and assume that we currently only have one input example.The corresponding code for the production would be funcA (int v_x, int v_y1, int v_y2, int v_y3, int v_y4) { if(nd()) { x_1_A = v_x; // Computing f (x) y1_1_A = v_y1; // Computing f (y 1 ) y3_1_A = v_y3; // Computing f (y 3 ) } ... In summary, thanks to the ability to execute a finite number of inputs in lock-step, our encoding can handle specifications that contain nested functioninvocations.

A.2 Overcoming a Quirk of SeaHorn
Because SeaHorn is unsound for satisfiability, it can report that some expression-tree satisfies behavioral specification ψ, when in fact no such expression-tree exists.In effect, SeaHorn overapproximates the set of reachable states, and erroneously concludes that the assertion in re enc(sy,E) can be falsified (i.e., all example inputs satisfy ψ).We encountered this situation in our experiments; for some unknown reason, when the following two productions were included in the grammar, SeaHorn would report that re enc(sy,E) was satisfiable in cases when it should have reported unsatisfiable: For the examples on which this happened, we found that we could delete these two productions, which resulted in a grammar of equivalent expressiveness.
Proof.The proof is by structural induction on e.Let q denote the root of e, and A → σ (j) (A 1 , . . ., A j ) denote the production instance at q.Note that #(e) = #(e j ) • • • #(e 1 )# nd (q).
Inductive step: Let e = σ (j) (e 1 , . . ., e j ), where the property to be shown is assumed to hold for each of the e l .For each e l , let q l be the root of e l .
The procedure funcA[#(e)] uses # nd (q) to select the branch B in funcA that captures the semantics of the production A → σ (j) (A 1 , . . ., A j ).For every input set {i 1 , . . ., i k }, the induction hypothesis ensures that the following property holds: for 1 ≤ l ≤ j, ( e l (i 1 ), . . ., e l (i k )) = funcAl[#(e l )] (i 1 , . . ., i k ).Therefore, each call to a procedure funcAl in B computes the k intermediate answers that correspond to the evaluation of e l on the k values {i 1 , . . ., i k }.The code in B that follows the final call to funcAj uses the collections of intermediate results to finish k computations of the semantics of A → σ (j) (A 1 , . . ., A j ).Therefore, ( e (i 1 ), . . ., e (i k )) = funcA[#(e)] (i 1 , . . ., i k ) holds.
⊓ ⊔ Lemma 4. For every non-terminal A and number n such that funcA[n] (i 1 , . . ., i k ) = ⊥ (i.e., funcA terminates when the non-deterministic choices are controlled by n), there exists a minimal n ′ that is a (base 2) suffix of n for which (i) there is an e ∈ L(G) such that #(e) = n ′ , and (ii) for every input {i 1 , . . ., i k },

C Supplementary Evaluation Results
The complete results of our evaluation are shown in Tables 1 and 2. For brevity, in Table 1 we omit consecutive benchmarks on which nope times out-e.g., the ". . ." between benchmarks max4 and max15 represents 10 benchmarks from max5 to max14 for which nope times out.The tables present the number of nonterminals and the number of productions in the grammar of each benchmark, the number of examples used to prove unrealizability, the total time taken by nope, and the time taken by SeaHorn for the last (un)reachability problem.For benchmarks on which nope times out, the value given for "number of examples" is the number of examples generated by the CEGIS loop when nope times out.

Example 1 .Definition 3 (
Consider again the SyGuS problem (ψ max2 (f, x, y), G 2 ) discussed in §2.In the discussion of the initial program P [G 2 , E 1 ] (Fig.1), we hypothesized that the program analyzer chose to report path (1) in P , for which the sequence of non-deterministic choices is t, f, t, f, f, f, t.That sequence means that for P [n], the value of n is 1000101 (base 2) (or 69 (base 10)).The 1s, from low-order to high-order position, represent choices of production instances in a top-down, left-to-right generation of an expression-tree.(The 0s represent rejected possible choices.)The rightmost 1 in n corresponds to the choice in line (3) of "Start ::= Plus(Start, Start)"; the 1 in the third-from-rightmost position corresponds to the choice in line(10) of "Start ::= x" as the left child of the Plus node; and the 1 in the leftmost position corresponds to the choice in line(12) of "Start ::= 1" as the right child.By this means, we learn that the behavioral specification ψ max2 (f, x, y) holds for the example set E 1 = {(0, 1)} for f → Plus(x,1).⊓⊔ Reachability Problem).Given a program P [n], containing assertion statements and a non-deterministic integer input n, we use re P to denote the corresponding reachability problem.The reachability problem re P is satisfiable if there exists a value n that, when bound to n, falsifies any of the assertions in P [n].The problem is unsatisfiable otherwise.
First, for each non-terminal A ∈ N , the program P [G, E] contains k global variables {g_1_A, . . ., g_k_A} of type a(A) that are used to express the values resulting from evaluating expressions generated from non-terminal A on the k examples.Second, for each non-terminal A ∈ N , the program P [G, E] contains a function void funcA(τ I v1, . . ., τ I vk){ bodyA } We denote by δ(A) = {r 1 , . . ., r m } the set of production rules of the form A → β in δ.The body bodyA of funcA has the following structure:

Fig. 3 :
Fig. 3: Time vs examples.Finding: The number of examples required to prove unrealizability depends mainly on the arity of the synthesized function and the complexity of the grammar.The number of examples seems to grow quadratically with the number of bounded operators allowed in the grammar.In particular, problems in which the grammar allows zero IfThenElse operators require 2-4 examples, while problems in which the grammar allows one IfThenElse operator require 7-9 examples.Figure 3 plots the running time of nope against the number of examples generated by the CEGIS algorithm.Finding: The solving time appears to grow exponentially with the number of examples required to prove unrealizability.

Figure 3
plots the running time of nope against the number of examples generated by the CEGIS algorithm.Finding: The solving time appears to grow exponentially with the number of examples required to prove unrealizability.

Proof.
Assume that the computation funcA[n] (i 1 , . . ., i k ) terminates.Let b 1 , . . ., b j be the finite sequence of bits drawn by nd() throughout the computation.Proof of (i): Let e be the expression-tree generated top-down, left-to-right using the sequence b 1 , . . ., b j .Let n ′ be the binary number b j • • • b 1 .Because #(e) is the concatenation, in right-to-left order, of the sequence of #(•) values for the nodes of e visited during a pre-order traversal, #(e) = n ′ .Proof of (ii): Property (ii) holds because n and n ′ agree on the (base 2) suffix b j • • • b 1 , and exactly j bits are used during the executions of both funcA over a set of examples E = {c 1 , . . ., c k }, outputs a program P [G, E] such that sy E is realizable if and only if re enc(sy,E) is satisfiable.In this section, we define all the components of P [G, E], and state the correctness properties of our reduction.
First, the program P [G, E] will now operate over input examples of the form c = {w 1 , . . ., w k }, where each example c is a tuple corresponding to the values of variables {x, y 1 , y 2 , y 3 , y 4 }.Second, the program will need to compute the values of all possible calls of f on the various input parameters.Hence, for every expression f (z) in ψ 2 , non-terminal A, and example w i , the program P [G, E] will have a global variable z_i_A computing the value of the expression generated by A parametrized by z, with respect to the values in input example w i .
over a finite set of examples E, sy E is realizable ⇐⇒ re enc(sy,E) is satisfiable Proof.⇒direction: Assume that sy E is realizable.Then there exists an expression e ∈ L(G) = L(G, S) such that ∀x ∈ E.ψ( e , x).By Lemma 3, for every {i 1 , . .., i k }, ( e (i 1 ), . .., e (i k )) = funcA[#(e)] (i 1 , . .., i k ).Hence, the assertion in program enc(sy, E) is false and the reachability problem re enc(sy,E) is satisfiable.⇐direction: Assume that re enc(sy,E) is satisfiable.Then there exists a value of n that makes the assertion in program enc(sy, E) false (i.e., the specification holds for all inputs c i ∈ E).By Lemma 4, there exists a minimal n ′ for which the program has equivalent semantics (in particular, the assertion in enc(sy, E) is still false), and there exists an expression e ∈ L(G) such that #(e) = n ′ .Hence, e is a solution to SyGuS problem sy E ; i.e., sy E is realizable.⊓⊔

Table 1 :
Performance of nope on LimitedIf and LimitedPlus benchmarks.✗ denotes a timeout.

Table 2 :
Performance of nope on LimitedConst benchmarks.✗ denotes a timeout.