Syntax-Guided Synthesis with Quantitative Syntactic Objectives

. Automatic program synthesis promises to increase the productivity of programmers and end-users of computing devices by automating tedious and error-prone tasks. Despite the practical successes of program synthesis, we still do not have systematic frameworks to synthesize programs that are “good” according to certain metrics—e.g., produce programs of reasonable sizes or with good runtime—and to understand when synthesis can result in such good programs. In this paper, we propose QSyGuS , a unifying framework for describing syntax-guided synthesis problems with quantitative objectives over the syntax of the synthesized programs. QSyGuS builds on weighted (tree) grammars, a clean and foundational formalism that provides ﬂexible support for diﬀerent quantitative objectives, useful closure properties, and practical decision procedures. We then present an algorithm for solving QSyGuS . Our algorithm leverages closure properties of weighted grammars to generate intermediate problems that can be solved using non-quantitative SyGuS solvers. Finally, we implement our algorithm in a tool, QuaSi , and evaluate it on 26 quantitative extensions of existing SyGuS benchmarks. QuaSi can synthesize optimal solutions in 15/26 benchmarks with times comparable to those needed to ﬁnd an arbitrary solution.


Introduction
The goal of program synthesis is to find a program in some search space that meets a specification-e.g., a set of examples or a logical formula. Recently, a large family of synthesis problems has been unified into a framework called syntax-guided synthesis (SyGuS). A SyGuS problem is specified by a contextfree grammar describing the search space of programs, and a logical formula describing the specification. Many synthesizers now support this format [2] and annually compete in synthesis competitions [4]. Thanks to these competitions, these solvers are now quite mature and are finding wide application [14].
While the logical specification mechanism provided by SyGuS is powerful, it can only capture the functional requirements of the synthesis problem-e.g., the program should perform correctly on a given set of input/output examples. When multiple possible programs can satisfy the specification, SyGuS does not provide a way to prefer one to the other-e.g., one cannot ask a solver to return the program with the fewest if-statements. As a consequence, existing synthesis tools do not provide guarantees about what solution is returned if multiple ones exist. While a few synthesizers have attempted to include some form of specification to express this kind of quantitative intents [7,19,16,15], these approaches are domain-specific, do not apply to SyGuS problems, and do not provide a simple and flexible specification mechanism. The lack of a formal treatment of quantitative requirements stands in the way of designing synthesizers that can take advantage quantitative of objectives to perform more efficient forms of synthesis.
In this paper, we propose QSyGuS, a unifying framework for describing syntax-guided synthesis problems with quantitative objectives over the syntax of the synthesized programs-e.g., find the most likely program with respect to a given probability distribution-and present an algorithm for solving synthesis problems expressed in this framework. We focus on syntactic objectives because they are the most common ones in practical applications of program synthesis. For example, in programming by examples it is desirable to produce small programs with fewer constants because these programs are more likely to generalize to examples outside of the specification [13]. QSyGuS extends SyGuS in two ways. First, in QSyGuS the search space is represented using weighted grammars, which augment context-free grammars with the ability to assign weights to programs. Second, QSyGuS allows the user to specify constraints over the weight of the program, including optimization objectives-e.g., find the program with the fewest if-statements and with the lowest depth.
QSyGuS is a natural, general, and flexible formalism and is grounded in the well-studied theory of weighted grammars. We leverage this theory and design an algorithm for solving QSyGuS problems using closure properties of weighted grammars. Given a QSyGuS problem, our algorithm generates a Sy-GuS problem that can be delegated to existing SyGuS solvers. The algorithm then iteratively refines the solution returned by the SyGuS solver to find an optimal one by further generating new SyGuS instances using weighted grammar operations. We implement our algorithm in a tool, QuaSi, and evaluate it on 26 quantitative extensions of existing SyGuS benchmarks. QuaSi can synthesize optimal solutions in 15/26 benchmarks with times comparable to those needed to find a solution that does not need to satisfy any quantitative objective.
Contributions In summary, our contributions are: -QSyGuS, a formal framework grounded in the theory of weighted grammars that can describe syntax-guided synthesis problems with quantitative objectives over the syntax of the synthesized programs. ( § 3) -An algorithm for solving QSyGuS problems that leverages closure properties of weighted grammars and existing SyGuS solvers. ( § 4) -QuaSi, a tool for specifying and solving QSyGuS problems that interfaces with existing SyGuS solvers and a comprehensive evaluation of QuaSi, which shows that QuaSi can efficiently solve QSyGuS problems over different types of weights, including additive weights, probabilities, and combinations of multiple weights. ( § 5) Fig. 1: Weighted grammar that assigns weight (w 1 , w 2 ) ∈ Nat×Nat to a program where w 1 is the number of if-statements and w 2 is the number of plus-statements.

Illustrative Example
In this section, we illustrate the main components of our framework using an example. We start with a Syntax-Guided Synthesis (SyGuS) problem in which no quantitative objective is provided. We recall that the goal of a SyGuS problem is to synthesize a function f of a given type that is accepted by a context-free grammar G, and such that ∀x.φ(f, x) holds (for a given Boolean constraint φ).
The following SyGuS problem asks to synthesize a function that is accepted by the following grammar and that computes the max of two numbers.
The semantic constraint is given by the following formula.
The following three programs are semantically equivalent, but syntactically different solutions.
then y else x) All solutions are correct, but the user might, for example, prefer the smallest one. However, SyGuS does not provide ways to specify this quantitative intent. Adding weights In our formalism, QSyGuS, we augment context-free grammars to assign weights to programs in the search space. Concretely, we adopt weighted grammars [10], a well-studied formalism with many desirable properties. In a weighted grammar, each production is assigned a weight. For example, the weighted grammar shown in Figure 1 extends the one from the previous SyGuS example to assign to each program p a pair of weights (w 1 , w 2 ) where w 1 is the number of if-statements and w 2 is the number of plus operators in p. In this case, the weights are pairs of integers and the weight of a grammar derivation is the pairwise sum of all the weights of the productions involved in the derivation-e.g., the sum of (w 1 , w 2 ) and (w 1 , w 2 ) is (w 1 + w 1 , w 2 + w 2 ). In the figure, we write /(w 1 , w 2 ) to assign weight (w 1 , w 2 ) to a production. We omit the weight for productions with cost (0, 0). The functions max 1 , max 2 and max 3 have weights (1, 0), (1,2), and (2, 0) respectively. Adding and solving quantitative objectives Once we have a way to assign weights to programs, QSyGuS allows the user to specify quantitative objectives over the weights of the productions-e.g., only allow solutions with fewer than 4 ifstatements. In our example, we could require the solution to be minimal with respect to the number of if-statements, i.e., minimize the first component of the paired weight. With these constraints both max 1 and max 2 would be considered optimal solutions because there exists no solution with 0 if-statements. If we require the solution to also be minimal with respect to the second component of the paired weight, max 1 will be a possible optimal solution.
Our tool QuaSi can automatically discover solutions in both these cases. Let's consider the last minimization objective. In this case, QuaSi first uses existing SyGuS solvers to synthesize an initial solution using the non-weighted version of the grammar. Let's say that the returned solution is, for example, max 3 of weight (2, 0). QuaSi uses this solution to build a new SyGuS instance that only accepts programs with at most one if-statement. Solving this SyGuS problem can, for example, result in the program max 2 of weight (1, 2), which will require our solver to build yet another SyGuS instance. This approach is repeated and if it terminates, an optimal program is found.

SyGuS with Quantitative Objectives
In this section, we introduce our framework for defining syntax-guided synthesis problems with quantitative objectives over the syntax of the synthesized programs. We first provide preliminary definitions for notions such as semirings (Section 3.1) and weighted tree grammars (Section 3.2), and then use these notions to augment SyGuS problems with quantitative objectives (Section 3.3).

Weights over Semirings
We now define the universe of weights we will assign to programs. In general, weights are defined using monoids-i.e., sets equipped with an addition operator-but when a grammar is nondeterministic-i.e., it can produce the same term using multiple derivations-the same term might be assigned multiple weights. Hence, we choose to use semirings. Since we also care about optimization objectives, we assume all our semirings are equipped with a partial order.
We often use the word semiring to refer to just the algebra S.
Example 1. In this paper, we focus on semirings with the following algebras.
Boolean Bool = (B, ∨, ∧, 0, 1). This semiring only contains the values true and f alse and is used to represent non-quantitative problems.
In our framework, we allow synthesis problems to have multiple objectives. Hence, we define a product operation to compose semirings. Intuitively, the following operation composes algebras of semirings to create a pair and applies the operation of each algebra to the corresponding projections of the pair. Similarly, two orders can be composed to create an order over pairs of elements. We propose two such compositions, one which assigns equal weights to the two orders (Pareto) one one which prefers one order over the other (Sorted).

Weighted Tree Grammars
Since SyGuS defines search spaces using context-free grammars, we propose to extend this formalism with weights to assign costs to terms in the grammar. We focus our attention on a restricted class of context-free grammars called regular tree grammars-i.e., grammars generating regular tree languages-because, to our knowledge, the benchmarks appearing in the SyGuS competition [3] and in practical applications of SyGuS operate over tree grammars. Moreover, it was recently shown that SyGuS problems that are undecidable for context-free grammars become decidable with weighted tree grammars [8]. Trees A ranked alphabet is a tuple (Σ, rk Σ ) where Σ is a finite set of symbol and rk Σ : Σ → N associates a rank to each symbol. For every m ≥ 0, the set of all symbols in Σ with rank m is denoted by Σ (m) . In our examples, a ranked alphabet is specified by showing the set Σ and attaching the respective rank to every symbol as superscript-e.g., Σ = {+ (2) , c (0) }. We use T Σ to denote the set of all (ranked) trees over Σ-i.e., T Σ is the smallest set such that (i ) In the following we assume a fixed ranked alphabet (Σ, rk Σ ).
Weighted Tree Grammars Tree grammars are similar to word grammars but they generate ranked trees instead of words. Weighted tree grammars augment tree grammars by assigning weights from a semiring to trees. They do so by associating weights to productions in the grammar. Weighted grammars can, for example, compute the height of a tree, the number of occurrences of some node in the tree, or the probability of a tree with respect to some distribution In the following, we assume a fixed semiring (S, ) where S = (S, ⊕, ⊗, 0, 1).

Definition 3 (Weighted Tree Grammar).
A weighted tree grammar (WTG) is a tuple G = (N, Z, P, µ), where N is a set of non-terminal symbols with arity 0, Z is an axiom with Z ∈ N , P is a set of production rules of the form A → β where A ∈ N is a non-terminal and β is a tree of T (Σ ∪ N ), and µ : P → S is a function assigning to each production a weight from the semiring.
We can now define the semantics of a WTG as a function w G : T Σ → S, which assigns weights to trees. Intuitively, the weight of a tree is ⊕-sum of the weight of every possible derivation of that tree in a grammar and the weight of a derivation is the ⊗-product of the weights of the productions appearing in the derivation. We use M S(β) = X 1 , . . . , X k to denote the multi-set of all nonterminals appearing in β and β[t 1 /X 1 , . . . , t k /X k ] to denote the result of simultaneously substituting each X i with t i in β. Given a derivation p = A → β such that M S(β) = X 1 , . . . , X k , we assume that p is a symbol of arity k. A derivation d starting at non-terminal X is a tree of productions d ∈ T (P ) representing one possible way to derive a tree starting from X. The derivation has to be such that: (i ) the root of d is a production of the form X → β, (ii ) for every node p = A → β in d, if M S(β) = X 1 , . . . , X k , then, for every 1 ≤ i ≤ k, the i-th child of p is a production X i → β i . Given a derivation d with root p = X → β, such that M S(β) = X 1 , . . . , X k and p has children subtrees d 1 , . . . , d k , the tree generated by d is recursively defined as tree(d) = β[tree(d 1 )/X 1 , . . . , tree(d k )/X k ]. We use der(X, t) to denote the set of all derivations d starting at X, such that tree(d) = t. The weight dw(d) of a derivation d is the ⊗-product of the weights of the productions appearing in the derivation. Finally, the weight of a tree t is the ⊕-sum of the weights of all the derivations of t from the initial nonterminal A weighted tree grammar is unambiguous iff, for every t ∈ T Σ , there exists at most one derivation-i.e., |der(Z, t)| ≤ 1.
Weighted tree grammars generalize weighted tree automata. In particular, a weighted tree automaton (WTA) is a WTG in which every production is of the form A → σ(T 1 , . . . , T n ), where A ∈ N , each T i ∈ N , and σ ∈ Σ (n) . Finally, a tree automaton (TA) is a WTA over the Boolean semiring-i.e., the TA accepts all trees with some derivations yielding true. Similarly, a tree grammar (TG) is a WTG over the Boolean semiring. Given a TA (resp. TG) G, we use L(G) to denote the set of trees accepted by G-i.e., L(G) = {t | w G (t) = true}.
Example 3. The weighted grammar in Fig. 1 operates over the semiring Trop × Trop, N = {Start, BExpr}, Z = Start, P contains 9 productions, and µ assigns non-zero weights to two of them.
Aside from being a natural formalism for assigning weights to trees, TGs and WTGs enjoy properties that make them a good choice for our model. First, WTGs (resp. TGs) are equi-expressive to WTAs (resp. TAs) and have logic characterizations [9,11,10]. Due to this reason, tree grammars are closed under Boolean operations and enjoy decidable equivalence [9]. Second, WTGs enjoy many closure and decidability properties-e.g., given two WTGs G 1 and G 2 , we can compute the grammars This operation is convenient for building grammars over product semirings.

QSyGuS
In this section, we formally define QSyGuS, which extends SyGuS with quantitative objectives. In SyGuS a problem is specified with respect to a background theory T -e.g., linear arithmetic-and the goal is to synthesize a function f that satisfies two constraints provided by the user. The first constraint describes a functional semantic property that f should satisfy and is given as a predicate ψ(f ) def = ∀x.φ(f, x). The second constraint limits the search space S of f and is given as a set of expressions specified by a context-free grammar G defining a subset of all the terms in T . A solution to the SyGuS problem is an expression e in S such that the formula ψ(e) is valid.
We augment such a framework in two ways. First, we replace context free grammars with WTGs, which we use to assign weights (from a given semiring) to terms. Second, we augment the problem formulation with constraints over the weight of the synthesized program-i.e., only consider programs of weight greater than 2-and optimization objectives over the same weight-i.e., find the solution of minimal weight. Weight constraints range over the grammar where w is a special variable and s is an element of the semiring under consideration. Given a constraint ω ∈ W C, we write ω(t) to denote the term obtained by replacing w with t in ω.

Definition 4 (QSyGuS).
A is an ordered semiring defining the set of weights and their operations.
-G is a weighted tree grammar with weights over the semiring S and that only contains terms in T -i.e., L(G) ⊆ T .

Algorithm 1 QSyGuS synthesis algorithm
if opt = f alse then return f * 5: while true do 6: Try to find better solution 8: if f = ⊥ then return f * Return the optimal solution 9: f is a Boolean formula constraining the semantic behavior of the synthesized program f .
ω ∈ W C is a set of constraints over the weight w of the synthesized program.
opt is a Boolean denoting whether the solution has to have minimal weight with respect to .
A solution to the QSyGuS problem is a term e such that e ∈ L(G), ψ(e) is true, and ω(w G (e)) is true. If opt is true, we also require that there is no g that satisfies the previous conditions and such that ω(w G (g)) ≺ ω(w G (e)).
A SyGuS problem is a QSyGuS problem without weight constraints-i.e., ω ≡ true and opt = f alse. We denote such problems just as triples (T, ψ(f ), G).
Example 4. Consider the QSyGuS problem described in Section. 2. We already described all the components but ω and opt in the rest of this section. In this example, ω = true and opt = true because we want to synthesize the solution with minimal weight.

Solving QSyGuS Problems via Grammar Reduction
In this section, we present an algorithm for solving QSyGuS problems (Algorithm 1), which works as follows. First, given a QSyGuS problem, we construct (under certain assumptions) a SyGuS problem for which the solution is guaranteed to satisfy the weight constraints ω (line 2) and use existing SyGuS solvers to find a solution to such a problem (line 3). If the QSyGuS problem requires minimization, our algorithm produces a new SyGuS instance to search for a solution that is better than the previously found one and tries to solve it (lines 6-7). This procedure is repeated until an optimal solution is found (line 8).

From QSyGuS to SyGuS
The first step of our algorithm is to construct a SyGuS problem characterizing exactly all the solutions of the QSyGuS problem that satisfy the weight constraints. Given a QSyGuS problem P = (T, (S, ), ψ(f ), G, ω, opt), we construct a SyGuS problem P = (T, ψ(f ), G ) such that a function g is a solution to the SyGuS problem P iff g is a solution of P = (T, (S, ), ψ(f ), G, ω, f alse), where the optimization constraint has been dropped. We denote the grammar reduction operation as G ← ReduceGrammar(G, ω).
Base case First we show how to solve the problem when ω is an atomic formulai.e. of the form w s, s w, w ≺ s, or s ≺ w. We start by showing how to solve the problem for w s as the construction is identical for the other constraints. Concretely, we are given a WTG G = (N, Z, P, µ) and we want to construct a TG G s = (N , Z , P ) such that t ∈ L(G s ) iff w G (t) s. In general, it is not possible to perform this construction for arbitrary semirings and grammars. We first present our algorithm and then describe sufficient conditions under which we can ensure termination and correctness.
The idea behind our construction is to introduce new nonterminals in the grammar G s to keep track of the weight of the trees that can be produced from those nonterminals. For example, a nonterminal pair (X, s ) will derive all trees derivable from X using a single derivation of weight s . Therefore, the set of nonterminals N is a subset of N × S (plus an initial nonterminal Z ), where S is the universe of the WTG's semiring. We construct our set of nonterminals N starting from the leaf productions of G and then recursively explore other productions. At the same time we generate the set of productions P . Formally, N and P are the smallest sets such that the following conditions hold.
Example 5. We illustrate our construction using the grammar in Figure 1 . Assume the weight constraint is w (1, 0) and the partial order is built using a Pareto product-i.e., we accept terms with 1 or less if-statements and no plusstatements. Our construction yields the following grammar. The construction of G s only terminates for certain semirings and grammars, and only guarantees that individual derivations yield the correct cost-i.e., it does not account for the ⊕-sum of multiple derivations. Example 6. The following WTG over Prob is ambiguous and, if we apply the grammar reduction algorithm for ω := w 0.6, the resulting grammar will be empty. However, the tree 1 + 1 has weight 0.9 0.6 (0.9 ≥ 0.6).
We now identify sufficient conditions under which the construction of G s terminates and is sound. In particular, we start by restricting our attention to unambiguous WTGs, which are the common ones in practice. We use weights(G) = {s | p ∈ P ∧ µ(p) = s} to denote the set of weights used by G and M S,G = (S , ⊗, 1) to denote the submonoid of S generated by weights(G)-i.e., the set of all weights we can generate using ⊗ and weights(G).

Theorem 1.
Given an unambiguous WTG G over a semiring S such that M S,G = (S , ⊗, 1), and a weight s ∈ S, the construction of G s terminates if the set {s | s s ∧ w ∈ S } is finite. Moreover, if the set of weights weights(G) is monotonically increasing with respect to -i.e. for every s ∈ S and s ∈ weights(G), s s ⊗ s -then L(G s ) contains exactly every tree t such that w G (t) s.
The theorem above also holds for other atomic constraints w ≺ s, s w, or s ≺ w (for these last two, the direction of the monotonicity is reversed). Moreover, in certain cases, even if the construction may not terminate for, let's say s w, it might terminate for the negated constraint w ≺ s. In such a case, we can use the closure properties of regular tree grammars/automata to construct the reduced grammar for s w as G w = intersect(G, complement(G w )). The same idea can be applied to all atomic constraints.
In practice, the restriction of Theorem 1 holds for grammars that operate over the Boolean and probabilistic semirings, and the tropical semiring only with positive weights. Theorem 1 never holds when S is the tropical semiring and the grammar contains negative weights. In general, one cannot construct the constrained grammar in this case. However, it is easy to modify our algorithm to work with grammars that do not contain loops-i.e., derivations from a nonterminal to a tree containing the same nonterminal-with negative weights.
Intuitively, when the grammar contains no negative loops, we can find a constant SH such that any intermediate derivation with weight greater than s+SH will never result in tree with weight smaller than s. We use this idea to modify the construction of G Trop ≤s -i.e., G ≤s for Trop-as follows. First, this constant is bounded by ck n+1 where c is the absolute value of the smallest negative weight in the grammar, k is the largest number of nonterminals appearing in one grammar production, and n = |N | is the number of nonterminals. Second, in steps 2 and 3 of the construction, a new nonterminal and the corresponding productions are produced if µ(p) ≤ s + |SH| (previously µ(p) ≤ s). However, if A = Z in steps 2 and 3, we add a new production Z → (A, s ) only if s s. We now show when this construction terminates and return correct values. Since the tropical semiring combines multiple runs using the min operator, we can drop the requirement that the grammar has to be unambiguous.
Theorem 2. Given a WTG G over Trop and a weight s ∈ Z, the construction of G Trop ≤s terminates if G contains no loop with cumulative negative weight. Moreover, G Trop ≤s contains exactly every tree t such that w G (t) ≤ s.
Composing semirings We next discuss how Theorem 1 relates to product semirings. Given a grammar G = (N, Z, P, µ) over a semiring S 1 × S S 2 , we use G Si to denote the grammar (N, Z, P, µ i ) in which the weight function outputs the corresponding projected weight-i.e., if µ(p) = (s 1 , s 2 ), then µ i (p) = s i . Let's first consider the case where the product semiring uses a Pareto partial order. In this case, if Theorem 1 holds for each grammar G Si and w i i s i , then it holds for G and (w 1 , w 2 ) p (s 1 , s 2 ). However, the other direction is not true. Theorem 3 proves this intuition and states that, in some sense, solving Pareto partial orders is easier than solving the individual partial orders.
Theorem 3 Given an unambiguous WTG G over the semiring S = S 1 × S S 2 with Pareto partial order p = par( 1 , 2 ) and a weight s = (s 1 , s 2 ) ∈ S, if the constructions G S1 1 s1 and G S2 2s2 terminate, then the construction of G s terminates.
When we move to Sorted partial order we cannot get an analogous theorem: if Theorem 1 holds for each grammar G Si and w i i s i , then it does not necessary hold for G and (w 1 , w 2 ) s (s 1 , s 2 ). In particular, if the semiring S 2 is infinite and there exists an s 1 ≺ s 1 , there will be infinitely many elements (s 1 , _) ≺ (s 1 , s 2 ). Using this observation, we devise a modified algorithm for reducing grammars with sorted objectives. First, we compute the grammars G S1 ≺1s1 , G S1 =s1 , and G S2 ≺2s2 . Second, we use WTG closure properties to compute G s (s 1 , s 2 ) as the union of G S1 ≺1s1 and intersect(G S1 =s1 , G S2 ≺2s2 ). General formulas We can now inductively construct the grammar accepting only terms satisfying all constraints in ω. We again use the fact that tree grammars are closed under Boolean operations to compute intersections and unions and correctly characterize all conjunctions and unions appearing in the formulas.

Finding an Optimal Solution
If our QSyGuS problem does not require minimization-i.e., opt = f alsethe technique presented in Section 4.1 can be used to generate an equivalent SyGuS problem P = (T, ψ(f ), G ), which can be solved using off-the-shelf Sy-GuS solvers. In this section, we show how to extend this technique to handle minimization objectives. Our idea is to use SyGuS solvers to find a non-optimal solution for P and then iteratively refine our grammar G to search for a better solution. This loop is illustrated in Algorithm 1 (lines 5-9). Given the initial solution f * to P such that w G (f * ) = s, we can construct a new grammar G ≺s and look for a solution with lower weight. If the SyGuS solver we use is sound-it can find a solution if it exists-and complete-it can detect if a solution does not exist-Algorithm 1 terminates with an optimal solution.
In general, the above conditions are too strict and in practice this implies that the algorithm will often not terminate. However, if the SyGuS solver is sound, the Algorithm 1 will eventually find the optimal solution, but it will not be able to prove that no smaller one exists. In our experiments, we will show that this approach can yield better solutions than those given by vanilla SyGuS solvers even when Algorithm 1 does not terminate.

Implementation and Evaluation
First, We extended the SyGuS format with new syntax for expressing QSy-GuS problems. Our format supports all semirings presented in Section 3.1 as well as additional ones. The format also allows creating tuples of semirings using the product operation described in Section 3.1. We augment the original SyGuS syntax to support weights on grammar productions. Weight constraints are added using an SMT-like syntax.
Second, we implemented Algorithm 1 in a tool called QuaSi. QuaSi already interfaces with three SyGuS solvers: CVC4 [6], ESolver [4], and EUSolver [5]. QuaSi supports all the semirings allowed in our format and implements a library for tree automata/grammars and weighted tree automata/grammars operations, as well as several optimizations we did not discuss in the paper. In particular, QuaSi often uses simple grammar reduction techniques to simplify the generated grammars, remove unnecessary productions, and consolidate equivalent ones.
We evaluate QuaSi through the following questions (experiments performed on an Intel Core i7 4.00GHz CPU with 32GB/RAM).  (Table 1), while 8 use a minimization objective (Pareto or Sorted) over a product semiring (Table 2). We select SyGuS benchmarks using the following criteria: (i ) the benchmark can be solved by either CVC4 [6] or ESolver [4], and (ii ) the solution is not optimal according to some reasonable metric-e.g., size or number of if statements.

Effectiveness of QSyGuS Solver
We evaluate the effectiveness of QuaSi on the 18 single-minimization-objective benchmarks. For each benchmark, we run QuaSi using either CVC4 or ESolver as the backend SyGuS solver (we also evaluated QuaSi using EUSolver [5], but, due to its poor performance, we do not report the results). The results are shown in Table 1. The timeout for each iteration of Alg. 1 is 10 minutes. With CVC4, QuaSi terminates with an optimal solution in 9/18 benchmarks, taking less than 5 seconds (avg: 0.7s) to solve each sub-problem. In 3 of these cases, the initial solution is already optimal and the second iteration is used to prove optimality. With ESolver, QuaSi terminates with an optimal solution in 8/18 benchmarks, taking less than 7 seconds (avg: 0.9s) to solve each subproblem. In 2 cases, it can find a better solution than the original one, but it cannot prove that the solution is optimal. Overall, by combining solvers, QuaSi can find a better solution than the original SyGuS solution given by one of the two solvers in 9/18 benchmarks. QuaSi cannot improve the initial solution of the linear integer arithmetic benchmarks (array_search and LinExpr_eq1ex).
Both solvers timeout on large grammars. The grammars in Table 1 are 1 to 2 order of magnitude larger than those in existing SyGuS benchmarks (avg: 224 vs 13 rules) and existing solvers have not yet been optimized for this parameter. In some cases, the solver times out for intermediate grammars that do not contain a solution, but that generate infinitely many terms. In general, existing SyGuS solvers cannot prove unsatisfiability for these types of problems. To answer Q1, QuaSi can solve quantitative variants of 10/18 real SyGuS benchmarks.

Solving Time for Different Iterations
In this section, we evaluate the time required by each iteration of Alg. 1. Figure 2 shows the ratio of time taken by each iteration with respect to the initial non-quantitative SyGuS solving time. Some of the iterations shown in Figure 1 do not appear in Figure 2 since they resulted in no solution-i.e., the initial solution was minimal. CVC4 is typically slower in subsequent iterations and can take up to 10 times the original solving time, while ESolver has comparable runtime to the initial run and is often faster. These numbers are largely due to how the two solvers work: CVC4 is optimized to solve problems where the grammar imposes no restrictions on the structure of the solution, while ESolver performs enumerative search and takes advantage of more restrictive grammars. One interesting point is the parity_not benchmark. ESolver takes 26.9s to find an initial solution. But, with a weight constraint w < 11, an solution can be found in 2.2s. CVC4 can find the initial solution with weight 11 in 0.1s but cannot solve the next iteration. We tried using different solvers in different iterations of our algorithm and, in fact, found that, if we use CVC4 to find an initial solution and then ESolver in subsequent iterations with restricted grammars we can fully solve this benchmark in a total of 2.3s which is much better than the time taken by a single solver. To answer Q2, with appropriate choices of solvers the overhead of synthesizing optimal solutions is minimal. In this section, we present how the weight of the synthesized solutions change across each iteration of Alg. 1. Figure 3 shows the percentage of weight of solutions synthesized at each iteration with respect to the weight of the initial SyGuS solution. The result shows that we can improve the solutions of CVC4 by 15-25% in one iteration, and the solutions of ESolver by 20-50% when taking one iteration and 50-60% when taking two. The Prob benchmarks, which require two iterations, can be improved more when using ESolver because ESolver tends to synthesize small terms whose probability may also be small. To answer Q3, QuaSi can improve the weights of SyGuS solutions by 20-60%.

Multi-Objective Optimization
In this section, we evaluate the effectiveness of QuaSi on the 8 benchmarks involving two minimization objectives. The benchmarks consists of two families,  4 for sorted optimization and 4 for Pareto optimization. The sorted optimization benchmarks ask to minimize first the number of occurrences of specified operator (bvand in hacks and ite in array_search) and then the size of the solution. The Pareto optimization benchmarks have the same objectives as sorted optimization but here we are synthesizing a Pareto optimal solution instead of sorted optimal one. The results are shown in Table 2. We do not present the results using CVC4 because it cannot solve any of the benchmarks. The array_search times out since it is already hard on a single objective. For the hackers_5 benchmarks, the initial solution is already optimized for the first objective, so the problem degenerates to the single-objective optimization problem. For the hackers_7 and hackers_17, we present the weights of the intermediate solutions we can see that Pareto and Sorted optimizations yield different solutions. To answer Q4, QuaSi can solve problems with multiple objectives when the same problems are feasible with a single objective.

Related Work
Qualitative Synthesis Existing program synthesizers fall in three categories: (i ) enumeration solvers, which typically output the smallest program [1], (ii ) symbolic solvers, which reduce the synthesis problem to a constraint solving problem and output whatever program is produced by the constraint solver [21], (iii ) probabilistic synthesizers, which randomly search the space for a solution and are typically unpredictable [18]. Since the introduction of the SyGuS format [2], these techniques have been used to build several SyGuS solvers that have competed in SyGuS competitions [4]. The most effective ones, which are used in this paper are ESolver and EUSolver [1] (enumeration), and CVC4 [6] (symbolic).
Quantitative synthesis Domain-specific synthesizers typically employ hard-coded ranking functions that guide the search towards a "preferable" program [17], but these functions are typically hard to write and are decoupled from the functional specification. Unlike QSyGuS, these synthesizers allow arbitrary ranking functions to be expressed in general purpose languages, but typically only support limited grammars for synthesis. Moreover, in many practical applications the ranking functions are very simple. For example, the popular spreadsheet formula synthesizer FlashFill [12] uses a ranking function to prefer small programs with few constants. This type of objective is expressible in our framework.
The Sketch synthesizer supports optimization objectives over variables in sketched programs [20]. This work differs from ours in that sketches are a different specification mechanism from SyGuS. In Sketch the search space is encoded as a program with holes to facilitate synthesis by constraint solving. Translating SyGuS problems into sketches is non-trivial and results in poor performance.
The work closest to ours is Synapse [7], which combines sketching with an approach similar to ours. For the same reasons as for Sketch, Synapse differs from our work because it proposes a different search space mechanisms. However, there are a few analogies between our work and Synapse that are worth explaining in detail. Synapse supports syntactic cost functions that are defined using a decidable theory, and separately from the sketch search space. Synthesis is done using an iterative search where sketches-i.e., set of partial programs with holes-of increasing sizes are given to the synthesizer. At the high level, the intermediate sketches are related to our notion of reduced grammars-i.e., they accept solution of weight less than a given constant. However, while our algorithm generates reduced grammars automatically for a well-defined family of semirings, Synapse requires the user to provide a function for generating the intermediate sketches. Moreover, since Synapse requires cost functions that are defined using a decidable theory, it would not support certain families of costs QSyGuS supports-e.g., the probabilistic semiring.
Koukoutos et al. [15] have proposed the use of probabilistic tree grammars to guide the search of enumerative synthesizers on applications outside of SyGuS. Their algorithm enumerates all terms accepted by the grammar in decreasing probability using a variant of the search algorithm A * and requires the grammar to not contain transitions of weight 1 to avoid getting stuck. Probabilistic tree grammars are a special case of QSyGuS and our algorithm does not impose limitations of what weights can appear in the grammar. Moreover, our algorithm does not require implementing a new solver when changing the cost semiring.

Conclusion
We presented QSyGuS, a general framework for defining and solving SyGuS problems in the presence of quantitative objectives over the syntax of the programs. QSyGuS is (i ) natural : requires minimal modification to the SyGuS format, (ii ) general : it supports complex but practical types of weights, (iii ) formal : it is grounded in the theory of weighted tree grammars, (iv ) effective: our tool QuaSi can solve quantitative variations of existing SyGuS benchmarks with little overhead. In the future, we plan to extend our framework to handle probabilistic objectives and quantitative objectives over the semantics of the program-e.g., synthesize programs that satisfy most of the specification. minimize the total size of the solution. These benchmarks operate over the semiring Trop × S Trop, but only impose one minimization objective. parity_not minimizes number of not in corresponding SyGuS benchmark. max3_ite minimizes number of ite in corresponding SyGuS benchmark. Rest of Trop minimizes size of the solution in corresponding SyGuS benchmark. hackers_a_prob Probabilistic extensions of corresponding SyGuS benchmarks (Prob semiring). The probability scheme we use assigns probability 1 8 to shift operators, probability 1 4 to arithmetic operators and 1 2 to logical operators. The goal is to find the most probable solution.

C Proofs of Theorems
Theorem 1 Given an unambiguous WTG G over a semiring S such that M S,G = (S , ⊗, 1), and a weight s ∈ S, the construction of G s terminates if the set {s | s s ∧ w ∈ S } is finite. Moreover, if the set of weights weights(G) is monotonically increasing with respect to -i.e. for every s ∈ S and s ∈ weights(G), s s ⊗ s -then L(G s ) contains exactly every tree t such that w G (t) s.
Proof. We first show that each step of the algorithm terminates. Steps 1 and 2 terminate since the grammar G is finite.
Step 3 only produces nonterminals that belongs to N × {s | s s ∧ w ∈ S } which is also finite. We now prove soundness. It is straightforward to prove the following claim by induction: for every nonterminal (A, s) ∈ N , tree t ∈ T Σ , we have that d ∈ der((A, s ), t) iff there exists a derivation d ∈ der(A, t) such that dw(d ) = s and s s. Because G is unambiguous, every tree has at most one derivation. Therefore dw(d ) = w G (t) and w G (t) s. Theorem 2 Given a WTG G over Trop and a weight s ∈ Z, the construction of G Trop ≤s terminates if G contains no loop with with cumulative negative weight. Moreover, G Trop ≤s contains exactly every tree t such that w G (t) ≤ s.
Proof. First, we show that any tree with weight ≤ s must be accepted by G Trop ≤s . We do so by showing that if a tree t is not accepted by G Trop ≤s -i.e., t has some subtree β with weight greater than s + SH-the weight of t must be greater than s. Note that the modified algorithm can track weights ≤ s + |SH| in the intermediate nonterminals but still accept only trees with weight ≤ s. According to the definition, the weight of t is the sum of all rules used to derived t, that is, w G (t) = w G (β) + w G (t[B/β]) where t[B/β] ∈ T Σ∪{B} is the result of substituting the node corresponding to β with B. Then if t[B/β] contains loops, we can eliminate all loops from it to get a tree t [B/β] such that w G (t[B/β]) ≥ w G (t [B/β]) because loops have non-negative weights. If t[B/β] contains no loop, its weight w G (t[B/β]) ≥ −ck n+1 = −SH since the size of t[B/β] is no more than k n+1 (the height of t[B/β] is no more than n since there is no loop in it) and each production used to derive t[B/β] has weight greater than −c. Therefore, using the fact that w G (t[B/β]) ≥ −SH, we have that the weight of t is w G (β) + w G (t[B/β]) > s + SH + w G (t[B/β]) ≥ s. Now, we show that the algorithm terminates. We observe that there is only a finite number of trees without loops so the set of their weights is also finite, namely the minimum weight of any tree without loops is w * . On the other hand, for any tree t containing loops, the weight w of t must be greater or equal to the weight of some tree without loops-i.e., w > w * . This is because we can eliminate loops, whose weights are non-negative, from t and the obtained tree has greater or equal weight to t. So the weights of nonterminal produced by our constructions are fall in the range [w * , s + SH] which is finite. Finally, the construction only needs to produce finite number of nonterminals and will always terminate.
Theorem 3 Given an unambiguous WTG G over the semiring S = S 1 × S S 2 with Pareto partial order p = par( 1 , 2 ) and a weight s = (s 1 , s 2 ) ∈ S, if the constructions G S1 1 s1 and G S2 2s2 terminate, then the construction of G s terminates.
Proof. We first show by induction that if a nonterminal (X, w 1 , w 2 ) is produced in the construction of G S s , the nonterminals (X, w 1 ) and (X, w 2 ) must be produced in the construction of G S1 s1 and G S1 s1 respectively. For the base case, we consider the nonterminals (X, w 1 , w 2 ) produced in step 2 with production p. The condition µ(p) s in the step 2 implies that µ 1 (p) s 1 and µ 2 (p) s 2 which means that (X, w 1 ) and (X, w 2 ) are also produced in the construction of the corresponding grammar. Then, for every nonterminal (X, w 1 , w 2 ) produced in the step 3 with rule p and {(X i , w (1) i , w (2) i )} i ⊆ N where (w 1 , w 2 ) := µ(p) ⊗ i (w (1) i , w (2) i ) (s 1 , s 2 ), according to the induction hypothesis, nonterminals in {(X i , w i 1 )} i and {(X i , w i 2 )} i are already produced in the grammars G S1 s1 and G S2 s2 . Therefore, we can apply the step 3 with p and nonterminals G S1 s1 (or G S2 s2 ) to produce a new nonterminal (X, w 1 ) (or (X, w 2 )). Note that w 1 = µ(p) ⊗ i w (1) i s 1 , and w 2 = µ(p) ⊗ i w (2) i s 2 ,. Since both of the constructions of G S1 1s1 and G S2 2s2 terminate, the number of nonterminals they produce, namely n 1 and n 2 , must be finite. We have shown that the number of nonterminals produced in G S s is less than n 1 × n 2 , which is also finite. At last, the construction of G S s terminates.