Understanding and Extending Incremental Determinization for 2QBF
Abstract
Incremental determinization is a recently proposed algorithm for solving quantified Boolean formulas with one quantifier alternation. In this paper, we formalize incremental determinization as a set of inference rules to help understand the design space of similar algorithms. We then present additional inference rules that extend incremental determinization in two ways. The first extension integrates the popular CEGAR principle and the second extension allows us to analyze different cases in isolation. The experimental evaluation demonstrates that the extensions significantly improve the performance.
1 Introduction
Solving quantified Boolean formulas (QBFs) is one of the core challenges in automated reasoning and is particularly important for applications in verification and synthesis. For example, program synthesis with syntax guidance [1, 2] and the synthesis of reactive controllers from LTL specifications has been encoded in QBF [3, 4]. Many of these problems require only formulas with one quantifier alternation (2QBF), which are the focus of this paper.
Algorithms for QBF and program synthesis largely rely on the counterexampleguided inductive synthesis principle (CEGIS) [5], originating in abstraction refinement (CEGAR) [6, 7]. For example, for program synthesis, CEGISstyle algorithms alternate between generating candidate programs and checking them for counterexamples, which allows us to lift arbitrary verification approaches to synthesis algorithms. Unfortunately, this approach often degenerates into a plain guessandcheck loop when counterexamples cannot be generalized effectively. This carries over to the simpler setting of 2QBF. For example, even for a simple formula such as \(\forall x.\exists y.~x=y\), where x and y are 32bit numbers, most QBF algorithms simply enumerate all \(2^{32}\) pairs of assignments. In fact, even the modern QBF solvers diverge on this formula when preprocessing is deactivated.
Recently, Incremental Determinization (ID) has been suggested to overcome this problem [8]. ID represents a departure from the CEGIS approach in that it is structured around identifying which variables have unique Skolem functions. (To prove the truth of a 2QBF \(\forall x.\exists y.~\varphi \) we have to find Skolem functions f mapping x to y such that \(\varphi [f/y]\) is valid.) After assigning Skolem functions to a few of the existential variables, the propagation procedure determines Skolem functions for other variables that are uniquely implied by that assignment. When the assignment of Skolem functions turns out to be incorrect, ID analyzes the conflict, derives a conflict clause, and backtracks some of the assignments. In other words, ID lifts CDCL to the space of Skolem functions.
ID can solve the simple example given above and shows good performance on various application benchmarks. Yet, the QBF competitions have shown that the relative performance of ID and CEGIS still varies a lot between benchmarks [9]. A third family of QBF solvers, based on the expansion of universal variables [10, 11, 12], shows yet again different performance characteristics and outperforms both ID and CEGIS on some (few) benchmarks. This variety of performance characteristics of different approaches indicates that current QBF solvers could be significantly improved by integrating the different reasoning principles.
In this paper, we first formalize and generalize ID [8] (Sect. 3). This helps us to disentangle the working principles of the algorithm from implementationlevel design choices. Thereby our analysis of ID enables a systematic and principled search for better algorithms for quantified reasoning. To demonstrate the value and flexibility of the formalization, we present two extensions of ID that integrate CEGISstyle inductive reasoning (Sect. 4) and expansion (Sect. 5). In the experimental evaluation we demonstrate that both extensions significantly improve the performance compared to plain ID (Sect. 6).
Related Work. This work is written in the tradition of works such as the Model Evolution Calculus [13], AbstractDPLL [14], MCSAT [15], and recent calculi for QBF [16], which present search algorithms as inference rules to enable the study and extension of these algorithms. ID and the inference rules presented in this paper can be seen as an instantiation of the more general frameworks, such as MCSAT [15] or Abstract Conflict Driven Learning [17].
Like ID, quantified conflictdriven clause learning (QCDCL) lifts CDCL to QBF [18, 19]. The approaches differ in that QCDCL does not reason about functions, but only about values of variables. Fazekas et al. have formalized QCDCL as inference rules [16].
2QBF solvers based on CEGAR/CEGIS search for universal assignments and matching existential assignments using two SAT solvers [5, 20, 21]. There are several generalizations of this approach to QBF with more than one quantifier alternation [22, 23, 24, 25, 26].
2 Preliminaries
An assignment \({\varvec{x}}\) to a set of variables X is a function \({\varvec{x}} : X \rightarrow \mathbb B\) that maps each variable \(x \in X\) to either \(\mathbf {1}\) or \(\mathbf {0}\). Given a propositional formula \(\varphi \) over variables X and an assignment \({\varvec{x}}'\) to \(X'\subseteq X\), we define \(\varphi ({\varvec{x}}')\) to be the formula obtained by replacing the variables \(X'\) by their truth value in \({\varvec{x}}'\). By \(\varphi ({\varvec{x}}',{\varvec{x}}'')\) we denote the replacement by multiple assignments for disjoint sets \(X',X''\subseteq X\).
A quantifier \(Q\, x \mathpunct {.}\varphi \) for \(Q \in \{\exists ,\forall \}\) binds the variable x in its subformula \(\varphi \) and we assume w.l.o.g. that every variable is bound at most once in any formula. A closed QBF is a formula in which all variables are bound. We define the dependency set of an existentially quantified variable y in a formula \(\varphi \) as the set \( dep (y)\) of universally quantified variables x such that \(\varphi \)’s subformula \(\exists y \mathpunct {.}\psi \) is a subformula of \(\varphi \)’s subformula \(\forall x.\psi '\). A Skolem function \(f_y\) maps assignments to \( dep (y)\) to a truth value. We define the truth of a QBF \(\varphi \) as the existence of Skolem functions \(f_Y=\{f_{y_1},\dots ,f_{y_n}\}\) for the existentially quantified variables \(Y=\{y_1,\dots ,y_n\}\), such that \(\varphi ({\varvec{x}}, f_Y({\varvec{x}}))\) holds for every \({\varvec{x}}\), where \(f_Y({\varvec{x}})\) is the assignment to Y that the Skolem functions \(f_Y\) provide for \({\varvec{x}}\).
A formula is in prenex normal form, if the formula is closed and starts with a sequence of quantifiers followed by a propositional subformula. A formula \(\varphi \) is in the kQBF fragment for \(k\in \mathbb {N}^+\) if it is closed, in prenex normal form, and has exactly \(k1\) alternations between \(\exists \) and \(\forall \) quantifiers.
A literal l is either a variable \(x \in X\), or its negation \(\lnot x\). Given a set of literals \(\{l_1,\dots ,l_n\}\), their disjunction \((l_1 \vee \ldots \vee l_n)\) is called a clause and their conjunction \((l_1 \wedge \ldots \wedge l_n)\) is called a cube. We use \(\overline{l}\) to denote the literal that is the logical negation of l. We denote the variable of a literal by \( var (l)\) and lift the notion to clauses \( var (l_1\vee \dots \vee l_n) = \{ var (l_1),\dots , var (l_n)\}\).
A propositional formula is in conjunctive normal form (CNF), if it is a conjunction of clauses. A prenex QBF is in prenex conjunctive normal form (PCNF) if its propositional subformula is in CNF. Every QBF \(\varphi \) can be transformed into an equivalent PCNF with size \(O(\varphi )\) [27].
Resolution is a wellknown proof rule that allows us to merge two clauses as follows. Given two clauses \(C_1\vee v\) and \(C_2\vee \lnot v\), we call \(C_1\otimes _v C_2 = C_1\vee C_2\) their resolvent with pivot v. The resolution rule states that \(C_1\vee v\) and \(C_2 \vee \lnot v\) imply their resolvent. Resolution is refutationally complete for propositional Boolean formulas, i.e. for every propositional Boolean formula that is equivalent to false we can derive the empty clause.
For quantified Boolean formulas, however, we need additional proof rules. The two most prominent ways to make resolution complete for QBF are to add either universal reduction or universal expansion, leading to the proof systems Qresolution [28] and \(\forall \)ExpRes [10, 29], respectively.
Universal expansion eliminates a single universal variable by creating two copies of the subformulas of its quantifier. Let \(Q_1.\forall x.Q_2.~\varphi \) be a QBF in PCNF, where \(Q_1\) and \(Q_2\) each are a sequence of quantifiers, and let \(Q_2\) quantify over variables X. Universal expansion yields the equivalent formula \(Q_1.Q_2.Q_2'.~\varphi [\mathbf {1}/x,X'/X]\wedge \varphi [\mathbf {0}/x]\), where \(Q_2'\) is a copy of \(Q_2\) but quantifying over a fresh set of variables \(X'\) instead of X. The term \(\varphi [\mathbf {1}/x,\ X'/X]\) denotes the \(\varphi \) where x is replaced by \(\mathbf {1}\) and the variables X are replaced by their counterparts in \(X'\).
Universal reduction allows us to drop universal variables from clauses when none of the existential variables in that clause may depend on them. Let C a clause of a QBF and let l be a literal of a universally quantified variable in C. Let us further assume that \(\overline{l}\) does not occur in C. If all existential variables v in C we have \( var (l)\notin dep (v)\), universal reduction allows us to remove l from C. The resulting formula is equivalent to the original formula.
Stack. For convenience, we use a stack data structure to describe the algorithm. Formally, a stack is a finite sequence. Given a stack S, we use S(i) to denote the ith element of the stack, starting with index 0, and we use \(S.S'\) to denote concatenation. We use S[0, i] to denote the prefix up to element i of S. All stacks we consider are stacks of sets. In a slight abuse of notation, we also use stacks as the union of their elements when it is clear from the context. We also introduce an operation specific to stacks of sets S: We define \( add (S,i,x)\) to be the stack that results from extending the set on level i by element x.
2.1 Unique Skolem Functions
Incremental determinization builds on the notion of unique Skolem functions. Let \(\forall X. \exists Y.\ \varphi \) be a 2QBF in PCNF and let \(\chi \) be a formula over X characterizing the domain of the Skolem functions we are currently interested in. We say that a variable \(v\in Y\) has a unique Skolem function for domain \(\chi \), if for each assignment \({\varvec{x}}\) with \(\chi ({\varvec{x}})\) there is a unique assignment \({\varvec{v}}\) to v such that \(\varphi ({\varvec{x}},{\varvec{v}})\) is satisfiable. In particular, a unique Skolem function is a Skolem function:
Lemma 1
If all existential variables have a unique Skolem function for the full domain \(\chi =\mathbf {1}\), the formula is true.
The semantic characterization of unique Skolem functions above does not help us with the computation of Skolem functions directly. We now introduce a local approximation of unique Skolem functions and show how it can be used as a propagation procedure.
We consider a set of variables \(D\subseteq X\cup Y\) with \(D\supseteq X\) and focus on the subset \(\varphi _D\) of clauses that only contain variables in D. We further assume that the existential variables in D already have unique Skolem functions for \(\chi \) in the formula \(\varphi _D\). We now define how to extend D by an existential variable \(v\notin D\). To define a Skolem function for v we only consider the clauses with unique consequence v, denoted \(\mathcal {U}_v\), that contain a literal of v and otherwise only literals of variables in D. (Note that \(\varphi _D\cup \mathcal {U}_v=\varphi _{D\cup \{v\}}\)). We define that variable v has a unique Skolem function relative to D for \(\chi \), if for all assignments to D satisfying \(\chi \) and \(\varphi \) there is a unique assignment to v satisfying \(\mathcal {U}_v\).
In order to determine unique Skolem functions relative to a set D in practice, we split the definition into the two statements \(\mathsf {deterministic}\) and \(\mathsf {unconflicted}\). Each statement can be checked by a SAT solver and together they imply that variable v has a unique Skolem function relative to D.
Lemma 2
Let the existential variables in D have unique Skolem functions for domain \(\chi \) and let \(v\in Y\) have a unique Skolem function relative to D for domain \(\chi \). Then v has a unique Skolem function for domain \(\chi \).
3 Inference Rules for Incremental Determinization

The solver status \(S\in \{\mathsf {Ready},\mathsf {Conflict}(L,{\varvec{x}}),\mathsf {SAT},\mathsf {UNSAT}\}\). The conflict status has two parameters: a clause L that is used to compute the learnt clause and the assignment \({\varvec{x}}\) to the universals witnessing the conflict.

A stack C of sets of clauses. C(0) contains the original and the learnt clauses. C(i) for \(i>0\) contain temporary clauses introduced by decisions.

A stack D of sets of variables. The union of all levels in the stack represent the set of variables that currently have unique Skolem functions and the clauses in \(C_D\) represent these Skolem functions. D(0) contains the universals and the existentials whose Skolem functions do not depend on decisions.

A formula \(\chi \) over D(0) characterizing the set of assignments to the universals for which we still need to find a Skolem function.

A formula \(\alpha \) over variables D(0) representing a temporary restriction on the domain of the Skolem functions.
We assume that we are given a 2QBF in PCNF \(\forall X. \exists Y.~\varphi \) and that all clauses in \(\varphi \) contain an existential variable. (If \(\varphi \) contains a nontautological clause without existential variables, the formula is trivially false by universal reduction.) We define \((\mathsf {Ready},\varphi ,X, \mathbf {1}, \mathbf {1})\) to be the initial state of the algorithm. That is, the clause stack C initially has height 1 and contains the clauses of the formula \(\varphi \). We initialize D as the stack of height 1 containing the universals.
Before we dive into the inference rules, we want to point out that some of the rules in this calculus are not computable in polynomial time. The judgements \(\mathsf {deterministic}\) and \(\mathsf {unconflicted}\) require us to solve a SAT problem and are, in general, NPcomplete. This is still easier than the 2QBF problem itself (unless NP includes \(\varPi _2^P\)) and in practice they can be discharged quickly by SAT solvers.
3.1 True QBF
Invariant 1
All existential variables in D have a unique Skolem function for the domain \(\chi \wedge \alpha \) in the formula \(\forall X. \exists Y.~ C_D\), where \(C_D\) are the clauses in C that contain only variables in D.
If Propagate identifies all variables to have unique Skolem functions relative to the growing set D, we know that they also have unique Skolem functions (Lemma 2). We can then apply Sat to reach the \(\mathsf {SAT}\) state, representing that the formula has been proven true (Lemma 1).
Lemma 3
ID cannot reach the SAT state for false QBF.
Proof
Let us assume we reached the \(\mathsf {SAT}\) state for a false 2QBF and prove the statement by way of contradiction. The \(\mathsf {SAT}\) state can only be reached by the rule Sat and requires \(D = X \cup Y\). By Invariant 1 all variables have a Skolem function in \(\forall X.\exists Y.~ C\). Since C includes \(\varphi \), this Skolem function does not violate any clause in \(\varphi \), which means it is indeed a proof. \(\square \)
Assuming we consider a true 2QBF, we can pick a Skolem function \(f_{y}\) for each existential variable y and encode it using Decide. We can simply consider the truth table of \(f_y\) in terms of the universal variables and define \(\delta \) to be the set of clauses \(\{\lnot {\varvec{x}} \vee v\mid f_y({\varvec{x}})\}\cup \{\lnot {\varvec{x}} \vee \lnot v\mid \lnot f_y({\varvec{x}})\}\). (Here we interpret the assignment \({\varvec{x}}\) as a conjunction of literals.) These clauses have unique consequence v and they guarantee that v is deterministic. Further, they guarantee that v is \(\mathsf {unconflicted}\), as otherwise \(f_y\) would not be a Skolem function, so we can apply Propagate to add v to D. Repeating this process for every variable let us reach the point where \(Y\subseteq D\) and we can apply Sat to reach the \(\mathsf {SAT}\) state.
Lemma 4
ID can reach the SAT state for true QBF.
Note that proving the truth of a QBF in this way requires guessing correct Skolem functions for all existentials. In Subsect. 3.4 we discuss how termination is guaranteed with a simpler type of decisions.
3.2 False QBF
To disprove false 2QBFs, i.e. formulas that do not have a Skolem function, we need the rules in Fig. 2 in addition to Propagate and Decide from Fig. 1. The \(\mathsf {conflict}\) state can only be reached via the rule Conflict, which requires that a variable v is conflicted, i.e. \(\mathsf {unconflicted}\) fails. The Conflict rule stores the assignment \({\varvec{x}}\) to D that proves the conflict and it creates the nucleus of the learnt clause \(\{v,\lnot v\}\). Via Analyze we can then resolve that nucleus with clauses in C(0), which consists of the original clauses and the clauses learnt so far. We are allowed to add the learnt clause back to C(0) by applying Learn.
Invariant 2
C(0) is equivalent to \(\varphi \).
Note that C(0) and \(\varphi \) are propositional formulas over \(X\cup Y\). Their equivalence means that they have the same set of satisfying assignments. We prove Invariant 2 together with the next invariant.
Invariant 3
Clause L in conflict state \(\mathsf {Conflict}(L,{\varvec{x}})\) is implied by \(\varphi \).
Proof
C(0) contains \(\varphi \) initially and is only ever changed by adding clauses through the Learn rule, so \(C(0)\Rightarrow \varphi \) holds throughout the computation.
We prove the other direction of Invariants 2 and 3 by mutual induction. Initially, C(0) consists exactly of the clauses \(\varphi \), satisfying Invariant 2. The nucleus of the learnt clause \(v\vee \lnot v\) is trivially true, so it is implied by any formula, which gives us the base case of Invariant 3. Analyze is the only rule modifying L, and hence soundness of resolution together with Invariant 2 already gives us the induction step for Invariant 3 [30]. Since Learn is the only rule changing C(0), Invariant 3 implies the induction step of Invariant 2. \(\square \)
When adding the learnt clause to C(0) we have to make sure that Invariant 1 is preserved. Learn hence requires that we have backtracked far enough with Backtrack, such that at least one of the variables in L is not in D anymore. In this way, L may become part of future Skolem function definitions, but will first have to be checked for causing conflicts by Propagate.
If all variables in L are in D(0) and the assignment \({\varvec{x}}\) from the conflict violates L, we can conclude the formula to be false using Unsat. The soundness of this step follows from the fact that \({\varvec{x}}\) includes an assignment satisfying \(C(0)_{D(0)}\) (i.e. the clauses defining the Skolem functions for D(0)), Invariants 1 and 3.
Lemma 5
ID cannot reach the UNSAT state for true QBF.
We will now show that we can disprove any false QBF. The main difficulty in this proof is to show that from any \(\mathsf {Ready}\) state we can learn a new clause, i.e. a clause that is semantically different to any clause in C(0), and then return to the \(\mathsf {Ready}\) state. Since there are only finitely many semantically different clauses over variables \(X\cup Y\), and we cannot terminate in any other way (Lemma 5), we eventually have to find a clause L with \( var (L)\subseteq D(0)\), which enables us to go to the \(\mathsf {UNSAT}\) state.
From the \(\mathsf {Ready}\) state, we can always add more variables to D with Decide and Propagate, until we reach a conflict. (Otherwise we would reach a state where \(D=Y\) we were able to prove \(\mathsf {SAT}\), contradicting Lemma 5.) We only enter a \(\mathsf {Conflict}\) state for a variable v, if there are two clauses \((c_1\vee v)\) and \((c_2\vee \lnot v)\) with unique consequence v such that \({\varvec{x}}\models \lnot c_1 \wedge \lnot c_2\) (see definition of \(\mathsf {unconflicted}\)).
In order to apply Analyze, we need to make sure that \((c_1\vee v)\) and \((c_2\vee \lnot v)\) are in C(0). We can guarantee this by restricting Decide as follows: We say a decision for a variable \(v'\) is consistent with the unique consequences in state \((\mathsf {Ready},C,D,\chi ,\alpha )\), if \(\mathsf {unconflicted}(v,C.\delta ,\chi \wedge \alpha ,D)\). We can construct such a decision easily by applying Decide only on variables that are not conflicted already (i.e. \(\mathsf {unconflicted}(v,C,\chi \wedge \alpha ,D)\)) and by defining \(\delta \) to be the CNF representation of \(\lnot \mathcal A_{v}\Rightarrow \lnot v\) (i.e. require v to be false, unless a unique consequence containing literal v applies). It is clear that for this \(\delta \) no new conflict for v is introduced and hence \(\mathsf {unconflicted}(v,C.\delta ,\chi \wedge \alpha ,D)\).
Assuming that all decisions are taken consistent with the unique consequences, we know that when we encounter a conflict for variable v, we did not apply Decide for v, and hence the clauses \((c_1\vee v)\) and \((c_2\vee \lnot v)\) causing the conflict must be in C(0). We can hence apply Analyze twice with clauses \((c_1\vee v)\) and \((c_2\vee \lnot v)\) and obtain the learnt clause \(L=c_1\vee c_2\). Since \({\varvec{x}}\models \lnot c_1 \wedge \lnot c_2\), the learnt clause is violated by \({\varvec{x}}\). As \({\varvec{x}}\) refutes \(\mathsf {unconflicted}(v,C,\chi \wedge \alpha ,D)\) by construction, it must satisfy the clauses \(C_D\) and learnt clause L hence cannot be in \(C_D\). Further, L only contains variables that are in D, as \((c_1\vee v)\) and \((c_2\vee \lnot v)\) were clauses with unique consequence v. So, L would have been in \(C_D\), if it existed in C already, and hence L is new. We can either add the new clause to C(0) after backtracking, or we can conclude \(\mathsf {UNSAT}\).
Lemma 6
ID can reach the \(\mathsf {UNSAT}\) state for false QBF.
The clause learning process considered here only applies one actual resolution step per conflict (\(L_1\otimes _v L_2\)). In practice, we probably want to apply multiple resolution steps before applying Learn. It is possible to use the conflicting assignment \({\varvec{x}}\) to (implicitly) construct an implication graph and mimic the clause learning of SAT solvers [8, 31].
3.3 Example
While Propagate was not applicable to \(y_2\) before, it now is, as the increased set D made \(y_2\) \(\mathsf {deterministic}\) (see clauses in line (2)). We can thus derive the state \((\mathsf {Ready}, \varphi , X\cup \{y_1,y_2\}, \mathbf {1}, \mathbf {1})\).
Now, we ran out of variables to propagate and the only applicable rule is Decide. We arbitrarily choose \(y_3\) as our decision variable and arbitrarily introduce a single clause \(\delta = \{(\lnot y_1\vee \lnot y_2\vee y_3)\}\), arriving in the state \((\mathsf {Ready}, \varphi .\delta , X\cup \{y_1,y_2\}, \mathbf {1}, \mathbf {1})\). In this step we can immediately apply Propagate (consider \(\delta \) and the clauses in line (3)) to add the decision variable to the set D and arrive at \((\mathsf {Ready}, \varphi .\delta , X\cup \{y_1,y_2,y_3\}, \mathbf {1}, \mathbf {1})\).
We can now apply Backtrack to undo the last decision, but this would not be productive. Instead identify \(y_4\) to be conflicted and we enter a conflict state with Conflict: \((\mathsf {Conflict}(\{y_4,\lnot y_4\}, x_1\wedge x_2), \varphi .\delta , X\cup \{y_1,y_2,y_3\}, \mathbf {1}, \mathbf {1})\). To resolve the conflict we apply Analyze twice  once with each of the clauses in line (4)  bringing us into state \((\mathsf {Conflict}(\{\lnot y_1,\lnot y_3\}, x_1\wedge x_2), \varphi .\delta , X\cup \{y_1,y_2,y_3\}, \mathbf {1}, \mathbf {1})\). We can backtrack one level such that \(D=X\cup \{y_1,y_2\}\) and then apply Learn to enter state \((\mathsf {Ready}, \varphi \cup \{(\lnot y_1\vee \lnot y_3)\}, X\cup \{y_1,y_2\}, \mathbf {1}, \mathbf {1})\).
The rest is simple: we apply Propagate on \(y_3\) and take a decision for \(y_4\). As no other variable can depend on \(y_4\) we can take an arbitrary decision for \(y_4\) that makes \(y_4\) deterministic, as long as this does not make \(y_4\) conflicted. Finally, we can propagate \(y_4\) and then apply \(\mathsf {SAT}\) to conclude that we have found Skolem functions for all existential variables.
3.4 Termination
So far, we have described sound and nondeterministic algorithms that allow us to prove or disprove any 2QBF. We can easily turn the algorithm in the proof of Lemma 6 into a deterministic algorithm that terminates for both true and false QBF by introducing an arbitrary ordering of variables and assignments: Whenever there is nondeterminism in the application of one of the rules as described in Lemma 6, pick the smallest variable for which one of the rules is applicable. When multiple rules are applicable for that variable, pick them in the order they appear in the figures. When the inference rule allows multiple assignments, pick the smallest. In particular, this guarantees that the existential variables are added to D in the arbitrarily picked order, as for any existential not in D we can either apply Propagate, Decide, or Conflict.
Restricting Decide to decisions that are consistent with the unique consequences may be unintuitive for true QBF, where we try to find a Skolem function. However, whenever we make the 2QBF false by introducing clauses with Decide, we will eventually go to a conflict state and learn a new clause. Deriving the learnt clause for conflicted variable v from two clauses with unique consequence v (as described for Lemma 6) means that we push the constraints towards smaller variables in the variable ordering. The learnt clause will thus improve the Skolem function for a smaller variable or cause another conflict for a smaller variable. In the extreme case, we will eventually learn clauses that look like function table entries, as used in Lemma 4, i.e. clauses containing exactly one existential variable. At some point, even with our restriction for Decide, we cannot make a “wrong” decision: The cases for which a variable does not have a clause with unique consequence are either irrelevant for the satisfaction of the 2QBF or our restricted decisions happen to make the right assignment.
In cases where no static ordering of variables is used  as it will be the case in any practical approach  the termination for true QBF is less obvious but follows the same argument: Given enough learnt clauses, the relationships between the variables are dense enough such that even naive decisions suffice.
3.5 Pure Literals
The original paper on ID introduces the notion of pure literals for QBF that allows us to propagate a variable v even if it is not deterministic, if for a literal l of v, all clauses c that l occurs in are either satisfied or l is the unique consequence of c. The formalization presented in this section allows us to conclude that pure literals are a special case of Decide: We can introduce clauses defining v to be of polarity \(\overline{l}\) whenever all clauses containing l are satisfied by another literal.
3.6 Relation of ID and CDCL
4 Inductive Reasoning
The CEGIS approach to solving a 2QBF \(\forall X \mathpunct {.}\exists Y \mathpunct {.}\varphi \) is to iterate over X assignments \({\varvec{x}}\) and check if there is an assignment \({\varvec{y}}\) such that \(\varphi ({\varvec{x}}, {\varvec{y}})\) is valid. Upon every successful iteration we exclude all assignments to X for which \({\varvec{y}}\) is a matching assignment. If the space of X assignments is exhausted we conclude the formula is true, and if we find an assignment to X for which there is no matching Y assignment, the formula is false [21].
While this approach shows poor performance on some problems, as discussed in the introduction, it is widely popular and has been successfully applied in many cases. In this section we present a way how it can be integrated in ID in an elegant way. The simplicity of the CEGIS approach carries over to our extension of ID  we only need the two additional inference rules in Fig. 4.
We exploit the fact that ID already generates assignments \({\varvec{x}}\) to X in its conflict check. Whenever ID is in a conflict state, the rules in Fig. 4 allow us to check if there is an assignment \({\varvec{y}}\) to Y that together with \(\mid _{X}\), which is the part of \({\varvec{x}}\) defining variables in X, satisfies \(\varphi \). If there is such an assignment \({\varvec{y}}\), we can let the Skolem functions output \({\varvec{y}}\) for the input \({\varvec{x}}\). But the output \({\varvec{y}}\) may work for other assignments to X, too. The set of all assignments to X for which \({\varvec{y}}\) works as an output, is easily characterized by \(\varphi ({\varvec{y}})\).^{1} InductiveRefinement allows us to exclude the assignments \(\varphi (0)\) from \(\chi \), which represents the domain (i.e. assignments to X) for which we still need to find a Skolem function.
This gives rise to a new invariant, stating that \(\lnot \chi \) only includes assignments to X for which we know that there is an assignment to Y satisfying \(\varphi \). With this invariant it is clear that Lemma 3 also holds for arbitrary \(\chi \).
Invariant 4
\(\forall X.\exists Y.~\lnot \chi \Rightarrow \varphi \)
It is easy to check that Propagate preserves Invariant 1 also if \(\chi \) and \(\alpha \) are not \(\mathbf {1}\). Invariants 2 and 3 are unaffected by the rules in this section. To make sure that Lemma 5 is preserved as well, we thus only have to inspect Failed, which is trivially sound.
A Portfolio Approach? In principle, we could generate assignments \({\varvec{x}}\) independently from the conflict check of ID. The result would be a portfolio approach that simply executes ID and CEGIS in parallel and takes the result from whichever method terminates first. The idea behind our extension is that conflict assignments are more selective and may thus increase the probability that we hit a refuting assignment to X. Also ID may profit from excluding groups of assignments for which frequently cause conflicts. We revisit this question in Sect. 6.
Example. We extend the example from Subsect. 3.3 from the point where we entered the conflict state \((\mathsf {Conflict}(\{y_4,\lnot y_4\}, x_1\wedge x_2), \varphi .\delta , X\cup \{y_1,y_2,y_3\}, \mathbf {1}, \mathbf {1})\). We can apply InductiveRefinement, checking that there is indeed a solution to \(\varphi \) for the assignment \(x_1, x_2\) to the universals (e.g. \(y_1,y_2,\lnot y_3, y_4\)). Instead of doing the standard conflict analysis as in our previous example, we can apply Learn to add the (useless) clause \(y_4\vee \lnot y_4\) to C(0) without any backtracking. That is, we effectively ignore the conflict and go to state \((\mathsf {Ready}, \varphi \cup \{(y_4\vee \lnot y_4)\}.\delta , X\cup \{y_1,y_2,y_3\}, \lnot x_1 \vee \lnot x_2, \mathbf {1})\).
There is no assignment to X that provokes a conflict for \(y_4\), other than the one we excluded through InductiveRefinement. We can thus take an arbitrary decision for \(y_4\) that is consistent with the unique consequences (see Subsect. 3.2), Propagate \(y_4\), and then conclude the formula to be true.
5 Expansion
One way to look at the expansion of a universal variable x is that it introduces a case distinction over the possible values of x in the Skolem functions. However, instead of creating a copy of the formula explicitly, which often caused a blowup in required memory, we can reason about the two cases sequentially. The rules in Fig. 5 extend ID by universal expansion in this spirit.
Using Assume we can, at any point, assume that a variable v in D(0), i.e. a variable that has a unique Skolem function without any decisions, has a particular value. This is represented by extending \(\alpha \) by the corresponding literal of v, which restricts the domain of the Skolem function that we try to construct for subsequent \(\mathsf {deterministic}\) and \(\mathsf {unconflicted}\) checks. Invariant 1 and Lemma 5 already accommodate the case that \(\alpha \) is not \(\mathbf {1}\).
When we reach a point where D contains all variables, we cannot apply Sat, as that requires \(\alpha \) to be true. In this case, Invariant 1 only guarantees us that the function we constructed is correct on the domain \(\chi \wedge \alpha \). We can hence restrict the domain for which we still need to find a Skolem function and strengthen \(\chi \) by \(\lnot \alpha \). In particular, Close maintains Invariant 4. When \(\chi \) ends up being equivalent to \(\mathbf {0}\), Invariant 4 guarantees that the original formula is true. (In this case we can reach the \(\mathsf {SAT}\) state easily, as we know that from now on every application of Propagate must succeed.^{2})
Note that Assume does not restrict us to assumptions on single variables. Together with Decide and Propagate it is possible to introduce variables with arbitrary definitions, add them to D(0), and then assume an outcome with the rule Assume.
Example. Again, we consider the formula from Subsect. 3.3. Instead of the reasoning steps described in Subsect. 3.3, we start using Assume with literal \(x_2\). Whenever checking \(\mathsf {deterministic}\) or \(\mathsf {unconflicted}\) in the following, we will thus restrict ourselves to universal assignments that set \(x_2\) to true. It is easy to check that this allows us to propagate not only \(y_1\) and \(y_2\), but also \(y_3\). A decision (e.g. \(\delta '=\{(y_4)\}\)) for \(y_4\) allows us to also propagate \(y_4\) (this time without potential for conflicts), arriving in state \((\mathsf {Ready}, \varphi .\delta ', X\cup \{y_1,y_2,y_3,y_4\}, \mathbf {1}, x_2)\).
We can Close this case concluding that under the assumption \(x_2\) we have found a Skolem function. We enter the state \((\mathsf {Ready}, \varphi , X, \lnot x_2, \mathbf {1})\) which indicates that in the future we only have to consider universal assignments with \(\lnot x_2\). Also for the case \(\lnot x_2\), we cannot encounter conflicts for this formula. Expansion hence allows us to prove this formula without any conflicts.
6 Experimental Evaluation
How Does Inductive Reasoning Affect the Performance? In Fig. 6 we see that CADETIR clearly dominates plain CADET. It also dominates all solvers that relied on clauselevel CEGAR and Bloqqer (CAQE, Qesto, RAReQS).
Only GhostQ beats CADETIR and solves 5 more formulas (of 384). A closer look revealed that there are many formulas for which CADETIR and GhostQ show widely different runtimes hinting at potential for future improvement.
Is the Inductive Reasoning Extension Just a PortfolioApproach? To settle this question, we created a version of CADETIR, called IRonly, that exclusively applies inductive reasoning by generating assignments to the universals and applying InductiveReasoning. This version of CADET does not learn any clauses, but otherwise uses the same code as CADETIR. On the QBFEval2017 benchmark, IRonly and CADET together solved 235 problems within the time limit, while CADETIR solved 243 problems. That is, even though the combined runtime of CADET and IRonly was twice the runtime of CADETIR, they solved fewer problems. CADETIR also uniquely solved 22 problems. This indicates that CADETIR improves over the portfolio approach.
How Does Universal Expansion Affect the Performance? CADETE clearly dominates plain CADET on QBFEval2017, but compared to CADETIR and some of the other QBF solvers, CADETE shows mediocre performance overall. However, for some subsets of formulas, such as the Hardware Fixpoint formulas shown in Fig. 7, CADETE dominated CADET, CADETIR, and all other solvers. We also combined the two extensions of CADET to obtain CADETIRE. While this helped to improve the performance on the Hardware Fixpoint formulas even further, it did not change the overall picture on QBFEval2017.
7 Conclusion
Reasoning in quantified logics is one of the major challenges in computeraided verification. Incremental Determinization (ID) introduced a new algorithmic principle for reasoning in 2QBF and delivered first promising results [8]. In this work, we formalized and generalized ID to improve the understanding of the algorithm and to enable future research on the topic. The presentation of the algorithm as a set of inference rules has allowed us to disentangle the design choices from the principles of the algorithm (Sect. 3). Additionally, we have explored two extensions of ID that both significantly improve the performance: The first one integrates the popular CEGARstyle algorithms and Incremental Determinization (Sect. 4). The second extension integrates a different type of reasoning termed universal expansion (Sect. 5).
Footnotes
 1.
We can actually exploit the Skolem functions that do not depend on decisions and exclude \(C(0)({\varvec{y}}_{\overline{D(0)}})\) from \(\chi \) instead, i.e. the set of assignments to D(0) to which the part of \({\varvec{y}}\) that is not in D(0) is a solution.
 2.
Technically, we could replace Sat by a rule that allows us to enter the \(\mathsf {SAT}\) state whenever \(\chi \) is \(\mathbf {0}\), which arguably would be more elegant. But that would require us to introduce the Close rule already for the basic ID inference system.
Notes
Acknowledgements
We want to thank Martina Seidl, who brought up the idea to formalize ID as inference rules, and Vijay D’Silva, who helped with disentangling the different perspectives on the algorithm. This work was supported in part by NSF grants 1139138, 1528108, 1739816, SRC contract 2638.001, the Intel ADEPT center, and the European Research Council (ERC) Grant OSARES (No. 683300).
References
 1.SolarLezama, A., Rabbah, R.M., Bodík, R., Ebcioglu, K.: Programming by sketching for bitstreaming programs. In: Proceedings of PLDI, pp. 281–294 (2005)Google Scholar
 2.Alur, R., Bodik, R., Juniwal, G., Martin, M.M., Raghothaman, M., Seshia, S.A., Singh, R., SolarLezama, A., Torlak, E., Udupa, A.: Syntaxguided synthesis. Depend. Softw. Syst. Eng. 40, 1–25 (2015)Google Scholar
 3.Faymonville, P., Finkbeiner, B., Rabe, M.N., Tentrup, L.: Encodings of bounded synthesis. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10205, pp. 354–370. Springer, Heidelberg (2017). https://doi.org/10.1007/9783662545775_20CrossRefGoogle Scholar
 4.Bloem, R., Könighofer, R., Seidl, M.: SATbased synthesis methods for safety specs. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 1–20. Springer, Heidelberg (2014). https://doi.org/10.1007/9783642540134_1CrossRefGoogle Scholar
 5.SolarLezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 404–415. ACM Press, October 2006Google Scholar
 6.Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexampleguided abstraction refinement. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 154–169. Springer, Heidelberg (2000). https://doi.org/10.1007/10722167_15CrossRefGoogle Scholar
 7.Jha, S., Seshia, S.A.: A theory of formal synthesis via inductive learning. Acta Inf. 54(7), 693–726 (2017)MathSciNetCrossRefGoogle Scholar
 8.Rabe, M.N., Seshia, S.A.: Incremental determinization. In: Creignou, N., Le Berre, D. (eds.) SAT 2016. LNCS, vol. 9710, pp. 375–392. Springer, Cham (2016). https://doi.org/10.1007/9783319409702_23CrossRefGoogle Scholar
 9.Pulina, L.: The ninth QBF solvers evaluation  preliminary report. In: Proceedings of QBF@SAT. CEUR Workshop Proceedings, vol. 1719, pp. 1–13. CEURWS.org (2016)Google Scholar
 10.Biere, A.: Resolve and expand. In: Hoos, H.H., Mitchell, D.G. (eds.) SAT 2004. LNCS, vol. 3542, pp. 59–70. Springer, Heidelberg (2005). https://doi.org/10.1007/11527695_5CrossRefzbMATHGoogle Scholar
 11.Pigorsch, F., Scholl, C.: An AIGbased QBFsolver using SAT for preprocessing. In: Proceedings of DAC, pp. 170–175. IEEE (2010)Google Scholar
 12.Charwat, G., Woltran, S.: Dynamic programmingbased QBF solving. In: Lonsing, F., Seidl, M. (eds.) Proceedings of Quantified Boolean Formulas. CEUR Workshop Proceedings, vol. 1719, pp. 27–40 (2016)Google Scholar
 13.Baumgartner, P., Tinelli, C.: The model evolution calculus. In: Baader, F. (ed.) CADE 2003. LNCS (LNAI), vol. 2741, pp. 350–364. Springer, Heidelberg (2003). https://doi.org/10.1007/9783540450856_32CrossRefGoogle Scholar
 14.Nieuwenhuis, R., Oliveras, A., Tinelli, C.: Abstract DPLL and abstract DPLL modulo theories. In: Baader, F., Voronkov, A. (eds.) LPAR 2005. LNCS (LNAI), vol. 3452, pp. 36–50. Springer, Heidelberg (2005). https://doi.org/10.1007/9783540322757_3CrossRefzbMATHGoogle Scholar
 15.de Moura, L., Jovanović, D.: A modelconstructing satisfiability calculus. In: Giacobazzi, R., Berdine, J., Mastroeni, I. (eds.) VMCAI 2013. LNCS, vol. 7737, pp. 1–12. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642358739_1CrossRefGoogle Scholar
 16.Fazekas, K., Seidl, M., Biere, A.: A dualityaware calculus for quantified Boolean formulas. In: Proceedings of SYNASC, pp. 181–186. IEEE Computer Society (2016)Google Scholar
 17.D’Silva, V., Haller, L., Kroening, D.: Abstract conflict driven learning. In: Proceedings POPL, pp. 143–154. ACM (2013)Google Scholar
 18.Giunchiglia, E., Narizzano, M., Tacchella, A.: QuBE: a system for deciding quantified Boolean formulas satisfiability. In: Goré, R., Leitsch, A., Nipkow, T. (eds.) IJCAR 2001. LNCS, vol. 2083, pp. 364–369. Springer, Heidelberg (2001). https://doi.org/10.1007/3540457445_27CrossRefGoogle Scholar
 19.Lonsing, F., Biere, A.: DepQBF: a dependencyaware QBF solver. JSAT 7(2–3), 71–76 (2010)Google Scholar
 20.Ranjan, D., Tang, D., Malik, S.: A comparative study of 2QBF algorithms. In: Proceedings of SAT. ACM (2004)Google Scholar
 21.Janota, M., MarquesSilva, J.: Abstractionbased algorithm for 2QBF. In: Sakallah, K.A., Simon, L. (eds.) SAT 2011. LNCS, vol. 6695, pp. 230–244. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642215810_19CrossRefGoogle Scholar
 22.Janota, M., Klieber, W., MarquesSilva, J., Clarke, E.: Solving QBF with counterexample guided refinement. In: Cimatti, A., Sebastiani, R. (eds.) SAT 2012. LNCS, vol. 7317, pp. 114–128. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642316128_10CrossRefGoogle Scholar
 23.Janota, M., MarquesSilva, J.: Solving QBF by clause selection. In: Proceedings of IJCAI, pp. 325–331. AAAI Press (2015)Google Scholar
 24.Rabe, M.N., Tentrup, L.: CAQE: a certifying QBF solver. In: Proceedings of FMCAD, pp. 136–143 (2015)Google Scholar
 25.Bloem, R., BraudSantoni, N., Hadzic, V.: QBF solving by counterexampleguided expansion. CoRR, vol. abs/1611.01553 (2016.). http://arxiv.org/abs/1611.01553
 26.Tentrup, L.: On expansion and resolution in CEGAR based QBF solving. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 475–494. Springer, Cham (2017). https://doi.org/10.1007/9783319633909_25CrossRefGoogle Scholar
 27.Tseitin, G.S.: On the complexity of derivation in propositional calculus. In: Studies in Constructive Mathematics and Mathematical Logic, Reprinted in [36], vol. 2, no. 115–125, pp. 10–13 (1968)Google Scholar
 28.Buning, H., Karpinski, M., Flogel, A.: Resolution for quantified Boolean formulas. Inf. Comput. 117(1), 12–18 (1995)MathSciNetCrossRefGoogle Scholar
 29.Janota, M., MarquesSilva, J.: Expansionbased QBF solving versus Qresolution. Theoret. Comput. Sci. 577, 25–42 (2015)MathSciNetCrossRefGoogle Scholar
 30.Robinson, J.A.: A machineoriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965)MathSciNetCrossRefGoogle Scholar
 31.MarquesSilva, J.P., Sakallah, K.A.: GRASP  a new search algorithm for satisfiability. In: Proceedings of CAD, pp. 220–227. IEEE (1997)Google Scholar
 32.Janota, M., Klieber, W., MarquesSilva, J., Clarke, E.M.: Solving QBF with counterexample guided refinement. Artif. Intell. 234, 1–25 (2016)MathSciNetCrossRefGoogle Scholar
 33.Klieber, W., Sapra, S., Gao, S., Clarke, E.: A nonprenex, nonclausal QBF solver with gamestate learning. In: Strichman, O., Szeider, S. (eds.) SAT 2010. LNCS, vol. 6175, pp. 128–142. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642141867_12CrossRefGoogle Scholar
 34.Biere, A., Lonsing, F., Seidl, M.: Blocked clause elimination for QBF. In: Bjørner, N., SofronieStokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 101–115. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642224386_10CrossRefGoogle Scholar
 35.Tang, D., Yu, Y., Ranjan, D., Malik, S.: Analysis of search based algorithms for satisfiability of propositional and quantified Boolean formulas arising from circuit state space diameter problems. In: Hoos, H.H., Mitchell, D.G. (eds.) SAT 2004. LNCS, vol. 3542, pp. 292–305. Springer, Heidelberg (2005). https://doi.org/10.1007/11527695_23CrossRefzbMATHGoogle Scholar
 36.Siekmann, J., Wrightson, G.: Automation of Reasoning: 2: Classical Papers on Computational Logic 1967–1970. Springer, Heidelberg (1983)Google Scholar
Copyright information
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis>This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara><SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>