figure a

1 Introduction

Efficient solving techniques for Boolean theories are an integral part of modern verification and synthesis methods. Especially in synthesis, the amount of choice in the solution space leads to propositional problems of enormous size. Quantified Boolean formulas (QBFs) have repeatedly been considered as a candidate theory for synthesis approaches [6, 7, 10,11,12, 24] and recent advances in QBF solvers give rise to hope that QBF may help to increase the scalability of those approaches.

Solving quantified Boolean formulas (QBF) using partial expansions in a counterexample guided abstraction and refinement (CEGAR) loop [16] has proven to be very successful. From its introduction, the corresponding solver \(\text {RAReQS}\) won several QBF competitions. In recent work, a different kind of CEGAR algorithms have been proposed [18, 25], implemented in the solvers \(\text {Qesto}\) and \(\text {CAQE}\). All those CEGAR approaches share algorithmic similarities like working recursively over the structure of the quantifier prefix and using SAT solver to enumerate candidate solutions. However, instead of using partial expansions of the QBF as \(\text {RAReQS}\) does, newer approaches base their refinements on whether a set of clauses is satisfied or not. Despite those algorithmic similarities, the performance characteristics of the resulting solver in experimental evaluations are very different and in many cases orthogonal: While \(\text {RAReQS}\) tends to perform best on instances with a low number of quantifier alternations, \(\text {Qesto}\) and \(\text {CAQE}\) have an advantage in instances with many alternations [25].

Proof theory has been repeatedly used to improve the understanding of different solving techniques. For example, the proof calculus \(\forall \text {Exp+Res}\) [17] has been developed to characterize aspects of expansion-based solving. In this paper, we introduce a new calculus \(\forall \text {Red+Res}\) that corresponds to the clausal-based CEGAR approaches [18, 25]. The levelized nature of those algorithms are reflected by the rules of this calculus, universal reduction and propositional resolution, which are applied to blocks of quantifiers. We show that this calculus is inherently different to \(\forall \text {Exp+Res}\) explaining the empirical performance results. In detail, we show that \(\forall \text {Red+Res}\) polynomial simulates level-ordered \(Q\text {-resolution}\). We also discuss an extension to \(\forall \text {Red+Res}\) that was already proposed as solving optimizations [25] and show that this extension makes the resulting calculus exponential more concise.

Further, we integrate the \(\forall \text {Exp+Res}\) calculus as a rule that can be used within the \(\forall \text {Red+Res}\) calculus, leading to a unified proof calculus for all current CEGAR approaches. We show that the unified calculus is exponential stronger than both \(\forall \text {Exp+Res}\) and \(\forall \text {Red+Res}\), as well as just applying both simultaneously. This unified calculus serves as a base for implementing an expansion refinement in the QBF solver \(\text {CAQE}\). On standard benchmark sets, the combined approach leads to a significant empirical improvement over the previous implementation.

2 Preliminaries

2.1 Quantified Boolean Formulas

We consider quantified Boolean formulas in prenex conjunctive normal form (PCNF), that is a formula consisting of a linear and consecutive quantifier prefix as well as a propositional matrix. A matrix is a set of clauses, and a clause is a disjunctive combination of literals l, that is either a variable or its negation.

Given a clause \(C = (l_1 \vee l_2 \vee \ldots \vee l_n)\), we use set notation interchangeably, that is C is also represented by the set \({\{l_1,l_2,\dots ,l_n\}}\). Furthermore, we use standard set operations, such as union and intersection, to work with clauses.

For readability, we lift the quantification over variables to the quantification over sets of variables and denote a maximal consecutive block of quantifiers of the same type \(\forall x_1\mathpunct {.}\forall x_2\mathpunct {.}\cdots \forall x_n\mathpunct {.}\varphi \) by \(\forall X\mathpunct {.}\varphi \) and \(\exists x_1\mathpunct {.}\exists x_2\mathpunct {.}\cdots \exists x_n\mathpunct {.}\varphi \) by \(\exists X\mathpunct {.}\varphi \), accordingly, where \(X={\{x_1,\dots ,x_n\}}\).

Given a set of variables X, an assignment of X is a function \(\alpha : X \rightarrow \mathbb {B}\) that maps each variable \(x \in X\) to either true (\(\top \)) or false (\(\bot \)). When the domain of \(\alpha \) is not clear from context, we write \(\alpha _X\). We use the instantiation of a QBF \(\varPhi \) by assignment \(\alpha \), written \(\varPhi [\alpha ]\) which removes quantification over variables in \(\mathrm {dom}(\alpha )\) and replaces occurrences of \(x \in \mathrm {dom}(\alpha )\) by \(\alpha (x)\). We write \(\alpha \vDash \varphi \) if the assignment \(\alpha \) satisfies a propositional formula \(\varphi \), i.e., \(\varphi [\alpha ] \equiv \top \).

2.2 Resolution

Propositional resolution is a well-known method for refuting propositional formulas in conjunctive normal form (CNF). The resolution rule allows to merge two clauses that contain the same variable, but in opposite signs.

A resolution proof \(\pi \) is a series of applications of the resolution rule. A propositional formula is unsatisfiable if there is a resolution proof that derives the empty clause. We visualize resolution proofs by a graph where the nodes with indegree 0 are called the leaves and the unique node with outdegree 0 is called the root. We depict the graph representation of a resolution proof in Fig. 1(b). The size of a resolution proof is the number of nodes in the graph.

Fig. 1.
figure 1

Visualization of the resolution rule as a graph.

2.3 Proof Systems

We consider proof systems that are able to refute quantified Boolean formulas. To enable comparison between proof systems, one uses the concept of polynomial simulation. A proof system P polynomially simulates (p-simulates) \(P'\) if there is a polynomial p such that for every number n and every formula \(\varPhi \) it holds that if there is a proof of \(\varPhi \) in \(P'\) of size n, then there is a proof of \(\varPhi \) in P whose size is less than p(n). We call P and \(P'\) polynomial equivalent, if \(P'\) additionally p-simulates P.

A refutation based calculus (such as resolution) is regarded as a proof system because it can refute the negation of a formula.

Figure 2 gives an overview over the proof systems introduced in this paper and their relation. An edge \(P \rightarrow P'\) means that P p-simulates \(P'\) (transitive edges are omitted). A dashed line indicates incomparability results.

Fig. 2.
figure 2

Overview of the proof systems and their relations. Solid arrows indicate p-simulation relation. Dashed lines indicate incomparability results. The gray boxes are the ones introduced in this paper.

3 Proof Calculi

Given a PCNF formula \(Q\,X_1 \dots Q\,X_n \mathpunct {.}\bigwedge _{1 \le i \le m} C_i\). We define a function \( lit (i, k)\) that returns the literals of clause \(C_i\) that are bound at quantifier level k (\(1 \le k \le n\)). Further, we generalize this definition to \( lit (i, > k)\) and \( lit (i, < k)\) that return the literals bound after (before) level k. We define \( lit (i, 0) = lit (i, n+1) = \emptyset \) for every \(1 \le i \le m\). We use \(\mathcal {C}\) to denote a set of clauses and \(Q_k \in {\{\exists ,\forall \}}\) to denote the quantification type of level k.

3.1 A Proof System for Clausal Abstractions

We start by defining the object on which our proof system \(\forall \text {Red+Res}\) is based on. A proof object \(\mathcal {P}^k\) consists of a set of indices \(\mathcal {P}\) where an index \(i \in \mathcal {P}\) represents the i-th clause in the original matrix and k denotes the k-th level of the quantifier hierarchy. We define an operation \( lit (\mathcal {P}^k) = \bigcup _{i \in \mathcal {P}} lit (i, k)\), that gives access to the literals of clauses contained in \(\mathcal {P}^k\). The leaves in our proof system are singleton sets \({\{i\}}^z\) where z is the maximum quantification level of all literals in clause \(C_i\). The root of a refutation proof is the proof object \(\mathcal {P}^0\) that represents the empty set, i.e., \( lit (\mathcal {P}^0) = \emptyset \).

The rules of the proof system is given in Fig. 3. It consists of three rules, an axiom rule (\(\mathrm {init}\)) that generates leaves, a resolution rule (\(\mathrm {res}\)), and a universal reduction rule (\(\forall \mathrm {red}\)). The latter two rules enable to transform a premise that is related to quantifier level k into a conclusion that is related to quantifier level \(k-1\). The universal reduction rule and the resolution rule are used for universal and existential quantifier blocks, respectively.

Fig. 3.
figure 3

The rules of the \(\forall \text {Red+Res}\) calculus.

Resolution Rule. There is a close connection between (\(\mathrm {res}\)) and the propositional resolution as (\(\mathrm {res}\)) merges a number of proof objects \(\mathcal {P}_i^k\) of level k into a single proof object of level \(k-1\). It does so by using a resolution proof for a propositional formula that is constructed from the premises \(\mathcal {P}_i^k\). This propositional formula \(\bigwedge _{1 \le i \le j} lit (\mathcal {P}_i^k)\) contains only literals of level k. Intuitively, this rule can be interpreted as follows: a resolution proof over those clauses rules out any possible existential assignment at quantifier level k, thus, one of those clauses has to be satisfied at an earlier level.

Universal Reduction Rule. In contrast to (\(\mathrm {res}\)), (\(\forall \mathrm {red}\)) works on single proof objects. It can be applied if level k is universal and the premise does not encode a universal tautology, i.e., for every literal \(l \in lit (\mathcal {P}^k)\), the negated literal \(\overline{l}\) is not contained in \( lit (\mathcal {P}^k)\).

Graph Representation. A proof in the \(\forall \text {Red+Res}\) calculus can be represented as a directed acyclic graph (DAG). The nodes in the DAG are proof objects \(\mathcal {P}^k\) and the edges represent applications of (\(\mathrm {res}\)) and (\(\forall \mathrm {red}\)). The rule (\(\mathrm {res}\)) is represented by a hyper-edge that is labeled with the propositional resolution proof \(\pi \). Edges representing the universal reduction can thus remain unlabeled without introducing ambiguity. The size of a \(\forall \text {Red+Res}\) proof is the number of nodes in the graph together with the number of inner (non-leaf, non-root) nodes of the containing propositional resolution proofs.

A refutation in the \(\forall \text {Red+Res}\) calculus is a proof that derives a proof object \(\mathcal {P}^0\) at level 0. A proof for some \(\mathcal {P}^k\) is a \(\forall \text {Red+Res}\) proof with root \(\mathcal {P}^k\). Thus, a proof for \(\mathcal {P}^k\) can be also viewed as a refutation for the formula \(Q\,X_{k+1} \dots Q\,X_n \mathpunct {.}\bigwedge _{i \in \mathcal {P}} lit (i,>k)\) starting with quantifier level \(k+1\) and containing clauses represented by \(\mathcal {P}\).

Example 1

Consider the following QBF

(1)

The refutation in the \(\forall \text {Red+Res}\) calculus is given in Fig. 4. In the nodes, we represent the proof objects \(\mathcal {P}^k\) in the first component and the represented clause in the second component. The proof follows the structure of the quantifier prefix, i.e., it needs four levels to derive a refutation. The resolution proof \(\pi _1\) for propositional formula

is depicted in Fig. 1(b).

Fig. 4.
figure 4

A \(\forall \text {Red+Res}\) refutation for formula (1).

In the following, we give a formal correctness argument and compare our calculus to established proof systems. A QBF proof system is sound if deriving a proof implies that the QBF is false and it is refutational complete if every false QBF has a proof.

Theorem 1

\(\forall \text {Red+Res}\) is sound and refutational complete for QBF.

Proof

The completeness proof is carried out by induction over the quantifier prefix.

Induction Base. Let \(\exists X \mathpunct {.}\varphi \) be a false QBF and \(\varphi \) be propositional. Then \((\mathrm {res})\) derives some \(\mathcal {P}^0\) because resolution is complete for propositional formulas. Let \(\forall X \mathpunct {.}\varphi \) be a false QBF and \(\varphi \) be propositional. Picking an arbitrary (non-tautological) clause \(C_i\) and applying \((\forall \mathrm {red})\) leads to \({\{i\}}^0\).

Induction Step. Let \(\exists X \mathpunct {.}\varPhi \) be a false QBF, i.e., for all assignments \(\alpha _X\) the QBF \(\varPhi [\alpha _X]\) is false. Hence, by induction hypothesis, there exists a \(\forall \text {Red+Res}\) proof for every \(\varPhi [\alpha _X]\). We transform those proofs in a way that they can be used to build a proof for \(\varPhi \). Let P be a proof of \(\varPhi [\alpha _X]\). P has a distinct root node (representing the empty set), that was derived using \((\forall \mathrm {red})\) as \(\varPhi [\alpha _X]\) starts with a universal quantifier. To embed P in \(\varPhi \), we increment every level in P by one, as \(\varPhi \) has one additional (existential) quantifier level. Then, instead of deriving the empty set, the former root node derives a proof object of the form \(\mathcal {P}^1\). Let N be the set of those former root nodes. By construction, there exists a resolution proof \(\pi \) such that the empty set can be derived by \((\mathrm {res})\) using N (or a subset thereof). Assuming otherwise leads to the contradiction that some \(\varPhi [\alpha _X]\) is true.

Let \(\forall X \mathpunct {.}\varPhi \) be a false QBF, i.e., there is an assignment \(\alpha _X\) such that the QBF \(\varPhi [\alpha _X]\) is false. Hence, by induction hypothesis, there exists a \(\forall \text {Red+Res}\) proof for \(\varPhi [\alpha _X]\). Applying \((\forall \mathrm {red})\) using \(\alpha _X\) is a \(\forall \text {Red+Res}\) proof for \(\varPhi \).

For soundness it is enough to show that one cannot derive a clause using this calculus that changes the satisfiability. Let \(\varPhi = Q\,X_1 \dots Q\,X_n \mathpunct {.}\bigwedge _{1 \le i \le m} C_i\) be an arbitrary QBF. For every level k and every \(\mathcal {P}^k\) generated by the application of the \(\forall \text {Red+Res}\) calculus, it holds that \(\varPhi \) and \(Q\,X_1 \dots Q\,X_n \mathpunct {.}\bigwedge _{1 \le i \le m} C_i \wedge (\bigvee _{i \in \mathcal {P}} \bigvee _{l \in lit (i, \le k)} l)\) are equisatisfiable. Assume otherwise, then either \((\forall \mathrm {red})\) or \((\mathrm {res})\) have derived a \(\mathcal {P}^k\) that would make \(\varPhi \) false. Again, by induction, one can show that if \((\forall \mathrm {red})\) derived a \(\mathcal {P}^k\) that makes \(\varPhi \) false, the original premise \(\mathcal {P}^{k+1}\) would have made \(\varPhi \) false; likewise, if \((\mathrm {res})\) derived a \(\mathcal {P}^k\) that makes \(\varPhi \) false, the conjunction of the premises have made \(\varPhi \) false.    \(\square \)

Comparison to \(Q\text {-resolution}\) Calculus. \(Q\text {-resolution}\) [19] is an extension of the (propositional) resolution rule to handle universal quantification. The universal reduction rule allows the removal of universal literal u from a clause C if no existential literal \(l \in C\) depends on u. There are also additional rules on when the resolution rule can be applied, i.e., it is not allowed to produce tautology clauses using the resolution rule. The definitions of Q-resolution proof and refutation are analogous to the propositional case.

There are two restricted classes of \(Q\text {-resolution}\) that are commonly considered, that is level-ordered and tree-like \(Q\text {-resolution}\). A \(Q\text {-resolution}\) proof is level-ordered if resolution of an existential literal l at level k happens before every other existential literal with level \({<}k\). A \(Q\text {-resolution}\) proof is tree-like if the graph representing the proof has a tree shape.

As a first result, we show that \(\forall \text {Red+Res}\) is polynomially equivalent to level-ordered \(Q\text {-resolution}\), i.e., a proof in our calculus can be polynomially simulated in level-ordered \(Q\text {-resolution}\) and vice versa. While this is straightforward from the definitions of both calculi, this is much less obvious if one looks at the underlying algorithms of the CEGAR approaches [18, 25] and QCDCL [27].

Theorem 2

\(\forall \text {Red+Res}\) and level-ordered \(Q\text {-resolution}\) are p-sim. equivalent.

Proof

A \(\forall \text {Red+Res}\) proof can be transformed into a Q-resolution proof by replacing every node \(\mathcal {P}^k\) by the clause \((\bigvee _{i \in \mathcal {P}} \bigvee _{l \in lit (i, \le k)} l)\) and by replacing the hyper-edge labeled with \(\pi \) by a graph representing the applications of the resolution rule. Similarly, a level-ordered Q-resolution proof can be transformed into a \(\forall \text {Red+Res}\) proof by a step-wise transformation from leaves to the root. This way, one can track the clauses needed for constructing the proof objects \(\mathcal {P}^k\) at every level k.    \(\square \)

Despite being equally powerful, the differences are important and enable the expansion based extension that we will introduce in the next section. One difference is that our calculus only reasons about literals of one quantifier level, which allows us to use plain resolution without any changes (as are needed in \(Q\text {-resolution}\)). Further, the proof rules capture the fact that only proof obligations are communicated between the quantifier levels of the QBF. An immediate consequence is that every refutation in the proof system is DAG-like and has exactly depth \(k+1\).

Since the level-ordering constraint imposes an order on the resolution, the size of the refutation proof may be exponentially larger for some formulas [14]. Hence, also \(\forall \text {Red+Res}\) is in general exponentially weaker than unrestricted \(Q\text {-resolution}\). In practice, and already noted by Janota and Marques-Silva [17], solvers that are based on \(Q\text {-resolution}\) proofs produce level-ordered \(Q\text {-resolution}\).

In the initial version of \(\text {CAQE}\) [25] an optimization that can generate new resolvents at level k without recursion into deeper levels was described. We model this optimization as a new rule extending the \(\forall \text {Red+Res}\) calculus and show that this rule leads to an exponential separation.

Strong UNSAT Rule. In the implementation of \(\text {CAQE}\), we used an optimization which we called strong UNSAT refinement [25], that allowed the solver to strengthen a certain type of refinements. The basic idea behind this optimization is that if the solver determines that, at an existential level k, a certain set of clauses \(\mathcal {C}\) cannot be satisfied at the same time, then every alternative set of clauses \(\mathcal {C}'\), that is equivalent with respect to the literals in levels \({>}k\), cannot be satisfied as well. We introduce the following proof rule that formalizes this intuition. We extend proof objects \(\mathcal {P}^k\) such that they can additionally contain fresh literals, i.e., literals that were not part of the original QBF. Those literals are treated as they were bound at level k, i.e., they are contained in \( lit (\mathcal {P}^k)\) and can thus be used in the premise of the rule \((\mathrm {res})\), but are not contained in the conclusion \(\mathcal {P}^{k-1}\).

figure b

Theorem 3

The strengthening rule is sound.

Proof

In a resolution proof at level k, one can derive the proof objects \((\mathcal {P}\cup {\{j\}})^k\) for \(j \in {\{j_1,\dots ,j_n\}}\) using the conclusion of the strengthening rule. Assume we have a proof for \((\mathcal {P}\cup {\{i\}})^k\) (premise), then the quantified formula \(\forall X_{k+1} \dots Q\,X_n \mathpunct {.}\bigwedge _{i^* \in \mathcal {P}} lit (i^*,>k) \wedge lit (i,>k)\) is false. Thus, the QBF with the same quantifier prefix and matrix, extended by some clause \( lit (j,>k)\) for \(j \in {\{j_1,\dots ,j_n\}}\), is still false. Since every \(C_j\) subsumes \(C_i\) with respect to quantifier level greater than k (\( lit (j,>k) \subseteq lit (i,>k)\)), the clause \( lit (i,>k)\) can be eliminated without changing satisfiability. Thus, the resulting quantified formula \(\forall X_{k+1} \dots Q\,X_n \mathpunct {.}\bigwedge _{i^* \in \mathcal {P}} lit (i^*,>k) \wedge lit (j,>k)\) is false and there exists a \(\forall \text {Red+Res}\) proof for \((\mathcal {P}\cup {\{j\}})^k\).    \(\square \)

Theorem 4

The proof system without strengthening rule does not p-simulate the proof system with strengthening rule.

Proof

We use the family of formulas \(\mathrm {CR}_n\) that was used to show that level-ordered \(Q\text {-resolution}\) cannot p-simulate \(\forall \text {Exp+Res}\) [17]. We show that \(\mathrm {CR}_n\) has a polynomial refutation in the \(\forall \text {Red+Res}\) calculus with strengthening rule, but has only exponential refutations without it. The latter follows from Theorem 2 and the results by Janota and Marques-Silva [17].

The formula \(\mathrm {CR}_n\) has the quantifier prefix \(\exists x_{11} \dots x_{nn} \forall z \exists a_1 \dots a_n b_1 \dots b_n\) and the matrix is given by

(2)

One can interpret the constraints as selecting rows and columns in a matrix where i selects the row and j selects the column, e.g., for \(n=3\) it can be visualized as follows:

figure c

Assume \(z \rightarrow 0\), then we derive the proof object \(\mathcal {P}^1 = {\{i1 \mid i \in 1..n\}}^1\) (\( lit (\mathcal {P}^1) = \bigvee _{i \in 1..n} x_{i1}\)) by applying the resolution and reduction rule. Likewise, for \(z \rightarrow 1\), we derive the proof object \(\mathcal {P}_0^1 = {\{ \overline{1j} \mid j \in 1..n \}}^1\) (\( lit (\mathcal {P}_0^1) = \bigvee _{j \in 1..n} \overline{x}_{1j}\)). Applying the strengthening rule on \(\mathcal {P}_0^1\) results in \(\mathcal {P}_1^1 = ({\{c_1\}} \cup {\{\overline{1j} \mid j \in 2..n\}})^1\) and \({\{\overline{c}_1, \overline{11}\}}^1, {\{\overline{c}_1, \overline{21}\}}^1, \dots , {\{\overline{c}_1, \overline{n1}\}}^1\) where \(c_1\) is a fresh variable. Further \(n-1\) applications of the strengthening rule starting on \(\mathcal {P}_1^1\) lead to \(\mathcal {P}_n^1 = {\{ c_j \mid j \in 1..n \}}^1\) and the proof objects \({\{\overline{c}_j, \overline{ij} \mid i,j \in 1..n\}}^1\), where \(c_j\) are fresh variables, as all clauses in a column are equivalent with respect to the inner quantifiers (contain \(\overline{z} \vee b_j\)).

Using \(\mathcal {P}^1\) and \({\{\overline{c}_1, \overline{11}\}}^1, {\{\overline{c}_1, \overline{21}\}}^1, \dots , {\{\overline{c}_1, \overline{n1}\}}^1\) from the first strengthening application, we derive the singleton set \({\{ \overline{c}_1 \}}\) using n resolution steps (\( lit (\mathcal {P}^1) = \bigvee _{i \in 1..n} x_{i1}\) and \( lit ({\{\overline{c}_1, \overline{i1}\}}^1) = {\{\overline{c}_1, \overline{x}_{i1}\}}\)). Analogously, one derives the singletons \({\{\overline{c}_2\}} \dots {\{\overline{c}_n\}}\) and together with \(\mathcal {P}_n^1 = {\{ c_j \mid j \in 1..n \}}\) the empty set is derived. Thus, there exists a polynomial resolution proof leading to a proof object \(\mathcal {P}^0\) and the size of the overall proof is polynomial, too.    \(\square \)

We note that despite being stronger than plain \(\forall \text {Red+Res}\), the extended calculus is still incomparable to \(\forall \text {Exp+Res}\).

Corollary 1

\(\forall \text {Red+Res}\) with strengthening rule does not p-simulate \(\forall \text {Exp+Res}\).

Proof

We use a modification of formula \(\mathrm {CR}_n\) (2), which we call \(\mathrm {CR}_n'\) in the following. The single universal variable z is replaced by a number of variables \(z_{ij}\) for every pair \(i,j \in 1..N\). It follows that the strengthening rule is never applicable and hence, the proof system is as strong as level-ordered \(Q\text {-resolution}\) which has an exponential refutation of \(\mathrm {CR}_n\) while \(\forall \text {Exp+Res}\) has a polynomial refutation since the expansion tree has still only two branches [17].    \(\square \)

When compared to \(Q\text {-resolution}\), the strengthening rule can be interpreted as a step towards breaking the level-ordered constraint inherent to \(\forall \text {Red+Res}\). The calculus, however, is not as strong as \(Q\text {-resolution}\).

Corollary 2

\(\forall \text {Red+Res}\) with strengthening rule does not p-sim. \(Q\text {-resolution}\).

Proof

The formula \(\mathrm {CR}_n'\) from the previous proof has a polynomial (tree-like) \(Q\text {-resolution}\) proof. The proof for \(\mathrm {CR}_n\) given by Mahajan and Shukla [23] can be modified for \(\mathrm {CR}_n'\).    \(\square \)

Both results follow from the fact that the strengthening rule as presented is not applicable to the formula \(\mathrm {CR}_n'\). Where in \(\mathrm {CR}_n\), the clauses \(C_{\overline{ij}}\) are equal with respect to the inner quantifier when j is fixed (\(\overline{z} \vee b_j\)), in \(\mathrm {CR}_n'\) they are all different (\(\overline{z}_{ij} \vee b_j\)). This difference is only due to the universal variables \(z_{ij}\). Thus, we propose a stronger version of the strengthening rule that does the subset check only on the existential variables. For the universal literals, one additionally has to make sure that no resolvent produces a tautology (as it is the case in \(\mathrm {CR}_n'\)). We leave the formalization to future work.

3.2 Expansion

The levelized nature of the proof system allows us to introduce additional rules that can reason about quantified subformulas. In the following, we introduce such a rule that allows us to use the \(\forall \text {Exp+Res}\) calculus [17] within a \(\forall \text {Red+Res}\) proof.

We start by giving necessary notations used to define \(\forall \text {Exp+Res}\). We refer the reader to [17] for further information.

Definition 1

(adapted from [17])

  • A \(\forall \)-expansion tree for QBF \(\varPhi \) with u universal quantifier blocks is a rooted tree \(\mathcal {T}\) such that every path \(p_0 \xrightarrow {\alpha _1} p_1 \cdots \xrightarrow {\alpha _u} p_u\) in \(\mathcal {T}\) from the root \(p_0\) to some leaf \(p_u\) has exactly u edges and each edge \(p_{i-1} \xrightarrow {\alpha _i} p_i\) is labeled with a total assignment \(\alpha _u\) to the universal variables at universal level u. Each path in \(\mathcal {T}\) is uniquely defined by its labeling.

  • Let \(\mathcal {T}\) be a \(\forall \)-expansion tree and \(P = p_0 \xrightarrow {\alpha _1} p_1 \cdots \xrightarrow {\alpha _u} p_u\) be a path from the root \(p_0\) to some leaf \(p_u\).

    1. 1.

      For an existential variable x we define \( expand\text {-}var (P,x) = x^\alpha \) where \(x^\alpha \) is a fresh variable and \(\alpha \) is the universal assignment of the dependencies of x.

    2. 2.

      For a propositional formula \(\varphi \) define \( expand (P,\varphi )\) as instantiating \(\varphi \) with \(\alpha _1,\dots ,\alpha _u\) and replacing every existential variable x by \( expand\text {-}var (P,x)\).

    3. 3.

      Define \( expand (\mathcal {T},\varPhi )\) as the conjunction of all \( expand (P,\varphi )\) for each root-to-leaf P in \(\mathcal {T}\).

In difference to previous work, we allow to use the expansion rule on quantified subformulas of \(\varPhi \) additionally to applying it to \(\varPhi \) directly. By \(\mathcal {C}^{\ge k}\) we denote a set of clauses that only contain literals bound at level \(\ge k\).

figure d

The rule states that if there is a universal expansion of the quantified Boolean formula \(\exists X_k \mathpunct {.}\forall X_{k+1} \dots \exists X_m \mathpunct {.}\mathcal {C}^{\ge k}\) and a resolution refutation \(\pi \) for this expansion, then there is no existential assignment that satisfies clauses \(\mathcal {C}\) from level k. The size of the expansion rule is the sum of the size of the expansion tree and resolution proof [17].

Example 2

We demonstrate the interplay between \((\forall \mathrm {exp\text {-}res})\) and the \(\forall \text {Red+Res}\) calculus on the following formula

To apply \((\forall \mathrm {exp\text {-}res})\), we use the clauses 5–9 from quantifier level 5, i.e., \(\mathcal {C}^{\ge 5} = {\{(\overline{b}) (z \vee t \vee b) (\overline{z} \vee \overline{t}) (x \vee \overline{t})(\overline{x} \vee t)\}}\). The corresponding quantifier prefix is \(\exists b \exists x \forall z \exists t\). Using the complete expansion of z (\({\{z \rightarrow 0, z \rightarrow 1\}}\)) as the expansion tree \(\mathcal {T}\), we get the following expansion formula

which has a simple resolution proof \(\pi \). The conclusion of \((\forall \mathrm {exp\text {-}res})\) leads to the proof object \({\{5,6,7,8,9 \}}^4\), but only clause 5 contains literals bound before quantification level 5. After a universal reduction, the proof continues as described in Example 1.

Theorem 5

The \(\forall \)exp-res rule is sound.

Proof

Assume otherwise, then one would be able to derive a proof object \(\mathcal {P}^{k-1}\) that is part of a \(\forall \text {Red+Res}\) refutation proof for true QBF \(\varPhi \). Thus, the clause corresponding to \(\mathcal {P}^{k-1}\) (cf. proof of Theorem 1) \((\bigvee _{i \in \mathcal {P}} \bigvee _{l \in lit (i, < k)} l)\) made \(\varPhi \) false. However, the same clause can be derived directly by applying the expansion \(\mathcal {T}\) to the original QBF, i.e., expanding universal variables beginning with quantification level \(k+1\), and propositional resolution on the resulting expansion formula. Thus, this clause can be conjunctively added to the matrix without changing satisfiability, leading to a contradiction.    \(\square \)

The resulting proof system can be viewed as a unification of the currently known CEGAR approaches for solving quantified Boolean formulas [16, 18, 25].

Theorem 6

\(\forall \text {Exp+Res}\) does not p-simulate \(\forall \text {Red+}\forall \text {Exp+Res}\).

Proof

\(\forall \text {Exp+Res}\) does not p-simulate level-ordered Q-resolution [23].    \(\square \)

The combination of both rules makes the proof system stronger than merely choosing between expansion and resolution proof upfront.

Theorem 7

There is a QBF that has polynomial refutation in \(\forall \text {Red+}\forall \text {Exp+}\) Res, but has only exponential refutations in \(\forall \text {Red+Res}\) and \(\forall \text {Exp+Res}\).

Proof

For this proof, we take two formulas that are hard for \(Q\text {-resolution}\) and \(\forall \text {Exp+Res}\), respectively. We build a new family of formulas that has a polynomial refutation in \(\forall \text {Red+}\forall \text {Exp+Res}\), but only exponential refutations in \(\forall \text {Red+Res}\) and \(\forall \text {Exp+Res}\).

The first formula we consider is formula (2) form [17], that we call \(\mathrm {DAG}_n\) in the following:

$$\begin{aligned}&\exists e_1 \forall u_1 \exists c_1 c_2 \dots \exists e_n \forall u_n \exists c_{2n-1} c_{2n} \mathpunct {.}\\&(\bigvee _{i \in 1 \dots 2n} \overline{c}_i) \wedge \bigwedge _{i \in 1 \dots n} (\overline{e}_i \vee c_{2i-1}) \wedge (\overline{u}_i \vee c_{2i-1}) \wedge (e_i \vee c_{2i}) \wedge (u_i \vee c_{2i}) \end{aligned}$$

It is known that \(\mathrm {DAG}_n\) has a polynomial level-ordered \(Q\text {-resolution}\) proof and only exponential \(\forall \text {Exp+Res}\) proofs [17]. As a second formula, we use the \(\mathrm {QParity}_n\) formula [2]

$$\begin{aligned} \exists x_1 \dots x_n \forall z \exists t_2 \dots t_n \mathpunct {.}\mathrm {xor}(x_1,x_2,t_2) \wedge \bigwedge _{i \in 3 \dots n} \mathrm {xor}(t_{i-1}, x_i, t_i) \wedge (z \vee t_n) \wedge (\overline{z} \vee \overline{t}_n) \end{aligned}$$

where \(\mathrm {xor}(o_1, o_2, o) = (\overline{o}_1 \vee \overline{o}_2 \vee \overline{o}) \wedge (o_1 \vee o_2 \vee \overline{o}) \wedge (\overline{o}_1 \vee o_2 \vee o) \wedge (o_1 \vee \overline{o}_2 \vee o)\) defines o to be equal to \(o_1 \oplus o_2\). \(\mathrm {QParity}_n\) has a polynomial \(\forall \text {Exp+Res}\) refutation but only exponential \(Q\text {-resolution}\) refutations [2]. We construct the following formula

We argue in the following that this formula has a polynomial refutation in \(\forall \text {Red+}\forall \text {Exp+Res}\). First, using \((\forall \mathrm {exp\text {-}res})\) we can derive the proof object containing the clause \((\overline{a} \vee \bigvee _{i \in {\{1 \dots 2n\}}} \overline{c}_i)\) using the expansion tree \(\mathcal {T}= {\{z \rightarrow 0, z \rightarrow 1\}}\) and the clauses from the last row (analogue to Example 2). After applying universal reduction, the proof object representing clause \((\bigvee _{i \in {\{1 \dots 2n\}}} \overline{c}_i)\) can be derived. For the remaining formula, there is a polynomial and level-ordered resolution proof [17], thus, the formula has a polynomial \(\forall \text {Red+}\forall \text {Exp+Res}\) proof.

There is no polynomial \(Q\text {-resolution}\) proof, because deriving \((\bigvee _{i \in {\{1 \dots 2n\}}} \overline{c}_i)\) is exponential in \(Q\text {-resolution}\). Likewise, there is no polynomial \(\forall \text {Exp+Res}\) proof as the formula after deriving this clause has only exponential \(\forall \text {Exp+Res}\) refutations.    \(\square \)

One question that remains open, is how the new proof system compares to unrestricted \(Q\text {-resolution}\). We already know that the new proof system polynomially simulates both tree-like \(Q\text {-resolution}\) as well as level-ordered \(Q\text {-resolution}\).

Theorem 8

\(\forall \text {Red+}\forall \text {Exp+Res}\) does not p-simulate \(Q\text {-resolution}\).

Proof

(Sketch). We construct a formula that is hard for expansion and level-ordered \(Q\text {-resolution}\), but easy for (unrestricted) \(Q\text {-resolution}\). We have already seen in the proof of Theorem 7 that \(\mathrm {DAG}_n\) is hard for \(\forall \text {Exp+Res}\) but easy for \(Q\text {-resolution}\). However, the \(Q\text {-resolution}\) proof of \(\mathrm {DAG}_n\) is level-ordered. Hence, we need an additional formula that is hard to refute for level-ordered \(Q\text {-resolution}\). We use the modified pigeon hole formula from [14] where unrestricted resolution has polynomial proofs and resolution proofs that are restricted to a certain variable ordering are exponential. Using universal quantification, one can impose an arbitrary order on a level-ordered \(Q\text {-resolution}\) proof, thus, there is a quantified Boolean formula which has only exponential level-ordered \(Q\text {-resolution}\) but has a polynomial \(Q\text {-resolution}\) proof. The disjunction of those two formulas gives the required witness. This formula is easy to refute for \(Q\text {-resolution}\), but the first one is hard for \(\forall \text {Exp+Res}\) and the second is hard for level-ordered \(Q\text {-resolution}\).    \(\square \)

3.3 Comparison Between Extensions

We conclude this section by comparing the two extensions of the \(\forall \text {Red+Res}\) calculus introduced in this paper.

Theorem 9

\(\forall \text {Red+}\forall \text {Exp+Res}\) and \(\forall \text {Red+Res}\) with strengthening rule are incomparable.

Proof

(Sketch). The family of formulas \(\mathrm {CR}_n'\) from proof of Corollary 1 separates \(\forall \text {Red+}\forall \text {Exp+Res}\) and \(\forall \text {Red+Res}\) with strengthening rule. Since the strengthening rule is not applicable, all \(\forall \text {Red+Res}\) proofs are exponential while there is a polynomial proof in \(\forall \text {Red+}\forall \text {Exp+Res}\).

The other direction is shown by using a similar construction as the one used in the proof of Theorem 7. We use a combination of \(\mathrm {CR}_n\) and \(\mathrm {DAG}_n\) to construct a formula that has only exponential refutations in \(\forall \text {Red+}\forall \text {Exp+Res}\), but a polynomial refutation using the strengthening rule. The formula \(\mathrm {DAG}_n\) is used to generate the premise for the application of the strengthening rule to solve \(\mathrm {CR}_n\). To generate this premise using the rule \((\forall \mathrm {exp\text {-}res})\) one needs an exponential proof. There is a polynomial proof for \(\mathrm {DAG}_n\) in \(\forall \text {Red+Res}\), but there is none for \(\mathrm {CR}_n\), thus, \(\forall \text {Red+}\forall \text {Exp+Res}\) has only exponential refutations.    \(\square \)

Theorem 10

\(\forall \text {Red+}\forall \text {Exp+Res}\) with strengthening rule and \(Q\text {-resolution}\) are incomparable.

Proof

Follows from the proof of Theorem 8 as the witnessing formula can be constructed such that the strengthening rule is not applicable. The other direction follows from the separation of \(Q\text {-resolution}\) and \(\forall \text {Exp+Res}\) by Beyersdorff et al. [2].    \(\square \)

4 Experimental Evaluation

4.1 Implementation

We extended the implementation of \(\text {CAQE}\) with the possibility to use the rule \((\forall \mathrm {exp\text {-}res})\) as introduced in Sect. 3.2 Footnote 1. While the rule is applicable at every level in the QBF in principle, the effectiveness decreases when applying it to deeply nested formulas where \(\text {CAQE}\) tends to perform better [25] than \(\text {RAReQS}\). We aim to strike a balance between expansion and clausal-abstraction, i.e., keeping the best performance characteristics of both solving methods. Thus, in our implementation, we apply the expansion refinement (additional to the clausal-abstraction refinement) to the innermost universal quantifier.

figure e

An overview of the CEGAR algorithm is given in Algorithm 1. There is a close connection between the rules of the \(\forall \text {Red+Res}\) calculus and the presented algorithm. Especially, we use a SAT solver to prove the refutation needed in the rule \((\mathrm {res})\). We refer to [25] for algorithmic details. Changes to the original algorithm are written in bold text.

Abstraction. The abstraction for quantifier \(\exists X_k\), written \(\varphi _k\) is the projection of the clauses of the matrix to variables in \(X_k\), i.e., \(\bigwedge _{1 \le i \le m} lit (i,k)\). We assume that there is a operation to “disable” clauses in \(\varphi _k\) which corresponds to the situation where a clause \(C_i\) is satisfied by some variable bound before k. Likewise, for every clause we allow the assumption that this clause will be satisfied by a some variable bound after k. This is used to generate candidate proof objects \(\mathcal {P}_*^{k+1}\) for inner levels. In the refinement step, this assumption can be invalidated, i.e., there is a way to force satisfaction of a clause at level k. Those operations can be implemented by an incremental SAT solver and two additional literals controlling the satisfaction of clauses [25].

Algorithm. The algorithm recurses on the structure of the quantifier prefix and communicates proof objects \(\mathcal {P}\), which indicate the clauses of the matrix that are satisfied. At an existential quantifier, the abstraction generates a candidate solution (line 5) and checks recursively whether the candidate is correct (line 10). If not, the counterexample originally consists of a set of clauses (which could not be satisfied from the inner existential quantifiers). We extend this counterexample to also include an expansion tree \(\mathcal {T}\) from the levels below. Additionally to the original refinement, we also build the expansion of the QBF with respect to the expansion tree \(\mathcal {T}\), resulting in a QBF with the same quantifier prefix as the current level (with additional existential variables due to expansion). This QBF is then translated into a propositional formula in the same way as the original QBF. Lastly, the abstraction \(\varphi _k\) is then conjunctively combined with this propositional formula. Note that if the function returns UNSAT (line 7), the corresponding resolution proof from the SAT solver can be used to apply the rule \((\mathrm {res})\) form the \(\forall \text {Red+Res}\) calculus.

As the underlying SAT solver in the implementation, we use \(\text {PicoSAT}\) [3], \(\text {MiniSat}\) [8], \(\text {cryptominisat}\) [26], or Lingeling [4].

4.2 Evaluation

In our evaluation, we show that the established theoretical separations shown in the last section translate to a significant empirical improvement. The evaluation is structured by the following three hypothesizes: First, the strengthen and expansion refinement give a significant improvement over the plain version of \(\text {CAQE}\). Combining both refinements is overall better than only applying one of them. Second, we show that the improvement provided by the those refinements is independently of the underlying SAT solver. Third, when comparing on a per instance basis, the combined refinement effects the runtime mostly positively. We show that the improvement is up to three orders of magnitude.

Table 1. Number of solved instances of the QBFGallery 2014 and QBFEval 2016 benchmark sets.

We compare our implementation against \(\text {RAReQS}\) [16], \(\text {Qesto}\) [18], \(\text {DepQBF}\) in version 5.0 [21], and \(\text {GhostQ}\) [20]. For every solver except \(\text {GhostQ}\), we use \(\text {Bloqqer}\) [5] in version 031 as preprocessor. For our experiments, we used a machine with a \(3.6\,\text {GHz}\) quad-core Intel Xeon processor and \(32\,\text {GB}\) of memory. The timeout and memout were set to 10 min and \(8\,\text {GB}\), respectively. Table 1 shows number of solved instances on the QBFGallery 2014 benchmark set, broken down by benchmark family, as well as the more recent QBFEval 2016 benchmark set. For \(\text {CAQE}\), we only report on the best performing version, that is the one using \(\text {cryptominisat}\) as a backend solver.

The table shows that the strengthen and expansion refinement individually improve over the plain version of \(\text {CAQE}\) in the number of solved instances. Further, the combination of both refinements is the overall best solver, followed by \(\text {RAReQS}\).

In the following, we refer to the combination of strengthen and expansion refinement as extended refinements. We want to detail the improvements due to the extended refinements and show their independence of the backend solver. The plot in Fig. 5 depicts the effect of the extended refinements with respect to the solved instances. The improvements in the number of solved instances are independent from the choice of the underlying SAT solver and range between 100 to 150 more instances solved compared to the plain version of \(\text {CAQE}\).

Fig. 5.
figure 5

Effect of the expansion refinement on the different configurations of \(\text {CAQE}\) on the GBFGallery 2014 benchmark sets.

The scatter plot depicted in Fig. 6 compares the running times of plain \(\text {CAQE}\) to the one using extended refinements (both using \(\text {cryptominisat}\)) on a per instance basis. Marks below the diagonal means that the variant using extended refinements is faster. It is remarkable that the extended refinements have mostly positive effect on the solving times. Only a few instances saw a significant increase in solving time and even less timed out with extended refinements while being solved before. On the other hand, we see improvements in solving time that exceed three orders of magnitude. This is an empirical confirmation of our goal stated before that our implementation of expansion-refinement adds performance characteristic of expansion-based solvers while keeping the characteristics of the clausal-abstraction algorithm.

Fig. 6.
figure 6

Scatter plot comparing the solving time (in sec.) of \(\text {CAQE}\) with and without extended refinement.

5 Related Work

\(Q\text {-resolution}\) [19] is a variant of propositional refutation that is sound and refutation complete for QBF. There have been extensions proposed to \(Q\text {-resolution}\), like long-distance resolution [27] and universal resolution [13], some which are implemented in the QCDCL solver \(\text {DepQBF}\) [21]. Recently, there has also been extensions proposed that extend \(Q\text {-resolution}\) by more generalized axioms [22]. In some sense, the \((\forall \mathrm {exp\text {-}res})\) rule presented in this paper can be viewed as an new axiom rule for the \(\forall \text {Red+Res}\) calculus.

The \(\forall \text {Exp+Res}\) calculus [17] was introduced to allow reasoning over expansion-based QBF solving, exemplified by the QBF solver \(\text {RAReQS}\) [16]. The work on \(\forall \text {Red+Res}\) was motivated by the same desire, namely understanding the performance of the recently introduced QBF solvers \(\text {CAQE}\) [25] and \(\text {Qesto}\) [18]. The incomparability of \(\forall \text {Exp+Res}\) and \(\forall \text {Red+Res}\) [2, 17] lead to the creation of stronger proof systems that unify those calculi, like \(\text {IR-Calc}\) [1]. Further separation results, between variants of \(\text {IR-Calc}\) and variants of \(Q\text {-resolution}\), were given in [2]. Those extensions, however, do not have accompanying implementations. This also applies to recent work that is based on first-order resolution [9].

There are two well-known restrictions to \(Q\text {-resolution}\), that is level-ordered and tree-like \(Q\text {-resolution}\). Those restricted calculi were shown to be incomparable [23]. QCDCL based solver exhibit level-ordered proofs [15] and it was shown that \(\forall \text {Exp+Res}\) p-simulates tree-like \(Q\text {-resolution}\) [17]. We showed that \(\forall \text {Red+Res}\) is polynomial simulation equivalent to level-ordered \(Q\text {-resolution}\), which explains similar performance characteristics of the underlying solvers. Further, the strengthening rule presented in this paper can be viewed as a first step towards breaking the level-ordered restriction. The \(\forall \text {Red+}\forall \text {Exp+Res}\) calculus p-simulates level-ordered and tree-like \(Q\text {-resolution}\).

6 Conclusion

In this paper, we have introduced a new QBF proof calculus \(\forall \text {Red+Res}\) and showed that it is suitable for describing CEGAR based solving algorithms. We defined two extensions of the \(\forall \text {Red+Res}\) calculus and showed that there is a theoretical advantage over the basic calculus. Based on this foundation, we implemented an expansion refinement in the solver \(\text {CAQE}\) and evaluated it on standard QBF benchmark sets. Our experiments show that our new implementation significantly outperforms the previous one, with little to no negative impact, making it one of the most competitive QBF solver available. We have also shown that our theoretical considerations and the consequent algorithmic change explains those practical gains.

In future work, we want to improve the implementation by exploring heuristics for the application of the different refinements and we want to explore alternative versions of the strengthening rule presented in this paper.