1 Introduction

The intuit theorem prover by Claessen and Rosén  [2] implements an efficient decision procedure for Intuitionistic Propositional Logic (\(\mathrm {IPL} \)) based on a Satisfiability Modulo Theories (SMT) approach. Given an input formula \(\alpha \), the clausification module of intuit computes a sequent \(\sigma =R,X \Rightarrow g\) equivalent to \(\alpha \) with respect to \(\mathrm {IPL} \)-validity, where R, X and g have a special form: R is a set of clauses, X is a set of implications \((a\rightarrow b)\rightarrow c\), with a, b, c atoms, g is an atom. The decision procedure at the core of intuit searches for a Kripke model \(\mathcal {K}\) such that at its root all the formulas in R and X are forced and g is not forced; we call \(\mathcal {K}\) a countermodel for \(\sigma \), since it witnesses the non-validity of \(\sigma \) in \(\mathrm {IPL} \). The search is performed via a proper variant of the DPLL(\(\mathcal T\)) procedure [12], whose top-level loop exploits an incremental SAT-solver. This leads to a highly performant decision strategy; actually, on the basis of a standard benchmarks suite, intuit outperforms two of the state-of-the-art provers for \(\mathrm {IPL} \), namely fCube  [5] and intHistGC  [11]. At first sight, the intuit decision procedure seems to be far away from the traditional techniques for deciding \(\mathrm {IPL} \) validity; on the other hand, the in-depth investigation presented in [10] unveils a close and surprising connection between the intuit approach based on SMT and the known proof-theoretic methods. The crucial point is that the main loop of the decision procedure mimics a standard root-first proof search strategy for the sequent calculus \(\mathrm {LJT_{\mathtt {SAT}}}\) [10] (see Fig. 7), a variant of Dyckhoff’s calculus \(\mathrm {LJT}\) [3]. In [10] the intuit decision procedure is re-formulated so that, given a sequent \(\sigma \), it outputs either a derivation of \(\sigma \) in \(\mathrm {LJT_{\mathtt {SAT}}}\) or a countermodel for \(\sigma \).

Here we continue this investigation to better take advantage of the interplay between the SMT perspective and proof-theoretic methods. At first, we have enhanced the Haskell intuit codeFootnote 1 by implementing the derivation/countermodel extraction procedures discussed in [10]. We experimented some unexpected and weird phenomena: derivations are often convoluted and contain applications of the cut rule which cannot be trivially eliminated; countermodels in general contain lots of redundancies. To overcome these issues, we have redesigned the decision procedure. Differently from intuit, in the main loop we keep all the worlds of the countermodel under construction. Whenever the generation of a new world fails, the current model is emptied and the computation restarts with a new iteration of the main loop. We call the obtained prover intuitR (intuit with Restart). We gain some remarkable advantages. Firstly, the proof search procedure has a plain and intuitive presentation, consisting of two nested loops (see the flowchart in Fig. 3). Secondly, derivations have a linear structure, formalized by the calculus \(C^\rightarrow \) in Fig. 1; basically, a derivation in \(C^\rightarrow \) is a cut-free derivation in \(\mathrm {LJT_{\mathtt {SAT}}}\) having only one branch. Thirdly, the countermodels obtained by intuitR are in general smaller than the ones obtained by intuit, since restarts cross out redundant worlds. We have replicated the experiments in [2] (1200 benchmarks): as reported in the table in Fig. 9 and in the scatter plot in Fig. 11, intuitR has better performances than intuit. The intuitR implementation and other additional material (e.g., the omitted proofs, a detailed report on experiments) can be downloaded at https://github.com/cfiorentini/intuitR.

2 Preliminary Notions

Formulas, denoted by lowercase Greek letters, are built from an infinite set of propositional variables \(V\), the constant \(\bot \) and the connectives \(\wedge \), \(\vee \), \(\rightarrow \); the formula \(\alpha \leftrightarrow \beta \) stands for \((\alpha \rightarrow \beta )\wedge (\beta \rightarrow \alpha )\). Elements of the set \(V\cup \{\bot \}\) are called atoms and are denoted by lowercase Roman letters, uppercase Greek letters denote sets of formulas. A (classical) interpretation M is a subset of V, identifying the propositional variables assigned to true. By \(M\models \alpha \) we mean that \(\alpha \) is true in M; moreover, \(M\models \varGamma \) iff \(M\models \alpha \) for every \(\alpha \in \varGamma \). We write \({\varGamma }\,\vdash _{\mathrm {c}}\, \alpha \) iff, for every interpretation M, \(M\models \varGamma \) implies \(M\models \alpha \). A formula \(\alpha \) is \(\mathrm {CPL} \)-valid (valid in Classical Propositional Logic) iff \({\emptyset }\,\vdash _{\mathrm {c}}\, \alpha \).

A (rooted) Kripke model for \(\mathrm {IPL} \) (Intuitionistic Propositional Logic) is a quadruple \(\langle W, \le , r, \vartheta \rangle \) where W is a finite and non-empty set (the set of worlds), \(\le \) is a reflexive and transitive binary relation over W, the world r (the root of \(\mathcal {K}\)) is the minimum of W w.r.t. \(\le \), and \(\vartheta : W \mapsto 2^{V}\) (the valuation function) is a map obeying the persistence condition: for every pair of worlds \(w_1\) and \(w_2\) of \(\mathcal {K}\), \(w_1 \le w_2\) implies \(\vartheta (w_1)\subseteq \vartheta (w_2)\). The valuation \(\vartheta \) is extended into a forcing relation between worlds and formulas as follows:

$$\begin{aligned} \begin{array}{ll} w \Vdash p ~\mathrm{iff}~ p \in \vartheta (w), \forall p\in V&{} w\nVdash \bot \quad w \Vdash \alpha \wedge \beta ~\mathrm{iff}~ w \Vdash \alpha ~\mathrm{and}~ w \Vdash \beta \\ w \Vdash \alpha \vee \beta ~\mathrm{iff}~ w \Vdash \alpha ~\mathrm{or}~ w \Vdash \beta &{} \quad w \Vdash \alpha \rightarrow \beta ~\mathrm{iff}~ \forall w' \ge w, w' \Vdash \alpha ~\mathrm{implies}~ w' \Vdash \beta . \end{array} \end{aligned}$$

By \(w\Vdash \varGamma \) we mean that \( w\Vdash \alpha \) for every \(\alpha \in \varGamma \). A formula \(\alpha \) is \(\mathrm {IPL} \)-valid iff, for every Kripke model \(\mathcal {K}\) we have \(r \Vdash \alpha \) (here and below r designates the root of \(\mathcal {K}\)). Thus, if there exists a model \(\mathcal {K}\) such that \(r\nVdash \alpha \), then \(\alpha \) is not \(\mathrm {IPL} \)-valid; we call \(\mathcal {K}\) a countermodel for \(\alpha \), written \(\mathcal {K}\not \models \alpha \), and we say that \(\alpha \) is counter-satisfiable. We write \({\varGamma }\,\vdash _{\mathrm {i}}\, \delta \) iff, for every model \(\mathcal {K}\), \(r\Vdash \varGamma \) implies \(r\Vdash \delta \); thus, \(\alpha \) is \(\mathrm {IPL} \)-valid iff \({\emptyset }\,\vdash _{\mathrm {i}}\, \alpha \). Let \(\sigma \) be a sequent of the form \(\varGamma \Rightarrow \delta \); \(\sigma \) is \(\mathrm {IPL} \)-valid iff \({\varGamma }\,\vdash _{\mathrm {i}}\, \delta \). By \(\mathcal {K}\not \models \sigma \) we mean that \(r\Vdash \varGamma \) and \(r\nVdash \delta \). Note that such a model \(\mathcal {K}\) witnesses that \(\sigma \) is not \(\mathrm {IPL} \)-valid; we say that \(\mathcal {K}\) is a countermodel for \(\sigma \) and that \(\sigma \) is counter-satisfiable.

Clausification We review the main concepts about the clausification procedure described in [2]. Flat clauses \(\varphi \) and implication clauses \(\lambda \) are defined as

$$\begin{aligned} \begin{array}{llll} \varphi &{} \;:=\; &{} \bigwedge A_1\rightarrow \bigvee A_2~|~\bigvee A_2 &{} \qquad \qquad \emptyset \subset A_k \,\subseteq \, V\cup \{\bot \}, ~\mathrm{for}~ k\in \{1,2\} \\ \lambda &{}\; := &{} (a \rightarrow b) \rightarrow c &{} \qquad \qquad a\in V,\;\{b,c\} \,\subseteq \, V\cup \{\bot \} \end{array} \end{aligned}$$

where \(\bigwedge A_1\) and \(\bigvee A_2\) denote the conjunction and the disjunction of the atoms in \(A_1\) and \(A_2\) respectively (\(\bigwedge \{a\}=\bigvee \{a\}=a\)). Henceforth, \(\bigwedge \emptyset \rightarrow \bigvee A_2\) must be read as \(\bigvee A_2\); moreover, R, \(R_1\), ...denote sets of flat clauses; X, \(X_1\), ...sets of implication clauses; A, \(A_1\), ...sets of atoms. The intuit procedure relies on the following property (see Lemma 2 in [10]):

Lemma 1

For every set of flat clauses R and every atom g, \({R}\,\vdash _{\mathrm {i}}\, g\) iff \({R}\,\vdash _{\mathrm {c}}\, g\).

In the decision procedure, flat clauses are actively used only in classical reasoning. A pair (RX) is \(\rightarrow \)-closed iff, for every \((a\rightarrow b)\rightarrow c\in X\), \(b\rightarrow c\in R\). An r-sequent (reduced sequent) is a sequent \(\varGamma \Rightarrow g\) where g is an atom, \(\varGamma =R\cup X\) and (RX) is \(\rightarrow \)-closed. Given a formula \(\alpha \), the clausification procedure yields a triple (RXg) such that \(R,X \Rightarrow g\) is an r-sequent and:

  • (1) \({}\,\vdash _{\mathrm {i}}\, \alpha \) iff \({R,X}\,\vdash _{\mathrm {i}}\, g\); (2) \(\mathcal {K}\not \models R,X \Rightarrow g\) implies \(\mathcal {K}\not \models \alpha \), for every \(\mathcal {K}\)Footnote 2

Thus, \(\mathrm {IPL} \)-validity of formulas can be reduced to \(\mathrm {IPL} \)-validity of r-sequents.

Fig. 1.
figure 1

The sequent calculus \(C^\rightarrow \); \(R,X \Rightarrow g\) is an r-sequent.

3 The Calculus \(C^\rightarrow \)

The sequent calculus \(C^\rightarrow \) consists of the rules \(\mathrm {cpl}_0\) and \(\mathrm {cpl}_1\) from Fig. 1. Rule \(\mathrm {cpl}_0\) (axiom rule) can only be applied if the condition \({R}\,\vdash _{\mathrm {c}}\, g\) holds, rule \(\mathrm {cpl}_1\) requires that \({R,A}\,\vdash _{\mathrm {c}}\, b\) holds. In rule \(\mathrm {cpl}_1\), \((a\rightarrow b)\rightarrow c\) is the main formula and A the local assumptions; note that A is any set of propositional variables (not necessarily containing a). Derivations are defined as usual (see e.g. [14]); by \(\vdash _{C^\rightarrow }\sigma \) we mean that there exists a derivation of the r-sequent \(\sigma \) in \(C^\rightarrow \). In showing derivations, we leave out rule names and we display the main formulas of \(\mathrm {cpl}_1\) applications. Soundness of rule \(\mathrm {cpl}_1\) relies on the following property:

  1. (a)

    If \({R,A}\,\vdash _{\mathrm {c}}\, b\), then \({R,(a\rightarrow b)\rightarrow c}\,\vdash _{\mathrm {i}}\, \varphi \), where \(\varphi = \bigwedge (A\setminus \{a\})\rightarrow c\).

Indeed, let \({R,A}\,\vdash _{\mathrm {c}}\, b\). By Lemma 1 \({R,A}\,\vdash _{\mathrm {i}}\, b\), thus \({R,A\setminus \{a\}}\,\vdash _{\mathrm {i}}\, a\rightarrow b\). It follows that \({R,(a\rightarrow b)\rightarrow c,A\setminus \{a\}}\,\vdash _{\mathrm {i}}\, c\), hence \({R,(a\rightarrow b)\rightarrow c}\,\vdash _{\mathrm {i}}\, \varphi \). By Lemma 1 and (a), the soundness of \(C^\rightarrow \) follows:

Proposition 1

\(\vdash _{C^\rightarrow }R,X \Rightarrow g\) implies \({R,X}\,\vdash _{\mathrm {i}}\, g\).

Fig. 2.
figure 2

Derivation of \(R_0,\,X \Rightarrow g\) in \(C^\rightarrow \) (\(0\le k\le m-1\)).

A derivation of \(\sigma _0=R_0,X \Rightarrow g\) has the plain form shown in Fig. 2: it only contains the branch of sequents \(\sigma _k=R_k,X \Rightarrow g\) where the sets \(R_k\) are increasing. Nevertheless, the design of a root-first proof search strategy for \(C^\rightarrow \) is not obvious. Let \(\sigma _0\) be the r-sequent to be proved; we try to bottom-up build the derivation in Fig. 2 by running a loop where, at each iteration \(k\ge 0\), we search for a derivation of \(\sigma _k\). It is convenient to firstly check whether \({R_k}\,\vdash _{\mathrm {c}}\, g\) so that, by applying rule \(\mathrm {cpl}_0\), we immediately get a derivation of \(\sigma _k\). If this is not the case, we should pick an implication \(\lambda _k\) from X and guess a proper set of local assumptions \(A_k\) in order to bottom-up apply rule \(\mathrm {cpl}_1\).

$$\begin{aligned}&\qquad \; \frac{{{R_k,\,b_k}\,\vdash _{\mathrm {c}}\, b_k} \qquad {R_k,\,X \Rightarrow g}}{{R_k,\,X \Rightarrow g}}\;{\lambda _k} \\&\lambda _k=(a_{k}\rightarrow b_{k})\rightarrow c_{k}\,\in \, X,\;b_k\rightarrow c_k\in R_k \\&A_k=\{b_k\},\; \varphi _k= b_k\rightarrow c_k,\;R_{k+1}=R_k \end{aligned}$$

If we followed a blind choice, the procedure would be highly inefficient; for instance, the application of rule \(\mathrm {cpl}_1\) shown on the left triggers a non-terminating loop. Instead, we pursue this strategy: we search for a countermodel for \(\sigma _k\); if we succeed, then \({R_k,X}\,\nvdash _{\mathrm {i}}\, g\) and, being \(R_0\subseteq R_k\), we conclude that \({R_0,X}\,\nvdash _{\mathrm {i}}\, g\) and proof search ends. Otherwise, from the failure we learn the proper \(\lambda _k\) and \(A_k\) to be used in the application of rule \(\mathrm {cpl}_1\); in next iteration, proof search restarts with the sequent \(\sigma _{k+1}\), where \(R_{k+1}\) is obtained by adding the learned clause \(\varphi _k\) to \(R_k\). To check classical provability, we exploit a SAT-solver; each time the solver is invoked, the set \(R_k\) has increased, thus it is advantageous to use an incremental SAT-solver.

Countermodels Henceforth we define Kripke models by specifying the interpretations associated with its worlds. Let W be a finite set of interpretations with minimum \(M_0\), namely: \(M_0\subseteq M\) for every \(M\in W\). By \(\mathcal {K}(W)\) we denote the Kripke model \(\langle W,\le ,M_0,\vartheta \rangle \) where \(\le \) coincides with the subset relation \(\subseteq \) and \(\vartheta \) is the identity map, thus \(M\Vdash p\) (in \(\mathcal {K}(W)\)) iff \(p\in M\). We introduce the following realizability relation \(\triangleright _{W}\) between W and implication clauses:

$$\begin{aligned} \begin{array}{lll} M\triangleright _{W} (a\rightarrow b)\rightarrow c &{} ~\mathrm{iff}~ &{} (a\in M) ~\mathrm{or}~ (b\in M) ~\mathrm{or}~ (c\in M) ~\mathrm{or}~ \\ &{}&{} \left( \; \exists M'\in W ~\mathrm{s.t.}~M\subset M' ~\mathrm{and}~ a\in M' ~\mathrm{and}~ b\not \in M' \;\right) . \end{array} \end{aligned}$$

By \(M\triangleright _{W} X\) we mean that \(M\triangleright _{W}\lambda \) for every \(\lambda \in X\). Countermodels of r-sequents can be characterized as follows:

Proposition 2

Let \(\sigma =R,X \Rightarrow g\) be an r-sequent and let W be a finite set of interpretations with minimum \(M_0\). Then, \(\mathcal {K}(W)\not \models \sigma \) iff:

(i) \(g\not \in M_0\); (ii) for every \(M\in W\), \(M\models R\) and \(M\triangleright _{W} X\).

Fig. 3.
figure 3

Computation of .

4 The Procedure

The strategy outlined in Sec. 3 is implemented by the decision procedure (prove with Restart) defined by the flowchart in Fig. 3. The call returns Valid if the r-sequent \(\sigma =R,X \Rightarrow g\) is \(\mathrm {IPL} \)-valid, CountSat otherwise; by tracing the computation, we can build a \(C^\rightarrow \)-derivation of \(\sigma \) in the former case, a countermodel for \(\sigma \) in the latter. We exploit a single incremental SAT-solver s: clauses can be added to s but not removed; by \(\mathrm {R}(s)\) we denote the set of clauses stored in s. The solver s has associated a set of propositional variables \(\mathrm {U}(s)\) (the universe of s); we assume that every clause \(\varphi \) supplied to s is built over \(\mathrm {U}(s)\) (namely, every variable occurring in \(\varphi \) belongs to \(\mathrm {U}(s)\)). The SAT-solver is required to support the following operations:

  • Create a new SAT-solver.

  • // s is a SAT-solver, \(\varphi \) a flat clause built over \(\mathrm {U}(s)\)

    Add the clause \(\varphi \) to s.

  • // s is a SAT-solver, \(A\subseteq \mathrm {U}(s)\), \(g\in \mathrm {U}(s)\cup \{\bot \}\)

    Call s to decide whether \({\mathrm {R}(s),A}\,\vdash _{\mathrm {c}}\, g\) (A is a set of local assumptions). The solver outputs one of the following answers:

    • \(\mathrm {Yes}(A')\): thus, \(A'\subseteq A\) and \({\mathrm {R}(s),A'}\,\vdash _{\mathrm {c}}\, g\);

    • \(\mathrm {No}(M)\): thus, \(A \subseteq M \subseteq \mathrm {U}(s)\) and \(M \models \mathrm {R}(s)\) and \(g\not \in M\).

    In the former case it follows that \({\mathrm {R}(s),A}\,\vdash _{\mathrm {c}}\, g\), in the latter \({\mathrm {R}(s),A}\,\nvdash _{\mathrm {c}}\, g\).

The procedure , defined using the primitive operations, creates a new SAT-solver containing all the clauses in R. The computation of the call consists of the following steps:

  1. (S0)

    A new SAT-solver s storing all the clauses in R is created.

  2. (S1)

    A loop starts (main loop) with empty W.

  3. (S2)

    The SAT-solver s is called to check whether \({\mathrm {R}(s)}\,\vdash _{\mathrm {c}}\, g\). If the answer is \(\mathrm {Yes}(\emptyset )\), the computation stops yielding Valid. Otherwise, the output is \(\mathrm {No}(M)\) and the computation continues at Step (S3).

  4. (S3)

    A loop starts (inner loop) by adding the interpretation M computed at Step (S2) to the set W (thus, \(W=\{M\}\)).

  5. (S4)

    We have to select a pair \(\langle w,\lambda \rangle \) such that \(w\in W\), \(\lambda \in X\) and \(w{\ntriangleright }_{W} \lambda \). If such a pair does not exist, the procedure ends with output CountSat. Otherwise, the computation continues at Step (S5).

  6. (S5)

    Let \(\langle w,(a\rightarrow b)\rightarrow c \rangle \) be the pair selected at Step (S4). The SAT-solver s is called to check whether \({\mathrm {R}(s),w,a}\,\vdash _{\mathrm {c}}\, b\). If the result is \(\mathrm {No}(M)\), then a new iteration of the inner loop is performed where M is added to W. Otherwise, the answer is \(\mathrm {Yes}(A)\) and the computation continues at Step (S6); we call A the learned assumptions and \(\langle w,(a\rightarrow b)\rightarrow c \rangle \) the learned pair.

  7. (S6)

    The clause \(\varphi \) (the learned clause) is added to the solver s and the computation restarts from Step (S1) with a new iteration of the main loop.

Note that during the computation no new variables are created, thus \(\mathrm {U}(s)\) can be defined as the set of propositional variables occurring in \(R\cup X\cup \{g\}\). We show that the call is correct, namely: if R, X, g match the Input Assumptions, then the Output Properties hold (see Fig. 3). We stipulate that:

  • \(R_k\) denotes the set \(\mathrm {R}(s)\) at the beginning of iteration k of the main loop;

  • \(\varphi _k\) denotes the clause learned at iteration k of the main loop;

  • \(W_{k,j}\) denotes the set W at iteration k of the main loop and just after Step (S3) of iteration j of the inner loop.

  • \(\sim _c\) denotes classical equivalence, namely: \(\alpha \sim _c\beta \) iff \({}\,\vdash _{\mathrm {c}}\, \alpha \leftrightarrow \beta \).

We prove some properties about the computation of .

  1. (P1)

    Let \(k,j\ge 0\) be such that \(W_{k,j}\) is defined. Then:

    1. (i)

      The set \(W_{k,j}\) has a minimum element \(M_0\) and \(g\not \in M_0\).

    2. (ii)

      For every \(M\in W_{k,j}\), \(M\models {R_k}\).

    3. (iii)

      If \(W_{k,j+1}\) is defined, then \(W_{k,j}\subset W_{k,j+1}\).

  2. (P2)

    For every \(0\le h < k\) such that \(\varphi _k\) is defined, \(\varphi _h\not \sim _c\varphi _k\).

Let \(W_{k,0}=\{M\}\); one can easily check that, setting \(M_0=M\), (i) holds. Point (ii) follows by the fact that each M in \(W_{k,j}\) comes from an answer \(\mathrm {No}(M)\), thus \(M\models R_k\). Let \(W_{k,j+1}\) be defined and let \(W_{k,j+1}=W_{k,j}\cup \{M\}\), with M computed at step (S5); there is \(w\in W_{k,j}\) and \(\lambda =(a\rightarrow b)\rightarrow c\in X\) such that \(w{\ntriangleright }_{W_{k,j}} \lambda \) and \(w\cup \{a\}\subseteq M\) and \(b\not \in M\). We cannot have \(M\in W_{k,j}\), otherwise, since \(w\subseteq M\) and \(a\in M\) and \(b\not \in M\), we would get \(w\triangleright _{W_{k,j}} \lambda \), a contradiction. Thus \(M\not \in W_{k,j}\), and this proves (iii).

Let \(0\le h < k\) be such that \(\varphi _k\) is defined, let \(\langle w_k,\lambda _k=(a_{k}\rightarrow b_{k})\rightarrow c_{k} \rangle \) and \(A_k\) be the pair and the assumptions learned at iteration k respectively; note that \(A_k\subseteq w_k\cup \{a_k\}\). Since \(R_h\cup \{\varphi _h\}=R_{h+1} \subseteq R_k\), we have \(\varphi _h\in R_k\); by (P1)(ii), it holds that \(w_k\models R_k\), hence \(w_k\models \varphi _h\). We show that \(w_k\not \models \varphi _k\), and this proves (P2). Since \(\langle w_k,\lambda _k \rangle \) has been selected at Step (S4), \(c_k\not \in w_k\); by the fact that \(\varphi _k=\bigwedge (A_k\setminus \{a_k\})\rightarrow c_k\) and \(A_k\setminus \{a_k\}\subseteq w_k\), we conclude \(w_k\not \models \varphi _k\).

Exploiting the above properties, we prove the correctness of , also showing how to extract derivations and countermodels from computations.

Proposition 3

The call is correct.

Proof

We start by proving that the computation never diverges. By (P2), the learned clauses \(\varphi _k\) are pairwise not classically equivalent; since each \(\varphi _k\) is built over the finite set \(\mathrm {U}(s)\), at most \(2^{|\mathrm {U}(s)|}\) such clauses can be generated, and this proves the termination of the main loop. Since every interpretation M in W is a subset of \(\mathrm {U}(s)\), by (P1)(iii) the termination of the inner loop follows.

Let \(\sigma =R,X \Rightarrow g\). If returns CountSat, then the computation ends at Step (S4) since no pair \(\langle w,\lambda \rangle \) can be selected. By (P1), the current set W satisfies the assumptions (i),(ii) of Prop. 2; accordingly, \(\mathcal {K}(W)\) is a countermodel for \(\sigma \), thus \({R,X}\,\nvdash _{\mathrm {i}}\, g\). If outputs Valid, then there exists \(m\ge 0\) such that, at Step (S2) of iteration m of the main loop, the SAT-solver yields \(\mathrm {Yes}(\emptyset )\), hence \({R_m}\,\vdash _{\mathrm {c}}\, g\). For every iteration k in \(0\dots m-1\) of the main loop, let \(\langle w_k,\lambda _k=(a_{k}\rightarrow b_{k})\rightarrow c_{k} \rangle \) be the learned pair and \(A_k\) the learned assumptions (thus, \({R_k,\,A_k}\,\vdash _{\mathrm {c}}\, b_k\)). We can apply rule \(\mathrm {cpl}_1\) as follows:

$$\begin{aligned} \frac{{{R_k,\,A_k}\,\vdash _{\mathrm {c}}\, b_k} \qquad \;\; {R_{k+1},\,X \Rightarrow g}}{{R_k,\,X \Rightarrow g}} \; {\lambda _k} \qquad \begin{array}{l} \varphi _k\,=\, \bigwedge (A_k\setminus \{a_k\})\rightarrow c_k \\ R_0=R,\quad R_{k+1}= R_k\cup \{\varphi _k\} \end{array} \end{aligned}$$

Accordingly, we can build the derivation of \(R,X \Rightarrow g\) displayed in Fig. 2 and, by Prop. 1, we conclude \({R,X}\,\vdash _{\mathrm {i}}\, g\).    \(\square \)

As a corollary, we get the completeness of the calculus \(C^\rightarrow \):

Proposition 4

For every r-sequent \(\sigma =R,X \Rightarrow g\), \(\;\vdash _{C^\rightarrow }\sigma \) iff \({R,X}\,\vdash _{\mathrm {i}}\, g\).

We give two examples of computations using formulas from the ILTP (Intuitionistic Logic Theorem Proving) library [13].

Example 1

Let \(\chi \) be the first instance of problem class SYJ201 from the ILTP library [13], where \(\eta _{ij}= p_i\leftrightarrow p_j\) and \(\gamma =\ p_1 \wedge p_2 \wedge p_3\):

$$ \chi \,=\,\left( (\eta _{12} \rightarrow \gamma ) \wedge ( \eta _{23} \rightarrow \gamma ) \wedge ( \eta _{31} \rightarrow \gamma ) \right) \,\rightarrow \, \gamma $$

The clausification of \(\chi \) yields the triple \((R_0,X,\tilde{g})\), where X contains the implication clauses \(\lambda _0,\dots ,\lambda _5\) defined in Fig. 4 and \(R_0\) the following 17 clauses (we mark by a tilde the fresh variables introduced during clausification): Footnote 3

$$ \begin{array}{l} \tilde{p}_{0}\rightarrow \tilde{p}_{4}, \quad \tilde{p}_{3}\rightarrow p_{2}, \quad \tilde{p}_{3}\rightarrow p_{3}, \quad \tilde{p}_{4}\rightarrow p_{1}, \quad \tilde{p}_{4}\rightarrow \tilde{p}_{3}, \quad \tilde{p}_{5}\rightarrow \tilde{p}_{4}, \quad \tilde{p}_{8}\rightarrow \tilde{p}_{4}, \\ \tilde{p}_{1}\wedge \tilde{p}_{2}\rightarrow \tilde{p}_{0}, \qquad \tilde{p}_{6}\wedge \tilde{p}_{7}\rightarrow \tilde{p}_{5}, \qquad \tilde{p}_{9}\wedge \tilde{p}_{10}\rightarrow \tilde{p}_{8}, \qquad p_{1}\wedge p_{2}\wedge p_{3}\rightarrow \tilde{g}, \\ p_1\rightarrow \tilde{p}_{2}, \qquad p_1\rightarrow \tilde{p}_{9}, \qquad p_2\rightarrow \tilde{p}_{1}, \qquad p_2\rightarrow \tilde{p}_{7}, \qquad p_3\rightarrow \tilde{p}_{6}, \qquad p_3\rightarrow \tilde{p}_{10}. \end{array} $$

The trace of the computation of is shown in Fig. 4. Each row displays the validity tests performed by the SAT-solver and the computed answers. If the result is \(\mathrm {No}(\_)\), the last two columns show the worlds \(w_k\) in the current set W and, for each \(w_k\), the list of \(\lambda \) such that \(w_k {\ntriangleright }_{W} \lambda \); the pair selected at Step (S4) is underlined. For instance, after call (0) we have \(W=\{w_0\}\) and \(w_0{\ntriangleright }_{W} \lambda _k\) for every \(0\le k\le 5\); the selected pair is \(\langle w_0,\lambda _0 \rangle \). After call (1), the set W is updated by adding the world \(w_1\) and \(w_1{\ntriangleright }_{W} \lambda _3\), \(w_1{\ntriangleright }_{W} \lambda _5\) and \(w_0{\ntriangleright }_{W} \lambda _k\) for every \(2\le k\le 5\) (since \(w_1\in W\), we get \(w_0\triangleright _{W} \lambda _0\)); the selected pair is \(\langle w_1,\lambda _3 \rangle \). Whenever the SAT-solver outputs \(\mathrm {Yes}(A)\), we display the learned clause \(\varphi _k\). The SAT-solver is invoked 15 times and there are 6 restarts. Fig. 4 also shows the derivation of \(R_0,X \Rightarrow \tilde{g}\) extracted from the computation.  \(\Diamond \)

Fig. 4.
figure 4

Computation of , see Ex. 1.

Fig. 5.
figure 5

Computation of , see Ex. 2.

Example 2

Let \(\psi \) be the second instance of problem class SYJ207 from the ILTP library [13], where \(\eta _{ij}= p_i\leftrightarrow p_j\) and \(\gamma = p_1 \wedge p_2 \wedge p_3 \wedge p_4\):

$$ \begin{array}{lcl} \psi&\,=\,&\left( (\eta _{12} \rightarrow \gamma ) \wedge ( \eta _{23} \rightarrow \gamma ) \wedge ( \eta _{34} \rightarrow \gamma ) \wedge ( \eta _{41} \rightarrow \gamma ) \right) \,\rightarrow \, (p_0\vee \lnot p_0\vee \gamma ) \end{array} $$

We proceed as in Ex. 1. The clausification procedure yields \((R_0,X,\tilde{g})\), where X consists of the implication clauses \(\lambda _0,\dots ,\lambda _8\) in Fig. 5 and the set \(R_0\) contains the 24 flat clauses below:

$$ \begin{array}{l} {p}_{0}\rightarrow \tilde{g}, {p}_{1}\rightarrow \tilde{p}_{2}, \, {p}_{1}\rightarrow \tilde{p}_{13}, \; {p}_{2}\rightarrow \tilde{p}_{1}, {p}_{2}\rightarrow \tilde{p}_{8}, \, {p}_{3}\rightarrow \tilde{p}_{7}, {p}_{3}\rightarrow \tilde{p}_{11}, \, {p}_{4}\rightarrow \tilde{p}_{10}, \, {p}_{4}\rightarrow \tilde{p}_{14}, \\ \tilde{p}_{0}\rightarrow \tilde{p}_{5}, \; \tilde{p}_{3}\rightarrow {p}_{3}, \; \tilde{p}_{3}\rightarrow {p}_{4}, \; \tilde{p}_{4}\rightarrow {p}_{2}, \; \tilde{p}_{4}\rightarrow \tilde{p}_{3}, \; \tilde{p}_{5}\rightarrow {p}_{1}, \; \tilde{p}_{5}\rightarrow \tilde{p}_{4}, \; \tilde{p}_{6}\rightarrow \tilde{p}_{5}, \; \tilde{p}_{9}\rightarrow \tilde{p}_{5} \\ \tilde{p}_{1}\wedge \tilde{p}_{2}\rightarrow \tilde{p}_{0}, \; \tilde{p}_{7}\wedge \tilde{p}_{8}\rightarrow \tilde{p}_{6}, \; \tilde{p}_{10}\wedge \tilde{p}_{11}\rightarrow \tilde{p}_{9}, \; \tilde{p}_{13}\wedge \tilde{p}_{14}\rightarrow \tilde{p}_{12}, \; \tilde{p}_{12}\rightarrow \tilde{p}_{5}, \; \gamma \rightarrow \tilde{g}. \end{array} $$

The execution of (see Fig. 5) requires 14 calls to the SAT-solver and 4 restarts. After the last call we get \(W=\{w_7,w_8, w_9\}\) and \(w_k\triangleright _{W} X\) for every \(w_k\in W\), thus the computation ends yielding CountSat. The model \(\mathcal {K}(W)\), depicted at the bottom left of the figure, is a countermodel for \(R_0,X \Rightarrow \tilde{g}\) and for \(\psi \) (see Sec. 2).  \(\Diamond \)

Fig. 6.
figure 6

The procedure of intuit  [2, 10].

5 Related Work and Experimental Results

We compare the procedure of intuitR with its intuit counterpart, namely the procedure defined in Fig. 6. Here we comply with the presentation in [10], equivalent to the original one in [2]. The recursive auxiliary function plays the role of the main loop of (but in the set of atoms \(\tilde{A}\) is not used); the loop inside corresponds to the inner loop of Footnote 4 We point out some major differences. Firstly, in the interpretations M computed by the SAT-solver are not collected; in the loop, only the interpretation M computed at line 8 is considered, thus at the beginning of each iteration just the “local” conditions of the test \(M{\ntriangleright }_{W} \lambda \) are checked (line 10). Secondly, the call to the SAT-solver at Step (S5) is replaced by the recursive call at line 11; as a consequence, we cannot build derivations by applying rule \(\mathrm {cpl}_1\). As thoroughly discussed in [10], the calculus underlying intuit is the sequent calculus \(\mathrm {LJT_{\mathtt {SAT}}}\) in Fig. 7, obtained from \(C^\rightarrow \) by replacing the rule \(\mathrm {cpl}_1\) with the more general rule \(\mathrm {ljt}\) and introducing a cut rule. Rule \(\mathrm {ljt}\) can be seen as a generalization of Dyckhoff’s implication-left rule from the calculus \(\mathrm {LJT}\) (alias \(\mathrm {G4ip}\)) [3, 14]. We remark that a \(C^\rightarrow \)-derivation is isomorphic to a cut-free \(\mathrm {LJT_{\mathtt {SAT}}}\)-derivation where, in every application of rule \(\mathrm {ljt}\), the left-premise has a trivial proof (just apply rule \(\mathrm {cpl}_0\)). In [10] it is shown how countermodels and \(\mathrm {LJT_{\mathtt {SAT}}}\)-derivations can be extracted from computations. In brief, countermodels are obtained by considering some of the interpretations coming from \(\mathrm {No}(\_)\) answers; countermodels are in general bigger than the ones built by , where at each restart the model is emptied. As an example, let \(\sigma _0=R_0,X \Rightarrow \tilde{g}\) be defined as in Ex. 2; the computation of requires 31 calls to the SAT-solver (24 \(\mathrm {No}(\_)\) answers) and the computed countermodel for \(\sigma _0\) has 6 worlds (see Fig. 5); instead, requires 14 calls and the countermodel has 3 worlds. Derivation extraction presents some awkward aspects. The key insight is that, for every recursive call occurring in the computation of , if returns \(\mathrm {Yes}(A)\) (where \(A\subseteq \tilde{A}\)), then we can build an \(\mathrm {LJT_{\mathtt {SAT}}}\)-derivation of a sequent \(R,R',A,\tilde{X} \Rightarrow q\), where \(R'\) contains some of the clauses added to the SAT-solver. The derivation is built either by applying the rule \(\mathrm {cpl}_0\) if ends at line 8, or else by applying rule \(\mathrm {ljt}\), exploiting the derivations obtained by the recursive calls at lines 11 and 14. Accordingly, the main call yields a derivation of \(R,R',X \Rightarrow g\). The crucial point is that the redundant clauses \(\varphi \) in \(R'\) satisfy \({R,X}\,\vdash _{\mathrm {i}}\, \varphi \) (this ultimately follows by property (a) in Sec. 3), thus we can eliminate them by applying the \(\mathrm {cut}\) rule.

Fig. 7.
figure 7

The calculus \(\mathrm {LJT_{\mathtt {SAT}}}\).

Example 3

Let \(\sigma _0=R_0,X \Rightarrow \tilde{g}\) be defined as in Ex. 1; yields the \(\mathrm {LJT_{\mathtt {SAT}}}\)-derivation \(\mathcal {D}_0\) of \(R_2,\varphi _4,X \Rightarrow \tilde{g}\) in Fig. 8. By applying the cut rule three times, we get an \(\mathrm {LJT_{\mathtt {SAT}}}\)-derivation of \(\sigma _0\). We stress that the \(C^\rightarrow \)-derivation of \(\sigma _0\) obtained with intuitR (see Fig. 4) has a simpler structure.    \(\square \)

Fig. 8.
figure 8

Derivation \(\mathcal {D}_0\) of \(R_2,\varphi _4,X \Rightarrow \tilde{g}\) in \(\mathrm {LJT_{\mathtt {SAT}}}\) (see Ex. 3).

Fig. 9.
figure 9

For each prover, we report the number of solved problems within 600s timeout and between brackets the total time in seconds required for the solved problems. The best prover is highlighted, a star reports that there are some unsolved problems.

Finally, we remark that the clauses \(\varphi \) computed in do not enjoy property (P2) (Sec. 4); we have experimented cases where such clauses are even duplicated (e.g., with formulas from class SYJ205 of ILTP library).

Fig. 10.
figure 10

Timings for problems \(k=1..50\) of SYJ212 (CountSat), - means timeout (600s).

Experimental results We have implemented intuitR in Haskell on the top of intuit: we have replaced the function with and added some features (e.g., trace of computations, construction of derivations/countermodels); as in intuit, we exploit the module MiniSat, a Haskell bundle of the MiniSat SAT-solver [4] (but in principle we can use any incremental SAT-solver). We compare intuitR with intuit and with two of the state-of-the-art provers for \(\mathrm {IPL} \) by replicating the experiments in [2]. The first prover is fCube  [5]; it is based on a standard tableaux calculus and exploits a variety of simplification rules [6] that can significantly reduce branching and backtracking. The second prover is intHistGC  [11]; it relies on a sequent calculus with histories and uses dependency directed backtracking for global caching to restrict the search space; we run it with its best flags (-b -c -c3). All tests were conducted on a machine with an Intel i7-8700 CPU@3.20GHz and 16GB memory. We considered the benchmarks provided with intuit implementation, including the ILTP library, the intHistGC benchmarks and the API problems introduced by intuit developers. This amounts to a total of 1200 problems, 498 Valid and 702 CountSat; we used a 600s (seconds) timeout. Fig. 9 reports the more significant results, among which the classes where at least a prover fails and the classes where intuitR performs poorly. In all the tests, the time required by clausification is negligible. Even though no optimized data structure has been implemented, intuitR solve more problems than its competitors; in families SYJ201 (Valid formulas) and SYJ207 (CountSat formulas) intuitR outperforms its rivals, in all the other cases, except the families EC, negEC and portia, intuitR is comparable to the best prover (which is intuit in most cases). The most remarkable improvement with respect to intuit occurs with class SYJ212 (see Fig. 10), where intuit timings are fluctuating. To give a close comparison, let us consider the case \(k=25\); clausification produces 246 flat clauses and 100 implications clauses (176 atoms). Our intuit implementation requires 11214 calls to the SAT-solvers (10181 \(\mathrm {No}(\_)\)) and the computed countermodel has 1955 worlds. Instead, intuitR requires 45 calls to the SAT-solvers, 8 restarts and yields a countermodel consisting of 4 worlds; the set W contains 26 worlds before the first restart, one world before the remaining ones. With all the benchmarks the models generated during the computation are small (typically, big models occur before the first restart); however, differently from [7, 8, 9], we cannot guarantee that countermodels have minimum depth or minimum number of worlds. To complete the picture, the scatter plot in Fig. 11 compares intuitR and intuit on all the benchmarks.

Fig. 11.
figure 11

Comparison between intuitR and intuit (1172 problems, the 28 problems where both provers run out of time have been omitted); time axis are logarithmic, the 8 red squares indicates that intuit has exceeded the timeout.

To conclude, we point out that intuitR can be extended to deal with some superintuitionistic logics [1]. For instance, let us consider the Göedel-Dummett logic \(\mathrm {GL} \), characterized by linear models; at any step of the computation of , the model \(\mathcal {K}(W)\) must be kept linear. Whenever the insertion of a new world to W breaks linearity, we follow a “restart with learning” strategy [12]: let \(\gamma =(a\rightarrow b)\vee (b\rightarrow a)\) be the instance of the \(\mathrm {GL} \)-axiom falsified at the root of \(\mathcal {K}(W)\); we restart by taking \(\gamma \) as “learned axiom”, so to avoid the repetition of the flaw. However, we cannot add \(\gamma \) to the SAT-solver, because \(\gamma \) is not a clause, but the clausification of \(\gamma \), namely the clauses \(\tilde{q}_{1}\vee \tilde{q}_{2}\), \(\tilde{q}_{1}\wedge a \rightarrow b\), \(\tilde{q}_{2}\wedge b\rightarrow a\), where \(\tilde{q}_{1}\) and \(\tilde{q}_{2}\) are fresh atoms; despite the language of the SAT-solver must be extended, the process converges. The other generalizations suggested in [2] (modal logics, fragments of first-order logic) seem to be more challenging.