Introduction

The goal of software verification is to prove specified system properties. To perform verification using automated tools, first, the system needs to be transformed into a representation more suitable for rigorous analysis than source or machine code. In this work, we study a representation in propositional logic together with a system of linear inequalities. It allows for employing techniques and tools from the area of logic, such as SAT and SMT solvers [7, 15] and Craig interpolation [12].

In this paper, we focus on safety properties [29]. In particular, we aim at proving facts about parts of the programs and generalizing them. Such generalizations serve as a basis for inductive invariants—formulas representing loops in the program, which make the verification difficult—for guiding the search for a correctness proof in approaches such as IC3 [9] and k-induction [41], both known to scale to the verification of highly complex systems.

Finding good proofs and generalizing them is hard. A widely used approach, satisfiability modulo theories (SMT) [7, 15], models a system with fragments of first-order logic. Solvers for SMT combine a resolution-based variant of the DPLL-algorithm [13, 14, 42] for propositional logic with decision procedures for first-order theories. The SMT-LIB initiative [6] offers currently 55 different first-order fractions called SMT-LIB logics. What is common to these logics is that their solving requires typically only a handful of algorithms. Arguably, the two most important algorithms are a congruence closure algorithm for deciding quantifier-free equality with uninterpreted functions [32], and a Simplex-based procedure for linear arithmetic over real or rational numbers [18].

Generalizing proofs to inductive invariants is commonly done by Craig interpolation [12]. Here, the system of formulas is split into two parts, say, A and B, resulting in an interpolation problem (AB). The proof of unsatisfiability for \(A \wedge B\) is used to extract an interpolant I, a formula that is defined over the common symbols of A and B, is implied by A, and is unsatisfiable with B. We perceive the interpolant as a generalization of A with respect to B. Several interpolants can be computed for a given interpolation problem, and not all of them are useful for proving safety. This phenomenon gives rise to employing a portfolio [22] of interpolation algorithms that is then applied in the hopes of aiding to find the safety proof with the help of different interpolants.

The approaches to interpolation based on Farkas’ lemma construct a linear-real-arithmetic (LRA) interpolant by summing all inequalities appearing in A into a single inequality. We call the resulting interpolant the Farkas interpolant. While a single inequality is desirable in some cases, it prevents IC3-style algorithms from converging in others [37]. We show how methods from linear algebra can be applied on a Farkas interpolant to obtain decomposed interpolants that do not consist of a single inequality and guarantee the convergence of the model-checking algorithm for some cases where Farkas interpolants fail. A major advantage of decomposed interpolants is that they can still be computed from Farkas coefficients produced by Simplex-based decision procedures, allowing us to re-use the highly tuned implementations present in many state-of-the-art SMT solvers.

Intuitively, while computing the decomposed interpolants, we do not directly sum the inequalities in A, but, instead, we split the sum into sub-sums. The result is an interpolant that is a conjunction of often more than one component of the Farkas interpolant. This allows us not only to solve the convergence problem observed in some model-checking examples but also to gain more control over the strength of LRA interpolants. In summary, the contributions of this paper are:

  1. 1.

    a new Farkas-lemma-based interpolation algorithm for LRA conflicts, which guarantees to decompose a Farkas interpolant to more than one inequality if such decomposition exists;

  2. 2.

    establishing properties regarding logical strength of interpolants produced by our algorithm with respect to the original Farkas interpolants,

  3. 3.

    implementation of our new interpolation algorithm in OpenSMT, our SMT solver, and integration of our approach with the model checker sally [26],

  4. 4.

    a set of extensive experiments on a large set of model-checking benchmarks where we evaluate (1) the effect of replacing a traditional interpolation engine with our interpolation algorithm, and (2) the performance of the portfolio of interpolation techniques available in OpenSMT and MathSAT and the contribution of decomposition techniques to the performance of each portfolio.

This article is an extended version of a conference publication that appeared in [8]. With respect to the aforementioned points, the contribution of this paper over [8] includes:

  • Our previous algorithm, presented in [8], relied on a heuristic and did not provide a guarantee to discover a decomposition if one exists. The algorithm presented in this paper provides the guarantee.

  • We present complexity analysis of the proposed algorithm.

  • We present new results of the experiments reflecting the use of the new version of the algorithm and the progress of the SMT solvers used.

  • We provide a detailed comparison with a related approach [11].

The structure of the paper is as follows. In Sec. 2, we motivate the investigation of decomposed interpolants on a concrete model-checking problem where our approach guarantees immediate convergence, but Farkas interpolation diverges. In Sec. 3, we review the related work. In Sec. 4, we define the notation used in the paper, and in Sec. 5 and 6, we detail our main theoretical contribution. We provide experimental results in Sec. 7 and finally conclude in Sec. 8.

Motivation

To motivate our work, consider the code in Fig. 1.Footnote 1

Fig. 1
figure 1

Motivating example

In this code, the “\(*\)” character represents a non- deterministic choice (e.g. user input); thus, the body of the while loop can be executed any number of times. The assert statement captures the property of the program that variable “x” should always be nonnegative after exiting the while loop.

This code can be modelled as a transition system \(S = (I, T, Err )\) given in Eq. (1); here, I and \( Err \) are predicates that capture the initial and error states, respectively, and T is the transition relation. The symbols xy are real variables, and \(x', y'\) are their next-state versions.

$$\begin{aligned} S =\left\{ \begin{array}{l} I := (x = 0) \wedge (y = 0), \\ T := (x^\prime = x + y) \wedge (y^\prime = y + 1), \\ Err := (x < 0) \end{array}\right. \end{aligned}$$
(1)

The aforementioned example is one variant from a family of similar transition systems that are known not to converge in straightforward implementations of IC3-based algorithms using LRA interpolation. To prove the safety of the transition system \((I, T, Err )\), we search for a safe inductive invariant, i.e. a predicate R that satisfies (1) \(I(X) \rightarrow R(X)\), (2) \(R(X) \wedge T(X,X') \rightarrow R(X')\), and (3) \(R(X) \wedge Err (X) \rightarrow \bot \).

We demonstrate the problem that occurs in model checking when using Farkas interpolants on a simplified run of a model checker for our example. After checking that the initial state satisfies the property \(P :=x \ge 0\) (the negation of \( Err \)), the inductiveness of P is checked. The inductive check is reduced to a satisfiability check of a formula representing the question whether it is possible to reach a \(\lnot P\)-state (a state where \(\lnot P\) holds) by one step from any P-state:

$$\begin{aligned} x \ge 0 \wedge x' = x + y \wedge y' = y + 1 \wedge x' < 0. \end{aligned}$$

This formula is satisfiable, and a generalized counter-example to induction (CTI) is extracted. In our case, the CTI is \(x + y < 0\).Footnote 2 This means that if we make one step from a P-state that additionally satisfies \(x+y < 0\), we end up in a \(\lnot P\)-state. Therefore, we have to check whether this CTI is consistent with the initial states. This is again encoded as a satisfiability check of a formula

$$\begin{aligned} x = 0 \wedge y = 0 \wedge x + y < 0. \end{aligned}$$

This formula is unsatisfiable, and we can extract an interpolant to obtain a generalized reason why this CTI is not consistent with the initial states (not reachable in 0 steps in our system). The interpolant is computed for the partitioning \((x = 0 \wedge y = 0, x + y < 0)\). The Farkas interpolant for this partitioning is \(x + y \ge 0\), and we denote it as \(L_1\). Interpolation properties guarantee that \(L_1\) is valid in all initial states. Moreover, P is inductive relative to \(L_1\), formally

$$\begin{aligned} x \ge 0 \wedge x + y \ge 0 \wedge x' =&x + y \wedge y' = y + 1\\&\implies x' \ge 0. \end{aligned}$$

This means that by making one step from a P-state that is also an \(L_1\)-state we always end up in a P-state again. However, now we need to show that \(L_1\) holds in all reachable states. We check whether \(L_1\) is inductive (even relative to P). Similarly as before, we encode this as a satisfiability check of a formula

$$\begin{aligned} x + y \ge 0 \wedge x \ge 0&\wedge x' = x + y \\&\wedge y'= y + 1 \wedge x' + y' < 0. \end{aligned}$$

Again, this formula is satisfiable, and a generalized CTI is \(x+2y < -1\). This CTI is refuted as inconsistent with the initial states similarly to the first one. The formula

$$ x = 0 \wedge y = 0 \wedge x + 2y < -1 $$

is unsatisfiable, and Farkas interpolant generalizing the refutation is \(L_2 :=x + 2y \ge 0\). Similarly as before, it can be easily checked that \(L_1\) is inductive relative to \(L_2\), but \(L_2\) is not inductive (not even relative to P and \(L_1\)). The CTI is \(x + 3y < -1\), and it is refuted by a Farkas interpolant \(L_3 :=x + 3y \ge 0\). \(L_2\) is now inductive relative to \(L_3\), but \(L_3\) is not inductive, etc. The model checker diverges, since for \(L_n\) a CTI \(x + ny < -1\) is discovered and a new obligation to show inductiveness of \(L_{n+1}\) is generated.

However, let us get back to the first interpolation query \((x=0 \wedge y=0, x + y < 0)\). Farkas interpolation, which always computes an interpolant in the form of a single inequality, is not the only option. It is possible to compute an interpolant that is a conjunction of inequalities. In our case, \(L :=x \ge 0 \wedge y \ge 0\) is also an interpolant. This interpolant L is stronger than the Farkas interpolant; the property P is inductive relative to L, and, most importantly, L is inductive:

$$\begin{aligned} (x \ge 0 \wedge y \ge 0)&\wedge x' = x + y \mathrel {\wedge } \\&\wedge y'=y+1 \implies (x' \ge 0 \wedge y' \ge 0) \end{aligned}$$

is a valid formula. Actually, P follows from L, so L represents the inductive strengthening of P that witnesses the safety of our system.

In this work, we present an approach that allows the computation of Craig interpolants in LRA in this conjunctive form.

Related work

The possible weakness of Farkas interpolants for use in model checking was recognized in [37]. The authors demonstrate that Farkas interpolation does not satisfy the condition needed for proving convergence of a model-checking algorithm pd-kind [26]. Indeed, the model checker sally [26], which implements PD-KIND, diverges on our example from Sec. 2 if Farkas interpolation is used in its underlying interpolation engine. To resolve this problem, [37] introduces a new interpolation procedure that guarantees the convergence of a special sequence of interpolation problems often occurring in model-checking problems. However, this interpolation algorithm is based on a decision procedure called conflict resolution [28], which is not as efficient as the Simplex-based decision procedure used by most state-of-the-art SMT solvers. In contrast, we show how the original Simplex-based decision procedure using Farkas coefficients can be modified to produce interpolants not restricted to the single-inequality form, while additionally obtaining strength guarantees with respect to the original Farkas interpolants.

The reasoning engine Spacer [27] is also known to be affected by this weakness of Farkas interpolants. The verification framework SeaHorn [20], which relies on Spacer, uses additional invariants obtained from abstract interpretation to avoid the divergence.

The interpolation in linear real arithmetic (LRA) itself has received a significant amount of attention recently. The work on LRA interpolation dates back to 1997 [33]. A compact set of rules for deriving LRA interpolants from the proof of unsatisfiability in an inference system was presented in [31]. The interpolants in these works were the Farkas interpolants. Current methods usually compute Farkas interpolants from explanations of unsatisfiability extracted directly from the Simplex-based decision procedure inside the SMT solver [18]. Recently in [4], we presented a way of computing an infinite family of interpolants between a primal and a dual interpolant with variable strength. However, those interpolants are still restricted to single inequalities.

The first discussion on how to obtain interpolants in the form of conjunction of inequalities from Farkas coefficients is present in [11]. However, their approach is based on a simple heuristic which does not discover the possibility for decompositions in some cases where our approach finds the decomposition easily. Moreover, their focus was on the interpolation techniques themselves, and they do not discuss the applications of decomposed interpolants. We provide a detailed comparison to our approach in Sec. 6.3.

Other works on LRA interpolants include, e.g. [1, 36, 38]. Both [1] and [38] focus on producing simple overall interpolants by attempting to re-use (partial) interpolants from pure LRA conflicts. Our focus is not on the overall interpolant, but on a single LRA conflict. However, in the context of interpolants from proofs produced by SMT solvers, our approach also has the potential for re-using components of interpolants for LRA conflicts across the whole proof. Beside interpolation algorithms for LRA conflicts, there exists a large body of work on propositional interpolation [2, 16, 21, 25].

Preliminaries

We work in the domain of Satisfiability Modulo Theories (SMT) [7, 15], where satisfiability of formulas is determined with respect to some background theory. In particular, we are concerned with the lazy approach to SMT that combines a SAT solver dealing with the propositional structure of a formula and a theory solver for checking consistency of a conjunction of theory literals. The proof of unsatisfiability in this approach is basically a propositional proof that incorporates theory lemmas learned by the theory solver and propagated to the SAT solver. The proof-based interpolation algorithm then combines any propositional-proof-based interpolation algorithm with theory interpolator. Theory interpolator provides an interpolant for each theory conflict—an unsatisfiable conjunction of theory literals.

Linear arithmetic and linear algebra. We use the letters xyz to denote variables and ck to denote constants. Vector of n variables is denoted by \(\mathbf {x} = {(x_1, \dots , x_n)}^{\intercal }\) where n is usually known from context. \(\mathbf {x}[i]\) denotes the element of \(\mathbf {x}\) at position i, i.e. \(\mathbf {x}[i] = x_i\). The vector of all zeroes is denoted as \(\mathbf {0}\), and \(\mathbf {e_i}\) denotes the unit vector with \(\mathbf {e_i}[i] = 1\) and \(\mathbf {e_i}[j] =0\) for \(j \ne i\). For two vectors \(\mathbf {x} = {(x_1, \dots , x_n)}^{\intercal }\) and \(\mathbf {y} = {(y_1, \dots , y_n)}^{\intercal }\), we say that \(\mathbf {x} \le \mathbf {y}\) iff \(x_i \le y_i\) for each \(i \in \{1, \dots , n\}\). \(\mathbb {Q}\) denotes the set of rational numbers, \(\mathbb {Q}^n\) the n-dimensional vector space of rational numbers and \(\mathbb {Q}^{m \times n}\) the set of rational matrices with m rows and n columns. A transpose of matrix M is denoted as \({M}^{\intercal }\). A kernel (or null space) of a matrix M is the vector space \( ker (M) = \{\mathbf {x} \mathrel {|} M\mathbf {x}=\mathbf {0}\}\). A matrix is said to be in row echelon form (REF) if all nonzero rows are above all rows containing only zeros and the leading coefficient (first nonzero value) of each row is always strictly to the right of the leading coefficient of the row above. A matrix is said to be in reduced row echelon form (RREF) if it is in REF, the leading entry of each nonzero row is 1, and each column containing the leading entry of some row has zeros everywhere else. REF of a matrix can be obtained by Gaussian elimination, while RREF can be obtained by Gauss–Jordan elimination.

We adopt the notation of matrix product for linear arithmetic. For a linear term \(l = c_1x_1 + \dots + c_nx_n\), we write \({\mathbf {c}}^{\intercal }\mathbf {x}\) to denote l. Without loss of generality, we assume that all linear inequalities are of the form \({\mathbf {c}}^{\intercal }\mathbf {x} \bowtie c\) with \(\bowtie \in \{\le , <\}\). By linear system over variables \(\mathbf {x}\), we mean a finite set of linear inequalities \(S = \{C_i \mathrel {|} 1 \le i \le m\}\), where each \(C_i\) is a linear inequality over \(\mathbf {x}\). Note that from the logical perspective, each \(C_i\) is an atom in the language of the theory of linear arithmetic; thus, system S can be expressed as a formula \(\bigwedge _{i=1}^{m} C_i\) and we use these representations interchangeably. A linear system is satisfiable if there exists an evaluation of variables that satisfies all inequalities; otherwise, it is unsatisfiable. This is the same as the (un)satisfiability of the formula representing the system.

We extend the matrix notation also to the whole linear system. For the sake of simplicity, we use \(\le \) instead of \(\bowtie \), even if the system contains a mix of strict and non-strict inequalities. The only important difference is that a (weighted) sum of a linear system (as defined below) results in a strict inequality, instead of a non-strict one, when at least one strict inequality is present in the sum with a nonzero coefficient. The theory, proofs, and algorithm remain valid also in the presence of strict inequalities. We write \(C\mathbf {x} \le \mathbf {c}\) to denote the linear system S where C denotes the matrix of all coefficients of the system, \(\mathbf {x}\) are the variables, and \(\mathbf {c}\) is the vector of the right sides of the inequalities. With the matrix notation, we can easily express the sum of (multiples) of inequalities. Given a system of inequalities \(C\mathbf {x} \le \mathbf {c}\) and a vector of “weights” (multiples) of the inequalities \(\mathbf {k} \ge \mathbf {0}{}\), the inequality that is the (weighted) sum of the system can be expressed as \({\mathbf {k}}^{\intercal }C\mathbf {x} \le {\mathbf {k}}^{\intercal }\mathbf {c}\).

Craig interpolation. Given two formulas \(A(\mathbf {x},\mathbf {y})\) and \(B(\mathbf {y},\mathbf {z})\) such that \( A \wedge B\) is unsatisfiable, a Craig interpolant [12] is a formula \(I(\mathbf {y})\) such that \(A \implies I\) and \(I \implies \lnot B\).

The pair of formulas (AB) is also referred to as an interpolation problem. In linear arithmetic, the interpolation problem is a linear system S partitioned into two parts: A and B.

One way to compute a solution to an interpolation problem in linear arithmetic, used in many modern SMT solvers, is based on Farkas’ lemma [19, 39]. Farkas’ lemma states that for an unsatisfiable system of linear inequalities \(S \equiv C\mathbf {x} \le \mathbf {c}\), there exist Farkas coefficients \(\mathbf {k} \ge \mathbf {0}\) such that \({\mathbf {k}}^{\intercal }C\mathbf {x} \le {\mathbf {k}}^{\intercal }\mathbf {c} \equiv 0 \le -1 \). In other words, the weighted sum of the system given by the Farkas coefficients is a contradictory inequality. If a strict inequality is part of the sum, the result might also be \(0 < 0\).

The idea behind the interpolation algorithm based on Farkas coefficients is simple. Intuitively, given the partitioning of the linear system into A and B, we compute only the weighted sum of A. It is not hard to see that this sum is an interpolant. It follows from A because a weighted sum of a linear system with nonnegative weights is always implied by the system. It is inconsistent with B because its sum with the weighted sum of B (using Farkas coefficients) is a contradictory inequality by Farkas’ lemma. Finally, it cannot contain any A-local variables, as can be seen from the following reasoning: all variables are eliminated in the weighted sum of the whole system. Since A-local variables are by definition absent in B, they must be eliminated already in the weighted sum of A.

More formally, for an unsatisfiable linear system \(S := C\mathbf {x} \le \mathbf {c}\) over n variables, where \(C \in \mathbb {Q}^{m \times n}, \mathbf {c} \in \mathbb {Q}^{m}\), and its partition to \(A := C_{A}\mathbf {x} \le \mathbf {c_{A}}\) and \(B := C_{B}\mathbf {x} \le \mathbf {c_{B}}\), where \(C_A \in \mathbb {Q}^{k \times n}\), \(C_B \in \mathbb {Q}^{l \times n}\), \(\mathbf {c_A} \in \mathbb {Q}^k\), \(\mathbf {c_B} \in \mathbb {Q}^l\) and \(k + l = m\), there exist Farkas coefficients \(\mathbf {{k}^{\intercal }} = (\mathbf {{k}^{\intercal }_A} \, \mathbf {{k}^{\intercal }_B})\) such that

$$\begin{aligned} (\mathbf {{k}^{\intercal }_A} \, \mathbf {{k}^{\intercal }_B}) \left( \begin{array}{c} C_A\\ C_B\\ \end{array} \right) = 0, (\mathbf {{k}^{\intercal }_A} \, \mathbf {{k}^{\intercal }_B}) \left( \begin{array}{c} \mathbf {c_A}\\ \mathbf {c_B}\\ \end{array} \right) = -1, \end{aligned}$$

and the Farkas interpolant for (AB) is the inequality

$$\begin{aligned} I^F:=\mathbf {{k}^{\intercal }_A} C_A \mathbf {x} \le \mathbf {{k}^{\intercal }_A} \mathbf {c_A}. \end{aligned}$$
(2)

Decomposed Interpolants

In this section, we present our new approach to computing interpolants in linear arithmetic based on Farkas coefficients. The definition of Farkas interpolant of Eq. (2) corresponds to the weighted sum of A-part of the unsatisfiable linear system. This sum can be decomposed into j sums by decomposing the vector \(\mathbf {k_A}\) into j vectors

$$\begin{aligned} \mathbf {k_A} = \sum _{i=1}^j \mathbf {k_{A,i}}, \text { with } \mathbf {0} \le \mathbf {k_{A,i}} \le \mathbf {k_A} \text { for all } i, \end{aligned}$$
(3)

thus obtaining j inequalities

$$\begin{aligned} I_i :={\mathbf {{k}^{\intercal }_{A,i}}} C_A \mathbf {x} \le {\mathbf {{k}^{\intercal }_{A,i}}} \mathbf {c_A} \end{aligned}$$
(4)

If \(\mathbf {k_{A,i}}\) are such that the left-hand side of the inequalities \(I_i\) contains only shared variables, the decomposition has an interesting application in interpolation, as illustrated below.

Definition 1

(decomposed interpolants) Given an interpolation instance (AB), if there exists a sum of the form Eq. (3) such that the left side of Eq. (4) contains only shared variables for all \(1 \le i \le j\), then the set of inequalities \(D = \{I_1, \ldots , I_j\}\) is a decomposition.

In that case the formula \(\bigwedge _{i=1}^j I_i\) is a decomposed interpolant (DI) of size j for (AB).

The decomposed interpolants are proper interpolants, as stated in the following theorem.

Theorem 1

Let (AB) be an interpolation problem in linear arithmetic. If \(D = \{I_1, \ldots , I_k\}\) is a decomposition, then \(I^{D} = I_1 \wedge \ldots \wedge I_k\) is an interpolant for (AB).

Proof

Let \(I^{D} = I_1 \wedge \ldots \wedge I_k\). First, \(A \implies I^{D}\) since for all \(I_i\), \(A \implies I_i\). This is immediate from the fact that A is a system of linear inequalities \(C_A \mathbf {x} \le \mathbf {c_A}\), \(I_i = (\mathbf {{k}^{\intercal }_{A,i}}C_A \mathbf {x} \le \mathbf {{k}^{\intercal }_{A,i}}\mathbf {c_A})\) and \(\mathbf {0}\le \mathbf {k_{A,i}}\).

Second, \(I^{D} \wedge B \implies \bot \) since \(I^{D}\) implies Farkas interpolant \(I^F\). This holds because \(\mathbf {k_{A}}= \sum _i \mathbf {k_{A,i}}\) and \(\mathbf {0}\le \mathbf {k_{A,i}}\).

Third, \(I^{D}\) contains only the shared variables by the definition of decomposition (Definition 1). Therefore, \(I^{D}\) is an interpolant. \(\square \)

Each interpolation instance has a \(DI\) of size one, a trivial decomposition, corresponding to the Farkas interpolant of Eq. (2). However, interpolation problems, in general, can admit bigger decompositions. In the following, we give a concrete example of an instance with decomposition of size two.

Example 1

Let (AB) be an interpolation problem in linear arithmetic with \( A= (x_1 + x_2 \le 0) \wedge (x_1 + x_3 \le 0) \wedge (-x_1 \le 0)\) and \(B = (- x_2 - x_3 \le -1)\). The linear systems corresponding to A and B are

$$\begin{aligned} C_A&=\begin{pmatrix} 1&{} 1&{} 0\\ 1&{} 0&{} 1\\ -1&{} 0&{} 0 \end{pmatrix}, \quad&\mathbf {c_A}&= \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \\ C_B&=\begin{pmatrix} 0&-1&-1 \end{pmatrix}, \quad&\mathbf {c_B}&= \begin{pmatrix} -1\end{pmatrix}. \end{aligned}$$

Farkas coefficients are

$$\begin{aligned} \mathbf {{k}^{\intercal }_{A}}=\begin{pmatrix} 1&1&2 \end{pmatrix} \text { and } \mathbf {{k}^{\intercal }_{B}}= \begin{pmatrix} 1 \end{pmatrix}, \end{aligned}$$

while Farkas interpolant for (AB) is the inequality \(I^F:=x_2 + x_3 \le 0\). However, if we decompose \(\mathbf {k_{A}}\) into

$$\begin{aligned} \mathbf {{k}^{\intercal }_{A,1}} = \begin{pmatrix} 1&0&1\end{pmatrix} \text { and } \mathbf {{k}^{\intercal }_{A,2}} = \begin{pmatrix} 0&1&1\end{pmatrix}, \end{aligned}$$

we obtain the decomposition \(\{x_2 \le 0, x_3 \le 0\}\) producing the decomposed interpolant \(I^{DI} :=x_2 \le 0 \wedge x_3 \le 0\) of size two.

Strength-Based Ordering of Decompositions

Decomposition of Farkas coefficients for a single interpolation problem is in general not unique. However, we can provide some structure to the space of possible interpolants by ordering interpolants with respect to their logical strength. To achieve this, we define the coarseness of a decomposition based on its ability to partition the terms of the interpolant into finer sums and then prove that coarseness provides us with a way of measuring the interpolant strength.

Definition 2

Let \(D_1, D_2\) denote two decompositions of the same interpolation problem of size m, n, respectively, where \(n < m\). Let \((\mathbf {q_1}, \dots , \mathbf {q_m})\) denote the decomposition of Farkas coefficients corresponding to \(D_1\) and let \((\mathbf {r_1}, \dots , \mathbf {r_n})\) denote the decomposition of Farkas coefficients corresponding to \(D_2\). We say that decomposition \(D_1\) is finer than \(D_2\) (or equivalently \(D_2\) is coarser than \(D_1\)) and denote this as \(D_1 \prec D_2\) when there exists a partitioning \(P_{}= \{p_1, \dots , p_n\}\) of the set \(\{\mathbf {q_1}, \dots , \mathbf {q_m}\}\) such that for each i with \(1 \le i \le n\), \(\mathbf {r_i} = \sum _{\mathbf {q} \in p_i} \mathbf {q}\).

Interpolants of decompositions ordered by their coarseness can be ordered by logical strength, as stated by the following lemma:

Lemma 1

Assume \(D_1\), \(D_2\) are two decompositions of the same interpolation problem such that \(D_1 \prec D_2\). Let \(I^{D_{1}}, I^{D_{2}}\) be the decomposed interpolants corresponding to \(D_1, D_2\). Then, \(I^{D_{1}}\) implies \(I^{D_{2}}\).

Proof

Informally, the implication follows from the fact that each linear inequality of \(I^{D_{2}}\) is a sum of some inequalities in \(I^{D_{1}}\).

Formally, let \(I_i\) denote the i-th inequality in \(I^{D_{2}}\). Then, \(I_i = (\mathbf {{r}^{\intercal }_i} C_A \mathbf {x}\le \mathbf {{r}^{\intercal }_i} \mathbf {c_A})\). Since \(D_1 \prec D_2\), there is a set \(\{I_{i_1}, \dots , I_{i_j}\} \subseteq D_1\) such that for each k with \(1 \le k \le j\), \(I_{i_k} = (\mathbf {{q}^{\intercal }_{i_k}} C_A \mathbf {x}\le \mathbf {{q}^{\intercal }_{i_k}} \mathbf {c_A})\) and \(\mathbf {r_i} = \sum _{k=1}^{j} \mathbf {q_{i_k}}\).

Since \(\mathbf {q_{i_k}} \ge \mathbf {0}\), it holds that \(I_{i_1} \wedge \dots \wedge I_{i_j} \implies I_i\). This means that \(I^{D_{1}}\) implies every conjunct of \(I^{D_{2}}\). \(\square \)

Note that the trivial, single-element decomposition corresponding to Farkas interpolant is the greatest element of this decomposition ordering. Also, for any decomposition of size more than one, replacing any number of elements by their sum yields a coarser decomposition.

Finally, we emphasize that it is difficult to argue about the suitability of a decomposition for a particular purpose based solely on strength. For example, a user may opt for a coarser decomposition because summing up just some elements of a decomposition may result in eliminating a shared variable.

Strength of the Dual Interpolants

Before we describe the details of the decomposing interpolation procedure, we extend the picture of interpolation strength related to the decomposed interpolants.

Some applications of interpolation can benefit from computing coarser over-approximation (i.e. weaker interpolants). For example, a weaker function summary can cover more changes in an upgrade checking scenario [40], and weaker over-approximations of reachability in a transition system can converge to fix-point faster [16]. Using the notion of dual interpolation, decompositions can also be used to compute interpolants weaker than Farkas interpolant (or even its dual).

Given an interpolation problem (AB) and an interpolation procedure \( Itp \), we denote the interpolant computed by \( Itp \) for (AB) as \( Itp (A,B)\). Then, \( Itp ^{\prime }\) denotes the dual interpolation procedure, which works as follows: \( Itp ^{\prime }(A,B) = \lnot Itp {}(B,A)\). The well-known duality theorem for interpolation states that \( Itp ^{\prime }\) is a correct interpolation procedure.

Let us denote the interpolation procedure based on Farkas’ lemma as \( Itp_{F} \) and the decomposing interpolation procedure as \( Itp_{DI} \). The relation between \( Itp_{F} \) and its dual \( Itp^{\prime }_{F} \) has been established in [4], namely that \( Itp_{F} (A,B) \implies Itp^{\prime }_{F} (A,B)\). We have shown in Lemma 1 that a decomposed interpolant always implies Farkas interpolant computed from the same Farkas coefficients. Formally, \( Itp_{DI} (A,B) \implies Itp_{F} (A,B)\). Similar result can be established for the dual interpolation procedures: as \( Itp_{DI} (B,A) \implies Itp_{F} (B,A)\), it follows that \(\lnot Itp_{F} (B,A) \implies \lnot Itp_{DI} (B,A)\) and consequently \( Itp^{\prime }_{F} (A,B) \implies Itp^{\prime }_{DI} (A,B)\).

Combining the results on logical strength together, we obtain a chain of implications:

$$\begin{aligned} Itp_{DI} (A,B) \implies Itp_{F} (A,B)&\implies Itp^{\prime }_{F} (A,B) \\&\implies Itp^{\prime }_{DI} (A,B). \end{aligned}$$

Note that while both \( Itp_{F} \) and \( Itp^{\prime }_{F} \) compute interpolants as a single inequality and interpolants produced by \( Itp_{DI} \) are conjunctions of inequalities, interpolants produced by \( Itp^{\prime }_{DI} \) are disjunctions of inequalities.

In the following section, we describe the details of the \( Itp_{DI} \) interpolation procedure.

Finding Decompositions

In this section, we present our approach for finding decompositions for linear arithmetic interpolation problems given their Farkas coefficients.

We focus on the task of finding decomposition of \(\mathbf {{k}^{\intercal }_{A}}C_{A}\mathbf {x}\). Recall that \(C_{A}\in \mathbb {Q}^{l \times n}\) and \(\mathbf {x}\) is a vector of variables of length n. Without loss of generality, assume that there are no B-local variables since columns of \(C_{A}\) corresponding to B-local variables would contain all zeroes by definition in any case.

Furthermore, without loss of generality, assume the variables in the inequalities of A are ordered such that all A-local variables are before the shared ones. Then, let us write

$$\begin{aligned} C_{A}=\begin{pmatrix} L&S\end{pmatrix}, \quad {\mathbf {x}}^{\intercal } = \begin{pmatrix} {\mathbf {x}_{L}}^{\intercal }&{\mathbf {x}_{S}}^{\intercal } \end{pmatrix} \end{aligned}$$
(5)

where \(\mathbf {x}_{L}{}\) is the vector of A-local variables of size p, \(\mathbf {x}_{S}{}\) the vector of shared variables of size q, \(n = p + q\), \(L\in \mathbb {Q}^{l \times p}\), and \(S\in \mathbb {Q}^{l \times q}\). We know that \( \mathbf {{k}^{\intercal }_{A}}L= \mathbf {0}\) and the goal is to find \(\mathbf {k_{A,i}}\) such that \(\sum _i \mathbf {k_{A,i}}= \mathbf {k_{A}}\) and for each i \(\mathbf {0}\le \mathbf {k_{A,i}}\le \mathbf {k_{A}}\) and \(\mathbf {{k}^{\intercal }_{A,i}}L = \mathbf {0}\).

In the following, we will consider two cases for computing the decompositions. We first study a common special case where system A contains rows with no local variables and give a linear-time algorithm for computing the decompositions. We then move to the general case where the rows of A contain local variables and provide a decomposition algorithm based on computing a vector basis for a null space of a matrix obtained from A.

Trivial Elements

First, consider a situation where there is a linear inequality with no local variables. This means there is a row j in \(C_{A}\) (denoted as \({C_{A}}_j)\) such that all entries in columns corresponding to local variables are 0, i.e. \(L_j = {\mathbf {0}}^{\intercal }\). Then, \(\{I_1, I_2\}\) for \(\mathbf {k_{A,1}} = \mathbf {k_{A}}[j] \times \mathbf {e_{j}}\) and \(\mathbf {k_{A,2}} = \mathbf {k_{A}}- \mathbf {k_{A,1}}\) is a decomposition. Intuitively, any linear inequality that contains only shared variables can form a stand-alone element of a decomposition. When looking for finest decomposition, we do this iteratively for all inequalities with no local variables. In the next part, we show how to look for a non-trivial decomposition when dealing with local variables.

Decomposing in the Presence of Local Variables

For this section, assume that \(L\) has no zero rows. (We have shown above how to deal with such rows.) We are going to search for a non-trivial decomposition starting with the following observation:

Observation \(\mathbf {{k}^{\intercal }_{A}}L= 0\). Equivalently, there are no A-local variables in the Farkas interpolant. It follows that \({L}^{\intercal }\mathbf {k_{A}}= 0\) and \(\mathbf {k_{A}}\) is in the kernel of \({L}^{\intercal }\).

Let us denote by \(\mathbb {K}= ker ({L}^{\intercal })\) the kernel of \({L}^{\intercal }\).

Theorem 2

Let \(\mathbf {v_{1}}, \dots , \mathbf {v_{n}}\) be vectors from \(\mathbb {K}\) such that \(\exists \alpha _1, \dots , \alpha _{n}\) with \(\alpha _i \mathbf {v_{i}} \ge \mathbf {0}\) for all i and \(\mathbf {k_{A}}= \sum _{i = 1}^{n} \alpha _i \mathbf {v_{i}}\).

Then, \(\{\mathbf {w_1}, \dots , \mathbf {w_{n}}\}\) for \(\mathbf {w_i} = \alpha _i \mathbf {v_{i}}\) is a decomposition of \(\mathbf {k_{A}}\) and \(D = \{I_1, \dots , I_{n}\}\) for \(I_i :=\mathbf {w_i}C_{A}\mathbf {x}\le \mathbf {c_A}\) is a decomposition, i.e. the formula \(I^D = \bigwedge _{i=1}^{n} I_i\) is a decomposed interpolant.

Proof

The theorem follows from the definition of decomposition (Def. 1). From the assumptions of the theorem, we immediately obtain \(\mathbf {k_{A}}= \sum _{i = 1}^{n} \mathbf {w_{i}}\) and \(\mathbf {w_{i}} \ge \mathbf {0}\). Moreover, \(\mathbf {w_{i}} \in \mathbb {K}\), since \(\mathbf {v_{i}} \in \mathbb {K}\) and \(\mathbf {w_{i}} = \alpha _i \mathbf {v_{i}}\). As a consequence, \({L}^{\intercal }\mathbf {w_{i}} = 0\) and it follows that there are no A-local variables in \({\mathbf {w_{i}}}^{\intercal }C_{A}\mathbf {x}\). \(\square \)

Note that Theorem 2 permits redundant components of a decomposition. Consider vectors \(\mathbf {w_{1}}, \mathbf {w_{2}}, \mathbf {w_{3}} \in \mathbb {K}\) that are part of a decomposition in the sense of Theorem 2 and that \(\mathbf {w_{3}} = \mathbf {w_{1}} + \mathbf {w_{2}}\). Then, \(I_1 \wedge I_2 \implies I_3\) and \(I_3\) is a redundant conjunct in the corresponding decomposed interpolant.

Good candidates that satisfy most of the assumptions of Theorem 2 (and avoid redundancies) are bases of the vector space \(\mathbb {K}\). If \(B = \{\mathbf {b_1}, \dots , \mathbf {b_n}\}\) is a basis of \(\mathbb {K}\) such that \(\mathbf {k_{A}}= \sum _{i = 1}^{n} \alpha _i \mathbf {b_{i}}\) with \(\alpha _i \mathbf {b_i} \ge \mathbf {0}\) for all i, then \(\{\alpha _1 \mathbf {b_1}, \dots , \alpha _n \mathbf {b_n}\}\) is a decomposition. Our solution for computing the decomposition of Farkas coefficients \(\mathbf {k_{A}}\) is described in Algorithm 1. It is based on the above idea of computing bases of \( ker ({L}^{\intercal })\). First, after transforming the matrix to the RREF form, we compute a basis of the kernel using the standard linear-algebra algorithm. The basis is almost what we want, except that some vectors of this basis can have negative coefficients. In such a case, our algorithm gradually updates the basis until all vectors from the basis are nonnegative while preserving all the necessary properties. Such a basis is used to compute the desired decomposition. Now, we describe our algorithm in detail, show its termination and correctness, and discuss its complexity.

figure a

The algorithm runs on the matrix \(M = {L}^{\intercal }\) and vector \(\mathbf {v} = \mathbf {k_{A}}\). At the beginning, the reduced row echelon form (RREF) of the matrix is computed. (Recall definition of RREF from Sec. 4.) Importantly, the transformation of a matrix to RREF preserves its kernel. The dimension of the kernel, known as nullity, can now be efficiently computed using rank–nullity theorem, which states that the nullity of a matrix is equal to the number of its columns minus its rank. For a matrix in RREF, the rank is simply the number of nonzero rows.

We already know that there is a nonzero vector in the kernel; therefore, the nullity of the matrix is at least one. If it is exactly one (line 3), then no non-trivial decomposition of the vector exists. Intuitively, this means that the Farkas coefficients represent the unique way (up to positive scalar multiples) of summing up the inequalities of A-part to eliminate the A-local variables. However, if the nullity is greater than one, it is possible to compute a decomposition of size equal to the nullity.

Initial basis computation. First, a basis of the kernel of the matrix in RREF is computed by a standard algorithm (see, for example, [5]). This algorithm ensures that the coordinates of \(\mathbf {v}\), with respect to the basis it computes, are positive (lines 5, 6). Since this is an important property, we include the description of the algorithm with the proof. Given a matrix M in RREF with m columns, each column is denoted as either pivot or non-pivot. A pivot column contains the first nonzero entry for a particular row; non-pivot column does not. We say that a non-pivot column is free. The number of free columns is exactly the nullity of the matrix, i.e. \(n\), and the number of pivot columns is \(m - n\). Due to the need to iterate over the pivot and free columns separately, we introduce additional notation: we use \(f \in \{1, \dots , n\}\) to iterate over the free columns, \(p \in \{1, \dots , m - n\}\) to iterate over the pivot columns, and we use mapping functions \(F:\{1, \dots , n\}\rightarrow \{1, \dots , m\}\) and \(P:\{1, \dots , m - n\}\rightarrow \{1, \dots , m\}\) to get the original column indices in M.

Now, for each \(f \in \{1, \dots , n\}\) denote as \(\mathbf {b_{f}}\) the solution obtained by solving the system \(M\mathbf {x}= \mathbf {0}\) where all variables corresponding to free columns are set to 0, except for \(x_{F(f)}\) which is set to 1. Note that this uniquely determines the value of pivot variables since M is in RREF; thus,

$$\begin{aligned} x_{P(p)} = \sum _{f=1}^{n}-M_{pF(f)}x_{F(f)}, \forall p \in \{1, \dots , m - n\}\end{aligned}$$
(6)

Lemma 2

\(\mathcal {B}= \{\mathbf {b_{f}} \mathrel {|} f \in \{1, \dots , n\}\}\) is a basis of \( ker (M)\). Moreover, \(\forall \mathbf {v}{} \in ker (M) : \mathbf {v} = \sum _{f=1}^{n}\mathbf {v}_{F(f)}\mathbf {b_{f}}\).

Proof

Linear independence: For each \(f \in \{1, \dots , n\}\), \(\mathbf {b_{f}}\) has 1 at position F(f), while all other elements of \(\mathcal {B}\) have 0 at position F(f). Consequently, \(\mathbf {b_{f}}\) cannot be expressed as a linear combination of other elements of \(\mathcal {B}\).

Generators: We show that each vector \(\mathbf {v} \in ker (M)\) can be written as a linear combination of elements of \(\mathcal {B}\). More precisely, we show that \(\mathbf {v} = \sum _{f=1}^{n}\mathbf {v}_{F(f)}\mathbf {b_{f}}\).

  1. (a)

    For each \(f \in \{1, \dots , n\}: \mathbf {v}_{F(f)} = \sum _{\hat{f}=1}^{n}\mathbf {v}_{F(\hat{f})}{} \mathbf{b}_{\hat{\mathbf{f}}F(f)}\) as \(\mathbf {b_{f}}_{F(f)} = 1\) and \(\mathbf{b}_{\hat{\mathbf{f}}F(f)} = 0\) for \(\hat{f} \ne f\).

  2. (b)

    Fix a pivot index \(p \in \{1, \dots , m - n\}\). To see that \(\mathbf {v}_{P(p)} = \sum _{f=1}^{n}\mathbf {v}_{F(f)}\mathbf {b_{f}}_{P(p)}\), note that \(\mathbf {v}\) and all elements of \(\mathcal {B}\) are solutions to the system \(M\mathbf {x}=\mathbf {0}\), so they satisfy Eq. (6). Instantiating Eq. (6) with \(\mathbf {b_{f}}\) for \(f \in \{1, \dots , n\}\), we get

    $$\begin{aligned} \mathbf {b_{f}}_{P(p)} = \sum _{\hat{f}=1}^{n}-M_{pF(\hat{f})}\mathbf {b_{f}}_{F(\hat{f})} = -M_{pF(f)} \end{aligned}$$
    (7)

    since \(\mathbf {b_{f}}_{F(\hat{f})} = 1\) when \(\hat{f} = f\) and 0 otherwise. Now, \(\mathbf {v}_{P(p)} = \sum _{f=1}^{n}\mathbf {v}_{F(f)}\mathbf {b_{f}}_{P(p)}\) is obtained by instantiating Eq. (6) with \(\mathbf {v}\) and then replacing \(-M_{pF(f)}\) by \(\mathbf {b_{f}}_{P(p)}\) using Eq. (7).

Combining (a) and (b), we have shown that \(\mathbf {v}\) can be expressed as a linear combination of \(\mathcal {B}\), which together with the linear independence of \(\mathcal {B}\) concludes the proof. \(\square \)

A direct consequence of Lemma 2 is that the coordinates of \(\mathbf {v} \in ker (M)\) with respect to basis \(\mathcal {B}\), i.e. the coefficients of elements of \(\mathcal {B}\) in the linear combination expressing \(\mathbf {v}\), are positive if \(\mathbf {v} > 0\). These coordinates are denoted as \(\alpha _{1}, \dots , \alpha _{n}\) in Algorithm 1, and we have just shown that using this standard algorithm for the computation of a kernel’s basis the coordinates are guaranteed to be positive (line 6). However, the elements of the basis \(\mathcal {B}\) are not guaranteed to be nonnegative vectors.

Ensuring nonnegativity of the basis. The second part of the algorithm, the loop on lines 7-13, modifies the elements of the basis. It gradually makes all elements nonnegative, while at the same time it keeps the coordinates of vector \(\mathbf {v}\), corresponding to the current basis, positive. Given an element of the basis \(\mathbf {b_{i}}\) such that its \(j\)-th element is negative, the algorithm replaces the element \(\mathbf {b_{i}}\) with a new element \(\mathbf {b^{\prime }_{i}} :=\mathbf {b_{i}} + \frac{-\mathbf {b_{i}}_j}{\mathbf {v}_j}\mathbf {v}\). After replacing \(\mathbf {b_{i}}\) with \(\mathbf {b^{\prime }_{i}}\), the resulting set of vectors is still a basis of \( ker (M)\).

Lemma 3

The set of vectors \(\mathcal {B}^\prime = (\mathcal {B}\setminus \{\mathbf {b_{i}}\}) \cup \{\mathbf {b^{\prime }_{i}}\}\) is a basis of \( ker (M)\).

Proof

We show that \(\mathbf {b_{i}}\) can be expressed as a linear combination of vectors from \(\mathcal {B}^\prime \). This is sufficient to show that \(\mathcal {B}^\prime \) consists of linearly independent vectors and that it generates \( ker (M)\). Let us denote the constant \(\frac{-\mathbf {b_{i}}_j}{\mathbf {v}_j}\) as K and note that \(K > 0\) since \(\mathbf {v}_j> 0\) and \(\mathbf {b_{i}}_j< 0\). We first express \(\mathbf {b^{\prime }_{i}}\) as

$$\begin{aligned} \mathbf {b^{\prime }_{i}}&= \mathbf {b_{i}} + K\mathbf {v} = \mathbf {b_{i}} + K\sum _{f=1}^{n} \alpha _{f}\mathbf {b_{f}} \\&= \mathbf {b_{i}}(1 + K\alpha _{i}) + K\sum _{f\ne i} \alpha _{f}\mathbf {b_{f}} \end{aligned}$$

and now \(\mathbf {b_{i}}\) can be expressed as a linear combination of elements of \(\mathcal {B}^\prime \):

$$\begin{aligned} \mathbf {b_{i}}(1 + K\alpha _{i})&= \mathbf {b^{\prime }_{i}} - K\sum _{f\ne i} \alpha _{f}\mathbf {b_{f}} \\ \mathbf {b_{i}}&= \frac{\mathbf {b^{\prime }_{i}} + \sum _{f\ne i} -K\alpha _{f}\mathbf {b_{f}}}{1 + K\alpha _{i}} \end{aligned}$$

\(\square \)

After this replacement, (at least) one negative value has been successfully eliminated: as \(K > 0\) and \(\mathbf {v} > \mathbf {0}\), it follows that \(\mathbf {b^{\prime }_{i}} > \mathbf {b_{i}}\) and \(\mathbf {b^{\prime }_{i}}_j= 0\).

As the last step, we show that the new coordinates of \(\mathbf {v}\) (with respect to the new basis) are still positive.

Lemma 4

Let \({\varvec{\alpha }}^\prime \) denote the coordinates of \(\mathbf {v}\) with respect to the new basis \(\mathcal {B}^\prime \). Then, \({\varvec{\alpha }}^{\prime } > \mathbf {0}\).

Proof

First, consider the result of a linear combination of the new basis \(\mathcal {B}^\prime \) with the old coefficients \({\varvec{\alpha }}\):

$$\begin{aligned}&\alpha _{1}\mathbf {b_{1}} + \ldots + \alpha _{i}\mathbf {b^{\prime }_{i}} + \ldots + \alpha _{n}\mathbf {b_{n}} = \sum _{f=1}^n \alpha _{f}\mathbf {b_{f}} + \alpha _{i}K\mathbf {v} \\&\quad = \mathbf {v} + \alpha _{i}K\mathbf {v} =\mathbf {v}(1 + \alpha _{i}K) \end{aligned}$$

Now, set \(C :=1 + \alpha _{i}K\) and note that \(C > 1\) since \(K > 0\) and \(\alpha _{i} > 0\). It follows that

$$\begin{aligned} \mathbf {v} = \frac{\alpha _{1}}{C}\mathbf {b_{1}} + \ldots + \frac{\alpha _{i}}{C}\mathbf {b^{\prime }_{i}} + \ldots + \frac{\alpha _{n}}{C}\mathbf {b_{n}} \end{aligned}$$

and that \({\varvec{\alpha }}^{\prime } = \frac{{\varvec{\alpha }}}{C}\) is the vector of coordinates of \(\mathbf {v}\) with respect to the new basis \(\mathcal {B}^\prime \). Since \({\varvec{\alpha }}> \mathbf {0}\) and \(C > 0\), it follows that \({\varvec{\alpha }}^{\prime } > \mathbf {0}\) as required. \(\square \)

We have shown that the loop on lines 7-13 preserves the invariant that the coordinates of \(\mathbf {v}\) with respect to the current basis are all positive (lines 11,12) and that each iteration decreases the number of negative values of the basis vectors. As a result, Algorithm 1 terminates and returns a decomposition of the input vector \(\mathbf {v}{}\) of size equal to the nullity of the input matrix M.

We first simulate the run of the algorithm on an example, then discuss its complexity and finally compare it to other approaches for computing interpolants as a conjunction of inequalities.

Example 2

Consider an unsatisfiable system of inequalities \(A \wedge B\) where \(A = \{ x_1 + x_2 \le 0 ,-x_1 + x_3 \le 0 ,x_1 + x_4 \le 0 ,-x_1 + x_5 \le 0 \}\) and \(B = \{ -x_2 - x_3 -x_4 -x_5 \le -1 \}\). The vector of Farkas coefficients witnessing the unsatisfiability of \(A \wedge B\) is \(\mathbf {k} = {\begin{pmatrix} 1&1&1&1&1 \end{pmatrix}}^{\intercal }\) and its restriction to A-part is \(\mathbf {k_{A}}= {\begin{pmatrix} 1&1&1&1 \end{pmatrix}}^{\intercal }\). The only A-local variable is \(x_1\), so the matrix of A-local coefficients is \({L}^{\intercal } = \begin{pmatrix} 1&-1&\;1&-1 \end{pmatrix}\). We simulate the run of Algorithm 1 on \(\mathbf {k_{A}}\) and \({L}^{\intercal }\): since \({L}^{\intercal }\) is already in RREF, nothing changes on line 1. Now, the rank of \({L}^{\intercal }\) is 1 and it has 4 columns, and thus, its nullity is 3 and we can compute a decomposition of \(\mathbf {k_{A}}\) of size 3. The first column of \({L}^{\intercal }\) is pivot, while the other three columns are free. The computation of the initial basis of \( ker ({L}^{\intercal })\) (line 4) yields three vectors:

$$\begin{aligned} b_1 = \begin{pmatrix} 1 \\ 1 \\ 0 \\ 0\end{pmatrix},\quad b_2 = \begin{pmatrix} -1 \\ 0 \\ 1 \\ 0\end{pmatrix},\quad b_3 = \begin{pmatrix} 1 \\ 0 \\ 0 \\ 1\end{pmatrix}. \end{aligned}$$

The coordinates of \(\mathbf {k_{A}}\) with respect to this basis are \({\varvec{\alpha }}= {\begin{pmatrix} 1&1&1 \end{pmatrix}}^{\intercal }\). As \(\mathbf {b_{2}}_1 < 0\), we enter the loop on line 6 where the new vector \(\mathbf {b^{\prime }_{2}}\) is computed as \(\mathbf {b^{\prime }_{2}} = \mathbf {b_{2}} + \mathbf {k_{A}}= {\begin{pmatrix} 0&1&2&1 \end{pmatrix}}^{\intercal }\). Then, the coordinates are divided by a constant \(C = 2\) to obtain the new coordinates \({\varvec{\alpha }}= {\begin{pmatrix} 1/2&\;1/2&\;1/2 \end{pmatrix}}^{\intercal }\). Since there are no more negative elements in the vectors of the basis, the decomposition \(\mathbf {k_{A}}= 1/2*\begin{pmatrix} 1&1&0&0 \end{pmatrix} + 1/2*\begin{pmatrix} 0&1&2&1 \end{pmatrix} + 1/2*\begin{pmatrix} 1&0&0&1 \end{pmatrix}\) is returned. This decomposition results in the decomposed interpolant

$$\begin{aligned} I^{ Dec } =&(x_2 + x_3 \le 0) \\&\wedge (x_3 + 2x_4 + x_5 \le 0) \wedge (x_2 + x_5 \le 0). \end{aligned}$$

Complexity of Algorithm 1. Considering the matrix of A-local coefficients \(L\) for m inequalities and l A-local variables, the algorithm runs on matrix \(M={L}^{\intercal }\) with m columns and l rows. When the transformation of M to RREF is done by Gauss–Jordan elimination, it needs to perform \(\mathcal {O}(m^{2}l)\) arithmetic operations. After the transformation, the number of (nonzero) rows is r, which is the rank of M, and we know that \(r \le l\). With \(n\) denoting the nullity of M, rank–nullity theorem implies that \(r + n= m\) and consequently that \(n< m\). The complexity of the computation of an initial basis is \(\mathcal {O}(nm)\) since we are computing \(n\) basis vectors, each of size m. Determining the value for every element of each basis vector is immediate: it is 0 or 1 for positions corresponding to the free columns, and it is a negated coefficient from RREF(M) for positions corresponding to the pivot columns, see Eq. (7). Finally, one iteration of the loop that ensures nonnegativity of the basis needs just \(\mathcal {O}(m)\) arithmetic operations and the termination can be ensured after \(\mathcal {O}(n)\) iteration. To see this, note that a basis vector \(\mathbf {b_{i}}\) can be made nonnegative in one iteration when the index j is used that maximizes \(\frac{-\mathbf {b_{i}}_j}{\mathbf {v}_j}\). The whole loop thus requires \(\mathcal {O}(nm)\) arithmetic operations. The complexity of the algorithm is thus dominated by the first part—computing RREF of the input matrix.

Comparison with other approaches

Given an unsatisfiable system of inequalities (AB), Cimatti et al. [11] recognized two extreme points in the spectrum of possible interpolants. On one side, there is the Farkas interpolant in the form of single inequality obtained as a weighted sum of inequalities from A with weights given by Farkas coefficients. On the other side, it is possible to employ quantifier elimination to compute the strongest possible interpolant for (AB) which will result in a conjunction of inequalities (if possible). If all A-local variables are existentially quantified in A and eliminated, then this is guaranteed to yield an interpolant. However, as Cimatti et al. note, quantifier elimination is potentially a very expensive operation.Footnote 3 Therefore, they propose modifications to the procedure computing the interpolant from the proof of unsatisfiability. The observation they make is that the only purpose of the summation of inequalities when traversing the proof is to eliminate A-local variables. If the leaves of the proof do not contain A-local variables, no summation is needed, and the conjunction of the inequalities in the leaves is already an interpolant. This corresponds to our notion of trivial elements of the decomposition. Based on this observation, they proposed a modification to the proof-based algorithm that performs only the summations that are necessary for eliminating A-local variables.

Example 3

Consider the unsatisfiable system of inequalities from Example 2. Figure 2 shows a possible proof of unsatisfiability according to the description of [11].

Fig. 2
figure 2

Proof of unsatisfiability of the system from Example 2

The computation of Farkas interpolant as described by Eq. (2) can be simulated by replacing the leaves from B with \(0 \le 0\). The resulting Farkas interpolant is

$$\begin{aligned} I^F= x_2 + x_3 + x_4 +x_5 \le 0. \end{aligned}$$

Applying the modification from [11] avoids one unnecessary sum and results in an interpolant

$$\begin{aligned} I^M= (x_2 + x_3 \le 0)\wedge (x_4 + x_5 \le 0). \end{aligned}$$

Footnote 4 As seen in Example 2, our approach yields interpolant with three conjuncts

$$\begin{aligned} I^{ Dec }=&(x_2 + x_3 \le 0) \wedge (x_3 + 2x_4 + x_5 \le 0) \\&\wedge (x_2 + x_5 \le 0). \end{aligned}$$

Finally, existentially quantifying \(x_1\) in A and eliminating this quantifier yield interpolant with four conjuncts

$$\begin{aligned} I^{ QE }=&(x_2 + x_3 \le 0) \wedge (x_2 + x_5 \le 0) \\&\wedge (x_3 + x_4 \le 0) \wedge (x_4 + x_5 \le 0). \end{aligned}$$

Note that \(I^{ QE }\) is the strongest and \(I^F\) is the weakest interpolant in this quadruple, while \(I^M\) and \(I^{ Dec }\) are incomparable in terms of logical strength. However, the advantage of our algorithm is that even though its result depends on the order of the inequalities (the order of columns of \({L}^{\intercal }\)), it guarantees to find a decomposition of size 3 in our example. If the first and third inequalities are switched, the decomposed interpolant computed by Algorithm 1 is

$$\begin{aligned} I^{ Dec ^\prime } =&(x_4 + x_3 \le 0) \\&\wedge (x_3 + 2x_2 + x_5 \le 0) \wedge (x_4 + x_5 \le 0) \end{aligned}$$

while if the first and second inequalities are switched, the computed interpolant is

$$\begin{aligned} I^{ Dec ^{\prime \prime }} =&(x_2 + x_4 + 2x_5 \le 0)\\&\wedge (x_3 + x_4 \le 0) \wedge (x_3 + x_2 \le 0). \end{aligned}$$

On the other hand, the approach of [11] is, in some sense, even more sensitive to the order of the input inequalities (the shape of the proof) since the order can influence the size of the decomposition. If the second and the third inequalities are switched, then their approach does not detect the opportunity for decomposition and returns the Farkas interpolant \(I^F\). Our algorithm in this situation returns an interpolant equivalent to \(I^{ Dec }\).

Experiments

We have implemented the computation of decomposed interpolants and their duals using Algorithm 1 in our SMT solver OpenSMT [23], which already provided a variety of interpolation algorithms for propositional logic [24, 34], theory of uninterpreted functions [3] and theory of linear real arithmetic [4].

We evaluated the effect of decomposed interpolants in a model-checking scenario using the model checker sally [26] with Yices [17] for satisfiability queries and OpenSMT for interpolation queriesFootnote 5. We experimented with four LRA interpolation algorithms: the original interpolation algorithms based on Farkas’ lemma, (i) \( Itp_{F} \) and (ii) \( Itp^{\prime }_{F} \), and the interpolation algorithm computing decomposed interpolants, (iii) \( Itp_{DI} \), and (iv) \( Itp^{\prime }_{DI} \). OpenSMT computes interpolants from the proof of unsatisfiability. In this approach, the interpolants computed for LRA conflicts are combined based on interpolation rules for propositional logic and the structure of the proof. In our experiments, we fixed the propositional part of the interpolation algorithm to use McMillan’s interpolation rules [30]. We split our analysis of the experiments into two parts. In Sec. 7.1, we analyse the performance of the model checker using different LRA interpolation algorithms. We focus specifically on a detailed comparison of \( Itp_{F} \) and \( Itp_{DI} \), i.e. the default algorithm and our proposed algorithm. In Sec. 7.2, we analyse the performance of a portfolio of interpolation algorithms and measure the contribution of our proposed algorithm. For comparison, we also run a version of sally using MathSAT as the interpolation engine and compare to the contribution of the decomposing algorithm proposed in [11].

The experiments were run on a large set of benchmarks consisting of several problem sets related to fault-tolerant algorithms (azadmanesh, approxagree, om, hacms, misc, ttesynchro, ttastartup,unifapprox), software model checking (cav12, ctigar), simple concurrent programs (conc), and a lock-free hash table (lfht). A benchmark suite of the kind model checker is also included (lustre). Each benchmark is a transition system with formulas characterizing initial states, a transition relation and a property that should hold. sally can finish with two possible answers (or run out of resources with no answer): valid means the property holds and an invariant implying the property has been found; invalid means the property does not hold and a counterexample leading to a state where the property does not hold has been found. In the plots, we denote the answers as and , respectively. The benchmarks were run on Linux machines with the Intel E5-2650 v3 processor (2.3 GHz) and 64GB of memory. Each benchmark was restricted to 600 seconds of running time and to 4GB of memory.

Comparing individual configurations

Table 1 Performance of sally using different interpolation algorithms of OpenSMT

Table 1 presents the results of the model checker’s runs using different interpolation algorithms. The results are summarized by category with the name of the category and the number of corresponding benchmarks in the first column. The two columns per interpolation algorithm show the number of benchmarks solved successfully (validated/invalidated) within the resource limits and the total running time for the solved benchmarks.

The results suggest that \( Itp_{F} \) interpolation algorithm achieves the best result overall. However, there are certain cases where \( Itp_{DI} \) is faring better, for example, the lfht category. Before we present a more thorough comparison between these two algorithms, we note that the configuration using \( Itp^{\prime }_{DI} \), which computes the weakest interpolants, performs very poorly compared to the others. Closer inspection revealed that it did not solve any benchmarks not solvable by other configurations. It did solve a few benchmarks faster than others, but the improvement was negligible. On the other hand, the overall drop in performance is large. We conclude that computing very weak interpolants is a bad strategy in this model-checking scenario.

Fig. 3
figure 3

Evaluation of the decomposed interpolants in model checking scenario: comparison of performance of sally using OpenSMT with different interpolation procedures, \( Itp_{F} \) and \( Itp_{DI} \)

As mentioned before, the results summarized in Table 1 suggest that \( Itp_{F} \) performs better than \( Itp_{DI} \) overall. However, a closer look reveals that the situation is more complicated. Figure 3 illustrates a direct comparison between these two algorithms. Each point represents one benchmark, x-axis corresponds to the runtime (in seconds) of sally using \( Itp_{F} \) as the interpolation algorithm in OpenSMT, and y-axis corresponds to the runtime of sally using \( Itp_{DI} \). The direct comparison shows that in some cases, the use of decomposed interpolants outperforms the original procedure, sometimes by an order of magnitude. Even though \( Itp_{DI} \) solved 6 benchmarks less than \( Itp_{F} \), it still managed to solve 12 benchmarks that \( Itp_{F} \) was not able to solve within the resource limits. Moreover, on a common set of non-trivial (runtime at least 10 seconds) solved benchmarks, it improved the performance by more than 10% on 45 benchmarks (out of 116 such benchmarks).

During the evaluation, we realized that a small modification in the SMT solver sometimes had a huge effect on the performance of the model checker. It made previously unsolved instance easily solvable or the other way around. To confirm that using \( Itp_{DI} \) is indeed better than using \( Itp_{F} \) for particular benchmarks, we ran an additional set of experiments. For each of the 12 benchmarks solved by \( Itp_{DI} \) but not solved by \( Itp_{F} \), we ran the model checker 100 times, each time with a different random seed for the interpolating solver. The results are summarized in Table 2. For each of the two configurations, the table reports how many runs (out of 100) of the model checker finished successfully within the resource limits and the average time of the successful runs. This experiment demonstrates that there are indeed benchmarks where the decomposition is necessary, while using the original Farkas algorithm leads to divergence. In other cases, the use of decomposed interpolants leads to a higher chance of a successful result and/or better runtime of the model checker. Note that these benchmarks were picked deliberately to confirm that \( Itp_{DI} \) performs better on them than \( Itp_{F} \), based on our experiments on the whole benchmark set.

Table 2 Aggregated results from 100 runs of the model checker on selected benchmarks
Table 3 Interpolation statistics—pwd stands for “Number of problems with at least one decomposition”. The numbers in parentheses denote “Decompositions with trivial and with non-trivial elements” (trivial/non-trivial)

For the final aspect of the direct comparison of \( Itp_{F} \) and \( Itp_{DI} \), we collected statistics from the runs of sally with \( Itp_{DI} \) about how often \( Itp_{DI} \) manages to decompose the vector of Farkas coefficients, thus returning a different interpolant than \( Itp_{F} \) would. These results are summarized in Table 3. The column pwd reports the number of benchmarks with at least a single decomposition (any; with at least one trivial element; with at least one non-trivial element). The next column (“#non-triv. LRA itps”) reports the total number of interpolation problems for theory conflict, excluding those without even theoretical possibility for decomposition. There is no possibility for decomposition if all inequalities are from one part of the problem (resulting in trivial interpolants, either \(\top \) or \(\bot \)) or there is only a single inequality in the A-part (trivially yielding an interpolant equal to that inequality). The last column reports the number of successfully decomposed interpolants (with at least one trivial element; with at least one non-trivial element). Note that it can happen that a successful decomposition contains both trivial and non-trivial elements. We see that at least one decomposition was possible in only less than half of all the benchmarks. This explains why there are many points on the diagonal in Fig. 3. On the other hand, it shows that the test for the possibility of decomposition is cheap and does not represent a significant overhead. Another conclusion we can draw is that when the structure of the benchmark enables decomposition, it can often be discovered in many theory conflicts that appear during the solving.

Table 4 Contribution of the configurations to their respective portfolios

Analysis of the portfolio

In this part, we present yet another way to measure the contribution of the decomposed interpolants: the contribution to the virtual best configuration. We consider a virtual portfolio consisting of configurations of sally using different interpolation algorithms of OpenSMT. In addition, we also consider a separate virtual portfolio of configurations of sally using MathSAT. The result of a virtual portfolio on a benchmark is the best result achieved by any of the configurations of the portfolio. As noted before, the configuration using \( Itp^{\prime }_{DI} \) performed quite poorly on our benchmarks. Since MathSAT can compute Farkas interpolants and its duals, and restricted form of decomposed interpolants but not its dual, we also exclude \( Itp^{\prime }_{DI} \) from the portfolio of OpenSMT’s configurations, with minimal impact on the performance. We denote the heuristic for computing decompositions described in [11] and available in MathSAT as \( Itp_{M} \). We use the number of solved instances and PAR-2 score as a metric of measuring the performance. PAR-2 score is computed as the sum of runtime on solved instances plus two times the timeout for each unsolved instance. Finally, for each configuration we compute the number of uniquely solved instances (not solved by any other configuration in the portfolio) and regret, i.e. how much would the PAR-2 score of the portfolio worsen, if that particular configuration was excluded from the portfolio. The results are summarized in Table 4. Note that OpenSMT and MathSAT portfolios are considered separately.

OpenSMT configuration portfolio is able to solve 1,017 benchmarks with PAR-2 score 116,117. MathSAT configuration portfolio is able to solve 1,018 benchmarks with PAR-2 score 110,356. We hypothesize that the better performance of MathSAT can be at least partially attributed to the fact that it supports interpolation in combination with incremental solving while OpenSMT does not. In both portfolios, the ability to compute decomposed interpolants (even in the restricted form) significantly improves the performance of the portfolio. We also see that the contribution of our algorithm based on methods from linear algebra to OpenSMT portfolio is slightly larger than the contribution of the heuristic \( Itp_{M} \) to the MathSAT portfolio. Additionally, our algorithm solves more instances uniquely within its portfolio. Interestingly, the contribution of the configuration computing Farkas interpolants is non-trivial in OpenSMT, but almost non-existent in MathSAT. Our hypothesis is that \( Itp_{M} \), compared to \( Itp_{DI} \), decomposes less often and the decompositions are of smaller size (e.g. in the situation from Example 3). This would mean that the interpolants from \( Itp_{M} \) are more often similar (or even identical) to Farkas interpolants, which would make the MathSAT portfolio less diverse than the OpenSMT portfolio.

Conclusion

In this paper, we have presented a new interpolation algorithm for linear real arithmetic that generalizes the interpolation algorithm based on Farkas’ lemma used in modern SMT solvers. We showed that the algorithm is able to compute interpolants in the form of a conjunction of inequalities that are logically stronger than the single inequality returned by the original approach. This becomes useful in the IC3-style model-checking algorithms where Farkas interpolants have been shown to be a source of incompleteness. In our experiments, we have demonstrated that the opportunity to decompose Farkas interpolants frequently occurs in practice and that the decomposition often leads to (i) lower solving time and, in some cases, to (ii) solving a problem not solvable by the previous approach.

As the next steps, we plan to investigate how to automatically determine what kind of interpolant would be more useful for the current interpolation query in IC3-style model-checking algorithms. We also plan to investigate other uses of interpolation in model checking where stronger (or weaker) interpolants are desirable [35].