Keywords

figure a
figure b

1 Introduction

The first-order theory of linear real arithmetic (LRA) allows reasoning about numerical variables by means of formulas, built from potentially quantified Boolean combinations of linear constraints, which compare linear combinations of variables to constants. Even though multiplication between variables is not supported, LRA is expressive enough to be widely applicable, for example in static program analysis [12], scheduling [26] or neural network verification [13].

These applications sometimes require determining the truth or, when free variables are involved, the satisfiability of LRA formulas. Other times, a representation of all solutions for a set of free variables is needed. Both problems can be solved by quantifier elimination, or variable elimination in the case of quantifier-free or purely existentially quantified formulas.

We focus on variable elimination for conjunctions of LRA constraints. That is, given a finite set of linear constraints in the variables \(x_1,\ldots ,x_n\), we want to compute a set of linear constraints in the variables \(x_{q+1},\ldots ,x_{n}\), whose solutions coincide with the solutions of the original set, when the \(x_{1},\ldots , x_q\)-dimensions are removed. From a geometrical perspective, LRA constraint sets define by their solution sets convex polyhedra. Variable elimination corresponds to projecting a convex polyhedron onto a subspace, resulting in another (lower-dimensional) convex polyhedron which can be represented by LRA constraints without the eliminated variables. Therefore, this problem is also called polyhedron projection.

Related Work. One of the first methods for variable elimination in LRA was discovered independently by Fourier [10] and Motzkin [24]. It is still often used in practice, but quickly suffers from a doubly exponential worst case behavior (w.r.t. the number of eliminated variables) when the inputs become more complex. Different optimizations have been suggested to reduce the computational effort, most notably by avoiding redundant constraints in intermediate steps, e.g. [6, 14] and, more recently, [16].

An alternative approach is the Double-Description (DD) method [23], which was further developed and implemented by Fukuda and Prodon [11] in the CDD library [5], but also by others [2, 15]. A more recent development uses a reduction to parametric linear programming [17, 31]. These approaches stem from research focusing on the geometric, polyhedra-based view.

On the other hand, the algebraic view considers the more general problem of quantifier elimination for formulas with quantifier alternation and an arbitrary Boolean structure. As these methods are not optimized for our particular problem, we only highlight the virtual term substitution method proposed by Loos and Weispfenning [21] and thoroughly studied by others [19, 20], as well as the work by Monniaux [22], in which the authors reduce quantifier elimination for the general case to repeated polyhedron projections and satisfiability checking.

Most relevant for this work is the FMplex method [27], which was derived from the method of Fourier and Motzkin and aimed at satisfiability checking, but which can also be used for variable elimination from conjunctions of linear constraints. While a tree-shaped search allows FMplex to reduce the worst-case complexity from doubly to singly exponential, it has the consequence that the variable elimination result is provided only as a disjunction of conjunctions and not as a single conjunction, like provided by Fourier-Motzkin. These disjunctions are harder to interpret for the users, and they are larger than necessary.

Contributions. We overcome this limitation with the following contributions:

  • We propose a new variable elimination approach for sets of LRA constraints, derived from the FMplex method presented in [27], but returning the result as a set (or conjunction) of LRA constraints.

  • We show the correctness of this approach, give complexity estimates, and explain which of the improvements from [27] can be transferred to the new variable elimination algorithm.

  • We discuss interesting relations between our method and the virtual term substitution method [21].

  • We provide an implementation in the SMT-solving toolbox SMT-RAT.

  • We present an experimental evaluation, which shows that our implementation outperforms other established tools on three different benchmark sets.

Outline. The rest of the paper is organized as follows: After a formal problem description in Sect. 2, we present the FMplex method and derive our new variable elimination method in Sect. 3. We compare it to other methods and evaluate its performance in Sects. 4 and 5. Finally, in Sect. 6, we conclude the paper and give an outlook on future work.

2 Preliminaries

Let \(\mathbb {R}\), \(\mathbb {Q}\) and \(\mathbb {N}\) denote the real, rational, and natural numbers, respectively.

We use upper case letters (e.g. A) to denote matrices, bold lower case letters for vectors (e.g. \(\boldsymbol{b}, \boldsymbol{f}\)), and \(b_i\) to denote the i-th entry in \(\boldsymbol{b}\). The i-th row and the j-th column vectors of a matrix A are denoted by \(\boldsymbol{a}_{i,-}\) and \(\boldsymbol{a}_{-,j}\), respectively. We assume \(\mathbb {R}^n = \mathbb {R}^{n\times 1}\), i.e. \(\boldsymbol{f} \in \mathbb {R}^n\) is a column vector. The transpose of \(\boldsymbol{f}\) is \(\boldsymbol{f}^{\intercal }\), and by \(\boldsymbol{f} \ge \boldsymbol{0}\) we denote the component-wise comparison to zero, i.e. \(f_1 \ge 0 \wedge \ldots \wedge f_n \ge 0\). We write \(\boldsymbol{e_i}\) for the i-th unit vector and \(\boldsymbol{0}\) for zero-matrices or zero-vectors; their dimensions will be clear from the context.

Linear Real Arithmetic: Syntax. We fix \(n \in \mathbb {N}\) and a vector \(\boldsymbol{x} = (x_1, \ldots , x_n)^{\intercal }\) of \(\mathbb {R}\)-valued variables. When convenient, we view variable vectors as ordered sets, writing e.g. \(x_i\in \boldsymbol{x}\), or \(\boldsymbol{y}\subseteq \boldsymbol{x}\) to denote that \(\boldsymbol{y}=(x_{i_1},\ldots ,x_{i_k})^{\intercal }\) for some \(0\le k\le n\) and \(1\le i_1<\ldots <i_k\le n\), and we write \(|\boldsymbol{y}|=k\) for the length of \(\boldsymbol{y}\).

Our main objects of interest are linear constraints (from here on simply constraints), which are inequations of the form

$$\begin{aligned} a_1 x_1 + a_2 x_2 + \ldots + a_n x_n \le b \text {, or equivalently } \boldsymbol{a}^{\intercal } \boldsymbol{x} \le b \end{aligned}$$

for some rational constants \(\boldsymbol{a} = (a_1, \ldots , a_n)^{\intercal } \in \mathbb {Q}^n\) and \( b \in \mathbb {Q}\). Expressions of the form \(\boldsymbol{a}^{\intercal } \boldsymbol{x} + b\) are called (linear) terms. We sometimes write \(s\le t\) with linear terms st and implicitly assume a conversion to the above normal form.

Note that we do not consider strict inequations \(\boldsymbol{a}^{\intercal } \boldsymbol{x} < b\); we will discuss their integration in Sect. 3.3. Note furthermore that \(\boldsymbol{a}^{\intercal } \boldsymbol{x} \ge b\) is equivalent to \(-\boldsymbol{a}^{\intercal } \boldsymbol{x} \le -b\), and \(\boldsymbol{a}^{\intercal } \boldsymbol{x} = b\) is equivalent to \(\boldsymbol{a}^{\intercal } \boldsymbol{x} \le b \wedge \boldsymbol{a}^{\intercal } \boldsymbol{x} \ge b\).

A variable \(x_i\) occurs in \(\boldsymbol{a}^{\intercal }\boldsymbol{x} \le b\) if \(a_i\ne 0\). Let \(\textit{vars}(c)\) be the set of all variables that occur in the constraint c, and \(\textit{vars}(C):=\cup _{c\in C}\textit{vars}(c)\) for any constraint set C. For a given \(\boldsymbol{y}\subseteq \boldsymbol{x}\), we sometimes write \(C(\boldsymbol{y})\) to indicate \(\textit{vars}(C)\subseteq \boldsymbol{y}\). Note that in this paper, we always have \(\textit{vars}(C) \subseteq \boldsymbol{x}\).

LRA formulas are built from constraints using Boolean connectives (\(\wedge , \vee , \lnot \)) and quantifiers (\(\exists , \forall \)), according to the syntax of first-order logic. For an LRA formula \(\varphi \), a variable \(x_i\in \boldsymbol{x}\), and a term t, we write \(\varphi [t/x_i]\) to denote the substitution of t for each free occurrence of \(x_i\) in \(\varphi \).

Throughout the paper, we sometimes interpret sets of constraints as conjunctions, and sometimes as systems \(A\boldsymbol{x} \le \boldsymbol{b}\), with \(A\in \mathbb {Q}^{m\times n}\) and \(\boldsymbol{b}\in \mathbb {Q}^{m}\), representing the set \(\{\boldsymbol{a}_{i,-}\boldsymbol{x} \le b_i \mid i \in \{1,\ldots m\}\}\). Note that every finite set of constraints can be represented this way for a suitable A and \(\boldsymbol{b}\). We use the representations as set, conjunction or system interchangeably.

Linear Real Arithmetic: Semantics. An assignment for \(\boldsymbol{y}\subseteq \boldsymbol{x}\) with \(|\boldsymbol{y}|=i\in \mathbb {N}\) is a vector \(\boldsymbol{\alpha }\in \mathbb {R}^{i}\). We define \(\varphi [\boldsymbol{\alpha }/\boldsymbol{y}]:=\varphi [\alpha _1/y_1]\ldots [\alpha _i/y_i]\), and say that \(\boldsymbol{\alpha }\) is a solution for \(\varphi \), written \(\boldsymbol{\alpha } \models \varphi \), if \(\varphi [\boldsymbol{\alpha }/\boldsymbol{y}]\) evaluates to true under the standard semantics.

If every solution of \(\varphi \) is also a solution of the formula \(\psi \), then we say that \(\varphi \) implies \(\psi \) and write \(\varphi \models \psi \). If both \(\varphi \models \psi \) and \(\psi \models \varphi \) hold, then the solution sets are equal and the formulas are equivalent, denoted by \(\varphi \equiv \psi \). Note that, when interpreting a set C of constraints as their conjunction, the statement \(\boldsymbol{\alpha } \models C\) is to be interpreted as \(\boldsymbol{\alpha } \models \bigwedge _{c \in C}c\).

A well-known result for such constraint systems is Farkas’ Lemma, which we will use in the following formulation.

Theorem 1

(Farkas’ Lemma [9]). Let \(A\in \mathbb {Q}^{m\times n}\) and \(\boldsymbol{b}\in \mathbb {Q}^{m}\). A constraint c is implied by \(A\boldsymbol{x} \le \boldsymbol{b}\) if and only if there are \(\boldsymbol{f} \in \mathbb {R}^m\) and \(f_0 \in \mathbb {R}\) with \(\boldsymbol{f} \ge \boldsymbol{0}\), \(f_0 \ge 0\) and \(c = (\boldsymbol{f}^{\intercal } A \boldsymbol{x} \le \boldsymbol{f}^{\intercal } \boldsymbol{b} + f_0 )\).

Variable Elimination. Consider a finite set C of linear constraints, and let \(\boldsymbol{x^{{Q}}}\subseteq \boldsymbol{x}\). W.l.o.g. we assume \(\boldsymbol{x^{{Q}}}= (x_1,\ldots , x_q)^{\intercal }\) for some \(1\le q \le n\), and we set \(\boldsymbol{x^{{P}}}:= (x_{q+1},\ldots ,x_n)^{\intercal }\) and \(p:=|\boldsymbol{x^{{P}}}|=n-q\). Our goal is to find a - preferably small - constraint set \(D(\boldsymbol{x^{{P}}})\) with \(D \equiv \exists x_1. \ldots \exists x_q. C\). That is, \(\boldsymbol{\alpha ^{{P}}}\in \mathbb {R}^{p}\) is a solution for D if and only if it can be extended to a solution \(\boldsymbol{\alpha }\in \mathbb {R}^n\) for C.

We refer to the variables in \(\boldsymbol{x^{{Q}}}\) as quantified and to those in \(\boldsymbol{x^{{P}}}\) as parameters, and formulate constraint sets also as \(A^{{Q}}\boldsymbol{x^{{Q}}}+ A^{{P}}\boldsymbol{x^{{P}}}\le \boldsymbol{b}\). We fix \(\boldsymbol{x^{{Q}}}\), \(\boldsymbol{x^{{P}}}\), q and p as given above for the rest of this paper.

Example 1

A possible solution to the variable elimination of \(x_1\) and \(x_2\) from

$$ C := \{-x_1 + x_2 -x_3 \le -3,\ -x_1 + x_2 +x_3 \le 4,\ x_1 \le 3,\ -x_2 \le -1\} $$

is the set \(\{-x_3 \le -1,\ x_3 \le 6\}\).

Fig. 1.
figure 1

The constraint set from Example 1 in matrix representation (left) and its solution set, as well as the projections of that set onto the \(x_1\)-\(x_3\)-plane (blue), the \(x_2\)-\(x_3\)-plane (red) and the \(x_3\)-axis (violet). (Color figure online)

It is important to note that the ordering of the existential quantifiers does not change the task, as is illustrated in Fig. 1. Thus, it is possible and indeed helpful to change this order dynamically.

Polyhedron Projection. The above variable elimination problem is also known as polyhedron projection. This name comes from the fact that the solutions of a constraint set \(A \boldsymbol{x} \le \boldsymbol{b}\) describe a convex polyhedron in the n-dimensional space and eliminating the variables \(\boldsymbol{x^{{Q}}}\) corresponds to a projection onto the dimensions from \(\boldsymbol{x^{{P}}}\). That is, \((D\boldsymbol{x^{{P}}}\le \boldsymbol{f}) \equiv (\exists \boldsymbol{x^{{Q}}}. A^{{Q}}\boldsymbol{x^{{Q}}}+ A^{{P}}\boldsymbol{x^{{P}}}\le \boldsymbol{b})\) if and only if

$$\begin{aligned} \{\boldsymbol{\alpha ^{{P}}}\in \mathbb {R}^p \mid D\boldsymbol{\boldsymbol{\alpha ^{{P}}}} \le \boldsymbol{f}\} = \{\boldsymbol{\alpha ^{{P}}}\in \mathbb {R}^p \mid \exists \boldsymbol{\alpha ^{{Q}}}\in \mathbb {R}^q: A^{{Q}}\boldsymbol{\alpha ^{{Q}}}+ A^{{P}}\boldsymbol{\alpha ^{{P}}}\le \boldsymbol{b}\}. \end{aligned}$$

3 A Divide-and-Conquer Approach

3.1 Divide: The FMplex Method

We now summarize the idea of the FMplex method introduced in [27] and then refine it to make it better suited for the task of polyhedron projection.

Originally, this method was developed as a branching version of the Fourier-Motzkin (FM) variable elimination method. However, [27] does not further study the general task of variable elimination, but focuses on using FMplex for checking the satisfiability of a constraint set.Footnote 1

To eliminate a variable \(x_i \in \boldsymbol{x}\) from a constraint set C, FMplex partitions the constraints into three sets as follows.

Definition 1

For each constraint set C and variable \(x_i \in \boldsymbol{x}\), we define

  • \(C^{-}(x_i) := \{(\boldsymbol{a}^{\intercal }\boldsymbol{x} \le b) \in C \mid a_i < 0\}\), called the lower bounds of C on \(x_i\),

  • \(C^{+}(x_i) := \{(\boldsymbol{a}^{\intercal }\boldsymbol{x} \le b) \in C \mid a_i > 0\}\), called the upper bounds of C on \(x_i\),

  • \(C^{0}(x_i) := \{(\boldsymbol{a}^{\intercal }\boldsymbol{x} \le b) \in C \mid a_i = 0\}\), called the non-bounds of C on \(x_i\),

Moreover, for each \(c=(\boldsymbol{a}^{\intercal }\boldsymbol{x}\le b)\in C^{-}(x_i)\cup C^{+}(x_i)\), we define the term

$$\begin{aligned} \textit{bnd}(x_i,c) := 1/a_i\cdot (b-(a_1x_1+\ldots +a_{i-1}x_{i-1}+a_{i+1}x_{i+1}+\ldots +a_nx_n)). \end{aligned}$$

Each constraint \(c \in C^{-}(x_i)\) is equivalent to \(\textit{bnd}(x_i,c) \le x_i\), and it bounds from below the \(x_i\)-value of solutions for C w.r.t. the other variables. Similarly, \(c \in C^{+}(x_i)\) is equivalent to \(x_i \le \textit{bnd}(x_i,c)\). For \(c \in C^0(x_i)\) we have \(x_i \not \in \textit{vars}(c)\).

The method then uses the insight that at each point in the projection onto \(\textit{vars}(C) \setminus (x_i)\), one of the lower bounds is a greatest and one of the upper bounds is a lowest. Thus, the projection can be divided into a disjunction of sub-problems, each stating that a particular lower bound is a greatest and that no upper bound is below it. Symmetrically, one can express that a particular upper bound is a lowest and that no lower bound is above it. Theorem 2 formalizes the division into sub-problems by which FMplex eliminates the variable \(x_i\).

Definition 2

(Partial Projection). Let C be a constraint set, \(x_i\in \boldsymbol{x}\) and \(c\in C^{-}(x_i) \cup C^{+}(x_i)\). Further, let \(t := \textit{bnd}(x_i,c)\), then the partial projection of \(x_i\) from C with c is

$$\begin{aligned} C[c//x_i] := &\ \{c'[t/x_i] \mid c' \in C\setminus \{c\}\}. \end{aligned}$$

Using \(L := \{\textit{bnd}(x_i,c') \mid c' \in C^{-}(x_i)\}\) and \(U := \{\textit{bnd}(x_i,c') \mid c' \in C^{+}(x_i)\}\), this is equivalent to \(\{l\le t \mid l \in L \setminus \{c\}\}\, \cup \,\{t\le u \mid u \in U \setminus \{c\}\}\, \cup \, C^0(x_i)\).

Theorem 2

([27]). If \(C^{+}(x_i) = \emptyset \) or \(C^{-}(x_i) = \emptyset \), then \(\exists x_i. C \equiv C^0(x_i)\).

$$ \text {Otherwise, }\qquad \exists x_i. C \quad \equiv \bigvee _{c \in C^{-}(x_i)} C[c//x_i]\quad \equiv \bigvee _{c \in C^{+}(x_i)} C[c//x_i]\ . $$

To eliminate multiple variables, the disjuncts can be handled independently. That is, in each of the disjuncts, one can eliminate any of the remaining variables to again receive a disjunction. Essentially, the method constructs a tree of constraint sets where the root is the initial input C, and the children of any node are the disjuncts that result from eliminating a variable from that node using Theorem 2. At the leafs of the tree, all desired variables are eliminated, but different paths from the root to a leaf may eliminate the variables in different orders and may alternate between using lower or upper bounds for branching.

Example 2

We revisit Example 1 and start by eliminating \(x_1\) using greatest lower bounds. With \(C^-(x_1) = \{-x_1 + x_2 - x_3 \le -3, -x_1 + x_2 + x_3 \le 4\}\), \(C^+(x_1) = \{x_1 \le 3\}\) and \(C^0(x_1) = \{-x_2 \le -1\}\), we get

$$\begin{aligned} \exists x_1. C\ \equiv &\quad C[(-x_1 + x_2 -x_3 \le -3)//x_1]\quad \vee \quad C[(-x_1 +x_2 +x_3 \le 4)//x_1]\\ \equiv &\quad \{2x_3 \le 7,\quad x_2 - x_3 \le 0,\quad -x_2 \le -1\}\quad \vee \\ &\quad \{-2x_3 \le -7,\quad x_2 + x_3 \le 7,\quad -x_2 \le -1\} \end{aligned}$$

We eliminate \(x_2\) from the two sets independently, using their only lower bound:

$$\begin{aligned} \exists x_1, x_2. C \equiv \quad \{2x_3 \le 7,\quad -x_3 \le -1\}\quad \vee \quad \{-2x_3 \le -7,\quad x_3 \le 6\}. \end{aligned}$$

3.2 Conquer: Obtaining a Conjunctive Result

When using FMplex for satisfiability checking (like in [27]), one eliminates all variables of the given constraint set and if any leaf is satisfiable, then the input is satisfiable as well. But for quantifier elimination, we need to consider all leafs. The problem here is that the final result is going to be a potentially large disjunction of conjunctions or, geometrically, a union of convex polyhedra. We know from Fourier-Motzkin that this union is again a single convex polyhedron and thus can be represented as a single conjunction. However, naively computing the union of the polyhedra would be too much effort, and we will show a more efficient way to extract a conjunction from the computations.

Our first step is to observe that all constraints constructed by the above method can be understood as linear combinations of the original constraints.

Lemma 1

Let \(A \in \mathbb {Q}^{m \times n}\) and \(\boldsymbol{b} \in \mathbb {Q}^m\). For every constraint \(\boldsymbol{v}^{\intercal }\boldsymbol{x} \le w\) constructed by FMplex (repeated application of Theorem 2) on the input \(A\boldsymbol{x} \le \boldsymbol{b}\), there is \(\boldsymbol{f} \in \mathbb {Q}^m\) with \( \boldsymbol{f}^{\intercal } A = \boldsymbol{v}^{\intercal } \text { and } \boldsymbol{f}^{\intercal }\boldsymbol{b} = w \in \mathbb {Q}. \)

Proof

For any \(\boldsymbol{a}_{k,-}\boldsymbol{x} \le b_k\), we can simply choose \(\boldsymbol{f} = \boldsymbol{e}_k\). Let \(c = (\boldsymbol{a}^{\intercal } \boldsymbol{x} \le d)\) with \(a_i \ne 0\), \(t:=\textit{bnd}(x_i,c)\), and \(c' = (\boldsymbol{a'}^{\intercal } \boldsymbol{x} \le d')\). Assume that we already have \(\boldsymbol{f},\boldsymbol{f'} \in \mathbb {Q}^m\) with \(c = (\boldsymbol{f}^{\intercal }A\boldsymbol{x} \le \boldsymbol{f}^{\intercal }\boldsymbol{b})\) and \(c' = (\boldsymbol{f'}^{\intercal }A\boldsymbol{x} \le \boldsymbol{f'}^{\intercal }\boldsymbol{b})\). Then

$$ c'[t/x_i]\ \equiv \ ((\boldsymbol{a'} - \frac{a'_{i}}{a_{i}} \boldsymbol{a})^{\intercal }\boldsymbol{x} \le d' - \frac{a'_{i}}{a_{i}} d)\ =\ ((\boldsymbol{f'} - \frac{a'_{i}}{a_{i}} \boldsymbol{f})^{\intercal }A\boldsymbol{x} \le (\boldsymbol{f'} - \frac{a'_{i}}{a_{i}} \boldsymbol{f})^{\intercal }\boldsymbol{b}) . $$

Note that new constraints are only constructed by this kind of substitution.    \(\square \)

It can happen that the same constraint is derived in multiple ways, and from now on, we want to distinguish constraints also by the way they are generated.

Definition 3

(Annotated Constraints). An annotated constraint has the form \(c:\boldsymbol{f}\) with some constraint c and an \(\boldsymbol{f} \in \mathbb {Q}^m\), also called the construction vector. Constraints with different annotations are considered as different.

Instead of constraint sets, we now consider sets of annotated constraints. Most notions can be adapted straightforwardly, in particular the definitions for \(C^-(x_i), C^+(x_i), C^0(x_i)\) and \(\textit{vars}(C)\). For the restricted projection \(C[c:\boldsymbol{f}//x_i]\), the construction vector of a new constraint can easily be computed from the construction vectors of its parents, according to the proof of Lemma 1.

Our main result, which we will show in Theorem 3, is that the final disjunction computed by FMplex is equivalent to the conjunction of those constraints whose construction vectors have no negative entries. This is fairly easy to see for the elimination of a single variable, since the non-negative combinations are exactly the constraints the Fourier-Motzkin method would compute. The constraints whose construction vectors do have negative entries stem from the assumptions that some lower (upper) bound is larger or equal to another lower (upper) bound. This means that these constraints do not define the boundary of the solution space, but they only cut it into multiple parts, causing the disjunction. This intuition can be generalized for the elimination of multiple variables.

Example 3

When going through Example 2 with annotations, we get the result

$$\begin{aligned} \exists x_1,x_2. C\quad \equiv \quad &\,\{2x_3 \le 7:\boldsymbol{(-1,1,0,0)^{\intercal }},\ -x_3 \le -1:\boldsymbol{(1,0,1,1)^{\intercal }}\}\ \vee \\ &\,\{-2x_3 \le -7:\boldsymbol{(1,-1,0,0)^{\intercal }},\ x_3 \le 6:\boldsymbol{(0,1,1,1)^{\intercal }}\}. \end{aligned}$$

Collecting the constraints with non-negative construction vectors gives the equivalent set \(\{-x_3 \le -1,\ x_3 \le 6\}\), as in Example 1. Note how the other two constraints partition the solution space, but they do not change it.

Theorem 3

Let C be a constraint set and \(\exists \boldsymbol{x^{{Q}}}. C \equiv D_1 \vee \ldots \vee D_k\), such that the disjunction on the right-hand side was constructed using the FMplex method, i.e. by repeated application of Theorem 2 with constraint annotation. Let further \(D_{pos} := \{c \mid c:\boldsymbol{f} \in \bigcup _{i=1}^k D_i,\ \boldsymbol{f} \ge 0\}\). Then \(\exists \boldsymbol{x^{{Q}}}. C \equiv D_{pos}\).

Proof

We show that for every \(\boldsymbol{\alpha ^{{P}}}\in \mathbb {R}^{p}\) holds \((\boldsymbol{\alpha ^{{P}}}\models \exists \boldsymbol{x^{{Q}}}. C) \Leftrightarrow (\boldsymbol{\alpha ^{{P}}}\models D_{pos})\). Farkas’ Lemma (Theorem 1) immediately yields \((\boldsymbol{\alpha ^{{P}}}\models \exists \boldsymbol{x^{{Q}}}. C) \Rightarrow (\boldsymbol{\alpha ^{{P}}}\models D_{pos})\). So, it remains to show \(\boldsymbol{\alpha ^{{P}}}\not \models \exists \boldsymbol{x^{{Q}}}. C \Rightarrow \boldsymbol{\alpha ^{{P}}}\not \models D_{pos}.\) Assume that C has the form \(A^{{Q}}\boldsymbol{x^{{Q}}}+A^{{P}}\boldsymbol{x^{{P}}}\le \boldsymbol{b}\) and consider the following system, with \(\boldsymbol{b'} := \boldsymbol{b} - A^{{P}}\boldsymbol{\alpha ^{{P}}}\in \mathbb {R}^m\):

$$\begin{aligned} C'\quad :=\quad C[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\quad =\quad (A^{{Q}}\boldsymbol{x^{{Q}}}+ A^{{P}}\boldsymbol{\alpha ^{{P}}}\le \boldsymbol{b})\quad \equiv \quad A^{{Q}}\boldsymbol{x^{{Q}}}\le \boldsymbol{b'}. \end{aligned}$$

Since \(\boldsymbol{\alpha ^{{P}}}\not \models \exists \boldsymbol{x^{{Q}}}. C\), this system is unsatisfiable and there is a minimal unsatisfiable subset \(K' \subseteq C'\). We know that there is \(K \subseteq C\) with \(K[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}] = K'\). We will show that there is \(c \in D_{pos}\), constructed only from K and so that \(\alpha \not \models c\).

For this purpose, we will construct a sequence \((C_1,C_2,\ldots , C_{q+1})\), starting with \(C_1 := C\), corresponding to a path in the elimination tree of FMplex from the initial system to a leaf (where all variables in \(\boldsymbol{x^{{Q}}}\) have been eliminated). Starting with \(K_1 := K\), we will construct a second sequence \((K_1, K_2,\ldots , K_{q+1})\) so that for all \(1\le i \le q+1\) holds \(K_i \subseteq C_i\), the set \(K_i[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) is unsatisfiable and \(K_i\) is minimal in the sense that for all \(L \subsetneq K_i\), the set \(L[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) is satisfiable.

W.l.o.g. let \(x_i\) be the variable eliminated next from \(C_i\). It holds \(K_i^{-}(x_i) \ne \emptyset \) if and only if \(K_i^{+}(x_i) \ne \emptyset \), because otherwise \(K_i^{0}(x_i)\) would be a strict subset of \(K_i\) and \(K_i^{0}(x_i)[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) would be unsatisfiable, contradicting the minimality of \(K_i\). Therefore, either \(x_i \not \in \textit{vars}(K_i)\) or \(K_i^{-}(x_i) \ne \emptyset \ne K_i^{+}(x_i)\) holds.

  • If \(x_i \not \in \textit{vars}(K_i)\), then \(K_i \subseteq C_i^{0}(x_i)\), therefore \(K_i\) is included in all children of \(C_i\). We choose one of them as \(C_{i+1}\) and use \(K_{i+1} := K_{i}\).

  • In the other case, there is a constraint \(c:\boldsymbol{f} \in K_i\) so that one of the constructed children is \(C_{i+1} := C_i[c:\boldsymbol{f}//x_i]\).

    Note that \(K_i[c:\boldsymbol{f}//x_i] \subseteq C_i[c:\boldsymbol{f}//x_i]\) holds and \(K_i[c:\boldsymbol{f}//x_i][\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) is unsatisfiable (since otherwise \(K'\) would be satisfiable). Thus, there is a minimal subset \(K_{i+1} \subseteq K_i[c:\boldsymbol{f}//x_i]\) so that \(K_{i+1}[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) is unsatisfiable.

In the end, \(K_{q+1} \ne \emptyset \), \(\textit{vars}(K_{q+1}) \cap \boldsymbol{x^{{Q}}}= \emptyset \) and \(K_{q+1}[\boldsymbol{\alpha ^{{P}}}/\boldsymbol{x^{{P}}}]\) is unsatisfiable. Thus, there is \(c:\boldsymbol{f} \in K_{q+1}\) with \(\boldsymbol{\alpha ^{{P}}}\not \models c\). We now show \(\boldsymbol{f} \ge 0\) and thus \(c \in D_{pos}\).

For \(i \in \{1,\ldots , m\}\), let \(c_i := ((\boldsymbol{a^Q})_{i,-}\boldsymbol{x^{{Q}}}+ (\boldsymbol{a^P})_{i,-}\boldsymbol{x^{{P}}}\le b_i)\). For all i with \(c_i \in C \setminus K\) holds \(f_i = 0\), by construction. Towards a contradiction, assume there was \(1\le j \le m\) with \(c_j \in K\) and \(f_j < 0\). By Farkas’ Lemma and the minimality of K, there exists \(\boldsymbol{f'} \in \mathbb {R}^m\) so that \(f'_i > 0\) for all \(c_i \in K\), \(f'_i = 0\) for all \(c_i \in C \setminus K\), and \((\boldsymbol{f'}^{\intercal }A^{{Q}}\boldsymbol{x^{{Q}}}+ \boldsymbol{f'}^{\intercal }A^{{P}}\boldsymbol{x^{{P}}}\le \boldsymbol{f'}^{\intercal }\boldsymbol{b}) = c\).

Using \(\lambda := \max \{\frac{-f_j}{f'_j} \mid 1\le j\le m,\ f_j < 0\}\) and \(\boldsymbol{g} := \boldsymbol{f} + \lambda \boldsymbol{f'}\), we observe \(\boldsymbol{g}^{\intercal }A^{{Q}}= 0\) and \(\boldsymbol{\alpha ^{{P}}}\not \models (\boldsymbol{g}^{\intercal }A^{{P}}\boldsymbol{x^{{P}}}\le \boldsymbol{g}^{\intercal }b)\), but \(\{c_i \mid 1 \le i \le m,\ g_i \ne 0\} \subsetneq K\), contradicting the minimality of K. Therefore, \(c \in D_{pos}\) and \(\boldsymbol{\alpha } \not \models D_{pos}\).   \(\square \)

Our method is formulated in Algorithm 1. It maintains a stack node_stack of annotated constraint sets, which correspond to the nodes of the elimination tree traversed by FMplex. The function push inserts the given set at the top of the stack, and pop removes its top element and returns that set.

Since a node’s children are smaller than that node, i.e. \(|N[c:\boldsymbol{f}//x_i]| < |N|\), and the tree branching is bounded by m, i.e. \(|N^-(x_i)| \le |N|, |N^+(x_i)| \le |N|\), the algorithm has an exponential complexity (\(\mathcal {O}(m^{q+1})\)) in space and time, with respect to \(|\boldsymbol{x^{{Q}}}|\). In fact, the stack never contains more than \(m^2 \cdot (q+1)\) total constraints at the same time and only the output needs exponentially large space. Interestingly, we only insert new constraints into the output set and never read or remove something from it during the procedure.

Algorithm 1
figure c

\(\texttt {project}(C, \boldsymbol{x^{{Q}}})\)

3.3 Further Improvements

Thanks to its disjunctive structure, our approach admits many optimizations. The original paper for FMplex [27] describes several improvements for the version developed for satisfiability checking. We will now show, which of these improvements can be transferred to the new setting of variable elimination.

Variable and Branch Choice. In each of the processed nodes, we can choose the eliminated variable and whether to branch on lower or upper bounds. This choice is independent of the other nodes and can have a massive impact on the runtime of the algorithm. To minimize the number of children for each node N, we choose an \(x_i\) for which \(\min (|N^-(x_i)|,|N^+(x_i)|)\) is minimal and, if necessary, choose the branching \(* \in \{-,+\}\) accordingly.

Equations and Strict Constraints. In the presence of equations \(\boldsymbol{a}^{\intercal }\boldsymbol{x} = b\), we employ Gauss-elimination before the actual call to our procedure, in order to eliminate some of the desired variables using the equations.

To handle strict constraints \(\boldsymbol{a}^{\intercal }\boldsymbol{x} < b\), we introduce a new variable \(\delta \) as a placeholder for some infinitesimal value and instead consider \(\boldsymbol{a}^{\intercal }\boldsymbol{x} + \delta \le b\). We eliminate \(\delta \) from the final result using the additional constraint \(\delta >0\). That is, for a resulting constraint \(\boldsymbol{a}^{\intercal }\boldsymbol{x^{{P}}}+ d \delta \le b\), if \(d \ne 0\), then \(d > 0\) and we can deduce \(\boldsymbol{a}^{\intercal }\boldsymbol{x^{{P}}}< b\). This is a fairly standard way of dealing with strict constraints.

Pruning Equivalent Nodes. An interesting result in [27] is that for each node N in the elimination tree, one can partition the original input C into two sets \(\mathcal {N}, \mathcal {B}\), so that each constraint \(c \in N\) was constructed from the constraints in \(\mathcal {N}\) and exactly one of the constraints in \(\mathcal {B}\). The intuition is that we start with \(\mathcal {N} := \emptyset \) and \(\mathcal {B} := C\) and when constructing \(N[c//x_i]\), the constraint corresponding to c moves from \(\mathcal {B}\) to \(\mathcal {N}\). In that sense, \(\mathcal {N}\) contains the assumed strictest bounds.

Multiple nodes can have the same corresponding sets \(\mathcal {N}, \mathcal {B}\), with the intuition that the same bounds are chosen in a different order. Then, the nodes are equivalent, and [27] avoids visiting more than one of them using some bookkeeping.

This optimization can be transferred straightforwardly, as the relevant results are about equivalence and not just satisfiability.

Pruning Unsatisfiable Nodes. In the case of satisfiability checking, the goal is to find any satisfiable node in the elimination tree. Therefore, it is helpful to identify and prune unsatisfiable parts of the tree. A node is easily recognized as unsatisfiable if it contains a trivially false constraint, e.g. \(0 \le -1\). Then, the children of that node can be ignored, since they will also contain that constraint.

This can also be used in our version for variable elimination, though we omit the proof here for brevity. Essentially, one can show that the construction in the proof for Theorem 3 still works when leaving out the pruned nodes.

This idea is taken even further in [27], by analyzing how the trivially false constraints are constructed. If an unsatisfiable node is encountered, one may find an ancestor of that node in the elimination tree which already implies the trivially false constraint. Therefore, the remaining children of that ancestor can also be pruned. However, this can not be used for variable elimination, as illustrated by the following example, which was already considered in [27]:

Example 4

We eliminate all variables with the static order \(x_3, x_2, x_1\) from

$$ \{-x_3 \le 0,\quad x_1 - x_2 - x_3 \le 0,\quad x_1 \le -1,\quad -x_1 + x_2 \le -1,\quad -x_2 + x_3 \le 0\}, $$

always branching on lower bounds and first considering greatest lower bounds whose construction vectors have negative entries. When pruning nodes according to which ancestor implied the unsatisfiability, our algorithm would incorrectly return the (satisfiable) empty constraint set, while the input is unsatisfiable.

There is one exception, though. If a trivially false constraint is annotated with an \(\boldsymbol{f} \ge 0\), then it is implied by the input system. Thus, it is added to the result, which is then unsatisfiable and we can stop immediately. However, in usual applications of polyhedron projection, the input is rarely unsatisfiable.

Splitting Into Unrelated Systems. It can happen that the input system consists of several parts that are unrelated with regard to the eliminated variables. That is, we can find a decomposition \(C = C_1 \cup \ldots \cup C_k\) so that for all \(i \ne j\) holds \(\textit{vars}(C_i) \cap \textit{vars}(C_j) \cap \boldsymbol{x^{{Q}}}= \emptyset \). In that case, \(\boldsymbol{x^{{Q}}}\) can be eliminated from each single \(C_i\) independently. To avoid eliminating the variables in \(\textit{vars}(C_1) \cap \boldsymbol{x^{{Q}}}\) and then performing the elimination for \(C_2\) in every resulting subtree, we can split the node C into nodes \(C_1, C_2, \ldots \) and put them on the stack. Such a partition can easily be found by computing the connected components in a graph that has the constraints of C as vertices and has an edge between c and \(c'\) if \(\textit{vars}(c) \cap \textit{vars}(c') \cap \boldsymbol{x^{{Q}}}\ne \emptyset \). Note that this can be applied to the initial input to help any projection method. However, our method can apply it to all the intermediate nodes, potentially leading to more savings. This improvement is not relevant for satisfiability checking, and it is a novel contribution.

4 Relation to Virtual Term Substitution

Although the original approach was derived from the Fourier-Motzkin method, it exhibits striking similarities to virtual term substitution (VTS) introduced by Loos and Weispfenning [21]. The VTS method also eliminates variables from LRA formulas in a tree-like manner. In each step, it collects terms from the individual constraints and substitutes them for the eliminated variable, obtaining a disjunction similar to the result of Theorem 2. In fact, these terms can be chosen to be exactly like in our approach, i.e. so that they correspond to the lower bounds or the upper bounds. However, the special case that there are no lower bounds or no upper bounds is handled by an additional disjunct \(C[-\infty //x_i]\) or \(C[\infty //x_i]\), performing a virtual substitution of an infinite value for the variable.

It was observed in [19] that the constraints derived by the VTS are linear combinations of the original input, in the same sense as described in Sect. 3.2. Thus, we are certain that our main result, Theorem 3, can be transferred to the VTS method. To our knowledge, this has not been shown before. In particular, [19] is only about satisfiability checking and does not construct an equivalent conjunction. Interestingly though, it introduces a pruning mechanism which is similar to the ones described in the previous section.

All this only applies to the case where existential quantifiers are eliminated from a conjunction of weak linear constraints. In general, VTS can be applied to formulas with quantifier alternation and arbitrary Boolean structure, and non-linear constraints with quadratic polynomials.Footnote 2 Our results cannot be easily generalized to that setting (and, in that general case, the result of quantifier elimination is not necessarily defining a convex polyhedron).

5 Experimental Evaluation

We implemented our algorithm in the Satisfiability Modulo Theories Real Arithmetic Toolbox (SMT-RAT, [7]) and compared the following tools.

  • SMT-RAT The implementation of the algorithm presented in this paper, including the optimizations described in Sect. 3.3. For source code see [29].

  • FM A basic implementation of the Fourier-Motzkin method in SMT-RAT, without any further optimizations. This is merely a baseline, and it should be noted that there exist substantial improvements.

  • CDD The projection method of the library CDDlib which uses the double description method. For source version, see [5].

  • Redlog An optimized VTS implementation provided by the Redlog package of the computer algebra system Reduce [8]. For source version, see [28].

  • Z3 The z3 prover [25], which offers quantifier elimination based on quantifier instantiation and model based projections [4]. For source version, see [32].

The methods provided by Redlog and Z3 are directed at much more general quantifier elimination tasks than polyhedron projection. As a consequence, they generally return disjunctions, which are harder to interpret and to use. The comparisons are still interesting, since firstly, VTS is closely related to our method and secondly, Z3 is state of the art for most SMT related tasks.

All tools use exact arithmetic. We tested the tools on three benchmark sets.

  • Random. We randomly generated a set of satisfiable conjunctions and varied certain parameters to ensure diversity. More precisely, for each combination of \(n \in \{3,6,\ldots ,30\}\), \(m \in \{3,6,\ldots ,60\}\) and \(d \in \{0.1,0.3,0.5,0.7,0.9\}\), we generated 10 conjunctions of m constraints with n variables and so that the density of the coefficients is around d (i.e., the probability that an entry is non-zero is set to d). The coefficients are random integers between \(-100\) and 100, though the right-hand sides are non-negative to ensure satisfiability. For each of the conjunctions, half of the variables are chosen at random as the quantified variables. This amounted to 10000 test cases.

  • SMT-LIB. The standard SMT-LIB [3] quantifier elimination benchmarks are not suitable as they contain universal quantifiers and more complex Boolean structures. Instead, we executed a DPLL(T)-style SMT-solver on the quantifier free linear real arithmetic benchmarks (QF_LRA) and collected all maximal conjunctions that were passed to the theory solver during the execution, leaving out disequalities (\(\ne \)) and replacing < by \(\le \), as CDD does not natively handle strict constraints. For each conjunction, half of the variables are chosen randomly to be eliminated. This produced 4798 test cases.

  • NN-Verif. Neural network verification approaches like [1, 30] use convex polyhedra to over-approximate the set of all possible outputs of a neural network for a given input set. This approximation uses auxiliary variables, whose elimination might simplify the representation and speed up computations.

    ACAS Xu is a standard data set with 45 neural networks belonging to an airborne collision avoidance system [18]. For each network, we over-approximated its output for two different input sets, using the method from [1]. The resulting constraint matrices have an almost triangular structure, with decreasing column density from left to right. Therefore, we derived three test cases from each polyhedron: one eliminating the first five variables, one eliminating the last five and one for five randomly chosen variables. This amounted to 270 instances.

We executed each tool on each test instance with a time limit of 5 min and a memory limit of 4 GB. The experiments were conducted on identical machines with two Intel Xeon Platinum 8160 CPUs (2.1 GHz, 24 cores). Consult the Data Availability Statement for more details on the collected data.

Table 1. Number of solved instances for each tool and benchmark set.

Table 1 summarizes the results. In each of the benchmark suites, our implementation in SMT-RAT solves more instances than any of the other tools. As was to be expected, the simple implementation of FM does not perform well and quickly uses too much memory as the number of constructed constraints explodes. In fact, all failures of FM were due to exceeding the memory limit. The weakness of FM becomes apparent especially for the random benchmarks.

Surprisingly, Z3 did not perform better than FM, though it never exceeded the memory limit. As mentioned before, the method implemented by Z3 is not specialized for our task and, while it solves more general and complex problems efficiently, it cannot compete with the specialized methods.Footnote 3

Redlog is not far behind our method: for each benchmark set, the difference in solved instances does not exceed 8% of the total number of instances. It is not surprising to see some similarities between the two, considering that they are closely related, as discussed in Sect. 4. However, note that the outputs of Redlog or Z3 are not always suitable for the respective application, as they can contain disjunctions. This is the case for 3352 (resp. 2069 for Z3) random instances, 1934 (2646) of the SMT-LIB set and 39 (76) of the NN-Verif set.

On the random and neural network verification benchmarks, CDD is the strongest contender after SMT-RAT. Compared to Redlog and FM, it is a more specialized implementation for the task at hand. However, its performance dramatically drops for the set derived from SMT-LIB, and we will see why this happens when we further inspect that benchmark set below.

Random Instances. With respect to the running time, our implementation clearly outperforms FM, Z3 and Redlog. FM solves 2648 instances faster than SMT-RAT, and Z3 does so on 430 instances. However, SMT-RAT solves most of them within 0.5 s and all within 10 s. Only in 154 cases is Redlog faster than SMT-RAT, in contrast to 7329 cases where SMT-RAT is faster.

The comparison to CDD, however, is more ambiguous. CDD is faster in “only” 1657 cases, but SMT-RAT significantly struggles with these cases and times out on 148 of them. On the other hand, many of the instances where SMT-RAT is faster are solved by both tools within one second. In such small time, implementation details, e.g. how the input is processed, can have a big impact, making it harder to interpret the results. Nevertheless, there are 1060 instances for which SMT-RAT is faster and CDD takes more than a second.

Fig. 2.
figure 2

Running times in seconds of CDD compared to SMT-RAT, colored by the number of input constraints (left) and the number of input variables (right). Each dot represents one instance of the random benchmark set. Timeouts are clamped to 5 min (lines on the very right/top).

To further investigate the strengths and weaknesses of the two tools, we consider Fig. 2, which shows the running times of SMT-RAT compared to CDD on each individual random instance. On the left, the instances are colored according to the number of input constraints; on the right, the same image is colored according to the number of input variables. We can see that SMT-RAT usually performs better if the number of constraints is high, but there are not more than 15 variables. On the other hand, CDD seems to have an advantage for problems with a medium number (20 to 40) of constraints and many variables.

A similar analysis with regard to the sparsity of the input showed no clear pattern. However, we will see next that extreme sparsity can make a difference.

SMT-LIB. The problems derived from the SMT-LIB satisfiability checking benchmarks are structurally quite different from the other two sets. They contain many variables and constraints, but are extremely sparse. In numbers: over 88% of the instances contain 50 or more variables and over 90% contain 100 or more constraints. On the other hand, over 87% of the instances have a density of 0.05 or lower, which is much lower than in any of the random benchmarks.

This sparsity makes it very likely that the flexibility and the structural savings of our approach have a big impact. It also favors FM, since memory and the combinatorial blow-up inherent to FM are less of an issue. CDD on the other hand cannot exploit the sparsity that well. It is based on the double description method which is generally more expensive for larger numbers of constraints.

Neural Network Verification. All instances in the NN-Verif set have a density between 0.25 and 0.4, which nicely complements the set derived from SMT-LIB. The instances are still sizable, containing 20–102 variables and 55–301 constraints.

As described before, the NN-Verif set has three categories of 90 instances each, depending on which variables are eliminated. SMT-RAT, Redlog and CDD were able to solve all 90 problems where the last five variables were eliminated. FM and Z3 only solved 75 (76) of those and none in the other categories. Of the instances where five randomly chosen variables were eliminated, SMT-RAT solved 31, CDD 25 and Redlog 15. When eliminating the first five variables, SMT-RAT solved 3 and CDD 1. The structure of these benchmarks has a big impact, and they are generally more challenging than the other sets we tried.

Output Size. When variable elimination is used by an external algorithm, a concise representation of the projection helps to reduce the effort of further computations by that algorithm. Thus, we are also interested in the output size, i.e. the number of constraints in the result.

On all three benchmark sets, FM and CDD never give a smaller output than SMT-RAT, and the same holds for Redlog and Z3 on most of the instances. In fact, their output can be bigger than SMT-RAT’s by up to four orders of magnitude (FM, Redlog), three orders of magnitude (Z3) or two orders of magnitude (CDD), respectively. For roughly 3600 instances in the random and SMT-LIB-derived benchmarks, Redlog yields significantly fewer (by up to four orders of magnitude) constraints than SMT-RAT. These instances contain very few variables, and the difference is likely due to some redundancy removal used by Redlog, which could also be implemented in the other tools. Z3 yields fewer (by up to one order of magnitude) constraints than SMT-RAT for 53 random instances, and there are 64 instances of the SMT-LIB-derived set where Z3 recognizes that the input is unsatisfiable and returns a single unsatisfiable constraint, while SMT-RAT returns up to several hundred constraints. This difference could be elided by having SMT-RAT perform a satisfiability check first. Note that only 68 total instances are unsatisfiable, and all of them are derived from SMT-LIB.

6 Conclusion

We adapted the FMplex method from [27] to eliminate (existentially quantified) variables from conjunctions of linear inequations. While a straightforward adaption of the original method yields a disjunction as output, we showed that it is possible to find an equivalent conjunction with little additional effort.

Our new approach admits many improvements and structural savings, as the processing of the individual steps is quite flexible. We revealed strong similarities to the VTS method and are certain that our results can be easily transferred to it, in the case of a conjunctive linear input. First experiments show that our implementation outperforms other tools, though a comparison to more alternatives, like the ones in [2, 15, 16], would be interesting. Depending on the application, our method, just like other established tools, can hit its limitations even for few eliminated variables, as observed on our neural network verification benchmarks.

Accordingly, there is potential for further improvements. For example, one could try to find and remove redundant constraints during the computation, or prune more nodes of the elimination tree. The difficulty there is to ensure that the final result still contains all necessary constraints. Finally, our procedure is easily parallelizable, as the nodes in the search tree can be processed independently.