Keywords

1 Introduction

Satisfiability modulo theories (SMT) refers to the problem of determining whether a first-order formula is satisfiable with respect to (w.r.t.) certain theories, such as the theories of linear integer/real arithmetic, nonlinear integer/real arithmetic and strings. In this paper, we consider the theory of nonlinear real arithmetic (NRA) and restrict our attention to the problem of solving satisfiability of quantifier-free polynomial formulas.

Solving polynomial constraints has been a central problem in the development of mathematics. In 1951, Tarski’s decision procedure [33] made it possible to solve polynomial constraints in an algorithmic way. However, Tarski’s algorithm is impractical because of its super-exponential complexity. The first relatively practical method is cylindrical algebraic decomposition (CAD) algorithm [13] proposed by Collins in 1975, followed by lots of improvements. See for example [6, 14, 20, 22, 26]. Unfortunately, those variants do not improve the complexity of the original algorithm, which is doubly-exponential. On the other hand, SMT(NRA) is important in theorem proving and program verification, since most complicated programs use real variables and perform nonlinear arithmetic operation on them. Particularly, SMT(NRA) has various applications in the formal analysis of hybrid systems, dynamical systems and probabilistic systems (see the book [12] for reference).

The most popular approach for solving SMT(NRA) is the lazy approach, also known as CDCL(T) [5]. It combines a propositional satisfiability (SAT) solver that uses a conflict-driven clause learning (CDCL) style algorithm to find assignments of the propositional abstraction of a polynomial formula and a theory solver that checks the consistency of sets of polynomial constraints. The solving effort in the approach is devoted to both the Boolean layer and the theory layer. For the theory solver, the only complete method is the CAD method, and there also exist many efficient but incomplete methods, such as linearisation [10], interval constraint propagation [34] and virtual substitution [35]. Recall that the complexity of the CAD method is doubly-exponential. In order to ease the burden of using CAD, an improved CDCL-style search framework, the model constructing satisfiability calculus (MCSAT) framework [15, 21], was proposed. Further, there are many optimizations on CAD projection operation, e.g. [7, 24, 29], custom-made for this framework. Besides, an alternative algorithm for determining the satisfiability of conjunctions of non-linear polynomial constraints over the reals based on CAD is presented in [1].

The development of this approach brings us effective SMT(NRA) solvers. Almost all state-of-the-art SMT(NRA) solvers are based on the lazy approach, including Z3 [28], CVC5 [3], Yices2 [16] and MathSAT5 [11]. These solvers have made great progress in solving SMT(NRA). However, the time and memory usage of them on some hard instances may be unacceptable, particularly when the proportion of nonlinear polynomials in all polynomials appearing in the formula is high. It pushes us to design algorithms which perform well on these hard instances.

Local search plays an important role in solving satisfiability problems, which is an incomplete method since it can only determine satisfiability but not unsatisfiability. A local search algorithm moves in the space of candidate assignments (the search space) by applying local changes, until a satisfied assignment is found or a time bound is reached. It is well known that local search method has been successfully applied to SAT problems [2, 4, 9, 23]. In recent years, some efforts trying to develop local search method for SMT solving are inspiring: Under the DPLL(T) framework, Griggio et al. [19] introduced a general procedure for integrating a local search solver of the WalkSAT family with a theory solver. Pure local search algorithms [17, 30, 31] were proposed to solve SMT problems with respect to the theory of bit-vectors directly on the theory level. Cai et al. [8] developed a local search procedure for SMT on the theory of linear integer arithmetic (LIA) through the critical move operation, which works on the literal-level and changes the value of one variable in a false LIA literal to make it true. We also notice that there exists a local search SMT solver for the theory of NRA, called NRA-LS, performing well at the SMT Competition 2022Footnote 1. A simple description of the solver without details about local search can be found in [25].

In this paper, we propose a local search algorithm for a special subclass of SMT(NRA), where all constraints are strict inequalities. The idea of applying the local search method to SMT(NRA) comes from CAD, which is a decomposition of the search space \(\mathbb {R}^{n}\) into finitely many cells such that every polynomial in the formula is sign-invariant on each cell. CAD guarantees that the search space only has finitely many states. Similar to the local search method for SAT which moves between finitely many Boolean assignments, local search for SMT(NRA) should jump between finitely many cells. So, we may use a local search framework for SAT to solve SMT(NRA).

Local search algorithms require an operation to perform local changes. For SAT, a standard operation is flip, which modifies the current assignment by flipping the value of one Boolean variable from false to true or vice-versa. For SMT(NRA), we propose a novel operation, called cell-jump, updating the current assignment \(x_1\mapsto a_1,\ldots ,x_n\mapsto a_n~(a_i\in \mathbb {Q})\) to a solution of a false polynomial constraint ‘\(p<0\)’ or ‘\(p>0\)’, where \(x_i\) is a variable appearing in the given polynomial formula. Different from the critical move operation for linear integer constraints [8], it is difficult to determine the threshold value of some variable \(x_i\) such that the false polynomial constraint becomes true. We deal with the issue by the method of real root isolation, which isolates every real root of the univariate polynomial \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\) in an open interval sufficiently small with rational endpoints. If there exists at least one endpoint making the false constraint true, a cell-jump operation assigns \(x_i\) to one closest to \(a_i\). The procedure can be viewed as searching for a solution along a line parallel to the \(x_i\)-axis. In fact, a cell-jump operation can search along any fixed straight line, and thus one cell-jump may change the values of more than one variables. Each step, the local search algorithm picks a cell-jump operation to execute according to a two-level operation selection and updates the current assignment, until a solution to the polynomial formula is found or the terminal condition is satisfied. Moreover, our algorithm can be generalized to deal with a wider subclass of SMT(NRA) where polynomial equations linear w.r.t. some variable are allowed.

The local search algorithm is implemented with Maple2022 as a tool. Experiments are conducted to evaluate the tool on two classes of benchmarks, including selected instances from SMT-LIBFootnote 2, and some hard instances generated randomly with only nonlinear constraints. Experimental results show that our tool is competitive with state-of-the-art SMT solvers on the SMT-LIB benchmarks, and performs particularly well on the hard instances. We also combine our tool with Z3, CVC5, Yices2 and MathSAT5 respectively to obtain four sequential portfolio solvers, which show better performance.

The rest of the paper is organized as follows. The next section introduces some basic definitions and notation and a general local search framework for solving a satisfiability problem. Section 3 shows from the CAD perspective, the search space for SMT(NRA) only has finite states. In Sect. 4, we describe cell-jump operations, while in Sect. 5 we provide the scoring function which gives every operation a score. The main algorithm is presented in Sect. 6. And in Sect. 7, experimental results are provided to indicate the efficiency of the algorithm. Finally, the paper is concluded in Sect. 8.

2 Preliminaries

2.1 Notation

Let \(\bar{\boldsymbol{x}}:=(x_1,\ldots ,x_n)\) be a vector of variables. Denote by \(\mathbb {Q}\), \(\mathbb {R}\) and \(\mathbb {Z}\) the set of rational numbers, real numbers and integer numbers, respectively. Let \(\mathbb {Q}[\bar{\boldsymbol{x}}]\) and \(\mathbb {R}[\bar{\boldsymbol{x}}]\) be the ring of polynomials in the variables \(x_1,\ldots ,x_n\) with coefficients in \(\mathbb {Q}\) and in \(\mathbb {R}\), respectively.

Definition 1 (Polynomial Formula)

Suppose \(\varLambda =\{P_1,\ldots ,P_m\}\) where every \(P_i\) is a non-empty finite subset of \(\mathbb {Q}[\bar{\boldsymbol{x}}]\). The following formula

$$\begin{aligned} F=\bigwedge _{P_i\in \varLambda }~\bigvee _{p_{ij}\in P_i}p_{ij}(x_1,\ldots ,x_n)\rhd _{ij}0,~\textrm{where}~\rhd _{ij}\in \{<,>,=\}, \end{aligned}$$

is called a polynomial formula. Additionally, we call \(p_{ij}(x_1,\ldots ,x_n)\rhd _{ij}0\) an atomic polynomial formula, and \(\bigvee _{p_{ij}\in P_j}p_{ij}(x_1,\ldots ,x_n)\rhd _{ij}0\) a polynomial clause.

For any polynomial formula F, \({\texttt{poly}}(F)\) denotes the set of polynomials appearing in F. For any atomic formula \(\ell \), \({\texttt{poly}}(\ell )\) denotes the polynomial appearing in \(\ell \) and \({\texttt{rela}}(\ell )\) denotes the relational operator (‘<’, ‘>’ or ‘\(=\)’) of \(\ell \).

For any polynomial formula F, an assignment is a mapping \(\alpha :\bar{\boldsymbol{x}}\rightarrow \mathbb {R}^n\) such that \(\alpha (\bar{\boldsymbol{x}})=(a_1,\ldots ,a_n)\) where \(a_i\in \mathbb {R}\). Given an assignment \(\alpha \),

  • an atomic polynomial formula is true under \(\alpha \) if it evaluates to true under \(\alpha \), and otherwise it is false under \(\alpha \),

  • a polynomial clause is satisfied under \(\alpha \) if at least one atomic formula in the clause is true under \(\alpha \), and falsified under \(\alpha \) otherwise.

When the context is clear, we simply say a true (or false) atomic polynomial formula and a satisfied (or falsified) polynomial clause. A polynomial formula is satisfiable if there exists an assignment \(\alpha \) such that all clauses in the formula are satisfied under \(\alpha \), and such an assignment is a solution to the polynomial formula. A polynomial formula is unsatisfiable if any assignment is not a solution.

2.2 A General Local Search Framework

When applying local search algorithms to solve a satisfiability problem, the search space is the set of all assignments. A general local search framework begins with a complete, initial assignment. Every time, one of the operations with the highest score is picked and the assignment is updated after executing the operation until reaching the set terminal condition. Below, we give the formal definitions of operation and scoring function.

Definition 2 (Operation)

Let F be a formula. Given an assignment \(\alpha \) which is not a solution of F, an operation modifies \(\alpha \) to another assignment \(\alpha '\).

Definition 3 (Scoring Function)

Let F be a formula. Suppose \(\alpha \) is the current assignment and op is an operation. A scoring function is defined as \({\texttt{score}}(op,\alpha ):={\texttt{cost}}(\alpha )-{\texttt{cost}}(\alpha ')\), where the real-valued function \({\texttt{cost}}\) measures the cost of making F satisfied under an assignment according to some heuristic, and \(\alpha '\) is the assignment after executing op.

Example 1

In local search algorithms for SAT, a standard operation is flip, which modifies the current assignment by flipping the value of one Boolean variable from false to true or vice-versa. A commonly used scoring function measures the change on the number of falsified clauses by flipping a variable. Thus, operation op is \({\texttt{flip}}(b)\) for some Boolean variable b, and \({\texttt{cost}}(\alpha )\) is interpreted as the number of falsified clauses under the assignment \(\alpha \).

Actually, only when \({\texttt{score}}(op,\alpha )\) is a positive number does it make sense to execute operation op, since the operation guides the current assignment to an assignment with less cost of being a solution.

Definition 4 (Decreasing Operation)

Suppose \(\alpha \) is the current assignment. Given a scoring function \({\texttt{score}}\), an operation op is a decreasing operation under \(\alpha \) if \({\texttt{score}}(op,\alpha )>0\).

A general local search framework is described in Algorithm 1. The framework was used in GSAT [27] for solving SAT problems. Note that if the input formula F is satisfied, Algorithm 1 outputs either (i) a solution of F if the solution is found successfully, or (ii) “unknown” if the algorithm fails.

figure a

3 The Search Space of SMT(NRA)

The search space for SAT problems consists of finitely many assignments. So, theoretically speaking, a local search algorithm can eventually find a solution, as long as the formula indeed has a solution and there is no cycling during the search. It seems intuitive, however, that the search space of an SMT(NRA) problem, e.g. \(\mathbb {R}^{n}\), is infinite and thus search algorithms may not work.

Fortunately, due to Tarski’s work and the theory of CAD, SMT(NRA) is decidable. Given a polynomial formula in n variables, by the theory of CAD, \(\mathbb {R}^{n}\) is decomposed into finitely many cells such that every polynomial in the formula is sign-invariant on each cell. Therefore, the search space of the problem is essentially finite. The cells of SMT(NRA) are very similar to the Boolean assignments of SAT, so just like traversing all Boolean assignments in SAT, there exists a basic strategy to traverse all cells.

In this section, we describe the search space of SMT(NRA) based on the CAD theory from a local search perspective, providing a theoretical foundation for the operators and heuristics we will propose in the next sections.

Example 2

Consider the polynomial formula

$$\begin{aligned} F\;=\;(f_1> 0 \vee f_2 > 0) \wedge (f_1< 0 \vee f_2 < 0), \end{aligned}$$

where \(f_1 = 17x^2 + 2xy + 17y^2 + 48x - 48y\) and \(f_2=17x^2 - 2xy + 17y^2 - 48x - 48y.\)

The solution set of F is shown as the shaded area in Fig. 1. Notice that \({\texttt{poly}}(F)\) consists of two polynomials and decomposes \(\mathbb {R}^2\) into 10 areas: \(C_1,\ldots ,C_{10}\) (see Fig. 2). We refer to these areas as cells.

Fig. 1.
figure 1

The solution set of F in Example 2.

Fig. 2.
figure 2

The zero level set of \({\texttt{poly}}(F)\) decomposes \(\mathbb {R}^2\) into 10 cells.

Definition 5 (Cell)

For any finite set \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\), a cell of Q is a maximally connected set in \(\mathbb {R}^{n}\) on which the sign of every polynomial in Q is constant. For any point \(\bar{a}\in \mathbb {R}^{n}\), we denote by \({\texttt{cell}(Q,\bar{a})}\) the cell of Q containing \(\bar{a}\).

By the theory of CAD, we have

Corollary 1

For any finite set \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\), the number of cells of Q is finite.

It is obvious that any two cells of Q are disjoint and the union of all cells of Q equals \(\mathbb {R}^{n}\). Definition 5 shows that for a polynomial formula F with \({\texttt{poly}}(F)=Q\), the satisfiability of F is constant on every cell of Q, that is, either all the points in a cell are solutions to F or none of them are solutions to F.

Example 3

Consider the polynomial formula F in Example 2. As shown in Fig. 3, assume that we start from point a to search for a solution to F. Jumping from a to b makes no difference, as both points are in the same cell and thus neither are solutions to F. However, jumping from a to c or from a to d crosses different cells and we may discover a cell satisfying F. Herein, the cell containing d satisfies F.

Fig. 3.
figure 3

Jumping from point a to search for a solution of F.

Fig. 4.
figure 4

A cylindrical expansion of a cylindrically complete set containing \({\texttt{poly}}(F)\).

For the remainder of this section, we will demonstrate how to traverse all cells through point jumps between cells. The method of traversing cell by cell in a variable by variable direction will be explained step by step from Definition 6 to Definition 8.

Definition 6 (Expansion)

Let \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) be finite and \(\bar{a}=(a_1,\ldots ,a_n)\in \mathbb {R}^n\). Given a variable \(x_i~(1\le i\le n)\), let \(r_1<\cdots <r_s\) be all real roots of \(\{q(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\mid q(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\not \equiv 0,\ q\in Q\}\), where \(s\in \mathbb {Z}_{\ge 0}\). An expansion of \(\bar{a}\) to \(x_i\) on Q is a point set \(\varLambda \subseteq \mathbb {R}^n\) satisfying

  1. (a)

    \(\bar{a}\in \varLambda \) and \((a_1,\ldots ,a_{i-1},r_j,a_{i+1},\ldots ,a_n)\in \varLambda \) for \(1\le j\le s\),

  2. (b)

    for any \(\bar{b}=(b_1,...,b_n)\in \varLambda \), \(b_j=a_j\) for \(j\in \{1,\ldots ,n\}\setminus \{i\}\), and

  3. (c)

    for any interval \(I\in \{(-\infty ,r_1),(r_1,r_2),\ldots ,(r_{s-1},r_s),(r_s,+\infty )\}\), there exists a unique \(\bar{b}=(b_1,...,b_n)\in \varLambda \) such that \(b_i\in I.\)

For any point set \(\{\bar{a}^{(1)},\ldots ,\bar{a}^{(m)}\}\subseteq \mathbb {R}^{n}\), an expansion of the set to \(x_i\) on Q is \(\bigcup _{j=1}^m\varLambda _j\), where \(\varLambda _j\) is an expansion of \(\bar{a}^{(j)}\) to \(x_i\) on Q.

Example 4

Consider the polynomial formula F in Example 2. The set of black solid points in Fig. 3, denoted as \(\varLambda \), is an expansion of point (0, 0) to x on \({\texttt{poly}}(F)\). The set of all points (including black solid points and hollow points) is an expansion of \(\varLambda \) to y on \({\texttt{poly}}(F)\).

As shown in Fig. 3, an expansion of a point to some variable is actually a result of the point continuously jumping to adjacent cells along that variable direction. Next, we describe the expansion of all variables in order, which is a result of jumping from cell to cell along the directions of variables w.r.t. a variable order.

Definition 7 (Cylindrical Expansion)

Let \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) be finite and \(\bar{a}\in \mathbb {R}^n\). Given a variable order \(x_1\prec \cdots \prec x_n\), a cylindrical expansion of \(\bar{a}\) w.r.t. the variable order on Q is \(\bigcup _{i=1}^n \varLambda _i\), where \(\varLambda _1\) is an expansion of \(\bar{a}\) to \(x_1\) on Q, and for \(2\le i\le n\), \(\varLambda _{i}\) is an expansion of \(\varLambda _{i-1}\) to \(x_{i}\) on Q. When the context is clear, we simply call \(\bigcup _{i=1}^n \varLambda _i\) a cylindrical expansion of Q.

Example 5

Consider the formula F in Example 2. It is clear that the set of all points in Fig. 3 is a cylindrical expansion of point (0, 0) w.r.t. \(x\prec y\) on \({\texttt{poly}}(F)\). The expansion actually describes the following jumping process. First, the origin (0, 0) jumps along the x-axis to the black points, and then those black points jump along the y-axis direction to the white points.

Clearly, a cylindrical expansion is similar to how a Boolean vector is flipped variable by variable. Note that the points in the expansion in Fig. 3 do not cover all the cells (e.g. \(C_7\) and \(C_8\) in Fig. 2), but if we start from (0, 2), all the cells can be covered. This implies that whether all the cells can be covered depends on the starting point.

Definition 8 (Cylindrically Complete)

Let \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) be finite. Given a variable order \(x_1\prec \cdots \prec x_n\), Q is said to be cylindrically complete w.r.t. the variable order, if for any \(\bar{a}\in \mathbb {R}^n\) and for any cylindrical expansion \(\varLambda \) of \(\bar{a}\) w.r.t. the order on Q, every cell of Q contains at least one point in \(\varLambda \).

Theorem 1

For any finite set \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) and any variable order, there exists \(Q'\) such that \(Q\subseteq Q'\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) and \(Q'\) is cylindrically complete w.r.t. the variable order.

Proof

Let \(Q'\) be the projection set of Q [6, 13, 26] obtained from the CAD projection operator w.r.t. the variable order. According to the theory of CAD, \(Q'\) is cylindrically complete.\(\square \)

Corollary 2

For any polynomial formula F and any variable order, there exists a finite set \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) such that for any cylindrical expansion \(\varLambda \) of Q, every cell of \({\texttt{poly}}(F)\) contains at least one point in \(\varLambda \). Furthermore, F is satisfiable if and only if F has solutions in \(\varLambda \).

Example 6

Consider the polynomial formula F in Example 2. By the proof of Theorem 1, \(Q':=\{ x, -2 - 3 x + x^2,\; -2 + 3 x + x^2,\; 10944 + 17 x^2,\; f_1,\; f_2\}\) is a cylindrically complete set w.r.t. \(x\prec y\) containing \({\texttt{poly}}(F)\). As shown in Fig. 4, the set of all (hollow) points is a cylindrical expansion of point (0, 0) w.r.t. \(x \prec y\) on \(Q'\), which covers all cells of \({\texttt{poly}}(F)\).

Corollary 2 shows that for a polynomial formula F, there exists a finite set \(Q\subseteq \mathbb {R}[\bar{\boldsymbol{x}}]\) such that we can traverse all the cells of \({\texttt{poly}}(F)\) through a search path containing all points in a cylindrical expansion of Q. The cost of traversing the cells is very high, and in the worst case, the number of cells will grow exponentially with the number of variables.

The key to building a local search on SMT(NRA) is to construct a heuristic search based on the operation of jumping between cells.

4 The Cell-Jump Operation

In this section, we propose a novel operation, called cell-jump, that performs local changes in our algorithm. The operation is determined by the means of real root isolation. We review the method of real root isolation and define sample points in Sect. 4.1. Section 4.2 and Sect. 4.3 present a cell-jump operation along a line parallel to a coordinate axis and along any fixed straight line, respectively.

4.1 Sample Points

Real root isolation is a symbolic way to compute the real roots of a polynomial, which is of fundamental importance in computational real algebraic geometry (e.g., it is a routing sub-algorithm for CAD). There are many efficient algorithms and popular tools in computer algebra systems such as Maple and Mathematica to isolate the real roots of polynomials.

We first introduce the definition of sequences of isolating intervals for nonzero univariate polynomials, which can be obtained by any real root isolation tool, e.g. CLPolyFootnote 3.

Definition 9 (Sequence of Isolating Intervals)

For any nonzero univariate polynomial \(p(x)\in \mathbb {Q}[x]\), a sequence of isolating intervals of p(x) is a sequence of open intervals \((a_1,b_1),\ldots ,(a_s,b_s)\) where \(s\in \mathbb {Z}_{\ge 0}\), such that

  1. (i)

    for each \(i~(1\le i\le s)\), \(a_i,b_i\in \mathbb {Q}\), \(a_i< b_i\) and \(b_i < a_{i+1}\),

  2. (ii)

    each interval \((a_i,b_i)~(1\le i\le s)\) has exactly one real root of p(x), and

  3. (iii)

    none of the real roots of p(x) are in \(\mathbb {R}\setminus \bigcup _{i=1}^{s}(a_i,b_i)\).

Specially, the sequence of isolating intervals is empty, i.e., \(s=0\), when p(x) has no real roots.

By means of sequences of isolating intervals, we define sample points of univariate polynomials, which is the key concept of the cell-jump operation proposed in Sect. 4.2 and Sect. 4.3.

Definition 10 (Sample Point)

For any nonzero univariate polynomial \(p(x)\in \mathbb {Q}[x]\), let \((a_1,b_1),\ldots ,(a_s,b_s)\) be a sequence of isolating intervals of p(x) where \(s\in \mathbb {Z}_{\ge 0}\). Every point in the set \(\{a_1,b_s\}\cup \bigcup _{i=1}^{s-1}\{b_i,\frac{b_i+a_{i+1}}{2},a_{i+1}\}\) is a sample point of p(x). If \(x^{*}\) is a sample point of p(x) and \(p(x^{*})>0\) (or \(p(x^{*})<0)\), then \(x^{*}\) is a positive sample point (or negative sample point) of p(x). For the zero polynomial, it has no sample point, no positive sample point and no negative sample point.

Remark 1

For any nonzero univariate polynomial p(x) that has real roots, let \(r_1,\ldots ,r_s\;(s\in \mathbb {Z}_{\ge 1})\) be all distinct real roots of p(x). It is obvious that the sign of p(x) is positive constantly or negative constantly on each interval I of the set \(\{(-\infty ,r_1),(r_1,r_2),\ldots ,(r_{s-1},r_s),(r_s,+\infty )\}\). So, we only need to take a point \(x^{*}\) from the interval I, and then the sign of \(p(x^*)\) is the constant sign of p(x) on I. Specially, we take \(a_1\) as the sample point for the interval \((-\infty ,r_1)\), \(b_i,\frac{b_i+a_{i+1}}{2}~\text {or}~a_{i+1}\) as a sample point for \((r_i,r_{i+1})\) where \(1\le i\le s-1\), and \(b_s\) as the sample point for \((r_s,+\infty )\). By Definition 10, there exists no sample point for the zero polynomial and a univariate polynomial with no real roots.

Example 7

Consider the polynomial \(p(x)=x^8 - 4x^6 + 6x^4 - 4x^2 + 1\). It has two real roots \(-1\) and 1, and a sequence of isolating intervals of it is \((-\frac{215}{128}, -\frac{19}{32})\), \((\frac{19}{32}, \frac{215}{128})\). Every point in the set \(\{-\frac{215}{128},-\frac{19}{32},0,\frac{19}{32},\frac{215}{128}\}\) is a sample point of p(x). Note that \(p(x)>0\) holds on the intervals \((-\infty ,-1)\) and \((1,+\infty )\), and \(p(x)<0\) holds on the interval \((-1,1)\). Thus, \(-\frac{215}{128}\) and \(\frac{215}{128}\) are positive sample points of p(x); \(-\frac{19}{32},0\) and \(\frac{19}{32}\) are negative sample points of p(x).

4.2 Cell-Jump Along a Line Parallel to a Coordinate Axis

The critical move operation [8, Definition 2] is a literal-level operation. For any false LIA literal, the operation changes the value of one variable in it to make the literal true. In the subsection, we propose a similar operation which adjusts the value of one variable in a false atomic polynomial formula with ‘<’ or ‘>’.

Definition 11

Suppose the current assignment is \(\alpha :x_1\mapsto a_1,\ldots , x_n\mapsto a_n\) where \(a_i\in \mathbb {Q}\). Let \(\ell \) be a false atomic polynomial formula under \(\alpha \) with a relational operator ‘<’ or ‘>’.

  1. (i)

    Suppose \(\ell \) is \(p(\bar{\boldsymbol{x}})<0\). For each variable \(x_i\) such that the univariate polynomial \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\) has negative sample points, there exists a cell-jump operation, denoted as \({\texttt{cjump}}(x_i,\ell )\), assigning \(x_i\) to a negative sample point closest to \(a_i\).

  2. (ii)

    Suppose \(\ell \) is \(p(\bar{\boldsymbol{x}})>0\). For each variable \(x_i\) such that the univariate polynomial \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\) has positive sample points, there exists a cell-jump operation, denoted as \({\texttt{cjump}}(x_i,\ell )\), assigning \(x_i\) to a positive sample point closest to \(a_i\).

Every assignment in the search space can be viewed as a point in \(\mathbb {R}^n\). Then, performing a \({\texttt{cjump}}(x_i,\ell )\) operation is equivalent to moving one step from the current point \(\alpha (\bar{\boldsymbol{x}})\) along the line \((a_1,\ldots ,a_{i-1},\mathbb {R},a_{i+1},\ldots ,a_n)\). Since the line is parallel to the \(x_i\)-axis, we call \({\texttt{cjump}}(x_i,\ell )\) a cell-jump along a line parallel to a coordinate axis.

Theorem 2

Suppose the current assignment is \(\alpha :x_1\mapsto a_1,\ldots , x_n\mapsto a_n\) where \(a_i\in \mathbb {Q}\). Let \(\ell \) be a false atomic polynomial formula under \(\alpha \) with a relational operator ‘<’ or ‘>’. For every \(i~(1\le i\le n)\), there exists a solution of \(\ell \) in \(\{\alpha '\mid \alpha '(\bar{\boldsymbol{x}})\in (a_1,\ldots ,a_{i-1},\mathbb {R},a_{i+1},\ldots ,a_n)\}\) if and only if there exists a \({\texttt{cjump}}(x_i,\ell )\) operation.

Proof

\(\Leftarrow \) It is clear by the definition of negative (or positive) sample points.

\(\Rightarrow \) Let \(S:=\{\alpha '\mid \alpha '(\bar{\boldsymbol{x}})\in (a_1,\ldots ,a_{i-1},\mathbb {R},a_{i+1},\ldots ,a_n)\}\). It is equivalent to proving that if there exists no \({\texttt{cjump}}(x_i,\ell )\) operation, then no solution to \(\ell \) exists in S. We only prove it for \(\ell \) of the form \(p(\bar{\boldsymbol{x}})<0\). Recall Definition 10 and Remark 1. There are only three cases in which \({\texttt{cjump}}(x_i,\ell )\) does not exist: (1) \(p^{*}\) is the zero polynomial, (2) \(p^{*}\) has no real roots, (3) \(p^{*}\) has a finite number of real roots, say \(r_1,\ldots ,r_s~(s\in \mathbb {Z}_{\ge 1})\), and \(p^{*}\) is positive on \(\mathbb {R}\setminus \{r_1,\ldots ,r_s\}\), where \(p^{*}\) denotes the polynomial \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\). In the first case, \(p(\alpha '(\bar{\boldsymbol{x}}))=0\) and in the third case, \(p(\alpha '(\bar{\boldsymbol{x}}))\ge 0\) for any assignment \(\alpha '\in S\). In the second case, the sign of \(p^{*}\) is positive constantly or negative constantly on the whole real axis. Since \(\ell \) is false under \(\alpha \), we have \(p(\alpha (\bar{\boldsymbol{x}}))\ge 0\), that is, \(p^{*}(a_i)\ge 0\). So, \(p^{*}(x_i)>0\) for any \(x_i\in \mathbb {R}\), which means \(p(\alpha '(\bar{\boldsymbol{x}}))>0\) for any \(\alpha '\in S\). Therefore, no solution to \(\ell \) exists in S in the three cases. That completes the proof.\(\square \)

The above theorem shows that if \({\texttt{cjump}}(x_i,\ell )\) does not exist, then there is no need to search for a solution to \(\ell \) along the line \((a_1,\ldots ,a_{i-1},\mathbb {R},a_{i+1},\ldots ,a_n)\). And we can always obtain a solution to \(\ell \) after executing a \({\texttt{cjump}}(x_i,\ell )\) operation.

Example 8

Assume the current assignment is \(\alpha :x_1\mapsto 1,\;x_2\mapsto 1\). Consider two false atomic polynomial formulas \(\ell _1:2x_1^2+2x_2^2-1<0\) and \(\ell _2:x_1^8x_2^3 - 4x_1^6 + 6x_1^4x_2 - 4x_1^2 + x_2>0\). Let \(p_1:={\texttt{poly}}(\ell _1)\) and \(p_2:={\texttt{poly}}(\ell _2)\).

We first consider \({\texttt{cjump}}(x_i,\ell _1)\). For the variable \(x_1\), the corresponding univariate polynomial is \(p_1(x_1,1)=2x_1^2+1\), and for \(x_2\), the corresponding one is \(p_1(1,x_2)=2x_2^2+1\). Both of them have no real roots, and thus there exists no \({\texttt{cjump}}(x_1,\ell _1)\) operation and no \({\texttt{cjump}}(x_2,\ell _1)\) operation for \(\ell _1\). Applying Theorem 2, we know a solution of \(\ell _1\) can only locate in \(\mathbb {R}^{2}\setminus (1,\mathbb {R})\cup (\mathbb {R},1)\) (also see Fig. 5 (a)). So, we cannot find a solution of \(\ell _1\) through one-step cell-jump from the assignment point (1, 1) along the lines \((1,\mathbb {R})\) and \((\mathbb {R},1)\).

Then consider \({\texttt{cjump}}(x_i,\ell _2)\). For the variable \(x_1\), the corresponding univariate polynomial is \(p_2(x_1,1)=x_1^8 - 4x_1^6 + 6x_1^4 - 4x_1^2 + 1\). Recall Example 7. There are two positive sample points of \(p_2(x_1,1):\) \(-\frac{215}{128},\frac{215}{128}\). And \(\frac{215}{128}\) is the closest one to \(\alpha (x_1)\). So, \({\texttt{cjump}}(x_1,\ell _2)\) assigns \(x_1\) to \(\frac{215}{128}\). After executing \({\texttt{cjump}}(x_1,\ell _2)\), the assignment becomes \(\alpha ':x_1\mapsto \frac{215}{128},\;x_2\mapsto 1\) which is a solution of \(\ell _2\). For the variable \(x_2\), the corresponding polynomial is \(p_2(1,x_2)=x_2^3 + 7x_2 - 8\), which has one real root 1. A sequence of isolating intervals of \(p_2(1,x_2)\) is \((\frac{19}{32},\frac{215}{128})\), and \(\frac{215}{128}\) is the only positive sample point. So, \({\texttt{cjump}}(x_2,\ell _2)\) assigns \(x_2\) to \(\frac{215}{128}\), and then the assignment becomes \(\alpha '':x_1\mapsto 1,\;x_2\mapsto \frac{215}{128}\) which is another solution of \(\ell _2\).

4.3 Cell-Jump Along a Fixed Straight Line

Given the current assignment \(\alpha \) such that \(\alpha (\bar{\boldsymbol{x}})=(a_1,\ldots ,a_n)\in \mathbb {Q}^{n}\), a false atomic polynomial formula \(\ell \) of the form \(p(\bar{\boldsymbol{x}})>0\) or \(p(\bar{\boldsymbol{x}})<0\) and a vector \(dir=(d_1,\ldots ,d_n)\in \mathbb {Q}^{n}\), we propose Algorithm 2 to find a cell-jump operation along the straight line L specified by the point \(\alpha (\bar{\boldsymbol{x}})\) and the direction dir, denoted as \({\texttt{cjump}}(dir,\ell )\).

In order to analyze the values of \(p(\bar{\boldsymbol{x}})\) on line L, we introduce a new variable t and replace every \(x_i\) in \(p(\bar{\boldsymbol{x}})\) with \(a_{i}+d_{i}t\) to get \(p^{*}(t)\). If \({\texttt{rela}}(\ell )=\)‘<’ and \(p^{*}(t)\) has negative sample points, there exists a \({\texttt{cjump}}(dir,\ell )\) operation. Let \(t^{*}\) be a negative sample point of \(p^{*}(t)\) closest to 0. The assignment becomes \(\alpha ':x_1\mapsto a_{1}+d_{1}t^{*},\ldots ,x_n\mapsto a_{n}+d_{n}t^{*}\) after executing the operation \({\texttt{cjump}}(dir,\ell )\). It is obvious that \(\alpha '\) is a solution to \(\ell \). If \({\texttt{rela}}(\ell )=\)‘>’ and \(p^{*}(t)\) has positive sample points, the situation is similar. Otherwise, \(\ell \) has no cell-jump operation along line L.

Similarly, we have:

Theorem 3

Suppose the current assignment is \(\alpha :x_1\mapsto a_1,\ldots , x_n\mapsto a_n\) where \(a_i\in \mathbb {Q}\). Let \(\ell \) be a false atomic polynomial formula under \(\alpha \) with a relational operator ‘<’ or ‘>’, \(dir:=(d_1,\ldots ,d_n)\) a vector in \(\mathbb {Q}^{n}\) and \(L:=\{(a_1+d_1t,\ldots ,a_n+d_nt)\mid t\in \mathbb {R}\}\). There exists a solution of \(\ell \) in L if and only if there exists a \({\texttt{cjump}}(dir,\ell )\) operation.

Theorem 3 implies that through one-step cell-jump from the point \(\alpha (\bar{\boldsymbol{x}})\) along any line that intersects the solution set of \(\ell \), a solution to \(\ell \) will be found.

Example 9

Assume the current assignment is \(\alpha :x_1\mapsto 1,\;x_2\mapsto 1\). Consider the false atomic polynomial formula \(\ell _1:2x_1^2+2x_2^2-1<0\) in Example 8. Let \(p:={\texttt{poly}}(\ell _1)\). By Fig. 5 (b), the line (line \(L_3)\) specified by the point \(\alpha (\bar{\boldsymbol{x}})\) and the direction vector \(dir=(1,1)\) intersects the solution set of \(\ell _1\). So, there exists a \({\texttt{cjump}}(dir,\ell _1)\) operation by Theorem 3. Notice that the line can be described in a parametric form, that is \(\{(x_1,x_2)\mid x_1=1+t, x_2=1+t~\textrm{where}~t \in \mathbb {R}\}\). Then, analyzing the values of \(p(\bar{\boldsymbol{x}})\) on the line is equivalent to analyzing those of \(p^{*}(t)\) on the real axis, where \(p^{*}(t)=p(1+t,1+t)=4t^2 + 8t + 3\). A sequence of isolating intervals of \(p^{*}\) is \((-\frac{215}{128}, -\frac{75}{64})\), \((-\frac{19}{32}, -\frac{61}{128})\), and there are two negative sample points: \(-\frac{75}{64}\), \(-\frac{19}{32}\). Since \(-\frac{19}{32}\) is the closest one to 0, the operation \({\texttt{cjump}}(dir,\ell _1)\) changes the assignment to \(\alpha ':x_1\mapsto \frac{13}{32},\;x_2\mapsto \frac{13}{32}\), which is a solution of \(\ell _1\). Again by Fig. 5, there are other lines (the dashed lines) that go through \(\alpha (\bar{\boldsymbol{x}})\) and intersect the solution set. So, we can also find a solution to \(\ell _1\) along these lines. Actually, for any false atomic polynomial formula with ‘<’ or ‘>’ that really has solutions, there always exists some direction dir in \(\mathbb {Q}^{n}\) such that \({\texttt{cjump}}(dir,\ell )\) finds one of them. Therefore, the more directions we try, the greater the probability of finding a solution of \(\ell \).

figure b
Fig. 5.
figure 5

The figure of the cell-jump operations along the lines \(L_1\), \(L_2\) and \(L_3\) for the false atomic polynomial formula \(\ell _1:2x_1^2+2x_2^2-1<0\) under the assignment \(\alpha :x_1\mapsto 1,x_2\mapsto 1\). The dashed circle denotes the circle \(2x_1^2+2x_2^2-1=0\) and the shaded part in it represents the solution set of the atom. The coordinate of point A is (1, 1). Lines \(L_1\), \(L_2\) and \(L_3\) pass through A and are parallel to the \(x_1\)-axis, the \(x_2\)-axis and the vector (1, 1), respectively.

Remark 2

For a false atomic polynomial formula \(\ell \) with ‘<’ or ‘>’, \({\texttt{cjump}}(x_i,\ell )\) and \({\texttt{cjump}}(dir,\ell )\) make an assignment move to a new assignment, and both assignments map to an element in \(\mathbb {Q}^{n}\). In fact, we can view \({\texttt{cjump}}(x_i,\ell )\) as a special case of \({\texttt{cjump}}(dir,\ell )\) where the i-th component of dir is 1 and all the other components are 0. The main difference between \({\texttt{cjump}}(x_i,\ell )\) and \({\texttt{cjump}}(dir,\ell )\) is that \({\texttt{cjump}}(x_i,\ell )\) only changes the value of one variable while \({\texttt{cjump}}(dir,\ell )\) may change the values of many variables. The advantage of \({\texttt{cjump}}(x_i,\ell )\) is to avoid that some atoms can never become true when the values of many variables are adjusted together. However, performing \({\texttt{cjump}}(dir,\ell )\) is more efficient in some cases, since it may happen that a solution to \(\ell \) can be found through one-step \({\texttt{cjump}}(dir,\ell )\), but through many steps of \({\texttt{cjump}}(x_i,\ell )\).

5 Scoring Functions

Scoring functions guide local search algorithms to pick an operation at each step. In this section, we introduce a score function which measures the difference of the distances to satisfaction under the assignments before and after performing an operation.

First, we define the distance to truth of an atomic polynomial formula.

Definition 12 (Distance to Truth)

Given the current assignment \(\alpha \) such that \(\alpha (\bar{\boldsymbol{x}})=(a_1,\ldots ,a_n)\in \mathbb {Q}^{n}\) and a positive parameter \(pp\in \mathbb {Q}_{>0}\), for an atomic polynomial formula \(\ell \) with \(p:={\texttt{poly}}(\ell )\), its distance to truth is

$$\begin{aligned} {\texttt{dtt}}(\ell ,\alpha ,pp):=\;{\left\{ \begin{array}{ll} 0, &{}\text {if}~\alpha ~{\text{ i }s~a~solution~to}~\ell ,\\ |p(a_1,\ldots ,a_n)|+pp, &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

For an atomic polynomial formula \(\ell \), the parameter pp is introduced to guarantee that the distance to truth of \(\ell \) is 0 if and only if the current assignment \(\alpha \) is a solution of \(\ell \). Based on the definition of \({\texttt{dtt}}\), we use the method of [8, Definition 3 and 4] to define the distance to satisfaction of a polynomial clause and the score of an operation, respectively.

Definition 13 (Distance to Satisfaction)

Given the current assignment \(\alpha \) and a parameter \(pp\in \mathbb {Q}_{>0}\), the distance to satisfaction of a polynomial clause c is \({\texttt{dts}}(c,\alpha ,pp):=\min _{\ell \in c}\{{\texttt{dtt}}(\ell ,\alpha ,pp)\}\).

Definition 14 (Score)

Given a polynomial formula F, the current assignment \(\alpha \) and a parameter \(pp\in \mathbb {Q}_{>0}\), the score of an operation op is defined as

$$\begin{aligned} {\texttt{score}}(op,\alpha ,pp):=\;\sum _{c\in F}({\texttt{dts}}(c,\alpha ,pp)-{\texttt{dts}}(c,\alpha ',pp))\cdot w(c), \end{aligned}$$

where w(c) denotes the weight of clause c, and \(\alpha '\) is the assignment after performing op.

Note that the definition of the score is associated with the weights of clauses. In our algorithm, we employ the probabilistic version of the PAWS scheme [9, 32] to update clause weights. The initial weight of every clause is 1. Given a probability sp, the clause weights are updated as follows: with probability \(1-sp\), the weight of every falsified clause is increased by one, and with probability sp, for every satisfied clause with weight greater than 1, the weight is decreased by one.

6 The Main Algorithm

Based on the proposed cell-jump operation (see Sect. 4) and scoring function (see Sect. 5), we develop a local search algorithm, called LS Algorithm, for solving satisfiability of polynomial formulas in this section. The algorithm is a refined extension of the general local search framework as described in Sect. 2.2, where we design a two-level operation selection. The section also explains the restart mechanism and an optimization strategy used in the algorithm.

Given a polynomial formula F such that every relational operator appearing in it is ‘<’ or ‘>’ and an initial assignment that maps to an element in \(\mathbb {Q}^{n}\), LS Algorithm (Algorithm 3) searches for a solution of F from the initial assignment, which has the following four steps:

(i):

Test whether the current assignment is a solution if the terminal condition is not reached. If the assignment is a solution, return the solution. If it is not, go to the next step. The algorithm terminates at once and returns “unknown” if the terminal condition is satisfied.

(ii):

Try to find a decreasing cell-jump operation along a line parallel to a coordinate axis. We first need to check whether such an operation exists. That is, to determine whether the set D is empty, where \(D=\{{\texttt{cjump}}(x_i,\ell )\mid \ell ~\mathrm{is~a~false~ atom},x_i~\mathrm{appears~in~ \ell }~\textrm{and}~{\texttt{cjump}}(x_i,\ell )~\mathrm{is~decreasing}\}\). If \(D=\emptyset \), go to the next step. Otherwise, we adopt the two-level heuristic in [8, Section 4.2]. The heuristic distinguishes a special subset \(S\subseteq D\) from the rest of D, where \(S=\{{\texttt{cjump}}(x_i,\ell )\in D\mid \ell ~\mathrm{appears~in~a~falsified~clause}\}\), and searches for an operation with the highest score from S. If it fails to find any operation from S (i.e. \(S=\emptyset \)), then it searches for one with the highest score from \(D\setminus S\). Perform the found operation and update the assignment. Go to Step (i).

(iii):

Update clause weights according to the PAWS scheme.

(iv):

Generate some direction vectors and try to find a decreasing cell-jump operation along a line parallel to a generated vector. Since it fails to execute a decreasing cell-jump operation along any line parallel to a coordinate axis, we generate some new directions and search for a decreasing cell-jump operation along one of them. The candidate set of such operations is \(\{{\texttt{cjump}}(dir,\ell )\mid \ell ~\textrm{is a false atom},~dir~\textrm{is a generated direction}~ \textrm{and}~{\texttt{cjump}}(dir,\ell )~\textrm{is decreasing}\}.\) If the set is empty, the algorithm returns “unknown". Otherwise, we use the two-level heuristic in Step (ii) again to choose an operation from the set. Perform the chosen operation and update the assignment. Go to Step (i).

We propose a two-level operation selection in LS Algorithm, which prefers to choose an operation changing the values of less variables. Concretely, only when there does not exist a decreasing \({\texttt{cjump}}(x_i,\ell )\) operation that changes the value of one variable, do we update clause weights and pick a \({\texttt{cjump}}(dir,\ell )\) operation that may change values of more variables. The strategy makes sense in experiments, since it is observed that changing too many variables together at the beginning might make some atoms never become true.

It remains to explain the restart mechanism and an optimization strategy.

Restart Mechanism. Given any initial assignment, LS Algorithm takes it as the starting point of the local search. If the algorithm returns “unknown”, we restart LS Algorithm with another initial assignment. A general local search framework, like Algorithm 1, searches for a solution from only one starting point. However, the restart mechanism allows us to search from more starting points. The approach of combining the restart mechanism and a local search procedure also aids global search, which finds a solution over the entire search space.

We set the initial assignments for restarts as follows: All variables are assigned with 1 for the first time. For the second time, for a variable \(x_i\), if there exists clause \(x_i< ub\vee x_i=ub\) or \(x_i> lb\vee x_i=lb\), then \(x_i\) is assigned with ub or lb; otherwise, \(x_i\) is assigned with 1. For the i-th time \((3\le i\le 7)\), every variable is assigned with 1 or \(-1\) randomly. For the i-th time \((i\ge 8)\), every variable is assigned with a random integer between \(-50(i-6)\) and \(50(i-6)\).

Forbidding Strategies. An inherent problem of the local search method is cycling, i.e., revisiting assignments. Cycle phenomenon wastes time and prevents the search from getting out of local minima. So, we employ a popular forbidding strategy, called tabu strategy [18], to deal with it. The tabu strategy forbids reversing the recent changes and can be directly applied in LS Algorithm. Notice that every cell-jump operation increases or decreases the values of some variables. After executing an operation that increases/decreases the value of a variable, the tabu strategy forbids decreasing/increasing the value of the variable in the subsequent tt iterations, where \(tt\in \mathbb {Z}_{\ge 0}\) is a given parameter.

figure c

Remark 3

If the input formula has equality constraints, then we need to define a cell-jump operation for a false atom of the form \(p(\bar{\boldsymbol{x}})=0\). Given the current assignment \(\alpha :x_1\mapsto a_1,\ldots , x_n\mapsto a_n~(a_i\in \mathbb {Q})\), the operation should assign some variable \(x_i\) to a real root of \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\), which may be not a rational number. Since it is time-consuming to isolate real roots of a polynomial with algebraic coefficients, we must guarantee that all assignments are rational during the search. Thus, we restrict that for every equality equation \(p(\bar{\boldsymbol{x}})=0\) in the formula, there exists at least one variable such that the degree of p w.r.t. the variable is 1. Then, LS Algorithm also works for such a polynomial formula after some minor modifications: In Line 6 (or Line 9), for every atom \(\ell \in fal\_cl\) (or \(\ell \in sat\_cl\)) and for every variable \(x_i\), if \(\ell \) has the form \(p(\bar{\boldsymbol{x}})=0\), p is linear w.r.t. \(x_i\) and \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)\) is not a constant polynomial, there is a candidate operation that changes the value of \(x_i\) to the (rational) solution of \(p(a_1,\ldots ,a_{i-1},x_i,a_{i+1},\ldots ,a_n)=0\); if \(\ell \) has the form \(p(\bar{\boldsymbol{x}})>0\) or \(p(\bar{\boldsymbol{x}})<0\), a candidate operation is \({\texttt{cjump}}(x_i,\ell )\). We perform a decreasing candidate operation with the highest score if such one exists, and update \(\alpha \) in Line 8 (or Line 11). In Line 15 (or Line 18), we only deal with inequality constraints from \(fal\_cl\) (or \(sat\_cl\)), and skip equality constraints.

7 Experiments

We carried out experiments to evaluate LS Algorithm on two classes of instances, where one class consists of selected instances from SMT-LIB while another is generated randomly, and compared our tool with state-of-the-art SMT(NRA) solvers. Furthermore, we combine our tool with Z3, CVC5, Yices2 and MathSAT5 respectively to obtain four sequential portfolio solvers, which show better performance.

7.1 Experiment Preparation

Implementation: We implemented LS Algorithm with Maple2022 as a tool, which is also named LS. There are 3 parameters in LS Algorithm: pp for computing the score of an operation, tt for the tabu strategy and sp for the PAWS scheme, which are set as \(pp=1\), \(tt=10\) and \(sp=0.003\). The direction vectors in LS Algorithm are generated in the following way: Suppose the current assignment is \(x_1\mapsto a_1,\ldots ,x_n\mapsto a_n~(a_i\in \mathbb {Q})\) and the polynomial appearing in the atom to deal with is p. We generate 12 vectors. The first one is the gradient vector \((\frac{\partial p}{\partial x_1},\ldots ,\frac{\partial p}{\partial x_n})|_{(a_1,\ldots ,a_n)}\). The second one is the vector \((a_1,\ldots ,a_n)\). And the rest are random vectors where every component is a random integer between \(-1000\) and 1000.

Experiment Setup: All experiments were conducted on 16-Core Intel Core i9-12900KF with 128GB of memory and ARCH LINUX SYSTEM. We compare our tool with 4 state-of-the-art SMT(NRA) solvers, namely Z3 (4.11.2), CVC5 (1.0.3), Yices2 (2.6.4) and MathSAT5 (5.6.5). Each solver is executed with a cutoff time of 1200 seconds (as in the SMT Competition) for each instance. We also combine LS with every competitor solver as a sequential portfolio solver, referred to as “LS+OtherSolver", where we first run LS with a time limit of 10 seconds, and if LS fails to solve the instance within that time, we then proceed to run OtherSolver from scratch, allotting it the remaining 1190 seconds.

7.2 Instances

We prepare two classes of instances. One class consists of in total 2736 unknown and satisfiable instances from SMT-LIB(NRA)Footnote 4, where in every equality polynomial constraint, the degree of the polynomial w.r.t. each variable is less than or equal to 1.

The rest are random instances. Before introducing the generation approach of random instances, we first define some notation. Let \(\textbf{rn}(down,up)\) denote a random integer between two integers down and up, and \(\textbf{rp}(\{x_1,\ldots ,x_{n}\},d,m)\) denote a random polynomial \(\sum _{i=1}^m c_iM_i+c_0\), where \(c_i=\textbf{rn}(-1000,1000)\) for \(0\le i\le m\), \(M_1\) is a random monomial in \(\{x_{1}^{a_1}\cdots x_{n}^{a_{n}}\mid a_i\in \mathbb {Z}_{\ge 0},~a_1+\cdots +a_{n}=d\}\) and \(M_i~(2\le i\le m)\) is a random monomial in \(\{x_{1}^{a_1}\cdots x_{n}^{a_{n}}\mid a_i\in \mathbb {Z}_{\ge 0},~a_1+\cdots +a_{n}\le d\}\).

A randomly generated polynomial formula \(\textbf{rf}(\{v\_n_{1},v\_n_{2}\},\{p\_n_{1},p\_n_{2}\},\{d_{-},d_{+}\},\{n_{-},n_{+}\},\{m_{-},m_{+}\}, \{cl\_n_{1},cl\_n_{2}\},\{cl\_l_{1},cl\_l_{2}\}), \) where all parameters are in \(\mathbb {Z}_{\ge 0}\), is constructed as follows: First, let \(n:=\textbf{rn}(v\_n_1,v\_n_2)\) and generate n variables \(x_1,\ldots ,x_n\). Second, let \(num:=\textbf{rn}(p\_n_1,p\_n_2)\) and generate num polynomials \(p_1,\ldots ,p_{num}\). Every \(p_i\) is a random polynomial \(\textbf{rp}(\{x_{i_1},\ldots ,x_{i_{n_{i}}}\},d,m)\), where \(n_i=\textbf{rn}(n_{-},n_{+})\), \(d=\textbf{rn}(d_{-},d_{+})\), \(m=\textbf{rn}(m_{-},m_{+})\), and \(\{x_{i_1},\ldots ,x_{i_{n_i}}\}\) are \(n_i\) variables randomly selected from \(\{x_1,\ldots ,x_n\)}. Finally, let \(cl\_n:=\textbf{rn}(cl\_n_1,cl\_n_2)\) and generate \(cl\_n\) clauses such that the number of atoms in a generated clause is \(\textbf{rn}(cl\_l_1,cl\_l_2)\). The \(\textbf{rn}(cl\_l_1,cl\_l_2)\) atoms are randomly picked from \(\{p_i<0,p_i>0,p_i=0\mid 1\le i\le num\}\). If some picked atom has the form \(p_i=0\) and there exists a variable such that the degree of \(p_i\) w.r.t. the variable is greater than 1, replace the atom with \(p_i<0\) or \(p_i>0\) with equal probability. We generate totally 500 random polynomial formulas according to \(\textbf{rf}(\{30,40\}, \{60,80\}, \{20,30\},\{10,20\},\{20,30\}, \{40,60\}, \{3,5\})\).

The two classes of instances have different characteristics. The instances selected from SMT-LIB(NRA) usually contain lots of linear constraints, and their complexity is reflected in the propositional abstraction. For a random instance, all the polynomials in it are nonlinear and of high degrees, while its propositional abstraction is relatively simple.

7.3 Experimental Results

The experimental results are presented in Table 1. The column “#inst” records the number of instances. Let us first see Column “Z3”–Column “LS”. On instances from SMT-LIB(NRA), LS performs worse than all competitors except MathSAT5, but it is still comparable. It is crucial to note that our approach is much faster than both CVC5 and Z3 on \(90\%\) of the Meti-Tarski benchmarks of SMT-LIB (2194 instances in total). On random instances, only LS solved all of them, while the competitor Z3 with the best performance solved \(29\%\) of them. The results show that LS is not good at solving polynomial formulas with complex propositional abstraction and lots of linear constraints, but it has great ability to handle those with high-degree polynomials. A possible explanation is that as a local search solver, LS cannot exploit the propositional abstraction well to find a solution. However, for a formula with plenty of high-degree polynomials, cell-jump may ‘jump’ to a solution faster.

Table 1. Results on SMT-LIB(NRA) and random instances.

The data revealed in the last column “LS+CVC5” of Table 1 indicates that the combination of LS and CVC5 manages to solve the majority of the instances across both classes, suggesting a complementary performance between LS and top-tier SMT(NRA) solvers. As shown in Table 2, when evaluating combinations of different solvers with LS, it becomes evident that our method significantly enhances the capabilities of existing solvers in the portfolio configurations. The most striking improvement can be witnessed in the “LS+MathSAT5” combination, which demonstrates superior performance and the most significant enhancement among all the combination solvers.

Table 2. Performance Comparison of Different Solver Combinations with LS.

Besides, Fig. 6 shows the performance of LS and the competitors on all instances. The horizontal axis represents time, while the vertical axis represents the number of solved instances within the corresponding time. Figure 7 presents the run time comparisons between LS+CVC5 and CVC5. Every point in the figure represents an instance. The horizontal coordinate of the point is the computing time of LS+CVC5 while the vertical coordinate is the computing time of CVC5 (for every instance out of time, we record its computing time as 1200 seconds). The figure shows that LS+CVC5 improves the performance of CVC5. We also present the run time comparisons between LS and each competitor in Figs. 811.

Fig. 6.
figure 6

Number of solved instances within given time (sec: seconds).

Fig. 7.
figure 7

Comparing LS+CVC5 with CVC5.

Fig. 8.
figure 8

Comparing LS with Z3.

Fig. 9.
figure 9

Comparing LS with CVC5.

Fig. 10.
figure 10

Comparing LS with MathSAT5.

Fig. 11.
figure 11

Comparing LS with Yices2.

8 Conclusion

For a given SMT(NRA) formula, although the domain of variables in the formula is infinite, the satisfiability of the formula can be decided through tests on a finite number of samples in the domain. A complete search on such samples is inefficient. In this paper, we propose a local search algorithm for a special class of SMT(NRA) formulas, where every equality polynomial constraint is linear with respect to at least one variable. The novelty of our algorithm contains the cell-jump operation and a two-level operation selection which guide the algorithm to jump from one sample to another heuristically. The algorithm has been applied to two classes of benchmarks and the experimental results show that it is competitive with state-of-the-art SMT solvers and is good at solving those formulas with high-degree polynomial constraints. Tests on the solvers developed by combining this local search algorithm with Z3, CVC5, Yices2 or MathSAT5 indicate that the algorithm is complementary to these state-of-the-art SMT(NRA) solvers. For the future work, we will improve our algorithm such that it is able to handle all polynomial formulas.