1 Introduction

Floating-point computations are pervasive in low-level control software and embedded applications. Such programs are frequently used in contexts where safety is critical, such as automotive and avionic applications. It is important to develop tools for accurate and scalable reasoning about programs manipulating floating-point variables.

Floating-point numbers have a dual nature that complicates complete logical reasoning. On the one hand, they are approximate representations of real numbers, which suggests reasoning about floating-point arithmetic using real arithmetic. On the other hand, floating-point numbers have a discrete, binary implementation, which suggests reasoning about them using bit-vector encodings and sat solvers. Both approaches suffer from an explosion of cases that arises when considering the possible results of evaluating floating-point expressions.

An alternative to existing approaches is to use abstractions that enable efficient but imprecise reasoning. This approach is standard in static program analysis, including analyses that target safety critical embedded software with floating-point variables [6]. Our solver uses intervals for sound, efficient but imprecise reasoning about floating-point formulae. If imprecise reasoning cannot determine whether a formula is unsatisfiable, we use decisions to increase the precision of deduction, and then use conflict analysis to generalise the results of deduction. Our approach combines ideas from static program analysis with satisfiability algorithms and allows us to trade efficiency for precision in a demand-driven fashion. The approach is generic and can be used to implement solvers for other logics and theories. The rest of this section provides a more detailed overview of our technique and how it compares to existing techniques for designing decision procedures.

1.1 Discussion of floating-point solver architectures

We now discuss in detail a few different possibilities for designing an smt solver for floating-point arithmetic. Interpreting a floating-point expression as a real arithmetic expression leads to incorrect conclusions because there are several cases where floating-point operations differ from real arithmetic operations. Encoding all these cases as constraints in a real arithmetic formula leads to large formulae containing several cases. Such formulae are difficult for real arithmetic solvers to handle.

A second approach to floating-point reasoning, called bit-blasting or propositional encoding, currently yields better performance than a real arithmetic encoding [11]. Floating-point operations are represented as circuits that are then translated into a Boolean formula that is solved by a propositional sat solver. This approach enables precise modelling of floating-point semantics and allows high-performance sat solvers to be reused to implement a floating-point solver. The disadvantage of this approach is that the sat solver only has an operational view of arithmetic operations and must reason about individual bits in adder and multiplier circuits without the ability to simplify formulae using high-level, numeric reasoning.

A third approach is to use the popular dpll(t) architecture [5]. The dpll(t) architecture uses a sat solver to reason about the Boolean structure of a formula and a specialised solver for conjunctions of theory literals. Thus, two efficient solvers for fragments of a theory are combined to obtain a solver for the theory. The first problem with using dpll(t) is that we would still require a solver for conjunctions of literals in floating-point logic, and no off-the-shelf solution is available. A second issue is that, in some cases, separating Boolean reasoning from theory reasoning is detrimental to performance [9, 51]. Details of the theory are not visible to the propositional solver and cannot guide search for a model, while information from previous runs are not available to the theory solver for conflict analysis.

The issues with dpll(t) mentioned above are known and have fuelled research in natural-domain smt procedures. The term ‘natural-domain smt’ was first used by Cotton [19] but we use it for smt procedures that perform all reasoning directly in a theory [19, 33, 47, 48, 51]. A fourth possibility is to develop a natural-domain floating-point solver which performs decisions, backtracking and learning using variables and atoms of the theory. The challenge in pursuing this approach is identifying which elements of the theory can be used for these operations, and developing efficient algorithms for propagation and learning.

In this paper, we pursue the fourth approach and develop a natural-domain smt solver for reasoning about floating-point arithmetic formulae. We address the efficiency concerns highlighted above by developing a natural-domain smt solver that supports imprecise reasoning. For insight into the operation of our solver, consider the formula

$$0.0 \le x \land x \le10.0 \land y = x^5 \land y > 10^5 $$

where the variables x and y have double-precision floating-point values. Interval propagation [54] tracks the range of each variable and can derive the fact x∈[0.0,10.0] from the first two constraints, which implies the fact y∈[0.0,100000.0] from the third constraint. This range is not compatible with the final conjunct y>105, so the formula is unsatisfiable. The computation requires a fraction of a second with our interval solver. In contrast, translating the formula above into a bit-vector and invoking the smt solver z3 requires 16 minutes on a modern processor to prove that the formula is unsatisfiable.

The efficiency of interval reasoning comes at the cost of completeness. Consider the floating-point formula below.

$$z = y \land x = y \cdot z \land x < 0 $$

After bit-vector encoding, the solver z3 can decide satisfiability of this formula in a fraction of a second. The interval abstraction cannot represent relationships between the values of variables. Interval propagation will not deduce that y and z are either both positive or both negative. The interval solver cannot conclude that x must be positive and cannot show that the formula is unsatisfiable.

To recover completeness, we lift the Conflict Driven Clause Learning algorithm in sat solvers to reason about intervals. Our solver uses intervals to make decisions, propagates intervals for deduction, and uses a conflict analysis over intervals to refine the results of interval propagation. Our algorithm is a strict, mathematical generalisation of propositional cdcl in that replacing floating-point intervals with partial assignments yields the original cdcl algorithm. Our approach is parametric, allowing for abstractions such as equality graphs, difference graphs, or linear inequalities to be used in place of floating-point intervals to obtain natural-domain cdcl algorithms for equality logic, difference logic, and linear arithmetic respectively.

Clause learning is not the only approach one may take to obtain a complete solver based on interval propagation. One may eliminate imprecision by splitting intervals into ranges that can be analysed without loss of precision. There are at least two ways to perform such splitting.

Splitting can be integrated in a dpll(t) solver [4]. New propositions are required to represent intervals over ranges that do not occur explicitly in the original formula. Implementing good learning heuristics for this approach is difficult because the propositional learning algorithm is unaware of the intervals associated with these propositions.

Splitting can also be implemented in a natural-domain fashion. For the second example above, the solver can consider the cases y<0 and y≥0, and can in each case conclude that x is positive. Such splitting yields a complete, natural-domain smt solver that only manipulates intervals, but requires considering a potentially exponential number of cases. Moreover, the conclusions drawn from proving one case are not used to reason about another case, so the solver may repeatedly perform certain reasoning.

1.2 Content and contribution

In this paper, we present a Conflict Driven Clause Learning algorithm for floating-point logic. Our work exploits the insight presented in [28] that propositional sat solvers internally operate on a lattice-based abstraction that overapproximates the space of possible solutions. We show how the first-uip learning algorithm [70] used in cdcl solvers can be lifted to a wider range of domains. This lifting is non-trivial since it has to address the additional complexity of abstractions for domains that go beyond propositional logic.

Contribution

We make the following contributions.

  1. 1.

    We present a novel, natural-domain smt solver for the theory of floating-point arithmetic. Our solver is based on a new perspective of sat and smt algorithms as techniques that manipulate lattice-based abstractions.

  2. 2.

    We lift the first-uip algorithm used for conflict analysis in modern sat solvers to lattice-based abstractions. Our lifting enables lattice-based analyzers access to learning techniques that were hitherto limited to propositional sat solvers.

  3. 3.

    We present a new implementation of our approach for floating-point logic as part of the mathsat5 framework. The implementation outperforms approaches based on bit-blasting significantly on our set of benchmarks.

Outline

Section 2 provides a brief introduction to floating-point numbers and the theory of floating-point arithmetic. Section 3 recaps some of the formal background on lattices and abstract interpretation. Section 4 gives a high-level account of model search and conflict analysis over abstract domains. The main algorithmic contribution is presented in Sect. 5: A lifting of the first-uip algorithm to abstract domains. The implementation of our floating-point solver, the specific heuristics we used and experiments are discussed in Sect. 6. An extensive survey of related work from the areas of theorem proving, abstract interpretation, and decision procedures is given in Sect. 7.

2 A review of floating-point arithmetic

This section provides an informal introduction to floating-point numbers and some issues surrounding formal reasoning about floating-point. For a more in-depth treatment see [61].

2.1 Floating-point arithmetic

‘Floating-point’ is a style of encoding subsets of the rational numbers using bit-vectors of fixed width. The bit vectors are split into multiple, fixed size parts, including a fractional part (the significand) and a integer power by which it is multiplied (the exponent). Historically there were a number of different floating-point systems in common usage. This created significant problems when moving data between machines and made writing portable numerical software prohibitively difficult. A standardisation process lead to ieee-754, which defines multiple floating point systems including their semantics and representation as bit strings. Since their introduction in 1985, the ieee-754 formats have become the dominant floating-point system and the most common way of representing non-integer quantities. Most systems that do not comply with ieee-754, such as some GPUs, simply implement a subset of its features. We focus on binary encodings of ieee-754 for which the definitive reference is the standard [23].

The ieee-754 binary encodings specify several different classes of numbers; normal, subnormal, zeros, infinities and “not a number” (NaN). Normal numbers are represented using a triple of unsigned binary numbers (s,e,m) in which s is the sign bit and always uses a single bit, e is the exponent and m is the significand and their width depends on the each format. The rational number represented by this pattern is given by the formulae:

$$ (-1)^S \cdot2^{e - bias} \cdot m $$

where bias is fixed by the format. An example of an ieee-754 binary16 floating-point normal number is given below.

figure a

Note that the exponent is 5 bits, the significand is 10 bits and the bias is 15 (25−1−1). An exponent which is a sequence of 0s or a sequence of 1s represents one of the other types of number. Subnormal numbers have a 0 exponent and non-zero significand. They act as fixed-point numbers between the smallest normal number and zero, allowing the difference between two numbers to always be representable and improving the error bounds of computation. If both exponent and significand are 0, the number represents 0. Note that there are two floating-point numbers, +0 and −0 that both represent the rational value 0. These two numbers allow distinguishing between convergence to 0 from above and from below. Exponents that are all 1 with 0 significand represent infinity. Finally, an exponent of all 1 and a non-zero significand represent NaN.

ieee-754 gives a precise and semi-operational semantics for common operations, including +, −, ∗, / and \(\sqrt{.}\). For example, for normal and subnormal numbers and zeros, the standard specifies:

… every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded.

Rounding depends on which rounding mode is used. ieee-754 specifies five rounding modes, three directed (rounding up, rounding down, and rounding towards zero) and two round to nearest (with tie breaks to give an even significand or away from zero). Round to nearest with tie break to even is the default rounding mode.

We briefly discuss floating-point addition and multiplication. Floating-point addition is not associative even when restricted to normal numbers. Consider the two floating-point expressions below where the numbers are represented by 32 bits with a 24 bit mantissa (the value 16777216 is 224).

$$ (1 + 16777216) + -16777216 = 0 \qquad 1 + (16777216 + -16777216) = 1 $$

25 bits are required to represent the sum of 1 and 224, so rounding is applied and 224 is returned. Thus, the expression on the left evaluates to 0. No rounding is required to represent the result of evaluating subexpressions in the expression on the right, so the result is 1.

Floating-point multiplication is not associative either, as the example below demonstrates with 32-bit floating-point numbers.

$$\begin{aligned} (3 * 2049) * 8191 = 50350076\qquad 3 * (2049 * 8191) = 50350080 \end{aligned}$$

When the result is a normal number there are tight bounds on the difference between the two orders of evaluation.

Moreover, floating-point addition does not distribute over multiplication because of rounding effects, as the example below demonstrates.

$$\begin{aligned} 2049 * (8189 + 1) = 16781310 \qquad (2049 * 8189) + (2049 * 1) = 16781308 \end{aligned}$$

In the absence of associativity and distributivity, several standard algebraic approaches to reasoning about arithmetic expressions are inapplicable.

While few algebraic equivalences hold over floating-point numbers, ordering properties are generally preserved. All of the basic operations are piecewise monotonic (but not strictly monotonic) over normal and subnormal numbers. This means that techniques based on ordering, for example, interval abstractions, are particularly suitable for floating-point numbers.

2.2 Floating-point logic

The previous sub-section demonstrated that floating-point arithmetic does not behave like real arithmetic. It is important to develop specialised decision procedures for reasoning precisely about the results and properties of floating-point operations. The smtlib theory of floating-point arithmetic (fpa) is a language for expressing constraints in terms of floating-point numbers, variables and operators. We refer to this theory as floating-point logic and review it in this section.

Terms

A term in fpa is constructed from floating-point variables, constants, standard arithmetic operators and special operators. Examples of special operators include square roots and combined multiply-accumulate operations used in signal processing. These operations are parameterized by one of five rounding modes. The result of floating-point operations is defined to match ieee-754; the real result (computed with ‘infinite precision’) rounded to a floating-point number using the chosen rounding mode.

Formulas in fpa are Boolean combinations of predicates over floating-point terms. In addition to the standard equality predicate =, fpa offers a number of floating-point specific predicates including a special floating-point equality \(=_{\mathbb{F}}\), and floating-point specific arithmetic inequalities < and ≤. These comparisons have to handle all classes of numbers. Normal and subnormal numbers are compared in the expected way. The two zeros, +0 and −0 are regarded as equal (despite having distinct floating-point representations) as they correspond to the same number. Infinities are respectively above (+∞) and below (−∞) all of the preceding classes. Finally NaN is regarded to be unordered and incomparable to all floating point numbers, thus all comparisons involving NaN, including \(\mathit{NaN}=_{\mathbb{F}}\mathit {NaN}\), are false. Thus standard equality, =, is reflexive but floating-point equality, \(=_{\mathbb{F}}\) is not.

3 Background on lattices and abstraction

We now introduce abstract satisfaction, a lattice-theoretic framework for designing decision procedures. Abstract satisfaction is based on abstract interpretation, which provides a similar framework for reasoning about programs. An in-depth account of abstract satisfaction is given in [31].

3.1 Review of abstract interpretation

Abstract interpretation is formulated in terms domains, which are lattices equipped with monotone functions called transformers. The space of program behaviours is represented by a concrete domain and the behaviour of a program, called the concrete semantics, is characterised by a fixed-point expression. Checking properties of the concrete semantics is usually undecidable. An abstract domain is a lattice with transformers that can represent some but not all concrete behaviour. Checking properties of the abstract semantics is decidable but may be inaccurate.

Lattice and transformers

A poset (C,⊑) is a set C equipped with a partial order. A lattice (C,⊑,⊔,⊓) is a partially ordered set with a greatest lower bound operator ⊔:C×CC, called join, and a least upper bound operator ⊓:C×CC, called meet. A lattice C is complete if every subset XC has a meet, denoted ⊓X, and a join, denoted ⊔X. The powerset lattice (℘(S),⊆) is the lattice of all subsets of S with the subset inclusion order.

A function f:CA from a poset (C,⊑) to (A,≼) is monotone if the order xy implies the order f(x)≼f(y) for all x and y in C. We use the term transformer for a monotone function f:CC from a lattice to itself. A fixed point of a function f:SS is an element satisfying the equality f(x)=x. On a poset, fixed points can be ordered. Tarski’s fixed point theorem guarantees that transformers on complete lattices have least and greatest fixed points. We denote the least fixed point of f by \(\operatorname {\mathsf{lfp}}(f)\) and the greatest fixed point of f by \(\operatorname{\mathsf {gfp}}(f)\).

Approximation in abstract interpretation is formalised using pairs of functions. A Galois connection between posets, written , is a pair of functions α:CA and γ:AC satisfying the conditions below.

  1. 1.

    The functions α and γ are monotone.

  2. 2.

    For all x in C, xγ(α(x)).

  3. 3.

    For all y in A, α(γ(y))≼y.

We can intuitively understand these conditions by interpreting xy to mean that x has more information than y. For example, if x is a variable the bound x∈[1,3] is tighter than x∈[0,5] and provides more information about the range of x. The monotonicity condition above guarantees that order between elements is preserved. The second condition is about approximation and guarantees that every element in C can be represented by an element in A with some loss of information. The third condition is about precision and ensures that repeatedly moving from A to C does not increase information loss.

Abstract interpretation

A domain (C,⊑,⊔,⊓,{f 1,…,f k }) in abstract interpretation is a lattice equipped with transformers. The number of transformers depends on the details of the domain. A domain \((A,\preccurlyeq,\curlyvee ,\curlywedge,\{\operatorname{\mathit{af}}_{1},\ldots ,\operatorname{\mathit{af}}_{k}\})\) is a sound abstraction of (C,⊑,⊔,⊓,{f 1,…,f k }), if it contains a transformer \(\operatorname{\mathit{af}}_{i}\) for each f i and if the conditions below hold.

  1. 1.

    There is a Galois connection between the lattices.

  2. 2.

    Every pair of transformers f i and \(\operatorname{\mathit {af}}_{i}\) satisfies \(f_{i}(\gamma(x)) \sqsubseteq\gamma(\operatorname{\mathit{af}}_{i}(x))\) for all x in A.

The domain over C is called a concrete domain involving a concrete lattice and concrete transformers. The domain over A is called an abstract domain involving an abstract lattice and abstract transformers.

A powerset domain is one in which the lattice is of the form ℘(S). Assume the concrete domain is a powerset domain. The abstract domain is overapproximating if xγ(α(x)) for all concrete elements x, and a transformer is overapproximating if \(f_{i}(\gamma(y)) \subseteq\gamma(\operatorname{\mathit{af}}_{i}(y))\) for all abstract elements y. The abstract domain is underapproximating if xγ(α(x)) for all concrete elements x, and a transformer is underapproximating if \(\gamma(\operatorname{\mathit{af}}_{i}(y)) \subseteq f_{i}(\gamma(y))\) for all abstract elements y.

Properties of programs are formalised using fixed points over transformers in a concrete domain. These properties can be approximated by computing the corresponding fixed point in the abstract domain. A key result of abstract interpretation is the fixed point transfer theorem showing that fixed points in the abstract domain approximate fixed points in the concrete domain.

3.2 Review of abstract satisfaction

We apply abstract interpretation to reason about formula satisfiability. The concrete domain we consider is a set of structures over which formulae are interpreted. We consider two different concrete semantics, one for deductive reasoning and one for abductive reasoning. Computing the concrete deductive semantics amounts to computing all models, while computing the abductive semantics amounts to computing all countermodels of a formula, with both computations being at least as hard as deciding satisfiability.

We review abstract satisfaction, an abstract interpretation framework for deciding satisfiability. The account that follows does not delve into details of specific logics. Let Forms be a set of formulae and Structs be a set of structures. We assume the interpretation of a formula over structures is specified in the standard manner by a relation ⊨, which is a subset of Structs×Forms. A structure σ is a model of φ if it satisfies σφ and is a countermodel of φ otherwise. A formula φ is satisfiable if it has a model.

We now develop a framework for characterising satisfiability via lattices and transformers. The concrete domain of structures, introduced below consists of all sets of structures. We introduce two structure transformers, which map between sets of structures and encode reasoning about models and countermodels of a formula.

Definition 1

The concrete domain of structures

$$\bigl(\wp(\mathit{Structs}),\subseteq,\cap, \cup, \{ \mathit{mods}_{\varphi},\mathit{confs}_{\varphi} \mid\varphi\in \mathit{Forms}\}\bigr) $$

is a powerset lattice of structures containing a model transformer mods φ and a conflict transformer confs φ defined below.

$$\begin{aligned} \mathit{mods}_{\varphi} & \mathrel{\hat{=}} S \mapsto\{\sigma\in\mathit{Structs}\mid \sigma\text{ is in } S \text{ and } \sigma\models\varphi\} \\ \mathit{confs}_{\varphi} & \mathrel{\hat{=}} S \mapsto\{\sigma\in\mathit{Structs}\mid \sigma\text{ is in } S \text{ or } \sigma \not\models\varphi\} \end{aligned}$$

The model transformer maps a set of structures S to the largest subset of S that contains the same models as S. The conflict transformer (called the universal countermodel transformer in [28]) maps a set of structures S to the largest superset of S that contains the same models as S. The model transformer can be viewed as refining an overapproximation of a set of models while the conflict transformer can be viewed as generalising an underapproximation of a set of countermodels. Observe also that mods φ (Structs) is the set of all models of a formula and confs φ (∅) is the set of all countermodels of a formula.

We introduce a concrete semantics below, which characterises satisfiability by fixed points over structure transformers. If a formula φ is unsatisfiable, it has no models, so mods φ (Structs) and the greatest fixed point of mods φ are the empty set. Moreover, every structure is a countermodel of φ, so confs φ (∅) is equal to Structs and so is the least fixed point of confs φ .

Theorem 1

The following statements are equivalent for a formula φ.

  1. 1.

    The formula φ is unsatisfiable.

  2. 2.

    The greatest fixed point \(\operatorname{\mathsf{gfp}}(\mathit {mods}_{\varphi})\) is the empty set.

  3. 3.

    The least fixed point \(\operatorname{\mathsf{lfp}}(\mathit {confs}_{\varphi})\) contains all structures.

Our goal is to compute over- and underapproximations of the models and countermodels of a formula. If an overapproximation of the set of models of a formula is the empty set, the formula is unsatisfiable. If an underapproximation of the set of countermodels of a formula contains all structures, the formula is unsatisfiable. Abstract interpretation provides a framework for deriving these abstractions.

Consider an abstract domain

$$\bigl(A,\sqsubseteq,\sqcap,\sqcup,\{\mathit{amods}_{\varphi }, \mathit{aconfs}_{\varphi}| \varphi\in\mathit{Forms}\}\bigr) $$

that is a sound abstraction of the structures domain. If the domain is overapproximating, the inclusion mods φ (γ(x))⊆γ(amods φ (x)) holds. If the domain is underapproximating, the inclusion γ(aconfs φ (x))⊆confs φ (γ(x)) holds. The theorem below shows that we can iterate these transformers to obtain better approximations.

Theorem 2

Let amods φ be an overapproximation of mods φ and aconfs φ be an underapproximation of confs φ .

  1. 1.

    If \(\gamma(\operatorname{\mathsf{gfp}}(\mathit {amods}_{\varphi})) = \emptyset\) then φ is unsatisfiable.

  2. 2.

    If \(\gamma(\operatorname{\mathsf{lfp}}(\mathit {aconfs}_{\varphi})) = \mathit{Structs}\) then φ is unsatisfiable.

Domains of floating-point numbers

We introduce concrete and abstract domains of floating-point numbers. Let \(\mathbb{F}\) be the set of all floating-point numbers and \((\wp(\mathbb{F}),\subseteq) \) be the set of subsets of floating-point numbers ordered by inclusion. Let Vars be the set of variables occurring in a formula. A floating-point assignment is a function \(\sigma: \mathit{Vars}\to\mathbb{F}\). Floating-point assignments are the structures over which formulae in floating-point logic are interpreted. The concrete domain of floating-point logic structures

$$\bigl(\wp(\mathit{Vars}\to\mathbb{F}),\subseteq,\cap, \cup, \{ \mathit{mods}_{\varphi},\mathit{confs}_{\varphi}\}\bigr) $$

is defined over the lattice of sets of floating-point assignments with φ ranging over floating-point logic formulae.

We define the floating-point interval abstraction. Intervals approximate sets of numbers by their closest enclosing range. In addition to the arithmetic ordering ≤, the ieee-754 standard dictates a total order ⪯ over all floating-point values, including special values such as NaN. The interval abstraction is defined with respect to this total order. The lattice \((\mathbb{I},\sqsubseteq,\sqcap,\sqcup) \) of floating-point intervals is defined below. We write min and max for the minimum and maximum with respect to the ⪯ order.

  1. 1.

    The set of lattice elements is \(\mathbb{I} \mathrel {\hat{=}} \{[a,b] | a,b \text{ are in } \mathbb{F} \text{ and } a \preceq b \} \cup\{\bot\}\).

  2. 2.

    The meet ⊥⊓y=y⊓⊥=⊥ for all y. The meet [a,b]⊓[c,d] is the interval [max(a,c),min(b,d)] if max(a,c)⪯min(b,d) holds and is ⊥ otherwise.

  3. 3.

    The join ⊥⊔y=y⊔⊥=y. The join [a,b]⊔[c,d] is [min(a,c),max(b,d)].

We write ⊤ for the greatest element of \(\mathbb{I}\). Given a set of variables Vars, the interval domain is the lattice

$$\bigl(\mathit{Vars}\to\mathbb{I},\sqsubseteq,\sqcap,\sqcup , \{ \mathit{amods}_{\varphi},\mathit{aconfs}_{\varphi}\}\bigr) $$

with the components defined as below. We defer the definition of the transformers to the next two sections.

  1. 1.

    fg for \(f,g : \mathit{Vars}\to\mathbb{I}\) if f(x)⊑g(x) for all variables x.

  2. 2.

    fg is the function that maps a variable x to f(x)⊓g(x).

  3. 3.

    fg is the function that maps a variable x to f(x)⊔g(x).

We denote an element of the form \(f: \{x,y\}\to\mathbb{I}\) as a tuple 〈x:f(x),y:f(y)〉 of variables paired with intervals. We omit from the tuple variables that map to ⊤. That is, if f(x) is ⊤ and f(y) is not, we write 〈y:f(y)〉. We follow the standard lattice-theoretic convention for overloading notation and write ⊤ for the greatest element of the interval domain. The interval lattice is related to the lattice of floating-point structures by a Galois connection.

A standard fact in abstract interpretation is that the pair of functions defined above form a Galois connection [20].

4 Lifting CDCL to abstractions

In this section we show how the cdcl algorithm can be generalised to abstract domains. We call the result of this lifting Abstract cdcl (acdcl). We focus on practical concerns. For a more formal perspective, and for soundness and completeness proofs see [30]. We first recall the propositional cdcl algorithm. Then we provide a detailed explanation of how each step can be lifted to abstract lattices. In our descriptions of cdcl and acdcl we focus on a basic clause learning framework. Clause learning has been shown to be most salient aspect of the cdcl algorithm with regards to efficiency [65]. Propositional cdcl benefits from numerous further algorithmic and engineering advances [66], such as smart variable selection heuristics for decisions, effective data structures like watched literals, and algorithmic improvements such as restarts. Discussing all these improvements in a lattice based setting is beyond the scope of the paper. Some, for example restarts, lift to acdcl in a trivial manner while others, such as variable selection heuristics require domain-specific adaptations.

4.1 Review of propositional CDCL

The cdcl algorithm is shown in Algorithm 1. cdcl consists of two interacting phases, called model search and conflict analysis. Model search, shown in Algorithm 2, aims to find satisfying assignments for the formula. This process may fail and encounter a conflicting partial assignment, that is, a partial assignment that contains only countermodels. Conflict analysis, presented in Algorithm 4, extracts a general reason which is used to derive a new lemma over the search space in the form of a clause in a step called learning.

Algorithm 1
figure 1

The propositional cdcl algorithm

The fundamental datastructure used within the cdcl algorithm is the partial assignment, which is a partial function from a set of logical propositions Props to the Boolean truth constants \(\mathbb{B} \mathrel{\hat{=}} \{ \mathsf{t}, \mathsf{f}\}\). Partial assignments can be ordered by precision, and extended with a special symbol ⊥, representing the empty set of models, to form the following lattice, as shown in Fig. 1.

$$(\mathit{PartAsg},\sqsubseteq,\sqcap,\sqcup) $$

Partial assignments are an abstraction of the concrete lattice of propositional truth assignments \(\wp(\mathit{Props}\to\mathbb{B})\). The abstraction and concretisation functions are given below.

Fig. 1
figure 2

The lattice of partial assignments PartAsg

Partial assignments are refined by applying the unit rule exhaustively in a step called Boolean Constraint Propagation (bcp). The unit rule is shown in Algorithm 3, and compares the literals of the clause with the current partial assignment. Given a clause and a partial assignment, the unit rule either returns a new variable assignment, a conflict element ⊥, or the empty set, signifying that no new information could be deduced. We introduce some notation used in Algorithm 3: A literal l is in positive phase if it is of the form p for some proposition p; if it is of the form ¬p, it is in negative phase. The function phase returns the phase of a literal, i.e., phase(l)=t if l is in positive phase, and phase(l)=f otherwise. For a literal l, we denote its opposite phase literal by flip(l), and by var(l) the proposition p such that l∈{¬p,p}.

From an abstract satisfaction perspective, a call to \(\mathtt{unit\;(}C, \cdot\mathtt{)}\) computes an overapproximation of the model transformer mods C . In fact, refining a partial assignment with the unit rule corresponds to the best abstract transformer of mods C available in the partial assignments lattice [28]. We may alternately characterise the unit rule as a very natural abstract interpretation of the formula in which logical disjunction is interpreted as a join over the abstract lattice. For example for C=p∨¬qr and π={pf,qt}:

$$\begin{aligned} \mathtt{unit\;(}C, \pi\mathtt{)} &= \mathtt{unit\;(}p, \pi\mathtt{)} \sqcup \mathtt{unit\;(}\lnot q, \pi\mathtt{)} \sqcup \mathtt{unit\;(}r, \pi\mathtt{)} \\ &= \bot\sqcup\bot\sqcup\{r \mapsto\mathsf{t}\} = \{r \mapsto\mathsf {t}\} \end{aligned}$$

In addition to the partial assignment, the algorithm stores a trail tr, which is a sequence of singleton partial assignments of the form 〈p:t〉 or 〈p:f〉. We denote concatenation of an element to the trail by ⋅, and the ith element by tr[i], e.g., the first element of the trail is tr[1]. The symbol ϵ denotes the empty trail. The trail stores the sequence of propositional assignments made during algorithm execution in chronological order. A reasons array maps a proposition p to a clause C if p was derived from C via the unit rule. If a conflict is derived, the reasons array maps the conflict element ⊥ to the clause that was contradicted by the partial assignment.

The cdcl algorithm interleaves model search and conflict analysis as depicted in Fig. 2. Model search refines a partial assignment and extends the trail until either a satisfying assignment is found or a conflict is encountered. This is done in two ways: deduction with the unit rule identifies necessary consequences of the formula under the current partial assignments; decisions heuristically guess a value for unassigned propositions. If model search finds a satisfying assignment then the algorithm returns SAT.

Fig. 2
figure 3

A schematic depiction of the cdcl framework

Algorithm 2
figure 4

Propositional model search

Algorithm 3
figure 5

Propositional unit rule

If a conflict is encountered, conflict analysis uses the first-uip algorithm [70] which extracts a general conflict reason from the specific partial assignment that originally caused the conflict. The algorithm chooses as an initial generalisation R the elements of the conflicting partial assignment that contradicted the clause in reasons[⊥]. It then steps backwards through the trail, removes elements a of R, and replaces them with partial assignments that are sufficient to deduce a. At the end of every iteration of the generalisation loop, the contents of R are a sufficient reason for a conflict. This process continues until the first unique implication point (UIP) is reached (see [70] for details). From the final conflict reason R, a clause is generated which expresses that R does not represent any models, and the result is added to the original formula φ. In future model search and conflict analysis steps this new learnt clause acts as a deduction shortcut.

After learning, backtracking resets the solver to an earlier state that is consistent with the newly learnt clause, or if this fails, the algorithm returns UNSAT. The stopping criterion during conflict analysis is coordinated with the backjumping step to ensure that the algorithm automatically explores a new region of the search space after backjumping (and thus avoids cycles). This mechanism is referred to as backjumping with asserting clauses [66, 67].

4.2 Complementable meet irreducibles

As demonstrated above, partial assignments are an abstract lattice and the unit rule is an approximation of the mods transformer. We now identify specific properties of these objects that are necessary in order to lift the algorithm to other lattices. In propositional cdcl, singleton partial assignments of the form {pt} or {pf} have a special role in the scope of the cdcl algorithm: (i) the unit rule returns singleton assignments as deduction results, (ii) they are the decision elements, i.e., a decision computes a meet between the current partial assignment and a singleton assignment, and (iii) they are stored in the trail datastructure tr.

In lattice theoretic terms, singleton assignments have a special property in that they cannot be expressed in terms of the meet over a set of other elements. A partial assignment {pt,qf}, for example, may be represented as the meet {pt}⊓{qf}, whereas the element {pt} cannot be further decomposed in this way.

Definition 2

(Meet irreducibles)

A meet irreducible in a complete lattice L is an element mL different from ⊤ such that the following implication is valid.

$$\forall m_1,m_2 \in L\quad m_1 \sqcap m_2 = m \implies(m = m_1 \lor m = m_2). $$

Definition 3

(Meet Decomposition)

A meet decomposition of an element aA is a set QA of meet irreducibles such that ⊓Q=a.

In the case of partial assignments, meet irreducibles are exactly the singleton assignments. An important property of these elements is that they have precise complements. For example, a singleton assignment {pt} represents the set of all propositional assignments where p is mapped to t. The complement of this set may be represented by the singleton assignment {pf}. This property is not shared by arbitrary partial assignments, e.g., the partial assignment {pt,qt} represents a set of models whose complement has no precise representation in PartAsg.

Definition 4

(Complementable meet irreducibles)

An abstract lattice A has complementable meet irreducibles if every meet irreducible mA has a complementary meet irreducible \(\overline{m} \in A\) such that γ(m) is the set complement of \(\gamma(\overline{m})\).

Example 1

The interval domain has complementable meet irreducibles. Consider the interval element 〈x:[0.0,5.3],y:[−3.6,10.2]〉. We may decompose the above element into meet irreducibles as follows.

$$\langle x \succeq0.0 \rangle\sqcap\langle x \preceq5.3 \rangle \sqcap\langle y \succeq-3.6 \rangle\sqcap\langle y \preceq10.2 \rangle $$

Each of the elements of the decomposition above has a precise complement, e.g., \(\overline{\langle x \succeq0.0\rangle} = \langle x \prec0.0 \rangle\).

Meet irreducibles are also returned by the unit rule. A cnf formula φ can then be viewed as providing a set of sound approximations \(\{\mathtt{unit\;(}C, \cdot\mathtt{)} \mid C \in\varphi\}\) of the concrete model transformer mods φ . We provide a general concept which lifts the relevant properties of the unit rule to lattices.

Definition 5

(Meet irreducible deduction)

A meet-irreducible deduction transformer for a formula φ over an abstract domain A is a sound approximation amods φ :AA of mods φ such that for any aA, the element amods φ (a) is ⊤, ⊥ or a meet irreducible.

Approximations of the model transformers are typically available in abstract domains in the form of strongest post-condition operators for logical guards. The required decomposition into meet irreducible deduction transformers can be achieved in practice by first applying a monolithic abstract model transformer and then computing a meet decomposition.

Example 2

Let ded be the best abstract transformer of mods φ over the intervals for the formula φ=(−x=y), and let σ=〈x:[5.0,10.0]〉. We have that ded(σ)=σ⊓〈y:[−10.0,−5.0]〉. We can decompose ded into a set of complementable rules \(\mathit{Ded}= \{\mathit{ded}_{x}^{l}, \mathit{ded}_{x}^{u},\mathit {ded}_{y}^{l},\mathit{ded}_{y}^{u}\}\) s.t. ⊓Ded=ded, and each of the elements of Ded infers a lower or an upper bound on x or y: \(\mathit{ded}_{x}^{l}(\sigma) = \langle x \succeq5.0 \rangle\), \(\mathit{ded}_{x}^{u}(\sigma) = \langle x \preceq10.0 \rangle\), \(\mathit{ded}_{y}^{l}(\sigma) = \langle y \succeq-10.0 \rangle\) and \(\mathit{ded}_{y}^{u}(\sigma) = \langle y \preceq-5.0 \rangle\).

4.3 Abstract CDCL

We now show how cdcl may be lifted to abstract domains that have (i) complementable meet irreducibles (ii) an approximation of the concrete model transformer mods φ expressed in terms of a set of meet irreducible deduction transformers. We reinterpret the propositional algorithm as a lattice-based procedure using the correspondences listed in Table 1 to translate between the world of propositions and partial assignments and that of lattices and abstraction.

Table 1 Propositional concepts and their abstract-satisfaction counterparts

The resulting acdcl framework is shown in Algorithm 5. Partial assignments are replaced by an element of the abstract domain, and the set of input clauses is replaced by a set of meet-irreducible deduction transformers. The reasons array maps elements of the trail to the transformers that were used to derive them.

Abstract model search is shown in Algorithm 6. In place of bcp, a greatest fixed point over the transformers in F is computed. Narrowing [20] may be used to enforce convergence of the fixed point computation. A simple way to implement narrowing is simply to end the loop early after a fixed number of iterations. Whenever a meet irreducible is inferred, it is put on the trail and the transformer that was used to infer it is stored as its reason.

Once a fixed point is reached, the result is checked to see if it precisely represents a set of models. Such a check is typically simple to implement. (In terms of abstract interpretation this step corresponds to a γ-completeness check, see [30] for details.) For example, in the propositional case one may check whether the current partial assignment assigns all variables, or alternatively, whether it satisfies at least one literal in each clause.

If the result of fixed point computation is neither a conflict nor a witness of satisfiability, then the abstract element is refined using an abstract decision by calling \(\mathtt{adecide\;(}a\mathtt{)\;}\). The resulting decision element d must be a meet irreducible that may be heuristically chosen. If the decision element fails to refine the current element, then we return an unknown result, since we have been unable to establish whether the instance is SAT or UNSAT.

5 Learning in abstract lattices

We now present our lifting of propositional conflict analysis to abstract lattices. In cdcl, the trail implicitly encodes a graph structure. The edge information is contained in the clauses associated with each element via the reasons array. The first-uip algorithm [70] shown in Algorithm 4 computes a set of nodes of this graph, called a cut, that suffices to produce a conflict. Naively lifting the algorithm is insufficient to learn good reasons in the interval abstraction as the following example will illustrate.

Algorithm 4
figure 6

first-uip conflict analysis

Algorithm 5
figure 7

The acdcl algorithm

Algorithm 6
figure 8

Abstract model search

Example 3

Consider the fpa formula z=yx=yzx<0 and the interval assignment σ=〈z⪯−5.0〉. Starting from σ, we can make the following deductions.

figure b

Arrows indicate sufficient conditions for deduction, e.g., 〈x⪰25.0〉 can be deduced from the conjunction of 〈z⪯−5.0〉 and 〈y⪯−5.0〉. The last deduction 〈x⪰25.0〉 conflicts with the constraint x<0. A classic conflict cutting algorithm may analyse the above graph to conclude that π=〈z⪯−5.0〉 is the reason for the conflict. It is easy to see though that there is a much more general reason: The conflict can be deduced in this way whenever z is negative.

5.1 Abductive reasoning and heuristic choice

A central insight is that conflict analysis performs a form of abductive reasoning: in each iteration of the conflict analysis loop, a singleton assignment a in the conflict reason is replaced by a partial assignment that is sufficient to infer a. In terms of abstract satisfaction, abduction corresponds to underapproximation of the conflict transformer [28, 30].

Example 4

Consider a clause pq and a partial assignment {qt}. In following concrete application of confs pq , we write (p,qv 1,v 2) to denote the function that maps p to v 1 and q to v 2.

$$\mathit{confs}_{p \lor q}\bigl(\gamma\bigl(\{q \mapsto\mathsf{t}\}\bigr) \bigr) = \bigl\{ (p,q \mapsto\mathsf{f},\mathsf{f}), (p,q \mapsto\mathsf {f}, \mathsf{t}), (p,q \mapsto\mathsf{t},\mathsf{t})\bigr\} $$

Above, confs pq returns the set of propositional structures that dissatisfy the formula or are approximated by {qt}. Informally, confs pq (γ({qt})) computes the most general set of circumstances under which the formula implies the truth of qt.

A sat solver may decide during conflict analysis to replace the partial assignment {qt} with {pf} in the conflict reason. This may be modelled as application of a transformer over partial assignments:

$$\mathit{aconfs}_{p \lor q}\bigl(\{q \mapsto\mathsf{t}\}\bigr) = \{p \mapsto \mathsf{f}\} $$

Note that aconfs pq underapproximates confs φ , since the result of applying it does not represent the case (p,qt,t) that is covered by the concrete computation.

5.1.1 Abductive generalisation

Propositional conflict analysis with first-uip uses only propositional abductive reasoning. In order to adapt the algorithm to perform domain-specific conflict analysis the use of a separate abduction transformer is necessary. The result of abduction should generalise the original conflict, to avoid its reexploration, and therefore the possibility of cycles. We define an abductive generalisation transformer that has this property.

Definition 6

An abductive generalisation transformer for a formula φ over an overapproximating abstract domain A is a function aconfs φ :A×AA, such that the properties (i) and (ii) below hold for all r,aA where mods φ (γ(r))⊆γ(a).

  1. (i)

    aconfs φ (r,a)⊒r

  2. (i)

    γ(aconfs φ (r,a))⊆confs φ (γ(a))

The definition above requires some explanation. In conditions (i) and (ii) the element r represents a semantically expressed reason for a. More formally, we require that every model of φ represented by r is also represented by a. The result of calling aconfs φ (r,a) is then an element that generalises r and is still a reason for a.

Example 5

Consider the formula x=−y. Then 〈x⪰0.0〉 is a sufficient reason to determine that 〈y⪯23.5〉 using an appropriate deduction transformer. We use abductive generalisation to find a more general reason:

$$\mathit{aconfs}_{x = -y}\bigl(\langle x \succeq0.0 \rangle, \langle y \preceq23.5 \rangle\bigr) = \langle x \succeq-23.5 \rangle $$

In some domains, it may be difficult to find good abductive generalisation transformers. Note that the function (r,a)↦r (that is, no generalisation) is always a sound choice and is sufficient for implementing acdcl. The use of generalisation is therefore an optional opportunity to increase performance rather than a strict requirement.

The better the results of generalisation, the more powerful the results of learning. Good generalisation can be expensive though; while powerful abductive generalisation techniques may reduce the overall number of iterations of the acdcl algorithm, the runtime required for each individual iteration may increase. As we shall see in Sect. 6, it is important to strike a careful balance to maintain overall performance.

5.1.2 Heuristic choice

Galois-connection based abstract domains have a property called best abstraction: In an overapproximate abstraction, concrete elements, have unique, maximally precise overapproximate representations. Similarly, every concrete transformer is overapproximated by a unique, maximally precise abstract transformer called the best abstract transformer. The abductive generalisation transformer does not have this property, since it is an underapproximate transformer over an overapproximate abstraction. In practice, this means that there may be multiple incomparable choices when attempting to generalise an element.

Example 6

Consider the formula φ given by x=y+z, and the interval elements a=〈x⪯10.0〉 and r=〈y⪯0.0,z⪯0.0〉. The element r is a sufficient reason for a, and we may apply abductive generalisation in multiple, mutually incomparable ways.

$$\mathit{aconfs}_{\varphi}^1(r,a) = \langle y \preceq10.0, z \preceq 0.0\rangle \quad\quad \mathit{aconfs}_{\varphi}^2(r,a) = \langle y \preceq0.0, z \preceq 10.0\rangle $$

Both of the abductive generalisation transformers above are sound, but they return incomparable results since neither reason is weaker or stronger than the other one. Moreover, the join of the two reasons 〈y⪯10.0,z⪯10.0〉 is not a reason for a itself, because it allows, for example, that x=20.0.

In propositional conflict analysis, the lack of best conflict analysis is reflected in the cut heuristic used to extract the conflict reason. Different heuristics may produce distinct, incomparable conflict reasons.

In acdcl, heuristic choice between reasons plays a role during abductive generalisation. As indicated in the example above, multiple reasons may be available. Abductive generalisation may choose among them based on heuristic considerations such as the state of the solver or the history of the search. We will discuss an example of an abductive generalisation heuristic in Sect. 6.

5.2 Abstract FirstUIP

We present our lifting of first-uip to lattices in Algorithm 7. It takes as input a trail tr, a reason array (which is required to contain a mapping for ⊥). Furthermore, we assume that for each abstract model transformer f, we have a corresponding abductive generalisation \(\mathit{aconfs}_{\varphi}^{f}\). We also assume that all abstract elements (except ⊥) have a unique, subset-minimal meet decomposition. The main data structure is a marking m which maps trail indices to meet irreducible elements or ⊤. This is similar to implementations of propositional conflict analysis: There, propositions receive binary markings to indicate that they are necessary to derive the conflict. The abstract markings we use instead store for each trail element a generalisation, such that a conflict may still be derived. Initially, m maps all elements to ⊤. The procedure steps backwards through the trail, and replaces trail markings using reasons generated from abductive generalisation. The results of the generalisation step are decomposed into meet irreducibles and added to the marking.

Algorithm 7
figure 9

Abstract first-uip conflict analysis

At the end, a transformer Unit R is returned from the final conflict reason R. We will discuss the construction of this transformer in the next section.

An example execution of the algorithm is illustrated in Fig. 3. There, an implication graph and corresponding trail is shown which records consequences of a decision x⪯0.0. Similar to propositional cdcl, no explicit graph is constructed. Instead, the algorithm implicitly explores the graph via markings, which overapproximate the trail pointwise and encode sufficient conditions for unsatisfiability. First the algorithm determines that ⊥ can be ensured whenever z⪯6.0 and y⪯4.0 are the case. In the first iteration, it finds that y⪯4.0 can be dropped from the reason if x⪯2.0 holds in addition to z⪯6.0.

Fig. 3
figure 10

Markings in abstract first-uip

It is an invariant during the run of the procedure that the greatest lower bound over all markings is sufficient to ensure a conflict. Hence the procedure could essentially terminate during any iteration and yield a sound global abduction result. We use the usual first-uip termination criterion and return once the number of open paths reaches 1. This number is defined as the number of indices j greater or equal to the index of the most recent decision, such that m[j]≠⊤.

5.3 Abstract clause learning

Propositional solvers learn new clauses that express the negation of the conflict analysis result. The new clauses open up further possibilities for deduction using the unit rule. We model learning directly as learning of a new deduction rule, rather than learning a formula in the logic. A lattice-theoretic generalisation of the unit rule is given below. Note that we define the rule directly in terms of a set of conflicting meet irreducibles, rather than their negation.

Definition 7

For an abstraction A of ℘(Structs) with complementable meet irreducibles, let RA be a set of meet irreducibles such that ⊓R represents no models of φ. The abstract unit rule Unit R :AA is defined as follows.

$$\mathit{Unit}_R(a) \mathrel{\hat{=}} \begin{cases} \bot& \text{if } a \sqsubseteq\sqcap R \\ \overline{r} & \text{otherwise, if } r \in R\text{ and }\\ & \forall r' \in R \setminus\{r\}.~a \sqsubseteq r'\\ \top& \text{otherwise} \end{cases} $$

Example 7

Let c=〈x:[0.0,10.0],y⪯3.2〉 be a conflicting element of φ with decomposition R={〈x⪰0.0〉,〈x⪯10.0〉,〈y⪯3.2〉}. Let a=〈x:[3.0,4.0],y:[1.0,1.0]〉, then Unit c (a)=⊥, since ac. Let a′=〈x:[3.0,4.0]〉, then Unit c (a′)=〈y≻3.2〉, since a′⊑〈x⪰0.0〉 and a′⊑〈x⪯10.0〉.

The unit rule Unit R for a conflicting set of meet irreducibles R soundly overapproximates the model transformer [30]. Furthermore, it is a meet irreducible deduction transformer, so we may add it to the set of transformers F used during model search.

6 Implementation and experiments

We have implemented our approach over floating-point intervals inside the mathsat5 smt solver [16]. We call our prototype tool fp-acdcl. The implementation uses the mathsat5 infrastructure, but is independent of its dpll(t) framework (although it may be used as a theory solver inside dpll(t) if required). The implementation provides a generic, abstract cdcl framework with first-uip learning. The overall architecture is shown in Fig. 4. An instantiation requires abstraction-specific implementations of the components described earlier, including deduction, decision making and abduction with heuristic choice. We first elaborate on those aspects of the implementation and then report experimental results.

Fig. 4
figure 11

fp-acdcl solver architecture

6.1 Implementation details

6.1.1 Deductions

We implement deduction using standard Interval Constraint Propagation (icp) techniques for floating-point numbers, defined e.g., in [8, 54]. The implementation operates on cnf formulae over floating-point predicates.

Propagation is performed using an occurrence-list approach, which associates with each variable a list of the fpa clauses in which the variable occurs. Learnt unit transformers are stored as vectors of meet irreducible elements and are propagated in a similar way. When a deduction is made, we scan the list of affected clauses to check for new deductions to be added to the trail. This is done by applying icp projection functions to the floating-point predicates in a way that combines purely propositional with theory-specific reasoning. A predicate is conflicting if some variable is assigned the empty interval during icp. If all predicates of a clause are contradicting, then we have found a conflict with the current interval assignment and ded returns ⊥. If all but one predicate in a clause are conflicting, then the result of applying icp to the remaining predicate is the deduction result. In this case, ded returns a list containing one meet irreducible 〈xb〉 (or 〈xb〉) for each new bound inferred.

6.1.2 Decisions

fp-acdcl performs decisions by adding to the trail one meet irreducible element 〈xb〉 or 〈xb〉 that does not contradict the previous value of x. Clearly, there are many possible choices for (i) selecting a variable x, (ii) selecting a bound b, and (iii) choosing between 〈xb〉 and 〈xb〉.

In propositional cdcl, each variable can be assigned at most once. In our lifting, a variable can be assigned multiple times with increasingly precise bounds. We have found some level of fairness to be critical for performance. Decisions should be balanced across different variables and upper and lower bounds. A strategy that proceeds in a “depth-first” manner, in which the same variable is refined using decisions until it has a singleton value, shows inferior performance compared to a “breadth-first” exploration, in which intervals of all the variables are restricted uniformly. We interpret this finding as indication that the value of abstraction lies in the fact that the search can be guided effectively using general, high-level reasoning, before considering very specific cases.

fp-acdcl currently performs decisions as follows: (i) variables are statically ordered, and the selection on which variable x to branch is cyclic across this order; (ii) the bound b is chosen to be an approximation of the arithmetic average between the current bounds l and u on x; note that the arithmetic average is different from the median, since floating-point values are non-uniformly distributed across the reals; (iii) the choice between 〈xb〉 and 〈xb〉 is random. Considering the advances in heuristics for propositional sat, there is likely a lot of room for enhancing this. In particular, the integration of fairness considerations with activity-based heuristics typically used in modern cdcl solvers could lead to similar performance improvements. This is part of ongoing and future work.

6.1.3 Generalised explanations for conflict analysis

In abduction, a trade-off must be made between finding reasons quickly and finding very general reasons. We perform abduction that relaxes bounds iteratively. As mentioned earlier, there may be many incomparable relaxations. Our experiments suggest that the way in which bounds are relaxed is extremely important for performance. Fairness considerations similar to those mentioned for the decision heuristic need to be taken into account. However, there is an additional, important criterion. Learnt unit rules are used to drive backjumping. It is therefore preferable to learn deduction rules that allow for backjumping higher in the trail. This will lead to propagations that are affected by a smaller number of decisions, and thus will hold for a larger portion of the search space.

Our choice heuristic, called trail-guided choice, is abstraction-independent, and is both fair and aims to increase backjump potential. In the first step, we remove all bounds over variables from the initial reason q which are irrelevant to the deduction. Then we step backwards through the trail and attempt to weaken the current element q using trail elements. The process is illustrated below.

figure c

When an element tr j is encountered such that tr j is a bound on a variable x that is used in q (that is, qtr j ), the first thing we do is to attempt to weaken q by replacing the bound tr j with the most recent trail element more general than tr j . If no such element exists, we replace tr j with the trivial bound \(\langle x \succeq\min_{\preceq}(\mathbb{F}) \rangle \). We check whether the weakened q is still sufficiently strong to deduce d. If so, we set q as the candidate generalisation and continue stepping backwards through the trail by processing the next element tr j−1. If not, we undo the weakening, and try to consider intermediate cases, that is elements weaker than q but stronger than tr j , but only performing a bounded number of attempts on the current variable x. For example, if q contains x∈[l:u] and tr j is 〈xc〉 with uc, we try setting the interval for x to [l:u+(cu)/2] and so on, until either no further generalisation on the upper bound of x is possible, or we reach the limit on the number of attempts. The algorithm then terminates once no further generalisations are possible.

Since we step backwards in order of deduction, we heuristically increase the potential for backjumps: the procedure never weakens a bound that was introduced early during model search at the expense of having to uphold a bound that is ensured only at a deep level of the search.

We have experimented with stronger but computationally more expensive generalisation techniques such as finding maximal bounds for deductions by search over floating-point values, as well as with different limits on the number of generalisation attempts on a single bound. Our experiments, reported in Sect. 6.2.2, indicate that the cheaper technique described above is more effective overall, and that using a small cutoff value gives the best trade-off between cost and quality of the generalisations. However, we believe that more sophisticated strategies might provide further benefits. In particular, we see two main avenues for improvement: first, for many deductions it is possible to implement good or optimal abduction transformers effectively without search. Second, we expect that dynamic heuristics that take into account statistical information may guide conflict analysis towards useful clauses.

6.2 Experimental evaluation

We have evaluated our prototype fp-acdcl tool over a set of 213 benchmark formulas, both satisfiable and unsatisfiable. The formulas have been generated from problems that check

  1. 1.

    ranges on numerical variables and expressions,

  2. 2.

    error bounds on numerical computations using different orders of evaluation of subexpressions, and

  3. 3.

    feasibility of systems of inequalities over bounded floating-point variables.

The first two sets originate from verification problems on C programs performing numerical computations, whereas the instances in the third set are randomly generated. Our benchmarks and the fp-acdcl tool are available for reproducing the experiments at http://es.fbk.eu/people/griggio/papers/FMSD-fmcad12.tar.bz2. All results have been obtained on an Intel Xeon machine with 2.6 GHz and 16 GB of memory running Linux, with a time limit of 1200 seconds.

6.2.1 Comparison with bit-vector encodings

In the first set of experiments, we have compared fp-acdcl with the current state-of-the-art procedures for floating-point arithmetic based on encoding into bit-vectors. For this, we have compared against all the SMT solvers supporting fpa that we were aware of, namely z3 [26], sonolar [50] and mathsat5 [16]. All three solvers use a bit-vector encoding of floating-point arithmetic which is then solved via reduction to sat(bit-blasting). For each tool, we used the default options. The results of this comparison are reported in Figs. 5 and 6. The plot in Fig. 5 shows the number of succesfully solved instances for each system (on the Y axis) and the total time needed for solving them (on the X axis). From it, we can see that fp-acdcl clearly outperforms z3 both in number of instances and in total execution time, and that fp-acdcl solves one instance more than sonolar, but it is overall significantly faster. Compared to mathsat5, the results are mixed: as can be seen in the scatter plot of Fig. 6, on one hand fp-acdcl is much faster than mathsat5 in the majority of instances that both tools can solve, but on the other hand mathsat5 seems to still have an advantage in terms of scalability, solving overall 6 more instances than fp-acdcl. More generally, there are some instances that turn out to be relatively easy for solvers based on bit-blasting, but cannot be solved by fp-acdcl. This is not surprising, since there are simple instances that are not amenable to analysis with icp, even with the addition of decision-making and learning.Footnote 1 To handle such cases, our framework can be instantiated with abstract domains or combinations of domains [21] that are better suited to the problems under analysis. Moreover, bit-blasting approaches can take advantage of highly efficient SAT solvers, which are the result of years of development, optimizations and fine tuning, whereas our fp-acdcl tool should still be considered a prototype.

Fig. 5
figure 12

Comparison of fp-acdcl against various SMT solvers using bit-vector encoding of floating-point operations

Fig. 6
figure 13

Detailed comparison of fp-acdcl against mathsat5 using bit-vector encoding of floating-point operations. Circles indicate unsatisfiable instances, triangles satisfiable ones. Points on the borders indicate timeouts (1200 s)

6.2.2 Impact of optimizations

The second set of experiments is aimed at evaluating the impact of our variable selection and generalisation techniques. In order to evaluate our novel generalisation technique, we have first run fp-acdcl with generalisation of deductions turned off, and compared it with the default fp-acdcl. Essentially, fp-acdcl without generalisation corresponds to a naive lifting of the conflict analysis algorithm. The results are summarised in Fig. 7. From the plot, we can clearly see that generalisation is crucial for the performance of fp-acdcl: without it, the tool times out in 42 more cases, whereas there is no instance that can be solved only without generalisation. However, there are a number of instances for which performance degrades when using generalisations, sometimes significantly. This can be explained by observing that (i) generalisations come at a runtime cost, which can sometimes induce a non-negligible overhead; (ii) the performance degradation occurs on satisfiable instances (shown in a lighter colour in the plots), for which it is known that the behaviour of cdcl-based approaches is typically unstable (even in the propositional case).

Fig. 7
figure 14

Effects of generalisations in conflict analysis. Circles indicate unsatisfiable instances, triangles satisfiable ones. Points on the borders indicate timeouts (1200 s)

Subsequently, we have performed a more in-depth evaluation of the effects of using different generalisation strategies. In particular, we have compared different versions of fp-acdcl with different cutoff values for the number of generalisation attempts in the trail-guided procedure described in Sect. 6.1.3. The results are collected in Fig. 8. In the figure, ‘genX’ stands for a configuration with a cutoff value of ‘X’ generalisation attempts, ‘nogen’ is the configuration where no generalisation is performed, and ‘gen inf’ in the one which does not impose any bound on the number of generalisation attempts. From the plot, we can see that the strategies that impose a limit to the number of generalisation attempts significantly outperform both the unconstrained strategy and the naive one that uses no generalisation at all. The results also indicate that there is a trade-off between the quality of the generalisation and the cost of performing it, with a ‘sweet spot’ (for our benchmarks) reached using a cutoff value of 100 attempts per step.

Fig. 8
figure 15

Comparison of different strategies for conflict generalisation in fp-acdcl

Finally, we have performed a further set of experiments in order to evaluate the impact of fairness in the variable selection herustics for branching and conflict generalisation. We have compared the default version of fp-acdcl (which tries to ensure fairness as described in Sect. 6.1.2 and Sect. 6.1.3) with a version in which variables are selected randomly. The results, shown in Fig. 9, demonstrate that fairness is a very important factor for the performance of fp-acdcl: while there are instances which are solved more efficiently when selecting variables randomly,Footnote 2 overall the use of fair selection strategies allows fp-acdcl to solve 23 more instances.

Fig. 9
figure 16

Effects of fairness in branching heuristic. Circles indicate unsatisfiable instances, triangles satisfiable ones. Points on the borders indicate timeouts (1200 s)

7 A survey of related work

The work presented in this paper may be understood in the context of efforts to unify sat techniques and abstract interpretation: the work in [29] describes an abstract interpreter for programs that uses sat-style conflict-driven learning; [28] and [9] show, respectively, that dpll-based sat solvers and dpll(t)-based smt solvers are abstract interpreters; the acdcl algorithm [30] is presented in acdcl. In [10], we give an interpolation procedure for some instances of acdcl, including fp-acdcl.

We now separately survey work in three related branches of research: (1) the analysis of floating-point computations, (2) lifting existing decision procedure architectures to richer problem domains and (3) automatic and intelligent precision refinement of abstract analyses.

7.1 Reasoning about floating-point numbers

This section briefly surveys work in interactive theorem proving, abstract interpretation and decision procedures that target floating-point problems. For a discussion of the special difficulties that arise in this area, see [59].

7.1.1 Theorem proving

Various floating-point axiomatisations and libraries for interactive theorem provers exist [24, 40, 53, 57]. Theorem provers have been applied extensively to proving properties over floating-point algorithms or hardware [1, 4144, 49, 60, 64]. While theorem proving approaches have the potential to be sound and complete, they require substantial manual work, although sophisticated (but incomplete) strategies exist to automate substeps of the proof, e.g., [2]. A preliminary attempt to integrate such techniques with SMT solvers has recently been proposed in [18].

7.1.2 Abstract interpretation

Analysis of floating-point computations has also been extensively studied in abstract interpretation. An approach to specifying floating-point properties over programs was proposed in [7]. A number of general purpose abstract domains have been constructed for the analysis of floating-point programs [1215, 46, 56]. In addition, specialised approaches exist which target specific problem domains such as digital filters [32, 58]. The approaches discussed so far mainly aim at establishing the result of a floating-point computation. An orthogonal line of research is to analyse the deviation of a floating-point computation from its real counterpart by studying the propagation of rounding errors [35, 37]. Case studies for this approach are given in [27, 38]. Abstract interpretation techniques provide a soundness guarantee, but may yield imprecise results.

7.1.3 Decision procedures

In the area of decision procedures, study of floating-point problems is relatively scarce. Work in constraint programming [55] shows how approximation with real numbers can be used to soundly restrict the scope of floating-point values. In [8], a symbolic execution approach for floating-point problems is presented, which combines interval propagation with explicit search for satisfiable floating-point assignments. An smtlib theory of fpa was presented in [63]. Recent decision procedures for floating-point logic are based on propositional encodings of floating-point constraints. Examples of this approach are implemented in mathsat5 [16], cbmc [17] and Sonolar [45]. A difficulty of this approach is that even simple floating-point formulas can have extremely large propositional encodings, which can be hard for current sat solvers. This problem is addressed in [11], which uses a combination of over- and underapproximate propositional abstractions in order to keep the size of the search space as small as possible.

7.2 Lifting decision procedures

The practical success of cdcl solvers has given rise to various attempts to lift the algorithmic core of cdcl to new problem domains. This idea is extensively studied in the field of satisfiability modulo theories. The most popular such lifting is the dpll(t) framework [34], which separates theory-specific reasoning from Boolean reasoning over the structure of the formula. Typically a propositional cdcl solver is used to reason about the Boolean structure while an ad-hoc procedure is used for theory reasoning. The dpll(t) framework can suffer from some difficulties that arise from this separation. To alleviate these problems, approaches such as theory decisions on demand [4] and theory-based decision heuristics [36] have been proposed.

Our work is co-located in the context of natural-domain smt [19], which aims to lift steps of the cdcl algorithm to operate directly over the theory. Notable examples of such approaches have been presented for equality logic with uninterpreted functions [3], linear real arithmetic and difference logic [19, 51], linear integer arithmetic [47], nonlinear integer arithmetic [33], and nonlinear real arithmetic [48]. The work in [33] is most similar to ours since it also operates over intervals and uses an implication graph construction.

We follow a slightly different approach to generalisation based on abstract interpretation. The work in [28] shows that sat solvers can naturally be considered as abstract interpreters for logical formulas. Generalisations can then be obtained by using different abstract domains. Our work is an application of this insight. A similar line of research was independently undertaken in [68, 69], which presents an abstract-interpretation based generalisation of Stålmarck’s method and an application to computation of abstract transformers.

7.3 Refining abstract analyses

A number of program analyses exist that use decision procedures or decision procedure architectures to refine a base analysis. A lifting of cdcl to program analyses over abstract domains is given in [29]. In [52], a decision-procedure based software model checker is presented that imitates the architecture of a cdcl solver. A lifting of dpll(t) to refinement of abstract analyses is presented in [39] which combines a cdcl solver with an abstract interpreter.

Modern cdcl solvers can be viewed as refinements of the original dpll algorithm [25], which is based on case-analysis. Case analysis has been studied in the abstract interpretation literature. The formal basis is given by cardinal power domains, already discussed in [21], in which a base domain is refined with a lattice of cases. The framework of trace partitioning [62] describes a systematic refinement framework for programs based on case analysis. The dpll algorithm can be viewed as a special instance of dynamic trace partitioning applied to the analysis of logical formulas.

8 Conclusions and future work

We have presented a decision procedure for the theory of floating-point arithmetic based on a strict lifting of the conflict analysis algorithm used in modern cdcl solvers to abstract domains. We have shown that, for a certain class of formulas, this approach significantly outperforms current complete solvers based on bit-vector encodings. Both our formalism and our implementation are modular and separate the cdcl algorithm from the details of the underlying abstraction. Furthermore, the overall architecture is not tied to analysing properties over floating-point formulas.

We are interested in a number of avenues of future research. One of these is a comparison of abstract cdcl and dpll(t)-based architectures, and investigating possible integrations. Another avenue of research is instantiating acdcl with richer abstractions (e.g., octagons). Combination and refinements of abstractions are well studied in the abstract interpretation literature [21]. Recent work [22] has shown that Nelson-Oppen theory combination is an instance of a product construction over abstract domains. We hope to apply this work to obtain effective theory combination within acdcl. In addition, product constructions can be used to enhance the reasoning capabilities within a single theory, e.g., by fusing interval-based reasoning over floating-point numbers and propositional reasoning about the corresponding bit-vector encoding.

We see this work as a step towards integrating the abstract interpretation point of view with algorithmic advances made in the area of decision procedures. Black-box frameworks such as dpll(t) abstract away from the details of their component procedures. Abstract interpretation can be used to express an orthogonal, algebraic “white-box” view which, we believe, has uses in both theory and practice.