Optimization Modulo the Theories of Signed Bit-Vectors and Floating-Point Numbers

Optimization Modulo Theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors (BV) for the integers and that of Floating-Point Numbers (FP) for the reals respectively. Whereas an approach for OMT with (unsigned) BV has been proposed by Nadel&Ryvchin, unfortunately we are not aware of any existing approach for OMT with FP. In this paper we fill this gap. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor, which extends the work of Nadel&Ryvchin to signed BV and, most importantly, to FP. We have implemented some OMT(BV) and OMT(FP) procedures on top of OptiMathSAT and tested the latter ones on modified problems from the SMT-LIB repository. The empirical results support the validity and feasibility of the novel approach.

However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of Bit-Vectors (BV) for the integers and that of Floating-Point Numbers (FP) for the reals respectively. (For instance, during the verification process of a piece of software, one may look for the minimum/maximum value of some int [resp. float] parameter causing an SMT(BV) [resp. SMT(FP) ] call to return SAT-which typically corresponds to the presence of some bug-so that to guarantee a safe range for such parameter. ) OMT for the theory of (unsigned) bit-vectors (OMT(BV)) was proposed by Nadel and Ryvchin [31], although a reduction to the problem to MaxSAT was already implemented in the SMT/OMT solver Z3 [9]. The work in [31] was based on the observation that OMT on unsigned BV can be seen as lexicographic optimization over the bits in the bitwise representation of the objective, ordered from the most-significant bit (MSB) to the least-significant bit (LSB).
In this paper we address -for the first time to the best of our knowledge-OMT for the theory of signed Bit-Vectors and, most importantly, for the theory of Floating-Point Arithmetic (OMT(FP)), by exploiting some properties of the two's complement encoding for signed BV and of the IEEE 754-2008 encoding for FP respectively.
We start from introducing the notion of attractor, which represent (the bitwise encoding of) the target value for the objective which the optimization process aims at. This allows us for easily leverage the procedure of [31] to work with both signed and unsigned Bit-Vectors, by minimizing lexicographically the bitwise distance between the objective and the attractor, that is, by minimizing lexicographically the bitwise-xor between the objective and the attractor.
Unfortunately there is no such notion of (fixed) attractor for FP numbers, because the target value moves as long as the bits of the objective are updated from the MSB to the LSB, and the optimization process may have to change dynamically its aim, even at the opposite direction. (For instance, as soon as the minimization process realizes there is no solution with a negative value for the objective and thus sets its MSB to 0, the target value is switched from −∞ to 0+, and the search switches direction, from the maximization of the exponent and the significand to their minimization.) To cope with this fact, we introduce the notions of dynamic attractor and attractor trajectory, representing the dynamics of the moving target value, which are progressively updated as soon as the bits of the objective are updated from the MSB to the LSB. Based on these ideas, we present novel OMT(FP) procedures, which require at most n + 2, incremental calls to an SMT(FP) solver, n being the number of bits in the representation of the objective. Notice that these procedures do not depend on the underlying SMT(FP) procedure used, provided the latter allows for accessing and setting the single bits of the objective.
We have implemented these OMT(BV) and OMT(FP) procedures on top of the OPTIMATHSAT OMT solver [42]. We have run an experimental evaluation of the OMT(FP) procedures on modified SMT(FP) problems from the SMT-LIB library. The empirical results support the validity and feasibility of the novel approach.
The rest of the paper is organized as follows. In §2 we provide the necessary background on BV and FP theories and reasoning. In §3 we provide the novel theoretical definitions and results. In §4 we describe our novel OMT(FP) procedures. In §5 we present the empirical evaluation. In §6 we conclude, hinting some future directions.

Background
We assume some basic knowledge on SAT and SMT and briefly introduce the reader to the Bit-Vector and Floating-Point theories.
Bit-Vectors. A bit is a Boolean variable that can be interpreted as 0 or 1. A Bit-Vector (BV) variable v [n] is a vector of n bits, where v[0] is the Most Significant Bit (MSB) and v[n − 1] is the Least Significant Bit (LSB). 1 A BV constant of width n is an interpreted vector of n values in {0, 1}. We overline a bit value or a BV value to denote its complement (e.g., [11010010] is [00101101]). A BV variable/constant of width n can be unsigned, in which case its domain is [0, 2 n −1], or signed, which we assume to comply with the Two's complement representation, so that its domain is [−2 (n−1) , 2 (n−1) − 1]. Therefore, the vector [11111111] can be interpreted either as the unsigned BV constant 255 [8] or as the signed BV constant −1 [8] . Following the SMT-LIBV2 standard [3], we may also represent a BV constant in binary (e.g. 28 [8] is written #b00011100) or in hexadecimal (e.g. 28 [8] is written #x1C) form. A BV term is built from BV constants, variables and interpreted BV functions which represent standard RTL operators: word concatenation (e.g. 3 [8] • x [8] ), sub-word selection (e.g. (3 [8] [6 : 3]) [4] ), modulo-n sum and multiplication (e.g. x [8] + 8 y [8] and x [8] · 8 y [8] ), bit-wise operators (like, e.g., and n , or n , xor n , nxor n , not n ), left and right shift << n , >> n . A BV atom can be built by combining BV terms with interpreted predicates like ≥ n , < n (e.g. 0 [8] ≥ 8 x [8] ) and equality. We refer the reader to [3,24] for further details on the syntax and semantics of Bit-Vector theory.
There are two main techniques for BV satisfiability, the "eager" and the "lazy" approach, which are substantially complementary to one another [25]. In the eager approach, BV terms and constraints are encoded into SAT via bit-blasting [23,17,16,24,33,32]. In the lazy approach, BV terms are not immediately expanded -so to avoid any scalability issue-and the BV solver is comprised by a layered set of techniques, each of which deals with a sub-portion of the BV theory [15,10,18,24].
Floating-Point. The theory of Floating-Point Numbers (FP), [3, 36,13], is based on the IEEE standard 754-2008 [4] for floating-point arithmetic, restricted to the binary case. A FP sort is an indexed nullary sort identifier of the form (_ FP <ebits> <sbits>) s.t. both ebits and sbits are positive integers greater than one, ebits defines the number of bits in the exponent and sbits defines the number of bits in the significand, including the hidden bit. A FP variable v [n] with sort (_ FP <ebits> <sbits>) can be indifferently viewed as a vector of n def = ebits + sbits bits, where v[0] is the Most Significant Bit (MSB) and v[n − 1] is the Least Significant Bit (LSB), or as a triplet of Bit-Vectors sign, exp, sig s.t. sign is a BV of size 1, exp is a BV of size ebits and sig is a BV of size sbits − 1. A FP constant is a triplet of BV constants. Given a fixed floating-point sort, i.e. a pair ebits, sbits , the following FP constants are implicitly defined: where t is either 0 or 1 and s is a BV which contains at least a 1.
Setting aside special FP constants, the remaining FP values can be classified to be either normal or subnormal (a.k.a. denormal) [4]. A FP number is said to be subnormal when every bit in its exponent is equal to zero, and normal otherwise. The significand of a normal FP number is always interpreted as if the leading binary digit is equal 1, while for denormalized FP values the leading binary digit is always 0. This allows for the representation of numbers that are closer to zero, although with reduced precision.
Example 1. Let x be the normal FP constant (_ FP #b0 #b1100 #b0101000), and y be the subnormal FP constant (_ FP #b0 #b0000 #b0101000), so that their corresponding sort is (_ FP <4> <8>). Then, according to the semantics defined in the IEEE standard 754-2008 [4], the floating-point value of x and y in decimal notation is given by: The theory of FP provides a variety of built-in floating-point operations as defined in the IEEE standard 754-2008. This includes binary arithmetic operations (e.g. +, −, , ÷), basic unary operations (e.g. abs, −), binary comparison operations (e.g. ≤, <, =, =, >, ≥), the remainder operation, the square root operation and more. Importantly, arithmetic operations are performed as if with infinite precision, but the result is then rounded to the "nearest" representable FP number according to the specified rounding mode. Five rounding modes are made available, as in [4].
The most common approach for FP-satisfiability is to encode FP expressions into BV formulas based on the circuits used to implement floating-point operations, using appropriate under-and over-approximation schemes -or a mixture of both-to improve performance [14,44,45,43]. Then, the BV-Solver is used to deal with the FP formula, using either the eager or the lazy BV approach. An alternative approach, based on abstract interpretation, is presented in [11,12,26]. With this technique, called Abstract CDCL (ACDCL), the set of feasible solutions is over-approximated with floatingpoint intervals, so that intervals-based conflict analysis is performed to decide FPsatisfiability.

Theoretical Framework
We present our generalization of [31] to the case of signed/unsigned Bit-Vector Optimization, and then move on to deal with Floating-Point Optimization.

Bit-Vector Optimization
Without any loss of generality, we assume that every objective function f (...) is replaced by a variable obj of the same type by conjoining "obj = f (...)" to the input formula. We use the symbol n to denote the bit-width of obj, and obj[i] to denote the i-th bit of obj, where obj[0] and obj[n − 1] are the Most Significant Bit (MSB) and the Least Significant Bit (LSB) of obj respectively. 1 We define the Bit-Vector Optimization problem as follows.
Definition 1. (OMT(BV)). Let ϕ be a SMT(BV) formula and obj be a -signed or unsigned-BV variable occurring in ϕ. We call an Optimization Modulo BV problem, OMT(BV), the problem of finding a model M for ϕ (if any) whose value of obj, denoted with min obj (ϕ), is minimum wrt. the total order relation ≤ n for signed BVs if obj is signed, and the one for unsigned BVs otherwise. (The dual definition where we look for the maximum follows straightforwardly) Hereafter, we generalize the unsigned BV maximization procedures described in [31] to the case of signed and unsigned BV optimization. To this extent, we introduce the novel notion of BV attractor.
Definition 2. (Attractor, attractor equalities). When minimizing [resp. maximizing], we call attractor for obj the smallest [resp. greatest] BV-value attr of the sort of obj. We call vector of attractor equalities the vector A s.t.
In essence, the attractor can be seen as the target value of the optimization search and therefore it can be used to determine the desired improvement direction and to guide the decisions taken by the optimization search. By construction, if a model M satisfies all equalities A[i], then M(obj) = attr.
More in general, if M is a model of ϕ, then the value of obj in M, denoted with M(obj), is given by when obj is an unsigned BV objective, and by when obj is a signed BV objective, using the two's complement representation. We use the symbol µ k to denote a generic (possibly partial) assignment which assigns at least the k most-significant bits of obj. We use the symbol τ k to denote an assignment to all and only the k most-significant bits of obj. Given i < k, we denote by Definition 3. (lexicographic maximization) Consider an OMT instance ϕ, obj and the vector of attractor equalities A. We say that an assignment τ n to obj lexicographically maximizes A wrt. ϕ iff, for every k ∈ [0..n − 1], n−1 k=0 2 n−1−k ·(obj[k] nxor 1 attr[k])], -where xor n is the bitwise-xor operator and nxor n is its complement-because 2 n−1−i > n−1 k=i+1 2 n−1−k . The following fact derives from the above definitions and the properties of two's complement representation adopted by the SMT-LIBV2 standard 2 for signed BV.
Theorem 1. An optimal solution of an OMT(BV) problem ϕ, obj is any model M of ϕ which lexicographically maximizes the vector of attractor equalities A.
Proof. (We investigate the minimization case, since the maximization case is dual.) In the case of minimization with unsigned BV, attr is [00...00], so that the lexicographic optimization corresponds to minimize n−1 k=0 2 n−1−k · obj[k] which is the standard minimization for unsigned BV.
In the case of minimization with signed BV, attr is [10...00], so that the lexicographic optimization corresponds to minimize 2 n−1 · obj[0] + n−1 which -by means of subtracting the constant value 2 n−1 -is equivalent to minimize −2 n−1 · obj[0] + n−1 k=1 2 n−1−k · obj[k], which is the standard minimization for two's complement BV. 2 Definitions 2 and 3 with Theorem 1 suggest thus a direct extension to the minimization/maximization of signed BV of the algorithm for unsigned BV in [31]: apply the unsigned-BV maximization [resp. minimization] algorithm of [31] to the objective obj def = (obj nxor n attr) [resp. obj def = (obj xor n attr)] instead than simply to obj [resp. obj].
Example 3. Let obj [3] be a signed BV goal of 3 bits to be minimized and attr def = [100] be its attractor, so that the corresponding vector of attractor equalities A is equal to , because the former satisfies the attractor equality corresponding to the MSB while the latter does not. Moreover, the assignment τ 3 is lexicographically worse than the assignment , because -all the rest being equalthe latter assignment makes the attractor equality (obj[2] = 0) true.

Floating-Point Optimization
We define the Floating-Point Optimization problem as follows. (The dual definition where we look for the maximum follows straightforwardly.) Definition 4 is made necessarily convoluted by the fact that obj can be NAN. In fact, in the SMT-LIBV2 standard the comparisons {≤, <, ≥, >} between NAN and any other FP value are always evaluated false because NAN has multiple representations at the binary level (see Table 1). Also, requiring the optimal solution to be always different from NAN makes the resulting OMT(FP) problem ϕ ∧ ¬IsNaN(obj), obj unsatisfiable when ϕ is satisfied only by models M s.t. M(obj) is NAN. For these reasons, we admit NAN as the optimal solution value for obj if and only if ϕ is satisfied only by models M s.t. M(obj) is NAN.
In the rest of this section we assume that we have already checked, in sequence, that i) the input formula ϕ is satisfiable -by invoking an SMT(FP) solver on ϕ. If the solver returns UNSAT, then there is no need to proceed; ii) ϕ is satisfied by at least one model M s.t. M (obj) is not NAN -by invoking an SMT(FP) solver on ϕ ∧ ¬IsNaN(obj) if the model M returned by the previous SMT call is s.t. M(obj) is NAN. If the solver returns UNSAT, then we conclude that the minimum is NAN.
After that, we can safely focus our investigation on the restricted OMT(FP) problem ϕ noNaN , obj , where ϕ noNaN def = ϕ ∧ ¬IsNaN(obj), knowing it is satisfiable. In Section §3.1, we have introduced the concept of a BV objective attractor, and we have shown how this value can be used to drive the optimization search towards the optimum value, when minimizing or maximizing a signed or unsigned BV goal. However, in the case of floating-point optimization, it is not possible to statically determine the attractor value in advance, before the search is even started. This is due to the more complex representation of FP variables, which uses three separate Bit-Vectors (i.e. sign, exponent and significand), and the presence of various classes of special values (i.e. zeros, infinity, NaN), which make definition 2 ambiguous for FP optimization. We illustrate this problem with the following example.
Example 4. Let ϕ noNaN , obj be an OMT(FP) problem where obj is a FP objective, of sort (_ FP 3 5), to be minimized. To make our explanation easier to follow, we show in Table 1 a short list of sample values for an FP variable of the same sort as obj.
Each FP value is represented as a triplet of Bit-Vectors sign, exp, sig -following the SMT-LIBV2 conventions described in Section §2-and also in decimal notation. Table 1. Sample values for a FP variable with sort (_ FP 3 5).
From Table 1, we immediately notice that the binary representation of both the exponent and the significant of a Floating-Point number grows in opposite directions in the positive and in the negative domains. In addition, by sorting the values according to their binary representation, we observe that −∞ [resp. +∞ ] is not the smallest [resp. greatest] representable FP value in the negative [resp. positive] domain. In fact, both extreme ends of the table are occupied by NAN, which has multiple binary representations.
In what follows, we temporarily disregard the effects of unit-propagation, which might assign some (or all) bits of obj as a result of some constraints in ϕ noNaN , and pick some values as candidate attractors for an FP goal to be minimized.
Suppose that the attractor is chosen to be equal to the value −∞ listed at row 9 in Table 1, which is the smallest FP value wrt. total order relation ≤ for FP numbers. Assume that the optimal value of the FP goal is the sub-normal FP value (fp #b1 #b000 #b1111) (i.e. −15 64 ). Then, it can be seen that after both the sign and the exponent bits have been decided to be equal #b1 and #b000 respectively, the remaining bits of the attractor pull the search in the wrong direction, that is, towards −0.
Selecting a different FP value as candidate attractor does not really solve the problem, or rather, it results in a different set of issues.
For instance, an attractor equal to the NAN value listed at row 10 in Table 1, which is the smallest representable FP value according to the binary ordering, would solve the problem for the previous case in which the optimum FP value is (fp #b1 #b000 #b1111). However, this attractor would remain an unsuitable choice for an OMT(FP) instance where the FP goal is forced to be positive, because after the sign bit of the objective function has been decided to be equal #b0 the remaining bits of the attractor drive the search in the wrong direction, that is, towards +∞.
Since there is no statically-determined FP value that can be used as an attractor when dealing with floating-point optimization, we introduce the new concept of dynamic attractor.
.n] and τ k be an assignment to the k most-significant bits of obj.
Then, we say that an FP-value attr τ k for obj is a dynamic attractor for obj wrt. τ k iff it is the smallest [resp. largest] FP value different from NAN s.t. the k mostsignificant bits of attr τ k have the same value of the k most-significant bits of obj in τ k . We call vector of attractor equalities the vector The following fact derives from the above definitions and the properties of IEEE 754-2008 standard representation adopted by SMT-LIBV2 standard for FP.
Lemma 1. Let ϕ noNaN , obj be a restricted minimization [resp. maximization] OMT(FP) problem, let τ k be an assignment to obj[0]...obj[k − 1] and attr τ k be its corresponding dynamic attractor, for some k Case k ∈ [1..ebits] (exponent bits), where ebits is the number of bits in the exponent of obj. Then, In the first case, obj can only be negative-valued in both M and M . More precisely, M(obj) can be either −∞ or a normal negative value, whereas M (obj) can be either a normal or a sub-normal negative value. Hereafter, we consider only the case in which both have a normal negative value, because the case in which M(obj) = −∞ or M (obj) is sub-normal are both trivial, given that the absolute value of any sub-normal FP number is smaller than the absolute value of any normal FP number. Furthermore, we disregard the significand bits in M and M because their contribution to the value of obj is always less significant than that of the bits in the exponent. Given these premises, the exponent value of obj in every possible M is larger than the exponent of obj obj is negative-valued) and 0 otherwise (i.e. obj is positive-valued). In both cases, we can disregard the exponent bits in M and M because their contribution to the value of obj is the same in either model. For the same reasons, since M(obj) and M (obj) can only be either both normal or both sub-normal, we can ignore the contribution of the leading hidden bit and focus on the bits of the significand.
When τ k [0] = 1 and obj must be negative-valued, the decimal value of the significand in M is larger than the decimal value of every possible significand in M by exactly 2 −(k−ebits) . Given that both M(obj) and M (obj) are negative-valued, we have that M(obj) ≤ M (obj).
The case in which τ k [0] = 0, that is when obj can only be positive-valued in both M and M , is dual. 2 Lemma 1 states that, given the current assignment τ k to the k most-significant-bits of obj, obj[k] = attr τ k [k] is always the best extension of τ k to the next bit (when consistent). A dynamic attractor attr τ k can thus be used by the optimization search to guide the assignment of the k + 1-th bit of obj towards the direction of maximum gain which is allowed by τ k , so that to obtain the "best" extension τ k+1 of τ k . Once the (new) assignment τ k+1 is found, the OMT solver can compute the dynamic attractor attr τ k+1 for obj wrt. τ k+1 and then use it to assign the k + 2-th bit of obj, and so on.
Let ϕ noNaN , obj be an OMT(FP) instance, s.t. obj is a FP variable of n bits, and τ 0 be an initially empty assignment. If at each step of the optimization search the assignment of the k-th bit of obj is guided by the dynamic attractor for obj wrt. τ k , then the corresponding sequence of n dynamic attractors (of increasing order k) is unique and depends exclusively on ϕ noNaN . Intuitively, this is the case because the (current) dynamic attractor always points in the direction of maximum gain. We illustrate this in the following example.
Example 5. Let ϕ noNaN , obj be an OMT(FP) problem where obj is a FP objective, of sort (_ FP 3 5), to be minimized, as in Example 4. At the beginning of the search, nothing is known about the structure of the solution. Therefore, τ 0 = ∅ and, since obj is being minimized, the dynamic attractor for obj wrt. τ 0 (i.e. attr τ0 ) is equal to (fp #b1 #b111 #b0000) (i.e. −∞), which gives a preference to any feasible value of obj in the negative domain.
If at some point of the optimization search we discover that the domain of the objective function can only be positive, so that the first bit of obj is permanently set to 0 in τ 1 , then the new dynamic attractor for obj wrt. τ 1 (i.e. attr τ1 ) is equal to (fp #b0 #b000 #b0000) (i.e. +0).
Furthermore, if later on we also find out that at least one bit in the exponent of obj can be assigned to 0 in a feasible solution of the problem that extends τ i , for some i, then we can remove +∞ from the optimization search interval. .., A τn }where each τ k is an assignment to the first k most-significant bits of obj s.t. τ k ⊂ τ k+1 , attr τ k is its corresponding dynamic attractor and A τ k is its corresponding vector of attractor equalities-so that, for every k ∈ [0..n − 1]: Example 6. Let ϕ noNaN , obj be a restricted OMT(FP) problem where obj is a FP objective, of sort (_ FP 3 5), to be minimized, as in Example 4. We consider the case in which the input formula ϕ noNaN requires obj to be larger or equal 29 /2 and it does not impose any other constraint on the value of obj. Given the sequence of (partial) assignments τ 0 , ..., τ 8 in Figure 1 Hence, τ n lexicographically maximizes A ϕ wrt. ϕ noNaN . 2 Finally, we make the following two observations. The first is that the sequence τ 0 , τ 1 , ..., τ n in definition 6 can be iteratively constructed using its list of requirements, for instance, by means of a sequence of incremental calls to an SMT solver. The second, more important, observation is that τ n corresponds to the assignment of values which makes obj optimal in ϕ noNaN .
Using the above definitions, we show that the following fact holds.
Proof. (We prove the case of minimization, since that of maximizations is dual.) By Lemma 2 we have that τ n lexicographically maximize A ϕ . Let M be a model of ϕ noNaN which lexicographically maximizes A ϕ , and let µ be its restriction to obj. Since both τ n and M lexicographically maximize A ϕ , for the uniqueness of τ n , we immediately notice that µ = τ n , so that τ k = [

OMT(F P) Procedures
In this paper, we consider two approaches for dealing with OMT(FP): a basic linear/binary search, based on the inline OMT schema for OMT(LRA ∪ T ) presented in [38], and Floating-Point Optimization with Binary Search (OFP-BS), a brand-new engine inspired by the OBV-BS algorithm for unsigned Bit-Vectors in [31] and by Theorem 2 and relative definitions in §3.2.

OMT-based Approach
The OMT-based approach for OMT(FP) adapts the linear-and binary-search schemata for OMT(LRA ∪ T ) presented in [38] to deal with FP objectives.
In the basic linear-search schema, the optimization search is advanced by means of a sequence of linear cuts, each of which forces the OMT solver to look for a new model M which improves the value of obj wrt. the most recent model M. In the binary-search schema, instead, the OMT solver learns an incremental sequence of cuts which bisect the current domain of the objective function. For clarity, we recap here the essential elements of the binary-search schema presented in [37,38]. At the beginning of the optimization search and following each update of the lower-(lb) and upper-(ub) bounds of obj, the OMT solver computes a pivoting value pivot def = floor(ρ · ub + (1 − ρ) · lb), for some value of ρ (e.g. 1 2 ). If pivot lies inside the range ]lb, ub], a cut of the form (obj < pivot) is learned. Otherwise, if -due to rounding side-effects of FP operationspivot lies outside the range ]lb, ub], a cut of the form (obj < ub) is learned instead. If the cut is satisfiable, the upper-bound of obj is updated with a new model value of obj. Otherwise, the lower-bound is made equal to pivot [resp. ub]. The algorithm terminates when the search interval [lb, ub[ becomes empty. In general, it is reasonable to expect the binary-search schema to converge towards the optimal solution faster than the linearsearch schema, because the feasible domain of a FP goal can be comprised by an exponentially large number of values (wrt. the bit-width of the cost function).
In either schema, whenever the optimization engine encounters for the first time a solution s.t. obj = NAN, the OMT solver learns a unit-clause of the form ¬(ISNAN(obj)) so as to look for an optimal solution different from NAN (if any).
When dealing with FP objectives, differently from the case of LRA in [38], it is not necessary to implement a specialized optimization procedure within the FP-Solver in order to guarantee the termination of the optimization search. Indeed, such procedure is not available when Floating-Point terms are bit-blasted into Bit-Vectors eagerly, or when the ACDCL FP-Solver is used, because by the time the optimization procedure is called the domain interval of any FP term contains a singleton value. Conversely, such a minimization procedure could be envisaged when the OMT solver uses a lazy FP-Solver as back-end, so as to speed-up the convergence towards the optimal solution 3 .

Floating-Point Optimization with Binary Search
The Floating-Point Optimization with Binary Search algorithm is a new engine for OMT(FP) which is inspired by the OBV-BS algorithm for OMT(BV) [31] and is a direct implementation of Definition 6 and Theorem 2.
The optimization search tries to lexicographically maximize an implicit attractor trajectory vector A ϕ , which is incrementally derived from the current value of the dynamic attractor. The raw value of the dynamic attractor's bits drive the optimization search towards the direction of maximum gain at any given point in time, without disrupting any decision that has been already made. The dynamic attractor is incrementally updated along the search, based on the outcome of the previous rounds of the optimization search. At each round, one bit of the objective function is assigned its final value. The first round decides the sign, the next batch of rounds decides the exponent and the remaining rounds decide the fine-grained details of the significand.
The pseudo-code of OFP-BS is shown in Figure 2. The arguments of the algorithm are the input formula ϕ and the FP objective obj, where obj is a FP variable with ebits bits in the exponent, sbits − 1 in the significand and n def = ebits + sbits bits overall. The procedure starts by checking whether the input formula ϕ is satisfiable and immediately terminates if that is not the case (lines 1-3). If obj = NAN in M then the procedure checks whether there exists a model M for ϕ ∧ ¬IsNaN(obj) (lines [4][5]. If this is not the case, the procedure terminates immediately and returns the pair SAT, M (line 7). Otherwise, the model M is updated with the new model M , and ϕ is permanently extended with the constraint ¬IsNaN(obj) (lines 9-10).
At this point, the procedure initializes the value of the dynamic attractor by invoking an external function UPDATE_DYNAMIC_ATTRACTOR() with the empty assignment τ if (k ≤ ebits) then 10: for i := k + 1 up to n − 1 do 11: attrτ [i] = 1 // track largest negative value 12: return attrτ as parameter, so that the returned value is equal to −∞ when minimizing and +∞ when maximizing (lines [11][12]. Then, the execution moves to the section of code implementing the core part of the OFP-BS algorithm (lines 15-28), which consists of a loop over the bits of obj, starting from the MSB obj[0] down to the LSB obj[n − 1].
Inside this loop, OFP-BS first checks whether the value of obj[i] in M matches the i-th bit of the (current) dynamic attractor attr τ . If this is the case, then the i-th bit is already set to its "best" value in M. Thus, the assignment τ is extended so as to permanently set obj . At this point, there is a mismatch between the value of the first i + 1 bits of obj in M, corresponding to the assignment τ , and those of the current dynamic attractor. This mismatch is resolved by calling the function UPDATE_DYNAMIC_ATTRACTOR() with the updated assignment τ as parameter (line 28). In either case, the execution moves to the next iteration of loop.
After exactly n iterations of the loop, the optimization search terminates with the pair SAT, M , where M is the optimum model of the given OMT(FP ∪ T ) instance. The OFP-BS algorithm requires at most n + 2 incremental calls to an underlying SMT(FP) solver. The test in rows 17-18 allows for saving lots of such SMT calls when the current model already assigns obj[i] to its corresponding value in the attractor.
The function UPDATE_DYNAMIC_ATTRACTOR() takes as input τ , a (partial) assignment over the k most-significant bits of obj and, when obj is minimized 4 , and it essentially works as follows. If τ = ∅, then nothing is known about the solution of the problem, so −∞ is returned. Otherwise, the procedure must compute the smallest FP value different from NAN (if any) which extends τ . Since τ = ∅ then we know that the sign of the objective function has been permanently decided in τ . If obj[0] = 0 in τ , i.e. obj must be positive, the procedure must return the smallest positive FP value admitted by τ . Hence, we extend τ with i=n−1 i=|τ | obj[i] = 0 and return the corresponding FP value. If obj[0] = 1 in τ , i.e. obj can be negative values, the procedure must return the largest negative FP value admitted by τ . We first check whether there exists a bit in the exponent of obj which is assigned to 0 in τ . If that is the case, we extend τ with i=n−1 i=|τ | obj[i] = 1 and return the corresponding FP value. Otherwise, the procedure returns the value −∞, which is still a viable extension of τ .

Search Enhancements
Given a FP value attr and a FP goal obj, (a combination of) the following techniques can be used to adjust the behavior of the optimization search, similarly what has been proposed for the case of OMT(BV) by Nadel et al. in [31].
-branching preference: the bits of the FP objective obj are marked, inside the OMT solver, as preferred variables for branching starting from the MSB down to the LSB. This ensures that conflicts involving the value of the objective function are handled as early as possible, possibly reducing the amount of work that needs to be redone after each back-jump. In the case of the basic OMT schema described in Section §4.1, the effectiveness of either technique depends on the initial choice for attr. In the lucky case, the value of attr pulls the optimization search in the right direction and speeds up the search. In the unlucky case, when attr pulls in the wrong direction, there is no visible effect or an overall slow down. For instance, in the case of the linear-search optimization schema, enabling both options with an unlucky choice of attr can cause the OMT solver to start the search from the furthest possible point from the optional solution, and thus enumerate an exponential number of intermediate solutions. Naturally, the OMT-based optimization search algorithm is still guaranteed to terminate even in the worst-case scenario, but the unpredictable performance makes using either technique a generally unsuitable option in practice.
In the case of the OFP-BS algorithm described in Section §4.2, we use the latest value of the dynamic attractor attr τ for both the branching preference (lines 11 and 18 of Figure 2) and the polarity initialization (rows 12 and 19 of Figure 2) techniques. We observe that the value of every bit in the dynamic attractor can change after the sign of the objective function has been decided. Furthermore, the value of all the significand's bits in the dynamic attractor can also change during the process of determining the optimal exponent value of the objective function (see, e.g., Example 4). As a consequence, if the OMT solver applies either enhancement before the correct improving direction is known, this may cause the underlying OMT engine to advance the search starting from a sub-optimal set of initial decisions. Enabling both enhancements at the same time could make things even worse. In order to mitigate this issue, we have designed a variant of our optimization-search approach which does not apply either enhancement on those bits of the objective function for which the best improving direction is not yet known. We have called this variant safe bits restriction.

Experimental Evaluation
We assess the performance of OPTIMATHSAT (v. 1.6.2) on a set of OMT(FP) formulas that have been automatically generated using the SMT(FP) benchmark-set of [3]. The formulas, the results and the scripts necessary to reproduce these results are made publicly available and can be downloaded from [1].
Experiment Setup. This experiment has been performed on an i7-6500U 2.50GHz Intel Quad-Core machine with 16GB of ram and running Ubuntu Linux 17. 10. For each formula being tested we used a timeout of 600 seconds. The OMT(FP) instances used in this experiment have been automatically generated starting from the satisfiable formulas included in the SMT(FP) benchmark-set of [3]. We did not consider any of the unsatisfiable instances that are present in the remote repository.
For each of the original SMT(FP) formulas we applied the following transformations. First, we either relaxed or removed some of the constraints in the original problem, so as to broaden the set of feasible solutions. This step is necessary because the majority of the original SMT(FP) formulas admits only one solution. However, this is not necessarily the ideal situation when comparing different optimization approaches. Second, for each FP variable v appearing inside a SMT(FP) problem we generated a pair of OMT(FP) instances, one for the minimization and another for the maximization of v. At the end of this step, we obtained 39536 OMT(FP) formulas. Third, we randomly selected up to 300 OMT(FP) instances from each of the five groups of problems in the OMT(FP) benchmark-set. This filtering step yielded a total of 1120 SMT-LIBV2 formulas.
We consider two OMT-based baseline configurations, OPTIMATHSAT(OMT+LIN) and OPTIMATHSAT(OMT+BIN), that run the linear-and the binary-search respectively. These configurations have been tested using both the eager and the lazy FP approaches. The third baseline approach, named OPTIMATHSAT(EAGER+OBV-BS), is based on a reduction of the OMT(FP) problem to OMT(BV) and it uses OPTIMATHSAT's implementation of the OBV-BS engine 5 presented by Nadel et al. in [31]. For this test, we have generated an OMT(BV) benchmark-set using a BV encoding that mimics the essential aspects of the OFP-BS algorithm described Section §4.2.
We compared these baseline approaches with a configuration using the OFP-BS algorithm and the eager FP approach, namely OPTIMATHSAT(EAGER+OFP-BS). We have separately tested the effect of enabling the branching preference (BP), the polarity initialization (PI) and the safe bits restriction (SO) enhancements described in Section §3.2, whenever these options were supported by the given configuration.
Last, in order to assess the significance of the optimization problems used in this experiment, we have collected the run-time statistics of OPTIMATHSAT on the SMT formulas obtained by stripping the objective function from each OMT instance. We named this configuration OPTIMATHSAT(EAGER+SMT).
We have not included other tools in our experiment because we are not aware of any other OMT(FP) solver. For all problem instances, we verified the correctness of the optimal solution found by each configuration with an SMT solver (MATHSAT5). When terminating, all tools returned the same optimum value. In order to perform this crosscheck as efficiently as possible, we enabled model generation on every configuration so that the optimum model could be extracted and verified. For what concerns OMT-based linear-search optimization, we observe that OP-TIMATHSAT performs the best when no enhancement is enabled. In particular, the empirical evidence suggests that enabling branching preference significantly increases the number of timeouts, generally deteriorating the performance (plot 1A in Fig. 5). Enabling only polarity initialization does not result in an appreciable change on the running time of the solver (plot 1B in Fig. 5). In contrast, enabling both enhancements at the same time has a small chance to result in a small improvement of the search time (plot 2A in Fig. 5), but it generally worsens the performance and results in a drastic increase in the number of timeouts (Table 2). We justify these results as follows. First, when only polarity initialization is used, the phase-saving value that is being set by OP-TIMATHSAT does not really matter because the optimization search is dominated by the structure of the formula itself rather than by the bits of the FP objective. Second, when polarity initialization is used on top of branching preference, there is an even more drastic decrease in performance due to the fact that the initial phase-saving value that is statically assigned by the OMT solver to the bits of the FP objective cannot be expected to be "good enough" for any situation. In fact, as illustrated in example 4, the initial phase-saving can be misleading and force the OMT solver -when running in linear-searchto explore an exponential number of intermediate satisfiable solutions.
In the case of the OMT-based binary-search optimization approach, we observe that it solves more formulas than linear-search and it generally appears to be faster (plot 3B in Fig. 5). Overall, polarity initialization does not seem to be beneficial, whereas enabling branching preference increases the number of formulas solved within the timeout. This behavior is different from the linear-search approach, and we conjecture that it is due to the fact that, with the OMT-based binary-search approach, branching over the bits of the objective function can reveal in advance any (partial) assignment to the bits of the objective function that it is inconsistent wrt. the pivoting cuts learned by the optimization engine.
Using the lazy FP engine results in fewer formulas being solved, although a significant number of these benchmarks is solved faster than with any other configuration (over 90 instances, for both configurations).
The OPTIMATHSAT(EAGER+OBV-BS) configuration is able to solve 1013 formulas within the timeout, showing that OMT(FP) can be reduced to OMT(BV) effectively, and that -on the given benchmark-set-the performance of this approach are comparable with the best OMT(FP) configurations being tested.
Overall, the best performance is obtained by using the OFP-BS engine, with up to 1019 benchmark-set instances being solved in correspondence to the OPTIMATH-SAT(EAGER+OFP-BS+PI) configuration. In plot 2B of Figures 5 and 6, we show the pairwise comparison of the best OFP-BS configuration with the best OMT-based run. Similarly to the case of OMT-based optimization with linear-search, we observe that enabling branching preference generally makes the performance worse (plot 1A in Fig.  7). Instead, when polarity initialization is used we observe a general performance improvement that does not only result in an increase in the number of formulas being solved within the timeout, but also a noticeable reduction of the solving time as a whole. This is in contrast with the case of OMT-based optimization, and it can be explained by the fact that OFP-BS uses an internal heuristic function to dynamically determine and update the most appropriate phase-saving value for the bits of the objective function. An equally important role is played by the safe bits restriction, that limits the effects of branching preference and polarity initialization to only certain bits of the dynamic attractor. As illustrated by the plots in the second and third rows of Figure 7 and by the data in Table 2, this feature is particularly effective when used in combination with branching preference.
The results of OPTIMATHSAT over the SMT-only version of the benchmark-set are reported in Table 2 and in the scatter-plot 3B in Fig. 6, and show that for a large number of instances the OMT problem is considerably harder than its SMT-only version There are a few exceptions to this rule, that we ascribe to the fact that the removal of the objective function alters the internal stack of formulas, and this can have unpredictable consequences on the behavior of various internal heuristics that depend on it. A solution can be found in a shorter amount of time when the sequence of (heuristic) choices is compatible with its assignment and it requires little back-tracking effort.