On new methods to construct lower bounds in simplicial branch and bound based on interval arithmetic

Branch and Bound (B&B) algorithms in Global Optimization are used to perform an exhaustive search over the feasible area. One choice is to use simplicial partition sets. Obtaining sharp and cheap bounds of the objective function over a simplex is very important in the construction of efficient Global Optimization B&B algorithms. Although enclosing a simplex in a box implies an overestimation, boxes are more natural when dealing with individual coordinate bounds, and bounding ranges with Interval Arithmetic (IA) is computationally cheap. This paper introduces several linear relaxations using gradient information and Affine Arithmetic and experimentally studies their efficiency compared to traditional lower bounds obtained by natural and centered IA forms and their adaption to simplices. A Global Optimization B&B algorithm with monotonicity test over a simplex is used to compare their efficiency over a set of low dimensional test problems with instances that either have a box constrained search region or where the feasible set is a simplex. Numerical results show that it is possible to obtain tight lower bounds over simplicial subsets.


Introduction
A review of simplicial Branch and Bound (B&B) can be found in [15]. Recently, there is a renewed interest in generating tight bounds over simplicial partition sets. Karhbet and Kearfott [7] discuss the idea of using range computation over simplices based on Interval Arithmetic. In [12], focus is on using second derivative enclosures for generating bounds. These works do not take monotonicity considerations over the simplex into account as discussed by [6]. Our research question is how information on the bounds of first derivatives can be used to derive tight bounds and to create new monotonicity tests in simplicial B&B. To investigate this question, we derive bounds based on derivative information and implement them in a B&B algorithm to compare the different techniques.
The rest of this paper is organized as follows. Section 2 introduces the notation. Section 3 presents several approaches to obtain lower bounds of a function over a simplex. Section 4 deals with monotonicity over a simplex. Section 5 describes the Global Optimization B&B algorithm to compare lower bounding methods over a simplex. Section 6 compares the results of the bounding techniques numerically on a large number of instances. Finally, Sect. 7 presents our findings.

Preliminaries
Consider a function f : R n → R which has to be minimized over a feasible set D ⊂ R n , which is either a box or a simplex, on which f is differentiable: The simplicial B&B algorithm to be investigated uses simplicial partition sets S and lower bounds of min x∈S f (x). Notation 1 Let V = {v 0 , . . . , v n } ⊂ R n denote a set of n + 1 affinely independent vertices. For the component i of vertex j, we use the notation (v j ) i .

Notation 5 The boundary and interior of set S is denoted by ∂ S and int S, respectively, where S = ∂ S ∪ int S and ∂ S ∩ int S = ∅.
Notation 9 We denote the Baumann base-point for the optimal lower bound in the centered form on a box by bb − . Component i is given by Any centered form (with a base-point y ∈ x) can be tightened based on the vertices of simplex S.

Then f y (S) ≤ min x∈S f (x).
Proof A first-order Taylor form provides a concave lower bounding function [3,25]. A concave function takes its minimum over a convex set at its extreme points. Consequently, the lower bounding function over the simplex takes its minimum at a vertex of the simplex. Thus, instead of computing the interval enclosure over x= S, taking the minimum over the simplex vertices provides a valid lower bound.

Remark 5
We can use y = cb or y = bb − in (3). Now, it is interesting to see how the Baumann point bb − can be generalized to a simplicial base-point. For bb − , the aim is to select the best base-point for the Taylor form, such that the lower bound is as high as possible. For a simplex, instead of using the limits of enclosing box x= S, we use the simplex vertices.
The highest lower bound in (3) over a simplex is taken at the base-point Obviously, optimizing (4) is a nonlinear problem as it includes the optimization of f (y) varying y. Therefore, it is advisable to optimize only the second part.
Let (z * , y * ) be the optimum of (5). Then we take base point bs − = y * with the corresponding lower bound f bs − (x) = f (bs − ) + z * . Writing (5) as a linear program requires 2 n constraints for each vertex v ∈ V: The constraints in (6) can be written as 2 n linear inequalities Note that we do not force bs − to be in simplex S, because it may happen that a point outside S would give the best lower bound. In case we want to use f (bs − ) to update the upper bound of the global minimum in a B&B algorithm, bs − has to be in the initial search region. In our experiments we force bs − to be in S by adding y ∈ S using simplex inclusion constraints (1) to (7) in a similar way as it is done in (10). Notice that (5), (6) and (7) are equivalent descriptions of the same problem, thus providing the same optimum corresponding to the same bound.

Linear relaxation based lower bounds
Following earlier results in interval based B&B [14,20,21], we can now define other lower bounds for simplicial subsets.

Standard linear relaxation of f over a box
Let w be a vertex of x = S and consider a first order Taylor expansion Since we have 2 n vertices of x, we obtain 2 n inequalities from Eq. (8), see [10] for more details. Consider the linear program Let (x * , z * ) be the optimal solution of (9), then such that z * is a lower bound of f over x. z * is also a lower bound of f over S ⊂ x = S.

Linear relaxation of f over a simplex
We now focus on the bounds of f over simplex S = conv(V). The earlier bound in (9) is valid for f over x = S, such that it is also a bound over the simplex S. However, it is interesting to force x ∈ x to be inside S, like in (1). Introducing the corresponding linear equations into problem (9) provides linear program Let (x * , z * , λ * ) be the solution of (10). Then we have that and therefore, z * is a lower bound of f over S. A straightforward idea is to consider the vertices of the simplex instead of the vertices of the enclosing box. Unfortunately, such a formulation leads to a Mixed Integer Programming problem, as the piece-wise linear lower bounding function is neither convex nor concave anymore.

Bounding technique using Affine Arithmetic
This section describes the use of Affine Arithmetic (see [2,4,9,11,18,22]) to generate a linear underestimation of function f over x= S. We add the constraint that the solution has to be inside the simplex S = conv(V), see (1). This provides a linear program.
First, we focus on the transformation of an interval vector into a vector of affine forms. Second, we describe how the computations are made using Affine Arithmetic to provide linear equations. Third, we sketch how the so-obtained linear equations are used to provide linear underestimations of f over x and then we provide the linear program to find a lower bound of f over the simplex S. Fourth, we show a simple way to solve the linear program.

Conversion into affine forms
The interval vector x= S can be converted to an affine form vector, denoted byx, as follows where i ∈ [−1, 1] for all i ∈ {1, . . . , n}. The affine formx can be transformed back into an interval by changing i to [−1, 1]. Moreover, for all x ∈ x, there is exactly one corresponding value for in the affine description,

Affine arithmetic
By replacing all the occurrences of the variable x i by the corresponding affine formx i in an expression of f , and by performing the computations using Affine Arithmetic, we obtain a resulting affine form, denoted bŷ where j is in [−1, 1] for all j ∈ {1, . . . , N }. Note that some error terms r k k are added for all k ∈ {n + 1, . . . , N }, which come from non affine operations in f .

Linear underestimation of f over x
Using Affine Arithmetic, (12) underestimates f over x because all error terms are taken into account using their worst value. (13) is a linear underestimation of f over x using the new variables i .

Linear program to provide lower bounds
In order to compute a lower bound of f over the simplex S (and not only on the x= S), we constrain the point x to be inside S by adding (1). In this case, we describe x by its affine form T (x, ) and thus, we obtain the following linear program Denoting the exact solution of (14) by ( * , λ * ), we have that and therefore, this is a lower bound of f over S. Note that to solve the above linear program, we just need to evaluate f at each vertex of S and then take the minimum value of the linear underestimation (13). If rad (x i ) = 0 (else Then, the lower bound of (15) becomes Therefore, instead of solving linear program (14), we can determine (16) and this yields directly the lower bound of f over S. The solution corresponds to the solution of linear program (14).

Monotonicity test
In this paper, we use a concise monotonicity test which excludes an interior partition set S if it does not contain a stationary point. To be more precise:

then S does not contain a global minimum point.
Proof The condition implies that such that S cannot contain an interior minimum of D. Moreover, S does not touch the boundary, i.e. S ∩ ∂ D = ∅, such that neither it can contain a boundary optimum point.
The test is not very strong, as initial partition sets typically touch the boundary. A slight relaxation is the following corollary, where a simplicial partition set has only vertices in common with the boundary.

Corollary 1 Let S ⊂ D be a partition set, where the number of boundary points is finite
. Then S can be eliminated from the search tree.
Proof The same reasoning as in the proof of Proposition 2 applies with respect to the interior of S. Now a minimum point could be attained in a vertex v ∈ S ∩ ∂ D. However, vertex v is also part of another simplicial partition set which covers part of the boundary of D, such that we do not have to store S anymore.
This corollary is not very strong, but it is relatively easy to check. For practical tests, Proposition 2 offers the conditions for removing an interior simplicial partition set S. We can also remove it, if just several vertices of S touch the boundary of D according to Corollary 1. Otherwise, we should store the facets of S which are completely in a face of the search region as new simplicial partition sets with less than n + 1 vertices. In our former investigation [6], we focused on bounds of the directional derivative in a direction d, denoted by d T ∇ f (S) and d T ∇ f (S). In this context, one can consider for instance an upper bound of the directional derivative Notice that a necessary condition for Proof Consider the vertices of V ordered such that w = v 0 . Let x = n j=1 λ j v j + λ 0 w be a minimum point x / ∈ F. We construct a point z on F walking in direction d according to . Thus, minimum point x either does not exist, or z is also a minimum point of min x∈S f (x) and it is located on facet F.
The corresponding test allows us to perform a dimension reduction of S by removing the vertex w. In case the conditions are not true, one can check each border facet if it can contain a minimum point. If we show it cannot, we do not have to deal further with the facet. In case no border facet can contain the minimum, it follows that S can be disregarded.

Simplicial B&B algorithm (SBB)
Algorithm 1 uses an AVL tree 3 [1] , a self-balancing binary search tree, for storing partition sets. Such a structure has a computational complexity of sorted insertion and extraction of an element of O(log 2 | |). Evaluated and not rejected simplices are sorted in by non decreasing order of the bounds on the objective using any of the methods from Sect. 3. This means [x, x] < [y, y] when x < y or when x = y and x < y. Simplicial partition sets having the same bounds are stored in the same node of the AVL tree using a linked list.
All vertices of a simplex are also stored in an AVL tree. Vertices may be shared among several simplices, such that we avoid duplicate storage. Although Algorithm 1 describes vertices to be evaluated in order to update the incumbentf (see Algorithm  Simplices with a lower bound greater than the incumbentf are rejected. They are also rejected using Proposition 2 when they are in the relative interior of the search space D and 0 / ∈ ∇ f i ( S) (see Algorithm 1, lines 14 and 17, and Algorithm 2, line 7) .
In case f is monotone on S and S ∩ ∂ D = ∅, S can be reduced to a number of facets by Corollary 3 (see calls to Algorithm 2 from Algorithm 1, lines 6 and 10). From computational perspective it is better to label the vertex border or not-border. A border vertex means that when it is removed from S, the remaining facet is on the boundary of D. If the search region is a simplex, P contains just the initial simplex, and all initial facets are at the boundary, such that all vertices are labelled border. In case the search region is a box, P contains the result of the combinatorial vertex triangulation of the box into n! simplices [24,26].
This technical detail has not been included in Algorithm 1 for the sake of simplicity. The specific triangulation is not appealing for large values of n. We use this here because box constrained problems are used to compare methods. Each of the n! initial simplices has two border facets. They are determined by removing the smallest and largest vertex (numbered in a binary system), see the grey nodes in Fig. 1. In the binary system, 0 is the lower bound and 1 is the upper bound of the given component of the box.

Require:
f : the n dimensional objective function. P: initial simplicial partition of the search region D. α: termination criterion.
if f (S) ≤f and not ReduceToFacets( f ,S, ) then 7: ← S Store S and its bounds in 10: if not ReduceToFacets( f ,S, ) then 11: A simplicial partition set, which was neither rejected nor reduced, is divided using Longest Edge Bisection (LEB), see Algorithm 1, line 11. When several longest edges exist, the longest edge with a vertex with the lowest value of f and the other vertex having the highest value of f is selected. In case vertices are not evaluated, the first longest edge is selected.

Remark 8
The interior of a new facet generated by LEB is always in the relative interior of the bisected simplex. This contributes to reduce the number of vertices labelled as border in the new sub-simplices.
Descendants of a partition set having all its vertices labelled as not-border have all facets in the interior of D, so labelling is no longer necessary. Notice that Kv Affine Arithmetic is slow in execution speed: When the direction of rounding is fixed as "upward", the downward calculation is performed as "sign inversion", and it currently does not support division by affine variables containing zero. Additionally, the execution time for Interval Arithmetic can be reduced on processors supporting Advanced Vector Extensions SIMD (AVX-512) (see last table at kv-rounding web page), which is not our case. Moreover, the PNL library has support for MPI, which is not used here. Table 1 describes the studied instances. Their detailed description can be found in [7] and the optimization web page.
The used termination accuracy is α = 10 −6 and Interval Arithmetic is applied with Automatic Differentiation to obtain bounds of f and ∇ f on S. The following notation is used to describe the variants to calculate lower bounds: 1) and the gradient on S.  Rejection tests like the ones on monotonicity, are checked after the bound calculations. This is not efficient, but it allows us to compare the calculated bounds. Improvement of the best function value foundf is done by point evaluation. Together with the IA bound calculation we evaluate simplex vertices. When other lower bound methods are added to IA, the evaluation of simplex vertices can be disabled in order to save computation. However, this may imply another (worse) update off and a different course of the algorithm, due to Longest Edge Bisection (LEB) by the first longest edge, instead of the best LEB [19].
The following points are evaluated for each method. +CFc* methods (*=b or s) evaluate only the center and +CFb* evaluate only base-points bb − or bs − . Such points are not stored. Notice that base point bb − might be located outside the simplicial search region. +CFvs and +AA evaluate and store simplex vertices. +LR and +LRS evaluate and store box vertices when the search region is a box. Additionally, simplex vertices are evaluated when the search region is a simplex, because vertices of S may be outside the search region and should not be used to improvef .
The +CF*s (*=c,b or v) methods only update lower bounds. The other methods also update upper bounds, which may affect the partition set storage order.   Table 2 to Table 21 in Appendix A. Reduction to border facets due to monotonicity does not occur in box constrained problems. It happens in the simplex constrained instances (see Corollary 3). The monotonocity test reduces the number of simplex evaluations significantly for all test problems. Without that test, the algorithm lasts more than the limit of 15 minutes for several problems. Therefore, we always apply the monotonicity test.
Going over the results of the test problems, problem G7 appears to be a special case, see Table 7. Apparently, adding methods to IA does not provide better lower bounds. For L8 only the +AA method improves the bound a few times and for RB2 only the LR* methods show tighter bounds, see Table 12.
A value of > 15m in Figs. 2 and 3 means that i) the algorithm reached the 15 minute time limit, or ii) there was a problem with the Linear Programming solver in methods +LR and +LRS or iii) a division by zero occurred. The latter only happens for the +AA method for problem L8, because the Kv library does not implement division by zero in Affine Arithmetic.
Focusing on the number of required simplex evaluations (NS), Fig. 2 shows that the +AA method requires the least evaluations for most of the test cases. The second best methods regarding the NS metric are those using LP (+LR* and +CFbs). The +LRS lower bounding requires less simplex evaluations than +LR for some cases, and +CFcb has the best NS values for only a few test cases. For smooth functions, the algorithm converges to a region which is captured by a convex quadratic function. To study the limit convergence behaviour of the algorithm, we run all variants over the so-called Trid function from [23], which represents a convex quadratic function. The results can be found in Tables 22-24 (Appendix B). One can observe for this limit situation that the Linear Relaxation variants are relatively close models and require less simplex evaluations than other lower bounding methods. This means that, for all cases, the  +LR variants have an advantage in the final stages of the algorithm. It is worth to mention that, when the dimension increases, the required Linear Programming gets more time consuming and also the +AA variant starts to do better. The execution time is a difficult performance indicator, as it depends on the used external subroutines. Figure 3 provides normalized values. In the first 9 test cases (ordered according to NS), the execution time is similar for most of the methods apart from those using LP (+LR* and +CFbs). In fact, methods using LP are in general the most time consuming due to the called routines, followed by +AA which avoids solving an LP due to (16). According to the Kv library documentation, Affine Arithmetic is slow and its implementation could be improved. The +CFvs method requires the least execution time for most of the instances. Comparing +CFvs with other +CF* methods, the centered form used in +CFvs has to evaluate one sum term less and the base-point vertex can already have been evaluated and stored. On average, +CFbb is the best method, but this is because it is the best for the ST5 test problem, which is one of the most time consuming instances.    The method requiring least computing time in several test problems is the one based on the center form on a simplex using the vertex of a simplex with the highest function value as a base-point. The vertex can already have been evaluated and stored and the centered form requires one less additional term evaluation. This means that it is preferable to evaluate cheap lower bounds that reuse previous information over more simplices than expensive lower bounds over less simplices for low dimensional instances.
Acknowledgements This research is supported by the Spanish Ministry (RTI2018-095993-B-I00), in part financed by the European Regional Development Fund (ERDF).

Conflicts of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Extended numerical results
In Tables 2 to 21 the following notation is used. NS : number of simplex evaluations, NSV : number of simplex vertex evaluations, MTS : maximum number of simplices stored in the AVL tree, MTP : maximum number of points stored in the AVL tree, NNV : number of non simplex vertex evaluations. They can be cb, bb − , cs or bs − . NBV : number of S vertex evaluations, NI : number of times the natural inclusion lower bound is improved by another lower bounding method, T : wall clock time. Differences smaller than 0.005s are not significant.

Table 24
Results for Trid on box

B Methods on a convex quadratic function
The Trid problem from optimization is a convex quadratic function. According to Some Hard Global Optimization Test Problems: This is a simple discretized variational problem, convex, quadratic, with a unique local minimizer and a tridiagonal Hessian. The scaling behaviour for increasing n (search region is [−n 2 , n 2 ]) gives an idea on the efficiency of localizing minima once the region of attraction (which here is everything) is found; most local methods only need O(n 2 ) function evaluations, or only O(n) (function+gradient) evaluations. A global optimization code that has difficulties with solving this problem for n=100, say, is of limited worth only. The strong coupling between the variables causes difficulties for genetic algorithms. The problem is typical for many problems from control theory, though the latter are usually nonquadratic and often nonconvex.