An approximation algorithm for multi-objective optimization problems using a box-coverage

For a continuous multi-objective optimization problem, it is usually not a practical approach to compute all its nondominated points because there are infinitely many of them. For this reason, a typical approach is to compute an approximation of the nondominated set. A common technique for this approach is to generate a polyhedron which contains the nondominated set. However, often these approximations are used for further evaluations. For those applications a polyhedron is a structure that is not easy to handle. In this paper, we introduce an approximation with a simpler structure respecting the natural ordering. In particular, we compute a box-coverage of the nondominated set. To do so, we use an approach that, in general, allows us to update not only one but several boxes whenever a new nondominated point is found. The algorithm is guaranteed to stop with a finite number of boxes, each being sufficiently thin.


Introduction
The aim of multi-objective optimization is to minimize not only one but multiple objectives at the same time. Usually, it is not possible to find a feasible point that minimizes all objectives as these are conflicting. Hence, a commonly used approach is to find so-called nondominated points in the criterion space which belong to so-called efficient solutions in the decision space. The set of all these nondominated points is called nondominated set or Pareto front. Given an efficient solution, it is not possible to find a feasible point that leads to an improvement for any objective without deteriorating another. For an introduction to multi-objective optimization see [9,30,32].
In general, there is an infinite number of nondominated points for a continuous multiobjective optimization problem. Thus, a common technique is to compute a (finite) approximation of the nondominated set. In [34] a survey of such techniques including a classification can be found. There are basically two kinds of approximations. On the one hand, there are approaches which compute a finite number of nondominated points to represent the whole nondominated set which we refer to as representation approach (e.g., [4,16,42]). On the other hand, there are approaches which compute a set (instead of a finite number of points) that contains the nondominated set which we refer to as coverage approach. A common technique for coverage approaches is to combine inner and outer approximations (e.g., [11,26,36]) . This is sometimes referred to as sandwiching, see [1].
Those sandwiching techniques using inner and outer approximations lead to a polyhedron which contains the nondominated set. For representing such polyhedra, usually their vertices are used. Hence, updating such polyhedral approximations is a constant change of computing inner and outer approximations and recomputing at least some of the vertices from a hyperplane representation, see for instance [8,11].
Another approach are coverages that consist of boxes. Boxes can be easily represented by their corners, which we refer to as lower and upper bound. Thus, one can expect that updating (a collection of) boxes requires less effort than updating a polyhedron. Moreover, boxes respect the natural ordering. This is an advantage for applications that perform further computation based on the approximation.
For example, one could think of fields as mixed-integer multi-objective optimization where the nondominated set for the mixed-integer problem can be computed by comparing the nondominated sets of the multi-objective optimization problems that arise by fixing the values of the integer variables.
Another field is set optimization where values of set-valued objective functions have to be compared, see [24]. Robust approaches for handling uncertainties in multi-objective optimization lead to such set optimization problems, see [21]. For instance, for the uppertype set relation comparing compact sets corresponds to comparing Pareto fronts, for which coverages can be used, see for example [13].
Thereby, boxes can be compared more easily than general polyhedra. Moreover, in case these coverages are not exact enough, it is important to be able to improve those iteratively. In view of this, we are able to present such a guarantee for our coverage approach which we call Halving Theorem.
An example for a box-coverage is presented in [19] for bi-objective problems and has been extended in [27] to tri-objective problems. The approach in [27] is to split a given box into seven subboxes using update points that are computed using a lexicographic ε-constraint scalarization. Boxes are removed if they do not contain any feasible points. Otherwise they are split again until their maximum width is smaller than a given tolerance. In [5] an algorithm to generate a box-coverage for bi-objective integer programs is presented. This algorithm also uses the approach to divide a given box into (two) subboxes. The update point, which decides where the box is split, is computed using an approach related to the Pascoletti-Serafini scalarization. It also takes into account the integrality of the problem to avoid working with infeasible subboxes. To the best of our knowledge there is no generalization of the approaches from [27] and [5] for an arbitrary number of objective functions.
Another approach for a box-based approximation, mainly focused on discrete tri-objective problems, is given in [6]. It is different from the approaches described above as all boxes have the same lower bound. In other words, only the upper bounds of the boxes are updated.
Moreover, the boxes are allowed to intersect. As a result, a single update point, i.e., a nondominated point, can lead to a split of multiple boxes and, consequently, redundant subboxes. However, approaches how to remove redundancy are presented as well.
In [25] a concept of search regions related to those from [6] and so-called local upper bounds are presented. They can be used for any number of objective functions and in particular for (non-discrete) multi-objective optimization problems. In [38] and very recently in [7] representation approaches that inherently also generate a box approximation have been presented, where [7] demonstrates the use of such approaches in radiotherapy planning as a real world application.
For completeness, we want to mention that branch-and-bound algorithms usually use boxes in the decision space and some of these algorithms also generate boxes in the criterion space, see [14,15,31,35]. However, we want to focus here on working in the criterion space without creating any substructure in the decision space. One reason for our focus on a criterion space based method is that the computation time of branch-and-bound approaches in the decision space increases quite fast when the number of decision variables increases. Since our algorithm works in the criterion space, its computation time depends more on the number of objectives than on the number of decision variables. Hence, our algorithm focuses on such cases where there are more decision variables than objective functions whereas branch-andbound approaches as those from [14,15,31,35] are alternatives in case that the dimension of the decision space is relatively small. Some of these methods like [14,35] are also limited to purely box-constrained optimization problems. Our approach works for an arbitrary feasible set as long as it is compact.
In this paper, we introduce a new approximation algorithm for multi-objective optimization problems using the bound concepts from [25] to compute a box-coverage of the nondominated set. To the best of our knowledge, we are the first to present a box approximation concept for an arbitrary dimension of the criterion space involving both upper and lower bound improvements with an exact bound on the number of iterations needed. Moreover, we show that in every iteration a certain improvement of the approximation can be guaranteed, which we refer to as Halving Theorem. While we recommend our algorithm most of all for convex multi-objective optimization problems, i.e., problems where all objective and constraint functions are smooth and convex, the theoretical results presented in this paper still hold when assuming continuous objective functions and a compact feasible set. In particular, the presented algorithm needs to solve a large number of single-objective optimization problems that are derived from the original multi-objective optimization problem. It is crucial for the performance of the algorithm that a fast and reliable solver for these single-objective subproblems is available. A well-known class of optimization problems for which such a solver is available are smooth convex problems. In case the problems are nonconvex, a suitable global solver needs to be used. Hence, the reader should be aware that while in theory the algorithm presented in this paper works even under weaker assumptions the best solvers for the subproblems exist for smooth convex optimization problems.
The remaining paper is structured as follows. We start in Sect. 2 with some notations and definitions. We also present the problem formulation (MOP) and characterize the kind of approximation that we aim for. In Sect. 3 we discuss the approach to characterize the boxes of our approximation using lower and upper bounds based on the concepts from [25]. Then, in Sect. 4 we introduce our new algorithm to compute the box-based approximation of the nondominated set including a detailed discussion of its properties, such as finiteness. Finally, in Sect. 5 some numerical results for using our algorithm to compute an approximation are presented.

Notations and definitions
For a positive integer n ∈ N we use the notation [n] := {1, . . . , n}. All relations in this paper are meant to be read component-wise, i.e., for x, x ∈ R n it is For l, u ∈ R n with l ≤ u we denote by [l, u] := y ∈ R n l ≤ y ≤ u the box with lower bound l and upper bound u. As already mentioned in the introduction, we focus on multiobjective optimization problems. We denote by f i : R n → R, i ∈ [m] the objective functions and by S ⊆ R n the feasible set. We also write f = ( f 1 , . . . , f m ) : R n → R m . Then, our multi-objective optimization problem is given as where all functions f i , i ∈ [m] are assumed to be continuous and S is assumed to be a nonempty, compact set. Since f (S) is bounded, it holds that We assume in the following that such a box B is known. While there is no need to assume convexity of the objective functions f i , i ∈ [m] and the set S for the theoretical results in this paper, one needs to be able to solve a single-objective subproblem related to (MOP), see (SUP(l, u)) on page 13. Fast and reliable solvers for such optimization problems are available for example when assuming convexity of the objective functions f i , i ∈ [m] and the set S. The corresponding subproblems are then single-objective convex optimization problems where every locally optimal solution is also a globally optimal solution. Hence, from a practical point of view, we recommend to use our algorithm first of all for smooth convex multi-objective optimization problems. We discuss this in more detail in Sect. 5. However, for the theoretical results of our paper we stick with the weaker assumptions of continuous objective functions and a compact feasible set S. As the different objective functions of (MOP) are usually competing with each other, in general it is not possible to find a feasible point that minimizes all objectives at the same time. Thus, we use the following optimality concepts.
For a given ε > 0 we callx ∈ S an ε-efficient solution for (MOP) if there exists no We use a related concept in the criterion space, called dominance.

Definition 2.2
Let y 1 , y 2 ∈ R m and ∈ {≤, ≥}. Then y 2 is dominated by y 1 with respect to if y 1 = y 2 , y 1 y 2 . For a set N ⊆ R m a vector y ∈ R m is dominated given N with respect to if ∃ŷ ∈ N :ŷ = y,ŷ y.
If y is not dominated given N w.r.t. , it is called nondominated given N with respect to . Analogously, for ≺ ∈ {<, >} we say y 2 is strictly dominated by y 1 with respect to ≺ if y 1 ≺ y 2 and a vector y ∈ R m is strictly dominated given a set N ⊆ R m with respect to ≺ if ∃ŷ ∈ N :ŷ ≺ y.
If y is not strictly dominated given N w.r.t. ≺, it is called weakly nondominated given N with respect to ≺.
In general, the specification of the relation /≺ is left out if it is known by context. As the images f (x) of efficient solutionsx ∈ S are nondominated given f (S) w.r.t. ≤, they are called nondominated points of (MOP). We denote by E the set of efficient solutions (also efficient set) and by N the set of nondominated points (also nondominated set) of (MOP), i.e., N := y ∈ R m y = f (x), x ∈ E ⊆ R m . Also, for an arbitrary ε > 0, we denote the ε-nondominated set for (MOP) by and the weakly ε-nondominated set for (MOP) by x is a weakly ε-efficient solution for (MOP) .
In this paper we focus on the criterion space and hence, on finding an approximation of the set N . As already mentioned in the introduction, we aim for a box-based coverage of N . The concept of an enclosure, as presented in [12], realizes such a box-based coverage.
Then L is called lower bound set, U is called upper bound set, and the set A which is given as is called approximation or enclosure of the nondominated set N given L and U .
For an illustration of this concept, see Fig. 1. In this figure, the nondominated set N is given in orange and the sets L = {l 1 , l 2 } and U = {u 1 , u 2 } are lower and upper bound sets as in Definition 2.3. The box structure of the corresponding approximation A can also be seen. We aim for an approximation of certain quality. For this reason, we use a quality criterion that is presented in [12]. There, the authors suggest to generalize the quality citerion given by the interval length u − l of enclosing intervals [l, u] from single-objective optimization to the so-called width w(A) of the enclosure A with respect to the direction of the all-ones vector e, i.e., to define w(A) as the optimal value of sup y,t This definition arises quite naturally, which we want to explain briefly. By Definition 2.3, we have N ⊆ A. Besides that, it is also reasonable to aim for an approximation A that only consists of points that are at least approximately nondominated. For example, we can demand that for some ε > 0 we have that y ∈ N ε for all y ∈ A ∩ f (S). A sufficient criterion for this to hold would be that for any y ∈ A we have that y − εe / ∈ A. In other words, the quality of the approximation A can be defined as the largest ε > 0 such that there exists some y ∈ A with y + εe ∈ A. This leads exactly to the definition of w(A) from (2.4). Moreover, this relation between A and N ε is also shown in [12, Lemma 3.1]. In particular, for any ε > 0 and an A similar result can be obtained for the polyhedral approach from [11] for convex multiobjective problems that generates an inner approximation P i and an outer approximation P o of the nondominated set. It is shown in [11,Theorem 4.3] that for the nondominated set N P i of P i it holds that N P i ⊆ N w ε . Thereby, ε is an upper bound on the distance between any vertex v of the polyhedral approximation and the boundary of f (S). More precisely, denote by V the vertex set of the polyhedron and choose a fixed interior point p ∈ f (S) + R m + . Then, for each v ∈ V and its corresponding (unique) boundary point In [12] the authors have also shown that there is an equivalent formulation of (2.4) that better fits the box-approximation concept. They denote the shortest edge of a box We want to remark that [12] presents a branch-and-bound framework in the decision space while we focus on the criterion space. For our paper, we only make use of their enclosure concept and the corresponding quality measure w.

Fig. 2 Illustration of lower search region, lower search zone and local upper bounds
Analogously to the concept of dominance, the specification of the order relation is often left out if it is known by context. It is easy to see that the nondominated set N is a stable set w.r.t. ≤.

Definition 3.2
Let N ⊆ f (S) be a finite and stable (w.r.t. ≤) set. Then the lower search region for N is s(N ) := y ∈ int(B) y y for every y ∈ N and the lower search zone

Each point u ∈ U (N ) is called a local upper bound (LUB).
Given the set N , the search region s(N ) contains all potentially nondominated points in int(B) given N w.r.t. ≤ without N itself. The latter because (just by the definition) it always holds N ∩ s(N ) = ∅. In other words, s(N ) contains all elements y ∈ int(B)\N that are not dominated by any y ∈ N . Hence, dominance in the context of local upper bounds is always dominance w.r.t. ≤ and also stable always means stable w.r.t. ≤. It is known that for any finite and stable set N ⊆ f (S) the local upper bound set U (N ) is uniquely determined and finite, see [12].
For an illustration of the concept of local upper bounds, see Fig. 2. For a stable set N = {y 1 , y 2 } ⊆ R 2 a local upper bound set U (N ) = {u 1 , u 2 , u 3 } is shown and also the lower search zone c(u 2 ) and the lower search region s(N ) are highlighted.
The following lemma presents a relation between local upper bound sets and upper bound sets as presented in Definition 2.3.

Lemma 3.3 Let N ⊆ f (S) be a finite and stable set. Then it holds
Proof We only need to show N ⊆ cl(s(N )). So letȳ ∈ N be a nondominated point of (MOP) and assumeȳ / ∈ s(N ). Then there exists some y ∈ N ⊆ f (S) with y ≤ȳ. This . Moreover, it holds that y k < y for all k ∈ N. This implies that (y k ) k∈N ⊆ s(N ), because otherwise there exists y ∈ N and an index k ∈ N with y ≤ y k < y , which contradicts the assumption that N is stable. Thus, we obtain thatȳ = y = lim k→∞ y k ∈ cl(s(N )).

Hence, U = U (N ) is an upper bound set in the sense of Definition 2.3 for any finite and stable set N ⊆ f (S).
This concept leads to upper bounds for the nondominated set of (MOP). Now, we show how to use it to gain lower bounds. This is one of the main differences when comparing our approach to that in [25] where only the local upper bounds are used. To distinguish between the local upper bounds and the closely related local lower bounds, which we present in the next definition, we use upper case notation instead of lower case notation for the search region and search zones.
In the context of local lower bounds, dominance is always dominance w.r.t. ≥ and stable sets are stable w.r.t. ≥ as well.
In Fig. 3 an illustration of the concept of local lower bounds is given for the same setting as in Fig. 2. We have the same stable set N = {y 1 , y 2 } ⊆ R 2 , a local lower bound set L(N ) = {l 1 , l 2 , l 3 }, the upper search zone C(l 2 ), and the upper search region S(N ).
Next, we show that for some specific sets N the local lower bound set L(N ) is indeed a lower bound set in the sense of Definition 2.3. Proof Letȳ ∈ N ⊆ f (S) ⊆ int(B) be a nondominated point of (MOP). Then for every y ∈ N it holds y =ȳ or y ȳ. Hence, using Definition 3.4, we have N ⊆ N ∪ S(N ) ⊆ cl(S(N )), where N ⊆ cl(S(N )) can be shown using similar arguments as in the proof of Lemma 3.3. Finally, this leads to is a lower bound set in the sense of Definition 2.3.
In particular, for any finite and stable set N ⊆ int(B)\( f (S)+(R m + \{0})) the assumptions of Lemma 3.5 are satisfied. As we need this result later in Sect. 4 (Lemma 4.4), we briefly summarize the relation between local lower and local upper bound sets and bound sets as given in Definition 2.3. In the following, we present a method to compute local upper and local lower bounds. To provide initial local lower bound and local upper bound sets, we set U (∅) = {Z } and L(∅) = {z}. It is easy to see that these sets satisfy Definitions 3.2 and 3.4. The sets L and U are then updated using points y ∈ int(B) to obtain smaller search regions. As updating these sets is done by using projections, we use here the following notation from [25]. For y ∈ R m , α ∈ R and an index i ∈ [m] we define y −i := (y 1 , . . . , y i−1 , y i+1 , . . . , y m ) as well as (α, y −i ) := (y 1 , . . . , y i−1 , α, y i+1 , . . . , y m ) .
Using this notation, Algorithm 1 shows an updating procedure for local upper bound sets as presented in [25,Algorithm 3]. We briefly explain the algorithm after the forthcoming Lemma 3.7.
Due to the close relation between local upper bounds and local lower bounds, the concept of Algorithm 1 can also be used for updating local lower bound sets. To do so, one simply has to replace every < by > and every ≤ by ≥. This leads to the updating procedure as given in Algorithm 2.
In Sect. 4 we present our new algorithm to generate a box-coverage of the nondominated set N . The properties of this algorithm, e.g., finiteness, are highly depending on the properties of the updating procedures for local lower and local upper bounds. For this reason, we discuss those properties in the remaining part of this section. Due to the analogies of both procedures, we focus on local upper bounds and Algorithm 1.
Our Algorithm 1 slightly differs from [25,Algorithm 3]. Compared to the original algorithm, we do not assume the update point y ∈ f (S) to be nondominated given N . However, the algorithm still works correctly. For any update point y that is nondominated given N the correctness of the algorithm is shown in [25]. For update points y that are dominated given N the correctness of the algorithm is shown in the following lemma.  Proof As y ∈ f (S) is dominated given N , there exists y ∈ N with y ≤ y, y = y. This implies that y / ∈ s(N ) and by property (i) of U = U (N ) being a local upper bound set this implies that there exists no u ∈ U (N ) with y ∈ c(u). Hence, there exists no u ∈ U (N ) with y < u and for Algorithm 1 this means that A = ∅. As a result, the algorithm returns the same (unchanged) set U = U (N ). We already discussed that this is the same local upper bound set as U (N ∪ {y}), see also [

Algorithm 1 Updating a local upper bound set
and u ∈ U (N ). For the latter case we call u the parent of u. Otherwise u is its own parent. Proof If u is its own parent, i.e., u ∈ U (N ), then there is nothing to show. Hence, we consider the case u / ∈ U (N ) and assume that there are two different parents u 1 , One can see that l 2 is the parent of l 2,1 and l 2,2 and that u 2 is the parent of u 2,1 and u 2,2 . All remaining bounds (i.e., u 1 , l 1 ) are their own parents. For consistency their names are changed as well. Using this way of assigning a numeration to the local lower and local upper bounds also encodes the parents of the bounds.

Computing the box-coverage
In this section, we present our new algorithm to compute an approximation of the nondominated set of (MOP) with a guaranteed improvement in each iteration. The approach is to use local lower bound sets and local upper bound sets as presented in Sect. 3 to compute the approximation in the form of a box-coverage. First, we discuss the initialization of these sets.

Initialization
We initialize L = L(∅) = {z} and U = U (∅) = {Z } with z, Z ∈ R m from (2.1). These first bounds should be chosen as tight as possible. For this reason, we use the ideal and anti-ideal Then, as the ideal and anti-ideal point are not satisfying (2.1), we introduce a small offset σ > 0 and define with respect to e as the all-ones vector z :=z − σ e and Z :=Z + σ e. Those are now suited as initial local lower and local upper bounds. Both the ideal and the anti-ideal point can be hard to compute. However, any choice of z, Z ∈ R m that satisfies (2.1) can be used for initialization.

Updating the boxes
Next, we provide a method to shrink boxes [l, u] with s(l, u) > ε. To shrink a box, we need to improve at least one of its bounds l and u. Therefore, our aim is to find a new nondominated point in [l, u]. This nondominated point is then chosen as an update point for Algorithms 1 and 2. As a result, the old box [l, u] is replaced by new, smaller boxes, see Fig. 4. In the following, we formalize this approach. Let l, u ∈ R m with l < u. Then the search for a nondominated (update) point is performed by solving the optimization problem x ∈ S, t ∈ R.

(SUP(l, u))
It is crucial for the performance of our algorithm that the (SUP(l, u)) can be solved fast and reliable. This is possible for instance in the case of smooth and convex subproblems (SUP(l, u)). We recommend to take this into account when choosing a solver to solve the subproblems within our overall algorithm. However, the following theoretical results do not require convexity but only continuous objective functions and a compact feasible set S.  l, u)).
In [33] it is shown that for every optimal solution (x,t) of (SUP(l, u)) the pointx ∈ S is a weakly efficient solution for (MOP). Thus, f (x) is weakly nondominated given N . To perform an update step with Algorithms 1 or 2, i.e., to obtain a new local upper bound or local lower bound set, we need an update point y ∈ R m that is nondominated and not only weakly nondominated. In case all objective functions f i , i ∈ [m] are strictly convex and the feasible set S is convex as well, any weakly nondominated point is also nondominated, see [3].
For our general setting, examples on how to find a nondominated pointȳ ≤ y given y ∈ f (S) can be found in [4,10]. Another example for bi-objective problems can be found in [5], where the authors used a Pascoletti-Serafini scalarization and encountered the problem of needing to derive a nondominated point of a weakly nondominated point as well. For our implementation, we use a subproblem as formulated in [41]. Let y ∈ f (S) be a weakly nondominated point of (MOP). Then according to [41,Theorem 2] any optimal solution is efficient for (MOP). Thus,ȳ := f (x) is a nondominated point withȳ ≤ y.

Main algorithm
Our new algorithm BAMOP to generate a box-coverage is presented as Algorithm 3. The algorithm basically works as follows: It loops through all boxes [l, u] with l ≤ u and s(l, u) > ε. Then, a new point to update the bound sets is computed using the methods described above. The whole procedure is repeated until finally all boxes and hence the approximation A given L and U are sufficiently small, i.e., w(A) ≤ ε. In Algorithm 3 there is a case distinction concerning the computation of the update point for Algorithms 1 and 2. The reason for that is that in order to compute an approximation with w(A) ≤ ε, we need the boxes to get thinner. To do so, we guarantee that at least in one dimension the box length is halved, see Theorem 4.2.
Before we present that theorem, we briefly illustrate a single update step of the algorithm in Fig. 5. Using the notation from Algorithm 3, one case isŷ =ȳ, i.e., on the connection line between l and u there is a nondominated pointŷ ∈ N . The illustration shows that in this

Algorithm 3 Box Approximation for (MOP)
Input: Tolerances ε > 0 and τ > 0, initial bounds z, Z from (2.1) Output: Lists L, U of lower and upper bounds procedure BAMOP(ε, τ, z, Z ) is an optimal solution of (SUP(l, u)). This can be used together with Corollary 3.6 later in Theorem 4.4 to proof that valid bound sets as needed for Theorem 2.3 are generated. Please be aware that this also implies that, in general, for the local upper bound set U = U (N 1 ) and the local lower bound set L = L(N 2 ) in Algorithm 3 it holds that N 1 = N 2 . Next, we discuss the development of the sets L and U , see Fig. 6. Let l = l 3 ∈ L loop be the lower bound selected for the current outer for loop in Algorithm 3. We have U = U loop = {u 1 , u 2 , u 3 } and assume that the inner for loop first selects u 2 and then u 3 (and finally u 1 which is not part of the illustration). During the first run of the inner for loop with u = u 2 we find the update pointŷ ∈ N . Using Algorithms 1 and 2, this leads to the projections as shown in Fig. 6. For the lower bounds none of the children of l 3 is part of the updated lower bound set L = {l 1 , l 6 , l 7 , l 5 }. The updated upper bound set is U = {u 1 , u 4 , u 5 , u 3 }.
As the run of the inner for loop with u = u 2 is finished, the algorithm continues with the run of the inner for loop with u = u 3 ∈ U loop = U . Even if l = l 3 / ∈ L, it is still the parameter l for (SUP(l, u)). However, for the same reason (i.e., l 3 / ∈ L) it is not considered in the update of the lower bound set L using Algorithm 2. This is an important mechanism to keep in mind. At the end of this run of the inner for loop it is L = {l 1 , l 6 , l 8 , l 9 , l 5 } and U = {u 1 , u 4 , u 6 , u 7 }.

Halving property and convergence
In this section we proof some important properties of Algorithm 3, such as finiteness and correctness. A key property is presented in the following theorem, which shows that with every run of the repeat loop the width of the boxes is in some sense halved. Proof Suppose that there exist l e ∈ L end and u e ∈ U end with l e ≤ u e such that for all l s ∈ L start and u s ∈ U start one of the statements (i) or (ii) is violated.
We denote by P(l e ), P(u e ) ⊆ R m the sets containing all elements of the parent history of l e and u e within the current iteration, i.e., their parents, their parents parents and so on. By Theorem 3.8 we have that P(l e ) ∩ L = 1 and P(u e ) ∩ U = 1 ( 4 . 2 ) at any point of the current iteration, i.e., for L and U at any point of time. In particular, this holds for the local lower and local upper bound sets at the beginning of the current iteration. Thus we let l s ∈ L start ∩ P(l e ) and u s ∈ U start ∩ P(u e ). The update procedures for local upper bound and local lower bound sets, i.e., Algorithms 1 and 2, always ensure that the local lower and local upper bounds are not getting worse. In particular, for l c , l p ∈ P(l e ) and u c , u p ∈ P(u e ) with l p being the parent of l c and u p being the parent of u c it holds that l c ≥ l p and u c ≤ u p . (4.3) By the definition of the parent history and using that L and U are only updated using Algorithms 1 and 2, this also implies that l s ≤ l ≤ l e ∀l ∈ P(l e ) and u s ≥ u ≥ u e ∀u ∈ P(u e ). (4.4) In particular, we have l s ≤ l e ≤ u e ≤ u s and hence (i) is satisfied. Thus, (ii) has to be violated. As a result, it holds that (4.5) Using the notation of Algorithm 3, in particular denoting by l and u the corresponding iteration variables of the for loop, at some point of the current run of the repeat loop it is l = l s because L loop = L start . Now, fix l = l s for the outer for loop and consider the inner for loop. By (4.2), we have that there exists a unique u ∈ U loop ∩ P(u e ). In the following, we consider Algorithm 3 at the point of time where this specific assignment of l and u is present. It is important to mention that u is not necessarily the first upper bound chosen by the inner for loop. For this reason, the sets L and U may have been updated several times which could lead to l / ∈ L and/or u / ∈ U , see also Fig. 6. However, we know from (4.2) and (4.3) that at this point of the algorithm where l and u are assigned as described above, there exist l ∈ L ∩ P(l e ) and u ∈ U ∩ P(u e ) with l ≥ l and u ≤ u. For the remaining part of this proof, we discuss all possible update steps for the fixed assignment of l and u. In total, there are ten cases, see also Fig. 7. In the following, we always refer to the case numbering from that figure.
First, we consider the case (A), i.e.,t > 0.5 andȳ =ỹ. This case is also shown in Fig. 5b. Witht := max{0.5,t − τ } it is l < l +t(u − l) =: y . We start with (A.1), i.e., l < y . In this setting the update procedure for L, i.e., Algorithm 2, removes l and creates m new candidates for lower bounds  At least one of these candidates (the child of l which is part of the parent history P(l e )) is indeed added to the new local lower bound set L. Thus, there exists an index i ∈ [m] such that l i ∈ L after executing Algorithm 2. By (4.4) we have that l e ≥ l i and together with (4.7) and again (4.4) we obtain that As a result, (ii) would be satisfied which contradicts our assumption.
Next, we consider case (A.2), i.e., l < y , which implies that there exists an index i ∈ [m] with l i ≥ y i . Using (4.7), this implies again with (4.4) that which contradicts our assumption as well.
This concludes case (A). For case (B) lett ≤ 0.5 orȳ =ŷ. First we discuss case (B.1), this is,t ≤ 0. In this case it isȳ ≤ l ≤ l . By Theorem 3.4 (ii) there is also no l * ∈ L with l * ≤ l , l * = l and in particular no l * ∈ L with l * <ȳ. Thus, the local lower bound set L is not updated in this step and only the local upper bound set has to be considered.
In case (B.1.1) withȳ < u , Algorithm 1 removes the upper bound u from the local upper bound set and creates m new candidates At least one of these candidates (as part of the parent history P(u e )) is added to the updated set of local upper bounds. Thus, there exists an index i ∈ [m] such that u i ∈ U after the updating procedure. Using (4.4), we obtain Again, (ii) would be satisfied, which contradicts our assumption.
Next, we consider case (B.1.2). Asȳ < u , there exists an index i ∈ [m] such thatȳ i ≥ u i . Using (4.7), this leads to and contradicts our assumption that (ii) is not satisfied.
This concludes case (B.1) and we continue with case (B.2). Consequently, for the remaining part of this proof we have thatt > 0. We start with case (B.2.1), i.e.,ȳ =ŷ, see Fig. 5a. First, we consider case (B.2.1.1) with l <ȳ < u . In this setting, the updating procedures Algorithms 1 and 2 remove l and u from L and U and create m new candidates (4.8) Again, at least one of these candidates is contained in the updated local lower and local upper bound sets because it is part of the parent history P(l e ) or P(u e ), respectively. Hence, there exist i, j ∈ [m] with l i ∈ L and u j ∈ U after the updating procedures. For i = j we have (u j − l i ) i =ȳ i −ȳ i = 0 < ε which contradicts our assumption. Thus, we only consider i = j. Using (4.4), (4.7), andȳ =ŷ, we obtain It is eithert ≤ 0.5 or (1 −t) < 0.5. Hence, there exists ι ∈ {i, j} with which contradicts our assumption. Next, we consider case (B.2.1.2) with l <ȳ < u . Consequently, there exists an index j ∈ [m] withȳ j ≥ u j . For the set L, we know that l is removed and Algorithm 2 computes m new candidates, see (4.8). Again, there exists an index i ∈ [m] such that l i ∈ L after the updating procedure. For i = j we have that (u − l j ) j = u j −ȳ j ≤ 0 which contradicts our assumption. For i = j, using (4.4), (4.7), andȳ =ŷ, we obtain Again, it is eithert ≤ 0.5 or (1 −t) < 0.5 and there exists ι ∈ {i, j} with (u e − l e ) ι ≤ 0.5(u − l) ι ≤ 0.5(u s − l s ) ι which contradicts our assumption.
Case (B.2.1.3), i.e., l <ȳ < u , can be shown analogously. There exists an index i ∈ [m] with l i ≥ȳ i . Moreover, u is updated by Algorithm 1. As a result, m new candidates for local upper bounds are computed, see (4.8). Again, there exists an index j ∈ [m] such that u j ∈ U after the updating procedure. For i = j we have that (u i − l ) i =ȳ i − l i ≤ 0 which contradicts our assumption. For i = j, using (4.4), (4.7), andȳ =ŷ, we obtain Consequently, there exists ι ∈ {i, j} with (u e − l e ) ι ≤ 0.5(u − l) ι ≤ 0.5(u s − l s ) ι which contradicts our assumption. Now, we consider case (B.2.1.4) with l <ȳ < u . In this case, there exist i, j ∈ [m] with l i ≥ȳ i andȳ j ≥ u j . For i = j this implies (u − l ) i ≤ 0 and (ii) would be satisfied. This contradicts our assumption. For i = j, using (4.7) andȳ =ŷ we have that Again, there exists ι ∈ {i, j} with (u e − l e ) ι ≤ 0.5(u − l) ι ≤ 0.5(u s − l s ) ι which contradicts our assumption.
Finally, we consider case (B.2.2.2) withȳ < u . Then there exists an index j ∈ [m] with y j ≥ u j . Using the same arguments as before, this implies that which contradicts our assumption as well.
We considered all cases shown in Fig. 7. For all these cases, our assumption that (ii) is not satisfied does not hold. Hence, we have to reject our assumption and Theorem 4.2 is shown.
With Theorem 4.2 we can show that Algorithm 3 is finite. In particular, the next lemma provides an upper bound on the number of iterations (i.e., the number of runs of the repeat loop). This bound depends on z, Z ∈ R m from (2.1) and thus they should be chosen tight as discussed on page 12. Then the number of iterations, i.e., the number of runs of the repeat loop, is bounded by max{κ, 1}, and hence, Algorithm 3 is finite.
Proof If κ ≤ 1 then Δ ≤ ε and Algorithm 3 ends after the first iteration. Hence, we consider the case κ > 1 and define l 1 := z, u 1 := Z and for k ≥ 1 let L k , U k be the local lower bound and local upper bound sets at the beginning of iteration k. By Theorem 4.2, for all k > 1, all l k ∈ L k , and all u k ∈ U k there exist l k−1 ∈ L k−1 , u k−1 ∈ U k−1 , and an index i k ∈ [m] such that (4.9) Suppose that Algorithm 3 has more than κ iterations. Then there exist l κ ∈ L κ and u κ ∈ U κ with l κ +εe < u κ (being equivalent to εe < u κ −l κ ). For every k ∈ {2, 3, . . . , κ} let i k ∈ [m] be the index from the iterative application of (4.9) starting with k = κ. We define n(i) := k ∈ {2, 3, . . . , κ} i k = i for all i ∈ [m].
there exists at least one indexī ∈ [m] with n(ī) ≥ log 2 Δ ε . Iteratively using (4.9), this implies that which contradicts εe < u κ − l κ . As a result, there cannot be more than κ iterations and Algorithm 3 is finite.
So far, we have shown that Algorithm 3 generates a local lower bound set L and a local upper bound set U and stops after finitely many iterations. Next, we use Theorem 3.6 to show that these sets L and U are in fact lower and upper bound sets in the sense of Theorem 2.3.

Lemma 4.4 The output sets L and U of Algorithm 3 are lower bound and upper bound sets in the sense of Theorem 2.3.
Proof As the output set U is a local upper bound set and L is a local lower bound set, there exist N 1 , N 2 ⊆ int(B) with U = U (N 1 ) and L = L(N 2 ). First, we discuss the local upper bound set U . The algorithm starts with U (∅) = {Z }. After that, the local upper bound set is only updated using pointsȳ = f (x) ∈ N computed from optimal solutionsx of (GNP( f (x)). As a result, it is N 1 ⊆ N (or N 1 = ∅) and in particular N 1 ⊆ f (S).
Next, we consider the local lower bound set L. The algorithm starts with L(∅) = {z}. For all further updates of the set there are two cases. The first case is that L is updated within the if part of the if clause, i.e., using y := l +t(u − l) with the notation from Algorithm 3. Thereby, it holds thatt <t with (x,t) being an optimal solution of (SUP(l, u)). Thus, there exists no y ∈ f (S) with y ≤ y which implies y / ∈ f (S) + R m + . The second case is that L is updated by Algorithm 2 using an update pointȳ = f (x) ∈ N computed from an optimal solutionx of (GNP( f (x)). Again, it isȳ / ∈ f (S) + (R m + \{0}). Hence, we have N 2 ⊆ int(B)\( f (S) + (R m + \{0})) (or N 2 = ∅). By Theorem 4.3 we have that Algorithm 3 is finite and so N 1 and N 2 are finite sets as well. Moreover, without loss of generality, we can assume that the sets N 1 and N 2 are stable, see page 10 or [25,Remark 2.2]. Together with Theorem 3.6 this implies that L and U are lower and upper bound sets in the sense of Theorem 2.3.
Finally, we can use all of the results from above to show that Algorithm 3 generates an approximation A of the nondominated set N of (MOP) with width w(A) ≤ ε. Thus, for all l ∈ L the if clause is not satisfied, i.e., for all u ∈ U there exists at least one index i ∈ [m] with As a result, the approximation A has a width w(A) ≤ ε.

Numerical results
In the following, we present numerical results for selected test instances. All numerical tests have been performed using MATLAB R2018b on a machine with Intel Core i9-10920X processor. The average of the results of bench(5) is: LU = 0.0470, FFT = 0.0559, ODE = 0.0137, Sparse = 0.0885, 2-D = 0.2302, 3-D = 0.2117. It provides a rough idea of the machine performance, in particular for comparing the results using other machines. Please be aware that these results are version specific according to the MATLAB documentation, see [29].
In this section we mainly focus on convex problems (MOP), i.e., such problems with convex objective functions f i , i ∈ [m] and a convex set S ⊆ R n . The big advantage of this setting is that also all single-objective subproblems like (SUP(l, u)) are convex problems and can be solved by any local solver. To solve (SUP(l, u)) and (GNP(y)) we use the local solver fmincon. For the initialization, we provide z, Z as in (4.1) with σ = 10 −6 unless stated otherwise. We also fix τ = 10 −6 for Algorithm 3. We also discuss two nonconvex examples in this section. In case we only use a local solver for the subproblems we will no longer have a guarantee to obtain a valid approximation for those.
First, in Sect. 5.1 we present the numerical results for BAMOP as given in Algorithm 3. Then, in Sect. 5.2, we present and discuss the results for a modified version of the algorithm. Please be aware that while our algorithm works for problems (MOP) with an arbitrary number of objective functions, in this section we focus on bi-and tri-objective examples since for those examples we can also provide a visualization of the results. In Test Instance 4 we discuss the results for an optimization problem with more than three objectives. The generated data and the MATLAB files are available from the authors on reasonable request.

Results for BAMOP
Test Instance 1 First, we consider a convex, box-constrained problem. The second objective function is the so-called Brent function, see [2,22].
We execute the algorithm for ε = 1 and ε = 5. The results are shown in Fig. 8. For ε = 1 the structure of the nondominated set is easily recognizable. However, this is also the case for ε = 5.
The computation time is about 0.59 seconds for ε = 5 and about 2.33 seconds for ε = 1. Hence, the quality improvement from ε = 5 to ε = 1 requires (roughly) four times the computation time. We further investigate the computation time in the next test instance.  Test Instance 2 For the next example, we consider a scalable test problem from [23] that was also used in [39].
The results, in particular the computation time, for different choices of n and ε are shown in Table 1. Those are computed using MATLAB's Run and Time feature. This causes some overhead which should be taken into account when comparing results. Table 1 shows that increasing n or ε also increases the computation time. Also, the overall computation time of Algorithm 3 is basically the computation time for solving the subproblems (SUP(l, u)) and (GNP(y)), i.e., the time used for fmincon in our example. Thus, it is important to carefully choose a suitable solver for these subproblems.
In Fig. 9 the results for ε = 0.1 and ε = 0.01 are shown. Together with the results shown in Table 1 this motivates that the parameter ε should be chosen carefully and with respect to the scale of the optimization problem.  Test Instance 3 The following tri-objective example can be found in [8] and depends on a parameter a > 0.
The parameter a controls the scale of the ellipsoid constraint function and, hence, the steepness of the nondominated set. In [8] the authors present an approach of vertex selection for a polyhedral approximation algorithm from [28]. They demonstrate that their modification's computation time is only slightly effected by the choice of a. In contrast, for our algorithm the choice of a has a noticeable influence on the computation time, see Table 2. However, this is not surprising. Increasing the parameter a also increases the size of the initial box [z, Z ]. In particular, we have z 2 = 1 − a − σ and Z 2 = 1 + a + σ . Thus, we can expect that more subboxes need to be computed to generate an approximation A with w(A) ≤ ε = 0.1.
For an illustration of the result for a = 5, see Fig. 10. Besides the illustration for ε = 0.1 we also included the result for ε = 0.5 where the box structure can be seen more clearly.
Test Instance 4 As mentioned in the beginning of this section, our algorithm is not restricted to bi-and tri-objective problems that we use here mainly to be able to visually represent the numerical results. The following test instance is convex and has a scalable number m ≤ n of objectives. It can be found in various formulations, for example in [20,39]. The objective functions f j : R n → R, j ∈ [m] are given by The test instance is then given as This example demonstrates that due to the fact that our algorithm is a criterion space algorithm, its computation time highly depends on the number of objectives. Moreover, the computation time increases drastically with the number of objectives in this example. This might seem surprising, since by Theorem 4.3 it is known that the maximal number of iterations of our algorithm depends linearly on the number of objective functions. The reason for the increase in computation time is that, in general, with each iteration the number of lower and upper bounds increases which implies that the iterations get more time consuming.
Test Instance 5 The next example is scalable in the decision space and nonconvex. It can be found in [20,39,40] and is a generalization of an example from [17].
As already mentioned at the beginning of this section, we keep using the local solver fmincon to solve all single-objective (nonconvex) subproblems. As a result, there is no longer a guarantee that our algorithm computes a valid approximation of the nondominated set N . To ensure that all theoretical results hold, a suitable (global) solver for the nonconvex subproblems (SUP(l, u)) and (GNP(y)) would be necessary. However, for this example it turns out that our algorithm is still computing a valid result. This is also the case for other examples which are only "slightly" nonconvex. If one aims for a guaranteed valid approximation of the nondominated set, algorithms as those presented in [14,31,35,42,43] should be considered. Note that the algorithms from [14], [31] and [35] also have the advantage that they compute not only an approximation of the nondominated set but also a representation or coverage of the set of (ε-) efficient solutions. On the other hand, these algorithms are often limited in the size of n.
For the dimension we choose n = 100. In Fig. 11 the results for ε = 0.1 and for ε = 0.01 are shown. The computation time for ε = 0.1 is about 1.8 s while for ε = 0.01 it is about 24.3 s. Considering Fig. 11, one has to decide whether the increase in computation time is worth the improvement of the result.
Test Instance 6 While in the previous example our algorithm delivered a valid approximation of the nondominated set, we now present another nonconvex test instance from [37] where our algorithm combined with the local solver fmincon failed to provide a valid approximation. The test instance is given as (Tan) We set the tolerance ε = 0.1 and for the initial box [z, Z ] we choose z = (−σ, −σ ) and Z = (1.3 + σ, 1.3 + σ ) . The result for using fmincon is shown in Fig. 12 on the left. On the right the result for using GlobalSearch instead of fmincon to solve the subproblems is shown. This actually returns a valid approximation of the nondominated set of (Tan). Hence, in this example more effort is needed to obtain a valid approximation by BAMOP.

Results for BAMOP with selection criterion
For this section, we consider a modification of BAMOP (Algorithm 3). More precisely, we replace the inner for loop for the upper bounds by a selection criterion for a single upper bound. For the selection criterion, we use the shortest edge s(l, u). More precisely, given l ∈ L we select an upper bound u ∈ ({l + εe} + int(R m + )) ∩ U loop with s(l, u) ≥ s(l, u ) for all u ∈ ({l + εe} + int(R m + )) ∩ U loop . We refer to this modified version of Algorithm 3 as BAMOPS. Please be aware that for this modification finiteness of the algorithm is no longer guaranteed by Theorem 4.3. However, if BAMOPS terminates then for the computed approximation A it still holds w(A) ≤ ε. We consider again the tri-objective optimization problem (Ex5.1) from Test Instance 3. For fixed ε = 0.1 we compare the computation time of BAMOP and BAMOPS for a ∈ {1, 3, 5, 7, . . . , 25}. The results are presented in Fig. 13. For better comparison, the dashed line represents half the computation time of BAMOP. Thus, for this specific instance BAMOPS is roughly twice as fast as BAMOP. Please be aware that the computation time of BAMOP for Test Instance 3 on page 24 is longer because of the overhead introduced by the Run and Time feature.
Also for other test instances with m > 2 we obtained a significant reduction of the computation time when using BAMOPS instead of BAMOP.

Conclusions
We presented a new algorithm to compute an approximation of the nondominated set of (MOP) in the form of a box-coverage. This algorithm has been proven to compute an approximation A with width w(A) of at most ε for any predefined ε > 0 after finitely many iterations. In particular, we have shown that an improvement in every iteration can be guaranteed (see Theorem 4.2) and provided an upper bound for the number of iterations needed (see Lemma 4.3).
Algorithm 3 can theoretically be used for a huge class of problems. Practically, one needs to be able to solve the single-objective subproblems fast and reliably. This is possible for instance for smooth convex optimization problems for which such solvers are available. Even for nonconvex problems our algorithm is able to obtain valid approximations when using a local solver for the subproblems as demonstrated in Sect. 5. If one wants to guarantee that BAMOP computes a valid approximation in the nonconvex case, a suitable global solver is necessary. In this case one might also consider using other approaches like those from [12], [14], [31] and [42]. In Sect. 5.1 we illustrated the algorithm on six different test instances. As the results have shown, the parameter ε should be chosen carefully and with respect to the range of the objective values. There is a significant trade-off between the approximation quality, i.e., the choice of ε, and the computation time for the algorithm.
We found out that the computation time of Algorithm 3 mainly depends on the computation time that is needed for solving the subproblems (SUP(l, u)) and (GNP(y)). Thus, an interesting question for further research would be how often these subproblems have to be solved. This could then be used to estimate the overall computation time. In particular, a relation between the amount of decreasing the value of ε and the resulting increase in the number of evaluations of the subproblems would be of great interest.
Moreover, the modification of Algorithm 3 that we presented in Sect. 5.2 seems to be promising in view of the computation time. However, we loose the Halving Theorem with this modification. Still, it should be further investigated. In particular, one should check if this modification can still be proofed to terminate after finitely many iterations. Also, working with other selection criteria, e.g., based on the Hypervolume Indicator [44], should be examined.