1 Introduction

The aim of multi-objective optimization is to minimize not only one but multiple objectives at the same time. Usually, it is not possible to find a feasible point that minimizes all objectives as these are conflicting. Hence, a commonly used approach is to find so-called nondominated points in the criterion space which belong to so-called efficient solutions in the decision space. The set of all these nondominated points is called nondominated set or Pareto front. Given an efficient solution, it is not possible to find a feasible point that leads to an improvement for any objective without deteriorating another. For an introduction to multi-objective optimization see [9, 30, 32].

In general, there is an infinite number of nondominated points for a continuous multi-objective optimization problem. Thus, a common technique is to compute a (finite) approximation of the nondominated set. In [34] a survey of such techniques including a classification can be found. There are basically two kinds of approximations. On the one hand, there are approaches which compute a finite number of nondominated points to represent the whole nondominated set which we refer to as representation approach (e.g., [4, 16, 42]). On the other hand, there are approaches which compute a set (instead of a finite number of points) that contains the nondominated set which we refer to as coverage approach. A common technique for coverage approaches is to combine inner and outer approximations (e.g., [11, 26, 36]) . This is sometimes referred to as sandwiching, see [1].

Those sandwiching techniques using inner and outer approximations lead to a polyhedron which contains the nondominated set. For representing such polyhedra, usually their vertices are used. Hence, updating such polyhedral approximations is a constant change of computing inner and outer approximations and recomputing at least some of the vertices from a hyperplane representation, see for instance [8, 11].

Another approach are coverages that consist of boxes. Boxes can be easily represented by their corners, which we refer to as lower and upper bound. Thus, one can expect that updating (a collection of) boxes requires less effort than updating a polyhedron. Moreover, boxes respect the natural ordering. This is an advantage for applications that perform further computation based on the approximation.

For example, one could think of fields as mixed-integer multi-objective optimization where the nondominated set for the mixed-integer problem can be computed by comparing the nondominated sets of the multi-objective optimization problems that arise by fixing the values of the integer variables.

Another field is set optimization where values of set-valued objective functions have to be compared, see [24]. Robust approaches for handling uncertainties in multi-objective optimization lead to such set optimization problems, see [21]. For instance, for the upper-type set relation comparing compact sets corresponds to comparing Pareto fronts, for which coverages can be used, see for example [13].

Thereby, boxes can be compared more easily than general polyhedra. Moreover, in case these coverages are not exact enough, it is important to be able to improve those iteratively. In view of this, we are able to present such a guarantee for our coverage approach which we call Halving Theorem.

An example for a box-coverage is presented in [19] for bi-objective problems and has been extended in [27] to tri-objective problems. The approach in [27] is to split a given box into seven subboxes using update points that are computed using a lexicographic \(\varepsilon \)-constraint scalarization. Boxes are removed if they do not contain any feasible points. Otherwise they are split again until their maximum width is smaller than a given tolerance. In [5] an algorithm to generate a box-coverage for bi-objective integer programs is presented. This algorithm also uses the approach to divide a given box into (two) subboxes. The update point, which decides where the box is split, is computed using an approach related to the Pascoletti–Serafini scalarization. It also takes into account the integrality of the problem to avoid working with infeasible subboxes. To the best of our knowledge there is no generalization of the approaches from [27] and [5] for an arbitrary number of objective functions.

Another approach for a box-based approximation, mainly focused on discrete tri-objective problems, is given in [6]. It is different from the approaches described above as all boxes have the same lower bound. In other words, only the upper bounds of the boxes are updated. Moreover, the boxes are allowed to intersect. As a result, a single update point, i.e., a nondominated point, can lead to a split of multiple boxes and, consequently, redundant subboxes. However, approaches how to remove redundancy are presented as well.

In [25] a concept of search regions related to those from [6] and so-called local upper bounds are presented. They can be used for any number of objective functions and in particular for (non-discrete) multi-objective optimization problems. In [38] and very recently in [7] representation approaches that inherently also generate a box approximation have been presented, where [7] demonstrates the use of such approaches in radiotherapy planning as a real world application.

For completeness, we want to mention that branch–and–bound algorithms usually use boxes in the decision space and some of these algorithms also generate boxes in the criterion space, see [14, 15, 31, 35]. However, we want to focus here on working in the criterion space without creating any substructure in the decision space. One reason for our focus on a criterion space based method is that the computation time of branch-and-bound approaches in the decision space increases quite fast when the number of decision variables increases. Since our algorithm works in the criterion space, its computation time depends more on the number of objectives than on the number of decision variables. Hence, our algorithm focuses on such cases where there are more decision variables than objective functions whereas branch–and–bound approaches as those from [14, 15, 31, 35] are alternatives in case that the dimension of the decision space is relatively small. Some of these methods like [14, 35] are also limited to purely box-constrained optimization problems. Our approach works for an arbitrary feasible set as long as it is compact.

In this paper, we introduce a new approximation algorithm for multi-objective optimization problems using the bound concepts from [25] to compute a box-coverage of the nondominated set. To the best of our knowledge, we are the first to present a box approximation concept for an arbitrary dimension of the criterion space involving both upper and lower bound improvements with an exact bound on the number of iterations needed. Moreover, we show that in every iteration a certain improvement of the approximation can be guaranteed, which we refer to as Halving Theorem. While we recommend our algorithm most of all for convex multi-objective optimization problems, i.e., problems where all objective and constraint functions are smooth and convex, the theoretical results presented in this paper still hold when assuming continuous objective functions and a compact feasible set. In particular, the presented algorithm needs to solve a large number of single-objective optimization problems that are derived from the original multi-objective optimization problem. It is crucial for the performance of the algorithm that a fast and reliable solver for these single-objective subproblems is available. A well-known class of optimization problems for which such a solver is available are smooth convex problems. In case the problems are nonconvex, a suitable global solver needs to be used. Hence, the reader should be aware that while in theory the algorithm presented in this paper works even under weaker assumptions the best solvers for the subproblems exist for smooth convex optimization problems.

The remaining paper is structured as follows. We start in Sect. 2 with some notations and definitions. We also present the problem formulation (MOP) and characterize the kind of approximation that we aim for. In Sect. 3 we discuss the approach to characterize the boxes of our approximation using lower and upper bounds based on the concepts from [25]. Then, in Sect. 4 we introduce our new algorithm to compute the box-based approximation of the nondominated set including a detailed discussion of its properties, such as finiteness. Finally, in Sect. 5 some numerical results for using our algorithm to compute an approximation are presented.

2 Notations and definitions

For a positive integer \(n \in {\mathbb {N}}\) we use the notation \([n] := \{1, \ldots , n\}\). All relations in this paper are meant to be read component-wise, i.e., for \(x,x^\prime \in {\mathbb {R}}^n\) it is

$$\begin{aligned} x&\le x^\prime \Leftrightarrow x_i \le x^\prime _i \text { for all } i \in [n],\\ x&< x^\prime \Leftrightarrow x_i < x^\prime _i \text { for all } i \in [n]. \end{aligned}$$

For \(l,u \in {\mathbb {R}}^n\) with \(l \le u\) we denote by \([l,u] := \left\{ y \in {\mathbb {R}}^n \;\big |\; l \le y \le u \right\} \) the box with lower bound l and upper bound u. As already mentioned in the introduction, we focus on multi-objective optimization problems. We denote by \(f_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\), \(i\in [m]\) the objective functions and by \(S \subseteq {\mathbb {R}}^n\) the feasible set. We also write \(f = (f_1,\ldots ,f_m) :{\mathbb {R}}^{n} \rightarrow {\mathbb {R}}^{m}\). Then, our multi-objective optimization problem is given as

$$\begin{aligned} \min _{x} \; f(x) \quad \text {s.t.} \quad x \in S \end{aligned}$$
(MOP)

where all functions \(f_i, i \in [m]\) are assumed to be continuous and S is assumed to be a nonempty, compact set. Since f(S) is bounded, it holds that

$$\begin{aligned} \exists \; z, Z \in {\mathbb {R}}^m :f(S) \subseteq {{\,\mathrm{int}\,}}(B) \text { with } B:=[z,Z]. \end{aligned}$$
(2.1)

We assume in the following that such a box B is known. While there is no need to assume convexity of the objective functions \(f_i, i \in [m]\) and the set S for the theoretical results in this paper, one needs to be able to solve a single-objective subproblem related to (MOP), see (SUP(lu)) on page 13. Fast and reliable solvers for such optimization problems are available for example when assuming convexity of the objective functions \(f_i, i \in [m]\) and the set S. The corresponding subproblems are then single-objective convex optimization problems where every locally optimal solution is also a globally optimal solution. Hence, from a practical point of view, we recommend to use our algorithm first of all for smooth convex multi-objective optimization problems. We discuss this in more detail in Sect. 5. However, for the theoretical results of our paper we stick with the weaker assumptions of continuous objective functions and a compact feasible set S.

As the different objective functions of (MOP) are usually competing with each other, in general it is not possible to find a feasible point that minimizes all objectives at the same time. Thus, we use the following optimality concepts.

Definition 2.1

A point \({\bar{x}} \in S\) is called an efficient solution for (MOP) if there exists no \(x \in S\) with \(f_i(x) \le f_i({\bar{x}})\) for all \(i \in [m]\) and with \(f_j(x) < f_j({\bar{x}})\) for at least one \(j \in [m]\). It is called a weakly efficient solution for (MOP) if there exists no \(x \in S\) with \(f_i(x) < f_i({\bar{x}})\) for all \(i \in [m]\).

For a given \(\varepsilon > 0\) we call \({\bar{x}} \in S\) an \(\varepsilon \)-efficient solution for (MOP) if there exists no \(x \in S\) with \(f_i(x) \le f_i({\bar{x}}) - \varepsilon \) for all \(i \in [m]\) and with \(f_j(x) < f_j({\bar{x}}) - \varepsilon \) for at least one \(j \in [m]\). It is called a weakly \(\varepsilon \)-efficient solution for (MOP) if there exists no \(x \in S\) with \(f_i(x) < f_i({\bar{x}}) - \varepsilon \) for all \(i \in [m]\).

We use a related concept in the criterion space, called dominance.

Definition 2.2

Let \(y^1,y^2 \in {\mathbb {R}}^m\) and \(\preceq \, \in \{ \le , \ge \}\). Then \(y^2\) is dominated by \(y^1\) with respect to \(\preceq \) if \(y^1 \ne y^2, \; y^1 \preceq y^2\). For a set \(N \subseteq {\mathbb {R}}^m\) a vector \(y \in {\mathbb {R}}^m\) is dominated given N with respect to \(\preceq \) if

$$\begin{aligned} \exists \; {\hat{y}} \in N :{\hat{y}} \ne y, \; {\hat{y}} \preceq y. \end{aligned}$$

If y is not dominated given N w.r.t. \(\preceq \), it is called nondominated given N with respect to \(\preceq \). Analogously, for \(\prec \; \in \{ <, > \}\) we say \(y^2\) is strictly dominated by \(y^1\) with respect to \(\prec \) if \(y^1 \prec y^2\) and a vector \(y \in {\mathbb {R}}^m\) is strictly dominated given a set \(N \subseteq {\mathbb {R}}^m\) with respect to \(\prec \) if

$$\begin{aligned} \exists \; {\hat{y}} \in N :{\hat{y}} \prec y. \end{aligned}$$

If y is not strictly dominated given N w.r.t. \(\prec \), it is called weakly nondominated given N with respect to \(\prec \).

In general, the specification of the relation \(\preceq \)/\(\prec \) is left out if it is known by context. As the images \(f({\bar{x}})\) of efficient solutions \({\bar{x}} \in S\) are nondominated given f(S) w.r.t. \(\le \), they are called nondominated points of (MOP). We denote by \({\mathcal {E}}\) the set of efficient solutions (also efficient set) and by \({\mathcal {N}}\) the set of nondominated points (also nondominated set) of (MOP), i.e., \({\mathcal {N}}:= \left\{ y \in {\mathbb {R}}^m \;\big |\; y = f(x),\; x \in {\mathcal {E}} \right\} \subseteq {\mathbb {R}}^m\). Also, for an arbitrary \(\varepsilon > 0\), we denote the \(\varepsilon \)-nondominated set for (MOP) by

$$\begin{aligned} {\mathcal {N}}_\varepsilon := \left\{ y \in {\mathbb {R}}^m \;\big |\; y = f(x), \; x \text { is an}\, \varepsilon \text {-efficient solution for }(MOP) \right\} \end{aligned}$$

and the weakly \(\varepsilon \)-nondominated set for (MOP) by

$$\begin{aligned} {\mathcal {N}}_\varepsilon ^w := \left\{ y \in {\mathbb {R}}^m \;\big |\; y = f(x), \; x \text { is a weakly }\varepsilon \text {-efficient solution for }(MOP) \right\} . \end{aligned}$$

In this paper we focus on the criterion space and hence, on finding an approximation of the set \({\mathcal {N}}\). As already mentioned in the introduction, we aim for a box-based coverage of \({\mathcal {N}}\). The concept of an enclosure, as presented in [12], realizes such a box-based coverage.

Definition 2.3

Let \(L,U \subseteq {\mathbb {R}}^m\) be two finite sets with

$$\begin{aligned} {\mathcal {N}}\subseteq L + {\mathbb {R}}^m_+ \text { and } {\mathcal {N}}\subseteq U - {\mathbb {R}}^m_+. \end{aligned}$$
(2.2)

Then L is called lower bound set, U is called upper bound set, and the set \({\mathcal {A}}\) which is given as

$$\begin{aligned} {\mathcal {A}}= {\mathcal {A}}(L,U) := (L + {\mathbb {R}}^m_+) \cap (U - {\mathbb {R}}^m_+) = \bigcup _{l \in L} \; \bigcup _{\begin{array}{c} u \in U, \\ l \le u \end{array}} \; [l,u] \end{aligned}$$
(2.3)

is called approximation or enclosure of the nondominated set \({\mathcal {N}}\) given L and U.

For an illustration of this concept, see Fig. 1. In this figure, the nondominated set \({\mathcal {N}}\) is given in orange and the sets \(L = \{l^1,l^2\}\) and \(U = \{u^1,u^2\}\) are lower and upper bound sets as in Definition 2.3. The box structure of the corresponding approximation \({\mathcal {A}}\) can also be seen.

Fig. 1
figure 1

Illustration of approximation \({\mathcal {A}}\) for \(L = \{l^1,l^2\}\) and \(U = \{u^1,u^2\}\)

We aim for an approximation of certain quality. For this reason, we use a quality criterion that is presented in [12]. There, the authors suggest to generalize the quality citerion given by the interval length \(u-l\) of enclosing intervals [lu] from single-objective optimization to the so-called width \(w({\mathcal {A}})\) of the enclosure \({\mathcal {A}}\) with respect to the direction of the all-ones vector e, i.e., to define \(w({\mathcal {A}})\) as the optimal value of

$$\begin{aligned} \sup _{y,t} \frac{\left\Vert (y+te)-y \right\Vert }{\sqrt{m}} \quad \text { s.t. } y,y+te \in {\mathcal {A}}, \; t \in {\mathbb {R}}_+. \end{aligned}$$
(2.4)

This definition arises quite naturally, which we want to explain briefly. By Definition 2.3, we have \({\mathcal {N}}\subseteq {\mathcal {A}}\). Besides that, it is also reasonable to aim for an approximation \({\mathcal {A}}\) that only consists of points that are at least approximately nondominated. For example, we can demand that for some \(\varepsilon > 0\) we have that \(y \in {\mathcal {N}}_\varepsilon \) for all \(y \in {\mathcal {A}}\cap f(S)\). A sufficient criterion for this to hold would be that for any \(y \in {\mathcal {A}}\) we have that \(y - \varepsilon e \not \in {\mathcal {A}}\). In other words, the quality of the approximation \({\mathcal {A}}\) can be defined as the largest \(\varepsilon > 0\) such that there exists some \(y \in {\mathcal {A}}\) with \(y + \varepsilon e \in {\mathcal {A}}\). This leads exactly to the definition of \(w({\mathcal {A}})\) from (2.4). Moreover, this relation between \({\mathcal {A}}\) and \({\mathcal {N}}_\varepsilon \) is also shown in [12, Lemma 3.1]. In particular, for any \(\varepsilon > 0\) and an approximation \({\mathcal {A}}\) with \(w({\mathcal {A}}) < \varepsilon \) it holds that \({\mathcal {A}}\cap f(S) \subseteq {\mathcal {N}}_\varepsilon \).

A similar result can be obtained for the polyhedral approach from [11] for convex multi-objective problems that generates an inner approximation \({\mathcal {P}}^i\) and an outer approximation \({\mathcal {P}}^o\) of the nondominated set. It is shown in [11, Theorem 4.3] that for the nondominated set \({\mathcal {N}}_{{\mathcal {P}}^i}\) of \({\mathcal {P}}^i\) it holds that \({\mathcal {N}}_{{\mathcal {P}}^i} \subseteq {\mathcal {N}}_\varepsilon ^w\). Thereby, \(\varepsilon \) is an upper bound on the distance between any vertex v of the polyhedral approximation and the boundary of f(S). More precisely, denote by V the vertex set of the polyhedron and choose a fixed interior point \(p \in f(S) + {\mathbb {R}}^m_+\). Then, for each \(v \in V\) and its corresponding (unique) boundary point \(b^v := \lambda v + (1-\lambda ) p \in f(S) + {\mathbb {R}}^m_+, \; \lambda \in (0,1)\) it holds that the distance \(d(v,b^v)\) is at most \(\varepsilon \).

In [12] the authors have also shown that there is an equivalent formulation of (2.4) that better fits the box-approximation concept. They denote the shortest edge of a box [lu] by

$$\begin{aligned} s(l,u) := \min _{i \in [m]} \;(u_i-l_i) \end{aligned}$$

and show in [12, Lemma 3.2] that the width \(w({\mathcal {A}})\) of an enclosure \({\mathcal {A}}\) equals the optimal value of

$$\begin{aligned} \sup _{l,u} \; s(l,u) \quad \text { s.t. } \quad l \in L, \; u \in U, \; l \le u. \end{aligned}$$

We want to remark that [12] presents a branch-and-bound framework in the decision space while we focus on the criterion space. For our paper, we only make use of their enclosure concept and the corresponding quality measure w.

3 Computing lower and upper bounds

In this section, we present an approach on how to choose and how to compute the lower and upper bound sets L and U. A suitable concept for both are the so-called Local Upper Bounds (LUB). We use them as given in [25].

Definition 3.1

A set \(Y \subseteq {\mathbb {R}}^m\) is called stable with respect to \(\preceq \, \in \{\le ,\ge \}\) if no element of Y dominates another, i.e., \(y^1 \not \preceq y^2\) for all \(y^1,y^2 \in Y\) with \(y^1 \ne y^2\).

Analogously to the concept of dominance, the specification of the order relation \(\preceq \) is often left out if it is known by context. It is easy to see that the nondominated set \({\mathcal {N}}\) is a stable set w.r.t. \(\le \).

Definition 3.2

Let \(N \subseteq f(S)\) be a finite and stable (w.r.t. \(\le \)) set. Then the lower search region for N is \(s(N) := \left\{ y \in {{\,\mathrm{int}\,}}(B) \;\big |\; y^\prime \not \le y \text { for every } y^\prime \in N \right\} \) and the lower search zone for some \(u \in {\mathbb {R}}^m\) is \(c(u) := \left\{ y \in {{\,\mathrm{int}\,}}(B) \;\big |\; y < u \right\} \). A set \(U=U(N)\) is called local upper bound set given N if

  1. 1.

    \(s(N) = \bigcup _{u \in U(N)} c(u)\),

  2. 2.

    \(c(u^1) \not \subset c(u^2) \text { for all } u^1,u^2 \in U(N)\).

Each point \(u \in U(N)\) is called a local upper bound (LUB).

Given the set N, the search region s(N) contains all potentially nondominated points in \({{\,\mathrm{int}\,}}(B)\) given N w.r.t. \(\le \) without N itself. The latter because (just by the definition) it always holds \(N \cap s(N) = \emptyset \). In other words, s(N) contains all elements \(y \in {{\,\mathrm{int}\,}}(B) {\setminus } N\) that are not dominated by any \(y^\prime \in N\). Hence, dominance in the context of local upper bounds is always dominance w.r.t. \(\le \) and also stable always means stable w.r.t. \(\le \). It is known that for any finite and stable set \(N \subseteq f(S)\) the local upper bound set U(N) is uniquely determined and finite, see [12].

For an illustration of the concept of local upper bounds, see Fig. 2. For a stable set \(N = \{y^1,y^2\} \subseteq {\mathbb {R}}^2\) a local upper bound set \(U(N) = \{u^1,u^2,u^3\}\) is shown and also the lower search zone \(c(u^2)\) and the lower search region s(N) are highlighted.

Fig. 2
figure 2

Illustration of lower search region, lower search zone and local upper bounds

The following lemma presents a relation between local upper bound sets and upper bound sets as presented in Definition 2.3.

Lemma 3.3

Let \(N \subseteq f(S)\) be a finite and stable set. Then it holds

$$\begin{aligned} {\mathcal {N}}\subseteq {{\,\mathrm{cl}\,}}(s(N)) = \bigcup _{u \in U(N)} {{\,\mathrm{cl}\,}}(c(u)) \subseteq \bigcup _{u \in U(N)} \{u\} - {\mathbb {R}}^m_+ = U(N) -{\mathbb {R}}^m_+. \end{aligned}$$

Proof

We only need to show \({\mathcal {N}}\subseteq {{\,\mathrm{cl}\,}}(s(N))\). So let \({\bar{y}} \in {\mathcal {N}}\) be a nondominated point of (MOP) and assume \({\bar{y}} \not \in s(N)\). Then there exists some \(y^\prime \in N \subseteq f(S)\) with \(y^\prime \le {\bar{y}}\). This implies that \(y^\prime = {\bar{y}}\) because otherwise \({\bar{y}}\) would be dominated by \(y^\prime \). Since \(y^\prime \in N \subseteq {{\,\mathrm{int}\,}}(B)\), there exists \(\varepsilon > 0\) such that \(B_\varepsilon (y^\prime ) := \left\{ y \in {\mathbb {R}}^m \;\big |\; \left\Vert y-y^\prime \right\Vert \le \varepsilon \sqrt{m} \right\} \subseteq {{\,\mathrm{int}\,}}(B)\).

Hence, for \(y^k := y^\prime -\frac{\varepsilon }{k}e, \; k \in {\mathbb {N}}\) we have that \((y^k)_{k \in {\mathbb {N}}} \subseteq B_\varepsilon (y^\prime ) \subseteq {{\,\mathrm{int}\,}}(B)\). Moreover, it holds that \(y^k < y^\prime \) for all \(k \in {\mathbb {N}}\). This implies that \((y^k)_{k \in {\mathbb {N}}} \subseteq s(N)\), because otherwise there exists \(y^{\prime \prime } \in N\) and an index \(k \in {\mathbb {N}}\) with \(y^{\prime \prime } \le y^k < y^\prime \), which contradicts the assumption that N is stable. Thus, we obtain that \({\bar{y}} = y^\prime = \lim _{k \rightarrow \infty } y^k \in {{\,\mathrm{cl}\,}}(s(N))\). \(\square \)

Hence, \(U = U(N)\) is an upper bound set in the sense of Definition 2.3 for any finite and stable set \(N \subseteq f(S)\).

This concept leads to upper bounds for the nondominated set of (MOP). Now, we show how to use it to gain lower bounds. This is one of the main differences when comparing our approach to that in [25] where only the local upper bounds are used. To distinguish between the local upper bounds and the closely related local lower bounds, which we present in the next definition, we use upper case notation instead of lower case notation for the search region and search zones.

Definition 3.4

Let \(N \subseteq {{\,\mathrm{int}\,}}(B)\) be a finite and stable (w.r.t \(\ge \)) set. Then the upper search region for N is \(S(N) := \left\{ y \in {{\,\mathrm{int}\,}}(B) \;\big |\; y^\prime \not \ge y \text { for every } y^\prime \in N \right\} \) and the upper search zone for some \(l \in {\mathbb {R}}^m\) is \(C(l) := \left\{ y \in {{\,\mathrm{int}\,}}(B) \;\big |\; y > l \right\} \). A set \(L=L(N)\) is called local lower bound set given N if

  1. 1.

    \(S(N) = \bigcup _{l \in L(N)} C(l)\),

  2. 2.

    \(C(l^1) \not \subset C(l^2) \text { for all } l^1,l^2 \in L(N)\).

Each point \(l \in L(N)\) is called a local lower bound (LLB).

In the context of local lower bounds, dominance is always dominance w.r.t. \(\ge \) and stable sets are stable w.r.t. \(\ge \) as well.

In Fig. 3 an illustration of the concept of local lower bounds is given for the same setting as in Fig. 2. We have the same stable set \(N = \{y^1,y^2\} \subseteq {\mathbb {R}}^2\), a local lower bound set \(L(N) = \{l^1,l^2,l^3\}\), the upper search zone \(C(l^2)\), and the upper search region S(N).

Fig. 3
figure 3

Illustration of upper search region, upper search zone and local lower bounds

Next, we show that for some specific sets N the local lower bound set L(N) is indeed a lower bound set in the sense of Definition 2.3.

Lemma 3.5

Let \(N \subseteq {{\,\mathrm{int}\,}}(B)\) be a finite and stable (w.r.t. \(\ge \)) set such that for every \(y \in N\) there is no \({\hat{y}} \in f(S)\) with \({\hat{y}} \le y, \, {\hat{y}} \ne y\). Then \(L = L(N)\) is a lower bound set in the sense of Definition 2.3.

Proof

Let \({\bar{y}} \in {\mathcal {N}}\subseteq f(S) \subseteq {{\,\mathrm{int}\,}}(B)\) be a nondominated point of (MOP). Then for every \(y^\prime \in N\) it holds \(y^\prime = {\bar{y}}\) or \(y^\prime \not \ge {\bar{y}}\). Hence, using Definition 3.4, we have \({\mathcal {N}}\subseteq N \cup S(N) \subseteq {{\,\mathrm{cl}\,}}(S(N))\), where \(N \subseteq {{\,\mathrm{cl}\,}}(S(N))\) can be shown using similar arguments as in the proof of Lemma 3.3. Finally, this leads to

$$\begin{aligned} {\mathcal {N}}\subseteq {{\,\mathrm{cl}\,}}(S(N)) = \bigcup _{l \in L(N)} {{\,\mathrm{cl}\,}}(C(l)) \subseteq \bigcup _{l \in L(N)} \{l\} + {\mathbb {R}}^m_+ = L(N) + {\mathbb {R}}^m_+ \end{aligned}$$

and \(L = L(N)\) is a lower bound set in the sense of Definition 2.3. \(\square \)

In particular, for any finite and stable set \(N \subseteq {{\,\mathrm{int}\,}}(B) {\setminus } (f(S) + ({\mathbb {R}}^m_+ {\setminus } \{0\}))\) the assumptions of Lemma 3.5 are satisfied. As we need this result later in Sect. 4 (Lemma 4.4), we briefly summarize the relation between local lower and local upper bound sets and bound sets as given in Definition 2.3.

Corollary 3.6

Let \(N^1 \subseteq f(S)\) be a finite and stable set w.r.t. \(\le \) and \(N^2 \subseteq {{\,\mathrm{int}\,}}(B) {\setminus } (f(S) + ({\mathbb {R}}^m_+ {\setminus } \{0\}))\) a finite and stable set w.r.t \(\ge \). Then \(U(N^1)\) is an upper bound set and \(L(N^2)\) is a lower bound set in the sense of Definition 2.3.

For Definitions 3.2 and 3.4 one does not necessarily need to assume N to be stable. In particular, let \(N \subseteq f(S)\) be an arbitrary set and denote by \({\hat{N}} \!:=\! \left\{ y \in N \;\big |\; y \right\} ~y\, \mathrm{is}\, \mathrm{nondomindated}\,\mathrm{given}\, N\). Then it is known from [25, Remark 2.2] that \(s(N) = s({\hat{N}})\). This also implies \(U(N) = U({\hat{N}})\). This holds analogously for the upper search regions and the corresponding local lower bound sets.

In the following, we present a method to compute local upper and local lower bounds. To provide initial local lower bound and local upper bound sets, we set \(U(\emptyset ) = \{Z\}\) and \(L(\emptyset ) = \{z\}\). It is easy to see that these sets satisfy Definitions 3.2 and 3.4. The sets L and U are then updated using points \(y \in {{\,\mathrm{int}\,}}(B)\) to obtain smaller search regions. As updating these sets is done by using projections, we use here the following notation from [25]. For \(y \in {\mathbb {R}}^m, \alpha \in {\mathbb {R}}\) and an index \(i \in [m]\) we define

$$\begin{aligned} y_{-i} \,&:= (y_1, \ldots , y_{i-1}, y_{i+1},\ldots ,y_m)^\top \text { as well as }\\ (\alpha ,y_{-i}) \,&:= (y_1, \ldots , y_{i-1},\alpha ,y_{i+1},\ldots ,y_m)^\top . \end{aligned}$$

Using this notation, Algorithm 1 shows an updating procedure for local upper bound sets as presented in [25, Algorithm 3]. We briefly explain the algorithm after the forthcoming Lemma 3.7.

figure a

Due to the close relation between local upper bounds and local lower bounds, the concept of Algorithm 1 can also be used for updating local lower bound sets. To do so, one simply has to replace every < by > and every \(\le \) by \(\ge \). This leads to the updating procedure as given in Algorithm 2.

figure b

In Sect. 4 we present our new algorithm to generate a box-coverage of the nondominated set  \({\mathcal {N}}\). The properties of this algorithm, e.g., finiteness, are highly depending on the properties of the updating procedures for local lower and local upper bounds. For this reason, we discuss those properties in the remaining part of this section. Due to the analogies of both procedures, we focus on local upper bounds and Algorithm 1.

Our Algorithm 1 slightly differs from [25, Algorithm 3]. Compared to the original algorithm, we do not assume the update point \(y \in f(S)\) to be nondominated given N. However, the algorithm still works correctly. For any update point y that is nondominated given N the correctness of the algorithm is shown in [25]. For update points y that are dominated given N the correctness of the algorithm is shown in the following lemma.

Lemma 3.7

Let N be a finite and stable set, \(U = U(N)\) a local upper bound set, and \(y \in f(S)\) dominated given N. Then Algorithm 1 returns the (unchanged) set \(U = U(N) = U(N \cup \{y\})\).

Proof

As \(y \in f(S)\) is dominated given N, there exists \(y^\prime \in N\) with \(y^\prime \le y, y^\prime \ne y\). This implies that \(y \not \in s(N)\) and by property (i) of \(U = U(N)\) being a local upper bound set this implies that there exists no \(u \in U(N)\) with \(y \in c(u)\). Hence, there exists no \(u \in U(N)\) with \(y < u\) and for Algorithm 1 this means that \(A = \emptyset \). As a result, the algorithm returns the same (unchanged) set \(U = U(N)\). We already discussed that this is the same local upper bound set as \(U(N \cup \{y\})\), see also [25, Remark 2.2]. \(\square \)

This holds analogously for Algorithm 2. As already mentioned, new local upper bounds are generated using projections. We briefly explain how this works.

First, all local upper bounds \(u \in U(N)\) that are strictly dominated by the update point y are added to the set A. These are the only local upper bounds that are possibly updated by Algorithm 1. The sets \(B_i, i \in [m]\) contain all local upper bounds that are dominated but not strictly dominated by y. The sets \(P_i, i \in [m]\) contain the projections of the (old) local upper bounds \(u \in A\) to the i-th component of y, i.e., \((y_i,u_{-i})\). In other words, those sets \(P_i, i \in [m]\) contain all candidates for possible new local upper bounds. Then, in the last for loop, redundant candidates are filtered out of each of the sets \(P_i, i \in [m]\). Finally, the new local upper bound set \(U(N \cup \{y\})\) is computed out of the old set U(N) by removing the old local upper bounds contained in A and adding the (filtered) candidates from the sets \(P_i, i \in [m]\).

Thus, for a local upper bound \(u \in U(N \cup \{y\})\) it is either \(u \in U(N)\) or \(u = (y_i,u^\prime _{-i})\) for some \(i \in [m]\) and \(u^\prime \in U(N)\). For the latter case we call \(u^\prime \) the parent of u. Otherwise u is its own parent.

Lemma 3.8

Let \(u \in U(N \cup \{y\})\) be a local upper bound. Then its parent \(u^\prime \in U(N)\) is unique.

Proof

If u is its own parent, i.e., \(u \in U(N)\), then there is nothing to show. Hence, we consider the case \(u \not \in U(N)\) and assume that there are two different parents \(u^1, u^2 \in A \subseteq U(N)\) of u with \(u^1 \ne u^2\). Then there exist \(i,j \in [m]\) with \(u = (y_i,u^1_{-i}) = (y_j,u^2_{-j})\). If \(i = j\) we have \(u^1_i \ne u^2_i\) and without loss of generality we assume \(u^1_i < u^2_i\). But then \(c(u^1) \subset c(u^2)\) which contradicts property (ii) of Definition 3.2 for U(N) to be a local upper bound set. If \(i \ne j\) it is \(u^2_i = y_i\) and \(u^1_j = y_j\) which contradicts \(u^1,u^2 \in A\). Thus, there exists only one unique parent \(u^\prime \) of u. \(\square \)

These parents do not only exist for local upper bounds \(u \in U(N \cup \{y\})\) but also for the candidates for local upper bounds of \(N \cup \{y\}\) contained in the sets \(P_i, i \in [m]\) before the filtering step in the last for loop. Of course this holds for updates of the local lower bound set as well.

For an illustration of the update procedures, see Fig. 4. For the stable set \(N = \{y^1\}\), a local upper bound set \(U(N) = \{u^1,u^2\}\), and a local lower bound set \(L(N) = \{l^1,l^2\}\) are already computed. The point \(y^2\) is then added to the set N and the bounds are updated using Algorithms 1 and 2. As a result, the local upper bound set is updated to \(U(N \cup \{y^2\}) = \{u^1=u^{1,1},u^{2,1},u^{2,2}\}\), and the local lower bound set is updated to \(L(N \cup \{y^2\}) = \{l^1=l^{1,1},l^{2,1},l^{2,2}\}\). One can see that \(l^{2}\) is the parent of \(l^{2,1}\) and \(l^{2,2}\) and that \(u^{2}\) is the parent of \(u^{2,1}\) and \(u^{2,2}\). All remaining bounds (i.e., \(u^1,l^1\)) are their own parents. For consistency their names are changed as well. Using this way of assigning a numeration to the local lower and local upper bounds also encodes the parents of the bounds.

Fig. 4
figure 4

Illustration of update procedure based on Algorithms 1 and 2

4 Computing the box-coverage

In this section, we present our new algorithm to compute an approximation of the nondominated set of (MOP) with a guaranteed improvement in each iteration. The approach is to use local lower bound sets and local upper bound sets as presented in Sect. 3 to compute the approximation in the form of a box-coverage. First, we discuss the initialization of these sets.

4.1 Initialization

We initialize \(L = L(\emptyset ) = \{z\}\) and \(U = U(\emptyset ) = \{Z\}\) with \(z,Z \in {\mathbb {R}}^m\) from (2.1). These first bounds should be chosen as tight as possible. For this reason, we use the ideal and anti-ideal points, i.e.,

$$\begin{aligned} {\bar{z}}&\in {\mathbb {R}}^m \text { with } {\bar{z}}_i = \min \left\{ f_i(x) \;\big |\; x \in S \right\} \;\forall i \in [m],\\ {\bar{Z}}&\in {\mathbb {R}}^n \text { with } {\bar{Z}}_i = \max \left\{ f_i(x) \;\big |\; x \in S \right\} \;\forall i \in [m]. \end{aligned}$$

Then, as the ideal and anti-ideal point are not satisfying (2.1), we introduce a small offset \(\sigma > 0\) and define with respect to e as the all-ones vector

$$\begin{aligned} z := {\bar{z}}-\sigma e\quad \text {and}\quad Z := {\bar{Z}} + \sigma e. \end{aligned}$$
(4.1)

Those are now suited as initial local lower and local upper bounds. Both the ideal and the anti-ideal point can be hard to compute. However, any choice of \(z,Z \in {\mathbb {R}}^m\) that satisfies (2.1) can be used for initialization.

4.2 Updating the boxes

Next, we provide a method to shrink boxes [lu] with \(s(l,u) > \varepsilon \). To shrink a box, we need to improve at least one of its bounds l and u. Therefore, our aim is to find a new nondominated point in [lu]. This nondominated point is then chosen as an update point for Algorithms  1 and 2. As a result, the old box [lu] is replaced by new, smaller boxes, see Fig. 4. In the following, we formalize this approach. Let \(l,u \in {\mathbb {R}}^m\) with \(l < u\). Then the search for a nondominated (update) point is performed by solving the optimization problem

figure c

It is crucial for the performance of our algorithm that the (SUP(lu)) can be solved fast and reliable. This is possible for instance in the case of smooth and convex subproblems (SUP(lu)). We recommend to take this into account when choosing a solver to solve the subproblems within our overall algorithm. However, the following theoretical results do not require convexity but only continuous objective functions and a compact feasible set S. The following lemma is based on [18, Proposition 2.3.4 and Theorem 2.3.1].

Lemma 4.1

Let \(l,u \in {\mathbb {R}}^m\) with \(l < u\). Then there exists an optimal solution \(({\bar{x}},{\bar{t}})\) for (SUP(lu)).

In [33] it is shown that for every optimal solution \(({\bar{x}},{\bar{t}})\) of (SUP(lu)) the point \({\bar{x}} \in S\) is a weakly efficient solution for (MOP). Thus, \(f({\bar{x}})\) is weakly nondominated given \({\mathcal {N}}\). To perform an update step with Algorithms 1 or 2, i.e., to obtain a new local upper bound or local lower bound set, we need an update point \(y \in {\mathbb {R}}^m\) that is nondominated and not only weakly nondominated. In case all objective functions \(f_i, i \in [m]\) are strictly convex and the feasible set S is convex as well, any weakly nondominated point is also nondominated, see [3].

For our general setting, examples on how to find a nondominated point \({\bar{y}} \le y\) given \(y \in f(S)\) can be found in [4, 10]. Another example for bi-objective problems can be found in [5], where the authors used a Pascoletti-Serafini scalarization and encountered the problem of needing to derive a nondominated point of a weakly nondominated point as well. For our implementation, we use a subproblem as formulated in [41]. Let \(y \in f(S)\) be a weakly nondominated point of (MOP). Then according to [41, Theorem 2] any optimal solution \({\bar{x}} \in S\) of

figure d

is efficient for (MOP). Thus, \({\bar{y}} := f({\bar{x}})\) is a nondominated point with \({\bar{y}} \le y\).

4.3 Main algorithm

Our new algorithm BAMOP to generate a box-coverage is presented as Algorithm 3. The algorithm basically works as follows: It loops through all boxes [lu] with \(l \le u\) and \(s(l,u) > \varepsilon \). Then, a new point to update the bound sets is computed using the methods described above. The whole procedure is repeated until finally all boxes and hence the approximation \({\mathcal {A}}\) given L and U are sufficiently small, i.e., \(w({\mathcal {A}}) \le \varepsilon \).

figure e

In Algorithm 3 there is a case distinction concerning the computation of the update point for Algorithms 1 and 2. The reason for that is that in order to compute an approximation with \(w({\mathcal {A}}) \le \varepsilon \), we need the boxes to get thinner. To do so, we guarantee that at least in one dimension the box length is halved, see Theorem 4.2.

Fig. 5
figure 5

Two update scenarios from Algorithm 3

Before we present that theorem, we briefly illustrate a single update step of the algorithm in Fig. 5. Using the notation from Algorithm 3, one case is \({\hat{y}} = {\bar{y}}\), i.e., on the connection line between l and u there is a nondominated point \({\hat{y}} \in {\mathcal {N}}\). The illustration shows that in this case the boxes \([l^1,u^1]\) and \([l^2,u^2]\) have indeed at least one dimension that is at most half the length of [lu].

The case \({\hat{t}} > 0.5\) with \({\hat{y}} \ne {\bar{y}}\) is shown in Fig. 5b. This case corresponds to the if clause from Algorithm 3. Again, we consider the connection line between l and u and by the definition of \({\bar{t}}\) it holds that \(l + {\bar{t}}(u-l) \ge l + 0.5(u-l) = 0.5(l+u)\). Thus, the new boxes \([l^1,u]\) and \([l^2,u]\) are at most half the length of [lu] in at least one dimension. Moreover, there is no \(y \in f(S)\) with \(y \le a + {\bar{t}}r\). This is because we use the small offset \(\tau > 0\) to ensure \({\bar{t}} < {\hat{t}}\) and \(({\hat{x}},{\hat{t}})\) is an optimal solution of (SUP(lu)). This can be used together with Corollary 3.6 later in Theorem 4.4 to proof that valid bound sets as needed for Theorem 2.3 are generated. Please be aware that this also implies that, in general, for the local upper bound set \(U = U(N^1)\) and the local lower bound set \(L = L(N^2)\) in Algorithm 3 it holds that \(N^1 \ne N^2\).

Fig. 6
figure 6

Two subsequent update steps from Algorithm 3 with \(l = l^3\)

Next, we discuss the development of the sets L and U, see Fig. 6. Let \(l = l^3 \in L_{\text {loop}}\) be the lower bound selected for the current outer for loop in Algorithm 3. We have \(U = U_{\text {loop}}= \{u^1,u^2,u^3\}\) and assume that the inner for loop first selects \(u^2\) and then \(u^3\) (and finally \(u^1\) which is not part of the illustration). During the first run of the inner for loop with \(u = u^2\) we find the update point \({\hat{y}} \in {\mathcal {N}}\). Using Algorithms 1 and 2, this leads to the projections as shown in Fig. 6. For the lower bounds none of the children of \(l^3\) is part of the updated lower bound set \(L = \{l^1,l^6,l^7,l^5\}\). The updated upper bound set is \(U = \{u^1,u^4,u^5,u^3\}\).

As the run of the inner for loop with \(u = u^2\) is finished, the algorithm continues with the run of the inner for loop with \(u = u^3 \in U_{\text {loop}}\ne U\). Even if \(l = l^3 \not \in L\), it is still the parameter l for (SUP(lu)). However, for the same reason (i.e., \(l^3 \not \in L\)) it is not considered in the update of the lower bound set L using Algorithm 2. This is an important mechanism to keep in mind. At the end of this run of the inner for loop it is \(L = \{l^1,l^6,l^8,l^9,l^5\}\) and \(U = \{u^1,u^4,u^6,u^7\}\).

4.4 Halving property and convergence

In this section we proof some important properties of Algorithm 3, such as finiteness and correctness. A key property is presented in the following theorem, which shows that with every run of the repeat loop the width of the boxes is in some sense halved.

Theorem 4.2

(Halving Theorem) Let \(\varepsilon , \tau > 0\) and \(z,Z \in {\mathbb {R}}^m\) be the input parameters for Algorithm 3. Moreover, let \(L^{\text {start}}\) and \(U^{\text {start}}\) be the local lower bound and local upper bound sets at the beginning of some iteration of Algorithm 3, i.e., at the beginning of some run of the repeat loop. Accordingly denote by \(L^{\text {end}}, U^{\text {end}}\) the sets at the end of this iteration. Then for every \(l^e \in L^{\text {end}}\) and every \(u^e \in U^{\text {end}}\) with \(l^e \le u^e\) there exist \(l^s \in L^{\text {start}}\) and \(u^s \in U^{\text {start}}\) with

  1. 1.

    \(l^s \le l^e \le u^e \le u^s\),

  2. 2.

    \((u^e - l^e)_i \le \max \{\varepsilon ,\; 0.5(u^s-l^s)_i\}\) for at least one index \(i \in [m]\).

Proof

Suppose that there exist \(l^e \in L^{\text {end}}\) and \(u^e \in U^{\text {end}}\) with \(l^e \le u^e\) such that for all \(l^s \in L^{\text {start}}\) and \(u^s \in U^{\text {start}}\) one of the statements (i) or (ii) is violated.

We denote by \(P(l^e),P(u^e) \subseteq {\mathbb {R}}^m\) the sets containing all elements of the parent history of \(l^e\) and \(u^e\) within the current iteration, i.e., their parents, their parents parents and so on. By Theorem 3.8 we have that

$$\begin{aligned} \left|P(l^e) \cap L \right| = 1 \text { and } \left|P(u^e) \cap U \right| = 1 \end{aligned}$$
(4.2)

at any point of the current iteration, i.e., for L and U at any point of time. In particular, this holds for the local lower and local upper bound sets at the beginning of the current iteration. Thus we let \(l^s \in L^{\text {start}} \cap P(l^e)\) and \(u^s \in U^{\text {start}} \cap P(u^e)\).

The update procedures for local upper bound and local lower bound sets, i.e., Algorithms 1 and 2, always ensure that the local lower and local upper bounds are not getting worse. In particular, for \(l^c,l^p \in P(l^e)\) and \(u^c,u^p \in P(u^e)\) with \(l^p\) being the parent of \(l^c\) and \(u^p\) being the parent of \(u^c\) it holds that

$$\begin{aligned} l^c \ge l^p \quad \text {and}\quad u^c \le u^p. \end{aligned}$$
(4.3)

By the definition of the parent history and using that L and U are only updated using Algorithms 1 and 2, this also implies that

$$\begin{aligned} l^s \le l \le l^e \quad \forall l \in P(l^e) \text { and } u^s \ge u \ge u^e \quad \forall u \in P(u^e). \end{aligned}$$
(4.4)

In particular, we have \(l^s \le l^e \le u^e \le u^s\) and hence (i) is satisfied. Thus, (ii) has to be violated. As a result, it holds that

$$\begin{aligned} (u^e-l^e)_i > \varepsilon \text { for all } i \in [m]. \end{aligned}$$
(4.5)

Using the notation of Algorithm 3, in particular denoting by l and u the corresponding iteration variables of the for loop, at some point of the current run of the repeat loop it is \(l = l^s\) because \(L_{\text {loop}}= L^\text {start}\). Now, fix \(l = l^s\) for the outer for loop and consider the inner for loop. By (4.2), we have that there exists a unique \(u \in U_{\text {loop}}\cap P(u^e)\). In the following, we consider Algorithm 3 at the point of time where this specific assignment of l and u is present. It is important to mention that u is not necessarily the first upper bound chosen by the inner for loop. For this reason, the sets L and U may have been updated several times which could lead to \(l \not \in L\) and/or \(u \not \in U\), see also Fig. 6. However, we know from (4.2) and (4.3) that at this point of the algorithm where l and u are assigned as described above, there exist \(l^\prime \in L \cap P(l^e)\) and \(u^\prime \in U \cap P(u^e)\) with

$$\begin{aligned} l^\prime \ge l \text { and }u^\prime \le u. \end{aligned}$$
(4.6)

Together with (4.4), we obtain that

$$\begin{aligned} l = l^s \le l^\prime \le l^e \text { and } u \ge u^\prime \ge u^e. \end{aligned}$$
(4.7)
Fig. 7
figure 7

Case distinction for Theorem 4.2

For the remaining part of this proof, we discuss all possible update steps for the fixed assignment of l and u. In total, there are ten cases, see also Fig. 7. In the following, we always refer to the case numbering from that figure.

First, we consider the case (A), i.e., \({\hat{t}} > 0.5\) and \({\bar{y}} \ne {\tilde{y}}\). This case is also shown in Fig. 5b. With \({\bar{t}} := \max \{0.5, {\hat{t}}-\tau \}\) it is \(l < l+{\bar{t}}(u-l) =: y^\prime \). We start with (A.1), i.e., \(l^\prime < y^\prime \). In this setting the update procedure for L, i.e., Algorithm 2, removes \(l^\prime \) and creates m new candidates for lower bounds

$$\begin{aligned} l^i := (l^\prime _1, \ldots , l^\prime _{i-1}, l_i + {\bar{t}}(u_i-l_i), l^\prime _{i+1}, \ldots , l^\prime _m)^\top \text { for all } i \in [m]. \end{aligned}$$

At least one of these candidates (the child of \(l^\prime \) which is part of the parent history \(P(l^e)\)) is indeed added to the new local lower bound set L. Thus, there exists an index \(i \in [m]\) such that \(l^i \in L\) after executing Algorithm 2. By (4.4) we have that \(l^e \ge l^i\) and together with (4.7) and again (4.4) we obtain that

$$\begin{aligned} (u^e-l^e)_i \le (u-l^i)_i {=}u_i{-}l_i-{\bar{t}}(u-l)_i {=} (1{-}{\bar{t}})(u-l)_i \le 0.5(u-l)_i \le 0.5(u^s-l^s)_i \end{aligned}$$

As a result, (ii) would be satisfied which contradicts our assumption.

Next, we consider case (A.2), i.e., \(l^\prime \not < y^\prime \), which implies that there exists an index \(i \in [m]\) with \(l^\prime _i \ge y^\prime _i\). Using (4.7), this implies again with (4.4) that

$$\begin{aligned} (u^e-l^e)_i \le (u^e-l^\prime )_i \le u_i - y^\prime _i = u_i-l_i-{\bar{t}}(u-l)_i \le 0.5(u-l)_i \le 0.5(u^s-l^s)_i \end{aligned}$$

which contradicts our assumption as well.

This concludes case (A). For case (B) let \({\hat{t}} \le 0.5\) or \({\bar{y}} = {\hat{y}}\). First we discuss case (B.1), this is, \({\hat{t}} \le 0\). In this case it is \({\bar{y}} \le l \le l^\prime \). By Theorem 3.4 (ii) there is also no \(l^* \in L\) with \(l^* \le l^\prime , l^* \ne l^\prime \) and in particular no \(l^* \in L\) with \(l^* < {\bar{y}}\). Thus, the local lower bound set L is not updated in this step and only the local upper bound set has to be considered.

In case (B.1.1) with \({\bar{y}} < u^\prime \), Algorithm 1 removes the upper bound \(u^\prime \) from the local upper bound set and creates m new candidates

$$\begin{aligned} u^i := (u^\prime _1, \ldots , u^\prime _{i-1}, {\bar{y}}_i, u^\prime _{i+1}, \ldots , u^\prime _m)^\top \text { for all } i \in [m]. \end{aligned}$$

At least one of these candidates (as part of the parent history \(P(u^e)\)) is added to the updated set of local upper bounds. Thus, there exists an index \(i \in [m]\) such that \(u^i \in U\) after the updating procedure. Using (4.4), we obtain

$$\begin{aligned} (u^e-l^e)_i \le (u^i-l)_i ={\bar{y}}_i-l_i \le 0 < \varepsilon . \end{aligned}$$

Again, (ii) would be satisfied, which contradicts our assumption.

Next, we consider case (B.1.2). As \({\bar{y}} \not < u^\prime \), there exists an index \(i \in [m]\) such that \({\bar{y}}_i \ge u^\prime _i\). Using (4.7), this leads to

$$\begin{aligned} (u^e-l^e)_i \le (u^\prime -l)_i \le {\bar{y}}_i-l_i \le 0 < \varepsilon \end{aligned}$$

and contradicts our assumption that (ii) is not satisfied.

This concludes case (B.1) and we continue with case (B.2). Consequently, for the remaining part of this proof we have that \({\hat{t}} > 0\). We start with case (B.2.1), i.e., \({\bar{y}} = {\hat{y}}\), see Fig. 5a. First, we consider case (B.2.1.1) with \(l^\prime< {\bar{y}} < u^\prime \). In this setting, the updating procedures Algorithms 1 and 2 remove \(l^\prime \) and \(u^\prime \) from L and U and create m new candidates

$$\begin{aligned} l^i := ({\bar{y}}_i,l^\prime _{-i}), \; u^i := ({\bar{y}}_i,u^\prime _{-i}) \text { for all } i \in [m]. \end{aligned}$$
(4.8)

Again, at least one of these candidates is contained in the updated local lower and local upper bound sets because it is part of the parent history \(P(l^e)\) or \(P(u^e)\), respectively. Hence, there exist \(i,j \in [m]\) with \(l^i \in L\) and \(u^j \in U\) after the updating procedures. For \(i=j\) we have \((u^j-l^i)_i = {\bar{y}}_i-{\bar{y}}_i = 0 < \varepsilon \) which contradicts our assumption. Thus, we only consider \(i \ne j\). Using (4.4), (4.7), and \({\bar{y}} = {\hat{y}}\), we obtain

$$\begin{aligned}&(u^e-l^e)_j \le (u^j-l^i)_j = l_j + {\hat{t}}(u-l)_j-l^\prime _j \le l_j + {\hat{t}}(u-l)_j-l_j = {\hat{t}}(u-l)_j,\\&(u^e-l^e)_i \le (u^j-l^i)_i = u^\prime _i - l_i - {\hat{t}}(u-l)_i \le u_i - l_i - {\hat{t}}(u-l)_i = (1-{\hat{t}})(u-l)_i. \end{aligned}$$

It is either \({\hat{t}} \le 0.5\) or \((1-{\hat{t}}) < 0.5\). Hence, there exists \(\iota \in \{i,j\}\) with

$$\begin{aligned} (u^e-l^e)_\iota \le 0.5(u-l)_\iota \le 0.5(u^s-l^s)_\iota \end{aligned}$$

which contradicts our assumption.

Next, we consider case (B.2.1.2) with \(l^\prime< {\bar{y}} \not < u^\prime \). Consequently, there exists an index \(j \in [m]\) with \({\bar{y}}_j \ge u^\prime _j\). For the set L, we know that \(l^\prime \) is removed and Algorithm 2 computes m new candidates, see (4.8). Again, there exists an index \(i \in [m]\) such that \(l^i \in L\) after the updating procedure. For \(i=j\) we have that \((u^\prime -l^j)_j = u^\prime _j - {\bar{y}}_j \le 0\) which contradicts our assumption. For \(i \ne j\), using (4.4), (4.7), and \({\bar{y}} = {\hat{y}}\), we obtain

$$\begin{aligned}&(u^e-l^e)_j \le (u^\prime -l^i)_j \le l_j + {\hat{t}}(u-l)_j-l^\prime _j \le l_j + {\hat{t}}(u-l)_j-l_j = {\hat{t}}(u-l)_j,\\&(u^e-l^e)_i \le (u^\prime -l^i)_i \le u^\prime _i - l_i - {\hat{t}}(u-l)_i \le u_i - l_i - {\hat{t}}(u-l)_i = (1-{\hat{t}})(u-l)_i. \end{aligned}$$

Again, it is either \({\hat{t}} \le 0.5\) or \((1-{\hat{t}}) < 0.5\) and there exists \(\iota \in \{i,j\}\) with

$$\begin{aligned} (u^e-l^e)_\iota \le 0.5(u-l)_\iota \le 0.5(u^s-l^s)_\iota \end{aligned}$$

which contradicts our assumption.

Case (B.2.1.3), i.e., \(l^\prime \not< {\bar{y}} < u^\prime \), can be shown analogously. There exists an index \(i \in [m]\) with \(l^\prime _i \ge {\bar{y}}_i\). Moreover, \(u^\prime \) is updated by Algorithm 1. As a result, m new candidates for local upper bounds are computed, see (4.8). Again, there exists an index \(j \in [m]\) such that \(u^j \in U\) after the updating procedure. For \(i=j\) we have that \((u^i-l^\prime )_i = {\bar{y}}_i - l^\prime _i \le 0\) which contradicts our assumption. For \(i \ne j\), using (4.4), (4.7), and \({\bar{y}} = {\hat{y}}\), we obtain

$$\begin{aligned}&(u^e-l^e)_j \le (u^j-l^\prime )_j \le l_j + {\hat{t}}(u-l)_j-l^\prime _j \le l_j + {\hat{t}}(u-l)_j-l_j = {\hat{t}}(u-l)_j,\\&(u^e-l^e)_i \le (u^j-l^\prime )_i \le u^\prime _i - l_i - {\hat{t}}(u-l)_i \le u_i - l_i - {\hat{t}}(u-l)_i = (1-{\hat{t}})(u-l)_i. \end{aligned}$$

Consequently, there exists \(\iota \in \{i,j\}\) with

$$\begin{aligned} (u^e-l^e)_\iota \le 0.5(u-l)_\iota \le 0.5(u^s-l^s)_\iota \end{aligned}$$

which contradicts our assumption.

Now, we consider case (B.2.1.4) with \(l^\prime \not< {\bar{y}} \not < u^\prime \). In this case, there exist \(i,j \in [m]\) with \(l^\prime _i \ge {\bar{y}}_i\) and \({\bar{y}}_j \ge u^\prime _j\). For \(i=j\) this implies \((u^\prime -l^\prime )_i \le 0\) and (ii) would be satisfied. This contradicts our assumption. For \(i \ne j\), using (4.7) and \({\bar{y}} = {\hat{y}}\) we have that

$$\begin{aligned}&(u^e-l^e)_j \le (u^\prime -l^\prime )_j \le l_j + {\hat{t}}(u-l)_j-l^\prime _j \le l_j + {\hat{t}}(u-l)_j-l_j = {\hat{t}}(u-l)_j,\\&(u^e-l^e)_i \le (u^\prime -l^\prime )_i \le u^\prime _i - l_i - {\hat{t}}(u-l)_i \le u_i - l_i - {\hat{t}}(u-l)_i = (1-{\hat{t}})(u-l)_i. \end{aligned}$$

Again, there exists \(\iota \in \{i,j\}\) with

$$\begin{aligned} (u^e-l^e)_\iota \le 0.5(u-l)_\iota \le 0.5(u^s-l^s)_\iota \end{aligned}$$

which contradicts our assumption.

This concludes case (B.2.1) and we continue with case (B.2.2), i.e., \({\hat{t}} \le 0.5\). In case (B.2.2.1), i.e., \({\bar{y}} < u^\prime \), the upper bound \(u^\prime \) is still removed (and m candidates created) as in (4.8). Again, there exists \(j \in [m]\) such that \(u^j \in U\) after the update procedure. Thus, using (4.4), (4.7), and that \({\bar{y}} \le {\hat{y}}\), we have that

$$\begin{aligned} (u^e-l^e)_j&\le (u^j-l^\prime )_j \le {\bar{y}}_j-l_j \le l_j+{\hat{t}}(u-l)_j-l_j\\&= {\hat{t}}(u-l)_j \le 0.5(u-l)_j \le 0.5(u^s-l^s)_j \end{aligned}$$

which contradicts our assumption.

Finally, we consider case (B.2.2.2) with \({\bar{y}} \not < u^\prime \). Then there exists an index \(j \in [m]\) with \({\bar{y}}_j \ge u^\prime _j\). Using the same arguments as before, this implies that

$$\begin{aligned} (u^e-l^e)_j&\le (u^\prime -l^\prime )_j \le {\bar{y}}_j-l_j \le l_j+{\hat{t}}(u-l)_j-l_j\\&= {\hat{t}}(u-l)_j \le 0.5(u-l)_j \le 0.5(u^s-l^s)_j \end{aligned}$$

which contradicts our assumption as well.

We considered all cases shown in Fig. 7. For all these cases, our assumption that (ii) is not satisfied does not hold. Hence, we have to reject our assumption and Theorem 4.2 is shown. \(\square \)

With Theorem 4.2 we can show that Algorithm 3 is finite. In particular, the next lemma provides an upper bound on the number of iterations (i.e., the number of runs of the repeat loop). This bound depends on \(z,Z \in {\mathbb {R}}^m\) from (2.1) and thus they should be chosen tight as discussed on page 12.

Lemma 4.3

Let \(\varepsilon , \tau > 0\) and \(z,Z \in {\mathbb {R}}^m\) be the input parameters for Algorithm 3. Moreover, define

$$\begin{aligned} \varDelta := \left\Vert Z-z \right\Vert _\infty \quad \mathrm{and }\quad \kappa := \left\lceil m \log _2 \left( \frac{\varDelta }{\varepsilon } \right) \right\rceil + 1. \end{aligned}$$

Then the number of iterations, i.e., the number of runs of the repeat loop, is bounded by \(\max \{\kappa ,1\}\), and hence, Algorithm 3 is finite.

Proof

If \(\kappa \le 1\) then \(\varDelta \le \varepsilon \) and Algorithm 3 ends after the first iteration.

Hence, we consider the case \(\kappa > 1\) and define \(l^1 := z, u^1 := Z\) and for \(k \ge 1\) let \(L^k,U^k\) be the local lower bound and local upper bound sets at the beginning of iteration k. By Theorem 4.2, for all \(k > 1\), all \(l^k \in L^k\), and all \(u^k \in U^k\) there exist \(l^{k-1} \in L^{k-1}, u^{k-1} \in U^{k-1}\), and an index \(i^k \in [m]\) such that

$$\begin{aligned} (u^k-l^k)_{i^k} \le \max \{\varepsilon , \; 0.5(u^{k-1}-l^{k-1})_{i^k} \}. \end{aligned}$$
(4.9)

Suppose that Algorithm 3 has more than \(\kappa \) iterations. Then there exist \(l^\kappa \in L^\kappa \) and \(u^\kappa \in U^\kappa \) with \(l^\kappa + \varepsilon e < u^\kappa \) (being equivalent to \(\varepsilon e < u^\kappa -l^\kappa \)). For every \(k \in \{2,3,\ldots ,\kappa \}\) let \(i^k \in [m]\) be the index from the iterative application of (4.9) starting with \(k = \kappa \). We define

$$\begin{aligned} n(i) := \left|\left\{ k \in \{2,3,\ldots ,\kappa \} \;\big |\; i^k = i \right\} \right| \text { for all } i \in [m]. \end{aligned}$$

As \(\kappa -1 = \left\lceil m \log _2 \left( \frac{\varDelta }{\varepsilon } \right) \right\rceil \) there exists at least one index \({\bar{i}} \in [m]\) with \(n({\bar{i}}) \ge \log _2 \left( \frac{\varDelta }{\varepsilon } \right) \). Iteratively using (4.9), this implies that

$$\begin{aligned} (u^\kappa -l^\kappa )_{{\bar{i}}} \le 2^{- \log _2 \left( \frac{\varDelta }{\varepsilon } \right) } (u^1-l^1)_{{\bar{i}}} \le \frac{\varepsilon }{\varDelta } \varDelta = \varepsilon \end{aligned}$$

which contradicts \(\varepsilon e < u^\kappa - l^\kappa \). As a result, there cannot be more than \(\kappa \) iterations and Algorithm 3 is finite. \(\square \)

So far, we have shown that Algorithm 3 generates a local lower bound set L and a local upper bound set U and stops after finitely many iterations. Next, we use Theorem 3.6 to show that these sets L and U are in fact lower and upper bound sets in the sense of Theorem 2.3.

Lemma 4.4

The output sets L and U of Algorithm 3 are lower bound and upper bound sets in the sense of Theorem 2.3.

Proof

As the output set U is a local upper bound set and L is a local lower bound set, there exist \(N^1,N^2 \subseteq {{\,\mathrm{int}\,}}(B)\) with \(U = U(N^1)\) and \(L=L(N^2)\). First, we discuss the local upper bound set U. The algorithm starts with \(U(\emptyset ) = \{Z\}\). After that, the local upper bound set is only updated using points \({\bar{y}} = f({\bar{x}}) \in {\mathcal {N}}\) computed from optimal solutions \({\bar{x}}\) of (GNP(\(f({\hat{x}})\)). As a result, it is \(N^1 \subseteq {\mathcal {N}}\) (or \(N^1 = \emptyset \)) and in particular \(N^1 \subseteq f(S)\).

Next, we consider the local lower bound set L. The algorithm starts with \(L(\emptyset ) = \{z\}\). For all further updates of the set there are two cases. The first case is that L is updated within the if part of the if clause, i.e., using \(y := l + {\bar{t}}(u-l)\) with the notation from Algorithm 3. Thereby, it holds that \({\bar{t}} < {\hat{t}}\) with \(({\hat{x}},{\hat{t}})\) being an optimal solution of (SUP(lu)). Thus, there exists no \(y^\prime \in f(S)\) with \(y^\prime \le y\) which implies \(y \not \in f(S) + {\mathbb {R}}^m_+\). The second case is that L is updated by Algorithm 2 using an update point \({\bar{y}} = f({\bar{x}}) \in {\mathcal {N}}\) computed from an optimal solution \({\bar{x}}\) of (GNP(\(f({\hat{x}})\)). Again, it is \({\bar{y}} \not \in f(S) + ({\mathbb {R}}^m_+ {\setminus } \{0\})\). Hence, we have \(N^2 \subseteq {{\,\mathrm{int}\,}}(B) {\setminus } (f(S) + ({\mathbb {R}}^m_+ {\setminus } \{0\}))\) (or \(N^2 = \emptyset \)).

By Theorem 4.3 we have that Algorithm 3 is finite and so \(N^1\) and \(N^2\) are finite sets as well. Moreover, without loss of generality, we can assume that the sets \(N^1\) and \(N^2\) are stable, see page 10 or [25, Remark 2.2]. Together with Theorem 3.6 this implies that L and U are lower and upper bound sets in the sense of Theorem 2.3. \(\square \)

Finally, we can use all of the results from above to show that Algorithm 3 generates an approximation \({\mathcal {A}}\) of the nondominated set \({\mathcal {N}}\) of (MOP) with width \(w({\mathcal {A}}) \le \varepsilon \).

Theorem 4.5

Let \(\varepsilon , \tau > 0\) and \(z,Z \in {\mathbb {R}}^m\) be the input parameters and \(L,U \subseteq {\mathbb {R}}^m\) be the output sets of Algorithm 3. Then \({\mathcal {A}}= (L+{\mathbb {R}}^m_+) \cap (U-{\mathbb {R}}^m_+)\) is an approximation of the nondominated set \({\mathcal {N}}\) of (MOP) with \(w({\mathcal {A}}) \le \varepsilon \).

Proof

By Lemma 4.4 the sets L and U are lower and upper bound sets in the sense of Theorem 2.3. Hence, \({\mathcal {A}}\) is an approximation of the nondominated set \({\mathcal {N}}\) given L and U. Moreover, we have shown in Theorem 4.3 that Algorithm 3 is finite, i.e., it terminates after finitely many iterations. This implies that at some point the repeat loop begins and ends with done == true. Thus, for all \(l \in L\) the if clause is not satisfied, i.e., for all \(u \in U\) there exists at least one index \(i \in [m]\) with

$$\begin{aligned} l_i + \varepsilon \not < u_i \; \Leftrightarrow \; \varepsilon \ge u_i - l_i. \end{aligned}$$

As a result, the approximation \({\mathcal {A}}\) has a width \(w({\mathcal {A}}) \le \varepsilon \). \(\square \)

5 Numerical results

In the following, we present numerical results for selected test instances. All numerical tests have been performed using MATLAB R2018b on a machine with Intel Core i9-10920X processor. The average of the results of bench(5) is: LU = 0.0470, FFT = 0.0559, ODE = 0.0137, Sparse = 0.0885, 2-D = 0.2302, 3-D = 0.2117. It provides a rough idea of the machine performance, in particular for comparing the results using other machines. Please be aware that these results are version specific according to the MATLAB documentation, see [29].

In this section we mainly focus on convex problems (MOP), i.e., such problems with convex objective functions \(f_i, i \in [m]\) and a convex set \(S \subseteq {\mathbb {R}}^n\). The big advantage of this setting is that also all single-objective subproblems like (SUP(lu)) are convex problems and can be solved by any local solver. To solve (SUP(lu)) and (GNP(y)) we use the local solver fmincon. For the initialization, we provide zZ as in (4.1) with \(\sigma = 10^{-6}\) unless stated otherwise. We also fix \(\tau = 10^{-6}\) for Algorithm 3. We also discuss two nonconvex examples in this section. In case we only use a local solver for the subproblems we will no longer have a guarantee to obtain a valid approximation for those.

First, in Sect. 5.1 we present the numerical results for BAMOP as given in Algorithm 3. Then, in Sect. 5.2, we present and discuss the results for a modified version of the algorithm. Please be aware that while our algorithm works for problems (MOP) with an arbitrary number of objective functions, in this section we focus on bi- and tri-objective examples since for those examples we can also provide a visualization of the results. In Test Instance 4 we discuss the results for an optimization problem with more than three objectives. The generated data and the MATLAB files are available from the authors on reasonable request.

5.1 Results for BAMOP

Test Instance 1

First, we consider a convex, box-constrained problem. The second objective function is the so-called Brent function, see [2, 22].

$$\begin{aligned} \min \; f(x) := \begin{pmatrix} 2x_1^2+x_2^2 \\ (x_1+10)^2+(x_2+10)^2+e^{\left( -x_1^2-x_2^2\right) } \end{pmatrix} \; \text { s.t. } \; x \in [-5,0]^2. \end{aligned}$$
(Br)

We execute the algorithm for \(\varepsilon = 1\) and \(\varepsilon = 5\). The results are shown in Fig. 8. For \(\varepsilon = 1\) the structure of the nondominated set is easily recognizable. However, this is also the case for \(\varepsilon = 5\).

The computation time is about 0.59 seconds for \(\varepsilon = 5\) and about 2.33 seconds for \(\varepsilon = 1\). Hence, the quality improvement from \(\varepsilon = 5\) to \(\varepsilon = 1\) requires (roughly) four times the computation time. We further investigate the computation time in the next test instance.

Fig. 8
figure 8

Numerical results for (Br)

Test Instance 2

For the next example, we consider a scalable test problem from [23] that was also used in [39].

$$\begin{aligned} \min \; f(x) := \begin{pmatrix} \frac{1}{n} \sum \limits _{i=1}^n x_i^2 \\ \\ \frac{1}{n} \sum \limits _{i=1}^n (x_i-2)^2 \end{pmatrix} \quad \text { s.t. } \quad x \in [0,1]^n. \end{aligned}$$
(Jin1)
Table 1 Numerical results for (Jin1)

The results, in particular the computation time, for different choices of n and \(\varepsilon \) are shown in Table 1. Those are computed using MATLAB’s Run and Time feature. This causes some overhead which should be taken into account when comparing results. Table 1 shows that increasing n or \(\varepsilon \) also increases the computation time. Also, the overall computation time of Algorithm 3 is basically the computation time for solving the subproblems (SUP(lu)) and (GNP(y)), i.e., the time used for fmincon in our example. Thus, it is important to carefully choose a suitable solver for these subproblems.

In Fig. 9 the results for \(\varepsilon = 0.1\) and \(\varepsilon = 0.01\) are shown. Together with the results shown in Table 1 this motivates that the parameter \(\varepsilon \) should be chosen carefully and with respect to the scale of the optimization problem.

Fig. 9
figure 9

Numerical results for (Jin1)

Test Instance 3

The following tri-objective example can be found in [8] and depends on a parameter \(a > 0\).

$$\begin{aligned} \min \; f(x) := x \quad \text { s.t. } \quad \left( \frac{x_1-1}{1}\right) ^2 + \left( \frac{x_2-1}{a}\right) ^2 + \left( \frac{x_3-1}{5}\right) ^2 \le 1. \end{aligned}$$
(Ex5.1)

The parameter a controls the scale of the ellipsoid constraint function and, hence, the steepness of the nondominated set. In [8] the authors present an approach of vertex selection for a polyhedral approximation algorithm from [28]. They demonstrate that their modification’s computation time is only slightly effected by the choice of a.

Table 2 Numerical results for (Ex5.1) with \(\varepsilon = 0.1\)

In contrast, for our algorithm the choice of a has a noticeable influence on the computation time, see Table 2. However, this is not surprising. Increasing the parameter a also increases the size of the initial box [zZ]. In particular, we have \(z_2 = 1-a-\sigma \) and \(Z_2 = 1+a+\sigma \). Thus, we can expect that more subboxes need to be computed to generate an approximation \({\mathcal {A}}\) with \(w({\mathcal {A}}) \le \varepsilon = 0.1\).

For an illustration of the result for \(a = 5\), see Fig. 10. Besides the illustration for \(\varepsilon = 0.1\) we also included the result for \(\varepsilon = 0.5\) where the box structure can be seen more clearly.

Fig. 10
figure 10

Numerical results for (Ex5.1)

Test Instance 4

As mentioned in the beginning of this section, our algorithm is not restricted to bi- and tri-objective problems that we use here mainly to be able to visually represent the numerical results. The following test instance is convex and has a scalable number \(m \le n\) of objectives. It can be found in various formulations, for example in [20, 39]. The objective functions \(f_j :{\mathbb {R}}^{n} \rightarrow {\mathbb {R}}, j \in [m]\) are given by

$$\begin{aligned} f_j(x) := (x_j-1)^2+\sum \limits _{i \in [n], i \ne j} x_i^2. \end{aligned}$$

The test instance is then given as

$$\begin{aligned} \min \; f(x) \quad \text { s.t. } \quad x \in [-1000,1000]^n, \; f_j(x) \le 1 \, \forall j \in [m]. \end{aligned}$$
(ZLT1)

We added the constraints \(f_j(x) \le 1, j \in [m]\) to allow a tight initial box [zZ] with \(Z_j := 1+\sigma \) for all \(j \in [m]\). Adding these constraints has no impact on the nondominated set of (ZLT1) since even without these constraints the nondominated set is contained in \([0,1]^m\). For all choices of m we fixed \(\varepsilon = 0.2\) and \(n = 100\).

For \(m=2\) and \(m=3\) the computation times are 2.46 and 4.64 seconds. But then, with increasing m, they increase quite fast to 15.05 seconds for \(m=4\) and 122.25 seconds for \(m=5\). When choosing \(\varepsilon = 0.1\) we only need 22.90 seconds for \(m=3\) but more than 3600 seconds (which we used as a time limit) for \(m=4\).

This example demonstrates that due to the fact that our algorithm is a criterion space algorithm, its computation time highly depends on the number of objectives. Moreover, the computation time increases drastically with the number of objectives in this example. This might seem surprising, since by Theorem 4.3 it is known that the maximal number of iterations of our algorithm depends linearly on the number of objective functions. The reason for the increase in computation time is that, in general, with each iteration the number of lower and upper bounds increases which implies that the iterations get more time consuming.

Test Instance 5

The next example is scalable in the decision space and nonconvex. It can be found in [20, 39, 40] and is a generalization of an example from [17].

$$\begin{aligned} \min \; f(x) := \begin{pmatrix} 1-\exp \left( -\sum \limits _{i=1}^n \left( x_i-\frac{1}{\sqrt{n}} \right) ^2 \right) \\ \\ 1-\exp \left( -\sum \limits _{i=1}^n \left( x_i+\frac{1}{\sqrt{n}} \right) ^2 \right) \end{pmatrix} \quad \text { s.t. } \quad x \in [-4,4]^n. \end{aligned}$$
(MOP2)

As already mentioned at the beginning of this section, we keep using the local solver fmincon to solve all single-objective (nonconvex) subproblems. As a result, there is no longer a guarantee that our algorithm computes a valid approximation of the nondominated set \({\mathcal {N}}\). To ensure that all theoretical results hold, a suitable (global) solver for the nonconvex subproblems (SUP(lu)) and (GNP(y)) would be necessary. However, for this example it turns out that our algorithm is still computing a valid result. This is also the case for other examples which are only “slightly” nonconvex. If one aims for a guaranteed valid approximation of the nondominated set, algorithms as those presented in [14, 31, 35, 42, 43] should be considered. Note that the algorithms from [14, 31] and [35] also have the advantage that they compute not only an approximation of the nondominated set but also a representation or coverage of the set of (\(\varepsilon \)-) efficient solutions. On the other hand, these algorithms are often limited in the size of n.

For the dimension we choose \(n = 100\). In Fig. 11 the results for \(\varepsilon = 0.1\) and for \(\varepsilon = 0.01\) are shown. The computation time for \(\varepsilon = 0.1\) is about 1.8 s while for \(\varepsilon = 0.01\) it is about 24.3 s. Considering Fig. 11, one has to decide whether the increase in computation time is worth the improvement of the result.

Fig. 11
figure 11

Numerical results for (MOP2)

Test Instance 6

While in the previous example our algorithm delivered a valid approximation of the nondominated set, we now present another nonconvex test instance from [37] where our algorithm combined with the local solver fmincon failed to provide a valid approximation. The test instance is given as

$$\begin{aligned} \begin{aligned} \min \; f(x) := x \quad \text { s.t. } \quad&-x_1^2-x_2^2+1+0.1\cos \left( 16 \arctan \left( \frac{x_1}{x_2}\right) \right) \le 0,\\&(x_1-0.5)^2+(x_2-0.5)^2 \le 0.5,\\&x \in [0,\pi ]^2. \end{aligned} \end{aligned}$$
(Tan)

We set the tolerance \(\varepsilon = 0.1\) and for the initial box [zZ] we choose \(z = (-\sigma ,-\sigma )^\top \) and \(Z = (1.3+\sigma ,1.3+\sigma )^\top \).

The result for using fmincon is shown in Fig. 12 on the left. On the right the result for using GlobalSearch instead of fmincon to solve the subproblems is shown. This actually returns a valid approximation of the nondominated set of (Tan). Hence, in this example more effort is needed to obtain a valid approximation by BAMOP.

Fig. 12
figure 12

Numerical results for (Tan)

5.2 Results for BAMOP with selection criterion

For this section, we consider a modification of BAMOP (Algorithm 3). More precisely, we replace the inner for loop for the upper bounds by a selection criterion for a single upper bound. For the selection criterion, we use the shortest edge s(lu). More precisely, given \(l \in L\) we select an upper bound \(u \in (\{l+\varepsilon e \} + {{\,\mathrm{int}\,}}({\mathbb {R}}^m_+)) \cap U_{\text {loop}}\) with

$$\begin{aligned} s(l,u) \ge s(l,u^\prime ) \text { for all } u^\prime \in (\{l+\varepsilon e \} + {{\,\mathrm{int}\,}}({\mathbb {R}}^m_+)) \cap U_{\text {loop}}. \end{aligned}$$

We refer to this modified version of Algorithm 3 as BAMOPS. Please be aware that for this modification finiteness of the algorithm is no longer guaranteed by Theorem 4.3. However, if BAMOPS terminates then for the computed approximation \({\mathcal {A}}\) it still holds \(w({\mathcal {A}}) \le \varepsilon \).

Fig. 13
figure 13

Comparison of BAMOP and BAMOPS for (Ex5.1)

We consider again the tri-objective optimization problem (Ex5.1) from Test Instance 3. For fixed \(\varepsilon = 0.1\) we compare the computation time of BAMOP and BAMOPS for \(a \in \{1,3,5,7,\ldots ,25\}\). The results are presented in Fig. 13. For better comparison, the dashed line represents half the computation time of BAMOP. Thus, for this specific instance BAMOPS is roughly twice as fast as BAMOP. Please be aware that the computation time of BAMOP for Test Instance 3 on page 24 is longer because of the overhead introduced by the Run and Time feature.

Also for other test instances with \(m > 2\) we obtained a significant reduction of the computation time when using BAMOPS instead of BAMOP.

6 Conclusions

We presented a new algorithm to compute an approximation of the nondominated set of (MOP) in the form of a box-coverage. This algorithm has been proven to compute an approximation \({\mathcal {A}}\) with width \(w({\mathcal {A}})\) of at most \(\varepsilon \) for any predefined \(\varepsilon > 0\) after finitely many iterations. In particular, we have shown that an improvement in every iteration can be guaranteed (see Theorem 4.2) and provided an upper bound for the number of iterations needed (see Lemma 4.3).

Algorithm 3 can theoretically be used for a huge class of problems. Practically, one needs to be able to solve the single-objective subproblems fast and reliably. This is possible for instance for smooth convex optimization problems for which such solvers are available. Even for nonconvex problems our algorithm is able to obtain valid approximations when using a local solver for the subproblems as demonstrated in Sect. 5. If one wants to guarantee that BAMOP computes a valid approximation in the nonconvex case, a suitable global solver is necessary. In this case one might also consider using other approaches like those from [12, 14, 31] and [42]. In Sect. 5.1 we illustrated the algorithm on six different test instances. As the results have shown, the parameter \(\varepsilon \) should be chosen carefully and with respect to the range of the objective values. There is a significant trade-off between the approximation quality, i.e., the choice of \(\varepsilon \), and the computation time for the algorithm.

We found out that the computation time of Algorithm 3 mainly depends on the computation time that is needed for solving the subproblems (SUP(lu)) and (GNP(y)). Thus, an interesting question for further research would be how often these subproblems have to be solved. This could then be used to estimate the overall computation time. In particular, a relation between the amount of decreasing the value of \(\varepsilon \) and the resulting increase in the number of evaluations of the subproblems would be of great interest.

Moreover, the modification of Algorithm 3 that we presented in Sect. 5.2 seems to be promising in view of the computation time. However, we loose the Halving Theorem with this modification. Still, it should be further investigated. In particular, one should check if this modification can still be proofed to terminate after finitely many iterations. Also, working with other selection criteria, e.g., based on the Hypervolume Indicator [44], should be examined.