1 Introduction

Multiobjective optimization problems arise if there is more than one objective of interest and the objectives are in conflict with each other, i.e. in general it is not possible to obtain a solution that is optimal w.r.t. each of the objective functions simultaneously. Therefore, the nondominated set (cf. Definition 2.1) consists of so-called optimal compromises. To have an overview of (at least some of) these optimal compromises is an advantage as one can observe the interplay between the objective functions in a more direct and easy way. Especially in real-world applications, it occurs rather rarely that only one objective is of interest. This is also the case in our application, where we are looking for energy supply network plans which are optimal w.r.t. the occurring costs as well as its accompanying \(\textrm{CO}_2\)-emissions. Naturally, a network plan being low in emissions is more costly and vice versa. Thus, having a list of optimal compromises between these two extreme solutions helps the political decision-makers to get a broader picture of the situation.

In most cases, solving a multiobjective optimization problem (MOP) numerically means that one computes either an enclosure (cf. [23,24,25,26]) or an approximation (cf. [5, 10, 17, 30, 33]) of the nondominated set. This is due to the fact that the nondominated set is in general infinite. In the linear case, it is also possible to compute the entire nondominated set (cf., e.g., [47]). In this paper, we introduce a deterministic approach to compute an enclosure of the nondominated set with a prescribed quality \(\varepsilon > 0\). Although the focus of our algorithm is to compute such an enclosure, it also returns an approximation of the nondominated set, namely a finite set of \(\varepsilon \)-nondominated points of (MOP) (see Definition 2.3).

In this work, we present two new methods to compute such enclosures. The first method (see Algorithm 2) tackles the problem more directly without being too complicated. In fact, it is a slight modification of the method presented in [23] in the way that we use the Restricted Weighted Sum Method (cf. [31, 32]) to obtain nondominated points of the problem. While being very effective for problems where single objective subproblems can be solved very quickly, it is not applicable (in terms of efficiency) to more complex problems—which can happen even with a comparably small number of variables in the nonconvex mixed-integer setting. Furthermore, this approach depends heavily on the availability of solvers for the resulting single objective problems. These drawbacks are addressed with the second method, where we combine the direct approach with piecewise linear relaxations to bypass the nonlinearities of the original problems—and therefore only single objective mixed-integer linear programming (MILP) problems have to be solved, where powerful solvers are available (cf. [14, 29]). The accuracy of these relaxations is chosen adaptively by the method itself during the solution process. Consequently, it is possible to use different levels of accuracy in different parts of the criterion space, which can be of importance if also the relaxed problems are difficult to solve.

This approach is the first of its kind for a (MOP). In [24] a method is presented to solve multiobjective convex mixed-integer optimization problems via relaxations. It makes use of cuts in the criterion space in order to discard infeasible integer assignments as well as to separate parts that do not belong to the image of the feasible set. In [25] the authors present the first deterministic method being able to deal with multiobjective mixed-integer nonconvex problems. However, the method presented there is based on a branch-and-bound scheme in the decision space which is a fundamental difference from the methods presented in this paper. Note further that due to clarity and conciseness, we restrict ourselves to nonconvex quadratically constrained problems in this paper as this is of relevance to our application. However, the presented method can be straightforwardly generalized to polynomially constrained programs using reduction techniques to decompose polynomials into quadratic and bilinear terms (cf., e.g., [45] and Remark 4.3).

The manuscript is organized as follows: After introducing preliminary notions in Sect. 2 we define enclosures and related notions as well as provide a basic approach for computing such an enclosure in Sect. 3. In Sect. 4 we introduce and relate piecewise linear relaxations of (MOP) to so-called lower bound sets of the nondominated set, which results in a new procedure for computing an enclosure using adaptively chosen relaxations presented in Sect. 5. We further prove correct and finite termination of this procedure. In Sect. 6 we demonstrate the advantages of using this novel approach by applying it to different instances of an energy supply network optimization problem, which are in fact nonconvex multiobjective mixed-integer quadratically constrained problems. Finally, a conclusion is drawn in Sect. 7.

2 Notations and definitions

Throughout we write \(x \le y\) for \(x,y \in {\mathbb {R}}^k\) if for any \(i\in [k]:= \{1,\ldots ,k\}\) we have that \(x_i \le y_i\). In the same manner we write \(x < y\) if \(x_i < y_i\) for any \(i\in [k]\).

The ingredients of a general multiobjective optimization problem are the continuous components \(f_i:{\mathbb {R}}^{n+m} \rightarrow {\mathbb {R}}\), \(i \in [k]\), of the objective function \(f = (f_1,\ldots ,f_k)^\top :{\mathbb {R}}^{n+m}\rightarrow {\mathbb {R}}^k\), the continuous components \(h_i:{\mathbb {R}}^{n+m}\rightarrow {\mathbb {R}}\), \(i\in [p]\), of the equality constraint function \(h = (h_1,\ldots ,h_p)^\top :{\mathbb {R}}^{n+m}\rightarrow {\mathbb {R}}^p\) and the continuous components \(g_i:{\mathbb {R}}^{n+m}\rightarrow {\mathbb {R}}\), \(i \in [q]\), of the inequality constraint function \(g = (g_1,\ldots ,g_q)^\top :{\mathbb {R}}^{n+m} \rightarrow {\mathbb {R}}^q\). For all of these functions, we only allow quadratic and bilinear terms. The corresponding multiobjective optimization problem is then given by

$$\begin{aligned} \begin{aligned} \min f(x) \quad \text { s.t. } \quad h(x) = 0,\, g(x) \le 0,\, x \in X := X_C \times X_I, \end{aligned} \end{aligned}$$
(MOP)

where \(X_C = [\ell _C, u_C] \subsetneq {\mathbb {R}}^n\) and \(X_I = [\ell _I,u_I] \cap {\mathbb {Z}}^m \subsetneq {\mathbb {R}}^m\) are non-empty boxesFootnote 1 with \(\ell _C, u_C \in {\mathbb {R}}^n, \, \ell _C < u_C,\) and \(\ell _I,u_I \in {\mathbb {Z}}^m, \, \ell _I < u_I\), respectively. These boxes represent the box constraints of the continuous and integer variables, respectively. If \(m=0\), i.e. if there are no integrality constraints for any variable, we call (MOP) a continuous multiobjective optimization problem, and if \(n=0\) we call it a pure integer multiobjective optimization problem. Note that the components of the occurring functions fg, and h are neither assumed to be linear nor convex. For a survey on solution methods for general versions of (MOP) we refer the reader to [22]. Single objective versions of (MOP) we denote by MIQCP (or more general MINLP) problems. Interested readers are referred to [1, 6, 9, 36, 43, 52].

As we may require integrality of some variables and due to the possible non-convexity of the occurring functions of (MOP) we obtain its possibly nonconvex feasible set

$$\begin{aligned} S = \left\{ x \in X \mid h(x) = 0, \, g(x) \le 0 \right\} , \end{aligned}$$

which we assume to be nonempty, as well as its possibly nonconvex image set

$$\begin{aligned} f\left( S\right) = \left\{ f(x) \in {\mathbb {R}}^k \mid x\in S\right\} . \end{aligned}$$

We define the projection of the feasible set on \({\mathbb {R}}^m\) by

$$\begin{aligned} S_I = \left\{ x_I \in {\mathbb {Z}}^m \mid \exists \, x_C \in {\mathbb {R}}^n:\left( x_C, x_I \right) \in S \right\} , \end{aligned}$$

which we call the set of feasible integer assignments of (MOP). For a given integer assignment \({\hat{x}}_I \in {\mathbb {Z}}^m\) we define the sets

$$\begin{aligned} S_{{\hat{x}}_I} = \left\{ x\in S \mid x_I = {\hat{x}}_I \right\} \text { and } X_{{\hat{x}}_I} = \left\{ x \in X \mid x_I = {\hat{x}}_I \right\} , \end{aligned}$$

describing the feasible and box-feasible solutions corresponding to \({\hat{x}}_I\), respectively. Notice that \(S_{{\hat{x}}_I}=\emptyset \) holds, provided that \({{\hat{x}}}_I\in {\mathbb {Z}}^m{\setminus } S_I\). Since we assume all constraint functions to be continuous and by the definition of X, the feasible set S is compact. Furthermore, by the continuity of the objective functions, we know that f(S) is also compact and we can therefore find a box \(B = [z^\ell , z^u] \subseteq {\mathbb {R}}^k\) with \(f(S) \subseteq \textrm{int}(B)\) for some \(z^\ell , z^u \in {\mathbb {R}}^k, \, z^\ell \le z^u, \, z^\ell \ne z^u\).

In general, the k objective functions of (MOP) are in conflict with each other, i.e. we cannot assume that there is a solution \({\bar{x}} \in S\) minimizing all objective functions simultaneously. This motivates the concepts of (non-)dominance and efficiency (cf. [20]).

Definition 2.1

  1. (1)

    Let \(y^1,y^2 \in {\mathbb {R}}^k\). Then \(y^1\) dominates \(y^2\) (and \(y^2\) is dominated by \(y^1\)) if

    $$\begin{aligned} y^1_i \le y^2_i \text { for all } i\in [k]\quad \text {and}\quad y^1_j < y^2_j \text { for some } j\in [k]. \end{aligned}$$

    We write \(y^1 \le y^2, \, y^1 \ne y^2\). If even \(y^1_i < y^2_i\) for all \(i\in [k]\), then \(y^1\) strictly dominates \(y^2\) (and \(y^2\) is strictly dominated by \(y^1\)) and we write \(y^1 < y^2\).

  2. (2)

    Let \(y \in {\mathbb {R}}^k\) and \(N\subset {\mathbb {R}}^k\). Then the vector y is (strictly) dominated by the set N if there exists a vector \({\hat{y}} \in N\) (strictly) dominating y. Similarly, if y is not (strictly) dominated by N we call y (weakly) nondominated by N.

  3. (3)

    A point \({{\bar{x}}} \in S\) is called an efficient solution of (MOP) if there exists no \(x\in S\) such that f(x) dominates \(f({\bar{x}})\). If there exists no \(x\in S\) such that f(x) strictly dominates \(f({\bar{x}})\), then \({\bar{x}}\) is called a weakly efficient solution of (MOP). The set of (weakly) efficient solutions of (MOP) is called its (weakly) efficient set.

  4. (4)

    A point \({\bar{y}} \in f(S)\) is called a (weakly) nondominated point of (MOP) if there exists no \(y\in f(S)\) (strictly) dominating \({\bar{y}}\). The set of (weakly) nondominated points of (MOP) is called its (weakly) nondominated set and is denoted by \({\mathcal {N}}\) (and \({\mathcal {N}}_w\)).

Remark 2.2

  1. (1)

    Note that \({\mathcal {N}}= \left\{ f(x) \in {\mathbb {R}}^k \mid x \in S \text { is efficient}\right\} \).

  2. (2)

    Note that even if we assume all objective and constraint functions of (MOP) to be convex (or linear), the nondominated set of (MOP) can be disconnected and nonconvex due to integrality constraints (e.g., c.f. [16, Figure 1]).

The goal of multiobjective optimization is to determine the nondominated set \({\mathcal {N}}\) of a given (MOP). As the set \({\mathcal {N}}\) can be infinite or of a very complicated structure it can be impossible to compute it exhaustively using numerical methods. Therefore, numerical methods for multiobjective optimization mostly focus on computing a finite approximation \({\mathcal {A}}\) of \({\mathcal {N}}\) subsequent to the following three goals (cf. [49]):

  • Coverage all parts of the nondominated set \({\mathcal {N}}\) should be represented in the approximation \({\mathcal {A}}\).

  • Uniformity the points of the approximation \({\mathcal {A}}\) should be distributed uniformly along the nondominated set \({\mathcal {N}}\).

  • Cardinality the set \({\mathcal {A}}\) should contain an appropriate number of points naturally depending on the number and extent of the cost functions.

Another approach of numerically solving an (MOP) is—instead of computing a sole approximation—to compute a coverage of the nondominated set \({\mathcal {N}}\) as presented in, e.g., [23,24,25,26]. Besides the introduction of enclosures of the nondominated set, the authors present a branch-and-bound framework for computing such an enclosure of the nondominated set with a prescribed quality. As we are aiming at the same goal in this paper, we present the enclosure-related notions in more detail in Sect. 3.

However, speaking of enclosures of a certain quality we come to the issue of obtaining convergence of numerical methods mostly only in the limit, i.e. after possibly infinitely many iterations. In general, this is overcome by using termination criteria together with specific tolerances—in fact, one terminates a method if it computed a solution satisfying an a-priori chosen criterion up to a specified tolerance. In single objective optimization, one way to use such tolerances is embodied in branch-and-bound methods which iteratively compute lower and upper bounds of the optimal value while checking if the gap between these two becomes smaller than the prescribed tolerance. In particular, these methods terminate with so-called \(\varepsilon \)-optimal solutions. This concept can also be applied to nondominance in multiobjective optimization.

Definition 2.3

Let \(\varepsilon > 0\). A point \({\bar{y}} \in f(S)\) is called a (weakly) \(\varepsilon \)-non-do-mi-na-ted point of (MOP) if there exists no \(y \in f(S)\) such that \(y + \varepsilon e\) (strictly) dominates \({\bar{y}}\), where \(e \in {\mathbb {R}}^k\) denotes the all-one vector. Similarly, a solution \({\bar{x}}\in S\) is called (weakly) \(\varepsilon \)-efficient for (MOP) if there is no \(x\in S\) such that \(f(x)+\varepsilon e\) (strictly) dominates \(f({\bar{x}})\). The set of \(\varepsilon \)-nondominated points is denoted by \({\mathcal {N}}_\varepsilon \).

Remark 2.4

Depending on the considered problem (MOP) one could also use another vector instead of e. E.g., this could be useful if the attained magnitudes of the objective functions differ largely. For an example see the explanations of the numerical examples in Sects. 3 and 6.

3 Enclosures and local bounds

In [23, 24, 26] the authors present methods for solving problems like (continuous or convex versions of) (MOP). The aim of these methods is to find an enclosure of the nondominated set \({\mathcal {N}}\). Following the sandwiching idea of single objective branch-and-bound methods, an extension to the multiobjective setting is needed.

Definition 3.1

Let \(L,U \subseteq {\mathbb {R}}^k\) be two finite sets satisfying \({\mathcal {N}}\subseteq L + {\mathbb {R}}^k_+\) and \({\mathcal {N}}\subseteq U - {\mathbb {R}}^k_+\). Then L is called a lower bound set, U is called an upper bound set, and the set \(E:= E(L,U)\) defined as

$$\begin{aligned} {\mathcal {N}}\subseteq E\left( L, U \right) := \left( L + {\mathbb {R}}^k_+\right) \cap \left( U - {\mathbb {R}}^k_+\right) = \bigcup _{\ell \in L} \bigcup _{\begin{array}{c} u\in U,\\ \ell \le u \end{array}} [\ell , u] \end{aligned}$$

is called enclosure of the nondominated set \({\mathcal {N}}\) of (MOP).

For transferring the gap termination criterion to the multiobjective setting we need some sort of quality measure for the set E(LU). In [23,24,25,26] the authors use the so-called width w(E) which is defined as the optimal value of the problem

figure a

where \(s(\ell ,u):= \min \{ u_i - \ell _i \mid i\in [k]\}\) represents the shortest edge length of a given box \([\ell ,u]\). The justification for the choice of this width measure is given by the following lemma.

Lemma 3.2

(cf. [26, Lemma 3.1]) For sets \(L,U \subseteq {\mathbb {R}}^k\) with \({\mathcal {N}}\subseteq L +{\mathbb {R}}^k_+\) and some \(\varepsilon > 0\) let \(w(E) < \varepsilon \). Then the relation

$$\begin{aligned} E\left( L,U\right) \cap f(S) \subseteq {\mathcal {N}}_\varepsilon \end{aligned}$$
(1)

holds, i.e. for any feasible point \(x \in S\) with \(f(x) \in E\) we know that f(x) is an \(\varepsilon \)-nondominated point of (MOP).

Due to Lemma 3.2, finding lower and upper bound sets LU and thus an enclosure E with \(w(E) < \varepsilon \) becomes important. In fact, computing such an enclosure is the aim of the methods from [23, 24], whereas in the branch-and-bound based method from [26] the authors use the width measure in the preimage space to declare a box sufficiently branched. As we propose a criterion space method, we focus on computing an enclosure E such that \(w(E) <\varepsilon \). In order to do so, we make use (as done in [23,24,25]) of the concept of Local Lower and Local Upper Bounds (LLBs and LUBs, respectively) which is an extension of the work in [15, 34] and also [21].

In [15] the authors call a set \(Y \subseteq {\mathbb {R}}^k\) stable if there are no elements in Y dominating each other, i.e. for any two distinct \(y^1, y^2 \in Y\) there are \(i,j \in [k]\) such that \(y^1_i < y^2_i\) and \(y^2_j < y^1_j\). One can immediately see that for any (MOP) the corresponding nondominated set \({\mathcal {N}}\) is stable. We start with the definition of local upper bounds as introduced in [34].

Definition 3.3

Let \(N \subseteq f\left( S\right) \) be a finite and stable set. Then the lower search region for N is

$$\begin{aligned} s\left( N\right) := \left\{ y \in \textrm{int}\left( B\right) \mid y^\prime \nleq y \text { for every } y^\prime \in N\right\} , \end{aligned}$$

and the lower search zone for some \(u \in {\mathbb {R}}^k\) is given by

$$\begin{aligned} c(u) := \left\{ y\in \textrm{int}\left( B\right) \mid y < u \right\} . \end{aligned}$$

A set \(U = U\left( N\right) \) is called local upper bound set given N if

  1. (i)

    \(s\left( N\right) = \bigcup _{u \in U\left( N\right) } c(u)\),

  2. (ii)

    \(c\left( u^1\right) \nsubseteq c\left( u^2\right) \) for any \(u^1, u^2 \in U\left( N\right) \) with \(u^1 \ne u^2\).

Each point \(u\in U(N)\) is called a local upper bound (LUB).

In [23, 24] the authors extended the concept of LUBs to so-called Local Lower Bounds.

Definition 3.4

Let \(N \subseteq \textrm{int}\left( B\right) \) be a finite and stable set. Then the upper search region for N is

$$\begin{aligned} S\left( N\right) := \left\{ y \in \textrm{int}\left( B\right) \mid y^\prime \ngeq y \text { for every } y^\prime \in N \right\} , \end{aligned}$$

and the upper search zone for some \(\ell \in {\mathbb {R}}^k\) is given by

$$\begin{aligned} C(\ell ) := \left\{ y\in \textrm{int}\left( B\right) \mid \ell < y \right\} . \end{aligned}$$

A set \(L = L\left( N\right) \) is called a local lower bound set given N if

  1. (i)

    \(S\left( N\right) = \bigcup _{\ell \in L\left( N\right) } C(\ell )\),

  2. (ii)

    \(C\left( \ell ^1\right) \nsubseteq C\left( \ell ^2\right) \) for any \(\ell ^1, \ell ^2 \in L\left( N\right) \) with \(\ell ^1 \ne \ell ^2\).

Each point \(\ell \in L\left( N\right) \) is called a local lower bound (LLB).

In [26] the authors show that given a stable set N, the corresponding local upper bound set is uniquely determined. In the same manner, one can show that also the corresponding local lower bound set is unique. The following result from [23] relates the concept of local lower/upper bounds to lower/upper bounds as introduced in Definition 3.1.

Lemma 3.5

(cf. [23, Corollary 3.6]) Suppose that the sets \(N^1 \subseteq f(S)\) and \(N^2 \subseteq \textrm{int}(B) {\setminus } \left( f\left( S\right) + \textrm{int}( {\mathbb {R}}^k_+)\right) \) are finite and stable. Then \(U( N^1)\) is an upper bound set and \(L(N^2)\) is a lower bound set in the sense of Definition 3.1.

As the local lower and upper bound sets depend on the stable set N, we have to update both if we add a new point y to the set N. These procedures come from [34, Algorithm 3], [23, Algorithm 2]. In Algorithm 1 we provide the scheme for updating a local upper bound set. Similar to [34] we use the following notation: for \(y\in {\mathbb {R}}^k,\, c\in {\mathbb {R}}\) and \(i\in [k]\) we define

$$\begin{aligned} y_{-i}&:= \left( y_1,\ldots ,y_{i-1},y_{i+1},\ldots ,y_k\right) ^\top \text { and } \\ \left( c,y_{-i}\right)&:= \left( y_1,\ldots ,y_{i-1},c,y_{i+1},\ldots ,y_k\right) ^\top . \end{aligned}$$
figure b

In order to obtain an algorithm for updating a local lower bound set L(N) w.r.t. an update point \(y \in \textrm{int}(B)\), one can modify Algorithm 1 in the following way. We replace U(N) by L(N), and change the following:

  • Step 1 to \(A = \left\{ \ell \in L\left( N\right) \mid y > \ell \right\} \),

  • Step 3 to \(B_i = \left\{ \ell \in L\left( N\right) \mid y_i = \ell _i \text { and } y_{-i} > \ell _{-i}\right\} \),

  • Step 12 to \(P_i = \left\{ \ell \in P_i \mid \ell \ngeq \ell ^\prime \text { for all } \ell ^\prime \in P_i \cup B_i,\, \ell ^\prime \ne \ell \right\} \).

Furthermore, by analyzing the update procedure of U(N) w.r.t. a point \(y\in f(S)\), one can relate the local upper bounds from U(N) to the ones from \(U\left( N\cup \{y\}\right) \) in the following way: for a local upper bound \(u\in U\left( N\cup \{y\}\right) \) we have either \(u \in U(N)\) or \(u = \left( y_i, u^\prime _{-i}\right) \) for some \(i\in [k]\) and \(u^\prime \in U(N)\). For the latter case, we call \(u^\prime \) the parent of u. Otherwise, u is its own parent.

Lemma 3.6

[23, Lemma 3.8] Let \(u \in U\left( N\cup \{y\}\right) \) be a local upper bound. Then its parent \(u^\prime \in U(N)\) is unique.

Similarly, we define the parents of a given local lower bound \(\ell \). Furthermore, the analogue of Lemma 3.6 holds also for any local lower bound \(\ell \in L\left( N\cup \{y\}\right) \).

In [15, 34] the authors present a general scheme for computing an approximation of the nondominated set of a multiobjective optimization problem (cf., e.g., [15, Algorithm 2]). In [23] the authors extend this scheme for computing an enclosure E of \({\mathcal {N}}\) which, in fact, consists of boxes \(\left[ \ell ,u\right] \), where \(\ell \) is a local lower and u a local upper bound, and satisfies \(w(E) < \varepsilon \) for given \(\varepsilon > 0\). Both methods make use of the search zones determined by a given local upper bound u.

For this paper, we use a slightly modified approach of these as a basic scheme which is written in Algorithm 2.

figure c

In Algorithm 2, we loop through the set \(U_\text {loop}\), i.e. the set of local upper bounds at the beginning of the iteration. If the current local upper bound u is part of a not sufficiently small box (see Step 5), we try to improve this local upper bound, i.e. find a nondominated point in the search region determined by u. In order to do so, we solve the weighted-sum problem

figure d

for some weight vector \(\alpha \in \textrm{int}({\mathbb {R}}^k_+)\). Note that if the weight vector \(\alpha \) has only strictly positive entries we know that any global solution \({\bar{x}}\) of (\(\hbox {WSP}\left( \alpha ;u\right) \)) is efficient for (MOP) (cf. [4, Lemma 1.5.2],[20, p. 214f]), i.e. \({\bar{y}} = f\left( {\bar{x}}\right) \in {\mathcal {N}}\) and \({\bar{y}} < u- \delta e\). Furthermore, we know that there exists \(y \in {\mathcal {N}}\) with \(y < u - \delta e\) if and only if (\(\hbox {WSP}\left( \alpha ;u\right) \)) has a solution. In sum, this means that we can decide whether to enter and execute the \(\texttt {if}\)-statement in Step 6 after solving (\(\hbox {WSP}\left( \alpha ;u\right) \)) for a strictly positive weight vector \(\alpha \). This weight vector could be chosen always the same, e.g., \(\alpha =(1,\ldots ,1)^\top /k\). But one could also compute it individually for any u, as done for the numerical examples presented later in this work. For example, one could compute the weight vector \(\alpha \) for the problem (\(\hbox {WSP}\left( \alpha ;u\right) \)) using a computation technique proposed in [48]. Given the current local upper bound of interest \(u \in U(N)\), where N is the current approximation of \({\mathcal {N}}\), for any \(i \in [k]\) let \(y^i \in N\) be a defining point of the i–th component of u, i.e. for any \(i \in [k]\) we have \(y^i_i = u_i\) and \( y^i_{-i} < u_{-i}\). We then solve the system

$$\begin{aligned} \begin{pmatrix} y^1_1 &{} \ldots &{} y^1_k \\ \vdots &{} &{} \vdots \\ y^k_1 &{} \ldots &{} y^k_k \end{pmatrix} \begin{pmatrix} {\tilde{\alpha }}_1 \\ \vdots \\ {\tilde{\alpha }}_k \end{pmatrix} = \begin{pmatrix} 1 \\ \vdots \\ 1 \end{pmatrix}, \end{aligned}$$
(2)

and set \(\alpha := \vert {{\tilde{\alpha }}}\vert / \Vert {{\tilde{\alpha }}}\Vert _2\). Note that \(\alpha \) is the normal vector of the hyperplane determined by the points \(y^1,\ldots ,y^k\). Using (\(\hbox {WSP}\left( \alpha ;u\right) \)), Algorithm 2 can be seen in the context of the so-called Adaptive Weighted Sum Method as presented in [31, 32].

Remark 3.7

For a clear definition of defining points we refer to [34]. Furthermore, in [34] the authors present methods for updating local upper bound sets together with the sets of defining points, i.e. an algorithm doing the same as Algorithm 1, but also updating the corresponding defining points (see [34, Algorithm 5]). In fact, [34, Algorithm 5] is used in any implementation of the present paper instead of Algorithm 1. We waived to present [34, Algorithm 5] in detail to keep things more concise.

If (\(\hbox {WSP}\left( \alpha ;u\right) \)) is feasible, it is solved to optimality and its solution \({\bar{y}}\in {\mathcal {N}}\) is used to update the local upper and lower bound sets. Afterward, it proceeds to the next iteration. If otherwise (\(\hbox {WSP}\left( \alpha ;u\right) \)) is not feasible, we know that there is no nondominated point y of (MOP) satisfying \(y < u - \delta e\) and therefore particularly \(y < u - \varepsilon e\). Therefore, we can use \(u - \delta e\) to update the set of local lower bounds (see Step 10).

An exemplary situation during Algorithm 2 is depicted in Fig. 1, where \({\hat{u}} \in U_\text {loop}\) denotes the current local upper bound of interest. The green boxes represent the current enclosure based on the set of local upper bounds U and local lower bounds L, which depend on the current approximation N of the Pareto set \({\mathcal {N}}\). Note that the enclosure simultaneously functions as the leftover search zone, i.e. the area where nondominated points can lie. The dashed red line mimics the restrictions coming from (\(\hbox {WSP}\left( \alpha ;u\right) \)), i.e. we are only looking for nondominated points in the lower left quadrant w.r.t. \({\hat{u}} - \delta e\). Clearly, if in this area there is no part of the Pareto set, we can update the lower bound set w.r.t. \({\hat{u}} - \delta e\) and declare the search region determined by \({\hat{u}}\) as well-enough explored. Furthermore, in that picture one can imagine the impact of enlarging \(\delta \), as the distance between the new potentially nondominated point and the old ones increases if \(\delta \) gets closer to \(\varepsilon \).

Fig. 1
figure 1

Exemplary configuration during Algorithm 2 tackling an instance of Example 3.14. \({\hat{u}} \in U\) denotes the current local upper bound of interest. The moving hyperplane \(\sigma ^T (x_1; x_2)\) together with the red dashed line mimics the behavior of the (\(\hbox {WSP}\left( \alpha ;u\right) \))

Remark 3.8

We briefly describe the method from [23] as it is closely related to ours from Algorithm 2.

The method from [23] iterates through the elements of the list of local lower bounds L in an outer loop and through the elements of the list of local upper bounds U in an inner loop. Given \({\hat{\ell }} \in L\), it iteratively chooses \({\hat{u}}\in U\) satisfying \({\hat{\ell }} \le {\hat{u}}\) and \(s({\hat{\ell }},{\hat{u}}) > \varepsilon \). After fixing a pair \(({\hat{\ell }},{\hat{u}})\) it solves the following optimization problem, where \(\ell = {\hat{\ell }}\) and \(u = {\hat{u}}\) are set,

figure e

Problem (\(\hbox {PSP}\left( \ell ,u\right) \)) can be interpreted as Pascoletti-Serafini problem (cf. [46]) with reference point \({\hat{\ell }}\) and direction \({\hat{u}}-{\hat{\ell }}\). Subsequently, it solves an optimization problem that ensures that one obtains a nondominated point \({\bar{y}}\). Note that for an optimal solution \(({\bar{x}},{\bar{t}})\) of (\(\hbox {PSP}\left( \ell ,u\right) \)) the point \({\bar{x}} \in S\) is only guaranteed to be weakly efficient (cf. [46]). If \({\bar{y}}\) satisfies some specific criterion (cf. [23, Section 4.3]), it is used to update the sets of local lower and upper bounds. If otherwise, \({\bar{y}}\) does not satisfy this criterion, the point \({\hat{\ell }} + {\tilde{t}}({\hat{u}} - {\hat{\ell }})\) is used to update the local lower bound set, where \({\tilde{t}} \ge 0.5\) is computed in a way that strict nondominance of \({\hat{\ell }} + {\tilde{t}}({\hat{u}} - {\hat{\ell }})\) w.r.t. \({\mathcal {N}}\) is ensured. We decided to avoid solving problems like (\(\hbox {PSP}\left( \ell ,u\right) \)) since preliminary numerical experiments pointed in the direction that solving (\(\hbox {PSP}\left( \ell ,u\right) \)) is computationally not as efficient as solving (\(\hbox {WSP}\left( \alpha ;u\right) \)) for our test instances. Furthermore, by doing so, one avoids the procedure for ensuring strict nondominance of the update points.

After running through both loops, it is guaranteed that any new box \([\ell ^\text {new},u^\text {new}]\) evolving from the above-described updates satisfies

$$\begin{aligned} \left( u^\text {new} - \ell ^\text {new}\right) _i \le \max \left\{ \varepsilon ,\, 0.5\left( u^\text {old} - \ell ^\text {old}\right) _i\right\} \end{aligned}$$
(3)

for at least one index \(i\in [k]\), where \(\ell ^\text {old}\) and \(u^\text {old}\) are local lower and upper bounds from the beginning of the iteration satisfying \(\ell ^\text {old} \le \ell ^\text {new} \le u^\text {new} \le u^\text {old}\) (cf. [23, Theorem 4.2]). Using that, the authors prove correctness and finiteness of the proposed method and provide an upper bound on the number of iterations until termination (cf. [23, Lemma 4.3] and [23, Theorem 4.5]). Due to the modifications in our method, we obtain the decrease estimate for the boxes (4) which differs from (3).

We proceed with proving correctness and finiteness of Algorithm 2.

Lemma 3.9

Let U be the local upper bound set at some arbitrary point in Algorithm 2. Then U is an upper bound set in the sense of Definition 3.1.

Proof

We initialize the set \(U = \{z^u\}\). During Algorithm 2 the set U is updated only in Step 7. The update procedure takes place w.r.t. a point \(y \in {\mathcal {N}}\) and therefore particularly \(y \in f(S)\). Thus, by correctness of Algorithm 1 we obtain that U is a local upper bound set w.r.t. some set \(N_1 \subseteq f(S)\). The claim follows by Lemma 3.5. \(\square \)

Lemma 3.10

Let L be the local lower bound set at some arbitrary point in Algorithm 2. Then L is a lower bound set in the sense of Definition 3.1.

Proof

We initialize the set \(L = \{z^\ell \}\). During Algorithm 2 the set L is updated only in the Steps 8 and 10. If updated in Step 8, L is updated w.r.t. a point \(y \in {\mathcal {N}}\) and therefore in particular \(y \notin f(S) + \textrm{int}({\mathbb {R}}^k_+)\). If otherwise updated in Step 10, it is updated w.r.t. \(y = {\hat{u}} - \delta e\) for some local upper bound \({\hat{u}} \in U\). Since we only enter Step 10, if there exists no \({\bar{y}} \in {\mathcal {N}}\) satisfying \({\bar{y}} \le {\hat{u}} - \delta e\) we particularly know that \(y = {\hat{u}} - \delta e \notin f(S) + \textrm{int}({\mathbb {R}}^k_+)\). Hence, by correctness of the modified version of Algorithm 1 for local lower bounds we obtain that L is a local lower bound set w.r.t. some set \(N_2 \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\). The claim follows by Lemma 3.5. \(\square \)

The above results show that the sets L and U are lower and upper bound sets in the sense of Definition 3.1 at any time of Algorithm 2. To show that also the output sets are sets of that type, we have to ensure that the sets \(N_1 \subseteq f(S)\) and \(N_2 \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\) are finite. If we show that Algorithm 2 is itself finite, i.e. terminates after a finite number of steps, the finiteness of \(N_1\) and \(N_2\) follows immediately. In order to show finiteness of Algorithm 2 we start with a result about the decrease of certain edge lengths during one iteration of the method. The following is essentially the same as the corresponding results in [23]. However, as some adaptions have to be made, we present the adapted proofs in detail.

Theorem 3.11

Let \(\varepsilon> \delta > 0\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 2. Moreover, let \(L^\text {start}\) and \(U^\text {start}\) be the local lower and upper bound sets at the beginning of some iteration in Algorithm 2, i.e. at the begin of the \(\texttt {while}\)-loop of some iteration. Accordingly denote by \(L^\text {end}, U^\text {end}\) the sets at the end of this iteration. Then for any \(\ell ^e \in L^\text {end}\) and any \(u^e \in U^\text {end}\) with \(\ell ^e \le u^e\) there exist \(\ell ^s \in L^\text {start}\) and \(u^s \in U^\text {start}\) such that the following hold:

  1. (1)

    \(\ell ^s \le \ell ^e \le u^e \le \ell ^s\), i.e. the width does not increase during one iteration.

  2. (2)

    There exists an index \(j\in [k]\) such that

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j < \max \, \left\{ \left( u^s - \ell ^s\right) _j - \delta , \varepsilon \right\} . \end{aligned}$$
    (4)

Proof

Let \(\ell ^e\in L^\text {end}\) and \(u^e \in U^\text {end}\) with \(\ell ^e \le u^e\). We denote by \(P(\ell ^e), P(u^e) \subseteq {\mathbb {R}}^k\) the sets containing the parent history of \(\ell ^e\) and \(u^e\) in the current iteration, i.e. their parents, their parents’ parents and so on until the ancestors of \(\ell ^e\) and \(u^e\) belonging to \(L^\text {start}\) and \(U^\text {start}\). By Lemma 3.6 we have that

$$\begin{aligned} \left| P\left( \ell ^e\right) \cap L \right| = 1 \text { and } \left| P\left( u^e\right) \cap U \right| = 1 \end{aligned}$$
(5)

at any point in the current iteration, i.e. for any assignment of L and U during the \(\texttt {while}\)-loop, and therefore in particular we obtain \(\ell ^s \in P(\ell ^e) \cap L^\text {start}\) and \(u^s \in P(u^e) \cap U^\text {start}\). By the definition of parents and the corresponding update procedures given in Algorithm 1 and its adaption for local lower bounds, we know that \(\ell ^p \le \ell ^c\) and \(u^c \le u^p\), where \(\ell ^c\) and \(u^c\) are the children of \(\ell ^p\) and \(u^p\), respectively. Hence, we obtain that

$$\begin{aligned} \begin{aligned} \ell ^s \le \ell \le \ell ^e \text { for all } \ell \in P\left( \ell ^e\right) ,~u^e \le u \le u^s \text { for all } u\in P\left( u^e\right) , \end{aligned} \end{aligned}$$
(6)

and therefore (1) is satisfied.

We now show that also (2) holds. We assume for a contradiction that there exist \(\ell ^e \in L^\text {end}\) and \(u^e\in U^\text {end}\) with \(\ell ^e \le u^e\) such that for any \(\ell \in L^\text {start}\) and \(u \in U^\text {start}\) with \(\ell \le \ell ^e \le u^e \le u\) and any index \(i\in [k]\) we have that

$$\begin{aligned} \left( u^e - \ell ^e\right) _i \ge \max \left\{ \left( u - \ell \right) _i - \delta , \varepsilon \right\} . \end{aligned}$$
(7)

In particular, (7) holds for the ancestors of \(\ell ^e\) and \(u^e\) in \(L^\text {start}\) and \(U^\text {start}\), namely \(\ell ^s\) and \(u^s\). We consider now the point in Algorithm 2, where \(u^s\) is chosen in the \(\texttt {for}\)-loop. Note that \(u^s\) might not be the first local upper bound from \(U_\text {loop}\) considered in the \(\texttt {for}\)-loop and therefore it might be the case that \(u^s \notin U_\text {current}\), where \(U_\text {current}\) is the current assignment of U. However, since for any \(i\in [k]\) we have that \((u^e - \ell ^e)_i \ge \varepsilon \) we know that for any \(\ell ^\prime \in P(\ell ^e)\cap L_\text {current}\) we have that \((u^s - \ell ^\prime )_i \ge \varepsilon \) for any \(i\in [k]\), where \(L_\text {current}\) denotes the current assignment of L. Hence, we have that

$$\begin{aligned} s\left( \ell ^\prime , u^s\right) \ge \varepsilon \end{aligned}$$
(8)

and we therefore enter the \(\texttt {if}\)-statement in Step 5. Similarly, we fix \(u^\prime \in P(u^e) \cap U_\text {current}\). Note that

$$\begin{aligned} \ell ^s \le \ell ^\prime \le \ell ^e \text { and } u^e \le u^\prime \le u^s. \end{aligned}$$

We denote by \(L_\text {updated}\) and \(U_\text {updated}\) the assignments of L and U after the current iteration of the for-loop is executed. We have to distinguish two main cases, namely we enter the \(\texttt {if}\)-statement in Step 6 (case A) or we do not (case B).

  1. (A)

    In that case we enter the \(\texttt {if}\)-statement in Step 6, i.e. there exists \(y \in {\mathcal {N}}\) with \(y < u^s - \delta e\). We have to distinguish two cases.

    Case A.1 \(y < u^\prime \). Then \(u^\prime \) would be removed during the update of \(U_\text {current}\) using Algorithm 1, i.e. \(u^\prime \notin U_\text {updated}\). Now, we have the candidates

    $$\begin{aligned} u^i = \left( y_i, u^\prime _{-i}\right) , \quad \text {for } i \in [k], \end{aligned}$$

    from which at least one belongs to \(U_\text {updated}\) by (5). Say \(u^j \in U_\text {updated}\). Using (6), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^j - \ell ^e\right) _j \le y_j - \ell ^s_j < \left( u^s - \ell ^s\right) _j - \delta , \end{aligned}$$

    a contradiction to (7).

    Case A.2 \(y \nless u^\prime \). Then there exists \(j\in [k]\) with \(u^\prime _j \le y_j\). Again, using (6), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^\prime - \ell ^e\right) _j \le y_j - \ell ^s_j < \left( u^s - \ell ^s\right) _j - \delta , \end{aligned}$$

    a contradiction to (7).

  2. (B)

    Assume now that we do not enter the \(\texttt {if}\)-statement in Step 6, i.e. there exists no \(y^\prime \in {\mathcal {N}}\) with \(y^\prime < u^s - \delta e =: y\). We have to distinguish two cases.

    Case B.1 \(\ell ^\prime < y\). Then \(\ell ^\prime \) would be removed during the update of \(L_\text {current}\) using Algorithm 1 for local lower bounds, i.e. \(\ell ^\prime \notin L_\text {updated}\). Now, we have the candidates

    $$\begin{aligned} \ell ^i = \left( y_i, \ell ^\prime _{-i}\right) , \quad \text {for } i \in [k], \end{aligned}$$

    from which at least one belongs to \(L_\text {updated}\) by (5). Say \(\ell ^j \in L_\text {updated}\). Using (6), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^e - \ell ^j\right) _j \le u^s_j - y_j = \delta < \varepsilon , \end{aligned}$$

    a contradiction to (7).

    Case B.2 \(\ell ^\prime \nless y\). Then there exists \(j\in [k]\) with \(\ell ^\prime _j \ge y_j = u^s_j - \delta \), i.e. particularly \(s(\ell ^\prime , u^s) \le \delta < \varepsilon \), a contradiction to (8).

Obtaining a contradiction in all possible cases shows that our assumption (7) cannot be true and therefore statement (2) is also true, which completes the proof. \(\square \)

Knowing that in every iteration of Algorithm 2 for any pair \((\ell ^e, u^e)\) we obtain a decrease w.r.t. the edge length for at least one edge \(j\in [k]\) we are able to prove that Algorithm 2 terminates after finitely many iterations.

Theorem 3.12

Let \(\varepsilon> \delta > 0\) and \(z^\ell , z^u\in {\mathbb {R}}^k\) be the input parameters of Algorithm 2. We define

$$\begin{aligned} \Delta := \big \Vert z^u - z^\ell \big \Vert _\infty \quad \text {and}\quad \kappa := k \left\lceil \llceil \frac{\Delta - \varepsilon }{\delta } \right\rceil \rrceil +1. \end{aligned}$$

Then the number of iterations of Algorithm 2, i.e. the number of iterations of the \(\texttt {while}\)-loop, is bounded by \(\max \,\{1, \kappa \}\). Hence, Algorithm 2 is finite.

Proof

If \(\kappa \le 1\), then \(\Delta \le \varepsilon \) and Algorithm 2 terminates without entering the \(\texttt {while}\)-loop.

Therefore, let \(\kappa > 1\). We write \(\ell ^1 = z^\ell \) and \(u^1 = z^u\) and for any \(l\ge 1\) let \(L_l\) and \(U_l\) be the assignments of L and U at the begin of the l-th execution of the \(\texttt {while}\)-loop.

By Theorem 3.11, we know that for any \(l\ge 2\) and any \(\ell ^l \in L_l\) and \(u^l \in U_l\) there exist \(\ell ^{l-1}\in L_{l-1}\) and \(u^{l-1}\in U_{l-1}\) with \(\ell ^{l-1} \le \ell ^l \le u^l \le u^{l-1}\) as well as an index \(i^l \in [k]\) such that

$$\begin{aligned} \left( u^l - \ell ^l\right) _{i^l} < \max \, \left\{ \left( u^{l-1} - \ell ^{l-1}\right) _{i^l} - \delta , \varepsilon \right\} . \end{aligned}$$
(9)

Suppose now that Algorithm 2 has more than \(\kappa \) iterations. Then there exist \(\ell ^\kappa \in L_\kappa \) and \(u^\kappa \in U_\kappa \) with \(s(\ell ^\kappa , u^\kappa ) \ge \varepsilon \). For any \(i\in [k]\) we define

$$\begin{aligned} n(i) = \big \vert \{l \in \{2,\ldots ,\kappa \} \mid i = i^l\}\big \vert . \end{aligned}$$

As \(\kappa > k \lceil \frac{\Delta - \varepsilon }{\delta } \rceil \) there exists at least one index \(j \in [k]\) with \(n\left( j\right) \ge \lceil \frac{\Delta - \varepsilon }{\delta } \rceil + 1\). Iterative use of (9) yields that

$$\begin{aligned} s\left( \ell ^\kappa , u^\kappa \right)&\le u^\kappa _{j} - \ell ^\kappa _{j} < u^1_{j} - \ell ^1_{j} - n(j) \delta \le z^u_{j} - z^\ell _{j} - \left\lceil \llceil \frac{\Delta - \varepsilon }{\delta }\right\rceil \rrceil \delta \\&\le \Delta - \frac{\Delta - \varepsilon }{\delta } \delta \le \varepsilon , \end{aligned}$$

a contradiction to \(s(\ell ^\kappa , u^\kappa ) \ge \varepsilon \). Hence, we have that \(w(E_\kappa ) < \varepsilon \) and Algorithm 2 terminates. \(\square \)

Corollary 3.13

Let \(\varepsilon> \delta > 0\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 2. Then after finitely many iterations of the \(\texttt {while}\)-loop, an enclosure E of the nondominated set \({\mathcal {N}}\) satisfying \(w(E) < \varepsilon \) is returned.

Proof

Theorem 3.12 tells us that Algorithm 2 terminates after at most \(\kappa \) calls of the \(\texttt {while}\)-loop, i.e. the output set \(E_\kappa \) satisfies \(w(E_\kappa ) < \varepsilon \). By Lemma 3.9 and Lemma 3.10 we know that the set L, resp. U, is a lower, resp. upper, bound set in the sense of Definition 3.1. In particular, this holds for \(L_\kappa \) and \(U_\kappa \). Thus, \(E_\kappa \) is an enclosure of the nondominated set \({\mathcal {N}}\) satisfying \(w(E_\kappa )< \varepsilon \). \(\square \)

We conclude this section with an example together with the numerical results. Note that for the numerical realization, we did not exactly implement Algorithm 2, but a modification in a way that we do not iterate through all local upper bounds in every call of the \(\texttt {while}\)-loop. Instead, at the begin of any \(\texttt {while}\)-loop we determine a local upper bound \({\hat{u}} \in U\) such that there exists \(\ell \in L\) with \(\ell \le {\hat{u}}\) and \(s(\ell ,{\hat{u}}) = w(E)\) and then perform the loop for \({\hat{u}}\) only. By doing so, we avoid performing computations for local upper bounds \(u \in U_\text {loop}\) which do not belong to the current U anymore. However, finiteness of this procedure is not guaranteed by Theorem 3.12 anymore. But if it terminates after finitely many steps, then we still have \(w(E) < \varepsilon \). Furthermore, we use a normalized shortest edge calculation, i.e. instead of \(s(\ell ,u) = \min _{i\in [k]}\, \{u_i-\ell _i\}\) we compute it via:

$$\begin{aligned} s\left( \ell , u\right) = \min _{i \in [k]} \, \left\{ \frac{u_i - \ell _i}{z^u_i - z^\ell _i}\right\} . \end{aligned}$$

Consequently, the termination tolerance \(\varepsilon \) is not an absolute measure anymore, but a relative one. This normalized approach guarantees that differences in the magnitude of the different objective functions are taken into account.

Example 3.14

We consider the optimization problem:

$$\begin{aligned} \begin{aligned} \min x=\left( x_1,\ldots ,x_n\right) ^\top \text { s.t. } 0&= b_i \left( \Vert x - m^i\Vert _2^2 - r^2 \right) ,{} & {} \hspace{-3mm}i \in [m]\\ 0&\le b_i \left( x_j - m^i_j\right) ,{} & {} \hspace{-3mm}i\in [m],\, j\in [n], \\ 1&= \sum _{i=1}^m b_i,\quad 0\le x, \\ b_i&\in \{0,1\},{} & {} \hspace{-3mm}i\in [m]. \end{aligned} \end{aligned}$$
(Circles)
Fig. 2
figure 2

Computational results on first instance in Example 3.14. Left: nondominated points of (Circles) obtained by Algorithm 2 with WSM. Right: enclosure given by the LLBs and LUBs

Fig. 3
figure 3

Computational results on second instance in Example 3.14. Two different viewpoints of the enclosure computed by Algorithm 2 with WSM

We consider two different instances of (Circles):

  • The first one is determined by \(n = 2\), \(r=1\), \(m = 3\) as well as \(m^1 = \left( 3,0\right) ^\top \), \(m^2 = \left( 2,1\right) ^\top \) and \(m^3 = \left( 0,3\right) ^\top \). The result of Algorithm 2 with relative tolerance \(\varepsilon = 0.0125\) (this is the equivalent of an absolute tolerance of \(\varepsilon = 0.05\)) and off-set factor \(\delta = 0.95\varepsilon \) is depicted in Fig. 2. One can observe that the gap in the nondominated set is realized in the enclosure via the edge with corner at \((2,2)^\top \). The corresponding local upper bound is not dominated by any point in the nondominated set and therefore the method updates the local lower bounds with the point \(u - \delta e\).

  • The second instance is determined by \(n = 3\), \(r=1\), \(m = 3\) as well as \(m^1 = \left( 1,0,0\right) ^\top \), \(m^2 = \left( 0,1,0\right) ^\top \) and \(m^3 = \left( 0,0,1\right) ^\top \). The result of Algorithm 2 with relative tolerance \(\varepsilon = 0.05\) (this is the equivalent of the absolute tolerance of \(\varepsilon = 0.1\)) and off-set factor \(\delta = 0.95\varepsilon \) is depicted in Fig. 3. Again, similar to the two-dimensional case, we have a region in the image space with local upper bounds not dominated by any nondominated point of the problem. We observe the same behavior.

All numerical computations in this paper are realized using the Python-package (cf. [41]) of the SCIP optimization suite (cf. [7]) and are carried out on a machine with Intel Core i7-8565U processor and 32GB of RAM. In Fig. 2 (Right) and Fig. 3 the enclosure obtained by the method is depicted. One can observe a significant increase in computational time going from two dimensions to three. In Fig. 2 (Left) the approximation of the nondominated set is depicted.

4 Piecewise linear relaxations and lower bounding

As we have seen, using Algorithm 2 we are able to compute an enclosure of the nondominated set of a given (MOP). However, as all the weighted-sum problems that have to be solved during Algorithm 2 are still MINLP problems in general, there may arise inconvenience while tackling, e.g., larger instances (see Sect. 6). One idea to cope with this issue is to bypass the non-linearity of the problems, e.g., trying to solve just MILP problems.

We have seen before that the idea of computing enclosures instead of a finite approximation of the nondominated set is somehow lifted from the single objective setting to the multiobjective one. In the same spirit, we are going to use an idea coming from single objective optimization in regard to solving MINLP problems. In [13, 40, 45] several methods for solving MINLP problems, e.g., a (MOP) with \(k=1\), by iteratively solving adaptively refined convex or linear MIP relaxations of the original problem are presented. Note that there is plenty of literature and ongoing research in the area of efficiently computing linear or convex relaxations (or approximations) of nonlinear problems (cf., e.g., [6, 12, 13, 28, 38, 44, 53]). Although the basic idea—but not the realization—of consequently refining the relaxations and therefore guaranteeing convergence of the sequence of optimal values of the relaxed problems to the optimal value of the original problem is the same, the respective termination criteria of the algorithms from [45] on the one hand and [13, 40] on the other hand differ. In the latter, both algorithms terminate if the maximal possible constraint violation of the relaxed problem in comparison to the original problem is below an a-priori given, but arbitrarily small error bound. Morally speaking, we say that the original MINLP problem is solved to optimality if we computed an optimal solution of a relaxation violating the nonlinear constraints only in an acceptable manner. The authors prove that under certain assumptions on the chosen relaxation technique, the proposed methods terminate after a finite number of steps.

In [45] the authors terminate their algorithm by the gap criterion which is known from classical branch-and-bound methods. The algorithm uses parts of the optimal solutions of the relaxed problems to fix variable values in the original problem, e.g., one can fix all integer variables in the MINLP problem to take the corresponding values of the relaxed solution and obtain an NLP problem which—if it is feasible—might be easier to solve and then gives an upper bound on the optimal value. In case of fixing the integer variables of the relaxed solution \(({\tilde{x}}_C, {\tilde{x}}_I)\) in the MINLP problem the resulting NLP problem has the following form

figure f

Furthermore, as before, the lower bound on the optimal value is consequently improved by refining the relaxations. In every step, the smallest available upper bound and the best lower bound is used to sandwich the optimal value of the original MINLP problem and terminate the algorithm if the gap between these bounds is small enough. However, it is not obvious that the resulting NLP problems become feasible at any time during the procedure, and therefore there is no guarantee that any upper bound is available for the gap criterion at all. Nevertheless, in practice, it is rather unlikely to produce exclusively infeasible integer assignments with the solution of the relaxed problems, in particular with increasing accuracy.

In this paper, we want to merge both ideas as the use of upper bounds together with the gap criterion may cause termination even if the maximal possible constraint violation of the relaxation is not below the tolerance and therefore accelerate the computations. In fact, we use the scheme presented in [13] to preserve the theoretically ensured convergence of the method, but also add the upper bounding part as well as the gap termination criterion from [45] to avoid unnecessary computations. Furthermore, the relaxation techniques used in [13] are able to handle general nonlinear functions, whereas the techniques in [45] are based on the so-called McCormick relaxations for bilinear and multilinear terms (cf. [42]) and are therefore only able to handle general polynomial terms. For this paper, we restrict ourselves to the explicit handling of quadratic and bilinear terms as these are the only ones appearing in our application (see Sect. 6). However, the method presented in the remainder of this paper is capable of handling general nonlinearities if an adequate relaxation technique is available. We start by introducing the necessary notions.

Definition 4.1

Let (MOP) be given. Further let \({\tilde{S}} \subseteq {\mathbb {R}}^{n+m}\) be a set and \({\tilde{f}} :{\mathbb {R}}^{n+m} \rightarrow {\mathbb {R}}^k\) a k-vector-valued function such that \(S \subseteq {\tilde{S}}\) and \({\tilde{f}}(x) \le f(x)\) for any \(x \in S\). Then we call

figure g

a relaxation of (MOP). We denote the nondominated set of (\(\hbox {RMOP}_{{\tilde{S}}}\)) by \({\mathcal {N}}^{{\tilde{S}}}\).

Note that, if \(k=1\), we have that the optimal value \({\tilde{f}}({\tilde{x}}^*)\) of (\(\hbox {RMOP}_{{\tilde{S}}}\)) is a lower bound on the optimal value \(f(x^*)\) of (MOP). For brevity we assume without loss of generality that the objective f in (MOP) is linear. This is also motivated by our energy supply network considered in Sect. 6 Consequently, f needs no further relaxation and for the remainder of that paper a relaxation (\(\hbox {RMOP}_{{\tilde{S}}}\)) is characterized by its feasible set \({\tilde{S}}\). Given two different relaxations \({\tilde{S}}\) and \({{\hat{S}}}\), we call the relaxation \({\tilde{S}}\) finer than \({{\hat{S}}}\) if \({\tilde{S}} \subseteq {{\hat{S}}}\). Again, if \(k=1\), for two relaxations \({\tilde{S}} \subseteq {{\hat{S}}}\) we have that the optimal value \(f({{\hat{x}}}^*)\) of the relaxation \({{\hat{S}}}\) is a lower bound on the optimal value \(f({\tilde{x}}^*)\) of the relaxation \({\tilde{S}}\). There is a plenty of possibilities to obtain relaxations of a given (MOP). One way is to use the concept of piecewise linear under- and overestimators (cf., e.g., [11, 13, 28]).

Definition 4.2

Let \(h(x_1,\ldots ,x_n)\) be a nonlinear real-valued function with compact domain \(D_h \subset {\mathbb {R}}^n\) appearing in (MOP), e.g., as a constraint function.

  • We call a continuous piecewise linear function \(h_u :D_h \rightarrow {\mathbb {R}}\) a piecewise linear underestimator of h if \(h_u(x) \le h(x)\) for all \(x \in D_h\).

  • We call a continuous piecewise linear function \(h_o :D_h \rightarrow {\mathbb {R}}\) a piecewise linear overestimator of h if \(h(x) \le h_o(x)\) for all \(x \in D_h\).

  • We call \({\mathcal {R}}_h = (h_u, h_o)\), where \(h_u\) is a piecewise linear underestimator and \(h_o\) a piecewise linear overestimator of h, a piecewise linear relaxation of h.

Given a piecewise linear relaxation \({\mathcal {R}}_h\) of a term h, we obtain a relaxation in the sense of Definition 4.1 by replacing any appearance of h by an additional variable \({\hat{h}}\) and adding the constraints \(h_u(x) \le {\hat{h}}\) and \({\hat{h}} \le h_o(x)\). Indeed, this yields a relaxation in the sense of Definition 4.1 by the fact that

$$\begin{aligned}&\big \{(h(x),x) \in {\mathbb {R}}^{1+n} \mid x \in D_h \big \} \\ {}&\subseteq \big \{\left( {\hat{h}}, x\right) \in {\mathbb {R}}^{1+n} \mid h_u(x) \le {\hat{h}} \le h_o(x) \text { for all } x \in D_h\big \}. \end{aligned}$$

Thus, given (MOP) with nonlinear terms \(h_i, i\in {\mathcal {I}}\), for some finite index set \({\mathcal {I}}\) together with piecewise linear relaxations \({\mathcal {R}}_{h_i}, i\in {\mathcal {I}}\), and proceeding as above, we obtain a relaxation of (MOP) with feasible set denoted by \({\tilde{S}}_{\mathcal {R}}\) or simply by \({\mathcal {R}}\), where \({\mathcal {R}}\) is the collection of all \({\mathcal {R}}_{h_i}\). For the remainder of that paper we only consider relaxations coming from piecewise linear relaxations of all appearing nonlinear terms of (MOP). One can see immediately that given two relaxations \({\mathcal {R}}\) and \({\mathcal {R}}^\prime \) we have that \({\mathcal {R}}^\prime \subseteq {\mathcal {R}}\) if and only if for any nonlinear term h appearing in (MOP) we have \({\mathcal {R}}^\prime _h \subseteq {\mathcal {R}}_h\), i.e. \(h_u(x) \le h_u^\prime (x)\) and \(h_o^\prime (x) \le h_o (x)\) for all \(x \in D_h\). Consequently, by improving the piecewise linear under- and overestimators we refine the corresponding relaxation. Accordingly with [11], we measure the quality of a given piecewise linear relaxation \({\mathcal {R}}_h\) of a nonlinear term h by

  • the overestimation error

    $$\begin{aligned} \varepsilon _o^{\left( h, {\mathcal {R}}_h\right) } := \max _{x\in D_h} \, h_o(x) - h(x), \end{aligned}$$
  • the underestimation error

    $$\begin{aligned} \varepsilon _u^{\left( h, {\mathcal {R}}_h \right) } := \max _{x\in D_h} \, h(x) - h_u(x), \end{aligned}$$
  • and the overall relaxation error \(\varepsilon _{rel}^{\left( h, {\mathcal {R}}_h\right) }:= \max \left\{ \varepsilon _u^{\left( h, {\mathcal {R}}_h \right) }, \varepsilon _o^{\left( h, {\mathcal {R}}_h\right) }\right\} \).

Clearly, the relaxation error decreases while refining relaxations, i.e. for two relaxations \({\mathcal {R}}^\prime _h \subseteq {\mathcal {R}}_h\) we have that \(\varepsilon _{rel}^{(h, {\mathcal {R}}^\prime _h)} \le \varepsilon _{rel}^{(h, {\mathcal {R}}_h)}\).

Naturally, we can also extend this concept to measure the quality of a relaxation \({\mathcal {R}}\) of a given (MOP). In fact, the overestimation error of \({\mathcal {R}}\) is defined by

$$\begin{aligned} \varepsilon _o^{{\mathcal {R}}} := \max \left\{ \varepsilon _o^{\left( h_i, {\mathcal {R}}_{h_i}\right) } \mid h_i \text { appears in } (\text {MOP}) \text { and } {\mathcal {R}}_{h_i} \in {\mathcal {R}}\text { for }i\in {\mathcal {I}}\right\} . \end{aligned}$$

Similar for the underestimation error. The overall relaxation error is then given by \(\varepsilon _{rel}^{\mathcal {R}}:= \max \{\varepsilon _o^{{\mathcal {R}}}, \varepsilon _u^{{\mathcal {R}}}\}\). Similar as before, if \({\mathcal {R}}^\prime \subseteq {\mathcal {R}}\) for two relaxations \({\mathcal {R}}\) and \({\mathcal {R}}^\prime \) we have that

$$\begin{aligned} \varepsilon _{rel}^{{\mathcal {R}}^\prime } \le \varepsilon _{rel}^{\mathcal {R}}. \end{aligned}$$

Let us now briefly introduce the relaxation technique of interest for that paper, namely the so-called McCormick piecewise linear relaxations (cf. [42]).

Remark 4.3

As already mentioned, we put the focus in this paper on the ingredients necessary for our application. As there appear only bilinear and quadratic constraints, we restrict ourselves to the description of exactly this case. However, by decomposing general polynomials into bilinear and quadratic terms one can apply the method from this work to general polynomially constrained problems. For example, the polynomial \(x^3y\) can be decomposed into

$$\begin{aligned} x_1&= x^2, \\ x_2&= xy, \\ x_3&= x_1 x_2, \end{aligned}$$

where \(x_i, i=1,2,3\), are additional variables.

We start with the bilinear case, i.e. \(h(x,y) = xy\), with compact domain

$$\begin{aligned}D_h = \left\{ (x,y) \in {\mathbb {R}}^2 \mid x^\ell \le x \le x^u, y^\ell \le y \le y^u\right\} .\end{aligned}$$

The McCormick piecewise linear relaxation depends on a so-called partition (or triangulation) of the domain \(D_h\). For our purpose it is enough to consider simple partitions of the intervals \([x^\ell , x^u]\) and \([y^\ell , y^u]\) in r equidistant intervals. Consequently, we obtain gridpoints \(x_i = x^\ell + i (y^u - x^\ell )/r\) and \(y_i = y^\ell + i (y^u - y^\ell )/r\) for \(i \in \{0,\ldots ,r\}\) with corresponding intervals \([x_{i-1}, x_i]\) and \([y_{i-1}, y_i]\) for \(i\in [r]\).

For any \(i,j\in [r]\) and any \((x,y) \in [x_{i-1}, x_i] \times [y_{j-1},y_j]\) the linear underestimator function on \(D_h^{ij}:= [x_{i-1}, x_i] \times [y_{j-1},y_j]\) is given by

$$\begin{aligned} h_u^{ij} (x,y) := \max \left\{ x_{i-1}y + y_{j-1}x - x_{i-1}y_{j-1}, x_i y + y_j x - x_i y_j\right\} , \end{aligned}$$

and the linear overestimator is given by

$$\begin{aligned} h_o^{ij} (x,y) := \min \left\{ x_{i-1} y + y_j x - x_{i-1} y_j, x_i y + y_{j-1} x - x_i y_{j-1}\right\} . \end{aligned}$$

The piecewise linear under- and overestimator functions \(h_u\) and \(h_o\) are then given by concatenation as shown in Fig. 4.

Fig. 4
figure 4

McCormick relaxations of the bilinear term xy using one partition per variable (green) and two partitions per variable (red). (Color figure online)

Note that \(h_u\) and \(h_o\) are continuous piecewise linear functions and \(h_u (x,y) \le h(x,y) \le h_o (x,y)\) for all \((x,y) \in D_h\), i.e. \(h_u\) is a piecewise linear underestimator and \(h_o\) is a piecewise linear overestimator of h in the sense of Definition 4.2. We denote the piecewise linear relaxation based on r equidistant partitions for each variable by \({\mathcal {R}}_{xy}^r\). The resulting relaxation error is given by

$$\begin{aligned} \varepsilon _{rel}^{\left( h, {\mathcal {R}}_h^r\right) } = \frac{\left( x^u - x^\ell \right) \left( y^u - y^\ell \right) }{4r^2}. \end{aligned}$$

Although the quadratic case is just a special case of the bilinear case we give the explicit formulation. Therefore, let \(h(x) = x^2\) with \(D_h = \{x \in {\mathbb {R}}\mid x^\ell \le x \le x^u\}\). We consider again the partition in r equidistant intervals, i.e. \(D_h^i = [x_{i-1}, x_i]\) for \(i \in [r]\). For any \(i\in [r]\) and any \(x\in D_h^i\) the linear underestimator is given by

$$\begin{aligned} h_u^i (x) := \max \left\{ 2x_{i-1} x - x_{i-1}^2, 2x_i x - x_i^2\right\} , \end{aligned}$$

and linear overestimator is given by

$$\begin{aligned} h_o^i (x) := \left( x_{i-1} + x_i\right) x - x_{i-1}x_i. \end{aligned}$$

The resulting relaxation error corresponding to r equidistant partitions is

$$\begin{aligned} \varepsilon _{rel}^{\left( h, {\mathcal {R}}_h^r\right) } = \frac{\left( x^u - x^\ell \right) ^2}{4r^2}. \end{aligned}$$

Clearly, for the McCormick piecewise linear relaxations we have that \(\varepsilon _{rel}^{(h, {\mathcal {R}}_h^r)} \rightarrow 0\) for \(r \rightarrow \infty \), i.e. using the McCormick piecewise linear relaxation technique, for any \(\varepsilon > 0\) we find \(r\in {\mathbb {N}}\) such that \(\varepsilon _{rel}^{(h, {\mathcal {R}}_h^r)} < \varepsilon \).

Remark 4.4

In [45] the authors provide an adaptive partitioning scheme, i.e. there is no requirement for equidistant intervals. New breakpoints are added in parts of the variable domains close to the value of the previous solution in order to refine the relaxation. The idea is to only partition on regions of the variable domain that appear to influence optimality the most. In contrast, using uniformly distributed break points, one runs the risk of refining in uninteresting regions of the variable domain and therefore unnecessarily blowing up the problem. The scheme presented in [45] could also be incorporatedFootnote 2 into the methods of the present paper. But for simplicity of presentation and since the equidistant procedure works very well for our application, we restrict ourselves to the equidistant partitioning scheme presented in this paper.

After we have introduced the relaxations of interest for the present paper, we now relate relaxations (\(\hbox {RMOP}_{{\tilde{S}}}\)) and lower bounds of the nondominated set \({\mathcal {N}}\) of (MOP). For the ease of notation, we subsequently assume that for a given relaxation \({\tilde{S}}\) we have that \(f({\tilde{S}}) \subseteq \textrm{int}(B)\). We obtain the following lemma.

Lemma 4.5

Let \(f({\tilde{x}}) \in {\mathcal {N}}^{{\tilde{S}}}\) for some \({\tilde{x}} \in {\tilde{S}}\) and some relaxation \({\tilde{S}}\) of (MOP). Then \(f({\tilde{x}})\) is nondominated by \({\mathcal {N}}\), i.e. \(f({\tilde{x}}) \in \textrm{int}(B) {\setminus }(f(S) + \textrm{int}({\mathbb {R}}^k_+))\).

Proof

Assume for a contradiction that there is some \({\bar{y}} \in S\) such that \(f({\bar{y}})\) dominates \(f({\tilde{x}})\). This contradicts \({\bar{y}} \in S \subseteq {\tilde{S}}\) together with the nondominance of \(f({\tilde{x}})\) w.r.t. (\(\hbox {RMOP}_{{\tilde{S}}}\)). \(\square \)

The above lemma tells us that any stable set \({\tilde{N}}\) consisting of nondominated points of possibly different relaxations of a given (MOP) is nondominated w.r.t. \({\mathcal {N}}\). Therefore, by employing Lemma 3.5, we obtain a lower bound set \(L({\tilde{N}})\) of the nondominated set \({\mathcal {N}}\). If we have furthermore any potentially nondominated—and therefore particularly stable—set \(N \subseteq f(S)\) available, we obtain an enclosure \(E(L({\tilde{N}}), U(N))\) of the nondominated set in the sense of Definition 3.1 again by Lemma 3.5. That is the core idea of the method presented in the next section.

We finalize this section with some aspects concerning the sets N and \({\tilde{N}}\). We want to make use of the idea from [45], where the optimal value is sandwiched between lower bounds coming from relaxed problems and upper bounds coming from solutions of the reduced problems. In the multiobjective setting, the lower bound is not a single value anymore, but a stable set consisting of optimal solutions of possibly different relaxations, namely \({\tilde{N}}\). In the same spirit, also the upper bound is not a single value but a stable set consisting of potentially nondominated points, namely N. As we want to somehow decrease this set N towards \({\mathcal {N}}\), it is updated w.r.t. nondominance, i.e. if we compute a new potentially nondominated point y, we add it to N if y is nondominated w.r.t. N. Furthermore, we discard any points in N which are dominated by y. This is written in Algorithm 3.

figure h

By doing so, we ensure that N stays stable and consequently improves towards \({\mathcal {N}}\) as shown in the next lemma. Note that since \(N\subseteq f(S)\) we have that \({\mathcal {N}}\) is nondominated w.r.t. N, i.e. N actually approximates \({\mathcal {N}}\) from above.

Lemma 4.6

Let \(N_1\) be a stable input set and let \(y \in f(S)\). Then Algorithm 3 returns a stable set \(N_2\). Furthermore, either \(N_1 \subseteq N_2\) or there exists \(y^\prime \in N_1\) such that y dominates \(y^\prime \).

Proof

We have to distinguish two base cases. Firstly, assume there exists no \(y^\prime \in N_1\) with \(y \le y^\prime \). Then either y is not nondominated w.r.t. \(N_1\) and we have that \(N_1 = N_2\), or y is nondominated w.r.t. \(N_1\) and we have that \(N_1 \subset N_2 = N_1 \cup \{y\}\). In both cases, we have that \(N_2\) is stable.

Secondly, assume that there exist \(y^i \in N_1\) with \(y \le y^i\), \(i \in [s]\) for some \(s \in {\mathbb {N}}\). Due to the stability of \(N_1\) we have that y is nondominated w.r.t. \(N_1 {\setminus } \{y^i \mid i \in [s]\}\). Hence, \(N_2 = (N_1 {\setminus } \{y^i \mid i \in [s]\}) \cup \{y\}\) and \(N_2\) is stable. Further, if \(y = y^1\) we have that \(N_1 = N_2\). If otherwise, we have that y dominates \(y^1\). \(\square \)

Let us now turn to the set \({\tilde{N}}\) which is meant to approach \({\mathcal {N}}\) from below, i.e. we only want to use the best relaxed solutions available. In that setting, we call a nondominated point \({\tilde{y}}\) of a relaxation \({\tilde{S}}\) better than a nondominated point \({{\hat{y}}}\) of a relaxation \({{\hat{S}}}\), if \({\hat{y}}\) dominates \({\tilde{y}}\). Note that if \(y^\prime \) dominates \({\tilde{y}}\) we know that \({\tilde{S}} \subsetneq S^\prime \) holds for the corresponding relaxations. In a similar but reversed way as for N, we update the set \({\tilde{N}}\) as written in Algorithm 4.

figure i

We obtain the analogue to Lemma 4.6. Note that by Lemma 4.5 we know that \({\tilde{N}}\) is nondominated w.r.t. \({\mathcal {N}}\), i.e. \({\tilde{N}}\) actually approximates \({\mathcal {N}}\) from below.

Lemma 4.7

Let \({\tilde{N}}_1\) be a stable input set and let \(y \in \textrm{int}(B) {\setminus }(f(S) + \textrm{int}({\mathbb {R}}^k_+))\). Then Algorithm 4 returns a stable set \({\tilde{N}}_2\). Furthermore, either \({\tilde{N}}_1 \subseteq {\tilde{N}}_2\) or there exists \({\tilde{y}} \in {\tilde{N}}_1\) such that \({\tilde{y}}\) dominates y.

Proof

We have to distinguish two base cases. Firstly, assume there exists no \({\tilde{y}} \in {\tilde{N}}_1\) with \({\tilde{y}} \le y\). Then either \({\tilde{N}}_1\) is not nondominated w.r.t. y and we have that \({\tilde{N}}_1 = {\tilde{N}}_2\), or \({\tilde{N}}_1\) is nondominated w.r.t. y and we have that \({\tilde{N}}_1 \subset {\tilde{N}}_2 = {\tilde{N}}_1 \cup \{y\}\). In both cases, we have that \({\tilde{N}}_2\) is stable.

Secondly, assume that there exist \({\tilde{y}}^i \in {\tilde{N}}_1\) with \({\tilde{y}}^i \le y\), \(i \in [s]\) for some \(s \in {\mathbb {N}}\). Due to the stability of \({\tilde{N}}_1\) we have that \({\tilde{N}}_1 {\setminus } \{{\tilde{y}}^i \mid i \in [s]\}\) is nondominated w.r.t. y. Hence, \({\tilde{N}}_2 = ({\tilde{N}}_1 {\setminus } \{{\tilde{y}}^i \mid i \in [s]\}) \cup \{y\}\) and \({\tilde{N}}_2\) is stable. Further, if \(y = {\tilde{y}}^1\) we have that \({\tilde{N}}_1 = {\tilde{N}}_2\). If otherwise, we have that \({\tilde{y}}^1\) dominates y. \(\square \)

5 General scheme

We have seen at the end of Sect. 3 that Algorithm 2 is able to compute an enclosure as well as an approximation of the nondominated set of a given (MOP). However, if nonlinear constraint functions are present in (MOP), the scalarized problems arising during Algorithm 2 are MINLP problems. At the end of Sect. 3 we have also seen that if the used solver, like, e.g., SCIP, is capable of handling the occurring nonlinear constraints one could solve the scalarized single objective problems directly. Nevertheless, if the complexity of (MOP) and therefore the one of the resulting MINLP problems increases, the run time of solvers like SCIP for computing a solution to such an MINLP problem may increase, too. Note that for any computed nondominated point in our approximation, we have to solve at least one such MINLP problem, so even a small increase of computational time per problem may cause a tremendous upturn of run time of the whole procedure.

In Sect. 4 we have presented ideas and concepts from single objective optimization of MINLP problems which are meant to reduce complexity and therefore facilitate the computations while solving a scalar MINLP problem. Now one could think of choosing a specific relaxation \({\mathcal {R}}\) and then using one of the present algorithms for computing a representation (or enclosure) of the nondominated set of the relaxed mixed-integer linear (or convex) problem, see, e.g., [47, 51] for the biobjective linear case and [24] for the multiobjective convex case, or even Algorithm 2 or [23]. One could argue that if the relaxation \({\mathcal {R}}\) satisfies some quality criterion, e.g., a small enough estimation error \(\varepsilon _{rel}^{{\mathcal {R}}}\), the approximation (or enclosure) of \({\mathcal {N}}^{{\mathcal {R}}}\) can be considered to be an approximation (or enclosure) of \({\mathcal {N}}\), similar as proposed in [13] for the single objective case. Note that convergence then only relies on the theory of the used multiobjective method.

However, computing such a relaxation and solving the arising problem using an available solution method may be very time-consuming as the complexity, even of the relaxed problems, may increase with ongoing refinement—in particular, as the number of integer variables increases while tightening the relaxations. One strategy for avoiding this is trying to use cheap relaxations whenever possible and refining them only when necessary, e.g., only in specific parts of the image space. This idea of adaptively refining the relaxations while computing an enclosure of the nondominated set of (MOP) is the core of this work. Note that one could additionally incorporate adaptivity in the variable domains into the refinement procedure (see Remark 4.4).

In the following, we present an algorithm similar to Algorithm 2 which makes use of these ideas in order to compute an enclosure of the nondominated set without solving scalarized MINLP problems, but only MILP and NLP problems (see Algorithm 5).

figure j

Before starting the procedure we fix a relaxation technique guaranteeing that the relaxation error quality criterion, namely \(\varepsilon _{rel}^{{\mathcal {R}}} < \varepsilon _{rel}\), is satisfied after a finite number of refinement steps. We introduce the set consisting of all such relaxations

$$\begin{aligned} \Omega := \left\{ {\mathcal {R}}\mid \varepsilon _{rel}^{{\mathcal {R}}} < \varepsilon _{rel}\right\} , \end{aligned}$$

and assume the following for the remainder of the paper.

Assumption 5.1

For any relaxation technique and any initial relaxation \({\mathcal {R}}^I\). Let \({\mathcal {R}}^I \supsetneq {\mathcal {R}}^1 \supsetneq {\mathcal {R}}^2 \supsetneq \ldots \) be a chain of strictly decreasing relaxations. Then for any \(\varepsilon _{rel}>0\) there exists an \(s\in {\mathbb {N}}\) with \(\varepsilon _{rel}^{{\mathcal {R}}^s} < \varepsilon _{rel}\), i.e. \({\mathcal {R}}^l \in \Omega \) for all \(l \ge s\).

Remark 5.2

We should mention that there is a wide range of alternative relaxation techniques in the literature, including the use of convex underestimators (cf. [2, 3, 37, 50]) instead of piecewise linear ones. One could then either solve the resulting convex MINLP problems or combine them with outer approximation techniques (cf. [18, 27, 35, 39, 54]). However, here we restrict ourselves to the case of piecewise linear relaxations since the relaxation error computation and especially convergence in the sense of Assumption 5.1 is straightforward. For the above-mentioned approaches, one has to ensure that both of these requirements are fulfilled.

Furthermore, if \({\mathcal {R}}\in \Omega \) we consider any feasible point \({\tilde{x}} \in {\tilde{S}}\) of the corresponding relaxed problem (\(\hbox {RMOP}_{{\tilde{S}}}\)) based on the feasible set \({\tilde{S}} = S^{{\mathcal {R}}}\) as a feasible point of (MOP), i.e. \({\tilde{x}} \in S\). Consequently, by Lemma 4.5 for any efficient point \({\tilde{x}} \in {\tilde{S}}\) of (\(\hbox {RMOP}_{{\tilde{S}}}\)) we have that \(f\left( {\tilde{x}}\right) \in {\mathcal {N}}\). This means, that for a relaxation \({\mathcal {R}}\in \Omega \) we consider any nondominated point of (\(\hbox {RMOP}_{{\tilde{S}}}\)) as a nondominated point of (MOP). For ease of notation, we write \({\tilde{x}} \in S\) if \({\tilde{x}} \in S^{{\mathcal {R}}}\) for some \({\mathcal {R}}\in \Omega \) for the remainder of that paper. Note that Assumption 5.1 holds for the McCormick piecewise linear relaxations introduced in Sect. 4.

In Step 3, we initialize the set

$$\begin{aligned} {\mathcal {D}}(U) := \left\{ \left( u, {\mathcal {R}}\right) \mid u\in U,\, {\mathcal {R}}\text { caused the computation of } u\right\} , \end{aligned}$$

consisting of any present local upper bound together with the relaxation \({\mathcal {R}}\) which was needed to obtain this specific local upper bound. We say that the relaxation \({\mathcal {R}}\) caused the computation of a local upper bound u if u entered the set of local upper bounds after it was updated w.r.t. a point y whose computation relied on \({\mathcal {R}}\). This could be either the case if y is the solution of the relaxed problem corresponding to \({\mathcal {R}}\) and \({\mathcal {R}}\in \Omega \) or if y is a nondominated point of (\(\hbox {redMOP}({\tilde{x}}_I)\)), where \({\tilde{x}}_I\) was computed using the relaxation \({\mathcal {R}}\). Furthermore, we initialize the set

$$\begin{aligned} {\mathcal {D}}\left( {\tilde{N}}\right) := \left\{ \left( {\tilde{y}}, {\mathcal {R}}\right) \mid {\tilde{y}}\in {\tilde{N}},\, {\mathcal {R}}\text { caused the computation of } {\tilde{y}}\right\} , \end{aligned}$$

consisting of solutions of relaxed problems together with their corresponding relaxation \({\mathcal {R}}\).

Suppose now we are at the beginning of the l-th call of the outer \(\texttt {while}\)-loop in Step 4 and we have that \(w(E) \ge \varepsilon _{\text {encl}}\). We fix the set \(U_\text {loop}\) to be the current assignment of the set of local lower bounds U and start the \(\texttt {for}\)-loop in Step 6. In that \(\texttt {for}\)-loop, let \({\hat{u}}\in U_\text {loop}\) such that there exists \(\ell \in L\) with \(\ell \le {\hat{u}}\) and \(s(\ell ,{\hat{u}})\ge \varepsilon _{\text {encl}}\), i.e. the search zone determined by \(\ell \) and \({\hat{u}}\) is not yet well enough explored and we set \(\texttt {done} = \text {false}\). We use the indicator \(\texttt {done}\) to determine whether we achieved an improvement w.r.t. \({\hat{u}}\), i.e. the inner \(\texttt {while}\)-loop ensures that we concentrate on \({\hat{u}}\) until we made some improvement. We say that we improved \({\hat{u}}\) if we entered one of the \(\texttt {if}\)-statements in the Steps 15 and 19 or the \(\texttt {else}\)-statement in Step 29 as in these steps either a potentially nondominated point y with \(y < {\hat{u}} - \delta e\) is found or the search region \(c({\hat{u}})\) is declared to be well enough explored.

However, given the current local upper bound \({\hat{u}}\) we have to choose an appropriate relaxation \({\mathcal {R}}_\text {current}\) for executing our computations. This is realized in Step 10. We choose a relaxation at least as fine as the relaxation which led to \({\hat{u}}\), i.e. \({\mathcal {R}}^{{\hat{u}}} \supseteq {\mathcal {R}}_\text {current}\). Furthermore, if there exists some relaxed solution \({\tilde{y}} \in {\tilde{N}}\) with \({\tilde{y}} < {\hat{u}} - \varepsilon _{\text {encl}}e\) we choose a strictly finer relaxation than \({\mathcal {R}}^{{\tilde{y}}}\), i.e. we ensure that \({\mathcal {R}}^{{\tilde{y}}} \supsetneq {\mathcal {R}}_\text {current}\). Note that the strictness of the inclusion is not necessary for the convergence of Algorithm 5 since the method also refines the relaxations if the incumbent relaxed solution did not lead to an improvement of the lower bound set. However, for some problems, it may be of advantage to refine the relaxations more aggressively instead of solving cheaper problems that do not have a high chance of leading to a significant improvement of the lower bound set. In fact, not forcing the inclusion \({\mathcal {R}}_\text {current} \subseteq {\mathcal {R}}^{{\tilde{y}}}\) to be strict makes it impossible to find a relaxed solution \(y^\prime \) dominating \({\tilde{y}}\), and therefore satisfying \({\hat{u}} - \varepsilon _{\text {encl}} e \le y^\prime \), as can be seen in Fig. 5. The possible negative effect of not forcing strictness can be seen in the comparison of Figs. 7 and 8, where we can observe an increase in runtime as well as in the number of problems to be solved. However, refining too aggressively may also be a problem as it may result in solving harder problems than necessary. From our first observations, it is a good strategy to take the coarsest possible relaxation without losing the strictness of the inclusion—at least with using our basic refinement strategy.

Fig. 5
figure 5

A strict refinement of the relaxation is needed in order to obtain solutions closer to the desired area as it is dominated by \({\tilde{y}}\)

After that we initialize the relaxation \({\tilde{S}} = S^{{\mathcal {R}}_\text {current}}\), solve (\(\hbox {WSP}\left( \alpha ;u\right) \)) with \(u = {\hat{u}}\) for some \(\alpha \in \textrm{int}({\mathbb {R}}^k_+)\) and feasible set \({\tilde{S}}\) and then decide whether we are able to enter the \(\texttt {if}\)-statement in Step 12. If (\(\hbox {WSP}\left( \alpha ;u\right) \)) is infeasible, we declare the current search region \(c({\hat{u}})\) as well enough explored by the same arguments as in Algorithm 2, and set \(\texttt {done} = \text {true}\). If, otherwise, there exists a solution \({\tilde{y}}\) to (\(\hbox {WSP}\left( \alpha ;u\right) \)) we check if \({\tilde{N}}\) is nondominated w.r.t. \({\tilde{y}}\), i.e. if \({\tilde{y}}\) improves the set \({\tilde{N}}\). If that is not the case, i.e. if there exists \(y^\prime \in {\tilde{N}}\) with \({\tilde{y}} \le y^\prime \), we have to restart the inner \(\texttt {while}\)-loop with a finer relaxation as the current one did not lead to any improvement. If otherwise, \({\tilde{N}}\) is nondominated w.r.t. \({\tilde{y}}\), it is reasonable to move on as \({\tilde{y}}\) suggests an improvement of \({\hat{u}}\). Consequently, we update the sets \({\tilde{N}}\) and L w.r.t. \({\tilde{y}}\) and save the corresponding relaxation. If now the relaxation is fine enough in the sense of [13], i.e. \({\mathcal {R}}_\text {current} \in \Omega \), we consider \({\tilde{y}}\) as a nondominated point of (MOP), update the sets N and U w.r.t. \({\tilde{y}}\) and set \(\texttt {done} = \text {true}\). If otherwise, the relaxation is not yet fine enough we try to make use of the idea from [45], i.e. using parts of the relaxed solution to set up the reduced problem (\(\hbox {redMOP}({\tilde{x}}_I)\)). We solve the corresponding (\(\hbox {WSP}\left( \alpha ;u\right) \)) and if it has a solution we obtain a potentially nondominated point, i.e. update the sets N and U and set \(\texttt {done} = \text {true}\). If it is infeasible, we have to restart with a finer relaxation, since \({\mathcal {R}}_\text {current}\) suggested wrongly that we would find a potentially nondominated point.

Remark 5.3

Note that it is not necessary to solve the (\(\hbox {WSP}\left( \alpha ;u\right) \)) corresponding to (\(\hbox {redMOP}({\tilde{x}}_I)\)) to global optimality. Since we aim to find a feasible point \(x\in S\) satisfying \(f(x) < {\hat{u}} - \delta e\)—and this is already incorporated in the constraints—it suffices to find a feasible point for (\(\hbox {WSP}\left( \alpha ;u\right) \)). Furthermore, even solving to global optimality would not guarantee finding an efficient solution \(f(x) \in {\mathcal {N}}\) for (MOP) since we are restricted to a specific integer assignment. However, it may speed up the algorithm investing the time in computing global solutions of the possibly nonconvex (\(\hbox {WSP}\left( \alpha ;u\right) \)) depending on the structure and complexity of (MOP).

We turn now to proving correctness and finiteness of Algorithm 5.

Lemma 5.4

Let \({\tilde{N}}\) be the set of best relaxed solutions at some arbitrary point in Algorithm 5. Then \({\tilde{N}} \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\) and \({\tilde{N}}\) is stable.

Proof

We initialize the set \({\tilde{N}} = \emptyset \). During Algorithm 5 the set \({\tilde{N}}\) is updated only in Step 14. The update takes place w.r.t. a point \({\tilde{y}} \in \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\) by Lemma 4.5. Thus, by correctness of Algorithm 4 we obtain that \({\tilde{N}} \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\) and \({\tilde{N}}\) is stable. \(\square \)

Lemma 5.5

Let N be the set of potentially nondominated points at some arbitrary point in Algorithm 5. Then \(N \subseteq f(S)\) and N is stable.

Proof

We initialize the set \(N = \emptyset \). During Algorithm 5 the set N is updated only in Steps 16 and 20. In both cases the update takes place w.r.t. a point \(y \in f(S)\)—note that in Step 16 we have \({\mathcal {R}}_\text {current}\in \Omega \) and therefore \({\tilde{y}} \in f(S)\). Thus, by correctness of Algorithm 3 we obtain that \(N \subseteq f(S)\) and N is stable. \(\square \)

Similar to Sect. 3 we prove correctness of the sets U and L.

Lemma 5.6

Let U be the local upper bound set at some arbitrary point in Algorithm 5. Then U is an upper bound set in the sense of Definition 3.1.

Proof

We initialize the set \(U = \{z^u\}\). During Algorithm 5 the set U is updated only in Steps 16 and 20. In both cases U is updated w.r.t. a point \(y \in f(S)\)—note that in Step 16 we have \({\mathcal {R}}_\text {current}\in \Omega \) and therefore \({\tilde{y}} \in f(S)\). Thus, by correctness of Algorithm 1 we obtain that U is a local upper bound set w.r.t. the set \(N \subseteq f(S)\). The claim follows by Lemma 3.5. \(\square \)

Lemma 5.7

Let L be the local lower bound set at some arbitrary point in Algorithm 5. Then L is a lower bound set in the sense of Definition 3.1.

Proof

We initialize the set \(L = \{z^\ell \}\). During Algorithm 5 the set L is updated only in Steps 16 and 20. If updated in Step 16, L is updated w.r.t. a point \({\tilde{y}} \in \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\). If otherwise updated in Step 20, it is updated w.r.t. \(y = {\hat{u}} - \delta e\) for some local upper bound \({\hat{u}} \in U\). Since we only enter Step 20, if there exists no \({\bar{y}} \in {\mathcal {N}}^{{\tilde{S}}}\) satisfying \({\bar{y}} \le {\hat{u}} - \delta e\) we particularly know that \(y = {\hat{u}} - \delta e \notin f(S) + \textrm{int}({\mathbb {R}}^k_+)\). Hence, by correctness of Algorithm 1 adapted to local lower bounds we obtain that L is a local lower bound set w.r.t. some set \(N^\prime \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\). Note that \({\tilde{N}} \subseteq N^\prime \). The claim follows by Lemma 3.5. \(\square \)

After proving that at any point in Algorithm 5 the sets N, \({\tilde{N}}\), L and U satisfy the requirements, we proceed by showing termination of Algorithm 5 after finitely many iterations of the outer \(\texttt {while}\)-loop. For that purpose, we first prove that the inner \(\texttt {while}\)-loop is terminated after a finite number of iterations.

Lemma 5.8

Let \(\varepsilon _\text {encl}> \delta > 0\), \(\varepsilon _{rel} > 0\), \({\mathcal {R}}^I\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 5. Moreover, let \({\hat{u}} \in U_\text {loop}\) be the local upper bound chosen in the \(\texttt {for}\)-loop at some arbitrary point of Algorithm 5. Assume that there exists \(\ell \in L\) with \(\ell \le {\hat{u}}\) and \(s\left( \ell ,{\hat{u}}\right) \ge \varepsilon _{\text {encl}}\). Then after finitely many iterations of the inner \(\texttt {while}\)-loop we have that \(\texttt {done} = \text {true}\).

Proof

We have to show that after finitely many iterations we either enter one of the \(\texttt {if}\)-statements in Steps 15 or 19 or we enter the \(\texttt {else}\)-statement in Step 29. For a contradiction, we assume that none of them is entered after finitely many iterations of the inner \(\texttt {while}\)-loop. Then, every iteration is executed with a relaxation strictly finer than the one before. This is due to the fact that any already executed iteration of the inner \(\texttt {while}\)-loop terminated either with Step 23 or Step 27. Thus, we have an infinite chain of relaxations \({\mathcal {R}}^1 \supsetneq {\mathcal {R}}^2 \supsetneq \ldots \supsetneq {\mathcal {R}}^l \supsetneq \ldots \), where \({\mathcal {R}}_l\) denotes the relaxation corresponding to the l-th iteration of the inner \(\texttt {while}\)–loop. Now, by Assumption 5.1 there exists \(s \in {\mathbb {N}}\) such that \({\mathcal {R}}^s \in \Omega \) and therefore \(\varepsilon _{rel}^{{\mathcal {R}}^s} < \varepsilon _{rel}\). Hence, in the sth iteration of the inner \(\texttt {while}\)-loop we either enter the \(\texttt {if}\)-statement in Step 12 or the \(\texttt {else}\)-statement in Step 29. By our assumption, we do not enter the \(\texttt {else}\)-statement. By Lemma 4.5 we have that \({\tilde{N}} \subseteq \textrm{int}(B) {\setminus } (f(S) + \textrm{int}({\mathbb {R}}^k_+))\) and therefore \({\tilde{N}}\) is nondominated w.r.t. \({\mathcal {N}}\). By the fact that \({\mathcal {R}}^s \in \Omega \) we have that \({\tilde{y}}\in {\mathcal {N}}\) and therefore we enter the \(\texttt {if}\)-statement in Step 13 and subsequently the one in Step 15, a contradiction. \(\square \)

Similar to Algorithm 2 we can prove some decrease in some edge lengths after any iteration of the outer \(\texttt {while}\)-loop.

Theorem 5.9

Let \(\varepsilon _\text {encl}> \delta > 0\), \(\varepsilon _{rel} > 0\), \({\mathcal {R}}^I\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 5. Moreover, let \(L^\text {start}\) and \(U^\text {start}\) be the local lower and upper bound sets at the beginning of some iteration in Algorithm 5, i.e. at the begin of the outer \(\texttt {while}\)-loop of some iteration. Accordingly denote by \(L^\text {end}, U^\text {end}\) the sets at the end of this iteration. Then for any \(\ell ^e \in L^\text {end}\) and any \(u^e \in U^\text {end}\) with \(\ell ^e \le u^e\) there exist \(\ell ^s \in L^\text {start}\) and \(u^s \in U^\text {start}\) such that the following hold:

  1. (1)

    \(\ell ^s \le \ell ^e \le u^e \le \ell ^s\), i.e. the width does not increase during one iteration.

  2. (2)

    There exists an index \(j\in [k]\) such that

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j < \max \, \left\{ \left( u^s - \ell ^s\right) _j - \delta , \varepsilon _\text {encl}\right\} . \end{aligned}$$

Proof

The proof of part (1) works exactly the same as in the proof of Theorem 3.11. We therefore directly show part (2). Assume for a contradiction that there exist \(\ell ^e \in L^\text {end}\) and \(u^e \in U^\text {end}\) such that for any \(\ell \in L^\text {start}\) and \(u\in U^\text {start}\) with \(\ell \le \ell ^e \le u^e \le u\) and any index \(i\in [k]\) we have that

$$\begin{aligned} \left( u^e - \ell ^e\right) _i \ge \max \left\{ \left( u - \ell \right) _i - \delta ,\, \varepsilon _{\text {encl}}\right\} . \end{aligned}$$
(10)

In particular, (10) holds for the ancestors of \(\ell ^e\) and \(u^e\) in \(L^\text {start}\) and \(U^\text {start}\), namely \(\ell ^s\) and \(u^s\). We consider now the point in Algorithm 5, where \(u^s\) is chosen in the \(\texttt {for}\)-loop in Step 6. Recall that we have introduced the sets \(P(\ell ^e)\) and \(P(u^e)\) in the proof of Theorem 3.11. Note that \(u^s\) might not be the first local upper bound from \(U_\text {loop}\) considered in the \(\texttt {for}\)-loop and therefore it might be the case that \(u^s\notin U_\text {current}\), where \(U_\text {current}\) is the current assignment of U. However, since for any \(i\in [k]\) we have that \((u^e - \ell ^e)_i\ge \varepsilon _{\text {encl}}\) we know that for \(\ell ^\prime \in P(\ell ^e) \cap L_\text {current}\) we have that \((u^s - \ell ^\prime )_i \ge \varepsilon _{\text {encl}}\) for any \(i\in [k]\), where \(L_\text {current}\) denotes the current assignment of L. Hence, we have that

$$\begin{aligned} s\left( \ell ^\prime , u^s\right) \ge \varepsilon _{\text {encl}} \end{aligned}$$
(11)

and we therefore enter the \(\texttt {if}\)-statement in Step 7. Similarly, we fix \(u^\prime \in P(u^e) \cap U_\text {current}\). Note that

$$\begin{aligned} \ell ^s \le \ell \le \ell ^e \quad \text {and}\quad u^e \le u \le u^s \end{aligned}$$
(12)

for any \(\ell \in P(\ell ^e) \cap L\) and \(u \in P(u^e) \cap U\) and for any L and U. We denote by \(L_\text {updated}\) and \(U_\text {updated}\) the assignments of L and U at the end of that iteration of the \(\texttt {for}\)-loop. As the exit from the inner \(\texttt {while}\)-loop can happen in three different ways, we have to distinguish three main cases, namely, if we set \(\texttt {done} = \text {true}\) in Step 17 (case A), if we set \(\texttt {done} = \text {true}\) in Step 21 (case B) or if we set \(\texttt {done} = \text {true}\) in Step 31 (case C). Note that Lemma 5.8 guarantees that after finitely many iterations of the inner \(\texttt {while}\)-loop one of the three above options is actually chosen.

  1. (A)

    In that case we entered the \(\texttt {if}\)-statement in Step 15, i.e. there exists \({\tilde{y}} \in f(S)\) with \({\tilde{y}} < u^s - \delta e\). We have to distinguish two cases.

    Case A.1 \({\tilde{y}} < u^\prime \). Then \(u^\prime \) would be removed during the update of \(U_\text {current}\) using Algorithm 1, i.e. \(u^\prime \notin U_\text {updated}\). We have the candidates

    $$\begin{aligned} u^i = \left( {\tilde{y}}_i, u^\prime _{-i}\right) , \quad \text {for } i \in [k], \end{aligned}$$

    from which at least one belongs to \(U_\text {updated}\) by (5). Say \(u^j \in U_\text {updated}\). Using (12), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^j - \ell ^e\right) _j \le {\tilde{y}}_j - \ell ^s_j < \left( u^s - \ell ^s\right) _j - \delta , \end{aligned}$$

    a contradiction to (10).

    Case A.2 \({\tilde{y}} \nless u^\prime \). Then there exists \(j \in [k]\) with \(u^\prime _j \le {\tilde{y}}_j\). Again, using (12), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^\prime - \ell ^e\right) _j \le {\tilde{y}}_j - \ell ^s_j < \left( u^s - \ell ^s\right) _j - \delta , \end{aligned}$$

    a contradiction to (10).

  2. (B)

    In that case we entered the \(\texttt {if}\)-statement in Step 19, i.e. there exists \(y \in f(S)\) with \(y < u^s - \delta e\). Again, we have to distinguish two cases, which both work exactly the same as the ones from (A).

  3. (C)

    In that case we entered the \(\texttt {else}\)-statement in Step 29, i.e. there exists no \({\tilde{y}} \in {\mathcal {N}}^{{\tilde{S}}}\) with \({\tilde{y}} < u^s - \delta e =: y\). Therefore, the set L is updated w.r.t. y. We have to distinguish two cases.

    Case C.1 \(\ell ^\prime < y\). Then \(\ell ^\prime \) would be removed during the update of \(L_\text {current}\) using Algorithm 1 adapted to local lower bounds, i.e. \(\ell ^\prime \notin L_\text {updated}\). We have the candidates

    $$\begin{aligned} \ell ^i = \left( y_i, \ell ^\prime _{-i}\right) , \quad \text {for } i\in [k], \end{aligned}$$

    from which at least one belongs to \(L_\text {updated}\) by (5). Say \(\ell ^j \in L_\text {updated}\). Using (12), we compute

    $$\begin{aligned} \left( u^e - \ell ^e\right) _j \le \left( u^e - \ell ^j\right) _j \le u^s_j - y_j = \delta < \varepsilon _{\text {encl}}, \end{aligned}$$

    a contradiction to (10).

    Case C.2 \(\ell ^\prime \nless y\). Then there exists \(j\in [k]\) with \(\ell ^\prime \ge y_j = u^s_j - \delta \), i.e. particularly \(s(\ell ^\prime , u^s) \le \delta < \varepsilon _{\text {encl}}\), a contradiction to (11).

Obtaining a contradiction in all possible cases shows that our assumption (10) cannot be true and therefore statement (2) is true, which completes the proof. \(\square \)

Similar as in the case of Algorithm 2, Theorem 5.9 enables us to prove finiteness of Algorithm 5.

Theorem 5.10

Let \(\varepsilon _\text {encl}> \delta > 0\), \(\varepsilon _{rel} > 0\), \({\mathcal {R}}^I\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 5. We define

$$\begin{aligned} \Delta := \big \Vert z^u - z^\ell \big \Vert _\infty \quad \text {and}\quad \kappa := k \left\lceil \llceil \frac{\Delta - \varepsilon _\text {encl}}{\delta } \right\rceil \rrceil +1. \end{aligned}$$

Then the number of iterations of Algorithm 5, i.e. the number of iterations of the outer \(\texttt {while}\)-loop, is bounded by \(\max \,\{1, \kappa \}\). Furthermore, Algorithm 5 terminates after finitely many steps.

Proof

The proof for bounding the number of calls of the outer \(\texttt {while}\)-loop works exactly the same as the one of Theorem 3.12. Termination after finitely many steps follows by the fact that in every iteration of the outer \(\texttt {while}\)-loop, the inner \(\texttt {while}\)-loop is only called finitely many times as shown in Lemma 5.8. \(\square \)

Corollary 5.11

Let \(\varepsilon _\text {encl}> \delta > 0\), \(\varepsilon _{rel} > 0\), \({\mathcal {R}}^I\) and \(z^\ell , z^u \in {\mathbb {R}}^k\) be the input parameters of Algorithm 5. Then after finitely many iterations of the outer \(\texttt {while}\)-loop an enclosure E of the nondominated set \({\mathcal {N}}\) satisfying \(w(E) < \varepsilon _\text {encl}\) is returned.

Proof

Theorem 5.10 tells us that Algorithm 5 terminates after at most \(\kappa \) iterations of the outer \(\texttt {while}\)-loop, i.e. the output set \(E_\kappa \) satisfies \(w(E_\kappa ) < \varepsilon _{\text {encl}}\). By Lemma 5.6 and Lemma 5.7 we know that the set L, resp. U, is a lower, resp. upper, bound set in the sense of Definition 3.1 at any point of Algorithm 5. In particular, this holds for \(L_\kappa \) and \(U_\kappa \). Thus, \(E_\kappa \) is an enclosure of the nondominated set \({\mathcal {N}}\) satisfying \(w(E_\kappa ) < \varepsilon _{\text {encl}}\). \(\square \)

Remark 5.12

Note that one could also use two different off-set factors such that \(0< \delta< {\tilde{\delta }} < \varepsilon _{\text {encl}}\). The idea is that for a relaxed solution \({\tilde{y}}\) we expect the eligible solutions y of (\(\hbox {redMOP}({\tilde{x}}_I)\)) to satisfy \({\tilde{y}} + \rho e < y\) for some sufficiently small \(\rho > 0\). If now \({\tilde{y}}\) satisfies \({\tilde{y}} < u - \delta e\) too tightly, e.g., if \({\tilde{y}} + \rho e \nless u - \delta e\), we do not find any point y with the desired property in Step 19 and therefore would refine the relaxation. This is not a problem in general, as we expect \(\rho \rightarrow 0\) for finer relaxations, i.e. at some point we either do not find any admissible points \({\tilde{y}}\) anymore and therefore declare the search region c(u) for well enough explored, or we find admissible points y. However, this may be very time-consuming as we might have to refine the relaxations and repeat the computations a few times. Thus, using two different off-set factors \(\delta \) and \({\tilde{\delta }}\) may help in terms of run time by being more restrictive for \({\tilde{y}}\) by requiring \({\tilde{y}} < u - {\tilde{\delta }} e\) in Step 12 in comparison to y where we require \(y < u - \delta e\) in Step 19, where \(0< \delta < {\tilde{\delta }}\).

We conclude this section with the performance of the described method on the problem of Example 3.14. Again, as for Algorithm 2, we did not exactly implement the procedure given in Algorithm 5 but a slight modification. In fact, we do not iterate through all local upper bounds in every iteration of the outer \(\texttt {while}\)-loop, but choose only one local upper bound \({\hat{u}}\) where the current width is attained, i.e. \({\hat{u}} \in \{u \in U \mid \exists \ell \in L:s(\ell , u) = w(E)\}\). Again, finiteness of that implementation is not anymore guaranteed by Theorem 5.10. But if it terminates after finitely many steps, we still have that \(w(E) < \varepsilon _{\text {encl}}\). Again we use a relative shortest edge calculation as described in Sect. 3. Furthermore, we use a relative relaxation error calculation. For example, given a quadratic term \(h(x) = x^2\) on the interval \(x^\ell \le x \le x^u\) we compute the relative relaxation error via

$$\begin{aligned} \varepsilon _{rel}^{\left( h, {\mathcal {R}}_h^r\right) } = \frac{\left( x^u - x^\ell \right) ^2}{4r^2} \frac{1}{\max \{x^2 \mid x^\ell \le x \le x^u\}}, \end{aligned}$$

where \(r\in {\mathbb {N}}\) is the number of considered equidistant partitions. Note that \(\varepsilon _{rel}^{(h, {\mathcal {R}}_h^r)} \rightarrow 0\) for \(r \rightarrow \infty \), i.e. refining relaxations by increasing the number of equidistant partitions satisfies Assumption 5.1. Furthermore, note that the relaxation \({\mathcal {R}}\) only depends on the number of equidistant partitions, i.e. we identify \({\mathcal {R}}\) by r in the following.

Fig. 6
figure 6

Computational results on first instance in Example 3.14. Left: enclosure given by LLBs and LUBs of (Circles) obtained by Algorithm 5 with WSM. Right: counter of used degrees of relaxations

Fig. 7
figure 7

Computational results on second instance in Example 3.14. Left: enclosure given by LLBs and LUBs of (Circles) obtained by Algorithm 5 with WSM. Right: counter of used degrees of relaxations

We consider the same instances of Example 3.14 as in Sect. 3. In the Figs. 6, 7 and 8, the respective enclosure computed by Algorithm 5 is depicted on the left together with the computational time and number of WSM problems solved, where we count the solution of (\(\hbox {RMOP}_{{\tilde{S}}}\)) and the corresponding (\(\hbox {redMOP}({\tilde{x}}_I)\)) as one. On the respective right-hand side, for each of these solved relaxed problems, we count the corresponding degree of relaxation—in our case the number of equidistant partitions.

  • The first one is determined by \(n=2\), \(r=1\), \(m=3\) as well as \(m^1 = (3,0)^\top \), \(m^2 = (2,1)^\top \) and \(m^3 = (0,3)^\top \). The enclosure computed by Algorithm 5 with relative tolerance \(\varepsilon _{\text {encl}} = 0.0125\) (this is the equivalent of a standard tolerance of \(\varepsilon _{\text {encl}} = 0.05\)) and unique off–set factor \(\delta = 0.95 \varepsilon _{\text {encl}}\) is depicted in Fig. 6 (Left). In Fig. 6 (Right) the usage counter of each degree of relaxation is depicted. The relaxation degree equal to 1 corresponds to the McCormick relaxation with no extra partition of the intervals, i.e. the intervals are not partitioned at all. The relaxation degree equal to 2 corresponds to one additional breaking point per considered interval, i.e. the original intervals are partitioned into two intervals. One can see that the method uses maximally four partitions per interval, but mostly one or two. We used a tolerance for the relative relaxation error of \(\varepsilon _{rel} = 0.01\). This yields the necessity of at least five partitions per variable, i.e. a degree of relaxation equal to five, to satisfy the relaxation error criterion. We can see that the method does not need to go all the way it is allowed to since four partitions seem to be enough. Of course, if we increase the tolerance of the relative relaxation error, the number of needed partitions decreases.

  • The second one is determined by \(n=3\), \(r=1\), \(m=3\) as well as \(m^1 = (1,0,0)^\top \), \(m^2 = (0,1,0)^\top \) and \(m^3 = (0,0,1)^\top \). The enclosure computed by Algorithm 5 with relative tolerance \(\varepsilon _{\text {encl}} = 0.0\)5 (this is equivalent to an absolute tolerance of \(\varepsilon _{\text {encl}} = 0.1\)) and unique off–set factor \(\delta =0.95\varepsilon _{\text {encl}}\) is depicted in Fig. 7 (Left). In Fig. 7 (Right) the use counters of each relaxation are depicted. Again we used a relative relaxation error tolerance of \(\varepsilon _{rel}=0.01\), which yields a minimum of five partitions per variable to obtain an error smaller than the tolerance. In particular, that means that if the current number of partitions is greater than five, we do not need to refine anymore. We can see that the method makes extensive use of that, i.e. in most of the iterations a relaxation satisfying the relaxation error criterion is chosen. In fact, the method decides early in the solution process that coarse relaxations do not suffice to terminate the method. Together with the fact that the degree of relaxation is inherited in the parent history of local upper bounds, this yields that only a few problems are solved with degrees of relaxation 1 and 2. Then the method sticks some time with degree of relaxation 4, while it needs relaxation degree 8 for a large number of problems. In Fig. 8 we have the same setting as in Fig. 7, but do not require a strict inclusion in Step 10 of Algorithm 5. One can see the effect in an increase in computational time and more problems to be solved. Furthermore, on the right one can observe that the algorithm is trying to stick longer with a coarse relaxation before actually going for the refinement step (see the increase of problems with relaxation degrees 1 and 4).

Fig. 8
figure 8

Computational results on second instance in Example 3.14 with no strict inclusion in Step 10 of Algorithm 5. Left: enclosure given by LLBs and LUBs of (Circles) obtained by Algorithm 5 with WSM. Right: counter of used degrees of relaxations

However, comparing the number of problems to be solved as well as especially the computational time with the ones from Sect. 3, one can see that the power of available solvers like SCIP dealing with quadratically constrained problems makes the use of relaxations redundant in some cases, as e.g. the ones considered above. However, with increasing complexity of the problems one might have an advantage by only considering relaxations instead of the original problem, as can be seen in the next section.

6 Application to the multiobjective optimization of decentralized energy supply networks

In this section, we present numerical results of the described method on some network optimization problem. In fact, aiming to model a decentralized energy supply network we obtain a MIQCP problem. The general network structure is a graph, where the nodes represent individual consumers and the edges connect the consumer nodes with the so-called source node, where energy is supplied. The mixed-integer character is coming from certain decision options available in the optimization process, e.g., if a gas pipe is laid at some edge or not. Furthermore, we take stationary models of energy flow into account, namely an equation based on the Ohmic law for the electricity flow as well as the Darcy–Weisbach equation for gas flow. As both of them contain bilinear or quadratic terms the resulting optimization problem has the mentioned MIQCP structure. As objective functions, we use the overall costs for realizing a given network plan on the one hand and the carbon emissions of that network plan on the other hand. Naturally, a cheap network plan results in high carbon emissions, and a low carbon emission can be obtained by, e.g., investing in energy-efficient house renovation which results in higher costs. Thus, we have a classical (MOP) with two conflicting objective functions. Details regarding the modeling aspects can be found in [38] and more recently in [19].

For the present paper, we consider three network instances of such decentralized energy supply networks, namely:

 

network 1

\(\subset \)

network 2

\(\subset \)

network 3

# Nodes

12

 

20

 

39

# Binaries

108

 

189

 

360

# Variables

484

 

829

 

1570

# Constraints

620

 

1064

 

2014

If e.g., we set up a single objective optimization problem with cost minimization as objective function and put the carbon emissions to the constraints, we obtain the following computational times

  • network 1: 0.57 s

  • network 2: 3.35 s

  • network 3: > 3 h,

using the SCIP solver with the standard settingsFootnote 3 from the pyscipopt-package (cf. [41]).

For testing the new method we use a relative width tolerance \(\varepsilon _{\text {encl}} = 0.03\) as well as two off-set factors \({\tilde{\delta }} = 0.95 \varepsilon _{\text {encl}}\) and \(\delta = 0.8 \varepsilon _{\text {encl}}\). In the network models the present nonlinearities are of the following form:

  • For modeling the low-voltage energy flow we use for instance

    $$\begin{aligned} \text {R}^\text {e}_{i,j} f^\text {e}_{\text {in},i,j}&= a^\text {e} u_j {\bar{u}}_{i,j}, \end{aligned}$$
    (13)

    where \(\text {R}^\text {e}_{i,j}>0\) denotes the resistance of the underlying cable at arc (ij), \(a^\text {e}>0\) the calorific multiplier of three-phase electric power flows, \(f^\text {e}_{\text {in},i,j}\) is a variable representing the electric power flow on arc (ij) into j. The variable \(u_i\) denotes the electrical voltage at node i and the variable \({\bar{u}}_{i,j} = u_i - u_j\) the voltage drop on arc (ij). Consequently, we have a quadratic term \(u_i^2\) and a bilinear term \(u_i u_j\) appearing in (13). For the computation of the corresponding relative relaxation errors the box constraints \(360 \le u_i, u_j \le 440\) are relevant. Thus, if we partition the corresponding intervals into r equidistant intervals, i.e. use the relaxation \({\mathcal {R}}^r\), we obtain

    $$\begin{aligned} \varepsilon _{rel}^{{\mathcal {R}}^r} = \frac{80^2}{4r^2} \frac{1}{\max \{x^2 \mid 360 \le x \le 440\}}, \end{aligned}$$

    and therefore the number of partitions of each interval to fall below a given tolerance \(\varepsilon _{rel}\) is given by

    $$\begin{aligned} r = \left\lceil \llceil \frac{40}{440} \frac{1}{\sqrt{\varepsilon _{rel}}}\right\rceil \rrceil . \end{aligned}$$

    Thus, if we require a relative relaxation error \(\varepsilon _{rel} = 0.01\) we have to partition the corresponding intervals into at least \(r = 1\) partitions, i.e. we do not have to partition at all.

  • For modeling low-pressure gas supply we use a reformulation of the Darcy–Weisbach equation avoiding the use of the \(\text {sign}\)-function as proposed in [8]. By doing so, we obtain for instance

    $$\begin{aligned} \text {R}^\text {g}_{i,j} {\bar{q}}_{ij}^2 \le {\bar{p}}_{\max } y_{i,j}, \end{aligned}$$
    (14)

    where \(\text {R}^\text {g}_{i,j}>0\) denotes the resistance constant of the underlying gas pipeline on arc (ij), \({\bar{q}}_{ij}\) the gas flow on arc (ij), \({\bar{p}}_{\max }>0\) the maximal pressure loss allowed in the network as well as a binary decision variable \(y_{i,j}\) indicating if a gas pipe is laid at arc (ij). The relevant box constraints are \(-150 \le {\bar{q}}_{ij} \le 150\) and partitioning into r intervals, i.e. using relaxation \({\mathcal {R}}^r\), we obtain

    $$\begin{aligned} \varepsilon _{rel}^{{\mathcal {R}}^r} = \frac{300^2}{4r^2} \frac{1}{\max \{x^2 \mid -150\le x \le 150\}}, \end{aligned}$$

    and therefore the number of partitions of each interval to fall below a given tolerance \(\varepsilon _{rel}\) is given by

    $$\begin{aligned} r = \left\lceil \llceil \frac{1}{\sqrt{\varepsilon _{rel}}} \right\rceil \rrceil . \end{aligned}$$

    Thus, if we require a relative relaxation error \(\varepsilon _{rel} = 0.01\) we have to partition the corresponding intervals into at least \(r=10\) partitions. Note that even if we just require a relative relaxation error \(\varepsilon _{rel} = 0.03\) we still have to partition into at least \(r=6\) intervals.

In sum, this yields that—if we use r equidistant partitions for any variable appearing in any nonlinear term of our problem—we fall below a relaxation error tolerance of \(\varepsilon _{rel}=0.01\) as soon as we use a relaxation \({\mathcal {R}}^r\) with \(r\ge 10\). Note that if we did not use the adaptive approach given in Algorithm 5, but chose a relaxation with \(r\ge 10\) and then used a method for solving multiobjective linear mixed-integer problems we would have to solve a problem with at least \(10 \cdot \vert \text {Edges}\vert \) additional integer variables, i.e. in the case of network 3 about 400 if we just use the ones for the quadratic terms.

Looking at the results for network 3 (see Fig. 9; the results for network 1 and network 2 are similar) we can see that the method uses only the relaxation \({\mathcal {R}}^r\) with \(r=1\), i.e. the coarsest relaxation possible using the McCormick relaxations. This shows the power of the proposed method dealing with the considered large network instances.

Fig. 9
figure 9

Computational results on network 3. Left: enclosure given by LLBs and LUBs of (Circles) obtained by Algorithm 5 with WSM. Right: counter of used degrees of relaxations

7 Conclusion

In the present work, a general MIQCP problem is considered and two novel methods for computing an enclosure of the nondominated set are presented. For both of them, we proved correct and finite termination as well as demonstrated the respective advantages and disadvantages. The implementation of the second approach is currently only able to deal with bilinear and quadratic terms. However, one could handle general polynomial terms still relying on McCormick relaxations. For general nonlinear terms, one has to go for a more elaborate relaxation technique as presented in, e.g., [11]. However, these are only implementation issues. As long as the relaxation technique satisfies Assumption 5.1 the theoretical results presented in this paper still apply.