Cooperative multiobjective optimization with bounds on objective functions

When solving large-scale multiobjective optimization problems, solvers can get stuck because of memory and/or time limitations. In such cases, one is left with no information on the distance to the best feasible solution, found before the optimization process has stopped, to the true Pareto optimal solution. In this work, we show how to provide such information. To this aim we make use of the concept of lower shells and upper shells, developed in our earlier works. No specific assumptions about the problems to be solved are made. We illustrate the proposed approach on biobjective multidimensional knapsack problems derived from single-objective multidimensional knapsack problems in the Beasley OR Library. We address cases when a top-class commercial mixed-integer linear solver fails to provide Pareto optimal solutions attempted to be derived by scalarization.


Introduction
The ubiquitous access to more and more powerful computing platforms stimulates the appetite to solve more and more large optimization problems. However, various barriers (time and/or memory limits, "the curse of dimensionality" in combinatorial problems) are the cause that in practice one is often left with suboptimal solutions. In such cases, the existence of lower and upper bounds on the exact solution is essential. The multiobjective optimization literature is rich with various approaches to this issue, cf. [14,15] for brief overviews.
This work has been inspired by the fact that when solving large-scale multiobjective optimization (MO) problems, it can happen that Pareto optimal (efficient) solutions are not derived. The reasons for that can be twofold. When making use of heuristics, such as Evolutionary Multiobjective Optimization (EMO), or other, Pareto optimality of derived solutions is not guaranteed. When making use of exact methods, such as mixed-integer programming B J. Miroforidis janusz.miroforidis@ibspan.waw.pl 1 Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, Warsaw 01-447, Poland 2 Warsaw School of Information Technology, ul. Newelska 6, Warsaw 01-447, Poland (MIP), solvers can reach memory limits, and, less often, time limits, before the optimal solution is found. In both cases, one is left with a feasible solution which may be believed to be Pareto optimal or close to Pareto optimality, but no information is offered about how wide the actual gap is (the Pareto suboptimality gap) between the image (in the objective space) of the feasible solution derived and the image (in the objective space) of the set of Pareto optimal solutions (the Pareto front).
To cope with such unfavorable situations and to provide the lacking information, here we employ the general methodology developed for calculating bounds on objective function values of implicit Pareto optimal solutions [12,14,15,17,18]. The central concept of this methodology is to built two-sided approximations of the Pareto front (PF). To this aim, two sets of elements are produced, one composed of feasible solutions, called lower shell, and another composed of feasible or infeasible solutions, called upper shell. The images of these sets form, respectively, an internal and an external approximation of the PF. No element of the lower shell needs to be Pareto optimal.
Lower and upper shells which we make use of are composed of just elements (points, solutions). No relationship between them (in the form of constructing lines, triangles, polygons, etc.), except nondominancy, is assumed. Following the terminology introduced by Ruzika and Wiecek [28], they are 0-order approximations of PFs, whereas, e.g., the bound sets [8] or the attainment function [9] are of an order higher than 0.
In the literature, two general approaches to approximate the PF are present. The first approach, based on exact optimization methods, aims at building approximating constructs starting from a number of efficient solutions derived by an exact optimization method. From a vast literature on that approach one can quote [28] (a survey paper), [1,8,11] and [20].
The second approach, based on inexact, mostly population type optimization methods (cf., e.g., [2,[4][5][6]10,[25][26][27]29,30]), aims at producing discrete feasible approximations (lower shells) of Pareto fronts. In the second approach, no guarantee is offered that the resulting approximations include actual elements of the PF. In consequence, the actual behavior of such methods can be only observed on test problems with known sets of Pareto optimal solutions.
The methodology we refer to [12,15,17,24] follows the second approach. However, in contrast to other methods consistent with that approach, the said methodology provides for two-sided, lower and upper bounds on objective functions values of Pareto optimal solutions. By this, the behavior of inexact methods can be put under control. Moreover, responding to the needs of Multiple Criteria Decision Making, the methodology abstains from approximating the whole Pareto front and takes a more pragmatic course. Namely, it produces ad hoc local approximations of the Pareto front, only in regions of the explicit decision maker's interest. This provides for large savings by forgoing vain computations.
As the said methodology works only for a special case (see Sect. 4), in this work, we extend this methodology to the general MO setting.
The outline of the work is as follows. In Sect. 2, we recall the multiobjective problem formulation in its most general form and recall a method for Pareto optimal solution derivation, which is instrumental for the subsequent development. In Sect. 3, we present our development of general lower and upper bounds and in Sect. 4, we discuss a special case. Section 5 addresses the issue of derivation of upper shells. We illustrate the development in Sect. 6 on large-scale biobjective multidimensional knapsack problems. Section 7 concludes.
By the "only if" part of this result, no Pareto optimal solution is a priori excluded from being derived by solving an instance of optimization problem (2). In contrast to that, maximization of a weighted sum of objective functions over X 0 does not possess, in general (and especially in the case of problems with discrete variables), this property (cf., e.g., [12], pp. 37-38, [19], pp. 54-56, [7], p. 121, [23], p. 78).
On the first glance, the objective function in (2) seems to be difficult to handle. However, problem (2) is equivalent to In the sequel, we will assume that Pareto optimal solutions are derived by solving problem (2) with varying λ l , l = 1, . . . , k.
With this lower bound, f (x P opt (λ)) is located somewhere in the dotted area

Lower bounds
The lower bound we developed in our earlier works [12] is general, i.e., it is valid for any problem of the form (1). Here, for completeness sake, we present the underlying arguments. To simplify the presentation, we assume here ρ = 0 (cf. [16], pp. 13-15, for the derivation of the formula for lower bounds with ρ > 0). In this case, x P opt (λ) can be weakly efficient.
In contrast to single-objective problems, where any feasible solution yields a lower bound for the optimal value of the objective function, in MO, where the notion of Pareto optimal solutions (elements of N ) replaces the notion of optimal solutions and set f (N ) is not a singleton, this is not the case. Suppose for a while k = 2. Since x P opt (λ) is a solution to (2) and thus it is Pareto optimal, for a given solutionx of Similar observations are valid for any k ≥ 2. Lower bounds provided on f (x P opt (λ)) by elements of X 0 are of disjunctive type and therefore not constructive.
To propose constructive, i.e., conjunctive bounds, one has to exploit the fact that Pareto optimal solutions are derived, as assumed, by solving problem (2).
. . , k, are components of the element at which the compromise half line intercepts {y | y ∈ R k , y l ≤ f l (x), l = 1, . . . , k} (as illustrated in Fig. 1 for k = 2) can be solution to (2) since any such solution yields a greater value of the objective function of (2) than f (x). Thus, L(λ, {x}) is the vector of lower bounds for the corresponding components of f (x P opt (λ)).
Given multiple solutions ofx ∈ X 0 , for each coordinate the maximal lower bound is selected. Solutions which are dominated by another elements of X 0 clearly provide lower bounds which are lower than or equal to lower bounds provided by dominating solutions. Therefore, to avoid redundancy, it is reasonable to work with solutions which do not dominate one another and this is formalized by the notion of lower shell.
Lower shell to problem (1) is a finite nonempty set S L ⊆ X 0 , elements of which satisfy The formula for lower bounds with ρ ≥ 0 ( [12], pp. 92-94, [16], pp. 13-15) is whereL l are lower bounds implied by some a priori information on f l (x P opt (λ)), if available. If such lower bounds are not available, then one can simply setL l = −∞. U l (λ) are known upper bounds on f l (x P opt (λ)) (e.g., when one can infer these bounds basing on properties of problem (1)). If U l (λ) are not known, then one can simply set U l (λ) = y * l .

Upper bounds
Upper shell 1 is a finite nonempty set S U ⊆ R n , elements of which satisfy Not every element of an upper shell provides an upper bound on a component of f (x P opt (λ)). To be so, elements of upper shell have to be appropriately located with respect to a lower bound L(λ, S L ).
Let S U be an upper shell to problem (1).

Proposition 1 x ∈ S U provides an upper bound for some f l
, as in this case nothing precludes f l (x P opt (λ)) ≥ f l (x) for one or several l.

Remark 1
It is worth observing that also some Pareto optimal solutions can satisfy conditions of Proposition 1. This relates to the development of [12] where lower and upper bounds were proposed with shells (subsets of Pareto optimal solutions), the concept in which lower and upper shells coincide. Later on, with applications to space sampling algorithms (e.g., evolutionary) in mind [14,17,24], to clearly delineate feasible and infeasible sampling spaces, we have adopted the condition S U ⊆ R n \ X 0 . However, for the sake of generality, here we stick to the definition of upper shell as that given above.

The special case
Lower and upper bounds were originally developed ( [12], pp. 92-102, [16], pp. 13-17) for the special case when at x P opt (λ), the optimal solution to (3) (and hence, also to (2)), the following holds In other words, f (x P opt (λ)) lies at the apex of the contour of the objective function of (2). As said, lower bounds developed in three references cited above are valid for the general case, and only upper bounds are case specific.
Givenx ∈ S U , by the definition of S U , the part of the half line emanating from y * (the locus of apexes of the objective function of (2) Fig. 2. The formula for upper bounds for this case can be found in three references cited above.

Derivation of upper shells
To make use of Proposition 2 we need S U . Usually it is not known which part of R n should be searched for elements of S U . However, upper shells can be derived from some relaxations of problem (1), as shown by Proposition 3.
with the set of efficient elements N .
Proposition 3 Any subset of N is an upper shell to problem 1.

Proof
Since thus A satisfies condition (6). Since Hence, thus A satisfies condition (7). In consequence, A is an upper shell. Figure 3 gives a schematic illustration to Proposition 3. The significance of Proposition 3 is in that it paves a way to derive upper shells in an algorithmic way. Its practical usefulness depends on whether the relaxed problems can be solved to optimality in fractions of times spent on deriving incumbents of the original problems unsolved to optimality.

Numerical experiments
We apply the development of Sect. 5 to the case where large-scale multiobjective multidimensional knapsack problems solved by an optimization solver have been not solved to optimality. It is well known (see, e.g., [21]) that the knapsack problem is N P-hard. The multidimensional knapsack problem, being its generalization, is also N P-hard, so the multiobjective multidimensional knapsack problem is as well. It is also known that the difficulty of solving this problem by an MIP solver depends on the number of decision variables as well as the structure of the constraint set (see, e.g., [3]).
In this work, we say that the multiobjective multidimensional knapsack problem is a largescale problem if it cannot be solved to optimality by a highly specialized MIP solver within a reasonable memory or time limit. We show how in such cases the lacking information about the size of the Pareto suboptimality gap can be retrieved.
. . , n}, and all a i, j , c i, j are nonnegative.

Solving multidimensional knapsack problems on the NEOS platform
To solve multiobjective mixed-integer linear problems (3) (weights λ l are real numbers, hence variable s, the objective function in (3), assumes real values) we have to rely on MIP solvers. Our choice was CPLEX, because it is renown as an effective and robust MIP optimizer. Moreover, it is accessible free of charge, e.g., via NEOS platform (https://neos-server. org/neos/solvers/lp:CPLEX/LP.html). Running optimization problems on an open access platform ensures perfect result reproducibility. At the time of numerical experiments, this platform run CPLEX version 12.6.2.0. With NEOS, jobs are terminated whenever time limit of 8 hours of CPU or memory limit of 3 GB is reached. However, as NEOS reaching one of these limits provides no output log, we put the memory limit to 2.048 GB 2 .

Multidimensional knapsack test problems
For test problems we selected single-objective multidimensional knapsack problems from Beasley OR Library (http://people.brunel.ac.uk/mastjjb/jeb/info.html), collected in [3], and we extend them to biobjective problems (cf. Sect. where α is the tightness ratio, equal to 0.25 for the first ten, 0.50 for the next ten, and 0.75 for the remaining ten problems in a group.
To prepare input data for CPLEX we converted all the 270 knapsack problems in the Beasley OR Library to the LP format. We made the converted problems publicly available via the web 3 .
With CPLEX running on NEOS, we were able to solve to optimality almost all singleobjective problems from (5, 500) and (30, 100) groups, a few problems from (10, 250) group and no problem from (10, 500), (30,250) and (30, 500) groups. The detailed results are given in [13]. In each case, the suboptimality gap 4 is defined by the following formula subopt. gap = 100 * current MIP best bound -objective funct. value for incumbent objective funct. value for incumbent , and all those values are provided in CPLEX logs (where the suboptimality gap is termed gap).

Biobjective multidimensional knapsack problems
We generated biobjective multidimensional knapsack (BMK) problems (10) from Chu and Beasley problems. For each original Chu and Beasley problem we added the second objective function obtained by permuting the coefficients of the first.
To solve BMK problems constructed in that manner, we made use of transformation (3). In case of knapsack problems, the resulting problems are MIP problems, thus they can be handled by CPLEX.
To simplify modifications of the input LP files, reflecting varying weights λ l , we introduced two unconstrained (free) variables f 1 and f 2 , equal to the first and the second objective function, respectively, Both variables f 1 and f 2 are clearly nonnegative. With that convention, the transformed problems have the following form, consistent with the LP format: min s subject to: We set ρ = 0.001. In order to make y * a relatively good approximation ofŷ, we maximized each objective function of problem (10) separately, setting the relative MIP gap tolerance to 1%. We set y * l to the optimal values of the respective objective functions derived with this tolerance. All these problems were solved in seconds. Obviously,ŷ l ≤ y * l . Hence, the use of y * was legitimate.
We made all 270 transformed BMK problems (in the LP format) publicly available via the web 5 . For each problem in the set, λ 1 and λ 2 were set to default values 0.5. For different settings of λ l , only the first two constraints in (11) have to be modified accordingly.

Solving biobjective multidimensional knapsack problems on the NEOS platform
We attempted to solve selected biobjective multidimensional problems (in the form (11) and with λ 1 = λ 2 = 0.5), namely the first, the eleventh and the twenty-first problem in group: (10, 500), (30,100), (30,250) and (30,500), respectively. Given problem, let x inc (λ) denote an incumbent for a given λ. The results are given in Table 1. In all problems not solved to optimality the memory limit was reached.
We also attempted to solve one of these problems with as much memory as it was available on NEOS. We selected for that the first problem from the (10, 500) group and we set memory limit to 10 GB 6 . Even in this case the problem was not solved to optimality, but a better (in terms of the value of the objective function s) incumbent was found, see results in Table 2.

Bounds at work
In this section, we consider BMK problems, formulated as in (11), which we were not able to solve to optimality because CPLEX reached the memory limit. For example, this is the case of biobjective multidimensional knapsack problems from group (10, 500) of single-objective problems. It is the first group of BMK problems (counting from low dimensions up) in which we were not able to solve to optimality any BMK problem formulated as in (11) with CPLEX run on NEOS, because of the memory limit (we checked this for λ 1 = λ 2 = 0.5, cf. Sect. 6.4). Hence, for problems within this range of dimensions or higher, complementing CPLEX (or any other MIP solver) with a method to derive a kind of lower and upper bound information on Pareto optimal solutions is of utmost importance. However, having no access to CPLEX code, we can do this only in the form of an external procedure.  Table 2 Results of solving the first problem from the (10, 500) group with the extended memory limit 10 GB; x inc time-conservative estimate of time to reach the incumbent, read from the CPLEX logs, λ = (0.5, 0.5) Moreover, it is well known that MIP solvers derive relatively quickly incumbents close to optimal solutions (in terms of the objective function values) or even optimal solutions but without the proof of optimality. Small incremental improvements or proving (by implicit enumeration) optimality of the incumbent usually consumes the majority of computing time. Existence of lower and upper bounds on objective function values of Pareto optimal solutions allows one to decide if a satisfactory solution within an acceptability range has been already derived and, in consequence, allows to stop the optimization process early.
We applied lower bounds (formula (5)) and upper bounds (Proposition (2)) to selected BMK problems which were not solved on the NEOS platform to optimality because of the memory limit reached. In formula (5), we setL l = 0, (as the absolute lower bound on values of all objective functions of problem (10) is 0), U l (λ) = y * l , l = 1, 2. The values of y * l for considered problems are shown in Table 3.
At first glance, it seems impossible to populate upper shells since the effcient set (N ), the object to determined, is unknown. However, this can be done with the help of relaxations of problem (1) (see [15]). In this work, we take advantage of the fact that an MIP solver, when solving problem (11), solves also a series of its relaxations. To populate S U , we used the MIP best bound on the objective function s (resulting from a problem relaxation), provided by CPLEX, and we derived corresponding values of objective functions f 1 and f 2 , denoted f bb 1 (λ) and f bb 2 (λ), by solving two equations (cf. (3)) where s bb (λ) is the current MIP best bound and f bb (λ) = ( f bb 1 (λ), f bb 2 (λ)). Remark 2 Let x bb ∈ R n be such that f (x bb ) = f bb (λ). No element of N dominates x bb . Indeed, if an element of N dominated x bb , then it is immediate to show that this element would produce s smaller than s bb (λ) corresponding to f bb (λ), a contradiction to x bb defining the MIP best bound. If so, x bb is a valid upper shell element.
At this point, we have no method to ensure condition (7) other than requesting elements x of an upper shell to satisfy f bb x bb being clearly the best choice.
By Remark 2, it is guaranteed that x bb is a legitimate element of an upper shell, e.g., one-element upper shell {x bb }, but it may or may not satisfy conditions of Proposition 2. However, x bb may satisfy these conditions for x P opt (λ ) derived with λ = λ. Along this line, for problems not solved to optimality for givenλ and conditions of Proposition 2 not satisfied, one should attempt to solve another two problems with λ , λ , λ 1 >λ 1 , λ 2 <λ 2 and λ 1 <λ 1 , λ 2 >λ 2 . These two problems, even if not solved to optimality, produce ( f bb 1 (λ ), f bb 2 (λ )) and ( f bb 1 (λ ), f bb 2 (λ )) which may yield valid upper bounds for f (x P opt (λ)). If successful, these actions can be repeated with another two values of λ closer toλ, in the hope to get tighter upper bounds. However, since the problems being solved are discrete, there is a range of λ aroundλ where all elements of the range yield the same f (x P opt (λ)). Thus, there is a limit of upper bound tightness, specific for each individual problem.
It is clear that if f bb 1 (λ ) is an upper bound for f 1 (x P opt (λ)) and f bb 2 (λ ) is an upper bound for f 2 (x P opt (λ)), then f bb 1 (λ ) is an upper bound for f bb 1 (λ ) and f bb 2 (λ ) is an upper bound for f bb 2 (λ ). However, the last observation does not apply to problems with k ≥ 3. We experimented with the first problems (formulated as in (11)) from (10, 500), (30,250) and (30,500) group (as seen in Table 1, the first problem from (30, 100) group was solved to optimality). The results of computations are presented in Table 4. Pareto suboptimality gap is defined as vector G Psub = (G Psub,1 , . . . , G Psub,k ), where G Psub,l = 100 * U l (λ,S U )−L l (λ,S L ) For each problem in Table 4, lower shell S L was expanded after each new incumbent was derived. After deriving x inc ((0.5, 0.5)), S L = {x inc ((0.5, 0.5))}. After deriving   Table 4 and Table 5.
In Fig. 4 (for the first problem from group (30, 500)), one can see images (in the objective space) of incumbents (marked by rectangles) derived for set of lambda vectors Λ = {λ, λ , λ }. Circles represent images of upper shells derived for set Λ. Filled rectangles represent images of incumbents derived for set of lambda vectorsΛ = {λ ,λ }, and bullets-images of upper shells derived for setΛ. The bottom left corner of the dashed rectangle determines the vector of lower bounds on components of f (x P opt (λ)), and the upper right corner-the vector of upper bounds on components of f (x P opt (λ)). One can also see that the set of incumbents derived for λ ∈ Λ ∪Λ forms a lower shell. The union of one-element upper shells derived for λ ∈ Λ ∪Λ forms, in turn, an upper shell.
It is worth noting here that the "branch and bound"-based method for deriving approximations of the efficient set (hence-lower shells) of the multiobjective multidimensional knapsack problem presented in [22], also exploits the concept of relaxation to derive upper shells. However, in that approach, elements of an upper shell are used to control the derivation of elements of the efficient set approximation only.

Final remarks
When it comes to solving large-scale optimization problems, the concept of deriving the whole Pareto front looses its appeal and becomes just impractical. At most, the Pareto front can be navigated by solving instances of problem (2) with various λ (the method to interpret λ in decision making terms was given in [12,17,19]). Then, as suggested by Proposition 2, a number of upper shell elements can mutually and cooperatively provide upper bounds, and these bounds can be strengthened by deriving new upper shell elements.
In this work, we illustrated the proposed approach with pure combinatorial optimization problems. However, it can be applied to other class of problems, e.g., to mixed-integer quadratic programming problems. The cardinality constrained Markowitz portfolio selection  problem is an another example of a practical decision-making problem which large-scale instances can be treated with the help of the proposed approach. The strength of the proposed development, as well as its weakness, lies it its generality (no assumption about problem (1) has been used). Though we present here experiments with biobjective problems, the very principle applies to any number of objectives. However, working efficiently with more than two objectives requires more sophisticated methods to populate upper shells. This will be the subject for further research.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.