Proximity measures based on KKT points for constrained multi-objective optimization

An important aspect of optimization algorithms, for instance evolutionary algorithms, are termination criteria that measure the proximity of the found solution to the optimal solution set. A frequently used approach is the numerical verification of necessary optimality conditions such as the Karush–Kuhn–Tucker (KKT) conditions. In this paper, we present a proximity measure which characterizes the violation of the KKT conditions. It can be computed easily and is continuous in every efficient solution. Hence, it can be used as an indicator for the proximity of a certain point to the set of efficient (Edgeworth-Pareto-minimal) solutions and is well suited for algorithmic use due to its continuity properties. This is especially useful within evolutionary algorithms for candidate selection and termination, which we also illustrate numerically for some test problems.


Introduction
In applications, one often has to deal with not only one but multiple objectives at the same time. This leads to multi-objective optimization problems. Then it is the aim to find globally optimal solutions, called efficient solutions, for such optimization problems using optimization algorithms, e.g., evolutionary algorithms as proposed in [6].
Especially evolutionary algorithms are often considered to be able to overcome regions with only locally efficient solutions and to generate points close to the global Pareto front, i.e., close to the image set of all globally efficient solutions. An important aspect in such algorithms is then the question when the algorithm can finally be stopped as one is sufficiently close to the Pareto front. For that decision, for instance in [7] and [10], proximity measures for termination have been examined by Deb, Dutta and co-authors. We follow this line of research for multi-objective problems, and we do this without the detour of scalarization.
A necessary condition for being efficient is to satisfy necessary optimality conditions, at least under certain constraint qualifications. This can also be used, based on the results of this paper, to evaluate the proximity of the generated points to the Pareto front: a necessary condition for being close to the Pareto front is that certain necessary optimality conditions are satisfied at least approximately.
In this paper we present two proximity measures which characterize the approximate fulfillment of the so-called Karush-Kuhn-Tucker (KKT) conditions. They can be used for candidate selection and, what is more, as a termination criterion for evolutionary algorithms, similar as proposed in [7] and [10]. In particular, proximity measures can be used in addition to already implemented techniques within evolutionary algorithms. We will show that the presented proximity measures are continuous in every efficient solution. Thereby we only assume the objective and constraint functions to be continuously differentiable and that some constraint qualifications hold. The continuity implies that when the algorithm produces points which come closer and closer to the Pareto front then also the values of the measure decrease continuously to zero.
Of course, proximity measures can also be used for deterministic algorithms. Within iterative approaches, an abort has to be done after a finite number of iterations. Hence, an approach to check whether the solution found by the algorithm is actually an efficient solution or at least an approximately efficient solution is needed in this case as well.
Finding exact KKT points can be a hard task, in particular when using computer algorithms. A common approach to handle this problem is to relax the KKT conditions and we will also do this within this paper. Within the last decade, several concepts for such a relaxation have been presented. An often followed approach is to provide a sequence of points that satisfy some relaxed KKT conditions and to show convergence of this sequence towards a KKT point.
Such a relaxation are the Approximate-KKT conditions, which were presented in 2011 in [3] for single-objective optimization problems. Those are satisfied for a feasible point if there exists a sequence of Lagrange multipliers and not necessarily feasible points such that the KKT error decreases to 0 in the limit of this sequence, which is also called AKKT sequence. This concept was extended to multi-objective optimization problems in [14] and [12].
The idea to use a relaxed version of KKT points to define proximity measures (also called error measures) appears more recently in [10] for single-objective optimization problems. There, the relaxation of KKT points are so-called ε-KKT and modified ε-KKT points. Their definition does no longer rely on a sequence of points but only on a single feasible point. The authors propose to use a proximity measure which is based on this relaxation and hence, based on the KKT error. A possibility to use this approach also for multi-objective optimization is to first scalarize the vector-valued optimization problem to a single-objective optimization problem, which was discussed in [7], [2], and [1]. Within this paper, we avoid the detour of a scalarization. The concept of (modified) ε-KKT points has also been extended to multiobjective optimization problems without using scalarization in [9] and [18], but no proximity measure was derived so far.
In all these papers the authors only show that an AKKT sequence or a convergent sequence of modified ε-KKT points with ε → 0 indeed converges towards a KKT point. However, in practice, for instance within an evolutionary algorithm, one generates a sequence for which one does not know whether it consists of modified ε-KKT points with ε → 0 or is an AKKT sequence. Still, one wants to know if the points of the sequence come close to a KKT point. With the results of this paper we give a necessary condition for that. Thus, if the proximity measure is too large, the points of the generated sequence are for sure not close to an efficient solution and the algorithm should not be stopped.
Thereby, it is important that the proximity measure is continuous in the efficient solutions. Otherwise, the value of the measure could be large for points being arbitrarily close to an efficient solution and could drop to zero only in the efficient solution itself.
So while in the literature for AKKT points and (modified) ε-KKT points it is shown that a sequence with certain properties converges to a KKT point, we will show that a sequence that converges to an efficient solution (which is a KKT point under constraint qualifications) has certain properties. Thus, for our new proximity measures in this paper we will show that they converge (and decrease to zero) for any such sequence.
As a consequence, the value of the presented proximity measures can be used as an indicator for non-efficiency as a point with positive value for the proximity measure is definitely not an efficient solution. Hence, for evolutionary algorithms, a small value of the proximity measure could be added as a criterion within candidate selection or as an additional termination criterion. In other words, we will show that a small value of the proximity measure is a necessary condition for being close to an efficient solution.
For the remaining part of this paper we start in Sect. 2 with some notations and definitions as well as the problem formulation and we recall basic results to necessary optimality conditions. In Sect. 3, we briefly discuss the concept of modified ε-KKT points from [10] and [18]. Then, we introduce our new proximity measures for multi-objective optimization problems. Finally, in Sect. 4, some numerical results for the proposed proximity measure for candidate selection are presented.

Notations and basic definitions
Within this paper, for a positive integer n ∈ N we use the notation [n] := {1, . . . , n}. For a differentiable function f : R n → R m the Jacobian matrix of f at x ∈ R n is D f (x). Now let x,x ∈ R n . Then ≤ and < are meant component-wise, i.e.
x ≤x ⇔ x i ≤x i for all i ∈ [n], We focus on multi-objective optimization problems with inequality constraints. We denote by f i : R n → R, i ∈ [m] the objective functions and by g j : R n → R, j ∈ [p] the constraint functions. We also write f = ( f 1 , . . . , f m ) and g = (g 1 , . . . , g p ). The multi-objective optimization problem of this paper is then defined by We denote the feasible set by S = x ∈ R n g(x) ≤ 0 and assume that it is a nonempty set. Moreover, for x ∈ R n the active index set is I (x) := j ∈ [p] g j (x) = 0 . All functions f i , g j for i ∈ [m], j ∈ [p] are assumed to be continuously differentiable. Optimality for (CMOP) is defined as follows.
Definition 2.1 A pointx ∈ S is called an efficient or an Edgeworth-Pareto-minimal solution for (CMOP) if there exists no x ∈ S with A pointx ∈ S is called a weakly efficient or a weakly Edgeworth-Pareto-minimal solution for (CMOP) if there exists no x ∈ S with Every efficient solutionx ∈ S is also weakly efficient. If all objective functions f i , i ∈ [m] are strictly convex and the set S is convex, then every weakly efficient solutionx ∈ S is also an efficient solution.
In unconstrained single-objective optimization, i.e., for S = R n and m = 1, a well known necessary optimality condition is ∇ f (x) = 0. For necessary optimality conditions in constrained optimization, i.e., for the KKT conditions, it is known that a constraint qualification has to hold for an optimal solutionx to guarantee that the necessary optimality conditions are satisfied. Several constraint qualifications are used in the literature. We use here the Abadie Constraint Qualification (Abadie CQ) as it is quite general and as it is implied by other well-known constraint qualifications as the LICQ or MFCQ as explained below.
Thus, we shortly recall the definition of the contingent cone as well as the linearized contingent cone to the set S at some pointx ∈ cl(S). So let S ⊆ R n be a nonempty set as defined above andx ∈ cl(S). Then the contingent cone at S inx is given as Moreover, the linearized (contingent) cone is given by It always holds T (S,x) ⊆ T lin (S,x). We say that the Abadie CQ holds for somex ∈ S if T (S,x) = T lin (S,x).
In case of affine linear constraint functions g j , j ∈ [p] the Abadie CQ is satisfied for all x ∈ S. Often stronger constraint qualifications than the Abadie CQ are used as those are easier to verify. One of them is the Mangasarian-Fromovitz Constraint Qualification (MFCQ). We say that the MFCQ is satisfied for somex ∈ S if there exists a direction d ∈ R n such that ∇g j (x) d < 0 for all j ∈ I (x). Another well known constraint qualification is the Linear Independence Constraint Qualification (LICQ) which is satisfied for somex ∈ S if for all j ∈ I (x) the gradients ∇g j (x) are linearly independent. For these constraint qualifications it holds Finally, if all constraint functions g j , j ∈ [p] are convex, another useful constraint qualification is Slater's Constraint Qualification (Slater's CQ). It is satisfied if there exists some x * ∈ S such that g(x * ) < 0. If Slater's CQ is satisfied, then the Abadie CQ holds for all feasible pointsx ∈ S. In this paper, in general neither the objective functions f i , i ∈ [m] nor the constraint functions g j , j ∈ [p] are assumed to be convex. For x ∈ R n , η ∈ R m , and λ ∈ R p the following conditions Note that no convexity is required for the necessary optimality condition given in Theorem 2.2. One obtains similar necessary optimality conditions by making use of a weighted sum scalarization of (CMOP). However, then convexity is needed: in case of convex objective and constraint functions, for any weakly efficient solutionx ∈ S there exist weights η i ∈ R + , i ∈ [m], which also satisfy (KKT6), such thatx also minimizes (2.1) The necessary optimality conditions to (2.1) are similar to the KKT conditions for (CMOP) above, but they only hold in case of convexity. We illustrate the difference with the next example.

Example 2.3
The pointx : is an efficient solution for the non-convex multi-objective optimization problem There exist no weights w 1 , w 2 ≥ 0, w 1 + w 2 = 1, such thatx minimizes the weighted-sum scalarization of this multi-objective optimization problem, i.e., the problem However, there exist multipliersη ∈ R 2 + andλ ∈ R 3 + such that (x,η,λ) is a KKT point: For the KKT conditions for (CMOP) are satisfied. This illustrates that for the KKT conditions for (CMOP) no convexity assumption is required.
In the special case of unconstrained multi-objective optimization problems the KKT conditions reduce to In this particular setting, x ∈ R n is also called Pareto critical if there exists η ∈ R m such that (x, η) satisfies (KKT1 ), (KKT5), and (KKT6), see [13].
Only for giving a sufficient optimality condition for (CMOP) we need assumptions on the convexity of the problem. Recall that a continuously differentiable function It is called quasiconvex if it holds for all x, y ∈ R n and λ ∈ [0, 1] Every continuously differentiable convex function is also pseudoconvex and quasiconvex. Using these convexity concepts, we obtain a sufficient optimality condition which can also be found in [17,Corollary 7.24].

Lemma 2.4 Letx ∈ S, f i be pseudoconvex for all i ∈ [m] and g j be quasiconvex for all
In case the functions f i , i ∈ [m] are even strictly convex, this is sufficient forx to be efficient for (CMOP).
Again, we want to point out that for the remaining part of this paper there are no convexity assumptions concerning the functions f i , i ∈ [m] and g j , j ∈ [p] unless otherwise stated. Finally, we recall a basic result on the convergence of sequences which we need in the proof of our main result later on.

Lemma 2.5 Let a sequence (x i ) N ⊆ R n and a pointx ∈ R n be given. If for all subsequences
Hence, the assumption does not hold and (x i ) N converges tox.

Proximity measures
As motivated in the introduction, we want to provide a proximity measure which characterizes the fulfillment of a necessary optimality condition. Moreover, it should be easy to compute and provide good numerical properties as we want to use it with optimization algorithms, e.g., evolutionary algorithms. First, we present the desired properties of such a proximity measure in the next definition. Definition 3.1 A function ω : R n → R is called a proximity measure if for every efficient solutionx ∈ S of (CMOP) in which the Abadie CQ holds and every sequence ( the following three properties are satisfied: While properties (PM1) and (PM2) are quite easy to realize, property (PM3) is more challenging. It ensures that a proximity measure ω is continuous at least in every efficient solutionx ∈ S of (CMOP) in which the Abadie CQ holds. Hence, we can expect a proximity measure to have small values locally around every efficient solution. This makes it suitable for applications, e.g., for termination or candidate selection in evolutionary algorithms. Convergence statements also appear in [10] and [18]. However, these statements only hold for certain sequences (x i ) N or rely on stronger assumptions, e.g., convexity of the objective and constraint functions. Whenever a function ω is a proximity measure in the sense of Definition 3.1, then property (PM3) holds for any sequence (x i ) N (converging towards an efficient solutionx in which the Abadie CQ holds).
Next to the three properties from Definition 3.1, we also aim to provide a non-trivial proximity measure, i.e., not to choose ω ≡ 0, as such a proximity measure would provide no additional information for optimization algorithms. In the introduction of [7], the authors presented an example for a naively defined candidateω for a proximity measure based on the KKT conditions to motivate their further examinations. For all x ∈ S this iŝ We shortly recall the numerical example from [7] which illustrates that this function does not satisfy property (PM3) in general. This also shows that more effort is needed to find a suitable proximity measure, i.e., a function that is not only based on KKT conditions but also satisfies the properties from Definition 3.1.
Example 3. 2 We consider the constrained multi-objective optimization problem The set of efficient solutions for this problem is A sequence of feasible points (x α ) ⊆ S is given by Using (3.1), we derive that the only efficient solution in this sequence is x 0 = (0.2, 0) , in which also the Abadie CQ is satisfied. With the Euclidean norm in the definition ofω we obtainω Hence, there exists a sequence (x α ) ⊆ S and an efficient solution x 0 such that lim so that property (PM3) is not satisfied, andω is not a proximity measure. In Fig. 1, it can be seen thatω is not continuous at the efficient solution x 0 ∈ S of (P1). This is exactly what we want to prevent and why we introduced property (PM3) in Definition 3.1. It is also important to notice that the function valuesω(x α ), α = 0 are monotonically increasing for α decreasing to 0. Hence, the closer we get to the efficient solution x 0 , the higher the function value gets, whereas a proximity measure should return values close to zero.
In [7] the authors used a scalarization approach to further investigate proximity measures for (CMOP). Moreover, they did not check the properties (PM1), (PM2), and (PM3). In the remaining part of this section, we first recall a proximity measure from the literature for single-objective optimization problems. After that, in Sect. 3.2, we introduce a new proximity measure for the multi-objective problem (CMOP) based on the results from the single-objective case. This new measure is not based on a scalarization approach. Finally, another new proximity measure with simpler structure will be introduced in Sect. 3.3. That proximity measure is especially useful for optimization algorithms as it can be computed by solving a linear optimization problem only.

Single-objective case
For m = 1, the problem (CMOP) is a constrained single-objective optimization problem which we denote by (CSOP). In the single-objective case, it is not common to use the term efficient solution for an optimal solution, but to use the term minimal solution. For the remaining part of Sect. 3.1 we follow this convention.
As already mentioned in the introduction, in general it is a hard task to compute exact KKT points. Therefore, different relaxations have been presented in the literature. One of these relaxations are the so-called modified ε-KKT points. This concept was introduced in [10]. Those modified ε-KKT points can be seen as a relaxation of KKT points where a small deviation concerning the KKT conditions is allowed. In [10] the objective and constraint functions have not been assumed to be differentiable but only Lipschitz continuous, and the definitions used Clarke's subdifferential, see [4]. With the assumptions of our paper, this leads to the following definition.
then x is called a modified ε-KKT point for (CSOP).
Based on this concept, the authors from [10] introduced a candidate for a proximity measure that we present in the following definition.
The proximity measure in its formulation in [10] is only defined on the feasible set S. To match our Definition 3.1 of proximity measures, we added the constraints g j (x) ≤ ε, j ∈ [p]. First, we show that properties (PM1) and (PM2) are satisfied forω.

Lemma 3.5 The functionω satisfies (PM1) and (PM2).
Proof By definition we haveω(x) ≥ 0 for all x ∈ R n and hence, property (PM1) holds. Now letx ∈ S be a minimal solution for (CSOP) in which the Abadie CQ holds. Then property (PM2) is satisfied by Theorem 2.2.
As already mentioned, our focus in this paper is on property (PM3). While it has not been examined in [10], in the following theorem we show that this property does indeed hold for ω.

Theorem 3.6 The functionω satisfies property (PM3).
Proof Letx ∈ S be a minimal solution for (CSOP) in which the Abadie CQ holds and (x i ) N ⊆ R n a sequence of points with lim i→∞ x i =x. We are interested in the sequence (ω(x i )) N and aim to apply Lemma 2.5. Let (x i k ) k∈N be a subsequence of (x i ) N which will be denoted by (x p ) N , i.e., (ω(x p )) N is a subsequence of (ω(x i )) N . We now construct a subsequence of (x p ) N , and hence of (ω(x p )) N , and show thatω converges to 0 on this subsequence.
Asx is a minimal solution for (CSOP), by Theorem 2.2 there existsλ ∈ R p + such that (x, 1,λ) is a KKT point. Hence, (KKT1),(KKT2), and (KKT4) hold which implies that Now let ε > 0. All functions g j , j ∈ [p] and the objective function f are continuously differentiable. Hence, every composition of those continuous functions and their continuous derivatives is continuous itself. So there exists δ 1 ε > 0 such that for all x ∈ R n with x −x ≤ δ 1 ε it holds that Finally, there exists δ 3 ε > 0 such that for all x ∈ R n with x −x ≤ δ 3 ε it holds that Now define δ ε := min{δ 1 ε , δ 2 ε , δ 3 ε } > 0. Then for all x ∈ R n with x −x ≤ δ ε by (3.4), (3.5), (3.6), and (3.3) it holds that (3.7) As (x p ) N converges tox, there exists p ε ∈ N with x p −x ≤ δ ε for all p ≥ p ε . Hence, for all p ≥ p ε we obtain from (3.7) that ε,x p = x p , and λ =λ is feasible for the optimization problem forω(x p ) as given in Definition 3.4. This implies thatω(x p ) ≤ ε. Now let (ε n ) N ⊆ int(R + ) be a monotonically decreasing sequence with lim n→∞ ε n = 0. Then for all n ∈ N there exists p n = p ε n ∈ N such that Moreover, without loss of generality, we can assume that for all n ∈ N it holds that p n+1 > p n . This is just a result of what was discussed above. So there exists a subsequence ( As ε n → 0 and using the squeeze theorem and (PM2) this leads to lim n→∞ω (x p n ) = 0 =ω(x).
Overall, for every subsequence (ω(x i k )) k∈N = (ω(x p )) N of (ω(x i )) N there exists a subsubsequence (ω(x i k l )) l∈N (described by (ω(x p n )) n∈N above) with lim l→∞ω (x i k l ) = 0 =ω(x). In the following, we give a brief comparison of our results to those in [10]. Then, in the next section, we extend the results to the case with multiple objectives. The main difference is that in [10] not the proximity measureω was examined but (modified) ε-KKT points and sequences of (modified) ε-KKT points. One of their main results is [10,Theorem 3.6]. For that let (ε i ) i∈N ⊆ R + be a sequence which converges to 0 and let (x i ) N ⊆ S be a sequence of feasible points which converges tox. The theorem states that if inx ∈ R n a certain CQ holds and, most of all, if x i is a modified ε i -KKT point for all i ∈ N, thenx is a KKT point.
This result says that a convergent sequence of modified ε-KKT points with ε decreasing to 0 converges to a KKT point. Instead, we have shown that for any sequence of (not necessarily feasible) points that converges to a minimal solution (in which a CQ holds and which is thus a KKT point) the value ofω has to converge to 0. For the feasible points of that sequence, this implies that they are a sequence of modified ε-KKT points with ε decreasing to 0.
For evolutionary algorithms that are used to solve (CSOP), we usually expect them to generate a sequence (x i ) N ⊆ R n that converges to a minimal solutionx ∈ S. For such algorithms it is important to have a criterion to decide when to terminate. In other words one needs a criterion to decide whether the current point x i , i ∈ N should be improved or not. We have shown that a small value ofω is at least a necessary condition for being close to a minimal solution. Hence, ifω(x i ) is not small then x i should possibly be improved. As a consequence, a small value ofω can be used as termination criterion. This is not the case regarding the result in [10] as it already assumes that (x i ) N ⊆ R n is a sequence of modified ε-KKT points with ε decreasing to 0.
Another main result of [10] is their Theorem 3.5 which states that when all functions f , g j , j ∈ [p] are convex, Slater's CQ holds, and x ∈ S is feasible and satisfies This result can be used to show that for any minimal solutionx ∈ S of (CSOP) and any sequence (x i ) N ⊆ S of feasible points converging tox, it holds This is closely related to (PM3) from Definition 3.4. However, for our result no convexity assumptions are needed.

A first approach for multi-objective problems
A possible approach to find proximity measures for the multi-objective optimization problem (CMOP) (with m ≥ 2) is to generalize the results from the single-objective case from Sect. 3.1. While no proximity measures were presented in [18], the authors there did generalize the concept of modified ε-KKT points from Definition 3.3 as follows.
Definition 3.7 Let x ∈ S be a feasible point for (CMOP) and ε > 0. If there existx ∈ R n , η ∈ R m + , and λ ∈ R p + with then x is called a modified ε-KKT point for (CMOP).
Again, we adapted the definition to fit our assumption of differentiability. In [18,Theorem 3.4] it was shown that for a sequence (x i ) N of modified ε-KKT points with ε decreasing to 0 and lim i→∞ x i =x, the pointx is a KKT point. As for instance evolutionary algorithms generate an arbitrary sequence of points, for which it is typically not guaranteed that the points are modified ε-KKT points, this result is not necessarily usable for such applications. A more applicable result when having evolutionary algorithms in mind is presented in [9]. There, a relation between so-called weakly ε-efficient solutions and modified ε-KKT points was given. For ε > 0, a feasible pointx ∈ S is called weakly ε-efficient solution for (CMOP) with respect to d ∈ int(R m The following theorem is an adaption of [9, Theorem 3.7], where the result was proven in a more general setting. and let Slater's CQ be satisfied, i.e., there exists x * ∈ S with g(x * ) < 0. Then every weakly εefficient solution for (CMOP) with respect to d ∈ int(R m + ) is also a modified ε-KKT point.
Weakly ε-efficient solutionsx ∈ S have images f (x) which are close to the image of the set of all weakly efficient solutions for (CMOP). By Theorem 3.8, a necessary condition (in case of Slater's CQ and convexity) for this 'ε-closeness', i.e., forx to be weakly ε-efficient, is thatx is a modified ε-KKT point. This can be used as a selection or termination criterion in evolutionary algorithms. Such a relation as in Theorem 3.8 was also shown for the singleobjective case in [10, Theorem 3.5], see the discussion on page 12. The downside of this result is that it requires convexity of the functions.
We now introduce a new proximity measure for which we need no convexity assumption to prove (PM1)-(PM3). Thereby we generalize the concept which was used in Definition 3.4 for the single-objective case. Definition 3.9 A function ω : R n → R based on modified ε-KKT points is given as We show that this function ω is indeed a proximity measure. We first state that properties (PM1) and (PM2) hold.

Lemma 3.10 The function ω satisfies (PM1) and (PM2).
Proof By definition we have ω(x) ≥ 0 for all x ∈ R n and hence, property (PM1) holds. Now letx ∈ S be an efficient solution for (CMOP) in which the Abadie CQ holds. Then property (PM2) is satisfied by Theorem 2.2.
In addition to property (PM2), we can also show a stronger relation between the zeros of ω and (exact) KKT points of (CMOP).

Lemma 3.11
For any x ∈ R n it holds that ω(x) = 0 if and only if there exist η ∈ R m + and λ ∈ R p + such that (x, η, λ) is a KKT point.
Proof Let x ∈ R n with ω(x) = 0. This is the case if and only if x ∈ S and there exist , (x, η, λ) is a KKT point.
Finally, we show that property (PM3) is satisfied for ω as well. This implies that ω is indeed a proximity measure.

Theorem 3.12 The function ω satisfies (PM3).
Proof Letx ∈ S be an efficient solution for (CMOP) in which the Abadie CQ holds, (x i ) N ⊆ R n a sequence with lim i→∞ x i =x and (x i k ) k∈N a subsequence of (x i ) N which will be denoted by (x p ) N . As in the proof of Theorem 3.6, we construct a subsequence of (x p ) N and show that ω converges to 0 on this subsequence.
Asx is efficient for (CMOP), there existη ∈ R m + andλ ∈ R Finally, there exists δ 3 ε > 0 such that for all x ∈ R n with x −x ≤ δ 3 ε it holds that Now define δ ε := min{δ 1 ε , δ 2 ε , δ 3 ε } > 0. Then for all x ∈ R n with x −x ≤ δ ε by (3.10), (3.11), (3.12), and (3.9) it holds that (3.13) As (x p ) N converges tox, there exists p ε ∈ N with x p −x ≤ δ ε for all p ≥ p ε . Hence, for all p ≥ p ε we obtain from (3.13) that ε,x p = x p , η =η, and λ =λ define a feasible point for the optimization problem for ω(x p ) as given in Definition 3.9, which implies ω(x p ) ≤ ε. Now by taking a sequence (ε n ) N ⊆ int(R + ) with lim n→∞ ε n = 0 as in the proof of Theorem 3.6, we can construct a subsequence ( With the same arguments as in the proof of Theorem 3.6, we obtain and thus ω satisfies property (PM3).
At the beginning of this section, in Example 3.2, a naively definition of a candidate for a proximity measure was presented. This candidate functionω was not a proximity measure. In particular, it was not continuous in every efficient solution in which the Abadie CQ holds. We now reconsider the problem (P1) from Example 3.2 and illustrate the advantages of ω compared toω. Example 3. 13 We consider again the constrained multi-objective optimization problem (P1) from Example 3.2 and the sequence of feasible points (x α ) ⊆ S.
The values for ω(x α ) are shown in Fig. 2. The function ω is continuous in x 0 , what corresponds to α = 0. Moreover, for α < 0.6 the function values ω(x α ) are monotonically decreasing to zero.

An easy to compute proximity measure
Although ω is a proximity measure in the sense of Definition 3.1, it is not really suited for computation. In general, the functionsx → Dg(x) andx → D f (x) are nonlinear and even nonconvex. This makes solving the optimization problem to compute ω(x) for x ∈ R n a hard task.
The introduction ofx in the definition of ω is a result of the weaker assumptions used in [18] and [9] (as well as in [10] for the single-objective case). All functions have not been assumed to be necessarily continuously differentiable but only locally Lipschitz continuous. Hence, the gradients of those functions do not necessarily exist in all feasible points. This is why the authors used the generalized gradient as presented by Clarke in [4]. However, the generalized gradient is not really suited for computation and a proximity measure should thus avoid using it. This is whyx was introduced. By Rademacher's theorem every Lipschitz continuous function is differentiable almost everywhere. As all functions f i , i ∈ [m] and g j , j ∈ [p] are at least locally Lipschitz, this motivates that near to x ∈ S there should be ax ∈ R n where those functions are differentiable and the generalized gradients could be replaced by gradients as in Definition 3.7. However, in our paper, the functions are assumed to be continuously differentiable. Thus, it is possible to evaluate the gradients at everyx ∈ R n and especially atx = x. We will take this idea of fixingx = x as a starting point to define a new relaxation of KKT points. This approach was already briefly mentioned in [7] for single-objective optimization but without any further examination. The new relaxation of KKT points for (CMOP) which we introduce next is called simplified ε-KKT points. Definition 3.14 Let x ∈ S be a feasible point for (CMOP) and ε > 0. If there exist η ∈ R m then x is called a simplified ε-KKT point for (CMOP).
The term 'simplified' is chosen as it is a simplified version of Definition 3.7, and as the corresponding proximity measure, which we will introduce shortly, can be computed easier. Compared to Definition 3.7 we have not only removedx from the definition but also replaced √ ε by ε and fixed the norm to the maximum norm in (i). As a consequence, in our new proximity measure, the function value at x only relies on the evaluation of g, D f , and Dg at x itself and the optimization problem to compute the value of this proximity measure can easily be formulated as a linear optimization problem.

Definition 3.15
Define a function ω s : R n → R based on simplified ε-KKT points by We can show that ω s is indeed a proximity measure. This can be done analogously to the proofs of Lemma 3.10 and Theorem 3.12. In particular, fixing the norm to the maximum norm and replacing √ ε by ε has no effect concerning the proofs which rely on continuity statements and which already usedx = x. Also the characterization of exact KKT points from Lemma 3.11 holds for ω s . We summarize these results in the following theorem.

Theorem 3.16
The function ω s is a proximity measure. Moreover, for any x ∈ R n it holds that ω s (x) = 0 if and only if there exist η ∈ R m + and λ ∈ R p + such that (x, η, λ) is a KKT point.
Not all results for modified ε-KKT points can easily be transferred to simplified ε-KKT points. For instance, the proof of Theorem 3.8 which can be found in [9] relies on Ekeland's variational principle. Hence, the relation cannot directly be extended as we would have to ensure x =x for that case. Whether such a statement can be found or not is an open question.

Numerical results
In [10] the authors investigated the behavior of their proximity measure (which we also presented asω in Definition 3.4) for single-objective optimization problems for iterates of the evolutionary algorithm RGA. They observed that the function value of their proximity measure decreased throughout the iterations for several test instances. As a result, the authors proposed to use their proximity measure as a termination criterion for evolutionary algorithms.
While this could also be done with the new proximity measures ω and ω s which we introduced in this paper for multi-objective optimization problems (CMOP), we focus on another illustration of the measures, in particular of ω s . In the previous section it was already discussed why this proximity measure is more suited for numerical evaluation and use with optimization algorithms: it can be calculated by solving a linear optimization problem only.
For the following examples we have generated k points equidistantly distributed in the preimage space and then computed the value of the proximity measure ω s in MATLAB using linprog. If the computed value was below a specified limit of α > 0, those points were selected as possible efficient solutions (also called solution candidates). The set of those points will be denoted by C in this section. Moreover, the set of efficient solutions will be Hence, dark blue corresponds to a value of ω s close to 0. Then, as the value rises up, the color turns green and finally yellow for the highest value of ω s that was reached within the generated discretization of the preimage space. The sets C and f (C) are illustrated by red triangles. The generated data and the MATLAB files are available from the corresponding author on reasonable request.

Test instance 1
The first example is convex and taken from [5]. min The set of efficient solutions for this multi-objective optimization problem is For computation in MATLAB a total of k = (64 + 1) 2 = 4225 points were generated equidistantly distributed in S = [−5, 10] 2 . A number of |C| = 21 points lead to a value of ω s lower or equal to α = 0.001. In particular, for the set C delivered by MATLAB it holds C ⊆ E. The result is shown in Fig. 3a, b. As (BK1) is a convex problem and the Abadie CQ holds for all x ∈ S due to the linear constraints, the KKT conditions are not only a necessary optimality condition (see Theorem 2.2) but also sufficient by Lemma 2.4. Moreover, it was already discussed that ω s (x) = 0 if and only if there exist η ∈ R m + and λ ∈ R p + such that (x, η, λ) is a KKT point. This implies that for (BK1) ω s (x) = 0 if and only if x ∈ S is a weakly efficient solution for this problem. Thus, ω s is well suited for characterizing weakly efficient solutions for (BK1) and also as a termination criterion for optimization algorithms.
Test instance 2 In [22] Srinivas and Deb presented the following non-convex problem with convex but nonlinear constraint functions.
The set of efficient solutions for this problem is presented in [8] as For the computation with MATLAB, a total of k = (64 + 1) 2 = 4225 points were generated equidistantly distributed in X = [−20, 20] 2 . The results can be seen in Fig. 4a, b.
In the figures, it may look as if the Pareto Front f (E) is larger than the representation which is covered by the candidates f (C). This is not the case, as there are a lot of infeasible points within the set X . For clarifying this, in Fig. 4c, d only feasible points and the corresponding images are shown.
Most of the |C| = 25 candidates presented by the algorithm are efficient solutions for (SRN). However, there are 5 candidates which are actually not belonging to E. An approach to improve this could be to reduce the acceptance bound α, but the result remains unchanged even for α = 1 · 10 −8 . In case one decreases α further, the set C gets smaller and some gaps appear in the preimage space (and as a consequence in the image space as well). In terms of quality, the result remains the same as the set C still contains some points that do not belong to E. The effect is the same when choosing a finer discretization. This is shown in Fig. 4e, f for k = (128 + 1) 2 = 16641 equidistantly distributed points in X .
The reason why there are points in C that are not belonging to E is just that these points (approximately) satisfy the KKT conditions without being efficient. Recall that the KKT conditions are just a necessary optimality condition and that they are sufficient only in case of convexity, see Lemma 2.4. Due to the specific structure of E for this test instance, we examined the influence of not discretizing equidistantly but randomly. The results using 16641 points are given in Fig. 5 for two different values of α. First, we chose α = 0.001 as this worked very well for (BK1) and (SRN) with equidistantly distributed points in the preimage space, see Figs. 3 and 4b. However, for (SRN) with randomly distributed points, only a single solution candidate is found by MATLAB for α = 0.001 and this is x c = (−2.3746, 2.5611) , see Fig. 5a. When increasing α, the set C starts to get larger. For instance, Fig. 5b shows the results for α = 0.01. Comparing the results to those seen in Fig. 4b, the structure of the sets of solution candidates is quite similar. This is exactly what we could expect due to the continuity of ω s in every efficient solution. This specific run shows that the tolerance α should be chosen carefully. If it is too small like in this case for α = 0.001, only few solution candidates will be found. On the other hand, choosing a large α can result in solution candidates that are not close to the set of efficient solutions E at all. [20]. It has a larger number of n = 6 optimization variables and a larger number of constraints as well. The objective function f : R 6 → R 2 is given as

Test instance 3 This test instance by Osyczka and Kundu is taken from
A characterization of the set E for this problem can be found in [8]. A total of k = (16 + 1) 4 = 83521 points were generated equidistantly distributed inX . The MATLAB implementation found 70 candidates with a value of ω s less or equal to α = 0.001. The result can be seen in Fig. 6a.
For a better characterization of the set E of all efficient solutions we set For those sets it holds that Considering the (discretized) image f (S) in Fig. 6b, E 1 contains all efficient solutions belonging to the upper half of the left 'stroke' and E 2 the efficient solutions belonging to its lower half. The set E 3 contains all efficient solutions belonging to the stroke at about f 1 ≈ −120. In Fig. 6a, it might look as if actually none of the elements within C is an efficient solution. However, the images of infeasible points are also included in that figure. For this reason, in Fig. 6b only the images of feasible points are shown, and it can be seen that C contains efficient and locally efficient solutions.
In particular, out of the 69 points in C there are 17 within E 1 , 17 within E 2 and 11 within E 3 . These are all of the 83521 generated points that belong to the sets E 1 , E 2 , and E 3 . Also the point x c := (0.625, 1.375, 1, 0, 1, 0) ∈ C is part of the set of all efficient solutions E. Nevertheless, there are 24 points which do not belong to E. There is one outlier which is easy to spot in Fig. 6a. In addition, there are also some locally efficient solutions which can be seen as an extension of E 3 . In particular, these are  In the image space, f (E 3 ) is the lower part of the line at about f 1 ≈ −120 rising up to f 2 ≈ 18. The upper end of this line with f 2 ≥ 30 is the image of C 2 . The points in between belong to C 1 .
In a next step, we aimed to obtain those points at the bottom of the 'strokes' which are indeed images of efficient solutions. As a first approach, α was increased to α = 0.1. The result is shown in Fig. 6c. Another idea was to keep α = 0.001 and increase the fineness of the discretization, in particular to choose k = (32 + 1) 4 = 1185921 points equidistantly distributed in the preimage space. This leads to the results which can be seen in Fig. 6d. In both cases some of the (efficient) bottom tips were found. However, we also included more points which are not approximately efficient solutions within C.

Conclusions
We presented two new proximity measures ω and ω s for (CMOP) in the sense of Definition 3.1. The proximity measure ω is a generalization of the proximity measure presented in [10] for the single-objective case by using the results from [9] and [18]. One drawback of this approach is that ω can be hard to compute. Thus, it is not really suited for use within optimization algorithms. This is why we introduced the proximity measure ω s . Compared to ω, the computation of ω s (x) for some x ∈ R n only relies on a single evaluation of g, D f , and Dg. Moreover, it only requires solving a linear optimization problem, which makes its computation a lot faster.
In addition, the ability of ω s to characterize the proximity of a certain point x ∈ R n to the set of efficient solutions for (CMOP) was demonstrated in the previous section. As a result, ω s is well suited for numerical applications. In particular, we have illustrated its capabilities as an additional criterion for candidate selection or termination in evolutionary algorithms.
It could now be argued that the computation of ω s relies on solving an optimization problem and hence, all the problems mentioned in the introduction as limited accuracy are critical aspects as well. While such limitations should always be taken into account, the computation of ω s relies on solving a linear optimization problem. For linear optimization problems, exact solvers such as SoPlex (see [15]) are available and can handle even numerically troublesome problems.
At no point we had to assume m ≥ 2 for the dimension of the image space. Hence, all results hold still for single-objective optimization problems. Moreover, the case of unconstrained problems is also contained as a special case. As a consequence, our proximity measure ω s can be used for a large class of optimization problems.