A Steepest Descent Method for Set Optimization Problems with Set-Valued Mappings of Finite Cardinality

In this paper, we study a first order solution method for a particular class of set optimization problems where the solution concept is given by the set approach. We consider the case in which the set-valued objective mapping is identified by a finite number of continuously differentiable selections. The corresponding set optimization problem is then equivalent to find optimistic solutions to vector optimization problems under uncertainty with a finite uncertainty set. We develop optimality conditions for these types of problems, and introduce two concepts of critical points. Furthermore, we propose a descent method and provide a convergence result to points satisfying the optimality conditions previously derived. Some numerical examples illustrating the performance of the method are also discussed. This paper is a modified and polished version of Chapter 5 in the PhD thesis by Quintana (On set optimization with set relations: a scalarization approach to optimality conditions and algorithms, Martin-Luther-Universit\"at Halle-Wittenberg, 2020).


Introduction
Set optimization is the class of mathematical problems that consists in minimizing setvalued mappings acting between two vector spaces, in which the image space is partially ordered by a given closed, convex and pointed cone. There are two main approaches for defining solution concepts for this type of problems, namely the vector approach and the set approach. In this paper, we deal with the last of these concepts. The main idea of this approach lies on defining a preorder on the power set of the image space, and to consider minimal solutions of the set-valued problem accordingly. Research in this area started with the works of Young [46], Nishnianidze [41], and Kuroiwa [35,36], in which the first set relations for defining a preorder were considered. Furthermore, Kuroiwa [34] was the first who considered set optimization problems where the solution concept is given by the set approach. Since then, research in this direction has expanded immensely due to its applications in finance, optimization under uncertainty, game theory, and socioeconomics. We refer the reader to [29] for a comprehensive overview of the field. The research topic that concerns us in this paper is the development of efficient algorithms for the solution of set optimization problems. In this setting, the current approaches in the literature can be roughly clustered into four different groups: • Derivative free methods [23,24,30].
In this context, the derived algorithms are descent methods and use a derivative free strategy [7]. These algorithms are designed to deal with unconstrained problems, and they assume no particular structure of the set-valued objective mapping. The first method of this type was described in [23]. There, the case in which both the epigraphical and hypographical multifunctions of the set-valued objective mapping have convex values was analyzed. This convexity assumption was then relaxed in [30] for the so called upper set less relation. Finally, in [24], a new method with this strategy was studied. An interesting feature of the algorithm in this reference is that, instead of choosing only one descent direction at every iteration, it considers several of them at the same time. Thus, the method generates a tree with the initial point as the root, and the possible solutions as leaves.
The methods in this class are specifically designed to treat set optimization problems with a finite feasible set. Because of this, they are based on simple comparisons between the images of the set-valued objective mapping. In [31,32], the algorithms are extensions of those by Jahn [21,26] for vector optimization problems. They use a so called forward and backward reduction procedures that, in practice, avoid making many of these previously mentioned comparisons. Therefore, these methods perform more efficiently than a naive implementation in which every pair of sets must be compared. More recently, in [14,15], an extension of the algorithm by Günther and Popovici [18] for vector problems was studied. The idea now is to, first, find an enumeration of the images of the set-valued mapping whose values by a scalarization using a strongly monotone functional are increasing. In a second step, a forward iteration procedure is performed. Due to the presorting step, these methods enjoy an almost optimal computational complexity, compare [33].
The methods in this group follow a scalarization approach, and are derived for problems where the set-valued objective mapping has a particular structure that comes from the so called robust counterpart of a vector optimization problem under uncertainty, see [20]. In [11,19,20], a linear scalarization was employed for solving the set optimization problem. Furthermore, the -constraint method was extended too in [11,19], for the particular case in which the ordering cone is the nonnegative orthant. Weighted Chebyshev scalarization and some of its variants (augmented, min-ordering) were also studied in [19,27,44].
The algorithm in [12] is also designed for uncertain vector optimization problems, but in particular it is assumed that only the decision variable is the source of uncertainty. There, the authors propose a branch and bound method for finding a box covering of the solution set.
The strategy that we consider in this paper is different to the ones previously described, and is designed for dealing with unconstrained set optimization problems in which the set-valued objective mapping is given by a finite number of continuously differentiable selections. Our motivation for studying problems with this particular structure is twofold: • Problems of this type have important applications in optimization under uncertainty.
Indeed, set optimization problems with this structure arise when computing robust solutions to vector optimization problems under uncertainty, if the so called uncertainty set is finite, see [20]. Furthermore, the solvability of problems with a finite uncertainty set is an important component in the treatment of the general case with an infinite uncertainty set, see the cutting plane strategy in [40] and the reduction results in [3, Proposition 2.1] and [11,Theorem 5.9].
• Current algorithms in the literature pose different theoretical and practical difficulties when solving these types of problems.
Indeed, although derivative free methods can be directly applied in this setting, they suffer from the same drawbacks as their counterparts in the scalar case. Specifically, because they make no use of first order information (which we assume is available in our context), we expect them to perform slower in practice that a method who uses these additional properties. Even worse, in the set-valued setting, there is now an increased cost on performing comparisons between sets, which was almost negligible for scalar problems. On the other hand, the algorithms of a sorting type described earlier can not be used in our setting since they require a finite feasible set. Similarly, the branch and bound strategy is designed for problems that do not fit the particular structure that we consider in this paper, and so it can not be taken into account. Finally, we can also consider the algorithms based on scalarization in our context. However, the main drawback of these methods is that, in general, they are not able to recover all the solutions of the set optimization problem. In fact, the -constraint method, which is known to overcome this difficulty in standard multiobjective optimization, will fail in this setting.
Thus, we address in this paper the need of a first order method that exploits the particular structure of the set-valued objective mapping previously mentioned, and does not have the same drawbacks of the other approaches in the literature. The rest of the paper is structured as follows. We start in Section 2 by introducing the main notations, basic concepts and results that will be used throughout the paper. In Section 3, we derive optimality conditions for set optimization problems with the aforementioned structure. These optimality conditions constitute the basis of the descent method described in Section 4, where the full convergence of the algorithm is also obtained. In Section 5, we illustrate the performance of the method on different test instances. We conclude in Section 6 by summarizing our results and proposing ideas for further research.

Preliminaries
We start this section by introducing the main notations used in the paper. First, the class of all nonempty subsets of R m will be denoted by P(R m ). Furthermore, for A ∈ P(R m ), we denote by int A, cl A, bd A and conv A the interior, closure, boundary and convex hull of the set A, respectively. All the considered vectors are column vectors, and we denote the transpose operator with the symbol . On the other hand, · will stand for either the euclidean norm of a vector or for the standard spectral norm of a matrix, depending on the context. We also denote the cardinality of a finite set A by |A|. Finally, for k ∈ N, we put [k] = {1, . . . , k}. We next consider the most important definitions and properties involved in the results of the paper. Recall that a set K ∈ P(R m ) is said to be a cone if ty ∈ K for every y ∈ K and every t ≥ 0. Moreover, a cone K is called convex , and solid if int K = ∅. An important related concept is that of the dual cone. For a cone K, this is the set Throughout, we suppose that K ∈ P(R m ) is a cone. It is well known [16] that when K is convex and pointed, it generates a partial order on R m as follows: Furthermore, if K is solid, one can also consider the so called strict order ≺ which is defined by In the following definition, we collect the concepts of minimal and weakly minimal elements of a set with respect to . Definition 2.1 Let A ∈ P(R m ) and suppose that K is closed, convex, pointed, and solid.
(i) The set of minimal elements of A with respect to K is defined as (ii) The set of weakly minimal elements of A with respect to K is defined as The following proposition will be often used.
The Gerstewitz scalarizing functional will play also an important role in the main results.

Definition 2.3
Let K be closed, convex, pointed, and solid. For a given element e ∈ int K, the Gerstewitz functional associated to e and K is ψ e : R m → R defined as ψ e (y) := min{t ∈ R | te ∈ y + K}.  (ii) ψ e is both monotone and strictly monotone with respect to the partial order , that is, and ∀ y, z ∈ R m : y ≺ z =⇒ ψ e (y) < ψ e (z), respectively.
(iii) ψ e satisfies the so called representability property, that is, We next introduce the set relations between the nonempty subsets of R m that will be used in the definition of the set optimization problem we consider. We refer the reader to [25,28] and the references therein for other set relations.

Definition 2.5 ([37]
) For the given cone K, the lower set less relation is the binary relation defined on P(R m ) as follows: Similarly, if K is solid, the strict lower set less relation ≺ is the binary relation defined on P(R m ) by: Remark 2.6 Note that for any two vectors y, z ∈ R m the following equivalences hold: Thus, the restrictions of and ≺ to the singletons in P(R m ) are equivalent to and ≺, respectively.
We are now ready to present the set optimization problem together with a solution concept based on set relations. Definition 2.7 Let F : R n ⇒ R m be a given set-valued mapping and suppose that K is closed, convex, pointed, and solid. The set optimization problem with this data is formally represented as and a solution is understood in the following sense: we say that a pointx ∈ R n is a local weakly minimal solution of (SP ) if there exists a neighborhood U ofx such that the following holds: Moreover, if we can choose U = R n above, we simply say thatx is a weakly minimal solution of (SP ).

Remark 2.8
A related problem to (SP ) that is relevant in our paper is the so called vector optimization problem [22,38]. There, for a vector-valued mapping f : We conclude the section by establishing the main assumption employed in the rest of the paper for the treatment of (SP ): Assumption 1 Suppose that K ∈ P(R m ) is a closed, convex, pointed and solid cone, and that e ∈ int K is fixed. Furthermore, consider a reference pointx ∈ R n , given vector-valued functions f 1 , f 2 , . . . , f p : R n → R m that are continuously differentiable, and assume that the set-valued mapping F in (SP ) is defined by

Optimality Conditions
In this section, we study optimality conditions for weakly minimal solutions of (SP ) under Assumption 1. These conditions are the foundation on which the proposed algorithm is built. In particular, because of the resemblance of our method with standard gradient descent in the scalar case, we are interested in Fermat rules for set optimization problems. Recently, results of this type were derived in [5], see also [2]. There, the optimality conditions involve the computation of the limiting normal cone [39] of the set-valued mapping F at different points in its graph. However, this is a difficult task in our case because the graph of F is the union of the graphs of the vector-valued functions f i , and to the best of our knowledge there is no exact formula for finding the normal cone to the union of sets (at a given point) in terms of the initial data. Thus, instead of considering the results from [5], we exploit the particular structure of F and the differentiability of the functionals f i to deduce new necessary conditions. We start by defining some index-related set-valued mappings that will be of importance. They make use of the concepts introduced in Definition 2.1.

Definition 3.1
The following set-valued mappings are defined: given by

(ii) The active index of weakly minimal elements associated to
It follows directly from the definition that (3.1)

Definition 3.2
The map ω : R n → R is defined as the cardinality of the set of minimal elements of F , that is, Furthermore, we setω := ω(x).
From now on we consider that, for any point x ∈ R n , an enumeration {v x 1 , . . . , v x ω(x) } of the set Min(F (x), K) has been chosen in advance.

Definition 3.3 For a given point
The optimality conditions for (SP ) we will present are based on the following idea: from the particular structure of F, we will construct a family of vector optimization problems that, together, locally represent (SP ) (in a sense to be specified) around the point which must be checked for optimality. Then, (standard) optimality conditions are applied to the family of vector optimization problems. The following lemma is the key step in that direction.

2)
and let us denote by K and ≺K the partial order and the strict order in R mω induced byK, respectively (see (2.2)). Furthermore, consider the partition set Px associated tō x and define, for every a ∈ Px, the functionalf a : Then,x is a local weakly minimal solution of (SP ) if and only if, for every a ∈ Px,x is a local weakly minimal solution of the vector optimization problem Proof. We argue by contradiction in both cases. First, assume thatx is a local weakly minimal solution of (SP ) and that, for some a ∈ Px,x is not a local weakly minimal solution of (VP a ). Then, we could find a sequence {x k } k≥1 ⊂ R n such that x k →x and Hence, we deduce that Since this is equivalent to F (x k ) ≺ F (x) for every k ∈ N and x k →x, we contradict the weak minimality ofx for (SP ). Next, suppose thatx is a local weakly minimal solution of (VP a ) for every a ∈ Px, but not a local weakly minimal solution of (SP ). Then, we could find a sequence Since the indexes i (j,k) are being chosen on the finite set [p], we can assume without loss of generality that i (j,k) is independent of k, that is, i (j,k) =ī j for every k ∈ N and someī j ∈ [p]. Hence, taking the limit in (3.5) when k → +∞, we get Because vx j ∈ Min(F (x), K), it follows from (3.6) that f¯i j (x) = vx j and thatī j ∈ I(x) for every j ∈ [ω]. Consider now the tupleā := (ī 1 , . . . ,īω). Then, it can be verified that a ∈ Px. Moreover, from (3.5) we deduce thatfā(x k ) ≺Kfā(x) for every k ∈ N. Since x k →x, this contradicts the weak minimality ofx for (VP a ) when a =ā.
We now establish the necessary optimality condition for (SP ) that will be used in our descent method.
Theorem 3.5 Suppose thatx is a local weakly minimal solution of (SP ). Then, Conversely, assume that f i is K-convex for each i ∈ I(x), that is, Then, condition (3.7) is also sufficient for the local weak minimality ofx.
Proof. By Lemma 3.4, we get thatx is a local weakly minimal solution of (VP a ) for every a ∈ Px. Applying now [38, Theorem 4.1] for every a ∈ Px, we get it is easy to verify that (3.8) is equivalent to the first part of the statement In order to see the sufficiency under convexity, assume thatx satisfies (3.7). Note that for any a ∈ Px, the functionf a isK-convex provided that each f i is K-convex for every i ∈ I(x). Then, in this case, it is well known that (3.8) is equivalent tox being a local weakly minimal solution of (VP a ) for every a ∈ Px, see [17]. Applying now Lemma 3.4, we obtain thatx is a local weakly minimal solution of (SP ).
Based on Theorem 3.5, we define the following concepts of stationarity for (SP ).

Definition 3.6
We say thatx is a stationary point of (SP ) if there exists a nonempty set Q ⊆ Px such that the following assertion holds: In that case, we also say thatx is stationary with respect to Q. If, in addition, we can choose Q = Px in (3.9), we simply callx a strongly stationary point.

Remark 3.7 It follows from Definition 3.6 that a strongly stationary point of (SP )
is also stationary with respect to Q for every Q ⊆ Px. Furthermore, from Theorem 3.5, it is clear that stationarity is also a necessary optimality condition for (SP ).
In the following example, we illustrate a comparison of our optimality conditions with previous ones in the literature for standard optimization problems.
and problem (SP ) associated to this data. Hence, in this case, It is then easy to verify that the following statements hold:

On the other hand, it is straightforward to verify thatx is a weakly minimal solution of (SP ) if and only ifx is a solution of the problem
Moreover, if we denote by ∂f (x) and ∂f (x) the Fréchet and Mordukhovich subdifferential of f at the pointx respectively (see [39]), it follows from [39,Proposition 1.114] that the inclusions and 0 ∈ ∂f (x) (3.11) are necessary forx being a solution of (P and respectively. Thus, from (3.10), (3.12) and (i), we deduce that (iii)x is strongly stationary for (SP ) if and only ifx is Fréchet stationary for (P).
We close the section with the following proposition, that presents an alternative characterization of stationary points.

Descent Method and its Convergence Analysis
Now, we present the solution approach. It is clearly based on the result shown in Lemma 3.4. At every iteration, an element a in the partition set of the current iterate point is selected, and then a descent direction for (VP a ) will be found using ideas from [6,17]. However, one must be careful with the selection process of the element a in order to guarantee convergence. Thus, we propose a specific way to achieve this. After the descent direction is determined, we follow a classical backtracking procedure of Armijo type to determine a suitable step size, and we update the iterate in the desired direction. Formally, the method is the following:
Proof. It suffices to show the existence of the neighborhood U for each item independently, as we could later take the intersection of them to satisfy all the properties.
(i) Assume that this is not satisfied in any neighborhood U ofx. Then, we could find a sequence {x k } k≥1 ⊂ R n such that x k →x and Because of the finite cardinality of all possible differences in (4.1), we can assume without loss of generality that there exists a commonī ∈ [p] such that In particular, (4.2) implies thatī ∈ I 0 (x k ). Hence, we get Since R m \ int K is closed, taking the limit when k → +∞ we obtain Hence, we deduce that f¯i(x) ∈ WMin(F (x), K) andī ∈ I 0 (x), a contradiction to (4.1).
(ii) Consider the same neighborhood U on which statement (i) holds. Note that, under the given assumption, we have I 0 (x) = I(x). This, together with statement (i), implies: (iii) For this statement, it is also sufficient to show that the neighborhood U can be chosen for any point in the set Min(F (x), K). Hence, fix v ∈ Min(F (x), K) and assume that there is no neighborhood U ofx on which the statement is satisfied. Then, we could find sequences {x k } k≥1 ⊂ R n and {i k } k≥1 ⊆ I v (x) such that x k →x and Since I v (x) is finite, we deduce that there is only a finite number of different elements in the sequence {i k }. Hence, we can assume without loss of generality that there exists i ∈ I v (x) such that i k =ī for every k ∈ N. Then, (4.3), is equivalent to From (4.4), we get in particular that f¯i(x k ) / ∈ Min(F (x k ), K) for every k ∈ N. This, together with the domination property in Proposition 2.2 and the fact that the sets I(x k ) are contained in the finite set [p], allow us to obtain without loss of generality the existence ofĩ ∈ I(x) such that (4.5) Now, taking the limit in (4.5) when k → +∞, we obtain f˜i(x) f¯i(x) = v. Since v is a minimal element of F (x), it can only be f˜i(x) = v and, hence,ĩ ∈ I v (x). From this, the first inequality in (4.5), and the fact that f¯i(x k ) ∈ Min({f i (x k )} i∈Iv(x) , K) for every k ∈ N, we get that f¯i(x k ) = f˜i(x k ) for all k ∈ N. This contradicts the second part of (4.5), and hence our statement is true.
(iv) It follows directly from the continuity of the functionals f i , i ∈ [p].
(v) The statement is an immediate consequence of (iii) and (iv).
For the main convergence theorem of our method, we will need the notion of regularity of a point for a set-valued mapping.

Definition 4.3
We say thatx is a regular point of F if the following conditions are satisfied: (ii) the functional ω is constant in a neighborhood ofx.

Remark 4.4
Since we will analyze the stationarity of the regular limit points of the sequence generated by Algorithm 1, the following points must be addressed: • Notice that, by definition, the regularity property of a point is independent of our optimality concept. Thus, by only knowing that a point is regular, we can not infer anything about whether it is optimal or not.
• The concept of regularity seems to be linked to the complexity of comparing sets in a high dimensional space. For example, in case m = 1 or p = 1, every point in R n is regular for the set-valued mapping F . Indeed, in these cases, we have ω(x) = 1 and A natural question is whether regularity is a strong assumption to impose on a point. In that sense, given the finite structure of the sets F (x), the condition (i) in Definition 4.3 seems to be very reasonable. In fact, we would expect that, for most practical cases, this condition is fulfilled at almost every point. For condition (ii), a formalized statement is derived in Proposition 4.5 below. Proof. (i) The openness is trivial. Suppose now that S is not dense in R n . Then, R n \ (cl S) is nonempty and open. Furthermore, since ω is bounded above, the real number is well defined. Consider the set From Lemma 4.2 (v), it follows that ω is lower semicontinuous. Hence, A is closed as it is the sublevel set of a lower semicontinuous functional, see [43,Lemma 1.7.2]. Consider now the set Then, U is a nonempty open subset of R n \ (cl S). This, together with the definition of A, gives us ω(x) = p 0 for every x ∈ U. However, this contradicts the fact that ω is not locally constant at any point of R n \ (cl S). Hence, S is dense in R n .
An essential property of regular points of a set-valued mapping is described in the next lemma.

Lemma 4.6 Suppose thatx is a regular point of F. Then, there exists a neighborhood
U ofx such that the following properties hold for every x ∈ U : In particular, without loss of generality, we have P x ⊆ Px for every x ∈ U.
Proof. Let U be the neighborhood ofx from Lemma 4.2. Sincex is a regular point of F, we assume without loss of generality that ω is constant on U. Hence, property (i) is fulfilled. Fix now x ∈ U and consider the enumeration {vx 1 , . . . , vx ω } of Min(F (x), K). Then, from properties (iii) and (iv) in Lemma 4.2 and the fact that ω(x) =ω, we deduce that Next, for j ∈ [ω], we define w x j as the unique element of the set Min {f i (x)} i∈I vx j (x) , K . Then, from (4.6), property (iii) in Lemma 4.2 and the fact that ω is constant on U, we obtain that {w x 1 , . . . , w x ω } is an enumeration of the set Min(F (x), K). It remains to show now that this enumeration satisfies (ii). In order to see this, fix j ∈ [ω] andī ∈ I w x j (x). Then, from the regularity ofx and property (ii) in Lemma 4.2, we get that I(x) ⊆ I(x). In particular, this impliesī ∈ I(x). From this and (3.1), we have the existence of j ∈ [ω] such thatī ∈ I vx j (x). Hence, we deduce that . (4.7) Then, from (4.6), (4.7) and the definition of w x j , we find that w x j w x j . Moreover, because w x j , w x j ∈ Min(F (x), K), it can only be w x j = w x j . Thus, it follows that j = j , since {w x 1 , . . . , w x ω } is an enumeration of the set Min (F (x), K). This shows thatī ∈ I vx j (x), as desired.
For the rest of the analysis we need to introduce the parametric family of functionals {ϕ x } x∈R n , whose elements ϕ x : P x × R n → R are defined as follows: It is easy to see that, for every x ∈ R n and a ∈ P x , the functional ϕ x (a, ·) is strongly convex in R n , that is, there exists a constant α > 0 such that the inequality is satisfied for every u, u ∈ R n and t ∈ [0, 1]. According to [13,Lemma 3.9], the functional ϕ x (a, ·) attains its minimum over R n , and this minimum is unique. In particular, we can check that and that, if u a ∈ R n is such that ϕ x (a, u a ) = min u∈R n ϕ x (a, u), then ϕ x (a, u a ) = 0 ⇐⇒ u a = 0. (4.10) Taking into account that P x is finite, we also obtain that ϕ x attains its minimum over the set P x × R n . Hence, we can consider the functional φ : R n → R given by φ(x) := min (a,u)∈Px×R n ϕ x (a, u). (4.11) Then, because of (4.9), it can be verified that , u), it follows from (4.10) (see also [17]) that In the following two propositions we show that Algorithm 1 is well defined. We start by proving that, if Algorithm 1 stops in Step 3, a stationary point was found.

Remark 4.8 A similar statement to the one in Proposition 4.7 can be made for stationary points of (SP ). Indeed, for a set
(i)x is stationary for (SP ) with respect to Q, Next, we show that the line search in Step 4 of Algorithm 1 terminates in finitely many steps.

Proposition 4.9
Fix β ∈ (0, 1) and consider the functionals ϕx and φ given in (4.8) and (4.11) respectively. Furthermore, let (ā,ū) ∈ Px × R n be such that φ(x) = ϕx(ā,ū) and suppose thatx is not a strongly stationary point of (SP ). The following assertions hold: (ii) Lett be the parameter in statement (i). Then, In particular,ū is a descent direction of F atx with respect to the preorder .
Proof. (i) Assume otherwise. Then, we could find a sequence {t k } k≥1 andj ∈ [ω] such that t k → 0 and As (R m \ −K) ∪ {0} is a cone, we can multiply (4.16) by 1 t k for each k ∈ N to obtain ∀ k ∈ N : (4.17) Taking now the limit in (4.17) when k → +∞ we get Since β ∈ (0, 1), this is equivalent to On the other hand, sincex is not strongly stationary, we can apply Proposition 4.7 to obtain thatū = 0 and that φ(x) < 0. This implies that ϕx(ā,ū) < 0, and hence From this, we deduce that ∀ j ∈ [ω] : ψ e ∇fā j (x) ū < 0 and, by Proposition 2.4 (iii), However, this is a contradiction to (4.18), and hence the statement is proven.
(ii) From (4.19), we know that (4.20) Then, it follows that as desired.
We are now ready to establish the convergence of Algorithm 1.

Theorem 4.10 Suppose that Algorithm 1 generates an infinite sequence for whichx
is an accumulation point. Furthermore, assume thatx is regular for F. Then,x is a stationary point of (SP ). If in addition |Px| = 1, thenx is a strongly stationary point of (SP ).
Proof. Consider the functional ζ : The proof will be divided in several steps: Step 1: We show the following result: Indeed, because of the monotonicity property of ψ e in Proposition 2.4 (ii), the functional ζ is monotone with respect to the preorder , that is, On the other hand, from Proposition 4.9 (ii), we deduce that Hence, using the monotonicity of ζ, we obtain for any k ∈ N ∪ {0} : The above inequality, together with the definition of φ in (4.11), implies (4.21). On the other hand, sincex is an accumulation point of the sequence {x k } k≥0 , we can find a subsequence K in N such that x k K →x.
Step 2: The following inequality holds Indeed, from Proposition 4.9 (ii), we can guarantee that the sequence {F (x k )} k≥0 is decreasing with respect to the preorder , that is, Fix now k ∈ N, and i ∈ [p]. Then, according to (4.23), we have Since there are only a finite number of possible values for i k , we assume without loss of generality that there isī ∈ [p] such that i k =ī for every k ∈ K, k ≥ k. Hence, (4.24) is equivalent to Taking the limit now in (4.25) when k K → +∞, we find that Since i was chosen arbitrarily in [p], this implies the statement.
Step 3: We prove that the sequence {u k } k∈K is bounded.
In order to see this, note that, since x k is not a stationary point, we have by Proposition 4.7 that φ(x k ) < 0 for every k ∈ N ∪ {0}. By the definition of a k and u k , we then have Let ρ be the Lipschitz constant of ψ e from Proposition 2.4 (i). Then, we deduce that Hence, Since {x k } k∈K is bounded, the statement follows from (4.27).
Step 4: We show thatx is stationary.
Fix κ ∈ N. Then, it follows from (4.21) that Adding this inequality for k = 0, . . . , κ, we obtain On the other hand, similarly to (4.19) in the proof of Proposition 4.9 (i), we obtain that In particular, applying Proposition 2.4 (iii) in (4.30), we find that We then have Taking now the limit in the previous inequality when κ → +∞, we deduce that In particular, this implies Since there are only a finite number of subsets of [p] andx is regular for F, we can apply Lemma 4.6 to obtain, without loss of generality, the existence of Q ⊆ Px and a ∈ Q such that Furthermore, since the sequences {t k } k≥1 , {u k } k∈K are bounded, we can also assume without loss of generality the existence oft ∈ R,ū ∈ R n such that The rest of the proof is devoted to show thatx is a stationary point with respect to Q. First, observe that by (4.33) and the definition of a k , we have Then, taking into account that ω k =ω in (4.33), we can take the limit when k K → +∞ in the above expression to obtain ∀ a ∈ Q, u ∈ R n : ϕx(ā,ū) ≤ ϕx(a, u).
Equivalently, we have (ā,ū) ∈ argmin (a,u)∈Q×R n ϕx(a, u). (4.35) Next, we analyze two cases: According to (4.32) and (4.33), we have in this case Then, it follows that from which we deduceū = 0. This, together with (4.35) and Remark 4.8, imply thatx is a stationary point with respect to Q.

Implementation and Numerical Illustrations
In this section, we report some preliminary numerical experience with the proposed method. Algorithm 1 was implemented in Python 3 and the experiments were done in a PC with an Intel(R) Core(TM) i5-4200U CPU processor and 4.0 GB of RAM. We describe below some details of the implementation and the experiments: • We considered instances of problem (SP ) only for the case in which K is the standard ordering cone, that is, K = R m + . In addition, we choose the parameter e ∈ int K for the scalarizing functional ψ e as e = (1, . . . , 1) .
• The parameters β and ν for the line search in Step 4 of the method were chosen as β = 0.0001, ν = 0.500.
• The stopping criteria employed was that u k < 0.0001, or a maximum number of 200 iterations was reached.
• For finding the set Min(F (x k ), K) at the k th -iteration in Step 1 of the algorithm, we implemented the method developed by Günther and Popovici in [18]. This procedure requires a strongly monotone functional ψ : R m → R with respect to the partial order for a so called presorting phase. In our implementation, we used ψ defined as follows: The other possibility for finding the set Min(F (x k ), K) would have been to use the method introduced by Jahn in [21,22,26] with ideas from [45]. However, as mentioned in the introduction, the first approach has better computational complexity. Thus, the algorithm proposed in [18] was a clear choice.
• At the k th -iteration in Step 2 of the algorithm, we worked with the modelling language CVXPY 1.0 [1,8] for the solution of the problem min (a,u)∈P k ×R n ϕ x k (a, u). Since the variable a is constrained to be in the discrete set P k , we proceeded as follows: using the solver ECOS [9] within CVXPY, we compute for every a ∈ P k the unique solution u a of the strongly convex problem min u∈R n ϕ x k (a, u). Then, we set (a k , u k ) = argmin a∈P k ϕ x k (a, u a ).
• For each test instance considered in the experimental part, we generated initial points randomly on a specific set and run the algorithm. We define as solved those experiments in which the algorithm stopped because u k < 0.0001, and declared that a strongly stationary point was found. For a given experiment, its final error is the value of u k at the last iteration. The following variables are collected for each test instance: Solved: this value indicates the number of initial points for which the problem was solved.
Iterations: this is a 3-tuple (min, mean, max) that indicates the minimum, the mean, and the maximum of the number of iterations in those instances reported as solved.

Mean CPU Time:
Mean of the CPU time(in seconds) among the solved cases.
Furthermore, for clarity, all the numerical values will be displayed for up to four decimal places. Now, we proceed to the different instances on which our algorithm was tested. Our first test instance can be seen as a continuous version of an example in [14].
Test Instance 5. 1 We consider F : R ⇒ R 2 defined as where, for i ∈ [5], f i : R → R 2 is given as The objective values in this case are discretized segments moving around a curve and being contracted (dilated) by a factor dependent on the argument. We generated 100 initial points x 0 randomly on the interval [−5π, 5π] and run our algorithm. Some of the metrics are collected in Table 1. As we can see, in this case all the runs terminated finding a strongly stationary point. Moreover, we observed that for this problem not too many iterations were needed. In Figure 1, the sequence {F (x k )} k∈{0}∪ [7] generated by Algorithm 1 for a selected starting point is shown. In this case, strong stationarity was declared after 7 iterations. The traces of the curves f i for i ∈ [5] are displayed, with arrows indicating their direction of movement. Moreover, the sets F (x 0 ) and F (x 7 ) are represented by black and red points respectively, and the elements of the sets F (x k ) with k ∈ [6] are in gray color. The improvements of the objective values after every iteration are clearly observed. We define, for i ∈ [100], the functional f i : R 2 → R 3 as Finally, the set-valued mapping F : R 2 ⇒ R 3 is defined by Note that problem (SP ) corresponds in this case to the robust counterpart of a vector location problem under uncertainty [20], where U represents the uncertainty set on the location facilities l 1 , l 2 , l 3 . Furthermore, with the aid of Theorem 3.5, it is possible to show that a pointx is a local weakly minimal solution of (SP ) if and only if x ∈ conv {l j + u i | (i, j) ∈ I(x) × {1, 2, 3}} .
Thus, in particular, the local weakly minimal solutions lie on the set C := conv ((l 1 + U) ∪ (l 2 + U) ∪ (l 3 + U)) . (5.1) In this test instance, 100 initial points x 0 were generated in the square [−50, 50] × [−50, 50], and Algorithm 1 was ran in each case. A summary of the results are presented in Table 2. Again, for any initial point the sequence generated by the algorithm stopped with a local solution to our problem. Perhaps the most noticeable parameter recorded in this case is the number of iterations required to declare the solution. Indeed, in most cases, only 1 iteration was enough, even when the starting point was far away from the locations l 1 , l 2 , l 3 .

Test Instance 5.2 Solved Iterations Mean CPU Time
100 (0,1.32,2) 0.0637 Table 2: Performance of Algorithm 1 in Test Instance 5.2 In Figure 2, the set of solutions found in this experiment are shown in red. The locations l 1 , l 2 , l 3 are represented by black points and the elements of the set (l 1 + U) ∪ (l 2 + U) ∪ (l 3 + U) are colored in gray. We can observe, as expected, that all the local solutions found are contained in the set C given in (5.1). Our last test example comes from [24]. The images of the set-valued mapping in this example are discretized, shifted, rotated, and deformated rhombuses, see Figure 3. We generated randomly 100 initial points in the square [−10π, 10π] × [−10π, 10π] and ran our algorithm. A summary of the results is collected in Table 3. In this case, only for 88 initial points a solution was found. In the rest of the occasions, the algorithm stopped because the maximum number of iterations was reached. Further examination in these unsolved cases revealed that, except for two of the initial points, the final error was of the order of 10 −1 (even 10 −3 and 10 −4 in half of the cases). Thus, perhaps only a few more iterations were needed in order to declare strong stationarity.   Figure 3 illustrates the sequence {F (x k )} k∈{0}∪ [18] generated by Algorithm 1 for a selected starting point. Strong stationarity was declared after 18 iterations in this experiment. The sets F (x 0 ) and F (x 18 ) are represented by black and red points respectively, and the elements of the sets F (x k ) with k ∈ [17] are in gray color. Similarly to the other test instances, we can observe that at every iteration the images decrease with respect to the preorder .

Conclusions
In this paper, we considered set optimization problems with respect to the lower less set relation, were the set-valued objective mapping can be decomposed into a finite number of continuously differentiable selections. The main contributions are the tailored optimality conditions derived using the first order information of the selections in the decomposition, together with an algorithm for the solution of the problems with this structure. An attractive feature of our method is that we are able to guarantee convergence towards points satisfying the previously mentioned optimality conditions. To the best of our knowledge, this would be the first procedure having such property in the context of set optimization. Finally, because of the applications of problems with this structure in the context of optimization under uncertainty, ideas for further research include the development of cutting plane strategies for general set optimization problems, as well as the extension of our results to other set relations.