1 Introduction

Set optimization is the class of mathematical problems that consists in minimizing set-valued mappings acting between two vector spaces, in which the image space is partially ordered by a given closed, convex and pointed cone. There are two main approaches for defining solution concepts for this type of problems, namely the vector approach and the set approach. In this paper, we deal with the last of these concepts. The main idea of this approach lies on defining a preorder on the power set of the image space and to consider minimal solutions of the set-valued problem accordingly. Research in this area started with the works of Young [46], Nishnianidze [41] and Kuroiwa [35, 36], in which the first set relations for defining a preorder were considered. Furthermore, Kuroiwa [34] was the first who considered set optimization problems where the solution concept is given by the set approach. Since then, research in this direction has expanded immensely due to its applications in finance, optimization under uncertainty, game theory and socioeconomics. We refer the reader to [29] for a comprehensive overview of the field.

The research topic that concerns us in this paper is the development of efficient algorithms for the solution of set optimization problems. In this setting, the current approaches in the literature can be roughly clustered into four different groups:

  • Derivative-free methods [23, 24, 30].

    In this context, the derived algorithms are descent methods and use a derivative-free strategy [7]. These algorithms are designed to deal with unconstrained problems, and they assume no particular structure of the set-valued objective mapping. The first method of this type was described in [23]. There, the case in which both the epigraphical and hypographical multifunctions of the set-valued objective mapping have convex values was analyzed. This convexity assumption was then relaxed in [30] for the so-called upper set less relation. Finally, in [24], a new method with this strategy was studied. An interesting feature of the algorithm in this reference is that, instead of choosing only one descent direction at every iteration, it considers several of them at the same time. Thus, the method generates a tree with the initial point as the root and the possible solutions as leaves.

  • Algorithms of a sorting type [17, 18, 31, 32].

    The methods in this class are specifically designed to treat set optimization problems with a finite feasible set. Because of this, they are based on simple comparisons between the images of the set-valued objective mapping. In [31, 32], the algorithms are extensions of those by Jahn [21, 26] for vector optimization problems. They use a so-called forward and backward reduction procedures that, in practice, avoid making many of these previously mentioned comparisons. Therefore, these methods perform more efficiently than a naive implementation in which every pair of sets must be compared. More recently, in [17, 18], an extension of the algorithm by Günther and Popovici [16] for vector problems was studied. The idea now is to, first, find an enumeration of the images of the set-valued mapping whose values by a scalarization using a strongly monotone functional are increasing. In a second step, a forward iteration procedure is performed. Due to the presorting step, these methods enjoy an almost optimal computational complexity, compare [33].

  • Algorithms based on scalarization [11, 12, 19, 20, 27, 44].

    The methods in this group follow a scalarization approach and are derived for problems where the set-valued objective mapping has a particular structure that comes from the so-called robust counterpart of a vector optimization problem under uncertainty, see [20]. In [11, 19, 20], a linear scalarization was employed for solving the set optimization problem. Furthermore, the \(\epsilon \)- constraint method was extended too in [11, 19], for the particular case in which the ordering cone is the nonnegative orthant. Weighted Chebyshev scalarization and some of its variants (augmented, min-ordering) were also studied in [19, 27, 44].

  • Branch and bound [12].

    The algorithm in [12] is also designed for uncertain vector optimization problems, but in particular it is assumed that only the decision variable is the source of uncertainty. There, the authors propose a branch and bound method for finding a box covering of the solution set.

The strategy that we consider in this paper is different to the ones previously described and is designed for dealing with unconstrained set optimization problems in which the set-valued objective mapping is given by a finite number of continuously differentiable selections. Our motivation for studying problems with this particular structure is twofold:

  • Problems of this type have important applications in optimization under uncertainty.

    Indeed, set optimization problems with this structure arise when computing robust solutions to vector optimization problems under uncertainty, if the so-called uncertainty set is finite, see [20]. Furthermore, the solvability of problems with a finite uncertainty set is an important component in the treatment of the general case with an infinite uncertainty set, see the cutting plane strategy in [40] and the reduction results in [3, Proposition 2.1] and [11, Theorem 5.9].

  • Current algorithms in the literature pose different theoretical and practical difficulties when solving these types of problems.

    Indeed, although derivative-free methods can be directly applied in this setting, they suffer from the same drawbacks as their counterparts in the scalar case. Specifically, because they make no use of first-order information (which we assume is available in our context), we expect them to perform slower in practice that a method who uses these additional properties. Even worse, in the set-valued setting, there is now an increased cost on performing comparisons between sets, which was almost negligible for scalar problems. On the other hand, the algorithms of a sorting type described earlier cannot be used in our setting since they require a finite feasible set. Similarly, the branch and bound strategy is designed for problems that do not fit the particular structure that we consider in this paper, and so it cannot be taken into account. Finally, we can also consider the algorithms based on scalarization in our context. However, the main drawback of these methods is that, in general, they are not able to recover all the solutions of the set optimization problem. In fact, the \(\epsilon \)- constraint method, which is known to overcome this difficulty in standard multiobjective optimization, will fail in this setting.

Thus, we address in this paper the need of a first-order method that exploits the particular structure of the set-valued objective mapping previously mentioned and does not have the same drawbacks of the other approaches in the literature.

The rest of the paper is structured as follows. We start in Sect. 2 by introducing the main notations, basic concepts and results that will be used throughout the paper. In Sect. 3, we derive optimality conditions for set optimization problems with the aforementioned structure. These optimality conditions constitute the basis of the descent method described in Sect. 4, where the full convergence of the algorithm is also obtained. In Sect. 5, we illustrate the performance of the method on different test instances. We conclude in Sect. 6 by summarizing our results and proposing ideas for further research.

2 Preliminaries

We start this section by introducing the main notations used in the paper. First, the class of all nonempty subsets of \({{\mathbb {R}}}^m\) will be denoted by \(\mathscr {P}({{\mathbb {R}}}^m).\) Furthermore, for \(A \in \mathscr {P}({{\mathbb {R}}}^m)\), we denote by \(int A\), \(cl A\), \(bd A\) and \(conv A\) the interior, closure, boundary and convex hull of the set A,  respectively. All the considered vectors are column vectors, and we denote the transpose operator with the symbol \(\top .\) On the other hand, \(\Vert \cdot \Vert \) will stand for either the Euclidean norm of a vector or for the standard spectral norm of a matrix, depending on the context. We also denote the cardinality of a finite set A by |A|. Finally, for \(k\in \mathbb {N},\) we put \([k] := \{1,\ldots ,k\}.\)

We next consider the most important definitions and properties involved in the results of the paper. Recall that a set \(K \in \mathscr {P}({{\mathbb {R}}}^m)\) is said to be a cone if \(t y\in K\) for every \(y\in K\) and every \(t \ge 0.\) Moreover, a cone K is called convex if \(K + K = K,\) pointed if \(K\cap (-K)=\{0\},\) and solid if \(int K \ne \emptyset .\) An important related concept is that of the dual cone. For a cone K,  this is the set

$$\begin{aligned} K^*:=\{v \in {{\mathbb {R}}}^m \mid \forall \; y\in K: v^\top y\ge 0\}. \end{aligned}$$

Throughout, we suppose that \(K \in \mathscr {P}({{\mathbb {R}}}^m)\) is a cone.

It is well known (see [14]) that when K is convex and pointed, it generates a partial order \(\preceq \) on \({{\mathbb {R}}}^m\) as follows:

$$\begin{aligned} y\preceq z: \Longleftrightarrow z-y\in K. \end{aligned}$$
(1)

Furthermore, if K is solid, one can also consider the so-called strict order \(\prec \) which is defined by

$$\begin{aligned} y\prec z: \Longleftrightarrow z-y\in int K. \end{aligned}$$
(2)

In the following definition, we collect the concepts of minimal and weakly minimal elements of a set with respect to \(\preceq .\)

Definition 2.1

Let \(A \in \mathscr {P}({{\mathbb {R}}}^m)\) and suppose that K is closed, convex, pointed and solid.

  1. (i)

    The set of minimal elements of A with respect to K is defined as

    $$\begin{aligned} Min (A,K):= \{y \in A\mid \left( y- K\right) \cap A =\{y\}\}. \end{aligned}$$
  2. (ii)

    The set of weakly minimal elements of A with respect to K is defined as

    $$\begin{aligned} WMin (A,K):= \{y \in A\mid \left( y- int K\right) \cap A =\emptyset \}. \end{aligned}$$

The following proposition will be often used.

Proposition 2.1

([22, Theorem 6.3 c)]) Let \(A \in \mathscr {P}({{\mathbb {R}}}^m)\) be compact, and K be closed, convex and pointed. Then, A satisfies the so-called domination property with respect to K, that is,

$$\begin{aligned} A + K = Min (A,K) + K. \end{aligned}$$

The Gerstewitz scalarizing functional will play also an important role in the main results.

Definition 2.2

Let K be closed, convex, pointed and solid. For a given element \(e \in int K,\) the Gerstewitz functional associated with e and K is \(\psi _e: {{\mathbb {R}}}^m \rightarrow {{\mathbb {R}}}\) defined as

$$\begin{aligned} \psi _e(y):= \min \{t\in \mathbb {R} \mid te\in y+K\}. \end{aligned}$$
(3)

Useful properties of this functional are summarized in the next proposition.

Proposition 2.2

([29, Section 5.2]) Let K be closed, convex, pointed and solid, and consider an element \(e \in int K\). Then, the functional \(\psi _e\) satisfies the following properties:

  1. (i)

    \(\psi _e\) is sublinear and Lipschitz on \({{\mathbb {R}}}^m.\)

  2. (ii)

    \(\psi _e\) is both monotone and strictly monotone with respect to the partial order \(\preceq \), that is,

    $$\begin{aligned} \forall \; y,z \in {{\mathbb {R}}}^m: y \preceq z \Longrightarrow \psi _e(y)\le \psi _e(z) \end{aligned}$$

    and

    $$\begin{aligned} \forall \; y,z \in {{\mathbb {R}}}^m: y \prec z \Longrightarrow \psi _e(y) < \psi _e(z), \end{aligned}$$

    respectively.

  3. (iii)

    \(\psi _e\) satisfies the so-called representability property, that is,

    $$\begin{aligned} - K= \{y \in {{\mathbb {R}}}^m \mid \psi _e(y)\le 0 \}, \quad - int K = \{y \in {{\mathbb {R}}}^m \mid \psi _e(y)< 0 \}. \end{aligned}$$

We next introduce the set relations between the nonempty subsets of \({{\mathbb {R}}}^m\) that will be used in the definition of the set optimization problem we consider. We refer the reader to [25, 28] and the references therein for other set relations.

Definition 2.3

[37] For the given cone K,  the lower set less relation \(\preceq ^\ell \) is the binary relation defined on \(\mathscr {P}({{\mathbb {R}}}^m)\) as follows:

$$\begin{aligned} \forall \; A,B \in \mathscr {P}({{\mathbb {R}}}^m): A\preceq ^\ell B: \Longleftrightarrow B\subseteq A+K. \end{aligned}$$

Similarly, if K is solid, the strict lower set less relation \(\prec ^\ell \) is the binary relation defined on \(\mathscr {P}({{\mathbb {R}}}^m)\) by:

$$\begin{aligned} \forall \; A,B \in \mathscr {P}({{\mathbb {R}}}^m): A\prec ^\ell B: \Longleftrightarrow B\subseteq A+ int K. \end{aligned}$$

Remark 2.1

Note that for any two vectors \(y,z \in {{\mathbb {R}}}^m\) the following equivalences hold:

$$\begin{aligned} \{y\} \preceq ^\ell \{z\} \Longleftrightarrow y \preceq z, \;\; \{y\} \prec ^\ell \{z\} \Longleftrightarrow y \prec z. \end{aligned}$$

Thus, the restrictions of \(\preceq ^\ell \) and \(\prec ^\ell \) to the singletons in \(\mathscr {P}({{\mathbb {R}}}^m)\) are equivalent to \(\preceq \) and \(\prec ,\) respectively.

We are now ready to present the set optimization problem together with a solution concept based on set relations.

Definition 2.4

Let \(F:{{\mathbb {R}}}^n \rightrightarrows {{\mathbb {R}}}^m\) be a given set-valued mapping taking only nonempty values, and suppose that K is closed, convex, pointed and solid. The set optimization problem with these data is formally represented as

figure a

and a solution is understood in the following sense: We say that a point \(\bar{x} \in {{\mathbb {R}}}^n\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)) if there exists a neighborhood U of \(\bar{x}\) such that the following holds:

$$\begin{aligned} \not \exists \; x \in U: F(x) \prec ^\ell F(\bar{x}). \end{aligned}$$

Moreover, if we can choose \(U = {{\mathbb {R}}}^n\) above, we simply say that \(\bar{x}\) is a weakly minimal solution of (\(\mathcal {SP}_\ell \)).

Remark 2.2

A related problem to (\(\mathcal {SP}_\ell \)) that is relevant in our paper is the so-called vector optimization problem [22, 38]. There, for a vector-valued mapping \(f: {{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}^m,\) one considers

$$\begin{aligned} \begin{array}{ll} \preceq \text {- }\min \limits _{ x\in {{\mathbb {R}}}^n} \; f(x), \\ \end{array} \end{aligned}$$

where a point \(\bar{x}\) is said to be a weakly minimal solution if

$$\begin{aligned} f(\bar{x}) \in WMin (f[{{\mathbb {R}}}^n],K) \end{aligned}$$

(corresponding to Definition 2.1). Taking into account Remark 2.1, it is easy to verify that this solution concept coincides with ours for (\(\mathcal {SP}_\ell \)) when the set-valued mapping F is given by \(F(x):= \{f(x)\}\) for every \(x \in {{\mathbb {R}}}^n.\)

We conclude the section by establishing the main assumption employed in the rest of the paper for the treatment of (\(\mathcal {SP}_\ell \)):

Assumption 1

Suppose that \(K \in \mathscr {P}({{\mathbb {R}}}^m)\) is a closed, convex, pointed and solid cone and that \( e\in int K\) is fixed. Furthermore, consider a reference point \(\bar{x} \in {{\mathbb {R}}}^n,\) given vector-valued functions \(f^1, f^2,\ldots , f^p: {{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}^m\) that are continuously differentiable, and assume that the set-valued mapping F in (\(\mathcal {SP}_\ell \)) is defined by

$$\begin{aligned} F(x):=\bigg \{f^1(x), f^2(x),\ldots , f^p(x) \bigg \}. \end{aligned}$$

3 Optimality Conditions

In this section, we study optimality conditions for weakly minimal solutions of (\(\mathcal {SP}_\ell \)) under Assumption 1. These conditions are the foundation on which the proposed algorithm is built. In particular, because of the resemblance of our method with standard gradient descent in the scalar case, we are interested in Fermat rules for set optimization problems. Recently, results of this type were derived in [5], see also [2]. There, the optimality conditions involve the computation of the limiting normal cone [39] of the set-valued mapping F at different points in its graph. However, this is a difficult task in our case because the graph of F is the union of the graphs of the vector-valued functions \(f^i,\) and to the best of our knowledge, there is no exact formula for finding the normal cone to the union of sets (at a given point) in terms of the initial data. Thus, instead of considering the results from [5], we exploit the particular structure of F and the differentiability of the functionals \(f^i\) to deduce new necessary conditions.

We start by defining some index-related set-valued mappings that will be of importance. They make use of the concepts introduced in Definition 2.1.

Definition 3.1

The following set-valued mappings are defined:

  1. (i)

    The active index of minimal elements associated with F is \(I:{{\mathbb {R}}}^n \rightrightarrows [p]\) given by

    $$\begin{aligned} I(x):= \big \{i \in [p] \mid f^i(x) \in Min (F(x),K) \big \}. \end{aligned}$$
  2. (ii)

    The active index of weakly minimal elements associated with F is \(I_W:{{\mathbb {R}}}^n \rightrightarrows [p]\) defined as

    $$\begin{aligned} I_W(x):= \big \{i \in [p] \mid f^i(x) \in WMin (F(x),K) \big \}. \end{aligned}$$
  3. (iii)

    For a vector \(v\in {{\mathbb {R}}}^m,\) we define \(I_v:{{\mathbb {R}}}^n \rightrightarrows [p]\) as

    $$\begin{aligned} I_v(x):= \{i \in I(x) \mid f^i(x) = v\}. \end{aligned}$$

It follows from the definition that \(I_v(x) = \emptyset \) whenever \(v \notin Min (F(x),K)\) and that

$$\begin{aligned} \forall \; x \in {{\mathbb {R}}}^n: I(x) = \bigcup \limits _{v \in Min (F(x),K)} I_v(x). \end{aligned}$$
(4)

Definition 3.2

The map \(\omega :{{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}\) is defined as the cardinality of the set of minimal elements of F, that is,

$$\begin{aligned} \omega (x): = |Min (F(x),K)|. \end{aligned}$$

Furthermore, we set \(\bar{\omega }:= \omega (\bar{x}).\)

From now on, we consider that, for any point \(x \in {{\mathbb {R}}}^n,\) an enumeration \(\{v^x_1,\ldots , v^x_{\omega (x)}\}\) of the set \(Min (F(x),K)\) has been chosen in advance.

Definition 3.3

Let \(x\in {{\mathbb {R}}}^n,\) and consider the enumeration \(\{v^x_1,\ldots , v^x_{\omega (x)}\}\) of the set \(Min (F(x),K).\) The partition set of x is defined as

$$\begin{aligned} P_x:= \prod \limits _{j=1}^{\omega (x)}I_{v^x_j}(x), \end{aligned}$$

where \(I_{v^x_j}(x)\) is given in Definition 3.1 (iii) for \(j \in [\omega (x)].\)

The optimality conditions for (\(\mathcal {SP}_\ell \)) we will present are based on the following idea: from the particular structure of F,  we will construct a family of vector optimization problems that, together, locally represent (\(\mathcal {SP}_\ell \)) (in a sense to be specified) around the point which must be checked for optimality. Then, (standard) optimality conditions are applied to the family of vector optimization problems. The following lemma is the key step in that direction.

Lemma 3.1

Let \(\tilde{K} \in \mathscr {P}\left( {{\mathbb {R}}}^{m \bar{\omega }} \right) \) be the cone defined as

$$\begin{aligned} \tilde{K}:= \prod \limits _{j=1}^{\bar{\omega }} K, \end{aligned}$$
(5)

and let us denote by \(\preceq _{\tilde{K}}\) and \(\prec _{\tilde{K}}\) the partial order and the strict order in \({{\mathbb {R}}}^{m\bar{\omega }}\) induced by \(\tilde{K},\) respectively (see (2)). Furthermore, consider the partition set \(P_{\bar{x}}\) associated with \(\bar{x}\) and define, for every \(a = (a_1, \ldots , a_{\bar{\omega }})\in P_{\bar{x}},\) the function \(\tilde{f}^a: {{\mathbb {R}}}^n \rightarrow \prod \nolimits _{j=1}^{\bar{\omega }} {{\mathbb {R}}}^m\) as

$$\begin{aligned} \tilde{f}^a(x):= \begin{pmatrix} f^{a_1}(x)\\ \vdots \\ f^{a_{\bar{\omega }}}(x) \end{pmatrix}. \end{aligned}$$
(6)

Then, \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)) if and only if, for every \(a \in P_{\bar{x}},\) \(\bar{x}\) is a local weakly minimal solution of the vector optimization problem

figure b

Proof

We argue by contradiction in both cases. First, assume that \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)) and that, for some \(a\in P_{\bar{x}},\) \(\bar{x}\) is not a local weakly minimal solution of (\(\mathcal {VP}_a\)). Then, we could find a sequence \(\{x_k\}_{k\ge 1}\subseteq {{\mathbb {R}}}^n\) such that \(x_k \rightarrow \bar{x}\) and

$$\begin{aligned} \forall \; k \in \mathbb {N}: \tilde{f}^a(x_k) \prec _{\tilde{K}} \tilde{f}^a(\bar{x}). \end{aligned}$$
(7)

Hence, we deduce that

$$\begin{aligned} \forall \; k \in \mathbb {N}: F(\bar{x})&\overset{(\text {Proposition } 2.1)}{\subseteq }&\{f^{a_1}(\bar{x}),\ldots , f^{a_{\bar{\omega }}} (\bar{x})\} + K\\&\overset{(7)}{\subseteq }&\{f^{a_1}(x_k),\ldots , f^{a_{\bar{\omega }}} (x_k)\} + int K+ K\\\subseteq & {} F(x_k) + int K. \end{aligned}$$

Since this is equivalent to \(F(x_k) \prec ^\ell F(\bar{x})\) for every \(k \in \mathbb {N}\) and \(x_k \rightarrow \bar{x},\) it contradicts the weak minimality of \(\bar{x}\) for (\(\mathcal {SP}_\ell \)).

Next, suppose that \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {VP}_a\)) for every \(a \in P_{\bar{x}}\), but not a local weakly minimal solution of (\(\mathcal {SP}_\ell \)). Then, we could find a sequence \(\{x_k\}_{k\ge 1} \subseteq {{\mathbb {R}}}^n\) such that \(x_k \rightarrow \bar{x}\) and \(F(x_k) \prec ^\ell F(\bar{x})\) for every \(k\in \mathbb {N}.\) Consider the enumeration \(\{v^{\bar{x}}_1,\ldots , v^{\bar{x}}_{\bar{\omega }}\}\) of the set \(Min (F(\bar{x}),K).\) Then,

$$\begin{aligned} \forall \; j\in [\bar{\omega }], k \in \mathbb {N}, \exists \; i_{(j,k)} \in [p]:f^{i_{(j,k)}}(x_k)\prec v^{\bar{x}}_j. \end{aligned}$$
(8)

Since the indexes \(i_{(j,k)}\) are being chosen on the finite set [p], we can assume without loss of generality that \(i_{(j,k)}\) is independent of k,  that is, \(i_{(j,k)} = \bar{i}_j\) for every \(k\in \mathbb {N}\) and some \(\bar{i}_j \in [p].\) Hence, taking the limit in (8) when \(k \rightarrow + \infty \), we get

$$\begin{aligned} \forall \; j \in [\bar{\omega }]: f^{\bar{i}_j}(\bar{x})\preceq v^{\bar{x}}_j. \end{aligned}$$
(9)

Because \(v^{\bar{x}}_j \in Min (F(\bar{x}),K),\) it follows from (9) that \(f^{\bar{i}_j}(\bar{x})= v^{\bar{x}}_j\) and that \(\bar{i}_j \in I(\bar{x})\) for every \(j \in [\bar{\omega }].\) Consider now the tuple \(\bar{a}:= (\bar{i}_1,\ldots ,\bar{i}_{\bar{\omega }}).\) Then, it can be verified that \(\bar{a}\in P_{\bar{x}}.\) Moreover, from (8) we deduce that \(\tilde{f}^{\bar{a}}(x_k) \prec _{\tilde{K}} \tilde{f}^{\bar{a}}(\bar{x})\) for every \(k \in \mathbb {N}.\) Since \(x_k \rightarrow \bar{x}\), this contradicts the weak minimality of \(\bar{x}\) for (\(\mathcal {VP}_a\)) when \(a = \bar{a}.\)

\(\square \)

We now establish the necessary optimality condition for (\(\mathcal {SP}_\ell \)) that will be used in our descent method.

Theorem 3.1

Suppose that \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)). Then,

$$\begin{aligned} \forall \; a \in P_{\bar{x}}, \; \exists \; \mu _1, \mu _2, \ldots , \mu _{\bar{w}} \in K^* : \sum _{j=1}^{\bar{\omega }}\nabla f^{a_j}(\bar{x})\mu _j=0,\; (\mu _1,\ldots ,\mu _{\bar{w}})\ne 0. \end{aligned}$$
(10)

Conversely, assume that \(f^i\) is K- convex for each \(i \in I(\bar{x})\), that is,

$$\begin{aligned} \forall \; i \in I(\bar{x}), x_1,x_2 \in {{\mathbb {R}}}^n, t \in [0,1]: f^i(t x_1 +(1-t)x_2) \preceq tf^i(x_1)+ (1-t)f^i(x_2). \end{aligned}$$

Then, condition (10) is also sufficient for the local weak minimality of \(\bar{x}.\)

Proof

By Lemma 3.1, we get that \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {VP}_a\)) for every \(a\in P_{\bar{x}}.\) Applying now [38, Theorem 4.1] for every \(a\in P_{\bar{x}}\), we get

$$\begin{aligned} \forall \; a\in P_{\bar{x}},\exists \; \mu \in \tilde{K}^*\setminus \{0\}: \nabla \tilde{f}^a(\bar{x})\mu =0. \end{aligned}$$
(11)

Since \(\tilde{K}^* = \prod \limits _{j=1}^{\bar{\omega }} K^*,\) it is easy to verify that (11) is equivalent to the first part of the statement.

In order to see the sufficiency under convexity, assume that \(\bar{x}\) satisfies (10). Note that for any \(a \in P_{\bar{x}}\), the function \(\tilde{f}^a\) is \(\tilde{K}\)-convex, provided that each \(f^i\) is K- convex for every \(i \in I(\bar{x})\). Then, in this case, it is well known that (11) is equivalent to \(\bar{x}\) being a local weakly minimal solution of (\(\mathcal {VP}_a\)) for every \(a \in P_{\bar{x}},\) see [15]. Applying now Lemma 3.1, we obtain that \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)).

\(\square \)

Based on Theorem 3.1, we define the following concepts of stationarity for (\(\mathcal {SP}_\ell \)).

Definition 3.4

We say that \(\bar{x}\) is a stationary point of (\(\mathcal {SP}_\ell \)) if there exists a nonempty set \(Q \subseteq P_{\bar{x}}\) such that the following assertion holds:

$$\begin{aligned} \forall \; a \in Q, \; \exists \; \mu _1, \mu _2, \ldots , \mu _{\bar{w}} \in K^* : \sum _{j=1}^{\bar{\omega }}\nabla f^{a_j}(\bar{x})\mu _j=0,\; (\mu _1,\ldots ,\mu _{\bar{w}})\ne 0.\nonumber \\ \end{aligned}$$
(12)

In that case, we also say that \(\bar{x}\) is stationary with respect to Q. If, in addition, we can choose \(Q = P_{\bar{x}}\) in (12), we simply call \(\bar{x}\) a strongly stationary point.

Remark 3.1

It follows from Definition 3.4 that a point \(\bar{x}\) is stationary for (\(\mathcal {SP}_\ell \)) if and only if

$$\begin{aligned} \exists \; a \in P_{\bar{x}}, \mu _1, \mu _2, \ldots , \mu _{\bar{w}} \in K^* : \sum _{j=1}^{\bar{\omega }}\nabla f^{a_j}(\bar{x})\mu _j=0,\; (\mu _1,\ldots ,\mu _{\bar{w}})\ne 0. \end{aligned}$$

Furthermore, a strongly stationary point of (\(\mathcal {SP}_\ell \)) is also stationary with respect to Q for every nonempty set \(Q \subseteq P_{\bar{x}}.\) Moreover, from Theorem 3.1, it is clear that stationarity is also a necessary optimality condition for (\(\mathcal {SP}_\ell \)).

In the following example, we illustrate a comparison of our optimality conditions with previous ones in the literature for standard optimization problems.

Example 3.1

Suppose that in Assumption 1 we have \(m = 1, K = {{\mathbb {R}}}_+.\) Furthermore, consider the functional \(f : {{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}\) defined as

$$\begin{aligned} f(x) := \min \limits _{i \in [p]} f^i(x) \end{aligned}$$

and problem (\(\mathcal {SP}_\ell \)) associated with these data. Hence, in this case,

$$\begin{aligned} P_{\bar{x}} = I(\bar{x}) = \{i \in [p] \mid f^i(\bar{x}) = f(\bar{x})\}. \end{aligned}$$

It is then easy to verify that the following statements hold:

  1. (i)

    \(\bar{x}\) is strongly stationary for (\(\mathcal {SP}_\ell \)) if and only if

    $$\begin{aligned} \forall \; i \in I(\bar{x}) : \nabla f^i(\bar{x}) = 0. \end{aligned}$$
  2. (ii)

    \(\bar{x}\) is stationary for (\(\mathcal {SP}_\ell \)) if and only if

    $$\begin{aligned} \exists \; i \in I(\bar{x}) : \nabla f^i(\bar{x}) = 0. \end{aligned}$$

On the other hand, it is straightforward to verify that \(\bar{x}\) is a weakly minimal solution of (\(\mathcal {SP}_\ell \)) if and only if \(\bar{x}\) is a solution of the problem

Moreover, if we denote by \(\widehat{\partial } f(\bar{x})\) and \(\partial f (\bar{x})\) the Fréchet and Mordukhovich subdifferential of f at point \(\bar{x}\) , respectively (see [39]), it follows from [39, Proposition 1.114] that the inclusions

$$\begin{aligned} 0 \in \widehat{\partial } f(\bar{x}) \end{aligned}$$
(13)

and

$$\begin{aligned} 0\in \partial f (\bar{x}) \end{aligned}$$
(14)

are necessary for \(\bar{x}\) being a solution of (\(\mathcal {P}\)). A point \(\bar{x}\) satisfying (13) and (14) is said to be Fréchet and Mordukhovich stationary for (\(\mathcal {P}\)), respectively. Furthermore, from [10, Proposition 5] and [39, Proposition 1.113], we have

$$\begin{aligned} \widehat{\partial } f(\bar{x}) = \bigcap _{i \in I(\bar{x})} \{\nabla f^i(\bar{x})\} \end{aligned}$$
(15)

and

$$\begin{aligned} \partial f(\bar{x}) \subseteq \bigcup _{i \in I(\bar{x})} \{\nabla f^i(\bar{x})\}, \end{aligned}$$
(16)

respectively. Thus, from (13), (15) and (i), we deduce that

  1. (iii)

    \(\bar{x}\) is strongly stationary for (\(\mathcal {SP}_\ell \)) if and only if \(\bar{x}\) is Fréchet stationary for (\(\mathcal {P}\)).

Similarly, from (14), (16) and (ii), we find that:

  1. (iii)

    If \(\bar{x}\) is Mordukhovich stationary for (\(\mathcal {P}\)), then \(\bar{x}\) is stationary for (\(\mathcal {SP}_\ell \)).

We close the section with the following proposition that presents an alternative characterization of stationary points.

Proposition 3.1

Let \(Q \subseteq P_{\bar{x}}\) be given. Then, \(\bar{x}\) is stationary for (\(\mathcal {SP}_\ell \)) with respect to Q if and only if

$$\begin{aligned} \forall \; a \in Q, u \in {{\mathbb {R}}}^n, \exists \;j \in [\bar{\omega }] : \nabla f^{a_j}(\bar{x})^\top u \notin - int K. \end{aligned}$$
(17)

Proof

Suppose first that \(\bar{x}\) is stationary with respect to Q. Fix now \(a \in Q, u \in {{\mathbb {R}}}^n,\) and consider the vectors \(\mu _1,\mu _2,\ldots , \mu _{\bar{\omega }} \in K^*\) that satisfy (12). We argue by contradiction. Assume that

$$\begin{aligned} \forall \;j \in [\bar{\omega }]: \nabla f^{a_j}(\bar{x})^\top u \in - int K. \end{aligned}$$
(18)

From (18) and the fact that \(\left( \mu _1,\ldots ,\mu _{\bar{\omega }}\right) \in \left( \prod \limits _{j=1}^{\bar{\omega }} K^* \right) \setminus \{0\},\) we deduce that

$$\begin{aligned} \left( \mu _1^\top \left( \nabla f^{a_1}(\bar{x})^\top u\right) , \ldots , \mu _{\bar{\omega }}^\top \left( \nabla f^{a_{\bar{\omega }}}(\bar{x})^\top u\right) \right) \in - {{\mathbb {R}}}^{\bar{\omega }}_+ \setminus \{0\}. \end{aligned}$$
(19)

Hence, we get

$$\begin{aligned} 0 \overset{(12)}{=} \left( \sum _{j=1}^{\bar{\omega }}\nabla f^{a_j}(\bar{x})\mu _j\right) ^\top u =\sum _{j=1}^{\bar{\omega }} \mu _j^\top \left( \nabla f^{a_j}(\bar{x})^\top u \right) \overset{(19)}{<}0, \end{aligned}$$

a contradiction.

Suppose now that (17) holds, and fix \(a \in Q.\) Consider the functional \(\tilde{f}^a\) and the cone \(\tilde{K}\) from Lemma 3.1, together with the set

$$\begin{aligned} A:= \left\{ \nabla \tilde{f}^a(\bar{x})^{\top } u \mid u \in {{\mathbb {R}}}^n \right\} . \end{aligned}$$

Then, we deduce from (17) that

$$\begin{aligned} A \cap int \tilde{K} = \emptyset . \end{aligned}$$

Applying now Eidelheit’s separation theorem for convex sets [22, Theorem 3.16], we obtain some \((\mu _1, \ldots , \mu _{\bar{\omega }}) \in \left( \prod \nolimits _{j=1}^{\bar{\omega }} {{\mathbb {R}}}^m \right) \setminus \{0\}\) such that

$$\begin{aligned} \forall \; u \in {{\mathbb {R}}}^n, v_1, \ldots ,v_{\bar{\omega }} \in K: \left( \sum \limits _{j=1}^{\bar{\omega }} \nabla f^{a_j}(\bar{x})\mu _j \right) ^{\top } u \le \sum \limits _{j=1}^{\bar{\omega }} \mu _j^\top v_j. \end{aligned}$$
(20)

By fixing \(\bar{j} \in [\bar{\omega }]\) and substituting \(u = 0, v_j = 0\) for \(j \ne \bar{j}\) in (20), we obtain

$$\begin{aligned} \forall \; v_{\bar{j}} \in K: \mu _{\bar{j}}^\top v_{\bar{j}} \ge 0. \end{aligned}$$

Hence, \(\mu _{\bar{j}} \in K^*.\) Since \(\bar{j}\) was chosen arbitrarily, we get that \((\mu _1, \ldots , \mu _{\bar{\omega }}) \in \left( \prod \limits _{j=1}^{\bar{\omega }} K^* \right) \setminus \{0\}.\) Define now

$$\begin{aligned} \bar{u}:= \sum \limits _{j=1}^{\bar{\omega }} \nabla f^{a_j}(\bar{x})\mu _j . \end{aligned}$$

To finish the proof, we need to show that \(\bar{u}= 0.\) In order to see this, substitute \(u = \bar{u}\) and \(v_j = 0\) for each \(j \in [\bar{\omega }]\) in (20). Then, we obtain

$$\begin{aligned} \left\| \sum \limits _{j=1}^{\bar{\omega }} \nabla f^{a_j}(\bar{x})\mu _j \right\| ^2 \le 0. \end{aligned}$$

Hence, it can only be \(\bar{u}=0,\) and statement (12) is true. \(\square \)

4 Descent Method and Its Convergence Analysis

Now, we present the solution approach. It is clearly based on the result shown in Lemma 3.1. At every iteration, an element a in the partition set of the current iterate point is selected, and then, a descent direction for (\(\mathcal {VP}_a\)) will be found using ideas from [6, 15]. However, one must be careful with the selection process of the element a in order to guarantee convergence. Thus, we propose a specific way to achieve this. After the descent direction is determined, we follow a classical backtracking procedure of Armijo type to determine a suitable step size, and we update the iterate in the desired direction. Algorithm 1 formally describes our method.

figure c

Remark 4.1

Algorithm 1 extends the approaches proposed in [6, 15] for vector optimization optimization problems to case (\(\mathcal {SP}_\ell \)). The main difference is that, in Step 2, the authors use the well known Hiriart-Urruty functional and the support of a so-called generator of the dual cone instead of \(\psi _e\), respectively. However, in our framework, the functional \(\psi _e\) is a particular case of those employed in the other methods, see [4, Corollary 2]. Thus, the equivalence in the case of vector optimization problems of the three algorithms is obtained.

Now, we start the convergence analysis of Algorithm 1. Our first lemma describes local properties of the active indexes.

Lemma 4.1

Under our assumptions, there exists a neighborhood U of \(\bar{x}\) such that the following properties are satisfied (some of them under additional conditions to be established below) for every \(x \in U:\)

  1. (i)

    \( I_W(x)\subseteq I_W(\bar{x}),\)

  2. (ii)

    \(I(x)\subseteq I(\bar{x}),\) provided that \(Min (F(\bar{x}),K)= WMin (F(\bar{x}),K),\)

  3. (iii)

    \(\forall \; v\in Min (F(\bar{x}),K): Min \left( \{f^i(x)\}_{i \in I_v(\bar{x})},K\right) \subseteq Min (F(x),K), \)

  4. (iv)

    For every \(v_1,v_2 \in Min (F(\bar{x}),K)\) with \(v_1 \ne v_2 : \)

    $$\begin{aligned} Min \left( \{f^i(x)\}_{i\in I_{v_1}(\bar{x})},K\right) \cap Min \left( \{f^i(x)\}_{i\in I_{v_2}(\bar{x})},K\right) = \emptyset , \end{aligned}$$
  5. (v)

    \( \omega (x)\ge \omega (\bar{x}).\)

Proof

It suffices to show the existence of the neighborhood U for each item independently, as we could later take the intersection of them to satisfy all the properties.

(i) Assume that this is not satisfied in any neighborhood U of \(\bar{x}\). Then, we could find a sequence \(\{x_k\}_{k\ge 1} \subseteq {{\mathbb {R}}}^n\) such that \(x_k \rightarrow \bar{x}\) and

$$\begin{aligned} \forall \; k \in \mathbb {N}: I_W(x_k) \setminus I_W(\bar{x}) \ne \emptyset . \end{aligned}$$
(21)

Because of the finite cardinality of all possible differences in (21), we can assume without loss of generality that there exists a common \(\bar{i}\in [p]\) such that

$$\begin{aligned} \forall \; k\in \mathbb {N}: \bar{i} \in I_W(x_k) \setminus I_W(\bar{x}). \end{aligned}$$
(22)

In particular, (22) implies that \(\bar{i}\in I_W(x_k)\). Hence, we get

$$\begin{aligned} \forall \; k \in \mathbb {N},\; i \in [p]:\;f^{i}(x_k) - f^{\bar{i}}(x_k) \in - \left( {{\mathbb {R}}}^m\setminus int K\right) . \end{aligned}$$

Since \({{\mathbb {R}}}^m\setminus int K\) is closed, taking the limit when \(k \rightarrow +\infty \) we obtain

$$\begin{aligned} \forall \; i \in [p]:\;f^i(\bar{x}) - f^{\bar{i}}(\bar{x}) \in - \left( {{\mathbb {R}}}^m\setminus int K\right) . \end{aligned}$$

Hence, we deduce that \(f^{\bar{i}}(\bar{x}) \in WMin (F(\bar{x}),K)\) and \(\bar{i} \in I_W(\bar{x}),\) a contradiction to (21).

(ii) Consider the same neighborhood U on which statement (i) holds. Note that, under the given assumption, we have \(I_W(\bar{x}) = I(\bar{x}).\) This, together with statement (i),  implies:

$$\begin{aligned} \forall \; x\in U:\; I(x) \subseteq I_W(x) \subseteq I_W(\bar{x}) = I(\bar{x}). \end{aligned}$$

(iii) For this statement, it is also sufficient to show that the neighborhood U can be chosen for any point in the set \( Min (F(\bar{x}),K).\) Hence, fix \(v \in Min (F(\bar{x}),K)\) and assume that there is no neighborhood U of \(\bar{x}\) on which the statement is satisfied. Then, we could find sequences \(\{x_k\}_{k\ge 1}\subseteq {{\mathbb {R}}}^n \) and \(\{i_k\}_{k\ge 1} \subseteq I_v(\bar{x})\) such that \(x_k \rightarrow \bar{x}\) and

$$\begin{aligned} \forall \; k \in \mathbb {N}: f^{i_k}(x_k) \in Min (\{f^i(x_k)\}_{i \in I_v(\bar{x})},K) \setminus Min (F(x_k),K). \end{aligned}$$
(23)

Since \(I_v(\bar{x})\) is finite, we deduce that there are only a finite number of different elements in the sequence \(\{i_k\}.\) Hence, we can assume without loss of generality that there exists \(\bar{i} \in I_v(\bar{x})\) such that \(i_k = \bar{i}\) for every \(k \in \mathbb {N}.\) Then, (23) is equivalent to

$$\begin{aligned} \forall \; k \in \mathbb {N}: f^{\bar{i}}(x_k) \in Min (\{f^i(x_k)\}_{i \in I_v(\bar{x})},K) \setminus Min (F(x_k),K). \end{aligned}$$
(24)

From (24), we get in particular that \(f^{\bar{i}}(x_k) \notin Min (F(x_k),K)\) for every \(k \in \mathbb {N}.\) This, together with the domination property in Proposition 2.1 and the fact that the sets \(I(x_k)\) are contained in the finite set [p], allows us to obtain without loss of generality the existence of \(\tilde{i} \in I(\bar{x})\) such that

$$\begin{aligned} \forall \; k\in \mathbb {N}:\; f^{\tilde{i}}(x_k)\preceq f^{\bar{i}}(x_k), \; f^{\tilde{i}}(x_k) \ne f^{\bar{i}}(x_k). \end{aligned}$$
(25)

Now, taking the limit in (25) when \(k \rightarrow +\infty ,\) we obtain \(f^{\tilde{i}}(\bar{x})\preceq f^{\bar{i}}(\bar{x}) = v.\) Since v is a minimal element of \(F(\bar{x})\), it can only be \(f^{\tilde{i}}(\bar{x})= v\) and, hence, \(\tilde{i} \in I_v(\bar{x}).\) From this, the first inequality in (25), and the fact that \(f^{\bar{i}}(x_k) \in Min (\{f^i(x_k)\}_{i \in I_v(\bar{x})},K)\) for every \(k \in \mathbb {N},\) we get that \(f^{\bar{i}}(x_k) = f^{\tilde{i}}(x_k)\) for all \(k\in \mathbb {N}.\) This contradicts the second part of (25), and hence, our statement is true.

(iv) It follows directly from the continuity of the functionals \(f^i, \; i \in [p].\)

(v) The statement is an immediate consequence of (iii) and (iv). \(\square \)

For the main convergence theorem of our method, we will need the notion of regularity of a point for a set-valued mapping.

Definition 4.1

We say that \(\bar{x}\) is a regular point of F if the following conditions are satisfied:

  1. (i)

    \(Min (F(\bar{x}),K)= WMin (F(\bar{x}),K),\)

  2. (ii)

    the cardinality functional \(\omega \) introduced in Definition 3.2 is constant in a neighborhood of \(\bar{x}.\)

Remark 4.2

Since we will analyze the stationarity of the regular limit points of the sequence generated by Algorithm 1, the following points must be addressed:

  • Notice that, by definition, the regularity property of a point is independent of our optimality concept. Thus, by only knowing that a point is regular, we cannot infer anything about whether it is optimal or not.

  • The concept of regularity seems to be linked to the complexity of comparing sets in a high-dimensional space. For example, in case \(m=1\) or \(p = 1,\) every point in \({{\mathbb {R}}}^n\) is regular for the set-valued mapping F. Indeed, in these cases, we have \(\omega (x) = 1\) and

    $$\begin{aligned} Min (F(x),K)= WMin (F(x),K) = \left\{ \begin{array}{ll} \left\{ \min \limits _{i \in \; [p]} f^i(x)\right\} &{} \text {if } m=1, \\ \{f^1(x)\} &{} \text {if } p=1\\ \end{array} \right. \end{aligned}$$

    for all \(x\in {{\mathbb {R}}}^n.\)

A natural question is whether regularity is a strong assumption to impose on a point. In that sense, given the finite structure of the sets F(x), condition (i) in Definition 4.1 seems to be very reasonable. In fact, we would expect that, for most practical cases, this condition is fulfilled at almost every point. For condition (ii), a formalized statement is derived in Proposition 4.1.

Proposition 4.1

The set

$$\begin{aligned} S: = \{x \in {{\mathbb {R}}}^n \mid \omega \text { is locally constant at } x\} \end{aligned}$$

is open and dense in \({{\mathbb {R}}}^n.\)

Proof

(i) The openness is trivial. Suppose now that S is not dense in \({{\mathbb {R}}}^n.\) Then, \({{\mathbb {R}}}^n\setminus (cl S)\) is nonempty and open. Furthermore, since \(\omega \) is bounded above, the real number

$$\begin{aligned} p_0 := \max _{x \in {{\mathbb {R}}}^n\setminus (cl S) } \omega (x) \end{aligned}$$

is well defined. Consider the set

$$\begin{aligned} A:= \left\{ x \in {{\mathbb {R}}}^n \mid \omega (x)\le p_0-\frac{1}{2}\right\} . \end{aligned}$$

From Lemma 4.1 (v), it follows that \(\omega \) is lower semicontinuous. Hence, A is closed as it is the sublevel set of a lower semicontinuous functional, see [43, Lemma 1.7.2]. Consider now the set

$$\begin{aligned} U:= \left( {{\mathbb {R}}}^n\setminus (cl S)\right) \cap \left( {{\mathbb {R}}}^n \setminus A\right) . \end{aligned}$$

Then, U is a nonempty open subset of \({{\mathbb {R}}}^n\setminus (cl S).\) This, together with the definition of A,  gives us \(\omega (x) = p_0\) for every \(x \in U.\) However, this contradicts the fact that \(\omega \) is not locally constant at any point of \({{\mathbb {R}}}^n\setminus (cl S).\) Hence, S is dense in \({{\mathbb {R}}}^n.\) \(\square \)

An essential property of regular points of a set-valued mapping is described in the next lemma.

Lemma 4.2

Suppose that \(\bar{x}\) is a regular point of F. Then, there exists a neighborhood U of \(\bar{x}\) such that the following properties hold for every \(x \in U\):

  1. (i)

    \(\omega (x) = \bar{\omega },\)

  2. (ii)

    there is an enumeration \(\{w^x_1, \ldots ,w^x_{\bar{\omega }}\}\) of \(Min (F(x),K)\) such that

    $$\begin{aligned} \forall \; j \in [\bar{\omega }]: I_{w^x_j}(x) \subseteq I_{v^{\bar{x}}_j}(\bar{x}). \end{aligned}$$

In particular, without loss of generality, we have \(P_x\subseteq P_{\bar{x}}\) for every \(x \in U.\)

Proof

Let U be the neighborhood of \(\bar{x}\) from Lemma 4.1. Since \(\bar{x}\) is a regular point of F,  we assume without loss of generality that \(\omega \) is constant on U. Hence, property (i) is fulfilled. Fix now \(x \in U\) and consider the enumeration \(\{v^{\bar{x}}_1,\ldots , v^{\bar{x}}_{\bar{\omega }}\}\) of \(Min (F(\bar{x}),K).\) Then, from properties (iii) and (iv) in Lemma 4.1 and the fact that \(\omega (x)= \bar{\omega },\) we deduce that

$$\begin{aligned} \forall \; j\in [\bar{\omega }]:\; \left| Min \left( \{f^i(x)\}_{i\in I_{v^{\bar{x}}_j}(\bar{x})},K\right) \right| =1. \end{aligned}$$
(26)

Next, for \(j \in [\bar{\omega }],\) we define \(w^x_j\) as the unique element of the set

$$\begin{aligned} Min \left( \{f^i(x)\}_{i\in I_{v^{\bar{x}}_j}(\bar{x})},K\right) . \end{aligned}$$

Then, from (26), property (iii) in Lemma 4.1 and the fact that \(\omega \) is constant on U,  we obtain that \(\{w^x_1,\ldots , w^x_{\bar{\omega }}\}\) is an enumeration of the set \(Min (F(x),K).\)

It remains to show now that this enumeration satisfies (ii). In order to see this, fix \(j \in [\bar{\omega }]\) and \(\bar{i} \in I_{w^x_j}(x).\) Then, from the regularity of \(\bar{x}\) and property (ii) in Lemma 4.1, we get that \(I(x)\subseteq I(\bar{x}).\) In particular, this implies \(\bar{i} \in I(\bar{x}).\) From this and (4), we have the existence of \(j' \in [\bar{\omega }]\) such that \(\bar{i} \in I_{v^{\bar{x}}_{j'}}(\bar{x}).\) Hence, we deduce that

$$\begin{aligned} w^x_j = f^{\bar{i}}(x) \in \left\{ f^i(x)\right\} _{i \in I_{v^{\bar{x}}_{j'}}(\bar{x})}. \end{aligned}$$
(27)

Then, from (26), (27) and the definition of \(w^x_{j'},\) we find that \(w^x_{j'}\preceq w^x_j.\) Moreover, because \(w^x_{j'}, w^x_j \in Min (F(x),K),\) it can only be \(w^x_{j'}= w^x_j.\) Thus, it follows that \(j = j',\) since \(\{w^x_1,\ldots , w^x_{\bar{\omega }}\}\) is an enumeration of the set \(Min (F(x),K).\) This shows that \(\bar{i} \in I_{v^{\bar{x}}_j}(\bar{x}),\) as desired. \(\square \)

For the rest of the analysis, we need to introduce the parametric family of functionals \(\{\varphi _x\}_{x \in {{\mathbb {R}}}^n},\) whose elements \(\varphi _x: P_x\times {{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}\) are defined as follows:

$$\begin{aligned} \forall \; a \in P_x,u \in {{\mathbb {R}}}^n: \varphi _x(a,u):=\max _{ j \in \;[\omega (x)]} \big \{\psi _e\big (\nabla f^{a_j}(x)^\top u\big )\big \} +\frac{1}{2}\Vert u\Vert ^2, \end{aligned}$$
(28)

where the functional \(\psi _e\) is given by (3). It is easy to see that, for every \(x \in {{\mathbb {R}}}^n\) and \(a \in P_x,\) the functional \(\varphi _x(a, \cdot )\) is strongly convex in \({{\mathbb {R}}}^n,\) that is, there exists a constant \(\alpha >0\) such that the inequality

$$\begin{aligned} \varphi _x\left( a,tu + (1-t)u'\right) + \alpha t(1-t)\Vert u-u'\Vert ^2 \le t \varphi _x(a,u) + (1-t)\varphi _x\left( a,u'\right) \end{aligned}$$

is satisfied for every \(u, u'\in {{\mathbb {R}}}^n\) and \(t \in [0,1].\) According to [13, Lemma 3.9], the functional \(\varphi _x(a,\cdot )\) attains its minimum over \({{\mathbb {R}}}^n,\) and this minimum is unique. In particular, we can check that

$$\begin{aligned} \forall \; x \in {{\mathbb {R}}}^n, a \in P_x : \min _{u \in {{\mathbb {R}}}^n} \varphi _x(a,u) \le 0 \end{aligned}$$
(29)

and that, if \(u_a \in {{\mathbb {R}}}^n\) is such that \(\varphi _x(a,u_a) = \min \limits _{u \in {{\mathbb {R}}}^n} \varphi _x(a,u),\) then

$$\begin{aligned} \varphi _x(a,u_a) = 0 \Longleftrightarrow u_a = 0. \end{aligned}$$
(30)

Taking into account that \(P_x\) is finite, we also obtain that \(\varphi _x\) attains its minimum over the set \(P_x\times {{\mathbb {R}}}^n.\) Hence, we can consider the functional \(\phi : {{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}\) given by

$$\begin{aligned} \phi (x):= \min _{(a,u)\in P_x \times {{\mathbb {R}}}^n} \varphi _x(a,u). \end{aligned}$$
(31)

Then, because of (29), it can be verified that

$$\begin{aligned} \forall \; x \in {{\mathbb {R}}}^n: \phi (x)\le 0. \end{aligned}$$
(32)

Furthermore, if \((a,u) \in P_x\times {{\mathbb {R}}}^n\) is such that \(\phi (x) = \varphi _x(a,u),\) it follows from (30) (see also [15]) that

$$\begin{aligned} \phi (x) = 0 \Longleftrightarrow u = 0. \end{aligned}$$
(33)

In the following two propositions, we show that Algorithm 1 is well defined. We start by proving that, if Algorithm 1 stops in Step 3, a stationary point was found.

Proposition 4.2

Consider the functionals \(\varphi _{\bar{x}}\) and \(\phi \) given in (28) and (31), respectively. Furthermore, let \((\bar{a},\bar{u})\in P_{\bar{x}} \times {{\mathbb {R}}}^n\) be such that \(\phi (\bar{x}) = \varphi _{\bar{x}}(\bar{a},\bar{u}).\) Then, the following statements are equivalent:

  1. (i)

    \(\bar{x}\) is a strongly stationary point of (\(\mathcal {SP}_\ell \)),

  2. (ii)

    \(\phi (\bar{x})=0,\)

  3. (iii)

    \(\bar{u}=0.\)

Proof

The result will be a consequence of [6, Proposition 2.2] where, using the Hiriart-Urruty functional, a similar statement is proved for vector optimization problems. Consider the cone \(\tilde{K}\) given by (5) , the vector \(\tilde{e}: = \begin{pmatrix} e \\ \vdots \\ e \end{pmatrix} \in int \tilde{K},\) and the scalarizing functional \(\psi _{\tilde{e}}\) associated with \(\tilde{e}\) and \(\tilde{K},\) see Definition 2.2. Then, for any \(v_1,\ldots ,v_{\bar{\omega }} \in {{\mathbb {R}}}^m\) and \(v:= \begin{pmatrix} v_1 \\ \vdots \\ v_{\bar{\omega }} \end{pmatrix},\) we get

$$\begin{aligned} \psi _{\tilde{e}}(v)= & {} \min \{t \in {{\mathbb {R}}}\mid t\tilde{e} \in v + \tilde{K} \}\nonumber \\= & {} \min \{t \in {{\mathbb {R}}}\mid \forall \; j\in [\bar{\omega }]: te \in v_j + K\}\nonumber \\= & {} \max _{ j \in \;[\bar{\omega }]}\psi _e(v_j). \end{aligned}$$
(34)

From [4, Theorem 4], we know that \(\psi _{\tilde{e}}\) is an Hiriart-Urruty functional. Hence, for a fixed \(a \in P_{\bar{x}}\) we can apply [6, Proposition 2.2] to (\(\mathcal {VP}_a\)) to obtain that

$$\begin{aligned} \bar{x} \text { is a stationary point of }({\mathcal {VP}_a}) \Longleftrightarrow \min \limits _{u \in {{\mathbb {R}}}^n} \;\psi _{\tilde{e}}\left( \nabla \tilde{f}^a(\bar{x})^\top u\right) + \frac{1}{2}\Vert u\Vert ^2 = 0. \end{aligned}$$
(35)

Thus, we deduce that

$$\begin{aligned} \bar{x} \text { is strongly stationary}&\overset{(\text { Remark } 3.1) }{\Longleftrightarrow }&\forall \; a\in P_{\bar{x}}: \bar{x} \text { is stationary for } ({\mathcal {VP}_a})\\&\overset{((35) + (28) \; + \; (34))}{\Longleftrightarrow }&\forall \; a\in P_{\bar{x}}: \min \limits _{u \in {{\mathbb {R}}}^n}\varphi _{\bar{x}}(a,u) = 0\\\Longleftrightarrow & {} \min \limits _{(a,u) \in P_{\bar{x}}\times {{\mathbb {R}}}^n} \varphi _{\bar{x}}(a,u) = 0\\&\overset{(31)}{\Longleftrightarrow }&\phi (\bar{x}) = 0\\&\overset{(33}{\Longleftrightarrow }&\bar{u} = 0, \end{aligned}$$

as desired. \(\square \)

Remark 4.3

A similar statement to the one in Proposition 4.2 can be made for stationary points of (\(\mathcal {SP}_\ell \)). Indeed, for a set \(Q \subseteq P_{\bar{x}},\) consider a point \(\left( \bar{a}_Q,\bar{u}_Q\right) \in Q \times {{\mathbb {R}}}^n\) such that \(\varphi _{\bar{x}}\left( \bar{a}_Q,\bar{u}_Q\right) = \min \nolimits _{(a,u) \in Q \times {{\mathbb {R}}}^n} \varphi _{\bar{x}}(a,u).\) Then, by replacing \(P_{\bar{x}}\) by Q in the proof of Proposition 4.2, we can show that the following statements are equivalent:

  1. (i)

    \(\bar{x}\) is stationary for (\(\mathcal {SP}_\ell \)) with respect to Q

  2. (ii)

    \(\min \limits _{(a,u) \in Q \times {{\mathbb {R}}}^n} \varphi _{\bar{x}}(a,u) = 0,\)

  3. (iii)

    \(\bar{u}_Q = 0.\)

Next, we show that the line search in Step 4 of Algorithm 1 terminates in finitely many steps.

Proposition 4.3

Fix \(\beta \in (0,1)\) and consider the functionals \(\varphi _{\bar{x}}\) and \(\phi \) given in (28) and (31) , respectively. Furthermore, let \(\left( \bar{a},\bar{u}\right) \in P_{\bar{x}} \times {{\mathbb {R}}}^n\) be such that \(\phi (\bar{x}) = \varphi _{\bar{x}}(\bar{a},\bar{u})\) and suppose that \(\bar{x}\) is not a strongly stationary point of (\(\mathcal {SP}_\ell \)). The following assertions hold:

  1. (i)

    There exists \(\tilde{t} > 0\) such that

    $$\begin{aligned} \forall \; t \in (0,\tilde{t}\;], j \in [\bar{\omega }]: f^{\bar{a}_j}(\bar{x} +t\bar{u})\preceq f^{\bar{a}_j}(\bar{x}) +\beta t\nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u}. \end{aligned}$$
  2. (ii)

    Let \(\tilde{t}\) be the parameter in statement (i). Then,

    $$\begin{aligned} \forall \; t\in (0,\tilde{t}\;]: F(\bar{x} + t \bar{u}) \preceq ^\ell \left\{ f^{\bar{a}_j}(\bar{x}) + \beta t \nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u}\right\} _{j \in [\bar{\omega }]} \prec ^\ell F(\bar{x}). \end{aligned}$$

    In particular, \(\bar{u}\) is a descent direction of F at \(\bar{x}\) with respect to the preorder \(\preceq ^\ell .\)

Proof

(i) Assume that (i) does not hold. Then, we could find a sequence \(\{t_k\}_{k\ge 1}\) and \(\bar{j}\in [\bar{\omega }]\) such that \(t_k \rightarrow 0\) and

$$\begin{aligned} \forall \; k \in \mathbb {N}: f^{\bar{a}_{\bar{j}}}(\bar{x} +t_k\bar{u}) - f^{\bar{a}_{\bar{j}}}(\bar{x}) - \beta t_k\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u} \notin -K. \end{aligned}$$
(36)

As \(({{\mathbb {R}}}^m\setminus - K)\cup \{0\}\) is a cone, we can multiply (36) by \(\frac{1}{t_k}\) for each \(k \in \mathbb {N}\) to obtain

$$\begin{aligned} \forall \; k \in \mathbb {N}: \frac{f^{\bar{a}_{\bar{j}}}(\bar{x} +t_k\bar{u}) - f^{\bar{a}_{\bar{j}}}(\bar{x})}{t_k} - \beta \nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u} \notin -K. \end{aligned}$$
(37)

Taking now the limit in (37), when \(k \rightarrow +\infty \) we get

$$\begin{aligned} (1-\beta )\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u} \notin - int K. \end{aligned}$$

Since \(\beta \in (0,1),\) this is equivalent to

$$\begin{aligned} \nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u} \notin - int K. \end{aligned}$$
(38)

On the other hand, since \(\bar{x}\) is not strongly stationary, we can apply Proposition 4.2 to obtain that \(\bar{u} \ne 0\) and that \(\phi (\bar{x})<0.\) This implies that \(\varphi _{\bar{x}}(\bar{a},\bar{u}) <0,\) and hence,

$$\begin{aligned} \max \limits _{j \in [\bar{\omega }]} \left\{ \psi _e \left( \nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u}\right) \right\}< -\frac{1}{2}\Vert \bar{u}\Vert ^2 <0. \end{aligned}$$

From this, we deduce that

$$\begin{aligned} \forall \; j \in [\bar{\omega }]: \psi _e \left( \nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u}\right) < 0 \end{aligned}$$

and, by Proposition 2.2 (iii),

$$\begin{aligned} \forall \; j \in [\bar{\omega }]: \nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u} \in - int K. \end{aligned}$$
(39)

However, this is a contradiction to (38), and hence, the statement is proven.

(ii) From (39), we know that

$$\begin{aligned} \forall \; j \in [\bar{\omega }], t \in (0,\tilde{t}\;] : f^{\bar{a}_j}(\bar{x}) + \beta t\nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u} \prec f^{\bar{a}_j}(\bar{x}) . \end{aligned}$$
(40)

Then, it follows that

$$\begin{aligned} \forall \; t \in (0,\bar{t\;]}: F(\bar{x})&\overset{(\text {Proposition } 2.1)}{\subseteq }&\left\{ f^{\bar{a}_1}(\bar{x}), \ldots , f^{\bar{a}_{\bar{\omega }}}(\bar{x}) \right\} + K\\&\overset{(40)}{\subseteq }&\left\{ \nabla f^{\bar{a}_j}(\bar{x}) + \beta t\nabla f^{\bar{a}_j}(\bar{x})^\top \bar{u}\right\} _{j \in [\bar{\omega }]} + int K\\&\overset{(\text {Statement } (i))}{\subseteq }&\left\{ f^{\bar{a}_j}(\bar{x}+ t \bar{a}_1), \ldots , f^{\bar{a}_j}(\bar{x}+ t \bar{a}_{\bar{\omega }})\right\} _{j \in [\bar{\omega }]} + int K\\\subseteq & {} F(\bar{x} + t \bar{u}) + int K, \end{aligned}$$

as desired. \(\square \)

We are now ready to establish the convergence of Algorithm 1.

Theorem 4.1

Suppose that Algorithm 1 generates an infinite sequence for which \(\bar{x}\) is an accumulation point. Furthermore, assume that \(\bar{x}\) is regular for F. Then, \(\bar{x}\) is a stationary point of (\(\mathcal {SP}_\ell \)). If in addition \(|P_{\bar{x}}| = 1\), then \(\bar{x}\) is a strongly stationary point of (\(\mathcal {SP}_\ell \)).

Proof

Consider the functional \(\zeta : \mathscr {P}({{\mathbb {R}}}^m) \rightarrow {{\mathbb {R}}}\cup \{- \infty \}\) defined as

$$\begin{aligned} \forall \;A \in \mathscr {P}({{\mathbb {R}}}^m): \; \zeta (A):= \inf _{y \in A} \psi _e(y). \end{aligned}$$

The proof will be divided in several steps:

Step 1: We show the following result:

$$\begin{aligned} \forall \; k\in \mathbb {N}\cup \{0\}: \; (\zeta \circ F)(x_{k+1})\le (\zeta \circ F)(x_k) + \beta t_k \left[ \phi (x_k)- \frac{1}{2}\Vert u_k\Vert ^2 \right] . \end{aligned}$$
(41)

Indeed, because of the monotonicity property of \(\psi _e\) in Proposition 2.2 (ii), the functional \(\zeta \) is monotone with respect to the preorder \(\preceq ^\ell \), that is,

$$\begin{aligned} \forall \; A,B \in \mathscr {P}({{\mathbb {R}}}^m): A\preceq ^\ell B \Longrightarrow \zeta (A)\le \zeta (B). \end{aligned}$$

On the other hand, from Proposition 4.3 (ii), we deduce that

$$\begin{aligned} \forall \; k \in \mathbb {N}\cup \{0\}: F(x_k+t_k u_k)\preceq ^\ell \left\{ f^{a_{k,j}}(x_k)+ \beta t_k \nabla f^{a_{k,j}}(x_k)^\top u_k \right\} _{i \in [\omega _k]}. \end{aligned}$$

Hence, using the monotonicity of \(\zeta \) and the sublinearity of \(\psi _e\) from Proposition 2.2 (i), we obtain for any \(k\in \mathbb {N}\cup \{0\}:\)

$$\begin{aligned} (\zeta \circ F)(x_{k+1})\le & {} \min _{ j \in \; [\omega _k]}\left\{ \psi _e \left( f^{a_{k,j}}(x_k)+ \beta t_k \nabla f^{a_{k,j}}(x_k)^\top u_k\right) \right\} \\\le & {} \min _{ j \in \; [\omega _k]}\left\{ \psi _e \left( f^{a_{k,j}}(x_k) \right) + \beta t_k \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k\right) \right\} \\\le & {} \min _{ j \in \; [\omega _k]}\left\{ \psi _e \left( f^{a_{k,j}}(x_k) \right) + \beta t_k \max _{j' \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j'}}(x_k)^\top u_k \right) \right\} \right\} \\= & {} (\zeta \circ F)(x_k) + \beta t_k \max _{ j \in \;[\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\} . \end{aligned}$$

The above inequality, together with the definition of \(\phi \) in (31), implies (41).

On the other hand, since \(\bar{x}\) is an accumulation point of the sequence \(\{x_k\}_{k\ge 0}\), we can find a subsequence \(\mathcal {K}\) in \(\mathbb {N}\) such that \(x_k\overset{\mathcal {K}}{ \rightarrow } \bar{x}.\)

Step 2: The following inequality holds

$$\begin{aligned} \forall \; k \in \mathbb {N}\cup \{0\}:\; F(\bar{x})\preceq ^\ell F(x_k). \end{aligned}$$
(42)

Indeed, from Proposition 4.3 (ii),  we can guarantee that the sequence \(\{F(x_k)\}_{k\ge 0}\) is decreasing with respect to the preorder \(\preceq ^\ell \), that is,

$$\begin{aligned} \forall \; k\in \mathbb {N}\cup \{0\}: F(x_{k+1})\preceq ^\ell F(x_k). \end{aligned}$$
(43)

Fix now \(k \in \mathbb {N},\) and \(i \in [p].\) Then, according to (43), we have

$$\begin{aligned} \forall \; k' \in \mathcal {K},k' \ge k, \exists \;i_{k'} \in [p]: f^{i_{k'}}(x_{k'})\preceq f^i(x_k). \end{aligned}$$
(44)

Since there are only a finite number of possible values for \(i_{k'},\) we assume without loss of generality that there is \(\bar{i} \in [p]\) such that \(i_{k'} = \bar{i}\) for every \(k' \in \mathcal {K}, k' \ge k.\) Hence, (44) is equivalent to

$$\begin{aligned} \forall \; k' \in \mathcal {K},k' \ge k: f^{\bar{i}}(x_{k'}) - f^i(x_k) \in -K. \end{aligned}$$
(45)

Taking the limit now in (45) when \(k' \overset{\mathcal {K}}{\rightarrow } + \infty ,\) we find that

$$\begin{aligned} f^i(x_k) \in f^{\bar{i}}(\bar{x}) + K. \end{aligned}$$

Since i was chosen arbitrarily in [p],  this implies the statement.

Step 3: We prove that the sequence \(\{u_k\}_{k\in \mathcal {K}}\) is bounded.

In order to see this, note that, since \(x_k\) is not a stationary point, we have by Proposition 4.2 that \(\phi (x_k) < 0\) for every \(k\in \mathbb {N}\cup \{0\}.\) By the definition of \(a_k\) and \(u_k,\) we then have

$$\begin{aligned} \forall \; k\in \mathbb {N}\cup \{0\}: \varphi _{x_k}(a_k,u_k)<0. \end{aligned}$$
(46)

Let \(\rho \) be the Lipschitz constant of \(\psi _e\) from Proposition 2.2 (i). Then, we deduce that

$$\begin{aligned} \forall \; k\in \mathbb {N}\cup \{0\}:\Vert u_k\Vert ^2&\overset{((46) + (28)) }{<}&-2\max \limits _{j \in \; [\omega _k]}\left\{ \psi _e\left( \nabla f^{a_{k,j}}(\bar{x})^\top u_k \right) \right\} \\= & {} 2\max \limits _{j \in \; [\omega _k]}\left\{ \left| \psi _e\left( \nabla f^{a_{k,j}}(\bar{x})^\top u_k \right) \right| \right\} \\&\overset{(\text {Proposition } 2.2\, (i))}{\le }&2 \rho \max \limits _{j \in [\omega _k]}\left\{ \left\| \nabla f^{a_{k,j}}(\bar{x})^\top u_k \right\| \right\} \\\le & {} 2 \rho \Vert u_k\Vert \max \limits _{j \in \; [\omega _k] } \left\{ \Vert \nabla f^{a_{k,j}}(x_k)\Vert \right\} . \end{aligned}$$

Hence,

$$\begin{aligned} \forall \; k\in \mathbb {N}\cup \{0\}: \Vert u_k\Vert \le 2 \rho \max \limits _{j \in \; [\omega _k] } \left\{ \Vert \nabla f^{a_{k,j}}(x_k)\Vert \right\} . \end{aligned}$$
(47)

Since \(\{x_k\}_{k\in \mathcal {K}}\) is bounded, the statement follows from (47).

Step 4: We show that \(\bar{x}\) is stationary.

Fix \(\kappa \in \mathbb {N}.\) Then, it follows from (41) that

$$\begin{aligned} \forall \; k\in \mathbb {N}: \; -\beta t_k \max _{ j \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\} \le (\zeta \circ F)(x_k)- (\zeta \circ F)(x_{k+1}).\nonumber \\ \end{aligned}$$
(48)

Adding this inequality for \(k= 0,\ldots , \kappa ,\) we obtain

$$\begin{aligned} -\beta \sum _{k=0}^\kappa t_k \max _{ j \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\} \le (\zeta \circ F)(x_0)- (\zeta \circ F)(x_{\kappa +1}). \end{aligned}$$
(49)

On the other hand, similarly to (39) in the proof of Proposition 4.3 (i),  we obtain that

$$\begin{aligned} \forall \; k \in \mathbb {N}\cup \{0\},j \in [\omega _k]: \nabla f^{a_{k,j}}(x_k)^\top u_k \in - int K. \end{aligned}$$
(50)

In particular, applying Proposition 2.2 (iii) in (50), we find that

$$\begin{aligned} \forall \; k \in \mathbb {N}\cup \{0\},j \in [\omega _k]: \psi _e\left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) < 0. \end{aligned}$$
(51)

We then have

$$\begin{aligned} 0 \overset{(51)}{<} -\sum \limits _{k=0}^\kappa t_k \max \limits _{ j \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\}&\overset{(49)}{\le }&\frac{(\zeta \circ F)(x_0)- (\zeta \circ F)(x_{\kappa +1})}{\beta }\\&\overset{(42)}{\le }&\frac{(\zeta \circ F)(x_0)- (\zeta \circ F)(\bar{x})}{\beta }. \end{aligned}$$

Taking now the limit in the previous inequality when \(\kappa \rightarrow +\infty ,\) we deduce that

$$\begin{aligned} 0 \le -\sum _{k=0}^{\infty } t_k \max _{ j \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\} < +\infty . \end{aligned}$$

In particular, this implies

$$\begin{aligned} \lim _{k\rightarrow \infty }t_k \max _{ j \in \; [\omega _k]} \left\{ \psi _e \left( \nabla f^{a_{k,j}}(x_k)^\top u_k \right) \right\} =0. \end{aligned}$$
(52)

Since there are only a finite number of subsets of [p] and \(\bar{x}\) is regular for F,  we can apply Lemma 4.2 to obtain, without loss of generality, the existence of \(Q \subseteq P_{\bar{x}}\) and \(\bar{a} \in Q\) such that

$$\begin{aligned} \forall \; k\in \mathcal {K}:\; \omega _{k} = \bar{\omega },\;P_{x_{k}} = Q,\;a_{k} = \bar{a}. \end{aligned}$$
(53)

Furthermore, since the sequences \(\{t_k\}_{k\ge 1}, \{u_{k}\}_{k\in \mathcal {K}}\) are bounded, we can also assume without loss of generality the existence of \(\bar{t} \in {{\mathbb {R}}}, \bar{u} \in {{\mathbb {R}}}^n\) such that

$$\begin{aligned} t_k \overset{\mathcal {K}}{\rightarrow } \bar{t},\;\; u_k \overset{\mathcal {K}}{\rightarrow } \bar{u}. \end{aligned}$$
(54)

The rest of the proof is devoted to show that \(\bar{x}\) is a stationary point with respect to Q. First, observe that by (53) and the definition of \(a_k,\) we have

$$\begin{aligned} \forall \; a\in Q,\; k\in \mathcal {K}, \; u\in {{\mathbb {R}}}^n: \phi (x_k)=\varphi _{x_k}(\bar{a}, u_k)\le \varphi _{x_k}(a,u). \end{aligned}$$

Then, taking into account that \(\omega _k= \bar{\omega }\) in (53), we can take the limit when \(k \overset{ \mathcal {K}}{ \rightarrow } +\infty \) in the above expression to obtain

$$\begin{aligned} \forall \; a\in Q,\; u\in {{\mathbb {R}}}^n: \varphi _{\bar{x}}(\bar{a}, \bar{u})\le \varphi _{\bar{x}}(a,u). \end{aligned}$$

Equivalently, we have

$$\begin{aligned} (\bar{a},\bar{u}) \in \underset{(a,u)\in Q\times {{\mathbb {R}}}^n}{{{\,\mathrm{argmin}\,}}} \varphi _{\bar{x}}(a,u). \end{aligned}$$
(55)

Next, we analyze two cases:

Case 1: \(\bar{t}>0.\)

According to (52) and (53), we have in this case

$$\begin{aligned} \lim _{k \overset{ \mathcal {K}}{\rightarrow } + \infty } \max _{j \in [\bar{\omega }]} \left\{ \psi _e \left( \nabla f^{\bar{a}_j}(x_k)^\top u_k \right) \right\} =0. \end{aligned}$$
(56)

Then, it follows that

$$\begin{aligned} 0\le & {} \frac{1}{2} \Vert \bar{u}\Vert ^2\\&\overset{ ((53) \; + \;(54)\;+ \;(56)) }{=}&\lim _{k \overset{ \mathcal {K}}{\rightarrow } + \infty } \max _{ j \in \;[\bar{\omega }]} \left\{ \psi _e \left( \nabla f^{\bar{a}_j}(x_k)^\top u_k \right) \right\} + \frac{1}{2} \Vert u_k\Vert ^2\\= & {} \lim _{k \overset{ \mathcal {K}}{\rightarrow } + \infty } \phi (x_k)\\&\overset{(32)}{\le }&0, \end{aligned}$$

from which we deduce \(\bar{u} = 0.\) This, together with (55) and Remark 4.3, implies that \(\bar{x}\) is a stationary point with respect to Q.

Case 2: \(\bar{t}=0.\)

Fix an arbitrary \(\kappa \in \mathbb {N}.\) Since \(t_k \overset{\mathcal {K}}{\rightarrow } 0,\) for \(k \in \mathcal {K}\) large enough \(\nu ^{\kappa }\) does not satisfy Armijo’s line search criteria in Step 4 of Algorithm 1. By (53) and the finiteness of \(\bar{\omega },\) we can assume without loss of generality the existence of \(\bar{j} \in [\bar{\omega }]\) such that

$$\begin{aligned} \forall \; k\in \mathcal {K}:\; f^{\bar{a}_{\bar{j}}}(x_k+ \nu ^{\kappa }u_k) \npreceq f^{\bar{a}_{\bar{j}}}(x_k) +\beta \nu ^{\kappa }\nabla f^{\bar{a}_{\bar{j}}}(x_k)^\top u_k. \end{aligned}$$

From this, it follows that

$$\begin{aligned} \forall \; k\in \mathcal {K}:\;\frac{f^{\bar{a}_{\bar{j}}}(x_k+ \nu ^{\kappa }u_k) - f^{\bar{a}_{\bar{j}}}(x_k)}{\nu ^{\kappa }} -\beta \nabla f^{\bar{a}_{\bar{j}}}(x_k)^\top u_k \notin -K. \end{aligned}$$

Now, taking the limit when \(k \overset{\mathcal {K}}{\rightarrow } +\infty ,\) we obtain

$$\begin{aligned} \frac{f^{\bar{a}_{\bar{j}}}(\bar{x}+ \nu ^{\kappa }\bar{u}) - f^{\bar{a}_{\bar{j}}}(\bar{x})}{\nu ^{\kappa }} -\beta \nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u} \notin - int K. \end{aligned}$$

Next, taking the limit when \(\kappa \rightarrow +\infty ,\) we get

$$\begin{aligned} (1-\beta )\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u}\notin - int K. \end{aligned}$$

Since \(\beta \in (0,1),\) we deduce that \(\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u}\notin - int K\) and, according to Proposition 2.2 (iii), this is equivalent to

$$\begin{aligned} \psi _e(\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u})\ge 0. \end{aligned}$$
(57)

Finally, we find that

$$\begin{aligned} 0 \overset{(57)}{\le } \psi _e(\nabla f^{\bar{a}_{\bar{j}}}(\bar{x})^\top \bar{u}) \le \varphi _{\bar{x}}(\bar{a},\bar{u}) \overset{(55)}{=} \min _{(a,u)\in Q\times {{\mathbb {R}}}^n} \varphi _{\bar{x}}(a,u) \overset{(29)}{\le } 0, \end{aligned}$$

which implies

$$\begin{aligned} \min _{(a,u)\in Q\times {{\mathbb {R}}}^n} \varphi _{\bar{x}}(a,u) = 0. \end{aligned}$$
(58)

The stationarity of \(\bar{x}\) follows then from (58) and Remark 4.3. The proof is complete. \(\square \)

5 Implementation and Numerical Illustrations

In this section, we report some preliminary numerical experience with the proposed method. Algorithm 1 was implemented in Python 3 and the experiments were done in a PC with an Intel(R) Core(TM) i5-4200U CPU processor and 4.0 GB of RAM. We describe in the following some details of the implementation and the experiments:

  • We considered instances of problem (\(\mathcal {SP}_\ell \)) only for the case in which K is the standard ordering cone, that is, \( K = {{\mathbb {R}}}_+^m.\) In addition, we choose the parameter \(e \in int K\) for the scalarizing functional \(\psi _e\) as \(e = (1,\ldots ,1)^{\top }.\)

  • The parameters \(\beta \) and \(\nu \) for the line search in Step 4 of the method were chosen as \(\beta = 0.0001,\; \nu = 0.500.\)

  • The stopping criteria employed were that \(\Vert u_k\Vert < 0.0001,\) or a maximum number of 200 iterations were reached.

  • For finding the set \(Min (F(x_k),K)\) at the \(k^{th}\)- iteration in Step 1 of the algorithm, we implemented the method developed by Günther and Popovici in [16]. This procedure requires a strongly monotone functional \(\psi : {{\mathbb {R}}}^m \rightarrow {{\mathbb {R}}}\) with respect to the partial order \(\preceq \) for a so-called presorting phase. In our implementation, we used \(\psi \) defined as follows:

    $$\begin{aligned} \forall \; v\in {{\mathbb {R}}}^m: \psi (v) := \sum _{i = 1}^m v_i. \end{aligned}$$

    The other possibility for finding the set \(Min (F(x_k),K)\) would be to use the method introduced by Jahn in [21, 22, 26] with ideas from [45]. However, as mentioned in “Introduction,” the first approach has better computational complexity. Thus, the algorithm proposed in [16] was a clear choice.

  • At the kth iteration in Step 2 of the algorithm, we worked with the modeling language CVXPY 1.0 [1, 8] for the solution of the problem

    $$\begin{aligned} \underset{(a,u)\in P_k \times {{\mathbb {R}}}^n}{\min } \varphi _{x_k}(a,u). \end{aligned}$$

    Since the variable a is constrained to be in the discrete set \(P_k,\) we proceeded as follows: Using the solver ECOS [9] within CVXPY, we compute for every \(a \in P_k\) the unique solution \(u_a\) of the strongly convex problem

    $$\begin{aligned} \underset{u\in {{\mathbb {R}}}^n}{\min } \varphi _{x_k}(a,u). \end{aligned}$$

    Then, we set

    $$\begin{aligned} (a_k,u_k) := \underset{a \in P_k}{{{\,\mathrm{argmin}\,}}}\; \varphi _{x_k}(a,u_a). \end{aligned}$$
  • For each test instance considered in the experimental part, we generated initial points randomly on a specific set and run the algorithm. We define as solved those experiments in which the algorithm stopped because \(\Vert u_k\Vert < 0.0001\), and declared that a strongly stationary point was found. For a given experiment, its final error is the value of \(\Vert u_k\Vert \) at the last iteration. The following variables are collected for each test instance:

    • Solved: this value indicates the number of initial points for which the problem was solved.

    • Iterations: this is a 3-tuple (min, mean, max) that indicates the minimum, the mean and the maximum of the number of iterations in those instances reported as solved.

    • Mean CPU Time: Mean of the CPU time(in seconds) among the solved cases.

    Furthermore, for clarity, all the numerical values will be displayed for up to four decimal places.

Now, we proceed to the different instances on which our algorithm was tested. Our first test instance can be seen as a continuous version of an example in [17].

Test Instance 5.1

We consider \(F:\mathbb {R} \rightrightarrows \mathbb {R}^2\) defined as

$$\begin{aligned} F(x) := \left\{ f^1(x), \ldots , f^5(x)\right\} , \end{aligned}$$

where, for \(i \in [5],\) \(f^i: {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}^2\) is given as

$$\begin{aligned} f^i(x) := \begin{pmatrix} x \\ \frac{x}{2}\sin (x) \end{pmatrix} + \sin ^2(x)\left[ \frac{(i-1)}{4 }\begin{pmatrix} 1 \\ -1 \end{pmatrix}+ \left( 1- \frac{(i-1)}{4}\right) \begin{pmatrix} -1 \\ 1 \end{pmatrix} \right] . \end{aligned}$$

The objective values in this case are discretized segments moving around a curve and being contracted (dilated) by a factor dependent on the argument. We generated 100 initial points \(x_0\) randomly on the interval \([-5\pi ,5\pi ]\) and run our algorithm. Some of the metrics are listed in Table 1. As we can see, in this case all the runs terminated finding a strongly stationary point. Moreover, we observed that for this problem not too many iterations were needed.

Table 1 Performance of Algorithm 1 in Test Instance 5.1

In Fig. 1, the sequence \(\left\{ F(x_k)\right\} _{k \in \{0\}\cup [7]}\) generated by Algorithm 1 for a selected starting point is shown. In this case, strong stationarity was declared after seven iterations. The traces of the curves \(f^i\) for \(i \in [5]\) are displayed, with arrows indicating their direction of movement. Moreover, the sets \(F(x_0)\) and \(F(x_{7})\) are represented by black and red points, respectively, and the elements of the sets \(F(x_k)\) with \(k \in [6]\) are in gray color. The improvements of the objective values after every iteration are clearly observed.

Fig. 1
figure 1

Sequence generated in the image space by Algorithm 1 for a selected starting point in Test Instance 5.1

Test Instance 5.2

In this example, we start by taking a uniform partition \(\mathcal {U}_1\) of 10 points of the interval \([-1,1]\), that is,

$$\begin{aligned} \mathcal {U}_1= \{-1, -0.7778, -0.5556, -0.3333, -0.1111, 0.1111, 0.3333, 0.5556, 0.7778, 1 \}. \end{aligned}$$

Then, the set \(\;\mathcal {U}: = \mathcal {U}_1\times \mathcal {U}_1\) is a mesh of 100 points of the square \([-1,1]\times [-1,1].\) Let \(\{u_1,\ldots , u_{100}\}\) be an enumeration of \(\mathcal {U}\) and consider the points

$$\begin{aligned} l_1:= \begin{pmatrix} 0\\ 0 \end{pmatrix}, \; l_2:= \begin{pmatrix} 8\\ 0 \end{pmatrix},\; l_3:= \begin{pmatrix} 0\\ 8 \end{pmatrix}. \end{aligned}$$

We define, for \(i \in [100],\) the functional \(f^i:{{\mathbb {R}}}^2 \rightarrow {{\mathbb {R}}}^3\) as

$$\begin{aligned} f^i(x) := \frac{1}{2}\begin{pmatrix} \Vert x- l_1-u_i\Vert ^2 \\ \Vert x- l_2-u_i\Vert ^2 \\ \Vert x- l_3 -u_i\Vert ^2 \end{pmatrix}. \end{aligned}$$

Finally, the set-valued mapping \(F: {{\mathbb {R}}}^2 \rightrightarrows {{\mathbb {R}}}^3\) is defined by

$$\begin{aligned} F(x) := \left\{ f^1(x), \ldots , f^{100}(x)\right\} . \end{aligned}$$

Note that problem (\(\mathcal {SP}_\ell \)) corresponds in this case to the robust counterpart of a vector-valued facility location problem under uncertainty [20], where \(\mathcal {U}\) represents the uncertainty set with respect to the facility sites \(l_1,l_2,l_3.\) Furthermore, with the aid of Theorem 3.1, it is possible to show that a point \(\bar{x}\) is a local weakly minimal solution of (\(\mathcal {SP}_\ell \)) if and only if

$$\begin{aligned} \bar{x} \in conv \left\{ l_j + u_i \mid (i,j) \in I(\bar{x})\times \{1,2,3\} \right\} . \end{aligned}$$

Thus, in particular, the local weakly minimal solutions lie on the set

$$\begin{aligned} C:=conv \left( (l_1+ \mathcal {U})\cup (l_2+ \mathcal {U}) \cup (l_3+ \mathcal {U}) \right) . \end{aligned}$$
(59)

In this test instance, 100 initial points \(x_0\) were generated in the square \([-50,50]\times [-50,50],\) and Algorithm 1 was ran in each case. A summary of the results is presented in Table 2. Again, for any initial point, the sequence generated by the algorithm stopped with a local solution to our problem. Perhaps, the most noticeable parameter recorded in this case is the number of iterations required to declare the solution. Indeed, in most cases, only 1 iteration was enough, even when the starting point was far away from the locations \(l_1, l_2, l_3.\)

Table 2 Performance of Algorithm 1 in Test Instance 5.2

In Fig. 2, the set of solutions found in this experiment are shown in red. The locations \(l_1,l_2,l_3\) are represented by black points and the elements of the set \(\left( l_1 + \mathcal {U}\right) \cup \left( l_2 + \mathcal {U}\right) \cup \left( l_3 + \mathcal {U} \right) \) are colored in gray. We can observe, as expected, that all the local solutions found are contained in the set C given in (59).

Fig. 2
figure 2

Solutions found (in red) in the argument space for Test Instance 5.2

Our last test example comes from [24].

Test Instance 5.3

For \(i \in [100], \) we consider the functional \(f^i: {{\mathbb {R}}}^2 \rightarrow {{\mathbb {R}}}^2\) defined as

$$\begin{aligned} f^i(x):= \begin{pmatrix} e^{\frac{x_1}{2}} \cos (x_2)+ x_1 \cos (x_2) \cos ^3\left( \frac{2\pi (i-1)}{100}\right) - x_2\sin (x_2)\sin ^3\left( \frac{2\pi (i-1)}{100}\right) \\ e^{\frac{x_2}{20}}\sin (x_1) + x_1 \sin (x_2)\cos ^3\left( \frac{2\pi (i-1)}{100}\right) + x_2\cos (x_2)\sin ^3\left( \frac{2\pi (i-1)}{100}\right) \end{pmatrix}. \end{aligned}$$

Hence, \(F: {{\mathbb {R}}}^2 \rightrightarrows {{\mathbb {R}}}^2\) is given by

$$\begin{aligned} F(x):= \{f^1(x), \ldots , f^{100}(x)\}. \end{aligned}$$

The images of the set-valued mapping in this example are discretized, shifted, rotated and deformed rhombuses, as shown in Fig. 3. We generated randomly 100 initial points in the square \([-10\pi , 10 \pi ] \times [-10\pi , 10 \pi ]\) and ran our algorithm. A summary of the results is given in Table 3. In this case, only for 88 initial points a solution was found. In the rest of the occasions, the algorithm stopped because the maximum number of iterations was reached. Further examination in these unsolved cases revealed that, except for two of the initial points, the final error was of the order of \(10^{-1}\) (even \(10^{-3}\) and \(10^{-4}\) in half of the cases). Thus, perhaps only a few more iterations were needed in order to declare strong stationarity.

Fig. 3
figure 3

Sequence generated in the image space by Algorithm 1 for a selected starting point in Test Instance 5.3

Table 3 Performance of Algorithm 1 in Test Instance 5.3

Figure 3 illustrates the sequence \(\left\{ F(x_k)\right\} _{k \in \{0\} \cup [18]}\) generated by Algorithm 1 for a selected starting point. Strong stationarity was declared after 18 iterations in this experiment. The sets \(F(x_0)\) and \(F(x_{18})\) are represented by black and red points, respectively, and the elements of the sets \(F(x_k)\) with \(k \in [17]\) are in gray color. Similarly to the other test instances, we can observe that at every iteration the images decrease with respect to the preorder \(\preceq ^\ell .\)

6 Conclusions

In this paper, we considered set optimization problems with respect to the lower less set relation, where the set-valued objective mapping can be decomposed into a finite number of continuously differentiable selections. The main contributions are the tailored optimality conditions derived using the first-order information of the selections in the decomposition, together with an algorithm for the solution of the problems with this structure. An attractive feature of our method is that we are able to guarantee convergence toward points satisfying the previously mentioned optimality conditions. To the best of our knowledge, this would be the first procedure having such property in the context of set optimization. Finally, because of the applications of problems with this structure in the context of optimization under uncertainty, ideas for further research include the development of cutting plane strategies for general set optimization problems, as well as the extension of our results to other set relations.