1 Introduction

Mathematical programming has been recognized as one of the fundamental chapters of applied mathematics since a huge number of problems in engineering, economics, management science, etc., involve an optimal decision-making process which gives rise to an optimization model. This fact has intensely motivated the theoretical foundations of optimization and the study and development of algorithms.

Among the main issues in mathematical optimization, optimality conditions play a key role in the theoretical understanding of solutions and their numerical computation. At first such conditions were set for linear or smooth optimization problems. Later developments in variational analysis allowed researchers to extend this theory to general nonlinear nonsmooth convex programming problems defined in infinite-dimensional spaces (see e.g. [39]). In the same spirit, generalized differentiation properties became an intensive field of research with numerous applications to nonsmooth and nonconvex mathematical programming problems. Nevertheless, in order to provide such general calculus rules and compute optimal conditions, a certain degree of smoothness is required, while working either in Banach spaces with a smooth norm or in Asplund spaces (see e.g. [24,25,26,27, 34]). In this paper, we provide necessary and sufficient optimality conditions for a general class of optimization problems under new qualification conditions which constitute real alternatives to the well-known Slater constraint qualification. Our approach is based on the notions of the (regular) subdifferential and coderivative and we show that these tools work out of the scope of Asplund spaces, not for the whole family of lower semicontinuous functions, but for the so-called class of B-DC mappings. This class of mappings is introduced in Definition 2.3 and constitutes a slight extension of the concept of DC functions/mappings.

The paper is focused on the study of (necessary and sufficient) optimal conditions for a constrained programming problem. First, we study the case where the constraint is of the form, \(\Phi (x)\in {\mathcal {C}}\), where \({\mathcal {C}}\) is a nonempty, closed and convex set in a vector space Y, and \(\Phi \) is a vector-valued function from the decision space X into Y. Second, we study an abstract conic constraint, that is, the case when \({\mathcal {C}}=-{\mathcal {K}}\), for a nonempty convex closed cone \({\mathcal {K}}\). These abstract representations allow us to cover infinite, stochastic, and semidefinite programming problems.

With this general aim of establishing necessary and sufficient optimality conditions, we first introduce an extension of the concept of vector-valued DC mappings, also called \(\delta \)-convex mappings, given in [38] (see also [21] for classical notions and further references). Our Definition 2.3 addresses two fundamental aspects in mathematical optimization. First, the convenience of using functions with extended real values and mappings which are not defined in the whole space has been widely recognized. This allows us to handle different classes of constraint systems from an abstract point of view and, to this purpose, we enlarge the space Y with an extra element \(\infty _{Y}\) (in particular, \(\infty _{Y}=+\infty \), whenever \(Y= \mathbb {R}\)). Second, we consider specific scalarization sets for the mapping \(\Phi \), which varies along dual directions in respect to the set involved in the constraint; more specifically, directions in the polar set of \({\mathcal {C}}\), or in the positive polar cone of \({\mathcal {K}}\), respectively (see definitions below).

The aforementioned notions make it possible for us to exploit the geometrical properties of mappings (convexity) combined with tools taken from variational analysis and generalized differentiation. Using these tools, we obtain necessary and sufficient optimality conditions of abstract optimization problems defined on general locally convex spaces under new qualification conditions, which represent alternatives for classical notions.

The paper is organized as follows. In Sect. 2, we introduce the necessary notations and preliminary results; we give the definition of the set \(\Gamma _{h}(X,Y)\) and introduce the tools of generalized differentiation. Together with those notions, we provide the first calculus rules and machinery needed in the paper, which constitute the working horse in the formulation of our optimal conditions. In Sect. 3, we deal with a constraint of the form \(\Phi (x)\in {\mathcal {C}}\); we transform the problem into an unconstrained mathematical program, where the objective function is a difference of two convex functions, and this reformulation yields necessary and sufficient conditions of global and local optimality. The main result in this section, concerning global optimality, is Theorem 3.1, and the result for necessary conditions of local optimality is Theorem 3.7; meanwhile, sufficient conditions are given in Theorem 3.11. Later in Sect. 4, we confine ourselves to studying problems with abstract conic constraints given by \(\Phi (x)\in -\mathcal {K }\). In such section, the cone structure is exploited, and a set of scalarizations, generating by the positive polar cone \({\mathcal {K}}^+\) (see Definition 4.1), is used. Appealing to that notion, and thanks to a suitable reformulation of the problem, we derived specific necessary and sufficient optimality conditions for conic programming problems. In particular, Theorem 4.3 presents global optimality conditions and Theorems 4.5 and 4.7 are devoted to local optimality. In the final section, we apply our developments to establish ad hoc optimality conditions for fundamental problems in applied mathematics such as infinite, stochastic and semidefinite programming problems.

2 Notation and Preliminary Results

The paper uses the main notations and definitions which are standard in convex and variational analysis (see e.g. [2, 24,25,26,27, 34, 39]).

2.1 Tools Form Convex Analysis

In this paper, X and Y are locally convex (Hausdorff) spaces (lcs, in brief) with respective topological duals \(X^{*}\) and \(Y^{*}\). We denote by \(w(X^{*},X)\), \(w(Y^{*},Y)\) the corresponding weak-topologies on \(X^{*}\) and \(Y^{*}\). We enlarge Y by adding the element \(\infty _{Y}\). The extended real line is denoted \(\overline{\mathbb {R}}:=[-\infty ,+\infty ]\), and we adopt the convention \(+\infty -(+\infty )=+\infty \). Given a family of sets \(\{A_{i}\}_{i\in I}\), we denote its union by brackets:

$$\begin{aligned} \bigcup \left[ A_{i}:i\in I\right] :=\bigcup _{i\in I}A_{i}. \end{aligned}$$

Given a set \(A\subset X\), we denote by \({\text {*}}{cl}(A)\), \({\text {int}}(A)\), \({\text {*}}{co}(A)\), \({\text {cone}}(A)\) the closure, the interior, the convex hull and the convex cone generated by A. By \(0_{X}\) we represent the zero vector in X, similarly for the spaces \(Y,X^{*}\) and \(Y^{*}\). For two sets \(A,B\subset X\) and \(\lambda \in \mathbb {R}\) we define the following operations:

$$\begin{aligned} A+B:=\left\{ a+b,\ a\in A\text { and }b\in B\right\} ,\quad \lambda A:=\{\lambda a:a\in A\}, \end{aligned}$$

and

$$\begin{aligned} A\ominus B:=\{x\in X:x+B\subseteq A\}. \end{aligned}$$

In the previous operations, we consider the following conventions:

$$\begin{aligned} A+\emptyset =\emptyset +A=\emptyset \text {, }{\text {*}}{co}(\emptyset )={\text {*}}{cone}(\emptyset )=\emptyset ,\text { and }\quad 0\emptyset =\{0_{X}\}. \end{aligned}$$

Given a set T, we represent the generalized simplex on T by \(\Delta (T)\), which is the set of all the functions \(\alpha :T\rightarrow [0,1]\) such that \(\alpha _{t}\ne 0\) only for finitely many \(t\in T\) and \(\sum _{t\in T}\alpha _{t}=1;\) for \(\alpha \in \Delta (T)\) we denote \({\text {supp}}\alpha :=\{t\in T:\alpha _{t}\ne 0\}\). We also introduce the symbols \(\Delta _{n}:=\Delta (\{1,2,\ldots ,n\})\), \(\Delta _{n}^{\epsilon }:=\epsilon \Delta _{n}.\)

If A is convex and \(\epsilon \ge 0\), we define the \(\epsilon \)-normal set to A at x as

$$\begin{aligned} N_{A}^{\epsilon }(x):=\{x^{*}\in X^{*}:\ \langle x^{*},y-x\rangle \le \epsilon \text { for all }y\in A\}, \end{aligned}$$
(1)

if \(x\in A,\) and \(N_{A}^{\epsilon }(x)=\emptyset \) if \(x\notin A\). If \(\epsilon =0\), \(N_{A}^{0}(x)\equiv N_{A}(x)\) is the so-called normal cone to A at x.

If \(A\subseteq X\), the polar set of A is given by

$$\begin{aligned} A^{\circ }:=\{x^{*}\in X^{*}:\ \langle x^{*},y\rangle \le 1\text { for all }y\in A\}. \end{aligned}$$

The positive polar cone of A is given by

$$\begin{aligned} A^{+}:=\left\{ x^{*}\in X^{*}: \langle x^{*}, x \rangle \ge 0, \text { for all } x\in A \right\} . \end{aligned}$$

Given a mapping \(F:X\rightarrow Y\cup \{{\infty _{Y}\}}\), the (effective) domain of F is given by \({\text {*}}{dom}F:=\{x\in X:\ F(x)\in Y\}\). We say that F is proper if \({\text {*}}{dom}F\ne \emptyset \). Given \(y^*\in Y^*\), we define \(\langle y^{*},F\rangle : X\rightarrow \mathbb {R}\cup \{+\infty \}\) by

$$\begin{aligned} \langle y^{*},F\rangle (x):=\left\{ \begin{array}{ll} \langle y^{*},F(x)\rangle , &{} \text {if }x\in {{\,\mathrm{{\text {dom}}}\,}}F,\\ +\infty , &{} \text {if }x\notin {{\,\mathrm{{\text {dom}}}\,}}F. \end{array} \right. \end{aligned}$$
(2)

We use similar notations for functions \(f:X\rightarrow \mathbb {R}\cup \{+\infty \}\). We represent by \(\Gamma _{0}(X)\) the set of all functions \(f:X\rightarrow \mathbb {R}\cup \{+\infty \}\) which are proper, convex and lower-semicontinuous (lsc, in brief).

The continuity of functions and mappings will be only considered at points of their domains.

Given the set A, the indicator function of A is

$$\begin{aligned} \delta _{A}(x):=\left\{ \begin{array}{cc} 0, &{} \text {if }x\in A,\\ +\infty , &{} \text {if }x\notin A. \end{array} \right. \end{aligned}$$

For \(\epsilon \ge 0\), the \(\epsilon \)-subdifferential (or approximate subdifferential) of a function \(f:X\rightarrow \overline{ \mathbb {R}}\) at a point \(x\in X\) is the set

$$\begin{aligned} \partial _{\epsilon }f(x):=\{x^{*}\in X^{*}:\ \langle x^{*},y-x\rangle \le f(y)-f(x)+\epsilon ,\ \forall y\in \mathbb {R}^{n}\}; \end{aligned}$$

if \(|f(x)|=+\infty \), we set \(\partial _{\epsilon }f(x)=\emptyset \). The special case \(\epsilon =0\) yields the classical (Moreau–Rockafellar) convex subdifferential, denoted by \(\partial f(x)\).

We finish this subsection by recalling the following alternative formulation of a general constrained optimization problem which uses a maximum function. Since the proof follows standard argument, we omit its proof.

Lemma 2.1

Given the functions \(g, h: X \rightarrow \overline{\mathbb {R}}\) and the nonempty set \(C \subseteq X\), let us consider the optimization problem

$$\begin{aligned} \begin{array}{ll} &{}\min g(x)\\ &{}\text {s.t. }h(x) \le 0\\ &{}x\in C. \end{array} \end{aligned}$$
(3)

Assume that the optimal value, \(\alpha \), of problem (3) is finite. Then, \(\bar{x}\) is an optimal solution of (3) if and only if \(\bar{x}\) is an optimal solution of the optimization problem

$$\begin{aligned} \begin{array}{l} \min \max \{ g(x)-\alpha , h(x) \}\\ \text {s.t. } \hspace{0.1cm} x\in C. \end{array} \end{aligned}$$
(4)

Moreover, the optimal value of problem (4) is zero.

Remark 2.2

The function \(H: X\times X\rightarrow \overline{\mathbb {R}}\) defined by

$$\begin{aligned} H(x,y):=\max \{ g(x) - g(y), h(x) \} \end{aligned}$$

is called standard improvement function (see e.g. [3]). Particularly, the objective function used in problem (4) corresponds to the improvement function at \(y=\bar{x}\).

2.2 B-DC Functions and Basic Properties

Next we introduce a new class of DC functions which constitutes the keystone of this paper. It extends the notion of DC vector-valued mappings introduced in [15] and is also related to the concept of delta-convex functions in [38].

Definition 2.3

Let X and Y be lcs spaces and \(h\in \Gamma _{0}(X)\).

  1. i)

    Consider a nonempty set \(B \subseteq Y^*\). We define the set of B-DC mappings with control h, denoted by \(\Gamma _{h}(X,Y,B)\), as the set of all mappings \(F: X\rightarrow Y \cup \{ \infty _Y\}\) such that \({\text {dom}} h \supseteq {\text {dom}} F\) and

    $$\langle \lambda ^*,F \rangle +h \in \Gamma _{0}(X) \text { for all } \lambda ^*\in B.$$

    We also say that F is a DC mapping with control function h relative to B, or that F is controlled by h relatively to B.

  2. ii)

    We represent by \(\Gamma _h(X)\) the set of all functions \(f: X\rightarrow \mathbb {R}\cup \{ +\infty \}\) such that \({\text {dom}} f \subseteq {\text {dom}} h\) and \(f + h\in \Gamma _{0}(X)\).

Remark 2.4

It is worth mentioning that Definition 2.3 corresponds to a natural extension of the notion used in [38], with the name delta-convex functions. More precisely, following [38], for a continuous mapping \(F: U \subseteq X\rightarrow Y\), where XY are Banach spaces and U is an open convex set, we say:

  1. a)

    F is a weak-DC mapping if for every \(\lambda ^*\in Y^*\) there exists \(h_{\lambda ^*}: U \rightarrow \mathbb {R}\) convex and continuous such that \( \langle \lambda ^*,F \rangle +h_{\lambda ^*} \) is convex and continuous on U.

  2. b)

    F is a DC mapping if there exists \(h: U \rightarrow \mathbb {R}\) convex and continuous such that \( \langle \lambda ^*,F \rangle +h \) is convex and continuous for all \(\lambda ^*\in \mathbb {B}_{Y^*}\), where \(\mathbb {B}_{Y^*}\) is the closed unit ball in \(Y^*\). In this case, it is said that F is a DC mapping with control function h.

Moreover, by [38, Corollary 1.8] both definitions are equivalent when Y is finite dimensional. In [38], the focus is on analytic properties of vector-valued mappings defined on convex open sets. Here, according to the tradition in optimization theory, we deal with mappings which admit extended values. It is important to mention that Definition 2.3 2.3 reduces to the concept of DC mapping introduced in [38], when B is the unit ball in the dual space of Y, in the normed space setting. Moreover, if a real-valued function f is DC mapping with control h, then necessarily \(-f\) is a DC with control h (see Proposition 2.5 c below for more details).

The next proposition gathers some elementary properties of the class \(\Gamma _{h}(X,Y,B)\).

Proposition 2.5

(Basic properties of B-DC mappings)

  1. a)

    Let \(F\in \Gamma _{h}(X,Y,B)\), then \({\text {dom}} F\) is convex.

  2. b)

    Let \(F_i\in \Gamma _{h_i}(X,Y,B)\) for \(i=1,\ldots p\) with \(\cap _{i=1}^p {{\,\mathrm{{\text {dom}}}\,}}F_i\ne \emptyset \), then \(\sum _{i=1}^p F_i \in \Gamma _{h}(X,Y,B)\), where \(h= \sum _{i=1}^p h_i\).

  3. c)

    Let \(F \in \Gamma _{h}(X,Y,B)\) with \({\text {dom}} F = X\) and suppose that B is symmetric, then \(-F\in \Gamma _{h} (X,Y,B)\).

Proof

a) It follows from the fact that \({\text {dom}} F = {\text {dom}} \left( \langle \lambda ^*, F\rangle + h \right) \) for every \(\lambda ^*\in B\). b)Let \(F:= \sum _{i=1}^p F_i \) and \(\lambda ^*\in B\). Then, for all \(x\in X\) we have that \(\langle \lambda ^*, F\rangle + h= \sum _{i=1}^p \left( \langle \lambda ^*, F_i\rangle + h_i\right) ,\) which is a convex proper and lsc function. c) It follows from the fact that \( \langle \lambda ^*, -F \rangle + h= \langle -\lambda ^*, F \rangle + h\) and \(-\lambda ^*\in B\) due to the symmetry of B.

\(\square \)

2.3 Generalized Differentiation and Calculus Rules

In this subsection, we introduce the necessary notation to distinguish nonsmooth and nonconvex functions and mappings, and we develop some calculus rules.

The following notions are based on classical bornological constructions in Banach spaces (see e.g. [5, 6, 18, 24] for more details and similar constructions). Given a locally convex space X, we consider \(\beta (X)\) as the family of all bounded sets of X (i.e. those sets such that every seminorm which generates the topology on X is bounded on them). We simply write \(\beta \), when there is no ambiguity of the space.

Definition 2.6

We say that \(g:X\rightarrow \overline{\mathbb {R}}\) is differentiable at x, with \(|g(x)|<+\infty \), if there exists \(x^{*}\in X^{*}\) such that

$$\begin{aligned} \lim _{t\rightarrow 0^+}\sup _{h\in S}\left| \frac{g(x+th)-g(x)}{t}-\left\langle x^{*},h\right\rangle \right| =0,\text { for all } S\in \beta . \end{aligned}$$
(5)

It is not difficult to see that when such \(x^*\) exists, it is unique. In that case, and following the usual notation, we simply write \(\nabla g(x)=x^{*}\). Here, it is important to recall that in a general locally convex space X, the differentiability of a function does not imply its continuity. For instance, the square of the norm in any infinite dimensional Hilbert space is Fréchet differentiable, but not weak continuous (see e.g. [7]).

Definition 2.7

The regular (Fréchet) subdifferential of \(f:X\rightarrow \overline{\mathbb {R}}\) at \(x\in X,\) with \(|f(x)|<+\infty \), is the set, denoted by \(\hat{\partial } f(x),\) of all \(x^{*}\) such that

$$\begin{aligned} \liminf \limits _{t\rightarrow 0^+ } \sup \limits _{h\in S}\left( \frac{f(x+th)-f(x) }{t}-\langle x^{*},h\rangle \right) \ge 0,\; \text { for all } S\in \beta . \end{aligned}$$

For a point \(x\in X\), where \(|f(x)|=+\infty \) we simply define \(\hat{\partial } f(x)=\emptyset \).

Now, let us formally prove that the regular subdifferential coincides with the classic convex subdifferential for functions in \(\Gamma _0(X)\).

Lemma 2.8

Let \(f\in \Gamma _0(X)\). Then, the regular subdifferential of f coincides with the classic convex subdifferential of f, that is, \( \hat{\partial } f(x)=\partial f(x) \) for every \(x\in X\).

Proof

Since \(\partial f(x) \subseteq \hat{\partial } f(x) \) obviously holds, we focus on the opposite inclusion. Let \(x^*\in \hat{\partial } f(x)\), which implies that \(|f(x)| <+\infty \). Now, consider \(y\in X\) and \(\epsilon >0\) arbitrary, and let \(S \in \beta \) such that \(h= y-x \in S\). Hence, we have that for small enough \(t\in (0,1)\), the following inequality holds

$$\begin{aligned}{} & {} \frac{ f((1-t)x + ty) - f(x)}{t}-\langle x^*, y -x\rangle \\{} & {} \quad =\frac{f(x + t(y-x)) - f(x)}{t} -\langle x^*, y -x\rangle \ge - \epsilon , \end{aligned}$$

so using the convexity of f, we yield that \( f(y)-f(x) -\langle x^*, y -x\rangle \ge -\epsilon ,\) then, taking \(\epsilon \rightarrow 0\), we have that \(f(y) - f(x) \ge \langle x^*, y-x\rangle \), which from the arbitrariness of \(y\in X\) implies the result. \(\square \)

The following sum rule is applied in the paper, and we provide its proof for completeness.

Lemma 2.9

Let \(x\in X\), and \(g: X \rightarrow \overline{\mathbb {R}}\) be differentiable at x. Then, for any function \(f: X\rightarrow \overline{\mathbb {R}}\) we have

$$\begin{aligned} \hat{\partial } ( f + g)(x)= \hat{\partial } f(x) + \nabla g(x). \end{aligned}$$
(6)

Proof

Let us suppose that \(x^*\in \hat{\partial } ( f + g)(x)\) and \(S\in \beta \). Hence,

$$\begin{aligned} 0&\le \liminf \limits _{t\rightarrow 0^+ } \sup \limits _{h\in S}\left( \frac{f(x+th)+g(x+th)-f(x)-g(x) }{t}-\langle x^{*},h\rangle \right) \\&\le \liminf \limits _{t\rightarrow 0^+ } \sup \limits _{h\in S}\left( \frac{f(x+th) -f(x) }{t}-\langle x^{*}-\nabla g(x),h\rangle \right) \\&\quad + \lim \limits _{t\rightarrow 0^+ } \sup \limits _{h\in S}\left( \frac{ g(x+th)- g(x) }{t}-\langle \nabla g(x),h\rangle \right) \\&= \liminf \limits _{t\rightarrow 0^+ } \sup \limits _{h\in S}\left( \frac{f(x+th) -f(x) }{t}-\langle x^{*}-\nabla g(x),h\rangle \right) , \end{aligned}$$

which shows that \(x^*- \nabla g(x) \in \hat{\partial } f(x)\). To prove the converse inclusion, it is enough to notice that \(\hat{\partial } f(x) = \hat{\partial } (f +g - g)(x) \subseteq \hat{\partial } ( f + g)(x) - \nabla g(x),\) where the final inclusion follows from the previous part, and that ends the proof. \(\square \)

Next, we employ the regular subdifferential to provide machinery to differentiate nonsmooth vector-valued mappings.

Definition 2.10

Given a mapping \(F:X\rightarrow Y\cup \{+\infty _{Y}\}\) and \(x\in {{\,\mathrm{{\text {dom}}}\,}}F\) we define the regular coderivative of F at x by the set-valued map \(\hat{D}^{*}F(x):Y^{*}\rightrightarrows X^{*}\) defined as:

$$\begin{aligned} \hat{D}^{*}F(x)(y^{*}):=\hat{\partial }\left( \langle y^{*},F\rangle \right) (x), \end{aligned}$$
(7)

where \(\langle y^{*},F\rangle \) is the function defined in (2).

This operator is positively homogeneous. In particular, definition (7) coincides with the general construction for the regular coderivative of set-valued mappings on Banach spaces when F is calm at x, that is,

$$\begin{aligned} \left\| F(u)-F(x)\right\| \le \ell \left\| u-x\right\| , \end{aligned}$$

for some \(\ell >0\) and u close to x (see e.g. [19, Proposition 1.32]).

The following lemma yields a sum rule for functions in \(\Gamma _{h}(X)\).

Lemma 2.11

Let \(f_1,f_2 \in \Gamma _{h}(X)\) for some function \(h \in \Gamma _0(X)\). Suppose that there exists a point in \({\text {dom}} (f_2+h)\) where \(f_1+h\) is continuous. Then,

$$\begin{aligned} \hat{\partial } ( f_1+ f_2)(\bar{x}) = \hat{\partial } f_1(\bar{x} ) + \hat{\partial } f_2(\bar{x}), \end{aligned}$$

provided that h is differentiable at \(\bar{x}\).

Proof

Let us compute the subdifferential \(\partial ( f_1+ f_2+2\,h)(\bar{x})\). On the one hand, by Lemma 2.9, we have that \(\partial ( f_1+ f_2+2\,h)(\bar{x}) = \hat{\partial }( f_1+ f_2 )(\bar{x}) + 2 \nabla h(\bar{x})\). On the other hand, since the convex function \(f_1 + h\) is continuous at some point in the domain of \(f_2 + h\), we get (see e.g. [39, Theorem 2.8.7]) that

$$\begin{aligned} \partial ( f_1+ f_2+2h)(\bar{x})&= \partial ( f_1+h) (\bar{x} ) + \partial ( f_2+ h)(\bar{x})\\&= \hat{\partial }f_1 (\bar{x}) + \nabla h(\bar{x}) + \hat{\partial }f_2(\bar{x}) + \nabla h(\bar{x}), \end{aligned}$$

where in the last equality we used Lemma 2.9 again, and that concludes the proof. \(\square \)

Next, we present some calculus rules for the subdifferential of an extended real DC function.

Proposition 2.12

[23, Theorem 1] Let \(g,h\in \Gamma _{0}(X)\) be such that both are finite at x. Then, for every \( \epsilon \ge 0\)

$$\begin{aligned} \partial _{\epsilon }\left( g-h\right) (x)=\bigcap \limits _{\eta \ge 0}\left( \partial _{\eta +\epsilon }g(x)\ominus \partial _{\eta }h(x)\right) . \end{aligned}$$

Particularly, \(g-h\) attains a global minimum at x if and only if

$$\begin{aligned} \partial _{\eta }h( x) \subseteq \partial _{\eta }g( x), \text { for all } \eta \ge 0. \end{aligned}$$

The following result characterizes the \(\epsilon \)-subdifferential of the supremum function of an arbitrary family of functions.

Proposition 2.13

[30, Proposition 3.1] Let \(\{f_t: t \in T \} \subseteq \Gamma _{0}(X)\) and define \(f =\sup _{T} f_t\). Then, for every \(\epsilon \ge 0\),

$$\begin{aligned} {\partial _{\varepsilon }f(x)\quad =} \bigcap \limits _{\gamma >\varepsilon }{\text {cl}}^*\left\{ \sum \limits _{t\in {\text {supp}}\alpha }\alpha _{t}\partial _{\eta _{t}}f_{t}(x):\hspace{-0.1cm} \hspace{-0.1cm} \begin{array}{c} \alpha \in \Delta (T),\;\eta _{t}\ge 0,\;\sum \limits _{t\in {\text {supp}} \alpha }\alpha _{t}\eta _{t}\in [0,\gamma ), \\ \vspace{-0.2cm} \\ \sum \limits _{t\in {\text {supp}}\alpha }\alpha _{t}f_{t}(x)\ge f(x)+\sum \limits _{t\in {\text {supp}}\alpha }\alpha _{t}\eta _{t}-\gamma \end{array} \right\} {\,,}\nonumber \\ \end{aligned}$$
(8)

where \({\text {cl}}^*\) represents the closure with respect to the \(w^{*}\)-topology.

The following calculus rules play a key role in our analysis. Given a mapping \(F:X\rightarrow Y\cup \{\infty _{Y}\}\), a set \(C\subset Y^{*}\) and \(\varepsilon \ge 0\) we denote the \(\epsilon \)-active index set at \(x\in {{\,\mathrm{{\text {dom}}}\,}}F\) by

$$\begin{aligned} C_\epsilon (x):=\left\{ \lambda ^*\in C: \sup \limits _{\nu ^*\in C} \langle \nu ^*,F \rangle (x) \le \langle \lambda ^*, F\rangle (x) + \epsilon \right\} . \end{aligned}$$

For \(\epsilon =0\), we simply write \( C(x):= C_0(x)\).

Theorem 2.14

Let \(F:X \rightarrow Y\cup \{ \infty _{Y}\}\) and consider C a convex and compact subset of \(Y^*\) with respect to the \(w^*\)-topology. Let \(g\in \Gamma _{0}(X)\) such that for all \(\lambda ^*\in C\) the function \(\langle \lambda ^*, F\rangle + g \in \Gamma _{0}(X)\). Then, for every \(\epsilon \ge 0\) and all \(x\in X\), we have

$$\begin{aligned} \partial _\epsilon \left( \sup \limits _{\lambda ^*\in C} \langle \lambda ^*, F\rangle + g \right) (x)=\bigcup \left[ \partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x) : \begin{array}{c} \eta \in [0,\epsilon ] \text { and } \\ \lambda ^*\in C_{\epsilon -\eta }(x) \end{array} \right] . \end{aligned}$$
(9)

Proof

First let us show the inclusion \(\supseteq \) in (9). Consider \(x^*\) in the right-hand side of (9); then there exists \(\eta \in [0,\epsilon ]\) and \(\lambda ^*\in C_{\epsilon -\eta }(x) \) such that \(x^*\in \partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x) \). Hence, for all \(y\in X\)

$$\begin{aligned} \langle x^*, y - x\rangle&\le \langle \lambda ^*, F\rangle (y) + g(y) - \langle \lambda ^*, F\rangle (x) - g(x) +\eta \\&\le \sup \limits _{\nu ^*\in C} \langle \nu ^*, F\rangle (y) + g(y) - \sup \limits _{\nu ^*\in C} \langle \nu ^*,F \rangle (x) -g(x) +\epsilon - \eta +\eta \\&\le \sup \limits _{\nu ^*\in C} \langle \nu ^*, F\rangle (y) + g(y) - \sup \limits _{\nu ^*\in C} \langle \nu ^*,F \rangle (x)-g(x) +\epsilon . \end{aligned}$$

Second, let us consider \(T=C\), and the family of functions \(f_t =\langle t, F\rangle + g \), for \(t=\lambda ^*\). Then, by Proposition 2.13 we have

$$\begin{aligned} \partial _{\epsilon } f(x) \subseteq \hspace{-0.05cm} \bigcap \limits _{\gamma > \epsilon } {{\,\textrm{cl}\,}}^*\left( \bigcup \left[ \partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x) :\hspace{-0.12cm} \hspace{-0.1cm} \begin{array}{c} \lambda ^*\in C, \; \eta \in [0,\gamma ), \text { and }\\ \vspace{-0.2cm} \\ \langle \lambda ^*, F\rangle (x) \ge \sup \limits _{\nu ^*\in C} \langle \nu ^*,F \rangle (x) + \eta - \gamma \end{array} \right] \right) , \end{aligned}$$
(10)

where we have simplified (8) by convexity of C and linearity of the application \(\lambda ^*\mapsto \langle \lambda ^*, F\rangle (w) \) for all \(w\in X\). In fact, for \((\alpha _t)_{t\in T} \in \Delta (T)\) we have

$$\begin{aligned} \sum \limits _{t\in {{\,\mathrm{\text {supp}}\,}}\alpha } \alpha _t f_t (\cdot )&= \left( \langle \lambda ^*, F\rangle + g \right) (\cdot )&\text { and }&\sum \limits _{ t\in {\text {supp}} \alpha } \alpha _t \partial _{\eta _t } f_t(x) \subseteq \partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x), \end{aligned}$$

where \(\lambda ^*= \sum _{ t\in {\text {supp}} \alpha } \alpha _t t\in C\) and \(\eta := \sum _{t\in T}\alpha _t \eta _t\). Now, consider \(x^*\in \partial _{\epsilon } f(x)\), so that by (10), there exists a net \(\gamma _\ell \rightarrow \epsilon \) and \(\eta _\ell \in [0,\gamma _\ell )\), \(\lambda ^*_\ell \in C\) with

$$\begin{aligned} \langle \lambda ^*_\ell , F\rangle (x) \ge \sup \limits _{\nu ^*\in C} \langle \nu ^*,F \rangle (x) + \eta _\ell - \gamma _\ell \end{aligned}$$

and \({x_\ell }^*\in \partial _{\eta _\ell } \left( \langle \lambda _\ell ^*, F\rangle + g \right) (x) \) such that \(x^*=\lim x_\ell ^*\). Hence, by compactness of C, we can assume that \(\lambda _\ell ^*\rightarrow \lambda ^*\in C\), as well as that \(\eta _\ell \rightarrow \eta \in [0, \epsilon ]\). Furthermore, taking limits \(\langle \lambda ^*, F\rangle (x) \ge \sup _{\nu ^*\in C} \langle \nu ^*,F \rangle (x) + \eta - \epsilon \), so \(\lambda ^*\in C_{\epsilon -\eta }(x) \). Finally, let us show that \(x^*\in \partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x) \). Indeed, for all \(y\in X\)

$$\begin{aligned} \langle x^*_\ell , y- x\rangle \le \langle \lambda ^*_\ell , F(y) \rangle + g(y) - \langle \lambda ^*_\ell , F(x) \rangle + g(x) +\eta _\ell , \text { for all } \ell \in D, \end{aligned}$$

so taking limits in \(\ell \) we conclude that

$$\begin{aligned} \langle x^*, y- x\rangle \le \langle \lambda ^*, F(y) \rangle + g(y) - \langle \lambda ^*, F(x) \rangle + g(x) +\eta , \end{aligned}$$

from which, and from the arbitrariness of \(y\in X\), we conclude that \(x^*\) belongs to \(\partial _\eta \left( \langle \lambda ^*, F\rangle + g \right) (x) \). \(\square \)

3 DC Mathematical Programming

This section is devoted to establishing necessary and sufficient conditions for general DC mathematical programming problems. More precisely, we consider the following optimization problem

$$\begin{aligned} \begin{array}{ll} \min &{} \quad \varphi (x) \\ {\mathrm{s.t.}} &{}\quad \Phi (x)\in {\mathcal {C}}, \end{array} \end{aligned}$$
(11)

where \(\varphi :X\rightarrow \mathbb {R}\cup \{+\infty \},\) \(\Phi :X\rightarrow Y\cup \{\infty _{Y}\}\) is a vector-valued mapping, and \({\mathcal {C}}\subseteq Y\) is a closed convex set. This section has two parts devoted to global and local optimality conditions.

3.1 Global Optimality Conditions

Let us establish our main result in this section.

Theorem 3.1

Let \({\mathcal {C}}\subseteq Y\) be a closed and convex set such that \(0_Y \in {\mathcal {C}}\) and \({\mathcal {C}}^\circ \) is \(\hbox {weak}^*\)-compact. Let \(\varphi \in \Gamma _{h}(X)\) and \( \Phi \in \Gamma _{h}(X,Y, {\mathcal {C}}^\circ )\) for some control function \(h\in \Gamma _0(X)\), and suppose that one of the following conditions holds:

  1. a)

    the function \(x \mapsto \varphi (x) + h(x)\) is continuous at some point of \({\text {dom}} \Phi \),

  2. b)

    the function \(x \mapsto \sup \{ \langle \lambda ^*, \Phi (x)\rangle : \lambda ^*\in {\mathcal {C}}^\circ \} +h(x)\) is continuous at some point of \({\text {dom}} \varphi \).

Then, if \(\bar{x}\) is an optimal solution of the optimization problem (11), we have that

$$\begin{aligned} \partial _{\eta } h( \bar{x}) \subseteq \bigcup \Big [ \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) \Big ], \text { for all } \eta \ge 0, \end{aligned}$$
(12)

where the union is taken over all \( \eta _1,\eta _2 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {C}}^\circ \) such that

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _2 +1- \langle \lambda ^*, \Phi (\bar{x} ) \rangle ) = \eta . \end{aligned}$$
(13)

Conversely, assume that \(\bar{x}\) is a feasible point of (11) and that (12) always holds with \(\alpha _1 >0 \), then \(\bar{x}\) is a solution of (11) relative to \({\text {dom}} \partial h\), that is, \(\bar{x}\) is an optimum of \(\min \{ \varphi (x): x\in \Phi ^{-1}({\mathcal {C}})\cap {\text {dom}} \partial h\}\).

Remark 3.2

(before the proof) It is important to note that the assumption that \(C^\circ \) is \(w^*\)-compact is not restrictive. Indeed, whenever there exists some \(z_0\) in \({\mathcal {C}}\) such that \({\mathcal {C}}_{z_0}^\circ :=\left( {\mathcal {C}}- z_0\right) ^\circ \) is \(\hbox {weak}^*\)-compact, Theorem 3.1 can be translated easily in terms the mapping \(\hat{\Phi } (x):=\Phi (x)-z_0\) and the set \(\hat{{\mathcal {C}}}= {\mathcal {C}}-z_0\). Moreover, according to Banach–Alaoglu–Bourbaki theorem, in order to guarantee that \({\mathcal {C}}^\circ _{z_0}\) is \(\hbox {weak}^{*}\)-compact, it is enough to suppose that \(z_0\in {{\,\textrm{int}\,}}({\mathcal {C}})\). More precisely, [8, Theorem I-14] establishes that \(z_0\) belongs to the interior of \({\mathcal {C}}\) with respect to the Mackey topology if and only if \({\mathcal {C}}_{z_0}^\circ \) is \(\hbox {weak}^*\)-compact. Here, it is important to mention that there are several relations between the \(\hbox {weak}^*\)-compactness of \({\mathcal {C}}^\circ _{z_0}\) and the nonemptiness of the interior of \({\mathcal {C}}\), with respect to the Mackey topology (see e.g. [8, 20] for more details), and they can be connected even with the classical James’s Theorem (see [31]) and other variational and geometric properties of functions (see [9, 10, 14]).

Proof

First let us suppose that \(\bar{x}\) is a solution of (11) and let \(\alpha \) be the optimal value of the optimization problem (11). In the first part, we prove three claims.

Claim 1: First we prove that

$$\begin{aligned} \partial _{\eta }h(\bar{x}) \subseteq \partial _{\eta }\psi (\bar{x}), \text { for all } \eta \ge 0, \end{aligned}$$
(14)

where

$$\begin{aligned} \psi (x)&:=\max \left\{ \psi _1 (x),\psi _2(x) \right\} , \;\; \end{aligned}$$
(15)

and

$$\begin{aligned} \psi _1 (x) :=\varphi (x) + h(x)-\alpha , \text { and } \psi _2(x):= f(x) +h(x), \end{aligned}$$

with

$$\begin{aligned} f(x):=\sup \left\{ \langle \lambda ^*, \Phi (x) \rangle : \lambda ^*\in {\mathcal {C}}^\circ \right\} -1. \end{aligned}$$
(16)

Indeed, first let us notice that, by the bipolar theorem,

$$\begin{aligned} \Phi (x) \in {\mathcal {C}} \Leftrightarrow \Phi (x)\in ({\mathcal {C}}^{\circ })^{\circ } \Leftrightarrow \langle \lambda ^{*},\Phi (x)\rangle \le 1\text { for all }\lambda ^{*}\in {\mathcal {C}}^{\circ }\Leftrightarrow f(x)\le 0. \end{aligned}$$

Therefore, by Lemma 2.1, the optimization problem (11) has the same optimal solutions as the problem

$$\begin{aligned} \min _{x\in X} \left( \max \{ \varphi (x) + h(x)-\alpha , f(x) +h(x) \} - h(x) \right) . \end{aligned}$$

Hence, by Proposition 2.12, we have that \(\bar{x}\) is a solution of (11) if and only if (14) holds.

Claim 2: The following formulas hold

$$\begin{aligned} \partial _{\eta }\psi ( \bar{x}) = \bigcup \left[ \alpha _1 \partial _{\eta _1 } \psi _1 ( \bar{x}) + \alpha _2 \partial _{\eta _0} \psi _2( \bar{x}) : \begin{array}{c} \eta _1 ,\eta _0 \ge 0, \, (\alpha _1,\alpha _2) \in \Delta _2 \\ \alpha _1 \eta _1+ \alpha _2 ( \eta _0 - f(\bar{x} ) ) \le \eta \end{array} \right] , \end{aligned}$$
(17)

and

$$\begin{aligned} \partial _{\eta _0} \psi _2( \bar{x}) = \bigcup \left[ \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) : \begin{array}{c} \eta _2 \in [0,\eta _0] \text { and } \\ \lambda ^*\in ({\mathcal {C}}^\circ )_{\eta _0-\eta _2}(\bar{x}) \end{array} \right] . \end{aligned}$$
(18)

Indeed, first let us compute \(\partial _{\eta }\psi ( \bar{x})\) using the formula for the \(\eta \)-subdifferential of the max-function given in [39, Corollary 2.8.15] (the assumptions are actually satisfied thanks to conditions a) or b)), so we get

$$\begin{aligned} \partial _{\eta }\psi ( \bar{x}) = \bigcup \left[ \alpha _1 \partial _{\frac{\epsilon _1}{\alpha _1} } \psi _1 ( \bar{x}) + \alpha _2 \partial _{\frac{ \epsilon _2}{\alpha _2}} \psi _2( \bar{x}) : \begin{array}{c} (\epsilon _0,\epsilon _1, \epsilon _2) \in \Delta _3^\eta , \, (\alpha _1,\alpha _2) \in \Delta _2 \\ \alpha _1 \psi _1 ( \bar{x}) + \alpha _2 \psi _2 ( \bar{x}) \ge \psi ( \bar{x}) -\epsilon _0 \end{array} \right] . \end{aligned}$$

Now, relabeling \(\eta _1:= {\epsilon _1}/{\alpha _1}\), \(\eta _0:= {\epsilon _2}/{\alpha _2}\) for \(\alpha _1 \ne 0\ne \alpha _2\) and \(\eta _1= \eta _0=0\) otherwise, and using that \( \psi (\bar{x})=\psi _1 (\bar{x})=h(\bar{x})\) and \(\psi _2(\bar{x})= f(\bar{x}) + h(\bar{x})\le h(\bar{x})\) we get (17).

Second, we get \( \partial _{\eta _0} \psi _2( \bar{x}) \) using Theorem 2.14, so (18) holds.

Claim 3: We have that (12) holds.

To prove that inclusion (12) holds, we can rely on (14) and simply prove that the right-hand side of (12) contains \(\partial _{\eta }\psi (\bar{x})\). Indeed, consider a vector \(x^*\in \partial _{\eta }\psi ( \bar{x})\), using formula (17) we get that there are \( \eta _1,\eta _0 \ge 0,\) and \( (\alpha _1,\alpha _2) \in \Delta _2 \) together with subgradient \(w_1^*\in \partial _{\eta _1 } \psi _1 ( \bar{x}) \) and \(w_2^*\in \partial _{\eta _0} \psi _2( \bar{x}) \) such that \( x^*= \alpha _1 w_1^*+ \alpha _2 w_2^*\) and

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _0 - f(\bar{x} ) ) \le \eta . \end{aligned}$$
(19)

Now, using (18) we have that there exists \( \eta _2 \in [0,\eta _0]\) and \( \lambda ^*\in ({\mathcal {C}}^\circ )_{\eta _0-\eta _2}(\bar{x})\) such that

$$\begin{aligned} w_2^*\in \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}). \end{aligned}$$

Until now, we have proved that \(x^*\) belongs to the set \( \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) \) for some \( \eta _1,\eta _2 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {C}}^\circ \). Moreover, by the definition of the \(({\mathcal {C}}^\circ )_{\eta _0-\eta _2}(\bar{x})\) we have that \(f(\bar{x}) \le \langle \lambda ^*, \Phi (\bar{x} ) \rangle -1 + \eta _0- \eta _2\). Hence, using (19), we get

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _2 +1- \langle \lambda ^*, \Phi (\bar{x} ) \rangle ) \le \alpha _1 \eta _1+ \alpha _2 ( \eta _0 - f(\bar{x}) \rangle ) \le \eta , \end{aligned}$$

so by increasing the values of \(\eta _1\) and \(\eta _2\) if it is needed, we can ensure that \((\eta _1,\eta _2,\alpha _1,\alpha _2, \lambda ^*)\) satisfies (13).

Now, to prove the converse, consider \(y \in {\text {dom}} \partial h\) a feasible point of the optimization problem (11). Then, if we consider \(x^*\in \partial h(y)\), we have that

$$\begin{aligned} h(y) -h(\bar{x}) +\eta&= \langle x^*, y- \bar{x}\rangle , \text { and }x^*\in \partial _\eta h(\bar{x}), \end{aligned}$$
(20)

where \(\eta := h^*(x^*) + h(\bar{x}) -\langle x^*, \bar{x}\rangle \ge 0.\) Hence, by (12) we see that there are \( \eta _1,\eta _2 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {C}}^\circ \) satisfying (13) with \(\alpha _1\ne 0\), and

$$\begin{aligned} x^*\in \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}). \end{aligned}$$

Particularly,

$$\begin{aligned} \langle x^*, y -\bar{x} \rangle&\le \alpha _1 \left( \varphi (y) + h(y) -\varphi (\bar{x} ) -h(\bar{x}) +\eta _1 \right) \nonumber \\&\quad + \alpha _2 \left( \langle \lambda ^*, \Phi \rangle (y) + h(y) - \langle \lambda ^*, \Phi \rangle (\bar{x}) - h (\bar{x}) +\eta _2 \right) \nonumber \\&\le \alpha _1 \left( \varphi (y) - \varphi (\bar{x}) \right) + \alpha _1 \eta _1 + \alpha _2 \left( \eta _2 + 1 - \langle \lambda ^*, \Phi (\bar{x}) \rangle \right) + h(y)-h(\bar{x})\nonumber \\&\le \alpha _1 \left( \varphi (y) - \varphi (\bar{x}) \right) + h(y)-h(\bar{x}) +\eta . \end{aligned}$$
(21)

We conclude, using (20) and (21), that

$$\begin{aligned} 0 \le \alpha _1( \varphi (y) - \varphi (\bar{x})), \end{aligned}$$
(22)

which, recalling that \(\alpha _1 > 0\), ends the proof. \(\square \)

Remark 3.3

Let us briefly comment on some facts about the last result:

  1. i)

    Notice that \( \lambda ^*\in {\mathcal {C}}^\circ \) if and only if \(\lambda ^*\in N^{\eta _3}_{{\mathcal {C}}}(\Phi (\bar{x}))\), where \(\eta _3:=1 - \langle \lambda ^*, \Phi (\bar{x})\rangle \ge 0\). Indeed, since \(\Phi (\bar{x}) \in {\mathcal {C}}\), we have

    $$\begin{aligned} \lambda ^*\in {\mathcal {C}}^\circ&\Leftrightarrow \langle \lambda ^*, u\rangle \le 1, \; \text { for all } u \in {\mathcal {C}}\\&\Leftrightarrow \langle \lambda ^*, u- \Phi (\bar{x})\rangle \le 1- \langle \lambda ^*, \Phi (\bar{x})\rangle \text { for all } u \in {\mathcal {C}}\\&\Leftrightarrow \lambda ^*\in N^{\eta _3}_{{\mathcal {C}}}(\Phi (\bar{x})) \text { with } \eta _3:=1 - \langle \lambda ^*, \Phi (\bar{x})\rangle \ge 0. \end{aligned}$$

    Therefore, the existence of multipliers can be equivalently described in terms of the \(\epsilon \)-normal set defined in (1). Nevertheless, we prefer to use the set \({\mathcal {C}}^\circ \) in order to take advantage of the compactness of this set, which will be exploited later.

  2. ii)

    The converse in Theorem 3.1 can be proved assuming that (12) always holds with multiplier \(\alpha _1 \ne 0\) for all \(\eta \in [0, \bar{\eta }]\), where

    $$\begin{aligned} \bar{\eta }:= \sup \left\{ h^*(y^*) + h( \bar{x}) - \langle y^*, \bar{x} \rangle : y^*\in \partial h(y), \; y \in \Phi ^{-1}({\mathcal {C}}) \right\} . \end{aligned}$$
  3. iii)

    It is worth mentioning that, due to the assumption of the existence of a continuity point of \(\varphi +h\), we can prove that \(\varphi ,h\) are bounded above on a neighbourhood of a point of their domain. Indeed, let us suppose that \(\varphi +h\) is bounded above on a neighbourhood U of \(x_0\), that is, \(\varphi (x) + h(x) \le M\) for all \(x\in U\) for some scalar M. Since \(\varphi \) and h are lsc at \(x_0\), we can assume (shrinking enough the neighbourhood U) that \(\inf _{x\in U}\varphi (x)>-m \) and \(\inf _{x\in U} h(x) >-m\), for some constant \(m\in \mathbb {R}\), which implies that \(\varphi (x) \le M+m\) and \( h(x) \le M +m\) for all \(x\in U\). Hence, due to the convexity of the involved functions, we see that \(\varphi \) and h are continuous on \( {{\,\textrm{int}\,}}{\text {dom}} \varphi \) and \({{\,\textrm{int}\,}}{\text {dom}} h\), respectively (see e.g. [39, Theorem 2.2.9]). Particularly, the latter implies that \({\text {dom}} \partial h \supseteq {{\,\textrm{int}\,}}{\text {dom}} h\) is dense on \({\text {dom}} \varphi \) (recall \({{\,\mathrm{{\text {dom}}}\,}}\varphi \subset {{\,\mathrm{{\text {dom}}}\,}}h\)).

Now, let us establish the following corollary when the problem (11) is convex. The proof follows directly from Theorem 3.1, so we omit the details.

Corollary 3.4

Let \({\mathcal {C}}\subseteq Y\) be a closed and convex set such that \(0_Y \in {\mathcal {C}}\) and \({\mathcal {C}}^\circ \) is \(\hbox {weak}^*\)-compact. Let \(\varphi \in \Gamma _{0}(X)\) and \( \Phi \in \Gamma _{0}(X,Y, {\mathcal {C}}^\circ )\) and suppose that one of the following conditions holds:

  1. a)

    the function \(\varphi \) is continuous at some point of \({\text {dom}} \Phi \),

  2. b)

    the function \(x \mapsto \sup \{ \langle \lambda ^*, \Phi (x)\rangle : \lambda ^*\in {\mathcal {C}}^\circ \}\) is continuous at some point of \({\text {dom}} \varphi \).

Then, if \(\bar{x}\) is an optimal solution of the optimization problem (11), we have that there exists \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {C}}^\circ \) such that

$$\begin{aligned} 0_{X^*} \in \alpha _1 \partial \varphi ( \bar{x}) + \alpha _2 \partial \left( \langle \lambda ^*, \Phi \rangle \right) (\bar{x}) \text { and } \alpha _2 ( 1- \langle \lambda ^*, \Phi (\bar{x} ) \rangle ) =0. \end{aligned}$$
(23)

Conversely, assume that \(\bar{x}\) is a feasible point of (11) and that (23) holds with \(\alpha _1 >0 \), then \(\bar{x}\) is a solution of (11).

The following result shows that the fulfilment of (12) with \(\alpha _1 \ge \epsilon _0\), for some \(\epsilon _0 >0\), can be used to establish that \(\bar{x}\) is a solution to problem (11).

Theorem 3.5

In the setting of Theorem 3.1, suppose that \(\bar{x}\) is a feasible point of (11) and that (12) always holds with \(\alpha _1 \ge \epsilon _0 \), for some \(\epsilon _0>0\). Then, \(\bar{x}\) is an optimal solution of (11).

Proof

The proof follows similar arguments to those given in Theorem 3.1. The only difference is that, given \(y\in {\text {dom}}h\) and \(\epsilon ^{\prime }\in (0,\epsilon _{0})\), we take \(x^{*}\in \partial _{\epsilon ^{\prime }}h(y)\) (recall that \(\partial _{\epsilon ^{\prime }}h(y)\) is nonempty due to the lsc of the function h). Again we consider \(\eta :=h^{*}(x^{*})+h(\bar{x})-\langle x^{*},\bar{x}\rangle \ge 0\), entailing \(x^{*}\in \partial _{\eta }h(\bar{x})\). Therefore, (12) gives rise to the existence of \(\eta _{1},\eta _{2}\ge 0\), \((\alpha _{1},\alpha _{2})\in \Delta _{2}\) and \(\lambda ^{*}\in {\mathcal {C}}^{\circ }\) satisfying (13) with \(\alpha _{1}\ge \epsilon _{0}\), and

$$\begin{aligned} x^{*}\in \alpha _{1}\partial _{\eta _{1}}(\varphi +h)(\bar{x})+\alpha _{2}\partial _{\eta _{2}}\left( \langle \lambda ^{*},\Phi \rangle +h\right) (\bar{x}), \end{aligned}$$

which itself yields

$$\begin{aligned} \langle x^{*},y-\bar{x}\rangle \le \alpha _{1}\left( \varphi (y)-\varphi (\bar{x})\right) +h(y)-h(\bar{x})+\eta . \end{aligned}$$

Then, using that \(-h^*(x^*)+ \langle x^{*},y\rangle = \langle x^{*},y-\bar{x} \rangle + h(\bar{x}) - \eta \), we write

$$\begin{aligned} \langle x^{*},y\rangle -h^*(x^*)-h(y)\le \langle x^{*},y-\bar{x} \rangle +h(\bar{x}) - \eta - h(y) \le \alpha _{1}\left( \varphi (y)-\varphi (\bar{x})\right) . \end{aligned}$$

But now \(x^{*}\in \partial _{\epsilon ^{\prime }}h(y)\) leads us to the inequality \(-\epsilon ^{\prime }\le \alpha _{1}(\varphi (y)-\varphi (\bar{x}))\) instead of (22), hence \(-\frac{\epsilon ^{\prime }}{\epsilon _{0}} \le \varphi (y)-\varphi (\bar{x})\) and taking \(\epsilon ^{\prime }\rightarrow 0^{+}\), we get the aimed conclusion. \(\square \)

Corollary 3.6

Let \({\mathcal {Q}}\subseteq X\) and \({\mathcal {C}}\subseteq Y\) be closed and convex set with \(0_Y \in {\mathcal {C}}\) and suppose \({\mathcal {C}}^\circ \) is \(\hbox {weak}^*\)-compact. Let \(\varphi \in \Gamma _{h}(X)\) and \( \Phi \in \Gamma _{h}(X,Y,{\mathcal {C}}^\circ )\) for some function \(h\in \Gamma _0(X)\). Consider the following optimization problem

$$\begin{aligned} \begin{aligned}&\min \varphi (x) \\&\text {s.t. } \Phi (x)\in {\mathcal {C}},\; \\&\hspace{0.5cm}x \in {\mathcal {Q}}. \end{aligned} \end{aligned}$$
(24)

Additionally, assume that there exists a point in \({\mathcal {Q}}\) such that \(\varphi \), \(\Phi \) and h are continuous at this point. Then, if \(\bar{x}\) is an optimal solution of the optimization problem (24) we have that

$$\begin{aligned} \partial _{\eta } h( \bar{x}) \subseteq \bigcup \Big [ \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) + N_{{\mathcal {Q}}}^{\eta _3} (\bar{x}) \Big ] \text { for all } \eta \ge 0, \end{aligned}$$
(25)

where the union is taken over all \( \eta _1,\eta _2,\eta _3 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {C}}^\circ \) such that

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _2 +1- \langle \lambda ^*, \Phi (\bar{x} ) \rangle ) + \eta _3= \eta . \end{aligned}$$

Conversely, assume that \(\bar{x}\) is a feasible point of (24) and that (25) always holds with \(\alpha _1 >0 \), then \(\bar{x}\) is a solution of (24) relative to \({\text {dom}} \partial h\).

Proof

Let us observe that the optimization problem (24) is equivalent to

$$\begin{aligned} \begin{aligned} \min \varphi (x) \text { s.t. } \Phi _{\mathcal {Q}}(x)\in {\mathcal {C}}, \end{aligned} \end{aligned}$$
(26)

where

$$\begin{aligned} \Phi _{\mathcal {Q}}(x) :=\left\{ \begin{array}{cc} \Phi (x),&{} \text { if } x\in {\mathcal {Q}},\\ \infty _{Y},&{} \text { if } x\notin {\mathcal {Q}}. \end{array} \right. \end{aligned}$$
(27)

Furthermore, it is easy to prove that \(\langle \lambda ^*,\Phi _{\mathcal {Q}}\rangle = \langle \lambda ^*, \Phi \rangle + \delta _{{\mathcal {Q}}}\) for every \(\lambda ^*\in {\mathcal {C}}^\circ \), and consequently \(\Phi _{\mathcal {Q}}\in \Gamma _{h}(X,Y,{\mathcal {C}}^\circ )\). Then, we apply Theorem 3.1 to the optimization problem (26) (notice that \(\varphi +h\) is continuous at some point of \({{\,\mathrm{{\text {dom}}}\,}}\Phi _{{\mathcal {Q}}}\)) and we use the sum rule for the \(\epsilon \)-subdifferential (see e.g. [39, Theorem 2.8.3]) to compute the \(\epsilon \)-subdifferential of \(\langle \lambda ^*, \Phi \rangle + h + \delta _{{\mathcal {Q}}} \) in terms of the corresponding subdifferentials of \(\langle \lambda ^*, \Phi \rangle + h\) and \(\delta _{{\mathcal {Q}}} \) (here recall that \( \langle \lambda ^*, \Phi \rangle \) and h are continuous at some point of \({\mathcal {Q}}\)). \(\square \)

3.2 Local Optimality Conditions

In this section, we present necessary and sufficient conditions for local optimality of problem (11). The first result corresponds to a necessary optimality condition.

Theorem 3.7

In the setting of Theorem 3.1, let \(\bar{x}\) be a local optimal solution of the optimization problem (11) and suppose that h is differentiable at \(\bar{x}\). Then, we have that there are multipliers \((\alpha _1,\alpha _2)\in \Delta _2\) and \(\lambda ^*\in {\mathcal {C}}^\circ \) such that

$$\begin{aligned} 0_{X^*} \in \alpha _1 \hat{\partial } \varphi ( \bar{x}) + \alpha _2 \hat{D}^*\Phi (\bar{x})( \lambda ^*) \text { with }\alpha _2( 1 -\langle \lambda ^*, \Phi (\bar{x} )\rangle )=0. \end{aligned}$$
(28)

In addition, if the following qualification condition holds

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] , \end{aligned}$$
(29)

then we have

$$\begin{aligned} 0_{X^*} \in \hat{\partial }\varphi ( \bar{x}) + {\text {cone}}\left( \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] \right) . \end{aligned}$$
(30)

Proof

Consider a closed convex neighbourhood U of \(\bar{x}\) such that \(\bar{x}\) is a global optimum for the next optimization problem

$$\begin{aligned} \min \varphi (x) \text { s.t. } \Phi (x)\in {\mathcal {C}},\; x \in U. \end{aligned}$$
(31)

Following the proof of Theorem 3.1, we can prove that \(\bar{x}\) is a solution to the unconstrained minimization problem

$$\begin{aligned} \min \left( \max \{ \varphi (x) + h(x)-\alpha , f(x) +h(x) \}+\delta _{U}(x) - h(x) \right) , \end{aligned}$$

where \(\alpha \) is the optimal value of (31), and f is defined in (16). Now, applying the Fermat rule, and using Proposition 2.12 (with \(\epsilon =\eta =0\)), we have that \(\bar{x}\) satisfies the following subdifferential inclusion

$$\begin{aligned} \nabla h( \bar{x}) \in \partial \left( \psi + \delta _{U} \right) (\bar{x}), \end{aligned}$$

where \(\psi \) is the function introduced in (15). Since U is a neighbourhood of \(\bar{x}\), we have that \(\partial \left( \psi + \delta _{U} \right) (\bar{x}) =\partial \psi (\bar{x})\). Moreover, using Claim 2 of Theorem 3.1, we conclude the existence \((\alpha _1,\alpha _2) \in \Delta _2 \) and \(\lambda ^*\in {\mathcal {C}}^\circ \) with \( \alpha _2( 1 -\langle \lambda ^*, \Phi (\bar{x} )\rangle )=0\) such that

$$\begin{aligned} \nabla h( \bar{x}) \in \alpha _1 \partial ( \varphi + h ) ( \bar{x}) + \alpha _2\partial \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}). \end{aligned}$$
(32)

Moreover, by Lemmas 2.8 and 2.9 (recall that h is differentiable at \(\bar{x}\)) we can compute the corresponding subdifferentials as

$$\begin{aligned} \partial ( \varphi + h ) ( \bar{x})&=\hat{\partial }( \varphi + h ) ( \bar{x}) =\hat{\partial } \varphi (\bar{x})+ \nabla h(\bar{x}),\\ \partial \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x})&= \hat{\partial }\left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) =\hat{\partial }\langle \lambda ^*, \Phi \rangle (\bar{x}) + \nabla h(\bar{x})\\ {}&=\hat{D}^*\Phi (\bar{x} )(\lambda ^*)+ \nabla h(\bar{x}). \end{aligned}$$

Therefore, inclusion (32) reduces to (28). Now, (28) gives us the existence of \((\alpha _1,\alpha _2) \in \Delta _2 \) and \(\lambda ^*\in {\mathcal {C}}^\circ \) with \( \alpha _2( 1 -\langle \lambda ^*, \Phi (\bar{x} )\rangle )=0\) such that (28) holds. Moreover, by the qualification condition (29), we have that \(\alpha _1 \ne 0\). Therefore, dividing by \(\alpha _1\) we get (30). \(\square \)

Remark 3.8

(On normality of multiplier \(\lambda ^*\)) It is important to emphasize that the conditions \(\lambda ^*\in {\mathcal {C}}^\circ \) and \( \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \) imply that \(\lambda ^*\in N_{{\mathcal {C}}}(\Phi (\bar{x}))\). Therefore, in Theorem 3.7 the multiplier \(\lambda ^*\) is necessarily a normal vector to \({\mathcal {C}}\) at \(\Phi (\bar{x})\). Furthermore, the qualification condition (29) is equivalent to

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in N_{{\mathcal {C}}}(\Phi (\bar{x})) \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] . \end{aligned}$$

Here, the equality \( \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1\) is relevant because, without this condition, \(0_{X^*}\) always belongs to the set \(\bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*): \lambda ^*\in N_{{\mathcal {C}}}(\Phi (\bar{x})) \right] .\)

Remark 3.9

(On abstract differentiability) It is worth mentioning that in Theorem 3.7 the differentiability and subdifferentiability can be exchanged for a more general notion using an abstract concept of subdifferentiability. Indeed, based on the notion of presubdifferential (see e.g. [35, 36]), we can adapt the definition there in the following way: For every \(x \in X\) consider a family of functions \({\mathcal {F}}_x\), which are finite-valued at x. Now, consider an operator \({\tilde{\partial }}\) which associates to any lower semicontinuous function \(f: X\rightarrow \overline{\mathbb {R}}\) and any \(x \in X\), a subset \({\tilde{\partial }}f(x)\) of \(X^*\) with the following properties:

  1. i)

    \({\tilde{\partial }}f(x) =\emptyset \) for all x where \(|f(x)|=+\infty \).

  2. ii)

    \({\tilde{\partial }}f(x) \) is equal to the convex subdifferential whenever f is proper, convex and lower semicontinuous.

  3. iii)

    \({\tilde{\partial }}\phi (x)\) is single valued for every \(x \in X\) and \(\phi \in {\mathcal {F}}_x\). In that case \(\phi \) is called \({\tilde{\partial }}\)-differentiable at x, and we represent by \( \tilde{\nabla } \phi (x)\) the unique point in \({\tilde{\partial }}\phi (x)\).

  4. iv)

    For every \(x \in {\text {dom}} f\) and \(\phi \in {\mathcal {F}}_x\), we have

    $$\begin{aligned} {\tilde{\partial }}\left( f + \phi \right) (x) \subseteq {\tilde{\partial }}f(x) + \tilde{\nabla } \phi (x). \end{aligned}$$

The above notion covers several classes of subdifferentials, for instance:

  1. 1)

    Bornological subdifferential (Fréchet, Hadamard, Gateaux, etc.) with \({\mathcal {F}}_x\), the family of differentiable functions at x with respect to that bornology.

  2. 2)

    Viscosity bornological subdifferential with \({\mathcal {F}}_x\), the family of smooth functions (with respect to that bornology) at x.

  3. 3)

    Proximal subdifferential with \({\mathcal {F}}_x\), the family of \({\mathcal {C}}^2\)-functions at x.

  4. 4)

    Basic subdifferential with \({\mathcal {F}}_x\), the family of \({\mathcal {C}}^1\)-functions at x.

Using the above definition, we can define the notation of \({\tilde{\partial }}\)-coderivative similar to (7), given by \(\tilde{D}^*\Phi (x)(y^{*}):=\tilde{\partial }\left( \langle y^{*},\Phi \rangle \right) (x).\) Using these tools, it is easy to change the proof of Theorem 3.7 requiring that the convex function h belongs to \( {\mathcal {F}}_{\bar{x}}\). In this way, the corresponding inclusion in Theorem 3.7 is given with \( {\tilde{\partial }}\) and \(\tilde{D}^*\) replacing the regular subdifferential and coderivative, respectively.

Therefore, the assumption over the operator \({\tilde{\partial }}\) and the family \({\mathcal {F}}_{\bar{x}} \) at the optimal point \(\bar{x}\) corresponds to a trade-off between the differentiability of the data \(h\in {\mathcal {F}}_{\bar{x}} \) and the robustness of objects \({\tilde{\partial }}\) and \(\tilde{D}^*\), that is, while the notion of differentiability is weaker, larger is the object for which the optimality condition are presented (\({\tilde{\partial }}\) and \(\tilde{D}^*\)). In that case, the reader could believe that it would be best to directly assume that h satisfies a high level of smoothness at \(\bar{x}\). Nevertheless, for infinite-dimensional applications, such smoothness does not always hold (see Example 5.3).

Similarly to Theorem 3.7, we provide a necessary optimality condition to problem (24).

Corollary 3.10

In the setting of Corollary 3.6, let \(\bar{x}\) be a local optimal solution of problem (24) and suppose that h is differentiable at \(\bar{x}\). Then, there exist \(\eta \ge 0\) and \(\lambda ^*\in {\mathcal {C}}^\circ \) such that \(\langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1\) and we have

$$\begin{aligned} 0_{X^*} \in \hat{\partial }\varphi (\bar{x}) + \eta \hat{D}^*\Phi (\bar{x})(\lambda ^*) + N_{\mathcal {Q}}(\bar{x}), \end{aligned}$$
(33)

provided that the following qualification holds

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) +N_{\mathcal {Q}}(\bar{x}) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] . \end{aligned}$$
(34)

Proof

Following the proof of Corollary 3.6, we have that the optimization problem (26) has a local optimal solution at \(\bar{x}\). Then, by Theorem 3.7 we get that

$$\begin{aligned} 0_{X^*} \in \hat{\partial }\varphi ( \bar{x}) + {\text {cone}}\left( \bigcup \left[ \hat{D}^*\Phi _{\mathcal {Q}}(\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] \right) , \end{aligned}$$
(35)

where \(\Phi _{{\mathcal {Q}}}\) is defined in (27), and provided that, the following qualification holds:

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi _{\mathcal {Q}}(\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] . \end{aligned}$$
(36)

Now, using Lemma 2.11 we have that (35) and (36) reduce to (33) and (34), which concludes the proof. \(\square \)

The final result of this section shows that the fulfilment of inclusion (12), for all small \(\eta \ge 0\), is sufficient for a point to be a local optimum of problem (11).

Theorem 3.11

Let \(\bar{x}\) be a feasible point of the optimization problem (11) which satisfies the subdifferential inclusion (12) for all \(\eta \) small enough. In addition, suppose that \({\mathcal {C}}^\circ \) is \(\hbox {weak}^*\)-compact, that h is continuous at \(\bar{x}\) and the following qualification condition holds

$$\begin{aligned} \partial h(\bar{x}) \cap \bigcup \left[ {\partial } \left( \langle \lambda ^*, \Phi \rangle + h\right) (\bar{x}) : \begin{array}{c} \lambda ^*\in N_{{\mathcal {C}}}(\Phi (\bar{x})) \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] =\emptyset . \end{aligned}$$
(37)

Then, \(\bar{x}\) is a local solution of (11).

Proof

First, we claim that \(\bar{x}\) satisfies the subdifferential inclusion (12) with multiplier \(\alpha _1\ne 0\) for all \(\eta \ge 0 \) small enough. Indeed, suppose by contradiction that there are sequences \(\eta _n,\eta _n' \rightarrow 0^+\) and

$$\begin{aligned} x_n^*\in \partial _{\eta _n} h( \bar{x}) \cap \partial _{\eta '_n} \left( \langle \lambda _n^*, \Phi \rangle + h \right) (\bar{x}), \end{aligned}$$

where \( \eta '_n +1- \langle \lambda _n^*, \Phi (\bar{x} ) \rangle \le \eta _n\) and \( \lambda _n^*\in {\mathcal {C}}^\circ \). Since h is continuous at \(\bar{x}\) we have that \(\partial _{\eta _n} h( \bar{x}) \) is \(\hbox {weak}^*\)-compact (see e.g. [39, Theorem 2.4.9]). Hence, there exists subnets (with respect to the \(\hbox {weak}^*\)-topology) \(x_{n_\nu }^*\) and \( \lambda ^*_{n_\nu } \) converging to \( x^*\) and \( \lambda ^*\), respectively. Then, it is easy to see that \(x^*\in \partial h(\bar{x}) \cap \partial \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}),\) which contradicts (37) and proves our claim.

Now, let us denote by \(\epsilon _0>0\) a number such that \(\bar{x}\) satisfies the subdifferential inclusion (12) with \(\alpha _1\ne 0\) for all \(\eta \in [0,\epsilon _0]\).

Since, h is locally Lipschitz at \(\bar{x}\) there exists a neighbourhood U of \(\bar{x}\) such that for all \(x,y \in U\) and all \(x^*\in \partial h(x) \)

$$\begin{aligned} | \langle x^*, y -x \rangle | + |h(y) -h(x) | \le \epsilon _0. \end{aligned}$$

Particularly, for each \(y \in U\) (20) holds with \(\eta \le \epsilon _0\). Now, for \(y\in U \cap \Phi ^{-1}({\mathcal {C}})\), and repeating the arguments given in the proof of Theorem 3.1, we get that \(\varphi (\bar{x}) \le \varphi (y)\), which ends the proof. \(\square \)

Remark 3.12

Let us notice that when h is differentiable at \(\bar{x}\) condition (37) reduces to

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] . \end{aligned}$$
(38)

Indeed, if h is differentiable at \(\bar{x}\), we can use the sum rule (6) to get that

$$\begin{aligned} {\partial } \left( \langle \lambda ^*, \Phi \rangle + h\right) (\bar{x}) =\hat{D}^*\Phi (\bar{x})( \lambda ^*) + \nabla h(\bar{x}). \end{aligned}$$

Therefore, (37) turns out to be equivalent to (38).

4 DC Cone-Constrained Optimization Problems

This section addresses to establishing necessary and sufficient conditions for cone-constrained optimization problems. More precisely, we consider the following optimization problem

$$\begin{aligned} \begin{aligned}&\min \varphi (x) \\&\text {{s.t. }} \Phi (x)\in -{\mathcal {K}}, \end{aligned} \end{aligned}$$
(39)

where \(\varphi :X \rightarrow \mathbb {R}\cup \{ +\infty \},\) \(\Phi :X \rightarrow Y\cup \{ \infty _{Y}\}\), and \({\mathcal {K}}\subset Y\) is a closed convex cone.

The approach in this section is slightly different for the one followed in the previous section where there was a convex abstract constraint involving a general closed convex set \({\mathcal {C}}\). More precisely, we will take advantage of the particular structure of the cone-constraint \(\Phi (x)\in -{\mathcal {K}}\) in terms of a more suitable supremum function. In order to do that we need to introduce the following notion for convex cones.

Definition 4.1

Let \(\Theta \subseteq Y^*\) be a convex closed cone, we say that \(\Theta \) is \(w^*\)-compactly generated if there exists a \(\hbox {weak}^*-\)compact and convex set \({\mathcal {B}} \) such that

$$\begin{aligned} \Theta = {{\,\textrm{cl}\,}}^{*}\text {cone}( {\mathcal {B}} ). \end{aligned}$$

In this case, we say that \(\Theta \) is \(w^*\)-compactly generated by \({\mathcal {B}}\).

The next lemma establishes sufficient conditions to ensure that the polar of a convex cone is \(w^*\)-compactly generated.

Lemma 4.2

Let \({\mathcal {K}} \subseteq Y\) be a convex closed cone. Suppose that one of the following conditions is satisfied:

  1. a)

    Y is a normed space.

  2. b)

    \({\mathcal {K}}\) has nonempty interior.

Then, \({\mathcal {K}}^{+}\) is \(w^*\)-compactly generated.

Proof

a) Let us consider the convex compact set \({\mathcal {B}}:=\{ x^*\in {\mathcal {K}}^{+}: \Vert x^*\Vert \le 1 \}\). Then, \({\mathcal {K}}^{+}\) is \(\hbox {weakly}^*\) compactly generated by \({\mathcal {B}}\). b) Consider an interior point of \({\mathcal {K}}\), \(y_0\), and take a convex balanced neighbourhood of zero, V, such that \(y_0 + V \subseteq {\mathcal {K}}\). Therefore \({\mathcal {K}}^+ \subseteq \{ x^*\in Y^*: \langle x^*, y_0 \rangle \ge \sup _{y \in V} \langle x^*, y \rangle \}\). Then, consider \({\mathcal {B}}:=\{ x^*\in {\mathcal {K}}^+: \sup _{y \in V} \langle x^*, y \rangle \le 1 \}\), which is a \(\hbox {weak}^*\)-compact (convex) set due to the Banach–Alaoglu–Bourbaki Theorem. Moreover, for every \(x^*\in {\mathcal {K}}^+\) we have that \(\frac{x^*}{ |\langle y_0, x^*\rangle | + 1 } \in {\mathcal {B}}\), so \({\mathcal {K}}^+\) is \(w^*\)-compactly generated by \({\mathcal {B}}\). \(\square \)

4.1 Global Optimality Conditions

The next theorem gives necessary and sufficient optimality conditions for problem (39).

Theorem 4.3

Let \({\mathcal {K}}\) be a closed convex cone such that \({\mathcal {K}}^+\) is \(\hbox {w}^*\)-compactly generated by \({\mathcal {B}}\), and assume that \(\Phi \in \Gamma _{h}(X,Y, {\mathcal {B}})\) and \( \varphi \in \Gamma _{h}(X)\) for some function \(h \in \Gamma _{0}(X)\). Furthermore, suppose that one of the following conditions holds:

  1. a)

    the function \(x \mapsto \varphi (x) + h(x)\) is continuous at some point of \({\text {dom}} \Phi \),

  2. b)

    the function \(x \mapsto \sup \{ \langle \lambda ^*, \Phi (x)\rangle : \lambda ^*\in {\mathcal {B}}\} +h(x)\) is continuous at some point of \({\text {dom}} \varphi \).

Then, if \(\bar{x}\) is a minimum of (39), we have that for every \(\eta \ge 0\)

$$\begin{aligned} \partial _{\eta } h( \bar{x}) \subseteq \bigcup \Big [ \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) \Big ], \end{aligned}$$
(40)

where the union is taken over all \( \eta _1,\eta _2 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {B}} \) such that

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _2 - \langle \lambda ^*, \Phi (\bar{x} )\rangle ) = \eta . \end{aligned}$$

Conversely, assume that \(\bar{x}\) is a feasible point of (39) and that (40) always holds with \(\alpha _1 > 0\), then \(\bar{x}\) is a solution of (39) relative to \({\text {dom}} \partial h\).

Proof

Suppose that \(\bar{x}\) is a minimum of (39). First, let us notice that \(\Phi (x) \in -{\mathcal {K}}\) if and only if \(\sup _{ y^*\in {\mathcal {B}}} \langle y^*, \Phi \rangle (x) \le 0\). Then, by Lemma 2.1 we have that \(\bar{x}\) is a solution of the DC program

$$\begin{aligned} \begin{array}{cc} &{}\min \max \{ \varphi (x)+h(x)-\alpha , \sup \limits _{ y^*\in {\mathcal {B}}} \langle y^*, \Phi \rangle (x) +h(x) \} - h(x).\\ \end{array} \end{aligned}$$

where \(\alpha \) is the optimal value of (39). Now, using the notation of Theorem 3.1 we consider

$$\begin{aligned}\psi _1 (x): =\varphi (x) + h(x)-\alpha ,\; \psi _2(x):= \sup _{ y^*\in {\mathcal {B}}} \langle y^*, \Phi \rangle (x) +h(x) \end{aligned}$$

and

$$\begin{aligned} \psi (x):=\max \left\{ \psi _1 (x),\psi _2(x) \right\} . \end{aligned}$$

Now, mimicking the proof of Theorem 3.1 and taking into account that, instead of (18), we have

$$\begin{aligned} \partial _{\eta _0} \psi _2( \bar{x}) = \bigcup \left[ \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) : \hspace{-0.2cm}\begin{array}{c} \eta _2 \in [0,\eta _0], \lambda ^*\in {\mathcal {B}} \text { and } \\ \vspace{-0.2cm} \\ \sup \limits _{ y^*\in {\mathcal {B}}} \langle y^*, \Phi \rangle (\bar{x}) \le \langle \lambda ^*, \Phi \rangle (\bar{x}) + \eta _0-\eta _2 \end{array} \right] , \end{aligned}$$

we have that (40) holds. The converse follows as in the proof of Theorem 3.1, so we omit the details. \(\square \)

Now, we present a result about optimality conditions of a DC cone-constrained optimization problem with an extra abstract convex constraint. The proof follows similar arguments to the proof of Corollary 3.6, but uses Theorem 4.3 instead of Theorem 3.1; so we omit the proof.

Corollary 4.4

Consider the optimization problem

$$\begin{aligned} \begin{aligned}&\min \varphi (x) \\&\text {s.t. } \Phi (x)\in -{\mathcal {K}},\; \\&\hspace{0.6cm}x \in {\mathcal {Q}}, \end{aligned} \end{aligned}$$
(41)

where \({\mathcal {Q}}\) is closed and convex, \({\mathcal {K}}\) is a closed convex cone such that \({\mathcal {K}}^+\) is \(\hbox {weakly}^*\)-compact generated by \({\mathcal {B}}\), and \(\Phi \in \Gamma _{h}(X,Y, {\mathcal {B}})\) and \( \varphi \in \Gamma _{h}(X)\) for some function \(h \in \Gamma _{0}(X)\). Assume that there exists a point in \({\mathcal {Q}}\) such that \(\varphi + h \) and \(\Phi + h\) are continuous at this point. Then, if \(\bar{x}\) is an optimal solution of problem (41) we have that

$$\begin{aligned} \partial _{\eta } h( \bar{x}) \subseteq \bigcup \Big [ \alpha _1 \partial _{\eta _1 }( \varphi + h ) ( \bar{x}) + \alpha _2 \partial _{\eta _2} \left( \langle \lambda ^*, \Phi \rangle + h \right) (\bar{x}) + N_{{\mathcal {Q}}}^{\eta _3} (\bar{x}) \Big ] \text { for all } \eta \ge 0, \end{aligned}$$
(42)

where the union is taken over all \( \eta _1,\eta _2,\eta _3 \ge 0\), \((\alpha _1,\alpha _2) \in \Delta _2\) and \( \lambda ^*\in {\mathcal {B}}\) such that

$$\begin{aligned} \alpha _1 \eta _1+ \alpha _2 ( \eta _2 - \langle \lambda ^*, \Phi (\bar{x} )\rangle ) + \eta _3= \eta . \end{aligned}$$

Conversely, assume that \(\bar{x}\) is a feasible point of (41) and that (42) always holds with \(\alpha _1 >0\), then \(\bar{x}\) is a solution of (41) relative to \({\text {dom}} \partial h\).

4.2 Local Optimality Conditions

Now, we focus on necessary and sufficient local optimality conditions for DC cone-constrained optimization problems. The following two results provide necessary conditions for optimality of problem (39) and for a variant with an additional abstract convex constraint. The proofs of both results follow similar arguments to the ones used in Theorem 3.7 and Corollary 3.10, respectively; accordingly, we omit them.

Theorem 4.5

In the setting of Theorem 4.3, let \(\bar{x}\) be a local optimal solution of problem (39), and suppose that h is differentiable at \(\bar{x}\). Then, there are multipliers \((\alpha _1,\alpha _2)\in \Delta _2\) and \(\lambda ^*\in {\mathcal {B}}\) such that

$$\begin{aligned} 0_{X^*} \in \alpha _1 \hat{\partial }\varphi ( \bar{x}) + \alpha _2 \hat{D}^*\Phi (\bar{x})( \lambda ^*) \text { and } \alpha _2 \langle \lambda ^*, \Phi (\bar{x} ) \rangle =0. \end{aligned}$$

In addition, if the following qualification holds

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c}\lambda ^*\in {\mathcal {B}} \text { such that } \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 0 \end{array} \right] , \end{aligned}$$

we have

$$\begin{aligned} 0_{X^*} \in \hat{\partial }\varphi ( \bar{x}) + {\text {cone}}\left( \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c}\lambda ^*\in {\mathcal {B}} \text { such that } \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 0 \end{array} \right] \right) . \end{aligned}$$

Corollary 4.6

Under the assumptions of Corollary 4.4, let \(\bar{x}\) be a local optimal solution of problem (41), and suppose that h is differentiable at \(\bar{x}\) and \(\varphi \) and \(\Phi \) are continuous at \(\bar{x}\). Then, there exists \(\lambda ^*\in {\mathcal {K}}^+\) such that

$$\begin{aligned} 0_{X^*} \in \hat{ \partial } \varphi ( \bar{x}) + \hat{D}^*\Phi (\bar{x})( \lambda ^*) + N_{{\mathcal {Q}}} (\bar{x}) \text { and } \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 0, \end{aligned}$$

provided that the following constraint qualification holds

$$\begin{aligned} 0_{X^*} \notin \hat{D}^*\Phi (\bar{x})( \lambda ^*) + N_{{\mathcal {Q}}} (\bar{x}), \text { for all } \lambda ^*\in {\mathcal {B}}. \end{aligned}$$

Similarly to Theorem 3.11, we provide sufficient conditions for local optimality in terms of (40).

Theorem 4.7

Let \(\bar{x}\) be a feasible point problem (39) satisfying the subdifferential inclusion (40) for all \(\eta \) small enough. Additionally, suppose that h is continuous at \(\bar{x}\) and the following qualification holds

$$\begin{aligned} \partial h(\bar{x}) \cap \bigcup \left[ {\partial } \left( \langle \lambda ^*, \Phi \rangle + h\right) (\bar{x}) : \begin{array}{c} \lambda ^*\in {\mathcal {B}} \text { such that } \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 0 \end{array} \right] =\emptyset . \end{aligned}$$
(43)

Then, \(\bar{x}\) is a local solution of (11).

Remark 4.8

Notice that when h is differentiable at \(\bar{x}\) condition (43) leads us to

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{ D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {B}} \text { such that } \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 0 \hspace{-0.15cm} \end{array}\right] . \end{aligned}$$

5 Applications to Mathematical Programs Problems

In this section, we provide some applications of the theory developed in the previous sections.

5.1 Infinite Programming

We consider the optimization problem

$$\begin{aligned} \begin{array}{l} \min \varphi (x) \\ \text {{s.t. }} \phi _t(x)\le 0, \;t\in T, \end{array} \end{aligned}$$
(44)

where T is a locally compact Hausdorff space, the functions \(\phi _t: X\rightarrow \mathbb {R}\), \(t\in T\), are such that, for all \(x\in X\), the function, \(t\mapsto \phi _t(x)\equiv \phi (t,x)\) is continuous with compact support, and \(\varphi : X\rightarrow \overline{\mathbb {R}}\). Problem (44) corresponds to the class of infinite programming problems (called semi-infinite when X is finite dimensional); we refer to [22] for more details about the theory.

The space of continuous functions defined on T and with compact support is denoted by \({\mathcal {C}}_c(T)\). A finite measure \(\mu : {{\mathcal {B}}}(T) \rightarrow [0,+\infty )\), where \({{\mathcal {B}}}(T)\) is the Borel \(\sigma \)-algebra, is called regular if for every \(A \in {{\mathcal {B}}}(T) \)

$$\begin{aligned} \begin{aligned} \mu (A)=\inf \left\{ \mu (V): \hspace{-0.05cm}V \text { is open and }V \supseteq A \right\} =\sup \left\{ \mu (K): \hspace{-0.05cm}K \text { is compact and }K \subseteq A \right\} . \end{aligned} \end{aligned}$$

We denote by \({{\mathcal {M}}}_+(T)\) the set of all (finite) regular Borel measures. Let us recall (e.g. [1, Theorem 14.14]) that the dual of \({\mathcal {C}}_c(T)\), endowed with the uniform norm, can be identified as the linear space generated by \({{\mathcal {M}}}_+(T)\).

The following result provides necessary optimality conditions for problem (44) using a cone representation in the space \({\mathcal {C}}_c(T)\).

Theorem 5.1

Let X be a Banach space and T be a locally compact Hausdorff space. Suppose that \(\varphi , \phi _t\in \Gamma _{h}(X)\), \(t\in T\), for some function \(h\in \Gamma _0(X)\), and assume that the function \(x\mapsto \inf _{t\in T} \phi _t(x)\) is locally bounded from below. Let \(\bar{x}\) be a local optimum of problem (44), and assume that h and \(\phi _t \), \(t\in T\), are differentiable at \(\bar{x}\), and that there are \(\ell , \epsilon > 0\) such that

$$\begin{aligned} |\phi _t (y) -\phi _t(\bar{x}) |&\le \ell \Vert y- \bar{x}\Vert , \quad \text { for all } (t,y)\in T\times \mathbb {B}_\epsilon (\bar{x}). \end{aligned}$$
(45)

Then, there exists \(\mu \in {{\mathcal {M}}}_+(T)\) such that \({{\,\mathrm{\text {supp}}\,}}\mu \subseteq T(\bar{x}):=\{ t\in T: \phi _t(\bar{x} )=0\}\),

$$\begin{aligned} - \int _T \nabla \phi (t, \bar{x}) \mu ({\textrm{d}}t)\in \hat{\partial }\varphi (\bar{x} ) \text { and }\int _T \phi _t(\bar{x})\mu ({\textrm{d}}t) =0, \end{aligned}$$

provided that

$$\begin{aligned} 0_{X^*} \notin \bigcup \left\{ \int _T \nabla \phi (t, \bar{x}) \nu ({\textrm{d}}t) : \begin{array}{c}\nu \in {{\mathcal {M}}}_+(T) \text { such that } \nu (T)=1, \\ \vspace{-0.3cm} \\ {{\,\mathrm{\text {supp}}\,}}\nu \subseteq T(\bar{x}) \text { and } \int _T \phi _t(\bar{x}) \nu ({\textrm{d}}t) = 0 \end{array} \right\} , \end{aligned}$$

where the integrals are in the sense of Gelfand (also called \(w^{*}\)-integrals).

Remark 5.2

(before the proof) It is important to recall that a mapping \(x^*: T \rightarrow X^*\) is Gelfand integrable if for every \(x \in X\), the function \(t\mapsto \langle x^*(t),x\rangle \) is integrable. In that case, the integral \( \int _T x^*(t) \nu ({\text {d}}t) \) is well-defined as the unique element of \(X^*\) such that

$$\begin{aligned} X\ni x \mapsto \int _T \langle x^*(t), x\rangle \nu ({\text {d}}t). \end{aligned}$$

We refer to [16, Chapter II.3, p. 53] for more details.

Proof

First, let us define \(\Phi : X\rightarrow {\mathcal {C}}_c(T)\) as the evaluation function \(\Phi (x): T \rightarrow \mathbb {R}\) given by \(\Phi (x)(t):= \phi _t(x)\), which is well-defined thanks to our assumptions. Given \({\mathcal {K}}:=\{ x \in {\mathcal {C}}_c(T): x(t) \ge 0 \text { for all }t\in T \}\), by [1, Theorem 14.12] we have \({\mathcal {K}}^+= {{\mathcal {M}}}_+(T)\), and it is easy to see that \({\mathcal {K}}^+\) is \(w^*\)-compactly generated by \({\mathcal {B}}:=\{ \mu \in {{\mathcal {M}}}_+(T): \mu (T)=1 \}\) (in the sense of Definition 4.1). Let us first prove two claims.

Claim 1: The function \(x \mapsto \sup \{ \langle \nu , \Phi (x)\rangle :\nu \in {\mathcal {B}}\} +h(x)\) is continuous, and the mapping \(\Phi \) belongs to \(\Gamma _{h}(X,{\mathcal {C}}_c(T),{\mathcal {B}})\).

To this purpose, fix a measure \(\nu \in {\mathcal {B}}\). By the assumptions, the function \(x \mapsto \phi _{t}(x)+h(x)\) is convex for all \(t\in T\), so integration over T with respect to \(\nu \) preserves the convexity on X (see e.g. [11,12,13, 28, 29, 32, 33]); hence, the function \(\langle \nu , \Phi \rangle + h\) is convex. Moreover, consider a sequence \(x_k \rightarrow x\). Since, the function \( x\mapsto \inf _{t\in T} \phi _t(x)\) is locally bounded from below and \(h \in \Gamma _{0}(X)\), we can take \(\alpha \in \mathbb {R}\) and \(k_0\in \mathbb {N}\) such that \(\phi _t(x_k ) + h(x_k)\ge \alpha \), for all \(t\in T\) and all \(k \ge k_0\). Then, Fatou’s lemma and the lower semicontinuity of the functions \(\phi _t + h\), \(t\in T\), yield

$$\begin{aligned} \langle \nu , \Phi (x)\rangle + h(x)&= \int _{T} \left( \phi _t(x) + h(x) \right) \nu ({\text {d}}t) \le \int _{T} \liminf _{k\rightarrow \infty }\left( \phi _t(x_k) + h(x_k) \right) \nu ({\text {d}}t) \\&\le \liminf _{k\rightarrow \infty } \int _{T} \left( \phi _t(x_k) + h(x_k) \right) \nu ({\text {d}}t) = \liminf _{k\rightarrow \infty } \left( \langle \nu , \Phi (x_k)\rangle + h(x_k) \right) , \end{aligned}$$

showing the lower semicontinuity of \(x \mapsto \langle \nu , \Phi \rangle (x)+h(x)\), and that, consequently the function \(\Phi \) belongs to \( \Gamma _{h}(X,{\mathcal {C}}_c(T),{\mathcal {B}})\). Finally, since the function \(x \mapsto \sup \{ \langle \nu , \Phi (x)\rangle :\nu \in {\mathcal {B}}\} +h(x)\) is convex, lsc and finite valued because \({\mathcal {B}}\) is \(w^*\)-compact, it is also continuous (recall that X is a Banach space).

Claim 2: For every \(h \in X\) the function \(t\mapsto \langle \nabla \phi _t(\bar{x}), h \rangle \) is measurable and, for every \(\nu \in {\mathcal {B}}\), we have that

$$\begin{aligned} \hat{D}^*\Phi (\bar{x}) (\nu ) \subseteq \left\{ \int _T \nabla \phi _t (\bar{x}) \nu ({\text {d}}t)\right\} . \end{aligned}$$

Fix \(h\in X\). Since the functions \(\phi _t\), \(t\in T\), are differentiable at \(\bar{x}\), we get that \(\langle \nabla \phi _t(\bar{x}), h \rangle = \lim _{ k \rightarrow \infty } k\left( { \phi _t( \bar{x} + k^{-1} h) - \phi _t(\bar{x})}\right) \). Particularly, the function \(t\mapsto \langle \nabla \phi _t(\bar{x}), h \rangle \) is measurable as it is the pointwise limit of a sequence of measurable functions. Moreover, by (45) we get that \(\langle \nabla \phi _t(\bar{x}), h \rangle \le \ell \Vert h\Vert \), for all \(t\in T\), which shows the integrability and, consequently, the Gelfand integral is well-defined (see e.g. [1, 16]). Finally, let \(x^*\in D^*\Phi (\bar{x}) (\nu )\), the definition of regular subdifferential, with \(S=\{h\} \in \beta \), implies that

$$\begin{aligned} \langle x^*,h \rangle \le \lim \limits _{k \rightarrow +\infty } \frac{ \langle \nu , \Phi (\bar{x} + k^{-1} h ) - \Phi (\bar{x}) \rangle }{k^{-1}} = \int _T \langle \nabla \phi (t, \bar{x}),h \rangle \nu ({\text {d}}t), \quad \forall h \in X, \end{aligned}$$
(46)

where in the last equality we use Lebesgue’s dominated convergence theorem, which can be applied thanks to (45).The proof of this claim ends by considering h and \(-h\) in (46).

Finally, observe that, by [1, Lemma 12.16], any measure \(\nu \in {{\mathcal {B}}}(T)\) such that \(\int _T \phi _t(\bar{x}) \nu ({\text {d}}t) = 0\) satisfies that \({{\,\mathrm{\text {supp}}\,}}\nu \subseteq T(\bar{x})\). Then, applying Theorem 4.5 we get the desired result. \(\square \)

5.2 Stochastic Programming

Before introducing our optimization problem in this subsection, let us give some additional notations. In the sequel, X is a separable Banach space, Y a general locally convex space, and \((\Omega ,{\mathcal {A}}, \mu )\) a complete \(\sigma \)-finite measure space. A set-valued mapping \(M: \Omega \rightrightarrows X\) is said to be measurable if for every open set \(U \subseteq X\), we have \( \{ \omega \in \Omega : M(\omega ) \cap U \ne \emptyset \} \in {\mathcal {A}}\). A function \(\varphi : \Omega \times X \rightarrow \mathbb {R}\cup \{ +\infty \}\) is said to be a normal integrand provided the set-valued mapping \(\omega \mapsto {{\,\textrm{epi}\,}}\varphi _\omega :=\{ (x,\alpha ) \in X\times \mathbb {R}: \varphi _\omega (x):=\varphi (\omega , x)\le \alpha \} \) is measurable with closed values. In addition, \(\varphi \) is a convex normal integrand if \(\varphi _\omega \) is convex for all \(\omega \in \Omega \).

Given a set-valued mapping \(S: \Omega \rightrightarrows X^*\) we define the (Gelfand) integral of S by

$$\begin{aligned} \int _{\Omega } S(\omega ) \mu ({\text {d}}\omega ) := \left\{ \int _\Omega x^*(\omega ) \mu ({\text {d}}\omega ) : \begin{array}{c} x^*\text { is Gelfand integrable and } \\ x^*(\omega ) \in S(\omega ) \text { a.e. } \omega \in \Omega \end{array}\right\} . \end{aligned}$$

We refer to [8, 16, 17, 34] for more details about the theory of measurable multifunctions and integration on Banach spaces.

Given a normal integrand \(\varphi : \Omega \times X \rightarrow \mathbb {R}\cup \{ +\infty \}\) we define the integral functional (also called expected functional) associated to \(\varphi \) by \( {\mathcal {I}}_{\varphi }: X\rightarrow \mathbb {R}\cup \{ +\infty ,-\infty \}\) defined as

$$\begin{aligned} {\mathcal {I}}_{\varphi }(x):=\int \limits _{\Omega } \varphi _\omega (x) \mu ({\text {d}}\omega ) := \int \limits _{\Omega } \max \{ \varphi _\omega (x) , 0\} \mu ({\text {d}}\omega ) + \int \limits _{\Omega }\min \{ \varphi _\omega (x), 0\} \mu ({\text {d}}\omega ), \end{aligned}$$
(47)

with the inf-addition convention \(+\infty +(-\infty ) = +\infty \).

Finally, a normal integrand \(\varphi \) is integrably differentiable at \(\bar{x}\) provided that \({\mathcal {I}}_{\varphi }\) is differentiable at \(\bar{x}\) and the following integral formula holds

$$\begin{aligned} \nabla {\mathcal {I}}_{\varphi }( \bar{x} ) = \int \limits _{\Omega } \nabla \varphi _\omega (\bar{x}) \mu ({\text {d}}\omega ). \end{aligned}$$
(48)

The next example shows that the last notion makes sense for integral mappings since its smoothness cannot be taken for granted even when all data are smooth.

Example 5.3

It is important to mention here that the integral functional \( {\mathcal {I}}_{\varphi } \), for a normal integrand \(\varphi \), could fail to be Fréchet differentiable even when the data functions \(\varphi _\omega \), \(\omega \in \Omega \), are Fréchet differentiable. Let us consider the measure space \((\mathbb {N},{\mathcal {P}}(\mathbb {N}), \mu )\), where the \(\sigma \)-finite measure is given by the counting measure \(\mu (A):=|A|\), and the Banach space \(X=\ell _1\). Next, consider the convex normal integrand function \(\varphi (n, x):= | \langle x, e_n\rangle |^{1 +\frac{1}{n}} \), where \(\{ e_1, e_2, \ldots , e_n,\ldots \} \) is the canonical basis of \(\ell _1\). It has been shown in [11, Example 2] that \({\mathcal {I}}_{\varphi }\) is Gateaux differentiable at any point and the integral formula (48) holds for the Gateaux derivative. Nevertheless, as it was also proved in that paper, the function \({\mathcal {I}}_{\varphi }\) fails to be Fréchet differentiable at zero.

Now, we extend a classical formula for the subdifferential of convex normal integrand functions to the case of nonconvex normal integrands. This result is interesting in itself, and for that reason, we present it as an independent proposition.

Proposition 5.4

Let \(\bar{x}\in X\), and \(\varphi : \Omega \times X \rightarrow \mathbb {R}\cup \{ +\infty \}\) be a normal integrand. Suppose that \(\varphi _\omega \in \Gamma _{h_\omega }(X)\) for some convex normal integrand h such that \({\text {dom}} {\mathcal {I}}_{\varphi } \subseteq {\text {dom}} {\mathcal {I}}_{h}\). Then, \({\mathcal {I}}_{\varphi } \in \Gamma _{{\mathcal {I}}_{h} }(X)\) provided that \({\mathcal {I}}_{\varphi }\) is proper. In addition, suppose that h is integrably differentiable at \(\bar{x} \) and the functions \( {\mathcal {I}}_{\varphi }\) and \(\varphi _\omega \), \(\omega \in \Omega \), are continuous at some common point. Then,

$$\begin{aligned} \hat{ \partial } {\mathcal {I}}_{\varphi }(\bar{x}) = \int _{\Omega } \hat{\partial } \varphi _\omega (\bar{x}) {\textrm{d}}\mu + N_{ {\text {dom}} {\mathcal {I}}_{\varphi } }(\bar{x}). \end{aligned}$$
(49)

Proof

Let us consider the convex normal integrand \(\psi := \varphi +h\). By our assumptions \({\text {dom}} {\mathcal {I}}_{\psi } = {\text {dom}} {\mathcal {I}}_{\varphi }\) and \({\mathcal {I}}_{\psi }= {\mathcal {I}}_{\varphi } + {\mathcal {I}}_{h}\). Consequently, \({\mathcal {I}}_{\psi }\) is proper, entailing that \({\mathcal {I}}_{\varphi } \in \Gamma _{{\mathcal {I}}_{h} }(X)\). Then, by [11, Theorem 2] we have that \( \partial {\mathcal {I}}_{\psi }(\bar{x})= \int _{\Omega } {\partial } \psi _\omega (\bar{x}) {\text {d}}\mu + N_{ {\text {dom}} {\mathcal {I}}_{\psi } }(\bar{x}).\) Now, by Lemmas 2.8 and 2.9, and the integral formula (48) we have that

$$\begin{aligned} \partial {\mathcal {I}}_{\psi }(\bar{x})&= \hat{ \partial } \left( {\mathcal {I}}_{\varphi } + {\mathcal {I}}_{h} \right) (\bar{x}) = \hat{ \partial } {\mathcal {I}}_{\varphi }(\bar{x}) + \nabla {\mathcal {I}}_{h}( \bar{x} ), \\ \int _{\Omega } {\partial } \psi _\omega (\bar{x}) {\text {d}}\mu&=\int _{\Omega } \hat{\partial } \varphi _\omega (\bar{x}) {\text {d}}\mu + \nabla {\mathcal {I}}_{h}( \bar{x} ), \end{aligned}$$

which implies that (49) holds. \(\square \)

Remark 5.5

(On the use of Gelfand integrals) The above result does not require that \(\varphi \) be locally Lipschitzian at \(\bar{x}\) as in other classical results about differentiation of nonconvex integral functionals (see e.g. [12, 29] and the references therein). Consequently, we cannot expect that the formula (49) holds for the Bochner integral (see e.g. [1, Definition 11.42]). Indeed, adapting [11, Example 1], let us consider the measure space \((\mathbb {N},{\mathcal {P}}(\mathbb {N}), \mu )\), where \(\mu (A):=\sum _{j\in A}2^{-j}\), the Hilbert space \(X=\ell _2\), and the normal integrand function \(\varphi (n, x):= 2^{n} \left( \langle x, e_n\rangle \right) ^2 - \Vert x\Vert ^2 \), where \(\{ e_1, e_2, \ldots , e_n, \ldots \}\) is the canonical basis of \(\ell _2\). Clearly, the integrand \(\varphi \) satisfies all the assumptions of Proposition 5.4 at any point, therefore (49) holds. Nevertheless, the function \(n \mapsto \nabla \phi _n ( x) = 2^{n+1}\left( \langle x, e_n\rangle \right) e_n - x \) is not always integrable in the Bochner sense because, otherwise, the function \(n \mapsto 2^{n+1}\left( \langle x, e_n\rangle \right) e_n\) must be integrable (see e.g. [1, Theorem 11.44]). Indeed, we always have

$$\begin{aligned} \int _{\mathbb {N}} \Vert 2^{n+1}\left( \langle x, e_n\rangle \right) e_n \Vert {\text {d}}\mu = 2 \sum _{n\in \mathbb {N}} |x_n|. \end{aligned}$$

Nonetheless, the right-hand side is equal to \(+\infty \) for \(x=(\frac{1}{n})_{n \in \mathbb {N}} \in \ell _2\backslash \ell _1\).

Given the normal integrand \(\varphi :\Omega \times X\rightarrow \mathbb {R}\cup \{ +\infty \}\), the mapping \(\Phi : X \rightarrow Y\) and the nonempty convex closed set \({\mathcal {C}}\subseteq Y\), we consider the problem

$$\begin{aligned} \begin{array}{l} \min {\mathcal {I}}_{\varphi }(x) \;\\ \text {s.t. }\; \Phi ( x) \in {\mathcal {C}}. \end{array} \end{aligned}$$
(50)

Theorem 5.6

Given problem (50), let us assume that \({\mathcal {C}}^\circ \) is \(\hbox {weak}^*\)-compact, let \(\varphi _\omega \in \Gamma _{h_\omega }(X)\), \(\omega \in \Omega \), for some convex normal integrand h such that \({\text {dom}} {h_\omega } ={\text {dom}} {\mathcal {I}}_{h}=X\), \(\omega \in \Omega \), and let \( \Phi \in \Gamma _{g}(X,Y,{\mathcal {C}}^\circ )\) for some function \(g\in \Gamma _0(X)\) such that \({\text {dom}} \Phi =X\). Let \(\bar{x}\in {{\,\textrm{int}\,}}\left( {\text {dom}} {\mathcal {I}}_{\varphi }\right) \) be a local optimal solution of problem (50) and assume that h is integrably differentiable at \(\bar{x}\) and g is differentiable at \(\bar{x}\). Then, we have

$$\begin{aligned} 0_{X^*} \in \int _{\Omega } \hat{\partial } \phi _\omega (\bar{x}) {\textrm{d}}\mu + {\text {cone}}\left( \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] \right) , \end{aligned}$$
(51)

provided that the following qualification condition holds

$$\begin{aligned} 0_{X^*} \notin \bigcup \left[ \hat{D}^*\Phi (\bar{x})( \lambda ^*) : \begin{array}{c} \lambda ^*\in {\mathcal {C}}^\circ \text { such that}\\ \langle \lambda ^*, \Phi (\bar{x} ) \rangle = 1 \end{array} \right] . \end{aligned}$$

Proof

Let us define \(h_2:= {\mathcal {I}}_{h} +g\), and notice that \({\text {dom}} h_2= X\) and \({\mathcal {I}}_{\varphi }\in \Gamma _{h_2} (X)\) (see Proposition 5.4), and \(\Phi \in \Gamma _{h_2}(X,Y,{\mathcal {C}}^\circ )\). Moreover, since X is a Banach space and \({\text {dom}} \Phi =X\), we have that condition b) of Theorem 3.1 holds. Moreover, \(h_2\) is continuous (\({\text {dom}} h_2 =X\)) and differentiable at \(\bar{x}\). Then, by Theorem 3.7 and Proposition 5.4 we have that (51) is satisfied. \(\square \)

5.3 Semidefinite Programming

In this subsection, we consider the optimization problem

$$\begin{aligned} \begin{array}{l} \min \varphi (x) \\ \text {s.t. } \Phi (x) \preceq 0,\; \\ \hspace{0.6cm}x \in {\mathcal {Q}}. \end{array} \end{aligned}$$
(52)

where \(\varphi : X\rightarrow \overline{\mathbb {R}}\), \(\Phi : X\rightarrow \mathbb {S}^p\cup \{ \infty _{ \mathbb {S}^p} \}\), with \( \mathbb {S}^p\) being the set of \(p\times p\) symmetric (real) matrices, and \({\mathcal {Q}}\subseteq X\) a nonempty convex and closed set of X. Here, \(A \preceq 0 \) (\(A \succeq 0\), respectively) means that the matrix A is negative semidefinite (positive semidefinite, respectively). We recall that \( \mathbb {S}^p\) is a Hilbert space with the inner product \(\langle A, B \rangle :={{\,\textrm{tr}\,}}(AB)\), where \({{\,\textrm{tr}\,}}\) represents the trace operator (see e.g. [4]). We recall that, for any symmetric matrix A, \({{\,\textrm{tr}\,}}(A)= \sum _{i=1}^p \lambda _i(A)\), where \(\lambda _1(A)\ge \ldots \ge \lambda _p(A)\) are the eigenvalues of A (see e.g. [37, Theorem 2.6.6]).

Classical studies of problem (52) suggested imposing some degree of convexity to the function \(\Phi \), more precisely, the so-called matrix convexity (see e.g. [4, Section 5.3.2]). This notion is equivalent to assuming that, for every \(v\in \mathbb {R}^p\) with \(\Vert v\Vert =1\), the function \(x \mapsto v^\top \Phi (x) v\) is convex, where \(v^\top \) is the transpose vector of v (see e.g. [4, Proposition 5.72 ]). In the spirit of DC optimization, a natural extension of such concept is given by the following notion. We say that \(\Phi : X\rightarrow \mathbb {S}^p\cup \{ \infty _{ \mathbb {S}^p}\}\) is a DC matrix-mapping, with control function \(h \in \Gamma _0(X)\), if \({\text {dom}} \Phi \subseteq {\text {dom}} h\) and, for all \(v\in \mathbb {R}^p\) with \( \sum _{i=1}^p v_i^2 =1\), the mapping \(x \mapsto v^\top \Phi (x) v + h(x)\) belongs to \(\Gamma _{0}(X)\), where

$$\begin{aligned} v^\top \Phi (x) v :=\left\{ \begin{array}{ll} v^\top \Phi (x) v, &{} \text { if } x\in {\text {dom}} \Phi ,\\ +\infty , &{} \text { if } x\notin {\text {dom}} \Phi . \end{array} \right. \end{aligned}$$

Moreover, it is well-known that problem (52) can be reformulated similarly to problem (41) involving the cone \( \mathbb {S}^p_+\) of positive semidefinite matrices. Due to this observation, and the topological structure of \( \mathbb {S}^p\), another natural assumption in our framework is the \(\Phi \in \Gamma _{h}(X, \mathbb {S}^p, {\mathcal {B}})\), for some convex control function h and \({\mathcal {B}}\) being a \(w^*\)-compact generator of \( \mathbb {S}^p_+\) (see Definition 4.1). The following result formally establishes that both notions, DC matrix-mapping and \( {\mathcal {B}}\)-DC mappings, coincide.

Proposition 5.7

Consider an lcs space X, \(\Phi : X\rightarrow \mathbb {S}^p\cup \{ \infty _{ \mathbb {S}^p}\}\), and \(h\in \Gamma _{0}(X)\) with \({\text {dom}} h \subseteq {\text {dom}} \Phi \). Then, the following are equivalent:

  1. a)

    \(\Phi \in \Gamma _h(X, \mathbb {S}^p, {\mathcal {B}})\), where \({\mathcal {B}}:=\left\{ A \in \mathbb {S}^p_+: {{\,\textrm{tr}\,}}(A) = 1 \right\} \)

  2. b)

    \(\Phi \) is a DC matrix-mapping with control h.

Moreover, in such case, we have that for all \(x\in X\), the following equality holds

$$\begin{aligned} \max \{ \langle A, \Phi \rangle (x) + h(x) : A\in {\mathcal {B}} \} = \max \left\{ v^\top \Phi (x) v + h(x) : \sum _{i=1}^p v_i^2 =1 \right\} . \end{aligned}$$
(53)

Proof

Let us suppose that a) holds, and fix \({u}\in \mathbb {R}^p\) with \(\Vert u\Vert =1\), where \(\Vert \cdot \Vert \) is the Euclidean norm. Then, consider the symmetric matrix \(A=u u^\top =(u_i u_j)_{ij}\). We have that \(v^\top A v = ( \langle u,v \rangle )^2\), for all \(v\in \mathbb {R}^p\), which shows that A is positive semidefinite and \({{\,\textrm{tr}\,}}(A)= \Vert u\Vert ^2=1\); hence, \(A \in {\mathcal {B}}\). Finally, \( \langle A, \Phi \rangle (x)=\langle A, \Phi (x) \rangle = u^\top \Phi (x) u\), for every \(x\in {\text {dom}} \Phi \), which shows that the function \(x\mapsto u^\top \Phi (x)u +h(x)\) is convex proper and lsc because \(\Phi \in \Gamma _h(X, \mathbb {S}^p, {\mathcal {B}})\).

Now, suppose that b) holds, and consider \(A \in \mathbb {S}^p_+ \) with \({{\,\textrm{tr}\,}}(A)=1\). Employing its spectral decomposition, we write \(A= PDP^\top = \sum _{i=1}^p \lambda _i(A) v_i v_i^\top \), where P is an orthogonal matrix whose columns are \(v_i \in \mathbb {R}^p\), \(i=1,\ldots , p\), and D is the diagonal matrix formed by \( \lambda _1(A),\ldots , \lambda _p(A)\). Then,

$$\begin{aligned} \langle A, \Phi (x) \rangle = \sum \limits _{i=1}^p \lambda _i(A) \langle v_i v_i^\top , \Phi (x) \rangle = \sum \limits _{i=1}^p \lambda _i(A) v_i^\top \Phi (x)v_i, \text { for all } x\in {\text {dom}} \Phi . \end{aligned}$$

Next, using the fact that \({{\,\textrm{tr}\,}}(A)=1\) and \(\lambda _i(A) \ge 0\), we get that

$$\begin{aligned} \langle A, \Phi \rangle (x) + h(x) = \sum _{ i=1 }^p \lambda _i (A) \left( v_i^\top \Phi (x)v_i+ h(x) \right) , \end{aligned}$$
(54)

which shows the desired convexity as well as the lower semicontinuity of the function \(x \mapsto \langle A, \Phi \rangle (x) + h(x)\).

Finally, (53) remains to be proved. On the one hand, from the fact that \(A=u u^\top \) is positive semidefinite, we get that

$$\begin{aligned} v^\top \Phi (x) v + h(x) \le \max \{ \langle A, \Phi \rangle (x) + h(x): A\in {\mathcal {B}} \}, \end{aligned}$$

which proves the inequality \(\ge \) in (53). On the other hand, for a given matrix \(A\in {\mathcal {B}}\), and taking into account (54), we have that

$$\begin{aligned} \langle A, \Phi (x) \rangle + h(x)&\le \sum _{ i=1 }^p \lambda _i (A) \max \left\{ v^\top \Phi (x) v + h(x) : \sum _{i=1}^p v_i^2 =1\right\} \\&= \max \left\{ v^\top \Phi (x) v + h(x) : \sum _{i=1}^p v_i^2 =1\right\} \end{aligned}$$

and we are done. \(\square \)

The following proposition establishes some sufficient conditions ensuring that \(\Phi \) is a DC matrix-mapping.

Proposition 5.8

Let \(\Phi : X\rightarrow \mathbb {S}^p\) be a mapping with \(\Phi (x)=(\phi _{ij}(x))\) for \(\phi _{ij} \in \Gamma _{h_{ij}}(X,\mathbb {R},[-1,1])\) for some control function \(h_{ij}\), \(i,j=1,\ldots ,p\). Then, \(\Phi \) is a DC matrix-mapping with control \(h:=\sum _{ij} h_{ij}\).

Proof

Let us notice that for every \(v\in \mathbb {R} \) with \(\Vert v\Vert =1\), we have that

$$\begin{aligned} v^\top \Phi (x) v + h(x) = \sum _{i,k=1}^p \left( v_iv_k \phi _{ij}(x) + h_{ij} (x) \right) , \text { for all }x\in X. \end{aligned}$$

Therefore, the function \(x\mapsto v^\top \Phi (x) v + h(x)\) is convex and lower semicontinuous, which yields that \(\Phi \) is a DC matrix-mapping with control h. \(\square \)

Although the notion of DC matrix-mapping has a simpler description in terms of quadratic forms \( v^\top \Phi (x) v \), the set \(\Gamma _h(X,Y,{\mathcal {B}})\) in Definition 2.3 provides enhanced properties depending on the choice of scalarizations \({\mathcal {B}}\) and, consequently, better properties of operations with the matrix \(\Phi \).

Given a mapping \(\Phi : X\rightarrow \mathbb {S}^p\cup \{ \infty _{ \mathbb {S}^p}\}\), we define its k-th eigenvalue function by \(\lambda _k^\Phi : X\rightarrow \overline{\mathbb {R}}\) given by

$$\begin{aligned} \lambda _k^\Phi (x)&:=\left\{ \begin{array}{ll} \lambda _k(\Phi (x)), &{}\quad \text { if } x\in {\text {dom}} \Phi ,\\ +\infty , &{}\quad \text { if } x\notin {\text {dom}} \Phi . \end{array} \right. \end{aligned}$$

Moreover, we define the sum of the first k eigenvalue functions by \( \Lambda _k^\Phi : X\rightarrow \overline{\mathbb {R}}\) given by \(\Lambda _k^\Phi (x):=\sum _{j=1}^{k} \lambda _j^\Phi (x)\).

The following proposition gives sufficient conditions to ensure that the above functions are DC.

Proposition 5.9

Let \(\Phi : X\rightarrow \mathbb {S}^p\cup \{ \infty _{ \mathbb {S}^p}\}\) and \(h \in \Gamma _0(X)\).

  1. a)

    If \(\Phi \) is a DC matrix-mapping with control h, then its largest eigenvalue function, \(\lambda _1^\Phi \), belongs to \(\Gamma _h(X)\).

  2. b)

    If \(\Phi \in \Gamma _h(X, \mathbb {S}^p,{\mathcal {P}}_k)\), where \({\mathcal {P}}_k:=\{ P \in \mathbb {S}^p: P \text { has rank } k \text { and } P^2=P \}\), then \(\Lambda _k^\Phi \in \Gamma _{h}(X)\).

  3. c)

    If X is a Banach space, \(\Phi \in \Gamma _{h}(X, \mathbb {S}^p,\mathbb {B}_{ \mathbb {S}^p})\) with \({\text {dom}} \Phi =X\), and \(\mathbb {B}_{ \mathbb {S}^p}\) is the unit ball on \( \mathbb {S}^p\), then all the eigenvalue functions \(\lambda _k^\Phi \) are a difference of convex functions.

Proof

Let us recall that \( \lambda _1^\Phi (x) = \sup \{ v^\top \Phi (x) v: \Vert v\Vert =1 \} \), which shows statement a). Second, let us notice that

$$\begin{aligned}\Lambda _k^\Phi (x) +h(x)= \max \{ \langle P, \Phi (x) \rangle +h(x): P \in {\mathcal {P}}_k \}\end{aligned}$$

(see e.g. [34, Exercise 2.54]), so \(\Lambda _k^\Phi \in \Gamma _{h}(X)\), which shows b). Finally, to prove c), we have that \(\lambda _k^\Phi (x)= \Lambda _k^\Phi (x) - \Lambda _{k-1}^\Phi (x)\) for all \(k \ge 2\), so it is a difference of convex functions. \(\square \)

Finally, let us go back to problem (52).

Theorem 5.10

Let \(\varphi \in \Gamma _{h}(X)\) and \(\Phi \) be a DC matrix-mapping with control h such that \(x \mapsto \varphi (x) + h(x)\) is continuous at some point of \({\text {dom}} \Phi \). Let \(\bar{x}\) be a local optimal solution of the optimization of problem (52) and suppose that h is differentiable at \(\bar{x}\). Then, there exists \(A \in \mathbb {S}^p_+\) with \({{\,\textrm{tr}\,}}(A) =1\) such that

$$\begin{aligned} 0_{X^*} \in \hat{\partial } \varphi (\bar{x}) + \hat{ D}^*\Phi (\bar{x}) (A) + N_{\mathcal {Q}}(\bar{x}), \text { and } v^\top \Phi (\bar{x}) v=0 \text { for each eigenvector }v \text { of }A, \end{aligned}$$
(55)

provided that the following qualification holds

$$\begin{aligned} 0_{X^*} \notin \hat{D} \Phi (\bar{x})( A) + N_{\mathcal {Q}}(\bar{x}), \text { for all } A \in \mathbb {S}^p_+ \text { with } {{\,\textrm{tr}\,}}(A) =1. \end{aligned}$$
(56)

Proof

First, let us notice that by Proposition 5.7 the mapping \(\Phi \) belongs to \(\Gamma _{h}(X, \mathbb {S}^p,{\mathcal {B}})\), where \({\mathcal {B}}:=\{ A \in \mathbb {S}^p_+: {{\,\textrm{tr}\,}}(A) = 1 \}\). Then, applying Corollary 4.6 we get the result. \(\square \)

Consider normed spaces XY and recall that a function \(F:X \rightarrow Y\) is called \({\mathcal {C}}^{1,+}\) at \(\bar{x}\) if there exists a neighbourhood U of \(\bar{x}\) such that F is Fréchet differentiable on U and its gradient is Lipschitz continuous on U.

Corollary 5.11

Let \(\bar{x}\) be a local optimal solution of problem (52). Suppose that X is a Hilbert space and \(\varphi \) and that \(\Phi \) are \({\mathcal {C}}^{1,+}\) at \(\bar{x}\). Then, there exist \(v_i \in \mathbb {R}^p\) with \(\Vert v_i\Vert =1\), \(i=1, \ldots ,p\), and \(( \lambda _i)_{i=1}^p \in \Delta _{p}\) such that \(v_i^\top \Phi (\bar{x}) v_i=0\) and

$$\begin{aligned} 0_{X} \in \nabla \varphi (\bar{x}) + \sum _{i=1}^p \lambda _i v_i^\top \nabla \Phi (\bar{x}) v_i+ N_{\mathcal {Q}}(\bar{x}), \end{aligned}$$

provided that the following qualification holds

$$\begin{aligned} 0_{X} \notin \sum _{i=1}^p \lambda _i v_i^\top \nabla \Phi (\bar{x}) v_i + N_{\mathcal {Q}}(\bar{x}), \text { for all } ( \lambda _i)_{i=1}^p \in \Delta _{p} \text { and } v_i \in \mathbb {R}^d \text { with }\Vert v_i\Vert =1, \end{aligned}$$
(57)

where \( v_i^\top \nabla \Phi (\bar{x}) v_i\) is the gradient of \(x \mapsto v^\top _i \Phi ( x) v_i\) at \(\bar{x}\).

Proof

Let us consider a closed and convex neighbourhood U of \( \bar{x}\) and \(\rho >0\) such that the functions \(\varphi (x) + \rho \Vert x\Vert ^2\) and \(\langle A, \Phi (x)\rangle +\rho \Vert x\Vert ^2 \) are convex over U for all \( A \in \mathbb {S}^p_+ \) with \({{\,\textrm{tr}\,}}(A) = 1\) (see e.g. [38, Proposition 1.11]). Hence, for \(h(x):=\rho \Vert x\Vert ^2\), \(\varphi _U:=\varphi +\delta _{U}\in \Gamma _{h}(X)\). Furthermore, by Proposition 5.7 the mapping

$$\begin{aligned} \Phi _U(x) :=\left\{ \begin{array}{cc} \Phi (x),&{} \text { if } x\in U,\\ \infty _{ \mathbb {S}^p},&{} \text { if } x\notin U, \end{array} \right. \end{aligned}$$

is a DC matrix-mapping with control h. Then, it is easy to see that \(\bar{x}\) is also a local solution of \(\min \{ \varphi _U (x): \Phi _U(x) \preceq 0, \; x\in {\mathcal {Q}}\}\). Let us notice that for every matrix \(A \in \mathbb {S}^p_+\), and its spectral decomposition \(A= \sum _{i=1}^p \lambda _i u_i u_i^\top \), we get

$$\begin{aligned} \hat{D}^*\Phi (\bar{x})(A)= \sum _{i=1}^p \lambda _i u_i^\top \nabla \Phi (\bar{x}) u_i. \end{aligned}$$

Hence, condition (57) implies (56). Therefore, Theorem 5.10 implies the existence of \(A \in \mathbb {S}^p_+\) with \({{\,\textrm{tr}\,}}(A) =1\) such that (55) holds. Now, consider \(\lambda _i:=\lambda _i(A)\), and associated eigenvalues \(v_i\), \(i=1,\ldots , p\). Then, \((\lambda _i)\in \Delta _p\) and \( v_i^\top \Phi (x)v_i=0\), and that ends the proof. \(\square \)

6 Conclusions

The paper deals with optimization problems involving the so-called class of B-DC mappings (see Definition 2.3), which slightly extend the concept of delta-convex functions. The most general model studied in the paper is an optimization problem with an abstract constraint given by a closed convex set \({\mathcal {C}}\). The proposed methodology consists in transforming the original problem into an unconstrained optimization problem (by means of the notion of improvement function), and in using this reformulation to derive necessary and sufficient conditions of global and local optimality. The case in which the abstract constraint is a convex closed cone \(-{\mathcal {K}}\) is discussed in detail, and global optimality conditions are stated in Theorem 4.3, while Theorems 4.5 and 4.7 deal with local optimality. Our developments are applied in the last section to establish ad hoc optimality conditions for fundamental problems in applied mathematics such as infinite, stochastic and semidefinite programming problems. Next, we resume the main conclusions of the paper:

  1. 1)

    Nonsmooth tools like the (regular) subdifferential, the notion of (regular) coderivative showed to be appropriate technical instruments in our approach, outside of the scope of Asplund spaces.

  2. 2)

    New qualification conditions, which are an alternative to the Slater condition, are introduced in the paper. These conditions require certain degree of continuity of the objective/constraints functions and (\(w^{*}\) )-compactness of the set \({\mathcal {C}}^\circ \).

  3. 3)

    Some properties of the B-DC mappings are supplied by Proposition 2.5.

  4. 4)

    Theorem 2.14 is a key result in our analysis. It is based on Proposition 2.13, a useful characterization of the \(\varepsilon \)-subdifferential of the supremum of convex functions.

  5. 5)

    The particular structure of the cone-constraint problem allows us to build more suitable supremum functions. This is the case when the polar cone \({\mathcal {K}}^{+}\) is \(w^{*}-\)compactly generated, and a representative example of that situation is the semi-infinite optimization model where \( {\mathcal {K}}^{+}\) is the set of all (finite) regular Borel measures.

  6. 6)

    In Proposition 5.4, a classical formula for the subdifferential of convex normal integrand functions is extended to the case of nonconvex normal integrands.

  7. 7)

    In the last subsection, devoted to semidefinite programming, the notion of DC-matrix mapping is introduced. This concept leads to the main associated optimality result, which is Theorem 5.10.