1 Introduction

It is very well known that differentiable nonlinear constrained programs can be approached by means of Karush-Kuhn-Tucker conditions (KKT) to determine necessary and/or sufficient optimality conditions. The KKT conditions are paired with complementarity conditions, needed to manage the binding constraints. There is no need to recall that some constraints qualification condition are needed too.

In the particular case of linearly constrained programs the Abadie constraint qualification is automatically verified and a suitable generalized convexity property of the objective function yields that KKT conditions are both necessary and sufficient. As a consequence, particular Max-Min problems can be approached by substituting the internal minimization subproblem with the KKT conditions, thus obtaining a flat maximization problem to be solved. Unfortunately, the complementarity conditions make such a flat problem NP-hard. In this light, various approaches have been proposed in the literature to manage them (dummy binary variables, big-M, outer approximations, branch-and-bounds, and so on).

The aim of this paper is to study a new transversality condition which can be substituted to complementarity conditions still guaranteeing the optimality conditions based on KKT ones. Such a transversality condition can also be used in duality theory and allows to solve a particular class of Max-Min problems in a very efficient way.

In Sect. 2 the studied class of differentiable linearly constrained problems is introduced and some very well known results recalled. New transversality conditions allow in Sect. 3 to state necessary and sufficient optimality conditions and allow in Sect. 4 to provide two pair of dual problems and the related weak and strong duality results. Finally, in Sect. 5 it is shown how the transversality condition allows to characterize the optimality of quadratic convex problems and to efficiently solve a particular class of Max-Min problems. A final applicative example is provided too.

2 Definitions and preliminary results

Let us first introduce the main problem which will be studied through this paper. Notice that all different kinds of linear constraints are considered, that is inequalities, equalities, lower and upper bounds, since in the applicative problems all of these kinds of constraints usually appear.

Definition 1

The following linearly constrained programming problem is introduced:

$$\begin{aligned} P:\left\{ \begin{array}{c} \min \ f(x)\\ A x \underline{\ge } b\\ M x = q\\ l \underline{\le } x \underline{\le } u \end{array} \right. \end{aligned}$$

where \(f:X\rightarrow \mathbb {R}\) is a differentiable function, with \(X\subseteq \mathbb {R}^{n}\) open convex set such that:

$$\begin{aligned} \left\{ x\in \mathbb {R}^{n}: \ A x \underline{\ge } b, \ M x = q, \ l \underline{\le } x \underline{\le } u \right\} \subset X \end{aligned}$$

Moreover, \(x,l,u\in \mathbb {R}^{n}\), \(A\in \mathbb {R}^{m\times n}\), \(b\in \mathbb {R}^{m}\), \(M\in \mathbb {R}^{p\times n}\), \(q\in \mathbb {R}^{p}\). The i-th row of A is denoted with \(A_i\in \mathbb {R}^{1\times n}\), \(i=1,\dots ,m\), the i-th element of b is denoted with \(b_i\in \mathbb {R}\), \(i=1,\dots ,m\), while the i-th elements of x, l and u are denoted with \(x_i\in \mathbb {R}\), \(l_i\in \mathbb {R}\) and \(u_i\in \mathbb {R}\), respectively, with \(i=1,\dots ,n\).

As it is known (see for instance Bazaraa et al. 2006 Sect. 5.1) Karush-Kuhn-Tucker conditions (KKT) are always necessary for problems with linear constraints (recall that if the constraints are linear then the Abadie constraint qualification is automatically true). This means that the following known necessary optimality condition holds.

Theorem 1

If \(\bar{x}\in X\) is an optimal solution for P then there exists \((\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) such that:

$$\begin{aligned}&\nabla f(\bar{x})=A^T\bar{\lambda }+M^T\bar{\mu }+\bar{\nu }-\bar{\rho }&\\&\bar{\lambda } \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0&\\&\bar{\lambda }_i(A_i\bar{x}-b_i)=0 \ \forall i=1,\dots ,m&\\&\bar{\nu }_i(\bar{x}_i-l_i)=0 \, \ \bar{\rho }(u_i-\bar{x}_i)=0\ \forall i=1,\dots ,n&\end{aligned}$$

In the next sections, some optimality conditions will be stated under suitable convexity or generalized convexity assumptions on function f. For this reason, the following well known definitions are recalled (Avriel et al. 2010; Cambini and Martein 2009).

Definition 2

A differentiable function \(f:X\rightarrow \mathbb {R}\), with \(X\subseteq \mathbb {R}^{n}\) nonempty open convex set, is said to be pseudoconvex on X if the following property holds for all \(x,x_0\in X\):

$$\begin{aligned} f(x)<f(x_0)\quad \Rightarrow \quad \nabla f(x_0)^T(x-x_0)<0 \end{aligned}$$

Moreover, f is said to be convex on X if the following property holds for all \(x,x_0\in X\):

$$\begin{aligned} f(x)\ge f(x_0)+ \nabla f(x_0)^T(x-x_0) \end{aligned}$$

3 Necessary and sufficient optimality conditions

It is very well known that the necessary KKT optimality conditions together with the corresponding complementarity conditions are also sufficient under suitable generalized convexity assumptions. The following results point out that it is possible to state necessary and sufficient conditions not based on complementarities but using transversality conditions involving both variables and multipliers.

Theorem 2

Consider problem P and assume f to be pseudoconvex. The following properties are equivalent:

  1. i)

    \(\bar{x}\in \mathbb {R}^{n}\) is an optimal solution for P;

  2. ii)

    \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) is a solution of the following system of equalities and inequalities:

    $$\begin{aligned} {(S1): } \left\{ \begin{array}{c} \nabla f(x)=A^T\lambda +M^T\mu +\nu -\rho \\ A x \underline{\ge } b\,\ M x =q\,\ l \underline{\le } x \underline{\le } u\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0\\ \lambda _i(A_ix-b_i)=0 \ \forall i=1,\dots ,m\\ \nu _i(x_i-l_i)=0 \, \ \rho (u_i-x_i)=0\ \forall i=1,\dots ,n \end{array} \right. \end{aligned}$$
  3. iii)

    \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) is a solution of the following system of equalities and inequalities:

    $$\begin{aligned} {(S2): } \left\{ \begin{array}{c} \nabla f(x)=A^T\lambda +M^T\mu +\nu -\rho \\ A x \underline{\ge } b\,\ M x =q\,\ l \underline{\le } x \underline{\le } u\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0\\ \nabla f(x)^Tx=\lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{array} \right. \end{aligned}$$
  4. iv)

    \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) is a solution of the following system of equalities and inequalities:

    $$\begin{aligned} {(S3): } \left\{ \begin{array}{c} \nabla f(x)=A^T\lambda +M^T\mu +\nu -\rho \\ A x \underline{\ge } b\,\ M x =q\,\ l \underline{\le } x \underline{\le } u\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0\\ \nabla f(x)^Tx\le \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{array} \right. \end{aligned}$$

Proof

\(i)\Rightarrow ii)\) Follows from Theorem 1.

\(ii)\Rightarrow iii)\) First notice that being \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) a solution of system (S1), then \(\bar{\lambda }^T(A\bar{x}-b)=0\), \(\bar{\nu }^T(\bar{x}-l)=0\), \(\bar{\rho }^T(u-\bar{x})=0\), \(M\bar{x}=q\), which yield:

$$\begin{aligned} \bar{\lambda }^TA\bar{x}=\bar{\lambda }^Tb \, \ \bar{\nu }^T\bar{x}=\bar{\nu }^Tl \, \ \bar{\rho }^T\bar{x}=\bar{\rho }^Tu \, \ \bar{\mu }^TM\bar{x}=\bar{\mu }^Tq \end{aligned}$$

Hence, from \(\nabla f(\bar{x})=A^T\bar{\lambda }+M^T\bar{\mu }+\bar{\nu }-\bar{\rho }\) it results:

$$\begin{aligned} \nabla f(\bar{x})^T\bar{x}=\bar{\lambda }^TA\bar{x}+\bar{\mu }^TM\bar{x}+\bar{\nu }^T\bar{x}-\bar{\rho }^T\bar{x} =\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$

and the result is proved.

\(iii)\Rightarrow iv)\) Trivial.

\(iv)\Rightarrow i)\) First notice that being \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) a solution of system (S3) then \(\bar{x}\) is feasible for P. Notice also that:

$$\begin{aligned} \nabla f(\bar{x})^T\bar{x}\le \bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$
(1)

Assume by contradiction that \(\bar{x}\) is not an optimal solution for P, that is to say that there exists \(\hat{x}\) such that:

$$\begin{aligned} A\hat{x}\underline{\ge } b \ , \ M\hat{x}=q \ , \ l\underline{\le }\hat{x}\underline{\le }u \quad \text{ and } \quad f(\hat{x})<f(\bar{x}) \end{aligned}$$
(2)

The pseudoconvexity of f and (2) imply:

$$\begin{aligned} \nabla f(\bar{x})^T(\hat{x}-\bar{x})<0 \end{aligned}$$
(3)

so that:

$$\begin{aligned} \nabla f(\bar{x})^T\hat{x}<\nabla f(\bar{x})^T\bar{x} \end{aligned}$$
(4)

Conditions in system (S3) yield \(\bar{\lambda }^T(A\hat{x}-b)\ge 0\), \(\bar{\mu }^T(M\hat{x}-q)=0\), \(\bar{\nu }^T(\hat{x}-l)\ge 0\) and \(\bar{\rho }^T(u-\hat{x})\ge 0\), so that:

$$\begin{aligned} \bar{\lambda }^TA\hat{x}\ge \bar{\lambda }^Tb \, \ \bar{\mu }^TM\hat{x}=\bar{\mu }^Tq \, \ \bar{\nu }^T\hat{x}\ge \bar{\nu }^Tl \, \ \bar{\rho }^T\hat{x}\le \bar{\rho }^Tu \end{aligned}$$

Being \(\nabla f(\bar{x})=A^T\bar{\lambda }+M^T\bar{\mu }+\bar{\nu }-\bar{\rho }\) condition (4) implies:

$$\begin{aligned} \nabla f(\bar{x})^T\bar{x}>\nabla f(\bar{x})^T\hat{x}=\bar{\lambda }^TA\hat{x}+\bar{\mu }^TM\hat{x}+\bar{\nu }^T\hat{x}-\bar{\rho }^T\hat{x} \ge \bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$
(5)

which contradicts (1). \(\square \)

Notice that the pseudoconvexity of f has been used to prove “\(iv)\Rightarrow i)\)” only. In other words, ii), iii) and iv) are necessary optimality conditions which become sufficient assuming the pseudoconvexity of f. Notice also that sufficiency is guaranteed even when all the complementarity conditions are substituted with just one of the following transversality conditions:

$$\begin{aligned} \nabla f(x)^Tx=\lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{aligned}$$
(6)
$$\begin{aligned} \nabla f(x)^Tx\le \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{aligned}$$
(7)

These two transversality conditions have very different behaviours in the case they are used to computationally solve a problem. In general, transversality conditions (6) and (7) do not guarantee the convexity of the feasible region. On the other hand, the feasible region becomes convex under suitable convexity or generalized convexity assumptions concerning the functions involved. There is no need to recall that the convexity of the feasible region is a fundamental property needed by many solution methods and/or computational solvers. In this light, the following properties can be proved just by recalling that every nonempty level set of a semistrictly quasilinear function is the intersection of the feasible region and an hyperplane (see Theorem 3.3.5 in Cambini and Martein (2009)), that the lower level sets of a quasiconvex function are convex sets, and that the sum of a convex function and a linear function is convex too and hence quasiconvex:

  1. (p1)

    if \(\nabla f(x)^Tx-\lambda ^Tb-\mu ^Tq-\nu ^Tl+\rho ^Tu\) is a semistrictly quasilinear function then the set of points verifying equality (6) is convex;

  2. (p2)

    if \(\nabla f(x)^Tx-\lambda ^Tb-\mu ^Tq-\nu ^Tl+\rho ^Tu\) is a quasiconvex function then the set of points verifying inequality (7) is convex;

  3. (p3)

    if \(\nabla f(x)^Tx\) is a convex function then the set of points verifying inequality (7) is convex.

Notice that property (p3) will be used to study the applicative example of Sect. 5.

Remark 1

Notice that property (p3) does not hold in the case \(\nabla f(x)^Tx\) is pseudoconvex but not convex. Specifically speaking, the sum of a strictly pseudoconvex function and a linear one is not necessarily quasiconvex. In this light, consider for example the function \(f(x,y)=g(x)+h(y)=(x^3+x)+2y\). It results \(f(0,0)=f(-1,1)=0\) and

$$\begin{aligned} f\left( \frac{1}{2}(0,0)+\frac{1}{2}(-1,1)\right) =f\left( -\frac{1}{2},\frac{1}{2}\right) =\frac{3}{8}>0 \end{aligned}$$

As a consequence, f(xy) is not quasiconvex even if \(g(x)=x^3+x\) is strictly pseudoconvex and \(h(y)=2y\) is linear.

4 Duality results

The aim of this section is to deep on the use of the transversality conditions (6) and (7) in duality theory. Specifically speaking, two different pairs of dual problems will be proposed, the first one dealing with convex programs and having the elements of the transversality conditions in the objective function of the dual problem, the second one dealing with pseudoconvex programs and having the transversality condition as an inequality constraint of the dual.

Notice that both duals have the KKT conditions as an equality constraint. Nevertheless, they differ from the classical dual problems used in nonlinear programming (Wolfe, Mond-Weir, Fenchel, etc etc (Avriel 2003; Bazaraa et al. 2006; Arana and Cambini 2015; Cambini and Carosi 2010; Cambini et al. 2005; Cambini and Carosi 2005; Geoffrion 1971; Mangasarian 1994; Rockafellar 1974, 1970; Wolfe 1961)).

4.1 A first pair of dual problems

Let us consider the following pair of constrained programs:

$$\begin{aligned} P:\left\{ \begin{array}{c} \min \ f(x)\\ A x \underline{\ge } b\\ M x = q\\ l \underline{\le } x \underline{\le } u \end{array} \right. \qquad , \qquad D:\left\{ \begin{array}{c} \max \ f(x)-\nabla f(x)^Tx+\lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu\\ \nabla f(x)=A^T\lambda +M^T\mu +\nu -\rho \\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0 \end{array} \right. \end{aligned}$$

where \(x,l,u\in \mathbb {R}^{n}\), \(A\in \mathbb {R}^{m\times n}\), \(b\in \mathbb {R}^{m}\), \(M\in \mathbb {R}^{p\times n}\), \(q\in \mathbb {R}^{p}\), \(\lambda \in \mathbb {R}^{m}\), \(\mu \in \mathbb {R}^{p}\), \(\nu \in \mathbb {R}^{n}\), \(\rho \in \mathbb {R}^{n}\).

Let us prove the weak and strong duality results assuming f to be convex.

Theorem 3

(Weak Duality) Let \(x_P\) be any feasible solution for P and let \((x_D,\lambda ,\mu ,\nu ,\rho )\) be any feasible solution for D. If f is convex then:

$$\begin{aligned} f(x_P)\ge f(x_D)-\nabla f(x_D)^Tx_D+\lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{aligned}$$

Proof

Taking into account that \(A x_P \underline{\ge } b\), \(M x_P = q\), \(x_P\underline{\ge } l\), \(x_P\underline{\le } u\), \(\lambda \underline{\ge } 0\), \(\nu \underline{\ge } 0\), \(\rho \underline{\ge } 0\), it results:

$$\begin{aligned} \lambda ^TAx_P\ge \lambda ^Tb \, \ \mu ^TMx_P=\mu ^Tq \, \ \nu ^Tx_P\ge \nu ^Tl \, \ \rho ^Tx_P\le \rho ^Tu \end{aligned}$$

These inequalities, the convexity of f and the equality constraint in D imply:

$$\begin{aligned} f(x_P)-f(x_D)\ge & {} \nabla f(x_D)^T(x_P-x_D)\\= & {} \nabla f(x_D)^Tx_P- \nabla f(x_D)^Tx_D\\= & {} \lambda ^TAx_P+\mu ^TMx_P+\nu ^Tx_P-\rho ^Tx_P- \nabla f(x_D)^Tx_D\\\ge & {} \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu- \nabla f(x_D)^Tx_D \end{aligned}$$

and the result is proved. \(\square \)

Corollary 1

Let \(\bar{x}_P\) be a feasible solution for P and let \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) be a feasible solution for D such that:

$$\begin{aligned} f(\bar{x}_P)=f(\bar{x}_D)-\nabla f(\bar{x}_D)^T\bar{x}_D+\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$

If f is convex then \(\bar{x}_P\) is an optimal solution of P and \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is an optimal solution of D.

Proof

Assume by contradiction that \(\bar{x}_P\) is not an optimal solution of P. Then, there exists \(\hat{x}_P\) feasible for P such that \(f(\hat{x}_P)<f(\bar{x}_P)\). This condition and the weak duality theorem imply:

$$\begin{aligned} f(\bar{x}_P)>f(\hat{x}_P)\ge f(\bar{x}_D)-\nabla f(\bar{x}_D)^T\bar{x}_D+\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu=f(\bar{x}_P) \end{aligned}$$

which is a contradiction. The optimality of \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) follows analogously. \(\square \)

Theorem 4

(Strong Duality) Let \(\bar{x}_P\) be an optimal solution of P and let f be convex. Then, there exists an optimal solution \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) of D such that \(\bar{x}_D=\bar{x}_P\) and:

$$\begin{aligned} f(\bar{x}_P)=f(\bar{x}_D)-\nabla f(\bar{x}_D)^T\bar{x}_D +\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$

Moreover, for all optimal solutions \((\hat{x}_D,\hat{\lambda },\hat{\mu },\hat{\nu },\hat{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) of D it results:

$$\begin{aligned} f(\bar{x}_P)=f(\hat{x}_D)-\nabla f(\hat{x}_D)^T\hat{x}_D+\hat{\lambda }^Tb+\hat{\mu }^Tq+\hat{\nu }^Tl-\hat{\rho }^Tu \end{aligned}$$

Proof

From Theorem 2 it follows that there exists \((\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) such that:

$$\begin{aligned}&\nabla f(\bar{x}_P)=A^T\bar{\lambda }+M^T\bar{\mu }+\bar{\nu }-\bar{\rho }&\\&\bar{\lambda } \underline{\ge } 0\, \ \bar{\nu }\underline{\ge } 0\, \ \bar{\rho }\underline{\ge } 0&\\&\nabla f(\bar{x}_P)^T\bar{x}_P =\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu&\end{aligned}$$

Let \(\bar{x}_D=\bar{x}_P\). It then results that \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is feasible for D and that:

$$\begin{aligned} \bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu-\nabla f(\bar{x}_D)^T\bar{x}_D=0 \end{aligned}$$

The weak duality theorem implies that for all feasible points \((x,\lambda ,\mu ,\nu ,\rho )\) of D it results:

$$\begin{aligned}{} & {} f(x)-\nabla f(x)^Tx+\lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \le f(\bar{x}_P)=f(\bar{x}_D)\\{} & {} \quad =f(\bar{x}_D)-\nabla f(\bar{x}_D)^T\bar{x}_D +\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu \end{aligned}$$

which means that \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is an optimal solution of D. The whole result then follows since all the optimal solutions of D have the same image value. \(\square \)

4.2 A second pair of dual problems

Let us consider the following pair of constrained programs:

$$\begin{aligned} P:\left\{ \begin{array}{c} \min \ f(x)\\ A x \underline{\ge } b\\ M x = q\\ l \underline{\le } x \underline{\le } u \end{array} \right. \qquad , \qquad D':\left\{ \begin{array}{c} \max \ f(x)\\ \nabla f(x)=A^T\lambda +M^T\mu +\nu -\rho \\ \nabla f(x)^Tx\le \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0 \end{array} \right. \end{aligned}$$

where \(x,l,u\in \mathbb {R}^{n}\), \(A\in \mathbb {R}^{m\times n}\), \(b\in \mathbb {R}^{m}\), \(M\in \mathbb {R}^{p\times n}\), \(q\in \mathbb {R}^{p}\), \(\lambda \in \mathbb {R}^{m}\), \(\mu \in \mathbb {R}^{p}\), \(\nu \in \mathbb {R}^{n}\), \(\rho \in \mathbb {R}^{n}\).

Notice that this pair of dual problems is quite unusual since they share the same objective function. Nevertheless, the feasible regions differ, the primal is a minimization problem while \(D'\) is a maximization one, and the weak and strong duality results hold. Hence, for the sake of completeness, the following results are provided pointing out that they offer the opportunity to solve problem P in a different way. Notice also that the following weak and strong duality results assume f to be pseudoconvex.

Theorem 5

(Weak Duality) Let \(x_P\) be any feasible solution for P and let \((x_D,\lambda ,\mu ,\nu ,\rho )\) be any feasible solution for \(D'\). If f is pseudoconvex then:

$$\begin{aligned} f(x_P)\ge f(x_D) \end{aligned}$$

Proof

First recall that the pseudoconvexity of f yields:

$$\begin{aligned} \nabla f(x_D)^T(x_P-x_D)\ge 0 \quad \Rightarrow \quad f(x_P)\ge f(x_D) \end{aligned}$$

Taking into account that \(A x_P \underline{\ge } b\), \(M x_P = q\), \(x_P\underline{\ge } l\), \(x_P\underline{\le } u\), \(\lambda \underline{\ge } 0\), \(\nu \underline{\ge } 0\), \(\rho \underline{\ge } 0\), it results:

$$\begin{aligned} \lambda ^TAx_P\ge \lambda ^Tb \, \ \mu ^TMx_P=\mu ^Tq \, \ \nu ^Tx_P\ge \nu ^Tl \, \ \rho ^Tx_P\le \rho ^Tu \end{aligned}$$

These inequalities and the equality/inequality constraints in \(D'\) imply:

$$\begin{aligned} \nabla f(x_D)^T(x_P-x_D)= & {} \nabla f(x_D)^Tx_P- \nabla f(x_D)^Tx_D\\= & {} \lambda ^TAx_P+\mu ^TMx_P+\nu ^Tx_P-\rho ^Tx_P- \nabla f(x_D)^Tx_D\\\ge & {} \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu- \nabla f(x_D)^Tx_D \ge 0 \end{aligned}$$

Hence, the result follows from the pseudoconvexity of f. \(\square \)

Corollary 2

Let \(\bar{x}_P\) be a feasible solution for P and let \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) be a feasible solution for \(D'\) such that \(f(\bar{x}_P)=f(\bar{x}_D)\). If f is pseudoconvex then \(\bar{x}_P\) is an optimal solution of P and \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is an optimal solution of \(D'\).

Proof

Assume by contradiction that \(\bar{x}_P\) is not an optimal solution of P. Then, there exists \(\hat{x}_P\) feasible for P such that \(f(\hat{x}_P)<f(\bar{x}_P)\). This condition and the weak duality theorem imply:

$$\begin{aligned} f(\bar{x}_P)>f(\hat{x}_P)\ge f(\bar{x}_D)=f(\bar{x}_P) \end{aligned}$$

which is a contradiction. The optimality of \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) follows analogously. \(\square \)

Theorem 6

(Strong Duality) Let \(\bar{x}_P\) be an optimal solution of P and let f be pseudoconvex. Then, there exists an optimal solution \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) of \(D'\) such that \(\bar{x}_D=\bar{x}_P\) and hence \(f(\bar{x}_P)=f(\bar{x}_D)\). Moreover, for all optimal solutions \((\hat{x}_D,\hat{\lambda },\hat{\mu },\hat{\nu },\hat{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) of \(D'\) it results \(f(\bar{x}_P)=f(\hat{x}_D)\).

Proof

From Theorem 2 it follows that there exists \((\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) such that:

$$\begin{aligned}&\nabla f(\bar{x}_P)=A^T\bar{\lambda }+M^T\bar{\mu }+\bar{\nu }-\bar{\rho }&\\&\bar{\lambda } \underline{\ge } 0\, \ \bar{\nu }\underline{\ge } 0\, \ \bar{\rho }\underline{\ge } 0&\\&\nabla f(\bar{x}_P)^T\bar{x}_P =\bar{\lambda }^Tb+\bar{\mu }^Tq+\bar{\nu }^Tl-\bar{\rho }^Tu&\end{aligned}$$

Let \(\bar{x}_D=\bar{x}_P\). It then results that \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is feasible for \(D'\). The weak duality theorem implies that for all feasible points \((x,\lambda ,\mu ,\nu ,\rho )\) of D it results:

$$\begin{aligned} f(x) \le f(\bar{x}_P)= & {} f(\bar{x}_D) \end{aligned}$$

which means that \((\bar{x}_D,\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\) is an optimal solution of \(D'\). The whole result then follows since all the optimal solutions of \(D'\) have the same image value. \(\square \)

5 An application to convex quadratic problems

In this section the results of Sect. 3 will be specified to quadratic convex problems and applied to a particular class of Max-Min problems. The main maximization problem has some linear constraints and a further constraint requiring the variable to belong to the set of optimal solutions of a secondary QP minimization problem. This class of problems is approached by means of both complementarity conditions and transversality one. Then, for the special case of a linear objective, a computational test is provided to show the usefulness of the transversality constraint. Finally, an applicative example of the studied model is given.

5.1 Transversality condition and quadratic problems

In the light of property (p3) at the end of Sect. 3, the nonlinear transversality condition (7) becomes easily manageable in the case f is linear or quadratic. Consider the following particular case of problem P:

$$\begin{aligned} P_Q:\left\{ \begin{array}{c} \min \ \frac{1}{2}x^TQx+c^Tx\\ A x \underline{\ge } b\\ M x = q\\ l \underline{\le } x \underline{\le } u \end{array} \right. \end{aligned}$$

with \(c\in \mathbb {R}^{n}\) and \(Q\in \mathbb {R}^{n\times n}\) symmetric and positive semidefinite (notice that Q may be the zero matrix, so that also the linear case is considered). The following result straightforwardly follows from Theorem 2.

Corollary 3

Consider problem \(P_Q\). The following properties are equivalent:

  1. i)

    \(\bar{x}\in \mathbb {R}^{n}\) is an optimal solution for \(P_Q\);

  2. ii)

    \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) is a solution of the following system of equalities and inequalities:

    $$\begin{aligned} \left\{ \begin{array}{c} Qx+c=A^T\lambda +M^T\mu +\nu -\rho \\ A x \underline{\ge } b\,\ M x =q\,\ l \underline{\le } x \underline{\le } u\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0\\ \lambda _i(A_ix-b_i)=0 \ \forall i=1,\dots ,m\\ \nu _i(x_i-l_i)=0 \, \ \rho (u_i-x_i)=0\ \forall i=1,\dots ,n \end{array} \right. \end{aligned}$$
  3. iii)

    \((\bar{x},\bar{\lambda },\bar{\mu },\bar{\nu },\bar{\rho })\in \mathbb {R}^{n}\times \mathbb {R}^{m}\times \mathbb {R}^{p}\times \mathbb {R}^{n}\times \mathbb {R}^{n}\) is a solution of the following system of equalities and inequalities:

    $$\begin{aligned} \left\{ \begin{array}{c} Qx+c=A^T\lambda +M^T\mu +\nu -\rho \\ A x \underline{\ge } b\,\ M x =q\,\ l \underline{\le } x \underline{\le } u\\ \lambda \underline{\ge } 0\, \ \nu \underline{\ge } 0\, \ \rho \underline{\ge } 0\\ x^TQx+c^Tx\le \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{array} \right. \end{aligned}$$

Remark 2

It is worth comparing, from a computational point of view, conditions ii) and iii) of Corollary 3. Complementarity conditions require of the use of time consuming approaches, such as branch and bound methods, dummy variables, big-M methods. On the other hand, the quadratic transversality condition can be efficiently managed by the main solvers (recall property (p3) at the end of Sect. 3 and notice that \(\nabla f(x)^Tx=x^TQx+c^Tx\) is convex being Q positive semidefinite).

5.2 A class of max-min problems

Let us consider the following class of Max-Min problems:

$$\begin{aligned}&\max \limits _{x,y}&g(x,y)\end{aligned}$$
(8)
$$\begin{aligned}&s.t.&B_x x+B_y y\underline{\ge } h\end{aligned}$$
(9)
$$\begin{aligned}{} & {} N_x x+N_y y= v\end{aligned}$$
(10)
$$\begin{aligned}{} & {} l_y\underline{\le }y\underline{\le }u_y\end{aligned}$$
(11)
$$\begin{aligned}{} & {} x\in \arg \min \limits _{\tilde{x}} \left\{ \frac{1}{2}\tilde{x}^TQ\tilde{x}+c^T\tilde{x}: \ A\tilde{x}\underline{\ge } b,\ M\tilde{x}=q,\ l\underline{\le }\tilde{x}\underline{\le }u \right\} \end{aligned}$$
(12)

with \(g:(\mathbb {R}^{n}\times \mathbb {R}^{n_y})\rightarrow \mathbb {R}\), \(x,c,l,u\in \mathbb {R}^{n}\), \(y,l_y,u_y\in \mathbb {R}^{n_y}\), \(B_x\in \mathbb {R}^{r\times n}\), \(B_y\in \mathbb {R}^{r\times {n_y}}\), \(h\in \mathbb {R}^{r}\), \(N_x\in \mathbb {R}^{s\times n}\), \(N_y\in \mathbb {R}^{s\times {n_y}}\), \(v\in \mathbb {R}^{s}\), \(A\in \mathbb {R}^{m\times n}\), \(b\in \mathbb {R}^{m}\), \(M\in \mathbb {R}^{p\times n}\), \(q\in \mathbb {R}^{p}\), \(Q\in \mathbb {R}^{n\times n}\) symmetric and positive semidefinite. Notice that the subproblem in (12) is nothing but problem \(P_Q\). By means of the necessary and sufficient optimality conditions in Corollary 3 it is possible to characterize optimality in (12), thus obtaining a flat reformulation of the Max-Min problem. Specifically speaking, ii) and iii) of Corollary 3 yield:

$$\begin{aligned} P_C: \ \left\{ \begin{array}{l} \max \limits _{x,y}\ g(x,y)\\ B_x x+B_y y\ge h\\ N_x x+N_y y= v, \ l_y\underline{\le }y\underline{\le }u_y\\ Ax\underline{\ge } b, \ Mx=q,\ l\underline{\le }x\underline{\le }u\\ Qx+c=A^T\lambda +M^T\mu +\nu -\rho \\ \lambda \underline{\ge }0 \, \ \nu \underline{\ge }0 \, \ \rho \underline{\ge }0\\ \lambda ^T(Ax-b)=0 \\ \ \nu ^T(x-l)=0 \, \ \rho ^T(u-x)=0 \end{array} \right. \, \ P_T: \ \left\{ \begin{array}{l} \max \limits _{x,y}\ g(x,y)\\ B_x x+B_y y\ge h\\ N_x x+N_y y= v, \ l_y\underline{\le }y\underline{\le }u_y\\ Ax\underline{\ge } b, \ Mx=q,\ l\underline{\le }x\underline{\le }u\\ Qx+c=A^T\lambda +M^T\mu +\nu -\rho \\ \lambda \underline{\ge }0 \, \ \nu \underline{\ge }0 \, \ \rho \underline{\ge }0\\ x^TQx+c^Tx\le \lambda ^Tb+\mu ^Tq+\nu ^Tl-\rho ^Tu \end{array} \right. \end{aligned}$$

with \(\lambda \in \mathbb {R}^{m}\), \(\mu \in \mathbb {R}^{p}\), \(\nu ,\rho \in \mathbb {R}^{n}\). Notice that \(P_C\) uses \(m+2n\) complementarity conditions, while \(P_T\) is based on just one quadratic transversality constraint.

5.3 A computational test

The behavior of the above problems has been studied by means of a computational test executed in a macOS 14 environment with an M1 Pro 10-core processor, MATLAB 2023b for coding, AMPL 2024.03.06 as modeler and Gurobi 11.0.0 as solver for LP and QP problems.

For the sake of simplicity, we assumed \(n_y=n\), \(r=n\), \(s=0\), \(m=0\), \(p=round(n/2)\) and \(g(x,y)=d_x^Tx+d_y^Ty\), with \(d_x,d_y\in \mathbb {R}^{n}\). Problems with a number of variables from \(n = 20\) to \(n = 35\) have been considered, 49000 instances have been randomly generated and solved with a grand total of 196000 problems solved. Specifically speaking, matrices and vectors \(d_x\), \(d_y\), c, \(B_x\), \(B_y\), M, have been randomly generated with components in the interval \([-5,5]\) by using the “randi()” MATLAB function (integer numbers generated with uniform distribution); analogously, \(l_x\) has components in the interval \([-5,5]\), \(l_y\) has components in the interval \([-10,0]\), \(u_x-l_x\) and \(u_y-l_y\) have components in the interval [20, 30]; finally, b and q are chosen to guarantee a nonempty feasible region while \(Q=S^TS\) where \(S\in \mathbb {R}^{n\times n}\) is generated with components in the interval \([-3,3]\). The average time spent to solve the instances (obtained with the “tic” and “toc” MATLAB commands) are given as the result of the computational test.

Remark 3

Complementarity conditions can be managed in various ways (outer approximation approach, big-M method, SOS1 variables, transformations based on squares, max values or absolute values). Some of them are time consuming and some others have numerical issues. In order to find a good equilibrium between performance and numerical errors we propose the following transformation based on dummy binary variables:

$$\begin{aligned} \left\{ \begin{array}{c} \nu _i\ge 0\, \ (x_i-l_i)\ge 0 \\ \nu _i (x_i-l_i)=0 \end{array}\right. \quad \Leftrightarrow \quad \left\{ \begin{array}{c} \nu _i\ge 0\, \ (x_i-l_i)\ge 0\, \ \delta ^\nu _i\in \{0,1\} \\ \delta ^\nu _i \nu _i +(1-\delta ^\nu _i)(x_i-l_i)\le 0 \end{array}\right. \end{aligned}$$

To prove the previous equivalence just notice that being \(\nu _i\ge 0\), \((x_i-l_i)\ge 0\) and \(\delta ^\nu _i\in \{0,1\}\) it results:

$$\begin{aligned} \nu _i (x_i-l_i)=0\Leftrightarrow & {} \delta ^\nu _i \nu _i=0\ \text{ and } \ (1-\delta ^\nu _i)(x_i-l_i)=0 \\\Leftrightarrow & {} \delta ^\nu _i \nu _i +(1-\delta ^\nu _i)(x_i-l_i)=0 \quad \Leftrightarrow \quad \delta ^\nu _i \nu _i +(1-\delta ^\nu _i)(x_i-l_i)\le 0 \end{aligned}$$

The other complementarity conditions can be transformed analogously. The use of inequalities provides reduced numerical errors, while the use of one dummy variable and one constraint for each complementarity condition keeps performance at a reasonable level.

The results obtained in the computational test are summarized in Table 1, where column n provides the number of variables while column num gives the number of generated instances. In the same table, the columns \(P_T\) and \(P_C\) provide the average elapsed time (in seconds) needed to solve problems \(P_T\) and \(P_C\), respectively. Specifically speaking, \(P_C\) has been approached as described in Remark 3. For the sake of completeness, both the linear case (\(Q=0\)) and the quadratic case (\(Q\ne 0\)) have been considered.

Table 1 Average Elapsed Times (secs)

The obtained results point out that the use of the transversality condition in \(P_T\) provides a stable performance behavior when the number of variables increases. On the other hand, the use of complementarity conditions in \(P_C\) shows an exponential elapsed time which seems to double whenever the number of variables n is increased by just two units. Finally, performance differences between the linear case (\(Q=0\)) and the quadratic one (\(Q\ne 0\)) seem not to be relevant.

5.4 An illustrative example: application to electricity market

In this section we propose an example that can be treated with the methodology presented in the previous sections. Let us consider an electricity company that sells electricity both in the spot and in the forward/future market. Selling electricity in the future market allows the company to curb the price volatility and to recover the investment cost related to new plants (similar approaches can be found in Arega et al. (2021, 2022)). Let us then suppose that the future market is used to ensure profitability from the electricity produced by the new plants that can be installed. We can, for instance, think about capacity expansion devoted to increase the electricity produced by renewable plants that have to be installed to reduce pollution and reach some net-zero carbon targets. The electricity company has to decide the quantity of electricity that has to be produced and devoted both to the spot and to the future market to satisfy the demand of its customers. We suppose that we have a zonal demand that has to be fully covered by the company.

The amount of electricity sold through derivative contracts is established by a market operator that clears the forward/future market taking into account the objective to reduce the total emission of pollutants to a pre-determined target. In this way there is an incentive to ask for derivative contracts related to electricity produced with less pollutant technologies.

Decisions on the quantity of electricity to be sold in the spot market is a function depending on the number of contracts assigned to the electricity company by the market operator of the future market. The electricity company, in addition, can not emit more that a maximum target level for each pollutant. A full description of the model is provided below.

Let us introduce the following parameters and variables.

  • Sets

  • \(I:=\{1,\dots , |I|\}\) set of pollutants.

  • \(J:=\{1,\dots , |J|\}\) set of technologies.

  • \(K:=\{1,\dots , |K|\}\) set of zones.

  • Variables

  • \(x:=\{x_j\}\) electricity produced with technology \(j\in J\) and negotiated with derivative contracts, \(x\in \mathbb {R}^{|J|}_+\).

  • \(y:=\{y_j\}\) electricity produced with technology \(j\in J\) to be sold in the spot market, \(y\in \mathbb {R}^{|J|}_+\).

  • Parameters

  • \(\{l^x;u^x\}\) lower and upper capacity bounds for production of electricity with technology \(j\in J\) negotiated with derivative contracts, \(l_x,u_x\in \mathbb {R}^{|J|}_+\).

  • \(\{l^y;u^y\}\) lower and upper capacity bounds for production of electricity with technology \(j\in J\) to be sold in the spot market, \(l_y,u_y\in \mathbb {R}^{|J|}_+\).

  • \(p^x:=\{p^x_j\}\) price for electricity produced with technology \(j\in J\) and negotiated in the future market, \(p^x\in \mathbb {R}^{|J|}_+\).

  • \(p^y:=\{p^y_j\}\) spot price for electricity sold in the spot market with technology \(j\in J\), \(p^y\in \mathbb {R}^{|J|}_+\).

  • \(w^x:=\{w^x_j\}\) investment and operative costs for electricity sold in the future market with technology \(j\in J\), \(w^y\in \mathbb {R}^{|J|}_+\).

  • \(w^y:=\{w^y_j\}\) investment and operative costs for electricity sold in the spot market with technology \(j\in J\), \(w^y\in \mathbb {R}^{|J|}_+\).

  • \( B^x:=\{b^x_{ij}\}\) emission factor of pollutant \(i\in I\) for electricity produced with technology \(j\in J\) and negotiated in the future market

  • \( B^y:=\{b^y_{ij}\}\) emission factor of pollutant \(i\in I\) for electricity produced with technology \(j\in J\) and sold in the spot market

  • \( h:=\{h_{i}\}\) total maximum emission allowed of pollutant \(i\in I\) for electricity negotiated both in the spot and future market

  • \( q:=\{q_{i}\}\) maximum emission allowed of pollutant \(i\in I\) related to electricity traded with derivative contracts

  • \( N^x:=\{n^x_{kj}\}\) electricity produced with technology \(j\in J\) to be sold in zone \(k\in K\) and negotiated through derivative contracts

  • \( N^y:=\{n^y_{kj}\}\) electricity produced with technology \(j\in J\) to be sold in the spot market in zone \(k\in K\)

  • \( v:=\{v_{k}\}\) electricity demand in zone \(k\in K\)

  • \(c^h:=\{c^h_{i}\}\) cost for emission of pollutant \(i\in I\)

  • \(c^x:=\{c^x_{j}\}\) cost for derivative contracts related to electricity produced with technology \(j\in J\)

  • \( { A:=\{a_{kj}\}}\) number of derivative contracts for selling electricity in zone \(k\in K\) produced with technology \(j\in J\)

  • \( M:=\{m^x_{ij}\}\) emission factor of pollutant \(i\in I\) for technology \(j\in J\) related to electricity produced with technology \(j\in J\) and negotiated in the future market

  • \( b:=\{b_{k}\}\) minimum amount of electricity to be traded with derivative contracts in zone \(k\in K\).

The problem formulation can be expressed as follows

$$\begin{aligned}&\max \limits _{x,y}&(p^x)^Tx+(p^y)^Ty-((w^x)^Tx+(w^y)^Ty)-({ c^h})^T(N^x x+N^y y)\end{aligned}$$
(13)
$$\begin{aligned}&s.t.&B^x x+B^y y\underline{\le } h\end{aligned}$$
(14)
$$\begin{aligned}{} & {} N^x x+N^y y= v\end{aligned}$$
(15)
$$\begin{aligned}{} & {} l_y\underline{\le }y\underline{\le }u_y \end{aligned}$$
(16)
$$\begin{aligned}{} & {} x\in \arg \min \limits _{\tilde{x}} \left\{ { (c^x)^T}\tilde{x}: \ A\tilde{x}\underline{\ge } b,\ M\tilde{x}=q,\ l_x\underline{\le }\tilde{x}\underline{\le }u_x \right\} \end{aligned}$$
(17)

It is easy to check that the proposed model can be efficiently solved by using the necessary and sufficient condition of Sect. 3 and reformulated as in Sects. 5.1 and 5.2 by simply noticing that \(d_x=p^x-w^x-(N^x)^T{ c^h}\), \(d_y=p^y-w^y-(N^y)^T{ c^h}\), \(g(x,y)=d_x^Tx+d_y^Ty\) and \(Q=0\).

Remark 4

Notice that the price \(p^x\) represents the price for the derivative contracts established in the market. In this model formulation it is supposed to be exogenous but a more general formulation can be adopted where \(p^x\) corresponds to the dual variable of the inequality constraints \(A\tilde{x}\underline{\ge } b\). It turns out to be a quadratic indefinite optimization problem and a solution approach can be found in Cambini et al. (2023).

6 Conclusions

In this paper new transversality conditions have been proposed and studied, thus obtaining necessary and/or sufficient optimality conditions and duality results. The transversality conditions prove to be useful also in stating an efficient approach to solve a particular class of Max-Min problems where a convex quadratic function is minimized. Future developments can be focused in the study of bilevel problems to see whether or not such transversality conditions can be helpful. Moreover, it will be worth studying solution methods approaching transversality conditions related to a general convex, linear fractional or pseudoconvex objective function.