Abstract
This paper presents sufficient conditions for strong metric subregularity (SMsR) of the optimality mapping associated with the local Pontryagin maximum principle for Mayer-type optimal control problems with pointwise control constraints given by a finite number of inequalities \(G_j(u)\le 0\). It is assumed that all data are twice smooth, and that at each feasible point the gradients \(G_j'(u)\) of the active constraints are linearly independent. The main result is that the second-order sufficient optimality condition for a weak local minimum is also sufficient for a version of the SMSR property, which involves two norms in the control space in order to deal with the so-called two-norm-discrepancy.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper contributes to the analysis of Lipschitz stability with respect to perturbations of the following Mayer type optimal control problem:
where \(F:\mathbb {R}^{2n}\rightarrow \mathbb {R}\), \(f: \mathbb {R}^{n+m}\rightarrow \mathbb {R}^n\), and \(G:\mathbb {R}^m\rightarrow \mathbb {R}^k\) are of class \(C^2\), \(u\in L^\infty \), \(x\in W^{1,1}\). More precisely, we investigate the property of Strong Metric subRegularity (SMsR) of the so-called optimality mapping, associated with the system of first order necessary optimality conditions (Pontryagin’s conditions in local form) for problem (1)–(3). These optimality conditions may have various forms. In this paper we deal with the representation using the augmented Hamiltonian, where the control constraints are included with corresponding Lagrange multipliers (see next section for a detailed formulation).
In general, the local Potryagin principle can be written in the form of an inclusion (also called optimality system)
where y incorporates the state, control, adjoint variables, and possibly the Lagrange multipliers associated with the control constraints. In this general setting, y belongs to a metric space \((Y, d_Y)\) and the image of \(\varPhi \) is contained in another metric space \((Z, d_Z)\). Each of these spaces is endowed with an additional metric: \(d_{Y\!\circ }\) in Y, and \(d_{Z\circ }\) in Z.
The definition of strong metric subregularity of the mapping \(\varPhi \) that we use is a slight (however substantial) extension of the standard one, introduced under this name in [9], also see [10, Chapter 3.9] and the recent paper [6]. The difference is, that the definition below involves the four metrics, \(d_Y, d_{Y\!\circ }\) in Y, and \(d_Z, d_{Z\circ }\) in Z, instead of a single metric in each of the two spaces.
Definition 1.1
The set-valued mapping \(\varPhi :Y \rightrightarrows Z\) is strongly metrically subregular (SMsR) at \((\hat{y}, \hat{z}) \in Y\times Z\) if \(\hat{z} \in \Phi (\hat{y})\) and there exist number \(\kappa \ge 0\) and neighborhoods \(B_Y\) of \(\hat{y}\) in the metric \(d_{Y\!\circ }\) and \(B_Z\) of \(\hat{z}\) in the metric \(d_{Z\circ }\), such that for any \(z \in B_Z\) and any solution \(y \in B_Y\) of the inclusion \(z \in \varPhi (y)\), it holds that \(d_Y(y, \hat{y}) \le \kappa \,d_Z(z, \hat{z})\).
Versions of the SMsR property have also been introduced and utilized in [3, 5, 11]. Metric regularity properties with two norms in the space Z (a Banach space) are first introduced in [22], while utilization of two metrics in Y, in relation with the SMsR property, is important in [2]. It is well recognized that the SMsR of the optimality mapping in optimal control is a key property for ensuring convergence with error estimates of numerous methods for solving optimal control problems: discretization methods, gradient methods, Newton-type methods, etc. (see e.g. [3, 6, 21], in addition to a large number of papers where the SMsR property is implicitly used).
We mention that there exists an amount of literature on Lipschitz continuity (related to the property of strong metric regularity) and differentiability of the optimal solution with respect to parameters; see e.g. [8] and [13], correspondingly, as well as the bibliography therein. These properties are stronger than SMsR, therefore the corresponding sufficient conditions for their validity are also stronger. On the other hand, the SMsR property is useful enough for the applications mentioned in the last paragraph.
The SMsR property of the optimality mapping associated with optimal control problems has been investigated and used in several papers, e.g. [1, 7, 20, 21]. However, the sufficient conditions obtained in these papers require various kinds of coercivity conditions for a quadratic form defined by the second derivatives of the (augmented) Hamiltonian. These conditions have to be satisfied for all (sufficiently small) admissible variations of the reference solution of the optimality system. In the present paper, we require coercivity of this quadratic form on an extended critical cone only, which is a subset of the set of all admissible variations. Namely, we establish that the known second-order sufficient optimality conditions for problem (1)–(3) (in terms of the extended critical cone) are also sufficient for SMsR. This makes the conditions for SMsR close to those in mathematical programming. A remarkable additional result is that in the second-order sufficient optimality conditions, the extended critical cone can be replaced with the usual critical cone, provided that a point-wise Legendre-type condition is satisfied. Moreover, we show that the converse is also true: the latter condition together with coercivity of the quadratic form on the critical cone implies coercivity on the extended critical cone.
In Sect. 2 we introduce some basic notations and assumptions. In Sect. 3 we define the extended critical cone and recall a second order sufficient optimality condition ensuring local quadratic growth of the objective function (1). This condition involves coercivity of the quadratic form associated with the Hamiltonian along the directions of the extended critical cone. In Sect. 4 we prove that for the local quadratic growth it suffices to require coercivity on the usual (not extended) critical cone, together with a Legendre-type condition. The main result—the sufficient conditions for SMsR—is formulated in Sect. 5, while the long Sect. 6 contains its proof.
2 Notations and Assumptions
First we recall some standard notations. The scalar product and the norm in the Euclidean space \(\mathbb {R}^n\) is defined in the usual way: \(\langle x, x' \rangle := x_1 x'_1 + \ldots + x_n x'_n\), and \(|x| = \sqrt{ \langle x,x \rangle }\) for any \(x = (x_1, \ldots , x_n) \in \mathbb {R}^n\) and \(x' = (x'_1, \ldots , x'_n) \in \mathbb {R}^n\). The elements of \(\mathbb {R}^n\) are regarded as column-vectors with the exception of the adjoint variables p and \(\lambda \) (to appear later), which are row-vectors. For a function \(\psi :\mathbb {R}^k \rightarrow \mathbb {R}^r\) of the variable z we denote by \(\psi '(z)\) its derivative (Jacobian), represented by an \((r \times k)\)-matrix. For \(r=1\), \(\psi ''(z)\) denotes the second derivative (Hessian), represented by a \((k \times k)\)-matrix. For a function \(\psi :\mathbb {R}^{k \times q} \rightarrow \mathbb {R}\) of the variables (z, v), \(\psi '(z,v)\) and \(\psi ''(z,v)\) still denote the first and the second derivatives with respect to (z, v), however the partial derivatives are denoted by \(\psi _z\), \(\psi _v\), \(\psi _{zz}\), \(\psi _{zv}\) and \(\psi _{vv}\).
The space \(L^k = L^k([0,1],\mathbb {R}^r)\), with \(k = 1, 2\) or \(k = \infty \), consists of all (classes of equivalent) Lebesgue measurable r-dimensional vector-functions defined on the interval [0, 1], for which the standard norm \(\Vert \cdot \Vert _k\) is finite. As usual, \(W^{1,1} = W^{1,1}([0,T],\mathbb {R}^r)\) denotes the space of absolutely continuous functions \(x:[0,T] \rightarrow \mathbb {R}^r\) for which the first derivative belongs to \(L^1\). For convenience, the norm in \(W^{1,1}\) is defined as \(\Vert x \Vert _{1,1} := |x(0)| + \Vert \dot{x} \Vert _1\), so that \(\Vert x \Vert _\infty \le \Vert x \Vert _{1,1}\). The specification \(([0,1],\mathbb {R}^r)\) will be omitted if clear from the context.
According to (3), the set of admissible control values is
Let \(G_i\) denote the ith component of the vector G. For any \(v\in U\) define the set of active indices
Assumption 2.1
(regularity of the control constraints) The set U is nonempty and at each point \(v\in U\) the gradients \(G_i'(v)\), \(i\in I(v)\) are linearly independent.
In the sequel we use the notation
Similarly, we denote \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\), \(\hat{q}=(\hat{x}(0),\hat{x}(1))\).
Assumption 2.2
The triplet \((\hat{w},\hat{p}, \hat{\lambda }) \in \mathcal{W}\times W^{1,1} \times L^\infty \) satisfies the following system of equations and inequalities:
Observe that this system represents the first order necessary optimality condition for a weak local minimumFootnote 1 of the pair \(\hat{w}=(\hat{x},\hat{u})\) (see e.g. [14, part 1, section 18]); later on we refer to it as to optimality system. Namely, if \(\hat{w}\) is a point of weak local minimum in problem (1)–(3), then there exist \(\hat{p}\in W^{1,1}\) and \(\hat{\lambda }\in L^\infty \) such that the optimality system is fulfilled. Note that for a given \(\hat{w}\) the pair \((\hat{p},\hat{\lambda })\) is uniquely determined by these conditions. Indeed, the adjoint variable p is uniquely determined by adjoint equation (6) and transversality conditions (5), and then \(\hat{\lambda }\) is uniquely determined by equation (7) and complementary slackness condition in (4) due to Assumption 2.1.
Introduce the Hamiltonian and the augmented Hamiltonian
Then equations (6) and (7) take the form
Notice that here and below, the dual variables p and \(\lambda \) are treated as row vectors, while x, u, w, f, and G are treated as column vectors.
3 Second-Order Sufficient Conditions for a Weak Local Minimum
Now we discuss the second-order sufficient conditions for a weak local minimum (references will be given at the end of Sect. 4). Set
Define the critical cone
It can be easily verified that \( F'(\hat{q} )q =0\) for any element w of the critical cone.
Indeed, let \(w\in K\). Then \(\dot{x}(t)=f'(\hat{w}(t))w(t)\) a.e. in [0, 1]. Multiplying this equation by \(\hat{p}(t)\) we get that \(\hat{p}(t)\dot{x}(t) =\hat{p}(t)f_x(\hat{w}(t))x(t) + \hat{p}(t)f_u(\hat{w}(t))u(t)\) a.e. in [0, 1]. The equalities \(\hat{p}(t)f_x(\hat{w}(t))=-\dot{\hat{p}}(t)\) and \(\hat{p}(t)f_u(\hat{w}(t)) u(t)=0\) a.e. in [0, 1], give \(\hat{p}(t)\dot{x}(t) + \dot{\hat{p}}(t) x(t)=0\) a.e. in [0, 1]. Integrating this equation on [0, 1], we obtain that \(\hat{p}(1)x(1)-\hat{p}(0)x(0) =0\). Using the transversality conditions (5), we get \(F_{x_0}(\hat{q})x(0)+F_{x_1}(\hat{q})x(1)=0\) q.e.d.
In many cases (in ”smooth problems” of mathematical programming and the calculus of variations) it is sufficient for local minimality that the critical cone consists only of the zero element. However, this is not the case for optimal control problems with a control constraint of the type \(u(t)\in U\).
An equivalent definition of the critical cone is the following. Set
Then, due to (7).
We introduce an extension of the critical cone. For any \(\Delta >0\) and \(j=1,\ldots ,k\) we set
For any \(\Delta >0\) we set
Notice that the cones \(K_\Delta \) form a non-increasing family as \(\Delta \rightarrow 0+\). In particular, \(K \subset K_\Delta \) for any \(\Delta > 0\).
Define the quadratic form:
Assumption 3.1
There exist \(\Delta >0\) and \(c_\Delta >0\) such that
Remark 3.1
Assumption 3.1 is equivalent to the following: there exist \(\Delta >0\) and \(c_\Delta >0\) such that
Indeed, if \(w\in K_\Delta \), then \(\dot{x}(t)=f_x(\hat{w}(t))x(t)+f_u(\hat{w}(t))u(t)\) a.e. in [0, 1], whence
with some \(c>0\). The required equivalence follows.
Remark 3.2
Notice that if (14) is true for some \(\Delta >0\) and \(c_\Delta >0\), then it is true for any positive \(\Delta '<\Delta \) and the same \(c_\Delta \).
In the sequel we use the notations c, \(c'\), \(c''\), \(c_1\), \(c_2\), etc. for constants which may have different values in different estimations.
We recall the following theorem, first published in [15, 16] in a slightly different formulation.
Theorem 3.1
(sufficient second order condition) Let Assumptions 2.1, 2.2, and 3.1 be fulfilled. Then there exist \(\delta >0\) and \(c>0\) such that
for all admissible \(w=(x,u)\in W^{1,1}\times L^\infty \) such that \(\Vert w-\hat{w}\Vert _\infty <\delta \).
In the next section, we discuss the equivalent formulation of this theorem and then provide references to the literature, where proofs can be found.
4 An Equivalent Form of the Second-Order Sufficient Condition for Local Optimality
In this section we show that Assumption 3.1 can be reformulated in terms of the critical cone K, instead of \(K_\Delta \), provided that an additional condition of Legendre type is fulfilled.
Let \((\hat{w},\hat{p}, \hat{\lambda }) \in \mathcal{W}\times W^{1,1} \times L^\infty \), and let Assumptions 2.1 and 2.2 hold.
Assumption 4.1
There exists \(c_0>0\) such that
Further, for any \(\Delta >0\) and any \(t\in [0,1]\) denote by \(I\!\!\!\!C_\Delta (t)\) the cone of all vectors \(v\in \mathbb {R}^m\) satisfying for all \(j=1,\ldots ,k\) the conditions
For any \(\Delta >0\) and any \(j\in \{1,\ldots ,k\}\) we set
Clearly, \(\textrm{meas}\, m_\Delta \rightarrow 0\) as \(\Delta \rightarrow 0+\).
Assumption 4.2
(strengthened Legendre condition on \(m_\Delta \)). There exist \(\Delta >0\) and \(c_{\Delta }^L>0\) such that for a.a. \(t\in m_\Delta \) we have
Remark 4.1
Similarly as in Remark 3.2, if (18) is true for some \(\Delta >0\) and \(c_\Delta ^L>0\), then it is true for any positive \(\Delta '<\Delta \) and the same \(c_\Delta ^L\).
In the sequel, we often omit the argument t of x, u, \(\hat{x}\), \(\hat{u}\), etc.
The following lemma follows from the definition of \(\Omega \) in (13).
Lemma 4.1
Let \(w=(x,u)\in \mathcal{W}\), \(w'=(x',u')\in \mathcal{W}\). Then
where
Moreover, there exists a constant c, independent of w and \(w'\), such that
Henceforth, for \(w=(x,u)\in \mathcal{W}\) we set
It is clear that \(\gamma _0( w)\le \gamma ( w)\), and, as shown in Remark 3.1, if \(\dot{x}=f_w(\hat{w})w\), then there exists \(c>0\), independent of w, such that
Proposition 4.1
Assumptions 4.1 and 4.2 imply Assumption 3.1.
Proof
Let Assumptions 4.1 and 4.2 hold with some \(c_0>0\), \(\Delta >0\) and \(c_{\Delta }^L>0\), where \(\Delta \) will be fixed later as small enough, see Remark 4.1. Set
Note that \(\alpha (\Delta ) \rightarrow 0+\) as \(\Delta \rightarrow 0+\). We may assume that \(\Delta \) is so small that \(\alpha (\Delta )\le 1\).
Let \(\tilde{w} \in K_\Delta \). Set
where \(\chi _{m_\Delta }\) is the characteristic function of the set \(m_\Delta \). Obviously, \(u'(t)\in I\!\!\!\!C_\Delta (t)\) a.e. on [0, 1] and, therefore,
Hence,
Let \(x'\) be the solution to the equation
Then
Hence,
Set
Since \(x'(0)=0\), we have
Obviously,
Using the estimate (20) in Lemma 4.1, Assumptions 4.1, 4.2, and the third relation in (23), we obtain the inequality
We consecutively estimate
where \(c'\) and \(c''\) are appropriate constants. Using these relations and (22) in (24), we obtain that
with some constant \(c'''\). Take \(\Delta >0\) such that
keeping the same constant \(c_{\Delta }^L\) (see Remark 4.1). Then
which completes the proof, since \(c_\Delta \) is independent of \(\tilde{w} \in K_{\Delta }\). \(\square \)
The converse is also true.
Proposition 4.2
Assumption 3.1 implies Assumptions 4.1 and 4.2.
Proof
Let Assumption 3.1 be fulfilled, i.e., there exist \(\Delta >0\) and \(c_\Delta >0\) such that
According to Remark 3.2, one may fix \(\Delta > 0\) arbitrarily small without changing \(c_\Delta \), which will be done below.
Since \(K\subset K_\Delta \), this inequality holds also on K, therefore Assumption 4.1 is fulfilled.
Let us prove that Assumption 4.2 is also fulfilled. Take any \(u\in L^\infty \) satisfying the conditions
where \(\chi _{m_\Delta }\) is the characteristic function of the set \(m_\Delta \). Define x by the conditions
Set \(w=(x,u)\). Then, obviously, \(w\in K_\Delta \), whence it follows that
Moreover,
where \(\alpha (\Delta )\) is defined in (21). The latter implies that
with some \(c'>0\). Using these estimates and (13), we get
Take any \(\Delta >0\) such that
Then we have
This inequality holds for any \(u\in L^\infty \) satisfying (25). The strengthened Legendre condition on \(m_\Delta \) follows. \(\square \)
Thus, instead of Assumption 3.1 we can use Assumptions 4.1 and 4.2 in the sufficient second-order conditions of Theorem 3.1.
The connection between the strengthened Legendre condition and the so-called “local quadratic growth of the Hamiltonian” (defined below) was studied in [4]. Let us formulate the corresponding result from [4] which may be useful for the problem under consideration.
Definition 4.1
We say that the local quadratic growth condition of the Hamiltonian is fulfilled if there exist \(c_H>0\), \(\delta >0\) and \(\Delta >0\) such that for a.a. \(t\in m_\Delta \) we have
for all \(u\in \mathbb {R}^m\) such that \(G(u)\le 0\) and \(|u-\hat{u}(t)| <\delta \).
Proposition 4.3
[4] Assumption 4.2 implies the local quadratic growth condition of the Hamiltonian.
The converse is not true. As shown in [4], the condition of the local quadratic growth of the Hamiltonian is somewhat finer than Assumption 4.2.
There is the following more subtle second-order sufficient condition for a weak local minimum at the point \(\hat{w}\) in problem (1)–(3).
Theorem 4.1
(sufficient second order condition) Let Assumptions 2.1, 2.2, and 4.1 hold and the local quadratic growth condition of the Hamiltonian be satisfied. Then there exist \(\delta >0\) and \(c>0\) such that
for all admissible \(w=(x,u)\in W^{1,1}\times L^\infty \) such that \(\Vert w-\hat{w}\Vert _\infty <\delta \).
A sufficient second order condition of this type for a much more general optimal control problem (together with the corresponding second order necessary condition) was first published by the first author back in 1978 in [12]. A relatively simple proof of Theorem 4.1 in the case of \(k=1\) was recently published in [19]. Proofs of much more general results of this type can be found, for example, in [17] and [18].
5 Strong Metric Subregularity
In this section we formulate the main result in this paper. Namely, we prove that the optimality mapping associated with problem (1)–(3) is strongly metrically subregular at a reference solution \((\hat{w},\hat{p},\hat{\lambda }) = (\hat{x},\hat{u},\hat{p},\hat{\lambda }) \in \mathcal{W}\times W^{1,1} \times L^\infty \) of the optimality system (4)–(9), provided that Assumptions 2.1, 2.2 and 3.1 hold.
In the sequel, for \(w=(x,u)\in \mathcal{W}\) we set
Consider the perturbed system of optimality conditions (4)–(9):
where \(p \in W^{1,1}\), \(\lambda \in L^\infty \), \(\nu \in \mathbb {R}^{2n}\), \(\pi \in L^1\), \(\rho \in L^\infty \), \(\xi \in L^1\), \(\eta \in L^\infty \). Note that \(\nu \), \(\pi \), and \(\rho \) are treated as row vectors, while \(\xi \) and \(\eta \) are treated as column vectors. Below we set
Theorem 5.1
Let Assumptions 2.1, 2.2, and 3.1 be fulfilled. Then there exist reals \(\delta >0\) and \(\kappa > 0\) such that if
then for any solution \((x,u,p,\lambda )\) of the perturbed system (27)–(32) such that \(\Vert \Delta w\Vert _\infty \le \delta \) the following estimates hold:
Observe that if the disturbance \(\eta \) is not present in the disturbed optimality system (27)–(32), that is, \(\eta = 0\), then the inequality (34) follows (modulo a multiplicative constant) from the assumption \(\Vert \Delta w \Vert _\infty \le \delta \), together with the equations (28)–(31). Therefore, the claim of the theorem in this case is valid without assuming (34). In this case again, two metrics are needed in Definition 1.1 of SMsR only in the space \(Y := W^{1,1} \times L^\infty \times W^{1,1} \times L^\infty \). The neighborhood \(B_Y\) in Definition 1.1 is \(B_Y := \{ (w,p,\lambda ) \, :\;\Vert w - \hat{w}\Vert _\infty \le \delta \}\) while the metric \(d_Y\) is induced by the norm \(\Vert (w,p,\lambda ) \Vert := \Vert x\Vert _{1,1} + \Vert p\Vert _{1,1} + \Vert u\Vert _2 + \Vert \lambda \Vert _2\). The metric in Z is induced by the norm \(\Vert \omega \Vert \) in (33).
6 Proof of Theorem 5.1
1. We start with the following auxiliary statement related to the constraint \(G(u)\le 0.\) Let
be a nonempty set of indices, and let \(G_I(v)\) be a column vector with elements \(G_{i_1}(v), \ldots , G_{i_s}(v) \). Set
where B is a fixed closed ball in \(\mathbb {R}^m\). Then, according to Assumption 2.1,
For any \(\varepsilon >0\), we set
Lemma 6.1
There exist positive numbers \(\hat{c}\) and \(\hat{\varepsilon }\) such that
Proof
Since there are finite number of subsets \(I \in \{ 1, \ldots , k\}\), it is enough to prove the lemma for a fixed I. If the statement is false, then there exists a sequence \(v_s \in B\) such that \(G_I(v_s) \rightarrow 0\) with \(s \rightarrow \infty \) and \(\mu _I(v_s) \le s^{-1}\). Without loss of generality we assume that \(v_s\) converges to some vector \(v\in B\). Then \(G_I(v)=0\) and \(\mu _I(v)=0\). A contradiction. \(\square \)
Since G is uniformly continuous on the compact set B, there exists \(\hat{\delta }>0\) such that
Decreasing, if necessary, \(\hat{\delta }\), we can assume that \(\hat{\delta }\le \hat{\varepsilon }\).
2. We analyze conditions (27)–(32). Take any \(\delta >0\) such that \(\delta \le \hat{\delta }\). Suppose that a collection \((\nu ,\pi ,\rho ,\xi ,\eta )\) satisfies condition (34) and there exists a solution \((x,u,p,\lambda )\) of the perturbed system (27)–(32) such that \(\Vert \Delta w\Vert _\infty \le \delta \). Consider this solution. It is clear, that \(\Vert w\Vert _\infty \) is bounded (that is, \(\Vert w\Vert _\infty \le C\), where \(C>0\) does not depend on w), and \(\Vert \omega \Vert \le \delta \).
Further, note that \(\Vert p\Vert _{1,1}\) is bounded due to conditions (28) and (29) and also because \(\Vert w\Vert _\infty \), \(|\nu |\) and \(\Vert \pi \Vert _1\) are bounded. Therefore, \(\Vert \Delta p\Vert _{1,1} \) is also bounded. Moreover, the following is true.
Proposition 6.1
The norms \(\Vert \lambda \Vert _\infty \) and \(\Vert \Delta \lambda \Vert _\infty \) are bounded.
Proof
For the ball appearing in Part 1 of the proof we choose \(B := \{v \in \mathbb {R}^m \, :\;|v| \le \Vert \hat{u} \Vert _\infty + \delta \}\). Consider equation (30):
We assume that \(\lambda \ne 0\), otherwise the claims of the proposition are obvious. Set
Then \(\textrm{meas}\, M(\lambda )>0\). For any \(t\in M(\lambda )\) we set
Let \(t\in M(\lambda )\). The complementary slackness conditions
imply that \(G_i(u(t))-\eta _i(t)=0\) for all \(i\in I(t)\), and then, \(|G_i(u(t))|=|\eta _i(t)|\) for all \(i\in I(t)\). Therefore, in virtue of (34),
Since \(\delta \le \hat{\delta }\), we obtain
Here \(G_{I(t)}\) and \(Q_{I(t),\hat{\delta }}\) are defined similarly to \(G_{I}\) and \(Q_{I,\hat{\delta }}\) in Part 1 of the proof. Hence, by Lemma 6.1, and since \(\hat{\delta }\le \hat{\varepsilon }\),
where
Obviously, \( \lambda (t) G'(u(t))= \lambda _{I(t)}(t)G'_{I(t)}(u(t)) \) for a.a. \(t\in M(\lambda )\), and, therefore,
(Note that the dimensions of the vector \(\lambda _{I(t)}(t)\) and the matrices \(G'_{I(t)}(u(t))\) and \( A_{I(t)}(u(t))\) depend on t.) Multiplying this equation by the transposed matrix \((G'_{I(t)}(u(t)))^*\) on the right, we get
Then
for a.a. \(t\in M(\lambda )\). Since here all matrices are essentially bounded and \(|\lambda (t)|=|\lambda _{I(t)}(t)|\) for a.a. \(t\in M(\lambda )\), we obtain the estimate
with some \(C>0\), and therefore,
Since \(\Vert p\Vert _\infty \) is bounded and \(\Vert \rho \Vert _\infty \le \delta \), we obtain that \(\Vert \lambda \Vert _\infty \) is bounded. Hence \(\Vert \Delta \lambda \Vert _\infty \) is also bounded. \(\square \)
3. Further, subtracting (8) from (31) we obtain that
It follows that
with some \(L>0\), where
Using the Grönwall inequality, we get
with some \(C>0\). In what follows we use a more rough estimate. Namely, since \(\Vert \Delta u\Vert _1\le \Vert \Delta u\Vert _2\) and \(\Vert \xi \Vert _1\le \Vert \omega \Vert \), we have
Consequently,
Clearly, relation (36) implies
As usual, for \(\varepsilon \in \mathbb {R}_+\), the symbol \(O(\varepsilon )\) means that there exists a constant \(C>0\), independent of \(\varepsilon \), such that \(|O(\varepsilon )|\le C|\varepsilon |\) as \(\varepsilon \rightarrow 0+\), and the symbol \(o(\varepsilon )\) means that \(o(\varepsilon )/\varepsilon \rightarrow 0\) as \(\varepsilon \rightarrow 0+\). We use these symbols for \(O(\varepsilon )\) and \(o(\varepsilon )\), taking values in \(\mathbb {R}\) or in \(\mathbb {R}^n\). Moreover, throughout the paper, the functions O and o may directly depend on \(\Delta w\), not only on the norms appearing as arguments at the place of \(\varepsilon \). However, the “smallness” with respect to the arguments of O and o will be uniform in \(\Delta w\), satisfying \(\Vert \Delta w \Vert _\infty \le \delta \). For example, \(O(|\Delta w|^2)\) in (40), which is a shortening of \(O(|\Delta w(t)|^2)\), means that there exists a constant C such that \(O(|\Delta w(t)|^2) \le C |\Delta w(t)|^2\) for all \(\Delta w\) satisfying \(\Vert \Delta w \Vert _\infty \le \delta \) and for a.e. \(t \in [0,1]\). Similarly, \(o(\gamma (\Delta w))\), appearing later, means that \(o(\gamma (\Delta w))/\gamma (\Delta w) \rightarrow 0\) with \(\gamma (\Delta w) \rightarrow 0\), uniformly with respect \(\Delta w\) satisfying \(\Vert \Delta w \Vert _\infty \le \delta \).
4. Subtracting (5) from (28) we obtain
hence,
This implies that
with some \(C>0\). Multiplying (41) by \(\Delta q=(\Delta x(0),\Delta x(1))\), we obtain
5. Subtracting (6) from (29) we obtain
Using the Grönwall inequality and the inequality \(\Vert \Delta u\Vert _1\le \Vert \Delta u\Vert _2\) we get
with some \(c>0\). Using (38), (39), (42) in this inequality, and also taking into account the definition of \(\Vert \omega \Vert \), we obtain
with some \(C>0\). Moreover, since \(\Vert \Delta w\Vert _\infty \le \delta \) and \(\Vert \omega \Vert \le \delta \), we also get
Further, we have
Therefore, relation (44) implies
6. Next we analyze condition (30). Subtracting (7) from (30), we obtain
Consequently,
From here
Here,
Therefore,
Since \(\bar{H}=H+\lambda G\),
Using this equality and the boundedness of \(\Vert \Delta \lambda \Vert _\infty \) and \(\Vert \Delta w\Vert _\infty \), we estimate
with some \(C>0\).
In the next paragraphs, we shall utilize Assumption 2.1 and Lemma 6.1 to estimate for a.e \(t \in [0,1]\)
with some \(C'>0\).
Set
If \(\textrm{meas}\, M(\Delta \lambda ) = 0\) the estimate is trivial, therefore we assume that \(\textrm{meas}\, M(\Delta \lambda )>0\). For any \(t\in M(\Delta \lambda )\), we set
Let \(\Delta \lambda _{J(t)}(t)\) be a row vector, composed of all nonzero components of \(\Delta \lambda (t)\), and let \(G_{J(t)}\) be a column vector with the components \(G_j\) for all \(j\in J(t)\). Then, obviously,
Let \( t \in M(\Delta \lambda )\), \(j\in J(t)\). If \( \lambda _j(t)>0\), then, by the complementary slackness condition in (27), we have \(G_j(u(t))=\eta _j(t)\), and hence, \(|G_j(u(t))| \le \hat{\varepsilon }\) since \(\Vert \eta \Vert _\infty \le \delta \le \hat{\delta }\le \hat{\varepsilon }\).
If \( \lambda _j(t)=0\), then \(\hat{\lambda }_j(t)>0\), and then, by the complementary slackness condition in (4), we have \(G_j(\hat{u}(t))=0\). But then, since \(\Vert u-\hat{u}\Vert _\infty \le \hat{\delta }\), by condition (35) we again have \(|G_j(u(t))| \le \hat{\varepsilon }\).
Thus, for all \(j\in J(t)\) we have \(|G_j(u(t))| \le \hat{\varepsilon }\). This implies that
where the set \(Q_{J(t),\hat{\varepsilon }}\) is defined similarly to the set \(Q_{I,\varepsilon }\) and the ball B is defined as at the beginning of the proof of Proposition 6.1. By Lemma 6.1, it follows that
where
Let
According to (50) and the second equality in (52) we have
for a.a. \(t\in M(\Delta \lambda )\). Consequently,
hence,
This equality, the inequality in (53), and the equality \(|\Delta \lambda (t)|=|\Delta \lambda _{J(t)}(t)|\), satisfied for a.a. \(t\in M(\Delta \lambda )\), imply estimate (51).
Estimate (51) together with the inequalities \(\Vert \Delta w\Vert _\infty \le \delta \), (34), and (47) imply
with some \(C>0\). In addition, from (38), (46), and (51) it follows that
with some \(C>0\).
7. Next, we estimate \(\Omega (\Delta w)\). Multiplying (48) by \(\Delta x\), we get
Further, since
and \(\Vert \Delta \lambda \Vert _\infty \) is bounded, relation (49) implies
Multiplying this relation by \(\Delta u\), we get
Adding equalities (56) and (57), we get
Further, we have
Moreover,
Consequently,
Integrating this equality over the segment [0,1], we obtain
Integrating by parts the first integral on the left side of this equality and applying (43), we get
Substituting this expression into the previous equality and taking into account definition (13) of \(\Omega \), we get
Notice that
Using this equality and equality (40) in equality (58), we obtain
According to (47), we have \(\Vert \Delta p\Vert _\infty \le 2C\delta \). Therefore,
with some \(c>0\). Similarly,
In addition, in view of (54),
with some \(c>0\). Hence, (59) gives
with some \(C>0\).
8. Now we estimate the first term
in the righ-handt side of inequality (63). Let us fix \(j\in \{1,\ldots ,k\}\) and consider the term
We use conditions (4), (9), (27), and (32). If \(\Delta \lambda _j=0\), then this term is equal to zero. Therefore, we assume that the set
has a positive Lebesgue measure.
8.1. Consider the set
A.e. on this set we have
Then, by the complementary slackness condition in (4), \(G_j(\hat{u})=0\). In this case, the condition \(G_j(u)\le \eta _j\) yields \( G'_j(\hat{u})\Delta u +O(|\Delta u|^2)\le \eta _j\), whence, multiplying by \(-\Delta \lambda _j>0\), we get
8.2. Consider the set
Then, by the complementary slackness condition in (27), a.e. on this set we have
(a) Let also \(G_j(\hat{u})=0\). Then
Multiplying this equality by \(-\Delta \lambda _j\), we get
(b) Let now \(G_j(\hat{u})<0\). Then, by the complementary slackness condition in (4), we have \(\hat{\lambda }_j=0\), and then \(\Delta \lambda _j=\lambda _j>0.\)
Again, by the complementary slackness condition (but now in (27)), we have \( G_j(u) =\eta _j\), which implies
Multiplying this equality by \(-\Delta \lambda _j<0\), we get
Since \( -\Delta \lambda _j \cdot G_j(\hat{u})>0\), we obtain
Consequently, inequality (64) holds a.e. on the set \( M(\Delta \lambda _j)\), and then it holds a.e. on [0.1]. This implies that
Recall that according to (54), \(\Vert \Delta \lambda \Vert _\infty \le C\delta \). Therefore,
with some \(C'>0\). This and (65) imply
If \(\Delta \lambda _j=0\), then this equality also holds. Thus, it is true for all \(j=1,\ldots ,k\). Consequently,
This and inequality (63) imply
with some \(c>0\). Using now the inequality \(\Vert \eta \Vert _2\le \Vert \omega \Vert \), we obtain from this that
9. Let \(\Delta >0\) appearing in Assumption 3.1 be given. In order to apply this assumption, with the help of (31) and (32), we pass from the element \(\Delta w\) to an element \(\delta w\in K_\Delta \), using a "small correction" \(w' =\delta w-\Delta w\).
First we use the condition \(G(u)\le \eta .\) Let \(j\in \{1,\ldots ,k\}\). We remind the notations \(M_j := \{t \in [0,1] \, :\;G_j(\hat{u}(t) = 0 \}\) and \( M_\Delta ^+(\hat{\lambda }_j) := \{ t \in [0,1] \, :\;\hat{\lambda }_j(t) > \Delta \}\) used in the definition (12) of the cone \(K_\Delta \). Set
Then
Since \(G_j(u)\le \eta _j\) and \(G_j(\hat{u})=0\) a.e. on \(M_j\), and since \(M_\Delta (\hat{\lambda }_j)\subset M_j\), we obtain that
Now we use the complementary slackness condition in (27). According to this condition, we have \(\lambda _j (G_j(u) -\eta _j)=0.\) Using (54), we get
whenever \(C\,\delta <\Delta .\) Let \( \delta >0\) be so small that this condition is fulfilled. Then, it follows that \(G_j(u) =\eta _j\) a.e. on \( M^+_\Delta (\hat{\lambda }_j)\). Since \(G_j(\hat{u})=0\) on \(M_j\), we get
By virtue of Assumption 2.1, relations (68) and (69) imply that there exists \( u'\) such that for all \(j\in \{1,\ldots ,k\}\) we have
with some \(c>0\), and, therefore,
Here we use \(\Vert \eta \Vert _1\le \Vert \eta \Vert _2\le \Vert \omega \Vert \). Moreover, due to (72) and since \(\Vert \Delta u\Vert _\infty \le \delta \), the product of functions \(|\Delta u|\cdot |u'|\) satisfies the estimate
with some \(c'>0\), and also by virtue of (72) for the function \(| u'|^2\) we have the estimate
with some \(c>0\) and \(c'>0\).
10. Set
There exists \(\delta x\in W^{1,1}\) such that
Recall that by (40)
Then \(\delta x= \Delta x+ x'\), where \( x'\) satisfies
This and (73) imply the following estimate
with some \(c>0\) and \(c'>0\). Set \( w'=( x', u')\). Then \(\delta w=\Delta w+ w'\). Due to (70) and (71), it is easy to verify that
and hence, by Assumption 3.1 (see also Remark 3.1),
11. Let us compare \(\Omega (\delta w)\) with \(\Omega (\Delta w)\). According to Lemma 4.1, we have
where
According to the above estimates (72)-(75), and (77) (we replace \(c'\) with c, taking the maximum of these two constants as the new c), we have
This implies that
with some \(c_\Omega > 0\), where (provided that \(\delta > 0\) is sufficiently small)
12. Let us compare \(\gamma (\delta w)\) with \(\gamma (\Delta w)\). We have
where
Here
with some \(c>0\). This implies that
with some \(c_r>0\). All these terms are contained in the estimate (80) for \(|E(\Delta w,w')|\). Consequently,
with some \(c_\gamma >0\).
13. Inequality (78) along with relations (79) and (82) implies the inequality
whence
Using estimates (81) and (83) in this inequality, we get
14. Combining inequality (67) with (84) we get
Consequently,
Substituting the expression for \(\, R_\delta (\Delta w,\omega )\) in this inequality, we obtain that
where \(\tilde{c}=c_\Delta c_\gamma +c_\Omega \). Then
Take \(\delta >0\) so small that \(c_\Delta ' := c_\Delta -\tilde{c}\,\delta -c\,\delta >0\). Then
Moreover, according (55), we have
Using these relations in (85) together with the definition \( \Vert \omega \Vert :=|\nu |+ \Vert \pi \Vert _1+\Vert \rho \Vert _2+ \Vert \xi \Vert _1+\Vert \eta \Vert _2\) and taking into account the inequalities \(|\Delta x_0|\le |\Delta q|\le 2 \Vert \Delta x\Vert _\infty \), we get
with some \( c_\Delta ''>0\) provided that \(\delta >0\) is small enough. Set \(z=|\Delta x_0|+\Vert \Delta u\Vert _2\), \(y=\Vert \omega \Vert \). Since \(|\Delta x_0|^2+\Vert \Delta u\Vert _2^2\ge \frac{1}{2} z^2\), we obtain
where \(a=c_\Delta ''/2\). This implies that
Consequently, \( b (|\Delta x_0|+\Vert \Delta u\Vert _2)\le \Vert \omega \Vert ,\) or equivalently,
where \(c_1=1/b\). Then relations (38), (46), and (55) imply
with some \(c_2>0\), \(c_3>0\), and \(c_4>0\). The theorem is proved.
Notes
This means that \(J(\hat{x},\hat{u}) \le J(x,u)\) for every admissible pair (x, u) which is close enough to \((\hat{x}, \hat{u})\) in the space \(\mathcal{W}\).
References
Alt, W., Schneider, C., Seydenschwanz, M.: Regularization and implicit Euler discretization of linear-quadratic optimal control problems with bang-bang solutions. Appl. Math. Comput. 287–288, 104–124 (2016)
Angelov, G., Corella, A. Domínguez., Veliov, V.M.: On the accuracy of the model predictive control method. SIAM J. Control Optim. 60(4), 2469–2487 (2022)
Bonnans, F.J.: Local analysis of Newton-type methods for variational inequalities and nonlinear programming. Appl. Math. Optim. 29, 161–186 (1994)
Bonnans, F.J., Osmolovskii, N.P.: Characterization of a local quadratic growth of the Hamiltonian for control constrained optimal control problems. Dyn. Contin. Discret. Impuls. Syst. Ser. B 19, 1-2–1-16 (2012)
Bonnans, F.J., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, Berlin (2000)
Cibulka, R., Dontchev, A.L., Kruger, A.Y.: Strong metric subregularity of mappings in variational analysis and optimization. J. Math. Anal. Appl. 457, 1247–1282 (2018)
Corella, A. Domínguez., Jork, N., Veliov, V.M.: Stability in affine optimal control problems constrained by semilinear elliptic partial differential equations. Submitted. Available as Research Report 2022-01, ORCOS, TU Wien (2022)
Dontchev, A.L., Hager, W.W., Malanowski, K., Veliov, V.M.: On qualitative stability in optimization and optimal control. Set-Valued Anal. 8, 31–50 (2000)
Dontchev, A.L., Rockafellar, R.T.: Regularity and conditioning of solution mappings in variational analysis. Set-Valued Anal. 12, 79–109 (2004)
Dontchev, A.L., Rockafellar, T.R.: Implicit Functions and Solution Mappings: A View from Variational Analysis, 2nd edn. Springer, New York (2014)
Klatte, D., Kummer, B.: Nonsmooth Equations in Optimization. Kluwer Academic Publisher, New York (2002)
Levitin, E.S., Milyutin, A.A., Osmolovskii, N.P.: Higher-order local minimum conditions in problems with constraints. Uspekhi Mat. Nauk. 33, 85–148 (1978); English translation in Russian Math. Surveys, 33, 97–168 (1978)
Malanowski, K., Maurer, H.: Sensitivity analysis for parametric control problems with control-state constraints. Comput. Optim. Appl. 5, 253–283 (1996)
Milyutin, A.A., Osmolovskii, N.P.: Calculus of Variations and Optimal Control. Translations of mathematical monographs, vol. 180. American Mathematical Society, Providence, RI (1998)
Osmolovskii, N.P.: Second-order conditions for a weak local minimum in an optimal control problem (necessity, sufficiency). Dokl. Akad. Nauk SSSR 225(2), 259–262 (1975)
Osmolovskii, N.P.: Second-order conditions for a weak local minimum in an optimal control problem (necessity, sufficiency). Soviet Math. Dokl. 16(3), 1480–1484 (1975)
Osmolovskii, N.P.: Sufficient quadratic conditions of extremum for discontinuous controls in optimal control problems with mixed constraints. J. Math. Sci. 173, 1–106 (2011)
Osmolovskii, N.P.: Second-order sufficient optimality conditions for control problems with linearly independent gradients of control constraints. ESAIM 18(2), 452–482 (2012)
Osmolovskii, N.P.: A second-order sufficient condition for a weak local minimum in an optimal control problem with an inequality control constraint. Control Cybern. 51(2), 151–169 (2022)
Osmolovskii, N.P., Veliov, V.M.: Metric sub-regularity in optimal control of affine problems with free end state. ESAIM 26, 47 (2020)
Preininger, J., Scarinci, T., Veliov, V.M.: Metric regularity properties in bang-bang type linear-quadratic optimal control problems. Set-Valued Var. Anal. 27, 381–404 (2019)
Quincampoix, M., Veliov, V.M.: Metric regularity and stability of optimal control problems for linear systems. SIAM J. Control Optim. 51(5), 4118–4137 (2013)
Funding
Open access funding provided by Austrian Science Fund (FWF). The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research is supported by the Austrian Science Foundation (FWF) under grant P 31400-N32.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Osmolovskii, N.P., Veliov, V.M. On the Strong Subregularity of the Optimality Mapping in an Optimal Control Problem with Pointwise Inequality Control Constraints. Appl Math Optim 87, 43 (2023). https://doi.org/10.1007/s00245-022-09959-9
Accepted:
Published:
DOI: https://doi.org/10.1007/s00245-022-09959-9