Skip to main content
Log in

Continuity/constancy of the Hamiltonian function in a Pontryagin maximum principle for optimal sampled-data control problems with free sampling times

  • Original Article
  • Published:
Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

Abstract

In a recent paper by Bourdin and Trélat, a version of the Pontryagin maximum principle (in short, PMP) has been stated for general nonlinear finite-dimensional optimal sampled-data control problems. Unfortunately, their result is only concerned with fixed sampling times, and thus, it does not take into account the possibility of free sampling times. The present paper aims to fill this gap in the literature. Precisely, we establish a new version of the PMP that can handle free sampling times. As in the aforementioned work by Bourdin and Trélat, we obtain a first-order necessary optimality condition written as a nonpositive averaged Hamiltonian gradient condition. Furthermore, from the freedom of choosing sampling times, we get a new and additional necessary optimality condition which happens to coincide with the continuity of the Hamiltonian function. In an autonomous context, even the constancy of the Hamiltonian function can be derived. Our proof is based on the Ekeland variational principle. Finally, a linear–quadratic example is numerically solved using shooting methods, illustrating the possible discontinuity of the Hamiltonian function in the case of fixed sampling times and highlighting its continuity in the instance of optimal sampling times.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Ackermann JE (1985) Sampled-data control systems: analysis and synthesis, robust system design. Springer, Berlin

    Book  Google Scholar 

  2. Aström KJ (1963) On the choice of sampling rates in optimal linear systems. Engineering Studies, IBM Research

  3. Aström KJ, Wittenmark B (1997) Computer-controlled systems. Prentice Hall, Upper Saddle River

    Google Scholar 

  4. Bakir T, Bonnard B, Bourdin L, Rouot J (2019) Pontryagin-type conditions for optimal muscular force response to functional electrical stimulations. Accepted for publication in J Optim Theory Appl

  5. Bergounioux M, Bourdin L (2019) Pontryagin maximum principle for general Caputo fractional optimal control problems with Bolza cost and terminal constraints. Accepted for publication in ESAIM Control Optim Calc Var

  6. Bini E, Buttazzo GM (2014) The optimal sampling pattern for linear control systems. IEEE Trans Automat Control 59(1):78–90

    Article  MathSciNet  Google Scholar 

  7. Biryukov RS (2016) Generalized \(H_\infty \)-optimal control of linear continuous-discrete plant. Avtomat I Telemekh 77(3):33–51

    MathSciNet  Google Scholar 

  8. Boltyanski VG, Poznyak AS (2012) The robust maximum principle. Systems & control: foundations & applications. Birkhäuser/Springer, New York. Theory and applications

  9. Boltyanskii VG (1978) Optimal control of discrete systems. Wiley, New York-Toronto

    Google Scholar 

  10. Bourdin L (2016) Note on Pontryagin maximum principle with running state constraints and smooth dynamics—proof based on the Ekeland variational principle. Research notes, hal-01302222

  11. Bourdin L, Trélat E (2013) Pontryagin maximum principle for finite dimensional nonlinear optimal control problems on time scales. SIAM J Control Optim 51(5):3781–3813

    Article  MathSciNet  Google Scholar 

  12. Bourdin L, Trélat E (2015) Pontryagin maximum principle for optimal sampled-data control problems. In: Proceedings of the IFAC workshop CAO

  13. Bourdin L, Trélat E (2016) Optimal sampled-data control, and generalizations on time scales. Math Control Relat Fields 6(1):53–94

    Article  MathSciNet  Google Scholar 

  14. Bourdin L, Trélat E (2017) Linear-quadratic optimal sampled-data control problems: convergence result and Riccati theory. Automatica 79:273–281

    Article  MathSciNet  Google Scholar 

  15. Chen T, Francis B (1996) Optimal sampled-data control systems. Springer, London Ltd, London

    MATH  Google Scholar 

  16. Dmitruk AV, Kaganovich AM (2011) Maximum principle for optimal control problems with intermediate constraints. Comput Math Model 22(2):180–215. Translation of Nelineĭnaya Din. Upr. No. 6(2008):101–136

    Article  MathSciNet  Google Scholar 

  17. Ekeland I (1974) On the variational principle. J Math Anal Appl 47:324–353

    Article  MathSciNet  Google Scholar 

  18. Fadali MS, Visioli A (2013) Digital control engineering: analysis and design. Elsevier, Amsterdam

    Google Scholar 

  19. Fattorini HO (1999) Infinite-dimensional optimization and control theory, vol 62. Encyclopedia of Mathematics and Its Applications. Cambridge University Press, Cambridge

    Book  Google Scholar 

  20. Grasse KA, Sussmann HJ (1990) Global controllability by nice controls. In: Nonlinear controllability and optimal control, vol 133 of Monogr. Textbooks Pure Appl Math pp 33–79. Dekker, New York

  21. Grüne L, Pannek J (2017) Nonlinear model predictive control. Communications and Control Engineering Series. Springer, Cham. Theory and algorithms, Second edition [of MR3155076]

  22. Haberkorn T, Trélat E (2011) Convergence results for smooth regularizations of hybrid nonlinear optimal control problems. SIAM J Control Optim 49(4):1498–1522

    Article  MathSciNet  Google Scholar 

  23. Halkin H (1966) A maximum principle of the Pontryagin type for systems described by nonlinear difference equations. SIAM J Control 4(1):90–111

    Article  MathSciNet  Google Scholar 

  24. Holtzman JM, Halkin H (1966) Discretional convexity and the maximum principle for discrete systems. SIAM J Control 4:263–275

    Article  MathSciNet  Google Scholar 

  25. Landau ID, Zito G (2006) Digital control systems: design, identification and implementation. Springer, London

    Google Scholar 

  26. Levis AH, Schlueter RA (1971) On the behaviour of optimal linear sampled-data regulators. Int J Control 13(2):343–361

    Article  Google Scholar 

  27. Margaliot M (2006) Stability analysis of switched systems using variational principles: an introduction. Autom J IFAC 42(12):2059–2077

    Article  MathSciNet  Google Scholar 

  28. Melzer SM, Kuo BC (1971) Sampling period sensitivity of the optimal sampled data linear regulator. Autom J IFAC 7:367–370

    Article  MathSciNet  Google Scholar 

  29. Pontryagin LS, Boltyanskii VG, Gamkrelidze RV, Mishchenko EF (1962) The mathematical theory of optimal processes. Wiley, New York

    MATH  Google Scholar 

  30. Santina MS, Stubberud AR (2005) Basics of sampling and quantization. In: Handbook of networked and embedded control systems, control eng. Birkhauser, Boston, pp 45–69

    Chapter  Google Scholar 

  31. Schlueter RA (1973) The optimal linear regulator with constrained sampling times. IEEE Trans Autom Control AC–18(5):515–518

    Article  MathSciNet  Google Scholar 

  32. Schlueter RA, Levis AH (1973) The optimal linear regulator with state-dependent sampling. IEEE Trans Autom Control AC–18(5):512–515

    Article  MathSciNet  Google Scholar 

  33. Sethi SP, Thompson GL (2000) Optimal control theory. Kluwer Academic Publishers, Boston, MA, second edition. Applications to management science and economics

  34. Sussmann HJ (1999) A maximum principle for hybrid optimal control problems. In: Proceedings of the 38th IEEE conference on decision and control (Cat. No. 99CH36304), vol 1. IEEE, pp 425–430

  35. Vinter R (2010) Optimal control. Modern Birkhäuser Classics. Birkhäuser Boston, Inc., Boston, MA. Paperback reprint of the 2000 edition

  36. Volz RA, Kazda LF (1966) Design of a digital controller for a tracking telescope. IEEE Trans Autom Control AC–12(4):359–367

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Dhar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Proof of Theorem 2.1

Proof of Theorem 2.1

“Appendix” is devoted to the detailed proof of Theorem 2.1 in the case \(L=0\) (without Lagrange cost). Indeed, reducing a Bolza problem (that is, with \(L\ne 0\)) to a Mayer problem (that is, with \(L = 0\)) is very standard in the literature (see, e.g., [8, Section 2.1.4 p. 12]).

We start with some required preliminaries in Sect. A.1. Then, the proof is based on the sensitivity analysis of the state equation in Sect. A.2 and on the application of the Ekeland variational principle in Sect. A.3.

1.1 Preliminaries

The first part of this section is devoted to some basics of convex analysis. Let \(\mathrm {d}_\mathrm {S}:{\mathbb {R}}^j \rightarrow {\mathbb {R}}_{+}\) denote the standard distance function to \(\mathrm {S}\) defined by \(\mathrm {d}_\mathrm {S}(z):= \inf _{z' \in \mathrm {S}} \Vert z-z' \Vert _{{\mathbb {R}}^j}\) for all \(z \in {\mathbb {R}}^j\). We recall that, for all \(z \in {\mathbb {R}}^j\), there exists a unique element \(\mathrm {P}_\mathrm {S}(z) \in \mathrm {S}\) (called the projection of z onto \(\mathrm {S}\)) such that \(\mathrm {d}_\mathrm {S}(z)=\Vert z-\mathrm {P}_\mathrm {S}(z) \Vert _{{\mathbb {R}}^j}\). It can easily be shown that the map \(\mathrm {P}_\mathrm {S}:{\mathbb {R}}^j\rightarrow \mathrm {S}\) is 1-Lipschitz continuous. Moreover, it holds that \(\langle z- \mathrm {P}_\mathrm {S}(z),z'-\mathrm {P}_\mathrm {S}(z) \rangle _{{\mathbb {R}}^j}\le 0\) for all \(z \in {\mathbb {R}}^j\) and all \(z'\in \mathrm {S}\). Let us recall the three following useful lemmas.

Lemma A.1

It holds that \(z-\mathrm {P}_\mathrm {S}(z)\in \mathrm {N}_\mathrm {S}[\mathrm {P}_\mathrm {S}(z)]\) for all \(z\in {\mathbb {R}}^j\).

Lemma A.2

Let \((z_k)_{k\in {\mathbb {N}}}\) be a sequence in \({\mathbb {R}}^j\) converging to some point \(z\in \mathrm {S}\), and let \((\zeta _k)_{k\in {\mathbb {N}}}\) be a sequence in \({\mathbb {R}}_+\). If \(\zeta _k(z_k-\mathrm {P}_\mathrm {S}(z_k))\) converges to some \({\overline{z}} \in {\mathbb {R}}^j\), then \({\overline{z}} \in \mathrm {N}_\mathrm {S}[z]\).

Lemma A.3

The map

$$\begin{aligned}{}\begin{array}[t]{lrcl}\mathrm {d}^2_\mathrm {S} :&{}{\mathbb {R}}^j &{}\longrightarrow &{}{\mathbb {R}}_+\\ {} &{}z&{} \longmapsto &{}\mathrm {d}^2_\mathrm {S}(z) := \mathrm {d}_\mathrm {S}(z)^2 \end{array} \end{aligned}$$

is differentiable on \({\mathbb {R}}^j\), and its differential \(D\mathrm {d}^2_\mathrm {S}(z)\) at every \(z \in {\mathbb {R}}^j\) can be expressed as

$$\begin{aligned} D\mathrm {d}^2_\mathrm {S}(z)(z') = 2 \langle z-\mathrm {P}_\mathrm {S}(z), z' \rangle _{{\mathbb {R}}^j}, \end{aligned}$$

for all \(z' \in {\mathbb {R}}^j\).

The second part of this section is devoted to additional notions about piecewise constant functions and to several technical results useful for the proof of Theorem 2.1. Precisely, we will introduce a technical control set [see Eq. (5)] which allows to avoid two degenerate situations in the behavior of the sequence of sampling times produced by the Ekeland variational principle in Sect. A.3. We also refer to Introduction and to Proposition A.2 for details. Let \(\tau > 0\) and \(N \in {\mathbb {N}}^*\) be fixed. For all \({\mathbb {T}}= (t_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\) and \(u \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\), we denote by

$$\begin{aligned} \Vert {\mathbb {T}}\Vert := \min \{ t_{i+1}-t_i \mid i=0,\ldots ,N-1 \} > 0, \end{aligned}$$

and we define the set

$$\begin{aligned}&{\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})} \\&\quad := \left\{ {\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\mid \forall i=1,\ldots ,N-1, \; \vert t'_i - t_i \vert \le \delta _{ \{ u_{i-1} \ne u_i \} } \dfrac{\Vert {\mathbb {T}}\Vert }{4} \right\} , \end{aligned}$$

where \(\delta _{ \{ u_{i-1} \ne u_i \} } = 1\) if \(u_{i-1} \ne u_i\), and \(\delta _{ \{ u_{i-1} \ne u_i \} } = 0\) otherwise. In particular, if \( {\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\), it holds that

$$\begin{aligned} 0= & {} t'_{0}< t_1 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{1} \le t_1 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_2 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{2} \le t_2 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< \ldots \\&\ldots< t_{N-2} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{N-2} \le t_{N-2} + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_{N-1} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \\\le & {} t'_{N-1} \le t_{N-1} + \dfrac{\Vert {\mathbb {T}}\Vert }{4} < t'_{N} = \tau , \end{aligned}$$

with \(t'_{i}=t_i\) for all \(i \in \{ 1,\ldots ,N-1 \}\) such that \(u_{i-1} = u_i\). Hence, for all \({\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\), the elements \(t'_i = t_i\) remain unchanged when u is constant over two consecutive sampling intervals \([t_{i-1},t_i)\) and \([t_i,t_{i+1})\) and all the elements \(t'_i\) live in intervals which are (strictly) disjoint. Finally, we introduce the following technical control set

$$\begin{aligned} \mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m ) := \bigcup _{{\mathbb {T}}' \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}} \mathrm {PC}^{{\mathbb {T}}'}_N ( [0,\tau ] , {\mathbb {R}}^m ). \end{aligned}$$
(5)

Of course note that \({\mathbb {T}}\in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\), and thus, \(u \in \mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\). Also note that the inclusion \( {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})} \subset {\mathcal {P}}_N^{\tau }\) holds, and thus, \( \mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\) is included in \(\mathrm {PC}_{N}( [0,\tau ] , {\mathbb {R}}^m ) \subset \mathrm {L}^\infty ([0,\tau ],{\mathbb {R}}^m)\), but is not a linear subspace, neither a convex subset.

Lemma A.4

Let \(\tau > 0\) and \(N \in {\mathbb {N}}^*\). Let \({\mathbb {T}}= (t_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\) and \(u \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\). Then \({\mathbb {T}}\) is the unique element \({\mathbb {T}}' \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) such that \(u \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m)\).

Proof

Let \({\mathbb {T}}' =(t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) be such that \(u \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m)\). Let us assume by contradiction that \({\mathbb {T}}' \ne {\mathbb {T}}\). Let \(i \in \{ 1, \ldots ,N-1 \}\) such that \(t_i \notin {\mathbb {T}}'\). Necessarily it holds that \(u_{i-1} \ne u_i\) and there exists \(j \in \{ i-1 ,i \}\) such that \(t'_j< t_i < t'_{j+1}\). Since \(u \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m)\), there exists \(c \in {\mathbb {R}}^m\) such that \(u(t)=c\) for almost every \(t \in [t'_j,t'_{j+1}]\). Since \(u(t) = u_{i-1}\) for almost every \(t \in [t_{i-1},t_i]\) and \(u(t)=u_i\) for almost every \(t \in [t_{i},t_{i+1}]\), we deduce that \(c = u_{i-1}\) and \(c = u_i\) which raises a contradiction since \(u_{i-1} \ne u_i\). The proof is complete. \(\square \)

Proposition A.1

Let \(\tau > 0\) and \(N \in {\mathbb {N}}^*\). Let \({\mathbb {T}}= (t_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\) and \(u \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\). The set \(\mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\) is a closed subset of \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\).

Proof

Let \((u_k)_{k \in {\mathbb {N}}}\) be a sequence in \(\mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\) converging in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\) to some \(u' \in \mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\). Our aim is to prove that \(u' \in \mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\). The proof is divided in three steps.

First step Let \({\mathbb {T}}_k = (t_{i,k})_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) be a partition associated with \(u_k\) for all \(k \in {\mathbb {N}}\). It holds for all \(k \in {\mathbb {N}}\) that

$$\begin{aligned} 0= & {} t_{0,k}< t_1 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t_{1,k} \le t_1 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_2 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t_{2,k} \le t_2 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< \ldots \\&\ldots< t_{N-2} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t_{N-2,k} \le t_{N-2} + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_{N-1} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \\\le & {} t_{N-1,k} \le t_{N-1} + \dfrac{\Vert {\mathbb {T}}\Vert }{4} < t_{N,k} = \tau , \end{aligned}$$

and \(t_{i,k}=t_i\) for all \(i \in \{ 1,\ldots ,N-1 \}\) such that \(u_{i-1} = u_i\). Extracting a finite number of subsequences (that we do not relabel), we know that, for all \(i \in \{ 0,\ldots ,N \}\), \(t_{i,k}\) converges to some \(t'_i\) satisfying

$$\begin{aligned} 0= & {} t'_{0}< t_1 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{1} \le t_1 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_2 - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{2} \le t_2 + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< \ldots \\&\ldots< t_{N-2} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \le t'_{N-2} \le t_{N-2} + \dfrac{\Vert {\mathbb {T}}\Vert }{4}< t_{N-1} - \dfrac{\Vert {\mathbb {T}}\Vert }{4} \\\le & {} t'_{N-1} \le t_{N-1} + \dfrac{\Vert {\mathbb {T}}\Vert }{4} < t'_{N} = \tau , \end{aligned}$$

and \(t'_{i}=t_i\) for all \(i \in \{ 1,\ldots ,N-1 \}\) such that \(u_{i-1} = u_i\). Hence, we have obtained a partition \({\mathbb {T}}' := (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\).

Second step Extracting a subsequence (that we do not relabel) from the partial converse of the Lebesgue dominated convergence theorem, we know that \(u_k(t)\) converges to \(u'(t)\) for almost every \(t \in [0,\tau ]\). We introduce the subset A of \([0,\tau ]\) of full measure defined by

$$\begin{aligned} A := \{ t \in [0,\tau ] \mid u_k(t) \text { converges to } u'(t) \} \end{aligned}$$

and the subset B of \([0,\tau ]\) of full measure defined by \(B := \bigcap _{k \in {\mathbb {N}}} B_k\) where

$$\begin{aligned} B_k := \bigcup _{i=0}^{N-1} \{ t \in [t_{i,k},t_{i+1,k}) \mid u_k(t) = u_{i,k} \} , \end{aligned}$$

for all \(k \in {\mathbb {N}}\).

Third step Let \(i \in \{ 0,\ldots ,N-1 \}\) and let \(t \in (t'_i,t'_{i+1}) \cap (A \cap B)\). For \(k \in {\mathbb {N}}\) sufficiently large, it holds that \(t \in (t_{i,k},t_{i+1,k})\). Since \(t \in A \cap B\), we know that \(u_k(t) = u_{i,k}\) which converges to \(u'(t)\). Since the convergence of \(u_{i,k}\) to \(u'(t)\) is independent of the choice of \(t \in (t'_i,t'_{i+1}) \cap (A \cap B)\), we deduce that u is equal almost everywhere over \([t'_i,t'_{i+1}]\) to a constant. Since the last sentence is true for every \(i \in \{ 0,\ldots ,N-1 \}\), we conclude that \(u' \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m) \subset \mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\). The proof is complete. \(\square \)

Proposition A.2

Let \(\tau > 0\) and \(N \in {\mathbb {N}}^*\). Let \({\mathbb {T}}= (t_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\) and \(u \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\). Let \((u_k)_{k\in {\mathbb {N}}}\) be a sequence in \(\mathrm {PC}_{N,(u,{\mathbb {T}})}( [0,\tau ] , {\mathbb {R}}^m )\) converging in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\) to u. Let \({\mathbb {T}}_k = (t_{i,k})_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) be a partition associated with \(u_k\) for all \(k \in {\mathbb {N}}\). Then there exists a subsequence of \((u_k)_{k\in {\mathbb {N}}}\ (\)that we do not relabel) such that:

  1. (i)

    \(u_k(t)\) converges to u(t) for almost every \(t\in [0,\tau ]\);

  2. (ii)

    \(t_{i,k}\) converges to \(t_i\) for all \(i=0,\ldots ,N\);

  3. (iii)

    \(u_{i,k}\) converges to \(u_i\) for all \(i=0,\ldots ,N-1\).

Proof

Following exactly the same steps as in the proof of Proposition A.1 (replacing \(u'\) by u), we construct a partition \({\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) such that \(u \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m)\). From Lemma A.4, it implies that \({\mathbb {T}}' ={\mathbb {T}}\). From the construction of \({\mathbb {T}}'\), we conclude that, up to subsequences (which we do not relabel), \(t_{i,k}\) converges to \(t_i\) for all \(i = 0,\ldots ,N\). Let us prove the last statement. Let us consider the sets A and B defined in the proof of Proposition A.1 and let us introduce the subset \(B'\) of \([0,\tau ]\) of full measure defined by

$$\begin{aligned} B' := \bigcup _{i=0}^{N-1} \{ t \in [t_i,t_{i+1}) \mid u(t)=u_i \}. \end{aligned}$$

Let \(i=0,\ldots ,N-1\) and let \(t \in (t_i,t_{i+1}) \cap (A \cap B \cap B')\). For \(k \in {\mathbb {N}}\) sufficiently large, it holds that \(t \in (t_{i,k},t_{i+1,k})\). Moreover, since \(t \in A \cap B \cap B'\), we know that \(u_k(t) = u_{i,k}\) converges to \(u(t) = u_i\). Since the last statement is true for all \(i=0,\ldots ,N-1\), the proof is complete. \(\square \)

1.2 Sensitivity analysis of the state equation

In this section, we focus on the Cauchy problem given by

$$\begin{aligned} \left\{ \begin{array}{l} {\dot{x}}(t) = f(x(t),u(t),t), \qquad \text {a.e. } t \ge 0, \\ x(0)= x_0, \end{array} \right. \end{aligned}$$
(CP)

for any \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). Before proceeding to the sensitivity analysis of the Cauchy problem (CP) (with respect to the control u and the initial condition \(x_0\)), we first recall some definitions and results from the classical Cauchy–Lipschitz (or Picard–Lindelöf) theory.

Definition A.1

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). A (local) solution to the Cauchy problem (CP) is a couple (xI) such that:

  1. (i)

    I is an interval such that \(\{ 0 \} \varsubsetneq I \subset {\mathbb {R}}_+\);

  2. (ii)

    \(x \in \mathrm {AC}([0,\tau ],{\mathbb {R}}^n)\), with \({\dot{x}}(t) = f(x(t),u(t),t)\) for almost every \(t \in [0,\tau ]\), for all \(\tau \in I\);

  3. (iii)

    \(x(0)=x_0\).

Let \((x_1,I_1)\) and \((x_2,I_2)\) be two (local) solutions to the Cauchy problem (CP). We say that \((x_2,I_2)\) is an extension (resp. strict extension) to \((x_1,I_1)\) if \(I_1 \subset I_2\) (resp. \(I_1 \varsubsetneq I_2\)) and \(x_2(t) = x_1(t)\) for all \(t \in I_1\). A maximal solution to the Cauchy problem (CP) is a (local) solution that does not admit any strict extension. Finally, a global solution to the Cauchy problem (CP) is a solution (xI) such that \(I={\mathbb {R}}_+\). In particular, a global solution is necessarily a maximal solution.

Lemma A.5

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). Any (local) solution to the Cauchy problem (CP) can be extended into a maximal solution.

Lemma A.6

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). A couple (xI) is a (local) solution to the Cauchy problem (CP) if and only if:

  1. (i)

    I is an interval such that \(\{ 0 \} \varsubsetneq I \subset {\mathbb {R}}_+\);

  2. (ii)

    \(x \in \mathrm {C}(I,{\mathbb {R}}^n)\);

  3. (iii)

    x satisfies the integral representation\( x(t) = x_0 + \int _0^t f(x(s),u(s),s) \, \text {d}s \) for all \(t \in I\).

Proposition A.3

For all \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\), the Cauchy problem (CP) admits a unique maximal solution denoted by \((x(\cdot ,u,x_0),I(u,x_0))\). Moreover, the maximal interval \(I(u,x_0)\) is semi-open and we write \(I(u,x_0) = [0,\tau (u,x_0))\) where \(\tau (u,x_0) \in (0,+\infty ]\). Furthermore, if \(\tau (u,x_0)<+\infty \), that is, if the maximal solution \((x(\cdot ,u,x_0),I(u,x_0))\) is not global, then \(x(\cdot ,u,x_0)\) is not bounded over \(I(u,x_0) = [0,\tau (u,x_0))\).

Remark A.1

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). The maximal solution \((x(\cdot ,u,x_0),I(u,x_0))\) to the Cauchy problem (CP) coincides with the maximal extension (see Lemma A.5) of any other local solution.

Our aim in the next subsections is to study the behavior of \(x(\cdot ,u,x_0)\) with respect to perturbations on the control u and on the initial condition \(x_0\).

1.2.1 A continuity result

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). In the sequel, for the ease of notations, we denote by \(\Vert \cdot \Vert _{\mathrm {L}^\infty } := \Vert \cdot \Vert _{\mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m)}\) and we introduce two sets:

  1. (i)

    For all \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and all \(0< \tau < \tau (u,x_0)\), we denote by

    $$\begin{aligned} \mathrm {K}( (u,x_0), (R,\tau ) ) := \{(y,v,t) \in {\mathbb {R}}^n\times {\overline{\mathrm {B}}}_{{\mathbb {R}}^m}(0_{{\mathbb {R}}^m},R) \times [0,\tau ] \mid \Vert y-x(t,u,x_0) \Vert _{{\mathbb {R}}^n} \le 1 \}. \end{aligned}$$

    Firstly, note that \(\mathrm {K}( (u,x_0), (R,\tau ) )\) is convex with respect to its first two variables. Secondly, since \(x(\cdot ,u,x_0)\) is continuous over \([0,\tau ]\), then \(\mathrm {K}( (u,x_0), (R,\tau ) )\) is a compact subset of \({\mathbb {R}}^n\times {\mathbb {R}}^m\times {\mathbb {R}}_+\). Thus, we denote by \( L ( (u,x_0), (R,\tau ) ) \ge 0 \) the Lipschitz constant of f over the compact subset \(\mathrm {K}( (u,x_0), (R,\tau ) )\) [see Inequality (1) in Sect. 2.2].

  2. (ii)

    For all \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and all \(0< \tau < \tau (u,x_0)\), we denote by

    $$\begin{aligned}&{\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon ) \\&\quad := \left\{ (u',x'_0) \in \Big ( {\overline{\mathrm {B}}}_{\mathrm {L}^1 ([0,\tau ],{\mathbb {R}}^m)}(u,\varepsilon ) \cap {\overline{\mathrm {B}}}_{\mathrm {L}^\infty }(0_{\mathrm {L}^\infty },R) \Big ) \right. \\&\left. \qquad \times \, {\overline{\mathrm {B}}}_{{\mathbb {R}}^n}(x_0,\varepsilon ) \; \Big | \; u' = u \text { over } [\tau ,+\infty ) \right\} , \end{aligned}$$

    for all \(\varepsilon > 0\), which can be seen as a neighborhood of the couple \((u,x_0)\) in the \(\mathrm {L}^1 ([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\)-space. The second part of the above definition, imposing that \(u'=u\) over \([\tau ,+\infty )\), allows us in the sequel to endow the above set with the \(\mathrm {L}^1 ([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\)-distance.

In the next proposition, we state a continuous dependence result for the trajectory \(x(\cdot ,u,x_0)\) with respect to the couple \((u,x_0)\).

Proposition A.4

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). For all \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and all \(0< \tau < \tau (u,x_0)\), there exists \(\varepsilon > 0\) such that

$$\begin{aligned} \forall (u',x'_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon ), \quad \tau (u',x'_0) > \tau . \end{aligned}$$

Moreover, considering the \(\mathrm {L}^1 ([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\)-distance over the set \({\mathcal {N}}( (u,x_0), (R,\tau ),\varepsilon )\), the map

$$\begin{aligned} (u',x'_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon ) \longmapsto x(\cdot ,u',x'_0) \in \mathrm {C}([0,\tau ],{\mathbb {R}}^n) \end{aligned}$$

is Lipschitz continuous and

$$\begin{aligned} (x(t,u',x'_0),u'(t),t) \in \mathrm {K}( (u,x_0), (R,\tau ) ), \end{aligned}$$

for almost every \(t \in [0,\tau ]\) and for all \((u',x'_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon )\).

Proof

The proof is standard and left to the reader. For similar statements with detailed proofs, we refer to [11, Lemmas 1 and 3 pp. 3795–3797], [13, Lemmas 4.3 and 4.5 pp. 73–74] (in the general framework of time scale calculus) or to [10, Propositions 1 and 2 pp. 4–5] (in a more classical framework, closer to the present considerations).\(\square \)

Remark A.2

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\). Let \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and \(0< \tau < \tau (u,x_0)\). Let \(\varepsilon > 0\) given in Proposition A.4. Let \((u_k,x_{0,k})_{k \in {\mathbb {N}}}\) be a sequence in \({\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon )\) and let \((u',x'_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ) ,\varepsilon )\). From Proposition A.4, if \((u_k,x_{0,k})\) converges to \((u',x'_0)\) in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\), then the sequence \((x(\cdot ,u_k,x_{0,k}))_{k\in {\mathbb {N}}}\) uniformly converges to \(x(\cdot ,u',x'_0)\) over \([0,\tau ]\).

1.2.2 Perturbation of the control

In the next proposition, we state a differentiability result for the trajectory \(x(\cdot ,u,x_0)\) with respect to a convex \(\mathrm {L}^\infty \)-perturbation of the control u.

Proposition A.5

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\) and \(0< \tau < \tau (u,x_0)\). Let \(v \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m)\) be fixed. We consider the convex \(\mathrm {L}^\infty \)-perturbation given by

$$\begin{aligned} u_v (\cdot ,\alpha ) := \left\{ \begin{array}{ll} u + \alpha (v -u) &{} \text {over} \ [0,\tau ), \\ u &{} \text {over}\ [\tau ,+\infty ), \end{array} \right. \end{aligned}$$

for all \(0 \le \alpha \le 1\). Then:

  1. (i)

    there exists \(0 < \alpha _0 \le 1\) such that \(\tau ( u_v(\cdot ,\alpha ) , x_0 ) > \tau \) for all \(0 \le \alpha \le \alpha _0\);

  2. (ii)

    the map

    $$\begin{aligned} \alpha \in [0,\alpha _0] \longmapsto x(\cdot ,u_v(\cdot ,\alpha ),x_0) \in \mathrm {C}([0,\tau ],{\mathbb {R}}^n), \end{aligned}$$

    is differentiable at \(\alpha = 0\) and its derivative is equal to \(w_v\) being the unique solution (that is global) to the linear Cauchy problem given by

    $$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u,x_0),u(t),t) \times w(t) \\ \qquad \qquad +\, \partial _2 f (x(t,u,x_0),u(t),t) \times (v(t)-u(t)), \quad \text {a.e. } t \in [0,\tau ], \\ w(0)= 0_{{\mathbb {R}}^n}. \end{array} \right. \end{aligned}$$

Proof

The proof is standard and left to the reader. For a similar statement with detailed proof, we refer to [11, Lemma 4 and Proposition 1 pp. 3797–3798]. \(\square \)

We conclude this section by a technical lemma on the convergence of the variation vectors. This result is needed in the proof of our main result (see Sect. A.3.2).

Lemma A.7

Let \((u,x_0)\in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m)\times {\mathbb {R}}^n\). Let \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and \(0<\tau <\tau (u,x_0)\). We take \(\varepsilon >0\) as in Proposition A.4. Let \((u_k,x_{0,k})_{k \in {\mathbb {N}}}\) be a sequence of elements in \({\mathcal {N}}( (u,x_0), (R,\tau ),\varepsilon )\) such that \(x_{0,k}\) converges to \(x_0\) and \(u_k(t)\) converges to u(t) for almost every \(t\in [0,\tau ]\). Let \((v_k)_{k \in {\mathbb {N}}}\) be a sequence in \(\mathrm {L}^\infty ([0,\tau ],{\mathbb {R}}^m)\) converging in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\) to some \(v \in \mathrm {L}^\infty ([0,\tau ],{\mathbb {R}}^m)\). Finally, let \(w^k_{v_k}\) be the unique solution (that is global) to the linear Cauchy problem given by

$$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u_k,x_{0,k}),u_k(t),t) \times w(t) \\ \qquad \qquad \quad +\, \partial _2 f (x(t,u_k,x_{0,k}),u_k(t),t) \times (v_k(t)-u_k(t)), \quad \text {a.e. } t \in [0,\tau ], \\ w(0)= 0_{{\mathbb {R}}^n}, \end{array} \right. \end{aligned}$$

for all \(k \in {\mathbb {N}}\). Then, the sequence \((w^k_{v_k})_{k\in {\mathbb {N}}}\) uniformly converges to \(w_v\) over \([0,\tau ]\) where \(w_v\) is defined as in Proposition A.5.

Proof

The proof is standard and left to the reader. For a similar statement with detailed proof, we refer to [11, Lemmas 4.8 and 4.9 pp. 77–78]. \(\square \)

1.2.3 Perturbation of the initial condition

In the next proposition, we state a differentiability result for the trajectory \(x(\cdot ,u,x_0)\) with respect to a simple perturbation of the initial condition \(x_0\).

Proposition A.6

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\) and \(0< \tau < \tau (u,x_0)\). Let \(y \in {\mathbb {R}}^n\) be fixed. Then:

  1. (i)

    there exists \(\alpha _0 > 0\) such that \(\tau ( u , x_0 + \alpha y ) > \tau \) for all \(0 \le \alpha \le \alpha _0\);

  2. (ii)

    the map

    $$\begin{aligned} \alpha \in [0,\alpha _0] \longmapsto x(\cdot ,u,x_0 + \alpha y) \in \mathrm {C}([0,\tau ],{\mathbb {R}}^n), \end{aligned}$$

    is differentiable at \(\alpha = 0\) and its derivative is equal to \(w_y\) being the unique solution (that is global) to the linear homogeneous Cauchy problem given by

    $$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u,x_0),u(t),t) \times w(t), \quad \text {a.e. } t \in [0,\tau ], \\ w(0)= y. \end{array} \right. \end{aligned}$$

Proof

The proof is standard and left to the reader. For similar statements with detailed proofs, we refer to [11, Lemma 10 and Proposition 3 pp. 3802–3803] and to [13, Lemma 4.13 and Proposition 5 pp. 81–83]. \(\square \)

We conclude this section by a technical lemma on the convergence of the variation vectors. This result is needed in the proof of our main result (see Sect. A.3.2).

Lemma A.8

Let \((u,x_0)\in \mathrm {L}^\infty ({\mathbb {R}}_{+},{\mathbb {R}}^m) \times {\mathbb {R}}^n\). Let \(R\ge \Vert u \Vert _{\mathrm {L}^\infty }\) and \(0<\tau <\tau (u,x_0)\). We take \(\varepsilon >0\) as in Proposition A.4. Let \((u_k,x_{0,k})_{k\in {\mathbb {N}}}\) be a sequence of elements in \({\mathcal {N}}((u,x_0), (R,\tau ),\varepsilon )\) such that \(x_{0,k}\) converges to \(x_0\) and \(u_k(t)\) converges to u(t) for almost every \(t\in [0,\tau ]\). Let \(y\in {\mathbb {R}}^n\) be fixed. Finally, let \(w^k_y\) be the unique solution (that is global) to the linear homogeneous Cauchy problem given by

$$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u_k,x_{0,k}),u_k(t),t) \times w(t), \quad \text {a.e. } t \in [0,\tau ], \\ w(0)= y, \end{array} \right. \end{aligned}$$

for all \(k \in {\mathbb {N}}\). Then the sequence \((w^k_y)_{k \in {\mathbb {N}}}\) uniformly converges to \(w_y\) over \([0,\tau ]\) where \(w_y\) is defined as in Proposition A.6.

Proof

The proof is standard and left to the reader. It is similar to the proof of Lemma A.7. \(\square \)

1.2.4 Perturbation of a switching time

Let us introduce the following notion of switching time for a control \(u \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \).

Definition A.2

(Switching time) Let \(u \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \). We say that \(r > 0\) is a switching time of u if there exist \(0 < \eta _r \le r\) and \(u(r^-)\), \(u(r^+) \in {\mathbb {R}}^m\) such that:

  1. (i)

    \(u = u(r^-)\) almost everywhere over \([r-\eta _r,r)\);

  2. (ii)

    \(u = u(r^+)\) almost everywhere over \([r,r+\eta _r)\).

This notion is particularly relevant when dealing with piecewise constant controls as in Problem (OSCP). Indeed, let \(u \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \) such that \(u \in \mathrm {PC}^{\mathbb {T}}_N ([0,\tau ],{\mathbb {R}}^m)\) for some \(\tau > 0\), \(N \in {\mathbb {N}}^*\) and \({\mathbb {T}}= (t_i)_{i=0,\ldots ,N} \in {\mathcal {P}}_N^{\tau }\). Then \(t_i\) is a switching time of u with \(u(t_i^-) = u_{i-1}\), \(u(t_i^+) = u_{i}\) and \(\eta _{t_i} = \min (t_i-t_{i-1},t_{i+1}-t_i) > 0\) for every \(i \in \{1,\ldots ,N-1 \}\).

In the next proposition, we prove a differentiability result for the trajectory \(x(\cdot ,u,x_0)\) with respect to a perturbation of a switching time of the control u.

Proposition A.7

Let \((u,x_0) \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m) \times {\mathbb {R}}^n\) and \(0< \tau < \tau (u,x_0)\). Let \(0<r< \tau \) be a switching time of u and let \(\mu \in \{ -1,1 \}\). We consider the perturbation

$$\begin{aligned} u^\mu _r (\cdot ,\alpha ) := \left\{ \begin{array}{ll} u(r^-) &{}\quad \text {over } [r-\eta _r, r+\mu \alpha ), \\ u(r^+) &{}\quad \text {over } [r+\mu \alpha ,r+\eta _r), \\ u &{} \text {otherwise}, \end{array} \right. \end{aligned}$$

for all \(0 \le \alpha \le \frac{\eta _r}{2}\). Then:

  1. (i)

    there exists \(0 < \alpha _0 \le \frac{\eta _r}{2}\) such that \(\tau ( u^\mu _r (\cdot ,\alpha ) , x_0 ) > \tau \) for all \(0 \le \alpha \le \alpha _0\);

  2. (ii)

    for any \(0 < \lambda \le \tau -r\) fixed, the map

    $$\begin{aligned} \alpha \in [0,\alpha _0] \longmapsto x(\cdot ,u^\mu _r (\cdot ,\alpha ),x_0 ) \in \mathrm {C}( [r+\lambda ,\tau ], {\mathbb {R}}^n), \end{aligned}$$

    is differentiable at \(\alpha = 0\) and its derivative is equal to \(w^\mu _r\) being the unique solution (that is global) to the linear homogeneous Cauchy problem given by

    $$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u,x_0),u(t),t) \times w(t), \qquad \text {a.e. } t \in [r,\tau ], \\ w(r)= \mu \Big ( f(x(r,u,x_0),u(r^-),r) - f(x(r,u,x_0),u(r^+),r) \Big ). \end{array} \right. \end{aligned}$$

Proof

We only prove the case \(\mu =1\) (the proof for the case \(\mu =-1\) is similar). First of all, note that the variation vector \(w^\mu _r\) is global (in the sense that it is defined over the whole interval \([r,\tau ]\)) since the corresponding Cauchy problem is linear. Let \(R:=\Vert u \Vert _{\mathrm {L}^\infty }\). For the ease of notations, we denote by \(\mathrm {K}:= \mathrm {K}( (u,x_0), (R,\tau ) )\) and by \(L := L ( (u,x_0), (R,\tau ) )\) (see the beginning of Sect. A.2.1 for these two notations). From Proposition A.4, there exists \(\varepsilon >0\) such that \(\tau (u',x'_0)>\tau \) for all \((u',x'_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ),\varepsilon )\). Let us take \(0<\alpha _0\le \frac{\eta _r}{2}\) small enough such that \(r+\alpha _0 < \tau \) and \(2R\alpha _0\le \varepsilon \). Then it holds that \(u^\mu _r(\cdot ,\alpha ) = u\) over \([\tau ,+\infty )\), \(\Vert u^\mu _r(\cdot ,\alpha )-u \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \le 2R\alpha \le \varepsilon \) and \(\Vert u^\mu _r(\cdot ,\alpha ) \Vert _{\mathrm {L}^\infty } \le R\) for all \(0\le \alpha \le \alpha _0\). It follows that \((u^\mu _r(\cdot ,\alpha ),x_0)\in {\mathcal {N}}( (u,x_0), (R,\tau ),\varepsilon )\) and \(\tau (u^\mu _r(\cdot ,\alpha ),x_0)>\tau \) for all \(0\le \alpha \le \alpha _0\). The first item of Proposition A.6 is proved. Since \((u^\mu _r(\cdot ,\alpha ),x_0) \in {\mathcal {N}}( (u,x_0), (R,\tau ),\varepsilon )\) for all \(0\le \alpha \le \alpha _0\) and \(u^\mu _r(\cdot ,\alpha )\) converges to u in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\) as \(\alpha \) tends to zero, we know from Proposition A.4 that \(x(\cdot ,u^\mu _r(\cdot ,\alpha ),x_0)\) converges uniformly to \(x(\cdot ,u,x_0)\) over \([0,\tau ]\) as \(\alpha \) tends to zero and that \((x(t,u^\mu _r(\cdot ,\alpha ),x_0),u^\mu _r(t,\alpha ),t) \in \mathrm {K}\) for almost every \(t \in [0,\tau ]\) and for all \(0\le \alpha \le \alpha _0\). Now let us define the function

$$\begin{aligned} \varepsilon (t,\alpha ):=\frac{x(t,u^\mu _r(\cdot ,\alpha ),x_0)-x(t,u,x_0)}{\alpha }-w^\mu _r(t), \end{aligned}$$

for all \(t\in [r,\tau ]\) and all \(\alpha \in (0,\alpha _0]\). Let \(0 < \lambda \le \tau -r\) be fixed. We will prove that \(\varepsilon (\cdot ,\alpha )\) uniformly converges to the zero function on \([r+\lambda ,\tau ]\) as \(\alpha \) tends to 0. From the integral representation of \(\varepsilon (\cdot ,\alpha )\), it holds that

$$\begin{aligned} \begin{aligned} \varepsilon (t,\alpha )&=\varepsilon (r+\alpha ,\alpha )\\&\quad +\int _{r+\alpha }^t\Bigg [\frac{f(x(s,u^\mu _r(\cdot ,\alpha ),x_0),u(s),s)-f(x(s,u,x_0),u(s),s)}{\alpha }\\&\quad -\partial _1 f(x(s,u,x_0),u(s),s)\times w^\mu _r(s)\Bigg ]\,\text {d}s, \end{aligned} \end{aligned}$$

for all \(t\in [r+\alpha ,\tau ]\) and all \(\alpha \in (0,\alpha _0]\). Expanding this expression using Taylor’s theorem with integral remainder, we obtain

$$\begin{aligned} \begin{aligned} \varepsilon (t,\alpha )=\,&\varepsilon (r+\alpha ,\alpha )+\int _{r+\alpha }^t\int _{0}^{1} \partial _1 f(\star _{\alpha \theta s})\,\text {d}\theta \times \varepsilon (s,\alpha )\,\text {d}s\\&+\int _{r+\alpha }^{t} \left( \int _{0}^{1} \partial _1 f(\star _{\alpha \theta s})-\partial _1 f(x(s,u,x_0),u(s),s)\,\text {d}\theta \right) \times w^\mu _r(s)\,\text {d}s, \end{aligned} \end{aligned}$$

for all \(t\in [r+\alpha ,\tau ]\) and all \(\alpha \in (0,\alpha _0]\), where

$$\begin{aligned} \star _{\alpha \theta s}:=(x(s,u,x_0)+\theta (x(s,u^\mu _r(\cdot ,\alpha ),x_0)-x(s,u,x_0)),u(s),s) \in \mathrm {K}, \end{aligned}$$

since \(\mathrm {K}\) is convex with respect to its first two variables. From the triangle inequality, it holds that

$$\begin{aligned} \Vert \varepsilon (t,\alpha ) \Vert _{{\mathbb {R}}^n}\le \Vert \varepsilon (r+\alpha ,\alpha ) \Vert _{{\mathbb {R}}^n}+\Phi (\alpha )+L\int _{r+\alpha }^{t} \Vert \varepsilon (s,\alpha ) \Vert _{{\mathbb {R}}^n}\,\text {d}s, \end{aligned}$$

for all \(t\in [r+\alpha ,\tau ]\) and all \(\alpha \in (0,\alpha _0]\), where the term \(\Phi (\alpha )\) is defined to be:

$$\begin{aligned} \Phi (\alpha ):=\int _{r}^{\tau }\int _{0}^{1}\Vert \partial _1 f(\star _{\alpha \theta s})-\partial _1 f(x(s,u,x_0),u(s),s) \Vert _{{\mathbb {R}}^{n\times n}}\,\text {d}\theta \Vert w^\mu _r(s) \Vert _{{\mathbb {R}}^n}\,\text {d}s . \end{aligned}$$

From the Gronwall lemma, it holds that

$$\begin{aligned} \Vert \varepsilon (t,\alpha ) \Vert _{{\mathbb {R}}^n}\le ( \Vert \varepsilon (r+\alpha ,\alpha ) \Vert _{{\mathbb {R}}^n}+\Phi (\alpha ))e^{L\tau }, \end{aligned}$$

for all \(t\in [r+\alpha ,\tau ]\) and all \(\alpha \in (0,\alpha _0]\). Since we only want to prove the uniform convergence of \(\varepsilon (\cdot ,\alpha )\) to the zero function on \([r+\lambda ,\tau ]\) as \(\alpha \) tends to 0 and since the estimate on the right-hand side is independent of t, we only need to prove that \(\varepsilon (r+\alpha ,\alpha )\) tends to \(0_{{\mathbb {R}}^n}\) and \(\Phi (\alpha )\) tends to 0 as \(\alpha \) tends to zero. The convergence of \(\Phi (\alpha )\) can be obtained with the Lebesgue dominated convergence theorem. Now let us prove that \(\varepsilon (r+\alpha ,\alpha )\) tends to \(0_{{\mathbb {R}}^n}\) as \(\alpha \) tends to zero. Since \(x(r,u^\mu _r(\cdot ,\alpha ),x_0) = x(r,u,x_0)\) and from the integral representations of \(x(\cdot ,u^\mu _r(\cdot ,\alpha ),x_0)\) and \(x(\cdot ,u,x_0)\), it holds that

$$\begin{aligned} \begin{aligned}&\varepsilon (r+\alpha ,\alpha )\\&\quad =\int _r^{r+\alpha } \frac{f(x(s,u^\mu _r(\cdot ,\alpha ),x_0),u^\mu _r(s,\alpha ),s)-f(x(s,u,x_0),u(s),s)}{\alpha }\,\text {d}s\\&\qquad -w^\mu _r(r+\alpha )\\&\quad = \int _r^{r+\alpha } \frac{f(x(s,u^\mu _r(\cdot ,\alpha ),x_0),u(r^-),s)-f(x(s,u,x_0),u(r^+),s)}{\alpha }\,\text {d}s\\&\qquad -w^\mu _r(r+\alpha )\\&\quad =\int _r^{r+\alpha }\frac{f(x(s,u,x_0),u(r^-),s)-f(x(s,u,x_0),u(r^+),s)}{\alpha }\,\text {d}s-w^\mu _r(r+\alpha )\\&\quad \quad +\int _r^{r+\alpha }\frac{f(x(s,u^\mu _r(\cdot ,\alpha ),x_0),u(r^-),s)-f(x(s,u,x_0),u(r^-),s)}{\alpha }\,\text {d}s , \end{aligned} \end{aligned}$$

for all \(\alpha \in (0, \alpha _0]\). Let us deal with the three terms above. Since the first above integrand is continuous, it is clear that r is a Lebesgue point and we get that

$$\begin{aligned}&\lim \limits _{\alpha \rightarrow 0}\int _r^{r+\alpha }\frac{f(x(s,u,x_0),u(r^-),s)-f(x(s,u,x_0),u(r^+),s)}{\alpha }\,\text {d}s \\&\quad = f(x(r,u,x_0),u(r^-),r)-f(x(r,u,x_0),u(r^+),r)=w^\mu _r(r). \end{aligned}$$

Secondly, from the continuity of \(w^\mu _r\), we know that \(w^\mu _r(r+\alpha )\) tends to \(w^\mu _r(r)\) as \(\alpha \) tends to 0. Finally, using the Lipschitz continuity of f over \(\mathrm {K}\), we get that

$$\begin{aligned}&\left\| \int _r^{r+\alpha }\frac{f(x(s,u^\mu _r(\cdot ,\alpha ),x_0),u(r^-),s)-f(x(s,u,x_0),u(r^-),s)}{\alpha }\,\text {d}s \right\| _{{\mathbb {R}}^n} \\&\quad \le \frac{L}{\alpha } \int _{r}^{r+\alpha } \Vert x(s,u^\mu _r(\cdot ,\alpha ),x_0)-x(s,u,x_0) \Vert _{{\mathbb {R}}^n}\,\text {d}s\\&\quad \le L \Vert x(\cdot ,u^\mu _r(\cdot ,\alpha ),x_0)-x(\cdot ,u,x_0) \Vert _{\mathrm {C}([0,\tau ],{\mathbb {R}}^n)}. \end{aligned}$$

Since \(x(\cdot ,u^\mu _r(\cdot ,\alpha ),x_0)\) converges uniformly to \(x(\cdot ,u,x_0)\) over \([0,\tau ]\) as \(\alpha \) tends to 0, the proof is complete. \(\square \)

We conclude this section by a technical lemma on the convergence of the variation vectors. This result is needed in the proof of our main result (see Sect. A.3.2).

Lemma A.9

Let \((u,x_0)\in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m)\times {\mathbb {R}}^n\). Let \(R\ge \Vert u \Vert _{\mathrm {L}^\infty }\) and let \(0<\tau <\tau (u,x_0)\). We take \(\varepsilon >0\) as in Proposition A.4. Let \((u_k,x_{0,k})_{k\in {\mathbb {N}}}\) be a sequence of elements in \({\mathcal {N}}((u,x_0), (R,\tau ),\varepsilon )\) such that \(x_{0,k}\) converges to \(x_0\) and \(u_k(t)\) converges to u(t) for almost every \(t\in [0,\tau ]\). Let \(0< r < \tau \) be a switching time of u and \(r_k\) be a switching time of \(u_k\) for all \(k \in {\mathbb {N}}\). Let us assume that \(r_k\) converges to r and that \(u_k(r^-_k)\) and \(u_k(r^+_k)\) converge, respectively, to \(u(r^-)\) and \(u(r^+)\). Finally, let \(\mu \in \{-1,1\}\) and let \(w^{\mu ,k}_{r_k}\) be the unique solution (that is global) to the linear homogeneous Cauchy problem given by

$$\begin{aligned} \left\{ \begin{array}{l} {\dot{w}}(t) = \partial _1 f(x(t,u_k,x_{0,k}),u_k(t),t) \times w(t), \quad \text {a.e. } t \in [r_k,\tau ], \\ w(r_k)= \mu \Big ( f(x(r_k,u_k,x_{0,k}),u_k(r_k^-),r_k) - f(x(r_k,u_k,x_{0,k}),u_k(r_k^+),r_k) \Big ), \end{array} \right. \end{aligned}$$

for all \(k \in {\mathbb {N}}\). Then, for any \(0<\lambda \le \tau -r\) fixed, the sequence \((w^{\mu ,k}_{r_k})_{k\in {\mathbb {N}}}\) uniformly converges to \(w^{\mu }_r\) over \([r+\lambda ,\tau ]\), where the variation vector \(w^{\mu }_r\) is defined as in Proposition A.7.

Proof

First of all, for all \(k \in {\mathbb {N}}\), note that the variation vector \(w^{\mu ,k}_{r_k}\) is global (in the sense that it is defined over the whole interval \([r_k,\tau ]\)) since the corresponding Cauchy problem is linear. In this proof, we denote by \(\mathrm {K}:= \mathrm {K}( (u,x_0), (R,\tau ) )\) and by \(L := L ( (u,x_0), (R,\tau ) )\) (see the beginning of Sect. A.2.1 for these two notations). From Proposition A.4, it is clear that \( \Vert \partial _1 f(x(t,u_k,x_{0,k}),u_k(t),t) \Vert _{{\mathbb {R}}^{n \times n}} \le L\) for almost every \(t \in [0,\tau ]\) and all \(k \in {\mathbb {N}}\). From the integral representation of \(w^{\mu ,k}_{r_k}\), it holds that

$$\begin{aligned} w^{\mu ,k}_{r_k}(t) = w^{\mu ,k}_{r_k} (r_k) + \int _{r_k}^t \partial _1 f(x(s,u_k,x_{0,k}),u_k(s),s) \times w^{\mu ,k}_{r_k}(s) \, \text {d}s, \end{aligned}$$

for all \(t \in [r_k,\tau ]\) and all \(k \in {\mathbb {N}}\). We deduce that

$$\begin{aligned} \Vert w^{\mu ,k}_{r_k}(t) \Vert _{{\mathbb {R}}^n} \le \Vert w^{\mu ,k}_{r_k} (r_k) \Vert _{{\mathbb {R}}^n} + L \int _{r_k}^t \Vert w^{\mu ,k}_{r_k}(s) \Vert _{{\mathbb {R}}^n} \, \text {d}s, \end{aligned}$$

and, from the Gronwall lemma, that \( \Vert w^{\mu ,k}_{r_k}(t) \Vert _{{\mathbb {R}}^n} \le \Vert w^{\mu ,k}_{r_k} (r_k) \Vert _{{\mathbb {R}}^n} e^{L\tau }\) for all \(t \in [r_k,\tau ]\) and all \(k \in {\mathbb {N}}\). From Proposition A.4, we know that \(x(r_k,u_k,x_{0,k})\) converges to \(x(r,u,x_0)\) and, from the continuity of f and the hypotheses, it is clear that \(w^{\mu ,k}_{r_k}(r_k)\) tends to \(w^\mu _r(r)\). We deduce that there exists a constant \(C \ge 0\) such that \( \Vert w^{\mu ,k}_{r_k}(t) \Vert _{{\mathbb {R}}^n} \le C\) for all \(t \in [r_k,\tau ]\) and all \(k \in {\mathbb {N}}\). Now we define \({\overline{r}}_k:=\max (r_k,r)\) for all \(k \in {\mathbb {N}}\). Note that \({\overline{r}}_k\) tends to r. From the integral representations of \(w^{\mu ,k}_{r_k}\) and \(w^{\mu }_r\), it holds that

$$\begin{aligned} w^{\mu ,k}_{r_k}(t)-w^{\mu }_r(t)= & {} w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu }_r({\overline{r}}_k)+\int _{{\overline{r}}_k}^t \partial _1f(x(s,u_k,x_{0,k}),u_k(s),s) \\&\times \, (w^{\mu ,k}_{r_k}(s)-w^{\mu }_r(s)) \,\text {d}s \\&+\int _{{\overline{r}}_k}^t (\partial _1f(x(s,u_k,x_{0,k}),u_k(s),s)\\&- \,\partial _1 f(x(s,u,x_0),u(s),s))\times w^{\mu }_r(s)\,\text {d}s, \end{aligned}$$

for all \(t\in [{\overline{r}}_k,\tau ]\) and all \(k\in {\mathbb {N}}\). From the triangle inequality, it holds that

$$\begin{aligned} \Vert w^{\mu ,k}_{r_k}(t)-w^{\mu }_r(t) \Vert _{{\mathbb {R}}^n}\le & {} \Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu }_r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n}+\Gamma _k\\&+\,L\int _{{\overline{r}}_k}^t \Vert w^{\mu ,k}_{r_k}(s)-w^{\mu }_r(s) \Vert _{{\mathbb {R}}^n} \, \text {d}s, \end{aligned}$$

for all \(t\in [{\overline{r}}_k,\tau ]\) and all \(k\in {\mathbb {N}}\), where the term \(\Gamma _k\) is defined to be:

$$\begin{aligned} \Gamma _k:= & {} \int _{r}^\tau \Vert \partial _1f(x(s,u_k,x_{0,k}),u_k(s),s) \\&-\, \partial _1 f(x(s,u,x_0),u(s),s) \Vert _{{\mathbb {R}}^{n \times n }}\Vert w^{\mu }_r(s) \Vert _{{\mathbb {R}}^n} \, \text {d}s. \end{aligned}$$

From the Gronwall lemma, we obtain

$$\begin{aligned} \Vert w^{\mu ,k}_{r_k}(t)-w^{\mu }_r(t) \Vert _{{\mathbb {R}}^n}\le (\Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu }_r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n}+\Gamma _k)e^{L\tau }, \end{aligned}$$

for all \(t\in [{\overline{r}}_k,\tau ]\) and all \(k\in {\mathbb {N}}\). Since we only want to prove the uniform convergence of \(w^{\mu ,k}_{r_k}\) to \(w^{\mu }_r\) on \([r+\lambda ,\tau ]\) (and since \({\overline{r}}_k\) converges to r) and since the estimate on the right-hand side is independent of t, we only need to prove that \(\Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu }_r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n}\) and \(\Gamma _k\) converge to 0 as k tends to \(+\infty \). The convergence of \(\Gamma _k\) can be obtained with the Lebesgue dominated convergence theorem. Now let us prove that \(\Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu }_r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n}\) tends to 0 as k tends to \(+\infty \). It holds that

$$\begin{aligned}&\Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^\mu _r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n} \\&\quad \le \Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu ,k}_{r_k}(r_k) \Vert _{{\mathbb {R}}^n} + \Vert w^{\mu ,k}_{r_k}(r_k)-w^{\mu }_r(r) \Vert _{{\mathbb {R}}^n} + \Vert w^{\mu }_r(r)-w^{\mu }_r({\overline{r}}_k) \Vert _{{\mathbb {R}}^n}, \end{aligned}$$

for all \(k\in {\mathbb {N}}\). Let us deal with the three terms above. Firstly, from the integral representation of \(w^{\mu ,k}_{r_k}\), it holds that

$$\begin{aligned} \Vert w^{\mu ,k}_{r_k}({\overline{r}}_k)-w^{\mu ,k}_{r_k}(r_k) \Vert _{{\mathbb {R}}^n}\le & {} \int _{r_k}^{{\overline{r}}_k} \Vert \partial _1f(x(s,u_k,x_{0,k}),u_k(s),s) \Vert _{{\mathbb {R}}^{n \times n} } \Vert w^{\mu ,k}_{r_k}(s) \Vert _{{\mathbb {R}}^n} \, \text {d}s \\\le & {} LC ({\overline{r}}_k -r_k), \end{aligned}$$

for all \(k \in {\mathbb {N}}\). Secondly, we have already mentioned that \(w^{\mu ,k}_{r_k}(r_k)\) tends to \(w^\mu _r(r)\) as k tends to \(+\infty \). Thirdly, we use the continuity of \(w^\mu _r\) to conclude the proof. \(\square \)

1.3 Application of the Ekeland variational principle in the case \(L=0\)

From the sensitivity analysis of the state equation given in Sect. A.2, we are now in a position to give a proof of Theorem 2.1 based on the following simplified version of the Ekeland variational principle (see [17, Theorem 1.1 p. 324]).

Proposition A.8

(Ekeland variational principle) Let \((\mathrm {E},\mathrm {d}_\mathrm {E})\) be a complete metric set. Let \({\mathcal {J}}:\mathrm {E}\rightarrow {\mathbb {R}}^+\) be a continuous nonnegative map. Let \(\varepsilon >0\) and \(\lambda ^*\in \mathrm {E}\) such that \({\mathcal {J}}(\lambda ^*)\le \varepsilon \). Then there exists \(\lambda _\varepsilon \in \mathrm {E}\) such that \(\mathrm {d}_\mathrm {E}(\lambda _\varepsilon ,\lambda ^*)\le \sqrt{\varepsilon }\), and \(-\sqrt{\varepsilon } \; \mathrm {d}_\mathrm {E}(\lambda ,\lambda _\varepsilon )\le {\mathcal {J}}(\lambda )-{\mathcal {J}}(\lambda _\varepsilon )\) for all \(\lambda \in \mathrm {E}\).

Without loss of generality (see details at the beginning of “Appendix A”), we will assume that \(L=0\) in Problem (OSCP). We will also assume that the final time and the N-partition are free in Problem (OSCP) (the two simpler cases where only the final time is fixed and where both of them are fixed can both be treated in very similar ways).

Let \((T,{\mathbb {T}},x,u)\) be a solution to Problem (OSCP). In the sequel, we will consider that \(u \in \mathrm {L}^\infty ({\mathbb {R}}_+,{\mathbb {R}}^m)\) by considering the extension

$$\begin{aligned} \left\{ \begin{array}{ll} u &{} \text {over } [0,T), \\ u_{N-1} &{} \text {over } [T,+\infty ). \end{array} \right. \end{aligned}$$

In particular, using the notations of Sect. A.2, note that \(x = x(\cdot ,u,x(0))\) and that \(\tau (u,x(0)) > T\). In the rest of the proof, we fix some \( \tau _0\), \(\tau \) such that

$$\begin{aligned} \tau _0 := T - \dfrac{T-t_{N-1}}{3} \quad \text {and} \quad T< \tau < \min \left( T + \dfrac{T-t_{N-1}}{3} , \tau (u,x(0)) \right) . \end{aligned}$$

In particular, it holds that \(t_{N-1}< \tau _0< T< \tau < \tau (u,x(0))\). Replacing \(t_N = T\) by \(t_N = \tau \), it holds that \({\mathbb {T}}\in {\mathcal {P}}_N^{\tau }\) and, with the above extension of u, it holds that \(u \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\). We conclude by noting that, with the new value of \(\Vert {\mathbb {T}}\Vert \), it holds that \( t_{N-1}+\frac{\Vert {\mathbb {T}}\Vert }{4} < \tau _0 \).

1.3.1 Fix \(R \in {\mathbb {N}}\) such that \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\)

In this section, we fix \(R \in {\mathbb {N}}\) such that \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and we denote by

$$\begin{aligned} {\mathcal {N}}^R_\varepsilon:= & {} \lbrace (u',x'_0) \in {\mathcal {N}}( (u,x(0)) , (R,\tau ) ,\varepsilon ) \mid u' \in \mathrm {PC}_{N,(u,{\mathbb {T}})}([0,\tau ],{\mathbb {R}}^m) \\&\text { with } u'(t) \in \Omega \text { for a.e.} \, \, t \in [0,\tau ]\rbrace , \end{aligned}$$

where \(\varepsilon > 0\) is given in Proposition A.4. We endow the set \( {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ] \) with the \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n \times {\mathbb {R}}\)-distance. Endowed with this distance, it can be seen from Proposition A.1, from the closedness assumption on \(\Omega \) and from the partial converse of the Lebesgue dominated convergence theorem that \({\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\) is a complete metric set.

Let us consider a sequence \((\varepsilon _k)_{k \in {\mathbb {N}}}\) converging to zero such that \(0< \sqrt{\varepsilon _k} < \varepsilon \) for all \(k \in {\mathbb {N}}\). Then we define the penalized functional \({\mathcal {J}}^R_k : {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ] \rightarrow {\mathbb {R}}_+\) by

$$\begin{aligned}&{\mathcal {J}}^R_k (u',x'_0,T' ) \\&\quad := \sqrt{ \Big ( \varphi (x'_0,x(T',u',x'_0),T') - \varphi (x(0),x(T),T) + \varepsilon _k \Big )^{+2} + \mathrm {d}^2_\mathrm {S}\Big ( g(x'_0,x(T',u',x'_0),T') \Big ) }, \end{aligned}$$

for all \((u',x'_0,T' ) \in {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ] \) and all \(k \in {\mathbb {N}}\).

Since \(\varphi \), g and \(\mathrm {d}^2_\mathrm {S}\) are continuous and from Proposition A.4, it follows that \({\mathcal {J}}^R_k\) is a continuous nonnegative map over \( {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\) for all \(k \in {\mathbb {N}}\). Furthermore, it is clear that \({\mathcal {J}}^R_k(u,x(0),T)=\varepsilon _k\) for all \(k \in {\mathbb {N}}\). Therefore, from the Ekeland variational principle (see Proposition A.8), we conclude that there exists a sequence \((u_k,x_{0,k},T_k)_{k \in {\mathbb {N}}} \subset {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\) such that

$$\begin{aligned} \mathrm {d}_{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\times {\mathbb {R}}}((u_k,x_{0,k},T_k),(u,x(0),T))\le \sqrt{\varepsilon _k}, \end{aligned}$$
(6)

and

$$\begin{aligned}&-\sqrt{\varepsilon _k} \; \mathrm {d}_{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m) \times {\mathbb {R}}^n\times {\mathbb {R}}}((u',x'_0,T'),(u_k,x_{0,k},T_k)) \nonumber \\&\quad \le {\mathcal {J}}^R_k(u',x'_0,T')-{\mathcal {J}}^R_k(u_k,x_{0,k},T_k), \end{aligned}$$
(7)

for all \((u',x'_{0},T') \in {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\) and all \(k\in {\mathbb {N}}\).

By contradiction, let us assume that there exists \((u',x'_{0},T')\in {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\) such that \({\mathcal {J}}^R_k(u',x'_{0},T') = 0\). In particular, we have \(0 < T' \le \tau \). Let us denote by \(x' = x(\cdot ,u',x'_0) \in \mathrm {AC}([0,T'],{\mathbb {R}}^n)\). Since \(u' \in \mathrm {PC}_{N,(u,{\mathbb {T}})}([0,\tau ],{\mathbb {R}}^m)\), there exists \({\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^{\tau }_{N,(u,{\mathbb {T}})}\) such that \(u' \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,\tau ],{\mathbb {R}}^m)\). Since \({\mathbb {T}}' = (t'_i)_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\), we know that \(t'_{N-1} \le t_{N-1} + \frac{\Vert {\mathbb {T}}\Vert }{4} < \tau _0 \le T' \le \tau \). Then, replacing \(t'_N = \tau \) by \(t'_N = T'\), we get that \({\mathbb {T}}' \in {\mathcal {P}}^{T'}_N\) and \(u' \in \mathrm {PC}^{{\mathbb {T}}'}_N([0,T'],{\mathbb {R}}^m)\). Moreover, it holds that \(\dot{x'}(t) = f(x'(t),u'(t),t)\) and \(u'(t) \in \Omega \) for almost every \(t \in [0,T']\). Since \({\mathcal {J}}^R_k(u',x'_{0},T') = 0\), we deduce moreover that \(g(x'(0),x'(T'),T') \in \mathrm {S}\). Thus, the quadruple \((T',{\mathbb {T}}',x',u')\) satisfies all constraints of Problem (OSCP) and thus \(\varphi (x'(0),x'(T'),T') \ge \varphi (x(0),x(T),T)\) from optimality of the quadruple \((T,{\mathbb {T}},x,u)\). This raises a contradiction with the equality \({\mathcal {J}}^R_k(u',x'_{0},T') = 0\). We conclude that \({\mathcal {J}}^R_k(u',x'_{0},T') > 0\) for all \((u',x'_{0},T')\in {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\).

From the above paragraph, we can correctly define the couple \((\psi ^{0R}_k, \psi ^R_k)\in {\mathbb {R}}\times {\mathbb {R}}^j\) as

$$\begin{aligned} \psi ^{0R}_k:=\frac{-1}{{\mathcal {J}}^R_k(u_k,x_{0,k},T_k)} \Big ( \varphi (x_{0,k},x(T_k,u_k,x_{0,k}),T_k) - \varphi (x(0),x(T),T) + \varepsilon _k \Big )^{+} \end{aligned}$$

and

$$\begin{aligned} \psi ^R_k:= & {} \frac{-1}{{\mathcal {J}}^R_k(u_k,x_{0,k},T_k)} \Big ( g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)\\&-\,\mathrm {P}_\mathrm {S}(g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)) \Big ) , \end{aligned}$$

for all \(k \in {\mathbb {N}}\). Note that \(\psi ^{0R}_k \in {\mathbb {R}}_-\) and \(-\psi ^{R}_k \in \mathrm {N}_\mathrm {S}[ \mathrm {P}_\mathrm {S}(g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)) ]\) from Lemma A.1 for all \(k \in {\mathbb {N}}\).

Since \((u_k,x_{0,k},T_k) \in {\mathcal {N}}^R_\varepsilon \times [ \tau _0, \tau ]\), we know that \(u_k \in \mathrm {PC}_{N,(u,{\mathbb {T}})}([0,\tau ],{\mathbb {R}}^m)\) for all \(k \in {\mathbb {N}}\). Let us denote by \({\mathbb {T}}_k = (t_{i,k})_{i=0,\ldots ,N} \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) a partition associated with \(u_k\) for all \(k \in {\mathbb {N}}\). Moreover, from Inequality (6), the sequence \((u_k)_{k\in {\mathbb {N}}}\) converges to u in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\). Thus, we can extract from Proposition A.2 a subsequence (which we do not relabel) such that \(u_k(t)\) converges to u(t) for almost every \(t\in [0,\tau ]\), \(t_{i,k}\) converges to \(t_i\) for all \(i=0,\ldots ,N\) and \(u_{i,k}\) converges to \(u_i\) for all \(i=0,\ldots ,N-1\). From Inequality (6), we know that \(x_{0,k}\) and \(T_k\) converge, respectively, to x(0) and T. From Proposition A.4, we deduce that \(x(T_k,u_k,x_{0,k})\) converges to x(T), and thus, \(g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)\) converges to \(g(x(0),x(T),T) \in \mathrm {S}\). Finally, from the definition of \({\mathcal {J}}^R_k\), it is clear that \(| \psi ^{0R}_k |^2+ \Vert \psi ^R_k \Vert ^2_{{\mathbb {R}}^j}=1\) for all \(k \in {\mathbb {N}}\). By a compactness argument, we can extract subsequences (which we do not relabel) such that \(\psi ^{0R}_k\) converges to some \(\psi ^{0R} \in {\mathbb {R}}_-\) and \(\psi ^{R}_k\) converges to some \(\psi ^{R} \in {\mathbb {R}}^j\) which satisfies \(-\psi ^R \in \mathrm {N}_\mathrm {S}[ g(x(0),x(T),T) ]\) from Lemma A.2. Note that \(| \psi ^{0R} |^2+ \Vert \psi ^R \Vert ^2_{{\mathbb {R}}^j}=1\).

1.3.2 Crucial inequalities depending on R fixed in the previous section

In this section, we will use Inequality (7) along with the perturbations defined in Sect. A.2 to obtain four crucial inequalities (depending on R fixed in the previous section). The perturbations will be considered on \(u_k\), \(x_{0,k}\), \(t_{i,k}\), but also on \(T_k\).

Lemma A.10

Let \(v \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\) taking values in \(\Omega \cap {\overline{\mathrm {B}}}_{{\mathbb {R}}^m}(0_{{\mathbb {R}}^m},R)\). Then the inequality

$$\begin{aligned} \Big \langle \psi ^{0R} \partial _2 \varphi (x(0),x(T),T)+ \partial _2 g(x(0),x(T),T)^\top \times \psi ^R,w_v(T) \Big \rangle _{{\mathbb {R}}^n} \le 0, \end{aligned}$$
(8)

where \(w_v\) is defined in Proposition A.5, holds true.

Proof

The proof is divided in three steps.

First step For all \(k \in {\mathbb {N}}\), let us define

$$\begin{aligned} v_k(t) := v_i \quad \text { if } t \in [t_{i,k},t_{i+1,k}) \text { for some } i \in \{ 0,\ldots ,N-1 \}, \end{aligned}$$

for all \(t \in [0,\tau )\). Then \(v_k \in \mathrm {PC}^{{\mathbb {T}}_k}_N([0,\tau ],{\mathbb {R}}^m)\) for all \(k \in {\mathbb {N}}\) and, since \(t_{i,k}\) converges to \(t_i\) for all \(i=0,\ldots ,N\), it is clear that the sequence \((v_k)_{k \in {\mathbb {N}}}\) converges to v in \(\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)\). It is also true that \(v_k\) takes its values in \(\Omega \cap {\overline{\mathrm {B}}}_{{\mathbb {R}}^m}(0_{{\mathbb {R}}^m},R)\) for all \(k \in {\mathbb {N}}\).

Second step Let us fix \(k\in {\mathbb {N}}\). We define as in Proposition A.5 the convex perturbation

$$\begin{aligned} u_{k,v_k} (\cdot ,\alpha ) := \left\{ \begin{array}{ll} u_k+\alpha (v_k-u_k) &{} \text {over} [0,\tau ), \\ u_k=u &{} \text {over} [\tau ,+\infty ), \end{array} \right. \end{aligned}$$

for all \(0 \le \alpha \le 1\). First of all, note that \(u_{k,v_k} (\cdot ,\alpha ) \in \mathrm {PC}^{{\mathbb {T}}_k}_N([0,\tau ],{\mathbb {R}}^m) \subset \mathrm {PC}_{N,(u,{\mathbb {T}})}([0,\tau ],{\mathbb {R}}^m)\) and, since \(\Omega \) is convex, that \(u_{k,v_k} (\cdot ,\alpha )\) takes its values in \(\Omega \) for all \(0 \le \alpha \le 1\). Moreover, it holds that \(\Vert u_{k,v_k}(\cdot ,\alpha ) \Vert _{\mathrm {L}^\infty } \le R\) and

$$\begin{aligned} \Vert u_{k,v_k}(\cdot ,\alpha )-u \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}\le & {} \Vert u_{k,v_k}(\cdot ,\alpha )-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}+\Vert u_k-u \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \\\le & {} \alpha \Vert v_k-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}+\sqrt{\varepsilon _k}. \end{aligned}$$

Since \(\sqrt{\varepsilon _k}<\varepsilon \), it follows that there exists \(0<\alpha _0\le 1\) small enough such that \((u_{k,v_k}(\cdot ,\alpha ),x_{0,k}) \in {\mathcal {N}}^R_\varepsilon \) for all \(\alpha \in [0,\alpha _0]\). From Inequality (7) we obtain

$$\begin{aligned} -\sqrt{\varepsilon _k} \Vert u_{k,v_k}(\cdot ,\alpha )-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \le {\mathcal {J}}^R_k(u_{k,v_k}(\cdot ,\alpha ),x_{0,k},T_k)-{\mathcal {J}}^R_k(u_k,x_{0,k},T_k), \end{aligned}$$

and thus,

$$\begin{aligned}&-\sqrt{\varepsilon _k}\Vert v_k-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \\&\quad \le \frac{1}{{\mathcal {J}}^R_k(u_{k,v_k}(\cdot ,\alpha ),x_{0,k},T_k)+{\mathcal {J}}^R_k(u_k,x_{0,k},T_k)}\\&\qquad \times \, \frac{{\mathcal {J}}^R_k(u_{k,v_k}(\cdot ,\alpha ),x_{0,k},T_k)^2-{\mathcal {J}}^R_k(u_k,x_{0,k},T_k)^2}{\alpha }, \end{aligned}$$

for all \(\alpha \in (0,\alpha _0]\). Taking the limit as \(\alpha \) tends to 0 and using the definitions of \(\psi ^{0R}_k\) and \(\psi ^R_k\), we obtain from Proposition A.5 that

$$\begin{aligned}&\Big \langle \psi ^{0R}_k \partial _2 \varphi (x_{0,k},x(T_k,u_k,x_{0,k}),T_k) + \partial _2 g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k) ^\top \times \psi ^R_k, w^k_{v_k}(T_k) \Big \rangle _{{\mathbb {R}}^n} \\&\quad \le \sqrt{\varepsilon _k} \Vert v_k-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}. \end{aligned}$$

where \(w^k_{v_k}\) is defined in Lemma A.7.

Third step We take the limit of the above inequality as k tends to \(+\infty \). Since \(\varphi \) and g are both of class \(\mathrm {C}^1\) and from the uniform convergence of \((w^k_{v_k})_{k\in {\mathbb {N}}}\) to \(w_v\) over \([0,\tau ]\) (see Lemma A.7), it holds that

$$\begin{aligned} \Big \langle \psi ^{0R} \partial _2 \varphi (x(0),x(T),T)+ \partial _2 g(x(0),x(T),T)^\top \times \psi ^R,w_v(T) \Big \rangle _{{\mathbb {R}}^n} \le 0. \end{aligned}$$

The proof is complete. \(\square \)

Lemma A.11

Let \(y\in {\mathbb {R}}^n\) be fixed. Then the inequality

$$\begin{aligned}&\Big \langle \psi ^{0R}\partial _1 \varphi (x(0),x(T),T)+\partial _1g(x(0),x(T),T)^\top \times \psi ^R, y \Big \rangle _{{\mathbb {R}}^n} \nonumber \\&\quad +\,\Big \langle \psi ^{0R}\partial _2 \varphi (x(0),x(T),T)+\partial _2g(x(0),x(T),T)^\top \times \psi ^R,w_y(T) \Big \rangle _{{\mathbb {R}}^n}\le 0,\qquad \quad \end{aligned}$$
(9)

where \(w_y\) is defined in Proposition A.6, holds true.

Proof

The proof is standard and left to the reader. For similar statements with detailed proofs, we refer to [11, Lemma 17 pp. 3807–3808] or [13, Lemma 4.20 p. 87]. \(\square \)

Lemma A.12

Let \(i \in \{1,\ldots ,N-1 \}\) such that \(u_{i-1} \ne u_i\) and let \(\mu \in \{-1,1 \}\). Then the inequality

$$\begin{aligned} \Big \langle \psi ^{0R}\partial _2\varphi (x(0),x(T),T)+\partial _2 g(x(0),x(T),T)^\top \times \psi ^R,w^\mu _{t_i}(T) \Big \rangle _{{\mathbb {R}}^n} \le 0,\qquad \end{aligned}$$
(10)

where \(w^\mu _{t_i}\) is defined in Proposition A.7, holds true.

Proof

The proof is divided in two steps.

First step Since \(t_{i,k}\) converges to \(t_i\) and since \(t_i - \frac{\Vert {\mathbb {T}}\Vert }{4} \le t_{i,k} \le t_i + \frac{\Vert {\mathbb {T}}\Vert }{4}\), we fix \(k\in {\mathbb {N}}\) sufficiently large in order to guarantee that \(t_i - \frac{\Vert {\mathbb {T}}\Vert }{8} \le t_{i,k} \le t_i + \frac{\Vert {\mathbb {T}}\Vert }{8}\). Since \(u_k \in \mathrm {PC}^{{\mathbb {T}}_k}_N([0,\tau ],{\mathbb {R}}^m)\), the point \(t_{i,k}\) is a switching time of \(u_k\) with \(\eta _{t_{i,k}} = \min (t_{i,k}-t_{i-1,k},t_{i+1,k}-t_{i,k} ) > 0\). We define the perturbation \(u^{\mu }_{k,t_{i,k}}(\cdot ,\alpha )\) as

$$\begin{aligned} u^{\mu }_{k,t_{i,k}} (\cdot ,\alpha ) := \left\{ \begin{array}{ll} u_k(t_{i,k}^-)=u_{i-1,k} &{} \text {over } [t_{i,k}-\eta _{t_{i,k}}, t_{i,k}+\mu \alpha ), \\ u_k(t_{i,k}^+)=u_{i,k} &{} \text {over } [t_{i,k}+\mu \alpha ,t_{i,k}+\eta _{t_{i,k}}), \\ u_k &{} \text {otherwise}, \end{array} \right. \end{aligned}$$

for all \(0 \le \alpha \le \frac{\eta _{t_{i,k}}}{2}\). Considering \({\mathbb {T}}_k^{i,\alpha }\) the N-partition given by

$$\begin{aligned} 0 = t_{0,k}< t_{1,k}< \cdots< t_{i-1,k}< t_{i,k} + \mu \alpha< t_{i+1,k}< \cdots< t_{N-1,k} < t_{N,k} = \tau , \end{aligned}$$

it is clear that \( u^{\mu }_{k,t_{i,k}} (\cdot ,\alpha ) \in \mathrm {PC}^{{\mathbb {T}}_k^{i,\alpha }}_N([0,\tau ],{\mathbb {R}}^m)\) for all \(0 \le \alpha \le \frac{\eta _{t_{i,k}}}{2}\). Since \(t_i - \frac{\Vert {\mathbb {T}}\Vert }{8} \le t_{i,k} \le t_i + \frac{\Vert {\mathbb {T}}\Vert }{8}\), then \(t_i - \frac{\Vert {\mathbb {T}}\Vert }{4} \le t_{i,k} + \mu \alpha \le t_i + \frac{\Vert {\mathbb {T}}\Vert }{4}\), and thus, \({\mathbb {T}}_k^{i,\alpha } \in {\mathcal {P}}^\tau _{N,(u,{\mathbb {T}})}\) and \( u^{\mu }_{k,t_{i,k}} (\cdot ,\alpha ) \in \mathrm {PC}_{N,(u,{\mathbb {T}})}([0,\tau ],{\mathbb {R}}^m)\) for small enough \(0 \le \alpha \le \frac{\eta _{t_{i,k}}}{2}\). Note that \(u^{\mu }_{k,t_{i,k}} (\cdot ,\alpha )\) takes its values in \(\Omega \) for all \(0 \le \alpha \le \frac{\eta _{t_{i,k}}}{2}\). It holds that \(\Vert u^{\mu }_{k,t_i}(\cdot ,\alpha ) \Vert _{\mathrm {L}^\infty } \le R\) and

$$\begin{aligned} \Vert u^{\mu }_{k,t_i}(\cdot ,\alpha )-u \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}\le & {} \Vert u^{\mu }_{k,t_i}(\cdot ,\alpha )-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)}+\Vert u_k-u \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \\\le & {} 2R\alpha +\sqrt{\varepsilon _k} , \end{aligned}$$

for all \(0 \le \alpha \le \frac{\eta _{t_{i,k}}}{2}\). Since \(\sqrt{\varepsilon _k}<\varepsilon \), we conclude that there exists \(0 < \alpha _0 \le \frac{\eta _{t_{i,k}}}{2} \) small enough such that \((u^{\mu }_{k,t_i}(\cdot ,\alpha ),x_{0,k}) \in {\mathcal {N}}^R_\varepsilon \) for all \(0 \le \alpha \le \alpha _0\). From Inequality (7), we obtain

$$\begin{aligned} -\sqrt{\varepsilon _k} \Vert u^{\mu }_{k,t_i}(\cdot ,\alpha )-u_k \Vert _{\mathrm {L}^1([0,\tau ],{\mathbb {R}}^m)} \le {\mathcal {J}}^R_k(u^{\mu }_{k,t_i}(\cdot ,\alpha ),x_{0,k},T_k)-J^R_k(u_k,x_{0,k},T_k), \end{aligned}$$

and thus,

$$\begin{aligned} -2R\sqrt{\varepsilon _k}&\le \frac{1}{{\mathcal {J}}^R_k(u^{\mu }_{k,t_i}(\cdot ,\alpha ),x_{0,k},T_k)+J^R_k(u_k,x_{0,k},T_k)}\\&\qquad \times \, \frac{{\mathcal {J}}^R_k(u^{\mu }_{k,t_i}(\cdot ,\alpha ),x_{0,k},T_k)^2-J^R_k(u_k,x_{0,k},T_k)^2 }{\alpha }, \end{aligned}$$

for all \(\alpha \in (0,\alpha _0]\). Taking the limit as \(\alpha \) tends to 0 and using the definitions of \(\psi ^{0R}_k\) and \(\psi ^R_k\), we obtain from Proposition A.7 that

$$\begin{aligned}&\Big \langle \psi ^{0R}_k \partial _2 \varphi (x_{0,k},x(T_k,u_k,x_{0,k}),T_k) + \partial _2 g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k) ^\top \\&\quad \times \psi ^R_k, w^{\mu ,k}_{t_{i,k}}(T_k) \Big \rangle _{{\mathbb {R}}^n}\le 2R\sqrt{\varepsilon _k}, \end{aligned}$$

where \(w^{\mu ,k}_{t_{i,k}}\) is defined in Lemma A.9.

Second step We take the limit of the above inequality as k tends to \(+\infty \). Since \(\varphi \) and g are of class \(\mathrm {C}^1\) and, since \(t_i < \tau _0\), from the uniform convergence of \((w^{\mu ,k}_{t_{i,k}})_{k\in {\mathbb {N}}}\) to \(w^\mu _{t_i}\) over \([\tau _0,\tau ]\) (see Lemma A.9), it holds that

$$\begin{aligned} \Big \langle \psi ^{0R} \partial _2 \varphi (x(0),x(T),T)+ \partial _2 g(x(0),x(T),T)^\top \times \psi ^R,w^\mu _{t_i}(T) \Big \rangle _{{\mathbb {R}}^n} \le 0. \end{aligned}$$

The proof is complete. \(\square \)

Lemma A.13

The equality

$$\begin{aligned}&\Big \langle \psi ^{0R} \partial _2 \varphi (x(0),x(T),T) + \partial _2g(x(0),x(T),T)^\top \times \psi ^R, f(x(T),u_{N-1},T)\Big \rangle _{{\mathbb {R}}^n} \nonumber \\&\quad +\,\psi ^{0R} \partial _3 \varphi (x(0),x(T),T) + \partial _3 g(x(0),x(T),T)^\top \times \psi ^R = 0 , \end{aligned}$$
(11)

holds.

Proof

The proof is divided in two steps.

First step Let \(\mu \in \{ -1 ,1 \}\). Since \((T_k)_{k \in {\mathbb {N}}}\) converges to \(T \in (\tau _0,\tau )\), then \(T_k \in (\tau _0,\tau )\) for \(k \in {\mathbb {N}}\) sufficiently large. Let us fix such an integer \(k \in {\mathbb {N}}\). Thus, there exists \(\alpha _0 > 0\) small enough such that \((x_{0,k},u_k,T_k+\mu \alpha ) \in {\mathcal {N}}^R_\varepsilon \times [\tau _0,\tau ]\) for all \(0\le \alpha \le \alpha _0\). From Inequality (7), we obtain

$$\begin{aligned} -\sqrt{\varepsilon _k} \; \vert T_k+\mu \alpha - T_k \vert \le {\mathcal {J}}^R_k(u_k,x_{0,k},T_k+\mu \alpha ) - {\mathcal {J}}^R_k(u_k,x_{0,k},T_k), \end{aligned}$$

and thus,

$$\begin{aligned} -\sqrt{\varepsilon _k}\le & {} \frac{1}{{\mathcal {J}}^R_k(u_k,x_{0,k},T_k+\mu \alpha ) + {\mathcal {J}}^R_k(u_k,x_{0,k},T_k)} \\&\times \frac{{\mathcal {J}}^R_k(u_k,x_{0,k},T_k+\mu \alpha )^2 - {\mathcal {J}}^R_k(u_k,x_{0,k},T_k)^2}{\alpha }, \end{aligned}$$

for all \(\alpha \in (0,\alpha _0]\). Taking the limit as \(\alpha \) tends to 0 and using the definitions of \(\psi ^{0R}_k\) and \(\psi ^R_k\), we obtain from the differentiability of \(x(\cdot ,u_k,x_{0,k})\) at \(T_k\) (since \(u_k\) is constant over the interval \([ \tau _0,\tau ] \subset [t_{N-1,k},t_{N,k}]\) and since \(T_k \in (\tau _0,\tau )\)) that

$$\begin{aligned}&\mu \Big \langle \psi ^{0R}_k \partial _2\varphi (x_{0,k},x(T_k,u_k,x_{0,k}),T_k)+ \partial _2 g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)^\top \times \psi ^R_k, \\&\quad f(x(T_k,u_k,x_{0,k}),u_k(T_k),T_k) \Big \rangle _{{\mathbb {R}}^n} +\mu \psi ^{0R}_k \partial _3 \varphi (x_{0,k},x(T_k,u_k,x_{0,k}),T_k) \\&\quad +\,\mu \partial _3 g(x_{0,k},x(T_k,u_k,x_{0,k}),T_k)^\top \times \psi ^R_k \le \sqrt{\varepsilon _k}, \end{aligned}$$

where \(u_k(T_k) = u_{N-1,k}\).

Second step We take the limit of the above inequality as k tends to \(+\infty \). Let us recall that \(u_{N-1,k}\) converges to \(u_{N-1}\). Furthermore, since f is continuous, since \(\varphi \) and g are of class \(\mathrm {C}^1\), and since \(u_k(T_k)\) converges to u(T) from Proposition A.2, it holds that

$$\begin{aligned}&\mu \Big \langle \psi ^{0R} \partial _2 \varphi (x(0),x(T),T) + \partial _2g(x(0),x(T),T)^\top \times \psi ^R, f(x(T),u_{N-1},T) \Big \rangle _{{\mathbb {R}}^n} \\&\quad +\,\mu \psi ^{0R} \partial _3 \varphi (x(0),x(T),T) +\mu \partial _3 g(x(0),x(T),T)^\top \times \psi ^R \le 0 . \end{aligned}$$

Since \(\mu \) can be chosen arbitrarily in \(\{-1,1 \}\), the proof is complete. \(\square \)

1.3.3 Crucial inequalities letting R go to \(+\,\infty \)

In the previous section, we have obtained Inequalities (8), (9) and (10) and Equality (11) which are valid for \(R \in {\mathbb {N}}\) being fixed such that \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\). In particular, Inequality (8) is satisfied only for \(v \in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\) taking values in \(\Omega \cap {\overline{\mathrm {B}}}_{{\mathbb {R}}^m}(0_{{\mathbb {R}}^m},R)\). Our goal in this section is to get rid of the dependence in R. From the equality \(\vert \psi ^{0R} \vert ^2 + \Vert \psi ^{R} \Vert _{{\mathbb {R}}^j}^2 = 1\) (see the end of Sect. A.3.1), we can extract subsequences (that we do not relabel) such that \((\psi ^{0R})_{R \in {\mathbb {N}}}\) converges to some \(\psi ^{0}\) in \({\mathbb {R}}\) and \((\psi ^{R})_{R \in {\mathbb {N}}}\) converges to some \(\psi \) in \({\mathbb {R}}^j\) when \(R \rightarrow \infty \). It clearly holds that \(\vert \psi ^{0} \vert ^2 + \Vert \psi \Vert _{{\mathbb {R}}^j}^2 = 1\) and, since \({\mathbb {R}}_-\) and \(\mathrm {N}_\mathrm {S}[g(x(0),x(T),T)]\) are closed, that \(\psi ^{0} \in {\mathbb {R}}_-\) and \(-\psi \in \mathrm {N}_\mathrm {S}[g(x(0),x(T),T)]\).

Now let us fix \(v\in \mathrm {PC}^{\mathbb {T}}_N([0,\tau ],{\mathbb {R}}^m)\) taking values in \(\Omega \). Considering \(R \in {\mathbb {N}}\) large enough in order to get that \(R \ge \Vert u \Vert _{\mathrm {L}^\infty }\) and \(R \ge \Vert v \Vert _{\mathrm {L}^\infty }\), we know from Lemma A.10 that Inequality (8) is satisfied. Taking the limit as R tends to \(+\infty \) we conclude that

$$\begin{aligned} \langle \psi ^0 \partial _2 \varphi (x(0),x(T),T)+ \partial _2 g(x(0),x(T),T)^\top \times \psi , w_v(T) \rangle _{{\mathbb {R}}^n} \le 0. \end{aligned}$$
(12)

Similarly, letting R go to \(+\infty \) in Inequalities (9) and (10) and in Equality (11), we get that

$$\begin{aligned} \langle \psi ^0 \partial _1&\varphi (x(0),x(T),T)+\partial _1g(x(0),x(T),T)^\top \times \psi , y \rangle _{{\mathbb {R}}^n}\nonumber \\&+\langle \psi ^0\partial _2 \varphi (x(0),x(T),T)+\partial _2g(x(0),x(T),T)^\top \times \psi ,w_y(T)\rangle _{{\mathbb {R}}^n}\le 0,\nonumber \\ \end{aligned}$$
(13)

for any \(y\in {\mathbb {R}}^n\), that

$$\begin{aligned} \langle \psi ^0\partial _2\varphi (x(0),x(T),T)+\partial _2 g(x(0),x(T),T)^\top \times \psi ,w^\mu _{t_i}(T)\rangle _{{\mathbb {R}}^n} \le 0, \end{aligned}$$
(14)

for any \(i \in \{1,\ldots ,N-1\}\) such that \(u_{i-1}\ne u_i\) and any \(\mu \in \{ -1,1 \}\), and that

$$\begin{aligned}&\langle \psi ^0 \partial _2 \varphi (x(0),x(T),T) + \partial _2g(x(0),x(T),T)^\top \times \psi , f(x(T),u_{N-1},T)\rangle _{{\mathbb {R}}^n} \nonumber \\&\quad +\psi ^0 \partial _3 \varphi (x(0),x(T),T) + \partial _3 g(x(0),x(T),T)^\top \times \psi = 0 . \end{aligned}$$
(15)

1.3.4 End of the proof

Now we can end the proof of Theorem 2.1 with the introduction of the adjoint vector p. Before coming to this point, let us first define \(p^0:=\psi ^0\) and \(\Psi :=\psi \). In particular, note that \(p^0 \in {\mathbb {R}}_-\), that \(\Psi \in {\mathbb {R}}^j\) is such that \(-\Psi \in \mathrm {N}_\mathrm {S}[g(x(0),x(T),T)]\) and that \(\vert p^{0} \vert ^2 + \Vert \Psi \Vert _{{\mathbb {R}}^j}^2 = 1\).

We define the adjoint vector \(p\in \mathrm {AC}([0,T],{\mathbb {R}}^n)\) as the unique solution (that is global) to the backward linear Cauchy problem given by

$$\begin{aligned} \left\{ \begin{array}{l} {\dot{p}}(t)= -\partial _1f(x(t),u(t),t)^\top \times p(t), \quad \text {a.e. } t\in [0,T],\\ p(T)=p^0 \partial _2\varphi (x(0),x(T),T)+\partial _2 g(x(0),x(T),T)^\top \times \Psi . \end{array} \right. \end{aligned}$$

From the Duhamel formula, recall that

$$\begin{aligned} p(t)=\mathrm {Z}(T,t)^\top \times \Big ( p^0 \partial _2\varphi (x(0),x(T),T)+\partial _2 g(x(0),x(T),T)^\top \times \Psi \Big ) , \end{aligned}$$

for all \(t \in [0,T]\), where \(\mathrm {Z}(\cdot ,\cdot ):[0,T]^2 \rightarrow {\mathbb {R}}^{n\times n}\) stands for the state transition matrix associated with the matrix function \(t \mapsto \partial _1 f(x(t),u(t),t)\).

Adjoint equation and transversality conditions on the adjoint vector From the above definition of the adjoint vector p, it is clear that the adjoint equation in Theorem 2.1 and the transversality condition \(p(T)=p^0 \partial _2\varphi (x(0),x(T),T)+\partial _2 g(x(0),x(T),T)^\top \times \Psi \) are satisfied. Moreover, from the Duhamel formula, it holds that \(w_y(T)= \mathrm {Z}(T,0)\times y\), and thus, Inequality (13) can be rewritten as

$$\begin{aligned} \langle p^0\partial _1\varphi (x(0),x(T),T) +\partial _1 g(x(0),x(T),T)^\top \times \Psi + p(0) , y\rangle _{{\mathbb {R}}^n} \le 0, \end{aligned}$$

for all \(y\in {\mathbb {R}}^n\). Thus, we conclude that the transversality condition \(-p(0)= p^0\partial _1\varphi (x(0),x(T),T)+\partial _1 g(x(0),x(T),T)^\top \times \Psi \) holds.

Nonpositive averaged Hamiltonian gradient condition Let us fix \(\omega \in \Omega \) and \(i \in \{ 0,\ldots ,N-1 \}\). Let us consider \(v \in \mathrm {PC}^{\mathbb {T}}_N([0,T],{\mathbb {R}}^m)\) defined by

$$\begin{aligned} v(t):= \left\{ \begin{array}{ll} \omega &{} \text {if } t\in [t_i,t_{i+1}),\\ u(t) &{} \text {otherwise} , \end{array} \right. \end{aligned}$$

for all \(t \in [0,T]\). From the Duhamel formula given by

$$\begin{aligned} w_{v}(T)= \int _0^T \mathrm {Z}(T,t) \times \partial _2 f(x(t),u(t),t) \times (v(t)-u(t))\,\text {d}t , \end{aligned}$$

Inequality (12) can be rewritten as

$$\begin{aligned} \displaystyle \int _0^T \left\langle \partial _2 f(x(t),u(t),t)^\top \times p(t), v(t)-u(t) \right\rangle _{{\mathbb {R}}^m} \, \text {d}t\le 0, \end{aligned}$$

that is,

$$\begin{aligned} \left\langle \int _{t_i}^{t_{i+1}} \partial _2 H(x(t),u_i,p(t),p^0,t)\,\text {d}t, \omega -u_i \right\rangle _{{\mathbb {R}}^m} \le 0. \end{aligned}$$

Transversality conditions on the optimal sampling times Let us fix some \(i \in \{ 1,\ldots ,N-1 \}\) and \(\mu \in \{ -1,1 \}\). If \(u_{i-1}=u_i\), then the transversality condition (3) in Theorem 2.1 is obviously satisfied. Now let us assume that \(u_{i-1} \ne u_i\). From the Duhamel formula given by

$$\begin{aligned} w^\mu _{t_i}(T)=\mu \mathrm {Z}(T,t_i) \times \Big ( f(x(t_i),u_{i-1},t_i)-f(x(t_i),u_i,t_i) \Big ), \end{aligned}$$

Inequality (14) can be rewritten as

$$\begin{aligned} \mu \langle p(t_i), f(x(t_i),u_{i-1},t_i)-f(x(t_i),u_i,t_i) \rangle _{{\mathbb {R}}^n} \le 0. \end{aligned}$$

Since \(\mu \) can be arbitrarily chosen in \(\{ -1,1 \}\) and from the definition of the Hamiltonian H, we get that

$$\begin{aligned} H(x(t_i),u_{i-1},p(t_i),p^0,t_i)=H(x(t_i),u_i,p(t_i),p^0,t_i) . \end{aligned}$$

Transversality condition on the optimal final time Equality (15) can be directly rewritten as

$$\begin{aligned} -H(x(T),u_{N-1},p(T),p^0,T) = p^0 \partial _3 \varphi (x(0),x(T),T) +\partial _3 g(x(0),x(T),T)^\top \times \Psi . \end{aligned}$$

Nontriviality of the couple\((p,p^0)\). Let us assume by contradiction that the couple \((p,p^0)\) is trivial. Then \(p(0)=p(T)=0_{{\mathbb {R}}^n}\) and \(p^0=0\). We get from the transversality conditions on the adjoint vector and on the optimal final time that \( Dg( x(0) , x(T) , T)^\top \times \Psi = 0_{{\mathbb {R}}^{2n+1}} \). From the submersion property, we deduce that \(\Psi = 0_{{\mathbb {R}}^j}\) which raises a contradiction with the equality \(\vert p^0 \vert ^2+ \Vert \Psi \Vert ^2_{{\mathbb {R}}^j}=1\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bourdin, L., Dhar, G. Continuity/constancy of the Hamiltonian function in a Pontryagin maximum principle for optimal sampled-data control problems with free sampling times. Math. Control Signals Syst. 31, 503–544 (2019). https://doi.org/10.1007/s00498-019-00247-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00498-019-00247-6

Keywords

Mathematics Subject Classification

Navigation