Introduction

The theory of second-order optimality conditions for different types of minima (strong, weak and the so-called Pontryagin) in optimal control (OC) is well-developed. It is associated with the names of Bonnans, Dmitruk, Frankowska, Hestenes, Ioffe, Malanowski, Maurer, Milyutin, Osmolovskii, Warga, Zeidan, and many others. We refer the interested reader to, e.g., [1,2,3,4,5] for historical comments and bibliographical remarks. Rather complete and advanced results were obtained by the Moscow group headed by Milyutin, to which belongs the author of the present publication. The main distinguishing feature of these results is that there is no gap between necessary and sufficient conditions. The conditions have the character of sign definiteness of a quadratic form (which, in simple cases, is the second variation of the Lagrange function) on the so-called critical cone. The type of the minimum affects only the choice of Lagrange multipliers involved in the conditions.

The necessary conditions, to which this work is devoted, were stated by the author (together with the relevant sufficient conditions) back in 1978; see [1, Supplement to Chapter VI, S2]. Later, much more general results were obtained in the author’s thesis [6]. Second-order conditions in [6] were obtained for problems with regular mixed state–control constraints, considered on a fixed or nonfixed time interval. Moreover, conditions in [6] were obtained for various types of minimum, and they took into account possible breaks of the first kind of the optimal control, if any. The necessary conditions, contained in [6], together with their proofs, were published in [7]. (The relevant sufficient conditions were published in [8].) But the proofs, presented in [7], are complex and long due to the generality of the obtained results. Therefore, the question of getting simpler proofs, let for partial results, is still pertinent. This we suppose to do in the present work.

Recent years have been marked by renewed interest in second-order conditions in OC. Some progress was due to the fact that an arbitrary compact set was considered as a control constraint that was not specified using a finite number of smooth inequalities, see, e.g., [3, 9,10,11]. To work with such a control constraint, the authors used the transition from a control system to differential inclusion, and then applied the differential calculus apparatus for multivalued mappings, developed by Aubin and Frankowska in [12]. These ideas allowed us to obtain the necessary second-order condition for a strong local minimum for a problem with an arbitrary compact control set and a finite number of inequality-type endpoint constraints (see [10]), and then for a problem with a finite number of inequality-type state constraints, in the absence of any qualification assumptions for constraints (see [11]). Moreover, the conditions in [10] and [11] were obtained for an arbitrary measurable optimal control. Unfortunately, the developed approach did not allow us to immediately include in the problem the terminal constraints of the equality type.

A new interesting approach to obtain necessary conditions for a strong local minimum in OC has been proposed by Ioffe in [13]. First-order necessary conditions for a strong local minimum in the form of the maximum principle (MP) were obtained in [13] for an OC problem with state, control and endpoint constraints, for a system, controlled by a differential inclusion, and under fairly general assumptions (in particular, the endpoint constraint was specified by an arbitrary closed set). The idea of the proof was completely based on the reduction in the original OC problem to a nonsmooth problem of Bolza type with the subsequent application of the necessary conditions for a strong local minimum in the latter.

This approach was further developed in [14], where conditions of both the first and second order for a strong local minimum were obtained for an OC problem with state constraints, Pontryagin standard dynamics, a control constraint U(t) with closed values, and a finite number of endpoint constraints. An important difference between [11] and [14] was that in [14], an OC problem was considered, containing not only endpoint constraints of the inequality type (as in [11]), but also endpoint constraints of the equality type. Moreover, in the necessary second-order condition in [14], there is a new quadratic term (the last term in inequality (21)), which is absent in traditional second-order conditions. The presence of this term determines the essential novelty of the necessary condition in [14], the effectiveness of which is confirmed by an example. At the same time, note that for the problem, considered in the present paper, the conditions from [14] (compared to the conditions of this paper) require an additional regularity assumption, associated with the endpoint constraint of the equality type.

It is worth noting that in all the mentioned works, including the present one, the idea of convexification of the right side of differential inclusion or differential equation was used one way or another.

In the present work, although the control constraint is not an arbitrary closed or compact set, we discuss second-order necessary conditions for a strong local minimum in a problem with endpoint constraints of both equality and inequality type, in the absence of any qualifying assumption for endpoint constraints, and again the reference control is an arbitrary bounded measurable function. Therefore, the present work can serve as a basis for further research.

As mentioned in the abstract, we use the sliding mode (relaxation) method to prove the necessary second-order conditions for a strong local minimum, based on the necessary second-order conditions for a weak local minimum. (A relatively short proof of the latter was given in [15].) The key role in the transition from conditions for a weak minimum to conditions for a strong one is played by the Dmitruk theorem [16, Theorem 1].

The paper is organized as follows. The main results are presented in Sect. 2. In particular, an OC problem is set in Sect. 2.1, the Lyusternik condition for the equality constraints of the OC problem and the concepts of the weak and strong local minima are recalled in Sect. 2.2, the second-order necessary conditions [15] for a weak local minimum are discussed in Sect. 2.3, and the second-order necessary conditions for a strong local minimum are formulated in Sect. 2.4 (see Theorem 2.2). Section 3 is entirely devoted to the proof of Theorem 2.2, which is the main one in the paper. Concluding remarks are given in Sect. 4.

First- and Second-Order Necessary Conditions in the Main Problem

Statement of the Main Problem

Denote by \(W^{1,1}([0,1],\mathrm{I\! R}^{d(x)})\) the Sobolev space of absolutely continuous functions \(x:[0,1]\rightarrow \mathrm{I\! R}^{d(x)}\) with the norm \(\Vert x(\cdot )\Vert _{1,1}=|x(0)|+\int _0^1|\dot{x}(t)|\mathrm{d}t\), and by \(L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), the space of measurable essentially bounded functions \(u:[0,1]\rightarrow \mathrm{I\! R}^{d(u)}\) with the norm \(\Vert u(\cdot )\Vert _{\infty }=\mathrm{ess\,sup}_{[0,1]}|u(t)|,\) where \(|\cdot |\) denotes the Euclidean norm. Hereafter, by d(a), we denote the dimension of the vector a. Define the space \(\mathcal{W}:=W^{1,1}([0,1],\mathrm{I\! R}^{d(x)})\times L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\) with the norm of \(w(\cdot )=(x(\cdot ),u(\cdot ))\in \mathcal{W}\) given by \(\Vert w(\cdot )\Vert = \Vert x(\cdot )\Vert _{1,1}+\Vert u(\cdot )\Vert _{\infty }.\) In the sequel, for \((x,u)\in \mathcal{W}\), we set \(\xi _0=x(0)\), \(\xi _1=x(1)\) and \(\xi =(\xi _0,\xi _1).\)

Consider the Mayer OC problem in the space \(\mathcal{W}\):

$$\begin{aligned}&{\text {Minimize}}\quad J(x,u):=F_0(x(0),x(1)), \end{aligned}$$
(1)
$$\begin{aligned}&F_i(x(0),x(1)) \le 0, \quad i=1,\ldots ,k, \quad K(x(0),x(1)) =0, \end{aligned}$$
(2)
$$\begin{aligned}&\dot{x}(t)=f(x(t),u(t)),\quad u(t)\in U \quad {\text {a.e. in}}\quad [0,1], \end{aligned}$$
(3)

where

$$\begin{aligned} U=\{u\in \mathrm{I\! R}^{d(u)}: \; g(u)\le 0\}. \end{aligned}$$
(4)

The functions \(f:\mathrm{I\! R}^{{d(x)}} \times \mathrm{I\! R}^{d(u)}\rightarrow \mathrm{I\! R}^{d(x)}\), \(g:\mathrm{I\! R}^{d(u)}\rightarrow \mathrm{I\! R}^r\), \(F_i:\mathrm{I\! R}^{2{d(x)}}\rightarrow \mathrm{I\! R}\), \(i=0,\ldots ,k\), and \(K:\mathrm{I\! R}^{2{d(x)}}\rightarrow \mathrm{I\! R}^s\) are assumed to be twice continuously differentiable. We also assume that at every point \(u\in \mathrm{I\! R}^{d(u)}\) such that \(g(u)=~0\), the gradients \( g'_i(u)\), \(i\in I_g(u)\) are linearly independent, where

$$\begin{aligned} I_g(u)=\{i\in \{1,\ldots ,r\}:\;g_i(u)=0\} \end{aligned}$$

is the set of active indices at u. We call (1)–(3) the main problem.

Lyusternik Condition, Weak and Strong Minima in the Main Problem

Define a nonlinear operator \(G:\mathcal{W}\mapsto L^1([0,1],\mathrm{I\! R}^{d(x)})\times \mathrm{I\! R}^s\) (which corresponds to the equality-type constraints of the main problem) as follows:

$$\begin{aligned} G:w=(x,u)\in \mathcal{W}\mapsto \Big (f(x,u)-\dot{x},\; K(x(0),x(1))\Big )\in L^1([0,1],\mathrm{I\! R}^{d(x)})\times \mathrm{I\! R}^s. \end{aligned}$$

This operator is continuously Fréchet differentiable, and its derivative at a point \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) is a linear operator

$$\begin{aligned} G'(\hat{w}): w=(x,u)\in \mathcal{W}\mapsto \Big (f'(\hat{w})w-\dot{x},\; K'(\hat{\xi })\xi \Big )\in L^1([0,1],\mathrm{I\! R}^{d(x)})\times \mathrm{I\! R}^s, \end{aligned}$$

where \(\hat{\xi }=(\hat{x}(0),\hat{x}(1))\), \(\xi =(x(0),x(1))\), \(w=(x,u)\).

Definition 2.1

Let a point \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) satisfy the equality constraints of the main problem, i.e., \(G(\hat{w})=0.\) We say that the Lyusternik condition holds at \(\hat{w}\), if the operator \(G'(\hat{w})\) is surjective.

Any trajectory-control pair \((x,u)\in \mathcal{W}\) satisfying (2)–(3) is called admissible. Recall that a weak local minimum is a local minimum over admissible pairs in the space \(\mathcal{W}\). Further, an admissible \((\hat{x},\hat{u})\) is called a strong local minimizer , if there exists an \(\varepsilon >0\) such that \(J(x,u)\ge J(\hat{x},\hat{u})\) for any admissible \((x,u)\in \mathcal{W}\) such that \(\Vert x-\hat{x}\Vert _\infty <\varepsilon \). Obviously, any strong local minimizer is a weak local minimizer.

Second-Order Necessary Conditions for a Weak Local Minimum in the Main Problem

The Pontryagin (Hamiltonian) function  and the terminal Lagrange function  are defined, respectively, by

$$\begin{aligned} H(x,u,p)=pf(x,u),\quad l(\xi ,\alpha ,\beta )=\sum _{i=0}^k\alpha _iF_i(\xi _0,\xi _1)+\beta K(\xi _0,\xi _1), \end{aligned}$$

where \(p=(p_1,\ldots ,p_{d(x)})\), \(\alpha =(\alpha _0,\ldots ,\alpha _k)\), and \(\beta =(\beta _1,\ldots ,\beta _s)\) are considered as row vectors. The augmented Pontryagin (Hamiltonian) function has the form:

$$\begin{aligned} \bar{H}(x,u,p,\mu )=H(x,u,p)+\mu g(u), \end{aligned}$$

where \(\mu =(\mu _1,\ldots ,\mu _r)\) is a row vector.

Let \((\hat{x},\hat{u})\) be an admissible pair. Denote by \(\Lambda \) the set of all tuples \(\lambda =(\alpha ,\beta ,p,\mu )\) such that

$$\begin{aligned} \begin{array}{rcl}&{}&{}\alpha \in \mathrm{I\! R}^{k+1},\quad \beta \in \mathrm{I\! R}^s,\quad p\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)}), \quad \mu \in L^\infty ([0,1],\mathrm{I\! R}^r),\\ &{}&{}\alpha \ge 0,\quad |\alpha |+|\beta |=1, \quad \mu \ge 0,\\ &{}&{}\alpha _iF_i(\hat{x}(0),\hat{x}(1))=0,\quad i=1,\ldots ,k,\quad \mu g(\hat{u})=0,\\ &{}&{}-\,\dot{p}=p f_x(\hat{x},\hat{u}) \quad \Leftrightarrow \quad -\dot{p}= H_x, \\ &{}&{}-\,p(0)=l_{\xi _0}(\hat{\xi },\alpha ,\beta ),\quad p(1)= l_{\xi _1}(\hat{\xi },\alpha ,\beta ),\\ &{}&{}p f_u(\hat{x},\hat{u})+\mu g'(\hat{u})=0 \quad \Leftrightarrow \quad \bar{H}_{u}=0. \end{array} \end{aligned}$$

If \((\hat{x},\hat{u})\) is a weak local minimum, then the set \(\Lambda \) is nonempty. This is the well-known first-order necessary condition for a weak local minimum. Since the gradients \(g_i'\) of active control constraints are linearly independent, each \(\lambda \in \Lambda \) is uniquely defined by its components \(\alpha ,\beta \). It follows that the equality \( |\alpha |+|\beta |=1\) is the normalization condition, and the set \(\Lambda \) does not contain a zero element. Moreover, it follows that \(\Lambda \) is a finite-dimensional compact set.

For the point \(\hat{w}=(\hat{x},\hat{u})\), define the critical cone\(\mathcal{C}\) as the set of all pairs \(w=(x,u)\in \mathcal{W}\) such that

$$\begin{aligned} \begin{array}{rcl} &{}&{}F'_i(\hat{\xi })\xi \le 0,\quad i\in I_F\cup \{0\}, \quad K'(\hat{\xi })\xi =0,\quad {\text {where}}\quad \xi =(x(0),x(1)),\\ &{}&{}\dot{x}=f'(\hat{w})w,\quad g_i'(\hat{u}(t))u(t)\le 0 \quad {\text {a.e. on}} \quad \mathcal{M}_{i0},\quad i=1,\ldots ,s,\end{array} \end{aligned}$$

where \(\mathcal{M}_{i0}=\{t\in [0,1]:\; g_i(\hat{u}(t))=0\}\), \(i=1,\ldots ,s\).

For any \(\lambda \in \Lambda \) and \(w=(x,u)\in \mathcal{W}\), we set

$$\begin{aligned} \Omega (w,\lambda )=\langle l_{\xi \xi }(\hat{\xi },\alpha ,\beta )\xi ,\xi \rangle +\int _0^1\langle \bar{H}_{ww}(\hat{w},p,\mu )w,w\rangle \mathrm{d}t, \end{aligned}$$

where \( \langle \bar{H}_{ww}w,w\rangle =\langle \bar{H}_{xx}x,x\rangle +2\langle \bar{H}_{xu}u,x\rangle +\langle \bar{H}_{uu}u,u\rangle ,\) and let

$$\begin{aligned} \Omega _\Lambda (w)=\sup _{\lambda \in \Lambda } \Omega (w,\lambda ). \end{aligned}$$
(5)

Here and in the sequel, \(\sup _\emptyset (\cdot )=-\infty .\) Note that the supremum in (5) is attained. A direct proof of the following result can be found in [15].

Theorem 2.1

If \(\hat{w}=(\hat{x},\hat{u})\) is a weak local minimum, then the set \(\Lambda \) is nonempty and

$$\begin{aligned} \Omega _\Lambda (w)\ge 0\quad \forall \,w\in \mathcal{C}. \end{aligned}$$
(6)

Here, (6) is a second-order necessary condition for a weak local minimum.

If the Lyusternik condition does not hold at \(\hat{w}\), then, it can be easily proved (see, e.g., [15]) that there exist \(p\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)}) \) and \(\beta \in \mathrm{I\! R}^s\) with \(|\beta |=1\) such that

$$\begin{aligned} -\dot{p}=pf_x(\hat{w}), \quad pf_u(\hat{w})=0,\quad -p(0)=\beta K_{\xi _0}(\hat{\xi }),\quad p(1)=\beta K_{\xi _1}(\hat{\xi }). \end{aligned}$$

Set \( \alpha =0, \quad \mu =0,\quad \lambda =(0,\beta ,p,0).\) Then, obviously \(\lambda \in \Lambda \) and \( -\lambda \in \Lambda .\) This implies the following lemma.

Lemma 2.1

If the Lyusternik condition does not hold at \(\hat{w}\), then

$$\begin{aligned} \Lambda \ne \emptyset \quad {\text {and}} \quad \Omega _\Lambda (w)\ge 0\quad \forall \,w\in \mathcal{W}. \end{aligned}$$

This simple lemma is an important complement to Theorem 2.1. We emphasize that if the Lyusternik condition does not hold, then the inequality \(\Omega _\Lambda (w)\ge 0\) holds for all \(w\in \mathcal{W}\), but not only for \(w\in \mathcal{C}\).

The idea of the proof of Theorem 2.1 in [15] is simple. Thanks to Lemma 2.1, we can assume that the Lyusternik condition holds at the point \(\hat{w}\) of the weak local minimum. Under this assumption, we consider the system of second-order approximations for the cost and constraints and, using the Lyusternik theorem, we prove that this system has an empty intersection. Then, we apply the separation theorem to this system.

Note that the necessary conditions for a weak local minimum in OC do not play such a dominant role, as in the calculus of variations, which is largely a theory of the weak local minimum. One of the reasons is that, in OC, we are dealing with a constraint of the form: \(u\in U\), which does not always allow us to use control variations that are small in absolute value. Say, if U consists of a finite number of elements, then such variations simply do not exist. Therefore, when studying the weak local minimum in OC, it is necessary to restrict ourselves to some special classes of sets U. For a long time, such a unique class was the class of sets represented in form (4). In the recent studies, this class has been significantly expanded. For example, results [9,10,11] can also be effectively applied if U is a cross, a star, etc.

The necessary conditions for a weak local minimum in OC should be considered rather as the first step in the analysis of the conditions for a strong minimum. This is precisely the role of Theorem 2.1. It is worth noting that the necessary condition (6) for a weak local minimum, contained in this theorem, cannot be regarded as complete and final, since its natural strengthening does not turn it into a sufficient condition for a weak local minimum; see [15] for details. However, it is the condition (6) (the wording of which is relatively simple) that is fundamental, when moving to a strong minimum. This ‘incomplete condition’ already leads to the desired result.

Second-Order Necessary Conditions for a Strong Local Minimum in the Main Problem

In Sect. 3, using Theorem 2.1, we will get a necessary second-order condition for a strong local minimum. We will use the same way of the proof as in [7, Chapter 4, Section 4.4]. (This approach was proposed by A.A. Milyutin in the 1980s.) The theorem, which we want to prove, is as follows:

Denote by M the set of all \(\lambda =(\alpha ,\beta ,p,\mu )\in \Lambda \) such that the minimum condition holds at the point \(\hat{u}\):

$$\begin{aligned} H(\hat{x}(t),u,p(t))\ge H(\hat{x}(t),\hat{u}(t),p(t)) \; \forall \,u\in U, \; {\text {for a.a.}}\; t\in [0,1]. \end{aligned}$$
(7)

Set

$$\begin{aligned} \Omega _M(w)=\sup _{\lambda \in M} \Omega (w,\lambda ). \end{aligned}$$
(8)

The condition \(M\ne \emptyset \) is equivalent to the Pontryagin minimum principle, which is a necessary first-order condition for a strong local minimum. Note that M is a finite-dimensional compact set, and the supremum in (8) is attained.

Theorem 2.2

If \((\hat{x},\hat{u})\) is a strong local minimum, then the set M is nonempty and

$$\begin{aligned} \Omega _M(w)\ge 0\quad \forall \,w\in \mathcal{C}. \end{aligned}$$
(9)

A much more refined result for the more general OC problem was obtained in [7, Theorem 4.10], but, as earlier remarked, the proofs in [7] are long and complicated. Our aim now is to give a relatively simple proof of Theorem 2.2, based on Theorem 2.1 and using the so-called sliding mode (relaxation) method.

Proof of the Main Result

Refinement of Theorem 2.1

The following concept will be used to prove the main result.

We say that a weak s-necessity  [1] holds at an admissible point \(\hat{w}=(\hat{x},\hat{u})\) of the main problem, if there is no sequence of admissible points \(w^n=(x^n,u^n)\), \(n=1,2,\ldots \) such that for all n

$$\begin{aligned}&F_0(x^n(0),x^n(1))<F_0(\hat{x}(0),\hat{x}(1)), \end{aligned}$$
(10)
$$\begin{aligned}&F_i(x^n(0),x^n(1))<0,\quad i=1,\ldots ,k, \end{aligned}$$
(11)
$$\begin{aligned}&\mathrm{ess\,sup}_{t\in [0,1]} g_j(u^n(t))<0,\quad j=1,\ldots ,s, \end{aligned}$$
(12)

and \(\Vert w^n-\hat{w}\Vert \rightarrow 0\) as \(n\rightarrow \infty \).

Clearly, the weak local minimum implies the weak s-necessity.

A direct proof of the following result can be found in [15].

Theorem 3.1

If \(\hat{w}=(\hat{x},\hat{u})\) is a point of weak s-necessity or the Lyusternik condition does not hold at \(\hat{w}\), then the set \(\Lambda \) is nonempty and condition (6) holds.

Theorem 3.1 immediately implies Theorem 2.1. Theorem 3.1 is an important refinement of Theorem 2.1. We will use this refinement in the proof of the main result.

Unfortunately, in [15], Theorem 3.1 has not been formulated. Instead of this, Theorem 2.1 was formulated and proved as the main result. But from this proof, given in [15, Section 3], it easily follows that Theorem 3.1 also holds (see the proof of Lemma 3 in [15]).

Associated Problem, Sliding Modes

We shall use the following notation

$$\begin{aligned} \mathcal {A}=\big \{a:= (u^1(\cdot ),\ldots ,u^N(\cdot ))\,:\, N\ge 1\;{\text {is an integer and }} u^i(\cdot )\in \mathcal {U}\, \big \}, \end{aligned}$$

where \(\mathcal {U}\) is the set of admissible controls that is

$$\begin{aligned} \mathcal {U}=\big \{u(\cdot )\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)}) \,: u(t)\in U\, {\text {a.e. in [0,1]}} \, \big \} . \end{aligned}$$

Let \(a=(u^1,\ldots , u^N)\in {\mathcal {A}}\). Along with the main problem, consider the so-called associated (or relaxed) problem, defined by (1), (2) and the relations

$$\begin{aligned}&\dot{x}(t)=f(x(t),u(t)) +\sum _{i=1}^N v^i(t) \Big (f(x(t),u^i(t))-f(x(t),u(t))\Big ), \end{aligned}$$
(13)
$$\begin{aligned}&v^i(t)\ge 0,\quad i=1,\ldots ,N, \quad \sum _{i=1}^N v^i(t)\le 1\quad {\text {a.e. in}}\quad [0,1], \end{aligned}$$
(14)
$$\begin{aligned}&g(u(t))\le 0,\quad {\text {a.e. in}}\quad [0,1], \end{aligned}$$
(15)

where \(u\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), \(v^i\in L^\infty ([0,1],\mathrm{I\! R})\), \(i=1,\ldots ,N\). In the new problem, the control is the tuple \((u,v^1,\ldots ,v^N)\), and x is the state variable.

Let us check the linear independence of the gradients of active inequality control constraints in the associated problem. As will be seen later, the constraint \(\sum _{i=1}^N v^i\le 1\) will always be inactive at the reference point, and therefore, when studying the weak minimum, it can be ignored. So, we consider only the constraints

$$\begin{aligned} g(u)\le 0, \quad v^1\ge 0,\quad \ldots \quad , v^N\ge 0. \end{aligned}$$

The gradients of these constraints at the point \((u,v^1,\ldots ,v^N)\), considered as row vectors, are of the form

$$\begin{aligned} \begin{array}{cccccc}(&{}g'(u),&{}0,&{}\ldots ,&{}0&{})\\ (&{}0,&{}1,&{}\ldots ,&{}0&{}),\\ &{}\ldots &{}\ldots &{}\ldots &{}\ldots &{}\\ (&{}0,&{}0,&{}\ldots ,&{}1&{}), \end{array} \end{aligned}$$
(16)

respectively. Take any reals \(\eta ^1\), \(\ldots \), \(\eta ^N\) and a vector \(\mu \in \mathrm{I\! R}^{d(u)}\). Suppose that the vector \(\mu \) has zero all components \(\mu _i\) that correspond to inactive constraints, i.e., the complimentary slackness conditions \(\mu _ig_i(u)=0\) hold for all i. Suppose that the combination of gradients (16) with some coefficients \(\mu \), \(\eta ^1\), \(\ldots \), \(\eta ^N\) is equal to zero. Then, obviously we get: \(\eta ^1=\cdots =\eta ^N=0\), and \(\mu g'(u)=0.\) The latter implies that \(\mu =0,\) since the gradients of active control constraints are linearly independent in the main problem. It means that the gradients of active control constraints are linearly independent in the associated problem as well.

Note that the associated problem (1), (2), (13)–(15) has the same type as the main problem (1)–(3). It is considered in the space

$$\begin{aligned} \mathcal{Z}:=W^{1,1}([0,1],\mathrm{I\! R}^{d(x)})\times L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\times \left( L^\infty ([0,1],\mathrm{I\! R})\right) ^{N} \end{aligned}$$

with elements \(z=(x,u,v^1,\ldots ,v^N)\) and the norm

$$\begin{aligned} \Vert z\Vert = \Vert x\Vert _{1,1}+\Vert u\Vert _\infty +\sum _{i=1}^N\Vert v^i\Vert _\infty . \end{aligned}$$

The local minimum in this norm is a weak local minimum in the associated problem.

Let \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) be an admissible point in the main problem. For this point, we define an admissible point \(\hat{z}\) in the associated problem such that \(x=\hat{x}\), \( u=\hat{u}, \quad v^1=\cdots = v^N=0,\) that is \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\). The weak s-necessity at the point \(\hat{z}\) in the associated problem means that there is no sequence \( z_n= ( x_n, u_n, v^1_n,\ldots , v^N_n)\), \( n=1,2,\ldots \), such that for all n

$$\begin{aligned}&F_0( \xi _n)<F_0(\hat{\xi }),\quad F_i( \xi _n)< 0, \quad i=1,\ldots ,k,\quad K( \xi _n)=0,\end{aligned}$$
(17)
$$\begin{aligned}&\dot{ x}_n=f( x_n, u_n) +\sum _{i=1}^N v^i_n \Big (f( x_n, u^i)-f( x_n,u_n)\Big ), \end{aligned}$$
(18)
$$\begin{aligned}&\mathrm{ess\,sup}_{[0,1]}(- v^i_n)<0, \quad i=1,\ldots ,N,\end{aligned}$$
(19)
$$\begin{aligned}&\mathrm{ess\,sup}_{[0,1]}\Big (\sum _{i=1}^N v^i_n-1\Big )<0, \end{aligned}$$
(20)
$$\begin{aligned}&\mathrm{ess\,sup}_{[0,1]}g(u_n)<0,\end{aligned}$$
(21)
$$\begin{aligned}&\Vert x_n-\hat{x}\Vert _\infty \rightarrow 0, \quad \Vert u_n- \hat{u}\Vert _\infty \rightarrow 0, \end{aligned}$$
(22)
$$\begin{aligned}&\Vert v^i_n\Vert _\infty \rightarrow 0,\quad i=1,\ldots ,N, \end{aligned}$$
(23)

where \( \xi _n=( x_n(0), x_n(1))\), \(\hat{\xi }=(\hat{x}(0),\hat{x}(1))\).

The following important lemma will be proved in this section.

Lemma 3.1

If there exists \(a\in \mathcal A\) such that \(\hat{z}\) is not a point of weak s-necessity in the associated problem and the Lyusternik condition holds at this point in the associated problem, then \(\hat{w}\) is not a strong local minimum in the main problem.

The proof of this lemma will be based on the theorem of Dmitruk; see [16, Theorem 3]. Below, we give this theorem in a simplified version, convenient for application in our case.

Theorem 3.2

Let \(a=(u^1,\ldots , u^N)\in {\mathcal {A}}\). Suppose that the Lyusternik condition holds at a point \(\tilde{z}= (\tilde{x},\tilde{u},\tilde{v}^1,\ldots ,\tilde{v}^N),\) satisfying the equality constraints of the associated problem and such that

$$\begin{aligned} \mathrm{ess\,sup}_{[0,1]}(-\tilde{v}^i(t))<0,\quad i=1,\ldots ,N,\quad \mathrm{ess\,sup}\sum _{i=1}^N \tilde{v}^i(t)<1. \end{aligned}$$

Then, there is a sequence of points \( z_n= ( x_n, u_n, v^1_n,\ldots , v^N_n)\), \( n=1,2,\ldots \), satisfying the equality constraints of the associated problem and such that

  1. (i)

    \(\Vert x_n-\tilde{x}\Vert _\infty \rightarrow 0\)   as \(n\rightarrow \infty \),

  2. (ii)

    \(\Vert u_n-\tilde{u}\Vert _\infty \rightarrow 0\)   as \(n\rightarrow \infty \),

  3. (iii)

    each difference \( v_n^i-\tilde{v}^i\) converges weakly* to zero in \(L^\infty \) (i.e., \(L^1\)-weakly) as   \(n\rightarrow \infty \),   \(i=1,\ldots ,N\), and

  4. (iv)

    each function \( v_n^i\) takes only two values, zero or one, and the same is true for each sum \(\sum _{i=1}^N v^i_n\).

(To get Theorem 3.2 from [16, Theorem 3] for some \(\hat{a}=(\hat{u}^1,\ldots ,\hat{u}^N)\in \mathcal A\) , we need to put \(g(x,u^i,t)=u^i-\hat{u}^i\), \(i=1,\ldots ,N\) in the definition of ‘extended system’ (4)–(7) in [16], and after that apply [16, Theorem 3] to this particular system.)

Proof of Lemma 3.1

Note that we will not use assertion (iii) of Theorem 3.2 in the proof of Lemma 3.1, while assertion (iv) of this theorem will be very important. By the definition of the weak s-necessity, there is a sequence of points \( z_n=(x_n,u_n,v^1_n,\ldots ,v^N_n)\), satisfying conditions (17)–(23). Without loss of generality, we assume that the Lyusternik condition holds at each point \( z_n\), since it always holds on an open set, and \(\Vert z_n-\hat{z}\Vert \rightarrow 0\) as \(n\rightarrow \infty \). According to Theorem 3.2, each point \( z_n\) can be “approximated” by a point \(\tilde{z}_n=(\tilde{x}_n,\tilde{u}_n,\tilde{v}_n^1,\ldots ,\tilde{v}_n^N)\) such that the norms \(\Vert \tilde{x}_n- x_n\Vert _\infty \) and \(\Vert \tilde{u}_n- u_n\Vert _\infty \) are so small that

  1. (a)

    Conditions

    $$\begin{aligned}&F_0(\tilde{\xi }_n)<F_0( \hat{\xi }),\quad F_i(\tilde{\xi }_n)< 0, \quad i=1,\ldots ,k, \end{aligned}$$
    (24)
    $$\begin{aligned}&\mathrm{ess\,sup}_{[0,1]}g(\tilde{u}_n)<0 \end{aligned}$$
    (25)

    hold for all \(n=1,2,\ldots \),

  2. (b)

    The equality constraints of the associated problem are satisfied:

    $$\begin{aligned} K(\tilde{\xi }_n)=0,\quad \dot{\tilde{x}}_n=f(\tilde{x}_n,\tilde{u}_n) +\sum _{i=1}^N \tilde{v}^i_n \Big (f(\tilde{x}_n,u^i)-f(\tilde{x}_n,\tilde{u}_n)\Big ) \end{aligned}$$
    (26)

    for all \(n=1,2,\ldots \),

  3. (c)

    Each function \(\tilde{v}_n^i\) takes only two values, zero or one, \(i=1,\ldots ,N\), and the same is true for each sum \(\sum _{i=1}^N \tilde{v}_n^i\),

  4. (d)

    \(\Vert \tilde{x}_n-\hat{x}\Vert _\infty \rightarrow 0\) as \(n\rightarrow \infty .\)

Set

$$\begin{aligned} \tilde{u}_n'=\tilde{u}_n+\sum _{i=1}^N \tilde{v}_n^i( u^i-\tilde{u}_n), \quad n=1,2,\ldots \end{aligned}$$

Then, in view of condition (c),

$$\begin{aligned} f(\tilde{x}_n,\tilde{u}_n) +\sum _{i=1}^N \tilde{v}^i_n \Big (f(\tilde{x}_n,u^i)-f(\tilde{x}_n,\tilde{u}_n)\Big ) =f(\tilde{x}_n,\tilde{u}_n'),\quad n=1,2,\ldots , \end{aligned}$$

and hence

$$\begin{aligned} \dot{\tilde{x}}_n=f(\tilde{x}_n,\tilde{u}_n'), \quad n=1,2,\ldots . \end{aligned}$$
(27)

Moreover, in view of (c), conditions (25) imply

$$\begin{aligned} \mathrm{ess\,sup}_{[0,1]}g(\tilde{u}_n')\le 0,\quad n=1,2,\ldots . \end{aligned}$$
(28)

Conditions (24), (26), (27), (28) together with condition (d) mean that the sequence \(\tilde{w}_n'=(\tilde{x}_n,\tilde{u}_n')\), \(n=1,2,\ldots \) violates the strong local minimum at \(\hat{w}\) in the main problem. The lemma is proved. \(\square \)

Second-Order Necessary Conditions for a Weak Local Minimum in the Associated Problem

Let \(\hat{w}=(\hat{x},\hat{u})\) be a strong local minimum in the main problem and \(a=(u^1,\ldots ,u^N)\in \mathcal A\). In the sequel, it will be convenient to supply the objects related to the associated problem with the superscript a. It follows from Lemma 3.1 that either \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\) is a point of a weak s-necessity in the associated problem, or the Lyusternik condition does not hold at this point in the associated problem. Then, applying Theorem 3.1 to the associated problem, we obtain the following result.

Lemma 3.2

Suppose that \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem. Then, for any \(a=(u^1,\ldots ,u^N)\in \mathcal A\), the following second-order necessary condition is satisfied at the point \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\) in the associated problem; the set \(\Lambda ^{a}\) is nonempty and

$$\begin{aligned} \max _{\lambda ^{a}\in \Lambda ^{a}}\Omega ^{a}( z,\lambda ^{a})\ge 0\quad \forall \, z\in \mathcal{C}^{a}. \end{aligned}$$
(29)

Let us describe the set \(\Lambda ^{a}\), the functional \(\Omega ^{a}( z,\lambda ^{a})\), and the critical cone \(\mathcal{C}^{a}\) at the point \(\hat{z}\) in the associated problem.

Set \(\Lambda ^{a}\)

Define the functions

$$\begin{aligned} \begin{array}{rcl} &{}&{}l^{a}(\xi ,\alpha ,\beta ):= \sum _{i=0}^k\alpha _iF_i(\xi _0,\xi _1)+\beta K(\xi _0,\xi _1)=l(\xi ,\alpha ,\beta ), \\ &{}&{} H^{a}(u,v^1,\ldots ,v^N,x,p):=pf(x,u) +\sum _{i=1}^N v^i p (f(x,u^i)-f(x,u)),\\ &{}&{}\bar{H}^{a}(u,v^1,\ldots ,v^N,x,p,\mu , \eta ^1,\ldots ,\eta ^N,\zeta ) :=H^{a}(u,v^1,\ldots ,v^N,x,p)\\ &{}&{} -\,\sum _{i=1}^N \eta ^i v^i+\zeta \Big (\sum _{i=1}^N v^i-1\Big )+\mu g(u). \end{array} \end{aligned}$$

Whereas \(\hat{v}^i=0 \), \(i=1,\ldots , N\), the set \(\Lambda ^{a}\) at the point \(\hat{z}\) in the associated problem consists of all tuples \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta )\) such that

$$\begin{aligned}&\alpha \in \mathrm{I\! R}^{k+1},\quad \beta \in \mathrm{I\! R}^s, \quad p\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)}), \end{aligned}$$
(30)
$$\begin{aligned}&\mu \in L^\infty ([0,1],\mathrm{I\! R}^r), \end{aligned}$$
(31)
$$\begin{aligned}&\eta ^i\in L^\infty ([0,1],\mathrm{I\! R}),\quad i=1,\ldots ,N,\quad \zeta \in L^\infty ([0,1],\mathrm{I\! R}), \end{aligned}$$
(32)
$$\begin{aligned}&\alpha \ge 0,\quad |\alpha | +|\beta |=1,\quad \mu \ge 0,\end{aligned}$$
(33)
$$\begin{aligned}&\eta ^i\ge 0, \quad i=1,\ldots ,N,\quad \zeta \ge 0,\end{aligned}$$
(34)
$$\begin{aligned}&\alpha _iF_i(\hat{x}(0),\hat{x}(1))=0,\quad i=1,\ldots ,k,\end{aligned}$$
(35)
$$\begin{aligned}&\mu g(\hat{u})=0,\end{aligned}$$
(36)
$$\begin{aligned}&\zeta \Big (\sum _{i=1}^N \hat{v}^i-1\Big )=0,\end{aligned}$$
(37)
$$\begin{aligned}&-\dot{p}=pf_x(\hat{x},\hat{u}) \quad \Leftrightarrow \quad -\dot{p}= H^a_x, \end{aligned}$$
(38)
$$\begin{aligned}&-p(0)=l_{\xi _0},\quad p(1)= l_{\xi _1},\end{aligned}$$
(39)
$$\begin{aligned}&p f_u(\hat{x},\hat{u})+\mu g'(\hat{u})=0 \quad \Leftrightarrow \quad \bar{H}^a_{u}=0, \end{aligned}$$
(40)
$$\begin{aligned}&p( f(\hat{x},u^i)- f(\hat{x},\hat{u}))- \eta ^i+\zeta =0 \quad \Leftrightarrow \quad \bar{H}^a_{v^i}=0, \quad i=1,\ldots ,N. \end{aligned}$$
(41)

Let us analyze these conditions. Complementarity condition (37) and the conditions \(\hat{v}^1=\cdots =\hat{v}^N=0\) imply \(\zeta =0\). Then, taking into account that all \(\eta ^i\ge 0\), from conditions (41), we obtain

$$\begin{aligned} p f(\hat{x},u^i)\ge p f(\hat{x},\hat{u}),\quad i=1,\ldots ,N. \end{aligned}$$
(42)

Thus, \(\lambda =(\alpha ,\beta ,p,\mu )\) is a tuple from \(\Lambda \), satisfying conditions (42). Moreover, it follows from (41) that

$$\begin{aligned} \eta ^i= p f(\hat{x},u^i)-p f(\hat{x},\hat{u}), \quad i=1,\ldots ,N. \end{aligned}$$
(43)

Conversely, if \(\lambda =(\alpha ,\beta ,p,\mu )\) is an arbitrary element of \(\Lambda \) satisfying conditions (42), then setting \(\zeta =0\), and defining \(\eta ^i\) by (43), we get the element \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta ) \in \Lambda ^{a}\).

Denote by \(M^{a}\) the set of all \(\lambda =(\alpha ,\beta ,p,\mu )\in \Lambda \) satisfying conditions (42). We have proved the following lemma.

Lemma 3.3

The set \(\Lambda ^{a}\) consists of all tuples \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta )\) such that \(\lambda =(\alpha ,\beta ,p,\mu )\in M^{a}\), the components \(\eta ^i\) are determined by conditions (43), and \(\zeta =0\).

Critical Cone \(\mathcal{C}^{a}\)

For \(a\in \mathcal A\) and the corresponding point \(\hat{z}\), let us describe the critical cone \(\mathcal{C}^{a}\) at the point \(\hat{z}\) in the associated problem. First, we write the equation in variations for differential Eq. (13) at \(\hat{z}\). Since \(\hat{v}^1=\cdots =\hat{v}^N=0\), we get

$$\begin{aligned} \dot{ x}=f_x(\hat{x},\hat{u}) x+f_u(\hat{x},\hat{u}) u +\sum _{i=1}^N v^i \Big (f(\hat{x},u^i)-f(\hat{x},\hat{u})\Big ). \end{aligned}$$
(44)

We also must take into account that the constraint \(\sum _{i=1}^N v^i -1 \le 0 \) is not active at the point \(\hat{z}\). Hence, \(\mathcal{C}^{a}\) consists of all tuples \( z=( x, u, v^1,\ldots , v^N)\) such that \( x\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)})\), \(u\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), \(v^i\in L^\infty ([0,1],\mathrm{I\! R})\), \(v^i\ge 0\), \(i=1,\ldots ,N\), Eq. (44) holds, and

$$\begin{aligned} \begin{array}{rcl}&{}&{} F'_{i}(\hat{x}(0),\hat{x}(1)) \xi \le 0,\; i\in I_F\cup \{0\}, \; K'(\hat{x}(0),\hat{x}(1)) \xi =0, \; \xi =( x(0), x(1)),\\ &{}&{} g_j'(\hat{u}(t)) u(t)\le 0 \; {\text {a.e. on}}\; \mathcal{M}_{j0},\quad j=1,\ldots ,r.\end{array} \end{aligned}$$

Let \(\mathcal{C}_0^{a}\) be the subset of tuples \( z=( x, u, v^1,\ldots , v^N)\in \mathcal{C}^{a}\) such that \( v^1=0,\ldots , v^N=0.\) The following lemma is obvious.

Lemma 3.4

The projection

$$\begin{aligned} z=( x, u, v^1,\ldots , v^N)\mapsto w=( x, u) \end{aligned}$$
(45)

maps the cone \(\mathcal{C}^{a}_0\) onto the critical cone \(\mathcal{C}\).

Quadratic Form \(\Omega ^{a}\)

It can be easily verified that for any \(\lambda ^{a}\in \Lambda ^{a}\) and any z,

$$\begin{aligned} \begin{array}{rcl}\Omega ^{a}( z,\lambda ^{a})&{}=&{}\langle l_{\xi \xi }(\hat{\xi },\alpha ,\beta )\xi ,\xi \rangle +\int _0^1\langle \bar{H}_{ww}(\hat{w},p,\mu ) w, w\rangle \mathrm{d}t\\ &{}&{}+\,2\int _0^1\Big (\sum _{i=1}^N v^i p\Big ( (f_x(\hat{x},u^i)-f_x(\hat{x},\hat{u})) x -f_u(\hat{x},\hat{u}) u\Big )\Big )\mathrm{d}t,\end{array} \end{aligned}$$

where \( w=( x, u)\), \(\xi =(x(0),x(1))\), and then the following lemma becomes obvious.

Lemma 3.5

If an element z satisfies \( v^1=0,\ldots , v^N=0,\) then

$$\begin{aligned} \Omega ^{a}( z,\lambda ^{a})=\Omega ( w,\lambda ), \end{aligned}$$

where \(\lambda \) is the projection of \(\lambda ^{a}\) under the mapping

$$\begin{aligned} \lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta )\rightarrow \lambda =(\alpha ,\beta ,p,\mu ) \end{aligned}$$

and w is the projection of z under the mapping (45).

Let us return to Lemma 3.2. Since \(\mathcal{C}^{a}_0\subset C^{a}\), we can replace the critical cone \(C^{a}\) in condition (29) with the smaller cone \(\mathcal{C}^{a}_0\), and with this change, Lemma 3.2 remains valid. From here, taking into account Lemmas 3.33.5, we obtain the following result.

Theorem 3.3

If \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem, then, for any \(a\in \mathcal A\), the following second-order necessary condition holds in the main problem: the set \(M^{a}\) is nonempty and

$$\begin{aligned} \max _{\lambda \in M^{a}}\Omega (w,\lambda )\ge 0\quad \forall \, w\in \mathcal{C}. \end{aligned}$$
(46)

Now, we can derive Theorem 2.2 from Theorem 3.3. This will be done in the next section.

Proof of Theorem 2.2

Assume that \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem. Then, by Theorem 3.3, for any \(a\in \mathcal A\), the set \(M^{a}\) is nonempty and condition (46) holds true. The set \(\mathcal A\) is directed by the inclusion: a is followed by \(a'\) if the set of controls of \(a'\) contains the set of controls of a; in this case, we write \(a \subset a'\); moreover, each two collections \(a_1\), \(a_2\) are followed by the third \(a_3\), whose set of controls is the union of the sets of controls of \(a_1\) and \(a_2\).

Now, consider the family of sets \(\{M^a\}_{a\in \mathcal {A}}\). This family is directed by the inverse inclusion: if \(a \subset a'\), then \(M^a\supset M^{a'}\). For any two collections \(a_1\) and \(a_2\) and a collection \(a_3\) such that \(a_1 \subset a_3\) and \(a_2 \subset a_3\), we have \(M^{a_1}\cap M^{a_2}\supset M^{a_3}.\) Clearly, each of the sets \(M^{a}\) is closed. Thus, \(\{M^a\}_{a\in \mathcal {A}}\) is a centered family of nonempty closed subsets of the finite-dimensional compact set \(\Lambda \), and hence the intersection

$$\begin{aligned} M^0:=\bigcap _{a\in \mathcal {A}} M^{a} \end{aligned}$$

is nonempty. Since \(M^0\subset M^{a}\) for any \(a\in \mathcal A\), it follows from (42) that for any \(\lambda =(\alpha ,\beta ,p,\mu )\in M^0\) and any \(u(\cdot )\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\) satisfying the control constraint \(u(t)\in U\) a.e. in [0, 1], we have

$$\begin{aligned} p(t)f(\hat{x}(t),u(t))\ge p(t)f(\hat{x}(t),\hat{u}(t)) \quad {\text {a.e. in}}\quad [0,1]. \end{aligned}$$

By the measurable selection theorem this implies the minimum condition (7) for the given \(\lambda \). Consequently, \(M^0\subset M\). The opposite inclusion is obvious. Hence, \(M^0= M\).

Let \(w\in \mathcal{C}\) be an arbitrary element. By virtue of Theorem 3.3, taking into account the compactness of \(M^a\)\(\forall \, a\in \mathcal A\), we get: for any \(a\in \mathcal A\), there is an element \(\lambda (a)\in M^a\) such that

$$\begin{aligned} \Omega (w,\lambda (a))\ge 0. \end{aligned}$$
(47)

Let \(\lambda \) be a limit point for the directedness \(\{\lambda (a)\}\). Since \(\lambda (a)\in M^a\)\(\forall \, a\in \mathcal A\), we obtain: \(\lambda \in M^a\)\(\forall \, a\in \mathcal A\), and hence \(\lambda \in M\). Passing to the limit in condition (47), we get: \(\Omega (w,\lambda )\ge 0\), and hence condition (9) of Theorem 2.2 holds. Theorem 2.2 is completely proved.

Conclusions

In this paper, we consider an optimal control problem in the Mayer form, with endpoint constraints of equality and inequality type and a control constraint, specified by a finite number of inequalities. We assume that all data are twice smooth and that the gradients of active control constraints are linearly independent at any point satisfying these constraints. We prove the necessary second-order condition for a strong local minimum for an arbitrary measurable and essentially bounded optimal control. No qualifying assumption is made regarding the control system and endpoint constraints. The proof method uses the transition from the main problem to the so-called associated problem, generated by a finite number of admissible controls. Using Dmitruk’s theorem, we show that for each finite collection of admissible controls, the strong local minimum in the main problem implies the necessary second-order condition for a weak local minimum in the associated problem. Then, analyzing the last conditions, we obtain the desired result.

The question arises: is it possible to prove a similar result for the case of control constraint given by an arbitrary compact set, using the first- and second-order tangents to this set and without using any qualification assumptions regarding the control system and endpoint constraints?