1 Introduction

In recent years, there has been a surge of interest in optimal control problems involving the controlled sweeping process of the form

$$\begin{aligned} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), ~u(t)\in U, ~~x(0) \in C_0. \end{aligned}$$
(1)

In this respect, we refer to, for example, [3,4,5, 8,9,10, 15, 23] (see also accompanying correction [11]), [6, 13, 14]. Sweeping processes first appeared in the seminal paper [17] by J.J. Moreau as a mathematical framework for problems in plasticity and friction theory. They have proven of interest to tackle problems in mechanics, engineering, economics and crowd motion problems; to name but a few, see [1, 5, 15, 16, 21]. In the last decades, systems in form (1) have caught the attention and interest of the optimal control community. Such interest resides not only in the range of applications but also in the remarkable challenge they rise concerning the derivation of necessary conditions. This is due to the presence of the normal cone \(N_{C(t)}(x(t))\) in the dynamics. Indeed, the presence of the normal cone renders the discontinuity of the right hand of the differential inclusion in (1) destroying a regularity property central to many known optimal control results.

Lately, there has been several successful attempts to derive necessary conditions for optimal control problems involving (1). Assuming that the set C is time independent, necessary conditions for optimal control problems with free end point have been derived under different assumptions and using different techniques. In [10], the set C has the form \(C=\{ x:~\psi (x)\le 0\}\) and an approximating sequence of optimal control problems, where (1) is approximated by the differential equation

$$\begin{aligned} \dot{x}_{\gamma _k}(t)= f(t,x_{\gamma _k}(t),u(t))-\gamma _k e^{\gamma _k \psi (x_{\gamma _k}(t))}\nabla \psi (x_{\gamma _k}(t)), \end{aligned}$$
(2)

for some positive sequence \(\gamma _k\rightarrow +\infty \), is used. Similar techniques are also applied to somehow more general problems in [23]. A useful feature of those approximations is explored in [12] to define numerical schemes to solve such problems.

More recently, an adaptation of the family of approximating systems (2) is used in [13] to generalize the results in [10] to cover problems with additional end-point constraints and with a moving set of the form \(C(t)=\{ x:~\psi (t,x)\le 0\}\).

In this paper, we generalize the maximum principle proven in [13] to cover problems with possibly nonsmooth sets. Our problem of interest is

$$\begin{aligned} (P) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \hspace{8mm} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} u(t)\in U, \ \ \;\, \text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} (x(0),x(T)) \in C_0\times C_T { ~\subset C(0)\times C(T)}, \end{array} \right. \end{aligned}$$

where \(T>0\) is fixed, \(\phi : R^n\rightarrow R\), \(f:[0,T]\times R^n\times R^m\rightarrow R^n\), \(U \subset R^m\) and

$$\begin{aligned} C(t):=\left\{ x\in R^n: ~\psi ^i(t,x)\le 0,\; i=1,\ldots , I\right\} \end{aligned}$$
(3)

for some functions \(\psi ^i:[0,T]\times R^n\rightarrow R\), \( i=1,\ldots , I\).

The case where \(I=1\) in (3) and \(\psi ^1\) is \(C^2\) is covered in [13]. Here, we assume \(I>1\) and that the functions \(\psi ^i\) are also \(C^2\). Although going from \(I=1\) in (3) to \(I>1\) may be seen as a small generalization, it demands a significant revision of the technical approach and, plus, the introduction of a constraint qualification. This is because set (3) may be nonsmooth. We focus on sets (3), satisfying a certain constraint qualification, introduced in assumption (A1) in Sect. 2. This is, indeed, a restriction on the nonsmoothness of (3). A similar problem with nonsmooth moving set is considered in [14]. Our results cannot be obtained from the results of [14] and do not generalize them.

This paper is organized in the following way. In Sect. 2, we introduce the main notation and we state and discuss the assumptions under which we work. In this same section, we also introduce the family of approximating systems to \(\dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t))\) and establish a crucial convergence result, Theorem 2.1. In Sect. 3, we dwell on the approximating family of optimal control problems to (P) and we state the associated necessary conditions. The maximum principle for (P) is then deduced and stated in Theorem 4.1, covering additionally, problems in the form of (P) where the end-point constraint \(x(T) \in C_T\) is absent. We present an illustrative example of our main result, Theorem 4.1, in Sect. 5. We end this paper with some brief conclusions.

2 Preliminaries

In this section, we introduce a summary of the notation and state the assumptions on the data of (P) enforced throughout. Furthermore, we extract information from the assumptions establishing relations crucial for the forthcoming analysis.

Notation

For a set \(S\subset R^n\), \(\partial S\), \( \text {cl}\,S\) and \( \text{ int }\, S\) denote the boundary, closure and interior of S.

If \(g: R^p\rightarrow R^q\), \(\nabla g\) represents the derivative and \(\nabla ^2g\) the second derivative. If \(g: R\times R^p\rightarrow R^q\), then \(\nabla _x g\) represents the derivative w.r.t. \(x\in R^p\) and \(\nabla ^2_xg\) the second derivative, while \(\partial _t g(t,x)\) represents the derivative w.r.t. \(t\in R\).

The Euclidean norm or the induced matrix norm on \( R^{p\times q}\) is denoted by \( |\cdot |\). We denote by \(B_n\) the closed unit ball in \( R^n\) centered at the origin. The inner product of x and y is denoted by \(\langle x, y\rangle \). For some \(A\subset R^n\), d(xA) denotes the distance between x and A. We denote the support function of A at z by \(S(z,A)=\sup \{\langle z,a\rangle \mid a\in A\}\)

The space \(L^{\infty }([a,b]; R^p)\) (or simply \(L^{\infty }\) when the domains are clearly understood) is the Lebesgue space of essentially bounded functions \(h:[a,b]\rightarrow R^p\). We say that \(h\in BV([a,b]; R^p)\) if h is a function of bounded variation. The space of continuous functions is denoted by \(C([a,b]; R^p)\).

Standard concepts from nonsmooth analysis will also be used. Those can be found in [7, 18] or [22], to name but a few. The Mordukhovich normal cone to a set S at \(s\in S\) is denoted by \(N_{S}(s)\) and \(\partial f(s)\) is the Mordukhovich subdifferential of f at s (also known as limiting subdifferential).

For any set \(A\subset R^n\), \( \text {cone}\, A\) is the cone generated by the set A.

We now turn to problem (P). We first state the definition of admissible processes for (P) and then we describe the assumptions under which we will derive our main results.

Definition 2.1

A pair (xu) is called an admissible process for (P) when x is an absolutely continuous function and u is a measurable function satisfying the constraints of (P).

Assumptions on the data of \(\mathbf {(P)}\)

  1. A1:

    The functions \(\psi ^i\), \( i=1,\ldots , I\), are \(C^2\). The graph of \(C(\cdot )\) is compact and it is contained in the interior of a ball \(rB_{n+1}\), for some \(r>0\). There exist constants \(\beta >0\), \(\eta >0\) and \(\rho \in ]0,1[\) such that

    $$\begin{aligned} \psi ^i(t,x) \in [ -\beta ,\beta ] \Longrightarrow |\nabla _x \psi ^i (t,x) | > \eta \; \; \textrm{for all }\; (t,x)\in [0,T]\times R^n, \end{aligned}$$
    (4)

    and, for \(I(t,x)=\{ i=1,\ldots , I\mid \psi ^i(t,x)\in ]-2\beta ,\beta ]\}\),

    $$\begin{aligned} \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \ge 0,\;\; i,j\in I(t,x). \end{aligned}$$
    (5)

    Moreover, if \(i\in I(t,x)\), then

    $$\begin{aligned} \sum _{j\in I(t,x)\setminus \{i\}} \big | \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \big | \le \rho |\nabla _x\psi ^i(t,x)|^2. \end{aligned}$$
    (6)

    Additionally,

    $$\begin{aligned} \psi ^i(t,x)\le -2\beta ~\Longrightarrow ~\nabla _x \psi ^i(t,x)=0~\text { for } i=1,\ldots I. \end{aligned}$$
    (7)
  2. A2:

    The function f is continuous, \(x\rightarrow f(t,x,u)\) is continuously differentiable for all \((t,u)\in [0,T]\times R^m\). The constant \(M>0\) is such that \(|f(t,x,u)|\le M\) and \(|\nabla _x f(t,x,u)|\le M\) for all \((t,x,u)\in rB_{n+1}\times U\).

  3. A3:

    For each (tx), the set f(txU) is convex.

  4. A4:

    The set U is compact.

  5. A5:

    The sets \(C_0\) and \(C_T\) are compact.

  6. A6:

    There exists a constant \(L_\phi \) such that \(|\phi (x)-\phi (x')|\le L_{\phi }|x-x'|\) for all \(x, x' \in R^n\).

Assumption (A1) concerns the functions \(\psi ^i\) defining the set C, and it plays a crucial role in the analysis. All \(\psi ^i\) are assumed to be smooth with gradients bounded away from the origin when \(\psi ^i\) takes values in a neighborhood of zero. Moreover, the boundary of C may be nonsmooth at the intersection points of the level sets \(\left\{ x: \psi ^i(t,x)=0\right\} \). However, nonsmoothness at those corner points is restricted to (5) which excludes the cases where the angle between the two gradients of the functions defining the boundary of C is obtuse; see Fig. 1.

Fig. 1
figure 1

Examples of two different sets C. On the left size, a set that does not satisfies (5). On the right side, the set C is nonsmooth and it fulfills (5)

On the other hand, (6) guarantees that the Gramian matrix of the gradients of the functions taking values near the boundary of C(t) is diagonally dominant and, hence, the gradients are linearly independent.

In many situations, as in the example we present in the last section, we can guarantee the fulfillment of (A1), in particular (7), replacing the function \(\psi ^i\) by

$$\begin{aligned} \tilde{\psi }^i(t,x)=h\circ \psi ^i(t,x), \end{aligned}$$
(8)

where

$$\begin{aligned} h(z)=\left\{ \begin{array}{lcl} z &{} \quad \text { if} &{} z>-\beta , \\ h_s (z) &{} \quad \text { if} &{}-2\beta \le z\le -\beta ,\\ -2\beta &{} \quad \text { if} &{} z<-2\beta , \end{array}\right. \end{aligned}$$

Here, h is a \(C^2\) function, with \(h_s\) an increasing function defined on \([-2\beta ,-\beta ]\). For example, \(h_s\) may be a cubic polynomial with positive derivative on the interval \(]-2\beta ,-\beta [\). For all \(t\in [0,T]\), set

$$\begin{aligned} \tilde{C}(t): =\left\{ x\in R^n:~\tilde{\psi }^i(t,x )\le 0,~i=1, \ldots ,I\right\} . \end{aligned}$$

It is then a simple matter to see that

$$\begin{aligned} C(t)=\tilde{C}(t) \text { for all } t \in [0,T]. \end{aligned}$$

and that the functions \(\tilde{\psi }^i(\cdot )\) satisfy the assumption (A1).

The assumption that the graph of \(C(\cdot )\) is compact and contained in the interior of a ball is introduced to avoid technicalities in our forthcoming analysis. In applied problems, this may be easily side tracked by considering the intersection of the graph of \(C(\cdot )\) with a tube around the optimal trajectory.

Condition (A1) implies the conditions of Theorem 3.1 in [2] and so our set C(t) is uniformly prox-regular.

We now proceed introducing an approximation family of controlled systems to (1). Let \(x(\cdot )\) be a solution to the differential inclusion

$$\begin{aligned} \dot{x}(t) \in f(t,x(t),U)- N_{C(t)}(x(t)). \end{aligned}$$

Under our assumptions, measurable selection theorems assert the existence of measurable functions u and \(\xi ^i\) such that \(u(t) \in U\), \(\xi ^i(t)\ge 0\) a.e. \(t\in [0,T]\), \(\xi ^i(t)=0\) if \(\psi ^i(t,x(t))<0\), and

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t))\; \mathrm{a.e.}\; t\in [0,T]. \end{aligned}$$

Considering the trajectory x, some observations are called for. Let \(\mu \) be such that

$$\begin{aligned}{} & {} \max \left\{ (|\nabla _x\psi ^i(t,x)||f(t,x,u)|+|\partial _t\psi ^i(t,x)|)+1:\right. \nonumber \\{} & {} \quad \left. ~t\in [0,T],\; u\in U,\; x\in C(t)+B_n, \; i=1,\ldots ,I\right\} \le \mu . \end{aligned}$$

The properties of the graph of \(C(\cdot )\) in (A1) guarantee the existence of such maximum.

Consider now some t such that, for some \(j\in \{1, \ldots I\}\), \(\psi ^j (t,x(t))=0\) and \(\dot{x}(t)\) exists. Since the trajectory x is always in C, we have (see (5))

$$\begin{aligned} \begin{aligned} 0&=\frac{\hbox {d}}{\hbox {d}t}\psi ^j(t,x(t))=\langle \nabla _x\psi ^j(t,x(t)),\dot{x}(t)\rangle +\partial _t\psi ^j(t,x(t)) \\&=\langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle -\xi ^j(t)| \nabla _x\psi ^j(t,x(t))|^2 \\&\quad -\sum _{i\in I(t,x(t))\setminus \{ j\}}\xi ^i(t) \langle \nabla _x\psi ^i(t,x(t)),\nabla _x\psi ^j(t,x(t))\rangle +\partial _t\psi ^j(t,x(t)) \\&\quad \le \langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle -\xi ^j(t)| \nabla _x\psi ^j(t,x(t))|^2 +\partial _t\psi ^j(t,x(t)), \end{aligned} \end{aligned}$$

and hence (see (4)),

$$\begin{aligned} \xi ^j(t)\le \frac{1}{| \nabla _x\psi ^j(t,x(t))|^2}(\langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle +\partial _t\psi ^j(t,x(t))) \le \frac{\mu }{\eta ^2}. \end{aligned}$$

Define the function

$$\begin{aligned} \mu (\gamma )=\frac{1}{\gamma }\log \left( \frac{\mu }{\eta ^2\gamma }\right) , \quad \gamma >0, \end{aligned}$$

consider a sequence \(\{\sigma _k\}\) such that \(\sigma _k\downarrow 0\) and choose another sequence \(\{\gamma _k\}\) with \(\gamma _k\uparrow +\infty \) and

$$\begin{aligned} C(t)\subset \textrm{int}\, C^k(t)= \textrm{int}\, \left\{ x: \psi ^i(t,x)-\sigma _k\le \mu _k,\; i=1,\ldots ,I\right\} , \end{aligned}$$

where

$$\begin{aligned} \mu _k=\mu (\gamma _k). \end{aligned}$$

Let \(x_k\) be a solution to the differential equation

$$\begin{aligned} \dot{x}_k(t)=f(t,x_k(t),u_k(t))-\sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\nabla _x\psi ^i(t,x_k(t)) \end{aligned}$$
(9)

for some \(u_k(t) \in U\) a.e. \(t\in [0,T]\). Take any \(t\in [0,T]\) such that \(\dot{x}_k(t)\) exists and \(\psi ^j (t,x_k(t))-\sigma _k=\mu _k\). Assume that \(|\psi ^j(t,x_k(t))|\le \beta \) and \(\psi ^i(t,x_k(t)) \le \beta \), for all \(\displaystyle i\).

Then, whenever \(\gamma _k\) is sufficiently large, we have (see (5) and (7))

$$\begin{aligned} \begin{aligned} \frac{\hbox {d}}{\hbox {d}t}\psi ^j(t,x_k(t))&=\langle \nabla _x\psi ^j(t,x_k(t)),f(t,x_k(t),u_k(t))\rangle \\&\quad -\gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}|\nabla _x\psi ^j(t,x_k(t))|^2 \\&\quad -\sum _{i\in I(t,x_k(t))\setminus \{ j\}}\hspace{-2em} \gamma _ke^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\langle \nabla _x\psi ^i(t,x_k(t)),\nabla _x\psi ^j(t,x_k(t))\rangle \\&\quad -\sum _{i\not \in I(t,x_k(t))}\hspace{-1em} \gamma _ke^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\langle \nabla _x\psi ^i(t,x_k(t)),\nabla _x\psi ^j(t,x_k(t))\rangle \\&\quad + \partial _t\psi ^j(t,x_k(t))\\&\quad \le \langle \nabla _x\psi ^j(t,x_k(t)),f(t,x_k(t),u_k(t))\rangle + \partial _t\psi ^j(t,x_k(t))\\&\quad -\gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}|\nabla _x\psi ^j(t,x_k(t))|^2 \\&\quad \le \mu -1 -\eta ^2\gamma _k e^{\gamma _k\mu _k}\\&=-1. \end{aligned} \end{aligned}$$

In the last inequality, we have used the definition of \(\mu \).

Thus, if \(x_k(0)\in C^k(0)\), we have \(x_k(t)\in C^k(t)\), for all \(t\in [0,T]\), and

$$\begin{aligned} \gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}\le \gamma _ke^{\gamma _k\mu _k}= \frac{\mu }{\eta ^2}. \end{aligned}$$
(10)

It follows that, for k sufficiently large, we have

$$\begin{aligned} |\dot{x}_k(t)|\le (\textrm{const}). \end{aligned}$$

We remark that the inclusion \(x_k(t)\in C^k(t)\) is a direct consequence of Theorem 3 in [20].

We are now a in position to state and prove our first result, Theorem 2.1. This is in the vein of Theorem 4.1 in [23] (see also Lemma 1 in [10] when \(\psi \) is independent of t and convex) deviating from it in so far as the approximating sequence of control systems (9) differs from the one introduced in [10].Footnote 1 The proof of Theorem 2.1 relies on (10).

Theorem 2.1

Let \(\{(x_k,u_k)\}\), with \(u_k(t)\in U\) a.e., be a sequence of solutions of Cauchy problems

$$\begin{aligned} \begin{array}{rcl} \dot{x}_k(t) &{} = &{} f(t,x_k(t),u_k(t))-\displaystyle \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\nabla _x\psi ^i(t,x_k(t)),\\ x_k(0) &{} = &{} b_k\in C^k(0). \end{array} \end{aligned}$$
(11)

If \(b_k\rightarrow x_0\), then there exists a subsequence \(\{x_k\}\) (we do not relabel) converging uniformly to x, a unique solution to the Cauchy problem

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),u(t))-N_{C(t)}(x(t)),\;\;\; x(0)=x_0, \end{aligned}$$
(12)

where u is a measurable function such that \(u(t)\in U\) a.e. \(t\in [0,T]\).

If, moreover, all the controls \(u_k\) are equal, i.e., \(u_k=u\), then the subsequence converges to a unique solution of (12), i.e., any solution of

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),U)-N_{C(t)}(x(t)),\;\;\; x(0)=x_0\in C(0) \end{aligned}$$
(13)

can be approximated by solutions of (11).

Proof

Consider the sequence \(\{x_k\}\), where \((x_k,u_k)\) solves (11). Recall that \(x_k(t)\in C^k(t)\) for all \(t\in [0,T]\), and

$$\begin{aligned} |\dot{x}_k(t)|\le (\textrm{const})\;\;\;\textrm{and}\;\;\; \xi _k^i(t)=\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\le (\textrm{const}). \end{aligned}$$
(14)

Then, there exist subsequences (we do not relabel) weakly-\(*\) converging in \(L^{\infty }\) to some v and \(\xi ^i\). Hence,

$$\begin{aligned} x_{k}(t)=x_0+ \int _0^t \dot{x}_{k}(s) \hbox {d}s \longrightarrow x(t)=x_0+ \int _0^t v(s)\hbox {d}s, ~\forall ~t\in [0,T], \end{aligned}$$

for an absolutely continuous function x. Obviously, \(x(t)\in C(t)\) for all \(t\in [0,T]\). Considering the sequence \(\{x_k\}\), recall that

$$\begin{aligned} \dot{x}_k(t)\in f(t,x_k(t),U)-\sum _{i=1}^I\xi _k^i(t)\nabla _x \psi ^i(t,x_k(t)). \end{aligned}$$
(15)

Inclusion (15) is equivalent to

$$\begin{aligned} \langle z, \dot{x}_k(t)\rangle \le S(z,f(t,x_k(t),U))-\sum _{i=1}^I\xi _k^i(t)\langle z, \nabla _x\psi ^i(t,x_k(t))\rangle ,\;\;\;\forall \, z\in R^n. \end{aligned}$$

Integrating this inequality, we get

$$\begin{aligned}{} & {} \left\langle z,\frac{x_k(t+\tau )-x_k(t)}{\tau }\right\rangle \nonumber \\{} & {} \quad \le \frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x_k(s),U))-\sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x_k(s))\rangle \right) \hbox {d}s\nonumber \\{} & {} \quad =\frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x_k(s),U))-\sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x(s))\rangle \right. \nonumber \\{} & {} \qquad + \left. \sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x(s))- \nabla _x\psi ^i(s,x_k(s))\rangle \right) \hbox {d}s. \end{aligned}$$
(16)

Passing to the limit as \(k\rightarrow \infty \), we obtain

$$\begin{aligned}{} & {} \left\langle z,\frac{x(t+\tau )-x(t)}{\tau }\right\rangle \nonumber \\{} & {} \quad \le \frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x(s),U))-\sum _{i=1}^I\xi ^i(s)\langle z, \nabla _x\psi ^i(s,x(s))\rangle \right) \hbox {d}s. \end{aligned}$$
(17)

Let \(t\in [0,T]\) be a Lebesgue point of \(\dot{x}\) and \(\xi \). Passing in the last inequality to the limit as \(\tau \downarrow 0\), it leads to

$$\begin{aligned} \langle z,\dot{x}(t)\rangle \le S(z,f(t,x(t),U))-\sum _{i=1}^I\xi ^i(t)\langle z, \nabla _x\psi ^i(t,x(t))\rangle . \end{aligned}$$

Since \(z\in R^n\) is an arbitrary vector and the set f(tx(t), U) is convex, we conclude that

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),U)-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

By the Filippov lemma, there exists a measurable control \(u(t)\in U\) such that

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

Furthermore, observe that \(\xi ^i\) is zero if \(\psi ^i(t,x(t))<0\). If for some u such that \(u(t)\in U\) a.e., \(u_k=u\) for all k, then the sequence \(x_k\) converges to the solution of

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

Indeed, to see this, it suffices to pass to the limit as \(k\rightarrow \infty \) and then as \(\tau \downarrow 0\), in the equality

$$\begin{aligned} \frac{x_k(t+\tau )-x_k(t)}{\tau }= \frac{1}{\tau }\int _t^{t+\tau }\left( f(s,x_k(s),u(s))-\sum _{i=1}^I\xi _k^i(s) \nabla _x\psi ^i(s,x_k(s))\right) \hbox {d}s. \end{aligned}$$

Recall that the set C(t) is uniformly prox-regular. The proof of uniqueness of solution for general sweeping processes with prox-regular sets can be found in [19], and it holds under the requirement that the moving set is Lipschitz continuous with respect to time. Although we do not assume the Lipschitz dependence directly, under our assumptions we can appeal to the implicit function theorem to show that C(t) is locally Lipschitz. However, for our special case, it is possible to have a simple alternative proof, which we present next for the convenience of the reader. The proof is in the vein of that of Theorem 4.1 in [23]. Suppose that there exist two different solutions of (12): \(x_1\) and \(x_2\). We have

$$\begin{aligned}{} & {} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2=\langle x_1(t)-x_2(t),\dot{x}_1(t)-\dot{x}_2(t)\rangle \nonumber \\{} & {} \quad =\langle x_1(t)-x_2(t),f(t,x_1(t),u(t))-f(t,x_2(t),u(t))\rangle \nonumber \\{} & {} \qquad -\left\langle x_1(t)-x_2(t),\sum _{i=1}^I\xi _1^i(t)\nabla _x\psi ^i(t,x_1(t))-\sum _{i=1}^I\xi _2^i(t)\nabla _x\psi ^i(t,x_2(t))\right\rangle .\nonumber \\ \end{aligned}$$
(18)

If, for all i, \(\psi ^i(t,x_1(t))<0\) and \(\psi ^i(t,x_2(t))<0\), then \(\xi _1^i(t)=\xi _2^i(t)=0\), and we obtain

$$\begin{aligned} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2\le L_f|x_1(t)-x_2(t)|^2. \end{aligned}$$

Suppose that \(\psi ^j(t,x_1(t))=0\). Then, by the Taylor formula we get

$$\begin{aligned}{} & {} \psi ^j(t,x_2(t))=\psi ^j(t,x_1(t))+\langle \nabla _x\psi ^j(t,x_1(t)),x_2(t)-x_1(t)\rangle \nonumber \\{} & {} \quad +\frac{1}{2}\langle x_2(t)-x_1(t), \nabla _x^2\psi ^j(t,\theta x_2(t)+(1-\theta )x_1(t))( x_2(t)-x_1(t))\rangle , \end{aligned}$$
(19)

where \(\theta \in [0,1]\). Since \(\psi ^j(t,x_2(t))\le 0\), we have

$$\begin{aligned}{} & {} \langle \nabla _x\psi ^j(t,x_1(t)),x_2(t)-x_1(t)\rangle \nonumber \\{} & {} \quad \le - \frac{1}{2}\langle x_2(t)-x_1(t), \nabla _x^2\psi ^j(t,\theta x_2(t)+(1-\theta )x_1(t))( x_2(t)-x_1(t))\rangle \nonumber \\{} & {} \quad \le (\textrm{const}) |x_1(t)-x_2(t)|^2. \end{aligned}$$
(20)

Now, if \(\psi ^j(t,x_2(t))=0\), we deduce in the same way that

$$\begin{aligned} \langle \nabla _x\psi ^j(t,x_2(t)),x_1(t)-x_2(t)\rangle \le (\textrm{const})|x_1(t)-x_2(t)|^2. \end{aligned}$$

Thus, we have

$$\begin{aligned} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2\le (\textrm{const})|x_1(t)-x_2(t)|^2. \end{aligned}$$

Hence, \(|x_1(t)-x_2(t)|=0\). \(\square \)

3 Approximating Family of Optimal Control Problems

In this section, we define an approximating family of optimal control problems to (P) and we state the corresponding necessary conditions.

Let \((\hat{x},\hat{u})\) be a global solution to (P) and consider sequences \(\{\gamma _k\}\) and \(\{\sigma _k\}\) as defined above. Let \(\hat{x}_k(\cdot )\) be the solution to

$$\begin{aligned} \begin{array}{rcl} &{}&{}\dot{x}(t) = f(t,x(t),\hat{u}(t))-\displaystyle \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)}\nabla _x\psi ^i(t,x(t)),\\ &{}&{}x(0)= \hat{x}(0). \end{array} \end{aligned}$$
(21)

Set \(\epsilon _k=|\hat{x}_k(T)-\hat{x}(T)|\). It follows from Theorem 2.1 that \(\epsilon _k\downarrow 0\). Take \(\alpha >0\) and define the problem

$$\begin{aligned} (P_k^\alpha ) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))+|x(0)-\hat{x}(0)|^2+\alpha \displaystyle \int _0^T|u(t)-\hat{u}(t)|\hbox {d}t\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \dot{x}(t) = f(t,x(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ u(t)\in U~~~~\text{ a.e. }\ \ t\in [0,T],\\ x(0)\in C_0,~~ x(T)\in C_T+\epsilon _k B_n, \end{array} \right. \end{aligned}$$

Clearly, the problem \((P_k^\alpha )\) has admissible solutions. Consider the space

$$\begin{aligned} W=\{ (c,u)\mid c\in C_0,\; u\in L^{\infty }\; \textrm{with }\; u(t)\in U\} \end{aligned}$$

and the distance

$$\begin{aligned} d_{W}((c_1,u_1),(c_2,u_2))=|c_1-c_2|+\int _0^T|u_1(t)-u_2(t)|\hbox {d}t. \end{aligned}$$

Endowed with \(d_{W}\), W is a complete metric space. Take any \((c,u)\in W\) and a solution y to the Cauchy problem

$$\begin{aligned} \begin{array}{rcl} \dot{y}(t) &{} = &{} f(t,y(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,y(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ y(0) &{} = &{} c. \end{array} \end{aligned}$$

Under our assumptions, the function

$$\begin{aligned} (c,u)~\rightarrow ~ \phi (y(T))+ | c - \hat{x}(0) |^2+\alpha \int _{0}^{T} | u-\hat{u}|~\hbox {d}t \end{aligned}$$

is continuous on \((W,d_{W})\) and bounded below. Appealing to Ekeland’s theorem, we deduce the existence of a pair \((x_k,u_k)\) solving the following problem

$$\begin{aligned} (AP_k) \left\{ \begin{array}{l} \text{ Minimize } \; \Phi (x,{u})= \phi (x(T))+|x(0)-\hat{x}(0)|^2+\alpha \displaystyle \int _0^T|u(t)-\hat{u}(t)|\hbox {d}t\\ \qquad \qquad +\epsilon _k\left( |x(0)-x_k(0)|+ \displaystyle \int _0^T|u(t)-u_k(t)|\hbox {d}t\right) ,\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \dot{x}(t) = f(t,x(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ u(t)\in U~~~~\text{ a.e. }\ \ t\in [0,T],\\ x(0)\in C_0,~~ x(T)\in C_T+\epsilon _k B_n, \end{array} \right. \end{aligned}$$

Lemma 3.1

Take \(\gamma _k\rightarrow \infty \), \(\sigma _k\rightarrow 0\) and \(\epsilon _k \rightarrow 0\) as defined above. For each k, let \((x_k,u_k)\) be the solution to \((AP_k)\). Then, there exists a subsequence (we do not relabel) such that

$$\begin{aligned} u_k(t)\rightarrow \hat{u}(t)~ {a.e.}, \quad x_k\rightarrow \hat{x}\; \textrm{ uniformly}\; in\; [0,T]. \end{aligned}$$

Proof

We deduce from Theorem 2.1 that \(\{x_k\}\) uniformly converges to an admissible solution \(\tilde{x}\) to (P). Since U and \(C_0\) are compact, we have \(U\subset KB_m\) and \(C_0\subset KB_n\). Without loss of generality, \(u_k\) weakly-\(*\) converges to a function \(\tilde{u}\in L_{\infty }([0,T],U)\). Hence, it weakly converges to \(\tilde{u}\) in \(L_1\). From optimality of the processes \((x_k,u_k)\), we have

$$\begin{aligned}{} & {} \phi (x_k(T))+|x_k(0)-\hat{x}(0)|^2+\alpha \int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \phi (\hat{x}_k(T))+\epsilon _k\left( |\hat{x}_k(0)-x_k(0)|+\int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\right) \\{} & {} \quad \le \phi (\hat{x}_k(T))+2K(1+T)\epsilon _k. \end{aligned}$$

Since \((\hat{x},\hat{u})\) is a global solution of the problem, passing to the limit, we get

$$\begin{aligned}{} & {} \phi (\tilde{x}(T))+|\tilde{x}(0)-\hat{x}(0)|^2+\alpha \int _0^T|\tilde{u}(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty }(\phi (x_k(T))+|x_k(0)-\hat{x}(0)|^2)+ \alpha \liminf _{k\rightarrow \infty } \int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty }\phi (\hat{x}_k(T))=\phi (\hat{x}(T))\le \phi (\tilde{x}(T)). \end{aligned}$$

Hence, \(\tilde{x}(0)=\hat{x}(0)\), \(\tilde{u}=\hat{u}\) a.e., and \(u_k\) converges to \(\hat{u}\) in \(L_1\), and some subsequence converges to \(\hat{u}\) almost everywhere (we do not relabel). \(\square \)

We now finish this section with the statement of the optimality necessary conditions for the family of problems \((AP_k)\). These can be seen as a direct consequence of Theorem 6.2.1 in [22].

Proposition 3.1

For each k, let \((x_k,u_k)\) be a solution to \((AP_k)\). Then, there exist absolutely continuous functions \(p_k\) and scalars \(\lambda _k\ge 0\) such that

(a):

(nontriviality condition)

$$\begin{aligned} \lambda _k+|p_{k}(T)| =1, \end{aligned}$$
(22)
(b):

(adjoint equation)

$$\begin{aligned} \begin{array}{c}\dot{p}_{k} =-(\nabla _x f_{k})^* p_{k} +\sum _{i=1}^I\gamma _k e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla ^2_x\psi _{k}^ip_{k}\\ +\sum _{i=1}^I\gamma _k^2e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla _x\psi _{k}^i\langle \nabla _x\psi _{k}^i,p_{k}\rangle , \end{array} \end{aligned}$$
(23)

where the superscript \(*\) stands for transpose,

(c):

(maximization condition)

$$\begin{aligned} \max _{u\in U}\left\{ \langle f(t,x_{k}, u) , p_{k} \rangle - \alpha \lambda _k|u-\hat{u}| -\epsilon _k \lambda _k|u-u_k|\right\} \end{aligned}$$
(24)

is attained at \(u_k (t) \), for almost every \(t\in [0,T]\),

(d):

(transversality condition)

$$\begin{aligned} ( p_{k}(0), - p_{k}(T)) \in \lambda _k\left( 2(x_k(0)-\hat{x}(0))+\epsilon _k B_n, \partial \phi (x_{k}(T))\right) \nonumber \\ + N_{C_0}(x_{k}(0))\times N_{C_T+\epsilon _kB_n}(x_{k}(T)). \hspace{1cm} \end{aligned}$$
(25)

To simplify the notation above, we drop the t dependence in \(p_k\), \(\dot{p}_k\), \(x_k\), \(u_k\), \(\hat{x}\) and \(\hat{u}\). Moreover, in (b), we write \(\psi _k\) instead of \(\psi (t,x_k(t))\), \(f_k\) instead of \(f(t,x_k(t),u_k(t))\). The same holds for the derivatives of \(\psi \) and f.

4 Maximum Principle for (P)

In this section, we establish our main result, a Maximum Principle for (P). This is done by taking limits of the conclusions of Proposition 3.1, following closely the analysis done in the proof of [10, Theorem 2].

Observe that

$$\begin{aligned} \begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} |p_k(t)|^2&= - \langle \nabla _x f_k p_k , p_k \rangle + \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle \\&\hspace{1cm} + \sum _{i=1}^I\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\langle \nabla _x\psi _k^i, p_k \rangle ^2 \\&\ge - \langle \nabla _x f_k p_k , p_k \rangle + \sum _{i=1}^I \gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle \\&\ge \ - M | p_k|^2 +\sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle , \end{aligned} \end{aligned}$$

where M is the constant of (A2). Taking into account hypothesis (A1) and (10), we deduce the existence of a constant \(K_0>0\) such that

$$\begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} |p_k(t)|^2\ge -K_0| p_k(t)|^2. \end{aligned}$$

This last inequality leads to

$$\begin{aligned} | p_k(t)|^2 \ \le \ e^{2 K_0 (T-t)} | p_k(T)|^2 \le \ e^{2 K_0T} |p_k(T)|^2. \end{aligned}$$

Since, by (a) of Proposition 3.1, \(|p_k(T)|\le 1\), we deduce from the above that there exists \(M_0>0\) such that

$$\begin{aligned} | p_k(t)| \ \le M_0. \end{aligned}$$
(26)

Now, we claim that the sequence \(\{\dot{p}_k\}\) is uniformly bounded in \(L^1\). To prove our claim, we need to establish bounds for the three terms in (23). Following [10, 13], we start by deducing some inequalities that will be of help.

Denote \(I_k=I(t,x_k(t))\) and \(S_k^j=\textrm{sign}\left( \langle \nabla _x\psi _k^j, p_ k \rangle \right) \). We have

$$\begin{aligned}{} & {} \sum _{j=1}^I\frac{\hbox {d}}{\hbox {d}t} \left| \langle \nabla _x\psi _k^j, p_ k \rangle \right| \\{} & {} \quad =\sum _{j=1}^I\left( \langle \nabla ^2_x\psi _k^j \dot{x}_ k, p_ k \rangle +\langle \partial _t\nabla _x\psi _k^j,p_k\rangle + \langle \nabla _x\psi _k^j,\dot{p}_ k\rangle \right) \, S_k^j \\{} & {} \quad = \sum _{j=1}^I\left( \langle p_ k, \nabla ^2_x\psi _k^j f_ k \rangle - \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle p_ k, \nabla ^2_x\psi _k^j \nabla _x\psi _k^i \rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{j=1}^I\left( \langle \partial _t\nabla _x\psi _k^j,p_k\rangle - \langle \nabla _x \psi _k^j, (\nabla _x f_ k)^* p_ k \rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{j=1}^I\left( \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j , \nabla ^2_x\psi _k^i p_ k\rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{i=1}^I \sum _{j=1}^I \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j , \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j. \end{aligned}$$

Observe that (see (6) and (7))

$$\begin{aligned}{} & {} \sum _{i=1}^I \sum _{j=1}^I \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \quad =\sum _{i=1}^I \sum _{j\in I_k} \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \quad =\sum _{i\not \in I_k} \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \sum _{j\in I_k} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \qquad +\sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\left( |\nabla _x\psi _k^i|^2+ \sum _{j\in I_k\setminus \{i\}} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle S_k^j~S_k^i \right) | \langle \nabla _x\psi _k^i,p_ k\rangle | \\{} & {} \quad = \sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\left( |\nabla _x\psi _k^i|^2+ \sum _{j\in I_k\setminus \{i\}} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle S_k^j~S_k^i \right) | \langle \nabla _x\psi _k^i,p_ k\rangle |\\{} & {} \quad \ge \displaystyle (1-\rho )\sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}|\nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle | \\{} & {} \quad = (1-\rho )\sum _{i=1}^I\gamma _k^2e^{\gamma _k( \psi _k^i-\sigma _k)}|\nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle |. \hspace{2cm} \end{aligned}$$

Using this and integrating the previous equality, we deduce the existence of \(M_1>0\) such that:

$$\begin{aligned} \int _0^T \sum _{i=1}^I\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}| \nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle |\hbox {d}t \le M_1. \end{aligned}$$
(27)

We are now in a position to show that

$$\begin{aligned} \displaystyle \int _{0}^{T} \sum _{i=1}^{I} \gamma _k^2 e^{\gamma _k( \psi _k^i-\sigma _k)} |\nabla _x \psi _k^i| \left| \langle \nabla _x \psi _k^i,p_ k\rangle \right| \ \hbox {d}t \end{aligned}$$

is bounded. For simplicity, set \(L_k^i(t) =\gamma _k^2 e^{\gamma _k( \psi _k^i-\sigma _k)} |\nabla _x \psi _k^i| \left| \langle \nabla _x \psi _k^i,p_ k\rangle \right| \). Notice that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \int _{0}^{T} L_k^i(t) \hbox {d}t= \displaystyle \sum _{i=1}^I \left\{ \int _{\{t: |\nabla _x \psi _k^i| < \eta \}} \hspace{-0.7cm}L_k^i(t)~\hbox {d}t+ \displaystyle \int _{\{t: |\nabla _x\psi _k^i| \ge \eta \}}\hspace{-0.7cm} L_k^i(t) \hbox {d}t \right\} . \end{aligned}$$

Using (A1) and (27), we deduce that

$$\begin{aligned} \begin{aligned} \displaystyle \sum _{i=1}^I \int _{0}^{T} L_k^i (t)~\hbox {d}t&\le \displaystyle \sum _{i=1}^I \left( \gamma _k^2 e^{-\gamma _k(\beta +\sigma _k)} \eta ^2 \max _{t} |p_ k(t)|\right) \\&\hspace{0.5cm}+\displaystyle \sum _{i=1}^I \left( \gamma _k^2 \int _{ \{t: |\nabla _x \psi _k^i | \ge \eta \} } \hspace{-1cm}e^{\gamma _k(\psi _k^i-\sigma _k)} \frac{|\nabla _x\psi _k^i |^2}{ | \nabla _x\psi _k^i |} \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t \right) \\&\le \gamma _k^2 I~e^{-\gamma _k (\beta +\sigma _k)} \eta ^2 M_0 \\&\hspace{0.5cm} +\frac{1}{\eta }\displaystyle \sum _{i=1}^I \left( \int _{0}^{T} \gamma _k^2 e^{\gamma _k(\psi _k^i-\sigma _k)} | \nabla _x\psi _k^i|^2 \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t\right) \\&\le \ \eta ^2 M_0 I+ \frac{M_1}{\eta }, \end{aligned} \end{aligned}$$

for k large enough. Summarizing, there exists a \(M_2>0\) such that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \gamma _k^2 \int _{0}^{T} e^{\gamma _k(\psi _k^i-\sigma _k)} |\nabla _x\psi _k^i| \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t \ \ \le M_2. \end{aligned}$$
(28)

Mimicking the analysis conducted in Step 1, b) and c) of the proof of Theorem 2 in [10] and taking into account (b) of Proposition 3.1, we conclude that there exist constants \(N_1>0\) such that

$$\begin{aligned} \int _{0}^{T} \left| \dot{p}_{\gamma _k}(t)\right| \hbox {d}t \le N_1, \end{aligned}$$
(29)

for k sufficiently large, proving our claim.

Before proceeding, observe that it is a simple matter to assert the existence of a constant \(N_2\) such that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \int _0^T \gamma _k^2 e^{\gamma _k(\psi _k^i-\sigma _k)} |\langle \nabla _x\psi _k^i,p_{\gamma _k}\rangle |\hbox {d}t\le N_2. \end{aligned}$$
(30)

This inequality will be of help in what follows.

Let us now recall that

$$\xi _k^i(t)=\gamma _k e^{\gamma _k( \psi ^i(t,x_k(t))-\sigma _k)}$$

and that the second inequality in (14) holds. We turn to the analysis of Step 2 in the proof of Theorem 2 in [10] (see also [13]). Adapting those arguments, we can conclude the existence of some function \(p\in BV([0,T],R^n)\) and, for \(i=1, \ldots , I\), functions \(\xi ^i\in L^{\infty }([0,T],R)\) with \(\xi ^i(t) \ge 0 \ \text{ a. } \text{ e. } t\), \(\xi ^i(t) = 0, \ t \in I_b^i\), where

$$\begin{aligned} I_b^i=\left\{ t\in [0,T]:~ \psi ^i(t, \hat{x}(t))<0\right\} , \end{aligned}$$

and finite signed Radon measures \(\eta ^i\), null in \( I_b^i\), such that, for any \(z\in C([0,T],R^n)\)

$$\begin{aligned} \int _0^T \langle z,dp\rangle =-\int _0^T \langle z, (\nabla _x\hat{f})^*p\rangle \hbox {d}t +\displaystyle \sum _{i=1}^{I} \left( \int _0^T\xi ^i\langle z,\nabla ^2_x\hat{\psi } ^ip\rangle \hbox {d}t +\int _0^T\langle z, \nabla _x\hat{\psi }^i \rangle {d}\eta ^i\right) , \end{aligned}$$

where \(\nabla _x \hat{\psi }^i =\nabla _x \psi ^i(t,\hat{x}(t))\). Set \(\nabla _x \psi ^i_k=\nabla _x \psi ^i(t,x_k(t))\). The finite signed Radon measures \(\eta ^i\) are weak-\(*\) limits of

$$\begin{aligned} \gamma _k^2 e^{\gamma _k(\psi ^i(t, x_k(t))-\sigma _k)} \langle \nabla _x\psi ^i(t, x_k(t)),p_{k}(t)\rangle \hbox {d}t. \end{aligned}$$

Observe that the measures

$$\begin{aligned} \langle \nabla _x\psi ^i(t,\hat{x}(t)),p(t)\rangle {d}\eta ^i(t) \end{aligned}$$
(31)

are nonnegative.

For each \(i=1,\ldots , I\), the sequence \(\xi _k^i\) is weakly-\(*\) convergent in \(L^{\infty }\) to \(\xi ^i\ge 0\). Following [13], we deduce from (30) that, for each \(i=1,\ldots , I\),

$$\begin{aligned}{} & {} \int _0^T|\xi ^i\langle \nabla _x\hat{\psi }^i,p\rangle | \hbox {d}t=\lim _{k\rightarrow \infty }\int _0^T|\xi _k^i\langle \nabla _x\hat{\psi }^i,p\rangle | \hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty } \left( \int _0^T\xi _k^i|\langle \nabla _x\hat{\psi }^i,p\rangle -\langle \nabla _x \psi _k^i,p_k\rangle | \hbox {d}t+ \int _0^T\xi _k^i|\langle \nabla _x \psi _k^i,p_k\rangle | \hbox {d}t\right) \\{} & {} \quad \le \lim _{k\rightarrow \infty }\left( \Big |\xi _k^i\Big |_{L^{\infty }}\Big | \langle \nabla _x\hat{\psi }^i,p\rangle -\langle \nabla _x \psi _k^i,p_k\rangle \Big |_{L^1}+\frac{N_2}{\gamma _k}\right) =0. \end{aligned}$$

It turns out that

$$\begin{aligned} \xi ^i \langle \nabla _x\hat{\psi }^i ,p\rangle =0\; \mathrm{a. e.}. \end{aligned}$$
(32)

Consider now the sequence of scalars \(\{\lambda _k\}\). It is an easy matter to show that there exists a subsequence of \(\{\lambda _k\}\) converging to some \(\lambda \ge 0\). This, together with the convergence of \(p_k\) to p, allows us to take limits in (a) and (c) of Proposition 3.1 to deduce that

$$\begin{aligned} \lambda +|p(T)|=1 \end{aligned}$$

and

$$\begin{aligned} \langle p(t), f(t,\hat{x}(t),u)\rangle -\alpha \lambda |u-\hat{u}(t)| \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle ~\forall u\in U, \text { a.e. } t \in [0,T]. \end{aligned}$$

It remains to take limits of the transversality conditions (d) in Proposition 3.1. First, observe that

$$\begin{aligned} C_T+\epsilon _kB_n=\left\{ x:~d(x,C_T)\le \epsilon _k\right\} . \end{aligned}$$

From the basic properties of the Mordukhovich normal cone and subdifferential (see [18], section 1.3.3), we have

$$\begin{aligned} N_{C_T+\epsilon _kB_n}(x_k(T))\subset \text { cl cone}\,\partial d(x_k(T), C_T) \end{aligned}$$

and

$$\begin{aligned} N_{C_T}(\hat{x}(T))= \text { cl cone}\,\partial d(\hat{x}(T), C_T). \end{aligned}$$

Passing to the limit as \(k\rightarrow \infty \), we get

$$\begin{aligned} (p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times N_{C_T}(\hat{x}(T))+\{0\}\times \lambda ~ \partial \phi (\hat{x}(T)). \end{aligned}$$

Finally, and mimicking Step 3 in the proof of Theorem 2 in [10], we remove the dependence of the conditions on the parameter \(\alpha \). This is done by taking further limits, this time considering a sequence of \(\alpha _j\downarrow 0\).

We then summarize our conclusions in the following Theorem.

Theorem 4.1

Let \((\hat{x}, \hat{u})\) be the optimal solution to (P). Suppose that assumption A1–A6 are satisfied. For \(i=1,\cdots , I\), set

$$\begin{aligned} I^{i}_b= \{ t \in [0,T]: ~ \psi ^{i}(t,\hat{x}(t) ) < 0 \}. \end{aligned}$$

There exist \( \lambda \ge 0\), \( p\in BV([0,T],R^n)\), finite signed Randon measures \( \eta ^i\), null in \(I^{i}_b\), for \(i=1,\cdots , I\), \( \xi ^{i}\in L^\infty ([0,T],R)\), with \(i=1,\cdots , I \), where \( \displaystyle \xi ^{i}(t) \ge 0 \ \text { a. e. } t\) and \(\xi ^{i}(t) = 0, \ t \in I^{i}_b, \) such that

a):

\(\lambda +|p(T)|\ne 0\),

b):

\(\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),\)

c):

for any \(z\in C([0,T];R^n)\)

$$\begin{aligned} \begin{array}{l} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\ \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \textrm{d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) ,\end{array} \end{aligned}$$

where \( \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla _x \hat{\psi }^i(t)=\nabla _x \psi ^i(t,\hat{x}(t))\) and \(\nabla ^2_x \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),\)

d):

\(\xi _i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0\), \(a.e. \, t\) for all \(i=1, \ldots , I\),

e):

for all \(i=1, \ldots , I\), the measures \(\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)\) are nonnegative,

f):

\(\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle \) for all \(u \in U,\) \(~a.e.\, t\),

g):

\(\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times N_{C_T}(\hat{x}(T)) +\{0\}\times \lambda \partial \phi (\hat{x}(T)).\end{array}\)

Noteworthy, condition e) is not considered in any of our previous works.

We now turn to the free end-point case, i. e., to the problem

$$\begin{aligned} (P_f) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \hspace{8mm} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} u(t)\in U, \ \ \;\, \text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} x(0) \in C_0 \subset C(0). \end{array} \right. \end{aligned}$$

Problem \((P_f)\) differs from (P) because x(T) is not constrained to take values in \(C_T\). We apply Theorem 4.1 to \((P_f)\). Since x(T) is free, we deduce from (f) in the above Theorem that \(-p(T)=\lambda \partial \phi (\hat{x}(T))\). Suppose that \(\lambda =0\). Then, \(p(T)=0\) contradicting the nontriviality condition (a) of Theorem 4.1. Without loss of generality, we then conclude that the conditions of Theorem 4.1 hold with \(\lambda =1\). We summarize our findings in the following Corollary.

Corollary 4.1

Let \((\hat{x}, \hat{u})\) be the optimal solution to \((P_f)\). Suppose that assumption A1–A6 are satisfied. For \(i=1,\cdots , I\), set

$$\begin{aligned} I^{i}_b= \{ t \in [0,T]: ~ \psi ^{i}(t,\hat{x}(t) ) < 0 \}. \end{aligned}$$

There exist \( p\in BV([0,T],R^n)\), finite signed Randon measures \( \eta _i\), null in \(I^{i}_b\), for \(i=1,\cdots , I\), \( \xi ^{i}\in L^\infty ([0,T],R)\), with \(i=1,\cdots , I \), where \( \displaystyle \xi ^{i}(t) \ge 0 \ \text { a.e. } t\) and \(\xi ^{i}(t) = 0\) for \(t \in I^{i}_b, \) such that

a):

\(\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),\)

b):

for any \(z\in C([0,T];R^n)\)

$$\begin{aligned}{} & {} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\{} & {} \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \hbox {d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) , \end{aligned}$$

where \( \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla \hat{\psi }^i(t)=\nabla \psi ^i(t,\hat{x}(t))\) and \(\nabla ^2 \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),\)

c):

\(\xi ^i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0\) for a.et and for all \(i=1, \ldots , I\),

d):

for all \(i=1, \ldots , I\), the measures \(\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)\) are nonnegative,

e):

\(\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle \) for all \(u \in U\), \(a.e.\, t\),

f):

\(\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times \partial \phi (\hat{x}(T)).\end{array}\)

5 Example

Let us consider the following problem (Fig. 2)

$$\begin{aligned} \begin{array}{l} \text{ Minimize } \; -x(T)\\ \text{ over } \text{ processes } ((x,y,z),u) \text{ such } \text{ that } \\ \hspace{8mm} \begin{bmatrix} \dot{x}(t)\\ \dot{y}(t)\\ \dot{z}(t) \end{bmatrix} \in \begin{bmatrix} 0 &{} \quad \sigma &{}v 0\\ 0 &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix} \begin{bmatrix} x\\ y\\ z \end{bmatrix} +\begin{bmatrix} 0\\ u\\ 0 \end{bmatrix}-N_C(x,y,z),\\ u\in [-1,1],\\ (x,y,z)(0)=(x_0,y_0,z_0), \\ (x,y,z)(T)\in C_T, \end{array} \end{aligned}$$

where

  • \(0<\sigma \ll 1\),

  • \(C=\{ (x,y,z)\mid x^2+y^2+(z+h)^2\le 1,\; x^2+y^2+(z-h)^2\le 1\},\;\; 2\,h^2<1 \),

  • \((x_0,y_0,z_0)\in \textrm{int }C\), with \(x_0<-\delta \), \(y_0=0\) and \(z_0>0\),

  • \(C_T=\{ (x,y,z)\mid x\le 0,\; y\ge 0,\; \delta y-y_2x\le \delta y_2\}\cap C\), where

    $$\begin{aligned} \delta <\frac{y_2|x_0|}{y_1},~ \textrm{with } ~y_1=\sqrt{1-x_0^2-(z_0+h)^2} \text { and }y_2=\sqrt{1-h^2}. \end{aligned}$$

We choose \(T>0\) small and, nonetheless, sufficiently large to guarantee that, when \(\sigma =0\), the system can reach the interior of \(C_T\) but not the segment \(\{ (x,0,0)\mid x\in [-\delta ,0]\}\). Since \(\sigma \) and T are small, it follows that the optimal trajectory should reach \(C_T\) at the face \(\delta y-y_2x=\delta y_2\) of \(C_T\).

To significantly increase the value of the x(T), the optimal trajectory needs to live on the boundary of C for some interval of time. Then, before reaching and after leaving the boundary of C, the optimal trajectory lives in the interior of C. Since \(\delta \) is small, the trajectory cannot reach \(C_T\) from any point of the sphere \(x^2+y^2+(z+h)^2=1\) with \(z>0\). This means that, while on the boundary of C the trajectory should move on the sphere \(x^2+y^2+(z+h)^2=1\) until reaching the plane \(z=0\) and then it moves on the intersection of the two spheres.

While in the interior of C, the control can change sign from \(-1\) to 1 or from 1 to \(-1\). Certainly, the control should be 1 right before reaching the boundary and \(-1\) right before arriving at \(C_T\). Changes of the control from 1 to \(-1\) or \(-1\) to 1 before reaching the boundary translate into time waste and lead to smaller values of x(T). It then follows that the optimal control should be of the form

$$\begin{aligned} u(t)=\left\{ \begin{array}{cl} 1,&{} \quad t\in [0,\tilde{t}],\\ -1, &{} \quad t\in \ ]\tilde{t},T], \end{array} \right. \end{aligned}$$
(33)

for some value \(\tilde{t}\in ]0,T[\).

Fig. 2
figure 2

The set C, consisting of the intersection of two balls, in thin solid line, the set \(C_T\) in dashed line and the optimal trajectory in bold line

After the modification (8), the data of the problem satisfy the conditions under which Theorem 4.1 holds. We now show that the conclusions of Theorem 4.1 completely identify the structure (33) of the optimal control.

From Theorem 4.1, we deduce the existence of \( \lambda \ge 0\), \( p,~q,~r \in BV([0,T],R)\), finite signed Randon measures \( \eta _1\) and \(\eta _2\), null, respectively, in

$$\begin{aligned} I^{1}_b=\left\{ (x,y,z)\mid x^2+y^2+(z+h)^2-1<0\right\} \end{aligned}$$

and

$$\begin{aligned} I^{2}_b=\left\{ (x,y,z)\mid x^2+y^2+(z-h)^2-1<0\right\} , \end{aligned}$$

\( \xi _{i}\in L^\infty ([0,T],R)\), with \(i=1,2 \), where \( \displaystyle \xi _{i}(t) \ge 0 \ \text { a. e. } t\) and \(\xi _{i}(t) = 0, \ t \in I^{i}_b, \) such that

$$\begin{aligned}{} & {} \text {(i)} \begin{bmatrix} \dot{x}(t)\\ \dot{y}(t)\\ \dot{z}(t) \end{bmatrix} = \begin{bmatrix} 0 &{} \quad \sigma &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix} \begin{bmatrix} x\\ y\\ z \end{bmatrix} +\begin{bmatrix} 0\\ u\\ 0 \end{bmatrix}-2\xi _1\begin{bmatrix} x\\ y\\ z+h \end{bmatrix}-2\xi _2\begin{bmatrix} x\\ y\\ z-h \end{bmatrix} \\{} & {} \text {(ii)}\, d\begin{bmatrix} p\\ q\\ r \end{bmatrix} = \begin{bmatrix} 0 &{} \quad 0 &{} \quad 0\\ -\sigma &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix}\begin{bmatrix} p\\ q\\ r \end{bmatrix} \hbox {d}t \\{} & {} +2(\xi _1+\xi _2) \begin{bmatrix} p\\ q\\ r \end{bmatrix} \hbox {d}t +2\begin{bmatrix} x\\ y\\ z+h \end{bmatrix} \hbox {d}\eta _1 +2\begin{bmatrix} x\\ y\\ z-h \end{bmatrix}\hbox {d}\eta _2,\\{} & {} \text {(iii)} \begin{bmatrix} p\\ q\\ r \end{bmatrix}(T) = \begin{bmatrix} \lambda \\ 0\\ 0 \end{bmatrix}+\mu \begin{bmatrix} y_2\\ -\delta \\ 0 \end{bmatrix},\text { where } \mu \ge 0,\\{} & {} \text {(iv)}\, \xi _1(xp+yq+(z+h)r)=0,\; \xi _2(xp+yq+(z-h)r)=0,\\{} & {} \text {(v)}\, \text {the measures } (xp+yq+(z+h)r)\hbox {d}\eta _1\text { and }(xp+yq+(z-h)r)\hbox {d}\eta _2\\{} & {} \, \text { are nonnegative,} \\{} & {} \text {(vi)}\, \max _{u\in [-1,1]}uq=\hat{u}q, \end{aligned}$$

where \(\hat{u}\) is the optimal control.

Let \(t_1\) be the instant of time when the trajectory reaches the sphere \(x^2+y^2+(z+h)^2=1\), \(t_2\) the instant of time when the trajectory reaches the intersection of the two spheres and \(t_3\) be the instant of time the trajectory leaves the boundary of C. We have \(0<t_1<t_2<t_3<T\).

Next, we show that the multiplier q changes sign only once and so identifying the structure (33) of the optimal control in a unique way. We start by looking at the case when \(t=T\). We have

$$\begin{aligned} \left[ \begin{array}{c} p\\ q \end{array}\right] (T) = \left[ \begin{array}{c} \lambda \\ 0 \end{array}\right] +\mu \left[ \begin{array}{c} y_2\\ -\delta \ \end{array}\right] . \end{aligned}$$

Starting from \(t=T\), let us go backwards in time until the instant \(t_3\) when the trajectory leaves the boundary of C. If \(q(T)=0\), then \(p(T)=\lambda >0\) and we would have \(q(t)>0\) for \(t \in ]t_3,T[\) (see (ii) above), which is impossible. We then have \(p(T)>0\) and \(q(T)<0\) and, in \(]t_3,T[\), since \(\sigma \) is small, the vector (p(t), q(t)) does not change much. At \(t=t_3\), the vector (pq) has a jump and such jump can only occur along the vector \((x(t_3),y(t_3))\). Therefore, we have \(p(t_3-0)>0\) and \(q(t_3-0)<0\).

Let us now consider \(t\in ]t_2,t_3[\). We have the following

  1. 1.

    when \( t\in [t_2,t_3]\), we have \(z=0\);

  2. 2.

    condition (i) implies that \(\xi _1=\xi _2=\xi \), \(\xi >0\) since, otherwise the motion along \(x^2+y^2=1-h^2\) would not be possible;

  3. 3.

    from \(0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2)=\sigma 2xy-8\xi x^2+2uy-8\xi y^2\), we get \(\xi =\frac{\sigma xy+uy}{4(1-h^2)}\);

  4. 4.

    condition (iv) implies that \(r=0\) leading to \(xp+yq=0\). Since \(x<0\), \(y>0\), then \(q=0\) implies \(p=0\);

  5. 5.

    condition (ii) implies \(\hbox {d}\eta _1=\hbox {d}\eta _2=\hbox {d}\eta \);

  6. 6.

    \(0=d(xp+yq)=uq\hbox {d}t+4(1-h^2)\hbox {d}\eta \) \(\Rightarrow \) \(\frac{\hbox {d}\eta }{\hbox {d}t}=-\frac{uq}{4(1-h^2)}\);

  7. 7.

    from the above analysis, we deduce that

    $$\begin{aligned}{} & {} \dot{p}= \frac{\sigma xy+uy}{(1-h^2)}~p-\frac{xuq}{(1-h^2)},\\{} & {} \dot{q}=-\sigma p+\frac{\sigma xy}{(1-h^2)}~q. \end{aligned}$$

    Thus, (pq) is a solution to a linear system and it can never be equal to zero. It follows that q cannot be zero because \(q=0\) implies \(p=0\). Since \(q\ne 0\), we have \(q>0\).

Let us consider the case when \(t=t_2\). We claim that

$$\begin{aligned} (p(t_2-0),q(t_2-0))\ne (0,0). \end{aligned}$$

Seeking a contradiction, assume that it is \((p(t_2-0),q(t_2-0))=(0,0)\). Then, we have

$$\begin{aligned} (p(t_2+0), q(t_2+0))=(0,0)+(2x_2(t_2), 2y_2(t_2))(\hbox {d}\eta _1+\hbox {d}\eta _2) \end{aligned}$$

and such jump has to be normal to \((x(t_2),y(t_2))\) since \(r(t_2+0)=0\) (see (iv)). It follows that \((x^2(t_2)+y^2(t_2))(\hbox {d}\eta _1+ \hbox {d}\eta _2)=0\) and, since \(x^2(t_2)+y^2(t_2)>0\), we get \(\hbox {d}\eta _1+ \hbox {d}\eta _2=0\), proving our claim.

We now consider \( t\in ]t_1,t_2[\). It is easy to see that \(\xi _2=0\) and \(d \eta _2=0\). We also deduce that

  1. 1.

    \(0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2+(z+h)^2)=2\sigma xy+2uy-4\xi _1 y^2-4\xi _1 x^2-4\xi _1(z+h)^2\) which implies that \(\xi _1=\frac{\sigma xy+uy}{2}\);

  2. 2.

    also \(0=d(xp+yq+(z+h)r)=uq\hbox {d}t+2\hbox {d}\eta _1\) implies that \(\frac{\hbox {d}\eta _1}{\hbox {d}t}=-\frac{uq}{2}\);

  3. 3.

    from the above, we deduce that

    $$\begin{aligned}{} & {} \dot{p}=(\sigma xy+uy)p-xuq,\\{} & {} \dot{q}=-\sigma p+\sigma xyq. \end{aligned}$$

    Thus, (pq) is a solution to a linear system and never is equal to zero. The second equation implies that if \(q=0\), then \(\dot{q}\ne 0\). Hence, \(q>0\).

Now, we need to consider \(t=t_1\). We claim that

$$\begin{aligned} (p(t_1-0),q(t_1-0),r(t_1-0))\ne (0,0,0). \end{aligned}$$

Let us then assume that it is \((p(t_1-0),q(t_1-0),r(t_1-0))=(0,0,0)\). It then follows that \((p(t_1+0),q(t_1+0),r(t_1+0))=(0,0,0)+(2x(t_1)\hbox {d}\eta _1, 2y(t_1)\hbox {d}\eta _1, 2(z(t_1)+h)\hbox {d}\eta \eta _1)\). We now show that there is no such jump. Set \(r(t_1-0)=r_0\). Then, it follows from (iv) that \((x(t_1)\cdot 0+y(t_1)\cdot 0 +(z(t_1)+h))r_0=0\) which implies that \(r_0=0\). We also have \((x^2(t_1)+y^2(t_1)+(z(t_1)+h)^2)\hbox {d}\eta _1=0\) from (v). But this implies that \(\hbox {d}\eta _1=0\). Consequently, the multipliers do not exhibit a jump at \(t_1\).

From the previous analysis, we deduce that q should be positive almost everywhere on the boundary. It then follows that to find the optimal solution we have to analyze admissible trajectories corresponding to controls with the structure (33) and choose the optimal value of \(\tilde{t}\).

6 Conclusions

We proved necessary conditions for an optimal control problem involving sweeping processes with a nonsmooth sweeping set depending on time. The main feature of our work is the use of exponential penalization functions. We have applied successfully this approach in previous works on optimal control problems involving sweeping processes with a smooth set. In this work, to deal with the sweeping set nonsmoothness we impose rather strong constraint qualifications. The weakening of these hypotheses will be the subject of our future work.