A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set

Pinho, Maria do Rosário de; Ferreira, Maria Margarida A.; Smirnov, Georgi

doi:10.1007/s10957-023-02283-4

A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set

Original Paper
Open access
Published: 24 August 2023

Volume 199, pages 273–297, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set

Download PDF

Maria do Rosário de Pinho¹,
Maria Margarida A. Ferreira ORCID: orcid.org/0000-0002-1223-4491¹ &
Georgi Smirnov²

926 Accesses
1 Citation
Explore all metrics

Abstract

We generalize a maximum principle for optimal control problems involving sweeping systems previously derived in de Pinho et al. (Optimization 71(11):3363–3381, 2022, https://doi.org/10.1080/02331934.2022.2101111) to cover the case where the moving set may be nonsmooth. Noteworthy, we consider problems with constrained end point. A remarkable feature of our work is that we rely upon an ingenious smooth approximating family of standard differential equations in the vein of that used in de Pinho et al. (Set Valued Var Anal 27:523–548, 2019, https://doi.org/10.1007/s11228-018-0501-8).

Optimal Control with Sweeping Processes: Numerical Method

Article 22 May 2020

A Maximum Principle for the Controlled Sweeping Process

Article 30 January 2017

Extended Euler–Lagrange and Hamiltonian Conditions in Optimal Control of Sweeping Processes with Controlled Moving Sets

Article 28 August 2018

1 Introduction

In recent years, there has been a surge of interest in optimal control problems involving the controlled sweeping process of the form

$$\begin{aligned} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), ~u(t)\in U, ~~x(0) \in C_0. \end{aligned}$$

(1)

In this respect, we refer to, for example, [3,4,5, 8,9,10, 15, 23] (see also accompanying correction [11]), [6, 13, 14]. Sweeping processes first appeared in the seminal paper [17] by J.J. Moreau as a mathematical framework for problems in plasticity and friction theory. They have proven of interest to tackle problems in mechanics, engineering, economics and crowd motion problems; to name but a few, see [1, 5, 15, 16, 21]. In the last decades, systems in form (1) have caught the attention and interest of the optimal control community. Such interest resides not only in the range of applications but also in the remarkable challenge they rise concerning the derivation of necessary conditions. This is due to the presence of the normal cone $N_{C(t)}(x(t))$ in the dynamics. Indeed, the presence of the normal cone renders the discontinuity of the right hand of the differential inclusion in (1) destroying a regularity property central to many known optimal control results.

Lately, there has been several successful attempts to derive necessary conditions for optimal control problems involving (1). Assuming that the set C is time independent, necessary conditions for optimal control problems with free end point have been derived under different assumptions and using different techniques. In [10], the set C has the form $C=\{ x:~\psi (x)\le 0\}$ and an approximating sequence of optimal control problems, where (1) is approximated by the differential equation

$$\begin{aligned} \dot{x}_{\gamma _k}(t)= f(t,x_{\gamma _k}(t),u(t))-\gamma _k e^{\gamma _k \psi (x_{\gamma _k}(t))}\nabla \psi (x_{\gamma _k}(t)), \end{aligned}$$

(2)

for some positive sequence $\gamma _k\rightarrow +\infty $, is used. Similar techniques are also applied to somehow more general problems in [23]. A useful feature of those approximations is explored in [12] to define numerical schemes to solve such problems.

More recently, an adaptation of the family of approximating systems (2) is used in [13] to generalize the results in [10] to cover problems with additional end-point constraints and with a moving set of the form $C(t)=\{ x:~\psi (t,x)\le 0\}$.

In this paper, we generalize the maximum principle proven in [13] to cover problems with possibly nonsmooth sets. Our problem of interest is

$$\begin{aligned} (P) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \hspace{8mm} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} u(t)\in U, \ \ \;\, \text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} (x(0),x(T)) \in C_0\times C_T { ~\subset C(0)\times C(T)}, \end{array} \right. \end{aligned}$$

where $T>0$ is fixed, $\phi : R^n\rightarrow R$, $f:[0,T]\times R^n\times R^m\rightarrow R^n$, $U \subset R^m$ and

$$\begin{aligned} C(t):=\left\{ x\in R^n: ~\psi ^i(t,x)\le 0,\; i=1,\ldots , I\right\} \end{aligned}$$

(3)

for some functions $\psi ^i:[0,T]\times R^n\rightarrow R$, $ i=1,\ldots , I$.

The case where $I=1$ in (3) and $\psi ^1$ is $C^2$ is covered in [13]. Here, we assume $I>1$ and that the functions $\psi ^i$ are also $C^2$. Although going from $I=1$ in (3) to $I>1$ may be seen as a small generalization, it demands a significant revision of the technical approach and, plus, the introduction of a constraint qualification. This is because set (3) may be nonsmooth. We focus on sets (3), satisfying a certain constraint qualification, introduced in assumption (A1) in Sect. 2. This is, indeed, a restriction on the nonsmoothness of (3). A similar problem with nonsmooth moving set is considered in [14]. Our results cannot be obtained from the results of [14] and do not generalize them.

This paper is organized in the following way. In Sect. 2, we introduce the main notation and we state and discuss the assumptions under which we work. In this same section, we also introduce the family of approximating systems to $\dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t))$ and establish a crucial convergence result, Theorem 2.1. In Sect. 3, we dwell on the approximating family of optimal control problems to (P) and we state the associated necessary conditions. The maximum principle for (P) is then deduced and stated in Theorem 4.1, covering additionally, problems in the form of (P) where the end-point constraint $x(T) \in C_T$ is absent. We present an illustrative example of our main result, Theorem 4.1, in Sect. 5. We end this paper with some brief conclusions.

2 Preliminaries

In this section, we introduce a summary of the notation and state the assumptions on the data of (P) enforced throughout. Furthermore, we extract information from the assumptions establishing relations crucial for the forthcoming analysis.

Notation

For a set $S\subset R^n$, $\partial S$, $ \text {cl}\,S$ and $ \text{ int }\, S$ denote the boundary, closure and interior of S.

If $g: R^p\rightarrow R^q$, $\nabla g$ represents the derivative and $\nabla ^2g$ the second derivative. If $g: R\times R^p\rightarrow R^q$, then $\nabla _x g$ represents the derivative w.r.t. $x\in R^p$ and $\nabla ^2_xg$ the second derivative, while $\partial _t g(t,x)$ represents the derivative w.r.t. $t\in R$.

The Euclidean norm or the induced matrix norm on $ R^{p\times q}$ is denoted by $ |\cdot |$. We denote by $B_n$ the closed unit ball in $ R^n$ centered at the origin. The inner product of x and y is denoted by $\langle x, y\rangle $. For some $A\subset R^n$, d(x, A) denotes the distance between x and A. We denote the support function of A at z by $S(z,A)=\sup \{\langle z,a\rangle \mid a\in A\}$

The space $L^{\infty }([a,b]; R^p)$ (or simply $L^{\infty }$ when the domains are clearly understood) is the Lebesgue space of essentially bounded functions $h:[a,b]\rightarrow R^p$. We say that $h\in BV([a,b]; R^p)$ if h is a function of bounded variation. The space of continuous functions is denoted by $C([a,b]; R^p)$.

Standard concepts from nonsmooth analysis will also be used. Those can be found in [7, 18] or [22], to name but a few. The Mordukhovich normal cone to a set S at $s\in S$ is denoted by $N_{S}(s)$ and $\partial f(s)$ is the Mordukhovich subdifferential of f at s (also known as limiting subdifferential).

For any set $A\subset R^n$, $ \text {cone}\, A$ is the cone generated by the set A.

We now turn to problem (P). We first state the definition of admissible processes for (P) and then we describe the assumptions under which we will derive our main results.

Definition 2.1

A pair (x, u) is called an admissible process for (P) when x is an absolutely continuous function and u is a measurable function satisfying the constraints of (P).

Assumptions on the data of $\mathbf {(P)}$

A1:
The functions $\psi ^i$, $ i=1,\ldots , I$, are $C^2$. The graph of $C(\cdot )$ is compact and it is contained in the interior of a ball $rB_{n+1}$, for some $r>0$. There exist constants $\beta >0$, $\eta >0$ and $\rho \in ]0,1[$ such that
$$\begin{aligned} \psi ^i(t,x) \in [ -\beta ,\beta ] \Longrightarrow |\nabla _x \psi ^i (t,x) | > \eta \; \; \textrm{for all }\; (t,x)\in [0,T]\times R^n, \end{aligned}$$
(4)
and, for $I(t,x)=\{ i=1,\ldots , I\mid \psi ^i(t,x)\in ]-2\beta ,\beta ]\}$,
$$\begin{aligned} \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \ge 0,\;\; i,j\in I(t,x). \end{aligned}$$
(5)
Moreover, if $i\in I(t,x)$, then
$$\begin{aligned} \sum _{j\in I(t,x)\setminus \{i\}} \big | \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \big | \le \rho |\nabla _x\psi ^i(t,x)|^2. \end{aligned}$$
(6)
Additionally,
$$\begin{aligned} \psi ^i(t,x)\le -2\beta ~\Longrightarrow ~\nabla _x \psi ^i(t,x)=0~\text { for } i=1,\ldots I. \end{aligned}$$
(7)
A2:
The function f is continuous, $x\rightarrow f(t,x,u)$ is continuously differentiable for all $(t,u)\in [0,T]\times R^m$. The constant $M>0$ is such that $|f(t,x,u)|\le M$ and $|\nabla _x f(t,x,u)|\le M$ for all $(t,x,u)\in rB_{n+1}\times U$.
A3:
For each (t, x), the set f(t, x, U) is convex.
A4:
The set U is compact.
A5:
The sets $C_0$ and $C_T$ are compact.
A6:
There exists a constant $L_\phi $ such that $|\phi (x)-\phi (x')|\le L_{\phi }|x-x'|$ for all $x, x' \in R^n$.

Assumption (A1) concerns the functions $\psi ^i$ defining the set C, and it plays a crucial role in the analysis. All $\psi ^i$ are assumed to be smooth with gradients bounded away from the origin when $\psi ^i$ takes values in a neighborhood of zero. Moreover, the boundary of C may be nonsmooth at the intersection points of the level sets $\left\{ x: \psi ^i(t,x)=0\right\} $. However, nonsmoothness at those corner points is restricted to (5) which excludes the cases where the angle between the two gradients of the functions defining the boundary of C is obtuse; see Fig. 1.

On the other hand, (6) guarantees that the Gramian matrix of the gradients of the functions taking values near the boundary of C(t) is diagonally dominant and, hence, the gradients are linearly independent.

In many situations, as in the example we present in the last section, we can guarantee the fulfillment of (A1), in particular (7), replacing the function $\psi ^i$ by

$$\begin{aligned} \tilde{\psi }^i(t,x)=h\circ \psi ^i(t,x), \end{aligned}$$

(8)

where

$$\begin{aligned} h(z)=\left\{ \begin{array}{lcl} z &{} \quad \text { if} &{} z>-\beta , \\ h_s (z) &{} \quad \text { if} &{}-2\beta \le z\le -\beta ,\\ -2\beta &{} \quad \text { if} &{} z<-2\beta , \end{array}\right. \end{aligned}$$

Here, h is a $C^2$ function, with $h_s$ an increasing function defined on $[-2\beta ,-\beta ]$. For example, $h_s$ may be a cubic polynomial with positive derivative on the interval $]-2\beta ,-\beta [$. For all $t\in [0,T]$, set

$$\begin{aligned} \tilde{C}(t): =\left\{ x\in R^n:~\tilde{\psi }^i(t,x )\le 0,~i=1, \ldots ,I\right\} . \end{aligned}$$

It is then a simple matter to see that

$$\begin{aligned} C(t)=\tilde{C}(t) \text { for all } t \in [0,T]. \end{aligned}$$

and that the functions $\tilde{\psi }^i(\cdot )$ satisfy the assumption (A1).

The assumption that the graph of $C(\cdot )$ is compact and contained in the interior of a ball is introduced to avoid technicalities in our forthcoming analysis. In applied problems, this may be easily side tracked by considering the intersection of the graph of $C(\cdot )$ with a tube around the optimal trajectory.

Condition (A1) implies the conditions of Theorem 3.1 in [2] and so our set C(t) is uniformly prox-regular.

We now proceed introducing an approximation family of controlled systems to (1). Let $x(\cdot )$ be a solution to the differential inclusion

$$\begin{aligned} \dot{x}(t) \in f(t,x(t),U)- N_{C(t)}(x(t)). \end{aligned}$$

Under our assumptions, measurable selection theorems assert the existence of measurable functions u and $\xi ^i$ such that $u(t) \in U$, $\xi ^i(t)\ge 0$ a.e. $t\in [0,T]$, $\xi ^i(t)=0$ if $\psi ^i(t,x(t))<0$, and

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t))\; \mathrm{a.e.}\; t\in [0,T]. \end{aligned}$$

Considering the trajectory x, some observations are called for. Let $\mu $ be such that

$$\begin{aligned}{} & {} \max \left\{ (|\nabla _x\psi ^i(t,x)||f(t,x,u)|+|\partial _t\psi ^i(t,x)|)+1:\right. \nonumber \\{} & {} \quad \left. ~t\in [0,T],\; u\in U,\; x\in C(t)+B_n, \; i=1,\ldots ,I\right\} \le \mu . \end{aligned}$$

The properties of the graph of $C(\cdot )$ in (A1) guarantee the existence of such maximum.

Consider now some t such that, for some $j\in \{1, \ldots I\}$, $\psi ^j (t,x(t))=0$ and $\dot{x}(t)$ exists. Since the trajectory x is always in C, we have (see (5))

$$\begin{aligned} \begin{aligned} 0&=\frac{\hbox {d}}{\hbox {d}t}\psi ^j(t,x(t))=\langle \nabla _x\psi ^j(t,x(t)),\dot{x}(t)\rangle +\partial _t\psi ^j(t,x(t)) \\&=\langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle -\xi ^j(t)| \nabla _x\psi ^j(t,x(t))|^2 \\&\quad -\sum _{i\in I(t,x(t))\setminus \{ j\}}\xi ^i(t) \langle \nabla _x\psi ^i(t,x(t)),\nabla _x\psi ^j(t,x(t))\rangle +\partial _t\psi ^j(t,x(t)) \\&\quad \le \langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle -\xi ^j(t)| \nabla _x\psi ^j(t,x(t))|^2 +\partial _t\psi ^j(t,x(t)), \end{aligned} \end{aligned}$$

and hence (see (4)),

$$\begin{aligned} \xi ^j(t)\le \frac{1}{| \nabla _x\psi ^j(t,x(t))|^2}(\langle \nabla _x\psi ^j(t,x(t)),f(t,x(t),u(t))\rangle +\partial _t\psi ^j(t,x(t))) \le \frac{\mu }{\eta ^2}. \end{aligned}$$

Define the function

$$\begin{aligned} \mu (\gamma )=\frac{1}{\gamma }\log \left( \frac{\mu }{\eta ^2\gamma }\right) , \quad \gamma >0, \end{aligned}$$

consider a sequence $\{\sigma _k\}$ such that $\sigma _k\downarrow 0$ and choose another sequence $\{\gamma _k\}$ with $\gamma _k\uparrow +\infty $ and

$$\begin{aligned} C(t)\subset \textrm{int}\, C^k(t)= \textrm{int}\, \left\{ x: \psi ^i(t,x)-\sigma _k\le \mu _k,\; i=1,\ldots ,I\right\} , \end{aligned}$$

where

$$\begin{aligned} \mu _k=\mu (\gamma _k). \end{aligned}$$

Let $x_k$ be a solution to the differential equation

$$\begin{aligned} \dot{x}_k(t)=f(t,x_k(t),u_k(t))-\sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\nabla _x\psi ^i(t,x_k(t)) \end{aligned}$$

(9)

for some $u_k(t) \in U$ a.e. $t\in [0,T]$. Take any $t\in [0,T]$ such that $\dot{x}_k(t)$ exists and $\psi ^j (t,x_k(t))-\sigma _k=\mu _k$. Assume that $|\psi ^j(t,x_k(t))|\le \beta $ and $\psi ^i(t,x_k(t)) \le \beta $, for all $\displaystyle i$.

Then, whenever $\gamma _k$ is sufficiently large, we have (see (5) and (7))

$$\begin{aligned} \begin{aligned} \frac{\hbox {d}}{\hbox {d}t}\psi ^j(t,x_k(t))&=\langle \nabla _x\psi ^j(t,x_k(t)),f(t,x_k(t),u_k(t))\rangle \\&\quad -\gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}|\nabla _x\psi ^j(t,x_k(t))|^2 \\&\quad -\sum _{i\in I(t,x_k(t))\setminus \{ j\}}\hspace{-2em} \gamma _ke^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\langle \nabla _x\psi ^i(t,x_k(t)),\nabla _x\psi ^j(t,x_k(t))\rangle \\&\quad -\sum _{i\not \in I(t,x_k(t))}\hspace{-1em} \gamma _ke^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\langle \nabla _x\psi ^i(t,x_k(t)),\nabla _x\psi ^j(t,x_k(t))\rangle \\&\quad + \partial _t\psi ^j(t,x_k(t))\\&\quad \le \langle \nabla _x\psi ^j(t,x_k(t)),f(t,x_k(t),u_k(t))\rangle + \partial _t\psi ^j(t,x_k(t))\\&\quad -\gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}|\nabla _x\psi ^j(t,x_k(t))|^2 \\&\quad \le \mu -1 -\eta ^2\gamma _k e^{\gamma _k\mu _k}\\&=-1. \end{aligned} \end{aligned}$$

In the last inequality, we have used the definition of $\mu $.

Thus, if $x_k(0)\in C^k(0)$, we have $x_k(t)\in C^k(t)$, for all $t\in [0,T]$, and

$$\begin{aligned} \gamma _ke^{\gamma _k(\psi ^j(t,x_k(t))-\sigma _k)}\le \gamma _ke^{\gamma _k\mu _k}= \frac{\mu }{\eta ^2}. \end{aligned}$$

(10)

It follows that, for k sufficiently large, we have

$$\begin{aligned} |\dot{x}_k(t)|\le (\textrm{const}). \end{aligned}$$

We remark that the inclusion $x_k(t)\in C^k(t)$ is a direct consequence of Theorem 3 in [20].

We are now a in position to state and prove our first result, Theorem 2.1. This is in the vein of Theorem 4.1 in [23] (see also Lemma 1 in [10] when $\psi $ is independent of t and convex) deviating from it in so far as the approximating sequence of control systems (9) differs from the one introduced in [10].^{Footnote 1} The proof of Theorem 2.1 relies on (10).

Theorem 2.1

Let $\{(x_k,u_k)\}$, with $u_k(t)\in U$ a.e., be a sequence of solutions of Cauchy problems

$$\begin{aligned} \begin{array}{rcl} \dot{x}_k(t) &{} = &{} f(t,x_k(t),u_k(t))-\displaystyle \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\nabla _x\psi ^i(t,x_k(t)),\\ x_k(0) &{} = &{} b_k\in C^k(0). \end{array} \end{aligned}$$

(11)

If $b_k\rightarrow x_0$, then there exists a subsequence $\{x_k\}$ (we do not relabel) converging uniformly to x, a unique solution to the Cauchy problem

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),u(t))-N_{C(t)}(x(t)),\;\;\; x(0)=x_0, \end{aligned}$$

(12)

where u is a measurable function such that $u(t)\in U$ a.e. $t\in [0,T]$.

If, moreover, all the controls $u_k$ are equal, i.e., $u_k=u$, then the subsequence converges to a unique solution of (12), i.e., any solution of

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),U)-N_{C(t)}(x(t)),\;\;\; x(0)=x_0\in C(0) \end{aligned}$$

(13)

can be approximated by solutions of (11).

Proof

Consider the sequence $\{x_k\}$, where $(x_k,u_k)$ solves (11). Recall that $x_k(t)\in C^k(t)$ for all $t\in [0,T]$, and

$$\begin{aligned} |\dot{x}_k(t)|\le (\textrm{const})\;\;\;\textrm{and}\;\;\; \xi _k^i(t)=\gamma _k e^{\gamma _k(\psi ^i(t,x_k(t))-\sigma _k)}\le (\textrm{const}). \end{aligned}$$

(14)

Then, there exist subsequences (we do not relabel) weakly-$*$ converging in $L^{\infty }$ to some v and $\xi ^i$. Hence,

$$\begin{aligned} x_{k}(t)=x_0+ \int _0^t \dot{x}_{k}(s) \hbox {d}s \longrightarrow x(t)=x_0+ \int _0^t v(s)\hbox {d}s, ~\forall ~t\in [0,T], \end{aligned}$$

for an absolutely continuous function x. Obviously, $x(t)\in C(t)$ for all $t\in [0,T]$. Considering the sequence $\{x_k\}$, recall that

$$\begin{aligned} \dot{x}_k(t)\in f(t,x_k(t),U)-\sum _{i=1}^I\xi _k^i(t)\nabla _x \psi ^i(t,x_k(t)). \end{aligned}$$

(15)

Inclusion (15) is equivalent to

$$\begin{aligned} \langle z, \dot{x}_k(t)\rangle \le S(z,f(t,x_k(t),U))-\sum _{i=1}^I\xi _k^i(t)\langle z, \nabla _x\psi ^i(t,x_k(t))\rangle ,\;\;\;\forall \, z\in R^n. \end{aligned}$$

Integrating this inequality, we get

$$\begin{aligned}{} & {} \left\langle z,\frac{x_k(t+\tau )-x_k(t)}{\tau }\right\rangle \nonumber \\{} & {} \quad \le \frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x_k(s),U))-\sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x_k(s))\rangle \right) \hbox {d}s\nonumber \\{} & {} \quad =\frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x_k(s),U))-\sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x(s))\rangle \right. \nonumber \\{} & {} \qquad + \left. \sum _{i=1}^I\xi _k^i(s)\langle z, \nabla _x\psi ^i(s,x(s))- \nabla _x\psi ^i(s,x_k(s))\rangle \right) \hbox {d}s. \end{aligned}$$

(16)

Passing to the limit as $k\rightarrow \infty $, we obtain

$$\begin{aligned}{} & {} \left\langle z,\frac{x(t+\tau )-x(t)}{\tau }\right\rangle \nonumber \\{} & {} \quad \le \frac{1}{\tau }\int _t^{t+\tau }\left( S(z,f(s,x(s),U))-\sum _{i=1}^I\xi ^i(s)\langle z, \nabla _x\psi ^i(s,x(s))\rangle \right) \hbox {d}s. \end{aligned}$$

(17)

Let $t\in [0,T]$ be a Lebesgue point of $\dot{x}$ and $\xi $. Passing in the last inequality to the limit as $\tau \downarrow 0$, it leads to

$$\begin{aligned} \langle z,\dot{x}(t)\rangle \le S(z,f(t,x(t),U))-\sum _{i=1}^I\xi ^i(t)\langle z, \nabla _x\psi ^i(t,x(t))\rangle . \end{aligned}$$

Since $z\in R^n$ is an arbitrary vector and the set f(t, x(t), U) is convex, we conclude that

$$\begin{aligned} \dot{x}(t)\in f(t,x(t),U)-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

By the Filippov lemma, there exists a measurable control $u(t)\in U$ such that

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

Furthermore, observe that $\xi ^i$ is zero if $\psi ^i(t,x(t))<0$. If for some u such that $u(t)\in U$ a.e., $u_k=u$ for all k, then the sequence $x_k$ converges to the solution of

$$\begin{aligned} \dot{x}(t)= f(t,x(t),u(t))-\sum _{i=1}^I\xi ^i(t)\nabla _x\psi ^i(t,x(t)). \end{aligned}$$

Indeed, to see this, it suffices to pass to the limit as $k\rightarrow \infty $ and then as $\tau \downarrow 0$, in the equality

$$\begin{aligned} \frac{x_k(t+\tau )-x_k(t)}{\tau }= \frac{1}{\tau }\int _t^{t+\tau }\left( f(s,x_k(s),u(s))-\sum _{i=1}^I\xi _k^i(s) \nabla _x\psi ^i(s,x_k(s))\right) \hbox {d}s. \end{aligned}$$

Recall that the set C(t) is uniformly prox-regular. The proof of uniqueness of solution for general sweeping processes with prox-regular sets can be found in [19], and it holds under the requirement that the moving set is Lipschitz continuous with respect to time. Although we do not assume the Lipschitz dependence directly, under our assumptions we can appeal to the implicit function theorem to show that C(t) is locally Lipschitz. However, for our special case, it is possible to have a simple alternative proof, which we present next for the convenience of the reader. The proof is in the vein of that of Theorem 4.1 in [23]. Suppose that there exist two different solutions of (12): $x_1$ and $x_2$. We have

$$\begin{aligned}{} & {} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2=\langle x_1(t)-x_2(t),\dot{x}_1(t)-\dot{x}_2(t)\rangle \nonumber \\{} & {} \quad =\langle x_1(t)-x_2(t),f(t,x_1(t),u(t))-f(t,x_2(t),u(t))\rangle \nonumber \\{} & {} \qquad -\left\langle x_1(t)-x_2(t),\sum _{i=1}^I\xi _1^i(t)\nabla _x\psi ^i(t,x_1(t))-\sum _{i=1}^I\xi _2^i(t)\nabla _x\psi ^i(t,x_2(t))\right\rangle .\nonumber \\ \end{aligned}$$

(18)

If, for all i, $\psi ^i(t,x_1(t))<0$ and $\psi ^i(t,x_2(t))<0$, then $\xi _1^i(t)=\xi _2^i(t)=0$, and we obtain

$$\begin{aligned} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2\le L_f|x_1(t)-x_2(t)|^2. \end{aligned}$$

Suppose that $\psi ^j(t,x_1(t))=0$. Then, by the Taylor formula we get

$$\begin{aligned}{} & {} \psi ^j(t,x_2(t))=\psi ^j(t,x_1(t))+\langle \nabla _x\psi ^j(t,x_1(t)),x_2(t)-x_1(t)\rangle \nonumber \\{} & {} \quad +\frac{1}{2}\langle x_2(t)-x_1(t), \nabla _x^2\psi ^j(t,\theta x_2(t)+(1-\theta )x_1(t))( x_2(t)-x_1(t))\rangle , \end{aligned}$$

(19)

where $\theta \in [0,1]$. Since $\psi ^j(t,x_2(t))\le 0$, we have

$$\begin{aligned}{} & {} \langle \nabla _x\psi ^j(t,x_1(t)),x_2(t)-x_1(t)\rangle \nonumber \\{} & {} \quad \le - \frac{1}{2}\langle x_2(t)-x_1(t), \nabla _x^2\psi ^j(t,\theta x_2(t)+(1-\theta )x_1(t))( x_2(t)-x_1(t))\rangle \nonumber \\{} & {} \quad \le (\textrm{const}) |x_1(t)-x_2(t)|^2. \end{aligned}$$

(20)

Now, if $\psi ^j(t,x_2(t))=0$, we deduce in the same way that

$$\begin{aligned} \langle \nabla _x\psi ^j(t,x_2(t)),x_1(t)-x_2(t)\rangle \le (\textrm{const})|x_1(t)-x_2(t)|^2. \end{aligned}$$

Thus, we have

$$\begin{aligned} \frac{1}{2}\frac{\hbox {d}}{\hbox {d}t}|x_1(t)-x_2(t)|^2\le (\textrm{const})|x_1(t)-x_2(t)|^2. \end{aligned}$$

Hence, $|x_1(t)-x_2(t)|=0$. $\square $

3 Approximating Family of Optimal Control Problems

In this section, we define an approximating family of optimal control problems to (P) and we state the corresponding necessary conditions.

Let $(\hat{x},\hat{u})$ be a global solution to (P) and consider sequences $\{\gamma _k\}$ and $\{\sigma _k\}$ as defined above. Let $\hat{x}_k(\cdot )$ be the solution to

$$\begin{aligned} \begin{array}{rcl} &{}&{}\dot{x}(t) = f(t,x(t),\hat{u}(t))-\displaystyle \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)}\nabla _x\psi ^i(t,x(t)),\\ &{}&{}x(0)= \hat{x}(0). \end{array} \end{aligned}$$

(21)

Set $\epsilon _k=|\hat{x}_k(T)-\hat{x}(T)|$. It follows from Theorem 2.1 that $\epsilon _k\downarrow 0$. Take $\alpha >0$ and define the problem

$$\begin{aligned} (P_k^\alpha ) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))+|x(0)-\hat{x}(0)|^2+\alpha \displaystyle \int _0^T|u(t)-\hat{u}(t)|\hbox {d}t\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \dot{x}(t) = f(t,x(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ u(t)\in U~~~~\text{ a.e. }\ \ t\in [0,T],\\ x(0)\in C_0,~~ x(T)\in C_T+\epsilon _k B_n, \end{array} \right. \end{aligned}$$

Clearly, the problem $(P_k^\alpha )$ has admissible solutions. Consider the space

$$\begin{aligned} W=\{ (c,u)\mid c\in C_0,\; u\in L^{\infty }\; \textrm{with }\; u(t)\in U\} \end{aligned}$$

and the distance

$$\begin{aligned} d_{W}((c_1,u_1),(c_2,u_2))=|c_1-c_2|+\int _0^T|u_1(t)-u_2(t)|\hbox {d}t. \end{aligned}$$

Endowed with $d_{W}$, W is a complete metric space. Take any $(c,u)\in W$ and a solution y to the Cauchy problem

$$\begin{aligned} \begin{array}{rcl} \dot{y}(t) &{} = &{} f(t,y(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,y(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ y(0) &{} = &{} c. \end{array} \end{aligned}$$

Under our assumptions, the function

$$\begin{aligned} (c,u)~\rightarrow ~ \phi (y(T))+ | c - \hat{x}(0) |^2+\alpha \int _{0}^{T} | u-\hat{u}|~\hbox {d}t \end{aligned}$$

is continuous on $(W,d_{W})$ and bounded below. Appealing to Ekeland’s theorem, we deduce the existence of a pair $(x_k,u_k)$ solving the following problem

$$\begin{aligned} (AP_k) \left\{ \begin{array}{l} \text{ Minimize } \; \Phi (x,{u})= \phi (x(T))+|x(0)-\hat{x}(0)|^2+\alpha \displaystyle \int _0^T|u(t)-\hat{u}(t)|\hbox {d}t\\ \qquad \qquad +\epsilon _k\left( |x(0)-x_k(0)|+ \displaystyle \int _0^T|u(t)-u_k(t)|\hbox {d}t\right) ,\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \dot{x}(t) = f(t,x(t),u(t))-\displaystyle \sum _{i=1}^I\nabla _x e^{\gamma _k(\psi ^i(t,x(t))-\sigma _k)} \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ u(t)\in U~~~~\text{ a.e. }\ \ t\in [0,T],\\ x(0)\in C_0,~~ x(T)\in C_T+\epsilon _k B_n, \end{array} \right. \end{aligned}$$

Lemma 3.1

Take $\gamma _k\rightarrow \infty $, $\sigma _k\rightarrow 0$ and $\epsilon _k \rightarrow 0$ as defined above. For each k, let $(x_k,u_k)$ be the solution to $(AP_k)$. Then, there exists a subsequence (we do not relabel) such that

$$\begin{aligned} u_k(t)\rightarrow \hat{u}(t)~ {a.e.}, \quad x_k\rightarrow \hat{x}\; \textrm{ uniformly}\; in\; [0,T]. \end{aligned}$$

Proof

We deduce from Theorem 2.1 that $\{x_k\}$ uniformly converges to an admissible solution $\tilde{x}$ to (P). Since U and $C_0$ are compact, we have $U\subset KB_m$ and $C_0\subset KB_n$. Without loss of generality, $u_k$ weakly-$*$ converges to a function $\tilde{u}\in L_{\infty }([0,T],U)$. Hence, it weakly converges to $\tilde{u}$ in $L_1$. From optimality of the processes $(x_k,u_k)$, we have

$$\begin{aligned}{} & {} \phi (x_k(T))+|x_k(0)-\hat{x}(0)|^2+\alpha \int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \phi (\hat{x}_k(T))+\epsilon _k\left( |\hat{x}_k(0)-x_k(0)|+\int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\right) \\{} & {} \quad \le \phi (\hat{x}_k(T))+2K(1+T)\epsilon _k. \end{aligned}$$

Since $(\hat{x},\hat{u})$ is a global solution of the problem, passing to the limit, we get

$$\begin{aligned}{} & {} \phi (\tilde{x}(T))+|\tilde{x}(0)-\hat{x}(0)|^2+\alpha \int _0^T|\tilde{u}(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty }(\phi (x_k(T))+|x_k(0)-\hat{x}(0)|^2)+ \alpha \liminf _{k\rightarrow \infty } \int _0^T|u_k(t)-\hat{u}(t)|\hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty }\phi (\hat{x}_k(T))=\phi (\hat{x}(T))\le \phi (\tilde{x}(T)). \end{aligned}$$

Hence, $\tilde{x}(0)=\hat{x}(0)$, $\tilde{u}=\hat{u}$ a.e., and $u_k$ converges to $\hat{u}$ in $L_1$, and some subsequence converges to $\hat{u}$ almost everywhere (we do not relabel). $\square $

We now finish this section with the statement of the optimality necessary conditions for the family of problems $(AP_k)$. These can be seen as a direct consequence of Theorem 6.2.1 in [22].

Proposition 3.1

For each k, let $(x_k,u_k)$ be a solution to $(AP_k)$. Then, there exist absolutely continuous functions $p_k$ and scalars $\lambda _k\ge 0$ such that

(a):

(nontriviality condition)

$$\begin{aligned} \lambda _k+|p_{k}(T)| =1, \end{aligned}$$

(22)

(b):

(adjoint equation)

$$\begin{aligned} \begin{array}{c}\dot{p}_{k} =-(\nabla _x f_{k})^* p_{k} +\sum _{i=1}^I\gamma _k e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla ^2_x\psi _{k}^ip_{k}\\ +\sum _{i=1}^I\gamma _k^2e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla _x\psi _{k}^i\langle \nabla _x\psi _{k}^i,p_{k}\rangle , \end{array} \end{aligned}$$

(23)

where the superscript $*$ stands for transpose,

(c):

(maximization condition)

$$\begin{aligned} \max _{u\in U}\left\{ \langle f(t,x_{k}, u) , p_{k} \rangle - \alpha \lambda _k|u-\hat{u}| -\epsilon _k \lambda _k|u-u_k|\right\} \end{aligned}$$

(24)

is attained at $u_k (t) $, for almost every $t\in [0,T]$,

(d):

(transversality condition)

$$\begin{aligned} ( p_{k}(0), - p_{k}(T)) \in \lambda _k\left( 2(x_k(0)-\hat{x}(0))+\epsilon _k B_n, \partial \phi (x_{k}(T))\right) \nonumber \\ + N_{C_0}(x_{k}(0))\times N_{C_T+\epsilon _kB_n}(x_{k}(T)). \hspace{1cm} \end{aligned}$$

(25)

To simplify the notation above, we drop the t dependence in $p_k$, $\dot{p}_k$, $x_k$, $u_k$, $\hat{x}$ and $\hat{u}$. Moreover, in (b), we write $\psi _k$ instead of $\psi (t,x_k(t))$, $f_k$ instead of $f(t,x_k(t),u_k(t))$. The same holds for the derivatives of $\psi $ and f.

4 Maximum Principle for (P)

In this section, we establish our main result, a Maximum Principle for (P). This is done by taking limits of the conclusions of Proposition 3.1, following closely the analysis done in the proof of [10, Theorem 2].

Observe that

$$\begin{aligned} \begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} |p_k(t)|^2&= - \langle \nabla _x f_k p_k , p_k \rangle + \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle \\&\hspace{1cm} + \sum _{i=1}^I\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\langle \nabla _x\psi _k^i, p_k \rangle ^2 \\&\ge - \langle \nabla _x f_k p_k , p_k \rangle + \sum _{i=1}^I \gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle \\&\ge \ - M | p_k|^2 +\sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x^2\psi _k^ip_k, p_k \rangle , \end{aligned} \end{aligned}$$

where M is the constant of (A2). Taking into account hypothesis (A1) and (10), we deduce the existence of a constant $K_0>0$ such that

$$\begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} |p_k(t)|^2\ge -K_0| p_k(t)|^2. \end{aligned}$$

This last inequality leads to

$$\begin{aligned} | p_k(t)|^2 \ \le \ e^{2 K_0 (T-t)} | p_k(T)|^2 \le \ e^{2 K_0T} |p_k(T)|^2. \end{aligned}$$

Since, by (a) of Proposition 3.1, $|p_k(T)|\le 1$, we deduce from the above that there exists $M_0>0$ such that

$$\begin{aligned} | p_k(t)| \ \le M_0. \end{aligned}$$

(26)

Now, we claim that the sequence $\{\dot{p}_k\}$ is uniformly bounded in $L^1$. To prove our claim, we need to establish bounds for the three terms in (23). Following [10, 13], we start by deducing some inequalities that will be of help.

Denote $I_k=I(t,x_k(t))$ and $S_k^j=\textrm{sign}\left( \langle \nabla _x\psi _k^j, p_ k \rangle \right) $. We have

$$\begin{aligned}{} & {} \sum _{j=1}^I\frac{\hbox {d}}{\hbox {d}t} \left| \langle \nabla _x\psi _k^j, p_ k \rangle \right| \\{} & {} \quad =\sum _{j=1}^I\left( \langle \nabla ^2_x\psi _k^j \dot{x}_ k, p_ k \rangle +\langle \partial _t\nabla _x\psi _k^j,p_k\rangle + \langle \nabla _x\psi _k^j,\dot{p}_ k\rangle \right) \, S_k^j \\{} & {} \quad = \sum _{j=1}^I\left( \langle p_ k, \nabla ^2_x\psi _k^j f_ k \rangle - \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle p_ k, \nabla ^2_x\psi _k^j \nabla _x\psi _k^i \rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{j=1}^I\left( \langle \partial _t\nabla _x\psi _k^j,p_k\rangle - \langle \nabla _x \psi _k^j, (\nabla _x f_ k)^* p_ k \rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{j=1}^I\left( \sum _{i=1}^I\gamma _k e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j , \nabla ^2_x\psi _k^i p_ k\rangle \right) S_k^j \\{} & {} \hspace{1cm}+\sum _{i=1}^I \sum _{j=1}^I \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j , \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j. \end{aligned}$$

Observe that (see (6) and (7))

$$\begin{aligned}{} & {} \sum _{i=1}^I \sum _{j=1}^I \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \quad =\sum _{i=1}^I \sum _{j\in I_k} \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \quad =\sum _{i\not \in I_k} \gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)} \sum _{j\in I_k} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle \langle \nabla _x\psi _k^i,p_ k\rangle S_k^j \\{} & {} \qquad +\sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\left( |\nabla _x\psi _k^i|^2+ \sum _{j\in I_k\setminus \{i\}} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle S_k^j~S_k^i \right) | \langle \nabla _x\psi _k^i,p_ k\rangle | \\{} & {} \quad = \sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}\left( |\nabla _x\psi _k^i|^2+ \sum _{j\in I_k\setminus \{i\}} \langle \nabla _x\psi _k^j, \nabla _x\psi _k^i \rangle S_k^j~S_k^i \right) | \langle \nabla _x\psi _k^i,p_ k\rangle |\\{} & {} \quad \ge \displaystyle (1-\rho )\sum _{i\in I_k}\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}|\nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle | \\{} & {} \quad = (1-\rho )\sum _{i=1}^I\gamma _k^2e^{\gamma _k( \psi _k^i-\sigma _k)}|\nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle |. \hspace{2cm} \end{aligned}$$

Using this and integrating the previous equality, we deduce the existence of $M_1>0$ such that:

$$\begin{aligned} \int _0^T \sum _{i=1}^I\gamma _k^2e^{\gamma _k(\psi _k^i-\sigma _k)}| \nabla _x\psi _k^i|^2 | \langle \nabla _x\psi _k^i,p_ k\rangle |\hbox {d}t \le M_1. \end{aligned}$$

(27)

We are now in a position to show that

$$\begin{aligned} \displaystyle \int _{0}^{T} \sum _{i=1}^{I} \gamma _k^2 e^{\gamma _k( \psi _k^i-\sigma _k)} |\nabla _x \psi _k^i| \left| \langle \nabla _x \psi _k^i,p_ k\rangle \right| \ \hbox {d}t \end{aligned}$$

is bounded. For simplicity, set $L_k^i(t) =\gamma _k^2 e^{\gamma _k( \psi _k^i-\sigma _k)} |\nabla _x \psi _k^i| \left| \langle \nabla _x \psi _k^i,p_ k\rangle \right| $. Notice that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \int _{0}^{T} L_k^i(t) \hbox {d}t= \displaystyle \sum _{i=1}^I \left\{ \int _{\{t: |\nabla _x \psi _k^i| < \eta \}} \hspace{-0.7cm}L_k^i(t)~\hbox {d}t+ \displaystyle \int _{\{t: |\nabla _x\psi _k^i| \ge \eta \}}\hspace{-0.7cm} L_k^i(t) \hbox {d}t \right\} . \end{aligned}$$

Using (A1) and (27), we deduce that

$$\begin{aligned} \begin{aligned} \displaystyle \sum _{i=1}^I \int _{0}^{T} L_k^i (t)~\hbox {d}t&\le \displaystyle \sum _{i=1}^I \left( \gamma _k^2 e^{-\gamma _k(\beta +\sigma _k)} \eta ^2 \max _{t} |p_ k(t)|\right) \\&\hspace{0.5cm}+\displaystyle \sum _{i=1}^I \left( \gamma _k^2 \int _{ \{t: |\nabla _x \psi _k^i | \ge \eta \} } \hspace{-1cm}e^{\gamma _k(\psi _k^i-\sigma _k)} \frac{|\nabla _x\psi _k^i |^2}{ | \nabla _x\psi _k^i |} \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t \right) \\&\le \gamma _k^2 I~e^{-\gamma _k (\beta +\sigma _k)} \eta ^2 M_0 \\&\hspace{0.5cm} +\frac{1}{\eta }\displaystyle \sum _{i=1}^I \left( \int _{0}^{T} \gamma _k^2 e^{\gamma _k(\psi _k^i-\sigma _k)} | \nabla _x\psi _k^i|^2 \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t\right) \\&\le \ \eta ^2 M_0 I+ \frac{M_1}{\eta }, \end{aligned} \end{aligned}$$

for k large enough. Summarizing, there exists a $M_2>0$ such that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \gamma _k^2 \int _{0}^{T} e^{\gamma _k(\psi _k^i-\sigma _k)} |\nabla _x\psi _k^i| \left| \langle \nabla _x\psi _k^i,p_ k\rangle \right| \ \hbox {d}t \ \ \le M_2. \end{aligned}$$

(28)

Mimicking the analysis conducted in Step 1, b) and c) of the proof of Theorem 2 in [10] and taking into account (b) of Proposition 3.1, we conclude that there exist constants $N_1>0$ such that

$$\begin{aligned} \int _{0}^{T} \left| \dot{p}_{\gamma _k}(t)\right| \hbox {d}t \le N_1, \end{aligned}$$

(29)

for k sufficiently large, proving our claim.

Before proceeding, observe that it is a simple matter to assert the existence of a constant $N_2$ such that

$$\begin{aligned} \displaystyle \sum _{i=1}^I \int _0^T \gamma _k^2 e^{\gamma _k(\psi _k^i-\sigma _k)} |\langle \nabla _x\psi _k^i,p_{\gamma _k}\rangle |\hbox {d}t\le N_2. \end{aligned}$$

(30)

This inequality will be of help in what follows.

Let us now recall that

$$\xi _k^i(t)=\gamma _k e^{\gamma _k( \psi ^i(t,x_k(t))-\sigma _k)}$$

and that the second inequality in (14) holds. We turn to the analysis of Step 2 in the proof of Theorem 2 in [10] (see also [13]). Adapting those arguments, we can conclude the existence of some function $p\in BV([0,T],R^n)$ and, for $i=1, \ldots , I$, functions $\xi ^i\in L^{\infty }([0,T],R)$ with $\xi ^i(t) \ge 0 \ \text{ a. } \text{ e. } t$, $\xi ^i(t) = 0, \ t \in I_b^i$, where

$$\begin{aligned} I_b^i=\left\{ t\in [0,T]:~ \psi ^i(t, \hat{x}(t))<0\right\} , \end{aligned}$$

and finite signed Radon measures $\eta ^i$, null in $ I_b^i$, such that, for any $z\in C([0,T],R^n)$

$$\begin{aligned} \int _0^T \langle z,dp\rangle =-\int _0^T \langle z, (\nabla _x\hat{f})^*p\rangle \hbox {d}t +\displaystyle \sum _{i=1}^{I} \left( \int _0^T\xi ^i\langle z,\nabla ^2_x\hat{\psi } ^ip\rangle \hbox {d}t +\int _0^T\langle z, \nabla _x\hat{\psi }^i \rangle {d}\eta ^i\right) , \end{aligned}$$

where $\nabla _x \hat{\psi }^i =\nabla _x \psi ^i(t,\hat{x}(t))$. Set $\nabla _x \psi ^i_k=\nabla _x \psi ^i(t,x_k(t))$. The finite signed Radon measures $\eta ^i$ are weak-$*$ limits of

$$\begin{aligned} \gamma _k^2 e^{\gamma _k(\psi ^i(t, x_k(t))-\sigma _k)} \langle \nabla _x\psi ^i(t, x_k(t)),p_{k}(t)\rangle \hbox {d}t. \end{aligned}$$

Observe that the measures

$$\begin{aligned} \langle \nabla _x\psi ^i(t,\hat{x}(t)),p(t)\rangle {d}\eta ^i(t) \end{aligned}$$

(31)

are nonnegative.

For each $i=1,\ldots , I$, the sequence $\xi _k^i$ is weakly-$*$ convergent in $L^{\infty }$ to $\xi ^i\ge 0$. Following [13], we deduce from (30) that, for each $i=1,\ldots , I$,

$$\begin{aligned}{} & {} \int _0^T|\xi ^i\langle \nabla _x\hat{\psi }^i,p\rangle | \hbox {d}t=\lim _{k\rightarrow \infty }\int _0^T|\xi _k^i\langle \nabla _x\hat{\psi }^i,p\rangle | \hbox {d}t\\{} & {} \quad \le \lim _{k\rightarrow \infty } \left( \int _0^T\xi _k^i|\langle \nabla _x\hat{\psi }^i,p\rangle -\langle \nabla _x \psi _k^i,p_k\rangle | \hbox {d}t+ \int _0^T\xi _k^i|\langle \nabla _x \psi _k^i,p_k\rangle | \hbox {d}t\right) \\{} & {} \quad \le \lim _{k\rightarrow \infty }\left( \Big |\xi _k^i\Big |_{L^{\infty }}\Big | \langle \nabla _x\hat{\psi }^i,p\rangle -\langle \nabla _x \psi _k^i,p_k\rangle \Big |_{L^1}+\frac{N_2}{\gamma _k}\right) =0. \end{aligned}$$

It turns out that

$$\begin{aligned} \xi ^i \langle \nabla _x\hat{\psi }^i ,p\rangle =0\; \mathrm{a. e.}. \end{aligned}$$

(32)

Consider now the sequence of scalars $\{\lambda _k\}$. It is an easy matter to show that there exists a subsequence of $\{\lambda _k\}$ converging to some $\lambda \ge 0$. This, together with the convergence of $p_k$ to p, allows us to take limits in (a) and (c) of Proposition 3.1 to deduce that

$$\begin{aligned} \lambda +|p(T)|=1 \end{aligned}$$

and

$$\begin{aligned} \langle p(t), f(t,\hat{x}(t),u)\rangle -\alpha \lambda |u-\hat{u}(t)| \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle ~\forall u\in U, \text { a.e. } t \in [0,T]. \end{aligned}$$

It remains to take limits of the transversality conditions (d) in Proposition 3.1. First, observe that

$$\begin{aligned} C_T+\epsilon _kB_n=\left\{ x:~d(x,C_T)\le \epsilon _k\right\} . \end{aligned}$$

From the basic properties of the Mordukhovich normal cone and subdifferential (see [18], section 1.3.3), we have

$$\begin{aligned} N_{C_T+\epsilon _kB_n}(x_k(T))\subset \text { cl cone}\,\partial d(x_k(T), C_T) \end{aligned}$$

and

$$\begin{aligned} N_{C_T}(\hat{x}(T))= \text { cl cone}\,\partial d(\hat{x}(T), C_T). \end{aligned}$$

Passing to the limit as $k\rightarrow \infty $, we get

$$\begin{aligned} (p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times N_{C_T}(\hat{x}(T))+\{0\}\times \lambda ~ \partial \phi (\hat{x}(T)). \end{aligned}$$

Finally, and mimicking Step 3 in the proof of Theorem 2 in [10], we remove the dependence of the conditions on the parameter $\alpha $. This is done by taking further limits, this time considering a sequence of $\alpha _j\downarrow 0$.

We then summarize our conclusions in the following Theorem.

Theorem 4.1

Let $(\hat{x}, \hat{u})$ be the optimal solution to (P). Suppose that assumption A1–A6 are satisfied. For $i=1,\cdots , I$, set

$$\begin{aligned} I^{i}_b= \{ t \in [0,T]: ~ \psi ^{i}(t,\hat{x}(t) ) < 0 \}. \end{aligned}$$

There exist $ \lambda \ge 0$, $ p\in BV([0,T],R^n)$, finite signed Randon measures $ \eta ^i$, null in $I^{i}_b$, for $i=1,\cdots , I$, $ \xi ^{i}\in L^\infty ([0,T],R)$, with $i=1,\cdots , I $, where $ \displaystyle \xi ^{i}(t) \ge 0 \ \text { a. e. } t$ and $\xi ^{i}(t) = 0, \ t \in I^{i}_b, $ such that

a):

$\lambda +|p(T)|\ne 0$,

b):

$\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),$

c):

for any $z\in C([0,T];R^n)$

$$\begin{aligned} \begin{array}{l} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\ \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \textrm{d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) ,\end{array} \end{aligned}$$

where $ \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla _x \hat{\psi }^i(t)=\nabla _x \psi ^i(t,\hat{x}(t))$ and $\nabla ^2_x \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),$

d):

$\xi _i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0$, $a.e. \, t$ for all $i=1, \ldots , I$,

e):

for all $i=1, \ldots , I$, the measures $\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)$ are nonnegative,

f):

$\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle $ for all $u \in U,$ $~a.e.\, t$,

g):

$\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times N_{C_T}(\hat{x}(T)) +\{0\}\times \lambda \partial \phi (\hat{x}(T)).\end{array}$

Noteworthy, condition e) is not considered in any of our previous works.

We now turn to the free end-point case, i. e., to the problem

$$\begin{aligned} (P_f) \left\{ \begin{array}{l} \text{ Minimize } \; \phi (x(T))\\ \text{ over } \text{ processes } (x,u) \text{ such } \text{ that } \\ \hspace{8mm} \dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t)), \hspace{0.2cm}\text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} u(t)\in U, \ \ \;\, \text{ a.e. }\ \ t\in [0,T],\\ \hspace{8mm} x(0) \in C_0 \subset C(0). \end{array} \right. \end{aligned}$$

Problem $(P_f)$ differs from (P) because x(T) is not constrained to take values in $C_T$. We apply Theorem 4.1 to $(P_f)$. Since x(T) is free, we deduce from (f) in the above Theorem that $-p(T)=\lambda \partial \phi (\hat{x}(T))$. Suppose that $\lambda =0$. Then, $p(T)=0$ contradicting the nontriviality condition (a) of Theorem 4.1. Without loss of generality, we then conclude that the conditions of Theorem 4.1 hold with $\lambda =1$. We summarize our findings in the following Corollary.

Corollary 4.1

Let $(\hat{x}, \hat{u})$ be the optimal solution to $(P_f)$. Suppose that assumption A1–A6 are satisfied. For $i=1,\cdots , I$, set

$$\begin{aligned} I^{i}_b= \{ t \in [0,T]: ~ \psi ^{i}(t,\hat{x}(t) ) < 0 \}. \end{aligned}$$

There exist $ p\in BV([0,T],R^n)$, finite signed Randon measures $ \eta _i$, null in $I^{i}_b$, for $i=1,\cdots , I$, $ \xi ^{i}\in L^\infty ([0,T],R)$, with $i=1,\cdots , I $, where $ \displaystyle \xi ^{i}(t) \ge 0 \ \text { a.e. } t$ and $\xi ^{i}(t) = 0$ for $t \in I^{i}_b, $ such that

a):

$\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),$

b):

for any $z\in C([0,T];R^n)$

$$\begin{aligned}{} & {} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\{} & {} \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \hbox {d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) , \end{aligned}$$

where $ \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla \hat{\psi }^i(t)=\nabla \psi ^i(t,\hat{x}(t))$ and $\nabla ^2 \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),$

c):

$\xi ^i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0$ for a.e. t and for all $i=1, \ldots , I$,

d):

for all $i=1, \ldots , I$, the measures $\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)$ are nonnegative,

e):

$\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle $ for all $u \in U$, $a.e.\, t$,

f):

$\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times \partial \phi (\hat{x}(T)).\end{array}$

5 Example

Let us consider the following problem (Fig. 2)

$$\begin{aligned} \begin{array}{l} \text{ Minimize } \; -x(T)\\ \text{ over } \text{ processes } ((x,y,z),u) \text{ such } \text{ that } \\ \hspace{8mm} \begin{bmatrix} \dot{x}(t)\\ \dot{y}(t)\\ \dot{z}(t) \end{bmatrix} \in \begin{bmatrix} 0 &{} \quad \sigma &{}v 0\\ 0 &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix} \begin{bmatrix} x\\ y\\ z \end{bmatrix} +\begin{bmatrix} 0\\ u\\ 0 \end{bmatrix}-N_C(x,y,z),\\ u\in [-1,1],\\ (x,y,z)(0)=(x_0,y_0,z_0), \\ (x,y,z)(T)\in C_T, \end{array} \end{aligned}$$

where

$0<\sigma \ll 1$,
$C=\{ (x,y,z)\mid x^2+y^2+(z+h)^2\le 1,\; x^2+y^2+(z-h)^2\le 1\},\;\; 2\,h^2<1 $,
$(x_0,y_0,z_0)\in \textrm{int }C$, with $x_0<-\delta $, $y_0=0$ and $z_0>0$,
$C_T=\{ (x,y,z)\mid x\le 0,\; y\ge 0,\; \delta y-y_2x\le \delta y_2\}\cap C$, where
$$\begin{aligned} \delta <\frac{y_2|x_0|}{y_1},~ \textrm{with } ~y_1=\sqrt{1-x_0^2-(z_0+h)^2} \text { and }y_2=\sqrt{1-h^2}. \end{aligned}$$

We choose $T>0$ small and, nonetheless, sufficiently large to guarantee that, when $\sigma =0$, the system can reach the interior of $C_T$ but not the segment $\{ (x,0,0)\mid x\in [-\delta ,0]\}$. Since $\sigma $ and T are small, it follows that the optimal trajectory should reach $C_T$ at the face $\delta y-y_2x=\delta y_2$ of $C_T$.

To significantly increase the value of the x(T), the optimal trajectory needs to live on the boundary of C for some interval of time. Then, before reaching and after leaving the boundary of C, the optimal trajectory lives in the interior of C. Since $\delta $ is small, the trajectory cannot reach $C_T$ from any point of the sphere $x^2+y^2+(z+h)^2=1$ with $z>0$. This means that, while on the boundary of C the trajectory should move on the sphere $x^2+y^2+(z+h)^2=1$ until reaching the plane $z=0$ and then it moves on the intersection of the two spheres.

While in the interior of C, the control can change sign from $-1$ to 1 or from 1 to $-1$. Certainly, the control should be 1 right before reaching the boundary and $-1$ right before arriving at $C_T$. Changes of the control from 1 to $-1$ or $-1$ to 1 before reaching the boundary translate into time waste and lead to smaller values of x(T). It then follows that the optimal control should be of the form

$$\begin{aligned} u(t)=\left\{ \begin{array}{cl} 1,&{} \quad t\in [0,\tilde{t}],\\ -1, &{} \quad t\in \ ]\tilde{t},T], \end{array} \right. \end{aligned}$$

(33)

for some value $\tilde{t}\in ]0,T[$.

After the modification (8), the data of the problem satisfy the conditions under which Theorem 4.1 holds. We now show that the conclusions of Theorem 4.1 completely identify the structure (33) of the optimal control.

From Theorem 4.1, we deduce the existence of $ \lambda \ge 0$, $ p,~q,~r \in BV([0,T],R)$, finite signed Randon measures $ \eta _1$ and $\eta _2$, null, respectively, in

$$\begin{aligned} I^{1}_b=\left\{ (x,y,z)\mid x^2+y^2+(z+h)^2-1<0\right\} \end{aligned}$$

and

$$\begin{aligned} I^{2}_b=\left\{ (x,y,z)\mid x^2+y^2+(z-h)^2-1<0\right\} , \end{aligned}$$

$ \xi _{i}\in L^\infty ([0,T],R)$, with $i=1,2 $, where $ \displaystyle \xi _{i}(t) \ge 0 \ \text { a. e. } t$ and $\xi _{i}(t) = 0, \ t \in I^{i}_b, $ such that

$$\begin{aligned}{} & {} \text {(i)} \begin{bmatrix} \dot{x}(t)\\ \dot{y}(t)\\ \dot{z}(t) \end{bmatrix} = \begin{bmatrix} 0 &{} \quad \sigma &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix} \begin{bmatrix} x\\ y\\ z \end{bmatrix} +\begin{bmatrix} 0\\ u\\ 0 \end{bmatrix}-2\xi _1\begin{bmatrix} x\\ y\\ z+h \end{bmatrix}-2\xi _2\begin{bmatrix} x\\ y\\ z-h \end{bmatrix} \\{} & {} \text {(ii)}\, d\begin{bmatrix} p\\ q\\ r \end{bmatrix} = \begin{bmatrix} 0 &{} \quad 0 &{} \quad 0\\ -\sigma &{} \quad 0 &{} \quad 0\\ 0 &{} \quad 0 &{} \quad 0 \end{bmatrix}\begin{bmatrix} p\\ q\\ r \end{bmatrix} \hbox {d}t \\{} & {} +2(\xi _1+\xi _2) \begin{bmatrix} p\\ q\\ r \end{bmatrix} \hbox {d}t +2\begin{bmatrix} x\\ y\\ z+h \end{bmatrix} \hbox {d}\eta _1 +2\begin{bmatrix} x\\ y\\ z-h \end{bmatrix}\hbox {d}\eta _2,\\{} & {} \text {(iii)} \begin{bmatrix} p\\ q\\ r \end{bmatrix}(T) = \begin{bmatrix} \lambda \\ 0\\ 0 \end{bmatrix}+\mu \begin{bmatrix} y_2\\ -\delta \\ 0 \end{bmatrix},\text { where } \mu \ge 0,\\{} & {} \text {(iv)}\, \xi _1(xp+yq+(z+h)r)=0,\; \xi _2(xp+yq+(z-h)r)=0,\\{} & {} \text {(v)}\, \text {the measures } (xp+yq+(z+h)r)\hbox {d}\eta _1\text { and }(xp+yq+(z-h)r)\hbox {d}\eta _2\\{} & {} \, \text { are nonnegative,} \\{} & {} \text {(vi)}\, \max _{u\in [-1,1]}uq=\hat{u}q, \end{aligned}$$

where $\hat{u}$ is the optimal control.

Let $t_1$ be the instant of time when the trajectory reaches the sphere $x^2+y^2+(z+h)^2=1$, $t_2$ the instant of time when the trajectory reaches the intersection of the two spheres and $t_3$ be the instant of time the trajectory leaves the boundary of C. We have $0<t_1<t_2<t_3<T$.

Next, we show that the multiplier q changes sign only once and so identifying the structure (33) of the optimal control in a unique way. We start by looking at the case when $t=T$. We have

$$\begin{aligned} \left[ \begin{array}{c} p\\ q \end{array}\right] (T) = \left[ \begin{array}{c} \lambda \\ 0 \end{array}\right] +\mu \left[ \begin{array}{c} y_2\\ -\delta \ \end{array}\right] . \end{aligned}$$

Starting from $t=T$, let us go backwards in time until the instant $t_3$ when the trajectory leaves the boundary of C. If $q(T)=0$, then $p(T)=\lambda >0$ and we would have $q(t)>0$ for $t \in ]t_3,T[$ (see (ii) above), which is impossible. We then have $p(T)>0$ and $q(T)<0$ and, in $]t_3,T[$, since $\sigma $ is small, the vector (p(t), q(t)) does not change much. At $t=t_3$, the vector (p, q) has a jump and such jump can only occur along the vector $(x(t_3),y(t_3))$. Therefore, we have $p(t_3-0)>0$ and $q(t_3-0)<0$.

Let us now consider $t\in ]t_2,t_3[$. We have the following

1.
when $ t\in [t_2,t_3]$, we have $z=0$;
2.
condition (i) implies that $\xi _1=\xi _2=\xi $, $\xi >0$ since, otherwise the motion along $x^2+y^2=1-h^2$ would not be possible;
3.
from $0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2)=\sigma 2xy-8\xi x^2+2uy-8\xi y^2$, we get $\xi =\frac{\sigma xy+uy}{4(1-h^2)}$;
4.
condition (iv) implies that $r=0$ leading to $xp+yq=0$. Since $x<0$, $y>0$, then $q=0$ implies $p=0$;
5.
condition (ii) implies $\hbox {d}\eta _1=\hbox {d}\eta _2=\hbox {d}\eta $;
6.
$0=d(xp+yq)=uq\hbox {d}t+4(1-h^2)\hbox {d}\eta $ $\Rightarrow $ $\frac{\hbox {d}\eta }{\hbox {d}t}=-\frac{uq}{4(1-h^2)}$;
7.
from the above analysis, we deduce that
$$\begin{aligned}{} & {} \dot{p}= \frac{\sigma xy+uy}{(1-h^2)}~p-\frac{xuq}{(1-h^2)},\\{} & {} \dot{q}=-\sigma p+\frac{\sigma xy}{(1-h^2)}~q. \end{aligned}$$
Thus, (p, q) is a solution to a linear system and it can never be equal to zero. It follows that q cannot be zero because $q=0$ implies $p=0$. Since $q\ne 0$, we have $q>0$.

Let us consider the case when $t=t_2$. We claim that

$$\begin{aligned} (p(t_2-0),q(t_2-0))\ne (0,0). \end{aligned}$$

Seeking a contradiction, assume that it is $(p(t_2-0),q(t_2-0))=(0,0)$. Then, we have

$$\begin{aligned} (p(t_2+0), q(t_2+0))=(0,0)+(2x_2(t_2), 2y_2(t_2))(\hbox {d}\eta _1+\hbox {d}\eta _2) \end{aligned}$$

and such jump has to be normal to $(x(t_2),y(t_2))$ since $r(t_2+0)=0$ (see (iv)). It follows that $(x^2(t_2)+y^2(t_2))(\hbox {d}\eta _1+ \hbox {d}\eta _2)=0$ and, since $x^2(t_2)+y^2(t_2)>0$, we get $\hbox {d}\eta _1+ \hbox {d}\eta _2=0$, proving our claim.

We now consider $ t\in ]t_1,t_2[$. It is easy to see that $\xi _2=0$ and $d \eta _2=0$. We also deduce that

1.
$0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2+(z+h)^2)=2\sigma xy+2uy-4\xi _1 y^2-4\xi _1 x^2-4\xi _1(z+h)^2$ which implies that $\xi _1=\frac{\sigma xy+uy}{2}$;
2.
also $0=d(xp+yq+(z+h)r)=uq\hbox {d}t+2\hbox {d}\eta _1$ implies that $\frac{\hbox {d}\eta _1}{\hbox {d}t}=-\frac{uq}{2}$;
3.
from the above, we deduce that
$$\begin{aligned}{} & {} \dot{p}=(\sigma xy+uy)p-xuq,\\{} & {} \dot{q}=-\sigma p+\sigma xyq. \end{aligned}$$
Thus, (p, q) is a solution to a linear system and never is equal to zero. The second equation implies that if $q=0$, then $\dot{q}\ne 0$. Hence, $q>0$.

Now, we need to consider $t=t_1$. We claim that

$$\begin{aligned} (p(t_1-0),q(t_1-0),r(t_1-0))\ne (0,0,0). \end{aligned}$$

Let us then assume that it is $(p(t_1-0),q(t_1-0),r(t_1-0))=(0,0,0)$. It then follows that $(p(t_1+0),q(t_1+0),r(t_1+0))=(0,0,0)+(2x(t_1)\hbox {d}\eta _1, 2y(t_1)\hbox {d}\eta _1, 2(z(t_1)+h)\hbox {d}\eta \eta _1)$. We now show that there is no such jump. Set $r(t_1-0)=r_0$. Then, it follows from (iv) that $(x(t_1)\cdot 0+y(t_1)\cdot 0 +(z(t_1)+h))r_0=0$ which implies that $r_0=0$. We also have $(x^2(t_1)+y^2(t_1)+(z(t_1)+h)^2)\hbox {d}\eta _1=0$ from (v). But this implies that $\hbox {d}\eta _1=0$. Consequently, the multipliers do not exhibit a jump at $t_1$.

From the previous analysis, we deduce that q should be positive almost everywhere on the boundary. It then follows that to find the optimal solution we have to analyze admissible trajectories corresponding to controls with the structure (33) and choose the optimal value of $\tilde{t}$.

6 Conclusions

We proved necessary conditions for an optimal control problem involving sweeping processes with a nonsmooth sweeping set depending on time. The main feature of our work is the use of exponential penalization functions. We have applied successfully this approach in previous works on optimal control problems involving sweeping processes with a smooth set. In this work, to deal with the sweeping set nonsmoothness we impose rather strong constraint qualifications. The weakening of these hypotheses will be the subject of our future work.

Notes

See also Theorem 2.2 in [13]

References

Addy, K., Adly, S., Brogliato, B., Goeleven, D.: A method using the approach of Moreau and Panagiotopoulos for the mathematical formulation of non-regular circuits in electronics. Nonlinear Anal. Hybrid Syst 1, 30–43 (2013). https://doi.org/10.1016/j.nahs.2006.04.00
Adly, S., Nacry, F., Thibault, L.: Preservation of prox-regularity of sets with applications to constrained optimization. SIAM J. Optim. 26, 448–473 (2016). https://doi.org/10.1137/15M1032739
Article MathSciNet MATH Google Scholar
Arroud, C., Colombo, G.: A maximum principle for the controlled sweeping process. Set Valued Var. Anal 26, 607–629 (2018). https://doi.org/10.1007/s11228-017-0400-4
Article MathSciNet MATH Google Scholar
Brokate, M., Krejčí, P.: Optimal control of ODE systems involving a rate independent variational inequality. Discrete Contin. Dyn. Syst. Ser. B 18(2), 331–348 (2013). https://doi.org/10.3934/dcdsb.2013.18.33
Article MathSciNet MATH Google Scholar
Cao, T.H., Mordukhovich, B.: Optimality conditions for a controlled sweeping process with applications to the crowd motion model. Discrete Contin. Dyn. Syst. Ser. B 22, 267–306 (2017). https://doi.org/10.3934/dcdsb.2017014
Article MathSciNet MATH Google Scholar
Cao, T.H., Colombo, G., Mordukhovich, B., Nguyen, D.: Optimization of fully controlled sweeping processes. J. Differ. Equ. 295, 138–186 (2021). https://doi.org/10.1016/j.jde.2021.05.042
Article MathSciNet MATH Google Scholar
Clarke, F.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)
MATH Google Scholar
Colombo, G., Palladino, M.: The minimum time function for the controlled Moreau’s sweeping process. SIAM J. Control. Optim. 54(4), 2036–2062 (2016). https://doi.org/10.1137/15M1043364
Article MathSciNet MATH Google Scholar
Colombo, G., Henrion, R., Hoang, N.D., Mordukhovich, B.S.: Optimal control of the sweeping process over polyhedral controlled sets. J. Differ. Equ. 260(2), 3397–3447 (2016). https://doi.org/10.1016/j.jde.2015.10.039
Article MathSciNet MATH Google Scholar
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Optimal control involving sweeping processes. Set Valued Var. Anal 27, 523–548 (2019). https://doi.org/10.1007/s11228-018-0501-8
Article MathSciNet MATH Google Scholar
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Correction to: optimal control involving sweeping processes. Set Valued Var. Anal 27, 1025–1027 (2019). https://doi.org/10.1007/s11228-019-00520-5
Article MathSciNet MATH Google Scholar
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Optimal control with sweeping processes: numerical method. J. Optim. Theory Appl. 185, 845–858 (2020). https://doi.org/10.1007/s10957-020-01670-5
Article MathSciNet MATH Google Scholar
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Necessary conditions for optimal control problems with sweeping systems and end point constraints. Optimization 71(11), 3363–3381 (2022). https://doi.org/10.1080/02331934.2022.2101111
Article MathSciNet MATH Google Scholar
Hermosilla, C., Palladino, M.: Optimal control of the sweeping process with a non-smooth moving set. SIAM J. Control. Optim. 60(5), 2811–2834 (2022). https://doi.org/10.1137/21M1405472
Article MathSciNet MATH Google Scholar
Kunze, M., Monteiro Marques, M.D.P.: An Introduction to Moreau’s sweeping process. In: Brogliato, B. (ed.) Impacts in Mechanical Systems Lecture Notes in Physics, vol. 551. Springer, Berlin (2000). https://doi.org/10.1007/3-540-45501-9_1
Chapter Google Scholar
Maury, B., Venel, J.: A discrete contact model for crowd motion. ESAIM M2AN 45(1), 145–168 (2011). https://doi.org/10.1051/m2an/2010035
Article MathSciNet MATH Google Scholar
Moreau, J.J.: On unilateral constraints, friction and plasticity. In: Capriz, G., Stampacchia, G. (eds.) New Variational Techniques in Mathematical Physics, CIME ciclo Bressanone 1973, pp. 171–322. Edizioni Cremonese, Rome (1974). https://doi.org/10.1007/978-3-642-10960-7_7
Chapter Google Scholar
Mordukhovich, B.: Variational Analysis and Generalized Differentiation II: Basic Theory. In: Fundamental Principles of Mathematical Sciences, vol. 330. Springer, Berlin (2006). https://doi.org/10.1007/3-540-31247-1
Sene, M., Thibault, L.: Regularization of dynamical systems associated with prox-regular moving sets. J. Nonlinear Convex Anal. 15(4), 647–663 (2014)
MathSciNet MATH Google Scholar
Tallos, P.: Viability problems for nonautonomous differential inclusions. SIAM J. Control. Optim. 29(2), 253–263 (1991). https://doi.org/10.1137/0329014
Article MathSciNet MATH Google Scholar
Thibault, L.: Moreau sweeping process with bounded truncated retraction. J. Convex Anal. 23, 1051–1098 (2016)
MathSciNet MATH Google Scholar
Vinter, R.B.: Optimal Control. Foundations and Applications, Boston MA, Birkhäuser, Systems and Control (2000)
MATH Google Scholar
Zeidan, V., Nour, C., Saoud, H.: A nonsmooth maximum principle for a controlled nonconvex sweeping process. J. Differ. Equ. 269(11), 9531–9582 (2020). https://doi.org/10.1016/j.jde.2020.06.053
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors gratefully thank the support of Portuguese Foundation for Science and Technology (FCT) in the framework of the Strategic Funding UIDB/04650/2020. Also, we thank the support by the ERDF—European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020, INCO.2030, under the Portugal 2020 Partnership Agreement and by National Funds, Norte 2020, through CCDRN and FCT, within projects To Chair (POCI-01-0145-FEDER-028247), Upwind (PTDC/EEI-AUT/31447/2017 - POCI-01-0145-FEDER-031447) and Systec R &D unit (UIDB/00147/2020).

Funding

Open access funding provided by FCT|FCCN (b-on).

Author information

Authors and Affiliations

SYSTEC, ARISE, Faculdade de Engenharia da Universidade do Porto, Porto, Portugal
Maria do Rosário de Pinho & Maria Margarida A. Ferreira
Department of Mathematics, University of Minho, Physics Center of Minho and Porto Universities (CF-UM-UP), Campus de Gualtar, Braga, Portugal
Georgi Smirnov

Authors

Maria do Rosário de Pinho
View author publications
You can also search for this author in PubMed Google Scholar
Maria Margarida A. Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Georgi Smirnov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Margarida A. Ferreira.

Additional information

Communicated by Boris S. Mordukhovich.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pinho, M.d.R.d., Ferreira, M.M.A. & Smirnov, G. A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set. J Optim Theory Appl 199, 273–297 (2023). https://doi.org/10.1007/s10957-023-02283-4

Download citation

Received: 13 October 2022
Accepted: 26 July 2023
Published: 24 August 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10957-023-02283-4

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set

Abstract

Similar content being viewed by others

Optimal Control with Sweeping Processes: Numerical Method

A Maximum Principle for the Controlled Sweeping Process

Extended Euler–Lagrange and Hamiltonian Conditions in Optimal Control of Sweeping Processes with Controlled Moving Sets

1 Introduction

2 Preliminaries

Definition 2.1

Theorem 2.1

Proof

3 Approximating Family of Optimal Control Problems

Lemma 3.1

Proof

Proposition 3.1

4 Maximum Principle for (P)

Theorem 4.1

Corollary 4.1

5 Example

6 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set

Abstract

Similar content being viewed by others

Optimal Control with Sweeping Processes: Numerical Method

A Maximum Principle for the Controlled Sweeping Process

Extended Euler–Lagrange and Hamiltonian Conditions in Optimal Control of Sweeping Processes with Controlled Moving Sets

1 Introduction

2 Preliminaries

Definition 2.1

Theorem 2.1

Proof

3 Approximating Family of Optimal Control Problems

Lemma 3.1

Proof

Proposition 3.1

4 Maximum Principle for (P)

Theorem 4.1

Corollary 4.1

5 Example

6 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation