Abstract
We generalize a maximum principle for optimal control problems involving sweeping systems previously derived in de Pinho et al. (Optimization 71(11):3363–3381, 2022, https://doi.org/10.1080/02331934.2022.2101111) to cover the case where the moving set may be nonsmooth. Noteworthy, we consider problems with constrained end point. A remarkable feature of our work is that we rely upon an ingenious smooth approximating family of standard differential equations in the vein of that used in de Pinho et al. (Set Valued Var Anal 27:523–548, 2019, https://doi.org/10.1007/s11228-018-0501-8).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In recent years, there has been a surge of interest in optimal control problems involving the controlled sweeping process of the form
In this respect, we refer to, for example, [3,4,5, 8,9,10, 15, 23] (see also accompanying correction [11]), [6, 13, 14]. Sweeping processes first appeared in the seminal paper [17] by J.J. Moreau as a mathematical framework for problems in plasticity and friction theory. They have proven of interest to tackle problems in mechanics, engineering, economics and crowd motion problems; to name but a few, see [1, 5, 15, 16, 21]. In the last decades, systems in form (1) have caught the attention and interest of the optimal control community. Such interest resides not only in the range of applications but also in the remarkable challenge they rise concerning the derivation of necessary conditions. This is due to the presence of the normal cone \(N_{C(t)}(x(t))\) in the dynamics. Indeed, the presence of the normal cone renders the discontinuity of the right hand of the differential inclusion in (1) destroying a regularity property central to many known optimal control results.
Lately, there has been several successful attempts to derive necessary conditions for optimal control problems involving (1). Assuming that the set C is time independent, necessary conditions for optimal control problems with free end point have been derived under different assumptions and using different techniques. In [10], the set C has the form \(C=\{ x:~\psi (x)\le 0\}\) and an approximating sequence of optimal control problems, where (1) is approximated by the differential equation
for some positive sequence \(\gamma _k\rightarrow +\infty \), is used. Similar techniques are also applied to somehow more general problems in [23]. A useful feature of those approximations is explored in [12] to define numerical schemes to solve such problems.
More recently, an adaptation of the family of approximating systems (2) is used in [13] to generalize the results in [10] to cover problems with additional end-point constraints and with a moving set of the form \(C(t)=\{ x:~\psi (t,x)\le 0\}\).
In this paper, we generalize the maximum principle proven in [13] to cover problems with possibly nonsmooth sets. Our problem of interest is
where \(T>0\) is fixed, \(\phi : R^n\rightarrow R\), \(f:[0,T]\times R^n\times R^m\rightarrow R^n\), \(U \subset R^m\) and
for some functions \(\psi ^i:[0,T]\times R^n\rightarrow R\), \( i=1,\ldots , I\).
The case where \(I=1\) in (3) and \(\psi ^1\) is \(C^2\) is covered in [13]. Here, we assume \(I>1\) and that the functions \(\psi ^i\) are also \(C^2\). Although going from \(I=1\) in (3) to \(I>1\) may be seen as a small generalization, it demands a significant revision of the technical approach and, plus, the introduction of a constraint qualification. This is because set (3) may be nonsmooth. We focus on sets (3), satisfying a certain constraint qualification, introduced in assumption (A1) in Sect. 2. This is, indeed, a restriction on the nonsmoothness of (3). A similar problem with nonsmooth moving set is considered in [14]. Our results cannot be obtained from the results of [14] and do not generalize them.
This paper is organized in the following way. In Sect. 2, we introduce the main notation and we state and discuss the assumptions under which we work. In this same section, we also introduce the family of approximating systems to \(\dot{x}(t) \in f(t,x(t),u(t))- N_{C(t)}(x(t))\) and establish a crucial convergence result, Theorem 2.1. In Sect. 3, we dwell on the approximating family of optimal control problems to (P) and we state the associated necessary conditions. The maximum principle for (P) is then deduced and stated in Theorem 4.1, covering additionally, problems in the form of (P) where the end-point constraint \(x(T) \in C_T\) is absent. We present an illustrative example of our main result, Theorem 4.1, in Sect. 5. We end this paper with some brief conclusions.
2 Preliminaries
In this section, we introduce a summary of the notation and state the assumptions on the data of (P) enforced throughout. Furthermore, we extract information from the assumptions establishing relations crucial for the forthcoming analysis.
Notation
For a set \(S\subset R^n\), \(\partial S\), \( \text {cl}\,S\) and \( \text{ int }\, S\) denote the boundary, closure and interior of S.
If \(g: R^p\rightarrow R^q\), \(\nabla g\) represents the derivative and \(\nabla ^2g\) the second derivative. If \(g: R\times R^p\rightarrow R^q\), then \(\nabla _x g\) represents the derivative w.r.t. \(x\in R^p\) and \(\nabla ^2_xg\) the second derivative, while \(\partial _t g(t,x)\) represents the derivative w.r.t. \(t\in R\).
The Euclidean norm or the induced matrix norm on \( R^{p\times q}\) is denoted by \( |\cdot |\). We denote by \(B_n\) the closed unit ball in \( R^n\) centered at the origin. The inner product of x and y is denoted by \(\langle x, y\rangle \). For some \(A\subset R^n\), d(x, A) denotes the distance between x and A. We denote the support function of A at z by \(S(z,A)=\sup \{\langle z,a\rangle \mid a\in A\}\)
The space \(L^{\infty }([a,b]; R^p)\) (or simply \(L^{\infty }\) when the domains are clearly understood) is the Lebesgue space of essentially bounded functions \(h:[a,b]\rightarrow R^p\). We say that \(h\in BV([a,b]; R^p)\) if h is a function of bounded variation. The space of continuous functions is denoted by \(C([a,b]; R^p)\).
Standard concepts from nonsmooth analysis will also be used. Those can be found in [7, 18] or [22], to name but a few. The Mordukhovich normal cone to a set S at \(s\in S\) is denoted by \(N_{S}(s)\) and \(\partial f(s)\) is the Mordukhovich subdifferential of f at s (also known as limiting subdifferential).
For any set \(A\subset R^n\), \( \text {cone}\, A\) is the cone generated by the set A.
We now turn to problem (P). We first state the definition of admissible processes for (P) and then we describe the assumptions under which we will derive our main results.
Definition 2.1
A pair (x, u) is called an admissible process for (P) when x is an absolutely continuous function and u is a measurable function satisfying the constraints of (P).
Assumptions on the data of \(\mathbf {(P)}\)
-
A1:
The functions \(\psi ^i\), \( i=1,\ldots , I\), are \(C^2\). The graph of \(C(\cdot )\) is compact and it is contained in the interior of a ball \(rB_{n+1}\), for some \(r>0\). There exist constants \(\beta >0\), \(\eta >0\) and \(\rho \in ]0,1[\) such that
$$\begin{aligned} \psi ^i(t,x) \in [ -\beta ,\beta ] \Longrightarrow |\nabla _x \psi ^i (t,x) | > \eta \; \; \textrm{for all }\; (t,x)\in [0,T]\times R^n, \end{aligned}$$(4)and, for \(I(t,x)=\{ i=1,\ldots , I\mid \psi ^i(t,x)\in ]-2\beta ,\beta ]\}\),
$$\begin{aligned} \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \ge 0,\;\; i,j\in I(t,x). \end{aligned}$$(5)Moreover, if \(i\in I(t,x)\), then
$$\begin{aligned} \sum _{j\in I(t,x)\setminus \{i\}} \big | \langle \nabla _x\psi ^i(t,x),\nabla _x\psi ^j(t,x)\rangle \big | \le \rho |\nabla _x\psi ^i(t,x)|^2. \end{aligned}$$(6)Additionally,
$$\begin{aligned} \psi ^i(t,x)\le -2\beta ~\Longrightarrow ~\nabla _x \psi ^i(t,x)=0~\text { for } i=1,\ldots I. \end{aligned}$$(7) -
A2:
The function f is continuous, \(x\rightarrow f(t,x,u)\) is continuously differentiable for all \((t,u)\in [0,T]\times R^m\). The constant \(M>0\) is such that \(|f(t,x,u)|\le M\) and \(|\nabla _x f(t,x,u)|\le M\) for all \((t,x,u)\in rB_{n+1}\times U\).
-
A3:
For each (t, x), the set f(t, x, U) is convex.
-
A4:
The set U is compact.
-
A5:
The sets \(C_0\) and \(C_T\) are compact.
-
A6:
There exists a constant \(L_\phi \) such that \(|\phi (x)-\phi (x')|\le L_{\phi }|x-x'|\) for all \(x, x' \in R^n\).
Assumption (A1) concerns the functions \(\psi ^i\) defining the set C, and it plays a crucial role in the analysis. All \(\psi ^i\) are assumed to be smooth with gradients bounded away from the origin when \(\psi ^i\) takes values in a neighborhood of zero. Moreover, the boundary of C may be nonsmooth at the intersection points of the level sets \(\left\{ x: \psi ^i(t,x)=0\right\} \). However, nonsmoothness at those corner points is restricted to (5) which excludes the cases where the angle between the two gradients of the functions defining the boundary of C is obtuse; see Fig. 1.
On the other hand, (6) guarantees that the Gramian matrix of the gradients of the functions taking values near the boundary of C(t) is diagonally dominant and, hence, the gradients are linearly independent.
In many situations, as in the example we present in the last section, we can guarantee the fulfillment of (A1), in particular (7), replacing the function \(\psi ^i\) by
where
Here, h is a \(C^2\) function, with \(h_s\) an increasing function defined on \([-2\beta ,-\beta ]\). For example, \(h_s\) may be a cubic polynomial with positive derivative on the interval \(]-2\beta ,-\beta [\). For all \(t\in [0,T]\), set
It is then a simple matter to see that
and that the functions \(\tilde{\psi }^i(\cdot )\) satisfy the assumption (A1).
The assumption that the graph of \(C(\cdot )\) is compact and contained in the interior of a ball is introduced to avoid technicalities in our forthcoming analysis. In applied problems, this may be easily side tracked by considering the intersection of the graph of \(C(\cdot )\) with a tube around the optimal trajectory.
Condition (A1) implies the conditions of Theorem 3.1 in [2] and so our set C(t) is uniformly prox-regular.
We now proceed introducing an approximation family of controlled systems to (1). Let \(x(\cdot )\) be a solution to the differential inclusion
Under our assumptions, measurable selection theorems assert the existence of measurable functions u and \(\xi ^i\) such that \(u(t) \in U\), \(\xi ^i(t)\ge 0\) a.e. \(t\in [0,T]\), \(\xi ^i(t)=0\) if \(\psi ^i(t,x(t))<0\), and
Considering the trajectory x, some observations are called for. Let \(\mu \) be such that
The properties of the graph of \(C(\cdot )\) in (A1) guarantee the existence of such maximum.
Consider now some t such that, for some \(j\in \{1, \ldots I\}\), \(\psi ^j (t,x(t))=0\) and \(\dot{x}(t)\) exists. Since the trajectory x is always in C, we have (see (5))
and hence (see (4)),
Define the function
consider a sequence \(\{\sigma _k\}\) such that \(\sigma _k\downarrow 0\) and choose another sequence \(\{\gamma _k\}\) with \(\gamma _k\uparrow +\infty \) and
where
Let \(x_k\) be a solution to the differential equation
for some \(u_k(t) \in U\) a.e. \(t\in [0,T]\). Take any \(t\in [0,T]\) such that \(\dot{x}_k(t)\) exists and \(\psi ^j (t,x_k(t))-\sigma _k=\mu _k\). Assume that \(|\psi ^j(t,x_k(t))|\le \beta \) and \(\psi ^i(t,x_k(t)) \le \beta \), for all \(\displaystyle i\).
Then, whenever \(\gamma _k\) is sufficiently large, we have (see (5) and (7))
In the last inequality, we have used the definition of \(\mu \).
Thus, if \(x_k(0)\in C^k(0)\), we have \(x_k(t)\in C^k(t)\), for all \(t\in [0,T]\), and
It follows that, for k sufficiently large, we have
We remark that the inclusion \(x_k(t)\in C^k(t)\) is a direct consequence of Theorem 3 in [20].
We are now a in position to state and prove our first result, Theorem 2.1. This is in the vein of Theorem 4.1 in [23] (see also Lemma 1 in [10] when \(\psi \) is independent of t and convex) deviating from it in so far as the approximating sequence of control systems (9) differs from the one introduced in [10].Footnote 1 The proof of Theorem 2.1 relies on (10).
Theorem 2.1
Let \(\{(x_k,u_k)\}\), with \(u_k(t)\in U\) a.e., be a sequence of solutions of Cauchy problems
If \(b_k\rightarrow x_0\), then there exists a subsequence \(\{x_k\}\) (we do not relabel) converging uniformly to x, a unique solution to the Cauchy problem
where u is a measurable function such that \(u(t)\in U\) a.e. \(t\in [0,T]\).
If, moreover, all the controls \(u_k\) are equal, i.e., \(u_k=u\), then the subsequence converges to a unique solution of (12), i.e., any solution of
can be approximated by solutions of (11).
Proof
Consider the sequence \(\{x_k\}\), where \((x_k,u_k)\) solves (11). Recall that \(x_k(t)\in C^k(t)\) for all \(t\in [0,T]\), and
Then, there exist subsequences (we do not relabel) weakly-\(*\) converging in \(L^{\infty }\) to some v and \(\xi ^i\). Hence,
for an absolutely continuous function x. Obviously, \(x(t)\in C(t)\) for all \(t\in [0,T]\). Considering the sequence \(\{x_k\}\), recall that
Inclusion (15) is equivalent to
Integrating this inequality, we get
Passing to the limit as \(k\rightarrow \infty \), we obtain
Let \(t\in [0,T]\) be a Lebesgue point of \(\dot{x}\) and \(\xi \). Passing in the last inequality to the limit as \(\tau \downarrow 0\), it leads to
Since \(z\in R^n\) is an arbitrary vector and the set f(t, x(t), U) is convex, we conclude that
By the Filippov lemma, there exists a measurable control \(u(t)\in U\) such that
Furthermore, observe that \(\xi ^i\) is zero if \(\psi ^i(t,x(t))<0\). If for some u such that \(u(t)\in U\) a.e., \(u_k=u\) for all k, then the sequence \(x_k\) converges to the solution of
Indeed, to see this, it suffices to pass to the limit as \(k\rightarrow \infty \) and then as \(\tau \downarrow 0\), in the equality
Recall that the set C(t) is uniformly prox-regular. The proof of uniqueness of solution for general sweeping processes with prox-regular sets can be found in [19], and it holds under the requirement that the moving set is Lipschitz continuous with respect to time. Although we do not assume the Lipschitz dependence directly, under our assumptions we can appeal to the implicit function theorem to show that C(t) is locally Lipschitz. However, for our special case, it is possible to have a simple alternative proof, which we present next for the convenience of the reader. The proof is in the vein of that of Theorem 4.1 in [23]. Suppose that there exist two different solutions of (12): \(x_1\) and \(x_2\). We have
If, for all i, \(\psi ^i(t,x_1(t))<0\) and \(\psi ^i(t,x_2(t))<0\), then \(\xi _1^i(t)=\xi _2^i(t)=0\), and we obtain
Suppose that \(\psi ^j(t,x_1(t))=0\). Then, by the Taylor formula we get
where \(\theta \in [0,1]\). Since \(\psi ^j(t,x_2(t))\le 0\), we have
Now, if \(\psi ^j(t,x_2(t))=0\), we deduce in the same way that
Thus, we have
Hence, \(|x_1(t)-x_2(t)|=0\). \(\square \)
3 Approximating Family of Optimal Control Problems
In this section, we define an approximating family of optimal control problems to (P) and we state the corresponding necessary conditions.
Let \((\hat{x},\hat{u})\) be a global solution to (P) and consider sequences \(\{\gamma _k\}\) and \(\{\sigma _k\}\) as defined above. Let \(\hat{x}_k(\cdot )\) be the solution to
Set \(\epsilon _k=|\hat{x}_k(T)-\hat{x}(T)|\). It follows from Theorem 2.1 that \(\epsilon _k\downarrow 0\). Take \(\alpha >0\) and define the problem
Clearly, the problem \((P_k^\alpha )\) has admissible solutions. Consider the space
and the distance
Endowed with \(d_{W}\), W is a complete metric space. Take any \((c,u)\in W\) and a solution y to the Cauchy problem
Under our assumptions, the function
is continuous on \((W,d_{W})\) and bounded below. Appealing to Ekeland’s theorem, we deduce the existence of a pair \((x_k,u_k)\) solving the following problem
Lemma 3.1
Take \(\gamma _k\rightarrow \infty \), \(\sigma _k\rightarrow 0\) and \(\epsilon _k \rightarrow 0\) as defined above. For each k, let \((x_k,u_k)\) be the solution to \((AP_k)\). Then, there exists a subsequence (we do not relabel) such that
Proof
We deduce from Theorem 2.1 that \(\{x_k\}\) uniformly converges to an admissible solution \(\tilde{x}\) to (P). Since U and \(C_0\) are compact, we have \(U\subset KB_m\) and \(C_0\subset KB_n\). Without loss of generality, \(u_k\) weakly-\(*\) converges to a function \(\tilde{u}\in L_{\infty }([0,T],U)\). Hence, it weakly converges to \(\tilde{u}\) in \(L_1\). From optimality of the processes \((x_k,u_k)\), we have
Since \((\hat{x},\hat{u})\) is a global solution of the problem, passing to the limit, we get
Hence, \(\tilde{x}(0)=\hat{x}(0)\), \(\tilde{u}=\hat{u}\) a.e., and \(u_k\) converges to \(\hat{u}\) in \(L_1\), and some subsequence converges to \(\hat{u}\) almost everywhere (we do not relabel). \(\square \)
We now finish this section with the statement of the optimality necessary conditions for the family of problems \((AP_k)\). These can be seen as a direct consequence of Theorem 6.2.1 in [22].
Proposition 3.1
For each k, let \((x_k,u_k)\) be a solution to \((AP_k)\). Then, there exist absolutely continuous functions \(p_k\) and scalars \(\lambda _k\ge 0\) such that
- (a):
-
(nontriviality condition)
$$\begin{aligned} \lambda _k+|p_{k}(T)| =1, \end{aligned}$$(22) - (b):
-
(adjoint equation)
$$\begin{aligned} \begin{array}{c}\dot{p}_{k} =-(\nabla _x f_{k})^* p_{k} +\sum _{i=1}^I\gamma _k e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla ^2_x\psi _{k}^ip_{k}\\ +\sum _{i=1}^I\gamma _k^2e^{\gamma _k (\psi _{k}^i-\sigma _k)}\nabla _x\psi _{k}^i\langle \nabla _x\psi _{k}^i,p_{k}\rangle , \end{array} \end{aligned}$$(23)where the superscript \(*\) stands for transpose,
- (c):
-
(maximization condition)
$$\begin{aligned} \max _{u\in U}\left\{ \langle f(t,x_{k}, u) , p_{k} \rangle - \alpha \lambda _k|u-\hat{u}| -\epsilon _k \lambda _k|u-u_k|\right\} \end{aligned}$$(24)is attained at \(u_k (t) \), for almost every \(t\in [0,T]\),
- (d):
-
(transversality condition)
$$\begin{aligned} ( p_{k}(0), - p_{k}(T)) \in \lambda _k\left( 2(x_k(0)-\hat{x}(0))+\epsilon _k B_n, \partial \phi (x_{k}(T))\right) \nonumber \\ + N_{C_0}(x_{k}(0))\times N_{C_T+\epsilon _kB_n}(x_{k}(T)). \hspace{1cm} \end{aligned}$$(25)
To simplify the notation above, we drop the t dependence in \(p_k\), \(\dot{p}_k\), \(x_k\), \(u_k\), \(\hat{x}\) and \(\hat{u}\). Moreover, in (b), we write \(\psi _k\) instead of \(\psi (t,x_k(t))\), \(f_k\) instead of \(f(t,x_k(t),u_k(t))\). The same holds for the derivatives of \(\psi \) and f.
4 Maximum Principle for (P)
In this section, we establish our main result, a Maximum Principle for (P). This is done by taking limits of the conclusions of Proposition 3.1, following closely the analysis done in the proof of [10, Theorem 2].
Observe that
where M is the constant of (A2). Taking into account hypothesis (A1) and (10), we deduce the existence of a constant \(K_0>0\) such that
This last inequality leads to
Since, by (a) of Proposition 3.1, \(|p_k(T)|\le 1\), we deduce from the above that there exists \(M_0>0\) such that
Now, we claim that the sequence \(\{\dot{p}_k\}\) is uniformly bounded in \(L^1\). To prove our claim, we need to establish bounds for the three terms in (23). Following [10, 13], we start by deducing some inequalities that will be of help.
Denote \(I_k=I(t,x_k(t))\) and \(S_k^j=\textrm{sign}\left( \langle \nabla _x\psi _k^j, p_ k \rangle \right) \). We have
Observe that (see (6) and (7))
Using this and integrating the previous equality, we deduce the existence of \(M_1>0\) such that:
We are now in a position to show that
is bounded. For simplicity, set \(L_k^i(t) =\gamma _k^2 e^{\gamma _k( \psi _k^i-\sigma _k)} |\nabla _x \psi _k^i| \left| \langle \nabla _x \psi _k^i,p_ k\rangle \right| \). Notice that
Using (A1) and (27), we deduce that
for k large enough. Summarizing, there exists a \(M_2>0\) such that
Mimicking the analysis conducted in Step 1, b) and c) of the proof of Theorem 2 in [10] and taking into account (b) of Proposition 3.1, we conclude that there exist constants \(N_1>0\) such that
for k sufficiently large, proving our claim.
Before proceeding, observe that it is a simple matter to assert the existence of a constant \(N_2\) such that
This inequality will be of help in what follows.
Let us now recall that
and that the second inequality in (14) holds. We turn to the analysis of Step 2 in the proof of Theorem 2 in [10] (see also [13]). Adapting those arguments, we can conclude the existence of some function \(p\in BV([0,T],R^n)\) and, for \(i=1, \ldots , I\), functions \(\xi ^i\in L^{\infty }([0,T],R)\) with \(\xi ^i(t) \ge 0 \ \text{ a. } \text{ e. } t\), \(\xi ^i(t) = 0, \ t \in I_b^i\), where
and finite signed Radon measures \(\eta ^i\), null in \( I_b^i\), such that, for any \(z\in C([0,T],R^n)\)
where \(\nabla _x \hat{\psi }^i =\nabla _x \psi ^i(t,\hat{x}(t))\). Set \(\nabla _x \psi ^i_k=\nabla _x \psi ^i(t,x_k(t))\). The finite signed Radon measures \(\eta ^i\) are weak-\(*\) limits of
Observe that the measures
are nonnegative.
For each \(i=1,\ldots , I\), the sequence \(\xi _k^i\) is weakly-\(*\) convergent in \(L^{\infty }\) to \(\xi ^i\ge 0\). Following [13], we deduce from (30) that, for each \(i=1,\ldots , I\),
It turns out that
Consider now the sequence of scalars \(\{\lambda _k\}\). It is an easy matter to show that there exists a subsequence of \(\{\lambda _k\}\) converging to some \(\lambda \ge 0\). This, together with the convergence of \(p_k\) to p, allows us to take limits in (a) and (c) of Proposition 3.1 to deduce that
and
It remains to take limits of the transversality conditions (d) in Proposition 3.1. First, observe that
From the basic properties of the Mordukhovich normal cone and subdifferential (see [18], section 1.3.3), we have
and
Passing to the limit as \(k\rightarrow \infty \), we get
Finally, and mimicking Step 3 in the proof of Theorem 2 in [10], we remove the dependence of the conditions on the parameter \(\alpha \). This is done by taking further limits, this time considering a sequence of \(\alpha _j\downarrow 0\).
We then summarize our conclusions in the following Theorem.
Theorem 4.1
Let \((\hat{x}, \hat{u})\) be the optimal solution to (P). Suppose that assumption A1–A6 are satisfied. For \(i=1,\cdots , I\), set
There exist \( \lambda \ge 0\), \( p\in BV([0,T],R^n)\), finite signed Randon measures \( \eta ^i\), null in \(I^{i}_b\), for \(i=1,\cdots , I\), \( \xi ^{i}\in L^\infty ([0,T],R)\), with \(i=1,\cdots , I \), where \( \displaystyle \xi ^{i}(t) \ge 0 \ \text { a. e. } t\) and \(\xi ^{i}(t) = 0, \ t \in I^{i}_b, \) such that
- a):
-
\(\lambda +|p(T)|\ne 0\),
- b):
-
\(\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),\)
- c):
-
for any \(z\in C([0,T];R^n)\)
$$\begin{aligned} \begin{array}{l} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\ \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \textrm{d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) ,\end{array} \end{aligned}$$where \( \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla _x \hat{\psi }^i(t)=\nabla _x \psi ^i(t,\hat{x}(t))\) and \(\nabla ^2_x \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),\)
- d):
-
\(\xi _i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0\), \(a.e. \, t\) for all \(i=1, \ldots , I\),
- e):
-
for all \(i=1, \ldots , I\), the measures \(\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)\) are nonnegative,
- f):
-
\(\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle \) for all \(u \in U,\) \(~a.e.\, t\),
- g):
-
\(\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times N_{C_T}(\hat{x}(T)) +\{0\}\times \lambda \partial \phi (\hat{x}(T)).\end{array}\)
Noteworthy, condition e) is not considered in any of our previous works.
We now turn to the free end-point case, i. e., to the problem
Problem \((P_f)\) differs from (P) because x(T) is not constrained to take values in \(C_T\). We apply Theorem 4.1 to \((P_f)\). Since x(T) is free, we deduce from (f) in the above Theorem that \(-p(T)=\lambda \partial \phi (\hat{x}(T))\). Suppose that \(\lambda =0\). Then, \(p(T)=0\) contradicting the nontriviality condition (a) of Theorem 4.1. Without loss of generality, we then conclude that the conditions of Theorem 4.1 hold with \(\lambda =1\). We summarize our findings in the following Corollary.
Corollary 4.1
Let \((\hat{x}, \hat{u})\) be the optimal solution to \((P_f)\). Suppose that assumption A1–A6 are satisfied. For \(i=1,\cdots , I\), set
There exist \( p\in BV([0,T],R^n)\), finite signed Randon measures \( \eta _i\), null in \(I^{i}_b\), for \(i=1,\cdots , I\), \( \xi ^{i}\in L^\infty ([0,T],R)\), with \(i=1,\cdots , I \), where \( \displaystyle \xi ^{i}(t) \ge 0 \ \text { a.e. } t\) and \(\xi ^{i}(t) = 0\) for \(t \in I^{i}_b, \) such that
- a):
-
\(\dot{ \hat{x}}(t)=f(t,\hat{x}(t),\hat{u}(t))- \displaystyle \sum _{i=1}^{I}\xi ^i(t)\nabla _x \hat{\psi }^{i} (t),\)
- b):
-
for any \(z\in C([0,T];R^n)\)
$$\begin{aligned}{} & {} \displaystyle \int _0^T \langle z(t),dp(t)\rangle = -\displaystyle \int _0^T \langle z(t), ( \nabla _x \hat{f}(t))^*p(t)\rangle \textrm{d}t \\{} & {} \quad \displaystyle + \sum _{i=1}^{I}\displaystyle \left( \int _0^T \xi ^{i}(t) \langle z(t), \nabla ^2_x\hat{\psi }^{i}(t) p(t)\rangle \hbox {d}t \right. \ +\displaystyle \left. \int _0^T \langle z(t), \nabla _x \hat{\psi }^{i}(t)\rangle \hbox {d}\eta _i\right) , \end{aligned}$$where \( \nabla \hat{f}(t) = \nabla _x f(t,\hat{x}(t),\hat{u}(t)), ~~\nabla \hat{\psi }^i(t)=\nabla \psi ^i(t,\hat{x}(t))\) and \(\nabla ^2 \hat{\psi }^i(t)=\nabla ^2 \psi ^i(t, x(t)),\)
- c):
-
\(\xi ^i(t)\langle \nabla _x \psi ^{i}(t,\hat{x}(t)),p(t)\rangle =0\) for a.e. t and for all \(i=1, \ldots , I\),
- d):
-
for all \(i=1, \ldots , I\), the measures \(\langle \nabla _x\psi ^i(\hat{x}(t)),p(t)\rangle {d}\eta ^i(t)\) are nonnegative,
- e):
-
\(\displaystyle \langle p(t), f(t,\hat{x}(t),u)\rangle \le \langle p(t), f(t,\hat{x}(t),\hat{u}(t))\rangle \) for all \(u \in U\), \(a.e.\, t\),
- f):
-
\(\displaystyle \begin{array}{c}(p(0),-p(T))\in N_{C_0}(\hat{x}(0))\times \partial \phi (\hat{x}(T)).\end{array}\)
5 Example
Let us consider the following problem (Fig. 2)
where
-
\(0<\sigma \ll 1\),
-
\(C=\{ (x,y,z)\mid x^2+y^2+(z+h)^2\le 1,\; x^2+y^2+(z-h)^2\le 1\},\;\; 2\,h^2<1 \),
-
\((x_0,y_0,z_0)\in \textrm{int }C\), with \(x_0<-\delta \), \(y_0=0\) and \(z_0>0\),
-
\(C_T=\{ (x,y,z)\mid x\le 0,\; y\ge 0,\; \delta y-y_2x\le \delta y_2\}\cap C\), where
$$\begin{aligned} \delta <\frac{y_2|x_0|}{y_1},~ \textrm{with } ~y_1=\sqrt{1-x_0^2-(z_0+h)^2} \text { and }y_2=\sqrt{1-h^2}. \end{aligned}$$
We choose \(T>0\) small and, nonetheless, sufficiently large to guarantee that, when \(\sigma =0\), the system can reach the interior of \(C_T\) but not the segment \(\{ (x,0,0)\mid x\in [-\delta ,0]\}\). Since \(\sigma \) and T are small, it follows that the optimal trajectory should reach \(C_T\) at the face \(\delta y-y_2x=\delta y_2\) of \(C_T\).
To significantly increase the value of the x(T), the optimal trajectory needs to live on the boundary of C for some interval of time. Then, before reaching and after leaving the boundary of C, the optimal trajectory lives in the interior of C. Since \(\delta \) is small, the trajectory cannot reach \(C_T\) from any point of the sphere \(x^2+y^2+(z+h)^2=1\) with \(z>0\). This means that, while on the boundary of C the trajectory should move on the sphere \(x^2+y^2+(z+h)^2=1\) until reaching the plane \(z=0\) and then it moves on the intersection of the two spheres.
While in the interior of C, the control can change sign from \(-1\) to 1 or from 1 to \(-1\). Certainly, the control should be 1 right before reaching the boundary and \(-1\) right before arriving at \(C_T\). Changes of the control from 1 to \(-1\) or \(-1\) to 1 before reaching the boundary translate into time waste and lead to smaller values of x(T). It then follows that the optimal control should be of the form
for some value \(\tilde{t}\in ]0,T[\).
After the modification (8), the data of the problem satisfy the conditions under which Theorem 4.1 holds. We now show that the conclusions of Theorem 4.1 completely identify the structure (33) of the optimal control.
From Theorem 4.1, we deduce the existence of \( \lambda \ge 0\), \( p,~q,~r \in BV([0,T],R)\), finite signed Randon measures \( \eta _1\) and \(\eta _2\), null, respectively, in
and
\( \xi _{i}\in L^\infty ([0,T],R)\), with \(i=1,2 \), where \( \displaystyle \xi _{i}(t) \ge 0 \ \text { a. e. } t\) and \(\xi _{i}(t) = 0, \ t \in I^{i}_b, \) such that
where \(\hat{u}\) is the optimal control.
Let \(t_1\) be the instant of time when the trajectory reaches the sphere \(x^2+y^2+(z+h)^2=1\), \(t_2\) the instant of time when the trajectory reaches the intersection of the two spheres and \(t_3\) be the instant of time the trajectory leaves the boundary of C. We have \(0<t_1<t_2<t_3<T\).
Next, we show that the multiplier q changes sign only once and so identifying the structure (33) of the optimal control in a unique way. We start by looking at the case when \(t=T\). We have
Starting from \(t=T\), let us go backwards in time until the instant \(t_3\) when the trajectory leaves the boundary of C. If \(q(T)=0\), then \(p(T)=\lambda >0\) and we would have \(q(t)>0\) for \(t \in ]t_3,T[\) (see (ii) above), which is impossible. We then have \(p(T)>0\) and \(q(T)<0\) and, in \(]t_3,T[\), since \(\sigma \) is small, the vector (p(t), q(t)) does not change much. At \(t=t_3\), the vector (p, q) has a jump and such jump can only occur along the vector \((x(t_3),y(t_3))\). Therefore, we have \(p(t_3-0)>0\) and \(q(t_3-0)<0\).
Let us now consider \(t\in ]t_2,t_3[\). We have the following
-
1.
when \( t\in [t_2,t_3]\), we have \(z=0\);
-
2.
condition (i) implies that \(\xi _1=\xi _2=\xi \), \(\xi >0\) since, otherwise the motion along \(x^2+y^2=1-h^2\) would not be possible;
-
3.
from \(0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2)=\sigma 2xy-8\xi x^2+2uy-8\xi y^2\), we get \(\xi =\frac{\sigma xy+uy}{4(1-h^2)}\);
-
4.
condition (iv) implies that \(r=0\) leading to \(xp+yq=0\). Since \(x<0\), \(y>0\), then \(q=0\) implies \(p=0\);
-
5.
condition (ii) implies \(\hbox {d}\eta _1=\hbox {d}\eta _2=\hbox {d}\eta \);
-
6.
\(0=d(xp+yq)=uq\hbox {d}t+4(1-h^2)\hbox {d}\eta \) \(\Rightarrow \) \(\frac{\hbox {d}\eta }{\hbox {d}t}=-\frac{uq}{4(1-h^2)}\);
-
7.
from the above analysis, we deduce that
$$\begin{aligned}{} & {} \dot{p}= \frac{\sigma xy+uy}{(1-h^2)}~p-\frac{xuq}{(1-h^2)},\\{} & {} \dot{q}=-\sigma p+\frac{\sigma xy}{(1-h^2)}~q. \end{aligned}$$Thus, (p, q) is a solution to a linear system and it can never be equal to zero. It follows that q cannot be zero because \(q=0\) implies \(p=0\). Since \(q\ne 0\), we have \(q>0\).
Let us consider the case when \(t=t_2\). We claim that
Seeking a contradiction, assume that it is \((p(t_2-0),q(t_2-0))=(0,0)\). Then, we have
and such jump has to be normal to \((x(t_2),y(t_2))\) since \(r(t_2+0)=0\) (see (iv)). It follows that \((x^2(t_2)+y^2(t_2))(\hbox {d}\eta _1+ \hbox {d}\eta _2)=0\) and, since \(x^2(t_2)+y^2(t_2)>0\), we get \(\hbox {d}\eta _1+ \hbox {d}\eta _2=0\), proving our claim.
We now consider \( t\in ]t_1,t_2[\). It is easy to see that \(\xi _2=0\) and \(d \eta _2=0\). We also deduce that
-
1.
\(0=\frac{\hbox {d}}{\hbox {d}t}(x^2+y^2+(z+h)^2)=2\sigma xy+2uy-4\xi _1 y^2-4\xi _1 x^2-4\xi _1(z+h)^2\) which implies that \(\xi _1=\frac{\sigma xy+uy}{2}\);
-
2.
also \(0=d(xp+yq+(z+h)r)=uq\hbox {d}t+2\hbox {d}\eta _1\) implies that \(\frac{\hbox {d}\eta _1}{\hbox {d}t}=-\frac{uq}{2}\);
-
3.
from the above, we deduce that
$$\begin{aligned}{} & {} \dot{p}=(\sigma xy+uy)p-xuq,\\{} & {} \dot{q}=-\sigma p+\sigma xyq. \end{aligned}$$Thus, (p, q) is a solution to a linear system and never is equal to zero. The second equation implies that if \(q=0\), then \(\dot{q}\ne 0\). Hence, \(q>0\).
Now, we need to consider \(t=t_1\). We claim that
Let us then assume that it is \((p(t_1-0),q(t_1-0),r(t_1-0))=(0,0,0)\). It then follows that \((p(t_1+0),q(t_1+0),r(t_1+0))=(0,0,0)+(2x(t_1)\hbox {d}\eta _1, 2y(t_1)\hbox {d}\eta _1, 2(z(t_1)+h)\hbox {d}\eta \eta _1)\). We now show that there is no such jump. Set \(r(t_1-0)=r_0\). Then, it follows from (iv) that \((x(t_1)\cdot 0+y(t_1)\cdot 0 +(z(t_1)+h))r_0=0\) which implies that \(r_0=0\). We also have \((x^2(t_1)+y^2(t_1)+(z(t_1)+h)^2)\hbox {d}\eta _1=0\) from (v). But this implies that \(\hbox {d}\eta _1=0\). Consequently, the multipliers do not exhibit a jump at \(t_1\).
From the previous analysis, we deduce that q should be positive almost everywhere on the boundary. It then follows that to find the optimal solution we have to analyze admissible trajectories corresponding to controls with the structure (33) and choose the optimal value of \(\tilde{t}\).
6 Conclusions
We proved necessary conditions for an optimal control problem involving sweeping processes with a nonsmooth sweeping set depending on time. The main feature of our work is the use of exponential penalization functions. We have applied successfully this approach in previous works on optimal control problems involving sweeping processes with a smooth set. In this work, to deal with the sweeping set nonsmoothness we impose rather strong constraint qualifications. The weakening of these hypotheses will be the subject of our future work.
Notes
See also Theorem 2.2 in [13]
References
Addy, K., Adly, S., Brogliato, B., Goeleven, D.: A method using the approach of Moreau and Panagiotopoulos for the mathematical formulation of non-regular circuits in electronics. Nonlinear Anal. Hybrid Syst 1, 30–43 (2013). https://doi.org/10.1016/j.nahs.2006.04.00
Adly, S., Nacry, F., Thibault, L.: Preservation of prox-regularity of sets with applications to constrained optimization. SIAM J. Optim. 26, 448–473 (2016). https://doi.org/10.1137/15M1032739
Arroud, C., Colombo, G.: A maximum principle for the controlled sweeping process. Set Valued Var. Anal 26, 607–629 (2018). https://doi.org/10.1007/s11228-017-0400-4
Brokate, M., Krejčí, P.: Optimal control of ODE systems involving a rate independent variational inequality. Discrete Contin. Dyn. Syst. Ser. B 18(2), 331–348 (2013). https://doi.org/10.3934/dcdsb.2013.18.33
Cao, T.H., Mordukhovich, B.: Optimality conditions for a controlled sweeping process with applications to the crowd motion model. Discrete Contin. Dyn. Syst. Ser. B 22, 267–306 (2017). https://doi.org/10.3934/dcdsb.2017014
Cao, T.H., Colombo, G., Mordukhovich, B., Nguyen, D.: Optimization of fully controlled sweeping processes. J. Differ. Equ. 295, 138–186 (2021). https://doi.org/10.1016/j.jde.2021.05.042
Clarke, F.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)
Colombo, G., Palladino, M.: The minimum time function for the controlled Moreau’s sweeping process. SIAM J. Control. Optim. 54(4), 2036–2062 (2016). https://doi.org/10.1137/15M1043364
Colombo, G., Henrion, R., Hoang, N.D., Mordukhovich, B.S.: Optimal control of the sweeping process over polyhedral controlled sets. J. Differ. Equ. 260(2), 3397–3447 (2016). https://doi.org/10.1016/j.jde.2015.10.039
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Optimal control involving sweeping processes. Set Valued Var. Anal 27, 523–548 (2019). https://doi.org/10.1007/s11228-018-0501-8
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Correction to: optimal control involving sweeping processes. Set Valued Var. Anal 27, 1025–1027 (2019). https://doi.org/10.1007/s11228-019-00520-5
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Optimal control with sweeping processes: numerical method. J. Optim. Theory Appl. 185, 845–858 (2020). https://doi.org/10.1007/s10957-020-01670-5
de Pinho, M.R., Ferreira, M.M.A., Smirnov, G.: Necessary conditions for optimal control problems with sweeping systems and end point constraints. Optimization 71(11), 3363–3381 (2022). https://doi.org/10.1080/02331934.2022.2101111
Hermosilla, C., Palladino, M.: Optimal control of the sweeping process with a non-smooth moving set. SIAM J. Control. Optim. 60(5), 2811–2834 (2022). https://doi.org/10.1137/21M1405472
Kunze, M., Monteiro Marques, M.D.P.: An Introduction to Moreau’s sweeping process. In: Brogliato, B. (ed.) Impacts in Mechanical Systems Lecture Notes in Physics, vol. 551. Springer, Berlin (2000). https://doi.org/10.1007/3-540-45501-9_1
Maury, B., Venel, J.: A discrete contact model for crowd motion. ESAIM M2AN 45(1), 145–168 (2011). https://doi.org/10.1051/m2an/2010035
Moreau, J.J.: On unilateral constraints, friction and plasticity. In: Capriz, G., Stampacchia, G. (eds.) New Variational Techniques in Mathematical Physics, CIME ciclo Bressanone 1973, pp. 171–322. Edizioni Cremonese, Rome (1974). https://doi.org/10.1007/978-3-642-10960-7_7
Mordukhovich, B.: Variational Analysis and Generalized Differentiation II: Basic Theory. In: Fundamental Principles of Mathematical Sciences, vol. 330. Springer, Berlin (2006). https://doi.org/10.1007/3-540-31247-1
Sene, M., Thibault, L.: Regularization of dynamical systems associated with prox-regular moving sets. J. Nonlinear Convex Anal. 15(4), 647–663 (2014)
Tallos, P.: Viability problems for nonautonomous differential inclusions. SIAM J. Control. Optim. 29(2), 253–263 (1991). https://doi.org/10.1137/0329014
Thibault, L.: Moreau sweeping process with bounded truncated retraction. J. Convex Anal. 23, 1051–1098 (2016)
Vinter, R.B.: Optimal Control. Foundations and Applications, Boston MA, Birkhäuser, Systems and Control (2000)
Zeidan, V., Nour, C., Saoud, H.: A nonsmooth maximum principle for a controlled nonconvex sweeping process. J. Differ. Equ. 269(11), 9531–9582 (2020). https://doi.org/10.1016/j.jde.2020.06.053
Acknowledgements
The authors gratefully thank the support of Portuguese Foundation for Science and Technology (FCT) in the framework of the Strategic Funding UIDB/04650/2020. Also, we thank the support by the ERDF—European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020, INCO.2030, under the Portugal 2020 Partnership Agreement and by National Funds, Norte 2020, through CCDRN and FCT, within projects To Chair (POCI-01-0145-FEDER-028247), Upwind (PTDC/EEI-AUT/31447/2017 - POCI-01-0145-FEDER-031447) and Systec R &D unit (UIDB/00147/2020).
Funding
Open access funding provided by FCT|FCCN (b-on).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Boris S. Mordukhovich.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pinho, M.d.R.d., Ferreira, M.M.A. & Smirnov, G. A Maximum Principle for Optimal Control Problems Involving Sweeping Processes with a Nonsmooth Set. J Optim Theory Appl 199, 273–297 (2023). https://doi.org/10.1007/s10957-023-02283-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-023-02283-4