1 Introduction

The stochastic maximum principle and the characterization of the optimal control by Pontryagin’s maximum principle have been studied for the classical case by many authors such as Bismut [9], and Bensoussan [8] and so on. This principle was extended to more general cases such as controlled MacKean–Vlasov systems in the sense that both the drift and the diffusion coefficients in their dynamics, are supposed to depend at time t on the solution X(t) and on its law \(P_{X(t)}\) as follows

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dX(t) &{}= b(t,X(t),P_{X(t)},u(t))dt+\sigma (t,X(t),P_{X(t)},u(t))dB(t), \quad t\in [0,T],\\ X(0) &{}= x. \end{array} \right. \end{aligned}$$

We refer for example to Carmona and Delarue [13, 14], Buckdahn et al. [10], Agram et al. [2,3,4].

In this paper, we want to extend Pontryagin’s stochastic maximum principle to a more general case, the case of controlled McKean–Vlasov SDE with anticipating law, i.e., our dynamics are assumed to satisfy for a given positive constant \(\delta \), the equation

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dX(t) &{} =b(t,X(t),P_{X(t+\delta )},u(t))dt+\sigma (t,X(t),P_{X(t+\delta )},u(t))dB(t), \quad t\in [0,T],\\ X(0) &{} =x,\\ X(t) &{} =X(T); \quad t\ge T, \end{array} \right. \end{aligned}$$
(1.1)

where u(t) is our control process. Here the coefficients depend at time t on both the solution X(t) and the law \(P_{X(t+\delta )}\) of the anticipating solution. We remark here that the SDE (1.1) being anticipative w.r.t. the law of the solution process does not mean being anticipative in the sense that it anticipates the driving Brownian motion B.

The performance functional for a control u is given by

$$\begin{aligned} J\left( u\right) ={{\mathbb {E}}}\Bigg [g\Bigg (X(T) ,P_{X(T)}\Bigg ) + \int _{0}^{T} l\Bigg (s,X(s) ,P_{X(s+\delta )},u(s)\Bigg ) ds\Bigg ], \end{aligned}$$

for some given bounded functions g and l and we want to maximize this performance over the set \({\mathcal {U}}\) of admissible control processes (will be specified later), as follows: find \(u^{*}\in {\mathcal {U}}\) such that \(J(u^{*})=\sup _{u\in {\mathcal {U}}}J\left( u\right) \).

We define the Hamiltonian H associated to this problem to be

$$\begin{aligned} H(x,\mu ,u,p,q)=l(t,x,\mu ,u)+b(t,x,\mu ,u)p+\sigma (t,x,\mu ,u)q, \end{aligned}$$

and we show that the couple (pq) is the solution of the adjoint backward stochastic differential equation given by

$$\begin{aligned} dp(t)= & {} -\,\{(\partial _{x}b)(t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },u^{*}\left( t\right) )p(t)\nonumber \\&+\,(\partial _{x}\sigma )(t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },u^{*}\left( t\right) )q(t)+\left( \partial _{x}l\right) (t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t))\nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }b)(t-\delta ,{\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) ){\tilde{p}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }\sigma )(t-\delta ,{\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) ){\tilde{q}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }l)(t-\delta ,{\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) )]I_{\left[ \delta ,T\right] }\left( t\right) \}dt\nonumber \\&+\,q(t)dB(t)\text {, }t\in \left[ 0,T\right] , \end{aligned}$$
(1.2)

with terminal condition

$$\begin{aligned} p(T)= & {} (\partial _{x}g)(X^{*}(T),P_{X^{*}(T)})+{\tilde{{{\mathbb {E}}}}} [(\partial _{\mu }g)({\tilde{X}}^{*}(T),P_{X^{*}(T)},X^{*}(T))] \nonumber \\&+ \,\int _{T-\delta }^{T} ({\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }b)(t,{\tilde{X}}^{*}\left( t\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T\right) ,{\tilde{u}}^{*}\left( t\right) ){\tilde{p}}\left( t\right) ] \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }\sigma )(t,{\tilde{X}}^{*}\left( t\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T\right) ,\tilde{u}^{*}\left( t\right) ){\tilde{q}}\left( t\right) ] \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }l)(t,{\tilde{X}}^{*}(t),P_{X^{*} (T)},X^{*}(T),{\tilde{u}}^{*}(t))])dt. \end{aligned}$$
(1.3)

The adjoint equation (1.2), (1.3) is a new type of delayed McKean–Vlasov BSDE with implicit terminal condition, i.e., with terminal value being a function of the law of the solution itself. In order to write it in a more comprehensible form, we use the fact that the expectation of any random variable is a function of its law, and under suitable assumptions on both the driver and a terminal value, we can get existence and uniqueness of our delayed BSDE. It is a generalisation of the adjoint equation of the above mentioned problem we are interested in.

Stochastic Pontryagin’s maximum principle in both cases partial and complete information, for a case where the mean-field term is expressed by the expected value of the state, has been studied for example by Anderson and Djehiche [7], Hu et al. [5, 18] and Agram and Røse [6].

For more general mean-field problems, we refer to Lions [19], Cardaliaguet [12] and Buckdahn et al. [11].

Delayed BSDEs have been studied by Delong and Imkeller [15]; they have later been extended by the same authors to the jump case, and studied [16] by the help of the Malliavin calculus and for more details, we refer to Delong’s book [17].

For mean-field delayed BSDE, we refer to Agram [1].

To the best of our knowledge our paper is the first to study optimal control problems of mean-field SDEs with anticipating law.

The paper is organized as follows: in the next section, we give some preliminaries which will be used throughout this work. In Sect. 3, the existence and the uniqueness of McKean–Vlasov SDEs with anticipating law is investigated. Section 4 is devoted to the study of Pontryagin’s stochastic maximum principle. In the last section, we prove the existence and the uniqueness for the associated delayed McKean–Vlasov BSDEs with implicit terminal condition.

This work has been presented at seminars and conferences in Brest, Biskra, Marrakech, Mans and Oslo.

2 Framework

We introduce some notations, definitions and spaces which will be used throughout this work. Let \(\left( \Omega ,{\mathcal {F}},P\right) \) be a complete probability space, B a d-dimensional Brownian motion and \({\mathbb {F}}=\left( {\mathcal {F}}_{t}\right) _{t\ge 0}\) the Brownian filtration generated by B and completed by all P-null sets. Let \({\mathcal {P}} _{2}({\mathbb {R}}^{d}):=\{\mu \in {\mathcal {P}}({\mathbb {R}}^{d}): \int _{{\mathbb {R}^{d}}} |x|^{2}\mu (dx)<+\infty \},\) where \({\mathcal {P}}({\mathbb {R}}^{d})\) is the space of all the probability measures on \(({\mathbb {R}}^{d},{\mathcal {B}}({\mathbb {R}}^{d} ))\); recall that \({\mathcal {B}}({\mathbb {R}}^{d})\) denotes the Borel \(\sigma \)-field over \({\mathbb {R}}^{d}\). We endow \({\mathcal {P}}_{2}({\mathbb {R}}^{d})\) with the 2-Wasserstein metric \(W_{2}\) on \({\mathcal {P}}_{2}({\mathbb {R}}^{d})\): For \(\mu _{1},\mu _{2}\in {\mathcal {P}}_{2}({\mathbb {R}}^{d})\), the 2-Wasserstein distance is defined by

$$\begin{aligned} W_{2}(\mu _{1},\mu _{2})= & {} \inf \Bigg \{\Bigg ( \int _{{\mathbb {R}}^{d}} \vert x-y \vert ^{2}\mu (dx,dy))^{\frac{1}{2}}:\mu \in {\mathcal {P}} _{2}({\mathbb {R}}^{d}\times {\mathbb {R}}^{d})\\&\quad \text { with } \mu ({\cdot }\times {\mathbb {R}}^{d}):=\mu _{1},\text { }\mu ({\mathbb {R}}^{d}\times {\cdot }):=\mu _{2}\Bigg \}{.} \end{aligned}$$

We also remark that, if \((\Omega ,{\mathcal {F}},P)\) is “rich enough” in the sense that

$$\begin{aligned} {\mathcal {P}}_{2}({\mathbb {R}}^{d}\times {\mathbb {R}}^{d})=\left\{ P_{\zeta }\text {, }\zeta \in L^{2}\left( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\times {\mathbb {R}} ^{d}\right) \right\} , \end{aligned}$$

then we also have

$$\begin{aligned} W_{2}(\mu _{1},\mu _{2}) =\inf \Bigg \{\Big ({{\mathbb {E}}}\Big [\vert \zeta -\eta \vert ^{2}\Big ]\Big )^{\frac{1}{2}}\text {, }\zeta \text {, }\eta \in L^{2}\Bigg ( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\Bigg ) \text {,with } P_{\zeta }=\mu _{1}\text {, }P_{\eta }=\mu _{2}\Bigg \}{.} \end{aligned}$$

Let \(({\tilde{\Omega }},{\tilde{{{\mathcal {F}}}}},{\tilde{P}}) :=(\Omega ,{\mathcal {F}},P) \), \(\mathbb {{\tilde{F}}}:={\mathbb {F}}\) and \(({\bar{\Omega }},{\bar{{{\mathcal {F}}}}},{\bar{P}}) =( \Omega ,{\mathcal {F}} ,P) \otimes ( {\tilde{\Omega }},{\tilde{{{\mathcal {F}}}}},{\tilde{P}}) \). For any measurable space \((E,{\mathcal {E}}) \) and any random variable \(\zeta :\)\((\Omega ,{\mathcal {F}},P) \rightarrow (E,{\mathcal {E}}) \), we put \({\tilde{\zeta }}( {\tilde{\omega }}) :=\zeta ({\tilde{\omega }}) ,\)\({\tilde{\omega }}\in {\tilde{\Omega }}=\Omega \), \(\zeta (\omega ,{\tilde{\omega }}) :=\zeta (\omega ) \), \({\tilde{\zeta }}(\omega ,{\tilde{\omega }}) :={\tilde{\zeta }}({\tilde{\omega }}) \), \((\omega ,{\tilde{\omega }}) \in \Omega \times {\tilde{\Omega }}.\) We observe that \({\tilde{\zeta }}\) on \(({\tilde{\Omega }},{\tilde{{{\mathcal {F}}}}},{\tilde{P}}) \) is a copy of \(\zeta \) on \((\Omega ,{\mathcal {F}},P) ,\) and\(\ \zeta \), \({\tilde{\zeta }}\) are i.i.d under \({\bar{P}}.\) Moreover, for \(\zeta \), \(\eta :(\Omega ,{\mathcal {F}},P) \rightarrow (E,{\mathcal {E}}) \) random variables and \(\varphi : (E^{2},{\mathcal {E}}^{2}) \rightarrow ( B,{\mathcal {B}}( {\mathbb {R}} )) \) a bounded and measurable function, we have

$$\begin{aligned} {\tilde{{{\mathbb {E}}}}}[\varphi ({\tilde{\zeta }},\eta )](\omega )= & {} \int _{{\tilde{\Omega }}} \varphi ({\tilde{\zeta }}\left( \omega \right) ,\eta \left( \omega \right) ){\tilde{P}}\left( d{\tilde{\omega }}\right) \\= & {} {\mathbb {E}}\left[ \varphi \left( \zeta ,y\right) \right] _{\diagup y=\eta \left( \omega \right) }. \end{aligned}$$

We recall now the notion of derivative of a function \(\varphi :{\mathcal {P}} _{2}({\mathbb {R}}^{d})\rightarrow {\mathbb {R}}\) w.r.t a probability measure \(\mu \), which was studied by Lions in his course at Collège de France in [19]; see also the notes of Cardaliaguet [12], the works by Carmona and Delarue [14] and in Buckdahn et al. [11]. We say that \(\varphi \) is differentiable at \(\mu \) if, for the lifted function \({\tilde{\varphi }}(\zeta ):=\varphi (P_{\zeta })\), \(\zeta \in L^{2}\left( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\right) \), there is some \(\zeta _{0}\in L^{2}\left( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\right) \) with \(P_{\zeta _{0}}=\mu \), such that \({\tilde{\varphi }}\) is differentiable in the Frèchet sense at \(\zeta _{0}\), such that there exists a linear continuous mapping \(D{\tilde{\varphi }}(\zeta _{0}):L^{2}(\Omega ,{\mathcal {F}} ,P;{\mathbb {R}}^{d}) \rightarrow {\mathbb {R}}\)\((L( L^{2}( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}) ;{\mathbb {R}}) )\), such that

$$\begin{aligned} \varphi (P_{\zeta _{0}+\eta })-\varphi (P_{\zeta _{0}})= & {} {\tilde{\varphi }} (\zeta _{0}+\eta )-{\tilde{\varphi }}(\zeta _{0})\\= & {} (D{\tilde{\varphi }})(\zeta _{0})(\eta )+o(\left| \eta \right| _{L^{2}\left( \Omega \right) }^{2}), \end{aligned}$$

for \(\left| \eta \right| _{L^{2}(\Omega ) }^{2} \rightarrow 0\), \(\eta \in L^{2}(\Omega ,{\mathcal {F}},P;{\mathbb {R}} ^{d}).\) With the identification that \(L(L^{2}(\Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}) ;{\mathbb {R}}) \equiv L^{2}(\Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d})\), given by Riesz’ representation theorem, we can write

$$\begin{aligned} \varphi (P_{\zeta _{0}+\eta })-\varphi (P_{\zeta _{0}})={\mathbb {E}}\left[ (D{\tilde{\varphi }})(\zeta _{0})\cdot \eta \right] +o(\left| \eta \right| _{L^{2}\left( \Omega \right) }^{2})\text {, }\eta \in L^{2}\left( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\right) {.} \end{aligned}$$

In Lions [19] and Cardaliaguet [12], it has been proved that there exists a Borel function \(h:{\mathbb {R}}\rightarrow {\mathbb {R}}\), such that \((D{\tilde{\varphi }})(\zeta _{0})=h(\zeta _{0})\)P-a.s. Note that \(h(\zeta _{0})\)P-a.s. uniquely determined. Consequently, h(y) is \(P_{\zeta _{0}}(dy)\)-a.e. uniquely determined. We define

$$\begin{aligned} (\partial _{\mu }\varphi )(P_{\zeta _{0}},y):=h(y)\text {, } \quad y\in {\mathbb {R}}{.} \end{aligned}$$

Hence

$$\begin{aligned} \varphi (P_{\zeta _{0}+\eta })-\varphi (P_{\zeta _{0}})={\mathbb {E}}\Bigg [ (\partial _{\mu }\varphi )(P_{\zeta _{0}},\zeta _{0})\cdot \eta \Bigg ] +o\Bigg (\left| \eta \right| _{L^{2}(\Omega ) }^{2}\Bigg )\text {, } \quad \left| \eta \right| _{L^{2}\left( \Omega \right) }^{2}\rightarrow 0{.} \end{aligned}$$

Example 2.1

Given a function \(\varphi (P_{\zeta })=g\left( {\mathbb {E}}\left[ f(\zeta )\right] \right) \), for \(g,f\in C_{l,b}^{1}({\mathbb {R}})\) and \(\zeta \in L^{2}( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d})\), then

$$\begin{aligned} {\mathbb {E}}\left[ (\partial _{\mu }\varphi )(P_{\zeta },\zeta )\cdot \eta \right]= & {} \underset{\lambda \rightarrow 0}{\lim }\left( \varphi (P_{\zeta +\lambda \eta })-\varphi (P_{\zeta })\right) \\= & {} \underset{\lambda \rightarrow 0}{\lim }\frac{g\left( {\mathbb {E}}\left[ f(\zeta +\lambda \eta )\right] \right) -g\left( {\mathbb {E}}\left[ f(\zeta )\right] \right) }{\lambda }\\= & {} g^{\prime }\left( {\mathbb {E}}\left[ f(\zeta )\right] \right) {\mathbb {E}}\left[ f^{\prime }(\zeta )\cdot \eta \right] \\= & {} {\mathbb {E}}\left[ g^{\prime }{\mathbb {E}}\left[ f(\zeta )\right] f^{\prime }(\zeta )\cdot \eta \right] , \quad \text {for all }\eta \in L^{2}\left( \Omega ,{\mathcal {F}},P;{\mathbb {R}}^{d}\right) {.} \end{aligned}$$

Throughout this work, we will use also the following spaces:

  • \(S_{{\mathbb {F}}}^{2}([0,T])\) is the set of real valued \({\mathbb {F}} \)-adapted continuous processes \((X(t))_{t\in [0,T]}\) such that

    $$\begin{aligned} {\Vert X\Vert }_{S_{{\mathbb {F}}}^{2}}:={{\mathbb {E}}}\Bigg [\sup _{t\in [0,T]}|X(t)|^{2}\Bigg ]<\infty . \end{aligned}$$
  • \(L_{{\mathbb {F}}}^{2}([0,T])\) is the set of real valued \({\mathbb {F}} \)-adapted processes \((Q(t))_{t\in [0,T]}\) such that

    $$\begin{aligned} \Vert Q\Vert _{L_{{\mathbb {F}}}^{2}}^{2}:={{\mathbb {E}}}\Bigg [ \int _{0}^{T} |Q(t)|^{2}dt\Bigg ]<\infty . \end{aligned}$$
  • \(L^{2}({\mathcal {F}}_{t})\) is the set of real valued square integrable \({\mathcal {F}}_{t}\)-measurable random variables.

3 Solvability of the anticipated forward McKean–Vlasov equations

Let us consider the following anticipated SDE for a given positive constant \(\delta \)

$$\begin{aligned} \left\{ \begin{array} [c]{l} dX(t) =\sigma \left( t,X(t),P_{X\left( t+\delta \right) }\right) dB(t)+b\left( t,X(t),P_{X\left( t+\delta \right) }\right) dt\text {, } \quad t\in \left[ 0,T\right] ,\\ X\left( 0\right) =x\in {\mathbb {R}} ^{d},\\ X(t) =X(T)\text {,} \quad t\ge T{.} \end{array} \right. \end{aligned}$$
(3.1)

The functions \(\sigma :\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \rightarrow {\mathbb {R}} ^{d\times d}\) and \(b:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \rightarrow {\mathbb {R}} ^{d}\) are progressively measurable and are assumed to satisfy the following set of assumptions.

Assumptions (H.1): There exists \(C>0\), such that

  1. 1.

    For all \(t\in \left[ 0,T\right] \), \(x,x^{\prime }\in {\mathbb {R}} ^{d},\mu ,\mu ^{\prime }\in {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \)

    $$\begin{aligned} \left| \sigma \left( t,x,\mu \right) -\sigma \left( t,x^{\prime } ,\mu ^{\prime }\right) \right| +\left| b\left( t,x,\mu \right) -b\left( t,x^{\prime },\mu ^{\prime }\right) \right| \le C\left( \left| x-x^{\prime }\right| +W_{2}\left( \mu ,\mu ^{\prime }\right) \right) . \end{aligned}$$
  2. 2.

    For all \(t\in \left[ 0,T\right] \), \(x,x^{\prime }\in {\mathbb {R}} ^{d}\)

    $$\begin{aligned} \left| \sigma \left( t,0,P_{0}\right) \right| +\left| b\left( t,0,P_{0}\right) \right| \le C, \end{aligned}$$

    where \(P_{0}\) is the distribution law of zero, i.e., the Dirac measure with mass at zero.

Remark 3.1

Note that Assumption (H.1) implies that the coefficients b and \(\sigma \) are of linear growth. Indeed we have

$$\begin{aligned} \left| \sigma \left( t,x,\mu \right) \right|\le & {} \left| \sigma \left( t,0,P_{0}\right) \right| +\left| \sigma \left( t,x,\mu \right) -\sigma \left( t,0,P_{0}\right) \right| \\\le & {} C\left( 1+\left| x\right| +W_{2}\left( \mu ,P_{0}\right) \right) \\= & {} C(1+\left| x\right| + \left( \int _{{\mathbb {R}}^{d}} \left| y\right| ^{2}\mu \left( dy\right) )^{\frac{1}{2}}\right) ,\left( t,x,\mu \right) \in \left[ 0,T\right] \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) , \end{aligned}$$

and a similar estimate holds for b.

Proposition 3.2

Under the above Assumption (H.1), there is some \(\delta _{0}>0\), such that for all \(\delta \in \left( 0,\delta _{0}\right] \), there exists a unique solution \(X\in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \) of SDE (3.1); \(\delta _{0}\) depends only on the Lipschitz constant C of the coefficients b and \(\sigma \) (see (H.1)) but not on the coefficients themselves.

Proof

For \(U\in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) ,\) we can make the identification with the continuous process

$$\begin{aligned} \left( U\left( t\wedge T\right) \right) _{t\in \left[ 0,T+\delta \right] }\equiv (\left( U\left( t\right) \right) _{t\in \left[ 0,T\right] },U\left( T\right) )\in L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \times L^{2}\left( {\mathcal {F}}_{T}\right) :=H. \end{aligned}$$

Given \(U\in H\), we put

$$\begin{aligned} V\left( t\right) :=x+ \int _{0}^{t} \sigma \left( s,U\left( s\right) ,P_{U\left( s+\delta \right) }\right) dB\left( s\right) + \int _{0}^{t} b\left( s,U\left( s\right) ,P_{U\left( s+\delta \right) }\right) ds, \quad t\in \left[ 0,T\right] . \end{aligned}$$

Then \(V\in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \subset H\) (with the above identification), and setting \(\Phi \left( U\right) :=V\) we define a mapping \(\Phi :H\rightarrow H.\) Fixing \(\beta >0\) (\(\beta \) will be specified later), we introduce the norm

$$\begin{aligned} \left\| U\right\| _{-\beta }^{2}:={{\mathbb {E}}}\left[ e^{-\beta T}\left| U\left( T\right) \right| ^{2}\right] +\tfrac{6}{7}\beta {{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| U\left( s\right) \right| ^{2}ds\right] \text {, } \quad U\in H. \end{aligned}$$

Obviously, \((H,\left\| \cdot \right\| _{-\beta })\) is a Banach space, and the norm \(\left\| \cdot \right\| _{-\beta }\) is equivalent to the norm \(\left\| \cdot \right\| _{0}\) (obtained from \(\left\| \cdot \right\| _{-\beta }\) by taking \(\beta =0\)). We are going to prove that \(\Phi :(H,\left\| \cdot \right\| _{-\beta })\rightarrow (H,\left\| \cdot \right\| _{-\beta })\) is contracting. Indeed, we consider arbitrary \(U^{i}\in H\), \(i=1,2,\) and we put \(V^{i}:=\Phi ( U^{i}) \), \(i=1,2\). Let \({\bar{U}}:=U^{1}-U^{2}\) and \({\bar{V}}:=V^{1}-V^{2}.\) Then, applying Itô’s formula to \((e^{-\beta t}\left| {\bar{V}}\left( t\right) \right| ^{2})_{t\ge 0}\), we get from the Assumptions (H.1)

$$\begin{aligned}&{{\mathbb {E}}}\left[ e^{-\beta t}\left| {\bar{V}}\left( t\right) \right| ^{2}\right] +{{\mathbb {E}}}\left[ \int _{0}^{t} \beta e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2}ds\right] \\&\quad =2{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}{\bar{V}}\left( s\right) (b(s,U^{1}\left( s\right) ,P_{U^{1}\left( s+\delta \right) })-b(s,U^{2}\left( s\right) ,P_{U^{2} \left( s+\delta \right) }))ds\right] \\&\qquad +\,{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}|\sigma (s,U^{1}\left( s\right) ,P_{U^{1}\left( s+\delta \right) })-\sigma (s,U^{2}\left( s\right) ,P_{U^{2}\left( s+\delta \right) } )|^{2}ds\right] \\&\quad \le C{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| (\left| {\bar{U}}\left( s\right) \right| +W_{2}(P_{U^{1}\left( s+\delta \right) },P_{U^{2}\left( s+\delta \right) }))ds\right] \\&\qquad +\,C{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}(\left| {\bar{U}}\left( s\right) \right| +W_{2} (P_{U^{1}\left( s+\delta \right) },P_{U^{2}\left( s+\delta \right) } ))^{2}ds\right] \\&\quad \le C{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2} ds\right] +C{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}\left| {\bar{U}}\left( s\right) \right| ^{2}ds\right] \\&\qquad +\,C{{\mathbb {E}}}\left[ \int _{0}^{t} e^{-\beta s}\left| {\bar{U}}\left( s+\delta \right) \right| ^{2}ds\right] ,t\in \left[ 0,T\right] . \end{aligned}$$

Indeed, we recall that

$$\begin{aligned} W_{2}^{2}(P_{U^{1}\left( s+\delta \right) },P_{U^{2}\left( s+\delta \right) })\le {{\mathbb {E}}}\left[ \left| U^{1}\left( s+\delta \right) -U^{2}\left( s+\delta \right) \right| ^{2}\right] ={{\mathbb {E}}}\left[ \left| {\bar{U}}\left( s+\delta \right) \right| ^{2}\right] . \end{aligned}$$

Hence for \(t=T\), we have

$$\begin{aligned}&{{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{V}}\left( T\right) \right| ^{2}\right] +{{\mathbb {E}}}\left[ \int _{0}^{T} \beta e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2}ds\right] \\&\quad \le C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2} ds\right] +C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{U}}\left( s\right) \right| ^{2}ds\right] \\&\qquad +\,Ce^{\beta \delta }{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{U}}\left( s\right) \right| ^{2} ds\right] +Ce^{\beta \delta }\delta {{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{U}}\left( T\right) \right| ^{2}\right] . \end{aligned}$$

We seek suitable \(\beta >0\), \(\delta >0\) with \(\delta \le \frac{1}{\beta }\), i.e., \(\beta \delta \le 1\), in order to estimate

$$\begin{aligned}&{{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{V}}\left( T\right) \right| ^{2}\right] +\beta {{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2}ds\right] \\&\quad \le Ce\delta {{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{U}}\left( T\right) \right| ^{2}\right] +C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2}ds\right] \\&\qquad +\,C\left( 1+e\right) {{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{U}}\left( s\right) \right| ^{2}ds\right] . \end{aligned}$$

Choosing \(\beta :=7C\), \(\delta _{0}:=\tfrac{1}{7C}(=\frac{1}{\beta })\), we have for all \(\delta \in \left( 0,\delta _{0}\right) \):

$$\begin{aligned}&{{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{V}}\left( T\right) \right| ^{2}\right] +6C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{V}}\left( s\right) \right| ^{2}ds\right] \\&\quad \le \tfrac{2}{3}\left( {{\mathbb {E}}}\left[ e^{-\beta T}\left| {\bar{U}}\left( T\right) \right| ^{2}\right] +6C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{-\beta s}\left| {\bar{U}}\left( s\right) \right| ^{2}ds\right] \right) . \end{aligned}$$

Then

$$\begin{aligned} \left\| {\bar{V}}\right\| _{-\beta }\le \left( \tfrac{2}{3}\right) ^{\frac{1}{2} }\left\| {\bar{U}}\right\| _{-\beta }, \end{aligned}$$

i.e.,

$$\begin{aligned} \left\| \Phi \left( U^{1}\right) -\Phi \left( U^{2}\right) \right\| _{-\beta }\le \left( \tfrac{2}{3}\right) ^{\frac{1}{2}}\left\| U^{1}-U^{2}\right\| _{-\beta },\quad \text { for all }U^{1},U^{2}\in H. \end{aligned}$$

This proves that \(\Phi :(H,\left\| \cdot \right\| _{-\beta })\rightarrow (H,\left\| \cdot \right\| _{-\beta })\) is a contraction on the Banach space \(\left( H,\left\| \cdot \right\| _{-\beta }\right) \). Hence, there is a unique fixed point \(X\in H,\) such that \(X=\Phi \left( X\right) ,\) i.e.,

$$\begin{aligned} X\left( t\right) =x+ \int _{0}^{t} \sigma \left( s,X\left( s\right) ,P_{X\left( s+\delta \right) }\right) dB\left( s\right) + \int _{0}^{t} b\left( s,X\left( s\right) ,P_{X\left( s+\delta \right) }\right) ds, \end{aligned}$$

\(v\left( dt\right) \)-a.e. on \(\left[ 0,T\right] \), P-a.s., with \(v\left( dt\right) =I_{\left[ 0,T\right] }\left( t\right) dt+P_{T} \left( dt\right) \) (Recall the definition of H). For a \(v\otimes P-\)modification of X,  also denoted by X, we have \(X\in S_{{\mathbb {F}}} ^{2}\left( \left[ 0,T\right] \right) \) and

$$\begin{aligned} X\left( t\right)= & {} x+ \int _{0}^{t} \sigma \left( s,X\left( s\right) ,P_{X\left( s+\delta \right) }\right) dB\left( s\right) \\&+\, \int _{0}^{t} b\left( s,X\left( s\right) ,P_{X\left( s+\delta \right) }\right) ds, \quad t\in \left[ 0,T\right] \ P\text {-a.s. } \end{aligned}$$

\(\square \)

4 Pontryagin’s stochastic maximum principle

Let us introduce now our stochastic control problem.

4.1 Controlled stochastic differential equation

As control state space we consider a bounded convex subset U of \({\mathbb {R}} ^{d}\). A process \(u=\left( u(t)\right) _{t\in \left[ 0,T\right] }:\left[ 0,T\right] \times \Omega \rightarrow U\) which is progressively measurable is called an admissible control; \({\mathcal {U}}=L_{{\mathbb {F}}}^{0}(\left[ 0,T\right] ;U)\) is the set of all admissible controls. The dynamics of our controlled system are driven by functions \(\sigma :\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\rightarrow {\mathbb {R}} ^{d\times d}\), \(b:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\rightarrow {\mathbb {R}} ^{d}\).

Assumptions (H.2): The coefficients \(\sigma \) and b are supposed to be continuous on \(\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\) and Lipschitz on \( {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) ,\) uniformly w.r.t. \(u\in U\) and \(\omega \in \Omega \) i.e., there is some \(C>0\), such that for all \(\left( x,\mu \right) ,\left( x^{\prime } ,\mu ^{\prime }\right) \in {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \), \(u\in U\), we have

$$\begin{aligned} \left| \sigma \left( t,x,\mu ,u\right) -\sigma \left( t,x^{\prime } ,\mu ^{\prime },u\right) \right|&\le C\left( \left| x-x^{\prime }\right| +W_{2}\left( \mu ^{\prime },\mu \right) \right) ,\\ \left| b\left( t,x,\mu ,u\right) -b\left( t,x^{\prime },\mu ^{\prime },u\right) \right|&\le C\left( \left| x-x^{\prime }\right| +W_{2}\left( \mu ^{\prime },\mu \right) \right) . \end{aligned}$$

On the other hand, from the continuity of the coefficients on \(\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\), we have

$$\begin{aligned} \left| \sigma \left( t,0,P_{0},u\right) \right| +\left| b\left( t,0,P_{0},u\right) \right| \le C, \quad \text {for all }u\in U. \end{aligned}$$

This shows that, for every \(u\in {\mathcal {U}}:=L_{{\mathbb {F}}}^{0}\left( \left[ 0,T\right] ;U\right) \) and \(\omega \in \Omega \), the coefficients \(\sigma \) and b satisfy the Assumptions (H.1). Thus, for \(\delta _{0}>0\) from Proposition 3.2, for all \(u\in {\mathcal {U}}\); \(x\in {\mathbb {R}} ^{d}\), there is a unique solution \(X^{u}\left( t\right) \in S_{{\mathbb {F}} }^{2}\left( \left[ 0,T\right] ; {\mathbb {R}} ^{d}\right) \) of the equation

$$\begin{aligned} X^{u}\left( t\right)= & {} x+ \int _{0}^{t} \sigma \left( s,X^{u}\left( s\right) ,P_{X^{u}(s+\delta )},u(s)\right) dB\left( s\right) \\&+\, \int _{0}^{t} b\left( s,X^{u}\left( s\right) ,P_{X^{u}(s+\delta )},u\left( s\right) \right) ds,t\in \left[ 0,T\right] . \end{aligned}$$

4.2 Cost functional

Let us endow our control problem with a terminal cost \(g:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \rightarrow {\mathbb {R}} \), and a running cost \(l:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\rightarrow {\mathbb {R}} \).

Assumptions (H.3): We suppose that \(g:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \rightarrow {\mathbb {R}} \) is continuous and satisfies a linear growth assumption: For some constant \(C>0\),

$$\begin{aligned} \left| g\left( x,\mu \right) \right| \le C\left( 1+\left| x\right| +\left( \int _{{\mathbb {R}}^{d}} \left| y\right| ^{2}\mu \left( dy\right) \right) ^{\frac{1}{2}}\right) , \quad \left( x,\mu \right) \in {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) {.} \end{aligned}$$

Let \(l:\left[ 0,T\right] \times \Omega \times {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\rightarrow {\mathbb {R}} \) be continuous and such that, for some \(C>0\), for all \(\left( x,\mu ,u\right) \in {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\), we have

$$\begin{aligned} \left| l\left( x,\mu ,u\right) \right| \le C\left( 1+\left| x\right| +\left( \int _{{\mathbb {R}}^{d}} \left| y\right| ^{2}\mu \left( dy\right) \right) ^{\frac{1}{2}}\right) {.} \end{aligned}$$

For any admissible control u, we define the performance functional:

$$\begin{aligned} J\left( u\right) :={{\mathbb {E}}}\left[ g\left( X^{u}\left( T\right) ,P_{X^{u} (T)}\right) + \int _{0}^{T} l\left( s,X^{u}\left( s\right) ,P_{X^{u}(s+\delta )},u\left( s\right) \right) ds\right] {.} \end{aligned}$$

A control process \(u^{*}\in {\mathcal {U}}\) is called optimal, if

$$\begin{aligned} J\left( u^{*}\right) \le J\left( u\right) , \quad \text {for all } u\in {\mathcal {U}}{.} \end{aligned}$$

Let us suppose that there is an optimal control \(u^{*}\in {\mathcal {U}}\). Our objective is to characterise the optimal control. For this let us assume some additional assumptions.

Assumptions (H.4): Let U be convex (and, hence, \({\mathcal {U}}\) is convex). The functions \(\sigma \left( \cdot ,\cdot ,\cdot ,u\right) \), \(b\left( \cdot ,\cdot ,\cdot ,u\right) \), \(l\left( \cdot ,\cdot ,\cdot ,u\right) \) and \(g\left( \cdot ,\cdot \right) \) are continuously differentiable over \( {\mathbb {R}} ^{d}\times {\mathcal {P}}_{2}\left( {\mathbb {R}} ^{d}\right) \times U\) with bounded derivatives.

Given an arbitrary but fixed control \(u\in {\mathcal {U}}\), we define

$$\begin{aligned} u^{\theta }:=u^{*}+\theta \left( u-u^{*}\right) , \theta \in \left[ 0,1\right] . \end{aligned}$$

Note that, thanks to the convexity of U and \({\mathcal {U}}\), also \(u^{\theta }\in {\mathcal {U}},\theta \in \left[ 0,1\right] \). We denote by \(X^{\theta }:=X^{u^{\theta }}\) and by \(X^{*}:=X^{u^{*}}\) the solution processes corresponding to \(u^{\theta }\) and \(u^{*},\) respectively. For simplicity of the computations, we set \(d=1.\)

4.3 Variational SDE

Given \(u^{*}\in {\mathcal {U}}\) and the associated controlled state process \(X^{*}\), let \(Y=\left( Y(t)\right) _{t\in \left[ 0,T\right] }\in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \) be the unique solution of the following SDE

$$\begin{aligned} \left\{ \begin{array} [c]{ll} Y\left( t\right) &{} = \int _{0}^{t} \{(\partial _{x}\sigma )(s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},u^{*}\left( s\right) )Y\left( s\right) \\ &{} \quad +\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }\sigma )(s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},{\tilde{X}}^{*}(s+\delta )){\tilde{Y}}(s+\delta )]\\ &{} \quad +\,(\partial _{u}\sigma )\left( s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},u^{*}\left( s\right) \right) (u\left( s\right) -u^{*}\left( s\right) )\}dB\left( s\right) \\ &{} \quad +\, \int _{0}^{t} \{(\partial _{x}b)(s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},u^{*}\left( s\right) )Y(s)\\ &{} \quad +\,{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }b)(s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},{\tilde{X}}^{*}(s+\delta )){\tilde{Y}}(s+\delta )]\\ &{} \quad +\,(\partial _{u}b)\left( s,X^{*}\left( s\right) ,P_{X^{*}(s+\delta )},u^{*}\left( s\right) \right) (u\left( s\right) -u^{*}\left( s\right) )\}ds,t\in [0,T],\\ Y\left( 0\right) &{} =0\text {, }Y(t)=Y(T)\text {, }t\ge T{.} \end{array} \right. \end{aligned}$$
(4.1)

Remark 4.1

Note that SDE (4.1) is obtained by formal differentiation of Eq. (6.1) (with \(u=u^{\theta }\)) at \(\theta =0\).

From the previous section, we have the existence and the uniqueness of a solution for all \(\delta \in \left( 0,\delta _{0}^{\prime }\right] ;0<\)\(\delta _{0}^{\prime }\le \delta _{0}\) small enough.

Indeed, Eq. (4.1) is of the form

$$\begin{aligned} Y\left( t\right)= & {} \int _{0}^{t} \alpha _{1}(s)Y\left( s\right) ds+ \int _{0}^{t} \alpha _{2}(s)ds+ \int _{0}^{t} {\tilde{{{\mathbb {E}}}}}[\beta _{1}(s){\tilde{Y}}(s+\delta )]ds\\&+\, \int _{0}^{t} \alpha _{3}(s)Y(s)dB(s)+ \int _{0}^{t} \alpha _{4}(s)dB(s) \\&+\, \int _{0}^{t} {\tilde{{{\mathbb {E}}}}}[\beta _{2}(s){\tilde{Y}}(s+\delta )]dB(s),t\in \left[ 0,T\right] , \end{aligned}$$

where \(\alpha _{i}:\left[ 0,T\right] \times \Omega \rightarrow {\mathbb {R}} ,i=1,2,3,4,\) are bounded progressively measurable processes and \(\beta _{j}:\left[ 0,T\right] \times \Omega \rightarrow {\mathbb {R}} ,j=1,2,\) two bounded \(({\mathcal {F}}_{t}\otimes {\tilde{{{\mathcal {F}}}}} _{T}{)}\)- progressively measurable processes. With the method used in the proof of Proposition 3.2, we get the existence of \(\delta _{0}^{^{\prime }}\in \left( 0,\delta _{0}\right] \) stated above. It turns out that Y(t) is the \(L^{2}\)-derivative of \(X^{\theta }(t)\) w.r.t. \(\theta \) at \(\theta =0.\) More precisely, the following property holds.

Lemma 4.2

\({{\mathbb {E}}}[\sup _{t\in \left[ 0,T\right] }|Y(t)-\tfrac{X^{\theta }(t)-X^{*}(t)}{\theta }|^{2}]\rightarrow 0\) as \(\left( \theta \rightarrow 0\right) \).

Proof

The proof is obtained with standard computations. For the sake of completeness, we give details in the Appendix. \(\square \)

4.4 Variational inequality

We know that if \(u^{*}\) is an optimal control, we have \(J(u^{*})\le J(u^{\theta })\), for all \(\theta \in \left[ 0,1\right] \), i.e.,

$$\begin{aligned} 0\le \underset{\theta \rightarrow 0}{{\underline{\lim }}}\tfrac{J(u^{\theta })-J(u^{*})}{\theta }{.} \end{aligned}$$
(4.2)

Lemma 4.3

Under Assumptions (H.3), (H.4), Lemma 4.2 and inequality (4.2), we have

$$\begin{aligned} 0\le & {} {{\mathbb {E}}}\left[ \left\{ (\partial _{x}g)(X^{*}(T), P_{X^{*}(T)})+{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }g)({\tilde{X}}^{*}(T), P_{X^{*}(T)},X^{*}(T))]\right. \right. \nonumber \\&\left. \left. +\, \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}\left[ (\partial _{\mu }l)(t,{\tilde{X}}^{*}(t),P_{X^{*} (T)},X^{*}(T),{\tilde{u}}^{*}(t))dt\right] \right\} Y(T)\right] \nonumber \\&+\,{{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg \{(\partial _{x}l)(t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t))\right. \nonumber \\&\left. +\,{\tilde{{{\mathbb {E}}}}}\Bigg [(\partial _{\mu }l)(t-\delta ,{\tilde{X}}^{*}(t-\delta ),P_{X^{*}(t)},x^{*}(t),{\tilde{u}}^{*}(t-\delta ))\Bigg ]I_{\left[ \delta ,T\right] }(t)\Bigg \}Y(t)dt\right] \nonumber \\&+\,{{\mathbb {E}}}\left[ \int _{0}^{T} (\partial _{u}l)(t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t))(u(t)-u^{*}(t))dt\right] . \end{aligned}$$
(4.3)

Proof

From the definition of \(J(u^{*})\), we have

$$\begin{aligned} \underset{\theta \rightarrow 0}{{\underline{\lim }}}\tfrac{1}{\theta }(J(u^{\theta })-J(u^{*}))= & {} {{\mathbb {E}}}\Bigg [(\partial _{x}g)\left( X^{*}(T),P_{X^{*}(T)})Y(T)\right) \\&+\,{\tilde{{{\mathbb {E}}}}}\Bigg [(\partial _{\mu }g)(X^{*}(T),P_{X^{*}(T)},\tilde{X}^{*}(T)){\tilde{Y}}(T)\Bigg ]\\&+\, \int _{0}^{T} \Bigg \{(\partial _{x}l)(t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t))Y(t)\\&+\,{\tilde{{{\mathbb {E}}}}}\Bigg [(\partial _{\mu }l)(t,X^{*}(t),P_{X^{*}(t+\delta )},{\tilde{X}}^{*}(t+\delta ),u^{*}(t)){\tilde{Y}}(t+\delta )\Bigg ]\\&+\,(\partial _{u}l)(t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*} (t))(u(t)-u^{*}(t))\Bigg \}dt\Bigg ], \end{aligned}$$

the fact that

$$\begin{aligned} {{\mathbb {E}}}\left[ \underset{t\in \left[ 0,T\right] }{\sup }|Y(t)-\tfrac{X^{\theta }(t)-X^{*}(t)}{\theta }|^{2}\right] \rightarrow 0\text { as }\theta \rightarrow 0, \end{aligned}$$

and by repeating previous arguments, (4.3) is obtained. \(\square \)

4.5 Adjoint processes

Let us first recall the equation satisfied by the derivative process

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dY\left( t\right) &{} =\Bigg \{\left( \partial _{x}\sigma \right) (t)Y\left( t\right) +{\tilde{{{\mathbb {E}}}}}\left[ \left( \partial _{\mu }\sigma \right) (t){\tilde{Y}}\left( t+\delta \right) \right] \\ &{} \quad +\,\left( \partial _{u}\sigma \right) (t)\left( u\left( t\right) -u^{*}\left( t\right) \right) \Bigg \}dB\left( t\right) \\ &{} \quad +\,\Bigg \{\left( \partial _{x}b\right) (t)Y\left( t\right) +{\tilde{{{\mathbb {E}}}}} \Bigg [\left( \partial _{\mu }b\right) (t){\tilde{Y}}\left( t+\delta \right) \Bigg ]\\ &{} \quad +\,\left( \partial _{u}b\right) (t)\left( u\left( t\right) -u^{*}\left( t\right) \right) \Bigg \}dt,\\ Y\left( 0\right) &{} =0, \end{array} \right. \end{aligned}$$

where for notational convenient, we have used the short hand notations

$$\begin{aligned}&\left( \partial _{x}\sigma \right) (t,X^{^{*}}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },u^{*}\left( t\right) )=:\left( \partial _{x}\sigma \right) (t),\\&\quad {\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t,X^{^{*}}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },{\tilde{X}}^{*}\left( t+\delta \right) ,u^{*}(t)){\tilde{Y}}\left( t+\delta \right) \Bigg ]=:{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t){\tilde{Y}}\left( t+\delta \right) \Bigg ], \end{aligned}$$

and similarly.In order to determine the adjoint backward equation, we suppose that it has the form

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dp(t) &{} =-\alpha (t)dt+q(t)dB(t),t\in \left[ 0,T\right] ,\\ p(T), &{} \end{array} \right. \end{aligned}$$
(4.4)

for some adapted process \(\alpha \) and terminal value p(T) which we have to determine. Applying Itô’s formula to \(p\left( t\right) Y\left( t\right) ,\) we obtain

$$\begin{aligned} d{\mathbb {E}}\left[ p\left( t\right) Y\left( t\right) \right]= & {} \Bigg \{{{\mathbb {E}}}\Bigg [\left( \partial _{x}b\right) (t)p\left( t\right) Y\left( t\right) \Bigg ]\nonumber \\&+{{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t){\tilde{Y}}\left( t+\delta \right) \Bigg ]p\left( t\right) \Bigg ]+{\mathbb {E}}\left[ \left( \partial _{u}b\right) \left( t\right) \left( u\left( t\right) -u^{*}\left( t\right) \right) p\left( t\right) \right] \Bigg \}dt\nonumber \\&-{\mathbb {E}}\left[ \alpha (t)Y(t)\right] dt+\Bigg \{{{\mathbb {E}}}\Bigg [\left( \partial _{x}\sigma \right) (t)q(t)Y\left( t\right) \Bigg ]+{{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t){\tilde{Y}}(t+\delta )\Bigg ]q(t)\Bigg ]\nonumber \\&+{{\mathbb {E}}}\Bigg [(\partial _{u}\sigma )(t)\left( u\left( t\right) -u^{*}\left( t\right) \right) q(t)\Bigg ]\Bigg \}dt, \end{aligned}$$
(4.5)

with \({\mathbb {E}}\left[ Y\left( 0\right) p\left( 0\right) \right] =0.\)

We have that

$$\begin{aligned} {{\mathbb {E}}}[{\tilde{{{\mathbb {E}}}}}[f(\xi ,{\tilde{\eta }})]]&= \int _{\Omega } \left( \int _{{\tilde{\Omega }}} f(\xi (\omega ),{\tilde{\eta }}({\tilde{\omega }})){\tilde{P}}(d\tilde{\omega })\right) P(d\omega )\nonumber \\&= \int _{\Omega } \left( \int _{{\mathbb {R}}} f(\xi ,y){\tilde{P}}_{{\tilde{\eta }}}(dy)\right) dP\nonumber \\&= \int _{\Omega } \left( \int _{{\mathbb {R}}} f(x,y){\tilde{P}}_{{\tilde{\eta }}}(dy)\right) P_{\xi }(dx)\nonumber \\&= \int _{{\mathbb {R}}} \left( \int _{\Omega } f(x,y)dP\right) P_{\xi }(dx)\nonumber \\&= \int _{{\tilde{\Omega }}} \left( \int _{\Omega } f({\tilde{\xi }},\eta )dP\right) d{\tilde{P}}\nonumber \\&={\tilde{{{\mathbb {E}}}}}[{{\mathbb {E}}}[f({\tilde{\xi }},\eta )]]. \end{aligned}$$
(4.6)

Using the above computations, we obtain

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },{\tilde{X}}^{*}\left( t+\delta \right) ,u^{*}\left( t\right) ){\tilde{Y}}\left( t+\delta \right) \Bigg ]p\left( t\right) \Bigg ]\nonumber \\&\quad ={{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t,{\tilde{X}}^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },X^{*}\left( t+\delta \right) ,{\tilde{u}}^{*}\left( t\right) ){\tilde{p}}\left( t\right) \Bigg ]Y\left( t+\delta \right) \Bigg ], \end{aligned}$$
(4.7)

and similarly

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },{\tilde{X}}^{^{*}}\left( t+\delta \right) ,u^{*}\left( t\right) ){\tilde{Y}}(t+\delta )\Bigg ]q(t)\Bigg ]\nonumber \\&\quad ={{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t,{\tilde{X}}^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },X^{^{*} }\left( t+\delta \right) ,{\tilde{u}}^{*}\left( t\right) )\tilde{q}(t)\Bigg ]Y(t+\delta )\Bigg ], \end{aligned}$$
(4.8)

Substituting (4.7), (4.8) into (4.5), we get

$$\begin{aligned} {{\mathbb {E}}}[p\left( T\right) Y\left( T\right) ]= & {} \int _{0}^{T} {{\mathbb {E}}}[((\partial _{x}b)(t)p(t)+(\partial _{x}\sigma )(t)q(t))Y(t)]dt\nonumber \\&+\, \int _{0}^{T} {{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [(( \partial _{\mu }b) (t){\tilde{p}}\left( t\right) +( \partial _{\mu }\sigma ) (t){\tilde{q}}\left( t\right) )\Bigg ]Y\left( t+\delta \right) \Bigg ]dt\nonumber \\&-\, \int _{0}^{T} {{\mathbb {E}}}[\alpha (t)Y(t)]dt\nonumber \\&+\, \int _{0}^{T} {{\mathbb {E}}}\Bigg [((\partial _{u}b)(t)p(t)+(\partial _{u}\sigma )(t)q(t))\left( u\left( t\right) -u^{*}\left( t\right) \right) \Bigg ]dt. \end{aligned}$$
(4.9)

As \(X^{*}\left( t\right) =X^{*}\left( T\right) ,\)\(Y(t)=Y(T),t\ge T\), we get

$$\begin{aligned}&\int _{0}^{T} {{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t){\tilde{p}}\left( t\right) \Bigg ]Y\left( t+\delta \right) \Bigg ]dt\nonumber \\&\quad ={{\mathbb {E}}}\left[ \left( \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }b\right) (t){\tilde{p}}\left( t\right) ]dt\right) Y\left( T\right) \right] \nonumber \\&\qquad +\,{{\mathbb {E}}}\left[ \int _{0}^{T} {\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }b\right) (t-\delta ){\tilde{p}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) Y\left( t\right) dt\right] . \end{aligned}$$
(4.10)

Analogously,

$$\begin{aligned}&\int _{0}^{T} {{\mathbb {E}}}\Bigg [{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t){\tilde{q}}\left( t\right) \Bigg ]Y\left( t+\delta \right) \Bigg ]dt\nonumber \\&\quad ={{\mathbb {E}}}\left[ \left( \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }\sigma \right) (t){\tilde{q}}\left( t\right) ]dt\right) Y\left( T\right) \right] \nonumber \\&\qquad +\,{{\mathbb {E}}}\left[ \int _{0}^{T} {\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }b\right) (t-\delta ){\tilde{q}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) Y\left( t\right) dt\right] . \end{aligned}$$
(4.11)

Combining (4.9)–(4.11), we obtain

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [\Bigg (p(T)- \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t){\tilde{p}}\left( t\right) +\left( \partial _{\mu }\sigma \right) (t){\tilde{q}}\left( t\right) \Bigg ]dt\Bigg )Y(T)\Bigg ]\nonumber \\&\quad ={{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg \{(\partial _{x}b)(t)p(t)+(\partial _{x}\sigma )(t)q(t)\right. \nonumber \\&\qquad \left. +\,{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t-\delta ){\tilde{p}}\left( t-\delta \right) +\left( \partial _{\mu }\sigma \right) (t-\delta ){\tilde{q}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) -\alpha (t))Y(t)\Bigg \}dt\right] \nonumber \\&\qquad +\,{{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg \{(\partial _{u}b)(t)p(t)+(\partial _{u}\sigma )(t)q(t)\Bigg \}\left( u\left( t\right) -u^{*}\left( t\right) \right) dt\right] . \end{aligned}$$
(4.12)

Hence, putting

$$\begin{aligned} \zeta (t):= & {} (\partial _{x}b)(t)p(t)+(\partial _{x}\sigma )(t)q(t)\nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[( \partial _{\mu }b) (t-\delta ){\tilde{p}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[( \partial _{\mu }\sigma ) (t-\delta ){\tilde{q}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) , \end{aligned}$$
(4.13)

and

$$\begin{aligned} \zeta := \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t){\tilde{p}}\left( t\right) +\left( \partial _{\mu }\sigma \right) (t){\tilde{q}}\left( t\right) \Bigg ]dt. \end{aligned}$$
(4.14)

Then, (4.12), takes the form

$$\begin{aligned}&{{\mathbb {E}}}[(p(T)-\zeta )Y(T)]\nonumber \\&\quad ={{\mathbb {E}}}\left[ \int _{0}^{T}((\partial _{u}b)(t)p(t)+(\partial _{u}\sigma )(t)q(t)\left( u\left( t\right) -u^{*}\left( t\right) \right) dt\right] \nonumber \\&\qquad +\,{{\mathbb {E}}}\left[ \int _{0}^{T} (\zeta (t)-\alpha (t))Y(t)dt\right] {.} \end{aligned}$$
(4.15)

We are now able to determine our adjoint process, putting

$$\begin{aligned} p(T):= & {} \zeta +\left( \partial _{x}g\right) (X^{*}(T),P_{X^{*} (T)})+{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }g\right) ({\tilde{X}}^{*}(T),P_{X^{*}(T)},X^{*}(T))\Bigg ]\nonumber \\&+\, \int _{T-\delta }^{T} {\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }l\right) (t,{\tilde{X}}^{*}(t),P_{X^{*}(T)},X^{*}(T),{\tilde{u}}^{*}(t))\Bigg ]dt, \end{aligned}$$
(4.16)

and

$$\begin{aligned} \alpha (t):=\zeta (t)+\left( \partial _{x}l\right) (t)+{\tilde{{{\mathbb {E}}}}} [\left( \partial _{\mu }l\right) (t-\delta )]I_{\left[ \delta ,T\right] }(t), \end{aligned}$$
(4.17)

where we denote by

$$\begin{aligned}&\left( \partial _{x}l\right) (t):=\left( \partial _{x}l\right) (t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t)),\\&{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }l\right) (t-\delta )\Bigg ]:={\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }l\right) (t-\delta ,{\tilde{X}}(t-\delta ),P_{X^{*}(t)},X^{*}(t),{\tilde{u}}^{*}(t-\delta ))\Bigg ]. \end{aligned}$$

Combining (4.13), (4.14) with (4.16) and (4.17), then (4.4) takes the following form

$$\begin{aligned} dp(t)= & {} -\Bigg \{(\partial _{x}b)(t)p(t)+(\partial _{x}\sigma )(t)q(t)+\left( \partial _{x}l\right) (t)\nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t-\delta ){\tilde{p}}\left( t-\delta \right) \Bigg ]I_{\left[ \delta ,T\right] }\left( t\right) +{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t-\delta )\tilde{q}\left( t-\delta \right) \Bigg ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }l\right) (t-\delta )]I_{\left[ \delta ,T\right] }\left( t\right) \Bigg \}dt+q(t)dB(t),\quad t\in \left[ 0,T\right] , \end{aligned}$$
(4.18)

with terminal condition

$$\begin{aligned} p(T)= & {} \left( \partial _{x}g\right) (X^{*}(T),P_{X^{*}(T)} )+{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }g\right) ({\tilde{X}}^{*}(T),P_{X^{*}(T)},X^{*}(T))\Bigg ]\nonumber \\&+\, \int _{T-\delta }^{T} \Bigg ({\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }b\right) (t){\tilde{p}}\left( t\right) \Bigg ]+{\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }\sigma \right) (t){\tilde{q}}\left( t\right) \Bigg ] \nonumber \\&+\, {\tilde{{{\mathbb {E}}}}}\Bigg [\left( \partial _{\mu }l\right) (t)\Bigg ]\Bigg )dt. \end{aligned}$$
(4.19)

We suppose that the above BSDE (4.18), (4.19) has a unique solution \((p,q)\in S_{{\mathbb {F}}}^{2}([0,T])\times L_{{\mathbb {F}}}^{2}([0,T])\). We will discuss this BSDE in the next section.

4.6 Stochastic maximum principle

We define now the Hamiltonian \(H:[0,T]\times \Omega \times {\mathbb {R}} \times {\mathcal {P}}_{2}\left( {\mathbb {R}} \right) \times U\times {\mathbb {R}} \times {\mathbb {R}} \rightarrow {\mathbb {R}} \), as

$$\begin{aligned} H(x,\mu ,u,p,q)=l(t,x,\mu ,u)+b(t,x,\mu ,u)p+\sigma (t,x,\mu ,u)q. \end{aligned}$$
(4.20)

Theorem 4.4

(Maximum principle) Let \(u^{*}(t)\) be an optimal control and \(X^{*}(t)\) the corresponding trajectory. Then, we have

$$\begin{aligned} \partial _{u}H(t,X^{*}(t),P_{X^{*}(t+\delta )},u^{*} (t),p(t),q(t))(u(t)-u^{*}(t))&\geqslant 0,\\ dt\text { }dP\text {-a.e., for all }u&\in {\mathcal {U}}, \end{aligned}$$

where \((p,q)\in S_{{\mathbb {F}}}^{2}([0,T])\times L_{{\mathbb {F}}}^{2}([0,T])\) is the solution of the adjoint equation (4.18), (4.19).

Proof

From (4.15) and (4.3) with the choice (4.16) and (4.17), we get

$$\begin{aligned} 0\le {{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg \{\left( \partial _{u}b\right) (t)p(t)+\left( \partial _{u}\sigma \right) (t)q(t)+\left( \partial _{u}l\right) (t)\Bigg \}(u(t)-u^{*}(t))dt\right] , \end{aligned}$$

for all \(u\in {\mathcal {U}}\). Assume for some \(u\in {\mathcal {U}},\)

$$\begin{aligned}&\Gamma _{u}:=\Bigg \{\left( t,\omega \right) \in \left[ 0,T\right] \times \Omega |\\&\quad \{\left( \partial _{u}b\right) (t)p(t)+\left( \partial _{u}\sigma \right) (t)q(t)+\left( \partial _{u}l\right) (t)(u(t)-u^{*}(t))(\omega )<0\Bigg \}\text {, } \end{aligned}$$

is such that

$$\begin{aligned} {{\mathbb {E}}}\left[ \int _{0}^{T} I_{\Gamma _{u}}(t)dt\right] >0. \end{aligned}$$

Then, for \({\tilde{u}}(t):=u(t)I_{\Gamma _{u}}(t)+u^{*}(t)I_{\Gamma _{u}^{c} }(t),t\in \left[ 0,T\right] ,{\tilde{u}}\in {\mathcal {U}}\) is such that

$$\begin{aligned} 0\le & {} {{\mathbb {E}}}\Bigg [ \int _{0}^{T} \Bigg \{\left( \partial _{u}b\right) (t)p(t)+\left( \partial _{u}\sigma \right) (t)q(t)+\left( \partial _{u}l\right) (t)p(t)\Bigg \} \\&\times (u(t)-u^{*}(t))I_{\Gamma _{u}}(t)dt\Bigg ]<0. \end{aligned}$$

But this is a contradiction and proves that

$$\begin{aligned} \Bigg \{\left( \partial _{u}b\right) (t)p(t)+\left( \partial _{u}\sigma \right) (t)q(t)+\left( \partial _{u}l\right) (t)(u(t)-u^{*}(t))\Bigg \}\ge 0, \end{aligned}$$

dtdP-a.e, for all \(u\in {\mathcal {U}}.\) By the definition of H in (4.20), the proof is complete. \(\square \)

5 Solvability of the delayed McKean–Vlasov BSDE

We now study the BSDE which is the adjoint equation to the above control problem. We consider the BSDE

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dp(t) &{} =-\alpha (t)dt+q(t)dB(t)\text {, }t\in \left[ 0,T\right] ,\\ p(T), &{} \end{array} \right. \end{aligned}$$

which we have seen due to our computations that it has the form

$$\begin{aligned} dp(t)= & {} -\Bigg \{(\partial _{x}b)(t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },u^{*}\left( t\right) )p(t)\nonumber \\&+\,(\partial _{x}\sigma )(t,X^{*}\left( t\right) ,P_{X^{^{*}}\left( t+\delta \right) },u^{*}\left( t\right) )q(t)+\left( \partial _{x}l\right) (X^{*}(t),P_{X^{*}(t+\delta )},u^{*}(t))\nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }b\right) \left( {\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) \right) {\tilde{p}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }\sigma \right) \left( {\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) \right) {\tilde{q}}\left( t-\delta \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}[\left( \partial _{\mu }l\right) \left( {\tilde{X}}^{*}\left( t-\delta \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t\right) ,{\tilde{u}}^{*}\left( t-\delta \right) \right) ]I_{\left[ \delta ,T\right] }\left( t\right) \Bigg \}dt\nonumber \\&+\,q(t)dB(t)\text {, }t\in \left[ 0,T\right] , \end{aligned}$$
(5.1)

with

$$\begin{aligned} p(T)= & {} (\partial _{x}g)(X^{*}(T),P_{X^{*}(T)})+{\tilde{{{\mathbb {E}}}}} [(\partial _{\mu }g)({\tilde{X}}^{*}(T),P_{X^{*}(T)},X^{*}(T))]\nonumber \\&+\, \int _{T-\delta }^{T} \left( {\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }b)({\tilde{X}}^{*}\left( t\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T\right) ,{\tilde{u}}^{*}\left( t\right) ){\tilde{p}}\left( t\right) ]\right. \nonumber \\&+\,{\tilde{{{\mathbb {E}}}}}\left[ (\partial _{\mu }\sigma )\left( {\tilde{X}}^{*}\left( t\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T\right) ,{\tilde{u}}^{*}\left( t\right) \right) {\tilde{q}}\left( t\right) \right] \nonumber \\&\left. +\,{\tilde{{{\mathbb {E}}}}}\left[ (\partial _{\mu }l)({\tilde{X}}^{*}(t),P_{X^{*}(T)},X^{*}(T),{\tilde{u}}^{*}(t))\right] \right) dt. \end{aligned}$$
(5.2)

Let us better understand the form of this BSDE: for \(\left( t,\omega ,{\tilde{\omega }}\right) \in \left[ 0,T\right] \times \Omega \times {\tilde{\Omega }},x_{1},x_{2},x_{3},x_{4}\in {\mathbb {R}} \), putting

$$\begin{aligned} \theta _{t}(\omega ,{\tilde{\omega }},x_{1},x_{2},x_{3},x_{4})&:=(\partial _{x}b)(t,X^{*}\left( t,\omega \right) ,P_{X^{*}\left( t+\delta \right) },u^{*}\left( t,\omega \right) )x_{1}\\&\quad +\,(\partial _{x}\sigma )(t,X^{*}\left( t,\omega \right) ,P_{X^{*}\left( t+\delta \right) },u^{*}\left( t,\omega \right) )x_{2}\\&\quad +\,(\partial _{x}l)(t,X^{*}\left( t,\omega \right) ,P_{X^{*}\left( t+\delta \right) },u^{*}\left( t,\omega \right) )\\&\quad +\,(\partial _{\mu }b)(t-\delta ,{\tilde{X}}^{*}\left( t-\delta ,\omega \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t,\omega \right) ,\\&\quad {\tilde{u}}^{*}\left( t-\delta ,{\tilde{\omega }}\right) )x_{3}I_{\left[ \delta ,T\right] }\left( t\right) \\&\quad +\,(\partial _{\mu }\sigma )(t-\delta ,{\tilde{X}}^{*}\left( t-\delta ,\omega \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t,\omega \right) ,\\&\quad {\tilde{u}}^{*}\left( t-\delta ,{\tilde{\omega }}\right) )x_{4}I_{\left[ \delta ,T\right] }\left( t\right) \\&\quad +\,(\partial _{\mu }l)(t-\delta ,{\tilde{X}}^{*}\left( t-\delta ,\omega \right) ,P_{X^{*}\left( t\right) },X^{*}\left( t,\omega \right) ,\\&\quad {\tilde{u}}^{*}\left( t-\delta ,{\tilde{\omega }}\right) )I_{\left[ \delta ,T\right] }\left( t\right) , \end{aligned}$$

and in order to describe also the terminal condition of our BSDE, we consider the coefficient

$$\begin{aligned} \vartheta _{t}(\omega ,{\tilde{\omega }},x)&:=(\partial _{\mu }b)(t,{\tilde{X}}^{*}\left( t,{\tilde{\omega }}\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T,\omega \right) ,{\tilde{u}}^{*}\left( t,\tilde{\omega }\right) )x_{3}\\&\quad +\,(\partial _{\mu }\sigma )(t,{\tilde{X}}^{*}\left( t,{\tilde{\omega }}\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T,\omega \right) ,\tilde{u}^{*}\left( t,{\tilde{\omega }}\right) )x_{4}\\&\quad +\,(\partial _{\mu }l)(t,{\tilde{X}}^{*}\left( t,\omega \right) ,P_{X^{*}\left( T\right) },X^{*}\left( T,\omega \right) ,{\tilde{u}}^{*}\left( t,{\tilde{\omega }}\right) ), \end{aligned}$$

We know that

$$\begin{aligned} \varphi ({\tilde{P}}_{\zeta })(\omega )&={\tilde{{{\mathbb {E}}}}}[\zeta ](\omega )\nonumber \\&= \int _{{\tilde{\Omega }}} \zeta (\omega ,{\tilde{\omega }}){\tilde{P}}(d{\tilde{\omega }}),\omega \in \Omega \text {, for }\zeta \in L^{2}\left( {\bar{\Omega }},{\bar{{{\mathcal {F}}}}},{\bar{P}}\right) {.} \end{aligned}$$
(5.3)

By using (5.3) the BSDE (5.1), (5.2) takes the form

Definition 5.1

The BSDE \((p,q)\in S_{{\mathbb {F}}}^{2}([0,T])\times L_{{\mathbb {F}}}^{2}([0,T])\) is defined by

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dp\left( t\right) &{} =-\varphi ({\tilde{P}}_{\theta _{t}(p\left( t\right) ,q\left( t\right) ,{\tilde{p}}\left( t-\delta \right) ,{\tilde{q}}\left( t-\delta \right) )})dt+q\left( t\right) dB\left( t\right) \text {, } t\in \left[ 0,T\right] ,\\ p\left( T\right) &{} =\zeta + \int _{T-\delta }^{T} \varphi ({\tilde{P}}_{\vartheta _{t}({\tilde{p}}\left( t\right) ,{\tilde{q}}\left( t\right) )})dt, \end{array} \right. \end{aligned}$$
(5.4)

where

$$\begin{aligned} \zeta :=(\partial _{x}g)(X^{*}\left( T\right) ,P_{X^{*}\left( T\right) })+{\tilde{{{\mathbb {E}}}}}[(\partial _{\mu }g)({\tilde{X}}^{*}\left( T\right) ,P_{X^{*}\left( T\right) },X^{*}\left( T\right) )], \end{aligned}$$

we see that \(\zeta \in L^{2}\left( \Omega ,{\mathcal {F}},P\right) \).

Remark 5.2

We call the BSDE (5.4), delayed BSDE because the driver at time t depend on both the solution at time t and on its previous value, i.e. the solution at time \(t-\delta \).

Note that \(\theta \) satisfies the following:

Assumptions (H.5):

  1. 1.

    \(\theta :\left[ 0,T\right] \times \Omega \times {\tilde{\Omega }}\times {\mathbb {R}} ^{4}\rightarrow {\mathbb {R}} \) is jointly measurable,

  2. 2.

    \(\theta _{t}\left( \cdot ,\cdot ,x\right) \) is \({\mathcal {F}}_{t} \otimes {\tilde{{{\mathcal {F}}}}}_{T}\)-progressively measurable, for all \(x\in {\mathbb {R}} ^{4}\),

  3. 3.

    for all \(x,x^{\prime }\in {\mathbb {R}} ^{4}\),

    $$\begin{aligned} \left| \theta _{t}\left( \omega ,{\tilde{\omega }},x\right) -\theta _{t}\left( \omega ,{\tilde{\omega }},x^{\prime }\right) \right| \le C\left| x-x^{\prime }\right| \text {, }dtP(d\omega ){\tilde{P}} (d{\tilde{\omega }})\text {-a.e.} \end{aligned}$$

Similarly, \(\vartheta \) is assumed to satisfy the following:

Assumptions (H.6):

  1. 1.

    \(\vartheta :\left[ T-\delta ,T\right] \times \Omega \times \tilde{\Omega }\times {\mathbb {R}} ^{2}\rightarrow {\mathbb {R}} \) is jointly measurable,

  2. 2.

    \(\vartheta \left( \cdot ,\cdot ,x\right) \) is \({\mathcal {F}}_{T} \otimes {\tilde{{{\mathcal {F}}}}}_{T}\)-measurable, for all \(\left( t,x\right) \in \left[ T-\delta ,T\right] \times {\mathbb {R}} ^{2}\),

  3. 3.

    \(\left| \vartheta _{t}\left( \omega ,{\tilde{\omega }},0\right) \right| \le C,dtP\left( d\omega \right) {\tilde{P}}\left( d{\tilde{\omega }}\right) \)-a.e, for some constant \(C>0\),

  4. 4.

    \(|\vartheta _{t}\left( \omega ,{\tilde{\omega }},x\right) -\vartheta _{t}(\omega ^{^{\prime }},{\tilde{\omega }}^{\prime },x^{\prime })|\le C\left| x-x^{\prime }\right| \), for all x, \(x^{\prime }\in {\mathbb {R}} ^{2}\), \(dtP(d\omega ){\tilde{P}}(d{\tilde{\omega }})\)-a.e.

However, the function \(\varphi :{\mathcal {P}}_{2}( {\mathbb {R}} )\rightarrow {\mathbb {R}} \) in a delayed BSDE (5.1), (5.2) is Lipschitz continuous. Consequently, we have the following more general form for our BSDE.

We consider arbitrary \(\theta ,\vartheta ,\varphi ,\psi ,\zeta \) with \(\theta \) satisfying the Assumption (H.5), \(\vartheta \) satisfying (H.6), \(\varphi ,\psi :{\mathcal {P}}_{2}( {\mathbb {R}} )\rightarrow {\mathbb {R}} \) being Lipschitz and \(\zeta \in L^{2}\left( \Omega ,{\mathcal {F}},P\right) \), and we study the delayed BSDE,

$$\begin{aligned} \left\{ \begin{array} [c]{ll} dp\left( t\right) &{} =-\varphi ({\tilde{P}}_{\theta _{t}(p\left( t\right) ,q\left( t\right) ,{\tilde{p}}\left( t-\delta \right) ,{\tilde{q}}\left( t-\delta \right) )})dt+q\left( t\right) dB\left( t\right) \text {, } t\in \left[ 0,T\right] ,\\ p\left( T\right) &{} =\zeta + \int _{T-\delta }^{T} \psi ({\tilde{P}}_{\vartheta _{t}({\tilde{p}}\left( t\right) ,{\tilde{q}}\left( t\right) )})dt{.} \end{array} \right. \end{aligned}$$
(5.5)

Remark 5.3

The adjoint BSDE we describe it above is a special case of (5.5). Indeed, for the adjoint BSDE we have:

$$\begin{aligned} \varphi ({\tilde{P}}_{\vartheta _{t}})(\omega )&=\psi ({\tilde{P}}_{\vartheta _{t} })(\omega )={\tilde{{{\mathbb {E}}}}}[\vartheta _{t}](\omega )\\&= \int _{{\tilde{\Omega }}} \vartheta _{t}(\omega ,{\tilde{\omega }}){\tilde{P}}(d{\tilde{\omega }}),\omega \in \Omega \text {, for }\vartheta _{t}\in L^{2}\left( {\bar{\Omega }},{\bar{{{\mathcal {F}}}}},{\bar{P}}\right) , \end{aligned}$$

Definition 5.4

We say that \(\left( p,q\right) \in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \) is a solution of (5.5), if

$$\begin{aligned} \left\{ \begin{array} [c]{lll} p\left( t\right) &{} :=p\left( 0\right) , &{} t\le 0,\\ q\left( t\right) &{} :=0, &{} t\le 0, \end{array} \right. \end{aligned}$$

and if (5.5) is satisfied.

Theorem 5.5

Under the above assumptions there is some \(\delta _{0}>0\) small enough such that for all \(\delta \in \left( 0,\delta _{0}\right] \), BSDE (5.5) has a unique solution \(\left( p,q\right) \in S_{{\mathbb {F}}}^{2}\left[ 0,T\right] \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \).

Proof

We embed \(S_{{\mathbb {F}}}^{2}\left[ 0,T\right] \subset {\mathbb {R}} \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \): For \(U\in S_{{\mathbb {F}}}^{2}\left[ 0,T\right] \) we put \(U\left( t\right) =U\left( 0\right) \), \(t\in \left[ -\delta ,0\right] \), and we observe that

$$\begin{aligned} (U\left( t\vee 0\right) _{t\in \left[ -\delta ,0\right] }\equiv (U\left( 0\right) ,\left( U\left( t\right) \right) _{t\in \left[ 0,T\right] }))\in {\mathbb {R}} \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) {.} \end{aligned}$$

For \(V\in L_{{\mathbb {F}}}^{2}\left[ 0,T\right] \) we use the convention that \(V\left( t\right) =0\), \(t\le 0\). Let \(\left( U,V\right) \in H=\left( {\mathbb {R}} \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \right) \), and \(\left( p,q\right) \in S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \times L_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) \left( \subset H\right) \) the unique solution of the equation

$$\begin{aligned} \left\{ \begin{array} [c]{l} dp\left( t\right) =-\varphi ({\tilde{P}}_{\theta _{t}(U\left( t\right) ,V\left( t\right) ,{\tilde{U}}\left( t-\delta \right) ,{\tilde{V}}\left( t-\delta \right) )})dt+q\left( t\right) dB\left( t\right) ,t\in \left[ 0,T\right] ,\\ p\left( t\right) =p(0),q\left( t\right) =0,t\le 0,\\ p\left( T\right) =\zeta + \int _{T-\delta }^{T} \varphi ({\tilde{P}}_{\vartheta _{t}({\tilde{U}}\left( t\right) ,{\tilde{V}}\left( t\right) )})dt. \end{array} \right. \end{aligned}$$

For this observe that the terminal condition is in \(L^{2}\left( \Omega ,{\mathcal {F}},P\right) \) and the given coefficient of the BSDE is \({\mathbb {F}}\)-progressively measurable and square integrable. Let us define

$$\begin{aligned} \Phi \left( U,V\right) :=\left( p,q\right) ,\Phi :H\rightarrow H. \end{aligned}$$

For a suitable \(\beta >0\) which will be specified later, we define the norm

$$\begin{aligned} \left\| \left( U,V\right) \right\| _{\beta }:=\left( {{\mathbb {E}}}[U^{2}\left( 0\right) ]+{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}(\left| U\left( t\right) \right| ^{2}+\left| V\left( t\right) \right| ^{2})dt\right] \right) ^{\frac{1}{2}},U,V\in H, \end{aligned}$$

which is equivalent to the standard norm \(\left\| \cdot \right\| _{0}\)\(\left( \text {for }\beta =0\right) \) on H. Note that \(\left( H,\left\| \cdot \right\| _{0}\right) \) is a Banach space, and so is \((H,\left\| \cdot \right\| _{\beta })\). We show that for some \(\delta _{0}>0\), we have for all \(\delta \in \left( 0,\delta _{0}\right] \) that

$$\begin{aligned} \Phi :(H,\left\| \cdot \right\| _{\beta })\rightarrow (H,\left\| \cdot \right\| _{\beta }) \end{aligned}$$

is a contraction, i.e, there is a unique fixed point \(\left( p,q\right) \in H,\) such that \(\Phi \left( p,q\right) =\left( p,q\right) .\) Then \(\left( p,q\right) \) solves BSDE (5.5) and belongs in particular to \(S_{{\mathbb {F}}}^{2}\left( \left[ 0,T\right] \right) .\) Let \(\left( U^{i},V^{i}\right) \in H,i=1,2,\) and consider \(\left( p^{i},q^{i}\right) =\Phi \left( U^{i},V^{i}\right) \), i.e.,

$$\begin{aligned} \left\{ \begin{array} [c]{l} dp^{i}\left( t\right) =-\varphi \left( {\tilde{P}}_{\theta _{t}\left( U^{i}\left( t\right) ,V^{i}\left( t\right) ,{\tilde{U}}^{i}\left( t-\delta \right) ,{\tilde{V}} ^{i}\left( t-\delta \right) \right) }\right) dt+q^{i}\left( t\right) dB\left( t\right) ,t\in \left( 0,T\right) \\ p^{i}\left( T\right) =\zeta + \int _{T-\delta }^{T} \varphi \left( {\tilde{P}}_{\vartheta _{t}\left( {\tilde{U}}^{i}\left( t\right) ,{\tilde{V}} ^{i}\left( t\right) \right) }\right) dt\text {, }i=1,2. \end{array} \right. \end{aligned}$$

From Itô’s formula applied to \(e^{\beta t}\left| {\bar{p}}\left( t\right) \right| ^{2}\), we obtain

$$\begin{aligned}&{{\mathbb {E}}}[|{\bar{p}}\left( 0\right) |^{2}]+{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}\left( \beta \left| {\bar{p}}\left( t\right) \right| ^{2}+\left| {\bar{q}}\left( t\right) \right| ^{2}\right) dt\right] \\&\quad ={{\mathbb {E}}}\left[ e^{\beta T}|{\bar{p}}\left( T\right) |\right] \\&\qquad +\,2{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}{\bar{p}}\left( t\right) \left\{ \varphi ({\tilde{P}}_{\theta _{t} \left( U^{1}\left( t\right) ,V^{1}\left( t\right) ,{\tilde{U}}^{1}\left( t-\delta \right) ,{\tilde{V}}^{1}\left( t-\delta \right) \right) }\right. \right. \\&\qquad \left. \left. -\,\varphi \left( \tilde{P}_{\theta _{t}\left( U^{2}\left( t\right) ,V^{2}\left( t\right) ,{\tilde{U}} ^{2}\left( t-\delta \right) ,{\tilde{V}}^{2}\left( t-\delta \right) \right) }\right) \right\} dt\right] . \end{aligned}$$

Observe that, thanks to the Assumptions (H.5) and (H.6),

$$\begin{aligned}&\Bigg |\varphi \Bigg ({\tilde{P}}_{\theta _{t}(U^{1}\left( t\right) ,V^{1}\left( t\right) ,{\tilde{U}}^{1}\left( t-\delta \right) ,{\tilde{V}}^{1}\left( t-\delta \right) )}\Bigg )-\varphi \Bigg ({\tilde{P}}_{\theta _{t}(U^{2}\left( t\right) ,V^{2}\left( t\right) ,{\tilde{U}}^{2}\left( t-\delta \right) ,{\tilde{V}}^{2}\left( t-\delta \right) )}\Bigg )\Bigg |\\&\quad \le CW_{2}\Bigg ({\tilde{P}}_{\theta _{t}(U^{1}\left( t\right) ,V^{1}\left( t\right) ,{\tilde{U}}^{1}\left( t-\delta \right) ,{\tilde{V}}^{1}\left( t-\delta \right) )},{\tilde{P}}_{\theta _{t}(U^{2}\left( t\right) ,V^{2}\left( t\right) ,{\tilde{U}}^{2}\left( t-\delta \right) ,{\tilde{V}}^{2}\left( t-\delta \right) )}\Bigg )\\&\quad \le C({\tilde{{{\mathbb {E}}}}}[|\theta _{t}(U^{1}\left( t\right) ,V^{1}\left( t\right) ,{\tilde{U}}^{1}\left( t-\delta \right) ,{\tilde{V}}^{1}\left( t-\delta \right) )\\&\qquad -\,\theta _{t}(U^{2}\left( t\right) ,V^{2}\left( t\right) ,{\tilde{U}}^{2}\left( t-\delta \right) ,{\tilde{V}}^{2}\left( t-\delta \right) )|^{2}])^{\frac{1}{2}}\\&\quad \le C\Bigg ({\tilde{{{\mathbb {E}}}}}\Bigg [\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| {\bar{V}}\left( t\right) \right| ^{2}+\left| {\bar{U}}\left( t-\delta \right) \right| ^{2}+\left| {\bar{V}}\left( t-\delta \right) \right| ^{2}\Bigg ]\Bigg )^{\frac{1}{2}}\\&\quad \le C\Bigg (\left| {\bar{U}}\left( t\right) \right| +\left| {\bar{V}}\left( t\right) \right| +\Bigg ({{\mathbb {E}}}\Bigg [\left| {\bar{U}}\left( t-\delta \right) \right| ^{2}+\left| {\bar{V}}\left( t-\delta \right) \right| ^{2}\Bigg ]\Bigg )^{\frac{1}{2}}\Bigg ). \end{aligned}$$

Hence, for some small \(\rho >0,\)

$$\begin{aligned}&2{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}{\bar{p}}\left( t\right) \Bigg \{\varphi \Bigg ({\tilde{P}}_{\theta _{t} (U^{1}\left( t\right) ,V^{1}\left( t\right) ,{\tilde{U}}^{1}\left( t-\delta \right) ,{\tilde{V}}^{1}\left( t-\delta \right) )}\Bigg )-\varphi \Bigg ({\tilde{P}}_{\theta _{t}(U^{2}\left( t\right) ,V^{2}\left( t\right) ,{\tilde{U}} ^{2}\left( t-\delta \right) ,{\tilde{V}}^{2}\left( t-\delta \right) )}\Bigg )\Bigg \}dt\right] \\&\quad \le C_{\rho }{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}\left| {\bar{p}}\left( t\right) \right| ^{2}dt\right] +\rho C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}\Bigg (\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| {\bar{V}}\left( t\right) \right| ^{2}\Bigg )dt\right] \\&\qquad +\rho C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}\Bigg (\left| {\bar{U}}\left( t-\delta \right) \right| ^{2}+\left| {\bar{V}}\left( t-\delta \right) \right| ^{2}\Bigg )dt\right] . \end{aligned}$$

Note that

$$\begin{aligned}&\rho C{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}\Bigg (\left| {\bar{U}}\left( t-\delta \right) \right| ^{2}+\left| {\bar{V}}\left( t-\delta \right) \right| ^{2}\Bigg )dt\right] \\&\quad \le \rho Ce^{\beta \delta }{{\mathbb {E}}}\left[ \int _{0}^{T-\delta } e^{\beta t}\Bigg (\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| {\bar{V}}\left( t\right) \right| ^{2}\Bigg )dt\right] +\rho Ce^{\beta \delta } {{\mathbb {E}}}\Bigg [\left| {\bar{U}}\left( 0\right) \right| ^{2}\Bigg ]. \end{aligned}$$

Moreover, recall that, \({\bar{V}}\left( t\right) =0\), \(t\le 0.\) On the other hand,

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [e^{\beta T}\left| {\bar{p}}\left( T\right) \right| ^{2}\Bigg ]\\&\quad ={{\mathbb {E}}}\Bigg [e^{\beta T}\left( \int _{T-\delta }^{T} \Bigg (\varphi ({\tilde{P}}_{\vartheta _{t}({\tilde{U}}^{1}\left( t\right) ,{\tilde{V}}^{1}\left( t\right) )}\Bigg )-\varphi \Bigg ({\tilde{P}}_{\vartheta _{t}({\tilde{U}} ^{2}\left( t\right) ,{\tilde{V}}^{2}\left( t\right) )})\Bigg )dt\right) ^{2}\Bigg ]\\&\quad \le C\delta e^{\beta \delta }{{\mathbb {E}}}\left[ \int _{T-\delta }^{T} e^{\beta T}\Bigg (\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| {\bar{V}}\left( t\right) \right| ^{2}\Bigg )dt\right] . \end{aligned}$$

Letting \(0<\delta \le \rho \), we obtain

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [\left| {\bar{p}}\left( 0\right) \right| ^{2}\Bigg ]+{{\mathbb {E}}}\left[ \int _{0}^{T}e^{\beta t}(\beta \left| {\bar{p}}\left( t\right) \right| ^{2}+\left| {\bar{q}}\left( t\right) \right| ^{2})dt\right] \\&\quad \le C\rho e^{\beta \delta }{{\mathbb {E}}}\Bigg [\left| {\bar{U}}\left( 0\right) \right| ^{2}\Bigg ]+C\rho \left( 1+e^{\beta \delta }\right) {{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg (\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| \bar{V}\left( t\right) \right| ^{2}\Bigg )dt\right] \\&\qquad +C\rho {{\mathbb {E}}}\left[ \int _{0}^{T} \left| {\bar{p}}\left( t\right) \right| ^{2}dt\right] . \end{aligned}$$

We choose now \(\rho =\tfrac{1}{8C},\beta =C_{\rho }+1\) and \(\delta _{0}\in (0,\frac{1}{8C}),\) such that \(\tfrac{1+e^{\beta \delta _{0}}}{8}\le \tfrac{1}{2}\). Then, for all \(\delta \in \left( 0,\delta _{0}\right] ,\)

$$\begin{aligned}&{{\mathbb {E}}}\Bigg [\left| {\bar{p}}\left( 0\right) \right| ^{2}\Bigg ]+{{\mathbb {E}}}\left[ \int _{0}^{T} e^{\beta t}(\left| {\bar{p}}\left( t\right) \right| ^{2}+\left| {\bar{q}}\left( t\right) \right| ^{2})dt\right] \\&\quad \le \tfrac{1}{2}\left( {{\mathbb {E}}}\Bigg [\left| {\bar{U}}\left( 0\right) \right| ^{2}\Bigg ]+{{\mathbb {E}}}\left[ \int _{0}^{T} \Bigg (\left| {\bar{U}}\left( t\right) \right| ^{2}+\left| {\bar{V}}\left( t\right) \right| ^{2}\Bigg )dt\right] \right) , \end{aligned}$$

i.e.,

$$\begin{aligned} \left\| \Phi \left( U^{1},V^{1}\right) -\Phi \left( U^{2},V^{2}\right) \right\| _{\beta }^{2}\le \tfrac{1}{2}\left\| \left( U^{1},V^{1}\right) -\left( U^{2},V^{2}\right) \right\| _{\beta }^{2}, \end{aligned}$$

for all \(\left( U^{1},V^{1}\right) ,\left( U^{2},V^{2}\right) \in H.\) This completes the proof. \(\square \)