1 Introduction

In this paper we consider a class of stochastic optimal control problems where the state equation is a stochastic delay differential equation in \({\mathbb {R}}^n\) of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} dx(t)=f\left( x(t), u(t), {{\,\mathrm{\displaystyle \int }\,}}_{-\infty }^0 \langle \alpha (r), u(t+r)\rangle dr \right) dt \\ \qquad \quad + g \left( x(t), u(t), {{\,\mathrm{\displaystyle \int }\,}}_{-\infty }^0 \langle \beta (r), u(t+r)\rangle dr\right) dW{(t)}, \ \ \ \ t\ge 0,\\ x(0)=x_0\in {\mathbb {R}}^n, \\ u(s)=\varphi (s) \in {\mathbb {R}}^k, \ s\in (-\infty ,0), \end{array}\right. }\nonumber \\ \end{aligned}$$
(1.1)

where W is a Brownian motion with values in \({\mathbb {R}}^m\), x is the state variable with values in \({\mathbb {R}}^n\), u is a control process taking values in a suitable set \(U\subset {\mathbb {R}}^k\), \(x_0 \in {\mathbb {R}}^n\) is the initial value of the state variable, \(\varphi \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) is the initial given control, \(f:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^n\), \(g:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^{m\times n}\) and \(\alpha ,\beta \) are \(L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) functions. The goal is to maximize, over all \(u \in {\mathcal {U}}\), the functional

$$\begin{aligned} J(x_0, u)={\mathbb {E}}\left[ \int _ 0^\infty e^{-\rho t} l\left( x^{x_0,u}(t),u(t),\int _{-\infty }^0\langle \gamma (r),u(t+r)\rangle dr\right) dt \right] , \end{aligned}$$
(1.2)

where \(\gamma \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) and \(l:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}\). The key feature of such class of problems is the integral dependence of all the ingredients (coefficients fg of the state equation and running reward function l) on the path of the control u.

As in the case where the delay dependence is with respect to the state variable, also the models that we address lack of Markovianity. Due to this fact, the dynamic programming approach cannot be directly applied. To overcome this difficulty, when it is the delay in the state variable but not in the control variable that appears in the problem, one available approach consists in rephrasing the finite dimensional problem in a Hilbert space setting, where the constituents of the new problem do not present anymore a delay-type dependence. The benefit of this approach is to recover Markovianity, hence to allow for an application of the dynamic programming machinery. Clearly, there is a cost to pay in doing so, due to the fact that a more technical theory is required, in particular for dealing with unbounded second-order Hamilton-Jacobi-Bellman equations on Hilbert spaces. Nevertheless, such a theory has been developed and is available for application (Fabbri et al. 2017). For stochastic optimal control problems with a delay dependence on the state variable, but not on the control variable, see Biffis et al. (2020), Biagini et al. (2022), Djehiche et al. (2022), De Feo et al. (2023), Di Giacinto et al. (2011), Federico (2011), Federico and Tankov (2015), Fuhrman et al. (2010), Masiero and Tessitore (2022), Pang and Yong (2019). Se also Cosso et al. (2023); Ren and Rosestolato (2020); Cosso et al. (2023) for a different approach, where no representation in Hilbert space is performed, but the problem, presented in a path-dependent framework, is addressed via dynamic programming in the original setting, but making use of the so-called pathwise derivatives (see Cont and Fournie (2010a, 2010b) for an account on this topic).

On the other hand, if we take into consideration models where we have a distributed delay dependence,

as we do in the present work, the infinite dimensional representation trick to overcome the lack of Markovianity is not obvious.

A way to do it is the one followed originally by Vinter and Kwong (1981), extended in the stochastic case with additive noise in

Gozzi and Marinelli (2004), then recently generalized in De Feo (2023) by considering a nonlinear dependence on the present of the control variable in the diffusion coefficient. In these works, the authors rephrase the original dynamics as an equivalent abstract SDE in a Hilbert space, controlled now only on the present value of the control variable. The drift of such an abstract controlled SDE is linearly dependent on an unbounded linear operator acting on the infinite dimensional state variable. The setting thereby recovered is then suitable to apply the theory of optimal control in infinite dimension, as developed in Fabbri et al. (2017).

Such a strategy to rephrase the problem strongly relies on the fact that the integral dependence on the past of the control variable appears only linearly in the drift of the original state dynamics. If in our model we used the same representation as in Gozzi and Marinelli (2004), De Feo (2023), the corresponding abstract equation would show a nonlinear dependence, both in the drift and in the diffusion coefficient, on an unbounded linear operator acting on the infinite dimensional state. This structure would make the problem very much difficult, and untreatable, when referring to the theory in Fabbri et al. (2017).

It is to overcome this issue, hence to let the delay dependence on the control appear nonlinearly both in the drift and in the diffusion coefficient, that we present an alternative representation.

The starting point is the simple observation that the function

$$\begin{aligned} x_1(t)=\int _{-\infty }^tu(s)ds\qquad \qquad \big ( u(s)=\varphi (s)\ \textrm{for}\ s\le 0 \big ) \end{aligned}$$

can be introduced as a second state variable to rewrite the integral dependence in (1.1) and in (1.2) as

$$\begin{aligned} \int _{-\infty }^0 \langle \alpha (r),u(t+r)\rangle dr= & {} \langle \alpha (0),\int _{-\infty }^0u(t+r)dr\rangle - \int _{-\infty }^0 \langle {\alpha '(r)},\int _{-\infty }^r u(t+s)ds\rangle dr\nonumber \\= & {} \langle \alpha (0),x_1(t)\rangle -\int _{-\infty }^0 \langle { \alpha '(r)},x_1(t+r)\rangle dr,\nonumber \\ \end{aligned}$$
(1.3)

and similarly for the other terms involving \(\beta ,\gamma \). The point is that in (1.3) there is no more the control the variable u, but only the newly introduced state \(x_1\), whose dynamics is trivially \(dx_1(t)=u(t)dt\). Of course, if we use (1.3) (and similarly for \(\beta ,\gamma \)) in (1.1) and in (1.2), we still have to deal with a delay model: but now the delay is in the state variable \(x_1\). This fact is important, because we can now perform the standard representation in infinite dimnesion for models with delay in the state, and, as said above, an unbounded linear operator will appear in the infinite dimensional dynamics, but it will appear linearly and only in the drift coefficient. Of course, differently from Gozzi and Marinelli (2004), in order to compute (1.3), we need some regularity, meaning \(\alpha ,\beta ,\gamma \in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})\).

Using this approach, once the problem has been represented in an infinite dimensional setting, one ends up with a structure that can be tackled by appealing to the available theory for dynamic programming in infinite dimension, including the theory of B-viscosity solution theory, as presented in Fabbri et al. (2017).

Concerning applications, the model that we present here can be exploited for optimal advertising. Within this field, a basic setting has been provided by the seminal papers (Nerlove and Arrow 1962; Vidale and Wolfe 1957), then extended to the stochastic case, in particular, by Grosset and Viscolani (2004), Marinelli (2007), Motte and Pham (2021), Prasad and Sethi (2008). The delay in the control variable, representing the advertisement spending, is then introduced, in our model as in Gozzi et al. (2009); Gozzi and Marinelli (2004); De Feo (2023), in order to account for a delay effect in the spending, often called carryover effect (see Gozzi et al. (2009); Hartl (1984); Feichtinger et al. (1994)). A further extension of the stochastic linear model with delay in the control, and additive noise, has been recently provided by Gozzi et al. (2024) and Ricciardi and Rosestolato (2024), where a mean field term is introduced, in order to account for non-competitive and competitive environments, respectively.

We point out that when the delay dependence on the control is not in an integral form, as we assumed in the discussion above, but e.g. pointwise, then the representation infinite dimension in general more difficult to perform, and other strategies have to be exploited (see e.g. Lefebvre and Miller (2021)).

The plan of the paper is the following. In Sect. 2 we introduce the needed notations. In Sect. 3 we formulate the optimal control problem in finite dimension with delay in the control variable. In Sect. 4 we introduce the infinite dimensional setting and prove Theorem 6, which states the equivalence between the finite dimensional control problem with delay in the control, introduced in Sect. 3, and an infinite dimensional control problem, where there is no delay in the control variable. Finally, in Sect. 5, we show how the representation of Sect. 4 can be used to find an explicit solution for an LQ model, where both the drift and the diffusion coefficient of the state dynamics depend on the path of the delay.

2 Notation and preliminaries

We fix natural numbers nkm, that will represent the dimension of the state variable, the control variable, the Brownian motion, respectively. By \(M_{n\times m}({\mathbb {R}})\) we denote the space of \(n\times m\) matrices with real entries, endowed with the Frobenius norm. For finite dimensional spaces, the Euclidean norm and scalar product will be always denoted by \(|\cdot |\) and \(\langle \cdot ,\cdot \rangle \), respectively, without any subscript. We denote \({\mathbb {R}}^+=[0,+\infty )\) and \({\mathbb {R}}^-=(-\infty ,0]\). If \({\mathcal {T}}\) is any topological space, \({\mathcal {B}}_{\mathcal {T}}\) denotes its Borel sigma-algebra. We fix a filtered probability space \((\Omega ,{\mathcal {F}},{\mathbb {F}}=\{{\mathcal {F}}_t\}_{t\in {\mathbb {R}}^+},{\mathbb {P}})\) satisfying the usual conditions, and an m-dimensional Bronwian motion W defined on it. We assume \({\mathbb {F}}\) to be the completion of the natural filtration of W.

Given any separable Banach space \((E,|\cdot |_E)\), we introduce the following function spaces.

  1. (i)

    For \(p\ge 1\), \(L^p(E)\) denotes the space \(L^p({\mathbb {R}}^-,E)\) of E-valued p-Lebesgue integrable functions defined on \({\mathbb {R}}^-\). Its usual \(L^p\)-norm will be denoted by \(|\cdot |_{L^p}\).

  2. (ii)

    For \(p\ge 1\) and any sub-sigma-algebra \({\mathcal {G}}\subset {\mathcal {F}}\), \(L^p_{{\mathcal {G}}}(E)\) denotes the space of \({\mathcal {G}}\)-measurable random variables \(\xi \) such that

    $$\begin{aligned} |\xi |_p{:}{=}\left( {\mathbb {E}} \left[ |\xi |^p_E \right] \right) ^{1/p}<\infty . \end{aligned}$$
  3. (iii)

    For \(t\ge 0\), \(L^0_{{\mathbb {F}},t}(E)\) denotes the space of \(\{{\mathcal {F}}_s\}_{s\in [t,\infty )}\)-progressively measurable processes \(X:\Omega \times [t,\infty )\rightarrow E\) endowed with the (quotient) metrizable topology associated to convergence in measure, when \((\Omega \times [t,\infty ),{\mathcal {F}}\otimes {\mathcal {B}}_{[t,\infty )})\) is endowed with the product measure \({\mathbb {P}}\otimes \lambda \) (\(\lambda \) is the Lebesgue measure).

  4. (iv)

    For \(p\ge 1\) and \(0\le t\le T\), \(L^p_{{\mathbb {F}},t,T}(E)\) denotes the space of \(\{{\mathcal {F}}_s\}_{s\in [t,T]}\)-progressively measurable processes \(X:\Omega \times [t,T]\rightarrow E\) such that

    $$\begin{aligned} |X|_{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \int _t^T |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$

    The couple \((L^p_{{\mathbb {F}},t,T}(E),|\cdot |_{p,t,T})\) is a Banach space.

  5. (v)

    For \(p\ge 1\) and \(t\ge 0\), \(L^p_{{\mathbb {F}},t}(E)\) denotes the Fréchet space of processes \(X\in L^0_{{\mathbb {F}},t}(E)\) such that \(|X|_{p,t,T} <\infty \) for all \(T >t\).

  6. (vi)

    For \(p\ge 1\) and \(0\le t\le T\), \({\textbf{S}}^p_{{\mathbb {F}},t,T}(E)\) denotes the Fréchet space of continuous processes \(X\in L^p_{{\mathbb {F}},t,T}(E)\) such that

    $$\begin{aligned} \Vert X\Vert _{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \sup _{s\in [t,T]} |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$
  7. (vii)

    For \(p\ge 1\) and \(t\ge 0\), \({\textbf{S}}^p_{{\mathbb {F}},t}(E)\) denotes the Fréchet space of continuous processes \(X\in L^p_{{\mathbb {F}},t}(E)\) such that \(\Vert X\Vert _{p,t,T}<\infty \) for all \(T>t\).

If EF are Banach spaces, the space L(EF) of linear and continuous operators \(E\rightarrow F\) is considered as endowed with the operator norm, denoted by \(|\cdot |_{{\mathcal {L}}(E,F)}\).

If K is a Hilbert space, its scalar product will be denote by \(\langle \cdot ,\cdot \rangle _K\). When \(K=L^2({\mathbb {R}}^k)\), we simply write \(\langle \cdot ,\cdot \rangle _{L^2}\).

We assume that the control variable takes value in a nonempty Borel set \(U\subset {\mathbb {R}}^k\). The control processes that we take into consideration are those belonging to the set

$$\begin{aligned} {\mathcal {U}}{:}{=}\left\{ u:\Omega \times {\mathbb {R}}^+\rightarrow U\ \mathrm {such\ that}\ u\in L^2_{{\mathbb {F}},0}({\mathbb {R}}^k) \right\} \end{aligned}$$

For given \(\alpha :{\mathbb {R}}^-\rightarrow {\mathbb {R}}^k\) and \(\beta :{\mathbb {R}}^+\rightarrow {\mathbb {R}}^k\), and for given times \(t_0,t\in {\mathbb {R}}^+\), \(t_0\le t\), we denote by \(\alpha \otimes ^{t_0}_t\beta \) the function \({\mathbb {R}}^-\rightarrow {\mathbb {R}}^k\) defined by

$$\begin{aligned} \alpha \otimes ^{t_0}_t\beta (s) {:}{=}{\left\{ \begin{array}{ll} \alpha ((t-t_0)+s) &{}\quad \textrm{if}\ s\in (-\infty ,-(t-t_0)]\\ \beta (t+s) &{}\quad \textrm{if}\ s\in (-(t-t_0),0]. \end{array}\right. } \end{aligned}$$

Notice that, if \(\varphi \in L^2({\mathbb {R}}^k)\) and \(u\in {\mathcal {U}}\), then \(\varphi \otimes ^{t_0} u=\{\varphi \otimes ^{t_0}_t u\}_{t\ge t_0}\) belongs to \(L^2_{{\mathbb {F}},t}(L^2({\mathbb {R}}^k))\).

3 The optimal control problem

3.1 State equation

For an initial time \(t\in {\mathbb {R}}^+\), an inital state \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), and a control process \(u\in {\mathcal {U}}\), we consider a state process x evolving according to the following delayed controlled stochastic differential equation:

$$\begin{aligned} {\left\{ \begin{array}{ll} &{} dx(s)= f(x(s),u(s),\langle \alpha ,\varphi \otimes ^t_s u\rangle _{L^2} )ds + g({x(s),u(s)},\langle \beta ,\varphi \otimes ^t_s u\rangle _{L^2})dW(s) \qquad \\ &{}\qquad \forall s\in (t,+\infty )\\ &{} x(t)=\xi \end{array}\right. } \end{aligned}$$
(3.1)

where we recall that \(\langle \cdot ,\cdot \rangle _{L^2}\) denotes the scalar product in \(L^2({\mathbb {R}}^k)\), and the data \(f,g,\alpha ,\beta ,\varphi \) are assumed to satisfy the following assumptions.

Assumption 1

The functions

$$\begin{aligned} f:{\mathbb {R}}^n\times U\times {\mathbb {R}}\rightarrow {\mathbb {R}}^n \qquad \textrm{and} \qquad g:{\mathbb {R}}^n\times U\times {\mathbb {R}}\rightarrow M_{n\times m}({\mathbb {R}}), \end{aligned}$$

are such that

  1. (i)

    fg are measurable;

  2. (ii)

    there exists a constant L such that

    $$\begin{aligned} \begin{aligned} |f(x,u,r)- f(x',u,r)|&\le L|x-x'|\\ |g(x,u,r)- g(x',u,r)|&\le L|x-x'|\\ |f(0,u,r)| + |g(0,u,r)|&\le L(1+|u|+|r|) \end{aligned} \end{aligned}$$

    for all \((x,u,r)\in {\mathbb {R}}^n\times U\times {\mathbb {R}}\).

Assumption 2

  1. (i)

    \(\alpha \), \(\beta \) are functions belonging to \(W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)\);

  2. (ii)

    \(\varphi \in L^1({\mathbb {R}}^k)\cap L^2({\mathbb {R}}^k)\) is such that \(\int _{-\infty }^\cdot \varphi (r)dr \in L^2({\mathbb {R}}^k)\).

Notice that Assumption 2(ii) is satisfied whenever \(\varphi \in L^2({\mathbb {R}}^k)\) has compact support.

We have the following well-posedness result for the state equation and continuity and growth properties of the strong solution.

Proposition 3

For \(t\in {\mathbb {R}}^+\), \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), and \(u\in {\mathcal {U}}\), there exists a unique strong solution \(x^{t,\xi ,u}\in L^0_{{\mathbb {F}},t}({\mathbb {R}}^n)\) of (3.1). Moreover, \(x^{t,\xi ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}({\mathbb {R}}^n)\) and

  1. (a)

    for any \(M>1\), there exists a constant C(ML) depending only on ML such that

    $$\begin{aligned}{} & {} \sup _{\begin{array}{c} u\in {\mathcal {U}}\\ \varphi \in L^2({\mathbb {R}}^k) \end{array}} \Vert x^{t,\xi ,u}- x^{t,\xi ',u} \Vert _{2,t,T}\le M e^{C(M,L)\cdot (T-t)}\cdot |\xi -\xi '|_2\nonumber \\{} & {} \quad \forall 0\le t\le T,\ \xi ,\xi '\in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n); \end{aligned}$$
    (3.2)
  2. (b)

    there exists \({\hat{C}}={\hat{C}}(L,|\alpha |_{L^2},|\beta |_{L^2})\), depending only on \(L,|\alpha |_{L^2},|\beta |_{L^2}\), and \({\hat{D}}={\hat{D}}(L)\), depending only on L, such that

    $$\begin{aligned} \Vert x^{t,\xi ,u} \Vert _{2,t,T}\le {\hat{C}} \left( 1+|\xi |_2+|\varphi |_{L^2({\mathbb {R}}^k)}+|u|_{2,t,T} \right) \cdot \, e^{{\hat{D}}\cdot (T-t)}, \end{aligned}$$
    (3.3)

    for all \(\varphi \in L^2({\mathbb {R}}^k)\), \(u\in {\mathcal {U}}\), \(0\le t\le T\), and \(\xi \in L^2_{{\mathbb {F}}_t}({\mathbb {R}}^n)\).

We omit the proof, since it based on standard arguments. To give the reader an idea for (3.2), consider e.g. Proposition 2.8 in Cosso et al. (2023). There, the Lipschitz constant is not expressed as in (3.2). Neverthless, by inspection, one can check that the constant \(\gamma \) in Claim III of Proof of Proposition 2.8, at p. 2897 in Cosso et al. (2023), can be arbitrarily close to 0, as long as \(\varepsilon \) is small enough. This fact entails that the Lipschitz constant in Proposition 2.8 in Cosso et al. (2023) can be arbitrarily close to 1, as long as \(T-t\) is is small enough. This provides our M in (3.2), as long as \(T-t\) is small enough. For general intervals [tT], one can use the estimate obtained for small \(T-t\), combined with the flow property of solutions. In this way one obtains the exponential term in (3.2).

To obtain (3.3), with an explicit growth constant \({\hat{D}}\), one can argue as in the proof of (Fabbri et al. (2017), Proposition 3.24, p. 187).

3.2 Objective functional and value function

We consider a discount factor \(\rho >0\) and a current reward function \( l:{\mathbb {R}}^n\times {\mathbb {R}}^k\times {\mathbb {R}} \rightarrow {\mathbb {R}} \) on which we impose the following assumptions.

Assumption 4

  1. (i)

    The function l is measurable.

  2. (ii)

    There exist constants \(a\ge 0\), \(0\le q \le 2\), \(d>0,{\theta > q}\) such that

    $$\begin{aligned} l(x,u,r)\le a(1+|x|^q+|r|^q)-d |u|^\theta \qquad \forall u\in U,\ x\in {\mathbb {R}}^n,\ r\in {\mathbb {R}}. \end{aligned}$$
  3. (iii)

    \(\rho > 2{\hat{D}}\), where \({\hat{D}}\) is as in (3.3).

  4. (iv)

    \(\gamma \) is a function belonging to \(W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)\).

Under Assumptions 1 and Assumptions 4, from Proposition 3 we get the reward functional J, given by

$$\begin{aligned} J(x_0,u)\,{:}{=}\, {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\big ( x^{0,x_0,u}(t),u(t),\langle \gamma ,\varphi \otimes ^0_t u\rangle _{L^2} \big )dt \right] \forall x_0\in {\mathbb {R}}^n,u\in {\mathcal {U}},\nonumber \\ \end{aligned}$$
(3.4)

is well-defined as a function \({\mathbb {R}}^n\times {\mathcal {U}}\rightarrow {\mathbb {R}}\).

We then consider the optimal control problem consisting in maximizing J over the set of admissible controls \({\mathcal {U}}\), for any given \(x_0\in {\mathbb {R}}^n\):

$$\begin{aligned} \sup _{u\in {\mathcal {U}}} J(x_0,u). \end{aligned}$$
(f-OCP)

For \(x_0\in {\mathbb {R}}^n\), we define the value function

$$\begin{aligned} V(x_0)\,{:}{=}\, \sup _{u\in {\mathcal {U}}} J(x_0,u). \end{aligned}$$

4 Representation in infinite dimension

Due to the dependence on the past of the control variable u, the finite dimensional stochastic dynamics (3.1) is not Markovian. This feature entails that the standard dynamic programming approach cannot be applied to the finite dimensional stochastic optimal control problem (f-OCP). A classical workaround to regain Markovianity consists in rephrasing the model in a functional space setting.

In order to do that, we start by introducing the Hilbert space

$$\begin{aligned} H\,{:}{=}\, {\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k), \end{aligned}$$

endowed with the induced scalar product

$$\begin{aligned} \langle z,y\rangle _H = \langle z_0,y_0\rangle + \langle z_1,y_1\rangle + \langle z_2,y_2\rangle _{L^2}, \end{aligned}$$

where \(z=(z_0,z_1,z_2)\), \(z_0\in {\mathbb {R}}^n,\ z_1\in {\mathbb {R}}^k,\ z_2\in L^2({\mathbb {R}}^k)\), and similarly for y.

4.1 Reformulation of the state equation in H

Then consider functions \({\hat{F}}, G\), associated to \(b,\sigma \), respectively, defined by

$$\begin{aligned} \begin{aligned}&{\hat{F}}:H\times U\rightarrow H,\ (z,u)\mapsto \left( f\left( z_0,u,\langle \alpha (0),z_1\rangle -\langle \alpha ',z_2\rangle _{L^2}\right) ,u,0 \right) \\&G:H\times U\rightarrow L({\mathbb {R}}^m,H),\ (z,u)\mapsto \left( g(z_0,u, \langle \beta (0),z_1\rangle -\langle \beta ',z_2\rangle _{L^2}\right) ,0,0) \end{aligned}\nonumber \\ \end{aligned}$$
(4.1)

where \(z=(z_0,z_1,z_2)\) denotes a generic point of \( H= {\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)\). Notice the the pointwise evaluations \(\alpha (0),\beta (0)\) and the square-integrable derivatives \(\alpha ',\beta '\) exist because of our initial assumption on \(\alpha ,\beta \).

For \(t\in {\mathbb {R}}^+\), consider the family operators \({\hat{S}}=\{{\hat{S}}_t\}_{t\in {\mathbb {R}}^+}\) defined by

$$\begin{aligned} {\hat{S}}_t:H\rightarrow H,\ z\mapsto (z_0,z_1,z_2(t+\cdot ){\textbf{1}}_{(-\infty ,-t))}(\cdot )+z_1{\textbf{1}}_{[-t,0]}(\cdot )). \end{aligned}$$

Then \({\hat{S}}\) is a strongly continuous semigroup, with infinitesimal generator \((D({\hat{A}}),{\hat{A}})\) specified by

$$\begin{aligned} {\hat{A}}:D({\hat{A}})\rightarrow H,\ z\mapsto (0,0,z_2') \end{aligned}$$

with

$$\begin{aligned} D({\hat{A}})= \left\{ z\in H:z_2\in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k),\ z_1=z_2(0) \right\} . \end{aligned}$$

Then we consider the H-valued dynamics

$$\begin{aligned} {\left\{ \begin{array}{ll} d{\hat{Z}}(s)= \left( {\hat{A}} {\hat{Z}}(s) + {\hat{F}}({\hat{Z}}(s),u(s)) \right) ds\\ \qquad \qquad + G ({\hat{Z}}(s), { u(s)})dW(s)&{}\qquad s\in (t,T]\\ {\hat{Z}}(t) = \zeta , \end{array}\right. } \end{aligned}$$
(4.2)

where \(u\in {\mathcal {U}}\), \(\zeta \in L^2_{{\mathcal {F}}_t}(H)\), \(t\in {\mathbb {R}}^+\).

Observe that, for fixed \(u\in {\mathcal {U}}\), Assumptions 1 on fg entail Lipschitz continuity and sublinear growth with respect to z of \({\hat{F}}\) and G.

As it can be easily checked, \({\hat{S}}\) is a \(C_0\)-semigroup of pseudo-contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). If follows that there exists a unique mild solution \({\hat{Z}}\) to (4.2), and the mild solution has a continuous version (Gawarecki and Mandrekar 2011, Theorem 3.3).

We could directly link \({\hat{Z}}^{t,\zeta ,u}\) to \(x^{t,\xi ,u}\), but, with the purpose to set up a framework suitable to be investigated in future works within the theory of B-viscosity solutions, as presented in Fabbri et al. (2017), Chapter 3, we need a dynamic representation \({\hat{Z}}\) similar to (4.2) but with the unbounded term appearing in the drift being the generator of a \(C_0\)-semigroup of contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). A simple way to do that consists in introducing the semigroup

$$\begin{aligned} S\,{:}{=}\, \{S_t\,{:}{=}\, e^{-t/2}{\hat{S}}_t\}_{t\in {\mathbb {R}}^+}, \end{aligned}$$

which is a semigroup of contractions, as it is easily seen by straightforward computations. The generator (D(A), A) of S is specified by

$$\begin{aligned} D(A)= D({\hat{A}})\qquad \textrm{and} \qquad Az={\hat{A}}z-\frac{z}{2},\ \forall z\in D(A), \end{aligned}$$

To use A in place of \({\hat{A}}\) in (4.2), we apply a translation to the bounded part of the drift \({\hat{F}}\), defining

$$\begin{aligned} F:H\times U\rightarrow H,\ (z,u)\mapsto \left( f\left( z_0,u, \langle \alpha (0),z_1\rangle -\langle \alpha ',z_2\rangle _{L^2}\right) ,u,0 \right) +\frac{z}{2}.\nonumber \\ \end{aligned}$$
(4.3)

Finally, we consider the H-valued dynamics

$$\begin{aligned} {\left\{ \begin{array}{ll} d Z(s)= \left( A Z(s) + F( Z(s),u(s)) \right) ds+ G( Z(s), { u(s)})dW(s)&{}\qquad s\in (t,T]\\ Z(t) = \zeta , \end{array}\right. }\nonumber \\ \end{aligned}$$
(4.4)

where \(u\in {\mathcal {U}}\), \(\zeta \in L^2_{{\mathcal {F}}_t}(H)\), \(t\in {\mathbb {R}}^+\). As noticed for (4.2), also (4.4) admits a unique mild solution \(Z^{t,\zeta ,u}\), that can be assumed to be pathwise continuous. It should also be clear that \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\). For future reference, we state this result in a proposition.

Proposition 5

For \(u\in {\mathcal {U}}, \zeta \in L^2_{{\mathcal {F}}_t}(H), t\in {\mathbb {R}}^+\), there exists a unique (up to indistinguishability) pathwise-continuous mild solution \(Z^{t,\zeta ,u}\). Moreover, \(Z^{t,\zeta ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}(H)\), and \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\), where \({\hat{Z}}^{t,\zeta ,u}\) is the unique mild solution to (4.2).

Proof

For existence and uniqueness, and integral estimates, see (Gawarecki and Mandrekar (2011), Theorem 3.3).

Regarding the fact that \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\), argue as in Rosestolato and Swiech (2017), pp. 1901–1902. \(\square \)

Denote by

$$\begin{aligned} P_0:H\rightarrow {\mathbb {R}}^n,\ z\mapsto z_0\qquad P_1:H\rightarrow {\mathbb {R}}^k,\ z\mapsto z_1\qquad P_2:H\rightarrow L^2({\mathbb {R}}^k),\ z\mapsto z_2 \end{aligned}$$

the orthogonal projections of \(H={\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)\) onto \({\mathbb {R}}^n,{\mathbb {R}}^k, L^2({\mathbb {R}}^k)\), respectively.

The following result explain the link between the mild solution of (4.4) \(Z^{t,\zeta ,u}\), and the strong solution of (3.1) \(x^{t,\xi ,u}\).

Theorem 6

Let \(t\in {\mathbb {R}}^+\), \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), \(u\in {\mathcal {U}}\). Let \(\zeta ^{\xi ,\varphi }=(\zeta _0,\zeta _1,\zeta _2)\in L^2_{{\mathcal {F}}_t}(H)\) be defined by

$$\begin{aligned} \zeta _0{:}{=}\xi , \qquad \zeta _1{:}{=}\int _{-\infty }^0\varphi (h)dh, \qquad \textrm{and} \qquad \zeta _2(r){:}{=}\int _{-\infty }^r \varphi (h)dh\quad \forall r\in {\mathbb {R}}^-. \end{aligned}$$

Then \(Z^{t,\zeta ^{\xi ,\varphi },u}=(Y_0,Y_1,Y_2)\), where, for \(s\ge t\),

$$\begin{aligned}{} & {} Y_0(s)\,{:}{=}\, x^{t,\xi ,u}(s), Y_1(s){:}{=}\zeta _1+\int _t^s u(h)dh, \qquad \textrm{and} \big ( Y_2(s) \big ) (r) \nonumber \\{} & {} \qquad \quad {:}{=}\int _{-\infty }^r \varphi \otimes ^t_s u(h)dh\quad \forall r\in {\mathbb {R}}^-. \end{aligned}$$
(4.5)

Proof

First, we notice that, by Assumption 2(ii), \(\zeta _1\) and \(\zeta _2\) are well-defined, and \(\zeta _2\in L^2({\mathbb {R}}^k)\). Then, we observe that, integrating by parts, for \(s\ge t\),

$$\begin{aligned} \begin{aligned} \langle \alpha ,\varphi \otimes ^t_su\rangle _{L^2}&= \langle \alpha (0),\int _{-\infty }^0 \varphi \otimes ^t_su(h)dh\rangle -\int _{-\infty }^0\alpha '(h) \left( \int _{-\infty }^h \varphi \otimes ^t_su(r)dr \right) dh\\&=\langle \alpha (0),Y_1(s)\rangle -\langle \alpha ', Y_2(s)\rangle _{L^2}\\ \langle \beta ,\varphi \otimes ^t_su\rangle _{L^2}&=\mathrm {(similarly)} =\langle \beta (0),Y_1(s)\rangle -\langle \beta ', Y_2(s)\rangle _{L^2}, \end{aligned} \end{aligned}$$
(4.6)

where we have used the fact that

$$\begin{aligned} Y_1(s)=\zeta _1+\int _t^s u(h)dh=\int _{-\infty }^0 \varphi \otimes ^t_s u(h)dh. \end{aligned}$$

Now let \(Y=(Y_0,Y_1,Y_2)\), where \(Y_0,Y_1,Y_2\) are as defined by (4.5). Let \(x=x^{t,\xi ,u}\) be the strong solution of (3.1). Due to the fact that the operator \({\hat{S}}_s\) (\(s\in {\mathbb {R}}^+\)) is the identity with respet to the first component, we can write, for \(s\ge t\),

$$\begin{aligned} Y_0(s)=x(s)= & {} \xi + \int _t^s f(x(h),u(h),\langle \alpha ,\varphi \otimes ^t_h u \rangle _{L^2})dh\nonumber \\{} & {} \quad + \int _t^s g(x(h),u(h),\langle \beta ,\varphi \otimes ^t_h u \rangle _{L^2})dW(h)\nonumber \\= & {} Y_0(t) + \int _t^s f(Y_0(h),u(h),\langle \alpha (0),Y_1(h)\rangle -\langle \alpha ',Y_2(h)\rangle _{L^2})dh\nonumber \\{} & {} + \int _t^s g(Y_0(h),u(h),\langle \beta (0),Y_1(h)\rangle -\langle \beta ',Y_2(h)\rangle _{L^2})dW(h)\nonumber \\= & {} P_0\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right) + \int _t^s P_0\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))\right) dh\nonumber \\{} & {} \quad + \int _t^s P_0\left( {\hat{S}}_{s-h} G(Y(h),u(h))\right) dW(h)\nonumber \\= & {} P_0\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi } + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh \right. \nonumber \\ {}{} & {} \quad \left. + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h)\right) . \end{aligned}$$
(4.7)

Regarding \(Y_1\), exploiting now the fact that \({\hat{S}}_s\) (\(s\in {\mathbb {R}}^+\)) is the identity in the second component, we have

$$\begin{aligned} Y_1(s)= & {} \int _{-\infty }^0\varphi \otimes ^t_su(h)dh =\int _{-\infty }^0 \varphi (h) dh +\int _t^s u(h)dh =\zeta _1 +\int _t^s u(h)dh \nonumber \\= & {} P_1\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right) + \int _t^s P_1\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))\right) dh\nonumber \\{} & {} + \int _t^s P_1\left( {\hat{S}}_{s-h} G(Y(h),u(h))\right) dW(h)\nonumber \\= & {} P_1\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right. \nonumber \\{} & {} \left. + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h)\right) \end{aligned}$$
(4.8)

Regarding \(Y_2\), we have, denoting by \({\textbf{0}}\) the function zero in \(L^2({\mathbb {R}}^k)\),

$$\begin{aligned} Y_2(s)= & {} \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh = \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh\right) {\textbf{1}}_{(-\infty ,-(s-t))}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} = \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s {\textbf{0}}(h)dh+ \int _{-\infty }^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} = \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))} +\left( \int _{-\infty }^0\varphi (h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))} +\zeta _1{\textbf{1}}_{[-(s-t),0]} \nonumber \\{} & {} + \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} \text {(this passage is justified below)}\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s u(h) {\textbf{1}}_{[-(s-h),0]}(\cdot )dh \qquad \qquad \nonumber \\{} & {} \text {(this is a Bochner integral in the function space }L^2({\mathbb {R}}^k))\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s P_2\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h)) \right) dh\nonumber \\{} & {} + \int _t^s P_2\left( {\hat{S}}_{s-h} G(Y(h),u(h)) \right) dW(h) \nonumber \\= & {} P_2\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h)) dh \right. \nonumber \\{} & {} \quad \left. + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h)) dW(h) \right) . \end{aligned}$$
(4.9)

To justify the equality in \(L^2({\mathbb {R}}^k)\)

$$\begin{aligned} \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]} = \int _t^s u(h) {\textbf{1}}_{[-(s-h),0]}(\cdot )dh, \end{aligned}$$
(4.10)

we pick any \(a\in L^2({\mathbb {R}}^k)\), and compute

$$\begin{aligned} \langle a, \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right){} & {} {\textbf{1}}_{[-(s-t),0]} \rangle _{L^2({\mathbb {R}}^k)} \\ {}{} & {} =\int _{-(s-t)}^0 a(r) \left( \int _{-(s-t)}^r {\textbf{0}}\otimes ^t_s u(h)dh \right) dr \\= & {} \int _{-(s-t)}^0 a(r) \left( \int _{-(s-t)}^r u(s+h)dh \right) dr\\= & {} \int _{-(s-t)}^0 \left( \int _h^0 a(r)u(s+h)dr \right) dh\\= & {} \int _t^s \left( \int _{h-s}^0 a(r)u(h)dr \right) dh\\= & {} \int _t^s \left( \int _{-\infty }^0 a(r)u(h) {\textbf{1}}_{[-(s-h),0)]}(r)dr \right) dh\\= & {} \int _t^s \langle a,u(h) {\textbf{1}}_{[-(s-h),0)]} \rangle _{L^2({\mathbb {R}}^k)} dh\\= & {} \langle a, \int _t^s u(h) {\textbf{1}}_{[-(s-h),0)]} dh \rangle _{L^2({\mathbb {R}}^k)}. \end{aligned}$$

This proves (4.10). Collecting (4.7), (4.8), (4.9), we obtain

$$\begin{aligned} Y(s)= {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi } + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h).\nonumber \\ \end{aligned}$$
(4.11)

Equality (4.11) tells us that Y is the unique mild solution to (4.2), i.e., \(Y={\hat{Z}}^{t,\zeta ^{\xi ,\varphi },u}\). Finally, by Proposition 5, we conclude \(Y=Z^{t,\zeta ^{\xi ,\varphi },u}\). \(\square \)

4.2 Reformulation of the optimal control problem in H

Thanks to Theorem 6, we can rephrase the finite dimensional optimal control problem (f-OCP), with delay in the control variable u, in an infinite dimensional setting, where there is no more delay in the control variable u. Indeed, if \(\zeta ^{\xi ,\varphi },Z^{t,\zeta ^{\xi ,\varphi },u}\) are as in Theorem 6, the functional J defined by (3.4) can be written as

$$\begin{aligned} J(x_0,u)= & {} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\big ( x^{0,x_0,u}(t),u(t),\langle \gamma ,\varphi \otimes ^0_t u\rangle _{L^2} \big )dt \right] \nonumber \\= & {} \text {(integrating by parts as in (4.6))}\nonumber \\= & {} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\left( Z_0^{0,\zeta ^{x_0,\varphi },u}(t),u(t), \langle \gamma (0),Z_1^{0,\zeta ^{x_0,\varphi },u}(t)\rangle \right. \right. \nonumber \\{} & {} \quad \left. \left. -\langle \gamma ', Z_2^{0,\zeta ^{x_0,\varphi },u}(t)\rangle _{L^2} \right) dt \right] \nonumber \\= & {} {\widetilde{J}}(\zeta ^{x_0,\varphi },u), \end{aligned}$$
(4.12)

where \(\widetilde{J}\) is defined by

$$\begin{aligned} \widetilde{J}(z,u)\,{:}{=}\, {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} {\widetilde{l}}\big (Z^{0,z,u}(t),u(t) \big ) dt \right] \qquad \qquad \forall z\in H,\ u\in {\mathcal {U}}, \end{aligned}$$

with

$$\begin{aligned} \hat{l}(z,u)\,{:}{=}\, l\big (z_0,u,\gamma (0)z_1-\langle \gamma ',z_2\rangle _{L^2}\big ) \qquad \qquad \forall z\in H,\ u\in U. \end{aligned}$$

It then follows that the problem (f-OCP) is a particular case of the infinite dimensional dimensional optimal control problem

figure a

Indeed, by introducing the value function \(\widetilde{V}\) associated to (\(\infty \)-OCP), defined as

$$\begin{aligned} \widetilde{V}(z)\,{:}{=}\, \sup _{u\in {\mathcal {U}}} \widetilde{J}(z,u)\qquad \qquad \forall z\in H, \end{aligned}$$

we have

$$\begin{aligned} V(x_0)=\widetilde{V}(\zeta ^{x_0,{\varphi }}),\qquad \qquad \forall x_0\in {\mathbb {R}}^n. \end{aligned}$$
(4.13)

4.3 Hamilton-Jacobi-Bellman equation and verification theorem

Denote by \({\mathcal {S}}(H)\) the space of self-adjoint operators in L(H) . Following the dynamic programming approach, the Hamilton-Jacobi-Bellman equation associated to (\(\infty \)-OCP) is

$$\begin{aligned} \rho v-\langle Az,Dv\rangle _H- {\mathcal {H}}(z,Dv,D^2v)=0\qquad \qquad z\in H \end{aligned}$$
(4.14)

where

$$\begin{aligned} {\mathcal {H}}(z,p,X)\,{:}{=}\, \sup _{u\in U} {\mathcal {H}}_{CV}(z,p,X,u) \qquad \forall z\in H,\ p\in H,\ X\in {\mathcal {S}}(H), \end{aligned}$$

with

$$\begin{aligned}{} & {} {\mathcal {H}}_{CV}(z,p,X,u)\,{:}{=}\, \frac{1}{2}\textrm{Tr} \left( G(z,u)G^*(z,u)X \right) +\langle p,F(z,u) \rangle _{H} +{\widetilde{l}}(z,u)\\{} & {} \qquad \forall z\in H,\ p\in H,\ X\in {\mathcal {S}}(H),\ u\in U. \end{aligned}$$

We recall the definition of classical solution of (4.14).

Definition 1

A function \(v:H\rightarrow {\mathbb {R}}\) is a classical solution of (4.14) if \(v\in C^2(H)\), \(Dv\in D(A^*)\), \(A^*Dv\in C(H,H)\), and v satisfies

$$\begin{aligned} \rho v{(z)}-\langle z,A^*Dv{(z)}\rangle _H- {\mathcal {H}}(z,Dv{(z)},D^2v{(z)})=0 \end{aligned}$$

for all \(z\in H\).

Assumption 7

There exists a constant \(C>0\) such that

$$\begin{aligned} \begin{aligned} |f(x, u,r)|&\le C(1+|x|) \quad \quad \forall x \in {\mathbb {R}}^n, u \in U,\ r\in {\mathbb {R}}\\ |g(x, u,r)|&\le C(1+|x|) \quad \quad \forall x\in {\mathbb {R}}^n, u \in U,\ r\in {\mathbb {R}}. \end{aligned} \end{aligned}$$
(4.15)

Assumption 8

  1. (i)

    The functions fg and l are continuous, l(xur) is uniformly continuous in x on bounded subsets of \({\mathbb {R}}^n\), uniformly for \(u \in {\mathbb {R}}^k,\ r\in {\mathbb {R}}\). Moreover, there exists C such that

    $$\begin{aligned} |l(x, u,r)| \le C(1+|x|) \end{aligned}$$

    for all \((x, u,r) \in {\mathbb {R}}^n \times {\mathbb {R}}^k\times {\mathbb {R}}\).

  2. (ii)

    The function \(v: H \rightarrow {\mathbb {R}}\) and its derivatives \(D v, D^2 v\) are uniformly continuous on bounded subsets of H. Moreover, \(D v: H \rightarrow D\left( A^*\right) \) and \(A^* D v\) is uniformly continuous on bounded subsets of H, and there exists CN such that

    $$\begin{aligned} |v(z)|+|D v(z)|_H+\textrm{Tr}\left( D^2 v(z){(D^2v(z))^*}\right) +\left| A^* D v(z)\right| _H \le C(1+|z|)^N\nonumber \\ \end{aligned}$$
    (4.16)

    for all \(z \in H\).

Theorem 9

(Theorem 2.42 in Fabbri et al. (2017)) Let \(v:H \rightarrow {\mathbb {R}}\) be a classical solution of

$$\begin{aligned} \rho v-\langle Dv, Az\rangle _H -{\mathcal {H}}(z,Dv,D^2v)=0 \end{aligned}$$

In addition to our standing Assumptions 1,2,4, let Assumptions 7 and 8 be satisfied. Assume that

$$\begin{aligned} \rho >{\bar{\rho }}{:}{=}(N+2)\left( C+\frac{1}{2}(N+1) C^2\right) , \end{aligned}$$

where C is the constant appearing in (4.15) and N is as in (4.16). Then, we have the following statements:

  1. 1.

    For all \(z \in H\)

    $$\begin{aligned} v(z) \ge \widetilde{V}(z). \end{aligned}$$
  2. 2.

    Let \(u^*\in {\mathcal {U}}\) be such that

    $$\begin{aligned} u^*(s) \in \arg \max _{u \in U} {\mathcal {H}}_{C V} \left( Z^{0,z,u}(s), D v\left( Z^{0,z,u}(s)\right) , D^2 v\left( Z^{0,z,u}(s)\right) , u \right) \end{aligned}$$

    for almost every \(s \in [0,+\infty )\) and \({\mathbb {P}}\)-almost surely. Then \(u^*\) is an optimal control and \(v(z)=\widetilde{V}(z)\).

5 Explicit solution in the LQ case

We now take into consideration a simple example to show how the representation in H leads to an explicit solution.

As coefficients for the state equation we consider, for real numbers abc,

$$\begin{aligned} \begin{aligned} f(x,u,r)&={ax+cu+r}\\ g(x,u,r)&=bx+r, \end{aligned} \end{aligned}$$

for all \(x\in {\mathbb {R}},\ u\in U={\mathbb {R}},\ r\in {\mathbb {R}}\). Dynamics (3.1) is written as

$$\begin{aligned} {\left\{ \begin{array}{ll} dx(s) = \left( ax(s)+ \int _{-\infty }^0 \alpha (h)\varphi \otimes ^t_s u(h)dh \right) ds \\ \qquad \qquad + \left( bx(s)+ \int _{-\infty }^0 \beta (h)\varphi \otimes ^t_s u(h)dh \right) dW(s) \qquad &{} \forall s\in (t,+\infty )\\ x(t)=\xi , \end{array}\right. } \end{aligned}$$
(5.1)

where \(\alpha ,\beta ,\varphi \) are assumed as in Assumption 2. As running reward we consider, for strictly positive real numbers \(c_1,c_2\), the linear-quadratic function

$$\begin{aligned} l(x,u,r)=c_1x-\frac{c_2}{2}u^2,\qquad \qquad \forall x\in {\mathbb {R}},\ u\in {\mathbb {R}}. \end{aligned}$$

The value function V is

$$\begin{aligned} V(x_0)=\sup _{u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} \left( c_1x^{0,x_0,u} -\frac{c_2}{2}u^2(t) \right) dt \right] \qquad \qquad \forall x_0\in {\mathbb {R}}. \end{aligned}$$

Then, for \(\rho \) sufficiently large, Assumptions 1 and 4 are satisfied.

Now we describe the corresponding infinite dimensional representation. We have \(H={\mathbb {R}}\times {\mathbb {R}}\times L^2({\mathbb {R}})\). The coefficients FG, as defined by (4.1), (4.3), are linear:

$$\begin{aligned} F(z,u)= Bz+Cu,\quad G(z,u)=\Sigma z \end{aligned}$$

for all \(z\in H,\ u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})\), where

$$\begin{aligned} \begin{aligned} Bz=&\left( \langle B_0,z\rangle _H,0,0 \right) +\frac{z}{2}\quad \textrm{with}\quad B_0= \left( a,\alpha (0),-\alpha ' \right) ,\\ C=&{(c,1,0),}\\ \Sigma z=&( \langle \Sigma _0,z\rangle _H,0,0)\quad \textrm{with}\quad \Sigma _0 = \left( b,\beta (0),-\beta ' \right) \end{aligned} \end{aligned}$$

The value function associated to the infinite dimensional problem (\(\infty \)-OCP) is

$$\begin{aligned} \widetilde{V}(z)=\sup _{u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} \left( c_1 Z_0^{0,z,u}(t)-\frac{c_2}{2}u^2(t) \right) dt \right] , \end{aligned}$$

where Z is the mild solution of (4.4), with FG as specified here above. The Hamiltonian \({\mathcal {H}}\) in (4.14) is

$$\begin{aligned} {\mathcal {H}}(z,p,X)= \frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 X_{00} + \langle p, Bz\rangle _H +c_1z_0 +\sup _{u\in U} \left\{ \langle p, Cu\rangle _H -\frac{c_2}{2}u^2 \right\} \end{aligned}$$

for \(z\in H,\ p\in H,\ X\in {\mathcal {S}}(H)\), \(X_{00}{:}{=}\langle X(1,0,0),(1,0,0)\rangle _H\), and it is maximized by

$$\begin{aligned} { u_{\textrm{max}}\, {:}{=}\, \frac{\langle p,C\rangle }{c_2}.} \end{aligned}$$
(5.2)

Then the HJB Eq. (4.14) is

$$\begin{aligned}{} & {} \rho v-\langle Az,Dv\rangle _H- \frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 D^2_{00}v - \langle Dv,Bz\rangle _H -c_1z_0 - \frac{\langle Dv,C\rangle ^2}{2c_2} =0\qquad \qquad \nonumber \\{} & {} \quad z\in H, \end{aligned}$$
(5.3)

where \(D^2_{00}v= \langle D^2v(1,0,0),(1,0,0)\rangle _H\) and \(D_1v = \langle Dv,(0,1,0)\rangle _H\).

Though the data do not satisfy the assumptions of Theorem 9 in this case an explicit solution of (5.3) is given by a suitably chosen linear function.

Proposition 10

The function \(v:H\rightarrow {\mathbb {R}}\), defined by

$$\begin{aligned} v(z)=\langle \Gamma ,z\rangle _H+\Gamma _3, \end{aligned}$$

where \(\Gamma =(\Gamma _0,\Gamma _1,\Gamma _2)\),

$$\begin{aligned} \begin{aligned}&\Gamma _0=\frac{c_1}{\rho -a},\qquad \Gamma _1=\frac{\Gamma _2(0)+\alpha (0)\Gamma _0}{\rho },\qquad \Gamma _2(\cdot )=-\Gamma _0\int _{-\infty }^\cdot \alpha '(s)e^{\rho (s-\cdot )}ds,\\ \qquad \Gamma _3=&\frac{(c\Gamma _0+\Gamma _1)^2}{2\rho c_2}, \end{aligned} \end{aligned}$$

is a classical solution of (5.3). Moreover, \(v=\widetilde{V}\), and the control \(u^*{:}{=}\frac{c_0\Gamma _0+\Gamma _1}{c_2}\) is optimal.

Proof

Clearly \(v\in C^2(H)\). To argue that \(\Gamma \in D(A^*)\), which is equivalent to \(\Gamma _2\in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})\), it is sufficient, first, to notice that Young’s inequality for convolutions implies that \(\Gamma _2\in L^2({\mathbb {R}}^-,{\mathbb {R}})\), then to consider the fact that \(\Gamma _2\) solves the differential equation

$$\begin{aligned} \Gamma _2' = -\rho \Gamma _2-\Gamma _0\alpha ', \end{aligned}$$

which entails \(\Gamma _2'\in L^2({\mathbb {R}}^-,{\mathbb {R}})\). Now, since \(\Gamma \in D(A^*)\), we have

$$\begin{aligned} \langle A^*Dv(z),z\rangle _H=\langle A^*\Gamma ,z\rangle _H =\langle (0,\Gamma _2(0),-\Gamma _2'),z\rangle _H -\frac{1}{2}\langle \Gamma ,z\rangle _H. \end{aligned}$$

Moreover, \(\langle C,Dv(z)\rangle _H=\langle C,\Gamma \rangle _H\), and

$$\begin{aligned} \langle Dv,Bz \rangle _H+c_1z_0 +\frac{\langle D_1v,C\rangle ^2}{2c_2}= & {} \langle \Gamma ,Bz \rangle _H+c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}\nonumber \\= & {} \Gamma _0 \langle B_0,z\rangle _H+ \frac{1}{2}\langle \Gamma ,z\rangle _H +c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}\nonumber \\= & {} \Gamma _0az_0+\Gamma _0\alpha (0)z_1-\Gamma _0\langle \alpha ',z_2\rangle _{L^2}\nonumber \\{} & {} \quad + \frac{1}{2}\langle \Gamma ,z\rangle _H +c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}. \end{aligned}$$
(5.4)

Then

$$\begin{aligned} \begin{aligned} \rho v-\langle Az,Dv\rangle _H-&\frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 D^2_{00}v - \langle Dv,Bz\rangle _H -c_1z_0 - \frac{\langle C,\Gamma \rangle ^2}{2c_2} =\\ =&\rho \left( \Gamma _0 z_0+\Gamma _1z_1+\langle \Gamma _2,z_2\rangle _{L^2}+\Gamma _3 \right) - \langle (0,\Gamma _2(0),-\Gamma _2'),z\rangle _H \\ {}&- \Gamma _0az_0-\Gamma _0\alpha (0)z_1+\Gamma _0\langle \alpha ',z_2\rangle _{L^2} -c_1z_0 -\frac{\langle C,\Gamma \rangle ^2}{2c_2}=0, \end{aligned} \end{aligned}$$

where we used \(\Gamma _2' = -\rho \Gamma _2-\Gamma _0\alpha '\).

We sketch the rest of the proof, as it goes in a standard way (see (Fabbri et al. (2017), Theorem 2.42)) Let \(Z=Z^{0,\zeta ^{z,\varphi },u}\). Since \(\Gamma \in D(A^*)\), Itô’s formula can be applied, and then, since v is a classical solution of (5.3), we obtain

$$\begin{aligned} v(z)= & {} e^{\rho t}{\mathbb {E}} \left[ v(Z_t) \right] +\int _0^t e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u_s)\right] ds\\{} & {} -\int _0^t e^{-\rho s} {\mathbb {E}} \left[ {\mathcal {H}}_{CV}(Z_s,Dv(Z_s),D^2v(Z_s),u_s) \right. \\{} & {} \quad \left. - {\mathcal {H}}(Z_s,Dv(Z_s),D^2v(Z_s))\right] ds. \end{aligned}$$

Letting t goes to \(\infty \) (\(\rho \) large enough), we recover the fundamental identity

$$\begin{aligned} v(z)= & {} \int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u_s)\right] ds\nonumber \\{} & {} +\int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ {\mathcal {H}}(Z_s,Dv(Z_s),D^2v(Z_s)) \right. \nonumber \\{} & {} \quad \left. -{\mathcal {H}}_{CV}(Z_s,Dv(Z_s),D^2v(Z_s),u_s) \right] ds. \end{aligned}$$
(5.5)

Since (5.5) holds true for any control u, recalling that \({\mathcal {H}}\ge {\mathcal {H}}_{CV}\), we conclude \(v\ge \widetilde{V}\). Finally, by (5.2) and (5.5), we have

$$\begin{aligned} v(z)=\int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u^*_s)\right] ds\le \widetilde{ V}(z), \end{aligned}$$

which concludes the proof. \(\square \)