Representation of stochastic optimal control problems with delay in the control variable

Girolami, Cristina Di; Rosestolato, Mauro

doi:10.1007/s10203-024-00465-x

Representation of stochastic optimal control problems with delay in the control variable

Open access
Published: 25 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Decisions in Economics and Finance Aims and scope Submit manuscript

Representation of stochastic optimal control problems with delay in the control variable

Download PDF

145 Accesses
Explore all metrics

Abstract

In this manuscript we provide a representation in infinite dimension for stochastic optimal control problems with delay in the control variable. The main novelty consists in the fact that the representation can be applied also to dynamics where the delay in the control appears as a nonlinear term and in the diffusion coefficient. We then apply the representation to a LQ case where an explicit solution can be found.

An approximation scheme for stochastic controls in continuous time

Article 18 October 2014

Mixed deterministic and random optimal control of linear stochastic systems with quadratic costs

Article Open access 04 January 2019

Optimal control for linear discrete systems with respect to probabilistic criteria

Article 10 October 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper we consider a class of stochastic optimal control problems where the state equation is a stochastic delay differential equation in ${\mathbb {R}}^n$ of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} dx(t)=f\left( x(t), u(t), {{\,\mathrm{\displaystyle \int }\,}}_{-\infty }^0 \langle \alpha (r), u(t+r)\rangle dr \right) dt \\ \qquad \quad + g \left( x(t), u(t), {{\,\mathrm{\displaystyle \int }\,}}_{-\infty }^0 \langle \beta (r), u(t+r)\rangle dr\right) dW{(t)}, \ \ \ \ t\ge 0,\\ x(0)=x_0\in {\mathbb {R}}^n, \\ u(s)=\varphi (s) \in {\mathbb {R}}^k, \ s\in (-\infty ,0), \end{array}\right. }\nonumber \\ \end{aligned}$$

(1.1)

where W is a Brownian motion with values in ${\mathbb {R}}^m$, x is the state variable with values in ${\mathbb {R}}^n$, u is a control process taking values in a suitable set $U\subset {\mathbb {R}}^k$, $x_0 \in {\mathbb {R}}^n$ is the initial value of the state variable, $\varphi \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)$ is the initial given control, $f:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^n$, $g:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^{m\times n}$ and $\alpha ,\beta $ are $L^2({\mathbb {R}}^-,{\mathbb {R}}^k)$ functions. The goal is to maximize, over all $u \in {\mathcal {U}}$, the functional

$$\begin{aligned} J(x_0, u)={\mathbb {E}}\left[ \int _ 0^\infty e^{-\rho t} l\left( x^{x_0,u}(t),u(t),\int _{-\infty }^0\langle \gamma (r),u(t+r)\rangle dr\right) dt \right] , \end{aligned}$$

(1.2)

where $\gamma \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)$ and $l:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}$. The key feature of such class of problems is the integral dependence of all the ingredients (coefficients f, g of the state equation and running reward function l) on the path of the control u.

As in the case where the delay dependence is with respect to the state variable, also the models that we address lack of Markovianity. Due to this fact, the dynamic programming approach cannot be directly applied. To overcome this difficulty, when it is the delay in the state variable but not in the control variable that appears in the problem, one available approach consists in rephrasing the finite dimensional problem in a Hilbert space setting, where the constituents of the new problem do not present anymore a delay-type dependence. The benefit of this approach is to recover Markovianity, hence to allow for an application of the dynamic programming machinery. Clearly, there is a cost to pay in doing so, due to the fact that a more technical theory is required, in particular for dealing with unbounded second-order Hamilton-Jacobi-Bellman equations on Hilbert spaces. Nevertheless, such a theory has been developed and is available for application (Fabbri et al. 2017). For stochastic optimal control problems with a delay dependence on the state variable, but not on the control variable, see Biffis et al. (2020), Biagini et al. (2022), Djehiche et al. (2022), De Feo et al. (2023), Di Giacinto et al. (2011), Federico (2011), Federico and Tankov (2015), Fuhrman et al. (2010), Masiero and Tessitore (2022), Pang and Yong (2019). Se also Cosso et al. (2023); Ren and Rosestolato (2020); Cosso et al. (2023) for a different approach, where no representation in Hilbert space is performed, but the problem, presented in a path-dependent framework, is addressed via dynamic programming in the original setting, but making use of the so-called pathwise derivatives (see Cont and Fournie (2010a, 2010b) for an account on this topic).

On the other hand, if we take into consideration models where we have a distributed delay dependence,

as we do in the present work, the infinite dimensional representation trick to overcome the lack of Markovianity is not obvious.

A way to do it is the one followed originally by Vinter and Kwong (1981), extended in the stochastic case with additive noise in

Gozzi and Marinelli (2004), then recently generalized in De Feo (2023) by considering a nonlinear dependence on the present of the control variable in the diffusion coefficient. In these works, the authors rephrase the original dynamics as an equivalent abstract SDE in a Hilbert space, controlled now only on the present value of the control variable. The drift of such an abstract controlled SDE is linearly dependent on an unbounded linear operator acting on the infinite dimensional state variable. The setting thereby recovered is then suitable to apply the theory of optimal control in infinite dimension, as developed in Fabbri et al. (2017).

Such a strategy to rephrase the problem strongly relies on the fact that the integral dependence on the past of the control variable appears only linearly in the drift of the original state dynamics. If in our model we used the same representation as in Gozzi and Marinelli (2004), De Feo (2023), the corresponding abstract equation would show a nonlinear dependence, both in the drift and in the diffusion coefficient, on an unbounded linear operator acting on the infinite dimensional state. This structure would make the problem very much difficult, and untreatable, when referring to the theory in Fabbri et al. (2017).

It is to overcome this issue, hence to let the delay dependence on the control appear nonlinearly both in the drift and in the diffusion coefficient, that we present an alternative representation.

The starting point is the simple observation that the function

$$\begin{aligned} x_1(t)=\int _{-\infty }^tu(s)ds\qquad \qquad \big ( u(s)=\varphi (s)\ \textrm{for}\ s\le 0 \big ) \end{aligned}$$

can be introduced as a second state variable to rewrite the integral dependence in (1.1) and in (1.2) as

$$\begin{aligned} \int _{-\infty }^0 \langle \alpha (r),u(t+r)\rangle dr= & {} \langle \alpha (0),\int _{-\infty }^0u(t+r)dr\rangle - \int _{-\infty }^0 \langle {\alpha '(r)},\int _{-\infty }^r u(t+s)ds\rangle dr\nonumber \\= & {} \langle \alpha (0),x_1(t)\rangle -\int _{-\infty }^0 \langle { \alpha '(r)},x_1(t+r)\rangle dr,\nonumber \\ \end{aligned}$$

(1.3)

and similarly for the other terms involving $\beta ,\gamma $. The point is that in (1.3) there is no more the control the variable u, but only the newly introduced state $x_1$, whose dynamics is trivially $dx_1(t)=u(t)dt$. Of course, if we use (1.3) (and similarly for $\beta ,\gamma $) in (1.1) and in (1.2), we still have to deal with a delay model: but now the delay is in the state variable $x_1$. This fact is important, because we can now perform the standard representation in infinite dimnesion for models with delay in the state, and, as said above, an unbounded linear operator will appear in the infinite dimensional dynamics, but it will appear linearly and only in the drift coefficient. Of course, differently from Gozzi and Marinelli (2004), in order to compute (1.3), we need some regularity, meaning $\alpha ,\beta ,\gamma \in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})$.

Using this approach, once the problem has been represented in an infinite dimensional setting, one ends up with a structure that can be tackled by appealing to the available theory for dynamic programming in infinite dimension, including the theory of B-viscosity solution theory, as presented in Fabbri et al. (2017).

Concerning applications, the model that we present here can be exploited for optimal advertising. Within this field, a basic setting has been provided by the seminal papers (Nerlove and Arrow 1962; Vidale and Wolfe 1957), then extended to the stochastic case, in particular, by Grosset and Viscolani (2004), Marinelli (2007), Motte and Pham (2021), Prasad and Sethi (2008). The delay in the control variable, representing the advertisement spending, is then introduced, in our model as in Gozzi et al. (2009); Gozzi and Marinelli (2004); De Feo (2023), in order to account for a delay effect in the spending, often called carryover effect (see Gozzi et al. (2009); Hartl (1984); Feichtinger et al. (1994)). A further extension of the stochastic linear model with delay in the control, and additive noise, has been recently provided by Gozzi et al. (2024) and Ricciardi and Rosestolato (2024), where a mean field term is introduced, in order to account for non-competitive and competitive environments, respectively.

We point out that when the delay dependence on the control is not in an integral form, as we assumed in the discussion above, but e.g. pointwise, then the representation infinite dimension in general more difficult to perform, and other strategies have to be exploited (see e.g. Lefebvre and Miller (2021)).

The plan of the paper is the following. In Sect. 2 we introduce the needed notations. In Sect. 3 we formulate the optimal control problem in finite dimension with delay in the control variable. In Sect. 4 we introduce the infinite dimensional setting and prove Theorem 6, which states the equivalence between the finite dimensional control problem with delay in the control, introduced in Sect. 3, and an infinite dimensional control problem, where there is no delay in the control variable. Finally, in Sect. 5, we show how the representation of Sect. 4 can be used to find an explicit solution for an LQ model, where both the drift and the diffusion coefficient of the state dynamics depend on the path of the delay.

2 Notation and preliminaries

We fix natural numbers n, k, m, that will represent the dimension of the state variable, the control variable, the Brownian motion, respectively. By $M_{n\times m}({\mathbb {R}})$ we denote the space of $n\times m$ matrices with real entries, endowed with the Frobenius norm. For finite dimensional spaces, the Euclidean norm and scalar product will be always denoted by $|\cdot |$ and $\langle \cdot ,\cdot \rangle $, respectively, without any subscript. We denote ${\mathbb {R}}^+=[0,+\infty )$ and ${\mathbb {R}}^-=(-\infty ,0]$. If ${\mathcal {T}}$ is any topological space, ${\mathcal {B}}_{\mathcal {T}}$ denotes its Borel sigma-algebra. We fix a filtered probability space $(\Omega ,{\mathcal {F}},{\mathbb {F}}=\{{\mathcal {F}}_t\}_{t\in {\mathbb {R}}^+},{\mathbb {P}})$ satisfying the usual conditions, and an m-dimensional Bronwian motion W defined on it. We assume ${\mathbb {F}}$ to be the completion of the natural filtration of W.

Given any separable Banach space $(E,|\cdot |_E)$, we introduce the following function spaces.

(i)
For $p\ge 1$, $L^p(E)$ denotes the space $L^p({\mathbb {R}}^-,E)$ of E-valued p-Lebesgue integrable functions defined on ${\mathbb {R}}^-$. Its usual $L^p$-norm will be denoted by $|\cdot |_{L^p}$.
(ii)
For $p\ge 1$ and any sub-sigma-algebra ${\mathcal {G}}\subset {\mathcal {F}}$, $L^p_{{\mathcal {G}}}(E)$ denotes the space of ${\mathcal {G}}$-measurable random variables $\xi $ such that
$$\begin{aligned} |\xi |_p{:}{=}\left( {\mathbb {E}} \left[ |\xi |^p_E \right] \right) ^{1/p}<\infty . \end{aligned}$$
(iii)
For $t\ge 0$, $L^0_{{\mathbb {F}},t}(E)$ denotes the space of $\{{\mathcal {F}}_s\}_{s\in [t,\infty )}$-progressively measurable processes $X:\Omega \times [t,\infty )\rightarrow E$ endowed with the (quotient) metrizable topology associated to convergence in measure, when $(\Omega \times [t,\infty ),{\mathcal {F}}\otimes {\mathcal {B}}_{[t,\infty )})$ is endowed with the product measure ${\mathbb {P}}\otimes \lambda $ ($\lambda $ is the Lebesgue measure).
(iv)
For $p\ge 1$ and $0\le t\le T$, $L^p_{{\mathbb {F}},t,T}(E)$ denotes the space of $\{{\mathcal {F}}_s\}_{s\in [t,T]}$-progressively measurable processes $X:\Omega \times [t,T]\rightarrow E$ such that
$$\begin{aligned} |X|_{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \int _t^T |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$
The couple $(L^p_{{\mathbb {F}},t,T}(E),|\cdot |_{p,t,T})$ is a Banach space.
(v)
For $p\ge 1$ and $t\ge 0$, $L^p_{{\mathbb {F}},t}(E)$ denotes the Fréchet space of processes $X\in L^0_{{\mathbb {F}},t}(E)$ such that $|X|_{p,t,T} <\infty $ for all $T >t$.
(vi)
For $p\ge 1$ and $0\le t\le T$, ${\textbf{S}}^p_{{\mathbb {F}},t,T}(E)$ denotes the Fréchet space of continuous processes $X\in L^p_{{\mathbb {F}},t,T}(E)$ such that
$$\begin{aligned} \Vert X\Vert _{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \sup _{s\in [t,T]} |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$
(vii)
For $p\ge 1$ and $t\ge 0$, ${\textbf{S}}^p_{{\mathbb {F}},t}(E)$ denotes the Fréchet space of continuous processes $X\in L^p_{{\mathbb {F}},t}(E)$ such that $\Vert X\Vert _{p,t,T}<\infty $ for all $T>t$.

If E, F are Banach spaces, the space L(E, F) of linear and continuous operators $E\rightarrow F$ is considered as endowed with the operator norm, denoted by $|\cdot |_{{\mathcal {L}}(E,F)}$.

If K is a Hilbert space, its scalar product will be denote by $\langle \cdot ,\cdot \rangle _K$. When $K=L^2({\mathbb {R}}^k)$, we simply write $\langle \cdot ,\cdot \rangle _{L^2}$.

We assume that the control variable takes value in a nonempty Borel set $U\subset {\mathbb {R}}^k$. The control processes that we take into consideration are those belonging to the set

$$\begin{aligned} {\mathcal {U}}{:}{=}\left\{ u:\Omega \times {\mathbb {R}}^+\rightarrow U\ \mathrm {such\ that}\ u\in L^2_{{\mathbb {F}},0}({\mathbb {R}}^k) \right\} \end{aligned}$$

For given $\alpha :{\mathbb {R}}^-\rightarrow {\mathbb {R}}^k$ and $\beta :{\mathbb {R}}^+\rightarrow {\mathbb {R}}^k$, and for given times $t_0,t\in {\mathbb {R}}^+$, $t_0\le t$, we denote by $\alpha \otimes ^{t_0}_t\beta $ the function ${\mathbb {R}}^-\rightarrow {\mathbb {R}}^k$ defined by

$$\begin{aligned} \alpha \otimes ^{t_0}_t\beta (s) {:}{=}{\left\{ \begin{array}{ll} \alpha ((t-t_0)+s) &{}\quad \textrm{if}\ s\in (-\infty ,-(t-t_0)]\\ \beta (t+s) &{}\quad \textrm{if}\ s\in (-(t-t_0),0]. \end{array}\right. } \end{aligned}$$

Notice that, if $\varphi \in L^2({\mathbb {R}}^k)$ and $u\in {\mathcal {U}}$, then $\varphi \otimes ^{t_0} u=\{\varphi \otimes ^{t_0}_t u\}_{t\ge t_0}$ belongs to $L^2_{{\mathbb {F}},t}(L^2({\mathbb {R}}^k))$.

3 The optimal control problem

3.1 State equation

For an initial time $t\in {\mathbb {R}}^+$, an inital state $\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)$, and a control process $u\in {\mathcal {U}}$, we consider a state process x evolving according to the following delayed controlled stochastic differential equation:

$$\begin{aligned} {\left\{ \begin{array}{ll} &{} dx(s)= f(x(s),u(s),\langle \alpha ,\varphi \otimes ^t_s u\rangle _{L^2} )ds + g({x(s),u(s)},\langle \beta ,\varphi \otimes ^t_s u\rangle _{L^2})dW(s) \qquad \\ &{}\qquad \forall s\in (t,+\infty )\\ &{} x(t)=\xi \end{array}\right. } \end{aligned}$$

(3.1)

where we recall that $\langle \cdot ,\cdot \rangle _{L^2}$ denotes the scalar product in $L^2({\mathbb {R}}^k)$, and the data $f,g,\alpha ,\beta ,\varphi $ are assumed to satisfy the following assumptions.

Assumption 1

The functions

$$\begin{aligned} f:{\mathbb {R}}^n\times U\times {\mathbb {R}}\rightarrow {\mathbb {R}}^n \qquad \textrm{and} \qquad g:{\mathbb {R}}^n\times U\times {\mathbb {R}}\rightarrow M_{n\times m}({\mathbb {R}}), \end{aligned}$$

are such that

(i)
f, g are measurable;
(ii)
there exists a constant L such that
$$\begin{aligned} \begin{aligned} |f(x,u,r)- f(x',u,r)|&\le L|x-x'|\\ |g(x,u,r)- g(x',u,r)|&\le L|x-x'|\\ |f(0,u,r)| + |g(0,u,r)|&\le L(1+|u|+|r|) \end{aligned} \end{aligned}$$
for all $(x,u,r)\in {\mathbb {R}}^n\times U\times {\mathbb {R}}$.

Assumption 2

(i)
$\alpha $, $\beta $ are functions belonging to $W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)$;
(ii)
$\varphi \in L^1({\mathbb {R}}^k)\cap L^2({\mathbb {R}}^k)$ is such that $\int _{-\infty }^\cdot \varphi (r)dr \in L^2({\mathbb {R}}^k)$.

Notice that Assumption 2(ii) is satisfied whenever $\varphi \in L^2({\mathbb {R}}^k)$ has compact support.

We have the following well-posedness result for the state equation and continuity and growth properties of the strong solution.

Proposition 3

For $t\in {\mathbb {R}}^+$, $\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)$, and $u\in {\mathcal {U}}$, there exists a unique strong solution $x^{t,\xi ,u}\in L^0_{{\mathbb {F}},t}({\mathbb {R}}^n)$ of (3.1). Moreover, $x^{t,\xi ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}({\mathbb {R}}^n)$ and

(a)
for any $M>1$, there exists a constant C(M, L) depending only on M, L such that
$$\begin{aligned}{} & {} \sup _{\begin{array}{c} u\in {\mathcal {U}}\\ \varphi \in L^2({\mathbb {R}}^k) \end{array}} \Vert x^{t,\xi ,u}- x^{t,\xi ',u} \Vert _{2,t,T}\le M e^{C(M,L)\cdot (T-t)}\cdot |\xi -\xi '|_2\nonumber \\{} & {} \quad \forall 0\le t\le T,\ \xi ,\xi '\in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n); \end{aligned}$$
(3.2)
(b)
there exists ${\hat{C}}={\hat{C}}(L,|\alpha |_{L^2},|\beta |_{L^2})$, depending only on $L,|\alpha |_{L^2},|\beta |_{L^2}$, and ${\hat{D}}={\hat{D}}(L)$, depending only on L, such that
$$\begin{aligned} \Vert x^{t,\xi ,u} \Vert _{2,t,T}\le {\hat{C}} \left( 1+|\xi |_2+|\varphi |_{L^2({\mathbb {R}}^k)}+|u|_{2,t,T} \right) \cdot \, e^{{\hat{D}}\cdot (T-t)}, \end{aligned}$$
(3.3)
for all $\varphi \in L^2({\mathbb {R}}^k)$, $u\in {\mathcal {U}}$, $0\le t\le T$, and $\xi \in L^2_{{\mathbb {F}}_t}({\mathbb {R}}^n)$.

We omit the proof, since it based on standard arguments. To give the reader an idea for (3.2), consider e.g. Proposition 2.8 in Cosso et al. (2023). There, the Lipschitz constant is not expressed as in (3.2). Neverthless, by inspection, one can check that the constant $\gamma $ in Claim III of Proof of Proposition 2.8, at p. 2897 in Cosso et al. (2023), can be arbitrarily close to 0, as long as $\varepsilon $ is small enough. This fact entails that the Lipschitz constant in Proposition 2.8 in Cosso et al. (2023) can be arbitrarily close to 1, as long as $T-t$ is is small enough. This provides our M in (3.2), as long as $T-t$ is small enough. For general intervals [t, T], one can use the estimate obtained for small $T-t$, combined with the flow property of solutions. In this way one obtains the exponential term in (3.2).

To obtain (3.3), with an explicit growth constant ${\hat{D}}$, one can argue as in the proof of (Fabbri et al. (2017), Proposition 3.24, p. 187).

3.2 Objective functional and value function

We consider a discount factor $\rho >0$ and a current reward function $ l:{\mathbb {R}}^n\times {\mathbb {R}}^k\times {\mathbb {R}} \rightarrow {\mathbb {R}} $ on which we impose the following assumptions.

Assumption 4

(i)
The function l is measurable.
(ii)
There exist constants $a\ge 0$, $0\le q \le 2$, $d>0,{\theta > q}$ such that
$$\begin{aligned} l(x,u,r)\le a(1+|x|^q+|r|^q)-d |u|^\theta \qquad \forall u\in U,\ x\in {\mathbb {R}}^n,\ r\in {\mathbb {R}}. \end{aligned}$$
(iii)
$\rho > 2{\hat{D}}$, where ${\hat{D}}$ is as in (3.3).
(iv)
$\gamma $ is a function belonging to $W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)$.

Under Assumptions 1 and Assumptions 4, from Proposition 3 we get the reward functional J, given by

$$\begin{aligned} J(x_0,u)\,{:}{=}\, {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\big ( x^{0,x_0,u}(t),u(t),\langle \gamma ,\varphi \otimes ^0_t u\rangle _{L^2} \big )dt \right] \forall x_0\in {\mathbb {R}}^n,u\in {\mathcal {U}},\nonumber \\ \end{aligned}$$

(3.4)

is well-defined as a function ${\mathbb {R}}^n\times {\mathcal {U}}\rightarrow {\mathbb {R}}$.

We then consider the optimal control problem consisting in maximizing J over the set of admissible controls ${\mathcal {U}}$, for any given $x_0\in {\mathbb {R}}^n$:

$$\begin{aligned} \sup _{u\in {\mathcal {U}}} J(x_0,u). \end{aligned}$$

(f-OCP)

For $x_0\in {\mathbb {R}}^n$, we define the value function

$$\begin{aligned} V(x_0)\,{:}{=}\, \sup _{u\in {\mathcal {U}}} J(x_0,u). \end{aligned}$$

4 Representation in infinite dimension

Due to the dependence on the past of the control variable u, the finite dimensional stochastic dynamics (3.1) is not Markovian. This feature entails that the standard dynamic programming approach cannot be applied to the finite dimensional stochastic optimal control problem (f-OCP). A classical workaround to regain Markovianity consists in rephrasing the model in a functional space setting.

In order to do that, we start by introducing the Hilbert space

$$\begin{aligned} H\,{:}{=}\, {\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k), \end{aligned}$$

endowed with the induced scalar product

$$\begin{aligned} \langle z,y\rangle _H = \langle z_0,y_0\rangle + \langle z_1,y_1\rangle + \langle z_2,y_2\rangle _{L^2}, \end{aligned}$$

where $z=(z_0,z_1,z_2)$, $z_0\in {\mathbb {R}}^n,\ z_1\in {\mathbb {R}}^k,\ z_2\in L^2({\mathbb {R}}^k)$, and similarly for y.

4.1 Reformulation of the state equation in H

Then consider functions ${\hat{F}}, G$, associated to $b,\sigma $, respectively, defined by

$$\begin{aligned} \begin{aligned}&{\hat{F}}:H\times U\rightarrow H,\ (z,u)\mapsto \left( f\left( z_0,u,\langle \alpha (0),z_1\rangle -\langle \alpha ',z_2\rangle _{L^2}\right) ,u,0 \right) \\&G:H\times U\rightarrow L({\mathbb {R}}^m,H),\ (z,u)\mapsto \left( g(z_0,u, \langle \beta (0),z_1\rangle -\langle \beta ',z_2\rangle _{L^2}\right) ,0,0) \end{aligned}\nonumber \\ \end{aligned}$$

(4.1)

where $z=(z_0,z_1,z_2)$ denotes a generic point of $ H= {\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)$. Notice the the pointwise evaluations $\alpha (0),\beta (0)$ and the square-integrable derivatives $\alpha ',\beta '$ exist because of our initial assumption on $\alpha ,\beta $.

For $t\in {\mathbb {R}}^+$, consider the family operators ${\hat{S}}=\{{\hat{S}}_t\}_{t\in {\mathbb {R}}^+}$ defined by

$$\begin{aligned} {\hat{S}}_t:H\rightarrow H,\ z\mapsto (z_0,z_1,z_2(t+\cdot ){\textbf{1}}_{(-\infty ,-t))}(\cdot )+z_1{\textbf{1}}_{[-t,0]}(\cdot )). \end{aligned}$$

Then ${\hat{S}}$ is a strongly continuous semigroup, with infinitesimal generator $(D({\hat{A}}),{\hat{A}})$ specified by

$$\begin{aligned} {\hat{A}}:D({\hat{A}})\rightarrow H,\ z\mapsto (0,0,z_2') \end{aligned}$$

with

$$\begin{aligned} D({\hat{A}})= \left\{ z\in H:z_2\in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k),\ z_1=z_2(0) \right\} . \end{aligned}$$

Then we consider the H-valued dynamics

$$\begin{aligned} {\left\{ \begin{array}{ll} d{\hat{Z}}(s)= \left( {\hat{A}} {\hat{Z}}(s) + {\hat{F}}({\hat{Z}}(s),u(s)) \right) ds\\ \qquad \qquad + G ({\hat{Z}}(s), { u(s)})dW(s)&{}\qquad s\in (t,T]\\ {\hat{Z}}(t) = \zeta , \end{array}\right. } \end{aligned}$$

(4.2)

where $u\in {\mathcal {U}}$, $\zeta \in L^2_{{\mathcal {F}}_t}(H)$, $t\in {\mathbb {R}}^+$.

Observe that, for fixed $u\in {\mathcal {U}}$, Assumptions 1 on f, g entail Lipschitz continuity and sublinear growth with respect to z of ${\hat{F}}$ and G.

As it can be easily checked, ${\hat{S}}$ is a $C_0$-semigroup of pseudo-contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). If follows that there exists a unique mild solution ${\hat{Z}}$ to (4.2), and the mild solution has a continuous version (Gawarecki and Mandrekar 2011, Theorem 3.3).

We could directly link ${\hat{Z}}^{t,\zeta ,u}$ to $x^{t,\xi ,u}$, but, with the purpose to set up a framework suitable to be investigated in future works within the theory of B-viscosity solutions, as presented in Fabbri et al. (2017), Chapter 3, we need a dynamic representation ${\hat{Z}}$ similar to (4.2) but with the unbounded term appearing in the drift being the generator of a $C_0$-semigroup of contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). A simple way to do that consists in introducing the semigroup

$$\begin{aligned} S\,{:}{=}\, \{S_t\,{:}{=}\, e^{-t/2}{\hat{S}}_t\}_{t\in {\mathbb {R}}^+}, \end{aligned}$$

which is a semigroup of contractions, as it is easily seen by straightforward computations. The generator (D(A), A) of S is specified by

$$\begin{aligned} D(A)= D({\hat{A}})\qquad \textrm{and} \qquad Az={\hat{A}}z-\frac{z}{2},\ \forall z\in D(A), \end{aligned}$$

To use A in place of ${\hat{A}}$ in (4.2), we apply a translation to the bounded part of the drift ${\hat{F}}$, defining

$$\begin{aligned} F:H\times U\rightarrow H,\ (z,u)\mapsto \left( f\left( z_0,u, \langle \alpha (0),z_1\rangle -\langle \alpha ',z_2\rangle _{L^2}\right) ,u,0 \right) +\frac{z}{2}.\nonumber \\ \end{aligned}$$

(4.3)

Finally, we consider the H-valued dynamics

$$\begin{aligned} {\left\{ \begin{array}{ll} d Z(s)= \left( A Z(s) + F( Z(s),u(s)) \right) ds+ G( Z(s), { u(s)})dW(s)&{}\qquad s\in (t,T]\\ Z(t) = \zeta , \end{array}\right. }\nonumber \\ \end{aligned}$$

(4.4)

where $u\in {\mathcal {U}}$, $\zeta \in L^2_{{\mathcal {F}}_t}(H)$, $t\in {\mathbb {R}}^+$. As noticed for (4.2), also (4.4) admits a unique mild solution $Z^{t,\zeta ,u}$, that can be assumed to be pathwise continuous. It should also be clear that $Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}$. For future reference, we state this result in a proposition.

Proposition 5

For $u\in {\mathcal {U}}, \zeta \in L^2_{{\mathcal {F}}_t}(H), t\in {\mathbb {R}}^+$, there exists a unique (up to indistinguishability) pathwise-continuous mild solution $Z^{t,\zeta ,u}$. Moreover, $Z^{t,\zeta ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}(H)$, and $Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}$, where ${\hat{Z}}^{t,\zeta ,u}$ is the unique mild solution to (4.2).

Proof

For existence and uniqueness, and integral estimates, see (Gawarecki and Mandrekar (2011), Theorem 3.3).

Regarding the fact that $Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}$, argue as in Rosestolato and Swiech (2017), pp. 1901–1902. $\square $

Denote by

$$\begin{aligned} P_0:H\rightarrow {\mathbb {R}}^n,\ z\mapsto z_0\qquad P_1:H\rightarrow {\mathbb {R}}^k,\ z\mapsto z_1\qquad P_2:H\rightarrow L^2({\mathbb {R}}^k),\ z\mapsto z_2 \end{aligned}$$

the orthogonal projections of $H={\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)$ onto ${\mathbb {R}}^n,{\mathbb {R}}^k, L^2({\mathbb {R}}^k)$, respectively.

The following result explain the link between the mild solution of (4.4) $Z^{t,\zeta ,u}$, and the strong solution of (3.1) $x^{t,\xi ,u}$.

Theorem 6

Let $t\in {\mathbb {R}}^+$, $\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)$, $u\in {\mathcal {U}}$. Let $\zeta ^{\xi ,\varphi }=(\zeta _0,\zeta _1,\zeta _2)\in L^2_{{\mathcal {F}}_t}(H)$ be defined by

$$\begin{aligned} \zeta _0{:}{=}\xi , \qquad \zeta _1{:}{=}\int _{-\infty }^0\varphi (h)dh, \qquad \textrm{and} \qquad \zeta _2(r){:}{=}\int _{-\infty }^r \varphi (h)dh\quad \forall r\in {\mathbb {R}}^-. \end{aligned}$$

Then $Z^{t,\zeta ^{\xi ,\varphi },u}=(Y_0,Y_1,Y_2)$, where, for $s\ge t$,

$$\begin{aligned}{} & {} Y_0(s)\,{:}{=}\, x^{t,\xi ,u}(s), Y_1(s){:}{=}\zeta _1+\int _t^s u(h)dh, \qquad \textrm{and} \big ( Y_2(s) \big ) (r) \nonumber \\{} & {} \qquad \quad {:}{=}\int _{-\infty }^r \varphi \otimes ^t_s u(h)dh\quad \forall r\in {\mathbb {R}}^-. \end{aligned}$$

(4.5)

Proof

First, we notice that, by Assumption 2(ii), $\zeta _1$ and $\zeta _2$ are well-defined, and $\zeta _2\in L^2({\mathbb {R}}^k)$. Then, we observe that, integrating by parts, for $s\ge t$,

$$\begin{aligned} \begin{aligned} \langle \alpha ,\varphi \otimes ^t_su\rangle _{L^2}&= \langle \alpha (0),\int _{-\infty }^0 \varphi \otimes ^t_su(h)dh\rangle -\int _{-\infty }^0\alpha '(h) \left( \int _{-\infty }^h \varphi \otimes ^t_su(r)dr \right) dh\\&=\langle \alpha (0),Y_1(s)\rangle -\langle \alpha ', Y_2(s)\rangle _{L^2}\\ \langle \beta ,\varphi \otimes ^t_su\rangle _{L^2}&=\mathrm {(similarly)} =\langle \beta (0),Y_1(s)\rangle -\langle \beta ', Y_2(s)\rangle _{L^2}, \end{aligned} \end{aligned}$$

(4.6)

where we have used the fact that

$$\begin{aligned} Y_1(s)=\zeta _1+\int _t^s u(h)dh=\int _{-\infty }^0 \varphi \otimes ^t_s u(h)dh. \end{aligned}$$

Now let $Y=(Y_0,Y_1,Y_2)$, where $Y_0,Y_1,Y_2$ are as defined by (4.5). Let $x=x^{t,\xi ,u}$ be the strong solution of (3.1). Due to the fact that the operator ${\hat{S}}_s$ ($s\in {\mathbb {R}}^+$) is the identity with respet to the first component, we can write, for $s\ge t$,

$$\begin{aligned} Y_0(s)=x(s)= & {} \xi + \int _t^s f(x(h),u(h),\langle \alpha ,\varphi \otimes ^t_h u \rangle _{L^2})dh\nonumber \\{} & {} \quad + \int _t^s g(x(h),u(h),\langle \beta ,\varphi \otimes ^t_h u \rangle _{L^2})dW(h)\nonumber \\= & {} Y_0(t) + \int _t^s f(Y_0(h),u(h),\langle \alpha (0),Y_1(h)\rangle -\langle \alpha ',Y_2(h)\rangle _{L^2})dh\nonumber \\{} & {} + \int _t^s g(Y_0(h),u(h),\langle \beta (0),Y_1(h)\rangle -\langle \beta ',Y_2(h)\rangle _{L^2})dW(h)\nonumber \\= & {} P_0\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right) + \int _t^s P_0\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))\right) dh\nonumber \\{} & {} \quad + \int _t^s P_0\left( {\hat{S}}_{s-h} G(Y(h),u(h))\right) dW(h)\nonumber \\= & {} P_0\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi } + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh \right. \nonumber \\ {}{} & {} \quad \left. + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h)\right) . \end{aligned}$$

(4.7)

Regarding $Y_1$, exploiting now the fact that ${\hat{S}}_s$ ($s\in {\mathbb {R}}^+$) is the identity in the second component, we have

$$\begin{aligned} Y_1(s)= & {} \int _{-\infty }^0\varphi \otimes ^t_su(h)dh =\int _{-\infty }^0 \varphi (h) dh +\int _t^s u(h)dh =\zeta _1 +\int _t^s u(h)dh \nonumber \\= & {} P_1\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right) + \int _t^s P_1\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))\right) dh\nonumber \\{} & {} + \int _t^s P_1\left( {\hat{S}}_{s-h} G(Y(h),u(h))\right) dW(h)\nonumber \\= & {} P_1\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi }\right. \nonumber \\{} & {} \left. + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h)\right) \end{aligned}$$

(4.8)

Regarding $Y_2$, we have, denoting by ${\textbf{0}}$ the function zero in $L^2({\mathbb {R}}^k)$,

$$\begin{aligned} Y_2(s)= & {} \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh = \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh\right) {\textbf{1}}_{(-\infty ,-(s-t))}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} = \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot \varphi \otimes ^t_s {\textbf{0}}(h)dh+ \int _{-\infty }^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} = \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))} +\left( \int _{-\infty }^0\varphi (h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\{} & {} + \left( \int _{-\infty }^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} \zeta _2((s-t)+\cdot ){\textbf{1}}_{(-\infty ,-(s-t))} +\zeta _1{\textbf{1}}_{[-(s-t),0]} \nonumber \\{} & {} + \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]}\nonumber \\= & {} \text {(this passage is justified below)}\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s u(h) {\textbf{1}}_{[-(s-h),0]}(\cdot )dh \qquad \qquad \nonumber \\{} & {} \text {(this is a Bochner integral in the function space }L^2({\mathbb {R}}^k))\nonumber \\= & {} P_2({\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s P_2\left( {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h)) \right) dh\nonumber \\{} & {} + \int _t^s P_2\left( {\hat{S}}_{s-h} G(Y(h),u(h)) \right) dW(h) \nonumber \\= & {} P_2\left( {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi })+ \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h)) dh \right. \nonumber \\{} & {} \quad \left. + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h)) dW(h) \right) . \end{aligned}$$

(4.9)

To justify the equality in $L^2({\mathbb {R}}^k)$

$$\begin{aligned} \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right) {\textbf{1}}_{[-(s-t),0]} = \int _t^s u(h) {\textbf{1}}_{[-(s-h),0]}(\cdot )dh, \end{aligned}$$

(4.10)

we pick any $a\in L^2({\mathbb {R}}^k)$, and compute

$$\begin{aligned} \langle a, \left( \int _{-(s-t)}^\cdot {\textbf{0}}\otimes ^t_s u(h)dh\right){} & {} {\textbf{1}}_{[-(s-t),0]} \rangle _{L^2({\mathbb {R}}^k)} \\ {}{} & {} =\int _{-(s-t)}^0 a(r) \left( \int _{-(s-t)}^r {\textbf{0}}\otimes ^t_s u(h)dh \right) dr \\= & {} \int _{-(s-t)}^0 a(r) \left( \int _{-(s-t)}^r u(s+h)dh \right) dr\\= & {} \int _{-(s-t)}^0 \left( \int _h^0 a(r)u(s+h)dr \right) dh\\= & {} \int _t^s \left( \int _{h-s}^0 a(r)u(h)dr \right) dh\\= & {} \int _t^s \left( \int _{-\infty }^0 a(r)u(h) {\textbf{1}}_{[-(s-h),0)]}(r)dr \right) dh\\= & {} \int _t^s \langle a,u(h) {\textbf{1}}_{[-(s-h),0)]} \rangle _{L^2({\mathbb {R}}^k)} dh\\= & {} \langle a, \int _t^s u(h) {\textbf{1}}_{[-(s-h),0)]} dh \rangle _{L^2({\mathbb {R}}^k)}. \end{aligned}$$

This proves (4.10). Collecting (4.7), (4.8), (4.9), we obtain

$$\begin{aligned} Y(s)= {\hat{S}}_{s-t}\zeta ^{\xi ,\varphi } + \int _t^s {\hat{S}}_{s-h}{\hat{F}}(Y(h),u(h))dh + \int _t^s {\hat{S}}_{s-h} G(Y(h),u(h))dW(h).\nonumber \\ \end{aligned}$$

(4.11)

Equality (4.11) tells us that Y is the unique mild solution to (4.2), i.e., $Y={\hat{Z}}^{t,\zeta ^{\xi ,\varphi },u}$. Finally, by Proposition 5, we conclude $Y=Z^{t,\zeta ^{\xi ,\varphi },u}$. $\square $

4.2 Reformulation of the optimal control problem in H

Thanks to Theorem 6, we can rephrase the finite dimensional optimal control problem (f-OCP), with delay in the control variable u, in an infinite dimensional setting, where there is no more delay in the control variable u. Indeed, if $\zeta ^{\xi ,\varphi },Z^{t,\zeta ^{\xi ,\varphi },u}$ are as in Theorem 6, the functional J defined by (3.4) can be written as

$$\begin{aligned} J(x_0,u)= & {} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\big ( x^{0,x_0,u}(t),u(t),\langle \gamma ,\varphi \otimes ^0_t u\rangle _{L^2} \big )dt \right] \nonumber \\= & {} \text {(integrating by parts as in (4.6))}\nonumber \\= & {} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} l\left( Z_0^{0,\zeta ^{x_0,\varphi },u}(t),u(t), \langle \gamma (0),Z_1^{0,\zeta ^{x_0,\varphi },u}(t)\rangle \right. \right. \nonumber \\{} & {} \quad \left. \left. -\langle \gamma ', Z_2^{0,\zeta ^{x_0,\varphi },u}(t)\rangle _{L^2} \right) dt \right] \nonumber \\= & {} {\widetilde{J}}(\zeta ^{x_0,\varphi },u), \end{aligned}$$

(4.12)

where $\widetilde{J}$ is defined by

$$\begin{aligned} \widetilde{J}(z,u)\,{:}{=}\, {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} {\widetilde{l}}\big (Z^{0,z,u}(t),u(t) \big ) dt \right] \qquad \qquad \forall z\in H,\ u\in {\mathcal {U}}, \end{aligned}$$

with

$$\begin{aligned} \hat{l}(z,u)\,{:}{=}\, l\big (z_0,u,\gamma (0)z_1-\langle \gamma ',z_2\rangle _{L^2}\big ) \qquad \qquad \forall z\in H,\ u\in U. \end{aligned}$$

It then follows that the problem (f-OCP) is a particular case of the infinite dimensional dimensional optimal control problem

Indeed, by introducing the value function $\widetilde{V}$ associated to ($\infty $-OCP), defined as

$$\begin{aligned} \widetilde{V}(z)\,{:}{=}\, \sup _{u\in {\mathcal {U}}} \widetilde{J}(z,u)\qquad \qquad \forall z\in H, \end{aligned}$$

we have

$$\begin{aligned} V(x_0)=\widetilde{V}(\zeta ^{x_0,{\varphi }}),\qquad \qquad \forall x_0\in {\mathbb {R}}^n. \end{aligned}$$

(4.13)

4.3 Hamilton-Jacobi-Bellman equation and verification theorem

Denote by ${\mathcal {S}}(H)$ the space of self-adjoint operators in L(H) . Following the dynamic programming approach, the Hamilton-Jacobi-Bellman equation associated to ($\infty $-OCP) is

$$\begin{aligned} \rho v-\langle Az,Dv\rangle _H- {\mathcal {H}}(z,Dv,D^2v)=0\qquad \qquad z\in H \end{aligned}$$

(4.14)

where

$$\begin{aligned} {\mathcal {H}}(z,p,X)\,{:}{=}\, \sup _{u\in U} {\mathcal {H}}_{CV}(z,p,X,u) \qquad \forall z\in H,\ p\in H,\ X\in {\mathcal {S}}(H), \end{aligned}$$

with

$$\begin{aligned}{} & {} {\mathcal {H}}_{CV}(z,p,X,u)\,{:}{=}\, \frac{1}{2}\textrm{Tr} \left( G(z,u)G^*(z,u)X \right) +\langle p,F(z,u) \rangle _{H} +{\widetilde{l}}(z,u)\\{} & {} \qquad \forall z\in H,\ p\in H,\ X\in {\mathcal {S}}(H),\ u\in U. \end{aligned}$$

We recall the definition of classical solution of (4.14).

Definition 1

A function $v:H\rightarrow {\mathbb {R}}$ is a classical solution of (4.14) if $v\in C^2(H)$, $Dv\in D(A^*)$, $A^*Dv\in C(H,H)$, and v satisfies

$$\begin{aligned} \rho v{(z)}-\langle z,A^*Dv{(z)}\rangle _H- {\mathcal {H}}(z,Dv{(z)},D^2v{(z)})=0 \end{aligned}$$

for all $z\in H$.

Assumption 7

There exists a constant $C>0$ such that

$$\begin{aligned} \begin{aligned} |f(x, u,r)|&\le C(1+|x|) \quad \quad \forall x \in {\mathbb {R}}^n, u \in U,\ r\in {\mathbb {R}}\\ |g(x, u,r)|&\le C(1+|x|) \quad \quad \forall x\in {\mathbb {R}}^n, u \in U,\ r\in {\mathbb {R}}. \end{aligned} \end{aligned}$$

(4.15)

Assumption 8

(i)
The functions f, g and l are continuous, l(x, u, r) is uniformly continuous in x on bounded subsets of ${\mathbb {R}}^n$, uniformly for $u \in {\mathbb {R}}^k,\ r\in {\mathbb {R}}$. Moreover, there exists C such that
$$\begin{aligned} |l(x, u,r)| \le C(1+|x|) \end{aligned}$$
for all $(x, u,r) \in {\mathbb {R}}^n \times {\mathbb {R}}^k\times {\mathbb {R}}$.
(ii)
The function $v: H \rightarrow {\mathbb {R}}$ and its derivatives $D v, D^2 v$ are uniformly continuous on bounded subsets of H. Moreover, $D v: H \rightarrow D\left( A^*\right) $ and $A^* D v$ is uniformly continuous on bounded subsets of H, and there exists C, N such that
$$\begin{aligned} |v(z)|+|D v(z)|_H+\textrm{Tr}\left( D^2 v(z){(D^2v(z))^*}\right) +\left| A^* D v(z)\right| _H \le C(1+|z|)^N\nonumber \\ \end{aligned}$$
(4.16)
for all $z \in H$.

Theorem 9

(Theorem 2.42 in Fabbri et al. (2017)) Let $v:H \rightarrow {\mathbb {R}}$ be a classical solution of

$$\begin{aligned} \rho v-\langle Dv, Az\rangle _H -{\mathcal {H}}(z,Dv,D^2v)=0 \end{aligned}$$

In addition to our standing Assumptions 1,2,4, let Assumptions 7 and 8 be satisfied. Assume that

$$\begin{aligned} \rho >{\bar{\rho }}{:}{=}(N+2)\left( C+\frac{1}{2}(N+1) C^2\right) , \end{aligned}$$

where C is the constant appearing in (4.15) and N is as in (4.16). Then, we have the following statements:

1.
For all $z \in H$
$$\begin{aligned} v(z) \ge \widetilde{V}(z). \end{aligned}$$
2.
Let $u^*\in {\mathcal {U}}$ be such that
$$\begin{aligned} u^*(s) \in \arg \max _{u \in U} {\mathcal {H}}_{C V} \left( Z^{0,z,u}(s), D v\left( Z^{0,z,u}(s)\right) , D^2 v\left( Z^{0,z,u}(s)\right) , u \right) \end{aligned}$$
for almost every $s \in [0,+\infty )$ and ${\mathbb {P}}$-almost surely. Then $u^*$ is an optimal control and $v(z)=\widetilde{V}(z)$.

5 Explicit solution in the LQ case

We now take into consideration a simple example to show how the representation in H leads to an explicit solution.

As coefficients for the state equation we consider, for real numbers a, b, c,

$$\begin{aligned} \begin{aligned} f(x,u,r)&={ax+cu+r}\\ g(x,u,r)&=bx+r, \end{aligned} \end{aligned}$$

for all $x\in {\mathbb {R}},\ u\in U={\mathbb {R}},\ r\in {\mathbb {R}}$. Dynamics (3.1) is written as

$$\begin{aligned} {\left\{ \begin{array}{ll} dx(s) = \left( ax(s)+ \int _{-\infty }^0 \alpha (h)\varphi \otimes ^t_s u(h)dh \right) ds \\ \qquad \qquad + \left( bx(s)+ \int _{-\infty }^0 \beta (h)\varphi \otimes ^t_s u(h)dh \right) dW(s) \qquad &{} \forall s\in (t,+\infty )\\ x(t)=\xi , \end{array}\right. } \end{aligned}$$

(5.1)

where $\alpha ,\beta ,\varphi $ are assumed as in Assumption 2. As running reward we consider, for strictly positive real numbers $c_1,c_2$, the linear-quadratic function

$$\begin{aligned} l(x,u,r)=c_1x-\frac{c_2}{2}u^2,\qquad \qquad \forall x\in {\mathbb {R}},\ u\in {\mathbb {R}}. \end{aligned}$$

The value function V is

$$\begin{aligned} V(x_0)=\sup _{u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} \left( c_1x^{0,x_0,u} -\frac{c_2}{2}u^2(t) \right) dt \right] \qquad \qquad \forall x_0\in {\mathbb {R}}. \end{aligned}$$

Then, for $\rho $ sufficiently large, Assumptions 1 and 4 are satisfied.

Now we describe the corresponding infinite dimensional representation. We have $H={\mathbb {R}}\times {\mathbb {R}}\times L^2({\mathbb {R}})$. The coefficients F, G, as defined by (4.1), (4.3), are linear:

$$\begin{aligned} F(z,u)= Bz+Cu,\quad G(z,u)=\Sigma z \end{aligned}$$

for all $z\in H,\ u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})$, where

$$\begin{aligned} \begin{aligned} Bz=&\left( \langle B_0,z\rangle _H,0,0 \right) +\frac{z}{2}\quad \textrm{with}\quad B_0= \left( a,\alpha (0),-\alpha ' \right) ,\\ C=&{(c,1,0),}\\ \Sigma z=&( \langle \Sigma _0,z\rangle _H,0,0)\quad \textrm{with}\quad \Sigma _0 = \left( b,\beta (0),-\beta ' \right) \end{aligned} \end{aligned}$$

The value function associated to the infinite dimensional problem ($\infty $-OCP) is

$$\begin{aligned} \widetilde{V}(z)=\sup _{u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})} {\mathbb {E}} \left[ \int _0^\infty e^{-\rho t} \left( c_1 Z_0^{0,z,u}(t)-\frac{c_2}{2}u^2(t) \right) dt \right] , \end{aligned}$$

where Z is the mild solution of (4.4), with F, G as specified here above. The Hamiltonian ${\mathcal {H}}$ in (4.14) is

$$\begin{aligned} {\mathcal {H}}(z,p,X)= \frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 X_{00} + \langle p, Bz\rangle _H +c_1z_0 +\sup _{u\in U} \left\{ \langle p, Cu\rangle _H -\frac{c_2}{2}u^2 \right\} \end{aligned}$$

for $z\in H,\ p\in H,\ X\in {\mathcal {S}}(H)$, $X_{00}{:}{=}\langle X(1,0,0),(1,0,0)\rangle _H$, and it is maximized by

$$\begin{aligned} { u_{\textrm{max}}\, {:}{=}\, \frac{\langle p,C\rangle }{c_2}.} \end{aligned}$$

(5.2)

Then the HJB Eq. (4.14) is

$$\begin{aligned}{} & {} \rho v-\langle Az,Dv\rangle _H- \frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 D^2_{00}v - \langle Dv,Bz\rangle _H -c_1z_0 - \frac{\langle Dv,C\rangle ^2}{2c_2} =0\qquad \qquad \nonumber \\{} & {} \quad z\in H, \end{aligned}$$

(5.3)

where $D^2_{00}v= \langle D^2v(1,0,0),(1,0,0)\rangle _H$ and $D_1v = \langle Dv,(0,1,0)\rangle _H$.

Though the data do not satisfy the assumptions of Theorem 9 in this case an explicit solution of (5.3) is given by a suitably chosen linear function.

Proposition 10

The function $v:H\rightarrow {\mathbb {R}}$, defined by

$$\begin{aligned} v(z)=\langle \Gamma ,z\rangle _H+\Gamma _3, \end{aligned}$$

where $\Gamma =(\Gamma _0,\Gamma _1,\Gamma _2)$,

$$\begin{aligned} \begin{aligned}&\Gamma _0=\frac{c_1}{\rho -a},\qquad \Gamma _1=\frac{\Gamma _2(0)+\alpha (0)\Gamma _0}{\rho },\qquad \Gamma _2(\cdot )=-\Gamma _0\int _{-\infty }^\cdot \alpha '(s)e^{\rho (s-\cdot )}ds,\\ \qquad \Gamma _3=&\frac{(c\Gamma _0+\Gamma _1)^2}{2\rho c_2}, \end{aligned} \end{aligned}$$

is a classical solution of (5.3). Moreover, $v=\widetilde{V}$, and the control $u^*{:}{=}\frac{c_0\Gamma _0+\Gamma _1}{c_2}$ is optimal.

Proof

Clearly $v\in C^2(H)$. To argue that $\Gamma \in D(A^*)$, which is equivalent to $\Gamma _2\in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})$, it is sufficient, first, to notice that Young’s inequality for convolutions implies that $\Gamma _2\in L^2({\mathbb {R}}^-,{\mathbb {R}})$, then to consider the fact that $\Gamma _2$ solves the differential equation

$$\begin{aligned} \Gamma _2' = -\rho \Gamma _2-\Gamma _0\alpha ', \end{aligned}$$

which entails $\Gamma _2'\in L^2({\mathbb {R}}^-,{\mathbb {R}})$. Now, since $\Gamma \in D(A^*)$, we have

$$\begin{aligned} \langle A^*Dv(z),z\rangle _H=\langle A^*\Gamma ,z\rangle _H =\langle (0,\Gamma _2(0),-\Gamma _2'),z\rangle _H -\frac{1}{2}\langle \Gamma ,z\rangle _H. \end{aligned}$$

Moreover, $\langle C,Dv(z)\rangle _H=\langle C,\Gamma \rangle _H$, and

$$\begin{aligned} \langle Dv,Bz \rangle _H+c_1z_0 +\frac{\langle D_1v,C\rangle ^2}{2c_2}= & {} \langle \Gamma ,Bz \rangle _H+c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}\nonumber \\= & {} \Gamma _0 \langle B_0,z\rangle _H+ \frac{1}{2}\langle \Gamma ,z\rangle _H +c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}\nonumber \\= & {} \Gamma _0az_0+\Gamma _0\alpha (0)z_1-\Gamma _0\langle \alpha ',z_2\rangle _{L^2}\nonumber \\{} & {} \quad + \frac{1}{2}\langle \Gamma ,z\rangle _H +c_1z_0 +\frac{\langle C,\Gamma \rangle ^2}{2c_2}. \end{aligned}$$

(5.4)

Then

$$\begin{aligned} \begin{aligned} \rho v-\langle Az,Dv\rangle _H-&\frac{1}{2} |\langle \Sigma _0,z\rangle _H|^2 D^2_{00}v - \langle Dv,Bz\rangle _H -c_1z_0 - \frac{\langle C,\Gamma \rangle ^2}{2c_2} =\\ =&\rho \left( \Gamma _0 z_0+\Gamma _1z_1+\langle \Gamma _2,z_2\rangle _{L^2}+\Gamma _3 \right) - \langle (0,\Gamma _2(0),-\Gamma _2'),z\rangle _H \\ {}&- \Gamma _0az_0-\Gamma _0\alpha (0)z_1+\Gamma _0\langle \alpha ',z_2\rangle _{L^2} -c_1z_0 -\frac{\langle C,\Gamma \rangle ^2}{2c_2}=0, \end{aligned} \end{aligned}$$

where we used $\Gamma _2' = -\rho \Gamma _2-\Gamma _0\alpha '$.

We sketch the rest of the proof, as it goes in a standard way (see (Fabbri et al. (2017), Theorem 2.42)) Let $Z=Z^{0,\zeta ^{z,\varphi },u}$. Since $\Gamma \in D(A^*)$, Itô’s formula can be applied, and then, since v is a classical solution of (5.3), we obtain

$$\begin{aligned} v(z)= & {} e^{\rho t}{\mathbb {E}} \left[ v(Z_t) \right] +\int _0^t e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u_s)\right] ds\\{} & {} -\int _0^t e^{-\rho s} {\mathbb {E}} \left[ {\mathcal {H}}_{CV}(Z_s,Dv(Z_s),D^2v(Z_s),u_s) \right. \\{} & {} \quad \left. - {\mathcal {H}}(Z_s,Dv(Z_s),D^2v(Z_s))\right] ds. \end{aligned}$$

Letting t goes to $\infty $ ($\rho $ large enough), we recover the fundamental identity

$$\begin{aligned} v(z)= & {} \int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u_s)\right] ds\nonumber \\{} & {} +\int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ {\mathcal {H}}(Z_s,Dv(Z_s),D^2v(Z_s)) \right. \nonumber \\{} & {} \quad \left. -{\mathcal {H}}_{CV}(Z_s,Dv(Z_s),D^2v(Z_s),u_s) \right] ds. \end{aligned}$$

(5.5)

Since (5.5) holds true for any control u, recalling that ${\mathcal {H}}\ge {\mathcal {H}}_{CV}$, we conclude $v\ge \widetilde{V}$. Finally, by (5.2) and (5.5), we have

$$\begin{aligned} v(z)=\int _0^\infty e^{-\rho s} {\mathbb {E}} \left[ \widetilde{ l}(Z_s,u^*_s)\right] ds\le \widetilde{ V}(z), \end{aligned}$$

which concludes the proof. $\square $

References

Biagini, S., Gozzi, F., Zanella, M.: Robust portfolio choice with sticky wages. SIAM J. Financ. Math. 13(3), 1004–1039 (2022)
Article Google Scholar
Biffis, E., Gozzi, F., Prosdocimi, C.: Optimal portfolio choice with path dependent labor income: the infinite horizon case. SIAM J. Control Optim. 58(4), 1906–1938 (2020)
Article Google Scholar
Cont, R., Fournie, D.: Change of variable formulas for non-anticipative functionals on path space. J. Funct. Anal. 259, 1043–1072 (2010)
Article Google Scholar
Cont, R., Fournie, D.: A functional extension of the Ito formula. C. R. Math. Acad. Sci. Paris 348(1–2), 57–61 (2010)
Google Scholar
Cosso, A., Gozzi, F., Rosestolato, M., Russo, F.: Path-dependent hamilton-jacobi-bellman equation: Uniqueness of crandall-lions viscosity solutions (2023) arXiv:2107.05959
Cosso, A., Gozzi, F., Kharroubi, I., Pham, H., Rosestolato, M.: Optimal control of path-dependent McKean-Vlasov SDEs in infinite-dimension. Ann. Appl. Prob. 33(4), 2863–2918 (2023)
Article Google Scholar
De Feo, F., Federico, S., Swiech, A.: Optimal control of stochastic delay differential equations and applications to path-dependent financial and economic models (2023) arXiv:2302.08809
De Feo, F.: Stochastic optimal control problems with delays in the state and in the control via viscosity solutions and applications to optimal advertising and optimal investment problems. Decisions in Economics and Finance (2024). https://doi.org/10.1007/s10203-024-00456-y
Di Giacinto, M., Federico, S., Gozzi, F.: Pension funds with a minimum guarantee: a stochastic control approach. Financ. Stoch. 15, 297–342 (2011)
Article Google Scholar
Djehiche, B., Gozzi, F., Zanco, G., Zanella, M.: Optimal portfolio choice with path dependent benchmarked labor income: a mean field model. Stoch. Process. Appl. 145, 48–85 (2022)
Article Google Scholar
Fabbri, G., Gozzi, F., Swiech, A.: Stochastic Optimal Control in Infinite Dimension. Dynamic Programming and HJB Equations. Probability Theory and Stochastic Modelling, vol. 82. Springer, (2017)
Federico, S.: A stochastic control problem with delay arising in a pension fund model. Financ. Stoch. 15(3), 421–459 (2011)
Article Google Scholar
Federico, S., Tankov, P.: Exact or approximate finite-dimensional Markovian representation for stochastic control problems with delay. Appl. Math. Optim. 71(1), 165–194 (2015)
Article Google Scholar
Feichtinger, G., Hartl, R.F., Sethi, S.P.: Dynamic optimal control models in advertising: recent developments. Manag. Sci. 40(2), 195–226 (1994)
Article Google Scholar
Fuhrman, M., Masiero, F., Tessitore, G.: Stochastic equations with delay: optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations. SIAM J. Control Optim. 48(7), 4624–4651 (2010)
Article Google Scholar
Gawarecki, L., Mandrekar, V.: Stochastic differential equations in infinite dimensions with applications to stochastic partial differential equations. Probability and its Applications (New York). Springer, Heidelberg (2011)
Gozzi, F., Marinelli, C.: Stochastic optimal control of delay equations arising in advertising models. Stochastic partial differential equations and applications VII - Papers of the 7th meeting, Levico Terme, Italy, January 5-10, 2004, Lecture Notes in Pure and Applied Mathematics 245, 133–148 (2004)
Gozzi, F., Masiero, F., Rosestolato, M.: An optimal advertising model with carryover effect and mean field terms. Mathematics and Financial Economics (2024). https://doi.org/10.1007/s11579-024-00361-3
Gozzi, F., Marinelli, C., Savin, S.: On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects. J. Optim. Theory Appl. 142, 291–321 (2009)
Article Google Scholar
Grosset, L., Viscolani, B.: Advertising for a new product introduction: a stochastic approach. Top 12(1), 149–167 (2004)
Article Google Scholar
Hartl, R.F.: Optimal dynamic advertising policies for hereditary processes. J. Optim. Theory Appl. 43(1), 51–72 (1984)
Article Google Scholar
Lefebvre, W., Miller, E.: Linear-quadratic stochastic delayed control and deep learning resolution. J. Optim. Theory Appl. 191(1), 134–168 (2021)
Article Google Scholar
Marinelli, C.: The stochastic goodwill problem. Eur. J. Oper. Res. 176(1), 389–404 (2007)
Article Google Scholar
Masiero, F., Tessitore, G.: Partial smoothing of delay transition semigroups acting on special functions. J. Diff. Equ. 316, 599–640 (2022)
Article Google Scholar
Motte, M., Pham, H.: Optimal bidding strategies for digital advertising. Working Papers hal-03429785, HAL (November 2021). https://ideas.repec.org/p/hal/wpaper/hal-03429785.html
Nerlove, M., Arrow, K.J.: Optimal advertising policy under dynamic conditions. Economica 29(114), 129–142 (1962)
Article Google Scholar
Pang, T., Yong, Y.: A New Stochastic Model For Stock Price with Delay Effects, pp. 110–117. Society for Industrial and Applied Mathematics, (2019)
Prasad, A., Sethi, S.P.: Dynamic optimization of an oligopoly model of advertising. UTD School of Management Working Paper (2008)
Ren, Z., Rosestolato, M.: Viscosity solutions of path-dependent pdes with randomized time. SIAM J. Math. Anal. 52(2), 1943–1979 (2020)
Article Google Scholar
Ricciardi, M., Rosestolato, M.: Mean field games incorporating carryover effects: Optimizing advertising models (2024) arXiv:2403.00413v1
Rosestolato, M., Swiech, A.: Partial regularity of viscosity solutions for a class of Kolmogorov equations arising from mathematical finance. J. Diff. Equ. 262(3), 1897–1930 (2017)
Article Google Scholar
Vidale, M.L., Wolfe, H.B.: An operations-research study of sales response to advertising. Oper. Res. 5, 370–381 (1957)
Article Google Scholar
Vinter, R.B., Kwong, R.H.: The infinite time quadratic control problem for linear system with state control delays: an evolution equation approach. SIAM J. Control Optim. 19, 139–153 (1981)
Article Google Scholar

Download references

Funding

Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement. This research was partially financed by the INdAM (Instituto Nazionale di Alta Matematica F. Severi) – GNAMPA (Gruppo Nazionale per l’Analisi Matematica, la Probabilitá e le loro Applicazioni) Project CUP E53C23001670001 Problemi di controllo ottimo stocastico con memoria a informazione parziale.

Author information

Authors and Affiliations

Dipartimento di Matematica, Università di Bologna, Piazza di Porta San Donato, 5, 40126, Bologna, Italy
Cristina Di Girolami
Dipartimento di Economia, Università di Genova, Via Vivaldi, 5, 16126, Genoa, Italy
Mauro Rosestolato

Authors

Cristina Di Girolami
View author publications
You can also search for this author in PubMed Google Scholar
Mauro Rosestolato
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristina Di Girolami.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Girolami, C.D., Rosestolato, M. Representation of stochastic optimal control problems with delay in the control variable. Decisions Econ Finan (2024). https://doi.org/10.1007/s10203-024-00465-x

Download citation

Received: 03 October 2023
Accepted: 10 June 2024
Published: 25 June 2024
DOI: https://doi.org/10.1007/s10203-024-00465-x

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Representation of stochastic optimal control problems with delay in the control variable

Abstract

Similar content being viewed by others

An approximation scheme for stochastic controls in continuous time

Mixed deterministic and random optimal control of linear stochastic systems with quadratic costs

Optimal control for linear discrete systems with respect to probabilistic criteria

1 Introduction

2 Notation and preliminaries

3 The optimal control problem

3.1 State equation

Assumption 1

Assumption 2

Proposition 3

3.2 Objective functional and value function

Assumption 4

4 Representation in infinite dimension

4.1 Reformulation of the state equation in H

Proposition 5

Proof

Theorem 6

Proof

4.2 Reformulation of the optimal control problem in H

4.3 Hamilton-Jacobi-Bellman equation and verification theorem

Definition 1

Assumption 7

Assumption 8

Theorem 9

5 Explicit solution in the LQ case

Proposition 10

Proof

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation