Abstract
In this manuscript we provide a representation in infinite dimension for stochastic optimal control problems with delay in the control variable. The main novelty consists in the fact that the representation can be applied also to dynamics where the delay in the control appears as a nonlinear term and in the diffusion coefficient. We then apply the representation to a LQ case where an explicit solution can be found.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper we consider a class of stochastic optimal control problems where the state equation is a stochastic delay differential equation in \({\mathbb {R}}^n\) of the form
where W is a Brownian motion with values in \({\mathbb {R}}^m\), x is the state variable with values in \({\mathbb {R}}^n\), u is a control process taking values in a suitable set \(U\subset {\mathbb {R}}^k\), \(x_0 \in {\mathbb {R}}^n\) is the initial value of the state variable, \(\varphi \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) is the initial given control, \(f:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^n\), \(g:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}^{m\times n}\) and \(\alpha ,\beta \) are \(L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) functions. The goal is to maximize, over all \(u \in {\mathcal {U}}\), the functional
where \(\gamma \in L^2({\mathbb {R}}^-,{\mathbb {R}}^k)\) and \(l:{\mathbb {R}}^n\times U\times {\mathbb {R}}\longrightarrow {\mathbb {R}}\). The key feature of such class of problems is the integral dependence of all the ingredients (coefficients f, g of the state equation and running reward function l) on the path of the control u.
As in the case where the delay dependence is with respect to the state variable, also the models that we address lack of Markovianity. Due to this fact, the dynamic programming approach cannot be directly applied. To overcome this difficulty, when it is the delay in the state variable but not in the control variable that appears in the problem, one available approach consists in rephrasing the finite dimensional problem in a Hilbert space setting, where the constituents of the new problem do not present anymore a delay-type dependence. The benefit of this approach is to recover Markovianity, hence to allow for an application of the dynamic programming machinery. Clearly, there is a cost to pay in doing so, due to the fact that a more technical theory is required, in particular for dealing with unbounded second-order Hamilton-Jacobi-Bellman equations on Hilbert spaces. Nevertheless, such a theory has been developed and is available for application (Fabbri et al. 2017). For stochastic optimal control problems with a delay dependence on the state variable, but not on the control variable, see Biffis et al. (2020), Biagini et al. (2022), Djehiche et al. (2022), De Feo et al. (2023), Di Giacinto et al. (2011), Federico (2011), Federico and Tankov (2015), Fuhrman et al. (2010), Masiero and Tessitore (2022), Pang and Yong (2019). Se also Cosso et al. (2023); Ren and Rosestolato (2020); Cosso et al. (2023) for a different approach, where no representation in Hilbert space is performed, but the problem, presented in a path-dependent framework, is addressed via dynamic programming in the original setting, but making use of the so-called pathwise derivatives (see Cont and Fournie (2010a, 2010b) for an account on this topic).
On the other hand, if we take into consideration models where we have a distributed delay dependence,
as we do in the present work, the infinite dimensional representation trick to overcome the lack of Markovianity is not obvious.
A way to do it is the one followed originally by Vinter and Kwong (1981), extended in the stochastic case with additive noise in
Gozzi and Marinelli (2004), then recently generalized in De Feo (2023) by considering a nonlinear dependence on the present of the control variable in the diffusion coefficient. In these works, the authors rephrase the original dynamics as an equivalent abstract SDE in a Hilbert space, controlled now only on the present value of the control variable. The drift of such an abstract controlled SDE is linearly dependent on an unbounded linear operator acting on the infinite dimensional state variable. The setting thereby recovered is then suitable to apply the theory of optimal control in infinite dimension, as developed in Fabbri et al. (2017).
Such a strategy to rephrase the problem strongly relies on the fact that the integral dependence on the past of the control variable appears only linearly in the drift of the original state dynamics. If in our model we used the same representation as in Gozzi and Marinelli (2004), De Feo (2023), the corresponding abstract equation would show a nonlinear dependence, both in the drift and in the diffusion coefficient, on an unbounded linear operator acting on the infinite dimensional state. This structure would make the problem very much difficult, and untreatable, when referring to the theory in Fabbri et al. (2017).
It is to overcome this issue, hence to let the delay dependence on the control appear nonlinearly both in the drift and in the diffusion coefficient, that we present an alternative representation.
The starting point is the simple observation that the function
can be introduced as a second state variable to rewrite the integral dependence in (1.1) and in (1.2) as
and similarly for the other terms involving \(\beta ,\gamma \). The point is that in (1.3) there is no more the control the variable u, but only the newly introduced state \(x_1\), whose dynamics is trivially \(dx_1(t)=u(t)dt\). Of course, if we use (1.3) (and similarly for \(\beta ,\gamma \)) in (1.1) and in (1.2), we still have to deal with a delay model: but now the delay is in the state variable \(x_1\). This fact is important, because we can now perform the standard representation in infinite dimnesion for models with delay in the state, and, as said above, an unbounded linear operator will appear in the infinite dimensional dynamics, but it will appear linearly and only in the drift coefficient. Of course, differently from Gozzi and Marinelli (2004), in order to compute (1.3), we need some regularity, meaning \(\alpha ,\beta ,\gamma \in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})\).
Using this approach, once the problem has been represented in an infinite dimensional setting, one ends up with a structure that can be tackled by appealing to the available theory for dynamic programming in infinite dimension, including the theory of B-viscosity solution theory, as presented in Fabbri et al. (2017).
Concerning applications, the model that we present here can be exploited for optimal advertising. Within this field, a basic setting has been provided by the seminal papers (Nerlove and Arrow 1962; Vidale and Wolfe 1957), then extended to the stochastic case, in particular, by Grosset and Viscolani (2004), Marinelli (2007), Motte and Pham (2021), Prasad and Sethi (2008). The delay in the control variable, representing the advertisement spending, is then introduced, in our model as in Gozzi et al. (2009); Gozzi and Marinelli (2004); De Feo (2023), in order to account for a delay effect in the spending, often called carryover effect (see Gozzi et al. (2009); Hartl (1984); Feichtinger et al. (1994)). A further extension of the stochastic linear model with delay in the control, and additive noise, has been recently provided by Gozzi et al. (2024) and Ricciardi and Rosestolato (2024), where a mean field term is introduced, in order to account for non-competitive and competitive environments, respectively.
We point out that when the delay dependence on the control is not in an integral form, as we assumed in the discussion above, but e.g. pointwise, then the representation infinite dimension in general more difficult to perform, and other strategies have to be exploited (see e.g. Lefebvre and Miller (2021)).
The plan of the paper is the following. In Sect. 2 we introduce the needed notations. In Sect. 3 we formulate the optimal control problem in finite dimension with delay in the control variable. In Sect. 4 we introduce the infinite dimensional setting and prove Theorem 6, which states the equivalence between the finite dimensional control problem with delay in the control, introduced in Sect. 3, and an infinite dimensional control problem, where there is no delay in the control variable. Finally, in Sect. 5, we show how the representation of Sect. 4 can be used to find an explicit solution for an LQ model, where both the drift and the diffusion coefficient of the state dynamics depend on the path of the delay.
2 Notation and preliminaries
We fix natural numbers n, k, m, that will represent the dimension of the state variable, the control variable, the Brownian motion, respectively. By \(M_{n\times m}({\mathbb {R}})\) we denote the space of \(n\times m\) matrices with real entries, endowed with the Frobenius norm. For finite dimensional spaces, the Euclidean norm and scalar product will be always denoted by \(|\cdot |\) and \(\langle \cdot ,\cdot \rangle \), respectively, without any subscript. We denote \({\mathbb {R}}^+=[0,+\infty )\) and \({\mathbb {R}}^-=(-\infty ,0]\). If \({\mathcal {T}}\) is any topological space, \({\mathcal {B}}_{\mathcal {T}}\) denotes its Borel sigma-algebra. We fix a filtered probability space \((\Omega ,{\mathcal {F}},{\mathbb {F}}=\{{\mathcal {F}}_t\}_{t\in {\mathbb {R}}^+},{\mathbb {P}})\) satisfying the usual conditions, and an m-dimensional Bronwian motion W defined on it. We assume \({\mathbb {F}}\) to be the completion of the natural filtration of W.
Given any separable Banach space \((E,|\cdot |_E)\), we introduce the following function spaces.
-
(i)
For \(p\ge 1\), \(L^p(E)\) denotes the space \(L^p({\mathbb {R}}^-,E)\) of E-valued p-Lebesgue integrable functions defined on \({\mathbb {R}}^-\). Its usual \(L^p\)-norm will be denoted by \(|\cdot |_{L^p}\).
-
(ii)
For \(p\ge 1\) and any sub-sigma-algebra \({\mathcal {G}}\subset {\mathcal {F}}\), \(L^p_{{\mathcal {G}}}(E)\) denotes the space of \({\mathcal {G}}\)-measurable random variables \(\xi \) such that
$$\begin{aligned} |\xi |_p{:}{=}\left( {\mathbb {E}} \left[ |\xi |^p_E \right] \right) ^{1/p}<\infty . \end{aligned}$$ -
(iii)
For \(t\ge 0\), \(L^0_{{\mathbb {F}},t}(E)\) denotes the space of \(\{{\mathcal {F}}_s\}_{s\in [t,\infty )}\)-progressively measurable processes \(X:\Omega \times [t,\infty )\rightarrow E\) endowed with the (quotient) metrizable topology associated to convergence in measure, when \((\Omega \times [t,\infty ),{\mathcal {F}}\otimes {\mathcal {B}}_{[t,\infty )})\) is endowed with the product measure \({\mathbb {P}}\otimes \lambda \) (\(\lambda \) is the Lebesgue measure).
-
(iv)
For \(p\ge 1\) and \(0\le t\le T\), \(L^p_{{\mathbb {F}},t,T}(E)\) denotes the space of \(\{{\mathcal {F}}_s\}_{s\in [t,T]}\)-progressively measurable processes \(X:\Omega \times [t,T]\rightarrow E\) such that
$$\begin{aligned} |X|_{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \int _t^T |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$The couple \((L^p_{{\mathbb {F}},t,T}(E),|\cdot |_{p,t,T})\) is a Banach space.
-
(v)
For \(p\ge 1\) and \(t\ge 0\), \(L^p_{{\mathbb {F}},t}(E)\) denotes the Fréchet space of processes \(X\in L^0_{{\mathbb {F}},t}(E)\) such that \(|X|_{p,t,T} <\infty \) for all \(T >t\).
-
(vi)
For \(p\ge 1\) and \(0\le t\le T\), \({\textbf{S}}^p_{{\mathbb {F}},t,T}(E)\) denotes the Fréchet space of continuous processes \(X\in L^p_{{\mathbb {F}},t,T}(E)\) such that
$$\begin{aligned} \Vert X\Vert _{p,t,T} {:}{=}\left( {\mathbb {E}} \left[ \sup _{s\in [t,T]} |X_s|^p_E ds \right] \right) ^{1/p}<\infty . \end{aligned}$$ -
(vii)
For \(p\ge 1\) and \(t\ge 0\), \({\textbf{S}}^p_{{\mathbb {F}},t}(E)\) denotes the Fréchet space of continuous processes \(X\in L^p_{{\mathbb {F}},t}(E)\) such that \(\Vert X\Vert _{p,t,T}<\infty \) for all \(T>t\).
If E, F are Banach spaces, the space L(E, F) of linear and continuous operators \(E\rightarrow F\) is considered as endowed with the operator norm, denoted by \(|\cdot |_{{\mathcal {L}}(E,F)}\).
If K is a Hilbert space, its scalar product will be denote by \(\langle \cdot ,\cdot \rangle _K\). When \(K=L^2({\mathbb {R}}^k)\), we simply write \(\langle \cdot ,\cdot \rangle _{L^2}\).
We assume that the control variable takes value in a nonempty Borel set \(U\subset {\mathbb {R}}^k\). The control processes that we take into consideration are those belonging to the set
For given \(\alpha :{\mathbb {R}}^-\rightarrow {\mathbb {R}}^k\) and \(\beta :{\mathbb {R}}^+\rightarrow {\mathbb {R}}^k\), and for given times \(t_0,t\in {\mathbb {R}}^+\), \(t_0\le t\), we denote by \(\alpha \otimes ^{t_0}_t\beta \) the function \({\mathbb {R}}^-\rightarrow {\mathbb {R}}^k\) defined by
Notice that, if \(\varphi \in L^2({\mathbb {R}}^k)\) and \(u\in {\mathcal {U}}\), then \(\varphi \otimes ^{t_0} u=\{\varphi \otimes ^{t_0}_t u\}_{t\ge t_0}\) belongs to \(L^2_{{\mathbb {F}},t}(L^2({\mathbb {R}}^k))\).
3 The optimal control problem
3.1 State equation
For an initial time \(t\in {\mathbb {R}}^+\), an inital state \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), and a control process \(u\in {\mathcal {U}}\), we consider a state process x evolving according to the following delayed controlled stochastic differential equation:
where we recall that \(\langle \cdot ,\cdot \rangle _{L^2}\) denotes the scalar product in \(L^2({\mathbb {R}}^k)\), and the data \(f,g,\alpha ,\beta ,\varphi \) are assumed to satisfy the following assumptions.
Assumption 1
The functions
are such that
-
(i)
f, g are measurable;
-
(ii)
there exists a constant L such that
$$\begin{aligned} \begin{aligned} |f(x,u,r)- f(x',u,r)|&\le L|x-x'|\\ |g(x,u,r)- g(x',u,r)|&\le L|x-x'|\\ |f(0,u,r)| + |g(0,u,r)|&\le L(1+|u|+|r|) \end{aligned} \end{aligned}$$for all \((x,u,r)\in {\mathbb {R}}^n\times U\times {\mathbb {R}}\).
Assumption 2
-
(i)
\(\alpha \), \(\beta \) are functions belonging to \(W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)\);
-
(ii)
\(\varphi \in L^1({\mathbb {R}}^k)\cap L^2({\mathbb {R}}^k)\) is such that \(\int _{-\infty }^\cdot \varphi (r)dr \in L^2({\mathbb {R}}^k)\).
Notice that Assumption 2(ii) is satisfied whenever \(\varphi \in L^2({\mathbb {R}}^k)\) has compact support.
We have the following well-posedness result for the state equation and continuity and growth properties of the strong solution.
Proposition 3
For \(t\in {\mathbb {R}}^+\), \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), and \(u\in {\mathcal {U}}\), there exists a unique strong solution \(x^{t,\xi ,u}\in L^0_{{\mathbb {F}},t}({\mathbb {R}}^n)\) of (3.1). Moreover, \(x^{t,\xi ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}({\mathbb {R}}^n)\) and
-
(a)
for any \(M>1\), there exists a constant C(M, L) depending only on M, L such that
$$\begin{aligned}{} & {} \sup _{\begin{array}{c} u\in {\mathcal {U}}\\ \varphi \in L^2({\mathbb {R}}^k) \end{array}} \Vert x^{t,\xi ,u}- x^{t,\xi ',u} \Vert _{2,t,T}\le M e^{C(M,L)\cdot (T-t)}\cdot |\xi -\xi '|_2\nonumber \\{} & {} \quad \forall 0\le t\le T,\ \xi ,\xi '\in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n); \end{aligned}$$(3.2) -
(b)
there exists \({\hat{C}}={\hat{C}}(L,|\alpha |_{L^2},|\beta |_{L^2})\), depending only on \(L,|\alpha |_{L^2},|\beta |_{L^2}\), and \({\hat{D}}={\hat{D}}(L)\), depending only on L, such that
$$\begin{aligned} \Vert x^{t,\xi ,u} \Vert _{2,t,T}\le {\hat{C}} \left( 1+|\xi |_2+|\varphi |_{L^2({\mathbb {R}}^k)}+|u|_{2,t,T} \right) \cdot \, e^{{\hat{D}}\cdot (T-t)}, \end{aligned}$$(3.3)for all \(\varphi \in L^2({\mathbb {R}}^k)\), \(u\in {\mathcal {U}}\), \(0\le t\le T\), and \(\xi \in L^2_{{\mathbb {F}}_t}({\mathbb {R}}^n)\).
We omit the proof, since it based on standard arguments. To give the reader an idea for (3.2), consider e.g. Proposition 2.8 in Cosso et al. (2023). There, the Lipschitz constant is not expressed as in (3.2). Neverthless, by inspection, one can check that the constant \(\gamma \) in Claim III of Proof of Proposition 2.8, at p. 2897 in Cosso et al. (2023), can be arbitrarily close to 0, as long as \(\varepsilon \) is small enough. This fact entails that the Lipschitz constant in Proposition 2.8 in Cosso et al. (2023) can be arbitrarily close to 1, as long as \(T-t\) is is small enough. This provides our M in (3.2), as long as \(T-t\) is small enough. For general intervals [t, T], one can use the estimate obtained for small \(T-t\), combined with the flow property of solutions. In this way one obtains the exponential term in (3.2).
To obtain (3.3), with an explicit growth constant \({\hat{D}}\), one can argue as in the proof of (Fabbri et al. (2017), Proposition 3.24, p. 187).
3.2 Objective functional and value function
We consider a discount factor \(\rho >0\) and a current reward function \( l:{\mathbb {R}}^n\times {\mathbb {R}}^k\times {\mathbb {R}} \rightarrow {\mathbb {R}} \) on which we impose the following assumptions.
Assumption 4
-
(i)
The function l is measurable.
-
(ii)
There exist constants \(a\ge 0\), \(0\le q \le 2\), \(d>0,{\theta > q}\) such that
$$\begin{aligned} l(x,u,r)\le a(1+|x|^q+|r|^q)-d |u|^\theta \qquad \forall u\in U,\ x\in {\mathbb {R}}^n,\ r\in {\mathbb {R}}. \end{aligned}$$ -
(iii)
\(\rho > 2{\hat{D}}\), where \({\hat{D}}\) is as in (3.3).
-
(iv)
\(\gamma \) is a function belonging to \(W^{1,2}({\mathbb {R}}^-,{\mathbb {R}}^k)\).
Under Assumptions 1 and Assumptions 4, from Proposition 3 we get the reward functional J, given by
is well-defined as a function \({\mathbb {R}}^n\times {\mathcal {U}}\rightarrow {\mathbb {R}}\).
We then consider the optimal control problem consisting in maximizing J over the set of admissible controls \({\mathcal {U}}\), for any given \(x_0\in {\mathbb {R}}^n\):
For \(x_0\in {\mathbb {R}}^n\), we define the value function
4 Representation in infinite dimension
Due to the dependence on the past of the control variable u, the finite dimensional stochastic dynamics (3.1) is not Markovian. This feature entails that the standard dynamic programming approach cannot be applied to the finite dimensional stochastic optimal control problem (f-OCP). A classical workaround to regain Markovianity consists in rephrasing the model in a functional space setting.
In order to do that, we start by introducing the Hilbert space
endowed with the induced scalar product
where \(z=(z_0,z_1,z_2)\), \(z_0\in {\mathbb {R}}^n,\ z_1\in {\mathbb {R}}^k,\ z_2\in L^2({\mathbb {R}}^k)\), and similarly for y.
4.1 Reformulation of the state equation in H
Then consider functions \({\hat{F}}, G\), associated to \(b,\sigma \), respectively, defined by
where \(z=(z_0,z_1,z_2)\) denotes a generic point of \( H= {\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)\). Notice the the pointwise evaluations \(\alpha (0),\beta (0)\) and the square-integrable derivatives \(\alpha ',\beta '\) exist because of our initial assumption on \(\alpha ,\beta \).
For \(t\in {\mathbb {R}}^+\), consider the family operators \({\hat{S}}=\{{\hat{S}}_t\}_{t\in {\mathbb {R}}^+}\) defined by
Then \({\hat{S}}\) is a strongly continuous semigroup, with infinitesimal generator \((D({\hat{A}}),{\hat{A}})\) specified by
with
Then we consider the H-valued dynamics
where \(u\in {\mathcal {U}}\), \(\zeta \in L^2_{{\mathcal {F}}_t}(H)\), \(t\in {\mathbb {R}}^+\).
Observe that, for fixed \(u\in {\mathcal {U}}\), Assumptions 1 on f, g entail Lipschitz continuity and sublinear growth with respect to z of \({\hat{F}}\) and G.
As it can be easily checked, \({\hat{S}}\) is a \(C_0\)-semigroup of pseudo-contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). If follows that there exists a unique mild solution \({\hat{Z}}\) to (4.2), and the mild solution has a continuous version (Gawarecki and Mandrekar 2011, Theorem 3.3).
We could directly link \({\hat{Z}}^{t,\zeta ,u}\) to \(x^{t,\xi ,u}\), but, with the purpose to set up a framework suitable to be investigated in future works within the theory of B-viscosity solutions, as presented in Fabbri et al. (2017), Chapter 3, we need a dynamic representation \({\hat{Z}}\) similar to (4.2) but with the unbounded term appearing in the drift being the generator of a \(C_0\)-semigroup of contractions (see e.g. Appendix B.4 in Fabbri et al. (2017) for the definition). A simple way to do that consists in introducing the semigroup
which is a semigroup of contractions, as it is easily seen by straightforward computations. The generator (D(A), A) of S is specified by
To use A in place of \({\hat{A}}\) in (4.2), we apply a translation to the bounded part of the drift \({\hat{F}}\), defining
Finally, we consider the H-valued dynamics
where \(u\in {\mathcal {U}}\), \(\zeta \in L^2_{{\mathcal {F}}_t}(H)\), \(t\in {\mathbb {R}}^+\). As noticed for (4.2), also (4.4) admits a unique mild solution \(Z^{t,\zeta ,u}\), that can be assumed to be pathwise continuous. It should also be clear that \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\). For future reference, we state this result in a proposition.
Proposition 5
For \(u\in {\mathcal {U}}, \zeta \in L^2_{{\mathcal {F}}_t}(H), t\in {\mathbb {R}}^+\), there exists a unique (up to indistinguishability) pathwise-continuous mild solution \(Z^{t,\zeta ,u}\). Moreover, \(Z^{t,\zeta ,u}\in {\textbf{S}}^2_{{\mathbb {F}},t}(H)\), and \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\), where \({\hat{Z}}^{t,\zeta ,u}\) is the unique mild solution to (4.2).
Proof
For existence and uniqueness, and integral estimates, see (Gawarecki and Mandrekar (2011), Theorem 3.3).
Regarding the fact that \(Z^{t,\zeta ,u}={\hat{Z}}^{t,\zeta ,u}\), argue as in Rosestolato and Swiech (2017), pp. 1901–1902. \(\square \)
Denote by
the orthogonal projections of \(H={\mathbb {R}}^n\times {\mathbb {R}}^k\times L^2({\mathbb {R}}^k)\) onto \({\mathbb {R}}^n,{\mathbb {R}}^k, L^2({\mathbb {R}}^k)\), respectively.
The following result explain the link between the mild solution of (4.4) \(Z^{t,\zeta ,u}\), and the strong solution of (3.1) \(x^{t,\xi ,u}\).
Theorem 6
Let \(t\in {\mathbb {R}}^+\), \(\xi \in L^2_{{\mathcal {F}}_t}({\mathbb {R}}^n)\), \(u\in {\mathcal {U}}\). Let \(\zeta ^{\xi ,\varphi }=(\zeta _0,\zeta _1,\zeta _2)\in L^2_{{\mathcal {F}}_t}(H)\) be defined by
Then \(Z^{t,\zeta ^{\xi ,\varphi },u}=(Y_0,Y_1,Y_2)\), where, for \(s\ge t\),
Proof
First, we notice that, by Assumption 2(ii), \(\zeta _1\) and \(\zeta _2\) are well-defined, and \(\zeta _2\in L^2({\mathbb {R}}^k)\). Then, we observe that, integrating by parts, for \(s\ge t\),
where we have used the fact that
Now let \(Y=(Y_0,Y_1,Y_2)\), where \(Y_0,Y_1,Y_2\) are as defined by (4.5). Let \(x=x^{t,\xi ,u}\) be the strong solution of (3.1). Due to the fact that the operator \({\hat{S}}_s\) (\(s\in {\mathbb {R}}^+\)) is the identity with respet to the first component, we can write, for \(s\ge t\),
Regarding \(Y_1\), exploiting now the fact that \({\hat{S}}_s\) (\(s\in {\mathbb {R}}^+\)) is the identity in the second component, we have
Regarding \(Y_2\), we have, denoting by \({\textbf{0}}\) the function zero in \(L^2({\mathbb {R}}^k)\),
To justify the equality in \(L^2({\mathbb {R}}^k)\)
we pick any \(a\in L^2({\mathbb {R}}^k)\), and compute
This proves (4.10). Collecting (4.7), (4.8), (4.9), we obtain
Equality (4.11) tells us that Y is the unique mild solution to (4.2), i.e., \(Y={\hat{Z}}^{t,\zeta ^{\xi ,\varphi },u}\). Finally, by Proposition 5, we conclude \(Y=Z^{t,\zeta ^{\xi ,\varphi },u}\). \(\square \)
4.2 Reformulation of the optimal control problem in H
Thanks to Theorem 6, we can rephrase the finite dimensional optimal control problem (f-OCP), with delay in the control variable u, in an infinite dimensional setting, where there is no more delay in the control variable u. Indeed, if \(\zeta ^{\xi ,\varphi },Z^{t,\zeta ^{\xi ,\varphi },u}\) are as in Theorem 6, the functional J defined by (3.4) can be written as
where \(\widetilde{J}\) is defined by
with
It then follows that the problem (f-OCP) is a particular case of the infinite dimensional dimensional optimal control problem
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs10203-024-00465-x/MediaObjects/10203_2024_465_Figa_HTML.png)
Indeed, by introducing the value function \(\widetilde{V}\) associated to (\(\infty \)-OCP), defined as
we have
4.3 Hamilton-Jacobi-Bellman equation and verification theorem
Denote by \({\mathcal {S}}(H)\) the space of self-adjoint operators in L(H) . Following the dynamic programming approach, the Hamilton-Jacobi-Bellman equation associated to (\(\infty \)-OCP) is
where
with
We recall the definition of classical solution of (4.14).
Definition 1
A function \(v:H\rightarrow {\mathbb {R}}\) is a classical solution of (4.14) if \(v\in C^2(H)\), \(Dv\in D(A^*)\), \(A^*Dv\in C(H,H)\), and v satisfies
for all \(z\in H\).
Assumption 7
There exists a constant \(C>0\) such that
Assumption 8
-
(i)
The functions f, g and l are continuous, l(x, u, r) is uniformly continuous in x on bounded subsets of \({\mathbb {R}}^n\), uniformly for \(u \in {\mathbb {R}}^k,\ r\in {\mathbb {R}}\). Moreover, there exists C such that
$$\begin{aligned} |l(x, u,r)| \le C(1+|x|) \end{aligned}$$for all \((x, u,r) \in {\mathbb {R}}^n \times {\mathbb {R}}^k\times {\mathbb {R}}\).
-
(ii)
The function \(v: H \rightarrow {\mathbb {R}}\) and its derivatives \(D v, D^2 v\) are uniformly continuous on bounded subsets of H. Moreover, \(D v: H \rightarrow D\left( A^*\right) \) and \(A^* D v\) is uniformly continuous on bounded subsets of H, and there exists C, N such that
$$\begin{aligned} |v(z)|+|D v(z)|_H+\textrm{Tr}\left( D^2 v(z){(D^2v(z))^*}\right) +\left| A^* D v(z)\right| _H \le C(1+|z|)^N\nonumber \\ \end{aligned}$$(4.16)for all \(z \in H\).
Theorem 9
(Theorem 2.42 in Fabbri et al. (2017)) Let \(v:H \rightarrow {\mathbb {R}}\) be a classical solution of
In addition to our standing Assumptions 1,2,4, let Assumptions 7 and 8 be satisfied. Assume that
where C is the constant appearing in (4.15) and N is as in (4.16). Then, we have the following statements:
-
1.
For all \(z \in H\)
$$\begin{aligned} v(z) \ge \widetilde{V}(z). \end{aligned}$$ -
2.
Let \(u^*\in {\mathcal {U}}\) be such that
$$\begin{aligned} u^*(s) \in \arg \max _{u \in U} {\mathcal {H}}_{C V} \left( Z^{0,z,u}(s), D v\left( Z^{0,z,u}(s)\right) , D^2 v\left( Z^{0,z,u}(s)\right) , u \right) \end{aligned}$$for almost every \(s \in [0,+\infty )\) and \({\mathbb {P}}\)-almost surely. Then \(u^*\) is an optimal control and \(v(z)=\widetilde{V}(z)\).
5 Explicit solution in the LQ case
We now take into consideration a simple example to show how the representation in H leads to an explicit solution.
As coefficients for the state equation we consider, for real numbers a, b, c,
for all \(x\in {\mathbb {R}},\ u\in U={\mathbb {R}},\ r\in {\mathbb {R}}\). Dynamics (3.1) is written as
where \(\alpha ,\beta ,\varphi \) are assumed as in Assumption 2. As running reward we consider, for strictly positive real numbers \(c_1,c_2\), the linear-quadratic function
The value function V is
Then, for \(\rho \) sufficiently large, Assumptions 1 and 4 are satisfied.
Now we describe the corresponding infinite dimensional representation. We have \(H={\mathbb {R}}\times {\mathbb {R}}\times L^2({\mathbb {R}})\). The coefficients F, G, as defined by (4.1), (4.3), are linear:
for all \(z\in H,\ u\in L^2_{{\mathcal {F}}_0}({\mathbb {R}})\), where
The value function associated to the infinite dimensional problem (\(\infty \)-OCP) is
where Z is the mild solution of (4.4), with F, G as specified here above. The Hamiltonian \({\mathcal {H}}\) in (4.14) is
for \(z\in H,\ p\in H,\ X\in {\mathcal {S}}(H)\), \(X_{00}{:}{=}\langle X(1,0,0),(1,0,0)\rangle _H\), and it is maximized by
Then the HJB Eq. (4.14) is
where \(D^2_{00}v= \langle D^2v(1,0,0),(1,0,0)\rangle _H\) and \(D_1v = \langle Dv,(0,1,0)\rangle _H\).
Though the data do not satisfy the assumptions of Theorem 9 in this case an explicit solution of (5.3) is given by a suitably chosen linear function.
Proposition 10
The function \(v:H\rightarrow {\mathbb {R}}\), defined by
where \(\Gamma =(\Gamma _0,\Gamma _1,\Gamma _2)\),
is a classical solution of (5.3). Moreover, \(v=\widetilde{V}\), and the control \(u^*{:}{=}\frac{c_0\Gamma _0+\Gamma _1}{c_2}\) is optimal.
Proof
Clearly \(v\in C^2(H)\). To argue that \(\Gamma \in D(A^*)\), which is equivalent to \(\Gamma _2\in W^{1,2}({\mathbb {R}}^-,{\mathbb {R}})\), it is sufficient, first, to notice that Young’s inequality for convolutions implies that \(\Gamma _2\in L^2({\mathbb {R}}^-,{\mathbb {R}})\), then to consider the fact that \(\Gamma _2\) solves the differential equation
which entails \(\Gamma _2'\in L^2({\mathbb {R}}^-,{\mathbb {R}})\). Now, since \(\Gamma \in D(A^*)\), we have
Moreover, \(\langle C,Dv(z)\rangle _H=\langle C,\Gamma \rangle _H\), and
Then
where we used \(\Gamma _2' = -\rho \Gamma _2-\Gamma _0\alpha '\).
We sketch the rest of the proof, as it goes in a standard way (see (Fabbri et al. (2017), Theorem 2.42)) Let \(Z=Z^{0,\zeta ^{z,\varphi },u}\). Since \(\Gamma \in D(A^*)\), Itô’s formula can be applied, and then, since v is a classical solution of (5.3), we obtain
Letting t goes to \(\infty \) (\(\rho \) large enough), we recover the fundamental identity
Since (5.5) holds true for any control u, recalling that \({\mathcal {H}}\ge {\mathcal {H}}_{CV}\), we conclude \(v\ge \widetilde{V}\). Finally, by (5.2) and (5.5), we have
which concludes the proof. \(\square \)
References
Biagini, S., Gozzi, F., Zanella, M.: Robust portfolio choice with sticky wages. SIAM J. Financ. Math. 13(3), 1004–1039 (2022)
Biffis, E., Gozzi, F., Prosdocimi, C.: Optimal portfolio choice with path dependent labor income: the infinite horizon case. SIAM J. Control Optim. 58(4), 1906–1938 (2020)
Cont, R., Fournie, D.: Change of variable formulas for non-anticipative functionals on path space. J. Funct. Anal. 259, 1043–1072 (2010)
Cont, R., Fournie, D.: A functional extension of the Ito formula. C. R. Math. Acad. Sci. Paris 348(1–2), 57–61 (2010)
Cosso, A., Gozzi, F., Rosestolato, M., Russo, F.: Path-dependent hamilton-jacobi-bellman equation: Uniqueness of crandall-lions viscosity solutions (2023) arXiv:2107.05959
Cosso, A., Gozzi, F., Kharroubi, I., Pham, H., Rosestolato, M.: Optimal control of path-dependent McKean-Vlasov SDEs in infinite-dimension. Ann. Appl. Prob. 33(4), 2863–2918 (2023)
De Feo, F., Federico, S., Swiech, A.: Optimal control of stochastic delay differential equations and applications to path-dependent financial and economic models (2023) arXiv:2302.08809
De Feo, F.: Stochastic optimal control problems with delays in the state and in the control via viscosity solutions and applications to optimal advertising and optimal investment problems. Decisions in Economics and Finance (2024). https://doi.org/10.1007/s10203-024-00456-y
Di Giacinto, M., Federico, S., Gozzi, F.: Pension funds with a minimum guarantee: a stochastic control approach. Financ. Stoch. 15, 297–342 (2011)
Djehiche, B., Gozzi, F., Zanco, G., Zanella, M.: Optimal portfolio choice with path dependent benchmarked labor income: a mean field model. Stoch. Process. Appl. 145, 48–85 (2022)
Fabbri, G., Gozzi, F., Swiech, A.: Stochastic Optimal Control in Infinite Dimension. Dynamic Programming and HJB Equations. Probability Theory and Stochastic Modelling, vol. 82. Springer, (2017)
Federico, S.: A stochastic control problem with delay arising in a pension fund model. Financ. Stoch. 15(3), 421–459 (2011)
Federico, S., Tankov, P.: Exact or approximate finite-dimensional Markovian representation for stochastic control problems with delay. Appl. Math. Optim. 71(1), 165–194 (2015)
Feichtinger, G., Hartl, R.F., Sethi, S.P.: Dynamic optimal control models in advertising: recent developments. Manag. Sci. 40(2), 195–226 (1994)
Fuhrman, M., Masiero, F., Tessitore, G.: Stochastic equations with delay: optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations. SIAM J. Control Optim. 48(7), 4624–4651 (2010)
Gawarecki, L., Mandrekar, V.: Stochastic differential equations in infinite dimensions with applications to stochastic partial differential equations. Probability and its Applications (New York). Springer, Heidelberg (2011)
Gozzi, F., Marinelli, C.: Stochastic optimal control of delay equations arising in advertising models. Stochastic partial differential equations and applications VII - Papers of the 7th meeting, Levico Terme, Italy, January 5-10, 2004, Lecture Notes in Pure and Applied Mathematics 245, 133–148 (2004)
Gozzi, F., Masiero, F., Rosestolato, M.: An optimal advertising model with carryover effect and mean field terms. Mathematics and Financial Economics (2024). https://doi.org/10.1007/s11579-024-00361-3
Gozzi, F., Marinelli, C., Savin, S.: On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects. J. Optim. Theory Appl. 142, 291–321 (2009)
Grosset, L., Viscolani, B.: Advertising for a new product introduction: a stochastic approach. Top 12(1), 149–167 (2004)
Hartl, R.F.: Optimal dynamic advertising policies for hereditary processes. J. Optim. Theory Appl. 43(1), 51–72 (1984)
Lefebvre, W., Miller, E.: Linear-quadratic stochastic delayed control and deep learning resolution. J. Optim. Theory Appl. 191(1), 134–168 (2021)
Marinelli, C.: The stochastic goodwill problem. Eur. J. Oper. Res. 176(1), 389–404 (2007)
Masiero, F., Tessitore, G.: Partial smoothing of delay transition semigroups acting on special functions. J. Diff. Equ. 316, 599–640 (2022)
Motte, M., Pham, H.: Optimal bidding strategies for digital advertising. Working Papers hal-03429785, HAL (November 2021). https://ideas.repec.org/p/hal/wpaper/hal-03429785.html
Nerlove, M., Arrow, K.J.: Optimal advertising policy under dynamic conditions. Economica 29(114), 129–142 (1962)
Pang, T., Yong, Y.: A New Stochastic Model For Stock Price with Delay Effects, pp. 110–117. Society for Industrial and Applied Mathematics, (2019)
Prasad, A., Sethi, S.P.: Dynamic optimization of an oligopoly model of advertising. UTD School of Management Working Paper (2008)
Ren, Z., Rosestolato, M.: Viscosity solutions of path-dependent pdes with randomized time. SIAM J. Math. Anal. 52(2), 1943–1979 (2020)
Ricciardi, M., Rosestolato, M.: Mean field games incorporating carryover effects: Optimizing advertising models (2024) arXiv:2403.00413v1
Rosestolato, M., Swiech, A.: Partial regularity of viscosity solutions for a class of Kolmogorov equations arising from mathematical finance. J. Diff. Equ. 262(3), 1897–1930 (2017)
Vidale, M.L., Wolfe, H.B.: An operations-research study of sales response to advertising. Oper. Res. 5, 370–381 (1957)
Vinter, R.B., Kwong, R.H.: The infinite time quadratic control problem for linear system with state control delays: an evolution equation approach. SIAM J. Control Optim. 19, 139–153 (1981)
Funding
Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement. This research was partially financed by the INdAM (Instituto Nazionale di Alta Matematica F. Severi) – GNAMPA (Gruppo Nazionale per l’Analisi Matematica, la Probabilitá e le loro Applicazioni) Project CUP E53C23001670001 Problemi di controllo ottimo stocastico con memoria a informazione parziale.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Girolami, C.D., Rosestolato, M. Representation of stochastic optimal control problems with delay in the control variable. Decisions Econ Finan (2024). https://doi.org/10.1007/s10203-024-00465-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10203-024-00465-x
Keywords
- stochastic control problems
- dynamic programming
- delay in the control
- infinite dimensional reformulation
- optimal advertising