Keywords

AMS subject classification:

1 Introduction

In optimal control problems for diffusions of mean-field type the performance functional, drift and diffusion coefficient depend not only on the state and the control but also on the probability distribution of the state-control pair. The mean-field coupling makes the control problem time-inconsistent in the sense that the Bellman Principle is no longer valid, which motivates the use of the stochastic maximum (SMP) approach to solve this type of optimal control problems instead of trying extensions of the dynamic programming principle (DPP). This class of control problems has been studied by many authors including [1, 2, 5, 7, 15, 20]. The performance functionals considered in these papers have been of risk-neutral type i.e. the running cost/profit terms are expected values of stage-additive payoff functions. Not all behavior, however, can be captured by risk-neutral performance. One way of capturing risk-averse and risk-seeking behaviors is by exponentiating the performance functional before expectation (see [17]).

The first paper that we are aware of and which deals with risk-sensitive optimal control in a mean field context is [24]. Using a matching argument, the authors derive a verification theorem for a risk-sensitive mean-field game whose underlying dynamics is a Markov diffusion. This matching argument freezes the mean-field coupling in the dynamics, which yields a standard risk-sensitive HJB equation for the value-function. The mean-field coupling is then retrieved through the Fokker-Planck equation satisfied by the marginal law of the optimal state.

In a recent paper [11], the authors have established a risk-sensitive SMP for mean-field type control. The risk-sensitive control problem was first reformulated in terms of an augmented state process and terminal payoff problem. An intermediate stochastic maximum principle was then obtained by applying the SMP of ([5], Theorem 2.1.) for loss functionals without running cost but with augmented state in higher dimension and complete observation of the state. Then, the intermediate first- and second-order adjoint processes are transformed into a simpler form using a logarithmic transformation derived in [12].

Optimal control of partially observed diffusions (without mean-field coupling) has been studied by many authors including the non-exhaustive references [3, 4, 8–10, 13, 14, 16, 19, 21, 23, 26, 27], using both the DPP and SMP approaches. Reference [23] derives an SMP for the most general model of optimal control of partially observed diffusions under risk-neutral performance functionals. Recently, Wang et al. [25], extended the SMP for partially observable optimal control of diffusions for risk-neutral performance functionals of mean-field type.

The purpose of this paper is to establish a stochastic maximum principle for a class of risk-sensitive mean-field type control problems under partial observation. Following the above mentioned papers of optimal control under partial observation, in particular [23], our strategy is to transform the partially observable control problem into a completely observable one and then apply the approach suggested in [11] to derive a suitable risk-sensitive SMP. To the best to our knowledge, the risk-sensitive maximum principle under partial observation without passing through the DPP, and in particular, for mean-field type controls was not established in earlier works.

The paper is organized as follows. In Sect. 2, we present the model and state the partially observable risk-sensitive SMP which constitutes the main result, whose proof is given in Sect. 3. Finally, in Sect. 4, we apply the risk-sensitive SMP to the linear-exponential-quadratic setup under partial observation. To streamline the presentation, we only consider the one-dimensional case. The extension to the multidimensional case is by now straightforward. Furthermore, we consider diffusion models where the control enters only the drift coefficient, which leads to an SMP with only one pair of adjoint processes. The general Peng-type SMP can be derived following e.g. [11, 23].

2 Statement of the Problem

Let \(T>0\) be a fixed time horizon and \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) be a given filtered probability space on which there are defined two independent standard one-dimensional Brownian motions \(W=\{W_s\}_{s\ge 0}\) and \(Y=\{Y_s\}_{s\ge 0}\). Let \(\mathscr {F}_t^{W}\) and \(\mathscr {F}_t^{Y}\) be the \({\mathrm{l\negthinspace P}}\)-completed natural filtrations generated by W and Y, respectively. Set \({\mathrm{l\negthinspace F}}^Y:=\{\mathscr {F}_t^{Y},\ 0\le s \le T\}\) and \({\mathrm{l\negthinspace F}}:=\{{\mathscr {F}}_s,\ 0\le s \le T\}\), where, \(\mathscr {F}_t=\mathscr {F}_t^{W} \vee \mathscr {F}_t^{Y}\).

We consider a mean-field type version the stochastic controlled system with partial observation considered in [23] which is an extension of the model considered by [4, 14] to which we refer for further details.

The model is defined as follows.

(i) An admissible control u is an \({\mathrm{l\negthinspace F}}^{Y}\)-adapted process with values in a non-empty subset (not necessarily convex) U of \({\mathrm{l\negthinspace R}}\) and satisfies \(E[\int _0^T|u(t)|^2dt]<\infty \). We denote the set of all admissible controls by \(\mathscr {U}\). The control u is called partially observable.

(ii) Given a control process \(u\in \mathscr {U}\), we consider the signal-observation pair \((x^u,Y)\) which satisfies the following SDE of mean-field type

$$\begin{aligned} \left\{ \begin{array}{lll} dx^u(t) &{}=&{} b(t,x^u(t),E[x^u(t)], u(t))dt+\sigma (t,x^u(t),E[x^u(t)])dW_{t}\\ &{}+&{} \alpha (t,x^u(t),E[x^u(t)])d\widetilde{W}^u_{t},\,\,\, x^u(0)=x_0, \\ dY_t &{}=&{} \beta (t,x^u(t))dt+ d\widetilde{W}^u_{t},\,\, Y_0=0, \end{array}\right. \end{aligned}$$
(1)

where,

$$\begin{aligned} b(t,x,m,u): \,\,[0,T] \times {\mathrm{l\negthinspace R}}\times {\mathrm{l\negthinspace R}}\times U\longrightarrow {\mathrm{l\negthinspace R}}, \end{aligned}$$
$$\begin{aligned} \alpha (t,x,m),\,\,\sigma (t,x,m): \,\,[0,T] \times {\mathrm{l\negthinspace R}}\times {\mathrm{l\negthinspace R}}\longrightarrow {\mathrm{l\negthinspace R}}\end{aligned}$$

and \(\beta (t,x): [0,T] \times {\mathrm{l\negthinspace R}}\longrightarrow {\mathrm{l\negthinspace R}}\) are Borel measurable function.

In this model, the observation process Y, which carries out the controls u, is assumed to be a given Brownian motion independent of W and is supposed to admit a decomposition as a trend \(\int _0^{\cdot }\beta (t,x^u(t))dt\) (a functional of the state process \(x^u\)) corrupted by a process \(\widetilde{W}^u\) which are a priori not observable.

The case \(\alpha =0\) corresponds to the model considered in [4, 14]. A more general model of the function \(\beta \) would be to let it depend on the control u and be of mean-field type. To keep the presentation simpler, we skip this case in this paper. But, the main results do extend to this case.

Before we formulate the control problem, we show that the system (1) has a weak solution. Introduce the density process defined on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) by

$$\begin{aligned} \rho ^u(t):=\exp {\left\{ \int _0^t \beta (s,x^u(s))dY_s-\frac{1}{2}\int _0^t|\beta (s,x^u(s))|^2ds \right\} }, \end{aligned}$$

which solves the linear SDE

$$\begin{aligned} d\rho ^u(t)=\rho ^u(t)\beta (t,x^u(t)) dY_t,\,\, \ \rho ^u(0)=1.\ \end{aligned}$$

Assuming the function \(\beta \) bounded (see Assumption 1, below), \(\rho \) is a uniformly integrable martingale such that, for every \(p\ge 2\),

$$\begin{aligned} E[\sup _{0\le t\le T}\rho _t^p]\le C, \end{aligned}$$
(2)

where, C is a constant which depends only on the bound of \(\beta \), p and T. Define \(d{\mathrm{l\negthinspace P}}^{u}=\rho ^u(T)d{\mathrm{l\negthinspace P}}\). By Girsanov’s Theorem, \({\mathrm{l\negthinspace P}}^{u}\) is a probability measure. Moreover, \(\widetilde{W}^u\) is a \( {\mathrm{l\negthinspace P}}^u\)-standard Brownian motion independent of W. This in turn entails that \(({\mathrm{l\negthinspace P}}^u,x^u,Y,W,\widetilde{W}^u)\) is a weak solution of (1).

The objective is to characterize admissible controls which minimize the risk-sensitive cost functional given by

$$\begin{aligned} J^{\theta }(u(\cdot ))=E^{u}\left[ \exp {\left( \theta \left[ \int _0^Tf(t,x^u(t),E^u[x^u(t)], u(t))\,dt+ h(x^u(T),E^u[x^u(T)])\right] \right) }\right] , \end{aligned}$$

where, \(\theta \) is the risk-sensitivity index,

$$ \begin{array}{l} f(t,x,m,u): \,\,[0,T] \times {\mathrm{l\negthinspace R}}\times {\mathrm{l\negthinspace R}}\times U\longrightarrow {\mathrm{l\negthinspace R}}, \\ h(x,m): \,\,{\mathrm{l\negthinspace R}}\times {\mathrm{l\negthinspace R}}\longrightarrow {\mathrm{l\negthinspace R}}. \end{array} $$

Any \(\bar{u}(\cdot )\in {\mathscr {U}}\) which satisfies

$$\begin{aligned} J^{\theta }(\bar{u}(\cdot ))=\inf _{u(\cdot )\in {\mathscr {U}}}J^{\theta }(u(\cdot )) \end{aligned}$$
(3)

is called a risk-sensitive optimal control under partial observation.

Let \(\varPsi _T=\int _0^T f(t, x(t),E^u[x(t)], u(t)) dt\,+\,h(x(T), E^u[x(T)])\) and consider the payoff functional given by

$$ \widetilde{\varPsi }_{\theta }:=\frac{1}{\theta }\log E^u e^{\theta \varPsi _T}. $$

When the risk-sensitive index \(\theta \) is small, the loss functional \(\widetilde{\varPsi }_{\theta }\) can be expanded as

$$ E^u[ \varPsi _T] +\frac{\theta }{2}\text{ var }_u(\varPsi _T)+O(\theta ^2), $$

where, \(\text{ var }_u(\varPsi _T)\) denotes the variance of \(\varPsi _T\) w.r.t. \( {\mathrm{l\negthinspace P}}^u\). If \(\theta <0\) , the variance of \(\varPsi _T\), as a measure of risk, improves the performance \(\widetilde{\varPsi }_{\theta }\), in which case the optimizer is called risk seeker. But, when \(\theta >0\), the variance of \(\varPsi _T\) worsens the performance \(\widetilde{\varPsi }_{\theta }\), in which case the optimizer is called risk averse. The risk-neutral loss functional \(E^u[\varPsi _{T}]\) can be seen as a limit of risk-sensitive functional \( \widetilde{\varPsi }_{\theta }\) when \(\theta \rightarrow 0\).

Since \(d{\mathrm{l\negthinspace P}}^{u}=\rho ^u(T)d{\mathrm{l\negthinspace P}}\), the associated risk-sensitive cost functional becomes

$$\begin{aligned} J^{\theta }(u(\cdot ))=E\left[ \rho ^u(T)e^{\theta \left[ \int _0^Tf(t,x^u(t),E[\rho ^u(t)x^u(t)], u(t))\,dt+ h(x^u(T),E[\rho ^u(T)x^u(T)])\right] }\right] , \end{aligned}$$
(4)

where, on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\), the process \((\rho ^u,x^u)\) satisfies the following dynamics:

$$\begin{aligned} \left\{ \begin{array}{lll} d\rho ^u(t) &{}=&{} \rho ^u(t)\beta (s,x^u(s)) dY_t, \\ dx^u(t) &{}=&{} \left\{ b(t,x^u(t),E[x^u(t)], u(t))-\alpha (t,x^u(t),E[x^u(t)])\beta (t,x^u(t))\right\} dt \\ &{}+&{} \sigma (t,x^u(t),E[x^u(t)])dW_{t} +\alpha (t,x^u(t),E[x^u(t)])dY_{t}, \\ \rho ^u(0)&{}=&{} 1,\,\, x^u(0)=x_0. \end{array}\right. \end{aligned}$$
(5)

We have recast the partially observable control problem into the following completely observable control problem: Minimize \(J^{\theta }(u(\cdot ))\) defined by (4) subject to (5).

The main result of this paper is a stochastic maximum principle (SMP) in terms of necessary optimality conditions for the problem (3) subject to (5).

We will only consider the case where the risk-sensitive parameter is positive, \(\theta >0\). The case \(\theta <0\) can be treated in a similar fashion by considering \(\theta =-\bar{\theta }, \bar{\theta }>0\), and \(\bar{f}=-f,\bar{h}=-h\) in the performance functional (4).

We will make the following assumption.

Assumption 1

The functions \(b, \sigma ,\alpha ,\beta , f, h\) are twice continuously differentiable with respect to (x, m). Moreover, these functions and their first derivatives with respect to (x, m) are bounded and continuous in (x, m, u).

To keep the presentation less technical, we impose these assumptions although they are restrictive and can be made weaker.

Under these assumptions, in view of Girsanov’s theorem and [18], Proposition 1.2., for each \(u\in {\mathscr {U}}\), the SDE (5) admits a unique weak solution \((\rho ^u, x^u)\).

We now state an SMP to characterize optimal controls \(\bar{u}(\cdot )\in \mathscr {U}\) which minimize (4), subject to (5). Let \((\bar{\rho },\bar{x}):=(\rho ^{\bar{u}},x^{\bar{u}})\) denote the corresponding state process, defined as the solution of (5).

We introduce the following notation.

$$\begin{aligned} \begin{array}{lll} X:=\left( \begin{array}{lll} \rho \\ x \end{array}\right) ,\,\,\, \bar{X}:=\left( \begin{array}{lll} \bar{\rho }\\ \bar{x}\\ \end{array}\right) ,\,\,\, \,\,\, X_0=\bar{X}_0:=\left( \begin{array}{lll} 1\\ x_0\end{array}\right) ,\,\,\, B_t:=\left( \begin{array}{lll} Y_t \\ W_t\end{array}\right) ,\\ F(t,X,m,u):=\left( \begin{array}{lll} 0\\ c(t,x,m,u)\end{array}\right) ,\,\,\, G(t,X,m):=\left( \begin{array}{lll} \rho \beta (t,x) &{} 0\\ \alpha (t,x,m) &{} \sigma (t,x,m)\end{array}\right) , \\ c(t,x,m,u):= b(t,x,m,u)-\alpha (t,x,m)\beta (t,x),\,\,\phi (X):=x,\,\,\tilde{\phi }(X):=\rho x,\\ \phi (\bar{X}):=\bar{x},\,\, \tilde{\phi }(\bar{X}):=\bar{\rho }\bar{x}. \end{array} \end{aligned}$$
(6)

We define the risk-neutral Hamiltonian as follows. For \((p,q)\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\),

$$\begin{aligned} H(t,X,m,p,q,u):=\langle F(t,X,m,u),p\rangle +\text{ tr }(G^*(t,X,m)q)-f (t, x,m,u), \end{aligned}$$
(7)

where, \(^{\prime }*^{\prime }\) denotes the transposition operation of a matrix or a vector.

We also introduce the risk-sensitive Hamiltonian: \((p,q, \ell )\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\times {\mathrm{l\negthinspace R}}^2\),

$$\begin{aligned} \begin{array}{lll} H^{\theta }(t, X,m, u, p, q,\ell ):= \langle F(t,X,m,u),p\rangle -f (t, x,m, u)\\ \qquad \quad \qquad \qquad \qquad \qquad \qquad +\,\text{ tr }(G^*(t,X,m)(q +\theta \ell p^*)). \end{array} \end{aligned}$$
(8)

We have \(H=H^0\).

Setting

$$\begin{aligned} \ell :=\left( \begin{array}{lll} \ell _1\\ \ell _2 \end{array}\right) , \quad p:=\left( \begin{array}{lll} p_1\\ p_2 \end{array}\right) ,\quad q:=\left( \begin{array}{lll} q_{11} &{} q_{12}\\ q_{21} &{} q_{22} \end{array}\right) , \end{aligned}$$

the explicit form of the Hamiltonian (8) reads

$$\begin{aligned}\begin{array}{lll} H^{\theta }(t, X,m, u, p, q,\ell ) &{}:=&{} c(t,x,m,u)p_2-f(t,x,m,u)+\rho \beta (t,x)(q_{11}+\theta \ell _1 p_1)\\ &{}+&{} \alpha (t,x,m)(q_{21}+\theta \ell _2 p_1)+\sigma (t,x,m)(q_{22}+\theta \ell _2 p_2). \end{array} \end{aligned}$$

Setting \(\theta =0\), we obtain the explicit form of the Hamiltonian (7):

$$\begin{aligned}\begin{array}{lll} H(t, X,m, u, p, q)&{}:=&{}c(t,x,m,u)p_2-f(t,x,m,u)+\rho \beta (t,x)q_{11}\\ \qquad \qquad \qquad &{}+&{}\alpha (t,x,m)q_{21}+\sigma (t,x,m)q_{22}. \end{array} \end{aligned}$$

With the obvious notation for the derivatives of the functions \(b,\alpha ,\beta , \sigma ,f,h\), w.r.t. the arguments x and m, we further set

$$\begin{aligned}\left\{ \begin{array}{lll} H^{\theta }_x(t, X,m, u, p, q) &{}:=&{} c_x(t,x,m,u)p_2-f_x(t,x,m,u)+\rho \beta _x(t,x)(q_{11}+\theta \ell _1 p_1) \\ &{}+&{}\alpha _x(t,x,m)(q_{21}+\theta \ell _2 p_1)+\sigma _x(t,x,m)(q_{22}+\theta \ell _2 p_2),\\ \check{H}^{\theta }_m(t, X,m, u, p, q) &{}:=&{} c_m(t,x,m,u)p_2+\alpha _m(t,x,m)(q_{21}+\theta \ell _2 p_1)\\ &{}+&{}\sigma _m(t,x,m)(q_{22}+\theta \ell _2 p_2),\\ H^{\theta }_{\rho }(t,X,m,u,p,q) &{}:=&{} \beta (t,x)(q_{11}+\theta \ell _1 p_1). \end{array}\right. \end{aligned}$$

With this notation, the system (5) can be rewritten in the following compact form

$$\begin{aligned} \left\{ \begin{array}{lll} dX(t)&{}=&{}F (t, X(t),E[\phi (X(t))],u(t)) dt + G(t, X(t),E[\phi (X(t))]) dB_t,\\ X(0)&{}=&{}X_0. \end{array} \right. \end{aligned}$$
(9)

We define the risk-neutral Hamiltonian associated with random variables X such that \(\phi (X)\) and \(\tilde{\phi }(X)\) are \(L^1(\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace P}})\) as follows (with the obvious abuse of notation): For \((p,q)\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\),

$$\begin{aligned} \begin{array}{lll} H(t,X,p,q,u):=\langle F(t,X,E[\phi (X)],u),p\rangle -f(t,x,E[\tilde{\phi }(X)],u)\\ \qquad \qquad \qquad \quad +\,\text{ tr }(G^*(t,X,E[\phi (X)])q). \end{array} \end{aligned}$$
(10)

We also introduce the risk-sensitive Hamiltonian: \((p,q, \ell )\in {\mathrm{l\negthinspace R}}^2\times {\mathrm{l\negthinspace R}}^{2\times 2}\times {\mathrm{l\negthinspace R}}^2\),

$$\begin{aligned} \begin{array}{lll} H^{\theta }(t, X, u, p, q,\ell ):= \langle F(t,X,E[\phi (X)],u),p\rangle -f (t,x ,E[\tilde{\phi }(X)], u)\\ \qquad \qquad \qquad \qquad \qquad +\,\text{ tr }(G^*(t,X,E[\phi (X)])(q +\theta \ell p^*)). \end{array} \end{aligned}$$
(11)

For \(g\in \{b,c, \sigma ,\alpha ,\beta \}\) and \(u\in U\), we set

$$\begin{aligned} \begin{array}{llll} g_x(t):=g_x(t,\bar{x}(t),E[\bar{x}(t)],\bar{u}(t)),\,\,\, g_m(t):=g_m(t,\bar{x}(t),E[\bar{x}(t)],\bar{u}(t)) \end{array} \end{aligned}$$
(12)

and

$$\begin{aligned} \left\{ \begin{array}{llll} f_x(t):=f_x(t,\bar{x}(t),E[\bar{\rho }(t)\bar{x}(t)],\bar{u}(t)),\,\,\, f_m(t):=f_m(t,\bar{x}(t),E[\bar{\rho }(t)\bar{x}(t)],\bar{u}(t)),\\ h_x(t):=h_x(\bar{x}(t),E[\bar{\rho }(t)\bar{x}(t)]),\,\,\, h_m(t):=h_m(\bar{x}(t),E[\bar{\rho }(t)\bar{x}(t)]). \end{array} \right. \end{aligned}$$
(13)

Let

$$\begin{aligned} \psi ^{\theta }_T:=\bar{\rho }(T)\exp {\theta \left[ \int _0^Tf(t,\bar{x}(t), E[\bar{\rho }(t)\bar{x}(t)], \bar{u}(t)) dt+h(\bar{x}(T), E[\bar{\rho }(T)\bar{x}(T)])\right] }. \end{aligned}$$

We introduce the adjoint equations involved in the risk-sensitive SMP for our control problem.

$$\begin{aligned} \left\{ \begin{array}{lll} d\hat{p}(t)&{}=&{}-\left( \begin{array}{ccc} H^{\theta }_{\rho }(t)+\frac{1}{v^{\theta }(t)}E[v^{\theta }(t){\check{H}}^{\theta }_m(t)]-\frac{\bar{x}(t)}{v^{\theta }(t)}E[v^{\theta }(t)f_m(t)] \\ H^{\theta }_x(t)+\frac{1}{v^{\theta }(t)}E[v^{\theta }(t){\check{H}}^{\theta }_m(t)]-\frac{\bar{\rho }(t)}{v^{\theta }(t)}E[v^{\theta }(t)f_m(t)]]\end{array}\right) dt\\ &{}&{}\quad +\,\hat{q}(t)(-\theta \ell (t)dt+dB_t),\\ dv^{\theta }(t)&{}=&{}\theta v^{\theta }(t)\langle \ell (t),dB_t\rangle ,\\ \hat{p}(T)&{}=&{}-\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ h_x(T) \end{array}\right) -\left( \begin{array}{ccc} \bar{x}(T)\\ \bar{\rho }(T)\end{array}\right) \frac{1}{\psi ^{\theta }_T}E[\psi ^{\theta }_Th_m(T)],\\ v^{\theta }(T)&{}=&{}\psi ^{\theta }_T, \end{array} \right. \end{aligned}$$
(14)

where, in view of (2) and (13), for \(k\in \{\rho ,x\}\),

$$\begin{aligned}\begin{array}{lll} H^{\theta }_k(t):=\langle F_k(t,\bar{X}(t),E[\phi (\bar{X}(t))],\bar{u}(t)), \hat{p}(t)\rangle -f_k (t, \bar{x}(t),E[\tilde{\phi }(\bar{X}(t))], \bar{u}(t))\\ \qquad \qquad \qquad \qquad \quad +\,\text{ tr }(G_k^*(t,\bar{X}(t),E[\phi (\bar{X}(t))])(\hat{q}(t) +\theta \ell {\hat{p}}^*(t)) \end{array} \end{aligned}$$

and

$$\begin{aligned}\begin{array}{lll} {\check{H}}^{\theta }_m(t):=\langle F_m(t,\bar{X}(t),E[\phi (\bar{X}(t))],\bar{u}(t)), \hat{p}(t)\rangle \\ \qquad \qquad \qquad \qquad \quad +\,\text{ tr }(G_m^*(t,\bar{X}(t),E[\phi (\bar{X}(t))])(\hat{q}(t) +\theta \ell {\hat{p}}^*(t)). \end{array} \end{aligned}$$

We note that the processes \((\hat{p},\hat{q},\ell )\) may depend on the sensitivity index \(\theta \). To ease notation, we omit to make this dependence explicit.

Below, we will show that, under Assumption 1, (14) admits a unique \({\mathrm{l\negthinspace F}}\)-adapted solution \((\hat{p},\hat{q},\hat{v}^{\theta },\ell )\) such that

$$\begin{aligned} E\left[ \sup _{t\in [0,T]}|\hat{p}(t)|^2+\sup _{t\in [0,T]}|v^{\theta }(t)|^2+\int _0^T \left( |\hat{q}(t)|^2+|\ell (t)|^2\right) dt\right] <\infty . \end{aligned}$$
(15)

Moreover,

Lemma 1

The process defined on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}})\) by

$$\begin{aligned} L^{\theta }_t:=\frac{v^{\theta }(t)}{v^{\theta }(0)}=\exp {\left( \int _0^t \theta \langle \ell (s),dB_s\rangle - \frac{\theta ^2}{2}\int _0^t |\ell (s)|^2 ds\right) }, \quad 0\le t\le T, \end{aligned}$$
(16)

is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale.

The process \(L^{\theta }\) defines a new probability measure \({\mathrm{l\negthinspace P}}^{\theta }\) equivalent to \({\mathrm{l\negthinspace P}}\) by setting \(L^{\theta }_t:=\frac{d{\mathrm{l\negthinspace P}}^{\theta }}{d{\mathrm{l\negthinspace P}}}|_{{\mathscr {F}}_t}\). By Girsanov’s theorem, the process \(B_t^{\theta }:=B_t-\theta \int _0^t \ell (s)ds,\,\, 0\le t\le T\) is a \({\mathrm{l\negthinspace P}}^{\theta }\)-Brownian motion.

The following theorem is the main result of the paper. Let \(E^{\theta }[\,\cdot \,]\) denote the expectation w.r.t. \({\mathrm{l\negthinspace P}}^{\theta }\).

Theorem 1

(Risk-sensitive maximum principle) Let Assumption 1 hold. If the process \((\bar{\rho }(\cdot ),\bar{x}(\cdot ),\bar{u}(\cdot ))\) is an optimal solution of the risk-sensitive control problem (3)–(5), then there are two pairs of \({\mathrm{l\negthinspace F}}\)-adapted processes \((v^{\theta }, \ell )\) and \((\hat{p},\hat{q})\) which satisfy (14)–(15), such that

$$\begin{aligned}\begin{array}{lll} E^{\theta }[H^{\theta }(t, \bar{\rho }(t),\bar{x}(t), \hat{p}(t), \hat{q}(t),\ell (t),u)-H^{\theta }(t, \bar{\rho }(t),\bar{x}(t), \hat{p}(t), \hat{q}(t),\ell (t),\bar{u}(t))|\mathscr {F}^Y_t] \le 0, \end{array} \end{aligned}$$

for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}^{\theta }-\)almost surely.

Remark 1

The boundedness assumption of the involved coefficients and their derivatives imposed in Assumption 1, in Theorem 1, guarantees the solvability of the system of forward-backward SDEs (5) and (14). In fact Theorem 1 applies provided we can solve system of forward-backward SDEs (5) and (14). A typical example of such a situation is the classical Linear-Quadratic (LQ) control problem (see Sect. 4 below), in which the involved coefficients are at most quadratic, but not necessarily bounded.

3 Proof of the Main Result

In this section we give a proof of Theorem 1 here presented in several steps.

3.1 An Intermediate SMP for Mean-Field Type Control

In this subsection we first reformulate the risk-sensitive control problem associated with (4)–(5) in terms of an augmented state process and terminal payoff problem. An intermediate stochastic maximum principle is then obtained by applying the SMP obtained in ([1], Theorem 3.1 or [5], Theorem 2.1) for loss functionals without running cost. Then, we transform the intermediate first-order adjoint processes to a simpler form. The mean-field type control problem for the cost functional (4) under the dynamics (5) is equivalent to

$$\begin{aligned} \inf _{u(\cdot )\in {\mathscr {U}}} E\left[ \rho (T) e^{\theta \left[ h(x(T), E[\rho (T)x(T)])+ \xi (T)\right] }\right] , \end{aligned}$$
(17)

subject to

$$\begin{aligned} \left\{ \begin{array}{lll} d\rho (t) &{}=&{} \rho (t)\beta (t,x(t)) dY_t,\\ dx(t) &{}=&{} \left\{ b(t,x(t),E[x(t)], u(t))-\alpha (t,x(t),E[x(t)])\beta (t,x(t))\right\} dt\\ &{}+&{}\sigma (t,x(t),E[x(t)])dW_{t} +\alpha (t,x(t),E[x(t)])dY_{t},\\ d\xi &{}=&{} f(t, x(t),E[\rho (t) x(t)], u(t))dt, \\ \rho (0) &{}=&{} 1,\,\, x(0)=x_0,\,\xi (0)=0. \end{array} \right. \end{aligned}$$
(18)

We introduce the following notation.

$$\begin{aligned}\begin{array}{ccc} R:=\left( \begin{array}{ccc} \rho \\ x \\ \xi \end{array}\right) =\left( \begin{array}{ccc} X \\ \xi \end{array}\right) ,\,\, \bar{R}:=\left( \begin{array}{ccc} \bar{\rho }\\ \bar{x}\\ \bar{\xi } \end{array}\right) =\left( \begin{array}{ccc} \bar{X} \\ \bar{\xi } \end{array}\right) ,\,\,\, R_0=\bar{R}_0:=\left( \begin{array}{ccc} X_0,\\ 0 \end{array}\right) ,\\ \varGamma (t,R,m):=\left( \begin{array}{ccc} G(t,X,m,u)\\ 0 \end{array}\right) , \\ \phi (R)=\phi (X)=x,\,\,\tilde{\phi }(R)=\tilde{\phi }(X)=\rho x,\, \,\,\phi (\bar{R})=\phi (\bar{X})=\bar{x},\,\,\tilde{\phi }(\bar{R})=\tilde{\phi }(\bar{X})=\bar{\rho }\bar{x}. \end{array} \end{aligned}$$

With this notation, the system (18) can be rewritten in the following compact form

$$\begin{aligned}\left\{ \begin{array}{lll} dR(t) &{}=&{} \left( \begin{array}{ccc} F(t, R(t),E[\phi (R(t))],u(t))\\ f(t, R(t),E[\tilde{\phi }(R(t))],u(t))\end{array}\right) dt + \varGamma (t, R(t),E[\phi (R(t))]) dB_t,\\ R(0) &{}=&{} R_0 \end{array} \right. \end{aligned}$$

and the risk-sensitive cost functional (4) becomes

$$\begin{aligned} J^{\theta }(u(\cdot )):=E[\Phi \left( R(T),E[\tilde{\phi }(R(T))]\right) ], \end{aligned}$$

where,

$$\begin{aligned} \Phi \left( R(T),E[\phi (R(T))]\right) :=\rho (T) \exp \left( \theta h(x(T), E[\rho (T)x(T)]) +\theta \xi (T)\right) . \end{aligned}$$

We define the Hamiltonian associated with random variables R such that \(\phi (R)\in L^1(\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace P}})\) as follows. For \((p,q)\in {\mathrm{l\negthinspace R}}^3\times {\mathrm{l\negthinspace R}}^{3\times 3}\),

$$\begin{aligned} \begin{array}{lll} H^e(t,R,p,q,u):=\langle \left( \begin{array}{ccc} F(t, R(t),E[\phi (R(t))],u(t))\\ f(t, R(t),E[\tilde{\phi }(R(t))],u(t))\end{array}\right) ,p\rangle \\ \qquad \qquad \qquad \qquad \qquad \quad \qquad \qquad +\,\text{ tr }(\varGamma ^*(t,R,E[\phi (R)])q), \end{array} \end{aligned}$$
(19)

where, \(\varGamma ^*\) denotes the transpose of the matrix \(\varGamma \).

Setting

$$\begin{aligned} p:=\left( \begin{array}{lll} p_1\\ p_2 \\ p_3 \end{array}\right) ,\quad q:=\left( \begin{array}{lll} q_{11} &{} q_{12}\\ q_{21} &{} q_{22}\\ q_{31} &{} q_{32} \end{array}\right) , \end{aligned}$$
(20)

the explicit form of the Hamiltonian (19) reads

$$\begin{aligned}\begin{array}{lll} H^e(t,\rho ,x,\xi ,p,q,u):=H^e(t,R,p,q,u)=c(t,x,E[x],u)p_2+f(t,x,E[\rho x],u)p_3\\ \qquad \qquad \qquad \qquad \quad +\sigma (t,x,E[x])q_{22}+\rho \beta (t,x) q_{11}+\alpha (t,x,E[x]) q_{21}. \end{array} \end{aligned}$$

In view of (12), we set

$$\begin{aligned}\left\{ \begin{array}{lll} H^e_x(t) &{}:=&{} c_x(t)p_2(t)+f_x(t)p_3(t)+\sigma _x(t)q_{22}(t)+\bar{\rho }(t)\beta _x(t) q_{11}(t)+\alpha _x(t)q_{21}(t),\\ {\check{H}}^e_m(t) &{}:=&{} c_m(t)p_2(t)+f_m(t)p_3(t)+\sigma _m(t)q_{22}(t)+\alpha _m(t) q_{21}(t),\\ H^e_{\rho }(t) &{}=&{} \beta (t,\bar{x}(t))q_{11}(t). \end{array} \right. \end{aligned}$$

We may apply the SMP for risk-neutral mean-field type control (cf. [1], Theorem 3.1 or [5], Theorem 2.1) to the augmented state dynamics \((\rho , x,\xi )\) to derive the first order adjoint equation:

$$\begin{aligned} \left\{ \begin{array}{lll} dp(t) &{}=&{} -\left( \begin{array}{ccc} H^e_{\rho }(t)+E[{\check{H}}^e_m(t)]+\bar{x}(t)E[f_m(t)p_3(t)]\\ H^e_x(t)+E[{\check{H}}^e_m(t)]+\bar{\rho }(t)E[f_m(t)p_3(t)]\\ 0\end{array}\right) dt+q(t)dB_t,\\ p(T) &{}=&{} -\theta \psi ^{\theta }_T\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ h_x(T)\\ 1\end{array}\right) -\theta \left( \begin{array}{lll} \bar{x}(T)\\ \bar{\rho }(T)\\ 0\end{array}\right) E[\psi ^{\theta }_Th_m(T)]. \end{array}\right. \end{aligned}$$
(21)

This is a system of linear backward SDEs of mean-field type which, in view of ([6], Theorem 3.1), under Assumption 1, admits a unique \({\mathrm{l\negthinspace F}}\)-adapted solution (p, q) satisfying

$$\begin{aligned} E\left[ \sup _{t\in [0,T]}|p(t)|^2+\int _0^T |q(t)|^2 dt\right] <\infty , \end{aligned}$$
(22)

where, \(|\cdot |\) denotes the usual Euclidean norm with appropriate dimension.

We may apply the SMP for SDEs of mean-field type control from ([1], Theorem 3.1 or [5], Theorem 2.1) together with the SMP for risk-neutral partially observable SDEs derived in ([23], Theorem 2.1) to obtain the following SMP.

Proposition 1

Let Assumption 1 hold. If \((\bar{R}(\cdot ), \bar{u}(\cdot ))\) is an optimal solution of the risk-neutral control problem (17) subject to the dynamics (18), then there is a unique pair of \({\mathrm{l\negthinspace F}}\)-adapted processes (p, q) which satisfies (21)–(22) such that

$$\begin{aligned}\begin{array}{lll} E[H^e(t, \bar{R}(t), p(t), q(t),u)-H^e(t, \bar{R}(t), p(t), q(t),\bar{u}(t))|\mathscr {F}^Y_t] \le 0, \end{array} \end{aligned}$$

for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}-\)almost surely.

3.2 Transformation of the First Order Adjoint Process

Although the result of Proposition 1 is a good SMP for the risk-sensitive mean-field type control with partial observations, augmenting the state process with the third component \(\xi \) yields a system of three adjoint equations that appears complicated to solve in concrete situations. In this section we apply the transformation of the adjoint processes (p, q) introduced in [11] in such a way to get rid of the third component \((p_3,q_{31},q_{32})\) in (21) and express the SMP in terms of only two adjoint process that we denote \((\hat{p},\hat{q})\), where

$$\begin{aligned} \hat{p}:=\left( \begin{array}{ccc} \hat{p}_1\\ \hat{p}_2\end{array}\right) ,\quad \hat{q}:=\left( \begin{array}{ccc} \hat{q}_1\\ \hat{q}_2\end{array}\right) ,\quad \hat{q}_i:=(\hat{q}_{i1},\hat{q}_{i2}),\,\,\, i=1,2. \end{aligned}$$
(23)

Indeed, noting that from (21), we have \(dp_3(t) =\langle q_3(t),dB_t\rangle \) and \( p_3(T)=- \theta \psi ^{\theta }_T,\) the explicit solution of this backward SDE is

$$\begin{aligned} p_3(t)=- \theta E[\psi ^{\theta }_T \ | \ {\mathscr {F}}_t ]=-\theta v^{\theta }(t) , \end{aligned}$$
(24)

where,

$$\begin{aligned} v^{\theta }(t):= E[\psi ^{\theta }_T \ | \ {\mathscr {F}}_t ], \qquad 0\le t\le T. \end{aligned}$$

In particular, we have \(v^{\theta }(0)= E[\psi ^{\theta }_T].\) Therefore, in view of (24), it would be natural to choose a transformation of (p, q) into an adjoint process \((\hat{p}, \hat{q})\) , where,

$$ \hat{p}:=\left( \begin{array}{ccc} \hat{p}_1\\ \hat{p}_2 \\ \hat{p}_3\end{array}\right) ,\quad \hat{q}:=\left( \begin{array}{ccc} \hat{q}_{11} &{} \hat{q}_{12}\\ \hat{q}_{21} &{} \hat{q}_{22} \\ \hat{q}_{31} &{} \hat{q}_{32}\end{array}\right) , $$

such that

$$\begin{aligned} \hat{p}_3(t)=\frac{p_3(t)}{ \theta v^{\theta }(t)}=-1, \qquad 0\le t\le T. \end{aligned}$$
(25)

This would imply that, for almost every \(0\le t\le T\),

$$\begin{aligned} \hat{q}_3(t)=(\hat{q}_{31}(t), \hat{q}_{32}(t))=0, \,\,\, {\mathrm{l\negthinspace P}}-\text{ a.s. }, \end{aligned}$$
(26)

which in turn would reduce the number of adjoint processes to those of the form given by (23).

We consider the following transform:

$$\begin{aligned} \hat{p}(t):=\frac{1}{\theta v^{\theta }(t)} p(t),\qquad 0\le t\le T. \end{aligned}$$
(27)

In view of (21), we have

$$\begin{aligned} \hat{p}(T)=-\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ h_x(T)\\ 1\end{array}\right) -\left( \begin{array}{ccc} \bar{x}(T)\\ \bar{\rho }(T)\\ 0\end{array}\right) \frac{1}{\psi ^{\theta }_T}E[\psi ^{\theta }_Th_m(T)]. \end{aligned}$$
(28)

We should identify the processes \(\hat{\alpha }\) and \(\hat{q}\) such that

$$\begin{aligned} d\hat{p}(t)=-\hat{\alpha }(t)dt+\hat{q}(t) dB_t, \end{aligned}$$
(29)

for which (25) and (26) are satisfied.

In order to investigate the properties of these new processes \((\hat{p},\hat{q})\), the following properties of the generic martingale \(v^{\theta }\), used in [11], are essential. We reproduce them here for the sake of completeness. Since, by Assumption 1, f and h are bounded by some constant \(C>0\), we have

$$\begin{aligned} 0< e^{-(1 + T)C\theta }\rho (T) \le \psi ^{\theta }_T \le e^{(1 + T)C\theta }\rho (T). \end{aligned}$$

Therefore, \(v^{\theta }\) is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale satisfying

$$\begin{aligned} 0< e^{-(1 + T)C\theta }\rho (t) \le v^{\theta }(t) \le e^{(1 + T)C\theta }\rho (t), \,\qquad 0\le t\le T. \end{aligned}$$

Hence, in view of (2), we have

$$\begin{aligned} E[\sup _{0\le t\le T}|v^{\theta }(t)|^2]\le C. \end{aligned}$$
(30)

Furthermore, the martingale \(v^{\theta }\) enjoys the following useful logarithmic transform established in ([12], Proposition 3.1)

$$\begin{aligned} v^{\theta }(t)=\exp \left( \theta Z_t+\theta \int _0^t f(s,\bar{x}(s),E[\bar{\rho }(s)\bar{x}(s)],\bar{u}(s))ds\right) , \quad 0\le t\le T, \end{aligned}$$
(31)

and

$$\begin{aligned} v^{\theta }(0)=E[\psi ^{\theta }_T]=\exp (\theta Z_0). \end{aligned}$$

Moreover, the process Z is the first component of the \({\mathrm{l\negthinspace F}}\)-adapted pair of processes \((Z,\ell )\) which is the unique solution to the following quadratic BSDE:

$$\begin{aligned} \left\{ \begin{array}{lll} dZ_t=-\{f(t, \bar{x}(t), E[\bar{\rho }(s)\bar{x}(s)],\bar{u}(t))+\frac{\theta }{2}|\ell (t)|^2\}dt+\langle \ell (t), dB_t \rangle ,\\ \\ Z_T=\frac{1}{\theta }\ln \bar{\rho }(T)+ h(\bar{x}_T, E[\bar{\rho }(T)\bar{x}(T)]). \end{array} \right. \end{aligned}$$
(32)

where, \(\ell (t)=(\ell _1(t),\ell _2(t))\) satisfies

$$\begin{aligned} E\left[ \int _0^T |\ell (t)|^2dt\right] <\infty . \end{aligned}$$
(33)

In particular, \(v^{\theta }\) solves the following linear backward SDE

$$\begin{aligned} dv^{\theta }(t)=\theta v^{\theta }(t)\langle \ell (t),dB_t\rangle ,\quad v^{\theta }(T)=\psi ^{\theta }_T. \end{aligned}$$
(34)

Hence,

Proof of Lemma 1. In view of (30),

$$\begin{aligned} \frac{v^{\theta }(t)}{v^{\theta }(0)}=\exp {\left( \int _0^t \theta \langle \ell (s),dB_s\rangle - \frac{\theta ^2}{2}\int _0^t |\ell (s)|^2 ds\right) }:=L^{\theta }_t, \quad 0\le t\le T, \end{aligned}$$
(35)

is a uniformly integrable \({\mathrm{l\negthinspace F}}\)-martingale. \(\square \)

To identify the processes \(\tilde{\alpha }\) and \(\tilde{q}\) such that

$$\begin{aligned} d\hat{p}(t)=-\hat{\alpha }(t)dt+\hat{q}(t) dB_t, \end{aligned}$$

we may apply Itô’s formula to the process \({p}(t)=\theta v^{\theta }\tilde{p}(t)\), use (21) and (34) and identify the coefficients. We obtain

$$\begin{aligned} \left\{ \begin{array}{lll} \hat{\alpha }(t)&{}=&{}\frac{1}{\theta v^{\theta }(t)}\left( \begin{array}{ccc} H^e_{\rho }(t)+E[{\check{H}}^e_m(t)]+\bar{x}(t)E[f_m(t)p_3(t)]\\ H^e_x(t)+E[{\check{H}}^e_m(t)]+\bar{\rho }(t)E[f_m(t)p_3(t)]\\ 0\end{array}\right) +\theta \hat{q}(t)\ell (t),\\ \hat{q}(t)&{}=&{}\frac{1}{\theta v^{\theta }(t)}q(t)-\theta \hat{p}(t)\ell (t).\\ \end{array} \right. \end{aligned}$$
(36)

Therefore,

$$\begin{aligned} \left\{ \begin{array}{lll} d\hat{p}(t) &{}=&{} -\frac{1}{\theta v^{\theta }(t)}\left( \begin{array}{ccc} H^e_{\rho }(t)+E[{\check{H}}^e_m(t)]+\bar{x}(t)E[f_m(t)p_3(t)]\\ H^e_x(t)+E[{\check{H}}^e_m(t)]+\bar{\rho }(t)E[f_m(t)p_3(t)]\\ 0\end{array}\right) dt+\hat{q}(t)dB^{\theta }_t,\\ \hat{q}(t) &{}=&{} \frac{1}{\theta v^{\theta }(t)}q(t)-\theta \hat{p}(t)\ell (t),\\ dv^{\theta }(t) &{}=&{} \theta v^{\theta }(t)\langle \ell (t),dB_t\rangle ,\\ \hat{p}(T) &{}=&{} -\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ h_x(T)\\ 1\end{array}\right) -\left( \begin{array}{lll} \bar{x}(T)\\ \bar{\rho }(T)\\ 0\end{array}\right) \frac{1}{\psi ^{\theta }_T}E[\psi ^{\theta }_Th_m(T)],\\ v^{\theta }(T) &{}=&{} \psi ^{\theta }_T, \end{array} \right. \end{aligned}$$
(37)

where, \(B_t^{\theta }:=B_t-\theta \int _0^t \ell (s)ds,\,\, 0\le t\le T\), which is, in view of (35) and Girsanov’s Theorem, a \({\mathrm{l\negthinspace P}}^{\theta }\)-Brownian motion, where \(\frac{d{\mathrm{l\negthinspace P}}^{\theta }}{d{\mathrm{l\negthinspace P}}}\Big |_{{\mathscr {F}}_t}:=L^{\theta }_t\).

In particular,

$$ d\hat{p}_3(t)= \langle \hat{q}_3(t),-\theta \ell (t)dt+dB_t\rangle ,\quad \hat{p}_3(T)=-1. $$

Therefore, noting that \(\hat{p}_3(t):=[\theta v^{\theta }(t)]^{-1}p_3(t)\) is square-integrable, we obtain

$$\hat{p}_3(t)=E^{\mathbb {P}^{\theta }}[\hat{p}_3(T)| {\mathscr {F}}_t]=-1.$$

Thus, its quadratic variation becomes \(\int _0^T|\hat{q}_3(t)|^2dt=0,\,\,\mathbb {P}^{\theta }-\text{ a.s. }\) This implies that, for almost every \(0\le t\le T\), \(\hat{q}_3(t)=0,\,\, \mathbb {P}^{\theta }\,\,\text{ and }\,\, \mathbb {P}-\hbox \mathrm{a.s.{ }}\)

Hence, we can drop the last components from the adjoint processes \((\hat{p},\hat{q})\) and only consider (keeping the same notation)

$$\begin{aligned} \hat{p}:=\left( \begin{array}{ccc} \hat{p}_1\\ \hat{p}_2 \end{array}\right) ,\quad \hat{q}:=\left( \begin{array}{ccc} \hat{q}_{11} &{} \hat{q}_{12}\\ \hat{q}_{21} &{} \hat{q}_{22}\end{array}\right) , \end{aligned}$$

for which (37) reduces to the risk-sensitive adjoint equation:

$$\begin{aligned} \left\{ \begin{array}{lll} d\hat{p}(t) &{}=&{}-\frac{1}{\theta v^{\theta }(t)}\left( \begin{array}{ccc} H^e_{\rho }(t)+E[{\check{H}}^e_m(t)]-\bar{x}(t)E[f_m(t)] \\ H^e_x(t)+E[{\check{H}}^e_m(t)]-\bar{\rho }(t)E[f_m(t)]\end{array}\right) dt+\hat{q}(t)dB^{\theta }_t,\\ \hat{q}(t) &{}=&{} \frac{1}{\theta v^{\theta }(t)}q(t)-\theta \hat{p}(t)\ell (t),\\ dv^{\theta }(t) &{}=&{} \theta v^{\theta }(t)\langle \ell (t),dB_t\rangle ,\\ \hat{p}(T) &{}=&{} -\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ h_x(T) \end{array}\right) -\left( \begin{array}{ccc} \bar{x}(T)\\ \bar{\rho }(T)\end{array}\right) \frac{1}{\psi ^{\theta }_T}E[\psi ^{\theta }_Th_m(T)],\\ v^{\theta }(T) &{}=&{} \psi ^{\theta }_T. \end{array} \right. \end{aligned}$$
(38)

In view of the uniqueness of \({\mathrm{l\negthinspace F}}\)-adapted pairs (p, q), solution of (21), and the pair \((v^{\theta },\ell )\) obtained satisfying (33) and (34), the solution of the system of backward SDEs (38) is unique and satisfies (15).

3.3 Risk-Sensitive Stochastic Maximum Principle

We may use the transform (27) and (36) to obtain the explicit form (11) of the risk-sensitive Hamiltonian \(H^{\theta }\) defined by

$$\begin{aligned} H^{\theta }(t,\bar{X}(t),\hat{p}(t),\hat{q}(t),\ell (t), u):=\frac{1}{\theta v^{\theta }(t)}H^e(t,\bar{R}(t),p(t), q(t),u), \end{aligned}$$
(39)

where, \(H^e\) is defined by (19).

Let

$$ \delta H^{e}(t):=H^e(t,\bar{R}(t), p(t), q(t),u)-H^e(t, \bar{R}(t), p(t), q(t),\bar{u}(t)) $$

and

$$ \delta H^{\theta }(t)=H^{\theta }(t,\bar{X}(t),\hat{p}(t),\hat{q}(t),\ell (t), u)-H^{\theta }(t,\bar{X}(t),\hat{p}(t),\hat{q}(t),\ell (t), \bar{u}(t)). $$

We have

$$ E[\delta H^{e}(t)|\mathscr {F}^Y_t]= \theta E[v^{\theta }(t)\delta H^{\theta }(t)|\mathscr {F}^Y_t]=\theta v^{\theta }(0)E^{\theta }[\delta H^{\theta }(t)|\mathscr {F}^Y_t], $$

where, we recall that \(v^{\theta }(t)/v^{\theta }(0)=L^{\theta }_t=d{\mathrm{l\negthinspace P}}^{\theta }/d{\mathrm{l\negthinspace P}}|_{{\mathscr {F}}_t}\).

Now, since \(\theta >0\) and \(v^{\theta }(0)=E[\psi _T^{\theta }]>0\), the variational inequality (1) translates into

$$\begin{aligned}\begin{array}{lll} E^{\theta }[H^{\theta }(t, \bar{\rho }(t),\bar{x}(t), \hat{p}(t), \hat{q}(t),\ell (t),u)-H^{\theta }(t, \bar{\rho }(t),\bar{x}(t), \hat{p}(t), \hat{q}(t),\ell (t),\bar{u}(t))|\mathscr {F}^Y_t] \le 0. \end{array} \end{aligned}$$

for all \(u\in U,\) almost every t and \({\mathrm{l\negthinspace P}}^{\theta }-\)almost surely. This finishes the proof of Theorem 1.

4 Illustrative Example: Linear-Quadratic Risk-Sensitive Model Under Partial Observation

To illustrate our approach, we consider a one-dimensional linear diffusion with exponential quadratic cost functional. Perhaps, the easiest example of a linear-quadratic (LQ) risk-sensitive control problem with mean-field coupling is

$$\begin{aligned} \left\{ \begin{array}{lll} \inf _{u(\cdot )\in {\mathscr {U}}} E^u e^{\theta \left[ \frac{1}{2} \int _0^T u^2(t)dt +\frac{1}{2}x^2(T)+\mu E^u [x(T)] \right] },\\ \displaystyle { \text{ subject } \text{ to } \ }\\ dx(t)=\left( ax(t)+bu(t)\right) dt+\sigma dW_t+\alpha d{\widetilde{W}}^u_t,\\ dY_t=\beta x(t)dt+d{\widetilde{W}}^u_t,\\ x(0)=x_{0},\, Y_0=0,\\ \end{array} \right. \end{aligned}$$

where, \(a, b, \alpha ,\beta ,\mu \) and \(\sigma \) are real constants.

In this section we will illustrate our approach by only considering the LQ risk-sensitive control under partial observation without the mean-field coupling i.e. \((\mu =0)\) so that our result can be compared with [8] where a similar example (in many dimensions) is studied using the Dynamic Programming Principle. The case \(\mu \ne 0\) can treated in a similar fashion (cf. [11]).

We consider the linear-quadratic risk-sensitive control problem:

$$\begin{aligned} \left\{ \begin{array}{lll} \inf _{u(\cdot )\in {\mathscr {U}}} E^u e^{\theta \left[ \frac{1}{2} \int _0^T u^2(t)dt +\frac{1}{2}x^2(T)\right] },\\ \displaystyle { \text{ subject } \text{ to } \ }\\ dx(t)=\left( ax(t)+bu(t)\right) dt+\sigma dW_t+\alpha d{\widetilde{W}}^u_t,\\ dY_t=\beta x(t)dt+d{\widetilde{W}}^u_t,\\ x(0)=x_{0},\, Y_0=0,\\ \end{array} \right. \end{aligned}$$
(40)

where, \(a, b, \alpha ,\beta \) and \(\sigma \) are real constants.

An admissible process \((\bar{\rho }(\cdot ), \bar{x}(\cdot ), \bar{u}(\cdot ))\) satisfying the necessary optimality conditions of Theorem 1 is obtained by solving the following system of forward-backward SDEs (cf. (5) and (14)) (see Remark 1, above).

$$\begin{aligned} \left\{ \begin{array}{lll} d\bar{\rho }(t) &{}=&{} \beta \bar{\rho }(t)\bar{x}(t) dY_t,\\ d\bar{x}(t) &{}=&{} \left\{ c\bar{x}(t)+b\bar{u}(t)\right\} dt+\sigma dW_{t} +\alpha dY_{t}, \\ dp(t) &{}=&{}-\left( \begin{array}{ccc} H^{\theta }_{\rho }(t) \\ H^{\theta }_x(t)\end{array}\right) dt+q(t)(-\theta \ell (t)dt+dB_t),\\ dv^{\theta }(t) &{}=&{} \theta v^{\theta }(t)\langle \ell (t),dB_t\rangle ,\\ p(T) &{}=&{} -\left( \begin{array}{ccc} (\theta \bar{\rho }(T))^{-1}\\ \bar{x}(T) \end{array}\right) ,\\ v^{\theta }(T) &{}=&{} \psi ^{\theta }_T,\\ \bar{\rho }(0) &{}=&{} 1,\,\, \bar{x}(0)=x_0, \end{array}\right. \end{aligned}$$
(41)

where,

$$ c:=a-\alpha \beta ,\,\, B_t:=\left( \begin{array}{lll} Y_t \\ W_t\end{array}\right) ,\,\, \ell :=\left( \begin{array}{ccc} \ell _1\\ \ell _2 \end{array}\right) ,\,\, p:=\left( \begin{array}{ccc} p_1\\ p_2 \end{array}\right) ,\,\, q:=\left( \begin{array}{ccc} q_{11} &{} q_{12}\\ q_{21} &{} q_{22} \end{array}\right) , $$
$$ \psi ^{\theta }_T:=\bar{\rho }(T)e^{\theta \left[ \frac{1}{2} \int _0^T \bar{u}^2(t)dt +\frac{1}{2}\bar{x}^2(T)\right] }, $$

and the associated risk-sensitive Hamiltonian is

$$\begin{aligned} \begin{array}{lll} H^{\theta }(t,\rho ,x, u, p, q,\ell ):=(cx+bu)p_2-\frac{1}{2}u^2+\rho \beta x(q_{11}+\theta \ell _1 p_1)\\ \qquad \qquad \qquad \qquad \qquad \quad +\alpha (q_{21}+\theta \ell _2 p_1)+\sigma (q_{22}+\theta \ell _2 p_2). \end{array} \end{aligned}$$
(42)

In general the solution \((v^{\theta },\ell )\) primarily gives the correct form of the process \(\ell \) which may be a function of the optimal control \(\bar{u}\). Inserting \(\ell \) in the BSDE satisfied by (p, q) in the system (41) and solving for (p, q), we arrive at the characterization the optimal control of our problem.

For the LQ-control problem it turns out that by considering the BSDE satisfied by \((v^{\theta },\ell )\), we will find an explicit form of the optimal control \(\bar{u}\). Indeed, by (31), this is equivalent to consider the BSDE satisfied by \((Z,\ell )\):

$$\begin{aligned} \left\{ \begin{array}{lll} dZ_t &{}=&{} -\{\frac{1}{2}\bar{u}^2(t)+\frac{\theta }{2}|\ell (t)|^2\}dt+\langle \ell (t), dB_t \rangle ,\\[4pt] Z_T &{}=&{} \frac{1}{\theta }\ln \bar{\rho }(T)+ \frac{1}{2} \bar{x}^2_T. \end{array} \right. \end{aligned}$$

Since \(\bar{u}\) is \( {\mathscr {F}}^Y_t\), the form of \(Z_T\) suggests that we characterize \(\bar{u}\) and \(\ell \) such that

$$\begin{aligned} E^{\theta }[Z_t|{\mathscr {F}}^Y_t]=E^{\theta }[\frac{\gamma (t)}{2}\bar{x}^2(t)+\frac{1}{\theta }\ln \bar{\rho }(t)+\eta (t)|{\mathscr {F}}^Y_t],\quad 0\le t\le T, \end{aligned}$$

where, \(\gamma \) and \(\eta \) are deterministic functions such that \(\gamma (T)=1\) and \(\eta (T)=0\). In view of the SDEs satisfied by \((\bar{\rho },\bar{x})\) in (41), applying Itô’s formula and identifying the coefficients, we get

$$\begin{aligned} \ell _1(t)=(\alpha \gamma (t)+\beta /\theta )\bar{x}(t),\quad \ell _2(t)=\sigma \gamma (t)\bar{x}(t) \end{aligned}$$
(43)

and

$$\begin{aligned}\begin{array}{lll} E^{\theta }[\frac{1}{2}\left( \dot{\gamma }(t)+2(c+\alpha \beta )\gamma (t)+(\theta (\sigma ^2+\alpha ^2)-b^2)\gamma ^2(t)\right) \bar{x}^2(t)|{\mathscr {F}}^Y_t]\\ \qquad \qquad \quad +E^{\theta }[\dot{\eta }(t)+\frac{1}{2}(\sigma ^2+\alpha ^2)\gamma (t)+(\bar{u}(t)+b\gamma (t)\bar{x}(t))^2|{\mathscr {F}}^Y_t]=0. \end{array} \end{aligned}$$

Hence,

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{\gamma }(t)+2(c+\alpha \beta )\gamma (t)+(\theta (\sigma ^2+\alpha ^2)-b^2)\gamma ^2(t)=0,\quad \gamma (T)=1,\\ \dot{\eta }(t)+\frac{1}{2}(\sigma ^2+\alpha ^2)\gamma (t)=0,\quad \eta (T)=0, \end{array} \right. \end{aligned}$$
(44)

where, the first equation is the risk-sensitive Riccati equation, and

$$\begin{aligned} E^{\theta }[(\bar{u}(t)+b\gamma (t)\bar{x}(t))^2|{\mathscr {F}}^Y_t]=0. \end{aligned}$$

By the conditional Jensen’s inequality, we have

$$ \left| E^{\theta }[\bar{u}(t)+b\gamma (t)\bar{x}(t)|{\mathscr {F}}^Y_t]\right| ^2\le E^{\theta }[(\bar{u}(t)+b\gamma (t)\bar{x}(t))^2|{\mathscr {F}}^Y_t]. $$

Therefore, the optimal control is

$$\begin{aligned} \bar{u}(t)=-b\gamma (t)E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t], \end{aligned}$$
(45)

and the optimal dynamics solves the linear SDE

$$\begin{aligned} d\bar{x}(t)=\left( c\bar{x}(t)-b^2\gamma (t) E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\right) dt+\sigma dW_t+\alpha dY_t,\quad \bar{x}(0)=x_0, \end{aligned}$$
(46)

where, by the filter equation of Theorem 8.1 in [22], \(\pi _t(\bar{x}):=E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\) is the solution of the SDE on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}}^{\theta })\):

$$\begin{aligned} \pi _t(\bar{x})=x_0+\int _0^t (c-b^2\gamma (s))\pi _s(\bar{x})ds+\int _0^t \left( \alpha +(\theta \alpha \gamma (t)+\beta )\left[ \pi _s(\bar{x}^2)-\pi _s^2(\bar{x})\right] \right) d\bar{Y}^{\theta }_s, \end{aligned}$$

where, \(\bar{Y}^{\theta }_t=Y_t-\int _0^t(\theta \alpha \gamma (s)+\beta ) \pi _s(\bar{x})ds\) is an \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}^Y, {\mathrm{l\negthinspace P}}^{\theta })\)-Brownian motion.

Inserting the form (43) of \(\ell \) in the BSDE satisfied by (p, q) in the system (41) and solving for (p, q), we arrive at the same characterization the optimal control of our problem, obtained as a maximizer of the associated \(H^{\theta }\) given by (42). We sketch the main steps and omit the details.

We have

$$ H^{\theta }_u=bp_2-u,\quad H^{\theta }_{\rho }=\beta x(q_{11}+\theta \ell _1 p_1),\quad H^{\theta }_x=cp_2+\beta \rho (q_{11}+\theta \ell _1 p_1). $$

The BSDE satisfied by (p, q) then reads

$$\begin{aligned} \left\{ \begin{array}{lll} dp_1(t) &{}=&{} -\left\{ q_{11}(t)(\beta \bar{x}(t)+\theta \ell _1(t))+\theta (\ell _1(t)p_1(t)\bar{x}(t)+q_{12}(t)\ell _2(t))\right\} dt \\ &{}+&{} q_{11}(t)dY_t+q_{12}(t)dW_t,\\ dp_2(t) &{}=&{} -\left\{ cp_2(t)+\beta \rho (t)(q_{11}(t)+\theta \ell _1(t)p_1(t))\right\} dt\\ &{}+&{} \theta (q_{21}\ell _1(t)+q_{22}\ell _2(t))dt+q_{21}(t)dY_t+q_{22}(t)dW_t,\\ p_1(T) &{}=&{} -\frac{1}{\theta \bar{\rho }(T)},\quad p_2(T)=-\bar{x}(T). \end{array} \right. \end{aligned}$$
(47)

In view of Theorem 1, if \(\bar{u}\) is an optimal control of the system (40), it is necessary that

$$\begin{aligned} E^{\theta }[bp_2(t)-\bar{u}(t)|{\mathscr {F}}^Y_t]=0. \end{aligned}$$

This yields

$$\begin{aligned} \bar{u}(t)=bE^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]. \end{aligned}$$

The associated state dynamics \(\bar{x}\) solves then the SDE

$$\begin{aligned} d\bar{x}(t)=\left\{ c\bar{x}(t)+b^2 E^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]\right\} dt+\sigma dW_{t} +\alpha dY_{t}. \end{aligned}$$

It remains to compute \(E^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]\). Indeed, inserting the form (43) of \(\ell \) in the BSDE satisfied by (p, q) in the system (47), by Itô’s formula and identifying the coefficients, it is easy to check that \((p_1(t),q_{11}(t),q_{12}(t))\) given by

$$\begin{aligned} p_1(t):=-\frac{1}{\theta \bar{\rho }(t)},\quad q_{11}(t):=\frac{\beta }{\theta }\frac{\bar{x}(t)}{\bar{\rho }(t)}, \quad q_{12}(t):=0 \end{aligned}$$

solves the first adjoint equation in (47). Furthermore, since \(p_2(T)=-\bar{x}(T)\), setting

$$\begin{aligned} E^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]=-\lambda (t) E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t], \end{aligned}$$

where, \(\lambda \) is a deterministic function such that \(\lambda (T)=1\), and identifying the coefficients, we find that \(\lambda \) satisfies the risk-sensitive Riccati equation in (44). Moreover,

$$\begin{aligned} q_{21}(t)=-\sigma \lambda (t),\quad q_{22}(t)=-\alpha \lambda (t). \end{aligned}$$

By uniqueness of the solution of the risk-sensitive Riccati equation in (44), it follows that \(\lambda =\gamma \). Therefore,

$$\begin{aligned} E^{\theta }[p_2(t)|{\mathscr {F}}^Y_t]=-\gamma (t) E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t],\quad q_{21}(t)=-\sigma \gamma (t),\quad q_{22}(t)=-\alpha \gamma (t). \end{aligned}$$

Summing up: the optimal control of the LQ-problem (41) is

$$\begin{aligned} \bar{u}(t)=-b\gamma (t)E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t], \end{aligned}$$
(48)

where, \(\gamma \) solves the risk-sensitive Riccati equation

$$\begin{aligned} \dot{\gamma }(t)+2(c+\alpha \beta )\gamma (t)+(\theta (\sigma ^2+\alpha ^2)-b^2)\gamma ^2(t)=0,\quad \gamma (T)=1. \end{aligned}$$
(49)

The optimal dynamics solves the linear SDE

$$\begin{aligned} d\bar{x}(t)=\left( c\bar{x}(t)-b^2\gamma (t) E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\right) dt+\sigma dW_t+\alpha dY_t,\quad \bar{x}(0)=x_0, \end{aligned}$$
(50)

and the filter \(\pi _t(\bar{x}):=E^{\theta }[\bar{x}(t)|{\mathscr {F}}^Y_t]\) is solution of the SDE on \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}, {\mathrm{l\negthinspace P}}^{\theta })\):

$$\begin{aligned} \pi _t(\bar{x})=x_0+\int _0^t (c-b^2\gamma (s))\pi _s(\bar{x})ds+\int _0^t \left( \alpha +(\theta \alpha \gamma (t)+\beta )\left[ \pi _s(\bar{x}^2)-\pi _s^2(\bar{x})\right] \right) d\bar{Y}^{\theta }_s, \end{aligned}$$

where, \(\bar{Y}^{\theta }_t=Y_t-\int _0^t(\theta \alpha \gamma (s)+\beta ) \pi _s(\bar{x})ds\) is an \((\varOmega ,{\mathscr {F}},{\mathrm{l\negthinspace F}}^Y, {\mathrm{l\negthinspace P}}^{\theta })\)-Brownian motion.