1 Introduction

The optimal control problem for regime-switching models has received a lot of attention recently, see, for example, [1,2,3,4,5]. There are two existing approaches to solving stochastic optimal control problem in the literature: the dynamic programming and the stochastic maximum principle. We refer the reader to [6,7,8] and references therein for more information on the dynamic programming approach. The stochastic maximum principle is a generalization of the Pontryagin maximum principle, in which optimizing a value function corresponds to optimizing a functional called Hamiltonian. The stochastic maximum principle is presented in terms of an adjoint equation, which is a solution to a backward stochastic differential equation (BSDE). There is a vast literature on stochastic maximum principle, and the reader may consult [7, 9,10,11,12,13] for more information. Some applications of stochastic maximum principle in finance include: the mean–variance portfolio selection (see, e.g. [7, 12] and references therein) and the utility maximization (classical and recursive) or risk minimization (see, e.g. [12, 14, 15]);

The stochastic maximum principle for regime-switching models was introduced in [1, 2] for Markov regime-switching diffusion systems and extended in [5] for Markov regime-switching jump–diffusion systems. In both cases, the authors developed a sufficient stochastic maximum principle. However, when solving the sufficient maximum principle, one of the main assumption is the concavity of the Hamiltonian which may be violated in some applications. In [3], the authors prove a weak sufficient and necessary maximum principle (that does not require concavity assumption) for Markov regime-switching diffusion systems. Let us mention that [5, Theorem 3.1] does not include the cases in which the profit Hamiltonian is not concave (see, e.g. Sect. 5.1).

This paper discusses a partial information stochastic maximum principle for optimal control of forward–backward stochastic differential equation (FBSDE) driven by Markov regime-switching jump–diffusion process. We first prove a general sufficient maximum principle for optimal control with partial information (Theorem 3.1). This can be seen as a generalization of [5, Theorem 3.1] to the FBSDE setting and [15, Theorem 2.3] to the regime-switching setting. Second, we prove a stochastic maximum principle that does not require a concavity condition (Theorem 3.2). In fact, we prove the following: a critical point of the performance functional of a partial information FBSDE problem is a conditional critical point for the associated Hamiltonian, and vice versa. The proof of Theorem 3.2 requires the use of some variational equations (compare with [16, Section 4]), and the maximum principle obtained is of a local form. One of the drawbacks of the two preceding maximum principles is, the need of an assumption on existence and uniqueness of the solution to the BSDE characterizing the adjoint processes. These equations are usually hard to solve explicitly in the partial information case and worse, may not have a solution. Therefore, a stochastic maximum principle via Malliavin calculus is proposed to overcome this problem. In this approach, the adjoint processes are given in terms of the coefficients of the system and their Malliavin derivatives and not by a BSDE. The Malliavin calculus approach was introduced in [17] and further developed in [18, 19], where the authors study optimal control in the presence of additional information. Let us mention that the latter works do not consider systems of forward–backward stochastic differential equations nor the presence of an external Markov chain in the coefficients of the systems. Using the aforementioned technique, the results obtained in [3, Example 4.7] can be extended to the jump–diffusion case. Our results also generalize the ones derived in [4].

One of the motivations of this paper is the problem of stochastic differential utility (SDU) maximization of terminal wealth under Markov switching. The notion of recursive utility in discrete time was introduced in [20, 21] in order to separate the risk aversion and intertemporal substitution aversion of a decision maker. This concept was generalized to that of stochastic differential utility (SDU) in [22]. The cost function in the stochastic differential utility depends on an intermediate consumption rate and a future utility and can be expressed as a BSDE. For more information on maximization of SDU, the reader may consult [14, 15, 23, 24] and references therein.

The paper is organized as follows: In Sect. 2, the framework for the partial information control problem is introduced. Section 3 presents a partial information sufficient maximum principle for a forward–backward stochastic differential equation (FBSDE) driven by a Markov switching jump–diffusion process. An equivalent maximum principle is also given, and we end the section by presenting the Malliavin calculus approach. In Sect. 4, we prove the main results. Section 5 uses the results obtained to solve a problem of optimal control for Markov switching jump–diffusion model. A problem of recursive utility maximization with Markov regime switching is also studied.

2 Framework

This section presents the model and formulates the stochastic control problem for a Markov regime-switching forward–backward SDE with jumps. The model in [5] is adopted for the forward Markov regime-switching jump diffusion.

Let \(({\varOmega },{\mathscr {F}},P)\) be a complete probability space, where P is a reference probability measure. On this probability space, we assume that we are given a one-dimensional Brownian motion \(B=\{B(t)\}_{0\le t\le T}\), an irreducible homogeneous continuous-time, finite state space Markov chain \(\alpha :=\{\alpha (t)\}_{0\le t\le T}\) and an independent Poisson random measure \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) on \(({\mathbb {R}}_+\times {\mathbb {R}}_0,\mathscr {B}({\mathbb {R}}_+)\otimes \mathscr {B}_0)\) under P. Here \({\mathbb {R}}_0={\mathbb {R}} \backslash \{0\}\) and \(\mathscr {B}_0\) is the Borel \(\sigma \)-algebra generated by open subset O of \({\mathbb {R}}_0\).

We suppose that the filtration \({\mathbb {F}}=\{{\mathscr {F}}_t\}_{0\le t\le T}\) is the P-augmented natural filtration generated by B, N and \(\alpha \) (see, e.g. [1, Section 2] or [25, Page 369]).

\(\alpha :=\{\alpha (t)\}_{0\le t\le T}\) is an irreducible homogeneous continuous-time Markov chain with a finite state space \({\mathbb {S}}=\{e_1,e_2, \ldots ,e_D\}\subset {\mathbb {R}}^D\), where \(D\in {\mathbb {N}}\), and the jth component of \(e_i\) is the Kronecker delta \(\delta _{ij}\) for each \(i,j=1,\ldots , D\). The Markov chain is characterized by a rate (or intensity) matrix \({\varLambda }:=\{\lambda _{ij}:1\le i,j\le D\}\) under P. Note that, for each \(1\le i,j\le D,\,\,\lambda _{ij}\) is the constant transition intensity of the chain from state \(e_i\) to state \(e_j\) at time t. In addition for \(i\ne j,\,\,\lambda _{ij}\ge 0\) and \(\sum _{j=1}^D \lambda _{ij}=0\), therefore \(\lambda _{ii}\le 0\). It follows from [26] that the dynamics of the semi-martingale \(\alpha \) is given as follows

$$\begin{aligned} \alpha (t)=\alpha (0)+\int _0^t{\varLambda }^T\alpha (s)\mathrm {d}s+M(t), \end{aligned}$$
(1)

where \(M:=\{M(t)\}_{t\in [0,T]}\) is a \({\mathbb {R}}^D\)-valued \(({\mathbb {F}},P)\)-martingale and \({\varLambda }^T\) is the transpose of the matrix \({\varLambda }\). Let us now present the set of jump martingales associated with the Markov chain \(\alpha \), see, for example, [5] or [26] for more information. For each \(1\le i,j\le D\), with \(i\ne j\), and \(t\in [0,T]\), let \(J^{ij}(t)\) be the number of jumps from state \(e_i\) to state \(e_j\) up to time t. It follows from [26] that

$$\begin{aligned} J^{ij}(t)=\lambda _{ij}\int _0^t\langle \alpha (s-),e_i\rangle \mathrm {d}s +m_{ij}(t), \end{aligned}$$
(2)

with \(m_{ij}:=\{m_{ij}(t)\}_{t\in [0,T]}\), where \(m_{ij}(t):=\int _0^t\langle \alpha (s-),e_i\rangle \langle \mathrm {d}M(s),e_j\rangle \) is a \(({\mathbb {F}},P)\)-martingale.

Fix \(j\in \{1,2,\ldots ,D\}\) and let \({\varPhi }_j(t)\) be the number of jumps into state \(e_j\) up to time t. Then

$$\begin{aligned} {\varPhi }_j(t)&:=\sum _{i=1,i\ne j}^D J^{ij}(t)= \sum _{i=1,i\ne j}^D \lambda _{ij} \int _0^t\langle \alpha (s-),e_i\rangle \mathrm {d}s +\widetilde{{\varPhi }}_{j}(t)\nonumber \\&= \lambda _j(t) + \widetilde{{\varPhi }}_{j}(t), \end{aligned}$$
(3)

where \(\widetilde{{\varPhi }}_{j}(t)=\sum _{i=1,i\ne j}^D m_{ij}(t)\) and \(\lambda _j(t)=\sum _{i=1,i\ne j}^D \lambda _{ij}\int _0^t\langle \alpha (s-),e_i\rangle \mathrm {d}s \). It is worth mentioning that for each \(j\in \{1,2,\ldots ,D\},\,\,\,\widetilde{{\varPhi }}_{j}:=\{\widetilde{{\varPhi }}_{j}(t)\}_{t\in [0,T]}\) is a \(({\mathbb {F}},P)\)-martingale.

Assume that the compensator of \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) is defined by

$$\begin{aligned} \eta _\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=\nu _\alpha (\mathrm {d}\zeta |s)\eta (\mathrm {d}s)=\langle \alpha (s-),\nu (\mathrm {d}\zeta |s)\rangle \eta (\mathrm {d}s), \end{aligned}$$
(4)

where \(\eta (\mathrm {d}s)\) is a \(\sigma \)-finite measure on \({\mathbb {R}}_+\) and

\(\nu (\mathrm {d}\zeta |s):=(\nu _{e_1}(\mathrm {d}\zeta |s),\nu _{e_2}(\mathrm {d}\zeta |s),\ldots ,\nu _{e_D}(\mathrm {d}\zeta |t))\in {\mathbb {R}}^D\) is a function of s. Let us observe that for each \(j=1,\ldots ,D\), \(\nu _{e_j}(\mathrm {d}\zeta |s)=\nu _j(\mathrm {d}\zeta |s) \) is the conditional Lévy density of jump sizes of \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) at time s when \(\alpha (s^-)=e_j\) and satisfies \(\int _{{\mathbb {R}}_{0}}\min (1,\zeta ^{2})\nu _j (\mathrm {d}\zeta |s)< \infty \). In this work, we further suppose that \(\eta (\mathrm {d}s)=\mathrm {d}s\) and \(\nu (\mathrm {d}\zeta |s)\) is a function of \(\zeta \), that is,

$$\begin{aligned} \nu (\mathrm {d}\zeta |s)=\nu (\mathrm {d}\zeta ). \end{aligned}$$

Let

$$\begin{aligned} \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=N(\mathrm {d}\zeta ,\mathrm {d}s)-\nu _\alpha (\mathrm {d}\zeta )\mathrm {d}s, \end{aligned}$$
(5)

be the compensated Markov regime-switching Poisson random measure.

Suppose that the state process \(X(t)=X^{(u)}(t,\omega );\,\,0 \le t \le T,\,\omega \in {\varOmega }\) is a controlled Markov regime-switching jump diffusion of the form

$$\begin{aligned} \begin{array}{llll} \,\mathrm {d}X (t) &{}=&{} b(t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}t +\sigma (t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}B(t) \\ &{} &{} +\displaystyle \int _{{\mathbb {R}}_0}\gamma (t,X(t),\alpha (t),u(t),\zeta , \omega ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{} &{}+\,\eta (t,X(t),\alpha (t),u(t),\omega )\cdot \mathrm {d}\widetilde{{\varPhi }}(t),\,\,\,\,\,\, t \in [ 0,T] \\ X(0) &{}=&{} x_0, \end{array} \end{aligned}$$
(6)

where \(T>0\) is a given constant. \(u(\cdot )\) is the control process.

The functions \(b:[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}\,,\,\, \sigma :[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}\),

\(\gamma :[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathscr {U}} \times {\mathbb {R}}_0\times {\varOmega }\rightarrow {\mathbb {R}}\) and \(\eta :[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}\) are given such that for all \(t,\,\,\,b(t,x,e_i,u,\cdot )\), \( \sigma (t,x,e_i,u,\cdot )\), \(\gamma (t,x,e_i,u,z,\cdot )\) and \( \eta (t,x,e_i,u,\cdot )\) are \({\mathbb {F}}\)-progressively measurable for all \(x \in {\mathbb {R}},\,\,\,e_i \in {\mathbb {S}},\,\,u \in {\mathscr {U}}\) and \(z \in {\mathbb {R}}_0\).

We suppose that we are given a subfiltration

$$\begin{aligned} {\mathscr {E}}_t\subset {\mathscr {F}}_t\,;\,\,\,t\in [0,T], \end{aligned}$$
(7)

representing the information available to the controller at time t. A possible subfiltration \({\mathscr {E}}_{t}\) in (7) is the \( \delta \)-delayed information given by \( {\mathscr {E}}_{t}={\mathscr {F}}_{(t-\delta )^+ };\,\,\,t\ge 0,\) where \(\delta \ge 0\) is a known constant delay.

We consider the associated BSDE’s in the unknowns \(\Big (Y(t), Z(t), K(t,\zeta ), V(t)\Big )\) of the form

$$\begin{aligned} \left\{ \begin{array}{ll} \,\mathrm {d}Y (t) &{}= - g(t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u(t)) \,\mathrm {d}t \,+\,Z(t)\,\mathrm {d}B(t) \\ &{}\quad + \displaystyle \int _{{\mathbb {R}}_0}K(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,\, t \in [ 0,T] \\ Y(T) &{}= h(X(T),\alpha (T))\,, \end{array}\right. \end{aligned}$$
(8)

where \(g:[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathbb {R}} \times {\mathbb {R}} \times \mathscr {R} \times {\mathbb {R}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}\) and \(h:{\mathbb {R}}\times {\mathbb {S}} \rightarrow {\mathbb {R}}\) are such that the BSDE (8) has a unique solution. As for sufficient conditions for existence and uniqueness of Markov regime-switching BSDEs, we refer the reader, for example, to [27,28,29] and references therein.

Let \(f:[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathbb {R}} \times {\mathbb {R}}\times \mathscr {R} \times {\mathbb {R}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}, \,\,\,\varphi :{\mathbb {R}} \times {\mathbb {S}}\rightarrow {\mathbb {R}}\) and \(\psi :{\mathbb {R}} \rightarrow {\mathbb {R}}\) be given \(C^1\) functions with respect to their arguments. Assume that the performance functional is as follows

$$\begin{aligned} J(u):=E\left[ \int _0^T f(s,X(s),\alpha (s),Y(s), Z(s), K(s,\cdot ),V(s),u(s))\,\mathrm {d}s + \varphi (X(T),\alpha (T))\,+\,\psi (Y(0))\right] . \end{aligned}$$
(9)

Here, \(f,\,\varphi \) and \(\psi \) represent profit rates, bequest functions and “utility evaluations”, respectively, of the controller.

Let \({\mathscr {A}}_{{\mathscr {E}}}\) denote the family of admissible control u, such that they are contained in the set of \({\mathscr {E}}_t\)-predictable control, and systems (6)–(8) have a unique solution, and

$$\begin{aligned}&E\left[ \int _0^T\left\{ | f(t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u(t))|\right. \right. \\&\,\,\left. +\Big | \frac{\partial f}{\partial x_i}(t,X(t),,\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u(t)) \Big |^2 \right\} \mathrm {d}t \Big .\\&\left. \Big . \varphi (X(T),\alpha (T))+|\varphi ^\prime (X(T),\alpha (T))|^2 +|\psi (Y(0))|+|\psi ^\prime (Y(0))|^2\right] \\ {}&\quad <\infty \text { for } x_i=x,y,z,k \text { and } u. \end{aligned}$$

The set \({\mathscr {U}}\subset {\mathbb {R}}\) is a given convex set such that \(u(t)\in {\mathscr {U}}\) for all \(t\in [0,T]\) a.s., for all \(u \in {\mathscr {A}}_{{\mathscr {E}}}\).

Remark 2.1

Systems (6)–(8) are a semi-coupled forward–backward stochastic differential equations (SDEs). Under globally Lipschitz continuity and linear growth condition of the coefficients, there exists a unique strong solution to the SDE (6). Therefore, existence and uniqueness of the solution to systems (6)–(8) will follow from the existence and uniqueness of the BSDE (8).

The problem we consider is the following: find \(u^*\in {\mathscr {A}}_{{\mathscr {E}}} \) such that

$$\begin{aligned} J(u^*)=\sup _{u\in {\mathscr {A}}_{{\mathscr {E}} }}J(u). \end{aligned}$$
(10)

3 Maximum Principle for a Markov Regime-Switching Forward–Backward Stochastic Differential Equation with Jumps

In this section, we derive a general sufficient stochastic maximum principle for a forward–backward Markov regime-switching jump–diffusion model. After, we derive an equivalent type maximum principle.

For this purposes, define the Hamiltonian

$$\begin{aligned} H:[0,T] \times {\mathbb {R}}\times {\mathbb {S}}\times {\mathbb {R}}\times {\mathbb {R}}\times \mathscr {R}\times {\mathbb {R}}\times {\mathscr {U}}\times {\mathbb {R}} \times {\mathbb {R}}\times {\mathbb {R}} \times \mathscr {R} \times {\mathbb {R}} \longrightarrow {\mathbb {R}}, \end{aligned}$$

by

$$\begin{aligned}&H\left( t,x,e_i,y,z,k,v,u,a,p,q,r(\cdot ),w\right) \nonumber \\&:= f(t,x,e_i,y,z,k,v,u)+a g(t,x,e_i,y,z,k,v,u)+ p b(t,x,e_i,u) \,\nonumber \\&+q\sigma (t,x,e_i,u)+\int _{{\mathbb {R}}_0}r(t,\zeta )\gamma (t,x,e_i,u,\zeta )\nu _i(\mathrm {d}\zeta )+\sum _{j=1}^D\eta ^j(t,x,e_i,u)w^j(t)\lambda _{ij} , \end{aligned}$$
(11)

where \(\mathscr {R} \) denotes the set of all functions \(k:[0,T]\times {\mathbb {R}}_0 \rightarrow {\mathbb {R}}\) for which the integral in (11) converges.

We suppose that H is Fréchet differentiable in the variables xyzkvu and that \(\nabla _k H(t,\zeta )\) is a random measure, which is absolutely continuous with respect to \(\nu _\alpha \). This happens, for example, when f and g are “quasi-strong generator”, that is,

$$\begin{aligned} g(t,x,e_i,y,z,k,v,u)=g(t,x,e_i,y,z,\int _{{\mathbb {R}}_0}k(\zeta ) {\varPsi }(t,\zeta )\nu _i(\mathrm {d}\zeta ),v,u), \end{aligned}$$

where \({\varPsi }\) is predictable and satisfies \(C_1\min (1,|\zeta |)\le {\varPsi }(t,\zeta )\le C_2\min (1,|\zeta |)\) P-a.e. In addition, the constants \(C_1\) and \(C_2\) are such that: \(C_2\ge 0\) and \(C_1\in ]-1,0]\). Letting \(\tilde{k}=\int _{{\mathbb {R}}_0}k(\zeta ){\varPsi }(t,\zeta )\nu _i(\mathrm {d}\zeta )\) on the right-hand side, one can show that

$$\begin{aligned} \nabla _kg(t,x,e_i,y,z,k,v,u)(h)= & {} \nabla _{\tilde{k}}g(t,x,e_i,y,z,\int _{{\mathbb {R}}_0}k(\zeta ){\varPsi }(\zeta )\nu _i(\mathrm {d}\zeta ),v,u)\\&\times \int _{{\mathbb {R}}_0}h(\zeta ){\varPsi }(t,\zeta )\nu _i(\mathrm {d}\zeta ). \end{aligned}$$

Next, define the adjoint processes \(A(t),\,p(t),\,q(t), r(t,\cdot )\) and \(w(t),\,\,\,t\in [0,T]\) associated with these Hamiltonians by the following system of Markov regime-switching FBSDEJs:

  1. 1.

    Forward SDE in A(t):

    $$\begin{aligned} \mathrm {d}A (t)= & {} \frac{\partial H}{\partial y} (t) \,\mathrm {d}t + \frac{\partial H}{\partial z} (t) \mathrm {d}B(t)+ \displaystyle \int _{{\mathbb {R}}_0} \frac{\mathrm {d}\nabla _k H}{\mathrm {d}\nu _\alpha (\zeta )} (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&+ \nabla _vH(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,\,t\in [0,T] \nonumber \\ A(0)= & {} \psi ^\prime (Y(0)). \end{aligned}$$
    (12)

    Here and in the sequel, we use the notation

    $$\begin{aligned} \frac{\partial H}{\partial y} (t)= & {} \frac{\partial H}{\partial y} (t,X(t),\alpha (t),u(t),Y(t), Z(t), K(t,\cdot ),V(t),A(t),p(t),\\&q(t),r(t,\cdot ),w(t)), \end{aligned}$$

    etc., \(\frac{\mathrm {d}\nabla _k H}{\mathrm {d}\nu _\alpha (\zeta )} (t,\zeta ) \) is the Radon–Nikodym derivative of \( \nabla _k H(t,\zeta )\) with respect to \(\nu _\alpha (\zeta )\) and \(\nabla _vH(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t)=\sum _{j=1}^D \frac{\partial H}{\partial v^j} (t)\mathrm {d}\widetilde{{\varPhi }}_j(t)\), with \(V^j=V(t,e_j)\).

  2. 2.

    The Markov regime-switching BSDE in \((p(t),q(t),r(t,\cdot ),w(t))\):

    $$\begin{aligned} \mathrm {d}p (t)= & {} - \frac{\partial H}{\partial x} (t) \mathrm {d}t + q(t)\,\mathrm {d}B(t)+ \displaystyle \int _{{\mathbb {R}}_0} r (t,\zeta ) \, \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+ w(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,\,t\in [0,T] \nonumber \\ p (T)= & {} \frac{\partial \varphi }{\partial x}(X(T),\alpha (T))\, +A(T) \frac{\partial h}{\partial x} (X(T),\alpha (T)). \end{aligned}$$
    (13)

Remark 3.1

Let V be an open subset of a Banach space \({\mathscr {X}}\) and let \(F: V \rightarrow {\mathbb {R}}\).

  • We say that F has a directional derivative (or Gâteaux derivative) at \(x\in V\) in the direction \(y\in {\mathscr {X}}\), if

    $$\begin{aligned} D_yF(x):=\underset{\varepsilon \rightarrow 0}{\lim } \frac{1}{\varepsilon }(F(x + \varepsilon y)-F(x)) \text { exists.} \end{aligned}$$
  • We say that F is Fréchet differentiable at \(x \in V\), if there exists a linear map

    $$\begin{aligned} L:{\mathscr {X}} \rightarrow {\mathbb {R}}, \end{aligned}$$

    such that

    $$\begin{aligned} \underset{\underset{h \in {\mathscr {X}}}{h \rightarrow 0}}{\lim } \frac{1}{\Vert h\Vert }|F(x+h)-F(x)-L(h)|=0. \end{aligned}$$

    In this case, we call L the Fréchet derivative of F at x, and we write

    $$\begin{aligned} L=\nabla _x F. \end{aligned}$$
  • If F is Fréchet differentiable, then F has a directional derivative in all directions \(y \in {\mathscr {X}}\) and

    $$\begin{aligned} D_yF(x)= \nabla _x F(y). \end{aligned}$$

3.1 A Sufficient Maximum Principle

In what follows, we give the sufficient maximum principle.

Theorem 3.1

(Sufficient maximum principle) Let \(\widehat{u}\in {\mathscr {A}}_{{\mathscr {E}} }\) with corresponding solutions \(\widehat{X}(t),(\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\zeta ), \widehat{V}(t)), \widehat{A}(t),(\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\zeta ),\widehat{w}(t))\) of (6), (8), (12) and (13), respectively. Suppose that the following are true:

  1. 1.

    For each \(e_i\in {\mathbb {S}}\), the functions

    $$\begin{aligned} x \mapsto h(x, e_i),\,\,x \mapsto \varphi (x, e_i),\,\,y \mapsto \psi (y) \end{aligned}$$
    (14)

    are concave.

  2. 2.

    The function

    $$\begin{aligned}&\widetilde{H}(x,y,z,k,v)\nonumber \\&\quad ={{\mathrm{ess \text { } sup}}}_{u\in {\mathscr {U}} }E\Big [H( t,x,e_i,y,z,k,v,u,\widehat{a},\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ), \widehat{w}(t))| {\mathscr {E}}_t\Big ] \end{aligned}$$
    (15)

    is concave for all \((t,e_i) \in [0,T]\times {\mathbb {S}}\) a.s.

  3. 3.
    $$\begin{aligned} \underset{u\in {\mathscr {U}} }{{{\mathrm{ess \text { } sup}}}}&\Big \{E\Big [H (t,\widehat{X}(t),\alpha (t), u,\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t), \widehat{A}(t),\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t)) \Big . \Big | {\mathscr {E}}_t\Big ]\Big \} \nonumber \\&\le E\Big [H (t,\widehat{X}(t),\alpha (t), \widehat{u},\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t),\widehat{A}(t), \widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t)) \Big . \Big | {\mathscr {E}}_t\Big ] \end{aligned}$$
    (16)

    for all \(t\in [0,T]\), a.s.

  4. 4.

    Assume that \(\frac{\mathrm {d}}{\mathrm {d}\nu }\nabla _k\widehat{H}(t,\zeta )>-1\).

  5. 5.

    In addition, assume the following integrability condition:

    $$\begin{aligned}&E\left[ \int _0^T\left\{ \widehat{p}^2(t) \left( (\sigma (t)-\widehat{\sigma }(t))^2+ {\int }_{{\mathbb {R}}_0}( \gamma (t,\zeta )-\widehat{\gamma } (t,\zeta ) )^2\,\nu _\alpha (\mathrm {d}\zeta )\right. \right. \right. \nonumber \\ {}&\quad \left. +\sum _{j=1}^D(\eta ^j(t)-\widehat{\eta }^j(t) )^2\lambda _{j}(t) \right) +(X(t)-\widehat{X}(t))^2 \left( \widehat{q}^2(t)+ {\int }_{{\mathbb {R}}_0} \widehat{r}^2 (t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\Big . \Big .\right. \nonumber \\&\quad \left. +\sum _{j=1}^D(w^j)^2(t) \lambda _{j}(t) \right) +(Y(t)-\widehat{Y}(t))^2 \left( (\frac{\partial \widehat{H}}{\partial z} )^2(t)\right. \nonumber \\&\left. \quad \ + {\int }_{{\mathbb {R}}_0} \Big \Vert \frac{\mathrm {d}\nabla _k H(t,\zeta ) }{\mathrm {d}\nu _\alpha (\zeta )} \Big \Vert ^2 \nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D (\frac{\partial \widehat{H}}{\partial v^j} )^2(t) \lambda _{j}(t) \right) \nonumber \\&\quad \ \Big . \Big . +\widehat{A}^2(t) \left( (Z(t)-\widehat{Z}(t))^2 + {\int }_{{\mathbb {R}}_0}\left( K (t,\zeta )-\widehat{K} (t,\zeta ) \right) ^2\nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D(V^j(t)\right. \nonumber \\&\left. \left. \left. \quad \ -\widehat{V}^j(t) )^2\lambda _{j}(t) \right) \right\} \mathrm {d}t \right] <\infty . \end{aligned}$$
    (17)

Then, \(\widehat{u}\) is an optimal control process and \(\widehat{X}\) is the corresponding controlled state process.

Remark 3.2

In Theorem 3.1 and in the following, we shall use the notations \(X(t)=X^{\widehat{u}}(t)\) and \(Y(t)=Y^{\widehat{u}}(t)\) are the processes associated with the control \(\widehat{u}(t)\). Furthermore, put \( \frac{\partial \widehat{H}}{\partial x}(t):=\frac{\partial H}{\partial x} (t,\widehat{X}(t),\alpha (t), \widehat{u},\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t),\widehat{A}(t),\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t)) \) and similarly for \( \frac{\partial \widehat{H}}{\partial y}(t), \frac{\partial \widehat{H}}{\partial z}(t), \nabla _{k} \widehat{H}(t,\zeta ), \frac{\partial \widehat{H}}{\partial v^j}(t)\) and \(\frac{\partial \widehat{H}}{\partial u}(t)\).

Remark 3.3

Let us mention that the above maximum principle requires some concavity assumptions. However, this concavity assumption may not be satisfied in some applications, see, for example, Sect. 5. Therefore, we need a maximum principle which does not require the above assumption. The maximum principle derived in the next section gives a first-order necessary and sufficient condition but not the optimality of the control. In fact, it says that, if it exists, then the equivalent maximum principle enables us to derive the expression for the optimal control.

3.2 An Equivalent Maximum Principle

In this section, we prove a version of the maximum principle that does not require a concavity condition. We call it an equivalent maximum principle. Let us make the following additional assumptions:

Assumption A1

For all \(t_0\in [0,T]\) and all bounded \({\mathscr {E}}_t\)-measurable random variable \(\theta (\omega )\), the control process \(\beta (t)\) defined by

$$\begin{aligned} \beta (t):= \chi _{]t_0,T[}(t)\theta (\omega );\,\,t\in [0,T], \text { belongs to } {\mathscr {A}}_{{\mathscr {E}}}. \end{aligned}$$
(18)

Assumption A2

For all \(u \in {\mathscr {A}}_{{\mathscr {E}}}\) and all bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\), there exists \(\delta >0\) such that

$$\begin{aligned} \widetilde{u}(t):=u(t)+\ell \beta (t) \in \mathscr {A_{{\mathscr {E}}}};\,\,t\in [0,T] , \text { belongs to } {\mathscr {A}}_{{\mathscr {E}}} \text { for all } \ell \in ]-\delta ,\delta [. \end{aligned}$$
(19)

Assumption A3

For all bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\), the derivatives processes

$$\begin{aligned} x_1(t)&=\frac{\mathrm {d}}{\mathrm {d}\ell }X^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}; \,\,\,\,y_1(t)=\frac{\mathrm {d}}{\mathrm {d}\ell }Y^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0} ;\\ z_1(t)&=\frac{\mathrm {d}}{\mathrm {d}\ell }Z^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}; \,\,\,\,k_1(t)=\frac{\mathrm {d}}{\mathrm {d}\ell }K^{(u+\ell \beta )}(t,\cdot ) \Big . \Big |_{\ell =0};\\ v_1^j(t)&=\frac{\mathrm {d}}{\mathrm {d}\ell }V^{j,(u+\ell \beta )}(t) \Big . \Big |_{\ell =0},\,\,\,j=1,\ldots , D&\end{aligned}$$

exist and belong to \(L^2([0,T]\times {\varOmega })\).

In the following, we write \(\frac{\partial b}{\partial x}(t)\) for \(\frac{\partial b}{\partial x}(t,X(t),\alpha (t),u(t))\), etc. It follows from (6) and (8) that

$$\begin{aligned} \mathrm {d}x_1(t)= & {} \left\{ \frac{\partial b}{\partial x}(t)x_1(t) +\frac{\partial b}{\partial u}(t)\beta (t)\right\} \mathrm {d}t +\left\{ x_1(t)\frac{\partial \sigma }{\partial x}(t)+ \frac{\partial \sigma }{\partial u}(t)\beta (t)\right\} \mathrm {d}B(t)\nonumber \\&+\displaystyle \int _{{\mathbb {R}}_0}\left\{ \frac{\partial \gamma }{\partial x} (t,\zeta ) x_1(t)+ \frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\right\} \widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\nonumber \\&+\left\{ \frac{\partial \eta }{\partial x}(t) x_1(t) +\frac{\partial \eta }{\partial u}(t)\beta (t)\right\} \cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,t\in [0,T] \nonumber \\ x_1(t)= & {} 0 \end{aligned}$$
(20)

and

$$\begin{aligned} \mathrm {d}y_1(t)= & {} -\left\{ \frac{\partial g}{\partial x}(t)x_1(t) +\frac{\partial g}{\partial y}(t)y_1(t)+\frac{\partial g}{\partial z}(t)z_1(t) +\displaystyle \int _{{\mathbb {R}}_0} \nabla _k g(t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\Big . \right. \nonumber \\&\left. +\sum _{j=1}^D \frac{\partial g}{\partial v^j}(t)v_1^j(t)\lambda _j(t) +\frac{\partial g}{\partial u}(t)\beta (t)\right\} \mathrm {d}t +z_1(t)\,\mathrm {d}B(t) \nonumber \\&+\displaystyle \int _{{\mathbb {R}}_0}k_1(t,\zeta ) \widetilde{N}_\alpha (\mathrm {d}\zeta , \mathrm {d}t) + v_1(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t) ;\,\,\,t\in [0,T]\nonumber \\ y_1(T)= & {} \frac{\partial h}{\partial x}(X(T),\alpha (T))x_1(T) . \end{aligned}$$
(21)

Remark 3.4

For sufficient conditions for the existence and uniqueness of solutions to (20) and (21), we refer the reader to [16, (4.1)]. A set of sufficient conditions under which systems (20)–(21) admits a unique solution is as follows:

  1. 1.

    Assume that the coefficients \(b,\sigma , \gamma , \eta , g, f, \psi \) and \(\phi \) are continuous with respect to their arguments and are continuously differentiable with respect to (xyzkvu). (Here, the dependence of g and f on k is trough \(\int _{{\mathbb {R}}_0}k(\zeta )\rho (t,\zeta )\nu (\mathrm {d}\zeta )\), where \(\rho \) is a measurable function satisfying \(0\le \rho (t,\zeta )\le c(1\wedge |\zeta |),\) \( \forall \zeta \in {\mathbb {R}}_0\). Hence the differentiability in this argument is in the Fréchet sense.)

  2. 2.

    The derivatives of \(b,\sigma , \gamma , \eta \) and g are bounded.

  3. 3.

    The derivatives of f are bounded by \(C(1+|x|+|y|+(\int _{{\mathbb {R}}_0} |k(.,\zeta )|^2\nu (\mathrm {d}\zeta ))^{1\backslash 2}+|v|+|u|)\).

  4. 4.

    The derivatives of \(\psi \) and \(\phi \) with respect to x are bounded by \(C(1+|x|).\)

Remark 3.5

Assumption A1 (which includes linear model) is common in the literature and allows to build the control step by step, see, for example, [15, 18, 30]. However, a drawback of this method is that it does not work when the set of controls is not convex.

Theorem 3.2

(Equivalent Maximum Principle) Let \(u\in {\mathscr {A}}_{{\mathscr {E}}}\) with corresponding solutions X(t) of (6), \((Y(t),Z(t),K(t,\zeta ),V(t))\) of (8), A(t) of (12), \((p(t),q(t),r(t,\zeta ),w(t))\) of (13) and corresponding derivative processes \(x_1(t)\) and \((y_1(t),z_1(t),k_1(t,\zeta ),v_1(t))\) given by (20) and (21), respectively. Suppose that Assumptions A1, A2 and A3 hold. Moreover, assume the following growth conditions

$$\begin{aligned}&E\left[ \int _0^T p^2(t)\left\{ \left( \frac{\partial \sigma }{\partial x} \right) ^2(t)x^2_1(t) +\left( \frac{\partial \sigma }{\partial u}\right) ^2(t) \beta ^2(t) +\int _{{\mathbb {R}}_0}\left( \left( \frac{\partial \gamma }{\partial x}\right) ^2(t,\zeta )x_1^2(t)\right. \right. \right. \nonumber \\&\left. \left. \left. +\left( \frac{\partial \gamma }{\partial u}\right) ^2(t,\zeta ) \beta ^2(t)\right) \nu _\alpha (\mathrm {d}\zeta ) +\sum _{j=1}^D \left( \left( \frac{\partial \eta ^j}{\partial x}\right) ^2(t)x^2_1(t)+\left( \frac{\partial \eta ^j}{\partial u}\right) ^2(t)\beta ^2(t) \right) \lambda _j(t)\right\} \mathrm {d}t \right. \nonumber \\&\left. +\int _0^Tx_1^2(t)\left\{ q^2(t)+\int _{{\mathbb {R}}_0}r^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D (\eta ^j)^2(t)\lambda _j(t)\right\} \mathrm {d}t \right] <\infty \end{aligned}$$
(22)

and

$$\begin{aligned}&E\left[ \int _0^Ty_1^2(t) \left\{ (\frac{\partial H}{\partial z})^2 (t) \, +\,\int _{{\mathbb {R}}_0} \Vert \nabla _k H(t,\zeta ) \Vert ^2\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D(\frac{\partial H}{\partial v^j})^2 (t) \lambda _j(t)\right\} \mathrm {d}t \right. \nonumber \\&\left. \quad \ + \int _0^TA^2(t)\left\{ z_1^2(t)+\int _{{\mathbb {R}}_0}k_1^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D (v^j_1)^2(t)\lambda _j(t)\right\} \mathrm {d}t\right] <\infty . \end{aligned}$$
(23)

Then the following are equivalent:

(A) \(\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}=0 \text { for all bounded } \beta \in {\mathscr {A}}_{{\mathscr {E}}}.\)

(B) \(E\Big [\frac{\partial H}{\partial u} (t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u,A(t),p(t),q(t),r(t,\cdot ),w(t))_{u=u(t)} \Big . \Big | {\mathscr {E}}_t\Big ]=0\) for almost all (a.a.) \( t \in [0,T].\)

Remark 3.6

Let us observe that the two previous maximum principles require the existence and uniqueness of the solution to the BSDE satisfied by the adjoint processes. In the partial information case, the above result is not always true. Hence, in the next section, we propose a stochastic maximum principle via Malliavin calculus. In this approach, the adjoint processes depend on the coefficients of the system, their Malliavin derivatives and a modified Hamiltonian.

3.3 A Malliavin Calculus Approach

In this section, we present a method based on Malliavin calculus. The set up we adopt here is that of a Markov regime-switching forward–backward stochastic differential equations with jumps, as in the previous sections and the notation are the same. For basic concepts of Malliavin calculus, we refer the reader to [31, 32].

In the sequel, let us denote by \(D^B_{t}F\) (respectively \(D^{\widetilde{N}_\alpha }_{t,\zeta } F\) and \(D^{\widetilde{{\varPhi }}}_t F\)) the Malliavin derivative in the direction of the Brownian motion B (respectively pure jump Lévy process \(\widetilde{N}_\alpha \) and the pure jump process \(\widetilde{{\varPhi }}\)) of a given (Malliavin differentiable) random variable \(F=F(\omega );\,\,\,\omega \in {\varOmega }\). We denote by \(\mathbb {D}_{1,2}\) the set of all random variables which are Malliavin differentiable with respect to \(B(\cdot ),\,\widetilde{N}_\alpha (\cdot ,\cdot )\) and \(\widetilde{{\varPhi }}(\cdot )\). A crucial argument in the proof of our general maximum principle rests on duality formulas for the Malliavin derivatives \( D_{t}\) and \(D_{t,\zeta }\) (see, e.g. [31, 32]):

$$\begin{aligned} E\left[ F\int _{0}^{T}\varphi (t)\mathrm {d}B(t)\right]&=E\left[ \int _{0}^{T}\varphi (t)D^B_{t}F\mathrm {d}t\right] , \end{aligned}$$
(24)
$$\begin{aligned} E\left[ F\int _{0}^{T}\int _{{\mathbb {R}}_{0}}\psi (t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\right]&=E\left[ \int _{0}^{T}\int _{{\mathbb {R}}_{0}}\psi (t,\zeta )D^{\widetilde{N}_\alpha }_{t,\zeta }F\nu _\alpha (\mathrm {d}\zeta )\mathrm {d}t\right] ,\end{aligned}$$
(25)
$$\begin{aligned} E\left[ F\int _{0}^{T}\varphi (t)\mathrm {d}\widetilde{{\varPhi }}(t) \right]&=E\left[ \int _{0}^{T}\varphi (t)D^{\widetilde{{\varPhi }}}_{t}F\lambda \mathrm {d}t\right] . \end{aligned}$$
(26)

These formulae hold true for all Malliavin differentiable, random variable F and \({\mathscr {F}}_t\)-predictable processes \(\varphi \) and \(\psi \) such that the integrals on the right-hand side converge absolutely.

We also need some basic properties of the Malliavin derivatives. Let \(F \in \mathbb {D}_{1,2}\) be a \({\mathscr {F}}_s\)-measurable random variable, then \( D^B_{t}F=D^{\widetilde{N}_\alpha }_{t,\zeta }F=D^{\widetilde{{\varPhi }}}_{t}F=0 \text { for all } t>s.\) We have the following results known as the fundamental theorems of calculus

$$\begin{aligned} D^B_s\left( \int _0^t\varphi (s)\,\mathrm {d}B(s) \right)&=\varphi (s)1_{[0,t]}(s) +\int _s^tD_s\varphi (r)\,\mathrm {d}B(r),\end{aligned}$$
(27)
$$\begin{aligned} D^{\widetilde{N}_\alpha }_{s,\zeta }\left( \int _0^t \int _{ {\mathbb {R}}_0} \psi (s,\zeta )\widetilde{N}(\mathrm {d}s,\mathrm {d}\zeta )\right)&=\psi (s,\zeta )1_{[0,t]}(s) +\int _s^t\int _{ {\mathbb {R}}_0} D^ {\widetilde{N}}_{s,\zeta } \psi (r,\zeta )\widetilde{N}_\alpha (\mathrm {d}r,\mathrm {d}\zeta ) ,\end{aligned}$$
(28)
$$\begin{aligned} D^{\widetilde{{\varPhi }}}_{s}\left( \int _0^t \varphi (s)\mathrm {d}\widetilde{{\varPhi }}( s)\right)&=\varphi (s)1_{[0,t]}(s) +\int _s^t D^{\widetilde{{\varPhi }}}_{s} \varphi (r)\mathrm {d}\widetilde{{\varPhi }}( r) , \end{aligned}$$
(29)

under the assumption that all the terms involved are well defined and belong to \(\mathbb {D}_{1,2}\).

In view of the optimization problem (10), we define the following processes: suppose that for all \(u\in \) \({\mathscr {A}}_{{\mathscr {E}}}\) the processes

$$\begin{aligned} \kappa (t)&:= \nabla _xh (X (T),\alpha (T))\widetilde{A}(T) + \nabla _x\varphi (X (T),\alpha (T)) \nonumber \\&\quad \ + \int _{t}^{T}\frac{\partial f }{\partial x }(s,X(s),\alpha (s),Y(s), Z(s), K(s,\cdot ),V(s),u(s)) \mathrm {d}s, \end{aligned}$$
(30)
$$\begin{aligned} H_{0}\left( t,x,e_i,y,z,k,v,u,\widetilde{a},\kappa \right)&:= \widetilde{a} g(t,x,e_i,y,z,k,v,u) + \kappa (t)b(t,x,e_i,u)\nonumber \\&\quad \ +D_t^B\kappa (t)\sigma (t,x,e_i,u), \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0}D_{t,\zeta }^{\widetilde{N}}\kappa (t) \gamma (t,x,e_i,u,\zeta )\nu _i(\mathrm {d}\zeta )\nonumber \\&\quad \ +\sum _{j=1}^DD_{t}^{\widetilde{{\varPhi }_j}} \kappa (t)\eta ^j(t,x,e_i,u)\lambda _{ij}, \end{aligned}$$
(31)
$$\begin{aligned} F(T):= \frac{\partial h}{\partial x} (X (T),\alpha (T))\tilde{A}(T) + \frac{\partial \varphi }{\partial x} (X (T),\alpha (T)), \end{aligned}$$
(32)
$$\begin{aligned} {\varTheta }(t,s):= \frac{\partial H_{0}}{\partial x }(s)G(t,s), \end{aligned}$$
(33)
$$\begin{aligned} G(t,s)&:=\exp \left( \int _t^s\left\{ \frac{\partial b}{\partial x }(r) -\frac{1}{2}\left( \frac{\partial \sigma }{\partial x }(r)\right) ^2\right. \right. \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0}\left( \ln \left( 1+\frac{\partial \gamma }{\partial x }\left( r,\zeta \right) \right) -\frac{\partial \gamma }{\partial x }\left( r,\zeta \right) \right) \nu _\alpha (\mathrm {d}\zeta ) \nonumber \\&\quad \left. + \sum _{j=1}^D \left( \ln \left( 1+\frac{\partial \eta ^j}{\partial x }(r)\right) -\frac{\partial \eta ^j}{\partial x }(r)\right) \lambda _j(r) \right\} \mathrm {d}r +\int _t^s\frac{\partial \sigma }{\partial x }\left( r\right) \mathrm {d}B(r)\nonumber \\&\quad \ +\int _t^s\int _{{\mathbb {R}}_0}\ln \left( 1+\frac{\partial \gamma }{\partial x } \left( r,\zeta \right) \right) \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}r)\nonumber \\&\quad \left. +\sum _{j=1}^D\int _t^s \ln \left( 1+\frac{\partial \eta ^j}{\partial x }(r)\right) \mathrm {d}\widetilde{{\varPhi }_j}( r) \right) , \end{aligned}$$
(34)

are all well defined. In (35) and in the sequel, we use the shorthand notation \(H_0(t)=H_{0}\Big (t,X(t),\alpha (t),Y(t),Z(t),K(t,\cdot ),V(t),u, \widetilde{A}(t),\kappa (t) \Big )\). We also assume that the following modified adjoint processes \((\tilde{p}(t),\tilde{q}(t),\tilde{r}(t,\zeta ),\tilde{w}(t))\) and \(\tilde{A}(t)\) given by

$$\begin{aligned} \tilde{p}(t)&:=\kappa (t)+\int _{t}^{T} \frac{\partial H_{0}}{\partial x } (s)G(t,s)\mathrm {d}s , \end{aligned}$$
(35)
$$\begin{aligned} \tilde{q}(t)&:= D^B_{t}\tilde{p}(t) , \end{aligned}$$
(36)
$$\begin{aligned} \tilde{r}(t,\zeta )&:=D^{\widetilde{N}_\alpha }_{t,\zeta }\tilde{p}(t) , \end{aligned}$$
(37)
$$\begin{aligned} \tilde{w}^j(t)&:=D_{t}^{\widetilde{{\varPhi }_j}}\tilde{p}(t),\,\,\,j=1,\ldots ,D \end{aligned}$$
(38)

and

$$\begin{aligned} \begin{array}{ll} \mathrm {d}\tilde{A} (t) &{}= \frac{\partial H}{\partial y} (t) \,\mathrm {d}t + \frac{\partial H}{\partial z} (t) \mathrm {d}B(t) +\displaystyle \int _{{\mathbb {R}}_0} \frac{\mathrm {d}\nabla _k H}{\mathrm {d}\nu (\zeta )} (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}\quad + \nabla _vH(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,\,t\in [0,T]\\ \tilde{A}(0) = \psi ^\prime (Y(0)). \end{array} \end{aligned}$$
(39)

are well defined. Here, the general Hamiltonian H is given by (11), with pqrw replaced by \(\tilde{p}, \tilde{q}, \tilde{r}, \tilde{w}\).

Remark 3.7

Assume that the coefficients of the control problem satisfy the conditions for existence and uniqueness of solution to systems (6)–(8). Assume moreover that conditions in Remark 3.4 hold. Then, the processes given by (30)–(39) are well defined. The conditions on existence of processes defined in (30)–(39) play an important role in the proof of Theorem 3.3. For example, if the Malliavin differentiability of the process \(\tilde{p}\) with respect to \(B,\widetilde{N}_\alpha , \widetilde{{\varPhi }}\) is not guaranteed, then the theorem cannot be proved. We can now state a general stochastic maximum principle for our control problem (10):

Theorem 3.3

Let \(u\in {\mathscr {A}}_{{\mathscr {E}}}\) with corresponding solutions X(t) of (6), \((Y(t),Z(t),K(t,\zeta ),V(t))\) of (8), \(\tilde{A}(t)\) of (39), \(\tilde{p}(t),\tilde{q}(t),\tilde{r}(t,\zeta ),\tilde{w}^j(t)\) of (35)–(38) and corresponding derivative processes \(x_1(t)\) and \((y_1(t),z_1(t),k_1(t,\zeta ),v_1(t))\) given by (20) and (21), respectively. Suppose that Assumptions A1, A2 and A3 hold. Moreover, suppose that the random variables \(F(T),{\varTheta }(t,s)\) given by (32) and (33), and \(\frac{\partial f}{\partial x}(t)\) are Malliavin differentiable with respect to \(B,\widetilde{N}\) and \(\widetilde{{\varPhi }}\). Furthermore, suppose the following integrability conditions:

$$\begin{aligned}&E\left[ \int _0^T\Big \{ \Big (\frac{\partial \sigma }{\partial x}\Big )^2(t) x^2_1(t) +\Big (\frac{\partial \sigma }{\partial u}\Big )^2(t)\beta ^2(t) +\int _{{\mathbb {R}}_0}\Big ( \Big (\frac{\partial \gamma }{\partial x}\Big )^2 (t,\zeta )x_1^2(t)\right. \nonumber \\ {}&\quad \ +\Big (\frac{\partial \gamma }{\partial u}\Big )^2(t,\zeta ) \beta ^2(t)\Big )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D \Big (\Big (\frac{\partial \eta ^j}{\partial x}\Big ) ^2(t)x^2_1(t)\nonumber \\&\quad \ \left. \Big .+\Big (\frac{\partial \eta ^j}{\partial u}\Big )^2(t) \beta ^2(t)\Big )\lambda _j(t)\Big \} \mathrm {d}t\right]<\infty , \nonumber \\&E\left[ \int _0^T\int _0^T\left\{ \Big (D^B_sF(T)\Big )^2+\int _{{\mathbb {R}}_0} \Big (D^{\widetilde{N}_\alpha }_{s,\zeta }F(T)\Big )^2 \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^D \Big (D^{\widetilde{{\varPhi }}_j}_{s}F(T)\Big )^2\lambda _j(t)\right\} \mathrm {d}s\,\mathrm {d}t \right]<\infty \nonumber ,\\&E\left[ \int _0^T\int _0^T\left\{ \Big (D^B_s\Big ( \frac{\partial f }{\partial x }(t)\Big )\Big )^2 +\int _{{\mathbb {R}}_0}\Big (D^{\widetilde{N}_\alpha }_{s,\zeta } \Big ( \frac{\partial f }{\partial x }(t)\Big )\Big )^2 \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^D \left( D^{\widetilde{{\varPhi }}_j}_{s}\Big ( \frac{\partial f }{\partial x } (t)\Big )\right) ^2\lambda _j(t)\right\} \mathrm {d}s\,\mathrm {d}t \right]<\infty ,\nonumber \\&E\left[ \int _0^T\int _0^T\left\{ \Big (D^B_s {\varTheta }(t,s)\Big )^2 +\int _{{\mathbb {R}}_0}\Big (D^{\widetilde{N}_\alpha }_{s,\zeta }{\varTheta }(t,s)\Big )^2 \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\quad \ \left. \left. +\sum _{j=1}^D \Big (D^{\widetilde{{\varPhi }}_j}_{s}{\varTheta }(t,s)\Big )^2\lambda _j(t)\right\} \mathrm {d}s\,\mathrm {d}t \right] <\infty . \end{aligned}$$
(40)

Then, the following are equivalent:

(A) \(\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}=0 \text { for all bounded } \beta \in {\mathscr {A}}_{{\mathscr {E}}}.\)

(B) \(E\Big [\frac{\partial H}{\partial u} (t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u,\tilde{A}(t),\tilde{p}(t),\tilde{q}(t),\tilde{r}(t,\cdot ),\tilde{w}(t))_{u=u(t)}\Big . \Big | {\mathscr {E}}_t\Big ] =0\) for a.a. \((t ,\omega )\in [0,T]\times {\varOmega }\).

Remark 3.8

Assume that conditions in Remark 3.7 hold. Assume moreover that the coefficients are twice continuously differentiable with the second-order derivatives satisfying the conditions in Remark 3.7. Then, \(F(T),{\varTheta }(t,s)\) and \(\frac{\partial f}{\partial x}(t)\) are Malliavin differentiable with respect to \(B,\widetilde{N}_{\alpha }\) and \(\widetilde{{\varPhi }}\).

4 Proof of the Results

In this section, we prove the main results.

Proof

(Proof of Theorem 3.1) We prove that \(J(x,\widehat{u},e_i)\ge J(x,u,e_i) \text { for all } u \in {\mathscr {A}}_{{\mathscr {E}}}.\)

Choose \(u \in {\mathscr {A}}_{{\mathscr {E}}}\) and consider

$$\begin{aligned} J(x,u,e_i)-J(x,\widehat{u},e_i)=I_1+I_2+I_3, \end{aligned}$$
(41)

where

$$\begin{aligned} I_1&=E\left[ \int _0^T \left\{ f(t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u(t))\right. \right. \nonumber \\&\left. \left. \quad \ -f(t, \widehat{X}(t),\alpha (t), \widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ), \widehat{V}(t), \widehat{u}(t)) \right\} \,\mathrm {d}t \right] , \end{aligned}$$
(42)
$$\begin{aligned} I_2&=E\Big [ \varphi (X(T),\alpha (T)) -\varphi (\widehat{X}(T), \alpha (T))\Big ], \end{aligned}$$
(43)
$$\begin{aligned} I_3&=E\Big [\psi (Y(0))\,-\,\psi (\widehat{Y}(0))\Big ]. \end{aligned}$$
(44)

By the definition of H, we get

$$\begin{aligned} I_1&=E\left[ \int _0^T \left\{ H(t)-\widehat{H}(t)- \widehat{A}(t)(g(t) -\widehat{g}(t))-\widehat{p}(t)(b(t)-\widehat{b}(t))-\widehat{q}(t)(\sigma (t) -\widehat{\sigma }(t))\Big .\Big .\right. \right. \nonumber \\&\left. \left. \quad \ \Big .\Big . -\int _{{\mathbb {R}}_0}\widehat{r}(t,\zeta )(\gamma (t,\zeta ) -\widehat{\gamma }(t,\zeta ))\nu _\alpha (\,\mathrm {d}\zeta ) -\sum _{j=1}^D\widehat{w}^j(t)(\eta ^j(t)-\widehat{\eta }^j(t) )\lambda _{j}(t) \right\} \,\mathrm {d}t\right] . \end{aligned}$$
(45)

By the concavity of \(\varphi \) in x, the Itô’s formula (see, e.g. [5, Theorem 4.1]), (6), (13) and (17), we get

$$\begin{aligned} I_2&\le E\left[ \frac{\partial \varphi }{\partial x} (\widehat{X}(T),\alpha (T))(X(T) -\widehat{X}(T)) \right] \nonumber \\&= E\left[ \widehat{p}(T) (X(T)-\widehat{X}(T))\right] -E\left[ \widehat{A}(T) \frac{\partial h}{\partial x}(\widehat{X}(T),\alpha (T))(X(T) -\widehat{X}(T))\right] \nonumber \\&= E\left[ \int _0^T\left\{ \widehat{p}(t) (b(t)-\widehat{b}(t))\,\mathrm {d}t +(X(t^-)-\widehat{X}(t^-)) \left( -\frac{\partial \widehat{H}}{\partial x}(t)\right) +(\sigma (t) -\widehat{\sigma }(t))\widehat{q}(t) \right. \right. \nonumber \\&+\left. \left. \int _{{\mathbb {R}}_0}(\gamma (t,\zeta )-\widehat{\gamma }(t,\zeta ))\widehat{r}(t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D\widehat{w}^j(t)(\eta ^j(t) -\widehat{\eta }^j(t) )\lambda _{j}(t) \right\} \,\mathrm {d}t \right] \nonumber \\&-E\Big [\widehat{A}(T)\frac{\partial h}{\partial x} (\widehat{X}(T), \alpha (T))(X(T)-\widehat{X}(T))\Big ]. \end{aligned}$$
(46)

By the concavity of \(\psi ,h\), the Itô’s formula, (8) and (12), we get

$$\begin{aligned} I_3&\le E\Big [\psi ^\prime (\widehat{Y}(0))(Y(0)-\widehat{Y}(0))\Big ]\nonumber \\&= E\Big [\widehat{A}(0)(Y(0)-\widehat{Y}(0))\Big ]\nonumber \\&= E\Big [\widehat{A}(T) \{h(X(T),\alpha (T))-h(\widehat{X}(T),\alpha (T))\} \Big ] -E\left[ \int _0^T \left\{ \frac{\partial \widehat{H}}{\partial y}(t) (Y(t) -\widehat{Y}(t)) \right. \right. \nonumber \\&\quad \ +\widehat{A}(t) (-g(t)+\widehat{g}(t)) + (Z(t)-\widehat{Z}(t))\frac{\partial \widehat{H}}{\partial z}(t) \nonumber \\&\left. \left. \quad \ +\int _{{\mathbb {R}}_0}(K(t,\zeta )-\widehat{K}(t,\zeta )) \nabla _k \widehat{H}(t,\zeta ) \nu _\alpha (d\zeta )+\sum _{j=1}^D\frac{\partial \widehat{H}}{\partial v^j}(t) (V^j(t)-\widehat{V}^j(t) )\lambda _{j}(t) \right\} \,\mathrm {d}t \right] \nonumber \\&\le E\left[ \widehat{A}(T)\frac{\partial h}{\partial x}( \widehat{X}(T), \alpha (T))(X(T)-\widehat{X}(T)) \right] -E\left[ \int _0^T\left\{ \frac{\partial \widehat{H}}{\partial y}(t) (Y(t)-\widehat{Y}(t)) \right. \right. \nonumber \\&\quad \ +\widehat{A}(t) (-g(t)+\widehat{g}(t)) + (Z(t)-\widehat{Z}(t))\frac{\partial \widehat{H}}{\partial z}(t) \nonumber \\&\left. \left. \quad \ +\int _{{\mathbb {R}}_0}(K(t,\zeta )-\widehat{K}(t,\zeta ))\nabla _k \widehat{H}(t,\zeta ) \nu _\alpha (d\zeta ) + \sum _{j=1}^D\frac{\partial \widehat{H}}{\partial v^j}(t) (V^j(t)-\widehat{V}^j(t) )\lambda _{j}(t) \right\} \,\mathrm {d}t \right] . \end{aligned}$$
(47)

Summing (45)–(47) up, we have

$$\begin{aligned} I_1+I_2+I_3&\le E\left[ \int _0^T \Big \{H(t)- \widehat{H}(t) -\frac{\partial \widehat{H}}{\partial x}(t)(X(t)-\widehat{X}(t)) \Big . \Big . \right. \nonumber \\&\quad \ \Big . \Big . -\frac{\partial \widehat{H}}{\partial y}(t)(Y(t)-\widehat{Y}(t)) +\int _{{\mathbb {R}}_0}(K(t,\zeta )-\widehat{K}(t,\zeta ))\nabla _k \widehat{H}(t,\zeta ) \nu _\alpha (d\zeta )\nonumber \\&\quad \left. + \sum _{j=1}^D\frac{\partial \widehat{H}}{\partial v^j}(t) (V^j(t)-\widehat{V}^j(t) ) \lambda _{j}(t) \Big \} \mathrm {d}t\right] . \end{aligned}$$
(48)

One can show, using similar arguments as in [33] (see also [5]), that the right-hand side of (48) is non-positive. For sake of completeness, we give the details here. Fix \(t\in [0,T]\). Since \(\widetilde{H}(x,y,z,k,v)\) is concave, it follows by the standard hyperplane argument (see, e.g. [34, Chapter 5, Section 23]) that there exists a subgradient \(d=(d_1,d_2,d_3,d_4(\cdot ),d_5) \in {\mathbb {R}}^3\times \mathscr {R}\times {\mathbb {R}}\) for \(\widetilde{H}(x,y,z,k,v)\) at \(x=\widehat{X}(t),y=\widehat{Y}(t),\,z=\widehat{Z}(t),\) \(k=\widehat{K}(t,\cdot ),\,v=\widehat{V}(t)\) such that, if we define

$$\begin{aligned} i(x,y,z,k,v)&:=\widetilde{H}(x,y,z,k,v)-\widehat{H}(t)-d_1 (x-\widehat{X}(t))-d_2(y-\widehat{Y}(t))\nonumber \\&\quad \ -d_3(z-\widehat{Z}(t))-\int _{{\mathbb {R}}_0} d_4(\zeta )(k(\zeta )-\widehat{K}(t,\zeta )) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \ -\sum _{j=1}^Dd_5^j (V^j(t)-\widehat{V}^j(t) )\lambda _{j}(t) . \end{aligned}$$
(49)

Then \(i(x,y,z,k,v)\le 0\) for all xyzkv.

Furthermore, we have \(i(\widehat{X}(t),\widehat{Y}(t),\widehat{Z}(t),\widehat{K}(t,\cdot ),\widehat{V}(t))\). It follows that,

$$\begin{aligned} d_1=\frac{\partial \widetilde{H}}{\partial x}(\widehat{X}(t),\widehat{Y}(t), \widehat{Z}(t),\widehat{K}(t,\cdot ),\widehat{V}(t)),\\ d_2=\frac{\partial \widetilde{H}}{\partial y}(\widehat{X}(t),\widehat{Y}(t), \widehat{Z}(t),\widehat{K}(t,\cdot ),\widehat{V}(t)),\\ d_3=\frac{\partial \widetilde{H}}{\partial z}(\widehat{X}(t),\widehat{Y}(t), \widehat{Z}(t),\widehat{K}(t,\cdot ),\widehat{V}(t)),\\ d_4=\nabla _k \widetilde{H} (\widehat{X}(t),\widehat{Y}(t),\widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t)),\\ d_5^j=\frac{\partial \widetilde{H}}{\partial v^j}(\widehat{X}(t), \widehat{Y}(t),\widehat{Z}(t),\widehat{K}(t,\cdot ),\widehat{V}(t)). \end{aligned}$$

Substituting this into (48), using conditions 2. and 3. in Theorem 3.1, and the concavity of \(\widetilde{H}\), we conclude that \(J(x,\widehat{u},e_i)\ge J(x,u,e_i) \text { for all } u \in {\mathscr {A}}_{{\mathscr {E}}}.\) This completes the proof. \(\square \)

Proof

(Proof of Theorem 3.2) We have that

$$\begin{aligned}&\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}\nonumber \\&\quad =E\left[ \int _0^T \left\{ \frac{\partial f}{\partial x}(t)x_1(t) +\frac{\partial f}{\partial y}(t)y_1(t)+\frac{\partial f}{\partial z}(t) z_1(t) +\int _{{\mathbb {R}}_0} \nabla _k f (t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\qquad \ \left. \left. + \sum _{j=1}^D \frac{\partial f}{\partial v^j}(t)v_1^j(t)\lambda _j(t)+ \frac{\partial f}{\partial u}(t)\beta (t)\right\} \mathrm {d}t+\frac{\partial \varphi }{\partial x} (X(T),\alpha (T))x_1(T)+\psi ^\prime (Y(0))y_1(0)\right] . \end{aligned}$$
(50)

By (13), the Itô’s formula, (20) and (22), we have

$$\begin{aligned}&E\left[ \frac{\partial \varphi }{\partial x}(X(T),\alpha (T))x_1(T)\right] \nonumber \\&=E\Big [p(T)X(T)\Big ]- E\left[ \frac{\partial h}{\partial x}(X(T),\alpha (T))A(T)x_1(T)\right] \nonumber \\&=E\left[ \int _0^T\left\{ p(t)\left( \frac{\partial b}{\partial x}(t)x_1(t)+\frac{\partial b}{\partial u}(t)\beta (t)\right) -x_1(t) \frac{\partial H}{\partial x}(t)\right. \right. \nonumber \\&\qquad \ +q(t)\left( \frac{\partial \sigma }{\partial x}(t)x_1(t)+\frac{\partial \sigma }{\partial u}(t)\beta (t)\right) \nonumber \\&\qquad \ +\int _{{\mathbb {R}}_0} r(t,\zeta )\left( \frac{\partial \gamma }{\partial x}(t,\zeta )x_1(t) +\frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\right) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\left. \left. \qquad \ +\sum _{j=1}^Dw^j(t)\Big (\frac{\partial \eta ^j}{\partial x}(t)x_1(t)+\frac{\partial \eta ^j}{\partial u}(t)\beta (t) \Big )\lambda _{j}(t) \right\} \mathrm {d}t\right] \nonumber \\&\qquad \ - E\Big [\frac{\partial h}{\partial x}(X(T),\alpha (T))A(T)x_1(T)\Big ]. \end{aligned}$$
(51)

By (12), the Itô’s formula, (21) and (23), we get

$$\begin{aligned}&E\Big [\psi ^\prime (Y(0))y_1(0)\Big ] \nonumber \\&\quad = E\Big [A(0)y_1(0)\Big ]\nonumber \\&\quad = E\Big [A(T)y_1(T)\Big ] - E\left[ \int _0^T\left\{ A(t^-)\,\mathrm {d}y_1(t)+y_1(t^-)\,\mathrm {d}A(t) +\frac{\partial H}{\partial z}(t)z_1(t)\,\mathrm {d}t\right. \right. \nonumber \\&\qquad \ \left. \left. +\int _{{\mathbb {R}}_0}\nabla _k H (t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}t+\sum _{j=1}^D\frac{\partial H}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \,\mathrm {d}t \right\} \right] \nonumber \\&\quad = E\left[ \frac{\partial h}{\partial x}(X(T),\alpha (T))A(T)x_1(T)+\int _0^T\left\{ A(t)\left( \frac{\partial g}{\partial x}(t)x_1(t) +\frac{\partial g}{\partial y}(t)y_1(t) +\frac{\partial g}{\partial z}(t)z_1(t) \right. \right. \right. \nonumber \\&\left. \qquad \ +\int _{{\mathbb {R}}_0} \nabla _k g (t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D\frac{\partial g}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) +\frac{\partial g}{\partial u}(t)\beta (t) \right) -\frac{\partial H}{\partial y}(t)y_1(t) \nonumber \\&\left. \left. \qquad \ -\frac{\partial H}{\partial z}(t)z_1(t)-\int _{{\mathbb {R}}_0} \nabla _k H (t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) -\sum _{j=1}^D\frac{\partial H}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \right. \}\mathrm {d}t\right] . \end{aligned}$$
(52)

Substituting (51) and (52) into (50), we get

$$\begin{aligned}&\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}\nonumber \\&=E\left[ \int _0^T\left( x_1(t) \left\{ \frac{\partial f}{\partial x}(t) +A(t)\frac{\partial g}{\partial x}(t)+p(t)\frac{\partial b}{\partial x}(t)+q(t)\frac{\partial \sigma }{\partial x}(t)\right. \right. \right. \nonumber \\&\left. \qquad \ +\int _{{\mathbb {R}}_0} r(t,\zeta )\frac{\partial \gamma }{\partial x}(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^Dw^j(t)\frac{\partial \eta ^j}{\partial x}(t)\lambda _{j}(t) -\frac{\partial H}{\partial x}(t)\right\} \nonumber \\&\qquad \ + y_1(t) \left\{ \frac{\partial f}{\partial y}(t) +A(t)\frac{\partial g}{\partial y}(t)-\frac{\partial H}{\partial y}(t)\right\} +z_1(t) \left\{ \frac{\partial f}{\partial z}(t) +A(t)\frac{\partial g}{\partial z}(t)-\frac{\partial H}{\partial z}(t)\right\} \nonumber \\&\qquad \ +\int _{{\mathbb {R}}_0}k_1(t,\zeta ) \Big \{\nabla _kf(t,\zeta )+A(t)\nabla _kg(t,\zeta )-\nabla _kH(t,\zeta )\Big \}\nu _\alpha (\mathrm {d}\zeta ) \nonumber \\&\qquad \ + \sum _{j=1}^Dv^j_1(t) \left\{ \frac{\partial f}{\partial v^j}(t) +A(t)\frac{\partial g}{\partial v^j}(t)-\frac{\partial H}{\partial v^j}(t)\right\} \nonumber \\&\qquad \ +\beta (t) \left\{ \frac{\partial f}{\partial u}(t) +A(t)\frac{\partial g}{\partial u}(t)+p(t)\frac{\partial b}{\partial u}(t)+q(t)\frac{\partial \sigma }{\partial u}(t)\right. \nonumber \\&\left. \left. \left. \qquad \ +\int _{{\mathbb {R}}_0} r(t,\zeta )\frac{\partial \gamma }{\partial u}(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^Dw^j(t)\frac{\partial \eta ^j}{\partial u}(t)\lambda _{j}(t) \right\} \right) \mathrm {d}t\right] . \end{aligned}$$
(53)

By the definition of H, the coefficients of \(x_1(t),y_1(t),z_1(t), k_1(t,\zeta )\) and \(v_1(t)\) are all equal to zero in (53). Hence, if

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)=0 \text { for all bounded } \beta \in {\mathscr {A}}_{{\mathscr {E}}}, \end{aligned}$$

it follows that

$$\begin{aligned} E\Big [\displaystyle \int _0^T\frac{\partial H}{\partial u}(t)\beta (t) \,\mathrm {d}t \Big ]=0 \text { for all bounded } \beta \in {\mathscr {A}}_{{\mathscr {E}}}. \end{aligned}$$

This holds in particular for \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\) of the form \( \beta (t)=\beta _{t_0}(t,\omega )=\theta (\omega )\xi _{[t_0,T]}(t)\) for a fix \(t_0\in [0,T[\), where \(\theta (\omega )\) is a bounded \({\mathscr {E}}_{t_0}\)-measurable random variable. Hence

$$\begin{aligned} E\Big [\displaystyle \int _{t_0}^T\frac{\partial H}{\partial u}(t)\,\mathrm {d}t\,\theta \Big ]=0. \end{aligned}$$

Differentiating with respect to \(t_0\), we have

$$\begin{aligned} E\Big [\frac{\partial H}{\partial u}(s)\,\theta \Big ]=0 \text { for a.a., } t_0. \end{aligned}$$

Since the equality is true for all bounded \({\mathscr {E}}_{t_0}\)-measurable random variable, we conclude that

$$\begin{aligned} E\Big [\frac{\partial H}{\partial u}(t_0)|{\mathscr {E}}_{t_0}\Big ]=0 \text { for a.a., } t_0\in [0,T]. \end{aligned}$$

This shows that (A) \(\Rightarrow \) (B).

Conversely, using the fact that every bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\) can be approximated by a linear combinations of controls \(\beta (t)\) of the form (18), the above argument can be reversed to show that (B) \(\Rightarrow \) (A). \(\square \)

Proof

(Proof of Theorem 3.3) (A) \(\Rightarrow \) (B). We split this proof into two steps: in the first step, we show that the directional derivative of the value function can be written as a sum of the two terms \(J_1\) and \(J_2\), given by (55) and (56). In the second step, we show that the condition (A) implies the condition (B).

Lemma 4.1

Assume that the conditions of Theorem 3.3 hold. Then

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)= J_1(h)+J_2(h), \end{aligned}$$
(54)

where

$$\begin{aligned} J_1(h)&=E\left[ \int _t^T \left\{ \kappa (s) \frac{\partial b}{\partial x}(s)+ D_s^B\kappa (s)\frac{\partial \sigma }{\partial x}(s) +\int _{{\mathbb {R}}_0} D_{s,\zeta }^{\widetilde{N}_\alpha }\kappa (s) \frac{\partial \gamma }{\partial x}(s,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (s) \frac{\partial \eta ^j}{\partial x}(s)+\tilde{A}(s)\frac{\partial g}{\partial x}(s)\right\} x_1(s)\mathrm {d}s\right] , \end{aligned}$$
(55)
$$\begin{aligned} J_2(h)&=E\left[ \theta \int _t^{t+h} \left\{ \kappa (s) \frac{\partial b}{\partial u}(s)+ D_t^B\kappa (s)\frac{\partial \sigma }{\partial u}(s) +\int _{{\mathbb {R}}_0} D_{s,\zeta }^{\widetilde{N}_\alpha }\kappa (s) \frac{\partial \gamma }{\partial u}(s,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_s^{\widetilde{{\varPhi }}_j}\kappa (s) \frac{\partial \eta ^j}{\partial u}(s)+\frac{\partial f}{\partial u}(s)+\tilde{A}(s)\frac{\partial g}{\partial u}(s)\right\} \mathrm {d}s\right] . \end{aligned}$$
(56)

Proof

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}&=E\left[ \int _0^T \left\{ \frac{\partial f}{\partial x}(t)x_1(t) +\frac{\partial f}{\partial y}(t)y_1(t)+\frac{\partial f}{\partial z}(t) z_1(t)\right. \right. \nonumber \\&\qquad \ +\int _{{\mathbb {R}}_0} \nabla _k f (t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D \frac{\partial f}{\partial v^j}(t)v_1^j(t)\lambda _j(t)\Big .\nonumber \\&\qquad \ \left. + \frac{\partial f}{\partial u}(t)\beta (t)\right\} \mathrm {d}t+\frac{\partial \varphi }{\partial x} (X(T),\alpha (T))x_1(T)+\psi ^\prime (Y(0))y_1(0)\nonumber \\&\qquad \ \left. +\frac{\partial h}{\partial x} (X (T),\alpha (T))\Big (\tilde{A}(T)-\tilde{A}(T)\Big )x_1(T) \right] . \end{aligned}$$
(57)

It follows from (20) and duality formula that for F(T) defined by (32) we get

$$\begin{aligned} E\Big [F(T)x_1(T) \Big ]&=E\left[ F(T) \left\{ \int _0^T\left( \frac{\partial b}{\partial x}(t)x_1(t)+\frac{\partial b}{\partial u}(t)\beta (t)\right) \mathrm {d}t \right. \right. \nonumber \\&\quad \ +\int _0^T \left( \frac{\partial \sigma }{\partial x}(t)x_1(t)+\frac{\partial \sigma }{\partial u}(t)\beta (t)\right) \mathrm {d}B(t)\nonumber \\&\quad \ +\int _0^T \int _{{\mathbb {R}}_0} \left( \frac{\partial \gamma }{\partial x}(t,\zeta )x_1(t) +\frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\right) \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t) \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^D\int _0^T \left( \frac{\partial \eta ^j}{\partial x}(t)x_1(t)-\frac{\partial \eta ^j}{\partial u}(t)\beta (t) \right) \mathrm {d}\widetilde{{\varPhi }}_j(t) \right\} \right] .\nonumber \\&=E\left[ \int _0^T \left\{ F(T) \left( \frac{\partial b}{\partial x}(t)x_1(t)+\frac{\partial b}{\partial u}(t)\beta (t)\right) \right. \right. \nonumber \\&\quad \ + D_t^BF(T)\left( \frac{\partial \sigma }{\partial x}(t)x_1(t)+\frac{\partial \sigma }{\partial u}(t)\beta (t)\right) \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }F(T)\left( \frac{\partial \gamma }{\partial x}(t,\zeta )x_1(t) +\frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\right) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \ \left. \left. +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}F(T)\left( \frac{\partial \eta ^j}{\partial x}(t)x_1(t)-\frac{\partial \eta ^j}{\partial u}(t)\beta (t) \right) \lambda _{j}(t) \right\} \mathrm {d}t\right] . \end{aligned}$$
(58)

Similarly, we have

$$\begin{aligned} E\left[ \int _0^T \frac{\partial f }{\partial x }(t)x_1(t)\mathrm {d}t \right]&=E\left[ \int _0^T \frac{\partial f }{\partial x }(t) \left\{ \int _0^t\left( \frac{\partial b}{\partial x}(s)x_1(s)+\frac{\partial b}{\partial u}(s)\beta (s)\right) \mathrm {d}s \right. \right. \\&\quad \ +\int _0^t \left( \frac{\partial \sigma }{\partial x}(s)x_1(s)+\frac{\partial \sigma }{\partial u}(s)\beta (s)\right) \mathrm {d}B(s)\\&\quad \ +\int _0^t \int _{{\mathbb {R}}_0} \Big (\frac{\partial \gamma }{\partial x}(s,\zeta )x_1(s) +\frac{\partial \gamma }{\partial u}(s,\zeta )\beta (s)\Big ) \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s) \\&\left. \left. \quad \ +\sum _{j=1}^D\int _0^t \left( \frac{\partial \eta ^j}{\partial x}(s)x_1(s)-\frac{\partial \eta ^j}{\partial u}(s)\beta (s) \right) \mathrm {d}\widetilde{{\varPhi }}_j(s) \right\} \mathrm {d}t\right] .\\&=E\left[ \int _0^T\left( \int _s^T \frac{\partial f }{\partial x }(t)\mathrm {d}t\right) \left( \frac{\partial b}{\partial x}(s)x_1(t)+\frac{\partial b}{\partial u}(s)\beta (s)\right) \right. \\&\quad \ + \left( \int _s^T D_s^B\left( \frac{\partial f }{\partial x }(t)\right) \mathrm {d}t\right) \left( \frac{\partial \sigma }{\partial x}(s)x_1(s)+\frac{\partial \sigma }{\partial u}(s)\beta (s)\right) \\&\quad \ +\int _{{\mathbb {R}}_0} \left( \int _s^T D_{s,\zeta }^{\widetilde{N}_\alpha }\left( \frac{\partial f }{\partial x }(t)\right) \mathrm {d}t\right) \nonumber \\&\quad \ \times \left( \frac{\partial \gamma }{\partial x}(s,\zeta )x_1(s) +\frac{\partial \gamma }{\partial u}(s,\zeta )\beta (s)\right) \nu _\alpha (\mathrm {d}\zeta )\\&\quad +\sum _{j=1}^D \left( \int _s^T D_s^{\widetilde{{\varPhi }}_j}\left( \frac{\partial f }{\partial x }(t)\right) \mathrm {d}t \right) \nonumber \\&\left. \left. \quad \ \times \Big (\frac{\partial \eta ^j}{\partial x}(s)x_1(s)-\frac{\partial \eta ^j}{\partial u}(s)\beta (s) \Big )\lambda _{j}(s) \right\} \mathrm {d}s\right] . \end{aligned}$$

Changing the notation \(s \leftrightarrow t\), this becomes

$$\begin{aligned}&=E\left[ \int _0^T\left( \int _t^T \frac{\partial f }{\partial x }(s)\mathrm {d}s\right) \left( \frac{\partial b}{\partial x}(t)x_1(t)+\frac{\partial b}{\partial u}(t)\beta (t)\right) \right. \nonumber \\&\quad \ + \left( \int _t^T D_t^B\Big (\frac{\partial f }{\partial x }(s)\Big )\mathrm {d}s\right) \left( \frac{\partial \sigma }{\partial x}(t)x_1(t)+\frac{\partial \sigma }{\partial u}(t)\beta (t)\right) \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0} \left( \int _t^T D_{t,\zeta }^{\widetilde{N}_\alpha }\Big (\frac{\partial f }{\partial x }(s)\Big )\mathrm {d}s\right) \left( \frac{\partial \gamma }{\partial x}(t,\zeta )x_1(t) +\frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\right) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \left. \left. +\sum _{j=1}^D \left( \int _t^T D_t^{\widetilde{{\varPhi }}_j}\left( \frac{\partial f }{\partial x }(s)\right) \mathrm {d}s \right) \left( \frac{\partial \eta ^j}{\partial x}(t)x_1(t)-\frac{\partial \eta ^j}{\partial u}(t)\beta (t) \right) \lambda _{j}(t) \right\} \mathrm {d}t\right] . \end{aligned}$$
(59)

Combining (30), (32), (58) and (59), we have

$$\begin{aligned}&E\left[ \int _0^T \left( \frac{\partial f}{\partial x}(t)x_1(t) + \frac{\partial f}{\partial u}(t)\beta (t)\right) \mathrm {d}t+\frac{\partial \varphi }{\partial x} (X(T),\alpha (T))x_1(T)\right] \nonumber \\&\quad \ = E\left[ \int _0^T \frac{\partial f}{\partial x}(t)x_1(t) \mathrm {d}t + F(T)x_1(T)+ \int _0^T \frac{\partial f}{\partial u}(t)\beta (t) \mathrm {d}t \right. \nonumber \\&\left. \qquad \ -\frac{\partial h}{\partial x} (X(T),\alpha (T))\tilde{A}(T)x_1(T)\right] \nonumber \\&\quad \ = E\left[ \int _0^T \left\{ \kappa (t) \left( \frac{\partial b}{\partial x}(t)x_1(t)+\frac{\partial b}{\partial u}(t)\beta (t)\right) + D_t^B\kappa (t) \left( \frac{\partial \sigma }{\partial x}(t)x_1(t)+\frac{\partial \sigma }{\partial u}(t)\beta (t)\right) \right. \right. \nonumber \\&\qquad \ +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t) \Big (\frac{\partial \gamma }{\partial x}(t,\zeta )x_1(t) +\frac{\partial \gamma }{\partial u}(t,\zeta )\beta (t)\Big ) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\qquad \ \left. +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (t) \left( \frac{\partial \eta ^j}{\partial x}(t)x_1(t)-\frac{\partial \eta ^j}{\partial u}(t)\beta (t) \right) \lambda _{j}(t) \right\} \mathrm {d}t\nonumber \\&\left. \qquad \ + \int _0^T \frac{\partial f}{\partial u}(t)\beta (t) \mathrm {d}t-\frac{\partial h}{\partial x} (X(T),\alpha (T))\tilde{A}(T)x_1(T) \right] . \end{aligned}$$
(60)

By the Itô’s formula and (39), it follows as in (52) that

$$\begin{aligned}&E\Big [\psi ^\prime (Y(0))y_1(0)\Big ] \nonumber \\&\quad \ =E\left[ \tilde{A}(0)y_1(0)\right] \nonumber \\&\quad \ =E\left[ \frac{\partial h}{\partial x}(X(T),\alpha (T))\tilde{A}(T)x_1(T)\right] + E\left[ \int _0^T\left\{ \tilde{A}(t)\left( \frac{\partial g}{\partial x}(t)x_1(t) +\frac{\partial g}{\partial y}(t)y_1(t) \right. \right. \right. \nonumber \\&\qquad \ +\frac{\partial g}{\partial z}(t)z_1(t)+\int _{{\mathbb {R}}_0} \nabla _k g (t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D\frac{\partial g}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \nonumber \\&\left. \qquad \ +\frac{\partial g}{\partial u}(t)\beta (t) \right) -\frac{\partial H}{\partial y}(t)y_1(t) -\frac{\partial H}{\partial z}(t)z_1(t)-\int _{{\mathbb {R}}_0} \nabla _k H (t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) \nonumber \\&\left. \left. \qquad \ -\sum _{j=1}^D\frac{\partial H}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \right\} \mathrm {d}t\right] . \end{aligned}$$

But

$$\begin{aligned} \frac{\partial H}{\partial y}(t)&=\frac{\partial f}{\partial y}(t)+\tilde{A}(t)\frac{\partial g}{\partial y}(t);\,\,\, \frac{\partial H}{\partial z}(t)=\frac{\partial f}{\partial z}(t)+\tilde{A}(t)\frac{\partial g}{\partial z}(t)\\ \nabla _k H(t)&=\nabla _kf(t)+\tilde{A}(t)\nabla _kg(t); \,\,\,\frac{\partial H}{\partial v^j}(t)=\frac{\partial f}{\partial v^j}(t)+\tilde{A}(t)\frac{\partial g}{\partial v^j}(t),\,\,j=1,\ldots ,D. \end{aligned}$$

Hence we have

$$\begin{aligned}&E\Big [\psi ^\prime (Y(0))y_1(0)\Big ] \nonumber \\&\quad \ = E\left[ \frac{\partial h}{\partial x}(X(T),\alpha (T))\tilde{A}(T)x_1(T)\right] + E\left[ \int _0^T\left\{ \tilde{A}(t)\left( \frac{\partial g}{\partial x}(t)x_1(t)+\frac{\partial g}{\partial u}(t)\beta (t) \right) \mathrm {d}t \right. \right. \nonumber \\&\qquad \ -\int _0^T\left\{ \frac{\partial f}{\partial y}(t)y_1(t) +\frac{\partial f}{\partial z}(t)z_1(t) +\int _{{\mathbb {R}}_0} \nabla _k f (t,\zeta ) k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) \right. \nonumber \\&\left. \left. \qquad \ + \sum _{j=1}^D\frac{\partial g}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \right\} \mathrm {d}t\right] . \end{aligned}$$
(61)

Substituting (58)–(61) into (57), we get

$$\begin{aligned}&\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}\nonumber \\&\quad \ =E\left[ \int _0^T \Big \{ \kappa (t) \frac{\partial b}{\partial x}(t)+ D_t^B\kappa (t)\frac{\partial \sigma }{\partial x}(t) +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t) \frac{\partial \gamma }{\partial x}(t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \nonumber \\&\left. \qquad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (t) \frac{\partial \eta ^j}{\partial x}(t)+\tilde{A}(t)\frac{\partial g}{\partial x}(t)\Big \}x_1(t)\mathrm {d}t\right] \nonumber \\&\qquad \ +E\left[ \int _0^T \Big \{ \kappa (t) \frac{\partial b}{\partial u}(t)+ D_t^B\kappa (t)\frac{\partial \sigma }{\partial u}(t) +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t) \frac{\partial \gamma }{\partial u}(t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \nonumber \\&\left. \qquad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (t) \frac{\partial \eta ^j}{\partial u}(t)+\frac{\partial f}{\partial u}(t)+\tilde{A}(t)\frac{\partial g}{\partial u}(t)\Big \}\beta (t)\mathrm {d}t\right] . \end{aligned}$$
(62)

Equation (62) holds for all \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\). In particular, choose \( \beta _{\theta }=\beta _{\theta }(s)=\theta (\omega )\chi _{(t,t+h]}(s),\) where \(\theta (\omega )\) is \({\mathscr {E}}_t\)-measure and \(0\le t\le t+h \le T.\) Then, (20) yields \( x_1=x_1^{(\beta _{\theta })}(s)=0 \text { for } 0\le s\le t.\) Hence (62) can be rewritten as

$$\begin{aligned} J_1(h)+J_2(h)=0, \end{aligned}$$
(63)

where

$$\begin{aligned} J_1(h)&=E\left[ \int _t^T \left\{ \kappa (s) \frac{\partial b}{\partial x}(s)+ D_s^B\kappa (s)\frac{\partial \sigma }{\partial x}(s) +\int _{{\mathbb {R}}_0} D_{s,\zeta }^{\widetilde{N}_\alpha }\kappa (s) \frac{\partial \gamma }{\partial x}(s,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (s) \frac{\partial \eta ^j}{\partial x}(s)+\tilde{A}(s)\frac{\partial g}{\partial x}(s)\right\} x_1(s)\mathrm {d}s\right] , \end{aligned}$$
(64)
$$\begin{aligned} J_2(h)&=E\left[ \theta \int _t^{t+h} \left\{ \kappa (s) \frac{\partial b}{\partial u}(s)+ D_t^B\kappa (s)\frac{\partial \sigma }{\partial u}(s) +\int _{{\mathbb {R}}_0} D_{s,\zeta }^{\widetilde{N}_\alpha }\kappa (s) \frac{\partial \gamma }{\partial u}(s,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_s^{\widetilde{{\varPhi }}_j}\kappa (s) \frac{\partial \eta ^j}{\partial u}(s)+\frac{\partial f}{\partial u}(s)+\tilde{A}(s)\frac{\partial g}{\partial u}(s)\right\} \mathrm {d}s\right] . \end{aligned}$$
(65)

This completes the first step. \(\square \)

Next, we conclude the proof of (A) \(\Rightarrow \) (B).

Lemma 4.2

Assume that the conditions of Theorem 3.3 are satisfied. Assume that (A) in Theorem 3.3 holds. Then, (B) in Theorem 3.3 also holds.

Proof

Assume that (A) holds, that is, \(\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)= 0\). Then, from Lemma 4.1, we have \(J_1(h)+J_2(h)=0\).

Let \(x_1(s)=x_1^{(\beta _{\theta })}(s)\). Assume that \(s\ge t+h\). Then, it follows from the choice of \(\beta _{\theta }\) and (20) that

$$\begin{aligned} \mathrm {d}x_1(s)= & {} x_1(s-) \left\{ \frac{\partial b}{\partial x}(s)\mathrm {d}s+\frac{\partial \sigma }{\partial x}(s)\mathrm {d}B(s)+\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial x}(s,\zeta )\widetilde{N}_\alpha (\mathrm {d}s,\mathrm {d}\zeta )\right. \nonumber \\&\left. \qquad \qquad \ + \frac{\partial \eta }{\partial x}(s)\cdot \mathrm {d}\widetilde{{\varPhi }}(s) \right\} ;\,\,s\in [t+h,T]. \end{aligned}$$

By the Itô’s formula, it is easy to show that \(x_1(s)=x_1(t+h)G(t+h,s); \,\,\,s\ge t+h,\) where G is defined by (34). Let us observe that G(ts) does not depend on h. It follows from the definition of \(H_0\) [see (31)] that

$$\begin{aligned} J_1(h)&= E\left[ \int _t^T \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] =E\left[ \int _t^{t+h} \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] \nonumber \\ {}&\quad \ +\,E\left[ \int _{t+h}^T \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] . \end{aligned}$$

Differentiating with respect to h at \(h=0\) gives

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}h} J_1(h)\Big |_{h=0}&=\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \int _t^{t+h} \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] _{h=0}\nonumber \\&\quad \ +\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \int _{t+h}^T \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] _{h=0}. \end{aligned}$$
(66)

Since \(x_1(t)=0\), we get \( \frac{\mathrm {d}}{\mathrm {d}h}E\left[ \displaystyle \int _t^{t+h} \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] _{h=0}=0.\) Using the definition of \(x_1(s)\), we have

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}h}E\left[ \int _{t+h}^T \frac{\partial H_0}{\partial x}(s)x_1(s)\mathrm {d}s\right] _{h=0}&=\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \int _{t+h}^T \frac{\partial H_0}{\partial x}(s)x_1(t+h)G(t+h,s) \mathrm {d}s\right] _{h=0}\nonumber \\&=\int _{t}^T\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \frac{\partial H_0}{\partial x}(s)x_1(t+h)G(t+h,s) \right] _{h=0}\mathrm {d}s\nonumber \\&=\int _{t}^T\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \frac{\partial H_0}{\partial x}(s)x_1(t+h)G(t,s) \right] _{h=0}\mathrm {d}s, \end{aligned}$$
(67)

where \(x_1(t+h)\) is given by

$$\begin{aligned} x_1(t+h)&=\int _t^{t+h}\left( x_1(r-) \left\{ \frac{\partial b}{\partial x}(r)\mathrm {d}r+\frac{\partial \sigma }{\partial x}(r) \mathrm {d}B(r)\right. \right. \nonumber \\&\left. \quad \ +\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial x}(r,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) + \frac{\partial \eta }{\partial x}(r)\cdot \mathrm {d}\widetilde{{\varPhi }}(r) \right\} \nonumber \\&\quad \ +\theta \left\{ \frac{\partial b}{\partial u}(r)\mathrm {d}r+\frac{\partial \sigma }{\partial u}(r) \mathrm {d}B(r)+\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial u}(r,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\right. \nonumber \\&\left. \left. \quad \ + \frac{\partial \eta }{\partial u}(r)\cdot \mathrm {d}\widetilde{{\varPhi }}(r) \right\} \right) . \end{aligned}$$
(68)

Therefore, by (67) and (68) \( \frac{\mathrm {d}}{\mathrm {d}h} J_1(h)\Big |_{h=0}=J_{1,1}(0)+J_{1,2}(0),\) with

$$\begin{aligned} J_{1,1}(0)&=\int _{t}^T\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \frac{\partial H_0}{\partial x}(s) G(t,s) \theta \int _t^{t+h}\left\{ \frac{\partial b}{\partial u}(r)\mathrm {d}r+\frac{\partial \sigma }{\partial u}(r) \mathrm {d}B(r)\right. \right. \nonumber \\&\left. \left. \quad \ +\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial u}(r,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) + \frac{\partial \eta }{\partial u}(r)\cdot \mathrm {d}\widetilde{{\varPhi }}(r) \right\} \right] _{h=0}\mathrm {d}s \end{aligned}$$
(69)
$$\begin{aligned} J_{1,2}(0)&=\int _{t}^T\frac{\mathrm {d}}{\mathrm {d}h}E\left[ \frac{\partial H_0}{\partial x}(s) G(t,s) \int _t^{t+h}x_1(r-)\left\{ \frac{\partial b}{\partial x}(r)\mathrm {d}r+\frac{\partial \sigma }{\partial x}(r) \mathrm {d}B(r)\right. \right. \nonumber \\&\left. \left. \quad \ +\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial x}(r,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) + \frac{\partial \eta }{\partial x}(r)\cdot \mathrm {d}\widetilde{{\varPhi }}(r) \right\} \right] _{h=0}\mathrm {d}s. \end{aligned}$$
(70)

Since \(x_1(t)=0\), we have \(J_{1,2}(0)=0\), from which we get \(\frac{\mathrm {d}}{\mathrm {d}h} J_1(h)\Big |_{h=0}=J_{1,1}(0).\) Using once more the duality formula, we get from (33) that

$$\begin{aligned} J_{1,1}(0)&=\int _{t}^T\frac{\mathrm {d}}{\mathrm {d}h}E\Big [ \theta \int _t^{t+h}\left\{ \frac{\partial b}{\partial u}(r) {\varTheta }(t,s)+\frac{\partial \sigma }{\partial u}(r) D_r^B{\varTheta }(t,s)\right. \nonumber \\&\left. \quad \ +\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial u}(r,\zeta )D_{r,\zeta }^{\widetilde{N}_\alpha } {\varTheta }(t,s) \nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D \frac{\partial \eta ^j}{\partial u}(r) D_{r}^{\widetilde{{\varPhi }}_j} {\varTheta }(t,s) \right\} \mathrm {d}r \Big ]_{h=0}\mathrm {d}s \nonumber \\&=\int _{t}^TE\left[ \left\{ \frac{\partial b}{\partial u}(t) {\varTheta }(t,s)+\frac{\partial \sigma }{\partial u}(t) D_t^B{\varTheta }(t,s)\right. \right. \nonumber \\&\left. \left. \quad \ +\int _{{\mathbb {R}}_0} \frac{\partial \gamma }{\partial u}(t,\zeta )D_{t,\zeta }^{\widetilde{N}_\alpha } {\varTheta }(t,s) \nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D \frac{\partial \eta ^j}{\partial u}(t) D_{t}^{\widetilde{{\varPhi }}_j} {\varTheta }(t,s)\lambda _j(t) \right\} \right] \mathrm {d}s. \end{aligned}$$
(71)

On the other hand, differentiating (65) with respect to h at \(h=0\), we have

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}h} J_2(h)\Big |_{h=0}&=E\left[ \theta \left\{ \kappa (t) \frac{\partial b}{\partial u}(t)+ D_t^B\kappa (t)\frac{\partial \sigma }{\partial u}(t) +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t) \frac{\partial \gamma }{\partial u}(t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\right. \right. \nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\kappa (t) \frac{\partial \eta ^j}{\partial u}(t)\lambda _j(t)+\frac{\partial f}{\partial u}(t)+\tilde{A}(t)\frac{\partial g}{\partial u}(t)\right\} \right] . \end{aligned}$$
(72)

Summing (71) and (72) yields

$$\begin{aligned}&E\left[ \theta \left\{ \left( \kappa (t) + \int _{t}^T {\varTheta }(t,s)\mathrm {d}s \right) \frac{\partial b}{\partial u}(t)+ D_t^B\left( \kappa (t) + \int _{t}^T {\varTheta }(t,s)\mathrm {d}s \right) \frac{\partial \sigma }{\partial u}(t)\right. \right. \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\left( \kappa (t) + \int _{t}^T {\varTheta }(t,s)\mathrm {d}s \right) \frac{\partial \gamma }{\partial u}(t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\left. \left. \quad \ +\sum _{j=1}^DD_t^{\widetilde{{\varPhi }}_j}\left( \kappa (t) + \int _{t}^T {\varTheta }(t,s)\mathrm {d}s \right) \frac{\partial \eta ^j}{\partial u}(t)\lambda _j(t)+\frac{\partial f}{\partial u}(t)+\tilde{A}(t)\frac{\partial g}{\partial u}(t)\right\} \right] =0 . \end{aligned}$$
(73)

Using (36)–(38) and (11) with Apqrw replaced by \(\tilde{A},\tilde{p}, \tilde{q}, \tilde{r}, \tilde{w}\), we get

$$\begin{aligned} E\left[ \theta \frac{\partial H}{\partial u}\left( t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u,\tilde{A}(t),\tilde{p}(t),\tilde{q}(t), \tilde{r}(t,\cdot ),\tilde{w}(t)\right) _{u=u(t)}\right] =0. \end{aligned}$$

Since this holds for all \({\mathscr {E}}_t\)-measurable random variables \(\theta \), we conclude that

$$\begin{aligned} E\Big [\frac{\partial H}{\partial u} (t,X(t),\alpha (t),Y(t), Z(t), K(t,\cdot ),V(t),u,A(t),\tilde{p}(t),\tilde{q}(t),\tilde{r}(t,\cdot ),\tilde{w}(t))_{u=u(t)} \Big . \Big | {\mathscr {E}}_t\Big ] =0. \end{aligned}$$
(74)

The proof of (A) \(\Rightarrow \) (B) is completed. \(\square \)

Finally, we prove that (B) \(\Rightarrow \) (A). Conversely, assume that there exists \(u\in {\mathscr {A}}_{{\mathscr {E}}}\) such that (74) holds. Then by reversing the previous argument, we obtain that (A) holds for \(\beta _{\theta }(s)=\theta (\omega )\chi _{(t,t+h]}(s) \in {\mathscr {A}}_{{\mathscr {E}}} \), where \(\theta \) is bounded and \({\mathscr {E}}_t\)-measurable. Then (63) holds for all linear combinations of \(\beta _{\theta }\). Since all bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}} \) can be approximated pointwise boundedly in \((t,\omega )\) by such linear combination, it follows that (63) is satisfied for all bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}} \). Thus, reversing the remaining part of the previous proof, we get \(\frac{\mathrm {d}}{\mathrm {d}\ell }J^{(u+\ell \beta )}(t)\Big . \Big |_{\ell =0}=0\) for all bounded \(\beta \in {\mathscr {A}}_{{\mathscr {E}}}\). \(\square \)

5 Applications

5.1 Application to Optimal Control Problem for Markov Regime Switching with No Concave Value Function

In this section, we apply the results obtained to study an optimal control problem for a Markov regime-switching system, assuming that the value function is not concave. Suppose that the state process \(X(t)=X^{(u)}(t,\omega );\,\,0 \le t \le T,\,\omega \in {\varOmega }\) is a controlled Markov regime-switching jump diffusion of the form

$$\begin{aligned} \mathrm {d}X (t) = u(t)\left\{ \sigma (t) \,\mathrm {d}B(t)+ \displaystyle \int _{{\mathbb {R}}_0}\gamma (t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\right\} ,\,\,\,\, t \in [ 0,T], \,\,\,\,X(0)= 0, \end{aligned}$$
(75)

where \(T>0\) is a given constant. \(u(\cdot )\) is the control process. We assume here that \(\widetilde{N}_\alpha =\widetilde{N}\) for any state of the Markov chain. Let us introduce the performance functional

$$\begin{aligned} J(u)=E\left[ \int _0^T\left\{ C_1(\alpha (t))u(t)+C_2(\alpha (t))u^2(t)+C_3(\alpha (t))X^2(t)\right\} \mathrm {d}t +C_4(\alpha (T))X^2(T)\right] . \end{aligned}$$
(76)

In this case, we have that

$$\begin{aligned}&f(t,x,\alpha ,y,z,k,v,u)=C_1(\alpha )u+C_2(\alpha )u^2+C_3(\alpha )x^2, \,\,\,\varphi (x,\alpha )=C_4(\alpha )x^2,\,\,\,\ g= \psi =0,\\&\kappa (t)= 2 C_4(\alpha (T))X(T)+2\int _t^TC_3(\alpha (s))X(s)\mathrm {d}s,\,\,\,\, A(t)=G(t,s)=0,\\&H_0\left( t,x,e_i,y,z,k,v,u,\widetilde{a},\kappa \right) = D_t^B\kappa (t) u\sigma (t)+\int _{{\mathbb {R}}_0} D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t)\gamma (t,\zeta )u\nu _i(\mathrm {d}\zeta )\\&H\left( t,x,e_i,y,z,k,v,u,a,p,q,r,w\right) =C_1(e_i)u+C_2(e_i)u^2+C_3(e_i)x^2 + \tilde{q}(t) \sigma (t)u\\&\qquad \qquad \qquad \qquad \qquad \qquad \ +\int _{{\mathbb {R}}_0}\tilde{r}(t,\zeta )\gamma (t,\zeta )u\nu _i(\mathrm {d}\zeta ), \end{aligned}$$

with the modified adjoint processes are reduced to

$$\begin{aligned} \tilde{p}(t)&= \kappa (t)+\int _t^T\frac{\partial H_0}{\partial x}(s)G(t,s) \mathrm {d}s=\kappa (t),\,\,\,\tilde{q}(t)= D_t^B\kappa (t),\\ \tilde{r}(t,\zeta )&=D_{t,\zeta }^{\widetilde{N}_\alpha }\kappa (t),\,\,\, \tilde{w}^j(t)=D_{t}^{\widetilde{{\varPhi }_j}}\kappa (t),\,\,\,j=1,\ldots ,D. \end{aligned}$$

Remark 5.1

The Hamiltonian in this case is not concave and therefore Theorem 3.1 cannot be applied. Using the Malliavin calculus approach (Theorem 3.3), we derive the expression of the optimal control if it exists. Note that, when \({\mathscr {E}}_t={\mathscr {F}}_t \text { for all } t\in [0,T]\), one can also use Theorem 3.2 to derive the optimal control. In fact, in this case, it is possible to guess the form of the adjoint processes and employ techniques from ordinary differential equations to get the solution and hence the optimal control.

Theorem 5.1

Assume that the state process is given by (75) and let the performance functional be given by (76). Moreover, assume that \(\alpha (t)\) is a two-state Markov chain and \({\mathscr {E}}_t={\mathscr {F}}_t \text { for all } t\in [0,T]\). Assume in addition that an optimal control exists. Then, \(u^*\) is an optimal control for (10) iff

$$\begin{aligned} u^*(t)&=\frac{-C_1(1)}{2C_2(1)+ 2{\varGamma }(t,T,1)\Big (\sigma ^2(t)+\int _{{\mathbb {R}}_0}\gamma ^2(t,\zeta ) \nu (\mathrm {d}\zeta )\Big )}\chi _{\{\alpha (t-)=1\}}\nonumber \\&\quad \ +\frac{-C_1(2)}{2C_2(2)+ 2{\varGamma }(t,T,2)\Big (\sigma ^2(t)+\int _{{\mathbb {R}}_0}\gamma ^2(t,\zeta ) \nu (\mathrm {d}\zeta )\Big )}\chi _{\{\alpha (t-)=2\}}, \end{aligned}$$
(77)

where

$$\begin{aligned} {\varGamma }(t,T,1)&=C_4(1)+C_3(1)(T-t)+C_3(2,1)\frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}(T-t)\nonumber \\&\quad \ +\frac{\lambda _{1,2}\Big \{C_4(2,1)(\lambda _{1,2} +\lambda _{2,1})-C_3(2,1)\Big \}}{(\lambda _{1,2} +\lambda _{2,1})^2}\Big \{1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\Big \} \end{aligned}$$
(78)

and \({\varGamma }(t,T,2)\) is obtained in a similar way.

Proof

The condition (2) in Theorem 3.3 for an optimal control \(\hat{u}(t)\) is one of the two

$$\begin{aligned} E\left[ C_1(\alpha (t))+2C_2(\alpha (t))u(t)+\sigma (t)\tilde{q}(t)+\int _{{\mathbb {R}}_0}\tilde{r}(t,\zeta )\gamma (t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\Big |{\mathscr {E}}_t\right] =0, \end{aligned}$$
(79)
$$\begin{aligned} E\left[ C_1(\alpha (t))+2C_2(\alpha (t))u(t)+\sigma (t)D_t^B\tilde{p}(t) +\int _{{\mathbb {R}}_0}D_{t,\zeta }^{\widetilde{N}_\alpha }\tilde{p}(t)\gamma (t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\Big |{\mathscr {E}}_t\right] =0. \end{aligned}$$
(80)

Equation (80) can be seen as a partial information, Markov switching Malliavin-differential type equation in the unknown random variable \(\tilde{p}(t)\). A similar equation was solved in [15] without regime switching, when \({\mathscr {E}}_t={\mathscr {F}}_t\). For simplicity, we assume from now on that \({\mathscr {E}}_t={\mathscr {F}}_t \text { for all } t\in [0,T]\) and that \(\alpha \) is a two-state Markov chain. Using the fundamental theorem of calculus (see, e.g. [31, Theorem 3.1]), we have

$$\begin{aligned} \tilde{q}(t)=D_t^B\tilde{p}(t)&=2C_4(\alpha (T))D_t^BX(T) +2\int _t^TC_3(\alpha (s))D_t^BX(s)\mathrm {d}s\\&=2C_4(\alpha (T))\left\{ \int _t^T D_t^B \Big (u(r) \sigma (r)\Big )\mathrm {d}B(r) +u(t)\sigma (t)\right. \\&\left. \quad \ +\int _t^T\int _{{\mathbb {R}}_0}D_t^B\Big (u(r)\gamma (r,\zeta )\Big ) \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}r)\right\} \\&\quad \ + 2\int _t^T C_3(\alpha (s))\left\{ \int _t^s D_t^B \Big (u(r)\sigma (r)\Big )\mathrm {d}B(r) +u(t)\sigma (t) \right. \\&\left. \quad \ +\int _t^s\int _{{\mathbb {R}}_0}D_t^B\Big (u(r)\gamma (r,\zeta )\Big ) \widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}r)\right\} \mathrm {d}s. \end{aligned}$$

Using integration by parts formula (or product rule), we get

$$\begin{aligned} \tilde{q}(t)=D_t^B\tilde{p}(t)&=2\left\{ C_4(\alpha (t))u(t)\sigma (t)+\int _t^TC_4(\alpha (r)) D_t^B \Big (u(r)\sigma (r)\Big )\mathrm {d}B(r)\right. \nonumber \\&\quad \ +\int _t^T\int _{{\mathbb {R}}_0}C_4(\alpha (r))D_t^B\Big (u(r)\gamma (r,\zeta )\Big )\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}r)\nonumber \\&\quad \ +\int _t^TD_t^BX(r)\sum _{j=1,i\ne j}^D \lambda _{i,j}(C_4(j)-C_4(i))\chi _{(\alpha (r)=i)}\mathrm {d}r\nonumber \\&\left. \quad \ +\int _t^TD_t^BX(r)\sum _{j=1,i\ne j}^D \lambda _{i,j}(C_4(j)-C_4(i))\chi _{(\alpha (r)=i)}\mathrm {d}m_{ij}(t)\right\} \nonumber \\&\quad \ +2\left\{ \int _t^T \left( C_3(\alpha (t))u(t)\sigma (t) +\int _t^sC_3(\alpha (r)) D_t^B\Big (u(r) \sigma (r)\Big )\mathrm {d}B(r)\right. \right. \nonumber \\&\quad \ +\int _{{\mathbb {R}}_0}\int _t^sC_3(\alpha (r))D_t^B\Big (u(r)\gamma (r,\zeta )\Big )\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}r)\nonumber \\&\quad \ +\int _t^sD_t^BX(r)\sum _{j=1,i\ne j}^D \lambda _{i,j}(C_3(j)-C_3(i))\chi _{(\alpha (r)=i)}\mathrm {d}r \nonumber \\&\left. \left. \quad \ +\int _t^sD_t^BX(r)\sum _{j=1,i\ne j}^D \lambda _{i,j}(C_3(j)-C_3(i))\chi _{(\alpha (r)=i)}\mathrm {d}m_{ij}(t)\right) \mathrm {d}s\right\} . \end{aligned}$$
(81)

Taking conditional expectation with respect to \({\mathscr {F}}_t\), we have

$$\begin{aligned} E\Big [\tilde{q}(t)\Big |{\mathscr {F}}_t\Big ]&=2C_4(\alpha (t))u(t)\sigma (t)+2\int _t^Tu(t)\sigma (t)\sum _{j=1,i\ne j}^D \lambda _{i,j}(C_4(j)-C_4(i))\nonumber \\&\quad \ E\Big [\chi _{(\alpha (r)=i)}\Big | {\mathscr {F}}_t\Big ]\mathrm {d}r+2C_3(\alpha (t))u(t)\sigma (t)(T-t) +2\int _t^T\int _t^su(t)\sigma (t)\nonumber \\&\quad \ \times \sum _{j=1,i\ne j}^D \lambda _{i,j}(C_3(j)-C_3(i))E\Big [\chi _{(\alpha (r)=i)} \Big |{\mathscr {F}}_t\Big ]\mathrm {d}r \,\mathrm {d}s. \end{aligned}$$
(82)

Let \(\alpha (t)=e_1\) and for \(n=1,2,3,4\), let \(C_n(i)\) be the value of the function \(C_n\) at 1. Define \(C_n(2,1)\) for \(n=1,2,3,4\) by \(C_n(2,1):=C_n(2)-C_n(1).\) Then, we have

$$\begin{aligned} E\Big [\tilde{q}(t)\Big |{\mathscr {F}}_t\Big ]&=2C_4(1)u(t)\sigma (t) +2\int _t^Tu(t)\sigma (t) \Big (\lambda _{1,2}(C_4(2)-C_4(1))E\Big [\chi _{(\alpha (r)=1)} \Big |\alpha (t)=1\Big ]\\&\quad \ +\lambda _{2,1}(C_4(1)-C_4(2))E\Big [\chi _{(\alpha (r)=2)} \Big |\alpha (t)=1\Big ]\Big )\mathrm {d}r+2 C_3(1)u(t)\sigma (t)(T-t)\\&\quad \ +2\int _t^T\int _t^su(t)\sigma (t)\Big (\lambda _{1,2}(C_3(2)-C_3(1))E \Big [\chi _{(\alpha (r)=1)}\Big |\alpha (t)=1\Big ]\\&\quad \ +\lambda _{2,1}(C_3(1)-C_3(2))E\Big [\chi _{(\alpha (r)=2)}\Big |\alpha (t)=1\Big ]\Big )\mathrm {d}r \,\mathrm {d}s\\&=2C_4(1)u(t)\sigma (t)+2\int _t^Tu(t)\sigma (t) \Big (\lambda _{1,2}(C_4(2)-C_4(1))P(\alpha (r)=1|\alpha (t)=1)\\&\quad \ +\lambda _{2,1}(C_4(1)-C_4(2))P(\alpha (r)=2|\alpha (t)=1)\Big )\mathrm {d}r+2 C_3(1)u(t)\sigma (t)(T-t)\\&\quad \ +2\int _t^T\int _t^su(t)\sigma (t)\Big (\lambda _{1,2}(C_3(2)-C_3(1))P(\alpha (r)=1|\alpha (t)=1)\\&\quad \ +\lambda _{2,1}(C_3(1)-C_3(2))P(\alpha (r)=2|\alpha (t)=1)\Big )\mathrm {d}r \,\mathrm {d}s. \end{aligned}$$

It follows from the transition probability of a two-state Markov chain that

$$\begin{aligned} E\Big [\tilde{q}(t)\Big |{\mathscr {F}}_t\Big ]&=2C_4(1)u(t)\sigma (t)+2u(t)\sigma (t)C_4(2,1)\int _t^T\left( \lambda _{1,2}\frac{\lambda _{1,2}e^{(\lambda _{1,2} +\lambda _{2,1})(t-r)}+\lambda _{2,1}}{\lambda _{1,2} +\lambda _{2,1}}\right. \nonumber \\&\left. \quad \ -\lambda _{2,1}\frac{\lambda _{1,2}-\lambda _{1,2}e^{(\lambda _{1,2} +\lambda _{2,1})(t-r)}}{\lambda _{1,2} +\lambda _{2,1}}\right) \mathrm {d}r+2 C_3(1)u(t)\sigma (t)(T-t) \nonumber \\&\quad \ +2C_3(2,1)u(t)\sigma (t)\int _t^T\int _t^s\left( \lambda _{1,2}\frac{\lambda _{1,2}e^{(\lambda _{1,2} +\lambda _{2,1})(t-r)}+\lambda _{2,1}}{\lambda _{1,2} +\lambda _{2,1}}\right. \nonumber \\&\left. \quad \ -\lambda _{2,1}\frac{\lambda _{1,2}-\lambda _{1,2}e^{(\lambda _{1,2} +\lambda _{2,1})(t-r)}}{\lambda _{1,2} +\lambda _{2,2}}\right) \mathrm {d}r \,\mathrm {d}s \nonumber \\&=2C_4(1)u(t)\sigma (t)+2u(t)\sigma (t)C_4(2,1)\frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}\left( 1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\right) \nonumber \\&\quad \ +2 C_3(1)u(t)\sigma (t)(T-t)+ 2C_3(2,1)u(t)\sigma (t) \frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}(T-t)\nonumber \\&\quad \ -2C_3(2,1)u(t)\sigma (t)\frac{\lambda _{1,2}}{(\lambda _{1,2} +\lambda _{2,1})^2}\Big (1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\Big )\nonumber \\&=2u(t)\sigma (t)\Big (C_4(1)+C_3(1)(T-t)+C_3(2,1)\frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}(T-t)\nonumber \\&\quad \ +\frac{\lambda _{1,2}\Big \{C_4(2,1)(\lambda _{1,2} +\lambda _{2,1})-C_3(2,1)\Big \}}{(\lambda _{1,2} +\lambda _{2,1})^2}\Big \{1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\Big \}\Big ). \end{aligned}$$
(83)

On the other hand, set \(\alpha (t)=e_1\). Using the integration by parts formula and the fundamental theorem of calculus, we have

$$\begin{aligned}&E\Big [\tilde{r}(t,\zeta )\Big |{\mathscr {F}}_t\Big ]\\&\quad \ =2C_4(1)u(t)\gamma (t,\zeta )+2\int _t^Tu(t)\gamma (t,\zeta ) \Big (\lambda _{1,2}(C_4(2)-C_4(1))E\Big [\chi _{(\alpha (r)=1)}\Big |\alpha (t)=1\Big ]\\&\qquad \ +\lambda _{2,1}(C_4(1)-C_4(2))E\Big [\chi _{(\alpha (r)=2)}\Big |\alpha (t)=1\Big ]\Big )\mathrm {d}r+2 C_3(1)u(t)\gamma (t,\zeta )(T-t)\\&\qquad \ +2\int _t^T\int _t^su(t)\gamma (t,\zeta )\Big (\lambda _{1,2}(C_3(2)-C_3(1))E\Big [\chi _{(\alpha (r)=1)}\Big |\alpha (t)=1\Big ]\\&\qquad \ +\lambda _{2,1}(C_3(1)-C_3(2))E\Big [\chi _{(\alpha (r)=2)}\Big |\alpha (t)=1\Big ]\Big )\mathrm {d}r \,\mathrm {d}s\\&\quad \ =2C_4(1)u(t)\gamma (t,\zeta )+2\int _t^Tu(t)\gamma (t,\zeta ) \Big (\lambda _{1,2}(C_4(2)-C_4(1))P(\alpha (r)=1|\alpha (t)=1)\\&\qquad \ +\lambda _{2,1}(C_4(1)-C_4(2))P(\alpha (r)=2|\alpha (t)=1)\Big )\mathrm {d}r+2 C_3(1)u(t)\gamma (t,\zeta )(T-t) \\&\qquad \ +2\int _t^T\int _t^su(t)\gamma (t,\zeta )\Big (\lambda _{1,2}(C_3(2)-C_3(1))P(\alpha (r)=1|\alpha (t)=1)\\&\qquad \ +\lambda _{2,1}(C_3(1)-C_3(2))P(\alpha (r)=2|\alpha (t)=1)\Big )\mathrm {d}r \,\mathrm {d}s. \end{aligned}$$

Similarly, we get

$$\begin{aligned} E\Big [\tilde{r}(t,\zeta )\Big |{\mathscr {F}}_t\Big ]&=2u(t)\gamma (t,\zeta )\left( C_4(1)+C_3(1)(T-t)+C_3(2,1)\frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}(T-t)\right. \nonumber \\&\left. \quad \ +\frac{\lambda _{1,2}\left\{ C_4(2,1)(\lambda _{1,2} +\lambda _{2,1})-C_3(2,1)\right\} }{(\lambda _{1,2} +\lambda _{2,1})^2}\left\{ 1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\right\} \right) . \end{aligned}$$
(84)

Then, the result follows for \(\alpha (t)=e_1\). Performing the same computations, one get an expression for \({\varGamma }(t,T,2)\). This completes the proof. \(\square \)

The following corollary is a generalization of [3, Example 4.7].

Corollary 5.1

Assume that conditions of Theorem 5.1 are satisfied. Moreover, assume that \(C_1,C_2,C_3,C_4:I\rightarrow {\mathbb {R}}\) satisfy \(C_1(1)=-1,C_1(2)=0,C_2(1)=0,C_2(2)=-\frac{1}{2}, C_3(1)=0,C_3(2)=1\), \(C_4(1)=\frac{1}{2},C_4(2)=1\).

Then, the optimal control \(u^*\) for (10) satisfies:

$$\begin{aligned} u^*(t)&=\frac{1}{2{\varGamma }(t,T,1)\Big (\sigma ^2(t)+\int _{{\mathbb {R}}_0}\gamma ^2(t,\zeta ) \nu (\mathrm {d}\zeta )\Big )}\chi _{\{\alpha (t-)=1\}}+0\times \chi _{\{\alpha (t-)=2\}}, \end{aligned}$$
(85)

where \({\varGamma }(t,T,1)=\frac{1}{2}+\frac{\lambda _{1,2}}{\lambda _{1,2} +\lambda _{2,1}}(T-t)+\frac{\lambda _{1,2}\Big \{\frac{1}{2}(\lambda _{1,2} +\lambda _{2,1})-1\Big \}}{(\lambda _{1,2} +\lambda _{2,1})^2}\Big \{1-e^{(\lambda _{1,2} +\lambda _{2,1})(t-T)}\Big \}.\)

5.2 Application to Recursive Utility Maximization

In this section, we use the results from Sect. 3.3 to study a problem of recursive utility maximization. Consider a financial market with two investments possibilities: a risk-free asset (bond) with the unit price \(S_0(t)\) at time t and a risky asset (stock) with unit price S(t) at time t. Let r(t) be the instantaneous interest rate of the risk-free asset at time t. If \(r_t:=r(t,\alpha (t))=\langle \underline{r}|\alpha (t)\rangle \), where \(\langle \cdot |\cdot \rangle \) is the usual scalar product in \({\mathbb {R}}^D\) and

\(\underline{r}=(r_1,r_2, \ldots ,r_D)\in \mathbb {R_+}^D\), then the price dynamic of \(S_0\) is given by:

$$\begin{aligned} \mathrm {d}S_0(t)&=r(t)S_0(t)\mathrm {d}t,\,\,\,S_0(0)=1. \end{aligned}$$
(86)

The appreciation rate \(\mu (t)\) and the volatility \(\sigma (t)\) of the stock at time t are defined by

$$\begin{aligned} \mu (t):=\mu (t,\alpha (t))=\langle \underline{\mu }|\alpha (t)\rangle , \,\,\,\sigma (t):=\sigma (t,\alpha (t))=\langle \underline{\sigma }|\alpha (t)\rangle \quad t\in [0,T], \end{aligned}$$
(87)

where \(\underline{\mu }=(\mu _1,\mu _2, \ldots ,\mu _D)\in {\mathbb {R}}^D\) and \(\underline{\sigma }=(\sigma _1,\sigma _2, \ldots ,\sigma _D)\in \mathbb {R_+}^D\). The stock price process S is described by the following Markov modulated Lévy process:

$$\begin{aligned} \mathrm {d}S(t)=S(t^-)\left( \mu (t)\mathrm {d}t+\sigma (t)\mathrm {d}B (t)+\int _{{\mathbb {R}}\backslash \{0\}}\gamma (t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\right) ,\quad S(0)>0. \end{aligned}$$
(88)

The general setting considered here can be seen as an extension of the exponential-Lévy model described in [35], where a factor of modulation is introduced. Hence, we can retrieve in a simple way some of the existing models in the literature (e.g. the classical Black–Scholes model and the family of exponential-Lévy models.)

Here \(r(t)\ge 0,\,\,\mu (t),\,\,\sigma (t)\) and \(\gamma (t,\zeta )>-1+\varepsilon \) (for some constant \(\varepsilon >0\)) are given \({\mathscr {E}}_t\)-predictable, integrable processes, with \(\left\{ {\mathscr {E}}_t\right\} _{t\in \left[ 0,T\right] }\) being a given filtration, such that

$$\begin{aligned} {\mathscr {E}}_t\subset {\mathscr {F}}_t \text { for all } t\in [0,T]. \end{aligned}$$

Suppose that a trader in this market chooses a portfolio u(t), representing the amount she invests in the risky asset at time t. In the partial information case, this portfolio is a \({\mathscr {E}}_t\)-predictable stochastic process. Choosing \(S_0(t)\) as a numeraire, and setting without loss of generality \(r(t)=0\), one can show (see [18] for such a derivation) that the corresponding wealth process \(X(t)=X^{(u)}(t)\) satisfies

$$\begin{aligned} \mathrm {d}X(t)=u(t)\left[ \mu (t)\mathrm {d}t+\sigma (t)\mathrm {d}B(t)+\displaystyle \int _{{\mathbb {R}}_0}\gamma (t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\right] ,\,\,\,X(0)=x> 0. \end{aligned}$$
(89)

The above process is a controlled Itô-Lévy process.

We consider a small agent endowed with an initial wealth x, who can choose her portfolio between time 0 and time T. We suppose that there exists a terminal reward X(T) at time T. In this setting, the utility at time t depends on the utility up to time t and on the future utility. More precisely, the recursive utility at time t is defined by

$$\begin{aligned} Y(t)=E\left[ X(T)+\int _t^Tg(s,Y(s),\alpha (s),\omega )\mathrm {d}s\right] , \end{aligned}$$
(90)

where g is called the driver. One can show as in [14] (see also [29, 36]) that the above process can be regarded as a solution to the following Markov regime-switching BSDE.

$$\begin{aligned} \left\{ \begin{array}{ll} \,\mathrm {d}Y (t) &{}= - g(t,Y(t),\alpha (t),\omega )\,\mathrm {d}t \,+\,Z(t)\,\mathrm {d}B(t) \\ &{}\quad + \displaystyle \int _{{\mathbb {R}}_0}K(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V(t)\cdot \mathrm {d}\widetilde{{\varPhi }}(t);\,\,\, t \in [ 0,T] \\ Y(T) &{}= X(T), \end{array}\right. \end{aligned}$$
(91)

where \(g:[0,T]\times {\mathbb {R}} \times {\mathbb {S}} \times {\mathscr {U}} \times {\varOmega }\rightarrow {\mathbb {R}}\) is such that the BSDE (91) has a unique solution and \((t,\omega ) \rightarrow g(t,x,e_i,\omega )\) is \({\mathscr {F}}_t\)-predictable for each given x and \(e_i\). For more information about recursive utility, the reader may consult [14, 20, 22]. Such unique solution exists if one assumes that \(g(\cdot , y,e_i)\) is uniformly Lipschitz continuous with respect to y, the random variable X(T) is squared integrable and \(g(t,0,e_i)\) is uniformly bounded.”

We want to apply Theorem 3.3 to find the control u (if it exists) that maximizes the recursive utility Y(0) defined by (91). This means that we aim at finding \(u^*\) and \(Y^*\) such that

$$\begin{aligned} Y^{(u^*)}(0)=\sup _{u\in {\mathscr {A}}_{{\mathscr {E}} }} Y^{(u)}(0)=Y^*. \end{aligned}$$

Note that the performance functional J(u) given by (9) is reduced to:

$$\begin{aligned} J(u)=Y^{(u)}(0). \end{aligned}$$

This means that

$$\begin{aligned} f=0,\,\,\, \varphi =0, \text { and } \psi (x)=x. \end{aligned}$$

We also have

$$\begin{aligned} h(x,\alpha )&=x,\\ b(t,x,\alpha ,u,\omega )&=u\mu (t,\alpha ,\omega ),\\ \sigma (t,x,\alpha ,u,\omega )&=u\sigma (t,\alpha ,\omega ),\\ \gamma (t,x,\alpha ,u,\omega )&=u \gamma (t,\alpha ,\zeta ,\omega ),\\ \eta _j(t,x,\alpha ,u,\omega )&=0 . \end{aligned}$$

The Hamiltonian is therefore reduced to:

$$\begin{aligned}&H\left( t,x,e_i,y,z,k,v,u,\tilde{a},\tilde{p},\tilde{q},\tilde{r},\tilde{w},\omega \right) \nonumber \\&\quad \ = a g(t,x,e_i,\omega )+ p u\mu (t,e_i,\omega )+q u\sigma (t,e_i,\omega )\nonumber \\&\qquad \ +\int _{{\mathbb {R}}_0}r(t,\zeta ) u\gamma (t,e_i,\zeta ,\omega ) \nu _i(\mathrm {d}\zeta ) , \end{aligned}$$
(92)

with the modified adjoint processes \(\tilde{A}\) and \((\tilde{p}(t),\tilde{q}(t),\tilde{r}(t,\zeta ),\tilde{w}(t))\) given, respectively by:

$$\begin{aligned} \begin{array}{ll} \,\mathrm {d}\tilde{A} (t) &{}= \tilde{A}(t) \nabla _x g(t,Y(t),\alpha (t),\omega )\,\mathrm {d}t\\ \tilde{A}(0) &{}= 1, \end{array} \end{aligned}$$
(93)

and

$$\begin{aligned} \tilde{p}(t)&:= \kappa (t)+\int _{t}^{T} \frac{\partial H_{0}}{\partial x } (s)G(t,s)\mathrm {d}s=\tilde{A}(T) , \end{aligned}$$
(94)
$$\begin{aligned} \tilde{q}(t)&:= D^B_{t}\tilde{A}(T) , \end{aligned}$$
(95)
$$\begin{aligned} \tilde{r}(t,\zeta )&:= D^{\widetilde{N}}_{t,\zeta }\tilde{A}(T) , \end{aligned}$$
(96)
$$\begin{aligned} \tilde{w}^j(t)&:=D_{t}^{\widetilde{{\varPhi }_j}}\tilde{A}(T),\,\,\,j=1,\ldots ,D . \end{aligned}$$
(97)

Equation (93) can be solved explicitly and the solution is given by:

$$\begin{aligned} \tilde{A} (t)=\exp \left( \int _0^t \nabla _x g(t,Y(t),\alpha (t),\omega )\,\mathrm {d}s \right) . \end{aligned}$$
(98)

Condition (B) in Theorem 3.3 for an optimal control \(u^*\) becomes

$$\begin{aligned} E\left[ \mu (t,e_i)\tilde{A}(T) +\sigma (t,e_i)D^B_{t}\tilde{A}(T)+\int _{{\mathbb {R}}_0} \gamma (t,e_i,\zeta ) D^{\widetilde{N}}_{t,\zeta }\tilde{A}(T) \nu _i(\mathrm {d}\zeta )|{\mathscr {E}}_t\right] =0 \end{aligned}$$
(99)

for \(i=1,\ldots ,D\). For each \(i=1,\ldots ,D\), Eq. (99) is called a partial information, Malliavin differentiable type of equation in the unknown variable \(\tilde{A}(T)\), see, for example, [15, 17]. For \({\mathscr {E}}_t={\mathscr {F}}_t\), one can solve this equation explicitly (see [15]) and get

$$\begin{aligned} \tilde{A} (T)&=E[\tilde{A} (T)]\exp \left( \int _0^T \beta (t,\alpha )\mathrm {d}B(t) -\frac{1}{2} \int _0^T \beta ^2(t,\alpha )\mathrm {d}t \right. \nonumber \\&\quad \ +\int _0^T\int _{{\mathbb {R}}_0} \ln (1+ \theta (t,\alpha ,\zeta )) \widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta )\nonumber \\&\left. \quad \ + \int _0^T\int _{{\mathbb {R}}_0} \left\{ \ln (1+ \theta (t,\alpha ,\zeta ))-\theta (t,\alpha ,\zeta )\right\} \nu _\alpha (\mathrm {d}\zeta ) \mathrm {d}t \right) \end{aligned}$$
(100)

for some \({\mathscr {F}}_t\)-predictable processes \(\beta (t,\alpha )\) and \(\theta (t,\alpha ,\zeta )\) such that

$$\begin{aligned} \mu (t,\alpha ) +\sigma (t,\alpha )\beta (t,\alpha )+\int _{{\mathbb {R}}_0} \gamma (t,\alpha ,\zeta ) \theta (t,\alpha ,\zeta ) \nu _i(\mathrm {d}\zeta )=0 \text { for a.a. } (t,\omega ). \end{aligned}$$
(101)

The processes \(\beta \) and \(\theta \) are completely determined by the vector \((\beta _1,\ldots , \beta _D)\) and \((\theta _1,\ldots ,\theta _D)\), solutions to the system of equations

$$\begin{aligned} \mu (t,e_i) +\sigma (t,e_i)\beta (t,e_i)+\int _{{\mathbb {R}}_0} \gamma (t,e_i,\zeta ) \theta (t,e_i,\zeta ) \nu _i(\mathrm {d}\zeta )=0 \text { for a.a. } (t,\omega ) \end{aligned}$$
(102)

for all \(i=1,\ldots ,D\). Under condition (101), the measure Q defined by

$$\begin{aligned} \mathrm {d}Q(\omega )=\frac{\tilde{A} (T)}{E[\tilde{A} (T)]}\mathrm {d}P(\omega ) \text { on } {\mathscr {F}}_T \end{aligned}$$
(103)

is an equivalent local martingale measure (ELMM) for the process X(t). For more discussion on this, we refer the reader to [15, Section 5].

Assume that \(\alpha (t)\) is a two-state Markov process and that \(g(t,Y(t),\alpha (t),\omega )\) is given by:

$$\begin{aligned} g(t,Y(t),1,\omega )= & {} -c_1(t) Y(t)\ln Y(t) +c_2(t)Y(t) ,\,\,\,g(t,Y(t),2,\omega )\nonumber \\= & {} c(t)Y(t)+c_0(t). \end{aligned}$$
(104)

Using Theorem 3.3, similar arguments as in [15, Section 5] yield the following:

Theorem 5.2

Suppose that \(g(t,y,\alpha )\) is as in (104) and \(c_1\) is deterministic. Let \(\tilde{A} (T)\) be the solution of the modified forward adjoint equation and suppose that \(\beta \) and \(\theta \) satisfy

$$\begin{aligned} \mu (t,\alpha ) +\sigma (t,\alpha )\beta (t,\alpha )+\int _{{\mathbb {R}}_0} \gamma (t,\alpha ,\zeta ) \theta (t,\alpha ,\zeta ) \nu _\alpha (\mathrm {d}\zeta )=0 \text { for a.a. } (t,\omega ). \end{aligned}$$

Moreover, assume that \( E\Big [\exp \Big (\int _0^Tc(t) \mathrm {d}t\Big )\Big (1+\int _0^T|c_0(t)| \mathrm {d}t\Big )\Big ]<\infty .\) In addition, suppose that an optimal control \(u^*\) exists . Then, the maximal differential utility is given by:

$$\begin{aligned} Y^*(0,1)= & {} x\left( \exp \int _0^Tc_1(t) \mathrm {d}t \right) E[\tilde{A} (T)], \end{aligned}$$
(105)
$$\begin{aligned} Y^*(0,2)= & {} xE\left[ \exp \int _0^Tc(t) \mathrm {d}t\right] + \int _0^T E\Big [c_0(t)\exp \int _0^Tc(t) \mathrm {d}t\Big ]\mathrm {d}t. \end{aligned}$$
(106)

Proof

It follows from Theorem 3.3 and the arguments in [15, Section 5]. \(\square \)

6 Conclusions

In this paper, we presented three versions of the stochastic maximum principle for Markov regime-switching forward–backward stochastic differential equation with jumps. We then applied the results to study both the problem of optimal control when the Hamiltonian is not concave and the problem of recursive utility maximization. In the former case, the Malliavin calculus approach was used. There are many advantages of using Malliavin calculus approach. First, it does not require the study of the existence and uniqueness of the solution of a BSDE usually satisfied by the adjoint equation. Second, it does not assume concavity of the Hamiltonian. Third, it enables us to get an “explicit” solution for the optimal control problem for non-concave Hamiltonian in some cases.

In this work, it is assumed that the sensitivity towards risk of the controller when making decisions is implicitly given in the utility function. It is often the case that the risk-sensitive parameter is explicitly taken into consideration when dealing with the controller preference. Such control problem is known as risk-sensitive control and has been studied in the past years by several authors, see, for example, [37,38,39,40]. It would therefore be interesting to extend the current Malliavin calculus approach to the risk-sensitive case. A risk-sensitive maximum principle for a Markov regime-switching jump–diffusion system is derived in [41] using the classical approach.

Another interesting study would be to address the problem of partial observation maximum principle for Markov regime-switching systems. In fact, in many economic applications the target variables are not always observed and a specific observation process is given (see, e.g. [37]). A way of solving the control problem in this case is to derive the stochastic partial differential equation of the associated filtering and consider an optimal control problem for stochastic partial differential equations.

In this paper, we do not analyse the effect of a change in a parameter (e.g. volatility, initial value) of the state process could have in the obtained optimal control. Such study could also be of interest.