1 Introduction

The expected utility theory can be seen as the theory of decision making under uncertainty based on some postulates of agent’s preferences. In general, the agent’s preference is driven by a time-additive functional and a constant rate discount future reward. The standard expected utility maximization problem supposes that the agent knows the initial probability measure that governs the dynamics of the underlying. However, it is difficult or even impossible to find an individual worthwhile probability distribution of the uncertainty. Moreover, in finance and insurance, there is no conformism on which original probability should be used to model uncertainty. This led to the study of utility maximization under model uncertainty, the uncertainty being represented by a family of absolute continuous (or equivalent) probability distributions. The idea is to solve the problem for each probability measure in the above mentioned class and choose the one that gives the worst objectives value. More specifically, the investor maximizes the expected utility with respect to each measure in this class, and chooses among all, the portfolio with the lowest value. This is also known as robust optimization problem and has been intensively studied in the past years. For more information, the reader may consult (Bordigoni et al. 2005; Elliott and Siu 2011; Faidi et al. 2011; Jeanblanc et al. 2012; Menoukeu-Pamen 2015; Øksendal and Sulem 2012) and references therein.

Our paper is motivated by the idea developed in Menoukeu-Pamen (2015), Menoukeu-Pamen (2014) and Øksendal and Sulem (2012) where general maximum principle for Forward–backward stochastic differential games with or without delay are presented. We give a general maximum principle for Forward–backwardMarkov regime-switching stochastic differential equations under model uncertainty. Then we study a problem of recursive utility maximization with entropy penalty. We show that the value function is the unique solution to a quadratic Markov regime-switching backward stochastic differential equation. This result extends the results in Bordigoni et al. (2005) and Jeanblanc et al. (2012) by considering a Markov regime-switching state process, and more general stochastic differential utility (SDU). The notion of SDU was introduced in Duffie and Epstein (1992) as a continuous time extension of the concept of recursive utility proposed in Epstein and Zin (1989) and Weil (1990). The latter notion was developed in order to untie the concepts of risk aversion and intertemporal substitution aversion which are not treated independently in the standard utility formulation.

The other motivation is to study stochastic differential games problem for Markov-regime switching systems. In a financial market, one may assume that this correspond to the case in which the mean relative growth rate of the risky asset is not known to the agent, but subject to uncertainty, hence it is regarded as a stochastic control which plays against the agent, that is, a (zero-sum) stochastic differential games between the agent and the market. Similar problem was studied in Elliott and Siu (2011) where the objective of an insurance company is to choose an optimal investment strategy so as to maximize the expected exponential utility of terminal wealth in the worst-case scenario. The authors use the dynamic programming approach to derive explicit optimal investment of the company and optimal mean growth rate of the market when the interest rate is zero. In this paper, our general the stochastic maximum principle extends their results to the framework of (nonzero-sum) Forward–backward stochastic differential games and more general dynamics for the state process. In addition, when the company and the market have the same level of information, we obtain explicit forms for the optimal strategies of the market and the insurance company, when the Markov chain has two states and the interest rate is not zero. Let us mention that our general result can also be applied to study utility maximization under risk constraint under model uncertainty. This is due to the fact that risk measures can be written as a solution to a BSDE. Hence transforming the problem with constraint to the unconstrained one leads to the setting discussed here. Another application of our result pertains to risk minimization under model uncertainty in a regime-switching market.

The remaining of the paper is organized as follows: In Sect. 2, we formulate the control problem. In Sect. 3, we derive a partial information stochastic maximum principle for forward backward stochastic differential games for a Markov switching Lévy process under model uncertainty. In Sect. 4, we apply the results to study first a robust utility maximization with entropy penalty and second a problem of optimal investment of an insurance company under model uncertainty. In the latter case, explicit expressions for optimal strategies are derived.

2 Model and problem formulation

In this section, we formulate the general problem of stochastic differential games of Markov regime-switching Forward–backward SDEs. Let \((\Omega ,\mathcal {F},P)\) be a complete probability space, where P is a reference probability measure. On this probability space, we assume that we are given a one dimensional Brownian motion \(B=\{B(t)\}_{0\le t\le T}\), an irreducible homogeneous continuous-time, finite state space Markov chain \(\alpha :=\{\alpha (t)\}_{0\le t\le T}\) and \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) a independent Poisson random measure on \((\mathbb {R}^+\times \mathbb {R}_0,\mathcal {B}(\mathbb {R}^+)\otimes \mathcal {B}_0)\) under P. Here \(\mathbb {R}_0=\mathbb {R} \backslash \{0\}\) and \(\mathcal {B}_0\) is the Borel \(\sigma \)-algebra generated by open subset O of \(\mathbb {R}_0\).

We suppose that the filtration \(\mathbb {F}=\{\mathcal {F}_t\}_{0\le t\le T}\) is the P-augmented natural filtration generated by B, N and \(\alpha \) [see for example Donnelly (2011, Section 2) or Elliott and Siu (2011, p. 369)].

We assume that the Markov chain takes values in a finite state space \(\mathbb {S}=\{e_1,e_2, \ldots ,e_D\}\subset \mathbb {R}^D\), where \(D\in \mathbb {N}\), and the jth component of \(e_n\) is the Kronecker delta \(\delta _{nj}\) for each \(n,j=1,\ldots , D\). Denote by \(\Lambda :=\{\lambda _{nj}:1\le n,j\le D\}\) the rate (or intensity) matrix of the Markov chain under P. Hence, for each \(1\le n,j\le D,\,\,\lambda _{nj}\) is the constant transition intensity of the chain from state \(e_n\) to state \(e_j\) at time t. Recall that for \(n\ne j,\,\,\lambda _{nj}\ge 0\) and \(\sum _{j=1}^D \lambda _{nj}=0\), hence \(\lambda _{nn}\le 0\). As shown in Elliott et al. (1994), \(\alpha \) admits the following semimartingale representation

$$\begin{aligned} \alpha (t)=\alpha (0)+\int _0^t\Lambda ^T\alpha (s)\mathrm {d}s+M(t), \end{aligned}$$
(2.1)

where \(M:=\{M(t)\}_{t\in [0,T]}\) is a \(\mathbb {R}^D\)-valued \((\mathbb {F},P)\)-martingale and \(\Lambda ^T\) denotes the transpose of a matrix. Next we introduce the Markov jump martingale associated to \(\alpha \); for more information, the reader should consult Elliott et al. (1994) or Zhang et al. (2012). For each \(1\le n,j\le D\), with \(n\ne j\), and \(t\in [0,T]\), denote by \(J^{nj}(t)\) the number of jumps from state \(e_n\) to state \(e_j\) up to time t. It can be shown (see Elliott et al. 1994) that

$$\begin{aligned} J^{nj}(t)=\lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s +m_{nj}(t), \end{aligned}$$
(2.2)

where \(m_{nj}:=\{m_{nj}(t)\}_{t\in [0,T]}\) with \(m_{nj}(t):=\int _0^t\langle \alpha (s-),e_n\rangle \langle \mathrm {d}M(s),e_j\rangle \) is a \((\mathbb {F},P)\)-martingale.

Fix \(j\in \{1,2,\ldots ,D\}\), denote by \(\Phi _j(t)\) the number of jumps into state \(e_j\) up to time t. Then

$$\begin{aligned} \Phi _j(t)&:=\sum _{n=1,n\ne j}^D J^{nj}(t)= \sum _{n=1,n\ne j}^D \lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s +\widetilde{\Phi }_{j}(t)\nonumber \\&= \lambda _j(t) + \widetilde{\Phi }_{j}(t), \end{aligned}$$
(2.3)

with \(\widetilde{\Phi }_{j}(t)=\sum _{n=1,n\ne j}^D m_{nj}(t)\) and \(\lambda _j(t)=\sum _{n=1,n\ne j}^D \lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s \). Note that for each \(j\in \{1,2,\ldots ,D\},\,\,\,\widetilde{\Phi }_{j}:=\{\widetilde{\Phi }_{j}(t)\}_{t\in [0,T]}\) is a \((\mathbb {F},P)\)-martingale.

Suppose that the compensator of \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) is given by

$$\begin{aligned} \eta _\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=\nu _\alpha (\mathrm {d}\zeta |s)\eta (\mathrm {d}s)=\langle \alpha (s-),\nu (\mathrm {d}\zeta |s)\rangle \eta (\mathrm {d}s), \end{aligned}$$
(2.4)

where \(\eta (\mathrm {d}s)\) is a \(\sigma \)-finite measure on \(\mathbb {R}^+\) and \(\nu (\mathrm {d}\zeta |s):=(\nu _{e_1}(\mathrm {d}\zeta |s),\nu _{e_2}(\mathrm {d}\zeta |s),\ldots ,\nu _{e_D}(\mathrm {d}\zeta |t))\in \mathbb {R}^D\) is a function of s. Let mention that for each \(j=1,\ldots ,D\), \(\nu _{e_j}(\mathrm {d}\zeta |s)=\nu _j(\mathrm {d}\zeta |s) \) represents the conditional Lévy density of jump sizes of \(N(\mathrm {d}\zeta ,\mathrm {d}s)\) at time s when \(\alpha (s^-)=e_j\) and satisfies \(\int _{\mathbb {R}_{0}}\min (1,\zeta ^{2})\nu _j (\mathrm {d}\zeta |s)< \infty \). In this paper, we further assume that \(\eta (\mathrm {d}s)=\mathrm {d}s\) and that \(\nu (\mathrm {d}\zeta |s)\) is a function of \(\zeta \), that is,

$$\begin{aligned} \nu (\mathrm {d}\zeta |s)=\nu (\mathrm {d}\zeta ) \end{aligned}$$

and denote by \(\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=N(\mathrm {d}\zeta ,\mathrm {d}s)-\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}s\) the compensated Markov regime-switching Poisson random measure.

Suppose that the state process \(X(t)=X^{(u)}(t,\omega );\,\,0 \le t \le T,\,\omega \in \Omega \), is a controlled Markov regime-switching jump-diffusion process of the form

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}X (t) &{}=&{} b(t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}t +\sigma (t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}B(t) \\ &{}&{}+\,\displaystyle \int _{\mathbb {R}_0}\gamma (t,X(t),\alpha (t),u(t),\zeta , \omega )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{}+\,\eta (t,X(t),\alpha (t),u(t),\omega )\cdot \mathrm {d}\widetilde{\Phi }(t),\,\,\,\,\,\, t \in [ 0,T] , \\ X(0) &{}=&{} x_0. \end{array}\right. \end{aligned}$$
(2.5)

In financial market the above model enables to incorporate the impact of changes in macro-economic conditions on the behaviour of the dynamics of an asset’s price as well as the occurrence of unpredictable events that could affect the price’s dynamic. One could think of the Brownian motion part as the random shocks in the price of a risky asset. The Poisson jump part takes into account the jumps in the asset price caused by lack of information or unexpected events. The Markov chain enables to describe economic cycles. The states of the underlying Markov chain represent the different states of the economy whereas the jumps given by the martingale of the underlying Markov chain represent transitions in economic conditions.

In this paper, we consider the nonzero-sum stochastic differential games problem. This means that, one player’s gain (respectively loss) does not necessarily end in the other player’s loss (respectively gain). In our model, the control \(u=(u_1,u_2)\) is such that \(u_i\) is the control of player \(i;\, i=1,2.\) We suppose that the different levels of information available at time t to the player \(i;\, i=1,2\) are modelled by two subfiltrations

$$\begin{aligned} \mathcal {E}^{(i)}_t\subset \mathcal {F}_t\,;\,\,\,t\in [0,T]. \end{aligned}$$
(2.6)

Note that one possible subfiltration \((\mathcal {E}^{(i)}_{t})_{t\ge 0}\) in (2.6) is the \(\delta \)-delayed information given by

$$\begin{aligned} \mathcal {E}^{(i)}_{t}=\mathcal {F}_{(t-\delta )^+ };\,\,\,t\ge 0 \end{aligned}$$

where \(\delta \ge 0\) is a given constant delay. Denote by \(\mathcal {A}_i\) the set of admissible control of player i, contained in the set of \(\mathcal {E}^{(i)}_t\)-predictable processes; \(i=1,2\), with value in \(\mathbb {A}_i\subset \mathbb {R}\).

The functions \(b,\sigma , \gamma \) and \(\eta \) are given such that for all \(t,\,\,\,b(t,x,e_n,u,\cdot )\), \( \sigma (t,x ,e_n,u,\cdot )\), \(\gamma (t,x,e_n,u,\zeta ,\cdot )\) and \(\eta (t,x,e_n,u,\cdot ),\,\,n=1,\ldots ,D\) are \(\mathcal {F}_t\)-progressively measurable for all \(x \in \mathbb {R},\,\,\,u\in \mathbb {A}_1\times \mathbb {A}_2\) and \(\zeta \in \mathbb {R}_0\), \(b(\cdot ,x,e_n,u,\omega )\), \( \sigma (\cdot ,x ,e_n,u,\omega )\). In addition, \(\gamma (\cdot ,x,e_n,u,\zeta ,\omega )\) and \(\eta (\cdot ,x,e_n,u,\omega ),\,\,n=1,\ldots ,D\) for each \(x\in \mathbb {R}, u\in \mathbb {A}_1\times \mathbb {A}_2, \zeta \in \mathbb {R}_0, \zeta \in \mathbb {R}_0\) and (2.5) has a unique strong solution for any admissible control \(u\in \mathbb {A}_1\times \mathbb {A}_2\). Under the above condition, existence and uniqueness of (2.5) is ensured if \(b, \sigma , \gamma \) and \(\eta \) are globally Lipschitz continuous in x and satisfy linear growth in x; see for example Applebaum (2009, Theorem 6.2.3), Mao and Yuan (2006, Theorem 3.13) and Kulinich and Kushnirenko (2014, Theorem).

For each player i, we consider the associated BSDE’s in the unknowns \((Y_i(t), Z_i(t), K_i(t,\zeta ), V_i(t))\) of the form

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}Y_i (t) &{}=&{} - g_i(t,X(t),\alpha (t),Y_i(t), Z_i(t), K_i(t,\cdot ),V_i(t),u(t))\,\mathrm {d}t \,+\,Z_i(t)\,\mathrm {d}B(t) \\ &{}&{}+\, \displaystyle \int _{\mathbb {R}_0}K_i(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t);\,\,\, t \in [ 0,T] , \\ Y_i(T) &{}=&{} h_i(X(T),\alpha (T))\,; \quad i=1,2. \end{array}\right. \end{aligned}$$
(2.7)

Here \(g_i:[0,T]\times \mathbb {R} \times \mathbb {S} \times \mathbb {R} \times \mathbb {R} \times \mathcal {R} \times \mathbb {R} \times \mathbb {A}_1\times \mathbb {A}_2 \rightarrow \mathbb {R}\) and \(h:\mathbb {R}\times \mathbb {S} \rightarrow \mathbb {R}\) are such that the BSDE (2.7) has a unique solution for any admissible control \(u \in \mathbb {A}_1\times \mathbb {A}_2\). For sufficient conditions for existence and uniqueness of Markov regime-switching BSDEs, we refer the reader to Cohen and Elliott (2010, Theorem 1.1) or Crepey (2010, Proposition 14.4.1) or Tang and Li (1994, Lemma 2.4) and references therein. For example, such unique solution exists if one assumes that \(g(\cdot , x,e_i,y,z,k,v,u)\) is uniformly Lipschitz continuous with respect to xyzkv, the random variable \(h(X(T),\alpha (T))\) is squared integrable and \(g(t,0,e_i,0,0,0,0,u)\) is uniformly bounded.

Let \(f_i:[0,T]\times \mathbb {R} \times \mathbb {S} \times \mathbb {A}_1\times \mathbb {A}_2 \rightarrow \mathbb {R}, \,\,\,\varphi _i:\mathbb {R}\times \mathbb {S} \rightarrow \mathbb {R}\) and \(\psi _i:\mathbb {R} \rightarrow \mathbb {R},\,i=1,2\) be given \(C^1\) functions with respect to their arguments and \(\psi _i^\prime (x)\ge 0\) for all \(x,\,i=1,2\). For the nonzero-sum games, the control actions are not free and generate for each player \(i,\,\,i=1,2\), a performance functional

$$\begin{aligned} J_i(t,u)&:=E\Big [ \int _t^T f_i(s,X(s),\alpha (s),u(s))\,\mathrm {d}s + \varphi _i(X(T),\alpha (T))\nonumber \\&\quad +\,\psi _i(Y_i(t))\Big | \mathcal {E}^{(i)}_t\Big ];\quad i=1,2. \end{aligned}$$
(2.8)

Here, \(f_i,\,\varphi _i\) and \(\psi _i\) may be seen as profit rates, bequest functions and “utility evaluations” respectively, of the player \(i;\, i=1,2\). For \(t=0\), we put

$$\begin{aligned} J_i(u):=J_i(0,u),\,\, i=1,2. \end{aligned}$$
(2.9)

Let us note that in the nonzero-sum games the players do not share the same performance functional, instead, each of them uses his own performance functional. In addition, they all have the same objectives, that is, maximize their performance functional. To be more precise, the nonzero-sum games is the following:

Problem 2.1

Find \((u_1^*,u_2^*)\in \mathcal {A}_1\times \mathcal {A}_2\) (if it exists) such that

  1. 1.

    \(J_1(t,u_1,u_2^*)\le J_1(t,u_1^*,u_2^*)\) for all \(u_1\in \mathcal {A}_1\),

  2. 2.

    \(J_2(t,u_1^*,u_2)\le J_2(t,u_1^*,u_2^*)\) for all \(u_2\in \mathcal {A}_2\).

If it exists, we call such a pair \((u_1^*,u_2^*)\) a Nash Equilibrium. This intuitively means that while player I controls \(u_1\), player II controls \(u_2\). We assume that each player knows the equilibrium strategies of the other player and does not gain anything by changing his strategy unilaterally. If each player is making the best decision she can, based on the other player’s decision, then we say that the two players are in Nash Equilibrium.

3 A stochastic maximum principle for Markov regime-switching forward–backward stochastic differential games

In this section, we derive the Nash equilibrium for Problem 2.1 based on a stochastic maximum principle for Markov regime-switching Forward–backward differential equation.

Define the Hamiltonians

$$\begin{aligned} H_i:[0,T] \times \mathbb {R}\times \mathbb {S}\times \mathbb {R}^2\times \mathcal {R}\times \mathbb {R} \times \mathbb {A}_1\times \mathbb {A}_2 \times \mathbb {R}^3 \times \mathcal {R} \times \mathbb {R} \longrightarrow \mathbb {R}, \end{aligned}$$

by

$$\begin{aligned}&H_i\left( t,x,e_n,y,z,k,v,u_1,u_2,a,p,q,r(\cdot ),w\right) \nonumber \\&\quad := f_i(t,x,e_n,u_1,u_2)+a g_i(t,x,e_n,y,z,k,v,u_1,u_2)+ p_i b(t,x,e_n,u_1,u_2) \,\nonumber \\&\quad \quad \;+\,q_i\sigma (t,x,e_n,u_1,u_2)+\int _{\mathbb {R}_0}r_i(\zeta )\gamma (t,x,e_n,u_1,u_2,\zeta )\nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \; +\,\sum _{j=1}^D\eta _j(t,x,e_n,u_1,u_2)w_n^j(t)\lambda _{nj} ,\quad i=1,2 \end{aligned}$$
(3.1)

where \(\mathcal {R} \) denote the set of all functions \(k:[0,T]\times \mathbb {R}_0 \rightarrow \mathbb {R}\) for which the integral in (3.1) converges. An example of such set is the set \(L^2(\nu _\alpha )\). We suppose that \(H_i,\,i=1,2\) is Fréchet differentiable in the variables xyzkvu and that \(\nabla _k H_i(t,\zeta ),\,i=1,2\) is a random measure which is absolutely continuous with respect to \(\nu \). Next, we define the associated adjoint process \(A_i(t),\,p_i(t),\,q_i(t), r_i(t,\cdot )\) and \(w_i(t),\,\,\,t\in [0,T]\) and \(\zeta \in \mathbb {R}\) by the following Forward–backward SDE

  1. 1.

    The Markovian regime-switching forward SDE in \(A_i(t); \,i=1,2\)

    $$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}A_i (t) &{}=&{} \dfrac{\partial H_i}{\partial y} (t) \,\mathrm {d}t + \dfrac{\partial H_i}{\partial z} (t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\mathrm {d}\nabla _k H_i}{\mathrm {d}\nu (\zeta )} (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{} + \nabla _vH_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t);\,\,\,t\in [0,T], \\ A_i(0) &{}=&{} \psi _i^\prime (Y(0)). \end{array}\right. \end{aligned}$$
    (3.2)

    Here and in what follows, we use the notation

    $$\begin{aligned}&\dfrac{\partial H_i}{\partial y} (t) = \dfrac{\partial H_i}{\partial y} (t,X(t),\alpha (t),u_1(t),u_2(t),Y_i(t), Z_i(t), K_i(t,\cdot ),V_i(t),A_i(t),\\&\quad p_i(t),q_i(t),r_i(t,\cdot ),w_i(t)), \end{aligned}$$

    etc, \(\dfrac{\mathrm {d}\nabla _k H_i}{\mathrm {d}\nu (\zeta )} (t,\zeta ) \) is the Radon-Nikodym derivative of \( \nabla _k H_i(t,\zeta )\) with respect to \(\nu (\zeta )\) and \(\nabla _v H_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t)=\sum _{j=1}^D \dfrac{\partial H_i}{\partial v^j} (t)\mathrm {d}\widetilde{\Phi }_j(t)\) with \(V_i^j=V_i(t,e_j)\).

  2. 2.

    The Markovian regime-switching BSDE in \((p_i(t),q_i(t),r_i(t,\cdot ),w_i(t)); \,i=1,2\)

    $$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}p_i (t) &{}=&{} - \dfrac{\partial H_i}{\partial x} (t) \mathrm {d}t + q_i(t)\,\mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} r_i (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{}+\, w_i(t)\cdot \mathrm {d}\widetilde{\Phi _i}(t);\,\,\,t\in [0,T], \\ p_i (T) &{}=&{} \dfrac{\partial \varphi _i}{\partial x}(X(T),\alpha (T))\,+A_i(T) \dfrac{\partial h_i}{\partial x} (X(T),\alpha (T)). \end{array}\right. \end{aligned}$$
    (3.3)

3.1 A sufficient maximum principle

In what follows, we give the sufficient maximum principle.

Theorem 3.1

(Sufficient maximum principle for Regime-switching FBSDE nonzero-sum games) Let \((\widehat{u}_1,\widehat{u}_2)\in \mathcal {A}_1\times \mathcal {A}_2 \) with corresponding solutions \(\widehat{X}(t),(\widehat{Y}_i(t), \widehat{Z}_i(t), \widehat{K}_i(t,\zeta ), \widehat{V}_i(t)), \widehat{A}_i(t),(\widehat{p}_i(t),\widehat{q}_i(t),\widehat{r}_i(t,\zeta ),\widehat{w}_i(t))\) of (2.5), (2.7), (3.2) and (3.3) respectively for \(i=1,2\). Suppose that the following holds:

  1. 1.

    For each \(e_n \in \mathbb {S} \), the functions

    $$\begin{aligned} x \mapsto h_i(x, e_n),\quad \,x \mapsto \varphi _i(x, e_n),\quad \,y \mapsto \psi _i(y), \end{aligned}$$
    (3.4)

    are concave for \(i=1,2\).

  2. 2.

    The functions

    $$\begin{aligned}&\widetilde{\mathcal {H}}_1(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H_1( t,x,\mu _1,e_n,y,z,k,v,\mu _1,\widehat{u}_2(t),\widehat{A}_1,\widehat{p}_1(t),\widehat{q}_1(t),\nonumber \\&\quad \quad \quad \widehat{r}_1(t,\cdot ),\widehat{w}_1(t))\Big | \mathcal {E}^{(1)}_t\Big ] \end{aligned}$$
    (3.5)

    and

    $$\begin{aligned}&\widetilde{\mathcal {H}}_2(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H_2( t,x,\mu _1,e_n,y,z,k,v,\widehat{u}_1(t),\mu _2,\widehat{A}_2,\widehat{p}_2(t),\widehat{q}_2(t),\nonumber \\&\quad \quad \quad \widehat{r}_2(t,\cdot ),\widehat{w}_2(t))\Big | \mathcal {E}^{(2)}_t\Big ] \end{aligned}$$
    (3.6)

    are all concave for all \((t,e_n) \in [0,T]\times \mathbb {S}\) a.s.

  3. 3.
    $$\begin{aligned} E\Big [\hat{H}_1(t,\widehat{u}_1(t),\widehat{u}_2(t))) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]=\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}}&\Big \{E\Big [\hat{H}_1(t,\mu _1,\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]\Big \} \end{aligned}$$
    (3.7)

    for all \(t\in [0,T]\), a.s. and

    $$\begin{aligned} E\Big [ \hat{H}_2(t,\widehat{u}_1(t),\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]=\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } sup}}}}\Big \{E\Big [\hat{H}_2(t,\widehat{u}_1(t),\mu _2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]\Big \} \end{aligned}$$
    (3.8)

    for all \(t\in [0,T]\), a.s. Here

    $$\begin{aligned}&\hat{H}_i(t,u_1(t),u_2(t))\\&\quad =\,H_i(t,\widehat{X}(t),\alpha (t),\widehat{Y}_i(t), \widehat{Z}_i(t), \widehat{K}_i(t,\cdot ),\widehat{V}_i(t),u_1(t),u_2(t),\widehat{A}_i(t),\widehat{p}_i(t), \widehat{q}_i(t),\\&\quad \quad \quad \;\widehat{r}_i(t,\cdot ),\widehat{w}_i(t)) \end{aligned}$$

    for \(i=1,2.\)

  4. 4.

    \(\frac{\mathrm {d}}{\mathrm {d}\nu }\nabla _k\widehat{H}_i(t,\xi )>-1\) for \(i=1,2.\)

  5. 5.

    In addition, the integrability condition

    $$\begin{aligned}&E\left[ \int _0^T\left\{ \widehat{p}_i^2(t) \left( \left( \sigma (t)-\widehat{\sigma }(t)\right) ^2+ {\int }_{\mathbb {R}_0}( \gamma (t,\zeta )-\widehat{\gamma } (t,\zeta ) )^2\,\nu _\alpha (\mathrm {d}\zeta )\right. \right. \right. \nonumber \\&\left. \quad \quad +\,\sum _{j=1}^D\left( \eta _j(t)-\widehat{\eta }_j(t) \right) ^2\lambda _{j}(t) \right) \nonumber \\&\quad \quad +\,(X(t)-\widehat{X}(t))^2 \left( \widehat{q}_i^2(t)+ {\int }_{\mathbb {R}_0}\widehat{r}_i^2 (t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D(w^j_i)^2(t)\lambda _{j}(t) \right) \nonumber \\&\quad \quad +\,(Y_i(t)-\widehat{Y}_i(t))^2 \left( \left( \dfrac{\partial \widehat{H}_i}{\partial z} \right) ^2(t) + {\int }_{\mathbb {R}_0} \left\| \nabla _k \widehat{H}_i(t,\zeta )\right\| ^2 \nu _\alpha (\mathrm {d}\zeta )\right. \nonumber \\&\left. \quad \quad +\,\sum _{j=1}^D \left( \dfrac{\partial \widehat{H}_i}{\partial v^j} \right) ^2(t) \lambda _{j}(t) \right) \nonumber \\&\quad \quad \Big . \Big . +\,\widehat{A}_i^2(t) \Big ( (Z_i(t)-\widehat{Z}_i(t))^2+ {\int }_{\mathbb {R}_0}( K_i (t,\zeta )-\widehat{K}_i (t,\zeta ) )^2\nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \left. +\,\sum _{j=1}^D(V_i^j(t)-\widehat{V}_i^j(t) )^2\lambda _{j}(t) \Big )\Big \} \mathrm {d}t \right] <\infty \end{aligned}$$
    (3.9)

    for \(i=1,2.\) holds.

Then \(\widehat{u}=(\widehat{u}_1(t),\widehat{u}_2(t))\) is a Nash equilibrium for (2.5), (2.7) and (2.8).

Proof of Theorem 3.1

See “Appendix”. \(\square \)

Remark 3.2

In the above Theorem and in its proof, we have used the following shorthand notation: For \( i = 1\), the processes corresponding to \(u=( u_1,\hat{u}_2)\) are given for example by \(X(t) = X^{(u_1,\hat{u}_2)}(t)\) and \(Y_1(t) = Y_1^{(u_1,\hat{u}_2)}(t)\) and the processes corresponding to \(u= (\hat{u}_1,\hat{u}_2)\) are \(\hat{X}(t) = X^{(\hat{u}_1,\hat{u}_2)}(t)\) and \(\hat{Y}_1(t) = Y_1^{(\hat{u}_1,\hat{u}_2)}(t) \). Similar notation is used for \(i=2\). The integrability condition (3.9) ensures the existence of the stochastic integrals while using Itô formula in the proof of the Theorem.

Remark 3.3

Let V be an open subset of a Banach space \(\mathcal {X}\) and let \(F: V \rightarrow \mathbb {R}\).

  • We say that F has a directional derivative (or Gateaux derivative) at \(x\in V\) in the direction \(y\in \mathcal {X}\) if

    $$\begin{aligned} D_yF(x):=\underset{\varepsilon \rightarrow 0}{\lim } \frac{1}{\varepsilon }(F(x + \varepsilon y)-F(x)) \text { exists.} \end{aligned}$$
  • We say that F is Fréchet differentiable at \(x \in V\) if there exists a linear map

    $$\begin{aligned} L:\mathcal {X} \rightarrow \mathbb {R} \end{aligned}$$

    such that

    $$\begin{aligned} \underset{\underset{h \in \mathcal {X}}{h \rightarrow 0}}{\lim } \frac{1}{\Vert h\Vert }|F(x+h)-F(x)-L(h)|=0. \end{aligned}$$

    In this case we call L the Fréchet derivative of F at x, and we write

    $$\begin{aligned} L=\nabla _x F. \end{aligned}$$
  • If F is Fréchet differentiable, then F has a directional derivative in all directions \(y \in \mathcal {X}\) and

    $$\begin{aligned} D_yF(x)= \nabla _x F(y). \end{aligned}$$

3.2 An equivalent maximum principle

The concavity condition on the Hamiltonians does not always hold on many applications. In this section, we shall prove an equivalent stochastic maximum principle which does not require this assumption. We shall assume the following:

Assumption A.1

For all \(t_0\in [0,T]\) and all bounded \(\mathcal {E}^{(i)}_{t_0}\)-measurable random variable \(\theta _i(\omega )\), the control process \(\beta _i\) defined by

$$\begin{aligned} \beta _i(t):= \chi _{]t_0,T[}(t)\theta _i(\omega );\,\,t\in [0,T], \end{aligned}$$
(3.10)

belongs to \(\mathcal {A}_i,\, i=1,2\).

Assumption A.2

For all \(u_i \in \mathcal {A}_i\) and all bounded \(\beta _i \in \mathcal {A}_i\), there exists \(\delta _i>0\) such that

$$\begin{aligned} \widetilde{u}_i(t):=u_i(t)+\ell \beta _i(t) \,\,t\in [0,T] , \end{aligned}$$
(3.11)

belongs to \(\mathcal {A}_i\) for all \(\ell \in ]-\delta _i,\delta _i[,\, i=1,2\).

Assumption A.3

For all bounded \(\beta _i \in \mathcal {A}_i\), the derivatives processes

$$\begin{aligned} X_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }X^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; X_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }X^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ y_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Y_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; y_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Y_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ z_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Z_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; z_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Z_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ k_1(t,\zeta )&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }K_1^{(u_1+\ell \beta _1,u_2)}(t,\zeta )\Big . \Big |_{\ell =0}; k_2(t,\zeta )=\dfrac{\mathrm {d}}{\mathrm {d}\ell }K_2^{(u_1,u_2+\ell \beta _2)}(t,\zeta )\Big . \Big |_{\ell =0};\\ v_1^j(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }V_1^{j,{(u_1+\ell \beta _1,u_2)}}(t)\Big . \Big |_{\ell =0},\quad \,j=1,\ldots , n; v_2^j(t)\\&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }V_2^{j,{(u_1,u_2+\ell \beta _1)}}(t)\Big . \Big |_{\ell =0},\quad \,j=1,\ldots , n \end{aligned}$$

exist and belong to \(L^2([0,T] \times \Omega )\).

It follows from (2.5) and (2.7) that

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}X_1(t) &{}=&{}X_1(t) \left\{ \dfrac{\partial b}{\partial x}(t)\mathrm {d}t+\dfrac{\partial \sigma }{\partial x}(t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\partial \gamma }{\partial x}(t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) + \dfrac{\partial \eta }{\partial x}(t)\cdot \mathrm {d}\widetilde{\Phi }(t) \right\} \\ &{}&{}+\,\beta _1(t)\left\{ \dfrac{\partial b}{\partial u_1}(t)\mathrm {d}t+\dfrac{\partial \sigma }{\partial u_1}(t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\partial \gamma }{\partial u_1}(t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) +\dfrac{\partial \eta }{\partial u_1}(t)\cdot \mathrm {d}\widetilde{\Phi }(t) \right\} ;\,\,t\in (0,T] \\ X_1(0)&{}=&{}0 \end{array}\right. \nonumber \\ \end{aligned}$$
(3.12)

and

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}y_1(t)&{}=&{}-\left\{ \dfrac{\partial g_1}{\partial x}(t)X_1(t)+\dfrac{\partial g_1}{\partial y}(t)y_1(t)+\dfrac{\partial g_1}{\partial z}(t)z_1(t)+\displaystyle \int _{\mathbb {R}_0}\nabla _k g_1 (t)k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\Big . \right. \\ &{}&{}\quad \left. +\,\sum \limits _{j=1}^D \dfrac{\partial g_1}{\partial v_1^j}(t)v_1^j(t)\lambda _j(t)+\dfrac{\partial g_1}{\partial u}(t)\beta _1(t)\right\} \mathrm {d}t +z_1(t)\,\mathrm {d}B(t) \\ &{}&{}\quad +\,\displaystyle \int _{\mathbb {R}_0}k_1(t,\zeta ) \widetilde{N}_\alpha (\mathrm {d}\zeta , \mathrm {d}t) + v_1(t)\cdot \mathrm {d}\widetilde{\Phi }(t) ;\quad \,t\in [0,T]\\ y_1(T)&{}=&{}\dfrac{\partial h_1}{\partial x}(X(T),\alpha (T))X_1(T) . \end{array}\right. \end{aligned}$$
(3.13)

We can obtain \(\mathrm {d}X_2(t)\) and \(\mathrm {d}y_2(t)\) in a similar way.

Remark 3.4

As for sufficient conditions for the existence and uniqueness of solutions (3.12) and (3.13), the reader may consult Peng (1993, Eq. 4.1) (in the case of diffusion state processes).

As an example, a set of sufficient conditions under which (3.12) and (3.13) admit a unique solution is as follows:

  1. 1.

    Assume that the coefficients \(b,\sigma , \gamma , \eta , g_i, h_i,f_i, \psi _i\) and \(\phi _i\) for \(i=1,2\) are continuous with respect to their arguments and are continuously differentiable with respect to (xyzkvu). (Here, the dependence of \(g_i\) and \(f_i\) on k is through \(\int _{\mathbb {R}_0}k(\zeta )\rho (t,\zeta )\nu (\mathrm {d}\zeta )\), where \(\rho \) is a measurable function satisfying \(0\le \rho (t,\zeta )\le c(1\wedge |\zeta |), \text { } \forall \zeta \in \mathbb {R}_0\). Hence the differentiability in this argument is in the Fréchet sense.)

  2. 2.

    The derivatives of \(b,\sigma , \gamma , \eta \) with respect to xu, the derivative of \(h_i,\,i=1,2\) with respect to x and the derivatives of \(g_i,\,i=1,2\) with respect to xyzkvu are bounded.

  3. 3.

    The derivatives of \(f_i,\,i=1,2\) with respect to xu are bounded by \(C(1+|x|+|u|)\).

  4. 4.

    The derivatives of \(\psi _i\) and \(\phi _i\) with respect to x are bounded by \(C(1+|x|).\)

We can state the following equivalent maximum principle:

Theorem 3.5

(Equivalent Maximum Principle) Let \(u_i\in \mathcal {A}_i\) with corresponding solutions X(t) of (2.5), \((Y_i(t),Z_i(t),K_i(t,\zeta ),V_i(t))\) of (2.7), \(A_i(t)\) of (3.2), \((p_i(t),q_i(t),r_i(t,\zeta ),w_i(t))\) of (3.3) and corresponding derivative processes \(X_i(t)\) and \((y_i(t),z_i(t),k_i(t,\zeta ),v_i(t))\) given by (3.12) and (3.13), respectively. Suppose that Assumptions A.1, A.2 and A.3 hold. Moreover, assume the following integrability conditions

$$\begin{aligned}&E\left[ \int _0^T p_i^2(t)\left\{ \left( \dfrac{\partial \sigma }{\partial x}\right) ^2(t)X^2_i(t) +\left( \dfrac{\partial \sigma }{\partial u_i}\right) ^2(t)\beta _i^2(t) \right. \right. \nonumber \\&\quad \quad +\,\int _{\mathbb {R}_0}\left( \left( \dfrac{\partial \gamma }{\partial x}\right) ^2(t,\zeta )X_i^2(t)+\left( \dfrac{\partial \gamma }{\partial u_i}\right) ^2(t,\zeta )\beta _i^2(t)\right) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \Big .+\,\sum _{j=1}^D \Big (\Big (\dfrac{\partial \eta ^j}{\partial x}\Big )^2(t)x^2_i(t)+\Big (\dfrac{\partial \eta ^j}{\partial u_i}\Big )^2(t)\beta _i^2(t)\Big )\lambda _j(t)\Big \} \mathrm {d}t\nonumber \\&\quad \quad +\,\int _0^TX_i^2(t)\Big \{ q_i^2(t)+\int _{\mathbb {R}_0}r_i^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D (\eta ^j)^2(t)\lambda _j(t)\Big \}\mathrm {d}t\Big ] <\infty \end{aligned}$$
(3.14)

and

$$\begin{aligned}&E\left[ \int _0^Ty_i^2(t) \left\{ \left( \dfrac{\partial H_i}{\partial z}\right) ^2 (t) +\int _{\mathbb {R}_0} \Vert \nabla _k H_i\Vert ^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D\left( \dfrac{\partial H_i}{\partial v^j}\right) ^2 (t) \lambda _j(t)\right\} \mathrm {d}t\right. \nonumber \\&\quad \quad \left. +\, \int _0^TA_i^2(t)\left\{ z_i^2(t)+\int _{\mathbb {R}_0}k_i^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D (v^j_i)^2(t)\lambda _j(t)\right\} \mathrm {d}t\right] <\infty \nonumber \\&\quad \text { for } \quad i=1,2. \end{aligned}$$
(3.15)

Then the following are equivalent:

  1. 1.

    \(\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0}=0\) for all bounded \(\beta _1\in \mathcal {A}_{1},\,\beta _2\in \mathcal {A}_{2}\)

  2. 2.
    $$\begin{aligned} 0&=E\left[ \dfrac{\partial H_1}{\partial \mu _1} (t,X(t),\alpha (t),\mu _1,u_2,Y_1(t), Z_1(t), K_1(t,\cdot ),V_1(t),\right. \nonumber \\&\quad \left. A_1(t),p_1(t),q_1(t),r_1(t,\cdot ),w_1(t))\Big . \Big | \mathcal {E}^{(1)}_t\right] _{\mu _1=u_1(t)}\nonumber \\&=E\left[ \dfrac{\partial H_2}{\partial \mu _2} (t,X(t),\alpha (t),u_1,\mu _2,Y_2(t), Z_2(t), K_2(t,\cdot ),V_2(t),\right. \nonumber \\&\quad \left. A_2(t),p_2(t),q_2(t),r_2(t,\cdot ),w_2(t))\Big . \Big | \mathcal {E}^{(2)}_t\right] _{\mu _2=u_2(t)} \end{aligned}$$
    (3.16)

    for a.a. \(t \in [0,T].\)

Proof

See “Appendix”. \(\square \)

Remark 3.6

The integrability conditions (3.14) and (3.15) guarantee the existence of the stochastic integrals while using Itô formula in the proof of the Theorem. Note also that the result is the same if we start from \(t\ge 0\) in the performance functional, hence extending Øksendal and Sulem (2012, Theorem 2.2) to the Markov regime-switching setting.

3.3 Zero-sum Game

In this section, we solve the zero-sum Markov regime-switching Forward–backward stochastic differential games problem (or worst case scenario optimal problem): that is, we assume that the performance functional for Player II is the negative of that of Player I, i.e.,

$$\begin{aligned}&J(t,u_1,u_2)=J_1(t,u_1,u_2)\nonumber \\&\quad :=\,E\Big [ \int _t^T f(s,X(s),\alpha (s),u_1(s),u_2(s))\,\mathrm {d}s + \varphi (X(T),\alpha (T))\,+\,\psi (Y(t))\Big .\Big | {\mathcal {E}}_t^1\Big ]\nonumber \\&\quad =:\,-J_2(t,u_1,u_2). \end{aligned}$$
(3.17)

In this case \((u_1^*,u_2^*)\) is a Nash equilibrium iff

$$\begin{aligned} \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2^*)=J(t,u_1^*,u_2^*)=\underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1^*,u_2). \end{aligned}$$
(3.18)

On one hand (3.18) implies that

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))&\le \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2^*)\\&=J(t,u_1^*,u_2^*)=\underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1^*,u_2)\\&\le \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$

On the other hand we always have \({{\mathrm{ess \text { } inf}}}({{\mathrm{ess \text { } sup}}}) \ge {{\mathrm{ess \text { } sup}}}({{\mathrm{ess \text { } inf}}})\). Hence, if \((u_1^*,u_2^*)\) is a saddle point, then

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))=\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$

The zero-sum Markov regime-switching Forward–backward stochastic differential games problem is therefore the following:

Problem 3.7

Find \(u_1^*\in \mathcal {A}_1\) and \(u_2^*\in \mathcal {A}_2\) (if they exist) such that

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))=J(t,u_1^*,u_2^*)=\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$
(3.19)

When it exists, a control \((u_1^*,u_2^*)\) satisfying (3.19), is called a saddle point. The actions of the players are opposite, more precisely, between player I and II there is a payoff \(J(t,u_1, u_2)\) and it is a reward for Player I and cost for Player II.

Remark 3.8

As in the nonzero-sum case, we give the result for \(t=0\) and get the result for \(t\in ]0,T]\) as a corollary. The results obtained in this section generalize the ones in Øksendal and Sulem (2012), Bordigoni et al. (2005), Faidi et al. (2011), Jeanblanc et al. (2012) and Elliott and Siu (2011).

In the case of a zero-sum games, we only have one value function for the players and therefore, Theorem 3.1 becomes

Theorem 3.9

(Sufficient maximum principle for Regime-switching FBSDE zero-sum games) Let \((\widehat{u}_1,\widehat{u}_2)\in \mathcal {A}_1\times \mathcal {A}_2 \) with corresponding solutions \(\widehat{X}(t),(\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\zeta ), \widehat{V}(t)), \widehat{A}(t),(\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\zeta ),\widehat{w}(t))\) of (2.5), (2.7), (3.2) and (3.3) respectively. Suppose that the following hold:

  1. 1.

    For each \(e_n \in \mathbb {S}\), the functions

    $$\begin{aligned} x \mapsto \varphi (x, e_n) \text { and } y \mapsto \psi (y), \end{aligned}$$
    (3.20)

    are affine and \(x \mapsto h(x, e_n)\) is concave.

  2. 2.

    The functions

    $$\begin{aligned}&\widetilde{\mathcal {H}}(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H( t,x,\mu _1,e_n,y,z,k,v,\mu _1,\widehat{u}_2(t),\widehat{A},\nonumber \\&\quad \quad \;\times \widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t))\Big | \mathcal {E}^{(1)}_t\Big ] \end{aligned}$$
    (3.21)

    is concave for all \((t,e_n) \in [0,T]\times \mathbb {S}\) a.s. and

    $$\begin{aligned}&\widetilde{\mathcal {H}}(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } inf}}}} E\Big [ H( t,x,\mu _1,e_n,y,z,k,v,\widehat{u}_1(t),\mu _2,\widehat{A},\nonumber \\&\quad \widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t))\Big | \mathcal {E}^{(2)}_t\Big ] \end{aligned}$$
    (3.22)

    is convex for all \((t,e_n) \in [0,T]\times \mathbb {S}\) a.s.

  3. 3.
    $$\begin{aligned} E\Big [\hat{H}(t,\widehat{u}_1(t),\widehat{u}_2(t))) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]=\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}}&\Big \{E\Big [\hat{H}(t,\mu _1,\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]\Big \} \end{aligned}$$
    (3.23)

    for all    \(t\in [0,T]\), a.s. and

    $$\begin{aligned} E\Big [ \hat{H}(t,\widehat{u}_1(t),\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]=\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } inf}}}}\Big \{E\Big [\hat{H}(t,\widehat{u}_1(t),\mu _2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]\Big \} \end{aligned}$$
    (3.24)

    for all \(t\in [0,T]\), a.s. Here

    $$\begin{aligned}&\hat{H}(t,u_1(t),u_2(t))\\&\quad =\,H(t,\widehat{X}(t),\alpha (t),\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t),u_1(t),u_2(t),\widehat{A}(t),\widehat{p}(t),\\&\quad \quad \;\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t)) \end{aligned}$$
  4. 4.

    \(\frac{\mathrm {d}}{\mathrm {d}\nu }\nabla _k\widehat{g}(t,\xi )>-1\).

  5. 5.

    In addition, the integrability condition (3.9) is satisfied for \(\widehat{p}_i=\widehat{p}\), etc.

Then \(\widehat{u}=(\widehat{u}_1(t),\widehat{u}_2(t))\) is a saddle point for \(J(u_1, u_2)\)

The equivalent maximum principle (Theorem 3.5) is then reduced to

Theorem 3.10

(Equivalent maximum principle for zero-sum games) Let \(u\in \mathcal {A}\) with corresponding solutions X(t) of (2.5), \((Y(t),Z(t),K(t,\zeta ),V(t))\) of (2.7), A(t) of (3.2), \((p(t),q(t),r(t,\zeta ),w_i(t))\) of (3.3) and corresponding derivative processes \(X_1(t)\) and \((y_1(t),z_1(t),k_1(t,\zeta ),v_1(t))\) given by (3.12) and (3.13), respectively. Assume that conditions of Theorem 3.5 are satisfied. Then the following statements are equivalent:

  1. 1.
    $$\begin{aligned} \dfrac{\mathrm {d}}{\mathrm {d}\ell }J^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=\dfrac{\mathrm {d}}{\mathrm {d}\ell }J^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0}=0 \end{aligned}$$
    (3.25)

    for all bounded    \(\beta _1\in \mathcal {A}_1,\,\,\,\beta _2\in \mathcal {A}_2\).

  2. 2.
    $$\begin{aligned}&E\Big [\dfrac{\partial H}{\partial \mu _1} (t,\mu _1(t),u_2(t))\Big . \Big | \mathcal {E}_t^{(1)}\Big ]_{\mu _1=u_1(t)} =E\Big [\dfrac{\partial H}{\partial \mu _2} (t,u_1(t),\mu _2(t))\Big . \Big | \mathcal {E}_t^{(2)}\Big ]_{\mu _2=u_2(t)}\nonumber \\&\quad =0 \end{aligned}$$
    (3.26)

    for   a.a \( t\in [0,T]\), where

    $$\begin{aligned}&H(t,u_1(t),u_2(t))\\&=H((t,X(t),\alpha (t),u_1,u_2,Y(t), Z(t), K(t,\cdot ),V_1(t),A(t),p(t),\\&q(t),r(t,\cdot ),w(t)). \end{aligned}$$

Proof

It follows directly from Theorem 3.5. \(\square \)

Corollary 3.11

If \(u=(u_1,u_2)\in \mathcal {A}_1\times \mathcal {A}_2\) is a Nash equilibrium for the zero-sum games in Theorem 3.10, then equalities (3.26) holds.

Proof

If \(u=(u_1,u_2)\in \mathcal {A}_1\times \mathcal {A}_2\) is a Nash equilibrium, then it follows from Theorem 3.10 that (3.25) holds by (3.18). \(\square \)

4 Applications

4.1 Application to robust utility maximization with entropy penalty

In this section, we apply the results obtained in Sect. 3 to study an utility maximization problem under model uncertainty. We assume that \(\mathcal {E}^{(1)}_t=\mathcal {E}^{(2)}_t=\mathcal {F}_t\). The framework is that of Bordigoni et al. (2005). For any \(Q\in (\Omega ,\mathcal {F}_T)\), let

$$\begin{aligned} H(Q|P):=\left\{ \begin{array}{ll} E_Q\left[ \ln \frac{\mathrm {d}Q}{\mathrm {d}P}\right] &{}\quad \text {if}\; Q\ll P\; \text {on}\; \mathcal {F}_T\\ +\infty &{}\quad \text {otherwise} \end{array} \right. \end{aligned}$$
(4.1)

be the relative entropy of Q with respect to P.

We aim at finding a probability measure \(Q\in \mathcal {Q}_{\mathcal {F}}\) that minimizes the functional

$$\begin{aligned} E_{Q}\Big [\int _0^t a_0 S^{\kappa }(s)U_1(s)ds+\overline{a}_0 S^{\kappa } (T)U_2(T)\Big ]+E_{Q}\Big [\mathcal {R}^{\kappa }(0,T)\Big ], \end{aligned}$$
(4.2)

where

$$\begin{aligned} \mathcal {Q}_F:=\Big \{Q|Q\ll P \text { on } \mathcal {F}_T ,\, Q=P\, on\,\, \mathcal {F}_0\,\, and\,\, H(Q|P)<+\infty \Big \} , \end{aligned}$$

with \(a_0\) and \(\overline{a}_0\) being non-negative constants; \(\kappa =(\kappa (t))_{0\le t\le T}\) a non-negative bounded and progressively measurable process; \(U_1=(U_1(t))_{0\le t\le T}\) a progressively measurable process with \(E_{P}\Big [\exp [\gamma _1\int _0^T|U_1(t)|\mathrm {d}t]\Big ]<\infty , \, \forall \gamma _1>0\); \(U_2(T)\) a \(\mathcal {F}_T-\)measurable random variable with \(E_{P}\Big [\exp [|\gamma _1 U_2(T)|]\Big ]<\infty , \, \forall \gamma _1>0\); \(S^{\kappa }(t)=\exp (-\int _0^t\kappa (s)\mathrm {d}s) \) is the discount factor and \(\mathcal {R}^{\kappa }(t,T)\) is the penalization term, representing the sum of the entropy rate and the terminal entropy, i.e.

$$\begin{aligned} \mathcal {R}^{\kappa }(t,T)=\frac{1}{S^{\kappa }(t)}\int _t^T\kappa (s)S^{\kappa }(s)\ln \frac{G_0^Q(s)}{G_0^Q(t)}\mathrm {d}s+\frac{S^{\kappa }(T)}{S^{\kappa }(t)}\ln \frac{G^Q(T)}{G_0^Q(t)}, \end{aligned}$$
(4.3)

with \(G^Q=(G^Q(t))_{0\le t\le T}\) is the RCLL P-martingale representing the density of Q with respect to P, i.e.

$$\begin{aligned} G^Q(t)=\frac{\mathrm {d}Q}{\mathrm {d}P}\Big |_{\mathcal {F}_t}. \end{aligned}$$

\(G_T\) represents the Radon-Nikodym derivative on \(\mathcal {F}_T\) of Q with respect to P. More precisely

Problem 4.1

Find \(Q^*\in \mathcal {Q}_{\mathcal {F}}\) such that

$$\begin{aligned} Y^{Q^*}(t)={{\mathrm{ess \text { } inf}}}_{Q\in \mathcal {Q}_{\mathcal {F}}} Y^Q(t) \end{aligned}$$
(4.4)

with

$$\begin{aligned} Y^Q(t)&:= \frac{1}{S^{\kappa }(t)}E_{Q}\Big [\int _t^Ta_0 S^{\kappa }(s)U_1(s)\mathrm {d}s+\overline{a}_0S^{\kappa } (T)U_2(T)\Big |\mathcal {F}_t\Big ]\nonumber \\&\quad +E_{Q}\Big [\mathcal {R}^{\kappa }(t,T)\Big |\mathcal {F}_t\Big ]. \end{aligned}$$
(4.5)

In the present regime switching jump-diffusion setup, we consider the model uncertainty given by a probability measure Q having a density \((G^{\theta }(t))_{0\le t\le T}\) with respect to P and whose stochastic differential equation is as follows

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}G^{\theta }(t)= G^{\theta }(t^-)\Big [\theta _0(t)\mathrm {d}B(t)+\theta _1(t)\cdot \mathrm {d}\widetilde{\Phi }(t)+\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\Big ],\quad t \in [ 0,T] \\ G^{\theta }(0) = 1. \end{array}\right. \nonumber \\ \end{aligned}$$
(4.6)

Here \(\theta = (\theta _0, \theta _1, \theta _2)\) (with \(\theta _1=(\theta _{1,1},\theta _{1,2},\ldots , \theta _{1,D})\in \mathbb {R}^D\)) may be seen as a scenario control. Denote by \(\mathcal {A}\) the set of all admissible controls \(\theta =(\theta _0,\theta _1,\theta _2)\) such that

$$\begin{aligned} E\left[ \int _0^T\left( \theta ^2_0(t)+\sum _{j=1}^D\theta _{1,j}^2(t)\lambda _j(t)+\displaystyle \int _{\mathbb {R}_0}\theta _2^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\right) \mathrm {d}t\right] <\infty \end{aligned}$$

and \(\theta _2(t,\zeta )\ge -1+\epsilon \) for some \(\epsilon >0.\)

Using Itô’s formula (see Zhang et al. (2012, Theorem 4.1)), one can easily check that

$$\begin{aligned} G^{\theta }(t)= & {} \exp \Big [\int _0^t\theta _0(s)\mathrm {d}B(s)-\frac{1}{2}\int _0^t\theta ^2_0(s)\mathrm {d}s+\int _0^t\int _{\mathbb {R}_0}\ln (1+\theta _2(\zeta ,s))\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s)\nonumber \\&+\int _0^t\int _{\mathbb {R}_0}\{\ln (1+\theta _2(s,\zeta ))-\theta _2(s,\zeta )\}\nu _\alpha (\mathrm {d}\zeta )\mathrm {d}s\nonumber \\&+\sum _{j=1}^D \int _0^t\ln ( 1+\theta _{1,j}(s))\cdot \mathrm {d}\widetilde{\Phi }_j(s)\nonumber \\&+\sum _{j=1}^D \int _0^t\{\ln ( 1+\theta _{1,j}(s))-\theta _{1,j}(s)\}\lambda _j(s)\mathrm {d}s\Big ]. \end{aligned}$$
(4.7)

Now, put \(G^{\theta }(t,s)=\frac{G^{\theta }(s)}{G^{\theta }(t)},\, \, s\ge t\) then \((G^{\theta }(t,s))_{0\le t\le s\le T}\) satisfies

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}G^{\theta }(t,s) &{}=&{} G^{\theta }(t,s^-)\Big [\theta _0(s)\mathrm {d}B(s)+\theta _1(s)\cdot \mathrm {d}\widetilde{\Phi }(s)+\displaystyle \int _{\mathbb {R}_0}\theta _2(s,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}s,\mathrm {d}\zeta )\Big ],\quad s \in [ t,T] \\ G^{\theta }(t,t) &{}=&{} 1. \end{array}\right. \nonumber \\ \end{aligned}$$
(4.8)

Hence (4.5) can be rewritten as

$$\begin{aligned} Y^Q(t)&=E_{Q}\left[ \int _t^Ta_0e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad +\,E_{Q}\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}\ln G^{\theta }(t,s)ds+e^{-\int _t^T\kappa (r)\mathrm {d}r}\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] \nonumber \\&=E\left[ \int _t^T a_0 G^{\theta }(t,s)e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad +\,E\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s\right. \nonumber \\&\quad \left. +\,e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] . \end{aligned}$$
(4.9)

Now, define \(h_1\) by

$$\begin{aligned} h_1(\theta (t))&:= \frac{1}{2} \theta _0^2(t)+\sum _{j=1}^D\{(1+\theta _{1,j}(t)\ln (1+\theta _{1,j}(t))-\theta _{1,j}\}\lambda _j(t)\nonumber \\&\quad +\,\int _{\mathbb {R}_0}\{(1+\theta _2(t,\zeta ))\ln (1+\theta _2(t,\zeta ))-\theta _2(t,\zeta )\}\nu _{\alpha }(\mathrm {d}\zeta ). \end{aligned}$$
(4.10)

Using the Itô-Lévy product rule, we have

$$\begin{aligned}&E\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s + e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad = E\left[ \int _t^Te^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)h(\theta (s))ds\Big |\mathcal {F}_t\right] . \end{aligned}$$
(4.11)

Substituting (4.11) into (4.9), leads to

$$\begin{aligned} Y^Q(t)&=E_t\left[ \int _t^T a_0 G^{\theta }(t,s)e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\right] \nonumber \\&\quad +\,E_t\Big [\int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s\nonumber \\&\quad +\,e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big ]\nonumber \\&=\,E_t\left[ \int _t^Te^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\left( a_0 U_1(s)+h(\theta (s))\right) \mathrm {d}s\right. \nonumber \\&\quad \left. +\,\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\right] . \end{aligned}$$
(4.12)

Here \(E_t\) denotes the conditional expectation with respect to the \(\mathcal {F}_t\).

We have the following theorem

Theorem 4.2

Suppose that the penalty function is given by (4.10). Then the optimal \(Y^{Q^*}\) is such that \((Y^{Q^*},Z,W,K)\) is the unique solution to the following quadratic BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} &{}\,\mathrm {d}Y(t) = -\left[ -\kappa (t)Y(t)+a U_1(t)-Z^2(t)+\sum _{j=1}^D\lambda _j(t)(-e^{W_j}-W_j+1)\right. \\ &{}\qquad \qquad \quad \left. +\,\displaystyle \int _{\mathbb {R}_0}(-e^{-K(t,\zeta )}-K(t,\zeta )+1)\nu _{\alpha }\mathrm {d}\zeta \right. ] \mathrm {d}t +Z(t) \mathrm {d}B(t)\\ &{}\qquad \qquad \quad +\,\sum _{j=1}^D W_j(t) \mathrm {d}\widetilde{\Phi }_j(t)+\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\ &{}Y(T) = \overline{a}_0U_2(T). \end{array}\right. \end{aligned}$$
(4.13)

Moreover, the optimal measure \(Q^*\) solution of Problem 4.1 admits the Radon-Nikodym density \((G^{Q}(t,s))_{0\le t\le s\le T}\) given by

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}G^{\theta }(t,s) = G^{\theta }(t,s^-)\left[ -Z(s)\mathrm {d}B(s)+\sum _{j=1}^D(e^{-W_j}-1)\cdot \mathrm {d}\widetilde{\Phi })_j(s)\right. \\ \quad \left. +\displaystyle \int _{\mathbb {R}_0}(e^{-K(s,\zeta )}-1)\,\widetilde{N}(\mathrm {d}s,\mathrm {d}\zeta )\right] ,\quad s \in [ t,T] \\ G(t,t) = 1. \end{array}\right. \end{aligned}$$
(4.14)

Proof

Fix \(u_1\) and denote by X(T) the corresponding wealth process. One can see that Problem 4.1 can be obtained from our general control problem by setting \(X(t)=0,\, \forall t\in [0,T]\), \(h(X(T),\alpha (T))=\overline{a}_0U_2(T)\), \(f=0\), \(\phi (x)=0\) and \(\psi (x)=I\). Since \(h_1(\theta )\) given by (4.10) is convex in \(\theta _0, \theta _1\) and \(\theta _2\), it follows that conditions of Theorem 3.1 are satisfied. The Hamiltonian in this case is reduced to:

$$\begin{aligned} H(t,y,z,K,W)= & {} \lambda (U_1(t)+h(\theta )+\theta _0 z)+\,\sum _{j=1}^D\lambda _j\theta _{1,j}W_j\nonumber \\&+\displaystyle \int _{\mathbb {R}_0}\theta _2(\cdot ,\zeta )K(\cdot ,\zeta ) \nu _{\alpha }(d \zeta ). \end{aligned}$$
(4.15)

Minimizing H with respect to \(\theta =(\theta _0,\theta _1,\theta _2)\) gives the first order condition of optimality for an optimal \(\theta ^*\),

$$\begin{aligned} \left\{ \begin{array}{llll} \frac{\partial H}{\partial \theta _0}=0 \quad \text {i.e.,}\; \theta _0^{*}(t)=-Z(t),\\ \frac{\partial H}{\partial \theta _{1,j}}=0 \quad \text {i.e.,}\;-\ln (1+\theta _{1,j}^*)(t)=-W_{1,j}(t)\quad \text { for }\; j=1,\ldots ,D,\\ \nabla _{\theta _2}H=0 \quad \text {i.e., }\;-\ln (1+\theta _2^*)(t,\zeta )=-K(\cdot ,\zeta ),\quad \nu _{\alpha }\text {- a.e.} \end{array}\right. \end{aligned}$$
(4.16)

On the hand, one can show using product rule (see e.g., Menoukeu-Pamen 2015) that Y given by (4.12) is solution to the following linear BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}Y(t) =&{}&{} -\Big [ -\kappa (t)Y(t)+a U_1(t)+ h(\theta )+\theta _0Z(t)+\sum _{j=1}^D\theta _{1,j}(t)\lambda _j(t){W_j}\\ &{}&{} +\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )K(t,\zeta )\nu _{\alpha }\mathrm {d}\zeta \Big ]\mathrm {d}t+Z(t) dB(t)+ W(t)\cdot \mathrm {d}\widetilde{\Phi }(t)\\ &{}&{}+\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\ Y(T) =&{}&{} \overline{a}_0U_2(T). \end{array}\right. \end{aligned}$$
(4.17)

Using comparison theorem for BSDE, \(Q^*\) is an optimal measure for Problem 4.1 if \(\theta ^*\) is such that

$$\begin{aligned} g(\theta ^*)=\underset{\theta }{\min }\,g(\theta ) \end{aligned}$$
(4.18)

for each t and \(\omega \), with \(g(\theta ):=h(\theta )+\theta _0Z(t)+\sum _{j=1}^D\theta _{1,j}(t)\lambda _j(t){W_j}+\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )K(t,\zeta )\nu _{\alpha }\mathrm {d}\zeta \). This is equivalent to the first condition of optimality. Hence \((\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2)\) satisfying (4.16) will satisfy (4.18). Substituting \(\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2\) into (4.17) leads to (4.13). Furthermore, substituting \(\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2\) into (4.8) gives (4.14). The proof of the theorem is complete. \(\square \)

Remark 4.3

  • This result can be seen as an extension to the Markov regime-switching setting of Jeanblanc et al. (2012, Theorem 1) or Bordigoni et al. (2005, Theorem 2).

  • Let us mention that in the case \((X(t))_{0\le t\le T}\) is not zero and has a particular dynamics (mean-reverting or exponential Markov Lévy switching) one can use Theorem 3.1 to solve a problem of recursive robust utility mazimization as in Øksendal and Sulem (2012,  Section 4.2) or Menoukeu-Pamen (2015, Theorem 4.1)

4.2 Application to optimal investment of an insurance company under model uncertainty

In this section, we use our general framework to study a problem of optimal investment of an insurance company under model uncertainty. The uncertainty here is also described by a family of probability measures. Such problem was solved in Elliott and Siu (2011) using dynamic programming approach when the interest rate is 0. We show that the general maximum principle enables us to find the explicit optimal investment when \(r\ne 0\). We restrict ourselves to the case \(\mathcal {E}_t^{(1)}=\mathcal {E}_t^{(2)}=\mathcal {F}_t,\,\,\,t\in [0,T]\) in order to have explicit result. Let us mention that if \(\mathcal {E}_t^{(i)}\subset \mathcal {F}_t,\,\,\,t\in [0,T], i=1,2\) then the problem is non-Markovian and hence the dynamic programming used in Elliott and Siu (2011) cannot be applied.

The model is that of Elliott and Siu (2011, Section 2.1). Let \((\Omega , \mathcal {F},P) \) be a complete probability space with P representing a reference probability measure from which a family of real-world probability measures are generated. We shall suppose that \((\Omega , \mathcal {F},P) \) is big enough to take into account uncertainties coming from future insurance claims, fluctuation of financial prices and structural changes in economics conditions. We consider a continuous-time Markov regime-switching economic model with a bond and a stock or share index.

The evolution of the state of an economy over time is modeled by a continuous-time, finite-state, observable Markov chain \(\alpha :=\{\alpha (t),t\in [0,T];\, T<\infty \}\) on \((\Omega , \mathcal {F},P)\), taking values in the state space \(\mathbb {S}=\{e_1,e_2,\ldots ,e_D\}\), where \(D\ge 2\). We denote by \(\Lambda :=\{\lambda _{nj}:1\le n,j\le D\}\) the intensity matrix of the Markov chain under P. Hence, for each \(1\le n,j\le D,\,\,\lambda _{nj}\) is the transition intensity of the chain from state \(e_n\) to state \(e_j\) at time t. It is assumed that for \(n\ne j,\,\,\lambda _{nj}> 0\) and \(\sum _{j=1}^D \lambda _{nj}=0\), hence \(\lambda _{nn}< 0\). The dynamics of \((\alpha (t))_{0\le t\le T}\) is given in Sect. 2.

Let \(r=\{r(t)\}_{t\in [0, T]}\) be the instantaneous interest rate of the money market account B at time t. Then

$$\begin{aligned} r(t):=\langle \underline{r},\alpha (t)\rangle =\sum _{j=1}^D r_j\langle \alpha (t), e_j\rangle \;, \end{aligned}$$
(4.19)

where \(\langle \cdot ,\cdot \rangle \) is the usual scalar product in \(\mathbb {R}^D\) and \(\underline{r}=(r_1,\dots ,r_D)\in \mathbb R^D_+\). Here the value \(r_j\), the \(j^{th}\) entry of the vector \(\underline{r}\), represents the value of the interest rate when the Markov chain is in the state \(e_j\), i.e., when \(\alpha (t)=e_j\). The price dynamics of B can now be written as

$$\begin{aligned} \mathrm {d}S_0(t)=S_0r(t)\mathrm {d}t,\, S_0(0)=1, \quad t\in [0,T]. \end{aligned}$$
(4.20)

Moreover, let \(\mu =\{\mu (t)\}_{t\in [0, T]}\) and \(\sigma =\{\sigma (t)\}_{t\in [0, T]}\) denote respectively the mean return and the volatility of the stock at time t. Using the same convention, we have

$$\begin{aligned} \mu (t)=&\langle \underline{\mu },\alpha (t)\rangle =\sum _{j=1}^D\mu _j\langle \alpha (t),e_j\rangle \;,\\ \sigma (t)=&\langle \underline{\sigma },\alpha (t)\rangle =\sum _{j=1}^D\sigma _j\langle \alpha (t),e_j \rangle \;, \end{aligned}$$

where

$$\begin{aligned} \underline{\mu }=(\mu _1,\mu _2,\ldots ,\mu _D)\in \mathbb {R}^D, \end{aligned}$$

and

$$\begin{aligned} \underline{\sigma }=(\sigma _1,\sigma _2,\ldots ,\sigma _D)\in \mathbb {R_+}^D. \end{aligned}$$

In a similar way, \(\mu _j\) and \(\sigma _j\) represent respectively the appreciation rate and volatility of the stock when the Markov chain is in state \(e_j\), i.e., when \(\alpha (t)=e_j\). Let \(B=\{B_t\}_{t\in [0, T]}\) denote the standard Brownian motion on \((\Omega ,\mathcal {F},P)\) with respect to its right-continuous complete filtration \(\mathcal {F}^B:=\{\mathcal {F}^B_t\}_{0\le t\le T}\). Then, the dynamic of the stock price \(S=\{S(t)\}_{t\in [0, T]}\) is given by the following Markov regime-switching geometric Brownian motion

$$\begin{aligned} \mathrm {d}S(t)=S(t)\left[ \mu (t)\mathrm {d}t+\sigma (t)\mathrm {d}B(t)\right] , \quad S(0)=S_0. \end{aligned}$$
(4.21)

Let \(Z_0:=\{Z_0(t)\}_{t\in [0, T]}\) be a real-valued Markov regime-switching pure jump process on \((\Omega ,\mathcal {F},P)\). Here \(Z_0(t)\) can be considered as the aggregate amount of claims up to and including time t. Since \(Z_0\) is a pure jump process, one has

$$\begin{aligned} Z_0(t)=\sum _{0<u\le t}\Delta Z_0(u),\,\,Z_0(0)=0,\,\, P\text {-a.s, } t\in [0,T], \end{aligned}$$
(4.22)

where for each \(u\in [0,T]\), \(\Delta Z_0(u)=Z_0(u)-Z_0(u^-)\), represents the jump size of \(Z_0\) at time u.

Assume that the state space of claim size denoted by \(\mathcal {Z}\) is \((0,\infty )\). Let \(\mathcal {M}\) be the product space \([0,T]\times \mathcal {Z}\) of claim arrival time and claim size. Define a random measure \(N^0(\cdot ,\cdot )\) on the product space \(\mathcal {M}\), which selects claim arrivals and size \(\zeta :=Z_0(u)-Z_0(u^-)\) at time u, then the aggregate insurance claim process \(Z_0\) can be written as

$$\begin{aligned} Z_0(t)=\int _0^t\int _0^{\infty } \zeta N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\, t\in [0,T]. \end{aligned}$$
(4.23)

Define, for each \(t\in [0,T]\)

$$\begin{aligned} N_{\Lambda ^0}(t)=\int _0^t\int _0^{\infty } N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\,t\in [0,T]. \end{aligned}$$
(4.24)

Then \(N_{\Lambda ^0}(t)\) counts the number of claim arrivals up to time t. Assume that, under the measure P, \(N_{\Lambda ^0}:=\{N_{\Lambda ^0}(t)\}_{t\in [0, T]}\) is a conditional Poisson process on \((\Omega ,\mathcal {F},P)\) with intensity \(\Lambda ^0:=\{\lambda ^0(t)\}_{t\in [0, T]}\) modulated by the chain \(\alpha \) given by

$$\begin{aligned} \lambda ^0(t):=\langle \underline{\lambda }^0,\alpha (t)\rangle =\sum _{j=1}^D \lambda _j^0\langle \alpha (t), e_j\rangle \;, \end{aligned}$$
(4.25)

with \(\underline{\lambda }^0=(\lambda _1^0,\ldots ,\lambda ^0_D)\in \mathbb R^D_+\). Here the value \(\lambda _j^o\), the \(j^{th}\) entry of the vector \(\underline{\lambda }^0\), represents the intensity rate of N when the Markov chain is in the space state \(e_j\), i.e., when \(\alpha (t^-)=e_j\). Denote by \(F_j(\zeta ),\,j=1,\ldots ,D\) the probability distribution of the claim size

\(\zeta :=Z_0(u)-Z_0(u^-)\) when \(\alpha (t^-)=e_j\). Then the compensator of the Markov regime switching random measure \(N^0(\cdot ,\cdot )\) under P is given by

$$\begin{aligned} \nu _{\alpha }^0(\mathrm {d}\zeta )\mathrm {d}u:=\sum _{j=1}^D\langle \alpha (u^-),e_j\rangle \lambda _j^0F_j(\mathrm {d}\zeta )\mathrm {d}u. \end{aligned}$$
(4.26)

Hence a compensated version \(\widetilde{N}^0_{\alpha }(\cdot ,\cdot )\) of the Markov regime-switching random measure is defined by

$$\begin{aligned} \widetilde{N}^0_{\alpha }(\mathrm {d}u,\mathrm {d}\zeta )=N^0(\mathrm {d}u,\mathrm {d}\zeta )-\nu ^0_{\alpha }(\mathrm {d}\zeta )\mathrm {d}u. \end{aligned}$$
(4.27)

The premium rate \(P_0(t)\) at time t is given by

$$\begin{aligned} P_0(t):=\langle \underline{P_0},\alpha (t)\rangle =\sum _{j=1}^D P_{0,j}\langle \alpha (t), e_j\rangle , \end{aligned}$$
(4.28)

with \(\underline{P_0}=(P_{0,1},\ldots ,P_{0,D})\in \mathbb R^D_+\). Let \(R_0:=\{R_0(t)\}_{t\in [0, T]}\) be the surplus process of the insurance company without investment. Then

$$\begin{aligned} R_0(t)&:=\,r_0+\int _0^tP_0(u)\mathrm {d}u -Z_0(t)\nonumber \\&=r_0+\sum _{j=1}^D P_{0,j}\mathcal {J}_j(t)-\int _0^t\int _0^{\infty } \zeta N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\, t\in [0,T], \end{aligned}$$
(4.29)

with \(R_0(0)=r_0\). For each \(j=1,\ldots ,D\) and each \(t\in [0,T]\), \(\mathcal {J}_j(t)\) is the occupation time of the chain \(\alpha \) in the state \(e_j\) up to time t, that is

$$\begin{aligned} \mathcal {J}_j(t)=\int _0^t\langle \alpha (u), e_j\rangle \mathrm {d}u. \end{aligned}$$
(4.30)

The following information structure will be important for the derivation of the dynamics of the company’ surplus process. Let \(\mathcal {F}^{Z_0}:=\{\mathcal {F}^{Z_0}\}_{0\le t\le T}\) denote the right-continuous P-completed filtration generated by \(Z_0\). For each \(t\in [0,T]\) define \(\mathcal {F}_t:=\mathcal {F}^{Z_0}_t\vee \mathcal {F}_t^{B}\vee \mathcal {F}_t^{\alpha }\) as the minimal \(\sigma \)-algebra generated by \(\mathcal {F}^{Z_0}_t\), \(\mathcal {F}_t^{B}\) and \(\mathcal {F}_t^{\alpha }\) and write \(\mathbb {F}=\{\mathcal {F}_t\}_{0\le t\le T}\) as the information accessible to the company.

From now on, we assume that the insurance company invests the amount of \(\pi (t)\) in the stock at time t, for each \(t\in [0,T]\). Then \(\pi =\{\pi (t),t\in [0,T]\}\) represents the portfolio process. Denote by \(X=\{X^{\pi }(t)\}_{t\in [0, T]}\) the wealth process of the company. One can show that the dynamics of the surplus process is given by

$$\begin{aligned} \left\{ \begin{aligned} \,\mathrm {d}X(t)&= \Big \{ P_0(t)+r(t)X(t)+ \pi (t)(\mu (t)-r(t))\Big \}\mathrm {d}t+\sigma (t)\pi (t)\mathrm {d}B(t)\\&\quad -\displaystyle \int _0^{\infty } \zeta N^0(\mathrm {d}t,\mathrm {d}\zeta )\\&= \Big \{ P_0(t)+r(t)X(t)+ \pi (t)(\mu (t)-r(t))-\displaystyle \int _0^{\infty }\zeta \nu ^0_{\alpha }(\,\mathrm {d}\zeta )\Big \}\mathrm {d}t\\&\quad +\sigma (t)\pi (t)\mathrm {d}B(t)-\displaystyle \int _0^{\infty }\zeta \widetilde{N}_{\alpha }^0(\mathrm {d}t,\mathrm {d}\zeta ),\,\,t\in [0,T,]\\ X(0)&= X_0. \end{aligned}\right. \end{aligned}$$
(4.31)

Definition 4.4

A portfolio \(\pi \) is admissible if it satisfies

  1. 1.

    \(\pi \) is \(\mathbb {F}\)-progressively measurable;

  2. 2.

    (4.31) admits a unique strong solution;

  3. 3.

    \(\sum _{j=1}^D E\Big [\int _0^T\Big \{|P_{0,j}+r_jX(t)+\pi (t)(\mu _j-r_j)|+\sigma ^2_j\pi ^2(t)+\lambda _j^0\int _0^{\infty }\zeta ^2F_j(\,\mathrm {d}\zeta )\Big \}\mathrm {d}t\Big ]<\infty \);

  4. 4.

    \(X(t)\ge 0,\,\,\forall t\in [0,T]\), P-a.s.

We denote by \(\mathcal {A}\) the space of all admissible portfolios.

Note that although condition (4) is strong, it is intuitively natural to only consider positive wealth for the insurance company. Define \(\mathbb {G}:=\{\mathcal {G}_t, t\in [0,T]\}\), where \(\mathcal {G}_t:=\mathcal {F}_t^B\vee \mathcal {F}_t^{Z_0}\), and for \(n,j=1,\ldots ,D\), let \(\{C_{nj}(t),t\in [0,T]\}\) be a real-valued, \(\mathbb {G}\)-predictable, bounded, stochastic process on \((\Omega ,\mathcal {F},P)\) such that for each \(t\in [0,T]\) \(C_{nj}\ge 0\) for \(n\ne j\) and \(\sum _{n=1}^D C_{nj}(t)=0,\, i.e,\,C_{nn}\le 0\).

We consider a model uncertainty setup given by a probability measure \(Q=Q^{\theta ,\mathbf {C}}\) which is equivalent to P, with Radon-Nikodym derivative on \(\mathcal {F}_t\) given by

$$\begin{aligned} \frac{\mathrm {d}Q}{\mathrm {d}P}\Big |_{\mathcal {F}_t}=G^{\theta ,C}(t), \end{aligned}$$
(4.32)

where, for \(0\le t\le T\), \(G^{\theta ,C}\) is a \(\mathbb {F}\)-martingale. Under \(Q^{\theta ,\mathbf {C}}\), \(\mathbf {C}:=\{\mathbf {C}(t),t\in [0,T]\}\) with \(\mathbf {C}(t):=[C_{nj}(t)]_{n,j=1,\ldots ,D}\) is a family of rate matrices of the Markov chain \(\alpha (t)\); see for example Dufour and Elliott (1999). For each \(t\in [0,T]\), we set

$$\begin{aligned} \mathbf {D}_0^{\mathbf {C}}(t):=\mathbf {D}^\mathbf {C}(t) -\mathbf {diag}(\mathbf {d}^C(t)), \end{aligned}$$

with \(\mathbf {d}^C(t)=(d^C_{11},\ldots ,d^C_{DD})^\prime \in \mathbb {R}^D\) and

$$\begin{aligned} \mathbf {D}^C:=\Big [\frac{C_{nj}(t)}{\lambda _{nj}(t)}\Big ]_{n,j=1,\cdots ,D}=[d^{\mathbf {C}} _{nj}(t)]. \end{aligned}$$
(4.33)

We denote by \(\mathcal {C}\) the space of all families intensity matrices \(\mathbf {C}\) with bounded components.

The Radon-Nikodym derivative or density process \(G^{\theta ,\mathbf {C}}\) is given by

$$\begin{aligned} \left\{ \begin{aligned} \,\mathrm {d}G^{\theta ,\mathbf {C}}(t)&= G^{\theta ,C}(t^-)\Big \{ \theta (t)\mathrm {d}B(t)+\displaystyle \int _0^{\infty }\theta (t)\widetilde{N}^0_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\&\quad +(\mathbf {D}^{\mathbf {C}}_0(u)\alpha (u)-\mathbf {1})^\prime \cdot \,\mathrm {d}\widetilde{\Phi }(t)\Big \},\,\,\,t\in [0,T],\\ G^{\theta ,\mathbf {C}}(0)&= 1, \end{aligned}\right. \end{aligned}$$
(4.34)

where \(^\prime \) represents the transpose. Here \((\theta , \mathbf {C})\) may be regarded as scenario control. A control \(\theta \) is admissible if \(\theta \) is \(\mathbb {F}\)-progressively measurable, with \(\theta (t)=\theta (t,\omega )\le 1\) for a.a \((t,\omega )\in [0,T]\times \Omega \), and \(\int _0^T\theta ^2(t)\mathrm {d}t<\infty .\) We denote by \(\Theta \) the space of such admissible processes.

Next, we formulate the optimal investment problem under model uncertainty. Let \(U:(0,\infty )\longrightarrow \mathbb {R}\), be an utility function which is strictly increasing, strictly concave and twice continuously differentiable. The objectives of the insurance firm and the market are the following:

Problem 4.5

Find a portfolio process \(\pi ^*\in \mathcal {A}\) and the process \((\theta ^*, \mathbf {C}^*)\in \Theta \times \mathcal {C}\) such that

$$\begin{aligned} \underset{\pi \in \mathcal {A}}{\sup }\,\,\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]=&E_{Q^{\theta ^*,\mathbf {C}^*}} \Big [U^{\pi ^*}(X_T)\Big ]\nonumber \\ =&\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,\,\underset{\pi \in \mathcal {A}}{\sup }\,E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]. \end{aligned}$$
(4.35)

This problem can be seen as a zero-sum stochastic differential games of an insurance firm. We have

$$\begin{aligned} E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]=E\Big [G^{\theta ,\mathbf {C}}(T)U(X^{\pi }(T))\Big ]. \end{aligned}$$
(4.36)

Now, define \(Y(t)=Y^{\theta ,\mathbf {C},\pi }(t)\) by

$$\begin{aligned} Y(t)=E\Big [\frac{G^{\theta ,\mathbf {C}}(T)}{G^{\theta ,\mathbf {C}}(t)}U(X^{\pi }(T))\Big |\mathcal {F}_t\Big ]. \end{aligned}$$
(4.37)

Then, it can easily be shown that Y(t) is the solution to the following linear BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}Y (t) &{}=&{}-\Big [\theta (t)Z_0(t)+\displaystyle \int _{\mathbb {R}_0}\theta (t)K(t,\zeta )\nu ^0_{\alpha }(\,\mathrm {d}\zeta )+\sum _{j=1}^D(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})_j\lambda _jV_j(t)\Big ]\mathrm {d}t \\ &{}&{} Z_0(t)\mathrm {d}B(t) +\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V(t)\cdot \mathrm {d}\widetilde{\Phi }(t),\,\,t\in [0,T], \\ Y(T) &{}=&{} U(X^{\pi }(T)). \end{array}\right. \end{aligned}$$
(4.38)

Noting that

$$\begin{aligned} Y(0)=Y^{\theta ,\mathbf {C},\pi }(0)=E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ], \end{aligned}$$
(4.39)

Problem 4.5 becomes

Problem 4.6

Find a portfolio process \(\pi ^*\in \mathcal {A}\) and the process \((\theta ^*, \mathbf {C}^*)\in \Theta \times \mathcal {C}\) such that

$$\begin{aligned} \underset{\pi \in \mathcal {A}}{\sup }\,\,\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,Y^{\theta ,\mathbf {C},\pi }(0)=Y^{\theta ^*,\mathbf {C}^*,\pi ^*}(0) =\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,\,\underset{\pi \in \mathcal {A}}{\sup }\,Y^{\theta ,\mathbf {C},\pi }(0), \end{aligned}$$
(4.40)

where \(Y^{\theta ,\mathbf {C},\pi }\) is described by the Forward–backward system (4.31) and (4.38).

Theorem 4.7

Let \(X^{\pi }(t)\) be dynamics of the surplus process satisfying (4.31) with r deterministic. Consider the optimization problem to find \(\pi ^*\in \mathcal {A}\) and \((\theta ^*,\mathbf {C}^*)\in \Theta \times \mathcal {C}\) such that (4.35) (or equivalently (4.40)) holds, with

$$\begin{aligned} Y^{\theta ,\mathbf {C},\pi }(t)=E\Big [\frac{G^{\theta ,\mathbf {C}}(T)}{G^{\theta ,\mathbf {C}}(t)}U(X^{\pi }(T))\Big |\mathcal {F}_t\Big ]. \end{aligned}$$
(4.41)

In addition, suppose \(U(x)=-e^{-\beta x},\, \beta \ge 0.\) Then the optimal investment \(\pi ^*(t)\) and the optimal scenario measure of the market \((\theta ^*,\mathbf {C}^*)\) are given respectively by

$$\begin{aligned} \theta ^*(t)=&-\sum _{j=1}^D\Big (\frac{\mu _j-r_j-\sigma ^2_j\pi ^*(t,e_j)\beta e^{\int _t^Tr(s)\mathrm {d}s}}{\sigma _j}\Big )\langle \alpha (t),e_j\rangle , \end{aligned}$$
(4.42)
$$\begin{aligned} \pi ^*(t)=&\sum _{n=1}^D\Big (\frac{\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\lambda _n^0 F_n(\mathrm {d}\zeta )}{\beta \sigma _ne^{\int _t^Tr(s)\mathrm {d}s}}\Big )\langle \alpha (t),e_n \rangle . \end{aligned}$$
(4.43)

and the optimal \(\mathbf {C}^*\) satisfies the following constraint linear optimization problem:

$$\begin{aligned} \underset{C_{1j},\ldots ,C_{Dj}}{\min }\,\, \sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)\,\,\, j=1,\ldots ,D, \end{aligned}$$
(4.44)

subject to the linear constraints

$$\begin{aligned} \sum _{n=1}^DC_{nj}(t)=0, \end{aligned}$$

where \(V_j\) is given by (4.67).

Moreover, if we assume that the space of family matrix rates \((C_{nj})_{n,j=1,2}\) is bounded and write \(C_{nj}(t)\in \Big [C^l(n,j), C^u(n,j)\Big ]\) with \(C^l(n,j)< C^u(n,j),\,\,n,j=1,2\). Then, in this case, the optimal \(\mathbf {C}^*\) is given by:

$$\begin{aligned} \mathbf {C}^*_{21}(t)&= C^l(2,1)\mathbb {I}_{V_1(t)-V_2(t)>0}+C^u(2,1)\mathbb {I}_{V_1(t)-V_2(t)<0}, \end{aligned}$$
(4.45)
$$\begin{aligned} \mathbf {C}^*_{11}(t)&=-\mathbf {C}^*_{21}(t), \end{aligned}$$
(4.46)
$$\begin{aligned} \mathbf {C}^*_{12}(t)&= C^l(1,2)\mathbb {I}_{V_2(t)-V_1(t)>0}+C^u(1,2)\mathbb {I}_{V_2(t)-V_1(t)<0}, \end{aligned}$$
(4.47)
$$\begin{aligned} \mathbf {C}^*_{22}(t)&=-\mathbf {C}^*_{12}(t). \end{aligned}$$
(4.48)

Proof

One can see that this is a particular case of a zero-sum stochastic differential games of the Forward–backward system of the form (2.5) and (2.7) with \(\psi =Id\), \(\varphi =f=0\) and \( h(x)=U(x).\) The Hamiltonian in Sect. 3 is reduced to

$$\begin{aligned}&H(t, x,e_n, y,z,k,v,\pi ,\theta ,a,p,q,r^0,w)\nonumber \\&\quad = \,a\left[ \theta z+\displaystyle \int _{\mathbb {R}^+}\theta k(t,\zeta )\nu ^0_{e_n}(\,\mathrm {d}\zeta )+\sum _{j=1}^D(\mathbf {D}_0^\mathbf {C}(t)e_n-\mathbf {1})_j\lambda _{nj}v_j(t)\right] \nonumber \\&\quad \quad +\,\left[ P_0(t)+rx+\pi (\mu -r)-\displaystyle \int _{\mathbb {R}^+}\zeta \nu ^0_{e_n}(\mathrm {d}\zeta )\right] p\nonumber \\&\quad \quad +\,\sigma \pi q-\displaystyle \int _{\mathbb {R}^+}\zeta r^0(t,\zeta )\nu _{e_n}^0(\mathrm {d}\zeta ). \end{aligned}$$
(4.49)

The adjoint processes A(t) ,\((p(t),q(t),r^0(t,\zeta ),w(t))\) associated with the Hamiltonian are given by the following Forward–backward SDE

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}A(t)&{}=&{}A(t)\Big [\theta (t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}\theta (t)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ],\,t\in [0,T], \\ A(0)&{}=&{}1, \end{array}\right. \nonumber \\ \end{aligned}$$
(4.50)

and

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}p(t)&{}=&{}-r(t)p(t)\mathrm {d}t+q(t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}r^0(t,\zeta )\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+w(t)\cdot \mathrm {d}\widetilde{\Phi }(t),\,t\in [0,T], \\ p(T)&{}=&{}A(T)U^\prime (X(T)). \end{array}\right. \nonumber \\ \end{aligned}$$
(4.51)

It is easy to see that the functions h and H satisfy the assumptions of Theorem 3.9.

Let us now find \(\theta ^*\) and \(\pi ^*\). First, maximizing the Hamiltonian H with respect to \(\pi \) gives the first order condition for an optimal \(\pi ^*\).

$$\begin{aligned} E[(\mu -r) p+\sigma q|\mathcal {F}_t]=0. \end{aligned}$$
(4.52)

The BSDE (4.51) is linear in p, hence we shall try a process p(t) of the form

$$\begin{aligned} p(t)=\beta f(t,\alpha (t))A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}, \end{aligned}$$
(4.53)

where \(f(\cdot ,e_n)\) satisfies a differential equation to be determined. Applying Itô-Lévy’s formula for jump-diffusion process, we have

$$\begin{aligned} \mathrm {d}\Big (e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big )&= e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi (t)\mathrm {d}B(t)\nonumber \\&\quad +\,\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t) \Big ]. \end{aligned}$$
(4.54)

Applying the Itô-Lévy’s formula for jump-diffusion, Markov regime-switching process (see e.g., Zhang et al. (2012, Theorem 4.1)), we get

$$\begin{aligned}&\mathrm {d}\Big ( A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big )\nonumber \\&\quad \displaystyle =e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}A(t)\Big [\theta (t)\mathrm {d}B(t)+ \int _{\mathbb {R}^+}\theta (t)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ]\nonumber \\&\quad \quad +\,A(t) e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\Big ]\nonumber \\&\quad \quad -\,\beta A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\theta (t)e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi (t)\mathrm {d}t\nonumber \\&\quad \quad +\displaystyle \int _{\mathbb {R}^+}\theta (t)A(t) e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)N^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t) \nonumber \\&\quad = A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}} \Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad \quad +\, ( \theta (t)-\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t))\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}\Big \{(1\nonumber \\&\quad \quad +\,\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \}\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad \quad +\,(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ]. \end{aligned}$$
(4.55)

Putting \(A(t)e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}=P_1(t)\), then \(p(t)=\beta f(t,\alpha (t))P_1(t)\) and using once more the Itô-Lévy’s formula for jump-diffusion Markov regime-switching process, we get

$$\begin{aligned} \mathrm {d}p(t)&= \beta \mathrm {d}\Big (f(t,\alpha (t))P_1(t)\Big )\nonumber \\&= \beta \Big [ f^\prime (t,\alpha (t))P_1(t)\mathrm {d}t +f(t,\alpha (t))P_1(t)\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)\nonumber \\&\quad +\,\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t)\pi (t)+\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) \nonumber \\&\quad +\, \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t \nonumber \\&\quad +\, ( \theta (t)-\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t))\mathrm {d}B(t)\Big ]+ \sum _{j=1}^D\Big (f(t,e_j)\nonumber \\&\quad -\,f(t,\alpha (t))\Big ) P_1(t)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j\lambda _j(t)\mathrm {d}t\nonumber \\&\quad +\,\displaystyle \int _{\mathbb {R}^+}f(t,\alpha (t))P_1(t)\Big \{(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \}\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad +\, \sum _{j=1}^DP_1(t)\Big ( f(t,e_j)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j -f(t,\alpha (t)) \Big ) \mathrm {d}\widetilde{\Phi }_j(t)\Big ], \end{aligned}$$
(4.56)

where \((\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j=(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t))^j\). Comparing (4.56) with (4.51), by equating the terms in \(\mathrm {d}t\), \(\mathrm {d}B(t)\), \(\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\), and \(\mathrm {d}\widetilde{\Phi }_j(t)\) \(j=1,\ldots ,D\), respectively, we get

$$\begin{aligned} q(t)=(\theta ^*(t)-\beta (t)\sigma (t)\pi ^*(t)e^{\int _t^Tr(s)\mathrm {d}s})p(t). \end{aligned}$$
(4.57)

Substituting this into (4.52) gives,

$$\begin{aligned} E[(\mu (t)-r(t)) p(t)|\mathcal {F}_t]&=-E[\sigma (t)\left( \theta ^*(t)-\sigma (t)\pi ^*(t)\beta e^{\int _t^Tr(s)\mathrm {d}s}\right) p(t)|\mathcal {F}_t],\nonumber \\ \text {i.e., }\,\, \theta ^*(t)&=-\sum _{j=1}^D\left( \frac{\mu _j-r_j-\sigma ^2_j\pi ^*(t,e_j)\beta e^{\int _t^Tr(s)\mathrm {d}s}}{\sigma _j}\right) \langle \alpha (t),e_j\rangle , \end{aligned}$$
(4.58)

where the last inequality follows since all coefficients are adapted to \(\mathcal {F}_t\). Thus (4.42) in the Theorem is proved. On the other hand, we also have

$$\begin{aligned} r^0(t,\zeta )=p(t)\Big \{(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \} \end{aligned}$$
(4.59)

and

$$\begin{aligned} w_j(t)=\beta \Big \{P_1(t)\Big ( f(t,e_j)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j -f(t,\alpha (t)) \Big )\Big \}, \end{aligned}$$
(4.60)

with the function \(f(\cdot ,e_n)\) satisfying the following backward differential equation:

$$\begin{aligned}&f^\prime (t,e_n) +f(t,e_n)\Big [-\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t,e_n)+\pi (t)(\mu (t,e_n)-r(t,e_n))\Big \} \nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t,e_n)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2 e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t,e_n)\pi ^2(t)+\displaystyle \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1) \lambda ^0_{n}F_{e_n}(\mathrm {d}\zeta )\Big ] \nonumber \\&\quad \quad +\,\sum _{j=1}^D\Big (f(t,e_j)-f(t,e_n)\Big )(\mathbf {D}_{0,e_n}^{\mathbf {C}}(t))_{nj}\lambda _{n j}=0, \end{aligned}$$
(4.61)

with the terminal condition \(f(T,e_n)=1\), for \(n=1,\ldots , D\). For \(r=0\), the solution of such backward equation can be found in Elliott and Siu (2011). Minimizing the Hamiltonian H with respect to \(\theta \) gives the first order condition for an optimal \(\theta ^*\).

$$\begin{aligned} E[z+\displaystyle \int _{\mathbb {R}^+}k(t,\zeta )\nu ^0_{\alpha }(\mathrm {d}\zeta )|\mathcal {F}_t]=0. \end{aligned}$$
(4.62)

The BSDE (4.38) is linear in Y, hence we shall try the process Y(t) of the form

$$\begin{aligned} Y(t)=f_1(t,\alpha (t))Y_1(t)\,\, \text { with }\,\, Y_1(t)=e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}, \end{aligned}$$
(4.63)

where \(f_1(\cdot ,e_n),\,\,n=1,\ldots , D\) is a deterministic function satisfying a backward differential equation to be determined. Applying the Itô-Lévy’s formula for jump-diffusion Markov regime-switching, we get

$$\begin{aligned} \mathrm {d}Y(t)&= f_1^\prime (t,\alpha (t) )e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}\mathrm {d}t- f_1(t, \alpha (t))Y_1(t)\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{P_0(t)\nonumber \\&\quad +\,\pi (t)(\mu (t)-\pi (t))\nonumber \\&\quad -\,\frac{1}{2}\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t)+\frac{1}{\beta }\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\nu ^0_{\alpha }(\mathrm {d}\zeta )\Big \}\mathrm {d}t\nonumber \\&\quad +\,\sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,\alpha (t))\Big )Y_1(t)\lambda _j(t)\mathrm {d}t\nonumber \\&\quad -\, f_1(t,\alpha (t))Y_1(t)\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t)\mathrm {d}B(t)\nonumber \\&\quad +\, \displaystyle \int _{\mathbb {R}^+}f_1(t,\alpha (t))Y_1(t)(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad +\, \sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,\alpha (t))\Big )Y_1(t) \mathrm {d}\widetilde{\Phi }_j(t). \end{aligned}$$
(4.64)

Comparing (4.64) and (4.38), we get

$$\begin{aligned} Z(t)=&-\beta e^{\int _t^Tr(s)\mathrm {d}s} Y(t)\sigma (t)\pi (t), \end{aligned}$$
(4.65)
$$\begin{aligned} K(t,\zeta )=&Y(t)(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1), \end{aligned}$$
(4.66)
$$\begin{aligned} V_j(t)=&\Big \{f_1(t,e_j)-f_1(t,\alpha (t)\Big \}Y_1(t). \end{aligned}$$
(4.67)

Substituting \(Z_0(t)\) and \(K(t,\zeta )\) into (4.62), we get

$$\begin{aligned} E[\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi ^*(t)|\mathcal {F}_t]=&E\Big [\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\nu ^0_{\alpha }(\mathrm {d}\zeta )|\mathcal {F}_t\Big ],\nonumber \\ \text { i.e., }\,\, \pi ^*(t)=&\sum _{n=1}^D\left( \frac{\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\lambda _n^0 F_n(\mathrm {d}\zeta )}{\beta \sigma _ne^{\int _t^Tr(s)\mathrm {d}s}}\right) \langle \alpha (t),e_n \rangle . \end{aligned}$$
(4.68)

Thus (4.43) in the Theorem is proved. Substituting (4.65)-(4.67) into (4.64), we deduce that the function \(f_1(\cdot , e_n)\) satisfies the following backward differential equation

$$\begin{aligned}&f_1^\prime (t,e_n) +f_1(t,e_n)\Big [-\beta e^{\int _t^Tr(s)\mathrm {d}s} \Big \{ P_0(t,e_n)+\pi (t)(\mu (t,e_n)-r(t,e_n))\Big \}\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t,e_n)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2 e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t,e_n)\pi ^2(t) +\displaystyle \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1) \lambda ^0_{n}F_{e_n}(\mathrm {d}\zeta )\Big ] \nonumber \\&\quad \quad +\,\sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,e_n)\Big )(\mathbf {D}_{0,e_n}^{\mathbf {C}}(t))_{nj}\lambda _{n j}=0, \end{aligned}$$
(4.69)

with the terminal condition \(f_1(T,e_n)=-1\) for \(n=1,\ldots ,D\).

As for the optimal \((C_{nj})_{n,j=1,\ldots ,D}\), the only part of the Hamiltonian that depends on \(\mathbf {C}\) is the sum \(\sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)\). Hence minimizing the Hamiltonian with respect to \(\mathbf {C}\) is equivalent to minimizing the following system of differential operator

$$\begin{aligned} \underset{C_{1j},\ldots ,C_{Dj}}{\min }\,\, \sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)\,\,\, j=1,\ldots ,D, \end{aligned}$$
(4.70)

subject to the linear constraints

$$\begin{aligned} \sum _{n=1}^DC_{nj}(t)=0. \end{aligned}$$

Hence, one can obtain the solution in the two-states case (since C is bounded) with \(V_j\) and \(f_1\) given by (4.67) and (4.69) respectively. More specifically, if the Markov chain only has two states, we have to solve the following two linear programming problems:

$$\begin{aligned} \underset{C_{11}(t),C_{21}(t)}{\min }\,\, ( V_1(t)-V_2(t))C_{21}(t)+\lambda _{21} (V_2(t)-V_1(t)) \end{aligned}$$
(4.71)

subject to the linear constraint

$$\begin{aligned} C_{11}+C_{21}=0 \end{aligned}$$

and

$$\begin{aligned} \underset{C_{12}(t),C_{22}(t)}{\min }\,\, ( V_2(t)-V_1(t))C_{12}(t)+\lambda _{12} (V_1(t)-V_2(t)) \end{aligned}$$
(4.72)

subject to the linear constraint

$$\begin{aligned} C_{12}+C_{22}=0. \end{aligned}$$

By imposing that the space of family matrix rates \((C_{nj})_{n,j=1,2}\) is bounded we can write that \(C_{nj}(t)\in \Big [C^l(n,j), C^u(n,j)\Big ]\) with \(C^l(n,j)< C^u(n,j),\,\,i,j=1,2\). The solution to the preceding two linear control problems are then given by:

$$\begin{aligned} C^*_{21}(t)= & {} C^l(2,1)\mathbb {I}_{V_1(t)-V_2(t)>0}+C^u(2,1)\mathbb {I}_{V_1(t)-V_2(t)<0},\nonumber \\ C^*_{11}(t)= & {} -C^*_{21}(t) \end{aligned}$$
(4.73)

and

$$\begin{aligned} C^*_{12}(t)= & {} C^l(1,2)\mathbb {I}_{V_2(t)-V_1(t)>0}+C^u(1,2)\mathbb {I}_{V_2(t)-V_1(t)<0},\nonumber \\ C^*_{22}(t)= & {} -C^*_{12}(t). \end{aligned}$$
(4.74)

The proof is completed \(\square \)

Remark 4.8

  • Assume for example that the distribution of the claim size is of exponential type (with parameter \(\tilde{\lambda }_j^0>2\beta ,\,j=1,\ldots ,n\)). Moreover, assume that \(\pi , \theta \) and C are given by (4.68), (4.58) and (4.70), respectively. Then each of the following equations: (4.31), (4.38), (4.50) and (4.51) admits a unique solution. The solution \((\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\zeta ), \widehat{V}(t))\) (respectively \((\widehat{p}(t),\widehat{q}(t),\widehat{r}^0(t,\zeta ),\widehat{w}(t))\)) to (4.38) (respectively (4.51)) is given by (4.63), (4.65), (4.66) and (4.67) (respectively (4.53), (4.57), (4.59) and (4.60)).

  • We note that f given by (4.61) and \(f_1\) given by (4.69) coincide. Moreover, for \(r=0\), the backward differential equation (4.61)is the same as Elliott and Siu (2011, Eq. (4.13))

5 Conclusion

In this paper, we use a general maximum principle for Markov regime-switching Forward–backward stochastic differential equation to study optimal strategies for stochastic differential games. The proposed model covers the model uncertainty in Bordigoni et al. (2005), Elliott and Siu (2011), Faidi et al. (2011), Jeanblanc et al. (2012), Øksendal and Sulem (2012). The results obtained are applied to study two problems: first, we study robust utility maximization under relative entropy penalization. We show that the value function in this case is described by a quadratic regime-switching backward stochastic differential equation. Second, we study a problem of optimal investment of an insurance company under model uncertainty. This can be formulated as a two-player zero-sum stochastic differential games between the market and the insurance company, where the market controls the mean relative growth rate of the risky asset and the company controls the investment. We find “closed form” solutions of the optimal strategies of the insurance company and the market, when the utility is of exponential type and the Markov chain has two states.

Optimal control for delayed systems has also received attention recently, due to the memory dependence of some processes. In this situation, the dynamics at the present time t does not only depend on the situation at time t but also on a finite part of their past history. Extension of the present work to the delayed case could be of interest. Such results were derived in Menoukeu-Pamen (2015) in the case of no regime-switching.

It would also be interesting to study the sensitivity of the optimal controls with respect to the given parameters. However this is not straightforward since the parameters (coefficients) in this case depend on the regime and thus stochastic. This is the object of future works.