A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Menoukeu-Pamen, Olivier; Momeya, Romuald Hervé

doi:10.1007/s00186-017-0574-4

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Original Article
Open access
Published: 04 February 2017

Volume 85, pages 349–388, (2017)
Cite this article

Download PDF

You have full access to this open access article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Download PDF

Olivier Menoukeu-Pamen^1,2 &
Romuald Hervé Momeya³

2156 Accesses
11 Citations
2 Altmetric
Explore all metrics

Abstract

In this paper, we present an optimal control problem for stochastic differential games under Markov regime-switching forward–backward stochastic differential equations with jumps. First, we prove a sufficient maximum principle for nonzero-sum stochastic differential games problems and obtain equilibrium point for such games. Second, we prove an equivalent maximum principle for nonzero-sum stochastic differential games. The zero-sum stochastic differential games equivalent maximum principle is then obtained as a corollary. We apply the obtained results to study a problem of robust utility maximization under a relative entropy penalty and to find optimal investment of an insurance firm under model uncertainty.

Maximum Principles of Markov Regime-Switching Forward–Backward Stochastic Differential Equations with Jumps and Partial Information

Article Open access 10 August 2017

Stochastic differential games with controlled regime-switching

Article 31 May 2024

Maximum Principle for Markov Regime-Switching Forward–Backward Stochastic Control System with Jumps and Relation to Dynamic Programming

Article 06 February 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The expected utility theory can be seen as the theory of decision making under uncertainty based on some postulates of agent’s preferences. In general, the agent’s preference is driven by a time-additive functional and a constant rate discount future reward. The standard expected utility maximization problem supposes that the agent knows the initial probability measure that governs the dynamics of the underlying. However, it is difficult or even impossible to find an individual worthwhile probability distribution of the uncertainty. Moreover, in finance and insurance, there is no conformism on which original probability should be used to model uncertainty. This led to the study of utility maximization under model uncertainty, the uncertainty being represented by a family of absolute continuous (or equivalent) probability distributions. The idea is to solve the problem for each probability measure in the above mentioned class and choose the one that gives the worst objectives value. More specifically, the investor maximizes the expected utility with respect to each measure in this class, and chooses among all, the portfolio with the lowest value. This is also known as robust optimization problem and has been intensively studied in the past years. For more information, the reader may consult (Bordigoni et al. 2005; Elliott and Siu 2011; Faidi et al. 2011; Jeanblanc et al. 2012; Menoukeu-Pamen 2015; Øksendal and Sulem 2012) and references therein.

Our paper is motivated by the idea developed in Menoukeu-Pamen (2015), Menoukeu-Pamen (2014) and Øksendal and Sulem (2012) where general maximum principle for Forward–backward stochastic differential games with or without delay are presented. We give a general maximum principle for Forward–backwardMarkov regime-switching stochastic differential equations under model uncertainty. Then we study a problem of recursive utility maximization with entropy penalty. We show that the value function is the unique solution to a quadratic Markov regime-switching backward stochastic differential equation. This result extends the results in Bordigoni et al. (2005) and Jeanblanc et al. (2012) by considering a Markov regime-switching state process, and more general stochastic differential utility (SDU). The notion of SDU was introduced in Duffie and Epstein (1992) as a continuous time extension of the concept of recursive utility proposed in Epstein and Zin (1989) and Weil (1990). The latter notion was developed in order to untie the concepts of risk aversion and intertemporal substitution aversion which are not treated independently in the standard utility formulation.

The other motivation is to study stochastic differential games problem for Markov-regime switching systems. In a financial market, one may assume that this correspond to the case in which the mean relative growth rate of the risky asset is not known to the agent, but subject to uncertainty, hence it is regarded as a stochastic control which plays against the agent, that is, a (zero-sum) stochastic differential games between the agent and the market. Similar problem was studied in Elliott and Siu (2011) where the objective of an insurance company is to choose an optimal investment strategy so as to maximize the expected exponential utility of terminal wealth in the worst-case scenario. The authors use the dynamic programming approach to derive explicit optimal investment of the company and optimal mean growth rate of the market when the interest rate is zero. In this paper, our general the stochastic maximum principle extends their results to the framework of (nonzero-sum) Forward–backward stochastic differential games and more general dynamics for the state process. In addition, when the company and the market have the same level of information, we obtain explicit forms for the optimal strategies of the market and the insurance company, when the Markov chain has two states and the interest rate is not zero. Let us mention that our general result can also be applied to study utility maximization under risk constraint under model uncertainty. This is due to the fact that risk measures can be written as a solution to a BSDE. Hence transforming the problem with constraint to the unconstrained one leads to the setting discussed here. Another application of our result pertains to risk minimization under model uncertainty in a regime-switching market.

The remaining of the paper is organized as follows: In Sect. 2, we formulate the control problem. In Sect. 3, we derive a partial information stochastic maximum principle for forward backward stochastic differential games for a Markov switching Lévy process under model uncertainty. In Sect. 4, we apply the results to study first a robust utility maximization with entropy penalty and second a problem of optimal investment of an insurance company under model uncertainty. In the latter case, explicit expressions for optimal strategies are derived.

2 Model and problem formulation

In this section, we formulate the general problem of stochastic differential games of Markov regime-switching Forward–backward SDEs. Let $(\Omega ,\mathcal {F},P)$ be a complete probability space, where P is a reference probability measure. On this probability space, we assume that we are given a one dimensional Brownian motion $B=\{B(t)\}_{0\le t\le T}$, an irreducible homogeneous continuous-time, finite state space Markov chain $\alpha :=\{\alpha (t)\}_{0\le t\le T}$ and $N(\mathrm {d}\zeta ,\mathrm {d}s)$ a independent Poisson random measure on $(\mathbb {R}^+\times \mathbb {R}_0,\mathcal {B}(\mathbb {R}^+)\otimes \mathcal {B}_0)$ under P. Here $\mathbb {R}_0=\mathbb {R} \backslash \{0\}$ and $\mathcal {B}_0$ is the Borel $\sigma $-algebra generated by open subset O of $\mathbb {R}_0$.

We suppose that the filtration $\mathbb {F}=\{\mathcal {F}_t\}_{0\le t\le T}$ is the P-augmented natural filtration generated by B, N and $\alpha $ [see for example Donnelly (2011, Section 2) or Elliott and Siu (2011, p. 369)].

We assume that the Markov chain takes values in a finite state space $\mathbb {S}=\{e_1,e_2, \ldots ,e_D\}\subset \mathbb {R}^D$, where $D\in \mathbb {N}$, and the jth component of $e_n$ is the Kronecker delta $\delta _{nj}$ for each $n,j=1,\ldots , D$. Denote by $\Lambda :=\{\lambda _{nj}:1\le n,j\le D\}$ the rate (or intensity) matrix of the Markov chain under P. Hence, for each $1\le n,j\le D,\,\,\lambda _{nj}$ is the constant transition intensity of the chain from state $e_n$ to state $e_j$ at time t. Recall that for $n\ne j,\,\,\lambda _{nj}\ge 0$ and $\sum _{j=1}^D \lambda _{nj}=0$, hence $\lambda _{nn}\le 0$. As shown in Elliott et al. (1994), $\alpha $ admits the following semimartingale representation

$$\begin{aligned} \alpha (t)=\alpha (0)+\int _0^t\Lambda ^T\alpha (s)\mathrm {d}s+M(t), \end{aligned}$$

(2.1)

where $M:=\{M(t)\}_{t\in [0,T]}$ is a $\mathbb {R}^D$-valued $(\mathbb {F},P)$-martingale and $\Lambda ^T$ denotes the transpose of a matrix. Next we introduce the Markov jump martingale associated to $\alpha $; for more information, the reader should consult Elliott et al. (1994) or Zhang et al. (2012). For each $1\le n,j\le D$, with $n\ne j$, and $t\in [0,T]$, denote by $J^{nj}(t)$ the number of jumps from state $e_n$ to state $e_j$ up to time t. It can be shown (see Elliott et al. 1994) that

$$\begin{aligned} J^{nj}(t)=\lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s +m_{nj}(t), \end{aligned}$$

(2.2)

where $m_{nj}:=\{m_{nj}(t)\}_{t\in [0,T]}$ with $m_{nj}(t):=\int _0^t\langle \alpha (s-),e_n\rangle \langle \mathrm {d}M(s),e_j\rangle $ is a $(\mathbb {F},P)$-martingale.

Fix $j\in \{1,2,\ldots ,D\}$, denote by $\Phi _j(t)$ the number of jumps into state $e_j$ up to time t. Then

$$\begin{aligned} \Phi _j(t)&:=\sum _{n=1,n\ne j}^D J^{nj}(t)= \sum _{n=1,n\ne j}^D \lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s +\widetilde{\Phi }_{j}(t)\nonumber \\&= \lambda _j(t) + \widetilde{\Phi }_{j}(t), \end{aligned}$$

(2.3)

with $\widetilde{\Phi }_{j}(t)=\sum _{n=1,n\ne j}^D m_{nj}(t)$ and $\lambda _j(t)=\sum _{n=1,n\ne j}^D \lambda _{nj}\int _0^t\langle \alpha (s-),e_n\rangle \mathrm {d}s $. Note that for each $j\in \{1,2,\ldots ,D\},\,\,\,\widetilde{\Phi }_{j}:=\{\widetilde{\Phi }_{j}(t)\}_{t\in [0,T]}$ is a $(\mathbb {F},P)$-martingale.

Suppose that the compensator of $N(\mathrm {d}\zeta ,\mathrm {d}s)$ is given by

$$\begin{aligned} \eta _\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=\nu _\alpha (\mathrm {d}\zeta |s)\eta (\mathrm {d}s)=\langle \alpha (s-),\nu (\mathrm {d}\zeta |s)\rangle \eta (\mathrm {d}s), \end{aligned}$$

(2.4)

where $\eta (\mathrm {d}s)$ is a $\sigma $-finite measure on $\mathbb {R}^+$ and $\nu (\mathrm {d}\zeta |s):=(\nu _{e_1}(\mathrm {d}\zeta |s),\nu _{e_2}(\mathrm {d}\zeta |s),\ldots ,\nu _{e_D}(\mathrm {d}\zeta |t))\in \mathbb {R}^D$ is a function of s. Let mention that for each $j=1,\ldots ,D$, $\nu _{e_j}(\mathrm {d}\zeta |s)=\nu _j(\mathrm {d}\zeta |s) $ represents the conditional Lévy density of jump sizes of $N(\mathrm {d}\zeta ,\mathrm {d}s)$ at time s when $\alpha (s^-)=e_j$ and satisfies $\int _{\mathbb {R}_{0}}\min (1,\zeta ^{2})\nu _j (\mathrm {d}\zeta |s)< \infty $. In this paper, we further assume that $\eta (\mathrm {d}s)=\mathrm {d}s$ and that $\nu (\mathrm {d}\zeta |s)$ is a function of $\zeta $, that is,

$$\begin{aligned} \nu (\mathrm {d}\zeta |s)=\nu (\mathrm {d}\zeta ) \end{aligned}$$

and denote by $\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s):=N(\mathrm {d}\zeta ,\mathrm {d}s)-\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}s$ the compensated Markov regime-switching Poisson random measure.

Suppose that the state process $X(t)=X^{(u)}(t,\omega );\,\,0 \le t \le T,\,\omega \in \Omega $, is a controlled Markov regime-switching jump-diffusion process of the form

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}X (t) &{}=&{} b(t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}t +\sigma (t,X(t),\alpha (t),u(t),\omega )\,\mathrm {d}B(t) \\ &{}&{}+\,\displaystyle \int _{\mathbb {R}_0}\gamma (t,X(t),\alpha (t),u(t),\zeta , \omega )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{}+\,\eta (t,X(t),\alpha (t),u(t),\omega )\cdot \mathrm {d}\widetilde{\Phi }(t),\,\,\,\,\,\, t \in [ 0,T] , \\ X(0) &{}=&{} x_0. \end{array}\right. \end{aligned}$$

(2.5)

In financial market the above model enables to incorporate the impact of changes in macro-economic conditions on the behaviour of the dynamics of an asset’s price as well as the occurrence of unpredictable events that could affect the price’s dynamic. One could think of the Brownian motion part as the random shocks in the price of a risky asset. The Poisson jump part takes into account the jumps in the asset price caused by lack of information or unexpected events. The Markov chain enables to describe economic cycles. The states of the underlying Markov chain represent the different states of the economy whereas the jumps given by the martingale of the underlying Markov chain represent transitions in economic conditions.

In this paper, we consider the nonzero-sum stochastic differential games problem. This means that, one player’s gain (respectively loss) does not necessarily end in the other player’s loss (respectively gain). In our model, the control $u=(u_1,u_2)$ is such that $u_i$ is the control of player $i;\, i=1,2.$ We suppose that the different levels of information available at time t to the player $i;\, i=1,2$ are modelled by two subfiltrations

$$\begin{aligned} \mathcal {E}^{(i)}_t\subset \mathcal {F}_t\,;\,\,\,t\in [0,T]. \end{aligned}$$

(2.6)

Note that one possible subfiltration $(\mathcal {E}^{(i)}_{t})_{t\ge 0}$ in (2.6) is the $\delta $-delayed information given by

$$\begin{aligned} \mathcal {E}^{(i)}_{t}=\mathcal {F}_{(t-\delta )^+ };\,\,\,t\ge 0 \end{aligned}$$

where $\delta \ge 0$ is a given constant delay. Denote by $\mathcal {A}_i$ the set of admissible control of player i, contained in the set of $\mathcal {E}^{(i)}_t$-predictable processes; $i=1,2$, with value in $\mathbb {A}_i\subset \mathbb {R}$.

The functions $b,\sigma , \gamma $ and $\eta $ are given such that for all $t,\,\,\,b(t,x,e_n,u,\cdot )$, $ \sigma (t,x ,e_n,u,\cdot )$, $\gamma (t,x,e_n,u,\zeta ,\cdot )$ and $\eta (t,x,e_n,u,\cdot ),\,\,n=1,\ldots ,D$ are $\mathcal {F}_t$-progressively measurable for all $x \in \mathbb {R},\,\,\,u\in \mathbb {A}_1\times \mathbb {A}_2$ and $\zeta \in \mathbb {R}_0$, $b(\cdot ,x,e_n,u,\omega )$, $ \sigma (\cdot ,x ,e_n,u,\omega )$. In addition, $\gamma (\cdot ,x,e_n,u,\zeta ,\omega )$ and $\eta (\cdot ,x,e_n,u,\omega ),\,\,n=1,\ldots ,D$ for each $x\in \mathbb {R}, u\in \mathbb {A}_1\times \mathbb {A}_2, \zeta \in \mathbb {R}_0, \zeta \in \mathbb {R}_0$ and (2.5) has a unique strong solution for any admissible control $u\in \mathbb {A}_1\times \mathbb {A}_2$. Under the above condition, existence and uniqueness of (2.5) is ensured if $b, \sigma , \gamma $ and $\eta $ are globally Lipschitz continuous in x and satisfy linear growth in x; see for example Applebaum (2009, Theorem 6.2.3), Mao and Yuan (2006, Theorem 3.13) and Kulinich and Kushnirenko (2014, Theorem).

For each player i, we consider the associated BSDE’s in the unknowns $(Y_i(t), Z_i(t), K_i(t,\zeta ), V_i(t))$ of the form

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}Y_i (t) &{}=&{} - g_i(t,X(t),\alpha (t),Y_i(t), Z_i(t), K_i(t,\cdot ),V_i(t),u(t))\,\mathrm {d}t \,+\,Z_i(t)\,\mathrm {d}B(t) \\ &{}&{}+\, \displaystyle \int _{\mathbb {R}_0}K_i(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t);\,\,\, t \in [ 0,T] , \\ Y_i(T) &{}=&{} h_i(X(T),\alpha (T))\,; \quad i=1,2. \end{array}\right. \end{aligned}$$

(2.7)

Here $g_i:[0,T]\times \mathbb {R} \times \mathbb {S} \times \mathbb {R} \times \mathbb {R} \times \mathcal {R} \times \mathbb {R} \times \mathbb {A}_1\times \mathbb {A}_2 \rightarrow \mathbb {R}$ and $h:\mathbb {R}\times \mathbb {S} \rightarrow \mathbb {R}$ are such that the BSDE (2.7) has a unique solution for any admissible control $u \in \mathbb {A}_1\times \mathbb {A}_2$. For sufficient conditions for existence and uniqueness of Markov regime-switching BSDEs, we refer the reader to Cohen and Elliott (2010, Theorem 1.1) or Crepey (2010, Proposition 14.4.1) or Tang and Li (1994, Lemma 2.4) and references therein. For example, such unique solution exists if one assumes that $g(\cdot , x,e_i,y,z,k,v,u)$ is uniformly Lipschitz continuous with respect to x, y, z, k, v, the random variable $h(X(T),\alpha (T))$ is squared integrable and $g(t,0,e_i,0,0,0,0,u)$ is uniformly bounded.

Let $f_i:[0,T]\times \mathbb {R} \times \mathbb {S} \times \mathbb {A}_1\times \mathbb {A}_2 \rightarrow \mathbb {R}, \,\,\,\varphi _i:\mathbb {R}\times \mathbb {S} \rightarrow \mathbb {R}$ and $\psi _i:\mathbb {R} \rightarrow \mathbb {R},\,i=1,2$ be given $C^1$ functions with respect to their arguments and $\psi _i^\prime (x)\ge 0$ for all $x,\,i=1,2$. For the nonzero-sum games, the control actions are not free and generate for each player $i,\,\,i=1,2$, a performance functional

$$\begin{aligned} J_i(t,u)&:=E\Big [ \int _t^T f_i(s,X(s),\alpha (s),u(s))\,\mathrm {d}s + \varphi _i(X(T),\alpha (T))\nonumber \\&\quad +\,\psi _i(Y_i(t))\Big | \mathcal {E}^{(i)}_t\Big ];\quad i=1,2. \end{aligned}$$

(2.8)

Here, $f_i,\,\varphi _i$ and $\psi _i$ may be seen as profit rates, bequest functions and “utility evaluations” respectively, of the player $i;\, i=1,2$. For $t=0$, we put

$$\begin{aligned} J_i(u):=J_i(0,u),\,\, i=1,2. \end{aligned}$$

(2.9)

Let us note that in the nonzero-sum games the players do not share the same performance functional, instead, each of them uses his own performance functional. In addition, they all have the same objectives, that is, maximize their performance functional. To be more precise, the nonzero-sum games is the following:

Problem 2.1

Find $(u_1^*,u_2^*)\in \mathcal {A}_1\times \mathcal {A}_2$ (if it exists) such that

1.
$J_1(t,u_1,u_2^*)\le J_1(t,u_1^*,u_2^*)$ for all $u_1\in \mathcal {A}_1$,
2.
$J_2(t,u_1^*,u_2)\le J_2(t,u_1^*,u_2^*)$ for all $u_2\in \mathcal {A}_2$.

If it exists, we call such a pair $(u_1^*,u_2^*)$ a Nash Equilibrium. This intuitively means that while player I controls $u_1$, player II controls $u_2$. We assume that each player knows the equilibrium strategies of the other player and does not gain anything by changing his strategy unilaterally. If each player is making the best decision she can, based on the other player’s decision, then we say that the two players are in Nash Equilibrium.

3 A stochastic maximum principle for Markov regime-switching forward–backward stochastic differential games

In this section, we derive the Nash equilibrium for Problem 2.1 based on a stochastic maximum principle for Markov regime-switching Forward–backward differential equation.

Define the Hamiltonians

$$\begin{aligned} H_i:[0,T] \times \mathbb {R}\times \mathbb {S}\times \mathbb {R}^2\times \mathcal {R}\times \mathbb {R} \times \mathbb {A}_1\times \mathbb {A}_2 \times \mathbb {R}^3 \times \mathcal {R} \times \mathbb {R} \longrightarrow \mathbb {R}, \end{aligned}$$

by

$$\begin{aligned}&H_i\left( t,x,e_n,y,z,k,v,u_1,u_2,a,p,q,r(\cdot ),w\right) \nonumber \\&\quad := f_i(t,x,e_n,u_1,u_2)+a g_i(t,x,e_n,y,z,k,v,u_1,u_2)+ p_i b(t,x,e_n,u_1,u_2) \,\nonumber \\&\quad \quad \;+\,q_i\sigma (t,x,e_n,u_1,u_2)+\int _{\mathbb {R}_0}r_i(\zeta )\gamma (t,x,e_n,u_1,u_2,\zeta )\nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \; +\,\sum _{j=1}^D\eta _j(t,x,e_n,u_1,u_2)w_n^j(t)\lambda _{nj} ,\quad i=1,2 \end{aligned}$$

(3.1)

where $\mathcal {R} $ denote the set of all functions $k:[0,T]\times \mathbb {R}_0 \rightarrow \mathbb {R}$ for which the integral in (3.1) converges. An example of such set is the set $L^2(\nu _\alpha )$. We suppose that $H_i,\,i=1,2$ is Fréchet differentiable in the variables x, y, z, k, v, u and that $\nabla _k H_i(t,\zeta ),\,i=1,2$ is a random measure which is absolutely continuous with respect to $\nu $. Next, we define the associated adjoint process $A_i(t),\,p_i(t),\,q_i(t), r_i(t,\cdot )$ and $w_i(t),\,\,\,t\in [0,T]$ and $\zeta \in \mathbb {R}$ by the following Forward–backward SDE

1.
The Markovian regime-switching forward SDE in $A_i(t); \,i=1,2$
$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}A_i (t) &{}=&{} \dfrac{\partial H_i}{\partial y} (t) \,\mathrm {d}t + \dfrac{\partial H_i}{\partial z} (t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\mathrm {d}\nabla _k H_i}{\mathrm {d}\nu (\zeta )} (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{} + \nabla _vH_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t);\,\,\,t\in [0,T], \\ A_i(0) &{}=&{} \psi _i^\prime (Y(0)). \end{array}\right. \end{aligned}$$
(3.2)
Here and in what follows, we use the notation
$$\begin{aligned}&\dfrac{\partial H_i}{\partial y} (t) = \dfrac{\partial H_i}{\partial y} (t,X(t),\alpha (t),u_1(t),u_2(t),Y_i(t), Z_i(t), K_i(t,\cdot ),V_i(t),A_i(t),\\&\quad p_i(t),q_i(t),r_i(t,\cdot ),w_i(t)), \end{aligned}$$
etc, $\dfrac{\mathrm {d}\nabla _k H_i}{\mathrm {d}\nu (\zeta )} (t,\zeta ) $ is the Radon-Nikodym derivative of $ \nabla _k H_i(t,\zeta )$ with respect to $\nu (\zeta )$ and $\nabla _v H_i(t)\cdot \mathrm {d}\widetilde{\Phi }(t)=\sum _{j=1}^D \dfrac{\partial H_i}{\partial v^j} (t)\mathrm {d}\widetilde{\Phi }_j(t)$ with $V_i^j=V_i(t,e_j)$.
2.
The Markovian regime-switching BSDE in $(p_i(t),q_i(t),r_i(t,\cdot ),w_i(t)); \,i=1,2$
$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}p_i (t) &{}=&{} - \dfrac{\partial H_i}{\partial x} (t) \mathrm {d}t + q_i(t)\,\mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} r_i (t,\zeta ) \,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\\ &{}&{}+\, w_i(t)\cdot \mathrm {d}\widetilde{\Phi _i}(t);\,\,\,t\in [0,T], \\ p_i (T) &{}=&{} \dfrac{\partial \varphi _i}{\partial x}(X(T),\alpha (T))\,+A_i(T) \dfrac{\partial h_i}{\partial x} (X(T),\alpha (T)). \end{array}\right. \end{aligned}$$
(3.3)

3.1 A sufficient maximum principle

In what follows, we give the sufficient maximum principle.

Theorem 3.1

(Sufficient maximum principle for Regime-switching FBSDE nonzero-sum games) Let $(\widehat{u}_1,\widehat{u}_2)\in \mathcal {A}_1\times \mathcal {A}_2 $ with corresponding solutions $\widehat{X}(t),(\widehat{Y}_i(t), \widehat{Z}_i(t), \widehat{K}_i(t,\zeta ), \widehat{V}_i(t)), \widehat{A}_i(t),(\widehat{p}_i(t),\widehat{q}_i(t),\widehat{r}_i(t,\zeta ),\widehat{w}_i(t))$ of (2.5), (2.7), (3.2) and (3.3) respectively for $i=1,2$. Suppose that the following holds:

1.
For each $e_n \in \mathbb {S} $, the functions
$$\begin{aligned} x \mapsto h_i(x, e_n),\quad \,x \mapsto \varphi _i(x, e_n),\quad \,y \mapsto \psi _i(y), \end{aligned}$$
(3.4)
are concave for $i=1,2$.
2.
The functions
$$\begin{aligned}&\widetilde{\mathcal {H}}_1(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H_1( t,x,\mu _1,e_n,y,z,k,v,\mu _1,\widehat{u}_2(t),\widehat{A}_1,\widehat{p}_1(t),\widehat{q}_1(t),\nonumber \\&\quad \quad \quad \widehat{r}_1(t,\cdot ),\widehat{w}_1(t))\Big | \mathcal {E}^{(1)}_t\Big ] \end{aligned}$$
(3.5)
and
$$\begin{aligned}&\widetilde{\mathcal {H}}_2(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H_2( t,x,\mu _1,e_n,y,z,k,v,\widehat{u}_1(t),\mu _2,\widehat{A}_2,\widehat{p}_2(t),\widehat{q}_2(t),\nonumber \\&\quad \quad \quad \widehat{r}_2(t,\cdot ),\widehat{w}_2(t))\Big | \mathcal {E}^{(2)}_t\Big ] \end{aligned}$$
(3.6)
are all concave for all $(t,e_n) \in [0,T]\times \mathbb {S}$ a.s.
3.
$$\begin{aligned} E\Big [\hat{H}_1(t,\widehat{u}_1(t),\widehat{u}_2(t))) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]=\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}}&\Big \{E\Big [\hat{H}_1(t,\mu _1,\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]\Big \} \end{aligned}$$
(3.7)
for all $t\in [0,T]$, a.s. and
$$\begin{aligned} E\Big [ \hat{H}_2(t,\widehat{u}_1(t),\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]=\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } sup}}}}\Big \{E\Big [\hat{H}_2(t,\widehat{u}_1(t),\mu _2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]\Big \} \end{aligned}$$
(3.8)
for all $t\in [0,T]$, a.s. Here
$$\begin{aligned}&\hat{H}_i(t,u_1(t),u_2(t))\\&\quad =\,H_i(t,\widehat{X}(t),\alpha (t),\widehat{Y}_i(t), \widehat{Z}_i(t), \widehat{K}_i(t,\cdot ),\widehat{V}_i(t),u_1(t),u_2(t),\widehat{A}_i(t),\widehat{p}_i(t), \widehat{q}_i(t),\\&\quad \quad \quad \;\widehat{r}_i(t,\cdot ),\widehat{w}_i(t)) \end{aligned}$$
for $i=1,2.$
4.
$\frac{\mathrm {d}}{\mathrm {d}\nu }\nabla _k\widehat{H}_i(t,\xi )>-1$ for $i=1,2.$
5.
In addition, the integrability condition
$$\begin{aligned}&E\left[ \int _0^T\left\{ \widehat{p}_i^2(t) \left( \left( \sigma (t)-\widehat{\sigma }(t)\right) ^2+ {\int }_{\mathbb {R}_0}( \gamma (t,\zeta )-\widehat{\gamma } (t,\zeta ) )^2\,\nu _\alpha (\mathrm {d}\zeta )\right. \right. \right. \nonumber \\&\left. \quad \quad +\,\sum _{j=1}^D\left( \eta _j(t)-\widehat{\eta }_j(t) \right) ^2\lambda _{j}(t) \right) \nonumber \\&\quad \quad +\,(X(t)-\widehat{X}(t))^2 \left( \widehat{q}_i^2(t)+ {\int }_{\mathbb {R}_0}\widehat{r}_i^2 (t,\zeta ) \nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D(w^j_i)^2(t)\lambda _{j}(t) \right) \nonumber \\&\quad \quad +\,(Y_i(t)-\widehat{Y}_i(t))^2 \left( \left( \dfrac{\partial \widehat{H}_i}{\partial z} \right) ^2(t) + {\int }_{\mathbb {R}_0} \left\| \nabla _k \widehat{H}_i(t,\zeta )\right\| ^2 \nu _\alpha (\mathrm {d}\zeta )\right. \nonumber \\&\left. \quad \quad +\,\sum _{j=1}^D \left( \dfrac{\partial \widehat{H}_i}{\partial v^j} \right) ^2(t) \lambda _{j}(t) \right) \nonumber \\&\quad \quad \Big . \Big . +\,\widehat{A}_i^2(t) \Big ( (Z_i(t)-\widehat{Z}_i(t))^2+ {\int }_{\mathbb {R}_0}( K_i (t,\zeta )-\widehat{K}_i (t,\zeta ) )^2\nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \left. +\,\sum _{j=1}^D(V_i^j(t)-\widehat{V}_i^j(t) )^2\lambda _{j}(t) \Big )\Big \} \mathrm {d}t \right] <\infty \end{aligned}$$
(3.9)
for $i=1,2.$ holds.

Then $\widehat{u}=(\widehat{u}_1(t),\widehat{u}_2(t))$ is a Nash equilibrium for (2.5), (2.7) and (2.8).

Proof of Theorem 3.1

See “Appendix”. $\square $

Remark 3.2

In the above Theorem and in its proof, we have used the following shorthand notation: For $ i = 1$, the processes corresponding to $u=( u_1,\hat{u}_2)$ are given for example by $X(t) = X^{(u_1,\hat{u}_2)}(t)$ and $Y_1(t) = Y_1^{(u_1,\hat{u}_2)}(t)$ and the processes corresponding to $u= (\hat{u}_1,\hat{u}_2)$ are $\hat{X}(t) = X^{(\hat{u}_1,\hat{u}_2)}(t)$ and $\hat{Y}_1(t) = Y_1^{(\hat{u}_1,\hat{u}_2)}(t) $. Similar notation is used for $i=2$. The integrability condition (3.9) ensures the existence of the stochastic integrals while using Itô formula in the proof of the Theorem.

Remark 3.3

Let V be an open subset of a Banach space $\mathcal {X}$ and let $F: V \rightarrow \mathbb {R}$.

We say that F has a directional derivative (or Gateaux derivative) at $x\in V$ in the direction $y\in \mathcal {X}$ if
$$\begin{aligned} D_yF(x):=\underset{\varepsilon \rightarrow 0}{\lim } \frac{1}{\varepsilon }(F(x + \varepsilon y)-F(x)) \text { exists.} \end{aligned}$$
We say that F is Fréchet differentiable at $x \in V$ if there exists a linear map
$$\begin{aligned} L:\mathcal {X} \rightarrow \mathbb {R} \end{aligned}$$
such that
$$\begin{aligned} \underset{\underset{h \in \mathcal {X}}{h \rightarrow 0}}{\lim } \frac{1}{\Vert h\Vert }|F(x+h)-F(x)-L(h)|=0. \end{aligned}$$
In this case we call L the Fréchet derivative of F at x, and we write
$$\begin{aligned} L=\nabla _x F. \end{aligned}$$
If F is Fréchet differentiable, then F has a directional derivative in all directions $y \in \mathcal {X}$ and
$$\begin{aligned} D_yF(x)= \nabla _x F(y). \end{aligned}$$

3.2 An equivalent maximum principle

The concavity condition on the Hamiltonians does not always hold on many applications. In this section, we shall prove an equivalent stochastic maximum principle which does not require this assumption. We shall assume the following:

Assumption A.1

For all $t_0\in [0,T]$ and all bounded $\mathcal {E}^{(i)}_{t_0}$-measurable random variable $\theta _i(\omega )$, the control process $\beta _i$ defined by

$$\begin{aligned} \beta _i(t):= \chi _{]t_0,T[}(t)\theta _i(\omega );\,\,t\in [0,T], \end{aligned}$$

(3.10)

belongs to $\mathcal {A}_i,\, i=1,2$.

Assumption A.2

For all $u_i \in \mathcal {A}_i$ and all bounded $\beta _i \in \mathcal {A}_i$, there exists $\delta _i>0$ such that

$$\begin{aligned} \widetilde{u}_i(t):=u_i(t)+\ell \beta _i(t) \,\,t\in [0,T] , \end{aligned}$$

(3.11)

belongs to $\mathcal {A}_i$ for all $\ell \in ]-\delta _i,\delta _i[,\, i=1,2$.

Assumption A.3

For all bounded $\beta _i \in \mathcal {A}_i$, the derivatives processes

$$\begin{aligned} X_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }X^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; X_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }X^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ y_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Y_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; y_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Y_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ z_1(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Z_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}; z_2(t)=\dfrac{\mathrm {d}}{\mathrm {d}\ell }Z_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0};\\ k_1(t,\zeta )&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }K_1^{(u_1+\ell \beta _1,u_2)}(t,\zeta )\Big . \Big |_{\ell =0}; k_2(t,\zeta )=\dfrac{\mathrm {d}}{\mathrm {d}\ell }K_2^{(u_1,u_2+\ell \beta _2)}(t,\zeta )\Big . \Big |_{\ell =0};\\ v_1^j(t)&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }V_1^{j,{(u_1+\ell \beta _1,u_2)}}(t)\Big . \Big |_{\ell =0},\quad \,j=1,\ldots , n; v_2^j(t)\\&=\dfrac{\mathrm {d}}{\mathrm {d}\ell }V_2^{j,{(u_1,u_2+\ell \beta _1)}}(t)\Big . \Big |_{\ell =0},\quad \,j=1,\ldots , n \end{aligned}$$

exist and belong to $L^2([0,T] \times \Omega )$.

It follows from (2.5) and (2.7) that

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}X_1(t) &{}=&{}X_1(t) \left\{ \dfrac{\partial b}{\partial x}(t)\mathrm {d}t+\dfrac{\partial \sigma }{\partial x}(t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\partial \gamma }{\partial x}(t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) + \dfrac{\partial \eta }{\partial x}(t)\cdot \mathrm {d}\widetilde{\Phi }(t) \right\} \\ &{}&{}+\,\beta _1(t)\left\{ \dfrac{\partial b}{\partial u_1}(t)\mathrm {d}t+\dfrac{\partial \sigma }{\partial u_1}(t) \mathrm {d}B(t)+ \displaystyle \int _{\mathbb {R}_0} \dfrac{\partial \gamma }{\partial u_1}(t,\zeta )\widetilde{N}_\alpha (\mathrm {d}t,\mathrm {d}\zeta ) +\dfrac{\partial \eta }{\partial u_1}(t)\cdot \mathrm {d}\widetilde{\Phi }(t) \right\} ;\,\,t\in (0,T] \\ X_1(0)&{}=&{}0 \end{array}\right. \nonumber \\ \end{aligned}$$

(3.12)

and

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}y_1(t)&{}=&{}-\left\{ \dfrac{\partial g_1}{\partial x}(t)X_1(t)+\dfrac{\partial g_1}{\partial y}(t)y_1(t)+\dfrac{\partial g_1}{\partial z}(t)z_1(t)+\displaystyle \int _{\mathbb {R}_0}\nabla _k g_1 (t)k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\Big . \right. \\ &{}&{}\quad \left. +\,\sum \limits _{j=1}^D \dfrac{\partial g_1}{\partial v_1^j}(t)v_1^j(t)\lambda _j(t)+\dfrac{\partial g_1}{\partial u}(t)\beta _1(t)\right\} \mathrm {d}t +z_1(t)\,\mathrm {d}B(t) \\ &{}&{}\quad +\,\displaystyle \int _{\mathbb {R}_0}k_1(t,\zeta ) \widetilde{N}_\alpha (\mathrm {d}\zeta , \mathrm {d}t) + v_1(t)\cdot \mathrm {d}\widetilde{\Phi }(t) ;\quad \,t\in [0,T]\\ y_1(T)&{}=&{}\dfrac{\partial h_1}{\partial x}(X(T),\alpha (T))X_1(T) . \end{array}\right. \end{aligned}$$

(3.13)

We can obtain $\mathrm {d}X_2(t)$ and $\mathrm {d}y_2(t)$ in a similar way.

Remark 3.4

As for sufficient conditions for the existence and uniqueness of solutions (3.12) and (3.13), the reader may consult Peng (1993, Eq. 4.1) (in the case of diffusion state processes).

As an example, a set of sufficient conditions under which (3.12) and (3.13) admit a unique solution is as follows:

1.
Assume that the coefficients $b,\sigma , \gamma , \eta , g_i, h_i,f_i, \psi _i$ and $\phi _i$ for $i=1,2$ are continuous with respect to their arguments and are continuously differentiable with respect to (x, y, z, k, v, u). (Here, the dependence of $g_i$ and $f_i$ on k is through $\int _{\mathbb {R}_0}k(\zeta )\rho (t,\zeta )\nu (\mathrm {d}\zeta )$, where $\rho $ is a measurable function satisfying $0\le \rho (t,\zeta )\le c(1\wedge |\zeta |), \text { } \forall \zeta \in \mathbb {R}_0$. Hence the differentiability in this argument is in the Fréchet sense.)
2.
The derivatives of $b,\sigma , \gamma , \eta $ with respect to x, u, the derivative of $h_i,\,i=1,2$ with respect to x and the derivatives of $g_i,\,i=1,2$ with respect to x, y, z, k, v, u are bounded.
3.
The derivatives of $f_i,\,i=1,2$ with respect to x, u are bounded by $C(1+|x|+|u|)$.
4.
The derivatives of $\psi _i$ and $\phi _i$ with respect to x are bounded by $C(1+|x|).$

We can state the following equivalent maximum principle:

Theorem 3.5

(Equivalent Maximum Principle) Let $u_i\in \mathcal {A}_i$ with corresponding solutions X(t) of (2.5), $(Y_i(t),Z_i(t),K_i(t,\zeta ),V_i(t))$ of (2.7), $A_i(t)$ of (3.2), $(p_i(t),q_i(t),r_i(t,\zeta ),w_i(t))$ of (3.3) and corresponding derivative processes $X_i(t)$ and $(y_i(t),z_i(t),k_i(t,\zeta ),v_i(t))$ given by (3.12) and (3.13), respectively. Suppose that Assumptions A.1, A.2 and A.3 hold. Moreover, assume the following integrability conditions

$$\begin{aligned}&E\left[ \int _0^T p_i^2(t)\left\{ \left( \dfrac{\partial \sigma }{\partial x}\right) ^2(t)X^2_i(t) +\left( \dfrac{\partial \sigma }{\partial u_i}\right) ^2(t)\beta _i^2(t) \right. \right. \nonumber \\&\quad \quad +\,\int _{\mathbb {R}_0}\left( \left( \dfrac{\partial \gamma }{\partial x}\right) ^2(t,\zeta )X_i^2(t)+\left( \dfrac{\partial \gamma }{\partial u_i}\right) ^2(t,\zeta )\beta _i^2(t)\right) \nu _\alpha (\mathrm {d}\zeta )\nonumber \\&\quad \quad \Big .+\,\sum _{j=1}^D \Big (\Big (\dfrac{\partial \eta ^j}{\partial x}\Big )^2(t)x^2_i(t)+\Big (\dfrac{\partial \eta ^j}{\partial u_i}\Big )^2(t)\beta _i^2(t)\Big )\lambda _j(t)\Big \} \mathrm {d}t\nonumber \\&\quad \quad +\,\int _0^TX_i^2(t)\Big \{ q_i^2(t)+\int _{\mathbb {R}_0}r_i^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+\sum _{j=1}^D (\eta ^j)^2(t)\lambda _j(t)\Big \}\mathrm {d}t\Big ] <\infty \end{aligned}$$

(3.14)

and

$$\begin{aligned}&E\left[ \int _0^Ty_i^2(t) \left\{ \left( \dfrac{\partial H_i}{\partial z}\right) ^2 (t) +\int _{\mathbb {R}_0} \Vert \nabla _k H_i\Vert ^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D\left( \dfrac{\partial H_i}{\partial v^j}\right) ^2 (t) \lambda _j(t)\right\} \mathrm {d}t\right. \nonumber \\&\quad \quad \left. +\, \int _0^TA_i^2(t)\left\{ z_i^2(t)+\int _{\mathbb {R}_0}k_i^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )+ \sum _{j=1}^D (v^j_i)^2(t)\lambda _j(t)\right\} \mathrm {d}t\right] <\infty \nonumber \\&\quad \text { for } \quad i=1,2. \end{aligned}$$

(3.15)

Then the following are equivalent:

1.
$\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_2^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0}=0$ for all bounded $\beta _1\in \mathcal {A}_{1},\,\beta _2\in \mathcal {A}_{2}$
2.
$$\begin{aligned} 0&=E\left[ \dfrac{\partial H_1}{\partial \mu _1} (t,X(t),\alpha (t),\mu _1,u_2,Y_1(t), Z_1(t), K_1(t,\cdot ),V_1(t),\right. \nonumber \\&\quad \left. A_1(t),p_1(t),q_1(t),r_1(t,\cdot ),w_1(t))\Big . \Big | \mathcal {E}^{(1)}_t\right] _{\mu _1=u_1(t)}\nonumber \\&=E\left[ \dfrac{\partial H_2}{\partial \mu _2} (t,X(t),\alpha (t),u_1,\mu _2,Y_2(t), Z_2(t), K_2(t,\cdot ),V_2(t),\right. \nonumber \\&\quad \left. A_2(t),p_2(t),q_2(t),r_2(t,\cdot ),w_2(t))\Big . \Big | \mathcal {E}^{(2)}_t\right] _{\mu _2=u_2(t)} \end{aligned}$$
(3.16)
for a.a. $t \in [0,T].$

Proof

See “Appendix”. $\square $

Remark 3.6

The integrability conditions (3.14) and (3.15) guarantee the existence of the stochastic integrals while using Itô formula in the proof of the Theorem. Note also that the result is the same if we start from $t\ge 0$ in the performance functional, hence extending Øksendal and Sulem (2012, Theorem 2.2) to the Markov regime-switching setting.

3.3 Zero-sum Game

In this section, we solve the zero-sum Markov regime-switching Forward–backward stochastic differential games problem (or worst case scenario optimal problem): that is, we assume that the performance functional for Player II is the negative of that of Player I, i.e.,

$$\begin{aligned}&J(t,u_1,u_2)=J_1(t,u_1,u_2)\nonumber \\&\quad :=\,E\Big [ \int _t^T f(s,X(s),\alpha (s),u_1(s),u_2(s))\,\mathrm {d}s + \varphi (X(T),\alpha (T))\,+\,\psi (Y(t))\Big .\Big | {\mathcal {E}}_t^1\Big ]\nonumber \\&\quad =:\,-J_2(t,u_1,u_2). \end{aligned}$$

(3.17)

In this case $(u_1^*,u_2^*)$ is a Nash equilibrium iff

$$\begin{aligned} \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2^*)=J(t,u_1^*,u_2^*)=\underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1^*,u_2). \end{aligned}$$

(3.18)

On one hand (3.18) implies that

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))&\le \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2^*)\\&=J(t,u_1^*,u_2^*)=\underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1^*,u_2)\\&\le \underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$

On the other hand we always have ${{\mathrm{ess \text { } inf}}}({{\mathrm{ess \text { } sup}}}) \ge {{\mathrm{ess \text { } sup}}}({{\mathrm{ess \text { } inf}}})$. Hence, if $(u_1^*,u_2^*)$ is a saddle point, then

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))=\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$

The zero-sum Markov regime-switching Forward–backward stochastic differential games problem is therefore the following:

Problem 3.7

Find $u_1^*\in \mathcal {A}_1$ and $u_2^*\in \mathcal {A}_2$ (if they exist) such that

$$\begin{aligned} \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}(\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}\,J(t,u_1,u_2))=J(t,u_1^*,u_2^*)=\underset{u_1 \in \mathcal {A}_1}{{{\mathrm{ess \text { } sup}}}}( \underset{u_2 \in \mathcal {A}_2}{{{\mathrm{ess \text { } inf}}}}\,J(t,u_1,u_2)). \end{aligned}$$

(3.19)

When it exists, a control $(u_1^*,u_2^*)$ satisfying (3.19), is called a saddle point. The actions of the players are opposite, more precisely, between player I and II there is a payoff $J(t,u_1, u_2)$ and it is a reward for Player I and cost for Player II.

Remark 3.8

As in the nonzero-sum case, we give the result for $t=0$ and get the result for $t\in ]0,T]$ as a corollary. The results obtained in this section generalize the ones in Øksendal and Sulem (2012), Bordigoni et al. (2005), Faidi et al. (2011), Jeanblanc et al. (2012) and Elliott and Siu (2011).

In the case of a zero-sum games, we only have one value function for the players and therefore, Theorem 3.1 becomes

Theorem 3.9

(Sufficient maximum principle for Regime-switching FBSDE zero-sum games) Let $(\widehat{u}_1,\widehat{u}_2)\in \mathcal {A}_1\times \mathcal {A}_2 $ with corresponding solutions $\widehat{X}(t),(\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\zeta ), \widehat{V}(t)), \widehat{A}(t),(\widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\zeta ),\widehat{w}(t))$ of (2.5), (2.7), (3.2) and (3.3) respectively. Suppose that the following hold:

1.
For each $e_n \in \mathbb {S}$, the functions
$$\begin{aligned} x \mapsto \varphi (x, e_n) \text { and } y \mapsto \psi (y), \end{aligned}$$
(3.20)
are affine and $x \mapsto h(x, e_n)$ is concave.
2.
The functions
$$\begin{aligned}&\widetilde{\mathcal {H}}(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\,\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}} E\Big [ H( t,x,\mu _1,e_n,y,z,k,v,\mu _1,\widehat{u}_2(t),\widehat{A},\nonumber \\&\quad \quad \;\times \widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t))\Big | \mathcal {E}^{(1)}_t\Big ] \end{aligned}$$
(3.21)
is concave for all $(t,e_n) \in [0,T]\times \mathbb {S}$ a.s. and
$$\begin{aligned}&\widetilde{\mathcal {H}}(t,x,e_n,y,z,k,v)\nonumber \\&\quad =\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } inf}}}} E\Big [ H( t,x,\mu _1,e_n,y,z,k,v,\widehat{u}_1(t),\mu _2,\widehat{A},\nonumber \\&\quad \widehat{p}(t),\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t))\Big | \mathcal {E}^{(2)}_t\Big ] \end{aligned}$$
(3.22)
is convex for all $(t,e_n) \in [0,T]\times \mathbb {S}$ a.s.
3.
$$\begin{aligned} E\Big [\hat{H}(t,\widehat{u}_1(t),\widehat{u}_2(t))) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]=\underset{\mu _1\in \mathcal {A}_1 }{{{\mathrm{ess \text { } sup}}}}&\Big \{E\Big [\hat{H}(t,\mu _1,\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(1)}_t\Big ]\Big \} \end{aligned}$$
(3.23)
for all $t\in [0,T]$, a.s. and
$$\begin{aligned} E\Big [ \hat{H}(t,\widehat{u}_1(t),\widehat{u}_2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]=\underset{\mu _2\in \mathcal {A}_2 }{{{\mathrm{ess \text { } inf}}}}\Big \{E\Big [\hat{H}(t,\widehat{u}_1(t),\mu _2(t)) \Big . \Big | \mathcal {E}^{(2)}_t\Big ]\Big \} \end{aligned}$$
(3.24)
for all $t\in [0,T]$, a.s. Here
$$\begin{aligned}&\hat{H}(t,u_1(t),u_2(t))\\&\quad =\,H(t,\widehat{X}(t),\alpha (t),\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\cdot ),\widehat{V}(t),u_1(t),u_2(t),\widehat{A}(t),\widehat{p}(t),\\&\quad \quad \;\widehat{q}(t),\widehat{r}(t,\cdot ),\widehat{w}(t)) \end{aligned}$$
4.
$\frac{\mathrm {d}}{\mathrm {d}\nu }\nabla _k\widehat{g}(t,\xi )>-1$.
5.
In addition, the integrability condition (3.9) is satisfied for $\widehat{p}_i=\widehat{p}$, etc.

Then $\widehat{u}=(\widehat{u}_1(t),\widehat{u}_2(t))$ is a saddle point for $J(u_1, u_2)$

The equivalent maximum principle (Theorem 3.5) is then reduced to

Theorem 3.10

(Equivalent maximum principle for zero-sum games) Let $u\in \mathcal {A}$ with corresponding solutions X(t) of (2.5), $(Y(t),Z(t),K(t,\zeta ),V(t))$ of (2.7), A(t) of (3.2), $(p(t),q(t),r(t,\zeta ),w_i(t))$ of (3.3) and corresponding derivative processes $X_1(t)$ and $(y_1(t),z_1(t),k_1(t,\zeta ),v_1(t))$ given by (3.12) and (3.13), respectively. Assume that conditions of Theorem 3.5 are satisfied. Then the following statements are equivalent:

1.
$$\begin{aligned} \dfrac{\mathrm {d}}{\mathrm {d}\ell }J^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=\dfrac{\mathrm {d}}{\mathrm {d}\ell }J^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0}=0 \end{aligned}$$
(3.25)
for all bounded $\beta _1\in \mathcal {A}_1,\,\,\,\beta _2\in \mathcal {A}_2$.
2.
$$\begin{aligned}&E\Big [\dfrac{\partial H}{\partial \mu _1} (t,\mu _1(t),u_2(t))\Big . \Big | \mathcal {E}_t^{(1)}\Big ]_{\mu _1=u_1(t)} =E\Big [\dfrac{\partial H}{\partial \mu _2} (t,u_1(t),\mu _2(t))\Big . \Big | \mathcal {E}_t^{(2)}\Big ]_{\mu _2=u_2(t)}\nonumber \\&\quad =0 \end{aligned}$$
(3.26)
for a.a $ t\in [0,T]$, where
$$\begin{aligned}&H(t,u_1(t),u_2(t))\\&=H((t,X(t),\alpha (t),u_1,u_2,Y(t), Z(t), K(t,\cdot ),V_1(t),A(t),p(t),\\&q(t),r(t,\cdot ),w(t)). \end{aligned}$$

Proof

It follows directly from Theorem 3.5. $\square $

Corollary 3.11

If $u=(u_1,u_2)\in \mathcal {A}_1\times \mathcal {A}_2$ is a Nash equilibrium for the zero-sum games in Theorem 3.10, then equalities (3.26) holds.

Proof

If $u=(u_1,u_2)\in \mathcal {A}_1\times \mathcal {A}_2$ is a Nash equilibrium, then it follows from Theorem 3.10 that (3.25) holds by (3.18). $\square $

4 Applications

4.1 Application to robust utility maximization with entropy penalty

In this section, we apply the results obtained in Sect. 3 to study an utility maximization problem under model uncertainty. We assume that $\mathcal {E}^{(1)}_t=\mathcal {E}^{(2)}_t=\mathcal {F}_t$. The framework is that of Bordigoni et al. (2005). For any $Q\in (\Omega ,\mathcal {F}_T)$, let

$$\begin{aligned} H(Q|P):=\left\{ \begin{array}{ll} E_Q\left[ \ln \frac{\mathrm {d}Q}{\mathrm {d}P}\right] &{}\quad \text {if}\; Q\ll P\; \text {on}\; \mathcal {F}_T\\ +\infty &{}\quad \text {otherwise} \end{array} \right. \end{aligned}$$

(4.1)

be the relative entropy of Q with respect to P.

We aim at finding a probability measure $Q\in \mathcal {Q}_{\mathcal {F}}$ that minimizes the functional

$$\begin{aligned} E_{Q}\Big [\int _0^t a_0 S^{\kappa }(s)U_1(s)ds+\overline{a}_0 S^{\kappa } (T)U_2(T)\Big ]+E_{Q}\Big [\mathcal {R}^{\kappa }(0,T)\Big ], \end{aligned}$$

(4.2)

where

$$\begin{aligned} \mathcal {Q}_F:=\Big \{Q|Q\ll P \text { on } \mathcal {F}_T ,\, Q=P\, on\,\, \mathcal {F}_0\,\, and\,\, H(Q|P)<+\infty \Big \} , \end{aligned}$$

with $a_0$ and $\overline{a}_0$ being non-negative constants; $\kappa =(\kappa (t))_{0\le t\le T}$ a non-negative bounded and progressively measurable process; $U_1=(U_1(t))_{0\le t\le T}$ a progressively measurable process with $E_{P}\Big [\exp [\gamma _1\int _0^T|U_1(t)|\mathrm {d}t]\Big ]<\infty , \, \forall \gamma _1>0$; $U_2(T)$ a $\mathcal {F}_T-$measurable random variable with $E_{P}\Big [\exp [|\gamma _1 U_2(T)|]\Big ]<\infty , \, \forall \gamma _1>0$; $S^{\kappa }(t)=\exp (-\int _0^t\kappa (s)\mathrm {d}s) $ is the discount factor and $\mathcal {R}^{\kappa }(t,T)$ is the penalization term, representing the sum of the entropy rate and the terminal entropy, i.e.

$$\begin{aligned} \mathcal {R}^{\kappa }(t,T)=\frac{1}{S^{\kappa }(t)}\int _t^T\kappa (s)S^{\kappa }(s)\ln \frac{G_0^Q(s)}{G_0^Q(t)}\mathrm {d}s+\frac{S^{\kappa }(T)}{S^{\kappa }(t)}\ln \frac{G^Q(T)}{G_0^Q(t)}, \end{aligned}$$

(4.3)

with $G^Q=(G^Q(t))_{0\le t\le T}$ is the RCLL P-martingale representing the density of Q with respect to P, i.e.

$$\begin{aligned} G^Q(t)=\frac{\mathrm {d}Q}{\mathrm {d}P}\Big |_{\mathcal {F}_t}. \end{aligned}$$

$G_T$ represents the Radon-Nikodym derivative on $\mathcal {F}_T$ of Q with respect to P. More precisely

Problem 4.1

Find $Q^*\in \mathcal {Q}_{\mathcal {F}}$ such that

$$\begin{aligned} Y^{Q^*}(t)={{\mathrm{ess \text { } inf}}}_{Q\in \mathcal {Q}_{\mathcal {F}}} Y^Q(t) \end{aligned}$$

(4.4)

with

$$\begin{aligned} Y^Q(t)&:= \frac{1}{S^{\kappa }(t)}E_{Q}\Big [\int _t^Ta_0 S^{\kappa }(s)U_1(s)\mathrm {d}s+\overline{a}_0S^{\kappa } (T)U_2(T)\Big |\mathcal {F}_t\Big ]\nonumber \\&\quad +E_{Q}\Big [\mathcal {R}^{\kappa }(t,T)\Big |\mathcal {F}_t\Big ]. \end{aligned}$$

(4.5)

In the present regime switching jump-diffusion setup, we consider the model uncertainty given by a probability measure Q having a density $(G^{\theta }(t))_{0\le t\le T}$ with respect to P and whose stochastic differential equation is as follows

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}G^{\theta }(t)= G^{\theta }(t^-)\Big [\theta _0(t)\mathrm {d}B(t)+\theta _1(t)\cdot \mathrm {d}\widetilde{\Phi }(t)+\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\Big ],\quad t \in [ 0,T] \\ G^{\theta }(0) = 1. \end{array}\right. \nonumber \\ \end{aligned}$$

(4.6)

Here $\theta = (\theta _0, \theta _1, \theta _2)$ (with $\theta _1=(\theta _{1,1},\theta _{1,2},\ldots , \theta _{1,D})\in \mathbb {R}^D$) may be seen as a scenario control. Denote by $\mathcal {A}$ the set of all admissible controls $\theta =(\theta _0,\theta _1,\theta _2)$ such that

$$\begin{aligned} E\left[ \int _0^T\left( \theta ^2_0(t)+\sum _{j=1}^D\theta _{1,j}^2(t)\lambda _j(t)+\displaystyle \int _{\mathbb {R}_0}\theta _2^2(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\right) \mathrm {d}t\right] <\infty \end{aligned}$$

and $\theta _2(t,\zeta )\ge -1+\epsilon $ for some $\epsilon >0.$

Using Itô’s formula (see Zhang et al. (2012, Theorem 4.1)), one can easily check that

$$\begin{aligned} G^{\theta }(t)= & {} \exp \Big [\int _0^t\theta _0(s)\mathrm {d}B(s)-\frac{1}{2}\int _0^t\theta ^2_0(s)\mathrm {d}s+\int _0^t\int _{\mathbb {R}_0}\ln (1+\theta _2(\zeta ,s))\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}s)\nonumber \\&+\int _0^t\int _{\mathbb {R}_0}\{\ln (1+\theta _2(s,\zeta ))-\theta _2(s,\zeta )\}\nu _\alpha (\mathrm {d}\zeta )\mathrm {d}s\nonumber \\&+\sum _{j=1}^D \int _0^t\ln ( 1+\theta _{1,j}(s))\cdot \mathrm {d}\widetilde{\Phi }_j(s)\nonumber \\&+\sum _{j=1}^D \int _0^t\{\ln ( 1+\theta _{1,j}(s))-\theta _{1,j}(s)\}\lambda _j(s)\mathrm {d}s\Big ]. \end{aligned}$$

(4.7)

Now, put $G^{\theta }(t,s)=\frac{G^{\theta }(s)}{G^{\theta }(t)},\, \, s\ge t$ then $(G^{\theta }(t,s))_{0\le t\le s\le T}$ satisfies

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}G^{\theta }(t,s) &{}=&{} G^{\theta }(t,s^-)\Big [\theta _0(s)\mathrm {d}B(s)+\theta _1(s)\cdot \mathrm {d}\widetilde{\Phi }(s)+\displaystyle \int _{\mathbb {R}_0}\theta _2(s,\zeta )\,\widetilde{N}_\alpha (\mathrm {d}s,\mathrm {d}\zeta )\Big ],\quad s \in [ t,T] \\ G^{\theta }(t,t) &{}=&{} 1. \end{array}\right. \nonumber \\ \end{aligned}$$

(4.8)

Hence (4.5) can be rewritten as

$$\begin{aligned} Y^Q(t)&=E_{Q}\left[ \int _t^Ta_0e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad +\,E_{Q}\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}\ln G^{\theta }(t,s)ds+e^{-\int _t^T\kappa (r)\mathrm {d}r}\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] \nonumber \\&=E\left[ \int _t^T a_0 G^{\theta }(t,s)e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad +\,E\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s\right. \nonumber \\&\quad \left. +\,e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] . \end{aligned}$$

(4.9)

Now, define $h_1$ by

$$\begin{aligned} h_1(\theta (t))&:= \frac{1}{2} \theta _0^2(t)+\sum _{j=1}^D\{(1+\theta _{1,j}(t)\ln (1+\theta _{1,j}(t))-\theta _{1,j}\}\lambda _j(t)\nonumber \\&\quad +\,\int _{\mathbb {R}_0}\{(1+\theta _2(t,\zeta ))\ln (1+\theta _2(t,\zeta ))-\theta _2(t,\zeta )\}\nu _{\alpha }(\mathrm {d}\zeta ). \end{aligned}$$

(4.10)

Using the Itô-Lévy product rule, we have

$$\begin{aligned}&E\left[ \int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s + e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big |\mathcal {F}_t\right] \nonumber \\&\quad = E\left[ \int _t^Te^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)h(\theta (s))ds\Big |\mathcal {F}_t\right] . \end{aligned}$$

(4.11)

Substituting (4.11) into (4.9), leads to

$$\begin{aligned} Y^Q(t)&=E_t\left[ \int _t^T a_0 G^{\theta }(t,s)e^{-\int _t^s\kappa (r)\mathrm {d}r}U_1(s)\mathrm {d}s+\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\right] \nonumber \\&\quad +\,E_t\Big [\int _t^T\kappa (s)e^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\ln G^{\theta }(t,s)\mathrm {d}s\nonumber \\&\quad +\,e^{-\int _t^T\kappa (r)\mathrm {d}r}G^{\theta }(t,T)\ln G^{\theta }(t,T)\Big ]\nonumber \\&=\,E_t\left[ \int _t^Te^{-\int _t^s\kappa (r)\mathrm {d}r}G^{\theta }(t,s)\left( a_0 U_1(s)+h(\theta (s))\right) \mathrm {d}s\right. \nonumber \\&\quad \left. +\,\overline{a}_0 G^{\theta }(t,T)e^{-\int _t^T\kappa (r)\mathrm {d}r}U_2(T)\right] . \end{aligned}$$

(4.12)

Here $E_t$ denotes the conditional expectation with respect to the $\mathcal {F}_t$.

We have the following theorem

Theorem 4.2

Suppose that the penalty function is given by (4.10). Then the optimal $Y^{Q^*}$ is such that $(Y^{Q^*},Z,W,K)$ is the unique solution to the following quadratic BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} &{}\,\mathrm {d}Y(t) = -\left[ -\kappa (t)Y(t)+a U_1(t)-Z^2(t)+\sum _{j=1}^D\lambda _j(t)(-e^{W_j}-W_j+1)\right. \\ &{}\qquad \qquad \quad \left. +\,\displaystyle \int _{\mathbb {R}_0}(-e^{-K(t,\zeta )}-K(t,\zeta )+1)\nu _{\alpha }\mathrm {d}\zeta \right. ] \mathrm {d}t +Z(t) \mathrm {d}B(t)\\ &{}\qquad \qquad \quad +\,\sum _{j=1}^D W_j(t) \mathrm {d}\widetilde{\Phi }_j(t)+\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\ &{}Y(T) = \overline{a}_0U_2(T). \end{array}\right. \end{aligned}$$

(4.13)

Moreover, the optimal measure $Q^*$ solution of Problem 4.1 admits the Radon-Nikodym density $(G^{Q}(t,s))_{0\le t\le s\le T}$ given by

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}G^{\theta }(t,s) = G^{\theta }(t,s^-)\left[ -Z(s)\mathrm {d}B(s)+\sum _{j=1}^D(e^{-W_j}-1)\cdot \mathrm {d}\widetilde{\Phi })_j(s)\right. \\ \quad \left. +\displaystyle \int _{\mathbb {R}_0}(e^{-K(s,\zeta )}-1)\,\widetilde{N}(\mathrm {d}s,\mathrm {d}\zeta )\right] ,\quad s \in [ t,T] \\ G(t,t) = 1. \end{array}\right. \end{aligned}$$

(4.14)

Proof

Fix $u_1$ and denote by X(T) the corresponding wealth process. One can see that Problem 4.1 can be obtained from our general control problem by setting $X(t)=0,\, \forall t\in [0,T]$, $h(X(T),\alpha (T))=\overline{a}_0U_2(T)$, $f=0$, $\phi (x)=0$ and $\psi (x)=I$. Since $h_1(\theta )$ given by (4.10) is convex in $\theta _0, \theta _1$ and $\theta _2$, it follows that conditions of Theorem 3.1 are satisfied. The Hamiltonian in this case is reduced to:

$$\begin{aligned} H(t,y,z,K,W)= & {} \lambda (U_1(t)+h(\theta )+\theta _0 z)+\,\sum _{j=1}^D\lambda _j\theta _{1,j}W_j\nonumber \\&+\displaystyle \int _{\mathbb {R}_0}\theta _2(\cdot ,\zeta )K(\cdot ,\zeta ) \nu _{\alpha }(d \zeta ). \end{aligned}$$

(4.15)

Minimizing H with respect to $\theta =(\theta _0,\theta _1,\theta _2)$ gives the first order condition of optimality for an optimal $\theta ^*$,

$$\begin{aligned} \left\{ \begin{array}{llll} \frac{\partial H}{\partial \theta _0}=0 \quad \text {i.e.,}\; \theta _0^{*}(t)=-Z(t),\\ \frac{\partial H}{\partial \theta _{1,j}}=0 \quad \text {i.e.,}\;-\ln (1+\theta _{1,j}^*)(t)=-W_{1,j}(t)\quad \text { for }\; j=1,\ldots ,D,\\ \nabla _{\theta _2}H=0 \quad \text {i.e., }\;-\ln (1+\theta _2^*)(t,\zeta )=-K(\cdot ,\zeta ),\quad \nu _{\alpha }\text {- a.e.} \end{array}\right. \end{aligned}$$

(4.16)

On the hand, one can show using product rule (see e.g., Menoukeu-Pamen 2015) that Y given by (4.12) is solution to the following linear BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} \,\mathrm {d}Y(t) =&{}&{} -\Big [ -\kappa (t)Y(t)+a U_1(t)+ h(\theta )+\theta _0Z(t)+\sum _{j=1}^D\theta _{1,j}(t)\lambda _j(t){W_j}\\ &{}&{} +\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )K(t,\zeta )\nu _{\alpha }\mathrm {d}\zeta \Big ]\mathrm {d}t+Z(t) dB(t)+ W(t)\cdot \mathrm {d}\widetilde{\Phi }(t)\\ &{}&{}+\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\ Y(T) =&{}&{} \overline{a}_0U_2(T). \end{array}\right. \end{aligned}$$

(4.17)

Using comparison theorem for BSDE, $Q^*$ is an optimal measure for Problem 4.1 if $\theta ^*$ is such that

$$\begin{aligned} g(\theta ^*)=\underset{\theta }{\min }\,g(\theta ) \end{aligned}$$

(4.18)

for each t and $\omega $, with $g(\theta ):=h(\theta )+\theta _0Z(t)+\sum _{j=1}^D\theta _{1,j}(t)\lambda _j(t){W_j}+\displaystyle \int _{\mathbb {R}_0}\theta _2(t,\zeta )K(t,\zeta )\nu _{\alpha }\mathrm {d}\zeta $. This is equivalent to the first condition of optimality. Hence $(\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2)$ satisfying (4.16) will satisfy (4.18). Substituting $\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2$ into (4.17) leads to (4.13). Furthermore, substituting $\theta ^*_0,\theta ^*_{1,1},\ldots ,\theta ^*_{1,D},\theta ^*_2$ into (4.8) gives (4.14). The proof of the theorem is complete. $\square $

Remark 4.3

This result can be seen as an extension to the Markov regime-switching setting of Jeanblanc et al. (2012, Theorem 1) or Bordigoni et al. (2005, Theorem 2).
Let us mention that in the case $(X(t))_{0\le t\le T}$ is not zero and has a particular dynamics (mean-reverting or exponential Markov Lévy switching) one can use Theorem 3.1 to solve a problem of recursive robust utility mazimization as in Øksendal and Sulem (2012, Section 4.2) or Menoukeu-Pamen (2015, Theorem 4.1)

4.2 Application to optimal investment of an insurance company under model uncertainty

In this section, we use our general framework to study a problem of optimal investment of an insurance company under model uncertainty. The uncertainty here is also described by a family of probability measures. Such problem was solved in Elliott and Siu (2011) using dynamic programming approach when the interest rate is 0. We show that the general maximum principle enables us to find the explicit optimal investment when $r\ne 0$. We restrict ourselves to the case $\mathcal {E}_t^{(1)}=\mathcal {E}_t^{(2)}=\mathcal {F}_t,\,\,\,t\in [0,T]$ in order to have explicit result. Let us mention that if $\mathcal {E}_t^{(i)}\subset \mathcal {F}_t,\,\,\,t\in [0,T], i=1,2$ then the problem is non-Markovian and hence the dynamic programming used in Elliott and Siu (2011) cannot be applied.

The model is that of Elliott and Siu (2011, Section 2.1). Let $(\Omega , \mathcal {F},P) $ be a complete probability space with P representing a reference probability measure from which a family of real-world probability measures are generated. We shall suppose that $(\Omega , \mathcal {F},P) $ is big enough to take into account uncertainties coming from future insurance claims, fluctuation of financial prices and structural changes in economics conditions. We consider a continuous-time Markov regime-switching economic model with a bond and a stock or share index.

The evolution of the state of an economy over time is modeled by a continuous-time, finite-state, observable Markov chain $\alpha :=\{\alpha (t),t\in [0,T];\, T<\infty \}$ on $(\Omega , \mathcal {F},P)$, taking values in the state space $\mathbb {S}=\{e_1,e_2,\ldots ,e_D\}$, where $D\ge 2$. We denote by $\Lambda :=\{\lambda _{nj}:1\le n,j\le D\}$ the intensity matrix of the Markov chain under P. Hence, for each $1\le n,j\le D,\,\,\lambda _{nj}$ is the transition intensity of the chain from state $e_n$ to state $e_j$ at time t. It is assumed that for $n\ne j,\,\,\lambda _{nj}> 0$ and $\sum _{j=1}^D \lambda _{nj}=0$, hence $\lambda _{nn}< 0$. The dynamics of $(\alpha (t))_{0\le t\le T}$ is given in Sect. 2.

Let $r=\{r(t)\}_{t\in [0, T]}$ be the instantaneous interest rate of the money market account B at time t. Then

$$\begin{aligned} r(t):=\langle \underline{r},\alpha (t)\rangle =\sum _{j=1}^D r_j\langle \alpha (t), e_j\rangle \;, \end{aligned}$$

(4.19)

where $\langle \cdot ,\cdot \rangle $ is the usual scalar product in $\mathbb {R}^D$ and $\underline{r}=(r_1,\dots ,r_D)\in \mathbb R^D_+$. Here the value $r_j$, the $j^{th}$ entry of the vector $\underline{r}$, represents the value of the interest rate when the Markov chain is in the state $e_j$, i.e., when $\alpha (t)=e_j$. The price dynamics of B can now be written as

$$\begin{aligned} \mathrm {d}S_0(t)=S_0r(t)\mathrm {d}t,\, S_0(0)=1, \quad t\in [0,T]. \end{aligned}$$

(4.20)

Moreover, let $\mu =\{\mu (t)\}_{t\in [0, T]}$ and $\sigma =\{\sigma (t)\}_{t\in [0, T]}$ denote respectively the mean return and the volatility of the stock at time t. Using the same convention, we have

$$\begin{aligned} \mu (t)=&\langle \underline{\mu },\alpha (t)\rangle =\sum _{j=1}^D\mu _j\langle \alpha (t),e_j\rangle \;,\\ \sigma (t)=&\langle \underline{\sigma },\alpha (t)\rangle =\sum _{j=1}^D\sigma _j\langle \alpha (t),e_j \rangle \;, \end{aligned}$$

where

$$\begin{aligned} \underline{\mu }=(\mu _1,\mu _2,\ldots ,\mu _D)\in \mathbb {R}^D, \end{aligned}$$

and

$$\begin{aligned} \underline{\sigma }=(\sigma _1,\sigma _2,\ldots ,\sigma _D)\in \mathbb {R_+}^D. \end{aligned}$$

In a similar way, $\mu _j$ and $\sigma _j$ represent respectively the appreciation rate and volatility of the stock when the Markov chain is in state $e_j$, i.e., when $\alpha (t)=e_j$. Let $B=\{B_t\}_{t\in [0, T]}$ denote the standard Brownian motion on $(\Omega ,\mathcal {F},P)$ with respect to its right-continuous complete filtration $\mathcal {F}^B:=\{\mathcal {F}^B_t\}_{0\le t\le T}$. Then, the dynamic of the stock price $S=\{S(t)\}_{t\in [0, T]}$ is given by the following Markov regime-switching geometric Brownian motion

$$\begin{aligned} \mathrm {d}S(t)=S(t)\left[ \mu (t)\mathrm {d}t+\sigma (t)\mathrm {d}B(t)\right] , \quad S(0)=S_0. \end{aligned}$$

(4.21)

Let $Z_0:=\{Z_0(t)\}_{t\in [0, T]}$ be a real-valued Markov regime-switching pure jump process on $(\Omega ,\mathcal {F},P)$. Here $Z_0(t)$ can be considered as the aggregate amount of claims up to and including time t. Since $Z_0$ is a pure jump process, one has

$$\begin{aligned} Z_0(t)=\sum _{0<u\le t}\Delta Z_0(u),\,\,Z_0(0)=0,\,\, P\text {-a.s, } t\in [0,T], \end{aligned}$$

(4.22)

where for each $u\in [0,T]$, $\Delta Z_0(u)=Z_0(u)-Z_0(u^-)$, represents the jump size of $Z_0$ at time u.

Assume that the state space of claim size denoted by $\mathcal {Z}$ is $(0,\infty )$. Let $\mathcal {M}$ be the product space $[0,T]\times \mathcal {Z}$ of claim arrival time and claim size. Define a random measure $N^0(\cdot ,\cdot )$ on the product space $\mathcal {M}$, which selects claim arrivals and size $\zeta :=Z_0(u)-Z_0(u^-)$ at time u, then the aggregate insurance claim process $Z_0$ can be written as

$$\begin{aligned} Z_0(t)=\int _0^t\int _0^{\infty } \zeta N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\, t\in [0,T]. \end{aligned}$$

(4.23)

Define, for each $t\in [0,T]$

$$\begin{aligned} N_{\Lambda ^0}(t)=\int _0^t\int _0^{\infty } N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\,t\in [0,T]. \end{aligned}$$

(4.24)

Then $N_{\Lambda ^0}(t)$ counts the number of claim arrivals up to time t. Assume that, under the measure P, $N_{\Lambda ^0}:=\{N_{\Lambda ^0}(t)\}_{t\in [0, T]}$ is a conditional Poisson process on $(\Omega ,\mathcal {F},P)$ with intensity $\Lambda ^0:=\{\lambda ^0(t)\}_{t\in [0, T]}$ modulated by the chain $\alpha $ given by

$$\begin{aligned} \lambda ^0(t):=\langle \underline{\lambda }^0,\alpha (t)\rangle =\sum _{j=1}^D \lambda _j^0\langle \alpha (t), e_j\rangle \;, \end{aligned}$$

(4.25)

with $\underline{\lambda }^0=(\lambda _1^0,\ldots ,\lambda ^0_D)\in \mathbb R^D_+$. Here the value $\lambda _j^o$, the $j^{th}$ entry of the vector $\underline{\lambda }^0$, represents the intensity rate of N when the Markov chain is in the space state $e_j$, i.e., when $\alpha (t^-)=e_j$. Denote by $F_j(\zeta ),\,j=1,\ldots ,D$ the probability distribution of the claim size

$\zeta :=Z_0(u)-Z_0(u^-)$ when $\alpha (t^-)=e_j$. Then the compensator of the Markov regime switching random measure $N^0(\cdot ,\cdot )$ under P is given by

$$\begin{aligned} \nu _{\alpha }^0(\mathrm {d}\zeta )\mathrm {d}u:=\sum _{j=1}^D\langle \alpha (u^-),e_j\rangle \lambda _j^0F_j(\mathrm {d}\zeta )\mathrm {d}u. \end{aligned}$$

(4.26)

Hence a compensated version $\widetilde{N}^0_{\alpha }(\cdot ,\cdot )$ of the Markov regime-switching random measure is defined by

$$\begin{aligned} \widetilde{N}^0_{\alpha }(\mathrm {d}u,\mathrm {d}\zeta )=N^0(\mathrm {d}u,\mathrm {d}\zeta )-\nu ^0_{\alpha }(\mathrm {d}\zeta )\mathrm {d}u. \end{aligned}$$

(4.27)

The premium rate $P_0(t)$ at time t is given by

$$\begin{aligned} P_0(t):=\langle \underline{P_0},\alpha (t)\rangle =\sum _{j=1}^D P_{0,j}\langle \alpha (t), e_j\rangle , \end{aligned}$$

(4.28)

with $\underline{P_0}=(P_{0,1},\ldots ,P_{0,D})\in \mathbb R^D_+$. Let $R_0:=\{R_0(t)\}_{t\in [0, T]}$ be the surplus process of the insurance company without investment. Then

$$\begin{aligned} R_0(t)&:=\,r_0+\int _0^tP_0(u)\mathrm {d}u -Z_0(t)\nonumber \\&=r_0+\sum _{j=1}^D P_{0,j}\mathcal {J}_j(t)-\int _0^t\int _0^{\infty } \zeta N^0(\mathrm {d}u,\mathrm {d}\zeta ),\,\,\, t\in [0,T], \end{aligned}$$

(4.29)

with $R_0(0)=r_0$. For each $j=1,\ldots ,D$ and each $t\in [0,T]$, $\mathcal {J}_j(t)$ is the occupation time of the chain $\alpha $ in the state $e_j$ up to time t, that is

$$\begin{aligned} \mathcal {J}_j(t)=\int _0^t\langle \alpha (u), e_j\rangle \mathrm {d}u. \end{aligned}$$

(4.30)

The following information structure will be important for the derivation of the dynamics of the company’ surplus process. Let $\mathcal {F}^{Z_0}:=\{\mathcal {F}^{Z_0}\}_{0\le t\le T}$ denote the right-continuous P-completed filtration generated by $Z_0$. For each $t\in [0,T]$ define $\mathcal {F}_t:=\mathcal {F}^{Z_0}_t\vee \mathcal {F}_t^{B}\vee \mathcal {F}_t^{\alpha }$ as the minimal $\sigma $-algebra generated by $\mathcal {F}^{Z_0}_t$, $\mathcal {F}_t^{B}$ and $\mathcal {F}_t^{\alpha }$ and write $\mathbb {F}=\{\mathcal {F}_t\}_{0\le t\le T}$ as the information accessible to the company.

From now on, we assume that the insurance company invests the amount of $\pi (t)$ in the stock at time t, for each $t\in [0,T]$. Then $\pi =\{\pi (t),t\in [0,T]\}$ represents the portfolio process. Denote by $X=\{X^{\pi }(t)\}_{t\in [0, T]}$ the wealth process of the company. One can show that the dynamics of the surplus process is given by

$$\begin{aligned} \left\{ \begin{aligned} \,\mathrm {d}X(t)&= \Big \{ P_0(t)+r(t)X(t)+ \pi (t)(\mu (t)-r(t))\Big \}\mathrm {d}t+\sigma (t)\pi (t)\mathrm {d}B(t)\\&\quad -\displaystyle \int _0^{\infty } \zeta N^0(\mathrm {d}t,\mathrm {d}\zeta )\\&= \Big \{ P_0(t)+r(t)X(t)+ \pi (t)(\mu (t)-r(t))-\displaystyle \int _0^{\infty }\zeta \nu ^0_{\alpha }(\,\mathrm {d}\zeta )\Big \}\mathrm {d}t\\&\quad +\sigma (t)\pi (t)\mathrm {d}B(t)-\displaystyle \int _0^{\infty }\zeta \widetilde{N}_{\alpha }^0(\mathrm {d}t,\mathrm {d}\zeta ),\,\,t\in [0,T,]\\ X(0)&= X_0. \end{aligned}\right. \end{aligned}$$

(4.31)

Definition 4.4

A portfolio $\pi $ is admissible if it satisfies

1.
$\pi $ is $\mathbb {F}$-progressively measurable;
2.
(4.31) admits a unique strong solution;
3.
$\sum _{j=1}^D E\Big [\int _0^T\Big \{|P_{0,j}+r_jX(t)+\pi (t)(\mu _j-r_j)|+\sigma ^2_j\pi ^2(t)+\lambda _j^0\int _0^{\infty }\zeta ^2F_j(\,\mathrm {d}\zeta )\Big \}\mathrm {d}t\Big ]<\infty $;
4.
$X(t)\ge 0,\,\,\forall t\in [0,T]$, P-a.s.

We denote by $\mathcal {A}$ the space of all admissible portfolios.

Note that although condition (4) is strong, it is intuitively natural to only consider positive wealth for the insurance company. Define $\mathbb {G}:=\{\mathcal {G}_t, t\in [0,T]\}$, where $\mathcal {G}_t:=\mathcal {F}_t^B\vee \mathcal {F}_t^{Z_0}$, and for $n,j=1,\ldots ,D$, let $\{C_{nj}(t),t\in [0,T]\}$ be a real-valued, $\mathbb {G}$-predictable, bounded, stochastic process on $(\Omega ,\mathcal {F},P)$ such that for each $t\in [0,T]$ $C_{nj}\ge 0$ for $n\ne j$ and $\sum _{n=1}^D C_{nj}(t)=0,\, i.e,\,C_{nn}\le 0$.

We consider a model uncertainty setup given by a probability measure $Q=Q^{\theta ,\mathbf {C}}$ which is equivalent to P, with Radon-Nikodym derivative on $\mathcal {F}_t$ given by

$$\begin{aligned} \frac{\mathrm {d}Q}{\mathrm {d}P}\Big |_{\mathcal {F}_t}=G^{\theta ,C}(t), \end{aligned}$$

(4.32)

where, for $0\le t\le T$, $G^{\theta ,C}$ is a $\mathbb {F}$-martingale. Under $Q^{\theta ,\mathbf {C}}$, $\mathbf {C}:=\{\mathbf {C}(t),t\in [0,T]\}$ with $\mathbf {C}(t):=[C_{nj}(t)]_{n,j=1,\ldots ,D}$ is a family of rate matrices of the Markov chain $\alpha (t)$; see for example Dufour and Elliott (1999). For each $t\in [0,T]$, we set

$$\begin{aligned} \mathbf {D}_0^{\mathbf {C}}(t):=\mathbf {D}^\mathbf {C}(t) -\mathbf {diag}(\mathbf {d}^C(t)), \end{aligned}$$

with $\mathbf {d}^C(t)=(d^C_{11},\ldots ,d^C_{DD})^\prime \in \mathbb {R}^D$ and

$$\begin{aligned} \mathbf {D}^C:=\Big [\frac{C_{nj}(t)}{\lambda _{nj}(t)}\Big ]_{n,j=1,\cdots ,D}=[d^{\mathbf {C}} _{nj}(t)]. \end{aligned}$$

(4.33)

We denote by $\mathcal {C}$ the space of all families intensity matrices $\mathbf {C}$ with bounded components.

The Radon-Nikodym derivative or density process $G^{\theta ,\mathbf {C}}$ is given by

$$\begin{aligned} \left\{ \begin{aligned} \,\mathrm {d}G^{\theta ,\mathbf {C}}(t)&= G^{\theta ,C}(t^-)\Big \{ \theta (t)\mathrm {d}B(t)+\displaystyle \int _0^{\infty }\theta (t)\widetilde{N}^0_{\alpha }(\mathrm {d}t,\mathrm {d}\zeta )\\&\quad +(\mathbf {D}^{\mathbf {C}}_0(u)\alpha (u)-\mathbf {1})^\prime \cdot \,\mathrm {d}\widetilde{\Phi }(t)\Big \},\,\,\,t\in [0,T],\\ G^{\theta ,\mathbf {C}}(0)&= 1, \end{aligned}\right. \end{aligned}$$

(4.34)

where $^\prime $ represents the transpose. Here $(\theta , \mathbf {C})$ may be regarded as scenario control. A control $\theta $ is admissible if $\theta $ is $\mathbb {F}$-progressively measurable, with $\theta (t)=\theta (t,\omega )\le 1$ for a.a $(t,\omega )\in [0,T]\times \Omega $, and $\int _0^T\theta ^2(t)\mathrm {d}t<\infty .$ We denote by $\Theta $ the space of such admissible processes.

Next, we formulate the optimal investment problem under model uncertainty. Let $U:(0,\infty )\longrightarrow \mathbb {R}$, be an utility function which is strictly increasing, strictly concave and twice continuously differentiable. The objectives of the insurance firm and the market are the following:

Problem 4.5

Find a portfolio process $\pi ^*\in \mathcal {A}$ and the process $(\theta ^*, \mathbf {C}^*)\in \Theta \times \mathcal {C}$ such that

$$\begin{aligned} \underset{\pi \in \mathcal {A}}{\sup }\,\,\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]=&E_{Q^{\theta ^*,\mathbf {C}^*}} \Big [U^{\pi ^*}(X_T)\Big ]\nonumber \\ =&\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,\,\underset{\pi \in \mathcal {A}}{\sup }\,E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]. \end{aligned}$$

(4.35)

This problem can be seen as a zero-sum stochastic differential games of an insurance firm. We have

$$\begin{aligned} E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ]=E\Big [G^{\theta ,\mathbf {C}}(T)U(X^{\pi }(T))\Big ]. \end{aligned}$$

(4.36)

Now, define $Y(t)=Y^{\theta ,\mathbf {C},\pi }(t)$ by

$$\begin{aligned} Y(t)=E\Big [\frac{G^{\theta ,\mathbf {C}}(T)}{G^{\theta ,\mathbf {C}}(t)}U(X^{\pi }(T))\Big |\mathcal {F}_t\Big ]. \end{aligned}$$

(4.37)

Then, it can easily be shown that Y(t) is the solution to the following linear BSDE

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}Y (t) &{}=&{}-\Big [\theta (t)Z_0(t)+\displaystyle \int _{\mathbb {R}_0}\theta (t)K(t,\zeta )\nu ^0_{\alpha }(\,\mathrm {d}\zeta )+\sum _{j=1}^D(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})_j\lambda _jV_j(t)\Big ]\mathrm {d}t \\ &{}&{} Z_0(t)\mathrm {d}B(t) +\displaystyle \int _{\mathbb {R}_0}K(t,\zeta )\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+V(t)\cdot \mathrm {d}\widetilde{\Phi }(t),\,\,t\in [0,T], \\ Y(T) &{}=&{} U(X^{\pi }(T)). \end{array}\right. \end{aligned}$$

(4.38)

Noting that

$$\begin{aligned} Y(0)=Y^{\theta ,\mathbf {C},\pi }(0)=E_{Q^{\theta ,\mathbf {C}}}\Big [U^{\pi }(X_T)\Big ], \end{aligned}$$

(4.39)

Problem 4.5 becomes

Problem 4.6

Find a portfolio process $\pi ^*\in \mathcal {A}$ and the process $(\theta ^*, \mathbf {C}^*)\in \Theta \times \mathcal {C}$ such that

$$\begin{aligned} \underset{\pi \in \mathcal {A}}{\sup }\,\,\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,Y^{\theta ,\mathbf {C},\pi }(0)=Y^{\theta ^*,\mathbf {C}^*,\pi ^*}(0) =\underset{(\theta ,\mathbf {C}) \in \Theta \times \mathcal {C}}{\inf }\,\,\underset{\pi \in \mathcal {A}}{\sup }\,Y^{\theta ,\mathbf {C},\pi }(0), \end{aligned}$$

(4.40)

where $Y^{\theta ,\mathbf {C},\pi }$ is described by the Forward–backward system (4.31) and (4.38).

Theorem 4.7

Let $X^{\pi }(t)$ be dynamics of the surplus process satisfying (4.31) with r deterministic. Consider the optimization problem to find $\pi ^*\in \mathcal {A}$ and $(\theta ^*,\mathbf {C}^*)\in \Theta \times \mathcal {C}$ such that (4.35) (or equivalently (4.40)) holds, with

$$\begin{aligned} Y^{\theta ,\mathbf {C},\pi }(t)=E\Big [\frac{G^{\theta ,\mathbf {C}}(T)}{G^{\theta ,\mathbf {C}}(t)}U(X^{\pi }(T))\Big |\mathcal {F}_t\Big ]. \end{aligned}$$

(4.41)

In addition, suppose $U(x)=-e^{-\beta x},\, \beta \ge 0.$ Then the optimal investment $\pi ^*(t)$ and the optimal scenario measure of the market $(\theta ^*,\mathbf {C}^*)$ are given respectively by

$$\begin{aligned} \theta ^*(t)=&-\sum _{j=1}^D\Big (\frac{\mu _j-r_j-\sigma ^2_j\pi ^*(t,e_j)\beta e^{\int _t^Tr(s)\mathrm {d}s}}{\sigma _j}\Big )\langle \alpha (t),e_j\rangle , \end{aligned}$$

(4.42)

$$\begin{aligned} \pi ^*(t)=&\sum _{n=1}^D\Big (\frac{\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\lambda _n^0 F_n(\mathrm {d}\zeta )}{\beta \sigma _ne^{\int _t^Tr(s)\mathrm {d}s}}\Big )\langle \alpha (t),e_n \rangle . \end{aligned}$$

(4.43)

and the optimal $\mathbf {C}^*$ satisfies the following constraint linear optimization problem:

$$\begin{aligned} \underset{C_{1j},\ldots ,C_{Dj}}{\min }\,\, \sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)\,\,\, j=1,\ldots ,D, \end{aligned}$$

(4.44)

subject to the linear constraints

$$\begin{aligned} \sum _{n=1}^DC_{nj}(t)=0, \end{aligned}$$

where $V_j$ is given by (4.67).

Moreover, if we assume that the space of family matrix rates $(C_{nj})_{n,j=1,2}$ is bounded and write $C_{nj}(t)\in \Big [C^l(n,j), C^u(n,j)\Big ]$ with $C^l(n,j)< C^u(n,j),\,\,n,j=1,2$. Then, in this case, the optimal $\mathbf {C}^*$ is given by:

$$\begin{aligned} \mathbf {C}^*_{21}(t)&= C^l(2,1)\mathbb {I}_{V_1(t)-V_2(t)>0}+C^u(2,1)\mathbb {I}_{V_1(t)-V_2(t)<0}, \end{aligned}$$

(4.45)

$$\begin{aligned} \mathbf {C}^*_{11}(t)&=-\mathbf {C}^*_{21}(t), \end{aligned}$$

(4.46)

$$\begin{aligned} \mathbf {C}^*_{12}(t)&= C^l(1,2)\mathbb {I}_{V_2(t)-V_1(t)>0}+C^u(1,2)\mathbb {I}_{V_2(t)-V_1(t)<0}, \end{aligned}$$

(4.47)

$$\begin{aligned} \mathbf {C}^*_{22}(t)&=-\mathbf {C}^*_{12}(t). \end{aligned}$$

(4.48)

Proof

One can see that this is a particular case of a zero-sum stochastic differential games of the Forward–backward system of the form (2.5) and (2.7) with $\psi =Id$, $\varphi =f=0$ and $ h(x)=U(x).$ The Hamiltonian in Sect. 3 is reduced to

$$\begin{aligned}&H(t, x,e_n, y,z,k,v,\pi ,\theta ,a,p,q,r^0,w)\nonumber \\&\quad = \,a\left[ \theta z+\displaystyle \int _{\mathbb {R}^+}\theta k(t,\zeta )\nu ^0_{e_n}(\,\mathrm {d}\zeta )+\sum _{j=1}^D(\mathbf {D}_0^\mathbf {C}(t)e_n-\mathbf {1})_j\lambda _{nj}v_j(t)\right] \nonumber \\&\quad \quad +\,\left[ P_0(t)+rx+\pi (\mu -r)-\displaystyle \int _{\mathbb {R}^+}\zeta \nu ^0_{e_n}(\mathrm {d}\zeta )\right] p\nonumber \\&\quad \quad +\,\sigma \pi q-\displaystyle \int _{\mathbb {R}^+}\zeta r^0(t,\zeta )\nu _{e_n}^0(\mathrm {d}\zeta ). \end{aligned}$$

(4.49)

The adjoint processes A(t) ,$(p(t),q(t),r^0(t,\zeta ),w(t))$ associated with the Hamiltonian are given by the following Forward–backward SDE

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}A(t)&{}=&{}A(t)\Big [\theta (t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}\theta (t)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ],\,t\in [0,T], \\ A(0)&{}=&{}1, \end{array}\right. \nonumber \\ \end{aligned}$$

(4.50)

and

$$\begin{aligned} \left\{ \begin{array}{llll} \mathrm {d}p(t)&{}=&{}-r(t)p(t)\mathrm {d}t+q(t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}r^0(t,\zeta )\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+w(t)\cdot \mathrm {d}\widetilde{\Phi }(t),\,t\in [0,T], \\ p(T)&{}=&{}A(T)U^\prime (X(T)). \end{array}\right. \nonumber \\ \end{aligned}$$

(4.51)

It is easy to see that the functions h and H satisfy the assumptions of Theorem 3.9.

Let us now find $\theta ^*$ and $\pi ^*$. First, maximizing the Hamiltonian H with respect to $\pi $ gives the first order condition for an optimal $\pi ^*$.

$$\begin{aligned} E[(\mu -r) p+\sigma q|\mathcal {F}_t]=0. \end{aligned}$$

(4.52)

The BSDE (4.51) is linear in p, hence we shall try a process p(t) of the form

$$\begin{aligned} p(t)=\beta f(t,\alpha (t))A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}, \end{aligned}$$

(4.53)

where $f(\cdot ,e_n)$ satisfies a differential equation to be determined. Applying Itô-Lévy’s formula for jump-diffusion process, we have

$$\begin{aligned} \mathrm {d}\Big (e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big )&= e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi (t)\mathrm {d}B(t)\nonumber \\&\quad +\,\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t) \Big ]. \end{aligned}$$

(4.54)

Applying the Itô-Lévy’s formula for jump-diffusion, Markov regime-switching process (see e.g., Zhang et al. (2012, Theorem 4.1)), we get

$$\begin{aligned}&\mathrm {d}\Big ( A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big )\nonumber \\&\quad \displaystyle =e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}A(t)\Big [\theta (t)\mathrm {d}B(t)+ \int _{\mathbb {R}^+}\theta (t)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)+(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ]\nonumber \\&\quad \quad +\,A(t) e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t)\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\Big ]\nonumber \\&\quad \quad -\,\beta A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}\theta (t)e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi (t)\mathrm {d}t\nonumber \\&\quad \quad +\displaystyle \int _{\mathbb {R}^+}\theta (t)A(t) e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)N^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t) \nonumber \\&\quad = A(t)e^{-\beta X(t)e^{\int _t^Tr(s)\mathrm {d}s}} \Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)+\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) + \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t\nonumber \\&\quad \quad +\, ( \theta (t)-\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t))\mathrm {d}B(t)+\displaystyle \int _{\mathbb {R}^+}\Big \{(1\nonumber \\&\quad \quad +\,\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \}\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad \quad +\,(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t)-\mathbf {1})^\prime \cdot \mathrm {d}\widetilde{\Phi }(t)\Big ]. \end{aligned}$$

(4.55)

Putting $A(t)e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}=P_1(t)$, then $p(t)=\beta f(t,\alpha (t))P_1(t)$ and using once more the Itô-Lévy’s formula for jump-diffusion Markov regime-switching process, we get

$$\begin{aligned} \mathrm {d}p(t)&= \beta \mathrm {d}\Big (f(t,\alpha (t))P_1(t)\Big )\nonumber \\&= \beta \Big [ f^\prime (t,\alpha (t))P_1(t)\mathrm {d}t +f(t,\alpha (t))P_1(t)\Big [\Big ( -\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t)\nonumber \\&\quad +\,\pi (t)(\mu (t)-r(t))\Big \}\nonumber \\&\quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t)\pi (t)+\frac{1}{2}\beta ^2e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t) \nonumber \\&\quad +\, \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s} }-1) \nu ^0_{\alpha }(\mathrm {d}\zeta )\Big )\mathrm {d}t \nonumber \\&\quad +\, ( \theta (t)-\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t))\mathrm {d}B(t)\Big ]+ \sum _{j=1}^D\Big (f(t,e_j)\nonumber \\&\quad -\,f(t,\alpha (t))\Big ) P_1(t)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j\lambda _j(t)\mathrm {d}t\nonumber \\&\quad +\,\displaystyle \int _{\mathbb {R}^+}f(t,\alpha (t))P_1(t)\Big \{(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \}\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad +\, \sum _{j=1}^DP_1(t)\Big ( f(t,e_j)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j -f(t,\alpha (t)) \Big ) \mathrm {d}\widetilde{\Phi }_j(t)\Big ], \end{aligned}$$

(4.56)

where $(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j=(\mathbf {D}_0^{\mathbf {C}}(t)\alpha (t))^j$. Comparing (4.56) with (4.51), by equating the terms in $\mathrm {d}t$, $\mathrm {d}B(t)$, $\widetilde{N}_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)$, and $\mathrm {d}\widetilde{\Phi }_j(t)$ $j=1,\ldots ,D$, respectively, we get

$$\begin{aligned} q(t)=(\theta ^*(t)-\beta (t)\sigma (t)\pi ^*(t)e^{\int _t^Tr(s)\mathrm {d}s})p(t). \end{aligned}$$

(4.57)

Substituting this into (4.52) gives,

$$\begin{aligned} E[(\mu (t)-r(t)) p(t)|\mathcal {F}_t]&=-E[\sigma (t)\left( \theta ^*(t)-\sigma (t)\pi ^*(t)\beta e^{\int _t^Tr(s)\mathrm {d}s}\right) p(t)|\mathcal {F}_t],\nonumber \\ \text {i.e., }\,\, \theta ^*(t)&=-\sum _{j=1}^D\left( \frac{\mu _j-r_j-\sigma ^2_j\pi ^*(t,e_j)\beta e^{\int _t^Tr(s)\mathrm {d}s}}{\sigma _j}\right) \langle \alpha (t),e_j\rangle , \end{aligned}$$

(4.58)

where the last inequality follows since all coefficients are adapted to $\mathcal {F}_t$. Thus (4.42) in the Theorem is proved. On the other hand, we also have

$$\begin{aligned} r^0(t,\zeta )=p(t)\Big \{(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)+\theta (t)\Big \} \end{aligned}$$

(4.59)

and

$$\begin{aligned} w_j(t)=\beta \Big \{P_1(t)\Big ( f(t,e_j)(\mathbf {D}_{0,\alpha }^{\mathbf {C}}(t))^j -f(t,\alpha (t)) \Big )\Big \}, \end{aligned}$$

(4.60)

with the function $f(\cdot ,e_n)$ satisfying the following backward differential equation:

$$\begin{aligned}&f^\prime (t,e_n) +f(t,e_n)\Big [-\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{ P_0(t,e_n)+\pi (t)(\mu (t,e_n)-r(t,e_n))\Big \} \nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t,e_n)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2 e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t,e_n)\pi ^2(t)+\displaystyle \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1) \lambda ^0_{n}F_{e_n}(\mathrm {d}\zeta )\Big ] \nonumber \\&\quad \quad +\,\sum _{j=1}^D\Big (f(t,e_j)-f(t,e_n)\Big )(\mathbf {D}_{0,e_n}^{\mathbf {C}}(t))_{nj}\lambda _{n j}=0, \end{aligned}$$

(4.61)

with the terminal condition $f(T,e_n)=1$, for $n=1,\ldots , D$. For $r=0$, the solution of such backward equation can be found in Elliott and Siu (2011). Minimizing the Hamiltonian H with respect to $\theta $ gives the first order condition for an optimal $\theta ^*$.

$$\begin{aligned} E[z+\displaystyle \int _{\mathbb {R}^+}k(t,\zeta )\nu ^0_{\alpha }(\mathrm {d}\zeta )|\mathcal {F}_t]=0. \end{aligned}$$

(4.62)

The BSDE (4.38) is linear in Y, hence we shall try the process Y(t) of the form

$$\begin{aligned} Y(t)=f_1(t,\alpha (t))Y_1(t)\,\, \text { with }\,\, Y_1(t)=e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}, \end{aligned}$$

(4.63)

where $f_1(\cdot ,e_n),\,\,n=1,\ldots , D$ is a deterministic function satisfying a backward differential equation to be determined. Applying the Itô-Lévy’s formula for jump-diffusion Markov regime-switching, we get

$$\begin{aligned} \mathrm {d}Y(t)&= f_1^\prime (t,\alpha (t) )e^{-\beta X(t) e^{\int _t^Tr(s)\mathrm {d}s}}\mathrm {d}t- f_1(t, \alpha (t))Y_1(t)\beta e^{\int _t^Tr(s)\mathrm {d}s}\Big \{P_0(t)\nonumber \\&\quad +\,\pi (t)(\mu (t)-\pi (t))\nonumber \\&\quad -\,\frac{1}{2}\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t)\pi ^2(t)+\frac{1}{\beta }\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\nu ^0_{\alpha }(\mathrm {d}\zeta )\Big \}\mathrm {d}t\nonumber \\&\quad +\,\sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,\alpha (t))\Big )Y_1(t)\lambda _j(t)\mathrm {d}t\nonumber \\&\quad -\, f_1(t,\alpha (t))Y_1(t)\beta e^{\int _t^Tr(s)\mathrm {d}s} \sigma (t)\pi (t)\mathrm {d}B(t)\nonumber \\&\quad +\, \displaystyle \int _{\mathbb {R}^+}f_1(t,\alpha (t))Y_1(t)(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\widetilde{N}^0_\alpha (\mathrm {d}\zeta ,\mathrm {d}t)\nonumber \\&\quad +\, \sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,\alpha (t))\Big )Y_1(t) \mathrm {d}\widetilde{\Phi }_j(t). \end{aligned}$$

(4.64)

Comparing (4.64) and (4.38), we get

$$\begin{aligned} Z(t)=&-\beta e^{\int _t^Tr(s)\mathrm {d}s} Y(t)\sigma (t)\pi (t), \end{aligned}$$

(4.65)

$$\begin{aligned} K(t,\zeta )=&Y(t)(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1), \end{aligned}$$

(4.66)

$$\begin{aligned} V_j(t)=&\Big \{f_1(t,e_j)-f_1(t,\alpha (t)\Big \}Y_1(t). \end{aligned}$$

(4.67)

Substituting $Z_0(t)$ and $K(t,\zeta )$ into (4.62), we get

$$\begin{aligned} E[\beta e^{\int _t^Tr(s)\mathrm {d}s}\sigma (t)\pi ^*(t)|\mathcal {F}_t]=&E\Big [\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\nu ^0_{\alpha }(\mathrm {d}\zeta )|\mathcal {F}_t\Big ],\nonumber \\ \text { i.e., }\,\, \pi ^*(t)=&\sum _{n=1}^D\left( \frac{\displaystyle \int _{\mathbb {R}^+}(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1)\lambda _n^0 F_n(\mathrm {d}\zeta )}{\beta \sigma _ne^{\int _t^Tr(s)\mathrm {d}s}}\right) \langle \alpha (t),e_n \rangle . \end{aligned}$$

(4.68)

Thus (4.43) in the Theorem is proved. Substituting (4.65)-(4.67) into (4.64), we deduce that the function $f_1(\cdot , e_n)$ satisfies the following backward differential equation

$$\begin{aligned}&f_1^\prime (t,e_n) +f_1(t,e_n)\Big [-\beta e^{\int _t^Tr(s)\mathrm {d}s} \Big \{ P_0(t,e_n)+\pi (t)(\mu (t,e_n)-r(t,e_n))\Big \}\nonumber \\&\quad \quad -\,\beta e^{\int _t^Tr(s)\mathrm {d}s} \theta (t)\sigma (t,e_n)\pi (t)\nonumber \\&\quad \quad +\,\frac{1}{2}\beta ^2 e^{2\int _t^Tr(s)\mathrm {d}s}\sigma ^2(t,e_n)\pi ^2(t) +\displaystyle \int _{\mathbb {R}^+}(1+\theta (t))(e^{\beta \zeta e^{\int _t^Tr(s)\mathrm {d}s}}-1) \lambda ^0_{n}F_{e_n}(\mathrm {d}\zeta )\Big ] \nonumber \\&\quad \quad +\,\sum _{j=1}^D\Big (f_1(t,e_j)-f_1(t,e_n)\Big )(\mathbf {D}_{0,e_n}^{\mathbf {C}}(t))_{nj}\lambda _{n j}=0, \end{aligned}$$

(4.69)

with the terminal condition $f_1(T,e_n)=-1$ for $n=1,\ldots ,D$.

As for the optimal $(C_{nj})_{n,j=1,\ldots ,D}$, the only part of the Hamiltonian that depends on $\mathbf {C}$ is the sum $\sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)$. Hence minimizing the Hamiltonian with respect to $\mathbf {C}$ is equivalent to minimizing the following system of differential operator

$$\begin{aligned} \underset{C_{1j},\ldots ,C_{Dj}}{\min }\,\, \sum _{j=1}^D(\mathbf {D}^{\mathbf {C}}_0(t)e_n-\mathbf {1})_j\lambda _{nj} V_j(t)\,\,\, j=1,\ldots ,D, \end{aligned}$$

(4.70)

subject to the linear constraints

$$\begin{aligned} \sum _{n=1}^DC_{nj}(t)=0. \end{aligned}$$

Hence, one can obtain the solution in the two-states case (since C is bounded) with $V_j$ and $f_1$ given by (4.67) and (4.69) respectively. More specifically, if the Markov chain only has two states, we have to solve the following two linear programming problems:

$$\begin{aligned} \underset{C_{11}(t),C_{21}(t)}{\min }\,\, ( V_1(t)-V_2(t))C_{21}(t)+\lambda _{21} (V_2(t)-V_1(t)) \end{aligned}$$

(4.71)

subject to the linear constraint

$$\begin{aligned} C_{11}+C_{21}=0 \end{aligned}$$

and

$$\begin{aligned} \underset{C_{12}(t),C_{22}(t)}{\min }\,\, ( V_2(t)-V_1(t))C_{12}(t)+\lambda _{12} (V_1(t)-V_2(t)) \end{aligned}$$

(4.72)

subject to the linear constraint

$$\begin{aligned} C_{12}+C_{22}=0. \end{aligned}$$

By imposing that the space of family matrix rates $(C_{nj})_{n,j=1,2}$ is bounded we can write that $C_{nj}(t)\in \Big [C^l(n,j), C^u(n,j)\Big ]$ with $C^l(n,j)< C^u(n,j),\,\,i,j=1,2$. The solution to the preceding two linear control problems are then given by:

$$\begin{aligned} C^*_{21}(t)= & {} C^l(2,1)\mathbb {I}_{V_1(t)-V_2(t)>0}+C^u(2,1)\mathbb {I}_{V_1(t)-V_2(t)<0},\nonumber \\ C^*_{11}(t)= & {} -C^*_{21}(t) \end{aligned}$$

(4.73)

and

$$\begin{aligned} C^*_{12}(t)= & {} C^l(1,2)\mathbb {I}_{V_2(t)-V_1(t)>0}+C^u(1,2)\mathbb {I}_{V_2(t)-V_1(t)<0},\nonumber \\ C^*_{22}(t)= & {} -C^*_{12}(t). \end{aligned}$$

(4.74)

The proof is completed $\square $

Remark 4.8

Assume for example that the distribution of the claim size is of exponential type (with parameter $\tilde{\lambda }_j^0>2\beta ,\,j=1,\ldots ,n$). Moreover, assume that $\pi , \theta $ and C are given by (4.68), (4.58) and (4.70), respectively. Then each of the following equations: (4.31), (4.38), (4.50) and (4.51) admits a unique solution. The solution $(\widehat{Y}(t), \widehat{Z}(t), \widehat{K}(t,\zeta ), \widehat{V}(t))$ (respectively $(\widehat{p}(t),\widehat{q}(t),\widehat{r}^0(t,\zeta ),\widehat{w}(t))$) to (4.38) (respectively (4.51)) is given by (4.63), (4.65), (4.66) and (4.67) (respectively (4.53), (4.57), (4.59) and (4.60)).
We note that f given by (4.61) and $f_1$ given by (4.69) coincide. Moreover, for $r=0$, the backward differential equation (4.61)is the same as Elliott and Siu (2011, Eq. (4.13))

5 Conclusion

In this paper, we use a general maximum principle for Markov regime-switching Forward–backward stochastic differential equation to study optimal strategies for stochastic differential games. The proposed model covers the model uncertainty in Bordigoni et al. (2005), Elliott and Siu (2011), Faidi et al. (2011), Jeanblanc et al. (2012), Øksendal and Sulem (2012). The results obtained are applied to study two problems: first, we study robust utility maximization under relative entropy penalization. We show that the value function in this case is described by a quadratic regime-switching backward stochastic differential equation. Second, we study a problem of optimal investment of an insurance company under model uncertainty. This can be formulated as a two-player zero-sum stochastic differential games between the market and the insurance company, where the market controls the mean relative growth rate of the risky asset and the company controls the investment. We find “closed form” solutions of the optimal strategies of the insurance company and the market, when the utility is of exponential type and the Markov chain has two states.

Optimal control for delayed systems has also received attention recently, due to the memory dependence of some processes. In this situation, the dynamics at the present time t does not only depend on the situation at time t but also on a finite part of their past history. Extension of the present work to the delayed case could be of interest. Such results were derived in Menoukeu-Pamen (2015) in the case of no regime-switching.

It would also be interesting to study the sensitivity of the optimal controls with respect to the given parameters. However this is not straightforward since the parameters (coefficients) in this case depend on the regime and thus stochastic. This is the object of future works.

References

Applebaum D (2009) Lévy processes and stochastic calculus, 2nd edn. Cambridge University Press, Cambridge
Book MATH Google Scholar
Bordigoni G, Matoussi A, Schweizer M (2005) A stochastic control approach to a robust utility maximization problem. In: Benth FE et al (ed) Stochastic analysis and applications. The abel symposium 2005. Springer, pp 125–151
Cohen SN, Elliott RJ (2010) Comparisons for backward stochastic differential equations on markov chains and related no-arbitrage conditions. Ann Appl Probab 20:267–311
Article MathSciNet MATH Google Scholar
Crepey S (2010) About the pricing equations in finance. Springer, Berlin
MATH Google Scholar
Donnelly C (2011) Sufficient stochastic maximum principle in a regime-switching diffusion model. Appl Math Optim 64:155–169
Article MathSciNet MATH Google Scholar
Duffie D, Epstein M (1992) Stochastic differential utility. Econometrica 60:353–394
Article MathSciNet MATH Google Scholar
Dufour F, Elliott RJ (1999) Filtering with discrete state observations. Appl Math Optim 40(756–778):259–272
Article MathSciNet MATH Google Scholar
Elliott RJ, Aggoun L, Moore JB (1994) Hidden Markov models: estimation and control. Springer, New York
MATH Google Scholar
Elliott RJ, Siu TK (2011) A stochastic differential game for optimal investment of an insurer with regime switching. Quantitative Finance 11:365–380
Article MathSciNet MATH Google Scholar
Epstein L, Zin S (1989) Substitution, risk aversion and the temporal behavior of consumption and asset returns: A theoretical framework. Econometrica 57:937–969
Article MathSciNet MATH Google Scholar
Faidi W, Matoussi A, Mnif M (2011) Maximization of recursive utilities: A dynamic maximum principle approach. SIAM J. Financial Math. 2(1):1014–1041
Article MathSciNet MATH Google Scholar
Framstad N, Øksendal B, Sulem A (2004) Stochastic maximum principle for optimal control of jump diffusions and applications to finance. J Optim Theory Appl 121:77–98
Article MathSciNet MATH Google Scholar
Jeanblanc M, Matoussi A, Ngoupeyou A (2012) Robust utility maximization in a discontinuous filtration. arXiv:1201.2690
Kulinich G, Kushnirenko S (2014) Strong uniqueness of solutions for stochastic differential equation with jumps and non-lipschitz random coefficients. Mod Stoch Theory Appl 1:65–72
Article MathSciNet MATH Google Scholar
Mao X, Yuan C (2006) Stochastic differential equations with Markovian switching. Imperial College Press, London
Book MATH Google Scholar
Menoukeu-Pamen O (2014) Maximum principles of markov regime-switching forward-backward stochastic differential equations with jumps and partial information. arXiv:1403.2901v2
Menoukeu-Pamen O (2015) Optimal control for stochastic delay systems under model uncertainty: a stochastic differential game approach. J Optim Theory Appl 167(3):998–1031
Article MathSciNet MATH Google Scholar
Øksendal B, Sulem A (2012) Forward-backward stochastic differential games and stochastic control under model uncertainty. J Optim Theory Appl. doi:10.1007/s10957-012-0166-7
MATH Google Scholar
Peng S (1993) Backward stochastic differential equations and applications to optimal control. Appl Math Optim 27:125–144
Article MathSciNet MATH Google Scholar
Rockafeller RT (1970) Convex analysis. Princeton University Press, Princeton
Book Google Scholar
Tang S, Li X (1994) Necessary conditions for optimal control of stochastic systems with random jumps. SIAM J Control Optim 32:1447–1475
Article MathSciNet MATH Google Scholar
Weil P (1990) Non-expected utility in macroeconomics. Q J Econ 105:29–42
Article Google Scholar
Zhang X, Elliott RJ, Siu TK (2012) A stochastic maximum principle for a markov regime-switching jump-diffusion model and its application to finance. SIAM J Control Optim 50(2):964–990
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

African Institute for Mathematical Sciences Ghana, Biriwa, Ghana
Olivier Menoukeu-Pamen
Institute for Financial and Actuarial Mathematics, Department of Mathematical, University of Liverpool, Peach Street, Liverpool, L69 7ZL, UK
Olivier Menoukeu-Pamen
CIBC Asset Management Inc., 1000 de la Gauchetière Ouest, Montreal, QC, Canada
Romuald Hervé Momeya

Authors

Olivier Menoukeu-Pamen
View author publications
You can also search for this author in PubMed Google Scholar
Romuald Hervé Momeya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olivier Menoukeu-Pamen.

Additional information

The project on which this publication is based has been carried out with funding provided by the Alexander von Humboldt Foundation, under the programme financed by the German Federal Ministry of Education and Research entitled German Research Chair No. 01DG15010 and by the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement No. 318984-RARE.

Appendix

Proof of Theorem 3.1

Let show that $J_1(u_1,\widehat{u}_2,e_n)\le J_1(\widehat{u}_1,\widehat{u}_2,e_n) \text { for all } \widehat{u}_1\in \mathcal {A}_1.$ Fix $u_1 \in \mathcal {A}_1$, then, we have

$$\begin{aligned} J_1(u_1,\widehat{u}_2,e_n)- J_1(\widehat{u}_1,\widehat{u}_2,e_n)=I_1+I_2+I_3, \end{aligned}$$

(5.1)

where

$$\begin{aligned} I_1=&E\Big [ \int _0^T \Big \{ f_1(t,X(t),\alpha (t),Y(t),u(t))-f_1(t, \widehat{X}(t),\alpha (t),\widehat{u}(t)) \Big \}\,\mathrm {d}t \Big ], \end{aligned}$$

(5.2)

$$\begin{aligned} I_2=&E\Big [ \varphi _1(X(T),\alpha (T)) -\varphi _1(\widehat{X}(T), \alpha (T))\Big ], \end{aligned}$$

(5.3)

$$\begin{aligned} I_3=&E\Big [\psi _1(Y_1(0))\,-\,\psi _1(\widehat{Y_1}(0))\Big ]. \end{aligned}$$

(5.4)

By the definition of $H_1$, we get

$$\begin{aligned} I_1&=E\left[ \int _0^T \left\{ H_1(t,u(t))-\widehat{H}_1(t,\widehat{u}(t))- \widehat{A}_1(t)(g_1(t)-\widehat{g}_1(t))-\widehat{p}_1(t)(b(t)-\widehat{b}(t))\right. \right. \nonumber \\&\quad -\,\widehat{q}_1(t)(\sigma (t)-\widehat{\sigma }(t))-\int _{\mathbb {R}_0}\widehat{r}_1(t,\zeta )(\gamma (t,\zeta )-\widehat{\gamma }(t,\zeta ))\nu _\alpha (\,\mathrm {d}\zeta ) \nonumber \\&\quad \left. \left. -\,\sum _{j=1}^D\widehat{w_1}^j(t)(\eta _j(t)-\widehat{\eta }_j(t) )\lambda _{j}(t) \right\} \,\mathrm {d}t\right] . \end{aligned}$$

(5.5)

By concavity of $\varphi _1$ in x, Itô formula, and (3.3) we have

$$\begin{aligned} I_2&\le E\left[ \dfrac{\partial \varphi _1}{\partial x} (\widehat{X}(T),\alpha (T))(X(T)-\widehat{X}(T)) \right] \nonumber \\&= E\left[ \widehat{p}_1(T) (X(T)-\widehat{X}(T))\right] -E\left[ \widehat{A}_1(T)\dfrac{\partial h_1}{\partial x}(\widehat{X}(T),\alpha (T))(X(T)-\widehat{X}(T))\right] \nonumber \\&= E\left[ \int _0^T\widehat{p}_1(t) (\mathrm {d}X(t)-\mathrm {d}\widehat{X}(t))+\int _0^T(X(t^-)-\widehat{X}(t^-)) \,\mathrm {d}\widehat{p}_1(t)+\int _0^T(\sigma (t)\right. \nonumber \\&\quad -\,\widehat{\sigma }(t))\widehat{q}_1(t)\,\mathrm {d}t \Big .\nonumber \\&\quad +\,\int _0^T\int _{\mathbb {R}_0}(\gamma (t)-\widehat{\gamma }(t))\widehat{r}_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}t+ \int _0^T \sum _{j=1}^D\widehat{w_1}^j(t)(\eta _j(t)\nonumber \\&\left. \quad -\,\widehat{\eta }_j(t) )\lambda _{j}(t) \,\mathrm {d}t \right] \nonumber \\&\quad -\,E\left[ \widehat{A}_1(T)\dfrac{\partial h_1}{\partial x} (\widehat{X}(T),\alpha (T))(X(T)-\widehat{X}(T))\right] \nonumber \\&= E\Big [\int _0^T\widehat{p}_1(t) (b(t)-\widehat{b}(t))\,\mathrm {d}t+\int _0^T(X(t^-)-\widehat{X}(t^-)) \Big (-\frac{\partial \widehat{H}_1}{\partial x}(t)\Big )\,\mathrm {d}t\nonumber \\&\quad +\,\int _0^T(\sigma (t)-\widehat{\sigma }(t))\widehat{q}(t)\,\mathrm {d}t \Big .\nonumber \\&\quad +\,\int _0^T\int _{\mathbb {R}_0}(\gamma (t)-\widehat{\gamma }(t))\widehat{r}_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}t+ \int _0^T \sum _{j=1}^D\widehat{w_1}^j(t)(\eta _j(t)\nonumber \\&\quad -\,\widehat{\eta }_j(t) )\lambda _{j}(t) \,\mathrm {d}t \Big ]\nonumber \\&\quad -\,E\Big [\widehat{A}_1(T)\dfrac{\partial h_1}{\partial x} (\widehat{X}(T),\alpha (T))(X(T)-\widehat{X}(T))\Big ]. \end{aligned}$$

(5.6)

By concavity of $\psi _1,h_1$, Itô formula, (2.7) and (3.2), we get

$$\begin{aligned} I_3\le&E\Big [\psi _1^\prime (\widehat{Y}_1(0))(Y_1(0)-\widehat{Y}_1(0))\Big ]\\ =&E\Big [\widehat{A}_1(0)(Y_1(0)-\widehat{Y}_1(0))\Big ]\\ =&E\Big [\widehat{A}_1(T)(Y(T)-\widehat{Y}_1(T))\Big ]-E\Big [\int _0^T\widehat{A}_1(t) (\,\mathrm {d}Y_1(t)-\,\mathrm {d}\widehat{Y}_1(t))\Big .\\&+\int _0^T(Y_1(t^-)-\widehat{Y}_1(t^-))\,\mathrm {d}\widehat{A}_1(t) + \int _0^T(Z_1(t)-\widehat{Z}_1(t))\dfrac{\partial \widehat{H}_1}{\partial z}(t) \,\mathrm {d}t \\&\Big . +\int _0^T\int _{\mathbb {R}_0}(K_1(t,\zeta )-\widehat{K}_1(t,\zeta ))\nabla _k \widehat{H}_1(t,\zeta )\nu _\alpha (\,\mathrm {d}\zeta )\,\mathrm {d}t+\int _0^T \sum _{j=1}^D\dfrac{\partial \widehat{H}_1}{\partial v^j}(t) (V_1^j(t)\nonumber \\&-\widehat{V}_1^j(t) )\lambda _{j}(t) \,\mathrm {d}t \Big ]\\ =&E\Big [\widehat{A}_1(T) \{h_1(X(T),\alpha (T))-h_1(\widehat{X}(T),\alpha (T))\} \Big ]\nonumber \\&-E\Big [\int _0^T\dfrac{\partial \widehat{H}_1}{\partial y}(t) (Y_1(t)-\widehat{Y}_1(t)) \,\mathrm {d}t \Big .\\&+\int _0^T\widehat{A}_1(t) (-g(t)+\widehat{g}(t))\,\mathrm {d}t + \int _0^T(Z(t)-\widehat{Z}(t))\dfrac{\partial \widehat{H}}{\partial z}(t) \,\mathrm {d}t\\&\Big .+\int _0^T\int _{\mathbb {R}_0}(K_1(t,\zeta )-\widehat{K}_1(t,\zeta ))\nabla _k \widehat{H}_1(t,\zeta )\nu _\alpha (d\zeta )\,\mathrm {d}t+\int _0^T \sum _{j=1}^D\dfrac{\partial \widehat{H}_1}{\partial v^j}(t) (V_1^j(t)\nonumber \\&-\widehat{V}_1^j(t) )\lambda _{j}(t) \,\mathrm {d}t \Big ]. \end{aligned}$$

Hence we get

$$\begin{aligned} I_3\le&E\Big [\widehat{A}_1(T)\dfrac{\partial h_1}{\partial x}( \widehat{X}(T),\alpha (T))(X(T)-\widehat{X}(T)) \Big ]-E\Big [\int _0^T\dfrac{\partial \widehat{H}_1}{\partial y}(t) (Y_1(t)-\widehat{Y}_1(t)) \,\mathrm {d}t \Big .\nonumber \\&+\int _0^T\widehat{A}_1(t) (-g(t)+\widehat{g}_1(t))\,\mathrm {d}t + \int _0^T(Z_1(t)-\widehat{Z}_1(t))\dfrac{\partial \widehat{H}_1}{\partial z}(t) \,\mathrm {d}t\nonumber \\&\Big .+\int _0^T\int _{\mathbb {R}_0}(K_1(t,\zeta )-\widehat{K}_1(t,\zeta ))\nabla _k \widehat{H}_1(t,\zeta )\nu _\alpha (d\zeta )\,\mathrm {d}t+\int _0^T \sum _{j=1}^D\dfrac{\partial \widehat{H}_1}{\partial v^j}(t) (V_1^j(t)\nonumber \\&-\widehat{V}_1^j(t) )\lambda _{j}(t) \,\mathrm {d}t \Big ]. \end{aligned}$$

(5.7)

Summing (5.5)–(5.7) up, we have

$$\begin{aligned} I_1+I_2+I_3\le&E\Big [ \int _0^T \Big \{H_1(t,u(t))-\widehat{H}_1(t,\widehat{u}(t))-\dfrac{\partial \widehat{H}_1}{\partial x}(t)(X(t)-\widehat{X}(t)) \nonumber \\&-\dfrac{\partial \widehat{H}_1}{\partial y}(t)(Y_1(t)-\widehat{Y}_1(t)) \Big . \Big . \nonumber \\&\Big . \Big . +\int _{\mathbb {R}_0}(K_1(t,\zeta )-\widehat{K}_1(t,\zeta ))\nabla _k \widehat{H}_1(t,\zeta )\nu _\alpha (d\zeta )\,\mathrm {d}t\nonumber \\&+ \sum _{j=1}^D\dfrac{\partial \widehat{H}_1}{\partial v^j}(t) (V_1^j(t)-\widehat{V}_1^j(t) )\lambda _{j}(t) \Big \} \mathrm {d}t\Big ]. \end{aligned}$$

(5.8)

One can show, using the same arguments in Framstad et al. (2004) that, the right hand side of (5.8) is non-positive. For sake of completeness we shall give the details here. Fix $t\in [0,T]$. Since $\widetilde{H}_1(x,y,z,k,v)$ is concave, it follows by the standard hyperplane argument (see e.g Rockafeller (1970, Chapter 5, Section 23)) that there exists a subgradient $d=(d_1,d_2,d_3,d_4(\cdot ),d_5) \in \mathbb {R}^3\times \mathcal {R}\times \mathbb {R}$ for $\widetilde{H}_1(x,y,z,k,v)$ at $x=\widehat{X}(t),\,y=\widehat{Y}_1(t),\,z=\widehat{Z}_1(t),\,k=\widehat{K}_1(t,\cdot ),\,v=\widehat{V}_1(t)$ such that if we define

$$\begin{aligned} i_1(x,y,z,k,v):=\,&\widetilde{H}_1(x,y,z,k,v)-\widehat{H}_1(t)-d_1(x-\widehat{X}(t))-d_2(y-\widehat{Y}_1(t))\nonumber \\&-d_3(z-\widehat{Z}_1(t))\nonumber \\&-\int _{\mathbb {R}_0} d_4(\zeta )(k(\zeta )-\widehat{K}_1(t,\zeta ))\nu _\alpha (\mathrm {d}\zeta )-\sum _{j=1}^Dd_5^j (V_1^j(t)\nonumber \\&-\widehat{V}_1^j(t) )\lambda _{j}(t) . \end{aligned}$$

(5.9)

Then $i(x,y,z,k,v)\le 0$ for all x, y, z, k, v.

Furthermore, we clearly have $i(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t))$. It follows that,

$$\begin{aligned} d_1=\frac{\partial \widetilde{H}_1}{\partial x}(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t)),\\ d_2=\frac{\partial \widetilde{H}_1}{\partial y}(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t)),\\ d_3=\frac{\partial \widetilde{H}_1}{\partial z}(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t)),\\ d_4=\nabla _k \widetilde{H}_1(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t)),\\ d_5^j=\frac{\partial \widetilde{H}_1}{\partial v^j}(\widehat{X}(t),\widehat{Y}_1(t),\widehat{Z}_1(t),\widehat{K}_1(t,\cdot ),\widehat{V}_1(t)). \end{aligned}$$

Combining this with (5.8), and using the concavity of $\widetilde{H}_1$, we conclude that

$J_1(u_1,\widehat{u}_2,e_i)\le J_1(\widehat{u}_1,\widehat{u}_2,e_i) \text { for all } u_1 \in \mathcal {A}_1.$ In a similar way, one can show that $J_2(\widehat{u}_1,u_2,e_i)\le J_2(\widehat{u}_1,\widehat{u}_2,e_i) \text { for all } u_2 \in \mathcal {A}_2$. This completed the proof. $\square $

Proof of Theorem 3.5

We have that

$$\begin{aligned}&\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}\nonumber \\&\quad =E\Big [\int _0^T \Big \{\dfrac{\partial f_1}{\partial x}(t)X_1(t) +\dfrac{\partial f_1}{\partial u_1}(t)\beta _1(t)\Big \} \mathrm {d}t+\dfrac{\partial \varphi _1}{\partial x} (X^{(u_1,u_2)}(T),\alpha (T))X_1(T)\nonumber \\&\qquad +\psi _1^\prime (Y_1(0))y_1(0)\Big ]\nonumber \\&\quad =J_1+J_2+J_3, \end{aligned}$$

(5.10)

with

$$\begin{aligned} J_1&=E\Big [\int _0^T \Big \{\dfrac{\partial f_1}{\partial x}(t)X_1(t) +\dfrac{\partial f_1}{\partial u_1}(t)\beta _1(t)\Big \} \mathrm {d}t\Big ], \nonumber \\ J_2&=E\Big [\dfrac{\partial \varphi _1}{\partial x} (X^{(u_1,u_2)}(T),\alpha (T))X_1(T)\Big ], \nonumber \\ J_3&=E\Big [\psi _1^\prime (Y_10))y_1(0)\Big ] . \end{aligned}$$

By Itô’s formula, (3.3), (3.12) and (3.14), we have

$$\begin{aligned} J_2&=E\Big [\dfrac{\partial \varphi _1}{\partial x}(X^{(u_1,u_2)}(T),\alpha (T) )X_1(T)\Big ] \nonumber \\&=E\Big [ p_1(T)X(T)\Big ]- E\Big [\dfrac{\partial h_1}{\partial x}(X^{(u_1,u_2)}(T),\alpha (T) )A_1(T)X_1(T)\Big ] \nonumber \\&=E\Big [\int _0^T\Big \{p_1(t)\Big (\dfrac{\partial b}{\partial x}(t)X_1(t)+\dfrac{\partial b}{\partial u_1}(t)\beta _1(t)\Big )-X_1(t) \dfrac{\partial H_1}{\partial x}(t)\Big .\Big .\nonumber \\&\quad +q_1(t)\Big (\dfrac{\partial \sigma }{\partial x}(t)X_1(t)+\dfrac{\partial \sigma }{\partial u_1}(t)\beta _1(t)\Big )+\int _{\mathbb {R}_0} r_1(t,\zeta )\Big (\dfrac{\partial \gamma }{\partial x}(t,\zeta )X_1(t)\nonumber \\&\quad +\dfrac{\partial \gamma }{\partial u_1}(t,\zeta )\beta _1(t)\Big ) \nu _\alpha (\mathrm {d}\zeta )\bigg .\nonumber \\&\quad +\sum _{j=1}^D w_1^j(t)\Big (\dfrac{\partial \eta ^j}{\partial x}(t)X_1(t)-\dfrac{\partial \eta ^j}{\partial u_1}(t)\beta _1(t) \Big )\lambda _{j}(t) \Big \}\mathrm {d}t\Big ]\nonumber \\&\quad - E\Big [\dfrac{\partial h_1}{\partial x}(X^{(u_1,u_2)}(T),\alpha (T) )A_1(T)X_1(T)\Big ]\Big ]. \end{aligned}$$

(5.11)

Applying once more the Itô’s formula and using (3.13) and (3.15), we get

$$\begin{aligned} J_3=&E\Big [\psi _1^\prime (Y(0))y_1(0)\Big ] =E\Big [A(0)y_1(0)\Big ]\nonumber \\ =&E\Big [A_1(T)y_1(T)\Big ] - E\Big [\int _0^T\Big \{A_1(t^-)\,\mathrm {d}y_1(t)+y_1(t^-)\,\mathrm {d}A_1(t) +\dfrac{\partial H_1}{\partial z}(t)z_1(t)\,\mathrm {d}t\Big .\Big .\nonumber \\&\Big .\Big .+\int _{\mathbb {R}_0}\nabla _kH_1(t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\,\mathrm {d}t+\sum _{j=1}^D\dfrac{\partial H_1}{\partial v_1^j}(t) v^j_1(t)\lambda _{j}(t) \,\mathrm {d}t \Big \}\Big ]\nonumber \\ =&E\Big [\dfrac{\partial h_1}{\partial x}(X^{(u_1,u_2)}(T),\alpha (T))A_1(T)X_1(T)\Big ] \nonumber \\&+ E\Big [\int _0^T\Big \{A_1(t)\Big (\dfrac{\partial g_1}{\partial x}(t)x_1(t) +\Big .\dfrac{\partial g_1}{\partial y}(t)y_1(t) \Big .\Big .\Big . \nonumber \\&+\dfrac{\partial g_1}{\partial z}(t)z_1(t)+\int _{\mathbb {R}_0}\nabla _k g_1 (t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) + \sum _{j=1}^D\dfrac{\partial g_1}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \nonumber \\&+\dfrac{\partial g_1}{\partial u_1}(t)\beta _1(t) \Big )-\dfrac{\partial H_1}{\partial y}(t)y_1(t) -\dfrac{\partial H_1}{\partial z}(t)z_1(t)-\int _{\mathbb {R}_0}\nabla _kH_1(t,\zeta )k_1(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) \nonumber \\&-\sum _{j=1}^D\dfrac{\partial H_1}{\partial v^j}(t) v^j_1(t)\lambda _{j}(t) \Big \}\mathrm {d}t\Big ] . \end{aligned}$$

(5.12)

Substituting (5.11) and (5.12) into (5.10), we get

$$\begin{aligned}&\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}\nonumber \\ =&E\Big [\int _0^TX_1(t) \Big \{\dfrac{\partial f_1}{\partial x}(t) +A_1(t)\dfrac{\partial g_1}{\partial x}(t)+p_1(t)\dfrac{\partial b}{\partial x}(t)+q_1(t)\dfrac{\partial \sigma }{\partial x}(t)\nonumber \\&+\int _{\mathbb {R}_0} r_1(t,\zeta )\dfrac{\partial \gamma }{\partial x}(t,\zeta )\nu _\alpha (\mathrm {d}\zeta )\nonumber \\&+\sum _{j=1}^Dw_1^j(t)\dfrac{\partial \eta ^j}{\partial x}(t)\lambda _{j}(t) -\dfrac{\partial H_1}{\partial x}(t)\Big \}\mathrm {d}t\Big ]+E\Big [\int _0^T y_1(t) \Big \{A_1(t)\dfrac{\partial g_1}{\partial y}(t)\nonumber \\&-\dfrac{\partial H_1}{\partial y}(t)\Big \}\mathrm {d}t\Big ] \nonumber \\&+E\Big [\int _0^T z_1(t) \Big \{ A_1(t)\dfrac{\partial g_1}{\partial z}(t)-\dfrac{\partial H_1}{\partial z}(t)\Big \}\mathrm {d}t\Big ] \nonumber \\&+E\Big [\int _0^T\int _{\mathbb {R}_0}k_1(t,\zeta ) \Big \{A_1(t)\nabla _kg_1(t,\zeta )-\nabla _k H_1(t,\zeta )\Big \}\nu _\alpha (\mathrm {d}\zeta ) \mathrm {d}t\nonumber \\&+E\Big [\int _0^T \sum _{j=1}^D v_1^j(t) \Big \{ A_1(t)\dfrac{\partial g}{\partial v^j}(t)-\dfrac{\partial H}{\partial v^j}(t)\Big \}\mathrm {d}t\Big ] \nonumber \\&+E\Big [\int _0^T \beta _1(t) \Big \{\dfrac{\partial f_1}{\partial u_1}(t) +A_1(t)\dfrac{\partial g_1}{\partial u_1}(t)+\dfrac{\partial b}{\partial u_1}(t)+\dfrac{\partial \sigma }{\partial u_1}(t)\nonumber \\&+\int _{\mathbb {R}_0} r_1(t,\zeta )\dfrac{\partial \gamma }{\partial u_1}(t,\zeta )\nu _\alpha (\mathrm {d}\zeta ) +\sum _{j=1}^D w_1^j(t)\dfrac{\partial \eta ^j}{\partial u_1}(t)\lambda _{j}(t) \Big \}\mathrm {d}t\Big ]. \end{aligned}$$

(5.13)

By the definition of $H_1$, the coefficients of the processes $X_1(t),y_1(t),z_1(t), k_1(t,\zeta )$ and $v_1^j(t),\, j=1,\ldots ,D,$ are all equal to zero in (5.13). We conclude that

$$\begin{aligned} \dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=&E\Big [\int _0^T\dfrac{\partial H_1}{\partial u_1}(t)\beta _1(t) \,\mathrm {d}t \Big ]\nonumber \\ =&E\Big [\int _0^TE\Big [\dfrac{\partial H_1}{\partial u_1}(t)\beta _1(t) \Big | \mathcal {E}^{(1)}_{t} \Big ]\,\mathrm {d}t \Big ]. \end{aligned}$$

(5.14)

Hence, $\dfrac{\mathrm {d}}{\mathrm {d}\ell }J_1^{(u_1+\ell \beta _1,u_2)}(t)\Big . \Big |_{\ell =0}=0$ for all bounded $ \beta _1 \in \mathcal {A}_1$ implies that the same holds in particular for $\beta _1 \in \mathcal {A}_1$ of the form

$$\begin{aligned} \beta _1(t)=\beta _1(t,\omega )=\theta _1(\omega )\xi _{[t_0,T]}(t), t\in [0,T] \end{aligned}$$

for a fix $t_0\in [0,T)$, where $\theta _1(\omega )$ is a bounded $\mathcal {E}^{(1)}_{t_0}$-measurable random variable. Therefore

$$\begin{aligned} E\Big [\int _{t_0}^TE\Big [\dfrac{\partial H_1}{\partial u_1}(t) \Big | \mathcal {E}^{(1)}_{t} \Big ]\theta _1\,\mathrm {d}t \Big ]=0. \end{aligned}$$

(5.15)

Differentiating with respect to $t_0$, we have

$$\begin{aligned} E\Big [\dfrac{\partial H_1}{\partial u_1}(t_0)\,\theta _1\Big ]=0 \quad \text { for a.a., } t_0. \end{aligned}$$

(5.16)

Since the equality is true for all bounded $\mathcal {E}^{(1)}_{t_0}$-measurable random variables $\theta _1$, we have

$$\begin{aligned} E\Big [\dfrac{\partial H_1}{\partial u_1}(t_0)|\mathcal {E}^{(1)}_{t_0}\Big ]=0\quad \text { for a.a., } t_0\in [0,T]. \end{aligned}$$

(5.17)

A similar argument gives that

$$\begin{aligned} E\Big [\dfrac{\partial H_2}{\partial u_2}(t_0)|\mathcal {E}^{(2)}_{t_0}\Big ]=0 \quad \text { for a.a., } t_0\in [0,T], \end{aligned}$$

under the condition that

$$\begin{aligned} \dfrac{\mathrm {d}}{\mathrm {d}s}J^{(u_1,u_2+\ell \beta _2)}(t)\Big . \Big |_{\ell =0}=0\,\quad \text { for all bounded } \beta _2 \in \mathcal {A}_2. \end{aligned}$$

This shows that (1) $\Rightarrow $ (2).

Conversely, using the fact that every bounded $\beta _i \in \mathcal {A}_i$ can be approximated by a linear combinations of controls $\beta _i(t)$ of the form (3.11), the above argument can be reversed to show that (2) $\Rightarrow $ (1). $\square $

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Menoukeu-Pamen, O., Momeya, R.H. A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications. Math Meth Oper Res 85, 349–388 (2017). https://doi.org/10.1007/s00186-017-0574-4

Download citation

Received: 30 May 2016
Accepted: 17 January 2017
Published: 04 February 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00186-017-0574-4

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications

Abstract

Similar content being viewed by others

Maximum Principles of Markov Regime-Switching Forward–Backward Stochastic Differential Equations with Jumps and Partial Information

Stochastic differential games with controlled regime-switching

Maximum Principle for Markov Regime-Switching Forward–Backward Stochastic Control System with Jumps and Relation to Dynamic Programming

1 Introduction

2 Model and problem formulation

Problem 2.1

3 A stochastic maximum principle for Markov regime-switching forward–backward stochastic differential games

3.1 A sufficient maximum principle

Theorem 3.1

Proof of Theorem 3.1

Remark 3.2

Remark 3.3

3.2 An equivalent maximum principle

Assumption A.1

Assumption A.2

Assumption A.3

Remark 3.4

Theorem 3.5

Proof

Remark 3.6

3.3 Zero-sum Game

Problem 3.7

Remark 3.8

Theorem 3.9

Theorem 3.10

Proof

Corollary 3.11

Proof

4 Applications

4.1 Application to robust utility maximization with entropy penalty

Problem 4.1

Theorem 4.2

Proof

Remark 4.3

4.2 Application to optimal investment of an insurance company under model uncertainty

Definition 4.4

Problem 4.5

Problem 4.6

Theorem 4.7

Proof

Remark 4.8

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Proof of Theorem 3.1

Proof of Theorem 3.5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation