1 Introduction

Optimal investment problems are extensively studied in financial mathematics and the key example is exponential utility maximization problem. Among many papers in this field, we can mention the works by Hu et al. (2005), Morlais (2009), Ankirchner et al. (2010), Lim and Quenez (2011), Jiao et al. (2013) and Jeanblanc et al. (2015). In these papers the authors consider dynamic investment problems for an agent who valuates his terminal wealth with exponential utility with absolute risk aversion coefficient which is constant in time. However, when deciding on dynamic asset allocation, it seems more reasonable to assume that the investor’s risk preferences are time-varying.

The motivation for considering time-varying and stochastic risk preferences is clear. In a bull market investors are willing to take more risk, which should be modeled with a lower risk aversion coefficient, whereas in a bear market investors are willing to take less risk, which should be modeled with a higher risk aversion coefficient. Hence, the coefficient of risk aversion depending on the state of economy should be used in dynamic portfolio selection problems. Pirvu and Zhang (2013) and Kwak et al. (2014) study exponential utility indifference pricing and optimal investment strategies under exponential utility with regime-switching risk aversion coefficient. Gordon and St-Amour (2000) show that a state-dependent risk aversion can explain asset price movements which cannot be explained by constant risk aversion. There is also a strong empirical evidence that the degree of risk aversion depends on prior gains and losses, or on the available wealth in general. Thaler and Johnson (1990) claim that after a gain on a prior gamble people are more risk seeking than usual, while after a prior loss they become more risk averse. The observation that the risk aversion goes down after a prior gain is called the “house money” effect.

We investigate an exponential utility maximization problem for an insurer who faces a stream of non-hedgeable claims. The policyholders are entitled to annuity, life insurance and endowment benefits. The benefits are contingent on a non-tradeable financial index correlated with a stock available for trading in the financial market. The deaths of the policyholders and the benefits’ occurrence times are modelled with a counting process. We assume that the insurer’s risk aversion coefficient changes in time and its value depends on the current insurer’s net asset value (the excess of assets over liabilities). If the assets are above the liabilities, then the insurer is less risk averse and is willing to implement more risky investment strategy. If the assets are below the liabilities, the insurer is more risk averse and switches to more conservative investment strategies. Hence, we take into account the “house money” effect when the insurer solves his asset allocation problem. To the best of our knowledge, there is only one paper (by Dong and Sircar 2014) which studies exponential utility maximization for investor with wealth-dependent risk aversion. At the same time we can find papers in which mean-variance optimization problems with wealth-dependent risk aversion coefficients are considered, see e.g. Zeng and Li (2011), Björk et al. (2014) and Kronborg and Steffensen (2015).

It is known that exponential utility maximization problems with time-varying risk aversion coefficient are time-inconsistent and classical techniques of stochastic control cannot be applied. We follow the game-theoretic approach from Ekeland and Lazrak (2006), Ekeland and Pirvu (2008), Björk and Murgoci (2014) and Björk et al. (2017) and we derive the HJB equation for our time-inconsistent optimization problem with wealth-dependent risk aversion. The HJB equation characterizes the so-called equilibrium investment strategy and the equilibrium value function. In order to solve our HJB equation, we use the expansion techniques from Fouque et al. (2011), Fouque et al. (2014), Fouque and Hu (2017), Fouque et al. (2017) and Dong and Sircar (2014). We assume that the insurer’s risk aversion coefficient consists of a constant risk aversion and a small amount of a wealth-dependent risk aversion. We apply perturbation theory and expand the solution to the HJB equation on the parameter controlling the degree of risk aversion depending on wealth. In the first step, we investigate an exponential utility maximization problem for an insurer with constant risk aversion coefficient and we derive some new results for exponential utility maximization problem with constant risk aversion. In particular, we investigate derivative of the value function with respect to risk aversion coefficient. We show existence of solutions to systems of nonlinear BSDEs and nonlinear PDEs which describe the value function for our exponential utility maximization problem with constant risk aversion and the derivative of the value function with respect to risk aversion. We show that the PDEs have smooth solutions. Finally, we use these results to postulate the first-order approximation to the solution to our HJB equation. We derive the first-order approximations to the equilibrium value function and the equilibrium investment strategy. Our first-order approximation to the equilibrium investment strategy is new and agrees with intuition.

Dong and Sircar (2014) investigate time-inconsistent optimization problems, including an indifference pricing problem for a terminal claim under exponential utility with wealth-dependent risk aversion coefficient. They also assume that a small amount of wealth-dependent risk aversion is added to constant risk aversion and apply perturbation theory to find the first-order approximation to the solution to their HJB equation. Our model and results are much more general than the model and results from Dong and Sircar (2014). We consider an insurance portfolio where the run-off is modelled with a counting process and the insurer is exposed to a stream of non-hedgeable claims of three different types. Since we consider an insurance portfolio with an arbitrary number of policies, we study a recursive system of HJB equations. The results presented in Dong and Sircar (2014) are heuristic and in a summary form, whereas we present formal proofs of our results. We use not only PDEs but also BSDEs to characterize the first-order approximation to the solution. Finally, Dong and Sircar (2014) are only interested in the exponential utility indifference price of a terminal claim and they do not give the first-order equilibrium investment strategy for their problem.

The remainder of the paper is organized as follows. Sections 2 and 3 describe the model and the optimization problem. In Sect. 4 we recall perturbation theory and explain the idea behind the (asymptotic) first-order approximation to a solution to a problem. In Sect. 5 we investigate an exponential utility maximization problem with constant risk aversion coefficient whereas in the subsequent Sect. 6 we study an exponential utility maximization problem with wealth-dependent risk aversion. Section 7 contains some examples which illustrate our key result from Sect. 6. All proofs are presented in Sect. 8.

2 The financial and insurance model

We deal with a probability space \((\varOmega ,{\mathbb {F}},\mathbb {P})\) with a filtration \({\mathbb {F}}=(\mathcal {F}_{t})_{0\le t\le T}\) and a finite time horizon \(T<\infty \). On the probability space \((\varOmega ,{\mathbb {F}},\mathbb {P})\) we define a standard two-dimensional Brownian motion \((W,B)=(W(t),B(t), 0\le t\le T)\) and a càdlàg (right-continuous with left limits) counting process \(N=(N(t), 0\le t\le T)\). The uncorrelated Brownian motions (WB) are used to model the financial risk and \(\mathcal {F}^{W,B}_{t}=\sigma (W(u), B(u), u\in [0,t])\) contains information on the evolution of the financial indices. The counting process N is used to model the insurance risk and \(\mathcal {F}^{N}_{t}=\sigma (N(u), u\in [0,t])\) contains information on the number of in-force policies in the insurance portfolio. We assume that

  1. (A1)

    the subfiltrations \(\mathcal {F}^{W,B}_{t}\) and \(\mathcal {F}^{N}_{t}\) are independent and we set \(\mathcal {F}_t=\bigcap _{\epsilon >0}\big (\mathcal {F}^{W,B}_{t+\epsilon }\vee \mathcal {F}^{N}_{t+\epsilon }\big )\) for \(0\le t\le T\),

Under assumption (A1), the financial risk is independent of the insurance risk. As far as the filtration \({\mathbb {F}}\) is concerned, we use the standard approach of progressive enlargement of the Brownian filtration. The filtration \({\mathbb {F}}\) is right-continuous and completed with sets of measure zero.

The financial market consists of a risk-free deposit \(D=(D(t), 0\le t\le T)\) and two risky indices: \(S=(S(t), 0\le t\le T)\) and \(P=(P(t),0\le t\le T)\). The value of the risk-free deposit is constant:

$$\begin{aligned} D(t)=1, \quad 0\le t\le T, \end{aligned}$$
(2.1)

i.e. we assume that the interest rate is zero or we consider discounted quantities in our problem. The prices of the risky indices are modelled with correlated Brownian motions. We assume that the prices of S and P satisfy the dynamics

$$\begin{aligned} \frac{dS(t)}{S(t)}= & {} \mu dt+\sigma dW(t),\quad 0\le t\le T,\nonumber \\ S(0)= & {} s_0, \end{aligned}$$
(2.2)
$$\begin{aligned} \frac{dP(t)}{P(t)}= & {} a dt+b\Big (\rho dW(t)+\sqrt{1-\rho ^2}dB(t)\Big ),\quad 0\le t\le T,\nonumber \\ P(0)= & {} p_0, \end{aligned}$$
(2.3)

where \(\mu , a, \sigma , b\) are positive constants which denote drifts and volatilities and \(\rho \in [-1,1]\) denotes the correlation coefficient between the log-returns of S and P. The insurance company can invest in the deposit D and in the index S. The index P is not available for trading. The index P is the underlying investment fund for the insurance contracts sold by the insurance company. We use two indices in our model since in practice equity-linked life insurance contracts may be contingent on non-tradeable indices.

The insurance company keeps a homogeneous portfolio of n unit-linked policies. The counting process N is used to count the number of deaths in the insurance portfolio. We assume that the lifetimes of the policyholders are independent and exponentially distributed, i.e. we assume that

  1. (A2)

    \(\Big (N(t)-\int _0^t(n-N(s-))\lambda ds, \ 0\le t\le T\Big )\) is an \({\mathbb {F}}\)-martingale, where \(\lambda >0\).

Parameter \(\lambda \) denotes the mortality intensity for the policyholders. Since mortality intensity depends on age, we should assume that \(\lambda \) depends on time t. Such a modification of (A2) can be easily introduced. However, we keep (A2) to simplify the presentation of our results. Let

$$\begin{aligned} J(t)=n-N(t), \quad 0\le t\le T, \end{aligned}$$

count the number of policies in force in the insurance portfolios.

The insurer faces a stream of non-hedgeable claims which is modelled with the process \(C=(C(t), 0\le t\le T)\). The process C is described with the equation

$$\begin{aligned} C(t)= & {} \int _{0}^{t}(n-N(s-))\alpha (P(s))ds+\int _{0}^{t}\beta (P(s))dN(s)\nonumber \\&+\,(n-N(T))\eta (P(T)){\mathbf {1}}_{t=T},\quad 0\le t\le T. \end{aligned}$$
(2.4)

Each policyholder in the insurance portfolio is entitled to three types of benefits: annuity \(\alpha \) paid as long as the policyholder lives, life insurance benefit \(\beta \) paid if the policyholder dies and endowment benefit \(\eta \) paid if the policyholder survives till the terminal time T. The benefits \(\alpha , \beta \) and \(\eta \) are contingent on the value of the index P. We assume that

  1. (A3)

    the functions \(\alpha , \beta , \eta :(0,\infty )\mapsto [0,\infty )\) are bounded and Lipschitz continuous.

In order to fulfill the future obligations, the insurer must hold a reserve. The reserve is set for the policies in force. We define the reserve:

$$\begin{aligned} F^k(t,p)= & {} {\mathbb {E}}^{\tilde{\mathbb {Q}}}\Big [C(T)-C(t)\big |P(t)=p, J(t)=k\Big ],\nonumber \\&\quad (t,p,k)\in [0,T]\times (0,\infty )\times \{0,1,\ldots ,n\}, \end{aligned}$$
(2.5)

where \({\tilde{\mathbb {Q}}}\) denotes a pricing measure for C. Here, by reserve we mean an amount of money which the insurer sets aside to cover the future benefits. The insurer can choose any pricing measure to calculate the reserve (2.5). We don’t make any assumptions on the pricing measure \({\tilde{{\mathbb {Q}}}}\) in (2.5). However, we assume that

  1. (A4)

    \(F^k(t,p)=kF^1(t,p), \ (t,p,k)\in [0,T]\times (0,\infty )\times \{0,\ldots ,n\},\) and the function \(F^1:[0,T]\times (0,\infty )\mapsto [0,\infty )\) is \(\mathcal {C}^{1,2}([0,T]\times (0,\infty ))\).

In the sequel, the reserve for one policy in force \(F^1\) is simply denoted by F. If the counting process N is independent of (SP) under the pricing measure \({\tilde{{\mathbb {Q}}}}\) and the prices of the pay-offs \(\alpha , \beta , \eta \) are smooth functions of time and the underlying index P, then (A4) is satisfied.

3 The optimization problem and the HJB equation

Let \(\pi :=(\pi (t),0\le t\le T)\) denote a strategy which determines the amount of wealth invested in the index S. The wealth process of the insurer, denoted by \(X^\pi =(X^\pi (t),0\le t\le T)\), satisfies the SDE

$$\begin{aligned} dX^\pi (t)= & {} \pi (t)\Big (\mu dt+\sigma dW(t)\Big )\nonumber \\&-J(s-)\alpha (P(s))ds+\beta (P(s))dJ(s),\quad 0\le t\le T,\nonumber \\ X(0)= & {} x. \end{aligned}$$
(3.1)

where \(x>0\) denotes the initial wealth. We assume that the survival benefits \(\eta \) are subtracted from \(X^\pi (T)\) at the terminal time T.

In this paper we study the optimization problem:

$$\begin{aligned} \sup _{\pi }{\mathbb {E}}\Big [-e^{-\varGamma \big (X^\pi (t)-J(t)F(t,P(t))\big )\times \big (X^\pi (T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ],\quad 0\le t\le T,\quad \ \end{aligned}$$
(3.2)

where \(\varGamma \) denotes a time-varying risk aversion coefficient which value at time t depends on the process

$$\begin{aligned} R(t)=X^\pi (t)-J(t)F(t,P(t)),\quad 0\le t\le T. \end{aligned}$$

The process R is interpreted as the insurer’s net asset value - the excess of the insurer’s assets over his liabilities. By the liability we mean the value of the reserve (2.5). The dynamics of the net asset value process R is given by the equation

$$\begin{aligned} dR(t)= & {} \pi (t)\Big (\mu dt+\sigma dW(t)\Big )-J(t-)\alpha (P(t))dt+\beta (P(t))dJ(t)\\&-\,J(t-)F_t(t,P(t))dt-J(t-)F_p(t,P(t))P(t)\Big (a+b\big (\rho dW(t)+\sqrt{1-\rho ^2}dB(t)\big )\Big )\\&-\,J(t-)\frac{1}{2}F_{pp}(t,P(t))b^2P^2(t)dt-F(t,P(t))dJ(t),\quad 0\le t\le T. \end{aligned}$$

We assume that the risk aversion coefficient in (3.2) satisfies:

  1. (A5)

    \(\varGamma :{\mathbb {R}}\mapsto (0,\infty )\) is bounded, decreasing, Lipschitz continuous and \(\mathcal {C}^2({\mathbb {R}})\).

The motivation for considering the wealth-dependent risk aversion in the optimization problem (3.2) is the following. At different points in time, the insurer is likely to have different exponential utilities which are characterized with different risk aversion coefficients. We expect that the insurer’s risk aversion coefficient should change in time and the dynamics of the risk aversion coefficient should be modelled with an adapted process related to some observable factors. It is very reasonable to assume that the risk aversion coefficient and the willingness to take the financial risk depend the financial position of the investor. We assume that the value of the insurer’s risk aversion coefficient at time t depends on the current insurer’s net asset value. If the assets are above the liabilities, then the insurer is less risk averse and is willing to implement more risky investment strategies. If the assets are below the liabilities, the insurer is more risk averse and switches to more conservative investment strategies. Hence, the risk aversion coefficient \(\varGamma \) should be a decreasing function of the net asset value.

Let us introduce the set of admissible investment strategies for (3.2).

Definition 3.1

A strategy \(\pi =(\pi (t), 0\le t\le T)\) is called admissible, \(\pi \in \mathcal {A}\), if it satisfies the following conditions:

  1. 1.

    \(\pi :[0,T]\times \varOmega \rightarrow \mathbb {R}\) is an \({\mathbb {F}}\)-predictable process determined with a measurable mapping \(\varPi :[0,T]\times {\mathbb {R}}\times (0,\infty )\times \{0,\ldots ,n\}\mapsto {\mathbb {R}}\) such that \(\pi (t)=\varPi (t,X^\pi (t-),P(t),J(t-))\),

  2. 2.

    The process \(\Big (\int _0^t\pi (s)dW(s), \ 0\le t\le T\Big )\) is a \(BMO({\mathbb {F}})\)-martingale,

  3. 3.

    The stochastic differential equation (3.1) has a unique solution \(X^{\pi }\) on [0, T],

  4. 4.

    \({\mathbb {E}}\Big [e^{- \varGamma (r) \big (X^\pi (T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]<\infty \) for all \(t\in [0,T]\) and all \(r\in {\mathbb {R}}\).

The above definition of admissible investment strategies is standard for exponential utility maximization problems, see e.g. Hu et al. (2005) and Jeanblanc et al. (2015), except for point 4 where we require that the expected utility of the terminal wealth exists for all risk aversion coefficients defined by \(\varGamma \). However, this requirement is clear since we aim at solving an exponential utility optimization problem with risk aversion coefficient which changes in time. Let us remark that points 2, 4 and boundedness of \(\eta \) imply that the family \(\{e^{-\varGamma (r) X^\pi (\mathcal {T})}, \mathcal {T} \ is \ an \ {\mathbb {F}}-stopping \ time\}\) is uniformly integrable for \(\pi \in \mathcal {A}\) and \(r\in {\mathbb {R}}\), which is often used in the definition of an admissible strategy instead of points 2 and 4, see Remark 8 in Hu et al. (2005). From financial point of view, points 2 and 4 of Definition 2.1 or the uniform integrability of \(\{e^{-\varGamma (r) X^\pi (\mathcal {T})}, \mathcal {T} \ is \ an \ {\mathbb {F}}-stopping \ time\}\) exclude arbitrage investment strategies from considerations, see Remark 2 in Hu et al. (2005). The assumption of uniform integrability is slightly weaker than the other common assumption that the wealth process should be bounded from below, which is used to introduce so-called tame arbitrage-free strategies as admissible strategies, see Definition 3 in Levental and Skorohod (1995). Tame strategies limit borrowing and prevent doubling strategies.

The optimization problem (3.2) is an exponential utility maximization problem for an investor with wealth-dependent risk aversion coefficient \(\varGamma \). We can define the objective function for (3.2):

$$\begin{aligned} v^{k,\pi }(t,x,p)= & {} {\mathbb {E}}\Big [-e^{-\varGamma \big (x-kF(t,p)\big )\big (X^\pi (T)-J(T)\eta (P(T))\big )}|X(t)=x,P(t)=p,J(t)=k\Big ],\nonumber \\&(t,x,p,k)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\times \{0,1,\ldots ,n\},\ \pi \in \mathcal {A}. \end{aligned}$$
(3.3)

The objective function (3.3) is well-defined for any \(\pi \in \mathcal {A}\) by point 4 of Definition 2.1. However, the optimization problem (3.2) is time-inconsistent and the Bellman’s principle of optimality cannot be used to find the optimal strategy and the optimal value defined by \(\sup _{\pi \in \mathcal {A}}v^{k,\pi }(t,x,p)\). We use the game-theoretic approach developed by Ekeland and Lazrak (2006), Ekeland and Pirvu (2008), Björk et al. (2017) and Björk and Murgoci (2014). In order to find the solution to (3.2), we consider a game played by a continuum of players with different utilities where the player at time t has its own risk aversion coefficient and only chooses the strategy at time t. We look for the sub-game perfect Nash equilibrium in the game with the reward given by (3.3).

Definition 3.2

Let us consider an admissible strategy \(\pi ^*\in \mathcal {A}\). Fix an arbitrary point \((t,x,p,k)\in [0,T)\times {\mathbb {R}}\times {\mathbb {R}}\times \{0,1,\ldots ,n\}\) and choose an admissible strategy \(\pi \in \mathcal {A}\). For \(\delta >0\) we define a new admissible strategy

$$\begin{aligned} \pi _\delta (s)=\left\{ \begin{array}{ll} \pi (s),\quad t\le s\le t+\delta ,\\ \pi ^*(s),\quad t+\delta < s\le T. \end{array}\right. \end{aligned}$$

If

$$\begin{aligned} lim \ inf \ _{\delta \rightarrow 0} \ \frac{v^{k,\pi ^*}(t,x,p)-v^{k,\pi _\delta }(t,x,p)}{\delta }\ge 0, \end{aligned}$$

for all \((t,x,p,k)\in [0,T)\times {\mathbb {R}}\times {\mathbb {R}}\times \{0,1,\ldots ,n\}\) and \(\pi \in \mathcal {A}\), then \(\pi ^*\) is called an equilibrium strategy and \(v^{k,\pi ^*}\) is called the equilibrium value function corresponding to the equilibrium strategy \(\pi ^*\).

In order to characterize the equilibrium value function and the equilibrium strategy with an HJB equation, we need to introduce the second function:

$$\begin{aligned} w^{k,\pi }(t,x,p,r)= & {} {\mathbb {E}}\big [-e^{-\varGamma (r)\big (X^\pi (T)-J(T)\eta (P(T))\big )}|X(t)=x,P(t)=p,J(t)=k,R(t)=r],\nonumber \\&(t,x,p,r,k)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}}\times \{0,1,\ldots ,n\},\ \pi \in \mathcal {A}.\quad \ \quad \ \end{aligned}$$
(3.4)

The function \(w^k\) gives the value of the objective (3.3) for the optimization problem with the risk aversion depending on an auxiliary parameter r. The function \(w^k\) describes the time-consistent part of the time-inconsistent optimization problem. Under the game-theoretic approach, the agent at time t forms a coalition for an infinitesimal time period and solves a time-consistent exponential utility maximization problem with a constant risk aversion coefficient over the infinitesimal time period, see Remark 2.3 in Björk and Murgoci (2014). The value function for this optimization problem at time t is determined by \(w^k(t,x,p,r)\) where \(r=x-kF(t,p)\). However, the evolution of \(w^k(t,x,p,r)\) cannot characterize the dynamics of the value function of the time-inconsistent optimization problem with time-varying risk aversion since the variable r is held fixed in the definition of \(w^k\). Hence, we need the function \(v^k\) and its dynamics to fully characterize the equilibrium strategy and the equilibrium value function of the exponential utility maximization problem with time-varying risk aversion.

We finish with section by presenting the HJB equation and a verification theorem for our time-inconsistent optimization problem (3.2). First, we introduce operators associated with the continuous parts of \((X^\pi ,P,R)\).

Definition 3.3

Let \(\mathcal {L}^\pi _k\) and \(\mathcal {M}_k^\pi \) denote second order differential operators given by

$$\begin{aligned} \mathcal {L}_{k}^{\pi } \phi (t,x,p)= & {} \phi _x(t,x,p)\big (\pi \mu -k\alpha (p)\big )+\frac{1}{2}\phi _{xx}(t,x,p)\pi ^2\sigma ^2\\&+\,\phi _{px}(t,x,p)\pi bp\sigma \rho +\phi _p(t,x,p) ap+\frac{1}{2}\phi _{pp}(t,x,p) b^2p^2, \\ \mathcal {M}_{k}^{\pi } \phi (t,x,p,r)= & {} \mathcal {L}^\pi _k\phi (t,x,p,r)\\&+\,\phi _r(t,x,p,r)\Big (\pi \mu -k\alpha (p)-kF_t(t,p)-kF_p(t,p)ap-\frac{1}{2}kF_{pp}(t,p)b^2p^2\Big )\\&+\,\frac{1}{2}\phi _{rr}(t,x,p,r)\Big (\pi ^2\sigma ^2+(kF_p(t,p))^2b^2p^2-2\pi kF_p(t,p)bp\sigma \rho \Big )\\&+\,\phi _{rp}(t,x,p,r)\Big (\pi bp\sigma \rho -kF_p(t,p) b^2p^2\Big )\\&+\,\phi _{rx}(t,x,p,r)\Big (\pi ^2\sigma ^2-\pi kF_p(t,p)bp\sigma \rho \Big ). \end{aligned}$$

The operators \(\mathcal {L}^\pi _k\) and \(\mathcal {M}_k^\pi \) are defined, respectively, for \(\phi \in \mathcal {C}^{1,2,2}([0,T]\times {\mathbb {R}}\times (0,\infty ))\) and \(\phi \in \mathcal {C}^{1,2,2,2}([0,T]\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}})\). The operator \(\mathcal {L}_k^\pi \phi (t,x,p,r)\) only acts on (txp) and r is kept as a constant.

Theorem 3.1

Let (A1)–(A5) hold. Assume there exist functions \((v^k)_{k=0}^n\in \mathcal {C}([0,T]\times {\mathbb {R}}\times (0,\infty ))\cap \mathcal {C}^{1,2,2}([0,T)\times {\mathbb {R}}\times (0,\infty )), \ (w^k)_{k=0}^n\in \mathcal {C}([0,T]\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}})\cap \mathcal {C}^{1,2,2,2}([0,T)\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}})\) and an admissible strategy \(\pi ^*=(\pi ^{k,*})_{k=0}^n\in \mathcal {A}\) which solve the system of HJB equations:

$$\begin{aligned}&v_t^k(t,x,p)+\sup _\pi \Big \{\mathcal {L}_k^\pi v^k(t,x,p)-\mathcal {M}^\pi _kw^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +\,\mathcal {L}^\pi _kw^k(t,x,p,x-kF(t,p))\Big \}+\Big (v^{k-1}(t,x-\beta (p),p)-v^k(t,x,p)\Big )k\lambda \nonumber \\&\quad +\,\Big (w^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad -\,w^{k-1}(t,x-\beta (p),p,x-\beta (p)-(k-1)F(t,p))\Big )k\lambda =0,\nonumber \\&\qquad (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ),\nonumber \\&v^k(T,x,p)=-e^{-\varGamma (x-k\eta (p))(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ),\nonumber \\&\pi ^{k,*}=arg \ sup_\pi \ \Big \{\mathcal {L}_k^\pi v^k(t,x,p)-\mathcal {M}^\pi _kw^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +\,\mathcal {L}^\pi _kw^k(t,x,p,x-kF(t,p))\Big \}, \nonumber \\&\qquad (t,x,p)\in [0,T]\times {\mathbb {R}}\times (0,\infty ), \end{aligned}$$
(3.5)

and

$$\begin{aligned}&w_t^k(t,x,p,r)+\mathcal {L}_k^{\pi ^{k,*}}w^k(t,x,p,r)+\Big (w^{k-1}(t,x-\beta (p),p,r)-w^k(t,x,p,r)\Big )k\lambda =0,\nonumber \\&\qquad (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ), \ r\in {\mathbb {R}},\nonumber \\&w^k(T,x,p,r)=-e^{-\varGamma (r)(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ), \ r\in {\mathbb {R}}, \end{aligned}$$
(3.6)

for \(k\in \{0,1,\ldots ,n\}\). In addition, assume that the families

$$\begin{aligned}&\Big \{v^k(\mathcal {T},X^{\pi }(\mathcal {T}),P(\mathcal {T})), \ \mathcal {T} \ is \ an \ {\mathbb {F}}^{W,B}-stopping \ time, \mathcal {T}\in [0,T] \Big \},\\&\Big \{w^k(\mathcal {T},X^{\pi }(\mathcal {T}),P(\mathcal {T}),R(\mathcal {T})), \ \mathcal {T} \ is \ an \ {\mathbb {F}}^{W,B}-stopping \ time, \mathcal {T}\in [0,T] \Big \},\\&\Big \{w^k(\mathcal {T},X^{\pi }(\mathcal {T}),P(\mathcal {T}),r), \ \mathcal {T} \ is \ an \ {\mathbb {F}}^{W,B}-stopping \ time, \mathcal {T}\in [0,T] \Big \}, \ r\in {\mathbb {R}}, \end{aligned}$$

are uniformly integrable for any \(\pi \in \mathcal {A}\) and \(k\in \{0,1,\ldots ,n\}\). The strategy \(\pi ^*=(\pi ^{k,*})_{k=0}^n\) is an equilibrium strategy for the time-inconsistent optimization problem (3.2) with wealth-dependent risk aversion coefficient and \(v^k(t,x,p)=v^{k,\pi ^*}(t,x,p)\) is the value function corresponding to the equilibrium strategy \(\pi ^*\).

4 Perturbation theory and first-order approximations

It is known that it is hard to solve HJB equations for time-inconsistent optimization problems, see Ekeland and Lazrak (2006), Ekeland and Pirvu (2008), Björk et al. (2017), Ekeland et al. (2012) and Dong and Sircar (2014). In particular, we are not able to solve our HJB equations (3.5)–(3.6) since standard separation methods cannot be applied and we cannot split the variables in \(v^k\) and \(w^k\). We use perturbation theory to approximate the solutions to the HJB equations (3.5)–(3.6).

Perturbation theory deals with finding an approximate solution to a problem by starting from the exact solution of a related, simpler problem. Perturbation theory can be applied if our problem can be formulated by adding a small term to some parameter of the exactly solvable problem. The solution to the main problem is next expanded in powers of this small parameter. The zeroth-order term in the expansion is the exact solution to the simpler problem and the higher order terms in the expansion describe deviations in the solution to the main problem from the solution of the simpler problem. Since the perturbation technique is based on adding a small parameter, we can truncate the series expansion of the solution to the main problem and keep the first two terms of the expansion as the first-order approximate solution. In financial applications, perturbation theory was developed by Fouque et al. (2011, 2014, 2017) and Fouque and Hu (2017).

It is clear that our exponential utility maximization problem with wealth-dependent risk aversion can be related to a simpler exponential utility maximization problem with constant risk aversion. In order to apply the perturbation theory to solve the optimization problem (3.2), we consider a special structure of the wealth-dependent risk aversion coefficient \(\varGamma \). We choose

$$\begin{aligned} \varGamma (r)=\gamma _0+\gamma _1(r)\epsilon ,\quad r\in {\mathbb {R}}. \end{aligned}$$
(4.1)

We now assume that the insurer’s risk aversion coefficient \(\varGamma \) consists of a constant risk aversion \(\gamma _0>0\) and a small amount \(\epsilon >0\) of wealth-dependent risk aversion \(\gamma _1\). We impose the technical condition:

  1. (A6)

    The function \(\gamma _1:{\mathbb {R}}\mapsto {\mathbb {R}}\) is bounded, decreasing, Lipschitz continuous and \(\mathcal {C}^2({\mathbb {R}})\). Moreover, \(\gamma _1(0)=0\).

The assumption that \(\gamma _1(0)=0\) is a normalizing assumption for the risk aversion coefficient. We note that if \(r>0\) then \(\varGamma (r)<\gamma _0\), if \(r>0\) then \(\varGamma (r)>\gamma _0\).

Since our risk aversion coefficient (4.1) consists of a constant risk aversion and a small amount of wealth-dependent risk aversion, we expect that the solution to the exponential utility maximization problem with the wealth-dependent risk aversion \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) should be expanded around the solution to the exponential utility maximization problem with the constant risk aversion \(\gamma _0\). In particular, the zeroth-order approximation to the equilibrium value function and the equilibrium strategy for the time-inconsistent exponential utility maximization problem (3.2) with the wealth-dependent risk aversion (4.1) should coincide with the value function and the optimal strategy for the time-consistent exponential utility maximization problem with the constant risk aversion \(\gamma _0\). Hence, in the next section we start with investigating the optimization problem (3.2) with \(\varGamma (r)=\gamma _0=\gamma \). In Sect. 5 we study some properties of the zeroth-order solution which allows us in Sect. 6 to derive the first-order correction resulting from adding a small amount of wealth-dependent risk aversion to constant risk aversion.

The goal of this paper is to establish the first-order approximations to the equilibrium value function and the equilibrium strategy for the optimization problem (3.2) in the case of risk aversion \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with small \(\epsilon >0\). Formally, we are interested in finding functions \((v_0^k)_{k=0}^n, (v_1^k)_{k=0}^n,(\pi _0^{*,k})_{k=0}^n,(\pi _1^{*,k})_{k=0}^n\) such that

$$\begin{aligned} v^k(t,x,p)= & {} v_0^k(t,x,p)+v_1^k(t,x,p)\epsilon +\mathcal {O}(\epsilon ^2), \quad \epsilon \rightarrow 0, \end{aligned}$$
(4.2)
$$\begin{aligned} \pi ^{*,k}(t,x,p)= & {} \pi _0^{*,k}(t,x,p)+\pi _1^{*,k}(t,x,p)\epsilon +\mathcal {O}(\epsilon ^2),\quad \epsilon \rightarrow 0, \end{aligned}$$
(4.3)

where \((v^{k},\pi ^{*,k})_{k=0}^n\) solve the system of the HJB equations (3.5)–(3.6) with the risk aversion \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \). The formulas (4.2)–(4.3) denote that

$$\begin{aligned} v_0^k(t,x,p)+v_1^k(t,x,p)\epsilon \quad and \quad \pi _0^{*,k}(t,x,p)+\pi _1^{*,k}(t,x,p)\epsilon , \end{aligned}$$
(4.4)

are the first-order approximations to \(v^k(t,x,p)\) and \(\pi ^{*,k}(t,x,p)\) for small \(\epsilon \). More precisely, the functions (4.4) which satisfy (4.2)–(4.3) are called the asymptotic first-order approximations to \(v^k(t,x,p)\) and \(\pi ^{*,k}(t,x,p)\) as \(\epsilon \rightarrow 0\). The error of the approximation in (4.2), or (4.3), is of a higher order than the approximating function and it is controlled with a function of order \(\mathcal {O}(\epsilon ^2)\), see Definitions 1.1 and 2.1 in Holmes (2013). Let us recall that

$$\begin{aligned} z^\epsilon (x)\sim \mathcal {O}(\epsilon ^\delta ) \ as \ \epsilon \rightarrow 0 \quad if \quad |z^\epsilon (x)|\le K\epsilon ^\delta ,\quad 0\le \epsilon \le \epsilon _0, \end{aligned}$$

for some \(\epsilon _0>0\), where K is independent of \(\epsilon \) but may depend on \((x,\epsilon _0)\).

For details on perturbation theory we refer e.g. to Holmes (2013). In order to clarify the idea behind finding the asymptotic first-order approximation to a solution of an equation, we present a simple example from Chapter 1.5 in Holmes (2013). Let use consider the equation

$$\begin{aligned} x^2+2\epsilon x-1=0. \end{aligned}$$
(4.5)

We postulate that the solution to (4.5) has the asymptotic expansion

$$\begin{aligned} x=x_0+x_1\epsilon +\mathcal {O}(\epsilon ^2),\quad \epsilon \rightarrow 0. \end{aligned}$$

We substitute the expansion to (4.5) and collect the terms of order \(\mathcal {O}(1),\mathcal {O}(\epsilon ),\mathcal {O}(\epsilon ^2)\):

$$\begin{aligned} x_0^2-1+2\big (x_0+x_0x_1\big )\epsilon +\mathcal {O}(\epsilon ^2)=0. \end{aligned}$$

We choose \(x_0\) and \(x_1\) so that the terms of orders \(\mathcal {O}(1),\mathcal {O}(\epsilon )\) are zero. We find

$$\begin{aligned} x_0=\pm 1,\quad x_1=-1. \end{aligned}$$

The solution \({\tilde{x}}=\pm 1-\epsilon \) is the first-order approximation to the true solution \(x=\pm \sqrt{\epsilon ^2+1}-\epsilon \) of Eq. (4.5) with small \(\epsilon \), or the asymptotic first-order approximation as \(\epsilon \rightarrow 0\), since (4.2) hold. In other words, the error of approximating \(x=\pm \sqrt{\epsilon ^2+1}-\epsilon \) with \({\tilde{x}}=\pm 1-\epsilon \) is a function of order \(\mathcal {O}(\epsilon ^2)\) as \(\epsilon \rightarrow 0\). We can note that the first-order approximation to the solution to (4.5) results from expanding the true solution around the exact solution to (4.5) with \(\epsilon =0\). We will use the same reasoning in Sect. 6 where we postulate the asymptotic first-order approximation to the solution to our optimization problem (3.2) with the risk aversion coefficient (4.1). We remark that, by construction of the approximate solution inspired by perturbation theory, we only consider the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with small \(\epsilon >0\).

5 The optimization problem with constant risk aversion coefficient

Since we expect that the zeroth-order approximation to the solutions to the HJB equations (3.5)–(3.6) are given with the solution to the exponential utility maximization problem with constant risk aversion, we start with investigating the optimization problem (3.2) with \(\varGamma \big (x-kF(t,p)\big )=\gamma \).

First, let us introduce some spaces and their norms. Let \(\mathbb {G}\) be some filtration and \(q\ge 1\). Let \(\mathcal {R}^q(\mathbb {G})\) denote the space of \(\mathbb {G}\)-adapted processes \(\mathcal {X}\) such that \(||\mathcal {X}||_{\mathcal {R}^q}=\big ({\mathbb {E}}\big [\sup _{t\in [0,T]}|\mathcal {X}(t)|^{q}\big ]\big )^{\frac{1}{q}}<\infty \). By \(\mathcal {R}^\infty (\mathbb {G})\) we denote the space of bounded \(\mathbb {G}\)-adapted processes equipped with the norm \(||\mathcal {X}||_{\mathcal {R}^\infty }=\sup _{t\in [0,T]}|\mathcal {X}(t)|\). Let \(\mathcal {H}^q(\mathbb {G})\) denote the space of \(\mathbb {G}\)-predictable processes \(\mathcal {X}\) such that \(||\mathcal {X}||_{\mathcal {H}^q}=\big ({\mathbb {E}}\big [\big (\int _0^T|\mathcal {X}(t)|^{2}dt\big )^{q/2}\big ]\big )^{\frac{1}{q}}<\infty \). Finally, let \(BMO^q(\mathbb {G})\) denote the space of uniformly integrable \(\mathbb {G}\)-martingales \(\mathcal {X}\) such that \(||\mathcal {X}||_{BMO^q}=\sup _{\mathbb {G}-stopping \ time \ \tau }\big ({\mathbb {E}}\big [|\mathcal {X}(T)-\mathcal {X}(\tau )|^{q}|{\mathcal {F}}_\tau \big ]\big )^{\frac{1}{q}}<\infty \). \(BMO^{q_1}\)-norm is equivalent to \(BMO^{q_2}\)-norm, and we will use \(BMO^2\)-norm, see Corollary 2.1 in Kazamaki (1997). For a martingale \(\mathcal {X}(t)=\int _0^t\mathcal {Z}(t)dW(t)\) we have \(||\mathcal {X}||_{BMO^2}=\sup _{\mathbb {G}-stopping \ time \ \tau }\big ({\mathbb {E}}\big [\int _\tau ^T|\mathcal {Z}(t)|^2ds|{\mathcal {F}}_\tau \big ]\big )^{\frac{1}{2}}\). If \(\mathcal {Z}\in \mathcal {H}^2(\mathbb {G})\), we will abuse the notation and set \(||\mathcal {Z}||_{BMO^2}=||\int _0^\cdot \mathcal {Z}(t)dW(t)||_{BMO^2}=\sup _{\mathbb {G}-stopping \ time \ \tau }\big ({\mathbb {E}}\big [\int _\tau ^T|\mathcal {Z}(t)|^2ds|{\mathcal {F}}_\tau \big ]\big )^{\frac{1}{2}}\). Moreover, the norm \(||\cdot ||_{BMO^2}\) will be simply denoted by \(||\cdot ||_{BMO}\).

We define the objective function and the value function for the exponential utility maximization problem with constant risk aversion:

$$\begin{aligned} V^{k,\pi }(t,x,p)= & {} {\mathbb {E}}\Big [-e^{-\gamma \big (X^\pi (T)-J(T)\eta (P(T))\big )}\big |X(t)=x,P(t)=p,J(t)=k\Big ],\nonumber \\&\quad \ (t,x,p,k)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\times \{0,1,\ldots ,n\},\ \pi \in \mathcal {A},\quad \ \quad \ \end{aligned}$$
(5.1)
$$\begin{aligned} V^k(t,x,p)= & {} \sup _{\pi \in \mathcal {A}}V^{k,\pi }(t,x,p),\nonumber \\&\quad (t,x,p,k)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\times \{0,1,\ldots ,n\}. \end{aligned}$$
(5.2)

It is known that the solution to the optimization problem (5.2) can be characterized with solutions to BSDEs or PDEs.

Let us study the system of BSDEs:

$$\begin{aligned} Y^{k}(t)= & {} k\eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (P(s))+\frac{\mu }{\sigma }Z_1^{k}(s)-\frac{1}{2}\gamma (Z_2^{k}(s))^2\nonumber \\&-\frac{e^{\gamma (\beta (P(s))+Y^{k-1}(s)-Y^{k}(s))}-1}{\gamma }k\lambda \Big )ds\nonumber \\&-\int _t^TZ_1^{k}(s)dW(s)-\int _t^TZ_2^{k}(s)dB(s),\quad 0\le t\le T, \quad k\in \{0,1,\ldots ,n\},\nonumber \\ \end{aligned}$$
(5.3)

Proposition 5.1

Let (A1)–(A3) hold.

  1. (i)

    There exist unique solutions \((Y^k,Z_1^k,Z_2^k)_{k=0}^n\in \mathcal {R}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\) to the system of BSDEs (5.3) such that, for each \(k=\{0,1,\ldots ,n\}\), the process \(Y^k\) is bounded and \(\big (\int _0^tZ^k_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^tZ^k_2(s)dB(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}}^{W,B})\)-martingales.

  2. (ii)

    The norms \(||Y^{k,\gamma }||_{\mathcal {R}^\infty }, \ ||Z^{k,\gamma }_1||_{BMO},\ ||Z^{k,\gamma }_2||_{BMO}\) are bounded uniformly in \(k\in \{0,\ldots ,n\}\) and \(\gamma \in (\gamma _0-\epsilon ,\gamma _0+\epsilon )\) for \(\epsilon <\gamma _0\).

  3. (iii)

    Let \((Y^{k,t,p})_{k=0}^n\) denote the solutions to the BSDEs (5.3) with the forward equation (2.3) with the initial condition \(P(t)=p\). For each \(k=\{0,1,\ldots ,n\}\), we have

    $$\begin{aligned} {\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |Y^{k,t,p}(s)-Y^{k,t,p'}(s)\big |^{2q}\Big ]\le K|p-p'|^{2q},\quad q>1, \end{aligned}$$
    (5.4)

    for any \((t,p), (t,p')\in [0,T]\times (0,\infty )\), where the constant K is independent of \((k,t,p,p')\).

Alternatively, we can consider the system of PDEs:

$$\begin{aligned}&h_t^k(t,p)+\Big (a-\frac{\mu b\rho }{\sigma }\Big )ph_p^k(t,p)+\frac{1}{2}b^2p^2h_{pp}^k(t,p)+k\alpha (p)-\frac{\mu ^2}{2\sigma ^2\gamma }-\frac{k\lambda }{\gamma }\nonumber \\&\quad +\frac{e^{\gamma \beta (p)} e^{\gamma h^{k-1}(t,p)}}{\gamma }k\lambda e^{-\gamma h^k(t,p)}\nonumber \\&\quad +\frac{1}{2}\gamma (1-\rho ^2)b^2p^2(h^k_p(t,p))^2=0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&h^k(T,p)=k\eta (p),\quad p\in (0,\infty ),\quad k\in \{0,\ldots ,n\}. \end{aligned}$$
(5.5)

Proposition 5.2

Let (A1)–(A3) hold.

  1. (i)

    There exist unique solutions \((h^k)_{k=0}^n\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\) to the system of PDEs (5.5).

  2. (ii)

    We have

    $$\begin{aligned} Y^k(t)= & {} h^k(t,P(t)),\quad Z_1^k(t)=h^k_p(t,P(t))bP(t)\rho ,\nonumber \\ Z^k_2(t)= & {} h^k_p(t,P(t))bP(t)\sqrt{1-\rho ^2},\quad 0\le t\le T, \quad k\in \{0,\ldots ,n\}, \end{aligned}$$
    (5.6)

    where \((Y^k,Z_1^k,Z_2^k)_{k=0}^n\) are defined in point (i) of Proposition 5.1.

The optimal solution to (5.2) is characterized in the following theorem.

Theorem 5.1

Let (A1)–(A3) hold. The strategy

$$\begin{aligned} \pi ^*(t)= & {} \sum _{k=0}^n\pi ^{k,*}(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T,\nonumber \\ \pi ^{k,*}(t)= & {} \frac{\mu }{\sigma ^2\gamma }+\frac{Z^k_1(t)}{\sigma },\quad 0\le t\le T, \end{aligned}$$
(5.7)

is the optimal admissible investment strategy for the optimization problem (5.1)–(5.2) and \(V^k(t,x,p)=V^{k,\pi ^*}(t,x,p)=-e^{-\gamma x}e^{\gamma Y^k(t)}|_{P(t)=p}\) is the value function corresponding to the strategy \(\pi ^*\). Alternatively, we can characterize the optimal strategy (5.7) with the functions \((h^k)_{k=0}^n\) from Proposition 5.2.

Expansions in perturbation theory are often justified by recalling Taylor’s theorem and expanding the function in powers of small parameter \(\epsilon \). This implies that the term of order \(\mathcal {O}(\epsilon )\) in the expansion is related to the first derivative of the function with respect to the parameter which is perturbated by adding \(\epsilon \). The value function from Theorem 5.1 depends on the risk aversion coefficient \(\gamma \), in particular the solutions \((Y^k)_{k=0}^n\) and \((h^k)_{k=0}^n\) depend on \(\gamma \). Consequently, our next step is to investigate the derivative of the process \(Y^k\), and the derivative of the function \(h^k\), with respect to risk aversion coefficient \(\gamma \). The following propositions are crucial for establishing the first-order correction in the expansion of the equilibrium value function.

Let us introduce the system of BSDEs:

$$\begin{aligned} \mathcal {Y}^{k,\gamma }(t)= & {} -\int _t^T\Big (-\frac{\mu ^2}{2\sigma ^2\gamma ^2}-\frac{1}{2}(Z_2^{k,\gamma }(s))^2\nonumber \\&-\,\frac{e^{\gamma (\beta (P(s))+Y^{k-1,\gamma }(s)-Y^{k,\gamma }(s))}\Big (\gamma \Big (\beta (P(s))+Y^{k-1,\gamma }(s)-Y^{k,\gamma }(s)\Big )-1\Big )+1}{\gamma ^2}k\lambda \nonumber \\&-\,e^{\gamma \big (\beta (P(s))+Y^{k-1,\gamma }(s)-Y^{k,\gamma }(s)\big )}k\lambda \mathcal {Y}^{k-1,\gamma }(s)\nonumber \\&+\,e^{\gamma \big (\beta (P(s))+Y^{k-1,\gamma }(s)-Y^{k,\gamma }(s)\big )}k\lambda \mathcal {Y}^{k,\gamma }(s)+\frac{\mu }{\sigma }\mathcal {Z}_1^{k,\gamma }(s)-\gamma Z_2^{k,\gamma }(t)\mathcal {Z}_2^{k,\gamma }(t)\Big )ds\nonumber \\&-\,\int _t^T\mathcal {Z}_1^{k,\gamma }(t)dW(t)-\int _t^T\mathcal {Z}_2^{k,\gamma }(t)dB(t),\quad 0\le t\le T, \quad k\in \{0,\ldots ,n\}, \end{aligned}$$
(5.8)

Proposition 5.3

Let (A1)–(A3) hold. Consider the processes \((Y^{k,\gamma },Z_1^{k,\gamma },Z_2^{k,\gamma })_{k=0}^n\) which solve the BSDEs (5.3).

  1. (i)

    The processes \((Y^{k,\gamma },Z_1^{k,\gamma },Z_2^{k,\gamma })_{k=0}^n\) are differentiable with respect to the risk aversion coefficient \(\gamma \) in \(\mathcal {R}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\) and the derivatives \(\mathcal {Y}^{k,\gamma }(t)=\frac{d}{d\gamma }Y^{k,\gamma }(t), \ \mathcal {Z}^{k,\gamma }_1(t)=\frac{d}{d\gamma }Z_1^{k,\gamma }(t), \ \mathcal {Z}^{k,\gamma }_2(t)=\frac{d}{d\gamma }Z_2^{k,\gamma }(t)\)solve the system of the BSDEs (5.8).

  2. (ii)

    For each \(k=\{0,1,\ldots ,n\}\), the process \(\mathcal {Y}^{k,\gamma }\) is \({\mathbb {F}}^{W,B}\)-adapted, \((\mathcal {Z}^{k,\gamma }_1,\mathcal {Z}^{k,\gamma }_2)\) are \({\mathbb {F}}^{W,B}\)-predictable, \(\mathcal {Y}^{k,\gamma }\) is bounded and \(\big (\int _0^t\mathcal {Z}^{k,\gamma }_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^t\mathcal {Z}^{k,\gamma }_2(s)dB(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}}^{W,B})\)-martingales.

  3. (iii)

    The norms \(||\mathcal {Y}^{k,\gamma }||_{\mathcal {R}^\infty }, \ ||\mathcal {Z}^{k,\gamma }_1||_{BMO},\ ||\mathcal {Z}^{k,\gamma }_2||_{BMO}\) are bounded uniformly in \(k\in \{0,\ldots ,n\}\) and \(\gamma \in (\gamma _0-\epsilon ,\gamma _0+\epsilon )\) for \(\epsilon <\gamma _0\).

  4. (iv)

    For each \(k=\{0,1,\ldots ,n\}\), the solution to the BSDE (5.8) is unique in \(\mathcal {R}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\times \mathcal {H}^2({\mathbb {F}}^{W,B})\).

  5. (v)

    Let \((\mathcal {Y}^{k,t,p})_{k=0}^n\) denote the solutions to the BSDEs (5.8) with the forward equation (2.3) with the initial condition \(P(t)=p\). For each \(k=\{0,1,\ldots ,n\}\), we have

    $$\begin{aligned} {\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |\mathcal {Y}^{k,t,p}(s)-\mathcal {Y}^{k,t,p'}(s)\big |^{2q}\Big ]\le K|p-p'|^{2q},\quad q>1, \end{aligned}$$
    (5.9)

    for any \((t,p), (t,p')\in [0,T]\times (0,\infty )\), where the constant K is independent of \((k,t,p,p')\).

The last result of this section establishes the relation between the solutions to the BSDEs (5.8) and solutions to PDEs. We investigate the system of PDEs:

$$\begin{aligned}&g_t^k(t,p)+\Big (a-\frac{\mu b\rho }{\sigma }+\gamma (1-\rho ^2)b^2ph^k_p(t,p)\Big )pg_p^k(t,p)+\frac{1}{2}b^2p^2g_{pp}^k(t,p)\nonumber \\&\quad -\,e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )}k\lambda g^k(t,p)\nonumber \\&\quad +\,\frac{\mu ^2}{2\sigma ^2\gamma ^2}+\frac{1}{2}(1-\rho ^2)b^2p^2\big (h^k_p(t,p)\big )^2\nonumber \\&\quad +\,\frac{e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )}\Big (\gamma \Big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\Big )-1\Big )+1}{\gamma ^2}k\lambda \nonumber \\&\quad +\,e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )}k\lambda g^{k-1}(t,p)=0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&g^k(T,p)=0,\quad p\in (0,\infty ),\quad k\in \{0,\ldots ,n\}, \end{aligned}$$
(5.10)

where \((h^k)_{k=0}^n\) are defined in Proposition 5.2. We need to impose an additional smoothness condition for the functions \((h^k)_{k=0}^n\) in order to guarantee smooth solutions \((g^k)_{k=0}^n\) to (5.10). We assume that

  1. (A7)

    There exist mixed derivatives \((h^k_{tp})_{k=0}^n\in \mathcal {C}([0,T)\times (0,\infty ))\).

Assumption (A7) is not needed if \(\rho ^2=1\), e.g. when the benefits \(\alpha ,\beta ,\eta \) are contingent on the tradeable risky asset S.

Proposition 5.4

Let (A1)–(A3) and (A7) hold.

  1. (i)

    There exist unique solutions \((g^k)_{k=0}^n\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\) to the system of PDEs (5.10).

  2. (ii)

    We have

    $$\begin{aligned} \mathcal {Y}^k(t)= & {} g^k(t,P(t)),\quad \mathcal {Z}_1^k(t)=g^k_p(t,P(t))bP(t)\rho ,\nonumber \\ \mathcal {Z}^k_2(t)= & {} g^k_p(t,P(t))bP(t)\sqrt{1-\rho ^2},\quad 0\le t\le T, \quad k\in \{0,\ldots ,n\},\nonumber \\ \end{aligned}$$
    (5.11)

    where \((\mathcal {Y}^k,\mathcal {Z}_1^k,\mathcal {Z}_2^k)_{k=0}^n\) are defined in Proposition 5.3.

6 The optimization problem with wealth-dependent risk aversion coefficient

In the view of the discussion from Sect. 4, we postulate the following first-order expansions:

$$\begin{aligned} v^k(t,x,p)= & {} v_0^k(t,x,p)+v_1^k(t,x,p)\epsilon +\mathcal {O}(\epsilon ^2), \quad \epsilon \rightarrow 0, \end{aligned}$$
(6.1)
$$\begin{aligned} w^k(t,x,p,r)= & {} w_0^k(t,x,p)+w_1^k(t,x,p,r)\varepsilon +\mathcal {O}(\epsilon ^2),\quad \epsilon \rightarrow 0, \end{aligned}$$
(6.2)
$$\begin{aligned} \pi ^{k,*}(t,x,p)= & {} \pi ^{k,*}_0(t,x,p)+\pi ^{k,*}_1(t,x,p)\epsilon +\mathcal {O}(\epsilon ^2),\quad \epsilon \rightarrow 0. \end{aligned}$$
(6.3)

We also assume that derivatives of \((v^k)_{k=0}^n, (w^k)_{k=0}^n\) satisfy the first-order expansions of the same form (6.1)–(6.2), see Chapter 1.4.3 in Holmes (2013).

From Eq. (3.5) we deduce that the true equilibrium strategy takes the form

$$\begin{aligned}&\pi ^{k,*}(t,x,p)\nonumber \\&\quad =-\frac{\Big (v_x^k(t,x,p)-w_r^k(t,x,p,x-kF(t,p))\Big )\mu }{\Big (v_{xx}^k(t,x,p)-w_{rr}^k(t,x,p,x-kF(t,p))-2w_{xr}^k(t,x,p,x-kF(t,p))\Big )\sigma ^2}\nonumber \\&\qquad -\frac{\Big (v_{px}^k(t,x,p)-w_{pr}^k(t,x,p,x-kF(t,p))\Big ) bp\sigma \rho }{\Big (v_{xx}^k(t,x,p)-w_{rr}^k(t,x,p,x-kF(t,p))-2w_{xr}^k(t,x,p,x-kF(t,p))\Big )\sigma ^2}\nonumber \\&\qquad -\frac{\Big (w_{rr}^k(t,x,p,x-kF(t,p))+w_{rx}^k(t,x,p,x-kF(t,p))\Big )kF_p(t,p) bp\sigma \rho }{\Big (v_{xx}^k(t,x,p)-w_{rr}^k(t,x,p,x-kF(t,p))-2w_{xr}^k(t,x,p,x-kF(t,p))\Big )\sigma ^2},\nonumber \\&\quad \qquad (t,x,p,k)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\times \{0,\ldots ,n\}. \end{aligned}$$
(6.4)

We remark that \(w_r\) in (6.4) denotes derivative of w with respect to r valued at \(r=x-kF(t,p)\). If the first-order expansions (6.1)–(6.2) for the functions \((v^k)_{k=0}^n, \ (w^k)_{k=0}^n\) and their derivatives are substituted into the equilibrium strategy (6.4), then we can confirm the first-order expansion for the equilibrium strategy (6.3). In the expansion (6.3) we have to use

$$\begin{aligned}&\pi ^{k,*}_0(t,x,p)=-\frac{v_{0,x}^k(t,x,p)\mu +v_{0,px}^k(t,x,p)bp\sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2},\nonumber \\&\pi ^{k,*}_1(t,x,p)=\frac{v_{0,x}^k(t,x,p)\mu +v_{0,px}^k(t,x,p)bp\sigma \rho }{(v_{0,xx}^k(t,x,p))^2\sigma ^2}\nonumber \\&\quad \times \,\Big (v_{1,xx}^k(t,x,p)-w_{1,rr}^k(t,x,p,x-kF(t,p))-2w_{1,xr}^k(t,x,p,x-kF(t,p))\Big )\nonumber \\&\quad -\,\frac{\Big (v_{1,x}^k(t,x,p)-w_{1,r}^k(t,x,p,x-kF(t,p))\Big )\mu }{v_{0,xx}^k(t,x,p)\sigma ^2}\\&\quad -\,\frac{\Big (v^k_{1,px}(t,x,p)-w^k_{1,pr}(t,x,p,x-kF(t,p))\Big ) bp\sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2}\\&\quad -\,\frac{\Big (w_{1,rr}^k(t,x,p,x-kF(t,p))+w_{1,xr}^k(t,x,p,x-kF(t,p))\Big )kF_p(t,p) bp\sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2}. \end{aligned}$$

We now substitute the expansions (6.1)–(6.3) for \((v^k)_{k=0}^n, \ (w^k)_{k=0}^n\) and \((\pi ^{k,*})_{k=0}^n\) into the system of HJB equations (3.5)–(3.6). We collect the terms of order \(\mathcal {O}(1), \mathcal {O}(\epsilon ), \mathcal {O}(\epsilon ^2)\) and set them to zero, see Sect. 4. After some calculations, we can derive the system of PDEs:

$$\begin{aligned}&v_{0,t}^k(t,x,p)+\mathcal {L}_{k}^{\pi _0^{k,*}} v_0^k(t,x,p)+\Big (v_0^{k-1}(t,x-\beta (p),p)-v_0^k(t,x,p)\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ),\nonumber \\&v_0^k(T,x,p)=-e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ), \end{aligned}$$
(6.5)
$$\begin{aligned}&v_{1,t}^k(t,x,p)+\mathcal {L}_k^{\pi ^{k,*}_0} v_1^k(t,x,p)-\mathcal {M}^{\pi ^{k,*}_0}_kw_1^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +\,\mathcal {L}^{\pi ^{k,*}_0}_kw_1^k(t,x,p,x-kF(t,p))+\Big (v_1^{k-1}(t,x-\beta (p),p)-v_1^k(t,x,p)\Big )k\lambda \nonumber \\&\quad +\,\Big (w_1^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad -\,w_1^{k-1}(t,x-\beta (p),p,x-\beta (p)-(k-1)F(t,p))\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ),\nonumber \\&v_1^k(T,x,p)=\gamma _1(x-k\eta (p))(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ),\quad \ \quad \ \end{aligned}$$
(6.6)
$$\begin{aligned}&w^k_{0,t}(t,x,p)+\mathcal {L}_k^{\pi ^{k,*}_0}w^k_{0}(t,x,p)+\Big (w_0^{k-1}(t,x-\beta (p),p)-w_0^k(t,x,p)\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ),\nonumber \\&w_0^k(T,x,p)=-e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ), \end{aligned}$$
(6.7)
$$\begin{aligned}&w^k_{1,t}(t,x,p,r)+\mathcal {L}_k^{\pi ^{k,*}_0}w^k_{1}(t,x,p,r)+\Big (w_1^{k-1}(t,x-\beta (p),p,r)-w_1^k(t,x,p,r)\Big )k\lambda =0\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {\mathbb {R}}\times (0,\infty ), \ r\in {\mathbb {R}},\nonumber \\&w_1^k(T,x,p,r)=\gamma _1(r)(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {\mathbb {R}}\times (0,\infty ), \ r\in {\mathbb {R}}, \end{aligned}$$
(6.8)

for \(k=0,1,\ldots ,n\).

Recalling the discussion from Sect. 4, we expect that

$$\begin{aligned} v_0^k(t,x,p)= & {} V^{k,\gamma _0}(t,x,p)=-e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(6.9)
$$\begin{aligned} w_0^k(t,x,p)= & {} V^{k,\gamma _0}(t,x,p)=-e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(6.10)

where \((h^k)_{k=0}^n\) are defined in Proposition 5.2. If (6.9)–(6.10) indeed hold, then

$$\begin{aligned} \pi ^{k,*}_0(t,p)=\frac{\mu }{\sigma ^2\gamma _0}+\frac{h^{k,\gamma _0}_p(t,p)bp\rho }{\sigma }. \end{aligned}$$
(6.11)

Our choices (6.9)–(6.11) can be verified by direct substitution into (6.5)–(6.6). We can also expect that

$$\begin{aligned} v_1^k(t,x,p)= & {} \gamma _1(x-kF(t,p))\frac{d}{d\gamma }V^{k,\gamma }(t,x,p)|_{\gamma =\gamma _0}\nonumber \\= & {} \gamma _1(x-kF(t,p))\Big (x-h^{k,\gamma _0}(t,p)-\gamma _0g^{k,\gamma _0}(t,p)\Big )e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)},\quad \ \quad \ \end{aligned}$$
(6.12)
$$\begin{aligned} w_1^k(t,x,p,r)= & {} \gamma _1(r)\frac{d}{d\gamma }V^{k,\gamma }(t,x,p)|_{\gamma =\gamma _0}\nonumber \\= & {} \gamma _1(r)\Big (x-h^{k,\gamma _0}(t,p)-\gamma _0g^{k,\gamma _0}(t,p)\Big )e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(6.13)

where \((g^k)_{k=0}^n\) are defined in Proposition 5.4. Our guesses can again be confirmed by direct substitution into (6.7)–(6.8). We now get

$$\begin{aligned} \pi ^{k,*}_1(t,x,p)=-\frac{\mu \gamma _1(x-kF(t,p))}{\sigma ^2\gamma ^2_0}+\frac{g_p^{k,\gamma _0}(t,p)\gamma _1(x-kF(t,p))bp\rho }{\sigma }.\nonumber \\ \end{aligned}$$
(6.14)

By Propositions 5.2 and 5.4 , our solutions \((v^k_0, v^k_1, w_0^k)_{k=0}^n\in \mathcal {C}([0,T]\times {\mathbb {R}}\times (0,\infty ))\cap \mathcal {C}^{1,2,2}([0,T)\times {\mathbb {R}}\times (0,\infty ))\) and \((w^k_1)_{k=0}^n\in \mathcal {C}([0,T]\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}})\cap \mathcal {C}^{1,2,2}([0,T)\times {\mathbb {R}}\times (0,\infty )\times {\mathbb {R}})\). Hence, we have found smooth solutions to the PDEs (6.5)–(6.8). These solutions allow us to define the first-order approximations (6.1)–(6.3) to the true equilibrium investment strategy and the true equilibrium value function of our optimization problem with small amount \(\epsilon \) of wealth-dependent risk aversion.

We can now state our main result.

Theorem 6.1

Let (A1)–(A7) hold. Consider the BSDEs (5.3) and (5.8). For a sufficiently small \(\epsilon >0\), the strategy

$$\begin{aligned} {\hat{\pi }}^*(t)= & {} \sum _{k=0}^n{\hat{\pi }}^{k,*}(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T,\nonumber \\ {\hat{\pi }}^{k,*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}+\frac{Z_1^{k,\gamma _0}(t)}{\sigma }+\,\Big (-\frac{\mu }{\sigma ^2\gamma ^2_0}+\frac{\mathcal {Z}_1^{k,\gamma _0}(t)}{\sigma }\Big )\gamma _1\big (X^{{\hat{\pi }}^*}(t-)-kF(t,P(t))\big )\epsilon ,\nonumber \\&\quad 0\le t\le T, \end{aligned}$$
(6.15)

is admissible, i.e. \({\hat{\pi }}^*=({\hat{\pi }}^{k,*})_{k=0}^n\in \mathcal {A}\). The investment strategy (6.15) is a candidate asymptotic first-order approximation to the equilibrium investment strategy for the optimization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) as \(\epsilon \rightarrow 0\). Alternatively, we can characterize the investment strategy (6.15) with the functions \((h^k)_{k=0}^n, (g^k)_{k=0}^n\) from Propositions 5.25.4, see (6.11) and (6.14).

Remark

In this paper we have not formally confirmed the order of the approximation error in (6.1)–(6.3), see Sect. 4 for the definition of the asymptotic first-order approximation. Hence, the strategy (6.15) is only a candidate asymptotic first-order approximation to the equilibrium investment strategy. We remark that only the order of the approximation error have not been proved, whereas the first-order approximations have been justified and formally derived on the grounds of perturbation theory, the discussion in Sect. 4 and the calculations in this section. In Delong (2018b) we study an asymptotic optimality of our investment strategy and we formally show that (6.15) performs better than any strategy in the class \(\pi _0(t)+\pi _1(t)\epsilon \) up to the second order \(\mathcal {O}(\epsilon ^2)\) in the asymptotic expansion of the value function as \(\epsilon \rightarrow 0\). We refer the reader to Delong (2018b). \(\square \)

Our investment strategy (6.15) agrees with intuition. The zeroth-order strategy, i.e. the first term in (6.15), is the optimal investment strategy for the insurer with constant risk aversion \(\gamma _0\) who aims at maximizing the expected exponential utility of the terminal wealth. The zeroth-order strategy consists of the constant Merton strategy and the hedging strategy for the claims, which are optimal if the constant risk aversion \(\gamma _0\) is used over the whole investment period. Since the insurer uses the risk aversion coefficient \(\varGamma \) consisting of the constant risk aversion \(\gamma _0\) and the wealth-dependent risk aversion \(\gamma _1\), the insurer should adjust the strategy and allow for the time-varying risk aversion. The first-order correction, the second term in (6.15), describes the first-order change in the zeroth-order strategy if the constant risk aversion coefficient \(\gamma _0\) is modified by adding a small amount of the wealth-dependent component \(\gamma _1\). The Merton strategy and the hedging strategy, which are optimal for the constant risk aversion \(\gamma _0\), are both adjusted in (6.15) to reflect changes in the risk aversion coefficient and they now take into account the new value of the insurer’s wealth-dependent risk aversion \(\varGamma \) at a given time.

7 Examples

In this section we illustrate Theorem 6.1 with examples. We investigate the BSDEs (5.3), (5.8) and the investment strategy (6.15) in some special cases relevant for insurance and financial applications.

Example 1

Let us assume that the insurer is not exposed to insurance risk and has no liability. Hence, in this example we consider a pure investment problem for an investor with the wealth-dependent risk aversion (4.1). We expect that the equilibrium strategy is related to the Merton strategy. It is easy to see that we can set \(Z^\gamma _1(t)=0\) and \(\mathcal {Z}^\gamma _1(t)=0\) in (5.3), (5.8). The first-order approximation to the equilibrium strategy is

$$\begin{aligned} {\hat{\pi }}^{*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}-\frac{\mu }{\sigma ^2\gamma ^2_0}\gamma _1\big (X^{{\hat{\pi }}^*}(t))\big )\epsilon . \end{aligned}$$
(7.1)

We end up with the Merton strategy with the constant risk aversion \(\gamma _0\) which is adjusted with a wealth-dependent term when the insurer’s wealth-dependent risk aversion deviates from \(\gamma _0\).

One may wonder if

$$\begin{aligned} {\tilde{\pi }}^{*}(t)= & {} \frac{\mu }{\sigma ^2\varGamma \big (X^{{\hat{\pi }}^*}(t)\big )}, \end{aligned}$$
(7.2)

is the true equilibrium strategy for our time-inconsistent pure investment problem, since

$$\begin{aligned} {\tilde{\pi }}^{*}(t)=\frac{\mu }{\sigma ^2\Big (\gamma _0+\gamma _1\big (X^{{\hat{\pi }}^*}(t)\big )\epsilon \Big )}=\frac{\mu }{\sigma ^2\gamma _0}-\frac{\mu }{\sigma ^2\gamma ^2_0}\gamma _1\big (X^{{\hat{\pi }}^*}(t))\big )\epsilon +\mathcal {O}(\epsilon ^2). \end{aligned}$$

The answer is no. The strategy (7.2) is called a naive strategy. For simplicity of presentation, let us slightly move away from the model considered in this paper and assume that \(\varGamma (r)=\gamma _0/r\). The wealth process (3.1) under the strategy (7.2) takes the form

$$\begin{aligned} dX^{{\tilde{\pi }}^*}(t)=\frac{\mu }{\sigma ^2\gamma _0}X^{{\tilde{\pi }}^*}(t)\big (\mu dt+\sigma dW(t)\big ). \end{aligned}$$
(7.3)

From (3.3)–(3.4) and (7.3), we can conclude that

$$\begin{aligned} w^{{\tilde{\pi }}^*}(t,x,r)={\mathbb {E}}\Big [-e^{-\frac{\gamma _0x}{r}\xi _{t,T}}\Big ],\quad v^{{\tilde{\pi }}^*}(t,x)={\mathbb {E}}\Big [-e^{-\gamma _0\xi _{t,T}}\Big ], \end{aligned}$$
(7.4)

where \(\xi _{t,T}\) is a random variable with log-normal law. If the strategy (7.2) were the true equilibrium strategy, then (7.4) would be the functions which satisfy the HJB equation (3.5). In particular, from (6.4) we could recalculate the equilibrium strategy. We get

$$\begin{aligned} {\tilde{\pi }}^{*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}X^{{\hat{\pi }}^*}(t)\frac{{\mathbb {E}}\Big [e^{-\gamma _0\xi _{t,T}}\xi _{t,T}\Big ]}{{\mathbb {E}}\Big [-e^{-\gamma _0\xi _{t,T}}\xi ^2_{t,T}\Big ]}, \end{aligned}$$
(7.5)

which does not coincide with the strategy assumed in (7.2). Summing up, the first-order approximation (7.1) to the equilibrium strategy agrees with our intuition, but the naive strategy (7.2) is not the equilibrium strategy for our investment problem. A numerical comparison of the true equilibrium strategy and the naive strategy is presented in Delong (2018a). \(\square \)

Example 2

Let us assume that the insurer is not exposed to insurance risk but has a terminal liability \(\eta \). Clearly, the Merton strategy must be complemented with a hedging strategy for \(\eta \). We assume that the market is complete, i.e the liability \(\eta \) is contingent on the index P which coincides with the tradeable index S. We can set \(\mathcal {Z}^\gamma _1(t)=0\) in (5.8), but we cannot set \(Z^\gamma _1(t)=0\) in (5.3). Fortunately, we can set \(Z^\gamma _2(t)=0\) in (5.3), and we end up with the linear BSDE:

$$\begin{aligned} Y^{\gamma }(t)=\eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2\gamma }+\frac{\mu }{\sigma }Z_1^{\gamma }(s)\Big )ds-\int _t^TZ_1^{\gamma }(s)dW(s). \end{aligned}$$
(7.6)

The solution to (7.6) can be derived by classical techniques, see e.g. Proposition 3.3.1 in Delong (2013). The solution \(Z^\gamma _1\) to (7.6) gives us the hedging strategy for \(\eta \) which should be applied by the insurer with the constant risk aversion \(\gamma \). However, the process \(Z_1^\gamma \) does not depend on the risk aversion coefficient \(\gamma \). The independence of the hedging strategy of the risk aversion is due to market completeness as the liability \(\eta \) can be perfectly hedged. Consequently, the insurer does not have to modify the hedging strategy when his risk aversion changes. The first-order approximation to the equilibrium strategy is

$$\begin{aligned} {\hat{\pi }}^{*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}+\frac{Z_1^{\gamma _0}(t)}{\sigma }-\frac{\mu }{\sigma ^2\gamma ^2_0}\gamma _1\big (X^{{\hat{\pi }}^*}(t)-F(t,P(t))\epsilon . \end{aligned}$$

The strategy consists of the Merton strategy and the hedging strategy for \(\eta \), but only the Merton strategy with the constant risk aversion \(\gamma _0\) is adjusted with a wealth-dependent term as the insurer’s wealth-dependent risk aversion varies in time.

Example 3

In this example we assume that the insurer is exposed to a terminal liability \(\eta \) which is paid if the policyholder survives. We assume that the liability \(\eta \) is contingent on the index P which coincides with the tradeable index S. Since the market is incomplete due to insurance risk, the hedging strategy for \(\eta \) now depends on the insurer’s risk aversion coefficient and should be updated when the risk aversion changes. In this example we have to solve both (5.3) and (5.8). We can set \(Z^\gamma _2(t)=0\) and \(\mathcal {Z}^\gamma _2(t)=0\). We deal with two BSDEs:

$$\begin{aligned} Y^{1,\gamma }(t)= & {} \eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2\gamma }+\frac{\mu }{\sigma }Z_1^{1,\gamma }(s)-\frac{e^{\gamma \big (Y^{0,\gamma }(s)-Y^{1,\gamma }(s)\big )}-1}{\gamma }\lambda \Big )ds\nonumber \\&-\,\int _t^TZ_1^{1,\gamma }(s)dW(s), \end{aligned}$$
(7.7)
$$\begin{aligned} \mathcal {Y}^{1,\gamma }(t)= & {} -\int _t^T\Big (-\frac{\mu ^2}{2\sigma ^2\gamma ^2}-\frac{e^{\gamma \big (Y^{0,\gamma }(s)-Y^{1,\gamma }(s)\big )}\Big (\gamma \Big (Y^{0,\gamma }(s)-Y^{1,\gamma }(s)\Big )-1\Big )+1}{\gamma ^2}\lambda \nonumber \\&-\,e^{\gamma \big (Y^{0,\gamma }(s)-Y^{1,\gamma }(s)\big )}\lambda \mathcal {Y}^{0,\gamma }(s)+e^{\gamma \big (Y^{0,\gamma }(s)-Y^{1,\gamma }(s)\big )}\lambda \mathcal {Y}^{1,\gamma }(s)+\frac{\mu }{\sigma }\mathcal {Z}_1^{1,\gamma }(s)\Big )ds\nonumber \\&-\,\int _t^T\mathcal {Z}_1^{1,\gamma }(t)dW(t), \end{aligned}$$
(7.8)

where \(Y^{0,\gamma }(t)=-\frac{\mu ^2}{2\sigma ^2\gamma }(T-t)\) and \(\mathcal {Y}^{0,\gamma }(t)=\frac{\mu ^2}{2\sigma ^2\gamma ^2}(T-t)\). The first-order approximation to the equilibrium strategy, applied if the policyholder lives, is

$$\begin{aligned} {\hat{\pi }}^{1,*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}+\frac{Z_1^{1,\gamma _0}(t)}{\sigma }\nonumber \\&+\,\Big (-\frac{\mu }{\sigma ^2\gamma ^2_0}+\frac{\mathcal {Z}_1^{1,\gamma _0}(t)}{\sigma }\Big )\gamma _1\big (X^{{\hat{\pi }}^*}(t-)-F(t,P(t))\big )\epsilon . \end{aligned}$$
(7.9)

The Merton strategy and the hedging strategy for \(\eta \), which are optimal for the constant risk aversion \(\gamma _0\), are both adjusted with wealth-dependent and liability-dependent terms as the insurer’s wealth-dependent risk aversion varies in time. If the policyholder dies, then the strategy from Example 2 is applied. The BSDE (7.8) is a linear equation and we can give a probabilistic representation of the solution, see Proposition 3.3.1 in Delong (2013). The solution to the BSDE (7.7) is investigated in Moore and Young (2003) and Ankirchner et al. (2010) in the context of different optimization problems. \(\square \)

Example 4

Finally, let us assume that the insurer is not exposed to insurance risk but has a terminal liability \(\eta \) which is contingent on the non-tradeable index P correlated with the tradeable index S. The market is incomplete due to non-hedgeable financial risk. As in the previous example, the hedging strategy for \(\eta \) depends on the insurer’s risk aversion coefficient. We have to solve both (5.3) and (5.8), and we cannot set \(Z^\gamma _2(t)=0, \ \mathcal {Z}^\gamma _2(t)=0\). We consider two BSDEs:

$$\begin{aligned} Y^{\gamma }(t)= & {} \eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2\gamma }+\frac{\mu }{\sigma }Z_1^{\gamma }(s)-\frac{1}{2}\gamma (Z^\gamma _2(t))^2 \Big )ds\nonumber \\&-\int _t^TZ_1^{\gamma }(s)dW(s)-\int _t^TZ_2^{\gamma }(s)dB(s), \end{aligned}$$
(7.10)
$$\begin{aligned} \mathcal {Y}^{\gamma }(t)= & {} -\int _t^T\Big (-\frac{\mu ^2}{2\sigma ^2\gamma ^2}-\frac{1}{2}(Z^\gamma _2(t))^2+\frac{\mu }{\sigma }\mathcal {Z}_1^{\gamma }(s)-\gamma Z_2^\gamma (t)\mathcal {Z}_2^\gamma (t)\Big )ds\nonumber \\&-\int _t^T\mathcal {Z}_1^{\gamma }(t)dW(t)-\int _t^T\mathcal {Z}_2^{\gamma }(t)dB(t), \end{aligned}$$
(7.11)

and the first-order approximation to the equilibrium strategy takes the form

$$\begin{aligned} {\hat{\pi }}^{*}(t)= & {} \frac{\mu }{\sigma ^2\gamma _0}+\frac{Z_1^{\gamma _0}(t)}{\sigma }\nonumber \\&+\,\Big (-\frac{\mu }{\sigma ^2\gamma ^2_0}+\frac{\mathcal {Z}_1^{\gamma _0}(t)}{\sigma }\Big )\gamma _1\big (X^{{\hat{\pi }}^*}(t)-F(t,P(t))\big )\epsilon . \end{aligned}$$
(7.12)

Again, the Merton strategy and the hedging strategy for \(\eta \), which are optimal for the constant risk aversion \(\gamma _0\), are both adjusted with wealth-dependent and liability-dependent terms as the insurer’s wealth-dependent risk aversion varies in time. \(\square \)

8 Proofs

Proof of Theorem 3.1

The proof is standard and we refer to the proof of Theorem 5.2 from Björk et al. (2017). \(\square \)

Proof of Proposition 5.1

Assertion (i): Let \(k=0\). By direct substitution, we can check that the processes

$$\begin{aligned} Y^0(t)=-\frac{\mu ^2}{2\sigma ^2\gamma }(T-t),\quad Z_1^0(t)=Z_2^0(t)=0,\quad 0\le t\le T, \end{aligned}$$
(8.1)

satisfy the BSDE (5.3) with \(k=0\). The uniqueness of solution to (5.3) for \(k=0\) follows from Lemma 4.11 in Jeanblanc et al. (2015). The properties of the solution (8.1) are obvious. Next, we fix \(k\in \{1,\ldots ,n\}\) and \(Y^{k-1}\) is given. Assume that \(Y^{k-1}\) is bounded, which is the case for \(Y^0\). The assertion (i) for the BSDE (5.3) follows from Lemma 4.11 in Jeanblanc et al. (2015), as the assumptions of this lemma are satisfied for our BSDE with k fixed.

Assertion (ii): The bounds for \(||Y^k||_{\mathcal {R}^\infty }, \ ||Z^k_1||_{BMO},\ ||Z^k_2||_{BMO}\) can be deduced from Lemma 4.11 in Jeanblanc et al. (2015) (Steps 2 and 3 in their proof). Let us consider the generator of the BSDE (5.3):

$$\begin{aligned} q^k(t,\omega ,y,z_1,z_2)=\frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (\omega )+\frac{\mu }{\sigma }z_1-\frac{1}{2}\gamma |z_2|^2. \end{aligned}$$
(8.2)

Let \(M_{k,\gamma }=K(1+\gamma +k+\frac{1}{\gamma })\) where K is a constant independent of \((k,\gamma )\). The generator (8.2) satisfies the following conditions [Assumption 4.8 from Jeanblanc et al. (2015)]:

$$\begin{aligned} |q^k(t,\omega ,0,0)|\le & {} M_{k,\gamma },\\ |q^k(t,\omega ,y,z_1,z_2)|\le & {} M_{k,\gamma }\big (1+|z_1|^2+|z_2|^2\big ),\\ |q^k(t,\omega ,y,z_1,z_2)-q^k(t,y',z_1,z_2)|= & {} 0,\\ |q^k(t,\omega ,y,z_1,z_2)-q^k(t,\omega ,y,z'_1,z'_2)|\le & {} M_{k,\gamma }(1+|z_2|+|z'_2|)\sqrt{|z_1-z_1'|^2+|z_2-z'_2|}. \end{aligned}$$

By Lemma 4.11 from Jeanblanc et al. (2015), we can directly establish the bounds for the solution to (5.3):

$$\begin{aligned}&||Y^{k,\gamma }||_{\mathcal {R}^\infty }\le Ke^{TM_{k,\gamma }}\big (k+M_{k,\gamma }+||Y^{k-1,\gamma }||_{\mathcal {R}^\infty }+1\big ):=K_{k,\gamma },\nonumber \\&||Z_1^{k,\gamma }||^2_{BMO}+||Z_2^{k,\gamma }||^2_{BMO}\le Ke^{6M_{k,\gamma }K_{k,\gamma }}\Big (1+M_{k,\gamma }(1+K_{k,\gamma })\nonumber \\&\quad +M_{k,\gamma }\frac{1+e^{\gamma (K+||Y^{k-1,\gamma }||_{\mathcal {R}^\infty }+K_{k,\gamma })}}{\gamma }(n-k)\Big )\frac{1}{M^2_{k,\gamma }}, \end{aligned}$$
(8.3)

where K denotes another constant independent of \((k,\gamma )\). Since we have a finite sequence of \((Y^k,Z_1^k,Z_2^k)_{k=0}^n\) and \((Y^0,Z_1^0,Z_2^0)_{k=0}^n\) is given by (8.1), the assertion (ii) holds.

Assertion (iii): Similarly to (8.3), we can deduce that

$$\begin{aligned} ||Y^{k,t,p}||_{\mathcal {R}^\infty }+||Z_1^{k,t,p}||^2_{BMO}+||Z_2^{k,t,p}||^2_{BMO}\le K, \end{aligned}$$
(8.4)

where the constant K is independent of (ktp). Let us introduce the function

$$\begin{aligned} \psi ^{k,t,p}(s,Y(s),Z_1(s))= & {} \frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (P^{t,p}(s))+\frac{\mu }{\sigma }Z_1(s)\\&-\frac{e^{\gamma (\beta (P^{t,p}(s))+Y^{k-1,t,p}(s)-Y(s))}-1}{\gamma }k\lambda . \end{aligned}$$

We remark that parameter p in \(\psi ^{k,t,p}\) also affects the process \(Y^{k-1,t,p}\). We fix \(k\in \{1,\ldots ,n\}\) and \(Y^{k-1}\) is given. We apply Theorem 5.1 from Ankirchner et al. (2007). For any \(q>1\), we have the following estimate for the solutions to the BSDE (5.3):

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |Y^{k,t,p}(s)-Y^{k,t,p'}(s)\big |^{2q}\nonumber \\&\qquad +\Big (\int _0^T\big |Z_1^{k,t,p}(s)-Z_1^{k,t,p'}(s)\big |^{2}ds+\int _0^T\big |Z_2^{k,t,p}(s)-Z_2^{k,t,p'}(s)\big |^{2}ds\Big )^q\Big ]\nonumber \\&\quad \le K\Big ({\mathbb {E}}\Big [\Big |k|\eta (P^{t,p}(T))-\eta (P^{t,p'}(T))\big |\Big |^{2qr^2}\nonumber \\&\qquad +\Big (\int _0^T\big |\psi ^{k,t,p}(s,Y^{k,t,p}(s),Z_1^{k,t,p}(s))\nonumber \\&\qquad \qquad -\psi ^{k,t,p'}(s,Y^{k,t,p}(s),Z_1^{k,t,p}(s))\big |ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}, \end{aligned}$$
(8.5)

where the constant K depends on qT, the Lipschitz constant of \((y,z_1)\mapsto \psi ^{k,t,p}(s,y,z_1)\) and \(||Z_2^{k,t,p}+Z_2^{k,t,p'}||_{BMO}\). The constant r is also related to \(||Z_2^{k,t,p}+Z_2^{k,t,p'}||_{BMO}\) by Theorem 5.1 from Ankirchner et al. (2007) and Theorem 3.1 from Kazamaki (1997). By (A3) and the assertion (ii), the Lipschitz constant of \((y,z_1)\mapsto \psi ^{k,t,p}(s,y,z_1)\) is independent of (ktp). Moreover, by (8.4), the norm \(||Z_2^{k,t,p}||_{BMO}\) can be bounded by a constant independent of (ktp) [see (8.4)]. Consequently, we can choose universal constants \(r>1\) and K in (8.5) for all \((t,p), (t,p')\in [0,T]\times (0,\infty )\) and \(k\in \{1,\ldots ,n\}\). Since \((Y^k)_{k=0}^n\) is uniformly bounded in (tpk), the functions \(\alpha , \beta , \eta \) are bounded and Lipschitz continuous, and

$$\begin{aligned} {\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |P^{t,p}(s)-P^{t,p'}(s)\big |^{q}\Big ]\le K|p-p'|^q,\quad q\ge 2, \end{aligned}$$

we can deduce that

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |Y^{k,t,p}(s)-Y^{k,t,p'}(s)\big |^{2q}\nonumber \\&\qquad +\Big (\int _0^T\big |Z_1^{k,t,p}(s)-Z_1^{k,t,p'}(s)\big |^{2}ds+\int _0^T\big |Z_2^{k,t,p}(s)-Z_2^{k,t,p'}(s)\big |^{2}ds\Big )^q\Big ]\nonumber \\&\quad \le K\Big (|p-p'|^{2qr^2}+{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |Y^{k-1,t,p}(s)-Y^{k-1,t,p'}(s)\big |^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}. \end{aligned}$$
(8.6)

The result (5.4) can be proved if we iterate (8.6) starting with (8.1). \(\square \)

Proof of Proposition 5.2

Assertion (i): If \(|\rho |=1\), then we deal with the PDEs:

$$\begin{aligned}&h_t^k(t,p)+\big (a-\frac{\mu b}{\sigma }\big )ph_p^k(t,p)+\frac{1}{2}b^2p^2h_{pp}(t,p)+k\alpha (p)-\frac{\mu ^2}{2\sigma ^2\gamma }-\frac{k\lambda }{\gamma }\nonumber \\&\quad +\frac{e^{\gamma \beta (p)} e^{\gamma h^{k-1}(t,p)}}{\gamma }k\lambda e^{-\gamma h^k(t,p)}=0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&h^k(T,p)=k\eta (p),\quad p\in (0,\infty ). \end{aligned}$$
(8.7)

If \(|\rho |<1\), then we introduce \({\tilde{h}}^k(t,p)=e^{(1-\rho ^2)\gamma h^k(t,p)}\) and we deal with the PDEs:

$$\begin{aligned}&{\tilde{h}}_t^k(t,p)+\big (a-\frac{\mu b\rho }{\sigma }\big )p{\tilde{h}}_p^k(t,p)+\frac{1}{2}b^2p^2{\tilde{h}}_{pp}^k(t,p)\nonumber \\&\quad +\,\big (\gamma k\alpha (p)-\frac{\mu ^2}{2\sigma ^2}-k\lambda \big )(1-\rho ^2){\tilde{h}}^k(t,p)\nonumber \\&\quad +\,e^{\gamma \beta (p)}(1-\rho ^2)({\tilde{h}}^{k-1}(t,p))^{\frac{1}{1-\rho ^2}}k\lambda ({\tilde{h}}^k(t,p))^{-\frac{\rho ^2}{1-\rho ^2}}=0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&{\tilde{h}}^k(T,p)=e^{(1-\rho ^2)\gamma k\eta (p)},\quad p\in (0,\infty ). \end{aligned}$$
(8.8)

For \(k=0\) we immediately get \(h^0(t,p)=-\frac{\mu ^2}{2\sigma ^2\gamma }(T-t)\) and uniqueness of solution follows from Proposition 2.3 from Becherer (2005).

Equation  (8.7): The result follows from Propositions 2.1 and 2.3 from Becherer (2005), which should be applied iteratively to the PDEs (8.7) starting with \(k=1\) and \(h^0\). Fix \(k\in \{1,\ldots ,n\}\) and \(h^{k-1}\) is given. Assume that \(h^{k-1}\) is uniformly bounded on \([0,T]\times (0,\infty )\) and \(h^{k-1}\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\), which is the case for \(h^0\). We define an operator based on Feynman-Kac formula and Proposition 2.1 from Becherer (2005):

$$\begin{aligned}&(\mathcal {A}\phi )(t,p)={\mathbb {E}}^{{\mathbb {Q}}}\Big [k\eta (P(T))+\int _t^T\Big (k\alpha (P(s))-\frac{\mu ^2}{2\sigma ^2\gamma }-\frac{k\lambda }{\gamma }\\&\quad +\frac{e^{\gamma \beta (P(s))} e^{\gamma h^{k-1}(s,P(s))}}{\gamma }k\lambda e^{-\gamma \phi (s,P(s))}\Big )ds|P(t)=p\Big ],\quad (t,p)\in [0,T]\times (0,\infty ), \end{aligned}$$

where

$$\begin{aligned} \frac{dP(t)}{P(t)}=\big (a-\frac{\mu b\rho }{\sigma }\big )dt+b\big (\rho dW^{{\mathbb {Q}}}(t)+\sqrt{1-\rho ^2}dB^{{\mathbb {Q}}}(t)\big ),\quad 0\le t\le T. \end{aligned}$$

Let us consider a sequence \((h_m^k)_{m=0}^\infty \) defined with \(h_{m+1}^k(t,p)=(\mathcal {A}h_m^k)(t,p)\). We can observe that if \(h_m^{k}(t,p)\ge -(\frac{\mu ^2}{2\sigma ^2\gamma }+\frac{k\lambda }{\gamma })T\), then \(h_{m+1}^k(t,p)\ge -(\frac{\mu ^2}{2\sigma ^2\gamma }+\frac{k\lambda }{\gamma })T\). Since \(h_m^{k}(t,p)\) is uniformly bounded from below in (tpm), it is also easy to see that \(h_{m+1}^k(t,p)\) is uniformly bounded from above in (tpm). Hence, the assumptions of Proposition 2.1 from Becherer (2005) are satisfied. We conclude that there exists a unique fixed point of the operator \(\mathcal {A}\) and a unique solution \(h^k\) to the equation \(h^k(t,p)=(\mathcal {A}h^k)(t,p)\), which can be derived from \((h_m^k)_{m=0}^\infty \). Next, we use Proposition 2.3 from Becherer (2005) to show that the fixed point \(h^k\) is a smooth function and satisfies the PDE (8.7). We investigate smoothness properties of the successive elements in the sequence \(h_{m+1}^k(t,p)=(\mathcal {A}h_m^k)(t,p)\). Assumptions (2.9)–(2.12) from Becherer (2005) are satisfied, but (2.13) is not clear. However, a closer look at the proof [see (2.16)] shows that it is sufficient to require that

$$\begin{aligned} (t,p,\phi )\mapsto k\alpha (p)-\frac{\mu ^2}{2\sigma ^2\gamma }-\frac{k\lambda }{\gamma }+\frac{e^{\gamma \beta (p)} e^{\gamma h^{k-1}(t,p)}}{\gamma }k\lambda e^{-\gamma \phi }, \end{aligned}$$

is uniformly Hölder continuous on \([0,T-\epsilon ]\times {\bar{D}}\times [K_l,K_u]\), where \(\epsilon >0\), D is a bounded subset of \((0,\infty )\) such that \({\bar{D}}\subset (0,\infty )\), and \(K_l, K_u\) denotes the lower and upper bounds for the sequence \((h^k_m)_{m=0}^\infty \). Since \(h^{k-1}\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\), this assumption holds in our case. Hence, from Proposition 2.3 in Becherer (2005) we can conclude that the sequence \((h^k_m)_{m=0}^\infty \) is in \(\mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\). Moreover, the PDE (8.7) has a unique solution in \(\mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\), uniformly bounded on \([0,T]\times (0,\infty )\), which is determined by the fixed point of the operator \(\mathcal {A}\) and the sequence \((h^k_m)_{m=0}^\infty \).

Equation  (8.8): The proof is analogous. This time we assume that \({\tilde{h}}^{k-1}\) is uniformly bounded on \([0,T]\times (0,\infty )\), positive and uniformly bounded away from zero on \([0,T]\times (0,\infty )\) and \({\tilde{h}}^{k-1}\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\), which is the case for \({\tilde{h}}^0\). We introduce the appropriate operator \(\mathcal {A}\) based on Feynman-Kac formula. We note that if \({\tilde{h}}_m^{k}(t,p)\ge e^{-(\frac{\mu ^2}{2\sigma ^2}+k\lambda )(1-\rho ^2)T}>0\), then \({\tilde{h}}_{m+1}^k(t,p)=(\mathcal {A}{\tilde{h}}_m^k)(t,p)\ge e^{-(\frac{\mu ^2}{2\sigma ^2}+k\lambda )(1-\rho ^2)T}>0\). Since \({\tilde{h}}_m^{k}(t,p)\) is positive and uniformly bounded away from zero in (tpm), it is also easy to see that \({\tilde{h}}_{m+1}^k(t,p)=(\mathcal {A}{\tilde{h}}_m^k)(t,p)\) is uniformly bounded from above in (tpm). Hence, the assumptions of Propositions 2.1 and 2.3 from Becherer (2005) are satisfied.

Assertion (ii): The case with \(k=0\) is trivial - just compare the explicit solutions to the BSDE and the PDE for \(k=0\). Fix \(k\in \{1,\ldots ,n\}\). Assume that \(Y^{k-1}(t)=h^{k-1}(t,P(t))\), which is the case for \(k=0\). Since we have a sequence of smooth functions \((h^k)_{k=0}^n\), we can apply Itô’s formula to derive the dynamics of \(h^k(t,P(t))\) on \([0,T-\epsilon ]\) and compare the resulting dynamics with the dynamics of \(Y^k\) given by (5.3) [this step is standard, see e.g. Proposition 4.3 in El Karoui et al. (1997)]. We can deduce candidate solutions for \((Y^k,Z_1^k,Z_2^k)\) on [0, T]. Next, we have to prove that the candidate solutions (5.6) are in the appropriate class of processes. The candidate solution for \(Y^k\) is bounded by point (i). We prove the BMO property for the candidate solutions for \((Z_1^k, Z_2^k)\). Let us choose a localizing sequence of stopping times \((\tau _m)_{m=1}^\infty \) for the process P and a stopping time \(\tau \in [0,T]\). Applying Itô’s formula to \(h^k\), changing the measure to \({\mathbb {Q}}\sim {\mathbb {P}}\) with the exponential martingale \(\mathcal {E}\big (-\int _0^\cdot \frac{\mu }{\sigma }dW(s)\big )\) and using the PDE (5.5), we can derive

$$\begin{aligned}&h^k((T-\epsilon )\wedge \tau _m\wedge \tau ,P((T-\epsilon )\wedge \tau _m\wedge \tau ))=h^k(\tau ,P(\tau ))\nonumber \\&\quad +\int _\tau ^{(T-\epsilon )\wedge \tau _m\wedge \tau }\Big (\frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (P(s))-\frac{1}{2}\gamma \Big (h^k_p(s,P(s))bP(s)\sqrt{1-\rho ^2}\Big )^2\nonumber \\&\quad -\frac{e^{\gamma (\beta (P(s))+h^{k-1}(s,P(s))-h^k(s,P(s)))}-1}{\gamma }k\lambda \Big )ds\nonumber \\&\quad +\int _\tau ^{(T-\epsilon )\wedge \tau _m\wedge \tau }h^k_p(s,P(s))bP(s)\rho dW^{\mathbb {Q}}(s)\nonumber \\&\quad +\int _\tau ^{(T-\epsilon )\wedge \tau _m\wedge \tau }h^k_p(s,P(s))bP(s)\sqrt{1-\rho ^2}dB^{\mathbb {Q}}(s). \end{aligned}$$
(8.9)

If \(|\rho |=1\), we take the square on both sides of (8.9) and the expected value. If \(|\rho |<1\), we just take the expected value. In both cases, by boundedness of \((h^k)_{k=0}^n, \alpha , \beta \), we can establish the inequality

$$\begin{aligned} {\mathbb {E}}^{\mathbb {Q}}\Big [\int _\tau ^{(T-\epsilon )\wedge \tau _m\wedge \tau }\big (h^k_p(s,P(s))P(s)\big )^2ds|{\mathcal {F}}^{W,B}_\tau \Big ]\le K. \end{aligned}$$

Taking \(m\rightarrow \infty , \epsilon \rightarrow 0\), by monotone convergence theorem we deduce that \(\big (\int _0^th^k_p(s,P(s))P(s)dW^{\mathbb {Q}}(s), 0\le t\le T)\) is a \(BMO({\mathbb {Q}},{\mathbb {F}}^{W,B})\)-martingale, and also a \(BMO({\mathbb {P}},{\mathbb {F}}^{W,B})\)-martingale [see Theorem 3.6 in Kazamaki (1997)]. By uniqueness of solution to the BSDE (5.3), we have characterized \((Y^k,Z_1^k,Z_2^k)\) with \(h^k\). \(\square \)

Proof of Theorem 5.1

Step 1: Let us assume there exists a unique solution \((Y,Z_1,Z_2,Q)\in \mathcal {R}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\) to the BSDE

$$\begin{aligned} Y(t)= & {} J(T)\eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2\gamma }-J(s-)\alpha (P(s))+\frac{\mu }{\sigma }Z_1(s)-\frac{1}{2}\gamma (Z_2(s))^2\nonumber \\&-\frac{e^{\gamma (\beta (P(s))+Q(s))}-1}{\gamma }J(s-)\lambda \Big )ds\nonumber \\&-\int _t^TZ_1(s)dW(s)-\int _t^TZ_2(s)dB(s)-\int _t^TQ(s)dN(s),\quad 0\le t\le T,\nonumber \\ \end{aligned}$$
(8.10)

such that (YQ) are bounded and \(\big (\int _0^tZ_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^t Z_2(s) dB(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}})\)-martingales. Using standard techniques from optimal control, see e.g. Hu et al. (2005) or Chapter 11 in Delong (2013), we can prove that the strategy

$$\begin{aligned} \pi ^*(t)=\frac{\mu }{\sigma ^2\gamma }+\frac{Z_1(t)}{\sigma },\quad 0\le t\le T, \end{aligned}$$
(8.11)

is the optimal admissible investment strategy for the optimization problem (3.2) and \(V^k(t,x,p)=V^{k,\pi ^*}(t,x,p)=-e^{-\gamma x}e^{\gamma Y(t)}|_{P(t)=p, J(t)=k}\) is the value function corresponding to the strategy \(\pi ^*\). Moreover,

$$\begin{aligned} e^{-\gamma (X^{\pi ^*}(t)- Y(t))}= & {} e^{-\gamma (x-Y(0))}\times \mathcal {E}\Big (-\int _0^t \frac{\mu }{\sigma }dW(s)+\int _0^t \gamma Z_2dB(s)\Big )\nonumber \\&\times \mathcal {E}\Big (\int _0^t\big (e^{\gamma (\beta (P(s))+Q(s))}-1\big )(dN(s)-(n-N(s-))\lambda ds)\Big ),\nonumber \\ \end{aligned}$$
(8.12)

where \(\mathcal {E}(M)\) denotes the stochastic exponential of the martingale M. Since \(\int _0^t Z_2(s)dB(s)\) is a BMO-martingale, \(\beta \) and Q are bounded and the process N only jumps finitely many times upward, we can conclude that the product of the stochastic exponentials of martingales in (8.12) is a true martingale, see Lemma 1 in Morlais (2010) and Theorem 2.3 in Kazamaki (1997).

Step 2: We prove that there exists a solution to the BSDE (8.10), which we assume in Step 1. The BSDE (8.10) is a quadratic-exponential BSDE with jumps. Jeanblanc et al. (2015), Kharroubi et al. (2013) and Jiao et al. (2013) showed how to transform a quadratic-exponential BSDE with a finite number of jumps into a system of BSDEs without jumps. We apply their methods. Let \(\tau _n=0, \ \tau _k=\inf \{t>\tau _{k+1}: J(t)<J(\tau _{k+1})\}\wedge T, k=n-1,\ldots ,0\). For \(k\in \{0,\ldots ,n\}\), let us write the BSDE (8.10) on \(\tau _{k}\le t\le \tau _{k-1}\), where we assume that \(\tau _{-1}=T\). We get the equation:

$$\begin{aligned} Y(t)= & {} Y(\tau _{k-1})-\int _{t}^{\tau _{k-1}}\Big (\frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (P(s))+\frac{\mu }{\sigma }Z_1^k(s)-\frac{1}{2}\gamma (Z_2^k(s))^2\nonumber \\&-\frac{e^{\gamma (\beta (P(s))+Q^k(s))}-1}{\gamma }k\lambda \Big )ds-\int _{t}^{\tau _{k-1}}Z_1^k(s)dW(s)\nonumber \\&-\int _{t}^{\tau _{k-1}}Z_2^k(s)dB(s)-\int _{t}^{\tau _{k-1}}Q^k(s)dN(s),\quad \tau _{k}\le t\le \tau _{k-1}. \end{aligned}$$
(8.13)

The \({\mathcal {F}}_{\tau _{k-1}}\)-measurable random variable \(Y(\tau _{k-1})\) has the decomposition:

$$\begin{aligned}&Y(\tau _{k-1})=k\eta (P(T)){\mathbf {1}}\{\tau _{k-1}=T\}+{\tilde{Y}}(\tau _{k-1}){\mathbf {1}}\{\tau _{k-1}<T\},\nonumber \\&\quad k=1,\ldots ,n, \end{aligned}$$
(8.14)

where \({\tilde{Y}}\) is an \({\mathbb {F}}^{W,B}\)-adapted process. In order to match the terminal condition of the BSDE (8.13), given by (8.14), at the jump time \(\tau _{k-1}<T\), we set

$$\begin{aligned} Q^k(t)={\tilde{Y}}(\tau _{k-1})-Y(t),\quad \tau _k\le t \le \tau _{k-1}, \quad k=1,\ldots ,n. \end{aligned}$$

Consequently, the problem of solving the BSDE (8.10) can be replaced with the problem of solving the system of BSDEs (5.3). The existence of solutions to the system of BSDEs (5.3) is established in Proposition 5.1. Gluing the solutions to the BSDEs (5.3), we can derive the solution to the BSDE (8.10). For details of the construction, we refer to Proposition 4.4, Lemma 4.11, Theorems 4.12 and 4.17 in Jeanblanc et al. (2015). We set

$$\begin{aligned} Y(t)= & {} \sum _{k=0}^n Y^k(t){\mathbf {1}}\{J(t)=k\},\quad 0\le t\le T,\nonumber \\ Z_1(t)= & {} \sum _{k=0}^n Z_1^k(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T,\nonumber \\ Z_2(t)= & {} \sum _{k=0}^n Z_2^k(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T,\nonumber \\ Q(t)= & {} \sum _{k=0}^n\big (Y^{k-1}(t)-Y^k(t)\big ){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T. \end{aligned}$$
(8.15)

The optimal strategy (5.7) follows from (8.11) and (8.15).

Step 3: We investigate properties of the solution (8.15). By uniqueness of solutions to the BSDEs (5.3) and the arguments from Step 2, there exists a unique solution \((Y,Z_1,Z_2,Q)\in \mathcal {R}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\times \mathcal {H}^2({\mathbb {F}})\) to the BSDE (8.10) given by (8.15). We notice that

$$\begin{aligned}&\sup _{{\mathbb {F}}-stopping \ times \ \mathcal {T}}{\mathbb {E}}\Big [\int _\mathcal {T}^T|Z_i(t)|^2dt\big |{\mathcal {F}}_\mathcal {T}\Big ]\\&\quad \le (n+1)\sum _{k=0}^n\sup _{{\mathbb {F}}-stopping \ times \ \mathcal {T}}{\mathbb {E}}\Big [\int _\mathcal {T}^T|Z^k_i(t)|^2dt\big |{\mathcal {F}}_{\mathcal {T}}\Big ]<\infty ,\quad \quad i=1,2, \end{aligned}$$

by the \(BMO({\mathbb {F}})\) property of \(\int _0^t Z_1^kdW(s), \int _0^t Z_2^kdB(s)\) which solve (8.13) [see point (i) in Proposition 5.1]. Consequently, the processes \(\big (\int _0^tZ_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^tZ_2(s)dB(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}})\) martingales. By point (ii) of Proposition 5.1 and (8.15), the processes (YQ) are bounded. \(\square \)

Proof of Proposition 5.3

Step 1: We will apply the a priori estimates from Ankirchner et al. (2007) which we adapt to our setting. We will often use the properties of the solutions to the BSDEs (5.3) which we specify in points (i)(ii) in Proposition 5.1 (without recalling them). We will also use the energy inequality [see p. 29 in Kazamaki (1997)], which says that for a \(BMO(\mathbb {G})\)-martingale \(\mathcal {X}(t)=\int _0^t\mathcal {Z}(s)dW(s)\) and \(\mathbb {G}\)-stopping time, we have the inequality

$$\begin{aligned} {\mathbb {E}}\Big [\Big (\int _\tau ^T|\mathcal {Z}(s)|^2ds\Big )^\kappa |\mathcal {G}_\tau \Big ]\le \kappa !||\mathcal {Z}||^{2\kappa }_{BMO},\quad \kappa =1,2,\ldots . \end{aligned}$$
(8.16)

We fix \(q>1\). We can deduce that \((Y^{k,\gamma },Z_1^{k,\gamma },Z_2^{k,\gamma })_{k=0}^n\in \mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\). We choose \((\epsilon ,\epsilon ')\in [-\epsilon _0,\epsilon _0]\) and \(0<\epsilon _0<\gamma \).

Step 2: We claim that the mapping \(\gamma \mapsto (Y^{k,\gamma },Z_1^{k,\gamma },Z_2^{k,\gamma })\) is continuous as a mapping \((0,\infty )\mapsto \mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\). The explicit solution \((Y^{0,\gamma },Z_1^{0,\gamma },Z_2^{0,\gamma })\) can be directly investigated and the assertion holds for \(k=0\). We fix \(k\in \{1,\ldots ,n\}\) and we assume that the assertion holds for \(k-1\). We prove that the assertion holds for k. Let us introduce the function

$$\begin{aligned} \psi ^{k,\gamma }(t,Y(t),Z_1(t))= & {} \frac{\mu ^2}{2\sigma ^2\gamma }-k\alpha (P(t))+\frac{\mu }{\sigma }Z_1(t)\\&-\frac{e^{\gamma (\beta (P(t))+Y^{k-1,\gamma }(t)-Y(t))}-1}{\gamma }k\lambda . \end{aligned}$$

We remark that parameter \(\gamma \) in \(\psi ^{k,\gamma }\) also affects the process \(Y^{k-1,\gamma }\). The assumptions of Theorem 5.1 and Lemma 5.2 from Ankirchner et al. (2007) are satisfied. However, the quadratic term in our Eq. (5.3) is of the form \(\gamma (Z_2^{k,\gamma })^2\) and both terms \(\gamma \) and \(Z_2^{k,\gamma }(t)\) are perturbated when we add \(\epsilon \) to \(\gamma \). If we write

$$\begin{aligned}&(\gamma +\epsilon )\big (Z_2^{k,\gamma +\epsilon }(t)\big )^2-\gamma \big (Z_2^{k,\gamma }(t)\big )^2\\&\quad =(\gamma +\epsilon )\big (Z_2^{k,\gamma +\epsilon }(t)+Z_2^{k,\gamma }(t)\big )\big (Z_2^{k,\gamma +\epsilon }(t)-Z_2^{k,\gamma }(t)\big )+\epsilon (Z_2^{k,\gamma }(t))^2, \end{aligned}$$

then we can observe that we have one additional term compared to Ankirchner et al. (2007). Adapting the proofs of Theorem 5.1 and Lemma 5.2 from Ankirchner et al. (2007) to our setting, we can derive the estimate

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{0\le t\le T}\big |Y^{k,\gamma }(t)-Y^{k,\gamma +\epsilon }(t)\big |^{2q}\nonumber \\&\qquad +\,\Big (\int _0^T\big |Z_1^{k,\gamma }(t)-Z_1^{k,\gamma +\epsilon }(t)\big |^{2}dt+\int _0^T\big |Z_2^{k,\gamma }(t)-Z_2^{k,\gamma +\epsilon }(t)\big |^{2}dt\Big )^q\Big ]\nonumber \\&\quad \le K\Big ({\mathbb {E}}\Big [\Big (\int _0^T\big |\psi ^{k,\gamma }(t,Y^{k,\gamma }(t),Z_1^{k,\gamma }(t))-\psi ^{k,\gamma +\epsilon }(t,Y^{k,\gamma }(t),Z_1^{k,\gamma }(t))\nonumber \\&\qquad +\,|\epsilon |\big (Z^{k,\gamma }_2(t)\big )^2\big |dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}, \end{aligned}$$
(8.17)

where the constant K depends on \(q, \ T\), the Lipschitz constant of \((y,z_1)\mapsto \psi ^{k,\gamma }(t,y,z_1)\) and \(||(\gamma +\epsilon )\big (Z_2^{k,\gamma +\epsilon }+Z_2^{k,\gamma }\big )||_{BMO}\). The constant r is also related to \(||(\gamma +\epsilon )\big (Z_2^{k,\gamma +\epsilon }+Z_2^{k,\gamma }\big )||_{BMO}\) by Theorem 5.1 from Ankirchner et al. (2007) and Theorem 3.1 from Kazamaki (1997). Since \(||Z_2^{k,\gamma }||_{BMO}\) can be bounded by a constant independent of \(\gamma \in [\gamma -\epsilon _0,\gamma +\epsilon _0]\), we can choose universal constants \(r>1\) and K in (8.17) for all \(\epsilon \in [-\epsilon _0,\epsilon _0]\). Since we assume that \(Y^{k-1,\gamma +\epsilon }\rightarrow Y^{k-1,\gamma }\) in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\) as \(\epsilon \rightarrow 0\), then \(\lim _{\epsilon \rightarrow 0}\psi ^{k,\gamma +\epsilon }(t,Y^{k,\gamma }(t),Z_1^{k,\gamma }(t))=\psi ^{k,\gamma }(t,Y^{k,\gamma }(t),Z_1^{k,\gamma }(t))\), a.s. for a.a. \(t\in [0,T]\). Taking \(\epsilon \rightarrow 0\) and using the dominated convergence theorem, we can prove that the right hand side of (8.17) converges to zero. The convergence in \(\mathcal {R}^{2q}({\mathbb {F}}^{W,B})\times \mathcal {H}^{2q}({\mathbb {F}}^{W,B})\times \mathcal {H}^{2q}({\mathbb {F}}^{W,B})\) implies the convergence in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\). Consequently, the assertion of Step 2 is proved.

Step 3: Let \(k=0\). We can easily see that \((Y^{0,\gamma },Z_1^{0,\gamma },Z_2^{0,\gamma })\), given by (8.1), is differentiable with respect to \(\gamma \) and the derivatives \(\mathcal {Y}^{0,\gamma }(t)=\frac{\mu ^2}{2\sigma ^2\gamma ^2}(T-t), \mathcal {Z}^{0,\gamma }_1(t)=\mathcal {Z}^{0,\gamma }_2(t)=0\) satisfy the BSDE (5.8). The properties of \((\mathcal {Y}^{0,\gamma },\mathcal {Z}_1^{0,\gamma },\mathcal {Z}_2^{0,\gamma })\) are obvious. Hence, the result of this proposition holds for \(k=0\). Fix \(k\in \{1,\ldots ,n\}\) and assume that the result holds for \(k-1\). We prove that the result holds for k. For \(\epsilon \ne 0\), we introduce

$$\begin{aligned}&\mathcal {U}^{k,\epsilon }(t)=\frac{Y^{k,\gamma +\epsilon }(t)-Y^{k,\gamma }(t)}{\epsilon }, \quad \mathcal {V}_1^{k,\epsilon }(t)=\frac{Z_1^{k,\gamma +\epsilon }(t)-Z_1^{k,\gamma }(t)}{\epsilon }, \\&\quad \mathcal {V}_2^{k,\epsilon }(t)=\frac{Z_2^{k,\gamma +\epsilon }(t)-Z_2^{k,\gamma }(t)}{\epsilon },\quad 0\le t\le T, \end{aligned}$$

and we investigate the BSDE

$$\begin{aligned} \mathcal {U}^{k,\epsilon }(t)= & {} -\int _t^T\Big (A^{k,\epsilon }(s)+\varphi ^{k,\epsilon }(s,\mathcal {U}^{k,\epsilon }(s),\mathcal {V}_1^{k,\epsilon }(s))+H^{k,\epsilon }(s)\mathcal {V}^{k,\epsilon }_2(t)\Big )ds\nonumber \\&-\int _t^T\mathcal {V}_1^{k,\epsilon }(t)dW(t)-\int _t^T\mathcal {V}_2^{k,\epsilon }(t)dB(t),\quad 0\le t\le T, \end{aligned}$$
(8.18)

where

$$\begin{aligned}&A^{k,\epsilon }(t)=-\frac{1}{2}\big (Z_2^{k,\gamma +\epsilon }(t)\big )^2+\int _0^1\Big (-\frac{\mu ^2}{2\sigma ^2((\theta (\gamma +\epsilon )+(1-\theta )\gamma )^2}\nonumber \\&\qquad -\,\frac{e^{(\theta (\gamma +\epsilon )+(1-\theta )\gamma )(\beta (P(t))+Y^{k-1,\gamma +\epsilon }(t)-Y^{k,\gamma +\epsilon }(t))}\big (\beta (P(t))+Y^{k-1,\gamma +\epsilon }(t)-Y^{k,\gamma +\epsilon }(t)\big )}{\theta (\gamma +\epsilon )+(1-\theta )\gamma }k\lambda \nonumber \\&\qquad -\,\frac{e^{(\theta (\gamma +\epsilon )+(1-\theta )\gamma )(\beta (P(t))+Y^{k-1,\gamma +\epsilon }(t)-Y^{k,\gamma +\epsilon }(t))}-1}{(\theta (\gamma +\epsilon )+(1-\theta )\gamma )^2}k\lambda \Big ) d\theta \nonumber \\&\qquad -\,\Big (\int _0^1e^{\gamma (\beta (P(t))+\theta Y^{k-1,\gamma +\epsilon }(t)+(1-\theta ) Y^{k-1,\gamma }(t)-Y^{k,\gamma +\epsilon }(t))}k\lambda d\theta \Big )\mathcal {U}^{k-1,\epsilon }(t),\nonumber \\&\varphi ^{k,\epsilon }(t,\mathcal {U}(t),\mathcal {V}_1(t))\nonumber \\&\quad =\Big (\int _0^1e^{\gamma (\beta (P(t))+Y^{k-1,\gamma }(t)-\theta Y^{k,\gamma +\epsilon }(t)-(1-\theta )Y^{k,\gamma }(t))}k\lambda d\theta \Big )\mathcal {U}(t)+\frac{\mu }{\sigma }\mathcal {V}_1(t), \nonumber \\&H^{k,\epsilon }(t)=-\frac{1}{2}\gamma \big (Z_2^{k,\gamma +\epsilon }(t)+Z_2^{k,\gamma }(t)\big ). \end{aligned}$$
(8.19)

By the assumption made for Step 3, the sequence \((\mathcal {U}^{k-1,\epsilon },\mathcal {V}_1^{k-1,\epsilon },\mathcal {V}_2^{k-1,\epsilon })\), for \(\epsilon \in [-\epsilon _0,\epsilon _0]{\setminus }\{0\}\), converges in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\) as \(\epsilon \rightarrow 0\). We also have \((\mathcal {U}^{k-1,0},\mathcal {V}_1^{k-1,0},\mathcal {V}_2^{k-1,0})=(\mathcal {Y}^{k-1,\gamma },\mathcal {Z}_1^{k-1,\gamma },\mathcal {Z}_2^{k-1,\gamma })\) where \((\mathcal {U}^{k-1,0}, \mathcal {V}_1^{k-1,0},\mathcal {V}_2^{k-1,0})\) is interpreted as the limit of the sequence \((\mathcal {U}^{k-1,\epsilon },\mathcal {V}_1^{k-1,\epsilon },\mathcal {V}_2^{k-1,\epsilon })\) as \(\epsilon \rightarrow 0\).

Step 3.1: Let us assume that \(||\mathcal {U}^{k-1,\epsilon }||_{\mathcal {R}^\infty }, ||\mathcal {V}_1^{k-1,\epsilon }||_{BMO}, ||\mathcal {V}_2^{k-1,\epsilon }||_{BMO}\) are uniformly bounded in \(\epsilon \in [-\epsilon _0,\epsilon _0]\). Our assumption clearly holds for \(k=0\). We prove that \(||\mathcal {U}^{k,\epsilon }||_{\mathcal {R}^\infty }, ||\mathcal {V}_1^{k,\epsilon }||_{BMO}, ||\mathcal {V}_2^{k,\epsilon }||_{BMO}\) are finite for any \(\epsilon \in [-\epsilon _0,\epsilon _0]{\setminus }\{0\}\) and the upper bound does not depend on \(\epsilon \). The assumptions of Theorem 4.1 and Lemma 4.2 from Ankirchner et al. (2007) are satisfied. Let \(\tau \in [0,T]\) denote an \({\mathbb {F}}^{W,B}\)-stopping time. Using the conditional version of the a priori estimate from Lemma 4.2 from Ankirchner et al. (2007), see (20)–(22), we can derive the estimate

$$\begin{aligned}&\big |\mathcal {U}^{k,\epsilon }(\tau )\big |^{2q}+{\mathbb {E}}\Big [\Big (\int _\tau ^T\big |\mathcal {V}_1^{k,\epsilon }(s)\big |^{2}ds+\int _\tau ^T\big |\mathcal {V}_2^{k,\epsilon }(s)\big |^{2}ds\Big )^q\big |{\mathcal {F}}^{W,B}_\tau \Big ]\nonumber \\&\quad \le K\Big \{{\mathbb {E}}\Big [\Big (\int _\tau ^T\big |A^{k,\epsilon }(s)\big |ds\Big )^{2q}|{\mathcal {F}}^{W,B}_\tau \Big ]\nonumber \\&\qquad +\Big ({\mathbb {E}}\Big [\Big (\int _\tau ^T|A^{k,\epsilon }(s)|ds\Big )^{2q}|{\mathcal {F}}^{W,B}_\tau \Big ]\Big )^{\frac{1}{2}} \times \Big ({\mathbb {E}}\Big [\Big (\int _\tau ^T|H^{k,\epsilon }(s)|^2ds\Big )^{2q}|{\mathcal {F}}^{W,B}_\tau \Big ]\Big )^{\frac{1}{2}}\Big \},\nonumber \\ \end{aligned}$$
(8.20)

where the constant K depends on \(q,\ T\) and the Lipschitz constant of \((u,v_1)\mapsto \varphi ^{k,\epsilon }(t,u,v_1)\). We remark that we simply use \(Q=P\) in Lemma 4.2 from Ankirchner et al. (2007). Moreover, by the energy inequality (8.16) we can deduce

$$\begin{aligned}&{\mathbb {E}}\Big [\Big (\int _\tau ^T|H^{k,\epsilon }(s)|^2ds\Big )^{2q}|{\mathcal {F}}^{W,B}_\tau \Big ]\le K\Big (1+||Z_2^{k,\gamma }||^{2[2q]+2}_{BMO}+|Z_2^{k,\gamma +\epsilon }||^{2[2q]+2}_{BMO}\Big )\le K,\nonumber \\&{\mathbb {E}}\Big [\Big (\int _\tau ^T|A^{k,\epsilon }(s)|ds\Big )^{2q}|{\mathcal {F}}^{W,B}_\tau \Big ]\le K\Big (1+||Z_2^{k,\gamma +\epsilon }||^{2[2q]+2}_{BMO}+||\mathcal {U}^{k-1,\epsilon }||^{2q}_{\mathcal {R}^\infty }\Big )\le K,\nonumber \\ \end{aligned}$$
(8.21)

where the final constant K can be chosen uniformly for all \(\epsilon \in [-\epsilon _0,\epsilon _0]\). Hence, the upper bound in (8.20) is independent of \(\tau \) and \(\epsilon \). The assertion of Step 3.1 is proved. The case for \(\epsilon =0\) will be resolved in Step 3.3.

Step 3.2: We prove that \((\mathcal {U}^{k,\epsilon },\mathcal {Z}_1^{k,\epsilon },\mathcal {Z}_2^{k,\epsilon })\) converges in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\) as \(\epsilon \rightarrow 0\). Theorem 4.1 from Ankirchner et al. (2007) gives us the key estimate:

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{0\le t\le T}\big |\mathcal {U}^{k,\epsilon }(t)-\mathcal {U}^{k,\epsilon '}(t)\big |^{2q}\nonumber \\&\qquad +\Big (\int _0^T\big |\mathcal {V}_1^{k,\epsilon }(t)-\mathcal {V}_1^{k,\epsilon '}(t)\big |^{2}dt+\int _0^T\big |\mathcal {V}_2^{k,\epsilon }(t)-\mathcal {V}_2^{k,\epsilon '}(t)\big |^{2}dt\Big )^q\Big ]\nonumber \\&\quad \le K\Big \{\Big ({\mathbb {E}}\Big [\Big (\int _0^T\big |\varphi ^{k,\epsilon }(t,\mathcal {U}^{k,\epsilon '}(t),\mathcal {V}_1^{k,\epsilon '}(t))-\varphi ^{k,\epsilon '}(t,\mathcal {U}^{k,\epsilon '}(t),\mathcal {V}_1^{k,\epsilon '}(t))\big |dt\Big )^{2qr^2}\nonumber \\&\qquad \ +\Big (\int _0^T\big |A^{k,\epsilon }(t)-A^{k,\epsilon '}(t)\big |dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}\nonumber \\&\qquad +\Big ({\mathbb {E}}\Big [\Big (\int _0^T|A^{k,\epsilon }(t)|dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\nonumber \\&\qquad \ \times \Big ({\mathbb {E}}\Big [\Big (\int _0^T|H^{k,\epsilon }(t)-H^{k,\epsilon '}(t)|^2dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\Big \}, \end{aligned}$$
(8.22)

where the constant K depends on \(q, \ T\), the Lipschitz constant of \((u,v_1)\mapsto \varphi ^{k,\epsilon }(t,u,v_1)\) and \(||Z_2^{k,\gamma +\epsilon }+Z_2^{k,\gamma }||_{BMO}\). The constant r is also related to \(||Z_2^{k,\gamma +\epsilon }+Z_2^{k,\gamma }||_{BMO}\) by Theorem 4.1 from Ankirchner et al. (2007) and Theorem 3.1 from Kazamaki (1997). As in Step 2, we can choose universal constants \(r>1\) and K in (8.22) for all \(\epsilon \in [-\epsilon _0,\epsilon _0]\).

We prove the convergence for each term on the right hand side of (8.22). Using the bound (8.21), the estimate (8.17) and the result from Step 2, we conclude that the last term in (8.22) converges to zero as \((\epsilon , \epsilon ')\rightarrow 0\). Next, we derive

$$\begin{aligned}&\big |\varphi ^{k,\epsilon }(t,\mathcal {U}^{k,\epsilon '}(t),\mathcal {V}_1^{k,\epsilon '}(t))-\varphi ^{k,\epsilon '}(t,\mathcal {U}^{k,\epsilon '}(t),\mathcal {V}_1^{k,\epsilon '}(t))\big |\nonumber \\&\quad \le K\Big (\int _0^1\big |e^{-\gamma (\theta Y^{k,\gamma +\epsilon }(t)+(1-\theta )Y^{k,\gamma }(t))}\nonumber \\&\qquad -e^{-\gamma (\theta Y^{k,\gamma +\epsilon '}(t)+(1-\theta )Y^{k,\gamma }(t))}\big |d\theta \Big ) ||\mathcal {U}^{k,\epsilon '}||_{\mathcal {R}^\infty }. \end{aligned}$$
(8.23)

By (8.20)–(8.21) and Step 3.1, the norm \(||\mathcal {U}^{k,\epsilon '}||_{\mathcal {R}^\infty }\) can be bounded by a constant independent of \(\epsilon '\). Since \(\gamma \mapsto Y^{k,\gamma }\) is continuous in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\) by Step 2, we deduce that the right hand side of (8.23) converges to zero a.s. for a.a. \(t\in [0,T]\), as \((\epsilon , \epsilon ')\rightarrow 0\). Consequently, by the dominated convergence theorem, the first term after the inequality in (8.22) converges to zero as \((\epsilon , \epsilon ')\rightarrow 0\). We are left with one more term in (8.22). We have the estimate

$$\begin{aligned}&{\mathbb {E}}\Big [\Big (\int _0^T\big |A^{k,\epsilon }(t)-A^{k,\epsilon '}(t)\big |dt\Big )^{2qr^2}\Big ]\nonumber \\&\quad \le K{\mathbb {E}}\Big [\Big (\int _0^T\big ||Z_2^{k,\gamma +\epsilon }(t)|^2-|Z_2^{k,\gamma +\epsilon '}(t)|^2\big |dt\Big )^{2qr^2}\nonumber \\&\qquad +\Big (\int _0^T\big |G_1^{k,\gamma +\epsilon }(t)-G_1^{k,\gamma +\epsilon '}(t)\big |dt\Big )^{2qr^2}\nonumber \\&\qquad +\Big (\int _0^T\big |G_2^{k,\gamma +\epsilon }(t)\mathcal {U}^{k-1,\epsilon }(t)-G_2^{k,\gamma +\epsilon '}(t)\mathcal {U}^{k-1,\epsilon '}(t)\big |dt\Big )^{2qr^2}\Big ], \end{aligned}$$
(8.24)

where \(G_1^{k,\gamma +\epsilon }, G_2^{k,\gamma +\epsilon }\) can be deduced from the definition of \(A^{k,\epsilon }\). We can see that

$$\begin{aligned}&{\mathbb {E}}\Big [\Big (\int _0^T\big ||Z_2^{k,\gamma +\epsilon }(t)|^2-|Z_2^{k,\gamma +\epsilon '}(t)|^2\big |dt\Big )^{2qr^2}\Big ]\nonumber \\&\quad \le \Big ({\mathbb {E}}\Big [\Big (\int _0^T|Z_2^{k,\gamma +\epsilon }(t)+Z_2^{k,\gamma +\epsilon '}(t)|^2dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2}}\nonumber \\&\qquad \times \Big ({\mathbb {E}}\Big [\Big (\int _0^T|Z_2^{k,\gamma +\epsilon }(t)-Z_2^{k,\gamma +\epsilon '}(t)|^2dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2}}\nonumber \\&\quad \le K\Big ({\mathbb {E}}\Big [\Big (\int _0^T|Z_2^{k,\gamma +\epsilon }(t)-Z_2^{k,\gamma +\epsilon '}(t)|^2dt\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2}},\quad \end{aligned}$$
(8.25)

where K depends on \(||Z_2^{k,\gamma +\epsilon }||_{BMO}\) [by the energy inequality (8.16)] and is universal for all \(\epsilon \in [-\epsilon _0,\epsilon _0]\). By the result of Step 2, we know that \(\gamma \mapsto Z_2^{k,\gamma }\) is continuous in \(\mathcal {H}^q({\mathbb {F}}^{W,B})\), for any \(q>1\). Consequently, the first term on the right hand side of (8.24) converges to zero as \((\epsilon , \epsilon ')\rightarrow 0\). We observe that the norms \(||G_1^{k,\gamma +\epsilon }||_{\mathcal {R}^\infty }, ||G_2^{k,\gamma +\epsilon }||_{\mathcal {R}^\infty },||\mathcal {U}^{k-1,\epsilon }||_{\mathcal {R}^\infty }\) are bounded in \(\epsilon \in [-\epsilon _0,\epsilon _0]\) (in particular by the assumption made for Step 3.1). Moreover, \(\lim _{\epsilon \rightarrow 0}G_1^{k,\gamma +\epsilon }(t)=G_1^{k,\gamma }(t)\), a.s. for a.a. \(t\in [0,T]\), and the same limits hold for \(G_2^{k,\gamma +\epsilon }\) and \(\mathcal {U}^{k-1,\epsilon }\) (by the result of Step 2 and the assumption made for Step 3 which guarantee a.s. convergence of \(Y^{k-1,\gamma +\epsilon }(t), Y^{k,\gamma +\epsilon }(t), \mathcal {U}^{k-1,\epsilon }(t)\) for \(a.a.\ t\in [0,T]\) as \(\epsilon \rightarrow 0\)). By the dominated convergence theorem, the remaining two term on the right hand side of (8.24) converge to zero as \((\epsilon , \epsilon ')\rightarrow 0\). Collecting our results and the estimate (8.22), we can conclude that \((\mathcal {U}^{k,\epsilon },\mathcal {V}_1^{k,\epsilon },\mathcal {V}_2^{k,\epsilon })\), for \(\epsilon \in [-\epsilon _0,\epsilon _0]{\setminus }\{0\}\) is a Cauchy sequence which converges to a unique triple \((\mathcal {U}^{k,0},\mathcal {V}_1^{k,0},\mathcal {V}_2^{k,0})\) in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\) as \(\epsilon \rightarrow 0\).

Step 3.3: We start with the BSDE (8.18) and its solution \((\mathcal {U}^{k,\epsilon },\mathcal {V}_1^{k,\epsilon },\mathcal {V}_2^{k,\epsilon })\). As above, we can prove the convergence for each term in the BSDE (8.18) - for the process \(\mathcal {U}^{k,\epsilon }\), the generator and the stochastic integrals. We can conclude that the limit \((\mathcal {U}^{k,0},\mathcal {V}_1^{k,0},\mathcal {V}_2^{k,0})\) satisfies the BSDE (5.8). Hence, the assertion of Step 3 is proved.

We now investigate the BSDE (5.8) and its solution \((\mathcal {Y}^{k,\gamma },\mathcal {Z}_1^{k,\gamma },\mathcal {Z}_2^{k,\gamma })\). We can derive similar bounds (8.20)–(8.21) for \((\mathcal {Y}^{k,\gamma },\mathcal {Z}_1^{k,\gamma },\mathcal {Y}_2^{k,\gamma })\). We can deduce that \(\mathcal {Y}^{k,\gamma }\) is bounded and \(\big (\int _0^t\mathcal {Z}^{k,\gamma }_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^t\mathcal {Z}^{k,\gamma }_2(s)dB(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}}^{W,B})\)-martingales. From (8.20)–(8.21) for \((\mathcal {Y}^{k,\gamma },\mathcal {Z}_1^{k,\gamma },\mathcal {Z}_2^{k,\gamma })\), we can also deduce that the norms \(||\mathcal {Y}^{k,\gamma }||_{\mathcal {R}^\infty }, \ ||\mathcal {Z}^{k,\gamma }_1||_{BMO},\ ||\mathcal {Z}^{k,\gamma }_2||_{BMO}\) are bounded uniformly in \(k\in \{0,\ldots ,n\}\) and \(\gamma \in (\gamma _0-\epsilon ,\gamma _0+\epsilon )\) for \(\epsilon <\gamma _0\).

Let us now investigate the BSDE (5.8) with the forward equation (2.3) with the initial condition \(P(t)=p\). The solution to (5.8) is now denoted by \((\mathcal {Y}^{k,t,p},\mathcal {Z}_1^{k,t,p},\mathcal {Z}_2^{k,t,p})\). Let

$$\begin{aligned}&A^{k,t,p}(s)=-\frac{1}{2}(Z_2^{k,t,p}(s))^2-\frac{\mu ^2}{2\sigma ^2\gamma ^2}\nonumber \\&\quad -\frac{e^{\gamma (\beta (P^{t,p}(s))+Y^{k-1,t,p}(s)-Y^{k,t,p}(s))}\Big (\gamma \Big (\beta (P^{t,p}(s))+Y^{k-1,t,p}(s)-Y^{k,t,p}(s)\Big )-1\Big )+1}{\gamma ^2}k\lambda \nonumber \\&\quad -e^{\gamma \big (\beta (P^{t,p}(s))+Y^{k-1,t,p}(s)-Y^{k,t,p}(s)\big )}k\lambda \mathcal {Y}^{k-1,t,p}(s),\nonumber \\&\varphi ^{k,t,p}(s,\mathcal {Y}(s),\mathcal {Z}(s))=e^{\gamma \big (\beta (P^{t,p}(s))+Y^{k-1,t,p}(s)-Y^{k,t,p}(s)\big )}k\lambda \mathcal {Y}(s)+\frac{\mu }{\sigma }\mathcal {Z}_1(s),\nonumber \\&H^{k,t,p}(s)=-\gamma Z_2^{k,t,p}(s). \end{aligned}$$
(8.26)

Similarly to (8.22), we state the estimate

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |\mathcal {Y}^{k,t,p}(s)-\mathcal {Y}^{k,t,p'}(s)\big |^{2q}\Big ]\nonumber \\&\quad \le K\Big \{\Big ({\mathbb {E}}\Big [\Big (\int _0^T\big |\varphi ^{k,t,p}(s,\mathcal {Y}^{k,t,p'}(s),\mathcal {Z}_1^{k,t,p'}(s))-\varphi ^{k,t,p'}(s,\mathcal {Y}^{k,t,p'}(s),\mathcal {Z}_1^{k,t,p'}(s))\big |ds\Big )^{2qr^2}\nonumber \\&\qquad \ +\Big (\int _0^T\big |A^{k,t,p}(s)-A^{k,t,p'}(s)\big |ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}\nonumber \\&\qquad +\Big ({\mathbb {E}}\Big [\Big (\int _0^T|A^{k,t,p}(s)|ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\nonumber \\&\qquad \times \Big ({\mathbb {E}}\Big [\Big (\int _0^T|H^{k,t,p}(s)-H^{k,t,p'}(s)|^2ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\Big \}, \end{aligned}$$
(8.27)

Using the assertion (iii), the arguments from Step 3.2 with (8.23)–(8.25) and the arguments leading to (5.4), (8.4), (8.6), we can deduce the estimate

$$\begin{aligned}&{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |\mathcal {Y}^{k,t,p}(s)-\mathcal {Y}^{k,t,p'}(s)\big |^{2q}\Big ]\nonumber \\&\quad \ \le K\Big \{\Big (|p-p'|^{2qr^2}+{\mathbb {E}}\Big [\sup _{s\in [0,T]}\big |\mathcal {Y}^{k-1,t,p}(s)-\mathcal {Y}^{k-1,t,p'}(s)\big |^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}\nonumber \\&\qquad +\,|p-p'|^{2q}\Big \}, \end{aligned}$$
(8.28)

where the constant K is independent of \((k,t,p,p')\). The result (5.9) can be derived if we iterate (8.28) starting with the explicit solution \(\mathcal {Y}^{0,t,p}\).

Step 4: We finally prove that the BSDE (5.8) has a unique solution. Fix \(k\in \{0,\ldots ,n\}\). By Step 3.3 there exists at least one solution to the BSDE (5.8). Let us assume that there exist two solutions \((\mathcal {Y}^{k,\gamma },\mathcal {Z}_1^{k,\gamma },\mathcal {Z}_2^{k,\gamma })\in \mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\) and \((\tilde{\mathcal {Y}}^{k,\gamma },\tilde{\mathcal {Z}}_1^{k,\gamma },\tilde{\mathcal {Z}}_2^{k,\gamma })\in \mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\). Changing the measure, we get the BSDE

$$\begin{aligned} \mathcal {Y}^{k,\gamma }(t)-\tilde{\mathcal {Y}}^{k,\gamma }(t)= & {} -\int _t^Te^{\gamma (\beta (P(s))+Y^{k-1,\gamma }(s)+Y^{k,\gamma }(s))}k\lambda (\mathcal {Y}^{k,\gamma }(s)-\tilde{\mathcal {Y}}^{k,\gamma }(s))ds\nonumber \\&-\int _t^T(\mathcal {Z}_1^{k,\gamma }(s)-\tilde{\mathcal {Z}}_1^{k,\gamma }(s))dW^{\mathbb {Q}}(s)\nonumber \\&-\int _t^T(\mathcal {Z}_1^{k,\gamma }(s)-\tilde{\mathcal {Z}}_1^{k,\gamma }(s))dB^{\mathbb {Q}}(s),\quad 0\le t\le T, \end{aligned}$$
(8.29)

where \({\mathbb {Q}}\) denotes an equivalent probability measure \({\mathbb {Q}}\sim {\mathbb {P}}\). By Theorem 5.1 from El Karoui et al. (1997) the BSDE (8.29) has a unique solution in \(\mathcal {R}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\times \mathcal {H}^q({\mathbb {F}}^{W,B})\) under \({\mathbb {Q}}\), for any \(q>1\). This solution is (0, 0, 0). Since the martingale \(\mathcal {E}(\int _0^\cdot \frac{\mu }{\sigma }dW^{\mathbb {Q}}(s)-\int _0^\cdot \gamma Z_2^{k,\gamma }(s)dB^{\mathbb {Q}}(s))\), which is used to change the measure from \({\mathbb {Q}}\) to \({\mathbb {P}}\), is r-integrable under \({\mathbb {Q}}\) for some \(r>1\), see e.g. Theorems 3.1 and 3.6 in Kazamaki (1997), we deduce that the BSDE (5.8) has a unique solution in \(\mathcal {R}^q({\mathbb {R}})\times \mathcal {H}^q({\mathbb {R}})\times \mathcal {H}^q({\mathbb {R}})\) under \({\mathbb {P}}\). \(\square \)

Proof of Proposition 5.4

The assertions (i)(ii) hold for \(k=0\) - just compare the explicit solutions to the BSDE and the PDE for \(k=0\). Uniqueness of solution to the PDE (5.10) for \(k=0\) follows from Proposition 2.3 in Becherer (2005). Fix \(k\in \{1,\ldots ,n\}\) and assume that the assertions (i)(ii) hold for \(k-1\). In particular, \(g^{k-1}\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\) and \(\mathcal {Y}^{k-1}(t)=g^{k-1}(t,P(t))\). We prove that the assertions (i)(ii) hold for k.

Step 1: By (5.9), the mapping \(p\mapsto \mathcal {Y}^{k,t,p}(t)\) is continuous on \((0,\infty )\) for any fixed \(t\in [0,T]\). We prove that the mapping \((t,p)\mapsto \mathcal {Y}^{k,t,p}(t)\) is continuous on \([0,T]\times (0,\infty )\). We introduce the parametrized dynamics:

$$\begin{aligned} \frac{P^{t,p}(s)}{P^{t,p}(s)}= & {} adt+b(\rho dW(t)+\sqrt{1-\rho ^2}dB(t)),\quad t\le s\le T,\\ P^{t,p}(t)= & {} p. \end{aligned}$$

We can observe that \((t,p)\mapsto P^{t,p}(s)\) is continuous on \([0,T]\times (0,\infty )\), for any \(s\in [t,T]\). Let \(t'\le t\). Recalling (8.5), (8.27) and Theorems 4.1, 5.1 from Ankirchner et al. (2007), we can deduce the following estimates:

$$\begin{aligned}&|\mathcal {Y}^{k,t,p}(t)-\mathcal {Y}^{k,t',p'}(t')|^{2q}=|\mathcal {Y}^{k,t,p}(0)-\mathcal {Y}^{k,t',p'}(0)|^{2q}\nonumber \\&\quad \le {\mathbb {E}}\Big [sup_{s\in [0,T]}|\mathcal {Y}^{k,t,p}(s)-\mathcal {Y}^{k,t',p'}(s)|^{2q}\Big ]\nonumber \\&\quad \le K\Big \{\Big ({\mathbb {E}}\Big [\Big (\int _t^T\big |\varphi ^{k,t,p}(s,\mathcal {Y}^{k,t',p'}(s),\mathcal {Z}_1^{k,t',p'}(s))-\varphi ^{k,t',p'}(s,\mathcal {Y}^{k,t',p'}(s),\mathcal {Z}_1^{k,t',p'}(s))\big |ds\nonumber \\&\qquad +\,\int _{t'}^t\big |\varphi ^{k,t',p'}(s,\mathcal {Y}^{k,t',p'}(s),\mathcal {Z}_1^{k,t',p'}(s))\big |ds\Big )^{2qr^2}\nonumber \\&\qquad +\,\Big (\int _t^T\big |A^{k,t,p}(s)-A^{k,t',p'}(s)\big |ds+\int _{t'}^t\big |A^{k,t',p'}(s)\big |ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}\nonumber \\&\qquad +\,\Big ({\mathbb {E}}\Big [\Big (\int _t^T|A^{k,t,p}(s)|ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\nonumber \\&\qquad \times \,\Big ({\mathbb {E}}\Big [\Big (\int _t^T|H^{k,t,p}(s)-H^{k,t',p'}(s)|^2ds+\int _{t'}^t|H^{k,t',p'}(s)|^2ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{2r^2}}\Big \}, \end{aligned}$$

and

$$\begin{aligned}&{\mathbb {E}}\Big [\Big (\int _0^T\big |Z_2^{k,t,p'}(s)-Z_2^{k,t',p'}(s)\big |^{2}ds\Big )^q\Big ]\nonumber \\&\quad \le K\Big ({\mathbb {E}}\Big [\big |k\eta (P^{t,p}(T))-k\eta (P^{t',p'}(T))\big |^{2qr^2}\Big ]\nonumber \\&\qquad +\,{\mathbb {E}}\Big [\Big (\int _t^T\big |\psi ^{k,t,p}(s,Y^{k,t',p'}(s),Z_1^{k,t',p'}(s))-\psi ^{k,t',p'}(t,Y^{k,t',p'}(s),Z_1^{k,t',p'}(s))\big |ds\nonumber \\&\qquad +\,\int _{t'}^t\big |\psi ^{k,t',p'}(t,Y^{k,t',p'}(s),Z_1^{k,t',p'}(s))\big |ds\Big )^{2qr^2}\Big ]\Big )^{\frac{1}{r^2}}. \end{aligned}$$

Using similar arguments as in the proof of Proposition 5.3 (Steps 3.2–3.3) and the results of Proposition 5.2 (in particular the properties that \(Y^k(t)=h^k(t,P(t))\) and \(h^k\in \mathcal {C}([0,T]\times (0,\infty ))\)), we can show that \(\lim _{(t,p)\rightarrow (t',p')}|\mathcal {Y}^{k,t,p}(t)-\mathcal {Y}^{k,t',p'}(t')|=0\). Consequently, the assertion is proved.

Step 2: We derive a representation for \(\mathcal {Y}^k\). Let us change the measure to \({\mathbb {Q}}\sim {\mathbb {P}}\) with the exponential martingale \(\mathcal {E}(-\int _0^\cdot \frac{\mu }{\sigma }dW(s)+\int _0^\cdot \gamma Z_2^{k,\gamma }(s)dB(s))\). The BSDE (5.8) and the price process (2.3) take the form

$$\begin{aligned} d\mathcal {Y}^{k,t,p}(s)= & {} \Big (\varUpsilon (s,P^{t,p}(s))\nonumber \\&-\,\varPsi (s,P^{t,p}(s))g^{k-1}(s,P^{t,p}(s))+\varPsi (s,P^{t,p}(s))\mathcal {Y}^{k,t,p}(s)\Big )ds\nonumber \\&+\,\mathcal {Z}_1^{k,t,p}(s)dW^{\mathbb {Q}}(s)+\mathcal {Z}_2^{k,t,p}(s)dB^{\mathbb {Q}}(s),\quad t\le s\le T, \nonumber \\ \frac{P^{t,p}(s)}{P^{t,p}(s)}= & {} \Big (a-\frac{\mu b \rho }{\sigma }+\gamma (1-\rho ^2)b^2P^{t,p}(s)h^k_p(s,P^{t,p}(s))\Big )ds\nonumber \\&+\,b\big (\rho dW^{\mathbb {Q}}(s)+\sqrt{1-\rho ^2}dB^{\mathbb {Q}}(s)\big ),\quad t\le s\le T, \end{aligned}$$
(8.30)

where

$$\begin{aligned}&\varUpsilon (s,p)=-\frac{\mu ^2}{2\sigma ^2\gamma ^2}-\frac{1}{2}(1-\rho ^2)b^2p^2\big (h^k_p(s,p)\big )^2 \nonumber \\&\quad -\frac{e^{\gamma (\beta (p)+h^{k-1}(s,p)-h^{k}(s,p))}\big (\gamma \big (\beta (p)+h^{k-1}(s,p)-h^{k}(s,p)\big )-1\big )+1}{\gamma ^2}k\lambda \nonumber \\&\varPsi (s,p)=e^{\gamma (\beta (p)+h^{k-1}(s,p)-h^{k}(s,p))}k\lambda . \end{aligned}$$

The stochastic integrals in (8.30) are \({\mathbb {Q}}\)-martingales, see e.g. Theorem 3.6 in Kazamaki (1997). Taking the expected value in (8.30), we derive that

$$\begin{aligned}&\mathcal {Y}^{k,t,p}(t) \\&\quad ={\mathbb {E}}^{\mathbb {Q}}\Big [-\int _t^Te^{-\int _t^s\varPsi (u,P^{t,p}(u))ds}\Big (\varUpsilon (s,P^{t,p}(s))-\varPsi (s,P^{t,p}(s))g^{k-1}(s,P^{t,p}(s))\Big )ds\Big ],\\&\quad \quad (t,p)\in [0,T]\times (0,\infty ),\quad k\in \{0,\ldots ,n\}. \end{aligned}$$

Step 3: Using (A7), Steps 1-2, Theorem 1 from Heath and Schweizer (2001), we can conclude that \(\mathcal {Y}^{k,t,p}(t)=g^k(t,p)\) where \(g^k\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T-\epsilon )\times (0,\infty ))\) and \(g^k\) satisfies the PDE (5.10) for \((t,p)\in [0,T-\epsilon )\times (0,\infty )\) with the terminal condition \(\mathcal {Y}^{k,T-\epsilon ,p}(T-\epsilon )\), for any \(\epsilon >0\). Moreover, the solution to such a PDE unique. Since \(\epsilon >0\) is arbitrary, the result is proved.

Step 4: The formulas for \(\mathcal {Z}_1\) and \(\mathcal {Z}_2\) can be proved as in Proposition 5.2. \(\square \)

Proof of Theorem 6.1

From the calculations in Sect. 6 we conclude that the first-order expansion to the equilibrium strategy is given by (6.3) with (6.11) and (6.14). If we use the relations between \((h^k)_{k=0}^n, (g^k)_{k=0}^n\) and \((Y^k,Z^k_1)_{k=0}^n, (\mathcal {Y}^k,\mathcal {Z}_1^k)_{k=0}^n\) established in Propositions 5.2 and 5.4 , we get the strategy (6.15). We now confirm that our strategy (6.15) is admissible, i.e. it satisfies all points of Definition 3.1.

Point 1: The strategy \({\hat{\pi }}^*\) is \({\mathbb {F}}\)-predictable and is determined with a measurable mapping.

Point 2: By Propositions 5.1 and 5.3, the processes \(\big (\int _0^tZ^{k,\gamma _0}_1(s)dW(s), 0\le t\le T\big ), \big (\int _0^t\mathcal {Z}^{k,\gamma _0}_1(s)dW(s), 0\le t\le T\big )\) are \(BMO({\mathbb {F}}^{W,B})\) martingales, for each \(k\in \{0,\ldots ,n\}\). Since \(\gamma _1\) is bounded, we can deduce that \(\big (\int _0^t{\hat{\pi }}^*(t)(s)dW(s), 0\le t\le T\big )\) is a \(BMO({\mathbb {F}})\) martingale.

Point 3: The dynamics of the insurer’s wealth process (3.1) under the strategy \({\hat{\pi }}^*\) is given by the equation

$$\begin{aligned} dX^{{\hat{\pi }}^*}(t)= & {} {\hat{\pi }}^{J(t-),*}(t,X^{{\hat{\pi }}^*}(t),P(t))\big (\mu dt+\sigma dW(t)\big )\nonumber \\&-J(t-)\alpha (P(t))dt+\beta (P(t))dJ(t),\quad 0\le t\le T. \end{aligned}$$
(8.31)

By Lipschitz continuity of \(\gamma _1\) and the formula (5.11) for \(\mathcal {Z}^k_1\), we have

$$\begin{aligned} |{\hat{\pi }}^{k,*}(t,x,p)-{\hat{\pi }}^{k,*}(t,x',p)|\le K|g^k_p(t,p)p||x-x'|, \end{aligned}$$

for any \((t,x,p), (t,x',p)\in [0,T]\times {\mathbb {R}}\times (0,\infty )\) and \(k\in \{0,\ldots ,n\}\). By Propositions 5.3 and 5.4, the mapping \(p\mapsto g^k(t,p)\) is Lipschitz continuous on \((0,\infty )\) uniformly in \(t\in [0,T]\) and \(g^k\in \mathcal {C}([0,T]\times (0,\infty ))\cap \mathcal {C}^{1,2}([0,T)\times (0,\infty ))\). Consequently, the derivative \((t,p)\mapsto g^k_p(t,p)\) is uniformly bounded and continuous on \([0,T)\times (0,\infty )\). In the definition of the investment strategy (6.15) we can choose \(g^k_p(T,p)=\lim _{t\mapsto T-}g^k_p(t,p)\) and we have a continuous, finite mapping \(t\mapsto g^k_p(t,P(t,\omega ))P(t,\omega )\) on [0, T] for a.a\(\omega \). We can conclude that the SDE (8.31) is a SDE with a process Lipschitz coefficient, see Chapter V in Protter (2005). Hence, by Theorem V.7 in Protter (2005), there exists a unique solution to (8.31).

Point 4: We use the decomposition: \({\hat{\pi }}^*(t)={\hat{\pi }}^*_0(t)+{\hat{\pi }}^*_1(t)\epsilon \). We choose \(r\in {\mathbb {R}}\) and set \(\gamma _1:=\gamma _1(r)\). We fix \(t\in [0,T]\). We have to study the expected value:

$$\begin{aligned}&{\mathbb {E}}\Big [e^{-\varGamma (r)\big (X^{{\hat{\pi }}^*}(T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]={\mathbb {E}}\Big [e^{-(\gamma _0+\gamma _1\epsilon )\big (X^{{\hat{\pi }}^*}(T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]\\&\quad =e^{-(\gamma _0+\gamma _1\epsilon )X^{{\hat{\pi }}^*}(t)+\gamma _0Y(t)}{\mathbb {E}}\Big [e^{-\gamma _0\big (X^{{\hat{\pi }}_0^*}(T)-X^{{\hat{\pi }}_0^*}(t)-(Y(T)-Y(t))\big )}\\&\qquad \times e^{-\epsilon \big (\int _t^T{\tilde{\pi }}^{*}(s)\mu ds+\int _t^T{\tilde{\pi }}^{*}(s)\sigma dW(s)\big )}e^{\gamma _1\epsilon \big (\int _t^TJ(s)\alpha (P(s))ds-\int _t^T\beta (P(s))dJ(s)+J(T)\eta (P(T))\big )}\Big ], \end{aligned}$$

where Y solves the BSDE (8.10), and we introduce \({\tilde{\pi }}^*(s)=\gamma _1{\hat{\pi }}_0^*(s)+(\gamma _0+\gamma _1\epsilon ){\hat{\pi }}_1^*(s)\). We have the property:

$$\begin{aligned} ||{\tilde{\pi }}^*||_{BMO}\le |\gamma _1|||{\hat{\pi }}_0^*||_{BMO}+(\gamma _0+|\gamma _1|\epsilon )||{\hat{\pi }}_1^*||_{BMO}<\infty . \end{aligned}$$
(8.32)

The process \((M_1(s), t\le s\le T)\) given by

$$\begin{aligned} M_1(s)= & {} e^{-\gamma _0\big (X^{{\hat{\pi }}_0^*}(s)-X^{{\hat{\pi }}_0^*}(t)-(Y(s)-Y(t))\big )}, \end{aligned}$$

is an exponential martingale generated by a BMO-martingale, see (8.12). By Hölder inequality and reverse Hölder inequality (see Theorem 3.1 in Kazamaki (1997)), we can derive

$$\begin{aligned}&{\mathbb {E}}\Big [e^{-\varGamma (r)\big (X^{{\hat{\pi }}^*}(T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]\nonumber \\&\quad \le e^{-(\gamma _0+\gamma _1\epsilon )X^{{\hat{\pi }}^*}(t)+\gamma _0Y(t)}\big ({\mathbb {E}}\big [|M_1(T)|^{q_1}|{\mathcal {F}}_t\big ]\big )^{\frac{1}{q_1}}\nonumber \\&\qquad \times \Big ({\mathbb {E}}\Big [e^{-q_1^*\epsilon \big (\int _t^T{\tilde{\pi }}^{*}(s)\mu ds+ \int _t^T{\tilde{\pi }}^{*}(s)\sigma dW(s)\big )}e^{q_1^*\gamma _1\epsilon \big (\int _t^TJ(s)\alpha (P(s))ds-\int _t^T\beta (P(s))dJ(s)+J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]\Big )^{\frac{1}{q_1^*}}\nonumber \\&\quad \le Ke^{-(\gamma _0+\gamma _1\epsilon )X^{{\hat{\pi }}^*}(t)+\gamma _0Y(t)}\nonumber \\&\qquad \times \Big ({\mathbb {E}}\Big [e^{-\int _t^Tq_1^*\epsilon {\tilde{\pi }}^{*}(s)\sigma dW(s)-\frac{1}{2}\int _t^T|q_1^*\epsilon {\tilde{\pi }}^{*}(s)\sigma |^2ds}e^{\frac{1}{2}\int _t^T|q_1^*\epsilon {\tilde{\pi }}^{*}(s)\sigma |^2ds-\int _t^Tq_1^*\epsilon {\tilde{\pi }}^{*}(s)\mu ds}|{\mathcal {F}}_t\Big ]\Big )^{\frac{1}{q_1^*}}, \end{aligned}$$

for some sufficiently small \(q_1>1\) and its conjugate \(q_1^*\). The process \((M_2(s), t\le s\le T)\) given by \(M_2(s)=e^{-\int _t^sq_1^*\epsilon {\tilde{\pi }}^{*}(u)\sigma dW(u)-\frac{1}{2}\int _t^s|q_1^*\epsilon {\tilde{\pi }}^{*}(u)\sigma |^2du}\) is also an exponential martingale generated by a BMO-martingale. Applying again Hölder inequality and reverse Hölder inequality, we get

$$\begin{aligned}&{\mathbb {E}}\Big [e^{-\varGamma (r)\big (X^{{\hat{\pi }}^*}(T)-J(T)\eta (P(T))\big )}|{\mathcal {F}}_t\Big ]\nonumber \\&\quad \le Ke^{-(\gamma _0+\gamma _1\epsilon )X^{{\hat{\pi }}^*}(t)+\gamma _0Y(t)}\nonumber \\&\qquad \times \Big ({\mathbb {E}}\Big [e^{\frac{1}{2}\int _t^Tq_2^*|q_1^*\epsilon {\tilde{\pi }}^{*}(s)\sigma |^2ds-\int _t^Tq_2^*q^*_1\epsilon {\tilde{\pi }}^*(s)\mu ds}|{\mathcal {F}}_t\Big ]\Big )^{\frac{1}{q_1^*q_2^*}}, \end{aligned}$$
(8.33)

for some sufficiently small \(q_2>1\) and its conjugate \(q_2^*\). Finally, for a sufficiently small \(\epsilon >0\), we have the inequality

$$\begin{aligned}&{\mathbb {E}}\Big [e^{\frac{1}{2}\int _t^Tq_2^*|q_1^*\epsilon {\tilde{\pi }}^{*}(s)\sigma |^2ds-\int _t^Tq_2^*q^*_1\epsilon {\tilde{\pi }}^*(s)\mu ds}|{\mathcal {F}}_t\Big ]\nonumber \\&\quad \le K{\mathbb {E}}\Big [e^{K\epsilon ^2\int _t^T|{\tilde{\pi }}^{*}(s)|^2ds}|{\mathcal {F}}_t\Big ]\le \frac{K}{1-K\epsilon ^2||{\tilde{\pi }}^*||^2_{BMO}}<\infty , \end{aligned}$$
(8.34)

by (8.32) and John–Nirenberg inequality, see Theorem 2.2 in Kazamaki (1997). Collecting (8.33) and (8.34), we can conclude that our strategy \({\hat{\pi }}^*\) satisfies point 4. \(\square \)