1 Introduction

In Delong [8] we investigate an exponential utility maximization problem for an insurer who faces a stream of non-hedgeable claims. We assume that the insurer’s risk aversion coefficient changes in time and depends on the current insurer’s net asset value (the excess of assets over liabilities). Since the optimization problem is time-inconsistent, we follow the game-theoretic approach developed by Ekeland and Lazrak [10], Ekeland and Pirvu [11], Björk and Murgoci [4] and Björk et al. [3]. We use the notion of an equilibrium strategy and derive the HJB equation for the equilibrium value function. In order to solve the HJB equation, we use perturbation theory. We assume that the insurer’s risk aversion coefficient consists of a constant risk aversion and a small amount of wealth-dependent risk aversion. The equilibrium value function is expanded on the parameter \(\epsilon \) controlling the degree of the insurer’s risk aversion depending on wealth. We derive candidates for the first-order approximations to the equilibrium value function and the equilibrium investment strategy.

Delong [8] proves a lot of results which are essential to characterize the first-order approximation to the equilibrium investment strategy and justify the choice of his investment strategy as the first-order approximation. However, the order of the error of approximating the true equilibrium investment strategy with the candidate first-order approximate solution has not been proved. In this paper we formally study an asymptotic optimality of the investment strategy postulated by Delong [8]. More precisely, we show that the zeroth-order investment strategy \(\pi _0^*\) postulated by Delong [8] performs better than any strategy \(\pi _0\) when we compare the asymptotic expansions of the objective functions up to order \({\mathcal {O}}(1)\) as \(\epsilon \rightarrow 0\), and the first-order investment strategy \(\pi _0^*+\pi _1^*\epsilon \) postulated by Delong [8] is the equilibrium strategy in the class of strategies \(\pi ^*_0+\pi _1\epsilon \) when we compare the asymptotic expansions of the objective functions up to order \({\mathcal {O}}(\epsilon ^2)\) as \(\epsilon \rightarrow 0\), where \(\epsilon \) denotes the parameter controlling the degree of the insurer’s risk aversion depending on wealth. From mathematical point of view, the results complete the results from Delong [8] and give a more rigorous justification for the strategy derived in Delong [8]. From economic point of view, the proof that the candidate strategy from Delong [8] is optimal (in some sense) is crucial for applications and conclusions derived from the model.

The assumption of constant risk aversion is best known in economics, finance and insurance. However, many empirical studies suggest that agents’ risk attitudes are correlated with their wealth, see e.g. Shaw [21], Wik et al. [22], Anderson and Galinsky [1], Bucciol and Miniaci [5], Courbage et al. [6]. Consequently, we should use wealth-dependent risk aversion coefficients to model, and understand, economic decisions of investors and insurance companies in risky environment. In practice, insurance companies implement investment strategies for asset and liability management in order to pay random claims and earn a profit. In this paper we study an optimality of an investment strategy for a risk-averse insurer with a time-varying risk aversion depending on the available wealth. The framework with stochastic risk preferences should better reflect the risk attitude of an insurer trying to make optimal decisions in financial markets. The investment strategy, which we derive in our theoretical model, can be used as a reference point for developing investment strategies for asset and liability management in real life.

To the best of our knowledge, there are only two papers by Dong and Sircar [9] and Delong [8] which study exponential utility maximization problems for investors with wealth-dependent risk aversion coefficients. Moreover, the first-order approximation to the equilibrium investment strategy postulated by Delong [8] is a new investment strategy and its properties are worth investigating.

Perturbation techniques have been popularized in financial mathematics by Fouque et al. [13], Fouque et al. [14], Fouque and Hu [12], Fouque et al. [15]. In particular, an asymptotic optimality of a candidate strategy in the class of strategies given by \(\pi _0+\pi _1\epsilon \) is investigated by Fouque and Hu [12] in a model where an investor maximizes an expected utility in a market with stochastic volatility. The idea to study the asymptotic expansions of the objective function up to orders \({\mathcal {O}}(1), {\mathcal {O}}(\epsilon ), {\mathcal {O}}(\epsilon ^2)\) as \(\epsilon \rightarrow 0\) and an asymptotic optimality of the candidate strategy in the class of strategies given by \(\pi _0+\pi _1\epsilon \) is taken from Fouque and Hu [12]. However, the techniques which we use in this paper are different from the techniques used by Fouque and Hu [12], since the models are different. Moreover, we deal with an equilibrium strategy, which is not the optimal strategy in the Bellman’s sense, and we introduce a new asymptotic criterion for the equilibrium in order to formalize our asymptotic results.

In Sects. 2 and 3 we briefly recall the model and the main results from Delong [8] for reader’s convenience. The results from Delong [8] are used in the proofs in this paper. In Sect. 4 we present the main result of this paper and we study the asymptotic optimality, in an appropriate sense, of the investment strategy from Delong [8]. The proofs can be found in Sect. 5.

In the sequel, the conditional expected value will be denoted by \({{\mathbb {E}}}_y[\cdot ]={{\mathbb {E}}}[\cdot |Y(t)=y]\) where Y denotes the stochastic process which is used in the conditional expected value. We will use functions of order \({\mathcal {O}}(\epsilon ^\theta )\). Let us recall that

$$\begin{aligned} z^\epsilon (x)\sim {\mathcal {O}}(\epsilon ^\theta ) \ as \ \epsilon \rightarrow 0 \quad if \quad |z^\epsilon (x)|\le K\epsilon ^\theta ,\quad 0\le \epsilon \le \epsilon _0, \end{aligned}$$
(1.1)

for some \(\epsilon _0>0\), where K is independent of \(\epsilon \) but may depend on \((x,\epsilon _0)\).

2 The Financial and Insurance Model

We deal with a probability space \((\varOmega ,{{\mathbb {F}}},{\mathbb {P}})\) with a filtration \({{\mathbb {F}}}=({\mathcal {F}}_{t})_{0\le t\le T}\) and a finite time horizon \(T<\infty \). On the probability space \((\varOmega ,{{\mathbb {F}}},{\mathbb {P}})\) we define a standard two-dimensional Brownian motion \((W,B)=(W(t),B(t), 0\le t\le T)\) and a càdlàg (right-continuous with left limits) counting process \(N=(N(t), 0\le t\le T)\). We assume that

  1. (A1)

    The filtration \({\mathcal {F}}_t=\bigcap _{\epsilon >0}\big ({\mathcal {F}}^{W,B}_{t+\epsilon }\vee {\mathcal {F}}^{N}_{t+\epsilon }\big )\), \(0\le t\le T\), where \({\mathcal {F}}^{W,B}_{t}=\sigma (W(u), B(u), u\in [0,t]),\ {\mathcal {F}}^{N}_{t}=\sigma (N(u), u\in [0,t])\). Moreover, \({\mathcal {F}}^{W,B}_{t}\) and \({\mathcal {F}}^{N}_{t}\) are independent.

The filtration \({{\mathbb {F}}}\) is right-continuous and completed with sets of measure zero.

The financial market consists of a risk-free deposit \(D=(D(t), 0\le t\le T)\) and two risky indices: \(S=(S(t), 0\le t\le T)\), \(P=(P(t),0\le t\le T)\). The value of the risk-free deposit is constant, i.e.:

$$\begin{aligned} D(t)=1, \quad 0\le t\le T. \end{aligned}$$
(2.1)

The prices of the risky indices are modelled with correlated geometric Brownian motions:

$$\begin{aligned} \frac{dS(t)}{S(t)}= & {} \mu dt+\sigma dW(t),\quad 0\le t\le T,\nonumber \\ S(0)= & {} s_0, \end{aligned}$$
(2.2)
$$\begin{aligned} \frac{dP(t)}{P(t)}= & {} a dt+b\Big (\rho dW(t)+\sqrt{1-\rho ^2}dB(t)\Big ),\quad 0\le t\le T,\nonumber \\ P(0)= & {} p_0, \end{aligned}$$
(2.3)

where \(\mu , a, \sigma , b\) are positive constants which denote drifts and volatilities, and \(\rho \in [-1,1]\) denotes the correlation coefficient between the log-returns of S and P. The insurance company can invest in the deposit D and in the index S. The index P is not available for trading. The index P is the underlying investment fund for the insurance contracts sold by the insurance company, see below for a detailed description.

The insurance company keeps a homogeneous portfolio consisting of n unit-linked policies. The counting process N is used to count the number of deaths in the insurance portfolio. We assume that the lifetimes of the policyholders are independent and exponentially distributed, i.e. we assume that

  1. (A2)

    \(\Big (N(t)-\int _0^t(n-N(s-))\lambda ds, \ 0\le t\le T\Big )\) is an \({{\mathbb {F}}}\)-martingale, where \(\lambda >0\).

Parameter \(\lambda \) denotes the mortality intensity in the population of the policyholders. We will use the process

$$\begin{aligned} J(t)=n-N(t), \quad 0\le t\le T, \end{aligned}$$

which counts the number of policies in force in the insurance portfolio. We remark that (A1) means that we assume that the insurance risk is independent of the financial risk under the real-world measure \({{\mathbb {P}}}\).

The insurer faces a stream of non-hedgeable claims which is modelled with the process \(C=(C(t), 0\le t\le T)\) given by the equation

$$\begin{aligned} C(t)= & {} \int _{0}^{t}J(s-)\alpha (P(s))ds+\int _{0}^{t}\beta (P(s))dN(s)\nonumber \\&+J(T)\eta (P(T)){\mathbf {1}}_{t=T},\quad 0\le t\le T. \end{aligned}$$
(2.4)

Each policyholder in the insurance portfolio is entitled to three types of benefits: annuity \(\alpha \) paid as long as the policyholder lives, life insurance benefit \(\beta \) paid if the policyholder dies and endowment benefit \(\eta \) paid if the policyholder survives till the terminal time T. The benefits \(\alpha , \beta , \gamma \) are contingent on the non-tradeable index P. We assume that

  1. (A3)

    the functions \(\alpha , \beta , \eta :(0,\infty )\mapsto [0,\infty )\) are bounded and Lipschitz continuous.

In order to fulfill the future liabilities, the insurer must hold a reserve. The reserve is set for the policies in force. The reserve is defined by

$$\begin{aligned} F^k(t,p)= & {} {{\mathbb {E}}}_{t,p,k}^{\tilde{{\mathbb {Q}}}}\big [C(T)-C(t)\big ],\nonumber \\&\quad (t,p,k)\in [0,T]\times (0,\infty )\times \{0,1,\ldots ,n\}, \end{aligned}$$
(2.5)

where \({\tilde{{\mathbb {Q}}}}\) denotes a pricing measure for C. Here, by reserve we mean an amount of money which the insurer sets aside to cover the future claims. In practice, the insurer can use best estimate, market-consistent or first-order assumptions to calculate the reserve, see e.g Chapter 2 in Møller and Steffensen [18]. The pricing and reserving assumptions are reflected in the measure \({\tilde{{{\mathbb {Q}}}}}\), under which the real-world dynamics of the risk factors are modified in accordance with the assumptions. We don’t make any assumptions on the pricing measure \({\tilde{{{\mathbb {Q}}}}}\) in (2.5). However, we assume that

  1. (A4)

    \(F^k(t,p)=kF^1(t,p), \ (t,p,k)\in [0,T]\times (0,\infty )\times \{0,\ldots ,n\},\) and the function \(F^1:[0,T]\times (0,\infty )\mapsto [0,\infty )\) is \({\mathcal {C}}^{1,2}([0,T]\times (0,\infty ))\).

In most cases, the insurance risk would be assumed to be independent of the financial risk under the pricing measure \({\tilde{{{\mathbb {Q}}}}}\). If (A1) also holds under \({\tilde{{{\mathbb {Q}}}}}\), then (A4) is satisfied. In the sequel, the reserve for one policy in force \(F^1\) is simply denoted by F.

For a detailed description of the financial and insurance model and a motivation for the optimization problem we refer to Delong [8].

3 The Optimization Problem and the Candidate First-Order Approximate Strategy

Let \(\pi :=(\pi (t),0\le t\le T)\) denote an investment strategy which specifies the amount of money that the insurer invests in the index S. The wealth process of the insurer, denoted by \(X^\pi =(X^\pi (t),0\le t\le T)\), satisfies the SDE:

$$\begin{aligned} dX^\pi (t)= & {} \pi (t)\Big (\mu dt+\sigma dW(t)\Big )\nonumber \\&-J(s-)\alpha (P(s))ds+\beta (P(s))dJ(s),\quad 0\le t\le T,\nonumber \\ X(0)= & {} x. \end{aligned}$$
(3.1)

where \(x>0\) denotes the initial wealth. The survival benefits \(\eta \) are subtracted from \(X^\pi (T)\) at the terminal time T.

We study the time-inconsistent optimization problem:

$$\begin{aligned} \sup _{\pi }{{\mathbb {E}}}\Big [-e^{-\varGamma \big (X^\pi (t)-J(t)F(t,P(t))\big )\cdot \big (X^\pi (T)-J(T)\eta (P(T))\big )}|{{\mathcal {F}}}_t\Big ],\quad 0\le t\le T,\quad \ \end{aligned}$$
(3.2)

where \(\varGamma \) denotes a time-varying risk aversion coefficient which value at time t depends on the process

$$\begin{aligned} R(t)=X^\pi (t)-J(t)F(t,P(t)),\quad 0\le t\le T. \end{aligned}$$

The process R is interpreted as the insurer’s net asset value - the excess of the insurer’s assets over his liabilities. By the liability we mean the value of the reserve (2.5). The optimization problem (3.2) is called an exponential utility maximization problem with wealth-dependent risk aversion. We assume that the risk aversion coefficient in (3.2) satisfies the condition:

  1. (A5)

    \(\varGamma :{{\mathbb {R}}}\mapsto (0,\infty )\) is bounded, decreasing, Lipschitz continuous and \({\mathcal {C}}^2({{\mathbb {R}}})\).

Let us introduce the set of admissible investment strategies for our optimization problem (3.2).

Definition 3.1

A strategy \(\pi =(\pi (t), 0\le t\le T)\) is called admissible, \(\pi \in {\mathcal {A}}\), if it satisfies the following conditions:

  1. 1.

    \(\pi :[0,T]\times \varOmega \rightarrow {\mathbb {R}}\) is an \({{\mathbb {F}}}\)-predictable process determined with a measurable mapping \(\pi :[0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,\ldots ,n\}\mapsto {{\mathbb {R}}}\) such that \(\pi (t)=\pi ^{J(t-)}(t,X^\pi (t-),P(t))\),

  2. 2.

    The process \(\Big (\int _0^t\pi (s)dW(s), \ 0\le t\le T\Big )\) is a \(BMO({{\mathbb {F}}})\)-martingale,

  3. 3.

    The stochastic differential equation (3.1) has a unique solution \(X^{\pi }\) on [0, T],

  4. 4.

    \({{\mathbb {E}}}\Big [e^{- \varGamma (r) \big (X^\pi (T)-J(T)\eta (P(T))\big )}|{{\mathcal {F}}}_t\Big ]<\infty \), for all \(t\in [0,T]\) and all \(r\in {{\mathbb {R}}}\), including \(\varGamma (-\infty )=\sup _{r\in {{\mathbb {R}}}}\varGamma (r)\) and \( \varGamma (+\infty )=\inf _{r\in {{\mathbb {R}}}}\varGamma (r)\).

We can now define the objective function for (3.2):

$$\begin{aligned}&{v^{k,\pi }(t,x,p)}\nonumber \\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [-e^{-\varGamma \big (x-kF(t,p)\big ) \big (X^\pi (T)-J(T)\eta (P(T))\big )}\Big ],\nonumber \\&\quad (t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\},\ \pi \in {\mathcal {A}}.\quad \ \end{aligned}$$
(3.3)

We will also need the auxiliary objective function:

$$\begin{aligned}&{w^{k,\pi }(t,x,p,r)}\nonumber \\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [-e^{-\varGamma (r)\big (X^\pi (T)-J(T)\eta (P(T))\big )}\Big ],\nonumber \\&\quad (t,x,p,r,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}}\times \{0,1,\ldots ,n\},\ \pi \in {\mathcal {A}}.\quad \ \quad \ \end{aligned}$$
(3.4)

Due to time-inconsistency caused by the wealth-dependent risk aversion, we cannot find a strategy \(\pi \) which maximizes the objective function (3.3) and is optimal in the Bellman’s sense. We look for the sub-game perfect Nash equilibrium in the game with the reward given by (3.3), see e.g. Björk et al. [3].

Definition 3.2

Let us consider an admissible strategy \(\pi ^*\in {\mathcal {A}}\). Fix an arbitrary point \((t,x,p,k)\in [0,T)\times {{\mathbb {R}}}\times {{\mathbb {R}}}\times \{0,1,\ldots ,n\}\) and choose an admissible strategy \(\pi \in {\mathcal {A}}\). For \(\delta >0\) we define a new admissible strategy

$$\begin{aligned} \pi ^\delta (s)=\left\{ \begin{array}{ll} \pi (s),\quad t\le s\le t+\delta ,\\ \pi ^*(s),\quad t+\delta < s\le T. \end{array}\right. \end{aligned}$$

If

$$\begin{aligned} \lim _{\delta \rightarrow 0} \frac{1}{\delta }\Big (v^{k,\pi ^*}(t,x,p)-v^{k,\pi ^\delta }(t,x,p)\Big )\ge 0, \end{aligned}$$
(3.5)

for all \((t,x,p,k)\in [0,T)\times {{\mathbb {R}}}\times {{\mathbb {R}}}\times \{0,1,\ldots ,n\}\) and all \(\pi \in {\mathcal {A}}\), then \(\pi ^*\) is called the equilibrium strategy and \(v^{k,\pi ^*}\) is called the equilibrium value function corresponding to the equilibrium strategy \(\pi ^*\).

We consider a special structure of the wealth-dependent risk aversion coefficient \(\varGamma \). We choose

$$\begin{aligned} \varGamma (r)=\gamma _0+\gamma _1(r)\epsilon ,\quad r\in {{\mathbb {R}}}, \quad \epsilon >0. \end{aligned}$$
(3.6)

In this paper we assume that the insurer’s risk aversion coefficient \(\varGamma \) consists of a constant risk aversion \(\gamma _0>0\) and a small amount \(\epsilon >0\) of wealth-dependent risk aversion \(\gamma _1\). Similar to (A5), we impose the condition:

  1. (A6)

    The function \(\gamma _1:{{\mathbb {R}}}\mapsto {{\mathbb {R}}}\) is bounded, decreasing, Lipschitz continuous and \({\mathcal {C}}^2({{\mathbb {R}}})\). Moreover, \(\gamma _1(0)=0\).

The assumption (3.6) allows us to apply perturbation theory and find the first-order approximation to the true solution to the optimization problem (3.2) for small \(\epsilon >0\).

We can apply perturbation theory since our problem (3.2) can be formulated by adding a small term to parameter of a related and exactly solvable problem. In our case, the exactly solvable problem is (3.2) with \(\varGamma (r)=\gamma _0\). We expect that the solution to the time-inconsistent exponential utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) should be expanded in powers of \(\epsilon \) around the solution to the time-consistent exponential utility maximization problem with the constant risk aversion \(\gamma _0\).

Let us describe the first-order approximate solution to (3.2). For details we refer to Delong [8]. We consider two systems of PDEs:

$$\begin{aligned}&h_t^k(t,p)+\Big (a-\frac{\mu b\rho }{\sigma }\Big )ph_p^k(t,p)+\frac{1}{2}b^2p^2h_{pp} ^k(t,p)+k\alpha (p)-\frac{\mu ^2}{2\sigma ^2\gamma }-\frac{k\lambda }{\gamma }\nonumber \\&\quad +\frac{e^{\gamma \beta (p)} e^{\gamma h^{k-1}(t,p)}}{\gamma }k\lambda e^{-\gamma h^k(t,p)}\nonumber \\&\quad +\frac{1}{2}\gamma (1-\rho ^2)b^2p^2(h^k_p(t,p))^2=0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&\quad h^k(T,p)=k\eta (p),\quad p\in (0,\infty ),\quad k\in \{0,\ldots ,n\}, \end{aligned}$$
(3.7)

and

$$\begin{aligned}&g_t^k(t,p)+\Big (a-\frac{\mu b\rho }{\sigma }+\gamma (1-\rho ^2)b^2ph^k_p(t,p)\Big )pg_p^k(t,p)+\frac{1}{2}b^2p^2g_{pp}^k(t,p)\nonumber \\&\quad -\,e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )}k\lambda g^k(t,p)\nonumber \\&\quad +\,\frac{\mu ^2}{2\sigma ^2\gamma ^2}+\frac{1}{2}(1-\rho ^2)b^2p^2 \big (h^k_p(t,p)\big )^2\nonumber \\&\quad +\frac{e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )} \Big (\gamma \Big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\Big )-1\Big )+1}{\gamma ^2}k\lambda \nonumber \\&\quad +e^{\gamma \big (\beta (p)+h^{k-1}(t,p)-h^{k}(t,p)\big )}k\lambda g^{k-1}(t,p) =0,\quad (t,p)\in [0,T)\times (0,\infty ),\nonumber \\&\quad g^k(T,p)=0,\quad p\in (0,\infty ),\quad k\in \{0,\ldots ,n\}. \end{aligned}$$
(3.8)

By Proposition 5.1 presented in the next section, there exist unique solutions \((h^k,g^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times (0,\infty ))\cap {\mathcal {C}}^{1,2}([0,T)\times (0,\infty ))\) to (3.7)–(3.8). We assume that

  1. (A7)

    There exist mixed derivatives \((h^k_{tp})_{k=0}^n\in {\mathcal {C}}([0,T)\times (0,\infty ))\).

We define the strategies:

$$\begin{aligned} \pi ^{k,*}_0(t,p)= & {} \frac{\mu }{\sigma ^2\gamma _0} +\frac{h^{k,\gamma _0}_p(t,p)bp\rho }{\sigma }, \end{aligned}$$
(3.9)
$$\begin{aligned} \pi ^{k,*}_1(t,x,p)= & {} -\frac{\mu \gamma _1(x-kF(t,p))}{\sigma ^2\gamma ^2_0} +\frac{g_p^{k,\gamma _0}(t,p)\gamma _1(x-kF(t,p))bp\rho }{\sigma },\quad \ \nonumber \\ \end{aligned}$$
(3.10)

and the functions:

$$\begin{aligned} v_0^k(t,x,p)= & {} -e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(3.11)
$$\begin{aligned} v_1^k(t,x,p)= & {} \gamma _1(x-kF(t,p))\nonumber \\&\times \Big (x-h^{k,\gamma _0}(t,p)-\gamma _0g^{k,\gamma _0}(t,p) \Big )e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}. \end{aligned}$$
(3.12)

We remark that \(h^{k,\gamma _0}, g^{k,\gamma _0}\) in (3.9)–(3.12) denote the solutions to the PDEs (3.7)–(3.8) with \(\gamma =\gamma _0\).

Theorem 3.1

(Theorem 6.1 from Delong [8]) Let (A1)–(A7) hold. Let us introduce the investment strategy

$$\begin{aligned} {\hat{\pi }}^{k,*}(t,x,p)= & {} \pi _0^{k,*}(t,p)+\pi _1^{k,*}(t,x,p)\epsilon , \end{aligned}$$
(3.13)

and the function

$$\begin{aligned} {\hat{v}}^{k,*}(t,x,p)= & {} v_0^{k}(t,x,p)+v_1^{k}(t,x,p)\epsilon . \end{aligned}$$
(3.14)
  1. 1.

    For a sufficiently small \(\epsilon >0\), the investment strategy (3.13) is admissible, i.e. \({\hat{\pi }}^*=({\hat{\pi }}^{k,*})_{k=0}^n\in {\mathcal {A}}\).

  2. 2.

    The investment strategy (3.13) and the function (3.14) are candidate asymptotic first-order approximations, respectively, to the equilibrium investment strategy and the equilibrium value function for the optimization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) as \(\epsilon \rightarrow 0\).

4 The Main Result: Asymptotic Optimality of the Candidate First-Order Approximate Strategy

First, we specify the class of investment strategies in which we show that the investment strategy (3.13) is asymptotically optimal for our optimization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) as \(\epsilon \rightarrow 0\). Next, we formalize and explain what we mean by the asymptotic optimality of (3.13) in our optimization problem. Finally, we present the main result of this paper.

Definition 4.1

Let us consider the utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with \(\epsilon >0\). A strategy \(\pi :=(\pi (t),0\le t\le T)\) is in the class \({\mathcal {B}}\) if

  1. 1.

    \(\pi :[0,T]\times \varOmega \rightarrow {\mathbb {R}}\) is an \({{\mathbb {F}}}\)-predictable process determined with a measurable mapping \(\pi :[0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,\ldots ,n\}\mapsto {{\mathbb {R}}}\) such that \(\pi (t)=\pi ^{J(t-)}(t,X^\pi (t-),P(t))\) and \(\pi \) has the representation:

    $$\begin{aligned}&\pi ^k(t,x,p)=\pi ^k_0(t,x,p)+\pi _1^k(t,x,p)\epsilon ,\nonumber \\&\quad \ \quad (t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\},\quad \end{aligned}$$
    (4.1)
  2. 2.

    The mappings \(x\mapsto \pi ^k_i\big (t,x,P(t,\omega )\big )\) satisfy the Lipschitz conditions:

    $$\begin{aligned}&{\big |\pi ^k_i\big (t,x,P(t,\omega )\big )-\pi ^k_i \big (t,x',P(t,\omega )\big )\big |\le H(t,\omega )\big |x-x'\big |,\quad i=0,1,}\\&\quad \ \quad (t,x,\omega ,k), (t,x',\omega ,k)\in [0,T]\times {{\mathbb {R}}}\times \varOmega \times \{0,1,\ldots ,n\}, \end{aligned}$$

    where \(H:[0,T]\times \varOmega \mapsto [0,\infty )\) is a continuous process, adapted to the filtration \(\sigma (P(u), u\in [0,T])\), such that \((\int _0^tH(s)dW(s),0\le t\le T)\) is a BMO-martingale and

    $$\begin{aligned} H(t,\omega )\le K\big (1+P(t,\omega )\big ),\quad (t,\omega )\in [0,T]\times \varOmega , \end{aligned}$$
  3. 3.

    The mappings \(x\mapsto \pi ^k_i\big (t,x,P(t,\omega )\big )\) satisfy the growth conditions:

    $$\begin{aligned}&{\big |\pi ^k_i\big (t,x,P(t,\omega )\big )\big |\le {\tilde{H}}(t,\omega ),\quad i=0,1,}\\&\quad \ \quad (t,x,\omega ,k)\in [0,T]\times {{\mathbb {R}}}\times \varOmega \times \{0,1,\ldots ,n\}, \end{aligned}$$

    where \({\tilde{H}}:[0,T]\times \varOmega \mapsto [0,\infty )\) is a continuous process, adapted to the filtration \(\sigma (P(u), u\in [0,T])\), such that \((\int _0^t{\tilde{H}}(s)dW(s),0\le t\le T)\) is a BMO-martingale and

    $$\begin{aligned} {\tilde{H}}(t,\omega )\le K\big (1+P(t,\omega )\big ),\quad (t,\omega )\in [0,T]\times \varOmega , \end{aligned}$$
  4. 4.

    \({{\mathbb {E}}}\Big [e^{- \varGamma (r) \big (X^{\pi _0}(T)-J(T)\eta (P(T))\big )}|{{\mathcal {F}}}_t\Big ]<\infty \), for all \(t\in [0,T]\) and all \(r\in {{\mathbb {R}}}\), including \(\varGamma (-\infty )=\sup _{r\in {{\mathbb {R}}}}\varGamma (r)\) and \( \varGamma (+\infty )=\inf _{r\in {{\mathbb {R}}}}\varGamma (r)\).

We remark that the amount of \(\pi _1\) added to \(\pi _0\), in order to define the admissible strategy (4.1), is controlled with the parameter \(\epsilon \) which represents the degree of the insurer’s risk aversion depending on wealth. If we choose \(\pi _1=0\), then we can consider strategies independent of the parameter \(\epsilon \) within the class \({\mathcal {B}}\). Finally, the processes H and \({\tilde{H}}\), which appear in the Lipschitz and growth conditions, may depend on the strategies \(\pi _0, \pi _1\).

Since we use perturbation techniques, the idea of which is to expand the true solution in powers of the small parameter \(\epsilon \), it is natural to consider the investment strategies of the form (4.1) in point 1 of Definition 4.1, see also Fouque and Hu [12]. Points 2–4 from Definition 4.1 are closely related to points 2–4 from Definition 3.1. Points 2-3 from Definition 4.1 describe in more details the measurable mapping \((t,x,p,k)\mapsto \pi ^k(t,x,p)\) which characterizes the investment strategy. In particular, points 2–3 from Definition 4.1 imply that points 2–3 from Definition 3.1 are satisfied. They are rather standard in the theory of stochastic differential equations and backward stochastic differential equations with BMO-martingales, see Chapter V.3 in Protter [20] and Ankirchner et al. [2]. Finally, since we add a small amount \(\epsilon \) of \(\pi _1\) to \(\pi _0\) in order to define the strategy \(\pi \in {\mathcal {B}}\) in (4.1), we expect that point 4 from Definition 3.1 should only be needed for \(\pi _0\) (which is point 4 from Definition 4.1). In Proposition 5.2 below, we show that \({\mathcal {B}}\subset {\mathcal {A}}\) and the candidate first-order approximation to the equilibrium strategy \({\hat{\pi }}^*\in {\mathcal {B}}\) for a sufficiently small \(\epsilon >0\). Although Definition 4.1 may look technical, we believe that it describes a very reasonable class of investment strategies which are important for our exponential utility maximization problem (3.2) with a small amount \(\epsilon \) of wealth-dependent risk aversion and does not exclude any relevant strategies.

We now present the main theorem of this paper.

Theorem 4.1

We assume that (A1)–(A7) hold. The strategies \(\pi _0^*=(\pi _0^{k,*})_{k=0}^n, \pi _1^*=(\pi _1^{k,*})_{k=0}^n\) are given by (3.9)–(3.10), and the functions \((v^k_0)_{k=0}^n, (v^k_1)_{k=0}^n\) are given by (3.11), (3.12). Let \((v^{k,\pi }_\epsilon )_{k=0}^n\) and \((w_\epsilon ^{k,\pi })_{k=0}^n\) denote the objective functions (3.3)–(3.4) for the utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with \(\epsilon >0\) when the strategy \(\pi \) is applied. We allow for strategies \(\pi =(\pi ^k)_{k=0}^n\in {\mathcal {B}}\) such that \((v_\epsilon ^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty ))\) and \((w_\epsilon ^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\). We fix \((t,x,p,k)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\}\).

  1. (i)

    For any strategy \(\pi _0\), we have the asymptotic zeroth-order approximation to the objective function:

    $$\begin{aligned} v_\epsilon ^{k,\pi _0}(t,x,p)=V^{k,\pi _0}(t,x,p)+{\mathcal {O}}(\epsilon ),\quad \epsilon \rightarrow 0, \end{aligned}$$
    (4.2)

    where \(V^{k,\pi _0}\) denotes the objective function for the time-consistent optimization problem (3.3) with \(\varGamma (r)=\gamma _0\) under the strategy \(\pi _0\).

  2. (ii)

    The strategy \(\pi ^*_0\) performs better than any strategy \(\pi _0\) when we compare the asymptotic approximations to the objective functions up to order \({\mathcal {O}}(1)\), i.e.

    $$\begin{aligned} \lim _{\epsilon \rightarrow 0}\big (v_\epsilon ^{k,\pi ^*_0}(t,x,p)-v_\epsilon ^{k,\pi _0}(t,x,p)\big )\ge 0. \end{aligned}$$
    (4.3)

    The equality in (4.3) holds only for \(\pi _0=\pi _0^*\).

  3. (iii)

    For any strategy \(\pi ^*_0+\pi _1\epsilon \), we have the asymptotic first-order approximation to the objective function:

    $$\begin{aligned} v_\epsilon ^{k,\pi ^*_0+\pi _1\epsilon }(t,x,p)=v^k_0(t,x,p)+v^k_1(t,x,p)\epsilon +{\mathcal {O}}(\epsilon ^2),\quad \epsilon \rightarrow 0, \end{aligned}$$
    (4.4)
  4. (iv)

    The strategy \(\pi ^*_0+\pi ^*_1\epsilon \) is the equilibrium strategy in the class of strategies \(\pi ^*_0+\pi _1\epsilon \) when we compare the asymptotic approximations to the objective functions up to order \({\mathcal {O}}(\epsilon ^2)\), i.e.

    $$\begin{aligned} \lim _{\delta \rightarrow 0} \frac{1}{\delta } \Bigg (\lim _{\epsilon \rightarrow 0} \frac{v_\epsilon ^{k,\pi ^*_0+\pi _1^*\epsilon }(t,x,p)-v_\epsilon ^{k,\pi ^*_0 +\pi ^\delta _1\epsilon }(t,x,p)}{\epsilon ^2}\Bigg )\ge 0, \end{aligned}$$
    (4.5)

    where, for \(\delta \in [0,T-t]\), we define

    $$\begin{aligned} \pi _1^\delta (s)=\left\{ \begin{array}{ll} \pi _1(s),\quad t\le s\le t+\delta ,\\ \pi _1^*(s),\quad t+\delta < s\le T. \end{array}\right. \end{aligned}$$
    (4.6)

    The equality in (4.5) holds only for \(\pi ^\delta _1=\pi _1^*\).

Remark 4.1

(a):

The function \(v_\epsilon ^{k,\pi _0}\) depends on \(\epsilon \) since we use \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \). The function \(v_\epsilon ^{k,\pi _0+\pi _1\epsilon }\) depends on \(\epsilon \) since we use \(\pi =\pi _0+\pi _1\epsilon \) and \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \). The subscript \(\epsilon \) in \((v^{k}_\epsilon )_{k=0}^n\) will be omitted in the sequel.

(b):

If we use \(\varGamma (r)=\gamma _0\), then \(\pi _0^*\) is the optimal investment strategy for the time-consistent exponential utility maximization problem (3.2) with the constant risk aversion coefficient \(\gamma _0\), and the functions \((v_0^k)_{k=0}^n\) define the corresponding optimal value function, see Proposition 5.1 in Delong [8]. We note that \((v_0^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty ))\) by Proposition 5.1 below.

(c):

The objective function \(V^{k,\pi }\) in (4.2) is given by

$$\begin{aligned} V^{k,\pi }(t,x,p)={{\mathbb {E}}}_{t,x,p,k}\Big [-e^{-\gamma _0\big (X^\pi (T)-J(T)\eta (P(T))\big )}\Big ]. \end{aligned}$$

By point b above, \(\sup _{\pi \in {\mathcal {A}}}V^{k,\pi }(t,x,p)=V^{k,\pi _0^*}(t,x,p)=v_0^k(t,x,p)\).

(d):

We consider a class of strategies which is potentially smaller than the class \({\mathcal {B}}\) since we require that the objective functions (3.3)–(3.4) are smooth for the strategies considered in Theorem 4.1. This assumption is reasonable since in this paper we work with smooth (classical) solutions to HJB equations and PDEs. In Theorem 3.1 we assume that the equilibrium value function (i.e. the objective function for our optimization problem for the equilibrium strategy) is a smooth solution to HJB equations. In Proposition 5.1 below we prove that the candidate first-order approximation to the equilibrium value function is a smooth solution to PDEs. Finally, Remark b shows that the optimal value function for the time-consistent optimization problem with constant risk aversion (i.e. the objective function for our optimization problem with \(\epsilon =0\)) is also smooth.

Theorem 4.1 gives a more rigorous justification for the investment strategy derived in Delong [8]. The assertions (i)–(ii) from Theorem 4.1 are intuitively clear in the view of Remark 4.1.b. The zeroth-order investment strategy \(\pi _0^*\) postulated in Theorem 3.1 and by Delong [8] performs better than any strategy \(\pi _0\) when we compare the asymptotic expansions of the objective functions up to order \({\mathcal {O}}(1)\) as \(\epsilon \rightarrow 0\). If we want to study investment strategies which are series expansions in powers of \(\epsilon \), then, by perturbation theory and Remark 4.1.b., it is natural to consider expansions around the strategy \(\pi _0^*\). The most interesting are the assertions (iii)–(iv) from Theorem 4.1 where we show that the first-order investment strategy \(\pi _0^*+\pi _1^*\epsilon \) postulated in Theorem 3.1 and by Delong [8] is the equilibrium strategy in a reasonable class of strategies \(\pi ^*_0+\pi _1\epsilon \) when we compare the asymptotic approximations to the objective functions up to order \({\mathcal {O}}(\epsilon ^2)\) as \(\epsilon \rightarrow 0\). The criterion (4.5) is a modification of the well-established criterion (3.5) for the equilibrium in continuous-time models. In (3.5) we compare the objective functions for the exponential utility maximization problem with the risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) for the strategies \(\pi ^*\) and \(\pi ^\delta \). In (4.5) we use the asymptotic expansions (4.4) of the objective functions for the exponential utility maximization problem with the risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) for the strategies \(\pi _0^*+\pi _1^*\epsilon \) and \(\pi _0^*+\pi _1^\delta \epsilon \) and compare the terms in these expansions up to order \({\mathcal {O}}(\epsilon ^2)\). To the best of our knowledge the criterion (4.5) is new and has not been investigated in the literature. We point out that (4.5) is not related to \(\epsilon \)-equilibrium.

5 The Proof of the Main Result

First, we introduce operators associated with the continuous parts of the processes \((X^\pi ,P,R)\).

Definition 5.1

Let \({\mathcal {L}}^\pi _k\) and \({\mathcal {M}}_k^\pi \) denote second order differential operators given by

$$\begin{aligned}&{{\mathcal {L}}_{k}^{\pi } \phi (t,x,p)=\phi _x(t,x,p)\big (\pi \mu -k\alpha (p)\big ) +\frac{1}{2}\phi _{xx}(t,x,p)\pi ^2\sigma ^2}\\&\quad +\,\phi _{px}(t,x,p)\pi bp\sigma \rho +\phi _p(t,x,p) ap+\frac{1}{2} \phi _{pp}(t,x,p) b^2p^2, \\&\quad {{\mathcal {M}}_{k}^{\pi } \phi (t,x,p,r)={\mathcal {L}}^\pi _k\phi (t,x,p,r)}\\&\quad +\,\phi _r(t,x,p,r)\Big (\pi \mu -k\alpha (p)-kF_t(t,p)-kF_p(t,p)ap-\frac{1}{2}kF_{pp}(t,p)b^2p^2\Big )\\&\quad +\,\frac{1}{2}\phi _{rr}(t,x,p,r)\Big (\pi ^2\sigma ^2+(kF_p(t,p))^2b^2p^2-2\pi kF_p(t,p)bp\sigma \rho \Big )\\&\quad +\,\phi _{rp}(t,x,p,r)\Big (\pi bp\sigma \rho -kF_p(t,p) b^2p^2\Big )\\&\quad +\,\phi _{rx}(t,x,p,r)\Big (\pi ^2\sigma ^2-\pi kF_p(t,p)bp\sigma \rho \Big ). \end{aligned}$$

The operators \({\mathcal {L}}^\pi _k\) and \({\mathcal {M}}_k^\pi \) are defined, respectively, for \(\phi \in {\mathcal {C}}^{1,2,2}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\) and \(\phi \in {\mathcal {C}}^{1,2,2,2}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\). The operator \({\mathcal {L}}_k^\pi \phi (t,x,p,r)\) only acts on (txp) and r is kept as a constant.

Next, we briefly recall some results from Delong [8] which we will use in the sequel.

Proof of Theorem 3.1

By Theorem 3.1 from Delong [8], the equilibrium strategy and the equilibrium value function for (3.2) are characterized with the system of HJB equations:

$$\begin{aligned}&{v_t^k(t,x,p)+\sup _\pi \Big \{{\mathcal {L}}_k^\pi v^k(t,x,p)-{\mathcal {M}}^\pi _kw^k(t,x,p,x-kF(t,p))}\nonumber \\&\quad +\,{\mathcal {L}}^\pi _kw^k(t,x,p,x-kF(t,p))\Big \}+\Big (v^{k-1}(t,x-\beta (p),p) -v^k(t,x,p)\Big )k\lambda \nonumber \\&\quad +\,\Big (w^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad \ -\,w^{k-1}(t,x-\beta (p),p,x-\beta (p)-(k-1)F(t,p))\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\nonumber \\&\quad v^k(T,x,p)=-e^{-\varGamma (x-k\eta (p))(x-k\eta (p))},\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ),\nonumber \\ \quad \pi ^{k,*}&=arg \ sup_\pi \ \Big \{{\mathcal {L}}_k^\pi v^k(t,x,p)-{\mathcal {M}}^\pi _kw^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +\,{\mathcal {L}}^\pi _kw^k(t,x,p,x-kF(t,p))\Big \}, \nonumber \\&\quad \ (t,x,p)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty ), \end{aligned}$$
(5.1)

and

$$\begin{aligned}&{w_t^k(t,x,p,r)+{\mathcal {L}}_k^{\pi ^{k,*}}w^k(t,x,p,r)}\nonumber \\&\quad +\Big (w^{k-1}(t,x-\beta (p),p,r)-w^k(t,x,p,r)\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}},\nonumber \\&\quad w^k(T,x,p,r)=-e^{-\varGamma (r)(x-k\eta (p))},\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}}, \end{aligned}$$
(5.2)

for \(k\in \{0,1,\ldots ,n\}\). If we assume the risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with small \(\epsilon >0\), then we can postulate the following first-order expansions for the solutions to the HJB equations (5.1)–(5.2):

$$\begin{aligned} v^k(t,x,p)= & {} v_0^k(t,x,p)+v_1^k(t,x,p)\epsilon +{\mathcal {O}}(\epsilon ^2), \quad \epsilon \rightarrow 0,\nonumber \\&\quad (t,p,k)\in [0,T]\times (0,\infty )\times \{0,1,\ldots ,n\}, \end{aligned}$$
(5.3)
$$\begin{aligned} w^k(t,x,p,r)= & {} w_0^k(t,x,p)+w_1^k(t,x,p,r)\varepsilon +{\mathcal {O}}(\epsilon ^2),\quad \epsilon \rightarrow 0,\nonumber \\&\quad (t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\}. \end{aligned}$$
(5.4)

We also assume that derivatives of \((v^k)_{k=0}^n, (w^k)_{k=0}^n\) satisfy the first-order expansions of the same form (5.3)–(5.4). From equation (5.1), we can now deduce the first-order expansion for the equilibrium strategy:

$$\begin{aligned} \pi ^{k,*}(t,x,p)= & {} \pi ^{k,*}_0(t,x,p)+\pi ^{k,*}_1(t,x,p)\epsilon +{\mathcal {O}}(\epsilon ^2),\quad \epsilon \rightarrow 0,\nonumber \\&\quad (t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\}, \end{aligned}$$
(5.5)

where

$$\begin{aligned}&{\pi ^{k,*}_0(t,x,p)=-\frac{v_{0,x}^k(t,x,p)\mu +v_{0,px}^k(t,x,p)bp\sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2},} \end{aligned}$$
(5.6)
$$\begin{aligned}&{\pi ^{k,*}_1(t,x,p)=\frac{v_{0,x}^k(t,x,p)\mu +v_{0,px}^k(t,x,p)bp\sigma \rho }{(v_{0,xx}^k(t,x,p))^2\sigma ^2}}\nonumber \\&\quad \times \Big (v_{1,xx}^k(t,x,p)-w_{1,rr}^k(t,x,p,x-kF(t,p))-2w_{1,xr}^k(t,x,p,x-kF(t,p))\Big )\nonumber \\&\quad -\frac{\Big (v_{1,x}^k(t,x,p)-w_{1,r}^k(t,x,p,x-kF(t,p))\Big )\mu }{v_{0,xx} ^k(t,x,p)\sigma ^2}\nonumber \\&\quad -\frac{\Big (v^k_{1,px}(t,x,p)-w^k_{1,pr}(t,x,p,x-kF(t,p)) \Big ) bp\sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2}\nonumber \\&\quad -\frac{\Big (w_{1,rr}^k(t,x,p,x-kF(t,p))+w_{1,xr}^k(t,x,p,x-kF(t,p))\Big )kF_p(t,p) bp \sigma \rho }{v_{0,xx}^k(t,x,p)\sigma ^2}.\quad \nonumber \\ \end{aligned}$$
(5.7)

We substitute the expansions for \((v^k)_{k=0}^n, \ (w^k)_{k=0}^n\) and \((\pi ^{k,*})_{k=0}^n\) into the system of HJB equations (5.1)–(5.2). We collect the terms of order \({\mathcal {O}}(1), {\mathcal {O}}(\epsilon ), {\mathcal {O}}(\epsilon ^2)\) and set them to zero. We can derive the system of PDEs:

$$\begin{aligned}&v_{0,t}^k(t,x,p)+{\mathcal {L}}_{k}^{\pi _0^{k,*}} v_0^k(t,x,p)+\Big (v_0^{k-1}(t,x-\beta (p),p)-v_0^k(t,x,p)\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\nonumber \\&\quad v_0^k(T,x,p)=-e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \end{aligned}$$
(5.8)
$$\begin{aligned}&v_{1,t}^k(t,x,p)+{\mathcal {L}}_k^{\pi ^{k,*}_0} v_1^k(t,x,p)-{\mathcal {M}}^{\pi ^{k,*}_0}_kw_1^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +{\mathcal {L}}^{\pi ^{k,*}_0}_kw_1^k(t,x,p,x-kF(t,p))+\Big (v_1^{k-1}(t,x -\beta (p),p)-v_1^k(t,x,p)\Big )k\lambda \nonumber \\&\quad + \Big (w_1^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad - w_1^{k-1}(t,x-\beta (p),p,x-\beta (p) -(k-1)F(t,p))\Big )k\lambda =0,\nonumber \\&\quad \quad (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\nonumber \\&v_1^k(T,x,p)=\gamma _1(x-k\eta (p))(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))}, \quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ),\quad \ \quad \ \end{aligned}$$
(5.9)
$$\begin{aligned}&w^k_{0,t}(t,x,p)+{\mathcal {L}}_k^{\pi ^{k,*}_0}w^k_{0}(t,x,p)+\Big (w_0^{k-1}(t,x -\beta (p),p)-w_0^k(t,x,p)\Big )k\lambda =0,\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\nonumber \\&\quad w_0^k(T,x,p)=-e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \end{aligned}$$
(5.10)
$$\begin{aligned}&\quad w^k_{1,t}(t,x,p,r)+{\mathcal {L}}_k^{\pi ^{k,*}_0}w^k_{1}(t,x,p,r)\nonumber \\&\quad \ \quad \ +\Big (w_1^{k-1} (t,x-\beta (p),p,r)-w_1^k(t,x,p,r)\Big )k\lambda =0\nonumber \\&\quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}},\nonumber \\&\quad w_1^k(T,x,p,r)=\gamma _1(r)(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))},\quad (x,p) \in {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}},\nonumber \\ \end{aligned}$$
(5.11)

for \(k=0,1,\ldots ,n\). We can find the solutions to the PDEs (5.8)–(5.11). These solutions are given by

$$\begin{aligned} v_0^k(t,x,p)= & {} -e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(5.12)
$$\begin{aligned} w_0^k(t,x,p)= & {} -e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)}, \end{aligned}$$
(5.13)
$$\begin{aligned} v_1^k(t,x,p)= & {} \gamma _1(x-kF(t,p))\nonumber \\&\times \Big (x-h^{k,\gamma _0}(t,p)-\gamma _0g^{k,\gamma _0}(t,p)\Big ) e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)},\quad \ \quad \ \end{aligned}$$
(5.14)
$$\begin{aligned} w_1^k(t,x,p,r)= & {} \gamma _1(r)\Big (x-h^{k,\gamma _0}(t,p) -\gamma _0g^{k,\gamma _0}(t,p)\Big )e^{-\gamma _0x}e^{\gamma _0 h^{k,\gamma _0}(t,p)},\quad \ \end{aligned}$$
(5.15)

where the functions \((h^k)_{k=0}^n\) and \((g^k)_{k=0}^n\) solve the PDEs (3.7) and (3.8). The first-order approximation to the equilibrium strategy (5.5) is determined with (3.9)-(3.10). \(\square \)

Proposition 5.1

(Propositions 5.1–5.4 from Delong [8]) Let (A1)–(A3) hold.

  1. 1.

    There exist unique solutions \((h^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times (0,\infty ))\cap {\mathcal {C}}^{1,2}([0,T)\times (0,\infty ))\) to the system of PDEs (3.7). Moreover, the functions \((h^k)_{k=0}^n:[0,T]\times (0,\infty )\mapsto {{\mathbb {R}}}\) are uniformly bounded in (tp), and Lipschitz continuous in p uniformly in t.

  2. 2.

    In addition, assume that

    1. (A7)

      There exist mixed derivatives \((h^k_{tp})_{k=0}^n\in {\mathcal {C}}([0,T)\times (0,\infty ))\).

    There exist unique solutions \((g^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times (0,\infty ))\cap {\mathcal {C}}^{1,2}([0,T)\times (0,\infty ))\) to the system of PDEs (3.8). Moreover, the functions \((g^k)_{k=0}^n:[0,T]\times (0,\infty )\mapsto {{\mathbb {R}}}\) are uniformly bounded in (tp), and Lipschitz continuous in p uniformly in t.

  3. 3.

    Let us define

    $$\begin{aligned} Z(t)= & {} \sum _{k=0}^nh_p^k(t,P(t))P(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T, \\ {\mathcal {Z}}(t)= & {} \sum _{k=0}^ng_p^k(t,P(t))P(t){\mathbf {1}}\{J(t-)=k\},\quad 0\le t\le T. \end{aligned}$$

    The processes \((\int _0^tZ(s)dW(s), 0\le t\le T), (\int _0^t{\mathcal {Z}}(s)dW(s),0\le t\le T)\) are BMO-martingales.

  4. 4.

    There exist solutions \((v^k_0, v^k_1, w_0^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty ))\) and \((w^k_1)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\) to the PDEs (5.8)–(5.11) given by (5.12)–(5.15).

We are now heading towards the proof of our main result. We prove Theorem 4.1 by using series of lemmas and propositions.

Proposition 5.2

Let us consider the utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with a sufficiently small \(\epsilon >0\).

  1. (i)

    Any strategy \(\pi =\pi _0+\pi _1\epsilon \in {\mathcal {B}}\) is in the class \({\mathcal {A}}\).

  2. (ii)

    The strategies \(\pi _0^*\) and \({\hat{\pi }}^*=\pi ^*_0+\pi ^*_1\epsilon \) are in the class \({\mathcal {B}}\).

Proof

Assertion (i) We choose \(\pi =\pi _0+\pi _1\epsilon \in {\mathcal {B}}\) from Definition 4.1. We will show that all points from Definition 3.1 are satisfied. Point 1 is obvious. Point 2 follows from the growth conditions for \(\pi _0\) and \(\pi _1\). Point 3 can be deduced from Theorem V.7 in Protter [20] since \(\pi _0\) and \(\pi _1\) are process Lipschitz. We are left with point 4. Let us introduce the process

$$\begin{aligned}&{Y(t)=J(T)\eta (P(T))-\int _t^T\Big (\frac{\mu ^2}{2\sigma ^2 \gamma }-J(s-)\alpha (P(s))+\frac{\mu }{\sigma }Z_1(s)-\frac{1}{2}\gamma (Z_2(s))^2}\nonumber \\&\quad -\frac{e^{\gamma (\beta (P(s))+Q(s))}-1}{\gamma }J(s-)\lambda \Big )ds\nonumber \\&\quad -\int _t^TZ_1(s)dW(s)-\int _t^TZ_2(s)dB(s)-\int _t^TQ(s)dN(s),\quad 0\le t\le T.\quad \ \end{aligned}$$
(5.16)

The process Y is used to define the solution to the exponential utility maximization problem (3.2) with the constant risk aversion coefficient \(\varGamma (r)=\gamma _0=\gamma \), see Theorem 5.1 in Delong [8]. We can show that

$$\begin{aligned} v^k_0(t,x,p)=-e^{-\gamma x}e^{\gamma h^k(t,p)}=-e^{-\gamma x}e^{\gamma Y(t)}|_{P(t)=p,J(t)=k}, \end{aligned}$$
(5.17)

where \(v_0^k(t,x,p)\) is the optimal value function for the time-consistent exponential utility maximization problem for the initial point (txpk).

We choose \(r\in {{\mathbb {R}}}\) and set \(\gamma _1:=\gamma _1(r)\). We choose \(t\in [0,T]\). We have the following decomposition:

$$\begin{aligned}&{(\gamma _0+\gamma _1\epsilon )\big (X^{\pi }(T)-J(T)\eta (P(T))\big )}\nonumber \\&\quad =\gamma _0\Big (X^{\pi }(t)+\int _t^T\pi _0(s)\mu ds+\int _t^T\pi _0(s)\sigma dW(s)\nonumber \\&\qquad -\,\int _t^TJ(s-)\alpha (P(s))ds+\int _t^T\beta (P(s))dJ(s)-Y(T)\Big )\nonumber \\&\qquad +\,\gamma _0\Big (\int _t^T\pi _1(s)\mu ds+\int _t^T\pi _1(s)\sigma dW(s)\Big )\epsilon \nonumber \\&\qquad +\,\gamma _1\epsilon \Big (X^{\pi }(t)+\int _t^T(\pi _0(s)+\pi _1(s)\epsilon )\mu ds\nonumber \\&\qquad +\,\int _t^T(\pi _0(s)+\pi _1(s)\epsilon )\sigma dW(s)\nonumber \\&\qquad -\,\int _t^TJ(s-)\alpha (P(s))ds+\int _t^T\beta (P(s))dJ(s)-J(T)\eta (P(T))\Big )\nonumber \\&\quad =(\gamma _0+\gamma _1\epsilon )X^{\pi }(t)-\gamma _0Y(t)\nonumber \\&\qquad +\,\gamma _0\Big (\int _t^T\pi _0(s)\mu ds+\int _t^T\pi _0(s)\sigma dW(s)\nonumber \\&\qquad -\,\int _t^TJ(s-)\alpha (P(s))ds+\int _t^T\beta (P(s))dJ(s)-(Y(T)-Y(t))\Big )\nonumber \\&\qquad +\,\epsilon \Big (\int _t^T{\tilde{\pi }}(s)\mu ds+\int _t^T{\tilde{\pi }}(s)\sigma dW(s)\Big )\nonumber \\&\qquad -\,\gamma _1\epsilon \Big (\int _t^TJ(s-)\alpha (P(s))ds-\int _t^T\beta (P(s))dJ(s)+J(T)\eta (P(T))\Big ),\quad \ \nonumber \\ \end{aligned}$$
(5.18)

where we introduce the strategy

$$\begin{aligned} {\tilde{\pi }}(s)=\gamma _1\pi _0(s)+(\gamma _0+\gamma _1\epsilon )\pi _1(s),\quad 0\le s\le T. \end{aligned}$$

From point 3 from Definition 4.1 and (A6), we deduce that the process \(\big (\int _0^t{\tilde{\pi }}(s)dW(s),0\le t\le T\big )\) is a BMO-martingale, and

$$\begin{aligned}&{\Big \Vert \int _0^T{\tilde{\pi }}(s)dW(s)\Big \Vert ^2_{BMO}}\nonumber \\&\quad \le K\Big (\Big \Vert \int _0^T\pi _0(s)dW(s)\Big \Vert ^2_{BMO}+\Big \Vert \int _0^T\pi _1(s)dW(s)\Big \Vert ^2_{BMO}\Big )<\infty .\nonumber \\ \end{aligned}$$
(5.19)

We now study the expected value:

$$\begin{aligned}&{{{\mathbb {E}}}\Big [e^{-\varGamma (r)\big (X^{\pi }(T)-J(T)\eta (P(T))\big )}| {{\mathcal {F}}}_t\Big ]}\nonumber \\&\quad =e^{-(\gamma _0+\gamma _1\epsilon )X^{\pi }(t)+\gamma _0Y(t)}{{\mathbb {E}}}\Big [e^{-\gamma _0\big (X^{\pi _0}(T)-X^{\pi _0}(t)-(Y(T)-Y(t))\big )}\nonumber \\&\qquad \times e^{-\epsilon \big (\int _t^T{\tilde{\pi }}(s)\mu ds +\int _t^T{\tilde{\pi }}(s)\sigma dW(s)\big )}\nonumber \\&\qquad \times e^{\gamma _1\epsilon \big (\int _t^TJ(s-)\alpha (P(s))ds -\int _t^T\beta (P(s))dJ(s)+J(T)\eta (P(T))\big )}|{{\mathcal {F}}}_t\Big ]. \end{aligned}$$
(5.20)

By Hölder’s inequality and boundedness of \(\alpha , \beta , \eta \), we can derive

$$\begin{aligned}&{{{\mathbb {E}}}\Big [e^{-\varGamma (r)\big (X^{\pi }(T)-J(T)\eta (P(T))\big )}| {{\mathcal {F}}}_t\Big ]}\nonumber \\&\quad \le Ke^{-(\gamma _0+\gamma _1\epsilon )X^{\pi }(t)+\gamma _0Y(t)} \Big ({{\mathbb {E}}}\Big [e^{-\gamma _0q_1\big (X^{\pi _0}(T)-X^{\pi _0}(t)-(Y(T)-Y(t))\big )}| {{\mathcal {F}}}_t\Big ]\Big )^{\frac{1}{q_1}}\nonumber \\&\quad \times \Big ({{\mathbb {E}}}\Big [e^{-q_1^*\epsilon \big (\int _t^T{\tilde{\pi }}(s) \mu ds+\int _t^T{\tilde{\pi }}(s)\sigma dW(s)\big )}|{{\mathcal {F}}}_t\Big ]\Big )^{\frac{1}{q_1^*}}, \end{aligned}$$
(5.21)

for a sufficiently small \(q_1>1\) and its conjugate \(q_1^*>1\). We can choose a sufficiently small \(q_1>1\) such that \(\gamma _0q_1=\gamma _0+\gamma _1(r)\epsilon =\varGamma (r)\), if a sufficiently small \(\epsilon >0\) is used. Consequently, by point 4 from Definition 4.1, the first expected value in (5.21) is finite. As far as the second expected value is concerned, we introduce the process

$$\begin{aligned} M(s)=e^{-\int _t^sq_1^*\epsilon {\tilde{\pi }}(u)\sigma dW(u)-\frac{1}{2}\int _t^s|q_1^*\epsilon {\tilde{\pi }}(u)\sigma |^2du},\quad t\le s\le T. \end{aligned}$$

The process M is an exponential martingale generated by a BMO-martingale since (5.19) holds. Consequently, applying Hölder inequality and reverse Hölder inequality to the exponential martingale, see Theorem 3.1 in Kazamaki [19], we get

$$\begin{aligned}&{{{\mathbb {E}}}\Big [e^{-q_1^*\epsilon \big (\int _t^T{\tilde{\pi }}(s)\mu ds+\int _t^T{\tilde{\pi }}(s)\sigma dW(s)\big )}|{{\mathcal {F}}}_t\Big ]}\nonumber \\&\quad \le \Big ({{\mathbb {E}}}\big [|M(T)|^{q_2}|{{\mathcal {F}}}_t\big ]\Big )^{\frac{1}{q_2}}\nonumber \\&\quad \times \Big ({{\mathbb {E}}}\Big [e^{\frac{1}{2}\int _t^Tq_2^*|q_1^*\epsilon {\tilde{\pi }}(s) \sigma |^2ds-\int _t^Tq_2^*q^*_1\epsilon {\tilde{\pi }}(s)\mu ds}|{{\mathcal {F}}}_t\Big ]\Big )^{\frac{1}{q_2^*}}\nonumber \\&\quad \le K\Big ({{\mathbb {E}}}\Big [e^{\frac{1}{2}\int _t^Tq_2^*|q_1^*\epsilon {\tilde{\pi }}(s)\sigma |^2ds -\int _t^Tq_2^*q^*_1\epsilon {\tilde{\pi }}(s)\mu ds}|{{\mathcal {F}}}_t\Big ]\Big )^{\frac{1}{q_2^*}}, \end{aligned}$$
(5.22)

for a sufficiently small \(q_2>1\) and its conjugate \(q_2^*>1\). We remark that the constant \(q_2\) depends on \(\Big \Vert \int _0^Tq_1^*\epsilon {\tilde{\pi }}(s)\sigma dW(s)\Big \Vert _{BMO}\). Finally, for a sufficiently small \(\epsilon \), we have the inequality

$$\begin{aligned}&{{{\mathbb {E}}}\Big [e^{\frac{1}{2}\int _t^Tq_2^*|q_1^*\epsilon {\tilde{\pi }}(s) \sigma |^2ds-\int _t^Tq_2^*q^*_1\epsilon {\tilde{\pi }}(s)\mu ds}|{{\mathcal {F}}}_t\Big ]}\nonumber \\&\quad \le K_1{{\mathbb {E}}}\Big [e^{K_2\epsilon ^2\int _t^T|{\tilde{\pi }}(s)|^2ds}| {{\mathcal {F}}}_t\Big ]\nonumber \\&\quad \le \frac{K_1}{1-K_2\epsilon ^2\Big \Vert \int _0^T{\tilde{\pi }}(s)dW(s)\Big \Vert ^2_{BMO}}<\infty , \end{aligned}$$
(5.23)

by (5.19) and John–Nirenberg inequality, see Theorem 2.2 in Kazamaki [19]. Collecting (5.21) and (5.23), we can conclude that the expected value (5.20) is a.s. finite and our strategy \(\pi \) satisfies point 4 from Definition 3.1. Hence, \(\pi \in {\mathcal {B}}\) implies that \(\pi \in {\mathcal {A}}\).

Assertion (ii) Point 1 from Definition 4.1 is obvious. Points 2-3 can be deduced from (A6) and the properties specified in Proposition 5.1. In particular, the properties that the mapping \(p\mapsto h^k(t,p)\) is Lipschitz continuous on \((0,\infty )\) uniformly in \(t\in [0,T]\) and \(h^k\in {\mathcal {C}}([0,T]\times (0,\infty ))\cap {\mathcal {C}}^{1,2}([0,T)\times (0,\infty ))\) imply that the derivative \((t,p)\mapsto h^k_p(t,p)\) is uniformly bounded and jointly continuous on \([0,T)\times (0,\infty )\). In the definition of the investment strategy (3.13) we choose the left limit \(\lim _{t\mapsto T-}h^k_p(t,P(t,\omega ))\) and we have a continuous, finite mapping \(t\mapsto h^k_p(t,P(t,\omega ))P(t,\omega )\) on [0, T] for a.a \(\omega \). The same arguments hold for \(g^k_p(t,p)\). We have to prove point 4. In fact, we only have to prove that the first expected value in (5.21) is finite if \(\pi _0^*\) is used. By Remark 4.1.b. the strategy \(\pi _0^*\) is the optimal investment strategy for the optimization problem (3.2) with constant risk aversion, see also Theorems 5.1, 6.1 in Delong [8]. From properties of the optimal value function (5.17) for the time-consistent exponential utility maximization problem (3.2) with the constant risk aversion \(\varGamma (r)=\gamma _0\), we can deduce that

$$\begin{aligned} {\mathcal {M}}(s)= & {} e^{-\gamma _0\big (X^{\pi _0^*}(s)-X^{\pi ^*_0}(t)-(Y(s)-Y(t))\big )},\quad t\le s\le T, \end{aligned}$$

is an exponential martingale generated by a BMO-martingale, see eq. (8.12) in Delong [8] or a general theory in Hu et al. [16]. Hence, by reverse Hölder inequality, we can choose a sufficiently small \(q_1>1\) such that

$$\begin{aligned} {{\mathbb {E}}}\big [|{\mathcal {M}}(T)|^{q_1}|{{\mathcal {F}}}_t\big ]\le K. \end{aligned}$$
(5.24)

We can now use the same arguments as in the first part of the proof. \(\square \)

Lemma 5.1

Let \(\pi \in {\mathcal {A}}\) denote an admissible strategy for the utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with \(\epsilon >0\), and let \((v^k_0, v^k_1, w_0^k,w^k_1)_{k=0}^n\) denote the solutions to the PDEs (5.8)–(5.11). The families

$$\begin{aligned}&\Big \{v_i^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{w_i^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),R({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{w_i^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),r), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \}, \ r\in {{\mathbb {R}}}, \end{aligned}$$

are uniformly integrable, for \(i=0,1\).

Proof

The solutions to (5.8)–(5.11) are given by (5.12)–(5.15). By Proposition 5.1, the functions \((h^k)_{k=0}^n, (g^k)_{k=0}^n\) are bounded in (tpk). Since \(\gamma _1\) is bounded by (A6), it is sufficient to prove that \(\{e^{-\gamma _0 X^{\pi }({\mathcal {T}})}, {\mathcal {T}} \ is\ an \ {{\mathbb {F}}}-stopping \ time\}\) and \(\{e^{-\gamma _0 X^{\pi }({\mathcal {T}})}X^\pi ({\mathcal {T}}), {\mathcal {T}} \ is\ an \ {{\mathbb {F}}}-stopping \ time\}\) are uniformly integrable for any \(\pi \in {\mathcal {A}}\).

We choose \(\pi \in {\mathcal {A}}\). Points 2 and 4 from Definition 3.1 and the assumption (A6) that \(\gamma _1(0)=0\) imply that the family \(\Big \{e^{-\gamma _0 X^\pi ({\mathcal {T}})}, {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time\Big \}\) is uniformly integrable, see Remark 8 in Hu et al. [16]. We now consider the second family. We choose a sufficiently small \(q>1\). We have the inequality

$$\begin{aligned}&{\sup _{t\in [0,T]}{{\mathbb {E}}}\Big [\big |e^{-\gamma _0 X^{\pi }(t)}X^\pi (t)\big |^{q}\Big ]}\nonumber \\&\quad \le \Big (\sup _{t\in [0,T]}{{\mathbb {E}}}\Big [e^{-\gamma _0\kappa q X^{\pi }(t)}\Big ]\Big )^{\frac{1}{\kappa }}\times \Big (\sup _{t\in [0,T]}{{\mathbb {E}}}\Big [|X^\pi (t)|^{\kappa ^*q}\Big ]\Big )^{\frac{1}{\kappa ^*}}, \end{aligned}$$
(5.25)

where we choose a sufficiently small \(\kappa >1\), and \(\kappa ^*\) denotes its conjugate. Since we can set \(\gamma _0\kappa q=\gamma _0+\gamma _1(r)\epsilon =\varGamma (r)\) for some \(r\in {{\mathbb {R}}}\) and sufficiently small \(\epsilon>0, \kappa>1, q>1\), the first term in (5.25) is finite by uniform integrability of \(\{e^{-\gamma _0\kappa q X^{\pi }({\mathcal {T}})},{\mathcal {T}} \ is\ an \ {{\mathbb {F}}}-stopping \ time\}\) (by points 2 and 4 from Definition 3.1 and the arguments from above). As far as the second term in (5.25) is concerned, let us recall the dynamics (3.1) for the process \(X^{\pi }\). For any \(\kappa ^*>1\) and \(q>1\), we have the inequalities

$$\begin{aligned} \sup _{t\in [0,T]}{{\mathbb {E}}}\Big [|X^\pi (t)|^{\kappa ^*q}\Big ]\le & {} K\Big (1+{{\mathbb {E}}}\Big [\Big |\int _0^T|\pi (s)|^2ds\Big |^{\frac{\kappa ^*q}{2}}\Big ]\nonumber \\&+\,{{\mathbb {E}}}\Big [\sup _{t\in [0,T]}\Big |\int _0^t\pi (s)dW(s)\Big |^{\kappa ^*q}\Big ]\Big )\nonumber \\\le & {} K\Big (1+{{\mathbb {E}}}\Big [\Big |\int _0^T|\pi (s)|^2ds\Big |^{\frac{\kappa ^*q}{2}}\Big ]\Big )\nonumber \\\le & {} K\Big (1+\Big \Vert \int _0^T\pi (s)dW(s)\Big \Vert ^{2[\frac{\kappa ^*q}{2}]+2}_{BMO}\Big )<\infty , \quad \ \end{aligned}$$
(5.26)

where we use the Burholder–Davis–Gundy inequality and the energy inequality (see e.g. page 29 in Kazamaki [19]. \(\square \)

Lemma 5.2

Let \(\pi \in {\mathcal {A}}\) denote an admissible strategy and \((v^{k,\pi }, w^{k,\pi })_{k=0}^n\) denote the corresponding objective functions (3.3)–(3.4) for the utility maximization problem (3.2) with the wealth-dependent risk aversion coefficient \(\varGamma (r)=\gamma _0+\gamma _1(r)\epsilon \) with \(\epsilon >0\). The families

$$\begin{aligned}&\Big \{v^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),r), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \}, \ r\in {{\mathbb {R}}}, \end{aligned}$$

are uniformly integrable.

Proof

This is a modification of a well-known result which concerns uniform integrability of conditional expectations. We choose \(\pi \in {\mathcal {A}}\).

Step 1 Let us consider the family

$$\begin{aligned} w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}), R({\mathcal {T}})) ={{\mathbb {E}}}\Big [e^{-\varGamma (R({\mathcal {T}}))\big (X^{\pi }(T)-J(T)\eta (P(T)) \big )}|{{\mathcal {F}}}_{{\mathcal {T}}}\Big ],\qquad \quad \end{aligned}$$
(5.27)

indexed with stopping times \({\mathcal {T}}\). We can observe that

$$\begin{aligned} {e^{-\varGamma (R({\mathcal {T}}))\big (X^{\pi }(T)-J(T)\eta (P(T))\big )}}\le & {} e^{-(\gamma _0+\gamma _1(-\infty ) \epsilon )\big (X^{\pi }(T)-J(T)\eta (P(T))\big )}\nonumber \\&+e^{-(\gamma _0+\gamma _1(+\infty )\epsilon )\big (X^{\pi }(T) -J(T)\eta (P(T))\big )}:={\mathcal {U}},\qquad \quad \end{aligned}$$
(5.28)

where \(\gamma _1(-\infty )=\sup _{r\in {{\mathbb {R}}}}\gamma _1(r)\) and \(\gamma _1( +\infty )=\inf _{r\in {{\mathbb {R}}}}\gamma _1(r)\). From point 4 from Definition 3.1, we conclude that \({{\mathbb {E}}}[{\mathcal {U}}]<\infty \). We can establish the first property:

$$\begin{aligned}&{\sup _{{\mathcal {T}}}{{\mathbb {E}}}\big [w^{J({\mathcal {T}}),\pi } ({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}), R({\mathcal {T}}))\big ]}\nonumber \\&\quad =\sup _{{\mathcal {T}}}{{\mathbb {E}}}\Big [e^{-\varGamma (R({\mathcal {T}})) \big (X^{\pi }(T)-J(T)\eta (P(T))\big )}\Big ]\le {{\mathbb {E}}}[{\mathcal {U}}]<\infty . \end{aligned}$$
(5.29)

Step 2 By Markov’s inequality and (5.29), we derive the inequality

$$\begin{aligned}&{Pr\Big (w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi } ({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C\Big )}\\&\quad \le \frac{{{\mathbb {E}}}\big [w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi } ({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))\big ]}{C}\le \frac{{{\mathbb {E}}}[{\mathcal {U}}]}{C}. \end{aligned}$$

Consequently, for any \(\delta >0\), we can choose a sufficiently large C such that

$$\begin{aligned} Pr\Big (w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C\Big )<\delta . \end{aligned}$$

Step 3 Since the random variable \({\mathcal {U}}\) defined in (5.28) is trivially uniformly integrable, then for any \(\delta _0>0\), we can choose \(\delta \) such that

$$\begin{aligned} Pr(A)<\delta \Rightarrow {{\mathbb {E}}}[{\mathcal {U}}{\mathbf {1}}_{A}]<\delta _0. \end{aligned}$$

By Step 2, for any \(\delta _0>0\), we can choose \(\delta \) and C such that

$$\begin{aligned}&Pr\Big (w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),R({\mathcal {T}}))>C\Big )<\delta \Rightarrow \\&\quad {{\mathbb {E}}}\big [{\mathcal {U}}{\mathbf {1}}_{w^{J({\mathcal {T}}),\pi }({\mathcal {T}}, X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C}\big ]<\delta _0. \end{aligned}$$

Step 4 By (5.27)–(5.28) and the property of conditional expectations, we get the inequality

$$\begin{aligned}&{{\mathbb {E}}}\big [w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),R({\mathcal {T}})){\mathbf {1}}_{w^{J({\mathcal {T}}), \pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C} \big ]\\&\quad \le {{\mathbb {E}}}\big [{\mathcal {U}}{\mathbf {1}}_{w^{J({\mathcal {T}}),\pi }({\mathcal {T}}, X^{\pi }({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C}\big ]. \end{aligned}$$

Consequently, by Step 3, for any \(\delta _0>0\), we can choose \(\delta \) and C such that

$$\begin{aligned}&Pr\Big (w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),R({\mathcal {T}}))>C\Big )<\delta \Rightarrow \\&\quad {{\mathbb {E}}}[w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi }({\mathcal {T}}),P({\mathcal {T}}), R({\mathcal {T}})){\mathbf {1}}_{w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi } ({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))>C}]<\delta _0. \end{aligned}$$

We conclude that the family \(w^{J({\mathcal {T}}),\pi }({\mathcal {T}},X^{\pi } ({\mathcal {T}}),P({\mathcal {T}}),R({\mathcal {T}}))\) indexed with stopping times \({\mathcal {T}}\) is uniformly integrable. The remaining families of random variables can be studied in the exactly the same way. \(\square \)

Proposition 5.3

Let \(\pi \in {\mathcal {A}}\). We consider functions \((\vartheta ^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )) \cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )), \ (\varphi ^k)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\) such that the families

$$\begin{aligned}&\Big \{\vartheta ^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{\varphi ^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),R({\mathcal {T}})), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \},\\&\Big \{\varphi ^{J({\mathcal {T}})}({\mathcal {T}},X^{\pi }({\mathcal {T}}), P({\mathcal {T}}),r), \ {\mathcal {T}} \ is \ an \ {{\mathbb {F}}}-stopping \ time, {\mathcal {T}}\in [0,T] \Big \}, \ r\in {{\mathbb {R}}}, \end{aligned}$$

are uniformly integrable, and \((\vartheta ^k)_{k=0}^n, \ (\varphi ^k)_{k=0}^n\) satisfy the PDEs:

$$\begin{aligned}&{\vartheta _t^k(t,x,p)+{\mathcal {L}}_k^\pi \vartheta ^k(t,x,p)-{\mathcal {M}}^\pi _k\varphi ^k(t,x,p,x-kF(t,p))}\nonumber \\&\quad +{\mathcal {L}}^\pi _k\varphi ^k(t,x,p,x-kF(t,p)) +\Big (\vartheta ^{k-1}(t,x-\beta (p),p)-\vartheta ^k(t,x,p)\Big )k\lambda \nonumber \\&\quad +\Big (\varphi ^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad \ -\varphi ^{k-1}(t,x-\beta (p),p,x-\beta (p)-(k-1)F(t,p))\Big )k\lambda \nonumber \\&\quad +\varPsi ^k(t,x,p,x-kF(t,p))=0,\nonumber \\&\quad \ \quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\nonumber \\&\quad \vartheta ^k(T,x,p)=\varPhi ^k(x,p,x-kF(t,p)),\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \end{aligned}$$
(5.30)

and

$$\begin{aligned}&{\varphi _t^k(t,x,p,r)+{\mathcal {L}}_k^{\pi }\varphi ^k(t,x,p,r)}\nonumber \\&\quad +\Big (\varphi ^{k-1}(t,x-\beta (p),p,r)-\varphi ^k(t,x,p,r)\Big )k\lambda +\varPsi ^k(t,x,p,r)=0,\nonumber \\&\quad \ \quad \ \quad \ (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}},\nonumber \\&\quad \varphi ^k(T,x,p,r)=\varPhi ^k(t,x,r),\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \ r\in {{\mathbb {R}}}, \end{aligned}$$
(5.31)

for \(k\in \{0,1,\ldots ,n\}\). Moreover, we assume that the functions \((\varPsi ^k)_{k=0}^n,(\varPhi ^k)_{k=0}^n\) satisfy the integrability conditions:

$$\begin{aligned}&{{\mathbb {E}}}\Big [\int _0^T\big |\varPsi ^{J(s)}(s,X^\pi (s),P(s),r)\big |ds\Big ]<\infty ,\quad r\in {{\mathbb {R}}},\\&{{\mathbb {E}}}\Big [\int _0^T\big |\varPsi ^{J(s)}(s,X^\pi (s),P(s),R(s))\big |ds\Big ]<\infty ,\\&{{\mathbb {E}}}\Big [|\varPhi ^{J(T)}(X^\pi (T),P(T),r)\big |\Big ]<\infty ,\quad r\in {{\mathbb {R}}},\\&{{\mathbb {E}}}\Big [\big |\varPhi ^{J(T)}(X^\pi (T),P(T),R(T))\big |\Big ]<\infty . \end{aligned}$$

We have the representations:

$$\begin{aligned}&{\varphi ^{k}(t,x,p,r)}\\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [\varPhi ^{J(T)}(X^\pi (T),P(T),r)+\int _t^T\varPsi ^{J(s)} (s,X^\pi (s),P(s),r)ds\Big ],\\&\quad \ \quad \ (t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty ) \times \{0,1,\ldots ,n\},\ r\in {{\mathbb {R}}},\quad \ \quad \ \end{aligned}$$

and

$$\begin{aligned} \vartheta ^k(t,x,p)= & {} \varphi ^k(t,x,p,x-kF(t,p)),\\&(t,x,p,k)\in [0,T]\times {{\mathbb {R}}}\times (0,\infty )\times \{0,1,\ldots ,n\}. \end{aligned}$$

Proof

Let \((\tau _m)_{m=0}^\infty \) denote a localizing sequence of stopping times for \((X^\pi ,P,R)\). We fix \((t,x,p,k,r)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty )\times \{0,\ldots ,n\}\times {{\mathbb {R}}}\) and choose \(\pi \in {\mathcal {A}}\). Applying Itô’s formula to \(\varphi \), with r fixed, and using equation (5.31), we can deduce that

$$\begin{aligned}&{{\mathbb {E}}}_{t,x,p,k}\big [\varphi ^{J(\tau _m)}(\tau _m,X^{\pi }(\tau _m),P(\tau _m),r)\big ] -\varphi ^k(t,x,p,r)\\&\quad \ \quad ={{\mathbb {E}}}_{t,x,p,k}\Big [-\int _t^{\tau _m}\varPsi ^{J(s)}(s,X^\pi (s),P(s),r)ds\Big ]. \end{aligned}$$

We take \(\tau _m\rightarrow T\). Since the jumps of the process J are totally inaccessible, then \(J(T-)=J(T), a.s.\). By uniform integrability and dominated convergence theorem, we derive that

$$\begin{aligned}&{\varphi ^{k}(t,x,p,r)}\\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [-e^{-\varGamma (r)\big (X^{\pi }(T)-J(T)\eta (P(T))\big )} +\int _t^T\varPsi ^{J(s)}(s,X^\pi (s),P(s),r)ds\Big ]. \end{aligned}$$

Applying Itô’s formula to \(\vartheta \) and using equation (5.30), we can show that

$$\begin{aligned}&{{{\mathbb {E}}}_{t,x,p,k}\big [\vartheta ^{J(\tau _m)}(\tau _m,X^{\pi }(\tau _m),P(\tau _m))] -\vartheta ^k(t,x,p)}\\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^{\tau _m}\Big \{{\mathcal {M}}^\pi _{J(s)} \varphi ^{J(s)}\big (s,X^\pi (s),P(s),X^\pi (s)-J(s)F(s,P(s))\big )\\&\quad -{\mathcal {L}}^\pi _{J(s)}\varphi ^{J(s)}\big (s,X^\pi (s),P(s),X^\pi (s) -J(s)F(s,P(s))\big )\\&\quad -\Big (\varphi ^{J(s)-1}\big (s,X^\pi (s)-\beta (P(s)),P(s),X^\pi (s) -J(s)F(s,P(s))\big )\\&\quad \ -\varphi ^{J(s)-1}\big (s,X^\pi (s)-\beta (P(s)),P(s),X^\pi (s)-\beta (P(s)) -(J(s)-1)F(s,P(s))\big )\Big )J(s)\lambda \\&\quad -\varPsi ^{J(s)}\big (s,X^\pi (s),P(s),X^\pi (s)-J(s)F(s,P(s))\big )\Big \}ds\Big ]. \end{aligned}$$

Since the PDEs (5.31) also hold for \(r=x-kF(t,p)\), we conclude that

$$\begin{aligned}&{{{\mathbb {E}}}_{t,x,p,k}\big [\vartheta ^{J(\tau _m)}(\tau _m,X^{\pi }(\tau _m),P(\tau _m))] -\vartheta ^k(t,x,p)}\\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^{\tau _m}\Big \{\varphi ^{J(s)}_t \big (s,X^\pi (s),P(s),X^\pi (s) -J(s)F(s,P(s))\big )\\&\quad +{\mathcal {M}}^\pi _{J(s)}\varphi ^{J(s)}\big (s,X^\pi (s),P(s),X^\pi (s) -J(s)F(s,P(s)) \big )\\&\quad +\Big (\varphi ^{J(s)-1}\big (s,X^\pi (s)-\beta (P(s)),P(s),X^\pi (s)-\beta (P(s)) -(J(s)-1)F(s,P(s))\big )\\&\quad \ - \varphi ^{J(s)}\big (s,X^\pi (s),P(s),X^\pi (s)-J(s)F(s,P(s))\big )\Big )J(s) \lambda \Big \}ds\Big ]\\&\quad ={{\mathbb {E}}}_{t,x,p,k}\big [\varphi ^{J(\tau _m)}(\tau _m,X^{\pi }(\tau _m),P(\tau _m),R(\tau _m)) \big ]-\varphi ^k(t,x,p,x-kF(t,p)), \end{aligned}$$

where the last term follows from Itô’s formula applied to \(\varphi \). We take \(\tau _m\rightarrow T\). By uniform integrability, we arrive at \(\vartheta ^k(t,x,p)=\varphi ^k(t,x,p,x-kF(t,p))\). \(\square \)

Proof of Theorem 4.1

First, we present detailed proofs of the assertions (iii) and (iv). At the end, we give a sketch of the proof for the assertions (i)–(ii). Let \(\epsilon _0\) denote a sufficiently small positive constant. We consider \(\epsilon \in (0,\epsilon _0]\). By K we denote a constant which may change from line to line.

Step 1 We choose \(\pi _1\) so that \(\pi =\pi ^*_0+\pi _1\epsilon \in {\mathcal {B}}\). By Proposition 5.2, \(\pi \in {\mathcal {A}}\). Let \((v^{k,\pi })_{k=0}^n, (w^{k,\pi })_{k=0}^n\) denote the corresponding objective functions (3.3)–(3.4) for the optimization problem with the wealth-dependent risk aversion (3.6). By our assumption, \((v^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty ))\) and \((w^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\). We will use the following four properties:

Step 1a From (3.3)–(3.4) we deduce that \(v^{k,\pi }(t,x,p)=w^{k,\pi }(t,x,p,x-kF(t,p))\). We have the following relations for the derivatives:

$$\begin{aligned} v_x^{k,\pi }(t,x,p)= & {} w_x^{k,\pi }(t,x,p,x-kF(t,p))+w_r^{k,\pi }(t,x,p,x-kF(t,p)),\nonumber \\ v_{xp}^{k,\pi }(t,x,p)= & {} w_{xp}^{k,\pi }(t,x,p,x-kF(t,p))+w_{rp}^{k,\pi }(t,x,p,x-kF(t,p))\nonumber \\&-w_{xr}^{k,\pi }(t,x,p,x-kF(t,p))kF_p(t,p)\nonumber \\&-w_{rr}^{k,\pi }(t,x,p,x-kF(t,p))kF_p(t,p),\nonumber \\ v_{xx}^{k,\pi }(t,x,p)= & {} w_{xx}^{k,\pi }(t,x,p,x-kF(t,p))+w_{rr}^{k,\pi }(t,x,p,x-kF(t,p))\nonumber \\&+2w_{xr}^{k,\pi }(t,x,p,x-kF(t,p)). \end{aligned}$$
(5.32)

Step 1b Since \(\pi ^*_0\) is determined by (5.6) and \(v^k_0(t,x,p)=w^k_0(t,x,p)\) [see (5.12)–(5.13)], we can also use the strategy

$$\begin{aligned} \pi ^{k,*}_0(t,x,p)=-\frac{w_{0,x}^k(t,x,p)\mu +w_{0,px}^k(t,x,p)bp\sigma \rho }{w_{0,xx}^k(t,x,p)\sigma ^2}, \end{aligned}$$

and the equation

$$\begin{aligned} w_{0,x}^k(t,x,p)\mu +w_{0,px}^k(t,x,p)bp\sigma \rho +\pi _0^{k,*}(t,x,p)w_{0,xx}^k(t,x,p)\sigma ^2=0.\quad \ \end{aligned}$$
(5.33)

The terms on the left hand side of (5.33) can be added to any equation without changing this equation. It is obvious that (5.33) also holds if we replace \(w_0^k\) with \(v^k_0\).

Step 1c We claim that the mapping \(\epsilon \mapsto X^{\pi _0^*+\pi _1\epsilon }(.,\omega )\) is continuous in the topology of uniform convergence on \([0,\epsilon _0]\times [0,T]\) for a.a. \(\omega \). By Theorem V.7 from Protter [20] and points 2–3 from Definition 4.1, there exists a unique solution \(X^{\pi _0^*+\pi _1\epsilon }\) to the SDE (3.1) for any \(\epsilon \in [0,\epsilon _0]\). We have the dynamics:

$$\begin{aligned}&{d\big (X^{\pi ^*_0+\pi _1\epsilon }(t)-X^{\pi ^*_0+\pi _1\epsilon '}(t)\big )}\\&\quad =\left\{ \pi _0^{J(t-),*}(t,X^{\pi ^*_0+\pi _1\epsilon }(t),P(t)) -\pi _0^{J(t-),*}(t,X^{\pi ^*_0+\pi _1\epsilon '}(t),P(t))\right. \\&\qquad +\,\epsilon \Big (\pi _1^{J(t-)}(t,X^{\pi ^*_0+\pi _1\epsilon }(t),P(t)) -\pi _1^{J(t-)}(t,X^{\pi ^*_0+\pi _1\epsilon '}(t),P(t))\Big )\\&\qquad \left. +\,\big (\epsilon -\epsilon '\big )\pi _1^{J(t-)}(t,X^{\pi ^*_0 +\pi _1\epsilon '}(t),P(t))\right\} \Big (\mu dt+\sigma dW(t)\Big ). \end{aligned}$$

Let us recall the continuous processes H and \({\tilde{H}}\) from points 2–3 of Definition 4.1 and we define the stopping times \(\tau _n=\inf \{t\in [0,T]: H(t)+{\tilde{H}}(t)\ge n\}\) for \(n\in {\mathbb {N}}\). Standard estimates for SDEs lead us to the inequality:

$$\begin{aligned} {{\mathbb {E}}}\Big [\sup _{t\in [0,\tau _n]}\big |X^{\pi ^*_0+\pi _1\epsilon }(t) -X^{\pi ^*_0+\pi _1\epsilon '}(t)\big |^q\Big ]\le K_{\epsilon _0,n}|\epsilon -\epsilon '|^q,\quad q\ge 2. \end{aligned}$$

By Kolmogorov’s lemma, our claim holds on \([0,\epsilon _0]\times [0,\tau _n]\) for a.a. \(\omega \). The continuity of the mapping \(\epsilon \mapsto X^{\pi _0^*+\pi _1\epsilon }(.,\omega )\) on \([0,\epsilon _0]\times [0,T]\) for a.a. \(\omega \) follows from the arguments from the proofs of Theorems V.7 and V.37 in Protter [20].

Step 1d We improve the estimates (5.20)–(5.23). Let us choose sufficiently small \(q>1, \kappa>1, \iota >1\), and let \(q^*, \kappa ^*, \iota ^*\) denote their conjugates. We introduce the martingales:

$$\begin{aligned} {\mathcal {M}}(t)= & {} e^{-\gamma _0(X^{\pi _0^*}(t)-Y(t))},\quad 0\le t\le T,\\ M(t)= & {} e^{-\int _0^tq\kappa ^*\epsilon \gamma _0\pi _1(s)\sigma dW(s)-\frac{1}{2}\int _0^t|q\kappa ^*\epsilon \gamma _0\pi _1(s)\sigma |^2ds},\quad 0\le t\le T. \end{aligned}$$

We note that

$$\begin{aligned} \Big \Vert \int _0^Tq\kappa ^*\epsilon \gamma _0\pi _1(s)\sigma dW(s)\Big \Vert _{BMO}\le K\Big \Vert \int _0^T\pi _1(s)dW(s)\Big \Vert _{BMO}, \end{aligned}$$

for all \(\epsilon \in [0,\epsilon _0]\), and the constant K is independent of \(\epsilon \). Consequently, by Theorem 3.1 from Kazamaki [19], for all \(\epsilon \in [0,\epsilon _0]\) we can find a universal, sufficiently small \(\iota >1\) such that all stochastic exponentials of \(\int _0^tq\kappa ^*\epsilon \gamma _0\pi _1(s)\sigma dW(s)\) satisfy the reverse Hölder inequality with the common power \(\iota \). The reverse Hölder inequality gives us the estimate

$$\begin{aligned} {{\mathbb {E}}}\big [\big |M(t)\big |^{l}\big ]\le K, \end{aligned}$$

all \(\epsilon \in [0,\epsilon _0]\), and the constant K depends on \(\iota \) but is independent of \(\epsilon \). Using the arguments from the proof of Proposition 5.2 together with Doob’s inequality, we can now conclude that

$$\begin{aligned} {{{\mathbb {E}}}\Big [\sup _{t\in [0,T]}\big |e^{-\gamma _0X^{\pi ^*_0+\pi _1\epsilon }(t)}\big |^q\Big ]}&\le K_1\Big ({{\mathbb {E}}}\Big [\sup _{t\in [0,T]}\big |{\mathcal {M}}(t)\big |^{q\kappa }\Big ] \Big )^{\frac{1}{\kappa }}\Big ({{\mathbb {E}}}\Big [\sup _{t\in [0,T]}\big |M(t)\big |^{l}\Big ]\Big )^{\frac{1}{\kappa ^*\iota }}\nonumber \\&\quad \times \Big ({{\mathbb {E}}}\Big [e^{K_2\epsilon ^2\int _0^T|\pi _1(s)|^2ds}\Big ]\Big )^{\frac{1}{\kappa ^*\iota ^*}}\nonumber \\&\le \frac{K_1}{\Big (1-K_2\epsilon ^2\Big \Vert \int _0^T\pi _1(s)dW(s)\Big \Vert ^2_{BMO} \Big )^{\frac{1}{r}}}, \end{aligned}$$
(5.34)

with some \(r>1\), for all \(\epsilon \in [0,\epsilon _0]\). The constants \(K_1, K_2, r\) in (5.34) are independent of \(\epsilon \). We show, in this step and in the sequel, that our constants are independent of \(\epsilon \) since in Steps 3–4 we prove that the approximation error is of order \({\mathcal {O}}(\epsilon ^2)\), where \({\mathcal {O}}(\epsilon ^2)\) is defined by (1.1) and the constant K for the approximation error in (1.1) must be independent of \(\epsilon \).

We also improve the estimate (5.26). Let us choose any \(q>1\). Applying Burkholder–Davis–Gundy inequality as in the proof of Lemma 5.1, we can show that

$$\begin{aligned} {{\mathbb {E}}}\Big [\sup _{t\in [0,T]}|X^{\pi ^*_0+\pi _1\epsilon }(t)|^{q}\Big ]\le K\Big (1+\epsilon ^r\Big \Vert \int _0^T\pi _1(s)dW(s)\Big \Vert ^{r}_{BMO}\Big ), \end{aligned}$$
(5.35)

with some \(r\ge 2\), for all \(\epsilon \in [0,\epsilon _0]\). The constants Kr in (5.35) are independent of \(\epsilon \). We remark that the constants in (5.34) depend on \(\Big \Vert \int _0^T\pi _1(s)dW(s)\Big \Vert _{BMO}\). However, the dependence of constants on the applied strategies will not be pointed out if this dependence is not needed for the proof.

Step 2 Let us introduce the functions:

$$\begin{aligned} Q^k(t,x,p,r)= & {} w^{k,\pi }(t,x,p,r)-w^k_0(t,x,p)-w^k_1(t,x,p,r)\epsilon ,\\ U^k(t,x,p)= & {} v^{k,\pi }(t,x,p)-v^k_0(t,x,p)-v^k_1(t,x,p)\epsilon . \end{aligned}$$

The functions quantify the approximation errors which we want to study. In this step we derive probabilistic representations for \((Q^{k})_{k=0}^n\) and \((U^{k})_{k=0}^n\). Since Lemma 5.2 holds and we assume that \((v^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty ))\), \((w^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\), we can apply Proposition 5.3 and derive PDEs for \((v^{k,\pi })_{k=0}^n, (w^{k,\pi })_{k=0}^n\). Using the PDEs (5.8)–(5.11) for \((v^k_0, v^k_1, w^k_0, w^k_1)_{k=0}^n\), we can next derive the PDEs for \((Q^{k})_{k=0}^n, (U^{k})_{k=0}^n\):

$$\begin{aligned}&Q_t^k(t,x,p,r)+{\mathcal {L}}_k^{\pi }Q^k(t,x,p,r)+\Big (Q^{k-1}(t,x-\beta (p),p,r)-Q^k(t,x,p,r)\Big )k\lambda \nonumber \\&\quad +\big ({\mathcal {L}}_k^{\pi }-{\mathcal {L}}_k^{\pi _0^*}\big ) \big (w_0^k(t,x,p)+w_1^k(t,x,p,r)\epsilon \big )=0,\nonumber \\&Q^k(T,x,p,r)=-e^{-\varGamma (r)(x-k\eta (p))}\nonumber \\&\quad +e^{-\gamma _0(x-k\eta (p))}-\gamma _1(r)(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))}\epsilon , \end{aligned}$$
(5.36)

and

$$\begin{aligned}&U_t^k(t,x,p)+{\mathcal {L}}_k^\pi U^k(t,x,p)-{\mathcal {M}}^\pi _kQ^k(t,x,p,x-kF(t,p))\nonumber \\&\quad +\,{\mathcal {L}}^\pi _kQ^k(t,x,p,x-kF(t,p)) +\,\Big (U^{k-1}(t,x-\beta (p),p)-U^k(t,x,p)\Big )k\lambda \nonumber \\&\quad +\,\Big (Q^{k-1}(t,x-\beta (p),p,x-kF(t,p))\nonumber \\&\quad -Q^{k-1}(t,x-\beta (p),p,x-\beta (p)-(k-1)F(t,p))\Big )k\lambda \nonumber \\&\quad +\,\big ({\mathcal {L}}^\pi _k-{\mathcal {L}}^{\pi ^*_0}_k\big )\big (v_0^k(t,x,p) +v_1^k(t,x,p)\epsilon \big )\nonumber \\&\quad -\big ({\mathcal {M}}^\pi _k -{\mathcal {L}}^\pi _k-{\mathcal {M}}^{\pi ^*_0}_k+{\mathcal {L}}^{\pi ^*_0}_k\big ) w^k_1(t,x,p,x-kF(t,p))\epsilon =0\nonumber \\&U^k(T,x,p)= -e^{-\varGamma (x-k\eta (p))(x-k\eta (p))} +\,e^{-\gamma _0(x-k\eta (p))}\nonumber \\&\quad -\gamma _1(x-k\eta (p))(x-k\eta (p))e^{-\gamma _0(x-k\eta (p))}\epsilon . \end{aligned}$$
(5.37)

Recalling Definition 5.1, the strategy \(\pi =\pi ^*_0+\pi _1\epsilon \) and using (5.32)–(5.33), we can show that

$$\begin{aligned}&{\big ({\mathcal {L}}_k^{\pi }-{\mathcal {L}}_k^{\pi _0^*}\big ) \big (w_0^k(t,x,p)+w_1^k(t,x,p,r)\epsilon \big )}\nonumber \\&\quad =\pi ^k(t,x,p)\big (w_{0,x}^k(t,x,p)+w_{1,x}^k(t,x,p,r)\epsilon \big )\mu \nonumber \\&\qquad -\,\pi _0^{k,*}(t,x,p)\big (w_{0,x}^k(t,x,p)+w_{1,x}^k(t,x,p,r)\epsilon \big ) \mu \nonumber \\&\qquad +\,\frac{1}{2}|\pi ^k(t,x,p)|^2\big (w_{0,xx}^k(t,x,p)+w_{1,xx}^k(t,x,p,r) \epsilon \big )\sigma ^2\nonumber \\&\qquad -\,\frac{1}{2}|\pi _0^{k,*}(t,x,p)|^2\big (w_{0,xx}^k(t,x,p)+w_{1,xx}^k(t,x,p,r) \epsilon \big )\sigma ^2\nonumber \\&\qquad +\,\pi ^k(t,x,p)\big (w_{0,px}^k(t,x,p)+w_{1,px}^k(t,x,p,r)bp\sigma \rho \epsilon \big )bp\sigma \rho \nonumber \\&\qquad -\,\pi _0^{k,*}(t,x,p)\big (w_{0,px}^k(t,x,p)+w_{1,px}^k(t,x,p,r)bp\sigma \rho \epsilon \big )bp\sigma \rho \nonumber \\&\qquad -\,\pi ^k_1(t,x,p)\Big (w_{0,x}^k(t,x,p)\mu \nonumber \\&\qquad +\,w_{0,px}^k(t,x,p)bp\sigma \rho +\pi _0^{k,*}(t,x,p)w_{0,xx}^k(t,x,p)\sigma ^2\Big ) \epsilon \nonumber \\&\quad =\pi ^k_1(t,x,p)\big (w_{1,x}^k(t,x,p,r)\mu +w_{1,px}^k(t,x,p,r)bp\sigma \rho \big )\epsilon ^2\nonumber \\&\qquad +\,\pi _0^{k,*}(t,x,p)\pi ^k_1(t,x,p)w_{1,xx}^k(t,x,p,r)\sigma ^2\epsilon ^2\nonumber \\&\qquad +\,\frac{1}{2}|\pi _1^k(t,x,p)|^2\big (w_{0,xx}^k(t,x,p)+w_{1,xx}^k(t,x,p,r) \epsilon \big )\sigma ^2\epsilon ^2\nonumber \\&\quad :=\varPsi ^{k,\pi _1}(t,x,p,r), \end{aligned}$$
(5.38)

and

$$\begin{aligned}&\big ({\mathcal {L}}^\pi _k-{\mathcal {L}}^{\pi ^*_0}_k\big )\big (v_0^k(t,x,p) +v_1^k(t,x,p)\epsilon \big )\nonumber \\&\quad -\big ({\mathcal {M}}^\pi _k-{\mathcal {L}}^\pi _k-{\mathcal {M}}^{\pi ^*_0}_k +{\mathcal {L}}^{\pi ^*_0}_k\big )w^k_1(t,x,p,x-kF(t,p))\epsilon \nonumber \\&\quad \ \quad \ \quad =\varPsi ^{k,\pi _1}(t,x,p,x-kF(t,p)). \end{aligned}$$
(5.39)

We investigate the function \(\varPsi ^{k,\pi _1}\). We can calculate derivatives of \(w_0^k, w_1^k\) since the explicit solutions (5.13), (5.15) are available. By the properties of \((h^k)_{k=0}^n, (g^k)_{k=0}^n\) specified in Proposition 5.1, point 3 of Definition 4.1 and (A6), we can derive the estimates:

$$\begin{aligned}&|w^k_{1,x}(t,x,p,r)|\le Ke^{-\gamma _0 x}(1+|x|),\quad |w^k_{1,px}(t,x,p,r)|\le Ke^{-\gamma _0 x}(1+|x|),\\&|w^k_{1,xx}(t,x,p,r)|\le Ke^{-\gamma _0 x}(1+|x|),\quad |w^k_{0,xx}(t,x,p)|\le Ke^{-\gamma _0 x},\\&|\pi _0^{k,*}(t,x,p)|\le K(1+p), \end{aligned}$$

which lead us to the following estimate for the function \(\varPsi ^{k,\pi _1}\):

$$\begin{aligned}&{|\varPsi ^{k,\pi _1}(t,x,p,r)|}\nonumber \\&\quad \le K\Big (|\pi ^k_1(t,x,p)|^2e^{-\gamma _0x}+|\pi ^k_1(t,x,p)|e^{ -\gamma _0x}\big (1+|x|\big )\big (1+p\big )\Big )\epsilon ^2\nonumber \\&\qquad + K|\pi ^k_1(t,x,p)|^2e^{-\gamma _0x}\big (1+|x|\big )\epsilon ^3\nonumber \\&\quad \le Ke^{-\gamma _0x}\big (1+|x|\big )\big (1+p\big )^2\big (\epsilon ^2 +\epsilon ^3\big ), \end{aligned}$$
(5.40)

Applying Hölder’s inequality and using (5.34)-(5.35) together with

$$\begin{aligned} {{\mathbb {E}}}\big [\sup _{t\in [0,T]}|P(t)|^q\big ]<\infty ,\quad for \ all \ q\ge 1, \end{aligned}$$
(5.41)

we can deduce that

$$\begin{aligned}&{{\mathbb {E}}}\Big [\int _0^T|\varPsi ^{J(s),\pi _1}(s,X^\pi ,P(s),r)|ds\Big ]<\infty ,\nonumber \\&{{\mathbb {E}}}\Big [\int _0^T|\varPsi ^{J(s),\pi _1}(s,X^\pi ,P(s),R(s))|ds\Big ]<\infty . \end{aligned}$$
(5.42)

Finally, using the above results, Proposition 5.1, Lemmas 5.15.2 and the assumption that \((v^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty ))\cap {\mathcal {C}}^{1,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )), \ (w^{k,\pi })_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\), we can apply Proposition 5.3 and establish probabilistic representations for the functions \((Q^k)_{k=0}^n, (U^k)_{k=0}^n\). We derive the key representation for the approximation error:

$$\begin{aligned}&{U^k(t,x,p)={{\mathbb {E}}}_{t,x,p,k}\Big [-e^{-\varGamma (x-k\eta (p))\big (X^\pi (T)-J(T)\eta (P(T))\big )}}\nonumber \\&\qquad +e^{-\gamma _0\big (X^\pi (T)-J(T)\eta (P(T))\big )}\nonumber \\&\qquad -\gamma _1(x-k\eta (p))\big (X^\pi (T)-J(T)\eta (P(T))\big )e^{-\gamma _0\big (X^\pi (T)-J(T)\eta (P(T))\big )}\epsilon \Big ]\nonumber \\&\qquad +{{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^T\varPsi ^{J(s),\pi _1}(s,X^\pi (s),P(s),x-kF(t,p))ds\Big ]\nonumber \\&\quad :=U^k_1(t,x,p)+U^k_2(t,x,p), \end{aligned}$$
(5.43)

where \(U^k_1(t,x,p)\) denotes the first expected value in (5.43), and \(U_2^k(t,x,p)\) denotes the second expected value in (5.43). In Steps 3–4 below we derive estimates for the functions \(U_1^k(t,x,p), U_2^k(t,x,p)\), and for the approximation error \(U^k(t,x,p)\).

Step 3 We fix \(t\in [0,T)\) and \(\delta \in [0,T-t]\). From now on, we consider \(\pi =\pi _0^*+\pi _1^\delta \epsilon \), where \(\pi _1^\delta \) is defined in (4.6). We choose \(\pi _1\) so that \(\pi _0^*+\pi ^\delta _1\epsilon \in {\mathcal {B}}\) for any \(\delta \in [0,T-t]\). By Proposition 5.2 and point 3 from Definition 4.1, we note that

$$\begin{aligned}&{\Big \Vert \int _t^T\pi _1^\delta (s)dW(s)\Big \Vert ^2_{BMO}}\nonumber \\&\quad \le 2\Big (\Big \Vert \int _t^T\pi _1^*(s)dW(s)\Big \Vert ^2_{BMO} +\Big \Vert \int _t^T\pi _1(s)dW(s)\Big \Vert ^2_{BMO}\Big )\le K, \end{aligned}$$
(5.44)

for all \(\delta \in [0,T-t]\). The constant K in (5.44) is independent of \(\delta \).

We study the first expected value in (5.43). Let \(\gamma _1:=\gamma _1(x-k\eta (p))\). We investigate the random variable

$$\begin{aligned}&{-e^{-(\gamma _0+\gamma _1\epsilon )\big (X^\pi (T)-J(T)\eta (P(T))\big )} +e^{-\gamma _0\big (X^\pi (T)-J(T)\eta (P(T))\big )}}\nonumber \\&\quad -\gamma _1\big (X^\pi (T)-J(T)\eta (P(T))\big )e^{-\gamma _0\big (X^\pi (T)-J(T) \eta (P(T))\big )}\epsilon \nonumber \\&\quad =-\gamma _1^2\int _0^1\big |X^\pi (T)-J(T)\eta (P(T))\big |^2e^{-\gamma _0\big (X^\pi (T) -J(T)\eta (P(T))\big )}\nonumber \\&\quad \times \,e^{-\gamma _1\epsilon z\big (X^\pi (T)-J(T)\eta (P(T))\big )}(1-z)dz\epsilon ^2. \end{aligned}$$
(5.45)

Let us choose \(z\in [0,1]\). As in the proof of Proposition 5.2 and Step 1d of this proof, we can observe that

$$\begin{aligned}&{e^{-(\gamma _0+\gamma _1\epsilon z)\big (X^\pi (T)-J(T)\eta (P(T))\big )}} = e^{-(\gamma _0+\gamma _1\epsilon z)X^{\pi }(t)+\gamma _0Y(t)} {\mathcal {M}}(T)M(T)\\&\quad \times \,e^{\gamma _1\epsilon z\big (\int _t^TJ(s-)\alpha (P(s))ds-\int _t^T\beta (P(s))dJ(s)+J(T)\eta (P(T))\big )}, \end{aligned}$$

where we introduce the strategy

$$\begin{aligned} {\tilde{\pi }}_z(s)=\gamma _1z\pi _0^*(s)+(\gamma _0+\gamma _1\epsilon z)\pi _1^\delta (s),\quad t\le s\le T, \end{aligned}$$

and the exponential martingales

$$\begin{aligned} {\mathcal {M}}(s)= & {} e^{-\gamma _0\big (X^{\pi _0^*}(s)-X^{\pi _0^*}(t) -(Y(s)-Y(t))\big )},\quad t\le s\le T,\\ M(s)= & {} e^{-\epsilon \big (\int _t^s{\tilde{\pi }}_z(u)\mu du+\int _t^s{\tilde{\pi }}_z(u)\sigma dW(u)\big )}\quad t\le s\le T. \end{aligned}$$

By point 3 from Definition 4.1, Proposition 5.2, the properties (5.44) and (A6), we have the estimate

$$\begin{aligned} \Big \Vert \int _t^T{\tilde{\pi }}_z(s)dW(s)\Big \Vert ^2_{BMO}\le K, \end{aligned}$$
(5.46)

for all \(z\in [0,1], \ \epsilon \in [0,\epsilon _0]\) and \(\delta \in [0,T-t]\). Moreover, the constant K is independent of \((z,\epsilon ,\delta )\).

We choose \(q=1\), or a sufficiently small \(q>1\), and a sufficiently small \(\kappa >1\). Using (5.44), (5.46) and the same arguments which led us to (5.20)–(5.23), (5.34)–(5.35)), we can deduce the estimate

$$\begin{aligned}&{{\mathbb {E}}}_{t,x,p,k}\Big [\big |X^\pi (T)-J(T)\eta (P(T))\big |^{2q}e^{-q(\gamma _0+\gamma _1\epsilon z)\big (X^\pi (T)-J(T)\eta (P(T))\big )}\Big ]\nonumber \\&\quad \le \Big ({{\mathbb {E}}}_{t,x,p,k}\Big [\big |X^\pi (T)-J(T)\eta (P(T))\big |^{2q\kappa ^*}\Big ]\Big )^{\frac{1}{\kappa ^*}}\Big ({{\mathbb {E}}}\Big [e^{-q\kappa (\gamma _0+\gamma _1\epsilon z) \big (X^\pi (T)-J(T)\eta (P(T))\big )}\Big ]\Big )^{\frac{1}{\kappa }}\nonumber \\&\quad \le \frac{K_1\Big (1+\epsilon ^{r_1}||\int _t^T\pi _1^\delta (s)dW(s)||^{r_1}_{BMO} \Big )^{\frac{1}{r_2}}}{\Big (1-K_2\epsilon ^2||\int _t^T{\tilde{\pi }}_z(s)dW(s)||^2_{BMO} \Big )^{\frac{1}{r_3}}}\le K, \end{aligned}$$
(5.47)

with some \(r_1\ge 2, r_2>1, r_3>1\), for all \(z\in [0,1], \epsilon \in [0,\epsilon _0], \delta \in [0,T-t]\). The final constant K in (5.47) is independent of \((z,\epsilon ,\delta )\). Let us remark that in Step 1d we concluded that the constants in (5.34) depend on the investment strategy. However, due to (5.44), (5.46), we can indeed conclude that the constants in (5.47) are independent of \((\epsilon ,\delta )\), but they depend on \(\pi _1\) used for \(\pi _1^\delta \).

By Fubini’s theorem and (5.45)–(5.47), we can write

$$\begin{aligned} U_1^k(t,x,p)= & {} -\gamma _1^2\int _0^1{{\mathbb {E}}}_{t,x,p,k}\Big [\big |X^{\pi _0^*+\pi _1^\delta \epsilon }(T)-J(T)\eta (P(T))\big |^{2}\nonumber \\&\quad \cdot e^{-(\gamma _0+\gamma _1\epsilon z)\big (X^{\pi _0^*+\pi _1^\delta \epsilon }(T)-J(T)\eta (P(T))\big )}\Big ](1-z)dz\epsilon ^2. \end{aligned}$$
(5.48)

From (5.47) we conclude that

$$\begin{aligned} |U_1^k(t,x,p)|\le K\epsilon ^2, \end{aligned}$$
(5.49)

for all \(\epsilon \in [0,\epsilon _0], \delta \in [0,T-t]\), and the constant K is independent of \((\epsilon ,\delta )\). By Step 1c, Lebesgue’s dominated convergence theorem and uniform integrability (justified with (5.47)), we can prove the limit:

$$\begin{aligned}&{\lim _{\epsilon \rightarrow 0}\frac{U_1^k(t,x,p)}{\epsilon ^2}}\nonumber \\&\quad =-\frac{1}{2}\gamma _1^2{{\mathbb {E}}}_{t,x,p,k}\Big [\big |X^{\pi ^*_0}(T)-J(T)\eta (P(T))\big |^2e^{-\gamma _0\big (X^{\pi ^*_0}(T)-J(T)\eta (P(T))\big )}\Big ],\quad \nonumber \\ \end{aligned}$$
(5.50)

where the right hand side of (5.50) only depends on \(\pi ^*_0\), and is independent of \(\pi _1^\delta \).

Step 4 We study the second expected value in (5.43). Recalling (5.38), we deal with

$$\begin{aligned} \varPsi ^{k,\pi _1}(t,x,p,r)=\varPsi _1^{k,\pi _1}(t,x,p,r)\epsilon ^2 +\varPsi _2^{k,\pi _1}(t,x,p,r)\epsilon ^3, \end{aligned}$$
(5.51)

where

$$\begin{aligned}&{\varPsi _1^{k,\pi _1}(t,x,p,r)=\pi ^k_1(t,x,p)\Big (w_{1,x}^k(t,x,p,r) \mu +w_{1,px}^k(t,x,p,r)bp\sigma \rho }\\&\quad +\pi _0^{k,*}(t,x,p)w_{1,xx}^k(t,x,p,r)\sigma ^2\Big ) +\frac{1}{2}|\pi _1^k(t,x,p)|^2w_{0,xx}^k(t,x,p)\sigma ^2,\\&\quad {\varPsi _2^{k,\pi _1}(t,x,p,r)=\frac{1}{2}| \pi _1^k(t,x,p)|^2w_{1,xx}^k(t,x,p,r)\sigma ^2.} \end{aligned}$$

Let us choose \(q=1\), or a sufficiently small \(q>1\). Using the upper bound (5.40), the estimates (5.34), (5.35), (5.41) and (5.44), we can deduce the estimate:

$$\begin{aligned}&{{{\mathbb {E}}}_{t,x,p,k}\Big [\sup _{s\in [t,T]}\big |\varPsi _1^{J(s),\pi _1^\delta } (s,X^{\pi _0^*+\pi _1^\delta \epsilon }(s),P(s),x-kF(t,p))\big |^q\Big ]}\nonumber \\&\quad \le \frac{K_1\Big (1+\epsilon ^{r_1}||\int _t^T\pi _1^\delta (s)dW(s)||^{r_1}_{BMO} \Big )^{\frac{1}{r_2}}}{\Big (1-K_2\epsilon ^2||\int _t^T\pi ^\delta (s)dW(s)||^2_{BMO} \Big )^{\frac{1}{r_3}}}\le K, \end{aligned}$$
(5.52)

with some \(r_1\ge 2, r_2>1, r_3>1\), for all \(\epsilon \in [0,\epsilon _0], \delta \in [0,T-t]\). The constant K is independent of \((\epsilon ,\delta )\). We have the same estimate for \(\varPsi ^{k,\pi _1^\delta }_2\). Consequently, we can conclude that

$$\begin{aligned} |U_2^k(t,x,p)|\le K\epsilon ^2, \end{aligned}$$
(5.53)

for all \(\epsilon \in [0,\epsilon _0], \delta \in [0,T-t]\), and the constant K is independent of \((\epsilon ,\delta )\). By (5.51)-(5.52), we can also calculate the limit:

$$\begin{aligned}&{\lim _{\epsilon \rightarrow 0}\frac{U_2^k(t,x,p)}{\epsilon ^2}}\nonumber \\&\quad ={{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^T\varPsi _1^{J(s),\pi _1^\delta }(s,X^{\pi _0^*}(s),P(s),x -kF(t,p))ds\Big ],\quad \end{aligned}$$
(5.54)

where we use the property that \((w^{k}_0)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}}), (w^{k}_1)_{k=0}^n\in {\mathcal {C}}([0,T]\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\cap {\mathcal {C}}^{1,2,2,2}([0,T)\times {{\mathbb {R}}}\times (0,\infty )\times {{\mathbb {R}}})\), point 2 from Definition 4.1 and similar arguments which led us to the limit (5.50). We note that the right hand side of (5.54) depends on \(\pi ^*_0\) and \(\pi _1\), since \(\varPsi _1^{k,\pi _1^\delta }\) depends on \(\pi _1^\delta \).

Step 5 Assertion (iii) follows from (1.1), (5.43), (5.49), (5.53) and \(\delta =T-t\). We prove assertion (iv). Recalling (5.50), (5.54) and the definition of \(\pi _1^\delta \), we have to study the limit:

$$\begin{aligned}&\lim _{\delta \rightarrow 0} \frac{1}{\delta }\Big ({{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^{t+\delta }\varPsi _1^{J(s),\pi _1^*} (s,X^{\pi _0^*}(s),P(s),x-kF(t,p))ds\\&\quad -\int _t^{t+\delta }\varPsi _1^{J(s),\pi _1}(s,X^{\pi _0^*}(s),P(s),x-kF(t,p))ds\Big ]\Big ). \end{aligned}$$

By (5.52), Lebesgue’s dominated convergence theorem and differentiation theorem, we derive

$$\begin{aligned}&\lim _{\delta \rightarrow 0} \frac{1}{\delta }\Big ({{\mathbb {E}}}_{t,x,p,k}\Big [\int _t^{t +\delta }\varPsi _1^{J(s),\pi _1^*}(s,X^{\pi _0^*}(s),P(s),x-kF(t,p))ds\nonumber \\&\quad \int _t^{t+\delta }\varPsi _1^{J(s),\pi _1}(s,X^{\pi _0^*}(s),P(s),x-kF(t,p))ds \Big ]\Big )\nonumber \\&\quad \ \quad =\varPsi _1^{k,\pi ^*_1}(t,x,p,x-kF(t,p))-\varPsi _1^{k,\pi _1}(t,x,p,x-kF(t,p)). \end{aligned}$$
(5.55)

Since \(w^k_{0,xx}(t,x,p)<0\) by (5.13), we can find \(\pi ^k_1\) which maximizes \(\varPsi ^{k,\pi _1}_1(t,x,p,x-kF(t,p))\). The optimal strategy takes the form

$$\begin{aligned}&{{\tilde{\pi }}_1^{k,*}(t,x,p)=-\pi _0^*(t,x,p)\frac{w_{1,xx}^k (t,x,p,x-kF(t,p))}{w_{0,xx}^k(t,x,p,x-kF(t,p))}}\\&\quad -\frac{w_{1,x}^k(t,x,p,x-kF(t,p))\mu +w_{1,px}^k(t,x,p,x-kF(t,p))bp \sigma \rho }{w_{0,xx}^k(t,x,p,x-kF(t,p))\sigma ^2}. \end{aligned}$$

Using (5.6) and (5.32), we can confirm (5.7). Consequently, the optimal \({\tilde{\pi }}_1^{k,*}\), which maximizes \(\varPsi ^{k,\pi _1}_1(t,x,p,x-kF(t,p))\), is given by (3.10) and coincides with \(\pi _1^{k,*}\). Hence, we conclude that

$$\begin{aligned} \varPsi _1^{k,\pi ^*_1}(t,x,p,x-kF(t,p))-\varPsi _1^{k,\pi _1}(t,x,p,x-kF(t,p))\ge 0, \end{aligned}$$

and the equality holds only for \(\pi _1^k=\pi _1^{k,*}\). Since the limit (5.55) holds, the assertion (iv) is proved.

Step 6 We prove assertions (i)–(ii). We consider the PDEs (5.8) and (5.10) where we replace \(\pi ^{*}_0\) with \(\pi _0\in {\mathcal {B}}\). By Remark 4.1.c, the objective functions \((V^{k,\pi _0})_{k=0}^n\) satisfy the PDEs:

$$\begin{aligned}&V^{k,\pi _0}_{0,t}(t,x,p)+{\mathcal {L}}_{k}^{\pi _0^{k}} V^{k,\pi _0}(t,x,p)\\&\quad +\Big (V^{k-1,\pi _0}(t,x-\beta (p),p)-V^{k,\pi _0}(t,x,p)\Big )k\lambda =0,\\&\quad \ \quad \ \quad (t,x,p)\in [0,T)\times {{\mathbb {R}}}\times (0,\infty ),\\&V^{k,\pi _0}(T,x,p)=-e^{-\gamma _0(x-k\eta (p))},\quad (x,p)\in {{\mathbb {R}}}\times (0,\infty ), \end{aligned}$$

We proceed in the same way as in Steps 1–4, and we can establish the zeroth-order expansion:

$$\begin{aligned} v^{k,\pi _0}(t,x,p)=V^{k,\pi _0}(t,x,p)+{\mathcal {O}}(\epsilon ),\quad \epsilon \rightarrow 0. \end{aligned}$$

By Remark 4.1.b, the strategy \(\pi _0^*\) is the optimal strategy for the time-consistent exponential utility maximization problem. Consequently, \(V^{k,\pi _0}(t,x,p)\le V^{k,\pi _0^*}(t,x,p)\) and the equality holds only for \(\pi _0=\pi _0^*\). We can now show that

$$\begin{aligned}&{\lim _{\epsilon \rightarrow 0}\big (v^{k,\pi ^*_0}(t,x,p)-v^{k,\pi _0}(t,x,p)\big )}\\&\quad =\lim _{\epsilon \rightarrow 0}\big (V^{k,\pi ^*_0}(t,x,p)-V^{k,\pi _0}(t,x,p)+{\mathcal {O}}(\epsilon )\big )\ge \lim _{\epsilon \rightarrow 0}{\mathcal {O}}(\epsilon )=0, \end{aligned}$$

and the equality holds only for \(\pi _0=\pi _0^*\). \(\square \)