1 Introduction

In this paper we consider the question of optimal entry into a plan of irreversible investment with a cost function which is non convex with respect to the control variable. The irreversible investment problem is that of [7], in which the investor commits to delivering a unit of electricity to a consumer at a future random time \(\Theta \) and may purchase and store electricity in real time at the stochastic (and potentially negative) spot price \((X_t)_{t \ge 0}\). In the optimal entry problem considered here, the consumer is willing to offer a single fixed initial payment \(P_0\) in return for this commitment and the investor must choose a stopping time \(\tau \) at which to accept the initial premium and enter the contract. If \(\Theta \le \tau \) then the investor’s opportunity is lost and in this case no cashflows occur. If \(\tau <\Theta \) then the inventory must be full at the time \(\Theta \) of demand, any deficit being met by a less efficient charging method whose additional cost is represented by a convex factor \(\Phi \) of the undersupply. The investor seeks to minimise the total expected costs, net of the initial premium \(P_0\), by choosing \(\tau \) optimally and by optimally filling the inventory from time \(\tau \) onwards.

Economic problems of optimal entry and exit under uncertain market prices have attracted significant interest. In the simplest formulation the timing of entry and/or exit is the only decision to be made and the planning horizon is infinite: see for example [8, 19], in which the market price is a geometric Brownian motion (GBM), and related models in [9, 22]. An extension of this problem to multiple types of economic activity is considered in [4] and solved using stochastic calculus. In addition to the choice of entry / exit time, the decision problem may also depend on another control variable representing for instance investment or production capacity. For example in [10] the rate of production is modelled as a progressively measurable process whereas in [13] the production capacity is a process of bounded variation. In this case the problem is usually solved by applying the dynamic programming principle to obtain an associated Hamilton–Jacobi–Bellman (HJB) equation. If the planning horizon is finite then the optimal stopping and control strategies are time-dependent and given by suitable curves, see for example [6].

Typically, although not universally, the costs in the aforementioned problems are assumed to be convex with respect to the control variable. In addition to being reasonable in a wide range of problems, this assumption usually simplifies the mathematical analysis. In the present problem the underlying commodity is electricity, for which negative prices have been observed in several markets (see, e.g., [12, 18]). The spot price is modelled by an Ornstein–Uhlenbeck process which is mean reverting and may take negative values and, as shown in [7], this makes our control problem neither convex nor concave: to date such problems have received relatively little attention in the literature. In our setting the control variable represents the cumulative amount of electricity purchased by the investor in the spot market for storage. This control is assumed to be monotone, so that the sale of electricity back to the market is not possible, and also bounded to reflect the fact that the inventory used for storage has finite capacity. The investment problem falls into the class of singular stochastic control (SSC) problems (see [1, 15, 16], among others).

Borrowing ideas from [13], we begin by decoupling the control (investment) problem from the stopping (entry) problem. The value function of this mixed stopping-then-control problem is shown to coincide with that of an appropriate optimal stopping problem over an infinite time-horizon whose gain function is the value function of the optimal investment problem with fixed entry time equal to zero. Unlike the situation in [13], however, the gain function in the present paper is a function of two variables without an explicit representation. Indeed [7] identifies three regimes for the gain function, depending on the problem parameters, only two of which are solved rigorously: a reflecting regime, in which the control may be singularly continuous, and a repelling regime, in which the control is purely discontinuous. We therefore only address these two cases in this paper and leave the remaining open case for future work.

The optimal entry policies obtained below depend on the spot price and the inventory level and are described by suitable curves. On the one hand, for the reflecting case we prove that the optimal entry time is of a single threshold type as in [10, 13]. On the other hand, the repelling case is interesting since it gives either a single threshold strategy or, alternatively, a complex optimal entry policy such that for any fixed value of the inventory level, the continuation region may be disconnected.

The paper is organised as follows. In Sect. 2 we set up the mixed irreversible investment-optimal entry problem, whose two-step formulation is then obtained in Sect. 3. Section 4 is devoted to the analysis of the optimal entry decision problem, with the repelling case studied separately in Sect. 5. In Sect. 5.2 we provide discussion of the complex optimal entry policy in this case, giving a possible economic interpretation.

2 Problem formulation

We begin by recalling the optimal investment problem introduced in [7]. Let \((\Omega ,\mathcal {A},\mathsf P)\) be a complete probability space, on which is defined a one-dimensional standard Brownian motion \((B_t)_{t\ge 0}\). We denote by \(\mathbb {F}:=(\mathcal{F}_t)_{t\ge 0}\) the filtration generated by \((B_t)_{t\ge 0}\) and augmented by \(\mathsf P\)-null sets. As in [7], the spot price of electricity X follows a standard time-homogeneous Ornstein–Uhlenbeck process with positive volatility \(\sigma \), positive adjustment rate \(\theta \) and positive asymptotic (or equilibrium) value \(\mu \); i.e., \(X^x\) is the unique strong solution of

$$\begin{aligned} dX^x_t= \theta (\mu -X_t^x)dt + \sigma dB_t, \quad \text {for } t>0,\,\,\,\text { with }X^x_0=x\in \mathbb {R}. \end{aligned}$$
(2.1)

Note that this model allows negative prices, which is consistent with the requirement to balance supply and demand in real time in electrical power systems and also consistent with the observed prices in several electricity spot markets (see, e.g., [12, 18]).

We denote by \(\Theta \) the random time of a consumer’s demand for electricity. This is modelled as an \(\mathcal {A}\)-measurable positive random variable independent of \(\mathbb {F}\) and distributed according to an exponential law with parameter \(\lambda >0\), so that effectively the time of demand is completely unpredictable. Note also that since \(\Theta \) is independent of \(\mathbb {F}\), the Brownian motion \((B_t)_{t\ge 0}\) remains a Brownian motion in the enlarged filtration \(\mathbb {G}:=(\mathcal {G}_t)_{t\ge 0}\), with \(\mathcal {G}_t:=\mathcal {F}_t \vee \sigma (\{\Theta \le s\}:\, s \le t)\), under which \(\Theta \) becomes a stopping time (see, e.g., Chapter 5, Section 6 of [14]).

We will denote by \(\tau \) any element of \(\mathcal {T}\), the set of all \((\mathcal {F}_t)\)-stopping times. At any \(\tau <\Theta \) the investor may enter the contract by accepting the initial premium \(P_0\) and committing to deliver a unit of electricity at the time \(\Theta \). At any time during \([\tau , \Theta )\) electricity may be purchased in the spot market and stored, thus increasing the total inventory \(C^{c,\nu } = (C^{c,\nu })_{t \ge 0}\), which is defined as

$$\begin{aligned} C^{c,\nu }_t:=c + \nu _t,\quad t\ge 0. \end{aligned}$$
(2.2)

Here \(c\in [0,1]\) denotes the inventory at time zero and \(\nu _t\) is the cumulative amount of electricity purchased up to time t. We specify the (convex) set of admissible investment strategies by requiring that \(\nu \in \mathcal {S}^c_{\tau }\), where

The amount of energy in the inventory is bounded above by 1 to reflect the investor’s limited ability to store. The left continuity of \(\nu \) ensures that any electricity purchased at time \(\Theta \) is irrelevant for the optimisation. The requirement that \(\nu \) be \((\mathcal {F}_t)\)-adapted guarantees that all investment decisions are taken only on the basis of the price information available up to time t. The optimisation problem is given by

$$\begin{aligned} \inf _{\tau \ge 0,\, \nu \in \mathcal {S}^c_{\tau }}\mathsf E\Bigg [\Big (\int _{\tau }^{\Theta }{X^x_t}d\nu _t+X^{x}_\Theta \Phi (C^{c,\nu }_\Theta ) - P_0\Big )\mathbbm {1}_{\{\tau < \Theta \}}\Bigg ]. \end{aligned}$$
(2.3)

Here the first term represents expenditure in the spot market and the second is a penalty function: if the inventory is not full at time \(\Theta \) then it is filled by a less efficient method, so that the terminal spot price is weighted by a strictly convex function \(\Phi \). We make the following standing assumption:

Assumption 2.1

\(\Phi : \mathbb {R} \mapsto \mathbb {R}_+\) lies in \(C^2(\mathbb {R})\) and is decreasing and strictly convex in [0, 1] with \(\Phi (1)=0\).

For simplicity we assume that costs are discounted at the rate \(r=0\). This involves no loss of generality since the independent random time of demand performs an effective discounting, as follows. Recalling that \(\Theta \) is independent of \(\mathbb {F}\) and distributed according to an exponential law with parameter \(\lambda >0\), Fubini’s theorem gives that (2.3) may be rewritten as

$$\begin{aligned} V(x,c):=\inf _{\tau \ge 0,\, \nu \in \mathcal {S}^c_{\tau }}\mathcal{J}_{x,c}(\tau ,\nu ) \end{aligned}$$
(2.4)

with

$$\begin{aligned} \mathcal{J}_{x,c}(\tau ,\nu ):=\mathsf E\left[ \int ^{\infty }_\tau {e^{-\lambda t}X^x_t\,d{\nu }_t}+\int _\tau ^{\infty }e^{-\lambda t}\lambda X^x_{t}\Phi (C^{c,\nu }_{t})dt-e^{-\lambda \tau }P_0\right] , \end{aligned}$$
(2.5)

setting this expectation equal to 0 on the set \(\{\tau = +\infty \}\). The discounting of costs may therefore be accomplished by appropriately increasing the exponential parameter \(\lambda \).

3 Decoupling the problem and background material

To deal with (2.4) we borrow arguments from [13] to show that the stopping (entry) problem can be split from the control (investment) problem, leading to a two-step formulation. We first briefly recall some results from [7], where the control problem has the value function

$$\begin{aligned} U(x,c):=\inf _{\nu \in \mathcal {S}^c_0}\mathcal{J}^0_{x,c}(\nu ) \end{aligned}$$
(3.1)

with

$$\begin{aligned} \mathcal{J}^0_{x,c}(\nu ):=\mathsf E\left[ \int ^{\infty }_0 {e^{-\lambda t}X^x_t\,d{\nu }_t} + \int _0^{\infty }e^{-\lambda t}\lambda X^x_{t}\Phi (C^{c,\nu }_{t})dt \right] . \end{aligned}$$
(3.2)

As was shown in [7, Sec. 2], the function

$$\begin{aligned} k(c):=\lambda +\theta +\lambda \,\Phi '(c), \qquad c\in \mathbb {R}, \end{aligned}$$
(3.3)

appears in an optimal stopping functional which may be associated with U. For convenience we let \(\hat{c}\in \mathbb {R}\) denote the unique solution of \(k(c)=0\) if it exists and write

$$\begin{aligned} \zeta (c):=\int _c^1{k(y)dy}=(\lambda + \theta )(1-c) - \lambda \Phi (c),\quad c\in [0,1]. \end{aligned}$$
(3.4)

We formally introduce the variational problem associated with U:

$$\begin{aligned} \max \{-\mathbb {L}_XU+ \lambda U-\lambda x \Phi (c),-U_c-x\}=0,\quad \text {on }\mathbb {R} \times (0,1), \end{aligned}$$
(3.5)

where \(\mathbb {L}_X\) is the second order differential operator associated to the infinitesimal generator of X:

$$\begin{aligned} \mathbb {L}_{X}f\,(x):=\frac{1}{2}\sigma ^2 f''(x) + \theta (\mu - x)f'(x),\quad \text {for }f\in C^2_b(\mathbb {R})\text { and }x\in \mathbb {R}. \end{aligned}$$
(3.6)

As is standard in such control problems we define the inaction set for problem (3.1) by

$$\begin{aligned} \mathcal C:=\{(x,c)\in \mathbb {R}\times [0,1]\,:\,U_c(x,c)>-x \}. \end{aligned}$$
(3.7)

The non convexity of the expectation (3.2) with respect to the control variable \(\nu _t\), which arises due to the real-valued factor \(X^x_t\), places it outside the standard existing literature on SSC problems. We therefore collect here the solutions proved in Sections 2 and 3 of [7].

Proposition 3.1

We have \(|U(x,c)|\le C(1+|x|)\) for \((x,c)\in \mathbb {R}\times [0,1]\) and a suitable constant \(C>0\). Moreover the following holds

  1. (i)

    If \(\hat{c}<0\) (i.e. \(k(\,\cdot \,)>0\) in [0, 1]), then \(U\in C^{2,1}(\mathbb {R}\times [0,1])\) and it is a classical solution of (3.5). The inaction set (3.7) is given by

    $$\begin{aligned} \mathcal C=\{(x,c)\in \mathbb {R}\times [0,1]\,:\,x>\beta _*(c) \} \end{aligned}$$
    (3.8)

    for some function \(\beta _*\in C^1([0,1])\) which is decreasing and dominated from above by \(x_0(c)\wedge \hat{x}_0(c)\), \(c\in [0,1]\), with

    $$\begin{aligned} x_0(c):=-\theta \mu \Phi '(c)/k(c)\quad \text {and}\quad \hat{x}_0(c):=\theta \mu /k(c), \end{aligned}$$
    (3.9)

    (cf. [7, Prop. 2.5 and Thm. 2.8]). For \(c\in [0,1]\) the optimal control is given by

    $$\begin{aligned} \nu _t^*=\left[ g_*\left( \inf _{0 \le s \le t} X^x_s\right) -c\right] ^+, \quad t>0, \quad \nu _0^*= 0, \end{aligned}$$
    (3.10)

    with \(g_*(x):=\beta _*^{-1}(x)\), \(x\in (\beta _*(1),\beta _*(0))\), and \(g_* \equiv 0\) on \([\beta _*(0),\infty )\), \(g_* \equiv 1\) on \((-\infty ,\beta _*(1)]\).

  2. (ii)

    If \(\hat{c}>1\) (i.e. \(k(\,\cdot \,)<0\) in [0, 1]), then \(U\in W^{2,1,\infty }_{loc}(\mathbb {R}\times [0,1])\) and it solves (3.5) in the a.e. sense. The inaction set (3.7) is given by

    $$\begin{aligned} \mathcal C=\{(x,c)\in \mathbb {R}\times [0,1]\,:\,x<\gamma _*(c) \} \end{aligned}$$
    (3.11)

    with suitable \(\gamma _*\in C^1([0,1])\), decreasing and bounded from below by \(\tilde{x}(c)\vee \overline{x}_0(c)\), \(c\in [0,1]\), with

    $$\begin{aligned} \overline{x}_0(c):= \theta \mu \Phi (c)/\zeta (c)\quad \text {and}\quad \tilde{x}(c):=\theta \mu (1-c)/\zeta (c), \end{aligned}$$
    (3.12)

    (cf. [7, Thm. 3.1 and Prop. 3.4]). Moreover \(U(x,c)=x(1-c)\) for \(x\ge \gamma _*(c)\), \(c\in [0,1]\), and for any \(c\in [0,1]\) the optimal control is given by (cf. [7, Thm. 3.5])

    $$\begin{aligned} \nu ^*_t:=\left\{ \begin{array}{ll} 0, &{}\quad t\le \tau _*,\\ (1-c), &{}\quad t>\tau _* \end{array} \right. \end{aligned}$$
    (3.13)

    with \(\tau _*:=\inf \big \{t\ge 0\,:\,X^x_t\ge \gamma _*(c)\big \}\).

We now perform the decoupling into two sub-problems, one of control and one of stopping.

Proposition 3.2

If \(\hat{c}<0\) or \(\hat{c}>1\) then the value function V of (2.4) can be equivalently rewritten as

$$\begin{aligned} V(x,c)=\inf _{\tau \ge 0}\mathsf E\Big [e^{-\lambda \tau }\Big (U(X^x_\tau ,c)-P_0\Big )\Big ], \end{aligned}$$
(3.14)

with the convention \(e^{-\lambda \tau }(U(X^x_\tau ,c)-P_0):=\liminf _{t \uparrow \infty } e^{-\lambda t}(U(X^x_t,c)-P_0) =0\) on \(\{\tau =\infty \}\).

Proof

Let us set

$$\begin{aligned} w(x,c):=\inf _{\tau \ge 0}\mathsf E\Big [e^{-\lambda \tau }\Big (U(X^x_\tau ,c)-P_0\Big )\Big ],\qquad \text {for } (x,c)\in \mathbb {R}\times [0,1]. \end{aligned}$$
(3.15)

Thanks to the results of Proposition 3.1 we can apply Itô’s formula to U, in the classical sense in case (i) and in its generalised version (cf. [11, Ch. 8, Sec. VIII.4, Thm. 4.1]) in case (ii). In particular for an arbitrary stopping time \(\tau \), an arbitrary admissible control \(\nu \in \mathcal {S}^c_\tau \) and with \(\tau _n:=\tau \wedge n\), \(n\in \mathbb {N}\) we get

$$\begin{aligned} \mathsf E\Big [e^{-\lambda \tau _n}U(X^x_{\tau _n},C^{c,\nu }_{\tau _n})\Big ] = {}&\mathsf E\Big [ e^{-\lambda \tau }U(X^x_{\tau },c)\Big ]+\mathsf E\bigg [\int ^{\tau _n}_\tau {e^{-\lambda t}\big (\mathbb {L}_XU-\lambda U\big )(X^x_t,C^{c,\nu }_t)dt}\bigg ] \nonumber \\&+\mathsf E\bigg [\int _{\tau }^{\tau _n}{e^{-\lambda t}U_c(X^x_t,C^{c,\nu }_t)d\nu ^{cont}_t}\bigg ]\nonumber \\&+\mathsf E\bigg [\sum _{\tau \le t<\tau _n}e^{-\lambda t}\Big (U(X^x_t,C^{c,\nu }_{t+})-U(X^x_t,C^{c,\nu }_{t})\Big )\bigg ], \end{aligned}$$
(3.16)

where we have used standard localisation techniques to remove the martingale term, and decomposed the control into its continuous and jump parts, i.e. \(d\nu _t=d\nu _t^{cont}+\Delta \nu _t\), with \(\Delta \nu _t:=\nu _{t+}-\nu _t\). Since U solves the HJB equation (3.5) it is now easy to prove (cf. for instance [7, Thm. 2.8]) that, in the limit as \(n\rightarrow \infty \), one has

$$\begin{aligned} \mathsf E\Big [e^{-\lambda \tau }U(X^x_{\tau },c)\Big ]\le \mathsf E\left[ \int ^\infty _{\tau }{e^{-\lambda t}\lambda X^x_t\Phi (C^{c,\nu }_t)dt}+\int ^{\infty }_\tau { e^{-\lambda t}X^x_t d\nu _t}\right] , \end{aligned}$$
(3.17)

and therefore

$$\begin{aligned} \mathsf E\Big [e^{-\lambda \tau }\big (U(X^x_\tau ,c)-P_0\big )\Big ]\le \mathsf E\left[ \int ^\infty _{\tau }{e^{-\lambda t}\lambda X^x_t\Phi (C^{c,\nu }_t)dt}+\int ^{\infty }_\tau { e^{-\lambda t}X^x_t d\nu _t}-e^{-\lambda \tau }P_0\right] , \end{aligned}$$
(3.18)

for an arbitrary stopping time \(\tau \) and an arbitrary control \(\nu \in \mathcal {S}^c_\tau \). Hence by taking the infimum over all possible stopping times and over all \(\nu \in \mathcal {S}^c_\tau \), (2.4), (3.15) and (3.18) give \(w(x,c)\le V(x,c)\).

To prove that equality holds, let us fix an arbitrary stopping time \(\tau \). In case i) of Proposition 3.1, one can pick a control \(\nu ^\tau \in \mathcal {S}^c_\tau \) of the form

$$\begin{aligned} \nu ^\tau _t=0\text { for }t\le \tau \,\,\text { and }\,\, \nu ^\tau _t=\nu ^*_t\text { for }t>\tau \end{aligned}$$
(3.19)

with \(\nu ^*\) as in (3.10), to obtain equality in (3.17) and hence in (3.18). In case ii) instead we define \(\sigma ^*_\tau :=\inf \{t\ge \tau \,:\,X^x_t\ge \gamma _*(c)\}\) and pick \(\nu ^\tau \in \mathcal {S}^c_\tau \) of the form

$$\begin{aligned} \nu ^\tau _t=0\text { for }t\le \sigma ^*_\tau \,\,\text { and }\,\, \nu ^\tau _t=1-c\text { for }t>\sigma ^*_\tau \end{aligned}$$
(3.20)

to have again equality in (3.17) and hence in (3.18). Now taking the infimum over all \(\tau \) we find \(w(x,c)\ge V(x,c)\).

To complete the proof we need to prove the last claim; that is, \(\liminf _{t \uparrow \infty } e^{-\lambda t}(U(X^x_t,c)-P_0) =0\) a.s. It suffices to show that \(\liminf _{t \uparrow \infty } e^{-\lambda t}|U(X^x_t,c)-P_0| =0\) a.s. To this end recall that \(|U(x,c)|\le C(1+|x|)\), for \((x,c)\in \mathbb {R}\times [0,1]\) and a suitable constant \(C>0\) (cf. Proposition 3.1), and then apply Lemma 7.1 in “Appendix 1”. \(\square \)

Remark 3.3

The optimal stopping problems (3.14) depend only parametrically on the inventory level c (the case \(c=1\) is trivial as \(U(\,\cdot \,,1)=0\) on \(\mathbb {R}\) and the optimal strategy is to stop at once for all initial points \(x\in \mathbb {R}\)).

It is worth noting that we were able to perform a very simple proof of the decoupling knowing the structure of the optimal control for problem (3.1). In wider generality one could obtain a proof based on an application of the dynamic programming principle although in that case it is well known that some delicate measurability issues should be addressed as well (see [13], “Appendix 1”). Although each of the optimal stopping problems (3.14) is for a one-dimensional diffusion over an infinite time horizon, standard methods find only limited application since no explicit expression is available for their gain function \(U(x,c)-P_0\).

In the next section we show that the cases \(\hat{c}<0\) and \(\hat{c}>1\), which are the regimes solved rigorously in [7], have substantially different optimal entry policies. To conclude with the background we prove a useful concavity result.

Lemma 3.4

The maps \(x\mapsto U(x,c)\) and \(x\mapsto V(x,c)\) are concave for fixed \(c\in [0,1]\).

Proof

We begin by observing that \(X^{px+(1-p)y}_t=pX^x_t+(1-p)X^y_t\) for all \(t\ge 0\) and any \(p\in (0,1)\). Hence (3.2) gives

$$\begin{aligned} \mathcal{J}^0_{px+(1-p)y,c}(\nu )=p\mathcal{J}^0_{x,c}(\nu )+(1-p)\mathcal{J}^0_{y,c}(\nu )\ge p U(x,c)+(1-p)U(y,c), \qquad \forall \nu \in \mathcal {S}^c_0 \end{aligned}$$

and therefore taking the infimum over all admissible \(\nu \) we easily find \(U(px+(1-p)y,c)\ge p U(x,c)+(1-p)U(y,c)\) as claimed.

For V we argue in a similar way and use concavity of \(U(\,\cdot \,,c)\) as follows: let \(\tau \ge 0\) be an arbitrary stopping time, then

$$\begin{aligned} \mathsf E\Big [e^{-\lambda \tau }\Big (U(X^{px+(1-p)y}_\tau ,c)-P_0\Big )\Big ]&= \mathsf E\Big [e^{-\lambda \tau }\Big (U(pX^{x}_\tau +(1-p)X^y_\tau ,c)-P_0\Big )\Big ]\\&\ge \mathsf E\Big [e^{-\lambda \tau }\Big (pU(X^{x}_\tau ,c)+(1-p)U(X^{y}_\tau ,c)-P_0\Big )\Big ]\\&= p \, \mathsf E\Big [e^{-\lambda \tau }\Big (U(X^{x}_\tau ,c)-P_0\Big )\Big ] \\&\quad +(1-p) \, \mathsf E\Big [e^{-\lambda \tau }\Big (U(X^{y}_\tau ,c)-P_0\Big )\Big ]\\&\ge \,\,p\,V(x,c)+(1-p)V(y,c). \end{aligned}$$

We conclude the proof by taking the infimum over all stopping times \(\tau \ge 0\). \(\square \)

4 Timing the entry decision

We first examine the optimal entry policy via a standard argument based on exit times from small intervals of \(\mathbb {R}\). An application of Dynkin’s formula gives that the instantaneous ‘cost of continuation’ in our optimal entry problem is given by the function

$$\begin{aligned} \mathcal {L}(x,c)+\lambda P_0:=(\mathbb {L}_X-\lambda )(U-P_0)(x,c). \end{aligned}$$
(4.1)

In the case \(\hat{c} < 0\), which is covered in Sect. 4.1, the function (4.1) is monotone decreasing in x (see the proof of Proposition 4.2 in “Appendix 2”). Since problem (2.3) is one of minimisation, it is never optimal to stop at points \((x,c)\in \mathbb {R}\times [0,1]\) such that \(\mathcal {L}(x,c)+\lambda P_0<0\); an easy comparison argument then shows there is a unique lower threshold that determines the optimal stopping rule in this case.

When \(\hat{c}>1\) the picture is more complex. The function (4.1) is decreasing and continuous everywhere except at a single point where it has a positive jump (cf. Proposition 5.1 below) and so can change sign twice. The comparison argument now becomes more subtle: continuation should not be optimal when the function (4.1) is positive in a ‘large neighbourhood containing the initial value x’. Indeed it will turn out in Sect. 5 that there are multiple possible optimal stopping regimes depending on parameter values. In particular the continuation region of the optimal stopping problem may be disconnected, which is unusual in the literature on optimal entry problems. The resulting optimal entry region can have a kinked shape, as illustrated in Fig. 1. The jump in the function (4.1) arises from the ‘bang-bang’ nature of the optimal investment plan when \(\hat{c} > 1\), and so this may be understood as causing this unusual shape for the optimal entry boundary.

Fig. 1
figure 1

An indicative example of an optimal entry region (shaded) when \(\hat{c} > 1\), together with the functions \(\gamma _*\) and \(x^0_1\), \(x^0_2\) (introduced in Proposition 5.1 below). The functions \(m_{1}\) and \(m_{2}\) (not drawn to scale) are important determinants for the presence of the kinked shape (see Remark 5.4 below). This plot was generated using \(\mu = 1\), \(\theta = 1\), \(\sigma = 3\), \(\lambda = 1\), \(P_{0} = 4\) and \(\Phi (c) = 2.2(1 - c) + 8(1 - c)^{2}\)

Before proceeding, we introduce two functions \(\phi _{\lambda }\) and \(\psi _\lambda \) that feature frequently below.

Definition 4.1

Let \(\phi _{\lambda } : \mathbb {R}\rightarrow \mathbb {R}^+\) and \(\psi _\lambda :\mathbb {R}\rightarrow \mathbb {R}^+\) denote respectively the decreasing and increasing fundamental solutions of the differential equation \(\mathbb {L}_Xf=\lambda f\) on \(\mathbb {R}\) (see “Appendix 1” for details).

4.1 The case \(\hat{c}<0\)

Let us now assume that \(\hat{c}<0\), i.e. \(k(c)>0\) for all \(c\in [0,1]\) [cf. (3.3)]. We first recall from Section 2.2 of [7] that in this case

$$\begin{aligned} U(x,c)=x(1-c)-\int _c^1{u(x,y)dy},\qquad \text {for }(x,c)\in \mathbb {R}\times [0,1], \end{aligned}$$
(4.2)

where u is the value function of an associated optimal stopping problem with (cf. Sections 2.1 and 2.2 of [7])

$$\begin{aligned}&(i)\quad u(\,\cdot \,,c)\in W^{2,\infty }_{loc}(\mathbb {R})\text { for any }c\in [0,1] \end{aligned}$$
(4.3)
$$\begin{aligned}&(ii)\quad u(x,c)> 0\text { for }x>\beta _*(c)\text { and }u(x,c)=0\text { for }x\le \beta _*(c),\, c\in [0,1], \end{aligned}$$
(4.4)

and with \(\beta _*\) given as in Proposition 3.1-i). Moreover, defining

$$\begin{aligned} G(x,c) := \frac{\mu (k(c)-\theta )}{\lambda } + \frac{k(c)(x-\mu )}{\lambda +\theta }, \end{aligned}$$
(4.5)

and recalling \(\phi _{\lambda }\) from Definition 4.1, u is expressed analytically as

$$\begin{aligned} u(x,c)=\left\{ \begin{array}{ll} G(x,c)-\frac{G({\beta _*(c)},c)}{\phi _{\lambda }({\beta _*(c)})}\phi _{\lambda }(x), &{} \quad x > {\beta _*(c)} \\ [+4pt] 0, &{} \quad x \le {\beta _*(c)}\end{array} \right. \end{aligned}$$
(4.6)

for \(c\in [0,1]\), and it solves the variational problem

$$\begin{aligned}&\big (\mathbb L_X-\lambda \big )u(x,c)=\theta \mu -k(c)x \quad \qquad x>\beta _*(c),\,c\in [0,1]\end{aligned}$$
(4.7)
$$\begin{aligned}&\big (\mathbb L_X-\lambda \big )u(x,c)=0 \qquad \quad \quad \,\,\qquad \qquad x\le \beta _*(c),\,c\in [0,1]\end{aligned}$$
(4.8)
$$\begin{aligned}&u(\beta _*(c),c)=u_x(\beta _*(c),c)=0 \qquad \quad \,\, c\in [0,1]. \end{aligned}$$
(4.9)

By the regularity of u and dominated convergence we have

$$\begin{aligned} (\mathbb {L}_X-\lambda )U(x,c)=(1-c)(\theta \mu -(\lambda +\theta )x)-\int _c^1(\mathbb {L}_X-\lambda )u(x,y)dy \end{aligned}$$
(4.10)

for \((x,c)\in \mathbb {R}\times [0,1]\).

As is usual, for each \(c\in [0,1]\) we define the continuation region \(\mathcal C^c_V\) and stopping region \(\mathcal D^c_V\) for the optimal stopping problem (3.14) as

$$\begin{aligned} \mathcal C^c_V=\{x\in \mathbb {R}\,:\,V(x,c) < U(x,c)-P_0\}\,,\quad \mathcal D^c_V=\{x\in \mathbb {R}\,:\,V(x,c) = U(x,c)-P_0\}. \end{aligned}$$
(4.11)

With the aim of characterising the geometry of \(\mathcal C^c_V\) and \(\mathcal D^c_V\) we start by providing some preliminary results on \(U-P_0\) that will help to formulate an appropriate free-boundary problem for V.

Proposition 4.2

For any given \(c\in [0,1]\), there exists a unique \(x^0(c)\in \mathbb {R}\) such that

$$\begin{aligned} \big (\mathbb {L}_X-\lambda \big )\big (U(x,c)-P_0)\left\{ \begin{array}{ll} {<}0 &{} \quad \text {for }x>x^0(c)\\ {=}0 &{} \quad \text {for }x=x^0(c)\\ {>}0 &{} \quad \text {for }x<x^0(c) \end{array} \right. \end{aligned}$$
(4.12)

We refer to “Appendix 2” for the proof of the previous proposition.

As discussed at the beginning of Sect. 4, it is never optimal in problem (3.14) to stop in \((x^0(c),\infty )\), \(c\in [0,1]\), for \(x^0(c)\) as in Proposition 4.2, i.e.

$$\begin{aligned} (x^0(c),\infty )\subseteq \mathcal C^c_V\qquad \text {for }c\in [0,1], \end{aligned}$$
(4.13)

and consequently

$$\begin{aligned} \mathcal D^c_V \subset [-\infty ,x^0(c)]\qquad \text {for }c\in [0,1]. \end{aligned}$$
(4.14)

Hence we conjecture that the optimal stopping strategy should be of single threshold type. In what follows we aim at finding \(\ell _*(c)\), \(c\in [0,1]\), such that \(\mathcal D^c_V=[-\infty ,\ell _*(c)]\) and

$$\begin{aligned} \tau ^*(x,c) = \inf \{ t \ge 0: X^x_t \le \ell _*(c)\} \end{aligned}$$
(4.15)

is optimal for V(xc) in (3.14) with \((x,c)\in \mathbb {R}\times [0,1]\). The methodology adopted in [7, Sec. 2.1] does not apply directly to this problem due to the semi-explicit expression of the gain function \(U-P_0\).

4.1.1 Formulation of auxiliary optimal stopping problems

To work out the optimal boundary \(\ell _*\) we will introduce auxiliary optimal stopping problems and employ a guess-and-verify approach in two frameworks with differing technical issues. We first observe that since U is a classical solution of (3.5), an application of Dynkin’s formula to (3.14) provides a lower bound for V, that is

$$\begin{aligned} V(x,c) \ge U(x,c)-P_0 + \Gamma (x,c),\qquad (x,c)\in \mathbb {R}\times [0,1], \end{aligned}$$
(4.16)

with

$$\begin{aligned} \Gamma (x,c):=\inf _{\tau \ge 0} \mathsf E\bigg [ \int _0^{\tau }e^{-\lambda s}\big (\lambda P_0-\lambda X^x_s \Phi (c)\big )ds \bigg ]\qquad (x,c)\in \mathbb {R}\times [0,1]. \end{aligned}$$
(4.17)

On the other hand, for \((x,c)\in \mathbb {R}\times [0,1]\) fixed, set \(\sigma ^*_\beta :=\inf \{t\ge 0\,:\,X^x_t\le \beta _*(c)\}\) with \(\beta _*\) as in Proposition 3.1, then for an arbitrary stopping time \(\tau \) one also obtains

$$\begin{aligned} \mathsf E&\Big [e^{-\lambda (\tau \wedge \sigma ^*_\beta )}\Big (U(X^x_{\tau \wedge \sigma ^*_\beta },c)-P_0\Big )\Big ]\nonumber \\&=U(x,c)-P_0+\mathsf E\bigg [ \int _0^{\tau \wedge \sigma ^*_\beta }e^{-\lambda s}\big (\lambda P_0-\lambda X^x_s \Phi (c)\big )ds \bigg ] \end{aligned}$$
(4.18)

by using the fact that U solves (3.5) and Dynkin’s formula. We can now obtain an upper bound for V by setting

$$\begin{aligned} \Gamma _\beta (x,c):=\inf _{\tau \ge 0}\mathsf E\bigg [ \int _0^{\tau \wedge \sigma ^*_\beta }e^{-\lambda s}\big (\lambda P_0-\lambda X^x_s \Phi (c)\big )ds \bigg ],\quad (x,c)\in \mathbb {R}\times [0,1], \end{aligned}$$
(4.19)

so that taking the infimum over all \(\tau \) in (4.18) one obtains

$$\begin{aligned} V(x,c)&\le U(x,c)-P_0 + \Gamma _\beta (x,c)\qquad (x,c)\in \mathbb {R}\times [0,1]. \end{aligned}$$
(4.20)

It turns out that (4.16) and (4.20) allow us to find a simple characterisation of the optimal boundary \(\ell _*\) and of the function V in some cases. Let us first observe that \(0 \ge \Gamma _\beta (x,c) \ge \Gamma (x,c)\) for all \((x,c) \in \mathbb {R} \times [0,1]\). Defining for each fixed \(c \in [0,1]\) the stopping regions

$$\begin{aligned} \mathcal D^c_{\Gamma }=\{x\in \mathbb {R}\,:\, \Gamma (x,c)=0\} \qquad \text {and}\qquad \mathcal D^c_{\Gamma _\beta }=\{x\in \mathbb {R}\,:\, \Gamma _\beta (x,c)=0\} \end{aligned}$$

it is easy to see that \(\mathcal D^c_{\Gamma } \subset \mathcal D^c_{\Gamma _\beta }\). Moreover, by the monotonicity of \(x\mapsto X^x_\cdot \) it is not hard to verify that \(x \mapsto \Gamma (x,c)\) and \(x\mapsto \Gamma _\beta (x,c)\) are decreasing. Hence we again expect optimal stopping strategies of threshold type, i.e.

$$\begin{aligned} \mathcal D^c_{\Gamma }=\{x\in \mathbb {R}\,:\, x\le \alpha ^*_1(c)\} \qquad \text {and}\qquad \mathcal D^c_{\Gamma _\beta }=\{x\in \mathbb {R}\,:\, x\le \alpha ^*_2(c)\} \end{aligned}$$
(4.21)

for \(c\in [0,1]\) and for suitable functions \(\alpha ^*_i(\,\cdot \,)\), \(i=1,2\) to be determined.

Assume for now that \(\alpha ^*_1\) and \(\alpha ^*_2\) are indeed optimal, then we must have

$$\begin{aligned} \alpha ^*_1(c) \le \ell _*(c) \le \alpha _2^*(c)\qquad \text {for }c\in [0,1]. \end{aligned}$$
(4.22)

Indeed, for all \((x,c) \in \mathbb {R}\times [0,1]\) we have \(\mathcal D^c_{\Gamma } \subset \mathcal D^c_{V}\) since \(\Gamma (x,c) \le V(x,c)-U(x,c)+P_0 \le 0\), and \(\mathcal D^c_{V} \subset \mathcal D^c_{\Gamma _\beta }\) since \(V(x,c)-U(x,c)+P_0 \le \Gamma _\beta (x,c) \le 0\). Notice also that since the optimisation problem in (4.19) is the same as the one in (4.17) except that in the former the observation is stopped when X hits \(\beta _*\), we must have

$$\begin{aligned} \alpha ^*_2(c)=\beta _*(c)\vee \alpha _1^*(c) \quad \text {for } c\in [0,1]. \end{aligned}$$
(4.23)

Thus for each \(c\in [0,1]\) we can now consider two cases:

  1. 1.

    if \(\alpha _1^*(c) > \beta _*(c)\) we have \(\Gamma (x,c)= \Gamma _\beta (x,c)=\big (V-U+P_0\big )(x,c)\) for \(x\in \mathbb {R}\) and \(\ell _*(c) = \alpha _1^*(c)\),

  2. 2.

    if \(\alpha _1^*(c) \le \beta _*(c)\) we have \(\alpha ^*_2(c)=\beta _*(c)\), implying that \(\ell _*(c) \le \beta _*(c)\).

Both 1. and 2. above need to be studied in order to obtain a complete characterisation of \(\ell _*\), however we note that case 1. is particularly interesting as it identifies V and \(\ell _*\) with \(\Gamma +U-P_0\) and \(\alpha ^*_1\), respectively. As we will clarify in what follows, solving problem (4.17) turns out to be theoretically simpler and computationally less demanding than dealing directly with problem (3.14).

4.1.2 Solution of the auxiliary optimal stopping problems

To make our claims rigorous we start by analysing problem (4.17). This is accomplished by largely relying on arguments already employed in [7, Sec. 2.1] and therefore we omit proofs here whenever a precise reference can be provided. Moreover, the majority of the proofs of new results are provided in “Appendix 2” to simplify the exposition.

In problem (4.17) we conjecture an optimal stopping time of the form

$$\begin{aligned} \tau _\alpha (x,c) := \inf \{ t \ge 0\,:\, X^x_t \le \alpha (c)\} \end{aligned}$$
(4.24)

for \((x,c)\in \mathbb {R}\times [0,1]\) and \(\alpha \) to be determined. Under this conjecture \(\Gamma \) should be found in the class of functions of the form

$$\begin{aligned} \Gamma ^{\alpha }(x,c)=\left\{ \begin{array}{ll} \displaystyle \mathsf E\bigg [ \int _0^{\tau _{\alpha }}e^{-\lambda s}\lambda \big (P_0- X^x_s \Phi (c)\big )ds\bigg ], &{} \quad x > \alpha (c) \\ 0, &{} \quad x \le \alpha (c)\end{array} \right. \end{aligned}$$
(4.25)

for each \(c\in [0,1]\). Now, repeating the same arguments of proof of [7, Thm. 2.1] we obtain

Lemma 4.3

One has

$$\begin{aligned} \Gamma ^{\alpha }(x,c)=\left\{ \begin{array}{ll} \big (P_0-\hat{G}(x,c)\big )-\big ( P_0-\hat{G}(\alpha (c),c) \big )\frac{\phi _{\lambda }(x)}{\phi _{\lambda }(\alpha (c))}, &{} \quad x > \alpha (c) \\ 0, &{} \quad x \le \alpha (c)\end{array} \right. \end{aligned}$$
(4.26)

for each \(c\in [0,1]\), with

$$\begin{aligned} \hat{G}(x,c):=\mu \Phi (c)+(x-\mu )\tfrac{\lambda \Phi (c)}{\lambda +\theta }\qquad (x,c)\in \mathbb {R}\times [0,1]. \end{aligned}$$
(4.27)

To single out the candidate optimal boundary we impose the so-called smooth fit condition, i.e. \(\tfrac{d}{dx}\Gamma ^\alpha (\alpha (c),c)=0\) for every \(c\in [0,1]\). This amounts to finding \(\alpha ^*\) such that

$$\begin{aligned} -\tfrac{\lambda \Phi (c)}{\lambda +\theta }+ \big (\hat{G}(\alpha ^*(c),c)-P_0 \big )\tfrac{\phi '_{\lambda }(\alpha ^*(c))}{\phi _{\lambda }(\alpha ^*(c))}=0\quad \text {for }c\in [0,1]. \end{aligned}$$
(4.28)

Proposition 4.4

For \(c\in [0,1]\) define

$$\begin{aligned} x^\dagger _0(c):=\mu +\big (P_0-\mu \Phi (c)\big )\tfrac{(\lambda +\theta )}{\lambda \Phi (c)}. \end{aligned}$$
(4.29)

For each \(c\in [0,1]\) there exists a unique solution \(\alpha ^*(c)\in (-\infty ,x^\dagger _0(c))\) of (4.28). Moreover \(\alpha ^*\in C^1([0,1))\) and it is strictly increasing with \(\lim _{c\rightarrow 1}\alpha ^*(c)=+\infty \).

For the proof of Proposition 4.4 we refer to “Appendix 2”.

To complete the characterisation of \(\alpha ^*\) and \(\Gamma ^{\alpha ^*}\) we now find an alternative upper bound for \(\alpha ^*\) that will guarantee \(\big (\mathbb L_X \Gamma ^{\alpha ^*}-\lambda \Gamma ^{\alpha ^*}\big )(x,c)\ge -\lambda (P_0-x\Phi (c))\) for \((x,c)\in \mathbb {R}\times [0,1]\). Again, the proof of the following result may be found in “Appendix 2”.

Proposition 4.5

For all \(c\in [0,1]\) we have \(\alpha ^*(c)\le P_0/\Phi (c)\) with \(\alpha ^*\) as in Proposition 4.4.

With the aim of formulating a variational problem for \(\Gamma ^{\alpha ^*}\) we observe that \(\tfrac{d^2}{dx^2}\Gamma ^{\alpha ^*}(x,c)<0\) for \(x>\alpha ^*(c)\), \(c\in [0,1]\) by (4.26), convexity of \(\phi _\lambda \) and the fact that \(\hat{G}(\alpha ^*(c),c)-P_0<0\). Hence \(\Gamma ^{\alpha ^*}\le 0\) on \(\mathbb {R}\times [0,1]\). It is not hard to verify by direct calculation from (4.26) and the above results that for all \(c\in [0,1]\) the couple \(\big (\Gamma ^{\alpha ^*}(\,\cdot \,,c),\alpha ^*(c)\big )\) solves the free-boundary problem

$$\begin{aligned}&\big (\mathbb L_X-\lambda \big )\Gamma ^{\alpha ^*}(x,c)=-\lambda (P_0-x\Phi (c)) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~x>{\alpha ^*}(c), \end{aligned}$$
(4.30)
$$\begin{aligned}&\big (\mathbb L_X-\lambda \big )\Gamma ^{\alpha ^*}(x,c)>-\lambda (P_0-x\Phi (c))~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ x < \alpha ^*(c), \end{aligned}$$
(4.31)
$$\begin{aligned}&\Gamma ^{\alpha ^*}(x,c)\le 0,\quad \Gamma ^{\alpha ^*}(\alpha ^*(c),c)=\Gamma ^{\alpha ^*}_x(\alpha ^*(c),c)=0~~~~~~~~~~~~ x\in \mathbb {R}\end{aligned}$$
(4.32)

and \(\Gamma ^{\alpha ^*}(\,\cdot \,,c)\in W^{2,\infty }_{loc}(\mathbb {R})\). Following now the same arguments as in the proof of [7, Thm. 2.1], which is based on an application of the Itô–Tanaka formula and (4.30)–(4.32), we can verify our guess and prove the following theorem (whose details are omitted).

Theorem 4.6

The boundary \(\alpha ^*\) of Proposition 4.4 is optimal for (4.17) in the sense that \(\alpha ^*=\alpha ^*_1\) with \(\alpha ^*_1\) as in (4.21),

$$\begin{aligned} \tau ^*_\alpha =\inf \{t \ge 0\,:\,X^x_t \le {\alpha _*(c)}\} \end{aligned}$$
(4.33)

is an optimal stopping time and \(\Gamma ^{\alpha ^*}\equiv \Gamma \) [cf. (4.17)].

4.1.3 Solution of the original optimal stopping problem (3.14)

In Theorem 4.6 we have fully characterised \(\alpha ^*_1\) and \(\Gamma \) thus also \(\alpha ^*_2\) and \(\Gamma _\beta \) (cf. (4.19), (4.21) and (4.23)). Moreover we have found that \(\alpha ^*_1(\,\cdot \,)\) is strictly increasing on [0, 1). On the other hand, \(\beta _*(\,\cdot \,)\) is a strictly decreasing function [cf. Proposition 3.1-i)], hence there exists at most one \(c_* \in (0,1)\) such that

$$\begin{aligned} \beta _*(c) > \alpha ^*_1(c)\text { for }c \in (0,c_*)\quad \text {and}\quad \beta _*(c) \le \alpha ^*_1(c)\text { for }c \in [c_*,1). \end{aligned}$$
(4.34)

As already mentioned, it may be possible to provide examples where such a value \(c_*\) does not exist in (0, 1) and \(\alpha ^*_1(c)>\beta _*(c)\) for all \(c\in [0,1]\). In those cases, as discussed in Sect. 4.1.1, one has \(\ell _*=\alpha ^*_1\) and \(V=U-P_0+\Gamma \) and problem (3.14) is fully solved. Therefore to provide a complete analysis of problem (3.14) we must consider the case when \(c_*\) exists in (0, 1). From now on we make the following assumption.

Assumption 4.7

There exists a value \(c_*\in (0,1)\) (which is therefore unique) such that (4.34) holds.

As a consequence of the analysis in Sect. 4.1.2 we have the next simple corollary.

Corollary 4.8

For all \(c \in [c_*,1)\) it holds \(V(x,c)=(\Gamma +U-P_0)(x,c)\), \(x\in \mathbb {R}\) and \(\ell _*(c)=\alpha ^*_1(c)\), with \(\Gamma \) and \(\alpha ^*_1\) as in Theorem 4.6.

It remains to characterise \(\ell _*\) in the interval \([0,c_*)\) in which we have \(\ell _*(c) \le \beta _*(c)\). This is done in Theorem 4.13, whose proof requires other technical results which are cited here and proved in the appendix. Fix \(c\in [0,c_*)\), let \(\ell (c)\in \mathbb {R}\) be a candidate boundary and define the stopping time \(\tau _\ell (x,c):=\inf \big \{t\ge 0\,:\,X^x_t\le \ell (c)\big \}\) for \(x\in \mathbb {R}\). Again to simplify notation we set \(\tau _\ell =\tau _\ell (x,c)\) when no confusion may arise. It is now natural to associate to \(\ell (c)\) a candidate value function

$$\begin{aligned} V^\ell (x,c):=\mathsf E\left[ e^{-\lambda \tau _\ell }\Big (U(X^x_{\tau _\ell },c)-P_0\Big )\right] , \end{aligned}$$
(4.35)

whose analytical expression is provided in the next lemma.

Lemma 4.9

For \(c\in [0,c_*)\) we have

$$\begin{aligned} V^\ell (x,c)=\left\{ \begin{array}{ll} (U(\ell (c),c)-P_0) \frac{\phi _{\lambda }(x)}{\phi _{\lambda }(\ell (c))}, &{} \quad x>\ell (c) \\ U(x,c)-P_0, &{} \quad x \le \ell (c) \end{array} \right. \end{aligned}$$
(4.36)

The candidate boundary \(\ell _*\), whose optimality will be subsequently verified, is found by imposing the smooth fit condition, i.e.

$$\begin{aligned} (U(\ell _*(c),c)-P_0) \frac{\phi '_{\lambda }(\ell _*(c))}{\phi _{\lambda }(\ell _*(c))} =U_x(\ell _*(c),c), \qquad c\in [0,1]. \end{aligned}$$
(4.37)

Proposition 4.10

For any \(c\in [0,c_*)\) there exists at least one solution \(\ell _*(c)\in (-\infty , x^0(c))\) of (4.37) with \(x^0(c)\) as in Proposition 4.2.

Remark 4.11

A couple of remarks before we proceed.

(i) The analytical representation (4.36) in fact holds for all \(c\in [0,1]\) and it must coincide with (4.26) for \(c\in [c_*,1]\). Furthermore, the optimal boundary \(\alpha ^*_1\) found in Sect. 4.1.2 by solving (4.28) must also solve (4.37) for all \(c\in [c_*,1]\) since \(\alpha ^*_1=\ell _*\) on that set. This equivalence can be verified by comparing numerical solutions to (4.28) and (4.37). Finding a numerical solution to (4.37) for \(c\in [0,c_*)\) (if it exists) is computationally more demanding than solving (4.28), however, because of the absence of an explicit expression for the function U.

(ii) It is important to observe that the proof of Proposition 4.10 does not use that \(c\in [0,c_*)\) and in fact it holds for \(c\in [0,1]\). However, arguing as in Sect. 4.1.2 we managed to obtain further regularity properties of the optimal boundary in \([c_*,1]\) and its uniqueness. We shall see in what follows that uniqueness can be retrieved also for \(c\in [0,c_*)\) but it requires a deeper analysis.

Now that the existence of at least one candidate optimal boundary \(\ell _*\) has been established, for the purpose of performing a verification argument we would also like to establish that for arbitrary \(c\in [0,c_*)\) we have \(V^{\ell _*}(x,c)\le U(x,c)-P_0\), \(x\in \mathbb {R}\). This is verified in the following proposition (whose proof is collected in the appendix).

Proposition 4.12

For \(c\in [0,c_*)\) and for any \(\ell _*\) solving (4.37) it holds \(V^{\ell _*}(x,c)\le U(x,c)-P_0\), \(x\in \mathbb {R}\).

Finally we provide a verification theorem establishing the optimality of our candidate boundary \(\ell _*\) and, as a by-product, also implying uniqueness of the solution to (4.37).

Theorem 4.13

There exists a unique solution of (4.37) in \((-\infty ,x^0(\bar{c})]\). This solution is the optimal boundary of problem (3.14) in the sense that \(V^{\ell _*}=V\) on \(\mathbb {R}\times [0,1)\) [cf. (4.36)] and the stopping time

$$\begin{aligned} \tau ^*:=\tau ^*_\ell (x,c)=\inf \{t\ge 0\,:\,X^x_t\le \ell _*(c)\} \end{aligned}$$
(4.38)

is optimal in (3.14) for all \((x,c)\in \mathbb {R}\times [0,1)\).

Proof

For \(c\in [c_*,1)\) the proof was provided in Sect. 4.1.2 recalling that \(\ell _*=\alpha ^*_1\) on \([c_*,1)\) and \(V=U-P_0+\Gamma \) on \(\mathbb {R}\times [c_*,1)\) [cf. (4.17), Remark 4.11]. For \(c\in [0,c_*)\) we split the proof into two parts.

1. Optimality Fix \(\bar{c}\in [0,c_*)\). Here we prove that if \(\ell _*(\bar{c})\) is any solution of (4.37) then \(V^{\ell _*}(\,\cdot \,,\bar{c})= V(\,\cdot \,,\bar{c})\) on \(\mathbb {R}\) [cf. (3.14) and (4.36)].

First we note that \(V^{\ell _*}(\,\cdot \,,\bar{c})\ge V(\,\cdot \,,\bar{c})\) on \(\mathbb {R}\) by (3.14) and (4.35). To obtain the reverse inequality we will rely on Itô–Tanaka’s formula. Observe that \(V^{\ell _*}(\,\cdot \,,\bar{c})\in C^1(\mathbb {R})\) by (4.36) and (4.37), and \(V^{\ell _*}_{xx}(\,\cdot \,,\bar{c})\) is continuous on \(\mathbb {R}\setminus \big \{\ell _*(\bar{c})\big \}\) and bounded at the boundary \(\ell _*(\bar{c})\). Moreover from (4.36) we get

$$\begin{aligned}&\big (\mathbb {L}_X-\lambda \big )V^{\ell _*}(x,\bar{c})=0&\text {for } x>\ell _*(\bar{c}) \end{aligned}$$
(4.39)
$$\begin{aligned}&\big (\mathbb {L}_X-\lambda \big )V^{\ell _*}(x,\bar{c})=\big (\mathbb {L}_X-\lambda \big )(U-P_0)(x,\bar{c})>0&\quad \text {for }x\le \ell _*(\bar{c}) \end{aligned}$$
(4.40)

where the inequality in (4.40) holds by (4.12) since \(\ell _*(\bar{c})\le x^0(\bar{c})\) [cf. Proposition 4.10]. An application of Itô–Tanaka’s formula (see [17], Chapter 3, Problem 6.24, p. 215), (4.39), (4.40) and Proposition 4.12 give

$$\begin{aligned} V^{\ell _*}(x,\bar{c})&= \mathsf E\left[ e^{-\lambda (\tau \wedge \tau _R)}V^{\ell _*}\big (X^x_{\tau \wedge \tau _R},\bar{c}\big )- \int _0^{\tau \wedge \tau _R}{e^{-rt}}\big (\mathbb {L}_X-\lambda \big )V^{\ell _*}(X^x_t,\bar{c})dt\right] \nonumber \\&\le \mathsf E\left[ e^{-\lambda (\tau \wedge \tau _R)}\Big (U\big (X^x_{\tau \wedge \tau _R},\bar{c}\big )-P_0\Big )\right] \end{aligned}$$
(4.41)

with \(\tau \) an arbitrary stopping time and \(\tau _R:=\inf \big \{t\ge 0\,:\,|X^x_t| \ge R\big \}\), \(R>0\). We now pass to the limit as \(R\rightarrow \infty \) and recall that \(|U(x,\bar{c})|\le C(1+|x|)\) [cf. Proposition 3.1] and that \(\big \{e^{-\lambda \tau _R}|X^x_{\tau _R}|\,,\,R>0\big \}\) is a uniformly integrable family (cf. Lemma 7.2 in “Appendix 1”). Then in the limit we use the dominated convergence theorem and the fact that

$$\begin{aligned} \lim _{R\rightarrow \infty }e^{-\lambda (\tau \wedge \tau _R)}X^x_{ \tau \wedge \tau _R }= e^{-\lambda \tau }X^x_{\tau },\qquad \mathsf P-a.s. \end{aligned}$$

to obtain \(V^{\ell _*}(\,\cdot \,,\bar{c})\le V(\,\cdot \,,\bar{c})\) on \(\mathbb {R}\) by the arbitrariness of \(\tau \), hence \(V^{\ell _*}(\,\cdot \,,\bar{c})=V(\,\cdot \,,\bar{c})\) on \(\mathbb {R}\) and optimality of \(\ell _*(\bar{c})\) follows.

2. Uniqueness Here we prove the uniqueness of the solution of (4.37) via probabilistic arguments similar to those employed for the first time in [20]. Let \(\bar{c}\in [0,c_*)\) be fixed and, arguing by contradiction, let us assume that there exists another solution \(\ell '(\bar{c})\ne \ell _*(\bar{c})\) of (4.37) with \(\ell '(\bar{c})\le x^0(\bar{c})\). Then by (3.14) and (4.35) it follows that

$$\begin{aligned} V^{\ell '}(\,\cdot \,,\bar{c})\ge V(\,\cdot \,,\bar{c})=V^{\ell _*}(\,\cdot \,,\bar{c})\qquad \text { on } \mathbb {R}, \end{aligned}$$
(4.42)

\(V^{\ell '}(\,\cdot \,,\bar{c})\in C^1(\mathbb {R})\) and \(V^{\ell '}_{xx}(\,\cdot \,,\bar{c})\in L^\infty _{loc}(\mathbb {R})\) by the same arguments as in 1. above. By construction \(V^{\ell '}\) solves (4.39) and (4.40) with \(\ell _*\) replaced by \(\ell '\).

Assume for example that \(\ell '(\bar{c})< \ell _*(\bar{c})\), take \(x<\ell '(\bar{c})\) and set \(\sigma ^*_{\ell }:=\inf \big \{t\ge 0\,:\,X^x_t\ge \ell _*(\bar{c})\big \}\), then an application of Itô–Tanaka’s formula gives (up to a localisation argument as in 1. above)

$$\begin{aligned} \mathsf E\left[ e^{-\lambda \sigma _{\ell }^*} V^{\ell '} \big ( X^x_{\sigma ^*_{\ell }}, \bar{c} \big ) \right]&= V^{\ell '}(x,\bar{c})+\mathsf E\left[ \int ^{\sigma ^*_{\ell }}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) V^{\ell '}\big (X^x_t,\bar{c}\big )dt}\right] \\&= V^{\ell '}(x,\bar{c})+\mathsf E\left[ \int ^{\sigma ^*_{\ell }}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) \big (U\big (X^x_t,\bar{c}\big )-P_0\big )\mathbbm {1}_{\{X^x_t<\ell '(\bar{c})\}}dt}\right] \nonumber \end{aligned}$$
(4.43)

and

$$\begin{aligned} \mathsf E\left[ e^{-\lambda \sigma ^*_{\ell }} V \big ( X^x_{\sigma ^*_{\ell }}, \bar{c} \big ) \right]&= V(x,\bar{c})+\mathsf E\left[ \int ^{\sigma ^*_{\ell }}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) \big (U\big (X^x_t,\bar{c}\big )-P_0\big )dt}\right] . \end{aligned}$$
(4.44)

Recall that \(V^{\ell '}(X^x_{\sigma ^*_{\ell }},\bar{c})\ge V(X^x_{\sigma ^*_{\ell }},\bar{c})\) by (4.42) and that for \(x<\ell '(\bar{c})\le \ell _*(\bar{c})\) one has \(V(x,\bar{c})=V^{\ell '}(x,\bar{c})=U(x,\bar{c})-P_0\), hence subtracting (4.44) from (4.43) we get

$$\begin{aligned} -\mathsf E\left[ \int ^{\sigma ^*_{\ell }}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) \big (U\big (X^x_t,\bar{c}\big )-P_0\big )\mathbbm {1}_{\{\ell '(\bar{c})<X^x_t<\ell _*(\bar{c})\}}dt}\right] \ge 0. \end{aligned}$$
(4.45)

By the continuity of paths of \(X^x\) we must have \(\sigma ^*_\ell >0\), \(\mathsf P\)-a.s. and since the law of X is absolutely continuous with respect to the Lebesgue measure we also have \(\mathsf P\big (\{\ell '(\bar{c})<X^x_t<\ell _*(\bar{c})\}\big )>0\) for all \(t > 0\). Therefore (4.45) and (4.40) lead to a contradiction and we conclude that \(\ell '(\bar{c})\ge \ell _*(\bar{c})\).

Let us now assume that \(\ell '(\bar{c})> \ell _*(\bar{c})\) and take \(x\in \big (\ell _*(\bar{c}),\ell '(\bar{c})\big )\). We recall the stopping time \(\tau ^*\) of (4.38) and again we use Itô–Tanaka’s formula to obtain

$$\begin{aligned} \mathsf E\left[ e^{-\lambda \tau ^*}V\big (X^x_{\tau ^*},\bar{c}\big )\right] =V(x,\bar{c}) \end{aligned}$$
(4.46)

and

$$\begin{aligned} \mathsf E\left[ e^{-\lambda \tau ^*} V^{\ell '} \big ( X^x_{\tau ^*}, \bar{c} \big ) \right] =V^{\ell '}(x,\bar{c})+\mathsf E\bigg [\int ^{\tau ^*}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) \big (U\big (X^x_t,\bar{c}\big )-P_0\big )\mathbbm {1}_{\{X^x_t<\ell '(\bar{c})\}}dt}\bigg ] \end{aligned}$$
(4.47)

Now, we have \(V(x,\bar{c})\le V^{\ell '}(x,\bar{c})\) by (4.42) and \(V^{\ell '} \big ( X^x_{\tau ^*}, \bar{c} \big )=V\big (X^x_{\tau ^*},\bar{c}\big )=U(\ell _*(\bar{c}),\bar{c})-P_0\), \(\mathsf P\)-a.s. by construction, since \(\ell '(\bar{c})>\ell _*(\bar{c})\) and X is positively recurrent (cf. “Appendix 1”). Therefore subtracting (4.46) from (4.47) gives

$$\begin{aligned} \mathsf E\left[ \int ^{\tau ^*}_0{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big ) \big (U\big (X^x_t,\bar{c}\big )-P_0\big )\mathbbm {1}_{\{\ell _*(\bar{c})<X^x_t<\ell '(\bar{c})\}}dt}\right] \le 0. \end{aligned}$$
(4.48)

Arguments analogous to those following (4.45) can be applied to (4.48) to find a contradiction. Then we have \(\ell '(\bar{c})= \ell _*(\bar{c})\) and by the arbitrariness of \(\bar{c}\) the first claim of the theorem follows. \(\square \)

Remark 4.14

The arguments developed in this section hold for all \(c\in [0,1]\). The reduction of (3.14) to the auxiliary problem of Sect. 4.1.1 is not necessary to provide an algebraic equation for the optimal boundary. Nonetheless, it seems convenient to resort to the auxiliary problem whenever possible due to its analytical and computational tractability. In contrast to Sect. 4.1.2, here we cannot establish either the monotonicity or continuity of the optimal boundary \(\ell _*\).

5 The case \(\hat{c}>1\)

In what follows we assume that \(\hat{c}>1\), i.e. \(k(c)<0\) for all \(c\in [0,1]\). As pointed out in Proposition 3.1-(ii) the solution of the control problem in this setting substantially departs from the one obtained for \(\hat{c}<0\). Both the value function and the optimal control exhibit a structure that is fundamentally different, and we recall here some results from [7, Sec. 3].

The function U has the following analytical representation:

$$\begin{aligned} U(x,c)=\left\{ \begin{array}{ll} \displaystyle \tfrac{\psi _{\lambda }(x)}{\psi _{\lambda }(\gamma _*(c))}\left[ \gamma _*(c)(1-c) - \lambda \,\Phi (c) \big (\tfrac{\gamma _*(c)-\mu }{\lambda +\theta }+\tfrac{\mu }{\lambda }\big )\right] +\lambda \,\Phi (c) \left[ \tfrac{x-\mu }{\lambda +\theta }+\tfrac{\mu }{\lambda }\right] , &{} \quad \text {for }x<{\gamma _*(c)} \\ \displaystyle x(1-c), &{} \quad \text {for }x\ge \gamma _*(c) \end{array} \right. \end{aligned}$$
(5.1)

with \(\gamma _*\) as in Proposition 3.1-(ii). In this setting U is less regular than the one for the case of \(\hat{c}<0\), in fact here we only have \(U(\,\cdot \,,c)\in W^{2,\infty }_{loc}(\mathbb {R})\) for all \(c\in [0,1]\) [cf. Proposition 3.1-(ii)] and hence we expect \(x\mapsto \mathcal {L}(x,c)+\lambda P_0:=(\mathbb {L}_X-\lambda )(U-P_0)(x,c)\) to have a discontinuity at the optimal boundary \(\gamma _*(c)\). For \(c\in [0,1]\) we define

$$\begin{aligned} \Delta ^{\mathcal {L}}(x,c):=\mathcal {L}(x+,c)-\mathcal {L}(x-,c),\qquad x\in \mathbb {R}, \end{aligned}$$
(5.2)

where \(\mathcal {L}(x+,c)\) denotes the right limit of \(\mathcal {L}(\,\cdot \,,c)\) at x and \(\mathcal {L}(x-,c)\) its left limit.

Proposition 5.1

For each \(c\in [0,1)\) the map \(x\mapsto \mathcal {L}(x,c)+\lambda P_0\) is \(C^\infty \) and strictly decreasing on \((-\infty ,\gamma _*(c))\) and on \((\gamma _*(c),+\infty )\) whereas

$$\begin{aligned} \Delta ^{\mathcal {L}}(\gamma _*(c),c)=(1-c)\big [\theta \mu -(\lambda +\theta )\gamma _*(c)\big ]+\lambda \gamma _*(c)\Phi (c)>0. \end{aligned}$$
(5.3)

Moreover, define

$$\begin{aligned} x^0_1(c):= \frac{P_0}{\Phi (c)}\quad \text {and}\quad x^0_2(c):= \frac{\theta \mu (1-c) + \lambda P_0}{(\lambda + \theta )(1-c)},\quad c\in [0,1); \end{aligned}$$
(5.4)

then for each \(c\in [0,1)\) there are three possible settings, that is

  1. 1.

    \(\gamma _*(c) \le x^0_1(c)\) hence \(\mathcal {L}(x,c)+\lambda P_0> 0\) if and only if \(x< x^0_2(c)\);

  2. 2.

    \(\gamma _*(c) \ge x^0_2(c)\) hence \(\mathcal {L}(x,c)+\lambda P_0>0\) if and only if \(x< x^0_1(c)\);

  3. 3.

    \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\) hence \(\mathcal {L}(x,c)+\lambda P_0>0\) if and only if \(x\in (-\infty ,x^0_1(c))\cup (\gamma _*(c),x^0_2(c))\).

Proof

The first claim follows by (5.1) and the sign of \(\Delta ^\mathcal {L}(\gamma _*(c),c)\) may be verified by recalling that \(\gamma _*(c)\ge \tilde{x}(c)\) [cf. Proposition 3.1-(ii)]. Checking 1, 2 and 3 is matter of simple algebra. \(\square \)

We may use Proposition 5.1 to expand the discussion in Sect. 4. In particular, from the first and second parts we see that if either \(\gamma _*(c) \ge x^0_2(c)\) or \(\gamma _*(c) \le x^0_1(c)\) then the optimal stopping strategy must be of single threshold type. On the other hand, for \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\), as discussed in Sect. 4, there are two possible shapes for the continuation set. This is the setting for the preliminary discussion which follows.

If the size of the interval \((\gamma _*(c),x^0_2(c))\) is “small” and/or the absolute value of \(\mathcal {L}(x,c)+\lambda P_0\) in \((\gamma _*(c),x^0_2(c))\) is “small” compared to its absolute value in \((x^0_1(c),\gamma _*(c))\cup (x^0_2(c),+\infty )\) then, although continuation incurs a positive cost when the process is in the interval \((\gamma _*(c),x^0_2(c))\), the expected reward from subsequently entering the neighbouring intervals (where \(\mathcal {L}(x,c)+\lambda P_0<0\)) is sufficiently large that continuation may nevertheless be optimal in \((\gamma _*(c),x^0_2(c))\) so that there is a single lower optimal stopping boundary, which lies below \(x^0_1(c)\) (see Figs. 1, 2a).

Fig. 2
figure 2

The function \(x \mapsto \mathcal {L}(x,c)+\lambda P_0\) changes sign in both plots but, with the visual aid of Fig. 1, the stopping region is connected in a and is disconnected in b. (a) Illustration when \(c = 0\), (b) Illustration when \(c = 0.25\).

If the size of \((\gamma _*(c),x^0_2(c))\) is “big” and/or the absolute value of \(\mathcal {L}(x,c)+\lambda P_0\) in \((\gamma _*(c),x^0_2(c))\) is “big” compared to its absolute value in \((x^0_1(c),\gamma _*(c))\cup (x^0_2(c),+\infty )\) then we may find a portion of the stopping set below \(x^0_1(c)\) and another portion inside the interval \((\gamma _*(c),x^0_2(c))\). In this case the loss incurred by continuation inside a certain subset of \((\gamma _*(c),x^0_2(c))\) may be too great to be mitigated by the expected benefit of subsequent entry into the profitable neighbouring intervals and it becomes optimal to stop at once. In the third case of Proposition 5.1, the continuation and stopping regions may therefore be disconnected sets (see Figs. 1, 2b).

To make this discussion rigorous let us now recall \(\mathcal C_V^c\) and \(\mathcal D_V^c\) from (4.11). Note that for any fixed \(c\in [0,1)\) and arbitrary stopping time \(\tau \) the map \(x\mapsto \mathsf E[e^{-\lambda \tau }\big (U(X^x_\tau ,c)-P_0\big )]\) is continuous, hence \(x\mapsto V(x,c)\) is upper semicontinuous (being the infimum of continuous functions). Recall that X is positively recurrent and therefore it hits any point of \(\mathbb {R}\) in finite time with probability one (see “Appendix 1” for details). Hence according to standard optimal stopping theory, if \(\mathcal D_V^c\ne \emptyset \) the first entry time of X in \(\mathcal D_V^c\) is an optimal stopping time (cf. e.g. [21, Ch. 1, Sec. 2, Corollary 2.9]).

Proposition 5.2

Let \(c\in [0,1)\) be fixed. Then

  1. (i)

    if \(\gamma _*(c) \ge x^0_2(c)\), there exists \(\ell _*(c)\in (-\infty ,x^0_1(c))\) such that \(\mathcal D_V^c=(-\infty ,\ell _*(c)]\) and \(\tau _*=\inf \{t\ge 0\,:\,X^x_t\le \ell _*(c)\}\) is optimal in (3.14)

  2. (ii)

    if \(\gamma _*(c) \le x^0_1(c)\), there exists \(\ell _*(c)\in (-\infty ,x^0_2(c))\) such that \(\mathcal D_V^c=(-\infty ,\ell _*(c)]\) and \(\tau _*=\inf \{t\ge 0\,:\,X^x_t\le \ell _*(c)\}\) is optimal in (3.14)

  3. (iii)

    if \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\), there exists \(\ell ^{(1)}_*(c)\in (-\infty ,x^0_1(c))\) such that \(\mathcal D_V^c\cap (-\infty ,\gamma _*(c)]=(-\infty ,\ell ^{(1)}_*(c)]\). Moreover, either (a): \(\mathcal D_V^c\cap [\gamma _*(c),\infty )=\emptyset \) and \(\tau _*=\inf \{t\ge 0\,:\,X^x_t\le \ell ^{(1)}_*(c)\}\) is optimal in (3.14), or (b): there exist \(\ell _*^{(2)}(c)\le \ell _*^{(3)}(c)\le x^0_{2}(c)\) such that \(\mathcal D_V^c\cap [\gamma _*(c),\infty )=[\ell _*^{(2)}(c),\ell _*^{(3)}(c)]\) (with the convention that if \(\ell _*^{(2)}(c)=\ell _*^{(3)}(c)=:\ell _*(c)\) then \(\mathcal D_V^c\cap [\gamma _*(c),\infty )=\{\ell _*(c)\}\)) and the stopping time

    $$\begin{aligned} \tau ^{(II)}_*:=\inf \{t\ge 0\,:\,X^x_t\le \ell ^{(1)}_*(c)\,\,\text {or}\,\,X^x_t\in [\ell _*^{(2)}(c),\ell _*^{(3)}(c)]\} \end{aligned}$$
    (5.5)

    is optimal in (3.14).

Proof

We provide a detailed proof only for (iii) as the other claims follow by analogous arguments. Let us fix \(c\in [0,1)\) and assume \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\).

Step 1 We start by proving that \(\mathcal D_V^c\ne \emptyset \). By localisation and an application of Itô’s formula in its generalised version (cf. [11, Ch. 8]) to (3.14) and recalling Proposition 5.1 we get

$$\begin{aligned} V(x,c)=U(x,c)-P_0+\inf _\tau \left[ \int _0^\tau {e^{-\lambda t}\Big (\lambda P_0+\mathcal {L}(X^x_t,c)\Big )dt}\right] \quad \text {for } x\in \mathbb {R}. \end{aligned}$$
(5.6)

Arguing by contradiction we assume that \(\mathcal D_V^c=\emptyset \) and hence the optimum in (5.6) is obtained by formally setting \(\tau =+\infty \). Moreover by recalling that U solves (3.5) we observe that \(\mathcal {L}(X^x_t,c)\ge -X^x_t\Phi (c)\) \(\mathsf P\)-a.s. for all \(t\ge 0\) and (5.6) gives

$$\begin{aligned} V(x,c)\ge U(x,c)-P_0+R(x,c)\qquad \text {for }x\in \mathbb {R}\end{aligned}$$
(5.7)

where

$$\begin{aligned} R(x,c):=\mathsf E\bigg [\int _0^\infty {e^{-\lambda t}\lambda \Big (P_0-X^x_t\Phi (c)\Big )dt}\bigg ]\qquad \text {for } x\in \mathbb {R}. \end{aligned}$$
(5.8)

It is not hard to see from (5.8) that for sufficiently negative values of x we have \(R(x,c)>0\) and (5.7) implies that \(\mathcal D_V^c\) cannot be empty.

Step 2 Here we prove that \(\mathcal D_V^c\cap (-\infty ,\gamma _*(c)]=(-\infty ,\ell ^{(1)}_*(c)]\) for suitable \(\ell ^{(1)}_*(c)\le x^0_1(c)\). The previous step has already shown that it is optimal to stop at once for sufficiently negative values of x. It now remains to prove that if \(x\in \mathcal D_V^c\cap (-\infty ,\gamma _*(c)]\) then \(x'\in \mathcal D_V^c\cap (-\infty ,\gamma _*(c)]\) for any \(x'<x\). For this, fix \(\bar{x}\in \mathcal D_V^c\cap (-\infty ,\gamma _*(c)]\) and let \(x'<\bar{x}\). Note that the process \(X^{x'}\) cannot reach a subset of \(\mathbb {R}\) where \(\lambda P_0+\mathcal {L}(\,\cdot \,,c)<0\) [cf. Proposition 5.1-(3)] without crossing \(\bar{x}\) and hence entering \(\mathcal D_V^c\). Therefore, if \(x'\in \mathcal C_V^c\) and \(\tau _*(x')\) is the associated optimal stopping time, i.e. \(\tau _*(x'):=\inf \{t\ge 0\,:\,X^{x'}_t\in \mathcal D^c_V\}\), we must have

$$\begin{aligned} V(x',c)&= U(x',c)-P_0+\mathsf E\left[ \int _0^{\tau _*(x')} {e^{-\lambda t}\Big (\lambda P_0+\mathcal {L}(X^{x'}_t,c)\Big )dt}\right] \ge U(x',c)-P_0, \end{aligned}$$
(5.9)

giving a contradiction and implying that \(x'\in \mathcal D_V^c\).

Step 3 We now aim to prove that if \(\mathcal D_V^c\cap [\gamma _*(c),\infty )\ne \emptyset \) then \(\mathcal D_V^c\cap [\gamma _*(c),\infty )=[\ell _*^{(2)}(c),\ell _*^{(3)}(c)]\) for suitable \(\ell _*^{(2)}(c)\le \ell _*^{(3)}(c)\le x^0_{2}(c)\). The case of \(\mathcal D_V^c\cap [\gamma _*(c),\infty )\) containing a single point is self-explanatory. We then assume that there exist \(x<x'\) such that \(x,x'\in \mathcal D_V^c\cap [\gamma _*(c),\infty )\) and prove that also \([x,x']\subseteq \mathcal D_V^c\cap [\gamma _*(c),\infty )\).

Looking for a contradiction, let us assume that there exists \(y\in (x,x')\) such that \(y\in \mathcal C_V^c\). The process \(X^y\) cannot reach a subset of \(\mathbb {R}\) where \(\lambda P_0+\mathcal {L}(\,\cdot \,,c)<0\) without leaving the interval \((x,x')\) [cf. Proposition 5.1-(3)]. Then, by arguing as in (5.9), with the associated optimal stopping time \(\tau _*(y):=\inf \{t\ge 0\,:\,X^{y}_t\in \mathcal D^c_V\}\), we inevitably reach a contradiction. Hence the claim follows. \(\square \)

Before proceeding further we clarify the dichotomy in part iii) of Proposition 5.2, as follows. Lemma 5.3 below characterises the subcases iii)(a) and iii)(b) via condition (5.10). Remark 5.4 then shows that this condition does nothing more than to compare the minima of two convex functions.

Lemma 5.3

Fix \(c \in [0,1)\) and suppose that \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\). Then \(\mathcal {D}_V^c \cap [\gamma _*(c),\infty )=\emptyset \) if and only if there exists \(\ell _*(c)\in (-\infty ,x^0_1(c))\) such that for every \(x \ge \gamma _*(c)\):

$$\begin{aligned} \frac{U(x,c)-P_0}{\phi _{\lambda }\left( x\right) } > \frac{U\left( \ell _*(c),c\right) -P_0}{\phi _{\lambda }\left( \ell _*(c)\right) }. \end{aligned}$$
(5.10)

Proof

(i) Necessity If \(\mathcal {D}_V^c \cap [\gamma _*(c),\infty )=\emptyset \), then by Proposition 5.2-(iii) there exists a point \(\ell _*(c)\in (-\infty ,x^0_1(c))\) such that \(\mathcal {D}_V^c=(-\infty ,\ell _*(c)]\). Let \(x \ge \gamma _*(c)\) be arbitrary and notice that \(V(x,c) < U(x,c)-P_0\) since the current hypothesis implies \(x \in [\gamma _*(c),\infty ) \subset \mathcal {C}^c_V\). According to Proposition 5.2-(iii), the stopping time \(\tau _*\) defined by

$$\begin{aligned} \tau _* := \inf \{t\ge 0\,:\,X^x_t\le \ell _*(c)\} \end{aligned}$$
(5.11)

is optimal in (3.14). On the other hand, since X has continuous sample paths and \(\mathsf {P}_{x}(\{\tau _* < \infty \}) = 1\) by positive recurrence of X, we can also show that

$$\begin{aligned} U(x,c)-P_0 > V(x,c)&= \mathsf {E}_{x}\bigl [e^{-\lambda \tau _*}\left( U(X_{\tau _*},c)-P_0\right) \bigr ] \nonumber \\&= \mathsf {E}_{x}\bigl [e^{-\lambda \tau _*}\left( U(\ell _*(c),c)-P_0\right) \bigr ] \nonumber \\&= \left( U(\ell _*(c),c)-P_0\right) \mathsf {E}_{x}\bigl [e^{-\lambda \tau _*}\bigr ] \nonumber \\&= \left( U(\ell _*(c),c)-P_0\right) \frac{\phi _{\lambda }\left( x\right) }{\phi _{\lambda }\left( \ell _*(c)\right) } \end{aligned}$$
(5.12)

where the last line follows from (6.5). Since \(x \ge \gamma _*(c)\) was arbitrary we have proved the necessity of the claim.

(ii) Sufficiency Suppose now that there exists a point \(\ell _*(c)\in (-\infty ,x^0_1(c))\) such that (5.10) holds for every \(x \ge \gamma _*(c)\). Using the same arguments establishing the right-hand side of (5.12), noting that \(\tau _*\) as defined in (5.11) is no longer necessarily optimal, for every \(x \ge \gamma _*(c)\) we have

$$\begin{aligned} V(x,c)&\le \mathsf {E}_{x}\bigl [e^{-\lambda \tau _*}\left( U(X_{\tau _*},c)-P_0\right) \bigr ]\\&= \left( U(\ell _*(c),c)-P_0\right) \frac{\phi _{\lambda }\left( x\right) }{\phi _{\lambda }\left( \ell _*(c)\right) } \\&< U(x,c)-P_0 \end{aligned}$$

which shows \(\mathcal {D}_V^c \cap [\gamma _*(c),\infty )=\emptyset \). \(\square \)

Remark 5.4

Let us fix \(c \in [0,1)\) such that \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\), or equivalently part (iii) of Proposition 5.2 holds. Writing

$$\begin{aligned} \mathcal {F}(x):= & {} \frac{U(x,c)-P_0}{\phi _{\lambda }\left( x\right) }, \end{aligned}$$
(5.13)
$$\begin{aligned} F(x):= & {} \psi _\lambda (x)/\phi _\lambda (x),\end{aligned}$$
(5.14)
$$\begin{aligned} H(y):= & {} \mathcal {F}\circ F^{-1}(y) \text { for } y > 0, \end{aligned}$$
(5.15)

we will appeal to the discussion given at the start of Section 6 of [5]. Since \(\mathcal {L}(x,c)+\lambda P_0>0\) if and only if \(x\in (-\infty ,x^0_1(c))\cup (\gamma _*(c),x^0_2(c))\) (from Proposition 5.1), it follows from equation (*) in Section 6 of [5] that the function \(y \mapsto H(y)\) is strictly convex on \((0,F(x^0_1(c)))\) and on \((F(\gamma _*(c)),F(x^0_2(c)))\) and concave everywhere else on its domain. Define \(y_m^1\) and \(y_m^2\) by

$$\begin{aligned} \begin{aligned} y_m^1&:= \arg \min \{H(y): y\in (0,F(x^0_1(c)))\} \\ y_m^2&:= \arg \min \{H(y): y\in (F(\gamma _*(c)),F(x^0_2(c)))\}. \end{aligned} \end{aligned}$$
(5.16)

By Eq. (5.1) above, and the fact that F is monotone increasing, we have that \(\lim _{y \rightarrow +\infty }H(y) = \lim _{x\rightarrow +\infty }\mathcal F(x) = +\infty \) (recall that \(\phi _\lambda \) is positive and decreasing). Also

$$\begin{aligned} \begin{aligned} m_1&:= \inf _{x \le x^0_1(c)} \mathcal F(x)= \mathcal F(F^{-1}(y_m^1)) \\ m_2&:= \inf _{x \ge \gamma _*(c)} \mathcal F(x)= \mathcal F(F^{-1}(y_m^2)), \end{aligned} \end{aligned}$$
(5.17)

where the second equality follows from the definition of \(y_m^2\) and the aforementioned geometric properties of \(y \mapsto H(y)\). It is therefore clear from Lemma 5.3 that when \(m_1<m_2\) then part (iii)(a) of Proposition 5.2 holds, while when \(m_1\ge m_2\) part (iii)(b) of Proposition 5.2 holds.

5.1 The optimal boundaries

We will characterise the four cases (i), (ii), (iii)(a), (iii)(b) of Proposition 5.2 through direct probabilistic analysis of the value function and subsequently derive equations for the optimal boundaries obtained in the previous section. We first address cases i and ii of Proposition 5.2.

Theorem 5.5

Let \(c\in [0,1)\) and \(\mathscr {B}\) be a subset of \(\mathbb {R}\). Consider the following problem: Find \(x \in \mathscr {B}\) such that

$$\begin{aligned} \big (U(x,c)-P_0\big )\frac{\phi '_\lambda (x)}{\phi _\lambda (x)}=U_x(x,c). \end{aligned}$$
(5.18)
  1. (i)

    If \(\gamma _*(c) \ge x^0_2(c)\), let \(\ell _*(c)\) be given as in Proposition 5.2-i), then \(V(x,c)=V^{\ell _*}(x,c)\) (cf. (4.36)), \(x\in \mathbb {R}\) and \(\ell _*(c)\) is the unique solution to (5.18) in \(\mathscr {B} = (-\infty ,x^0_1(c))\).

  2. (ii)

    If \(\gamma _*(c) \le x^0_1(c)\), let \(\ell _*(c)\) be given as in Proposition 5.2-ii), then \(V(x,c)=V^{\ell _*}(x,c)\), \(x\in \mathbb {R}\) (cf. (4.36)) and \(\ell _*(c)\) is the unique solution to (5.18) in \(\mathscr {B} =(-\infty ,x^0_2(c))\).

Proof

We only provide details for the proof of i) as the second part is completely analogous.

From Proposition 5.2-(i) we know that \(\ell _*(c)\in (-\infty ,x^0_2(c))\) and that taking \(\tau _*(x):=\inf \{t\ge 0\,:\,X^x_t\le \ell _*(c)\}\) is optimal for (3.14), hence the value function V is given by (4.36) with \(\ell =\ell _*\) (the proof is the same as that of Lemma 4.9). If we can prove that smooth fit holds then \(\ell _*\) must also be a solution to (5.18). To simplify notation set \(\ell _*=\ell _*(c)\) and notice that

$$\begin{aligned} \frac{V(\ell _*+\varepsilon ,c)-V(\ell _*,c)}{\varepsilon }\le \frac{U(\ell _*+\varepsilon ,c)-U(\ell _*,c)}{\varepsilon }\,,\qquad \varepsilon >0. \end{aligned}$$
(5.19)

On the other hand, consider \(\tau _\varepsilon :=\tau _*(\ell _*+\varepsilon )=\inf \{t\ge 0\,:\,X^{\ell _*+\varepsilon }_t\le \ell _*\}\) and note that \(\tau _\varepsilon \rightarrow 0\), \(\mathsf P\)-a.s. as \(\varepsilon \rightarrow 0\) (which can be proved by standard arguments based on the law of the iterated logarithm) and therefore \(X^{\ell _*+\varepsilon }_{\tau _\varepsilon }\rightarrow \ell _*\), \(\mathsf P\)-a.s. as \(\varepsilon \rightarrow 0\) by the continuity of \((t,x)\mapsto X^x_t(\omega )\) for \(\omega \in \Omega \). Since \(\tau _\varepsilon \) is optimal in Eq. (3.14) with \(x = \ell _*+\varepsilon \) we obtain

$$\begin{aligned} \frac{V(\ell _*+\varepsilon ,c)-V(\ell _*,c)}{\varepsilon }\ge \frac{\mathsf E\Big [ e^{-\lambda \tau _\varepsilon }\big (U(X^{\ell _*+\varepsilon }_{\tau _\varepsilon },c)-U(X^{\ell _*}_{\tau _\varepsilon },c)\big )\Big ]}{\varepsilon }\,,\qquad \varepsilon >0. \end{aligned}$$
(5.20)

The mean value theorem, (6.1) in “Appendix 1” and (5.20) give

$$\begin{aligned} \frac{V(\ell _*+\varepsilon ,c)-V(\ell _*,c)}{\varepsilon }&\ge \frac{\mathsf E\Big [ e^{-\lambda \tau _\varepsilon }U_x(\xi _\varepsilon ,c)\big (X^{\ell _*+\varepsilon }_{\tau _\varepsilon }-X^{\ell _*}_{\tau _\varepsilon }\big )\Big ]}{\varepsilon }=\mathsf E\Big [ e^{-(\lambda +\theta )\tau _\varepsilon }U_x(\xi _\varepsilon ,c)\Big ], \end{aligned}$$
(5.21)

with \(\xi _\varepsilon \in [X^{\ell _*}_{\tau _\varepsilon }, X^{\ell _*+\varepsilon }_{\tau _\varepsilon }]\), \(\mathsf P\)-a.s. From (5.1) one has that \(U_x(\,\cdot \,,c)\) is bounded on \(\mathbb {R}\), hence taking limits as \(\varepsilon \rightarrow 0\) in (5.19) and (5.21) and using the dominated convergence theorem in the latter we get \(V_x(\ell _*,c)=U_x(\ell _*,c)\), and since \(V(\,\cdot \,,c)\) is concave (see Lemma 3.4) it must also be \(C^1\) across \(\ell _*\), i.e. smooth fit holds. In particular this means that differentiating (4.36) at \(\ell _*\) we observe that \(\ell _*\) solves (5.18). The uniqueness of this solution can be proved by the same arguments as those in part 2 of the proof of Theorem 4.13 and we omit them here for brevity. \(\square \)

Next we address cases (iii)(a) and (iii)(b) of Proposition 5.2. Let us define

$$\begin{aligned} F_1(\xi ,\zeta ):=\psi _{\lambda }(\xi ) \phi _{\lambda }(\zeta )-\psi _{\lambda }(\zeta )\phi _{\lambda }(\xi )\quad \text {and}\quad F_2(\xi ,\zeta ):=\psi _{\lambda }'(\xi )\phi _{\lambda }(\zeta )-\psi _{\lambda }(\zeta )\phi _{\lambda }'(\xi ) \end{aligned}$$
(5.22)

for \(\xi ,\zeta \in \mathbb {R}\).

Theorem 5.6

Let \(c\in [0,1)\) be such that \(x^0_1(c)<\gamma _*(c)<x^0_2(c)\) and consider the following problem: Find \(x<y<z\) in \(\mathbb {R}\) with \(x\in (-\infty ,x^0_1(c))\) and \(\gamma _*(c)<y<z<x^0_2(c)\) such that the triple (xyz) solves the system

$$\begin{aligned}&\quad (U(z,c)-P_0) \frac{\phi '_{\lambda }(z)}{\phi _{\lambda }(z)}=U_x(z,c) \end{aligned}$$
(5.23)
$$\begin{aligned}&\quad (U(x,c)-P_0)\frac{F_2(x,y)}{F_1(x,y)}-(U(y,c)-P_0) \frac{F_2(x,x)}{F_1(x,y)}=U_x(x,c)\end{aligned}$$
(5.24)
$$\begin{aligned}&\quad (U(x,c)-P_0)\frac{F_2(y,y)}{F_1(x,y)}-(U(y,c)-P_0)\frac{F_2(y,x)}{F_1(x,y)}=U_x(y,c) \end{aligned}$$
(5.25)
  1. (i)

    In case (iii)(b) of Proposition 5.2 the stopping set is of the form \(\mathcal D_V^c=(-\infty ,\ell _*^{(1)}(c)]\cup [\ell _*^{(2)}(c),\ell _*^{(3)}(c)]\), and then \(\{x,y,z\}=\{\ell _*^{(1)}(c),\ell _*^{(2)}(c),\ell _*^{(3)}(c)\}\) is the unique triple solving (5.23)–(5.25). The value function is given by

    $$\begin{aligned} V(x,c)= \left\{ \begin{array}{ll} \big (U(\ell ^{(3)}_*,c)-P_0\big )\frac{\phi _\lambda (x)}{\phi _\lambda (\ell ^{(3)}_*)} &{} \quad \text {for }\,x> \ell ^{(3)}_*\\ U(x,c)-P_0 &{}\quad \text {for }\, \ell ^{(2)}_*\le x \le \ell ^{(3)}_*\\ (U(\ell ^{(1)}_*,c)-P_0)\frac{F_1(x,\ell ^{(2)}_*)}{F_1(\ell ^{(1)}_*,\ell ^{(2)}_*)}+(U(\ell ^{(2)}_*,c) -P_0)\frac{F_1(\ell ^{(1)}_*,x)}{F_1(\ell ^{(1)}_*,\ell ^{(2)}_*)} &{}\quad \text {for }\, \ell ^{(1)}_*< x < \ell ^{(2)}_*\\ U(x,c)-P_0 &{} \quad \text {for }\, x \le \ell ^{(1)}_* \end{array} \right. \end{aligned}$$
    (5.26)

    where we have set \(\ell _*^{(k)}=\ell _*^{(k)}(c)\), \(k=1,2,3\) for simplicity.

  2. (ii)

    In case (iii)(a) of Proposition 5.2 we have \(\mathcal D_V^c=(-\infty ,\ell _*^{(1)}(c)]\), moreover \(V(x,c)=V^{\ell _*^{(1)}}(x,c)\), \(x\in \mathbb {R}\) (cf. (4.36)) and \(\ell _*^{(1)}(c)\) is the unique solution to (5.18) with \(\mathscr {B}=(-\infty ,x^0_1(c))\).

Proof

Proof of (i). In the case of Proposition 5.2-(iii)(b), the stopping time \(\tau ^{(II)}_*\) defined in (5.5) is optimal for (3.14):

$$\begin{aligned} V(x,c) = \mathsf E\Big [e^{-\lambda \tau ^{(II)}}\big (U(X^x_{\tau ^{(II)}},c)-P_0\big )\Big ] \end{aligned}$$

Equation (5.26) is therefore just the analytical representation for the value function in this case. The fact that \(\ell ^{(1)}_*\), \(\ell ^{(2)}_*\) and \(\ell ^{(3)}_*\) solve the system (5.23)–(5.25) follows from the smooth fit condition at each of the boundaries. A proof of the smooth fit condition can be carried out using probabilistic techniques as done previously for Theorem 5.5. We therefore omit its proof and only show uniqueness of the solution to (5.23)–(5.25).

Uniqueness will be addressed with techniques similar to those employed in Theorem 4.13, taking into account that the stopping region in the present setting is disconnected. We fix \(c\in [0,1)\), assume that there exists a triple \(\{\ell '_1,\ell '_2,\ell '_3\} \ne \{\ell ^{(1)}_*,\ell ^{(2)}_*,\ell ^{(3)}_*\}\) solving (5.23)–(5.25) and define a stopping time

$$\begin{aligned} \sigma ^{(II)}:=\inf \big \{t\ge 0\,:\,X^x\le \ell '_1\,\,\text {or}\,\,X^x_t\in [\ell '_2,\ell '_3]\big \}\quad x\in \mathbb {R}. \end{aligned}$$
(5.27)

We can associate to the triple a function

$$\begin{aligned} V'(x,c):=\mathsf E\Big [e^{-\lambda \sigma ^{(II)}}\big (U(X^x_{\sigma ^{(II)}},c)-P_0\big )\Big ]\quad x\in \mathbb {R}\end{aligned}$$
(5.28)

and note that \(V'(\,\cdot \,,c)\) has the same properties as the value function \(V(\,\cdot \,,c)\) provided that we replace \(\ell _*^{(k)}\) by \(\ell '_k\) everywhere for \(k=1,2,3\). Moreover, Eq. (3.14) implies

$$\begin{aligned} V'(x,c)\ge V(x,c),\qquad x \in \mathbb {R}. \end{aligned}$$
(5.29)

Step 1 First we show that \((\ell ^{(2)}_*,\ell ^{(3)}_*)\cap (\ell '_2,\ell '_3)\ne \emptyset \). We assume that \(\ell '_2\ge \ell ^{(3)}_*\) but the same arguments would apply if we consider \(\ell ^{(2)}_*\ge \ell '_3\). Note that \(\ell '_1<\ell ^{(3)}_*\) since \(\ell '_1\in (-\infty ,x^0_1(c))\), then fix \(x\in (\ell '_2,\ell '_3)\) and define the stopping time \(\tau _3=\inf \{t\ge 0\,:\,X^x_t\le \ell _*^{(3)}\}\). We have \(V(x,c)<U(x,c)-P_0\) and by (5.28) it follows that \(V'(x,c)=U(x,c)-P_0\). Then an application of the Itô–Tanaka formula gives

$$\begin{aligned} 0<V'(x,c)-V(x,c)&= \mathsf E\Big [e^{-\lambda \tau _3}\big (V'(X^x_{\tau _3},c)-V(X^x_{\tau _3},c)\big )\Big ]\nonumber \\&\quad -\mathsf E\bigg [\int _0^{\tau _3}{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big )\big (U(X^x_t,c)-P_0\big )\mathbbm {1}_{\{X^x_t\in (\ell '_2,\ell '_3)\}}dt}\bigg ]\nonumber \\&< \mathsf E\Big [e^{-\lambda \tau _3}\big (V'(\ell _*^{(3)},c)-U(\ell _*^{(3)},c)+P_0\big )\Big ]\le 0 \end{aligned}$$
(5.30)

where we have used Proposition 5.1-(3) in the first inequality on the right-hand side and the fact that \(V'(\ell _*^{(3)},c)\le U(\ell _*^{(3)},c)-P_0 \) in the second. We then reach a contradiction with (5.29) and \((\ell ^{(2)}_*,\ell ^{(3)}_*)\cap (\ell '_2,\ell '_3)\ne \emptyset \).

Step 2 Notice now that if we assume \(\ell '_3<\ell ^{(3)}_*\) we also reach a contradiction with (5.29) as for any \(x\in (\ell '_3,\ell ^{(3)}_*)\) we would have \(V'(x,c)<U(x,c)-P_0=V(x,c)\). Then we must have \(\ell '_3\ge \ell ^{(3)}_*\).

Assume now that \(\ell '_3>\ell _*^{(3)}\), take \(x\in (\ell _*^{(3)},\ell '_3)\) and \(\tau _3\) as in Step 1. above. Note that \(V'(x,c)=U(x,c)-P_0>V(x,c)\) whereas \(V(\ell _*^{(3)},c)=U(\ell _*^{(3)},c)-P_0=V'(\ell _*^{(3)},c)\) by Step 1. above and (5.28). Then using the Itô–Tanaka formula again we find

$$\begin{aligned} 0<V'(x,c)-V(x,c)&= \mathsf E\Big [e^{-\lambda \tau _3}\big (V'(X^x_{\tau _3},c)-V(X^x_{\tau _3},c)\big )\Big ]\nonumber \\&\quad -\mathsf E\bigg [\int _0^{\tau _3}{e^{-\lambda t}\big (\mathbb {L}_X-\lambda \big )\big (U(X^x_t,c)-P_0\big )\mathbbm {1}_{\{X^x_t\in (\ell ^{(3)}_*,\ell '_3)\}}dt}\bigg ]\nonumber \\&< \mathsf E\Big [ e^{-\lambda \tau _3}\big (V'(\ell _*^{(3)},c)-U(\ell _*^{(3)},c)+P_0\big )\big ]=0 \end{aligned}$$
(5.31)

hence there is a contradiction with (5.29) and \(\ell _*^{(3)}=\ell '_3\).

Step 3 If we now assume that \(\ell _*^{(2)}<\ell '_2\) we find the same contradiction with (5.29) as in Step 2. as in fact for any \(x\in (\ell ^{(2)}_*,\ell '_2)\) we would have \(V'(x,c)<U(x,c)-P_0=V(x,c)\). Similarly if we assume that \(\ell '_1<\ell _*^{(1)}\) then for any \(x\in (\ell '_1,\ell ^{(1)}_*)\) we would have \(V'(x,c)<U(x,c)-P_0=V(x,c)\). These contradictions imply that \(\ell '_2\le \ell ^{(2)}_*\) and \(\ell '_1\ge \ell ^{(1)}_*\).

Let us assume now that \(\ell '_2<\ell ^{(2)}_*\), then taking \(x\in (\ell '_2,\ell ^{(2)}_*)\), applying the Itô–Tanaka formula until the first exit time from the open set \((\ell ^{(1)}_*,\ell ^{(2)}_*)\) and using arguments similar to those in Steps 1. and 2. we end up with a contradiction. Hence \(\ell '_2=\ell ^{(2)}_*\); analogous arguments can be applied to establish that \(\ell '_1=\ell _*^{(1)}\).

Proof of (ii). To prove (ii) we simply argue as in Theorem 5.5, concluding that \(V(x,c)=V^{\ell _*^{(1)}}(x,c)\), \(x\in \mathbb {R}\) and \(\ell _*^{(1)}\) solves (5.18) with \(\mathscr {B}=(-\infty ,x^0_1(c))\). \(\square \)

5.2 Discussion and economic considerations

As proved in Sect. 4.1 above, for cost functions \(\Phi \) with \(\hat{c}<0\) the control problem of purchasing in the spot market has a reflecting boundary and the optimal contract entry problem has a connected continuation region. These problem structures have been commonly observed in the literature on irreversible investment and optimal stopping respectively.

In contrast the results in the present Sect. 5 for \(\hat{c}>1\) include repelling control boundaries and disconnected optimal stopping regions, which to date have been observed less frequently in the literature. Here we provide further discussion on the results of this section, including simple examples and an economic interpretation of the optimal stopping rule.

If \(\hat{c}>1\) then k(c) is negative for all \(c \in (0,1]\) [see (3.3)]. It follows that the penalty function \(\Phi \) must satisfy

$$\begin{aligned} \Phi (c) > \bigl (1 + \tfrac{\theta }{\lambda }\bigr )(1-c) \quad \forall c \in [0,1). \end{aligned}$$
(5.32)

Since \(\theta /\lambda \) is positive, (5.32) establishes that the function \(c \mapsto \Phi (c)\) is bounded below by a positive, decreasing linear function for \(c \in (0,1)\). Given the role of \(\Phi \) as a penalisation function, this superlinearity is a natural property for \(\Phi \). Nonetheless we note that the slope of this linear lower bound increases as \(\lambda \downarrow 0\). Thus as the arrival rate of the demand decreases, this penalisation from \(\Phi \) must be increasingly strong in order to fall into the case \(\hat{c}>1\).

Although \(\lambda \) is the parameter in the exponential distribution of the arrival time of demand, we noted after (2.5) above that mathematically it is equivalent to a financial discount rate. Indeed this is analogous to the situation in the reduced-form methodology for credit risk modelling (see Chapter 7 of [14] for example), where a similar parameter \(\lambda \) can be interpreted as an adjustment to the discount rate due to the risk of default (in our case the ‘default’ event would correspond to the arrival of the demand, and hence the loss of the opportunity to enter the contract).

Now we give three examples of functions \(\Phi \) with \(\hat{c}>1\), drawn from functional forms commonly found in the economic literature. From now on let us fix \(\theta \) and \(\lambda \) and introduce an additional parameter a, specifying that \(a>1+\theta /\lambda \). Our examples involve respectively polynomial costs

$$\begin{aligned} \Phi (c):=\frac{1}{2}(1-c)^2+a(1-c), \end{aligned}$$
(5.33)

exponential costs

$$\begin{aligned} \Phi (c):=e^{a(1-c)}-1, \end{aligned}$$
(5.34)

and logarithmic costs

$$\begin{aligned} \Phi (c):=-\frac{1}{a}\ln c. \end{aligned}$$
(5.35)

(Formally, for the third example we note that the assumption \(\Phi \in C^2(\mathbb {R})\) made above may be relaxed to \(\Phi \in C^2((0,1])\) if we restrict our study of (2.4) to \(\mathbb {R}\times (0,1]\). Indeed, since the inventory can only be increased, the behaviour of \(\Phi \) for \(c\le 0\) (and \(c>1\)) would play no role in our analysis.)

In order to explain the complex shape of the optimal entry boundary shown in Fig. 1 (where \(\hat{c}>1\)) we first identify two extremal regimes, referred to as cases 1 and 2 below. For each we provide the mathematical justification and an economic interpretation. Throughout the rest of this section it is important to recall from the problem setup in Sect. 2 that there are two costs to consider: the shortfall penalty and the expenditure in the spot market. Moreover, once the contract has been entered, the investor’s optimal purchasing strategy in the spot market is to instantaneously fill the inventory at some time. That is, for each value of c there is a unique critical price \(\gamma _*(c)\) above which it is optimal to instantaneously fill the inventory. This optimal policy explains the extremal cases 1 and 2 discussed below, and thus informs the intermediate case 3.

Case 1 To the left of Fig. 1, that is when c is small, it can be observed that \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\) and \(m_2(c)>m_1(c)\). From Remark 5.4 this implies that part iii)(a) of Proposition 5.2 holds. The optimal entry policy is then of threshold type, and the continuation region has the lower boundary \(\ell ^{(1)}_*(c)\). Further, once the contract is entered, the optimal purchasing policy is then to wait until the price X is above the level \(\gamma _*(c)\), at which time the inventory is instantaneously filled.

When the inventory level c is small let us first consider the shortfall penalty term \(X_{\Theta }\Phi (c)\) incurred by the arrival of the demand. Since \(\Phi \) is decreasing, \(\Phi (c)\) is relatively large for small c. If the value of the spot price X is high, the investor is then exposed to the risk of significant costs from the shortfall penalty and it is not attractive to enter the contract. Conversely for low values of X the penalty \(X_{\Theta }\Phi (c)\) is relatively low, making the contract more attractive to enter.

Next we consider the expenditure in the spot market. As recalled above this is equal to \((1-c)\gamma _*(c)\), since the inventory is instantaneously filled when the price rises to \(\gamma _*(c)\). If the contract is entered at a price \(X<\gamma _*(c)\) then this cost is not incurred immediately, but at the later time when the price rises to \(\gamma _*(c)\). In this case the investor therefore benefits from the discounting effect of \(\lambda \) described above: the lower the price at entry, the greater is the average benefit from discounting. However a balance must be struck, since at the random time \(\Theta \) the demand arrives and the opportunity to enter the contract is lost, so there is a disadvantage to waiting for a very low entry price. This balance implies the existence of a lower optimal threshold \(\ell ^{(1)}_*(c)<\gamma _*(c)\).

Case 2 To the right of the figure, when c is close to 1, we have \(\gamma _*(c) \le x^0_1(c)\) and so part ii) of Proposition 5.2 holds and the optimal entry policy is again of threshold type. The continuation region has the lower boundary \(\ell _*(c)=\ell ^{(3)}_*(c)\) and once the contract is entered it is optimal to fill the inventory immediately.

Because of this immediate filling of the inventory there is no possibility of a shortfall penalty and we need only consider the expenditure in the spot market, which is equal to \(X(1-c)\). Lower spot prices are preferable but, since the opportunity to enter the contract is lost at the random time \(\Theta \), again there is a disadvantage to waiting for excessively low entry prices. However for values of c sufficiently close to 1 this expenditure becomes insignificant and it may be attractive to enter the contract even at relatively high spot prices X with \(X>\gamma _*(c)\) (even though filling is then immediate, so there is no benefit from discounting as in case 1).

Case 3 In the middle part of the figure we have \(x^0_1(c)< \gamma _*(c) < x^0_2(c)\) and \(m_2(c)<m_1(c)\) so that case iii)(b) of Proposition 5.2 applies. Then the continuation region is disconnected and is the union of a bounded interval \((\ell ^{(1)}_*(c), \ell ^{(2)}_*(c))\), whose endpoints satisfy the inequality \(\ell ^{(1)}_*(c)< \gamma _*(c) < \ell ^{(2)}_*(c)\), with the half-line \((\ell ^{(3)}_*(c),\infty )\). If the contract is entered when the spot price is \(\ell ^{(1)}_*(c)\) (or lower) then the optimal purchasing policy is as in case 1, otherwise the inventory is immediately filled upon entering the contract, as in case 2.

From the economic point of view this case is more difficult to interpret as it is a mixture of the previous two cases. However it may be noted that the bounded component \((\ell ^{(1)}_*(c), \ell ^{(2)}_*(c))\) of the continuation region contains the critical level \(\gamma _*(c)\) for the optimal purchasing policy. Thus if the problem begins when the spot price is at this critical level \(x=\gamma _*(c)\), the investor prefers to wait and learn more about the price movements. Recalling that the opportunity to enter the contract is lost at the random time \(\Theta \), if the spot price then falls sufficiently low the investor enters the contract and benefits both from lower risk from the undersupply penalty, and also from discounting, as described in case 1; alternatively if the spot price rises sufficiently far above \(\gamma _*(c)\) then the investor enters the contract and immediately fills the inventory, eliminating the potential undersupply penalty at the cost of a higher (and undiscounted) expenditure in the spot market.

6 Conclusion

In this paper we have studied the problem of optimal entry into an irreversible investment plan with a cost function which is non convex with respect to the control variable. This non convexity is due to the real-valued nature of the spot price of electricity. We show that the problem can be decoupled and that the investment phase can be studied independently of the entry decision as an investment problem over an infinite time horizon. The optimal entry decision depends heavily on the properties of the optimal investment policy.

The complete value function can be rewritten as that of an optimal stopping problem where the cost of immediate stopping involves the value function of the infinite horizon investment problem. It has been shown in [7] that the latter problem presents a complex structure of the solution, in which the optimal investment rule can be either singularly continuous or purely discontinuous, depending on the problem parameters. This feature in turn implies a non standard optimal entry policy. Indeed, the optimal entry rule can be either the first hitting time of the spot price at a single threshold, or can be triggered by multiple boundaries splitting the state space into non connected stopping and continuation regions. A possible economic interpretation of this complex structure is provided.