1 Introduction

In the past two decades there has been increasing interest in incorporating regime switching into various stochastic control problems arising in economics, finance and operations research. Most of this literature is based on extending well-known archetypal stochastic control problems such as the investment/consumption problem (e.g. Sotomayor and Cadenillas 2009; Xu et al. 2020), dividend optimization (e.g. Sotomayor and Cadenillas 2011, 2013; Jiang and Pistorius 2012; Zhu and Chen 2013) and option valuation (e.g. Buffington and Elliott 2002; Zhang and Guo 2004; Guo et al. 2005; Boyarchenko and Levendorskii 2006, 2009) into a regime switching context. Such extensions are useful for problems with a horizon which is long compared to the expected duration of the present macroeconomic regime. In particular, they can provide better and more flexible approximations than standard (i.e. non-switching) infinite horizon problems because it is not always realistic to assume that the problem’s exogenous conditions will remain unchanged in the future.

Due to their added complexity, there are as of yet only a handful of cases where stochastic control problems with regime switching (RS problems for short) are known to be explicitly or semi-explicitly solvable. Most of these special cases are solved in the papers of Guo et al. (e.g. Guo 2001; Zhang and Guo 2004; Guo and Zhang 2005; Guo et al. 2005), Boyarchenko and Levendorskiy (e.g. Boyarchenko and Levendorskii 2006, 2009) and Sotomayor and Cadenillas (e.g. Sotomayor and Cadenillas 2009, 2011, 2013). The problems are usually formulated in a two-regime world for tractability reasons and deal with switches in the model’s parameters e.g. the drift and volatility of a Brownian motion (BM) or geometric Brownian motion (GBM). There are also a few papers providing existence and uniqueness results as well as detailed characterizations for the value function and the optimal control policy for more general RS problems, such as Jiang and Pistorius (2012), Zhu and Chen (2013) and Xu et al. (2020).

In the present paper we solve semi-explicitly an optimal stopping and an impulse control problem with two regimes and a single regime switch. Our analysis is complementary to previous literature on explicitly solvable RS problems in that in our models, the regime switching structure is more restricted but our results hold simultaneously for a much wider class of regimes. In particular, we do not require the payoffs and the diffusions to be related in almost any way in different regimes and hence allow substantial changes in the problem’s underlying dynamics. The motivation for this class of problems is that we can study in detail the effects anticipating an individual regime switch has on a standard optimal stopping or an impulse control problem. An interesting special case is an RS problem where the regime switch is non-trivial but anticipation has no effect on the optimal control policy. Therefore our results can also be seen as a benchmark for the qualitative behaviour of solutions to more complicated RS problems.

Following previous work of Alvarez (2003, 2004a) and Alvarez and Lempa (2008), our analysis relies only on the classical theory of diffusions. The novelty of this approach is that we are able to provide fairly general sufficient conditions under which the optimal control policies are of a threshold form and characterize the optimal thresholds in a computationally tractable manner.

The paper is organized as follows. In Sect. 2 we introduce our model and recall some basic results of diffusion theory that we are going to use later on. In Sect. 3 we present and discuss the problems which we then solve in Sects. 4 and 5. In Sect. 6 we present various comparison results for the optimal control policies and the corresponding value functions. In Sect. 7 we illustrate our analysis with examples. Section 8 concludes the paper.

2 Notation and preliminaries

Let \((\Omega , \mathcal {F}, \mathbb {F} = \lbrace \mathcal {F}_t \rbrace _{t \ge 0}, \mathbb {P} )\) be the usual augmentation of a filtered probability space \((\Omega , \mathcal {G}, \mathbb {G}=\lbrace \mathcal {G}_t \rbrace _{t \ge 0}, \mathbb {P} )\) where \(\mathbb {G}\) is a filtration generated by two regular, linear diffusions \(X_1, X_2\) with a common state space \((I, \mathcal {B}(I))\) and a right-continuous indicator process H defined as \(H_t = 1_{\{T \le t\}}\) where \(T \sim \textrm{Exp}(\lambda )\) and \(\lambda >0\). We assume that \(I = (a,b) \subseteq \mathbb {R}\), H is independent of \(X_i\), \( X_i\) are either independent or stochastically equivalent (in which case we write \(X_1 = X_2\)) and the boundaries ab are natural for \(X_i\). We assume that the continuous functions \(\mu _i:\, I \rightarrow \mathbb {R}\) and \(\sigma _i:\, I \rightarrow \mathbb {R}_+\) are such that the stochastic differential equations

$$\begin{aligned} dX_{i,t}&= \mu _i(X_{i,t})dt + \sigma _i (X_{i,t}) dW_{i,t} \qquad i=1,2 \end{aligned}$$

(where \(W_{i}\) are Brownian motions) admit unique strong solutions [see e.g. the various conditions in Borodin and Salminen (2012, pp. 46–49)]. We denote the generator of \(X_i\) with

$$\begin{aligned} \mathcal {A}_i&= \mu _i(x) \dfrac{d}{dx} + \dfrac{\sigma _i ^2(x)}{2} \dfrac{d^2}{dx^2} \end{aligned}$$

We denote by \(\psi _{i, \alpha }\) (\(\varphi _{i, \alpha }\)) the increasing (decreasing) fundamental solution to \(\mathcal {A}_i u = \alpha u\) (the minimal \(\alpha \)-excessive functions). For any \(f \in \mathcal {L}_i^1(I, \alpha )\) where

$$\begin{aligned} \mathcal {L}_i^1(I, \alpha )&= \left\{ f:\, I \rightarrow \mathbb {R}, \, f \mathrm {\, is \, measurable\,},\, \mathbb {E}_x \left[ \int _{0}^{\infty } e^{-\alpha s} |f(X_{i,s})| ds \right] < \infty \right\} \end{aligned}$$

we write the resolvent of the f as

$$\begin{aligned} (R_{i, \alpha } f)(x)&= \mathbb {E}_x \left[ \int _{0}^{\infty } e^{-\alpha s} f(X_{i,s}) ds \right] \\&= B_{i, \alpha }^{-1} \left( \varphi _{i, \alpha } (x) \int _{a}^{x} \psi _{i, \alpha } (y) f(y) m_i'(y) dy + \psi _{i, \alpha } (x) \int _{x}^{b} \varphi _{i, \alpha } (y) f(y) m_i'(y) dy \right) \end{aligned}$$

where the last particularly useful identity follows from the classical theory of diffusions [see e. g. Mandl (1968), Section 2.3]. As usual, \(m_i\) is the speed measure and \(S_i\) is the scale function with

$$\begin{aligned} m_i'(x)&=\dfrac{2}{\sigma _i(x)^2}e^{\int ^x_a 2 \mu _i (y)\sigma _i^{-2} (y)dy} ,\\ S_i'(x)&= e^{-\int ^x_a 2 \mu _i (y)\sigma _i^{-2} (y)dy} \end{aligned}$$

and \(B_{i, \alpha }\) is the (constant) Wronskian

$$\begin{aligned} B_{i, \alpha }&= \varphi _{i, \alpha }(x) \dfrac{d\psi _{i, \alpha } (x)}{dS_i(x)} - \psi _{i, \alpha } (x) \dfrac{d\varphi _{i, \alpha } (x)}{dS_i(x)} \end{aligned}$$

In general, we denote any quantity related to the diffusion \(X_i\) with the index i and if there are multiple indices, i will be the first one. For specific index combinations we suppress the notation even further to make certain calculations less repetitive and verbose. That is, we write \(R_{1,r+\lambda } = R_1\) and

$$\begin{aligned} \psi _{1,r}&= \psi _{0},\, \psi _{1,r+\lambda } = \psi _{1},\, \psi _{2,r} = \psi _{2},\\ \varphi _{1,r}&= \phi _{0},\, \varphi _{1,r+\lambda } = \varphi _{1},\, \phi _{2,r} = \phi _{2},\\ B_{1,r}&= B_{0},\, B_{1,r+\lambda } = B_{1},\, B_{2,r} = B_{2} \end{aligned}$$

The random variable T is interpreted as the regime switching time. Without loss of generality we assume that the diffusion \(X_{1}\) is started at the initial time \(t=0\) and stopped at \(t=T\) after which the second diffusion \(X_{2}\) is immediately started from the state \(X_{1,T}\). In other words the underlying is of the form

$$\begin{aligned} X_t&= X_{1,t \wedge T} + H_t (X_{2,t-T} - X_{1,T}),\, X_{2,0} = X_{1,T} \end{aligned}$$
(1)

which is seen to be an r.c.l.l. strong Markov process whenever \(X_1\) and \(X_2\) are r.c.l.l. Moreover, we allow the payoff function to change during the regime switch, that is we are working with a payoff g of the the form

$$\begin{aligned} g(t, X_t)&= g_1(X_t)(1 - H_t) + g_2(X_t)H_t \end{aligned}$$
(2)

where \(g_i \in C^2(I \setminus D_i) \), \(D_i\) are finite or countably infinite, \(g_i\) are non-decreasing and

$$\begin{aligned}&- \infty< \lim _{x \downarrow a}g_i(x) \le 0 < \lim _{x \uparrow b}g_i(x) \end{aligned}$$
(3)

The following result will be useful later on. It is essentially Corollary 3.2 in Alvarez (2004a) formulated in the present context. The proof will be omitted as it is completely analogous to the original.

Lemma 1

Assume that \(f \in C^2(I)\), \(F_i = ( \mathcal {A}_i - \alpha )f \in \mathcal {L}^1_i(I, \alpha )\) and \(\lim _{x \downarrow a}f(x)/\varphi _{i,\alpha }(x) = \lim _{x \uparrow b}f(x)/\psi _{i,\alpha }(x) =0\). Then

$$\begin{aligned} \dfrac{f'(x)}{S_i'(x)}\psi _{i,\alpha }(x) - \dfrac{\psi _{i,\alpha }'(x)}{S_i'(x)}f(x)&= \int _{a}^x\psi _{i,\alpha }(z)F_i(z)m_i'(z)dz,\\ \dfrac{f''(x)}{S_i'(x)}\psi _{i,\alpha }'(x) - \dfrac{\psi _{i,\alpha }''(x)}{S_i'(x)}f'(x)&= \dfrac{2\alpha }{\sigma _i^2(x)} \int _{a}^x\psi _{i,\alpha }(z)(F_i(x) - F_i(z))m_i'(z)dz \end{aligned}$$

and

$$\begin{aligned} \dfrac{f'(x)}{S_i'(x)}\varphi _{i,\alpha }(x) - \dfrac{\phi _{i,\alpha }'(x)}{S_i'(x)}f(x)&= - \int _x^{b}\varphi _{i,\alpha }(z)F_i(z)m_i'(z)dz,\\ \dfrac{f''(x)}{S_i'(x)}\varphi _{i,\alpha }'(x) - \dfrac{\varphi _{i,\alpha }''(x)}{S_i'(x)}f'(x)&= \dfrac{2 \alpha }{\sigma _i^2(x)} \int _x^{b}\varphi _{i,\alpha }(z)(F_i(z) - F_i(x))m_i'(z)dz \end{aligned}$$

3 Outline of the problems

The two-regime-single-switch structure considered in the present paper gives rise to three different classes of optimal stopping problems (OSPs) and impulse control problems (ICPs). We call these classes the pre-switch, anticipative and post-switch problems respective of their temporal relation to the regime switch. Since we are working with six problems in total, we will use the words “problem” and “value function” interchangeably to streamline certain arguments and avoid general confusion. For \(x \in I\), the pre-switch OSP and ICP are defined as

$$\begin{aligned} \tilde{V}_0(x)&= \sup _{\tau \in \mathcal {S}_1} \mathbb {E}_{x}\left[ e^{-r\tau } g_1(X_{1,\tau }) \right] \end{aligned}$$
(4)

and

$$\begin{aligned} V_0(x) = \sup _{\nu \in \mathcal {V}_1} \mathbb {E}_{x}\left[ \sum _{i=1}^N e^{-r \tau _i} g_1(X_{1,\tau _i }^{\nu }) \right] \end{aligned}$$
(5)

respectively. The anticipative problems are defined as

$$\begin{aligned} \tilde{V}_1(x)&= \sup _{\tau \in \mathcal {S}} \mathbb {E}_{x}\left[ e^{-r\tau } g(\tau , X_{\tau }) \right] \end{aligned}$$
(6)

and

$$\begin{aligned} V_1(x) = \sup _{\nu \in \mathcal {V}} \mathbb {E}_{x}\left[ \sum _{i=1}^N e^{-r \tau _i} g(\tau _i, X_{\tau _i }^{\nu }) \right] \end{aligned}$$
(7)

and the post-switch problems are

$$\begin{aligned} \tilde{V}_2(x)&= \sup _{\tau \in \mathcal {S}_2} \mathbb {E}_{x}\left[ e^{-r\tau } g_2(X_{2,\tau }) \right] \end{aligned}$$
(8)

and

$$\begin{aligned} V_2(x) = \sup _{\nu \in \mathcal {V}_2} \mathbb {E}_{x}\left[ \sum _{i=1}^N e^{-r \tau _i} g_2(X_{2,\tau _i }^{\nu }) \right] \end{aligned}$$
(9)

The terminology and notation used in the above definitions require some elaboration. The pre-switch (post-switch) problems are standard OSPs and ICPs depending only on the initial (final) regime. The anticipative problems are the only ones which depend on both regimes and hence on the regime switching structure. Single regime problems were analyzed in detail in Alvarez (2004a) in the present framework. Thus the main focus of our analysis is on solving the anticipative problems (6) and (7). We do not need to assume any specific conditions from Alvarez (2004a) unless explicitly stated otherwise. We simply assume that the single regime problems admit unique threshold solutions, in which case the single regime value functions are necessarily of the form described in Alvarez (2004a). By threshold solutions we mean those policies for which every associated stopping time is the hitting time (from below) of the underlying to a given unique state.

The pre-switch problems correspond to a situation where an agent is solving an OSP or an ICP, not anticipating any changes in the problems’ exogenous features. When the agent becomes aware that a given regime switch will happen at an exponentially distributed random time, they immediately ditch the pre-switch problem and move on to solve the anticipative problem. When the regime switch finally happens, the agent is left with a post-switch problem where future switches are not possible. Note that if the supremum of an pre-switch problem is attained for some admissible policy and the pre- and post-switch problems are identical, then the anticipative and pre-switch problems are identical too. In this case there is effectively only one regime as the regime switch has no effect on any of the problems. We call such regime switches trivial.

The motivation behind using the three problem classes comes from the fact that they allow us to examine the effects anticipating future changes in a problem’s exogenous features has on current optimal policies and that it is necessary to solve the post-switch problem if one wishes to solve the anticipative problem in a computationally tractable manner.

Time-homogeneity of the problems (49) follows from the memoryless property of the exponential distribution and the strong Markov property of \(X_1\) and \(X_2\). The set \(\mathcal {S}\) of admissible stopping times for the anticipative OSP consists of all the stopping times of \(X_1\). \(\mathcal {V}\) is the set of anticipative admissible impulse control policies i.e. sequences \(\nu = (\tau _j, \xi _j)_{j=1}^N\) such that \(N \in \mathbb {N} \cup \lbrace \infty \rbrace \) and for all \(j=1,..., N\)

$$\begin{aligned} (i)\,&\tau _j \in \mathcal {S}\\ (ii)\,&\tau _j 1_{\{ \tau _j< \infty \}}< \tau _{j+1}1_{\{ \tau _{j+1} < \infty \}} \, \mathbb {P}- \mathrm {a.s.}\\ (iii)\,&\xi _j = X_{\tau _j} - \sum _{k=1}^{j-1}\xi _k - x_1(1 - H_t) - x_2H_t \end{aligned}$$

Above \(\tau _j\) are interpreted as the control times, \(\xi _j\) as the impulse controls and \(x_1(1 - H_t) + x_2H_t > 0\) is an exogenously determined state (\(x_i \in I\)) to which the controlled process

$$\begin{aligned} X^{\nu }_{s}&= X_{s} - \sum _{k=1}^{N}\xi _k1_{\{ \tau _k \le s \}},\, s \ge t \end{aligned}$$

is instantaneously driven at each intervention time. We allow \(\mathcal {S}\) and \(\mathcal {V}\) to contain stopping times that are infinite with a positive probability, since choosing not to stop the anticipative problem is equivalent to waiting for the regime switch to happen as long as the transversality conditions

$$\begin{aligned} {\left\{ \begin{array}{ll} \lim _{t \uparrow \infty } \mathbb {E}_{x} \left[ e^{-(r + \lambda ) t }(g_1(X_{1,t }) - \lambda (R_{1 }\tilde{V}_2)(X_{1,t })) \right] = 0\\ \lim _{t \uparrow \infty } \mathbb {E}_{x} \left[ e^{-(r + \lambda ) t }(g_1(X_{1,t }) - \lambda (R_{1 }V_2)(X_{1,t })) \right] = 0 \end{array}\right. } \end{aligned}$$
(10)

hold. We assume that (10) holds throughout the paper. We further assume that the payoffs satisfy \(g_1(x_1) \le 0, g_2(x_2) \le 0\) because otherwise we might encounter pathological ICPs where the supremum cannot be attained with an impulse control policy. \(\mathcal {S}_i\) and \(\mathcal {V}_i\) are defined analogously to \(\mathcal {S}\) and \(\mathcal {V}\), such that the corresponding stopping times are \(\mathbb {P}\)-a.s. finite stopping times of \(X_i\) and \(\xi _j\) satisfy

$$\begin{aligned} \xi _j&= X_{i,\tau _j} - \sum _{k=1}^{j-1}\xi _k - x_i \end{aligned}$$

instead of (iii).

A few words can be said about the problems (4 - 9) and their relations in this level of generality. Supremum is a subadditive mapping and the post-switch value functions \(\tilde{V}_2,V_2\) are both non-negative, so we have the following upper and lower bounds for the anticipative value functions \(\tilde{V}_1\) and \(V_1\):

$$\begin{aligned} \max \{ \lambda ( R_{1}\tilde{V}_2 ) (x), \tilde{V}_{1}^{\lambda }(x) \}&\le \tilde{V}_1(x) \le \tilde{V}_{1}^{\lambda }(x) + \lambda ( R_{1} \tilde{V}_2 ) (x),\end{aligned}$$
(11)
$$\begin{aligned} \max \{ \lambda \left( R_{1}V_2 \right) (x), V_{1}^{\lambda }(x) \}&\le V_1(x) \le V_{1}^{\lambda }(x) + \lambda \left( R_{1} V_2 \right) (x) \end{aligned}$$
(12)

where

$$\begin{aligned} \tilde{V}_{1}^{\lambda }(x)&= \sup \limits _{\tau \in \mathcal {S}_{1}} \mathbb {E}_{x}\left[ e^{-(r+\lambda )\tau } g_1(X_{1,\tau }) \right] ,\\ V_1^{\lambda }(x)&= \sup \limits _{\nu \in \mathcal {V}_{1}} \mathbb {E}_{x}\left[ \sum _{i=1}^N e^{-(r + \lambda )\tau _i} g_1(X_{1,\tau _i^-}^{\nu }) \right] \end{aligned}$$

We can derive additional meaning for the bounds (1112). The lower bounds \(\tilde{V}_{1}^{\lambda }\) and \(V_{1}^{\lambda }\) correspond to anticipative problems where the regime switch kills the entire process X. This can be seen as an example of optimal stopping and impulse control under default risk. The lower bounds \(\lambda ( R_{1}\tilde{V}_2 ) \) and \(\lambda (R_{1}V_2)\) correspond to anticipative problems where an agent has to wait for a random amount of time T before solving the post-switch problems \(\tilde{V}_2\) and \(V_2\). In this problem the agent is “late for the party” in the first regime and has to wait for the regime switch before doing anything. The problem could also be seen as an example of stochastic control with random entry. The upper bounds in (1112) describe pairs of independent problems which are switched at time T. That is, a decision maker solves an anticipative default risk problem \(\tilde{V}_{1}^{\lambda }\, (V_{1}^{\lambda })\) and after the default moves on to solve a post-switch problem \(\tilde{V}_2\, (V_2)\).

We also have an ordering for the continuation (and hence for stopping) regions of the anticipative and default risk problems whenever they are well-posed. If

$$\begin{aligned} \tilde{C}_{1,\lambda }&= \{ x \in I:\, \tilde{V}_{1}^{\lambda }(x)> g(x) \} ,&\tilde{C}_{1} = \{ x \in I:\, \tilde{V}_{1}(x)> g(x) \} ,\\ C_{1,\lambda }&= \{ x \in I:\, V_{1}^{\lambda }(x)> g(x) \} ,&C_{1} = \{ x \in I:\, V_{1}(x) > g(x) \} \end{aligned}$$

then \(\tilde{C}_{1,\lambda } \subseteq \tilde{C}_{1}\) and \(C_{1,\lambda } \subseteq C_{1}\)

It should be noted that the OSPs \(\tilde{V}_0, \tilde{V}_2\) can be seen as special cases of the ICPs \(V_0, V_2\). Indeed, for \(N=1\) the ICPs reduce to OSPs. \(\tilde{V}_0, \tilde{V}_2\) are also known as the associated OSPs of \(V_0, V_2\) and \(V_0, V_2\) can be approximated as sequences of OSPs of the form \(\tilde{V}_0, \tilde{V}_2\). These relations and the inequalities \(\tilde{V}_0 \le V_0, \tilde{V}_2 \le V_2\) were proved for a large class of problems admitting unique threshold solutions in Alvarez (2004a) and we expand these properties for the anticipative problems \(\tilde{V}_1, V_1\) in Sect. 6.

4 Anticipative optimal stopping

Using (6) and the strong Markov property of diffusions the anticipative OSP can be written as

$$\begin{aligned} \tilde{V}_1(x)&= \sup \limits _{\tau \in \mathcal {S}} \mathbb {E}_{x} \left[ \int _{\tau }^{\infty } e^{-r\tau }g_1(X_{1,\tau }) \lambda e^{-\lambda s} ds + \int _{0}^{\tau } \lambda e^{-(r + \lambda )s }\tilde{V}_{2}(X_{1,s}) ds \right] \\&= \lambda (R_{1 }\tilde{V}_{2})(x) + \sup \limits _{\tau \in \mathcal {S}} \mathbb {E}_{x} \left[ e^{-(r + \lambda ) \tau }(g_1(X_{1,\tau }) - \lambda (R_{1 }\tilde{V}_{2})(X_{1,\tau })) \right] \end{aligned}$$

(10) implies that if there does not exist a \(\mathbb {P}\)-a.s. finite \(\tau \in \mathcal {S}\) such that

$$\begin{aligned} \mathbb {E}_{x} \left[ e^{-(r + \lambda ) \tau }(g_1(X_{1,\tau }) - \lambda (R_{1 }\tilde{V}_{2})(X_{1,\tau })) \right]&\ge 0 \end{aligned}$$
(13)

then \(\tau = \infty \) is an optimal stopping time for the anticipative OSP and hence the optimal policy is to wait for the regime switch and then solve the post-switch OSP. For the rest of this section we assume that the converse holds, i.e. there exists \(\tau \in \mathcal {S}\) such that (13) holds. Since the process \(X_1\) has \(\mathbb {P}\)-a.s. continuous paths and \(g_1 - \lambda (R_{1 }\tilde{V}_{2})\) is continuous and finite in I, (10) also implies the condition

$$\begin{aligned} \mathbb {E}_{x} \left[ \sup _{t \ge 0} e^{-(r + \lambda ) t }|g_1(X_{1,t }) - \lambda (R_{1 }\tilde{V}_{2})(X_{1,t })| \right]&< \infty \end{aligned}$$

It is now a trivial extension of Theorem 3.1 in Shiryaev (2007) to show that \(\tilde{V}_1\) is the smallest majorant of \(g_1\) such that \(\tilde{V}_1(x) - \lambda (R_{1}\tilde{V}_{2})(x)\) is \(r+\lambda \)-excessive w.r.t. \(X_{1}\).

We will now employ the same techniques as in Alvarez (2004a) for finding the optimal stopping policy and calculating the value function \(\tilde{V}_1\). That is, we treat the anticipative OSP as a problem of finding a state maximizing a certain function such that the corresponding hitting time coincides with the optimal stopping policy. The theoretical justification for this method is discussed e.g. in Christensen and Irle (2011). However we use a verification approach so we do not have to know a priori that the method leads to a correct solution.

Assume \(y \in I\) and define the function \(\tilde{V}_{1,y}\) by removing the supremum in \(\tilde{V}_1\) and choosing \(\tau = \tau _y\), where \(\tau _y = \inf \lbrace t \ge 0:\, X_{1,t} \ge y \rbrace \), the first hitting time (from below) of \(X_1\) to the state y. Then

$$\begin{aligned} \tilde{V}_{1,y}(x)&= {\left\{ \begin{array}{ll} g_1(x) &{} \quad , x \ge y \\ \lambda (R_{1 }\tilde{V}_{2})(x) + \psi _{1}(x)C(y) &{} \quad , x < y \end{array}\right. } \end{aligned}$$
(14)

where

$$\begin{aligned} C(y)&= \dfrac{g_1(y) - \lambda (R_{1 }\tilde{V}_{2})(y)}{\psi _{1}(y)} \end{aligned}$$
(15)

In the following analysis we will also make use of the operators \(\mathcal {L}_{i,f}\) defined by

$$\begin{aligned} \mathcal {L}_{i,f}g&= \dfrac{g'f - f'g}{S_i'},\qquad i=1,2 \end{aligned}$$
(16)

for all functions fg for which the above expression is a well-defined function.

Theorems 1 and 2 give fairly general sufficient conditions for the existence and uniqueness of threshold solutions for the anticipative OSP as well as a computationally tractable representation for the value function, which may even be obtained explicitly in some cases. The proof relies on constructing a specific integral representation for the chosen value function candidate and proving that this representation has certain properties that imply \(r+\lambda \)-excessivity. In fact, this allows us to express the value function as a resolvent of another function. The method is described in more detail and generality in e.g. Christensen and Lempa (2015) and Mordecki and Salminen (2007).

Theorem 1

Let C be as in (15) and let \(\tilde{y}_1\) be its unique global maximum in I. Let

$$\begin{aligned} A_{\psi }(x)&= - \mathcal {L}_{1,\psi _1}(g_1 - \lambda (R_{1}\tilde{V}_{2}))(x) \end{aligned}$$
(17)
$$\begin{aligned} A_{\phi }(x)&= \mathcal {L}_{1,\varphi _1}(g_1 - \lambda (R_{1}\tilde{V}_{2}))(x) \end{aligned}$$
(18)

on \((\tilde{y}_1, b_1 ) \setminus D_1\) and suppose that \(A_{\psi }\) is non-decreasing and \(\lim _{x \uparrow b}A_{\varphi } \ge 0\). Assume that \(g_1/\psi _{1}\) is non-increasing on a neighbourhood of \(\tilde{y}_1\). Then \(\tau _{\tilde{y}_1}\) is the optimal stopping time for the problem \(\tilde{V}_1\) and

$$\begin{aligned} \tilde{V}_1(x)&= {\left\{ \begin{array}{ll} g_1(x) &{} \quad x \ge \tilde{y}_1\\ \lambda (R_{1}\tilde{V}_{2})(x) + \dfrac{g_1(\tilde{y}_{1}) - \lambda (R_{1}\tilde{V}_{2})(\tilde{y}_{1})}{\psi _{1}(\tilde{y}_{1})}\psi _{1}(x) &{} \quad x < \tilde{y}_{1} \end{array}\right. } \end{aligned}$$
(19)

Proof

Let \(\hat{V}_1\) be the value function candidate given by the r.h.s. of (19). \(\tau _{\tilde{y}_1} \in \mathcal {S}\) so \(\tilde{V}_1 \ge \hat{V}_1\). The definition of \(\tilde{y}_1\) implies \(\hat{V}_1 \ge g_1\). On \(I {\setminus } (D_1 \cap [\tilde{y}_1, b))\) define the mappings

$$\begin{aligned} L_{\varphi }&= \mathcal {L}_{1,\varphi _1}(\hat{V}_1 - \lambda (R_{1}\tilde{V}_{2})) \end{aligned}$$
(20)
$$\begin{aligned} L_{\psi }&= - \mathcal {L}_{1,\psi _1}(\hat{V}_1 - \lambda (R_{1}\tilde{V}_{2})) \end{aligned}$$
(21)

\(L_{\psi }(x) = 0\) for \(x \in (a, \tilde{y}_1)\) and \(L_{\psi }(x) = A_{\psi }(x)\) for \(x \in [\tilde{y}_1, b) {\setminus } D_1\). \(\lim _{x \downarrow \tilde{y}_1}A_{\psi }(x) \ge 0\) and \(A_{\psi }\) was assumed to be non-decreasing, so \(L_{\psi }\) is non-negative on its domain. For \(x \in [\tilde{y}_1, b) \setminus D_1\) we have

$$\begin{aligned} -\dfrac{\psi _{1}(x)}{\varphi _{1}(x)}L_{\varphi }'(x)&= L_{\psi }'(x) = A_{\psi }'(x) \ge 0 \end{aligned}$$

implying that \(L_{\psi }\) is non-decreasing. Since

$$\begin{aligned} L_{\varphi }(x)&= B_1 \dfrac{g_1(\tilde{y}_{1}) - \lambda (R_{1}\tilde{V}_{2})(\tilde{y}_{1})}{\psi _{1}(\tilde{y}_{1})} \end{aligned}$$

for \(x \in (a, \tilde{y}_1)\) and \(0 \le \lim _{x \uparrow b}A_{\varphi }(x) \le \lim _{x \downarrow \tilde{y}_1} A_{\varphi }(x) \le \lim _{x \uparrow \tilde{y}_1}L_{\varphi }(x)\), \(L_{\varphi }\) is non-negative and non-increasing. These properties of \(L_{\psi }\) and \(L_{\varphi }\) guarantee the existence of a representing measure for \(\hat{V}_1\) [Salminen (1985), Section 3], proving that it is \(r+\lambda \)-excessive. \(\square \)

Theorem 2

Let \(g_1 \in C^2(I)\) and C be as in (15). Suppose that \(g_1 - \lambda (R_1\tilde{V}_2)\) satisfies the limit conditions of Lemma 1, \(\tilde{F}_1 = (\mathcal {A}_1 - r- \lambda )g_1 + \lambda \tilde{V}_{2} \in \mathcal {L}^1_1(I, r+\lambda )\) and \(\tilde{F}_1\) has a unique root \(\hat{x} >a\) so that \(\tilde{F}_1(x) > 0\) when \(x < \hat{x}\) and \(\tilde{F}_1(x) < 0\) when \(x > \hat{x}\). Then \(\tau _{\tilde{y}_1}\) is the optimal stopping time for the anticipative OSP. Here \(\tilde{y}_1 > \hat{x}\) is the unique state \(\tilde{y}_1 = \textrm{argmax}_{x \in I} \lbrace C(x) \rbrace \).

Proof

Let \(L(x) = \mathcal {L}_{1,\psi _1}(g_1 - \lambda (R_{1}\tilde{V}_{2})(x)\). Lemma 1 implies

$$\begin{aligned} L(x)&= \int _{a}^{x} \psi _{1}(z) \tilde{F}_1(z)m_1'(z)dz \end{aligned}$$

By assumption \(\tilde{F}_1(x)>0\) on \((a,\hat{x})\) so \(L(x)>0\) on \((a, \hat{x}]\). If \(\hat{x}< \xi < x\), then the intermediate value theorem implies

$$\begin{aligned} L(x)&= L(\xi ) + \psi _{1}(c) \tilde{F}_1(c)m_1'(c) (x - \xi ) \end{aligned}$$

for some \(c \in (\xi , x)\). However, \(\tilde{F}_1(x)<0\) when \(x>\hat{x}\) so \(\lim _{x \uparrow b} L(x) = - \infty \). Additionally L(x) is continuous on the state space I and monotonically decreasing on \((\hat{x}, b)\), so there exists a unique \(\tilde{y}_1 > \hat{x}\), so that \(L(\tilde{y}_1)=0\) i.e. \(\tilde{y}_1 = \textrm{argmax}_{x \in I} \{ C(x) \}\). Thus \(\tilde{V}_{1,\tilde{y}_1} \ge g_1\).

Since \(\tau _{\tilde{y}_1}\) is admissible for the problem \(\tilde{V}_1\), we have \(\tilde{V}_1 \ge \tilde{V}_{1,\tilde{y}_1}\). We still need to prove that \(\tilde{V}_{1,\tilde{y}_1}(x) - \lambda (R_{1}\tilde{V}_{2})(x)\) is \(r+\lambda \)-excessive w.r.t. \(X_1\). Let \(L_{\varphi }, L_{\psi }\) be as in the proof of Theorem 1 with \(\hat{V}_1\) replaced by \(\tilde{V}_{1,\tilde{y}_1}\).

\(L_{\varphi }\) is non-negative since \(\tilde{V}_{1,\tilde{y}_1}(x) - \lambda (R_{1}\tilde{V}_{2})(x)\) is non-negative and strictly increasing on \((a, \tilde{y}_1)\) and Lemma 1 and the negativity of \(\tilde{F}_1(x)\) on \((\hat{x}, b )\) imply

$$\begin{aligned} L_{\varphi }(x) = - \int _{x}^{b} \varphi _{1}(z)\tilde{F}_1(z) m_1'(z)dz \ge 0 \end{aligned}$$

when \(x > \tilde{y}_1\). \(L_{\psi }\) is non-negative as well. To see this, note that \(L_{\psi } =0\) on \((a,\tilde{y}_1)\) and on \((\tilde{y}_1, b)\) we have

$$\begin{aligned} L_{\psi }'(x)&= - \psi _{1}(x)\tilde{F}_1(x) m_1'(x) \end{aligned}$$

Thus \(L_{\psi }'(x) = 0\) for \(x < \tilde{y}_1\) and \(L_{\psi }'(x) > 0\) for \(x > \tilde{y}_1 \), since \(\tilde{F}_1(\tilde{y}_1) < 0\). In particular, \(\lim _{x \downarrow \tilde{y}_1 \vee \tilde{y}_2}L_{\psi }'(x) > 0\) proving that \(L_{\psi }\) is non-negative and non-decreasing. \(L_{\varphi }\) is non-increasing, since

$$\begin{aligned} L_{\varphi }'(x)&= - \dfrac{\varphi _{1}(x)}{\psi _{1}(x)}L_{\psi }'(x) \le 0 \end{aligned}$$

The statement follows since \(L_{\varphi }, L_{\psi }\) satisfy the same properties as in the proof of Theorem 1. \(\square \)

5 Anticipative impulse control

Using the same arguments as in Sect. 4 and noting that the switch may occur between any two consecutive control times or after the last control has been exercised, the anticipative ICP value function \(V_1\) may be written as

$$\begin{aligned} V_1(x)&= \sup _{\nu \in \mathcal {V}}\mathbb {E}_{x} \left[ \sum _{i=1}^{N} e^{-(r+ \lambda ) \tau _i}g_1(X_{1,\tau _{i}^-}^{\nu }) + \int _{0}^{\tau _N} \lambda e^{-(r+ \lambda )s}V_{2}(X_{1,s}^{\nu })ds \right] \\&= \lambda (R_{1}V_{2})(x) + \sup _{\nu \in \mathcal {V}}\mathbb {E}_x \left[ \sum _{i=1}^{N}e^{-(r+ \lambda )\tau _i}g_1(X_{1,\tau _{i}^-}^{\nu }) - \lambda e^{-(r+\lambda )\tau _N} (R_{1}V_{2})(X_{1,\tau _N}) \right] \end{aligned}$$

It turns out that the verification theorem for a candidate solution developed in Alvarez (2004a) can be extended for anticipative ICPs.

Theorem 3

Let \(f: I \rightarrow \mathbb {R}_+\) be a function satisfying \(f(x) \ge g_1(x) + f(x_{1})\) for every \(x \in I\) and suppose that \(f - \lambda (R_{1}V_{2})\) is \(r+\lambda \)-excessive w.r.t. \(X_{1}\). Then \(f(x) \ge V_1(x)\) and \(f(x) \ge \tilde{V}_1(x)\) for every \(x \in I\).

Proof

Let \(\nu \) be an admissible impulse control with control times \(\lbrace \tau _i \rbrace _{i=1}^{N}\) and define \(\tau _0=0\). The controlled diffusion \(X^{\nu }_{1,t}\) behaves like the diffusion \(X_{1,t}\) between consecutive control times so we have

$$\begin{aligned}&\mathbb {E}_{X^{\nu }_{1, \tau _{i}}} \left[ e^{-(r + \lambda )\tau _{i+1}} \left( f(X^{\nu }_{1, \tau _{i+1}^-}) - \lambda (R_{1}V_{2})(X^{\nu }_{1, \tau _{i+1}^-}) \right) \right] \\ \le&e^{-(r + \lambda )\tau _{i}}\left( f(X^{\nu }_{1, \tau _{i}}) - \lambda (R_{1}V_{2})(X^{\nu }_{1, \tau _{i}}) \right) \end{aligned}$$

\(\mathbb {P}\)-almost surely. Thus

$$\begin{aligned} 0 \le&\mathbb {E}_x \left[ e^{-(r+\lambda )\tau _i} f(X^{\nu }_{1,\tau _i}) - e^{-(r+\lambda )\tau _{i+1}} f(X^{\nu }_{1,\tau _{i+1}-}) \right] \\&- \lambda \mathbb {E}_x \left[ e^{-(r+ \lambda )\tau _i}(R_{1}V_{2}) (X^{\nu }_{1,\tau _i}) - e^{-(r+\lambda )\tau _{i+1}}(R_{1}V_{2})(X^{\nu }_{1,\tau _{i+1}-}) \right] \\ =&\mathbb {E}_x \left[ e^{-(r+\lambda )\tau _i} f(X^{\nu }_{1,\tau _i}) - e^{-(r+\lambda )\tau _{i+1}} f(X^{\nu }_{1,\tau _{i+1}-}) \right] \\&+ \mathbb {E}_x \left[ \int _{\tau _i}^{\tau _{i+1}-} \lambda e^{-(r+ \lambda )s}V_{2}(X_{1,s})ds \right] \end{aligned}$$

for every i and \(x \in I\). Summing over i yields

$$\begin{aligned} \mathbb {E}_x&\left[ \int _{0}^{\tau _{k+1}} \lambda e^{-(r+ \lambda )s} V_{2}(X_{1,s})ds \right. \\&+ \left. \sum _{i=0}^{k}\left( e^{-(r+\lambda )\tau _i} f(X^{\nu }_{1,\tau _i}) - e^{-(r+\lambda )\tau _{i+1}} f(X^{\nu }_{1,\tau _{i+1}-}) \right) \right] \ge 0 \\ \Leftrightarrow&f(x) \ge \mathbb {E}_x \left[ \int _{0}^{\tau _{k+1}} \lambda e^{-(r+ \lambda )s}V_{2}(X_{1,s})ds \right] \\&+ \sum _{i=1}^{k} \mathbb {E}_x \left[ e^{-(r+\lambda )\tau _i} \left( f(X^{\nu }_{1,\tau _{i}-}) - f(X^{\nu }_{1,\tau _{i}}) \right) \right] + \mathbb {E}_x \left[ e^{-(r+\lambda )\tau _{k+1}} f(X^{\nu }_{1,\tau _{k+1}-}) \right] \end{aligned}$$

where \(k \le N\). As \(k \rightarrow N\), the assumptions \(f(x) \ge g_1(x) + f(x_{1})\) and \(X_{1,\tau _i}=x_{1}\) imply

$$\begin{aligned} f(x)&\ge \mathbb {E}_x \left[ \int _{0}^{\tau _N} \lambda e^{-(r+ \lambda )s}V_{2}(X_{1,s}^{\nu })ds + \sum _{i=1}^{N}e^{-(r+ \lambda )\tau _i}g_1(X_{1,\tau _{j}-}^{\nu }) \right] \end{aligned}$$

The admissible impulse control \(\nu \) was arbitrary so the above inequality is valid for any \(\nu \in \mathcal {V}\) and it follows that \(f(x) \ge V_1(x)\) for every \(x \in I\).

The excessivity of \(f(x) - \lambda (R_{1}V_{2})(x)\) implies that for any \(\tau \in \mathcal {S}\)

$$\begin{aligned}&\mathbb {E}_x\left[ e^{-(r + \lambda )\tau }(f(X_{1,\tau }) - \lambda (R_{1,r+\lambda }\tilde{V}_{2})(X_{1,\tau })) \right] \\ =&f(x) - \lambda (R_{1}V_{2})(x) \\&+ \mathbb {E}_x\left[ e^{-r\tau } \mathbb {E}_{X_{1,\tau }} \left[ \lambda \int _0^{\infty } e^{-(r+\lambda )s}(R_{1}(V_{2} - \tilde{V}_{2}))(X_{1,s}) ds \right] \right] \\ \le&f(x) - \lambda (R_{1}\tilde{V}_{2})(x) \end{aligned}$$

Since excessive functions are non-negative, we have

$$\begin{aligned} f(x)&\ge f(x) - \lambda (R_{1}\tilde{V}_{2})(x_1) \ge g(x) + f(x_1) - \lambda (R_{1}\tilde{V}_{2})(x_1) \ge g(x) \end{aligned}$$

for all \(x \in I\). \(f \ge \tilde{V}_1\) follows by the minimality of \(\tilde{V}_1\). \(\square \)

We will now proceed as in Sect. 4 and determine the conditions under which the anticipative ICP admits a unique threshold solution with a state \(y > x_1\). Assume \(x_{1}<y\) and define

$$\begin{aligned} F_{1,y}(x)&= \mathbb {E}_x \left[ e^{-(r+\lambda )\tau _y}(g_1(y) + F_{1,y}(x_{1})) + \int _{0}^{\tau _y}V_{2}(X_{1,s}^{\nu })ds \right] \end{aligned}$$

in I. \(x_{1}<y\) implies

$$\begin{aligned} F_{1,y}(x_{1}) =&\dfrac{\psi _{1}(x_{1})}{\psi _{1}(y)} (g_1(y) + F_{1,y}(x_{1}) - \lambda (R_{1}V_{2})(y)) + \lambda (R_{1}V_{2})(x_{1}) \\ \Leftrightarrow F_{1,y}(x_{1})&= \dfrac{\psi _{1}(x_{1})(g_1(y) - \lambda (R_{1}V_{2})(y)) + \lambda (R_{1}V_{2})(x_{1})}{\psi _{1}(y) - \psi _{1}(x_{1})} \end{aligned}$$

Setting

$$\begin{aligned} u_1(y)&= \dfrac{g_1(y) - \lambda (R_{1}V_{2})(y) + \lambda (R_{1}V_{2})(x_{1})}{\psi _{1}(y) - \psi _{1}(x_{1})} \end{aligned}$$

yields

$$\begin{aligned} F_{1,y}(x)&= {\left\{ \begin{array}{ll} \lambda (R_{1}V_{2})(x_{1}) + \psi _{1}(x_{1})u_1(y) + g_1(x) &{} \quad , x \ge y \\ \lambda (R_{1}V_{2})(x) + \psi _{1}(x)u_1(y) &{} \quad , x<y \end{array}\right. } \end{aligned}$$

Lemma 2

Let \(\hat{y}_1\) be the unique local maximum of \(u_1\) in \((x_1, b)\). Define

$$\begin{aligned} A_{\psi }(x) =&- \mathcal {L}_{1,\psi _1}(F_{1,\hat{y}_1} - \lambda (R_{1}V_{2})(x)\\ A_{\varphi }(x) =&\mathcal {L}_{1,\varphi _1}(F_{1,\hat{y}_1} - \lambda (R_{1}V_{2})(x) \end{aligned}$$

on \((\hat{y}_1, b ) \setminus D_1\) and suppose that \(A_{\psi }\) is non-decreasing and \(\lim _{x \uparrow b}A_{\varphi } \ge 0\). Then \(F_{1,\hat{y}_1} - \lambda (R_{1}V_{2})\) is \(r+\lambda \)-excessive w.r.t. \(X_1\).

Proof

The proof is analogous with that of Theorem 1. \(\square \)

Lemma 3

Let \(\hat{y}_1\) be a unique local maximum of \(u_1\) in \((x_{1}, b)\) and \((g_1'(x) - \lambda (R_{1}V_{2})'(x))/\psi _{1}'(x)\) be non-increasing on \((a, x_1)\). Then \(F_{1,\hat{y}_1}(x) \ge F_{1,\hat{y}_1}(x_{1}) + g_1(x)\) for every \(x \in I\).

Proof

Define \(\Delta (x)= u_1(\hat{y}_1)( \psi _{1}(x) - \psi _{1}(x_{1})) - g_1(x) - \lambda (R_{1}V_{2})(x_{1}) + \lambda (R_{1}V_{2})(x) \). By the monotonicity of \((g_1'(x) - \lambda (R_{1}V_{2})'(x))/\psi _{1}'(x)\) we have for all \(x \in (a,x_{1})\)

$$\begin{aligned} \Delta '(x)&= u_1(\hat{y}_1) \psi _{1}'(x) - g_1'(x) + \lambda (R_{1}V_{2})'(x) \\&= \psi _{1}'(x) \left( \dfrac{g_1'(\hat{y}_1) - \lambda (R_{1} V_{2})'(\hat{y}_1)}{\psi _{1}'(\hat{y}_1)} - \dfrac{g_1'(x) - \lambda (R_{1}V_{2})'(x)}{\psi _{1}'(x)} \right) \\&\le 0 \end{aligned}$$

But \(g_1(x_{1}) \le 0\), so \(\Delta (x) \ge 0\) for every \(x \in (a,x_{1})\). It follows that \(F_{1,\hat{y}_1}(x) \ge F_{1,\hat{y}_1}(x_{1}) + g_1(x)\) for every \(x \in I\). \(\square \)

The next result follows immediately from Theorem 3.

Corollary 1

Let the assumptions of Lemma 2 and 3 hold. Then \(V_1=F_{1,\hat{y}_1}\) and \(\hat{y}_1\) is the unique state \(\hat{y}_1 = \textrm{argmax}_{x > x_1}\{ u_1(x)\}\).

The following theorem is an ICP analogue of Theorem 2

Theorem 4

Let \(g_1 \in C^2(I)\). Suppose that \(g_1 - \lambda (R_1V_2)\) satisfies the limit conditions of Lemma 1 and \(F_1 = (\mathcal {A}_1 - r - \lambda )g_1 + \lambda V_{2} \in \mathcal {L}^1_1(I,r+\lambda )\). Assume that \(F_1\) is non-increasing and \(\lim _{x \downarrow a}F_1(x)> 0 > \lim _{x \uparrow b}F_1(x)\). Then \(V_1=F_{1,\hat{y}_1}\) and \(\hat{y}_1\) is the unique state \(\hat{y}_1 = \textrm{argmax}_{ x \in (x_1, b)}\{ u_1(x) \}\).

Proof

By Theorem 2 there exists a unique \(\tilde{y}> x_1 > a\) so that

$$\begin{aligned} (g_1'(x) - \lambda (R_{1}V_{2})'(x)) \psi _{1}(x) \le (g_1(x) - \lambda (R_{1}V_{2})(x)) \psi _{1}'(x) \end{aligned}$$

for every \(x \ge \tilde{y}\). The assumptions imply

$$\begin{aligned}&\dfrac{(g_1''(x) - \lambda (R_{1}V_2)''(x)) \psi _{1}'(x) - (g_1'(x) - \lambda (R_{1}V_2)'(x)) \psi _{1}''(x)}{S'_1(x)}\\&= \dfrac{2(r+\lambda )}{\sigma ^2_1(x)} \int _{a}^{x}\psi _{1}(z)(F_1(x) - F_1(z))m_1'(z)dz \\&\le 0 \end{aligned}$$

so \((g_1'(x) - \lambda (R_{1,r+\lambda }V_{2})'(x))/\psi _{1}'(x)\) is non-increasing.

For all \(x \in I\) define

$$\begin{aligned} v(x)=&(g_1'(x) - \lambda (R_{1}V_{2})'(x))(\psi _{1}(x) - \psi _{1}(x_{1}))\\&- (g_1(x) - \lambda (R_{1}V_{2})(x) + \lambda (R_{1}V_{2})(x_1)) \psi _{1}'(x) \end{aligned}$$

\(v(x_1) = - g_1(x_1)\psi _{1}'(x_1) \ge 0\) and for any \(x \ge \tilde{y} \vee x_1\) we have

$$\begin{aligned} v(x)&\le -(g_1'(x) - \lambda (R_{1}V_{2})'(x))\psi _{1}(x_1) - \lambda (R_{1}V_{2})(x_1) \psi _{1}'(x) \\&\le \psi _{1}'(x)\psi _{1}(x_1) \left( \dfrac{ \lambda (R_{1}V_{2})' (x)}{\psi _{1}'(x)} - \dfrac{ \lambda (R_{1}V_{2})(x_1)}{\psi _{1}(x_1)} \right) \\&< 0 \end{aligned}$$

where the last inequality follows from

$$\begin{aligned} \dfrac{\psi _{1}(x)^2}{S_1'(x)} \dfrac{d}{dx} \left( \dfrac{ \lambda (R_{1}V_{2})(x)}{\psi _{1}(x)} \right)&= -\lambda \int _{a}^x\psi _{1}(z)V_{2}(z)m_1'(z)dz < 0 \end{aligned}$$

v(x) is continuous, so there is at least one \(\hat{y}_1>x_1 >a\) for which \(v(\hat{y}_1)=0\). \((g_1'(x) - \lambda (R_{1}V_{2})'(x))/\psi _{1}'(x)\) is non-increasing, \(\textrm{sgn}(v(x)) =\textrm{sgn}(v(x)/\psi _{1}'(x))\) and

$$\begin{aligned} \dfrac{d}{dx} \left( \dfrac{v(x)}{\psi _{1}'(x)} \right)&= (\psi _{1}(x) - \psi _{1}(x_1)) \dfrac{d}{dx} \left( \dfrac{g_1'(x) - \lambda (R_{1}V_{2})'(x)}{\psi _{1}'(x)} \right) < 0 \end{aligned}$$

when \(x > x_1\) so \(\hat{y}_1 = \textrm{argmax}_{x \in [x_1, b)} \lbrace u_1(x) \rbrace \) is unique. In accordance with Lemma 3, the monotonicity of \((g_1'(x) - \lambda (R_{1}V_{2})'(x))/\psi _{1}'(x)\) implies \(F_{1,\hat{y}_1}(x) \ge F_{1,\hat{y}_1}(x_{1}) + g_1(x)\) for every \(x \in I\).

Clearly \(F_{1,\hat{y}_1}(x) - \lambda (R_{1}V_{2})(x) \in C^2(I {\setminus } \lbrace \hat{y}_1 \rbrace )\) and

$$\begin{aligned} \mathcal {A}_1(F_{1,\hat{y}_1}(x) - \lambda (R_{1}V_{2})(x))&= (r+\lambda )(F_{1,\hat{y}_1}(x) - \lambda (R_{1}V_{2})(x)) \end{aligned}$$

on \((a,\hat{y}_1)\). The function

$$\begin{aligned} H(x)&= (\mathcal {A}_1 - r - \lambda )(F_{1,\hat{y}_1}(x) - \lambda (R_{1}V_{2})(x)) \end{aligned}$$

defined on \((\hat{y}_1, b )\) is non-increasing by the monotonicity of \(F_1\) and

$$\begin{aligned} \lim _{x \downarrow \hat{y}_1} H(x) =&\dfrac{\sigma ^2_1(\hat{y}_1)}{2} (g_1''(\hat{y}_1) - \lambda (R_{1}V_{2})''(\hat{y}_1)) + \mu _1 (\hat{y}_1) (g_1'(\hat{y}_1) - \lambda (R_{1}V_{2})'(\hat{y}_1))\\&- (r+\lambda )(g_1(\hat{y}_1) - \lambda (R_{1}V_{2})(\hat{y}_1)) \\&- (r+\lambda )(\lambda (R_{1}V_{2})(x_{1}) + \psi _{1}(x_{1})u_1(\hat{y}_1)) \\ \le&\dfrac{\sigma ^2_1(\hat{y}_1)}{2} \psi _{1}''(\hat{y}_1)u_1 (\hat{y}_1) + \mu _1 (\hat{y}_1) \psi _{1}'(\hat{y}_1)u(\hat{y}_1) \\&- (r+\lambda ) \psi _{1}(\hat{y}_1) u_1(\hat{y}_1) \\ =&(\mathcal {A}_1\psi _{1} (\hat{y}_1)- (r+\lambda ) \psi _{1}(\hat{y}_1) )u_1(\hat{y}_1)\\ =&0 \end{aligned}$$

so \(H(x) \le 0\) for every \(x \ge \hat{y}_1\).Thus \(F_{1,\hat{y}_1}(x) - \lambda (R_{1}V_{2})(x)\) is \(r+\lambda \)-excessive w.r.t. \(X_{1}\). The statement follows from Theorem 3\(\square \)

6 Comparison results

Having solved the anticipative OSP and ICP in Sects. 4 and 5, we are now ready to formulate various comparison results mapping the order relations between pre-switch and anticipative problems, anticipative and post-switch problems and their respective threshold solutions. We also study sufficient conditions for situations where the anticipative and post-switch threshold solutions are equal, even if the switch is non-trivial. In other words, once an agent receives information about a future switch, they will immediately choose a policy that corresponds to the post-switch threshold solution in the first regime. We call these regime changes neutral, as the anticipative optimal policy is independent of the time at which the switch occurs.

6.1 Order relations

The form and monotonicity assumptions on the functions \(\tilde{F}_1\) and \(F_1\) in Theorems 2 and 4 seem to suggest that the order of the pre- and post-switch value functions share a fundamental connection with the order of the corresponding threshold solutions. The intuition turns out to be true as it will be shown below. Theorems 5 and 6 state that the order of pre- and post-switch value functions determines the order of pre-switch and anticipative threshold solutions. That is, if a regime switch increases value (post-switch value dominates pre-switch value) then this switch should be anticipated by raising the control threshold in advance. The converse holds for switches that decrease value.

Theorem 5

Let \(\tilde{V}_0 \lesseqqgtr \tilde{V}_{2}\). Then \(\tilde{y}_0 \lesseqqgtr \tilde{y}_1\).

Proof

An OSP with a trivial regime switch is equivalent to and has the same value function as a corresponding problem without a switch. Thus by Theorem 2

$$\begin{aligned} \dfrac{g_1(\tilde{y}_1) - \lambda (R_{1}\tilde{V}_{2})(\tilde{y}_1)}{\psi _{1}(\tilde{y}_1)}&\ge \dfrac{g_1(\tilde{y}_0) - \lambda (R_{1}\tilde{V}_{2})(\tilde{y}_0)}{\psi _{1}(\tilde{y}_0)} \\&\ge \dfrac{g_1(\tilde{y}_1) - \lambda (R_{1}\tilde{V}_{2})(\tilde{y}_1)}{\psi _{1}(\tilde{y}_1)} - \dfrac{\lambda (R_{1}(\tilde{V}_2-\tilde{V}_0))(\tilde{y}_0)}{\psi _{1}(\tilde{y}_0)} \end{aligned}$$

implying that

$$\begin{aligned} \dfrac{\lambda (R_{1}(\tilde{V}_{2} - \tilde{V}_0))(\tilde{y}_0)}{\psi _{1}(\tilde{y}_0)}&\ge \dfrac{\lambda (R_{1}(\tilde{V}_{2} - \tilde{V}_0))(\tilde{y}_1)}{\psi _{1}(\tilde{y}_1)} \end{aligned}$$

The statement follows from the above inequality since \(\lambda (R_{1}f)/\psi _{1}\) is strictly decreasing for \(f \ge 0\). \(\square \)

Theorem 6

Let \(V_0 \lesseqqgtr V_{2}\) and suppose that \(\lambda (R_{1}(V_{2}\vee V_0 - V_{2}\wedge V_0))'/\psi _{1}'\) is non-increasing on \((x_1, b)\). Then \(\hat{y}_0 \lesseqqgtr \hat{y}_1\).

Proof

The proof is analogous with that of Theorem 5. It follows from Theorem 4 that

$$\begin{aligned}&\dfrac{g_1(\hat{y}_1) - \lambda (R_{1}V_{2})(\hat{y}_1) + \lambda (R_{1}V_{2})(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} \\ \ge&\dfrac{g_1(\hat{y}_0) - \lambda (R_{1}V_{2})(\hat{y}_0) + \lambda (R_{1}V_{2})(x_1)}{\psi _{1}(\hat{y}_0) - \psi _{1}(x_1)} \\ \ge&\dfrac{g_1(\hat{y}_1) - \lambda (R_{1}V_0)(\hat{y}_1) + \lambda (R_{1}V_0)(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} \\&+ \dfrac{\lambda (R_{1}V_0)(\hat{y}_0) - \lambda (R_{1}V_0)(x_1)}{\psi _{1}(\hat{y}_0) - \psi _{1}(x_1)} - \dfrac{\lambda (R_{1}V_{2})(\hat{y}_0) - \lambda (R_{1}V_{2})(x_1)}{\psi _{1}(\hat{y}_0) - \psi _{1}(x_1)} \end{aligned}$$

and consequently

$$\begin{aligned}&\dfrac{\lambda (R_{1}(V_{2} - V_0))(\hat{y}_0) - \lambda (R_{1}(V_{2} - V_0))(x_1)}{\psi _{1}(\hat{y}_0) - \psi _{1}(x_1)} \\ \ge&\dfrac{\lambda (R_{1}(V_{2} - V_0))(\hat{y}_1) - \lambda (R_{1}(V_{2} - V_0))(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} \end{aligned}$$

The statement follows from the above inequality. To see this, let \(f = V_{2}\vee V_0 - V_{2}\wedge V_0\) and

$$\begin{aligned} h(x)&= \dfrac{\lambda (R_{1}f)(x) - \lambda (R_{1}f)(x_1)}{\psi _{1}(x) - \psi _{1}(x_1)} \end{aligned}$$

\(\lim _{x \downarrow x_1}h(x) = \lambda (R_{1}f)'/\psi _{1}'\), \(h'(x) < 0\) when \(h(x) > \lambda (R_{1}f)'(x)/\psi _{1}'(x)\) and \(\lambda (R_{1}f)'/\psi _{1}'\) was assumed to be non-increasing on \((x_1, b)\), so h is strictly decreasing on \((x_1, b)\). \(\square \)

Theorem 6 yields the following corollary, which is an analogue of Theorem 5.8 in Alvarez (2004a).

Corollary 2

(Risk sensitivity of the anticipative ICP threshold solutions) Suppose that the volatility of the underlying is the only changing parameter, \(\sigma _2(x) \ge \sigma _0(x)\) for all \(x >0\) and there is a state \(\tilde{z} \in [a, b]\), such that \(\mu _1(x) \le 0\) for \(x \ge \tilde{z}\) and \(\mu _1(x) - rx\) is non-increasing on \((a, \tilde{z})\). Then \(\hat{y}_1 > \hat{y}_0\). That is, if a regime switch increases volatility, the switch should be anticipated by raising the control threshold.

Proof

By Theorem 5.8 in Alvarez (2004a), we have \(V_0 \le V_{2,\iota }\). The statement follows from Theorem 6. \(\square \)

Lastly, Theorems 7 and 8 describe the order of the thresholds \(\tilde{y}_1,\hat{y}_1\) and the value functions \(\tilde{V}_1,V_1\) respectively. They are extensions of the corresponding results in Alvarez (2004a) for anticipative problems.

Theorem 7

Suppose that \(\lambda (R_1V_2)/\psi _1\) is convex. Then \(\hat{y}_1 \le \tilde{y}_1\).

Proof

combining the inequalities

$$\begin{aligned}&\dfrac{g_1(\hat{y}_1) - \lambda (R_{1}V_{2,\iota })(\hat{y}_1) + \lambda (R_{1}V_{2,\iota })(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} \\ \ge&\dfrac{g_1(\tilde{y}_1) - \lambda (R_{1}V_{2,\iota })(\tilde{y}_1) + \lambda (R_{1}V_{2,\iota })(x_1)}{\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1)} \end{aligned}$$

and

$$\begin{aligned} \dfrac{g_1(\tilde{y}_1) - \lambda (R_{1}\tilde{V}_{2,\iota })(\tilde{y}_1)}{\psi _{1}(\tilde{y}_1)}&\ge \dfrac{g_1(\hat{y}_1) - \lambda (R_{1}\tilde{V}_{2,\iota })(\hat{y}_1)}{\psi _{1}(\hat{y}_1)} \end{aligned}$$

yields

$$\begin{aligned}&g_1(\hat{y}_1) \left( \dfrac{1}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} - \dfrac{\psi _{1}(\tilde{y}_1)}{(\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1))\psi _{1}(\hat{y}_1)} \right) \\ \ge&\dfrac{\lambda (R_{1}V_{2,\iota })(\hat{y}_1) - \lambda (R_{1} V_{2,\iota })(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)} - \dfrac{\lambda (R_{1}V_{2,\iota })(\tilde{y}_1) - \lambda (R_{1}V_{2,\iota })(x_1)}{\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1)}\\&+ \dfrac{\lambda (R_{1}\tilde{V}_{2,\iota })(\tilde{y}_1) - \lambda (R_{1}\tilde{V}_{2,\iota })(\hat{y}_1)\dfrac{\psi _{1} (\tilde{y}_1)}{\psi _{1}(\hat{y}_1)}}{\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1)} \end{aligned}$$

The left hand side of the above inequality can be written as

$$\begin{aligned} g_1(\hat{y}_1) \left( \dfrac{(\psi _{1}(\tilde{y}_1) - \psi _{1}(\hat{y}_1))\psi _{1}(x_1)}{(\psi _{1}(\hat{y}_1) - \psi _{1}(x_1))(\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1))\psi _{1}(\hat{y}_1)} \right) \end{aligned}$$

which is negative when \(\tilde{y}_1 < \hat{y}_1\). To prove \(\hat{y}_1 \le \tilde{y}_1\), it is now sufficient to show that the right hand side is positive if \(\tilde{y}_1 < \hat{y}_1\). This is true since \(x_1 < \tilde{y}_1\), \(\lambda (R_1\tilde{V}_2)/\psi _1\) is strictly decreasing and

$$\begin{aligned} \dfrac{\lambda (R_{1}V_{2,\iota })(\hat{y}_1) - \lambda (R_{1}V_{2,\iota })(x_1)}{\psi _{1}(\hat{y}_1) - \psi _{1}(x_1)}&> \dfrac{\lambda (R_{1}V_{2,\iota })(\tilde{y}_1) - \lambda (R_{1}V_{2,\iota })(x_1)}{\psi _{1}(\tilde{y}_1) - \psi _{1}(x_1)} \end{aligned}$$

which holds because \(\lambda (R_1V_2)/\psi _1\) is strictly decreasing and was assumed to be convex. \(\square \)

Theorem 8

Assume that the conditions of Theorem 1 and Corollary 1 or Theorems 2 and 4 hold, \(\lim _{x \downarrow a}\lambda (R_1V_2)(x) = 0\) and \(\lim _{x_2 \downarrow a}\tilde{V}_2 = V_2\). Then \(V_1 \ge \tilde{V}_1\), \(\lim _{x_1 \downarrow a, x_2 \downarrow a} \hat{y}_1 = \tilde{y}_1\) and \(\lim _{x_1 \downarrow a, x_2 \downarrow a} V_1 = \tilde{V}_1\).

Proof

The inequality \(V_1 \ge \tilde{V}_1\) follows from Theorem 3. The limit assumptions imply \(\lim _{x_1 \downarrow a, x_2 \downarrow a} u_1(x) = C(x)\), which in turn implies \(\lim _{x_1, x_2 \downarrow a} \hat{y}_1 = \tilde{y}_1\). \(\lim _{x_1, x_2 \downarrow a} V_1 = \tilde{V}_1\) follows now from the explicit forms of the value functions \(\tilde{V}_1\) and \(V_1\). \(\square \)

6.2 Neutral regime changes

As stated in the beginning of this section, it may happen that the anticipative and post-switch threshold solutions coincide. In this case it is optimal for an agent to anticipate an upcoming regime switch by immediately choosing the optimal post-switch policy, regardless of the moment when the switch actually occurs. The study of neutral anticipation for general regime switching problems seems to be quite complicated. However, if either the payoff or the underlying remains fixed during the switch, the analysis becomes somewhat more tractable. Theorem 9 gives a necessary condition for neutral anticipation for an OSP with a switching payoff and Theorem 10 gives a sufficient condition for a problem with a switching diffusion.

Theorem 9

Suppose that \(g_1,g_2 \in C^2(I)\) and the underlying does not change during the switch. If \(\tilde{y}_1 = \tilde{y}_2\), then

$$\begin{aligned} {\left\{ \begin{array}{ll} (\mathcal {L}_{\psi _{2}}g_2)(\tilde{y}_1) = 0\\ (\mathcal {L}_{\psi _{1}}g_1)(\tilde{y}_1) = (\mathcal {L}_{\psi _{1}}g_2)(\tilde{y}_1) \end{array}\right. } \end{aligned}$$
(22)

where \(\mathcal {L}_{\psi }\) are as in (16).

Proof

Combining Lemma 1 with the optimality condition of Theorem 2 yields

$$\begin{aligned} (\mathcal {L}_{\psi _2}g_2)(\tilde{y}_2)&=0 \end{aligned}$$

and

$$\begin{aligned} S'(\tilde{y}_1)(\mathcal {L}_{\psi _1}g_1)(\tilde{y}_1)&= \dfrac{g_2(\tilde{y}_2)}{\psi _2(\tilde{y}_2)} (\psi _2'(\tilde{y}_1)\psi _1(\tilde{y}_1) - \psi _1'(\tilde{y}_1)\psi _2(\tilde{y}_1)) \end{aligned}$$

for \(\tilde{y}_1 \le \tilde{y}_2\) \(\square \)

Theorem 10

Suppose that the payoff g does not change during the switch and \(x_1 = x_2\). If \(\psi _2 = \psi _0\), then \(\tilde{y}_0 = \tilde{y}_1 = \tilde{y}_2\), \(\hat{y}_0 = \hat{y}_1 = \hat{y}_2\), \(\tilde{V}_0 = \tilde{V}_1 = \tilde{V}_2\) and \(V_0 = V_1 = V_2\).

Proof

The assumptions imply \(\tilde{V}_0 = \tilde{V}_2\) and \(V_0 = V_2\) and hence \(\tilde{y}_0 = \tilde{y}_2\) and \(\hat{y}_0 = \hat{y}_2\). Theorems 5 and 6 imply \(\tilde{y}_0 = \tilde{y}_1\) and \(\hat{y}_0 = \hat{y}_1\). Since the solutions exist and are of a threshold form, we have

$$\begin{aligned} \tilde{V}_1(x)&= \mathbb {E}_x \left[ e^{-r\tau _{\tilde{y}_1}} g(X_{1,\tau _{\tilde{y}_1}})1_{\{\tau _{\tilde{y}_1} < T \}} + e^{-rT}V_0(X_{1,T}) 1_{\{\tau _{\tilde{y}_1} \ge T \}} \right] \\&= \mathbb {E}_x \left[ e^{-r\tau _{\tilde{y}_1}}g(X_{1,\tau _{\tilde{y}_1}}) \right] \\&= \tilde{V}_0(x) \end{aligned}$$

and

$$\begin{aligned} V_1(x)&= \sup _{\nu \in \mathcal {V}}\mathbb {E}_{x} \left[ \sum _{i=1}^{N}e^{-(r+ \lambda ) \tau _i}g_1(X_{1,\tau _{i}^-}^{\nu }) + \int _{0}^{\tau _N} \lambda e^{-(r+ \lambda )s}V_{0}(X_{1,s}^{\nu })ds \right] \\&= \sup _{\nu \in \mathcal {V}_1}\mathbb {E}_{x} \left[ \sum _{i=1}^{N} e^{-(r+ \lambda ) \tau _i}g_1(X_{1,\tau _{i}^-}^{\nu }) \right] \\&= V_0(x) \end{aligned}$$

for all \(x \in I\). \(\square \)

The conditions of Theorem 10 might seem extremely restrictive, but it turns out that there are non-trivial regime switches for which \(\psi _2 = \psi _0\) holds. For these problems, the regime switching structure has absolutely no effect on the value functions or the optimal control policies. We elaborate this phenomenon in Sect. 7.3.

7 Examples

Having solved the anticipative OSP and ICP in Sects. 4 and 5 and studied their relationships further in Sect. 6, we will now illustrate the preceding analysis with three examples. The first two are general examples dealing with anticipated switches in a cash flow tax and the interest rate regimes, respectively. We apply various comparison results of Sect. 6 to qualitatively describe the anticipative value functions and threshold solutions and the relations between them and their pre- and post-switch counterparts. In the third example we examine conditions leading to total neutrality (as described in Theorem 10) for a problem with a switching GBM. Throughout this section we assume the conditions of the Theorems in Sects. 4, 5 and 6.1.

7.1 Switching cash flow tax

As noted in the Introduction, it is not always realistic to assume that the macroeconomic conditions affecting the OSP or ICP at hand will remain unchanged in the foreseeable future. Switches may arise e.g. due to the long run infeasibility of the current government policies (Drazen and Helpman 1990). In this example we study problems with an anticipated switch in the cash flow tax rate. We chose cash flow tax over other forms of income taxation for simplicity, since it directly scales the payoff by a constant factor between 0 and 1.

Suppose that the underlying does not change, \(x_2 = x_1\) and the payoffs are of the form \(g_i = (1-t_i)g\) where \(t_i \in (0,1)\), \(i=1,2\) and \(t_1 \ne t_2\). Then

$$\begin{aligned} \textrm{argmax}_{x \in I} \left\{ \dfrac{g(x)}{\psi _{0}(x)} \right\}&= \textrm{argmax}_{x \in I} \left\{ \dfrac{(1 - t)g(x)}{\psi _{0}(x)} \right\} ,\\ \textrm{argmax}_{x \in (x_1, b)} \left\{ \dfrac{g(x)}{\psi _{0}(x) - \psi _{0}(x_1)} \right\}&= \textrm{argmax}_{x \in (x_1, b)} \left\{ \dfrac{(1 - t)g(x)}{\psi _{0}(x)- \psi _{0}(x_1)} \right\} \end{aligned}$$

for any \(t \in (0,1)\) implying that the optimal pre- and post-switch control thresholds are equal for OSP and ICP. Still the anticipative thresholds will differ from them in general. Indeed, for OSP Theorem 5 implies that if \(t_1 \lessgtr t_2\), then \(\tilde{V}_0 \gtrless \tilde{V}_2\) and consequently \(\tilde{y}_0 \gtrless \tilde{y}_1\). For ICP, the analogous result follows from Theorem 6. These conditions imply that there are no neutral changes, provided that the switch is non-trivial. Thus in the present framework all non-trivial changes in the cash flow tax rate create an incentive to either delay or accelerate irreversible investment. The results are in line with similar literature on the effects of uncertain tax policies on irreversible investment ( Nickell 1977; Sandmo 1979).

A brief discussion of the practical implications of the aforementioned results is in place. Consider an agent who is solving a pre-switch problem (either OSP or ICP) and receives information about an upcoming tax regime switch. If the switch increases the tax rate, the post-switch value will be lower than the pre-switch value and consequently the conditions are more favourable before the switch than after it. Thus even though the pre- and post-switch threshold solutions are equal, rising taxes are best anticipated by momentarily lowering the control threshold so that there is a higher chance of larger after-tax nominal payoffs before the switch. Likewise, switches decreasing the tax rate are best anticipated by momentarily raising the control threshold. This kind of anticipation results in a higher chance of the switch occurring before the next control time, resulting again in larger after-tax nominal payoffs. The results emphasize that in general, the neutrality of a tax system does not guarantee neutral anticipation of a changing tax policy.

7.2 Switching interest rate

The previous example can be seen as a brief study of how anticipated switches in a fiscal policy affect irreversible investment. We now turn our attention to anticipated switches in a monetary policy. The simplest such models that can be analysed in the present framework are problems where the only regime dependent parameter is the interest rate. Interest rates are often studied together with macroeconomic variables such as the inflation and unemployment rate (see e.g. Alvarez et al. 2001; Sargent et al. 1973). Here we isolate the interest rate switches in order to apply the preceding analysis. This example is also related to the seminal paper of Ingersoll Jr and Ross (1992) since the two values of the interest rate are deterministic but the switching time is random.

We begin by assuming that the interest rate is the only regime dependent parameter and the payoff and underlying satisfy the conditions of Theorems 2 and 4. In particular \(x_1 = x_2\) for the impulse control problems. We denote the regime interest rates as \(r_1\) and \(r_2\) and assume that \(r_1 \ne r_2\) since otherwise the switch would be trivial. Thus the anticipative rate is \(r_1 + \lambda \). Defining \(F_i = (\mathcal {A} - r_i)g\) for \(i=1,2\) we see that \(F_1 \lessgtr F_2\) if and only if \(r_1 \gtrless r_2\). Choosing \(\lambda = 0\) in Theorems 2 and 4 yields that \(r_1 \gtrless r_2\) is equivalent to \(\tilde{y}_0 \lessgtr \tilde{y}_2\) and \(\hat{y}_0 \lessgtr \hat{y}_2\) respectively. Thus in the present example, the ordering of the pre- and post-switch thresholds is opposite to the ordering of the regime interest rates. Next we show that the order conditions of Theorems 5 and 6 are satisfied so that we have a similar ordering for the pre-switch and anticipative thresholds as well.

We write \(\alpha = r_1 \wedge r_2\), \(\beta = r_1 \vee r_2\), \(y_* = \tilde{y}_0 \wedge \tilde{y}_2\) and \(y^* = \tilde{y}_0 \vee \tilde{y}_2\) in order to simplify the notation. For \(x \ge y^*\) we clearly have \(\tilde{V}_0(x) = g(x) = \tilde{V}_2(x)\). Using the uniqueness of the optimal stopping thresholds it is also easy to see that \(\tilde{y}_0 \lessgtr \tilde{y}_2\) implies \(\tilde{V}_0(x) \lessgtr \tilde{V}_2(x)\) for \(x \in [ y_*, y^*)\). The case \(x < y_*\) requires some analysis. For any \(x,y \in I\) such that \(x < y\), we have

$$\begin{aligned} \frac{\psi _{\beta }(x)}{\psi _{\beta }(y)}&= \mathbb {E}_x[e^{-\beta \tau _y}] \le \mathbb {E}_x[e^{-\alpha \tau _y}] = \frac{\psi _{\alpha }(x)}{\psi _{\alpha }(y)} \end{aligned}$$
(23)

The equalities in (23) are well known results in the classical theory of diffusions (see e.g. Borodin and Salminen 2012, p. 18). Now

$$\begin{aligned} \frac{g(y_*)}{\psi _{\beta }(y_*)}\psi _{\beta }(x)&< \frac{g(y_*)}{\psi _{\alpha }(y_*)}\psi _{\alpha }(x) < \frac{g(y^*)}{\psi _{\alpha }(y^*)}\psi _{\alpha }(x) \end{aligned}$$

for \(x < y_*\). Thus \(r_1 \gtrless r_2\) implies \(\tilde{V}_0 \lessgtr \tilde{V}_2\) and it follows from Theorem 5 that \(\tilde{y}_0 \lessgtr \tilde{y}_1\).

For impulse control thresholds the calculations are somewhat more complicated. Let \(\alpha , \beta \) be as in the previous paragraph and let \(y_* = y_0 \wedge y_2, y^* = y_0 \vee y_2\). Using (23) we get

$$\begin{aligned} \frac{g(y_*) \frac{\psi _{\beta }(x)}{\psi _{\beta }(y_*)}}{1 - \frac{\psi _{\beta }(x_1)}{\psi _{\beta }(y_*)}}&< \frac{g(y_*) \frac{\psi _{\alpha }(x)}{\psi _{\alpha }(y_*)}}{1 - \frac{\psi _{\alpha }(x_1)}{\psi _{\alpha }(y_*)}} < \frac{g(y^*) \frac{\psi _{\alpha }(x)}{\psi _{\alpha }(y^*)}}{1 - \frac{\psi _{\alpha }(x_1)}{\psi _{\alpha }(y^*)}} \end{aligned}$$
(24)

for \(x < y_*\). Thus \(r_1 \gtrless r_2\) implies \(V_0(x) \lessgtr V_2(x)\) for \(x < y_*\) and \(x \ge y^*\). For \(x \in [y_*, y^*)\) the condition \(V_0(x) \lessgtr V_2(x)\) follows by (24) and the monotonicity of \(\psi _{\beta }\). Moreover, if the monotonicity condition of Theorem 6 is satisfied, then \(\hat{y}_0 \lessgtr \hat{y}_1\).

\(r_1 \gtrless r_2\) implies both \(\tilde{y}_0 \lessgtr \tilde{y}_1\) and \(\hat{y}_0 \lessgtr \hat{y}_1\). This means that there are no non-trivial interest rate switches leading to neutral anticipation. In light of the above calculations we may also conclude that in our framework higher interest rates discourage irreversible investment and lower interest rates have the opposite effect. Furthermore, rising interest rates are best anticipated by delaying irreversible investment decisions and falling are anticipated by accelerating them. The results are in line with the usual textbook view on the relationship between interest rates and investment. In particular it seems that here the interest rate structure is in some sense too deterministic to allow for the counter-intuitive positive relationship between interest rates and investment described in Ingersoll Jr and Ross (1992).

7.3 Total neutrality for a switching GBM

As stated in Theorem 10, there are non-trivial regime-switches that guarantee not only neutral anticipation, but also the identity of pre-, anticipative and post-switch optimal control policies and value functions. We call this type of neutrality total neutrality, since the regime switching structure has no effect on the solution of the problem. In this example we determine the regime switches that satisfy the conditions of Theorem 10 when the underlying is a GBM in both regimes.

Let \(g_1 = g_2 = g\). Contrary to the preceding analysis, we label quantities associated to regime 1 with the index 1 instead of 0 for the rest of this section. This is a notational convenience because in the present example we are only comparing quantities related to the individual regimes. Thus for \(i =1,2\) the diffusion generators are

$$\begin{aligned} \mathcal {A}_i&= \mu _i x \dfrac{d}{dx} + \dfrac{\sigma _i^2}{2}x^2 \dfrac{d^2}{dx^2}, \qquad i=1,2 \end{aligned}$$

and the minimal r-excessive functions are \(\psi _{i}(x) = x^{\alpha _{i,+}}\) and \(\phi _{i}(x) = x^{\alpha _{i,-}}\), where

$$\begin{aligned} \alpha _{i, \pm }&= \dfrac{1}{2} - \dfrac{\mu _i}{\sigma _i^2} \pm \sqrt{\left( \dfrac{1}{2} - \dfrac{\mu _i}{\sigma _i^2}\right) ^2 + \dfrac{2r}{\sigma _i^2}} \end{aligned}$$
(25)

It is now evident that \(\psi _{2} = \psi _{1}\) if and only if \(\alpha _{2,+} = \alpha _{1,+}\), which is equivalent to

$$\begin{aligned} \dfrac{\sigma _1^2}{2}\alpha _{2,+}(\alpha _{2,+} - 1) + \mu _1 \alpha _{2,+} - r&= 0 \end{aligned}$$
(26)

Note that (26) implies total neutrality for OSP and ICP both. Economically interesting applications arise when the payoff is e.g. of the form \(g(x) = x - c,\, c>0\). The pre- and post-switch optimal stopping policies satisfy the relation

$$\begin{aligned} \alpha _{i,+}&= \dfrac{g'(\tilde{y}_i)\tilde{y}_i}{g(\tilde{y}_i)} = \dfrac{\tilde{y}_i}{\tilde{y}_i - c} \end{aligned}$$
(27)

The right hand side of (27) describes the elasticity of the payoff with respect to the state variable. If the underlying is taken to be the value of an irreversible investment and c is the fixed cost of exercising the investment opportunity, the problem turns into a regime switching extension of the investment problem studied by Dixit et al. (1999). Following their markup interpretation of optimal stopping rules, we can say that total neutrality holds if the optimal markups are identical in both regimes. Thus if both regimes induce the exact same trade-off between larger versus later net benefits, there is no incentive to ever deviate from the pre-switch policy, even if the switch is non-trivial.

In the context of impulse control problems, the chosen payoff is used e.g. in the stochastic Faustmann timber harvesting problem as discussed in Alvarez (2004b). In this case, total neutrality means that even if the environmental factors affecting the forest growth dynamics undergo a regime switch, the optimal harvesting policy and the forest stand value remain unchanged.

8 Discussion

In this paper we studied semi-explicit solutions to OSPs and ICPs with one exponentially distributed regime switch. We provided general forms of their solutions and sufficient conditions for existence and uniqueness of threshold solutions which are a class of particularly simple and attractive control policies. Various intuitive comparison results for different problems and their threshold solutions were formulated. We also considered three economically relevant examples as demonstrations of the general theory, namely anticipated changes in the cash flow tax and the interest rate and neutral anticipation for a switching GBM. In the first example, it was found that neutral anticipation is impossible for non-trivial switches in the cash flow tax rate. The result is in line with previous literature. In the second example we obtained a similar neutrality result for interest rates and recovered the mainstream result that higher interest rates discourage irreversible investment while lower rates encourage it. The third example demonstrated that there are non-trivial regime switches for which the problem exhibits a condition much stronger than neutral anticipation, which we labelled total neutrality. Total neutrality means that the pre-, anticipative and post-switch value functions (and the control thresholds) coincide and thus the underlying regime switching structure has no effect on the solution of the problem.

In the existing literature on stochastic impulse control problems with regime switching, the switches are usually modelled as finite state Markov chains in continuous time. Hence most of the results and solution techniques rely heavily on the Markovian structure of the problems. However, one could in principle consider a problem where the regime switches follow arbitrary, sufficiently smooth probability distributions. As a consequence the problems become non-Markovian and time inconsistent in the sense that initial optimal solutions may not remain optimal as time increases. In particular the classical notion of an optimal strategy arising from Hamilton-Jacobi-Bellman equations has to be abandoned and one has to look for equilibrium strategies in a game theoretic framework (e. g. Bayraktar et al. 2021, 2023; Björk et al. 2017; He and Jiang 2021; Huang and Nguyen-Huu 2018; Huang et al. 2020; Huang and Zhou 2020; Huang and Wang 2021; Huang and Zhou 2021). Such models go beyond the scope of the present work and are left for future research.