1 Introduction

Selling is a fundamental and ubiquitous economic operation. As the prices of goods fluctuate over time, ‘What is the best time to sell an asset to maximise revenue?’ qualifies as a basic question in Finance. Suppose that an asset needs to be sold before a known deterministic time \(T>0\) and that the only source of information available to the seller is the price history. A natural mathematical reformulation of the aforementioned optimal selling question is to find a selling time \(\tau ^{*} \in {\mathcal {T}}_{T}\) such that

$$\begin{aligned} {\mathbb {E}}[S_{\tau ^*}] = \sup _{\tau \in \mathcal {T}_{T}} {\mathbb {E}}[S_\tau ], \end{aligned}$$
(1.1)

where \(\{S_{t}\}_{t\ge 0}\) denotes the price process and \(\mathcal {T}_{T}\) denotes the set of stopping times with respect to the price process S.

Many popular continuous models for the price process are of the form

$$\begin{aligned} \,\mathrm {d}S_t = \alpha S_t \,\mathrm {d}t +\sigma (t) S_t \,\mathrm {d}W_t, \end{aligned}$$
(1.2)

where \(\alpha \in {\mathbb {R}}\) is called the drift, and \(\sigma \ge 0\) is known as the volatility process. Imposing simplifying assumptions that the volatility is independent of W as well as time-homogeneous, an m-state time-homogeneous Markov chain stands out as a basic though still rather flexible stochastic volatility model (proposed in [11]), which we choose to use in this article. The flexibility comes from the fact that we can choose the state space as well as the transition intensities between the states.

Though the problem (1.1) in which S follows (1.2) is well-posed mathematically, from a financial point of view, the known drift assumption is widely accepted to be unreasonable (e.g. see [32, Sect. 4.2 on p. 144]) and needs to be relaxed. Hence, using the Bayesian paradigm, we model the initial uncertainty about the drift by a probability distribution (known as the prior in Bayesian inference), which incorporates all the available information about the parameter and its uncertainty (see [15] for more on the interpretation of the prior). If the quantification of initial uncertainty is subjective, then the prior represents one’s beliefs about how likely the drift is to take different values. To be able to incorporate arbitrary prior beliefs, we set out to solve the optimal selling problem (1.1) under an arbitrary prior for the drift.

In the present paper, we analyse and solve the asset liquidation problem (1.1) in the case when S follows (1.2) with m-state time-homogeneous Markov chain volatility and unknown drift, the uncertainty of which is modelled by an arbitrary probability distribution. The first time a particular four-dimensional process hits a specific boundary determining the stopping set is shown to be optimal. This stopping boundary has attractive monotonicity properties and can be found using the approximation procedure developed.

Let us elucidate our study of the optimal selling problem in more depth. Using the nonlinear filtering theory, the original selling problem with parameter uncertainty is rewritten as an equivalent optimal stopping problem of a standard form (i.e. without unknown parameters). In this new optimal stopping problem, the posterior mean serves as the underlying process and acts as a stochastic creation rate; the payoff function in the problem is constant. The posterior mean is shown to be the solution of an SDE depending on the prior and the whole volatility history. Embedding of the optimal stopping problem into a Markovian framework is non-trivial because the whole posterior distribution needs to be included as a variable. Fortunately, we show that having fixed the prior, the posterior is fully characterised by only two real-valued parameters: the posterior mean and, what we call, the effective learning time. As a result, we are able to define an associated Markovian value function with four underlying variables (time, posterior mean, effective learning time, and volatility) and study the optimal stopping problem as a four-dimensional Markovian optimal stopping problem (the volatility takes values in a finite set, but slightly abusing terminology, we still call it a dimension). Exploiting that the volatility is constant between the regime switches, we construct m sequences of simpler auxiliary three-dimensional Markovian optimal stopping problems whose values in the limit converge monotonically to the true value function. The main advantage of this approximating sequence approach comparing with tackling the full variational inequality of the problem directly is that dealing with the analytically complicated coupled system is avoided altogether. Instead only much simpler standard uncoupled free-boundary problems need to be analysed or solved numerically to arrive at a desired result. We show that the value function is decreasing in time and effective learning time as well as increasing and convex in posterior mean. The first hitting time of a region specified by a stopping boundary that is a function of time, effective learning time, and volatility is shown to be optimal. The stopping boundary is increasing in time, effective learning time, and is the limit of a monotonically increasing sequence of boundaries from the auxiliary problems. Moreover, the approximation procedure using the auxiliary problems yields a method to calculate the value function as well as the optimal stopping boundary numerically.

In the two-point prior case, the posterior mean fully characterises the posterior distribution, making the problem more tractable and allowing us to obtain some additional results. In particular, we prove that, under a skip-free volatility assumption, the Markovian value function is decreasing in the volatility and that the stopping boundary is increasing in the volatility.

In a broader mathematical context, the selling problem investigated appears to be the first optimal stopping problem with parameter uncertainty and stochastic volatility to be studied in the literature. Thus it is plausible that ideas presented herein will find uses in other optimal stopping problems of the same type; for example, in classical problems of Bayesian sequential analysis (e.g. see [29, Chapter VI]) with stochastically evolving noise magnitude. It is clear to the author that with additional efforts a number of results of the article can be refined or generalised. However, the objective chosen is to provide an intuitive understanding of the problem and the solution while still maintaining readability and clarity. This also explains why, for the most part, we focus on the two-point prior case and outline an extension to the general prior case only at the end.

1.1 Related Literature

There is a strand of research on asset liquidation problems in models with regime-switching volatility, alas, they either concern only a special class of suboptimal strategies or treat the drift as observable. In [36], a restrictive asset liquidation problem was proposed and studied; the drift as well as the volatility were treated as unobservable and the possibility to learn about the parameters from the observations was disregarded. The subsequent papers [17, 34, 35] explored various aspects of the same formulation. An optimal selling problem with the payoff \(e^{-r\tau }(S_{\tau }-K)\) was studied in [26] for the Black–Scholes model, in [21] for a two-state regime-switching model, and in [35] for an m-state model with finite horizon. In all three cases, the drift and the volatility are assumed to be fully observable.

In another strand of research, the optimal stopping problem (1.1) has been solved and analysed in the Black–Scholes model under arbitrary uncertainty about the drift. The two-point prior case was studied in [12], while the general prior case was solved in [15] using a different approach. This article can be viewed as a generalisation of [15] to include stochastic regime-switching volatility. Related option valuation problems under incomplete information were studied in [18, 33], both in the two-point prior case, and in [10] in the n-point prior case.

The approach we take to approximate a Markovian value function by a sequence of value functions of simpler constant volatility problems was used before in [24] to investigate a finite-horizon American put problem (also, its slight generalisation) in a regime-switching model with full information. Regrettably, in the case of 3 or more volatility states, the recursive approximation step in [24, Sect. 5] contains a blunder; we rectify it in Sect. 3.2 of this article. A possible alternative route to analysing and solving the optimal stopping problem is to analytically tackle the system of variational inequalities directly using weak solutions techniques (e.g., see [6, 30]), similarly as in [7] for American options with regime-switching volatility. Structural and regularity properties would need to be established using PDE techniques. If appropriate theoretical results can be obtained, numerical PDE schemes discussed in [22] should yield a numerical solution. However, this alternative approach requires a different toolkit, appears to be more demanding analytically, and hence not investigated further in the present article.

Though it is true that the current paper is a generalisation of [15] from constant volatility to the regime-switching stochastic volatility model, the extension is definitely not a straightforward one. Novel statistical learning intuitions were needed, and new proofs were developed to arrive at the results of the paper. One of the main insights of the optimal liquidation problem with constant volatility in [15] was that the current time and price were sufficient statistics for the optimal selling problem. However, changing the volatility from constant to stochastic makes the posterior distribution of the drift truly dependent on the price path. This raises questions whether an optimal liquidation problem can be treated using the mainstream finite-dimensional Markovian techniques at all, and also whether any of the developments from the constant volatility case can be taken advantage of. In the two-point prior case with regime-switching volatility, the following new insight was key. Despite the posterior being a path-dependent function of the stock price, we can show that the current time, posterior mean and instantaneous volatility (extracted from the price process) are sufficient statistics for the optimal liquidation problem. Alas, for any prior with more than two points in the support, the same triplet is no longer a sufficient statistic. Fortunately, if in addition to the time-price-volatility triplet we introduce an additional statistic, which we name the effective learning time, the resulting 4-tuple becomes a sufficient statistic for the selling problem under a general prior. Besides these insights, some new technicalities (in particular, Lemma (2.3)) stemming from stochastic volatility had to be resolved to reformulate the optimal selling problem into the standard Markovian form.

In relation to [24], though we employ the same general iterative approximation idea to construct an approximating sequence for the Markovian value function, the particulars, including proofs and results, are notably distinct. Firstly, we work in a more general setting, proving and formulating more abstract as well as, in multiple instances, new type of results. For example, we prove things in the m-state rather than the two-state regime-switching model. This allowed us to catch and correct an erroneous construction of the approximating sequence in [24] for models with more than two volatility states. Moreover, almost all the proofs follow different arguments either because of the structural differences in the selling problem or because we prefer another way, which seems to be more transparent and direct, to arrive at the results. Lastly, many of the results in the present paper are problem-specific and even not depend on the iterative approximation of the value function after all.

The idea to iteratively construct a sequence of auxiliary value functions that converge to the true value function in the limit is generic and has been many times successfully applied to optimal stopping problems with a countable number of discrete events (e.g. jumps, discrete observations). In the setting with partial observations, an iterative approximation scheme was employed in [5] to study the Poisson disorder detection problem with unknown post-disorder intensity, then later, in [9], to analyse a combined Poisson-Wiener disorder detection problem, and, more recently, in [4], to investigate the Wiener disorder detection under discrete observations. In the fully observable setting, such iterative approximations go back to at least as early as [19], which deals with a Markovian optimal stopping problem with a piecewise deterministic underlying. In Financial Mathematics, iteratively constructed approximations were used in [2, 3] to study the value functions of finite and perpetual American put options, respectively, for a jump diffusion. Besides optimal stopping, the iterative approximation technique was utilised for the singular control problem [13] of optimal dividend policy.

2 Problem Set-Up

We model a financial market on a filtered probability space \((\Omega ,{\mathcal {F}}, \{\mathcal {F}_{t} \}_{t \ge 0}, {{\mathbb {P}}})\) satisfying the usual conditions. Here the measure \({{\mathbb {P}}}\) denotes the physical probability measure. The price process is modelled by

$$\begin{aligned} \,\mathrm {d}S_t = X S_t \,\mathrm {d}t +\sigma (t) S_t \,\mathrm {d}W_t, \end{aligned}$$
(2.1)

where X is a random variable having probability distribution \(\mu \), W is a standard Brownian motion, and \(\sigma \) is a time-homogeneous right-continuous m-state Markov chain with a generator \(\Lambda = (\lambda _{ij})_{1\le i,j \le m}\) and taking values \(\sigma _{m} \ge \cdots \ge \sigma _{1} >0\). Moreover, we assume that X, W, and \(\sigma \) are independent. Since the volatility can be estimated from the observations of S in an arbitrary short period of time (at least in theory), it is reasonable to assume that the volatility process \(\{ \sigma (t) \}_{t \ge 0}\) is observable. Hence the available information is modeled by the filtration \({\mathbb {F}}^{S, \sigma } = \{ {\mathcal {F}}^{S, \sigma }_t \}_{t \ge 0}\) generated by the processes S and \(\sigma \) and augmented by the null sets of \({\mathcal {F}}\). Note that the drift X and the random driver W are not directly observable.

The optimal selling problem that we are interested in is

$$\begin{aligned} V=\sup _{\tau \in \mathcal {T}_T^{S, \sigma } } {\mathbb {E}}[S_\tau ], \end{aligned}$$
(2.2)

where \(\mathcal {T}_T^{S, \sigma }\) denotes the set of \({\mathbb {F}}^{S, \sigma }\)-stopping times that are smaller or equal to a prespecified time horizon \(T >0\).

Remark 2.1

It is straightforward to include a discount factor \(e^{-r\tau }\) in (2.2). In fact, it simply corresponds to a shift of the prior distribution \(\mu \) in the negative direction by r.

Let \(l:=\inf \mathop {\mathrm {supp}}\nolimits (\mu )\) and \(h:=\sup \mathop {\mathrm {supp}}\nolimits (\mu )\). It is easy to see that if \(l \ge 0\), then it is optimal to stop at the terminal time T. Likewise, if \(h \le 0\), then stopping immediately, i.e. at time zero, is optimal. The rest of the article focuses on the remaining and most interesting case.

Assumption 2.2

\(l< 0 < h\).

2.1 Equivalent Reformulation Under a Measure Change

Let us write \({\hat{X}}_{t} := {\mathbb {E}}[X \,|\, \mathcal {F}^{S, \sigma }_{t} ]\). Then the process

$$\begin{aligned} {\hat{W}}_t := \int _0^t \frac{1}{\sigma (s)}(X - {\hat{X}}_s) \,\mathrm {d}s + W_t, \end{aligned}$$

called the innovation process, is an \({\mathbb {F}}^{S,\sigma }\)-Brownian motion (see [1, Proposition 2.30 on p. 33]).

Lemma 2.3

The volatility process \(\sigma \) and the innovation process \({\hat{W}}\) are independent.

Proof

Since X, W, and \(\sigma \) are independent, we can think of \((\Omega , {\mathcal {F}}, {{\mathbb {P}}})\) as a product space \( \left( \Omega _{X,W} \times \Omega _{\sigma }, {\mathcal {F}}_{X,W} \otimes {\mathcal {F}}_{\sigma }, {{\mathbb {P}}}_{X,W} \times {{\mathbb {P}}}_{\sigma } \right) \). Let \( A, A' \in \mathcal {B}({\mathbb {R}}^{[0,T]}) \). Then

$$\begin{aligned} {{\mathbb {P}}}\left( {\hat{W}} \in A, \, \sigma \in A' \right)= & {} \int _{\Omega _{X,W} \times \Omega _{\sigma }} \mathbb {1}_{\{ {\hat{W}}(\omega _{X,W}, \omega _{\sigma }) \in A, \, \sigma (\omega _{\sigma }) \in A' \}} \,\mathrm {d}\left( {{\mathbb {P}}}_{X,W}\times {{\mathbb {P}}}_{\sigma } \right) (\omega _{X,W}, \omega _{\sigma }) \nonumber \\= & {} \int _{\Omega _{\sigma }} \int _{\Omega _{X, W}} \mathbb {1}_{\{ {\hat{W}}(\omega _{X,W}, \omega _{\sigma }) \in A\}} \mathbb {1}_{\{\sigma (\omega _{\sigma }) \in A' \}} \,\mathrm {d}{{\mathbb {P}}}_{X,W}(\omega _{X,W}) \,\mathrm {d}{{\mathbb {P}}}_{\sigma }(\omega _{\sigma }) \nonumber \\= & {} \int _{\Omega _{\sigma }} \mathbb {1}_{\{\sigma (\omega _{\sigma }) \in A' \}} \int _{\Omega _{X, W}} \mathbb {1}_{\{ {\hat{W}}(\omega _{X,W}, \omega _{\sigma }) \in A\}} \,\mathrm {d}{{\mathbb {P}}}_{X,W}(\omega _{X,W}) \,\mathrm {d}{{\mathbb {P}}}_{\sigma }(\omega _{\sigma }) \nonumber \\= & {} \int _{\Omega _{\sigma }} \mathbb {1}_{\{ \sigma (\omega _{\sigma }) \in A' \}} {{\mathbb {P}}}_{X,W} \left( {\hat{W}}(\cdot , \omega _{\sigma }) \in A\right) \,\mathrm {d}{{\mathbb {P}}}_{\sigma }(\omega _{\sigma }) \nonumber \\= & {} {{\mathbb {P}}}\left( {\hat{W}} \in A\right) {{\mathbb {P}}}_{\sigma } \left( \sigma \in A' \right) \nonumber \\= & {} {{\mathbb {P}}}\left( {\hat{W}} \in A\right) {{\mathbb {P}}}\left( \sigma \in A' \right) , \end{aligned}$$
(2.3)

where the penultimate equality is justified by the fact that, for any fixed \(\omega _{\sigma }\), the innovation process \({\hat{W}} (\cdot , \omega _{\sigma })\) is a Brownian motion under \({{\mathbb {P}}}_{X,W}\). Hence from (2.3), the processes \({\hat{W}}\) and \(\sigma \) are independent. \(\square \)

Defining a new equivalent measure \({\tilde{{{\mathbb {P}}}}} \sim {{\mathbb {P}}}\) on \((\Omega , {\mathcal {F}}_{T})\) via the Radon-Nikodym derivative

$$\begin{aligned} \frac{\mathrm {d}{\tilde{{{\mathbb {P}}}}}}{\mathrm {d}{{\mathbb {P}}}} = e^{\int _{0}^{T} \sigma (t) \,\mathrm {d}\hat{W}_t - \frac{1}{2}\int _{0}^{T}\sigma (t)^2\,\mathrm {d}t} \end{aligned}$$

and writing

$$\begin{aligned} S_t= & {} S_0e^{Xt+ \int _{0}^{t} \sigma (s) \,\mathrm {d}W_s-\frac{1}{2} \int _{0}^{t}\sigma (s)^{2} \,\mathrm {d}s}\\= & {} S_0e^{\int _0^t{\hat{X}}_s \,\mathrm {d}s + \int _{0}^{t} \sigma (s) \,\mathrm {d}{\hat{W}}_s-\frac{1}{2}\int _{0}^{t}{\sigma (s)^2}\,\mathrm {d}s}, \end{aligned}$$

we have that, for any \(\tau \in \mathcal {T}^{S, \sigma }_{T}\),

$$\begin{aligned} {\mathbb {E}}\left[ S_\tau \right] = {\tilde{{\mathbb {E}}}} \left[ S_0e^{ \int _0^\tau {\hat{X}}_{s} \,\mathrm {d}s } \right] = S_0 {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}_{s} \,\mathrm {d}s } \right] . \end{aligned}$$

Moreover, by Girsanov’s theorem, the process \(B_t:= -\int _{0}^{t} \sigma (s) \,\mathrm {d}s + \hat{W}_t\) is a \({\tilde{{{\mathbb {P}}}}}\)-Brownian motion on [0, T]. In addition, Lemma 2.3 together with [1, Proposition 3.13] tells us that the law of \(\sigma \) is the same under \({\tilde{{{\mathbb {P}}}}}\) and \({{\mathbb {P}}}\), as well as that B and \(\sigma \) are independent under \({\tilde{{{\mathbb {P}}}}}\).

Without loss of generality, we set \(S_0=1\) throughout the article, so the optimal stopping problem (2.2) can be cast as

$$\begin{aligned} V=\sup _{\tau \in \mathcal {T}^{S, \sigma }_T } {\tilde{{\mathbb {E}}}}[e^{ \int _0^\tau {\hat{X}}_{s} \,\mathrm {d}s }]. \end{aligned}$$
(2.4)

Between the volatility jumps, the stock price is a geometric Brownian motion with known constant volatility and unknown drift. Hence, by Corollary 3.4 in [15], we have that \({\mathbb {F}}^{S, \sigma } = {\mathbb {F}}^{{\hat{X}}, \sigma }\) and \(\mathcal {T}^{S, \sigma }_T=\mathcal {T}^{{\hat{X}}, \sigma }_T\), where \({\mathbb {F}}^{{\hat{X}}, \sigma }\) denotes the usual augmentation of the filtration generated by \({\hat{X}}\) and \(\sigma \), also, \(\mathcal {T}^{{\hat{X}} , \sigma }_{T}\) denotes the set of \({\mathbb {F}}^{{\hat{X}}, \sigma }\)-stopping times not exceeding T. As a result, an equivalent reformulation of (2.4) is

$$\begin{aligned} V=\sup _{\tau \in \mathcal {T}^{{\hat{X}}, \sigma }_T } {\tilde{{\mathbb {E}}}}[e^{ \int _0^\tau {\hat{X}}_{s} \,\mathrm {d}s }], \end{aligned}$$
(2.5)

which we will study in the subsequent parts of the article.

2.2 Markovian Embedding

In all except the last section of this article, we will focus on the special case when X has a two-point distribution \(\mu = \pi \delta _{h} + (1-\pi ) \delta _{l}\), where \(h>l\), \(\pi \in (0,1)\) are constants, and \(\delta _{h}, \delta _{l}\) are Dirac measures at h and l, respectively. In this special case, expressions are simpler and arguments are easier to follow than in the general prior case; still, most underlying ideas of the arguments are the same. Hence, we choose to understand the two-point prior case first, after which generalising the results to the general prior case will become a rather easy task.

Since the volatility is a known constant between the jump times, using the dynamics of \({\hat{X}}\) in the constant volatility case [the equation (3.9) in [15]], the process \({\hat{X}}\) is a unique strong solution of

$$\begin{aligned} \mathrm {d}{\hat{X}}_{t}= & {} \sigma (t) \phi ({\hat{X}}_{t}, \sigma (t)) \,\mathrm {d}t + \phi ({\hat{X}}_{t}, \sigma (t)) \,\mathrm {d}B_{t}, \end{aligned}$$
(2.6)

where

$$\begin{aligned} \phi (x, \sigma ):= & {} \frac{1}{\sigma }(h-x)(x-l). \end{aligned}$$

Now, we can embed the optimal stopping problem (2.4) into a Markovian framework by defining a Markovian value function

$$\begin{aligned} v(t,x, \sigma ) := \sup _{\tau \in {\mathcal {T}}_{T-t} } {\tilde{{\mathbb {E}}}}[e^{ \int _0^\tau {\hat{X}}^{t,x, \sigma }_{s} \,\mathrm {d}s }], \quad (t,x, \sigma ) \in [0,T] \times (l, h) \times \{ \sigma _{1},\ldots , \sigma _{m} \}.\nonumber \\ \end{aligned}$$
(2.7)

Here \({\hat{X}}^{t,x, \sigma }\) denotes the process \({\hat{X}}\) in (2.6) started at time t with \({\hat{X}}_{t} = x\), \(\sigma (t) = \sigma \), and \({\mathcal {T}}_{T-t}\) stands for the set of stopping times less or equal to \(T-t\) with respect to the usual augmentation of the filtration generated by \(\{ {\hat{X}}^{t,x,\sigma }_{t+s}\}_{s \ge 0}\) and \(\{ \sigma (t+s)\}_{s \ge 0}\). The formulation (2.7) has an interpretation of an optimal stopping problem with the constant payoff 1 and the discount rate \(-{\hat{X}}_{s}\); from now onwards, we will study this discounted problem. The notation \(v_{i} := v( \cdot , \cdot , \sigma _{i})\) will often be used.

3 Approximation Procedure

It is not clear how to compute v in (2.7) or analyse it directly. Hence, in this section, we develop a way to approximate the value function v by a sequence of value functions, corresponding to simpler constant volatility optimal stopping problems.

3.1 Operator \(J_{i}\)

For the succinctness of notation, let \(\lambda _{i}:=\sum _{j \ne i} \lambda _{ij}\) denote the total intensity with which the volatility jumps from state \(\sigma _{i}\). Also, let us define

$$\begin{aligned} \eta ^{t}_{i}:= & {} \inf \{ s > 0 \,|\, \sigma (t+s) \ne \sigma (t) = \sigma _{i}\}, \end{aligned}$$

which is an Exp(\(\lambda _{i}\))-distributed random variable representing the duration up to the first volatility change if started from the volatility state \(\sigma _{i}\) at time t.

Furthermore, let us define an operator J acting on a bounded \(f : [0, T] \times (l,h) \rightarrow {\mathbb {R}}\) by

$$\begin{aligned}&(J f)(t, x, \sigma _{i})\nonumber \\&\quad := \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}^{t,x, \sigma _{i}}_{t+s} \,\mathrm {d}s } \mathbb {1}_{\{\tau < \eta ^{t}_{i} \}} + e^{ \int _0^{\eta ^{t}_{i}} {\hat{X}}^{t,x,\sigma _{i}}_{t+s} \,\mathrm {d}s }f\left( t+\eta ^{t}_{i}, {\hat{X}}^{t,x, \sigma _{i}}_{t+\eta ^{t}_{i}}\right) \mathbb {1}_{\{\tau \ge \eta ^{t}_{i}\}} \right] \nonumber \\ \end{aligned}$$
(3.1)
$$\begin{aligned}&\quad = \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}^{t,x, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s } + \lambda _{i} \int _{0}^{\tau } e^{ \int _0^u {\hat{X}}^{t,x, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s }f\left( t+u, {\hat{X}}^{t,x,\sigma _{i}}_{t+u}\right) \,\mathrm {d}u \right] , \nonumber \\ \end{aligned}$$
(3.2)

where \({\mathcal {T}}_{T-t}\) denotes the set of stopping times less or equal to \(T-t\) with respect to the usual augmentation of the filtration generated by \(\{ {\hat{X}}^{t,x,\sigma _{i}}_{t+s}\}_{s \ge 0}\) and \(\{ \sigma (t+s)\}_{s \ge 0}\). To simplify notation, we also define an operator \(J_{i}\) by

$$\begin{aligned} J_{i} f := (Jf)(\cdot , \cdot , \sigma _{i}). \end{aligned}$$

Intuitively, \((J_{i} f)\) represents a Markovian value function corresponding to optimal stopping before \(t+ \eta ^{t}_{i}\), i.e. before the first volatility change after t, when, at time \(t+\eta ^{t}_{i} < T\), the payoff \(f\left( t+\eta ^{t}_{i}, {\hat{X}}^{{t,x, \sigma _{i}}}_{t+\eta ^{t}_{i}} \right) \) is received provided stopping has not occurred yet.

Proposition 3.1

Let \(f : [0, T] \times (l,h) \rightarrow {\mathbb {R}}\) be bounded. Then

  1. (i)

    Jf is bounded;

  2. (ii)

    f increasing in the second variable x implies that Jf is increasing in the second variable x;

  3. (iii)

    f decreasing in the first variable t implies that Jf is decreasing in the first variable t;

  4. (iv)

    f increasing and convex in the second variable x implies that Jf is increasing and convex in the second variable x;

  5. (v)

    J preserves order, i.e. \(f_{1} \le f_{2}\) implies \(J f_{1} \le J f_{2}\);

  6. (vi)

    \(J f \ge 1\).

Proof

All except claim (iv) are straightforward consequences of the representation (3.2). To prove (iv), we will approximate the optimal stopping problem (3.2) by Bermudan options.

Let i and n be fixed. We will approximate the value function \(J_{i}f\) by a value function \(w^{(f)}_{i,n}\) of a corresponding Bermudan problem with stopping allowed only at times \(\left\{ \frac{kT}{2^{n}} \,:\, k\in \{0,1, \ldots , 2^{n}\}\right\} \). We define \(w^{(f)}_{i,n}\) recursively as follows. First,

$$\begin{aligned} w^{(f)}_{i,n}(T,x) := 1. \end{aligned}$$

Then, starting with \(k =2^{n}\) and continuing recursively down to \(k=1\), we define

$$\begin{aligned} w^{(f)}_{i,n}(t,x)= & {} \left\{ \begin{array}{ll} g\left( t,x, \frac{kT}{2^{n}}\right) , &{}\quad t \in \left( \frac{(k-1)T}{2^{n}}, \frac{kT}{2^{n}}\right) , \\ g\left( \frac{(k-1)T}{2^{n}},x, \frac{kT}{2^{n}}\right) \vee 1, &{}\quad t = \frac{(k-1)T}{2^{n}}, \\ \end{array} \right. \end{aligned}$$
(3.3)

where the function g is given by

$$\begin{aligned} g\left( t,x, \frac{kT}{2^{n}}\right):= & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{ \int _t^{\frac{kT}{2^{n}}} {\hat{X}}^{t,x, \sigma _{i}}_{s} -\lambda _{i} \,\mathrm {d}s } w^{(f)}_{i,n}\left( \frac{kT}{n},{\hat{X}}^{t,x, \sigma _{i}}_{\frac{kT}{2^{n}}} \right) \nonumber \\&+ \int _{t}^{\frac{kT}{2^{n}}} e^{ \int _t^u {\hat{X}}^{t,x, \sigma _{i}}_{s} -\lambda _{i} \,\mathrm {d}s }f\Big (u, {\hat{X}}^{t,x,\sigma _{i}}_{u}\Big ) \,\mathrm {d}u \bigg ]. \end{aligned}$$
(3.4)

Next, we show by backward induction on k that \(w^{(f)}_{i,n}\) is increasing and convex in the second variable x. Suppose that for some \(k \in \{1, 2,\ldots , 2^{n}\}\), the function \(w^{(f)}_{i,n}\left( \frac{kT}{2^{n}}, \cdot \right) \) is increasing and convex (the assumption clearly holds for the base step \(k=2^{n}\)). Let \(t\in [ \frac{(k-1)T}{2^{n}}, \frac{kT}{2^{n}})\). Then, since f is also increasing and convex in the second variable x, we have that the function \(g(t,\cdot , \frac{kT}{2^{n}})\), and so \(w^{(f)}_{i,n}(t,\cdot )\), is convex by [14, Theorem 5.1]. Moreover, from (3.4) and [31, Theorem IX.3.7], it is clear that \(w^{(f)}_{i,n}(t,\cdot )\) is increasing. Consequently, by backward induction, we obtain that the Bermudan value function \(w^{(f)}_{i,n}\) is increasing and convex in the second variable.

Letting \(n \nearrow \infty \), the Bermudan value \(w^{(f)}_{i,n} \nearrow J_{i}f\) pointwise. As a result, \(J_{i}f\) is increasing and convex in the second argument, since convexity and monotonicity are preserved when taking pointwise limits. \(\square \)

The sets

$$\begin{aligned} \mathcal {C}^{f}_{i}:= & {} \{ (t,x) \in [0,T)\times (l, h) \,:\, (J_{i}f) (t,x) > 1 \}, \nonumber \\ \mathcal {D}^{f}_{i}:= & {} \{ (t,x) \in [0,T]\times (l, h) \,:\, (J_{i}f) (t,x) = 1 \} = [0,T]\times (l, h) \setminus \mathcal {C}^{f}_{i}, \end{aligned}$$
(3.5)

correspond to continuation and stopping sets for the stopping problem \(J_{i}f\) as the next proposition shows.

Proposition 3.2

(Optimal stopping time) The stopping time

$$\begin{aligned} \tau ^{f}_{\sigma _{i}}(t,x)= & {} \inf \{ u \in [0, T-t] \,:\, (t + u, {\hat{X}}^{t,x, \sigma _{i}}_{t+u}) \in \mathcal {D}^{f}_{i} \} \end{aligned}$$
(3.6)

is optimal for the problem (3.2).

Proof

A standard application of Theorem D.12 in [23]. \(\square \)

Proposition 3.3

If a bounded \(f :[0,T] \times (l,h) \rightarrow {\mathbb {R}}\) is decreasing in the first variable as well as increasing and convex in the second, then \(J_{i}f\) is continuous.

Proof

The argument is a trouble-free extension of the proof of the third part of Theorem 3.10 in [15]; still, we include it for completeness. Before we begin, in order to simplify notation, we will write \(u:= J_{i} f\).

Firstly, we let \(r\in (l, h)\) and will prove that there exists \(K>0\) such that, for every \(t\in [0,T]\), the map \(x\mapsto J_{i}f(t,x)\) is K-Lipschitz continuous on (lr]. To obtain a contradiction, assume that there is no such K. Then, by convexity of u in the second variable, there is a sequence \(\{ t_{n}\}_{n\ge 0} \subset [0,T]\) such that the left-derivatives \(\partial ^{-}_{2} u(t_{n}, r) \nearrow \infty \). Hence, for \(r' \in (r, h)\), the sequence \(u(t_{n}, r') \rightarrow \infty \), which contradicts that \(u(t_{n}, r') \le u(0, r') < \infty \) for all \(n \in {\mathbb {N}}\).

Now, it remains to show that u is continuous in time. Assume for a contradiction that the map \(t \mapsto u(t, x_{0})\) is not continuous at \(t=t_{0}\) for some \(x_{0}\). Since u is decreasing in time, \(u(\cdot , x_{0})\) has a negative jump at \(t_{0}\). Next, we will investigate the cases \(u(t_{0}-, x_{0}) > u(t_{0}, x_{0})\) and \(u(t_{0}, x_{0}) > u(t_{0}+, x_{0})\) separately.

Suppose \(u(t_{0}-, x_{0}) > u(t_{0}, x_{0})\). By Lipschitz continuity in the second variable, there exists \(\delta >0\) such that, writing \(\mathcal {R} = (t_{0}-\delta , t_{0}) \times (x_{0} - \delta , x_{0}+\delta )\),

$$\begin{aligned} \inf _{(t,x) \in \mathcal {R}} u(t, x) > u(t_{0}, x_{0} + \delta ). \end{aligned}$$
(3.7)

Thus \(\mathcal {R} \subseteq \mathcal {C}^{f}_{i}\). Let \(t \in (t_{0}-\delta , t_{0})\) and \(\tau _{\mathcal {R}} := \inf \{ s \ge 0\,:\, (t+s, {\hat{X}}^{t,x,\sigma _{i}}_{t+\tau _{\mathcal {R}}}) \notin \mathcal {R} \}\). Then, by the martingality in the continuation region,

$$\begin{aligned} u(t,x_0)= & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{ \int _0^{\tau _{{\mathcal {R}}}} {\hat{X}}^{t,x_0, \sigma _{i}}_{t+u} - \lambda _{i} \,\mathrm {d}u }u\Big (t+\tau _{\mathcal {R}}, {\hat{X}}^{t,x_0, \sigma _{i}}_{t+\tau _{{\mathcal {R}}}}\Big ) \\&+ \int _{0}^{\tau _{\mathcal {R}}} e^{ \int _0^u {\hat{X}}^{t,x_{0}, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s }f\Big (t+u, {\hat{X}}^{t,x_{0},\sigma _{i}}_{t+u}\Big ) \,\mathrm {d}u \bigg ]\\\le & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{ (t_0-t)(x_{0}+\delta )^{+} }u(t,x_0+\delta )\mathbb {1}_{\{t+\tau _{\mathcal {R}}<t_0\}} \\&+\,e^{ (t_0-t)(x_0+\delta )^{+}}u(t_0,x_0+\delta )\mathbb {1}_{\{t+\tau _{\mathcal {R}} = t_0\}}\\&+ \int _{0}^{t_{0}-t} e^{ \int _0^u {\hat{X}}^{t,x_{0}, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s }\Big |f\Big (t+u, {\hat{X}}^{t,x_{0},\sigma _{i}}_{t+u}\Big )\Big | \,\mathrm {d}u \bigg ]\\\le & {} e^{(t_0-t)(x_0+\delta )^+}u(t,x_0+\delta ) {\tilde{{{\mathbb {P}}}}}(t+\tau _{\mathcal {R}} <t_0) + e^{(t_0-t)(x_0+\delta )^+}u(t_0,x_0+\delta )\\&+\int _{0}^{t_{0}-t} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^u {\hat{X}}^{t,x_{0}, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s } \Big |f\Big (t+u, {\hat{X}}^{t,x_{0},\sigma _{i}}_{t+u}\Big )\Big |\right] \,\mathrm {d}u \\\rightarrow & {} u(t_0,x_0+\delta ) \end{aligned}$$

as \(t \rightarrow t_{0}\), contradicting (3.7).

The other case to consider is \(u(t_{0}, x_{0}) > u(t_{0}+, x_{0})\); we look into the situation \(u(t_{0}, x_{0})> u(t_{0}+, x_{0})>1\) first. The local Lipschitz continuity in the second variable and the decay in the first variable imply that there exist \(\epsilon >0\) and \(\delta >0\) such that, writing \(\mathcal {R} = (t_{0}, t_{0}+\epsilon ] \times [x_{0}-\delta , x_{0}+\delta ]\),

$$\begin{aligned} u(t_{0}, x_{0})> \sup _{(t,x) \in \mathcal {R}} u(t,x) \ge \inf _{(t,x) \in \mathcal {R}} u(t,x) > 1. \end{aligned}$$
(3.8)

Hence, \({\mathcal {R}} \subseteq \mathcal {C}^{f}_{i}\) and writing \(\tau _{\mathcal {R}}:=\inf \{s\ge 0: (t_{0}+s, {\hat{X}}^{t_{0},x_0, \sigma _{i}}_{t_{0}+s})\notin {\mathcal {R}}\}\) we have

$$\begin{aligned} u(t_{0},x_0)= & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{ \int _{0}^{\tau _{{\mathcal {R}}}} {\hat{X}}^{t_{0},x_0 , \sigma _{i}}_{t_{0}+u} - \lambda _{i} \,\mathrm {d}u } u\Big (t_0+\tau _{\mathcal {R}}, {\hat{X}}^{t_{0},x_0, \sigma _{i}}_{t_0+\tau _{{\mathcal {R}}}}\Big ) \\&+ \int _{0}^{\tau _{\mathcal {R}}} e^{ \int _0^u {\hat{X}}^{t_0,x_0, \sigma _{i}}_{t_0+s} -\lambda _{i} \,\mathrm {d}s }f\Big (t_0+u, {\hat{X}}^{t_0,x_0,\sigma _{i}}_{t_0+u}\Big ) \,\mathrm {d}u \bigg ] \\\le & {} {\tilde{{\mathbb {E}}}} \left[ e^{\epsilon (x_0+\delta )^+}u(t_{0},x_0+\delta )\mathbb {1}_{\{\tau _{\mathcal {R}}<\epsilon \}}\right] \\&+\,\,{\tilde{{\mathbb {E}}}} \bigg [e^{\epsilon (x_0+\delta )^+}u(t_0+\epsilon ,x_0+\delta )\mathbb {1}_{\{\tau _{\mathcal {R}} = \epsilon \}} \\&+\int _{0}^{\epsilon } e^{ \int _0^u {\hat{X}}^{t_0,x, \sigma _{i}}_{t_0+s} -\lambda _{i} \,\mathrm {d}s }\Big |f\Big (t_0+u, {\hat{X}}^{t_0,x_0,\sigma _{i}}_{t_0+u}\Big )\Big | \,\mathrm {d}u\bigg ]\\\le & {} e^{\epsilon (x_0+\delta )^+}u(t_{0},x_0+\delta ) {\tilde{{{\mathbb {P}}}}}(\tau _{\mathcal {R}} <\epsilon ) + e^{\epsilon (x_0+\delta )^+}u(t_0+\epsilon ,x_0+\delta )\\&+\int _{0}^{\epsilon } {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^u {\hat{X}}^{t_0,x_0, \sigma _{i}}_{t_0+s} -\lambda _{i} \,\mathrm {d}s }\Big |f\Big (t_0+u, {\hat{X}}^{t_0,x_0,\sigma _{i}}_{t_0+u}\Big )\Big | \right] \,\mathrm {d}u \\\rightarrow & {} u(t_0+,x_0+\delta ) \end{aligned}$$

as \(\epsilon \searrow 0\), which contradicts (3.8).

Lastly, suppose that \(u(t_{0}, x_{0}) > u(t_{0}+, x_{0}) = 1\). By Lipschitz continuity in the second variable, there exists \(\delta >0\) such that

$$\begin{aligned} \inf _{x\in (x_{0}-\delta , x_{0})}u(t_{0},x)>u(t_0+,x_0)=1. \end{aligned}$$
(3.9)

Consequently, \((t_{0}, T]\times (x_{0}-\delta , x_{0}) \subseteq \mathcal {D}^{f}_{i}\). Hence the process \({\hat{X}}^{t_{0}, x_{0}-\delta /2, \sigma _{i}}\) hits the stopping region immediately and so \((t_{0}, x_{0}-\delta /2) \in \mathcal {D}^{f}_{i}\), which contradicts (3.9). \(\square \)

Proposition 3.4

(Optimal stopping boundary) Let \(f: [0,T] \times (l, h) \rightarrow {\mathbb {R}}\) be bounded, decreasing in the first variable as well as increasing and convex in the second variable. Then the following hold.

  1. (i)

    There exists a function \(b^{f}_{\sigma _{i}}: [0,T) \rightarrow [l,h]\) that is both increasing, right-continuous with left limits, and satisfies

    $$\begin{aligned} \mathcal {C}^{f}_{i} = \{ (t,x) \in [0,T) \times (l, h) \,:\, x > b^{f}_{\sigma _{i}}(t) \}. \end{aligned}$$
    (3.10)
  2. (ii)

    The pair \((J_{i}f, b^{f}_{\sigma _{i}})\) satisfies the free-boundary problem

    $$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}u(t,x) + {\sigma _{i}} \phi (x, \sigma _{i}) \partial _{x} u(t,x) + \frac{1}{2} \phi (x, \sigma _{i})^{2} \partial _{xx} u(t,x)\\ \quad +\,(x-\lambda _{i})u(t,x)+\lambda _{i} f(t,x) = 0, &{}\quad \text { if } x > b^{f}_{\sigma _{i}}(t), \\ u(t,x) = 1, &{}\quad \text { if } x \le b^{f}_{\sigma _{i}}(t) \text { or } t=T. \end{array} \right. \end{aligned}$$
    (3.11)

Proof

  1. (i)

    By Proposition 3.1 (iv), there exists a unique function \(b^{f}_{\sigma _{i}}\) satisfying (3.10). Moreover, by Proposition 3.1 (iii), this boundary \(b^{f}_{\sigma _{i}}\) is increasing. Hence, using Proposition 3.3, we also obtain that \(b^{f}_{\sigma _{i}}\) is right-continuous with left limits.

  2. (ii)

    The proof follows a well-known standard argument (e.g. see [23, Theorem 7.7 in Chapter 2]), thus we omit it.

\(\square \)

3.2 A Sequence of Approximating Problems

Let us define a sequence of stopping times \(\{\xi ^{t}_{n}\}_{n \ge 0}\) recursively by

$$\begin{aligned} \xi ^{t}_{0}:= & {} 0, \\ \xi ^{t}_{n}:= & {} \inf \big \{ s> \xi ^{t}_{n-1}\,:\, \sigma ( t + s) \ne \sigma (t + \xi ^{t}_{n-1}) \big \}, \quad n > 0. \end{aligned}$$

Here \(\xi ^{t}_{n}\) represents the duration until the n-th volatility jump since time t. Furthermore, let us define a sequence of operators \(\{ J^{(n)} \}_{n\ge 0}\) by

$$\begin{aligned}&(J^{(n)} f)(t,x, \sigma _{i}) \nonumber \\&\quad := \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{\int _0^\tau {\hat{X}}^{t,x, \sigma _{i}}_{t+s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau < \xi ^{t}_{n} \}} + e^{\int _0^{\xi ^{t}_{n}} {\hat{X}}^{t,x, \sigma _{i}}_{t+s} \,\mathrm {d}s }f\Big (t+\xi ^{t}_{n}, {\hat{X}}^{t,x, \sigma _{i}}_{t+\xi ^{t}_{n}}\Big ) \mathbb {1}_{\{\tau \ge \xi ^{t}_{n}\}} \right] ,\nonumber \\ \end{aligned}$$
(3.12)

where \(f : [0, T] \times (l,r) \rightarrow {\mathbb {R}}\) is bounded. In particular, note that \(J^{(0)} f = f\) and \(J^{(1)}f=Jf\). Similarly as for the operator J, we define \(J^{(n)}_{i}\) by

$$\begin{aligned} J^{(n)}_{i} f := (J^{(n)}f)(\cdot , \cdot , \sigma _{i}). \end{aligned}$$

Proposition 3.5

Let \(n \ge 0\) and \(i \in \{0,\ldots , m\}\). Then

$$\begin{aligned} J^{(n+1)}_{i}= J_{i} \left( \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J^{(n)}_{j} \right) . \end{aligned}$$
(3.13)

Proof

The proof is by induction. In order to present the argument of the proof while keeping intricate notation at bay, we will only prove that, for a bounded \(f:[0,T]\times (l,h) \rightarrow {\mathbb {R}}\) and \(x\in (l,h)\), the identity \((J^{(2)}_{i}f)(t,x)= (J_{i}(\sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J_{j}f))(t,x)\) holds. The induction step \(J^{(n+1)}_{i}= J_{i} \left( \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J^{(n)}_{j} \right) \) follows a similar argument, though with more abstract notation. Note that without loss of generality, we can assume \(t=0\), which we do.

Firstly, we will show \((J_{i}^{(2)}f)(0,x) \le J_{i}\bigg ( \sum _{j \ne i}\frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f)\bigg )(0,x)\) and then the opposite inequality. For \(j \in {\mathbb {N}}\), we will write \(\xi _{j}\) instead of \(\xi ^{0}_{j}\) as well as will use the notation \(\eta _{j}:=\xi _{j}-\xi _{j-1}\). Let \(\tau \in \mathcal {T}_{T}\) and consider

$$\begin{aligned}&A(\tau )\nonumber \\&\quad := {\tilde{{\mathbb {E}}}} \left[ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau< \eta _{1} \}} + e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \eta _{1} \le \tau< \xi _{2} \}} +e^{\int _0^{\xi _{2}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s }f(\xi _{2}, {\hat{X}}^{0,x, \sigma _i}_{\xi _{2}}) \mathbb {1}_{\{\tau \ge \xi _{2}\}} \right] \nonumber \\&\quad = {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau< \eta _{1} \}} + {\tilde{{\mathbb {E}}}} \Big [ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \eta _{1} \le \tau < \xi _{2} \}} \nonumber \\&\qquad +\,e^{\int _0^{\xi _{2}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s }f(\xi _{2}, {\hat{X}}^{0,x, \sigma _i}_{\xi _{2}}) \mathbb {1}_{\{\tau \ge \xi _{2}\}} \, | \, {\mathcal {F}}^{{\hat{X}}^{0,x,\sigma _{i}}, N}_{\eta _{1}} \Big ] \bigg ], \end{aligned}$$
(3.14)

where \(\{N_{t}\}_{t\ge 0}\) denotes the process counting the volatility jumps. The inner conditional expectation in (3.14) satisfies

$$\begin{aligned}&{\tilde{{\mathbb {E}}}} \Big [ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \eta _{1} \le \tau< \xi _{2} \}} +e^{\int _0^{\xi _{2}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s }f(\xi _{2}, {\hat{X}}^{0,x, \sigma _i}_{\xi _{2}}) \mathbb {1}_{\{\tau \ge \xi _{2}\}} \, | \, {\mathcal {F}}^{{\hat{X}}^{0,x,\sigma _{i}}, N}_{\eta _{1}} \Big ] \nonumber \\&\quad = e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \eta _{1} \le \tau \}}{\tilde{{\mathbb {E}}}} \Big [ e^{\int _{\eta _{1}}^{\tau } {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau< \xi _{2} \}} \nonumber \\&\qquad +\,e^{\int _{\eta _{1}}^{\xi _{2}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s }f(\xi _{2}, {\hat{X}}^{0,x, \sigma _i}_{\xi _{2}}) \mathbb {1}_{\{\tau \ge \xi _{2}\}} \, | \, {\mathcal {F}}^{{\hat{X}}^{0,x,\sigma _{i}}, N}_{\eta _{1}} \Big ] \nonumber \\&\quad = e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \eta _{1} \le \tau \}} \sum _{j \ne i} \frac{\lambda _{ij}}{\lambda _{i}} {\tilde{{\mathbb {E}}}}^{\eta _{1},{\hat{X}}^{0,x,\sigma _i}_{\eta _{1}}, \sigma _{j}} \bigg [ e^{\int _{0}^{{\tilde{\tau }}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s} \mathbb {1}_{\{{\tilde{\tau }} < \eta _{2}\}}\nonumber \\&\qquad +\, e^{\int _{0}^{\eta _{2}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s }f(\eta _{1}+\eta _{2}, {\hat{X}}_{\eta _{1}+\eta _{2}}) \mathbb {1}_{\{{\tilde{\tau }} \ge \eta _{2}\}}\bigg ], \end{aligned}$$
(3.15)

where \({\tilde{\tau }} = \tau - \eta _{1}\) in the case \(\eta _{1} \le \tau \le T\). Therefore, substituting (3.15) into (3.14) and then taking a supremum over \({\tilde{\tau }}\), we get

$$\begin{aligned} A(\tau )\le & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau< \eta _{1} \}} \nonumber \\&+\,e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau \ge \eta _{1} \}} \sum _{j \ne i} \frac{\lambda _{ij}}{\lambda _{i}} \sup _{{\tilde{\tau }} \in \mathcal {T}_{T-T\wedge \eta _{1}}} {\tilde{{\mathbb {E}}}}^{\eta _{1},{\hat{X}}^{0,x,\sigma _i}_{\eta _{1}}, \sigma _j} \Big [ e^{\int _{0}^{{\tilde{\tau }}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s} \mathbb {1}_{\{{\tilde{\tau }}< \eta _{2}\}} \nonumber \\&+\,e^{\int _{0}^{\eta _{2}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s }f(\eta _{1}+\eta _{2}, {\hat{X}}_{\eta _{1}+\eta _{2}}) \mathbb {1}_{\{{\tilde{\tau }} \ge \eta _{2}\}}\Big ]\bigg ] \nonumber \\= & {} {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^\tau {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau < \eta _{1} \}} + e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau \ge \eta _{1} \}} \sum _{j \ne i} \frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f)(\eta _{1}, X^{0,x,\sigma _i}_{\eta _{1}}) \bigg ] \nonumber \\ \end{aligned}$$
(3.16)

Taking a supremum over \(\tau \) in (3.16), we obtain

$$\begin{aligned} (J_{i}^{(2)}f)(0,x) = \sup _{\tau \in \mathcal {T}_{T}} A(\tau ) \le J_{i}\bigg ( \sum _{j \ne i}\frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f)\bigg )(0,x). \end{aligned}$$
(3.17)

It remains to establish the opposite inequality. Let \(\tau \in \mathcal {T}_{T}\) and define

$$\begin{aligned} \check{\tau }:= & {} \tau \mathbb {1}_{\{ \tau \le \eta _{1} \}} + (\eta _{1} \wedge T +\tau _{\sigma (\eta _{1})}) \mathbb {1}_{\{ \tau > \eta _{1} \}}, \end{aligned}$$
(3.18)

where \(\tau _{\sigma (\eta _{1})} := \tau ^{f}_{\sigma (\eta _{1})} (\eta _{1} \wedge T, {\hat{X}}^{0,x,\sigma _{i}}_{\eta _{1} \wedge T})\). Clearly, \(\check{\tau } \in \mathcal {T}_{T}\). Then

$$\begin{aligned}&(J^{(2)}_{1}f)(0,x) \\&\quad \ge A(\check{\tau }) \\&\quad = {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^{\tau } {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau< \eta _{1} \}} + e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau \ge \eta _{1}\}} \sum _{j \ne i} \frac{\lambda _{ij}}{\lambda _{i}} {\tilde{{\mathbb {E}}}}^{\eta _{1},{\hat{X}}^{0,x,\sigma _i}_{\eta _{1}}, \sigma _j} \Big [ e^{\int _{0}^{\tau _{\sigma _j}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s} \mathbb {1}_{\{\tau _{\sigma _j}< \eta _{2}\}} \\&\qquad +\,\, e^{\int _{0}^{\eta _{2}} {\hat{X}}_{\eta _{1}+s} \,\mathrm {d}s }f(\eta _{1}+\eta _{2}, {\hat{X}}_{\eta _{1}+\eta _{2}}) \mathbb {1}_{\{ \tau _{\sigma _j} \ge \eta _{2}\}}\Big ] \bigg ] \\&\quad = {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^{\tau } {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau < \eta _{1} \}} +\,e^{\int _0^{\eta _{1}} {\hat{X}}^{0,x,\sigma _i}_{s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau \ge \eta _{1}\}} \sum _{j \ne i} \frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f)(\eta _{1}, {\hat{X}}^{0,x,\sigma _i}_{\eta _{1}}) \bigg ], \end{aligned}$$

where Proposition 3.2 was used to obtain the last equality. Hence, by taking supremum over stopping times \(\tau \in \mathcal {T}_{T}\), we get

$$\begin{aligned} (J^{(2)}_{i}f)(0,x) \ge J_{i} \bigg ( \sum _{j \ne i}\frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f) \bigg )(0,x). \end{aligned}$$
(3.19)

Finally, (3.17) and (3.19) taken together imply

$$\begin{aligned} (J^{(2)}_{i}f)(0,x) = J_{i} \bigg ( \sum _{j \ne i}\frac{\lambda _{ij}}{\lambda _{i}} (J_{j}f) \bigg )(0,x). \end{aligned}$$

\(\square \)

Remark 3.6

In [24], the authors use the same approximation procedure for an optimal stopping problem with regime switching volatility as in this article. Unfortunately, a mistake is made in equation (18) of [24], which wrecks the subsequent approximation procedure when the number of volatility states is greater than 2. The identity (18) therein should be replaced by (3.13).

3.3 Convergence to the Value Function

Proposition 3.7

(Properties of the approximating sequence)

  1. (i)

    The sequence of functions \(\{ J^{(n)} 1 \}_{n \ge 0}\) is increasing, bounded from below by 1 and from above by \(e^{hT}\).

  2. (ii)

    Every \(J^{(n)} 1 \) is decreasing in the first variable t as well as increasing and convex in the second variable x.

  3. (iii)

    The sequence of functions

    $$\begin{aligned} J^{(n)} 1 \nearrow v \quad \text { pointwise as } n \nearrow \infty . \end{aligned}$$

    Moreover, the approximation error

    $$\begin{aligned} \Vert v - J^{(n)} 1 \Vert _{\infty } \le e^{hT} \lambda T \frac{(\lambda T)^{n-1}}{(n-1)!} \text { as } n \rightarrow \infty , \end{aligned}$$
    (3.20)

    where \(\lambda := \max \{ \lambda _{i}\,:\, 1 \le i \le m \}\).

  4. (iv)

    For every \(n \in {\mathbb {N}}\cup \left\{ 0\right\} \),

    $$\begin{aligned} J_{m}^n 1 \le J^{(n)} 1 \le J_{1}^n 1. \end{aligned}$$
    (3.21)

Proof

  1. (i)

    The statement that \(\{ J^{(n)}_{i} 1 \}_{n \ge 0}\) is increasing, bounded from below by 1 and from above by \(e^{hT}\) is a direct consequence of the definition (3.12).

  2. (ii)

    The claim that every \(J^{(n)}_{i} 1 \) is decreasing in the first variable t as well as increasing and convex in the second variable x follows by a straightforward induction on n, using Proposition 3.1 (iii),(iv) and Proposition 3.5 at the induction step.

  3. (iii)

    First, let \(i \in \{ 1, \ldots , m\}\) and note that, for any \(n \in {\mathbb {N}}\),

    $$\begin{aligned} J^{(n)}_{i}1 \le v_{i}. \end{aligned}$$

    Here the inequality holds by suboptimality, since \(J^{(n)}_{i}1\) corresponds to an expected payoff of a particular stopping time in the problem (2.4). Next, define

    $$\begin{aligned} U^{(i)}_{n}(t,x):= & {} \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{\int _0^\tau {\hat{X}}^{t,x, \sigma _{i}}_{t+s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau < \xi ^{t}_{n} \}}\right] . \end{aligned}$$

    Then

    $$\begin{aligned} U^{(i)}_{n}(t,x) \le (J^{(n)}_{i}1) (t,x) \le v_{i}(t,x) \le U^{(i)}_{n}(t,x) + e^{h(T-t)} {{\mathbb {P}}}(\xi ^{t}_{n} \le T-t). \end{aligned}$$
    (3.22)

    Since it is a standard fact that the \(n^{\text {th}}\) jump time, call it \(\zeta _{n}\), of a Poisson process with jump intensity \(\lambda := \max \{ \lambda _{i}\,:\, 1 \le i \le m \}\) follows the Erlang distribution, we have

    $$\begin{aligned} {{\mathbb {P}}}(\xi ^{t}_{n} \le T-t)\le & {} {{\mathbb {P}}}(\zeta _{n} \le T-t)\\= & {} \frac{1}{(n-1)!} \int _{0}^{\lambda (T-t)} u^{n-1} e^{-u} \,\mathrm {d}u \\\le & {} \lambda T \frac{(\lambda T)^{n-1}}{(n-1)!}. \end{aligned}$$

    Therefore, by (3.22),

    $$\begin{aligned} \Vert v - J^{(n)} 1 \Vert _{\infty } \le e^{hT} \lambda T \frac{(\lambda T)^{n-1}}{(n-1)!} \text { as } n \rightarrow \infty . \end{aligned}$$
  4. (iv)

    The string of inequalities (3.21) will be proved by induction. First, the base step is obvious. Now, suppose (3.21) holds for some \(n \ge 0\). Hence, for any \(i \in \{1, \ldots , m\}\),

    $$\begin{aligned} J^{n}_{m} 1 \le \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J^{(n)}_{j} 1 \le J^n_1 1. \end{aligned}$$
    (3.23)

    Let us fix \(i \in \left\{ 1, \ldots , m \right\} \). By Proposition 3.1 (iv), every function in (3.23) is convex in the spatial variable x, thus [14, Theorem 6.1] yields

    $$\begin{aligned} J^{n+1}_{m} 1 \le J_i \left( \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J^{(n)}_{j} 1 \right) \le J^{n+1}_1 1. \end{aligned}$$

    As i was arbitrary, we also have

    $$\begin{aligned} J_{\sigma _{m}}^{n+1} 1 \le J^{(n+1)} 1 \le J_{\sigma _{1}}^{n+1} 1. \end{aligned}$$
    (3.24)

\(\square \)

Remark 3.8

If instead of 1 we choose the constant function \(e^{hT}\) to apply the operators \(J^{(n)}_{i}\) to, then, following the same strategy as above, \(\{ J^{(n)}_{i} e^{hT} \}_{n \ge 0}\) is a decreasing sequence of functions with the limit \( J^{(n)}_{i} e^{hT} \searrow v_{i}\) pointwise as \(n \nearrow \infty \).

Let \(\mathcal {B}_{b}([0,T]\times (l,h); {\mathbb {R}})\) denote the set of bounded functions from \([0,T]\times (l,h)\) to \({\mathbb {R}}\) and define an operator \( {\tilde{J}} : \mathcal {B}_{b}([0,T]\times (l,h); {\mathbb {R}})^{m} \rightarrow \mathcal {B}_{b}([0,T]\times (l,h); {\mathbb {R}})^{m}\) by

$$\begin{aligned} {\tilde{J}} \left( \begin{array}{c} f_{1} \\ \vdots \\ f_{m} \end{array}\right):= & {} \left( \begin{array}{c} J_{1} ( \sum _{j \ne 1} \frac{\lambda _{1j}}{\lambda _{1}} f_{j} ) \\ \vdots \\ J_{m} (\sum _{j \ne m} \frac{\lambda _{mj}}{\lambda _{m}} f_{j} )\end{array} \right) . \end{aligned}$$

Proposition 3.9

  1. (i)

    Let \(f \in \mathcal {B}_{b}([0,T]\times (l,h); {\mathbb {R}})^{m}\). Then

    $$\begin{aligned} \lim _{n\rightarrow \infty } {\tilde{J}}^{n} f= & {} \left( \begin{array}{c} v_{1}\\ \vdots \\ v_{m} \end{array} \right) . \end{aligned}$$
  2. (ii)

    The vector \((v_{1}, \ldots , v_{m})^{tr}\) of value functions is a fixed point of the operator \({\tilde{J}}\), i.e.

    $$\begin{aligned} {\tilde{J}} \left( \begin{array}{c} v_{1} \\ \vdots \\ v_{m} \end{array} \right)= & {} \left( \begin{array}{c} v_{1} \\ \vdots \\ v_{m} \end{array}\right) . \end{aligned}$$
    (3.25)

Proof

  1. (i)

    Observe that the argument in the proof of part (iii) of Proposition 3.7 also gives that \(J^{(n)}_{i} g \rightarrow v_{i}\) as \(n \rightarrow \infty \) for any bounded g. Hence to finish the proof it is enough to recall the relation (3.13) in Proposition 3.5.

  2. (ii)

    Let \(i \in \{1, \ldots , m\}\). By Proposition 3.5,

    $$\begin{aligned} J^{(n+1)}_{i}1= J_{i} \left( \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} J^{(n)}_{j} 1\right) . \end{aligned}$$
    (3.26)

    By Proposition 3.7 (iii), for every \(j \in \{1, \ldots , m\}\), the sequence \(J^{(n)}_{j} 1 \nearrow v_{j}\) as \(n \nearrow \infty \), so, letting \(n \nearrow \infty \) in (3.26), the monotone convergence theorem tells us that

    $$\begin{aligned} v_{i}= J_{i} \left( \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} v_{j} \right) . \end{aligned}$$
    (3.27)

\(\square \)

4 The Value Function and the Stopping Strategy

In this section, we show that the value function v has attractive structural properties and identify an optimal strategy for the liquidation problem (2.7). The first passage time below a boundary, which is an increasing function of time and volatility, is proved to be optimal. Moreover, we provide a method to approximate the optimal stopping boundary by demonstrating that it is a limit of an increasing sequence of stopping boundaries coming from easier auxiliary problems of Sect. 3.

Theorem 4.1

(Properties of the value function)

  1. (i)

    v is decreasing in the first variable t as well as increasing and convex in the second variable x.

  2. (ii)

    \(v_{i}\) is continuous for every \(i \in \{1, \ldots , m\}\).

  3. (iii)
    $$\begin{aligned} \check{v}_{\sigma _m} \le v \le \check{v}_{\sigma _1}, \end{aligned}$$
    (4.1)

    where \(\check{v}_{\sigma _i}:[0,T]\times (l, h) \rightarrow {\mathbb {R}}\) denotes the Markovian value function as in (2.7), but for a price process (2.1) with constant volatility \(\sigma _i\).

Proof

  1. (i)

    Since, by Proposition 3.7 (ii), every \(J^{(n)} 1\) is decreasing in the first variable t, increasing and convex in the second variable x, these properties are also preserved in the pointwise limit \(\lim _{n \rightarrow \infty } J^{(n)}1\), which is v by Proposition 3.7 (iii).

  2. (ii)

    Using part (i) above, the claim follows from Proposition 3.9 (ii), i.e. from the fact that \((v_{1}, \ldots , v_{m})^{tr}\) is a fixed point of a regularising operator \({\tilde{J}}\) in the sense of Proposition 3.3.

  3. (iii)

    Letting \(n \rightarrow \infty \) in (3.21), Proposition 3.7 (iii) gives us (4.1).

\(\square \)

For the optimal liquidation problem (2.4) with constant volatility \(\sigma \), i.e. in the case \(\sigma _{1}= \ldots =\sigma _{m} =\sigma \), it has been shown in [15] that an optimal liquidation strategy is characterised by a increasing continuous stopping boundary \(\check{b}_{\sigma } :[0,T) \rightarrow [l, 0]\) with \(\check{b}_{\sigma }(T-)=0\) such that the stopping time \(\check{\tau }_{\sigma }= \inf \{t \ge 0 \,:\, {\hat{X}}_{t} \le \check{b}_{\sigma }(t) \}\wedge T\) is optimal. It turns out that the optimal liquidation strategy within our regime-switching volatility model shares some similarities with the constant volatility case as the next theorem shows.

Theorem 4.2

(Optimal liquidation strategy)

  1. (i)

    For every \(i \in \{1,\ldots , m\}\), there exists \(b_{\sigma _{i}}: [0,T) \rightarrow [l, 0]\) that is increasing, right-continuous with left limits, satisfies the equality \(b_{\sigma _{i}}(T-) = 0\) and the identity

    $$\begin{aligned} \mathcal {C}^{u_{i}}_{i} = \{ (t,x) \in [0,T) \times (l, h) \,:\, x > b_{\sigma _{i}}(t) \}, \end{aligned}$$
    (4.2)

    where \(u_{i} := { \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}} v_{j}}\). Moreover,

    $$\begin{aligned} \check{b}_{\sigma _{1}} \le b_{\sigma _{i}} \le \check{b}_{\sigma _{m}}. \end{aligned}$$

    for any \(i \in \left\{ 1, \ldots , m \right\} \).

  2. (ii)

    The stopping strategy

    $$\begin{aligned} \tau ^{*} := \inf \{ s \in [0,T-t) \,:\, {\hat{X}}^{t,x, \sigma }_{t+s} \le b_{\sigma (t+s)}(t+s) \} \wedge (T-t)\,. \end{aligned}$$

    is optimal for the optimal selling problem (2.7).

  3. (iii)

    For \(i \in \{1,\ldots , m\}\), the boundaries

    $$\begin{aligned} b_{\sigma _{i}}^{g_i^{(n)}} \searrow b_{\sigma _{i}} \quad \text {pointwise as } n \nearrow \infty , \end{aligned}$$

    where \(g_i^{(n)} := \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}}J^{(n)}_{j}1\).

  4. (iv)

    The pairs \((v_{1}, b_{\sigma _{1}}), (v_{2},b_{\sigma _{2}}), \ldots ,(v_{m},b_{\sigma _{m}})\) satisfy a coupled system of m free-boundary problems with each being

    $$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}v_{i}(t,x) +\,{\sigma _{i}} \phi (x, \sigma _{i}) \partial _{x} v_{i}(t,x) + \frac{1}{2} \phi (x, \sigma _{i})^{2} \partial _{xx} v_{i}(t,x) \\ \quad +(x- \lambda _{i})v_{i}(t,x)+\sum _{j \ne i} \lambda _{ij}v_{j}(t,x) = 0, &{} \text { if } x > b_{i}(t),\\ v_{i}(t,x) = 1, &{} \text { if } x \le b_{i}(t) \text { or } t = T, \end{array} \right. \end{aligned}$$
    (4.3)

    where \(i \in \{1, \ldots , m\}\).

Proof

  1. (i)

    The existence of \(b_{\sigma _{i}} : [0, T) \rightarrow [l, h]\) that is increasing, right-continuous with left limits, and satisfies (4.2) follows from the fixed-point property (3.25), and Theorem 4.1 (i),(ii). Since the range of \(\check{b}_{\sigma _{1}}, \check{b}_{\sigma _{m}}\) is [l, 0] and \(\check{b}_{\sigma _{1}}(T-)= \check{b}_{\sigma _{m}}(T-)=0\), using Theorem 4.1 (iii), we also conclude that \(\check{b}_{\sigma _{1}} \le b_{\sigma _{i}} \le \check{b}_{\sigma _{m}}\) and that \(b_{\sigma _{i}}(T-) = 0\) for every i.

  2. (ii)

    Let us define \(\mathcal {D}:= \{ (t,x, \sigma ) \in [0,T]\times (l,h)\times \{\sigma _1, \ldots ,\sigma _m\} \,:\, v(t,x, \sigma ) = 1 \}\). Then \(\tau _{\mathcal {D}} := \inf \{ s \ge 0 \,:\, (t+s, {\hat{X}}^{t,x, \sigma (t)}_{t+s}, \sigma (t+s)) \in \mathcal {D}\}\) is optimal for the problem (2.7) by [29, Corollary 2.9]. Lastly, from the fixed-point property (3.25) and Proposition 3.2, we conclude that \(\tau ^*=\tau _{\mathcal {D}}\), which finishes the proof.

  3. (iii)

    Since \(J^{(n)}_{i} 1 \nearrow v_{i}\) as \(n \nearrow \infty \) and \(J^{(n)}_{i} 1 \ge 1\) for all n, we have that \(\lim _{n\nearrow \infty } b_{\sigma _{i}}^{g_i^{(n)}} \ge b_{\sigma _{i}}\). Also, if \(x < \lim _{n\nearrow \infty } b_{\sigma _{i}}^{g_i^{(n)}} (t)\), then \(J^{(n)}_{i} 1 (t,x) = 1\) for all \( n \in {\mathbb {N}}\) and so \(v_{i}(t,x)= \lim _{n \nearrow \infty } J^{(n)}_{i} 1 (t,x)=1\). Hence, \(\lim _{n\nearrow \infty } b_{\sigma _{i}}^{g_i^{(n)}} \le b_{\sigma _{i}}\). As a result, \( \lim _{n\nearrow \infty } b_{\sigma _{i}}^{g_i^{(n)}} = b_{\sigma _{i}}\).

  4. (iv)

    The free-boundary problem is a consequence of Proposition 3.4 (ii) and the fixed-point property (3.25).

\(\square \)

Remark 4.3

Establishing uniqueness of a classical solution to a time non-homogeneous free-boundary problem is typically a technical task (see [27] for an example). Not being central to the mission of the paper, the uniqueness of solution to the free-boundary problems (4.3) and (3.11) has not been pursued.

Remark 4.4

(A possible alternative approach) It is worth pointing out that a potential alternative approach for the study of the value function and the optimal strategy is to directly analyse the variational inequality formulation (e.g., see [30, Sect. 5.2]) arising from the optimal stopping problem (2.7). The coupled system of variational inequalities would need to be studied using weak solution techniques from the PDE theory (e.g., see [6, 30]) to obtain desired regularity and structural properties of the value function and the stopping region. Though the author is unaware of any work studying exactly this type of free-boundary problem directly in detail, there are available theoretical results [7] that include existence, uniqueness of viscosity solutions, and a comparison principle for the pricing of American options in regime-switching models. Also, under some conditions, convergence of stable, monotone, and consistent approximation schemes to the value function is shown. Suitable numerical PDE methods and their pros and cons for such a coupled system are discussed in [22]. With this alternative route in mind (provided all the needed technical results can be established), our approach has clear benefits: avoiding many analytical complications that arise in the study of the full system (compare [7]) and yielding a very intuitive monotone approximation scheme for the value function and the stopping boundary.

For further study of the problem in this section, we will make a structural assumption about the Markov chain modelling the volatility.

Assumption 4.5

The Markov chain \(\sigma \) is skip-free, i.e. for all \(i \in \{1, \ldots , m \}\),

$$\begin{aligned} \lambda _{ij}=0 \; \text { if } j \notin \{i-1, i, i+1 \}. \end{aligned}$$

As many popular financial stochastic volatility models have continuous trajectories, and a skip-free Markov chain is a natural discrete state-space approximation of a continuous process, Assumption 4.5 does not appear to be a severe restriction.

Lemma 4.6

Let \(\delta >0\), \(g:(l,h)\times [0, \infty ) \rightarrow [0,\infty ) \) be increasing and convex in the first variable as well as decreasing in the second. Then \(u : (l, h)\times \{ \sigma _{1}, \ldots , \sigma _{m}\} \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} u(x, \sigma _{i}) := {\mathbb {E}}\left[ e^{\int _{0}^{\delta } {\hat{X}}^{x, \sigma _{i}}_{u} \,\mathrm {d}u} g({\hat{X}}^{x, \sigma }_{\delta }, \sigma (\delta )) \right] \end{aligned}$$
(4.4)

is increasing and convex in the first variable as well as decreasing in the second.

Proof

We will prove the claim using a coupling argument. Let \((\Omega ', {\mathcal {F}}', {\tilde{{{\mathbb {P}}}}}')\) be a probability triplet supporting a Brownian motion B, and two volatility processes \(\sigma ^{1}\), \(\sigma ^{2}\) with the state space and transition densities as in (2.1). In addition, we assume that B is independent of \((\sigma ^{1}, \sigma ^{2})\), that the starting values satisfy \(\sigma ^{1}(0) = \sigma _{i} \le \sigma _{j} = \sigma ^{2}(0)\), and that \(\sigma ^{1}(t)\le \sigma ^{2}(t)\) for all \(t \ge 0\). Also, let \({\hat{X}}^{1}\) and \({\hat{X}}^{2}\) denote the solutions to (2.6) when \(\sigma \) is replaced by \(\sigma ^{1}\) and \(\sigma ^{2}\), respectively.

Let us fix an arbitrary \(\omega _{0} \in \Omega '\). Since \({\hat{W}}\) is independent of \(\sigma ^{1}\),

$$\begin{aligned} {\tilde{E}}' \left[ e^{\int _{0}^{\delta } ({\hat{X}}^{1})^{x}_{u} \,\mathrm {d}u} g(({\hat{X}}^{1})^{x}_{\delta }, \sigma ^{1}(\delta )) \,|\, {\mathcal {F}}^{\sigma ^{1}}_{\delta } \right] (\omega _{0})= & {} {\tilde{{\mathbb {E}}}}' \left[ e^{\int _{0}^{\delta } ({{\tilde{X}}}^{1})^{x}_{u} \,\mathrm {d}u} g((\tilde{X}^{1})^{x}_{\delta }, \sigma ^{1}(\delta , \omega _{0}))\right] , \end{aligned}$$
(4.5)

where \({\tilde{X}}^{1}\) denotes the process \({\hat{X}}^{1}\) with the volatility process \(\sigma ^{1}\) replaced by a deterministic function \(\sigma ^{1}(\cdot , \omega _{0})\). Furthermore, the right-hand (and so the left-hand side) in (4.5) as a function of x is increasing by [31, Theorem IX.3.7] as well as convex by [14, Theorem 5.1]. Hence

$$\begin{aligned} u(\cdot , \sigma _{i}) : x \mapsto {\tilde{{\mathbb {E}}}}' \left[ {\tilde{{\mathbb {E}}}}' \left[ e^{\int _{0}^{\delta } ({\hat{X}}^{1})^{x}_{u} \,\mathrm {d}u} g(({\hat{X}}^{1})^{x}_{\delta }, \sigma ^{1}(\delta )) \,|\, {\mathcal {F}}^{\sigma ^{1}}_{\delta } \right] \right] \end{aligned}$$

is increasing and convex. Next, we observe that

$$\begin{aligned}&{\tilde{{\mathbb {E}}}}' \left[ e^{\int _{0}^{\delta } ({\hat{X}}^{1})^{x}_{u} \,\mathrm {d}u} g(({\hat{X}}^{1})^{x}_{\delta }, \sigma ^{1}(\delta )) \,|\, {\mathcal {F}}^{\sigma ^{1}, \sigma ^{2}}_{\delta } \right] (\omega _{0})\nonumber \\&\quad \ge {\tilde{{\mathbb {E}}}}' \left[ e^{\int _{0}^{\delta } ({\hat{X}}^{2})^{\delta }_{u} \,\mathrm {d}u} g(({\hat{X}}^{2})^{x}_{\delta }, \sigma ^{1}(\delta )) \,|\, {\mathcal {F}}^{\sigma ^{1}, \sigma ^{2} }_{\delta } \right] (\omega _{0}) \nonumber \\&\quad \ge {\tilde{{\mathbb {E}}}}' \left[ e^{\int _{0}^{\delta } ({\hat{X}}^{2})^{x}_{u} \,\mathrm {d}u} g( ({\hat{X}}^{2})^{x}_{\delta }, \sigma ^{2}(\delta )) \,|\, {\mathcal {F}}^{\sigma ^{1}, \sigma ^{2}}_{\delta } \right] (\omega _{0}). \end{aligned}$$
(4.6)

In the above, having in mind that the conditional expectations can be rewritten as ordinary expectations similarly as in (4.5), the first inequality followed by [14, Theorem 6.1], the second by the decay of g in the second variable. Integrating both sides of (4.6) over all possible \(\omega _{0} \in \Omega '\) with respect to \(\mathrm {d}{{\mathbb {P}}}'\), we get that

$$\begin{aligned} u(x, \sigma _{1}) \ge u(x, \sigma _{2}). \end{aligned}$$

Thus we can conclude that u is increasing and convex in the first variable as well as decreasing in the second. \(\square \)

Theorem 4.7

(Ordering in volatility)

  1. (i)

    v is decreasing in the volatility variable, i.e.

    $$\begin{aligned} v_{\sigma _{1}} \ge v_{\sigma _{2}} \ge \cdots \ge v_{\sigma _{m}} . \end{aligned}$$
  2. (ii)

    The boundaries are ordered in volatility as

    $$\begin{aligned} b_{\sigma _{1}} \le b_{\sigma _{2}} \le \cdots \le b_{\sigma _{m}}. \end{aligned}$$

Proof

  1. (i)

    We will prove the claim by approximating the value function v by a sequence of value functions \(\{v_{n}\}_{n\ge 0}\) of corresponding Bermudan optimal stopping problems. Let \(v_{n}\) denote the value function as in (2.7), but when stopping is allowed only at times \(\left\{ \frac{kT}{2^{n}} \,:\, k\in \{0,1, \ldots , 2^{n}\}\right\} \). Let us fix \(n \in {\mathbb {N}}\). We will show that, for any given \(k\in \{0, \ldots ,2^n\}\) and any \(t \in [\frac{k}{2^n}T, T]\), the value function \(v_{n}(t,x, \sigma )\) is increasing and convex in x as well as decreasing in \(\sigma \) (note that here \(\sigma \) denotes the initial value of the process \(t\mapsto \sigma (t)\)). The proof is by backwards induction from \(k=2^n\) down to \(k=0\). Since \(v_{n}(T, \cdot , \cdot )=1\), the base step \(k=2^n\) holds trivially. Now, suppose that, for some given \(k \in \{0, \ldots , 2^n\}\), the value \(v_{n}(t,x, \sigma )\) is increasing and convex in x as well as decreasing in \(\sigma \) for any \(t\in [\frac{k}{2^n}T, T]\). Then, Lemma 4.6 tells us that for any fixed \(t \in [\frac{(k-1)T}{2^n}, \frac{kT}{2^n})\),

    $$\begin{aligned} f(t, x, \sigma ) := {\tilde{{\mathbb {E}}}} \left[ e^{\int _{t}^{\frac{kT}{2^n}} {\hat{X}}^{t,x, \sigma }_{u} \,\mathrm {d}u} v_{n}\left( \frac{kT}{2^n}, {\hat{X}}^{t, x, \sigma }_{\frac{kT}{2^n}}, \sigma \left( \frac{kT}{2^n} \right) \right) \right] , \end{aligned}$$

    is increasing and convex in x as well as decreasing in \(\sigma \). Consequently, since

    $$\begin{aligned} v_{n}(t,x, \sigma )= & {} \left\{ \begin{array}{ll} f(t,x, \sigma ), &{}\quad t \in (\frac{(k-1)T}{2^n}, \frac{kT}{2^n}), \\ f(t,x, \sigma ) \vee 1, &{}\quad t = \frac{(k-1)T}{2^n}, \end{array} \right. \end{aligned}$$
    (4.7)

    the value \(v_{n}(t,x, \sigma )\) is increasing and convex in x as well as decreasing in \(\sigma \) for any fixed \(t \in [ \frac{k-1}{2^n}T, T]\). Hence, by backwards induction, \(v_{n}\) is increasing and convex in the second argument x as well as decreasing in the third argument \(\sigma \). Finally, since \(v_{n} \rightarrow v\) pointwise as \(n \rightarrow \infty \), we can conclude that the value function v is decreasing in \(\sigma \).

  2. (ii)

    From the proof of Theorem 4.2 (ii), the claim is a direct consequence of part (i) above.

\(\square \)

Remark 4.8

  1. 1.

    The value function is decreasing in the initial volatility (Theorem 4.7 (i)) also when the volatility is any continuous time-homogeneous positive Markov process independent of the driving Brownian motion W. The assertion is justified by inspection of the proof of Lemma 4.6 in which no crossing of the volatility trajectories was important, not the Markov chain structure.

  2. 2.

    Though there are no grounds to believe that any of the boundaries \(b_{\sigma _{1}}, \ldots ,b_{\sigma _{m}}\) is discontinuous, proving their continuity, except for the lowest one, is beyond the power of customary techniques. Continuity of the lowest boundary can be proved similarly as in the proof of part 4 of [15, Theorem 3.10], exploiting the ordering of the boundaries. The stumbling block for proving continuity of the upper boundaries is that, at a downward volatility jump time, the value function has a positive jump whose magnitude is difficult to quantify.

5 Generalisation to an Arbitrary Prior

In this section, we generalise most results of the earlier parts to the general prior case. In what follows, the prior \(\mu \) of the drift is no longer a two-point but an arbitrary probability distribution.

5.1 Two-Dimensional Characterisation of the Posterior Distribution

Let us first think a bit more abstractly to develop intuition for the arbitrary prior case. According to the Kushner–Stratonivich stochastic partial differential equation (SPDE) for the posterior distribution (see [8, Sect. 3.2]), if we take the innovation process driving the SPDE and the volatility as the available information sources, then the posterior distribution is a measure-valued Markov process. Unfortunately, there does not exist any applicable general methods to solve optimal stopping problems for measure-valued stochastic processes. If only we were able to characterise the posterior distribution process by an \({\mathbb {R}}^n\)-valued Markovian process (with respect to the filtration generated by the innovation and the volatility processes), then we should manage to reduce our optimal stopping problem with a stochastic measure-valued underlying to an optimal stopping problem with a \({\mathbb {R}}^n\)-valued Markovian underlying. Mercifully, this wishful thinking turns out to be possible in reality as we shall soon see.

Unlike in the problem with constant volatility studied in [15], when the volatility is varying, the pair consisting of the elapsed time t and the posterior mean \({\hat{X}}_{t}\) is not sufficient (with an exception of the two-point prior case studied before) to characterise the posterior distribution \(\mu _{t}\) of X given \({\mathcal {F}}^{S, \sigma }_{t}\). Hence we need some additional information to describe the posterior distribution. Quite surprisingly, all this needed additional information can be captured in a single additional observable statistic which we will name the ‘effective learning time’. We start the development by first introducing some useful notation.

Define \(Y^{(i)}_t:=Xt + \sigma _{i} W_{t}\) and let \(\mu ^{(i)}_{t,y}\) denote the posterior distribution of X at time t given \(Y^{(i)}_{t}=y\). It needs to be mentioned that, for any given prior \(\mu \), the distributions of X given \({\mathcal {F}}^{Y^{(i)}}_{t}\) and X given \(Y^{(i)}_{t}\) are equal (see Proposition 3.1 in [15]), which justifies our conditioning only on the last value \(Y^{(i)}_{t}\). Also, recall that \(l= \inf \mathop {\mathrm {supp}}\nolimits (\mu )\), \(h = \sup \mathop {\mathrm {supp}}\nolimits (\mu )\).

The next lemma provides the key insight allowing to characterise the posterior distribution by only two parameters.

Lemma 5.1

Let \(\sigma _{2} \ge \sigma _{1} > 0\). Then

$$\begin{aligned} \{\mu ^{(1)}_{t,y} \,:\, t> 0, \, y \in {\mathbb {R}}\}= \{ \mu ^{(2)}_{t,y}\,:\, t > 0, \, y \in {\mathbb {R}}\}, \end{aligned}$$

i.e. the sets of possible conditional distributions of X in both cases are the same.

Proof

Let \(t>0\), \(y \in {\mathbb {R}}\). By the standard filtering theory (a generalised Bayes’ rule),

$$\begin{aligned} \mu ^{(i)}_{t,y}(\mathrm {d}u):=\frac{e^{\frac{2uy-u^2t}{2\sigma _{i}^2}} \mu (\mathrm {d}u)}{\int _{\mathbb {R}}e^{\frac{2uy-u^2t}{2\sigma _{i}^2}} \mu (\mathrm {d}u)}. \end{aligned}$$
(5.1)

Then taking \(r = \left( \frac{\sigma _{1}}{\sigma _{2}} \right) ^{2}t\) and \(y_{1} = \left( \frac{\sigma _{1}}{\sigma _{2}} \right) ^{2} y\), we have that

$$\begin{aligned} \mu ^{(2)}_{t,y}(\mathrm {d}u)= & {} \mu ^{(1)}_{r ,y_{1}}(\mathrm {d}u). \end{aligned}$$

\(\square \)

From Lemma 5.1 and [15, Lemma 3.3] we obtain the following important corollary, telling us that, having fixed a prior, any possible posterior distribution can be fully characterised by only two parameters.

Corollary 5.2

Let \(t >0\). Then, for any posterior distribution \(\mu _{t}(\cdot ) = {{\mathbb {P}}}(X \in \cdot \,|\, {\mathcal {F}}^{S, \sigma }_{t})(\omega )\), there exists \((r, x) \in (0,T]\times (l,h)\) such that \(\mu _{t}= \mu ^{(1)}_{r,y_{1}(r,x)}\), where \(y_{1}(r,x)\) is defined as the unique value satisfying \({\mathbb {E}}[X \,|\, Y^{(1)}_{r}=y_{1}(r,x) ] =x\). In particular, we can take \(r = \int _0^t \left( \frac{\sigma _1}{\sigma (u)(\omega )}\right) ^2 \,\mathrm {d}u\) and \(y_1(r,x) = \int _0^t \left( \frac{\sigma _1}{\sigma (u)(\omega )}\right) ^2 \,\mathrm {d}Y_u(\omega )\), where \(Y_u = \log (S_u) + \frac{1}{2}\int _0^u \sigma (b)^2 \,\mathrm {d}b\).

When the volatility varies, so does the speed of learning about the drift. The corollary tells us that we can interpret r as the effective learning time measured under the constant volatility \(\sigma _{1}\). The intuition for the name is that even though the volatility is varying over time, the same posterior distribution \(\mu _t\) can be also be obtained in a constant volatility model with the constant volatility \(\sigma _1\), just at a different time r and at a different value of the price S.

Remark 5.3

It is worth remarking that Corollary 5.2 also holds for any reasonable positive volatility process. Indeed, using the Kallianpur–Striebel formula with time-dependent volatility (see Theorem 2.9 on page 39 of [8]), the proof of Lemma 5.1 equally applies for an arbitrary positive time-dependent volatility and immediately yields the result of the corollary.

Next, we make a convenient technical assumption about the prior distribution \(\mu \).

Assumption 5.4

The prior distribution \(\mu \) is such that

  1. 1.

    \( \int _{\mathbb {R}}e^{a u^2}\mu (\mathrm {d}u)<\infty \) for some \(a>0\),

  2. 2.

    \(\psi (\cdot ,\cdot ) :[0,T]\times (l,h) \rightarrow {\mathbb {R}}\) defined by

    $$\begin{aligned} \psi (t,x) := \frac{1}{\sigma _{1}} \left( {\mathbb {E}}[X^{2}\,|\, Y^{1}_{t}=y_{1}(t,x)] - x^{2} \right) = \frac{1}{\sigma _{1}} \mathop {\mathrm {Var}}\nolimits \left( X\,|\, Y^{1}_{t}= y_{1}(t,x) \right) \end{aligned}$$

    is a bounded function that is Lipschitz continuous in the second variable.

In particular, all compactly supported distributions as well as the normal distribution are known to satisfy Assumption 5.4 (see [15]), so it is an inconsequential restriction for practical applications.

5.2 Markovian Embedding

Similarly as in the two-point prior case, we will study the optimal stopping problem (2.5) by embedding it into a Markovian framework. With Corollary 5.2 telling us that the effective learning time r and the posterior mean x fully characterise the posterior distribution, now, we can embed the optimal stopping problem (2.5) into the standard Markovian framework by defining the Markovian value function

$$\begin{aligned} v(t,x, r, \sigma )&:= \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}^{t, x, r, \sigma }_{t+s} \,\mathrm {d}s }\right] , \, (t, x,r, \sigma ) \in [0,T] \nonumber \\&\quad \times (l, h) \times [0,T] \times \{ \sigma _{1},\ldots , \sigma _{m} \}. \end{aligned}$$
(5.2)

Here the process \({\hat{X}}= {\hat{X}}^{t,x, r, \sigma _{i}}\) evolves according to

$$\begin{aligned} \left\{ \begin{array}{ll} \mathrm {d}{\hat{X}}_{t+s} = \sigma _{1} \psi (r_{t+s}, {\hat{X}}_{t+s}) \,\mathrm {d}s + \frac{\sigma _{1}}{\sigma (t+s)}\psi (r_{t+s}, {\hat{X}}_{t+s}) \,\mathrm {d}B_{t+s}, &{} \quad s \ge 0, \\ \mathrm {d}r_{t+s} = \left( \frac{\sigma _{1}}{\sigma (t+s)}\right) ^{2} \mathrm {d}s, &{} \quad s \ge 0, \\ {\hat{X}}_t = x,\\ r_t = r, \\ \sigma (t)= \sigma _{i}; \end{array} \right. \end{aligned}$$
(5.3)

the given dynamics of \({\hat{X}}\) is a consequence of Corollary 5.2 and the evolution equation of \({\hat{X}}\) in the constant volatility case (see the equation (3.9) in [15]). Also, in (5.3), the process \(B_{t}= \int _{0}^{t} \sigma (u) \,\mathrm {d}u + {\hat{W}}_{t}\) is a \({\tilde{{{\mathbb {P}}}}}\)-Brownian motion. Lastly, in (5.2), \({\mathcal {T}}_{T-t}\) denotes the set of stopping times less than or equal to \(T-t\) with respect to the usual augmentation of the filtration generated by \(\{ {\hat{X}}^{t, x, r, \sigma _{i}}_{t+s}\}_{s \ge 0}\) and \(\{ \sigma (t+s)\}_{s \ge 0}\).

Remark 5.5

Let us note that in light of the observations of Sect. 5.1, if the regime-switching volatility was replaced by a different stochastic volatility process, the same Markovian embedding 5.2 could still be useful for the study of the altered problem.

5.3 Outline of the Approximation Procedure and Main Results

Under an arbitrary prior, the approximation procedure of Sect. 3 can also be applied, however, the operators J and \(J^{(n)}\) need to be redefined in a suitable way. We redefine the operator J to act on a function \(f:[0, T] \times (l,h) \times [0,T] \rightarrow {\mathbb {R}}\) as

$$\begin{aligned}&(J f)(t, x, r, \sigma _{i}) \nonumber \\&\quad := \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}^{t, x, r, \sigma _{i}}_{t+s} \,\mathrm {d}s } \mathbb {1}_{\{\tau < \eta ^{t}_{i} \}} + e^{ \int _0^{\eta ^{t}_{i}} {\hat{X}}^{t, x, r, \sigma _{i}}_{t+s} \,\mathrm {d}s }f\left( t+\eta ^{t}_{i}, {\hat{X}}^{t, x, r, \sigma _{i}}_{t+\eta ^{t}_{i}}, r^{t,r}_{t+\eta ^{t}_{i}}\right) \mathbb {1}_{\{\tau \ge \eta ^{t}_{i}\}} \right] \nonumber \\&\quad = \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \left[ e^{ \int _0^\tau {\hat{X}}^{t, x, r, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s } + \int _{0}^{\tau } e^{ \int _0^u {\hat{X}}^{t, x, r, \sigma _{i}}_{t+s} -\lambda _{i} \,\mathrm {d}s }f \left( t+u, {\hat{X}}^{t, x, r, \sigma _{i}}_{t+u}, r^{t,r}_{t+\eta ^{t}_{i}} \right) \,\mathrm {d}u \right] \nonumber \\ \end{aligned}$$
(5.4)

and then the operator \(J_{i}\) as \(J_{i} f := (Jf)(\cdot , \cdot , \sigma _{i})\). Intuitively, \((J_{i} f)\) represents a Markovian value function corresponding to optimal stopping before \(t+ \eta ^{t}_{i}\), i.e. before the first volatility change after t, when, at time \(t+\eta ^{t}_{i} < T\), the payoff \(f\left( t+\eta ^{t}_{i}, {\hat{X}}^{{t, x, r, \sigma _{i}}}_{t+\eta ^{t}_{i}}, r^{t,r}_{t+\eta ^{t}_{i}} \right) \) is received, provided stopping has not occurred yet. The underlying process in the optimal stopping problem \(J_{i} f\) is the diffusion \((t, {\hat{X}}_{t}, r_{t})\).

The majority of the results in Sects. 3 and 4 generalise nicely to an arbitrary prior case. Proposition 3.1 extends word by word; the proofs are analogous, just the second property of \(\psi \) from [15, Proposition 3.6] needs to be used for Proposition 3.1 (iv). In addition, we have that f decreasing in r implies that \(J_{i}f\) is decreasing in r, which is proved by a Bermudan approximation argument as in Proposition 3.1 (iv) using the time decay of \(\psi \) from [15, Proposition 3.6]. As a result, for \(f :[0,T] \times (l,h) \times [0,T] \rightarrow {\mathbb {R}}\) that is decreasing in the first and third variables as well as increasing (though not too fast as \(x \nearrow \infty \)) and convex in the second, there exists a function (a stopping boundary) \(b^{f}_{\sigma _{i}} : [0,T)\times [0, T) \rightarrow [l,0]\) that is increasing in both variables and such that the continuation region \( \mathcal {C}^{f}_{i} := \{ (t, x, r) \in [0,T) \times (l,h) \times \big [0, T \big ) \,:\, (J_{i}f) (t,x, r) > 1 \}\) (optimality shown as in Proposition 3.2) satisfies

$$\begin{aligned} \mathcal {C}^{f}_{i}= & {} \{ (t, x, r) \in [0,T) \times (l,h) \times \left[ 0, T \right) \,:\, x > b^{f}_{\sigma _{i}}(t,r) \}. \end{aligned}$$

In addition, each pair \((J_{i}f,b_{\sigma _{i}}^{f})\) solves the free-boundary problem

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}u(t,x,r) + \left( \frac{\sigma _{1}}{\sigma _{i}} \right) ^{2} \partial _{r} u(t,x,r) + {\sigma _{1}} \psi (r, x) \partial _{x} u(t,x,r) \\ \quad +\frac{1}{2} \left( \frac{\sigma _{1}}{\sigma _{i}}\right) ^{2}\psi (r, x)^{2} \partial _{xx} u(t,x,r) +(x-\lambda _{i})u(t,x,r)\\ \quad +\,\lambda _{i} f(t,x,r) = 0, \text { if }x > b_{\sigma _{i}}^{f}(t,r), &{}\\ u(t,x,r) = 1, \text { if } x \le b_{\sigma _{i}}^{f}(t,r) \text { or } t = T. &{} \end{array} \right. \end{aligned}$$

With the operator \(J^{(n)}\) redefined as

$$\begin{aligned} (J^{(n)} f)(t,x, r, \sigma _{i}):= & {} \sup _{\tau \in \mathcal {T}_{T-t}} {\tilde{{\mathbb {E}}}} \bigg [ e^{\int _0^\tau {\hat{X}}^{t,x, r, \sigma _i}_{t+s} \,\mathrm {d}s } \mathbb {1}_{\{ \tau < \xi ^{t}_{n} \}} \\&+\,e^{\int _0^{\xi ^{t}_{n}} {\hat{X}}^{t, x, r, \sigma _i}_{t+s} \,\mathrm {d}s }f(t+\xi ^{t}_{n}, {\hat{X}}^{t,x, r, \sigma _i}_{t+\xi ^{t}_{n}}, r^{t,r}_{t +\xi ^{t}_{n}}) \mathbb {1}_{\{\tau \ge \xi ^{t}_{n}\}} \bigg ], \end{aligned}$$

the crucial Proposition 3.5 holds word by word. Furthermore, the sequence of functions \(\{ J^{(n)} 1 \}_{n \ge 0}\) is increasing, bounded from below by 1 with each \(J^{(n)} 1 \) being decreasing in the first and third variables as well as increasing and convex in the second variable x. As desired,

$$\begin{aligned} J^{(n)} 1 \nearrow v \quad \text { pointwise as } n \nearrow \infty , \end{aligned}$$

so the value function v is decreasing in the first and third variables as well as increasing and convex in the second variable; again, v is a fixed point of \({{\tilde{J}}}\). Moreover, the uniform approximation error result (3.20) also holds for compactly supported priors (with an obvious reinterpretation \(h = \sup (\mathop {\mathrm {supp}}\nolimits \mu )\)). We can also show (by a similar argument as in Theorem 4.2 (iii)) that

$$\begin{aligned} b_{\sigma _{i}}^{g_i^{(n)}} \searrow b_{\sigma _{i}} \quad \text {pointwise as } n \nearrow \infty , \end{aligned}$$

where \(g_i^{(n)} := \sum _{j\ne i} \frac{\lambda _{ij}}{\lambda _{i}}J^{(n)}_{j}1\) and the limit \(b_{\sigma _{i}}\) is a function increasing in both variables. Lastly, by similar arguments as before, the stopping time

$$\begin{aligned} \tau ^{*}= \inf \{ s \in [0, T-t) \,:\, {\hat{X}}^{t,x,r,\sigma }_{t+s} \le b_{\sigma (t+s)}(t+s, r_{t+s}) \} \wedge (T-t)\, \end{aligned}$$

is optimal for the liquidation problem (2.5).

Remark 5.6

The higher volatility, the slower learning about the drift, so under Assumption 4.5 it is tempting to expect that the value function v is decreasing in the volatility variable and so the stopping boundaries \(b_{\sigma _{1}} \le b_{\sigma _{2}} \le \cdots \le b_{\sigma _{m}}\) also in the case of an arbitrary prior distribution \(\mu \). Regrettably, proving (or disproving) such monotonicity in volatility has not been achieved by the author.