1 Introduction

The question of optimally managing the debt-to-GDP ratio (also called “debt ratio”) of a country has become particularly important in the last years. Indeed, concurrently with the financial crisis started in 2007, debt-to-GDP ratios exploded from an average of 53% to circa 80% in developed countries. Clearly, the debt management policy of a government highly depends on the underlying macroeconomic conditions; indeed, these affect for example the growth rate of the GDP which in turn determines the growth rate of the debt-to-GDP ratio of a country. However, in practice, it is typically neither possible to measure in real time the growth rate of the GDP, nor can one directly observe the underlying business cycles. On August 24, 2018, during a speech at “Changing Market Structure and Implications for Monetary Policy” – a symposium sponsored by the Federal Reserve Bank of Kansas City in Jackson Hole, Wyoming –, the chairman of the Federal Reserve Jerome H. Powell said:

…In conventional models of the economy, major economic quantities such as inflation, unemployment and the growth rate of the gross domestic product fluctuate around values that are considered “normal” or “natural” or “desired”. The FOMC (Federal Open Market Committee) has chosen a 2 percent inflation objective as one of these desired values. The other values are not directly observed, nor can they be chosen by anyone

Following an idea that dates back to Hamilton [38], we assume in this paper that the GDP growth rate of a country is modulated by a continuous-time Markov chain that is not directly observable. The Markov chain has \(Q\geq 2\) states modelling the different business cycles of the economy, so that a shift in the macroeconomic conditions induces a change in the value of the growth rate of the GDP. The government can observe only the current levels of the debt-to-GDP ratio and of a macroeconomic indicator. The latter might be e.g. one of the so-called “Big Four” which are usually considered proxies of the industrial production index, hence of the business conditions. These indicators constitute the Conference Board’s Index of Coincident Indicators; they are employment in non-agricultural businesses, industrial production, real personal income less transfers, and real manufacturing and trade sales. We refer to e.g. Stock and Watson [60], where the authors present a wide range of economic indicators and examine the forecasting performance of various of them in the recession of 2001.

Motivated by the recent aforementioned debt crisis, we consider a government that has the priority to return debt to less dangerous levels, to move away from the dark corners (O. Blanchard, former chief economist of the International Monetary Fund (2014)) e.g. through fiscal policies or imposing austerity policies in the form of spending cuts. In our model, we thus preclude the possibility for the government to increase the level of the debt ratio and we neglect any possible potential benefit resulting from holding debt, even if we acknowledge that a policy of debt reduction might not be always the most sensible approach, as also observed by Ostry et al. [55] (see also the discussion in Remark 2.6 below). We further assume that the loss resulting from holding debt can be measured through a convex and nondecreasing cost function, and that the debt ratio is instantaneously affected by any reduction. The latter need not necessarily be performed at rates, but also lump sum actions are allowed, and the cumulative amount of the debt ratio’s decrease is the government’s control variable. Any decrease of the debt ratio results in proportional costs, and the government aims at choosing a debt-reduction policy that minimises the total expected loss of holding debt, plus the total expected costs of interventions on the debt ratio. In line with recent papers on stochastic control methods for optimal debt management (see Cadenillas and Huamán-Aguilar [8, 9], Ferrari [31] and Ferrari and Rodosthenous [32]), we model the previous problem as a singular stochastic control problem. However, differently to all previous works, our problem is formulated in a partial observation setting, thus leading to a completely different mathematical analysis. In our model, the observations consist of the debt ratio and the macroeconomic indicator. The debt ratio is a linearly controlled geometric Brownian motion, and its drift is given in terms of the GDP growth rate, which is modulated by the unobservable continuous-time Markov chain \(Z\). The macroeconomic indicator is a real-valued jump-diffusion which is correlated to the debt ratio process, and which has drift and both intensity and jump sizes depending on \(Z\).

Our contributions. Our study of the optimal debt reduction problem is performed thought three main steps.

First of all, via advanced filtering techniques with mixed-type observations, we reduce the original problem to an equivalent problem under full information, the so-called separated problem. This is a classical procedure used to handle optimal stochastic control problems under partial information (see e.g. Fleming and Pardoux [33], Bensoussan [4, Chap. 7.1] and Ceci and Gerardi [12]) The filtering problem consists in characterising the conditional distribution of the unobservable Markov chain \(Z\) at any time \(t\), given observations up to time \(t\). The case of diffusion observations has been widely studied in the literature, and textbook treatments can be found in Elliott et al. [29, Chap. 8], Kallianpur [44, Chap. 8] and Liptser and Shiryaev [49, Chap. 8]. There are also known results for pure-jump observations (see e.g. Brémaud [7, Chap. IV], Ceci and Gerardi [13, 14], Kliemann et al. [47] and references therein). More recently, filtering problems with mixed-type information which involve pure-jump processes and diffusions have been studied by Ceci and Colaneri [15, 16], among others.

Notice that also the economic and financial literature has experienced papers on models under partial observation where a reduction to a complete information setting is performed via filtering techniques and the problem is split into the so-called “two-step procedure”. We refer e.g. to the literature on portfolio selection in the seminal papers by Detemple [25] and Gennotte [36] (in a continuous-time setting, with diffusive observations leading to a Gaussian filter process); to Veronesi [62] for an equilibrium model with uncertain dividend drift in the field of market over- and under-reaction to information; to the more recent work by Luo [51], where different uncertainty models are analysed in a Gaussian setting with the aim of studying strategic consumption–portfolio rules when dealing with precautionary savings. Generally, the economic and financial literature refers to well-known results in filtering theory, in the case of diffusive observation processes such as additive Gaussian white noise (e.g. Detemple [25], Gennotte [36] and Luo [51]) or in the case of pure-jump observations (see Bäuerle and Rieder [3] and Ceci [11], among others). A few papers consider mixed-type information (see e.g. Callegaro et al. [10] and Frey and Schmidt [35], among others). In our paper, we deal with a more general setting with a two-dimensional observation process allowing jumps, for which known results cannot be invoked.

Usually, two filtering approaches are followed: the so-called reference probability approach (see the seminal paper by Zakai [65] and the more recent papers Frey and Runggaldier [34], Ceci and Colaneri [16] and Colaneri et al. [18], among others) and the innovation approach (see e.g. Brémaud [7, Chap. IV.1], Ceci and Colaneri [15], Eksi and Ku [27] and Frey and Schmidt [35]). Due to the general structure of our observations’ dynamics, the innovation approach is more suitable to handle our filtering problem, and this leads to the so-called Kushner–Stratonovich equation. In particular, it turns out that the dynamics of our filter and of the observation process are coupled, thus making the proof of uniqueness of the solution to the Kushner–Stratonovich system more delicate. After providing such a result, we are then able to show that the original problem under partial observation and the separated problem are equivalent, that is, they share the same value and the same optimal control.

Secondly, we exploit the convex structure of the separated problem and provide a general probabilistic verification theorem. This result – which is in line with findings in Baldursson and Karatzas [2], De Angelis et al. [20] and Ferrari [31], among others – relates the optimal control process to the solution to an auxiliary optimal stopping problem. Moreover, it proves that the value function of the separated problem is the integral – with respect to the controlled state variable – of the value function of the optimal stopping problem. The stopping problem thus gives the optimal timing at which debt should be marginally reduced.

Finally, by specifying a setting in which the continuous-time Markov chain faces only two regimes (a fast growth or slow growth phase) and the macroeconomic indicator is a suitable diffusion process, we are able to characterise the optimal debt reduction policy. In this framework, the filter process is a two-dimensional process \((\pi _{t},1-\pi _{t})_{t\geq 0}\), where \(\pi _{t}\) is the conditional probability at time \(t\) that the economy enjoys the fast growth phase. We prove that the optimal control prescribes to keep at any time the debt ratio below an endogenously determined curve that is a function of the government’s belief about the current state of the economy. Such a debt ceiling is the free boundary of the fully two-dimensional optimal stopping problem that is related to the separated problem (in the sense of the previously discussed verification theorem). By using almost exclusively probabilistic arguments, we are able to show that the value function of the auxiliary optimal stopping problem is a \(C^{1}\)-function of its arguments, and thus enjoys the so-called smooth-fit property. Moreover, the free boundary is a continuous, bounded and increasing function of the filter process. This last monotonicity property has also a clear economic interpretation: the more the government believes that the economy enjoys a regime of fast growth, the less strict the optimal debt reduction policy should be.

As a remarkable byproduct of the regularity of the value function of the optimal stopping problem, we also obtain that the value function of the singular stochastic control problem is a classical solution to its associated Hamilton–Jacobi–Bellman (HJB) equation. The latter takes the form of a variational inequality involving an elliptic second-order partial differential equation (PDE). It is worth noticing that the \(C^{2}\)-regularity of the value function implies the validity of a second-order principle of smooth fit, usually observed in one-dimensional problems.

We believe that the study of the auxiliary fully two-dimensional optimal stopping problem is a valuable contribution to the literature on its own. Indeed, while the literature on one-dimensional optimal stopping problems is very rich, the problem of characterising the optimal stopping rule in multi-dimensional settings has been so far rarely explored in the literature (see the recent work by Christensen et al. [17], De Angelis et al. [20] as well as Johnson and Peskir [43] among the very few papers dealing with multi-dimensional stopping problems). This discrepancy is due to the fact that a standard guess-and-verify approach, based on the construction of an explicit solution to the variational inequality arising in the considered optimal stopping problem, is no longer applicable in multi-dimensional settings where the variational inequality involves a PDE rather than an ordinary differential equation.

Related literature. As already noticed above, our paper is placed among those recent works addressing the problem of optimal debt management via continuous-time stochastic control techniques. In particular, Cadenillas and Huamán-Aguilar [8, 9] model an optimal debt reduction problem as a one-dimensional control problem with singular and bounded-velocity controls, respectively. In the work by Ferrari and Rodosthenous [32], the government is allowed to increase and decrease the current level of the debt ratio, and the interest rate on debt is modulated by a continuous-time observable Markov chain. The mathematical formulation leads to a one-dimensional bounded-variation stochastic control problem with regime switching. In the model by Ferrari [31], when optimally reducing the debt ratio, the government takes into consideration the evolution of the inflation rate of the country. The latter evolves as an uncontrolled diffusion process and affects the growth rate of the debt ratio, which is a process of bounded variation. In this setting, the debt reduction problem is formulated as a two-dimensional singular stochastic control problem whose HJB equation involves a second-order linear parabolic partial differential equation. All the previous papers are formulated in a full information setting, while ours is under partial observation.

The literature on singular stochastic control problems under partial observation is also still quite limited. Theoretical results on the PDE characterisation of the value function of a two-dimensional optimal correction problem under partial observation are obtained by Menaldi and Robin [53], whereas a general maximum principle for a not necessarily Markovian singular stochastic control problem under partial information has more recently been derived by Øksendal and Sulem [54]. We also refer to De Angelis [19] and Decámps and Villeneuve [23] who provide a thorough study of the optimal dividend strategy in models in which the surplus process evolves as a drifted Brownian motion with unknown drift that can take only two constant values, with given probabilities.

Outline of the paper. The rest of the paper is organised as follows. In Sect. 2, we introduce the setting and formulate the problem. The reduction of the problem under partial observation to the separated problem is performed in Sect. 3; in particular, the filtering results are presented in Sect. 3.1. The probabilistic verification theorem connecting the separated problem to one of optimal stopping is then proved in Sect. 3.3. In Sect. 4, we consider a case study in which the economy faces only two regimes. Its solution, presented in Sects. 4.2 and 4.3, hinges on the study of a two-dimensional optimal stopping problem that is performed in Sect. 4.1. Finally, Appendix A collects the proofs of some technical filtering results.

2 Setting and problem formulation

2.1 The setting

Consider the complete filtered probability space \((\Omega ,\mathcal{F},\mathbb{F},\mathbb{P})\) capturing all the uncertainty of our setting. Here, \(\mathbb{F}:={(\mathcal{F}_{t})}_{t \ge 0}\) denotes the full information filtration. We suppose that it satisfies the usual hypotheses of completeness and right-continuity.

We denote by \(Z\) a continuous-time finite-state Markov chain describing the different states of the economy. For \(Q\geq 2\), let \(S := \{1, 2,\dots , Q\}\) be the state space of \(Z\) and \((\lambda _{ij})_{1 \leq i,j \leq Q}\) its generator matrix. Here \(\lambda _{ij}\), \(i \neq j\), gives the intensity of a transition from state \(i\) to state \(j\) and is such that \(\lambda _{ij} \geq 0\) for \(i \neq j\) and \(\sum _{j=1, j\ne i}^{Q} \lambda _{ij} = - \lambda _{ii}\). For any time \(t\geq 0\), \(Z_{t}\) is \(\mathcal{F}_{t}\)-measurable.

In the absence of any intervention by the government, we assume that the (uncontrolled) debt-to-GDP ratio evolves as

$$ d X^{0}_{t} = \big(r - g(Z_{t})\big) X^{0}_{t} dt + \sigma X^{0}_{t} dW_{t} , \qquad X^{0}_{0}=x \in (0,\infty ), $$
(2.1)

where \(W\) is a standard \(\mathbb{F}\)-Brownian motion on \((\Omega , \mathcal{F})\) independent of \(Z\), \(r\geq 0\) and \(\sigma >0\) are constants and \(g: S \rightarrow \mathbb{R}\). The constant \(r\) is the real interest rate on debt, \(\sigma \) is the debt’s volatility and \(g(i) \in \mathbb{R}\) is the rate of the GDP growth when the economy is in state \(i\in S\).

It is clear that (2.1) admits a unique strong solution, and when needed, we denote it by \(X^{x,0}\) for any \(x>0\). The current level of the debt-to-GDP ratio is known to the government at any time \(t\), and \(X^{x,0}\) is therefore the first component of the so-called observation process.

The government also observes a macroeconomic stochastic indicator \(\eta \), e.g. one of the so-called “Big Four”, which we interpret as a proxy of the business conditions. We assume that \(\eta \) is a jump-diffusion process solving the stochastic differential equation

$$\begin{aligned} \begin{aligned} d \eta _{t} &= b_{1}(\eta _{t}, Z_{t}) dt + \sigma _{1}(\eta _{t}) dW_{t} + \sigma _{2}(\eta _{t}) dB_{t} + c(\eta _{t-}, Z_{t-}) dN_{t}, \\ \eta _{0}&= q \in \mathcal{I}, \end{aligned} \end{aligned}$$
(2.2)

where \(\sigma _{1} >0\), \(\sigma _{2}>0\) and \(b_{1}\), \(c\) are measurable functions of their arguments and \(\mathcal{I}\subseteq \mathbb{R}\) is the state space of \(\eta \). Here, \(B\) is an \(\mathbb{F}\)-standard Brownian motion independent of \(W\) and \(Z\). Moreover, \(N\) is an \(\mathbb{F}\)-adapted point process, without common jump times with \(Z\), independent of \(W\) and \(B\). The predictable intensity of \(N\) is denoted by \((\lambda ^{N}(Z_{t-}))_{t\geq 0}\) and depends on the current state of the economy, with \(\lambda ^{N}(\,\cdot \,)>0\) being a measurable function. From now on, we make the following assumptions that ensure strong existence and uniqueness of the solution to (2.2) (by a standard localising argument, one can indeed argue as e.g. in the proof of Xi and Zhu [64, Theorem 2.1], performed in a setting more general than ours by employing Ikeda and Watanabe [40, Theorem IV.9.1]).

Assumption 2.1

The functions \(b_{1}: \mathcal{I}\times S \to \mathbb{R}\), \(\sigma _{1}: \mathcal{I}\to (0,\infty )\), \(\sigma _{2}: \mathcal{I}\to (0,\infty )\) and \(c: \mathcal{I}\times S \to \mathbb{R}\) are such that for any \(i \in S\),

(i) (continuity) \(b_{1}(\cdot , i)\), \(\sigma _{1}(\cdot )\), \(\sigma _{2}(\cdot )\) and \(c(\cdot , i)\) are continuous;

(ii) (local Lipschitz conditions) for any \(R>0\), there exists a constant \(L_{R}>0\) such that if \(|q|< R\), \(|q'|< R\), \(q, q' \in \mathcal{I}\), then

$$\begin{aligned} &|b_{1}(q, i) - b_{1}(q', i)| + |\sigma _{1}(q)- \sigma _{1}(q')| + | \sigma _{2}(q)- \sigma _{2}(q')| + |c(q, i) - c(q', i)| \\ &\leq L_{R}|q-q'|; \end{aligned}$$

(iii) (growth conditions) there exists a constant \(C>0\) such that

$$ |b_{1}(q, i)|^{2} + |\sigma _{1}(q)|^{2} + |\sigma _{2}(q)|^{2} + |c(q, i)|^{2} \leq C (1 + |q|^{2}). $$

The dynamics proposed in (2.2) is of jump-diffusive type and allows size and intensity of the jumps to be affected by the state of the economy. It is therefore flexible enough to describe a large class of stochastic factors which may exhibit jumps.

The observation filtration \(\mathbb{H} = {(\mathcal{H}_{t})}_{t \ge 0}\) is defined as

$$ \mathbb{H} := \mathbb{F}^{X^{0}} \vee \mathbb{F}^{\eta }, $$

where \(\mathbb{F}^{X^{0}}\) and \(\mathbb{F}^{\eta }\) denote the natural filtrations generated by \(X^{0}\) and \(\eta \), respectively, as usual augmented by ℙ-null sets. Clearly, \((X^{0}, \eta )\) is adapted to both ℍ and \(\mathbb{F}\), and

$$ \mathbb{H} \subseteq \mathbb{F}. $$

The above inclusion means that the government cannot directly observe the state \(Z\) of the economy, but that this has to be inferred through the observation of \((X^{0}, \eta )\). We are therefore working in a partial information setting.

2.2 The optimal debt reduction problem

The government can reduce the level of the debt-to-GDP ratio by intervening on the primary budget balance (i.e., the overall difference between government revenues and spending), for example through austerity policies in the form of spending cuts. When doing so, the debt ratio dynamics becomes

$$ dX^{\nu }_{t}= \big(r - g(Z_{t})\big) X^{\nu }_{t} dt + \sigma X^{\nu }_{t} dW_{t} - d \nu _{t}, \qquad X^{\nu }_{0-}=x >0. $$
(2.3)

The process \(\nu \) is the control that the government chooses based on the information at its disposal. More precisely, \(\nu _{t}\) defines the cumulative reduction of the debt-to-GDP ratio made by the government up to time \(t\), and \(\nu \) is therefore a nondecreasing process belonging to the set

$$\begin{aligned} \mathcal{M}(x, \underline{y},q):=\big\{ \nu :\Omega \times \mathbb{R}_{+} \rightarrow \mathbb{R}_{+} &: {\big(\nu _{t}(\omega ) := \nu (\omega ,t) \big)}_{t \ge 0} \ \textrm{is nondecreasing}, \\ &\phantom{:=}\text{right-continuous, $\mathbb{H}$-adapted and such that} \\ & \phantom{:=}\text{$X_{t}^{\nu } \ge 0$ for every $t \ge 0$, $X^{\nu }_{0-}=x$}, \\ &\phantom{:=}\text{$\mathbb{P}[Z_{0}=i]=y_{i}$, $i \in S$, $\eta _{0}=q$ a.s.} \big\} , \end{aligned}$$

for any given and fixed initial value \(x \in (0, \infty )\) of \(X^{\nu }\), initial value \(q \in \mathcal{I}\) of \(\eta \), and \({\underline{y}} \in \mathcal{Y}\). Here

$$ \mathcal{Y} := \bigg\{ {\underline{y}} = (y_{1}, \dots , y_{Q}): y_{i} \in [0,1], i=1, \dots Q, \sum _{i=1}^{Q} y_{i} =1 \bigg\} $$

is the probability simplex on \(\mathbb{R}^{Q}\), representing the space of initial distributions of the process \(Z\). From now on, we set \(\nu _{0-}=0\)a.s. for any \(\nu \in \mathcal{M}(x, \underline{y},q)\).

Remark 2.2

Notice that in the definition of the set ℳ above, as well as in (2.4) and in (2.5) below, we have stressed the dependency on the initial data \((x,\underline{y},q)\) just for notational convenience, not to indicate any Markovian nature of the considered problem, which is in fact not given.

For any \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\) and \(\nu \in \mathcal{M}(x,\underline{y},q)\), there exists a unique solution to (2.3), denoted by \(X_{t}^{x,\nu }\), that is given by

$$ X_{t}^{x,\nu } = X^{1,0}_{t}\left (x - \int _{0}^{t} \frac{d\nu _{s}}{X^{1,0}_{s}}\right ), \quad t \geq 0, \qquad X_{0-}^{x, \nu }=x, $$

where

$$X^{1,0}_{t} = \displaystyle e^{\int _{0}^{t} (r-g(Z_{s}) ) ds - { \frac{1}{2}} \sigma ^{2} t + \sigma W_{t}}, \qquad t \geq 0. $$

Here and in the rest of this paper, we use the notation \(\int _{0}^{t} (\,\cdot \,)d\nu _{s} = \int _{[0,t]} (\,\cdot \,) d \nu _{s}\) for the Lebesgue–Stieltjes integral with respect to the random measure \(d\nu _{\cdot }\) induced by the nondecreasing process \(\nu \) on \([0,\infty )\).

Remark 2.3

The dynamics (2.3) might be justified in the following way. Suppose that the public debt (in real terms) \(D\) and the GDP \(Y\) follow the classical dynamics

$$\begin{aligned} \left \{ \textstyle\begin{array}{rlr} dD_{t} &= r D_{t} dt - d \xi _{t}, \qquad & D_{0-}=d > 0, \\ dY_{t}&= g(Z_{t}) Y_{t} dt + \sigma Y_{t} d\widetilde{W}_{t}, \qquad & Y_{0}= y>0, \\ \end{array}\displaystyle \right . \end{aligned}$$

where \(\xi _{t}\) is the cumulative real budget balance up to time \(t\) and \(\widetilde{W}\) is a Brownian motion. An easy application of Itô’s formula and a change of measure then gives that the ratio \(X:=D/Y\) evolves as in (2.3), upon setting \(\nu _{\cdot }:=\int _{0}^{\cdot } d\xi _{s}/Y_{s}\) and \(x:=d/y\).

The government aims at reducing the level of the debt ratio. Having a level of \(X_{t}=x\) at time \(t\geq 0\) when the state of the economy is \(Z_{t}=i\), the government incurs an instantaneous cost (loss) \(h(x,i)\). This may be interpreted as an opportunity cost resulting from private investments’ crowding out, less room for financing public investments, and from a tendency to suffer low subsequent growth (see the technical report [30] and the work by Woo and Kumar [63], among others). The cost function \(h:\mathbb{R} \times S \mapsto \mathbb{R}_{+}\) fulfils the following requirements (see also Cadenillas and Huamán-Aguilar [8] and Ferrari [31]).

Assumption 2.4

(i) For any \(i\in S\), the mapping \(x \mapsto h(x,i)\) is strictly convex, continuously differentiable and nondecreasing on \(\mathbb{R}_{+}\). Moreover, \(h(0,i)=0\).

(ii) For any given \(x\in (0,\infty )\) and \(i\in S\), one has

$$ \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} h (X_{t}^{x,0}, i ) dt \bigg] + \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} X_{t}^{1,0} h_{x} (X_{t}^{x,0}, i ) dt\bigg] < \infty . $$

Remark 2.5

1) As an example, the power function given by \(h(x,i) = \vartheta _{i} x^{n_{i} + 1}\) for \((x,i) \in [0,\infty )\times S\), \(\vartheta _{i}>0\), \(n_{i} \geq 1\) satisfies Assumption 2.4 (for a suitable \(\rho >0\) taking care of requirement (ii) above). Inspired by the careful discussion of Cadenillas and Huamán-Aguilar in [8, Sect. 2], \(n_{i}\) is a subjective regime-dependent parameter capturing the government’s aversion/intolerance towards the debt ratio. On the other hand, the parameter \(\vartheta _{i}\) can be thought of as a measure (in monetary terms) of the importance of debt: the better the debt’s characteristics (for example, a larger portion of debt is domestic rather than external, cf. Japan), the lower the parameter \(\vartheta _{i}\) (relative to marginal cost of intervention, see below). A power cost function as the one above is in line with the usual quadratic loss function adopted in the economic literature (see the influential paper by Tabellini [61], among many others).

2) Notice that the integrability conditions in Assumption 2.4 (ii) ensure that the expected cost and marginal cost of having debt and not intervening on it are finite for any possible regime of the economy. In particular, the finiteness of the second expectation in Assumption 2.4 (ii) guarantees that the stopping functional considered in Sect. 3.3 below is finite.

Whenever the government intervenes in order to reduce the debt-to-GDP ratio, it incurs a proportional cost. This might be seen as a measure of the social and financial consequences deriving from a debt-reduction policy, and the associated regime-dependent marginal cost \(\kappa (Z_{t})\) allows to express it in monetary terms (we refer e.g. to Lukkezen and Suyker [50] for an empirical evaluation of those costs). We assume that \(\kappa (\,\cdot \,)>0\) is a measurable finite function.

Given an intertemporal discount rate \(\rho >0\), for any given and fixed triple \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\), the government thus aims to minimise the expected total cost functional

$$ \mathcal{J}_{x, \underline{y},q}(\nu ) := \mathbb{E}\bigg[\int _{0}^{ \infty } e^{-\rho t} h (X_{t}^{x,\nu }, Z_{t} ) dt + \int _{0}^{\infty } e^{- \rho t} \kappa(Z_{t})d \nu _{t}\bigg] $$
(2.4)

for \(\nu \in \mathcal{M}(x, \underline{y}, q)\). The government’s problem under partial observation can be therefore defined as

$$ V_{{\mathrm{po}}}(x,\underline{y} ,q):= \inf _{\nu \in \mathcal{M}(x, \underline{y},q)} \mathcal{J}_{x,\underline{y},q}(\nu ), \qquad (x, \underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}. $$
(2.5)

Remark 2.6

1) We provide here some comments on our formulation of the optimal debt reduction problem. In line with the recent literature [8, 9, 31, 32] on stochastic control models for debt management, the cost/loss function \(h\) appearing in the government’s objective functional is nondecreasing and null when the debt level is zero. While the latter requirement can be made without loss of generality, the former implicitly means that the government believes that disadvantages arising from debt far outweigh the advantages, and therefore neglects any potential social and financial benefit arising from having debt (cf. Holmström and Tirole [39]). One could think that this assumption is more appropriate for those countries that have faced severe debt crises during the last financial crisis and whose governments trust that high government debt has a negative effect on the long-term economic growth, makes the economy less resilient to macroeconomic shocks (e.g. sovereign default risks and liquidity shocks) and poses limits to the adoption of counter-cyclical fiscal policies (see e.g. the book by Blanchard [5, Chap. 22], the technical report [30] and Won and Kumar [63] for empirical studies).

However, it is also worth noticing that the general results of Sect. 3 of this paper still hold if we take \(x \mapsto h(x,i)\) convex and bounded from below and remove the condition of being nondecreasing on \(\mathbb{R}_{+}\) (thus allowing potential benefits arising from debt). On the other hand, the monotonicity of \(h(\cdot ,i)\) has an important role in our analysis of Sects. 3.3 and 4 (see Propositions 4.4 and 4.6).

2) In our model, we do not allow policies that might lead to an increase of the debt like e.g. investments in infrastructure, healthcare, education and research, and we neglect any possible social and financial benefit that those economic measures might induce (see Ostry et al. [55]). From a mathematical point of view, allowing policies of debt increase would lead to a singular stochastic control problem with controls of bounded variation, where the two nondecreasing processes giving the minimal decomposition of any admissible control represent the cumulative amount of the debt’s increase and decrease. In this case, one might also allow in the government’s objective functional the total expected social and financial benefits arising from a policy of debt expansion. We refer to Ferrari and Rodosthenous [32] where a similar setting has been considered in a problem of debt management under complete observation.

The function \(V_{\mathrm{po}}\) is well defined and finite. Indeed, it is nonnegative due to the nonnegativity of \(h\); moreover, since the admissible policy “instantaneously reduce at initial time the debt ratio to 0” is a priori suboptimal and has cost \(x\), we have \(V_{\mathrm{po}} \leq x\).

We should like to stress once more that any \(\nu \in \mathcal{M}(x,\underline{y},q)\) is ℍ-adapted, and therefore (2.5) is a stochastic control problem under partial observation. In particular, it is a singular stochastic control problem under partial observation, that is, an optimal control problem in which the random measures induced by the nondecreasing control processes on \([0,\infty )\) might be singular with respect to Lebesgue measure, and in which one component \(Z\) of the state variable is not directly observable by the controller.

In its current formulation, the optimal debt reduction problem is not Markovian and the dynamic programming approach via an HJB equation is not applicable. In the next section, by using techniques from filtering theory, we introduce an equivalent problem under complete information, the so-called separated problem. This enjoys a Markovian structure, and its solution is characterised in Sect. 3.3 through a Markovian optimal stopping problem.

3 Reduction to an equivalent problem under complete information

In this section, we derive the separated problem. To this end, we first study the filtering problem arising in our model. As already discussed in the introduction, results on such a filtering problem cannot be directly obtained from existing literature due to the structure of our dynamics.

3.1 The filtering problem

The filtering problem consists in finding the best mean-squared estimate of \(f(Z_{t})\), for any \(t\) and any measurable function \(f\), on the basis of the information available up to time \(t\). In our setting, that information flow is given by the filtration ℍ. The estimate can be described through the filter process \({(\pi _{t})}_{t \ge 0}\) which provides the conditional distribution of \(Z_{t}\) given \(\mathcal{H}_{t}\) for any time \(t\) (see for instance Liptser and Shiryaev [49, Chap. 8]). For any probability measure \(\mu \) on \(S=\{1,\dots ,Q\}\) and any function \(f\) on \(S\), we write \(\mu (f):= \int _{S} f d \mu = \sum _{i=1}^{Q} f(i)\mu (\{i\})\). It is known that there exists a càdlàg (right-continuous with left limits) process taking values in the space of probability measures on \(S=\{1,\dots ,Q\}\) such that for any measurable function \(f\) on \(S\),

$$ \pi _{t}(f) = \mathbb{E}[f(Z_{t}) |\mathcal{H}_{t} ]; $$
(3.1)

see for further details Kurtz and Ocone [48, Lemma 1.1]. Moreover, since \(Z\) takes only a finite number of values, the filter is completely described by the vector

$$ \pi _{t}(f_{i}) =\mathbb{P}[Z_{t} = i | \mathcal{H}_{t}], \qquad i \in S, $$

where , \(i \in S\). With a slight abuse of notation, we denote in the following by \(\pi (i)\) the process \(\pi (f_{i})\), so that for all measurable functions \(f\), (3.1) gives

$$ \pi _{t}(f) = \sum _{i=1}^{Q} f(i) \pi _{t}(i). $$

Setting \(\beta (Z_{t}):=r-g(Z_{t})\) and \(\beta (i):= r - g(i)\), \(i \in S\), notice that \(\beta \) is clearly a bounded function. Then we define two processes \(I\) and \(I^{1}\) such that for any \(t\geq 0\),

$$\begin{aligned} I_{t} &:= W_{t} - \int _{0}^{t} \sigma ^{-1} \big( \pi _{s}(\beta ) - \beta (Z_{s}) \big) ds, \\ I^{1}_{t} &:= B_{t} - \int _{0}^{t} \Big( \pi _{s}\big(\alpha (\eta _{s}, \,\cdot \,) \big) - \alpha (\eta _{s}, Z_{s}) \Big) ds, \end{aligned}$$
(3.2)

where

$$ \alpha (q, i):= \sigma _{2}(q)^{-1} \big( b_{1}(q, i) - \sigma ^{-1} \beta (i) \sigma _{1}(q)\big), \qquad (q,i) \in \mathcal{I}\times S. $$
(3.3)

Henceforth, we work under the following Novikov condition.

Assumption 3.1

$$ \mathbb{E}\big[e^{\frac{1}{2}\int _{0}^{t} \alpha ^{2}(\eta _{s}, Z_{s}) ds} \big]< \infty \qquad \text{for any } t\geq 0. $$

Under Assumption 3.1, by classical results from filtering theory (see e.g. [49, Chap. 7]), the innovation processes \(I\) and \(I^{1}\) are Brownian motions with respect to the filtration ℍ. Moreover, given the assumed independence of \(B\) and \(W\), they turn out to be independent.

The integer-valued random measure associated to the jumps of \(\eta \) is defined as

$$ m(dt, dq):= \sum _{s: \Delta \eta _{s} \neq 0} \delta _{( s, \Delta \eta _{s} )} (ds, dq), $$
(3.4)

where \(\delta _{(a_{1},a_{2})}\) denotes the Dirac measure at the point \((a_{1},a_{2}) \in \mathbb{R}_{+} \times \mathbb{R}\). Notice that the ℍ-adapted random measure \(m\) is such that

To proceed further we need the following useful definitions.

Definition 3.2

For any filtration \(\mathbb{G}\), we denote by \({\mathcal{P}}(\mathbb{G})\) the predictable \(\sigma \)-field on the product space \((0,\infty )\times \Omega \). Moreover, let \({\mathcal{B}}(\mathbb{R})\) be the Borel \(\sigma \)-algebra on ℝ. Any mapping \(H: (0,\infty ) \times \Omega \times \mathbb{R} \to \mathbb{R}\) which is \({\mathcal{P}}(\mathbb{G}) \times {\mathcal{B}}(\mathbb{R})\)-measurable is called a \(\mathbb{G}\)-predictable process indexed by ℝ.

Letting

$$ {\mathcal{F}}_{t}^{m} := \sigma \big\{ m\big((0,s] \times A\big): 0 \leq s \leq t, A \in {\mathcal{B}}( \mathbb{R})\big\} , $$
(3.5)

we denote by \(\mathbb{F}^{m}:=(\mathcal{F}^{m}_{t})_{t\geq 0}\) the filtration which is generated by the random measure \(m(dt, dq)\). It is right-continuous by [7, Theorem T25 in Appendix A2].

Definition 3.3

Given any filtration \(\mathbb{G}\) with \(\mathbb{F}^{m} \subseteq \mathbb{G}\), the \(\mathbb{G}\)-dual predictable projection of \(m\), denoted by \(m^{p, \mathbb{G}}(dt, dq)\), is the unique positive \(\mathbb{G}\)-predictable random measure such that for any nonnegative \(\mathbb{G}\)-predictable process \(\Phi \) indexed by ℝ,

$$ \mathbb{E}\bigg[\int _{0}^{\infty }\int _{\mathbb{R}} \Phi (s,q) m(ds,dq) \bigg] = \mathbb{E}\bigg[\int _{0}^{\infty }\int _{\mathbb{R}} \Phi (s,q) m^{p, \mathbb{G}}(ds, dq)\bigg]. $$
(3.6)

To prove that a given positive \(\mathbb{G}\)-predictable random measure is the \(\mathbb{G}\)-dual predictable projection of \(m\), it suffices to verify (3.6) for any process which has the form with \(C\) a nonnegative \(\mathbb{G}\)-predictable process and \(A \in {\mathcal{B}}(\mathbb{R})\). For further details, we refer to the books by Brémaud [7, Sect. VIII.4] and Jacod [41, Sect. III.1].

We now aim at deriving an equation for the evolution of the filter (the filtering equation). To this end, we use the so-called innovation approach (see Brémaud [7, Chap. IV.1], Liptser and Shiryaev [49, Chaps. 7.4 and 10.1.5] and Ceci and Colaneri [15], among others), which in our setting requires the introduction of the ℍ-compensated jump measure of \(\eta \),

$$ m^{\pi }(dt, dq):= m(dt, dq) - m^{p, \mathbb{H}}(dt, dq). $$
(3.7)

The triplet \((I, I^{1}, m^{\pi })\) also represents a building block for the construction of ℍ-mar- tingales as shown in Proposition 3.5 below. We start by determining the form of \(m^{p, \mathbb{H}}\).

Proposition 3.4

The ℍ-dual predictable projection of \(m\)is given by

(3.8)

where \(\delta _{a}\)denotes the Dirac measure at the point \(a \in \mathbb{R}\).

Proof

1) We first prove that the \(\mathbb{F}\)-dual predictable projection of \(m\) is given by

(3.9)

Let \(A\in {\mathcal{B}}(\mathbb{R})\) and introduce

(3.10)

Then \(\mathcal{N}(A)\) is the point process counting the number of jumps of \(\eta \) up to time \(t\) with jump size in the set \(A\). Since (2.2) implies that for \(s\!\geq 0\) and \(N\) is a point process with \(\mathbb{F}\)-predictable intensity given by \((\lambda ^{N}\!(Z_{t-}))_{t\geq 0}\), we obtain for each nonnegative \(\mathbb{F}\)-predictable process \(C\) that

So for any \(A\in {\mathcal{B}}(\mathbb{R})\), provides the \(\mathbb{F}\)-predictable intensity of the counting process \(\mathcal{N}(A)\). Recalling (3.10) and Definition 3.3, this implies that \(m^{p, \mathbb{F}}(dt,dq)\) in (3.9) coincides with the \(\mathbb{F}\)-dual predictable projection of \(m\), since (3.6) holds with the choice \(\mathbb{G} = \mathbb{F}\) and .

2) As in Ceci [11, Proposition 2.3], we can now derive the ℍ-dual predictable projection of \(m^{p, \mathbb{F}}\) by projecting \(m^{p, \mathbb{F}}\) onto the observation flow ℍ. More precisely, the ℍ-predictable intensity of the point process \(\mathcal{N}(A)\), \(A\in {\mathcal{B}}(\mathbb{R})\), is given by

This implies that \(m^{p, \mathbb{H}}(dt, dq)\) is given by (3.8), since (3.6) is satisfied with the choice \(\mathbb{G} = \mathbb{H}\), . □

An essential tool to prove that the original problem under partial information is equivalent to the separated one is the characterisation of the filter as the unique solution to the filtering equation (see El Karoui et al. [28], Mazliak [52] and Ceci and Gerardi [12]). In order to derive the filtering equation solved by \(\pi \), we first give a representation theorem for ℍ-martingales. The proof of the following technical result is given in Appendix A.

Proposition 3.5

Under Assumptions 2.1and 3.1, every ℍ-local martingale \(M\)admits the decomposition

$$ M_{t} = M_{0} + \int _{0}^{t} \varphi _{s} dI_{s} + \int _{0}^{t} \psi _{s} dI^{1}_{s} + \int _{0}^{t} \int _{\mathbb{R}} w(s,q) m^{\pi }(ds, dq), $$

where \(\varphi \)and \(\psi \)are ℍ-predictable processes and \(w\)is an ℍ-predictable process indexed bysuch that a.s.

$$ \int _{0}^{t} \varphi ^{2}_{s} ds < \infty , \quad \int _{0}^{t} \psi ^{2}_{s} ds < \infty , \quad \int _{0}^{t} \int _{\mathbb{R}} |w(s,q)| m^{p, \mathbb{H}}(ds, dq) < \infty , \qquad t\geq 0. $$

We are now in the position to prove the following fundamental result, whose proof is postponed to Appendix A.

Theorem 3.6

Recall (3.7), let \({\underline{y}} \in \mathcal{Y} \)be the initial distribution of \(Z\)and let Assumptions 2.1and 3.1hold. Then the filter \(({\underline{\pi }}_{t})_{t \geq 0} := (\pi _{t}(i); i\in S)_{t \geq 0}\)solves the Kushner–Stratonovich system

$$\begin{aligned} \pi _{t}(i) =& y_{i} + \int _{0}^{t} \sum _{j=1}^{Q} \lambda _{ji} \pi _{s}(j) ds + \int _{0}^{t} \pi _{s}(i) \sigma ^{-1} \bigg( \beta (i) - \sum _{j=1}^{Q} \beta (j) \pi _{s}(j) \bigg) dI_{s} \\ & + \int _{0}^{t} \pi _{s}(i)\bigg(\alpha (\eta _{s}, i) - \sum _{j=1}^{Q} \alpha (\eta _{s}, j) \pi _{s}(j) \bigg) dI^{1}_{s} \\ & + \int _{0}^{t} \int _{\mathbb{R}} \big( w^{\pi }_{i}(s,q) - \pi _{s-}(i) \big) m^{\pi }(ds, dq) \end{aligned}$$
(3.11)

for any \(i\in S\). Here, \(\beta (i) = r -g(i)\)and

(3.12)

denotes the Radon–Nikodým derivative of with respect to .

Let us introduce the sequence of jump times and jump sizes of the process \(\eta \), denoted by \((T_{n}, \zeta _{n} )_{n\geq 1}\) and recursively defined, with \(T_{0} := 0\), as

$$\begin{aligned} T_{n+1}& := \inf \bigg\{ t > T_{n}: \int _{T_{n}}^{t} c(\eta _{s-}, Z_{s-} )dN_{s} \neq 0 \bigg\} , \\ \zeta _{n} &:= \eta _{T_{n}} - \eta _{T_{n}-} = c(\eta _{T_{n} -}, Z_{{T_{n} -}} ), \qquad n\geq 1. \end{aligned}$$

We use the standard convention that \(\inf \emptyset = +\infty \). Then the integer-valued measure associated to the jumps of \(\eta \) (cf. (3.4)) can also be written as

(3.13)

The filtering system (3.11) has a natural recursive structure in terms of the sequence \(( T_{n} )_{n\geq 1}\), as shown in the next proposition.

Proposition 3.7

Between two consecutive jump times, i.e., for \(t \in [T_{n}, T_{n+1})\), the filtering system (3.11) reads as

(3.14)

for any \(i\in S\). At a jump time \(T_{n}\)of \(\eta \), \(({\underline{\pi }}_{t})_{t \geq 0}= (\pi _{t}(i); i\in S)_{t \geq 0}\)jumps as well, and its value is given by

(3.15)

Proof

First, recalling that \(m^{\pi }(dt, dq)= m(dt, dq) - m^{p, \mathbb{H}}(dt, dq)\) and

we obtain that

which from (3.11) implies that \(\pi _{t}(i)\) solves (3.14) for any \(t \in [T_{n}, T_{n+1})\). Finally, (3.15) follows by (3.12) and

We want to stress that (3.15) shows that the vector \({\underline{\pi }}_{T_{n}}\) is completely determined by the observed data \(\eta \) and the knowledge of \({\underline{\pi }}_{t}\) for \(t \in [T_{n-1}, T_{n})\), since \(\pi _{{T_{n} -}}(i) := \lim _{t \uparrow T_{n}} \pi _{t}(i)\), \(i \in S\).

Example 3.8

1) In the case \(c(q,i) \equiv c \neq 0\) for any \(i \in S\) and \(q \in \mathcal{I}\), the sequences of jump times of \(\eta \) and \(N\) coincide and the filtering system (3.11) reduces to (for \(i \in S\))

$$\begin{aligned} \pi _{t}(i) &= y_{i}+ \int _{0}^{t} \sum _{j=1}^{Q} \lambda _{ji} \pi _{s}(j) ds + \int _{0}^{t} \pi _{s}(i)\sigma ^{-1} \bigg( \beta (i) - \sum _{j=1}^{Q} \beta (j) \pi _{s}(j) \bigg)dI_{s} \\ &\phantom{=:} + \int _{0}^{t} \pi _{s}(i)\bigg( \alpha (\eta _{s}, i) - \sum _{j=1}^{Q} \alpha (\eta _{s}, j) \pi _{s}(j) \bigg) dI^{1}_{s} \\ & \phantom{=:}+ \int _{0}^{t} \bigg( \frac{\lambda ^{N}(i) \pi _{s-}(i)}{\sum _{j=1}^{Q}\pi _{s-}(j) \lambda ^{N}(j) } - \pi _{s-}(i) \bigg) \bigg( dN_{s} - \sum _{j=1}^{Q}\pi _{s-}(j) \lambda ^{N}(j) ds\bigg). \end{aligned}$$

2) In the case \(\alpha (q,i)=\alpha (i)\) and \(c(q,i) \equiv 0\) for any \(i \in S\) and \(q \in \mathcal{I}\), the filtering system (3.11) does not depend explicitly on the process \(\eta \). In particular, one has

$$\begin{aligned} \pi _{t}(i) &= y_{i} + \int _{0}^{t} \sum _{j=1}^{Q} \lambda _{ji} \pi _{s}(j) ds + \int _{0}^{t} \pi _{s}(i)\sigma ^{-1} \bigg( \beta (i) - \sum _{j=1}^{Q} \beta (j) \pi _{s}(j) \bigg) dI_{s} \\ &\phantom{=:} + \int _{0}^{t} \pi _{s}(i)\bigg( \alpha (i) - \sum _{j=1}^{Q} \alpha (j) \pi _{s}(j) \bigg) dI^{1}_{s}, \qquad i \in S, \end{aligned}$$

where we have set \(\alpha (i) := \sigma _{2}^{-1} (b_{1}(i) - \sigma ^{-1} \beta (i) \sigma _{1})\). In Sect. 4, we provide the explicit solution to the optimal debt reduction problem within this setting. With reference to (2.2) and (3.3), this setting corresponds e.g. to the purely diffusive arithmetic case \(c(q,i) = 0\), \(b_{1}(q,i) = b_{1}(i)\) and \(\sigma _{1}(q) = \sigma _{1} >0\), \(\sigma _{2}(q) = \sigma _{2} >0\) for any \(i \in S\) and \(q \in \mathcal{I}\), or to the purely diffusive geometric case \(c(q,i) = 0\), \(b_{1}(q,i) = b_{1}(i)q\) and \(\sigma _{1}(q) = \sigma _{1} q\), \(\sigma _{2}(q) = \sigma _{2} q\) for any \(i \in S\) and \(q \in \mathcal{I}\).

3.2 The separated problem

Thanks to the introduction of the filter, (2.1)–(2.3) can now be rewritten in terms of observable processes. In particular, we have that

$$\begin{aligned} \begin{aligned} d X_{t}^{0} &= \pi _{t}(\beta ) X_{t}^{0} dt + \sigma X_{t}^{0} dI_{t} , \\ X^{0}_{0}&=x>0, \end{aligned} \end{aligned}$$
(3.16)
$$\begin{aligned} \begin{aligned} d \eta _{t}&= \pi _{t}\big(b_{1}(\eta _{t}, \cdot )\big) dt + \sigma _{1}( \eta _{t}) dI_{t} + \sigma _{2}(\eta _{t}) dI^{1}_{t} + \int _{\mathbb{R}} \zeta m(dt, d\zeta ), \\ \eta _{0}&=q \in \mathcal{I}, \end{aligned} \end{aligned}$$
(3.17)
$$\begin{aligned} \begin{aligned} dX_{t}^{\nu } &= \pi _{t}(\beta ) X_{t}^{\nu } dt + \sigma X_{t}^{\nu } dI_{t} - d \nu _{t}, \\ X^{\nu }_{0-}&=x >0. \end{aligned} \end{aligned}$$
(3.18)

Notice that for any \(\nu \in \mathcal{M}(x,\underline{y},q)\), the process \(X^{\nu }\) turns out to be ℍ-adapted and depends on the vector \(({\underline{\pi }}_{t})_{t \geq 0} =(\pi _{t}(i); i\in S)_{t \geq 0}\) with \({\underline{\pi }}_{0}= {\underline{y}} \in \mathcal{Y}\).

Definition 3.9

We say that a process \(( { \underline{\widetilde{\pi }}}_{t}, \widetilde{\eta }_{t} )_{t \geq 0}\) with values in \(\mathcal{Y} \times \mathcal{I}\) is a strong solution to (3.11) and (3.17) if it satisfies those equations pathwise. We say that strong uniqueness for the system (3.11) and (3.17) holds if for any strong solution \(( { \underline{\widetilde{\pi }}}_{t}, \widetilde{\eta }_{t} )_{t \geq 0}\) to (3.11) and (3.17), one has \({\underline{\widetilde{\pi }}}_{t} = {\underline{\pi }}_{t}\) and \(\widetilde{\eta }_{t} = \eta _{t}\) a.s. for all \(t \geq 0\).

Proposition 3.10

Let Assumptions 2.1and 3.1hold and suppose that \(\alpha (\cdot ,i)\)is locally Lipschitz for any \(i\in S\)and there exists \(M>0\)such that \(|\alpha (q,i)|\leq M(1 + |q|)\)for any \(q \in \mathcal{I}\)and any \(i\in S\). Then the system (3.11) and (3.17) admits a unique strong solution.

The proof of Proposition 3.10 is postponed to Appendix A. Notice that under Assumption 2.1, the requirement on \(\alpha \) of Proposition 3.10 is verified e.g. whenever \(\sigma _{2}(q) \geq \underline{\sigma }\) for some \(\underline{\sigma }>0\) and for any \(q \in \mathcal{I}\), or if \(b_{1}/\sigma _{2}\) and \(\sigma _{1}/\sigma _{2}\) are locally Lipschitz in \(q \in \mathcal{I}\) and have sublinear growth. As a byproduct of Proposition 3.10, we also have strong uniqueness of the solution to (3.18). In the following, when there is a need to stress the dependence with respect to the initial value \(x>0\), we denote the solution to (3.16) and (3.18) by \(X^{x,0}\) and \(X^{x,\nu }\), respectively. Since

$$ \mathbb{E}\left [ \pi _{t}\big( h(X_{t}^{x,\nu }, \cdot ) \big) \right ] = \mathbb{E}\left [\mathbb{E}[ h(X_{t}^{x,\nu }, Z_{t} ) | \mathcal{H}_{t} ] \right ], $$

an application of the Fubini–Tonelli theorem allows writing

$$ \mathbb{E}\left [\int _{0}^{\infty } e^{-\rho t} h(X_{t}^{x,\nu }, Z_{t} ) dt \right ] = \mathbb{E}_{(x,\underline{y},q)}\left [\int _{0}^{ \infty } e^{-\rho t} \pi _{t}\big( h(X_{t}^{\nu }, \cdot ) \big) dt \right ], $$

where \(\mathbb{E}_{(x,\underline{y},q)}\) denotes the expectation conditioned on \(X^{\nu }_{0^{-}}=x>0\), \(\underline{\pi }_{0}=\underline{y} \in \mathcal{Y}\), and \(\eta _{0}=q \in \mathcal{I}\). Also, because \(\pi (\kappa )\) is the ℍ-optional projection of the process \(\kappa (Z)\) (cf. [48, Lemma 1.1]) and any admissible control \(\nu \) is increasing and ℍ-adapted, an application of Dellacherie and Meyer [24, Theorem VI.57, in particular (VI.57.1)] yields

$$ \mathbb{E}\left [\int _{0}^{\infty } e^{-\rho t} \kappa (Z_{t}) d \nu _{t}\right ] = \mathbb{E}_{(x,\underline{y},q)}\left [\int _{0}^{ \infty } e^{-\rho t} \pi _{t}\big(\kappa ( \cdot )\big) d \nu _{t} \right ]. $$

Hence, the cost functional of (2.4) can be rewritten in terms of observable quantities as

$$\mathcal{J}_{x,\underline{y},q}(\nu ) = \mathbb{E}_{(x,\underline{y},q)} \left [\int _{0}^{\infty } e^{-\rho t} \pi _{t}\big( h(X_{t}^{\nu }, \cdot )\big) dt + \int _{0}^{\infty } e^{-\rho t} {\pi _{t}\big( \kappa ( \cdot )\big)} d \nu _{t}\right ]. $$

Notice that the latter expression does not depend on the unobservable process \(Z\) any more, and this allows us to introduce a control problem with complete information, the separated problem, in which the new state variable is given by the triplet \((X^{\nu }, \underline{\pi }, \eta )\). For this problem, we introduce the set \(\mathcal{A}(x,{\underline{y}},q) \) of admissible controls, given in terms of the observable processes in (3.11), (3.17) and (3.18) as

$$\begin{aligned} \mathcal{A}(x,{\underline{y}},q) := \big\{ \nu :\Omega \times \mathbb{R}_{+} \rightarrow \mathbb{R}_{+} & : {\big(\nu _{t}(\omega ) := \nu (\omega ,t) \big)}_{t \ge 0}, \textrm{is nondecreasing,} \\ &\phantom{=:} \text{right-continuous, $\mathbb{H}$-adapted and such that} \\ &\phantom{=:} \text{$X_{t}^{x,\nu } \ge 0$ for every $t \ge 0$, $X_{0-}^{x,\nu } = x$}, \\ &\phantom{=:}\underline{\pi }_{0}= {\underline{y}}, \text{$\eta _{0}=q$ a.s.} \big\} \end{aligned}$$

for every initial value \(x \in (0,\infty )\) of \(X^{x,\nu }\) defined in (3.18), any initial value \({\underline{y}} \in \mathcal{Y}\) of the process \(({\underline{\pi }}_{t} )_{t \geq 0}= (\pi _{t}(i); i\in S)_{t \geq 0}\) solving (3.11) and any initial value \(q \in \mathcal{I}\) of \(\eta \). In the following, we set \(\nu _{0-}=0\) a.s. for any \(\nu \in \mathcal{A}(x,{\underline{y}},q)\).

Given \(\nu \in \mathcal{A}(x, {\underline{y}}, q)\), the triplet \((X_{t}^{x,\nu }, {\underline{\pi }}_{t}, \eta _{t}) _{t \geq 0}\) solves (3.18), (3.11) and (3.17) and the jump measure associated to \(\eta \) has ℍ-predictable dual projection given by (3.8). Hence, the process \((X_{t}^{x,\nu }, {\underline{\pi }}_{t}, \eta _{t}) _{t \geq 0}\) is an ℍ-Markov process and we therefore define the Markovian separated problem as

$$ \textstyle\begin{cases} \textstyle\begin{array}{rl} \displaystyle V(x, {\underline{y}}, q) :=& \inf _{\nu \in \mathcal{A}(x, {\underline{y}}, q)} \mathbb{E}_{(x,\underline{y}, q)}\displaystyle \bigg[ \int _{0}^{\infty } e^{-\rho t} \pi _{t}\big( h(X_{t}^{\nu }, \cdot )\big) dt \\ & \displaystyle \qquad \quad \qquad \qquad \qquad + \int _{0}^{ \infty } e^{-\rho t} \pi _{t}\big(\kappa ( \cdot )\big) d \nu _{t} \bigg] \\ dX_{t}^{x,\nu } =& \pi _{t}(\beta ) X_{t}^{x,\nu } dt + \sigma X_{t}^{x, \nu } dI_{t} - d \nu _{t}, \qquad X^{x,\nu }_{0-}=x >0, \\ &({\underline{\pi }}, \eta ) \ \text{ solution to (3.11) and (3.17).} \end{array}\displaystyle \end{cases} $$
(3.19)

This is now a singular stochastic problem under complete information, since all the processes involved are ℍ-adapted.

The next proposition immediately follows from the construction of the separated problem and the strong uniqueness of the solutions to (3.11), (3.17) and (3.18).

Proposition 3.11

Assume strong uniqueness for the system of (3.11) and (3.17), and let \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\)be the initial values of the process \((X,Z,\eta )\)in the problem (2.5) under partial observation. Then

$$ V_{\mathrm{po}}(x,\underline{y},q)=V(x,\underline{y},q). $$

Moreover, \(\mathcal{A}(x,\underline{y},q) = \mathcal{M}(x,\underline{y},q)\)and \(\nu ^{*}\)is an optimal control for the separated problem (3.19) if and only if it is optimal for the original problem (2.5) under partial observation.

Remark 3.12

Notice that in the setting of Example 3.8, 2), the pair \((X^{x,\nu }, {\underline{\pi }})\) solving (3.18) and (3.11) is an ℍ-Markov process for any given control \(\nu \in \mathcal{A}(x,\underline{y},q)\), \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\). As a consequence, since the cost functional and the set of admissible controls do not depend explicitly on the process \(\eta \), the value function of the separated problem (3.19) does not depend on the variable \(q\). We consider this setting in Sect. 4.

3.3 A probabilistic verification theorem via reduction to optimal stopping

In this section, we relate the separated problem to a Markovian optimal stopping problem and show that the solution to the latter is directly related to the optimal control of the former. The following analysis is fully probabilistic and based on a change-of-variable formula for Lebesgue–Stieltjes integrals that has already been employed in singular control problems (see e.g. Baldursson and Karatzas [2] and Ferrari [31]). The result of this section is then employed in Sect. 4 where in a case study, we determine the optimal debt reduction policy by solving an auxiliary optimal stopping problem.

With regard to (3.19), notice that we can write \(\pi _{t} (\kappa (\cdot ) ) = \sum _{i=1}^{Q} \pi _{t}(i) \kappa (i)\) as well as \(\pi _{t} (h(X_{t}^{x,\nu }, \cdot ) ) = \sum _{i=1}^{Q} \pi _{t}(i) h(X_{t}^{x, \nu }, i)\) a.s. for any \(t\geq 0\). For any \((x,\underline{\pi })\) in \((0,\infty ) \times \mathcal{Y}\), set

$$ \widehat{h}(x, \underline{\pi }):=\sum _{i=1}^{Q} \pi (i) h(x,i), \qquad \widehat{\kappa }(\underline{\pi }):=\sum _{i=1}^{Q} \pi (i) \kappa (i), $$

and given \(z \in (0,\infty )\), we introduce the optimal stopping problem

$$\begin{aligned} {\widetilde{U}}_{t} (z):= \operatorname*{ess\,inf}_{\tau \geq t} \mathbb{E}\bigg[&\int _{t}^{ \tau } e^{-\rho (s-t)} X^{1,0}_{s}\, \widehat{h}_{x}(X^{z,0}_{s}, \underline{\pi }_{s}) ds \\ &+ e^{-\rho (\tau - t)} \widehat{\kappa }(\underline{\pi }_{\tau }) X^{1,0}_{\tau }\,\bigg|\,\mathcal{H}_{t}\bigg], \qquad t \geq 0, \end{aligned}$$
(3.20)

where the optimisation is taken over all ℍ-stopping times \(\tau \geq t\).

Under Assumption 2.4, the expectation in (3.20) is finite for any ℍ-stopping time \(\tau \geq t\), for any \(t\geq 0\). Observing that \(\kappa (i) < \infty \) for any \(i \in S\), in order to take care of the event \(\{\tau = \infty \}\), we use in (3.20) the convention

$$ e^{-\rho \tau } X^{1,0}_{\tau }:=\liminf _{t\uparrow \infty }e^{-\rho t} X^{1,0}_{t} \qquad \text{on } \{\tau = \infty \}. $$
(3.21)

Denote by \(U(z)\) a càdlàg modification of \(\widetilde{U}(z)\) (which under our assumptions exists due to the results in Karatzas and Shreve [46, Appendix D]), and observe that \(0 \leq U_{t}(z) \leq \widehat{\kappa }(\underline{\pi }_{t}) X^{1,0}_{t}\) for any \(t \geq 0\), a.s. Also, define the stopping time

$$ \tau _{t}^{*}(z):=\inf \{s\geq t: U_{s}(z) \geq \widehat{\kappa }( \underline{\pi }_{s}) X^{1,0}_{s} \}, \qquad z \in (0,\infty ), $$
(3.22)

with the usual convention \(\inf \emptyset = \infty \). Then by [46, Theorem D.12], \(\tau _{t}^{*}(z)\) is an optimal stopping time for (3.20). In particular, \(\tau ^{*}(z):= \tau _{0}^{*}(z)\) is optimal for the problem

$$ U_{0}(z):=\inf _{\tau \geq 0}\mathbb{E}\left [ \int _{0}^{\tau } e^{- \rho t} X^{1,0}_{t}\, \widehat{h}_{x}(X^{z,0}_{t}, \underline{\pi }_{t}) dt + e^{-\rho \tau } \widehat{\kappa }(\underline{\pi }_{\tau }) X^{1,0}_{\tau }\right ]. $$

Notice that since \(h_{x}(\cdot , \underline{\pi })\) is a.s. increasing, \(z \mapsto \tau ^{*}(z)\) is a.s. decreasing. This monotonicity of \(\tau ^{*}(\,\cdot \,)\) will be important in the sequel as we need to consider its generalised inverse. Moreover, since the triplet \((X^{z,0}, \underline{\pi }, \eta )\) is a homogeneous ℍ-Markov process, there exists a measurable function \(U: (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\to \mathbb{R}\) such that \(U_{t}(z) = U (X^{z,0}_{t}, \underline{\pi }_{t}, \eta _{t})\) for any \(t \geq 0\), a.s. Hence \(U_{0}(z)=U(z,\underline{y},q)\), and for any \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\), we define

$$ \widetilde{V}(x, \underline{y},q):= \int _{0}^{x} U(z, \underline{y},q)dz. $$
(3.23)

Moreover, we introduce the nondecreasing right-continuous process

$$ \overline{\nu }_{t}^{{*}} := \sup \{\alpha \in [0,x]: \tau ^{*}(x- \alpha ) \leq t\}, \qquad t \geq 0, \overline{\nu }^{{*}}_{0-} =0, $$
(3.24)

and then also the process

$$ \nu ^{*}_{t}:=\int _{0}^{t} X^{1,0}_{s} d\overline{\nu }^{{*}}_{s}, \qquad t > 0, \nu ^{*}_{0-}=0. $$

Notice that \(\overline{\nu }^{{*}}_{\cdot }\) is the right-continuous inverse of \(\tau ^{*}(\,\cdot \,)\).

Theorem 3.13

Let \(\widetilde{V}\)be as in (3.23) and \(V\)as in the definition (3.19). Then \(\widetilde{V} = V\), and \(\nu ^{*}\)is the (unique) optimal control for (3.19).

Proof

1) Let \(x > 0\), \(\underline{y} \in \mathcal{Y}\) and \(q \in \mathcal{I}\) be given and fixed. For \(\nu \in \mathcal{A}(x, \underline{y}, q)\), we introduce the process \(\overline{\nu }\) such that \(\overline{\nu }_{t}:= \int _{0}^{t} \frac{d{\nu }_{s}}{X^{1,0}_{s}}\), \(t\geq 0\), and define its inverse (see e.g. Revuz and Yor [59, Sect. 0.4]) by

$$ \tau ^{\overline{\nu }}(z):=\inf \{t\geq 0: x - \overline{\nu }_{t} < z \}, \qquad 0 < z \leq x. $$

Notice that the process \((\tau ^{\overline{\nu }}(z))_{z \leq x}\) has decreasing left-continuous sample paths, and hence it admits right limits

$$ \tau ^{\overline{\nu }}_{+}(z):=\inf \{t\geq 0 : x - \overline{\nu }_{t} \leq z\}, \qquad z \leq x. $$
(3.25)

Moreover, the set of points \(z\in \mathbb{R}\) at which \(\tau ^{\overline{\nu }}(z)(\omega ) \neq \tau ^{\overline{\nu }}_{+}(z)( \omega )\) is a.s. countable for a.e. \(\omega \in \Omega \). The random time \(\tau ^{\overline{\nu }}(z)\) is actually an ℍ-stopping time because it is the entry time into an open set of the right-continuous process \(\overline{\nu }\), and ℍ is right-continuous. Moreover, since \(\tau ^{\overline{\nu }}_{+}(z)\) is the first entry time of the right-continuous process \(\overline{\nu }\) into a closed set, it is an ℍ-stopping time as well for any \(z \leq x\).

Proceeding then as in Ferrari [31, Step 1 of the proof of Theorem 3.1], by employing the change-of-variable formula in [59, Proposition 0.4.9], one finds that

$$ \widetilde{V}(x, \underline{y}, q) = \int _{0}^{x}U(z, \underline{y}, q)dz \leq \mathcal{J}_{x,\underline{y}, q}(\nu ). $$

Hence, since \(\nu \) was arbitrary, we find that

$$ \widetilde{V}(x, \underline{y}, q) \leq V(x, \underline{y}, q), \qquad (x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}. $$
(3.26)

2) To complete the proof, we have to show the reverse inequality. Let \(x \in (0,\infty )\), \(\underline{y} \in \mathcal{Y}\) and \(q \in \mathcal{I}\) be initial values of \(X^{x,\nu }\), \(\underline{\pi }\) and \(\eta \). We first notice that \(\nu ^{*} \!\in\! \mathcal{A}(x, \underline{y},q)\). Indeed, \(\nu ^{*}\) is nondecreasing, right-continuous and such that \(X^{x,\nu ^{*}}_{t} \!\!= X^{1,0}_{t}(x - \overline{\nu }^{{*}}_{t}) \geq 0\) a.s. for all \(t\geq 0\), since \(\overline{\nu }^{{*}}_{t} \leq x\) a.s. by definition. Moreover, for any \(0< z \leq x\), we can write by (3.24) and (3.25) that

$$ \tau ^{\overline{\nu }^{{*}}}_{+}(z) \leq t \quad \Longleftrightarrow \quad \overline{\nu }^{{*}}_{t} \geq x - z \quad \Longleftrightarrow \quad \tau ^{*}(z) \leq t. $$

Then recalling that \(\tau ^{\overline{\nu }^{{{*}}}}_{+}(z)=\tau ^{\overline{\nu }^{{{*}}}}(z)\) ℙ-a.s. for almost every \(z\le x\), we pick \(\nu =\nu ^{*}\) (equivalently, \(\overline{\nu }=\overline{\nu }^{{*}}\)) and following [31, Step 2 in the proof of Theorem 3.1], we obtain \(\widetilde{V}(x,\underline{y},q) = \mathcal{J}_{x,\underline{y},q}( \nu ^{*}) \geq V(x,\underline{y},q)\), where the last inequality is due to the admissibility of \(\nu ^{*}\). Hence, by (3.26), we have \(\widetilde{V}=V\) and \(\nu ^{*}\) is optimal. In fact, by strict convexity of \(\mathcal{J}_{x,\underline{y}, q}(\,\cdot \,)\), \(\nu ^{*}\) is the unique optimal control in the class of controls belonging to \(\mathcal{A}(x, \underline{y}, q)\) and such that \(\mathcal{J}_{x,\underline{y}, q}(\nu ) < \infty \). □

Remark 3.14

For any given \((x,\underline{y},q) \in (0,\infty ) \times \mathcal{Y} \times \mathcal{I}\), define the Markovian optimal stopping problem

$$ v(x, \underline{y}, q):= \inf _{\tau \geq 0}\mathbb{E}\bigg[\int _{0}^{ \tau } e^{-\rho t} X^{x,0}_{t} \widehat{h}_{x}({X^{x,0}_{t}}, \underline{\pi }^{\,\underline{y}}_{t}) dt + e^{-\rho \tau } \widehat{\kappa }(\underline{\pi }^{\,\underline{y}}_{\tau }) X^{x,0}_{\tau }\bigg], $$

where \(\underline{\pi }^{\,\underline{y}}\) denotes the filter process starting at time zero from \(\underline{y} \in \mathcal{Y}\). Then, since \(X^{x,0}=x X^{1,0}\) by (3.16) and \(U_{0}(z)= U(z, \underline{y}, q)\) for some measurable function \(U\), one can easily see that \(v(x, \underline{y}, q)= x U(x, \underline{y}, q)\). Moreover, the previous considerations together with (3.22) (evaluated at \(t=0\)) ensure that the stopping time

$$ \tau ^{*}(x,\underline{y},q):=\inf \{t\geq 0: v(X^{x,0}_{t}, \underline{\pi }^{\,\underline{y}}_{t} , \eta ^{q}_{t}) \geq \widehat{\kappa }(\underline{\pi }^{\,\underline{y}}_{t}) X^{x,0}_{t} \} $$

is optimal for \(v(x,\underline{y}, q)\), where \(q=\eta _{0}\).

4 The solution in a case study with \(Q=2\) economic regimes

In this section, we build on the general filtering analysis developed in the previous sections and on the result of Theorem 3.13, and provide the form of the optimal debt reduction policy in a case study defined through the following standing assumption.

Assumption 4.1

1) \(Z\) takes values in \(S=\{1, 2\}\), and with reference to (2.3), we assume \(g_{2}:=g(2)< g(1)=:g_{1}\).

2) For any \(q\in \mathcal{I}\) and \(i\in \{1, 2\}\), one has \(c(q,i) = 0\), and for \(\alpha \) as in (3.3), we assume \(\alpha (q,i)=\alpha (i)\).

3) \(h(x,i)=h(x)\) for all \((x,i) \in (0,\infty ) \times \{1,2\}\), with \(h:\mathbb{R} \to \mathbb{R}\) such that

  (i) \(x \mapsto h(x)\) is strictly convex, twice continuously differentiable and nondecreasing on \(\mathbb{R}_{+}\) with \(h(0)=0\) and \(\lim _{x \uparrow \infty }h(x)=\infty \);

  (ii) there exist \(\gamma > 1\), \(0< K_{o}< K\) and \(K_{1},K_{2}>0\) such that

$$\begin{aligned} K_{o}|x^{+}|^{\gamma } - K &\leq h(x) \leq K(1 + |x|^{\gamma }), \\ |h'(x)| &\leq K_{1}(1 + |x|^{\gamma -1}), \\ |h''(x)| &\leq K_{2}(1 + |x|^{(\gamma -2)^{+}}). \end{aligned}$$

4) \(\kappa (i)=1\) for \(i \in \{1,2\}\).

Notice that under Assumption 4.1, 2), the macroeconomic indicator \(\eta \) has a suitable diffusive dynamics whose coefficients \(b_{1}\), \(\sigma _{1}\), \(\sigma _{2}\) are such that the function \(\alpha \) is independent of \(q\). As discussed in Example 3.8, 2), this is the case of a geometric or arithmetic diffusive dynamics for \(\eta \). In this setting, the Kushner–Stratonovich system (3.11) reduces to

$$\begin{aligned} d\pi _{t}(1) &= \big(\lambda _{2} - (\lambda _{1} + \lambda _{2}) \pi _{t}(1) \big) dt \\ &\phantom{=:} + \pi _{t}(1)\big( 1- \pi _{t}(1) \big)\bigg( \frac{ \beta _{1} - \beta _{2}}{\sigma } dI_{t} + (\alpha _{1}- \alpha _{2})dI^{1}_{t} \bigg) \end{aligned}$$
(4.1)

and \(\pi _{t}(2) = 1- \pi _{t}(1)\). Here, \(\lambda _{1}:= \lambda _{12} >0\) and \(\lambda _{2} := \lambda _{21} >0\).

Setting \(\pi _{t} := \pi _{t}(1)\), \(t\geq 0\), (3.19) then reads as

$$ \textstyle\begin{cases} \textstyle\begin{array}{rl} \displaystyle V(x,y) &= \inf _{\nu \in \mathcal{A}(x,y)}\mathbb{E}_{(x,y)} \displaystyle \bigg[\int _{0}^{\infty } e^{-\rho t} h(X_{t}^{\nu }) dt + \displaystyle \int _{0}^{\infty } e^{-\rho t} d \nu _{t}\bigg], \\ \displaystyle dX_{t}^{x,y,\nu } &= \big(\beta _{2} + \pi ^{y}_{t}(g_{2} -g_{1}) \big) X_{t}^{x,y,\nu } dt + \sigma X_{t}^{x,y,\nu } dI_{t} - d \nu _{t}, \\ \displaystyle d\pi ^{y}_{t}&= \big( \lambda _{2} - (\lambda _{1}+ \lambda _{2})\pi ^{y}_{t}\big)dt \\ &\phantom{=:}+ \pi ^{y}_{t}(1-\pi ^{y}_{t})\bigg(\frac{g_{2}-g_{1}}{\sigma }dI_{t} + (\alpha _{1}-\alpha _{2})dI^{1}_{t}\bigg), \end{array}\displaystyle \end{cases} $$
(4.2)

with initial conditions \(X^{x,y,\nu }_{0-}=x >0\), \(\pi _{0} =y \in (0,1)\), and where \(g_{i}= r -\beta _{i}\) denotes the rate of economic growth in the state \(i\), \(i=1,2\). Note that we switch here from arguments \((i)\) to subscripts i.

It is worth noticing that there is no need to involve the process \(\eta \) in the Markovian formulation (4.2). This is due to the fact that the couple \((X^{\nu },\pi )\) solving the two stochastic differential equations above is a strong Markov process and the cost functional and the set of admissible controls \(\mathcal{A}(x,y,q)\) do not depend explicitly on \(\eta \); hence we simply write \(\mathcal{A}(x,y)\) instead of \(\mathcal{A}(x,y,q)\) (cf. (4.2) above). For this reason, the value function of (4.2) does not depend on the initial value \(q\) of the process \(\eta \). However, the memory of the macroeconomic indicator process \(\eta \) appears in the filter \(\pi \) through the constant term \(\alpha _{1} - \alpha _{2}\) in its dynamics.

Since (4.1) admits a unique strong solution, Proposition 3.11 implies the following result.

Proposition 4.2

Under Assumption 4.1, solving (4.2) is equivalent to solving the original problem (2.5). That is,

$$ V_{\mathrm{po}}(x,y) = V(x,y) \qquad \textit{for any given and fixed } (x,y) \in (0,\infty ) \times (0,1), $$

and a control is optimal for the separated problem (4.2) if and only if it is optimal for the original problem (2.5) under partial observation.

In the following analysis, we need (for technical reasons due to the infinite horizon of our problem) to take a sufficiently large discount factor. Namely, defining

$$\begin{aligned} \rho _{o}:= & \bigg(\beta _{2} + \frac{1}{2}\sigma ^{2}\bigg) \vee \bigg( \gamma \beta _{2} + \frac{1}{2}\sigma ^{2}\gamma (\gamma -1) \bigg) \vee (2\beta _{2} + \sigma ^{2} ) \vee \big( 24\theta ^{2} -( \lambda _{1}+\lambda _{2}) \big) \\ & \vee (4\beta _{2} + 6\sigma ^{2} ) \vee \Big(4 \beta _{2} (2\vee \gamma ) + 2\sigma ^{2}(2\vee \gamma ) \big( 4 (2\vee \gamma )-1 \big) \Big), \end{aligned}$$

with \(\theta ^{2}:=\frac{1}{2} (\frac{(g_{1}-g_{2})^{2}}{\sigma ^{2}} + ( \alpha _{1}-\alpha _{2})^{2} )\), we assume the following.

Assumption 4.3

One has \(\rho > \rho _{o}^{+}\).

Due to the growth condition on \(h\), let us notice that Assumption 4.3 in particular ensures that \(\rho > \gamma \beta _{2} + \frac{1}{2}\sigma ^{2}\gamma (\gamma -1)\) so that the (trivial) admissible control \(\nu \equiv 0\) has a finite total expected cost.

4.1 The related optimal stopping problem

Motivated by the results of the previous sections (in particular Theorem 3.13), we now aim at solving (4.2) through the study of an auxiliary optimal stopping problem whose value function can be interpreted as the marginal value of the optimal debt reduction problem (cf. (3.23) and Theorem 3.13). Therefore, we can think informally of the solution to that optimal stopping problem as the optimal time at which the government should marginally reduce the debt ratio. The optimal stopping problem involves a two-dimensional diffusive process, and in the sequel, we provide an almost exclusively probabilistic analysis.

4.1.1 Formulation and preliminary results

Recall that \((I_{t},I^{1}_{t})_{t\geq 0}\) is a two-dimensional standard ℍ-Brownian motion, and introduce the two-dimensional diffusion process \((\widehat{X},\pi ):=(\widehat{X}_{t},\pi _{t})_{t\geq 0}\) solving the stochastic differential equations (SDEs)

$$ \left \{ \textstyle\begin{array}{rl} \displaystyle d\widehat{X}_{t}&= \widehat{X}_{t} \big(\beta _{2} +(g_{2}-g_{1}) \pi _{t} \big) dt + \sigma \widehat{X}_{t} dI_{t}, \\ d\pi _{t}&=\big(\lambda _{2} - (\lambda _{1}+\lambda _{2})\pi _{t} \big)dt + \pi _{t}(1-\pi _{t})\displaystyle \bigg( \frac{g_{2}-g_{1}}{\sigma }dI_{t} + (\alpha _{1}-\alpha _{2})dI^{1}_{t} \bigg) \end{array}\displaystyle \right . $$
(4.3)

with initial conditions \(\widehat{X}_{0}=x\), \(\pi _{0}=y\) for any \((x,y ) \in \mathcal{O} :=(0,\infty ) \times (0,1)\). Recall that \(\beta _{2}= r - g_{2}\).

Since the process \(\pi \) is bounded, classical results on SDEs ensure that (4.3) admits a unique strong solution that, when needed, we denote by \((\widehat{X}^{x,y},\pi ^{y})\) to stress its dependence on the initial datum \((x,y) \in \mathcal{O}\). In particular, one easily obtains

$$ \widehat{X}^{x,y}_{t} = x e^{(\beta _{2} - \frac{1}{2}\sigma ^{2})t + \sigma I_{t} + (g_{2}-g_{1})\int _{0}^{t} \pi ^{y}_{s} ds}, \qquad t \geq 0. $$
(4.4)

Moreover, it can be shown that Feller’s test of explosion (see e.g. Karatzas and Shreve [45, Chap. 5.5]) gives \(1= \mathbb{P}[\pi ^{y}_{t} \in (0,1), \forall t\geq 0]\) for all \(y \in (0,1)\). In fact, the boundary points 0 and 1 are classified as “entrance-not-exit”, hence are unattainable for the process \(\pi \). In other words, the diffusion \(\pi \) can start from 0 and 1, but it cannot reach any of those two points when starting from \(y \in (0,1)\) (we refer to Borodin and Salminen [6, Sect. II.6] for further details on boundary classification).

With regard to Remark 3.14, we study here the fully two-dimensional Markovian optimal stopping problem with value function

$$\begin{aligned} v(x,y) & := \inf _{\tau \geq 0}\mathbb{E}_{(x,y)}\bigg[\int _{0}^{ \tau } e^{- \rho t} \widehat{X}_{t} h'(\widehat{X}_{t}) dt + e^{- \rho \tau } \widehat{X}_{\tau }\bigg] \\ & \phantom{:}=: \inf _{\tau \geq 0} \widehat{\mathcal{J}}_{(x,y)}(\tau ), \qquad (x,y) \in \mathcal{O}. \end{aligned}$$
(4.5)

In (4.5), the optimisation is taken over all ℍ-stopping times, and \(\mathbb{E}_{(x,y)}\) denotes the expectation under the probability measure \(\mathbb{P}_{(x,y)}[\,\cdot \,]:=\mathbb{P}[\,\cdot \,|\widehat{X}_{0}=x, \pi _{0}=y]\).

Because \(\pi \) is positive, \(g_{2} - g_{1} <0\) and \(\rho >\beta _{2}\) by Assumption 4.3, (4.4) gives

$$ \liminf _{t \uparrow \infty } e^{-\rho t} \widehat{X}_{t} =0 \qquad \text{$\mathbb{P}_{(x,y)}$-a.s.,} $$

which implies via the convention (3.21) that \(e^{-\rho \tau } \widehat{X}_{\tau }=0\) on \(\{\tau =\infty \}\) for any ℍ-stopping time \(\tau \).

Clearly, \(v\geq 0\) since \(\widehat{X}\) is positive and \(h\) is increasing on \(\mathbb{R}_{+}\). Also, \(v \leq x\) on \(\mathcal{O}\), and we can therefore define the continuation region and the stopping region as

$$ \mathcal{C}:=\{(x,y) \in \mathcal{O}: v(x,y) < x\}, \qquad \mathcal{S}:=\{(x,y) \in \mathcal{O}: v(x,y) = x\}. $$
(4.6)

Notice that integrating by parts the term \(e^{-\rho \tau } \widehat{X}_{\tau }\), taking expectations and exploiting that \(\mathbb{E}[\int _{0}^{\tau } e^{-\rho s} \widehat{X}_{s} dI_{s}]=0\) for any ℍ-stopping time \(\tau \) (because \(\rho >\beta _{2} + \frac{1}{2}\sigma ^{2}\) by Assumption 4.3), we can equivalently rewrite (4.5) as

$$ v(x,y) := x + \inf _{\tau \geq 0} \mathbb{E}_{(x,y)}\bigg[\int _{0}^{ \tau } e^{-\rho t } \widehat{X}_{t} \Big(h'(\widehat{X}_{t})- \big( \rho - \beta _{2} -(g_{2}-g_{1})\pi _{t}\big)\Big) dt\bigg] $$
(4.7)

for any \((x,y) \in \mathcal{O}\). From (4.7), it is readily seen that

$$ \big\{ (x,y)\in \mathcal{O}: h'(x) - \big(\rho - \beta _{2} -(g_{2}-g_{1})y \big) < 0 \big\} \subseteq \mathcal{C}, $$

which implies

$$ \mathcal{S} \subseteq \big\{ (x,y)\in \mathcal{O}: h'(x) - \big(\rho - \beta _{2} -(g_{2}-g_{1})y\big) \geq 0 \big\} . $$
(4.8)

Moreover, since \(\rho \) satisfies Assumption 4.3 and \(0 \leq \pi _{t} \leq 1\) for any \((x,y) \in \mathcal{O}\), one has that

$$ \mathbb{E}_{(x,y)}\bigg[\int _{0}^{\infty } e^{- \rho t} \widehat{X}_{t} \big(h' (\widehat{X}_{t} ) + \rho + |\beta _{2}| + |g_{2}-g_{1}| \big) dt\bigg] < \infty , $$
(4.9)

and the family of random variables

$$ \bigg\{ \text{$\int _{0}^{\tau } e^{-\rho t } \widehat{X}_{t} \Big(h'( \widehat{X}_{t})- \big(\rho - \beta _{2} -(g_{2}-g_{1})\pi _{t}\big) \Big) dt: \tau $ is an $\mathbb{H}$-stopping time}\bigg\} $$

is therefore ℍ-uniformly integrable under \(\mathbb{P}_{(x,y)}\).

Preliminary properties of \(v\) are given in the next proposition.

Proposition 4.4

The following hold:

(i) \(x \mapsto v(x,y)\)is increasing for any \(y \in (0,1)\).

(ii) \(y \mapsto v(x,y)\)is decreasing for any \(x \in (0,\infty )\).

(iii) \((x,y) \mapsto v(x,y)\)is continuous in \(\mathcal{O}\).

Proof

(i) Recall (4.5). By the strict convexity and monotonicity of \(h\) and (4.4), it follows that \(x \mapsto \widehat{\mathcal{J}}_{(x,y)}(\tau )\) is increasing for any ℍ-stopping time \(\tau \) and any \(y \in (0,1)\). Hence the claim is proved.

(ii) This is due to the fact that \(y \mapsto \widehat{\mathcal{J}}_{(x,y)}(\tau )\) is decreasing for any stopping time \(\tau \) and \(x \in (0,\infty )\). Indeed, the mapping \(y \mapsto \widehat{X}^{x,y}_{t}\) is a.s. decreasing for any \(t\geq 0\) because \(y \mapsto \pi ^{y}_{t}\) is a.s. increasing by the comparison theorem of Yamada and Watanabe (see e.g. Karatzas and Shreve [45, Proposition 5.2.18]) and \(g_{2}-g_{1}<0\), and \(x \mapsto xh'(x)\) is increasing.

(iii) Since \((x,y) \mapsto (\widehat{X}^{x,y}_{t}, \pi ^{y}_{t})\) is a.s. continuous for any \(t\geq 0\), it is not hard to verify that \((x,y) \mapsto \widehat{\mathcal{J}}_{(x,y)}(\tau )\) is continuous for any given \(\tau \geq 0\). Hence \(v\) is upper semicontinuous. We now show that it is also lower semicontinuous.

Let \((x,y) \in \mathcal{O}\) and let \((x_{n},y_{n})_{n \in \mathbb{N}} \subseteq \mathcal{O}\) be any sequence converging to \((x,y)\). Without loss of generality, we may take \((x_{n},y_{n}) \in (x-\delta ,x+\delta ) \times (y-\delta ,y+\delta )\) for a suitable \(\delta >0\). Letting \(\tau ^{n}_{\varepsilon }:=\tau ^{n}_{\varepsilon }(x_{n},y_{n})\) be \(\varepsilon \)-optimal for \(v(x_{n},y_{n})\), but suboptimal for \(v(x,y)\), we can then write

$$\begin{aligned} v(x,y) - v(x_{n},y_{n}) &\leq \mathbb{E}\bigg[\int _{0}^{\tau ^{n}_{ \varepsilon }} e^{-\rho t}\big( \widehat{X}^{x,y}_{t} h' ( \widehat{X}^{x,y}_{t} ) - \widehat{X}^{x_{n},y_{n}}_{t} h' ( \widehat{X}^{x_{n},y_{n}}_{t} )\big) dt \bigg] \\ & \phantom{=:}+ \mathbb{E}\big[e^{-\rho \tau ^{n}_{\varepsilon }} \big(\widehat{X}^{x,y}_{ \tau ^{n}_{\varepsilon }} - \widehat{X}^{x_{n},y_{n}}_{\tau ^{n}_{ \varepsilon }} \big) \big] + \varepsilon . \end{aligned}$$

Notice now that a.s.

$$\begin{aligned} & \int _{0}^{\tau ^{n}_{\varepsilon }} e^{-\rho t} \big| \widehat{X}^{x,y}_{t} h' (\widehat{X}^{x,y}_{t} ) - \widehat{X}^{x_{n},y_{n}} h' ( \widehat{X}^{x_{n},y_{n}}_{t} ) \big| dt \\ & \leq \int _{0}^{\infty } e^{-\rho t}\big(\widehat{X}^{x,y}_{t} h' ( \widehat{X}^{x,y}_{t} ) + \widehat{X}^{x+\delta ,y-\delta }_{t} h' ( \widehat{X}^{x+\delta ,y-\delta }_{t} )\big) dt, \end{aligned}$$

where we have used that \(x \mapsto \widehat{X}^{x,y}\) is increasing, \(y \mapsto \widehat{X}^{x,y}\) is decreasing and \(x \mapsto xh'(x)\) is positive and increasing. The random variable on the right-hand side above is independent of \(n\) and integrable due to (4.9). Also, using integration by parts and performing standard estimates, we can write that a.s.

$$\begin{aligned} & e^{-\rho \tau ^{n}_{\varepsilon }} \big(\widehat{X}^{x,y}_{\tau ^{n}_{ \varepsilon }} - \widehat{X}^{x_{n},y_{n}}_{\tau ^{n}_{\varepsilon }} \big) \\ &\leq |x-x_{n}| + \int _{0}^{\infty }e^{-\rho s} (\rho + |\beta _{2}| + |g_{2}-g_{1}| ) (\widehat{X}^{x,y}_{s} + \widehat{X}^{x+\delta ,y- \delta }_{s} ) ds, \end{aligned}$$

and the last integral above is independent of \(n\) and has finite expectation due to (4.9). Then taking limits as \(n\uparrow \infty \), invoking the dominated convergence theorem thanks to the previous estimates and using that \((x,y) \mapsto (\widehat{X}^{x,y}_{t}, \pi ^{y}_{t})\) is a.s. continuous for any \(t\geq 0\), we find (after rearranging terms) that

$$ \liminf _{n\uparrow \infty }v(x_{n},y_{n}) \geq v(x,y) - \varepsilon . $$

We thus conclude that \(v\) is lower semicontinuous at \((x,y)\) by arbitrariness of \(\varepsilon \). Since \((x,y) \in \mathcal{O}\) was arbitrary as well, \(v\) is lower semicontinuous on \(\mathcal{O}\). □

Due to Proposition 4.4 (iii), the stopping region is closed whereas the continuation region is open. Moreover, thanks to (4.9) and the \(\widehat{\mathbb{P}}_{(x,y)}\)-a.s. continuity of the paths of the process \(( \int _{0}^{t} e^{- \rho s} \widehat{X}_{s} (h'(\widehat{X}_{s})-( \rho - \beta _{2} - (g_{2}-g_{1}))\pi _{s} ) ds )_{t\geq 0}\), we can apply Karatzas and Shreve [46, Theorem D.12] to obtain that the first entry time of \((\widehat{X},\pi )\) into \(\mathcal{S}\) is optimal for (4.5), that is,

$$ \tau ^{\star }(x,y):=\inf \{t\geq 0: (\widehat{X}_{t}, \pi _{t}) \in \mathcal{S} \} \qquad \text{$\mathbb{P}_{(x,y)}$-a.s.}, (x,y) \in \mathcal{O}, $$
(4.10)

attains the infimum in (4.5) (with the usual convention \(\inf \emptyset = \infty \)). Also, standard arguments based on the strong Markov property of \((\widehat{X},\pi )\) (see e.g. Peskir and Shiryaev [57, Theorem I.2.4]) allow one to show that \(\mathbb{P}_{(x,y)}\)-a.s., the process \(S:= (S_{t} )_{t\geq 0}\) with

$$\begin{aligned} S_{t}:=e^{-\rho t}v(\widehat{X}_{t},\pi _{t}) + \int _{0}^{t} e^{- \rho s} \widehat{X}_{s} h' (\widehat{X}_{t} ) dt \end{aligned}$$

is an ℍ-submartingale, and the stopped process \((S_{t\wedge \tau ^{\star }} )_{t\geq 0}\) is an ℍ-martingale. The latter two conditions are usually referred to as the subharmonic characterisation of the value function \(v\).

We now rule out the possibility of an empty stopping region.

Lemma 4.5

The stopping region of (4.6) is not empty.

Proof

We argue by contradiction and suppose that \(\mathcal{S} = \emptyset \). Hence for any \((x,y) \in \mathcal{O}\), we can write

$$\begin{aligned} x > v(x,y) =& \mathbb{E}_{(x,y)}\bigg[\int _{0}^{\infty } e^{-\rho t} \widehat{X}_{t} h'(\widehat{X}_{t}) dt\bigg] \\ \geq &K_{o} x^{\gamma } \mathbb{E}_{(1,y)}\bigg[\int _{0}^{\infty } e^{- \rho t} \widehat{X}_{t}^{\gamma } dt\bigg] - \frac{K}{\rho }, \end{aligned}$$

where the inequality \(xh'(x) \geq h(x)\), due to convexity of \(h\), and the growth condition assumed on \(h\) (cf. Assumption 4.1) have been used. Now by taking \(x\) sufficiently large, we reach a contradiction since \(\gamma >1\) by assumption. Hence \(\mathcal{S} \neq \emptyset \). □

Proposition 4.6

For any \(y \in (0,1)\), let

$$ \overline{x}(y):=\inf \{x>0: v(x,y) \geq x\} $$
(4.11)

with the convention \(\inf \emptyset = \infty \). Then:

(i) We have

$$ \mathcal{C}=\{(x,y) \in \mathcal{O}: x < \overline{x}(y)\}, \qquad \mathcal{S}=\{(x,y) \in \mathcal{O}: x \geq \overline{x}(y)\}. $$
(4.12)

(ii) \(y \mapsto \overline{x}(y)\)is increasing and left-continuous.

(iii) There exist \(0 < x_{\star } < x^{\star } < \infty \)such that for any \(y \in [0,1]\),

$$ (h')^{-1} (\rho - \beta _{2}) \vee x_{\star } \leq \overline{x}(y) \leq x^{\star }. $$

Proof

(i) To show (4.12), it suffices to show that if \((x_{1},y) \in \mathcal{S}\), then \((x_{2},y) \in \mathcal{S}\) for any \(x_{2} \geq x_{1}\). Let \(\tau ^{\varepsilon }:= \tau ^{\varepsilon }(x_{2},y)\) be an \(\varepsilon \)-optimal stopping time for \(v(x_{2},y)\). Then exploiting \(\widehat{X}^{x_{2},y}_{t} = \frac{x_{2}}{x_{1}}\widehat{X}^{x_{1},y}_{t} \geq \widehat{X}^{x_{1},y}_{t}\) a.s. and monotonicity of \(h'\), (4.7) yields

$$\begin{aligned} 0 \geq & v(x_{2},y) - x_{2} \\ \geq &\mathbb{E}\bigg[\int _{0}^{\tau ^{\varepsilon }} e^{- \rho t} \widehat{X}^{x_{2},y}_{t}\Big(h' (\widehat{X}^{x_{2},y}_{t} ) - \big(\rho - \beta _{2} - (g_{2}-g_{1})\pi ^{y}_{t}\big)\Big) dt\bigg] - \varepsilon \\ \geq & \frac{x_{2}}{x_{1}}\,\mathbb{E}\bigg[\int _{0}^{\tau ^{ \varepsilon }} e^{- \rho t} \widehat{X}^{x_{1},y}_{t}\Big(h' ( \widehat{X}^{x_{1},y}_{t} ) - \big(\rho - \beta _{2} - (g_{2}-g_{1}) \pi ^{y}_{t}\big)\Big) dt\bigg] - \varepsilon \\ \geq & \frac{x_{2}}{x_{1}}\,\big( v(x_{1},y) - x_{1}\big) - \varepsilon = - \varepsilon . \end{aligned}$$

Therefore, by arbitrariness of \(\varepsilon \), we conclude that \((x_{2},y) \in \mathcal{S}\) as well, and therefore that \(\overline{x}\) in (4.11) splits \(\mathcal{C}\) and \(\mathcal{S}\) as in (4.12).

(ii) Let \((x,y_{1})\in \mathcal{C}\). Since \(y \mapsto v(x,y)\) is decreasing by Proposition 4.4 (ii), it follows that \((x,y_{2})\in \mathcal{C}\) for any \(y_{2} \geq y_{1}\). This in turn implies that \(y \mapsto \overline{x}(y)\) is increasing. The monotonicity of \(y \mapsto \overline{x}(y)\) together with the fact that \(\mathcal{S}\) is closed then gives the claimed left-continuity by standard arguments.

(iii) Let \(\Theta ^{x}_{t}:= x \exp ((\beta _{2} - \frac{1}{2}\sigma ^{2} + (g_{2}-g_{1}))t + \sigma I_{t})\) and introduce the one-dimensional optimal stopping problem

$$\begin{aligned} & v^{\star }(x):= \inf _{\tau \geq 0} \mathbb{E}\bigg[\int _{0}^{\tau } e^{- \rho t} \Theta ^{x}_{t} h'(\Theta ^{x}_{t}) dt + e^{-\rho \tau } \Theta ^{x}_{\tau }\bigg], \qquad x>0. \end{aligned}$$

Because \(g_{2} - g_{1} <0\), \(h'\) is increasing and \(\pi ^{y}_{t} \leq 1\) a.s. for all \(t\geq 0\) and \(y\in (0,1)\), it is not hard to see that \(v(x,y) \geq v^{\star }(x)\) for any \((x,y) \in \mathcal{O}\).

By arguments similar to those employed to prove (i), one can show that there exists \(x^{\star }\) such that \(\{x \in (0,\infty ): v^{\star }(x) \geq x\} = \{x \in (0,\infty ): x \geq x^{\star }\}\). In fact, by arguing as in the proof of Lemma 4.5, the latter set is not empty. Then we have the inclusions

$$\begin{aligned} & \{x \in (0,\infty ): x \geq x^{\star }\} \subseteq \{(x,y) \in \mathcal{O}: v(x,y) \geq x\} = \{(x,y) \in \mathcal{O}: x \geq \overline{x}(y)\}, \end{aligned}$$

which in turn show that \(\overline{x}(y) \leq x^{\star }\) for all \(y \in (0,1)\). Hence also \(\overline{x}(y)\leq x^{\star }\) for all \(y \in [0,1]\), setting \(\overline{x}(0+) :=\lim _{y\downarrow 0}\overline{x}(y)\) by monotonicity and \(\overline{x}(1):=\lim _{y\uparrow 1}\overline{x}(y)\) by left-continuity. As for the lower bound of \(\overline{x}\), notice that (4.8) implies

$$ \overline{x}(y) \geq (h')^{-1}\big(\rho - \beta _{2} - (g_{2}-g_{1})y \big)=:\zeta (y), \qquad y \in (0,1), $$
(4.13)

where \((h')^{-1}(\,\cdot \,)\) is the inverse of the strictly increasing function \(h': [0,\infty ) \to (0,\infty )\) (notice that \(\rho - \beta _{2} - (g_{2}-g_{1})y \geq 0 \) since \(\rho >\beta _{2}\), \(g_{2}-g_{1}<0\) and \(y >0\)). Since \((h')^{-1}\) is strictly increasing and \(-(g_{2}-g_{1})y \geq 0 \), we can conclude from (4.13) that \(\overline{x}(y) \geq (h')^{-1} (\rho - \beta _{2})\) for every \(y\in [0,1]\). Moreover, setting

$$ \Psi ^{x}_{t}:= x \bigg(\Big(\beta _{2} - \frac{1}{2}\sigma ^{2}\Big)t + \sigma I_{t}\bigg) $$

and introducing the one-dimensional optimal stopping problem

$$\begin{aligned} & v_{\star }(x):= \inf _{\tau \geq 0} \mathbb{E}\bigg[\int _{0}^{\tau } e^{- \rho t} \Psi ^{x}_{t} h'(\Psi ^{x}_{t}) dt + e^{-\rho \tau } \Psi ^{x}_{\tau }\bigg], \qquad x>0, \end{aligned}$$

one has \(v(x,y) \leq v_{\star }(x)\) for any \((x,y) \in \mathcal{O}\). Following arguments as those employed above and defining \(x_{\star }:=\inf \{x>0: v_{\star }(x) \geq x\} \in (0,\infty )\), the last inequality implies that \(\overline{x}(y) \geq x_{\star }\) for all \(y \in [0,1]\). □

4.1.2 Smooth-fit property and continuity of the free boundary

We now aim at proving further regularity of \(v\) and the free boundary \(\overline{x}\).

The second-order linear elliptic differential operator

$$\begin{aligned} \mathbb{L} &:= \big(\beta _{2} + (g_{2}-g_{1})y\big)x \frac{\partial }{\partial x} + \frac{1}{2}\sigma ^{2} x^{2} \frac{\partial ^{2}}{\partial x^{2}} + \big(\lambda _{2} - (\lambda _{1} + \lambda _{2})y\big) \frac{\partial }{\partial y} \\ & \phantom{=::}+ \frac{1}{2} \bigg((\alpha _{1}-\alpha _{2})^{2} + \frac{(g_{2}-g_{1})^{2}}{\sigma ^{2}}\bigg)y^{2}(1-y)^{2} \frac{\partial ^{2}}{\partial y^{2}}, \end{aligned}$$
(4.14)

acting on any function \(f \in C^{2}(\mathcal{O})\), is the infinitesimal generator of the process \((\widehat{X},\pi )\). The nondegeneracy of the process \((\widehat{X},\pi )\) and the smoothness of the coefficients in (4.14) together with the subharmonic characterisation of \(v\) allow proving by standard arguments (see e.g. [57, Sect. 3.7.1]) and classical regularity results for elliptic partial differential equations (see e.g. Gilbarg and Trudinger [37, Sect. 6.6.3]) the following result.

Lemma 4.7

The value function \(v\)of (4.5) belongs to \(C^{2}\)separately in the interior of \(\mathcal{C}\)and in the interior of \(\mathcal{S}\) (i.e., away from the boundary \(\partial \mathcal{C}\)of \(\mathcal{C}\)). Moreover, in the interior of \(\mathcal{C}\), it satisfies

$$ (\mathbb{L} - \rho )v(x,y) = - x h'(x), $$

with \(\mathbb{L}\)as in (4.14).

We continue our analysis by proving that the value function of (4.5) belongs to the class \(C^{1}((0,\infty ) \times (0,1))\). This will be obtained through probabilistic methods that rely on the regularity (in the sense of diffusions) of the stopping set \(\mathcal{S}\) for the process \((\widehat{X},\pi )\) (see De Angelis and Peskir [22] where this methodology has recently been developed in a general context; for other examples, refer to De Angelis et al. [21] as well as to Johnson and Peskir [43]). Recall that the boundary points are regular for \(\mathcal{S}\) with respect to \((\widehat{X},\pi )\) if (cf. Karatzas and Shreve [45, Definition 4.2.9])

$$ \widehat{\tau }(x_{o},y_{o}):=\inf \{t>0: (\widehat{X}^{x_{o},y_{o}}_{t}, \pi ^{y_{o}}_{t}) \in \mathcal{S}\} = 0 \quad \text{a.s.}, \qquad \forall (x_{o},y_{o}) \in \partial \mathcal{C}. $$
(4.15)

The time \(\widehat{\tau }(x_{o},y_{o})\) is the first hitting time of \((\widehat{X}^{x_{o},y_{o}},\pi ^{y_{o}})\) to \(\mathcal{S}\).

Notice that defining \(U_{t}:=\ln \widehat{X}_{t}\), one has

$$ dU_{t}=\bigg(\beta _{2} + (g_{2}-g_{1})\pi _{t} - \frac{1}{2}\sigma ^{2} \bigg)dt + \sigma dI_{t} $$

as well as \(\mathbb{E}_{(x,y)} [f(\widehat{X}_{t},\pi _{t}) ]=\mathbb{E}_{(u,y)} [f(e^{U_{t}},\pi _{t}) ]\) for every bounded Borel function \(f: \mathbb{R}^{2} \mapsto \mathbb{R}\), where \(u:=\ln x\). Due the nondegeneracy of the process \((U,\pi )\) and the smoothness and boundedness of its coefficients, the pair \((U,\pi )\) has a continuous transition density \(\widehat{p}(\cdot ,\cdot ,\cdot ;u,y)\), \((u,y) \in \mathbb{R} \times (0,1)\), such that for any \((u',y') \in \mathbb{R} \times (0,1)\) and \(t\geq 0\) (see e.g. Aronson [1]),

$$\begin{aligned} \frac{M}{t}\exp \bigg(-\lambda \frac{ (u-u')^{2} + (y-y')^{2} }{t} \bigg) \geq & \widehat{p}(t,u',y';u,y) \\ \geq & \frac{m}{t}\exp \bigg(-\Lambda \frac{ (u-u')^{2} + (y-y')^{2} }{t}\bigg), \end{aligned}$$
(4.16)

for some constants \(M>m>0\) and \(\Lambda > \lambda >0\). It thus follows that the mapping \((u,y) \mapsto \mathbb{E}_{(u,y)} [f(e^{U_{t}}, \pi _{t}) ]\) is continuous, so that \((U,\pi )\) is a strong Feller process. Hence \((\widehat{X},\pi )\) is strong Feller as well, and we can therefore conclude that (4.15) holds if and only if (see Dynkin [26, Chap. 13.1-2])

$$ \tau ^{\star }(x_{n},y_{n}) \rightarrow 0 \quad \text{a.s.} \qquad \text{whenever $ \mathcal{C} \supseteq (x_{n},y_{n})_{n \in \mathbb{N}} \rightarrow (x_{o},y_{o}) \in \partial \mathcal{C}$,} $$

where \(\tau ^{\star }\) is as in (4.10).

The next proposition shows the validity of (4.15).

Proposition 4.8

The boundary points in \(\partial \mathcal{C}\)are regular for \(\mathcal{S}\)with respect to \((\widehat{X},\pi )\), that is, (4.15) holds.

Proof

Let \((x_{o},y_{o})\in \partial \mathcal{C}\) and set \(u_{o}:=\ln x_{o}\). We set \(\widehat{\sigma }(u_{o},y_{o}):=\widehat{\tau }(e^{u_{o}},y_{o})\) for any given \((u_{o},y_{o}) \in \mathbb{R} \times (0,1)\) and equivalently rewrite (4.15) in terms of the process \((U,\pi )\), with \(U\) as defined above, as

$$\begin{aligned} \widehat{\sigma }(u_{o},y_{o})&=\inf \{t>0: U^{u_{o},y_{o}}_{t} \geq \ln \overline{x}(\pi ^{y_{o}}_{t})\} \\ &= 0 \quad \mbox{a.s.}, \quad \mbox{for all $ (u_{o},y_{o})$ such that $u_{o}=\ln \overline{x}(y_{o})$.} \end{aligned}$$

Given that \(y \mapsto \ln \overline{x}(y)\) is increasing like \(y \mapsto \overline{x}(y)\), the region

$$ \widehat{\mathcal{S}}:=\{(u,y) \in \mathbb{R} \times (0,1): u \geq \ln \overline{x}(y)\} $$

enjoys the so-called cone property (see Karatzas and Shreve [45, Definition 4.2.18]). In particular, we can always construct a cone \(C_{o}\) with vertex in \((u_{o},y_{o})\) and aperture \(0 \leq \phi \leq \pi /2\) such that \(C_{o} \cap (\mathbb{R} \times (0,1)) \subseteq \widehat{\mathcal{S}}\) and for any \(t_{o}\geq 0\), we have

$$ \mathbb{P}[\widehat{\sigma }(u_{o},y_{o}) \leq t_{o}] \geq \mathbb{P}[(U^{u_{o},y_{o}}_{t_{o}}, \pi ^{y_{o}}_{t_{o}} ) \in C_{o}]. $$
(4.17)

Then using (4.16), one has

$$\begin{aligned} \mathbb{P}[ (U^{u_{o},y_{o}}_{t_{o}}, \pi ^{y_{o}}_{t_{o}}) \in C_{o}] &= \int _{C_{o}} \widehat{p}(t_{o},u_{o},y_{o};u,y) du \ dy \\ &\geq \int _{C_{o}} \frac{m}{t_{o}}e^{-\Lambda \frac{(u-u_{o})^{2} + (y-y_{o})^{2}}{t_{o}}} du \ dy \\ & = m \int _{C_{o}} e^{-\Lambda ((u')^{2} + (y')^{2} )} du' dy' =: \ell > 0, \end{aligned}$$
(4.18)

using that the change of variables \(u':= (u-u_{o})/\sqrt{t_{o}}\) and \(y':= (y-y_{o})/\sqrt{t_{o}}\) maps the cone \(C_{o}\) into itself. The number \(\ell \) above depends on \(u_{o}\), \(y_{o}\), but is independent of \(t_{o}\). From (4.17) and (4.18), we thus have \(\mathbb{P}[\widehat{\sigma }(u_{o},y_{o}) \leq t_{o}] \geq \ell \), and letting \(t_{o} \downarrow 0\) yields \(\mathbb{P}[\widehat{\sigma }(u_{o},y_{o}) = 0] \geq \ell > 0\). However, \(\{\widehat{\sigma }(u_{o},y_{o}) = 0\} \in \mathcal{H}_{0}\), and by the Blumenthal 0–1 law, we obtain \(\mathbb{P}[\widehat{\sigma }(u_{o},y_{o}) = 0] =1\), which completes the proof. □

Theorem 4.9

One has that \(v \in C^{1}(\mathcal{O})\).

Proof

The value function belongs to \(C^{2}\) in the interior of the continuation region due to Lemma 4.7, and it is \(C^{\infty }\) in the interior of the stopping region where \(v(x,y)=x\). It thus only remains to prove that \(v\) is continuously differentiable across \(\partial \mathcal{C}\). In the sequel, we prove that (i) the function \(\overline{w}( x,y) :=\frac{1}{x}(v(x,y) - x)\) has a continuous derivative with respect to \(x\) across \(\partial \mathcal{C}\) (and this clearly implies the continuity of \(v_{x}\) across \(\partial \mathcal{C}\)); (ii) the function \(v_{y}\) is continuous across \(\partial \mathcal{C}\).

(i) Continuity of \(v_{x}\)across \(\partial \mathcal{C}\): For the subsequent arguments, it is useful to notice that the function \(\overline{w}\) admits the representation (recall (4.7))

$$ \overline{w}(x,y)=\inf _{\tau \geq 0}\mathbb{E}\bigg[\int _{0}^{\tau } e^{-\rho t} \widehat{X}^{1,y}_{s}\Big(h' (\widehat{X}^{x,y}_{s} ) - \big(\rho - \beta _{2} -(g_{2}-g_{1})\pi ^{y}_{s}\big)\Big) ds\bigg] $$
(4.19)

and to bear in mind that the optimal stopping time \(\tau ^{\star }\) for \(v\) in (4.10) is also optimal for \(\overline{w}\) since \(v \geq x\) if and only if \(\overline{w}\geq 0\). We now prove that \(\overline{w}_{x}\) is continuous across \(\partial \mathcal{C}\), thus implying continuity of \(v_{x}\) across \(\partial \mathcal{C}\).

Take \((x,y) \in \mathcal{C}\) and let \(\varepsilon >0\) be such that \(x-\varepsilon >0\). Since \(x \mapsto \overline{w}(x,y)\) is increasing due to the monotonicity of \(h'\), it is clear that \((x-\varepsilon ,y) \in \mathcal{C}\) as well. Denote by \(\tau ^{\star }_{\varepsilon }(x,y):=\tau ^{\star }(x-\varepsilon ,y)\) the optimal stopping time for \(\overline{w}(x-\varepsilon ,y)\) and notice that \(\tau ^{\star }_{\varepsilon }(x,y)\) is suboptimal for \(\overline{w}(x,y)\) and \(\tau ^{\star }_{\varepsilon }(x,y)\rightarrow \tau ^{\star }(x,y)\) a.s. as \(\varepsilon \downarrow 0\). To simplify the exposition, we write \(\tau ^{\star }_{\varepsilon }:=\tau ^{\star }_{\varepsilon }(x,y)\) and \(\tau ^{\star }:=\tau ^{\star }(x,y)\) in the sequel. We then have from (4.19) that

$$\begin{aligned} 0 \leq \frac{\overline{w}(x,y) - \overline{w}(x-\varepsilon ,y)}{\varepsilon }& \leq \frac{1}{\varepsilon }\mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{-\rho t} \widehat{X}^{1,y}_{t}\big(h' (\widehat{X}^{x,y}_{t} ) - h' (\widehat{X}^{x-\varepsilon ,y}_{t} )\big) dt \bigg] \\ & = \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} (\widehat{X}^{1,y}_{t})^{2} h'' (\widehat{X}^{\xi _{ \varepsilon },y}_{t} ) dt \bigg] \end{aligned}$$
(4.20)

for some \(\xi _{\varepsilon } \in (x-\varepsilon ,x)\), where we have used in the last step the mean value theorem and the fact that \(\widehat{X}^{x,y}_{t} - \widehat{X}^{x-\varepsilon ,y}_{t} = \varepsilon \widehat{X}^{1,y}_{t}\). Letting \(\varepsilon \downarrow 0\), invoking dominated convergence (thanks to the fact that \(\rho > (\gamma \beta _{2} + \frac{1}{2}\sigma ^{2} \gamma (\gamma -1) ) \vee (2\beta _{2} + \sigma ^{2} )\) by Assumption 4.3) and using that \(\overline{w} \in C^{1}(\mathcal{C})\) (since \(v \in C^{1}(\mathcal{C})\)), we then find from (4.20) that

$$ 0 \leq \overline{w}_{x}(x,y) \leq \mathbb{E}\bigg[\int _{0}^{\tau ^{ \star }} e^{-\rho t} (\widehat{X}^{1,y}_{t})^{2} h'' (\widehat{X}^{x,y}_{t} ) dt \bigg]. $$
(4.21)

Now let \((x_{o},y_{o})\) be an arbitrary point belonging to \(\partial \mathcal{C}\). Taking limits \((x,y) \rightarrow (x_{o},y_{o})\) in (4.21), using dominated convergence and Proposition 4.8, we obtain

$$ 0 \leq \liminf _{(x,y) \rightarrow (x_{o},y_{o}) \in \partial \mathcal{C}}\overline{w}_{x}(x,y) \leq \limsup _{(x,y) \rightarrow (x_{o},y_{o}) \in \partial \mathcal{C}}\overline{w}_{x}(x,y) \leq 0, $$

thus proving that \(\overline{w}_{x}\) is continuous across \(\partial \mathcal{C}\). This immediately implies the continuity of \(v_{x}\) across \(\partial \mathcal{C}\), upon recalling that \(v(x,y)=x(\overline{w}(x,y)+1)\).

(ii) Continuity of \(v_{y}\)across \(\partial \mathcal{C}\): Take again \((x,y) \in \mathcal{C}\) and \(\varepsilon >0\) such that \(y+\varepsilon <1\). Since \(y \mapsto v(x,y)\) is decreasing by Proposition 4.4 (ii), it is clear that \((x,y+\varepsilon ) \in \mathcal{C}\) as well. Denote by \(\tau ^{\star }_{\varepsilon }(x,y):= \tau ^{\star } (x, y+\varepsilon )\) the optimal stopping time for \(v(x, y +\varepsilon )\) and notice that \(\tau ^{\star }_{\varepsilon }(x,y)\) is suboptimal for \(v(x,y)\) and \(\tau ^{\star } (x, y+\varepsilon ) \rightarrow \tau ^{\star } (x,y) \) a.s. as \(\varepsilon \downarrow 0\). To simplify the notation, we write \(\tau ^{\star }_{\varepsilon }\) instead of \(\tau ^{\star }_{\varepsilon }(x,y)\) in the sequel. From Proposition 4.4 (ii) and (4.7), we then have

$$\begin{aligned} 0 &\ge \frac{v(x,y+\varepsilon ) - v(x,y)}{\varepsilon } \\ & \ge \frac{1}{\varepsilon } \ \mathbb{E}\bigg[\int _{0}^{\tau ^{ \star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+\varepsilon }_{t} \Big( h' ( \widehat{X}^{x,y+\varepsilon }_{t} ) - \big( \rho - \beta _{2} - \pi ^{y+\varepsilon }_{t} (g_{2} - g_{1}) \big) \Big) dt \bigg] \\ & \phantom{=:}- \frac{1}{\varepsilon } \ \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{- \rho t} \widehat{X}^{x,y}_{t} \Big( h' ( \widehat{X}^{x,y}_{t} ) - \big( \rho - \beta _{2} - \pi ^{y}_{t} (g_{2} - g_{1}) \big) \Big) dt \bigg] \\ &= \frac{1}{\varepsilon } \mathbb{E}\bigg[ \int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{- \rho t} \Big( \big(\widehat{X}^{x,y+\varepsilon }_{t} h' ( \widehat{X}^{x,y+\varepsilon }_{t} ) - \widehat{X}^{x,y}_{t} h' ( \widehat{X}^{x,y}_{t} ) \big) \\ &\phantom{=:}\qquad \qquad \qquad \,- (\rho - \beta _{2}) ( \widehat{X}^{x,y+ \varepsilon }_{t} - \widehat{X}^{x,y}_{t} ) \Big) dt \bigg] \\ & \phantom{=:}+ \frac{1}{\varepsilon } \ \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{- \rho t} (g_{2} - g_{1}) ( \widehat{X}^{x,y+ \varepsilon }_{t} \pi ^{y+\varepsilon }_{t} - \widehat{X}^{x,y}_{t} \pi ^{y}_{t} ) dt \bigg]. \end{aligned}$$

Now add and subtract on the right-hand side both \(\mathbb{E}[\int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+\varepsilon }_{t} h'( \widehat{X}^{x,y}_{t}) dt]\) and \((g_{2}-g_{1})\mathbb{E}[\int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+\varepsilon }_{t} \pi ^{y}_{t} dt]\) and recall that \(g_{2} - g_{1} <0\), \(\widehat{X}^{x,y}_{t} \ge 0\) a.s. and \(\pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t} \ge 0\) a.s., for every \(t \ge 0\). Then after rearranging terms and using the integral mean value theorem for some \(L_{t}^{\varepsilon } \in (\widehat{X}^{x,y+\varepsilon }_{t}, \widehat{X}^{x,y}_{t})\) a.s., we obtain that

$$\begin{aligned} 0 &\ge \frac{v(x,y+\varepsilon ) - v(x,y)}{\varepsilon } \\ & \geq \frac{1}{\varepsilon } \ \mathbb{E}\bigg[ \int _{0}^{\tau ^{ \star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+\varepsilon }_{t} \big( h' ( \widehat{X}^{x,y+\varepsilon }_{t} ) - h' ( \widehat{X}^{x,y}_{t} ) \big) dt \bigg] \\ & \phantom{=:}+ \frac{1}{\varepsilon } \ \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{- \rho t} ( \widehat{X}^{x,y+\varepsilon }_{t} - \widehat{X}^{x,y}_{t} ) \Big( h' ( \widehat{X}^{x,y}_{t} ) - \big( \rho - \beta _{2} - \pi ^{y}_{t} (g_{2} - g_{1}) \big) \Big) dt \bigg] \\ & \phantom{=:}- \frac{1}{\varepsilon } \ | g_{2} - g_{1}| \ \mathbb{E}\bigg[\int _{0}^{ \tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+ \varepsilon }_{t} ( \pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t} ) dt \bigg] \\ &\geq \frac{1}{\varepsilon } \ \mathbb{E}\bigg[\int _{0}^{\tau ^{ \star }_{\varepsilon }} e^{- \rho t} ( \widehat{X}^{x,y+\varepsilon }_{t} - \widehat{X}^{x,y}_{t} ) \big(\widehat{X}^{x,y+\varepsilon }_{t} h'' ( L_{t}^{\varepsilon } ) + h' ( \widehat{X}^{x,y}_{t} ) \big) dt \bigg] \\ &\phantom{=:} - \frac{1}{\varepsilon } \ | g_{2} - g_{1}| \ \mathbb{E}\bigg[ \int _{0}^{ \tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y+ \varepsilon }_{t} ( \pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t} ) dt \bigg]. \end{aligned}$$
(4.22)

In the last inequality, we have used that \(\rho - \beta _{2} - \pi ^{y}_{t} (g_{2} - g_{1}) \geq 0\) since \(\rho >\beta _{2}\) by Assumption 4.3, that \(g_{2}-g_{1}<0\) and that \(\widehat{X}^{x,y+\varepsilon }_{t} \leq \widehat{X}^{x,y}_{t}\).

Define now \(\Delta \pi ^{y}_{t}:=\frac{1}{\varepsilon }(\pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t})\), \(t\geq 0\), and notice that by using the second equation in (4.3), we can write for any \(t\geq 0\) that

$$\begin{aligned} \Delta \pi ^{y}_{t} &= 1 - \int _{0}^{t} (\lambda _{1} + \lambda _{2}) \Delta \pi ^{y}_{s} ds \\ &\phantom{=:}+ \int _{0}^{t} \Delta \pi ^{y}_{s} (1- \pi ^{y+\varepsilon }_{s} - \pi ^{y}_{s} )\bigg(\frac{g_{2}-g_{1}}{\sigma }dI_{s} + (\alpha _{1}- \alpha _{2})dI^{1}_{s}\bigg). \end{aligned}$$

With the help of Itô’s formula, it can easily be shown that

$$\begin{aligned} \Delta \pi ^{y}_{t} = & \exp \bigg(-(\lambda _{1} + \lambda _{2})t - \theta ^{2}\int _{0}^{t} (1- \pi ^{y+\varepsilon }_{s} - \pi ^{y}_{s} )^{2}ds \bigg) \\ & \times \exp \bigg( \int _{0}^{t} (1- \pi ^{y+\varepsilon }_{s} - \pi ^{y}_{s} )\Big(\frac{g_{2}-g_{1}}{\sigma }dI_{s} + (\alpha _{1}- \alpha _{2})dI^{1}_{s}\Big)\bigg) \end{aligned}$$
(4.23)

with \(\theta ^{2}:=\frac{1}{2} (\frac{(g_{2}-g_{1})^{2}}{\sigma ^{2}} + ( \alpha _{1}-\alpha _{2})^{2} )\). Also, by (4.4) and simple algebra,

$$ \frac{1}{\varepsilon } ( \widehat{X}^{x,y+\varepsilon }_{t} - \widehat{X}^{x,y}_{t} ) = \widehat{X}^{x,y}_{t} \frac{ e^{\varepsilon (g_{2}-g_{1}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds} - 1 }{\varepsilon } . $$
(4.24)

Using the definition of \(\Delta \pi ^{y}_{t}\) and (4.24) in (4.22) and \(\widehat{X}^{x,y+\varepsilon }_{t} \leq \widehat{X}^{x,y}_{t}\), one finds

$$\begin{aligned} 0 &\ge \frac{v(x,y+\varepsilon ) - v(x,y)}{\varepsilon } \\ &\geq \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y}_{t} \frac{ e^{\varepsilon (g_{2}-g_{1}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds} - 1 }{\varepsilon } \big( \widehat{X}^{x,y}_{t} h'' ( L_{t}^{\varepsilon } ) + h' ( \widehat{X}^{x,y}_{t} )\big) dt \bigg] \\ & \phantom{=:} - | g_{2} - g_{1}| \ \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }_{ \varepsilon }} e^{- \rho t} \widehat{X}^{x,y}_{t} \Delta \pi ^{y}_{t} dt \bigg]. \end{aligned}$$
(4.25)

We now aim at taking limits as \(\varepsilon \downarrow 0\) in (4.25). To this end, notice that \(\Delta \pi ^{y}_{t} \rightarrow Z^{y}_{t}\) a.s. for all \(t\geq 0\) as \(\varepsilon \downarrow 0\), where by Protter [58, Theorem V.7.39], \((Z^{y}_{t})_{t\geq 0}\) is the unique strong solution to

$$ d Z_{t}^{y} = - (\lambda _{1}+\lambda _{2}) Z_{t}^{y} dt + Z_{t}^{y} (1 - 2 \pi _{t}^{y} ) \bigg(\frac{g_{2}-g_{1}}{\sigma }dI_{t} + (\alpha _{1}- \alpha _{2})dI^{1}_{t}\bigg), \qquad t>0, $$

with \(Z_{0}^{y} = 1\). Then, if we are allowed to invoke the dominated convergence theorem when taking limits as \(\varepsilon \downarrow 0\) in (4.25), we obtain that

$$\begin{aligned} 0 &\ge v_{y}(x,y) \\ & \geq (g_{2}-g_{1})\mathbb{E}\bigg[\int _{0}^{\tau ^{\star }} e^{- \rho t} \widehat{X}^{x,y}_{t} \bigg(\int _{0}^{t} Z^{y}_{s} ds\bigg) \big( \widehat{X}^{x,y}_{t} h'' (\widehat{X}^{x,y}_{t} ) + h' ( \widehat{X}^{x,y}_{t} ) \big) dt \bigg] \\ &\phantom{=:} - | g_{2} - g_{1}| \ \mathbb{E}\bigg[\int _{0}^{\tau ^{\star }} e^{- \rho t} \widehat{X}^{x,y}_{t} Z^{y}_{t} dt\bigg] \end{aligned}$$
(4.26)

upon recalling that \(v\in C^{2}(\mathcal{C})\). Therefore, letting \((x_{o},y_{o})\) be an arbitrary point belonging to \(\partial \mathcal{C}\), by taking limits in (4.26) as \((x,y) \rightarrow (x_{o},y_{o})\) and using dominated convergence and Proposition 4.8, we obtain that

$$ 0 \geq \limsup _{(x,y) \rightarrow (x_{o},y_{o}) \in \partial \mathcal{C}}v_{y}(x,y) \geq \liminf _{(x,y) \rightarrow (x_{o},y_{o}) \in \partial \mathcal{C}}v_{y}(x,y) \geq 0, $$

thus proving that \(v_{y}\) is continuous across \(\partial \mathcal{C}\).

To complete the proof, it only remains to show that the dominated convergence theorem can be applied when taking limits as \(\varepsilon \downarrow 0\) in (4.25). We show this in the two following technical steps.

1) To prove that the dominated convergence theorem can be invoked when taking \(\varepsilon \downarrow 0\) in the first expectation on the right-hand side of (4.25), we set

$$ \Lambda _{\varepsilon }:= \int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y}_{t} \frac{ e^{\varepsilon (g_{2}-g_{1}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds} - 1 }{\varepsilon } \big( \widehat{X}^{x,y}_{t} h'' ( L_{t}^{\varepsilon } ) + h' ( \widehat{X}^{x,y}_{t} )\big) dt $$

and show that the family of random variables \(\{\Lambda _{\varepsilon } : \varepsilon \in (0,1-y)\}\) is bounded in \(L^{2}(\Omega ,\mathcal{F},\mathbb{P})\), hence uniformly integrable.

Notice that by Assumption 4.1 (ii) and the fact that \(L_{t}^{\varepsilon } \leq \widehat{X}^{x,y}_{t}\) a.s., one has a.s. for any \(t\geq 0\) that

$$ \widehat{X}^{x,y}_{t}\big(\widehat{X}^{x,y}_{t} h'' (L_{t}^{ \varepsilon } ) + h' (\widehat{X}^{x,y}_{t} ) \big) \leq \widehat{K} \big(1 + (\widehat{X}^{x,y}_{t} )^{\gamma \vee 2}\big) $$

for some constant \(\widehat{K}>0\) independent of \(\varepsilon \) so that by Jensen’s inequality,

$$\begin{aligned} & |\Lambda _{\varepsilon } |^{2} \leq \frac{2 \widehat{K}^{2}}{\rho ^{2}} \int _{0}^{\infty } \rho e^{- \rho t} \bigg( \frac{1 - e^{\varepsilon (g_{2}-g_{1}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds}}{\varepsilon } \bigg)^{2}\big(1 + (\widehat{X}^{x,y}_{t} )^{2\gamma \vee 4}\big) dt. \end{aligned}$$

Taking expectations and applying Hölder’s inequality gives

$$\begin{aligned} \mathbb{E}[ |\Lambda _{\varepsilon } |^{2} ]^{\frac{1}{2}} &\leq K' \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} \bigg( \frac{1- e^{\varepsilon (g_{2}-g_{1}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds}}{\varepsilon } \bigg)^{4} dt\bigg]^{\frac{1}{4}} \\ &\phantom{=:}\times \mathbb{E}\bigg[\int _{0}^{\infty } e^{- \rho t} \big(1 + ( \widehat{X}^{x,y}_{t} )^{4\gamma \vee 8}\big) dt\bigg]^{\frac{1}{4}} \end{aligned}$$
(4.27)

for some other constant \(K'>0\) independent of \(\varepsilon \) that in the sequel will vary from line to line. The standard inequality \(1- e^{-u} \leq u\) with \(u=\varepsilon (g_{1}-g_{2}) \int _{0}^{t} \Delta \pi ^{y}_{s} ds \geq 0\) allows us to continue from (4.27) and write

$$\begin{aligned} \mathbb{E}[ |\Lambda _{\varepsilon } |^{2} ]^{\frac{1}{2}}& \leq K' \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} \bigg(\int _{0}^{t} \Delta \pi ^{y}_{s} ds\bigg)^{4} dt\bigg]^{\frac{1}{4}} \\ & \phantom{=:}\times \mathbb{E}\bigg[\int _{0}^{\infty } e^{- \rho t} \big(1 + ( \widehat{X}^{x,y}_{t} )^{4(\gamma \vee 2)}\big) dt\bigg]^{ \frac{1}{4}}. \end{aligned}$$
(4.28)

We now treat the two expectations in (4.28) separately. First, by Jensen’s inequality,

$$ \bigg(\int _{0}^{t} \Delta \pi ^{y}_{s} ds\bigg)^{4} = \bigg( \frac{1}{t}\int _{0}^{t} t \,\Delta \pi ^{y}_{s} ds\bigg)^{4}\leq t^{3} \int _{0}^{t} (\Delta \pi ^{y}_{s} )^{4} ds. $$
(4.29)

Second, thanks to the nonnegativity of \((\Delta \pi ^{y})^{4}\), we can invoke the Fubini–Tonelli theorem and using also (4.29), we obtain

$$\begin{aligned} & \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} \bigg(\int _{0}^{t} \Delta \pi ^{y}_{s} ds\bigg)^{4} dt\bigg] \\ &\leq \mathbb{E}\bigg[\int _{0}^{\infty } e^{-\rho t} t^{3} \int _{0}^{t} (\Delta \pi ^{y}_{s} )^{4} ds dt\bigg] \\ & = \frac{1}{\rho ^{4}}\int _{0}^{\infty } e^{-\rho s} (\rho ^{3}s^{3}+3 \rho ^{2} s^{2}+ 6\rho s + 6 ) \mathbb{E}[ (\Delta \pi ^{y}_{s} )^{4} ] ds. \end{aligned}$$
(4.30)

To evaluate the expectation in the last integral above, notice that applying Itô’s formula to the process \(\xi ^{y}_{t}:=(\Delta \pi ^{y}_{t})^{4}\) and using (4.23) gives for any \(t>0\) that

$$\begin{aligned} d\xi ^{y}_{t} &= \xi ^{y}_{t}\big(-(\lambda _{1}+\lambda _{2}) + 12 \theta ^{2}(1-\pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t})^{2}\big)dt \\ &\phantom{=:}+ 4\xi ^{y}_{t}(1-\pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t})\bigg( \frac{g_{2}-g_{1}}{\sigma }dI_{s} + (\alpha _{1}-\alpha _{2})dI^{1}_{s} \bigg) \end{aligned}$$

with \(\xi ^{y}_{0}=1\) and \(\theta ^{2}=\frac{1}{2} (\frac{(g_{2}-g_{1})^{2}}{\sigma ^{2}} + ( \alpha _{1}-\alpha _{2})^{2} )\). Because \((1-\pi ^{y+\varepsilon }_{t} - \pi ^{y}_{t})^{2} \leq 2\) a.s. for all \(t\geq 0\) and

$$ \xi ^{y}_{t} = e^{-(\lambda _{1} + \lambda _{2})t + 12\theta ^{2} \int _{0}^{t}(1-\pi ^{y+\varepsilon }_{s} - \pi ^{y}_{s})^{2} ds}M^{y}_{t}, $$

where \((M^{y}_{t})_{t\geq 0}\) is an exponential martingale, it is easy to see that

$$ \mathbb{E}[(\Delta \pi ^{y}_{t})^{4} ] \leq e^{-(\lambda _{1} + \lambda _{2})t + 24\theta ^{2} t}, \qquad t\geq 0. $$
(4.31)

Using the latter estimate in (4.30) together with Assumption 4.3, we deduce that

$$ \sup _{\varepsilon \in (0,1-y)}\mathbb{E}\bigg[\int _{0}^{\infty } e^{- \rho t} \bigg(\int _{0}^{t} \Delta \pi ^{y}_{s} ds\bigg)^{4} dt\bigg] < \infty . $$
(4.32)

For the second expectation in (4.28), Assumption 4.3 and standard estimates using (4.4) plus the fact that \((g_{2}-g_{1})\int _{0}^{t} \pi ^{y}_{s} ds <0\) guarantee that it is finite. Moreover, it is independent of \(\varepsilon \). Combining this with (4.32), we thus find from (4.28) that

$$ \sup _{\varepsilon \in (0,1-y)}\mathbb{E}[ |\Lambda _{\varepsilon } |^{2} ]^{\frac{1}{2}} < \infty . $$

This implies that the family of random variables \(\{\Lambda _{\varepsilon } : \varepsilon \in (0,1-y)\}\) is bounded in \(L^{2}(\Omega ,\mathcal{F},\mathbb{P})\), hence uniformly integrable.

2) Consider the second expectation on the right-hand side of (4.25) and set

$$ \Xi _{\varepsilon }:=\int _{0}^{\tau ^{\star }_{\varepsilon }} e^{- \rho t} \widehat{X}^{x,y}_{t} \Delta \pi ^{y}_{t} dt. $$

We aim at proving that the family of random variables \(\{\Xi _{\varepsilon }: \varepsilon \in (0,1-y)\}\) is bounded in \(L^{2}(\Omega ,\mathcal{F},\mathbb{P})\), hence uniformly integrable.

By Jensen’s inequality and then Hölder’s inequality, one finds that

$$ \mathbb{E}[ |\Xi _{\varepsilon }|^{2} ]^{\frac{1}{2}} \leq \widehat{K}\mathbb{E}\bigg[\int _{0}^{\infty }e^{-\rho t} ( \widehat{X}^{x,y}_{t} )^{4} dt\bigg]^{\frac{1}{4}} \mathbb{E}\bigg[ \int _{0}^{\infty }e^{-\rho t} (\Delta \pi ^{y}_{t} )^{4} dt\bigg]^{ \frac{1}{4}} $$
(4.33)

for some \(\widehat{K}>0\) independent of \(\varepsilon \). The first expectation on the right-hand side of (4.33) is finite thanks to Assumption 4.3 and standard estimates using (4.4) plus the fact that \((g_{2}-g_{1})\int _{0}^{t} \pi ^{y}_{s} ds <0\). Moreover, it is independent of \(\varepsilon \). For the second one, interchanging expectation and \(dt\)-integral by the Fubini–Tonelli theorem and using (4.31), we obtain

$$ \mathbb{E}\bigg[\int _{0}^{\infty }e^{-\rho t} (\Delta \pi ^{y}_{t} )^{4} dt\bigg]^{\frac{1}{4}} \leq \frac{1}{(\rho + \lambda _{1} + \lambda _{2} - 24\theta ^{2})^{\frac{1}{4}}} $$

by Assumption 4.3. We thus conclude by (4.33) that \(\sup _{\varepsilon \in (0,1-y)}\mathbb{E}[ |\Xi _{\epsilon }|^{2} ]^{\frac{1}{2}} < \infty \), which completes the proof. □

The previous theorem in particular implies the so-called smooth-fit property, a well-known optimality principle in optimal stopping theory. Moreover, by standard arguments based on the strong Markov property of \((\widehat{X},\pi )\) (see Peskir and Shiryaev [57, Chap. III]), it follows from the results collected so far that the couple \((v,\overline{x})\) solves the free-boundary problem

$$ \left \{ \textstyle\begin{array}{ll} (\mathbb{L} - \rho )v(x,y) = - xh'(x) & \qquad \text{on } \mathcal{C}, \\ v(x,y) = x &\qquad \text{on } \mathcal{S}, \\ v_{x}(x,y) = 1 & \qquad \text{at } x= \overline{x}(y), y \in (0,1), \\ v_{y}(x,y) = 0 &\qquad \text{at } x= \overline{x}(y), y \in (0,1), \end{array}\displaystyle \right . $$

with \(v \in C^{2}(\mathcal{C})\).

An important consequence of Theorem 4.9 is the following result.

Proposition 4.10

The mapping \(y \mapsto \overline{x}(y)\)is continuous on \([0,1]\).

Proof

Let \(T>0\) and define the probability measure ℚ on \((\Omega , \mathcal{H}_{T})\) by

$$ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{H}_{t}} = e^{- \frac{1}{2}\sigma ^{2} t + \sigma I_{t}}, \qquad t \in [0,T]. $$

Under the new measure ℚ, the process \(\widehat{I}_{t}:=I_{t} - \sigma t\), \(t \in [0,T]\), is a standard Brownian motion, and the dynamics of \((\widehat{X},\pi )\) read

$$ \left \{ \textstyle\begin{array}{rl} \displaystyle d\widehat{X}_{t}&= \widehat{X}_{t} \big( \beta _{2} + \sigma ^{2} +(g_{2}-g_{1})\pi _{t} \big) dt + \sigma \widehat{X}_{t} d \widehat{I}_{t}, \\ \displaystyle d\pi _{t}&= \big(\lambda _{2} - (\lambda _{1}+\lambda _{2}) \pi _{t} + (g_{2}-g_{1})\pi _{t}(1-\pi _{t}) \big)dt \\ \displaystyle &\phantom{=:}+ \pi _{t}(1-\pi _{t})\displaystyle \bigg( \frac{g_{2}-g_{1}}{\sigma }d\widehat{I}_{t} + (\alpha _{1}-\alpha _{2})dI^{1}_{t} \bigg), \end{array}\displaystyle \right . $$
(4.34)

with initial conditions \(\widehat{X}_{0}=x\), \(\pi _{0}=y\), \((x,y)\in \mathcal{O}\). Now for any \(\tau \) and \((x,y)\), we have

$$\begin{aligned} & \mathbb{E}_{(x,y)}\bigg[\int _{0}^{\tau \wedge T} e^{-\rho t } \widehat{X}_{t} \Big(h'(\widehat{X}_{t})- \big(\rho - \beta _{2} -(g_{2}-g_{1}) \pi _{t}\big)\Big) dt\bigg] \\ & = \mathbb{E}^{\mathbb{Q}}_{(x,y)}\bigg[\int _{0}^{\tau \wedge T} e^{-( \rho - \beta _{2})t + (g_{2}-g_{1})\int _{0}^{t} \pi _{s} ds } \widehat{H}(\widehat{X}_{t},\pi _{t}) dt\bigg] \end{aligned}$$
(4.35)

with \(\widehat{H}(x,y):= (\rho - \beta _{2} -(g_{2}-g_{1})y - h'(x) )\). We cannot directly take the limit \(T \uparrow \infty \) in (4.35) since the measure change ℚ depends on \(T\). However, we notice that the right-hand side of (4.35) only depends on the law of \((\widehat{X},\pi )\) under ℚ. Therefore, we can define a new probability space \((\overline{\Omega },\overline{\mathcal{F}},\overline{\mathbb{P}})\) equipped with a two-dimensional Brownian motion \((\overline{B}^{1},\overline{B}^{2})\) and a filtration \(\overline{\mathbb{F}}:=(\overline{\mathcal{F}}_{t})_{t\geq 0}\) and let \((\overline{X},\overline{\pi })\) be the unique strong solution to (4.34), driven by \((\overline{B}^{1},\overline{B}^{2})\). In this setting, we can then define the stopping problems

$$\begin{aligned} \overline{V}(x,y;T)&:= \sup _{\tau \geq 0} \overline{\mathbb{E}}_{(x,y)} \bigg[\int _{0}^{\tau \wedge T} e^{-(\rho - \beta _{2})t + (g_{2}-g_{1}) \int _{0}^{t} \overline{\pi }_{s} ds } \widehat{H}(\overline{X}_{t}, \overline{\pi }_{t}) dt\bigg], \\ \overline{V}(x,y)&:= \sup _{\tau \geq 0} \overline{\mathbb{E}}_{(x,y)} \bigg[\int _{0}^{\tau } e^{-(\rho - \beta _{2})t + (g_{2}-g_{1})\int _{0}^{t} \overline{\pi }_{s} ds } \widehat{H}(\overline{X}_{t},\overline{\pi }_{t}) dt\bigg], \end{aligned}$$
(4.36)

where \(\overline{\mathbb{E}}_{(x,y)}\) is the expectation under \(\overline{\mathbb{P}}\) conditionally on \((\overline{X}_{0},\overline{\pi }_{0})=(x,y)\). By arguing as in the proof of De Angelis [19, Proposition 4.2], one can show that

$$ \lim _{T\uparrow \infty }\overline{V}(x,y;T) = \overline{V}(x,y) \qquad \text{and} \qquad \lim _{T\uparrow \infty }\widehat{V}(x,y;T) = \widehat{V}(x,y), $$

where we have set

$$\begin{aligned} \widehat{V}(x,y;T) &:= \sup _{\tau \geq 0} \mathbb{E}_{(x,y)}\bigg[ \int _{0}^{\tau \wedge T} e^{-\rho t } \widehat{X}_{t} \Big(h'( \widehat{X}_{t})- \big(\rho - \beta _{2} -(g_{2}-g_{1})\pi _{t}\big) \Big) dt\bigg], \\ \widehat{V}(x,y)&:= \sup _{\tau \geq 0} \mathbb{E}_{(x,y)}\bigg[ \int _{0}^{\tau } e^{-\rho t } \widehat{X}_{t} \Big(h'(\widehat{X}_{t})- \big(\rho - \beta _{2} -(g_{2}-g_{1})\pi _{t}\big)\Big) dt\bigg]. \end{aligned}$$

Since now

$$\begin{aligned} & \sup _{\tau \geq 0} \mathbb{E}^{\mathbb{Q}}_{(x,y)}\bigg[\int _{0}^{ \tau \wedge T} e^{-(\rho - \beta _{2})t + (g_{2}-g_{1})\int _{0}^{t} \pi _{s} ds } \widehat{H}(\widehat{X}_{t},\pi _{t}) dt\bigg] \\ & = \sup _{\tau \geq 0} \overline{\mathbb{E}}_{(x,y)}\bigg[\int _{0}^{ \tau \wedge T} e^{-(\rho - \beta _{2})t + (g_{2}-g_{1})\int _{0}^{t} \overline{\pi }_{s} ds } \widehat{H}(\overline{X}_{t},\overline{\pi }_{t}) dt\bigg], \end{aligned}$$

the equivalence in law of the processes \((\widehat{X},\pi ,\widehat{I},I^{1})\) under ℚ and \((\overline{X},\overline{\pi },\overline{B}^{1},\overline{B}^{2})\) under \(\overline{\mathbb{P}}\) on \([0,T]\) plus (4.35) allow us to write

$$\begin{aligned} \overline{V}(x,y) = \lim _{T\uparrow \infty }\overline{V}(x,y;T) = \lim _{T\uparrow \infty }\widehat{V}(x,y;T) = \widehat{V}(x,y). \end{aligned}$$

In light of this last equality, it is then not difficult to see that \(v\) as in (4.7) is such that \(v(x,y) := x - \overline{V}(x,y)\) for any \((x,y) \in \mathcal{O}\). Since

$$ \{(x,y) \in \mathcal{O}: v(x,y) \geq x\} = \{(x,y) \in \mathcal{O}: \overline{V}(x,y) \leq 0\}, $$

\(\overline{x}(\,\cdot \,)\) is the optimal stopping boundary for the problem with value \(\overline{V}\) as well.

To prove the continuity of \(\overline{x}(\,\cdot \,)\), we now aim at applying Peskir [56, Theorem 10] for (4.36). Notice that \(\overline{V}_{x} \leq 0\) on \(\mathcal{O}\) since \(x \mapsto h(x)\) is strictly convex. Moreover, recalling \(\theta ^{2}=\frac{1}{2}((\alpha _{1}-\alpha _{2})^{2} + \frac{(g_{2}-g_{1})^{2}}{\sigma ^{2}})\), we have \(\partial _{x} \frac{\widehat{H}}{\theta ^{2} y^{2}(1-y)^{2}} <0\) on \(\mathcal{O}\) again thanks to the strict convexity of \(h\). Also, \(\overline{V}_{y}\) is continuous across the boundary due to the \(C^{1}\)-property shown in Theorem 4.9 for \(v=x-\widehat{V}\); hence, the horizontal smooth-fit property holds. We can therefore apply [56, Theorem 10 ] (upon noticing that in [56], \(x\) is the horizontal axis and \(y\) the vertical one, while in our paper, \(x\) is the vertical axis and \(y\) the horizontal one) and conclude that \(\overline{x}\) cannot have discontinuities of the first kind at any point \(y\in [0,1)\). Finally, \(\overline{x}\) is also continuous at \(y=1\) since it is left-continuous by Proposition 4.6 (ii). □

4.2 The optimal control for the problem (4.2)

In this section, we provide the form of the optimal debt reduction policy. It is given in terms of the free boundary studied in the previous section.

For \(\overline{x}\) as in (4.11), introduce under \(\mathbb{P}_{(x,y)}\) the nondecreasing process

$$ \displaystyle \overline{\nu }^{{{*}}}_{t} = \Big( x - \inf _{0\leq s \leq t}\big(\overline{x} (\pi _{s} )e^{-(\beta _{2} - \frac{1}{2} \sigma ^{2})s - \sigma I_{s} - (g_{2}-g_{1})\int _{0}^{s} \pi _{u} du} \big)\Big) \vee 0, \qquad t \geq 0, $$
(4.37)

with \(\overline{\nu }^{{{*}}}_{0-}=0\) and then the process

$$ \nu ^{{*}}_{t}:=\int _{0}^{t} e^{-(\beta _{2}-\frac{1}{2}\sigma ^{2})s - \sigma I_{s} - (g_{2}-g_{1})\int _{0}^{s} \pi _{u} du} d \overline{\nu }^{{{*}}}_{s}, \qquad t \geq 0, \nu ^{{*}}_{0-}=0. $$
(4.38)

Notice that since \(\overline{\nu }^{{{*}}}_{t} \leq x\) a.s. for all \(t\geq 0\) and \(t \mapsto \overline{\nu }^{{{*}}}_{t}\) is nondecreasing, it follows from (4.38) that \(\nu ^{{*}}\) is admissible. Moreover, \(t \mapsto \overline{\nu }^{{{*}}}_{t}\) is continuous (with the exception of a possible initial jump at time 0), due to the continuity of \(y \mapsto \overline{x}(y)\), \(t \mapsto I_{t}\), \(t \mapsto \pi _{t}\) and \(t \mapsto \int _{0}^{t} \pi _{s} ds\).

Theorem 4.11

Let \(\widetilde{V}(x,y):= \int _{0}^{x}\frac{1}{z}v(z,y)dz\), \((x,y)\in [0,\infty ) \times [0,1]\). Then one has \(\widetilde{V} = V\)on \([0,\infty ) \times [0,1]\), and \(\nu ^{{*}}\)as in (4.38) is optimal for (4.2).

Proof

Recall \(U=U_{0}\) as in (3.20) and notice that in our Markovian setting, one actually has \(\frac{1}{z}v(z,y) = U(z)\). By the proof of Theorem 3.13, it suffices to show that the right-continuous inverse of the stopping time \(\tau ^{\star }(z,y)= \ \inf \{t\geq 0 : \widehat{X}^{z,y}_{t} \geq \overline{x}(\pi ^{y}_{t}) \}\) (which is optimal for \(v(z,y)\), cf. (4.10)) coincides (up to a null set) with \(\overline{\nu }^{{{*}}}\). For that, recall (3.25) from the proof of Theorem 3.13, fix \((x,y) \in (0,\infty ) \times (0,1)\), take \(t\geq 0\) arbitrary and notice that by (4.10), we have \(\mathbb{P}_{(z,y)}\)-a.s. the equivalences

$$\begin{aligned} \tau ^{\star }(z,y) \leq t &\Longleftrightarrow \widehat{X}_{\theta } \geq \overline{x}(\pi _{\theta }) \mbox{ for some } \theta \in [0,t] \\ & \Longleftrightarrow z \geq e^{-(\beta _{2} - \frac{1}{2}\sigma ^{2}) \theta - \sigma I_{\theta } - (g_{2}-g_{1})\int _{0}^{\theta }\pi _{u} du} \overline{x}(\pi _{\theta }) \mbox{ for some } \theta \in [0,t] \\ &\Longleftrightarrow \Big(x - \inf _{0\leq s \leq t}\big( \overline{x}(\pi _{s})e^{-(\beta _{2} - \frac{1}{2}\sigma ^{2})s - (g_{2}-g_{1}) \int _{0}^{s}\pi _{u} du - \sigma I_{s}}\big)\Big) \vee 0 \geq x - z \\ & \Longleftrightarrow \overline{\nu }^{{{*}}}_{t} \geq x - z \\ &\Longleftrightarrow \tau ^{\overline{\nu }^{{{*}}}}_{+}(z) \leq t. \end{aligned}$$

Hence \(\tau ^{\overline{\nu }^{{{*}}}}_{+}(z)=\tau ^{\star }(z,y)\) a.s. and \(\overline{\nu }^{{{*}}}_{\cdot }\) is the right-continuous inverse of \(\tau ^{\star }(\cdot ,y)\). As \(\overline{\nu }^{{{*}}}\) is admissible, the claim follows by arguing as in the proof of Theorem 3.13, part 2). □

Notice that (4.38) and the equation for \(X^{x,y,\nu }\) in the formulation of (4.2) yield

$$ X^{x,y,\nu ^{{*}}}_{t} = e^{(\beta _{2}-\frac{1}{2}\sigma ^{2})t + (g_{2}-g_{1}) \int _{0}^{t} \pi ^{y}_{s} ds + \sigma I_{t}} (x - \overline{\nu }^{{{*}}}_{t} ), $$

which with regard to (4.37) shows that

$$ 0 \leq X^{x,y,\nu ^{{*}}}_{t} \leq \overline{x}(\pi ^{y}_{t}), \qquad \text{$t \geq 0, \mathbb{P}$-a.s.} $$

Moreover, it is easy to see that we can express \(\overline{\nu }^{{{*}}}\) of (4.37) as

$$ \overline{\nu }^{{{*}}}_{t} = \sup _{0\leq s \leq t} \bigg( \frac{X^{x,y,0}_{s} - \overline{x}(\pi ^{y}_{s})}{X^{1,y,0}_{s}} \bigg) \vee 0, \qquad \overline{\nu }^{{{*}}}_{0-}=0. $$

These equations allow us to make some remarks about the optimal debt management policy of our problem.

(i) If at the initial time 0, the level \(x\) of the debt ratio is above \(\overline{x}(y)\), then an immediate lump sum reduction of \(x-\overline{x}(y)\) is optimal.

(ii) At any time \(t\geq 0\), it is optimal to keep the debt ratio level below the belief-dependent ceiling \(\overline{x}\).

(iii) If the level of the debt ratio at time \(t\) is strictly below \(\overline{x}(\pi _{t})\), there is no need for interventions. The government should intervene to reduce its debt only at those (random) times \(t\) at which the debt ratio attempts to rise above \(\overline{x}(\pi _{t})\). These interventions are then minimal, in the sense that \((X^{x,y,\nu ^{{*}}},\pi ^{y},\nu ^{{*}})\) solves a Skorokhod reflection problem at the free boundary \(\overline{x}\).

(iv) Recall that the debt ceiling \(\overline{x}\) is an increasing function of the government’s belief that the economy is enjoying a phase of fast growth. Then, with regard to the previous description of the optimal debt reduction rule, we have that the more the government believes that the economy is in good shape, the less strict the optimal debt reduction policy should be.

4.3 Regularity of the value function of (4.2) and related HJB equation

Combining the results collected so far, we are now able to prove that the value function \(V\) of the control problem (4.2) is a twice continuously differentiable function. As a byproduct, \(V\) is a classical solution to the corresponding Hamilton–Jacobi–Bellman (HJB) equation.

From Theorem 4.11, we know that we have \(V(x,y)= \int _{0}^{x}\frac{1}{z}v(z,y)dz\) for all \((x,y)\in \overline{\mathcal{O}}:=[0,\infty ) \times [0,1]\). Hence thanks to Theorem 4.9 and the dominated convergence theorem, we immediately obtain the following result.

Lemma 4.12

One has that \(V \in C^{1}(\mathcal{O}) \cap C(\overline{\mathcal{O}})\). Moreover, \(V_{xx}\in C(\mathcal{O})\)as well as \(V_{xy}\in C(\mathcal{O})\).

To take care of the second derivative \(V_{yy}\) we follow ideas used in De Angelis [19]. In particular, we determine the second weak derivative of \(V\) (recall that \(V_{y}\) is continuous by Theorem 4.9) and then show that it is a continuous function. This is accomplished in the next proposition.

Proposition 4.13

Let \(\theta ^{2}:=\frac{1}{2} ((\alpha _{1}-\alpha _{2})^{2} + \frac{(g_{2}-g_{1})^{2}}{\sigma ^{2}} )\). We have \(V_{yy} \in C(\mathcal{O})\)with

$$\begin{aligned} &V_{yy}(x,y) \\ & = - \frac{1}{\theta ^{2} y^{2}(1-y)^{2}}\bigg(\Big(\beta _{2} + (g_{2}-g_{1})y - \frac{1}{2}\sigma ^{2}\Big)\Big(v\big(x\wedge \overline{x}(y),y \big)-v(0+,y)\Big) \\ &\phantom{=:} \qquad \quad\ \qquad \qquad + h\big(x\wedge \overline{x}(y) \big) + \frac{1}{2}\sigma ^{2}\big(x\wedge \overline{x}(y)\big) v_{x} \big(x\wedge \overline{x}(y),y \big)\bigg) \\ &\phantom{=:} + \frac{ \lambda _{2} - (\lambda _{1}+\lambda _{2})y }{\theta ^{2} y^{2}(1-y)^{2}} \int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v_{y}(z,y) dz \\ &\phantom{=:}- \frac{\rho }{\theta ^{2} y^{2}(1-y)^{2}}\int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v(z,y) dz. \end{aligned}$$
(4.39)

Proof

Notice that \(V_{y}(x,y)=\int _{0}^{x}\frac{1}{z}v_{y}(z,y)dz\) and therefore \(V_{y}(x,\cdot )\) is a continuous function for all \(x>0\) by Theorem 4.9 (notice indeed that by the bounds in (4.26) and the multiplicative dependence of \(\widehat{X}^{z,y}\) with respect to \(z\), \(\frac{1}{z}v_{y}(z,y)\) is integrable at zero). Hence its weak derivative with respect to \(y\) is a function \(g \in L^{1}_{{\mathrm{{loc}}}}( \mathcal{O})\) such that for any test function \(\varphi \in C^{\infty }_{c}((0,1))\), one has

$$ \int _{0}^{1} V_{y}(x,y)\varphi '(y)dy = - \int _{0}^{1} g(x,y) \varphi (y)dy. $$

We now want to evaluate \(g\) and show that it coincides with the right-hand side of (4.39).

Denote by \(m(x)\) for \(x>0\) the generalised right-continuous inverse of \(\overline{x}(y)\) for \(y\in [0,1]\), that is, \(m(x):= \inf \{y\in [0,1]: \overline{x}(y) \geq x\}\). Then noticing that \(v_{y}=0\) on the set \(\{(x,y) \in \mathcal{O}: x>\overline{x}(y)\}\) and using Fubini’s theorem, we can write

$$\begin{aligned} \int _{0}^{1} V_{y}(x,y)\varphi '(y)dy &= \int _{0}^{1} \int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z}v_{y}(z,y) dz \varphi '(y)dy \\ & = \int _{0}^{x} \frac{1}{z} \int _{m(z)}^{1} v_{y}(z,y) \varphi '(y)dy dz \\ & = \int _{0}^{x} \frac{1}{z} \bigg(v_{y}(z,1)\varphi (1) - v_{y} \big(z,m(z)\big)\varphi \big(m(z)\big) \\ &\phantom{=:}\qquad \quad - \int _{m(z)}^{1} v_{yy}(z,y) \varphi (y)dy \bigg) dz \\ &= - \int _{0}^{x} \frac{1}{z} \int _{m(z)}^{1} v_{yy}(z,y) \varphi (y)dy dz, \end{aligned}$$
(4.40)

where we have used \(v_{y}(z,m(z))=0\) for all \(z\in (0,x)\) and \(x>0\) as well as \(\varphi (1)=0\). By Lemma 4.7 (cf. also (4.14)), for any \(y>m(z)\) with \(z\in (0,x)\) and \(x>0\), we have

$$\begin{aligned} v_{yy}(z,y)&= \frac{1}{\theta ^{2} y^{2}(1-y)^{2}}\bigg(\rho v(z,y) - \big(\lambda _{2} - (\lambda _{1}+\lambda _{2})y\big) v_{y}(z,y) - zh'(z) \\ & \phantom{=:}\qquad\; \qquad \qquad - \frac{1}{2}\sigma ^{2} z^{2} v_{xx}(z,y) - \big(\beta _{2} + (g_{2}-g_{1})y\big)z v_{x}(z,y) \bigg). \end{aligned}$$

Inserting this into the last integral term on the right-hand side of (4.40), using again Fubini’s theorem and then integrating the derivatives with respect to \(x\), we find

$$\begin{aligned} & \int _{0}^{1} V_{y}(x,y)\varphi '(y)dy \\ &= - \int _{0}^{x} \frac{1}{z} \int _{m(z)}^{1} v_{yy}(z,y) \varphi (y)dy dz \\ & = \int _{0}^{1} \frac{ \lambda _{2} - (\lambda _{1}+\lambda _{2})y }{\theta ^{2} y^{2}(1-y)^{2}} \int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v_{y}(z,y) dz \varphi (y) dy \\ &\phantom{=:}- \int _{0}^{1} \frac{\rho }{\theta ^{2} y^{2}(1-y)^{2}}\int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v(z,y) dz\varphi (y) dy \\ & \phantom{=:}+ \int _{0}^{1} \bigg(h\big(x\wedge \overline{x}(y)\big) + \big( \beta _{2} + (g_{2}-g_{1})y\big)\Big(v\big(x\wedge \overline{x}(y),y \big)-v(0+,y)\Big) \\ & \phantom{=:}\qquad\, \quad + \frac{1}{2}\sigma ^{2}\big(x\wedge \overline{x}(y) \big) v_{x}\big(x\wedge \overline{x}(y),y\big) \\ &\phantom{=:}\qquad\, \quad - \frac{1}{2}\sigma ^{2} \Big(v\big(x\wedge \overline{x}(y),y\big)-v(0+,y)\Big)\bigg) \frac{\varphi (y)}{\theta ^{2} y^{2}(1-y)^{2}} dy, \end{aligned}$$
(4.41)

where we have also used that \(h(0)=0\). Finally, setting

$$\begin{aligned} &g(x,y) \\ &:= - \frac{1}{\theta ^{2} y^{2}(1-y)^{2}}\bigg(h\big(x\wedge \overline{x}(y)\big) \\ &\phantom{=:} \qquad \qquad \qquad \quad \,\,+ \Big(\beta _{2} + (g_{2}-g_{1})y - \frac{1}{2}\sigma ^{2}\Big)\Big(v\big(x\wedge \overline{x}(y),y \big)-v(0+,y)\Big) \\ & \phantom{=:}\qquad \qquad \qquad \quad \,\, + \frac{1}{2}\sigma ^{2}\big(x \wedge \overline{x}(y)\big) v_{x}\big(x\wedge \overline{x}(y),y\big) \bigg) \\ &\phantom{=::}+ \frac{ \lambda _{2} - (\lambda _{1}+\lambda _{2})y }{\theta ^{2} y^{2}(1-y)^{2}} \int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v_{y}(z,y) dz \\ & \phantom{=::}- \frac{\rho }{\theta ^{2} y^{2}(1-y)^{2}}\int _{0}^{x \wedge \overline{x}(y)} \frac{1}{z} v(z,y) dz, \end{aligned}$$

we see that (4.41) reads \(\int _{0}^{1} V_{y}(x,y)\varphi '(y)dy = - \int _{0}^{1} g(x,y) \varphi (y)dy\) so that \(g\) can be identified with the second weak derivative of \(V\) with respect to \(y\). Notice that \(g\) is continuous by the continuity of \(\overline{x}\), \(v\), \(v_{x}\), \(h\) and the fact that \(\int _{0}^{x\wedge \overline{x}(y)} \frac{1}{z}v(z,y)dz\) and \(\int _{0}^{x\wedge \overline{x}(y)} \frac{1}{z}v_{y}(z,y)dz\) are finite due to (4.4), (4.5) and (4.26). The proof is complete. □

Thanks to Lemma 4.12 and Proposition 4.13, we have that \(V \in C^{2}(\mathcal{O}) \cap C(\overline{\mathcal{O}})\). As a byproduct of this, by the dynamic programming principle and standard methods based on an application of Dynkin’s formula, we obtain the next result.

Proposition 4.14

Recall the second-order differential operator \(\mathbb{L}\)defined in (4.14). The value function \(V\)of (4.2) is a classical solution to the HJB equation

$$ \min \{ (\mathbb{L}-\rho )V(x,y) + h(x), 1 - V_{x}(x,y) \}=0, \qquad (x,y) \in \mathcal{O}, $$

with boundary condition \(V(0,y)=0\)for any \(y\in [0,1]\).