Optimal dividends with partial information and stopping of a degenerate reflecting diffusion

We study the optimal dividend problem for a firm's manager who has partial information on the profitability of the firm. The problem is formulated as one of singular stochastic control with partial information on the drift of the underlying process and with absorption. In the Markovian formulation, we have a 2-dimensional degenerate diffusion, whose first component is singularly controlled and it is absorbed as it hits zero. The free boundary problem (FBP) associated to the value function of the control problem is challenging from the analytical point of view due to the interplay of degeneracy and absorption. We find a probabilistic way to show that the value function of the dividend problem is a smooth solution of the FBP and to construct an optimal dividend strategy. Our approach establishes a new link between multidimensional singular stochastic control problems with absorption and problems of optimal stopping with `creation'. One key feature of the stopping problem is that creation occurs at a state-dependent rate of the `local-time' of an auxiliary 2-dimensional reflecting diffusion.


Introduction
We study a singular stochastic control problem on a linearly controlled, 1-dimensional Brownian motion X with (random) drift µ. The problem is motivated by the dividend problem, where X denotes the revenues of a firm and the firm's manager needs to distribute dividends to the share-holders in an optimal way but being mindful of the risk of default. Similarly to the existing literature, we account for the risk of default by letting the process X be absorbed upon reaching zero.
As one may expect, the optimal distribution of dividends is very sensitive to the profitability of the firm, which is encoded in the drift µ of the process X. A positive drift reflects a company in good health and, as a rule of thumb, dividends are paid when revenues are sufficiently high (which is expected to occur rather often) so to keep a low risk of default. On the contrary, a negative drift indicates a firm that operates at a loss and therefore should be wound up as soon as possible by paying out all dividends.
Estimating profitability is a challenging task in many real-world situations which has already received attention in the mathematical economic literature; see, e.g., [20] for investment timing, [22] for contract theory, [14] for asset trading. In order to capture this feature in a non-trivial but tractable way, we assume partial information on the drift of the process X. This is a novelty compared to existing models on dividend distribution.
We remark that statistical estimation of the drift of a drifting Brownian motion from observation of the process is a much less efficient procedure than estimation of its volatility. Indeed, over a given period of time [0, T ], the variance on the classical estimator for the volatility can be reduced by increasing the number of observations, whereas this is not the case for the variance on the estimator for the drift µ. The latter depends on 1/T (see [25,Example 2.1] for a simple example), hence an accurate estimate of µ requires a long period of observation under the exact same market conditions, which in reality is not feasible.
Our study shows how the flow of information affects the firm's manager optimal dividend strategy: as in the informal discussion above, dividends are paid only when revenues exceed a critical value d * , however, in contrast to the existing literature this critical value changes dynamically according to the manager's current belief on the profitability of the firm; as we will explain in more detail below, such belief is described by a state variable π ∈ (0, 1), where a value of π close to 1 indicates a strong belief in a positive drift and a value of π close to 0 indicates a strong belief in a negative drift; we observe that the critical value of the revenues d * increases (but stays bounded) as π increases, which is in line with the intuition that a firm with high profitability expects good performance and chooses to pay dividends when large revenues are realised, so that the risk of default is kept low and the business can be sustained over longer times; on the contrary, if there is a weak belief in the profitability of the firm, then dividends will be paid also for lower levels of the revenues, as there is no expectation that these will increase in the future. The partially informed manager of our firm, learns about the true value of profitability by observing the stream of revenues X and adjusts her strategy accordingly, so that dividends are paid dynamically at different levels of revenue depending on the learning process.
The observation of X will in the end reveal the true drift µ so that the belief of the firm's manager will eventually converge to either π = 0 or π = 1. Her dividend strategy will then converge to the corresponding strategy for the problem with full information (see Proposition 5.14). This shows that our model complements and extends the existing literature, which will be reviewed in the next section, by displaying a richer structure of the optimal solution and by effectively adding a new dimension to the classical problem (i.e., the belief). For a broader discussion on the economic foundations and implications of a dividend problem with partial information we also refer the reader to the introduction of the preprint [21], where a special case of our problem is studied with different methods (a detailed comparison is given in the final three paragraphs of the next section).
1.1. Mathematical background and overview of main results. Our specific mathematical interest is in the explicit characterisation of the optimal control in terms of an optimal boundary arising from an associated free boundary problem. To the best of our knowledge the study of free boundaries for singular stochastic control problems associated to diffusions with absorption and partial information has never been addressed in the literature. Recently Øksendal and Sulem [39] studied general maximum principles for singular control problems with partial information. Their approach relies mostly on backward stochastic differential equations (BSDEs) and they provide general abstract results linking the value of the singular control problem to the solution of suitable BS-DEs. Here instead we focus on a specific problem with the aim of a more detailed study of the optimal control. It is worth noticing that [39] does not consider the case of absorbed diffusions, which is a source of interesting mathematical facts in our paper, as we will discuss below.
For the sake of tractability we choose a model in which µ is a random variable that can only take two real values, i.e. µ ∈ {µ 0 , µ 1 }, with µ 0 < µ 1 . The company's revenue, net of dividend payments, at time t reads where B is a Brownian motion, σ > 0, and D t denotes the total amount of dividends paid up to time t (notice that D is a non-decreasing process and we choose it to be right-continuous). As in the most canonical formulation of the dividend problem, the insurance company's manager wants to maximise the discounted flow of dividends until the firm goes bankrupt. Moreover, the manager can infer the true value of µ by observing the evolution of X.
Using filtering techniques the problem can be written in a Markovian framework by considering simultaneously the dynamics of X D and of the process π t := P(µ = µ 1 |F X t ), where F X t = σ(X s , s ≤ t). This approach has a long and venerable history in optimal stopping theory, with early contributions dating back to work of Shiryaev in 1960s in the context of quickest detection (see [46] for a survey. See also [34] for some recent developments and further references). However, it seems that such model has never been adopted in the context of singular control.
One difficulty that arises by the reduction to Markovian framework is that the dynamic of the state process is two dimensional and diffusive. This leads to a variational formulation of the stochastic control problem in terms of PDEs and therefore explicit solutions cannot be provided, in general.
The literature on the optimal dividend problem is very rich with seminal mathematical contributions by Jeanblanc and Shiryaev [32] and Radner and Shepp [43]. More recent contributions include, among many others (see, e.g., the survey [2]), [1] and [23] who consider random interest rates, [3] who allow for jumps in the dynamic of X, [33] who consider a regime switching dynamic for the coefficients in (1.1), [4] who consider jumps in the dynamic of X and fixed transaction costs for dividend lump payments. However, research so far has largely focused on explicitly solvable examples. This means that, in the largest majority of papers, the underlying stochastic dynamics are either one dimensional, or two dimensional but with one of the state processes driven by a Markov chain. Moreover, the time horizon γ D of the optimisation is usually assumed to be the first time of X D falling below some level a ≥ 0. Alternatively, capital injection is allowed and the opitimisation continues indefinitely, i.e. γ D = +∞. These choices of γ D make the problem time homogenous and easier to deal with. In absence of capital injection, even just assuming a finite time-horizon for the dividend problem, i.e. taking γ D ∧ T for some deterministic T > 0, introduces major technical difficulties. The latter were addressed first in [29] and [30] with PDE methods, and then in [16] with probabilistic methods. Interestingly, the finite time-horizon is more easily tractable in presence of capital injection, as shown in [26] using ideas originally contained in [24].
Here we take the approach suggested in [16] but, as we will explain below, we substantially expand results therein. First we link our dividend problem to a suitable optimal stopping one. Then we solve the optimal stopping problem (OSP) by characterising its optimal stopping rule in terms of a free boundary π → d(π). Finally, we deduce from properties of the value function U of the OSP that the value function V of the dividend problem is a strong solution of an associated variational inequality on R + × [0, 1] with gradient constraint. Moreover, using the boundary d(·) we express the optimal dividend strategy as an explicit process depending on t → d(π t ). It is worth noticing that we can prove that V ∈ C 1 (R + × (0, 1)), with V xx and V xπ belonging to C(R + × (0, 1)) and V ππ ∈ L ∞ (R + × (0, 1)). This type of global regularity cannot be easily obtained with PDE methods due to the degeneracy of the underlying diffusion. Here we obtain these results with a careful probabilistic study of the value function U . In particular the argument used to prove V ππ ∈ L ∞ (R + × (0, 1)) in Proposition 6.2 seems completely new in the related literature.
As in [16] the presence of an absorbing point for the process X D 'destroys' the standard link between optimal stopping and singular control. Such link has been studied by many authors: Bather and Chernoff [6] and Benes, Shepp and Witsenhausen [7] were the first to observe it and Taksar [48] provided an early connection to Dynkin games. Extensions and refinements of the initial results were obtained in a long series of subsequent papers using different methodologies. Just to mention a few we recall [10], [24] and [35] who address the problem with probabilistic methods, [8] who use viscosity theory, [31] who link singular control problems to switching problems.
Departing from the literature mentioned above, here we prove that V x = U where now U is the value function of an OSP whose underlying process is a 2-dimensional, uncontrolled, degenerate diffusion ( X, π), which lives in R + × [0, 1] and is reflected at {0} × (0, 1), towards the interior of the domain, along the direction of a state-dependent vector v( π) (see Section 4.1). Moreover, upon each reflection, the gain process that is underlying the OSP increases exponentially at a rate that depends on the 'intensity' of the reflection and on the value of the process π t . We call this behaviour of the gain process: 'state-dependent creation' of the process ( X, π) at {0}× (0, 1) (cf. [40]). Indeed it is interesting that the 'creation' feature of our reflected process links our paper to work by Stroock and Williams [47] and Peskir [40], concerning a type of non-Feller boundary behaviour of 1-dimensional Brownian motion with drift. Notice however, that in those papers the creation rate is constant and the problem is set on the real line, so that the direction of reflection is fixed. Here instead we deal with an example of a non-trivial, two dimensional, extension of the problem studied in [47] and [40].
A striking difference with the problem studied in [16] is the much more involved dynamics underlying the OSP and the behaviour of the gain process. In [16] the state dynamics in the control problem is of the form (t,X D t ), withX D as in (1.1) but with deterministic constant drift. This leads to an optimal stopping problem involving a 1dimensional Brownian motion with drift which is reflected at zero, and which is created (in the same sense as above) at a constant rate. The state variable 'time' is unaffected by the link between the dividend problem and the stopping one. Here instead, the correlation in the dynamics of X D t and π t in the control problem induces two main effects: (i) it causes for the reflection of the process ( X, π) to be along the stochastic vector process t → v( π t ) (see (4.5)-(4.6)), (ii) it generates a non-constant, creation rate that depends on the process π (see (4.8)).
The reflection of ( X, π) at {0} × (0, 1) is realised by an increasing process (A t ) t≥0 which we can write down explicitly (see (4.14)) and which we will informally refer to as 'local-time' of ( X, π) at {0} × (0, 1). Despite its use in solving the dividend problem, the OSP that we derive is interesting in its own right and belongs to a class of problems that, to the best of our knowledge, has never been studied before. In particular this is an optimal stopping problem on a multi-dimensional diffusions, reflected in a domain O, with a gain process that increases exponentially at a rate proportional to the local time spent by the process in some portions of ∂O (moreover such rate is non-constant).
In conclusion, we believe that the main mathematical contributions of our work are the following: (i) for the first time we characterise the free boundary associated to a singular stochastic control problem with partial information on the drift of the process and absorption, (ii) we obtain rather strong regularity results for the value V of the control problem, despite degeneracy of the associated HJB operator, (iii) we find a non-trivial connection between singular control for multi-dimensional diffusions with absorption, and optimal stopping of reflected diffusions with 'state-dependent creation', (iv) we solve an example of a new class of optimal stopping problems, whose popularity we hope will increase with the increasing understanding of their role in the dividend problem.
After completing this work we learned about the preprint [21] where the same problem is addressed in the special case of µ 1 = −µ 0 . In that setting the problem's dimension can be reduced by a transformation that makes one of the two state processes purely controlled (a closer inspection reveals that this is in line with the case of a null drift in our (4.45)). The problem in [21] can be solved by 'guess-and-verify' via a parameterdependent family of ODEs with suitable boundary conditions. The methods of [21] cannot be used for generic µ 0 and µ 1 because the dimension reduction is impossible and the ODE becomes a 2-dimensional free boundary problem involving partial derivatives.
Besides the methodolocial differences between the two papers, the optimal strategy obtained in [21] shares similarities with ours but it also features a remarkable difference. Due to the fact that one of the state variables is purely controlled, in [21] the level of future revenues at which dividends will be paid can only increase after each dividend payment. As stated in [21], this can be understood as the firm's manager 'becoming more confident about the relevance of their project'. When revenues reach a new maximum, this suggests to the manager that the drift be positive; however, the symmetric structure µ 1 = −µ 0 is such that she does not subsequently change her view, even if revenues start fluctuating downwards. This fact stands in sharp contrast with our solution, which instead allows the manager to increase/decrease her revenues target level depending on the new information acquired.
The rest of the paper is organised as follows. In Section 2 we cast the problem and provide its Markovian formulation. Section 3 introduces the verification theorem which we aim at proving probabilistically in the subsequent sections. The main technical contribution of the paper is contained in Sections 4, 5 and 6. In the first part of Section 4 we introduce the stopping problem for a 2-dimensional degenerate diffusion with statedependent reflection. Then, in the rest of Section 4 and in Section 5, we study properties of the associated value function and obtain geometric properties of the optimal stopping set. In Section 6 we prove that the value function and the optimal control of the dividend problem can be constructed from the value function of the optimal stopping problem and its optimal stopping region. A short appendix contains a rather standard proof of the verification theorem stated in Section 3.

Setting
We consider a complete probability space (Ω, F, P) equipped with a 1-dimensional Brownian motion (B t ) t≥0 and its natural filtration (F B t ) t≤0 completed with P null sets. On the same probability space we also have a random variable µ which is independent of B and takes two possible real values µ 0 < µ 1 , with probability P(µ = µ 1 ) = π ∈ [0, 1]. Further, given x > 0 and σ > 0, we model the firm's revenue in absence of dividend payments by the process (X t ) t≥0 defined as We denote by (F X t ) t≥0 the filtration generated by X and we say that a dividend strategy is a (F X t ) t≥0 -adapted, increasing, right-continuous process (D t ) t≥0 with D 0− = 0. In particular D t represents the cumulative amount of dividends paid by the firm up to time t and we say that the firm's profit, under the dividend strategy D, is Notice that for D ≡ 0 we formally have X 0 = X. As it is customary in the dividend problem, we define a default time at which the firm stops paying dividends and we denote it by γ D := inf{t ≥ 0 : X D t ≤ 0}. Equipped with this simple model for the firm's profitability, the manager of the firm wants to maximise the expected flow of discounted dividends until the default time, where discounting occurs at a constant rate ρ > 0, i.e.: where A denotes the set of admissible dividend strategies. In particular s. It is important to notice that the drift of X D is not affected by the choice of D, so that X = X D + D. Moreover, the control process D is chosen by the firm's manager based on their observation of the process X and it is therefore natural that D t should be F X t -measurable. It is well known that the dynamic (2.2) may be rewritten in a more tractable Markovian form, thanks to standard filtering methods (see, e.g., [45,Sec. 4.2]). In particular, denoting π t := P(µ = µ 1 F X t ), one can construct a ((F X t ) t≥0 , P)-Brownian motion (W t ) t≥0 and write the dynamics of the couple (X D t , π t ) t≥0 for all t > 0 in the form under the measure P, withμ := µ 1 − µ 0 and θ :=μ/σ. We notice that (2.4) can be obtained from (2.2) by formally replacing µ with E[µ|F X t ]. Moreover, (π t ) t≥0 in (2.5) is a bounded martingale, hence it is a martingale on [0, ∞] and, in particular, π ∞ ∈ {0, 1} since all information is revealed at time t = ∞.
Intuitively, we can say that at any given time t ≥ 0 the amount of new information which becomes available to the firm's manager is measured by the absolute value of the increment ∆π t . Then, the learning rate depends on the so-called signal-to-noise ratio θ and on the current belief π t , which appear in the diffusion coefficient in (2.5). Given an increment ∆W t of the Brownian motion, the value of |∆π t | is increasing in the signal-to-noise ratio, as expected. Further, the maximum of the diffusion coefficient (hence the maximum learning rate) occurs when π t = 1/2, which corresponds to the most uncertain situation.
Since (X D t , D t , π t , W t ) t≥0 is (F X t ) t≥0 -adapted and we do not need to consider any other filtration, from now on we denote F t = F X t to simplify the notation. In the new Markovian framework our problem (2.3) reads The formulation in (2.6) of the optimal dividend problem with partial information corresponds to a singular stochastic control problem involving a 2-dimensional degenerate diffusion which is killed upon leaving the set R + × (0, 1) (recall that if π 0 ∈ (0, 1) then π t ∈ (0, 1) for all t ∈ (0, +∞), whereas if π 0 ∈ {0, 1} then π t = π 0 for all t > 0). In the economic literature, the value function V of (2.6) is traditionally considered as the value of the firm itself.
Remark 2.1. The case of full information corresponds to π ∈ {0, 1}. In this case it is known that if the drift µ ≤ 0 it is optimal to pay all dividends immediately and liquidate the firm. On the contrary, if µ > 0 then dividends should be paid gradually according to a strategy characterised by a Skorokhod reflection of the process X D against a positive (moving) boundary (see [32] for the stationary case and [16] for the non-stationary one).
In our setting with partial information, it is clear that µ 0 < µ 1 ≤ 0 would lead to an immediate liquidation of the firm. The cases µ 1 > µ 0 ≥ 0 and µ 0 < 0 < µ 1 instead need to be studied separately as they present subtle technical differences which would make a unified exposition rather lengthy. In this paper we start with the case µ 0 < 0 < µ 1 , which seems economically the most interesting as it represents the uncertainty of a firm who cannot predict exactly whether its line of business is following an increasing or decreasing future trend.
Motivated by the remark above we make the following standing assumption throughout the paper: We close this section by introducing the infinitesimal generator L X,π associated to the uncontrolled process (X t , π t ) t≥0 . For functions f ∈ C 2 (R + × [0, 1]) we have for (x, π) ∈ R + × [0, 1] and where f xx , f xπ , f ππ are second derivatives and f x a first derivative. For simplicity in the rest of the paper we also define O := (0, +∞) × (0, 1).
Moreover, given a set A we denote by A its closure. Following the approach introduced in [16], in the next section we will start our analysis by providing a verification theorem for V . Then we will use the latter to conjecture an optimal stopping problem that should be associated with V x . It will soon become clear that the construction of [16] is substantially easier than the one needed here. Our new construction also leads to a much more involved optimal stopping problem.

A verification theorem
A familiar heuristic use of the dynamic programming principle suggests that for any admissible control D the process should be a super-martingale and, if D = D * is an optimal control, then (3.1) should be a martingale. Moreover, given a starting point (x, π) one strategy could be to pay immediately a small amount δ of dividends, hence shifting the dynamics to the point (x − δ, π), and then continue optimally. Since this would in general be sub-optimal, one has V (x, π) ≥ V (x − δ, π) + δ =⇒ V x (x, π) ≥ 1. If the inequality is strict, then the suggested strategy is strictly sub-optimal. Hence, the firm should pay dividends when V x = 1 and do nothing when V x > 1. It is also clear from (2.6) that V (0, π) = 0 for all π ∈ [0, 1].
Based on this heuristic we can formulate the following verification theorem. Its proof is rather standard (see, e.g., [27,Thm. 4.1, Ch. VIII]) and we give it in appendix for completeness.
. Assume that 0 ≤ v(x, π) ≤ c x, for all (x, π) ∈ O and some c > 0, and that it solves Then v ≥ V on O.
Let us denote In addition to the above assume that: v ∈ C 2 (I v ∩ O) and there exists D v ∈ A such that, P x,π -almost surely for all Then V = v on O and D * := D v is an optimal dividend strategy.
From now on we will denote the inaction set for problem (2.6) by I, and if V ∈ C 1 (O) this will correspond to the set For future reference we also recall that if V ∈ C 2 (O) solves (3.2), then in particular we have

Stopping a 2-dimensional diffusion with reflection and creation
In this section we will construct an optimal stopping problem (OSP) which involves a 2-dimensional degenerate diffusion. Such diffusion is kept inside O by reflection at {0} × (0, 1) and it is also created upon each new reflection, in a sense which will be mathematically clarified later. Here we will also start a detailed study of the optimal stopping region and of the value function of such OSP, which will be instrumental to solve problem (2.6).

4.1.
Construction of the stopping problem. Let us assume for a moment that V ∈ C 2 (O) so that the boundary condition V (0, π) = 0 would also imply V π (0, π) = V ππ (0, π) = 0. Then, for all π ∈ (0, 1) for which (0, π) ∈ I (see (3.7)) we get from (3.8) Moreover, formally differentiating (3.8) and using (4.1) we obtain that u solves We claim that the variational problem (4.2)-(4.3)-(4.4) should be connected to the optimal stopping problem (4.8) given below. First we state the problem, then we give a heuristic justification of our claim and finally we prove, in several steps, that our conjecture is indeed correct. Let ( X, π) be solution of the system, for t > 0, where (A t ) t≥0 is an increasing continuous process, started at time zero from A 0 = 0 and such that P-a.s Notably the process ( X, π) is a 2-dimensional degenerate diffusion which is reflected at Although existence of such reflected process may be deduced by standard theory (see, e.g., [5] for a general exposition and references therein), we will not dwell here on this issue. In fact in the next section the reflected SDE (4.5)-(4.6) is reduced to an equivalent but simpler one (see (4.12)-(4.13) below) for which a solution can be computed explicitly -hence implying that (4.5)-(4.6) admits a solution as well.
For (x, π) ∈ O, let us now consider the problem where the supremum is taken over all P x,π -a.s. finite stopping times.
Associated with the above problem we also introduce the so-called continuation and stopping sets, denoted by C and S, respectively. These are defined as and it is immediate to observe that, if U = V x , then C = I (recall (3.7)).
The heuristic that associates (4.8) to (4.2)-(4.4) goes as follows: suppose u ∈ C 2 (O) is a solution of (4.2)-(4.4) and that Then (L X,π − ρ)u ≤ 0 on O and an application of Dynkin formula, combined with the use of (4.4) and u ≥ 1, gives for any stopping time τ . Then u ≥ U . Moreover, the inequality above becomes a strict equality if we choose τ as the first exit time from I and this concludes the heuristic.
The rest of this section is devoted to the analysis of problem (4.8) in order to show that indeed U = V x and that U solves (4.2)-(4.4).

A Girsanov transformation.
It turns out that the problem may be more conveniently addressed under a different probability measure. As it is customary in problems involving the process π t (see, e.g., [25], [38] or [34]) we introduce here the analogue for π t of the so-called likelihood ratio process By direct computation it is not hard to derive the dynamic of Φ, for t > 0, in the form With the aim of turning W t + θ t 0 π s ds into a Brownian motion we follow steps as in [25] and introduce a new probability measure Q by its Radon-Nikodym derivative for some T > 0. Under the new measure Q we have that is a Brownian motion and the dynamics of ( X, Φ) read One advantage of this formulation is that the process X is decoupled from the process Φ and, thanks to (4.7), we see that it is just a Brownian motion with drift µ 0 reflected at zero. In particular this allows to compute a simple expression for A. Indeed Q x,ϕ -a.s. on [0, T ] we have (see, [36, Lemma 3.6.14]) Moreover we can express the dynamic for Φ as where the dependence on x is given explicitly by (4.14). Sometimes we will also use the notation ( X x , A x , Φ x,ϕ ) to express the dependence of ( X, A, Φ) on the initial point (x, ϕ).
In order to rewrite problem (4.8) in the new variables we introduce the process and notice that P x,ϕ (Z 0 = 1) = 1 and, under the measure P x,ϕ , we have Recalling (4.11) and rewriting the above SDE in terms of an exponential gives with the same T > 0 as in (4.11). Now, for any τ and any (x, ϕ) we get we immediately see that (4.16) implies We would like to extend this equality to the case T = +∞ and this requires a short digression as Girsanov theorem does not directly apply.
Since we are interested in properties of the value functions, here we can define a new probability space (Ω ′ , F ′ , P) equipped with a Brownian motion W and a filtration (F ′ t ) t≥0 , and let ( X ′ , Φ ′ ) be the unique strong solution of the SDE (4.12)-(4.13) driven by W (instead of W Q ) with a corresponding process A ′ as in (4.14). In this setting we can define the stopping problems where E is the expectation under P. Now, U Q (x, ϕ; T ) = U (x, ϕ; T ) by the equivalence in law of the process ( X, Φ, A, W Q ), under Q, and ( then combining these facts with (4.17) we obtain The proof of (4.18) is the same as that of (4.27) below and we omit it here for brevity.
Finally, with a slight abuse of notation we relabel ( Problem (4.20) is somewhat easier to analyse than the original (4.8) because the dynamics (4.12)-(4.13) for ( X, Φ), driven by W under P x,ϕ , are more explicit than the ones of ( X, π), driven by W under P x,π (see (4.5)-(4.6)).
For α > 0, setting β = α + σ 2 ρ 2α , the use of (4.21) and 2α give the following bound: for any stopping time τ A great deal of standard results in optimal stopping theory rely on the assumption that In particular (4.23) would normally be used to show that is the minimal optimal stopping time for problem (4.20), whenever P x,ϕ (τ * < +∞) = 1, otherwise it is the minimal optimal Markov time (see [45]) (notice also that for problem (4.8) we rewrite (4.24) in terms of ( X, π)). Moreover, (4.23) would also guarantee the (super)-martingale property of the discounted value process: the process (N t ) t≥0 defined as Assumption 4.23 may be fulfilled in our setting by choosing ρ sufficiently large in comparison to the coefficients (µ 0 , µ 1 , σ). In fact we notice that the process is not uniformly integrable in general. As it turns out, by following a slightly different approach we can still achieve (4.24)-(4.26) but with no other restriction on ρ than ρ > 0.
For n ≥ 1, let us denote ζ n := inf{t ≥ 0 : Φ t ≥ n} and consider the sequence of problems with value function It is clear that such truncated problems fulfill condition (4.23), since the process ( X, Φ) is stopped at ζ n . Hence is an optimal stopping time for problem (4.27). Moreover, the process (N n t ) t≥0 defined as satisfies the analogue of conditions (4.25)-(4.26) and we obtain the next useful results Moreover, there exists a universal constant c 1 > 0 such that Proof. Clearly U n ≤ U for all n and the sequence is increasing because the set of admissible stopping times is increasing. For any P x,ϕ -a.s. finite stopping time τ , Fatou's lemma gives The latter implies U (x, ϕ) ≤ lim inf n→∞ U n (x, ϕ) and therefore (4.30).
Let us now analyse (4.31). For any stopping time τ , using (4.15) we obtain and we can study the two terms separately. For the first one, given that µ 0 < 0 then the expectation is trivially bounded above by one.
Hence, U n fulfils (4.31) for all n ≥ 1 and then (4.30) implies that the bound holds for U as well.
It is also useful to state a continuity result for U n .
Thanks to (4.15) we know that the map is P-a.s. linear for any stopping time τ . Using this fact and the inequality sup for α ∈ (0, 1), ϕ 1 , ϕ 2 ∈ R + and each given x ∈ R + . Since the map (4.37) is monotonic increasing, it also follows that ϕ → U (x, ϕ) is increasing as claimed. (The latter could have also been deduced by monotonicity of ϕ → U n (x, ϕ).) Next, we observe that (iii) follows immediately by (4.31) upon noticing also that U (x, ϕ) ≥ 1 + ϕ. It only remains to prove (iv). From (4.31), and using (4.15) and where we recall that W θ is P θ -Brownian motion. Using now (4.22) we can find a universal constant c ′ 1 > 0 such that . Then (4.36) follows by taking t → ∞.
There are several conclusions that one can draw from Proposition 4.4. First we notice that (U − U n ) n≥1 is a decreasing sequence of continuous functions that converges to zero, therefore Dini's theorem implies for any compact K ⊂ [0, +∞) × (0, +∞). Now we can use this fact and an argument inspired by [12,Lem. 4.17] and [11,Lem. 6.2] to prove the next lemma. with τ * as in (4.24).
The above lemma implies optimality of τ * as explained in the next proposition.  Proof. We start by showing (4.25)-(4.26). Recall the process (N n t ) t≥0 defined in (4.29) and notice that (4.25)-(4.26) hold for such process. Then, for any s ≥ t we have P x,ϕ -a.s.
In order to prove (4.42), we notice that (4.26) implies, for any t ≥ 0 where we have used continuity of U in the second equality. Letting t → ∞, the transversality condition (4.36) gives (4.42).
Before closing this section we illustrate consequences of Proposition 4.4 for the shape of the continuation and stopping sets C and S. These are summarised in the next corollary.
Corollary 4.7. The continuation set C is open and the stopping set S is closed. The continuation set is connected in the ϕ variable, i.e., for all ϕ ′ > ϕ we have Proof. The first statement is trivial due to (i) in Proposition 4.4. The second statement follows from the fact that ϕ → U (x, ϕ) − (1 + ϕ) is convex due to (ii) in Proposition 4.4, it is non-negative and (iii) in Proposition 4.4 holds.
For future frequent use we define for any x ∈ [0, +∞), with the convention that sup ∅ = 0. Clearly C and ψ are related by (see also Remark 4.1).
Next we will infer monotonicity of ψ(·) and therefore the existence of a generalised inverse c(·), which is more convenient for a fuller geometric characterisation of C. This will be done in the next sections.

4.4.
A parabolic formulation. Since the process ( X, Φ) is driven by the same Brownian motion we can equivalently consider a 2-dimensional state dynamic in which only one component has a diffusive part. This is done according to a method similar to the one used in several papers addressing partial information, including [18,34].
Let us define a new process ( Y t ) t≥0 by setting, P x,ϕ -a.s. for all t ≥ 0 Then, letting y := σ θ ln(ϕ) − x, it is easy to verify that the couple ( X, Y ) evolves under P x,y according to In order to rewrite our problem (4.20) in terms of the new dynamics we set and from (4.20) we obtain Another formulation of the problem, which will be useful below, may be obtained by an application of Dynkin's formula (up to standard localisation arguments). Indeed we can write where we have also used that dA t = 1 { Xt=0} dA t (cf. (4.7)). Recalling from Proposition 4.4 that ϕ → U (x, ϕ) − (1 + ϕ) is convex and non-negative with U (x, 0+) = 1, it follows that the mapping is also non-decreasing. Then we have For frequent future use we introduce the second order operator L X,Y associated to ( X, Y ). That is, for f ∈ C 1,2 ([0, +∞) × R) and (x, y) ∈ [0, +∞) × R we set   Hence U is C 1,2 in C ∩ ((0, +∞) × R). Now we turn to the analysis of the geometry of C. First we show that C = ∅. Proposition 4.9. We have C = ∅ and in particular {0} × (y ℓ , +∞) ⊂ C with y ℓ := σ θ ln(− µ 0 µ 1 ). Proof. Fix ε > 0, take y > y ℓ + ε and let Notice that there exists c 1,ε > 0, c 2,ε > 0 such that P 0,y -a.s.
By (4.45) we notice that, for X away from 0 the process Y could either have a positive drift or a negative one. Interestingly, this dichotomy also produces substantially different technical difficulties. Recalling (4.43) we start by observing that Hence we have that (4.43) is equivalent to Before going further it is convenient to introduce C y := {x ∈ R + : (x, y) ∈ C} and S y := {x ∈ R + : (x, y) ∈ S}, for any y ∈ R. The geometry of C in the coordinates (x, y) is explained in Proposition 4.10 and 4.12 below. Proof. First we show that (x, y) ∈ S =⇒ (x ′ , y) ∈ S for all x ′ ≥ x. Fix (x, y) ∈ S and x ′ > x, then we know from (4.56) that (−∞, y] × {x} ∈ S. Due to (4.45) we have Y non-increasing, during excursions of X away from zero. This implies that the process ( X x ′ , Y x ′ ,y ) cannot reach x = 0 before hitting the half-line (−∞, y] × {x}, thus implying P x ′ ,y (τ * < τ 0 ) = 1 for τ 0 := inf{t ≥ 0 : X t = 0}. Hence (4.49) gives u(x ′ , y) ≤ 0 for all x ′ ≥ x, as claimed. Now, for each y ∈ R we can define b(y) := inf{x ∈ [0, +∞) : (x, y) ∈ S} and therefore S y = [b(y), +∞). Combining the latter with (4.56) gives that y → b(y) is non-decreasing.
Next we want to show that a result similar to Proposition 4.10 also holds for µ 1 +µ 2 < 0, under a mild additional condition. However, in this case we need first to compute an expression for the derivative U y .
Proof. The claim is trivial if (x, y) ∈ S \ ∂C since P x,y (τ * = 0) = 1 therein. Take (x, y) ∈ C and let τ := τ * (x, y) be optimal for U (x, y). Then for ε > 0, using (4.25) and (4.26), we have Recall (4.46), (4.36) and (4.22). Then letting t → ∞ and using also dominated convergence gives The same argument may be applied to obtain We divide both expressions by ε and let ε → 0. Then, recalling that U ∈ C 1,2 in C (Lemma 4.8), noticing that ∂ y Y y t = 1 for all t ≥ 0 and that τ was chosen independent of ε, we obtain (4.57). Proof. First notice that if S y = [b(y), +∞) for all y ∈ R, then b is non-decreasing due to (4.56). Then it remains to prove existence of b.
In case (i) there exists a unique point b(y) ∈ [0, +∞] such that S y = [b(y), +∞). In case (ii) we argue in two steps. First we show that (ii) implies [x 0 , +∞) × {y} ∈ C and then we show that [x 0 , +∞) × {y} ∈ C leads to a contradiction. Hence only (i) is possible, for all y ∈ R.
Combining the above Propositions 4.10 and 4.12 with (4.56) gives the next corollary.
We can say that χ is the (generalised) inverse of b in a sense that will be clarified later in Section 5.2.

Fine properties of the value function and of the boundary
In this section we continue our study of the optimal stopping problem by proving that its value function is C 1 and by illustrating properties of the optimal boundary in the different coordinate systems (i.e. (x, π), (x, ϕ) and (x, y)).
For future use, let us introduce Fix t 0 > 0 and define y 1 := y 0 + 1 2 (µ 1 + µ 0 )t 0 . Then by assumption it must be P x,y 1 (τ * ≥ t 0 ) = 1 for all x ≥ 0. For τ 0 := inf{t ≥ 0 : X t = 0}, using the strong Markov property and (4.49) we obtain where we use that for t ≤ τ 0 we have ( X t , Y t ) = (X • t , Y • t ), P x,y 1 -a.s. From (4.31) we deduce that for some c y 1 > 0, only depending on y 1 , we have where in the last inequality we have also used µ 0 + µ 1 ≥ 0. Plugging the latter bound into (5.3) and using that τ * ≥ t 0 we get Taking x → ∞ the first term on the right-hand side of (5.4) goes to zero whereas the second one diverges to +∞, because lim x→∞ P x,y 1 (τ 0 ≥ t 0 ) = 1 and x → g(x, y) is increasing. Hence we have a contradiction.
Step 2. (Left-continuity.) Using that b(·) is non-decreasing and that S is closed we obtain that for any y 0 ∈ R and any increasing sequence y n ↑ y 0 as n → ∞ it must be that lim n→∞ (b(y n ), y n ) = (b(y 0 −), y 0 ) ∈ S, where b(y 0 −) is the left limit of b at y 0 . Then b(y 0 −) ≥ b(y 0 ) by (5.1), and since b(y n ) ≤ b(y 0 ) for all n ≥ 1 then b must be left-continuous, hence lower semi-continuous.
For y > y 0 , from (5.6) we get Defining F φ (y) := x 2 x 1 u xy (x, y)φ(x)dx and using integration by parts, (5.7) may be rewritten as
Monotonicity of b is the key to the regularity of the value function in this context. In fact we will use it to show that the first hitting time to S coincides with the first hitting time to the interior of S. The latter, along with regularity (in the sense of diffusions) of ∂S, will be sufficient to prove that U ∈ C 1 ((0, +∞) × R), or equivalently U ∈ C 1 ((0, +∞) 2 ).
If µ 1 + µ 0 < 0 the process Y increases. Moreover, during excursions of X away from x = 0, the rate of increase is constant. Recalling (5.10), we can therefore use [13,Cor. 8] to conclude that (5.9) indeed holds (see also a self contained proof in a setting similar to ours, in Appendix B of [18]).
In case µ 0 + µ 1 ≥ 0, during excursions of X away from zero the process Y is nonincreasing. So the couple ( X, Y ) moves towards the left of the (x, y)-plane during such excursions (or Y is just constant if µ 0 + µ 1 = 0). If ( X 0 , Y 0 ) = (x 0 , y 0 ) ∈ ∂C with x 0 > 0, recalling that b(·) is non-decreasing, the law of iterared logarithm implies that P x 0 ,y 0 (σ * > 0) = 0. So we can claim To treat the regularity of ∂C in the remaining case of µ 0 + µ 1 < 0 we need to take a longer route because ( X, Y ) is now moving towards the right of the (x, y)-plane and in principle, when started from ∂C, it may 'escape' from the stopping set. We shall prove below that this is not the case. For that, we first need to show that the smooth fit holds at the boundary. Notice that this is the classical concept of smooth fit, i.e. continuity of z → U x (z, y). Smooth fit in this sense does not imply that (x, y) → U x (x, y) is continuous across the boundary, which instead we will prove in Proposition 5.10.
Proof. From σ 2 2 u xx (x, y) = ρg(x, y) + ρ u(x, y) − µ 0 u x (x, y) + 1 2 (µ 0 + µ 1 ) u y (x, y), for (x, y) ∈ C and using (4.35) (which clearly implies Lipschitz continuity of U as well) we see that for any bounded set B it must be that u xx is bounded on the closure of B ∩ C. (5.12) This fact will be used later to justify the use of Itô-Tanaka formula in (5.13).
We establish the smooth fit with an argument by contradiction. The first step is to recall that u x ≤ 0 in C as it was verified in the proof of Proposition 4.12. Second, notice that any (x 0 , y 0 ) ∈ ∂C must be of the form (b(y 0 ), y 0 ) due to continuity of y → b(y) (Proposition 5.2). Next, assume that for some y 0 and x 0 = b(y 0 ) > 0 we have u x (x 0 −, y 0 ) < −δ 0 for some δ 0 > 0, where u x (x 0 −, y 0 ) exists due to (5.12). Take a bounded rectangular neighbourhood B of (x 0 , y 0 ) such that B ∩ ({0} × R) = ∅ and let τ B := inf{t ≥ 0 : ( X t , Y t ) / ∈ B}. Then from the super-martingale property of U (4.25), using that A τ B ∧t = 0 for all t ≥ 0 and recalling (5.2), we have Now we notice that t → Y • τ B ∧t is increasing. Moreover, recalling (4.50), we have u y ≥ 0 in C. This implies u(X • τ B ∧t , Y • τ B ∧t ) ≥ u(X • τ B ∧t , y 0 ), P x 0 ,y 0 -a.s. Finally, observing that g is bounded on B we obtain for some c = c(B) > 0 that depends on the set B and will vary from line to line below.
As anticipated, we can now use Itô-Tanaka formula in (5.13) thanks to (5.12). We let L X = σ 2 2 ∂ xx + µ 0 ∂ x , denote the local time of X • at x 0 by L x 0 , and notice also that u xx ( · , y 0 ) = 0 for x > x 0 . Then where in the final inequality we used that (L X − ρ) u is bounded on B. Letting t → 0 the inequality in (5.14) leads to a contradiction because E x 0 ,y 0 L x 0 τ B ∧t ≈ √ t whereas E x 0 ,y 0 τ B ∧ t ≈ t (the argument is similar to the one used to prove Proposition 4.9. See also, e.g., [18,Lem. 6.5] or [41,Lem. 13]).
Hence the claim is proved.
Next we establish regularity of ∂C in the sense of diffusions, when µ 0 + µ 1 < 0.
As corollary to Lemma 5.3 and Propositions 5.4 and 5.6, we have Corollary 5.7. Under Assumption 5.1, for all (x, y) ∈ [0, +∞) × R \ (0, y * 0 ) we have P x,y (τ * = σ * = σ • * ) = 1. This corollary is important to determine continuity of the stopping times with respect to the initial position of the process, at all points of the state space.
Thanks to continuity of the optimal stopping times and to the probabilistic representations of U x and U y we can state our next result (see also [19] for general results in this direction). Proof. Trivially U ∈ C 1 in S • and moreover U ∈ C 1 in C \ ({0} × R), due to Lemma 4.8. It only remains to prove that ∇ x,y U is continuous across the boundary ∂C. Let us consider the case of U x , as the proof for U y follows the same arguments.
Let t > 0 be given and notice that on {τ n > t} one has ( X t , Y t ) ∈ C, P xn,yn -a.s. so that U x ( X xn t , Y xn,yn t ) may be represented by using (5.29). Hence, tower property of conditional expectation and Markov property allow us to write (5.29) as Now we want to take limits as n → ∞ and use that τ n → 0 in (5.31) to show that U x (x n , y n ) → g x (x 0 , y 0 ). For that, first notice that x → 1 {x≤0} and x → 1 {x≥0} are continuous on (−∞, 0) and in particular at −x 0 . Since we also have where we also used that lim n→∞ 1 {τn>t} = 0. By arbitrariness of (x 0 , y 0 ) and of the sequence (x n , y n ) we conclude that U x is continuous across ∂C \ (0, y * 0 ). Similar arguments, applied to (4.57), allow to show that U y is continuous across ∂C \ (0, y * 0 ) as well.
Our proposition above has a simple corollary. Recall that C is the closure of C.
This implies that also U and U belong to C ∞ in C \ ({0} × R).
Proof. We will prove (5.35). Fix y ∈ R with (0, y) ∈ C and take a sequence x n ↓ 0 as n → ∞. Notice that X xn is decreasing in n whereas Y xn,y is increasing in n thanks to (5.27) and (5.28). Then, by Proposition 5.8 and the geometry of S we have τ * (x n , y) ↑ τ * (0, y), P-a.s. For simplicity we denote τ n = τ * (x n , y) and τ ∞ = τ * (0, y).
Similarly, for U y we get Combining the two expressions above we find where the last equality uses (4.42).
With the aim of eventually going back to our original problem (4.8) in the (x, π) coordinates, we need now to consider the inverse of b(·). In particular, recalling the non-decreasing map x → χ(x) from (4.55), and noticing that we conclude that χ is the right continuous inverse of b, i.e.
Before closing this section we determine the limiting behaviour of the boundary d(π) as π → {0, 1}. Let us recall the measure P θ introduced in the proof of Proposition 4.2 and the associated Brownian motion W θ . Moreover let us also consider σ 2 Aτ −ρτ (5.40) which corresponds to problem (4.8) with π = 1 (notice that indeed X has drift µ 1 under P θ ). It was shown in [16,Sec. 8.3] that (5.40) is the optimal stopping problem associated to the dividend problem with full information and drift of X D equal to µ 1 . It then follows from [16] that there is an optimal stopping boundary a * > 0 that fully characterises the solution of (5.40) and the stopping set is [a * , +∞) (an expression for a * can be found in [44,Thm. 2.53, Ch. 2] with the notation m = µ 1 and δ = ρ). We now notice that using Girsanov theorem and (4.30), from (4.19) we obtain Then letting π → 1 (or equivalently ϕ → ∞) we obtain from the last expression above We also need to state two simple facts which can be obtained by (4.19) and straightforward calculations. For all (x, π) ∈ O we have Thanks to (4.31) and (4.35), the above and (4.19) imply that there is a constant c > 0 such that We can now state our next result. where a * is the optimal boundary for (5.40). Proof.
Since v π (x, · ) is a continuous function for all x > 0, as usual we say that its weak derivative with respect to π is a function f ∈ L 1 loc (O) such that for any φ ∈ C ∞ c (0, 1) it holds 1 0 v π (x, z)φ ′ (z)dz = − 1 0 f (x, z)φ(z)dz.
Since N has zero Lebesgue measure in O, we conclude that (6.2) holds.
In the remainder of the paper we will always consider the representative of v ππ given by the expression in (6.2). From (6.2) and U ∈ C 1 (O) we derive the next result. Proof. It is sufficient to notice that for any (x, π) ∈ C ∩ O we have x ≤ d + (π). Hence v ππ (x, π) = 2 ρ x 0 U (ζ, π)dζ − σ 2 2 U x (x, π) −μπ(1 − π)U π (x, π) − (µ 0 +μπ)U (x, π) θπ(1 − π) −2 , for all (x, π) ∈ C ∩ O. Continuity of v ππ now follows from U ∈ C 1 (O). Now that we have a candidate solution for the variational problem in Theorem 3.1, we would like also to construct a candidate optimal control. Recalling I v as in (3.3) and noticing that v x = U we immediately see that I v = C. Then, given (x, π) ∈ O we define P x,π -a.s. the process D t := sup 0≤s≤t [X s − d(π s )] + (6.5) where we recall that X is the uncontrolled dynamic X t = x + t 0 (µ 0 +μπ s )ds + σW t , P x,π -a.s. We also recall the notation γ D := inf{t ≥ 0 : X D t ≤ 0}. Some of the arguments in the proof of the next lemma are borrowed from [17,Sec. 5].
Lemma 6.4. Let Assumption 5.1 hold. The process D in (6.5) belongs to A (i.e. it is admissible). The treble (X D t , D t , π t ) t≥0 solves the Skorokhod reflection problem in C, that is, P x,π -a.s. for all 0 ≤ t ≤ γ D we have (X D t , π t ) ∈ C, Proof. It is immediate to see that D is increasing and adapted to (F t ) t≥0 . Since it is increasing then it also admits left limits at all points. In order to prove right-continuity of paths we observe that d(·) is non-decreasing and left continuous, hence lower semicontinuous. It then follows that the mapping t → X t − d(π t ) is P x,π -a.s. upper semicontinuous. Now, obviously lim ε→0 D t+ε ≥ D t , and the reverse inequality follows by Hence D ∈ A.
Let us turn to the study of the Skorokhod reflection problem. Notice that, since π is unaffected by D we have d(π t ) − X D t = d(π t ) − X t + D t ≥ 0, for all t ≥ 0, P x,π -a.s., where the final inequality follows from (6.5). Recalling that x < d(π) ⇐⇒ (x, π) ∈ C we deduce that (6.6) holds. It remains to prove (6.7).
Since D is arbitrary, such inequality also implies v ≥ V .
In order to show that v ≤ V , it is enough to observe that for D = D * all inequalities above become strict equalities. In particular, when taking limits in (A.5) we now use that v ∈ C 2 (I v ∩ O) implies lim k→∞ sup (x,π)∈Iv∩Kn (L X,π − ρ)(v k − v)(x, π) = 0.
Also, we use that (X D *