Optimal dividend distribution under Markov-regime switching

We investigate the problem of optimal dividend distribution for a company in the presence of regime shifts. We consider a company whose cumulative net revenues evolve as a Brownian motion with positive drift that is modulated by a finite state Markov chain, and model the discount rate as a deterministic function of the current state of the chain. In this setting the objective of the company is to maximize the expected cumulative discounted dividend payments until the moment of bankruptcy, which is taken to be the first time that the cash reserves (the cumulative net revenues minus cumulative dividend payments) are zero. We show that, if the drift is positive in each state, it is optimal to adopt a barrier strategy at certain positive regime-dependent levels, and provide an explicit characterization of the value function as the fixed point of a contraction. In the case that the drift is small and negative in one state, the optimal strategy takes a different form, which we explicitly identify if there are two regimes. We also provide a numerical illustration of the sensitivities of the optimal barriers and the influence of regime-switching.


Introduction
A classical topic in finance and actuarial science is that of optimal dividend distribution for a company, which can be phrased as the problem of determining the optimal timing and sizes of dividend payments in the presence of bankruptcy risk, where the usual objective is to maximize the expected value of the cumulative discounted dividend payments until bankruptcy. The earliest work in this setting can be traced back to De Finetti [9] who studied the dividend problem for an insurance company under the binomial model. In continuous time the problem was posed and solved in a Brownian motion model for the cash reserves by Jeanblanc-Piqué and Shiryaev [21], and Asmussen and Taksar [2], using optimal control theory. Since then an extensive literature has appeared on the dividend problem and its extensions, including reinsurance (e.g. [27]), optimal investment of the reserves (e.g. [19]), tax and proportional cost (e.g. [7,23]), and growth options ( [8]).
In general, the form of the optimal dividend policy has been found to depend on the expected growth rate and variability of future revenues, and the discount rate. These quantities will evolve in time reflecting changing market and economic conditions, and those changes may happen gradually or occur abruptly and be more substantial. Here we will focus on the changes of the latter type (also called regime shifts or switches) and model the cumulative net revenues of the company as a Brownian motion with the drift and volatility modulated by a finite state Markov chain, and the discount rate as a deterministic function of the chain. Since Hamilton [17,18], a substantial econometric literature has appeared that supports the use of Markov regime-switching models to describe business cycles, term structure of interest rates and other macroeconomic quantities. Such models have been shown to be capable of capturing occasional simultaneous and substantial changes of the parameters. Regimeswitching models also have the advantage of retaining a degree of analytical tractability, and models from this class can in principle approximate a given diffusion arbitrarily closely by taking the state space large enough and specifying the generator matrix appropriately. In the mathematical finance literature regime-switching models have become more popular, and have found their applications in stock price models, interest rate models and the real option literature. See e.g. Boyarchenko and Levendorskiǐ [5], Buffington and Elliott [6], Driffill et al. [10], Duan et al. [11], Elliott et al. [12], Guo and Zhang [15], Jiang and Pistorius [22], Naik [25] for derivative pricing, Elliott and van der Hoek [13] and Guidolin and Timmermann [14] for asset allocation, Bäuerle [3], Li and Lu [24], Zhu and Yang [29] and Asmussen [1] for ruin and risk theory, and Guo et al. [16] for irreversible investment.
In this regime-switching setting we will consider the problem of the management of the company to find a dividend distribution policy that maximizes expected discounted dividend payments until bankruptcy, which is defined to occur at the first moment when the level of the cash reserves hits zero. We will restrict ourselves to the case that the management can only control timing and size of the dividend payments. In the case that the drift is positive in every regime, we will show that it is optimal to adopt a barrier-type strategy at certain positive levels that depend on the current regime, that is, it is optimal to make the minimal payments needed to keep the cash reserves below these barrier levels. When a regime-switch occurs, dividend payments are to be postponed or brought forward in time, according to whether the barrier jumps up or down, and in the latter case a lump sum should be paid if the reserves were above the new barrier at the moment of the switch. In the case of a single regime this strategy reduces to the classical constant barrier strategy that was found before by Asmussen and Taksar [2].
After an adverse economic regime-switch it could happen that the expected net revenue of the company becomes negative, in which case the optimal strategy takes a different form. Intuitively, it is clear that, if the drift is negative and the reserves are sufficiently small, it will be optimal to liquidate the company by paying out the reserves as a lump-sum. In the absence of regime-switching, this optimality actually holds irrespective of the size of the reserves. In the presence of regime-switching, however, we find that it is optimal to continue the business if the drift is small and negative and the reserves are not too small: the prospect of switching to a better regime with suitable positive drift outweighs the risk of ruin. In this case the value function is not concave, which differs from what is usually found in singular control problems. An explicit solution is derived in Section 5 in the case of two regimes.
The dividend optimization problem gives rise to a singular control problem, whose HJB equation takes the form of a coupled system of variational inequalities, due to the fact that the problem is driven by a two-dimensional Markov process. A commonly used direct approach for explicitly solving optimal control problems proceeds by guessing a candidate optimal solution, constructing a corresponding value function, assuming smoothness if necessary, and subsequently verifying its optimality by employing a verification result. Here we shall follow a different approach to construct the candidate value function, by directly employing a dynamic programming equation. We will prove that the value function is the fixed point of a certain contraction operator, which is given explicitly in terms of the initial data, and derive an explicit iterative algorithm to calculate the value function, which 'decouples' the different regimes such that at any stage one-dimensional control problems are solved. This construction yields in particular that the value function is C 2 , which implies that the value function is a classical solution of the HJB equation. At this point it is worth mentioning that, although it is possible to follow the direct approach, this seems to become intractable if the number of states is large, as it leads to a large collection of systems of coupled non-linear equations (corresponding to different orderings of the dividend levels).
After the first version of this paper was written, we discovered a related work on optimal dividend problems by Sotomayor and Cadenillas [28]. In a setting that is a particular case of ours, with two regimes and constant rate of discounting, they solve three dividend distribution problems with bounded and unbounded dividend rates, and in the presence of fixed cost, respectively, under the assumption of existence of a solution to the smooth fit equation.
The remainder of the paper is organized as follows. In Section 2 we give a statement of the problem, and present a dynamic programming equation and related theorem. In Sections 3 and 4 we present the optimal solution and give a proof by constructing an iterative algorithm to calculate the value function V . Section 5 is devoted to a case study of the setting of two regimes, with a numerical illustration of the sensitivities of the optimal barrier levels to the different parameters. Section 6 concludes. Some proofs are presented in the Appendix.

Problem formulation
Let {W t : t ≥ 0} be a Wiener process and let {Z t : t ≥ 0} be a continuous time Markov chain with finite state space E and generator matrix Q = (q ij ) i,j∈E , independent of W . Assume that the cash reserves X = {X t , t ≥ 0} evolve, in the absence of dividend payments, as a regime-switching linear Brownian motion, that is, X satisfies the SDE where Z represents the state of economy. For every state i in E, both drift parameter μ(i) and volatility parameter σ(i) > 0 are assumed to be known constants. In case there is no notational confusion possible, we will write μ i and σ i for μ(i) and σ(i) respectively. The processes X and Z are defined on some filtered probability space (Ω, F , F, IP) where F = {F t , t ≥ 0} denotes the right-continuous completed filtration jointly generated by X and Z. We denote by IP x,i and IP x the measure IP conditioned on {X 0 = x, Z 0 = i} and {X 0 = x}, respectively, and write IE x,i and IE x for the corresponding expectations. We assume that the processes X and Z are both fully observable to the shareholders, and that these decide on the dividend strategies on the basis of the available information. A dividend strategy D is a non-decreasing and right-continuous stochastic process D = {D t : t ≥ 0} with D 0− = 0. Here D t represents the cumulative amount of dividends that has been paid out until time t. We will assume that, apart from reducing the reserves, dividend payments have no effect on the business and that there are no transaction costs associated to the payment or receipt of dividends. The dynamics of the risk reserve process U = {U t : t ≥ 0} in the presence of dividend payments are then given by for all t until the time τ of bankruptcy and dU t = 0 for t after τ , where is the first time that U hits zero. To avoid degeneracies only those dividend strategies will be considered that have no lump sum dividend payments larger than the current level of the reserves: Denoting by D the set of admissible dividend strategies, the objective function of the shareholders is given by where V D denotes the expected value of the discounted dividends until the time of ruin τ under the dividend strategy D, with r : E → (0, ∞) the Markov-modulated rate of discounting. The problem for the shareholders is to identify a dividend strategy D * ∈ D that attains the supremum in (2.3), that is, V ≡ V D * .

A priori bounds
Assume for the moment that there is only a single regime, E = {i}. Then we are back in the classical linear Brownian motion setting that was investigated in Asmussen and Taksar [2]. They showed that, if μ i > 0, the optimal strategy is a constant barrier strategy at the level According to this strategy, the overflow of the reserves above the level a * i is immediately paid out as dividends. The corresponding value function is given by The eqs. (2.4)-(2.7) show that the value function and optimal level are both functions of the drift and of the rate of discounting per unit of squared volatility. This observation leads one to expect that V (x, i) is bounded above and below by the values V + (x) and V − (x) of firms operating in a more or less favourable environment, with volatility constant equal to one drift and discounting equal to ( μ + ), respectively. The following result confirms that these explicit bounds indeed hold true: The bounds in (2.8) will be employed in the construction of the optimal value function in Section 4.

Dynamic programming equation and comparison result
The following dynamic programming equation for the value function of the singular control problem (2.3) will form the basis for its solution: where ζ denotes the epoch of the first regime-switch and Λ t = t 0 r(Z s )ds. The proof of Proposition 2.2 is given in the Appendix. This dynamic programming equation is associated with the following Hamilton-Jacobi-Bellman equation for the value function: where denotes the partial derivative with respect to x and G denotes the infinitesimal generator of (X, Z) which acts on functions w : It holds that any sufficiently regular super-solution of the HJB equation (2.10) dominates the value function: Theorem 2.1 Assume that there exists a function w = (w(•, i), i ∈ E), with w(•, i), i ∈ E, C 1 functions on (0, ∞) that are piecewise C 2 and satisfy for x > 0 (ii) If, in addition, w = V D for some D ∈ D, then D is an optimal strategy and V ≡ w.
Proof of Theorem 2.1. (i) Fix an arbitrary D ∈ D and let U be the corresponding risk process. The statement will follow once we have shown that w( Here the last integration is over the set [0, t] × E and π = π − ν is a compensated random measure 1 where π(dt, dj) = s≥0 1 {ΔZs(ω) =0} δ (s,Zs(ω)) (dt, dj), with δ (s,z) denoting the Dirac measure at point (s, z), and the compensator ν is given by Notice from (2.12) that, as M is bounded below and M 0 = 0, M is a super-martingale with IE[M T ∧τ ] ≤ 0. In view of HJB equation (2.10), the right-hand side of (2.12) is non-positive, so that taking expectations yields that By letting T → ∞ and invoking the monotone convergence theorem and the fact that w is non-negative, we obtain that w(

The optimal dividend strategy
Following the classical approach to solving optimal control problems we next construct a candidate optimal solution. In view of the fact that (U, Z) is a Markov process we consider strategies that pay out the overflow of the cash reserves above a regime-dependent level: According to this strategy, dividends are only paid out when U b is at the barrier b, which implies that process D b is a local time. It is straightforward to verify that D b can be explicitly expressed in terms of a running supremum as follows: The barrier levels are represented by horizontal lines. In this case the barrier jumps down at the moment of the regime-switch and a lump sum payment is made.
Employing the heuristic 'principle of C 2 fit' of singular control allows us to define candidate optimal levels as the solution of the system of equations if such a solution exists. In fact, (3.2) follows from Lemma 4.1 and Proposition 4.1 as you can see later. If the drift is positive in all regimes, this candidate solution is indeed optimal: i < ∞ and the following holds true: (i) The optimal value function V is a classical solution of the HJB equation (2.10). In particular, V is equal to the unique solution (ii) The modulated barrier strategy at b * is an optimal policy in (2.3).
If the drift condition is not satisfied, it is not necessarily optimal to adopt a modulated barrier strategy. Indeed, in Section 5 we show that in the case of two regimes with a small and negative drift in one state and a positive in the other, the optimal dividend barrier depends on the regime as well as on the level of the reserves. In the following section we will give a proof of Theorem 3.1 by presenting an iterative construction of the optimal value function.
4 Algorithm to compute the value function V Throughout this section we will assume that μ i > 0 for all i ∈ E. We start by observing that the value function V b of a modulated barrier strategy at level b = (b i , i ∈ E) solves the following fixed point equation in terms of the function W (q) i : where, for any f : The previous result can be utilized to calculate the value function V b of the barrier strategy at b by iterating the map Corollary 4.1 The map T b is a contraction on B with respect to the norm • . In particular, where the convergence is in be the ruin times of U b and U i , and denote by ζ the epoch of the first regime-switch and by η(a) an independent exponential random time with mean 1/a. Then it holds that the ensemble ). Thus, the value z 1 (x, i) of the discounted dividends received before ζ is given by where θ i = r i − q ii and in the last line we used (2.5). Similarly, the value z 2 (x, i) of the discounted dividends received after ζ satisfies, in view of the Markov property, Employing the identity (see e.g. [26,Thm. 1]) , we find the result as stated.
Proof of Corollary 4.1 Note that B endowed with the norm • is a complete metric space and that T maps B to itself, by definition of T and the fact that W ]. Thus, it follows that T is a contraction on B, which implies the convergence in (4.3).

Iteration
As next step we consider the auxiliary control problem with a prescribed pay-off function v to be received at the epoch of the first regime-switch ζ: This singular control problem can be solved explicitly if v lies in the set of smooth concave pay-off functions C = {v ∈ B : v i is increasing and concave, i ∈ E}: for i ∈ E and the optimal strategy in (4.5) is given by a regime-switching barrier strategy at the levels with A v given in (4.2).
Supposing that the map U : v → U v preserves concavity and smoothness, this Proposition can be applied iteratively, as follows: Initialise by setting n = 0 and v = v 0 for some v 0 ∈ B and then (2) Set v ← T b v (v), n ← n + 1, and v n ← v, and return to step (1).
The following result shows that the sequence (v n ) generated in this way converges to the value function V as n → ∞: where the convergence is with respect to the norm • . In particular, V is concave.
In fact, we shall show below that U is a contraction on C. Notice that Theorem 3.1(i) is now a direct consequence of these results. Indeed, by combining Proposition 4.2 and the dynamic programming equation (2.9) we see that the optimal strategy in (2.3) is given by a modulated barrier strategy at some positive finite levels. Explicit examples of initial functions v ± 0 are the V ± given in Proposition 2.1.

Proofs
This subsection is devoted to the proofs of the Propositions 4.2 and 4.3 which we split in a number of steps. The first step is to verify that the b v i as defined above are positive and finite, which is a matter of straightforward calculations using the explicit expression (2.6): The proof of Lemma 4.1 is given in the Appendix. The key step is to verify next that the value function of a barrier strategy at level b v with a concave payoff function v(•, i), is itself concave: Proof of Lemma 4. 2 We first assume that v ∈ C∩C 2 [0, ∞), and write b instead of b v to simplify the notation. In view of the smoothness of v and the definition of w i (x) := (T b v v)(x, i), we can obtain from (2.6) and (4.1) that for x ∈ (0, b i ), From these expressions, equation (2.6) and the v ∈ C 2 [0, ∞), we have that w i | (0,b i ) ∈ C 4 (0, b i ). In addition, we have w i (b i ) = 1 from the above expressions and equation (4.1), and have w i (b i ) = 0 by Lemma 4.1. As a result, w i is C 2 [0, ∞). An application of Itô's lemma shows that w i satisfies the ode with boundary conditions w i (0) = 0, w i (b i ) = 1. Since w i (x) ≥ 0 for x > 0 and w i (0) = 0, we deduce that w i (0+) ≥ 0. Furthermore, the continuity of w i and the fact that w i (0) = 0 and v i (0) = 0 imply that σ 2 i w i (0+) + 2μ i w i (0+) = 0, so that w i (0+) < 0, as μ i > 0 by assumption.
Write now ξ i (x) = w i (x) for x > 0, and denote ξ i (0) = w i (0+). By twice differentiating the first equation of the original system (3.3), which is justified since w i (x) ∈ C 4 (0, b i ) as a consequence of the assumptions, we find that ξ i (x) satisfies the ode Another application of Itô's lemma then shows that the following representation holds true for ξ: where θ i = (c i − q ii ) and T i = inf{t ≥ 0 : X i t / ∈ (0, b i )}. Thus, since ξ i (X T i ) ≤ 0 and v j (x) ≤ 0, it follows that ξ i (x) is non-positive for all x ∈ (0, b i ) and i ∈ E. In particular, we deduce that x → w i (x) is concave and increasing on [0, ∞).
Suppose now that v ∈ C and let v n ∈ C ∩ C 2 [0, ∞) be a sequence that pointwise increases to v.
, and the concavity of T b v (v) directly follows from the fact that the pointwise limit of concave functions is concave.
We next verify that the modulated barrier strategy at b v is optimal for the problem (4.5):
, by continuity). Since w i are C 2 and concave and satisfy (4.8), the assertion of the Lemma follows by an argument similar to the one used in the proof of Theorem 2.1. Fix an arbitrary D ∈ D and let U be the corresponding risk process. Applying a generalised form of Itô's lemma to the process {e −Λ T ∧τ w(U T ∧τ , Z T ∧τ ), T ≥ 0}, taking expectations and using that f v i (x) ≤ 0 as in the proof of Theorem 2.1, we find that By letting T → ∞ and invoking the monotone convergence theorem and the fact that w and v are non-negative and that The convergence of the iteration procedure is an immediate consequence of the following contraction property of U v:  From the definition of U and the dynamic programming equation we directly see that In particular, taking v = v − 0 and w = v + 0 and repeatedly applying the former inequality yields that v − n ≤ V ≤ v + n . It follows from Lemma 4.4 that v + n and v − n converge to the unique fixed point of U , which is therefore equal to V . Next note that, in view of Lemma 4.2, v ± n are concave (as we took v ± 0 ∈ C), so that V , a pointwise limit of concave functions, is also concave. This completes the proof of Proposition 4.3.

Positive drifts
From now on we restrict ourselves to the case of two regimes, E = {0, 1}. For the setting of positive drifts, μ 0 , μ 1 > 0, we will derive a system of two non-linear equations for the optimal dividend barriers. We will denote by F 0 and F 1 the quadratic polynomials given by with two different real roots λ k 1 and λ k 2 . Consider the fourth order polynomial The equation F k (λ) = 0 has two different roots λ k − < λ k + given in (2.7) and the equation F 0,1 (λ) = 0 has four real roots satisfying λ 1 < λ 2 < 0 < λ 3 < λ 4 .
Solving the systems of differential equation in Theorem 3.1 leads to the following result: where d = (d 1 , . . . , d 4 ) solves the linear system Ad = h, where h = (0, 0, 1, 0) and The proof of Proposition 5.1 is given in the Appendix.

Sensitivities of the optimal barriers
To illustrate the effects of regime-switching and the sensitivities of the optimal barrier levels we numerically solved the system of non-linear equations in Propositions 5.1 for different parameter values, and compared the results with the explicit solutions (2.4) and (2.5) corresponding to the absence of regime-switching. The non-linear equations were solved using a Maple routine based on the standard quasi-Newton method. We chose the parameters as in Table 1 and varied μ 0 , σ 0 , q 00 and r 0 individually whilst keeping the other parameters fixed -the results are given in Table 2.  We see that when the drift parameter μ 0 is increased then initially b * 0 and b * 1 increase, while they decrease when the drift μ 0 becomes very large. Apparently, for relatively low drift it is optimal to reduce the probability of ruin while for large drift the effect of discounting takes priority. Table 2 Table 2: The optimal barriers for drifts μ 0 , volatilities σ 0 , transition rates −q 00 and discounting rates r 0 . when σ 0 increases. A larger volatility leads to a higher probability of ruin requiring the company to raise the level of the barrier in order to protect its future operations. We can also observe the effect of the transition rates of the underlying Markov chain. For example, if the rate is −q 00 = 0.01, the chain spends a large part of the time in state 0 (in equilibrium, 3/3.01 ≈ 99.7% of the time), which we find back as b * 0 = 1.014 is very close to a * 0 = 1.013, whereas if −q 00 and −q 11 are of similar size, the chain spends on average similar amounts of time in both states and the level b * 0 differs substantially from a * 0 . Finally, we note that both b * 0 and b * 1 decrease when the rate of discounting r 0 is increased: if the rate of discounting is higher it is optimal to increase the dividend payments by lowering the dividend barriers.

Adverse regime-shifts: negative drift
We next consider the case that the drift is positive in one state and negative in the other. Intuitively it is clear that for sufficiently small reserves a quick bankruptcy of the company is quite likely if the drift is negative, so that it is optimal to liquidate the company by paying out the entire reserves as a lump sum. If, however, the negative drift is moderate and the reserves are not too small, the expected future gains from a regime switch to a 'good' state may outweigh the effect of the negative drift and it may be optimal to continue the business. In that case a sensible strategy could be to liquidate the company for small initial reserves but to pay out dividends according to a modulated barrier strategy for larger levels of reserves, which we formalize as follows:

Definition 5.1 A modulated liquidation and dividend barrier strategy at levels
Condition (iii) states that all the reserves are paid out as dividends once the risk reserves fall below the level d(Z t ). Define next the critical levels is negative for all x small enough, which implies that Δ i ∈ (0, ∞]. If Δ i = +∞, which is the case if μ i < 0 and |μ i | is sufficiently large, it is optimal in state i to liquidate the company for any level of the reserves, by immediately paying out all the reserves as dividends-this can be directly checked from Theorem 2.1. In the case that μ 0 < 0 < μ 1 and Δ 0 < ∞ (the case μ 1 < 0 < μ 0 follows by relabelling the states), it turns out that it is optimal to continue paying dividends if the reserves are large enough, where the 'liquidation' level d * 0 > 0 solves the smooth fit The solution is explicitly given as follows: Proposition 5.2 Suppose that μ 0 < 0 < μ 1 and Δ 0 < ∞. (i) The optimal strategy in (2.3) is given by the modulated liquidation and dividend barrier strategy at levels d * = (d * 0 , 0) and

4)
where h = (d 0 , 1, q 00 , 0) and The value functions are given by Its proof is given in the Appendix. Observe that the value function V 0 is not concave, as there are two disjoint intervals where it has unit slope.
As illustration, we provide next a numerical example of a case where a modulated liquidation-dividend strategy is optimal.

Conclusion
In this paper we have shown that, in the presence of regime-shifts, the optimal dividend policy is given by a threshold strategy set at a level that is a function of the current regime. That is to say, the policy that maximises the expectation of the net present value of the paid dividends until the moment of default consists of paying out as dividends the overflow of the cash reserves above a certain optimal threshold, where this threshold jumps up or down exactly at the moment when the regime shifts. Hence, at the moment of a regime shift, when the key parameters such as drift, volatility and discounting may change, it may be optimal to make a lump-sum dividend payment, namely when the threshold level jumps below the current level of the cash reserves. We presented a contraction algorithm for the computation of the optimal threshold levels. As a case study we numerically investigated the parameter sensitivities of the levels in the case of two regimes. It would be desirable to systematically explore the dependence of the optimal threshold levels on key parameters, and its financial significance, which could be achieved by an analytical investigation of its form in specific parametric models; this is a topic left for future research.

A Proofs
A.1 Proof of the bounds (Proposition 2.1) To prove the upper and lower bounds in (2.8) we consider two auxiliary optimal switching problems where not only the dividend payout but also the regime is a control variable. An admissible switching strategy σ = {Z σ t , t ≥ 0} is an F-adapted E-value process that indicates the current regime. The two control problems are then given by where D − denotes the constant barrier strategy at level b − (where b − denote the optimal barriers corresponding to V − ), S and D are the sets of all admissible switching and dividend strategies, Λ σ s = s 0 c(Z σ u )du and τ σ is the corresponding ruin time. As the regime-switching process Z is one particular admissible switching strategy, the upper and lower bounds in (2.8) will follow once we have shown that v + ≤ V + and v − ≥ V − .
In the proof we will use the following sub-and super-harmonicity properties: where G i is the infinitesimal generator of X i t = x + μ i t + σ i W t . Proof Since V + is the value function corresponding to the optimal dividend problem without regime-switching, and with volatility, drift and discounting 1, max i∈E , it solves a corresponding HJB equation. In particular, V + and V + are both positive, so that, in view of the form of the drift and discounting it follows that (A.2) holds true. By a similar argument it can be verified that (A.3) holds true. Proof of Proposition 2.1. Fixing an arbitrary admissible switching and dividend strategies σ and D and denoting by U σ,D the corresponding risk process, an application of Itô's lemma shows that where M σ is some local martingale which is a supermartingale as it is bounded below. Taking note of Lemma A.1 and the facts that V + ≥ 1 and V + (0) = 0, it follows by rearranging and taking expectations that Subsequently taking the supremum in (A.4) over all σ ∈ S and D ∈ D shows that V + (x) ≥ v + (x). By a similar line of reasoning it can be verified that is valid for all x ≥ 0, and, since V χ ≤ V , the proof of (2.8) is complete.

A.2 The dynamic programming equation (Proposition 2.2)
The proof is an adaptation of a classical line of reasoning to a regime-switching setting. We start with the following two lemmas: where θ i = c i − q ii In particular, it follows that V (•, i) is Lipschitz continuous.
Proof Let > 0 and let D(u, i) be an -optimal strategy for U 0 = u, Z 0 = i, and consider the strategies D t (u, y) = (u − y)1 {t=0} + D t (y, i)1 {t>0} ("pay a lump sum u − y and follow then the strategy D(y, i)") andD t (u, x) = 1 {t>τ (x),Z τ (x) =i} D(x, i) for x ≥ u ≥ y ("wait until the first time τ (x) that the reserves reach the level x; if no regime-switch has occurred by then, follow the strategy D(x, i), otherwise don't pay any dividends"). Then it follows that . Letting → 0 the bounds follow. Let D i,j be -optimal strategies corresponding to U 0 = x (j) and Z 0 = i, that is, V (x (j) , i) − V D i,j (x (j) , i) < , and define the strategy D depending on U 0 = x and Z 0 = i as 'pay a lump sum (x − x (j * ) ) and follow then the strategy D i,j * ' where j * = max{j : x (j) ≤ x}: Then it follows that As this estimate holds for arbitrary x ≥ 0 and i ∈ E, the proof is complete.
Proof of Proposition 2.2: Denote by w the right-hand side of (2.9) and by D ∈ D and U an arbitrary admissible strategy and the corresponding cash reserves. To show that V ≤ w, we verify that V D ≤ w: To prove the opposite bound w ≤ V we will show that, for given > 0 and D ∈ D, there exists a strategy D( ) ∈ D such that w D ≤ V D( ) +const• , where w D denotes the expectation in (2.9). Fixing M > 0 such that P x,i (X ζ > M ) < for all i ∈ E, we denote by D ∈ D a dividend strategy that pays out x − M , if U 0 = x > M , and that is -optimal, uniformly over starting values (i, in terms of these coefficients we arrive at the matrix equation Ad = h. The equations for the optimal levels follow since V (x, ) is C 2 at b ı (noting that any two of the three equations implies the third one). The system in Proposition 5.1 can be solved explicitly as shown explicitly in the following result. To that end we introduce the functionsg ı,k and g ı,k , k = 1, 2, as follows g ı,k (x) = e λ k x + C k e λ 3 x − (C k + 1)e λ 4 x g ı,k (x) = F ı (λ k )e λ k x + C k F ı (λ 3 )e λ 3 x − (C k + 1)F ı (λ 4 )e λ 4 x where C k = F ı (λ k )−F ı (λ 4 ) F ı(λ4)−Fı(λ3) and we write G =g ı,1 (b ı )g ı,2 (b ı ) −g ı,2 (b ı )g ı,1 (b ı ). The last two equations of Ad = h then can be rewritten as with a unique solution (d 1 , d 2 ) = G −1 (g ı,2 (b ı ), −g ı,1 (b ı )) (A.14) according to Cramér's rule (as G = 0). Proof of Proposition 5.2. The structure of the proof is analogous to that of Theorem 3.1. As the value function V 0 will not be concave some parts of the proof has to be modified. The steps are outlined as follows: