Numerical Approximation of a System of Hamilton–Jacobi–Bellman Equations Arising in Innovation Dynamics

We consider a system of fully nonlinear partial differential equations that corresponds to the Hamilton–Jacobi–Bellman equations for the value functions of an optimal innovation investment problem of a monopoly firm facing bankruptcy risk. We compare several algorithms for the numerical solution of the considered problem: the collocation method, the finite difference method, WENO method and the adaptive finite element method. We discuss implementation issues for the considered schemes and perform numerical studies for different model parameters to assess their performance.


Introduction
We study the system of Hamilton-Jacobi-Bellman (HJB) equations x ∈ R, (1.1) where r > 0 is a constant (the so-called discount rate) and the operators L 1 , L 2 are defined as (1.3) B L'ubomír Baňas banas@math.uni-bielefeld.de L 2 (V 2 , x, y) := 1 2 σ 2 (y)∂ 2 yy V 2 (x, y) + b 2 (x, y)∂ x V 2 (x, y) + μ(y)∂ y V 2 (x, y) − p 0 (x)V 2 (x, y). (1.4) The terms f 1 and f 2 in (1.1)-(1.2) as well as b 1 , b 2 , p 0 , p 2 , σ , and μ in (1.3)-(1.4) are real-valued continuous functions specified in Table 2 below. The HJB system (1.1)-(1.2), which can be classified as a fully-nonlinear degenerate second-order partial differential equation (PDE) in non-divergence form, was derived in [9] to study optimal innovation investment strategies of a monopoly firm under financial constraints, i.e., that is facing a bankruptcy risk. The solutions V 1 and V 2 to (1.1)-(1.2) correspond to the value functions of different investment modes of the monopoly firm considered on an infinite time horizon and the control variable I in (1.1) is the innovation investment (i.e. the amount of innovation activity the firm performs). Given the liquidity x ≡ x(t) and the innovation investment I ≡ I (t) at time t ≥ 0, the monopoly firm may enter three different modes of production m = 0, 1, 2 with certain probability rates that are characterized by a Markov process m := {m t ∈ {0, 1, 2} : t ≥ 0} with m 0 = 1, and for every t ≥ 0 for any other (i, j), i = j, (1.5) where p 0 ≥ 0 is the bankruptcy rate and p 2 ≥ 0 is the innovation rate. For every mode i ∈ {0, 1, 2}, it holds that 2 j=0 P[m t+dt = j|m t = i] = 1. The possible scenarios given by the above Markov process can be summarized as follows. The mode m = 1 corresponds to the pre-innovation mode where the firm is selling an old product on an established market and is developing a new product with a suitable amount of innovation investment I . The latter decreases firm's liquidity, but allows it to enter the new market and to move from selling only the old product (m = 1) to selling both the old and the new product on the market (m = 2) with a probability rate p 2 ≡ p 2 (I ). Finally, depending on its liquidity the firm may go bankrupt with a probability rate p 0 ≡ p 0 (x) and exit the market, that is the firm leaves the respective mode m = 1 or m = 2 and enters the mode m = 0, where the value V 0 = 0 is constant. The objective of the firm is to maximize the accumulated dividend f m , m = 1, 2 over time, which yields the system (1.1)-(1.2) for the modes m = 1, m = 2, respectively. For details on the derivation of the HJB Eqs. (1.1-(1.2) we refer to [9].
In general, the solutions to (1.1)-(1.2) do not exist in the classical sense but can be defined in the sense of viscosity solutions, cf., [14,28,33]. The existence of the unique viscosity solution of a general HJB equation written in the form is guaranteed provided that the functional F is semicontinuous and satisfies a monotonicity condition, i.e., it holds for A, whenever A − A is negative semidefinite and q ≥ q , see [18] and [33,Chapter 5]. Equation (1.2) corresponds to (1.6) with d = 2 and F given by where for two d ×d matrices A := {a i, j } i, j , B := {b i, j } i, j we denote A : B := d i, j a i, j b i, j . The above F is monotone since σ 2 ≥ 0 and ( p 0 +r ) > 0. The continuity of F is a consequence of the continuity of the functions σ , μ, and f m , b m , m = 1, 2. Hence, we deduce that (1.2) admits a unique viscosity solution.
Equation (1.1) corresponds to (1.6) with d = 1 and F given by which is monotone since r > 0 and p 0 , p 2 ≥ 0. Formally, one can equivalently formulate the Hamiltonian using I opt := arg max For the considered model parameters (cf., Table 2) F is continuous, except for p = 0; the singular case p = 0 can be dealt with by imposing an additional condition I (x) ≤ I max , x ∈ R for some 0 < I max < ∞. Apart from theoretical considerations, the monotonicity condition (1.7) is also required for the convergence of monotone and consistent numerical schemes, cf., [1] and the recent overview paper [27]. The finite difference schemes are among the most established schemes for the approximation of HJB equations. Monotone finite difference schemes can be constructed by appropriately choosing the finite difference stencil, for instance by the upwind finite difference approximation [32,34] or by adopting a semi-Lagrangian approach [7,10,11]. Much fewer contributions are available on monotone finite element (FEM) based discretization methods including [15,17,29], where in particular [17] is the first to show convergence of FEM discretizations of HJB equations of the type (1.1). A popular choice in applications is the collocation method [9,35] because of its capability to solve HJB equations in higher dimensions, cf., [6]. As far as we are aware, very few results exist on adaptive numerical approximation of HJB equations. We mention [35] which employs an adaptive collocation approach; and [15,21] utilize adaptive finite element methods to linear second-order problems that satisfy the so-called Cordes condition see [27,Definition 2.46]. We note that the Cordes condition is not satisfied for the model (1.1)-(1.2). A widely used class of stable higher order discretization schemes for the time-dependent HJB equations, so-called WENO (weighted essentially non-oscillatory) schemes, have been introduced in [19,23]. The WENO schemes have been used to solve HJB equations in conjunction with the semi-Lagrangian approach, see for instance [3,8,12]. An alternative approach to achieve higher-order accuracy in space is to use the discontinuous Galerkin approximation, see for instance [22] and the references therein. Finally, we also mention that the considered HJB model bears some similarities with the problems of utility maximization under regime switching, that leads to a system of coupled HJB equations. The individual equations in the HJB system can be decoupled via a fixed-point type iteration; for further details see [25] (as well as the references therein), where a monotone finite difference method for the solution of the HJB system is analyzed using the Barles-Souganidis framework. In contrast to the regime switching problem the HJB system (1.1) is decoupled but the respective equations are posed on different state spaces. Furthermore, whereas the generator of the regime switching process in [25] is constant over time, in our setting the switching rate between modes is a function of the state and of the control, in general.
In the present work we compare different discretization methods for the approximation of the HJB Eqs. (1.1)-(1.2). We note that the Eq. (1.2) can be solved independently of (1.1). Hence we first determine the numerical approximation of the value function V 2 in mode m = 2. This numerical approximation is then used to compute the approximation of the value function V 1 in mode m = 1. We consider a stabilized finite element method (FEM) for the approximation of the linear degenerate Eq. (1.2). Furthermore, we propose two adaptive finite element discretization strategies for the solution of the (nonlinear) HJB Eq. (1.1); the first approach employs a stabilized finite element approximation and the second one is based on an adaptive least-squares finite element method (LSFEM). The nonlinear algebraic system that results from the discretization of (1.1) is solved iteratively using the well-known policy iteration algorithm, see [16,Chapter 4] and [5]. We compare the proposed finite element methods with alternative discretization approaches: the Chebyshev collocation method and the finite difference method. We discuss relevant implementation details of the considered numerical schemes. In particular, the convergence of the policy iteration for the discrete approximation of (1.1) proves to be problematic in certain scenarios; we discuss suitable modification of the considered numerical algorithms which improve the convergence behavior of the policy iteration.
In addition to the above schemes that approximate the problem (1.1) directly, we propose a numerical approach where (1.1) is interpreted as a steady state of the corresponding time-dependent problem. The time-dependent problem is solved using the policy iteration where the subproblems are discretized in time using approach that combines a total-variation diminishing time discretization with a WENO spatial-discretization of the time dependent problem. The spatial discretisation is based on the upwind WENO5 scheme proposed in [23], [19]. The resulting finite-dimensional system of ordinary differential equations is approximated by a strong stability preserving time-discretization scheme, cf. [30], which is used to compute the steady state of the discrete problem.
The paper is organized as follows. In Sect. 2, we give details on the notation and the considered parameters in (1.1)-(1.2), and discuss analytical properties which are useful to deduce boundary conditions that are required in the considered numerical approximations. In Sect. 3, we discuss the discretization of (1.1)-(1.2) by the collocation method, Sect. 4 is dedicated to the finite difference approximation. The WENO scheme is introduced in Sect. 5. In Sect. 6, we present the stabilized FEM method and the LSFEM method as well as the corresponding adaptive mesh refinement algorithms. We discuss the performance of the considered numerical schemes in Sect. 7.

Model Parameters
The functions in (1.1)-(1.4) are defined in Table 2 and the values of the constants ν 1 , ν 2 , α n , α n , η, δ, σ , r , γ I , ξ in these expressions are fixed throughout the paper, see Table 1. We compute the numerical solution for different values of the parameters α o , γ B ; we consider the following three Scenarios: Scenario 1 with (α o = 0.8, γ B = 0.05), Scenario 2 with (α o = 0.8, γ B = 0.005), and Scenario 3 with (α o = 1.0, γ B = 0.05); the parameters in the considered Scenarios can be interpreted as follows: α o is the established market size and γ B is the bankruptcy parameter. Furthermore, the state variable y ≡ y(t) is interpreted as the evolving size of the market demand for the new product. Pre-innovation profit Demand trend for the new market σ (y) σ y Volatility of the new market

Boundary Conditions
All numerical schemes considered in this paper can only be applied on bounded domains. Hence, we restrict the computations on sufficiently large finite domains and deduce suitable boundary conditions from the considerations below. Regarding (1.5), the probability of bankruptcy, i.e., V 1 = V 0 or V 2 = V 0 is increasing when x → −∞, thus at the left end of the domain, we impose a zero Dirichlet boundary condition to V 1 while at the right end, the solution can be characterized by the following result from [9,Proposition 1].

Proposition 2.1
Assume that r > ν 1 . For every x, y ≥ 0 the value in mode m = 2 is given by Furthermore, it holds for every x ≥x := max 2ξ |I nc | 2 − α 2 o 4(r − ν 1 ) , 0 with I nc := The optimal control is constant over time, that is I = I nc ; (ii) The value in mode m = 1 is given by

Discretization by Collocation Method
In this section, we describe the Chebyshev collocation method to compute an approximate solution for (1.1) and (1.2). The method approximates V 1 (x) and V 2 (x, y) by finite series of Chebyshev polynomials with unknown coefficients. The finite series are then substituted into (1.1) and (1.2) and the coefficients are determined by requiring that the differential equations are satisfied at some finite number of collocations nodes. The collocation method can only be applied on a finite domain and thus, for convenience, the scaling functions (3.1)-(3.2) are used.

Rescaling Transformation
A modification of the differential Eqs. (1.1)-(1.2) on an infinite domain results into equivalent differential equations on finite domains. In the manner of [9], we introduce the function which transforms R into (0, 1). The inverse transformation of (3.1) reads Thanks to the transformation (3.2), we can work on the bounded interval (0, 1). However, the transformation (3.2) is still undefined at z = 0 and z = 1. Thus, we make an approximation of (0, 1) by cutting off the left and right ends of the interval and work in [z,z]. Analogously, we perform the computations for y ∈ [y,ȳ] instead of y ≥ 0. We set U (z) := V 1 (x(z)) and V (z, y) := V 2 (x(z), y). The inverse transformation (3.2) implies that ∂ x V 1 = 1 2 z(1 − z)∂ z U and as a result the differential equations in (1.1)-(1.2) are transformed into (3.6)

Discretization of the Value in Post-Innovation Mode, m = 2
In a given state space [z,z] × [y,ȳ], we first construct a grid of collocation nodes N : ..,n z and N y := {y j } j=1,...,n y , z i and y j are defined as We construct a set of basis functions {φ i, j (z, y)} {i=1,...,n z }×{ j=1,...,n y } corresponding to our Chebyshev grid such that where T k : t → T k (t) is the Chebyshev polynomial of degree k defined for any t ∈ [0, 1]. For all z ∈ [0.5, 1) and y ≥ 0, the value function in mode m = 2 is given by (2.1). Our calculation is carried out only in the space [z, 0.5] × [0,ȳ]. In order to make sure the calculated value function is continuous at z = 0.5, that corresponds to x = 0, we specify further that (3.10) The expansion of V 2 as a Chebyshev series is given by The approximation of V 2 is therefore defined by where x : z → x(z) is given by (3.2) and V 2 : (x, y) → V 2 (x, y) is given by the formula (2.1). Furthermore, we impose V 2 (0.5, y j ) = V 2 (x(0.5), y j ) to enforce the continuity of (3.12) at z = 0.5.
In total there are n z n y number of nodes, implying n z n y number of equations. Furthermore, for i ∈ {1, . . . , n z − 1} and j ∈ {1, . . . , n y } we introduce four (n z − 1)n y × n z n y matrices B, B y , B yy , and B z with entries and s = (i − 1)n z + j ∈ {1, . . . , (n z − 1)n y }. For the s nodes, the resulting linear system of equations reads For the other n y nodes, it reads v · φ(0.5, y j ) = V 2 (0, y j ), j ∈ {1, . . . , n y }.

Discretization of the Value in Pre-Innovation Mode, m = 1
In contrast to mode m = 2, the value function V 1 in mode m = 1 is a one-dimensional problem defined on R. The optimal control I * = arg max I L 1 (V 1 , V 2 , x, I ) can be expressed explicitly as (3.14) Similarly as for the mode m = 2, we employ the transformation (3.2). After the transformation, the optimal control in (3.14) can be rewritten as for all z where ∂ z V 1 (x(z)) exists. We proceed with the same collocation method as in mode m = 2, but on the one dimensional state space of z ∈ [z,z]. However, in mode m = 1 for some values of the model paramaters there potentially exists a kink in V 1 (x(z)) at z = 0.5 and consequently the control functionĨ * (z) is discontinuous at z = 0.5 (x = 0). In [9,Proposition 2 and Corollary 3] an analytical characterization of scenarios where such a jump in the optimal control function occurs is provided. In particular, it is shown that the following three cases can occur.
In the case (i) the discontinuity of the control at x = 0 is guaranteed, whereas in cases (ii) and (iii) the theoretical results do not allow for a clear conclusion with respect to the continuity of optimal investment at x = 0. Due to the polynomial approximation of the value function in our collocation approach, by definition the approximate value function is smooth on the entire considered state space and a value function with a kink cannot be generated. Hence, based on the considerations above, we apply the collocation scheme separately on the intervals (−∞, 0] and [0, +∞). We define the approximation of V 1 as follows where V 1 and q V 1 are respectively the approximation of V 1 by Chebyshev expansion in [z, 0.5] and [0.5,z], that is for x ≤ 0 and x ≥ 0. Depending on which of the three listed cases applies boundary conditions for one or both of the collocation schemes are inserted. In particular, in the case (i) the value at z = 0.5 is calculated as and in both collocations, similar to the procedure in mode m 2 , one node is set at z = 0.5 and the condition is included. In case (ii) the collocation is first applied to [0.5,z] without any boundary condition, i.e. all n z nodes in the interior of the interval and all conditions determined by the HJB equation at the node. This yields q V 1 (z). In a second step the collocation is applied to [z, 0.5] with the boundary condition V 1 (0.5) = q V 1 (0.5). Analogous for case (iii). We illustrate the application of the collocation method for scenario (iii), the other scenarios are analogous. In the case (iii), the interval [z, 0.5] is invariant under the dynamics and the expansion of V 1 as a Chebyshev series on this interval is given by To solve the resulting nonlinear problem we use a policy iteration procedure, see for instance [2,16]. In particular, we consider a sequence of vectorsū (n−1) , with n ∈ N as the iteration counter. By (3.15), we derive the discrete optimal control I (n−1) := {I (n−1) i : i = 1, . . . , n z } at the stage n ∈ N of the iteration by where the column vectorsv ·,1 : . . , n z } and the matrices B and B z are such that Overall, the numerical algorithm that couples the Chebyshev collocation method and the policy iteration can be summarized as follows: Algorithm 3.1 (Collocation method) Start with an initial guess forū (0) (see below) and calculate the control I (0) according to (3.19). Then, for every n ∈ N, we proceed along the following two steps iteration: Policy improvement: Update the optimal control by (cf. (3.19)) Stopping criterion: Repeat the loop until In order to determine the initial guess forū (0) we employ a continuation approach. For γ I = 0 the value function is linear and can be determined in closed form (see [9]). The corresponding coefficient vectorū is found by interpolation of this function at the nodes {z i : i = 1, . . . , n z }. The parameter γ I is then stepwise increased till its target value is reached, where in each step the coefficient vector found in the previous step is used as the initial guessū (0) . The numerical calculation of q V 1 (z) with z ∈ (0.5,z] is similar to the calculation of V 1 (z) except we further enforce q V 1 (0.5) = V 1 (0.5) so that the continuity of (3.16) is preserved at z = 0.5.

Numerical Results
To illustrate the results obtained using the collocation method, we show at the bottom part of Fig. 1 the value and feedback functions obtained for the Scenarios 1-3. The number of collocation nodes for these calculations are n z = 40 for the collocation on the domain z ∈ [0.5,z] (i.e. x ≥ 0) and n z = 25 on the domain z ∈ [z, 0.5] (i.e. x ≤ 0). Note that in light of the considered parameter values we have γ I ∈ (γ I ,γ I ) in Scenario 1, whereas γ I >γ I in Scenario 2 and γ I < γ I in Scenario 3. Hence, in Scenario 1 x = 0 is a stable fixed point of the state dynamics under optimal investment and we impose boundary is given by (3.17), for both the collocations run on z ∈ [z, 0.5] and z ∈ [0.5,z]. In the other two scenarios no boundary conditions were imposed on the domain z ∈ [z, 0.5] (Scenario 2) respectively on the domain z ∈ [0.5,z] (Scenario 3) and, following our description above, the obtained values of V 1 (0.5) respectively q V 1 (0.5) were used as the boundary conditions on the second domain. For Scenario 2 we have calculated the value and control functions not only for our standard value z = z(−10), but also for z = z (−20). The reason for also considering a smaller lower bound of the domain is that V 1 (z(−10)) still has a substantial positive value, whereas it is close to zero at z (−20). Comparing the dashed green lines (z = z(−10)) and the solid green lines (z = z(−20)) however shows that the value functions calculated under these two domain specifications are indistinguishable, and also the optimal control determined under z = z(−10) is very close to that obtained for z = z (−20). Although hard to discern from the figure in the right lower panel, we like to point out that not only for Scenario 1, where according to our theoretical results we know that a jump in the control occurs at x = 0, but also for Scenario 3 the calculated optimal investment function exhibits a jump at x = 0.
Considering still the lower panels of Fig. 1, we observe that the optimal investment function in Scenario 2 with z = z(−20) crosses the line I = 0 at approximately x = −14. This indicates that for the calculated value functions V 2 (z, 0) − V 1 (z) < 0 holds for z close to the lower bound z. Since the value after successful innovation cannot be lower than that in mode m 1 before the innovation, this is a numerical artifact suggesting that the value of V 1 is overestimated for z close to z. To address this issue, we show in the upper part of  particular for z = z(−10) yield unsatisfactory results. This can be attributed to the fact that the distribution of the collocation points is getting coarser away from x(z = 0.5) = 0 due to the transformation (3.1), (3.2) which results in low accuracy of the scheme for large domains. Nevertheless, without the transformation the policy iteration does not converge, unless a good initial guess for the policy iteration is used (see next paragraph below). Furthermore, we observe in Fig. 1 that the results get worse if a homogeneous Dirichlet boundary condition is imposed since in Scenario 2 the actual value of V 1 (x) at x = −10 and x = −15 is still substantially larger than 0.
In order to improve the performance of the collocation method in Scenario 2, we have computed the solution with the collocation method for this case without applying the state space transformation

Discretization by the Finite Difference Method
This section is devoted to the description of the finite difference approximation of (1.1). For the sake of brevity, we do not consider the finite difference approximation of V 2 since the results of the finite difference approximation are very similar to those obtained by the finite element method in Sect. 6.1 below.
The finite difference method requires the spatial domain to be of finite size. Thus, we consider (1.1) for x ∈ (x,x) with x < 0 andx ≥x, see Proposition 2.1. We impose the following Dirichlet boundary conditions We introduce an equidistant grid h , where

Upwind Finite Difference Approximation
We approximate ∂ x V 1 by backward, central, or forward differences, which are denoted by D − x , D 0 x , and D + x , respectively, and defined by We also introduce the upwind finite difference approximation D ± Once a numerical approximation of the value function V 2 is computed we denote its values on the grid points at ). To obtain the finite difference approximation of V 1 , we replace the derivative in (1.1) by the upwind finite difference (4.4). We set ) and by (4.1) impose To obtain the finite difference approximation of the optimal control I * , we replace the derivative in (3.14) by the central finite difference D 0 Thus, to (1.1), we associate the following discrete system for all nodes i = 1, . . . , M − 2 In the discrete system (4.6), the optimal control I given by the formula (4.5) depends implicitly on V 1 and the central difference of V 1 . Analogically to the collocation method, we use the policy iteration procedure to break the implicit character of the system (4.6). Altogether, the upwind scheme and the policy iteration give the following algorithm: Stopping criterion: Repeat the loop until

Numerical Experiments
To obtain the numerical approximation of V 1 we use the values of the numerical approximation of V 2 at y = 0 computed by the finite element method with mesh size h = 2 −11 (see Sect. 6.1 below) in all numerical experiments in this section.
Effect of V 2 on the solution As mentioned above, we use a value function V 2 computed on a (fixed) coarse mesh, i.e. the approximation of V 2 is given on M = 2 11 uniformly distributed grid-points. To compute a value function V 1 on a finer mesh, we have to interpolate the available data for V 2 over the (finer) mesh on which V 1 is computed. Below we examine the effect of different interpolation methods on the numerical solution in Scenario 2. The value function V 1 is computed for x ∈ (−20, 10) using Algorithm 4.1.
We compare the effect of linear interpolation and cubic interpolation (see [20]) of V 2 on the control. We observe in Fig. 2 (bottom) that algorithm with linearly interpolated V 2 produces a control with irregular oscillations while control obtained with cubic interpolation is smooth; note that in Fig. 2 (top) we display the central difference approximation of the gradient of V 2 . Although, the different interpolation methods for V 2 do not influence the overall convergence of the policy algorithm, it is preferable to avoid the oscillations in the control caused by the linear interpolation of V 2 . Therefore, we employ cubic interpolation of

Effect of the Discretization of the Control
We examine the effect of different approximations of the optimal control (3.14) on the convergence behavior of the policy iteration algorithm. The computations below were performed on a fixed grid with M = 2 15 grid-points. In addition to the central difference approximation (4.5), we consider the backward difference, forward difference and upwind approximation, i.e, the central difference D 0 x in the denominator of (4.5) is replaced by D − x , the forward difference D + x , and the upwind difference D ± x , respectively (with the corresponding modification of the formula (4.9) in the policy improvement step of Algorithm 4.1).
We also consider the following alternative to (4.5) which is an average of the respective formulas obtained by the forward and backward differences. The averageing has a regularizing effect in the case of discontinuous control (cf. Fig. 4) and improves the convergence behavior even when the control is not Lipschitz continuous. The convergence plots of the policy iteration with the considered finite difference approximations of the control for Scenarios 1-3 are displayed in Fig. 5. The results can be summarized as follows: • The upwind finite difference approximation (the results are not displayed in Fig. 5): this discretization yields control that is optimal on the discrete level; nevertheless, we observe that the policy iteration diverges with the upwind discretization. • The forward finite difference: the algorithm converges in Scenario 1 and 2 but diverges in Scenario 3. • The backward finite difference: the algorithm converges in all scenarios, nevertheless one may expect similar problems with convergence as for the forward difference approximation in general (i.e., for different model parameters). • The central and "averaged" finite differences: in both cases the algorithm converges in every scenario, the "averaged" difference approximation exhibits slightly better convergence behavior overall.
Finally, it is apparent from Fig. 5 that in Scenario 2 (i.e., when there is no discontinuity in the control) the approximation of the control has only minor influence on the convergence behavior of the policy iteration algorithm.

Experimental Rate of Convergence
Next, we study the rate of convergence of the finite difference numerical approximation; the numerical solution is computed on 10 in Scenarios 1,3 and on 20 in Scenario 2. Since no analytical solution is known for the given problem, we determine the rate of convergence by using a reference solution computed on a fine grid.
We compute a sequence of approximations for κ = 0, . . . , 5 on uniformly spaced grids with M(κ) = 2 10+κ grid-points denoted as The rate of convergence in the L ∞ -norm (analogically in the L 2 -norm) is determined as The results are displayed in Table 3; we observe that the experimental convergence rate is of order about 1 for the variables V 1 and I . The convergence rate in the L ∞ -norm is reduced to about 1/2 in Scenarios 1 and 3, which reflects the fact that the control is discontinuous. In addition, in Table 4 we display the experimental convergence rates for the "averaged" discretization of the control (4.10) (which has the advantage of a slightly faster convergence of the policy iteration algorithm); the observed convergence rates are similar to those from Table 3, the accuracy of the approximation is slightly worse in Scenario 2.

Weighted Essentially Non-oscillatory (WENO) Finite-Difference Approximation
In this section we propose an algorithm for the computation of the value function that employs higher-order finite difference discretization in space. The scheme is similar to the policy iteration Algorithm 4.1 with the exception that the value function in the policy evaluation step is obtain as a steady state of a time dependent version of (1.1), i.e., the solution of (1.1) is obtained as Given the control I we consider the time-dependent counterpart of (1.1) as Spatial semi-discretization We employ the notation from the previous section, in particular we denote x i = ih, where h is the mesh size. We perform the spatial semi-discretization of (5.1) on the grid (4.2). We use a WENO5 interpolation procedure that employs a stencil with 6 nodes [31], depending on the direction of the velocity, see (5.5). We assign the values to grid points near the boundary of r ; via the boundary condition given by . , x i+3 } and is constructed analogously, cf., [31].
We denote the approximation at the nose x i as u i (t) ≈ u(t, x i ) and define the following interpolated values We define the approximation of ∂ x u(t, x i ) by where the weights ω i are given by The parameter c e in (5.6) is introduced to avoid a denominator with zero value, in the numerical experiments below we set c e = 10 −6 . The parameters β i in (5.6), the so-called smoothness indicators, are defined as Hence, the spatial semi-discretization based on the upwind WENO5 interpolation procedure leads to the following system of ordinary differential equations

Time-discretization
Given a time step t and a sequence of discrete time levels t = t we denote the fully discrete approximation at (t , x i ) as u i ≈ u i (t )). The time-discretization of (5.7) is based on the Heun method, which is an explicit one-step 2-stage Runge-Kutta method of order 2 (RK22). The choice of the time-discretization is motivated by the stability properties of the RK22 scheme [30]. Since the scheme is explicit we set t = h/2 to preserve the associated CFL condition.
The fully discrete scheme for the approximation of (5.1) then reads as follows: given an initial guess {u 0 i } M−2 i=1 compute for = 1, . . .
We then obtain the following modification of the policy iteration algorithm: Numerical results We discuss the results for Scenario 1, the behavior of Algorithm 5.1 in the remaining scenarios was similar. We observe very good agreement between the solutions computed with the upwind finite difference scheme and the WENO scheme, see Fig. 6. In Fig. 7 we display the error between the solution of the WENO scheme for decreasing mesh size and the solution of the upwind finite difference scheme on a very fine mesh with M = 2 20 grid points; we observe a linear convergence for the approximation of V 1 and convergence of order 1 2 for the control in the maximum norm. Algorithm 5.1 requires about 30 − 40 time more policy iterations than Algorithm 4.1, in Scenario 1 443 policy iterations were required to achieve the convergence criterion. As expected, the number of time-steps required to reach the steady-state in each policy iteration, decreases in later states of the policy iteration algorithm, only two time-steps per policy iteration are required after the 10th policy iteration, see Tables 3 and 4. The advantage of the time-stepping approach in Algorithm 5.1 is that it does not require the solution of a linear system of equations, nevertheless in the present setting Algorithm 4.1 was considerably faster in terms of CPU time due to the lower policy iteration count.
To reduce the number of policy iterations for the WENO scheme we employ a preconditioning approach: we use the control computed by Algorithm 4.1 as an initial guess for Algorithm 5.1. We obtained a reduction in the number of policy iterations as well as in the total number of time-steps, see Table 6.

Discretization by the Stabilized Finite Element Method
In this section we introduce stabilized finite element (FEM) schemes for the approximation of (1.1)-(1.2). We note that, in contrast to the collocation method, we do not employ the transformation Throughout the section we use the following notation. By L 2 ( ) we denote the standard Lebesgue space of square integrable functions on . For u, v ∈ L 2 ( ) we denote the L 2 -inner product by (u, v) := u(x)v(x) dx with the corresponding L 2 -norm u L 2 ( ) := (u, u) 1/2 .

Discretization of the Value in Mode m = 2
We note that for x ≥ 0 by Proposition 2.1 the solution of the HJB Eq. (1.2) is given explicitly by (2.1). Hence, similarly as in Sect. 3, we only consider numerical approximation of V 2 for x < 0 and impose a Dirichlet boundary condition at x = 0 via (2.1). In addition, in order to solve V 1 we only need to obtain the solution V 2 at y = 0. Therefore, to obtain a finite element approximation of V 2 we solve the HJB Eq.
are the standard nodal basis functions, s.t. ϕ (x m ) = δ ml for all x m ∈ N h . We also introduce a finite element space of functions that satisfy the homogeneous Dirichlet boundary condition on D as The standard nodal interpolation operator is denoted as and similarly we define δ 2 yy,h ≈ ∂ 2 yy . Consequently the discrete Laplace operator h := By the definition of the discrete inner product (6.1) and the fact that φ (x m ) = δ m for x m ∈ N h , we deduce from (6.2) that cf. [13,17,27]. The finite element approximation of the continuous problem is then constructed as follows.
is given by (2.1)) and solves the discrete system On setting L 2,h (V 2,h ) := 1 2 σ 2 (y)δ 2 yy,h V 2,h − b · ∇V 2,h − p 0 V 2,h we observe that (6.4) is equivalent to which is a discrete counterpart of (1.4) with an additional artificial diffusion term h stab h V 2,h .
The artificial diffusion term guarantees the monotonicity and convergence of the finite element approximation, cf. [17] and [27,Sect. 3.5]. It is well known that the nodal basis for weakly acute triangulation T h . Since we employ uniform right-angled triangulations (consequently T h are weakly acute) it can be shown that the artificial diffusion term −h stab h V 2,h ensures the monotonicity of the finite element scheme (6.4), [27,Lemma 3.42]. The monotonicity of the finite element approximation for adaptive meshes can be ensured by a suitable choice of the stabilization parameter h stab , for more details see [17,Sect. 8] and [27]. By the monotonicity of the numerical approximation standard arguments imply that the finite element solution V 2,h of (6.4) converges to the (unique) viscosity solution of (1.2) (considered on − ), cf. [17]. Furthermore, on noting the representation V 2,h ≡ V 2,h (x, y) = L =1 v 2, φ (x, y) we observe that (6.4) is equivalent to a system of linear equations for the coefficients {v 2, } L =1 . Hence, owing to the monotonicity of the numerical approximation the associated matrix is a M-matrix, which implies that the numerical approximation (6.4) satisfies a discrete maximum principle, see for instance [27,Corollary 3.43].
We observed that, in the present setting, it is also possible to use the numerical scheme (6.4) without the stabilization (i.e., with h stab ≡ 0) and the resulting numerical solution (not displayed) is very similar to those obtained by the stabilized scheme. In general the use of suitable numerical stabilization is advisable to ensure the reliability of numerical approximation.
The graph of the finite element solution V 2,h for Scenarios 1-3 (see Sect. 2) computed on a mesh with mesh size h = 20 × 2 −11 is displayed in Fig. 8.

Discretization of the Value in Mode m = 1
In this section we propose two adaptive finite element algorithms for the numerical solution of (1.1). The first algorithm uses a stabilized finite element method along with a heuristic mesh refinement strategy, the second adaptive finite element algorithm is based on a least-squares approach.
Since the finite element method requires the spatial domain to be of finite size we consider (1.1) for x ∈ = (−10, 10). If not mentioned otherwise we impose the following Dirichlet boundary condition V 1 (−10) = 0 and V 1 (10) = 10 + c − ξ γ 1 I nc ; (6.6) the first condition is motivated by heuristic observations and the second condition is derived from (2.2). We partition into possibly non-equidistant subintervals, i.e., we denote T = (x −1 , x ), = 1, . . . , L and T h = L =1 T . The partition T h is determined by an adaptive refinement strategy, that is, the computation starts with a coarse initial grid T h,0 and then performs local mesh refinements and coarsening using suitable criteria which will be discussed below. The space of piecewise linear globally continuous functions on T h is denoted The value function V 1 is then approximated by a continuous and piecewise linear function We propose the following adaptive numerical algorithm which combines the policy iteration algorithm (cf., Algorithm 4.1) with iterative mesh refinement. The algorithm requires the values of V 2 (x, 0), x ∈ ; we construct an approximation V 2,h ≈ V 2 (·, 0) as follows. For x ≤ 0 we obtain the (piecewise linear) approximation V 2,h ∈ Vh ≡ Vh(Th) using the stabilized finite element method from Sect. 6.1 on a uniform mesh with mesh sizeh = 20 × 2 −11 , see Fig. 8; for x ≥ 0 the approximation V 2,h is obtained from formula (2.1) by piecewise linear interpolation over a uniform mesh with the same mesh size.
1B) Policy improvement: For P h : L 2 ( ) → V h (cf., (6.11) below) update the control as 1C) Stopping criterion: Set n → n + 1 and proceed with step 1A) unless the relative change of the control I (n) h is below the prescribed tolerance (see, (6.9), (6.12), respectively). We note that the use of a suitable operator P h : L 2 ( ) → V h in step 1B) of the above algorithm is necessary because for V 1,h ∈ V h the derivative ∂ x V 1,h is a piecewise constant function, i.e., ∂ x V 1,h / ∈ V h . Possible choices of the operator are given implicitly by (6.11) below.

Stabilized Finite Element Approximation with Heuristic Mesh Refinement
In this section we introduce a stabilized finite element method with heuristic mesh refinement strategy for the (linear) subproblem (6.7). The method employs the following stopping criterion in step (1C) of Algorithm 6.1: In all considered cases the prescribed tolerance = 10 −6 was typically reached after 5 to 15 iterations; the iteration count is higher in the initial iterations and decreases once a good approximation of the control is available.
To simplify the notation we denote throughout this section . We note that the discrete inner product for ⊆ R 1 takes the form We employ a 1d counterpart of the stabilized finite element method from Sect. 6.1, i.e., given the control I h the finite element approximation V 1,h ∈ V h that satisfies the Dirichlet boundary condition (6.6) and solves where the discrete second-order term δ 2 x x,h V 1,h is a 1d analogue of (6.2). The artificial diffusion parameter in the above scheme is defined as with supp(φ ) = {x ∈ , φ (x) = 0}, and α stab = 1, unless mentioned otherwise. We note that, in general, the above choice of the stabilization parameter h stab does not guarantee the M-matrix property of the corresponding linear system of equations (i.e., the monotonicity of the numerical approximation). However, in the current setting the M-matrix property was only violated for very few rows of the matrix (e.g. rows that correspond to the elements with large gradient in the discrete control) and this effect had negligible influence on the numerical approximation. The monotonicity of the numerical approximation can be obtained by choosing a large value of the stablization parameter α stab , more sophisticated monotonicity preserving stabilization methods are also available, see for instance in [17].
The discrete control in (6.8) of Algorithm 6.1 reads . (6.11) An alternative choice of the approximation of the optimal control, which is an analogue of the "averaged" finite difference approximation from Sect. 4.1.2, is given as Similarly as in the case of the finite difference approximation, we observe that the above approximation requires similar iteration counts for the convergence as (6.11). Furthermore, we note that the policy iteration does not converge without any "averaging" of the (piecewise constant) derivative ∂ x V 1,h , e.g., for or for the choice I ≡Ĩ , withĨ defined below.
For the adaptive mesh refinement we employ the following local error indicator , with a scaling factor α η > 0; the error indicator involves a modified controlĨ h = L =1Ĩ φ (x) wherẽ .
On meshes with uniform mesh size h ≡ 1 2 |supp(φ )| the denominator in the above expression becomes 1 2 Hence, the error indicator η h measures the difference between discrete controls obtained by two alternative discrete formulas which both approximate the continuous expression of the optimal control (3.14). Due to the sensitivity of the control with respect to the discretization, the above heuristic mesh refinement strategy produces meshes with good approximation properties, see Table 7 below. We observe that the choice of the scaling factor α η = 2.5 improves the efficiency of the mesh refinement strategy for the considered simulation parameters.
In Fig. 9 we display the results for Scenario 1 computed on an adaptive mesh with 10945 degrees of freedom. The displayed results indicate growth of the gradient of the control at x = 0, which indicates a discontinuity that is also captured by the mesh refinement algorithm (the tolerance for the refinement was η h = T η T ≤ 2 × 10 −5 ). Results for Scenarios 2-3 are not displayed, we observed no significant differences to the results obtained by the finite difference method.
We remark that the scheme (6.10) can be modified by replacing the L 2 -inner product with the discrete inner product Sect. 6.10; on a equidistant mesh with mesh size h it holds that supp(φ ) = 2h and the resulting finite element scheme is equivalent to the upwind finite difference approximation (4.6). Nevertheless, this generalization of the finite difference scheme turned out to be unsuitable for the use in conjunction with adaptive mesh refinement as the policy iteration in Algorithm 6.1 did not converge.

Adaptive Least-Squares Finite Element Approximation
In this section we discuss the least-squares finite element method (LSFEM) for the approximation of (6.7). We employ the following mesh dependent stopping criterion in step (1C) of Algorithm 6.1: As in the previous section we describe the application of the method in the n-th iteration of Algorithm 6.1 and denote The LSFEM method consists of finding the approximation V 1,h ≈ V Notice that we do not impose any boundary conditions, since they are in general unknown (in particular on the negative axis). The above least squares functional provides the following straightforward way for the construction of mesh refinement indicators. The minimization in (6.13) is equivalent to the minimization of the error To minimize the error V 1,h − V (n) 1 a,I h , we want to refine the mesh in regions where this error is large. This motivates the evaluation of the residual on each T ∈ T h , leading to the refinement indicators We use this built-in error control to drive adaptive mesh refinements with Dörfler marking strategy (cf. [26]) and bulk parameter = 0.3.
A downside of the LSFEM is that it is not a monotone scheme, in general. Thus the solution V 1,h might oscillate, which results in difficulties in the approximation of the control I h . To stabilize the policy improvement in Algorithm 6.1, our numerical experiments suggested a damping, that is, given some damping parameter ρ ∈ (0, 1) and the operator P h in (6.11), we replace the policy improvement (6.8) by the damped version Throughout all numerical experiments, we use the damping ρ = 1/2. Nevertheless, there are still tiny oscillations of the control near the origin x = 0, causing difficulties in attaining the stopping criterion (6.12). To circumvent this difficulty, we had to replace the L ∞ norm by the L 2 norm.
We terminate our computations if the number k of policy iterations in Algorithm 6.1 exceeds fifteen or the number of degrees of freedom ndof = dim V h exceeds half a million. All experiments utilize the parameters in Tables 1, 2 and the approximations of V 2 already used in Sects. 4 and 6.2.1. The initial grid T 0 is a uniform partition of the domain into 10 subintervals. We refine the mesh either adaptively or uniformly. The computations utilize the open source tool for solving partial differential equations FEniCS [24].

Numerical Experiments
In this section we examine the performance of the respective adaptive finite element schemes proposed in Sect. 6.2.1, 6.2.2 for the computation of V 1 and the control I . The approximation of V 2 is constructed using the (uniform) finite element method from Sect. 6.1 as discussed in Sect. 6.2.1. We compare our results with the finite difference scheme introduced in Sect. 4 which also provides the reference solution computed on a grid with M = 2 20 ≈ 10 6 gridpoints. Table 7  It turns out that the adaptive schemes allow for significantly better approximations compared to the (uniformly refined) finite differences scheme, in particular on the coarse grid. This observation underlines the need of adaptive schemes in higher dimensions, where we have to solve such models on coarse grids due to limitations in our computational resources. We do not experience difficulties with the adaptive stabilized FEM in the sense that the policy iterations converge after a finite number of iterations.
The adaptive LSFEM performs well in Scenario 2 and 3, but struggles in Scenario 1: the policy iterates I (n) h seem to oscillate between several states (cf. Fig. 10) and so do not converge after ndof exceeds 10 4 . As discussed previously, the failure of convergence of the algorithm is related to the discontinuity in the exact control I (cf. Fig. 9) at the origin x = 0, which induces oscillation in the computed approximation V 1,h and and in the control I h . We overcome the failure of the adaptive LSFEM in Scenario 1 by a domain decomposition approach similar to the one used in the collocation method. More precisely, we use the scheme to solve the problem on the domains l = (−10, 0) and r = (0, 10) separately with given boundary data for V 1,h in the origin x = 0. With this splitting, the results obtained by adaptive LSFEM are comparable to those obtained by the FDM and stabilized FEM in Scenario 1, cf. the results in Table 7.
In Fig. 11 we display the mesh size distribution for adaptive meshes with ndof ≈ 1000 for the considered adaptive finite element methods. We observe that, up to the boundary, the meshes exhibit similar structure; in particular both algorithms exhibit similar behavior near x = 0. The smaller mesh size near the boundary for the stabilized FEM method is due to the boundary effects and has minor effect on the overall accuracy. The efficiency of the scheme can be increased by a suitable modification of the refinement indicator near the boundary, but we do not pursue this approach for simplicity.

Conclusion
In general, the computed numerical approximations were similar for all considered numerical schemes in all the considered scenarios, at least for certain choices of simulation parameters.

Table 7
Errors of the stabilized adaptive finite element scheme, the finite differences scheme, and the adaptive least-squares scheme Nevertheless, we made some interesting observations (in particular in the solution for m = 1) that are summarized below. We believe that these observations can be used as a benchmark for the evaluation of the robustness of numerical schemes for Hamilton-Jacobi-Bellman equations.

Convergence of the policy iteration
The convergence of the policy iteration proved to be the most challenging aspect of numerical approximation of (1.1). The non-monotone numerical schemes, i.e., the collocation and LSFEM methods, required a splitting/domain decomposition approach in order to obtain the convergence of the policy iteration algorithm in Scenario 1, where the control is discontinuous at x = 0. In addition, without a good initial guess, the collocation method required the use of the transformation (3.1), (3.2) to ensure convergence. The solution computed with the state state transformation can be used as a preconditioner to obtain an accurate solution on larger domains without transformation. The divergence can be explained by the fact that the policy iteration can be interpreted as a nonlinear fixed-point iteration (cf. [5]) and, in the case of discontinuous control, the corresponding nonlinearity is not Lipschitz continuous. Nevertheless, a rigorous convergence analysis of the policy iteration in the present setting is still open. We note that the application of the splitting/domain decomposition approach may be difficult for more general problems (e.g., in higher dimensions) where the location of the discontinuity is not known a priori. The monotone schemes proved to be more robust in this respect, in particular when a suitable discrete approximation of the optimal control is employed. For instance in the case of the finite element approximation in Scenario 1, a suitable nodal interpolation of the (piecewise constant) gradient of the numerical solution in the formula for the discrete optimal control was essential in order to ensure robust convergence of the nonlinear solver. We observed that the number of policy iterations is independent of the mesh size, nevertheless the convergence of the policy iteration can be affected by "too harsh" mesh refinement, i.e., on meshes with too sharp differences in the sizes of the neighboring elements. On the other hand, suitable mesh refinement may improve the convergence of the policy iteration, in particular in the case of LSFEM method. We note that an additional benefit of the mesh refinement algorithm is that the iterations counts decrease in the final stages of the mesh refinement loop, i.e., once the mesh is close to the optimal one. This can be attributed to the fact that the final refinement steps typically only introduce minor changes in the control. In the case of the WENO scheme, the convergence of the policy iteration combined with the time-stepping algorithm was robust, but the algorithm required a high number of iterations to reach the prescribed tolerances.

WENO scheme
The algorithm produced very similar results as the monotone finite difference and finite element schemes. The disadvantage of the method is the high iteration count of the policy iteration, nevertheless compared to the LSFEM and the collocation method, the convergence of the policy iteration was robust. Suitable preconditioning strategies can improve the efficiency of the algorithm.
Boundary/transformation effects Except for the collocation method, we observed no major differences between the computed approximation of V 1 by the respective numerical schemes. On larger domains the collocation method becomes inaccurate due to the transformation (3.1), (3.2) which is required for the convergence of the policy iteration. The transformation causes the collocation points to be concentrated near x = 0 and the approximation becomes increasingly inaccurate on large domains; a more advanced adaptive collocation strategies could improve the accuracy of the scheme, cf. [4,35]. Without the transformation the collocation method produces accurate results on large domains as well, but it requires preconditioning to ensure the convergence of the policy iteration algorithm. Compared to other methods, the adaptive LSFEM methods exhibited slightly faster convergence behavior w.r.t. the mesh size, nevertheless due to the fact that the value function V 1 is smooth (i.e., the solution does not exhibit any shocks) the differences were not significant. For x close to the origin, the computed optimal controls also exhibited good agreement for all methods, in particular when adaptivity is employed. Nevertheless, near the left boundary the three methods produce very different numerical results. This can be explained by the different approximation of the infinite spatial domain as well as the different choice of the (artificial) boundary conditions. With the exception of the collocation method, all considered numerical schemes exhibit similar behavior for increasing sizes of the (bounded) spatial domain, i.e., the effects of the artificial boundary conditions can be mitigated by the choice of a sufficiently large spatial domain for numerical computations.
Approximation of V 2 The numerical solution of V 2 enters as a parameter in the numerical approximation of V 1 and hence the quality of the approximation of V 2 has some influence on the numerical approximation of V 1 . The numerical approximation V 2 computed by collocation method contained minor oscillations, this effect can be attributed to the lack of numerical stabilization in the collocation scheme as well as the aforementioned domain transformation.