High-order combined Multi-step Scheme for solving forward Backward Stochastic Differential Equations

In this work, in order to obtain higher-order schemes for solving forward backward stochastic differential equations, we adopt the high-order multi-step method in [W. Zhao, Y. Fu and T. Zhou, SIAM J. Sci. Comput., 36(4) (2014), pp.A1731-A1751] by combining multi-steps. Two reference ordinary differential equations containing the conditional expectations and their derivatives are derived from the backward component. These derivatives are approximated by finite difference methods with multi-step combinations. The resulting scheme is a semi-discretization in the time direction involving conditional expectations, which are solved by using the Gaussian quadrature rules and polynomial interpolations on the spatial grids. Our new proposed multi-step scheme allows for higher convergence rate up to ninth order, and are more efficient. Finally, we provide a numerical illustration of the convergence of the proposed method.


Introduction
Recently, the forward-backward stochastic differential equation (FBSDE) becomes an important tool for formulating many problems in various areas including physics and financial mathematics. We are interested in the numerical approximation of the general FBSDEs on a filtered complete probability space (Ω, F, P ) with the natural filtration (F t ) 0≤t≤T , where a : [0, T ] × R n × R m × R m×d → R n and b : [0, T ] × R n × R m × R m×d → R n×d , are drift and diffusion coefficients in the forward component, respectively; W t = (W 1 t , · · · , W d t ) T is a d-dimensional Brownian motion (all Brownian motions are independent with each other); f (t, X t , Y t , Z t ) : [0, T ] × R n × R m × R m×d → R m is the driver function and ξ is the squareintegrable terminal condition. We see that the terminal condition Y T depends on final value of the forward component. Note that a, b and f are all F t -adapted, and a triple (X t , Y t , Z t ) is called In the next section, we start with preliminaries on FBSDEs and derive in Section 3 the approximations of derivatives using finite difference method with combined multi-steps. In Section 4, we derive the reference ODEs, based on which the semi-discrete higher-order multi-step schemes are introduced for solving decoupled FBSDEs. Section 5 is devoted to the fully discrete higher-order schemes. In Section 6, these methods are extended to solve a coupled FBSDE. In Section 7, several numerical experiments on the decoupled and coupled FBSDEs including two-dimensional applications are provided to show the higher efficiency and accuracy. Finally, Section 8 concludes this work.

Preliminaries
As mentioned before, throughout the paper we assume that (Ω, F, P ) is a complete, filtered probability space. A standard d-dimensional Brownian motion W t with a finite terminal time T is defined, and the first component, X t generates the filtration F t = σ{X s , 0 ≤ s ≤ t}. And the usual hypotheses should be satisfied. We denote the set of all F t -adapted and square integrable processes in R d with L 2 = L 2 (0, T ; R d ), and list following notation to be used: • | · | : the Euclidean norm in R, R n and R n×d ; • F s,x t : σ-algebra generated by the diffusion process {X r , s ≤ r ≤ t, X s = x}; • E s,x t [·] : conditional expectation under F s,x t , i.e., E s,x t [·|F s,x t ]; • C k b : the set of continuous functions with uniformly bounded derivatives up to order k; • C k 1 ,k 2 : the set of functions with continuous partial derivatives ∂ ∂t and ∂ ∂x up to k 1 and k 2 , respectively; • C L : the set of uniformly Lipschitz continuous function with respect to the spatial variables; • C 1 2 L : the subset of C L such that its element is Hölder-1 2 continuous with respect to time, with uniformly bounded Lipschitz and Hölder constants.
Let X t be a diffusion process starting at (t 0 , x 0 ) and t ∈ [t 0 , T ], which has a unique solution. Note that E x s [X t ] := E s,x s [X t ] is equal to E[X t |X s = x] for all s ≤ t with the Markov property of the diffusion process. Given a measurable function g : [0, T ] × R n → R, E x s [g(t, X t )] is a function of (t, s, x) whose partial derivative with respect to t reads provided that the limit exists and is finite.
Definition 2.1 (Generator). The generator A x t of X t satisfying (3) on a measurable function g : [0, T ] × R n → R is defined by Theorem 2.1. Let X t be the diffusion process defined by (3), then it holds The proof can be simply completed by using the Itô's lemma and the dominated convergence theorem.
Remark 2.1. From (4) one can straightforwardly deduce that which is a stochastic process.
By using the Itô's lemma and Theorem 2.1 we calculate from which we deduce Theorem (2.2) as follows.
Furthermore, one has the following identity whereX t is an approximating diffusion process defined bỹ It has been noted in [Zhao et al., 2014a] that the different approximations of (5) can be obtained by choosing differentã t 's andb t 's. One can simply e.g., chooseã(s,X s ; t 0 , x 0 ) = a(t 0 , x 0 ) and b(s,X s ; t 0 , For existence, regularity and representation of solutions of decoupled FBSDEs we refer to [Ma and Zhang, 2005, Peng, 1991, Zhang, 2001. In the following of this section we will present some of those. We denote the forward stochastic differential equation (SDE) starting from (s, x) with X s,x t and consider the decoupled FBSDEs where t ∈ [s, T ], and the superscript s,x will be omitted when the context is clear.
Throughout the paper, we shall often make use of the following standing assumptions: 1. The functions a, b ∈ C 1 b , and assume sup 0≤t≤T where the common constant L > 0 denotes all the Lipschitz constants.
2. n = d and we assume that b satisfies 3. a, b, f, g ∈ C L , and assume that where L denotes all the Lipschitz constants.
Under the above conditions, it is clear that (6) is well-posed; the resulting integrands by taking conditional expectation on both side of the backward component is continuous with respect to time; the nonlinear Feynman-Kac formula Zhang, 2005, Peng, 1991] can be given as follows.
is the unique solution to (6).

Calculation of the weights in the FDM for approximating derivative
In this section we calculate the weights in the FDM for approximating the function derivatives, e.g., du(t) dt . Let u(t) ∈ C k+1 b , k is a positive integer, and t i = i∆t, i.e., t 0 < t 1 < · · · < t k .

Combination of two time points
We consider the Taylor's expansions of u(t i ) and u(t i+1 ), where α k,i , i = 0, 1, · · · , k are real numbers. Clearly, we obtain and thus du dt by choosing Due to t i = i∆t and t i+1 = (i+ 1)∆t the conditions in (8) are equivalent to the following system: which can be solved for α k,i ∆t, i = 0, · · · , k. We refer to the algorithm proposed in [Fornberg, 1988] for those solutions. We report α k,i ∆t for k = 1, 2, · · · , 7 in Table 1, since the related multi-step schemes proposed in this paper is unstable from k = 8, which will be explained below. The multi-step schemes (combining two time points) can be constructed by  approximating the reference ODEs (see Sec. 4.1) using (7). Therefore, we consider the following ODE with the known terminal condition Y (T ) for studying stability, see also [Zhao et al., 2014a]. Applying (7) to (9) one obtain the multi-step scheme as under the uniform time partition 0 = t 0 < t 1 < · · · < t N = T. (10) is stable if the roots {λ k,j } k j=1 of the characteristic equation satisfies the following root conditions [Butcher, 2008] • |λ k,j | ≤ 1, • P ′ (λ k,j ) = 0 if |λ k,j | = 1 (simple roots).
With the α k,j in Tabel 1, 1 is the simple root of the latter characteristic function for each k, except which we list the maximum absolute values of the roots for k = 2, · · · , 8 in Table 2, from which we see that the multi-step scheme (10) is unstable for k ≥ 8. However, compared to the k 2 3 4 5 6 7 8 max (|λ k,j |) 0.5000 0.5424 0.6344 0.7438 0.8636 0.9915 1.1264 Table 2: The maximum absolute root of (11) except the simple roots multi-step scheme proposed in [Zhao et al., 2014a](unstable ≥ 7), stability for k = 7 has been achieved, i.e., 1-order higher convergence rate is obtained. Combination of more time points can be done similarly, and provide other multi-step schemes, which have different instabilities. In our investigation we find that the multi-step scheme resulted by combining four time points are stable for k ≤ 9, which is the best. Thus, we show its detailed derivation in next subsection and will consider it in the numerical experiments.

Combination of four time points
Similarly but slightly different to the multi-step scheme in Section 3.1, we need to consider the Taylor's expansions of u(t i ), u(t i+1 ), u(t i+2 ) and u(t i+3 ), i = 0, · · · , k, where α k,i , i = 0, 1, · · · , k are real numbers as well. Straightforwardly, we obtain and thus du dt which are equivalent to the following system: In Table 3 we report solutions of the latter system for k = 1, · · · , 9. Applying (13) to (9) one  obtain the multi-step scheme as whose characteristic equation reads With the α k,j in Table 3, 1 is the simple root of the latter characteristic function for each k. The maximum absolute values of the roots for k = 2, · · · , 10 expect the simple roots are listed in Table 4, also the multi-step scheme (14) is stable for k ≤ 9. We remark that the stability cannot  Table 4: The maximum absolute root of (15) except the simple roots be guaranteed for k > 9 by combining more time points, e.g., the multi-step scheme constructed by combining five time points is stable for k ≤ 8.

The semi-discrete multi-step scheme for decoupled FBSDEs
Following the idea in [Zhao et al., 2014a] we derive the semi-discrete scheme for (1) in the decoupled case. We consider the time interval [0, T ] with the following partition We denote t n+k − t n by ∆t n,k and W t n+k − W tn by ∆W n,k , i.e., ∆t tn,t = t − t n and ∆W tn,t = W t − W tn for t ≥ t n .

Two reference ODEs
Let (X t , Y t , Z t ) be the solution of the decoupled FBSDEs (1). By taking conditional expectation E x tn [·] on both sides of the backward component in (1) one obtains the integral equation As explained in Sec. 2, the integrand in the latter integral equation is continuous with respect to the time. By taking the derivative with respect to t on both sides one thus obtain the first reference ODE: Furthermore, we have By multiplying both sides of the latter equation by ∆W ⊤ tn,t and again taking the conditional expectation E x tn [·] on its both sides we obtain Similarly, we obtain the second reference ODE: by taking the derivative with respect to t ∈ [t n , T ].

The semi-discrete scheme
x), and thus define the diffusion process t=tn by Theorem 2.2. Then, we apply (13) to terms on the right hand side of both the latter equations to obtain and where α k,i are given in Table 3,R k y,n andR k z,n are truncation errors. We insert respectively (19) and (20) into (16) and (17), and obtain and with R k y,n = −R k y,n and R k z,n = −R k z,n . We denote the numerical approximations of Y t and Z t at t n by Y n and Z n , respectively. Furthermore, forā andb in (18) we chooseā(t,X tn,x t ) = a(t n , x) andb(t,X tn,x t ) = b(t n , x) for t ∈ [t n , T ]. Finally, from (21) and (22), the semi-discrete scheme can be obtained as Scheme 1. Assume that Y N T −i and Z N T −i are known for i = 0, 1, · · · , k + 2. For n = N T − k − 3, · · · , 0, X n,j , Y n = Y n (X n ) and Z n = Z n (X n ) can be solved by X n,j = X n + a(t n , X n )∆t n,j + b(t n , X n )∆W n,j , j = 1, · · · , k + 3, Remark 4.1.
1.Ȳ n+j is the value of Y n+j at the space point X n,j for j = 1, · · · , k + 3.
2. The latter implicit equation can be solved by using iterative methods, e.g., Newton's method or Picard scheme. (12) it holds [Butcher, 2008] R k y,n = O(∆t) k andR k z,n = O(∆t) k provided that L k+4 t,x u(t, x) is bounded, whereR k y,n andR k z,n are defined in (19) and (20), respectively.

By Theorem 2.2 and
4. Similar to the scheme proposed in [Zhao et al., 2014a], one can obtain high-order accurate numerical solutions for (24) and (25), although the Euler scheme is used for (23). This is the main advantages because the usage of the Euler scheme reduces dramatically the total computational complexity, and one is only interested in the solution of (24) and (25) in many applications.

The fully discrete multi-step scheme for decoupled FBSDEs
To solve (X n , Y n , Z n ) numerically, next we consider the space discretization. We define firstly the partition of the real space as where dist(x, R n h ) is the distance from x to R x h . Furthermore, for each x we define the neighbor grid set (local subset) R n h,x satisfying 2. the number of elements in R n h,x is finite and uniformly bounded.
Based on the space discretization, we can solve Y n (x) and Z n (x) for each grid point x ∈ R n h , n = N t − k − 3, · · · , 0, by Note thatȲ n+j is the value of Y n+j at the space point X n,j generated by X n,j = x + a(t n , x)∆t n,j + b(t n , x)∆W n,j , j = 1, · · · , k + 3.
However, X n,j does not belong to R n+j h . This is to say that the value of Y n+j at X n,j needs to be approximated based on the values of Y n+j on R n+j h , this can be done using a local interpolation. By LI n h,X F we denote the interpolated value of the function F at space point X ∈ R n by using the values of F only in the neighbor grid set, namely R n h,X . Including the interpolations, (26) and (27) become Furthermore, to approximate the conditional expectations in (28) and (29) we employ the Gauss-Hermite quadrature rule which is an extension of the Gaussian quadrature method for approximating the value of integrals of the form x 2 j , L is the number of used sample points, j = (j 1 , j 2 , · · · , j n ) , , a j = (a j 1 , · · · , a jn ) and roots of the Hermite polynomial H L (x) of degree L and {ω j i } L j i =1 are corresponding weights [Abramowitz and Stegun, 1972]. For a standard n-dimensional standard normal distributed random variable X we know that where R GH L is the truncation error of the Gauss-Hermite quadrature rule for g.
Now we consider the conditional expectations of the form E x tn LI n+j h,X n,j Y n+j ∆W ⊤ n,j and E x tn LI n+j h,X n,j Y n+j in (28) and (29). We know that LI n+j h,X n,j Y n+j is the interpolated value of Y n+j , which is a function of X n,j and can be represented by (Theorem 2.3) with ∆W n,j ∼ ∆t n,j N (0, I n ). Straightforwardly, we can approximate those conditional expectations as and where E x,h tn [·] denotes the approximation of E x tn [·] . Finally, by inserting these approximations into (28) and (29) we obtain Remark 5.1. 1. The estimate of R k,E y,n or R k,E z,n reads [Abramowitz and Stegun, 1972, Shen et al., 2011, Zhao et al., 2014b O L! 2 L (2L)! . when using r-degree polynomial interpolation in k-step scheme, and provided that a, b, f and g are sufficiently smooth such that L k+4 t,x u(t, x) is bounded and u(t, ·) ∈ C r+1 b , see [Abramowitz and Stegun, 1972, Burden and Faires, 2001, Butcher, 2008, Zhao et al., 2014b. 3. To balance the time discretization error R k y,n = O(∆t) k and R k z,n = O(∆t) k , one needs to control well both the interpolation and integration error mentioned in last two points. 4. For a k-step scheme we need to know the support values of Y N T −i and Z N T −i , i = 0, · · · , k +2. One can use the following three ways to deal with this problem: before running the multistep scheme, we choose a quite smaller ∆t and run one-step scheme until N T − k − 2; Alternatively, one can prepare these initial values "iteratively", namely we compute Y N T −1 and Z N T −1 based on Y N T and Z N T with k = 1, and the compute Y N T −2 and Z N T −2 based on Y N T , Y N T −1 , Z N T , Z N T −1 with k = 2 and so on; Finally, one can use the Runge-Kutta scheme proposed in [Crisan and Chassagneux, 2014] with small ∆t to initialize our proposed multi-step scheme.

For the local interpolation errors R k,LI
By removing all the error terms, from (31) and (32) we obtain our fully discrete scheme as follows.
Scheme 2. Assume that Y N T −i and Z N T −i on R N T −i h are known for i = 0, 1, · · · , k + 2. For n = N T − k − 3, · · · , 0 and x ∈ R n h , Y n = Y n (x) and Z n = Z n (x) can be solved by X n,j = x + a(t n , x)∆t n,j + b(t n , x)∆W n,j , j = 1, · · · , k + 3,

Numerical schemes for coupled FBSDEs
The authors in [Zhao et al., 2014b] extended their scheme proposed for solving decoupled FBS-DEs to the one which can solve fully coupled FBSDEs. Similarly, our Scheme 2 can be extended to solve (1) in a fully coupled case.
Remark 6.1. 1. Scheme 3 coincides with Scheme 2 if a and b do not depend on Y and Z.
2. We only assume that the coupled FBSDEs are uniquely solvable, the lacking analysis will be the task of future work.

Numerical experiments
In this section we use some numerical examples to show that our Schemes 2 and 3 can reach ninth-order convergence rate for solving FBSDEs. The uniform partitions in both time and space will be used, that is, the time interval [0, T ] will be uniformly divided into N T parts with ∆t = T N T such that t n = n∆t, n = 0, 1, · · · , N T ; the space partition is R n h = R h for all n with , · · · , ±∞ , j = 1, 2, · · · , n.
In our numerical experiments we choose the local Lagrange interpolation for LI n h,x based on the set of some neighbor grids near x, i.e., R h,x ⊂ R h such that (33) holds. Following [Zhao et al., 2014a], we set sufficiently many Gauss-Hermite quadrature points such that the quadrature error could be negligible. Note that the truncation error is defined in (12), in order to thus balance the time and space truncation error in our numerical examples, we force h r+1 = (∆t) k+1 , where r is the degree of the Lagrangian interpolation polynomials. For example, one can firstly specify a value of r, and then adjust the value of h such that h = ∆t k+1 r+1 . For the numerical results in this paper, r is set to be a value from the set {10, 11, · · · , 21} to control the errors. Furthermore, we will consider k from 3 such that at least one combination of four α k,i s is included, but until k = 9 due to the stability condition, see Scheme 2 and 3. Finally, CR and RT are used to denote the convergence rate and the running time in second, respectively. For the comparison purpose, we will directly take examples considered in [Zhao et al., 2014a]. Numerical experiment were performed in MATLAB with an Intel(R) Core(TM) i5-8350 CPU @ 1.70 GHz and 15 G RAM.
has the analytic solution Y t = sin(t + X t ), Z t = cos 2 (t + X t ).
In this coupled FBSDE, the diffusion coefficient b depends on X, Y and Z, i.e., quite general. Due to the same reasons as those explained for Example 1, we set N T = {13, 15, 17, 19, 21} in order to show the convergence rate up to ninth order. From the results listed in Table 6 one can  Table 6: Errors, running time and convergence rates for Example 2, T = 1 clearly draw same conclusions as those having been for Example 1.
Finally, we illustrate the accuracy of the proposed scheme for a two-dimensional example, which is also taken from [Zhao et al., 2014a]

Conclusion
In this work, by combining the multi-steps we have adopted the high-order multi-step method in [W. Zhao, Y. Fu and T. Zhou, SIAM J. Sci. Comput., 36(4) (2014), pp.A1731-A1751] for solving FBSDEs. First of all, our new schemes allow for higher convergence rate up to ninth order, and are more efficient. Secondly, they keep the key feature of the method in [W. Zhao, Y. Fu and T. Zhou, SIAM J. Sci. Comput. 36(4), pp.A1731-A1751], that is the numerical solution of backward component maintains the higher-order accuracy by using the Euler method to the forward component. This feature makes our schemes be promising in solving problems in practice. The effectiveness and higher-order accuracy have been confirmed by the numerical experiments. A rigorous stability analysis for the proposed schemes is the task of future work.