Effective numerical methods for simulating diffusion on a spherical surface in three dimensions

In order to construct an algorithm for homogeneous diffusive motion that lives on a sphere, we consider the equivalent process of a randomly rotating spin vector of constant length. By introducing appropriate sets of random variables based on cross products, we construct families of methods with increasing efficacy that exactly preserve the spin modulus for every realisation. This is done by exponentiating an antisymmetric matrix whose entries are these random variables that are Gaussian in the simplest case.


Introduction and background
We take a non-standard approach to diffusion on the surface of a sphere, starting with an equation for a three-component spin vector written in Langevin form: where η is a vector of independent white noises with magnitude σ . Here × denotes the cross product-see Definition 1.
Equation (1) can then be written as where W (t) = (W 1 (t), W 2 (t), W 3 (t)) T is a vector of independent Wiener processes. Given the definition of the cross product ×, this can be written as In matrix form, we can write this as the linear Itô SDE where I is the 3×3 identity matrix and Another convenient representation of (3) is dS = −σ 2 Sdt + G(S)dW (t) (5) where G(S) is the antisymmetric matrix Based on (5), we can prove the following theorem. Proof The proof is by Itô's Lemma-see for example Kloeden and Platen [1]. Consider the Itô SDE where f and G are arbitrary functions satisfying appropriate integrability conditions-see [1] for details. Suppose u = h(X) ∈ R, where h has continuous first-and second-order partial derivatives. Then Itô's Lemma states du = (∇h(X)) f (X)+ 1 2 Tr(G(X)G (X)∇[∇h(X)] dt+(∇h(X)) G(X)dW (t), where ∇[∇h(X)] is the matrix of second-order spatial derivatives of h. Now when and G(S) is given by (6).
As a consequence of Theorem 1, the vector S(t) lives on the unit sphere of radius 1 for all time.
In this paper, we will construct different classes of numerical methods that preserve ||S(t)|| 2 2 . The starting point is the Stratonovich form of (3) namely where the A i are as in (4). This equation is linear, but non-commutative, and we can write the solution as a Magnus expansion [2]: in terms of iterated commutators of the A i and stochastic Stratonovich integrals with respect to multiple Wiener processes. Section 2 reviews the Magnus expansion in the general setting, but we also show that for (8) (t) can be represented as an antisymmetric matrix where the ξ i (σ t) are continuous random variables that are to be constructed. Given (9) then by (8) , and so this construction is norm-preserving. We also show in Section 3 that a stepwise implementation by, for example, the Euler-Maruyama method is not norm-preserving.
In Sections 3 and 4, we show how to construct the ξ i (σ t) based on an expansion of a weighted sum of increasing numbers of appropriate cross products. In Section 5, we estimate these weights based on the following idea: as a particle wanders randomly on the unit sphere, the steady-state distribution at z = cos θ is uniform as the curvature near the pole balances the girth near the equator. We can therefore write down an Itô SDE (2) for z(t) (= S 3 (t)), namely This satisfies We will use these weak forms to compare with S 3 (t) derived from (8) and (9). In Section 6, we give some results and discussions, and in Section 7 give conclusions on the novelty of this work. Finally, we note that the problem of a particle diffusing on a sphere has been studied in a number of settings. Yosida [3] in 1949 considered motion on a threedimensional sphere by solving a certain parabolic partial differential equation in which the generating function of the right-hand side operator can be determined explicitly and is the Laplacian operator in polar co-ordinates. Brillinger [4] looked at this problem in terms of expected travel time to a cap. In a slightly different setting, a number of variants of walks on N-spheres have been constructed for solving the N-dimensional Dirichlet problem. Muller [5] constructed N-dimensional spherical processes through an iterative process extending the ideas of Kakutani [6] who used the exit locations of Brownian motion. Other approaches were introduced in [7,8]. More recently, Yang et al. [9] showed how a constant-potential, time-independent Schrödinger equation can be solved by a classical walk-on-spheres approach.

The Magnus method
The form of the Magnus expansion of the solution for arbitrary matrices A 1 , A 2 , and A 3 was given in [2], as in Lemma 1.

Lemma 1
with Stratonovich integrals In fact for any positive integer p, the σ p term in the expansion will include iterated commutators of order p that are summed over p summations and multiplied by complicated expressions involving Stratonovich integrals over p Wiener processes.

Theorem 2 With the A i as in (4), then (t) is the anti-symmetric matrix
Proof Given (4), then This means that all high-order commutators of any order p will collapse down to one of A 1 , A 2 , or A 3 . To illustrate this up to σ 3 , we apply (13) to the expansion in Lemma 1. This gives Here, we have dropped the dependence on t for ease of notation. Clearly the form for (t) is as in (12).

Remarks
• The ξ i (σ t) are complicated expansions in σ of high-order Stratonovich integrals. However, these are extremely computationally intensive to simulate [1]. Instead, we will approximate them as continuous stochastic processes in some weak sense-see (26). • Clearly, the simplest approximation to the ξ(σ t) is to take where W (t) is a three-vector of independent Wiener processes. This idea will be the basis of our first algorithm presented in Section 3.

Stepwise implementations
Before presenting our first method, we show that the Euler-Maruyama method is an inappropriate method in that it does not preserve ||S(t)|| 2 2 , the spin norm. In fact, the mean drifts and the distribution of values grow rapidly wider with increasing time. An improved algorithm (without Itô correction) can narrow the distribution of values of the norm but will still have a mean that drifts. To see the behaviour of the EM method applied to (3), we have where, as before, Note N 1k , N 2k , N 3k , k = 1, · · · , m are independent Normal random variables with mean 0 and variance 1. Now Thus and so Similarly, Thus, the spin modulus is not conserved and the mean error grows linearly with t. More importantly, the variance can be very large so that if the procedure described above is repeated numerous times, the standard deviation of the ensemble of values of ||S(T )|| 2 obtained is proportional to √ T . In fact, the probability density function of log(||S(t)|| 2 ) is Gaussian. This means that, while more than half of the values of ||S(t)|| 2 obtained will be less than 1, rare large values of ||S(t)|| 2 dominate the statistics.
In order to construct a simple method that preserves the spin norm, it will be based on (8), (9), and (14), which leads to where in the first instance we take Our construction is based on Rodrigues' formula [10]. Let Then Hence from (16) and (17) Now let T = mh; then we can writê This allows us to write a step-by-step method where N k is given in (15), and hence a step-by-step method is, from (18), Note that this step-by-step method will only be strong order 0.5.

A class of Magnus-type methods
A stepwise approach, as constructed previously, will not yield a method that has more than strong order 0.5 and weak order 1 so we will attempt to approximate the ξ i (σ t) to obtain a better weak order approximation. We will first consider the behaviour of the composition of the Magnus operator over two half steps and require this to be the same as the Magnus approximation over a full step up to some power of the stepsize h. This will give us a clue as to how to choose the ξ j (t). In order to simplify the discussion, we will, wolog, take σ = 1. LetĀ ξ denote the matrixĀ Suppose on the two half steps, we assume that the random variables behave aŝ and on the full step , where N 1 , N 2 , N and P 1 , P 2 , P are 3 vectors of independent random variables that are to be determined in some manner. Furthermore, the matrices generated by these vectors through (19) will be denoted byN 1 ,N 2 ,P 1 ,P 2 .
So from (12) and (16) and setting the composition over two half steps to be equal to the Magnus operator up to the h term implies andP . Hence from (20) and after some simple algebrā Now withN 1 andN 2 generated by the vectors N 1 and N 2 , via (19) it is easy to show that [N 2 ,N 1 ] generates a matrix of the form (19) in which the corresponding vector ξ that generatesĀ ξ is N 1 × N 2 , where the cross product is given through the following definition.

Definition 1 Given vectors
Consequences of Definition 1 are the following well-known results: Lemma 2 Given two three-vectors B and D, the following results on cross products hold.
Proof Trivial use of Definition 1.

Thus, in the vector setting, (20) and (21) and Lemma 2 give
Equation (22) suggests that we take N 1 and N 2 to be independent N(0, 1) 3-vectors, so that N is also a 3-vector with independent N(0, 1) components.
Furthermore, if we let u 1 , u 2 , v 1 and v 2 be independent N(0, 1) 3-vectors and we take then from (23) and Lemma 2 we have Hence N, N 1 , and N 2 have the same distributions as do P , P 1 , and P 2 , respectively.
Continuing this line of thought, this suggests that we base our choice of the ξ(t) on a cross product formulation. Thus, we will take for ξ(t) the expansion where the d j are chosen appropriately and We can choose any positive integer value for r in (24). But we will see in Section 5 when we attempt to estimate the d j that they become overly sensitive for values of r > 5, and so we will take a specific value of r, namely r = 5.
This will lead to methods that we will denote by M(1, d 2 , d 3 , d 4 , d 5 ). For clarity, we give the form of the ξ(t): We will show in Section 5 how to calibrate the parameters d 2 , d 3 , d 4 , d 5 appropriately, in order to get good performance. Note if we wish to simulate ξ(t) at some time point t = T , then we generate an equidistant time mesh with stepsize h = T m . We then simulate two sequences of vectors of length m consisting of independent N(0, 1)-3 vectors: G 1i , G 2i , i = 1, · · · , m. We then approximate and generate ξ(T ) by using (26) and the definition of the cross product and related results in Lemma 1.

Model calibration
As a particle wanders randomly on a sphere, the steady-state distribution at latitude, z = cos θ, is uniform as the curvature near the poles balances the greater girth near the equator (see also [4])-by symmetry, the same is true of x and y; see Fig. 1.
A plot of this error in (33) is given in Fig. 2. We see that (32) is only accurate for modest values of time. So this is a word of caution in using a truncated error estimate for too large a value of t.
We will now consider the behaviour of the general class of methods given by M (1, d 2 , d 3 , d 4 , d 5 ) in terms of (30) where ξ(t) is given in (26). It will prove too difficult to get analytical results for the error in (33) so we will have to use a truncated error estimate. First, we will expandz(t) in (30) up to and including the t 4 term. It can be shown with some simple expansions that In order to calculate these expectations, we note the following Lemma, where the product of vectors is considered component-wise.
Proof Without loss of generality we will assume p ≥ q = p − r and consider two cases: r = 2k + 1 and r = 2k, (k = 0, 1, 2, · · · ). Let us consider the odd case first. Now

It is easy to show by induction that
Now, by definition of the cross product, the i th component of J 2 × A p−(2k+1) does not have a corresponding component from A p−(2k+1) and since the powers of J 2 appearing in (34) are odd, then Now let us consider the even case.
Similarly to the odd case, then by induction, for k = 1, 2, · · · Clearly, in each of the 3 components of the vectors on the right-hand side, there will be terms that have even powers in J 2 and A p−2j and the power of t will behave as k + p − 2k = p − k. Furthermore, each of the 3 components will have the same form. Hence E(A p · A p−2k ) = C p,k t p−k e, as required.
Some algebra and calculations of moments allow us to writē where c 2 , c 3 , c 4 can be considered to be the error terms when comparingz(t) to e −t . It can be shown that These results hold true for any component of ξ , i = 1, 2, or 3. Some of the expectations in (36) have already been calculated, but we now show the analysis in Lemma 4 for some of the more complicated terms in (36).

Lemma 4 For any
Proof We will drop the dependence on t for ease of notation.
(i) As a consequence of Lemma 2 and (34), Take any component of the vectors, say the first component, then Using results on expectation of normals (ii) From (34) and Lemma 2 stays on the surface of the sphere. However, this approach says nothing about the accuracy of the trajectories on the surface. The novelty of this work is that we construct the continuous random variables ξ 1 (t), ξ 2 (t), and ξ 3 (t) that guarantee that the trajectories lie on the surface but also give good accuracy in a weak sense. This is done by considering a one-dimensional model (27) in which we note that the steady-state distribution of the third variable at z = cos θ is uniform as the curvature near the pole balances the girth near the equator. From (27), we can get exact formulations for the first and second moments.
The additional novelty is that we now construct the ξ j (t) in terms of a linear combination of iterated cross products (see (26)). We then find the weights d j by comparing the Magnus solution with the above moments. This results in a family of methods with very small weak errors. We describe these methods to be effective in the sense of the above characterisation. This is important for making sure that the paths on the surface of the sphere are highly accurate. It turns out that method M 3 is the simplest and the most robust of the methods constructed. The final aspect of innovation is that these ideas can be extended to diffusion on higher dimensional spherical surfaces [11] and we hope to do this in a following paper.

Author contribution All the authors contributed equally to this work
Funding Open Access funding enabled and organized by CAUL and its Member Institutions. The firstnamed author received funding from the Australian Centre of Excellence (ACEMS) through grant number CE-140100049.
Data availability Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.