1 Introduction

In the last decades, more and more interrelations in mechanics and finance have been modeled by stochastic differential equations (SDEs) on Lie groups. A trend can be observed that shows that kinematic models, which were previously expressed by ordinary differential equations (ODEs), are now extended by terms that include stochastic processes in order to include possible stochastic perturbations. Examples can be found in the modeling of rigid bodies such as satellites, vehicles and robots [5, 6, 16, 31]. Furthermore, SDEs on Lie groups are also considered in the estimation of object motion from a sequence of projections [28] and in the representation of the precessional motion of magnetization in a solid [1].

In financial mathematics, the consideration of stochastic processes is essential, and the solution of SDEs has been performed for many years, but usually not on Lie groups. However, the use of Lie groups to solve existing or to create new financial models could be of central importance for dealing with geometric constraints. We are confronted with geometric constraints, e.g. in the form of a positivity constraint on interest rates [14, 25, 30] or a symmetry and positivity constraint on covariance and correlation matrices [22], which are important in, for example, risk management and portfolio optimization.

Despite these diversified applications, the available literature on analysis and numerical methods for SDEs on Lie groups is limited, in contrast to the available literature on ODEs on Lie groups (for example [4, 7, 9, 11, 23, 24]). Furthermore, the available literature on Lie group SDEs mainly concerns Stratonovich SDEs [1, 3, 8, 16, 21, 30]. Readers interested in Itô SDEs on Lie groups will only find the geometric Euler–Maruyama scheme with strong order \(\gamma =1\) appearing in [18, 19, 26] and more recently the existence and convergence proof of the stochastic Magnus expansion in [12]. However, the consideration of Itô SDEs is crucial for its application in finance.

Our contribution to this field of research is a general procedure on how to set up structure-preserving schemes of higher strong order for Itô SDEs on matrix Lie groups. Based on the Magnus expansion we apply Itô–Taylor schemes or stochastic Runge–Kutta (SRK) schemes to solve a corresponding SDE in the Lie algebra. Using a SRK method can be interpreted as a stochastic version of Runge–Kutta–Munthe–Kaas (RKMK) methods and if our considered Itô SDEs were transformed to Stratonovich SDEs, this approach would be equivalent to the stochastic Munthe–Kaas schemes presented in [16]. Nevertheless, a proof of convergence for this method is still missing. Under these circumstances, we derive a condition such that the stochastic RKMK scheme inherits the strong convergence order \(\gamma \) of the SRK method applied in the Lie algebra.

The remainder of the paper is organized as follows. We start with an introduction to matrix Lie groups, their corresponding Lie algebras and the linear Itô matrix SDE which we consider in this geometric setting in Sect. 2. In Sect. 3 we take a closer look on how SDEs on Lie groups can be solved numerically and present our higher strong order methods. Then we provide some numerical and application examples in Sect. 4. A conclusion of our results is given in Sect. 5.

2 SDEs on matrix Lie groups

A Lie group is a differentiable manifold, which is also a group G with a differentiable product that maps \(G\times G\rightarrow G\). Matrix Lie groups are Lie groups, which are also subgroups of GL(n) for \(n\in {\mathbb {N}}\). The tangent space at the identity of a matrix Lie group G is called Lie algebra \({\mathfrak {g}}\). The Lie algebra is closed under forming Lie brackets \([\cdot ,\cdot ]\) (also called commutators) of its elements. For further details on Lie groups and Lie algebras we refer the interested reader to [10].

The matrix exponential, \(\exp (\varOmega )=\sum _{k=0}^{\infty }\frac{1}{k!}\varOmega ^k\), serves as a map from the Lie algebra to the corresponding Lie group, \(\exp :{\mathfrak {g}}\rightarrow G\), and is a local diffeomorphism near \(\varOmega =0_{n\times n}\). Its directional derivative in the direction of an arbitrary matrix \(H\in {\mathfrak {g}}\) is given by

$$\begin{aligned} \left( \frac{d}{d\varOmega } \exp (\varOmega )\right) H = \bigl (d\exp _{\varOmega }(H)\bigr )\exp (\varOmega ) =\exp (\varOmega )\left( d\exp _{-\varOmega }(H)\right) \end{aligned}$$

where

$$\begin{aligned} d\exp _{-\varOmega }(H) = \sum _{k=0}^{\infty }\frac{1}{(k+1)!}\mathrm{ad}_{-\varOmega }^k(H), \end{aligned}$$
(2.1)

(see [9, p. 83]). By \(\mathrm{ad}_{\varOmega }:{\mathfrak {g}}\rightarrow {\mathfrak {g}}\), \(\mathrm{ad}_{\varOmega }(H)=[\varOmega ,H]= \varOmega H - H\varOmega \) we express the adjoint operator which is used iteratively

$$\begin{aligned} \mathrm{ad}_{\varOmega }^0(H) =H,\quad \mathrm{ad}_{\varOmega }^k(H)=\bigl [\varOmega , \mathrm{ad}_{\varOmega }^{k-1}(H)\bigr ]=\mathrm{ad}_{\varOmega }\bigl (\mathrm{ad}_{\varOmega }^{k-1}(H)\bigr ), \quad k\ge 1. \end{aligned}$$

The inverse of \(d\exp _{-\varOmega }\) is given in the following Lemma [9, p. 84].

Lemma 2.1

(Baker, 1905) If the eigenvalues of the linear operator \(\mathrm{ad}_{\varOmega }\) are different from \(2\ell \pi i\) with \(\ell \in \{\pm 1,\pm 2,\ldots \}\), then \(d\exp _{-\varOmega }\) is invertible. Furthermore, we have for \(\Vert \varOmega \Vert <\pi \) that

$$\begin{aligned} d\exp _{-\varOmega }^{-1}(H) = \sum _{k=0}^{\infty } \frac{B_k}{k!}\mathrm{ad}_{-\varOmega }^k(H), \end{aligned}$$
(2.2)

where \(B_k\) are the Bernoulli numbers, defined by \(\sum _{k=0}^{\infty }(B_k/k!)x^k=x/(e^x-1)\).

We recall that the first three Bernoulli numbers are given by \(B_0=1\), \(B_1=-\frac{1}{2}\), \(B_2=\frac{1}{6}\) and that \(B_{2m+1}=0\) holds for \(m\in {\mathbb {N}}\).

Let \((\varTheta ,{\mathscr {F}},{\mathbb {P}})\) be a complete probability space and let \(W_t=(W_t^1,W_t^2,\ldots ,W_t^d)\) be a d-dimensional standard Brownian motion w.r.t. a filtration \({\mathscr {F}}_t\) for \(t\ge 0\) which satisfies the usual conditions. On a matrix Lie group G we consider the linear matrix-valued Itô SDE

$$\begin{aligned} dQ_t = Q_t K_t\,dt + Q_t \sum _{i=1}^d V_{i,t}\,dW_{t}^{i}, \quad Q_0=I_{n\times n}, \end{aligned}$$
(2.3)

where \(K_t, V_{1,t},\ldots , V_{d,t} \in {\mathbb {R}}^{n \times n}\) are given coefficient matrices and \(I_{n\times n}\) is the n-dimensional identity matrix. In general, there exists no closed form solution to (2.3). However, a solution can be defined via a Magnus expansion \(Q_t=Q_0\exp (\varOmega _t)\) (see [15, 19, 24]), where \(\varOmega _t{{\in {\mathfrak {g}}}}\) obeys the following matrix SDE

$$\begin{aligned} d\varOmega _t = A(\varOmega _t)\, dt + \sum _{i=1}^d\varGamma _i(\varOmega _t) \,dW_t^{i}, \quad \varOmega _0 = 0_{n\times n}. \end{aligned}$$
(2.4)

The drift and diffusion coefficients, \(A,\varGamma _i:{\mathfrak {g}}\rightarrow {\mathfrak {g}}\), are given by

$$\begin{aligned} A(\varOmega _t) = d\exp _{-\varOmega _t}^{-1} \left( K_t - \frac{1}{2}\sum _{i=1}^d\left( V_{i,t}^2 + C_i(\varOmega _t)\right) \right) ,\quad \varGamma _i(\varOmega _t) = d\exp _{-\varOmega _t}^{-1}(V_{i,t}) \end{aligned}$$
(2.5)

with \(C_i:{\mathfrak {g}}\rightarrow {\mathfrak {g}}\),

$$\begin{aligned} C_i(\varOmega _t) = \left( \frac{d}{d\varOmega _t}d\exp _{-\varOmega _t}\bigl (\varGamma _i(\varOmega _t)\bigr )\right) \,\varGamma _i(\varOmega _t), \end{aligned}$$
(2.6)

which can be specified by

$$\begin{aligned} C_i(\varOmega _t) = \sum _{p=0}^{\infty } \sum _{q=0}^{\infty } \frac{1}{(p+q+2)}\frac{(-1)^p}{p!(q+1)!} \mathrm{ad}_{\varOmega _t}^p\Bigl (\mathrm{ad}_{\varGamma _i(\varOmega _t)}\bigl (\mathrm{ad}_{\varOmega _t}^q\bigl (\varGamma _i(\varOmega _t)\bigr )\bigr )\Bigr ), \end{aligned}$$
(2.7)

for \(i=1,\ldots ,d\). We refer to [19] for the proof.

2.1 The Cayley map as local parameterization

With all these series related to the matrix exponential, the question arises whether there is another mapping \(\psi :{\mathfrak {g}} \rightarrow G\), which is also a local diffeomorphism near 0 but not based on the evaluation of an infinite number of summands. In case of a quadratic Lie group the answer is yes, there is a mapping, namely the Cayley transformation

$$\begin{aligned} \mathrm{cay}(\varOmega ) = (I - \varOmega )^{-1}(I + \varOmega ). \end{aligned}$$

A quadratic Lie group G is a set of matrices Q that fulfill the equation \(Q^{\top }PQ = P\) for a given constant matrix P. For the derivative of \(\mathrm{cay}(\varOmega )\) we have

$$\begin{aligned} \left( \frac{d}{d\varOmega } \mathrm{cay}(\varOmega )\right) H = \bigl (d\mathrm{cay}_{\varOmega }(H)\bigr )\mathrm{cay}(\varOmega ) =\mathrm{cay}(\varOmega )\left( d\mathrm{cay}_{-\varOmega }(H)\right) . \end{aligned}$$
(2.8)

The analogue expression to (2.1) is

$$\begin{aligned} d\mathrm{cay}_{-\varOmega }(H) = 2(I + \varOmega )^{-1} H (I-\varOmega )^{-1} \end{aligned}$$

with the inverse given by

$$\begin{aligned} d\mathrm{cay}_{-\varOmega }^{-1}(H) = \frac{1}{2}(I + \varOmega ) H (I-\varOmega ), \end{aligned}$$
(2.9)

see [9, p. 128].

Using the Cayley map instead of the matrix exponential as a local parameterization, the coefficients of (2.4) are

$$\begin{aligned} A(\varOmega _t) = d\mathrm{cay}_{-\varOmega _t}^{-1} \left( K_t - \frac{1}{2}\sum _{i=1}^d\left( V_{i,t}^2 + C_i(\varOmega _t)\right) \right) ,\quad \varGamma _i(\varOmega _t) = d\mathrm{cay}_{-\varOmega _t}^{-1}(V_{i,t}), \end{aligned}$$
$$\begin{aligned} C_i(\varOmega _t) = \left( \frac{d}{d\varOmega _t}d\mathrm{cay}_{-\varOmega _t}\bigl (\varGamma _i(\varOmega _t)\bigr )\right) \,\varGamma _i(\varOmega _t) = V_{i,t}\varOmega _t V_{i,t} \end{aligned}$$
(2.10)

for \(i=1,\ldots ,d\) (see Appendix for proof).

2.2 Example: SDEs on SO(n)

As an example of a matrix Lie group, we take a closer look at the special orthogonal group

$$\begin{aligned} \mathrm {SO}(n)=\{X \in \mathrm {GL}(n) : X^{\top }X=I, \, \det (X)=1\} \end{aligned}$$

which is a quadratic Lie group, such that the Cayley map is also applicable as a local parameterization. The corresponding Lie algebra consists of skew-symmetric matrices:

$$\begin{aligned} \mathfrak {so}(n) = \{Y \in \mathrm {GL}(n) : Y + Y^\top = 0\}. \end{aligned}$$

Since we are interested in structure preservation we need conditions that tell us when the solution of an SDE on SO(n) is kept on the manifold.

Theorem 2.1

For the solution \(Q_t\) of (2.3), then \(Q_t \in \) SO(n) if and only if the coefficient matrices satisfy \(V_{i,t} \in \mathfrak {so}(n)\) for \(i=1,\ldots ,d\) and \(K_t+K_t^{\top }=\sum _{i=1}^d V_{i,t}^2\).

For the proof of this theorem we refer to [19].

3 Numerical methods for SDEs on Lie groups

Applying standard numerical methods for SDEs directly to the linear matrix-valued Itô SDE (2.3) will result in a drift off, i.e. the numerical approximations do not stay on the manifold. Consequently, one needs to consider special numerical methods that preserve the geometric properties of the Lie group G.

As the Lie algebra \({\mathfrak {g}}\) represents a linear space with Euclidean-like geometry, it appears reasonable to compute the numerical approximations of the matrix SDE (2.4) and to project the solution back onto the Lie group G.

A simple scheme based on the Runge–Kutta–Munthe–Kaas schemes for ODEs [23] that puts the described approach into practice can be found in [18] and is presented in the following algorithm.

Algorithm 3.1

Divide the time interval \([0,T]\) uniformly into J subintervals \([t_j, t_{j+1} ]\), \(j=0,1,\ldots ,J-1\) and define the time step \(\varDelta = t_{j+1} - t_j\). Let \(Q_t=Q_0\psi (\varOmega _t)\) with \(\psi :{\mathfrak {g}}\rightarrow G\) be a local parameterization of the Lie group G. Starting with \(t_0=0\), \(Q_0=I_{n\times n}\) and \(\varOmega _0=0_{n\times n}\) the following steps are repeated over successive intervals \([t_j, t_{j+1}]\) until \(t_{j+1}=T\).

  1. 1.

    Initialization step: Let \(Q_j\) be the approximation of \(Q_t\) at time \(t=t_j\). Similarly, let \(K_j\) and \(V_{i,j}\) be the coefficient matrices \(K_t\) and \(V_{i,t}\) evaluated at \(t=t_j\) for \(i=1,\ldots ,d\).

  2. 2.

    Numerical method step: Let \(\varOmega _\varDelta \) be the exact solution of (2.4) after one time step, i.e. at \(t=t_1=\varDelta \). Compute an approximation \(\varOmega _1\approx \varOmega _{\varDelta }\) by applying a stochastic Itô–Taylor or stochastic Runge–Kutta method to the matrix SDE (2.4).

  3. 3.

    Projection step: Set \(Q_{j+1}=Q_j\, \psi (\varOmega _1)\).

Stochastic Runge–Kutta schemes consist of linear operations which would lead to approximations drifting off the Lie group G if they were applied directly to (2.3). In Algorithm 3.1 these linear operations are all carried out in the vector space \({\mathfrak {g}}\) such that the approximation \(\varOmega _1\) of (2.4) is also an element of the Lie algebra \({\mathfrak {g}}\). This approximation is then mapped to the Lie group G by \(\varOmega _1\mapsto Q_{j+1}=Q_j\psi (\varOmega _1)\), i.e. an approximation of (2.3) is obtained. Thus, all schemes obtained by following Algorithm 3.1 produce approximations in G and the structure of the Lie group is preserved.

The order of convergence of these Lie group structure-preserving schemes clearly depends on the numerical method used in the second step of the algorithm since \(\varOmega _1\mapsto Q_{j+1}=Q_j\psi (\varOmega _1)\) is a smooth mapping, i.e. the order on G is as high as the order of the scheme used for the corresponding equation in \({\mathfrak {g}}\). In order to analyze the accuracy of our geometric numerical methods we recall that an approximating process \(X_t^{\varDelta }\) is said to converge in a strong sense with order \(\gamma > 0\) to the Itô process \(X_t\) if there exists a finite constant K and a \(\varDelta '>0\) such that

$$\begin{aligned} {\mathbb {E}}[|X_T - X_T^{\varDelta }|] \le K \varDelta ^{\gamma } \end{aligned}$$
(3.1)

for any time discretization with maximum step size \(\varDelta \in (0,\varDelta ')\) [13].

3.1 Geometric schemes of strong order 1

Using the Euler–Maruyama scheme in the numerical method step of Algorithm 3.1 results in

$$\begin{aligned} \varOmega _{1}&= \varOmega _0 + A(\varOmega _0) \varDelta + \sum _{i=1}^d\varGamma _i(\varOmega _0) \varDelta W^{i} \nonumber \\&= d\psi _{-\varOmega _0}^{-1}\left( K_j - \frac{1}{2}\sum _{i=1}^dV_{i,j}^2 \right) \varDelta + \sum _{i=1}^d d\psi _{-\varOmega _0}^{-1}(V_{i,j}) \varDelta W^{i}, \nonumber \\ {}&Q_{j+1} = Q_j \psi (\varOmega _1), \end{aligned}$$
(3.2)

where \(\varDelta W \sim {\mathscr {N}}(0,\varDelta )\). Note that \(C_i(\varOmega _0)=0_{n\times n}\), \(i=1,\ldots ,d\) for both mappings \(\psi =\exp \) (see (2.7)) and \(\psi =\mathrm{cay}\) (see (2.10)) which is why we neglect this coefficient from here on.

Since this scheme (3.2) preserves the geometry of the Lie group G it was called the geometric Euler–Maruyama scheme [19]. It can be specified according to the mapping.

For \(\psi =\exp \), we get

$$\begin{aligned} \varOmega _{1}&= d\exp _{-\varOmega _0}^{-1}\left( K_j - \frac{1}{2}\sum _{i=1}^d V_{i,j}^2 \right) \varDelta + \sum _{i=1}^d d\exp _{-\varOmega _0}^{-1}(V_{i,j}) \varDelta W^{i} \nonumber \\&= \left( K_j - \frac{1}{2}\sum _{i=1}^d V_{i,j}^2 \right) \varDelta + \sum _{i=1}^d V_{i,j}\varDelta W^{i}, \nonumber \\ {}&Q_{j+1} = Q_j\,\exp (\varOmega _1), \end{aligned}$$
(3.3)

where inserting \(\varOmega _0=0_{n \times n}\) is equivalent to truncating the infinite series (2.2) after the first summand, right before any dependence on \(\varOmega \) appears.

Using \(\psi =\mathrm{cay}\) instead, we obtain

$$\begin{aligned} \begin{aligned} \varOmega _1&= d\mathrm{cay}_{-\varOmega _0}^{-1}\left( K_j-\frac{1}{2} \sum _{i=1}^d V_{i,j}^2\right) \varDelta + \sum _{i=1}^d d\mathrm{cay}_{-\varOmega _0}^{-1}(V_{i,j})\,\varDelta W^{i} \\&= \frac{1}{2}\left( K_j-\frac{1}{2}\sum _{i=1}^d V_{i,j}^2\right) \varDelta + \frac{1}{2}\sum _{i=1}^d V_{i,j} \varDelta W^{i}, \\ Q_{j+1}&= Q_j\,\mathrm{cay}(\varOmega _1) = Q_j(I - \varOmega _1)^{-1}(I+ \varOmega _1). \end{aligned} \end{aligned}$$

In both cases we see that the diffusion term is only dependent on time and not on the solution itself. This is called additive noise [13] and it is the reason why these schemes have strong order \(\gamma =1\) instead of \(\gamma =0.5\) as expected for the traditional Euler–Maruyama method. A general proof of the geometric Euler–Maruyama method converging with strong order \(\gamma =1\) can be found in [26].

3.2 Geometric schemes of higher order

A higher strong order than \(\gamma = 1\) can be achieved by applying the strong Itô–Taylor approximation of order \(\gamma =1.5\) (see [13, p. 351]) in the second step of Algorithm 3.1. By doing so, we obtain for \(d=1\)

$$\begin{aligned} \varOmega _1&= A(\varOmega _0) \varDelta + \varGamma (\varOmega _0) \,\varDelta W + \frac{1}{2}\varGamma '\varGamma (\varOmega _0)\bigl ( (\varDelta W)^2 - \varDelta \bigr )\nonumber \\ {}&\quad + A'\varGamma (\varOmega _0)\varDelta Z \nonumber \\ {}&\quad + \frac{1}{2}\left( A'A(\varOmega _0) + \frac{1}{2}A''\varGamma ^2(\varOmega _0)\right) \varDelta ^2 \nonumber \\ {}&\quad + \left( \varGamma 'A(\varOmega _0) + \frac{1}{2}\varGamma ''\varGamma ^2(\varOmega _0)\right) (\varDelta W \varDelta - \varDelta Z) \nonumber \\ {}&\quad + \frac{1}{2}\left( \varGamma '\varGamma (\varOmega _0)\right) '\varGamma (\varOmega _0)\; \Bigl (\frac{1}{3}(\varDelta W)^2 - \varDelta \Bigr )\,\varDelta W,\nonumber \\&Q_{j+1} = Q_j \,\psi (\varOmega _1). \end{aligned}$$
(3.4)

As seen in the previous section \(\varOmega _1\) depends on \(t_j\) since

$$\begin{aligned} A(\varOmega _0) = d\psi _{-\varOmega _0}^{-1}\Bigl (K_j - \frac{1}{2}V_{j}^2 \Bigr ), \quad \varGamma (\varOmega _0)=d\psi _{-\varOmega _0}^{-1}\Bigl (V_{j} \Bigr ). \end{aligned}$$

Representing the double integral \(\int _{\tau _{\ell }}^{\tau _{\ell +1}} \int _{\tau _{\ell }}^{s_2} dW_{s_1}ds_2\), the random variable \(\varDelta Z\) is normally distributed with mean \({\mathbb {E}}[\varDelta Z]=0\), variance \({\mathbb {E}}\bigl [(\varDelta Z)^2\bigr ]=\frac{1}{3}\varDelta ^3\) and covariance \({\mathbb {E}}[\varDelta Z \varDelta W]=\frac{1}{2}\varDelta ^2\). We consider the matrix derivatives as directional derivatives, e.g.

$$\begin{aligned} A'H = \left( \frac{d}{d\varOmega }A(\varOmega )\right) H = \frac{d}{d\epsilon }\left. A(\varOmega +\epsilon H)\right| _{\epsilon =0} \end{aligned}$$

which we then evaluate at \(\varOmega _0\). The computation of the needed matrix derivatives for \(\psi =\exp \) and \(\psi =\mathrm{cay}\) is provided in the Appendix.

In the case where \(d>1\) one could apply the corresponding Itô–Taylor scheme for \(d>1\) (see [13, p. 353]) to (2.4) followed by a projection step.

A strong order of \(\gamma =1.5\) can also be achieved by applying a stochastic Runge–Kutta method of that order to the SDE (2.4). By using the stochastic Runge–Kutta scheme of order \(\gamma =1.5\) of Rößler [27], we can avoid computing the derivatives in (3.4) and we obtain for \(d=1\)

$$\begin{aligned} \varOmega _{1}&= + \left( \frac{1}{3}A(H_1) + \frac{2}{3}A(H_2)\right) \varDelta \nonumber \\&\quad + \left( \frac{13}{4}\varGamma (\tilde{H}_1) - \frac{9}{4}\varGamma (\tilde{H}_2) - \frac{9}{4}\varGamma (\tilde{H}_3) + \frac{9}{4}\varGamma (\tilde{H}_4)\right) \varDelta W \nonumber \\&\quad + \left( -\frac{15}{4}\varGamma (\tilde{H}_1) + \frac{15}{4}\varGamma (\tilde{H}_2) + \frac{3}{4}\varGamma (\tilde{H}_3) - \frac{3}{4}\varGamma (\tilde{H}_4)\right) \frac{1}{2\sqrt{\varDelta }}\bigl ((\varDelta W)^2-\varDelta \bigr )\nonumber \\&\quad + \left( -\frac{9}{4}\varGamma (\tilde{H}_1) + \frac{9}{4}\varGamma (\tilde{H}_2) + \frac{9}{4}\varGamma (\tilde{H}_3) - \frac{9}{4}\varGamma (\tilde{H}_4)\right) \frac{\varDelta Z}{\varDelta } \nonumber \\&\quad + \left( 6\varGamma (\tilde{H}_1) - 9\varGamma (\tilde{H}_2) + 3\varGamma (\tilde{H}_4)\right) \frac{1}{3!\varDelta }\left( (\varDelta W)^2-3\varDelta \right) \varDelta W, \nonumber \\ Q_{j+1}&= Q_j \,\psi (\varOmega _1), \end{aligned}$$
(3.5)

with the stage values

$$\begin{aligned} H_1&= H_3 = \tilde{H}_1 = \varOmega _0, \quad H_2 = \frac{3}{4}A(H_1)\varDelta + \frac{3}{2}\varGamma (\tilde{H}_1)\frac{\varDelta Z}{\varDelta }, \\ \tilde{H}_2&= \frac{1}{9}A(H_1)\varDelta + \frac{1}{3}\varGamma (\tilde{H}_1)\sqrt{\varDelta }, \\ \tilde{H}_3&= -\frac{5}{9}A(H_1)\varDelta + \frac{1}{3}A(H_2)\varDelta - \frac{1}{3}\varGamma (\tilde{H}_1)\sqrt{\varDelta } + \varGamma (\tilde{H}_2)\sqrt{\varDelta }, \\ \tilde{H}_4&= -A(H_1)\varDelta + \frac{1}{3}A(H_2)\varDelta + A(H_3)\varDelta \\&\quad + \varGamma (\tilde{H}_1)\sqrt{\varDelta } - \varGamma (\tilde{H}_2)\sqrt{\varDelta } + \varGamma (\tilde{H}_3)\sqrt{\varDelta }. \end{aligned}$$

The exploitation of stochastic Runge–Kutta methods gives us the benefit of a derivative-free scheme. However, using the mapping \(\psi =\exp \) raises the question of how large the truncation index q must be chosen in the truncated approximation for (2.2),

$$\begin{aligned} \sum _{k=0}^{q}\frac{B_k}{k!}\mathrm{ad}_{-\varOmega }^k(H) = H - \frac{1}{2} [-\varOmega , H ]+ \frac{1}{12} \bigl [- \varOmega , [-\varOmega ,H ]\bigr ]+ \cdots , \end{aligned}$$
(3.6)

in order to maintain a strong order of \(\gamma =1.5\). More generally, a condition is needed that connects the truncation index q with the aimed strong convergence order \(\gamma \).

Inspired by [9, Theorem IV.8.5.] for Runge–Kutta–Munthe–Kaas methods to solve deterministic matrix ODEs we formulate the following theorem.

Theorem 3.2

Consider Algorithm 3.1 with \(\psi =\exp \). Let the applied stochastic Runge–Kutta method in the second step of Algorithm 3.1 be of strong order \(\gamma \). If the truncation index q in (3.6) satisfies \(q\ge 2\gamma -2\), then the method of Algorithm 3.1 is of strong order \(\gamma \).

Proof

According to the definition of strong convergence (3.1) we have to show that

$$\begin{aligned} {\mathbb {E}}[\Vert \varOmega _{\varDelta }-\varOmega _1\Vert ] \le K\varDelta ^{(q+2)/2} \end{aligned}$$

where \(\varOmega _{\varDelta }\) is the exact solution of (2.4) with \(\psi =\exp \) at \(t=\varDelta \), \(\varOmega _1\) is the numerical approximation obtained in the second step of Algorithm 3.1 and K is a finite constant.

Let \(\varOmega _{\varDelta }^q\) be the exact solution of the truncated version of (2.4) with \(\psi =\exp \) at \(t=\varDelta \), namely

$$\begin{aligned} \varOmega _{\varDelta }^q = \int _0^{\varDelta } \sum _{k=0}^q \frac{B_k}{k!}\mathrm{ad}_{-\varOmega _t}^k\left( K_s-\frac{1}{2}\sum _{i=1}^d V_{i,s}^2) ds+ \sum _{i=1}^d \int _0^{\varDelta } \sum _{k=0}^q \frac{B_k}{k!}\mathrm{ad}_{-\varOmega _t}^k(V_{i,s}\right) dW_s^{i}. \end{aligned}$$

Our proof is divided into six steps.

Step 1: Numerical error

We consider the Frobenius norm of matrices in the Lie algebra \({\mathfrak {g}}\) and estimate the error in the \(L^1\)-norm by the \(L^2\)-norm. Then, we use the Minkowski inequality by introducing \(\varOmega _{\varDelta }^q\),

$$\begin{aligned} {\mathbb {E}}[\Vert \varOmega _{\varDelta }-\varOmega _1\Vert ]&\le \left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }-\varOmega _1\Vert ^2\right] \right) ^{1/2} \\&\le \left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }-\varOmega _{\varDelta }^q\Vert ^2\right] \right) ^{1/2} + \left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }^q-\varOmega _1\Vert ^2\right] \right) ^{1/2}{{.}} \end{aligned}$$

We are left with the modelling error, which corresponds to the first summand, and the numerical error, the second summand. The numerical error can be estimated by

$$\begin{aligned} \left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }^q-\varOmega _1\Vert ^2\right] \right) ^{1/2} \le \tilde{K}\varDelta ^{\gamma } \quad \text {for} \; \tilde{K}<\infty , \end{aligned}$$

as we assume that we are applying a SRK method of strong order \(\gamma \).

In other words, it remains to be shown that

$$\begin{aligned} \left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }-\varOmega _{\varDelta }^q\Vert ^2\right] \right) ^{1/2} \le K\varDelta ^{(q+2)/2} \end{aligned}$$

holds for the modelling error.

Step 2: Itô isometry

Inserting the integral equation of (2.4) and its truncated version, we get

$$\begin{aligned}&\left( {\mathbb {E}}\left[ \Vert \varOmega _{\varDelta }-\varOmega _{\varDelta }^q\Vert ^2\right] \right) ^{1/2} \\&\quad = \left( {\mathbb {E}}\left[ \Big \Vert \int _0^{\varDelta }\sum _{k=q+1}^{\infty }\frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k\left( K_s-\frac{1}{2}\sum _{i=1}^d V_{i,s}^2\right) ds\right. \right. \\&\left. \left. \qquad + \sum _{i=1}^d\int _0^{\varDelta }\sum _{k=q+1}^{\infty } \frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k(V_{i,s})dW_s\Big \Vert ^2\right] \right) ^{1/2} \\&\quad \le \left( {\mathbb {E}}\left[ \Big \Vert \int _0^{\varDelta }\sum _{k=q+1}^{\infty }\frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k\left( K_s-\frac{1}{2}\sum _{i=1}^d V_{i,s}^2\right) ds\Big \Vert ^2\right] \right) ^{1/2}\\&\qquad + \sum _{i=1}^d\left( {\mathbb {E}}\left[ \Big \Vert \int _0^{\varDelta }\sum _{k=q+1}^{\infty } \frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k(V_{i,s})dW_s\Big \Vert ^2\right] \right) ^{1/2} \\&\quad \le \left( \int _0^{\varDelta }{\mathbb {E}}\left[ \Big \Vert \sum _{k=q+1}^{\infty }\frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k\left( K_s-\frac{1}{2}\sum _{i=1}^dV_{i,s}^2\right) \Big \Vert ^2\right] ds\right) ^{1/2}\\&\qquad + \sum _{i=1}^d\left( \int _0^{\varDelta }{\mathbb {E}}\left[ \Big \Vert \sum _{k=q+1}^{\infty } \frac{B_k}{k!}\mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\Big \Vert ^2\right] ds\right) ^{1/2} \\&\quad \le \left( \int _0^{\varDelta }{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty }\frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k\left( K_s-\frac{1}{2}\sum _{i=1}^dV_{i,s}^2\right) \big \Vert \right) ^2\right] ds\right) ^{1/2}\\&\qquad + \sum _{i=1}^d\left( \int _0^{\varDelta }{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty } \frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\big \Vert \right) ^2\right] ds\right) ^{1/2}, \end{aligned}$$

where we also used the Minkowski inequality, the Itô isometry and the properties of a matrix norm. Now, the summands in the last line differ only in the input matrix of the adjoint operator.

Step 3: Adjoint operator

We estimate the Frobenius norm of the adjoint operator of \(V_{i,s}\) for a fixed \(s\in [0,\varDelta ]\) and keep in mind that analogous estimates hold for the adjoint operator of \(K_s-\frac{1}{2}\sum _{i=1}^d V_{i,s}^2\). Since the Frobenius norm is submultiplicative, we have

$$\begin{aligned} \Vert \mathrm{ad}_{-\varOmega _s}(V_{i,s})\Vert = \Vert [-\varOmega _s,V_{i,s}]\Vert \le \Vert \varOmega _sV_{i,s}\Vert +\Vert V_{i,s}\varOmega _s\Vert \le 2\Vert \varOmega _s\Vert \Vert V_{i,s}\Vert . \end{aligned}$$

As a direct consequence, it holds

$$\begin{aligned} \Vert \mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\Vert \le 2^k\Vert \varOmega _s\Vert ^k\Vert V_{i,s}\Vert , \end{aligned}$$

which can also be shown via induction. Inserting this result in the expected value considered in the last line of the previous step, we get

$$\begin{aligned} {\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty }\frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\big \Vert \right) ^2\right]&\le {\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty }\frac{|B_k|}{k!}2^k\Vert \varOmega _s\Vert ^k\Vert V_{i,s}\Vert \right) ^2\right] \\&= \Vert V_{i,s}\Vert ^2\,{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty }\frac{|B_k|}{k!}2^k\Vert \varOmega _s\Vert ^k\right) ^2\right] . \end{aligned}$$

Step 4: Estimate for the remainder

It is known that the Bernoulli numbers are implicitly defined by \(\sum _{k=0}^{\infty }(B_k/k!)x^k=x/(e^x-1)\). Inserting the absolute values of the Bernoulli numbers instead, then

$$\begin{aligned} \sum _{k=0}^{\infty }\frac{|B_k|}{k!}x^k = \frac{x}{2}\left( 1+\cot \left( \frac{x}{2}\right) \right) +2 \end{aligned}$$

(see [11, p. 48]). Let \(f:I\rightarrow {\mathbb {R}}\), \(x \mapsto \frac{x}{2}\left( 1+\cot \left( \frac{x}{2}\right) \right) +2\) with \(I=\{x \in {\mathbb {R}}: \frac{x}{2\pi }\not \in {\mathbb {Z}}\}\). Applying Taylor’s theorem to the function f at the point 0 gives

$$\begin{aligned} f(x)= \sum _{k=0}^{q}\frac{f^{(k)}(0)}{k!}x^k + R_q(x), \quad R_q(x)=\frac{f^{(q+1)}(\xi )}{(q+1)!}x^{q+1}, \end{aligned}$$

where we consider the Lagrange form of the remainder for some real number \(\xi \) between 0 and x.

Setting \(x=2\Vert \varOmega _s\Vert \) and recalling that the expression (2.2) only converges for \(\Vert \varOmega \Vert <\pi \), we now consider \(\left. f\right| _{\tilde{I}}:\tilde{I}\rightarrow {\mathbb {R}}\), \(x \mapsto \frac{x}{2}\left( 1+\cot \left( \frac{x}{2}\right) \right) +2\) with \(\tilde{I}=\{x \in {\mathbb {R}}: |x|<2\pi \}\). The restriction of f to \(\tilde{I}\) is bounded, in particular there exists an upper bound \(M_q\) such that \(|\left. f\right| _{\tilde{I}}^{(q+1)}(\xi )|\le M_q\) for all \(\xi \) between 0 and x. Moreover, the following estimate for the remainder holds

$$\begin{aligned} |R_q(x)|=\left| \frac{\left. f\right| _{\tilde{I}}^{(q+1)}(\xi )}{(q+1)!}x^{q+1}\right| \le \frac{M_q}{(q+1)!}|x|^{q+1} \le \frac{M_q}{(q+1)!}(2\Vert \varOmega _s\Vert )^{q+1}. \end{aligned}$$

Using this estimate in the expected value of the last line of the previous step results in

$$\begin{aligned} {\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty }\frac{|B_k|}{k!}(2\Vert \varOmega _s\Vert )^k\right) ^2\right]&\le {\mathbb {E}}\left[ \left( \frac{M_q}{(q+1)!}(2\Vert \varOmega _s\Vert )^{q+1}\right) ^2\right] \\&= \left( \frac{2^{q+1}M_q}{(q+1)!}\right) ^2 {\mathbb {E}}\left[ \Vert \varOmega _s\Vert ^{2q+2}\right] . \end{aligned}$$

Step 5: Itô–Taylor expansion

The goal of this step is to find an estimate for \({\mathbb {E}}\left[ \Vert \varOmega _s\Vert ^{2q+2}\right] \). For this purpose, we examine the following Itô–Taylor expansion

$$\begin{aligned} {{\varOmega _s = \varOmega _0 + R_s = R_s}}, \quad {\mathbb {E}}\left[ \Vert R_s\Vert ^2\right] \le C_1s \end{aligned}$$

where \(C_1\) is a finite constant, for details see [13, Proposition 5.9.1]. This leads to the estimate

$$\begin{aligned} {\mathbb {E}}\left[ \Vert \varOmega _s\Vert ^{2(q+1)}\right] = {\mathbb {E}}\left[ \Vert R_s\Vert ^{2(q+1)}\right] \le C_1s^{q+1}. \end{aligned}$$

Step 6: Overall estimate

Gathering the results of the previous steps and inserting a Taylor expansion for \(V_{i,s}\) where \(C_2<\infty \) gives

$$\begin{aligned}&{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty } \frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\big \Vert \right) ^2\right] \\&\quad \le \left( \Vert V_{i,0}\Vert + C_2s\right) ^2\left( \frac{2^{q+1}M_q}{(q+1)!}\right) ^2C_1s^{q+1} ={\mathscr {O}}(s^{q+1}). \end{aligned}$$

Thus,

$$\begin{aligned} \left( \int _0^{\varDelta }{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty } \frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k(V_{i,s})\big \Vert \right) ^2\right] ds\right) ^{1/2} ={\mathscr {O}}(\varDelta ^{(q+2)/2}). \end{aligned}$$

Analogously, one can show that

$$\begin{aligned} \left( \int _0^{\varDelta }{\mathbb {E}}\left[ \left( \sum _{k=q+1}^{\infty } \frac{|B_k|}{k!}\big \Vert \mathrm{ad}_{-\varOmega _s}^k(K_s-\frac{1}{2}V_{i,s}^2)\big \Vert \right) ^2\right] ds\right) ^{1/2} ={\mathscr {O}}(\varDelta ^{(q+2)/2}), \end{aligned}$$

which concludes the proof.\(\square \)

Remark 3.1

  1. 1.

    Assuming that multiple Wiener integrals can be simulated efficiently, a general procedure for designing methods of strong order \(\gamma \) for linear Itô SDEs on matrix Lie groups can be achieved by applying a SRK scheme of order \(\gamma \) in the second step of Algorithm 3.1 while evaluating the sum (3.6) involved in the coefficients of equation (2.4) only up to index \(2\gamma -2\). We refer the reader to [17] and the references therein for more details on the approximation of iterated stochastic integrals.

  2. 2.

    Theorem 3.2 is in accordance with our results of Sect. 3.1, where the geometric Euler–Maruyama scheme (3.3) can be interpreted as a stochastic RKMK method with \(\gamma =1\) and \(q=0\).

  3. 3.

    Due to the definition of the Cayley map as a finite product of matrices, there is no modelling error and therefore, no such theorem is needed if \(\psi =\mathrm{cay}\) is chosen as the local parameterization in Algorithm 3.1.

4 Numerical examples

In the following we provide numerical examples that illustrate the effectiveness of the proposed geometric methods, firstly, by simulating the strong convergence order of the proposed schemes and secondly, by showing the Lie group structure preservation of our methods.

For checking the convergence order, we set \(G=\) SO(3) and \({\mathfrak {g}}=\mathfrak {so}(3)\). In order to ensure the conditions of Theorem 2.1 we have used the set up of matrices \(K_t\) and \(V_t\) proposed by Muniz et al. [22] for \(d=1\). Specifically, we chose the time-dependent functions

$$\begin{aligned} f_1(t) = \cos (t), \quad f_2(t) = \sin (t), \quad f_3(t) = 1 + t + t^2 + t^3, \end{aligned}$$

to compute a skew-symmetric matrix \(V_t\) as a linear combination,

$$\begin{aligned} V_t = f_1(t) G_1 + f_2(t)G_2 + f_3(t)G_3, \end{aligned}$$

where \(G_i\), \(i=1,2,3\) are the following generators of the Lie algebra \(\mathfrak {so}(3)\),

$$\begin{aligned} G_1 = \begin{pmatrix} 0 &{} -1 &{} 0 \\ 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{pmatrix}, \quad G_2 = \begin{pmatrix} 0 &{} 0 &{} -1 \\ 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 \end{pmatrix}, \quad G_3 = \begin{pmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 \\ 0 &{} 1 &{} 0 \end{pmatrix}. \end{aligned}$$

Note that the functions \(f_i\), \(i=1,2,3\) can be chosen arbitrarily. We then set the matrix \(K_t\) as the lower triangular matrix of \(V_t^2\) where the diagonal entries of \(K_t\) are 0.5 times the diagonal entries of \(V_t^2\) such that \(K_t + K_t^{\top } = V_t^2\).

We simulated \(M=1000\) different paths of two independent realizations of a standard normally distributed random variable, \(U_1, U_2 \sim {\mathscr {N}}(0,1)\). Then, the random variables used in the numerical method step in Algorithm 3.1 were simulated as \(\widehat{\varDelta W} = U_1\sqrt{\varDelta }\) and \(\widehat{\varDelta Z} = \frac{1}{2}\varDelta (\widehat{\varDelta W} + U_2\sqrt{\frac{\varDelta }{3}})\). The absolute error as defined in (3.1) was estimated by using the Frobenius norm at \(t_j=T\), i.e. by

$$\begin{aligned} \frac{1}{M}\sum _{i=1}^{M} \Bigl (\bigl \Vert Q_{T,i}^{\text {ref}}-Q_{T,i}^{\varDelta }\bigr \Vert _F\Bigr ) \end{aligned}$$

where the approximations \(Q_{T,i}^{\varDelta }\) were obtained by using Algorithm 3.1 with step sizes \(\varDelta = 2^{-14}, 2^{-13}, 2^{-12}, 2^{-11}, 2^{-10}, 2^{-9}\) and for the reference solution \(Q_{T,i}^{\text {ref}}\) we used the same method with \(\psi =\mathrm{cay}\) and step size \(\varDelta =2^{-16}\), respectively.

A log–log plot of the estimation of the absolute error against the step sizes can be viewed in Fig. 1. It indicates the strong order of convergence claimed in the sections above for the geometric Euler–Maruyama scheme (3.2), the geometric version of the Itô–Taylor scheme (3.4) and the geometric stochastic Runge–Kutta scheme (3.5).

Fig. 1
figure 1

Simulation of the strong convergence order for \(M=1000\) paths. Left: Geometric Euler–Maruyama scheme (gEM). Center: Geometric version of the Itô–Taylor scheme of strong order 1.5 (gIT). Right: Geometric version of Rößler’s stochastic Runge–Kutta scheme of strong order 1.5 (gSRK)

Examples from financial mathematics and multibody system dynamics verify that the structure-preserving methods derived above can be applied in practice.

In the first example we apply our methods of strong order \(\gamma =1.5\) to an SDE on SO(2) in the context of stochastic correlation modelling. The second example shows how our methods can be used in the modeling of rigid bodies, such as satellites. Although, we have restricted our research for this paper to considering only linear SDEs on Lie groups, the second example shows that our methods can also be applied to nonlinear SDEs on e.g. SO(3).

4.1 A stochastic correlation model

Let us assume that a risk manager retrieves from the middle office’s reporting system an initial value of the correlation between two assets and a density function of the considered correlation. Moreover, we assume that the risk manager was given the task to generate correlation matrices that not only approximate the given density function but also respect the stochastic behaviour of correlations.

This problem can be solved by the stochastic correlation model presented in [22]. The main ideas of the approach are outlined below.

For this example we consider historical prices of the S&P 500 index and the Euro/US-Dollar exchange rate and compute moving correlations with a window size of 30 days to obtain correlations from January 03, 2005 to January 06, 2006 (see Fig. 2).

Fig. 2
figure 2

The 30-day historical correlations between S&P 500 and Euro/US-Dollar exchange rate, source of data: www.yahoo.com

The corresponding initial correlation matrix calculated from this data and imputed to the risk manager is

$$\begin{aligned} R_0^{\mathrm{hist}}= \begin{pmatrix} 1 &{} -0.0159 \\ -0.0159 &{} 1 \end{pmatrix}. \end{aligned}$$

Furthermore, we estimate a density function from the historical data using kernel smoothing functions, which is also plotted in Fig. 3. For more details on the density estimation see [2].

Fig. 3
figure 3

Empirical density function of the historical correlation between S&P 500 and Euro/US-Dollar exchange rate, computed with the MATLAB function ksdensity

As a first step, we focus on covariance matrices \(P_t\), \(t\ge 0\). The authors of [29] utilised the principal axis theorem and defined the covariance flow

$$\begin{aligned} P_t = Q_t^{\top }P_0Q_t, \quad t\ge 0, \end{aligned}$$
(4.1)

where \(P_0\) is the initial covariance matrix computed based on \(R_0^{\mathrm{hist}}\) and \(Q_t\) is an orthogonal matrix which without loss of generality can be assumed to have determinant +1, i.e. \(Q_t \in \mathrm {SO}(2)\). Following the approach in [22] the matrix \(Q_t\) is now assumed to be driven by the SDE (2.3) which can be solved by using Algorithm 3.1. With the resulting matrices approximations of \(P_t\) can be computed with (4.1), which can then be transformed to corresponding correlation matrices

$$\begin{aligned} R_t=\varSigma _t^{-1} P_t \varSigma _t^{-1} \end{aligned}$$

with \(\varSigma _t=\bigl (\mathrm{diag}(P_t)\bigr )^{\frac{1}{2}}\).

At last, a density function is estimated from this correlation flow and the free parameters involved are calibrated such that the density function matches the density function from the historical data, see [22] for details.

We executed this procedure using the geometric Itô–Taylor scheme (3.4) with \(\psi =\mathrm{cay}\) (gIT) and the geometric Rößler scheme (3.5) with \(\psi =\exp \) and truncation index \(q=1\) (gSRK) in the second step of Algorithm 3.1, respectively. The results are plotted in Fig. 4, which shows that both density functions approximate the density function of the historical data quite well.

Fig. 4
figure 4

Empirical density function of the historical correlation and the correlation flow between S&P 500 and Euro/US-Dollar exchange rate

4.2 The stochastic rigid body problem

Consider a free rigid body, whose centre of mass is at the origin. Let the vector \(y=(y_1,y_2,y_3)^{\top }\) represent the angular momentum in the body frame and \(I_1\), \(I_2\) and \(I_3\) be the principal moments of inertia [20]. Then the motion of this free rigid body is described by the Euler equations

$$\begin{aligned} \dot{y} = V(y)y, \quad V(y) = \begin{pmatrix} 0 &{} y_3/I_3 &{} -y_2/I_2 \\ -y_3/I_3 &{} 0 &{} y_1/I_1 \\ y_2/I_2 &{} -y_1/I_1 &{} 0 \end{pmatrix}. \end{aligned}$$

We suppose that the rigid body is perturbed by a Wiener process \(W_t\) and compute a matrix K(y) such that the dynamics are kept on the manifold, i.e. we compute K(y) from the condition \(K(y) + K^{\top }(y) = V^2(y)\). Consequently, we regard the Itô SDE

$$\begin{aligned} dy = K(y)y\, dt + V(y)y\, dW_t, \end{aligned}$$
(4.2)

where the solution evolves on the unit sphere if the initial value \(y_0\) satisfies \(|y_0|=1\). Note that stochastic versions of the rigid body problem have already been considered in [16] and [30] but as Stratonovich SDEs.

Since the solution of (4.2) can also be written as \(y=Qy_0\) where \(Q\in \mathrm {SO}(3)\), we focus on the nonlinear matrix SDE

$$\begin{aligned} dQ = K(Q)Q\, dt + V(Q)Q\, dW_t, \quad Q_0 = I_{3\times 3}. \end{aligned}$$
(4.3)

The coefficients of the corresponding SDE in the Lie algebra (2.4) are

$$\begin{aligned} A(\varOmega )= & {} d\psi ^{-1}_{\varOmega }\Bigl (K\bigl (\psi (\varOmega )Q_0\bigr ) -\frac{1}{2}V^2\bigl (\psi (\varOmega )Q_0\bigr )\Bigr ), \quad \nonumber \\ \varGamma (\varOmega )= & {} d\psi ^{-1}_{\varOmega }\Bigl (V\bigl (\psi (\varOmega )Q_0\bigr )\Bigr ). \end{aligned}$$
(4.4)

Now, SDE (4.2) can be solved by applying Algorithm 3.1 to the SDE (4.3). Note that we deal with right multiplication of the solution Q on the right hand side of (4.3) instead of left multiplication as in (2.3). As a consequence, the sign of the index of the operator \(d\psi ^{-1}\) is changed in (4.4) and the solution of the Projection step in Algorithm 3.1 should be \(Q_{j+1}=\psi (\varOmega _{j+1})Q_j\). We refer to [23] for more details on this matter.

In Fig. 5 we simulated 200 steps of the trajectory of (4.2) with a step size of \(\varDelta =0.03\) by using Algorithm 3.1 with the initial values \(y_0=(\sin (1.1),0,\cos (1.1))^{\top }\) and the moments of inertia \(I_1=2\), \(I_2=1\) and \(I_3=2/3\). For the numerical method step of Algorithm 3.1 we used the Euler–Maruyama scheme with \(\psi =\mathrm{cay}\). Emphasizing the structure-preserving character of Algorithm 3.1 we also plotted a sample path of the traditional Euler–Maruyama scheme applied directly to (4.2), whose trajectory clearly fails to stay on the manifold. This phenomenon can also be viewed in Fig. 6 where we visualize the distance of the approximate solutions from the manifold.

Fig. 5
figure 5

Sample paths of the geometric Euler–Maruyama (blue) and the traditional Euler–Maruyama scheme (red) applied to (4.2)

Fig. 6
figure 6

Log-distance of the numerical solutions to the unit sphere

5 Conclusion

We have presented stochastic Lie group methods for linear Itô SDEs on matrix Lie groups that have a higher strong convergence order than the known geometric Euler–Maruyama scheme. Based on RKMK methods for ODEs on Lie groups, we have proven a condition on the truncation index of the inverse of \(d\exp _{-\varOmega }(H)\) such that the stochastic RKMK method inherits the convergence order of the underlying SRK. This allows us to construct even higher order schemes than the here presented methods of strong order \(\gamma =1.5\) if one were able to approximate multiple Wiener integrals efficiently. Additionally, we have shown examples for the application of our methods in mechanical engineering and in financial mathematics.

Our methods require further investigations for the application to nonlinear Itô SDEs on matrix Lie groups, which we consider as future work. Moreover, we have restricted our research for this paper to the strong convergence order. In future research, an investigation on the weak convergence order of stochastic Lie group methods will also be conducted.