1 Introduction

The main aim of the modelling under location uncertainty (LU) consists in simulating on coarse meshes an enriched system mimicking a high resolution deterministic chaotic dynamics. Such LU models allow one to recover phenomena such as backscattering, dissipation and reorganisation on very coarse meshes. Furthermore, it provides a natural framework for uncertainty quantification analysis [14]. The LU framework, first introduced in [11], is based on the decomposition of the Lagrangian velocity into two components: a large-scale smooth component and a small-scale fast oscillating one. This decomposition leads to a stochastic transport operator, and one can, in turn, develop the stochastic version of classical fluid-dynamics systems derived from the Navier–Stokes equations. SQG in particular consists of one stochastic partial differential equation (SPDE), which models the stochastic transport of the buoyancy, and a linear operator relating the velocity and the buoyancy:

$$\displaystyle \begin{aligned} \begin{cases} \mathrm{d} b_t = \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot}(\boldsymbol{a}\boldsymbol{\nabla} b_t) \mathrm{d}t - \boldsymbol{v}^*\boldsymbol{\cdot}\boldsymbol{\nabla} b_t \mathrm{d}t - \boldsymbol{\nabla} b_t \boldsymbol{\cdot} \boldsymbol{\sigma} \mathrm{ d} {\mathbf{B}}_t, \\ b_t = N(-\varDelta)^{1/2} \psi,\\ \mathbf{u} = \boldsymbol{\nabla}^\bot\psi, \end{cases} {}\end{aligned} $$
(1)

where b t is the buoyancy at time t, u the large-scale smooth velocity, N a constant depending on the vertical oscillation frequency of the buoyancy and a Coriolis parameter, B a Wiener process, ψ the stream function and \(\boldsymbol {v}^* = \boldsymbol {u} - \frac {1}{2}\boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {a} + \boldsymbol {\sigma } \boldsymbol {\nabla } \boldsymbol {\cdot } \boldsymbol {\sigma }\) is a corrected velocity associated with the effect of the noise inhomogeneity on the advected variables. The spatial correlations of the noise are given through an integral kernel operator σ (here assumed deterministic and symmetric for sake of simplicity), and the variance matrix, a, given by the matrix kernel of the operator σσ provides a local measure of the noise strength. For more details on the derivation of this system, see [10, 13]. In the rest of this work we will mainly focus on the first equation, and the last two will be condensed in \(\mathbf {u} = \boldsymbol {\mathcal {H}}(b)\). Concerning the modelling of the noise, we use the equivalent convenient spectral definition:

$$\displaystyle \begin{aligned} \boldsymbol{\sigma} \text{d}{\mathbf{B}}_t = \sum_{m} \boldsymbol{\varphi}^m \text{d}\beta_t^m,\end{aligned} $$

where β m = β m(t) are independent one-dimensional standard Brownian motions and \(\boldsymbol {\varphi }^m = [\varphi _x^m, \varphi _y^m]^T \, (\mathbf {x})\) are basis functions. The number of terms involved in the sum is in theory infinite, but in numerical application a truncation is considered. In the definition of the numerical schemes we will thus assume that it is a finite sum. For the computation of the basis functions, two strategies are possible: an offline strategy, where they are defined from the eigenfunctions of an empirical covariance tensor built from high-resolution data as described in [10, 13]; of strategies, where the functions are updated during the simulation and in this case they are a function of the buoyancy b. With this representation, the variance tensor reads:

$$\displaystyle \begin{aligned} \boldsymbol{a} = \sum_m \boldsymbol{\varphi}^m (\boldsymbol{\varphi}^m)^T. \end{aligned}$$

2 Numerical Schemes

In this section we derive a two-step numerical scheme in time for the SQG system under LU (SQG-LU). We compare this scheme to other multi-step schemes for the SPDE, in particular the ones developed in [5] and [4], and show how our scheme improves the precision. Concerning discretisation in space, standard spectral methods are used: the linear terms are treated in the Fourier space, whilst the nonlinear terms are discretised in the physical space.

The derivation of the time scheme consists of two steps: first, we derive a class of Milstein schemes for SQG-LU and we empirically verify their convergence, then a two-step scheme is proposed.

2.1 Derivation of a Milstein Scheme

To design the Milstein schemes, we consider the integral form of the SPDE in (1), namely

$$\displaystyle \begin{aligned} b_t = b_{t_0} + \int_{t_0}^t \left( \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot}(\boldsymbol{a}\boldsymbol{\nabla} b_s) - \boldsymbol{v}^*\boldsymbol{\cdot}\boldsymbol{\nabla} b_s \right) \mathrm{d}s - \int_{t_0}^t \displaystyle \sum_m \boldsymbol{\nabla} b_s \boldsymbol{\cdot} \boldsymbol{\varphi}^m \mathrm{d} \beta_s^m, {} \end{aligned} $$
(2)

and we can define the following functions:

$$\displaystyle \begin{aligned} f(b_t,t) = \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot}(\boldsymbol{a}\boldsymbol{\nabla} b_t) - \boldsymbol{v}^*\boldsymbol{\cdot}\boldsymbol{\nabla} b_t \qquad \text{and}\qquad g^m(b_t,t) = - \boldsymbol{\nabla} b_t \boldsymbol{\cdot} \boldsymbol{\varphi}^m. {} \end{aligned} $$
(3)

We can now use the functional extension of the Itô formula [3] for both f and g to write their differential forms:

$$\displaystyle \begin{aligned} f(b_t,t) = f(b_{t_0}, t_0) + \int_{t_0}^t \frac{\partial f}{\partial s}(b_s,s) {\mathrm d} s + \int_{t_0}^t \frac{\partial f}{\partial b}(b_s,s) {\mathrm d} b_s \\ + \frac{1}{2}\int_{t_0}^t \frac{\partial^2 f}{\partial b^2}(b_s,s) {\mathrm d}\langle b, b \rangle_s {} \end{aligned} $$
(4)
$$\displaystyle \begin{aligned} g^m(b_t,t) = g^m(b_{t_0}, t_0) + \int_{t_0}^t \frac{\partial g}{\partial s}^m(b_s,s) {\mathrm d} s + \int_{t_0}^t \frac{\partial g}{\partial b}^m(b_s,s) {\mathrm d} b_s \\ + \frac{1}{2}\int_{t_0}^t \frac{\partial^2 g}{\partial b^2}^m(b_s,s) \mathrm{d}\langle b, b \rangle_s \end{aligned} $$
(5)

We remark that, since the basis φ m is constant in time then so is a and the functions f and g m do not depend explicitly on time, therefore ∂f∂t = ∂g m∂t = 0.

Concerning the first derivatives with respect to b, it has to be interpreted as a Fréchet derivative. The Fréchet derivative of an operator F is the bounded linear operator \(DF(\overline {x})\) which satisfies the following relation:

$$\displaystyle \begin{aligned} \lim_{\|h\|\rightarrow 0} \frac{\| F(\overline{x}+h) - F(\overline{x}) -DF(\overline{x})h \|}{\|h\|} = 0, \end{aligned} $$
(6)

which implies that for a linear operator \(DF(\overline {x})h = F(h)\). We start for g and use the fact that is a linear operator:

$$\displaystyle \begin{aligned} \frac{\partial g}{\partial b}(\overline{b})b = - \boldsymbol{\nabla} b \boldsymbol{\cdot} \boldsymbol{\varphi}^m - \boldsymbol{\nabla} b \boldsymbol{\cdot} \frac{\partial \boldsymbol{\varphi}}{\partial b}^m. {} \end{aligned} $$
(7)

If the basis is computed offline, φ m does not depend on b and therefore the second term in (7) is zero. If the basis is computed online and φ m does depend on b, we can rewrite the second term of the sum by components and, using the chain rule, one has:

$$\displaystyle \begin{aligned} \boldsymbol{\nabla} b \boldsymbol{\cdot} \frac{\partial \boldsymbol{\varphi}}{\partial b}^m = \frac{\partial b}{\partial x} \frac{\partial \varphi_x^m}{\partial b} + \frac{\partial b}{\partial y} \frac{\partial \varphi_y^m}{\partial b} = \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{\varphi}^m . {} \end{aligned} $$
(8)

For the second term of f, i.e. v ⋅∇ b, the same considerations are valid. To compute the derivative of the first term of f, we remark that it is a composition and product of three operators, two of which are linear. We can define:

$$\displaystyle \begin{aligned} F_1 (\boldsymbol{h}) = \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{h}, \quad F_2(b) = \boldsymbol{a} (b), \quad F_3(b) = \boldsymbol{\nabla} b. \end{aligned} $$
(9)

Using the chain rule and the linearity of F 1 and F 3 one has:

$$\displaystyle \begin{aligned} D\Big( F_1\big(F_2(b)F_3(b)\big) \Big) b &= DF_1\big(F_2(b)F_3(b)\big)\big(DF_2(b)F_3(b)+F_2(b)DF_3(b)\big)b \\ & = F_1\big(F_3(b)DF_2(b)b+F_2(b)F_3(b)\big)\\ & = \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot} \left( \frac{\partial \boldsymbol{a}}{\partial b} \boldsymbol{\nabla} b + \boldsymbol{a} \boldsymbol{\nabla} b \right). \end{aligned} $$
(10)

Finally, with the same considerations used above, we remark that we can write ( a∂b) b = ∇⋅a. Therefore:

$$\displaystyle \begin{aligned} \frac{\partial f}{\partial b}(\overline{b})b = f(b) + \frac{1}{2} \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{a} - \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{v}^*, \quad \frac{\partial g}{\partial b}^m(\overline{b})b = g^m(b) - \boldsymbol{\nabla} \boldsymbol{\cdot} \boldsymbol{\varphi}^m. \end{aligned} $$
(11)

As for the Itô covariation bracket, one has:

$$\displaystyle \begin{aligned} \langle b, b \rangle_t = \big\langle \int_{t_0}^{\boldsymbol{\cdot}} \sum_m g^m (b_s, s) \mathrm{d} \beta_s^m , \int_{t_0}^{\boldsymbol{\cdot}} \sum_k g^k (b_\tau, \tau) \mathrm{d} \beta_\tau^k \big\rangle_t = \int_{t_0}^t \left(\sum_m g^m(b_s,s)\right)^2 \mathrm{d} s \end{aligned}$$

We now suppose to be in either one of the following cases:

  • the basis functions φ m (and therefore a) do not depend on b and ∇⋅v  = 0,

  • the basis functions φ m depend on b but are such that ∇⋅v  = ∇⋅∇⋅a = ∇⋅σ = ∇⋅φ m = 0.

It can be noticed that the first case corresponds to a noise defined from external high-resolution data (and thus that does not depend on the solution) while the second case boils down to impose an incompressibility condition constraint on the large scale component, ∇⋅u = 0, that is indeed often considered in practice with particular scaling of the noise [1, 2]. With these assumptions, we have then:

$$\displaystyle \begin{aligned} \frac{\partial f}{\partial b} = \frac{\partial^2 f}{\partial b^2} = f, \quad \frac{\partial g}{\partial b}^m = \frac{\partial^2 g^m}{\partial b^2} = g^m. \end{aligned} $$
(12)

We can now replace all these expressions into (4) and (5), and then (4) and (5) into (2). Keeping only the terms of order one or lower, we obtain:

$$\displaystyle \begin{aligned} b_{t} = b_{t_0} + f(b_{t_0})\varDelta t + \sum_m g^m(b_{t_0})\varDelta \beta^m + \int_{t_0}^t \int_{t_0}^s \sum_{m,k} g^m(g^k(b_\tau )) \text{d} \beta_\tau^k \text{d} \beta_s^m, {} \end{aligned} $$
(13)

where Δt = t − t 0 and \(\varDelta \beta ^m = \beta ^m_t - \beta ^m_{t_0}\). We define the following quantities:

$$\displaystyle \begin{aligned} G^{m,k} := g^m(g^k(b_{t_0})), \quad I^{m,k} := \int_{t_0}^t \int_{t_0}^s \text{d} \beta_\tau^k \text{d} \beta_s^m, \end{aligned}$$

then the double iterated Itô integral in (13) can be approximated as follows:

$$\displaystyle \begin{aligned} \sum_{m,k} G^{m,k} I^{m,k} = \sum_{m,k} G^{m,k} \frac{I^{m,k} + I^{k,m}}{2} + G^{m,k} \frac{I^{m,k} - I^{k,m}}{2}. \end{aligned}$$

The first symmetric term can be computed analytically from Itô integration by part formulae, I m, k + I k, m = Δβ m Δβ k − δ m,k Δt, however the second antisymmetric term \((I^{m,k}-I^{k,m})/2 =: A_{t_0, t}^{m,k}\) cannot and it is known as the Lévy area.

2.1.1 Lévy Area Simulation

In this subsection, we briefly introduce the methods we used to simulate the Lévy area. More details can be found in [6, 8], where these methods were proposed. The first method to simulate the Lévy area will be referred to as the weak approximation in the rest of this work: in this method, we simulate a random variable that has the same moments as the Lévy area. The second method, which will be referred to as the conditional method, is a recursive method: the time interval (t 0, t) is recursively split into two subintervals of the same length, and the two following relations are used:

$$\displaystyle \begin{aligned} A_{t_0, t}^{m,k} = A_{t_0, u}^{m,k} + A_{u, t}^{m,k} + \frac{1}{2} \Big((\beta_u^m - \beta_{t_0}^m)(\beta_t^k - \beta_u^k) - (\beta_u^k - \beta_{t_0}^k)(\beta_t^m - \beta_u^m) \Big) {} \end{aligned} $$
(14)
$$\displaystyle \begin{aligned} \mathbb{E}[A_{t_0,t} | {\mathbf{B}}_t - {\mathbf{B}}_{t_0}] = 0. \end{aligned}$$

For more details on these two methods, see [7]. Finally, we consider a third approach, where we neglect the Lévy area. We remark that this approach is exact if G m, k = G k, m, which is not the case here.

2.2 Multi-Step Schemes

We next propose a two-step scheme in which the Milstein method is used as the prediction step and the Euler method is adopted as the correction step, it reads:

$$\displaystyle \begin{aligned} \begin{cases} b_{t}^* = b_{t_0} + f(b_{t_0}, \boldsymbol{u}_{t_0})\varDelta t + \sum_{m} g^m(b_{t_0})\varDelta \beta^m + \sum_{m,k} G^{m,k} \Big( S_{t_0, t}^{m, k} + \tilde{A}_{t_0, t}^{m, k} \Big) \\ \boldsymbol{u}^*_t = \boldsymbol{\mathcal{H}}(b^*_t) \\ b_t = \frac{1}{2} b_{t_0} + \frac{1}{2} \Big( b_{t}^* + f(b_{t}^*, \boldsymbol{u}^*_t)\varDelta t+\displaystyle \sum_{m} g^m(b_{t}^*)\varDelta \beta^m \Big) \\ \end{cases} {} \end{aligned} $$
(15)

where \(S_{t_0, t}^{m, k} := (\varDelta \beta ^m \varDelta \beta ^k - \delta _{m,k}\varDelta t)/2\) and \(\tilde {A}_{t_0, t}^{m, k}\) is one of the approximations of the Lévy area described in the previous subsection. This scheme will be referred to as SRK2-EM (EM stands for Euler-Milstein not for Euler-Maruyama) in the rest of the paper.

In the next section, we first analyse the results of the Milstein schemes with the different Lévy area approximations in order to select the best one. Then, we compare our multi-step scheme to two other multi-step schemes developed in [5] and [4]. We briefly recall them here. The first one, based on a third order Runge-Kutta scheme, (SSPRK3) [5], is:

$$\displaystyle \begin{aligned} \begin{cases} b^{(1)} = b_{t_0} + f_s(b_{t_0}, \boldsymbol{u}_{t_0})\varDelta t + \sum_{m} g^m(b_{t_0})\varDelta \beta^m \\ \boldsymbol{u}^{(1)} = \boldsymbol{\mathcal{H}}(b^{(1)}) \\ b^{(2)} = \frac{3}{4} b_{t_0} + \frac{1}{4}\left( b^{(1)} + f_s(b^{(1)}, \boldsymbol{u}^{(1)})\varDelta t + \sum_{m} g^m( b^{(1)})\varDelta \beta^m \right) \\ \boldsymbol{u}^{(2)} = \boldsymbol{\mathcal{H}}(b^{(2)}) \\ b_t = \frac{1}{3} b_{t_0} + \frac{2}{3}\left( b^{(2)} + f_s(b^{(2)}, \boldsymbol{u}^{(2)})\varDelta t + \sum_{m} g^m( b^{(2)})\varDelta \beta^m \right) \\ \end{cases} {} \end{aligned} $$
(16)

where f s = f −∇⋅(a∇ b)∕2 denotes the modified drift under Stratonovich integral. The second one, relies on Euler-Heun method [4] equally for Stratonovich integral, reads:

$$\displaystyle \begin{aligned} \begin{cases} b^{(1)} = b_{t_0} + f_s(b_{t_0}, \boldsymbol{u}_{t_0})\varDelta t + \sum_{m} g^m(b_{t_0})\varDelta \beta^m \\ \boldsymbol{u}^{(1)} = \boldsymbol{\mathcal{H}}(b^{(1)}) \\ b_t = \frac{1}{2} b_{t_0} + \frac{1}{2}\left( b^{(1)} + f_s(b^{(1)}, \boldsymbol{u}^{(1)})\varDelta t + \sum_{m} g^m( b^{(1)})\varDelta \beta^m \right) \\ \end{cases} {} \end{aligned} $$
(17)

3 Numerical Results

In this section we show some numerical results. First, the effect of the different approximations of the Lévy area is studied on the Milstein scheme. Then, the multi-step scheme is assessed and compared to the ones already proposed in the literature. We focus on two variations of one specific test case plotted in Fig. 1: the initial condition (left) consists of two warm elliptical anticyclones on the bottom of the domain and two cold elliptical cyclones on the top. After one day under moderate noise (centre), the four structures have rotated of approximately 45o. After one day under strong noise (right) the nonlinearity of the dynamic is more noticeable. One can find all the configuration details used for these simulations in Chapter 6 of [10] for the moderate noise configuration. For the strong noise, all the basis functions φ m are multiplied by a factor 10.

Fig. 1
figure 1

Euler-Maruyama simulation of system (1) on a 128 × 128 spatial grid

We will use the following abbreviations for the different numerical schemes

  • Euler: Euler-Maruyama scheme.

  • Milstein-0: Milstein scheme without the Lévy area.

  • Milstein-weak: Milstein scheme with the weak approximation of the Lévy area.

  • Milstein-cond-n: Milstein scheme with the conditional approximation of the Lévy area. Here n stands for the number of times the interval is recursively split (cf. (14)).

  • SRK2-EM: scheme (15) with \(\tilde {A}^{m,k}_{t_0,t} = 0\).

  • SSPRK3: scheme (16).

  • Heun: scheme (17).

In Figs. 2 and 3 one can see the difference among the Euler-Maruyama scheme and all the Milstein schemes proposed. In Fig. 2 we plot for each scheme for a period of 30 day the root mean squared error (RMSE), defined as:

$$\displaystyle \begin{aligned} \text{RMSE} = \frac{1}{|\varOmega|} \mathbb{E} \Big[ \big\|b_h - b \big\|{}_{L^2(\varOmega)}^2 \Big]^{1/2}, \end{aligned} $$
(18)

where Ω denotes the spatial domain, b h is the numerical solution of stochastic system (1), and b stands for the reference solution downsampled from a high-resolution deterministic simulation (recall that the aim of the stochastic setting is to reproduce on coarse grid high-resolution deterministic simulations). The downsampling procedure consists of a first low-pass filtering performed in the Fourier domain and a subsequent subsampling operation. The expectations are estimated from 30 of realization. These results are obtained with a Δt twice as small for the Euler scheme with respect to the other schemes. One can observe that Milstein-0 performs slightly better than the other Milstein schemes.

Fig. 2
figure 2

RMSE (normalised by the amplitude of buoyancy B 0 = 10−3 m/s2) of different schemes during 30 days of simulation under moderate noise

Fig. 3
figure 3

Convergence of different schemes under weak and strong noise. Order 1 in dotted black, order 0.5 in dashed black

In Fig. 3, we show the rate of strong convergence γ of all the schemes discussed, under weak and strong noise. Since the exact solution is unknown, we use the following method [15] to estimate γ, for a sufficiently small Δt:

$$\displaystyle \begin{aligned} \gamma \simeq \log_2\left(\frac{e_1}{e_2}\right), \text{with } e_i := \mathbb{E}\left[\left\| b_h\left(T, \frac{\varDelta t}{2^{i-1}}\right) - b_h\left(T, \frac{\varDelta t}{2^i}\right) \right\|{}_{L^2(\varOmega)}^2 \right]^{1/2}, \end{aligned}$$

where b h(T, Δt) is the numerical solution at the final time T obtained with a time step Δt. It is important to underline that in order for this method to work, the Brownian trajectories must be fixed. We applied this method for time steps 30, 60, 120, 240, hence obtaining two estimates for γ. Is is important to remark that the value of the time steps is given in seconds and the time-scale of the studied phenomenon is of the order of one day. For reference, the CFL condition for this problem at the initial time would give a time step around 300 s. The smallest time step we considered to obtain this estimate is ten times smaller than this. As one can see from Fig. 3, under weak noise all the one-step schemes provide almost identical results and all the multi-step schemes are very similar. It is hard to distinguish among the different numerical schemes proposed. In particular, for the considered span of time steps, the error of the Euler scheme under moderate noise displays a linear trend and the prevailing convergence order in this case is one. The reason of that is explained in Appendix.

Under strong noise, it is easier to see the differences among the schemes. Milstein-weak is a slight improvement on the Euler-Maruyama, but its rate of convergence is far from 1. Milstein-0 has the highest rate of convergence among all the schemes.

In conclusion, Milstein-0 seem to perform better than the other Milstein schemes. Furthermore, it is less computationally demanding. For these reasons, we built our two-step scheme based on Milstein-0.

In Fig. 3 we also compare the multi-step schemes mentioned above: they all have a similar behaviour, with a rate of convergence 0.5 ≤ γ ≤ 1, but a much smaller error when compared to the one-step schemes. In particular, the two-step scheme proposed in this work (SRK2-EM in the figures) yields the smallest error of all for this test case. The SRK2-EM schemes also yields the smallest RMSE (cf. Fig. 2).

4 Conclusion and Perspectives

The Milstein schemes analysed in this work improve the numerical results, in particular when used in a multi-step framework. The Lévy area does not seem to play a key role in these test cases, which allows us to drastically reduce the computational costs. It must be pointed out that under weak noise, all the schemes tested provide very similar results. Some ongoing and future work include the understanding of the (non) importance of the Lévy area and whether this is related to the test case, the equations, or other factors.