1 Introduction

The efficiency of numerical methods is a very important topic for practitioners that has lately seen a surge of interest in the field of uncertainty quantification (UQ) [40,41,42]. UQ seeks to combine statistical and probabilistic techniques with traditional numerical schemes to improve the modeling and the accuracy of estimates. Examples of applications include climate modeling, subsurface flow, medical imaging and deep learning [1, 12, 37]. A particular focus has been given on the class of numerical methods known as Monte Carlo (MC) methods, which are used to solve problems incorporating elements of randomness or uncertainty [35, 39, 42], i.e., in stochastic computations. One methodology which has exhibited improved efficiency and a high level of applicability, is multilevel Monte Carlo (MLMC).

MLMC is a numerical technique aimed at reducing the computational cost of the Monte Carlo method. The methodology was first introduced by Heinrich [22] and extended and popularized by various works on diffusion processes by Giles [14, 15]. The methodology of MLMC can be viewed as a variance-reduction technique. Since these works, MLMC has been applied in numerous areas, including stochastic filtering [6, 13, 24, 26, 27], Markov chain Monte Carlo (MCMC) [7, 10] and partial differential equations with random input arising in UQ [2, 7, 19]. MLMC is based upon a given problem, such as estimating an expectation at some terminal time, w.r.t. the law of a diffusion process, that requires a discretization. For instance, in the diffusion case, this can be a time-discretization based on the Euler method. One then decomposes an expectation w.r.t. a law associated to a very precise discretization into a telescoping sum of differences of expectations associated to laws of increasingly coarse discretizations. The objective is then to sample from coupled probability distributions associated to consecutive discretized laws and to apply Monte Carlo at each summand of the telescoping sum to achieve a variance reduction, relative to using Monte Carlo at the finest discretization. The amount of discretization refers to the level, in the acronym MLMC.

Despite the substantial advancements made with MLMC, the number of applications and research papers on applying the methodology to stochastic partial differential equations (SPDE) [29] is relatively small. Such examples include finite-difference solvers with applications in mathematical finance [16], and finite element methods for parabolic SPDE [3, 5]. There are open questions on the efficiency and scope of MLMC for SPDE which we use as motivation for this work: is it possible to improve the efficiency of MLMC through strong pairwise coupling of numerical solutions of SPDE, and can that widen the scope of MLMC on SPDE to problems in higher dimensions and with lower-regularity driving noise?

Our objective in this manuscript is to present a complexity study of an alternative way to apply MLMC for SPDE, which can demonstrate computational gains. This approach is based on the exponential Euler method [11, 23, 30, 36] and strong pairwise coupling of solution realizations on different levels. The strong pairwise coupling approach was introduced and studied experimentally in the work on finite-dimensional Langevin SDE by Müller et al. [38] and extended to filtering methods for (infinite-dimensional) SPDE by Chernov et al. [8]. The coupling idea is based on the exponential Euler integrator [11, 23, 33, 36] for time-discretization of reaction-diffusion type SDE/SPDE. For the finite-dimensional SDE in [38], strong coupling is shown to produce constant-factor efficiency gains in numerical experiments, whereas for the herein considered class of SPDE, we show that strong coupling reduces the asymptotic rate of growth in the computational cost. This indicates that strong coupling for MLMC can lead to more substantial asymptotic efficiency gains for infinite-dimensional problems than for finite-dimensional ones.

The main contribution of this work is to demonstrate the improvements of the discussed coupling approach, for numerically solving SPDE. This is presented in the standard-format cost-versus-error result for exponential Euler MLMC in Theorem 2. Specifically, our findings suggest that in order to achieve \(\mathcal {O}(\epsilon ^{2})\) mean squared error (MSE) in a standard setting, we have to pay \(\mathcal {O}(\epsilon ^{-2})\) in computational cost. This is a reduction in cost compared to other existing methods, such as the Milstein MLMC method, for which the cost is \(\mathcal {O}(\epsilon ^{-3})\), cf. Theorem 3 and [5], and it is, to the best of our knowledge, the first theoretical result on the performance of the exponential Euler MLMC for any nonlinear stochastic PDE. We also verify these gains numerically on two SPDE, one with a linear reaction term and one with a nonlinear one.

The outline of this paper is as follows. In Sect. 2 we describe our model problem, which is a semilinear SPDE, and review fundamental properties of the MLMC method. Section 3 describes our proposed coupling method for MLMC and two alternative methods. We also summarize the theoretical properties of our main MLMC method in Theorem 2. Numerical experiments on various SPDE are conducted in Sect. 4 to demonstrate the improvement with the proposed coupling. Finally, we conclude our findings, and provide future areas of research, in Sect. 5. Required model assumptions are provided in the Appendix.

2 Background Material

In this section we present and review the MLMC method applied to numerical discretizations of SPDEs. We first introduce the SPDE under consideration, and then review the approximation methods: a spectral Galerkin spatial discretization combined with either exponential Euler or Milstein discretization in time.

2.1 Notation

Let \(T>0\) and let \((\varOmega , \mathcal {F}, \mathbb {P})\) be a complete probability space equipped with a filtration \(\mathcal {F}_{t\in [0,T]}\). H denotes a non-empty separable Hilbert space with inner product \(\langle \cdot , \cdot \rangle \), norm \(\Vert \cdot \Vert _{ H} = \sqrt{\langle \cdot , \cdot \rangle }\) and orthogonal basis \((e_n)_{n=1}^{\infty }\). \(L^2(\varOmega ,H)\) denotes the associated Bochner–Hilbert space, consisting of the set of strongly measurable maps \(f:\varOmega \rightarrow H\) such that

$$\begin{aligned} \Vert f \Vert _{ L^2(\varOmega ,H)}^2:= \int _{\varOmega } \Vert f(\omega )\Vert _{ H}^2 \, \textrm{d}\mathbb {P}(\omega ) < \infty \end{aligned}$$

Let \(\mathbb {N}:= \{1,2,\ldots \}\), and for every \(N\in \mathbb {N}\), we introduce the finite-dimensional subspace \(H^N:= \text {span} \{ e_n\mid n=1,\dots ,N\}\subset H\) and the associated orthogonal projection operator \(P_{ N} v:= \sum ^{ N}_{n=1} \langle v,e_n \rangle _{}\, e_n\) for \(v\in H\). For a (normally implicitly given) set B and mappings \(f,g:B \rightarrow [0,\infty )\), the notation \(f \lesssim g\) implies there exists a \(C>0\) such that \(f(x) \le C g(x)\) for all \(x\in B\), and the notation \(f\eqsim g\) means that both \(f \lesssim g\) and \(g\lesssim f\) hold. For multivariate positive-valued functions f(xy) and g(xy) for which it holds for some \(C>0\) that \(f(x,y) \le C g(x,y)\) for all \((x,y) \in \text {Domain}(f) = \text {Domain}(g)\), we write \(f \lesssim _{(x,y)} g\) if confusion is possible. And, similarly as above, \(f \eqsim _{(x,y)} g\) means that \(f \lesssim _{(x,y)} g\) and \(g \lesssim _{(x,y)} f\). For \(m,n \in \mathbb {Z}\) with \(m\le n\), we introduce the integer interval \(\llbracket m,n\rrbracket := [m,n]\cap \mathbb {Z}\), and for \(x\in \mathbb {R}\) we define \(\lceil x \rceil :=\min \{n \in \mathbb {Z}\mid n\ge x\}\).

2.2 Problem Setup

We consider a semilinear stochastic partial differential equation [9] of the form

$$\begin{aligned} \begin{aligned} dU_t=&\,\left( AU_t + f(U_t)\right) \,dt+dW_t \quad \text {for}\quad t \in [0,T],\\ U_0=&\,u_0, \end{aligned} \end{aligned}$$
(2.1)

where \(A:D(A) \rightarrow H \) is a linear operator, \(u_0\in H\) is a random-valued initial condition, \(f:H \rightarrow H\) is a reaction term that in general is nonlinear, and \(W_t\) is a Q-Wiener process, cf. (A.2). A number of further assumptions are imposed for the problem, which we have deferred to Appendix  1. Suffice it to say here that we do assume that the linear operator is negative-definite and spectrally decomposable in the considered basis:

$$\begin{aligned} A v = - \sum _{k=1}^{\infty } \lambda _k \langle e_k, v\rangle e_k \end{aligned}$$
(2.2)

and that the Q-Wiener process takes the form

$$\begin{aligned} W(t,x) = \sum ^{\infty }_{n=1}\sqrt{q_n}e_n w^n_t, \end{aligned}$$

where \((w^n_t)_{n=1}^{\infty }\) is a sequence of independent scalar-valued Wiener processes. We note that the eigenbasis of the operator A, \((e_n)_{n=1}^{\infty }\), also appears in the representation of the Q-Wiener process. The strictly positive sequence \((\lambda _n)_{n=1}^\infty \) and the non-negative sequence \((q_n )_{n=1}^\infty \) are further described in Appendix  1.

The mild solution to Eq. (2.1) is an H-valued predictable process \((U_t)_{t\in [0,T]}\) satisfying

$$\begin{aligned} {\mathbb P}\left( \omega \in \varOmega \, \Big | \, \,U_t = e^{A t}u_0 + \int ^t_0 e^{A(t-s)}f(U_s)ds + \int ^t_0e^{A(t-s)}dW_s\, \quad \forall t \in [0,T] \right) =1. \end{aligned}$$
(2.3)

The general form of (2.1) encapsulates numerous SPDE in practice. We will introduce and numerically study some of these in Sect. 4.

2.3 Numerical Methods

Numerical approximations of SPDE have traditionally been computed through the use of finite difference methods and finite element methods (FEM) [29, 35, 43]. For the relevance of this work, we will utilize and discuss an alternative class of Galerkin-based solvers. To motivate such an alternative class, we review some of these techniques below.

2.3.1 Continuous-time Spectral Galerkin Methods

For \(N\in {\mathbb N}\), consider the Galerkin problem of solving the SPDE (2.1) on the subspace \(H^N\):

$$\begin{aligned} \begin{aligned} dU^{ N}_t=&\, \big (A_{ N}U^{ N}_t+f_{ N}(U^{ N}_t) \big ) \, dt +dW^{ N}_t,\\ U^{ N}_0=&\,u_0^{ N}:=P_{ N}u_0,\\ \end{aligned} \end{aligned}$$
(2.4)

where \(A_{ N}:=P_{ N}A\), \(f_{ N}(v):=P_{ N}(f(v))\) and \(W^{ N}_t:= P_{ N} W_t=\sum ^{ N}_{n=1}\sqrt{q_n}\,e_n\, w^n_t.\) It is well-known [32] that (2.4) has a unique mild solution given by

$$\begin{aligned} U^{ N}_t = e^{A_Nt} u^{ N}_0 + \int ^t_0e^{A_N(t-s)}f_{ N}(U^{ N}_s)\;ds +\int ^t_0 e^{A_N(t-s)}\;dW^{ N}_s. \end{aligned}$$

We next discuss two time-discretizations of spectral Galerkin methods.

2.3.2 The Exponential Euler Method

For a given \(J\in {\mathbb N}\), let \(\Delta t =\frac{T}{J}\) and let \((t_j)_{j=0}^{ J}\) be the nodes of a uniformly spaced mesh of [0, T], so that \(t_j=j\,\Delta t\) for \(j\in \llbracket 0,J\rrbracket \). Then, for given \(N\in {\mathbb N}\), the exponential Euler approximations \((V^{ N,J}_j)_{j=0}^{ J}\subset H^N\) of \((U^{ N}_{t_j})_{j=0}^J\) are defined by

$$\begin{aligned} \begin{aligned} V^{ N,J}_0&:=\,u_0^{ N},\\ V^{ N,J}_{j+1}&=\,e^{A_N\Delta t} V^{ N,J}_j +A_{ N}^{-1}(e^{A_N\Delta t}-I)f_{ N}(V^{ N,J}_j)\\&\quad +\int ^{t_{j+1}}_{t_j}e^{A_N(t_{j+1}-s)}\,dW^{ N}_s \qquad \forall j\in \llbracket 0,J-1\rrbracket , \end{aligned} \end{aligned}$$
(2.5)

where \(A_{ N}^{-1}:H^N\rightarrow H^N\) denotes the inverse operator of \(A_{ N}\). Defining for \(n \in \llbracket 1,N\rrbracket \) the components of \(V^{ N,J}_j\) and \(f_{ N}(\cdot )\) by \(V^{ N,J}_{j,n}:=\langle V^{ N,J}_j,e_n \rangle \)

\(f_{{ N},n}(\cdot ):=\langle f(\cdot ),e_n\rangle \), respectively, and recalling the spectral decomposition of the operator A, we arrive at the recursive relation

$$\begin{aligned} V^{ N,J}_{j+1,n} = e^{-\lambda _n\Delta t}\,V_{j,n}^{ N,J} +\frac{1-e^{-\lambda _n \Delta t}}{\lambda _n} \,f_{{ N},n}(V^{ N,J}_j) + R_{j,n}, \end{aligned}$$
(2.6)

where

$$\begin{aligned} R_{j,n} := \sqrt{q_n} \int _{t_j}^{t_{j+1}} e^{-\lambda _n (t_{j+1} -s)} dw_t^{n} {\mathop {=}\limits ^\texttt{d}} \mathcal {N}\left( 0,\frac{q_n\,(1 -e^{-2\lambda _n\Delta t}) }{2\lambda _n}\right) . \end{aligned}$$

We recall from (2.2) that \(-\lambda _n\) denotes the n-th eigenvalue of the operator A, see also Assumption 2 in Appendix 1 for further details. Convergence properties of the exponential Euler scheme has been studied in [30], where they demonstrate strong convergence and highlight an improvement in the order of convergence in time against traditional numerical schemes:

Proposition 1

(Jentzen and Kloeden [30]) Let all assumptions in Appendix  1 hold for some \(\phi \in (0,1)\) relating to the regularity of the Q-Wiener process. Then

$$\begin{aligned} \max _{j\in \llbracket 0,J\rrbracket } \mathbb {E}\left[ \Vert U_{t_j}-V^{ N,J}_j\Vert ^2_{ H}\right] \lesssim _{(N,J)} \lambda ^{-2\phi }_{ N} + \left( \frac{\log _2(J)}{J} \right) ^{2}, \end{aligned}$$
(2.7)

where U is the mild solution (2.3) of (2.1), and \((V^{ N,J}_j)_{j=0}^{ J}\) denotes the exponential Euler approximation of the mild solution, cf. (2.5).

We note that the first term on the RHS of (2.7) is related to the discretization in space and the second term is related to the discretization in time.

The performance of a numerical method will be measured by the computational cost required to reach a mean squared error (MSE) \(\mathcal {O}(\epsilon ^2)\). Computational cost refers to the number of computational operations, where we count each addition, subtraction, multiplication, division, and each draw of a Gaussian random variable as one computational operation. It follows from this definition that if \(f =0\) in the SPDE (2.1), then no evaluation of the reaction term is needed and the computational cost of computing the final-time solution \(V_J^{N,J}\) is \(\mathcal {O}(JN)\), as each time iteration of (2.5) consists of \(\mathcal {O}(N)\) computational operations. When \(f\ne 0\), however, the cost becomes a more complicated expression in general, and we make the following assumption to simplify matters:

Assumption 1

(Cost of evaluating \(f_N\)) For any \(N \in \mathbb {N}\) and \(V^N \in H^N\), the cost of of evaluating \(f_N(V^N)\) is \(\mathcal {O}(N\log _2(N))\).

When Assumption 1 holds, each evaluation of \(f_N(V^{J,N}_j)\) costs \(\mathcal {O}(N \log _2(N))\), and this accumulates to

$$\begin{aligned} \textrm{Cost}(V_{J}^{N,J}) = \mathcal {O}\big (J \, \big (N +\textrm{Cost}(f_N) \big )\, \big ) = \mathcal {O}(J N \log _2(N)) \end{aligned}$$
(2.8)

for the final-time solution.

Remark 1

In the numerical Scheme (2.5) used in Proposition 1 and in Assumption 1 it is tacitly assumed that the nonlinear reaction term \(f_{N}(V^{N,J}_j)\) can be evaluated exactly for any \(N \in \mathbb {N}\) and \(V^{N,J} \in H^N\). For nonlinear reaction terms f, this may however not be possible in practice. In computations, we will employ the fast Fourier transform (FFT) to approximate \(f_N(V^{N,J}_j)\) on a uniform mesh with N degrees of freedom in space for each iteration of (2.5), where we refer to [8, Section 6.3.1] and [35] for further details on this procedure. The approximation of \(f_N\) by FFT may introduce so-called aliasing errors in the numerical solution, cf. [31, page 334]. Aliasing errors are not covered in the mathematical analysis of this paper, but we will include the cost of using FFT in the computational cost of all numerical methods studied.

Remark 2

Disregarding aliasing errors, Assumption 1 holds when computing \(f_N(V^N)\) by FFT for Nemytskii-operators \(f(U)(x) = g(U(x))\) where the mapping \(g: \mathbb {R}\rightarrow \mathbb {R}\) additionally satisfies that one evaluation costs \(\mathcal {O}(1)\). Then

$$\begin{aligned} f_N(V^{N}) = \textrm{FFT}\big ( g(V^N(x=0)), g(V^N(x=1/N)), \ldots , g(V^N(x=(N-1)/N)) \big ) \end{aligned}$$

where the right-hand side costs \(\mathcal {O}(N \log _2(N))\) to evaluate.

2.3.3 The Milstein Method

The Milstein method has been extended from SDE to different forms of parabolic SPDE with multiplicative noise in [4, 31]. We will here consider the version developed in [31], since its scheme is easy to express in our problem setting, and it is also easy to extend to an MLMC method. Using the previously introduced discretization parameters in space and time and recalling that the operators A and Q share the same eigenspace, the Milstein Scheme [31, equation (28)] takes the form

$$\begin{aligned} \begin{aligned} V_0^{N,J}&:= u_0^N\\ V^{ N,J}_{j+1}&= e^{A_N \Delta t} \left( V^{N,J}_j + \Delta t\,f_N(V^{ N,J}_j) + W^{N}(t_{j+1})-W^{N}(t_{j}) \right) \end{aligned} \end{aligned}$$

for \(j \in \llbracket 0,J-1\rrbracket \). On the component level, the scheme is given by

$$\begin{aligned} V^{ N,J}_{j+1,n}= e^{-\lambda _n \Delta t} \left( V_{j,n}^{N,J} + \Delta t f_{{ N},n}(V^{ N,J}_j) + \sqrt{q_n}\big (w^{n}(t_{j+1})-w^{n}(t_{j})\big ) \right) \end{aligned}$$
(2.9)

for \(n \in \llbracket 1,N\rrbracket \) and \(j \in \llbracket 0,J-1\rrbracket \).

We next present strong convergence rates for the Milstein scheme restricted to the additive-noise setting. For extensions to various multiplicative-noise settings, see [4, 31].

Proposition 2

(Jentzen and Röckner [31]) Let Assumption 6 in Appendix 1 be fulfilled for some values of \(\phi \in (1/2,1)\), \(\kappa \in [0,\phi )\) and \(\theta \in [\max (\kappa ,\phi -1/2), \phi )\), where \(\phi \) is the noise parameter introduced in Assumption 3. Then it holds that

$$\begin{aligned} \mathbb {E}\left[ \left\| U_{T}-V^{ N,J}_{J}\right\| ^2_{H}\right] \lesssim _{(N,J)} \lambda ^{-2\theta }_{N} + J^{-2\min (2(\theta - \kappa ), \theta )}, \end{aligned}$$

where \(U_t\) denotes the mild solution to the SPDE (2.1) and \((V^{ N,J}_j)_{j=0}^{ J}\) denotes the Milstein approximation to the mild solution, cf. (2.9).

Proof

(Connecting the result to the literature) Remark 7 in Appendix 1 associates our parameters \((\phi ,\kappa ,\theta )\) with corresponding ones in [31, Assumptions 1-4]. In our additive-noise setting with the operators A and Q having the same eigenspace, Proposition 2 follows from [31, Theorem 1].

Even when disregarding the differences in the regularity assumptions, a comparison of the convergence rates for the exponential Euler and Milstein method is not straightforward since the rates for exponential Euler only depend on the single parameter \(\phi \), while the rates of Milstein depend on two additional parameters, \(\kappa \) and \(\theta \). To simplify the comparison we impose additional constraints on the relationship between the parameters \(\phi \), \(\kappa \) and \(\theta \):

Corollary 1

For some value of \(\phi \in (1/2,1)\), let Assumption 6 in Appendix 1 be fulfilled for some \(\kappa \in [0,\phi /2)\) and all \(\theta \in [\max (\kappa ,\phi -1/2), \phi )\). Then for any sufficiently small fixed \(\delta >0\), it holds that

$$\begin{aligned} \mathbb {E}\left[ \left\| U_{T}-V^{ N,J}_{J}\right\| ^2_{H}\right] \lesssim _{(N,J)} \lambda ^{-2\phi +\delta }_{N} + J^{-2\phi + \delta }. \end{aligned}$$

Proof

For any sufficiently small \(\delta >0\), Assumption 6 holds for some \(\kappa < \phi /2 -\delta /4\) and \(\theta _\delta :=\phi -\delta /2\). Noting that

$$\begin{aligned} \min (2(\theta _\delta - \kappa ), \theta _\delta ) = \theta _\delta = \phi - \delta /2, \end{aligned}$$

the result follows from Proposition 2.

For a fixed value of \(\phi \in (1/2,1)\), the additional constraints imposed on \(\kappa \) and \(\theta \) in Corollary 1 are likely to present the Milstein method in a good light, as they produce the highest possible convergence rates attainable from Proposition 2. Comparing the convergence of exponential Euler in Proposition 1 with Milstein in Corollary 1, the methods have essentially the same rate in space, but exponential Euler has a higher rate in time. Note further that the rates only apply to Milstein when \(\phi >1/2\), while they apply to exponential Euler method for any \(\phi \in (0,1)\). But one should also keep in mind that the Milstein method applies to a wider range of reaction terms f than exponential Euler, since Assumption 6 is more relaxed than Assumption 4. When comparable, the lower convergence rate for Milstein leads to a poorer performance for the Milstein MLMC method than the exponential Euler MLMC method in low-regularity settings, when \(\phi <3/4\), cf. Theorems 2 and 3. See also Sect. 4 for numerical evidence that exponential Euler outperforms Milstein when \(\phi \approx 1/2\).

2.4 The multilevel Monte Carlo Method

The expectation of an H-valued random variable U is often approximated by the standard Monte Carlo estimator

$$\begin{aligned} E_{ M}\left[ U\right] := \frac{1}{M}\,\sum ^{ M}_{m=1}U^{(m)}, \end{aligned}$$

where the samples \(U^{(1)}, U^{(2)},\ldots , U^{(M)} \sim \mathbb {P}_U\) are independently drawn random variables and \(E_M[U]\) consequently denotes the sample average estimator using M i.i.d. draws of U. When it is computationally costly to draw samples of U, variance-reduction techniques may improve the efficiency through reducing the statistical error of the estimator. The multilevel Monte Carlo (MLMC) method is an extension of standard Monte Carlo that draws pairwisely coupled random variables \(\{(U^{\ell -1,C}, U^{\ell ,F})\}_{\ell =0}^L\), where \(U^{\ell -1,C}\) denotes the coarse random variable on resolution level \(\ell \), and \(U^{\ell ,F}\) the fine random variable on level \(\ell \). Pairwise coupling of \((U^{\ell -1,C}, U^{\ell ,F})(\omega )\) means that \(U^{\ell -1,C}(\omega )\) and \(U^{\ell ,F}(\omega )\) are generated using the same driving noise \(W_t(\omega )\) (to be elaborated on in the next section). We further impose that

$$\begin{aligned} U^{-1,C} := 0 \in H \qquad \text {and} \qquad \mathbb {E}\left[ U^{\ell ,C}\right] = \mathbb {E}\left[ U^{\ell ,F}\right] \quad \forall \ell \in \mathbb {N}_0, \end{aligned}$$
(2.10)

so that the weak approximation on resolution level \(L \in \mathbb {N}\) can be represented as a telescoping sum of expectations:

$$\begin{aligned} \mathbb {E}\left[ U\right] \approx \mathbb {E}\left[ U^{L,F}\right] {\mathop {=}\limits ^{(2.10)}} \sum _{\ell =0}^{L} \mathbb {E}\left[ U^{\ell ,F}- U^{\ell -1,C}\right] . \end{aligned}$$
(2.11)

By approximating each of the \(L+1\) expectations in the telescoping sum by a sample average, we obtain the MLMC estimator:

$$\begin{aligned} \begin{aligned} E_{ \text {M\!L}}\left[ U\right] :=&\,\sum _{\ell =0}^{ L} E_{ M_\ell } \left[ U^{\ell ,F}- U^{\ell -1,C}\right] \\ =&\,\sum _{m=1}^{ M_0}\frac{U^{0,F,(m)}}{M_{0}} +\sum ^{ L}_{{\ell =1}} \sum ^{ M_{\ell }}_{m=1} \frac{U^{\ell ,F,(m)} - U^{\ell -1,C,(m)}}{M_{\ell }}. \end{aligned} \end{aligned}$$
(2.12)

Here, \((U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\) denotes the \(\mathbb {P}_{(U^{\ell -1,C},U^{\ell ,F})}\)-distributed m-th sample on level \(\ell \), and all samples on all resolution levels are independent, meaning that all random variables in the sequence \(\{(U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\}_{\ell ,m}\) are independent. A near-optimal calibration of the parameters \(L \in \mathbb {N}\) and \((M_{\ell })_{\ell =0}^{ L}\subset \mathbb {N}\) is obtained through minimizing the mean squared error for a given computational cost, cf. [14] and Theorem 1. The MLMC estimator achieves variance reduction over standard Monte Carlo when the coupled random variables \(U^{\ell -1,C}\) and \(U^{\ell ,F}\) are sufficiently correlated, cf. Condition (ii) in Theorem 1 below.

One way to assess the performance of Monte Carlo methods is through the MSE. The following theorem describes the cost versus error of the MLMC methodology for H-valued random variables:

Theorem 1

Assume that the telescoping-sum properties (2.10) hold and that there exists positive constants \(\alpha ,\beta ,\gamma \) such that \(\alpha \ge \frac{\min (\beta ,\gamma )}{2}\) and

  1. (i)

    \(\left\| \mathbb {E}\left[ U^{\ell ,F}-U\right] \right\| _{ H} \lesssim \,2^{-\alpha \,\ell }\),

  2. (ii)

    \(V_\ell :=\mathbb {E}\left[ \left\| U^{\ell ,F}-U^{\ell -1,C} \right\| ^2_{ H}\right] \lesssim \,2^{-\beta \,\ell }\),

  3. (iii)

    \(C_\ell := \textrm{Cost}(U^{\ell -1,C}, U^{\ell ,F}) \lesssim \, 2^{\gamma \,\ell }\).

Then for any \(\epsilon \in (0,1)\) and \(L:= \lceil \log _2(1/\epsilon )/\alpha \rceil \), there exists a sequence \((M_{\ell })_{\ell =0}^{ L}\subset \mathbb {N}\) such that

$$\begin{aligned} \textrm{MSE}={\mathbb E}\left[ \big \Vert E_{ \mathrm {M\!L}}\left[ U\right] -{\mathbb E}\left[ U\right] \big \Vert ^2_{ H} \right] \lesssim \,\epsilon ^2, \end{aligned}$$

and

$$\begin{aligned} \mathrm {Cost(MLMC)}:= \sum _{\ell =0}^L M_{\ell }\, C_{\ell } \lesssim {\left\{ \begin{array}{ll} \epsilon ^{-2}, \quad &{}\textrm{if} \ \beta >\gamma ,\\ \epsilon ^{-2}( \log \epsilon )^2, \quad &{}\textrm{if} \ \beta =\gamma ,\\ \epsilon ^{-2-\frac{(\gamma -\beta )}{\alpha }}, \quad &{}\textrm{if} \ \beta <\gamma . \end{array}\right. } \end{aligned}$$
(2.13)

The proof of this result is a straightforward extension of the original theorem presented by Giles [14] for weak approximations of stochastic differential equations.

Proof

Let

$$\begin{aligned} M_{\ell }:=\left\lceil \epsilon ^{-2} \sqrt{\frac{V_\ell }{C_\ell }} \sum _{j=0}^L\sqrt{V_j C_j} \right\rceil \qquad \ell \in \llbracket 0,L\rrbracket , \end{aligned}$$
(2.14)

where \(V_0:= \mathbb {E}\left[ \Vert U^{0}\Vert _{ H}^2\right] \). By the telescoping-sum property

$$\begin{aligned} \mathbb {E}[E_{ \mathrm {M\!L}}[U]]= \sum _{\ell =0}^L \mathbb {E}\left[ U^{\ell ,F} - U^{\ell -1,C}\right] {\mathop {=}\limits ^{(2.10)}} \mathbb {E}\left[ U^{L,F}\right] , \end{aligned}$$

the representation (2.12) and the independence of the samples

\(\{(U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\}_{\ell ,m}\), we obtain that

$$\begin{aligned} \begin{aligned} \mathbb {E}\Big [\big \Vert E_{ \mathrm {M\!L}}[U]&-\mathbb {E}[U]\big \Vert ^2_{ H}\Big ]\\&= \Vert \mathbb {E}\left[ U\right] - \mathbb {E}\left[ U^{ L}\right] \Vert _{H}^2 + \mathbb {E}\left[ \left\| E_{ \mathrm {M\!L}}[U] -{\mathbb E}[U^{ L}]\right\| ^2_{ H}\right] \\&\lesssim 2^{-2\alpha L} + \mathbb {E}\left[ \left\| \sum _{m=1}^{ M_0}\frac{U^{0,F,(m)} - \mathbb {E}[U^0]}{M_{0}} \right\| ^2 \right] \\&\quad +\mathbb {E}\left[ \left\| \sum ^{L}_{\ell =1}\sum ^{ M_{\ell }}_{m=1} \frac{U^{\ell ,F,(m)} - U^{\ell -1,C,(m)} - \mathbb {E}[U^{\ell ,F} - U^{\ell -1,C}]}{M_{\ell }} \right\| ^2\right] \\&\lesssim \,\sum _{\ell =0}^{ L} \frac{V_\ell }{M_\ell } +2^{-2\alpha L} \lesssim \,\epsilon ^2. \end{aligned} \end{aligned}$$

By assumptions (ii) and (iii), we obtain that

$$\begin{aligned} \begin{aligned}&\sum _{\ell =0}^LC_{\ell }\, M_{\ell }\ {\mathop {\le }\limits ^{(2.14)}} \sum _{\ell =0}^LC_{\ell } \left( \epsilon ^{-2} \sqrt{\frac{V_\ell }{C_\ell }} \sum _{j=0}^L\sqrt{V_j C_j} +1 \right) \\&\quad \lesssim \epsilon ^{-2} \left( \sum _{j=0}^L \sqrt{V_j C_j}\right) ^2 + \underbrace{C_L}_{\lesssim 2^{\gamma L} } \\&\quad \lesssim \epsilon ^{-2} \left( \sum _{j=0}^L \sqrt{V_j C_j}\right) ^2 + \epsilon ^{- \gamma /\alpha }\\&\quad \lesssim {\left\{ \begin{array}{ll} \epsilon ^{-2} &{} \text {if} \quad \beta > \gamma \\ L^2 \epsilon ^{-2} + \epsilon ^{-2}&{} \text {if} \quad \beta = \gamma \\ \epsilon ^{-2-\frac{(\gamma -\beta )}{\alpha }} + \epsilon ^{-\gamma /\alpha }&{} \text {if} \quad \beta < \gamma . \end{array}\right. } \end{aligned} \end{aligned}$$

For the last inequality, the assumption \(\alpha \ge \min (\beta , \gamma )/2\) implies that that \(\gamma /\alpha \le 2\) when \(\beta \ge \gamma \) and \(\beta /\alpha \le 2\) when \(\beta \le \gamma \) (so that \(2 + (\gamma -\beta )/\alpha \ge \gamma /\alpha \)), and inequality (2.13) follows.

Remark 3

The theorem also applies in settings where one replaces \(V_\ell \) in Theorem 1 (ii) by \({\widetilde{V}}_\ell :={\mathbb E}\left[ \left\| U^{\ell ,F}-U^{\ell -1,C} -\mathbb {E}\left[ U^{\ell ,F}-U^{\ell -1,C}\right] \right\| ^2_{ H}\right] \), and for some problems this may improve the rate \(\beta >0\). Practically, however, there may be little to gain by replacing \(V_\ell \) by \(\widetilde{V}_\ell \) as weak approximations of \(U^{\ell ,F} - U^{\ell -1,C}\) can be much more intractable than strong approximations, cf. [34].

3 Multilevel Monte Carlo Methods for SPDE

In this section we describe two MLMC methods that are based on extending the two numerical schemes in Sect. 2.3 to the MLMC setting. To better illustrate the importance of strong coupling and the loss of accuracy due to damping, we also propose a third MLMC method which is an extension of a modified form of the exponential Euler method that only is exponential in the drift-term. We will employ the following notation for the multilevel hierarchy of discretized solutions: On level \(\ell \ge 0\), let \(N_{\ell }\eqsim N_0\,2^{\nu \ell }\) for given \(N_0\in \mathbb {N}\) and \(\nu >0\) denote a sequence of spatial resolutions, and let \(J_\ell :=J_0\,2^\ell \) for a given \(J_0 \in \mathbb {N}\) denote a sequence of time resolutions. In a notation that suppresses details on the pairwise coupling, we let \(U^{\ell ,F}_j:= V^{ N_\ell , J_\ell }_j\) denote the fine numerical solution of a given spectral Galerkin method on level \(\ell \) at time \(t_j^{\ell }:=j\,\Delta t_\ell \) for \(j\in \llbracket 0,J_\ell \rrbracket \), computed on the subspace \(H^{N_\ell }\) using the time step \(\Delta t_{\ell }:=\frac{T}{J_\ell }\). And \(U^{\ell -1,C}_j:= V^{ N_{\ell -1}, J_{\ell -1}}_j\) denotes the coupled coarse numerical solution on level \(\ell \) at time \(t_j^{\ell -1}:=j\,\Delta t_{\ell -1}\) for \(j\in \llbracket 0,J_{\ell -1}\rrbracket \) computed on the subspace \(H^{N_{\ell -1}}\) with time step \(\Delta t_{\ell -1}:=\frac{T}{J_{\ell -1}}\).

To discuss the quality of a pairwise coupling, let us first introduce some terminology. When a coupling satisfies

$$\begin{aligned} \mathbb {E}\left[ U^{\ell ,F}_{j}\right] = \mathbb {E}\left[ U^{\ell , C}_{j}\right] \quad \forall j \in \llbracket 0,J_{\ell }\rrbracket \end{aligned}$$

for all \(\ell \ge 0\), we say that the coupling is weakly correct, and when it additionally satisfies

$$\begin{aligned} U^{\ell ,F}_{j}(\omega ) = U^{\ell ,C}_{j}(\omega ) \quad \forall (\omega , j) \in \varOmega \times \llbracket 0,J_\ell \rrbracket \end{aligned}$$

for all \(\ell \ge 0\), we say that it is a pathwise correct coupling. From the construction of the multilevel estimator in Sect. 2, we see that weakly correct coupling is needed to obtain the crucial telescoping sum in the MLMC estimator, cf. (2.10) and (2.11), and that weakly correct coupling thus ensures consistency for the MLMC estimator. Pathwise correct coupling is on the other hand not necessary to obtain consistency, and there are many examples of performant MLMC methods that only are weakly correct, cf. [17, 25]. Pathwise correct coupling is however often an easy way to ensure the needed weakly correct coupling.

To achieve high performance, the pairwise coupling must be weakly correct and produce a high convergence rate \(\beta \) for the strong error, cf. Theorem 1. We will refer to a coupling that achieves a high rate \(\beta \) in comparison to alternative approaches as a strong coupling. To be more precise for the particular SPDE considered in this work, we introduce the notion of strong diffusion coupling:

Definition 1

(Strong diffusion coupling (SDC)) Consider a weakly correct coupling sequence of spectral-Galerkin numerical solutions of the

\(\{(U^{\ell -1,C}, U^{\ell ,F})\}_{\ell \ge 0}\) of the SPDE (2.1) with no reaction term, \(f=0\) (the stochastic heat equation). Recall further that a coupled pair of solutions is defined on time meshes of different resolutions:

$$\begin{aligned} U^{\ell -1,C}_j = U^{\ell -1,C}(j \Delta t_{\ell -1}) \in H^{N_{\ell -1}} \quad \text {for} \quad j \in \llbracket 0, J_{\ell -1}\rrbracket \end{aligned}$$

and

$$\begin{aligned} U^{\ell ,F}_j = U^{\ell ,F}(j \Delta t_{\ell }) \in H^{N_{\ell }} \quad \text {for} \quad j \in \llbracket 0, J_{\ell }\rrbracket , \end{aligned}$$

with \(\Delta t_{\ell -1} = 2 \Delta t_\ell \). We say that the coupling is a strong diffusion coupling if it holds for all \(\ell \ge 0\) that

$$\begin{aligned} P_{N_{\ell -1} } U^{\ell ,F}_{2j} = U^{\ell -1,C}_{j} \quad \forall j \in \llbracket 0, J_{\ell -1}\rrbracket . \end{aligned}$$

For the stochastic heat equation, an SDC is thus an exact coupling of \(U^{\ell -1,C}\) to \(U^{\ell ,F}\) on the subspace \(H^{N_{\ell -1}}\). This is of course the strongest possible coupling one can achieve (for the given problem), and we will see later that the exponential Euler MLMC method indeed is the only among the three we consider whose coupling is SDC. Although our theory and numerical experiments both indicate a connection between SDC and strong couplings more generally when f is non-zero-valued, it is not clear how far this extends. To best of our knowledge, it is an open problem to describe coupling strategies for H-valued stochastic processes that are weakly correct and maximize the convergence rate of the strong error \(\beta \).

We next extend the exponential Euler method and the Milstein method to the MLMC setting.

3.1 Exponential Euler MLMC Method

This MLMC method was first introduced and analyzed for the linear reaction-term setting in [8, Section 5.4.1]. Since then the method has been applied to the SPDE (2.1) with linear reaction term for problems arising in Bayesian computation. These include stochastic filtering [28] and Markov chain Monte Carlo [27], with an extension to multi-index Monte Carlo.

We consider the pairwisely coupled solutions \((U^{\ell -1,C},U^{\ell ,F})\) that both are solved by the numerical Scheme (2.6) with the respective initial conditions

$$\begin{aligned} U^{\ell -1,C}_0 = P_{N_{\ell -1}}u_0 \quad \text {and} \quad U^{\ell ,F}_0 = P_{N_\ell }u_0. \end{aligned}$$

For the fine solution, the n-th component of two iterations of the Scheme (2.6) at time \(t^{\ell }_{2j} = 2j \Delta t_{\ell }\) takes the form

$$\begin{aligned} U_{2j+1,n}^{\ell ,F} = e^{-\lambda _n \Delta t_{\ell }} U_{2j,n}^{\ell ,F} + \frac{1- e^{- \lambda _n \Delta t_{\ell } }}{\lambda _n} f_{{ N_{\ell }},n}(U_{2j}^{\ell ,F} ) + R_{2j,n}^{\ell ,F} , \end{aligned}$$
(3.1)

and

$$\begin{aligned} U_{2j+2,n}^{\ell ,F} = e^{- \lambda _n \Delta t_{\ell } } U_{2j+1,n}^{\ell ,F} +\frac{1- e^{-\lambda _n \Delta t_{\ell } }}{ \lambda _n} f_{{ N_{\ell }},n}(U_{2j+1}^{\ell ,F} ) + R_{2j+1,n}^{\ell ,F}, \end{aligned}$$
(3.2)

for \((j,n) \in \llbracket 0, J_{\ell -1}-1\rrbracket \times \llbracket 1,N_{\ell }\rrbracket \) and with

$$\begin{aligned} R_{k,n}^{\ell ,F} =\sqrt{q_n}\, \int _{t^{\ell }_{k}}^{t^{\ell }_{k+1}} e^{-\lambda _n (t^{\ell }_{k+1}-s)}\;dw^{n}_s {\mathop {=}\limits ^{\texttt{d}}} N\left( 0, q_n \frac{1- e^{- 2\lambda _n \Delta t_{\ell } }}{2 \lambda _n}\right) \end{aligned}$$
(3.3)

for \((k,n) \in \llbracket 0,J_{\ell }-1\rrbracket \times \llbracket 1,N_\ell \rrbracket \).

The coupled coarse solution uses the time step \(\Delta t_{\ell -1} = 2\,\Delta t_{\ell }\), and one iteration at time \(t_{j}^{\ell -1} = j\Delta t_{\ell -1} = 2j\Delta t_{\ell } = t_{2j}^{\ell }\) takes the form

$$\begin{aligned} U_{j+1,n}^{\ell -1,C}= e^{- \lambda _n \Delta t_{\ell -1} } U_{j,n}^{\ell -1,C} +\frac{1-e^{-\lambda _n\,\Delta t_{\ell -1}}}{ \lambda _n} f_{{ N_{\ell -1}},n}(U^{\ell -1,C}_{j,n})+ R_{j,n}^{\ell -1,C}, \end{aligned}$$
(3.4)

where

$$\begin{aligned} R_{j,n}^{\ell -1,C} = \sqrt{q_n} \int _{t_j^{\ell -1}}^{t_{j+1}^{\ell -1} } e^{-\lambda _n (t_{j+1}^{\ell -1} -s)} dw^{n}_s. \end{aligned}$$

The pairwise coupling \(U^{\ell -1,C}_{j+1,n} \leftrightarrow U^{\ell ,F}_{2j+2,n}\) is obtained through coupling the driving noise \(R^{\ell -1,C}_{j,n} \leftrightarrow (R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\). By (3.3), we have that

$$\begin{aligned} \begin{aligned} R_{j,n}^{\ell -1,C}=&\,\sqrt{q_n}\, \int _{t_j^{\ell -1}}^{t_{j+1}^{\ell -1} } e^{-\lambda _n(t^{\ell -1}_{j+1} - s)} dw^{n}_s\\ =&\,e^{-\lambda _n \Delta t_{\ell }}\,\sqrt{q_n} \,\int _{t^{\ell }_{2j}}^{t^{\ell }_{2j +1} } e^{-\lambda _n (t^{\ell }_{2j+1} -s)} dw^{n}_s +\sqrt{q_n}\,\int _{t^{\ell }_{2j+1}}^{t^{\ell }_{2j+2}} e^{-\lambda _n (t^{\ell }_{2j+2} -s)} dw^{n}_s,\\ \end{aligned} \end{aligned}$$

which yields

$$\begin{aligned} R_{j,n}^{\ell -1,C} = e^{-\lambda _n \Delta t_{\ell }} R_{2j,n}^{\ell ,F} + R_{2j+1,n}^{\ell ,F} \qquad \forall \,(j,n) \in \llbracket 0, J_{\ell }-1\rrbracket \times \llbracket 1, N_{\ell }\rrbracket . \end{aligned}$$
(3.5)

To summarize, given the coupling \(U^{\ell -1,C}_{j,n} \leftrightarrow U^{\ell ,F}_{2j,n}\) at some time \(t_j^{\ell -1}\), the coupling at the next time is obtained by generating the fine-solution noise \((R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\) and coupling it to the coarse-solution noise by formula (3.5). The next-time solution \(U^{\ell -1,C}_{j+1,n}\) is computed by (3.4) with \(R_{j,n}^{\ell -1,C}\) as input, and \(U^{\ell ,F}_{2j+2,n}\) is computed by (3.1) and (3.2) with \((R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\) as input.

Remark 4

We note from the above that

$$\begin{aligned} R^{\ell -1,C}_{j,n}(\omega ) = \sqrt{q_n} \int _{t_j^{\ell -1}}^{t_{j+1}^{\ell -1} } e^{-\lambda _n (t_{j+1}^{\ell -1} -s)} dw^{n}_s(\omega ) = R^{\ell -1,F}_{j,n}(\omega ), \end{aligned}$$

and since \(U^{\ell -1,C}\) and \(U^{\ell -1,F}\) are solved using the same numerical scheme, the coupling is pathwise correct. Let us further note that if \(f=0\), then the linearity of the problem and (3.5) imply that the coupling is an SDC:

$$\begin{aligned} U^{\ell ,F}_{2j, n}(\omega ) = U^{\ell -1,C}_{j, n}(\omega ) \quad \forall (j,n) \in \llbracket 0, J_{\ell -1}\rrbracket \times \llbracket 1,N_{\ell -1}\rrbracket . \end{aligned}$$
(3.6)

This can be verified by induction: assume (3.6) holds for some \(j\in \llbracket 0,J_{\ell -1}-1\rrbracket \) (it holds for \(j=0\) by definition). And using the numerical schemes for the respective methods with \(f=0\), we obtain that

$$\begin{aligned} \begin{aligned} U^{\ell ,F}_{2j+2, n}&{\mathop {=}\limits ^{(3.2)}} e^{- \lambda _n \Delta t_{\ell } } U_{2j+1,n}^{\ell ,F}+ R_{2j+1,n}^{\ell ,F} \\&{\mathop {=}\limits ^{(3.1)}} e^{- \lambda _n 2\Delta t_{\ell } } U_{2j,n}^{\ell ,F} + e^{- \lambda _n \Delta t_{\ell } } R_{2j,n}^{\ell ,F} + R_{2j+1,n}^{\ell ,F}\\&{\mathop {=}\limits ^{(3.5)}} e^{- \lambda _n \Delta t_{\ell -1} } U_{j,n}^{\ell -1,C} + R_{j,n}^{\ell -1,C}\\&{\mathop {=}\limits ^{(3.4) }} U_{j+1,n}^{\ell -1,C}. \end{aligned} \end{aligned}$$

Since the exponential Euler MLMC method is SDC, we expect it to perform very efficiently when \(f=0\), and Theorem 2 shows that the coupling is strong also for more general reaction terms.

For showcasing the importance of strong pairwise coupling, and as a transition between exponential Euler MLMC and Milstein MLMC, we next consider a slightly altered form of the exponential Euler method with explicit integration of the Itô integral.

3.2 Drift-exponential Euler MLMC Method

We consider the drift-exponential Euler scheme

$$\begin{aligned} \begin{aligned} V^{ N,J}_0:=&\,u_0^{ N},\\ V^{ N,J}_{j+1}=&\, e^{A_N\Delta t} V^{ N,J}_j + A_{ N}^{-1}(e^{A_N\Delta t}-I)f_{ N}(V^{ N,J}_j)\\&+ e^{A_N\Delta t} \big ( W^N(t_{j+1}) - W^N(t_j) \big ) \Big ) \qquad \forall j\in \llbracket 0,J-1\rrbracket . \end{aligned} \end{aligned}$$

This is a mix of exponential Euler and Milstein, as the approximation of the drift terms agree with the exponential Euler scheme and the approximation of the Itô integral agrees with the Milstein scheme.

When extending this scheme to an MLMC method, a similar argument as in Sect. 3.1 yields that two iterations of the fine solution in the pairwise couple \((U^{\ell -1,C}, U^{\ell ,F})\) takes the form

$$\begin{aligned} U_{2j+1,n}^{\ell ,F} = e^{-\lambda _n \Delta t_{\ell }} U_{2j,n}^{\ell ,F} + \frac{1- e^{- \lambda _n \Delta t_{\ell } }}{\lambda _n} f_{{ N_{\ell }},n}(U_{2j}^{\ell ,F} ) + \widetilde{R}_{2j,n}^{\ell ,F} , \end{aligned}$$
(3.7)

and

$$\begin{aligned} U_{2j+2,n}^{\ell ,F} = e^{- \lambda _n \Delta t_{\ell } } U_{2j+1,n}^{\ell ,F} +\frac{1- e^{-\lambda _n \Delta t_{\ell } }}{ \lambda _n} f_{{ N_{\ell }},n}(U_{2j+1}^{\ell ,F} ) + \widetilde{R}_{2j+1,n}^{\ell ,F}, \end{aligned}$$
(3.8)

for \((j,n) \in \llbracket 0, J_{\ell -1}-1\rrbracket \times \llbracket 1,N_{\ell }\rrbracket \) and with

$$\begin{aligned} \widetilde{R}_{k,n}^{\ell ,F} = \sqrt{q_n} \, e^{-\lambda _n \Delta t_{\ell }} \Big (w^{n}(t_{k+1}^\ell )- w^{n}(t_k^\ell )\Big ) \end{aligned}$$
(3.9)

for \((k,n) \in \llbracket 0,J_{\ell }-1\rrbracket \times \llbracket 1,N_\ell \rrbracket \).

The coupled coarse solution takes the form

$$\begin{aligned} U_{j+1,n}^{\ell -1,C}= e^{- \lambda _n \Delta t_{\ell -1} } U_{j,n}^{\ell -1,C} +\frac{1-e^{-\lambda _n\,\Delta t_{\ell -1}}}{ \lambda _n} f_{{ N_{\ell -1}},n}(U^{\ell -1,C}_{j,n})+ \widetilde{R}_{j,n}^{\ell -1,C}, \end{aligned}$$
(3.10)

where

$$\begin{aligned} \widetilde{R}_{j,n}^{\ell -1,C} = \sqrt{q_n} e^{-\lambda _n \Delta t_{\ell -1}} \big (w^{n}(t_{j+1}^{\ell -1}) - w^{n}(t_j^{\ell -1}) \big ). \end{aligned}$$

Recalling that \(t_{j}^{\ell -1} = j \Delta t_{\ell -1} = 2j \Delta t_\ell = t^{\ell }_{2j}\), we obtain the pairwise coupling of \(U^{\ell -1,C}_{j+1,n} \leftrightarrow U^{\ell ,F}_{2j+2,n}\) through coupling the driving noise:

$$\begin{aligned} \begin{aligned} \widetilde{R}_{j,n}^{\ell -1,C}&= \sqrt{q_n} e^{-2\lambda _n \Delta t_{\ell }} \big (w^{n}(t_{2j+2}^{\ell }) - w^{n}(t_{2j}^{\ell }) \big )= e^{-\lambda _n \Delta t_{\ell }}\big ( \widetilde{R}_{2j,n}^{\ell ,F} + \widetilde{R}_{2j+1,n}^{\ell ,F}\big ). \end{aligned} \end{aligned}$$

Since \(\widetilde{R}_{j,n}^{\ell -1,C}(\omega ) = \widetilde{R}_{j,n}^{\ell -1,F}(\omega )\) and \(U^{\ell -1,C}\) and \(U^{\ell -1,F}\) are solved using the same numerical method, it follows that that the drift-exponential Euler MLMC method also is pathwisely correctly coupled. However, it is not SDC, since when \(f=0\) we obtain by (3.7) and (3.8) that for \(n \in \llbracket 1,N_{\ell -1}\rrbracket \),

$$\begin{aligned} \begin{aligned} U^{\ell ,F}_{2,n}&= e^{- \lambda _n 2\Delta t_{\ell } } U^{\ell ,F}_{0,n} + \widetilde{R}_{1,n}^{\ell ,F} + e^{- \lambda _n \Delta t_{\ell }} \widetilde{R}_{0,n}^{\ell ,F} \\&= \underbrace{e^{- \lambda _n \Delta t_{\ell -1} } U^{\ell ,C}_{0,n} + \widetilde{R}_{0,n}^{\ell -1,C}}_{= U^{\ell -1,C}_{2,n}} + (1 - e^{- \lambda _n \Delta t_{\ell }} )\widetilde{R}_{1,n}^{\ell ,F} \ne U^{\ell -1,C}_{2,n}. \end{aligned} \end{aligned}$$

The term \( (1 - e^{- \lambda _n \Delta t_{\ell }} )\widetilde{R}_{1,n}^{\ell ,F}\) is an error in the coupling that is introduced by explicit integration of the Itô integral. This leads to an artificial smoothing of the numerical solution, as is illustrated by the numerical examples in Sect. 4.

3.3 Milstein MLMC Method

We consider the pairwise coupling of the coarse and fine Milstein solutions on level \(\ell \) with respective initial conditions

$$\begin{aligned} U^{\ell -1,C}_0 = P_{N_{\ell -1}}u_0 \quad \text {and} \quad U^{\ell ,F}_0 = P_{N_\ell }u_0. \end{aligned}$$

Two iterations of the fine solution takes the form

$$\begin{aligned} \begin{aligned} U^{\ell ,F}_{2j+1,n}&= e^{-\lambda _n \Delta t_\ell } \left( U^{\ell ,F}_{2j,n} + \Delta t_\ell f_{N_{\ell },n}(U^{\ell ,F}_{2j,n}) + \sqrt{q_n}\big (w^{n}(t_{2j+1}^{\ell })-w^{n}(t_{2j}^{\ell }) \big ) \right) \\&= e^{-\lambda _n \Delta t_\ell } \left( U^{\ell ,F}_{2j,n} + \Delta t_\ell f_{N_{\ell },n}(U^{\ell ,F}_{2j,n}) \right) + \widehat{R}_{2j,n}^{\ell ,F} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} U^{\ell ,F}_{2j+2,n}&= e^{-\lambda _n \Delta t_\ell } \left( U^{\ell ,F}_{2j+1,n} + \Delta t_{\ell } f_{N_{\ell },n}(U^{\ell ,F}_{2j+1,n}) + \sqrt{q_n}\big (w^{n}(t_{2j+2}^{\ell })-w^{n}(t_{2j+1}^{\ell }) \big ) \right) \\&= e^{-\lambda _n \Delta t_\ell } \left( U^{\ell ,F}_{2j+1,n} + \Delta t_{\ell } f_{N_{\ell },n}(U^{\ell ,F}_{2j+1,n}) \right) + \widehat{R}_{2j+1,n}^{\ell ,F} \end{aligned} \end{aligned}$$

for \((j,n) \in \llbracket 0,J_{\ell -1} -1\rrbracket \times \llbracket 1,N_\ell \rrbracket \) with

$$\begin{aligned} \widehat{R}_{k,n}^{\ell ,F}:= \sqrt{q_n} e^{-\lambda _n \Delta t_\ell } \left( w^{n}(t_{k+1}^{\ell })-w^{n}(t_{k}^{\ell })\right) , \; \quad (k,n) \in \llbracket 0,J_\ell -1\rrbracket \times \llbracket 1, N_\ell \rrbracket . \end{aligned}$$

One iteration of the coarse solution takes the form

$$\begin{aligned} \begin{aligned} U^{\ell -1,C}_{j+1,n}&= e^{-\lambda _n \Delta t_{\ell -1}} \left( U^{\ell -1,C}_{j,n} + \Delta t_{\ell -1} f_{N_{\ell -1},n}(U^{\ell -1,C}_{j,n}) \right) \\&+ \underbrace{\sqrt{q_n} e^{-\lambda _n \Delta t_{\ell -1}} \left( w^{n}(t_{j+1}^{\ell -1})-w^{n}(t_{j}^{\ell -1}) \right) }_{=: \widehat{R}^{\ell -1,C}_{j,n}} \end{aligned} \end{aligned}$$

for \((j,n) \in \llbracket 0,J_{\ell -1} -1\rrbracket \times \llbracket 1,N_{\ell -1}\rrbracket \).

And we obtain the same coupling as for the drift-exponential Euler method:

$$\begin{aligned} \widehat{R}^{\ell -1,C}_{j,n} = \sqrt{q_n} e^{-\lambda _n 2\Delta t_{\ell }} \left( w^{n}(t_{2j+2}^{\ell })-w^{n}(t_{2j}^{\ell }) \right) = e^{-\lambda _n \Delta t_{\ell }}\left( \widehat{R}^{\ell ,F}_{2j,n} + \widehat{R}^{\ell ,F}_{2j+1,n} \right) . \end{aligned}$$

By a similar argument as for the previous MLMC method, this is a pathwise correct coupling, but it is not SDC.

In summary, we have presented three different MLMC methods where only the coupling for the exponential Euler MLMC method is SDC. This particularly means that the exponential Euler MLMC method outperforms the other methods when \(f=0\), and later comparisons of the strong convergence rate \(\beta \) for the two methods in Theorems 2 and 3 and in the numerical experiments show that the outperformance is broader.

3.4 MLMC for SPDE

In this section, we present cost versus error results for the exponential Euler- and Milstein MLMC methods.

We recall that the computational cost of one simulation of a numerical method is defined by the computational effort needed, cf. (2.8), and that under Assumption 1, it holds for all three spectral Galerkin methods we consider that

$$\begin{aligned} \text {Cost}(U^{\ell ,F}(T)) = \text {Cost}(U^{\ell ,F}_{J_\ell }) = \text {Cosst}(V^{N_\ell ,J_\ell }_{J_\ell }) \eqsim J_\ell N_\ell \log _2(N_\ell ), \end{aligned}$$

and

$$\begin{aligned} \text {Cost}( U^{\ell -1,C}(T), U^{\ell ,F}(T) ) \eqsim \text {Cost}( U^{\ell -1,C}_{J_{\ell -1}}) + \text {Cost}(U^{\ell ,F}_{J_\ell } ) \eqsim J_\ell N_\ell \log _2(N_\ell ). \end{aligned}$$
(3.11)

We will consider weak approximations of Banach-space-valued quantities of interest (QoI) of the following form:

Definition 2

(Admissible QoI) Let K be a Banach space equipped with the norm \(\Vert \cdot \Vert _K\) and let \(\varphi : H \rightarrow K\) be a strongly measurable and uniformly Lipschitz continuous QoI. We say that such a QoI is admissible if the cost of evaluating the mapping satisfies that

$$\begin{aligned} \sup _{v \in H^N} \text {Cost}(\varphi (v)) \lesssim N. \end{aligned}$$

We are ready to state the main result of this work.

Theorem 2

([Exponential Euler MLMC) Consider the SPDE (2.1) for a linear operator A with \(\lambda _n \eqsim n^2\) and let \(\varphi :H \rightarrow K\) be an admissible QoI, in the sense of Definition 2. If all assumptions in Appendix 1 hold for some \(\phi \in (0,1)\) and Assumption 1 holds, then the pathwise correctly coupled exponential Euler MLMC method with

$$\begin{aligned} J_\ell = 2^\ell J_0 \quad \text { and } \quad N_\ell \eqsim N_0 2^{\ell /(2\phi )} \end{aligned}$$
(3.12)

satisfies

  1. (i)

    \(\big \Vert \mathbb {E}\big [ {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U(T, \cdot )) \big ]\big \Vert _{{K}} \lesssim (\ell +1) 2^{-\ell }\).

  2. (ii)

    \(V_\ell := \mathbb {E}\Big [\Vert {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }( U^{\ell -1,C}(T, \cdot ))\Vert ^2_{{K}} \Big ] \lesssim (\ell +1)^2 \; 2^{-2\ell }\).

  3. (iii)

    \(C_\ell := \textrm{Cost}\big ({\varphi }(U^{\ell -1,C}(T)), {\varphi }(U^{\ell ,F}(T)) \big ) {\lesssim } (\ell +1) 2^{(1+1/(2\phi ))\ell }\).

And for any sufficiently small \(\epsilon >0\) and \(L:= \lceil \log _2\big ( \log _2(1/\epsilon )/\epsilon \big ) \rceil \), there exists a sequence \(\{M_{\ell }(\epsilon )\}_{\ell =0}^L \subset \mathbb {N}\) such that

$$\begin{aligned} \textrm{MSE} = \mathbb {E}\big [\big \Vert E_{ \mathrm {M\!L}} [{\varphi }(U(T,\cdot ))]-\mathbb {E}[ {\varphi }(U(T, \cdot )) ] \big \Vert ^2_{{K}} \big ] \lesssim \epsilon ^{2}, \end{aligned}$$
(3.13)

and

$$\begin{aligned} \begin{aligned} \mathrm {Cost(MLMC)}&:= \sum _{\ell =0}^L M_\ell C_\ell \\&\lesssim {\left\{ \begin{array}{ll} \epsilon ^{-2} &{} \text {if} \quad \phi \in (1/2,1) \\ \epsilon ^{-2} \big (\log _2(1/\epsilon )\big )^{5} &{} \text {if} \quad \phi = 1/2\\ \epsilon ^{-1 - 1/(2\phi )} \big (\log _2(1/\epsilon )\big )^{2+1/(2\phi )} &{} \text {if} \quad \phi \in (0,1/2). \end{array}\right. } \end{aligned} \end{aligned}$$
(3.14)

Proof

Let us show that the K-valued random variables \(\varphi (U^{\ell ,F}(T,\cdot ))\) and \(\varphi (U^{\ell ,C}(T,\cdot ))\) are well-defined. The Lipschitz continuity of the mapping \(\varphi \) implies that

$$\begin{aligned} \begin{aligned} \Vert \varphi (U^{\ell ,F}(T,\cdot ))\Vert _K&\le \Vert \varphi (U^{\ell ,F}(T,\cdot )) - \varphi ( 0)\Vert _K + \Vert \varphi ( 0)\Vert _K\\&\le C_{\varphi } \Vert U^{\ell ,F}(T,\cdot )\Vert _H + \Vert \varphi (0)\Vert _{K}, \end{aligned} \end{aligned}$$

where \(C_{\varphi }>0\) denotes the Lipschitz constant for \(\varphi \). It follows that \(\varphi (U^{\ell ,F}(T,\cdot )) \in L^2(\varOmega , K)\) for all \(\ell \ge 0\), and we similarly also have that \(\varphi (U^{\ell ,C}(T,\cdot )) \in L^2(\varOmega , K)\).

We note that the numerical resolution sequences are set according to (3.12) to balance the error from space- and time-discretization in Proposition 1. A pair of correctly coupled solutions \(U^{\ell ,F, (m)}\) and \(U^{\ell -1,C,(m)}\) can be viewed as exponential Euler solutions using the same driving \(Q-\)Wiener process \(W^{(m)}\) on levels \(\ell \) and \(\ell -1\), respectively. Consequently,

$$\begin{aligned} \begin{aligned} \mathbb {E}&\left[ \Vert {\varphi }(U^{\ell ,F}(T,\cdot ))-{\varphi }(U^{\ell -1,C}(T,\cdot ))\Vert ^2_{{K}}\right] \\&\le {C_{\varphi }^2} \mathbb {E}\left[ \Vert U^{\ell ,F}(T,\cdot )-U^{\ell -1,C}(T,\cdot )\Vert ^2_{{{H}}}\right] \\&\le \,2 {C_{\varphi }^2}\mathbb {E}\left[ \Vert U^{\ell ,F}(T,\cdot )-U(T,\cdot )\Vert _{ H}^2 + \Vert U(T,\cdot ) - U^{\ell -1,C}(T, \cdot )\Vert ^2_{ H} \right] \\&= 2 {C_\varphi ^2} \mathbb {E}\left[ \Vert V^{N_\ell ,J_\ell }_{J_\ell }-U(T,\cdot )\Vert _{ H}^2 + \Vert U(T,\cdot ) - V^{N_{\ell -1}, J_{\ell -1}}_{J_{\ell -1}}\Vert ^2_{ H} \right] \\&{\mathop {\lesssim }\limits ^{ (2.7)}}\, \lambda _{N_{\ell }}^{-2\phi } + \left( \frac{\log _2(J_\ell )}{J_\ell }\right) ^2 \\&\eqsim \, N_\ell ^{-2\phi } + \Big (\ell +1\Big )^2 2^{-2\ell } \\&\eqsim \, (\ell +1)^2 \, 2^{-2\ell }, \end{aligned} \end{aligned}$$

where we used Lipschitz continuity for the first inequality. This verifies rate (ii). Since \(\{\varphi (U^{\ell ,F}(T,\cdot ))\}_\ell \) is a Cauchy sequence in \(L^2(\varOmega ,K)\) with limit \(\varphi (U(T,\cdot ))\), we have that

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ {\varphi }(U(T, \cdot )) -{\varphi }(U^{\ell ,F}(T,\cdot ))\right]&= \mathbb {E}\left[ \sum _{j = \ell }^\infty {\varphi }(U^{j+1, F}(T, \cdot )) -{\varphi }(U^{j,F}(T,\cdot ))\right] \\&= \mathbb {E}\left[ \sum _{j = \ell }^\infty {\varphi }(U^{j+1, F}(T, \cdot )) -{\varphi }(U^{j,C}(T,\cdot ))\right] . \end{aligned} \end{aligned}$$

In the last equality we used that the coupling is pathwise correct: \(U^{j,F}(T,\cdot ) = U^{j,C}(T,\cdot )\), cf. Remark 4. Rate (i) follows from

$$\begin{aligned} \begin{aligned} \big \Vert \mathbb {E}\big [ {\varphi }(U(T, \cdot )) -{\varphi }(U^{\ell ,F}(T,\cdot )) \big ]\big \Vert _{{K}}&\le \sum _{j=\ell }^\infty \mathbb {E}\big [ \Vert {\varphi }(U^{j+1, F}(T, \cdot )) -{\varphi }(U^{j,C}(T,\cdot )) \Vert _K \big ] \\&{\mathop {\lesssim }\limits ^{(ii)}} \sum _{j=\ell +1}^{\infty } (j+1) 2^{-j}\\&\lesssim (\ell +1)2^{-\ell }. \end{aligned} \end{aligned}$$

Rate (iii) follows by (3.11) and Definition 2. Introducing the following number-of-samples-per-level sequence

$$\begin{aligned} M_\ell = \left\lceil \epsilon ^{-2} \sqrt{\frac{V_\ell }{C_\ell }} \sum _{j=0}^{ L}\sqrt{V_j C_j} \right\rceil \quad \ell \in \llbracket 0,L\rrbracket , \end{aligned}$$
(3.15)

and noting that \(L = \lceil \log _2(\log _2(1/\epsilon )/\epsilon ) \rceil \eqsim \log _2(1/\epsilon )\), we obtain (3.13) by a similar argument as in the proof of Theorem 1:

$$\begin{aligned} \begin{aligned} \mathbb {E}\big [\big \Vert E_{ \mathrm {M\!L}} [{\varphi }(U(T,\cdot ))] -\mathbb {E}[ {\varphi }&(U(T, \cdot )) ] \big \Vert ^2_{{K}} \big ] \\&= \mathbb {E}\Big [\left\| E_{ \mathrm {M\!L}}[ {\varphi }(U(T,\cdot ))] -{\mathbb E}[\varphi (U^{L,F}(T,\cdot ))]\right\| ^2_{{K}}\Big ]\\&\quad + \Vert \mathbb {E}[{\varphi }(U(T,\cdot ))] - {\varphi }(U^{L,F}(T,\cdot ))]\Vert _{{K}}^2\\&\lesssim \sum _{\ell =0}^{ L} \frac{V_\ell }{M_\ell } + L^2 2^{-2L}\\&\lesssim \epsilon ^{2} \frac{\max ( \sum _{\ell =0}^L \sqrt{V_\ell C_\ell }, \, 1 )}{\max (\sum _{j=0}^L \sqrt{V_j C_j}, \, 1)} +L^2 \frac{\epsilon ^2}{(\log (\epsilon ))^2} \\&\eqsim \,\epsilon ^2. \end{aligned} \end{aligned}$$

For the computational cost, we have that

$$\begin{aligned} \begin{aligned}&\sum _{\ell =0}^LC_{\ell }\, M_{\ell }\lesssim \epsilon ^{-2} \left( \sum _{j=0}^L \sqrt{V_j C_j}\right) ^2 + C_L \\&\quad \lesssim \epsilon ^{-2} \left( \sum _{j=0}^L \sqrt{V_j C_j}\right) ^2 + \left( \frac{\epsilon }{ \log _2(1/\epsilon )} \right) ^{-( 1+ 1/(2\phi ) )}, \end{aligned} \end{aligned}$$

and (3.14) follows from using \(V_jC_j \lesssim (j+1)^3\, 2^{\,j\, (1/(2\phi )-1)}\) when bounding the squared sum from above.

Remark 5

A general framework for (MLMC) methods for reaction-diffusion type SPDE in the setting of \(\phi \ge 1/2\) and for numerical methods with a strong convergence rate 1/2 was first developed in [5]. When \(\phi =1/2\) and \(\gamma = 2\), the MSE \(\mathcal {O}(\epsilon ^{2-\delta })\) was achieved at the computational cost \(\mathcal {O}(\epsilon ^{-3})\) for that method in [5, Theorem 4.4] compared to a cost \(\mathcal {O}(\epsilon ^{-2})\) for our exponential Euler MLMC method. This is however not a fair performance comparison, since [5] was developed for more general SPDE with multiplicative noise and for which the operators A and Q need not share eigenbasis, while our method is tailored to the additive-noise setting with A and Q sharing eigenbasis, cf. Appendix 1.

We state a similar cost-versus-error result for the MLMC Milstein method with pathwise correctly pairwise coupling.

Theorem 3

(Milstein MLMC) Consider the SPDE (2.1) for a linear operator A with \(\lambda _n \eqsim n^2\), let the assumptions in Corollary 1 hold for some \(\phi \in (1/2,1)\) and let Assumption 1 hold. Let \(\varphi :H \rightarrow K\) be an admissible QoI, in the sense of Definition 2. Then the pathwise correctly coupled Milstein MLMC method with

$$\begin{aligned} J_\ell = 2^\ell J_0 \quad \text { and } \quad N_\ell \eqsim N_0 2^{ \ell /2}, \end{aligned}$$
(3.16)

satisfies for any fixed \(\delta >0\) that

  1. (i)

    \(\big \Vert \mathbb {E}\big [ {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U(T, \cdot )) \big ]\big \Vert _H \lesssim 2^{-(\phi -\delta /2) \ell }\).

  2. (ii)

    \(V_\ell := \mathbb {E}\Big [\Vert {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U^{\ell -1,C}(T, \cdot ))\Vert ^2_H \Big ] \lesssim 2^{-(2\phi -\delta ) \ell }\).

  3. (iii)

    \(C_\ell := \textrm{Cost}\big ( {\varphi }(U^{\ell -1,C}(T)), {\varphi }(U^{\ell ,F}(T)) \big ) \lesssim (\ell +1) 2^{3\ell /2}\).

And for any sufficiently small fixed \(\delta >0\) and any sufficiently small \(\epsilon >0\), there exist an \(L(\epsilon ) \in \mathbb {N}\) and a sequence \(\{M_{\ell }(\epsilon )\}_{\ell =0}^L \subset \mathbb {N}\) such that

$$\begin{aligned} \textrm{MSE} = \mathbb {E}\big [\big \Vert E_{ \mathrm {M\!L}} [ {\varphi }(U(T,\cdot ))]-\mathbb {E}[ {\varphi }(U(T, \cdot )) ] \big \Vert ^2_H \big ] \lesssim \epsilon ^{2}, \end{aligned}$$
(3.17)

at the cost

$$\begin{aligned} \begin{aligned} \mathrm {Cost(MLMC)}&:= \sum _{\ell =0}^L M_\ell C_\ell \lesssim {\left\{ \begin{array}{ll} \epsilon ^{-2} &{} \text {if} \quad \phi \in (3/4,1) \\ \epsilon ^{-2(1+\delta )} &{} \text {if} \quad \phi = 3/4\\ \epsilon ^{-3(1 +\delta )/(2\phi )} &{} \text {if} \quad \phi \in (1/2,3/4). \end{array}\right. } \end{aligned} \end{aligned}$$
(3.18)

Proof

We set the numerical resolution sequences by (3.16) to balance the error from space- and time-discretization in Corollary 1, and, since the Milstein MLMC method is pathwise correctly coupled, the rates (i), (ii) and (iii) can be verified as in the proof of Theorem 2.

To prove the error and cost results, we relate the rates in (i), (ii) and (iii) to those in Theorem 1: for any \(\delta >0\), it holds that

$$\begin{aligned} \alpha =\phi -\delta /2, \quad \beta =2\phi -\delta \quad \text {and} \quad \gamma = 3/2+\delta \end{aligned}$$
(3.19)

For the case \(\phi \in (3/4,1)\) it holds for sufficiently small \(\delta >0\) that \(\beta > \gamma \), and the results (3.17) and (3.18) follow from Theorem 1.

For the case \(\phi \in (1/2,3/4]\), we again apply Theorem 1 to our rates \((\alpha ,\beta ,\gamma )\) in (3.19) to conclude that (3.17) is fulfilled at the cost

$$\begin{aligned} \mathrm {Cost(MLMC)} \lesssim \epsilon ^{-2-\frac{(\gamma -\beta )}{\alpha }} = \epsilon ^{-\frac{3/2 +\delta }{\phi -\delta /2}}, \end{aligned}$$

and taking \(\delta >0\) sufficiently small, it holds that

$$\begin{aligned} \frac{3/2 +\delta }{\phi -\delta /2} \le \frac{3(1 + \delta )}{2\phi }. \end{aligned}$$

Comparing Theorems 2 with 3, we expect exponential Euler MLMC to asymptotically outperform Milstein MLMC when the colored noise has low regularity, meaning when \(\phi < 3/4\).

4 Numerical Examples

In this section, we numerically test the exponential Euler MLMC method against the drift-exponential- and Milstein MLMC methods. We study two reaction-diffusion SPDE, one with a linear reaction term and one with a trigonometric one. To showcase the superior performance of exponential Euler in settings with low-regularity colored noise, we consider one setting with \(\phi \approx 1/2\) (this is a low-regularity setting for the Milstein method) and we numerically confirm the theoretical result that the exponential Euler MLMC and the Milstein MLMC perform similarly when \(\phi \approx 3/4\), cf. Theorems 2 and 3.

For our numerical experiments we consider the general form of semilinear SPDE

$$\begin{aligned} dU_t =\,\left( AU_t + f(U_t)\right) \;dt+dW_t, \quad t \in [0,T], \end{aligned}$$

with initial triangular-wave initial condition

$$\begin{aligned} u_0(x) = {\left\{ \begin{array}{ll} 2x, &{} x \in \left[ 0,\frac{1}{2}\right] , \\ 2(1-x), &{} x \in \left( \frac{1}{2},1\right] . \end{array}\right. } \end{aligned}$$

Furthermore we specify our space \(H = L^2(0,1)\) with Fourier basis functions \(e_n(x) = \exp (i 2n\pi x)\) for \(n \in \mathbb {Z}\), and the final time is set to \(T=1/2\). We consider a linear operator \(A:D(A) \rightarrow H\) defined as

$$\begin{aligned} A =-\sum _{n \in \mathbb {Z}} \lambda _n\, \langle \cdot , e_n\rangle \,e_n, \end{aligned}$$

with eigenvalues \((\lambda _n)_{n \in \mathbb {Z}}\), given as

$$\begin{aligned} \lambda _n = {\left\{ \begin{array}{ll} 1 &{} \textrm{if} \quad n = 0,\\ \frac{(2 n \pi )^2}{5} &{} \textrm{if} \quad n \in \mathbb {Z}\setminus \{0\}. \end{array}\right. } \end{aligned}$$

We note that the triangular-wave initial condition satisfies the following regularity condition: \(u_0 \in H_{3/4-\delta }\) for any \(\delta >0\).

For \(f:H \rightarrow H\), we consider the two different reaction terms which are presented in Table 1. Both belong to the class of Nemytskii operators, cf. [35].

Table 1 Different reaction terms \(f(U):H \rightarrow H\) tested in the numerical experiments

The driving noise dW is a \(Q-\)Wiener process (A.2) with

$$\begin{aligned} q_n:= \frac{1}{4}\lambda _n^{-2b} \qquad \text {for} \quad n\in \mathbb {Z}, \end{aligned}$$

for two different values of b: the low-regularity setting \(b=1/4\), and the smoother setting \(b=1/2\). In connection with Assumption 3, we note that

$$\begin{aligned} \sum _{n \in \mathbb {Z}} (\lambda _n)^{2\phi -1} q_n = \frac{1}{4} + \frac{2\pi ^2}{5} \sum _{n =1}^\infty n^{4(\phi -b) -2}< \infty \iff \phi < 1/4 +b. \end{aligned}$$

It consequently holds that that \(\phi =(1/4+b)-\delta \) for any \(\delta >0\), and for simplicity, we will refer to the parameter values for \(\phi \) as \(\phi (b=1/4) = 1/2-\) and \(\phi (b=1/2) =3/2-\), respectively. When Theorems 2 and 3 apply, we expect exponential Euler MLMC to outperform Milstein MLMC when \(\phi <3/4\), and that the methods perform similarly when \(\phi > 3/4\).

Note however that some of our numerical studies are purely experimental, as neither of the theorems apply to all problem settings we consider. Theorem 2 only applies to the linear reaction term, because the trigonometric reaction term has no Fréchet derivative that belongs to L(H), and this violates Assumption 4. We do however believe the regularity assumptions in Proposition 1 can be relaxed so that it also applies to the trigonometric reaction term, but, to the best of our knowledge, it is an open problem to prove this.

For the Milstein method, on the other hand, Assumption 6 does hold whenever \(\phi > 1/2\) and \(\kappa > 1/4\), with Fréchet derivatives \(f'(\cdot ) = 4\pi (\cos (2\pi \cdot ) -\sin (2\pi \cdot ))\) and \(f''(\cdot ) = -8\pi ^2f(\cdot )\). (This can be verified using the definition of Fréchet derivatives and that \(L^\infty (0,1) \subset H_{\kappa }\).) But Theorem 3 only applies when \(\phi >1/2\).

Fig. 1
figure 1

RMSE in time and space for the SPDE with the linear reaction term. The top row provides rates for low-regularity setting with \(b=1/4\) and the bottom row provides rates for \(b=1/2\)

4.1 Numerical Estimates of the Convergence Rate \(\beta \)

Numerical estimates of the root mean squared error (RMSE) convergence rates in time and space for all three methods are presented in Figs. 1 and 2. The RMSE in time is approximated by

$$\begin{aligned} \sqrt{E_{M}[ \Vert V^{N_*, 2J}(T, \cdot ) - V^{N_*, J}(T, \cdot ) \Vert _H^2 ]}, \end{aligned}$$

where J is varied and \(N_* = 1024\) is fixed, and using \(M=10000\) independent samples of the random variable in the Monte Carlo estimator. For the exponential Euler method we observe the rate 1 and for the other methods, we observe the rate \(\phi (b) = 1/4+b\).

The RMSE in space is approximated by

$$\begin{aligned} \sqrt{E_{M}[ \Vert V^{2N, J_*}(T, \cdot ) - V^{N, J_*}(T, \cdot ) \Vert _H^2 ] }, \end{aligned}$$

where N is varied and \(J_* = 2^{18}\) is fixed, and using \(M=250\) independent samples. This error describes the RMSE convergence rate in N, which we observe to be \(2\phi = 1/2+2b\) for all methods.

Fig. 2
figure 2

RMSE in time and space for the SPDE with the trigonometric reaction term. The top row provides the rates for low-regularity setting with \(b=1/4\) and the bottom row provides the rates for \(b=1/2\)

Since the \(\beta \) in Theorem 1 represents the MSE, the numerical experiments indicate that \(\beta = 2\min (1, 2\phi ) = 2\) for exponential Euler MLMC and \(\beta = 2\phi (b) = 1/2 + 2b\) for the other two methods. We will further set \(\alpha = \beta /2\) as the weak rate when implementing all MLMC methods.

4.2 Method Parameters

All three methods are implemented using Theorem 1 with the numerical estimates of the rates \(\alpha \) and \(\beta \), rather than by using the rather than using the slightly more conservative rate for \(\beta \) in Theorem 2 (ii).

4.2.1 Exponential Euler MLMC Method

We use the estimated rates \(\alpha =1\) and \(\beta =2\) and balance error contributions in time and space by setting

$$\begin{aligned} J_\ell = 2^{\ell +2} \quad \text {and} \quad N_\ell = 2 \times \lceil 2^{\ell /(2 \phi ) +1} \rceil \quad \text {and} \quad . \end{aligned}$$

and \(L = \lceil \log _2( 1/\epsilon )/\alpha \rceil = \lceil \log _2( 1/\epsilon )\rceil \). We set \(C_\ell := (\ell +1) 2^{\gamma \ell }\) with \(\gamma = 1 + 1/(2\phi )\), which one may verify is consistent with \(C_\ell \eqsim J_\ell N_\ell \log _2(N_\ell )\), and we set \(V_\ell := 2^{-2\ell }\) to determine the sequence \(\{M_\ell \}_\ell \), in compliance with formula (3.15), by

$$\begin{aligned} M_\ell (\epsilon ) = {\left\{ \begin{array}{ll} 20 \left\lceil 2\epsilon ^{-2} \sqrt{\frac{V_\ell }{C_\ell }} \sum _{j=0}^L\sqrt{V_j C_j} \right\rceil &{} \text {if} \quad \ell = 0\\ \; \;5 \left\lceil 2\epsilon ^{-2} \sqrt{\frac{V_\ell }{C_\ell }} \sum _{j=0}^L\sqrt{V_j C_j} \right\rceil &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(4.1)

4.2.2 Drift-Exponential Euler MLMC and Milstein MLMC

The numerically observed convergence rates for both of these methods are \(\beta = 2\phi \) and \(\alpha = \phi \). For both methods, we set

$$\begin{aligned} J_\ell = 2^{\ell +2}, \quad N_\ell = 2\times \lceil 2^{\ell /2 +1}\rceil , \quad \text {and} \quad L = \left\lceil \frac{\log _2(1/\epsilon )}{\phi } \right\rceil -2, \end{aligned}$$

\(C_\ell := (\ell +1) 2^{3\ell /2}\) and \(V_\ell := 2^{- 2\phi \ell }\), we and determine the sequence \(\{M_\ell \}_{\ell }\) by formula (4.1).

4.3 Linear Reaction Term

Fig. 3
figure 3

Comparison of the couplings at the finest level (blue) and the coarsest level (red) for the SPDE with linear reaction term. The spatial resolution is fixed to \(N=2^8\) in all simulations and the resolution in time (JJ/2) is given by \(J=2^6, 2^8,\) and \(2^{10}\) from top to bottom row (Color figure online)

We first consider the SPDE with \(f(U) = U\). Figure 3 presents pairwisely coupled realizations for progressively finer resolution in time for the settings \(b=1/4\) and \(b=1/2\). In the low regularity setting \(b=1/4\), we clearly observe that the exponential Euler method has far less smoothing of the solutions and achieves a stronger coupling than the other methods. The difference between the methods becomes less visible in the smoother setting \(b=1/2\).

Figure 4 provides the approximation of \(\mathbb {E}[U(T,\cdot )]\) by one simulation of each of the MLMC methods for different input \(\epsilon = 2^{-\ell }\) for \(\ell = 4,5,\ldots ,9\). We observe that all methods converge to the mean with approximately the same rate for both values of b.

Fig. 4
figure 4

Approximation of \(E_{\textrm{ML}}[U(T,x)]\) for the SPDE with linear reaction term for different values of \(\epsilon = 2^{-4}, 2^{-5}, \ldots , 2^{-9}\) in full colored lines (blue, orange, green, red, purple, brown) and the pseudo-reference solution \(\mathbb {E}[U(T,x)]\) (dashed line). From left to right, exponential Euler, drift-exponential Euler, and Milstein. Top row is for the low-regularity setting \(b=1/4\) and bottom row is for \(b=1/2\) (Color figure online)

Fig. 5
figure 5

Top row: convergence and computational cost plots for the SPDE with \(f(U)=U\) and \(b=1/4\). Bottom row: similar plots for the setting with \(b=1/2\) (Color figure online)

Figure 5 presents the MLMC approximation error versus tolerance and the computational cost versus tolerance for different input tolerances \(\epsilon \) for one simulation. The approximation error

$$\begin{aligned} \Vert E_\mathrm{{ML}}[U(T,\cdot )](\omega ; \epsilon ) - \mathbb {E}\left[ U(T, \cdot )\right] \Vert _H^2 \end{aligned}$$

is computed for one simulation of the MLMC estimator for each input of \(\epsilon \), where the the pseudo-reference solution \(\mathbb {E}\left[ U(T, \cdot )\right] \) is obtained by solving the PDE

$$\begin{aligned} dU_t^N = (A_N U_t^N + f_N(U^N)) dt, \qquad U_0^N = P_N u_0 \end{aligned}$$

with the exponential Euler method using the resolutions \(N=2^{13}\) and \(J=2^{18}\). Let us also recall that the computational cost of the MLMC methods is defined by \(\sum _{\ell =0}^L C_\ell M_\ell \).

For \(\phi = 1/2-\), we observe that exponential Euler MLMC method has achieves the error \(\mathcal {O}(\epsilon ^2)\) at the cost \(\mathcal {O}((\log _2(\epsilon ))^2 \epsilon ^{-2})\) while the other methods achieves similar accuracy at considerably higher cost. For \(\phi = 3/4-\) all three methods achieves an error \(\mathcal {O}(\epsilon ^2)\) at a comparable computational cost. The observations are consistent with theory.

Fig. 6
figure 6

Approximation of \(E_{\textrm{ML}}[U(T,x)]\) for the SPDE with trigonometric reaction term for different values of \(\epsilon = 2^{-4}, 2^{-5}, \ldots , 2^{-9}\) in full colored lines (blue, orange, green, red, purple, brown) and the pseudo-reference solution \(\mathbb {E}[U(T,x)]\) (dashed line). From left to right, exponential Euler, drift-exponential Euler, and Milstein. Top row is for the low-regularity setting \(b=1/4\) and bottom row is for \(b=1/2\) (Color figure online)

Fig. 7
figure 7

Top row: convergence and computational cost plots for the SPDE with \(f(U)= 2(\sin (2\pi U) + \cos (2\pi U)) \) and \(b=1/4\). Bottom row: similar plots for the setting with \(b=1/2\)

4.4 Trigonometric Reaction Term

We next consider the SPDE with

$$\begin{aligned} f(U)(x) = 2\big (\sin (2 \pi U(x)) + \cos (2 \pi U(x)) \big ). \end{aligned}$$

The approximation of \(\mathbb {E}[U(T,\cdot )]\) by the MLMC methods for different inputs \(\epsilon = 2^{-\ell }\) for \(\ell = 4,5,\dots ,9\) is presented in Figs. 6 and 7 shows the MLMC approximation error versus computational cost and computational cost versus tolerance for different input tolerances \(\epsilon \). For each value of b, the pseudo-reference solution used for evaluating the approximation error is computed by the exponential Euler MLMC method \(E_{\textrm{ML}}[ U(T,\cdot )](\omega , \epsilon ) \approx \mathbb {E}[U(T,\cdot )]\) with the overkilled parameter value \(\epsilon = 2^{-11}\). This an expensive computation using the following number of samples per level when \(b=1/4\):

$$\begin{aligned} (M_0, M_1, M_2, \ldots , M_{10}, M_{11}) = (4907168680, 216868270, 44268050, \ldots , 355, 85), \end{aligned}$$

with \(N_{\ell } = J_{\ell } = 2^{\ell +2}\). We observe once again that exponential Euler MLMC outperforms the other methods in the low-regularity setting \(b=1/4\) and that all methods perform similarly when \(b=1/2\).

5 Conclusion

Our objective in this work was to show both theoretically and experimentally that coupling approaches that exploit more information than only the driving noise \(W_t\), such as the exponential Euler MLMC method, can result in strong coupling and improve the efficiency of weak approximations for SPDE. Our motivation in doing so, was based on the lack of literature on strong coupling for MLMC methods solving SPDE. In particular, we have derived explicit convergence rates, related to the decay of the mean squared error-to-cost rate, for the exponential Euler MLMC method and the Milstein MLMC method, cf. Theorems 2 and 3. The convergence rates for exponential Euler MLMC method is an improvement over existing MLMC methods for reaction-diffusion SPDE with additive noise. We also presented numerical experiments highlighting our derived rates and demonstrating the efficiency gains of the exponential Euler MLMC method over alternative ones. This was tested numerically on SPDE with linear and nonlinear reaction terms.

There are many possible extensions of this work. It would be interesting to understand whether strong couplings also can improve the efficiency of MLMC for other numerical solvers for SPDE, such as finite difference methods and FEM [2, 5]. This indeed is a challenging problem, due to the seemingly limitless possibilities of couplings for infinite-dimensional problems. Another direction is to develop a multi-index Monte Carlo method [18, 28] based on the pathwise correctly coupled exponential Euler method. This has the potential of further improving tractability in higher-dimensional physical space and low-regularity settings.