Abstract
Multilevel Monte Carlo (MLMC) has become an important methodology in applied mathematics for reducing the computational cost of weak approximations. For many problems, it is well-known that strong pairwise coupling of numerical solutions in the multilevel hierarchy is needed to obtain efficiency gains. In this work, we show that strong pairwise coupling indeed is also important when MLMC is applied to stochastic partial differential equations (SPDE) of reaction-diffusion type, as it can improve the rate of convergence and thus improve tractability. For the MLMC method with strong pairwise coupling that was developed and studied numerically on filtering problems in (Chernov in Num Math 147:71-125, 2021), we prove that the rate of computational efficiency is higher than for existing methods. We also provide numerical comparisons with alternative coupling ideas on linear and nonlinear SPDE to illustrate the importance of this feature.
Similar content being viewed by others
1 Introduction
The efficiency of numerical methods is a very important topic for practitioners that has lately seen a surge of interest in the field of uncertainty quantification (UQ) [40,41,42]. UQ seeks to combine statistical and probabilistic techniques with traditional numerical schemes to improve the modeling and the accuracy of estimates. Examples of applications include climate modeling, subsurface flow, medical imaging and deep learning [1, 12, 37]. A particular focus has been given on the class of numerical methods known as Monte Carlo (MC) methods, which are used to solve problems incorporating elements of randomness or uncertainty [35, 39, 42], i.e., in stochastic computations. One methodology which has exhibited improved efficiency and a high level of applicability, is multilevel Monte Carlo (MLMC).
MLMC is a numerical technique aimed at reducing the computational cost of the Monte Carlo method. The methodology was first introduced by Heinrich [22] and extended and popularized by various works on diffusion processes by Giles [14, 15]. The methodology of MLMC can be viewed as a variance-reduction technique. Since these works, MLMC has been applied in numerous areas, including stochastic filtering [6, 13, 24, 26, 27], Markov chain Monte Carlo (MCMC) [7, 10] and partial differential equations with random input arising in UQ [2, 7, 19]. MLMC is based upon a given problem, such as estimating an expectation at some terminal time, w.r.t. the law of a diffusion process, that requires a discretization. For instance, in the diffusion case, this can be a time-discretization based on the Euler method. One then decomposes an expectation w.r.t. a law associated to a very precise discretization into a telescoping sum of differences of expectations associated to laws of increasingly coarse discretizations. The objective is then to sample from coupled probability distributions associated to consecutive discretized laws and to apply Monte Carlo at each summand of the telescoping sum to achieve a variance reduction, relative to using Monte Carlo at the finest discretization. The amount of discretization refers to the level, in the acronym MLMC.
Despite the substantial advancements made with MLMC, the number of applications and research papers on applying the methodology to stochastic partial differential equations (SPDE) [29] is relatively small. Such examples include finite-difference solvers with applications in mathematical finance [16], and finite element methods for parabolic SPDE [3, 5]. There are open questions on the efficiency and scope of MLMC for SPDE which we use as motivation for this work: is it possible to improve the efficiency of MLMC through strong pairwise coupling of numerical solutions of SPDE, and can that widen the scope of MLMC on SPDE to problems in higher dimensions and with lower-regularity driving noise?
Our objective in this manuscript is to present a complexity study of an alternative way to apply MLMC for SPDE, which can demonstrate computational gains. This approach is based on the exponential Euler method [11, 23, 30, 36] and strong pairwise coupling of solution realizations on different levels. The strong pairwise coupling approach was introduced and studied experimentally in the work on finite-dimensional Langevin SDE by Müller et al. [38] and extended to filtering methods for (infinite-dimensional) SPDE by Chernov et al. [8]. The coupling idea is based on the exponential Euler integrator [11, 23, 33, 36] for time-discretization of reaction-diffusion type SDE/SPDE. For the finite-dimensional SDE in [38], strong coupling is shown to produce constant-factor efficiency gains in numerical experiments, whereas for the herein considered class of SPDE, we show that strong coupling reduces the asymptotic rate of growth in the computational cost. This indicates that strong coupling for MLMC can lead to more substantial asymptotic efficiency gains for infinite-dimensional problems than for finite-dimensional ones.
The main contribution of this work is to demonstrate the improvements of the discussed coupling approach, for numerically solving SPDE. This is presented in the standard-format cost-versus-error result for exponential Euler MLMC in Theorem 2. Specifically, our findings suggest that in order to achieve \(\mathcal {O}(\epsilon ^{2})\) mean squared error (MSE) in a standard setting, we have to pay \(\mathcal {O}(\epsilon ^{-2})\) in computational cost. This is a reduction in cost compared to other existing methods, such as the Milstein MLMC method, for which the cost is \(\mathcal {O}(\epsilon ^{-3})\), cf. Theorem 3 and [5], and it is, to the best of our knowledge, the first theoretical result on the performance of the exponential Euler MLMC for any nonlinear stochastic PDE. We also verify these gains numerically on two SPDE, one with a linear reaction term and one with a nonlinear one.
The outline of this paper is as follows. In Sect. 2 we describe our model problem, which is a semilinear SPDE, and review fundamental properties of the MLMC method. Section 3 describes our proposed coupling method for MLMC and two alternative methods. We also summarize the theoretical properties of our main MLMC method in Theorem 2. Numerical experiments on various SPDE are conducted in Sect. 4 to demonstrate the improvement with the proposed coupling. Finally, we conclude our findings, and provide future areas of research, in Sect. 5. Required model assumptions are provided in the Appendix.
2 Background Material
In this section we present and review the MLMC method applied to numerical discretizations of SPDEs. We first introduce the SPDE under consideration, and then review the approximation methods: a spectral Galerkin spatial discretization combined with either exponential Euler or Milstein discretization in time.
2.1 Notation
Let \(T>0\) and let \((\varOmega , \mathcal {F}, \mathbb {P})\) be a complete probability space equipped with a filtration \(\mathcal {F}_{t\in [0,T]}\). H denotes a non-empty separable Hilbert space with inner product \(\langle \cdot , \cdot \rangle \), norm \(\Vert \cdot \Vert _{ H} = \sqrt{\langle \cdot , \cdot \rangle }\) and orthogonal basis \((e_n)_{n=1}^{\infty }\). \(L^2(\varOmega ,H)\) denotes the associated Bochner–Hilbert space, consisting of the set of strongly measurable maps \(f:\varOmega \rightarrow H\) such that
Let \(\mathbb {N}:= \{1,2,\ldots \}\), and for every \(N\in \mathbb {N}\), we introduce the finite-dimensional subspace \(H^N:= \text {span} \{ e_n\mid n=1,\dots ,N\}\subset H\) and the associated orthogonal projection operator \(P_{ N} v:= \sum ^{ N}_{n=1} \langle v,e_n \rangle _{}\, e_n\) for \(v\in H\). For a (normally implicitly given) set B and mappings \(f,g:B \rightarrow [0,\infty )\), the notation \(f \lesssim g\) implies there exists a \(C>0\) such that \(f(x) \le C g(x)\) for all \(x\in B\), and the notation \(f\eqsim g\) means that both \(f \lesssim g\) and \(g\lesssim f\) hold. For multivariate positive-valued functions f(x, y) and g(x, y) for which it holds for some \(C>0\) that \(f(x,y) \le C g(x,y)\) for all \((x,y) \in \text {Domain}(f) = \text {Domain}(g)\), we write \(f \lesssim _{(x,y)} g\) if confusion is possible. And, similarly as above, \(f \eqsim _{(x,y)} g\) means that \(f \lesssim _{(x,y)} g\) and \(g \lesssim _{(x,y)} f\). For \(m,n \in \mathbb {Z}\) with \(m\le n\), we introduce the integer interval \(\llbracket m,n\rrbracket := [m,n]\cap \mathbb {Z}\), and for \(x\in \mathbb {R}\) we define \(\lceil x \rceil :=\min \{n \in \mathbb {Z}\mid n\ge x\}\).
2.2 Problem Setup
We consider a semilinear stochastic partial differential equation [9] of the form
where \(A:D(A) \rightarrow H \) is a linear operator, \(u_0\in H\) is a random-valued initial condition, \(f:H \rightarrow H\) is a reaction term that in general is nonlinear, and \(W_t\) is a Q-Wiener process, cf. (A.2). A number of further assumptions are imposed for the problem, which we have deferred to Appendix 1. Suffice it to say here that we do assume that the linear operator is negative-definite and spectrally decomposable in the considered basis:
and that the Q-Wiener process takes the form
where \((w^n_t)_{n=1}^{\infty }\) is a sequence of independent scalar-valued Wiener processes. We note that the eigenbasis of the operator A, \((e_n)_{n=1}^{\infty }\), also appears in the representation of the Q-Wiener process. The strictly positive sequence \((\lambda _n)_{n=1}^\infty \) and the non-negative sequence \((q_n )_{n=1}^\infty \) are further described in Appendix 1.
The mild solution to Eq. (2.1) is an H-valued predictable process \((U_t)_{t\in [0,T]}\) satisfying
The general form of (2.1) encapsulates numerous SPDE in practice. We will introduce and numerically study some of these in Sect. 4.
2.3 Numerical Methods
Numerical approximations of SPDE have traditionally been computed through the use of finite difference methods and finite element methods (FEM) [29, 35, 43]. For the relevance of this work, we will utilize and discuss an alternative class of Galerkin-based solvers. To motivate such an alternative class, we review some of these techniques below.
2.3.1 Continuous-time Spectral Galerkin Methods
For \(N\in {\mathbb N}\), consider the Galerkin problem of solving the SPDE (2.1) on the subspace \(H^N\):
where \(A_{ N}:=P_{ N}A\), \(f_{ N}(v):=P_{ N}(f(v))\) and \(W^{ N}_t:= P_{ N} W_t=\sum ^{ N}_{n=1}\sqrt{q_n}\,e_n\, w^n_t.\) It is well-known [32] that (2.4) has a unique mild solution given by
We next discuss two time-discretizations of spectral Galerkin methods.
2.3.2 The Exponential Euler Method
For a given \(J\in {\mathbb N}\), let \(\Delta t =\frac{T}{J}\) and let \((t_j)_{j=0}^{ J}\) be the nodes of a uniformly spaced mesh of [0, T], so that \(t_j=j\,\Delta t\) for \(j\in \llbracket 0,J\rrbracket \). Then, for given \(N\in {\mathbb N}\), the exponential Euler approximations \((V^{ N,J}_j)_{j=0}^{ J}\subset H^N\) of \((U^{ N}_{t_j})_{j=0}^J\) are defined by
where \(A_{ N}^{-1}:H^N\rightarrow H^N\) denotes the inverse operator of \(A_{ N}\). Defining for \(n \in \llbracket 1,N\rrbracket \) the components of \(V^{ N,J}_j\) and \(f_{ N}(\cdot )\) by \(V^{ N,J}_{j,n}:=\langle V^{ N,J}_j,e_n \rangle \),
\(f_{{ N},n}(\cdot ):=\langle f(\cdot ),e_n\rangle \), respectively, and recalling the spectral decomposition of the operator A, we arrive at the recursive relation
where
We recall from (2.2) that \(-\lambda _n\) denotes the n-th eigenvalue of the operator A, see also Assumption 2 in Appendix 1 for further details. Convergence properties of the exponential Euler scheme has been studied in [30], where they demonstrate strong convergence and highlight an improvement in the order of convergence in time against traditional numerical schemes:
Proposition 1
(Jentzen and Kloeden [30]) Let all assumptions in Appendix 1 hold for some \(\phi \in (0,1)\) relating to the regularity of the Q-Wiener process. Then
where U is the mild solution (2.3) of (2.1), and \((V^{ N,J}_j)_{j=0}^{ J}\) denotes the exponential Euler approximation of the mild solution, cf. (2.5).
We note that the first term on the RHS of (2.7) is related to the discretization in space and the second term is related to the discretization in time.
The performance of a numerical method will be measured by the computational cost required to reach a mean squared error (MSE) \(\mathcal {O}(\epsilon ^2)\). Computational cost refers to the number of computational operations, where we count each addition, subtraction, multiplication, division, and each draw of a Gaussian random variable as one computational operation. It follows from this definition that if \(f =0\) in the SPDE (2.1), then no evaluation of the reaction term is needed and the computational cost of computing the final-time solution \(V_J^{N,J}\) is \(\mathcal {O}(JN)\), as each time iteration of (2.5) consists of \(\mathcal {O}(N)\) computational operations. When \(f\ne 0\), however, the cost becomes a more complicated expression in general, and we make the following assumption to simplify matters:
Assumption 1
(Cost of evaluating \(f_N\)) For any \(N \in \mathbb {N}\) and \(V^N \in H^N\), the cost of of evaluating \(f_N(V^N)\) is \(\mathcal {O}(N\log _2(N))\).
When Assumption 1 holds, each evaluation of \(f_N(V^{J,N}_j)\) costs \(\mathcal {O}(N \log _2(N))\), and this accumulates to
for the final-time solution.
Remark 1
In the numerical Scheme (2.5) used in Proposition 1 and in Assumption 1 it is tacitly assumed that the nonlinear reaction term \(f_{N}(V^{N,J}_j)\) can be evaluated exactly for any \(N \in \mathbb {N}\) and \(V^{N,J} \in H^N\). For nonlinear reaction terms f, this may however not be possible in practice. In computations, we will employ the fast Fourier transform (FFT) to approximate \(f_N(V^{N,J}_j)\) on a uniform mesh with N degrees of freedom in space for each iteration of (2.5), where we refer to [8, Section 6.3.1] and [35] for further details on this procedure. The approximation of \(f_N\) by FFT may introduce so-called aliasing errors in the numerical solution, cf. [31, page 334]. Aliasing errors are not covered in the mathematical analysis of this paper, but we will include the cost of using FFT in the computational cost of all numerical methods studied.
Remark 2
Disregarding aliasing errors, Assumption 1 holds when computing \(f_N(V^N)\) by FFT for Nemytskii-operators \(f(U)(x) = g(U(x))\) where the mapping \(g: \mathbb {R}\rightarrow \mathbb {R}\) additionally satisfies that one evaluation costs \(\mathcal {O}(1)\). Then
where the right-hand side costs \(\mathcal {O}(N \log _2(N))\) to evaluate.
2.3.3 The Milstein Method
The Milstein method has been extended from SDE to different forms of parabolic SPDE with multiplicative noise in [4, 31]. We will here consider the version developed in [31], since its scheme is easy to express in our problem setting, and it is also easy to extend to an MLMC method. Using the previously introduced discretization parameters in space and time and recalling that the operators A and Q share the same eigenspace, the Milstein Scheme [31, equation (28)] takes the form
for \(j \in \llbracket 0,J-1\rrbracket \). On the component level, the scheme is given by
for \(n \in \llbracket 1,N\rrbracket \) and \(j \in \llbracket 0,J-1\rrbracket \).
We next present strong convergence rates for the Milstein scheme restricted to the additive-noise setting. For extensions to various multiplicative-noise settings, see [4, 31].
Proposition 2
(Jentzen and Röckner [31]) Let Assumption 6 in Appendix 1 be fulfilled for some values of \(\phi \in (1/2,1)\), \(\kappa \in [0,\phi )\) and \(\theta \in [\max (\kappa ,\phi -1/2), \phi )\), where \(\phi \) is the noise parameter introduced in Assumption 3. Then it holds that
where \(U_t\) denotes the mild solution to the SPDE (2.1) and \((V^{ N,J}_j)_{j=0}^{ J}\) denotes the Milstein approximation to the mild solution, cf. (2.9).
Proof
(Connecting the result to the literature) Remark 7 in Appendix 1 associates our parameters \((\phi ,\kappa ,\theta )\) with corresponding ones in [31, Assumptions 1-4]. In our additive-noise setting with the operators A and Q having the same eigenspace, Proposition 2 follows from [31, Theorem 1].
Even when disregarding the differences in the regularity assumptions, a comparison of the convergence rates for the exponential Euler and Milstein method is not straightforward since the rates for exponential Euler only depend on the single parameter \(\phi \), while the rates of Milstein depend on two additional parameters, \(\kappa \) and \(\theta \). To simplify the comparison we impose additional constraints on the relationship between the parameters \(\phi \), \(\kappa \) and \(\theta \):
Corollary 1
For some value of \(\phi \in (1/2,1)\), let Assumption 6 in Appendix 1 be fulfilled for some \(\kappa \in [0,\phi /2)\) and all \(\theta \in [\max (\kappa ,\phi -1/2), \phi )\). Then for any sufficiently small fixed \(\delta >0\), it holds that
Proof
For any sufficiently small \(\delta >0\), Assumption 6 holds for some \(\kappa < \phi /2 -\delta /4\) and \(\theta _\delta :=\phi -\delta /2\). Noting that
the result follows from Proposition 2.
For a fixed value of \(\phi \in (1/2,1)\), the additional constraints imposed on \(\kappa \) and \(\theta \) in Corollary 1 are likely to present the Milstein method in a good light, as they produce the highest possible convergence rates attainable from Proposition 2. Comparing the convergence of exponential Euler in Proposition 1 with Milstein in Corollary 1, the methods have essentially the same rate in space, but exponential Euler has a higher rate in time. Note further that the rates only apply to Milstein when \(\phi >1/2\), while they apply to exponential Euler method for any \(\phi \in (0,1)\). But one should also keep in mind that the Milstein method applies to a wider range of reaction terms f than exponential Euler, since Assumption 6 is more relaxed than Assumption 4. When comparable, the lower convergence rate for Milstein leads to a poorer performance for the Milstein MLMC method than the exponential Euler MLMC method in low-regularity settings, when \(\phi <3/4\), cf. Theorems 2 and 3. See also Sect. 4 for numerical evidence that exponential Euler outperforms Milstein when \(\phi \approx 1/2\).
2.4 The multilevel Monte Carlo Method
The expectation of an H-valued random variable U is often approximated by the standard Monte Carlo estimator
where the samples \(U^{(1)}, U^{(2)},\ldots , U^{(M)} \sim \mathbb {P}_U\) are independently drawn random variables and \(E_M[U]\) consequently denotes the sample average estimator using M i.i.d. draws of U. When it is computationally costly to draw samples of U, variance-reduction techniques may improve the efficiency through reducing the statistical error of the estimator. The multilevel Monte Carlo (MLMC) method is an extension of standard Monte Carlo that draws pairwisely coupled random variables \(\{(U^{\ell -1,C}, U^{\ell ,F})\}_{\ell =0}^L\), where \(U^{\ell -1,C}\) denotes the coarse random variable on resolution level \(\ell \), and \(U^{\ell ,F}\) the fine random variable on level \(\ell \). Pairwise coupling of \((U^{\ell -1,C}, U^{\ell ,F})(\omega )\) means that \(U^{\ell -1,C}(\omega )\) and \(U^{\ell ,F}(\omega )\) are generated using the same driving noise \(W_t(\omega )\) (to be elaborated on in the next section). We further impose that
so that the weak approximation on resolution level \(L \in \mathbb {N}\) can be represented as a telescoping sum of expectations:
By approximating each of the \(L+1\) expectations in the telescoping sum by a sample average, we obtain the MLMC estimator:
Here, \((U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\) denotes the \(\mathbb {P}_{(U^{\ell -1,C},U^{\ell ,F})}\)-distributed m-th sample on level \(\ell \), and all samples on all resolution levels are independent, meaning that all random variables in the sequence \(\{(U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\}_{\ell ,m}\) are independent. A near-optimal calibration of the parameters \(L \in \mathbb {N}\) and \((M_{\ell })_{\ell =0}^{ L}\subset \mathbb {N}\) is obtained through minimizing the mean squared error for a given computational cost, cf. [14] and Theorem 1. The MLMC estimator achieves variance reduction over standard Monte Carlo when the coupled random variables \(U^{\ell -1,C}\) and \(U^{\ell ,F}\) are sufficiently correlated, cf. Condition (ii) in Theorem 1 below.
One way to assess the performance of Monte Carlo methods is through the MSE. The following theorem describes the cost versus error of the MLMC methodology for H-valued random variables:
Theorem 1
Assume that the telescoping-sum properties (2.10) hold and that there exists positive constants \(\alpha ,\beta ,\gamma \) such that \(\alpha \ge \frac{\min (\beta ,\gamma )}{2}\) and
-
(i)
\(\left\| \mathbb {E}\left[ U^{\ell ,F}-U\right] \right\| _{ H} \lesssim \,2^{-\alpha \,\ell }\),
-
(ii)
\(V_\ell :=\mathbb {E}\left[ \left\| U^{\ell ,F}-U^{\ell -1,C} \right\| ^2_{ H}\right] \lesssim \,2^{-\beta \,\ell }\),
-
(iii)
\(C_\ell := \textrm{Cost}(U^{\ell -1,C}, U^{\ell ,F}) \lesssim \, 2^{\gamma \,\ell }\).
Then for any \(\epsilon \in (0,1)\) and \(L:= \lceil \log _2(1/\epsilon )/\alpha \rceil \), there exists a sequence \((M_{\ell })_{\ell =0}^{ L}\subset \mathbb {N}\) such that
and
The proof of this result is a straightforward extension of the original theorem presented by Giles [14] for weak approximations of stochastic differential equations.
Proof
Let
where \(V_0:= \mathbb {E}\left[ \Vert U^{0}\Vert _{ H}^2\right] \). By the telescoping-sum property
the representation (2.12) and the independence of the samples
\(\{(U^{\ell -1,C,(m)}, U^{\ell ,F,(m)})\}_{\ell ,m}\), we obtain that
By assumptions (ii) and (iii), we obtain that
For the last inequality, the assumption \(\alpha \ge \min (\beta , \gamma )/2\) implies that that \(\gamma /\alpha \le 2\) when \(\beta \ge \gamma \) and \(\beta /\alpha \le 2\) when \(\beta \le \gamma \) (so that \(2 + (\gamma -\beta )/\alpha \ge \gamma /\alpha \)), and inequality (2.13) follows.
Remark 3
The theorem also applies in settings where one replaces \(V_\ell \) in Theorem 1 (ii) by \({\widetilde{V}}_\ell :={\mathbb E}\left[ \left\| U^{\ell ,F}-U^{\ell -1,C} -\mathbb {E}\left[ U^{\ell ,F}-U^{\ell -1,C}\right] \right\| ^2_{ H}\right] \), and for some problems this may improve the rate \(\beta >0\). Practically, however, there may be little to gain by replacing \(V_\ell \) by \(\widetilde{V}_\ell \) as weak approximations of \(U^{\ell ,F} - U^{\ell -1,C}\) can be much more intractable than strong approximations, cf. [34].
3 Multilevel Monte Carlo Methods for SPDE
In this section we describe two MLMC methods that are based on extending the two numerical schemes in Sect. 2.3 to the MLMC setting. To better illustrate the importance of strong coupling and the loss of accuracy due to damping, we also propose a third MLMC method which is an extension of a modified form of the exponential Euler method that only is exponential in the drift-term. We will employ the following notation for the multilevel hierarchy of discretized solutions: On level \(\ell \ge 0\), let \(N_{\ell }\eqsim N_0\,2^{\nu \ell }\) for given \(N_0\in \mathbb {N}\) and \(\nu >0\) denote a sequence of spatial resolutions, and let \(J_\ell :=J_0\,2^\ell \) for a given \(J_0 \in \mathbb {N}\) denote a sequence of time resolutions. In a notation that suppresses details on the pairwise coupling, we let \(U^{\ell ,F}_j:= V^{ N_\ell , J_\ell }_j\) denote the fine numerical solution of a given spectral Galerkin method on level \(\ell \) at time \(t_j^{\ell }:=j\,\Delta t_\ell \) for \(j\in \llbracket 0,J_\ell \rrbracket \), computed on the subspace \(H^{N_\ell }\) using the time step \(\Delta t_{\ell }:=\frac{T}{J_\ell }\). And \(U^{\ell -1,C}_j:= V^{ N_{\ell -1}, J_{\ell -1}}_j\) denotes the coupled coarse numerical solution on level \(\ell \) at time \(t_j^{\ell -1}:=j\,\Delta t_{\ell -1}\) for \(j\in \llbracket 0,J_{\ell -1}\rrbracket \) computed on the subspace \(H^{N_{\ell -1}}\) with time step \(\Delta t_{\ell -1}:=\frac{T}{J_{\ell -1}}\).
To discuss the quality of a pairwise coupling, let us first introduce some terminology. When a coupling satisfies
for all \(\ell \ge 0\), we say that the coupling is weakly correct, and when it additionally satisfies
for all \(\ell \ge 0\), we say that it is a pathwise correct coupling. From the construction of the multilevel estimator in Sect. 2, we see that weakly correct coupling is needed to obtain the crucial telescoping sum in the MLMC estimator, cf. (2.10) and (2.11), and that weakly correct coupling thus ensures consistency for the MLMC estimator. Pathwise correct coupling is on the other hand not necessary to obtain consistency, and there are many examples of performant MLMC methods that only are weakly correct, cf. [17, 25]. Pathwise correct coupling is however often an easy way to ensure the needed weakly correct coupling.
To achieve high performance, the pairwise coupling must be weakly correct and produce a high convergence rate \(\beta \) for the strong error, cf. Theorem 1. We will refer to a coupling that achieves a high rate \(\beta \) in comparison to alternative approaches as a strong coupling. To be more precise for the particular SPDE considered in this work, we introduce the notion of strong diffusion coupling:
Definition 1
(Strong diffusion coupling (SDC)) Consider a weakly correct coupling sequence of spectral-Galerkin numerical solutions of the
\(\{(U^{\ell -1,C}, U^{\ell ,F})\}_{\ell \ge 0}\) of the SPDE (2.1) with no reaction term, \(f=0\) (the stochastic heat equation). Recall further that a coupled pair of solutions is defined on time meshes of different resolutions:
and
with \(\Delta t_{\ell -1} = 2 \Delta t_\ell \). We say that the coupling is a strong diffusion coupling if it holds for all \(\ell \ge 0\) that
For the stochastic heat equation, an SDC is thus an exact coupling of \(U^{\ell -1,C}\) to \(U^{\ell ,F}\) on the subspace \(H^{N_{\ell -1}}\). This is of course the strongest possible coupling one can achieve (for the given problem), and we will see later that the exponential Euler MLMC method indeed is the only among the three we consider whose coupling is SDC. Although our theory and numerical experiments both indicate a connection between SDC and strong couplings more generally when f is non-zero-valued, it is not clear how far this extends. To best of our knowledge, it is an open problem to describe coupling strategies for H-valued stochastic processes that are weakly correct and maximize the convergence rate of the strong error \(\beta \).
We next extend the exponential Euler method and the Milstein method to the MLMC setting.
3.1 Exponential Euler MLMC Method
This MLMC method was first introduced and analyzed for the linear reaction-term setting in [8, Section 5.4.1]. Since then the method has been applied to the SPDE (2.1) with linear reaction term for problems arising in Bayesian computation. These include stochastic filtering [28] and Markov chain Monte Carlo [27], with an extension to multi-index Monte Carlo.
We consider the pairwisely coupled solutions \((U^{\ell -1,C},U^{\ell ,F})\) that both are solved by the numerical Scheme (2.6) with the respective initial conditions
For the fine solution, the n-th component of two iterations of the Scheme (2.6) at time \(t^{\ell }_{2j} = 2j \Delta t_{\ell }\) takes the form
and
for \((j,n) \in \llbracket 0, J_{\ell -1}-1\rrbracket \times \llbracket 1,N_{\ell }\rrbracket \) and with
for \((k,n) \in \llbracket 0,J_{\ell }-1\rrbracket \times \llbracket 1,N_\ell \rrbracket \).
The coupled coarse solution uses the time step \(\Delta t_{\ell -1} = 2\,\Delta t_{\ell }\), and one iteration at time \(t_{j}^{\ell -1} = j\Delta t_{\ell -1} = 2j\Delta t_{\ell } = t_{2j}^{\ell }\) takes the form
where
The pairwise coupling \(U^{\ell -1,C}_{j+1,n} \leftrightarrow U^{\ell ,F}_{2j+2,n}\) is obtained through coupling the driving noise \(R^{\ell -1,C}_{j,n} \leftrightarrow (R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\). By (3.3), we have that
which yields
To summarize, given the coupling \(U^{\ell -1,C}_{j,n} \leftrightarrow U^{\ell ,F}_{2j,n}\) at some time \(t_j^{\ell -1}\), the coupling at the next time is obtained by generating the fine-solution noise \((R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\) and coupling it to the coarse-solution noise by formula (3.5). The next-time solution \(U^{\ell -1,C}_{j+1,n}\) is computed by (3.4) with \(R_{j,n}^{\ell -1,C}\) as input, and \(U^{\ell ,F}_{2j+2,n}\) is computed by (3.1) and (3.2) with \((R_{2j,n}^{\ell ,F},R_{2j+1,n}^{\ell ,F})\) as input.
Remark 4
We note from the above that
and since \(U^{\ell -1,C}\) and \(U^{\ell -1,F}\) are solved using the same numerical scheme, the coupling is pathwise correct. Let us further note that if \(f=0\), then the linearity of the problem and (3.5) imply that the coupling is an SDC:
This can be verified by induction: assume (3.6) holds for some \(j\in \llbracket 0,J_{\ell -1}-1\rrbracket \) (it holds for \(j=0\) by definition). And using the numerical schemes for the respective methods with \(f=0\), we obtain that
Since the exponential Euler MLMC method is SDC, we expect it to perform very efficiently when \(f=0\), and Theorem 2 shows that the coupling is strong also for more general reaction terms.
For showcasing the importance of strong pairwise coupling, and as a transition between exponential Euler MLMC and Milstein MLMC, we next consider a slightly altered form of the exponential Euler method with explicit integration of the Itô integral.
3.2 Drift-exponential Euler MLMC Method
We consider the drift-exponential Euler scheme
This is a mix of exponential Euler and Milstein, as the approximation of the drift terms agree with the exponential Euler scheme and the approximation of the Itô integral agrees with the Milstein scheme.
When extending this scheme to an MLMC method, a similar argument as in Sect. 3.1 yields that two iterations of the fine solution in the pairwise couple \((U^{\ell -1,C}, U^{\ell ,F})\) takes the form
and
for \((j,n) \in \llbracket 0, J_{\ell -1}-1\rrbracket \times \llbracket 1,N_{\ell }\rrbracket \) and with
for \((k,n) \in \llbracket 0,J_{\ell }-1\rrbracket \times \llbracket 1,N_\ell \rrbracket \).
The coupled coarse solution takes the form
where
Recalling that \(t_{j}^{\ell -1} = j \Delta t_{\ell -1} = 2j \Delta t_\ell = t^{\ell }_{2j}\), we obtain the pairwise coupling of \(U^{\ell -1,C}_{j+1,n} \leftrightarrow U^{\ell ,F}_{2j+2,n}\) through coupling the driving noise:
Since \(\widetilde{R}_{j,n}^{\ell -1,C}(\omega ) = \widetilde{R}_{j,n}^{\ell -1,F}(\omega )\) and \(U^{\ell -1,C}\) and \(U^{\ell -1,F}\) are solved using the same numerical method, it follows that that the drift-exponential Euler MLMC method also is pathwisely correctly coupled. However, it is not SDC, since when \(f=0\) we obtain by (3.7) and (3.8) that for \(n \in \llbracket 1,N_{\ell -1}\rrbracket \),
The term \( (1 - e^{- \lambda _n \Delta t_{\ell }} )\widetilde{R}_{1,n}^{\ell ,F}\) is an error in the coupling that is introduced by explicit integration of the Itô integral. This leads to an artificial smoothing of the numerical solution, as is illustrated by the numerical examples in Sect. 4.
3.3 Milstein MLMC Method
We consider the pairwise coupling of the coarse and fine Milstein solutions on level \(\ell \) with respective initial conditions
Two iterations of the fine solution takes the form
and
for \((j,n) \in \llbracket 0,J_{\ell -1} -1\rrbracket \times \llbracket 1,N_\ell \rrbracket \) with
One iteration of the coarse solution takes the form
for \((j,n) \in \llbracket 0,J_{\ell -1} -1\rrbracket \times \llbracket 1,N_{\ell -1}\rrbracket \).
And we obtain the same coupling as for the drift-exponential Euler method:
By a similar argument as for the previous MLMC method, this is a pathwise correct coupling, but it is not SDC.
In summary, we have presented three different MLMC methods where only the coupling for the exponential Euler MLMC method is SDC. This particularly means that the exponential Euler MLMC method outperforms the other methods when \(f=0\), and later comparisons of the strong convergence rate \(\beta \) for the two methods in Theorems 2 and 3 and in the numerical experiments show that the outperformance is broader.
3.4 MLMC for SPDE
In this section, we present cost versus error results for the exponential Euler- and Milstein MLMC methods.
We recall that the computational cost of one simulation of a numerical method is defined by the computational effort needed, cf. (2.8), and that under Assumption 1, it holds for all three spectral Galerkin methods we consider that
and
We will consider weak approximations of Banach-space-valued quantities of interest (QoI) of the following form:
Definition 2
(Admissible QoI) Let K be a Banach space equipped with the norm \(\Vert \cdot \Vert _K\) and let \(\varphi : H \rightarrow K\) be a strongly measurable and uniformly Lipschitz continuous QoI. We say that such a QoI is admissible if the cost of evaluating the mapping satisfies that
We are ready to state the main result of this work.
Theorem 2
([Exponential Euler MLMC) Consider the SPDE (2.1) for a linear operator A with \(\lambda _n \eqsim n^2\) and let \(\varphi :H \rightarrow K\) be an admissible QoI, in the sense of Definition 2. If all assumptions in Appendix 1 hold for some \(\phi \in (0,1)\) and Assumption 1 holds, then the pathwise correctly coupled exponential Euler MLMC method with
satisfies
-
(i)
\(\big \Vert \mathbb {E}\big [ {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U(T, \cdot )) \big ]\big \Vert _{{K}} \lesssim (\ell +1) 2^{-\ell }\).
-
(ii)
\(V_\ell := \mathbb {E}\Big [\Vert {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }( U^{\ell -1,C}(T, \cdot ))\Vert ^2_{{K}} \Big ] \lesssim (\ell +1)^2 \; 2^{-2\ell }\).
-
(iii)
\(C_\ell := \textrm{Cost}\big ({\varphi }(U^{\ell -1,C}(T)), {\varphi }(U^{\ell ,F}(T)) \big ) {\lesssim } (\ell +1) 2^{(1+1/(2\phi ))\ell }\).
And for any sufficiently small \(\epsilon >0\) and \(L:= \lceil \log _2\big ( \log _2(1/\epsilon )/\epsilon \big ) \rceil \), there exists a sequence \(\{M_{\ell }(\epsilon )\}_{\ell =0}^L \subset \mathbb {N}\) such that
and
Proof
Let us show that the K-valued random variables \(\varphi (U^{\ell ,F}(T,\cdot ))\) and \(\varphi (U^{\ell ,C}(T,\cdot ))\) are well-defined. The Lipschitz continuity of the mapping \(\varphi \) implies that
where \(C_{\varphi }>0\) denotes the Lipschitz constant for \(\varphi \). It follows that \(\varphi (U^{\ell ,F}(T,\cdot )) \in L^2(\varOmega , K)\) for all \(\ell \ge 0\), and we similarly also have that \(\varphi (U^{\ell ,C}(T,\cdot )) \in L^2(\varOmega , K)\).
We note that the numerical resolution sequences are set according to (3.12) to balance the error from space- and time-discretization in Proposition 1. A pair of correctly coupled solutions \(U^{\ell ,F, (m)}\) and \(U^{\ell -1,C,(m)}\) can be viewed as exponential Euler solutions using the same driving \(Q-\)Wiener process \(W^{(m)}\) on levels \(\ell \) and \(\ell -1\), respectively. Consequently,
where we used Lipschitz continuity for the first inequality. This verifies rate (ii). Since \(\{\varphi (U^{\ell ,F}(T,\cdot ))\}_\ell \) is a Cauchy sequence in \(L^2(\varOmega ,K)\) with limit \(\varphi (U(T,\cdot ))\), we have that
In the last equality we used that the coupling is pathwise correct: \(U^{j,F}(T,\cdot ) = U^{j,C}(T,\cdot )\), cf. Remark 4. Rate (i) follows from
Rate (iii) follows by (3.11) and Definition 2. Introducing the following number-of-samples-per-level sequence
and noting that \(L = \lceil \log _2(\log _2(1/\epsilon )/\epsilon ) \rceil \eqsim \log _2(1/\epsilon )\), we obtain (3.13) by a similar argument as in the proof of Theorem 1:
For the computational cost, we have that
and (3.14) follows from using \(V_jC_j \lesssim (j+1)^3\, 2^{\,j\, (1/(2\phi )-1)}\) when bounding the squared sum from above.
Remark 5
A general framework for (MLMC) methods for reaction-diffusion type SPDE in the setting of \(\phi \ge 1/2\) and for numerical methods with a strong convergence rate 1/2 was first developed in [5]. When \(\phi =1/2\) and \(\gamma = 2\), the MSE \(\mathcal {O}(\epsilon ^{2-\delta })\) was achieved at the computational cost \(\mathcal {O}(\epsilon ^{-3})\) for that method in [5, Theorem 4.4] compared to a cost \(\mathcal {O}(\epsilon ^{-2})\) for our exponential Euler MLMC method. This is however not a fair performance comparison, since [5] was developed for more general SPDE with multiplicative noise and for which the operators A and Q need not share eigenbasis, while our method is tailored to the additive-noise setting with A and Q sharing eigenbasis, cf. Appendix 1.
We state a similar cost-versus-error result for the MLMC Milstein method with pathwise correctly pairwise coupling.
Theorem 3
(Milstein MLMC) Consider the SPDE (2.1) for a linear operator A with \(\lambda _n \eqsim n^2\), let the assumptions in Corollary 1 hold for some \(\phi \in (1/2,1)\) and let Assumption 1 hold. Let \(\varphi :H \rightarrow K\) be an admissible QoI, in the sense of Definition 2. Then the pathwise correctly coupled Milstein MLMC method with
satisfies for any fixed \(\delta >0\) that
-
(i)
\(\big \Vert \mathbb {E}\big [ {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U(T, \cdot )) \big ]\big \Vert _H \lesssim 2^{-(\phi -\delta /2) \ell }\).
-
(ii)
\(V_\ell := \mathbb {E}\Big [\Vert {\varphi }(U^{\ell ,F}(T,\cdot )) - {\varphi }(U^{\ell -1,C}(T, \cdot ))\Vert ^2_H \Big ] \lesssim 2^{-(2\phi -\delta ) \ell }\).
-
(iii)
\(C_\ell := \textrm{Cost}\big ( {\varphi }(U^{\ell -1,C}(T)), {\varphi }(U^{\ell ,F}(T)) \big ) \lesssim (\ell +1) 2^{3\ell /2}\).
And for any sufficiently small fixed \(\delta >0\) and any sufficiently small \(\epsilon >0\), there exist an \(L(\epsilon ) \in \mathbb {N}\) and a sequence \(\{M_{\ell }(\epsilon )\}_{\ell =0}^L \subset \mathbb {N}\) such that
at the cost
Proof
We set the numerical resolution sequences by (3.16) to balance the error from space- and time-discretization in Corollary 1, and, since the Milstein MLMC method is pathwise correctly coupled, the rates (i), (ii) and (iii) can be verified as in the proof of Theorem 2.
To prove the error and cost results, we relate the rates in (i), (ii) and (iii) to those in Theorem 1: for any \(\delta >0\), it holds that
For the case \(\phi \in (3/4,1)\) it holds for sufficiently small \(\delta >0\) that \(\beta > \gamma \), and the results (3.17) and (3.18) follow from Theorem 1.
For the case \(\phi \in (1/2,3/4]\), we again apply Theorem 1 to our rates \((\alpha ,\beta ,\gamma )\) in (3.19) to conclude that (3.17) is fulfilled at the cost
and taking \(\delta >0\) sufficiently small, it holds that
Comparing Theorems 2 with 3, we expect exponential Euler MLMC to asymptotically outperform Milstein MLMC when the colored noise has low regularity, meaning when \(\phi < 3/4\).
4 Numerical Examples
In this section, we numerically test the exponential Euler MLMC method against the drift-exponential- and Milstein MLMC methods. We study two reaction-diffusion SPDE, one with a linear reaction term and one with a trigonometric one. To showcase the superior performance of exponential Euler in settings with low-regularity colored noise, we consider one setting with \(\phi \approx 1/2\) (this is a low-regularity setting for the Milstein method) and we numerically confirm the theoretical result that the exponential Euler MLMC and the Milstein MLMC perform similarly when \(\phi \approx 3/4\), cf. Theorems 2 and 3.
For our numerical experiments we consider the general form of semilinear SPDE
with initial triangular-wave initial condition
Furthermore we specify our space \(H = L^2(0,1)\) with Fourier basis functions \(e_n(x) = \exp (i 2n\pi x)\) for \(n \in \mathbb {Z}\), and the final time is set to \(T=1/2\). We consider a linear operator \(A:D(A) \rightarrow H\) defined as
with eigenvalues \((\lambda _n)_{n \in \mathbb {Z}}\), given as
We note that the triangular-wave initial condition satisfies the following regularity condition: \(u_0 \in H_{3/4-\delta }\) for any \(\delta >0\).
For \(f:H \rightarrow H\), we consider the two different reaction terms which are presented in Table 1. Both belong to the class of Nemytskii operators, cf. [35].
The driving noise dW is a \(Q-\)Wiener process (A.2) with
for two different values of b: the low-regularity setting \(b=1/4\), and the smoother setting \(b=1/2\). In connection with Assumption 3, we note that
It consequently holds that that \(\phi =(1/4+b)-\delta \) for any \(\delta >0\), and for simplicity, we will refer to the parameter values for \(\phi \) as \(\phi (b=1/4) = 1/2-\) and \(\phi (b=1/2) =3/2-\), respectively. When Theorems 2 and 3 apply, we expect exponential Euler MLMC to outperform Milstein MLMC when \(\phi <3/4\), and that the methods perform similarly when \(\phi > 3/4\).
Note however that some of our numerical studies are purely experimental, as neither of the theorems apply to all problem settings we consider. Theorem 2 only applies to the linear reaction term, because the trigonometric reaction term has no Fréchet derivative that belongs to L(H), and this violates Assumption 4. We do however believe the regularity assumptions in Proposition 1 can be relaxed so that it also applies to the trigonometric reaction term, but, to the best of our knowledge, it is an open problem to prove this.
For the Milstein method, on the other hand, Assumption 6 does hold whenever \(\phi > 1/2\) and \(\kappa > 1/4\), with Fréchet derivatives \(f'(\cdot ) = 4\pi (\cos (2\pi \cdot ) -\sin (2\pi \cdot ))\) and \(f''(\cdot ) = -8\pi ^2f(\cdot )\). (This can be verified using the definition of Fréchet derivatives and that \(L^\infty (0,1) \subset H_{\kappa }\).) But Theorem 3 only applies when \(\phi >1/2\).
4.1 Numerical Estimates of the Convergence Rate \(\beta \)
Numerical estimates of the root mean squared error (RMSE) convergence rates in time and space for all three methods are presented in Figs. 1 and 2. The RMSE in time is approximated by
where J is varied and \(N_* = 1024\) is fixed, and using \(M=10000\) independent samples of the random variable in the Monte Carlo estimator. For the exponential Euler method we observe the rate 1 and for the other methods, we observe the rate \(\phi (b) = 1/4+b\).
The RMSE in space is approximated by
where N is varied and \(J_* = 2^{18}\) is fixed, and using \(M=250\) independent samples. This error describes the RMSE convergence rate in N, which we observe to be \(2\phi = 1/2+2b\) for all methods.
Since the \(\beta \) in Theorem 1 represents the MSE, the numerical experiments indicate that \(\beta = 2\min (1, 2\phi ) = 2\) for exponential Euler MLMC and \(\beta = 2\phi (b) = 1/2 + 2b\) for the other two methods. We will further set \(\alpha = \beta /2\) as the weak rate when implementing all MLMC methods.
4.2 Method Parameters
All three methods are implemented using Theorem 1 with the numerical estimates of the rates \(\alpha \) and \(\beta \), rather than by using the rather than using the slightly more conservative rate for \(\beta \) in Theorem 2 (ii).
4.2.1 Exponential Euler MLMC Method
We use the estimated rates \(\alpha =1\) and \(\beta =2\) and balance error contributions in time and space by setting
and \(L = \lceil \log _2( 1/\epsilon )/\alpha \rceil = \lceil \log _2( 1/\epsilon )\rceil \). We set \(C_\ell := (\ell +1) 2^{\gamma \ell }\) with \(\gamma = 1 + 1/(2\phi )\), which one may verify is consistent with \(C_\ell \eqsim J_\ell N_\ell \log _2(N_\ell )\), and we set \(V_\ell := 2^{-2\ell }\) to determine the sequence \(\{M_\ell \}_\ell \), in compliance with formula (3.15), by
4.2.2 Drift-Exponential Euler MLMC and Milstein MLMC
The numerically observed convergence rates for both of these methods are \(\beta = 2\phi \) and \(\alpha = \phi \). For both methods, we set
\(C_\ell := (\ell +1) 2^{3\ell /2}\) and \(V_\ell := 2^{- 2\phi \ell }\), we and determine the sequence \(\{M_\ell \}_{\ell }\) by formula (4.1).
4.3 Linear Reaction Term
We first consider the SPDE with \(f(U) = U\). Figure 3 presents pairwisely coupled realizations for progressively finer resolution in time for the settings \(b=1/4\) and \(b=1/2\). In the low regularity setting \(b=1/4\), we clearly observe that the exponential Euler method has far less smoothing of the solutions and achieves a stronger coupling than the other methods. The difference between the methods becomes less visible in the smoother setting \(b=1/2\).
Figure 4 provides the approximation of \(\mathbb {E}[U(T,\cdot )]\) by one simulation of each of the MLMC methods for different input \(\epsilon = 2^{-\ell }\) for \(\ell = 4,5,\ldots ,9\). We observe that all methods converge to the mean with approximately the same rate for both values of b.
Figure 5 presents the MLMC approximation error versus tolerance and the computational cost versus tolerance for different input tolerances \(\epsilon \) for one simulation. The approximation error
is computed for one simulation of the MLMC estimator for each input of \(\epsilon \), where the the pseudo-reference solution \(\mathbb {E}\left[ U(T, \cdot )\right] \) is obtained by solving the PDE
with the exponential Euler method using the resolutions \(N=2^{13}\) and \(J=2^{18}\). Let us also recall that the computational cost of the MLMC methods is defined by \(\sum _{\ell =0}^L C_\ell M_\ell \).
For \(\phi = 1/2-\), we observe that exponential Euler MLMC method has achieves the error \(\mathcal {O}(\epsilon ^2)\) at the cost \(\mathcal {O}((\log _2(\epsilon ))^2 \epsilon ^{-2})\) while the other methods achieves similar accuracy at considerably higher cost. For \(\phi = 3/4-\) all three methods achieves an error \(\mathcal {O}(\epsilon ^2)\) at a comparable computational cost. The observations are consistent with theory.
4.4 Trigonometric Reaction Term
We next consider the SPDE with
The approximation of \(\mathbb {E}[U(T,\cdot )]\) by the MLMC methods for different inputs \(\epsilon = 2^{-\ell }\) for \(\ell = 4,5,\dots ,9\) is presented in Figs. 6 and 7 shows the MLMC approximation error versus computational cost and computational cost versus tolerance for different input tolerances \(\epsilon \). For each value of b, the pseudo-reference solution used for evaluating the approximation error is computed by the exponential Euler MLMC method \(E_{\textrm{ML}}[ U(T,\cdot )](\omega , \epsilon ) \approx \mathbb {E}[U(T,\cdot )]\) with the overkilled parameter value \(\epsilon = 2^{-11}\). This an expensive computation using the following number of samples per level when \(b=1/4\):
with \(N_{\ell } = J_{\ell } = 2^{\ell +2}\). We observe once again that exponential Euler MLMC outperforms the other methods in the low-regularity setting \(b=1/4\) and that all methods perform similarly when \(b=1/2\).
5 Conclusion
Our objective in this work was to show both theoretically and experimentally that coupling approaches that exploit more information than only the driving noise \(W_t\), such as the exponential Euler MLMC method, can result in strong coupling and improve the efficiency of weak approximations for SPDE. Our motivation in doing so, was based on the lack of literature on strong coupling for MLMC methods solving SPDE. In particular, we have derived explicit convergence rates, related to the decay of the mean squared error-to-cost rate, for the exponential Euler MLMC method and the Milstein MLMC method, cf. Theorems 2 and 3. The convergence rates for exponential Euler MLMC method is an improvement over existing MLMC methods for reaction-diffusion SPDE with additive noise. We also presented numerical experiments highlighting our derived rates and demonstrating the efficiency gains of the exponential Euler MLMC method over alternative ones. This was tested numerically on SPDE with linear and nonlinear reaction terms.
There are many possible extensions of this work. It would be interesting to understand whether strong couplings also can improve the efficiency of MLMC for other numerical solvers for SPDE, such as finite difference methods and FEM [2, 5]. This indeed is a challenging problem, due to the seemingly limitless possibilities of couplings for infinite-dimensional problems. Another direction is to develop a multi-index Monte Carlo method [18, 28] based on the pathwise correctly coupled exponential Euler method. This has the potential of further improving tractability in higher-dimensional physical space and low-regularity settings.
Data Availability
The computer code used for generating the numerical tests, written in the Julia programming language, is available from the corresponding author on request.
References
Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, Dana, et al.: A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion 71, 243–297 (2021)
Abdulle, A., Barth, A., Schwab, C.: Multilevel Monte Carlo methods for stochastic elliptic multiscale PDEs multiscale model. Simul. 11(4), 1033–1070 (2013)
Barth, A., Lang, A.: Multilevel Monte Carlo method with applications to stochastic partial differential equations. Int. J. Comput. Math. 89(18), 2479–2498 (2012)
Barth, A., Lang, A.: Milstein approximation for advection-diffusion equations driven by multiplicative noncontinuous martingale noises. Appl. Math. Optim. 66(3), 387–413 (2012)
Barth, A., Lang, A., Schwab, C.: Multilevel Monte Carlo method for parabolic stochastic partial differential equations. BIT Num. Math. 53(1), 3–27 (2013)
Chada, N. K., Jasra, A., Yu, F.: Multilevel ensemble Kalman–Bucy filters. arXiv preprint arXiv:2011.04342, (2020)
Charrier, J., Scheichl, R., Teckentrup, A.L.: Finite element error analysis of elliptic PDEs with random coefficients and its application to multilevel Monte Carlo methods. SIAM J. Num. Anal. 51, 322–352 (2013)
Chernov, A., Hoel, H., Law, K.J.H., Nobile, F., Tempone, R.: Multilevel ensemble Kalman filtering for spatio-temporal processes. Num. Math. 147, 71–125 (2021)
Da Prato, G., Zabczyk, J.: Stochastic equations in infinite dimensions. Cambridge University Press, Cambridge, UK (1992)
Dodwell, T.J., Ketelsen, C., Scheichl, R., Teckentrup, A.L.: Multilevel Markov chain Monte Carlo. SIAM Review 61(3), 509–545 (2019)
Erdoğan, U., Lord, G.J.: A new class of exponential integrators for SDEs with multiplicative noise. IMA J. Num. Anal. 39(2), 820–846 (2019)
Freeman, T. G.: The mathematics of medical imaging: a beginner’s guide. Springer Undergraduate Texts, (2015)
Fossum, K., Mannseth, T., Stordal, A.S.: Assessment of multilevel ensemble-based data assimilation for reservoir history matching. Comput. Geosci. 24, 217–239 (2020)
Giles, M.B.: Multilevel Monte Carlo path simulation. Op. Res. 56, 607–617 (2008)
Giles, M.B.: Multilevel Monte Carlo methods. Acta Numerica 24, 259–328 (2015)
Giles, M.B., Reisinger, C.: Stochastic finite differences and multilevel Monte Carlo for a class of SPDEs in finance. SIAM J. Fin. Math. 3(1), 572–592 (2012)
Giles, M.B., Szpruch, L.: Antithetic multilevel Monte Carlo estimation for multi-dimensional sdes without lévy area simulation. Ann. Appl. Prob. 24(4), 1585–1620 (2014)
Haji-Ali, A., Nobile, F., Tempone, R.: Multi-index Monte Carlo: when sparsity meets sampling. Numer. Math. 132, 767–806 (2016)
Harbrecht, H., Peters, M., Siebenmorgen, M.: On multilevel quadrature for elliptic stochastic partial differential equations. In: Garcke, J., Griebel, M. (eds.) Sparse grids and applications. Lecture Notes in Computational Science and Engineering, 161–179, vol. 88. Springer, Berlin-Heidelberg (2013)
Hausenblas, E.: Numerical analysis of semilinear stochastic evolution equations in Banach spaces. J. Comput. Appl. Math. 147, 485–516 (2002)
Hausenblas, E.: Approximation for Semilinear stochastic evolution equations. Potential Analysis 18, 141–186 (2003)
Heinrich, S.: Multilevel Monte Carlo methods. In Large-Scale Scientific Computing, (Eds. S. Margenov, J. Wasniewski & P. Yalamov), Springer: Berlin, (2011)
Hochbruck, M., Ostermann, A.: Exponential integrators. Acta Numerica 19, 209–286 (2010)
Hoel, H., Law, K.J.H., Tempone, R.: Multilevel ensemble Kalman filtering. SIAM J. Numer. Anal. 54(3), 1813–1839 (2016)
Hoel, H., Shaimerdenova, G., Tempone, R.: Multilevel ensemble Kalman filtering based on a sample average of independent ENKF estimators. Found. Data Sci. 2(4), 351–390 (2020)
Jasra, A., Kamatani, K., Law, K.J.H., Zhou, Y.: Multilevel particle filters. SIAM J. Numer. Anal. 55(6), 3068–3096 (2017)
Jasra, A., Kamatani, K., Law, K. J. H., Zhou, Y.: A multi-index Markov chain Monte Carlo method. Int’l J. Uncer. Quant., 8(1), (2018)
Jasra, A., Law, K.J.H., Xu, Y.: Multi-Index sequential Monte Carlo methods for partially observed stochastic partial differential equations. Int’l J. Uncer. Quant. 11, 1–25 (2021)
Jentzen, A.: Stochastic partial differential equations: analysis and numerical approximations. ETH Zurich Lecture Notes, (2016)
Jentzen, A., Kloeden, P.. E.: Overcoming the order barrier in the numerical approximation of stochastic partial differential equations with additive space-time noise. Proc. R. Soc. A Math. Phys. Eng. Sci 465, 649–667 (2009)
Jentzen, A., Röckner, M.: A Milstein scheme for SPDEs. Found. Comput. Math. 15(2), 313–362 (2015)
Kloeden, P.E., Platen, E.: Numerical solution of stochastic differential equations. Applied Mathematical Sciences, Springer, New York, Berlin (1992)
Kloeden, P.E., Lord, G.J., Neuenkirch, A., Shardlow, T.: The exponential integrator scheme for stochastic partial differential equations: pathwise error bounds. J. Comput. Appl. Math. 235(5), 1245–1260 (2011)
Lang, A., Petersson, A.: Monte Carlo versus multilevel Monte Carlo in weak error simulations of SPDE approximations. Math. Comput. Simul. 143, 99–113 (2018)
Lord, G. J., Powell, C.E., Shardlow, T.: An introduction to computational stochastic PDEs, Cambridge Texts in Applied Mathematics, (2014)
Lord, G.J., Tambue, A.: Stochastic exponential integrators for the finite element discretization of SPDEs for multiplicative and additive noise. IMA J. Num. Anal. 33(2), 515–543 (2013)
Majda, A., Wang, X.: Non-linear dynamics and statistical theories for basic geophysical flows. Cambridge University Press, UK (2006)
Müller, E.H., Scheichl, R., Shardlow, T.: Improving multilevel Monte Carlo for stochastic differential equations with application to the Langevin equation. Royal Society Proceedings A, (2015)
Robert, C., Casella, G.: Monte Carlo statistical methods. Springer Science & Business Media, UK (2013)
Sullivan, T. J.: Introduction to uncertainty quantification. Texts in Applied Mathematics 63, Springer, (2014)
Smith, R. C.: Uncertainty quantification: theory, implementation, and applications. SIAM textbooks, (2013)
Xiu, D.: Numerical methods for stochastic computations: a spectral method approach. Princeton University Press, Princeton, NJ (2010)
Zhang, Z., Karniadakis, G.E.: Numerical methods for stochastic partial differential equations with white noise. Applied Mathematical Sciences, Springer, USA (2017)
Funding
Open access funding provided by University of Oslo (incl Oslo University Hospital) Research reported in this publication received support from the Alexander von Humboldt Foundation. NKC and AJ are sponsored by KAUST baseline funding and HH acknowledges support by University of Oslo and RWTH Aachen University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Research reported in this publication received support from the Alexander von Humboldt Foundation.
Appendices
A Model Assumptions for the Exponential Euler Method
Our assumptions will be similar to those in the seminal work [30] on exponential Euler integrators.
Assumption 2
There exists a strictly increasing sequence \((\lambda _n)_{n=1}^{\infty }\) of positive real numbers such that \(Ae_n=-\lambda _n\,e_n\) for \(n\in {\mathbb N}\) and the linear operator \(A:D(A) \rightarrow H\) is given as
where
We define the family of interpolation spaces of the operator A for \(r \ge 0\) as follows
The Q-Wiener process is defined by
where \((w^n_t)_{n=1}^{\infty }\) is a sequence of independent scalar-valued Wiener processes, and the non-negative sequence \((q_n )_n \subset [0,\infty )\) satisfies the following:
Assumption 3
There exists a constant \(\phi \in (0,1)\) such that
Let \(L(H_{r_1},H_{r_2})\) denote the set of bounded linear operators mapping from \(H_{r_1}\) to \(H_{r_2}\), and the let \(L(H_r):= L(H_r,H_r)\).
Assumption 4
The reaction term \(f:H \rightarrow H\) is twice continuously Fréchet differentiable, where its derivatives satisfy the following
for all \(x,y \in H, v \in D((-A)^{\phi })\), and \(r = \{0,1/2,1\}\), and
for all \(v,w \in H\), where \(C>0\) is a positive constant.
Assumption 5
The initial value \(u_0\) is a \(D((-A)^{\phi })\)-valued random variable, that satisfies
for the constant \(\phi >0\) in Assumption 3.
B Model Assumptions for the Milstein Method
In this section, we present the assumptions for the Milstein method [31] in the setting that is relevant for this paper: when the operators A and Q share eigenspace and for reaction-diffusion SPDE (2.1) with additive noise.
Assumption 6
(Drift coefficient and noise assumption) Let Assumption 2 hold and let Assumption 3 hold for some \(\phi \in (1/2,1)\). Let \(\kappa \in [0, \phi )\) and let \(f:H_{\kappa } \rightarrow H\) be a twice continuously Fréchet differentiable mapping with
And for a value \(\theta \in [\max (\kappa ,\phi -1/2), \phi )\), it holds that
Remark 6
Since \(H_\kappa \) is dense in H, the operator \(f'(x) \in L(H_\kappa , H)\) has a unique extension \(\tilde{f}'(x) \in L(H,H)\) and one should interpret the operator norm on the extended domain as follows:
Remark 7
The cryptic parameter \(\theta \) is an adaptation of [31, Assumption 3] to the additive-noise setting with \(B(u) = I\) and \(U_0 = Q^{1/2}(H)\). And, working with Hilbert–Schmidt operator norms, [31, equation (21)] is then fulfilled by
What we represent by \(\kappa \), \(\phi -1/2\) and \(\theta \) is respectively denoted by \(\beta \), \(\delta \) and \(\gamma \) in [31]. [31, equation (22)] is trivially fulfilled since \(B'(u) =0\) and choosing, in the paper’s notation, \(\alpha =0\) and \(\vartheta = \max (1/2 - \phi , 1/4)\), it follows that [31, equation (23)] holds for any \(\theta \in [ \max (\kappa ,\phi -1/2), \phi )\), since
[31, equation (23)] does indeed not depend on the value \(\theta \) in the additive-noise setting, but \(\theta \) does enter as a constraint on the regularity of the initial data in [31, Assumption 4]. Our lower bound \(\phi > 1/2\) is due to the constraint \(\delta >0\) in [31, Assumption 3] and our upper bound \(\kappa < \phi \) is due to the constraint \(\beta < \delta +1/2\) in [31, Assumption 3].
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chada, N.K., Hoel, H., Jasra, A. et al. Improved Efficiency of Multilevel Monte Carlo for Stochastic PDE through Strong Pairwise Coupling. J Sci Comput 93, 62 (2022). https://doi.org/10.1007/s10915-022-02031-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-022-02031-2
Keywords
- Multilevel Monte Carlo method
- Stochastic partial differential equations
- Exponential Euler method
- Weak approximations