Abstract
The numerical approximation of the solution to a stochastic partial differential equation with additive spatial white noise on a bounded domain is considered. The differential operator is assumed to be a fractional power of an integer order elliptic differential operator. The solution is approximated by means of a finite element discretization in space and a quadrature approximation of an integral representation of the fractional inverse from the Dunford–Taylor calculus. For the resulting approximation, a concise analysis of the weak error is performed. Specifically, for the class of twice continuously Fréchet differentiable functionals with second derivatives of polynomial growth, an explicit rate of weak convergence is derived, and it is shown that the component of the convergence rate stemming from the stochasticity is doubled compared to the corresponding strong rate. Numerical experiments for different functionals validate the theoretical results.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The representation of Gaussian random fields as solutions to stochastic partial differential equations (SPDEs) has become a popular approach in spatial statistics in recent years. It was observed already in [21] and [22] that a Gaussian random field u on \(\mathbb {R}^d\) with a covariance function of Matérn type [13] solves an SPDE of the form \((\kappa ^2 - \varDelta )^\beta u = {\mathscr {W}}\). Here, \({\mathscr {W}}\) is Gaussian white noise, \(\kappa >0\) is a parameter determining the practical correlation range of the field, and \(\beta >d/4\) controls the smoothness parameter \(\nu \) of the Gaussian Matérn field via the equality \(\nu = 2\beta - d/2\).
Later, this relation was the incentive to consider the SPDE
for Gaussian random field approximations of Matérn fields on bounded domains \({\mathscr {D}}\subsetneq \mathbb {R}^d\). On the boundary \(\partial {\mathscr {D}}\), the operator \(\kappa ^2-\varDelta \) is augmented with, e.g., homogeneous Dirichlet or Neumann boundary conditions. In [12] it was shown that by restricting the value of \(\beta \) to \(2\beta \in {\mathbb {N}}\) and by solving the stochastic problem (1.1) by means of a finite element method, the computational costs of many operations, which are needed for statistical inference, such as sampling and likelihood evaluations can be significantly reduced. This decrease in computing time is one of the main reasons for the popularity of the SPDE approach in spatial statistics. In addition, it facilitates various extensions of the Matérn model which are difficult to formulate using a covariance-based approach, see, for instance [2, 5, 10, 12, 20].
However, the constraint \(2\beta \in {\mathbb {N}}\) imposed by [12] restricts the value of the smoothness parameter \(\nu \), which is the most important parameter when the model is used for prediction [17]. In [4] we showed that this restriction can be avoided by combining a finite element discretization in space with a quadrature approximation based on an integral representation of the inverse fractional power operator from the Dunford–Taylor calculus. We furthermore derived an explicit rate of convergence for the strong mean-square error of the proposed approximation for a class of fractional elliptic stochastic equations including (1.1).
In practice, it is often not only necessary to sample from the solution u to (1.1), but also to estimate the expected value \(\mathbb {E}[\varphi (u)]\) of a certain real-valued quantity of interest \(\varphi (u)\). The aim of this work is to provide a concise analysis of the weak error \(|\mathbb {E}[\varphi (u)] - \mathbb {E}[\varphi (u_{h,k}^Q)]|\) for the approximation \(u_{h,k}^Q\) proposed in [4]. This analysis includes the derivation of an explicit weak convergence rate for twice continuously Fréchet differentiable real-valued functions \(\varphi \), whose second derivatives are of polynomial growth. Functions of this form occur in many applications, e.g., when integral means of the solution with respect to a certain subdomain of \({\mathscr {D}}\) are of interest, or when a transformation of the model is used as a component in a hierarchical model. An example of the latter situation is to consider logit or probit transformed Gaussian random fields for binary regression models, see, e.g., [16, §4.3.3].
We prove that, compared to the convergence rate of the strong error formulated in [4], the component of the weak convergence rate stemming from the stochasticity of the problem is doubled. To this end, two time-dependent stochastic processes are introduced, which at time \(t=1\) have the same probability distribution as the exact solution u and the approximation \(u_{h,k}^Q\), respectively. The weak error is then bounded by introducing an associated Kolmogorov backward equation on the interval [0, 1] and applying Itô calculus.
The structure of this article is as follows: in Sect. 2 we formulate the equation of interest in a Hilbert space setting similarly to [4] and state our main result on weak convergence of the approximation in Theorem 2.1. A detailed proof of Theorem 2.1 is given in Sect. 3. For validating the theoretical result in practice, we describe the outcomes of several numerical experiments in Sect. 4. Finally, Sect. 5 concludes the article with a discussion.
2 Weak approximations
The subject of our investigations is the fractional order equation considered in [4],
for \(\beta \in (0,1)\), where \({\mathscr {W}}\) denotes Gaussian white noise defined on a complete probability space \((\Omega , {\mathscr {A}}, \mathbb {P})\) with values in a separable Hilbert space H. Here and below, (in-)equalities involving random terms are meant to hold \(\mathbb {P}\)-almost surely, if not specified otherwise. Furthermore, we use the notation \(X\overset{d}{=}Y\) to indicate that two random variables X and Y have the same probability distribution.
Similarly to [4], we make the following assumptions: \(L:{\mathscr {D}}(L) \subset H \rightarrow H\) is a densely defined, self-adjoint, positive definite operator and has a compact inverse \(L^{-1}:H \rightarrow H\). In this case, \(-L\) generates an analytic strongly continuous semigroup \((S(t))_{t\ge 0}\) on H. The H-orthonormal eigenvectors of L are denoted by \(\{e_j\}_{j\in \mathbb {N}}\) and the corresponding eigenvalues by \(\{\lambda _j\}_{j\in \mathbb {N}}\). These values are listed in nondecreasing order and we assume that there exist constants \(\alpha , c_\lambda , C_\lambda > 0\) such that
The action of the fractional power operator \(L^\beta \) in (2.1) is well-defined on
which is itself a Hilbert space with inner product \( ( \phi , \psi )_{ 2\beta } := ( L^{\beta } \phi , L^{\beta } \psi )_{ H }\). Furthermore, there exists a unique continuous extension of \(L^\beta \) to an isometric isomorphism \(L^\beta :{\dot{H}}^{r} \rightarrow {\dot{H}}^{r-2\beta }\) for all \(r\in \mathbb {R}\), see [4, Lem. 2.1]. Here, for \(s > 0\), the negative-indexed space \({\dot{H}}^{-s}\) is defined as the dual space of \({\dot{H}}^{s}\). After identifying the dual space \(H^*\) of \({\dot{H}}^{0} := H\) via the Riesz map, we obtain the Gelfand triple \({\dot{H}}^{s} \hookrightarrow H \cong H^* \hookrightarrow {\dot{H}}^{-s}\) with continuous and dense embeddings. The norm on the dual space \({\dot{H}}^{-s}\) can be expressed by
where \( \langle \,\cdot \,,\,\cdot \, \rangle _{ }\) denotes the duality pairing between \({\dot{H}}^{-s}\) and \({\dot{H}}^{s}\), [19, Proof of Lem. 5.1]. With this representation of the dual norm and the growth (2.2) of the eigenvalues \(\lambda _j\) at hand, it is an immediate consequence of a Karhunen–Loève expansion of the white noise \({\mathscr {W}}\) with respect to the H-orthonormal eigenvectors \(\{e_j\}_{j\in \mathbb {N}}\) that \({\mathscr {W}}\) has mean-square regularity in \({\dot{H}}^{-s}\) for every \(s > \alpha ^{-1}\), see [4, Prop. 2.3]. Consequently, (2.1) has a solution \(u \in L_2(\Omega ; {\dot{H}}^{2\beta - s})\) for \(s > \alpha ^{-1}\) if \(g\in {\dot{H}}^{-s}\).
2.1 The Galerkin approximation
In the following, let \((V_h)_{h\in (0,1)}\) be a family of subspaces of \({\dot{H}}^{1}={\mathscr {D}}(L^{1/2})\) with finite dimensions \(N_h := \dim (V_h)\) and let \(\Pi _h:H \rightarrow V_h\) be the H-orthogonal projection onto \(V_h\). For \(g\in H\), we define the finite element approximation of \(v = L^{-1}g\) by \(v_h = L_h^{-1} \Pi _hg\), where \(L_h\) denotes the Galerkin discretization of the operator L with respect to \(V_h\), i.e.,
We then consider the following numerical approximation of the solution u to (2.1)
proposed in [4, Eq. (2.18)]. It is based on the following two components:
-
(a)
The operator \(Q_{h,k}^\beta \) is the quadrature approximation for \(L_h^{-\beta }\) of [6]:
$$\begin{aligned} Q^\beta _{h,k} := \frac{2 k \sin (\pi \beta )}{\pi } \sum _{\ell =-K^{-}}^{K^{+}} e^{2\beta y_\ell } \left( \mathrm {Id}_{V_h} + e^{2 y_\ell } L_h \right) ^{-1}. \end{aligned}$$(2.4)The quadrature nodes \(\{y_\ell = \ell k : \ell \in \mathbb {Z}, -K^{-} \le \ell \le K^+\}\) are equidistant with distance \(k>0\) and we set \(K^- := \bigl \lceil \tfrac{\pi ^2}{4\beta k^2} \bigr \rceil \) and \(K^+ := \bigl \lceil \frac{\pi ^2}{4(1-\beta )k^2} \bigr \rceil \).
-
(b)
The white noise \({\mathscr {W}}\) in H is approximated by the square-integrable \(V_h\)-valued random variable \({\mathscr {W}}_h^\Phi \) given by \({\mathscr {W}}_{h}^{\Phi } := \sum _{j=1}^{N_h} \xi _j \, \phi _{j,h}\), where \(\Phi :=\{\phi _{j,h}\}_{j=1}^{N_h}\) is any basis of the finite element space \(V_h\). The vector \(\varvec{\xi }= (\xi _1,\ldots , \xi _{N_h})^T\) is multivariate Gaussian distributed with mean zero and covariance matrix \({\mathbf {M}}^{-1}\), where \({\mathbf {M}}\) denotes the mass matrix with respect to the basis \(\Phi \), i.e., \(M_{ij} = ( \phi _{i,h}, \phi _{j,h} )_{ H }\).
The main outcome of [4] is strong convergence of the approximation \(u_{h,k}^{Q}\) in (2.3) to the solution u of (2.1) at an explicit rate. Subsequently, this work focusses on weak approximations based on \(u_{h,k}^{Q}\), i.e., we investigate the error
for continuous functions \(\varphi :H \rightarrow \mathbb {R}\).
Remark 2.1
In practice, the expected value \(\mathbb {E}[ \varphi (u_{h,k}^Q) ]\) is approximated, e.g., by a Monte Carlo method. For this, usually a large number of realizations of \(\varphi (u_{h,k}^Q)\) and, thus, of the approximation \(u_{h,k}^Q\) in (2.3) is needed. Each of them requires a sample of the load vector \({\mathbf {b}}\) with entries \(b_j := ( \Pi _hg + {\mathscr {W}}_h^\Phi , \phi _{j,h} )_{ H }\). As pointed out in [4, Rem. 2.9], this is computationally feasible if the mass matrix \({\mathbf {M}}\) with respect to the finite element basis \(\Phi \) is sparse, since the distribution of \(\varvec{\xi }\sim {\mathscr {N}}({\mathbf {0}},{\mathbf {M}}^{-1})\) implies that
where \({\mathbf {z}}\sim {\mathscr {N}}({\mathbf {0}},{\mathbf {I}})\), \({\mathbf {G}}\) is the Cholesky factor of \({\mathbf {M}}= {\mathbf {G}}{\mathbf {G}}^T\), and the vector \({\mathbf {g}}\) has entries \(g_j := ( g, \phi _{j,h} )_{ H }\).
2.2 Weak convergence
For bounding the error in (2.5), we start by introducing some more notation and assumptions. Let \({\mathscr {E}}:= \{e_{j,h}\}_{j=1}^{N_h} \subset V_h\) be the H-orthonormal eigenvectors of the discrete operator \(L_h\) with corresponding eigenvalues \(\{\lambda _{j,h}\}_{j=1}^{N_h}\) listed in nondecreasing order. In addition, the strongly continuous semigroup on \(V_h\) generated by \(-L_h\) is denoted by \((S_h(t))_{t\ge 0}\).
We define the space \(C^2(H;\mathbb {R})\) of twice continuously Fréchet differentiable functions \(\varphi :H \rightarrow \mathbb {R}\), i.e., \(\varphi \in C^2(H;\mathbb {R})\) if and only if
Here and below, using the Riesz representation theorem, we identify the first two Fréchet derivatives \(D\varphi \) and \(D^2 \varphi \) of \(\varphi \) with functions taking values in H and in \({\mathscr {L}}(H)\), respectively. Furthermore, we say that the second derivative has polynomial growth of degree \(p\in \mathbb {N}\), if there exists a constant \(K>0\) such that
All the properties of the finite element discretization, of the operator L, and of the function \(\varphi \), which are of importance for our analysis of the weak error (2.5), are summarized in the assumption below.
Assumption 2.1
The finite element spaces \((V_h)_{h\in (0,1)} \subset {\dot{H}}^{1}\), the operator L in (2.1), and the function \(\varphi :H \rightarrow \mathbb {R}\) in (2.5) satisfy the following:
-
(i)
there exists \(d\in \mathbb {N}\) such that \(N_h = \dim (V_h) \propto h^{-d}\) for all \(h > 0\);
-
(ii)
there exist constants \(C_1, C_2 > 0\), \(h_0\in (0,1)\), as well as exponents \(r,s > 0\) and \(q > 1\) such that
$$\begin{aligned} \lambda _j \le \lambda _{j,h}&\le \lambda _j + C_1 h^r \lambda _j^q, \\ \Vert e_j - e_{j,h} \Vert _{ H }^2&\le C_2 h^{2s} \lambda _j^q, \end{aligned}$$for all \(h\in (0,h_0)\) and \(j\in \{1,\ldots ,N_h\}\);
-
(iii)
the eigenvalues of L satisfy (2.2) for an exponent \(\alpha \) with
$$\begin{aligned} \tfrac{1}{2\beta } < \alpha \le \min \left\{ \tfrac{r}{(q-1)d}, \tfrac{2s}{q d} \right\} , \end{aligned}$$where the values of \(d\in \mathbb {N}\), \(r,s>0\), and \(q>1\) are the same as in (i)–(ii);
-
(iv)
\(s>2\beta \) and for \(0\le \theta \le \sigma \le s\) there exists a constant \(C_3 > 0\) such that
$$\begin{aligned} \Vert (S(t)-S_h(t)\Pi _h)g \Vert _{ H } \le C_3 h^\sigma t^{\frac{\theta -\sigma }{2}} \Vert g \Vert _{ \theta } \quad \forall t>0, \end{aligned}$$for every \(g\in {\dot{H}}^{\theta }\) and \(h\in (0,h_0)\). Here, \(h_0\) and s are as in (ii);
-
(v)
\(\varphi \in C^2(H;\mathbb {R})\) and \(D^2 \varphi \) has polynomial growth (2.6) of degree \(p\ge 2\).
The following example shows that Assumptions 2.1(i)–(iv) are satisfied, e.g., for the motivating problem (1.1) related to approximations of Matérn fields, if \(\beta > d/4\), when using continuous piecewise linear finite element bases.
Example 2.1
For \(\kappa \ge 0\) and a bounded, convex, polygonal domain \({\mathscr {D}}\subset \mathbb {R}^d\), consider the stochastic model problem (1.1), i.e., the fractional order equation (2.1) for \(g=0\) and \(L = \kappa ^2 - \varDelta \) on \(H=L_2({\mathscr {D}})\). Furthermore, we assume that the differential operator L is augmented with homogeneous Dirichlet boundary conditions on \(\partial {\mathscr {D}}\). In this case, the eigenvalues \(\{\lambda _j\}_{j\in \mathbb {N}}\) of L satisfy (2.2) for \(\alpha = 2/d\) (see [8, Ch. VI.4] for \({\mathscr {D}}=(0,1)^d\), the result for more general domains as above follows from the min–max principle). Consequently, the first inequality of Assumption 2.1(iii) holds if \(\beta > d/4\).
In addition, if \((V_h)_{h\in (0,1)} \subset {\dot{H}}^{1} = H_0^1({\mathscr {D}})\) are finite element spaces with continuous piecewise linear basis functions defined with respect to a quasi-uniform family of triangulations, Assumption 2.1(i) holds and Assumptions 2.1(ii), (iv) are satisfied for \(r=s=q=2\), see [18, Thm. 6.1, Thm. 6.2] and [19, Thm. 3.5]. Thus,
and Assumptions 2.1(i)–(iv) hold for all \(\beta \in (d/4,1)\).
We remark that Assumptions 2.1(i)–(iii) coincide with those of [4]. The strong \(L_2(\Omega ;H)\)-convergence rate
was derived in [4, Thm. 2.10] for the approximation \(u_{h,k}^{Q}\) in (2.3) under a suitable calibration of the distance of the quadrature nodes k with the finite element mesh size h. Furthermore, a bound for the weak-type error
was provided, showing convergence to zero with the rate \(\min \{d(2\alpha \beta - 1),r,s\}\), see [4, Cor. 3.4]. In particular, the term \(d(2\alpha \beta - 1)\) stemming from the stochasticity is doubled compared to the strong rate in (2.7).
In the following, we generalize this result to weak errors of the form (2.5) for functions \(\varphi :H \rightarrow \mathbb {R}\), which are twice continuously Fréchet differentiable and have a second derivative of polynomial growth. The bound of the weak error in Theorem 2.1 is our main result.
Theorem 2.1
Let Assumption 2.1 be satisfied. Let \(\theta > \min \{d(2\alpha \beta -1),s\} - 2\beta \), if \(d(2\alpha \beta -1) \ge 2\beta \), and set \(\theta = 0\) otherwise. Then, for \(g\in {\dot{H}}^{\theta }\) and for sufficiently small \(h\in (0,h_0)\) and \(k\in (0,k_0)\), the weak error in (2.5) admits the bound
Here, we set \(f_{\alpha ,\beta }(h) := h^{d(\alpha \beta - 1)}\), if \(\alpha \beta \ne 1\), and \(f_{\alpha ,\beta }(h) := |\ln (h)|\), if \(\alpha \beta = 1\). The constant \(C>0\) is independent of h and k and the values of \(\alpha ,r,s > 0\), \(d\in \mathbb {N}\), and \(p\in \{2,3,\ldots \}\) are those of Assumption 2.1.
Remark 2.2
In the derivation of the strong convergence rate (2.7), we balanced the error terms caused by the quadrature and by the finite element method by choosing the quadrature step size k sufficiently small with respect to the finite element mesh width h, namely \(e^{-\pi ^2/(2k)} \propto h^{d\alpha \beta }\), see [4, Table 1].
For calibrating the terms in the weak error estimate (2.8), we distinguish the cases \(\alpha \beta < 1\), \(\alpha \beta = 1\), and \(\alpha \beta > 1\). If \(\alpha \beta < 1\), then \(d\alpha \beta > d(2\alpha \beta -1)\) and we let \(k>0\) be such that \(e^{-\pi ^2/(2k)} \propto h^{d\alpha \beta }\). With this choice, the error estimate (2.8) simplifies to
For \(\alpha \beta > 1\) (\(\alpha \beta =1\)) we achieve the same bound if k and h are calibrated such that \(e^{-\pi ^2/(2k)} \propto h^{d(2\alpha \beta -1)}\) (\(e^{-\pi ^2/(2k)} \max \{1,|\ln (h)|\} \propto h^d\)). Note that the calibration for \(\alpha \beta < 1\) coincides with the one for the strong error and that the term \(d(2\alpha \beta -1)\) in the derived weak convergence rate \(\min \{d(2\alpha \beta -1),r,s\}\) is doubled compared to the first term of the strong convergence rate (2.7).
Remark 2.3
We emphasize that (under the same assumptions) both the strong and weak convergence rates remain the same when approximating the solution u to
by \(u_{h,k}^Q := \sigma \, Q_{h,k}^\beta (\Pi _h g + {\mathscr {W}}_h^\Phi )\), where \(\sigma > 0\) is a constant parameter which scales the variance of u. This can be seen from the equality \(\sigma ^{-1} L^\beta = L_\sigma ^{\beta }\) for \(L_\sigma := \sigma ^{-1/\beta } L\), combined with the fact that the eigenvalues of the operator \(L_\sigma \) satisfy the growth assumption (2.2) with the same exponent \(\alpha > 0\) as the eigenvalues of L.
However, the constants \(c_\lambda , C_\lambda > 0\) in (2.2) and the constants in the error estimates change. For instance, if \(\varphi (u) := \Vert u \Vert _{ H }^{p_{*}}\) for \(p_{*}\in \mathbb {N}\), then the constant \(C>0\) in (2.8) will depend linearly on \(\sigma ^{p_{*}}\).
Note that one has to consider a problem of the form
when approximating a Matérn field with variance \(\sigma _{*}^2\). Here and in what follows, \(\Gamma (\,\cdot \,)\) denotes the Gamma function.
Remark 2.4
We also comment on how the error bound in (2.8) changes if instead of the family \((Q_{h,k}^\beta )_{k>0}\) a different sequence of approximations \(\{R_{h,n}^\beta \}_{n\in \mathbb {N}}\) of \(L_h^{-\beta }\) is used. If there exists a function \(E:\mathbb {N}\rightarrow \mathbb {R}_{\ge 0}\) such that \(\lim _{n\rightarrow \infty } E(n) = 0\) as well as a constant \(C>0\), independent of h and n, such that
it is an immediate consequence of the arguments in our proof that a bound of the weak error for the approximation \(u^R_{h,n} := R_{h,n}^\beta (\Pi _hg + {\mathscr {W}}_h^\Phi )\) is given by
An example of such a family \(\{R_{h,n}^\beta \}_{n\in \mathbb {N}}\) are the approximations of \(L_h^{-\beta }\) proposed in [3], which are based on rational approximations of the function \(x^{-\beta }\) of different degrees \(n\in \mathbb {N}\).
3 The derivation of Theorem 2.1
The main idea in our derivation of the weak error estimate (2.8) is to introduce two time-dependent stochastic processes with the property that their (random) values at time \(t=1\) have the same distribution as the solution u to (2.1) and its approximation \(u_{h,k}^Q\) in (2.3), respectively. We then use an associated Kolmogorov backward equation and Itô calculus to estimate the difference between these values.
3.1 The extension to time-dependent processes
Recall the eigenvalue-eigenvector pairs \(\{ (\lambda _j, e_j) \}_{j\in \mathbb {N}}\) of L as well as the parameter \(\alpha >0\) determining the growth of the eigenvalues via (2.2). In what follows, we assume that \(g\in H\) and \(2\alpha \beta > 1\) so that the solution u to (2.1) satisfies \(u\in L_2(\Omega ; H)\). With the aim of introducing the time-dependent processes mentioned above, we start by defining
where \(\{B_j\}_{j \in \mathbb {N}}\) is a sequence of independent real-valued Brownian motions adapted to a filtration \({\mathscr {F}}:= ({\mathscr {F}}_t, \ t\ge 0)\). Owing to this construction, \((W^{\beta }(t), \ t\ge 0)\) is an \({\mathscr {F}}\)-adapted H-valued Wiener process with covariance operator \(L^{-2\beta }\), which is of trace-class if \(2\alpha \beta > 1\). Since the random variables \(\{B_j(1)\}_{j \in \mathbb {N}}\) are independent and identically \({\mathscr {N}}(0,1)\)-distributed, the spatial white noise \({\mathscr {W}}\) satisfies
The stochastic process \(Y := (Y(t), \ t \in [0,1])\) defined as the (strong) solution to the stochastic partial differential equation
therefore takes the following random value in H at time \(t=1\),
Its Gaussian distribution implies the existence of all moments, as shown in the following lemma.
Lemma 3.1
Let \(p\in \mathbb {N}\), \(t\in [0,1]\), and Y be the strong solution of (3.1). Then the p-th moment of Y(t) exists and, for \(p \ge 2\), it admits the following bound:
Here, \(\mu _p := \mathbb {E}[ |Z|^p ] = \sqrt{\frac{2^p}{\pi }} \, \Gamma \left( \frac{p+1}{2}\right) \) is the p-th absolute moment of \(Z\sim {\mathscr {N}}(0,1)\).
Proof
For \(p=2\), the bound in (3.3) follows from the Itô isometry [15, Thm. 8.7(i)]:
If \(p\ge 3\), we estimate \(\mathbb {E}[ \Vert Y(t) \Vert _{ H }^p ] \le 2^{p-1} ( \Vert L^{-\beta }g \Vert _{ H }^p + \mathbb {E}[ \Vert W^\beta (t) \Vert _{ H }^p ])\). By Jensen’s inequality we have
Thus, the distribution of \(\{B_j(t)\}_{j\in \mathbb {N}}\) implies that \(\mathbb {E}[ \Vert W^\beta (t) \Vert _{ H }^p ] \le {t^{p/2} \mu _p} {\text {tr}}( L^{-2\beta } )^{p/2}\), and assertion (3.3) follows. \(\square \)
In order to define a another stochastic process \({\widetilde{Y}} := ({\widetilde{Y}}(t), \ t \in [0,1])\) with the property \({\widetilde{Y}}(1) \overset{d}{=} u_{h,k}^Q\) in H, we recall the orthonormal eigenbasis \({\mathscr {E}}= \{e_{j,h}\}_{j=1}^{N_h} \subset V_h\) of \(L_h\) and define \(P_h^\beta :H \rightarrow V_h\) for \(\beta \in (0,1)\) by
Since \(V_h\) is finite-dimensional, the operator \(Q_{h,k}^\beta :V_h \rightarrow V_h\) in (2.4) is bounded, \(Q_{h,k}^\beta \in {\mathscr {L}}(V_h)\) for short, with norm
We now consider the following stochastic partial differential equation
Note that the reproducing kernel Hilbert space of \(W^\beta \) is \({\dot{H}}^{2\beta }\). The finite rank of the operator \(Q_{h,k}^{\beta } P_h^\beta :H \rightarrow V_h\) implies that it is a Hilbert–Schmidt operator from \({\dot{H}}^{2\beta }\) to H. For this reason, existence and uniqueness of a (strong) solution \({\widetilde{Y}}\) to (3.5) is evident. Furthermore, the solution process \({\widetilde{Y}}\) satisfies
where \({\mathscr {W}}_h^{\mathscr {E}}:= \sum _{j=1}^{N_h} B_j(1) \, e_{j,h}\). To see that also \({\widetilde{Y}}(1) \overset{d}{=} u_{h,k}^Q\) holds in H, define the deterministic matrix \({\mathbf {R}}\) and the random vector \({\mathbf {B}}_1\) by
i.e., \({\mathbf {B}}_1\) is the vector of the first \(N_h\) Brownian motions at time \(t=1\). Due to
the vector \(\varvec{\xi }:= {\mathbf {R}}^{-1} {\mathbf {B}}_1\) is \({\mathscr {N}}({\mathbf {0}},{\mathbf {M}}^{-1})\)-distributed. In addition, by [4, Lem. 2.8] the \(V_h\)-valued random variables
are equal in \(L_2(\Omega ; H)\). In particular, their first and second moments coincide. Since \({\mathscr {W}}_h^{\mathscr {E}}\) and \({\mathscr {W}}_h^\Phi \) are Gaussian random variables, their distributions are uniquely characterized by their first two moments and we conclude that
3.2 The Kolmogorov backward equation and partition of the error
With the aim of bounding the weak error in (2.5) by means of Itô calculus, we introduce the following Kolmogorov backward equation associated with the stochastic partial differential equation (3.1) for Y and the function \(\varphi \) by
Here, \(w_x := D_x w\) and \(w_{xx} := D^2_x w\) denote the first and second order Fréchet derivative of w with respect to \(x \in H\). It is well-known [9, Rem. 3.2.1, Thm. 3.2.3] that the solution \(w :[0,1] \times H \rightarrow \mathbb {R}\) to (3.7) is given in terms of the stochastic process Y in (3.1) by the following expectation
Since \(\varphi :H \rightarrow \mathbb {R}\) is twice continuously Fréchet differentiable, we can furthermore express the first two derivatives of w with respect to x in terms of \(\varphi \) and Y by
Let \({\widetilde{Y}}\) be the solution to (3.5). The application of Itô’s lemma [7] to the stochastic process \((w(t,{\widetilde{Y}}(t)), \ t\in [0,1])\) yields
where, for \(T\in {\mathscr {L}}(H)\), the H-adjoint operator is denoted by \(T^*\). To simplify the second term in (3.11), we define the operator \({\widetilde{\Pi }}_h:H \rightarrow V_h\) by
Note that in contrast to the H-orthogonal projection \(\Pi _h\), the operator \({\widetilde{\Pi }}_h\) is neither self-adjoint (\({\widetilde{\Pi }}_h^* \ne {\widetilde{\Pi }}_h\)) nor a projection (\({\widetilde{\Pi }}_h^2 \ne {\widetilde{\Pi }}_h\)). We then use the following relation between \({\widetilde{\Pi }}_h\) and \(P_h^\beta \) from (3.4),
and express (3.11) as an integral equation for \(t=1\). Taking the expectation on both sides of this equation yields
since \({\widetilde{Y}}(0) = Q_{h,k}^\beta \Pi _hg\) by (3.5) and \(w_t( t,{\widetilde{Y}}(t) ) = - \tfrac{1}{2} {\text {tr}}\bigl ( w_{xx}( t, {\widetilde{Y}}(t) ) L^{-2\beta } \bigr )\) by (3.7).
As a final step in this subsection, we relate the quantity of interest \(\mathbb {E}[\varphi (u)]\) with the expected value of w(1, Y(1)) and similarly for the approximation \(\mathbb {E}[\varphi (u_{h,k}^Q)]\) and \(w(1,{\widetilde{Y}}(1))\). For this purpose, we extend the equalities in (3.8)–(3.10) to the case that \(x=\xi \) is a an H-valued random variable in the following lemma.
Lemma 3.2
Let Assumption 2.1 (v) be satisfied. Then, for every \(t\in [0,1]\) and any \({\mathscr {F}}_t\)-measurable random variable \(\xi \in L_{p+2}(\Omega ; H)\), it holds
Proof
For \(k=0\), this identity follows from [11, Lem. 4.1] with \(N=p+2\), \(\xi _1 = \xi \) and \(\xi _2 = Y(1) - Y(t)\), since \(Y(t) \in L_{p+2}(\Omega ; H)\) for all \(t\in [0,1]\) by Lemma 3.1 and \(|\varphi (x)| \lesssim 1 + \Vert x \Vert _{ H }^{p+2}\) as a consequence of (2.6).
Furthermore, for \(y,\,z\in H\), we define \(\varphi _{y}, \varphi _{y,\,z} :H \rightarrow \mathbb {R}\) by
Since the inner product is bilinear and continuous with respect to both components, we find with (3.9)–(3.10) that
Thus, again applying [11, Lem. 4.1] for \(\xi _1 = \xi \) and \(\xi _2 = Y(1) - Y(t)\) as well as \(N=p+1\) and \(N=p\), respectively, yields
by bilinearity and continuity of the inner product. The separability of H and the arbitrary choice of \(y,\,z\in H\) complete the proof of the assertion for \(k\in \{1,2\}\). \(\square \)
Owing to Lemma 3.2 and the tower property for conditional expectations, the stochastic process \((w(t,Y(t)), \ t\in [0,1])\) has no drift, i.e.,
Furthermore, it follows with (3.2) and (3.6) that
Summing up the observations in (3.13)–(3.16), we find that the difference between the quantity of interest \(\mathbb {E}[\varphi (u)]\) and the expected value of the approximation \(\varphi (u_{h,k}^Q)\) can be expressed by
This equality implies that the weak error (2.5) admits the following upper bound
where we set \({\widetilde{Q}}_{h,k}^\beta := Q_{h,k}^{\beta } {\widetilde{\Pi }}_h\) and \({\widetilde{L}}_{h}^{-\beta }:= L_h^{-\beta } {\widetilde{\Pi }}_h\).
The following subsections are structured as follows: In Sect. 3.3 we bound the deterministic error \( \Vert (L^{-\beta } - L_h^{-\beta } \Pi _h)g \Vert _{ H }\) caused by the finite element discretization. This result is essential for estimating the first error term (I) in (3.17). Secondly, we investigate the terms (II) and (III) stemming from applying the quadrature operator \(Q_{h,k}^\beta \) instead of the discrete fractional inverse \(L_h^{-\beta }\) in Sect. 3.4. Finally, in Sect. 3.5 we estimate the trace in (IV) and combine all our results to prove Theorem 2.1.
3.3 The deterministic finite element error
In this subsection we focus on the deterministic error \( \Vert (L^{-\beta } - L_h^{-\beta } \Pi _h)g \Vert _{ H }\) caused by the inhomogeneity g. More precisely, we derive an explicit rate of convergence depending on the \({\dot{H}}^{\theta }\)-regularity of g in Lemma 3.3 below. Subsequently, in Lemma 3.4, we apply this result to bound the first term of (3.17).
Lemma 3.3
Suppose Assumption 2.1(iv) is satisfied. Set \(\theta _{*} := d(2\alpha \beta - 1) - 2\beta \) and let \(\theta > \min \{ \theta _{*}, s - 2\beta \}\) if \(\theta _{*} \ge 0\), and set \(\theta = 0\) otherwise. Then there exists a constant \(C>0\), independent of h, such that
for all \(g\in {\dot{H}}^{\theta }\) and sufficiently small \(h\in (0,h_0)\).
Proof
By applying [14, Ch. 2, Eq. (6.9)] to the negative fractional powers of L and \(L_h\), we find
Thus, Assumption 2.1(iv) yields for \(0 \le \theta _j \le \sigma _j \le s\) (\(j=1,2\)) the estimate
If \(\theta _{*} \ge 0\), we let \(\varepsilon >0\) be such that \(\theta = \min \{\theta _{*}, s-2\beta \} + \varepsilon \) and we choose \(\sigma _1 := \min \{d(2\alpha \beta -1),s\}\), \(\sigma _2 := s\), \(\theta _1 := \min \{ \theta , \sigma _1 \}\), and \(\theta _2 := 0\). We then obtain \(\theta _1 - \sigma _1 = \min \{-2\beta + \varepsilon , 0\}\) and
For \(\theta _{*} < 0\), we instead set \(\sigma _1 := d(2\alpha \beta -1)\), \(\sigma _2 := s\), \(\theta _1 := 0\), \(\theta _2 := 0\), and we conclude in a similar way that
Since in both cases \(\max \{ \Vert g \Vert _{ \theta _1 }, \Vert g \Vert _{ \theta _2 } \} \le \Vert g \Vert _{ \theta }\) with \(\theta \) defined as in the statement of the lemma, the bound (3.18) follows. \(\square \)
Remark 3.1
We note that by letting \(\sigma _1 = \sigma _2 := s\), \(\theta _1 := s-2\beta +\varepsilon \), and \(\theta _2 := 0\) in the proof of Lemma 3.3 the optimal convergence rate for the deterministic error,
can be derived. The error estimate (3.18) is formulated in such a way that the smoothness \(\theta \ge 0\) of \(g\in {\dot{H}}^{\theta }\) is minimal for convergence with the rate \(\min \{d(2\alpha \beta -1), s\}\), which will dominate the overall weak error, stemming from the term (IV) in the partition (3.17), see Lemma 3.8.
We furthermore remark that the convergence result of Lemma 3.3 is in accordance with the result of [6, Thm. 4.3]. There the self-adjoint positive definite operator L is induced by an \(H_0^1({\mathscr {D}})\)-coercive, symmetric bilinear form A:
where \(0 < a_0 \le a({\mathbf {x}}) \le a_1\), \(H := L_2({\mathscr {D}})\), \({\dot{H}}^{1} := H^1_0({\mathscr {D}})\) and \(\mathscr {D}\) is a bounded polygonal domain in \(\mathbb {R}^d\), \(d\in \{1,2,3\}\), with Lipschitz boundary. The discrete spaces \((V_h)_h\) considered in [6] are the finite element spaces with continuous piecewise linear basis functions defined with respect to a quasi-uniform family of triangulations. The convergence rate for the error \( \Vert (L^{-\beta } - L_h^{-\beta }\Pi _h)g \Vert _{ H }\) derived in [6, Thm. 4.3] is \(2 \tau \), if \(g\in {\dot{H}}^{\theta }\) for \(\theta > 2 (\tau - \beta )\), if \(\tau \ge \beta \), and \(\theta = 0\) otherwise. Here, \(\tau \in (0,1]\) is such that the operators
are bounded with respect to the intermediate Sobolev spaces
where \(H^{-1}({\mathscr {D}}) = {\dot{H}}^{-1}\) is the dual space of \(H^1_0({\mathscr {D}}) = {\dot{H}}^{1}\) and \([\cdot , \cdot ]_{\varrho ,q}\) denotes the real K-interpolation method.
According to this result of [6], the convergence rate \(2\min \{d(\alpha \beta -1/2), 1\}\) can be achieved if g is \({\dot{H}}^{\theta }\)-regular for \(\theta > \theta _*\) if \(\theta _* := 2 (\min \{d(\alpha \beta -1/2),1\} - \beta ) \ge 0\) and \(\theta = 0\) if \(\theta _* < 0\). A comparison with (3.18) in Lemma 3.3 shows that the error estimates and regularity assumptions coincide for this particular case, since \(s=2\) for the choice of finite-dimensional subspaces \((V_h)_h\) in [6] specified above.
Having bounded the error between \(L^{-\beta }g\) and \(L_h^{-\beta }\Pi _hg\), an estimate of the first error term (I) in (3.17) is an immediate consequence of the fundamental theorem of calculus and the chain rule for Fréchet derivatives. This bound is formulated in the next lemma.
Lemma 3.4
Let Assumptions 2.1 (iv)–(v) be satisfied and \(2\alpha \beta >1\). Define \(\theta \ge 0\) as in Lemma 3.3. Then there exists a constant \(C>0\), independent of h, such that
for all \(g\in {\dot{H}}^{\theta }\) and sufficiently small \(h\in (0,h_0)\).
Proof
Since the mapping \(x \mapsto w(0,x)\) is Fréchet differentiable, we obtain by the fundamental theorem of calculus and the Cauchy–Schwarz inequality
A bound for the first term is given by (3.18) in Lemma 3.3. For the second term, we use (3.9), \(Y(0) = L^{-\beta }g\), and the polynomial growth (2.6) of \(D^2\varphi \) to estimate
for all \(t\in [0,1]\). The boundedness (3.3) of the \((p+1)\)-th moment of Y(1) completes the proof, since the trace of \(L^{-2\beta }\) is finite if \(2\alpha \beta > 1\). \(\square \)
3.4 The quadrature approximation
In this subsection we address the error terms (II) and (III) in (3.17), which are induced by the quadrature approximation \(Q_{h,k}^\beta \) of \(L_h^{-\beta }\). To this end, we start by stating the following result of [6, Lem. 3.4, Thm. 3.5] that bounds the error between the two operators on \(V_h\).
Lemma 3.5
The approximation \(Q_{h,k}^\beta :V_h \rightarrow V_h\) of \(L_h^{-\beta }\) in (2.4) admits the bound
and it is bounded, \( \Vert Q_{h,k} \Vert _{ {\mathscr {L}}(V_h) } \le C'\), for sufficiently small \(h\in (0,h_0)\), \(k\in (0,k_0)\), where the constants \(C,C'>0\) depend only on \(\beta \) and the smallest eigenvalue of L.
In the following, we use this error estimate of the quadrature approximation \(Q_{h,k}^\beta \) for bounding the second term of (3.17) in Lemma 3.6 as well as the trace occurring in the third term of (3.17) in Lemma 3.7.
Lemma 3.6
Suppose that Assumption 2.1(v) is satisfied and that \(2\alpha \beta > 1\). Then there exists a constant \(C>0\), independent of h and k, such that
for all \(g\in H\) and sufficiently small \(h\in (0,h_0)\) and \(k\in (0,k_0)\).
Proof
As in the proof of Lemma 3.4, we apply the fundamental theorem of calculus and the chain rule for Fréchet derivatives. By (3.9) and Lemma 3.5 we then find
Again, the proof is completed by (3.3) and the fact that \({\text {tr}}(L^{-2\beta }) < \infty \). \(\square \)
Lemma 3.7
Let Assumptions 2.1(i)–(iii) be satisfied. Then there exists a constant \(C>0\), independent of h and k, such that
for every self-adjoint \(T \in {\mathscr {L}}(H)\) and sufficiently small \(h\in (0,h_0)\) and \(k\in (0,k_0)\). Here, the function \(f_{\alpha ,\beta }\) is defined as in Theorem 2.1.
Proof
By the definition of \({\widetilde{\Pi }}_h\) in (3.12) we have
Therefore, the trace of interest simplifies to a finite sum,
where the second equality follows from the self-adjointness of \(T\in {\mathscr {L}}(H)\).
The application of the Cauchy–Schwarz inequality and of Lemma 3.5 to the first sum yield the following upper bound
By Assumption 2.1(i) we thus have \(| S_1 |\lesssim e^{-\frac{\pi ^2}{k}} h^{-d} \Vert T \Vert _{ {\mathscr {L}}(H) }\).
The second sum can be bounded by
Finally, due to the approximation property of the discrete eigenvalues \(\lambda _{j,h}\) in Assumption 2.1(ii) as well as the growth (2.2) of the exact eigenvalues \(\lambda _j\) we obtain \(\lambda _{j,h}^{-\beta } \le \lambda _{j}^{-\beta } \le c_{\lambda }^{-\beta } j^{-\alpha \beta }\) and, for \(\alpha \beta \ne 1\), we find
where we have used Lemma 3.5 and Assumption 2.1(i). If \(\alpha \beta =1\), we instead estimate \(| S_2 | \lesssim e^{- \pi ^2 / (2k)} (1 + |\ln (h)| ) \, \Vert T \Vert _{ {\mathscr {L}}(H) }\). This completes the proof. \(\square \)
3.5 Proof of Theorem 2.1
After having bounded the terms (I), (II), and (III) in the partition (3.17) of the weak error in the previous subsections, we now turn to estimating the final error term (IV). Furthermore, we bound the p-th moment of \({\widetilde{Y}}(t)\), where \({\widetilde{Y}}\) is the solution process of (3.5). We then combine all our results and prove Theorem 2.1.
Lemma 3.8
Let Assumptions 2.1(i)–(iii) be satisfied. Then there exists a constant \(C>0\), independent of h, such that
for every self-adjoint \(T \in {\mathscr {L}}(H)\) and sufficiently small \(h\in (0,h_0)\).
Proof
Similarly to (3.20) we use the self-adjointness of T and rewrite the trace as \({\text {tr}}(T({\widetilde{L}}_{h}^{-\beta }{\widetilde{L}}_{h}^{-\beta *}- L^{-2\beta })) = S_1 + S_2\), where
In order to estimate the terms \(S_1\) and \(S_2\), we note that for \(j\in \{1,\ldots ,N_h\}\)
By the mean value theorem, the existence of \({\widetilde{\lambda }}_j\in (\lambda _j,\lambda _{j,h})\) satisfying \(\lambda _{j}^{-\beta } - \lambda _{j,h}^{-\beta } = \beta {\widetilde{\lambda }}_j^{-\beta -1} (\lambda _{j,h} - \lambda _j)\) is ensured. By Assumption 2.1(ii) we thus have
Owing to (3.19) the series \(S_1\) simplifies to the finite sum
Using (3.21) as well as Assumptions 2.1(i)–(iii), this sum can be bounded by
since \(d \alpha (q-1) \le r\) and \(d \alpha q/2 \le s\) by Assumption 2.1(iii).
For the second term we find
since \({\widetilde{L}}_{h}^{-\beta }e_j = 0\) for \(j>N_h\) by (3.19). Therefore, the application of (3.21) yields
and \(|S_2| \lesssim h^{\min \{d(2\alpha \beta -1),r,s\}} \Vert T \Vert _{ {\mathscr {L}}(H) }\) follows from Assumptions 2.1(i), (iii). \(\square \)
Lemma 3.9
Suppose that Assumptions 2.1(i)–(iii) are satisfied. Let \(p\in \mathbb {N}\), \(t\in [0,1]\), and \({\widetilde{Y}}\) be the strong solution of (3.5). Then the p-th moment of \({\widetilde{Y}}(t)\) exists and, for \(p \ge 2\), it admits the following bound:
where the constant \(C>0\) is independent of h and k.
Proof
Since \(P_h^\beta W^\beta (t) = \sum _{j=1}^{N_h} B_j(t) \, e_{j,h}\), we obtain by Lemma 3.5, for any \(p\ge 2\), that
where, again, \(\mu _p := \mathbb {E}[ |Z|^p ]\) denotes the p-th absolute moment of \(Z\sim {\mathscr {N}}(0,1)\) and the constant \(C>0\) is independent of h, k, and p. Furthermore, using \(0 < \lambda _{j} \le \lambda _{j,h}\) of Assumption 2.1(ii) and applying the Hölder inequality gives
where \({\text {tr}}(L^{-2\beta }) < \infty \) by Assumption 2.1(iii). Thus, we obtain for the solution \({\widetilde{Y}}\) of (3.5) that for any \(t\in [0,1]\) the bound
holds. Finally, the assertion follows by the boundedness of \(Q_{h,k}^\beta \) which is uniform in h and k, the finiteness of \({\text {tr}}(L^{-2\beta })\), and Assumption 2.1(i). \(\square \)
Proof (of Theorem 2.1)
Owing to the partition (3.17) and the estimates of the error terms (I)–(IV) in Lemmata 3.4 and 3.6–3.8 we can bound the weak error as follows
since \(w_{xx}(t,x)\in {\mathscr {L}}(H)\) is self-adjoint for every \(t\in [0,1]\) and \(x\in H\). The application of Lemma 3.2 and of the tower property for conditional expectations yield
By the polynomial growth (2.6) of \(D^2 \varphi \) and the boundedness of the p-th moments of Y(t) and \({\widetilde{Y}}(t)\) in Lemmata 3.1 and 3.9, respectively, we obtain that
since \({\text {tr}}(L^{-2\beta }) < \infty \). This completes the proof of the weak error estimate in (2.8). \(\square \)
Remark 3.2
Note that, if the first and second Fréchet derivatives of \(\varphi \) are bounded, the estimates of the Lemmata 3.1 and 3.9 are not needed and the weak error estimate in (2.8) simplifies to
The calibration of the discretization parameters k and h remains as described in Remark 2.2.
4 An application and numerical experiments
In this section we validate the theoretical results of the previous sections within the scope of a simulation study based on the model for Matérn approximations in (1.1) on the domain \({\mathscr {D}}= (0,1)^d\) for \(d=1,2\), \(\kappa = 0.5\), and \(u=0\) on \(\partial {\mathscr {D}}\), i.e., \(L=\kappa ^2 - \varDelta \) with homogeneous Dirichlet boundary conditions. In this case, the operator L has the following eigenvalue-eigenvector pairs [8, Ch. VI.4]:
where \({\mathbf {j}}=(j_1,\ldots ,j_d)\in \mathbb {N}^d\) is a d-dimensional multi-index. As already mentioned in Example 2.1, these eigenvalues satisfy (2.2) for \(\alpha = 2/d\).
Note that, for every \({\mathbf {x}}\in {\mathscr {D}}\), the solution u satisfies \(u({\mathbf {x}}) \sim {\mathscr {N}}(0,\sigma ({\mathbf {x}})^2)\). Following a Karhunen–Loève expansion of u with respect to the eigenfunctions \(\{e_{{\mathbf {j}}}\}_{{\mathbf {j}}\in \mathbb {N}^d}\) in (4.1), the variance \(\sigma ({\mathbf {x}})^2\) can be expressed explicitly in terms of the eigenvalues and eigenfunctions in (4.1) by
where \(\bigl \{{\widetilde{\xi }}_{{\mathbf {j}}}\bigr \}_{{\mathbf {j}}\in \mathbb {N}^d}\) are independent \({\mathscr {N}}(0,1)\)-distributed random variables.
Considering continuous evaluation functions \(\varphi :L_2({\mathscr {D}}) \rightarrow \mathbb {R}\) of the form
allows us to perform the simulation study without Monte Carlo sampling, since
and the value of \(\mathbb {E}[f(u({\mathbf {x}}))]\) can be derived analytically from \(u({\mathbf {x}}) \sim {\mathscr {N}}(0,\sigma ({\mathbf {x}})^2)\). More precisely, we choose \(f(u) = |u|^p\), \(p=2,3,4\), and \(f(u) = \Phi (20(u-0.5))\), where \(\Phi (\cdot )\) denotes the cumulative distribution function for the standard normal distribution. The motivation of the latter function is given by its correspondence to a probit transform which is often used to approximate step functions (see, e.g., [1]), in this case \(\mathbb {1}(u>0.5)\). These four functions satisfy Assumption 2.1(v) and we obtain for the quantity of interest,
if \(f(u) = |u|^p\), and
if \(f(u) = \Phi (c(u-a))\) for \(a\in \mathbb {R}\) and \(c>0\).
We truncate the series in (4.2) in order to approximate the variance \(\sigma ({\mathbf {x}})^2\),
Here, we choose \(N_{\mathrm {ok}}= 1 + 2^{18}\) for \(d=1\) and \(N_{\mathrm {ok}}= 1 + 2^{11}\) for \(d=2\) so that, in both cases, \(N_{\mathrm {ok}}^d \gg N_h\) for all considered finite element spaces with \(N_h\) basis functions. This estimate of \(\sigma ({\mathbf {x}})\) is used at \(N_{\mathrm {ok}}^d\) equally spaced locations \({\mathbf {x}}\in {\mathscr {D}}\), and the reference solution \(\mathbb {E}[\varphi (u)]\) is then approximated by applying the trapezoidal rule in order to evaluate the integrals in (4.3) and (4.4) numerically.
We consider (1.1) for \(\beta = 0.6,0.7,0.8,0.9\) and use a finite element discretization based on continuous piecewise linear basis functions with respect to uniform meshes on \({\bar{{\mathscr {D}}}} = [0,1]^d\). We use four different mesh sizes h in each dimension \(d=1,2\), and calibrate the quadrature step size k with h for each value of \(\beta \) by \(k = -1/(\beta \ln h)\). This results in the numbers of basis functions and quadrature nodes shown in Table 1. As already pointed out in Example 2.1, the growth exponent of the eigenvalues is in this case \(\alpha =2/d\), and Assumption 2.1 is satisfied for \(r=s=q=2\). This gives the theoretical value \(\min \{4\beta -d,2\}\) for the weak convergence rate.
For the computation of \(\mathbb {E}[\varphi (u_{h,k}^Q)]\) we can use the same procedure as for the reference solution in order to avoid Monte Carlo simulations. For this purpose, we have to replace \(\sigma ({\mathbf {x}})^2\) in (4.3) and (4.4) by the variance of the finite element solution, \(\sigma _{h}({\mathbf {x}})^2 = {\text {Var}}(u_{h,k}^Q({\mathbf {x}}))\). To this end, we first assemble the matrix
where \(y_\ell := \ell k\) and \({\mathbf {M}}, {\mathbf {S}}\in \mathbb {R}^{N_h\times N_h}\) are the mass matrix and the stiffness matrix with respect to the finite element basis \(\{\phi _{j,h}\}_{j=1}^{N_h}\) with entries
If we let \(\varvec{\phi }_h({\mathbf {x}}) := (\phi _{1,h}({\mathbf {x}}),\ldots , \phi _{N_h,h}({\mathbf {x}}))^{T}\) denote the vector of the finite element basis functions evaluated at \({\mathbf {x}}\in {\mathscr {D}}\) and \({\mathbf {b}}:= ( ( {\mathscr {W}}_h^\Phi , \phi _{j,h} )_{ L_2({\mathscr {D}}) })_{j=1}^{N_h} \sim {\mathscr {N}}({\mathbf {0}},{\mathbf {M}})\), the variance \(\sigma _h({\mathbf {x}})^2\) is given by
The computation of \(\sigma _h({\mathbf {x}})^2\) at the same \(N_{\mathrm {ok}}^d\) locations as for the reference solution again enables a numerical evaluation of the integrals in (4.3) and (4.4) via the trapezoidal rule for approximating \(\mathbb {E}[\varphi (u_{h,k}^Q)]\).
The resulting observed weak errors \(\mathrm {err}_{\ell } := |\mathbb {E}[\varphi (u)] - \mathbb {E}[\varphi (u_{h_{\ell },k}^Q)]|\) are shown in Fig. 1. For each function \(\varphi \) and for each value of \(\beta \), we compute the empirical convergence rate \(\mathrm {r}\) by a least-squares fit of a line \(\mathrm {c} + \mathrm {r} \ln h\) to the data set \(\{h_{\ell }, \mathrm {err}_{\ell }\}\). The results are shown in Table 2 and can be seen to validate the theoretical rates given in Theorem 2.1 for \(d=1\). For \(d=2\), the observed rates deviate slightly from the theoretical rates for \(\beta =0.9\), which is caused by the fact that we had to use coarser finite element meshes for \(d=2\) than for \(d=1\) in order to be able to assemble the dense matrices \({\mathbf {Q}}_{h,k}^{\beta } \in \mathbb {R}^{N_h\times N_h}\) for performing the simulation study without Monte Carlo simulations.
5 Conclusion
Gaussian random fields are of great importance as models in spatial statistics. A popular method for reducing the computational cost for operations, which are needed during statistical inference, is to represent the Gaussian field as a solution to an SPDE. In this work, we have investigated a recent extension of this approach to Gaussian random fields with general smoothness proposed in [4]. The method considers the fractional order equation (2.1) and is based on combining a finite element discretization in space with the quadrature approximation (2.4) of the inverse fractional power operator. This yields an approximate solution \(u_{h,k}^Q\) of the SPDE, which in [4] was shown to converge to the solution u of (2.1) in the strong mean-square sense with rate (2.7).
In many applications one is mostly interested in a certain quantity of the random field u which can be expressed by \(\varphi (u)\) for some real-valued function \(\varphi \). For this reason, the focus of the present work has been the weak error \(|\mathbb {E}[\varphi (u)] - \mathbb {E}[\varphi (u_{h,k}^Q)] |\). The main outcome of this article, Theorem 2.1, shows convergence of this type of error to zero at an explicit rate for twice continuously Fréchet differentiable functions \(\varphi \), which have a second derivative of polynomial growth. Notably, the component of the convergence rate stemming from the stochasticity of the problem is doubled compared to the strong convergence rate (2.7) derived in [4]. For proving this result, we have performed a rigorous error analysis in Sect. 3, which is based on an extension of the Eq. (2.1) to a time-dependent problem as well as an associated Kolmogorov backward equation and Itô calculus.
In order to validate the theoretical findings, we have performed a simulation study for the stochastic model problem (1.1) on the domain \({\mathscr {D}}= (0,1)^d\) for \(d=1,2\) in Sect. 4. This model is highly relevant for applications in spatial statistics, since it is often used to approximate Gaussian Matérn fields. We have considered four different functions \(\varphi \) and the fractional orders \(\beta = 0.6, 0.7, 0.8, 0.9\). The observed empirical weak convergence rates can be seen to verify the theoretical results. One of the considered functions \(\varphi \) is based on a transformation of the random field by a Gaussian cumulative distribution function. Quantities of this form are particularly important for applications to porous materials, as they are used to model the pore volume fraction of the material, see, e.g., [1]. Thus, we see ample possibilities for applying the outcomes of this work to problems in spatial statistics and related disciplines.
References
Barman, S., Bolin, D.: A three-dimensional statistical model for imaged microstructures of porous polymer films. J. Microsc. 269, 247–258 (2018)
Bolin, D.: Spatial Matérn fields driven by non-Gaussian noise. Scand. J. Stat. 41, 557–579 (2014)
Bolin, D., Kirchner, K.: The rational SPDE approach for Gaussian random fields with general smoothness. arXiv preprint, arXiv:1711.04333v2 (2018)
Bolin, D., Kirchner, K., Kovács, M.: Numerical solution of fractional elliptic stochastic PDEs with spatial white noise. arXiv preprint, arXiv:1705.06565v2 (2018)
Bolin, D., Lindgren, F.: Spatial models generated by nested stochastic partial differential equations, with an application to global ozone mapping. Ann. Appl. Stat. 5, 523–550 (2011)
Bonito, A., Pasciak, J.E.: Numerical approximation of fractional powers of elliptic operators. Math. Comput. 84, 2083–2110 (2015)
Brzeźniak, Z.: Some Remarks on Itô and Stratonovich Integration in 2-Smooth Banach Spaces, in Probabilistic Methods in Fluids, pp. 48–69. World Science Publisher, River Edge (2003)
Courant, R., Hilbert, D.: Methods of Mathematical Physics: Vol. I. Interscience Publishers, Inc, New York (1953)
Da Prato, G., Zabczyk, J.: Second Order Partial Differential Equations in Hilbert Spaces London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge (2002)
Fuglstad, G.-A., Lindgren, F., Simpson, D., Rue, H.: Exploring a new class of non-stationary spatial Gaussian random fields with varying local anisotropy. Stat. Sin. 25, 115–133 (2015)
Kovács, M., Printems, J.: Weak convergence of a fully discrete approximation of a linear stochastic evolution equation with a positive-type memory term. J. Math. Anal. Appl. 413, 939–952 (2014)
Lindgren, F., Rue, H., Lindström, J.: An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 73, 423–498 (2011)
Matérn, B.: Spatial variation, Meddelanden från statens skogsforskningsinstitut, vol. 49 (1960)
Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Applied Mathematical Sciences. Springer, Berlin (1983)
Peszat, S., Zabczyk, J.: Stochastic Partial Differential Equations with Lévy Noise. Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge (2007)
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall/CRC Monographs on Statistics and Applied Probability. CRC Press, Boca Raton (2005)
Stein, M.L.: Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer, New York (1999)
Strang, G., Fix, G.: An Analysis of the Finite Element Method. Wellesley-Cambridge Press, Wellesley (2008)
Thomée, V.: Galerkin Finite Element Methods for Parabolic Problems. Springer Series in Computational Mathematics. Springer, Berlin (2006)
Wallin, J., Bolin, D.: Geostatistical modelling using non-Gaussian Matérn fields. Scand. J. Stat. 42, 872–890 (2015)
Whittle, P.: On stationary processes in the plane. Biometrika 41, 434–449 (1954)
Whittle, P.: Stochastic processes in several dimensions. Bull. Int. Stat. Inst. 40, 974–994 (1963)
Acknowledgements
The authors thank Stig Larsson for valuable comments on the manuscript and an anonymous referee who helped to improve the presentation of the results.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ragnar Winther.
This work was supported in part by the Swedish Research Council (Grant Nos. 2016-04187, 2017-04274), and the Knut and Alice Wallenberg Foundation (KAW 20012.0067).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Bolin, D., Kirchner, K. & Kovács, M. Weak convergence of Galerkin approximations for fractional elliptic stochastic PDEs with spatial white noise. Bit Numer Math 58, 881–906 (2018). https://doi.org/10.1007/s10543-018-0719-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10543-018-0719-8
Keywords
- Stochastic partial differential equations
- Weak convergence
- Gaussian white noise
- Fractional operators
- Finite element methods
- Galerkin methods
- Matérn covariances
- Spatial statistics