Fast approximation by periodic kernel-based lattice-point interpolation with application in uncertainty quantification

This paper deals with the kernel-based approximation of a multivariate periodic function by interpolation at the points of an integration lattice -- a setting that, as pointed out by Zeng, Leung, Hickernell (MCQMC2004, 2006) and Zeng, Kritzer, Hickernell (Constr. Approx., 2009), allows fast evaluation by fast Fourier transform, so avoiding the need for a linear solver. The main contribution of the paper is the application to the approximation problem for uncertainty quantification of elliptic partial differential equations, with the diffusion coefficient given by a random field that is periodic in the stochastic variables, in the model proposed recently by Kaarnioja, Kuo, Sloan (SIAM J. Numer. Anal., 2020). The paper gives a full error analysis, and full details of the construction of lattices needed to ensure a good (but inevitably not optimal) rate of convergence and an error bound independent of dimension. Numerical experiments support the theory.


Introduction
We consider a kernel-based approximation for a multivariate periodic function by interpolation at a quasi-Monte Carlo lattice point set.Kernel-based interpolation methods are by now well established (see, e.g., [26] and more discussion below).It is the unique combination of a periodic kernel plus a lattice point set here that will deliver us the significant advantage in computational efficiency.As already advocated by Hickernell and colleagues in [28,29], the combination of a periodic reproducing kernel with the group structure of lattice points means that the linear system for constructing the kernel interpolant involves a circulant matrix, thus can be solved very efficiently using the fast Fourier transform.So, our kernel method can be fast even if the dimensionality is high.
As also advocated in [28,29], a kernel interpolant is in many settings optimal among all approximation algorithms that use the same function values (see also known results on optimal recovery, e.g., [20,21]).We can therefore analyze the worst case approximation error of our kernel method by using, as upper bound, the worst case error of an auxiliary algorithm based on a Fourier series truncated at a hyperbolic cross index set.Using recent works [3,4], we here construct a lattice generating vector with a guaranteed good error bound for our kernel interpolant.Note, importantly, that neither the construction of our lattice generating vector, nor the implementation of our kernel method, requires explicit knowledge or evaluation of the auxiliary hyperbolic cross index set.In short, we know how to find a good lattice point set so that our kernel method has a small error in addition to being of low cost.
In this paper, the main contribution is to apply and analyze this periodic-kernel-plus-lattice method to uncertainty quantification of elliptic partial differential equations (PDEs), where the diffusion coefficient is given by a random field that is periodic in the stochastic variables, as in the model proposed recently by Kaarnioja, Kuo, and Sloan [12].We tailor our lattice generating vector to the regularity of the PDE solution with respect to the stochastic variables.Our numerical results beat the theoretical predictions, indicating that the theory based on worst case analysis may not be sharp.
The kernel approximation developed here may have a role as a surrogate model for complicated forward problems.One popular use for surrogate models is to allow efficient sampling of the original system.If the solution of some particularly difficult PDE problem with high accuracy takes a week for a given parameter choice y, then having a kernel interpolant that can be evaluated in hours or minutes could be very useful.A second possible use for the kernel interpolant is in the easy generation of derivatives, needed for example in gradient-based optimization algorithms.The surrogate might be even more useful for Bayesian inverse problems.
We now elaborate key points.
Periodic-kernel-plus-lattice method.Let f (y) = f (y 1 , . . ., y s ) be a real-valued function on the s-dimensional unit cube [0, 1] s , with a somewhat smooth 1-periodic extension to R s .Our main interest is in problems where the dimension s is large.Following [25], we assume that f has an absolutely convergent Fourier series, and belongs to a weighted mixed Sobolev space H := H s,α,γ which is characterized by a smoothness parameter α > 1 and a family of positive numbers γ = (γ u ) u⊂N called weights; the details are given in Section 2.1.
Our ultimate application, to be analyzed in Section 4, concerns a class of elliptic PDEs parameterized by a very high (or possibly countably infinite) number of stochastic parameters, for which the solution, as a function of the parameters, is periodic and belongs to the weighted space H for a suitable choice of α and γ.
The important feature of the space H is that it is a reproducing kernel Hilbert space (RKHS), with a simple reproducing kernel K(y, y ′ ).This opens the way to the use of kernel methods to approximate functions in H from given point values.In particular, in this paper we focus on the kernel interpolation: given f ∈ H and a suitable set of points t 1 , . . ., t n ∈ [0, 1] s , we seek for an approximation f n ∈ H of the form which satisfies the interpolation condition f n (t k ) = f (t k ), k = 1, . . ., n.We refer to f n as the kernel interpolant of f .We will interpolate the function at a set of n lattice points specified by a generating vector z ∈ Z s .The points are then given by the formula t k = (kz mod n)/n, k = 1, . . ., n, with t n = t 0 = 0.A lattice point set has an additive group structure, implying that the difference of two lattice points is another lattice point (after taking into account periodicity).
A key property of our reproducing kernel is that it depends only on the difference of the two arguments, thus K(y, y ′ ) = K(y − y ′ , 0), and K(•, 0) is a periodic function with an easily computable expression when α is an even integer.Combining this with the group structure of lattice points means that the matrix [K(t k − t k ′ , 0)] k,k ′ =1,...,n contains only n distinct values and indeed is a circulant matrix.Therefore the linear system arising from collocating (1) at the points t k ′ , k ′ = 1, . . ., n, can be solved using the fast Fourier transform with a cost of O(n log(n)).
Once we have the coefficients a k , we can use (1) to evaluate the interpolant f n at L arbitrary points y ℓ , ℓ = 1, . . ., L, with a cost of O(Ln).Remarkably, with almost the same cost we can evaluate f n at all the Ln points of the union of shifted lattices y ℓ +t k ′ , ℓ = 1, . . ., L, k ′ = 1, . . ., n.Indeed, since K(t k , y ℓ +t k ′ ) = K(t k −t k ′ , y ℓ ) and the matrix [K(t k −t k ′ , y ℓ )] k,k ′ =1,...,n is circulant, we have which can be evaluated for each y ℓ for all t k ′ together by fast Fourier transform with a cost of O(n log(n)), leading to the total cost of O(Ln log(n)).Comprehensive cost analysis taking into account also the evaluations of f and K is given in Section 5.
Brief survey on kernel methods in high dimensions.Griebel and Rieger [10] considered a (non-interpolatory) kernel approximation based on a regularized reconstruction technique from machine learning for a class of parameterized elliptic PDEs similar to the one considered in this work, yet with non-periodic dependence on the parameters.They used an anisotropic kernel, behaving differently in different variables, to address the high dimensionality of the problem.However, their error estimate was in terms of the mesh norm or fill distance of the point set, which is the Euclidean radius of the largest Euclidean ball that contains no points in its interior.Since the fill distance behaves at best like n −1/s , where n is the number of sampling points, their estimates inevitably suffer the curse of dimensionality.
Kempf et al. [15] considered the same PDE problem and anisotropic kernel as [10].However, they considered a penalized least-squares approach for kernel approximation and an isotropic sparse grid as point set, which allowed them to obtain error estimates with a mitigated (but still present) curse of dimensionality.
As noted above, lattice points have already been used in a kernel interpolation method.Zeng et al. [28] seem to be the first to work in this direction, however the question of dependence on dimension was not considered in their analysis.Zeng et al. [29] established dimension independent error estimates in weighted spaces in the case of product weights (i.e., weights that have the form γ u = j∈u γ j ).We note, however, that the assumption of product weights is rather limiting.For instance, for integration problems involving parameterized PDEs, the best convergence rates known up to now are obtained by considering weighted space for the parameter-to-solution map with (S)POD weights [9,12,17], whereas weighted spaces with product weights lead to the best known rates only for special models [7,11,14].In this paper we extend these results to the case of kernel approximation (as opposed to integration) of the parameter-to-solution map, and we are able to show dimension-independent convergence rates using (S)POD weights in the general case.To the best of our knowledge, this is the first paper to use non-product weights for approximation in parameterized PDE problems.
PDEs with periodic dependence on random variables.Our motivating application is a class of parameterized elliptic PDEs with periodic dependence on the parameters, for which we will establish dimension independent error estimate for the kernel interpolant, by deriving suitable choices of smoothness parameter and weights for the problem at hand.To the best of our knowledge, this is the first paper presenting dimension independent kernel approximation methods using lattice points for this class of problems.
We consider uncertainty quantification for an elliptic PDE (see details in Section 4) on a physical domain D ⊂ R d , d = 1, 2 or 3, in a probability space (Ω, A , P), with an input random field of the form where a 0 and ψ j are uniformly bounded in D, and Θ j (ω) are i.i.d.random variables following a prescribed distribution.In the popular affine model, Θ j are i.i.d.random variables uniformly distributed on [− 1 2 , 1  2 ].In the periodic model [12], Θ j are i.i.d.random variables distributed according to the arcsine distribution and can be parameterized as with y j uniformly distributed on [− 1 2 , 1  2 ].The mean of the random field is a 0 , and the scaling 1/ √ 6 is chosen here so that the covariance of the random field is also exactly the same as in the affine case.Higher moments are of course somewhat different, but as argued in [12], there seems to be no clear reason for preferring one over the other.
Due to periodicity, it is equivalent to work with y j uniformly distributed in the interval [0, 1] instead of [− 1 2 , 1 2 ], thus from now on we consider the parameter space In the earlier paper [12] the aim was to develop and analyze a method for computing the expected value of a given quantity of interest, expressed as a linear functional of the PDE solution, hence facing a high dimensional integration problem.Here, in contrast, the aim is to develop and analyze a fast method for approximating the solution u(x, y), or some quantity of interest Q(y) derived from u(x, y), as an explicit function of y.To that end we will develop a kernel-based approximation, using the kernel of a reproducing kernel Hilbert space of periodic functions, and interpolation at a lattice point set.
Structure of the paper.In Section 2 we define the function space setting and the kernel interpolant, and establish its principal properties, while giving a simple proof of a known optimality result, namely that in the sense of worst case error the kernel interpolant is an optimal L p approximation among all approximations that use the same information about the target function f ∈ H. Then in Section 3 we establish upper and lower bounds on the error.For the upper bound we use the optimality result together with the error analysis for a trigonometric polynomial method established by two of the current authors together with Cools and Nuyens [3,4].For the lower bound we provide another proof of a recent result by Byrenheid et al. [1], namely that a method that draws information only from function values at lattice points inevitably has a rate of convergence that is at best only half of the best possible rate, thereby obtaining matching upper and lower bounds up to logarithmic factors.In Section 4, we apply the error analysis developed in Section 3 to a parameterized PDE problem, thereby obtaining rigorous upper error bounds that are independent of dimension and have explicit rates of convergence.Section 5 is concerned with the cost analysis of our proposed method.In Section 6 we give the results of some numerical experiments.

The kernel interpolant 2.1 The function space setting
Let f (y) = f (y 1 , . . ., y s ) be a real-valued function on [0, 1] s with a somewhat smooth 1-periodic extension to R s with respect to each variable y j .Our main interest is in problems where the dimension s is large.Following [25], we assume that f has absolutely convergent Fourier series (and so is continuous), with supp(h) := {j ∈ {1 : s} : h j = 0} and {1 : s} := {1, 2, . . ., s}, and with the h = 0 term in the sum to be interpreted as The weighted space H s,α,γ is characterized by the smoothness parameter α > 1 and a family of positive numbers γ = (γ u ) u⊂N called weights, where a positive weight γ u is associated with each subset u ⊆ {1 : s}.We fix the scaling of the weights by setting γ ∅ := 1, so that the norm of a constant function in H matches its L 2 norm.
It can easily be verified that if α is an even integer then the norm can be rewritten as the norm in an "unanchored" weighted Sobolev space of dominating mixed smoothness of order α/2, where y u denotes the components of y with indices that belong to the subset u, and y −u denotes the components that do not belong to u, and |u| denotes the cardinality of u.
The important feature of the space H is that it is an RKHS, with an explicitly known and analytically simple reproducing kernel, namely where Note that the reproducing property is easily verified.
Of special interest are even integer values of α, because, when α is even, η α can be expressed in the especially simple closed form where the braces indicate that y − y ′ is to be replaced by its fractional part in [0, 1), and B α (y) is the Bernoulli polynomial of degree α.For example, for α = 2 and α = 4 we have

The kernel interpolant
We are interested in approximating a given function f ∈ H by an approximation of the form where z ∈ {1, . . ., n − 1} s , and the braces around the vector of length s indicate that each component of the vector is to be replaced by its fractional part.The points are the points of a lattice cubature rule of rank 1, see [24].In what follows, we omit these braces because functions we consider are, unless otherwise stated, periodic.
In particular, we define f n ∈ H to be the function of the form (4) that interpolates f at the lattice points, and refer to f n as the kernel interpolant of f .The coefficients a k in (4) are given by the linear system based on ( 6) where Note that the matrix elements can be expressed, using periodicity, as where 0 is the s-vector of all zeroes.It follows that the n × n matrix K is a circulant matrix, which contains only n distinct elements, and can be diagonalised in a time of order n log n by fast Fourier transform.This is a major motivation for using lattice points.

The kernel interpolant is the minimal norm interpolant
The following property is a well known result for interpolation in a reproducing kernel Hilbert space; for completeness we give a proof.
Proof.Denoting the linear span of the kernels with one leg at t k , k = 1, . . ., n by we observe the well known fact (see, e.g., [5,8]), that f n is the orthogonal projection of f on P n with respect to the inner product •, • H , since from the reproducing property (3) and the interpolation property (6) we have In turn, there follows the Pythagoras theorem, and the minimal norm property of f n , since if g is any other interpolant of f at the lattice points then and hence g 2 H , from which the uniqueness of the minimal norm interpolant also follows.✷

The kernel interpolant is optimal for given function values
In this subsection we show that the kernel interpolant f n defined by ( 4), ( 5) and ( 6) is optimal among all approximations that use only the same function values of f , in the sense of giving the least possible worst case error measured in any given norm • W such that H ⊂ W for functions in H.This is a special case of a general result for optimal recovery problems in Hilbert spaces (see for example [20,Example 1.1] and [21,Section 3]), but for completeness we give a short proof here.Our proof follows the exposition of [26, Proof of Theorem 13.5], but suitably adapted to our setting.Let A n : H → H be an algorithm (linear or non-linear) that uses as information about the argument only its values at the points (5), i.e., it is a mapping of the form A n (f ) = I n (f (t 1 ), . . ., f (t n )) for a mapping I n : R n → H.The worst case W -error for this algorithm is defined by e wor (A n ; W ) := sup Theorem 2. Let A n : H → H be an algorithm (linear or non-linear ) such that A n (f ) uses as information about f only its values f (t 1 ), . . ., f (t n ) at the points (5).For f ∈ H, let A * n (f ) := f n be the kernel interpolant defined by (4), ( 5) and (6).Then, for any normed space W ⊃ H we have e wor (A * n ; W ) ≤ e wor (A n ; W ).
Proof.Define C := {g ∈ H : g H ≤ 1 and g(t k ) = 0 for all k = 1, . . ., n}.For any g ∈ C we have where in the penultimate step we used g(t k ) = 0 for all k = 1, . . ., n, from which it follows that A n (0) = A n (g) = A n (−g).For any f ∈ H such that f H ≤ 1, since f n is interpolatory, the Pythagoras theorem (8) implies f − f n H ≤ 1, and hence f − f n ∈ C. Thus it follows from (9) that f − A * n (f ) W = f − f n W ≤ e wor (A n ; W ). The theorem now follows.✷ In the above result, we may, for example, take W = L p for any 1 ≤ p ≤ ∞.
3 Lower and upper error bounds 3.1 Lower bound on the worst case L p error (1 ≤ p ≤ ∞) A recent paper [1] showed (with a different definition of the parameter α) that the worst case L 2 error for an approximation that uses the points of a rank-1 lattice cannot have an order of convergence better than n −α/4 (with our definition of α).Bearing in mind that H is a (Hilbert) space of functions of dominating mixed smoothness of order α/2, this is just half the rate n −α/2 of the best approximation.Since the function space setting in that paper is rather different from ours (here we use a Fourier description and a so-called unanchored space, and have introduced weights) we briefly reprove the main result here, obtaining a sharp lower bound expressed in terms of the weights.Furthermore, in our setting we make the result stronger by showing that the same lower bound holds for the worse case L 1 error.
Theorem 3. Let s ≥ 2. Assume that the weights for the subsets of {1 : s} containing a single element satisfy γ {j} > 0 for all j ∈ {1 : s}, and that z ∈ {0, . . ., n−1} s is given.Let A n : H → L p be an algorithm (linear or non-linear ) that uses information only at the lattice points (5) and satisfies A n (0) = 0. Then for 1 ≤ p ≤ ∞ the worst case L p error for algorithm A n satisfies e wor (A n ; L p ) ≥ 2 1/γ {j} + 1/γ {k} n −α/4 for any j, k ∈ {1 : s} with j = k.
In particular, if Proof.Without loss of generality we assume The heart of the matter is that there exists a non-zero integer vector h * of length s in the 2-dimensional set (In the language of dual lattices, see [24], there exists a point of the dual lattice in D n \ {0}.) To prove this fact, we define D n , the positive quadrant of D n , by > n, it follows from the pigeonhole principle that two distinct elements of D n , say h and h ′ , yield the same element of E n (z); from this it follows that h * := h − h ′ satisfies (10).A "fooling function" is then defined by where e 1 and e 2 are the unit vectors corresponding to variables 1 and 2. By construction, q vanishes at all the lattice points (5).For this function, since the two terms in q are orthogonal with respect to the inner products in H = H s,α,γ , the squared H norm satisfies On the other hand, the L p norm is bounded from below by This integrand is even with respect to h * 1 and h * 2 separately, so both h * 1 and h * 2 can be considered as non-negative.First assume that both h * 1 and h * 2 are positive, and partition the square into boxes of size 1/h * 1 × 1/h * 2 .It is easy to see that each box gives the same contribution to the integral, and hence (For the last step it may be useful to note that the integrand in the inner integral is 1-periodic, making the inner integral independent of z 2 .)If we have h * 1 > 0 and h * 2 = 0 or vice versa, we again have q L 1 = 4/π.Since h * is non-zero, we obtain If we now define g := q/ q H , then g belongs to the unit ball in H and vanishes at all the points of the lattice (5), and g Lp is bounded below by the right-hand side of (11).Since A n (g) depends on g only through its values at the lattice points, and g vanishes at all those points, it follows that A n (g) = A n (0) = 0, with the last step following from the assumption on A n .From the definition of worst case error we conclude that e wor (A n , L p ) ≥ g − A n (g) Lp = g Lp , which is bounded below by the right-hand side of (11), completing the proof.✷

Upper bound on the worst case L 2 error
In this section, we obtain explicit L 2 error bounds for the kernel interpolant by using Theorem 2 combined with error bounds given for an explicit trigonometric polynomial approximation in [3,4] which extends the construction from [18,19] to general weights.(An alternative approach to obtain an upper bound would be to use a "reconstruction lattice", see, e.g., [1,13,16].) The lattice algorithm A † n,M applied to a target function f ∈ H takes the form which is obtained by applying a lattice integration rule to the Fourier coefficients in the orthogonal projection onto a finite index set defined for some parameter M > 0 by The error for this algorithm consists of the error from truncation to the index set A s (M ) together with the quadrature error from approximating those Fourier coefficients with indices h ∈ A s (M ), leading to a worst case L 2 approximating error bound of the form The quantity S s (z) (see [3] for details) can be used as a search criterion in a component-bycomponent (CBC) construction for finding suitable lattice generating vectors z, and has the key advantage that it does not depend on the index set A s (M ).The analysis in [3] together with the optimality of the kernel interpolant (see Theorem 2) leads to the following theorem.
Theorem 4. Given s ≥ 1, α > 1, weights (γ u ) u⊂N with γ ∅ := 1, and prime n, the worst case L 2 approximation error of the kernel interpolant A * n (f ) = f n defined by (4), ( 5) and (6), using the generating vector z obtained from the CBC construction with search criterion S s (z) in [3,4], satisfies for all λ ∈ ( 1 α , 1], , where the implied constant depends on δ but is independent of s provided that Proof.The optimality of the kernel interpolant established in Theorem 2 means that e wor (A * n ; L 2 ) ≤ e wor (A † n,M ; L 2 ) for all M , and therefore the upper bound in ( 14) also serves as an upper bound for the kernel interpolant.It is easy to verify that the bound in ( 14) can be minimized by setting M = 1/S s (z), leading to (15).The subsequent bound follows from [3, Theorem 3.5].The big-O bound is then obtained by taking λ = 1/(α − 4δ).✷ From this result (which by Theorem 3 is almost best possible with respect to the order of convergence) we immediately obtain an error bound for the kernel interpolant.
Theorem 5.Under the conditions of Theorem 4, and with lattice generating vector z obtained by the CBC construction in [3,4], for any f ∈ H, we have for the kernel interpolant f n defined by (4), ( 5) and (6), We stress again that the CBC construction in [3,4] does not require the explicit construction of the index set A s (M ) in order to determine an appropriate generating vector z.However, the expression S s (z) (see [3] for details) used as the search criterion does depend in a complicated way on the weights γ u , and therefore the target dimension s needs to be fixed at the start of the CBC construction (except for the case of product weights).For weights with no special structure, the computational cost will be exponentially large in s.We consider some special forms of weights: • Product weights: γ u = j∈u γ j , specified by one sequence (γ j ) j≥1 .
• SPOD weights (smoothness-driven product and order dependent) with degree σ ≥ 1: for almost all events ω ∈ Ω in the probability space (Ω, A , P) with where a 0 ∈ L ∞ (D), ψ j ∈ L ∞ (D) for all j ≥ 1 are such that j≥1 |ψ j (x)| < ∞ for any x ∈ D, and Y 1 , Y 2 , . . .are i.i.d.random variables uniformly distributed on [− 1 2 , 1  2 ].This type of random field is not new in the context of uncertainty quantification.Indeed, the random variable sin(2πY j (ω)) induces the arcsine measure as its distribution: for if Y (ω) is uniformly distributed on [− 1 2 , 1 2 ], then Z(ω) := sin(2πY (ω)) has the probability density 1 Thus, a is identical, up to the law to the random field with Z j i.i.d.random variables with arcsine distribution on [−1, 1].Expression (19) would be the starting point for deriving a polynomial chaos approximation [27] of the solution in terms of Chebyshev polynomials of the first kind [22].In this paper, however, we want to exploit periodicity, hence we consider rather the formulation (18) and a different approximation method based on kernel interpolation.Since the expression ( 18) is periodic in the random variable Y j , we can shift those random variables so that their range is [0, 1] instead of [− 1 2 , 1 2 ], i.e., we consider the equivalent parametric space U := [0, 1] N .Let B(U ) be the Borel σ-algebra corresponding to the product topology on U = [0, 1] N , and equip (U, B(U )) with the product uniform measure; see, for example, [23] for details.The weak formulation of ( 16)- (17) can then be stated parametrically as: where the datum q ∈ H −1 (D) is fixed and the diffusion coefficient is given by Here H 1 0 (D) denotes the subspace of the L 2 -Sobolev space H 1 (D) with vanishing trace on ∂D, and H −1 (D) denotes the topological dual of H 1 0 (D), and •, • H −1 (D),H 1 0 (D) denotes the duality pairing between H −1 (D) and H 1 0 (D).We endow the Sobolev space H 1 0 (D) with the norm v H 1 0 (D) := ∇v L 2 (D) .Since we now have two sets of variables x ∈ D and y ∈ U , from here on we will make the domain D and U explicit in our notation.We state the following assumptions and refer to them as they become needed: for all j ≥ 1, and j≥1 ψ j L∞(D) < ∞; (A2) there exist positive constants a min and a max such that 0 < a min ≤ a(x, y) ≤ a max < ∞ for all x ∈ D and y ∈ U ; , is a convex and bounded polyhedron with plane faces.
Let assumptions (A1) and (A2) be in effect.Then the Lax-Milgram lemma [2] implies unique solvability of the problem (20) for all y ∈ U , with the solution satisfying the a priori bound a min for all y ∈ U.
Moreover, from the recent paper [12, Theorem 2.3] we know, after differentiating the PDE (20), that the mixed derivatives of the PDE solution are 1-periodic and bounded by for all y ∈ U and all multindices ν ∈ N ∞ 0 with finite order |ν| := j≥1 ν j < ∞, and we define Furthermore, S(σ, m) denotes the Stirling number of the second kind for integers σ ≥ m ≥ 0, with the convention that S(σ, 0) = δ σ,0 .In [12] we considered a function space with respect to y with a supremum norm rather than an L 2 -based norm, so here we need to write down the relevant L 2 -based norm bound instead.Moreover, we want to approximate the solution u directly, rather than a bounded linear functional G(u) of the PDE solution.
For our proposed approximation scheme, we require the target function to be pointwise welldefined with respect to both the physical variable and the parametric variable.In terms of our PDE application, this can be achieved either by assuming additional regularity of both the diffusion coefficient a and the source term q or, alternatively, by analyzing instead the construction of the kernel interpolant for the finite element approximation of u (which is naturally pointwise well-defined everywhere).Here we focus on the latter case, in which the kernel interpolant is crafted for the finite element approximation of u.This is also the setting that arises in practical computations, where one only ever has access to a numerical approximation of the solution to (20), with the diffusion coefficient (21) truncated to a finite number of terms.To this end, we split our analysis into three parts: dimension truncation error, finite element error, and kernel interpolation error.
For an R-valued function on U that is Lebesgue integrable with respect to the uniform measure on B(U ), we use the notation U F (y) dy for the integral of F over U .Similarly, for an integrable function F on U >s , we denote the integral over U >s with respect to the uniform measure by U>s F (y >s ) dy >s .
Arguing as in [17,Theorem 5.1], it is not difficult to see that holds under assumptions (A1)-(A3) and (A5).In what follows, we consider the dimension truncation error in the L 2 -norm in the stochastic parameter, and establish the rate O(s −1/p+1/2 ), which is one half order better.This case does not appear to have been considered in the existing literature.Notably, this rate is only half that of the rate proved in [12] for integration problem with respect to y: We will establish a dimension truncation error for a general class of parametrized random fields that includes (21), without the periodicity assumption.Our proof adapts the argument by Gantner [6] to the L 2 (U ; H 1 0 (D))-norm estimate.Theorem 6. Suppose that (A1), (A3) and (A5) hold.Let ξ : Suppose further that the function satisfies (A2).Then for any s ∈ N, there exists a constant C > 0 such that where u ∈ H 1 0 (D) denotes the solution of the equation (20) but with a(x, y) given by (26), u s ∈ H 1 0 (D) denotes the corresponding dimensionally truncated solution, c D > 0 is the Poincaré constant of the embedding H 1 0 (D) ֒→ L 2 (D), and the constant C > 0 is independent of s and q.
Proof.We begin by introducing some helpful notations.For y ∈ U , let us define the operators B, B s : where the operators B k : and This allows the equation (20) with the coefficient a given by ( 26) to be written as Bu = q.It is easy to see that the assumptions (A1) and (A2) ensure that both B(y) and B s (y) are boundedly invertible linear maps for all y ∈ U , with the norms of B and B s both bounded by a max , and the norms of both B −1 and (B s ) −1 bounded by a −1 min .Thus we can write u := u(y) := B −1 q and u s := u s (y) := (B s ) −1 q for all y ∈ U .
Only in this proof, we redefine ( 24) by b j := ξ ∞ ψ j L∞(D) /a min , with ξ ∞ := ξ L∞([0,1]) .Notice that with ξ = 1 √ 6 sin(2π•) we recover (24).Let s ′ ∈ Z + be such that Without loss of generality, we can assume that s ≥ s ′ since the assertion in the theorem can subsequently be extended to all values of s by making a simple adjustment of the constant C > 0 (see the end of the proof).Then for all j ≥ s ′ + 1 and all s ≥ s ′ we have The bound (29) permits the use of a Neumann series expansion where it is assumed that the product symbol respects the non-commutative nature of the operators (B s ) −1 B j , j ≥ 1.
Our strategy is to estimate first Let B s : H 1 0 (D) → H 1 0 (D) be defined by and observe that B s is self-adjoint with respect to the inner product v, w y ≤s := D a x, (y ≤s , 0, . . . Indeed, for any v, w ∈ H 1 0 (D) we have Hence, from (30) we have where we used the notation η∈{s+1:∞} m := lim s→∞ η∈{s+1:s} m , and the latter product is assumed to respect the non-commutative nature of the operators.Introducing ν(η) := (ν i (η)) i≥1 := (#{j = 1, . . ., m : . Define and note from (25) that c ν = 0 if for some i ∈ supp(ν) we have ν i = 1.Then we have, using (31), which can be further bounded by where the sum of η simplifies to The dimension truncation error is estimated by splitting the upper bound into two parts.Let m * ≥ 3 be an as yet undetermined index.Then We can estimate the sum in the first term of (32) by where b ν := i∈supp(ν) b ν i i .Furthermore, we obtain where we used ( 28), (29), and the inequality e x ≤ 1 + (e − 1)x for all x ∈ [0, 1].Recalling (27), we find the following upper bound for the sum in the second term of (32): where we used the estimates m − 1 ≤ 2 m and 2 ∞ j=s+1 b j < 1. Observing that by [ where the constant C > 0 is independent of s and q.This proves the theorem for s ≥ s ′ .The result can be extended to all s ≥ 1 by noting that for all 1 ≤ s < s ′ , where we used the a priori bound identical to (22).✷ Remark 7. Theorem 6 can be generalised further to include a more complex model where the function ξ in (26) is now replaced by an L ∞ ([0, 1]) function ξ j depending on j.Then, assuming that we have 1 0 ξ j (y) dy = 0, j ≥ 1, and that bj := ξ j ∞ ψ j L∞(D) /a min is nonincreasing in j, and moreover that ã satisfies (A2), the same argument as above establishes the same estimate as in Theorem 6.

Finite element error
Let assumption (A6) be in effect.Let {V h } h be a family of conforming finite element subspaces V h ⊂ H 1 0 (D), parameterized by the one-dimensional mesh size h > 0, which are spanned by continuous, piecewise linear finite element basis functions.It is assumed that the triangulation corresponding to each V h is obtained from an initial, regular triangulation of D by recursive, uniform partition of simplices.
For each y ∈ U , we denote by u h (•, y) ∈ V h the finite element solution to the system where q ∈ H −1 (D) and a is defined by (21).Under assumptions (A1)-(A2), this system is uniquely solvable and the finite element solution u h satisfies both the a bound ( 22) as well as the partial derivative bounds (23).In analogy to the previous subsection, we also define the dimensionally truncated finite element solution by setting u s,h (•, y) := u s,h (•, (y 1 , . . ., y s )) := u h (•, (y 1 , . . ., y s , 0, 0, . ..)), y ∈ U, where u h (•, y) ∈ V h is the solution of (35) for y ∈ U .
Proof.Let y ∈ U .From [17, Theorem 7.2], under the assumptions (A1), (A2), (A4), and (A6), we have for every g ∈ L 2 (D) the following asymptotic convergence estimate as h → 0 where the constant C > 0 is independent of h and y.Therefore and this concludes the proof.✷

Kernel interpolation error
We focus on approximating the finite element solution of the problem (20) in the following discussion, since it is essential for our approximation scheme that the function being approximated is pointwise well-defined in the physical domain D.
Let H(U s ) = H denote the RKHS of functions with respect to the stochastic parameter y ∈ U s , defined in Section 2.1.For every x ∈ D, let ) ∈ H(U s ) be the kernel interpolant of the dimensionally truncated finite element solution (36) at x as a function of y.We measure the L 2 approximation error u s,h (x, •) − u s,h,n (x, •) L 2 (Us) in y and then take the L 2 norm over x, to arrive at the error criterion where, observing that u s,h −u s,h,n is jointly measurable, we interchanged the order of integration by appeal to the Fubini's theorem.
Theorem 9.Under the assumptions (A1), (A2) and (A6), let u s,h (•, y) ∈ H 1 0 (D) denote the dimensionally truncated finite element solution of (35) for y ∈ U s and let q ∈ H −1 (D) be the corresponding source term.Moreover, for every x ∈ D let u s,h,n (x, •) := A * n (u s,h (x, •)) be the kernel interpolant at x based on a lattice rule satisfying the assumptions of Theorem 4. Suppose that α ∈ 2N and σ := α 2 .Then we have for all λ ∈ ( 1 α , 1] that where c D > 0 is the Poincaré constant of the embedding H 1 0 (D) ֒→ L 2 (D), κ > 0 is the constant defined in Theorem 4, and Proof.We can express the squared L 2 error as The first factor is the squared worst case L 2 approximation error, which can be bounded using Theorem 4. The second factor can be estimated using (2) by where we used the Cauchy-Schwarz inequality, Fubini's theorem, the Poincaré constant c D > 0 for the embedding H 1 0 (D) ֒→ L 2 (D), together with the PDE derivative bound (23) applied with ν = (σ, . . ., σ) = ( α 2 , . . ., α 2 ).The theorem is proved by combining the above expressions with Theorem 4. ✷ Next, we proceed to choose the weights γ u and the parameters λ and α to ensure that the constant C s (λ) can be bounded independently of s, with λ as small as possible to yield the best possible convergence rate.

Choosing SPOD weights
One way to choose the weights is to equate the terms inside the two sums over u in the formula (38) for C s (λ).(The value of C s (λ) so obtained minimizes (38) with respect to γ u for u ⊆ {1 : s}.)It will be shown that this yields the convergence rate O(n −( 1 2p − 1 4 ) ) with an implied constant independent of the dimension s.The rate is precisely the rate of convergence that we expect to get.However, this choice of weights is too complicated to allow for efficient CBC construction of the lattice generating vector.So in the theorem below we propose a choice of SPOD weights that achieves the same error bound.
Proof.We will proceed to justify the two choices of weights (39) and (40), and show that in both cases the term C s (λ) appearing in Theorem 9 can be bounded independently of s, by specifying λ and α as in the theorem.The first choice of weights (39) is obtained by equating the terms inside the two sums over u in the formula (38).Substituting (39) into (38) yields b j for all j ≥ 1, while applying Jensen's inequality * with 0 < 2λ 1+λ ≤ 1.

Choosing POD weights
In the next theorem we prove that if the assumption (A3) holds for some p ∈ ∞ k=1 2 2k+1 , 1 k , then it is possible to use POD weights to obtain the same rate of convergence as in Theorem 10.For this and the next subsections we need the sequence of Bell polynomials (more precisely, Touchard polynomials), which we denote by where S(σ, m) denotes the Stirling number of the second kind as before.
In consequence, we have We can use the ratio test to determine sufficient conditions for the convergence of the infinite sum over ℓ.Letting ℓ > 0, we find that In conclusion, by choosing 2λ 1+λ = p ⇐⇒ λ = p 2−p , it follows from Theorem 9 that the convergence is independent of s with rate O(n Unfortunately this condition cannot be fulfilled for all values of p, since α = 2σ needs to be an even integer.Indeed, the condition is equivalent to The Lebesgue measure of the set of admissible values for p is precisely k and a correspondingly larger value of λ.The theorem then holds but with some loss in the rate of convergence.

Choosing product weights
In the next theorem we increase our error bounds to obtain product weights, which have the benefit of a lower computational cost (see Section 5), but with the disadvantage of a compromised theoretical convergence rate.
Theorem 12. Assume that (A1)-(A3), (A5) and (A6) hold, and further assume that p < . We define product weights with γ ∅ := 1.Then the kernel interpolant of the PDE solution in Theorem 9 satisfies where the constant C > 0 is independent of the truncation dimension s. =: V ℓ , with Now one can easily check using the ratio test that the term C s (λ) can be bounded independently of s as long as the series ∞ j=1 (j σ b j ) 2λ 1+λ is convergent.From the monotonicity of (b j ) j≥1 in the assumption (A5) it follows that b j ≤ j Taking into account also the requirement that 1 α < λ ≤ 1 and that α = 2σ be an even integer, we have the constraint We consider two scenarios below depending on the value of the maximum.Scenario A. If 2σ ≤ 2 p − 1 − 2σ then p ≤ 2 4σ+1 and σ ≤ 1 2p − 1 4 , while the condition (48) simplifies to 1 2σ < λ ≤ 1.Since σ must be an integer and at least 1, this scenario applies only when p ∈ (0, 2  5 ].In this case the best convergence rate is obtained by taking λ as close to 1 2σ as possible and σ as large as possible.Hence we take σ := ⌊ 1 2p − 1 4 ⌋ and λ := 1 2σ−4δ for arbitrary δ ∈ (0, σ 2 − 1 4 ).By Theorem 9 this yields the convergence rate ) with the implied constant independent of the dimension s, but approaching ∞ as δ → 0.
If p ∈ (2 5 , 1 2 ) then only Scenario B applies.If p ∈ [ 1 3 , 2 5 ] then only Scenario A applies.If p ∈ (0, 1  3 ) then both scenarios apply, and it remains to resolve which scenario to use in order to obtain the better convergence rate.For convenience we abbreviate x := 1 2p − 1 4 and m := ⌊ 1 2p − 1 4 ⌋, noting that m ≥ 1 since p < 1 3 .Scenario B has a better convergence rate than Scenario A if and only if 1  2 ⌊x⌋ < x − 1 2 ⌈x⌉.The latter condition is not satisfied if x ∈ Z, while for x / ∈ Z the condition is equivalent to ⌊x⌋ + 1 2 < x < ⌈x⌉.Hence the condition is equivalent to We conclude that for the case p <

Combined approximation error
The combined approximation error of the PDE problem (20) can be decomposed as where the first term is the dimension truncation error, the second term is the finite element error, and the final term is the kernel interpolation error.Combining the results developed in Sections 4.1-4.3,we arrive at the following result.
Theorem 13.Assume that (A1)-(A6) hold.For any y ∈ U , let u(•, y) ∈ H 1 0 (D) denote the solution to (20) with the source term q ∈ H −1+t (D) for some 0 ≤ t ≤ 1.Let u s,h (•, y) ∈ V h be the corresponding dimensionally truncated finite element solution and let u s,h,n (x, •) = A * n (u s,h (x, •)) be its kernel interpolant constructed using the weights described in Theorems 10, 11, or 12. Then we have the combined error estimate where 0 ≤ t ≤ 1, h denotes the mesh size of the piecewise linear finite element mesh, C > 0 is a constant independent of s, h, n, q, and with weights (39) or SPOD weights (40), and δ > 0 is sufficiently small in each case.
5 Cost analysis 5.1 What is the point set at which values are wanted?
In this section we consider the cost of evaluating the kernel interpolant as an approximation to the periodic function f , with lattice points t k = { kz n k = 1, . . ., n, and t n = t 0 = 0. Recall that all our functions including the kernel are 1-periodic with respect to y.For the linear system (7), as observed already, the matrix K = [K(t k − t k ′ , 0)] k,k ′ =1,...,n is circulant, thus we need to compute only its first column (see the cost for evaluating the kernel in the next subsection) and then solve for the coefficients a k with a cost of O(n log(n)).
First, however, it turns out to be useful to ask: what is the set of points, say {y 1 , y 2 , . ..}, at which the values of the interpolant are desired?If L such points y ℓ , ℓ = 1, . . ., L, are chosen arbitrarily then the cost, naturally, is L times the cost of a single evaluation.On the other hand, for a set of Ln points formed by the union of shifted lattices y ℓ + t k ′ , ℓ = 1, . . ., L, k ′ = 1, . . .n, it turns out that the cost for Ln evaluations is little more than the cost of the L evaluations at arbitrary points.
The reason for the low cost lies in the shift invariance of the kernel and the group nature of the lattice.For a single given y the principal costs for evaluating the kernel interpolant come from evaluating K(t k , 0) and f (t k ) at the n lattice points; then solving the circulant linear system (7) for the n values of a k ; from evaluating K(t k , y) at the n lattice points; and finally from assembling f n (y) with a cost of O(n).(The precise cost breakdown is given in Table 1 below after we discuss the cost for evaluating the kernel in the next subsection.)But for evaluation of Since the right-hand side has the form of a circulant n×n matrix multiplying a vector of length n, the n values f n (y + t k ′ ) for k ′ = 1, . . ., n can be assembled with a cost of O(n log(n)), compared with the O(n) cost of assembling f n at a single value of y.
Cost breakdown for the kernel interpolant u s,h,n based on n lattice points t k in s dimensions, evaluated at M finite element nodes x i and L arbitrary points y ℓ .Here M a for some positive a is the cost for one finite element solve with M nodes.

Operation \ Weights
Product POD SPOD cost while the last three rows are the running cost for sampling.The cost for the fast CBC construction based on the criterion S s (z) with different weight parameters is analyzed in [4].
For the PDE application, our kernel method is where {x i : i = 1, . . ., M } ⊂ D is the set of finite element nodes in the physical domain, and a k (x i ) for k = 1, . . .n is the solution for fixed x i of the linear system Let M a for some a ≥ 1 denote the cost of the finite element solve to obtain all x i for one y.The cost breakdown for obtaining the kernel interpolant at all M nodes for all L samples is shown in Table 2.Note in this case that the coefficients a k (x i ) need to be computed for every finite element node x i , hence the scaling of the cost in line 4 of Table 2 by M .If the quantity of interest is a linear functional of the PDE finite element solution (no need for the solution at every node), then the cost is reduced to be as in Table 1 with X = M a .

Numerical experiments
We consider the parametric PDE problem ( 16)- (17) in the physical domain D = (0, 1) 2 with the source term q(x) = x 2 and the diffusion coefficient periodic in the parameters y given by (21).
For each fixed y ∈ U s (i.e. with the sum in (21) truncated to s terms), we solve the PDE using a piecewise linear finite element method with h = 2 −5 as the finite element mesh size.As the stochastic fluctuations, we consider the functions where c > 0 is a constant, θ > 1 is the decay rate of the stochastic fluctuations, and s ∈ N is the truncation dimension.Following (24), the sequence (b j ) j≥1 is taken to be ζ(θ) , ensuring that the assumption (A2) is satisfied.We approximate the dimensionally truncated finite element solution u s,h of the PDE ( 16)-( 17) by constructing a kernel interpolant u s,h,n (x, y) := A * n (u s,h (x, y)), x ∈ D and y ∈ U s using SPOD weights, POD weights, and product weights chosen according to Theorem 10, Figure 2: The kernel interpolation errors of the PDE problem ( 16)-( 17) with θ = 2.4, p = 1/2.2,c ∈ {0.2, 1.5}, and s ∈ {10, 100}.Results are displayed for kernel interpolants constructed using product (PROD), POD, and SPOD weights.
Theorem 11, and Theorem 12, respectively.The same weights appear in the formula for the kernel as well as the search criterion for finding good lattice generating vectors.The kernel interpolant is constructed over a lattice point set t k := {kz/n}, k ∈ {1, . . ., n}, where the generating vector z ∈ {1, . . ., n − 1} s has been obtained separately for each weight type using the fast CBC algorithm detailed in [4].We assess the kernel interpolation error by computing error = where y ℓ for ℓ = 1, . . ., L is a sequence of Sobol ′ nodes in [0, 1] s , with L = 100, and we recall that all our functions including u s,h (x, y) and u s,h,n (x, y) are 1-periodic with respect to y.The kernel interpolant in the formula above can be evaluated efficiently over the union of shifted lattices y ℓ + t k , ℓ = 1, . . ., L, k = 1, . . ., n, by making use of formula (49) in conjunction with the fast Fourier transform, requiring only the evaluation of the values K(t k , y ℓ ).We compute the approximation error when θ ∈ {1.2, 2.4, 3.6}, choosing p ∈ { 1 1.1 , 1 2.2 , 1 3.3 }, respectively, which are all p values ensuring that (A3) is satisfied.We also use several values of the parameter c ∈ {0.2, 0.4, 1.5} to control the difficulty of the problem.We set δ = 0.1 in the product weights (46).The numerical experiments have been carried out by using both  16)-( 17) with θ = 3.6, p = 1/3.3,c ∈ {0.2, 1.5}, and s ∈ {10, 100}.Results are displayed for kernel interpolants constructed using product (PROD), POD, and SPOD weights.s = 10 and s = 100 as the truncation dimensions.Selected results are displayed in Figures 1-3, where the corresponding values of a min and a max are listed to give insights to the difficulty of the problem in each case, as well as the parameter σ which shows the "order" of the lattice rule.Note that as s increases the problem does not change, but the computation becomes harder because the diffusion coefficient takes a wider range of values, with small values of a(x, y) being especially challenging.
The empirically obtained convergence rates appear to exceed the theoretically expected rates once the kernel interpolant enters the asymptotic regime of convergence.The convergence behavior of the kernel interpolant with SPOD weights is good across all experiments, except for the most difficult PDE problem of the lot corresponding to parameters θ = 1.2 and c = 0.4, illustrated in the bottom row of Figure 1.On the other hand, the POD weights and, to a lesser extent, the product weights appear to be somewhat sensitive to the effective dimension of the PDE problem, either leading to a longer pre-asymptotic regime compared to SPOD weights (see "PROD" in the bottom row of Figure 2) or no apparent convergence (see "POD" in the bottom rows of Figures 2 and 3).
In the top graph of Figure 4 we compare the results in Figures 1-3 from SPOD weights with truncation dimension s = 100 for the same damping parameter c = 0.2 and different θ ∈ {1.2, 2.4, 3.6}, listing the estimated convergence rate in each case.In the middle graph  of Figure 4 we show the results of an additional experiment, namely, that for s = 100 where we fix the decay rate θ = 3.6 of the stochastic fluctuations and solve the parametric PDE problem using different σ ∈ {1, 2, 3} in the formula for SPOD weights, which correspond to p ∈ { 1 1.1 , 1 2.2 , 1 3.3 }, see Theorem 10.Finally, in the bottom graph of Figure 4, we return to the experimental setup illustrated in Figure 1 except this time we carry out the experiment using the truncation dimensions s ∈ {10, 20, 40, 80, 160}.
In all cases displayed in Figure 4, the observed error decays faster than the rate implied by Theorem 10.We also see that increasing σ improves the error and mildly improves the rate of convergence.Moreover, we observe from the graph in the middle that the parameter θ that governs the decay of ψ j L∞(D) is more important in determining the rate than the choice of σ.This observation suggests that the kernel interpolation with the rank-1 lattice points are robust in σ.Notice that σ appears in the definition (40) of the SPOD weights; and that the weights are an input of the CBC construction, and are used to define the kernel K(•, •).These observed error decay rates and the robustness are encouraging, but also suggest that the worst-case error estimates may be pessimistic in practical situations.The bottom graph in Figure 4 illustrates the effect that the dimension has on the obtained convergence rates: we see that the observed convergence rate remains reasonable even when s = 160.Finally, we present numerical experiments that assess the dimension truncation error rate given in Theorem 6.We consider the same PDE and stochastic fluctuations (ψ j ) j≥1 which were stated at the beginning of this section.We choose the parameters c = 0.4 with θ ∈ {1.2, 2.4, 3.6} and c = 1.5 with θ ∈ {2.4,3.6}.The PDE is discretized using piecewise linear finite element method with mesh size h = 2 −6 and the integral over the computational domain D is computed exactly for the finite element solutions.As the reference solution, we use the finite element solution with truncation dimension s ′ = 2 11 .The dimension truncation error is then estimated by computing U s ′ D (u s ′ ,h (x, y) − u s,h (x, y)) 2 dx dy for s = 2 k , k = 2, . . ., 10, where the value of the parametric integral is computed approximately by means of a rank-1 lattice rule based on the offthe-shelf generating vector lattice-39101-1024-1048576.3600downloaded from https://web.maths.unsw.edu.au/~fkuo/lattice/ with n = 2 17 nodes.The results are displayed in Figure 5.The theoretically expected rate, which is essentially O(s −θ+1/2 ), is clearly observed in all cases.

Conclusions
In this paper we have developed an approximation scheme for periodic multivariate functions based on kernel approximation at lattice points, in the setting of weighted Hilbert spaces of dominating mixed smoothness.We have developed L 2 error estimates that are independent of dimension, for three classes of weights: product weights, POD (product and order dependent) weights and SPOD (smoothness driven product and order dependent) weights.Numerical experiments for 10 and 100 dimensions give results that (with the possible exception of POD weights) are generally satisfactory, and that exhibit better than predicted rates of convergence.
Nevertheless, there is room for future improvement.First, the error analysis is based on the principle that the L 2 error is bounded above by the worst case L 2 error multiplied by the norm of the function being approximated; yet it is known (see Section 3.1) that the worst-case error has a poor rate of convergence.It may be possible to obtain improved error rates by making better use of the special properties of the minimum norm interpolant in conjunction with the analytic parameter dependence of the PDE solution of ( 16)- (17).
dy and then deduce by the Poincaré inequality u L 2 (D) ≤ c D u H 1 0 (D) , with c D > 0 depending only on the domain D, together with uniform coercivity, that U D (u − u s ) 2 dx dy ≤ c 2 D U D |∇(u − u s )| 2 dx dy ≤ c 2 D a min S.

and (with b j replaced by b 2 j
17, Theorem 5.1] it holds and p replaced by p/2) that the terms (33)-(34) can be balanced by choosing m * = ⌈ 2−p 1−p ⌉.One arrives at the dimension truncation bound U D (u − u s ) 2 dx dy for ∅ = u ⊂ N, |u| < ∞, with γ ∅ := 1.Then the kernel interpolant of the finite element solution in Theorem 9 satisfies
As an application, we apply our kernel interpolation scheme to a forward uncertainty quantification problem, namely, a PDE problem with an uncertain, periodically parameterized diffusion coefficient, fitting the theoretical framework considered in the preceding sections.The kernel interpolant can be postprocessed with low computational cost to obtain statistics of the PDE solution itself or functionals of the solution for uncertainty quantification.Letting D ⊂ R d , d ∈ {1, 2, 3}, be a bounded domain with Lipschitz boundary, we consider the problem of finding u ν j , specified by the sequences (Γ ℓ ) ℓ≥0 and (γ j,ν ) j≥1 for each ν = 1, . . ., σ, where |ν u | := j∈u ν j .

Table 1 :
Cost breakdown for the kernel interpolant f n based on n lattice points t k in s dimensions, evaluated at L arbitrary points y ℓ .Here X is the cost for one evaluation of f .