Adaptive Quasi-Monte Carlo Finite Element Methods for Parametric Elliptic PDEs

We introduce novel adaptive methods to approximate moments of solutions of partial differential Equations (PDEs) with uncertain parametric inputs. A typical problem in Uncertainty Quantification is the approximation of the expected values of quantities of interest of the solution, which requires the efficient numerical approximation of high-dimensional integrals. We perform this task by a class of deterministic quasi-Monte Carlo integration rules derived from Polynomial lattices, that allows to control a-posteriori the integration error without querying the governing PDE and does not incur the curse of dimensionality. Based on an abstract formulation of adaptive finite element methods (AFEM) for deterministic problems, we infer convergence of the combined adaptive algorithms in the parameter and physical space. We propose a selection of examples of PDEs admissible for these algorithms. Finally, we present numerical evidence of convergence for a model diffusion PDE.


Introduction
The study of problems governed by parametric Partial Differential Equations (PDEs) has seen a steady development in recent years with an eye to applications to computational sciences and engineering. The general common methodology is to treat the parametric equation as a family of equations with given data and to query a possibly large number of them, by well-known solvers.
In the spirit of Uncertainty Quantification (UQ), we aim at the approximation of low-order moments of a linear goal functional G ∈ V * (also called quantity of interest or observable) of the solution u : U → V of a parametric PDE. We are particularly interested in the case of a large number s of parameters, all independent uniformly distributed on the interval [− 1 2 , 1 2 ]. Moments of u are then expressed as high-dimensional integrals with respect to the Lebesgue measure μ on U := [− 1 2 , 1 2 ] s , which is a probability measure. In particular, we consider problems of the form: given s ∈ N and k = 1, 2, 3, . . . find where, for all y ∈ U , u( y) ∈ V solves a linear variational problem a y (u( y), w) = l y (w) ∀w ∈ W , with smooth dependence on the parameters y = (y 1 , . . . , y s ).
As an example, our general framework includes a diffusion equation modeling the Darcy's flow in uncertain porous media − div(a(x, y)∇u(x, y)) = f (x) x ∈ D, u(·, y) on a bounded domain D. The PDE (3) has been used in several works as a model problem to develop deterministic UQ techniques [3,8,14,16,18,31]. A more computationally challenging (and less studied) parametric PDE arises from linear elasticity, where the Young modulus E(x, y) is uncertain − div E(x, y) 2(1+ν) [∇u(x, y) + (∇u(x, y)) ] + ∇ p(x, y) = f (x), div(u(x, y)) + c −1 p(x, y)/E(x, y) = 0, with suitable boundary conditions, see (42) and the constant c := ν (1+ν) (1−2ν) is only dependent on the Poisson ratio ν. For nearly incompressible materials (i.e. ν ≈ 1 2 ), a suitable mixed formulation inspired from [29] will be used to avoid the so-called locking effect, while keeping a smooth parametric dependence. We address these PDEs more in detail in Sect. 3.6 below, in the case of affine dependence on y of the data a(·, y) or E(·, y).
For the computation of (1), deterministic quasi-Monte Carlo (QMC) integration is proven to outperform standard Monte Carlo sampling: suitable assumptions on the regularity of the parameter to solution map u : U → V are known to grant dimension independent and higher order decay of the quadrature error, for deterministic QMC rules derived from Polynomial lattices, comprising Interlaced Polynomial lattices [12,23,24] and Extrapolated Polynomial lattices (EPL) [10,11].
Moreover, EPL rules allow for an easily computable a-posteriori error estimator, that is known to be asymptotically exact and free of the curse of dimensionality [10]. We remark that other a-posteriori estimation techniques were developed for Sobol' points and Rank-1 Lattices in [27,28], but the analysis there provides no dimension robust asymptotic exactness. For deterministic PDEs, quasi-optimality of Adaptive Finite Element Methods (AFEM) has been extensively studied, we refer to [6] for classical results on elliptic diffusion PDEs and to [5,17,21] and the references therein for more recent developments towards an abstract analysis. When including uncertainty in the underlying PDE, in order to maintain the computational cost to a minimum, it is crucial to estimate the error for the parametric solution and in particular to determine adaptively a finite sampling set P contained in U and a suitable Finite Element space of the PDE, such that a given error tolerance is met a-posteriori. Existing approaches involve adaptive stochastic Galerkin, studied in [2,15,16] and more recently adaptive collocation methods on sparse grids [14,18]. We extend these results to sampling based on QMC rules, while leveraging the aforementioned abstract AFEM framework of the Axioms of adaptivity [5].
The purpose of this work is to introduce a family of adaptive algorithms, to approximate solutions of many-query problems, based on deterministic QMC sampling on the parameter box U . Our contribution is to provide convergence results of these algorithms, without incurring in the curse of dimensionality, in a generic framework comprising several common PDE problems, where the parametric error estimator is independent of the underlying PDE. We employ parametric error estimators that only depend on the computed discrete solution u T : U → V T (where V T is a finite dimensional space), while its computation is independent of a) the specific discretization space V T , b) the Eq. (2) satisfied by u and c) the PDE solver used. Moreover, we pay particular attention to modularity of the algorithm, i.e. we break the overall computation into smaller parts, each with its requirements, in order to be able to reuse existing implementations. In fact, any other adaptive and reliable discretization can be used in place of AFEM in Algorithms 2 and 3 below.
To summarize, we leverage recent progress in QMC -in particular EPL rules -and established AFEM results to obtain adaptive, deterministic, reliable and non-intrusive computational strategies for UQ, that are free of the curse of dimensionality.
The structure of the paper is as follows: in Sect. 2 we introduce the problem and we summarize the relevant notation and results from quasi-Monte Carlo integration with Polynomial lattice rules and convergence of AFEM. Sect. 3 is devoted to the description and proof of convergence of 3 different adaptive procedures; for each of them, we show that it is possible to include goal oriented adaptivity as described in [3,17]. Additionally, we indicate a few examples of problems that can be solved with our method. In Sect. 4 we present numerical experiments for a model PDE with random diffusion. Additional material, including a brief introduction to Polynomial lattices and a theoretical analysis the computational cost, is given in the Appendix.

Preliminaries
In this section we formulate the problem and illustrate our working assumptions. Let V , W be reflexive Banach spaces of functions defined on a Lipschitz domain D ⊂ R d , d ∈ {2, 3}. For y ∈ U , let a y : V × W → R be a bilinear form and l y ∈ W * , W * denoting the topological dual of W .
In order to ensure that the linear PDE (2) is well-posed we shall impose the following, [4].

Assumption 2.1
The data a y , l y satisfy uniform, with respect to y ∈ U , inf-sup conditions and continuity for some 0 < λ < < ∞ independent of y. Moreover, we assume, for some 0 < C l < ∞ sup y∈U l y W * ≤ C l .
Under Assumption 2.1, we have the following a-priori estimate Let G ∈ V * be the sought Quantity of Interest. Then, given a small tolerance ε > 0, we want to compute a Q ∈ R such that It is clear that we have multiple sources of error to take into consideration. First, we have the quadrature error in approximating the expectation by sampling with quasi-Monte Carlo rules. Second, we include the discretization error as the solution u( y) comes from a PDE problem and we cannot expect in general to recover it exactly.
Additionally, one could consider dimension truncation error, that arises in the treatment of countably many parameters by means of a quadrature rule over a finite dimensional set U , [22]. We exclude this error from the analysis and we assume that the dimension s ∈ N is finite throughout the rest of the discussion.

Quasi-Monte Carlo a Posteriori Error Estimation
In order to determine a stopping criterion for the QMC-AFEM algorithms below, we use the asymptotically exact a-posteriori estimator from [10, Section 4] derived from the socalled Polynomial lattices P m of cardinality 2 m , m ∈ N. In Appendix A, we recall briefly the construction of P m used here. QMC integration rules are sample averages that employ deterministic sampling sets, in our case P m The fundamental result from QMC theory that we will need to overcome the curse of dimensionality is the next Proposition 2.1, first proved in [10,11]. In particular, we consider infinitely differentiable integrands F := G k (u), satisfying certain bounds on the derivatives uniformly in y ∈ U and in the parametric dimension s ∈ N, as we specify next. To this end, we fix some notation: consider a multiindex ν = (ν j ) j∈N ∈ N N 0 with finite support supp(ν) := j : ν j > 0 < ∞. We write ν! := j∈supp(ν) ν j !, and denote the partial derivatives ∂ ν y := ∂ ν 1 y 1 ∂ ν 2 y 2 · · · . We also write, for a real valued sequence β, β ν := j∈supp(ν) β ν j j . Then, we require derivative bounds of the form for some κ ≥ 0, C > 0, β independent of ν, s and β ∈ p (N) for some p ∈ (0, 1 2+κ ). Alternatively, we can assume bounds of the form where β ∈ p (N), p ∈ (0, 1).

Proposition 2.1
Assume that (10) is satisfied for some β ∈ p (N) for all p > 1 2 , or that (9) holds with β ∈ p (N) for some 0 < p < 1 2+κ . Then, for F := G k (u) a sequence of Polynomial lattice rules (Q 2 m ) m∈N can be constructed so that for a constant C independent of m, s. Moreover, for all δ > 0, with the hidden constant in O(·) independent of s, m but dependent on δ.
Proof In the case k = 1, (11) follows combining the derivative bounds (9), (10)  We denote the QMC a-posteriori error estimator by For completeness, we mention a criterion to verify (9) in Appendix B, for the special case of bilinear forms a y with affine dependence on the parameters. However, the parametric regularity bound (9) can be verified with alternative methods, also for non-affine parametric operators, based on holomorphic extensions of a y for complex parameters y ∈Ũ ⊆ C s , U ⊆Ũ . For more details we refer to [13]. On the other hand, (10) can also be verified in some situations [23]. In what follows, we will assume that Assumption 2.1 and either (9) or (10) are available for the parametric solution map u : U → V .

Modules of AFEM
We mentioned that discretization error occurs in the solution of (2), for any instance of y ∈ U . In this section we precise our discretization method of choice.
We restrict ourselves to polyhedral Lipschitz domains D ⊂ R d , d ∈ {2, 3}. A mesh T on D is defined as a finite collection of compact sets T ∈ T , |T | > 0 such that T ∈T T = D and |T ∩ T | = 0, for all T , T ∈ T with T = T . We assume availability of finite-dimensional spaces V T ⊂ V , W T ⊂ W linked to a mesh T on D with dim(V T ) = dim(W T ) and such that the following stable discrete inf-sup condition hold: forλ > 0 independent of T ∈ T and y ∈ U , Then, u T ( y) ∈ V T denotes the unique solution of the problem corresponding to (2). We will often use the shorthand notation T ≤ T , meaning that the mesh T can be obtained from another mesh T by possibly multiple applications of the module REFINE, as described in Assumption 2.2 below. Further, we fix an initial mesh T 0 of D and we denote by T := {T : T 0 ≤ T } the set of admissible refinements of the initial mesh T 0 .

Algorithm 1 AFEM
Input: a, l, tol, T 0 Output: u T , T 1: T ← T 0 2: while True do 3: T ← REFINE(T , M) 10: end while The well-established Adaptive FEM algorithm, see Algorithm 1, is composed of the four modules SOLVE, ESTIMATE, MARK and REFINE, plus a stopping criterion determined by a given tolerance tol.
The parameter dependent error indicators {η y,T (T )} T ∈T , are computable values that approximate the local FEM error u( y) T

− u T ( y)
T V corresponding to each cell T ∈ T : these are used to determine which cells to refine, to drive the global error to 0. Following the description in [21], we state the abstract assumptions for Algorithm 1 to ensure error convergence of AFEM, pointwise for all y ∈ U .

Assumption 2.2 AFEM modules for parametric problems:
• For given y ∈ U and u T ( y) ∈ V T , ESTIMATE computes positive real numbers η y,T (T ) T ∈T , called indicators. We assume that the indicators satisfy, for all T , T with T 0 ≤ T ≤ T the stability over non-refined elements (16) and reduction over refined elements where q red ∈ (0, 1) and the functions S, R : [0, ∞) → [0, ∞) are continuous at 0 with S(0) = R(0) = 0 and monotocally increasing. We assume that S(·), R(·), q red are independent of y ∈ U . Furthermore, we assume reliability: there exists a constant c * > 0 such that • The marking procedure MARK selects, based on a set of indicators η y,T (T ) computed in the previous step, a subset M ⊂ T of cells that will be refined. We assume that there exists a function M : • The REFINE module, for a given mesh T and a set of marked elements M ⊆ T , produces a new mesh T such that T ∩ M = ∅. We assume that parents are union of their children, that is T = T ∈ T : T ⊆ T for all T ∈ T . We stress that M ⊆ T \T , that is REFINE can in principle refine more than the marked set. To simplify the presentation, we further assume conformity • For the module SOLVE, we assume that the Galerkin solution u T ( y) of (15) can be recovered exactly for every y ∈ U , which entails exact integration and linear algebra.
We stress that the availability of c * (18) depends implicitly on the set T, and hence on the REFINE module. In practice, usually c * depends on λ, from (5),(6) and on the shape regularity of a mesh T , and hence it is often required that REFINE does not generate strongly anisotropic meshes, i.e. T is uniformly shape-regular. Typical MARK strategies, as the Dörfler criterion, are known to satisfy (19), see e.g. [21].
Let (u T ( y) ( y)) ∈N , T := T ( y) be the sequence of approximations produced by the AFEM loop with T +1 ( y) = REFINE(T ( y), M ( y)), M ( y) = MARK( η y,T ( y) (T ) ) ⊆ T ( y) for all ∈ N; then, as a corollary of [21, Theorem 3.1] we get the following pointwise convergence result. (2) satisfying Assumption 2.1. Let AFEM satisfy (14) and Assumption 2.2. Then, for all y ∈ U and all initial meshes T ∈ T, it returns in finite time T ( y) and u T ( y) ( y) such that

Lemma 2.2 Consider a problem of the form
for a constant c * > 0 independent of tol, y and T ∈ T.
Proof From [32], for all y there exists u ∞ ( y) ∈ V such that Hence, the result follows from [21, Theorem 3.1] and reliability (18).

A First Convergence Result
In this section we present a first combined QMC-AFEM algorithm, that outputs an approximation of I (G k (u)) for a given tolerance ε. For simplicity, we consider the case k = 1 in (1) and postpone the case k > 1. Algorithm 2 is in fact not efficient for implementation, but it illustrates effectively the key ideas. First of all, we observe that E 2 m can be fully evaluated by means of quantities Q 2 m , Q 2 m−1 that have already been computed, when we loop over m. In other words, when adding more QMC points we reuse part of the work done previously so that the cost to compute E 2 m is negligible. A second crucial observation is that each call of the Algorithm 1 results (in principle) in a different mesh T ( y) ≥ T 0 , starting from a common coarse mesh T 0 . In particular, G(u T ( y) ( y)) may not be even continuous with respect to y ∈ U , and hence in general Proposition 2.1 is not applicable for G(u T (·) (·)) regardless of the discretization scheme.
Proposition 3.1 Let T 0 ∈ T. Assume that G ∈ V * , that u satisfies either (9) or (10) for a sequence β as in Proposition 2.1 and that ∀ y ∈ U and any tolerance tol > 0, AFEM returns Generate lattice P m 4: for y ∈ P m do 5: (20) holds for a constant c * independent of tol. Then, for any ε > 0, there exist choices ε Q and ε F := tol, with ε −1 ε F , ε −1 ε Q independent of ε, such that

Algorithm 2 stops in finite time and 2. it produces an approximation of I
Proof Let ε F be the tolerance for AFEM. To prove the first item it is sufficient to show that, By linearity of G, Proposition 2.1 also implies that |E 2 m (G(u))| → 0 as m → ∞ and hence the claim. Now we show the second item: we separate the error due to the Finite Element discretization from the QMC integration error as follows For the FEM error we have For the QMC error we apply Proposition 2.1 to get for all δ > 0 Hence, for given ε we can choose ε F := ε 6 G V * and ε Q := 2 5 c * ε > 2c * G V * ε F and obtain We remark that a sharp value for the reliability constant c * is usually not known but (potentially pessimistic) upper bounds exist. The size of c * can be controlled for structured meshes and refinement by bisection in spatial dimension d = 2. Algorithm 2 entails a decoupling of the QMC sampling with a AFEM solver. In practice, this implies that an adaptive software can be integrated into such algorithm in a non-intrusive manner, provided that the reliability (20) is satisfied for some variational space V and G ∈ V * . This feature can be advantageous in many situations, especially when a solver is complex to implement. However, it presents two main computational difficulties: • Algorithm 2 recomputes a mesh T ≥ T 0 for the domain D, for each QMC sample y ∈ P m as well as for all iterations over m, which for complex geometries is an expensive step. • Imposing the same AFEM threshold (20) for all QMC points can be unnecessary since we are primarily interested in the average over the parameter space.
In what follows we propose two alternative algorithms that improve upon Algorithm 2 under these aspects. The first is a modification of Algorithm 2, that recycles part of the computation from previous iterations over m.
Algorithm 3 is motivated by the following heuristics. If there exists a metric d β : U ×U → [0, ∞) and a Lipschitz constant L > 0 satisfying In particular, for a small distance of the parameters we have a good chance to meet the AFEM tolerance by just one call of the SOLVE module, starting from the mesh T ( y ).

Algorithm 3 QMC-AFEM (v2)
for y ∈ P m do 5: if m > 1 then 6: Generate lattice P m 16: end while Following verbatim the proof of Proposition 3.1, we get convergence Algorithm 3. The parameter q ∈ N in line 6 regulates how much information from previous iterations we use.
A discussion of possible choices q = q(ε) depending on the tolerance is given in Appendix C below.
Proposition 3.2 Let T 0 ∈ T. Assume that G ∈ V * , that u satisfies either (9) or (10) for a sequence β as in Proposition 2.1 and that ∀ y ∈ U , ∀ T ∈ T and any tolerance tol, AFEM, starting from the initial mesh T , returns in finite time a mesh T ( y) ≥ T and u T ( y) ( y) such that (20) holds for a constant c * independent of tol and T . Then, for any ε, there exist choices ε Q and ε F := tol, with ε −1 ε F , ε −1 ε Q independent of ε such that 1. Algorithm 3 stops in finite time and 2. it produces an approximation of I (G(u)) within tolerance c * ε + O 2 −2m+δ , with constant hidden in O(·) independent of s.

Goal Oriented AFEM -Part 1
For the convergence of Algorithms 2 and 3, we assumed (20). This assumption alone does not yield optimal convergence rate of the AFEM module; as a consequence, the Finite Element error is overestimated and the spatial domain D could be overrefined in the algorithms. Nevertheless, we only require a reliable upper bound for the difference |G(u( y)) − G(u T ( y))|, that in many situations converges to 0 faster than u( y) − u T ( y) V as we refine T , by an Aubin-Nitsche duality argument. Let T ∈ T, y ∈ U , then we define z( y) ∈ W as the unique solution of the dual problem Then, for all w T ∈ W T , |G(u( y)) − G(u T ( y))| = a y (u( y), z( y)) − a y (u T ( y), z( y)) When the goal functional G ∈ V * has additional regularity, i.e. it belongs to a suitable subspace H ⊆ V * , then for h T : However, in general AFEM produces non quasi-uniform meshes. Hence we can exploit regularity of G as follows: we pick w T := z T ( y) ∈ W T the FE solution of and define reliable indicators ζ y,T (T ) T ∈T such that, for a constant c * * > 0 independent of y, T ∈ T, Similarly to η y,T (T ), each indicator ζ y,T (T ) has the purpose of estimating the local error of the dual FEM problem z( y) Combining (26) and (29) we can use the following a-posteriori estimator as termination criterion for AFEM, in Algorithms 2 and 3 Furthermore, as shown in [17], suitable marking strategies driven by both indicators η y,T , ζ y,T yield optimal convergence of the resulting goal oriented AFEM (or goAFEM) algorithm, provided that the axioms of adaptivity (A1-A4 in [5]) hold for the indicators η y,T (T ), ζ y,T (T ). If in Algorithms 2 and 3, we replace AFEM by goAFEM, then the results of Propositions 3.1 and 3.2 remain valid. As a side advantage, we do not need to include G V * to the FEM tolerance ε F .

Remark 3.3
Note that to use (30) we must solve numerically (28) for each sample y ∈ P m , m = 1, 2, . . ., until the tolerance is met. However, the stiffness matrix of the dual problem coincides with the transpose of the stiffness matrix of the primal, thus the additional work for the solution of (28) includes only the construction of the load vector corresponding to G (independent of y) and one linear solver per sample -in particular it is independent of the parametric dimension s ∈ N.

Indicator Averaging
Next, we design an iterative algorithm that refines the mesh or increases the number of samples at each step. Conversely to the previous algorithms, at any given time we employ only one mesh of the domain D for all y ∈ U . In this case, we will assume a-priori uniform convergence, slightly stronger than the a-priori convergence in (21). T ( y)) ∈N 0 the sequence of approximations produced by Algorithm 4 with T +1 = REFINE(T , M ), M = MARK( η T (T ) ) ⊆ T for all ∈ N 0 . We assume that there exists u ∞ ∈ C 0 (U , V ) such that

Assumption 3.1 Denote by (u
Theorem 3.5 Let T 0 ∈ T. Assume that G ∈ V * , and that ∀T ∈ T u, u T satisfy either (9) or (10) for a sequence β as in Proposition 2.1. Impose that the AFEM modules satisfy Assumption 2.2 and Assumption 3.1. Then, for all ε > 0 there exist ε F , ε Q , with ε −1 ε F , ε −1 ε Q independent of ε, such that

Algorithm 4 AQMC-FEM
Input: [ y → a y ], [ y → l y ], G, ε, T 0 Output: Approximation of I (G(u)) within tolerance ∝ ε 1: m ← 2; generate P 1 , P 2 2: T ← T 0 3: while True do 4: for y ∈ P m do 5: u T ( y) ← SOLVE(a y , l y , T ) 6: η y,T (T ) ← ESTIMATE(u T ( y)) 7: Evaluate G(u T ( y)) 8: end for 9: T ← REFINE(T , MARK( η T (T ) )) 12: else 13: if Q 2 m−1 (G(u T )) was not computed for the current T then 14: for y ∈ P m−1 do 15: u T ( y) ← SOLVE(a y , l y , T ) 16: Evaluate G(u T ( y)) 17: end for 18: Generate lattice P m 24: end if 25: end while as R is increasing and u T , u T are continuous. Hence, also η T (T ) has the reduction property (17), but with respect to the L ∞ (U , V )-norm. Similarly, from (16), monotonicity of S and Jensen inequality Taking square roots on both sides, we obtain that η T (T ) has the stability property (16) with respect to the L ∞ (U , V )-norm. Note that the modules MARK, REFINE are independent of y ∈ U and we assumed a-priori convergence in (31) in the same norm; therefore, from the proof [21, Theorem 3.1], for the sequence of meshes (T ) ∈N 0 constructed by we obtain, Thus for all m ∈ N, any FEM tolerance ε F is met in finite time. Since u T satisfies (9) or (10) for all T ∈ T (conversely to Algorithms 2 and 3, here there is only one mesh T at a time, used for all points y ∈ P m ∪ P m−1 ), Proposition 2.1 implies that E 2 m (G(u T )) → 0 as m → ∞, showing that Algorithm 4 stops in finite time. The error bound follows as in Proposition 3.1: denote by T (m) := T (m) , m ∈ N the mesh that meets the FEM error tolerance for the lattice P m , i.e.

Goal Oriented AFEM -Part 2
We now include a goal oriented adaptivity approach in

Proposition 3.6 Let {T } ∈N 0 be a sequence of meshes produced with the indicators ρ T (T ) by a marking and refinement strategy as in Assumption 2.2.
Let K 0 := max y∈U (η 2 y,T 0 +ζ 2 y,T 0 ) < ∞. Assume that both estimators η y,T , ζ y,T for the primal and dual problems satisfy reliability (18), and (29) and the properties (16), (17) for T ∈ T. Then Proof Due to (14), we have quasi-optimality of the primal and dual problems for all T ∈ T. Hence, we get from [5, Lemma 3.6], quasi-monotonicity of the estimators: The axioms (16) and (17) for the indicators ρ T (T ) are verified as in Theorem 3.5, and using that K 0 < ∞. Therefore we conclude with [21, Theorem 3.1] the claim, T ∈T ρ 2 As termination criterion for the spatial refinement we impose Convergence of a goal oriented adaptive QMC-FEM Algorithm follows replacing (33) with the latter equation.

Higher Moments and Lipschitz Goal Functionals
Although we confined the analysis to linear G ∈ V * , inspection of the proofs reveals that the results of the previous sections carry over to sufficiently smooth functionalsĜ : V → R. In particular, instead of the expectation of G(u) we can obtain higher moments settingĜ = G k for some k ∈ N, G ∈ V * . The additional steps required in the proofs read as follows. First, from (8) and a local Lipschitz estimate for G k we can bound the FEM error in (22), (33) as Note that for k = 1 the last inequality was sufficient. Moreover, the first inequality also allows to recover the goal oriented stopping criteria (30) and (36). Second, parametric regularity required in Proposition 2.1 follows from the multivariate product rule: for k = 2 and the assumption (9) which is again of the form (9). Here we used the bound μ≤ν ν μ |μ|!|ν − μ|! ≤ 2 |ν| |ν|!. Note that β ∈ p (N) ⇐⇒ 2 1+κ β ∈ p (N); therefore Proposition 2.1 applies for the same choice of p, which in turn does not change depending on k. The case of assumption (10) follows the same steps, while regularity for higher k > 2 can be treated iterating the product rule.
Hence, the computation of higher moments is covered and the error is only changed by a constant dependent on k, G V * , C l and λ.

Examples
In the present section we illustrate the framework in a selection of model problems.
The first two terms cancel since f is independent of y ∈ U . Furthermore, Therefore defining d β ( y, y ) := j≥1 y j − y j β j and L :=  y) and . For all y ∈ U , let u(·, y) ∈ H 1 0 (D( y)) solve the Poisson equation, given a source f ∈ C ∞ (D) analytic (as in [26,Lemma 5] This problem can be recast by a change of variables to the reference domain: for V = W = H 1 0 (D) and for any y ∈ U , we seekû(·, y) := u(·, y) • ∈ V such that (2) holds with , y)) dx, where A(x, y) := (J (x, y)J (x, y)) −1 det (J (x, y)) and J (x, y) := ∇ x (x, y) is the Jacobian matrix of . In [26,Theorem 5], the authors provided a derivative bound in the form (9), κ = 0, forû. The AFEM modules are analogous to the previous example (but here with parametric matrix-valued diffusion coefficient); the applicability of Algorithms 2 and 4 is straightforward.
Linear elasticity of nearly incompressible materials. Robust approximation of linear elasticity in the incompressible limit, (that is Poisson ratio ν ∈ (0, 1 2 ) approaching 1 2 ), was studied in [29,30] by the following three-field-formulation. Let E(x, y) = e 0 (x) + s j=1 y j e j (x) ∈ L ∞ (D) be the (affine-parametric) Young modulus, with 0 < e 0,min < e 0 (x) < e 0,max a.e., for constants e 0,min , e 0,max . Define ε(v) := 1 2 [∇v + (∇v) ] the strain tensor (for a vector field v : D → R d ) and f ∈ L 2 (D) d . Assume that the boundary conditions are of mixed type Dirichlet-Neumann given respectively on D , N , both of positive length, with where p(x, y) is the (parameter-dependent) Herrmann pressure, we can write the linear elasticity equations as (x, y)p(x, y) for the absolute constant c = c(ν) := ν (1+ν) (1−2ν) only dependent on ν ∈ (0, 1 2 ). The weak formulation is: for all y ∈ U , find (u (·, y), p(·, y),p(·, y) (2) holds for the bilinear form a y ((v, g,g), (w, q,q) and l y ((w, q,q)) = D f w. We equip V with the norm (related to [30,Equation (2.21)], but without integrating out the parameter space) The main motivation to introduce the three-field formulation is that E only appears in the numerator, and it is in particular affine-parametric. We thus verify the criteria of Theorem B.
that is A j ≤ e j L ∞ (D) . Therefore, to obtain (9) we assume β 1 (N) < 2, β ∈ p (N) With these choices, this formulation fits Assumption 2.1 by [30,Lemma 2.3] and Theorem B.1; hence the problem is well-posed and (9) holds with p < 1 2 . Lipschitz continuity (24) follows as in the first example. Any converging AFEM solver (not necessarily conforming) for (42) ensures that Algorithms 2, 3 are applicable. In particular, the reliability and efficiency of [29,Theorem 5.1] (applied with = {0}, in the notation there, i.e. for a deterministic equation) and suitable inf-sup stable discretization spaces as V T := (Q 2 (T )) d × Q 1 (T ) × Q 1 (T ) satisfying in (14) give, for all y ∈ U , an AFEM algorithm based on hierarchical spatial refinement. Here, Q 2 (T ) denotes the space of continuous piecewise biquadratic functions on T and Q 1 (T ) the continuous piecewise bilinear functions, on quadrilatelar meshes.

Numerical Experiments
We consider the model problem (3) on a polygon D ⊆ R 2 , for the solution of (15) we employ Lagrangian P 1 -FEM on regular triangulations T ∈ T of D. AFEM is driven by the residual indicators from, e.g. [35,Section 1.4] and the Dörfler MARK strategy with marking parameter θ ∈ (0, 1), where larger θ corresponds to more aggressive refinement. In all the computations, we select θ = 0.25. The REFINE module is the Newest Vertex Bisection as from the MATLAB implementation in [19], that gives uniform shape regularity of T. We run on a machine equipped with Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz (OctaCore) and MATLAB R2019a.

Convex Domain
As a first example we select an affine-parametric diffusion, (3), (37) with ψ 0 ≡ 1 and for j ≥ 1, where the pairs (k j,1 , k j,2 ) ∈ N 2 are defined by the ordering of N 2 such that for j ∈ N, 2 , and the ordering is arbitrary when equality holds. With this In this case we expect the mesh to be approximately uniformly refined by AFEM, starting from a structured mesh T 0 with 128 elements, since u(·, y) ∈ H 2 (D) for all y ∈ U , due to convexity of the domain and smoothness of the data. We also use the stopping criteria (30), (36) for the FEM error exploiting symmetry of the stiffness matrix -see Remark 3.3 -thus avoiding excessive spatial refinement. We compare the algorithms in Fig. 1, for various tolerances ε F = ε Q . For convenience of the reader, we compute a reference value with |T | ≈ 10 5 many P 2 (i.e. quadratic) elements obtained by uniform refinement of T 0 and |P m | = 2 m , m = 8 samples, obtaining I (G(u)) ≈ 0.024411631814585. Since we observe that the cost of computing d β ( y, y ) := j≥1 y j − y j β j is negligible, we formally set q = ∞ in Algorithm 3. As predicted, they all produce outputs well within the tolerance ε = 2ε F . We also observe that, for the finest tolerance (ε F = 10 −5 ), all 3 algorithms produce meshes with ≈ 2 · 10 5 degrees of freedom and they stop at m = 7, that is 128 samples are sufficient to meet the tolerance. In terms of computing time, Algorithm 2 lags behind the other 2 algorithms, which in turn offer comparable performance. The rates in Fig. 1 (right) are estimated excluding the coarsest tolerance (ε F = 10 −3 ).

L-shape Domain
We again pick the affine parametric diffusion in sin expansion of (47). Let   no QMC estimator is computed until the FEM tolerance is reached for m = 2. We measure the computational effort of each iteration of the algorithm AQMC-FEM (indexed by a pair (m, ) ∈ N 2 , corresponding to QMC and FEM refinement level, respectively) by This is proportional to the cost of the iteration (m, ) of the SOLVE module, assuming that s is fixed and that a FEM solver that performs linearly with respect to |T | is available. We also show the mesh generated by AQMC-FEM for ε F = ε Q = 10 −3 ; as expected, it is strongly graded near the source and towards the corner of the domain, where a singularity of the solution occurs.
free of the curse of dimensionality, allowing for arbitrary s ∈ N, also in practical examples, under the assumption of quantified decay of the derivatives (10) or (9). Moreover, we stress that the parametric error is estimated without resorting to the specific problem formulation. These are the main features that improve upon existing methods based on stochastic Galerkin or sparse grids [2,14,18,29]. Thus, we expect our algorithms to be applicable in a wide range of problems, including, but not restricted to, those in the framework of Sect. 2, provided that a converging AFEM algorithm is available for the corresponding non-parametric equation. In particular, we mention parabolic equations (cp. a posteriori indicators in [35,Chapter 6]) and certain non-linear PDEs meeting the criteria exposed in [7], stationary Stokes (cp. [5, Section 6.2-6.3]) and Navier-Stokes (cp. [35,Chapter 5]) equations on uncertain domains [9] and elliptic eigenvalue problems [25], [5,Section 10.3].
i.e. the cardinality |P m (q(x), p(x))| = 2 m . For fixed m ∈ N, it remains to construct suitable polynomials p(x), q 1 (x), . . . , q s (x) that satisfy Proposition 2.1. If u satisfies either (10) or (9), the sequence β is given as input to a Component-By-Component (CBC) construction of the generating vector q(x) = (q 1 (x), . . . , q s (x)) of a Polynomial lattice rule. Specifically, we operate inductively (over the parametric dimensions = 1, . . . , s) a minimization of (a computable bound for) the worst case quadrature error. Furthermore, the computation of the minimum can be accelerated employing Fast Fourier transform, resulting in a computational cost O(sm2 m ) for product weights, (that is, if (10) holds) and O 2 m (s 2 + sm) for SPOD weights (in case (9) holds). We refer to [11] for a detailed analysis of the (Fast) CBC algorithm for product weights, and to [10] for the case of weights of SPOD type used here. To simplify the notation, we omit the explicit dependence of P m (q(x), p(x)) on q(x), p(x) and write P m , to denote Polynomial lattices constructed by a CBC algorithm for given weights.

B Regularity of Affine-Parametric Operators
We give a criterion to verify (9) for linear operators, with affine-parametric coefficients. Here L(V , W ) denotes the space of linear operators A : V → W equipped with the usual norm.
Then, A y is boundedly invertible for all y ∈ U and (9) holds with κ = 0, C = and β := β l + C β with C = 2 2− j≥1β j . If l = l y is independent of y, the same estimate holds with β := C β .
Proof The proof follows the same arguments of [8,Theorem 4.3], see also [34].
For affine parametric bilinear forms a y , we can define A y ∈ L(V , W * ) by A y v := a y (v, ·) ∈ W * for all v ∈ V . Hence, it is sufficient to verify (52) and (53) to obtain our working Assumptions 2.1 and (9).

C Computational Cost
We analyze the computation cost under the assumption that we have available SOLVE and ESTIMATE modules for the Galerkin formulation (15) that run in O(s|T |) operations. On the other hand, we assume that MARK and REFINE have cost O(|T |).
Fix a tolerance ε > 0 and consider Algorithm 2. Let N := max y∈P m |T ( y)|, where T ( y) is the mesh obtained after iterations of AFEM for a sample y. From Proposition 3.1, we have ε ∼ ε F independent of m, that justifies imposing N independent of m: then the number of operations required is Here, M(ε), (ε) := (m, ε) ∈ N are the maximum number of iterations in the (outer) QMC and (inner) AFEM loop, respectively. In the last step we used M(ε) m=1 m2 m ∼ M(ε)2 M(ε)+1 . For spaces V T of piecewise polynomials, we have N (ε) ∼ ε −d/t for some t > 0, which is typically determined by the polynomial degree and the spatial regularity of the data (see e.g. [6]). Moreover, due to Proposition 2. If we set q > M(ε), the asymptotic cost increases quadratically with respect to 2 M(ε) , and the argmin computation dominates the cost. Therefore, such choice is only possible when the metric d β is cheap to compute. Another possibility is imposing q such that 2 q ∼ N (ε) , so that the dominating contribution to the computational cost is due to the AFEM solver. We finally turn to Algorithm 4. The mesh that meets the FEM tolerance for P m is denoted by T (m) := T (m,ε) (cf. (32)). Hence we get  =0 N ∼ N (ε) , i.e. the cost of the adaptive loop is dominated by the last iteration (cf. [20]), all three algorithms require (asymptotically) the same computational effort O s 2 ε −1 + sε −1−d/t .