1 Introduction

The study of problems governed by parametric Partial Differential Equations (PDEs) has seen a steady development in recent years with an eye to applications to computational sciences and engineering. The general common methodology is to treat the parametric equation as a family of equations with given data and to query a possibly large number of them, by well-known solvers.

In the spirit of Uncertainty Quantification (UQ), we aim at the approximation of low-order moments of a linear goal functional \( G \in V^{*} \) (also called quantity of interest or observable) of the solution \( u:U \rightarrow V \) of a parametric PDE. We are particularly interested in the case of a large number s of parameters, all independent uniformly distributed on the interval \( [-\frac{1}{2},\frac{1}{2}] \). Moments of u are then expressed as high-dimensional integrals with respect to the Lebesgue measure \( \mu \) on \( U := [-\frac{1}{2},\frac{1}{2}]^s \), which is a probability measure. In particular, we consider problems of the form: given \( s \in {\mathbb {N}}\) and \( k = 1,2,3,\ldots \) find

$$\begin{aligned} I(G^k(u)) = \int _U G^k(u({{\varvec{y}}})) {\,\mathrm {d}}\mu ({{\varvec{y}}}), \end{aligned}$$
(1)

where, for all \( {{\varvec{y}}} \in U \), \( u({{\varvec{y}}}) \in V \) solves a linear variational problem

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}}(u({{\varvec{y}}}),w) = \mathfrak {l}_{{{\varvec{y}}}}(w) \quad \forall w \in W, \end{aligned}$$
(2)

with smooth dependence on the parameters \( {{\varvec{y}}} = (y_1,\ldots ,y_s) \).

As an example, our general framework includes a diffusion equation modeling the Darcy’s flow in uncertain porous media

(3)

on a bounded domain D. The PDE (3) has been used in several works as a model problem to develop deterministic UQ techniques [3, 8, 14, 16, 18, 31]. A more computationally challenging (and less studied) parametric PDE arises from linear elasticity, where the Young modulus \( E(x,{{\varvec{y}}}) \) is uncertain

$$\begin{aligned} {\left\{ \begin{array}{ll} -{\text {div}}\left( \frac{E(x,{{\varvec{y}}})}{2(1+\nu )}[\nabla u(x,{{\varvec{y}}}) + (\nabla u(x,{{\varvec{y}}})) ^{\top }] \right) + \nabla p(x,{{\varvec{y}}}) = f(x), \\ {\text {div}}(u(x,{{\varvec{y}}})) + c^{-1} p(x,{{\varvec{y}}})/E(x,{{\varvec{y}}}) = 0, \end{array}\right. }\quad x\in D \end{aligned}$$
(4)

with suitable boundary conditions, see (42) and the constant \( c := \frac{\nu }{(1+\nu )(1-2\nu )} \) is only dependent on the Poisson ratio \( \nu \). For nearly incompressible materials (i.e. \( \nu \approx \tfrac{1}{2} \)), a suitable mixed formulation inspired from [29] will be used to avoid the so-called locking effect, while keeping a smooth parametric dependence. We address these PDEs more in detail in Sect. 3.6 below, in the case of affine dependence on \( {{\varvec{y}}} \) of the data \( a(\cdot ,{{\varvec{y}}}) \) or \( E(\cdot ,{{\varvec{y}}}) \).

For the computation of (1), deterministic quasi-Monte Carlo (QMC) integration is proven to outperform standard Monte Carlo sampling: suitable assumptions on the regularity of the parameter to solution map \( u:U\rightarrow V \) are known to grant dimension independent and higher order decay of the quadrature error, for deterministic QMC rules derived from Polynomial lattices, comprising Interlaced Polynomial lattices [12, 23, 24] and Extrapolated Polynomial lattices (EPL) [10, 11].

Moreover, EPL rules allow for an easily computable a-posteriori error estimator, that is known to be asymptotically exact and free of the curse of dimensionality [10]. We remark that other a-posteriori estimation techniques were developed for Sobol’ points and Rank-1 Lattices in [27, 28], but the analysis there provides no dimension robust asymptotic exactness.

For deterministic PDEs, quasi-optimality of Adaptive Finite Element Methods (AFEM) has been extensively studied, we refer to [6] for classical results on elliptic diffusion PDEs and to [5, 17, 21] and the references therein for more recent developments towards an abstract analysis. When including uncertainty in the underlying PDE, in order to maintain the computational cost to a minimum, it is crucial to estimate the error for the parametric solution and in particular to determine adaptively a finite sampling set P contained in U and a suitable Finite Element space of the PDE, such that a given error tolerance is met a-posteriori. Existing approaches involve adaptive stochastic Galerkin, studied in [2, 15, 16] and more recently adaptive collocation methods on sparse grids [14, 18]. We extend these results to sampling based on QMC rules, while leveraging the aforementioned abstract AFEM framework of the Axioms of adaptivity [5].

The purpose of this work is to introduce a family of adaptive algorithms, to approximate solutions of many-query problems, based on deterministic QMC sampling on the parameter box U. Our contribution is to provide convergence results of these algorithms, without incurring in the curse of dimensionality, in a generic framework comprising several common PDE problems, where the parametric error estimator is independent of the underlying PDE. We employ parametric error estimators that only depend on the computed discrete solution \( u_{{\mathcal {T}}}:U \rightarrow V_{{\mathcal {T}}} \) (where \( V_{{\mathcal {T}}}\) is a finite dimensional space), while its computation is independent of a) the specific discretization space \( V_{{\mathcal {T}}} \), b) the Eq. (2) satisfied by u and c) the PDE solver used. Moreover, we pay particular attention to modularity of the algorithm, i.e. we break the overall computation into smaller parts, each with its requirements, in order to be able to reuse existing implementations. In fact, any other adaptive and reliable discretization can be used in place of AFEM in Algorithms 2 and 3 below.

To summarize, we leverage recent progress in QMC – in particular EPL rules – and established AFEM results to obtain adaptive, deterministic, reliable and non-intrusive computational strategies for UQ, that are free of the curse of dimensionality.

The structure of the paper is as follows: in Sect. 2 we introduce the problem and we summarize the relevant notation and results from quasi-Monte Carlo integration with Polynomial lattice rules and convergence of AFEM. Sect. 3 is devoted to the description and proof of convergence of 3 different adaptive procedures; for each of them, we show that it is possible to include goal oriented adaptivity as described in [3, 17]. Additionally, we indicate a few examples of problems that can be solved with our method. In Sect. 4 we present numerical experiments for a model PDE with random diffusion. Additional material, including a brief introduction to Polynomial lattices and a theoretical analysis the computational cost, is given in the Appendix.

2 Preliminaries

In this section we formulate the problem and illustrate our working assumptions. Let VW be reflexive Banach spaces of functions defined on a Lipschitz domain \( D \subset {\mathbb {R}}^{d} \), \( d \in \{2,3\} \). For \( {{\varvec{y}}} \in U \), let \( \mathfrak {a}_{{{\varvec{y}}}}:V \times W \rightarrow {\mathbb {R}}\) be a bilinear form and \( \mathfrak {l}_{{{\varvec{y}}}} \in W^{*} \), \( W^{*} \) denoting the topological dual of W.

In order to ensure that the linear PDE (2) is well-posed we shall impose the following, [4].

Assumption 2.1

The data \( \mathfrak {a}_{{{\varvec{y}}}},\mathfrak {l}_{{{\varvec{y}}}} \) satisfy uniform, with respect to \({{\varvec{y}}} \in U\), inf-sup conditions

$$\begin{aligned} \begin{aligned} \inf _{0\ne v \in V}&\sup _{0 \ne w \in W}\frac{\mathfrak {a}_{{{\varvec{y}}}}(v,w)}{\left\| v \right\| _{V}\left\| w \right\| _{W}} \ge \lambda> 0 \\ \inf _{0\ne w \in W}&\sup _{0 \ne v \in V}\frac{\mathfrak {a}_{{{\varvec{y}}}}(v,w)}{\left\| v \right\| _{V}\left\| w \right\| _{W}} \ge \lambda > 0 \end{aligned} \end{aligned}$$
(5)

and continuity

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}}(v,w)&\le \Lambda \left\| v \right\| _{V} \left\| w \right\| _{W} , \qquad \forall v\in V, w \in W, \end{aligned}$$
(6)

for some \( 0< \lambda< \Lambda < \infty \) independent of \( {{\varvec{y}}} \). Moreover, we assume, for some \( 0< C_{\mathfrak {l}} <\infty \)

$$\begin{aligned} \sup _{{{\varvec{y}}} \in U}\left\| \mathfrak {l}_{{{\varvec{y}}}} \right\| _{W^{*}} \le C_{\mathfrak {l}}. \end{aligned}$$
(7)

Under Assumption 2.1, we have the following a-priori estimate

$$\begin{aligned} \sup _{{{\varvec{y}}} \in U}\left\| u({{\varvec{y}}}) \right\| _{V} \le \frac{C_{\mathfrak {l}}}{\lambda }. \end{aligned}$$
(8)

Let \( G \in V^{*} \) be the sought Quantity of Interest. Then, given a small tolerance \( {\varepsilon }> 0 \), we want to compute a \( Q \in {\mathbb {R}}\) such that

$$\begin{aligned} {\left| I(G^k(u)) - Q \right| } \approx {\varepsilon }. \end{aligned}$$

It is clear that we have multiple sources of error to take into consideration. First, we have the quadrature error in approximating the expectation by sampling with quasi-Monte Carlo rules. Second, we include the discretization error as the solution \( u({{\varvec{y}}}) \) comes from a PDE problem and we cannot expect in general to recover it exactly.

Additionally, one could consider dimension truncation error, that arises in the treatment of countably many parameters by means of a quadrature rule over a finite dimensional set U, [22]. We exclude this error from the analysis and we assume that the dimension \( s \in {\mathbb {N}}\) is finite throughout the rest of the discussion.

2.1 Quasi-Monte Carlo a Posteriori Error Estimation

In order to determine a stopping criterion for the QMC–AFEM algorithms below, we use the asymptotically exact a-posteriori estimator from [10, Section 4] derived from the so-called Polynomial lattices \( P_m \) of cardinality \( 2^m \), \( m \in {\mathbb {N}}\). In Appendix A, we recall briefly the construction of \( P_m \) used here. QMC integration rules are sample averages that employ deterministic sampling sets, in our case \( P_m \)

$$\begin{aligned} Q_{2^m}(F) := \frac{1}{2^m} \sum _{{{\varvec{y}}} \in P_m} F({{\varvec{y}}}). \end{aligned}$$

The fundamental result from QMC theory that we will need to overcome the curse of dimensionality is the next Proposition 2.1, first proved in [10, 11]. In particular, we consider infinitely differentiable integrands \( F := G^k(u) \), satisfying certain bounds on the derivatives uniformly in \( {{\varvec{y}}} \in U \) and in the parametric dimension \( s \in {\mathbb {N}}\), as we specify next. To this end, we fix some notation: consider a multiindex \( {\varvec{\nu }} = (\nu _j)_{j\in {\mathbb {N}}} \in {\mathbb {N}}_{0}^{{\mathbb {N}}} \) with finite support \( {\text {supp}}({\varvec{\nu }}) := {\left| {\left\{ j : \nu _j > 0 \right\} } \right| } <\infty \). We write \( {\varvec{\nu }}! := \prod _{j\in {\text {supp}}({\varvec{\nu }})} \nu _j! \), and denote the partial derivatives \( \partial _{{{\varvec{y}}}}^{{\varvec{\nu }}} := \partial _{y_1}^{\nu _1} \partial _{y_2}^{\nu _2} \cdots \). We also write, for a real valued sequence \( {\varvec{\beta }}\), \( {\varvec{\beta }}^{{\varvec{\nu }}} := \prod _{j\in {\text {supp}}({\varvec{\nu }})}\beta _{j}^{\nu _j} \).

Then, we require derivative bounds of the form

$$\begin{aligned} \sup _{{{\varvec{y}}} \in U}\left\| \partial _{{{\varvec{y}}}}^{{\varvec{\nu }}} u({{\varvec{y}}}) \right\| _{V} \le C (|{\varvec{\nu }}| !)^{1+\kappa } {\varvec{\beta }}^{{\varvec{\nu }}} \qquad \forall {\varvec{\nu }} \in {\mathbb {N}}_{0}^{{\mathbb {N}}}, \ {{\text {supp}}}({\varvec{\nu }}) < \infty , \end{aligned}$$
(9)

for some \( \kappa \ge 0,C>0, {\varvec{\beta }}\) independent of \( {\varvec{\nu }},s \) and \( {\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \) for some \( p\in (0,\frac{1}{2 + \kappa }) \).

Alternatively, we can assume bounds of the form

$$\begin{aligned} \sup _{{{\varvec{y}}} \in U}\left\| \partial _{{{\varvec{y}}}}^{{\varvec{\nu }}} u({{\varvec{y}}}) \right\| _{V} \le C {\varvec{\nu }} ! {\varvec{\beta }}^{{\varvec{\nu }}} \qquad \forall {\varvec{\nu }} \in {\mathbb {N}}_{0}^{{\mathbb {N}}}, \ {{\text {supp}}}({\varvec{\nu }}) < \infty , \end{aligned}$$
(10)

where \( {\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \), \( p \in (0,1) \).

Proposition 2.1

Assume that (10) is satisfied for some \( {\varvec{\beta }}\in \ell ^{p}({\mathbb {N}}) \) for all \( p >\frac{1}{2} \), or that (9) holds with \( {\varvec{\beta }}\in \ell ^{p}({\mathbb {N}}) \) for some \( 0<p<\frac{1}{2+\kappa } \). Then, for \( F := G^k(u) \) a sequence of Polynomial lattice rules \( (Q_{2^m})_{m \in {\mathbb {N}}} \) can be constructed so that

$$\begin{aligned} {\left| I(F) - Q_{2^m}(F) \right| } \le C 2^{-m} \end{aligned}$$
(11)

for a constant C independent of ms. Moreover,

$$\begin{aligned} I(F) - Q_{2^m}(F) = Q_{2^m}(F) - Q_{2^{m-1}}(F) + \mathcal {O}\!\left( 2^{-2m+\delta }\right) \qquad \text{ as } m \rightarrow \infty \end{aligned}$$
(12)

for all \( \delta > 0 \), with the hidden constant in \( \mathcal {O}\!\left( \cdot \right) \) independent of sm but dependent on \( \delta \).

Proof

In the case \( k=1 \), (11) follows combining the derivative bounds (9), (10) and the quadrature error estimate [11,   Equation (3.1)], while (12) is [10,  Theorem 4.1]. The case \( k > 1 \) is shown analogously, using the derivative bounds in Sect. 3.5 below. \(\square \)

We denote the QMC a-posteriori error estimator by

$$\begin{aligned} E_{2^m}(F) := Q_{2^m}(F) - Q_{2^{m-1}}(F). \end{aligned}$$
(13)

For completeness, we mention a criterion to verify (9) in Appendix B, for the special case of bilinear forms \( \mathfrak {a}_{{{\varvec{y}}}} \) with affine dependence on the parameters. However, the parametric regularity bound (9) can be verified with alternative methods, also for non-affine parametric operators, based on holomorphic extensions of \( \mathfrak {a}_{{{\varvec{y}}}} \) for complex parameters \( {{\varvec{y}}} \in {\tilde{U}} \subseteq {\mathbb {C}}^s \), \( U \subseteq {\tilde{U}} \). For more details we refer to [13]. On the other hand, (10) can also be verified in some situations [23]. In what follows, we will assume that Assumption 2.1 and either (9) or (10) are available for the parametric solution map \( u:U \rightarrow V \).

2.2 Modules of AFEM

We mentioned that discretization error occurs in the solution of (2), for any instance of \( {{\varvec{y}}} \in U \). In this section we precise our discretization method of choice.

We restrict ourselves to polyhedral Lipschitz domains \( D \subset {\mathbb {R}}^d \), \( d \in {\left\{ 2,3 \right\} } \). A mesh \( {\mathcal {T}} \) on D is defined as a finite collection of compact sets \( T \in {\mathcal {T}} \), \( |T| > 0 \) such that \( \bigcup _{T \in {\mathcal {T}}} T = {\overline{D}} \) and \( |T \cap T'| = 0 \), for all \( T,T' \in {\mathcal {T}} \) with \( T \ne T' \). We assume availability of finite-dimensional spaces \( V_{{\mathcal {T}}} \subset V, W_{{\mathcal {T}}} \subset W \) linked to a mesh \( {\mathcal {T}} \) on D with \( \dim (V_{{\mathcal {T}}}) = \dim (W_{{\mathcal {T}}}) \) and such that the following stable discrete inf-sup condition hold: for \( {\tilde{\lambda }}>0 \) independent of \( {\mathcal {T}} \in \mathbb {T}\) and \( {{\varvec{y}}} \in U \),

$$\begin{aligned} \begin{aligned} \inf _{0\ne v \in V_{{\mathcal {T}}}}&\sup _{0 \ne w \in W_{{\mathcal {T}}}}\frac{\mathfrak {a}_{{{\varvec{y}}}}(v,w)}{\left\| v \right\| _{V}\left\| w \right\| _{W}} \ge {\tilde{\lambda }}> 0, \\ \inf _{0\ne w \in W_{{\mathcal {T}}}}&\sup _{0 \ne v \in V_{{\mathcal {T}}}}\frac{\mathfrak {a}_{{{\varvec{y}}}}(v,w)}{\left\| v \right\| _{V}\left\| w \right\| _{W}} \ge {\tilde{\lambda }} > 0. \end{aligned} \end{aligned}$$
(14)

Then, \( u_{{\mathcal {T}}}({{\varvec{y}}}) \in V_{{\mathcal {T}}} \) denotes the unique solution of the problem

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}} (u_{{\mathcal {T}}}({{\varvec{y}}}),w) = \mathfrak {l}_{{{\varvec{y}}}}(w) \quad \forall w \in W_{{\mathcal {T}}}, \end{aligned}$$
(15)

corresponding to (2). We will often use the shorthand notation \( {\mathcal {T}} \le {\mathcal {T}}' \), meaning that the mesh \( {\mathcal {T}}' \) can be obtained from another mesh \( {\mathcal {T}} \) by possibly multiple applications of the module \( {\text {REFINE}}\), as described in Assumption 2.2 below. Further, we fix an initial mesh \( {\mathcal {T}}_{0} \) of D and we denote by \( \mathbb {T}:= {\left\{ {\mathcal {T}} \, : \, {\mathcal {T}}_{0} \le {\mathcal {T}} \right\} } \) the set of admissible refinements of the initial mesh \( {\mathcal {T}}_{0} \).

The well-established Adaptive FEM algorithm, see Algorithm 1, is composed of the four modules \( {\text {SOLVE}}\), \( {\text {ESTIMATE}}\), \( {\text {MARK}}\) and \( {\text {REFINE}}\), plus a stopping criterion determined by a given tolerance tol.

figure a

The parameter dependent error indicators \( \{\eta _{{{\varvec{y}}},{\mathcal {T}}}(T)\}_{T\in {\mathcal {T}}} \), are computable values that approximate the local FEM error corresponding to each cell \( T\in {\mathcal {T}} \): these are used to determine which cells to refine, to drive the global error to 0. Following the description in [21], we state the abstract assumptions for Algorithm 1 to ensure error convergence of AFEM, pointwise for all \( {{\varvec{y}}} \in U \).

Assumption 2.2

\( {\text {AFEM}}\) modules for parametric problems:

  • For given \( {{\varvec{y}}} \in U \) and \( u_{{\mathcal {T}}}({{\varvec{y}}}) \in V_{{\mathcal {T}}} \), \( {\text {ESTIMATE}}\) computes positive real numbers \( {\left\{ \eta _{{{\varvec{y}}},{\mathcal {T}}}(T) \right\} }_{T\in {\mathcal {T}}} \), called indicators. We assume that the indicators satisfy, for all \( {\mathcal {T}},{\mathcal {T}}'\) with \( {\mathcal {T}}_{0} \le {\mathcal {T}} \le {\mathcal {T}}' \) the stability over non-refined elements

    $$\begin{aligned} \left( \sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{{\varvec{y}}},{\mathcal {T}}'}^2(T) \right) ^{\frac{1}{2}} \le \left( \sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}} + S(\left\| u_{{\mathcal {T}}'}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}), \end{aligned}$$
    (16)

    and reduction over refined elements

    $$\begin{aligned} \sum _{T \in {\mathcal {T}}'{\setminus } {\mathcal {T}}} \eta _{{{\varvec{y}}},{\mathcal {T}}'}^2(T) \le q_{red} \sum _{T \in {\mathcal {T}}{\setminus } {\mathcal {T}}' } \eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T) + R(\left\| u_{{\mathcal {T}}'}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}), \end{aligned}$$
    (17)

    where \( q_{red} \in (0,1) \) and the functions \( S,R :[0,\infty ) \rightarrow [0,\infty ) \) are continuous at 0 with \( S(0) = R(0) = 0 \) and monotocally increasing. We assume that \( S(\cdot ), R(\cdot ),q_{red} \) are independent of \( {{\varvec{y}}} \in U \). Furthermore, we assume reliability: there exists a constant \( c^{*}>0 \) such that \( \forall \, {\mathcal {T}} \in \mathbb {T}, \forall \, {{\varvec{y}}} \in U \)

    $$\begin{aligned} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V} \le c^{*}\left( \sum _{T \in {\mathcal {T}} } \eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T)\right) ^{\frac{1}{2}}. \end{aligned}$$
    (18)
  • The marking procedure \( {\text {MARK}}\) selects, based on a set of indicators \( {\left\{ \eta _{{{\varvec{y}}},{\mathcal {T}}}(T) \right\} } \) computed in the previous step, a subset \( {\mathcal {M}} \subset {\mathcal {T}} \) of cells that will be refined. We assume that there exists a function \( M:[0,\infty ) \rightarrow [0,\infty ) \) continuous at 0 with \( M(0) = 0 \) such that

    $$\begin{aligned} \max _{T \in {\mathcal {T}} {\setminus } {\mathcal {M}} } \eta _{{{\varvec{y}}},{\mathcal {T}}}(T) \le M\left( \left( \sum _{T \in {\mathcal {M}} }\eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}}\right) . \end{aligned}$$
    (19)
  • The \( {\text {REFINE}}\) module, for a given mesh \( {\mathcal {T}} \) and a set of marked elements \( {\mathcal {M}} \subseteq {\mathcal {T}} \), produces a new mesh \( {\mathcal {T}}' \) such that \( {\mathcal {T}}' \cap {\mathcal {M}} = \emptyset \). We assume that parents are union of their children, that is \( T = \bigcup {\left\{ T' \in {\mathcal {T}}' : T' \subseteq T \right\} } \) for all \( T \in {\mathcal {T}} \). We stress that \( {\mathcal {M}} \subseteq {\mathcal {T}} {\setminus } {\mathcal {T}}' \), that is \( {\text {REFINE}}\) can in principle refine more than the marked set. To simplify the presentation, we further assume conformity \( V_{{\mathcal {T}}} \subseteq V_{{\mathcal {T}}'} \subset V \) for all \( {\mathcal {T}} \le {\mathcal {T}}',\, {\mathcal {T}}, {\mathcal {T}}' \in \mathbb {T}\).

  • For the module \( {\text {SOLVE}}\), we assume that the Galerkin solution \( u_{{\mathcal {T}}}({{\varvec{y}}}) \) of (15) can be recovered exactly for every \( {{\varvec{y}}} \in U \), which entails exact integration and linear algebra.

We stress that the availability of \( c^{*} \) (18) depends implicitly on the set \( \mathbb {T}\), and hence on the \( {\text {REFINE}}\) module. In practice, usually \( c^{*} \) depends on \( \lambda ,\Lambda \) from (5),(6) and on the shape regularity of a mesh \( {\mathcal {T}} \), and hence it is often required that \( {\text {REFINE}}\) does not generate strongly anisotropic meshes, i.e. \( \mathbb {T}\) is uniformly shape-regular. Typical \( {\text {MARK}}\) strategies, as the Dörfler criterion, are known to satisfy (19), see e.g. [21].

Let \( (u_{{\mathcal {T}}_\ell ({{\varvec{y}}})}({{\varvec{y}}}))_{\ell \in {\mathbb {N}}} \), \( {\mathcal {T}}_\ell :={\mathcal {T}}_\ell ({{\varvec{y}}}) \) be the sequence of approximations produced by the \( {\text {AFEM}}\) loop with \( {\mathcal {T}}_{\ell +1}({{\varvec{y}}}) = {\text {REFINE}}({\mathcal {T}}_\ell ({{\varvec{y}}}), {\mathcal {M}}_{\ell }({{\varvec{y}}})) \), \( {\mathcal {M}}_\ell ({{\varvec{y}}}) = {\text {MARK}}({\left\{ \eta _{{{\varvec{y}}},{\mathcal {T}}_\ell ({{\varvec{y}}})}(T) \right\} } ) \subseteq {\mathcal {T}}_{\ell }({{\varvec{y}}}) \) for all \( \ell \in {\mathbb {N}}\); then, as a corollary of [21, Theorem 3.1] we get the following pointwise convergence result.

Lemma 2.2

Consider a problem of the form (2) satisfying Assumption 2.1. Let \( {\text {AFEM}}\) satisfy (14) and Assumption 2.2. Then, for all \( {{\varvec{y}}} \in U \) and all initial meshes \( {\mathcal {T}} \in \mathbb {T}\), it returns in finite time \( {\mathcal {T}}({{\varvec{y}}}) \) and \( u_{{\mathcal {T}}({{\varvec{y}}})}({{\varvec{y}}}) \) such that

$$\begin{aligned} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}({{\varvec{y}}})}({{\varvec{y}}}) \right\| _{V} \le c^{*} tol, \end{aligned}$$
(20)

for a constant \( c^{*} > 0 \) independent of \( tol, {{\varvec{y}}} \) and \({\mathcal {T}} \in \mathbb {T}\).

Proof

From [32], for all \( {{\varvec{y}}} \) there exists \( u_{\infty }({{\varvec{y}}}) \in V \) such that

$$\begin{aligned} \left\| u_{\infty }({{\varvec{y}}}) - u_{{\mathcal {T}}_\ell ({{\varvec{y}}})}({{\varvec{y}}}) \right\| _{V} \rightarrow 0 \quad \text{ as } \ell \rightarrow \infty . \end{aligned}$$
(21)

Hence, the result follows from [21, Theorem 3.1] and reliability (18). \(\square \)

3 QMC–AFEM Algorithms

3.1 A First Convergence Result

In this section we present a first combined QMC–AFEM algorithm, that outputs an approximation of \( I(G^k(u)) \) for a given tolerance \( {\varepsilon }\). For simplicity, we consider the case \( k=1 \) in (1) and postpone the case \( k>1 \). Algorithm 2 is in fact not efficient for implementation, but it illustrates effectively the key ideas.

First of all, we observe that \( E_{2^m} \) can be fully evaluated by means of quantities \( Q_{2^{m}}, Q_{2^{m-1}} \) that have already been computed, when we loop over m. In other words, when adding more QMC points we reuse part of the work done previously so that the cost to compute \( E_{2^m} \) is negligible. A second crucial observation is that each call of the Algorithm 1 results (in principle) in a different mesh \( {\mathcal {T}}({{\varvec{y}}}) \ge {\mathcal {T}}_{0}\), starting from a common coarse mesh \( {\mathcal {T}}_{0} \). In particular, \( G(u_{{\mathcal {T}}({{\varvec{y}}})}({{\varvec{y}}})) \) may not be even continuous with respect to \( {{\varvec{y}}} \in U \), and hence in general Proposition 2.1 is not applicable for \( G(u_{{\mathcal {T}}(\cdot )}(\cdot )) \) regardless of the discretization scheme.

figure b

Proposition 3.1

Let \( {\mathcal {T}}_{0} \in \mathbb {T}\). Assume that \( G \in V^* \), that u satisfies either (9) or (10) for a sequence \( {\varvec{\beta }}\) as in Proposition 2.1 and that \( \forall \, {{\varvec{y}}} \in U \) and any tolerance \( tol > 0 \), \( {\text {AFEM}}\) returns in finite time \( u_{{\mathcal {T}}({{\varvec{y}}})}({{\varvec{y}}}) \) such that (20) holds for a constant \( c^{*} \) independent of tol. Then, for any \( {\varepsilon }> 0\), there exist choices \( {\varepsilon }_{Q} \) and \( {\varepsilon }_{F}:=tol \), with \( {\varepsilon }^{-1}{\varepsilon }_{F},{\varepsilon }^{-1}{\varepsilon }_{Q} \) independent of \( {\varepsilon }\), such that

  1. 1.

    Algorithm 2 stops in finite time and

  2. 2.

    it produces an approximation of I(G(u)) within tolerance \( c^{*}{\varepsilon }+ \mathcal {O}\!\left( 2^{-2m + \delta }\right) \), with constant hidden in \( \mathcal {O}\!\left( \cdot \right) \) independent of s.

Proof

Let \( {\varepsilon }_{F} \) be the tolerance for AFEM. To prove the first item it is sufficient to show that, for any \( {\varepsilon }_{Q} > 2c^{*} \left\| G \right\| _{V^{*}} {\varepsilon }_{F} \) there exists m sufficiently large such that \( {\left| E_{2^{m}}( G(u_{{\mathcal {T}}})) \right| } \le {\varepsilon }_{Q} \). By linearity of G,

$$\begin{aligned} {\left| E_{2^{m}}( G(u_{{\mathcal {T}}})) \right| }&\le {\left| E_{2^{m}}( G(u - u_{{\mathcal {T}}})) \right| } + {\left| E_{2^{m}}( G(u)) \right| } \\&\le 2 \max _{{{\varvec{y}}} \in P_{m-1} \cup P_m } G(u - u_{{\mathcal {T}}({{\varvec{y}}})})({{\varvec{y}}}) + {\left| E_{2^{m}}( G(u)) \right| } \\&\le 2c^{*}\left\| G \right\| _{V^{*}}{\varepsilon }_{F} + {\left| E_{2^{m}}( G(u)) \right| }. \end{aligned}$$

Proposition 2.1 also implies that \( {\left| E_{2^{m}}( G(u)) \right| } \rightarrow 0 \) as \( m \rightarrow \infty \) and hence the claim.

Now we show the second item: we separate the error due to the Finite Element discretization from the QMC integration error as follows

$$\begin{aligned} {\left| I(G(u)) - Q_{2^{m}}(G(u_{{\mathcal {T}}})) \right| }&\le {\left| Q_{2^{m}}(G(u-u_{{\mathcal {T}}})) \right| } + {\left| I(G(u))-Q_{2^{m}}(G(u)) \right| }. \end{aligned}$$

For the FEM error we have

$$\begin{aligned} {\left| Q_{2^m} (G(u-u_{{\mathcal {T}}})) \right| } \le \max _{{{\varvec{y}}}\in P_{m}} {\left| G(u({{\varvec{y}}})-u_{{\mathcal {T}}}({{\varvec{y}}})) \right| } \le c^{*} \left\| G \right\| _{V^{*}} {\varepsilon }_{F}. \end{aligned}$$
(22)

For the QMC error we apply Proposition 2.1 to get for all \( \delta > 0 \)

$$\begin{aligned} {\left| I(G(u))-Q_{2^{m}}(G(u)) \right| }&\le \big ({\left| E_{2^m}(G(u_{{\mathcal {T}}})) \right| } + 2c^{*}\left\| G \right\| _{V^{*}} {\varepsilon }_{F} \big ) + \mathcal {O}\!\left( 2^{-2m +\delta }\right) \\&\le \big ({\varepsilon }_{Q} + 2c^{*}\left\| G \right\| _{V^{*}} {\varepsilon }_{F} \big ) + \mathcal {O}\!\left( 2^{-2m +\delta }\right) . \end{aligned}$$

Hence, for given \( {\varepsilon }\) we can choose \( {\varepsilon }_{F} := \frac{{\varepsilon }}{6\left\| G \right\| _{V^{*}}}\) and \( {\varepsilon }_{Q} := \frac{2}{5} c^{*}{\varepsilon }> 2c^{*}\left\| G \right\| _{V^{*}} {\varepsilon }_{F} \) and obtain

$$\begin{aligned} \left| I(G(u)) - Q_{2^{m}}(G(u_{{\mathcal {T}}})) \right|&\le \left( {\varepsilon }_{Q} + 3 c^{*} \left\| G \right\| _{V^{*}} {\varepsilon }_{F} \right) + \mathcal {O}\left( 2^{-2m + \delta }\right) \nonumber \\&\le c^{*}{\varepsilon }+ \mathcal {O}\!\left( 2^{-2m + \delta }\right) . \end{aligned}$$
(23)

\(\square \)

We remark that a sharp value for the reliability constant \( c^{*} \) is usually not known but (potentially pessimistic) upper bounds exist. The size of \( c^{*} \) can be controlled for structured meshes and refinement by bisection in spatial dimension \( d=2 \).

Algorithm 2 entails a decoupling of the QMC sampling with a AFEM solver. In practice, this implies that an adaptive software can be integrated into such algorithm in a non-intrusive manner, provided that the reliability (20) is satisfied for some variational space V and \( G \in V^{*} \). This feature can be advantageous in many situations, especially when a solver is complex to implement. However, it presents two main computational difficulties:

  • Algorithm 2 recomputes a mesh \( {\mathcal {T}} \ge {\mathcal {T}}_{0} \) for the domain D, for each QMC sample \( {{\varvec{y}}} \in P_m \) as well as for all iterations over m, which for complex geometries is an expensive step.

  • Imposing the same AFEM threshold (20) for all QMC points can be unnecessary since we are primarily interested in the average over the parameter space.

In what follows we propose two alternative algorithms that improve upon Algorithm 2 under these aspects. The first is a modification of Algorithm 2, that recycles part of the computation from previous iterations over m.

Algorithm 3 is motivated by the following heuristics. If there exists a metric \( d_{{\varvec{\beta }}}:U \times U \rightarrow [0,\infty ) \) and a Lipschitz constant \( L > 0 \) satisfying

$$\begin{aligned} \max (\left\| u_{{\mathcal {T}}}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}') \right\| _{V}, \left\| u({{\varvec{y}}}) - u({{\varvec{y}}}') \right\| _{V} )\le & {} L d_{{\varvec{\beta }}}({{\varvec{y}}}, {{\varvec{y}}}') \nonumber \\&\qquad \forall \, {{\varvec{y}}}, {{\varvec{y}}}' \in U, \, \forall \, {\mathcal {T}} \in \mathbb {T}, \end{aligned}$$
(24)

then

$$\begin{aligned} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}({{\varvec{y}}}')}({{\varvec{y}}}) \right\| _{V}&\le 2Ld_{{\varvec{\beta }}}({{\varvec{y}}}, {{\varvec{y}}}') + \left\| u({{\varvec{y}}}') - u_{{\mathcal {T}}({{\varvec{y}}}')}({{\varvec{y}}}') \right\| _{V} \\&\le 2Ld_{{\varvec{\beta }}}({{\varvec{y}}}, {{\varvec{y}}}') + c^{*} {\varepsilon }_{F}. \end{aligned}$$

In particular, for a small distance of the parameters we have a good chance to meet the \( {\text {AFEM}}\) tolerance by just one call of the \( {\text {SOLVE}}\) module, starting from the mesh \( {\mathcal {T}}({{\varvec{y}}}') \).

figure c

Following verbatim the proof of Proposition 3.1, we get convergence Algorithm 3. The parameter \( q \in {\mathbb {N}}\) in line 6 regulates how much information from previous iterations we use. A discussion of possible choices \( q = q({\varepsilon }) \) depending on the tolerance is given in Appendix C below.

Proposition 3.2

Let \( {\mathcal {T}}_{0} \in \mathbb {T}\). Assume that \( G \in V^* \), that u satisfies either (9) or (10) for a sequence \( {\varvec{\beta }}\) as in Proposition 2.1 and that \( \forall \, {{\varvec{y}}} \in U \), \( \forall \, {\mathcal {T}} \in \mathbb {T}\) and any tolerance tol, AFEM, starting from the initial mesh \( {\mathcal {T}} \), returns in finite time a mesh \( {\mathcal {T}}({{\varvec{y}}}) \ge {\mathcal {T}} \) and \( u_{{\mathcal {T}}({{\varvec{y}}})}({{\varvec{y}}}) \) such that (20) holds for a constant \( c^{*} \) independent of tol and \( {\mathcal {T}} \). Then, for any \( {\varepsilon }\), there exist choices \( {\varepsilon }_{Q} \) and \( {\varepsilon }_{F}:=tol \), with \( {\varepsilon }^{-1}{\varepsilon }_{F},{\varepsilon }^{-1}{\varepsilon }_{Q} \) independent of \( {\varepsilon }\) such that

  1. 1.

    Algorithm 3 stops in finite time and

  2. 2.

    it produces an approximation of I(G(u)) within tolerance \( c^{*}{\varepsilon }+ \mathcal {O}\!\left( 2^{-2m + \delta }\right) \), with constant hidden in \( \mathcal {O}\!\left( \cdot \right) \) independent of s.

3.2 Goal Oriented AFEM – Part 1

For the convergence of Algorithms 2 and 3, we assumed (20). This assumption alone does not yield optimal convergence rate of the AFEM module; as a consequence, the Finite Element error is overestimated and the spatial domain D could be overrefined in the algorithms. Nevertheless, we only require a reliable upper bound for the difference \( {\left| G(u({{\varvec{y}}})) - G(u_{\mathcal {T}}({{\varvec{y}}})) \right| } \), that in many situations converges to 0 faster than \( \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V} \) as we refine \( {\mathcal {T}} \), by an Aubin-Nitsche duality argument.

Let \( {\mathcal {T}} \in \mathbb {T}, {{\varvec{y}}}\in U \), then we define \( z({{\varvec{y}}}) \in W \) as the unique solution of the dual problem

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}}(v,z({{\varvec{y}}})) = G(v) \quad \forall v \in V. \end{aligned}$$
(25)

Then, for all \( w_{{\mathcal {T}}} \in W_{{\mathcal {T}}} \),

$$\begin{aligned} {\left| G(u({{\varvec{y}}})) - G(u_{\mathcal {T}}({{\varvec{y}}})) \right| }&= {\left| \mathfrak {a}_{{{\varvec{y}}}}(u({{\varvec{y}}}),z({{\varvec{y}}})) - \mathfrak {a}_{{{\varvec{y}}}}(u_{{\mathcal {T}}}({{\varvec{y}}}),z({{\varvec{y}}})) \right| } \nonumber \\&= {\left| \mathfrak {a}_{{{\varvec{y}}}}(u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}),z({{\varvec{y}}}) - w_{{\mathcal {T}}}) \right| } \nonumber \\&\le \Lambda \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}\left\| z({{\varvec{y}}}) - w_{{\mathcal {T}}} \right\| _{W}. \end{aligned}$$
(26)

When the goal functional \( G \in V^{*} \) has additional regularity, i.e. it belongs to a suitable subspace \( H \subseteq V^{*} \), then for \( h_{{\mathcal {T}}}:=\max _{T\in {\mathcal {T}}}{{\text {diam}}(T)} \)

$$\begin{aligned} \lim _{h_{{\mathcal {T}}} \rightarrow 0}\inf _{w_{\mathcal {T}} \in V_{{\mathcal {T}}}} \left\| z({{\varvec{y}}}) - w_{{\mathcal {T}}} \right\| _{W} = 0. \end{aligned}$$
(27)

However, in general AFEM produces non quasi-uniform meshes. Hence we can exploit regularity of G as follows: we pick \( w_{{\mathcal {T}}} := z_{{\mathcal {T}}}({{\varvec{y}}}) \in W_{{\mathcal {T}}} \) the FE solution of

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}}(v,z_{{\mathcal {T}}}({{\varvec{y}}})) = G(v) \quad \forall v \in V_{{\mathcal {T}}} \end{aligned}$$
(28)

and define reliable indicators \( {\left\{ \zeta _{{{\varvec{y}}},{\mathcal {T}}}(T) \right\} }_{T\in {\mathcal {T}}} \) such that, for a constant \( c^{**} > 0 \) independent of \( {{\varvec{y}}}, {\mathcal {T}} \in \mathbb {T}\),

$$\begin{aligned} \left\| z({{\varvec{y}}}) - z_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{W} \le c^{**} \left( \sum _{T\in {\mathcal {T}}} \zeta _{{{\varvec{y}}},{\mathcal {T}}}^2(T)\right) ^{\frac{1}{2}}. \end{aligned}$$
(29)

Similarly to \( \eta _{{{\varvec{y}}},{\mathcal {T}}}(T) \), each indicator \( \zeta _{{{\varvec{y}}},{\mathcal {T}}}(T) \) has the purpose of estimating the local error of the dual FEM problem . Combining (26) and (29) we can use the following a-posteriori estimator as termination criterion for \( {\text {AFEM}}\), in Algorithms 2 and 3

$$\begin{aligned} {\left| G(u({{\varvec{y}}})) - G(u_{\mathcal {T}}({{\varvec{y}}})) \right| } \lesssim \left( \sum _{T\in {\mathcal {T}}} \eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T) \sum _{T\in {\mathcal {T}}} \zeta _{{{\varvec{y}}},{\mathcal {T}}}^2(T)\right) ^{\frac{1}{2}} \le {\varepsilon }_{F}. \end{aligned}$$
(30)

Furthermore, as shown in [17], suitable marking strategies driven by both indicators \( \eta _{{{\varvec{y}}},{\mathcal {T}}},\zeta _{{{\varvec{y}}},{\mathcal {T}}} \) yield optimal convergence of the resulting goal oriented AFEM (or goAFEM) algorithm, provided that the axioms of adaptivity (A1-A4 in [5]) hold for the indicators \( \eta _{{{\varvec{y}}},{\mathcal {T}}}(T), \zeta _{{{\varvec{y}}},{\mathcal {T}}}(T) \). If in Algorithms 2 and 3, we replace \( {\text {AFEM}}\) by \( {\text {goAFEM}} \), then the results of Propositions 3.1 and 3.2 remain valid. As a side advantage, we do not need to include \( \left\| G \right\| _{V^{*}} \) to the FEM tolerance \( {\varepsilon }_{F} \).

Remark 3.3

Note that to use (30) we must solve numerically (28) for each sample \( {{\varvec{y}}} \in P_m, m = 1,2,\ldots \), until the tolerance is met. However, the stiffness matrix of the dual problem coincides with the transpose of the stiffness matrix of the primal, thus the additional work for the solution of (28) includes only the construction of the load vector corresponding to G (independent of \( {{\varvec{y}}} \)) and one linear solver per sample – in particular it is independent of the parametric dimension \( s \in {\mathbb {N}}\).

Remark 3.4

The axioms of adaptivity [5, (A1)–(A2)] are analogous to (16) and (17), while [5, (A4)] is a discrete version of (18). Quasi-orthogonality [5, (A3)] holds trivially for symmetric bilinear forms \( \mathfrak {a}_{{{\varvec{y}}}} \), \( {{\varvec{y}}}\in U \), although here we do not assume symmetry and we must verify it on a case by case basis, so to obtain optimal convergence of \( {\text {goAFEM}} \).

3.3 Indicator Averaging

Next, we design an iterative algorithm that refines the mesh or increases the number of samples at each step. Conversely to the previous algorithms, at any given time we employ only one mesh of the domain D for all \( {{\varvec{y}}} \in U \). In this case, we will assume a-priori uniform convergence, slightly stronger than the a-priori convergence in (21).

Assumption 3.1

Denote by \( (u_{{\mathcal {T}}_\ell }({{\varvec{y}}}))_{\ell \in {\mathbb {N}_{0}}} \) the sequence of approximations produced by Algorithm 4 with \( {\mathcal {T}}_{\ell +1} = {\text {REFINE}}({\mathcal {T}}_{\ell }, {\mathcal {M}}_{\ell }) \), \( {\mathcal {M}}_\ell = {\text {MARK}}({\left\{ \eta _{{\mathcal {T}}_{\ell }}(T) \right\} } ) \subseteq {\mathcal {T}}_{\ell } \) for all \( \ell \in {\mathbb {N}_{0}}\). We assume that there exists \( u_{\infty } \in C^{0}(U,V) \) such that

$$\begin{aligned} \left\| u_{\infty } - u_{{\mathcal {T}}_\ell } \right\| _{L^{\infty }(U,V)} \rightarrow 0 \quad \text{ as } \ell \rightarrow \infty . \end{aligned}$$
(31)
figure d

Theorem 3.5

Let \( {\mathcal {T}}_{0} \in \mathbb {T}\). Assume that \( G \in V^{*}\), and that \( \forall {\mathcal {T}} \in \mathbb {T}\) \( u,u_{{\mathcal {T}}}\) satisfy either (9) or (10) for a sequence \( {\varvec{\beta }}\) as in Proposition 2.1. Impose that the AFEM modules satisfy Assumption 2.2 and Assumption 3.1. Then, for all \( {\varepsilon }> 0 \) there exist \( {\varepsilon }_{F},{\varepsilon }_{Q} \), with \( {\varepsilon }^{-1}{\varepsilon }_{F},{\varepsilon }^{-1}{\varepsilon }_{Q} \) independent of \( {\varepsilon }\), such that

  1. 1.

    Algorithm 4 stops in finite time and

  2. 2.

    it produces an approximation of I(G(u)) within tolerance \( c^{*}{\varepsilon }+ \mathcal {O}\!\left( b^{-2m + \delta }\right) \), with constant hidden in \( \mathcal {O}\!\left( \cdot \right) \) independent of s.

Proof

Fix \( m \in {\mathbb {N}}\). For a mesh \( {\mathcal {T}} \) on D define \( \eta _{{\mathcal {T}}}(T):= \left( \frac{1}{2^m}\sum _{{{\varvec{y}}} \in P_m} \eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T)\right) ^{\frac{1}{2}} \) the quadratic mean over \( {{\varvec{y}}} \in P_m \) of the local indicators. Since \( {\left\{ \eta _{{{\varvec{y}}},{\mathcal {T}}}(T) \right\} }_T \) satisfy (17) for all \( {{\varvec{y}}} \in P_m \), and all \( {\mathcal {T}}_{0}\le {\mathcal {T}}\le {\mathcal {T}}' \), we get

$$\begin{aligned} \sum _{T \in {\mathcal {T}}'{\setminus } {\mathcal {T}}} \eta _{{\mathcal {T}}'}^2(T)&\le q_{red} \sum _{T \in {\mathcal {T}}{\setminus } {\mathcal {T}}'} \eta _{{\mathcal {T}}}^2(T) + \frac{1}{2^m} \sum _{{{\varvec{y}}} \in P_m}R(\left\| u_{{\mathcal {T}}'}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}) \\&\le q_{red} \sum _{T \in {\mathcal {T}}{\setminus } {\mathcal {T}}'} \eta _{{\mathcal {T}}}^2(T) + R(\left\| u_{{\mathcal {T}}'} - u_{{\mathcal {T}}} \right\| _{L^{\infty }(U,V)}) \end{aligned}$$

as R is increasing and \( u_{{\mathcal {T}}'}, u_{{\mathcal {T}}} \) are continuous. Hence, also \( \eta _{{\mathcal {T}}}(T) \) has the reduction property (17), but with respect to the \( L^{\infty }(U,V) \)-norm. Similarly, from (16), monotonicity of S and Jensen inequality

$$\begin{aligned} \sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{\mathcal {T}}'}^2(T) \le&\sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{\mathcal {T}}}^2(T) + \frac{1}{2^m} \sum _{{{\varvec{y}}} \in P_m} S(\left\| u_{{\mathcal {T}}'}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}) ^2 \\&+ \frac{1}{2^m} \sum _{{{\varvec{y}}} \in P_m} 2 S(\left\| u_{{\mathcal {T}}'}({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}) \left( \sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{{\varvec{y}}},{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}} \\ \le&\sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{\mathcal {T}}}^2(T) + S(\left\| u_{{\mathcal {T}}'} - u_{{\mathcal {T}}} \right\| _{L^{\infty }(U,V)}) ^2 \\&+ 2 S(\left\| u_{{\mathcal {T}}'} - u_{{\mathcal {T}}} \right\| _{L^{\infty }(U,V)}) \left( \sum _{T \in {\mathcal {T}}\cap {\mathcal {T}}' }\eta _{{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}} . \end{aligned}$$

Taking square roots on both sides, we obtain that \( \eta _{{\mathcal {T}}}(T) \) has the stability property (16) with respect to the \( L^{\infty }(U,V) \)-norm. Note that the modules \( {\text {MARK}},{\text {REFINE}}\) are independent of \( {{\varvec{y}}} \in U \) and we assumed a-priori convergence in (31) in the same norm; therefore, from the proof [21, Theorem 3.1], for the sequence of meshes \( ({\mathcal {T}}_{\ell })_{\ell \in {\mathbb {N}_{0}}} \) constructed by

$$\begin{aligned} {\mathcal {T}}_{\ell +1} = {\text {REFINE}}({\mathcal {T}}_{\ell },{\text {MARK}}( \{ \eta _{{\mathcal {T}}_{\ell }}(T) \}) ) \end{aligned}$$

we obtain,

$$\begin{aligned} \sum _{T \in {\mathcal {T}}_\ell }\eta _{{\mathcal {T}}_{\ell }}^2(T) \rightarrow 0 \quad \text{ as } \ell \rightarrow \infty . \end{aligned}$$

Thus for all \( m \in {\mathbb {N}}\), any FEM tolerance \( {\varepsilon }_{F} \) is met in finite time. Since \( u_{{\mathcal {T}}} \) satisfies (9) or (10) for all \( {\mathcal {T}} \in \mathbb {T}\) (conversely to Algorithms 2 and 3, here there is only one mesh \( {\mathcal {T}} \) at a time, used for all points \( {{\varvec{y}}} \in P_m \cup P_{m-1} \)), Proposition 2.1 implies that \( E_{2^m}(G(u_{\mathcal {T}})) \rightarrow 0 \) as \( m \rightarrow \infty \), showing that Algorithm 4 stops in finite time. The error bound follows as in Proposition 3.1: denote by \( {\mathcal {T}}^{(m)}:= {\mathcal {T}}_{\ell (m)}, m\in {\mathbb {N}}\) the mesh that meets the FEM error tolerance for the lattice \( P_m \), i.e.

$$\begin{aligned} \frac{1}{2^m}\sum _{T \in {\mathcal {T}}^{(m)} }\sum _{{{\varvec{y}}} \in P_m}\eta _{{{\varvec{y}}},{\mathcal {T}}^{(m)}}^2(T) \le {\varepsilon }_{F}^2. \end{aligned}$$
(32)

For all \( \delta > 0 \),

$$\begin{aligned} {\left| I(G(u)) - Q_{2^m}(G(u_{{\mathcal {T}}^{(m)}})) \right| } \le&{\left| Q_{2^m}(G(u - u_{{\mathcal {T}}^{(m)}})) \right| }+ {\left| E_{2^m}(G(u - u_{{\mathcal {T}}^{(m)}})) \right| } \nonumber \\&+ {\left| E_{2^m}(G(u_{{\mathcal {T}}^{(m)}})) \right| } + \mathcal {O}\!\left( 2^{-2m+\delta }\right) . \end{aligned}$$

Jensen inequality and (18) give

$$\begin{aligned} {\left| Q_{2^m}(G(u - u_{{\mathcal {T}}^{(m)}})) \right| }&\le \left\| G \right\| _{V^{*}} \frac{1}{2^m}\sum _{{{\varvec{y}}} \in P_m} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}^{(m)}}({{\varvec{y}}}) \right\| _{V} \le c^*\left\| G \right\| _{V^{*}} {\varepsilon }_{F}. \end{aligned}$$
(33)

Note that, since \( {\mathcal {T}}^{(m-1)} \le {\mathcal {T}}^{(m)} \) as we never coarsen meshes, Galerkin orthogonality implies, for \( C({\tilde{\lambda }},\Lambda ) = 1+ \frac{\Lambda }{{\tilde{\lambda }}} \)

$$\begin{aligned} \frac{1}{2^{m-1}}\sum _{{{\varvec{y}}} \in P_{m-1}} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}^{(m)}}({{\varvec{y}}}) \right\| _{V}&\le \frac{C({\tilde{\lambda }},\Lambda )}{ 2^{m-1}}\sum _{{{\varvec{y}}} \in P_{m-1}} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}^{(m-1)}}({{\varvec{y}}}) \right\| _{V} \\&\le C({\tilde{\lambda }},\Lambda )c^* {\varepsilon }_{F}, \end{aligned}$$

whence

$$\begin{aligned} {\left| E_{2^m}(G(u - u_{{\mathcal {T}}^{(m)}})) \right| }\le \left( 1+C({\tilde{\lambda }},\Lambda ) \right) c^*\left\| G \right\| _{V^{*}} {\varepsilon }_{F}. \end{aligned}$$
(34)

The stopping criterion gives \( {\left| E_{2^m}(G(u_{{\mathcal {T}}^{(m)}})) \right| } \le {\varepsilon }_{Q} \) so that

$$\begin{aligned} {\left| I(G(u)) - Q_{2^m}(G(u_{{\mathcal {T}}^{(m)}})) \right| } \le \left( 2+C({\tilde{\lambda }},\Lambda ) \right) c^*\left\| G \right\| _{V^{*}} {\varepsilon }_{F} + {\varepsilon }_{Q} + \mathcal {O}\!\left( 2^{-2m+\delta }\right) \end{aligned}$$

and it is sufficient to pick \( {\varepsilon }_{F}:= \frac{{\varepsilon }}{2(2+C({\tilde{\lambda }},\Lambda ))\left\| G \right\| _{V^{*}}} \), \( {\varepsilon }_{Q}:=\frac{c^{*}{\varepsilon }}{2} \) to get the claim. \(\square \)

3.4 Goal Oriented AFEM – Part 2

We now include a goal oriented adaptivity approach in Algorithm 4. Given the estimator average \( \varphi _{{\mathcal {T}}}(T):= \left( \frac{1}{2^m}\sum _{{{\varvec{y}}} \in P_m} \varphi _{{{\varvec{y}}},{\mathcal {T}}}^2(T)\right) ^{\frac{1}{2}} \), \( \varphi \in \{\eta , \zeta \} \), we define the following indicators from [1]

$$\begin{aligned} \rho _{{\mathcal {T}}}^2(T) := \eta _{{\mathcal {T}}}^2(T)\sum _{T' \in {\mathcal {T}}}\zeta _{{\mathcal {T}}}^2(T') + \zeta _{{\mathcal {T}}}^2(T)\sum _{T' \in {\mathcal {T}}}\eta _{{\mathcal {T}}}^2(T'). \end{aligned}$$

Proposition 3.6

Let \( \{{\mathcal {T}}_{\ell }\}_{\ell \in {\mathbb {N}_{0}}} \) be a sequence of meshes produced with the indicators \( \rho _{{\mathcal {T}}_{\ell }}(T) \) by a marking and refinement strategy as in Assumption 2.2. Let \( K_0 := \max _{{{\varvec{y}}}\in U} (\eta _{{{\varvec{y}}}, {\mathcal {T}}_0}^2 + \zeta _{{{\varvec{y}}}, {\mathcal {T}}_0}^2) < \infty \). Assume that both estimators \(\eta _{{{\varvec{y}}},{\mathcal {T}}}, \zeta _{{{\varvec{y}}},{\mathcal {T}}} \) for the primal and dual problems satisfy reliability (18), and (29) and the properties (16), (17) for \({\mathcal {T}} \in \mathbb {T}\). Then

$$\begin{aligned} \eta _{{\mathcal {T}}_{\ell }} \zeta _{{\mathcal {T}}_{\ell }} = \left( \sum _{T\in {\mathcal {T}}_{\ell }} \zeta _{{\mathcal {T}}_{\ell }}^2(T) \right) ^{\frac{1}{2}} \left( \sum _{T\in {\mathcal {T}}_{\ell }} \eta _{{\mathcal {T}}_{\ell }}^2(T) \right) ^{\frac{1}{2}} \rightarrow 0, \quad \text { as } \ell \rightarrow \infty . \end{aligned}$$

Proof

Due to (14), we have quasi-optimality of the primal and dual problems

$$\begin{aligned} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}&\le C({\tilde{\lambda }},\Lambda ) \inf _{v_{\mathcal {T}} \in V_{{\mathcal {T}}}} \left\| u({{\varvec{y}}}) - v_{{\mathcal {T}}} \right\| _{V} \\ \left\| z({{\varvec{y}}}) - z_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{W}&\le C({\tilde{\lambda }},\Lambda ) \inf _{w_{\mathcal {T}} \in W_{{\mathcal {T}}}} \left\| z({{\varvec{y}}}) - w_{{\mathcal {T}}} \right\| _{W}, \end{aligned}$$

for all \( {\mathcal {T}} \in \mathbb {T}\). Hence, we get from [5,  Lemma 3.6], quasi-monotonicity of the estimators: there exists \( C > 0 \) independent of \( {{\varvec{y}}} \in U, {\mathcal {T}} \in \mathbb {T}\) such that

$$\begin{aligned} \sum _{T \in {\mathcal {T}}}\varphi _{{{\varvec{y}}},{\mathcal {T}}}^2(T) \le C \sum _{T \in {\mathcal {T}}}\varphi _{{{\varvec{y}}}, {\mathcal {T}}_0}^2(T) < CK_0 \qquad \text { with } \varphi \in \{\eta , \zeta \}. \end{aligned}$$
(35)

The axioms (16) and (17) for the indicators \( \rho _{{\mathcal {T}}}(T) \) are verified as in Theorem 3.5, and using that \( K_0 < \infty \). Therefore we conclude with [21,  Theorem 3.1] the claim, \( \sum _{T\in {\mathcal {T}}_{\ell }}\rho _{{\mathcal {T}}_{\ell }}^2(T) = 2 \eta _{{\mathcal {T}}_{\ell }}^2\zeta _{{\mathcal {T}}_{\ell }}^2 \rightarrow 0\) as \( \ell \rightarrow \infty \). \(\square \)

As termination criterion for the spatial refinement we impose

$$\begin{aligned} \frac{1}{2^m} \sum _{{{\varvec{y}}} \in P_m} {\left| G(u({{\varvec{y}}})) - G(u_{\mathcal {T}}({{\varvec{y}}})) \right| } \lesssim \left( \sum _{T\in {\mathcal {T}}} \zeta _{{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}} \left( \sum _{T\in {\mathcal {T}}} \eta _{{\mathcal {T}}}^2(T) \right) ^{\frac{1}{2}} \le {\varepsilon }_{F}. \end{aligned}$$
(36)

Convergence of a goal oriented adaptive QMC-FEM Algorithm follows replacing (33) with the latter equation.

3.5 Higher Moments and Lipschitz Goal Functionals

Although we confined the analysis to linear \( G \in V^* \), inspection of the proofs reveals that the results of the previous sections carry over to sufficiently smooth functionals \( {\hat{G}} :V \rightarrow {\mathbb {R}}\). In particular, instead of the expectation of G(u) we can obtain higher moments setting \( {\hat{G}} = G^k \) for some \( k \in {\mathbb {N}}\), \( G \in V^* \). The additional steps required in the proofs read as follows. First, from (8) and a local Lipschitz estimate for \( G^k \) we can bound the FEM error in (22), (33) as

$$\begin{aligned} {\left| G^k(u({{\varvec{y}}})) - G^k(u_{{\mathcal {T}}}({{\varvec{y}}})) \right| }&= {\left| G(u({{\varvec{y}}})) - G(u_{{\mathcal {T}}}({{\varvec{y}}})) \right| } {\left| \sum _{i = 0}^{k-1} G^{k-i-1}(u({{\varvec{y}}}))G^{i}(u_{{\mathcal {T}}}({{\varvec{y}}})) \right| } \\&\lesssim {\left| G(u({{\varvec{y}}})) - G(u_{{\mathcal {T}}}({{\varvec{y}}})) \right| } \\&\le \left\| G \right\| _{V^*} \left\| u({{\varvec{y}}}) - u_{{\mathcal {T}}}({{\varvec{y}}}) \right\| _{V}. \end{aligned}$$

Note that for \( k=1 \) the last inequality was sufficient. Moreover, the first inequality also allows to recover the goal oriented stopping criteria (30) and (36). Second, parametric regularity required in Proposition 2.1 follows from the multivariate product rule: for \( k = 2 \) and the assumption (9)

$$\begin{aligned} {\left| \partial _{{{\varvec{y}}}}^{{\varvec{\nu }}}G^2(u({{\varvec{y}}})) \right| }&= {\left| \sum _{{\varvec{\mu }} \le {\varvec{\nu }}} {{\varvec{\nu }} \atopwithdelims (){\varvec{\mu }}}\partial _{{{\varvec{y}}}}^{{\varvec{\mu }}}G(u({{\varvec{y}}})) \partial _{{{\varvec{y}}}}^{{\varvec{\nu }}-{\varvec{\mu }}}G(u({{\varvec{y}}})) \right| } \\&\le \left\| G \right\| _{V^*}^2 (|{\varvec{\nu }}|!)^{1+\kappa } (2^{1+\kappa }{\varvec{\beta }})^{{\varvec{\nu }}}, \end{aligned}$$

which is again of the form (9). Here we used the bound \( \sum _{{\varvec{\mu }} \le {\varvec{\nu }}} {{\varvec{\nu }} \atopwithdelims (){\varvec{\mu }}} |{\varvec{\mu }}|!|{\varvec{\nu }}-{\varvec{\mu }}|! \le 2^{|{\varvec{\nu }}|} |{\varvec{\nu }}|!\). Note that \( {\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \iff 2^{1+\kappa }{\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \); therefore Proposition 2.1 applies for the same choice of p, which in turn does not change depending on k. The case of assumption (10) follows the same steps, while regularity for higher \( k > 2 \) can be treated iterating the product rule.

Hence, the computation of higher moments is covered and the error is only changed by a constant dependent on \( k,\left\| G \right\| _{V^*}, C_{\mathfrak {l}} \) and \( \lambda \).

3.6 Examples

In the present section we illustrate the framework in a selection of model problems.

Parametric diffusion We consider a parametric stationary diffusion equation: given \( {{\varvec{y}}} \in U \), find \( u(\cdot ,{{\varvec{y}}}) \) such that (3) holds, where \( a(\cdot ,{{\varvec{y}}}) \in W^{1,\infty }(D) \) and \( f \in L^2(D) \). We select an affine-parametric diffusion coefficient: for \( {\left\{ \psi _j \right\} }_{j\in {\mathbb {N}}_0} \in W^{1,\infty }(D) \),

$$\begin{aligned} a(x,{{\varvec{y}}}) = \psi _0(x) + \sum _{j=1}^{s}y_j \psi _j(x). \end{aligned}$$
(37)

Assume that the \( \psi _0 > \psi _{0,\min } \text{ a.e. } \text{ in } D\) for a constant \( \psi _{0,\min } > 0 \) and the sequence \( {\varvec{\beta }}\) given by \( \beta _j = \frac{\left\| \psi _j \right\| _{L^\infty (D)}}{\psi _{0,\min }} \), \( j\ge 1 \) satisfies \( \left\| {\varvec{\beta }} \right\| _{\ell ^1({\mathbb {N}})} :=\sum _{j\ge 1} \beta _{j} < 2 \) and \( {\varvec{\beta }}\in \ell ^{p}({\mathbb {N}}) \), for some \( p\in (0,\frac{1}{2}) \). The weak formulation of Eq. (3) reads, for all \( {{\varvec{y}}}\in U \) find \( u(\cdot ,{{\varvec{y}}}) \in V:= H_0^1(D) \) such that

$$\begin{aligned} \mathfrak {a}_{{{\varvec{y}}}}(u(\cdot ,{{\varvec{y}}}),v) := \int _{D} a(\cdot ,{{\varvec{y}}})\nabla u(\cdot ,{{\varvec{y}}}) \cdot \nabla v = \int _{D}fv =: \mathfrak {l}_{{{\varvec{y}}}}(v) \quad \forall v\in V. \end{aligned}$$
(38)

This model problem satisfies (2.1) and the derivative bound (9) with \( \kappa =0 \), follows from [8] or Theorem B.1. \( {\text {AFEM}}\) can be performed (with quasi-optimal convergence) for example by first order Lagrangian elements, standard residual indicators, Dörfler marking and refinement by newest vertex bisection, as derived in [6].

For completeness we verify Lipschitz continuity (24) for the model problem: denote \( u({{\varvec{y}}}) = u(\cdot ,{{\varvec{y}}}) \in V \), then affine parametric structure of \( a(\cdot ,{{\varvec{y}}}) \) gives

$$\begin{aligned} \psi _{0,\min } \left( 1-\frac{\left\| {\varvec{\beta }} \right\| _{\ell ^1({\mathbb {N}})}}{2} \right) \left\| u({{\varvec{y}}}) - u({{\varvec{y}}}') \right\| _{V}^2&\le \mathfrak {a}_{{{\varvec{y}}}}(u({{\varvec{y}}}) - u({{\varvec{y}}}'),u({{\varvec{y}}}) - u({{\varvec{y}}}')) \\&= \left\langle f,u({{\varvec{y}}}) - u({{\varvec{y}}}') \right\rangle - \mathfrak {a}_{{{\varvec{y}}}}( u({{\varvec{y}}}'),u({{\varvec{y}}}) - u({{\varvec{y}}}')) \\&= \left\langle f,u({{\varvec{y}}}) - u({{\varvec{y}}}') \right\rangle - \mathfrak {a}_{{{\varvec{y}}}'}( u({{\varvec{y}}}'),u({{\varvec{y}}}) - u({{\varvec{y}}}')) \\&+ \sum _{j\ge 1} (y'_j - y_j) \int _{D} \psi _j \nabla u({{\varvec{y}}}') \nabla ( u({{\varvec{y}}}) \!-\! u({{\varvec{y}}}')). \end{aligned}$$

The first two terms cancel since f is independent of \( {{\varvec{y}}} \in U \). Furthermore,

$$\begin{aligned}&{\left| \sum _{j\ge 1} (y'_j - y_j) \int _{D} \psi _j \nabla u({{\varvec{y}}}') \nabla ( u({{\varvec{y}}}) - u({{\varvec{y}}}')) \right| } \\&\quad \le \psi _{0,\min } \left\| u({{\varvec{y}}}) - u({{\varvec{y}}}') \right\| _{V} \left\| u({{\varvec{y}}}) \right\| _{V}\sum _{j\ge 1} {\left| y'_j - y_j \right| } \beta _j. \end{aligned}$$

Therefore defining \(d_{{\varvec{\beta }}}({{\varvec{y}}}, {{\varvec{y}}}') := \sum _{j\ge 1} {\left| y'_j - y_j \right| } \beta _j \) and \( L := \frac{4}{\psi _{0,\min }\left( 2-\left\| {\varvec{\beta }} \right\| _{\ell ^1({\mathbb {N}})} \right) ^2} \left\| f \right\| _{V^*} \) we have the claim. The same steps hold for a FE solution \( u_{{\mathcal {T}}}({{\varvec{y}}}) \), for any \( {\mathcal {T}} \in \mathbb {T}\). This also implies Assumption 3.1 due to compactness of U.

Shape Uncertainty Quantification for the Poisson equation. Consider the following domain uncertainty problem from [26]: define a family of domains \( {\left\{ D({{\varvec{y}}}) : \, {{\varvec{y}}} \in U \right\} } \) contained in a hold-all domain \( {\mathcal {D}} := \bigcup D({{\varvec{y}}}) \). Given a reference Lipschitz polyhedron \( {\hat{D}} \subset {\mathbb {R}}^d \) \( d\in \{2,3\} \), we assume that the family is parametrized by a \( C^2({\hat{D}}) \) diffeomorphism \( \Psi :{\hat{D}}\times U \rightarrow {\mathcal {D}} \) by the relations \( D({{\varvec{y}}}) := \Psi ({\hat{D}},{{\varvec{y}}}) \) and

$$\begin{aligned} \Psi (x,{{\varvec{y}}}) = x + \sum _{j= 1}^{s} y_j \psi _j(x), \quad x \in {\hat{D}},{{\varvec{y}}}\in U \end{aligned}$$
(39)

for functions \( {\left\{ \psi _j \right\} }_{j\in {\mathbb {N}}} \subset W^{1,\infty }(D)\) satisfying \( {\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \), \( p\in (0,\frac{1}{2}) \) with \( \beta _{j} := \left\| \psi _j \right\| _{W^{1,\infty }({\hat{D}})} \). For all \( {{\varvec{y}}} \in U \), let \( u(\cdot ,{{\varvec{y}}}) \in H_0^1(D({{\varvec{y}}})) \) solve the Poisson equation, given a source \( f\in C^{\infty }({\mathcal {D}}) \) analytic (as in [26,  Lemma 5]),

(40)

This problem can be recast by a change of variables to the reference domain: for \( V = W = H_0^1({\hat{D}}) \) and for any \( {{\varvec{y}}} \in U \), we seek \( {\hat{u}}(\cdot ,{{\varvec{y}}}) := u(\cdot ,{{\varvec{y}}}) \circ \Psi \in V \) such that (2) holds with

$$\begin{aligned} \begin{aligned} \mathfrak {l}_{{{\varvec{y}}}}({\hat{w}})&:= \int _{{\hat{D}}} f\circ \Psi (x,{{\varvec{y}}}) {\hat{w}}(x) \det (J(x,{{\varvec{y}}})) {\,\mathrm {d}}x, \\ \mathfrak {a}_{{{\varvec{y}}}}({\hat{v}},{\hat{w}})&:= \int _{{\hat{D}}} A(x,{{\varvec{y}}}) \nabla {\hat{v}}(x)\cdot \nabla {\hat{w}}(x) {\,\mathrm {d}}x, \end{aligned} \end{aligned}$$
(41)

where \( A(x,{{\varvec{y}}}):= (J^{\top }(x,{{\varvec{y}}})J(x,{{\varvec{y}}}))^{-1}\det (J(x,{{\varvec{y}}})) \) and \( J(x,{{\varvec{y}}}):= \nabla _x \Psi (x,{{\varvec{y}}}) \) is the Jacobian matrix of \( \Psi \). In [26,  Theorem 5], the authors provided a derivative bound in the form (9), \( \kappa =0 \), for \( {\hat{u}} \). The \( {\text {AFEM}}\) modules are analogous to the previous example (but here with parametric matrix-valued diffusion coefficient); the applicability of Algorithms 2 and 4 is straightforward.

Linear elasticity of nearly incompressible materials. Robust approximation of linear elasticity in the incompressible limit, (that is Poisson ratio \( \nu \in (0,\frac{1}{2}) \) approaching \( \frac{1}{2} \)), was studied in [29, 30] by the following three-field-formulation. Let \( E(x,{{\varvec{y}}}) = e_0(x) + \sum _{j = 1}^{s} y_j e_j(x) \in L^{\infty }(D)\) be the (affine-parametric) Young modulus, with \( 0< e_{0,\min }< e_0(x) < e_{0,\max } \) a.e., for constants \( e_{0,\min },e_{0,\max } \). Define \( {\varepsilon }(v) :=\frac{1}{2}[\nabla v + (\nabla v) ^{\top }] \) the strain tensor (for a vector field \( v:D \rightarrow {\mathbb {R}}^d \)) and \( f\in L^2(D)^d \). Assume that the boundary conditions are of mixed type Dirichlet-Neumann given respectively on \( \Gamma _D,\Gamma _N \), both of positive length, with \( \Gamma _D\cup \Gamma _N = \partial D \), \( \Gamma _D\cap \Gamma _N = \emptyset \). Introducing the extra variable \( {\tilde{p}}(x,{{\varvec{y}}}) = p(x,{{\varvec{y}}})/E(x,{{\varvec{y}}}) \), where \( p(x,{{\varvec{y}}}) \) is the (parameter-dependent) Herrmann pressure, we can write the linear elasticity equations as

$$\begin{aligned} {\left\{ \begin{array}{ll} -{\text {div}}\left( \frac{E(x,{{\varvec{y}}})}{(1+\nu )}{\varepsilon }(u(x,{{\varvec{y}}})) \right) + \nabla p(x,{{\varvec{y}}}) = f(x) &{} x\in D, {{\varvec{y}}}\in U\\ {\text {div}}(u(x,{{\varvec{y}}})) + c^{-1} {\tilde{p}}(x,{{\varvec{y}}}) = 0 &{} x \in D,{{\varvec{y}}}\in U \\ c^{-1} p(x,{{\varvec{y}}}) - c^{-1} E(x,{{\varvec{y}}}) {\tilde{p}}(x,{{\varvec{y}}}) = 0 &{} x \in D,{{\varvec{y}}}\in U \\ u(x,{{\varvec{y}}}) = 0 &{} x\in \Gamma _D,{{\varvec{y}}}\in U \\ \left( \frac{E(x,{{\varvec{y}}})}{1+\nu } {\varepsilon }(u(x,{{\varvec{y}}})) -p(x,{{\varvec{y}}})I\right) {{\varvec{n}}}(x) = 0 &{} x\in \Gamma _N ,{{\varvec{y}}}\in U \end{array}\right. } \end{aligned}$$
(42)

for the absolute constant \( c = c(\nu ) := \frac{\nu }{(1+\nu )(1-2\nu )} \) only dependent on \( \nu \in (0,\frac{1}{2}) \). The weak formulation is: for all \( {{\varvec{y}}} \in U \), find \( (u(\cdot ,{{\varvec{y}}}),p(\cdot ,{{\varvec{y}}}),{\tilde{p}}(\cdot ,{{\varvec{y}}})) \in V := W := H_{\Gamma _D }^1(D)^d \times L^2(D) \times L^2(D) \) such that (2) holds for the bilinear form

$$\begin{aligned}&\mathfrak {a}_{{{\varvec{y}}}}((v,g,{\tilde{g}}),(w,q,{\tilde{q}})) = \int _D \frac{E(\cdot ,{{\varvec{y}}})}{1+\nu }{\varepsilon }(v):{\varepsilon }(w) -\int _{D}g{\text{ div }}w - \int _{D}q{\text{ div }}v - c^{-1}\int _{D} {\tilde{g}}\, q \nonumber \\&\quad - c^{-1} \int _{D} g \, {\tilde{q}} + c^{-1} \int _{D} E(\cdot ,{{\varvec{y}}}){\tilde{g}}\, {\tilde{q}} \qquad \forall (v,g,{\tilde{g}}),(w,q,{\tilde{q}}) \in V \end{aligned}$$
(43)

and \( \mathfrak {l}_{{{\varvec{y}}}}((w,q,{\tilde{q}})) = \int _{D}f w \). We equip V with the norm (related to [30,  Equation (2.21)], but without integrating out the parameter space)

$$\begin{aligned} \left\| (w,q,{\tilde{q}}) \right\| _{V}^{2} := \frac{1}{1+\nu } \left\| \nabla w \right\| _{L^2(D)}^2 + (1+\nu + c^{-1}) \left\| q \right\| _{L^2(D)}^2 + c^{-1} \left\| {\tilde{q}} \right\| _{L^2(D)}^2. \end{aligned}$$
(44)

The main motivation to introduce the three-field formulation is that E only appears in the numerator, and it is in particular affine-parametric. We thus verify the criteria of Theorem B.1. The nominal operator \( A_0:V\rightarrow V^* \), induced by \( \mathfrak {a}_{{\varvec{0}}} \) is linear and boundedly invertible by [30,  Theorem 2.4], with norm \( \left\| A_0^{-1} \right\| _{} \le \frac{K_0(1+\nu )^{1/2}}{e_{0,\min }} \), for a constant \( K_0 > 0 \) dependent on D and \(\left\| e_0 \right\| _{L^{\infty }(D)} \). Moreover, the fluctuations \( {\left\{ A_j \right\} }_j \) in the notation of Theorem B.1 satisfy, for all triples \( (v,g,{\tilde{g}}),(w,q,{\tilde{q}}) \in V \)

$$\begin{aligned}&\left\langle A_j(v,g,{\tilde{g}}),(w,q,{\tilde{q}}) \right\rangle = \frac{1}{1+\nu }\int _D e_j {\varepsilon }(v):{\varepsilon }(w) + c^{-1}\int _D e_j {\tilde{g}}{\tilde{q}} \nonumber \\&\quad \le \left\| e_j \right\| _{L^{\infty }(D)} \left\| (v,g,{\tilde{g}}) \right\| _{V}\left\| (w,q,{\tilde{q}}) \right\| _{V}, \end{aligned}$$
(45)

that is \( \left\| A_j \right\| _{} \le \left\| e_j \right\| _{L^{\infty }(D)} \). Therefore, to obtain (9) we assume \( \left\| {\varvec{\beta }} \right\| _{\ell ^1({\mathbb {N}})} < 2, {\varvec{\beta }}\in \ell ^p({\mathbb {N}}) \) for some \( p\in (0,\frac{1}{2}) \), where

$$\begin{aligned} \beta _{j} := \frac{K_0(1+\nu )^{1/2}}{e_{0,\min }} \left\| e_j \right\| _{L^{\infty }(D)}. \end{aligned}$$
(46)

With these choices, this formulation fits Assumption 2.1 by [30,  Lemma 2.3] and Theorem B.1; hence the problem is well-posed and (9) holds with \( p<\frac{1}{2} \). Lipschitz continuity (24) follows as in the first example. Any converging AFEM solver (not necessarily conforming) for (42) ensures that Algorithms 23 are applicable. In particular, the reliability and efficiency of [29,  Theorem 5.1] (applied with \( \Gamma = {\left\{ 0 \right\} } \), in the notation there, i.e. for a deterministic equation) and suitable inf-sup stable discretization spaces as \( V_{{\mathcal {T}}}:= (\mathbb {Q}_2({\mathcal {T}}))^d \times \mathbb {Q}_1({\mathcal {T}}) \times \mathbb {Q}_1({\mathcal {T}}) \) satisfying in (14) give, for all \( {{\varvec{y}}}\in U \), an AFEM algorithm based on hierarchical spatial refinement. Here, \( \mathbb {Q}_2({\mathcal {T}}) \) denotes the space of continuous piecewise biquadratic functions on \( {\mathcal {T}} \) and \( \mathbb {Q}_1({\mathcal {T}}) \) the continuous piecewise bilinear functions, on quadrilatelar meshes.

4 Numerical Experiments

We consider the model problem (3) on a polygon \( D\subseteq {\mathbb {R}}^2 \), for the solution of (15) we employ Lagrangian \( \mathbb {P}_1 \)-FEM on regular triangulations \( {\mathcal {T}} \in \mathbb {T}\) of D. \( {\text {AFEM}}\) is driven by the residual indicators from, e.g. [35,  Section 1.4] and the Dörfler \( {\text {MARK}}\) strategy with marking parameter \( \theta \in (0,1) \), where larger \( \theta \) corresponds to more aggressive refinement. In all the computations, we select \(\theta =0.25\). The \( {\text {REFINE}}\) module is the Newest Vertex Bisection as from the MATLAB implementation in [19], that gives uniform shape regularity of \( \mathbb {T}\). We run on a machine equipped with Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz (OctaCore) and MATLAB R2019a.

4.1 Convex Domain

As a first example we select an affine-parametric diffusion, (3), (37) with \( \psi _0 \equiv 1 \) and for \( j\ge 1 \),

$$\begin{aligned} \psi _j(x) := \frac{1}{(k_{j,1}^2 + k_{j,2}^2)^{\xi }}\sin (k_{j,1} \pi x_1 )\sin (k_{j,2} \pi x_2), \end{aligned}$$
(47)

where the pairs \( (k_{j,1},k_{j,2}) \in {\mathbb {N}}^2 \) are defined by the ordering of \( {\mathbb {N}}^2 \) such that for \( j\in {\mathbb {N}}\), \( k_{j,1}^2 + k_{j,2}^2 \le k_{j+1,1}^2 + k_{j+1,2}^2 \), and the ordering is arbitrary when equality holds. With this choice \( \beta _j \sim j^{-\xi } \), that is \({\varvec{\beta }}\in \ell ^p({\mathbb {N}}), p > \frac{1}{\xi } \). Let \( D = (0,1)^2 \), \( f(x) = e^{-|x|^2} \), \( s=32\), \(\xi =2.1 \) and goal functional \( G(v) = 4\int _{(0,\frac{1}{2})^2}v \). In this case we expect the mesh to be approximately uniformly refined by \( {\text {AFEM}}\), starting from a structured mesh \( {\mathcal {T}}_{0} \) with 128 elements, since \( u(\cdot ,{{\varvec{y}}}) \in H^2(D) \) for all \( {{\varvec{y}}} \in U \), due to convexity of the domain and smoothness of the data. We also use the stopping criteria (30), (36) for the FEM error exploiting symmetry of the stiffness matrix – see Remark 3.3 – thus avoiding excessive spatial refinement. We compare the algorithms in Fig. 1, for various tolerances \( {\varepsilon }_{F} = {\varepsilon }_{Q} \). For convenience of the reader, we compute a reference value with \( |{\mathcal {T}}|\approx 10^5 \) many \( \mathbb {P}_2 \) (i.e. quadratic) elements obtained by uniform refinement of \( {\mathcal {T}}_{0} \) and \( |P_m| = 2^m \), \( m = 8 \) samples, obtaining \( I(G(u)) \approx 0.024411631814585 \). Since we observe that the cost of computing \( d_{{\varvec{\beta }}}({{\varvec{y}}}, {{\varvec{y}}}') := \sum _{j\ge 1} {\left| y'_j - y_j \right| } \beta _j \) is negligible, we formally set \( q = \infty \) in Algorithm 3. As predicted, they all produce outputs well within the tolerance \( {\varepsilon }= 2{\varepsilon }_F \). We also observe that, for the finest tolerance (\( {\varepsilon }_{F}=10^{-5} \)), all 3 algorithms produce meshes with \( \approx 2\cdot 10^5 \) degrees of freedom and they stop at \( m = 7 \), that is 128 samples are sufficient to meet the tolerance. In terms of computing time, Algorithm 2 lags behind the other 2 algorithms, which in turn offer comparable performance. The rates in Fig. 1 (right) are estimated excluding the coarsest tolerance (\( {\varepsilon }_{F} = 10^{-3} \)).

Fig. 1
figure 1

Errors committed by the 3 different algorithms for varying tolerances (left). Runtimes (in seconds) averaged over 2 runs and estimated rate (right)

4.2 L-shape Domain

We again pick the affine parametric diffusion in sin expansion of (47). Let \( D = (-1,1)^2{\setminus }[0,1)\times (-1,0] \), \( f(x) = e^{-2|x+(1,0)|^2} \) be a localized source at \( (-1,0) \). Assume homogeneous Neumann boundary conditions at \( \Gamma _N = {\left\{ 1 \right\} }\times (0,1) \cup (0,1]\times {\left\{ 0 \right\} } \) and homogeneous Dirichlet at \( \Gamma _D = \partial D {\setminus } \Gamma _N \). As Goal functional we pick \( G(v) = \int _{D\cap B_{1/2}} v \), where \( B_{r} \) denotes the ball centered at the origin with radius \(r>0\). Again we choose \( s=32,\xi =2.1 \). We start from a uniform mesh \( {\mathcal {T}}_{0} \) with \( |{\mathcal {T}}_{0}| = 192 \). The evolution of the QMC and FEM error estimators run by AQMC-FEM for \( {\varepsilon }_{F}={\varepsilon }_{Q}=5\cdot 10^{-6} \) are displayed in Fig. 2. Note that no QMC estimator is computed until the FEM tolerance is reached for \( m = 2 \). We measure the computational effort of each iteration of the algorithm AQMC-FEM (indexed by a pair \( (m,\ell ) \in {\mathbb {N}}^2 \), corresponding to QMC and FEM refinement level, respectively) by

$$\begin{aligned} W(m,\ell )= |P_m||{\mathcal {T}}_{\ell }|. \end{aligned}$$
(48)

This is proportional to the cost of the iteration \( (m,\ell ) \) of the \( {\text {SOLVE}}\) module, assuming that s is fixed and that a FEM solver that performs linearly with respect to \( |{\mathcal {T}}_{\ell }| \) is available. We also show the mesh generated by AQMC-FEM for \( {\varepsilon }_{F}={\varepsilon }_{Q}=10^{-3} \); as expected, it is strongly graded near the source and towards the corner of the domain, where a singularity of the solution occurs.

Fig. 2
figure 2

Mesh produced by AQMC-FEM for \( {\varepsilon }_{F}={\varepsilon }_{Q} = 10^{-3} \) (left) and decay of FEM and QMC estimators (asterisk and square, respectively) against \( W(m,\ell ) \), for \( {\varepsilon }_{F} = {\varepsilon }_{Q} = \text {TOL} := 5\cdot 10^{-6} \) (right)

5 Conclusions

We have presented a family of adaptive discretization methods that combine FEM error estimation in the spatial domain and deterministic Polynomial lattice rules in the parameter box \( U = [-\frac{1}{2},\frac{1}{2}]^s \). We recalled possible criteria to verify convergence of the AFEM iteration and to enable QMC a-posteriori estimation. The convergence of the parametric estimator is free of the curse of dimensionality, allowing for arbitrary \( s \in {\mathbb {N}}\), also in practical examples, under the assumption of quantified decay of the derivatives (10) or (9). Moreover, we stress that the parametric error is estimated without resorting to the specific problem formulation. These are the main features that improve upon existing methods based on stochastic Galerkin or sparse grids [2, 14, 18, 29].

Thus, we expect our algorithms to be applicable in a wide range of problems, including, but not restricted to, those in the framework of Sect. 2, provided that a converging AFEM algorithm is available for the corresponding non-parametric equation. In particular, we mention parabolic equations (cp. a posteriori indicators in [35,  Chapter 6]) and certain non-linear PDEs meeting the criteria exposed in [7], stationary Stokes (cp. [5,  Section 6.2-6.3]) and Navier-Stokes (cp. [35,  Chapter 5]) equations on uncertain domains [9] and elliptic eigenvalue problems [25, 5,  Section 10.3].