Error Estimation

Ryckelynck, David; Casenave, Fabien; Akkari, Nissrine

doi:10.1007/978-3-031-52764-7_3

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

832 Accesses

Abstract

Consider first data-based machine learning techniques. They rely on large sets of examples provided during the training stage and do not learn with equations. Dealing with a situation that do not belong to the training set variability, namely an out-of-distribution sample, can be very challenging for these techniques. Trusting them could imply being able to guarantee that the training set covers the operational domain of the system to be trained. Besides, data-based AI can lack in robustness: examples have been given of adversarial attacks in which a classifier was tricked to infer a wrong class only by changing a very small percentage of the pixels of the input image. These models often also lack explainability: it is hard to understand what is exactly learned, what phenomenon occurs through the layers of a neural network. In some cases, information on the background of a picture is used by the network in the prediction of the class of an object, or bias present in the training data will be learned by the AI model, like gender bias in recruitment processes.

You have full access to this open access chapter, Download chapter PDF

3.1 Confidence and Trust in Model-Based Engineering Assisted by AI

Consider first data-based machine learning techniques. They rely on large sets of examples provided during the training stage and do not learn with equations. Dealing with a situation that do not belong to the training set variability, namely an out-of-distribution sample, can be very challenging for these techniques. Trusting them could imply being able to guarantee that the training set covers the operational domain of the system to be trained. Besides, data-based AI can lack in robustness: examples have been given of adversarial attacks in which a classifier was tricked to infer a wrong class only by changing a very small percentage of the pixels of the input image. These models often also lack explainability: it is hard to understand what is exactly learned, what phenomenon occurs through the layers of a neural network. In some cases, information on the background of a picture is used by the network in the prediction of the class of an object, or bias present in the training data will be learned by the AI model, like gender bias in recruitment processes. Addressing these limitations is the subject of the Program Confiance.ai,^{Footnote 1} regrouping French academic as well as industrial partners from defense, transports, manufacturing and energy sectors.

Model-based engineering enjoys better explainability–since the reference-model equations are known and used, and robustness–when the reference-model is well-posed. Concerning trust, in our projection-based reduced-order modeling context, the prediction is in general deterministic, and strict error bounds can be derived in particular cases. This is a main difference with AI-based models, which use notions like confidence intervals and predictive uncertainties, expressed as probability results. In the remainder of this chapter, we present error bounds and indicators in projection-based reduced-order modeling, depending on the complexity of the underlying equations.

3.2 In Linear Elasticity and for Linear Problems

Parts of this section has been inspired from the authors previous works [7] and [9].

We suppose that the problem of interest has the following discrete variational form, depending on a parameter $\mu $ in a parameter set $\mathcal {P}$: for a finite-dimensional space $\mathcal {V}$ of dimension N (with $N\gg 1$ resulting, e.g., from discretization), find $u_\mu \in \mathcal {V}$ such that

$$\begin{aligned} E_\mu :a_\mu (u_\mu ,v)=b(v),\qquad \forall v\in \mathcal {V}, \end{aligned}$$

(3.1)

where $a_\mu $ is an inf-sup stable bounded sesquilinear form on $\mathcal {V}\times \mathcal {V}$ and b is a continuous linear form on $\mathcal {V}$. We define the Riesz isomorphism J from $\mathcal {V}'$ to $\mathcal {V}$ such that for all $l\in \mathcal {V}'$ and all $u\in \mathcal {V}$, $\left( Jl,u\right) _{\mathcal {V}}=l(u)$, where $(\cdot ,\cdot )_{\mathcal {V}}$ denotes the inner product of $\mathcal {V}$ with associated norm $\Vert \cdot \Vert _{\mathcal {V}}$ and $\mathcal {V}'$ the dual of $\mathcal {V}$. We denote $\displaystyle {\beta _\mu }:=\underset{u\in \mathcal {V}}{\inf }\underset{v\in \mathcal {V}}{\sup }\frac{|a_{\mu }(u,v)|}{\Vert u\Vert _\mathcal {V}\Vert v\Vert _\mathcal {V}}>0$ the inf-sup constant of $a_\mu $ and $\tilde{\beta }_\mu $ a computable positive lower bound of ${\beta _\mu }$. For simplicity, we consider that the linear form b is independent of the parameter $\mu $. The extension to $\mu $-dependent b is straightforward.

Applying the Galerkin method on the linear problem (3.1), using a ROB $(\psi _i)_{1\le i\le n}\in \mathbb {R}^{n\times N}$ as done in Sect. 2.3.2 leads to finding $\hat{u}_\mu \in \mathcal {V}_n$ such that

$$\begin{aligned} \hat{E}_\mu :a_\mu (\hat{u}_\mu ,u_j)=b(u_j), \qquad \forall j\in \{1,\ldots , n\}. \end{aligned}$$

(3.2)

The approximate solution on the ROB is written as (2.16):

$$\begin{aligned} \hat{u}_\mu =\sum _{i=1}^n\gamma _i^\mu \psi _i. \end{aligned}$$

(3.3)

We assume that the sesquilinear form $a_\mu $ depends on $\mu $ in an affine way, namely there exist d functions $\alpha _k(\mu ):\mathcal {P}\rightarrow \mathbb {C}$ and d $\mu $-independent sesquilinear forms $a_k$ bounded on $\mathcal {V}\times \mathcal {V}$ such that

$$\begin{aligned} a_\mu (u,v)=\sum _{k=1}^d{\alpha _k^\mu a_k(u,v)}, \qquad \forall u,v \in \mathcal {V}, \end{aligned}$$

(3.4)

which enables the ROM to be online-efficient.

Under the current assumptions, the following error bound holds (see [16, Sect. 4.3.2]): for all $\mu \in \mathcal {P}$,

$$\begin{aligned} \begin{aligned} \Vert u_\mu -\hat{u}_\mu \Vert _{\mathcal {V}}\le \mathcal {E}_1(\mu )&:={\tilde{\beta }_\mu }^{-1}\Vert G_\mu \hat{u}_\mu \Vert _{\mathcal {V}},\\ &={\tilde{\beta }_\mu }^{-1} \left\| G_{00}+\sum _{i=1}^{\hat{N}}\sum _{k=1}^{d}{\alpha _k^\mu \gamma _i^\mu }G_k u_i\right\| _{\mathcal {V}}, \end{aligned} \end{aligned}$$

(3.5)

with $G_\mu $ the linear map from $\mathcal {V}$ to $\mathcal {V}$ such that $\mathcal {V}\ni u\mapsto G_\mu u:=J\left( a_\mu (u,\cdot )-b\right) \in \mathcal {V}$; $G_\mu $ inheriting the affine dependence of $a_\mu $ on $\mu $ since, for all $u\in \mathcal {V}$,

$$\begin{aligned} G_\mu u=-Jb+\sum _{k=1}^d\alpha _k^\mu Ja_k(u,\cdot )=G_{00}+\sum _{k=1}^d\alpha _k^\mu G_k u, \end{aligned}$$

(3.6)

where $G_{00}:=-Jb\in \mathcal {V}$ and $G_k u:=Ja_k(u,\cdot )\in \mathcal {V}$ for all $k\in \{1,\ldots ,d\}$.

The error bound (3.5) can rewritten in an equivalent way as

$$\begin{aligned} \begin{aligned} \displaystyle {\mathcal {E}_2(\mu )} ={}& {\tilde{\beta }_\mu }^{-1}\left( {(G_{00},G_{00})_{\mathcal {V}}} \displaystyle +2\textrm{Re}\sum _{i=1}^{\hat{N}}\sum _{k=1}^{d}{\gamma _i^\mu \alpha _k^\mu {(G_k u_i,G_{00})_{\mathcal {V}}}}\right. \\ \displaystyle &\left. +\sum _{i,j=1}^{\hat{N}}\sum _{k,l=1}^{d}{\gamma _i^\mu \alpha _k^\mu {\gamma _j^*}^\mu {\alpha _l^*}^\mu {(G_k u_i,G_l u_j)_{\mathcal {V}}}}\right) ^{\frac{1}{2}},\\ &={\tilde{\beta }_\mu }^{-1}\left( \delta ^2 + 2\textrm{Re} (s^t \hat{x}_\mu ) + {\hat{x}_\mu }^{*t} S\hat{x}_\mu \right) ^{\frac{1}{2}}, \end{aligned} \end{aligned}$$

(3.7)

where $\delta :=\Vert G_{00}\Vert _{\mathcal {V}}$, s and $\hat{x}_\mu $ are vectors in $\mathbb {C}^{dn}$ with components $s_I:=(G_k u_i,G_{00})_{\mathcal {V}}$ and $(\hat{x}_{\mu })_I:= \alpha _k^\mu \gamma _i^\mu $, and S is a matrix in $\mathbb {C}^{dn,dn}$ with coefficients $S_{I,J}:=(G_k u_i,G_l u_j)_{\mathcal {V}}$ (with I and J re-indexing respectively (k, i) and (l, j), for all $1\le k,l\le d$ and all $1\le i,j\le n$). The t superscript denotes the transposition. The vector s and the matrix S depend on the ROB $(\psi _i)_{1\le i\le n}$ but are independent of $\mu $, hence can be precomputed; while the vector $\hat{x}_\mu $ depends on the reduced basis approximation $\hat{u}_\mu $ via the coefficients $\gamma _i^\mu $. A lower bound ${\tilde{\beta }_\mu }$ of the stability constant of $a_\mu $ is also computed in complexity independent of N (which is possible, for example, by the Successive Constraint Method, see [10, 13]).

We would like to stress that $\mathcal {E}_1(\mu )=\mathcal {E}_2(\mu )$ (in infinite precision arithmetic): the indices 1 and 2 are used to denote two different ways to compute the same quantity. In particular, $\mathcal {E}_1(\mu )$ is not online efficient, while $\mathcal {E}_2(\mu )$ is.

3.3 In Nonlinear Mechanics of Materials

The remaining of this section is inspired from the authors work [8].

We look here for an efficient error indicator in the context of general nonlinearities and nonparametrized variabilities in nonlinear structural mechanics. The generality of the assumptions and the complexity of the model lead us to search for quantities correlated to the error made by the ROM with respect to the HFM, instead of rigorous error bounds as considered in the previous section.

The problem of interest is the same as described in Sect. 2.2, and we suppose that we have constructed a ROM following the methods described in Sects. 2.3 and 2.3.5, namely used POD for data compression and ECM for operator compression.

The quantity of interest is the accumulated plastic strain over the complete structure. Since this is a dual quantity, the ROM do not provide directly a prediction over the structure, but only at the reduced quadrature points selected by ECM, see Sect. 2.3.5. The Gappy-POD can be used for to recover the dual quantity of interest over the rest of the structure, see [8, Algorithms 3 and 4] for a presentation on the present context, and [11] for in seminal paper on Gappy-POD.

A quantification for the prediction relative error of the accumulated plastic strain is defined as

$$\begin{aligned} E^p_{\mu }:= \left\{ \begin{array}{ll} \frac{\Vert p_{\mu }-\tilde{p}_{\mu }\Vert _{L^2(\Omega )}}{\Vert p_{\mu }\Vert _{L^2(\Omega )}} &{} \text{ if } \Vert p_{\mu }\Vert _{L^2(\Omega )}\ne 0,\\ \frac{\Vert p_{\mu }-\tilde{p}_{\mu }\Vert _{L^2(\Omega )}}{\underset{\mu \in \mathcal {P}_\mathrm{off.}}{\max }\Vert p_{\mu }\Vert _{L^2(\Omega )}} &{} \text{ otherwise, }\\ \end{array}\right. \end{aligned}$$

(3.8)

where $p_{\mu }$ and $\tilde{p}_{\mu }$ are respectively the high-fidelity and reduced predictions for the accumulated plasticity field at the variability $\mu $, and $\mathcal {P}_\mathrm{off.}$ is the set of variabilities encountered during the offline stage. We underline the fact that $\tilde{p}_{\mu }$ is the reduced prediction over the complete structure, hence after applying the Gappy-POD reconstruction.

Define the ROM-Gappy-POD residual as

$$\begin{aligned} \mathcal {E}^p_{\mu }:= \left\{ \begin{array}{ll} \frac{\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2}{\Vert \hat{\boldsymbol{p}}_{\mu }\Vert _2} &{} \text{ if } \Vert \hat{\boldsymbol{p}}_{\mu }\Vert _2\ne 0,\\ \frac{\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2}{\underset{\mu \in \mathcal {P}_\mathrm{off.}}{\max }\Vert \hat{\boldsymbol{p}}_{\mu }\Vert _2} &{} \text{ otherwise, }\\ \end{array}\right. \end{aligned}$$

(3.9)

where $\tilde{\boldsymbol{p}}_{\mu }$ is the reduced prediction (after applying the Gappy-POD) taken at the reduced quadrature points ($\tilde{p}_{\mu ,k}=\tilde{p}_{\mu }(\hat{x}_{k})$, $1\le k\le m^p$), $\hat{\boldsymbol{p}}_{\mu }$ is the vector of the accumulated plastic strain as computed by the constitutive law solver at the reduced quadrature points during the online stage, and $\Vert \cdot \Vert _2$ denotes the Euclidean norm. Notice that in the general case, $\tilde{\boldsymbol{p}}_\mu \ne \hat{\boldsymbol{p}}_\mu $: this discrepancy is at the base of our proposed error indicator.

Notice that the relative error $E^p_{\mu }$ involves fields and $L^2$-norms whereas the ROM-Gappy-POD residual $\mathcal {E}^p_{\mu }$ involves vectors of dual quantities in the set of reduced integration points and Euclidean norms. In (3.9), $\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2$ is the error between the online evaluation of the accumulated plastic strain by the behavior law solver: $\hat{\boldsymbol{p}}_{\mu }$, and the reconstructed prediction at the reduced integration points $\hat{x}_{k}$: $\tilde{\boldsymbol{p}}_{\mu }$, $1\le k\le m^p$. It is explained in [8, Sect. 4.1] that $\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2$ is also the residual of the least-square optimization involved in the online stage of the Gappy-POD.

Let $B\in \mathbb {R}^{m^p\times n^p}$ such that $B_{k,i}=\psi ^{p}_i(\hat{x}_{k})$, $1\le k\le m^p$, $1\le i\le n^p$, $K:=\{p_\mu ,\mathrm{~for~all~possible~variabilities~}\mu \}$ and $d(K,W)_{L^2(\Omega )}:=\underset{v\in K}{\sup }~\underset{w\in W}{\inf }~\Vert v-w\Vert _{L^2(\Omega )}$, with W a finite-dimensional subspace of $L^2(\Omega )$. The following propositions and corollary are proven in [8, Sect. 4.1].

Proposition 3.1

There exist two positive constants $C_1$ and $C_2$ independent of $\mu $ (but dependent on $n^p$) such that

$$\begin{aligned} \left\| p_{\mu }-\tilde{p}_\mu \right\| ^2_{L^2(\Omega )}\le C_1\Vert B\boldsymbol{z}_\mu -\hat{\boldsymbol{p}}_{\mu }\Vert _2^2 + C_1 \Vert \boldsymbol{p}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2^2 + C_2 d(K, \textrm{Span}\{\psi _i^p\}_{1\le i\le n^p})_{L^2(\Omega )}^2. \end{aligned}$$

(3.10)

Proposition 3.2

There exist two positive constants $K_1$ and $K_2$ independent of $\mu $ such that

$$\begin{aligned} \Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2^2\le K_1 \left\| p_{\mu }-\tilde{p}_\mu \right\| ^2_{L^2(\Omega )} + K_2 \Vert \boldsymbol{p}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2^2 . \end{aligned}$$

(3.11)

Corollary 3.3.1

Suppose that the reduced solution is exact up to the considered time step at the online variability $\mu $: $p_{\mu }=\tilde{p}_\mu $ in $L^2(\Omega )$. In particular, the behavior law solver has been evaluated with the exact strain tensor and state variables at the integration points $x_{k}$, leading to $\hat{p}_{\mu }(\hat{x}_{k})=p_{\mu }(\hat{x}_{k})$, $1\le k\le m^d$. From Proposition 3.2, $\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2=0$, and $\mathcal {E}_{\mu }^p=0$.

We observe that in practice, the evaluations of the ROM-Gappy-POD residual $\mathcal {E}^p_{\mu }$ (3.9) and the error $E^p_{\mu }$ (3.8) are very correlated in our numerical simulations. The idea is to exploit this correlation by training a Gaussian process regressor for the function $\mathcal {E}^p_{\mu }\mapsto {E}^p_{\mu }$. At the end of the offline stage, we propose to compute reduced predictions at variability values $\{\mu _i\}_{1\le i\le N_c}$ encountered during the data generation step, and the corresponding couples $\left( E^p_{\mu _i}, \mathcal {E}^p_{\mu _i}\right) $, $1\le i\le N_c$. A Gaussian process regressor is trained on these values and we define an approximation function

$$\begin{aligned} \mathcal {E}^p_{\mu }\mapsto \textrm{Gpr}^p(\mathcal {E}^p_{\mu }), \end{aligned}$$

(3.12)

for the error ${E}^p_{\mu }$ at variability $\mu $ as the mean plus 3 times the standard deviation of the predictive distribution at the query point $\mathcal {E}^p_{\mu }$: this is our proposed error indicator. If the dispersion around the learning data is small for certain values $\mathcal {E}^p_{\mu }$, then adding 3 times the standard deviation will not change very much the prediction, whereas for values with large dispersion of the learning data, this correction aims to provide an error indicator larger than the error. We use the GaussianProcessRegressor python class from scikit-learn [17]. Notice that although some operations in computational complexity dependent on N are carried-out, we are still in the offline stage, and they are much faster than the resolutions of the large size systems of nonlinear equations (2.8). If the offline stage is correctly carried-out and since $\mathcal {E}^p_{\mu }$ is highly correlated with the error, only small values for $\mathcal {E}^p_{\mu }$ are expected to be computed. Hence, in order to train the Gaussian process regressor correctly for larger values of the error, the reduced Newton algorithm (2.13) is solved with a large tolerance $\epsilon ^\textrm{ROM}_\textrm{Newton}=0.1$. We call these operations “calibration of the error indication”, see Algorithm 3 for a description and Fig. 3.1 for a presentation of the workflow featuring this calibration step.

A screenshot of 7-line algorithm code for calibration of the error indicator. The outputs of the data generation, data compression, and operator compression steps are the inputs. The approximation function is the output.

A block flow diagram starts with the offline variability, followed by the data generator, H F solution, modes and reduced integration, reduced solver, reduced solution, reconstruction, Gappy residual and error, Gaussian process regression, and error indicator functions. — **Fig. 3.1**

We recall that in model order reduction, the original hypothesis is the existence of a low-dimensional vector space where an acceptable approximation of the high-fidelity solution lies. The hypothesis is formalized under a rate of decrease for the Kolmogorov n-width with respect to the dimension of this vector space. The same hypothesis is made when using the Gappy-POD to reconstruct the dual quantities, which are expressed as a linear combination of constructed modes. For both the primal and dual quantities, the modes are computed by searching some low-rank structure of the high-fidelity data. The coefficients of the linear combination for reconstructing the primal quantities are given by the solution of the reduced Newton algorithm (2.13). After convergence, the residual is small, even in cases where the reduced order model exhibits large errors with respect to the high-fidelity reference: this residual gives no information on the distance between the reduced solution and the high-fidelty finite element space.

However, in the online phase of the ROM-Gappy-POD reconstruction (see [8, Algorithm 4]), the coefficients $\hat{p}_{\mu ,k}$ (the accumulated plastic strain computed by the constitutive law solver during the online stage) contain information from the high-fidelity behavior law solver. Moreover, an overdetermined least-square is solved, which can provide a nonzero residual that implicitly contains this information from the high-fidelity behavior law solver: namely the distance between the prediction from the behavior law and the vector space spanned by the Gappy-POD modes (restricted to the reduced integration points): this is the term $\Vert B\boldsymbol{z}_\mu -\hat{\boldsymbol{p}}_{\mu }\Vert _2$ in (3.10). Hence, the ability of the online variability to be expressed on the Gappy-POD modes is monitored through the behavior law solver on the reduced integration points. When the ROM is solved for an online variability not included in the offline variabilities, then the new physical solution cannot be correctly interpolated using the POD and Gappy-POD modes: hence, the ROM-Gappy-residual becomes large.

From Proposition 3.2, if $\Vert B\boldsymbol{z}_\mu -\hat{\boldsymbol{p}}_{\mu }\Vert _2=\Vert \tilde{\boldsymbol{p}}_{\mu }-\hat{\boldsymbol{p}}_{\mu }\Vert _2$ is large, then the global error $\left\| p_{\mu }-\tilde{p}_\mu \right\| _{L^2(\Omega )}$ and/or the error at the reduced integration points $\hat{x}_k$ is large, which makes $\Vert B\boldsymbol{z}_\mu -\hat{\boldsymbol{p}}_{\mu }\Vert _2$ a good candidate for a enrichement criterion for the ROM. A limitation of the error indicator can occur if the online variability activates strong nonlinearities on areas containing no point from the reduced integration scheme, namely through the term $C_2 d(K, \textrm{Span}\{\psi _i^p\}_{1\le i\le n^p})_{L^2(\Omega )}^2$ in (3.10).

We recall that the error indicator (3.12) is a regression of the function $\mathcal {E}^p_{\mu }\mapsto {E}^p_{\mu }$. In the online phase, we only need to evaluate $\mathcal {E}^p_{\mu }$ and do not require any estimation for the other terms and constants appearing in Propositions 3.1 and 3.2.

Equipped with an efficient error indicator, we are now able to assess the quality of the approximation made by the reduced order model in the online phase. If the error indicator is too large, an enrichment step occurs: the high-fidelity model is used to compute a new high-fidelity snapshot, which is used to update the POD and Gappy-POD basis, as well as the reduced integration schemes. Notice that for the enrichment steps to be computed, the displacement field and all the state variables of the previous time step need to be reconstructed on the complete mesh $\Omega $ to provide the high-fidelity solver with the correct material state. The workflow for the online stage with enrichment is presented in Fig. 3.2.

A block flow diagram starts with the online variability, followed by the reduced solver, reduced solution, error indicator evaluation, reconstruction, data generator, H F solution at online variability, precomputed H F solutions at offline variabilities, and modes and reduced integration. — **Fig. 3.2**

We refer to [8] for more details on this subject, and detailed numerical applications in nonlinear structural mechanics for this error indicator and its ability to enrich a ROM in the online stage.

Notice that another (noncertified) indicator in nonlinear solid mechanics with internal variables has been proposed in [1], aiming to approximate the dual norm of residuals in the same fashion as in the linear case described in Sect. 3.2. For such nonlinear case, rigorous error bounds are not obtained: a gappy-POD-based approximate representation of the stress tensor is used, and the inf-sup constant evaluation has been replaced by a normalization of the residual using the norm of the Riesz elements for the external loading.

3.4 In Computational Fluid Dynamics

In the following section, we present a priori error estimates due to the POD-Galerkin approximation applied to fluid dynamics equations in particular and parabolic nonlinear PDEs in general. It is a theoretical result on the convergence of the POD-Galerkin reduced order model towards the high-fidelity semi-discretized equations in the sense of the spatial variable. The solution of these semi-discretized equations is denoted $\widetilde{u}^{h}$ over a time interval [0, T] such that $h=\frac{1}{M}$ and M is the cardinal of a Hilbert basis that is capable of generating the high-fidelity semi-discretized solution at some specified M time instants. This orthonormal basis can be obtained in an a posteriori fashion thanks to the snapshots POD method. It is denoted by $\left( \psi _i\right) _{i=1,...,M}$.

The following convergence result includes furthermore a discussion on the stability of the Galerkin projection technique applied to parabolic PDEs. This result has been developed and published in the following papers: [2,3,4,5].

In the literature, we can find many works around the problem of defining convergent and a priori upper bounds for the POD-Galerkin reduced models for parabolic equations. It is a subject of great interest, if we can quantify efficiently the error of a reduced solution $\widehat{u}$ obtained with an approximation technique of the corresponding high fidelity semi-discretized solution $\widetilde{u}^{h}$. This problem can be seen as a theoretical confidence interval around a training point of an approximation model (details on some convergence results of the literature should be added). Let us denote by $\Omega $ the open and bounded domain of the spatial variable, such that $\Omega \in \mathbb {R}^{d}$, where $d=1 \; \; or \;\; 2$. Let us consider the parabolic PDEs for which the weak formulation in the space of the solution $u^{h}$ spanned by the POD basis of cardinal M, is described as follows:

b is a trilinear form defined over $[H^{1}(\Omega )]^d\times [H^{1}(\Omega )]^d\times [H^{1}(\Omega )]^d$,
a dissipating term defined as a bi-linear and coercive form a over $[H^{1}(\Omega )]^d\times [H^{1}(\Omega )]^d$,
a linear form $F_t$ defined over $[L^{2}(\Omega )]^d$,
$\beta $ is the coercivity constant of the bilinear form a,
$C_a$ and $C_b$ are respectively the norms of a and b in the space $[L^{2}(\Omega )]^{d}$,
$K=\left\| u^{h}\right\| _{L^{\infty }(0,T;[L^{2}(\Omega )]^{d})}$.

The following result is proved:

Theorem 3.1

If $n<<M$ is the dimension of the truncated POD-Galerkin reduced solution $\widehat{u}$ that represent the training point (solution) $u^{h}$, then we can derive the following a priori upper bound of the $[L^{2}(\Omega )]^{d}-$ error between $\widehat{u}(t)$ and $\widetilde{u}^h(t)$, $\forall $ $t\in [0,T]$:

$$\begin{aligned} \left\| (\widetilde{u}^{h}-\widehat{u})(t)\right\| ^{2}_{[L^{2}(\Omega )]^{d}}\le f_1(N). \end{aligned}$$

(3.13)

Where $f_1(n)$ is the remainder of the sum of a convergent series; $f_1(n)$ converges to 0 when N converges to M. More precisely, $f_1(n)$ is a function of the remainder of the sum of the POD eigenvalues $\left( \lambda _k\right) _{k=1,...,M}$ obtained from the Snapshots POD applied to M temporal snapshots of $u^h$: it is expressed into two different fashions: Either

$$\begin{aligned} f_1(n) = \left( 1+2C_bK+\frac{C^2_{a}}{\epsilon }\right) T\displaystyle \sum ^{M}_{k=n+1}\lambda _k, \end{aligned}$$

(3.14)

if $\left( \epsilon -2\beta +6C_bK\right) \le 0$, for a strictly positive real number $\epsilon $, or

$$\begin{aligned} f_1(n)=\left( T+\left( 2C_bK+\frac{C^2_{a}}{\epsilon }\right) T\exp (T[2C_aK+K^2(1+C_a^{2})])\right) \displaystyle \sum ^{M}_{k=n+1}\lambda _k, \end{aligned}$$

(3.15)

if $\left( \epsilon -2\beta +6C_bK\right) > 0$, for all strictly positive real numbers $\epsilon .$

The mathematical proof of Theorem 3.1 is based on the properties of the forms a and b, the application of the Young inequality and the Gronwall lemma. For the details of the proof, refer to [4].

Remark 3.1

The result of Theorem 3.1 is applicable in particular for the 1D Burgers equation and the 2D unsteady and incompressible Navier-Stokes equations, by remarking the following:

When $d=1,$ $H^1(\Omega )\subset L^\infty (\Omega )$: there exists $C^1_\infty \in \mathbb {R}^{+*}$ such that $\forall $ $v\in H^1(\Omega ),$ $\left\| v\right\| _{L^\infty (\Omega )}\le C^1_\infty \left\| \nabla v\right\| _{L^2(\Omega )}$.
When $d=2$ or $d=3$, $H^1(\Omega )\subset L^4(\Omega )$: there exists $C^1_4\in \mathbb {R}^{+*}$ such that $\forall $ $v\in H^1(\Omega ),$ $\left\| v\right\| _{L^4(\Omega )}\le C^1_4\left\| \nabla v\right\| _{L^2(\Omega )}$.
When $d=2$: there exists $C\in \mathbb {R}^{+*}$ such that:
$$\begin{aligned}\left\| v\right\| _{L^4(\Omega )}\le C\left\| v\right\| ^{1/2}_{L^2(\Omega )}\left\| \nabla v\right\| ^{1/2}_{L^2(\Omega )}.\end{aligned}$$
If we denote by S the square matrix of dimension M defined by: $S_{i,j}=\left\langle \psi _i,\psi _j\right\rangle _{[H^1(\Omega )]^{d}}$, then $\forall $ v in $\mathcal {V}^{h}$ the solutions space spanned by the complete POD basis $\psi $ we have the following inequality: $\left\| v\right\| _{[H^1(\Omega )]^{d}}\le \sqrt{\left\| S\right\| }\left\| v\right\| _{[L^2(\Omega )]^{d}}.$ Where $\left\| S\right\| $ is the spectral norm of the matrix S. For details, refer to [14].

Remark 3.2

A result of stability with respect to time of the POD-Galerkin reduced model for nonlinear and dissipated parabolic PDEs can be derived from result 3.1. If $\mu $ denotes only the viscosity constant in this particular case (without any loss of generality), then the POD-Galerkin reduced model is stable with respect to time when $\mu $ satisfies the following inequality:

$$\begin{aligned}\mu \ge \frac{6C_{b}K+\epsilon }{2\beta },\end{aligned}$$

where $\epsilon $ is a strictly positive real number.

Remark 3.3

More particularly, in the case of linear and dissipated parabolic PDEs, the stability condition of the POD-Galerkin reduced model becomes equivalent to $\mu \ge \frac{\epsilon }{2c_p}.$ The error of the POD-Galerkin reduced model with respect to the high fidelity training point can be estimated exactly as in the Céa lemma applied for elliptic and sesquilinear PDEs: if $\mu \ge 1$ then,

$$\begin{aligned}\left\| (u^{h}-\widehat{u})(t)\right\| ^{2}_{[L^{2}(\Omega )]^{d}}\le \left( 1+\frac{C^2_a}{2\beta }\right) T\displaystyle \sum ^M_{n=N+1}\lambda _n.\end{aligned}$$

Based on the same methodology, we propose in what follows an a priori error estimate and a convergence result for POD-Galerkin reduced model parameter wise. In other words, we show that when the parameters change with respect to the training ones, a confidence interval is obtained around the new test solution. The width of this confidence interval converges to the truncation error of the ROM at the training parameters. This parametric convergence result is formulated as follows:

Theorem 3.2

Let us denote by $u^h_{\mu _0}$ a training solution associated with a training parameter $\mu _0$. So we suppose we have only one training point for the POD-Galerkin reduced model, without any loss of generality. If $\psi ^{\mu _0}$ denotes the Hilbert basis obtained from the POD applied to the high fidelity training solution and $\widehat{u}_{\mu ,\mu _0}$ denotes the truncated POD-Galerkin reduced solution that approaches the test point (solution) $u^{h}_{\mu }$ in the reduced POD space of dimension n spanned by $\psi ^{\mu _0}$, then:

$$\begin{aligned} \left\| \left( u^h_{\mu }-\widehat{u}_{\mu ,\mu _0}\right) (t)\right\| ^{2}\le f^{\mu _0}_1(n)\left( 1+ \left\| \mu -\mu _0\right\| ^{2\alpha }\right) +f^{\mu _0}_2(n)\left\| \mu -\mu _0\right\| ^{\alpha }, \end{aligned}$$

(3.16)

where $f^{\mu _0}_2(n)$ is the remainder of the sum of a convergent series; $f^{\mu _0}_2(n)$ converges to 0 when n converges to M. More precisely $f^{\mu _0}_2(n)$ is a function of the remainder of the sum of the orthogonal projection coefficients of the characteristic function $1_{\Omega _{x}}$ such that $\Omega _{x}$ tends to $\Omega $ when x tends to $\partial \Omega $: it is expressed as follows:

$$\begin{aligned} {}{ f^{\mu _0}_2(n)= \left( \left\| (u^h_{\mu }-u^{h}_{\mu _0})(0)\right\| ^{-4}+\frac{B}{A}\left( \exp (-2At)-1\right) \right) ^{-1/2} C^2_a\left\| \nabla u^{h}_{\mu _0}\right\| ^2\displaystyle \sum ^{M}_{k=n+1}\langle 1_{\Omega _{x}},\psi ^{\mu _0}_k \rangle ^2, } \end{aligned}$$

(3.17)

where A and B are strictly positive constants of which the detailed expressions are given in [2].

The proof of the above theorem is published in [2]. It is based on the properties of the two forms a and b, the application of the Young inequality and the resolution of a nonlinear ordinary differential inequality of Ricatti type.

3.5 A Note on Accuracy of a Posteriori Error Bounds and Round-Off Errors

In this section, we explain why the online-efficient error bound (3.7) may be sensitive to round-off errors.

In computers, real numbers are represented by a finite number of bits, called floating-point representation. Current architectures are optimized for the IEEE 754 double-precision binary floating-point format. Let x and y be real numbers. When computing the operation $x+y$, the result returned by the computer can be different from its theoretical value. Whenever the difference is substantial, a loss of significance occurs. A well-known case of loss of significance is when x and y are almost opposite numbers. Suppose that $x=-y$. We denote by $\textrm{maxfl}(x+y)$ the result that the computer returns when the maximal accumulation of round-off errors occurs when computing the summation. There holds

$$\begin{aligned} |\textrm{maxfl}(x+y)|\approx 2\epsilon |x|, \end{aligned}$$

(3.18)

where $\epsilon $ is called the machine precision. In double precision, $\epsilon =5\times 10^{-16}$ (see [12, Sect. 1.2]).

When implementing an algorithm, one should ensure that each step is free of such a loss of significance. In some cases, simply changing the order of the operations can prevent these situations. As an illustration, consider $x=1$, $y=1+10^{-7}$, and the operation $x^2-2xy+y^2$. This is a sum of terms where the first intermediate result in the sum is 14 orders larger than the result. Therefore, a loss of significance is expected. The relative error of this computation is about $8\times 10^{-4}$. Computing $(x-y)^2$, which is the factorization of the considered operation, leads to a relative error of about $10^{-9}$. Thus, the terms of the sum are only 7 orders larger than the results, leading to a less catastrophic loss of significance. In this specific case, the remedy consists in carrying out the sum before the multiplication. In our projection-based ROM context, the evaluation of the formula $\mathcal {E}_2$ suffers from such a loss of significance, as we now explain.

We investigate the influence of round-off errors when computing the error bounds (3.5) and (3.7) for respectively $\mathcal {E}_1(\mu )$ and $\mathcal {E}_2(\mu )$. As observed in the previous paragraph, the computation of a polynomial using a factorized form is more accurate than using the developed form, in particular at points close to its roots. Here, $\left( {{\tilde{\beta }_\mu }}\mathcal {E}_2(\mu )\right) ^2$ is a multivariate polynomial of degree 2 in ${\hat{x}_\mu }$ computed in a developed form, whereas the scalar product $(G_\mu u_\mu ,G_\mu u_\mu )_{\mathcal {V}}$ used in the computation of $\mathcal {E}_1(\mu )$ is not developed. The following holds (see [9, Proposition 2.2.1] for the proof)

Proposition 3.3

Let $\mu \in \mathcal {P}$ and let $\textrm{maxfl}({\tilde{\beta }_{\mu }} \mathcal {E}_k(\mu ))$, $k=1,2$, denote the evaluation of ${\tilde{\beta }_\mu } \mathcal {E}_k(\mu )$ when the maximum accumulation of round-off errors occurs. There holds

$$\begin{aligned} \begin{aligned} &\textrm{maxfl}({\tilde{\beta }_{\mu }} \mathcal {E}_1(\mu ))\ge 2\delta \epsilon ,\\ &\textrm{maxfl}({\tilde{\beta }_{\mu }}\mathcal {E}_2(\mu ))\ge 2\delta \sqrt{\epsilon }, \end{aligned} \end{aligned}$$

(3.19)

where $\delta =\Vert G_{00}\Vert _{\mathcal {V}}$ and $\epsilon $ is the machine precision.

From this proposition, we notice that the online-efficient formula $\mathcal {E}_2(\mu )$ suffers from an important loss of significance.

We present below an error estimator proposed in [9] that enjoys both accuracy and online-efficiency. Let $\sigma :=1+2dn+(dn)^2$. For a given $\mu \in \mathcal {P}_\textrm{trial}$ and the resulting $\hat{u}_\mu \in \text {Span}\{\psi _1, ..., \psi _n\}$ solving the reduced problem (3.2), we define $\hat{X}(\mu )\in \mathbb {C}^{\sigma }$ as the vector with components $(1,{\hat{x}}_{\mu _I},{\hat{x}^*}_{\mu _I},{\hat{x}^*}_{\mu _I} {\hat{x}}_{\mu _J})$, where ${\hat{x}}_{\mu _I}=\alpha _k^\mu \gamma _i^\mu $ (we recall that $\gamma _i^\mu $ are the coefficients of the reduced solution in the reduced basis, see (3.3), and $\alpha _k^\mu $ the coefficients of the affine decomposition of $a_\mu $ in (3.4)), with $1\le I,J \le dn$ (with $I=i+n(k-1)$ such that $1\le i\le n$, $1\le k\le d$, and with $J=j+n(l-1)$ such that $1\le j\le n$, $1\le l\le d$). We can write the right-hand side of (3.7) as a linear form in $\hat{X}(\mu )$ as follows:

$$\begin{aligned} \displaystyle \delta ^2 + 2\textrm{Re}(s^t \hat{x}_\mu ) + {\hat{x}_\mu }^{*t} S\hat{x}_\mu = \sum _{p=1}^{\sigma }{t_p \hat{X}_p(\mu )}, \end{aligned}$$

(3.20)

where $t_p$ is independent of $\mu $ (as $\delta $, s, and S are independent of $\mu $) and $\hat{X}_p(\mu )$ is the p-th component of $\hat{X}(\mu )$.

Consider the function of two variables $(p,\mu )\mapsto \hat{X}_p(\mu )$, for all $p\in \{1,...,\sigma \}$ and all $\mu \in \mathcal {P}$. We look for an approximation of this function in the form

$$\begin{aligned} \forall \mu \in \mathcal {P},\forall p\in \{1, ..., \sigma \},~\hat{X}_p(\mu ) \approx \sum _{r=1}^{\hat{\sigma }}{\lambda _r^{\hat{\sigma }}(\mu ) \hat{X}_p(\mu _r)}, \end{aligned}$$

(3.21)

for a certain parameter $\hat{\sigma }\le \sigma $. The Empirical Interpolation Method (EIM) provides a numerical procedure to construct this approximation and to choose the interpolation points (see [6, 15]), which leads to the following formula for computing the error bound

$$\begin{aligned} \mathcal {E}_3(\mu ):={\tilde{\beta }_{\mu }}^{-1}\left( \sum _{r=1}^{\hat{\sigma }}{\lambda ^{\hat{\sigma }}_r(\mu ) V_r}\right) ^{\frac{1}{2}}, \end{aligned}$$

(3.22)

where $V_r=\left\| G_{\mu _r}\hat{u}_{\mu _r}\right\| _{\mathcal {V}}^2$, and where $\lambda ^{\hat{\sigma }}(\mu )$ and $\mu _r$ are provided by EIM, see [9, Sect. 3.2] for all the details of this derivation. There holds (see [9, Proposition 3.2.1]):

Proposition 3.4

The computation of the formula $\mathcal {E}_3$ is well defined, and this formula is online-efficient.

Besides, $\mathcal {E}_3$ involving a linear combination of accurately computed scalar products (see (3.5)), it is not subject to the loss of significance encountered in $\mathcal {E}_2$.

We refer to [9] for more details on the notion of validity of a formula to compute an error bound, additional variants for accurate and efficient error bounds (including one featuring a stabilized EIM), as well as numerical illustrations for a one-dimensional linear diffusion problem and and three-dimensional acoustic scattering problem. The error bound $\mathcal {E}_3$ can be of particular interest in the following situations: (i) when the stability constant of the original problem is very small (this is the case in many practical problems), (ii) when very accurate solutions are needed, (iii) when considering a nonlinear problem (for which, in some cases, no error bound is possible until a very tight tolerance is reached, see [18]).

Notes

1.
https://www.confiance.ai/.

References

E. Agouzal, J-P. Argaud, M. Bergmann, G. Ferté, T. Taddei, A projection-based reduced-order model for parametric quasi-static nonlinear mechanics using an open-source industrial code (2022)
Google Scholar
N. Akkari, A. Hamdouni, L. Erwan, M. Jazar, On the sensitivity of the pod technique for a parameterized quasi-nonlinear parabolic equation. Adv. Model. Simul. Eng. Sci. 1, 14, 08 (2014)
Google Scholar
N. Akkari, A. Hamdouni, M. Jazar, Mathematical and numerical results on the parametric sensitivity of a rom-pod of the burgers equation. Eur. J. Comput. Mech. 23(1–2), 78–95 (2014)
Article Google Scholar
N. Akkari, A. Hamdouni, M. Jazar, Mathematical and numerical results on the sensitivity of the pod approximation relative to the burgers equation. Appl. Math. Comput. 247, 951–961 (2014)
MathSciNet Google Scholar
N. Akkari, A. Hamdouni, E. Liberge, M. Jazar, A mathematical and numerical study of the sensitivity of a reduced order model by pod (rom–pod), for a 2d incompressible fluid flow. J. Comput. Appl. Math. 270, 522–530 (2014), in Fourth International Conference on Finite Element Methods in Engineering and Sciences (FEMTEC 2013)
Google Scholar
M. Barrault, Y. Maday, N.C. Nguyen, A.T. Patera, An empirical interpolation method: application to efficient reduced-basis discretization of partial differential equations. Comptes Rendus Mathematiques 339(9), 666–672 (2004)
MathSciNet Google Scholar
F. Casenave, Accurate a posteriori error evaluation in the reduced basis method. Comptes Rendus Mathematique 350(9–10), 539–542 (2012)
Article MathSciNet Google Scholar
F. Casenave, N. Akkari, An error indicator-based adaptive reduced order model for nonlinear structural mechanics - application to high-pressure turbine blades. Math. Comput. Appl. 24(2), (2019)
Google Scholar
F. Casenave, A. Ern, T. Lelièvre, Accurate and online-efficient evaluation of the a posteriori error bound in the reduced basis method. ESAIM Math. Model. Numer. Anal. 48(1), 207–229 (2014)
Google Scholar
Y. Chen, J.S. Hesthaven, Y. Maday, J. Rodríguez, Improved successive constraint method based a posteriori error estimate for reduced basis approximation of 2d Maxwell’s problem. ESAIM Math. Model. Numer. Anal. 43(6), 1099–1116, 8 (2009)
Google Scholar
R. Everson, L. Sirovich, Karhunen-Loève procedure for gappy data. J. Opt. Soc. Am. A 12(8), 1657–1664 (1995)
Google Scholar
D. Goldberg, What every computer scientist should know about floating point arithmetic. ACM Comput. Surv. 23(1), 5–48 (1991)
Article MathSciNet Google Scholar
D.B.P. Huynh, G. Rozza, S. Sen, A.T. Patera, A successive constraint linear optimization method for lower bounds of parametric coercivity and inf-sup stability constants. Comptes Rendus Mathematique 345(8), 473–478 (2007)
Article MathSciNet Google Scholar
K. Kunisch, S. Volkwein, Galerkin proper orthogonal decomposition methods for parabolic problems. Numerische mathematik 90(1), 117–148 (2001)
Article MathSciNet Google Scholar
Y. Maday, N.-C. Nguyen, A.T. Patera, S.H. Pau, A general multipurpose interpolation procedure: the magic points. Commun. Pure Appl. Anal. 8(1), 383–404 (2009)
Article MathSciNet Google Scholar
A.T. Patera, G. Rozza, Reduced Basis Approximation and A Posteriori Error Estimation for Parametrized Partial Differential Equations. MIT Pappalardo Graduate Monographs in Mechanical Engineering (2007)
Google Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet Google Scholar
M. Yano, A space-time Petrov-Galerkin certified reduced basis method: application to the Boussinesq equations. SIAM J. Sci. Comput. 36(1), A232–A266 (2014)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Centre des Matériaux, Mines Paris—PSL, Évry, France
David Ryckelynck
Department of Digital Sciences and Technologies, Safran Tech, Châteaufort, France
Fabien Casenave & Nissrine Akkari

Authors

David Ryckelynck
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Casenave
View author publications
You can also search for this author in PubMed Google Scholar
Nissrine Akkari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Ryckelynck .

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ryckelynck, D., Casenave, F., Akkari, N. (2024). Error Estimation. In: Manifold Learning. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-031-52764-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-52764-7_3
Published: 21 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-52766-1
Online ISBN: 978-3-031-52764-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics