A reduced basis ensemble Kalman method

Silva, Francesco A. B.; Pagliantini, Cecilia; Grepl, Martin; Veroy, Karen

doi:10.1007/s13137-023-00235-8

A reduced basis ensemble Kalman method

Original Paper
Open access
Published: 01 September 2023

Volume 14, article number 24, (2023)
Cite this article

Download PDF

You have full access to this open access article

GEM - International Journal on Geomathematics Aims and scope Submit manuscript

A reduced basis ensemble Kalman method

Download PDF

Francesco A. B. Silva ORCID: orcid.org/0000-0002-1741-9356¹,
Cecilia Pagliantini²,
Martin Grepl³ &
…
Karen Veroy¹

788 Accesses
1 Citation
Explore all metrics

Abstract

In the process of reproducing the state dynamics of parameter dependent distributed systems, data from physical measurements can be incorporated into the mathematical model to reduce the parameter uncertainty and, consequently, improve the state prediction. Such a data assimilation process must deal with the data and model misfit arising from experimental noise as well as model inaccuracies and uncertainties. In this work, we focus on the ensemble Kalman method (EnKM), a particle-based iterative regularization method designed for a posteriori analysis of time series. The method is gradient free and, like the ensemble Kalman filter (EnKF), relies on a sample of parameters or particle ensemble to identify the state that better reproduces the physical observations, while preserving the physics of the system as described by the best knowledge model. We consider systems described by parameterized parabolic partial differential equations and employ model order reduction techniques to generate surrogate models of different accuracy with uncertain parameters. Their use in combination with the EnKM involves the introduction of the model bias which constitutes a new source of systematic error. To mitigate its impact, an algorithm adjustment is proposed accounting for a prior estimation of the bias in the data. The resulting RB-EnKM is tested in different conditions, including different ensemble sizes and increasing levels of experimental noise. The results are compared to those obtained with the standard EnKF and with the unadjusted algorithm.

A Stochastic Ensemble Kalman Filter with Perturbation Ensemble Transformation

Article 01 January 2019

On the mathematical theory of ensemble (linear-Gaussian) Kalman–Bucy filtering

Article Open access 19 May 2023

Frequentist Perspective on Robust Parameter Estimation Using the Ensemble Kalman Filter

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The problem of estimating model parameters of static and dynamical systems is encountered in many applications from earth sciences to engineering. In this work we focus on the parameter estimation of dynamical systems described by parameterized parabolic partial differential equations (pPDEs). Here, we assume that a limited and polluted knowledge of the solution is available at multiple time instances through noisy local measurements.

For solving this kind of inverse problem, countless deterministic and stochastic methods have been proposed. Among them, a widely used technique is the so-called ensemble Kalman filter (Evensen 2003), a recursive filter employing a series of measurements to obtain improved estimates of the variables involved in the process. The idea of using the EnKF for reconstructing the parameters of dynamical systems traces back to Anderson (2001) and Lorentzen et al. (2001), in which trivial artificial dynamics for the parameters was assumed to make the estimation possible. This was naturally accompanied by efforts for improving the performance of the method in terms of stability, by introducing covariance inflation (Hamill et al. 2001; Anderson and Anderson 1999) and localization (Hamill et al. 2001; Houtekamer and Mitchell 2001), and in terms of computational cost. Relevant to the latter have been the development of multi-level (Hoel et al. 2016) and multi-fidelity Popov et al. (2021) and Donoghue and Yano (2022) methods, the use of model order reduction (MOR) techniques with offline (Pagani et al. 2017; da Silva and Colonius 2018) and with on-the-fly (Donoghue and Yano 2022) training, as well as the introduction of further surrogate modeling techniques (Popov and Sandu 2022). The use of approximated models inevitably led to the study of the impact of model error on the EnKF (Mitchell et al. 2002; Mitchell and Carrassi 2015), alongside with other data assimilation methods (Calvetti et al. 2018; Huttunen and Kaipio 2007).

Although ensemble Kalman methods were originally meant for sequential data assimilation, i.e., for real-time applications, they proved to be reliable also for asynchronous data assimilation (Sakov et al. 2010). The first paper proposing to adapt the EnKF to a retrospective data analysis was (Skjervheim et al. 2007). For analysis, the data are employed all at once at the end of an assimilation window, which is in common with a series of methods, e.g., variational methods (Li and Navon 2001) such as 4D-VAR (Thepaut and Courtier 1991) and other smoothers (Anderson and Moore 1979). Compared to those approaches, the EnKF is particularly appealing since it does not require the computation of Fréchet derivatives, a major complication for data assimilation algorithms.

In Iglesias et al. (2013), Iglesias et al. introduced what they called the ensemble Kalman method, an EnKF-based asynchronous data assimilation algorithm. Depending on the design of the algorithm, this method has connections to Bayesian data assimilation (Schillings and Stuart 2018) and to maximum likelihood estimation (Chen and Oliver 2012). In particular, in the latter case, the method constitutes an ensemble-based implementation of so-called iterative regularization methods (Kaltenbacher et al. 2008). In the case of perfect models, the EnKM has already been analyzed in depth in Schillings and Stuart (2018) and Evensen (2018) and convergence and identifiability enhancements have been proposed in Wu et al. (2019) and Iglesias (2016). Due to the iterative nature of the EnKM, dealing with high-dimensional parametric problems is often computationally challenging. In Gao and Wang (2021) a multi-level strategy has been proposed to improve the computational performance of the method.

In this work we propose an algorithm, called reduced basis ensemble Kalman method (RB-EnKM), that leverages the computational efficiency of surrogate models obtained with MOR techniques to solve asynchronous data assimilation problems via ensemble Kalman methods. The use of the EnKM allows us to avoid adjoint problems that are often difficult to reduce and intrinsically depend on the choice of measurement positions. Model order reduction, already employed in other data assimilation problems (Gong et al. 2019; Nadal et al. 2015), is used as a key tool for accelerating the method. However, the use of approximate models within the EnKM introduces a model error that could hinder the convergence of the method. In this work, we propose to deal with this error by including a prior estimation of the bias in the data. Specifically, we incorporate empirical estimates of the mean and covariance of the bias in the Kalman gain. In some instances, those quantities can be computed at a negligible cost by employing the same training set used for the construction of the reduced model.

The paper is structured as follows: in Sect. 2 we introduce the asynchronous data assimilation problem together with the standard ensemble Kalman method (Algorithm 1). Subsequently, in Sect. 3.1, we present an overview on reduced basis (RB) methods and describe how to use them in combination with the ensemble Kalman method to derive the RB-EnKM (Algorithm 2). In Sect. 4, we test the new method on two numerical examples. In the first example, we estimate the diffusivity in a linear advection-dispersion problem in 2D (Sect. 4.1), while in the second, we estimate the hydraulic log-conductivity in a non-linear hydrological problem (Sect. 4.2). In both cases, we compare the behavior of the full order and reduced order models in different conditions. Section 5 provides conclusions and considerations on the proposed method and on its numerical performance.

2 Problem formulation

Let $(\mathcal {U},\mathcal {H})$ be a suitable pair of function spaces and let $\mathcal {P}\subset {\mathbb {R}^{N_p}}$, with ${N_p}\in {\mathbb {N}^+}$, be a set of model parameters. We consider the pPDE: for any parameter $\varvec{\mu }\in \mathcal {P}$, find $u(\varvec{\mu }) \in \mathcal {U}$ such that $\partial _t u(\varvec{\mu }) = \mathcal {F}_{\varvec{\mu }} u(\varvec{\mu })$, $u(0,\varvec{\mu })=u_0(\varvec{\mu })$. Here $\mathcal {F}_{\varvec{\mu }}$ is a generic parameterized differential operator, $\partial _t$ is the first order partial time derivative and $u_0(\varvec{\mu }) \in \mathcal {H}$ is a parameterized initial condition. This pPDE provides the constraint to the inverse problem of estimating the unknown parameter ${\varvec{\mu }^\star }\in \mathcal {P}$ from data or observations given by

$$\begin{aligned} \begin{aligned}&{\textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) = \mathcal {L}u({\varvec{\mu }^\star }) + \varvec{\eta }\quad s.t.}\\&{\partial _t u({\varvec{\mu }^\star }) = \mathcal {F}_{{\varvec{\mu }^\star }} u({\varvec{\mu }^\star }), \,u(0,{\varvec{\mu }^\star })=u_0({\varvec{\mu }^\star }). } \end{aligned} \end{aligned}$$

(1)

Here, $\mathcal {L}: \mathcal {U}\rightarrow {\mathbb {R}^{N_m}}$, with ${N_m}\in {\mathbb {N}^+}$, maps the space of the solutions to the space of the measurements, simulating the observation process, and $\varvec{\eta }$ is an unknown realization of a Gaussian random variable with zero mean and given covariance, $\varvec{\Sigma }\in {\mathbb {R}^{{N_m}\times {N_m}}}$. Note that both the observed data $\textbf{y}$ and the additive experimental noise $\varvec{\eta }$ are ${N_m}$-dimensional vector-valued quantities and that $\varvec{\Sigma }$ is a symmetric positive-definite matrix defining the inner product ${\Vert \cdot \Vert ^2_{{\varvec{\Sigma }}^{-1}}} :={\Vert \varvec{\Sigma }^{-1/2} \cdot \Vert _2}$ on $\mathbb {R}^{N_m}$, where ${\Vert \cdot \Vert _2}$ is the Euclidean norm.

To solve this inverse problem, we must explicitly solve the pPDE (1). This is done using a suitable discretization, in space and time, of the differential operators $\mathcal {F}_{\varvec{\mu }}$ and $\partial _t$. To this end, we introduce the approximation spaces $ \mathcal {V}_h \subset \mathcal {U}$ and $ \mathcal {H}_h \subset \mathcal {H}$ and the discretized initial condition $u_{h,0}(\varvec{\mu })\in \mathcal {H}_h$ so that the approximate problem reads:

$$\begin{aligned} {\text {find} \,\, u_h(\varvec{\mu }) \in \mathcal {V}_h \quad \text {s.t.} \quad \partial _t u_h({\varvec{\mu }}) = \mathcal {F}^h_{\varvec{\mu }}u_h({\varvec{\mu }}), \, u_h({0, \varvec{\mu }})=u_{h,0}(\varvec{\mu }).} \end{aligned}$$

(2)

The discretization of the pPDE can be chosen according to the specific problem of interest. In all numerical examples proposed in this work, we employ a space-time Petrov–Galerkin discretization of (1) with piecewise polynomial trial and test spaces, as described in Sect. 4, and we assume (2) to be sufficiently accurate such that we can take $\textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) ={\mathcal {L}u_h({{\varvec{\mu }^\star }})} + \varvec{\eta }$.

To characterize the observation of the solution, we introduce the forward response map $\mathcal {G}: \mathcal {P}\rightarrow {\mathbb {R}^{N_m}}$ defined as ${\mathcal {G}(\varvec{\mu })} :={\mathcal {L}u_h({\varvec{\mu }})} $ for any solution of the pPDE (2). Although the use of the map $\mathcal {G}$ results in a more compact notation, omitting its dependence on the solution of the pPDE conceals a key aspect of the method, i.e., the mapping from the parameter vector to the corresponding space-time pPDE solution. For this reason, and because it makes it harder to introduce the problem discretization, it will be used with caution.

2.1 The ensemble Kalman method

The data assimilation problem presented above can be recast as a minimization problem for the cost functional, ${\Phi (\varvec{\mu } \,\vert \, \textbf{y}) } :={\Vert \textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) - \mathcal {L} u_h({\varvec{\mu }}) \Vert ^2_{{\varvec{\Sigma }}^{-1}}}$, representing the misfit between the experimental data, $\textbf{y}({\varvec{\mu }^\star }, \varvec{\eta })$, and the forward response. The optimal parameter estimate ${\varvec{\mu }_{\text {opt}}}(\textbf{y})$ is thus given by

$$\begin{aligned} \begin{aligned} {\varvec{\mu }_{\text {opt}}}(\textbf{y})&= {\text {arg}\min _{\varvec{\mu }\in \mathcal {P}}\,} {\Phi (\varvec{\mu } \,\vert \, \textbf{y}) } \quad s.t.\\ {\quad \partial _t u_h({\varvec{\mu }})}&= {\mathcal {F}^h_{\varvec{\mu }} u_h({\varvec{\mu }}), \, u_h({0, \varvec{\mu }})=u_{h,0}(\varvec{\mu }).} \end{aligned} \end{aligned}$$

(3)

This is equivalent to a maximum likelihood estimation, given the likelihood function, $l(\varvec{\mu }\,\vert \, \textbf{y}) = \exp \{-\frac{1}{2} {\Phi (\varvec{\mu } \,\vert \, \textbf{y}) } \}$, associated with the probability density function of the data, $\textbf{y}\vert \varvec{\mu }$, i.e., the probability of observing $\textbf{y}$ if $\varvec{\mu }$ is the parametric state. The shape of the function follows from the probability density function of the Gaussian noise realization.

Among various methods proposed to solve this optimization problem, the EnKM relies on a sequence of parameter ensembles ${\mathcal {E}_{n}}$, with $n \in {\mathbb {N}^+}$, to estimate the minimum of the cost functional. Each ensemble consists of a collection ${\{{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}\}_{{}^{{j}=1}}^{{}_{J}}}$ of ${J}\in {\mathbb {N}^+}$ parameter vectors ${\varvec{\mu }_{n}^{{}_{\left( j\right) }}}$, hereby named ensemble members or particles, whose interaction, guided by the experimental measurements, causes them to cluster around the solution of the problem as iterations proceed. At the beginning of each iteration, the solution of the pPDE and its observations are computed for each ${j}\in \{1,\ldots ,{J}\}$. Subsequently, the ensemble is updated based on the empirical correlation among parameters and between parameters and measurements, as well as on the misfits between the experimental measurements $\textbf{y}({\varvec{\mu }^\star }, \varvec{\eta })$ and the particle measurements $\mathcal {L} u_h({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})$. A single iteration, equivalent to the one in Iglesias et al. (2013), is formalized in the following pseudo algorithm:

Algorithm 1

Iterative ensemble method for inverse problems.

Input. Let ${\mathcal {E}_{0}}$ be the initial ensemble with elements $\{{\varvec{\mu }_{0}^{{}_{\left( j\right) }}}\}_{{}^{j=1}}^{{}_J}$ sampled from a given distribution ${\Pi _{0}} ( \varvec{\mu })$. Let $\varvec{\Sigma }$ be the a priori known noise covariance and $\textbf{y}$ the vector of noisy measurements collected from the physical system. Let $\tau \ll 1$ be the termination tolerance.

For $n = 0,1,\ldots $

(i)
Prediction step. Compute the synthetic measurements of the solution over a time interval $\mathcal {I}$ for each particle in the last updated ensemble:
$$\begin{aligned} \begin{aligned} {{\mathcal {G}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}}&= {\mathcal {L} u_h({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}}) \quad \text {for all } {j}\in \{1,\ldots ,{J}\} \quad s.t.}\\ {\partial _t u_h({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})}&= {\mathcal {F}^h_{{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}} u_h({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}}), \,\, u_h({0, {\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})=u_{h,0}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}}).} \end{aligned} \end{aligned}$$
(4)
(ii)
Intermediate step. From the last updated ensemble measurements and parameters, compute the sample means and covariances:
$$\begin{aligned} \textbf{P}_n&= \frac{1}{{J}} \sum _{{j}=1}^{J}{\mathcal {G}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})} {\mathcal {G}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}^\top - \,\overline{\mathcal {G}}_{n} \overline{\mathcal {G}}_{n}^\top{} & {} \quad \text {with} \quad \overline{\mathcal {G}}_{n} = \frac{1}{{J}} \sum _{{j}=1}^{J}{\mathcal {G}( {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} )} \end{aligned}$$
(5)
$$\begin{aligned} \textbf{Q}_n&= \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n}^{{}_{\left( j\right) }}} {\mathcal {G}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}^\top - \,{\overline{\varvec{\mu }}_{n}} \overline{\mathcal {G}}_{n}^\top{} & {} \quad \text {with} \quad {\overline{\varvec{\mu }}_{n}} = \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}. \end{aligned}$$
(6)
(iii)
Analysis step. Update each particle in the ensemble: $\text {for all } {j}\in \{1,\ldots ,{J}\}$
$$\begin{aligned} { \varvec{\gamma }_n^{{}_{(j)}}}&{\sim \mathcal {N} (0, \varvec{\Sigma }),} \end{aligned}$$
(7)
$$\begin{aligned} {{\varvec{\mu }_{n+1}^{{}_{\left( j\right) }}}}&{= {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} + \textbf{Q}_n ( \textbf{P}_n + \varvec{\Sigma })^{-1} \left( \textbf{y}- {\mathcal {G}( {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} )} - \varvec{\gamma }_n^{{}_{(j)}} \right) }. \end{aligned}$$
(8)
(iv)
Termination step. Stop the algorithm when the termination criterion is satisfied. Here, we terminate when the relative change in the mean parameter is less than the tolerance:
$$\begin{aligned} { {\Vert {\overline{\varvec{\mu }}_{n+1}}-{\overline{\varvec{\mu }}_{n}} \Vert _2} \le \tau {\Vert {\overline{\varvec{\mu }}_{n+1}} \Vert _2} \quad \text {with} \quad {\overline{\varvec{\mu }}_{n+1}} = \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n+1}^{{}_{\left( j\right) }}}. } \end{aligned}$$
(9)

In the last step of the algorithm, the cross correlation matrices ${\textbf{P}_{n}}$ and ${\textbf{Q}_{n}}$ are used to compute the Kalman gain ${\textbf{K}_{n}} :={\textbf{Q}_{n}} {{ ({\textbf{P}_{n}} + \varvec{\Sigma }) }^{-1}}$. This modulates the extent of the correction: a low-gain corresponds to conservative behavior, i.e., small changes in the particle positions, while a high-gain involves a larger correction. Note that the experimental data are perturbed with artificial noise sampled from the same distribution assumed for the experimental noise $\varvec{\eta }$. This leads to an improved estimate over the unperturbed case.

A termination criterion for the algorithm is essential for the proper implementation of the method. The one presented in Iglesias et al. (2013) is based on the discrepancy principle and consists in stopping the algorithm when the error between the experimental data and the measurements is comparable to the experimental noise, that is, when ${\Vert \textbf{y}-{\mathcal {G}({\overline{\varvec{\mu }}_{n}})} \Vert ^2_{{\varvec{\Sigma }}^{-1}}} \le \sigma {\Vert \varvec{\eta } \Vert ^2_{{\varvec{\Sigma }}^{-1}}}$ for some $\sigma \ge 1$. An alternative approach is to set a threshold for the norm of the parameter update, i.e., to terminate the algorithm when ${\Vert {\overline{\varvec{\mu }}_{n+1}}-{\overline{\varvec{\mu }}_{n}} \Vert _2} \le \tau {\Vert {\overline{\varvec{\mu }}_{n+1}} \Vert _2}$ for some $\tau \ll 1$. The latter criterion is more robust to model errors and is therefore used in our numerical experiments.

Equally important for the method is the choice of the distribution ${\Pi _{0}}$ from which the initial ensemble (or first guess) ${\mathcal {E}_{0}}$ is sampled. In most of the cases, including those considered in our numerical experiments, the distribution $\Pi _0$ comes from an a priori knowledge of the range of admissible parameters. In other scenarios, e.g., when the parameters live in an infinite-dimensional space, it may be necessary to define additional criteria on how to treat the parameter space. The initial ensemble plays a fundamental role in stabilizing the inverse problem. Indeed, it has been shown in Iglesias et al. (2013) that all the ensembles generated by Algorithm 1 are contained in the space spanned by the initial ensemble, that is

$$\begin{aligned} {\mathcal {E}_{n}} \in \mathcal {A} :=\text {span}\, {\{{\varvec{\mu }_{0}^{{}_{\left( j\right) }}}\}_{{}^{{j}=1}}^{{}_{J}}} \quad \text {for all } \, n \in {\mathbb {N}^+}. \end{aligned}$$

(10)

Furthermore, in the mean-field limit, i.e., in the case of infinite particles, and assuming an affine relationship between parameters and synthetic measurements, the distribution ${\Pi _{0}}$ plays the same role as the Tikhonov regularization in variational data assimilation, see (Asch et al. 2016). In particular, the stabilization term is given by $-\log _e {\Pi _{0}}(\varvec{\mu })$.

The main sources of error of the EnKM are associated with the ensemble size and with the evaluation of $\mathcal {G}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}}) = \mathcal {L}u_h({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})$. Indeed, while the observation of the solution is accurate and computationally cheap to evaluate, due to the linearity of the operator, the accuracy in the computation of the pPDE solution intrinsically depends on the quality of the numerical discretization. High order numerical discretizations might require prohibitively large computational costs, especially if the pPDE (2) is solved for many values of the parameter and over long temporal intervals.

The other steps of Algorithm 1 involve the following operations: (i) the assembly of $\textbf{P}_n$ and $\textbf{Q}_n$ in (5)-(6), with computational complexity of order $\mathcal {O}(J N_m^2)$, and (ii) the inversion of the matrix $\textbf{P}_n+\varvec{\Sigma }$ in the analysis step (8) with complexity $\mathcal {O}(N_m^3)$. The solution of the pPDE (2), $\text {for all } {j}\in \{1,\ldots ,{J}\}$, in the prediction step of Algorithm 1 is thus the computational bottleneck of the EnKM algorithm.

3 Surrogate models

3.1 Reduced basis methods

Given the need to solve the pPDE (2) for several instances of the parameter, the use of MOR techniques appears an ideal choice. Model order reduction has allowed exceptional computational speed-ups in settings that require repeated model evaluations, such as multi-query simulations. In MOR the high-dimensional problem is replaced with a surrogate model of reduced dimensionality that still possesses optimal or near-optimal approximation properties but that can be solved at a considerably reduced computational cost. In this work, we focus on a particular class of MOR techniques, known as reduced basis methods (Prud’homme et al. 2002).

The reduced basis method typically consists of two phases: an offline phase and an online phase. In the computationally expensive offline phase a low-dimensional approximation of the solution space, namely the reduced space, is constructed and a surrogate model is derived via projection of the full order model onto the reduced space. Then, the resulting low-dimensional reduced model can be solved in the online phase for many instances of the parameter at a computational cost independent of the size of the full order model.

To be more precise, let $\mathcal {M}:= \{ {u_h(t, \varvec{\mu }) \in \mathcal {V}_h \,\vert \, \partial _t u_h({\varvec{\mu }}) = \mathcal {F}^h_{\varvec{\mu }} u_h({\varvec{\mu }}),} \, u_h({0, \varvec{\mu }})=u_{h,0}(\varvec{\mu }) \, \text{ for } \text{ all }\, \varvec{\mu }\in \mathcal {P},\,t\in \mathcal {I} \}$ be the solution set which collects the solution of the discretized pPDE (2) evaluated at times $t\in \mathcal {I}:=(0,T]$, with $T\in {\mathbb {R}^+}$, for a set of parameters $\varvec{\mu }\in \mathcal {P}$. The parametric problem (2) is said to be reducible if the solution set $\mathcal {M}$ can be well approximated by a low-dimensional linear subspace. In this case, such a subspace is obtained as the span of a problem-dependent basis derived from a collection of full order solutions or snapshots, $\{ u_h(t_n, \varvec{\mu }_s) \}_{s,n=1}^{S,R}$, with $S,R \in {\mathbb {N}^+}$, at sampled values, $\{ \varvec{\mu }_s \}_{s=1}^{S}$, of the parameter, and at discrete times, $\{t_n\}_{n=1}^R$. The set $\mathcal {P}_\text {TRAIN}:=\{ \varvec{\mu }_s \}_{s=1}^{S}\subset \mathcal {P}$ of training parameters is a sufficiently rich subset of the parameter space that can be obtained by drawing random samples from a uniform distribution in $\mathcal {P}$ or with other sampling techniques, such as statistical methods and sparse grids, see (Quarteroni et al. 2015, Chapter 6) and references therein. The extraction of the basis functions from the snapshots is usually performed using SVD-type algorithms such as the proper orthogonal decomposition (POD) (Berkooz et al. 1993) or greedy algorithms. For problems that depend on both time and parameters, the so-called POD-Greedy method (Grepl and Patera 2005; Haasdonk and Ohlberger 2008) combines a greedy algorithm in parameter space with the proper orthogonal decomposition in time at a given parameter. In the numerical tests of this work, we rely on the Weak-POD-Greedy algorithm, which is the preferred method whenever a rigorous error bound can be derived, while the POD and the Strong-POD-Greedy are often used when a bound is unavailable, i.e., for most non-linear problems. Note that, under the same choice of training parameters, the latter are more accurate but computationally less efficient.

Once an $N_\varepsilon $-dimensional set of spatial reduced basis functions $\{ \psi _{i}\}_{i=1}^{N_\varepsilon }$ is obtained and a set of time basis functions $\{ \upsilon _n \}_{n=1}^{N_t}$ is selected, the reduced spaces $\mathcal {H}_{\varepsilon } = \text {span} \{ \psi _{i}\}_{i=1}^{N_\varepsilon } \subset \mathcal {H}_h$ and $\mathcal {V}_{\varepsilon } = \text {span} \{ \upsilon _n {\otimes } \psi _{i}\}_{i,n=1}^{N_\varepsilon , N_t} \subset \mathcal {V}_h$ are constructed. The full model solution $u_h(\varvec{\mu })$ and the initial condition $u_{h,0}(\varvec{\mu })$, for a given $\varvec{\mu }$, are approximated by the functions $u_{\varepsilon }(\varvec{\mu })$ in $\mathcal {V}_{\varepsilon }$ and $u_{\varepsilon ,0}(\varvec{\mu })$ in $\mathcal {H}_{\varepsilon }$,

$$\begin{aligned} { u_{\varepsilon }(\varvec{\mu }) = \sum _{i,n=1}^{N_\varepsilon , N_t} u_{i,n}(\varvec{\mu })\,\upsilon _n \, \psi _i, \quad u_{\varepsilon ,0}(\varvec{\mu }) = \sum _{i=1}^{N_\varepsilon } u_{i,0}(\varvec{\mu })\,\psi _i, \quad \varvec{\mu }\in \mathcal {P},} \end{aligned}$$

where $(u_{1,1}(\varvec{\mu }),\ldots ,u_{N_\varepsilon ,N_t}(\varvec{\mu }))^\top \in \mathbb {R}^{N_\varepsilon N_t}$ and $(u_{1,0}(\varvec{\mu }),\ldots ,u_{N_\varepsilon ,0}(\varvec{\mu }))^\top \in \mathbb {R}^{N_\varepsilon }$ denote the vectors of expansion coefficients in the reduced basis. The reduced model thus reads:

$$\begin{aligned} { \text {find} \,\, u_{\varepsilon }(\varvec{\mu }) \in \mathcal {V}_{\varepsilon } \quad \text {s.t.} \quad \partial _t u_{\varepsilon }({\varvec{\mu }}) = \mathcal {F}^{\varepsilon }_{\varvec{\mu }}u_{\varepsilon }({\varvec{\mu }}), \quad u_{\varepsilon }({0, \varvec{\mu }})=u_{\varepsilon ,0}(\varvec{\mu }),} \end{aligned}$$

(11)

where the operator $\mathcal {F}^{\varepsilon }_{\varvec{\mu }}$ is obtained by projecting the full order operator $\mathcal {F}^{h}_{\varvec{\mu }}$ onto the reduced space $\mathcal {V}_{\varepsilon }$. Note that we set $N_t = R$ in the sequel since we do not consider a temporal compression. However, choosing $R < N_t$ is also possible.

The computational gain derived from solving problem (11) instead of the full order model (2) hinges on the feasibility of a complete decoupling of the offline and online phases. A computational complexity of the online phase independent of the size of the full order problem can be achieved under the assumption of linearity and parameter-separability of the operator $\mathcal {F}^{h}_{\varvec{\mu }}$. To deal with general non-linear operators, hyper-reduction techniques are required. These include methods for approximating the high-dimensional non-linear term $\mathcal {F}^{h}_{\varvec{\mu }}$ with an empirical affine decomposition, such as the EIM (Barrault et al. 2004), and methods for reducing the cost of evaluating the non-linear term, such as linear program empirical quadrature (Yano and Patera 2019) and empirical cubature (Hernández et al. 2017).

3.2 A reduced basis ensemble Kalman method

In this section, we discuss the implications of replacing the high-fidelity model in the prediction step of the EnKM by a surrogate model derived via model order reduction, as described in Sect. 3.1. The use of MOR for particle-based methods is particularly desirable in multi-query contexts since it allows us to significantly reduce the computational cost of solving the inverse problem. However, the approximation introduced by the model order reduction inevitably produces (small) deviations of the reduced solution from the full order one. This constitutes a problem for data assimilation algorithms, as already documented and investigated in Calvetti et al. (2018) and in other works. Indeed, the error in the solution results in discrepancies between approximated and exact measurements. Although we can expect the mismatch $\varvec{\delta }_\varepsilon (\varvec{\mu }) :=\mathcal {L}u_h(\varvec{\mu }) - \mathcal {L}u_\varepsilon (\varvec{\mu })$ to decrease with the approximation error of $u_\varepsilon (\varvec{\mu })$, this bias will inevitably entail a distortion of the loss functional obtained by simple model substitution, i.e.,

$$\begin{aligned} { \widetilde{\Phi } (\varvec{\mu }\vert \,\textbf{y}) :=\Vert \textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) - \mathcal {L}u_{\varepsilon }({\varvec{\mu }})\Vert ^2_{\varvec{\Sigma }^{-1}}.} \end{aligned}$$

(12)

Note that this cost function does not vanish in the parameter $\varvec{\mu }$ we are trying to estimate, not even in noise free conditions. This systematic error, independent of the magnitude of the experimental noise, can be mitigated by modifying the cost function and consequently the EnKM. A modified algorithm, which we refer to as adjusted RB-EnKM is presented in the following sections. This algorithm is in contrast to what we refer to as the biased RB-EnKM, i.e., the algorithm obtained by the simple substitution of the full order model with the reduced order model in Algorithm 1, as presented in (12).

The modification of the algorithm can proceed in two ways. One possibility is to rewrite the exact cost function in terms of the surrogate model and the measurement bias, namely substituting $\mathcal {L}u_h({\varvec{\mu }}) = \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) + \varvec{\delta }_\varepsilon (\varvec{\mu })$ in the minimization problem (3) to obtain

$$\begin{aligned} {\Phi _1 (\varvec{\mu }\vert \,\textbf{y})}:= & {} {\, \Vert \textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) - \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) - \varvec{\delta }_\varepsilon (\varvec{\mu }) \Vert ^2_{\varvec{\Sigma }^{-1}}} \nonumber \\&{=}&{\, \Vert \mathcal {L}u_h({{\varvec{\mu }^\star }}) - \mathcal {L}u_h({\varvec{\mu }}) + \varvec{\eta }\, \Vert ^2_{\varvec{\Sigma }^{-1}} \equiv \Phi (\varvec{\mu }\vert \,\textbf{y}).} \end{aligned}$$

(13)

A second option is to correct the experimental data involved in the biased cost function (12) so that, at least in noise free conditions, its minimum coincides with the minimum of the exact cost function. This means subtracting $\varvec{\delta }_\varepsilon ({\varvec{\mu }^\star })$ instead of $\varvec{\delta }_\varepsilon (\varvec{\mu })$, and results in the new cost function

$$\begin{aligned} {\Phi _2 (\varvec{\mu }\vert \,\textbf{y})}:= & {} {\, \Vert \textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) - \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) - \varvec{\delta }_\varepsilon ({\varvec{\mu }^\star }) \Vert ^2_{\varvec{\Sigma }^{-1}}} \nonumber \\&{=}&{\, \Vert \mathcal {L}u_{\varepsilon }({{\varvec{\mu }^\star }}) - \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) + \varvec{\eta }\, \Vert ^2_{\varvec{\Sigma }^{-1}} \not \equiv \Phi (\varvec{\mu }\vert \,\textbf{y}).} \end{aligned}$$

(14)

In noise free conditions, i.e., if $\varvec{\eta }= \textbf{0}$, both cost functions vanish at the exact value ${\varvec{\mu }^\star }$. Since the cost functions $\Phi _1$ and $\Phi _2$ are non-negative, the minimum attained in ${\varvec{\mu }^\star }$ is necessarily also a global minimum.

In the following, we focus on the second approach. The reason is that the first approach requires the evaluation of the bias at all parameter values $\varvec{\mu }\in \mathcal {P}$ which is too expensive to perform. Furthermore, at the algorithmic level, the substitution of the true model with the sum of the surrogate model and its bias would significantly change the computation of $\textbf{P}_n$ and $\textbf{Q}_n$, and thus the algorithm structure. By contrast, the second approach is based on the assumption that the true model is incorrect and on the subsequent correction of the experimental data. This implies that $\varvec{\delta }_\varepsilon ({\varvec{\mu }^\star })$ is the only bias involved and it requires just a single full order evaluation. However, since the argument ${\varvec{\mu }^\star }$ is unknown, this is clearly not possible, and we must instead exploit the prior epistemic uncertainty on ${\varvec{\mu }^\star }$, encoded in $\Pi _0(\varvec{\mu })$, to modify the cost function.

If ${\varvec{\mu }^\star }$ is treated as a random variable with probability measure $\Pi _0$, then the data bias $\varvec{\delta }^\star _\varepsilon = \varvec{\delta }_\varepsilon ({\varvec{\mu }^\star })$ is in turn a random variable with probability measure $\Pi _0 \circ \varvec{\delta }_\varepsilon ^{-1}$. The moments of this distribution, henceforth denoted by $\overline{\varvec{\delta }}_\varepsilon $ and $\varvec{\Gamma }_\varepsilon $, can be empirically estimated via pointwise evaluations of the bias without further assumptions on the nature of the distribution itself. However, the assumption of Gaussianity, although improperly implying the linearity of $\varvec{\delta }_\varepsilon : \mathcal {P} \rightarrow \mathbb {R}^{N_m}$, is consistent with the other assumptions of Gaussianity and linearity required for the derivation of the EnKF (Evensen 2003). Furthermore, it allows us to obtain closed-form results, as shown in the next paragraphs.

In view of the fact that $\varvec{\delta }^\star _\varepsilon $ is considered as a random variable, we change (14) to make the dependence of the cost function $\Phi _2$ on $\varvec{\delta }^\star _\varepsilon $ explicit; i.e.

$$\begin{aligned} { \Phi _\varepsilon (\varvec{\mu }\, \vert \, \textbf{y}, \varvec{\delta }^\star _\varepsilon ) :=\, \Vert \textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) - \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) - \varvec{\delta }^\star _\varepsilon \Vert ^2_{\varvec{\Sigma }^{-1}}.} \end{aligned}$$

(15)

In order to make the estimate of $\varvec{\mu }$ dependent only on the experimental data, we must remove the conditioning on $\varvec{\delta }^\star _\varepsilon $, i.e., marginalize out the random variable. The easiest way to do this is by employing Bayesian statistics and particularly recovering the same marginal distribution $\textbf{y}\vert \varvec{\mu }$ mentioned at the beginning of Sect. 2.1. To this end, we consider the likelihood function $ l (\varvec{\mu }\, \vert \, \textbf{y}, \varvec{\delta }^\star _\varepsilon ) :=\exp \{ - \frac{1}{2} \Phi _\varepsilon (\varvec{\mu }\, \vert \, \textbf{y}, \varvec{\delta }^\star _\varepsilon ) \}$, proportional to the density of $(\textbf{y}\,\vert \, \varvec{\mu }, \varvec{\delta }^\star _\varepsilon ) \sim \mathcal {N}( \varvec{\delta }^\star _\varepsilon + \mathcal {L}u_{\varepsilon }({\varvec{\mu }}), \varvec{\Sigma })$. Employing (Särkkä 2013, Lemma 1.A), concerning the mean and covariance of the joint distribution of Gaussian variables, it can be easily proven that, if $\varvec{\delta }^\star _\varepsilon \sim \mathcal {N}(\overline{\varvec{\delta }}_\varepsilon , \Gamma _\varepsilon )$, then $\textbf{y}\,\vert \, \varvec{\mu }\sim \mathcal {N}( \overline{\varvec{\delta }}_\varepsilon + \mathcal {L}u_{\varepsilon }({\varvec{\mu }}), \varvec{\Sigma }+ \Gamma _\varepsilon )$ and consequently we derive the marginalized cost functional

$$\begin{aligned} { \Phi _\varepsilon (\varvec{\mu }\, \vert \, \textbf{y}) :=\, \Vert \textbf{y}- \mathcal {L}u_{\varepsilon }({\varvec{\mu }}) - \overline{\varvec{\delta }}_\varepsilon \Vert ^2_{(\varvec{\Sigma }+\Gamma _\varepsilon )^{-1}}.} \end{aligned}$$

(16)

Hence, by analogy with Sect. 2.1, we can adapt the EnKM to optimize the new cost function under the surrogate model constraint (11). The resulting adjusted RB-EnKM is summarized in Algorithm 2. Unlike the reference EnKM, we distinguish between an offline and an online phase. In the offline phase, the training set of full order solutions is generated and used both to construct the surrogate model and to estimate the moments of $\varvec{\delta }^\star _\varepsilon $. In the online phase, the actual optimization is performed.

Algorithm 2

Iterative ensemble method with reduced basis surrogate models and accounting for the associated measurements bias.

Offline:

Input. Let $\mathcal {P}_{\text {TRAIN}}$ be a set of S parameters $\{\varvec{\mu }_{0}^{{}_{(s)}}\}_{{}^{s=1}}^{{}_S}$ sampled from a given probability distribution ${\Pi _{0}} ( \varvec{\mu })$ and let $\{ u_h(\varvec{\mu }_s) \}_{s=1}^{S}$ be the associated training set of full order solutions. Let $\varepsilon \in {\mathbb {R}^+}$ be a prescribed tolerance.

(i)
Model order reduction. Relying on the training set $\{ u_h(\varvec{\mu }_s) \}_{s=1}^{S}$, construct a surrogate model of accuracy $\varepsilon $ as explained in Sect. 3.1 and compute the set of reduced basis solutions $\{ u_\varepsilon (\varvec{\mu }_s) \}_{s=1}^{S}$.
(ii)
Data bias estimation. Define the training biases as
$$\begin{aligned} \varvec{\delta }_{\varepsilon }(\varvec{\mu }^{{}_{(s)}}) = \mathcal {L} u_h({\varvec{\mu }}^{{}_{(s)}}) - \mathcal {L} u_{\varepsilon }({\varvec{\mu }}^{{}_{(s)}}) \quad \text {for all } s \in \{1,...,S\} \end{aligned}$$
(17)
and the associated empirical moments
$$\begin{aligned} \varvec{\Gamma }_{\varepsilon } = \frac{1}{S} \sum _{s=1}^S \varvec{\delta }_{\varepsilon }(\varvec{\mu }^{{}_{(s)}}) \varvec{\delta }_{\varepsilon }(\varvec{\mu }^{{}_{(s)}})^\top - \,\overline{\varvec{\delta }}_\varepsilon \overline{\varvec{\delta }}_\varepsilon ^\top \quad \text {with} \quad \,\,\overline{\varvec{\delta }}_\varepsilon = \frac{1}{S} \sum _{s=1}^S \varvec{\delta }_{\varepsilon }(\varvec{\mu }^{{}_{(s)}}). \end{aligned}$$
(18)

Online:

Input. Let ${\mathcal {E}_{0}}$ be the initial ensemble with elements $\{{\varvec{\mu }_{0}^{{}_{\left( j\right) }}}\}_{{}^{j=1}}^{{}_J}$ sampled from a given distribution ${\Pi _{0}} ( \varvec{\mu })$. Let $\varvec{\Sigma }$ be the a priori known noise covariance and $\textbf{y}$ the vector of noisy measurements collected from the physical system. Let $\tau \ll 1$ be the termination parameter.

For $n = 0,1,\ldots $

(i)
Prediction step. Compute the biased measurements of the approximated solution over a time interval $\mathcal {I}$ for each particle in the last updated ensemble:
$$\begin{aligned} \begin{aligned} {{\mathcal {G}_\varepsilon ({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}}&\,{= {\mathcal {L}u_{\varepsilon }({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})} \quad \text {for all } {j}\in \{1,\ldots ,{J}\} \quad s.t.} \\ {\partial _t u_{\varepsilon }({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})}&\,{= \mathcal {F}^\varepsilon _{{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}} u_{\varepsilon }({{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}}), \,\, u_{\varepsilon }({0, {\varvec{\mu }_{n}^{{}_{\left( j\right) }}}})=u_{\varepsilon ,0}({\varvec{\mu }_{n}^{{}_{\left( j\right) }}}).} \end{aligned} \end{aligned}$$
(19)
(ii)
Intermediate step. From the last updated ensemble measurements and parameters, define the sample means and covariances:
$$\begin{aligned} \textbf{P}_{n,\varepsilon }&= \frac{1}{{J}} \sum _{{j}=1}^{J}{\mathcal {G}_\varepsilon ({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}\, {\mathcal {G}_\varepsilon ({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}^\top - \,\overline{\mathcal {G}}_{{n},\varepsilon } \, \overline{\mathcal {G}}_{{n},\varepsilon }^\top{} & {} \, \text {with} \,\,\, \overline{\mathcal {G}}_{{n},\varepsilon } = \frac{1}{{J}} \sum _{{j}=1}^{J}{\mathcal {G}_\varepsilon ( {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} )}, \end{aligned}$$
(20)
$$\begin{aligned} \textbf{Q}_{n,\varepsilon }&= \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n}^{{}_{\left( j\right) }}} {\mathcal {G}_\varepsilon ({\varvec{\mu }_{n}^{{}_{\left( j\right) }}})}^\top - \,{\overline{\varvec{\mu }}_{n}} \overline{\mathcal {G}}_{{n},\varepsilon }^\top{} & {} \, \text {with} \,\,\,\,\,{\overline{\varvec{\mu }}_{n}} = \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n}^{{}_{\left( j\right) }}}. \end{aligned}$$
(21)
(iii)
Analysis step. Update each particle in the ensemble: $\text {for all } {j}\in \{1,\ldots ,{J}\}$
$$\begin{aligned} {\varvec{\gamma }_n^{{}_{(j)}}}&{\sim \mathcal {N} ( \overline{\varvec{\delta }}_\varepsilon , \varvec{\Sigma }+ \varvec{\Gamma }_\varepsilon ),} \end{aligned}$$
(22)
$$\begin{aligned} {{\varvec{\mu }_{n+1}^{{}_{\left( j\right) }}}}&{= {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} + \textbf{Q}_{n,\varepsilon } \left( \textbf{P}_{n,\varepsilon } + \varvec{\Gamma }_\varepsilon + \varvec{\Sigma } \right) ^{-1} \left( \textbf{y}- {\mathcal {G}_\varepsilon ( {\varvec{\mu }_{n}^{{}_{\left( j\right) }}} )} - \varvec{\gamma }_n^{{}_{(j)}} \right) .} \end{aligned}$$
(23)
(iv)
Termination step. Stop the algorithm when the termination criterion is satisfied:
$$\begin{aligned} {{\Vert {\overline{\varvec{\mu }}_{n+1}}-{\overline{\varvec{\mu }}_{n}} \Vert _2} \le \tau {\Vert {\overline{\varvec{\mu }}_{n+1}} \Vert _2} \quad \text {with} \quad {\overline{\varvec{\mu }}_{n+1}} = \frac{1}{{J}} \sum _{{j}=1}^{J}{\varvec{\mu }_{n+1}^{{}_{\left( j\right) }}}. } \end{aligned}$$
(24)

By employing the same training set for constructing the surrogate model and for evaluating $\overline{\varvec{\delta }}_\varepsilon $ and $\varvec{\Gamma }_\varepsilon $, we provide the largest possible training set to the model order reduction algorithm for a fixed value of S. However, we also introduce a bias in the estimation of the moments of $\varvec{\delta }_\varepsilon (\mu ^\star )$ due to the underestimated values of $\varvec{\delta }_{\varepsilon }(\varvec{\mu }^{{}_{(s)}})$. The bias could be removed, e.g., by partitioning the training set $\{\varvec{\mu }_{0}^{{}_{(s)}}\}_{{}^{s=1}}^{{}_S}$ into two sub-sets (or by introducing two independent sets with cardinality S/2), one for the construction of the surrogate model and one for the independent estimation of $\overline{\varvec{\delta }}_\varepsilon $ and $\varvec{\Gamma }_\varepsilon $. The disadvantage of this approach (for fixed S) would be a a smaller training set for the surrogate model construction, and a poorer yet unbiased statistics for the moments’ estimation.

We note from the offline part in Algorithm 2 in the data bias estimation that the RB approximation needs to approximate the full order model uniformly well globally, i.e., over the entire parameter domain. We can thus not replace the classical offline-online decomposition by an on-the-fly adaptation of the reduced basis (Donoghue and Yano 2022), which is beneficial if only local approximations, e.g. along the optimization path, are required.

In Algorithm 2, the prior probability $\Pi _0 (\varvec{\mu })$ used for the estimation of the moments of $\varvec{\delta }^\star _\varepsilon $ could be substituted at every iteration by an updated probability measure of $\varvec{\mu }$. However, the computation of the updated probability measure might compromise the computational gain obtained with the use of reduced models. One possibility to address this shortcoming is to use a Gaussian process regression of the initial ensemble biases to estimate the moments of $\varvec{\delta }^\star _\varepsilon $ with respect to the new probability measure of $\varvec{\mu }$. The development and study of this strategy together with its effect on the accuracy and performances of the RB-EnKM will be investigated in future studies.

4 Numerical experiments

In the following section, we consider two data assimilation problems for the estimation of model parameters in pPDEs. The first problem involves a linear advection dispersion problem with unknown Péclet number. The corresponding model is linear in the observed state $c(\mu )$, but it is non-linear in the parameter to estimate. The second problem concerns the transport of a contaminant in an unconfined aquifer with unknown hydraulic conductivity. It involves two coupled PDEs: a stationary non-linear equation which describes the pressure field induced by an external pumping force and a time-dependent linear equation describing the advection-dispersion of the contaminant in a medium whose properties depend non-linearly on the pressure field.

Both models describe 2D systems, and each exhibits ideal characteristics to test the proposed algorithms. The first, while leading to a non-linear inverse problem, is sufficiently simple to allow for a comparison between the adjusted and biased RB-EnKM and the reference full order EnKM. Moreover, its affine dependence on the parameter enables the use of error bounds for the efficient construction of the reduced space. The second problem, which is non-linear and non-affine in the six-dimensional parameter vector, is complex enough to serve as a non-trivial challenge for the proposed RB-EnKM algorithm, while the reference EnKM cannot even be tested due to the computational cost. From an a priori estimate, performing full order tests with the same statistical relevance as the reduced basis ones would have taken up to 20 days on our machine.

The two problems are presented in Sects. 4.1 and 4.2. We first introduce the pPDE, then present the full order discretization followed by the reduced basis approximation. The measurement operator is then introduced, and a first analysis of the inversion method is carried out. Finally, we study the impact of the ensemble size, of the experimental noise magnitude, and of the error of the reduced model on the reconstruction error of the EnKM. All the computations are performed using Python on a computer with 2.20 GHz Intel Core i7-8750 H processor and 32 GB of RAM.

4.1 Taylor–Green vortex problem

Let us consider the dispersion of a contaminant modeled by the 2D advection–diffusion equation with a Taylor–Green vortex velocity field (Kärcher et al. 2018). We introduce the spatial domain ${\Omega }= (-1, 1)^2$ with Dirichlet boundary ${ \Gamma _D :=(-1,1) \times \{ -1 \} }$ and Neumann boundary ${ \Gamma _N :={\partial {\Omega }}{\setminus } \Gamma _D }$, and the time domain $\mathcal {I}:=\left( 0,T\right] $ with $T=2.5$. We consider the problem of estimating the inverse of the Péclet number $\mu = 1/\textrm{Pe}$ in the interval $\mathcal {P}:=[1/50, 1/10]$. The governing pPDE is given by: find $ c(\mu ): {\Omega }\times \left( 0,T\right] \rightarrow \mathbb {R} $ such that

$$\begin{aligned} \left\{ \begin{aligned}&\partial _t c - \mu \Delta c + \varvec{\beta } \cdot \nabla c = 0, \qquad{} & {} \text{ in } \, {\Omega }&\times \, {\mathcal {I}},\\&\nabla c(\textbf{x},t;\mu )\cdot \textbf{n} = 0,{} & {} \text{ on } \, \Gamma _N&\times \, {\mathcal {I}},\\&c(\textbf{x},t;\mu )=0,{} & {} \text{ on } \, \Gamma _D&\times \, {\mathcal {I}},\\&c(\textbf{x},0;\mu )=c_0(\textbf{x};\mu ),{} & {} \text{ in } \, {\Omega }.&\end{aligned} \right. \end{aligned}$$

(25)

Here, the velocity field $\varvec{\beta } :={(\sin (\pi x_1) \cos (\pi x_2)}$,$ {-\cos (\pi x_1) \sin (\pi x_2))^\top }$, $\textbf{x} = (x_1, x_2)$, is a solenoidal field, and the initial condition $c_0(\mu ): {\Omega }\rightarrow \mathbb {R}$ is given by the sum of three Wendland functions $\psi _{2,1}$ (Wendland 1995) of radius 0.4 and centers located at $(-0.6, -0.6)$, (0, 0), and (0.6, 0.6). The velocity field and the initial condition are shown in Fig. 1.

The full order model is obtained by a nodal finite element discretization of (25) using piecewise continuous polynomial functions, $\zeta _i: {\Omega }\rightarrow \mathbb {R}$, $i = 1, \ldots , N_h$, of degree 2 over a uniform Cartesian grid of width $h=0.04$, for a total of $N_h = 10,100$ degrees of freedom. The resulting system of ordinary differential equations is integrated over time using a Crank–Nicolson scheme with uniform time step $\Delta t = 0.01$. As shown in (Thomée 2006, Chapter 12), this is equivalent to performing a Petrov–Galerkin projection of (25) with trial and test spaces defined as follows: we consider the partition of the temporal interval $\mathcal {I}$ into the union of equispaced subintervals, $\mathcal {I}_n :=\left( t_{n-1}, t_n \right] $, of length $\Delta t$ with $n=1,\ldots , N_t$ and $N_t :=T / \Delta t$. Let $\omega _n:\mathcal {I} \rightarrow \mathbb {R}$ be a piecewise constant function with support in $\mathcal {I}_n$, and let $\upsilon _n:\mathcal {I} \rightarrow \mathbb {R}$ be a hat function with support in $\mathcal {I}_{n} \cup \mathcal {I}_{n+1}$. We define the trial space $\mathcal {V}_h :=\text {span} \{ \upsilon _n \cdot \zeta _i \}_{ {}^{i,n=1} }^{ {}_{N_h, N_t} }$ and the test space $\mathcal {W}_h :=\text {span} \{ \omega _n \cdot \zeta _i \}_{ {}^{i,n=1} }^{ {}_{N_h, N_t} }$, respectively.

To solve the spatial problems arising at each time step, we use the sparse $\texttt {splu}$ function implemented in the scipy.sparse.linalg^{Footnote 1} package. The computational time to obtain a single full order solution is on average 0.56s. Snapshots of the solution at times $t \in \{0.2, 0.8, 1.4, 2.0\}$ and for the three parameter values $\mu \in \{1/10, 1/30, 1/50 \}$ are shown in Fig. 2.

The high-fidelity model is used in combination with the time-gradient error bound $\Delta ^{pr}_\text {R}(\mu )$ introduced in Aretz (2021) to implement a Weak-POD-Greedy algorithm for the selection of the reduced basis functions. To this end, we consider the training set $\Xi _{\text {TRAIN}}^\mu $ with parameters $\mu ^{{}_{(s)}}= 1/(9.5 + 0.5 s)$ for all $s\in \mathbb {N}\cap [1,S]$ of size $S=81$. We prescribe a target accuracy of $10^{-2}$ for the maximum time-gradient relative error bound and we obtain an RB space of size 42. We can construct surrogate models of different accuracy by selecting $N_\varepsilon \in \mathbb {N}$ basis functions $\psi _i: {\Omega }\rightarrow \mathbb {R}$, for $i=1,\ldots ,N_{\varepsilon }$, out of these 42. Each choice corresponds to a relative error for the model given by

$$\begin{aligned} \varepsilon _c :=\sup _{\mu \in \mathcal {D}} \frac{\Vert c_h(\mu ) - c_\varepsilon (\mu ) \Vert _{L^2(\mathcal {I},H^{1}({\Omega }))} }{ \Vert c_h(\mu ) \Vert _{L^2(\mathcal {I},H^{1}({\Omega }))} }. \end{aligned}$$

(26)

Once the reduced basis has been computed, we construct a reduced model via a Petrov–Galerkin projection of (25) in the same way as we did for the full order model. For this purpose, we define the trial space $\mathcal {V}_\varepsilon :=\text {span} \{ \upsilon _n {\otimes } \psi _i \}_{ {}^{i,n=1} }^{ {}_{N_\varepsilon , N_t} }$ and the test space $\mathcal {W}_\varepsilon :=\text {span} \{ \omega _n {\otimes } \psi _i \}_{ {}^{i,n=1} }^{ {}_{N_\varepsilon , N_t} }$.

We then look for a reduced solution of the form

$$\begin{aligned} { c_{\varepsilon }(\mu )= \sum _{i=1}^{N_{\varepsilon }}\sum _{n=1}^{N_t} c_{n,i}(\mu ) \, \upsilon _n \, \psi _i,} \end{aligned}$$

(27)

where the expansion coefficients $c_{0,i}$, for $i=1,\ldots ,N_\varepsilon $, result from the projection of the initial condition onto $\mathcal {V}_\varepsilon $, while the remaining coefficients $c_{n,i}$, with $i=1,\ldots ,N_{\varepsilon }$ and $n=1,\ldots ,N_t$, satisfy the equation

$$\begin{aligned} \sum _{j=1}^{ N_\varepsilon } \left( \textbf{M}_{ij} + \frac{\Delta t}{2} ( \textbf{A}_{ij} + \mu \textbf{K}_{ij}) \right) {c_{n,j}} = \sum _{j=1}^{ N_\varepsilon } \left( \textbf{M}_{ij} - \frac{\Delta t}{2} ( \textbf{A}_{ij} + \mu \textbf{K}_{ij}) \right) {c_{n-1,j}}. \end{aligned}$$

(28)

Here the matrices $\textbf{M}, \textbf{K}, \textbf{A} \in \mathbb {R}^{\scriptscriptstyle N_\varepsilon \times N_\varepsilon }$ denote the mass, stiffness, and advection matrix, respectively, and are given by

$$\begin{aligned} \textbf{M}_{ij} :=\int _\Omega \psi _j \psi _i \, d\Omega ,\,\,\, \textbf{K}_{ij} :=\int _\Omega \nabla \psi _j \cdot \nabla \psi _i \, d\Omega ,\,\,\, \textbf{A}_{ij} :=\int _\Omega (\varvec{\beta } \cdot \nabla \psi _j) \psi _i \, d\Omega .\nonumber \\ \end{aligned}$$

(29)

The solution of the system of equations (28), equivalent to a Crank–Nicolson scheme, can be obtained iteratively solving $N_t$ linear systems of size $N_\varepsilon $ for an online complexity $\mathcal {O}(N_\varepsilon ^3 + N_t N_\varepsilon ^2)$. This complexity can be achieved due to the time-independence of the pPDE by performing the LU factorization of the left-hand side before entering the time integration loop. Employing all $N_\varepsilon = 42$ basis functions, the computational time for a reduced basis solution (online cost) is on average 5.4ms, significantly less than the approximately 0.56s required for a full order solution. The acceleration achieved is over 100, which justifies the 47s necessary for the construction of the RB model (offline cost), considering that the online phase requires computing up to 150 reduced basis solutions per iteration. Let us remark that such a cheap training phase is due to the low-dimensionality of the parameters space $\mathcal {P}$ and the availability of a tight error bound for this class of linear problems.

Note that both the online computational cost and the accuracy of the solution depend on $\Delta t$ and on $N_\varepsilon $. The first is kept fixed, $\Delta t = 0.01$, while the latter varies in some of the experiments. In order to keep track of the error associated with different choices of $N_\varepsilon $, we proceed with the characterization of the error between the surrogate model solution $c_\varepsilon (\mu )$ and the full model solution $c_h (\mu )$ for different values of $N_\varepsilon $. This analysis is provided in Fig. 3, depicting the maximum relative errors in $L^2(\mathcal {I}, H^1(\Omega ))$, $L^\infty (\mathcal {I}, L^\infty (\Omega ))$ and the time-gradient norm versus the reduced basis size. It shows a nearly exponential error decay as $N_\varepsilon $ increases. The maxima are computed on an independent test set, $\Xi _\text {TEST}^\mu :=\{ 1/(9.75 + 0.5 s)$, for all $ s \in \mathbb {N} \cap [1,80] \}$. Furthermore, the reduced solutions do not appear to deviate significantly from the projection of their full order counterparts onto the associated RB space, and the error bound employed demonstrates a good effectivity.

For the implementation of the EnKM as presented in Sect. 2.1, it is necessary to provide a mathematical model for the measurement process. We take 40 measurements in time at the three sensor locations, $\eta _i$, $i \in \{1,2,3\}$, shown in Fig. 1. For this purpose, we introduce the measurement operator $\mathcal {L}: L^2(\mathcal {I}, L^2(\Omega )) \rightarrow \mathbb {R}^{120}$, which can be seen as a vector of linear functionals $\ell _k: L^2(\mathcal {I}, L^2(\Omega )) \rightarrow \mathbb {R}$ for all $k \in \mathbb {N}\cap [1, 120]$. Each of those linear functionals has a unique Riesz representer $\rho _k: \mathcal {I} \times \Omega \rightarrow \mathbb {R}$, with respect to the $L^2(\mathcal {I}, L^2(\Omega ))$ norm, that can be written as

$$\begin{aligned} \rho _k = \nu _j \cdot \eta _i\quad \text{ with }\quad k=3j+i\quad \text{ for } \text{ all }\, j\in \mathbb {N}\cap [1,40], \, i \in \mathbb {N}\cap [1,3], \end{aligned}$$

where the spatial fields $\eta _i: {\Omega }\rightarrow \mathbb {R}$ are Wendland functions $\psi _{2,1}$ of radius 0.1 and center coordinates $(x_i,y_i) \in \{(0.1,0.7), (-0.1,-0.5), (0.5,0.1)\}$ (see Fig. 1), while, for each $j\in \mathbb {N} \cap [1, 40]$, $\nu _j: \mathcal {I} \rightarrow \mathbb {R}$ is a piecewise linear function supported over the interval $\mathcal {I}_j :=[t_j-2\Delta t, t_j+2\Delta t]$, where $t_j :=\Delta t (33+5j)$; $\nu _j$ is assumed to be symmetric with respect to $t_j$ and constant between $t_j-\Delta t$ and $t_j+\Delta t$.

Given this description of the observation process and the surrogate model, we next test the data assimilation scheme. We start with the estimation of the unknown parameter $\mu ^\star = 0.04$ given the experimental measurements $\textbf{y}(\mu ^\star , \varvec{\eta }) \in \mathbb {R}^{120}$, with noise $\varvec{\eta } \sim \mathcal {N}(\textbf{0},\,\Sigma )$. We compare the performances of the EnKM employing a full order model and a surrogate model of accuracy $\varepsilon _c = 10^{-3}$ with $N_{\varepsilon }=42$. In order to obtain reliable statistics, we consider 25 ensembles $\mathcal {E}_0$ of size $J=150$ with particles sampled from the uniform prior distribution, $\Pi _0(\mu ) = U(0.02, 0.10)$. The results obtained for a fixed value of $\sigma ^2 = 10^{-6}$, at different iterations of the algorithm, are shown in Table 1. We observe a quick stabilization of the error means $H_h$, $H_\varepsilon $ and $H_\varepsilon ^*$, and of the error covariances, $S_h$, $S_\varepsilon $ and $S_\varepsilon ^*$, after just a few steps. The full order algorithm performs significantly better than the biased reduced basis algorithm, while the adjusted version of the algorithm exhibits an excellent performance, very close to the full order one.

The comparison of the ensemble standard deviation, reported in Table 2, with the average error, reported in Table 1, shows a positive correlation between the two quantities in the reference and the adjusted case. Contrarily, it is clear a decorrelation between the two quantities in the biased case as the iteration index increases. From this observation, we infer that, at least in this case, the ensemble covariance can be used as an error indicator when the reference or the adjusted algorithm is employed.

Table 1 Comparison of reference FE $(\,\cdot _h)$—biased RB $(\,\cdot _\varepsilon )$—adjusted RB $(\,\cdot _\varepsilon ^*)$ EnKM in low-noise conditions $\sigma ^2=10^{-6}$. The test was performed by averaging 25 estimations obtained employing ensembles of 150 particles and using reduced basis models of size $N_\varepsilon = 42$ ($\varepsilon _c \approx 0.001$). H refers to the mean of the estimation error, while S denotes the standard deviation of the estimation error. t.c. and o.c. indicate the total and online cost of one parameter estimation, respectively

Full size table

Table 2 Same experimental conditions as in Table 1. E refers to the ensemble mean, $\Sigma $ to the ensemble standard deviation. Both quantities are computed as the average over 25 ensembles

Full size table

We next investigate the sensitivity of the algorithm with respect to the accuracy of the reduced model, to the effect of the ensemble size, and to the noise magnitude. First, we repeat the estimation of the reference parameter $\mu ^\star = 0.04$ for different values of the ensemble size $J = 4 k$, with $k \in \mathbb {N}\cap [1, 10]$. In this experiment, we employ the same surrogate model used before and consider the relative noise magnitude $\sigma / \Vert \mathcal {G}({\varvec{\mu }^\star })\Vert _\infty = 10^{-3}$. The results, shown in Fig. 4, indicate a larger sensitivity to J for the full order algorithm than for the other two. It requires a larger number of particles before stabilizing on a large-ensemble asymptotic behavior (or mean-field behavior), while the reduced basis algorithms exhibit a much faster convergence, possibly as a consequence of a lower-dimensional state space. Among the three iterations considered, the first appears to be the most affected, while, as the algorithm converges, the ensemble size seems to become less relevant.

In a second experiment, we consider the same parameter estimation, but we let the relative noise $\sigma /\Vert \mathcal {G}({\varvec{\mu }^\star })\Vert _\infty $ take values $10^{-i}$ for $i\in \mathbb {N}\cap [2,6]$. Moreover, we employ $J=40$ particles per ensemble and the same reduced basis model as before. Each estimation is replicated 64 times for different noise realizations. The results are shown in Fig. 5: for the full order EnKM we observe a linear dependence between the reconstruction error and the experimental noise, while the results for the biased RB-EnKM show that an untreated model bias introduces a systematic error independent of the noise magnitude. The most important result is the one related to the adjusted RB-EnKM: the error behavior achieved with this algorithm is comparable with the one obtained using a full order model. This demonstrates the effectiveness of the proposed method in compensating for the bias introduced by the reduced basis model, at least in this case study.

This conclusion is further confirmed by the last experiment, in which the performances of the biased and adjusted RB-EnKM are tested for all the parameters in the test set $\Xi _{\text {TEST}}^{\varvec{\mu }}$ already employed to test the reduced basis model. Each parameter in the set is estimated using surrogate models of increasing sizes. Each estimation is performed 64 times in very low-noise conditions, that is $\sigma /\Vert \mathcal {G}({\varvec{\mu }^\star })\Vert _\infty = 10^{-5}$, employing $J=40$ particles per ensemble. For each surrogate model employed, the results from the 64 ensembles are averaged and the maximum over the test set is computed. The results, shown in Fig. 6, demonstrate the ability of the correction to compensate for the presence of a model bias very well. As a consequence, the worst-case reconstruction error for the adjusted RB-EnKM barely depends on the reduced model size and it is always significantly lower than its biased counterpart. These results confirm the good performance of the adjusted RB-EnKM and its superiority over the biased RB-EnKM.

4.2 Tracer transport problem

We now consider the tracer transport problem from Conrad et al. (2018), describing the non-homogeneous and non-isotropic transport of a non-reactive tracer in an unconfined aquifer. We introduce the spatial domain ${\Omega }:=(0, 1)^2$ divided into six sub-regions ${\Omega }= \bigcup _{r=1}^6 {\Omega }_r$ illustrated in Fig. 7 and defined as follows: $(x,y) \in {\Omega }$ is in ${\Omega }_r$ if the subscript r is the smallest integer for which $x_0^r< x < x_1^r$ and $y_0^r< y < y_1^r$ where the points $\{(x_0^r,y_0^r)\}_{r=1}^6$ and $\{(x_1^r,y_1^r)\}_{r=1}^6$ are defined in Table 3. We denote by $\partial {\Omega }$ the outer boundary of the domain and define the parallel walls $\Gamma _D :=(0,1)\times \{0,1\}$ and $\Gamma _N :=\partial {\Omega }{\setminus } \Gamma _D$. Based on this partition, we define the conductivity field as the piecewise constant function $k(\varvec{\mu }): {\Omega }\rightarrow \mathbb {R}$ over the six sub-regions ${\Omega }_r$. The conductivity can be affinely decomposed employing the coefficient vector $\varvec{\mu }\in \mathbb {R}^6$, with components $\mu _r$, and the indicator functions $\eta _r: {\Omega }\rightarrow \mathbb {R}$

$$\begin{aligned} \begin{aligned} k(\textbf{x};\varvec{\mu }) = \sum _{r=1}^6 e^{\mu _r} \eta _r(\textbf{x}) \qquad \text {with} \,\, \eta _r(\textbf{x}) = {\left\{ \begin{array}{ll} 1 \,\, \text{ if } \quad \textbf{x}\in \Omega _r, \\ 0 \,\, \text{ if } \quad \textbf{x}\in \Omega {\setminus }\Omega _r. \end{array}\right. } \end{aligned} \end{aligned}$$

(30)

We can now estimate the hydraulic log-conductivity $\varvec{\mu }$, restricted to the orthotope , relying on measurements of the tracer concentration $c(\varvec{\mu })$ collected over the time interval $\mathcal {I}:=\left( 0,T\right] $, with $T=0.5$. This field satisfies the pPDE: find $c(\varvec{\mu }): {\Omega }\times \mathcal {I} \rightarrow \mathbb {R}$ such that

$$\begin{aligned} \left\{ \begin{aligned}&{\partial _t c - \nabla \cdot ( ( d_m \textbf{I} + d_l \varvec{\beta } \varvec{\beta }^\top (\varvec{\mu }) ) \nabla c ) + \varvec{\beta }(\varvec{\mu }) \cdot \nabla c = f_c, \qquad }{} & {} {\text{ in } \, {\Omega }\times \, \mathcal {I},}\\&\nabla c(\textbf{x},t;\varvec{\mu })\cdot \textbf{n} = 0,{} & {} {\text{ on } \, \partial {\Omega }\times \, \mathcal {I} ,}\\&c(\textbf{x},0;\varvec{\mu }) = 0,{} & {} \text{ in } \, {\Omega }. \end{aligned} \right. \end{aligned}$$

(31)

In this equation, the dispersion coefficients $d_l=d_m=2.5\cdot 10^{-3}$ correspond to the flow-dependent component of the dispersion tensor and to its residual component, respectively. The forcing term $f_c$ is assumed to be of the form $f_c :=\sum _{{}^{i=1}}^{{}_4} f_{c,i}$ and it models the injection of different amounts of tracer in four wells located at $(a_i, b_i)\in \{ 0.15, 0.85 \}^2$; each $f_{c,i}$ is a Gaussian function centered in $(a_i,b_i)$, with covariance $\Gamma _c = 0.005$ and multiplicative coefficient $p_i$ where $(p_1,p_2,p_3,p_4)=(10, 5, 10, 5)$. The velocity field $\varvec{\beta } (\varvec{\mu }): {\Omega }\rightarrow \mathbb {R}^2$ is linearly dependent on the hydraulic head $u(\varvec{\mu }): {\Omega }\rightarrow \mathbb {R}$ through the relation $\varvec{\beta }(\varvec{\mu }) = -k(\varvec{\mu }) \nabla u$. The latter field must satisfy the second constraint of the inverse problem, i.e., under the Dupuit–Forchheimer approximation (Delleur 2016) it solves the non-linear elliptic pPDE: find $u(\varvec{\mu }): {\Omega }\rightarrow \mathbb {R}$ such that

$$\begin{aligned} \left\{ \begin{aligned}&\nabla \cdot (k(\varvec{\mu }) u \nabla u) + f_u = 0, \qquad{} & {} \text{ in } \, {\Omega },\\&\nabla u(\textbf{x};\varvec{\mu }) \cdot \textbf{n} = 0,{} & {} \text{ on } \, \Gamma _N ,\\&u(\textbf{x};\varvec{\mu })=0,{} & {} \text{ on } \, \Gamma _D . \end{aligned} \right. \end{aligned}$$

(32)

Here, the forcing term $f_u :=\sum _{i=1}^{4} f_{u,i}$ models an active pumping action at the four wells, each $f_{u,i}$ is a Gaussian function centered in $(a_i, b_i)$, of covariance $\Gamma _u = 0.02$ and coefficient $q_i$, where $(q_1,q_2,q_3,q_4)=(10, 50, 150, 50)$. Due to the combination of the quadratic dependence in u and the zero boundary conditions, the equation always admits pairs of opposite solutions $u^{+}, u^{-}$. However, in our study, we are only interested in the positive solution $u^{+}(\varvec{\mu }): {\Omega }\rightarrow {\mathbb {R}^+}$.

Table 3 On the left: coordinates of the corners of the sub-regions $\Omega _r$. On the right: true values of the parameters $\mu _r$ and boundaries of the uniform prior $\Pi _0$

Full size table

Full order solutions are obtained via a finite element approximation, employing piecewise linear functions, $\zeta _i: {\Omega }\rightarrow \mathbb {R}$, for $i = 1, \ldots , N_h$, with $N_{h} = 44,972$ degrees of freedom (mesh size $h \approx 0.01$). The discretization of the elliptic equation (32) results in a discrete non-linear problem which is iteratively solved employing a Newton scheme with tolerance $10^{-6}$. The approximate solution, $u_{h}$, is used to compute the velocity field, $\varvec{\beta }_h(\varvec{\mu }):=- k(\varvec{\mu }) \nabla u_{h}$, which is piecewise constant with $N_{h} - 1$ degrees of freedom. This is needed for the solution of the parabolic equation (31), whose discretization leads to a system of ordinary differential equations integrated over the time interval $\mathcal {I}$ using the Crank–Nicolson scheme with uniform time step $\Delta t = 0.01$. This is equivalent to performing a Petrov–Galerkin projection of Equation (31), analogously to what have been shown for Equation (25) in Sect. 4.1.

Each full order simulation is obtained employing a FreeFEM++ solver (Hecht 2012) and takes roughly 2 min to be computed. Figure 8 shows the hydraulic head $u_h({\varvec{\mu }^\star })$ and the relative velocity field $\varvec{\beta }_h({\varvec{\mu }^\star })$ (on the left) and the tracer concentration field $c_h(0.4;{\varvec{\mu }^\star })$ (on the right), both associated with the reference log-conductivity

$$\begin{aligned} {\varvec{\mu }^\star }= [-0.75, -0.25, -0.50, 1.00, -0.25, 3.00]^\top . \end{aligned}$$

(33)

The same reference log-conductivity is used as the true parameter for the data assimilation problem. Pointwise observations are collected at five successive times $t_m \in \{0.1, 0.2, 0.3, 0.4, 0.5\}$, in 25 spatial location $\textbf{x}_{ij} = (x_i, y_j)$ such that $x_i=0.1+0.2 i$ and $y_j=0.1+0.2 j$ for $i,j \in \{0, \ldots ,4\}$. This operation is encoded in the measurement operator $\mathcal {L}: H^1({\Omega }) \rightarrow \mathbb {R}^{125}$. Each noise-free measurement is polluted with i.i.d. Gaussian noise with mean zero and covariance $\sigma $, resulting in a noise covariance matrix $\Sigma = \sigma ^2 \textbf{I}$.

In order to solve the inverse problem with surrogate models of different accuracy, various approximations of (32) and (31) must be produced. This requires the introduction of spatial basis functions $\psi _i, \varphi _j: {\Omega }\rightarrow \mathbb {R}$, $i \in \mathbb {N}\cap [1, N_\varepsilon ]$, $j \in \mathbb {N}\cap [1, M_\varepsilon ]$, selected by applying the method of snapshots (POD) to the two sets of full order solutions,

$$\begin{aligned}\Theta ^u_\text {TRAIN} :=\{ u_h (\varvec{\mu }^{{}_{(s)}}) \}_{{}^{s=1}}^{{}_{S}}\quad \text {and}\quad \Theta _\text {TRAIN}^{c} :=\{ c_h(t^{\scriptscriptstyle (z)};\varvec{\mu }^{{}_{(s)}}) \}_{{}^{z,s=1}}^{{}_{Z,S}},\end{aligned}$$

with snapshot parameters, , for all $s \in \mathbb {N}\cap [1, S]$ and sampling times $t^{\scriptscriptstyle (z)} = 0.01 z$ for all $z \in \mathbb {N}\cap [1, Z]$, where $S=2,000$ and $Z=50$. The number of basis functions considered, $N_\varepsilon , M_\varepsilon \in \mathbb {N}$, is the one required to approximate the hydraulic head and the tracer concentration with relative accuracy $\varepsilon _u, \varepsilon _c \in {\mathbb {R}^+}$, where

$$\begin{aligned} \varepsilon _u&:=\sup _{\varvec{\mu }\in \mathcal {D}} \frac{\Vert u_h(\varvec{\mu }) - u_\varepsilon (\varvec{\mu }) \Vert _{H^1({\Omega })}}{\Vert u_h(\varvec{\mu })\Vert _{H^1({\Omega })}}, \end{aligned}$$

(34)

$$\begin{aligned} \varepsilon _c&:=\sup _{\varvec{\mu }\in \mathcal {D}} \frac{\Vert c_h(\varvec{\mu }) - c_\varepsilon (\varvec{\mu }) \Vert _{L^2(\mathcal {I}, H^1({\Omega }))}}{\Vert c_h(\varvec{\mu })\Vert _{L^2(\mathcal {I}, H^1({\Omega }))}}. \end{aligned}$$

(35)

Based on the first set of basis functions, the approximation space for the Galerkin projection of (32) is defined as $\mathcal {U}_\varepsilon :=\text {span} \{ \psi _i \}_{{}^{i=1}}^{{}_{N_\varepsilon }}$. From the second set of basis functions, instead, the RB test space $\mathcal {W}_\varepsilon :=\text {span} \{ \omega _n {\otimes } \varphi _i \}_{ {}^{i,n=1} }^{ {}_{M_\varepsilon , N_t} }$ and RB trail space $\mathcal {V}_\varepsilon :=\text {span} \{ \upsilon _n {\otimes } \varphi _i \}_{ {}^{i,n=1} }^{ {}_{M_\varepsilon , N_t} }$ are defined for the Petrov–Galerkin projection of (31). We look at reduced solutions of the form

$$\begin{aligned} {u_\varepsilon (\varvec{\mu })}&{= \sum _{i=1}^{N_\varepsilon } u_{i}(\varvec{\mu })\psi _i} \end{aligned}$$

(36)

$$\begin{aligned} {c_\varepsilon (\varvec{\mu })}&{= \sum _{j=1}^{M_\varepsilon } \sum _{n=1}^{N_t} c_{n,j}(\varvec{\mu }) \, \upsilon _n \, \varphi _j,} \end{aligned}$$

(37)

where the expansion coefficients $c_{n,j}$ and $u_i$, with $i\in \mathbb {N}\cap [1, N_\varepsilon ]$, $n \in \mathbb {N}\cap [1,N_t]$ and $j \in \mathbb {N}\cap [1,M_\varepsilon ]$, satisfy the systems of algebraic equations

$$\begin{aligned} \sum _{p,q=1}^{N_\varepsilon , N_\varepsilon } \textbf{N}_{ipq} (\varvec{\mu }) u_p u_q&= f_i, \end{aligned}$$

(38)

$$\begin{aligned} \sum _{k=1}^{ M_\varepsilon } \left( \textbf{M}_{jk} + \frac{\Delta t}{2} \textbf{D}_{jk} (\textbf{u},\varvec{\mu }) \right) c_{n+1,k}&= \left( \textbf{M}_{jk} - \frac{\Delta t}{2} \textbf{D}_{jk} (\textbf{u},\varvec{\mu }) \right) c_{n,k} + g_j, \end{aligned}$$

(39)

given the initial conditions $c_{0,j}=0$ for all $j \in \mathbb {N}\cap [1, M_\varepsilon ]$. The scalar forcing terms $f_i$, $g_j$ are obtained by integrating their full order counterparts versus the basis functions $\psi _i$ and $\varphi _j$, for all $i \in \mathbb {N}\cap [1, N_\varepsilon ]$, $j \in \mathbb {N}\cap [1, M_\varepsilon ]$

$$\begin{aligned} f_i :=\int _\Omega f_h \psi _i d \Omega , \qquad g_j :=\Delta t \int _\Omega f_c \varphi _j d \Omega . \end{aligned}$$

(40)

The mass and stiffness matrices $\textbf{M}, \textbf{K} \in \mathbb {R}^{\scriptscriptstyle M_\varepsilon }$ are defined as in (29), while the parameter dependent tensors $\textbf{D}(\textbf{u},\varvec{\mu }) \in \mathbb {R}^{\scriptscriptstyle M_\varepsilon ^2}$ and $\textbf{N}(\varvec{\mu }) \in \mathbb {R}^{\scriptscriptstyle M_\varepsilon ^3}$ depend affinely on the multidimensional arrays $\textbf{A}\in \mathbb {R}^{\scriptscriptstyle 6 \times N_\varepsilon ^3}$, $\textbf{B}\in \mathbb {R}^{\scriptscriptstyle 6 \times N_\varepsilon ^2 \times M_\varepsilon ^2}$, and $\textbf{C}\in \mathbb {R}^{\scriptscriptstyle 6 \times N_\varepsilon \times M_\varepsilon ^2}$ defined as

$$\begin{aligned} \textbf{A}_{ipqr}&:=\int _\Omega \frac{\eta _r}{2} \left( \psi _p (\nabla \psi _q \cdot \nabla \psi _i) + \psi _q (\nabla \psi _p \cdot \nabla \psi _i) \right) d\Omega , \end{aligned}$$

(41)

$$\begin{aligned} \textbf{B}_{jkpqr}&:=\int _\Omega \eta _r (\nabla \varphi _j \cdot \nabla \psi _p)(\nabla \varphi _k \cdot \nabla \psi _q) d\Omega , \end{aligned}$$

(42)

$$\begin{aligned} \textbf{C}_{jksr}&:=\int _\Omega \eta _r (\nabla \varphi _j \cdot \nabla \psi _s) \varphi _k d\Omega . \end{aligned}$$

(43)

For a fixed value of the log-conductivity, $\varvec{\mu }$, the tensors $\textbf{N} (\varvec{\mu }) $ and $\textbf{D} (\textbf{u}, \varvec{\mu }) $ can be assembled. The latter, however, requires the evaluation of the discrete hydraulic head $\textbf{u}$. They are respectively defined as

$$\begin{aligned} \textbf{N}_{ipq} (\varvec{\mu })&:=\sum _{r=1}^{6} e^{\mu _r} \textbf{A}_{ipqr} \,, \end{aligned}$$

(44)

$$\begin{aligned} \textbf{D}_{jk} (\textbf{u}, \varvec{\mu })&:=d_m \textbf{K}_{jk} + d_l \sum _{p, q, r=1}^{N_\varepsilon , N_\varepsilon , 6} e^{2\mu _r} \textbf{B}_{jkpqr} u_p u_q + \sum _{s,r=1}^{N_\varepsilon , 6} e^{\mu _r} \textbf{C}_{jksr} u_s \,. \end{aligned}$$

(45)

We emphasize that the accuracy of the solutions of (38) and (39), with the latter equivalent to a Crank–Nicolson discretization, depends on the number of basis functions and on the time step $\Delta t$. In Fig. 9, on the left and on the right, we show the maximum relative errors of the surrogate model (${\varepsilon }_u$, ${\varepsilon }_c$) as a function of $N_\varepsilon $ and $M_\varepsilon $. In the center, we show the $L^\infty ( \mathcal {I}; L^\infty ( {\Omega }))$ relative error of the tracer concentration, bounding from above the error on synthetic measurements. We compute these maximum relative errors on a set of parameters $ \Xi _\text {TEST}^{\varvec{\mu }} :=\{ \varvec{\mu }^{{}_{(s)}} \sim \Pi _0(\varvec{\mu }) \}_{s=1}^{500}$ independent of the ones used for the model training. It can be observed that, for small values of $N_{\varepsilon }$, the error in the concentration stagnates after a certain value of $M_{\varepsilon }$, suggesting that, in this region, the error is dominated by the approximation of the hydraulic head. However, for $N_\varepsilon =40$, this effect is no longer present, at least for the values of $M_\varepsilon $ considered, and the tracer error only depends on $M_{\varepsilon }$. This allows us to modify the accuracy of the model by varying the dimension of the reduced model.

The construction of the reduced model has an offline cost of about 75 h. This includes the time required for the construction of a training set of 2, 000 full order solutions $(61\text {h}\, 16'\, 40'')$, the time for the computation of the POD basis functions $(23' \, 40'')$, and the time for assembling the RB model tensors $(13\text {h}\, 32'\, 47'')$. This cost corresponds roughly to the computational cost of 2, 500 finite element solutions, each of which takes approximately 110s. The surrogate model obtained employing $N_\varepsilon = 40$, $M_\varepsilon = 320$ basis functions produces a solution in only 1.25s (online cost), which is about 1/90 of its full order equivalent. The same training set used for the POD is employed to estimate, at negligible cost, the empirical moments of $\varvec{\delta }^\star _\varepsilon $, i.e., $\overline{\varvec{\delta }}_\varepsilon $ and $\varvec{\Gamma }_\varepsilon $.

Note that we admittedly used a “brute force” POD approach to generate the basis for this nonlinear problem. A more offline-efficient method, for example using a POD-Greedy algorithm, could have been used. However, this would have been more complex in terms of both theory and implementation of the reduced model, and is beyond the scope of this paper. Our focus here is on the EnKM and its modification in settings where surrogate models are used.

We now turn our attention to the inverse problem, as discussed in Sect. 2.1. We start by considering the estimation of the reference parameter ${\varvec{\mu }^\star }$ given the measurements $\textbf{y}({\varvec{\mu }^\star }, \varvec{\eta }) \in \mathbb {R}^{125}$, polluted by experimental noise of magnitude $\sigma $. To have a reliable statistic, we consider 32 independent initial ensembles $\mathcal {E}_{0}$ of variable size, sampled from the same distribution $\Pi _0$.

As a first experiment, we compare the performances of the two RB-EnKM employing $J=160$ particles and a surrogate model with error tolerance $\varepsilon _c \approx 0.02$ (obtained with $N_\varepsilon = 40$ and $M_\varepsilon = 320$). The first test relies on the biased version of the RB-EnKM, as presented in Sect. 2.1, while the second test corresponds to the adjusted algorithm. For both simulations, we consider low-amplitude experimental noise, i.e., negligible if compared to the model error, $\sup _{\varvec{\mu }} \Vert \mathcal {L} (c_h(\varvec{\mu }) - c_\varepsilon (\varvec{\mu })) \Vert _\infty \approx 10^{-2} > 10^{-3} = \sigma $, and we separately pollute the measurements employed by the ensembles. In Table 4, we report the average properties of the ensembles after 4 iterations: columns $E_\varepsilon $ and $E_\varepsilon ^*$ contain the mean parameter estimation, i.e., the particle mean, averaged over the 32 ensembles. Here, columns $\Sigma _\varepsilon $ and $\Sigma _\varepsilon ^*$ contain the average standard deviation of the ensembles. We can observe that the correction term has the effect of significantly lowering the reconstruction error from $\Vert E_\varepsilon - {\varvec{\mu }^\star }\Vert _\infty = 6.437$e-3 to $\Vert E_\varepsilon ^* - {\varvec{\mu }^\star }\Vert _\infty = 7.870$e-4. We also notice that the variability of the estimate increases consistently with the presence of an additional term in the Kalman gain. At least in this test, and contrary to the previous numerical experiment, the standard deviation of the ensemble shows a good correlation with the reconstruction error for both the biased and adjusted algorithm.

We note, for this example, that the total cost of estimating a single parameter with the biased or the adjusted RB-EnKM is higher than with the FO-EnKM (approx. $75\text {h}\, 28'$ for the former and $19\text {h}\, 37'$ for the latter). This is due to the six-dimensional parameter space and the fact that the parameters are not highly correlated. We thus require a fairly large training set of size ${2,\!000}$ to obtain a sufficiently accurate reduced order model over the whole parameter space. In combination with the “brute force” POD approach—as mentioned above—the offline cost is thus considerable. However, if one is interested in multiple parameter estimations, e.g., due to new data being availabe, the RB-EnKM algorithm significantly outperforms the FO-EnKM in terms of computational runtime since the online phase requires only $13'\, 32''$. For example, repeating the parameter estimation 32 times in order to obtain a better statistical characterization of the method takes about $82\text {h}$ using the reduced basis method, but it would require more than $627\text {h}$ using the FO-EnKM.

Table 4 Comparison of biased RB $(\,\cdot _\varepsilon )$—adjusted RB $(\,\cdot _\varepsilon ^*)$ EnKM in low-noise conditions $\sigma = 10^{-6}$. The test was performed by averaging 32 estimations obtained employing ensembles of 160 particles and 4 iterations, and using reduced basis models of size $N_\varepsilon =40$, $M_\varepsilon =320$ ($\varepsilon _c \approx 0.02$). E refers to the average parameter estimation, while $\Sigma $ denotes the average ensemble standard deviation, and H the average estimation error. t.c. and o.c indicate respectively the total and online cost of one parameter estimation

Full size table

As an extension of the previous experiment, we estimate the reference parameter ${\varvec{\mu }^\star }$ employing the same surrogate model, noise magnitude and number of ensembles as before, but using ensembles of variable size $J=20k$, with $k \in \mathbb {N}\cap [2,16]$. This allows us to study the effect of the ensemble size on the parameter estimation obtained with the biased and adjusted RB-EnKM algorithms. The results shown in Fig. 10 indicate that, for both algorithms, very small ensembles lead to large relative errors and entail a large variability among the different samples. This behavior seems to be relevant only for ensembles with less than 40 particles when the biased RB-EnKM is employed, and with less than 80 particles when the adjusted version is used. Larger ensembles do not exhibit relevant fluctuations; we can therefore assume an ensemble of size $J=160$ to be sufficiently large to ensure the independence from this quantity of the results in the upcoming tests.

A key quantity determining the performances of the method is the noise magnitude. Its effect on the two reduced basis algorithms is investigated by looking at the variation of the relative estimation error of the reference parameter ${\varvec{\mu }^\star }$ when the noise magnitude varies. To this end, we consider seven noise values, $\sigma ^2 = 10^{-m}$ with $m \in \mathbb {N}\cap [1, 7]$. We employ the same RB-EnKM used before, with a fixed ensemble size $J=160$, and we average the results over 32 independent ensembles. The results, shown in Fig. 11, reiterate the inadequacy of the biased method in dealing with the systematic bias introduced in the measurements by the surrogate model. In fact, the plot corresponding to the biased method shows error stagnation for low-noise. On the contrary, the plot corresponding to the adjusted method highlights a mitigation of this effect, with an estimation error that keeps decreasing in low-noise conditions, although at a lower rate than in high-noise conditions.

In our last experiment, we test the performances of the biased and adjusted RB-EnKM by employing surrogate models of increasing accuracy. We fix the size of the reduced space $\mathcal {U}_\varepsilon $ to a sufficiently large value, $N_\varepsilon = 40$, and we vary the size of the approximation space associated with the concentration: $M_\varepsilon =10k$, with $k \in \mathbb {N}\cap [2, 32]$. Employing the resulting approximated models, we estimate the reference parameter ${\varvec{\mu }^\star }$ in low-noise conditions, $\sigma = 10^{-3}$, averaging the results obtained over 16 ensembles of 160 particles each. In Fig. 12, we show the final relative error (after three algorithm iterations) as $M_\varepsilon $ and $\varepsilon _c$ change, both for the biased and the adjusted RB-EnKM. For both, we observe that the relative estimation error decreases, almost linearly, with the error of the surrogate model. Moreover, we observe that, with few exceptions, the error of the adjusted algorithm is smaller than the error of the biased algorithm. The few points where the two errors are very close can be explained by a strongly unbalanced distribution of the measurement bias in a region away from the reference parameter. Future developments that take into account, in the execution of the algorithm, the parameter estimate to adjust the bias correction should dampen this effect.

5 Conclusions

We proposed an efficient, gradient-free iterative solution method for inverse problems that combines model order reduction techniques, via the reduced basis method, and the Kalman ensemble method introduced in Iglesias et al. (2013). The use of surrogate models allows a significant speed-up of the online computational cost, but it leads to a distortion in the cost function optimized by the inverse problem. This in turn introduces a systematic error in the approximate solution of the inverse problem. To overcome this limitation, we have proposed the adjusted RB-EnKM which corrects for this bias by systematically adjusting the cost function and thus retrieving good convergence.

Using a linear Taylor–Green vortex problem, the performance of the method is compared versus the full order model as well as to the biased RB-EnKM in which no adjustment was made. The numerical results show that the biased method fails to achieve the same accuracy as the full order method. Contrarily, the adjusted RB-EnKM attains the same accuracy as its full order counterpart for a large range of noise magnitudes at a significantly lower computational cost, and even approaches the mean-field limit faster as the ensemble size is increased. Furthermore, the dependence on model accuracy of the reconstruction error is essentially removed over the range of model accuracy considered.

The method was then applied to a non-linear tracer transport problem. The results for this example show that, despite a decrease in the order of convergence at low-noise, the stagnation of the reconstruction error observed in the biased RB-EnKM can be removed by adjusting the algorithm. Regarding the model accuracy, a substantial improvement of the adjusted EnKM with respect to the biased EnKM was observed, although less pronounced than in the linear problem.

Overall, our numerical tests show that the proposed method allows for the use of inexpensive surrogate models while empirically ensuring that the predicted result of the inversion remains accurate with respect to the full order inversion.

Although the online computational cost is significantly lower than the reference full order method, we do note that—depending on the problem at hand and the implementation—the offline cost can be considerable. As a result, the overall cost (offline plus online parameter inversion) for solving a single inversion problem may not be competitive with a plain full order inversion as observed in the second case study. However, if we consider the solution of multiple inverse problems, either due to new data being analyzed or in order to obtain a better statistics, the method becomes competitive also for the second numerical experiment considered.

Notes

https://docs.scipy.org/doc/scipy/reference/sparse.linalg.html.

References

Anderson, J.L., Anderson, S.L.: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Monthly Weather Review 127(12), 2741–2758 (1999). https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2
Anderson, J.L.: An ensemble adjustment Kalman filter for data assimilation. Monthly Weather Review 129(12), 2884–2903 (2001). https://doi.org/10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2
Anderson, B.D.O., Moore, J.B.: Optimal Filtering, 1st edn. Prentice-Hall, Englewood Cliffs (1979)
MATH Google Scholar
Aretz, N.: Data assimilation and sensor selection for configurable forward models: Challenges and opportunities for model order reduction methods. PhD thesis, IRTG-2379, RWTH Aachen, Germany (2021)
Asch, M., Bocquet, M., Nodet, M.: Data Assimilation. Society for Industrial and Applied Mathematics, Philadelphia (2016)
Book MATH Google Scholar
Barrault, M., Maday, Y., Nguyen, N.C., Patera, A.T.: An ‘empirical interpolation method’: application to efficient reduced-basis discretization of partial differential equations. C. R. Math. 339(9), 667–672 (2004). https://doi.org/10.1016/j.crma.2004.08.006
Article MathSciNet MATH Google Scholar
Berkooz, G., Holmes, P., Lumley, J.L.: The proper orthogonal decomposition in the analysis of turbulent flows. Ann. Rev. Fluid Mech. 25, 539–575 (1993). https://doi.org/10.1146/annurev.fl.25.010193.002543
Article MathSciNet Google Scholar
Calvetti, D., Dunlop, M., Somersalo, E., Stuart, A.: Iterative updating of model error for Bayesian inversion. Inverse Prob. 34(2), 025008–38 (2018). https://doi.org/10.1088/1361-6420/aaa34d
Article MathSciNet MATH Google Scholar
Chen, Y., Oliver, D.S.: Ensemble randomized maximum likelihood method as an iterative ensemble smoother. Math. Geosci. 44(1), 1–26 (2012). https://doi.org/10.1007/s11004-011-9376-z
Article Google Scholar
Conrad, P.R., Davis, A.D., Marzouk, Y.M., Pillai, N.S., Smith, A.: Parallel local approximation MCMC for expensive models. SIAM/ASA J. Uncertain. Quantif. 6(1), 339–373 (2018). https://doi.org/10.1137/16M1084080
Article MathSciNet MATH Google Scholar
da Silva, A.F.C., Colonius, T.: Ensemble-based state estimator for aerodynamic flows. AIAA J. 56(7), 2568–2578 (2018). https://doi.org/10.2514/1.J056743
Article Google Scholar
Delleur, J.W.: Elementary Groundwater Flow and Transport Processes, pp. 73–102. CRC Press, Boca Raton (2016)
Google Scholar
Donoghue, G., Yano, M.: A multi-fidelity ensemble kalman filter with hyperreduced reduced-order models. Computer Methods in Applied Mechanics and Engineering 398, 115282 (2022). https://doi.org/10.1016/j.cma.2022.115282
Evensen, G.: The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn. 53, 343–367 (2003). https://doi.org/10.1007/s10236-003-0036-9
Article Google Scholar
Evensen, G.: Analysis of iterative ensemble smoothers for solving inverse problems. Comput. Geosci. 22(3), 885–908 (2018). https://doi.org/10.1007/s10596-018-9731-y
Article MathSciNet MATH Google Scholar
Gao, H., Wang, J.-X.: A bi-fidelity ensemble Kalman method for PDE-constrained inverse problems in computational mechanics. Comput. Mech. 67(4), 1115–1131 (2021). https://doi.org/10.1007/s00466-021-01979-6
Article MathSciNet MATH Google Scholar
Gong, H., Maday, Y., Mula, O., Taddei, T.: PBDW method for state estimation: error analysis for noisy data and nonlinear formulation (2019). arxiv:1906.00810
Grepl, M.A., Patera, A.T.: A posteriori error bounds for reduced-bias approximations of parametrized parabolic partial differential equations. Math. Model. Numer. Anal. 39(1), 157–181 (2005). https://doi.org/10.1051/m2an:2005006
Article MathSciNet MATH Google Scholar
Haasdonk, B., Ohlberger, M.: Reduced basis method for finite volume approximations of parametrized linear evolution equations. Math. Model. Numer. Anal. 42(2), 277–302 (2008). https://doi.org/10.1051/m2an:2008001
Article MathSciNet MATH Google Scholar
Hamill, T.M., Whitaker, J.S., Snyder, C.: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Monthly Weather Review 129(11), 2776–2790 (2001). https://doi.org/10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2
Hecht, F.: New development in freefem++. J. Numer. Math. 20(3–4), 251–265 (2012). https://doi.org/10.1515/jnum-2012-0013
Article MathSciNet MATH Google Scholar
Hernández, J.A., Caicedo, M.A., Ferrer, A.: Dimensional hyper-reduction of nonlinear finite element models via empirical cubature. Comput. Methods Appl. Mech. Eng. 313, 687–722 (2017). https://doi.org/10.1016/j.cma.2016.10.022
Article MathSciNet MATH Google Scholar
Hoel, H.K., Law, K.J.H., Tempone, R.: Multilevel ensemble Kalman filtering. SIAM J. Numer. Anal. 54(3), 1813–1839 (2016). https://doi.org/10.1137/15M100955X
Article MathSciNet MATH Google Scholar
Houtekamer, P.L., Mitchell, H.L.: A sequential ensemble Kalman filter for atmospheric data assimilation. Monthly Weather Review 129(1), 123–137 (2001). https://doi.org/10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2
Huttunen, J.M.J., Kaipio, J.P.: Approximation error analysis in nonlinear state estimation with an application to state-space identification. Inverse Prob. 23(5), 2141–2157 (2007). https://doi.org/10.1088/0266-5611/23/5/019
Article MathSciNet MATH Google Scholar
Iglesias, M.A.: A regularizing iterative ensemble Kalman method for PDE-constrained inverse problems. Inverse Prob. 32(2), 025002–45 (2016). https://doi.org/10.1088/0266-5611/32/2/025002
Article MathSciNet MATH Google Scholar
Iglesias, M.A., Law, K.J.H., Stuart, A.M.: Ensemble Kalman methods for inverse problems. Inverse Prob. 29(4), 045001–20 (2013). https://doi.org/10.1088/0266-5611/29/4/045001
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., Neubauer, A., Scherzer, O.: Iterative Regularization Methods for Nonlinear Ill-posed Problems. De Gruyter, Berlin (2008). https://doi.org/10.1515/9783110208276
Book MATH Google Scholar
Kärcher, M., Boyaval, S., Grepl, M.A., Veroy, K.: Reduced basis approximation and a posteriori error bounds for 4D-Var data assimilation. Optim. Eng. 19(3), 663–695 (2018). https://doi.org/10.1007/s11081-018-9389-2
Article MathSciNet MATH Google Scholar
Li, Z., Navon, I.M.: Optimality of variational data assimilation and its relationship with the Kalman filter and smoother. Q. J. R. Meteorol. Soc. 127(572), 661–683 (2001). https://doi.org/10.1002/qj.49712757220
Article Google Scholar
Lorentzen, R.J., Fjelde, K.K., Frøyen, J., Lage, A.C.V.M., Nævdal, G., Vefring, E.H.: Underbalanced and low-head drilling operations: Real time interpretation of measured data and operational support. SPE Annual Technical Conference and Exhibition, vol. All Days (2001). https://doi.org/10.2118/71384-MS
Mitchell, L., Carrassi, A.: Accounting for model error due to unresolved scales within ensemble Kalman filtering. Q. J. R. Meteorol. Soc. 141(689), 1417–1428 (2015). https://doi.org/10.1002/qj.2451
Article Google Scholar
Mitchell, H.L., Houtekamer, P.L., Pellerin, G.: Ensemble size, balance, and model-error representation in an ensemble Kalman filter. Mon Weather Rev 130(11), 2791–2808 (2002). https://doi.org/10.1175/1520-0493(2002)130<2791:ESBAME>2.0.CO;2
Article Google Scholar
Nadal, E., Chinesta, F., Díez, P., Fuenmayor, F.J., Denia, F.D.: Real time parameter identification and solution reconstruction from experimental data using the proper generalized decomposition. Comput. Methods Appl. Mech. Engrg. 296, 113–128 (2015). https://doi.org/10.1016/j.cma.2015.07.020
Article MathSciNet MATH Google Scholar
Pagani, S., Manzoni, A., Quarteroni, A.: Efficient state/parameter estimation in nonlinear unsteady PDEs by a reduced basis ensemble Kalman filter. SIAM/ASA J. Uncertain. Quantif. 5(1), 890–921 (2017). https://doi.org/10.1137/16M1078598
Article MathSciNet MATH Google Scholar
Popov, A.A., Sandu, A.: In: Park, S.K., Xu, L. (eds.) Multifidelity data assimilation for physical systems, pp. 43–67. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-77722-7_2
Popov, A.A., Mou, C., Sandu, A., Iliescu, T.: A multifidelity ensemble kalman filter with reduced order control variates. SIAM J. Sci. Comput. 43(2), 1134–1162 (2021). https://doi.org/10.1137/20M1349965
Article MathSciNet MATH Google Scholar
Prud’homme, C., Rovas, D.V., Veroy, K., Patera, A.T.: A mathematical and computational framework for reliable real-time solution of parametrized partial differential equations. Math. Model. Numer. Anal. 36(5), 747–771 (2002). https://doi.org/10.1051/m2an:2002035
Article MathSciNet MATH Google Scholar
Quarteroni, A., Manzoni, A., Negri, F.: Reduced basis methods for partial differential equations: An introduction, p. 296. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15431-2
Sakov, P., Evensen, G., Bertino, L.: Asynchronous data assimilation with the EnKF. Tellus A: Dyn. Meteorol. Oceanogr 62(1), 24–29 (2010). https://doi.org/10.1111/j.1600-0870.2009.00417.x
Article Google Scholar
Särkkä, S.: Bayesian filtering and smoothing. Institute of Mathematical Statistics Textbooks, vol. 3, p. 232. Cambridge University Press, Cambridge (2013). https://doi.org/10.1017/CBO9781139344203
Schillings, C., Stuart, A.M.: Convergence analysis of ensemble Kalman inversion: the linear, noisy case. Appl. Anal. 97(1), 107–123 (2018). https://doi.org/10.1080/00036811.2017.1386784
Article MathSciNet MATH Google Scholar
Skjervheim, J.-A., Evensen, G., Aanonsen, S.I., Ruud, B.O., Johansen, T.A.: Incorporating 4D seismic data in reservoir simulation models using ensemble Kalman filter. SPE J. 12(03), 282–292 (2007). https://doi.org/10.2118/95789-PA
Article Google Scholar
Thepaut, J.-N., Courtier, P.: Four-dimensional variational data assimilation using the adjoint of a multilevel primitive-equation model. Q. J. R. Meteorol. Soc. 117(502), 1225–1254 (1991). https://doi.org/10.1002/qj.49711750206
Article Google Scholar
Thomée V.: Galerkin Finite Element Methods for Parabolic Problems vol. 25. Springer, Berlin (2006). 10.1007/3-540-33122-0 12
Wendland, H.: Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Adv. Comput. Math. 4(4), 389–396 (1995). https://doi.org/10.1007/bf02123482
Article MathSciNet MATH Google Scholar
Wu, J., Wang, J.-X., Shadden, S.C.: Improving the convergence of the iterative ensemble Kalman filter by resampling (2019). arxiv:1910.04247
Yano, M., Patera, A.T.: An LP empirical quadrature procedure for reduced basis treatment of parametrized nonlinear PDEs. Comput. Methods Appl. Mech. Eng. 344, 1104–1123 (2019). https://doi.org/10.1016/j.cma.2018.02.028
Article MathSciNet MATH Google Scholar

Download references

Funding

The research leading to these results received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 818473).

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands
Francesco A. B. Silva & Karen Veroy
Department of Mathematics, University of Pisa, Pisa, Italy
Cecilia Pagliantini
Institute of Geometry and Practical Mathematics, RWTH Aachen University, 52056, Aachen, Germany
Martin Grepl

Authors

Francesco A. B. Silva
View author publications
You can also search for this author in PubMed Google Scholar
Cecilia Pagliantini
View author publications
You can also search for this author in PubMed Google Scholar
Martin Grepl
View author publications
You can also search for this author in PubMed Google Scholar
Karen Veroy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco A. B. Silva.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Silva, F.A.B., Pagliantini, C., Grepl, M. et al. A reduced basis ensemble Kalman method. Int J Geomath 14, 24 (2023). https://doi.org/10.1007/s13137-023-00235-8

Download citation

Received: 01 November 2022
Accepted: 25 July 2023
Published: 01 September 2023
DOI: https://doi.org/10.1007/s13137-023-00235-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A reduced basis ensemble Kalman method

Abstract

Similar content being viewed by others

A Stochastic Ensemble Kalman Filter with Perturbation Ensemble Transformation

On the mathematical theory of ensemble (linear-Gaussian) Kalman–Bucy filtering

Frequentist Perspective on Robust Parameter Estimation Using the Ensemble Kalman Filter

1 Introduction

2 Problem formulation

2.1 The ensemble Kalman method

Algorithm 1

3 Surrogate models

3.1 Reduced basis methods

3.2 A reduced basis ensemble Kalman method

Algorithm 2

4 Numerical experiments

4.1 Taylor–Green vortex problem

4.2 Tracer transport problem

5 Conclusions

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A reduced basis ensemble Kalman method

Abstract

Similar content being viewed by others

A Stochastic Ensemble Kalman Filter with Perturbation Ensemble Transformation

On the mathematical theory of ensemble (linear-Gaussian) Kalman–Bucy filtering

Frequentist Perspective on Robust Parameter Estimation Using the Ensemble Kalman Filter

1 Introduction

2 Problem formulation

2.1 The ensemble Kalman method

Algorithm 1

3 Surrogate models

3.1 Reduced basis methods

3.2 A reduced basis ensemble Kalman method

Algorithm 2

4 Numerical experiments

4.1 Taylor–Green vortex problem

4.2 Tracer transport problem

5 Conclusions

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation