1 Introduction

Numerical flow and transport models require the knowledge of parameters such as hydraulic transmissivity and porosity. These parameters, as they describe natural conditions below the subsurface, are subject to an inherent uncertainty. This uncertainty subsequently leads to uncertain flow and transport predictions. Thus, the conditioning to measurement data by inverse modeling techniques is frequently applied to reduce those uncertainties.

In general, inverse modeling refers to the process of using the actual results of some measurements to infer the values of the parameters that characterizes the system of interest (Tarantola 2005). A large number of different methods ranging from manual calibration to sophisticated numerical procedures is available in the literature. A review of several methods is given in Zhou et al. (2014). The complexity of inverse problems does seldom allow an exact solution. Instead, solutions that are close to the observations are sought. For that reason, inverse problems are usually formulated as optimization problems (Hu 2000; Caers 2003; Gómez-Hernández et al. 1997). The corresponding objective functions are related to the observations as well as to the spatial structure of the required aquifer properties. As the unknown field usually contains a large number of points, a straightforward optimization is computationally not feasible. Hence, a reduction in the dimensions of the optimization problem is required.

This paper presents a novel methodology for inverse modeling of groundwater flow and transport problems. The goal of the methodology is to produce realizations that represent: the observed spatial variability of the field, the observed hydraulic transmissivity values, the observed hydraulic head values, as well as the observed contaminant concentration values. It is based on random mixing of spatial random fields and represents an extension of the gradual deformation approach described in Hu (2000). The spatial field of interest is derived from a linear combination of independent random fields, where the corresponding weights have to be selected such that certain linear constraints are fulfilled. If those linear constraints are satisfied, a vector space including an infinite number of solutions can be defined. Nonlinear constraints can then be incorporated via optimization inside the vector space. Furthermore, this new technique generates multiple solutions to the inverse problem; hence, it provides a reasonable representation of the uncertainty of the unknown fields.

This paper is divided in six sections. After the introduction, the basic methodology is presented. Section 3 describes possible extensions to the basic approach, i.e., the use of spatial copulas and the combination with multiple point statistics (Strebelle 2002). In Sect. 4, the numerical methodology is described. In Sect. 5, the presented approach is illustrated using two artificial examples, and conclusions are drawn in Sect. 6. Section 7 gives an outlook on further extensions.

2 Theory

In general, the goals of inverse modeling for groundwater flow and transport problems are:

  1. 1.

    To find a field W(x) for \(x\in D\), with x denoting a point in the domain of interest D that reflects the observed spatial variability.

  2. 2.

    To honor all observed hydraulic transmissivities in field W(x) such that \(W(x_{\kappa })=w_{\kappa }\) for \(\kappa =1,\ldots , K \).

  3. 3.

    To have the temporally dependent head field \(H_W(x,t)\) corresponding to the field W(x) and calculated from the corresponding differential equation with the observations fulfilling \(H_W(x_{\eta },t) = h_{\eta ,t}\) for \(\eta =1,\ldots ,H\) and \(t=1, \ldots , T\).

  4. 4.

    To have the temporally dependent concentration field \(C_W(x,t)\) corresponding to the field W(x) and calculated from the corresponding differential equation with the observations fulfilling \(C_W(x_{\rho },t) = c_{\rho ,t}\) for \(\rho =1, \ldots , P\) and \(t=1, \ldots , T\).

Inverse modeling is usually an ill-posed problem: Either there is no solution (contradicting constraints) or there are infinitely many solutions (Tarantola 2005). To find solutions, fulfilling the first two conditions is not extremely difficult. Unfortunately, the simultaneous fulfillment of three or all four conditions requires specific effort. Additionally, different specific properties of the distribution, such as nonnegativity of the values, trigger further problems (Michalak 2008).

In groundwater modeling, it is often assumed that the transmissivities follow a lognormal spatial distribution. Let W(x) be the unknown hydraulic transmissivity field with x being the location. Introducing:

$$\begin{aligned} Z(x)=\frac{\log W(x) - E[\log W(x)]}{D[\log W(x)]} \end{aligned}$$
(1)

if the expectation E and the standard deviation D of the log-transmissivity fields are known, the task is to identify the normal field Z(x). It is generally assumed that this field is stationary with a known covariance matrix \(\varGamma \). By definition, \(E[Z(x)]=0\) and \(\mathrm{Var}[Z(x)]=1\).

The observations restrict the possible field Z(x). Note that these conditions are partly location specific (transmissivity observations) and partly dependent on the partial differential equations relating the transmissivity to the heads and concentrations. The conditions are partly linear partly nonlinear. Their treatment is described in the next sections.

2.1 Linear Conditions

The presented methodology can be regarded as a stepwise procedure. The first step is to honor the first two goals, i.e., to identify a spatial field reflecting the prescribed spatial dependence structure (expressed by the covariance matrix \(\varGamma \)) with the prescribed values at the hydraulic transmissivity observation locations \(x_{\kappa }\). Note, however, that the presented approach is very general and can also be applied to model variables other than discussed in this paper.

Following the idea presented in Hu (2000), the spatial random field of interest Z(x) is expressed as a linear combination of n independent random fields \(Y_i(x)\):

$$\begin{aligned} Z(x) = \sum _{i=1}^n \alpha _i Y_i(x) \end{aligned}$$
(2)

with:

$$\begin{aligned} E[Z(x)]= & {} E[Y_i(x)]=0 \end{aligned}$$
(3)
$$\begin{aligned} \mathrm{Var}[Z(x)]= & {} \mathrm{Var}[Y_i(x)]=1 \end{aligned}$$
(4)

and:

$$\begin{aligned} \varGamma _{Z(x)} = \varGamma _{Y_i(x)} \end{aligned}$$
(5)

Such independent random fields \(Y_i(x)\) can be simulated using different methods such as fast Fourier transformation for regular grids (Wood and Chan 1994; Wood 1995; Le Ravalec et al. 2000), turning band simulation (Journel 1974), or the Cholesky transformation of the covariance matrix.

The covariance of a Gaussian random field fully describes its spatial variability. In general, the covariance of a linear combination can be calculated as:

$$\begin{aligned} \text{ Cov } [Z(x_j), Z(x_k) ]= & {} \text{ Cov } \left[ \sum _{i=1}^n \alpha _i Y_i(x_j),\sum _{i=1}^n\alpha _i Y_i(x_k) \right] \nonumber \\= & {} E \left[ \left( \sum _{i=1}^n \alpha _i Y_i(x_j) \right) \left( \sum _{i=1}^n \alpha _i Y_i(x_k) \right) \right] \nonumber \\= & {} E \left[ \sum _{i=1}^n \alpha ^2_i Y_i(x_j) Y_i(x_k) \right] = \sum _{i=1}^n \alpha ^2_i \text{ Cov } [ Y_i(x_j), Y_i(x_k) ] \end{aligned}$$
(6)

with \(x_j,x_k \in D\). If all \(Y_i(x)\) share the same covariance matrix \(\varGamma _{Y_i(x)}\) and if:

$$\begin{aligned} \sum _{i=1}^n \alpha _i^2 =1 \end{aligned}$$
(7)

then Eq. (5) is fulfilled, i.e., Z(x) also exhibit the same covariance structure as all \(Y_i(x)\)-s.

The conditional field Z(x) should honor all observed values at the observation locations \(x_{\kappa }\):

$$\begin{aligned} Z(x_{\kappa }) = z_{\kappa } \quad \kappa = 1, \ldots , K \end{aligned}$$
(8)

In Hu (2000) and Hu et al. (2001), linear data are incorporated using conditioning kriging; in Hu (2002), the methodology is extended to combine dependent conditional realizations. The methodology proposed here incorporates any linear constraint directly. The weights \(\alpha _i\) for the n independent realizations of unconditional fields \(Y_i(x)\) have to be selected so that:

$$\begin{aligned} \sum _{i=1}^n \alpha _i Y_i(x_{\kappa }) = z_{\kappa } \quad \kappa = 1, \ldots , K \end{aligned}$$
(9)

For a sufficiently large number n of fields \(Y_i(x)\), there are weights \(\alpha _i\) that fulfill Eq. (9). However, these weights do not necessarily fulfill Eq. (7).

Hence, the next step is to identify weights that fulfill both Eqs. (9) and (7). If the dimension n is large enough, Eq. (9) has an infinite number of solutions. These solutions form a hypersurface in the n-dimensional space of the weights \((\alpha _1,\ldots ,\alpha _n)\). As a first step, weights \((\alpha _1,\ldots ,\alpha _n)\) fulfilling Eq. (9) and:

$$\begin{aligned} \sum _{i=1}^n \alpha _i^2 \ll 1 \end{aligned}$$
(10)

are identified. In order to find such a set of weights, the problem is reformulated as an optimization problem:

$$\begin{aligned} A=\sum _{i=1}^n \alpha _i^2 \rightarrow \min \end{aligned}$$
(11)

subject to the constraints defined in Eq. (9). This is a quadratic optimization problem and can, for example, be solved using quadratic programming (Boyd and Lieven 2004). If the minimum is larger than 1, then by taking an additional field \(Y_i(x)\) (increasing the number of unconditional fields n) the squared sum in Eq. (11) can be reduced until it is below 1. The resulting field:

$$\begin{aligned} Z^*(x) = \sum _{i=1}^n \alpha _i Y_i(x) \end{aligned}$$
(12)

can be considered as a quasi interpolation as it has a much lower variance than the target field, i.e., it is smoother than Z(x) as \(\sum \nolimits _{i=1}^n \alpha _i^2 \ll 1\). However, in order to preserve the desired spatial structure Eq. (7) has to be fulfilled. Thus, a second component has to be added to \(Z^*(x)\) to obtain an appropriate simulated field Z(x). If fields \(U_m(x)\) fulfill:

$$\begin{aligned} U_m(x_{\kappa }) = 0 \quad \kappa = 1, \ldots , K \quad m = 1, \ldots , L-K \end{aligned}$$
(13)

then any linear combination of these fields \(U_m(x)\) also fulfills Eq. (13). The fields \(U_m(x)\) thus form a vector space with an infinite number of solutions. Any field of this vector space can be added to \(Z^*(x)\), and the sum will fulfill the linear conditions defined in Eq. (8). The fields \(U_m(x)\) are also formed as linear combinations of L (\(L > K\)) independent random fields:

$$\begin{aligned} U_m(x) = \sum _{l=1}^L \beta _{l,m} V_l(x) \quad m = 1, \ldots , L-K \end{aligned}$$
(14)

with \(\beta _{l,m}\) denoting the weights and \(V_l(x)\) denoting independent random fields sharing the same spatial properties as the \(Y_i(x)\)-s. The weights \(\beta _{l,m}\) can be obtained sequentially by solving the equations:

$$\begin{aligned} \sum _{l=1}^K \beta _{l,m} V_l(x_{\kappa }) = V_{K+m}(x_{\kappa }) \quad \kappa = 1, \ldots , K \quad m = 1, \ldots , L-K \end{aligned}$$
(15)

and setting the weights obtained from Eq. (15) for \(l \le K\) and:

$$\begin{aligned} \beta _{l,m} = \left\{ \begin{array}{ll} -1 &{}\quad \text{ if }\;\; l=K+m \\ 0 &{} \quad \text{ if }\;\; l>K\quad \text{ and } \;\; l \ne K+m \\ \end{array} \right. \end{aligned}$$
(16)

Thus, all fields of the form:

$$\begin{aligned} Z(x)&= Z^*(x) + k(\lambda ) \sum _{m=1}^{L-K} \lambda _m U_m(x)\nonumber \\&= Z^*(x) + k(\lambda ) \sum _{m=1}^{L-K} \left( \sum _{l=1}^L \beta _{l,m} \lambda _m \right) V_l(x) \end{aligned}$$
(17)

fulfill the linear conditions defined in Eq. (8). \(\lambda _m\) denotes arbitrary weights and \(k(\lambda )\) denotes a normalizing constant that is a function of the weights \(\lambda _m\):

$$\begin{aligned} k(\lambda ) = \pm \sqrt{\frac{1 - \sum _{i=1}^n \alpha _i^2}{\sum _{m=1}^{L-K} \left( \sum _{l=1}^L \beta _{l,m} \lambda _m \right) ^2}} \end{aligned}$$
(18)

in order to fulfill Eq. (7), i.e., to obtain the desired spatial covariance structure. This construct has the specific advantage with respect to other linear conditioning methods that it can be used to generate an infinite number of conditional fields (for each choice of the weights \(\lambda _m\)-s). That property provides the foundation of the presented methodology to be applicable to nonlinear conditioning constraints and with that for inverse problems.

2.2 Nonlinear Conditions: Head and Concentration Observations

For each conditional hydraulic transmissivity field Z(x) obtained by the previously described method, the solution of the groundwater flow and transport equations provide calculated hydraulic head values \(H_W(x_{\eta },t)\) as well as concentration values \(C_W(x_{\rho },t)\) at the observation locations. These values are usually different from the observed values \(h_{\eta ,t}\) and \(c_{\rho ,t}\). Thus, as a next step, hydraulic transmissivity fields which also honor these conditions (third and fourth goal) have to be found. However, the relation between hydraulic transmissivity and hydraulic head and concentration is nonlinear, i.e., such conditions cannot be treated directly as described above. Furthermore, they have to be incorporated additional to the already fulfilled linear constraints.

In general, nonlinear constraints are of the form:

$$\begin{aligned} \varPsi _{\xi }(Z) = \psi _{\xi } \quad \xi = 1, \ldots , \varXi \end{aligned}$$
(19)

with \(\varPsi _{\xi }(Z)\) denoting a nonlinear function of the field Z(x). As Z(x) can be expressed in the form of a sum of a smooth field \(Z^*(x)\) and \(L-K\) (theoretically infinite) fields \(U_m(x)\) fulfilling the homogeneous conditions defined in Eq. (13), a solution of the nonlinear constraints is to be found by modifying the homogeneous component. Consider the \(L-K\) homogeneous fields \(U_m(x)\). As stated above, the corresponding weights \((\lambda _1, \ldots , \lambda _{L-K})\) can be selected arbitrarily. Changing \((\lambda _1, \ldots , \lambda _{L-K})\) leads to a deformation of the field Z(x) without effecting the linear constraints defined in Eq. (8). As the normalizing constant \(k(\lambda )\) is a function of the weights, Eq. (7) is also not effected, i.e., the spatial structure is preserved. Thus, the constraints defined in Eq. (19) can be fulfilled by varying the weights \((\lambda _1, \ldots , \lambda _{L-K})\) via minimization of a certain objective function, for example:

$$\begin{aligned} f_{\mathrm{obj}} = \sum \left( \psi _{\xi }^{\mathrm{obs}} - \psi _{\xi }^{\mathrm{sim}}\right) ^2 \rightarrow \min \quad \xi = 1, \ldots , \varXi \end{aligned}$$
(20)

with \(\psi _{\xi }^{\mathrm{obs}}\) denoting the observed values and \(\psi _{\xi }^{\mathrm{sim}}\) denoting the simulated values.

The advantages of this procedure are:

  1. 1.

    The optimization is continuous with respect to the unknown weights \(\lambda _m\).

  2. 2.

    The optimization is unconstrained—any \(\lambda _m\) weights can be considered.

  3. 3.

    The reduction in the number of constraints to the number of nonlinear constraints (hydraulic head and concentration observations); all considered fields fulfill the linear conditions and have the prescribed spatial dependence.

  4. 4.

    The extension of the vector space of the weights through the addition of a new field (increasing L) is very simple, and the optimal solution obtained in the lower-dimensional field remains a solution in the higher-dimensional case too. Thus, the previous optimum can be used as a starting point for the next optimization.

Other objective functions such as consideration of the covariances of the head observations, the minimization of the maximal deviation, or weighted combinations can be considered. Note that for each random choice of the fields \(Y_i(x)\), a different solution of the problem can be obtained. Thus, this procedure can be used to produce an arbitrary number of random solutions of the inverse problem. The question whether the obtained field is representative can be treated using an additional Markov chain Monte Carlo procedure (Hastings 1970).

3 Extensions

The next sections present possible extensions of the suggested methodology. An approach to handle arbitrary marginal distributions as well as a combination with multiple point statistics is described.

3.1 Non-lognormal Marginals

In groundwater modeling, it is often assumed that the hydraulic transmissivities follow a lognormal distribution (Freeze 1975). However, this assumption is not always necessarily valid. This case can be treated by relaxing this assumption, and assuming an arbitrary marginal distribution for the transmissivities. Furthermore, it is assumed that the spatial dependence of the hydraulic transmissivities can be described using a Gaussian copula. In general, copulas are multivariate distributions defined on the n-dimensional hypercube:

$$\begin{aligned} C\,{:}\,[0,1]^n = [0,1] \end{aligned}$$
(21)

that have uniform univariate marginals. Thus, using copulas one can describe the dependence structure independently of the marginal distributions. For further information, the reader is referred to Nelson (1999). The applicability of copulas in spatial problems is extensively described in Bárdossy and Li (2008), Haslauer et al. (2012), and Guthke (2013).

Using a Gaussian copula, the transmissivity field W(x) has to be transformed to a multinormal field Z(x). If the marginal distribution function of the target field is F(w), then the new variable Z(x) defined as:

$$\begin{aligned} Z(x)= \varPhi ^{-1} (F(W(x))) \end{aligned}$$
(22)

where \(\varPhi ^{-1}\) denotes the inverse standard normal distribution, follows a multivariate normal distribution and the same procedure as described in Sect. 2.1 can be applied. Thus, instead of the transformation described in Eq. (1) that only works in the lognormal case, Eq. (22) can be applied for arbitrary marginal distributions with Gaussian copula dependence structures.

3.2 Combination with Geological Structure Information: Multiple Point Geostatistics

It is often assumed that the hydraulic transmissivity field is structured in a specific way according to the geological processes leading to these variables. For example, fluvial deposits could result from geological sedimentation processes. Such structures are obtained by combining observations and training images. In Li et al. (2012), a Kalman filter-based method is suggested for inverse modeling for this case. Ronayne et al. (2008) coupled training images with a dynamic flow model in a simulation inverse framework to obtain discrete geological structures. Here a different approach is suggested.

Structural information obtained from training images can be combined with the random mixing methodology to solve the inverse problem. Assume that a conditional categorical map has been obtained using a multipoint geostatistics approach, e.g., direct sampling (Mariethoz et al. 2010). With that a random field B(x) with possible values \(B(x)=1,\ldots ,B\) is obtained for conditioning. For each of the possible classes b, there can be a different marginal distribution of the hydraulic transmissivity values:

$$\begin{aligned} F_b(w) = P (W(x)<w | B(x)=b) \end{aligned}$$
(23)

Thus, an inverse solution for the problem with the above additional condition can be obtained by applying the concept described in Sect. 3.1. The new field is then defined as:

$$\begin{aligned} Z(x) = \varPhi ^{-1} (F_{B(x)} (W(x))) \end{aligned}$$
(24)

where \(\varPhi ^{-1}\) denotes the inverse standard normal distribution. This Z(x) can then be treated the same way as described in Sect. 2.1. Note that here a kind of spatial continuity within the different units b is assumed, as the spatial variability within the units is the same in the rank sense. However, this assumption could be weakened, by allowing an individual field \(Z_b(x)\) for each geological unit b, with a specific description of the spatial variability. In this case, the individual fields are mixed simultaneously, the same way as described above.

4 Numerical Methodology

As described in Sect. 2, a marginal transformed conditional realization of the hydraulic transmissivities is determined by a linear combination of unconditional Gaussian random fields. Thus, a large number of Gaussian random fields could be required. In order to reduce the computational burden, simulation on a regular grid using fast Fourier transformation (Wood and Chan 1994; Wood 1995; Le Ravalec et al. 2000) is adopted for all examples. This method allows very fast simulation of unconditional Gaussian random fields.

According to Sect. 2.2, the inverse problem is transformed to a continuous optimization problem. Hence, different continuous nonlinear optimization technique can be applied. Throughout this paper, the COBYLA algorithm (constrained optimization by linear approximation) is adopted. This algorithm is based on linear approximations of the objective function and each constraint. For further details, the reader is referred to Powell (1998). The optimization is carried out in order to achieve a reasonable reproduction of the nonlinear constraints. If a value below a user-defined threshold \(\delta _{\mathrm{obj}}\) is achieved or if the number of forward model runs \(n_{\mathrm{iter}}\) exceeds 500, the optimization process is terminated. During the optimization process the weight vector \(\lambda \) (Eq. 17) is modified. The accuracy of the optimized solution depends on the dimensionality of this vector and on the number of forward model runs. However, in order to achieve a reasonable balance between computational costs and accuracy, the dimensionality of \(\lambda \) is restricted to 26 dimensions in this work.

The numerical flow and transport model applied is HydroGeoSphere (Therrien and Sudicky 1996). HydroGeoSphere is a three-dimensional numerical model describing fully integrated subsurface and surface flow and solute transport. A finite element scheme is adopted throughout this paper.

5 Examples

The high flexibility of the presented methodology allows a wide range of possible applications. In this paper, the approach is demonstrated using two artificial examples.

The general flow setup of both examples is shown in Fig. 1. The domain length is 50 cm in the x direction as well as in the y direction, discretized into \(50 \times 50\) regular grid cells. Steady-state groundwater flow is simulated. The northern and southern boundaries share no-flow conditions, while the western and eastern boundaries have prescribed hydraulic heads of 20.0 and 2.0 cm, respectively. These prescribed heads force a flow from west to east.

A line contamination source introduces a specified mass flux of 1.0 kg/day to the system starting at the first time step for a duration of 12 h. The contaminant is represented by a conservative tracer; thus, it does not show retardation or any chemical reaction, but it is subject to hydrodynamic dispersion. The longitudinal dispersivity is 0.625 cm, and the transversal dispersivity is 0.0625 cm. The transport simulation is solved until 2 days are reached, and the concentrations are sampled at five locations at eleven time steps.

According to Franssen et al. (2009), the performance of the method is evaluated using two statistics:

  1. 1.

    Average absolute error:

    $$\begin{aligned} \mathrm{AAE}(X) = \frac{1}{N} \sum _{i=1}^N | \bar{X_i} - X_{\mathrm{ref},i}| \end{aligned}$$
    (25)
  2. 2.

    Average ensemble standard deviation:

    $$\begin{aligned} \mathrm{AESD}(X) = \frac{1}{N} \sum _{i=1}^N \sigma _{X_i} \end{aligned}$$
    (26)

    with N denoting the number of elements, X denoting the variable of interest, and \(\sigma _{X_i}\) denoting the ensemble standard deviation of variable X at element i.

Fig. 1
figure 1

General flow setup

5.1 Example 1: Basic Case

The first example illustrates the basic methodology according to the traditional assumption that the hydraulic transmissivities follow a lognormal distribution. The reference hydraulic transmissivity field has an average \(\mathrm{log}_{10}T\) of 1.65 cm\(^2\)/day and a \(\mathrm{log}_{10}T\) variance of 0.189 (cm\(^2\)/day)\(^2\). As spatial model an exponential covariogram without nugget effect and an effective range of 4 cm are assumed. The reference transmissivity field as well as the reference hydraulic head field are sampled at 16 locations. According to Eq. (9), the sampled transmissivities are considered as linear equality constraints. The sampled heads are considered as nonlinear constraints according to Sect. 2.2. Tracer concentrations are also considered as nonlinear constraints according to Sect. 2.2. They are sampled at five locations at eleven time steps. The sampled data are not corrupted, i.e., no measurement uncertainties are assumed. Figure 2 shows the reference transmissivity field and the reference head field; Fig. 3 shows the reference tracer concentration fields at ten selected time steps.

Fig. 2
figure 2

Reference \(\mathrm{log}_{10}T~(\mathrm{cm}^2/\mathrm{day})\) field (left) and corresponding hydraulic head (cm) field (right) according to example 1. ‘\(\times \)’ marks the observation locations

Fig. 3
figure 3

Reference tracer concentration for time steps \([0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0~(\mathrm{days})]\) starting in the upper left according to example 1

In total, 100 realization are generated as described in Sect. 2, and for comparison, three different scenarios are distinguished:

  1. 1.

    The first scenario is only conditioned on the spatial model of the hydraulic transmissivities.

  2. 2.

    The second scenario is conditioned on the spatial model of the hydraulic transmissivities, on the observed hydraulic transmissivity values, and on the observed hydraulic head values. The corresponding objective function is:

    $$\begin{aligned} f_{\mathrm{obj}} = \sum _{\eta = 1}^H \left( h_{\eta }^{\mathrm{obs}} - h_{\eta }^{\mathrm{sim}} \right) ^2 \end{aligned}$$
    (27)

    with \(h_{\eta }^{\mathrm{obs}}\) denoting the observed hydraulic head and \(h_{\eta }^{\mathrm{sim}}\) the simulated hydraulic head value.

  3. 3.

    The third scenario is conditioned on the spatial model of the hydraulic transmissivities, on the observed hydraulic transmissivity values, on the observed hydraulic head values, and on the observed tracer concentration values. Here the corresponding objective function is a weighted combination:

    $$\begin{aligned} f_{\mathrm{obj}} = \sum _{\eta = 1}^H \left( h_{\eta }^{\mathrm{obs}} - h_{\eta }^{\mathrm{sim}} \right) ^2 + 100 \cdot \sum _{\rho =1}^P \sum _{t=1}^\mathrm{T} \left| c_{\rho ,t}^{\mathrm{obs}} - c_{\rho ,t}^{\mathrm{sim}} \right| \end{aligned}$$
    (28)

    where \(c_{\rho ,t}^{\mathrm{obs}}\) denotes the observed concentration and \(c_{\rho ,t}^{\mathrm{sim}}\) the simulated concentration.

Fig. 4
figure 4

Possible \(\mathrm{log}_{10}T\) fields (upper) with corresponding hydraulic head fields (middle) and corresponding tracer concentration (time step: 0.8 day) fields (lower) according to scenario 1 (left), scenario 2 (middle), and scenario 3 (right) for example 1

Fig. 5
figure 5

Ensemble averages of \(\mathrm{log}_{10}T\) (upper), hydraulic heads (middle), and tracer concentrations (time step: 0.8 day) (lower) according to scenario 1 (left), scenario 2 (middle), and scenario 3 (right) for example 1

Figure 4 shows a possible realization for each scenario with corresponding hydraulic head fields and corresponding tracer concentration fields according to time step 0.8 day. Figure 5 shows the ensemble mean fields according to the three scenarios. The ensemble mean fields according to scenarios 2 and 3 are able to resemble zones of high and low transmissivities, hydraulic heads, and tracer concentrations reasonably. Hence, conditioning on data leads to an improved characterization of the three attributes although in different ways.

Table 1 shows the AAE and AESD for all three scenarios according to the three attributes. The values are normed such that AAE and AESD is equal to 1 for scenario 1. All other results are relative to the results of scenario 1. For all scenarios including conditioning data the AAE for hydraulic transmissivity, hydraulic head, and concentration is below 1, again indicating an improved characterization of the fields. As also found in other inverse modeling studies (Franssen et al. 2009), conditioning to data is more advantageous to the characterization of the hydraulic head fields than to the characterization of the transmissivity fields. This, however, has also been observed in other studies (Franssen et al. 2003). For scenario 2, \(\mathrm{AAE}(Y)\) is reduced by 16.7 %, while \(\mathrm{AAE}(h)\) is reduced by 68.1 %. \(\mathrm{AAE}(c)\) is reduced by 41.6 %. When additional concentration data are used for conditioning the characterization of the transmissivity and the concentration fields improves, while \(\mathrm{AAE}(h)\) is getting slightly worse. For scenario 3, the reduction in \(\mathrm{AAE}(Y)\) is 17.6 %, 43 % \(\mathrm{AAE}(c)\) reduction, and 63.7 % \(\mathrm{AAE}(h)\) reduction. As concentration is only sampled at five locations, a denser observation network or even a different weighting of the objective function could lead to further improvements. This, however, is subject to future research.

Table 1 Normed average absolute error (AAE) and normed average ensemble standard deviation (AESD) for the log transmissivities, hydraulic heads, and tracer concentrations (averaged over all 11 time steps) fields according to example 1

The uncertainty, measured by AESD, is also below 1 for all scenarios including conditioning data, indicating that conditioning to transmissivity and hydraulic head data helps to reduce the uncertainty of the transmissivity field. For scenario 2, there is a \(\mathrm{AESD}(Y)\) reduction of 16.4 %, a \(\mathrm{AESD}(h)\) reduction of 67.4 %, and a \(\mathrm{AESD}(c)\) reduction of 21.4 %. Additionally, taking concentration data into account reduces the uncertainty of the concentration field; however, it does not help to reduce the uncertainty of the transmissivities any further and even slightly increases the uncertainty of the hydraulic heads. Again, a similar behavior has also been observed in Franssen et al. (2003). For scenario 3, the reduction in \(\mathrm{AESD}(Y)\) is again 16.4 %, the reduction in \(\mathrm{AESD}(h)\) is 53.6 %, and the reduction in \(\mathrm{AESD}(c)\) is 34.3 %.

5.2 Example 2: Including Geological Information

The second example focuses on the importance of the interplay of macrostructure and microstructure of the hydraulic transmissivities. As described in Sect. 3.2, it is often assumed that the transmissivity field is structured in a specific way according to the geological processes leading to these variables. For example, fluvial deposits, i.e., contrasting facies of highly different hydraulic transmissivities, could result from geological sedimentation processes. Such structures, their connectedness, and geometry have a great influence on groundwater flow and transport processes. For example, connected features of high transmissivity result in preferential flow paths, while zones of low transmissivity act like a flow barrier.

In Ronayne et al. (2008), the authors combine multiple point geostatistics with a dynamic flow model to achieve specific channel structures in an inverse modeling framework. They assume discrete structures, i.e., homogeneous distributions for each structure. However, this assumption does not represent nature, and small-scale heterogeneity can have serious influence on transport predictions. Thus, this example aims to show the importance of the interplay of larger-scale and small-scale heterogeneity.

Fig. 6
figure 6

Reference \(\mathrm{log}_{10}T~(\mathrm{cm}^2/\mathrm{day})\) field (left) and corresponding hydraulic head (cm) field (right) according to example 2. ‘\(\times \)’ marks the observation locations

Fig. 7
figure 7

Reference tracer concentration for time steps \([0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0~(\mathrm{days})]\) starting in the upper left according to example 2

The general flow setup is the same as described in Sect. 5.1, but a two-facies geological formation is considered. Each facies has its own marginal distribution: the first facies (representing the connected flow paths) exhibits a lognormal distribution with a mean \(\mathrm{log}_{10}T\) of 2.05 cm\(^2\)/day and a \(\mathrm{log}_{10}T\) variance of 0.189 (cm\(^2\)/day)\(^2\), while the second facies exhibits a lognormal distribution sharing the same variance but an average \(\mathrm{log}_{10}T\) of 0.62 cm\(^2\)/day. The spatial model is again exponential with a spatial range of 4 cm and no nugget effect.

It is assumed that the spatial distribution of the two facies, i.e., the categorical map described in Sect. 3.2, is fully known. Thus, the macrostructure (large-scale heterogeneity) of the hydraulic transmissivity field is the same for each realization, and the microstructure (small-scale heterogeneity) is assumed to be the only unknown. Consequently, the preferential flow paths are predefined by the macrostructure, as solute transport is dominated by zones of high transmissivities. However, as stated above the microstructure inside the respective facies has a great influence on solute transport as well. Figure 6 shows the reference \(\mathrm{log}_{10}T\) field as well as the reference hydraulic head field. Figure 7 shows the reference tracer concentration fields. Again the reference transmissivity and head fields are sampled at 16 locations, resulting in 16 linear transmissivity as well as 16 nonlinear head constraints. The tracer fields are again sampled at five locations at 11 time steps. As in the first example, the sampled data are not corrupted, i.e., no measurement uncertainties are assumed. A total of 100 solutions to the inverse problem are generated for the three scenarios defined in the first example.

Figure 8 shows a possible realization for each scenario with corresponding hydraulic head fields and corresponding tracer concentration fields according to time step 0.8 day. Figure 9 shows the ensemble mean fields for all three scenarios. As in the first example, the ensemble mean fields according to scenarios 2 and 3 are able to resemble zones of high and low transmissivities, hydraulic heads, and tracer concentrations reasonably. This again indicates that conditioning on data leads to an improved characterization of all three attributes. Furthermore, even though the macrostructure is prescribed, scenario 1 is not able to resemble the reference fields reasonably. This fact shows the influence of the microstructure on the flow and transport behavior. Furthermore, the microstructure influences the different attributes differently. While the structure inside the flow channels has more effect on the transport behavior, the structure in the remaining field has more effect on the overall flow behavior.

Table 2 shows the AAE and AESD for all three scenarios according to the three attributes. Again the values are normed such that AAE and AESD is equal to 1 for scenario 1. All other results are relative to the results of scenario 1. As in the first example, \(\mathrm{AAE}(Y), \mathrm{AAE}(h)\), and \(\mathrm{AAE}(c)\) are below 1 for all scenarios including conditioning data. For scenario 2, \(\mathrm{AAE}(Y)\) is reduced by 18.1 %, \(\mathrm{AAE}(h)\) is reduced by 66.5 %, and \(\mathrm{AAE}(c)\) is reduced by 19.8 %. Additional conditioning on concentration data leads to further improvements in all attributes, indicating an improved characterization of all fields. For scenario 3, an \(\mathrm{AAE}(Y)\) decrease of 19.7 %, an \(\mathrm{AAE}(h)\) decrease of 66.8 %, and an \(\mathrm{AAE}(c)\) decrease of 30.8 % are observed. The uncertainty is also reduced for all scenarios including conditioning data. For scenario 2, an \(\mathrm{AESD}(Y)\) reduction of 13.8 % is observed. \(\mathrm{AESD}(h)\) exhibits a reduction of 61.7 %, and \(\mathrm{AESD}(c)\) shows a reduction of 19.8 %. Conditioning on concentration data reduces the uncertainty of the transmissivities as well as the uncertainty of the concentration field. However, as in the first example, it does not help to reduce the uncertainty of the hydraulic heads. For scenario 3, the uncertainty of the hydraulic transmissivity is reduced by 16.1 %, the uncertainty of the hydraulic head field is reduced by 55.6 %, and the uncertainty of the concentration fields is reduced by 30.8 %.

Fig. 8
figure 8

Possible \(\mathrm{log}_{10}T\) fields (upper) with corresponding hydraulic head fields (middle) and corresponding tracer concentration (time step: 0.8 day) fields (lower) according to scenario 1 (left), scenario 2 (middle), and scenario 3 (right) for example 2

Fig. 9
figure 9

Ensemble averages of \(\mathrm{log}_{10}T\) (upper), hydraulic heads (middle), and tracer concentrations (time step: 0.8 day) (lower) according to scenario 1 (left), scenario 2 (middle), and scenario 3 (right) for example 2

6 Conclusions

This paper presents a new methodology to generate solutions to the inverse groundwater flow and transport problem. The methodology uses linear combinations of unconditional random fields to achieve constraint realizations of the required hydraulic transmissivities. It is situated in a Monte Carlo framework, i.e., multiple solutions to the inverse problem are generated. Thus, the associated uncertainty can be quantified reasonably. The main advantages of the presented approach are:

  1. 1.

    The flexible description of the spatial dependence structure using spatial copulas. Thus, arbitrary marginals can be considered.

  2. 2.

    The continuous and unconstrained formulation of the nonlinear constraints, which relate the hydraulic transmissivity fields to both the observed hydraulic heads and the observed contaminant concentrations.

  3. 3.

    The ability to integrate structural information, for example via multiple point geostatistics.

Table 2 Normed average absolute error (AAE) and normed average ensemble standard deviation (AESD) for the log transmissivities, hydraulic heads, and tracer concentrations (averaged over all 11 time steps) fields according to example 2

The general applicability of the suggested methodology is demonstrated using two artificial examples. The first example presents the very basic approach. It was shown that the suggested approach leads to an improved characterization as well as to an reduced uncertainty of the hydraulic transmissivities, hydraulic heads, and concentration fields. Conditioning on the observed hydraulic heads improves the performance measures for all three attributes, while additional conditioning on the observed concentrations leads to further improvements in the transmissivities and the concentrations. However, it did not improve the characterization of the hydraulic head field. Nevertheless, a similar behavior has also been observed in other studies and further research is needed to investigate those findings properly.

The second example demonstrated how the suggested approach can be coupled to multiple point statistics. Thus, the spatial structure of the hydraulic transmissivity field can be distinguished in a macrostructure as well as a microstructure. The macrostructure represents the facies distribution, while the microstructure represents the small-scale variability inside the corresponding facies. However, in the presented example, the macrostructure, i.e., the facies distribution of the transmissivities, was assumed to be known. Thus, only the microstructure inside the different facies was considered uncertain. The results obtained are quite interesting as they show the importance of the interplay of the macro- and microstructure. It was shown that even though the large-scale flow paths are known, the microstructure still has significant influence on flow and transport behavior. As in the first example, the suggested inverse approach leads to an improved characterization and to an reduced uncertainty of all three attributes. Again conditioning on observed concentrations was beneficial to the transmissivities and the concentrations.

In many cases, also the spatial distribution of facies is unknown. This case can also be treated using the presented methodology. Nevertheless, this coupled approach needs to be improved as only one facies distribution is considered for each inverse solution. This means that a possible macrostructure is simulated for each inverse solution beforehand which is not changed during the optimization process. Thus, a large number of realizations are needed to represent the macrostructure properly. A joint inversion, i.e., changing both structures during the optimization process would probably lead to better results.

It has to be stated that the mean \(\mathrm{RMSE}(h)\) (of the 16 sampled hydraulic head values) over all realizations for both example is 0.3 m with single values ranging from 0.09 to 0.4 m. Thus, some values are rather high. However, by increasing the number of iterations and the dimensionality of the weight vector \(\lambda \) those values can be improved significantly. As the numerical flow and transport model applied is quite complex, i.e., computationally expensive, those parameters were restricted in this work.

Finally, it should be stated that the presented synthetic test cases only consider the hydraulic transmissivities as source of uncertainty. In reality, different sources of uncertainty should be considered. Among others, these sources could include porosity or boundary conditions. But as the presented methodology gives promising results and exhibits a great potential due to the above mentioned advantages, it could be applied to more complicated synthetical test cases as well as real-world problems. Such real-world cases are usually large-scale three-dimensional problems. The suggested approach is able to handle such tasks without any modification. However, as common to all Monte Carlo-based inversion approaches, the limiting factor is the complexity of the numerical model.

7 Perspective

It is shown that the suggested methodology can efficiently handle point constraints corresponding to hydraulic transmissivity, hydraulic head, and concentration observations. However, there are different spatial scales at which information is available in inverse groundwater modeling. For example, measurements from bore cores exhibit a smaller spatial support, while pumping tests result in values that are averages over a certain region. Taking information on such different spatial scales into account is still a frequent and challenging task in inverse groundwater modeling (Zhou et al. 2014). The presented approach can be extended to cope with observations on different spatial scales. This, however, goes beyond the scope of this paper.

As described in Sect. 3.1, the presented approach uses Gaussian copulas to describe the spatial dependence structure. However, the spatial dependence of the hydraulic transmissivity field W(x) might differ from Gaussian. For example, Zinn and Harvey (2003) showed how different spatial patterns of connectivity lead to different flow and transport behavior, even though the fields applied share the same basic statistics, i.e., the same first and second moment. In Haslauer et al. (2012), it was shown that the spatial dependence of transmissivities cannot be described adequately using a Gaussian copula. The dependence structure of the transmissivities is often asymmetrical—high values being clustered differently than the low values. This asymmetry has significant influence on flow and transport behavior. An alternative to the Gaussian copula can be obtained from multivariate distributions that are obtained by non-monotonic transformations of the multivariate Gaussian distribution. The suggested methodology can be extended to some of these non-Gaussian cases, but the description of the details is not subject of this paper.