Mathematical Geosciences

, Volume 50, Issue 8, pp 867–893 | Cite as

Data Assimilation in Truncated Plurigaussian Models: Impact of the Truncation Map

  • Dean S. Oliver
  • Yan Chen


Assimilation of production data into reservoir models for which the distribution of porosity and permeability is largely controlled by facies has become increasingly common. When the locations of the facies bodies must be conditioned to observations, the truncated plurigaussian model has been often shown to be a useful method for modeling as it allows gaussian variables to be updated instead of facies types. Previous experience has also shown that ensemble Kalman filter-like methods are particularly effective for assimilation of data into truncated plurigaussian models. In this paper, some limitations are shown of the ensemble-based or gradient-based methods when applied to truncated plurigaussian models of a certain type that is likely to occur for modeling channel facies. It is also shown that it is possible to improve the data match and increase the ensemble spread by modifying the updating step using an approximate derivative of the truncation map.


Inverse problem Ensemble Kalman filter Categorical variables Data assimilation Truncated plurigaussian model 

1 Introduction

The truncated plurigaussian (TPG) model has become an increasingly popular method for modeling the spatial distribution of categorical variables such as facies or rock types in subsurface reservoirs. Examples of different types of facies environments range from algal mounds (Galli et al. 2006), to reef reservoirs (Grötsch and Mercadier 1999), turbites (Albertão et al. 2005), tidal flat environment with tidal channels and a carbonate reef (Biver et al. 2015). In most practical situations, facies can only be directly observed at well locations. In the region between wells, the locations of facies boundaries are imperfectly known, and other data may be needed to reduce the uncertainty.

Ensemble-based data assimilation methods such as the ensemble Kalman filter or the iterative ensemble smoother have been shown to be effective at assimilating data into reservoir models, but the methods implicitly rely on the initial distribution of model variables being approximately multivariate gaussian (Evensen 1994). Most mathematical models of geological facies are, therefore, not well suited to ensemble-based methods without first applying variable transformations. The truncated gaussian or truncated plurigaussian methods do, however, meet this criteria if the latent gaussian random variables are the variables that are updated. Previous experience has in fact shown that reservoir production data can be matched very well using ensemble-based data assimilation methods when the TPG method is used to model facies (Liu and Oliver 2005b; Agbalaka and Oliver 2011; Sebacher et al. 2013; Astrakova and Oliver 2015). As a result, practitioners have tended to be confident that the nonlinearity in the transformation from gaussian variables to categorical (facies) variables and ultimately to petrophysical properties will be accommodated by iteration.

Despite the generally good results, in at least one synthetic example for which the TPG model was used to represent the probability of channel facies (Zhao et al. 2008) it was not possible to obtain a good match to production data. A similar difficulty was observed when a TPG model was applied to a real field, but the difficulty disappeared when the truncation map was changed (Chen 2015). In previously published examples (Chen 2015; Zhao et al. 2008) it appears that the difficulty of obtaining a good match to production data was the result of the non-monotonic relationship of data to the latent gaussian variables of the TPG. In those cases, when one computes a direction for updating the gaussian variables from their covariance with data, the resulting direction can be a very poor approximation of the correct local update direction. In particular, when the truncation map is non-monotonic, at the same gridblock, an increase of permeability is achieved by decreasing the gaussian variable in some realizations, while for other realizations an increase of permeability is achieved by increasing the gaussian variable. Since in most applications of ensemble methods, the same Kalman gain matrix is used to update all ensemble members, it will be impossible to move all realizations the correct direction in this case.

If truncation maps used in TPG models always resulted in monotonic relationships between petrophysical properties and the latent gaussian variables, minimization of data misfit would not be difficult. As the use of the TPG for modeling increases, however, the probability of encountering truncation maps that cause difficulties with data assimilation will also inevitably increase. Note for example that both truncation maps in some complex examples (Biver et al. 2015) are symmetric, and hence non-monotonic. Similarly, symmetric or non-monotonic truncation maps are shown in other publications (Albertão et al. 2005; Mariethoz et al. 2009; Beucher and Renard 2016; D’Or et al. 2017). One purpose of this paper is to help identify the source of the problem for data assimilation. It is also shown, however, that the probability of obtaining a good history match can be increased by analytical computation of some derivatives.

2 Motivation

A common approach to assimilation of data, \(\mathbf {d}^{\mathrm {o}}\) of dimension \(N_d\), into a mathematical model of the physical system with variables \(\mathbf {m}\) of dimension \(N_m\), is to simultaneously minimize the misfit of perturbed predicted data, \(\mathbf {g}(\mathbf {m}) + \varvec{\epsilon }\) to actual data, and the misfit of model parameters to a sample from the prior distribution by adjusting the values of the model variables. In a TPG model, the model variables are the gaussian variables that, when truncated, define the facies type.

If the covariance of the additive gaussian noise in the data is \(\mathbf {C}_D\) and the prior covariance of gaussian model variables is \(\mathbf {C}_{M}\), then the objective function to be minimized for computation of approximate samples from the posteriori distribution is (Kitanidis 1995; Oliver et al. 1996; Oliver 2017)
$$\begin{aligned} S_i(\mathbf {m})= & {} \frac{1}{2} \left[ (\mathbf {g}(\mathbf {m}) + \varvec{\epsilon }_i -\mathbf {d}^{\mathrm {o}}) ^{\text {T}} \mathbf {C}_D^{-1} (\mathbf {g}(\mathbf {m}) + \varvec{\epsilon }_i -\mathbf {d}^{\mathrm {o}}) \right. \nonumber \\&\left. + (\mathbf {m} - \mathbf {m}_i^*)^{\text {T}} \mathbf {C}_{M}^{-1} (\mathbf {m} - \mathbf {m}_i^*) \right] , \end{aligned}$$
where \(\mathbf {m}_i^*\) is the \(i\hbox {th}\) sample from the prior probability distribution for the parameters, and \(\varvec{\epsilon }_i\) is the \(i\hbox {th}\) sample of observation errors. Note that if \(\mathbf {m}\) is a spatially distributed random field, its covariance is represented by the covariance matrix \(\mathbf {C}_M\). In subsurface applications, the data (e.g. production rates and well pressures) are indirectly functions of the petrophysical properties (e.g. permeability and porosity), \(\mathbf {f}\), so one might, with some abuse of notation, instead write the objective function as
$$\begin{aligned} S_i(\mathbf {m})= & {} \frac{1}{2} \left[ (\mathbf {g}(\mathbf {f}(\mathbf {m}))+ \varvec{\epsilon }_i -\mathbf {d}^{\mathrm {o}}) ^{\text {T}} \mathbf {C}_D^{-1} (\mathbf {g}(\mathbf {f}(\mathbf {m}))+ \varvec{\epsilon }_i -\mathbf {d}^{\mathrm {o}}) \right. \nonumber \\&\left. + (\mathbf {m} - \mathbf {m}_i^*)^{\text {T}} \mathbf {C}_{M}^{-1} (\mathbf {m} - \mathbf {m}_i^*) \right] . \end{aligned}$$
For efficient assimilation of data into large models, one would generally try to use a greedy minimization algorithm, which would require evaluation of the gradient of the objective function with respect to the model variables. If the objective function is differentiable with respect to the model variables, then at the minimum,
$$\begin{aligned} \nabla _m S_i= (\nabla _m \mathbf {f}^{\text {T}} \cdot \nabla _f \mathbf {g}^{\text {T}} ) \mathbf {C}_D^{-1} (\mathbf {g}(\mathbf {f}(\mathbf {m}))+ \varvec{\epsilon }_i-\mathbf {d}^{\mathrm {o}}) + \mathbf {C}_{M}^{-1} (\mathbf {m} - \mathbf {m}_i^*) = 0. \end{aligned}$$
In many data assimilation problems with spatially distributed variables, the petrophysical properties, \(\mathbf {f}\), are modeled as gaussian (perhaps after a simple transformation, e.g. log-transformation for permeability), so it is possible to use \(\mathbf {f}(\mathbf {m}) = \mathbf {m}\) and when one computes the sensitivity of data to model variables, it is only necessary to compute \(\mathbf {G}_f^{\text {T}} = \nabla _f \mathbf {g}^{\text {T}} \). When the TPG model is used, however, one also needs the sensitivity of petrophysical properties to the gaussian model variables, \(\nabla _m \mathbf {f}^{\text {T}} \). If \(\mathbf {f}\) is constant within each domain on the truncation map, then the gradient \(\nabla _m \mathbf {f}^{\text {T}} \) will be zero everywhere except on the boundaries, where it is not defined. Straightforward application of gradient-based minimization methods are not useful in this case. Liu and Oliver (2004) introduced small transition regions near the domain boundaries on the truncation maps to allow the use of Newton-like minimization methods for history matching of TPG models. These transition regions made the problem differentiable, but convergence to the minimum of the objective function was slow.
Fig. 1

Best fit linear relationships (blue lines) for two truncation maps (dashed red lines) from 200 samples (blue dots). Both truncation maps have threshold values at \(m=-0.6\) and \(m=0.6\). a Monotonic truncation map, b non-monotonic truncation map

A standard ensemble-based solution to this problem is to ignore the derivative of the truncation map (\(\nabla _m \mathbf {f}^{\text {T}} \)) and instead compute the gradient based on covariances between production data and the latent gaussian variables (Liu and Oliver 2005b; Agbalaka and Oliver 2008; Sebacher et al. 2013; Astrakova and Oliver 2015). This approach works well when the relationship between petrophysical properties and gaussian variables is monotonic as in Fig. 1a. In that case, the ensemble approximation of the sensitivity provides useful information for updating the ensemble when data are assimilated. On the other hand, when the truncation map is non-monotonic, the correlation between the variables computed from the ensemble can result in completely wrong update directions (Fig. 1b). As an example, if the measured porosity at a given location is 0.1 (corresponding to \(-0.4< x < 0.4\)), and the current value in the model at that location is 0.3 (corresponding to \(x \le -0.4)\), then the slope of the line in Fig. 1b specifies that the gaussian variable in the model should be made more negative to match the observation while, in fact, it is necessary to make the gaussian variable more positive to match the model to the observation.

To summarize, although the TPG model has been successfully used with the EnKF for history matching and assimilation of observed values of petrophysical properties, there are two potential problems with the use of the TPG model for minimization-based methods of sampling. The first is that the truncation map is discontinuous and nondifferentiable. If one wants to use gradient-based methods for data assimilation, then something must be done to approximate the derivatives of the truncation map in a useful way. The second problem is that ensemble-based methods are based on the covariance between model variables and data. The update directions for these methods are not useful if the covariance is not a good approximation of the relationship.

Note that other methods of sampling from the conditional distribution would be more appropriate for most of the examples shown in this manuscript. In particular, a Gibbs sampler would provide rigorous sampling in the TPG model for observations of facies types (Le Loc’h and Galli 1997; Emery 2007; Armstrong et al. 2011). The target applications are for cases in which the relationship between the observations and the property field is nonlinear and nonlocal. In those cases, the Gibb’s sampler or MCMC may not be useful, and an approximate sampling method may be necessary.

3 Modified Gradient

Let \(\mathbf {m}\) denote the vector of gaussian model variables that are used to determine facies type and let \(\mathbf {f}\) denote the vector of petrophysical properties (e.g. gridblock permeability) that are determined by facies type. In a truncated gaussian model there would be one latent variable per cell in the model. If there is only one petrophysical property (e.g. porosity) per cell, then the number of latent variables and the dimension of \(\mathbf {f}\) is the same as the dimension of \(\mathbf {m}\). In a TPG with two latent variables per cell, and one petrophysical property per cell, there would be twice as many latent variables as petrophysical variables so the dimension of \(\mathbf {m}\) would be twice as large as the dimension of \(\mathbf {f}\) in this case. Or, if both porosity and permeability are included, then the numbers of petrophysical variables and the number of latent variables is again the same.

In real cases, the facies type would only determine the distribution of petrophysical properties. In practice we use a hierarchical model for ensemble-based history matching, in which the mean and the covariance of the petrophysical properties are determined by the facies type (Agbalaka and Oliver 2011; Astrakova and Oliver 2015), but for simplicity, the petrophysical properties are assumed here to be completely determined by facies type. The sensitivities, \(\mathbf {G}_f\), of predicted data \(\mathbf {d}\) with respect to property fields \(\mathbf {f}\) are estimated from the cross-covariance between the property fields and the simulated data realizations in ensemble Kalman filter-like assimilation methods. We make the same assumption here, but assume that the sensitivity with respect to gaussian model variables, \(\mathbf {G}_m\), is required in order to update the gaussian fields for assimilation of data into a truncated plurigaussian model.

Let M be the number of gaussian model variables and let N be the number of data. If the number of petrophysical properties is the same as the number of gaussian variables, the sensitivity of production data to model variables can be decomposed as follows
$$\begin{aligned} \begin{aligned} \mathbf {G}_m^{\text {T}}&= \nabla _m \mathbf {f}^{\text {T}} \cdot \nabla _f \mathbf {g}^{\text {T}} \\&= \begin{bmatrix} \frac{\partial f_1}{\partial m_1}&\cdots&\frac{\partial f_{M}}{\partial m_1} \\ \vdots&\vdots \\ \frac{\partial f_1}{\partial m_{M}}&\cdots&\frac{\partial f_{M}}{\partial m_{M}} \end{bmatrix} \begin{bmatrix} \frac{\partial g_1}{\partial f_1}&\cdots&\frac{\partial g_N}{\partial f_1} \\ \vdots&\vdots \\ \frac{\partial g_1}{\partial f_{M}}&\cdots&\frac{\partial g_N}{\partial f_{M}} \end{bmatrix} \\&= \begin{bmatrix} \frac{\partial f_1}{\partial m_1}&0 \\&\ddots&\\ 0&\frac{\partial f_{M}}{\partial m_{M}} \end{bmatrix} \begin{bmatrix} \frac{\partial g_1}{\partial f_1}&\cdots&\frac{\partial g_N}{\partial f_1} \\ \vdots&\vdots \\ \frac{\partial g_1}{\partial f_{M}}&\cdots&\frac{\partial g_N}{\partial f_{M}} \end{bmatrix} . \end{aligned} \end{aligned}$$
Note that although the dimension of the matrix \(\nabla _m \mathbf {f}^{\text {T}} \) will be quite large in practical problems, it will be diagonal or block diagonal, so that it is only necessary to compute and operate with the elements on the diagonal.

Since the rock-type rule (truncation map) assigns a constant facies type for values of \(\mathbf {m}\) within intervals, the function \(\mathbf {f}(\mathbf {m})\) is discontinuous. The derivatives \(\partial f_j/ \partial m_j\) are zero almost everywhere and are not defined on domain boundaries where the facies type changes. In Sect. 5, we will discuss a method for approximating \(\nabla _m \mathbf {f}^{\text {T}} \), but for now, we assume that such an approximation exists and discuss how it can be used for data assimilation in an iterative ensemble smoother.

Because of the nonlinearity in the relationship between petrophysical properties and the gaussian variables, it is necessary to use an iterative data assimilation method to update the model variables, even if the observation operator is linear in the petrophysical properties. We investigate two possible methods: one in which \(\mathbf {G}_m\) is computed using Eq. (4) with an analytical approximation of \(\nabla _m \mathbf {f}^{\text {T}} \) and the standard method of computing the sensitivity \(\mathbf {G}_m\) directly from the ensemble (Liu and Oliver 2005b; Agbalaka and Oliver 2008; Sebacher et al. 2013; Astrakova and Oliver 2015).

4 Data Assimilation

4.1 Iterative Ensemble Smoother

Because the problem of updating a facies model is highly nonlinear, we use the Levenberg–Marquardt form of the ensemble randomized maximum likelihood methods (Chen and Oliver 2013), which for simplicity we will refer to by the generic term iterative ensemble smoother (IES). The update equation for the Levenberg–Marquardt approach takes the form (Chen and Oliver 2013, Eq. 10)
$$\begin{aligned} \delta \mathbf {m}_{i}= & {} - \left[ (1 + \lambda ) \mathbf {P}^{-1} + \mathbf {G}_m^{\text {T}} \mathbf {C}_D^{-1} \mathbf {G}_m \right] ^{-1} \mathbf {C}_M^{-1} (\mathbf {m}_{i} - \mathbf {m}^*_i) \nonumber \\&- \mathbf {P} \mathbf {G}_m^{\text {T}} \left[ (1 + \lambda ) \mathbf {C}_D + \mathbf {G}_m \mathbf {P} \mathbf {G}_m^{\text {T}} \right] ^{-1} (\mathbf {g}(\mathbf {m}_{i}) - \mathbf {d}^{\mathrm {o}}) , \end{aligned}$$
where \(\lambda \) is the Levenberg–Marquardt damping factor and \(\mathbf {P}\) is an approximation of the model parameter covariance matrix. In the \(\ell \)th iteration of the IES, we approximate \(\mathbf {P}\) from the ensemble of parameter perturbations, \(\Delta \mathbf {m}_\ell \), using \(\mathbf {P}_\ell = \Delta \mathbf {m}_\ell \Delta \mathbf {m}_\ell ^{\text {T}} /(N_e -1)\), in which case the update for the ith ensemble member is
$$\begin{aligned} \delta \mathbf {m}_{i \ell }= & {} - (N_e - 1)\Delta \mathbf {m}_\ell \left[ \alpha _\ell \mathbf {I} + \Delta \mathbf {d}_\ell ^{\text {T}} \mathbf {C}_D^{-1} \Delta \mathbf {d}_\ell \right] ^{-1} \Delta \mathbf {m}_\ell ^{\text {T}} (\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} )^+ (\mathbf {m}_{i \ell } - \mathbf {m}^*_i) \nonumber \\&- \Delta \mathbf {m}_\ell \Delta \mathbf {d}^{\text {T}} \left[ \alpha \mathbf {C}_D + \Delta \mathbf {d}_\ell \Delta \mathbf {d}_\ell ^{\text {T}} \right] ^{-1} (\mathbf {g}(\mathbf {m}_{i \ell }) + \varvec{\epsilon }_i - \mathbf {d}^{\mathrm {o}}) \end{aligned}$$
for \(\alpha _\ell = (1 + \lambda _\ell )(N_e -1)\). Here and in subsequent equations, the superscript “\(+\)” refers to the Moore–Penrose pseudo inverse. Equation (6) is a standard update formula for an iterative ensemble smoother with regularization of the step (Chen and Oliver 2013, Eq. 15), except that scalings of model parameters and data have been neglected for clarity.

4.2 Hybrid Derivatives

This hybrid method starts with the same basic updating formula (Eq. (5)), but decomposes the sensitivity of data to the latent gaussian variables, which we write as
$$\begin{aligned} \mathbf {G}_m^{\text {T}} = \nabla _m \mathbf {f}^{\text {T}} \cdot \nabla _f \mathbf {g}^{\text {T}} \equiv \mathbf {F} \, \mathbf {G}_f^{\text {T}} . \end{aligned}$$
In this case, we use \(\mathbf {P}= \mathbf {C}_M \approx (\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} )/(N_e -1)\) based on the initial ensemble of model parameters, instead of \(\mathbf {P}= \Delta \mathbf {m}_\ell \Delta \mathbf {m}_\ell ^{\text {T}} /(N_e -1)\) as there appears to be no advantage to doing that in this case. A formula corresponding to Eq. (5) for hybrid updating of model realizations in a Levenberg–Marquardt algorithm would be
$$\begin{aligned} \delta \mathbf {m}= & {} - \left[ (1 + \lambda ) \mathbf {C}_M^{-1} + \mathbf {F} \, \mathbf {G}_f^{\text {T}} \mathbf {C}_D^{-1} \mathbf {G}_f \, \mathbf {F}^{\text {T}} \right] ^{-1} \mathbf {C}_M^{-1} (\mathbf {m} - \mathbf {m}^*_i) \nonumber \\&- \mathbf {C}_M \mathbf {F} \, \mathbf {G}_f^{\text {T}} \left[ (1 + \lambda ) \mathbf {C}_D + \mathbf {G}_f \, \mathbf {F}^{\text {T}} \mathbf {C}_M \mathbf {F} \, \mathbf {G}_f^{\text {T}} \right] ^{-1} (\mathbf {g}(\mathbf {m}_\ell ) + \varvec{\epsilon }_i - \mathbf {d}^{\mathrm {o}}) \, . \end{aligned}$$
In an ensemble-based approach, we approximate the prior covariance matrix by its sample covariance, \( \mathbf {C}_M \approx \Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} /(N_e - 1)\). Substituting the expression for the sample covariance, the update formula in an iterative ensemble smoother approach is then written as
$$\begin{aligned} \delta \mathbf {m}= & {} - \left[ (1 + \lambda ) \left( \frac{\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} }{N_e - 1}\right) ^+ + \mathbf {F} \, \mathbf {G}_f^{\text {T}} \mathbf {C}_D^{-1} \mathbf {G}_f \, \mathbf {F}^{\text {T}} \right] ^{-1} \nonumber \\&\times \left( \frac{\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} }{N_e - 1}\right) ^+ (\mathbf {m} - \mathbf {m}^*_i) - \left( \frac{\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} }{N_e - 1}\right) \mathbf {F} \, \mathbf {G}_f^{\text {T}} \nonumber \\&\times \left[ (1 + \lambda ) \mathbf {C}_D + \mathbf {G}_f \, \mathbf {F}^{\text {T}} \left( \frac{\Delta \mathbf {m}_0 \Delta \mathbf {m}_0^{\text {T}} }{N_e - 1}\right) \mathbf {F} \, \mathbf {G}_f^{\text {T}} \right] ^+ (\mathbf {g}(\mathbf {m}_\ell ) + \varvec{\epsilon }_i - \mathbf {d}^{\mathrm {o}}) \, . \end{aligned}$$
We make use of the matrix identity
$$\begin{aligned} \begin{aligned} ((\mathbf {A}\mathbf {A}^{\text {T}} )^+ + \mathbf {B})^+ (\mathbf {A}\mathbf {A}^{\text {T}} )^+&= \mathbf {A} (\mathbf {I} + \mathbf {A}^{\text {T}} \mathbf {B} \mathbf {A} )^+ (\mathbf {A}\mathbf {A}^{\text {T}} )^+ \\&= \mathbf {A} (\mathbf {I} + \mathbf {A}^{\text {T}} \mathbf {B} \mathbf {A} )^+ \mathbf {A}^+ \end{aligned} \end{aligned}$$
for positive definite matrices \(\mathbf {B}\). After some expansion and simplification, this can be written in terms of ensemble quantities,
$$\begin{aligned} \begin{aligned} \delta \mathbf {m} =&- (N_e - 1) \Delta \mathbf {m}_0 \left[ \alpha \mathbf {I} + \mathbf {B}_i \mathbf {C}_D^{-1} \mathbf {B}_i^{\text {T}} \right] ^{-1} (\Delta \mathbf {m}_0)^+ (\mathbf {m} - \mathbf {m}^*_i) \\&- \Delta \mathbf {m}_0 \, \mathbf {B}_i \left[ \alpha \mathbf {C}_D + \mathbf {B}_i^{\text {T}} \, \mathbf {B}_i \right] ^{-1} (\mathbf {g}(\mathbf {m}_\ell ) + \varvec{\epsilon }_i - \mathbf {d}^{\mathrm {o}}), \end{aligned} \end{aligned}$$
where \(\alpha = (1 + \lambda )(N_e-1)\), \(\mathbf {B}_i = (\Delta \mathbf {m}_0)^{\text {T}} \mathbf {F}_i \mathbf {G}_f^{\text {T}} ,\) and the sensitivity of data to petrophysical properties is computed from the ensemble,
$$\begin{aligned} \mathbf {G}_f = \Delta \mathbf {d} \, (\Delta \mathbf {f})^+ . \end{aligned}$$
In some cases, we use the approximate form of the LM-EnRML in which the prior model mismatch term is dropped from the update equations,
$$\begin{aligned} \delta \mathbf {m} = - \Delta \mathbf {m}_0 \, \mathbf {B}_i \left[ \alpha \mathbf {C}_D + \mathbf {B}_i^{\text {T}} \, \mathbf {B}_i \right] ^{-1} (\mathbf {g}(\mathbf {m}_\ell ) + \varvec{\epsilon }_i - \mathbf {d}^{\mathrm {o}}) . \end{aligned}$$
Note that the matrix \(\mathbf {B}_i\) is different for each realization since the \(\mathbf {F}_i\) are computed locally instead of from the ensemble.

5 Approximation of \(F = \nabla _m \mathbf {f}^{\text {T}} \)

The truncation function \(\mathbf {f}(\mathbf {m})\) in the truncated plurigaussian method is neither differentiable nor continuous, hence direct application of gradient-based methods for minimization of the objective function is not appropriate. In order to make use of gradient-based methods for data assimilation with a truncated plurigaussian model, it has been necessary to define a derivative based on an approximation to \(\mathbf {f}(\mathbf {m})\) in which the discontinuities were replaced with transition regions (Liu and Oliver 2004). Although the function \(\mathbf {f}(\mathbf {m})\) itself was not altered, the use of the approximation to a derivative improved convergence. Surprisingly, when the adjoint method was compared with the EnKF, it was found that updating truncated plurigaussian models using the EnKF was faster and the data match was better than results obtained using the adjoint and transition regions (Liu and Oliver 2005a). The efficiency of the ensemble-based method seemed to be a result of the ensemble approximation of the gradient being better for minimization than the direct computation of an approximation with a transition region. The usefulness of the EnKF for minimizing discontinuous objective functions has been shown by Chen and Oliver (2012).

From the example illustrated in Fig. 1, it is clear that a single (global) gradient from an ensemble-based approach will not be useful when the relationship between petrophysical properties and the gaussian variable is non-monotonic, but the success of ensemble-based approaches indicates that large-scale trends are sometimes better than locally accurate derivatives when the relationships between model parameters and observations are discontinuous. Based on previous experience and the analysis of Zupanski et al. (2008), we compute derivatives from a piecewise linear approximation to the truncation map with nodes at the “center of probability mass” for each domain of the truncation map. If, for example, one domain is defined by the interval \(-\infty < m \le s_1\), and the prior distribution is standard normal, then the location of the center of probability mass for that domain is
$$\begin{aligned} \mu _1 = \int _{-\infty }^{s_1} m \exp (-m^2/2) \, \mathrm{d}m \left( \int _{-\infty }^{s_1} \exp (-m^2/2) \, \mathrm{d}m \right) ^{-1} , \end{aligned}$$
which evaluates to \(-1.215\) for the case in which \(s_1 = -0.60\). For a three-facies truncation map with centroids at \(\mu _1\), \(\mu _2\), and \(\mu _3\), and corresponding values of f equal to \(v_1\), \(v_2\), \(v_3\), the derivative of the piecewise linear approximation to the truncation map is
$$\begin{aligned} \frac{\mathrm{d}f}{\mathrm{d}m} = {\left\{ \begin{array}{ll} \frac{v_1 - v_2}{\mu _1 - \mu _2 } &{} m \le \mu _2 \\ \frac{v_2 - v_3}{\mu _2 - \mu _3 } &{} m > \mu _2 . \end{array}\right. } \end{aligned}$$
Figure 2a–c show the location of centroids and the corresponding piecewise linear approximations to the truncation maps for the truncated gaussian test cases in Sect. 6.1. Note that the piecewise linear approximation is quite similar to the ensemble approximation when the truncation map is monotonic (compare Fig. 2a to Fig. 1a), but that the piecewise linear approximation captures the large-scale relationship much better when the truncation map is non-monotonic (compare Fig. 2b to Fig. 1b).

6 Test Cases

6.1 One-Dimensional Truncated Gaussian: Linear, Local Observations

The ability to assimilate data into truncated plurigaussian models using ensemble-based methods appears to depend strongly on the characteristics of the truncation map (TM). In this section we investigate the behavior of ensemble-based data assimilation on three simple truncated gaussian examples with local observations that are linear in the petrophysical properties. The truncation maps for the three examples are (1) monotonic, (2) non-monotonic, but asymmetric, and (3) non-monotonic and symmetric. For all three examples, we compare results from the Levenberg–Marquardt form of EnRML (Eq. (6)) to a similar approach that uses analytical approximations of the derivative of the relationship of petrophysical properties to the gaussian variables (Eq. (10)).

Figure 2a–c show the three truncation maps with piecewise linear interpolations of the mapping for computation of an approximate derivative. The truncation levels in each truncation map are set at \(-0.6\) and 0.6, so the prior probabilities for the three facies are 0.274, 0.452, and 0.274, respectively. The gaussian random field has a prior distribution with mean 0, variance 1, and a gaussian covariance with practical range of 15. Figure 2d–f show realizations of the property field from each of the truncation maps. Each of the property fields in Fig. 2 was generated from the same gaussian random field (Fig. 3). Note that property fields equivalent to the type in Fig. 2f could be generated using a monotonic truncation map with a single threshold and only two facies. That would not be the case for 2D property fields, however, because the continuity of extreme values is much different from the continuity of mean values (Adler et al. 2014).
Fig. 2

Three truncation maps (TM) and piece-wise linear approximation (top row). Realizations of porosity field from each TM. The black dots in (a) to (c) are centroids for interpolating the truncation map. The black dots in the second row are observed porosity at five locations. a Monotonic TM, b non-monotonic TM, c symmetric TM, d porosity realization and data (monotonic), e porosity realization and data (non-monotonic), f porosity realization and data (symmetric)

Fig. 3

The gaussian field used to generate the true facies fields and true porosity observations for test cases in Sect. 6.1. The resulting porosity fields and corresponding porosity observations are shown in the second row of Fig. 2 for the three truncation maps considered

The data in this test case are observations of the porosity fields at five locations (black dots in Fig. 2d–f). Measurement errors are assumed to be additive, uncorrelated and gaussian with mean 0 and standard deviation 0.01. Two data assimilation methods were applied to observations generated from each truncation map: (1) standard full-form LM-EnRML (Eq. (6)) and (2) the hybrid method which computes derivatives of TM using piecewise linear approximation of TM (Eq. (10)). The methods were applied without localization for ensemble size 60 and ensemble size 200. There were no significant differences in the results for the two ensemble sizes so we show results for ensemble size 200.

Figure 4 shows the final ensemble estimate of porosity for a single ensemble of size 200, for all three truncation maps and for both data assimilation methods. When the truncation map is monotonic (left column of Fig. 4), the posterior ensemble mean and the posterior ensemble spread for both methods are very similar. In both cases, the mean is quite close to the observations at locations where measurements were made. The spread is small at measurement locations, but increases fairly rapidly with distance from observations, as should be expected.
Fig. 4

Ensemble estimation of porosity at the final iteration for the standard the hybrid method. The solid orange lines show the true porosity. The black dots are porosity observations. The error bars represent the ensemble estimate with mean shown as dots and standard deviation shown as whiskers. The results are from one single ensemble of size 200. Three truncation maps (monotonic, non-monotonic and symmetric) shown in the top row of Fig. 2 are considered. The data consist of five linear local observations

When the truncation map was such that the mapping of gaussian variables to porosity is non-monotonic as in Fig. 2b, the covariance of data (porosity) to gaussian variables is not a good representation of the relationship, so the directions for updating of gaussian variables are poor when a standard iterative ensemble smoother is used for updating. For the example shown in the center column, top row of Fig. 4, the ensemble mean is far from observations at observation locations and the spread is not substantially reduced except for the observation located at \(x=30\). When a piecewise constant approximation to the derivative of the truncation map is used to generate the directions for updating (center column, bottom row of Fig. 4), the results are clearly much better, but the spread is clearly still too large at several observation locations. The inability to match all data for the non-monotonic truncation map is due to the limitations of descent methods of minimization, as some descent directions lead to the wrong local minimum.

Finally, when the truncation map is symmetric, the posterior ensemble mean and spread obtained from the hybrid method using analytical derivatives seems nearly perfect, while the posterior ensemble mean and spread obtained from the standard ensemble-based method are almost unchanged from the initial ensemble (right column of Fig. 4). The good results for the hybrid method are partially a result of the fact that both local minima in the posterior pdf are equivalent. The ensemble-based method fails for the symmetric truncation map because the covariance provides no useful information on the relationship between the property values and the gaussian variables.
Fig. 5

Distribution of ensemble mean data mismatch \(\overline{O}_\mathrm{d}\) for final ensembles after data assimilation. Results are computed from 20 independent ensembles of size 200

To ensure that conclusions were not influenced strongly by the composition of the initial ensemble, each method was applied to 20 independent sets of noisy observations and 20 independent ensembles of initial model realizations. Figure 5 shows the distribution of ensemble mean (\(\overline{O}_\mathrm{d}\)) data mismatch for final ensembles after data assimilation. The data mismatch for each ensemble member is computed as
$$\begin{aligned} O_\mathrm{d}=\frac{1}{2}\sum _{j=1}^{N}\left( \frac{d^{\mathrm {sim}}_j-d^{\mathrm {o}}_j}{\sigma _j} \right) ^2 , \end{aligned}$$
where N is the total number of data, \(\sigma _j\) is the standard deviation of data noise, \(d^{\mathrm {sim}}_j\) and \(d^{\mathrm {o}}_j\) are components of simulated and observed data in vector \(\mathbf {g}(\mathbf {m})\) and \(\mathbf {d}^{\mathrm {o}}\). Results from multiple ensemble runs are consistent with those from the single ensemble. For the monotonic truncation map there is no significant difference between the two methods. For the non-monotonic truncation map and for the symmetric truncation map, results from the hybrid method with analytical computation of the derivatives show a much better match to observations.

6.2 1D Truncated Gaussian: Nonlinear, Nonlocal Observations

For most real data assimilation problems related to subsurface flow, the observations are nonlinear weighted averages of petrophysical properties. In these cases, the nonlinearity occurs both in the truncation map, which relates gaussian variables to petrophysical properties, and in the observation operator, which relates data to petrophysical properties. Because the hybrid method requires estimation of the sensitivity of data to petrophysical properties from a relatively small ensemble, we investigate the robustness of the methods to nonlinearity and nonlocality of observations.

For these tests, each of the five observations is the average of the squares of the property values over an interval of length ten with the standard deviation of the data noise \(\sigma = 0.005\)
$$\begin{aligned} \mathbf {g}(\mathbf {m}) = 0.1 \begin{pmatrix} \sum _{i=6}^{15} m_i^2 \\ \sum _{i=26}^{35} m_i^2 \\ \sum _{i=46}^{55} m_i^2 \\ \sum _{i=66}^{75} m_i^2 \\ \sum _{i=86}^{95} m_i^2 \end{pmatrix} . \end{aligned}$$
The truncation maps are the same as those used in the previous section. We use ensemble sizes of 200 and 60. Because the amount of data being assimilated is quite small, localization is not used in any of the data assimilation examples. Each experiment was repeated 20 times.
Fig. 6

Distribution of ensemble mean data mismatch \(\overline{O}_\mathrm{d}\) for final ensembles after data assimilation. Results are computed from twenty independent ensembles of size 60 (a) and of size 200 (b). The key for statistical box is the same as in Fig. 5. a Ensemble size 60, b ensemble size 200

Figure 6 compares the ensemble mean data mismatch after 20 iterations for both methods on each of the truncation maps. Results for nonlinear, nonlocal observations are quite similar to results for linear, local observations shown in Sect. 6.1: both methods give equivalent results for the monotonic truncation map. The hybrid method gives better results for both the non-monotonic and the symmetric truncation maps. Results are quite similar for ensemble size 60 and for ensemble size 200.

6.3 Symmetric TPG, 2D Field

In this section, we use a 2D truncated plurigaussian example with symmetric truncation map to illustrate additional features of the minimization problem and again to compare behavior of the standard ensemble-based approaches with the hybrid approach using analytically approximated derivatives. Here, the facies type is determined by two gaussian random fields. The first gaussian random field (GRV), \(y_1\), has zero mean and gaussian covariance function with practical ranges of 20 and 10 in the two principal directions (\(2\pi /5\) and \(-\pi /10\), respectively). The second GRV, \(y_2\), has zero mean and gaussian covariance function with practical ranges of 36 and 6 in the two principal directions (\(\pi /4\) and \(-\pi /4\), respectively). The true facies field is shown in Fig. 7a, and the truncation map is shown in Fig. 7b. The black dots in the truncation map show the nodes for interpolation of the truncation map. The data are permeability values at nine locations shown in Fig. 7a.
Fig. 7

The truncation map (b) and the true facies field (a). The permeability for the three facies types are 50 mD (dark green), 100 mD (green) and 150 mD (yellow), respectively. a True facies field, b truncation map mapping gaussian variables \(y_1\) and \(y_2\) to facies

Fig. 8

Blinear interpolation of the truncation map (a) and the approximate derivative of permeability to \(y_1\) (b) and \(y_2\) (c). a Bilinear interpolation of truncation map, b approximate \(\mathrm{d}f/\mathrm{d}y_1\), c approximate \(\mathrm{d}f/\mathrm{d}y_2\)

Approximate derivatives for this case are computed by differentiating the bilinear interpolation surface connecting the nodes (Fig. 8a). Note that the gradients of the interpolation surface with respect to both gaussian random variables are discontinuous along the lines of symmetry (Fig. 8b, c). Data assimilation is done using the standard approximate form of LM-EnRML (dropping the first term of Eq. (6)) and the approximate form of LM-EnRML with the analytical approximations of the gradient of the TM (Eq. (12)). In both cases, we used a single large ensemble of size 500 to reduce the effect of sampling error. The standard ensemble method (Eq. (6)) was unable to reduce the data mismatch substantially because the covariance of permeability values with the gaussian random variables does not provide a useful representation of the relationship between the variables. In contrast, the hybrid method achieves a much better data match after just three iterations (Fig. 9).

Neither method was able to match all nine observations in every ensemble member. After iteratively assimilating observations, the number of realizations with exact match at all nine data locations are 56 and 1 for the hybrid and standard methods, respectively. For the standard ensemble-based method, the locations with low permeability measurements were matched best as shown by the low permeability value in the ensemble mean at the four data locations and the low standard deviation of the estimate in Fig. 10. The data match was poor for observations of intermediate permeability (100 mD) or high permeability (150 mD). This is a result of the location of these two facies types on the truncation map, not a consequence of the permeability values. The hybrid method matched both high and low values of permeability well, but was frequently unable to match the observed intermediate values of permeability (Figs. 11a, d).
Fig. 9

Iterative reduction in data mismatch \(O_\mathrm{d}\) for an ensemble of size 500. a Standard method, b hybrid method

Fig. 10

Mean and standard deviation of permeability field at final (10th) iteration for the standard method. Ensemble size is 500. a Mean permeability, b Std of permeability

Fig. 11

Mean and standard deviation of permeability field and the two GRVs at final (10th) iteration for the hybrid method. Ensemble size is 500. a Mean permeability, b mean first GRV \(y_1\), c mean second GRV \(y_2\), d Std of permeability, e Std of first GRV \(y_1\), f Std of second GRV \(y_2\)

Fig. 12

Two realizations of the facies field, before and after assimilation of nine observations using the hybrid method. Ensemble size is 500. a Realization 4, initial (left) and final (right), b realization 28, initial (left) and final (right)

Fig. 13

Scatter plot of the initial ensemble of \(y_1\) (horizontal-axis) and \(y_2\) (vertical-axis) in blue and final ensemble of \(y_1\) and \(y_2\) in red. Black lines show boundaries of regions of facies type on the truncation map. The truncation map in shown in Fig. 7b

Fig. 14

A simplified version of a TPG model of tidal flat environment (modified from Biver et al. 2015)

Somewhat surprisingly, although the match to data using the hybrid method is quite good and the realizations after updating appear quite plausible (Fig. 12), the mean of the final gaussian random fields (Fig. 11b, e) were changed very little from the mean of the initial gaussian random fields. The posterior standard deviation on the other hand was quite large at some data locations (approximately 1.5 at the observation locations where low permeability was observed (dark green on the truncation map). In contrast, the standard deviation of the final gaussian random field using the hybrid method was reduced at locations where high permeability was observed, as should be expected. Because the true posterior pdf for the latent “gaussian variables”, is not gaussian, it is helpful to look at scatterplots of the pairs (\(y_1,y_2\)) for several observation locations. Figure 13 shows scatterplots of the initial ensemble (blue) and the final ensemble (red) at locations with low, intermediate, and high permeability observations.

To match observations of low permeability, the variable \(y_1\) must either be less than \(-0.8\) or greater than 0.8. Figure 13d shows that the hybrid method accomplishes this fairly well by placing points in regions on both the left and right sides of the TM. There are only a few points (off the scale) with low (less than -3) or high (great than 3) values of \(y_2\) were not assigned the correct permeability (with \(y_1\) taking values in the center region). The standard ensemble-based methods use the cross-covariance between observations (permeability) and the model variables. Because the truncation map is symmetric, and the prior distributions for \(y_1\) and \(y_2\) are symmetric, the cross-covariance is zero and updating is not possible. For a finite ensemble size, however, a small non-zero correlation between permeability and \(y_1\) will always be present in the initial ensemble, allowing the values of \(y_1\) at the observation location to be slowly driven to larger or smaller values during iteration (Fig. 13a). This shift of the ensemble allows a data match but distorts the posterior distribution of \(y_1\) so that the continuity of facies might be altered.

To match high permeability observations, the gaussian variables must move to the region of the truncation map near the origin (\(-0.8< y_1 < 0.8\) and \(-0.8< y_2 < 0.8\)). Figure 13f shows that the hybrid method does this perfectly, while the standard ensemble-based method is unable to improve the distribution of points (Fig. 13f). Neither method can match the intermediate permeability values well (Fig. 13b, e). The standard ensemble-based method does not change the initial distribution significantly, while the hybrid derivative method tries to locate the 100 mD facies type between the 50 md and 150 mD facies types, which is incorrect based on the truncation map.

6.4 Symmetric TPG, 2D Field Based on Model of Tidal Flats

The TPG model of tidal flat environment (Biver et al. 2015) is relatively complex, with spatial variability of the covariance of the latent gaussian random variables, and a symmetric truncation map. We have simplified Biver et al.’s model somewhat to focus on the effect of the symmetries of the truncation map on the ability to assimilate observations. Figure 14 shows the two “true” gaussian random fields, the truncation map, and the “true” permeability field that was used to generate observations at nine locations shown as black dots on the true field.

Both the hybrid method and the standard IES were used with ensemble size 100 to assimilate the observations of “permeability” at the nine observation locations. The approximate gradient of permeability with respect to the gaussian random fields, was computed from a bilinear interpolation of the 23 nodes shown in the truncation map (Fig. 14). The observation error was assumed to be normally distributed with mean 0 and variance 1. For the hybrid method, reduction in the data mismatch (\(O_\mathrm{d}\)) was fairly steady until the sixth iteration, after which the updating was quite minor (Fig. 15a). Figure 15b shows the distribution of data mismatch at the beginning and at the end of iteration. Note that none of the initial realizations matched all observations, but after 10 iterations, 23 of 100 realizations have matched all observations. When the standard IES method was used, there was no reduction in the data mismatch so we have not included a plot of the reduction for that case.
Fig. 15

Reduction of data mismatch \(O_\mathrm{d}\) with iteration for the tidal flats environment model using the hybrid method. a Realization 1, initial (left) and final (right), b histograms of initial data mismatch (yellow) and final mismatch (blue) for the hybrid method

Fig. 16

Three realizations from the initial ensemble and the corresponding updated realizations from the hybrid method

Figure 16 shows three realizations from the initial ensemble with the corresponding realizations after updating using the hybrid method. The realizations that are shown were selected from the set of realizations with perfect match to observations. We note that the updated realizations display two useful properties. Firstly, the updated realizations exhibit the same continuity characteristics as the initial realizations, indicating that the updating was not harmful to connectivity. Secondly, the updated realizations are highly diverse, i.e. there appears to be little tendency towards ensemble collapse. This is partly a result of the small number of observations, but also to the fact that a different “Kalman gain matrix” is used to update each ensemble member.

6.5 History Matching Example

In this section, we show a history matching example using the same true permeability field and truncation map layout as in Sect. 6.3. The true permeability and the location of four producers and one injector are shown in Fig. 17a. There are only two phases present in the reservoir, oil and water. All wells are constrained by bottom hole pressure. Data for history matching include oil rate and water rate at all four producers and water injection rate at the injector at 19 different times in a 9-year period. The total number of data points is \(171 = 19 \times 9 \). The initial facies realizations are conditioned to facies type at the five well locations through spatially varying facies proportions. The proportions of the three facies in each gridblock (e.g. Fig. 17b–d) are computed from 1000 conditional facies realizations obtained through rejection sampling. At well locations where the facies type has been observed, the proportion is equal to one for the type of facies that is measured, and zero for all other facies types. The influence of the facies measurement on proportions in non-observed cells is somewhat local. At regions far from wells, facies proportions approach the global prior facies proportion equal to 0.33, 0.25, 0.42 (results of using \(-0.8\) and 0.8 as thresholds for both \(y_1\) and \(y_2\)).
Fig. 17

The true permeability field with the location of four producers (circle) and one injector (triangle) is shown in (a). The proportion of facies with permeability equal to 150, 100 and 50 mD are shown in (b), (c), and (d), respectively. Truncation map at four different gridblocks are shown in (e) to (h). These four gridblocks are marked at “\(\times \)” in (a). The white dots in (e) to (h) are centroids for interpolating the truncation map. The producers (P1, P2, P3, P4) and the injections (I1) are located at gridblocks (6 6), (6 42), (42 42), (42 6), (25 25), respectively

When the number of conditioning data is large, the cost of rejection sampling would be too high to make the method feasible. In this case, a more practical but less correct solution is to krige the proportions with an assumed variogram (Sebacher et al. 2013). Alternatively conditioning to facies type at the well locations can be obtained through sampling the underlying gaussian random fields with facies observation as constraints using the Gibbs sampler (Armstrong et al. 2011). Because the layout of the truncation map is very simple, it is easy to compute the location of the threshold lines in the truncation map given proportions of the three facies. Four examples of the truncation map with various facies proportion are shown in the second row of Fig. 17. The gridblocks associated with these truncation maps are marked by “\(\times \)” in Fig. 17a. They are selected to be next to the wells, where the proportions vary the most from the global facies proportion. As in Sect. 6.3, the white dots in Fig. 17e–h are centroids for interpolating the truncation map for updating using the hybrid method. The initial realizations for history matching are then generated using unconditional underlying gaussian random fields (\(y_1\) and \(y_2\)) with the spatially varying thresholds in the truncation map.
Table 1

The top panel shows the number of iterations when LM-EnRML stopped because it met one of the stop criteria. The bottom panel shows the number of realizations with data mismatch less than 2500 at the last iteration for both methods. The ensemble size is 100



Run 1

Run 2

Run 3

Run 4

Run 5

Iterations required













Realizations with \(O_{\mathrm {d}} <2500\)













The full form of the LM-EnRML is used for both the standard (Eq. (6)) and the hybrid method (Eq. (10)). Five data assimilation experiments with independent ensembles of model realizations are performed to check the consistency of the results. The ensemble size is 100 for all the runs and localization is not used because the localization region extends through the entire domain for this single pattern example (Chen and Oliver 2010). Table 1 summarizes the results from data assimilation. The top panel shows the number of the iteration at which LM-EnRML stopped because it met one of the stopping criteria. The bottom panel shows the number of realizations with data mismatch \(O_{\mathrm {d}}\) (Eq. 13) less than 2500 at the last iteration for both methods. An iterative data assimilation run using LM-EnRML was stopped when one of the following three criteria was met: (1) exceeding the maximum number of iterations equal to 20, (2) exceeding the maximum number of inner iterations (tuning of \(\lambda \)) equal to three, or (3) reduction of data mismatch between two consecutive iterations is less than 0.1%. For the standard method, three of five runs stopped after a single iteration because they were not able to reduce data mismatch after tuning \(\lambda \) three times. The hybrid method was able to obtain reasonable data match for all runs with most runs terminating due to exceeding the maximum number of inner iterations.

Figure 18 shows data match obtained by the hybrid method (Run 1) for different data types at each well. The gray curves are simulated data from the initial ensemble; the blue curves are simulated data from the final ensemble (iteration 12). The red dots show historical data with error bars indicating the standard deviation of data noise. The data match obtained by the standard method for the Run 1 and Run 3 are better than those shown in Fig. 18 as suggested by the low value of data mismatch (shown in Fig. 19), and the plots are not include in the paper.

Results from the ensemble methods were compared with results obtained using the method of randomized maximum likelihood (RML), which is a minimization-based method for sampling (Oliver et al. 1996; Oliver 2014). Although RML sampling requires weighting to obtain a correct estimate of the posterior pdf for nonlinear problems (Oliver 2017), the distribution of unweighted samples are often quite close to those obtained from more expensive methods like MCMC (Emerick and Reynolds 2013). For the RML method, we again used Levenberg–Marquardt, but computed the sensitivity of production data to permeability computed using the adjoint option of the ECLIPSE simulator. Because the truncation map is not differentiable, it is still necessary to introduce a smooth approximation of the derivative of permeability to the gaussian latent variables. An iteration in the Levenberg–Marquardt is nearly identical to an iteration in the hybrid method (Eq. (10)), except that \(\mathbf {G}_f\) is the adjoint sensitivity instead of the sensitivity approximated using the ensemble. The RML method is run using the same 100 different starting models that were used as the initial realizations for Run 1 of the ensemble-based method. Out of these 100 runs, 96 runs obtained data mismatch less than 2500 at the final iteration. The same stopping criteria used for LM-EnRML is used for RML.
Fig. 18

Data match (blue curves) at the final iteration (iteration 12) for Run 1 of the hybrid method. The unit for rate is stb/day. The gray curves shown simulated data from the initial realizations. The red dots show historical data with error bars indicating the standard deviation of data noise

Fig. 19

Squared data mismatch (\(O_{\mathrm {d}}\) in Eq. (13)) for the initial ensemble and the final ensemble (at the last iteration) for the five independent runs. Four boxes are shown for each run: the initial distribution (blue), final distributions using standard method (green), hybrid method (black), and RML using adjoint sensitivity (red)

Fig. 20

Mean and standard deviation of the permeability realizations at the final iteration of the hybrid method using adjoint sensitivity

Figure 19 shows the distribution of data mismatch of different ensembles of 100 realizations. The vertical dashed lines separates five different ensemble runs for each of the methods. Four boxes are shown for each run: the left box shows the initial distribution of data mismatch. The following three show the distributions of final data mismatch for the three methods. For the runs that the standard method failed to reduce data mismatch at the first iteration, the initial and final boxes are identical. The five runs from the hybrid method obtained similar level of data match at the final iteration, with mean around 1700. For the two runs, for which standard method was able to reduce data mismatch, the final data mismatch is much lower than the hybrid method, with mean close to 230. In all runs, the level of final data match is similar between hybrid method using ensemble sensitivity and using adjoint sensitivity.

Figure 20 shows mean and standard deviation of the initial 100 permeability realizations and the final 96 permeability realizations (with \(O_{\mathrm {d}} < 2500\)) from the RML method. The initial mean and standard deviation reflect the conditioning at the well locations. The mean of the final permeability realizations clearly identifies the high permeability connection between the injector and well P2 (Fig. 17a), which is necessary for matching production data at P2. The standard deviation of permeability is reduced after history matching, but the final uncertainty of permeability remains high because the resolution of production data from a small number of wells is typically not sufficient to uniquely identify permeability values at each gridblock.
Fig. 21

Mean (top row) and standard deviation of the permeability realizations at the final iteration of the hybrid method. Results from five independent ensembles are shown

Fig. 22

Mean (top row) and standard deviation (STD) of the permeability realizations at the final iteration (left two columns) and at iteration 6 (right two columns) of the standard method. Results from the first and the third runs are shown

Figure 21 shows the mean and standard deviation of permeability of the final ensemble (at the last iteration) for the hybrid method. The final mean and standard deviation are remarkably consistent among the five independent runs, and are similar in appearance to the results obtained by the RML method (Fig. 20). The magnitude of the standard deviations for the five final ensembles from the hybrid method are generally slightly lower than the standard deviation of the RML realizations. This underestimation of uncertainty is often observed in ensemble-based methods when the ensemble size is relatively small.
Fig. 23

Realizations of permeability obtained by the three different methods. The corresponding initial realizations are shown in the top row. The color scale is the same as in Fig. 17a

The mean and standard deviation of the final permeability ensembles from the standard method are shown in Fig. 22. Only the two runs for which the standard method was able to reduce the data mismatch were shown (see run summary in Table 1). The mean and standard deviation of the two final ensembles (at iteration 20) show clear signs of ensemble collapse with strong features in the mean field and very low standard deviation in a large region of the field. The mean and standard deviation are also shown at an intermediate iteration (iteration six), at which data mismatch was similar to those obtained by the hybrid method, i.e. with mean of data mismatch around 1700. At iteration six, there is already excessive reduction in ensemble variability, however, not as severe as at iteration 20.

Individual realizations of permeability for all three methods are shown in Fig. 23. Four initial realizations are shown in the top row. The remaining rows of Fig. 23 show corresponding final realizations from RML (hybrid method with adjoint sensitivity), from the hybrid method and from the standard method. It is relatively easy to see the resemblance between the initial realizations and the realizations updated by RML (hybrid with adjoint sensitivity). For example, initial realization 3 is modified slightly to obtain the connection between the injector and P2, and realization 6 is modified slightly so that the direct connection between injector and P1 in the initial realization is removed. Similarities between the initial realizations and the final realizations from the hybrid method are not as obvious. This implies that the update is not optimal for each individual realization when a single ensemble sensitivity (\(\mathbf {G}_f\) in Eq. (11)) is used for the entire ensemble. The final realizations from the standard method are all very similar to each other despite the clear variability among the initial realizations. This lack of variability in the final ensemble of permeability realizations is reflected in the small standard deviation (see standard deviation of Run 1 in Fig. 22).

Despite the generally good results from the hybrid method, we note that the final realizations obtained by the hybrid method seem to have too small proportion for the facies type with the highest permeability (red in Fig. 23) compared to the initial realizations and the true permeability field shown in Fig. 17a. This is partially a result of the bilinear interpolation of the truncation map. The gradient of the truncation map \(\mathrm{d}f/\mathrm{d}y_2\) (reference Fig. 7c) remains nonzero as \(|y_2|\) increases in the center region of the truncation map (region with color red and green in Fig. 17e). This combined with a noisy estimation of \(\mathbf {G}_f\) from the ensemble results in erroneous updates that push \(y_2\) to extreme values (more than two standard deviation from the mean), resulting in an artificial reduction in the proportion of the facies that is located at the center of the truncation map. This artificial reduction in the proportion of facies “red” (red in Fig. 23) also appears when adjoint sensitivity is used with the hybrid method, but is less obvious.

7 Summary

When the truncated plurigaussian (TPG) model is used to represent the distribution of facies or rock types with a reservoir model, the model variables for data assimilation are the latent gaussian variables of TPG. The relationship between the latent gaussian model variables and the observed quantities such as porosity from a well log or water cut at a producing well is, in this case, non-differentiable, hence gradient-based methods with exact derivatives cannot be used to reduce the data misfit function. Ensemble Kalman filter-like methods have, however, been used successfully to assimilate data into TPG models. In this paper, we showed that although iterative ensemble smoothers can be used successfully to update TPG models when the truncation map between latent gaussian variables and petrophysical properties is monotonic, it can fail badly when the relationship is non-monotonic.

The ability to assimilate data into truncated plurigaussian models using iterative ensemble smoothers can be greatly improved through the use of a hybrid method in which the covariance between observations and the petrophysical properties is estimated from the ensemble of realizations, but the gradient of the petrophysical properties with respect to the latent gaussian variables is estimated analytically using a piecewise bilinear approximation of the truncation map. Because the mapping of the gaussian variables to the petrophysical properties is highly nonlinear, the derivative is computed locally, and each ensemble member is assigned a unique Kalman gain matrix for updating. In numerical examples, we showed that although not all data in all realizations are matched using the hybrid method the error in the data mismatch was almost always improved, the spread was increased, and the dependence on the initial ensemble was reduced when analytical approximations of the derivative were used in the iterative update.



Primary support for Oliver has been provided by the CIPR/IRIS cooperative research project “4D Seismic History Matching” which is funded by industry partners Eni Norge, Petrobras, and Total, as well as the Research Council of Norway (PETROMAKS2 program). The second author thanks Total for permission to publish this work.


  1. Adler RJ, Moldavskaya E, Samorodnitsky G (2014) On the existence of paths between points in high level excursion sets of Gaussian random fields. Ann Probab 42(3):1020–1053. CrossRefGoogle Scholar
  2. Agbalaka CC, Oliver DS (2008) Application of the EnKF and localization to automatic history matching of facies distribution and production data. Math Geosci 40(4):353–374CrossRefGoogle Scholar
  3. Agbalaka CC, Oliver DS (2011) Joint updating of petrophysical properties and discrete facies variables from assimilating production data using the EnKF. SPE J 16(2):318–330. CrossRefGoogle Scholar
  4. Albertão GA, Grell AP, Badolato D, dos Santos LR (2005) 3D geological modeling in a turbidite system with complex stratigraphic-structural framework—an example from Campos Basin Brazil. In: SPE Annual technical conference and exhibition, Dallas, Texas, 9–12 October, Society of Petroleum EngineersGoogle Scholar
  5. Armstrong M, Galli A, Beucher H, Le Loc’h G, Renard D, Doligez B, Eschard R, Geffroy F (2011) Plurigaussian simulations in geosciences, 2nd edn. Springer, Berlin. CrossRefGoogle Scholar
  6. Astrakova A, Oliver DS (2015) Conditioning truncated pluri-Gaussian models to facies observations in ensemble-Kalman-based data assimilation. Math Geosci 47(3):345–367. CrossRefGoogle Scholar
  7. Beucher H, Renard D (2016) Truncated Gaussian and derived methods. CR Geosci 348(7):510–519. CrossRefGoogle Scholar
  8. Biver PYA, Allard D, Pivot F, Ruelland P (2015) Recent advances for facies modelling in pluri-Gaussian formalism. In: Petroleum geostatistics, 7–11 September, Biarritz, France, EAGEGoogle Scholar
  9. Chen Y (2015) Geologically consistent history matching using the ensemble based methods. In: Petroleum geostatistics, 7–11 September, Biarritz, France, EAGE.
  10. Chen Y, Oliver DS (2010) Cross-covariances and localization for EnKF in multiphase flow data assimilation. Comput Geosci 14:579–601. CrossRefGoogle Scholar
  11. Chen Y, Oliver DS (2012) Ensemble randomized maximum likelihood method as an iterative ensemble smoother. Math Geosci 44(1):1–26. CrossRefGoogle Scholar
  12. Chen Y, Oliver DS (2013) Levenberg–Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput Geosci 17(4):689–703. CrossRefGoogle Scholar
  13. D’Or D, David E, Walgenwitz A, Pluyaud P, Allard D (2017) Non stationary plurigaussian simulations with auto-adaptative truncation diagrams using the CART algorithm. In: 79th EAGE conference and exhibition, Paris, France 12–15 June.
  14. Emerick AA, Reynolds AC (2013) Investigation of the sampling performance of ensemble-based methods with a simple reservoir model. Comput Geosci 17(2):325–350. CrossRefGoogle Scholar
  15. Emery X (2007) Using the Gibbs sampler for conditional simulation of Gaussian-based random fields. Comput Geosci 33(4):522–537. CrossRefGoogle Scholar
  16. Evensen G (1994) Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J Geophys Res 99(C5):10143–10162CrossRefGoogle Scholar
  17. Galli A, Le Loc’h G, Geffroy F, Eschard R (2006) An application of the truncated pluri-Gaussian method for modeling geology. In: Coburn TC, Yarus JM, Chambers RI (eds) Stochastic modeling and geostatistics: principles, methods, and case studies, volume II: AAPG computer applications in geology, AAPG special volumes, pp 109–122Google Scholar
  18. Grötsch J, Mercadier C (1999) Integrated 3-D reservoir modeling based on 3-D seismic: the tertiary Malampaya and Camago buildups, offshore Palawan, Philippines. AAPG Bull 83(11):1703–1728Google Scholar
  19. Kitanidis PK (1995) Quasi-linear geostatistical theory for inversing. Water Resour Res 31(10):2411–2419CrossRefGoogle Scholar
  20. Le Loc’h G, Galli A (1997) Truncated plurigaussian method: theoretical and practical points of view. In: Baafi EY, Schofield NA (eds) Geostatistics Wollongong ’96, vol 1. Kluwer Academic, Dordrecht, pp 211–222Google Scholar
  21. Liu N, Oliver DS (2004) Automatic history matching of geologic facies. SPE J 9(4):188–195CrossRefGoogle Scholar
  22. Liu N, Oliver DS (2005a) Critical evaluation of the ensemble Kalman filter on history matching of geologic facies. SPE Reserv Eval Eng 8(6):470–477. CrossRefGoogle Scholar
  23. Liu N, Oliver DS (2005b) Ensemble Kalman filter for automatic history matching of geologic facies. J Petrol Sci Eng 47(3–4):147–161CrossRefGoogle Scholar
  24. Mariethoz G, Renard P, Cornaton F, Jaquet O (2009) Truncated plurigaussian simulations to characterize aquifer heterogeneity. Ground Water 47(1):13–24. CrossRefGoogle Scholar
  25. Oliver DS (2014) Minimization for conditional simulation: relationship to optimal transport. J Comput Phys 265:1–15. CrossRefGoogle Scholar
  26. Oliver DS (2017) Metropolized randomized maximum likelihood for improved sampling from multimodal distributions. SIAM/ASA J Uncertain Quantif 5(1):259–277. CrossRefGoogle Scholar
  27. Oliver DS, He N, Reynolds AC (1996) Conditioning permeability fields to pressure data. In: Proceedings of the European conference on the mathematics of oil recovery, V, pp 1–11Google Scholar
  28. Sebacher B, Hanea R, Heemink A (2013) A probabilistic parametrization for geological uncertainty estimation using the ensemble Kalman filter (EnKF). Comput Geosci 17(5):813–832. CrossRefGoogle Scholar
  29. Zhao Y, Reynolds AC, Li G (2008) Generating facies maps by assimilating production data and seismic data with the ensemble Kalman filter, SPE-113990. In: Proceedings of SPE IOR Symp, Tulsa, OK, April 21–23.
  30. Zupanski M, Navon IM, Zupanski D (2008) The maximum likelihood ensemble filter as a non-differentiable minimization algorithm. Q J R Meteorol Soc 134(633):1039–1050. CrossRefGoogle Scholar

Copyright information

© International Association for Mathematical Geosciences 2018

Authors and Affiliations

  1. 1.Uni Research CIPRBergenNorway
  2. 2.Geoscience Research CentreTotal E&P UKWesthillUK

Personalised recommendations