1 Introduction

Porosity is a ignificant criterion in characterizing a reservoir and in determining flow patterns in order to optimize the production of a hydrocarbon field. Also, reliable estimation of porosity is critical for evaluating hydrocarbon accumulations in a basin and to map potential pressure seals in order to reduce drilling risk in the wildcats. Porosity is mostly measured in the laboratory on the cored rocks recovered from the reservoir or could be determined by well-test data. As the well testing and coring methods are expensive and time consuming, all wells in a typical oil or gas field are logged using various tools to measure petrophysical parameters such as porosity and density. However, the spatial distribution of porosity between wells is a very important concern in oil industry (Bhatt and Helle 2002; Tiab and Donaldson 2004). Seismic measurements are often used to delineate the structure of reservoir bodies, but are not often used to estimate the spatial distribution of reservoir and rock properties. In other words, it is very difficult to estimate the porosity directly from seismic data. Inversion was used to improve the prediction of reservoir properties from the 3D seismic. These predictions should become more accurate as wells are added. The past studies showed that inversion of seismic data into acoustic impedance (AI) is widely used in hydrocarbon exploration to estimate petrophysical properties. The acoustic impedance is commonly used for porosity estimation, mostly based on an empirical relationship between acoustic impedance and porosity. However, the relationship differs from area to area because the compaction model varies both laterally and vertically. Thus, in many cases, in a large area, porosity cannot be estimated directly from the acoustic impedance using a single transform function (Anderson 1996). For this reason, Schultz et al. (1994) proposed the idea of using multiple seismic attributes to estimate log properties away from well control. After that, various data integration techniques such as neural networks were used to derive petrophysical properties directly from seismic attributes. The use of artificial neural networks (ANN) in geophysical inverse problems is a relatively recent development and offers many advantages when dealing with the nonlinearity inherent in such applications (Baddari et al. 2009). ANN has been used to predict core properties from well logs (Lim 2005), well log to well log transformations, and seismic properties have been used to predict lithology (Singh et al. 2007; Walls et al. 2000; Calderon and Castagna 2007; Joel et al. 2002), sonic logs and shale content (Liu and Liu 1998), shale stringers in a heavy oil reservoir (Tonn 2002), spontaneous potential (Banchs and Michelena 2002), permeability (Lim 2005; Helle et al. 2001), and porosity (Leite and Vidal 2011; Artun and Mohaghegh 2011; Singh et al. 2007; Calderon and Castagna 2007; Joel et al. 2002; Pramanik et al. 2004; Daniel et al. 2001; Kevin and Curtis 2004; Leiphart and Hart 2001; Russell et al. 1997). Multivariate linear regression (MLR), another technique, is a simple extension of the well-known univariate case. In these circumstances, log properties are estimated from a linearly weighted sum of a number of seismic attributes. This was first demonstrated to yield accurate results by Russell et al. (1997). Although all of these works may show significant advantages compared to impedance-based methods, they have not presented a specific mathematical equation to describe the relationship between attributes and petrophysical properties. To solve this issue, this research attempts to propose a nonlinear mathematical equation to describe the relationship between a seismic attribute (similarity) and the porosity value in a sandstone reservoir. There are several advantages of this mathematical model over the conventional inversion methods: it predicts porosity log rather than acoustic impedance; it uses seismic attribute (similarity) rather than the conventional post-stack volume. It relies on a simple forward model and knowledge of the seismic wavelet is not required that may enhance resolution. In fact, this model which transforms the similarity attribute of a sandstone reservoir to a porosity value is called the pseudo-forward equation (PFE) in this paper. The structure of PFE is implemented based on the dataset of the gas reservoir of the F3 block in the North Sea. This reservoir consists of sand and shale layers, in which shale units are sandwiched between the sand layers. Therefore, the role of PFE in both rock types will be investigated. The initial parameters of PFE are unknown and should be derived from data. This study will use the algebra technique to solve the nonlinear model and finally the quality of the implemented model will be studied. A typical feature of inverse problems is that they are ill-posed and a unique solution may not exist and small errors in the data may cause prohibitively large variations in the estimations of the quantity sought. To overcome these difficulties one has to regularize the original problem, that is, the original problem has to be replaced by a nearby well-posed problem in order to obtain a stable solution. One of the best known and most used regularization methods is Tikhonov regularization. This work will illustrate how Tikhonov regularization could optimize the PFE acceptably in the North Sea reservoir and ultimately the optimized PFE will be employed in the F3 reservoir to estimate the porosity distribution of the various seismic sections and finally the quality of the implemented model will be compared with results of a conventional method. As mentioned above, most previous studies have used AI to predict porosity. Therefore, cross-plot analysis between AI and porosity derived from density logs is performed to find a regression fit between two datasets. Given a linear relationship provided by regression fit, spatial distribution of porosity is estimated in the F3 block. Thus, the comparison between developed models provides a simple means of testing whether the model is implemented correctly.

2 Geological setting

This research is facilitated by having F3 block data from dGB Earth Sciences. The F3 block is located in the northeastern part of the Dutch sector of the North Sea. During the Cenozoic era, much of this region was a thermally subsiding epicontinental basin, most of which was confined by landmasses (Sørensen et al. 1997). During the Neogene, sedimentation rates exceeded the subsidence rate, and consequently shallowing of the basin occurred. A large fluvio-deltaic system dominated the basin, draining the Fennoscandian High and the Baltic Shield. The Cenozoic succession could be subdivided into two main packages, separated by the Mid-Miocene Unconformity (Fig. 1).

Fig. 1
figure 1

Sketch of the Neogene fluvio-deltaic system in the south of the North Sea (modified after Steeghs et al. 2000)

The lower package consists mainly of relatively fine-grained gradational Paleogene sediments (Steeghs et al. 2000), whereas the package above consists of coarser grained Neogene sediments with much more complex geometries. Most of the above package is a progradational deltaic sequence that could be subdivided into three units, corresponding to three phases of delta evolution (Fig. 1). The dominant direction of progradation is toward the west-southwest and is expressed as sigmoid lineaments (clinoforms) in the dip section (Tigrek 1988). Unit 2, containing a conspicuous clinoform package, was chosen as the target zone for gas accumulation, and forms the delta fore set with a coarsening upward sequence. Its age is estimated as Early Pliocene. The coarse sediments are attributed to a regression caused by the Neogene uplift of Scandinavia in the Pliocene (Gregersen 1997).

3 Dataset

A 3D seismic survey in F3 block covering an area of approximately 16 × 23 km2 has become publicly available and is provided by a monograph of Aminzadeh and Groot (2006). The data volume consists of 646 inlines and 947 cross-lines. The line spacing is 25 m for both inlines and cross-lines, and the sample rate is 1 ms. A standard seismic data processing sequence was applied to the data. Data from four wells in the area are available, in particular well logs in true vertical depths, including sonic and gamma ray logs. Density logs were reconstructed from the sonic logs using neural network techniques by dGB Earth Sciences. The density logs were also used to calculate porosity logs for all wells. Figure 2 is a seismic cross section of the study area that shows existing well locations (F06-1, F02-1, F03-2, F03-4).

Fig. 2
figure 2

The seismic section driven from original seismic data (inline 425) and it shows the location of wells and presents the gamma ray logs in every well

In Fig. 2 gamma ray logs are displayed in every well and one could separate the various shale and sand layers in F3 reservoir. The study area in this paper is the upper package where coarser grained Neogene sediments with much more complex geometries are located (450–1200 ms). In this zone, the presence of a laminated shale and sand sequence is proved especially in well No. F03-4. On the opposite side, the sand layers are the main lithology present in well F03-2. These sand and shale sequences constitute commercial gas-bearing reservoirs and exhibit an approximate time range of 700–1000 ms. A basic rule for gamma ray log interpretation is that lower values correlate with sandy layers and higher values correlate with the shale-rich layers (Luthi 2001). According to Fig. 2, there are two types of sediments that could be clearly distinguished from the plots: shale-rich sediments with generally higher gamma ray values (>70 API) that mostly belong to the upper and lower target zone (units 1 and 3) and sand-rich sediments with generally low gamma ray values (<70 API) that mostly belong to the middle part of target zone (unit 2). To get better results in the target zone, this research divides the dataset of F3 reservoir into two parts: shale-rich sediments data with gamma ray values more than 70 API and sand-rich sediments data with gamma ray values less than 70 API.

4 Pre-processing

In this work, our objective is to find an operator, possibly nonlinear, which could predict porosity from seismic data. In fact, this paper chooses to apply not the seismic data itself, but attributes of the seismic data. The reason for this choice is that many of attributes are nonlinear; thus, the predictive power of the method is increased. The next reason is that there is often advantage in breaking down the input data into component parts. It means that when the raw seismic trace is divided into several mathematical functions (attributes), one could study the behavior of every attribute in contrast to the petrophysical properties and if there is a well-determined linear correlation between seismic attributes and reservoir properties, they will be considered in analysis to predict the unknown properties. Hundreds of seismic attributes have been developed but only some of them are well enough understood to be quantitative, and many are redundant. Also, a seismic trace is the result of complicated interrelationships between bed thickness, porosity, fluid saturation, lithological boundaries, and other rock properties (Kevin and Curtis 2004; Satinder and Kurt 2008). In the present research, the authors have considered statistically the behavior of more than 15 attributes in four wells and they have found that similarity is the fundamental attribute which shows more correlation than other attributes. A detailed list of the attributes used and their statistical parameters besides the correlation coefficients of the extracted attributes are given in Table 1.

Table 1 The list of used attributes and their correlation coefficients with porosity values

According to Table 1, the similarity attribute is considered to be the optimal one to predict porosity as the output in linearity and nonlinearity mode. In practice, it is not too frequent to have greater correlation than 50 %–60 % between seismic attributes and well log data, so this work is satisfied with F3 data to find linear or nonlinear relationships between two sets of input and output data. Acoustic impedance is another seismic attribute that is widely used to estimate porosity distribution of reservoir rocks. Given the acoustic impedance attribute in the inversion procedure, the simplest method to derive the appropriate relationship between porosity and AI is to cross plot between two datasets. Assuming a linear relationship between porosity and AI, a straight line may be fitted by regression in Fig. 3. In Fig. 3a, the target log property (porosity) is plotted against AI attribute using OpendTect software. The cross correlation and the mean-squared prediction error are 47 % and 6.35e−6, respectively. From the linear regression fit, the distribution of porosity in the reservoir in inline-228 is estimated (Fig. 3b). In this research, the candidate attributes such as energy, envelope, spectral decomposition, and similarity are used to predict spatial distribution of porosity in the North Sea reservoir using an experimental model which is constructed using an algebraic technique, and then the results are compared with conventional method of cross-plot analysis (Fig. 3a).

Fig. 3
figure 3

a Cross plot between the target log (porosity) and the seismic attribute (AI); b Distribution of porosity estimated using the relationship established between porosity and AI

5 Pseudo-forward modeling

The development of a mathematical model that is able to predict petrophysical properties should be performed based on the physical concepts. These equations (the so-called forward model) are often formulated using fundamental seismic factors of the Earth such as wave velocity, density, etc. Unlike the conventional procedure, this work intends to extract an empirical model that is weakly supported by experimental data. Although it is possible to introduce a model from seismic data, the terms in the equation are empirical and any functional connection to physical concepts is not entirely justified. Therefore, because of empirical nature of the proposed model, the developed model is called pseudo-forward equation (PFE), in this work. As described before, some seismic attributes were chosen for prediction of spatial distribution of porosity. This paper has designed various mathematical structures based on the aforementioned attributes, but they have shown different degrees of accuracy. Implicitly, it has been assumed that multi-attribute functions are more valid than single-attribute ones over the target zone. Improvements have focused on accuracy enhancement, shorter equations, and improved representation of the sand and shale regions. Finally, in all these situations, various mixture models were developed that exhibit varied behavior in contrast to sand and shale layers. The ultimate empirical function is a single-attribute equation based on the similarity attribute. Similarity is a form of “coherency” that expresses how much two or more trace segments look alike. The coherency attribute is a measure of lateral changes in acoustic impedance caused by variations in structure, stratigraphy, lithology, porosity, and fluid content. The first coherency algorithm based on correlation was proposed by Bahorich and Farmer (1995) and then it was completed by Marfurt et al. (1998). For attributes expressing coherence, it is generally true that they are suitable for the indication of sudden changes between neighboring channel sections. They are extremely good for the detection of faults, fractured zones, or boundaries related to lithological changes. There are several types of coherence attributes. The best known one is the so-called coherency cube, while recently developed methods are semblance-type procedures, eigen structure or variance-based coherence, and coherence based on the calculation of least squares (Eichkitz et al. 2012). The so-called similarity attribute characteristic for the coherence, used by this research, is a simple one and can be calculated quickly. Its value between two channel sections could be given by (OpendTect dGB Plugins User Documentation 2012)

$${\text{sim}}(x,y) = 1 - \frac{{\sqrt {\sum\limits_{i = 1}^{N} {\left( {x_{i} - y_{i} } \right)^{2} } } }}{{\sqrt {\sum\limits_{i = 1}^{N} {x_{i}^{2} } } + \sqrt {\sum\limits_{i = 1}^{N} {y_{i}^{2} } } }},$$
(1)

where \({\text{sim}}(x,y)\) is the value of similarity between \(x\) and \(y\) vectors containing \(N\) number of data. \(N\) could be defined by a time gate. The numerator is the Euclidean distance in the \(N\) dimension of vectors \(x\) and \(y\), and the denominator is the sum of the vectors’ lengths. In various research (Santosh et al. 2013), the similarity attribute map is applied to enhance the fault structures and clear salt edges. To implement the PFE on F3 block, the similarity attribute of three wells (F06-1, F02-1, F03-2) besides the porosity values from density logs are used to construct the structure of PFE and well F03-4 is selected to evaluate the performance of PFE. The PFE model is introduced as

$$Y = a + bs\ln (s) + \frac{c}{\ln (s)},$$
(2)

where Y is denoted as porosity and \(s\) is the similarity attribute. The constants of \(a\), \(b\), and \(c\) are fundamental parameters that depend on behavior of porosity in the reservoir. Equation 2 is proposed as an empirical model which could fit approximately to the dataset in the F3 block. Note that a forward model has a physical concept to analyze in inversion modeling but because of empirical nature of the proposed model, in this work, Eq. (2) is called the pseudo-forward equation (PFE). According to the above, this study should solve the pseudo-forward equation and estimate the optimized constants for a reliable prediction and finally verify the fit between predicted and observed data. To solve the pseudo-forward equation, a linear algebraic approach is developed to invert the pseudo-forward equation.

6 Inverse modeling

This research is faced with the situation that a quantity (similarity) is measured at the surface of the Earth and the aim is to know porosity of the rocks beneath the place where we made the measurements. For each set of measurements (similarity), a PFE is presented which approximately relates it to the porosity. The PFE is a nonlinear function which needs optimal constants to predict the porosity distribution in a reservoir. The nature of these constants originates the nature of the reservoir. Inverse theory is a method to infer the unknown physical property (porosity) from measurements (similarity). To solve the PFE first, it is represented in the form of an operator equation:

$${\mathbf{d = Gm}},$$
(3)

where \({\mathbf{d}}\) is the vector of predictions (porosity), \({\mathbf{m}}\) is the vector of the unknown parameters of the model (\(a,b,c\)), and \({\mathbf{G}}\) is the theoretical function or the linear operator which makes it possible to calculate \({\mathbf{d}}\) (porosity) from an earth model defined by the \({\mathbf{m}}\) parameters (\(a,b,c\)). \({\mathbf{G}}\) is the theory that predicts the porosity distribution in a reservoir from the model parameters \({\mathbf{m}}\). This theory is based on seismic attributes. Mathematically, \({\mathbf{Gm}}\) is a functional, a rule that unambiguously assigns a single real number to an element of a vector space. Now let us introduce the nomenclature of Eq. (3) more accurately. In these notes, vectors will be denoted by bold lowercase letters, and matrices will be denoted by bold uppercase letters. Suppose there exist N measurements (similarity) in a field then there are N values for the corresponding porosity data and we are trying to determine the values of three model parameters (\(a,b,c\)). Our nomenclature for data and model parameters will be

$$\begin{aligned} &{\text{Data: }}{\mathbf{d}} = [d_{1} ,d_{2} ,d_{3} , \ldots ,d_{N} ]^{{\text{T}}} {\text{;}}\quad\, \quad d_{i} = ({\text{Porosity value)}}_{i}\quad i = 1,2,3, \ldots ,N \hfill \\ &{\text{Model parameters: }}{\mathbf{m}} = [a,b,c]^{{\text{T}}} {\text{;}}\quad (a,b,c){\text{ = constants of PFE}}, \hfill \\ \end{aligned}$$
(4)

where \({\mathbf{d}}\) and \({\mathbf{m}}\) are N and three-dimensional column vectors, respectively, and T denotes transpose.

The model, or relationship between \({\mathbf{d}}\) and \({\mathbf{m}}\), could implement in elements of \({\mathbf{G}}\) matrix. Then, the equation of PFE can be written as

$$\begin{aligned} &d_{1} = a + bs_{1} \ln (s_{1} ) + \frac{c}{{{\text{ln(}}s_{1} )}} \hfill \\ &d_{2} = a + bs_{2} \ln (s_{2} ) + \frac{c}{{{\text{ln(}}s_{2} )}} \hfill \\ &\qquad\qquad\qquad \vdots \hfill \\ &d_{N} = a + bs_{N} \ln (s_{N} ) + \frac{c}{{\ln (s_{N} )}} \hfill \\ \end{aligned}$$
(5)

Getting the PFE equation set up in matrix notation is essential before we can invert the system. Hence, the above statements are written as

$$\left[ \begin{aligned} d_{1} \hfill \\ d_{2} \hfill \\ \vdots \hfill \\ d_{N} \hfill \\ \end{aligned} \right] = \left[ {\begin{array}{*{20}c} \begin{aligned} 1 \hfill \\ 1 \hfill \\ \vdots \hfill \\ 1 \hfill \\ \end{aligned} & \begin{aligned} s_{1} \ln (s_{1} ) \hfill \\ s_{2} \ln (s_{2} ) \hfill \\ \vdots \hfill \\ s_{N} \ln (s_{N} ) \hfill \\ \end{aligned} & \begin{aligned} 1/\ln (s_{1} ) \hfill \\ 1/\ln (s_{2} ) \hfill \\ \vdots \hfill \\ 1/\ln (s_{N} ) \hfill \\ \end{aligned} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} a \\ b \\ c \\ \end{array} } \right].$$
(6)

Then \({\mathbf{d}}\) and \({\mathbf{m}}\) are \(N \times 1\) and \(3 \times 1\) column vectors, respectively, and \({\mathbf{G}}\) is an \(N \times 3\) matrix with constant coefficients. The logical next step is to invert Eq. (6) for an estimate of the model parameters \({\mathbf{m}}^{\text{est}}\) as

$${\mathbf{m}}^{\text{est}} = ({\mathbf{G}})^{ - 1} {\mathbf{d}}.$$
(7)

This inverse problem reverses the process of predicting the values of porosities. It tries to invert the operator \({\mathbf{G}}\) to get an estimate of the model. A most common vector concerned is the data error or misfit vector which plays an essential role in the development of inverse methods (Menke 1989). If \({\mathbf{d}}^{\text{pre}}\) is calculated by

$${\mathbf{d}}^{\text{pre}} = {\mathbf{G}} \, {\mathbf{m}}^{\text{est}}.$$
(8)

The misfit vector (data error vector) will be provided by

$${\text{Data error vector: }}{\mathbf{e}} = {\mathbf{d}}^{\text{obs}} - {\mathbf{d}}^{\text{pre}}.$$
(9)

The dimension of the error vector \({\mathbf{e}}\) is \(N \times 1\). The total misfit \(E\) between observed (\({\mathbf{d}}^{\text{obs}}\)) and predicted data (\({\mathbf{d}}^{\text{pre}}\)) is considered as

$$E = {\mathbf{e}}^{\text{T}} {\mathbf{e}} = [\begin{array}{*{20}c} {e_{1} } & {e_{2} } & \ldots & {e_{N} ]\left[ {\begin{array}{*{20}c} {e_{1} } \\ {e_{2} } \\ \vdots \\ {e_{N} } \\ \end{array} } \right]} \\ \end{array} = \sum\limits_{i = 1}^{N} {e_{i}^{2} }.$$
(10)

The term \(E\) is a way to quantify the misfit between predicted and observed data. The solutions which implemented based on the misfit vector give rise to least squares solutions (Menke 1989). In the next stage, the least square procedure will be used to find a best fit model of PFE to the F3 block dataset and the corresponding codes were written in the Matlab environment.

6.1 Least square solution

The solution of an inverse problem consists of giving the best solution for the model from the inversion of Eq. (5). This relation is valid when the number of the equations is equal to the number of parameters of the model. In this case, the \({\mathbf{G}}\) matrix will be a square matrix which could be invertible if the determinant of the matrix is different from zero. In order to implement the PFE model with sufficient generality, the available data are divided into three subsets. The first subset is the training set derived from three wells (F02-1, F06-1, F03-2), which is used to estimate the model coefficients. The second subset is the validation set derived from the same wells. This set of data which is not applied during the development of the PFE model is used to validate the model. The third subset is the testing set derived from testing well (F03-4); this well which is not used during the development of the PFE model, is applied to obtain the overall accuracy of the PFE model. To get better model parameters for PFE, this work divided the training and validation sets into two groups: sand dataset (gamma ray <70 API) and shale dataset (gamma ray >70 API); hence, the PFE is solved in the form of two different datasets:

$$\begin{aligned}&&\begin{array}{cccccc} &{\mathbf{d}} \, &=\, &{\mathbf{G}} &{\mathbf{m}}\!;&\,{\text{\quad sand dataset}} \\ &(268 \times 1) &&(268 \times 3 ) &(3 \times 1 ) &\\ \end{array}\\ &&\begin{array}{cccccc} &{\mathbf{d}} \, &=\, &{\mathbf{G}} &{\mathbf{m}}\!;&\,{\text\quad shale dataset} \\ &(215 \times 1) &&(215 \times 3 ) &(3 \times 1 ) &\\ \end{array}\end{aligned}$$
(11)

The least square procedure is to take the partial derivative of \(E\) with respect to each element in \({\mathbf{m}}\) and set the resulting equations to zero. This will produce a system of three equations that can be manipulated in such a way that, in general, leads to a solution for the three elements of \({\mathbf{m}}\).

In summary, the least squares solution for \({\mathbf{m}}\) is given by

$${\mathbf{m}}_{\text{LS}} = [{\mathbf{G}}^{\text{T}} {\mathbf{G}}]^{ - 1} {\mathbf{G}}^{\text{T}} {\mathbf{d}}.$$
(12)

The \({\mathbf{m}}_{\text{LS}}\) above is the solution that minimizes \(E\), the total misfit. It is noted that there exists \({\mathbf{m}}_{\text{LS}}\) when the matrix \({\mathbf{G}}^{\text{T}} {\mathbf{G}}\) has a mathematical inverse (Menke 1989). Mathematically, the \({\mathbf{G}}^{\text{T}} {\mathbf{G}}\) has an inverse when the determinant of the matrix is different from zero and it is not zero for both datasets (sand and shale). The Eq. (12) is calculated for both datasets as follows:

$$\begin{array}{ccl} {{{\bf{m}}_{\scriptstyle{\rm{LS}}\hfill\atop \scriptstyle{\rm{SAND}}\hfill}} = \left[ {\begin{array}{lll} {0.5435}&{2.3385}&{0.0276}\\ {2.3385}&{10.4406}&{0.1158}\\ {0.0276}&{0.1158}&{0.0014} \end{array}} \right]}&{\left[ {\begin{array}{llll} \begin{array}{l} 1\\ - 0.1819\\ - 4.3743 \end{array}&\begin{array}{l} 1\\ - 17.50\\ - 4.5963 \end{array}&\begin{array}{l} \cdots \\ \cdots \\ \cdots \end{array}&\begin{array}{l} 1\\ - 0.0897\\ - 10.1013 \end{array} \end{array}} \right]}&{\left[ {\begin{array}{c} {0.3003}\\ {0.3008}\\ \vdots \\ {0.3070} \end{array}} \right] = \left[ {\begin{array}{l} {0.4253}\\ {0.4780}\\ {0.0080} \end{array}} \right]}\\ {(3 \times 3)}&{(3 \times 268)}&{(268 \times 1)}\\ {{{\bf{m}}_{\scriptstyle{\rm{LS}}\hfill\atop \scriptstyle{\rm{SHALE}}\hfill}} = \left[ {\begin{array}{lll} {2.1986}&{8.8751}&{0.1386}\\ {8.8751}&{36.3829}&{0.5509}\\ {0.1386}&{0.5509}&{0.0089} \end{array}} \right]}&{\left[ {\begin{array}{llll} \begin{array}{l} 1\\ - 0.0981\\ - 9.1346 \end{array}&\begin{array}{l} 1\\ - 0.1028\\ - 8.6643 \end{array}&\begin{array}{l} \cdots \\ \cdots \\ \cdots \end{array}&\begin{array}{l} 1\\ - 0.1964\\ - 3.9539 \end{array} \end{array}} \right]}&{\left[ {\begin{array}{c} {0.3468}\\ {0.3514}\\ \vdots \\ {0.2996} \end{array}} \right] = \left[ {\begin{array}{l} {0.5211}\\ {0.7619}\\ {0.0115} \end{array}} \right]}\\ {(3 \times 3)}&{(3 \times 215)}&{(215 \times 1)} \end{array}$$
(13)

We note that the least square solutions \(\left[ {\begin{array}{lll} {0.4253} & {0.4780} & {0.0080} \\ \end{array} } \right]_{\text{SAND}}^{\text{T}}\) and \(\left[ {\begin{array}{lll} {0.5211} & {0.7619} & {0.0115} \\ \end{array} } \right]_{\text{SHALE}}^{\text{T}}\) do not fit the data exactly and are ones that minimize the misfit vector. Now one can calculate the minimized \(E\) as follows:

$$\begin{aligned} E_{\text{SAND}} = {\mathbf{e}}^{\text{T}} {\mathbf{e}} = 0.0091 \hfill \\ \hfill \\ E_{\text{SHALE}} = {\mathbf{e}}^{\text{T}} {\mathbf{e}} = 0.0616. \hfill \\ \end{aligned}$$
(14)

In the next section, the accuracy and qualification of the responses obtained from the least square method are addressed; however, Eq. (14) has provided a measurement for validating the results. Before considering the next section, Eq. (14) shows that the least square solution of PFE for the sand dataset has more validity than for the shale dataset. Nevertheless, this viewpoint may tend to obscure an important aspect of the inverse problems. Namely, the nature of the problem depends more on the relationship between the data and model parameters than on the data or model parameters themselves. Therefore, it is essential that the qualification of the PFE itself is investigated in the following.

7 Assessment of the quality of PFE model

The PFE model is solved by a least square technique for two different layers:

$$\begin{array}{ll} {\text{Porosity}}_{\text{SAND}} = 0.4253 + 0.4780s\ln (s) + \frac{0.0080}{\ln (s )} \hfill \\ {\text{Porosity}}_{\text{SHALE}} = 0.5211 + 0.7619s\ln (s) + \frac{0.0115}{\ln (s )}. \hfill \\ \end{array}$$
(15)

Equation (15) is expected to estimate approximately the porosity of sand and shale layers of a reservoir using the similarity attribute. But the prediction power of these proposed equations should be discussed. The predictive performance of solved PFEs on validation set is illustrated in Fig. 4.

Fig. 4
figure 4

a Cross plot between real and estimated porosity of validation set by PFE-sand model on sand dataset; b Cross plot between real and estimated porosity of validation set by PFE-shale model on shale dataset

Figure 4 indicates obviously that PFE fitted the sand dataset with acceptable accuracy but shale data could not satisfy the PFE to estimate porosity values. The correlation coefficient reflects a model’s ability to predict the output. In statistics, it indicates how well data points fit a statistical model—sometimes simply a line or curve. It is a statistic used in the context of statistical models whose main purpose is either the prediction of future outcomes or the testing of hypotheses, on the basis of other related information. It provides a measure of how well observed outcomes are replicated by the model, as the proportion of total variation of outcomes explained by the model. A correlation coefficient of 1 indicates that the regression line perfectly fits the data (Steel and Torrie 1960). The correlation coefficient of PFE-SAND validates well the predictive power of this experimental equation (0.938); therefore, based on this evidence it should estimate the sand data of well F03-4 (test well) acceptably. According to the result of Fig. 4b, the structure of PFE cannot reliably estimate porosity of shale sediments; therefore, this paper just studies the PFE-SAND equation, and the purpose of PFE is the PFE-SAND in the following. To evaluate the solved PFE, 250 points of well F03-4 in reservoir with gamma ray lower than 70 API are extracted. This well has not been applied in implementation of the structure of PFE. Figure 5 illustrates the response of PFE to the test dataset.

Fig. 5
figure 5

Cross plot between real and estimated porosity of testing set by PFE on sand dataset

According to Fig. 5, there is a main issue in the solved PFE and it is obvious that this experimental equation could not be successful in the prediction of the porosity values in a sandstone zone. The authors believe that the main reason of this issue is related to the nature of the inversion process; it means that the inverse problem does not have a unique response and it is known as an ill-posed problem. This is because the noise in the measured data affects the quality of the PFE. Any errors (noise) in the data will be mapped into errors in the estimates of the model parameters. For this reason, a model covariance matrix \([\text{cov}\,m]\) needs to be defined by assuming that \([\text{cov}\,d] = I_{N}\), that is, all the data variances are equal to 1 and the covariances are all 0 (uncorrelated data errors) (Menke 1989).

If

$${\mathbf{G}}_{g}^{ - 1} = [{\mathbf{G}}^{\text{T}} {\mathbf{G}}]^{ - 1} {\mathbf{G}}^{\text{T}},$$
(16)

then

$$\begin{aligned} &[{\text{cov}}\,{\mathbf{m}}] = {\mathbf{G}}_{g}^{{ - 1}} [{\text{cov}}\,{\mathbf{d}}][{\mathbf{G}}_{g}^{{ - 1}} ]^{{\text{T}}} \hfill \\ &\qquad\quad= {\mathbf{G}}_{g}^{{ - 1}} [{\mathbf{G}}_{g}^{{ - 1}} ]^{{\text{T}}} \hfill \\ \end{aligned}.$$
(17)

The Eq. (17) is very helpful for getting a sense of the basic stability of PFE. The mean of the stability is what the expected noise is in the solution. In fact, \([\text{cov}\,{\mathbf{m}}]\) is a function of the forward problem as expressed in \({\mathbf{G}}\), and not a function of the actual data. Therefore, it could show the capability of the PFE equation to accept the noise in the data and it is not necessary to know \([\text{cov}\,{\mathbf{d}}]\) basically (Menke 1989). This research just wants to have a quick look at stability; hence, it is assumed that

$$[\text{cov}\,{\mathbf{d}}] = {\mathbf{I}}.$$
(18)

The diagonal terms of \([\text{cov}\,{\mathbf{m}}]\) are the variances of model parameters and the off-diagonal terms are the covariances. The \([1,1]\) entry in \([\text{cov}\,{\mathbf{m}}]\) is \(\sigma_{a}^{2}\), the variance for a. Correspondingly, the standard deviation of the error for a is \(\sigma_{a}\). Therefore, the perfect model should have a diagonal of zero in \([\text{cov}\,{\mathbf{m}}]\) matrix. Then the solution of PFE is, however, essentially meaningless if the diagonal entries of the corresponding covariance matrix are close to zero. To see this, consider the covariance matrix \([\text{cov}\,{\mathbf{m}}]\) for PFE:

$$[\text{cov}\,{\mathbf{m}}] = {\mathbf{G}}_{g}^{ - 1} [{\mathbf{G}}_{g}^{ - 1} ]^{\text{T}} = [{\mathbf{G}}^{\text{T}} {\mathbf{G}}]^{ - 1} {\mathbf{G}}^{\text{T}} [[{\mathbf{G}}^{\text{T}} {\mathbf{G}}]^{ - 1} {\mathbf{G}}^{\text{T}} ]^{\text{T}} = \left[ {\begin{array}{*{20}c} {0.5435} & {2.3385} & {0.0276} \\ {2.3385} & {10.4406} & {0.1158} \\ {0.0276} & {0.1158} & {0.0014}. \\ \end{array} } \right]$$
(19)

The above covariance matrix is a measure of how uncorrelated noise with unit variance in the data is mapped into uncertainties in the estimated model parameters. It means, in the inverse problem of PFE, every solution could be expressed as

$$\begin{aligned} &a = 0.4253 ;\quad \sigma_{a}^{2} = 0.5435 ;\quad \sigma_{a} = 0.7372\quad \to a = 0.4253 \pm 0.7372 \hfill \\ &b = 0.4780 ;\quad \sigma_{b}^{2} = 10.4406;\,\,\,\sigma_{b} = 3.2312\quad \to b = 0.4780 \pm 3.2312 \hfill \\ &c = 0.0080 ;\quad \sigma_{c}^{2} = 0.0014 ;\quad \sigma_{c} = 0.0374\quad \to c = 0.0080 \pm 0.0374. \hfill \\ \end{aligned}$$
(20)

According to the above equations, these are very large variances for \(a, \, b,{\text{ and }}c\), which indicate that the solution, while fitting the data using the least square solution, is very unstable, or sensitive to noise in the data. Therefore, the solved PFE for sand dataset [Eq. (15)] could not be relied upon for prediction of porosity distribution as it is observed in Fig. 5. This work tries to improve the stability of PFE using the Tikhonov approach in the next part.

8 Improving the stability of PFE model using the Tikhonov approach

In 1902, Jacques Hadamard indicated the notion of a well-posed problem. A well-posed problem in the sense of Hadamard is a problem that fulfills the following three conditions:

  1. (1)

    The solution exists.

  2. (2)

    The solution is unique.

  3. (3)

    The solution depends continuously on the problem data.

If any of these conditions is not realized, the problem becomes ill-posed. Note that both the first and second conditions deal with the feasibility of the problem, and the last condition relates to the possible implementation of a stable numerical procedure for its resolution. The solution of a problem is always based on some data, typically obtained from experimentation. If the solution does not depend “smoothly” on the problem data, small variations on the data could create huge variations on the solutions, resulting in strong instability which is not acceptable. When solving ill-posed problems, the concept of regularization immediately appears. Regularization is used to well-pose a problem that is ill-posed. Historically, the so-called Tikhonov regularization is one of the oldest and most well-known techniques for stabilization (Wiener 1942). To apply Tikhonov regularization for optimizing the PFE problem, the following minimization problem should be considered (Tikhonov and Arsenin 1977):

$$\hbox{min} \, \left\| {{\mathbf{d}} - {\mathbf{Gm}}} \right\|^{2} + \varepsilon^{2} \left\| {\mathbf{m}} \right\|^{2} ,$$
(21)

where \(\varepsilon^{2}\) is a parameter that controls the influence of the regularization term. Using the above statement the following Tikhonov solution is proved:

$${\mathbf{m}}_{\text{Tikh}} = \left[ {{\mathbf{G}}^{\text{T}} {\mathbf{G}} + \varepsilon^{2} {\mathbf{I}}} \right]^{ - 1} {\mathbf{G}}^{\text{T}} {\mathbf{d}}.$$
(22)

Care must be taken as it depends on the parameter \(\varepsilon^{2}\); the choice of this parameter highly influences the estimated \({\mathbf{m}}\). In practice, parameter \(\varepsilon^{2}\) is determined by trial and error, with the attendant trade-off between resolution and stability. There are several heuristic ways to proceed in order to select \(\varepsilon^{2}\) (Wabha 1990; Hansen 1992; Hilgendorf 1997), but the criterion described below is based on a balance between total variance of PFE and model resolution. It is a convenient graphical tool for displaying the trade-off between the size of a regularized solution and its fit to the given data, as the regularization parameter varies. In this research, to have a better solution for the PFE model, various total variances derived from different \(\varepsilon^{2}\) are illustrated in a graph. The total variance is defined as the trace of the model covariance matrix, given by

$${\text{Total variance}} = {\text{trace [cov }}\,{\mathbf{m}}] = \sigma^{2} \left\{ {{\text{trace}}\left[ {{\mathbf{G}}^{\text{T}} {\mathbf{G}} + \varepsilon^{2} {\mathbf{I}}} \right]^{ - 1} } \right\}.$$
(23)

In the next step, we considered a plot of the total variance from Eq. (23) as a function of \(\varepsilon^{2}\).

The total variance decreases, as expected, when the regularization parameter is increased. For \(\varepsilon^{2} = 0,\) the total variance of PFE is maximum and finally the PFE does not have sufficient stability; it means this point is the least square solution as discussed before. According to Fig. 6, the best regularization parameter should be selected for minimum total variance. But just using this graph, it is hard to choose the most appropriate value for \(\varepsilon^{2}\), because in order to select the best regularization parameter, it is important to achieve an acceptable balance between stability and accuracy of the solution by tuning carefully the regularization parameter. For this reason, the plots for total variance and trace (R) could help to choose the best one. The model resolution matrix is given by

Fig. 6
figure 6

Plot of total variance versus Tikhonov regularization parameter

$$R = \left[ {{\mathbf{G}}^{\text{T}} {\mathbf{G}} + \varepsilon^{2} {\mathbf{I}}} \right]^{ - 1} {\mathbf{G}}^{\text{T}} {\mathbf{G}}.$$
(24)

The model resolution matrix \(R\) measures the ability of the inverse operator to uniquely determine the estimated model parameters.

Figure 7 indicates that for \(\varepsilon^{2} = 0,\) the PFE constants are determined perfectly. Comparing the plots of total variance and the trace of the model resolution matrix shows that as \(\varepsilon^{2}\) increases, stability improves (total variance decreases) while resolution degrades. This is an inevitable trade-off. It seems that the most suitable value of the regularizing parameter \(\varepsilon^{2}\) is determined by selecting one intermediate point on the corner of the trace (R) and total variance plots (\(\varepsilon^{2} = 0.5\)). Such a point, indicated with a rectangle point in Fig. 7, is supposed to provide, in terms of accuracy and regularity, the value of the parameter corresponding to the most balanced perturbed solution of the inverse problem. As stated in the previous sections, the optimized solution of PFE inverse problem could be calculated as

Fig. 7
figure 7

Plot of trace (R) versus Tikhonov regularization parameter; the determined regularization parameter is the best balance between stability and resolution

$${\mathbf{m}}_{\text{Tikh}} = \left[ {{\mathbf{G}}^{\text{T}} {\mathbf{G}} + 0.5 \, {\mathbf{I}}} \right]^{ - 1} {\mathbf{G}}^{\text{T}} {\mathbf{d}} = \left[ {\begin{array}{*{20}c} {0.3075} \\ { - 0.0211} \\ {0.0020} \\ \end{array} } \right]$$
(25)
$$[\text{cov}\,{\mathbf{m}}]^{\text{Tikh}} = [{\mathbf{G}}^{\text{T}} {\mathbf{G}} + 0.5 \, {\mathbf{I}}]^{ - 1} {\mathbf{G}}^{\text{T}} [[{\mathbf{G}}^{\text{T}} {\mathbf{G}} + 0.5 \, {\mathbf{I}}]^{ - 1} {\mathbf{G}}^{\text{T}} ]^{\text{T}} = \left[ {\begin{array}{*{20}c} {0.0301} & {0.0518} & {0.0021} \\ {0.0518} & {0.2493} & {0.0024} \\ {0.0021} & {0.0024} & {0.0002} \\ \end{array} } \right]$$
(26)
$$\begin{aligned} &a_{\text{Tikh}} = 0.3075 ;\qquad \sigma_{a}^{2} = 0.0301 ;\quad \sigma_{a} = 0.1734\quad \to \quad a_{\text{Tikh}} = 0.3075 \pm 0.1734 \hfill \\ &b_{\text{Tikh}} = - 0.0211 ;{\mkern 1mu} {\mkern 1mu} \quad \,\,\sigma_{b}^{2} = 0.2493 ;\quad \sigma_{b} = 0.4992\quad \to \quad b_{\text{Tikh}} = - 0.0211 \pm 0.4992 \hfill \\ &c_{\text{Tikh}} = 0.0020 ;\qquad \,\sigma_{c}^{2} = 0.0002 ;\quad \sigma_{c} = 0.0141\quad \to \quad c_{\text{Tikh}} = 0.0020 \pm 0.0141 \hfill \\ \end{aligned}$$
(27)
$${\text{Porosity}}^{\text{Tikh}} = 0.3075 - 0.0211s\ln (s) + \frac{0.0020}{\ln (s )}.$$
(28)

Equation (27) presents the optimized constants and their variances. Variances are improved considerably by Tikhonov regularization compared with Eq. (20); in the following, to gain a better sense about prediction power of optimized PFE, the testing set is applied again in order to evaluate the new PFE [Eq. (28)].

The comparison between Figs. 5 and 8a indicates that Tikhonov regularization has increased the predictive performance of PFE and the optimized PFE could be an indicator that there is a considerable nonlinear relation between porosity values and similarity attribute. As the testing set is exactly the same, the cross correlation between actual porosity derived from the density log and the estimated one is around 47 % using a linear regression model (Fig. 8b), while the same increased to 72 % when the PFE model is used (Fig. 8a). Based on cross validation results, it seems that the developed PFE model could estimate porosity distribution of sand units of a reservoir with an acceptable quality. Although in shale units of reservoir, the results of the PFE model are not significant compared to the regression analysis.

Fig. 8
figure 8

a Cross plot between real and estimated porosity of testing set by PFE optimized by Tikhonov method in sand unit; b cross plot between real and estimated porosity of testing set by regression equation in sand unit

According to the illustrated results in Fig. 9, it is found that the developed PFE model shows an inverse correlation faced with the shale units, while regression model could be adapted in shale sediments better than in sand ones. However, the PFE is not evaluated well with respect to the shale units, but it could be considered for sandstone reservoirs because it could obtain initial information about reservoirs. Also, further studies are needed absolutely to explore other aspects of this experimental equation. For example, this research was done with assumption of \([\text{cov}\,{\mathbf{d}}] = {\mathbf{I}}\) to obtain the model resolution matrix or the authors did not consider prior information in solving the PFE. Therefore, more research especially probabilistic approach should be applied to develop this model.

Fig. 9
figure 9

a Cross plot between real and estimated porosity of testing set by PFE optimized by Tikhonov method in shale units; b cross plot between real and estimated porosity of testing set by regression equation in shale units

9 Reservoir characterization using PFE and regression analysis

In the following section, the authors attempt to apply the optimized PFE in various cross sections of the F3 block as an estimator and compare the illustrated outputs of the PFE with cross-plot analysis between AI and porosity.

Two cross sections are illustrated in inlines 228 and 339 of F3 block which provide the porosity distributions of the reservoir using the optimized PFE model (Fig. 10a, c) and regression equation (Fig. 10b, d). These outputs provided in time range of 400–1150 ms indicate porosity distribution of only the upper package which is described in the geological setting part. In this package, three sedimentary units (units 1, 2, and 3) are identified. The boundary between the units is plotted in Figs. 11 and 12 provided by PFE model in inlines 244 and 442, respectively.

Fig. 10
figure 10

a Distribution of porosity estimated by PFE in inline 228; b distribution of porosity estimated by linear regression in inline 228; c distribution of porosity estimated by PFE in inline 339; d distribution of porosity estimated by linear regression in inline 339

Fig. 11
figure 11

Distribution of porosity estimated by PFE in inline 244

Fig. 12
figure 12

Distribution of porosity estimated by PFE in inline 442

10 Discussion

In the previous sections, the PFE was introduced as a nonlinear mathematical model to have ability to estimate porosity. This model in a nonlinear mode is dependent on the similarity attribute. At first, this model was fitted on the implementing set using a least square solution and evaluated with a different well. The initial results showed that the PFE needed a different method to regularize. Then Tikhonov regularization method was employed and optimized PFE could present the results relatively well in sand units, while when a conventional cross-plotting method between AI and porosity is used, the accuracy of results declined in sand units. In addition to address the cross correlation of the developed models, the illustration of the spatial distribution of porosity predicted by both models is interpreted in the target zone. In order to better differentiate the differences between PFE and cross-plot analysis results, the outputs of both models are presented in the same sections in Fig. 10. In Fig. 10c, d, it is evident that the unit 2 has a higher porosity value than other units (1 and 3) and does not exhibit any significant variations except close to a vertical discontinuity which is known as a gas chimney anomaly. The presence of gas chimneys has been interpreted as hydrocarbon leakage pathways, and mapping of such chimneys by neural network techniques has been established as an exploration tool. Wells drilled inside gas chimneys typically have higher pore fluid pressure, higher mud gas readings, higher mud gas wetness, more hydrocarbon shows, lower velocities, and higher temperatures than wells drilled outside gas chimneys (Løseth et al. 2008). Gas chimney and fault volumes extracted from 3-D seismic data were rapidly becoming valuable tools for exploration and field development. In Fig. 10a, b, the PFE model could detect this anomaly which marks transition between the salt dome located in unit 1 (Zechstein) and near surface gas pockets in Fig. 12. The red polygon illustrated in Fig. 11 shows the possible areas of this occurrence. The reason for this behavior of optimized PFE is that its intrinsic properties originate from the nature of similarity. The similarity attribute enhances the fault structures and salt edges. Gas chimneys are a kind of fault structure whose similarity and amplitude attributes are usually used to detect these properties. In gas reservoirs, they identify the pathway of hydrocarbon migration and when the target of study is the determination of optimum drilling points, these structures could be an indicator to show the probable location of hydrocarbon accumulation. Also, in Figs. 10a and 12, a salt structure, the Zechstein salt dome, is identified by PFE in unit 1 and studying these structures is very important because they are traps for accumulation of hydrocarbon. In addition to the patterns already defined in the PFE map, another anomaly could be found at about 530 ms in the map. In fact, F3 block contains a bright spot at about 530 ms possibly due to the presence of a gas pocket. Chopra and Marfurt (2007) demonstrate that reflections from gas-charged reservoir rocks showed much larger amplitudes than reflections from adjacent oil- or water-saturated zones. These are often known as bright spots. In the output of the PFE model (Fig. 10a), the bright spot is identified with a black arrow. In PFE results, multiple layers of shale and sand sediments are observed; however, it is proved that the PFE could not match the shale dataset (Figs. 11, 12). Because the PFE model was developed only using the dataset of sand sediments, this might allow it to tune more appropriately in the sand units than shale layers. However, despite the inaccuracy of the PFE model in the shale units, it seems that there is enough evidence of superiority in the results of PFE, and the observations suggest that the PFE model has performed well within the gas-bearing sand reservoir of the F3 block. Various seismic anomalies such as chimneys, faults, fractures, salt, bright spot, and sand bodies could be highlighted using the PFE technique that analyzes data with combinations of similarity attribute and PFE could present initial information about reservoir which is important for determination of optimum points for drilling operations.

11 Conclusions

This work provides a comparative analysis between a developed empirical model and conventional cross plotting to characterize a North Sea reservoir in term of porosity. The empirical model designed in a nonlinear mode has three unknown constants which were optimized using Tikhonov regularization based on the dataset taken from sand units. This method is different from the cross-plotting method, as it predicts porosity rather than acoustic impedance. This method did not succeed in matching the PFE on the shale dataset; therefore, this research concentrated on the behavior of the PFE on the sand dataset. A Tikhonov regularization parameter could improve the predictive power of the PFE and PFE validated with correlation coefficient equal to 0.72 for the testing set, while the same coefficient was only 0.47 for cross-plot analysis. But the developed models show a paradoxical behavior in shale units and it is evident that a single transform function such as PFE cannot be applied for estimation of petrophysical properties in various lithologies. The point that is significant in the seismic sections obtained by PFE is its capability in enhancing the gas chimneys. The reason for this behavior of PFE is that its intrinsic properties originate from the nature of similarity. The similarity attribute enhances the fault structures and salt edges. Unit 2, which is known as one of the main gas reservoirs of F3 block, shows higher porosity compared to the units 1 and 3 by PFE. According to the observations in the outputs of PFE, the ability to detect the geological structures such as faults (gas chimney), folds (salt dome), and bright spots besides porosity estimation of sandstone reservoirs could be a guideline to select the drilling points. The Tikhonov regularization approach showed that the bias represents a potentially significant component of the uncertainty in the results of calculations of inverse problem of PFE. Since the bias depends on something which is unknown for researchers, it will be necessary to use a priori information in order to estimate it. In this research, \([\text{cov}\,{\mathbf{d}}]\) is an important priori information assumed equal to the identity matrix while it could be considered with more accuracy. In the future work, determining \([\text{cov}\,{\mathbf{d}}]\) should be investigated and a probabilistic technique is proposed to apply to develop the PFE model with further predictive power. On the other hand, the physical base of PFE should be studied; if the physical relationship between porosity and similarity is investigated, probably the structure of PFE could be optimized.