Use of maximum likelihood sparse spike inversion and probabilistic neural network for reservoir characterization: a study from F-3 block, the Netherlands

Maximum likelihood sparse spike inversion (MLSSI) method is commonly used in the seismic industry to estimate petrophysical parameters in inter-well region. In present study, maximum likelihood sparse spike inversion technique is applied to the processed 3D post-stack seismic data from the F-3 block, the Netherlands, for estimation of acoustic impedance in the region between the wells. The analysis shows that the impedance varies from 2500 to 6200 m/s/*g/cc in the region which is relatively low and indicates the presence of loose formation in the area. The correlation between synthetic seismic trace and original seismic trace is found to be 0.93 and the synthetic relative error as 0.369, which indicate good performance of the algorithm. The analysis also shows low-impedance anomaly in between 600 and 700 ms time interval which may be due to the presence of sand formation. Thereafter, the probabilistic neural network analysis is performed to predict porosity along with multi-attribute transform analysis to estimate P-wave velocity and porosity in inter-well region. These parameters strengthen the seismic data interpretation which is very crucial step of any exploration and production project. The method is first applied to the composite traces near to well locations, and results are compared with well log data. After getting reasonable results, the whole seismic section is inverted for the P-wave velocity and porosity volume. The analysis shows anomaly in between 600 and 700 ms time interval which corroborates well with the low-impedance zone which may correspond to the reservoir. This is preliminarily interpretation; however to confirm a reservoir, there is need for more petrophysical parameters to be studied.


Introduction
The process of combining seismic and well log data has been a routine goal of researchers from last few years because of the movement from exploration to growth of existing fields containing several wells in the field. There are two types of integration, the forward modeling and inverse synthetic seismic data modeling which is known as seismic inversion and has been explained by several authors (Lindseth 1979;Oldenburg et al. 1983;Chi et al. 1984;Cooke and Schneider 1983;). This study is related to the seismic inversion methods which are classified as pre-stack and post-stack inversion methods. This study focuses post-stack seismic inversion methods. The post-stack inversion methods were developed with the design of wavelet amplitude and phase spectra extraction algorithms (Lindseth 1979;Yilmaz 2001;Maurya and Singh 2015). These methods produced extraordinary resolution images of the subsurface, which enhanced the seismic interpretation and has significantly reduced the drilling risk. The P-impedance and the density parameters are derived from these inversions.
P-impedance provides information about the properties of rock constituting the reservoir (Buxton et al. 2000;Maurya and Singh 2017;Anderson 1996). Though, there are several post-stack inversion methods like model-based inversion (MBI), band-limited inversion (BLI), linear programming sparse spike inversion (LPSSI), colored inversion (CI) and maximum likelihood sparse spike inversion (MLSSI) available, but in this study we have adopted maximum likelihood sparse spike inversion method because this is fast and gives better results than other inversion methods (Russell 1988;Russell and Hampson1991;Helgesen et al. 2000;Wang et al. 2006;Dossal and Mallat 2005;Maurya and Sarkar 2016;. The maximum likelihood sparse spike post-stack inversion method is dependent on the model and is based on the assumption that the reflectivity is made up of a sequence of major spikes merged on the background of minor spikes (Wang et al. 2006). The inversion method simulates a synthetic seismic trace from the simplest possible reflectivity model that matches with the input seismic trace.
The sparse spike inversion methods use simple subsurface reflectivity model derived from the well log data, and convolution theory to generate synthetic traces. The error between synthetic trace and seismic trace is minimized by introducing more and more spikes in the reflectivity series (Debeye and Riel 1990). Depending on minimization of error, the sparse spike inversion methods are divided into two groups. The first method is called linear programming sparse spike inversion (LPSSI) methods that use l 1 norm solution for its implementation, and the second is called maximum likelihood sparse spike inversion (MLSSI) methods that use l 2 norm solution for its implementation (Russell 1988;Sacchi and Ulrich 1995;Zhang and Castagna 2011). The aim of the sparse spike inversions is to estimate acoustic impedance in inter-well region by integrating well log and seismic data. The inverted impedance reveals a time-domain blocky subsurface structure which is directly related to the subsurface lithology (Goutsias and Mendel 1986;Wang 2010).
Further, the present study utilizes geostatistical methods to predict various petrophysical parameters (P-wave velocity and porosity) in inter-well region. The geostatistical methods use inverted impedance as input and predict petrophysical parameters away from the boreholes. These geostatistical approaches are further divided into four parts as single-attribute analysis, multi-attribute regression, probabilistic neural network (PNN) and multilayer feed forward neural network (MLFN). Of all these methods, multi-attribute regression and probabilistic neural network method are most appropriate and fast to perform (Russell et al. 1997;Maurya et al. 2019). The multi-attribute regression analyses more than one attributes at a time and develop a liner statistical relationship between well log property and attributes. These statistical relationships are then used to predict petrophysical parameters. On the other hand, the probabilistic neural network derives nonlinear relationship rather liner relationship as in multi-attribute case and predict petrophysical parameters. The probabilistic neural network (PNN) is a system of mathematical interpolation used to implement structure of a neural network (Soubotcheva and Stewart 2004;Hampson et al. 2001;Pramanik et al. 2004). The flowchart of the study is presented in Fig. 1. The present study is performed using Hampson Russell software (HRS) (ver 10.2), a CGG veritas software suit.

The study area
The present study makes use of offshore 3D seismic data from the F3 block, North Sea, the Netherlands, which is a gas-producing country. The survey was carried out in 1987 in an area of 384 square km. The initial F3 dataset was noisy, and therefore, to eliminate the noises, a dip-steered median filter with a limit of two traces was applied to the data. There are four vertical wells within the survey area and all the wells had sonic and gamma ray logs. The depth of well log was about 1700 ms (Aminzadeh and De Groot 2006). The well location of F02-1 is 362 inline and 336 crossline, F03-2 is 722 inline and 848 crossline, F03-4 is 442 inline and 1007 crossline and F06-1 is 244 inline and 387 crossline. The seismic survey area ranges from inline 100 to 750 and crossline 300 to 1200. The essential data needed for the inversion procedure are P-wave sonic, density, S-wave sonic and check shot logs. Both seismic and well data are provided by an open source seismic repository portal (dGB Earth Sciences).
This survey was carried out for exploring oil and gas in the layer of Upper-Jurassic to Lower Cretaceous generally found below the certain interval. The upper part up to 1200 ms is made up of reflectors belonging to the Miocene, Pliocene, and Pleistocene (De Bruin and Bouanga 2007;Wolak et al. 2013). The deltaic bundle is comprised of sand and shale, which has very high porosity (20-42%).

Maximum likelihood inversion
The maximum likelihood inversion is executed in two steps. In the first step, maximum likelihood deconvolution is applied to estimate the seismic reflectivity series. Then in the second step, reflectivity series is transformed into the acoustic impedance which is more important to infer data about the subsurface layer (Hampson and Russell 1985).
The maximum likelihood deconvolution is based on the notion that the Earth's sequence of reflectivity consists of significant spikes surrounded in the background by minor spikes. The technique also assumes that only large spikes are essential because they show a deposition gap in the subsurface, and hence the goal of the maximum likelihood deconvolution (MLD) is to identify these large spikes from the seismic data. The seismic trace is expressed as follows.
where S(t) is time-dependent seismic trace, r(t) is reflectivity of earth dependent on time, W(t) is source wavelet dependent on time and n(t) is time-dependent noise component. Equation (1) can be written as follows: Using the subsurface model assumptions, one can reduce the target function for the best solution, i.e., earth's reflectivity series. The target function E can be written as follows: where r(j) is reflection coefficient at jth sample, R is square root of reflectivity variance, m is number of reflections, t is total number of samples, N is square root of noise variance, n(j) is noise at jth sample, and λ is the likelihood that a given sample has a reflection (Zhang and Castagna 2011;Russell 1988). By minimizing error as given by Eq. (3), one can obtain the earth's seismic data reflectivity series. Thereafter, this reflectivity series is converted into acoustic impedance. If the reflectivity is given by r(i), then the resulting impedance Z(i) can be given as follows.
Inappropriately, the use of Eq. (4) to estimate the reflectivity from MLD produces unacceptable results, specifically in cases of more noise. Although the MLD algorithm extrapolates outside the wavelet band width to provide a broadband reflectivity estimate, at the bottom of the spectrum the reliability is degraded by noise. Therefore, the impedance's brief wavelength characteristics can be reassembled properly, but the general trend is poorly resolved. This is similar to the fact that the spike's times on the reflectivity approximation are well determined than their amplitudes. The independent understanding of the impedance trend can be used as a constraint to stabilize the reflectivity estimate. Since r(i) < 1 , the equation between acoustic impedance and reflectivity provided by r(i) < 1 can be derived as follows: where Z(i) is the recognized trend in impedance, n(i) is an input trend error, and H(i) can be written as follows: The maximum likelihood sparse spike inversion is a method that extracts broadband seismic reflectivity approximation and enables us to invert into an acoustic impedance segment by introducing linear limitations that maintains the main geological characteristics of borehole log information.

Multi-attribute linear regression
Using seismic and well log information, the multi-attribute linear regression is used to predict the well log property in inter-well region. The method estimates and utilizes a range of seismic attributes from the seismic information to assess the connection between the attributes and characteristics of the well log. Then, in the inter-well region, this connection is used to assess petrophysical parameters. It is well known that the crossplot between the two is the easiest way to derive the required connection between target information and seismic attribute for a particular attribute of the seismic data. Assuming that the attribute has a real connection with the target log, a straight line can be fitted as follows: In this equation, the coefficients a and b are obtained by reducing the error of mean-square prediction as follows: The calculated prediction error (E) is a metric of fitness for the regression line defined by Eq. (6). Further, the normalized correlation coefficient can be defined as follows: It should be noted that the linear requirement can be relaxed to some extent by implementing a nonlinear transformation into either target data or attribute data or both.
The discussion till this point of time is linear where single attribute is utilized at a time. For multiple attributes to be used at the moment of the relationship estimation between target log and seismic attributes, the analysis would be as follows:  . 2 The Gamma log, density log, P-wave (blue) and S-wave (red) logs and comparison between original porosity (blue) and predicted porosity (red) of well F06-1 Assume that there are three attributes A 1 , A 2 and A 3 . The object log is modeled on the linear equation at each sample.
where in Eq. (14), the weight scan be derived by reducing the predicted mean-square error given by The solution generates the standard normal equations for the four weights: The mean-square error [Eq. (15)] is calculated using derived weights, which is a goodness-of-fit metric for the transformation as does the standardized correlation described by Eq. (8), in the case of a single attribute, where the predicted log value is the x-coordinate and the actual log value is the y-coordinate.

Probabilistic neural network
The probabilistic neural network (PNN) is a method of mathematical interpolation that makes use of architecture of the neural network. The information used by PNN is a sequence of training data for every seismic sample in the examination windows for all the wells.
D (x, x i ) reflects the distance between the point of entry and each point of training, x i , in multi-dimensional space covered by the attributes. The value D (x, x i ) is differentiated and scaled by an amount that can vary for each attribute.
Equations (17) and (18) define the use of the PNN network. The network training consists of the ideal set of parameters for smoothing j . The standard for these parameters is that the calculated network should have the lowest validation error.
The authentication outcome for the mth target example is given by This is the mth target example estimated value when the example is excluded from the training data. As this example's value is known, the prediction error can be calculated for that example. By repeating this process for each of the  The error of prediction is based on selecting the parameters, j . This value is minimized using a nonlinear conjugate gradient algorithm described in Masters (1994Masters ( , 1995. The network calculated is the minimum validation error. The greatest issue with PNN is that it carries all its training information and compares each output sample with each training sample (Specht 1990(Specht , 1991. In short, PNN is a feed forward neural network with a complex structure. It consists of an input layer, a layer of pattern, a layer of summation, and an output layer. PNN has only one training parameter despite its complexity. This is a smoothing parameter of probability density functions (PDFs) used in the pattern layer to activate the neurons. Therefore, the PNN training process needs only one input-output signal transfer to determine the response of the network. Nevertheless, in terms of generalization efficiency, only the optimal value of the smoothing parameter gives the possibility of the correctness of the model's response. The value of j has to be estimated depending on the performance of the PNN, which is normally done iteratively. Two issues need to be addressed in the smoothing parameter estimation process. The first relates to the PNN pattern layer neurons choice of j in probability density function (PDF). There are four possible approaches, i.e., one parameter for the entire model, one parameter for each class (Adeli and Panakkat 2009), one parameter for each data attribute, and one parameter for each attribute and one class. The second issue related to the estimation of the smoothing parameter for PNN is the calculation of the j value foe which different procedures are available. For the probability density functions in the pattern layer, the authors use the particle swarm optimization algorithm to estimate the matrix of smoothing parameters (Zhong et al. 2007) where the smoothing parameter adaptation gap-based approach is used. Based on the gap calculated between the two nearest points of the data set, the authors have provide the formula for j . The choice of smoothing parameter plays a crucial role in the probabilistic neural network training process. This fact is particularly important when PNN has different parameters for each class, each attribute, each class and each attribute. The job of choosing smoothing parameters can then be considered as a problem of high-dimensional function optimization. The reinforcement learning (RL) algorithm is an effective method to solve these problems.

Training and validation of PNN
In this section, we examine the problem of how to determine the correct number of attributes to be used. One can show that a multi-attribute transform with N + 1 attributes must always have a training error less than or equal to the transform with N attributes. As more attributes are added, one can expect an asymptotically decreasing prediction error. Of course, while the added attributes continuously improve the fit to the training data, they may be unusable or worse when applied to new data not in the training set. This is occasionally called overtraining and is described by Kalkomey   Fig. 6 P-impedance (original log) versus P-impedance (inverted log) crossplot Fig. 7 Training data for both wells. The curve on the left is the target P-wave log from the well. The center curve is the composite seismic trace from the 3-D volume at the well location. The right curve is the composite acoustic impedance from the seismic inversion  (1997). Efficiently, using higher numbers of attributes is analogous to fitting a crossplot with progressively higherorder polynomials. A number of statistical methods have been derived to measure the reliability of the higher-order attribute fits (e.g., Draper and Smith 1966). Inappropriately, most of these methods apply to linear regression and are not immediately applicable to nonlinear prediction using neural networks. For this reason, we have selected crossvalidation, which can be applied to any type of prediction. Cross-validation consists of dividing the full training data set into two subsets: the training data set and the validation data set. The training data set is used to derive the transform, while the validation data set is used to quantity its final prediction error. The hypothesis is that overtraining on the training data set will result in a poorer fit to the validation data set. In our analysis, the natural subdivision of data is by well. In other words, the training data set comprises training samples from all wells, except some specified hidden well. The validation data set comprises samples from that hidden well. In the procedure of cross-validation, the analysis is repeated as many times as there are wells, each time leaving out a different well. The total validation error is the RMS average of the individual errors: where E V is the total validation error, e Vi is ith well validation error, and N is the number of wells in the analysis.
In this study, different types of errors are calculated and they are as follows: Error Error is a statistical value associated with the target attribute pair. The pair with the best correlation (and lowest error) is listed at the top, followed by pairs with decreasing correlation (and increasing error). It is associated with the only single-attribute analysis as shown in Tables 1 and 3.
Training error Training error is the prediction error obtained by applying the model to the same data from which it is trained. In other words, training error is the error that one gets when he runs the trained model back on the training data. Training error is always less than the validation error (Tables 2 and 4).

Validation error
The average predictive error of all the hidden wells is referred to as validation error. The validation error is the error associated with applying the PNN to the entire seismic data volume. The total validation error is the RMS average of individual errors. If validation error is low, slightly higher than the training error, then it shows good fit. All these errors are mean-square error.

Results and discussion
The study is performed in four steps: First, porosity log is predicted at one well location by using other well logs of the region. Thereafter, in the second step, ML sparse spike inversion is performed to estimate P-impedance. Further, multi-attribute regression is adopted to predict P-wave velocity and porosity away from the boreholes. In the last step, PNN is applied to the data and P-wave velocity and porosity is estimated. The output of each steps is explained in the following sections. Fig. 9 Probabilistic neural network analysis for P-wave prediction with a good correlation 0.93 1 3

Predicting porosity log from other porosity logs
A porosity log is predicted at a particular well location using other well log information in the area. This is performed to monitor capacity of some of the geostatistical methods. For this purpose, porosity log is predicted at well F06-1 location using information of wells F02-1, F03-2, and F03-4. This prediction is performed using single-attribute analysis and multi-attribute regression technique. Further, the predicted well log curve is plotted with actual well log porosity from well F06-1 for comparison between them and displayed in Fig. 2. The gamma log, density log, and P-wave and S-wave velocity logs are also plotted in Fig. 2. The crossplot between original porosity versus predicted porosity is shown in Fig. 3 which shows a very good correlation of 0.99.

Maximum likelihood sparse spike inversion
In the inversion process, we have used only 200-400 inline and 300-399 crossline in which only two well F02-1 and F06-1 are present. The reason behind this is to reduce computation time and cost. The maximum likelihood sparse spike inversion is applied to the post-stack seismic data from the F-3 block, the Netherlands. The analysis shows that the area impedance varies from 2500 to 6200 m/s*g/cc. The impedance is relatively low which indicates the presence of loose formation in the region. The correlation coefficient is estimated to be 0.93, and error is 0.370. The comparison of single trace inversion near to well location is presented in Fig. 4. The cross section of inverted impedance is shown in Fig. 5. The top figure shows post-stack seismic section at inline 362, and their inverted section is presented at the bottom of Fig. 5. A low-P-impedance zone near to 680 ms time highlighted by arrow can be seen from the inverted impedance section. This low-impedance zone may be due to the presence of reservoir zone, but this is initial interpretation and one cannot be sure about the reservoir as more parameters are needed to characterize it. Figure 6 shows crossplot between P-impedance (original log) versus P-impedance (inverted log). The r-square value in the crossplot is 0.84 which indicates good inversion result. The r square indicates the proportion of the variance in the dependent variable that is predictable from the independent variable. The distribution of the scatter points in the crossplot indicates good performance of the ML inversion algorithm. Now, this inverted impedance is used as input to predict P-wave velocity and porosity away from the boreholes.

Prediction of P-wave velocity with the help of inverted P-impedance
P-wave velocity and porosity is predicted using geostatistical methods. These parameters help to get more confidence about the interpretation of subsurface features. The prediction is performed using inverted P-impedance as one attributes. First, P-wave velocity is predicted and then porosity prediction is performed. For P-wave prediction, we have used single-attribute analysis and multi-attribute regression method simultaneously. Firstly, we start training by inputting P-wave log of F02-1 and F06-1 wells and post-stack seismic data as an internal attribute and inverted P-impedance as an external attribute. The training results are presented in Fig. 7. For the analysis purpose, a list of single-attribute analyses is generated and displayed in Table 1.
Thereafter, the multi-attribute regression analysis is applied to the data in which we have taken 8 attributes and 8 operator lengths as threshold to predict P-wave velocity. The average error plot of all the multi-attributes is shown in Fig. 8. The list of all attributes with training error and validation error is shown in Table 2. We can take only two attributes from top of the multi-attribute list for the better result because we use only those attributes which follow a decreasing trend in the validation error curve (Fig. 8). Figure 9 shows a very good correlation (0.93) after applying probabilistic neural network.
The crossplot obtained from PNN between actual P-wave and predicted P-wave is shown in Fig. 10. The scatter points show valuable prediction results. The cross section of predicted P-wave velocity at inline 362 is shown in Fig. 11. The analysis shows that the area has P-wave velocity variation ranging from 1500 to 2600 m/s. This is again relatively low and indicates the presence of loose formation. A low-velocity zone is observed and marked at 680 ms time which may be due to the presence of reservoir in this zone. The P-wave velocity from well F02-1 is also plotted over the predicted section, and a good match is observed between them.

Porosity prediction with the help of inverted P-impedance
Porosity prediction is also performed using probabilistic neural networks (PNN) method. The entire process is same as discussed earlier in section for the P-wave velocity prediction. The list of single attributes with error and correlation is given in Table 3.
Further, the multi-attribute regression analysis is applied with 8 attributes and 8 operator lengths. The list of all attributes is shown in Table 4. Similar to P-wave prediction, we use 8 attributes for training the data and 2 attributes for validation because only 2 attributes follow a decreasing trend in the validation error curve.
Then probabilistic neural network is built for predicting the porosity by selecting two attributes together from top of multi-attribute list results because the validation curve is always decreasing with the well error curve. Comparison of Fig. 12 Correlation is 0.91 obtained from PNN analysis for predicting porosity predicted porosity with actual porosity is given in Fig. 12. The correlation coefficient is estimated to be 0.91 which shows a good correlation. The crossplot between actual porosity versus predicted porosity is shown in Fig. 13. The crossplot shows that the predicted porosity is following the trend of original porosity, sample by sample. This indicates good prediction results, and now, one can move to the prediction of porosity volume. The cross section of predicted porosity is displayed in Fig. 14. The analysis shows that that area has porosity variation ranging from 20 to 42% which is comparatively large. This again shows the presence of loose formation. A high-porosity zone is observed and marked at 680 ms time which corroborated well with low-impedance and low-P-wave-velocity zone. This anomalous zone may indicate the presence of reservoir, but for confirmation, one needs more petrophysical parameters to be studied.

Conclusions
In the present study, a variety of petrophysical parameters, i.e., impedance, porosity, velocity, are estimated from inversion of processed 3D post-stack seismic data of F-3 block, the Netherlands, using maximum likelihood sparse spike inversion which is relatively fast and gives better results than other inversion methods. The analysis shows a correlation coefficient of 0.93 which indicates good performance of the algorithm. Further, the inversion of entire seismic section for impedance shows a relatively low impedance varying from 2500 to 6200 m/s*g/cc in the region which indicates the presence of loose formation in the area. The analysis suggests a low-P-impedance zone at 680 ms time which may be due to the presence of a reservoir. Fig. 13 Crossplot between actual porosity versus predicted porosity obtained from PNN Thereafter, P-wave velocity and porosity prediction are performed using probabilistic neural network technique. The analysis shows a correlation coefficient of 0.89 and 0.93 for P-wave velocity and porosity, respectively. The results show that the area has P-wave velocity varying from 1500 to 2600 m/s and the porosity varying from 20 to 42%. The relatively low P-wave velocity and high porosity in the area indicate the presence of loose formation which is well correlated with low-impedance regime in the area. Further, the PNN analysis also shows an anomalous zone near to 680 ms time which may be due to the presence of reservoir zone and confirms the finding of ML inversion results. This is prelim interpretation based on a small volume of the data, so it cannot be denied that the other part of the region has some other characteristics.