Regularized ridge regression models to estimate static elastic moduli from wireline measurements: case study from Southern Iraq

Elastic moduli such as Young’s modulus (E), Poisson’s ratio (v), and bulk modulus (K) are vital to creating geomechanical models for wellbore stability, hydraulic fracturing, sand production, etc. Due to the difficulty of obtaining core samples and performing rock testing, alternatively, wireline measurements can be used to estimate dynamic moduli. However, dynamic moduli are significantly different from elastic moduli due to many factors. In this paper, correlations for three zones (Nahr Umr shale, Zubair shale, and Zubair sandstone) located in southern Iraq were created to estimate static E, K, and ν from dynamic data. Core plugs from the aforementioned three zones alongside wireline measurements for the same sections were acquired. Single-stage triaxial (SST) tests with CT scans were executed for the core plugs. The data were separated into two parts; training (70%), and testing (30%) to ensure the models can be generalized to new data. Regularized ridge regression models were created to estimate static E, K, and ν from dynamic data (wireline measurements). The shrinkage parameter (α) was selected for each model based on an iterative process, where the goal is to ensure having the smallest error. The results showed that all models had testing R2 ranging between 0.92 and 0.997 and consistent with the training results. All models of E, K, and ν were linear besides ν for the Zubair sandstone and shale which were second-degree polynomial. Furthermore, root means squared error (RMSE) and mean absolute error (MAE) were utilized to assess the error of the models. Both RMSE and MAE were consistently low in training and testing without a large discrepancy. Thus, with the regularization of ridge regression and consistent low error during the training and testing, it can be concluded that the proposed models can be generalized to new data and no overfitting can be observed. The proposed models for Nahr Umr shale, Zubair shale, and Zubair sandstone can be utilized to estimate E, K, and ν based on readily available dynamic data which can contribute to creating robust geomechanical models for hydraulic fracturing, sand production, wellbore stability, etc.


Introduction
A precise estimate of mechanical rock properties is essential for all geomechanical applications (Davarpanah et al. 2020), including but not limited to; in-situ stress estimation, wellbore stability, sand control, hydraulic fracturing design, wellbore instability analysis, etc. (Kidambi and Kumar 2016;Sulaimon and Teng 2020;Wang and Sharma 2017;Zeynali 2012;Zoback et al. 2003). Mechanical rock properties can be divided into elastic parameters (Young modulus, shear modulus, bulk modulus, and Poisson's ratio) and rock strength parameters (unconfined compressive strength (UCS), cohesive strength, and internal friction angle) . Ideally, these parameters are measured in the laboratory using core samples. While being highly expensive, laboratory tests are the most accurate way to find these parameters (Fjaer et al. 2008). An alternative method to estimate mechanical rock properties is the use of well logs (dynamic methods). It is essential to distinguish between dynamic and static methods. Dynamic rock properties refer to the elastic stiffness estimated from well logs. On the other hand, static rock properties refer to the elastic stiffness estimated from core samples.
The dynamic stiffness is usually larger than the static stiffness in rocks (Fjaer 1999;Jizba and Nur 1990;King 1969;Martin and Haupt 1994;Olsen et al. 2008;Simmons and Brace 1965;Yale and Jamieson 1994). On the contrary, for a homogenous and isotropic material like steel, static and dynamic moduli are equal (Ledbetter 1993). There are many reasons why the dynamic is usually higher than the static stiffness such as strain rate, drainage conditions, heterogeneity, anisotropy, and strain amplitude (Fjaer 2019).
Starting with the strain rate, the difference between the dynamic and static moduli can be explained by the variation in the rate of rock deformation, where static is from loading versus the dynamic is induced by ultrasonic waves. Laboratory measurements usually correlate better with the seismic frequency band. Therefore, dynamic moduli derived from seismic waves are closer to the laboratory testes than dynamic moduli derived from ultrasound waves (Fjaer 2019;Fjaer et al. 2013).
Drainage condition is another factor that may lead to the discrepancy in dynamic and static moduli. According to the poroelastic theory, a large difference is perhaps observed depending on whether the rock is drained or undrained (Biot 1956). Static rock deformation is frequently drained, while dynamic induced deformation is often undrained. Therefore, drainage conditions can be another reason why there is this discrepancy between dynamic and static moduli (Fjaer 2019).
It is well known that heterogeneity, with various scales, exists in rocks. When comparing static and dynamic moduli, it is important to keep in mind that the probed volume of rocks is different in static and dynamic measurement. Especially for large heterogeneous rocks, heterogeneity can play a vital role in the discrepancy between the static and dynamic measurements (Fjaer 2019).
For orthorhombic symmetry, there are nine independent components for the stiffness tensor (Ahmed and Meehan 2016). Therefore, it is important to account for anisotropy when comparing between the static and dynamic moduli. Because of data scarcity, dynamic moduli are usually used to estimate and compare with static moduli. However, the dynamic moduli often introduce a large margin of error due to the assumption of isotropic (Fjaer 2019).
Strain amplitude is another factor that contributes to the discrepancy between static and dynamic moduli. Static moduli perhaps decrease as a result of non-elastic loading, while dynamic moduli will not be affected. For example, uncemented grain contact or closed fracture may stay immobilized as a result of elastic waves, while they may be mobilized because of static loading due to static friction (Walsh 1965). This can be one of the reasons for the sensitivity of static and dynamic relationships to stress history and path (Fjaer 2019).
Finding an empirical relation between static and dynamic is vital for continuous and robust estimation of mechanical rock properties (Chang et al. 2006). There are many correlations in the literature that present relationships between dynamic and static rock properties (Ameen et al. 2009;Asef and Farrokhrouz 2010;Brotons et al. 2014Brotons et al. ,2016Christaras et al. 1994;Davarpanah et al. 2020;Eissa and Kazi 1988;Horsrud 2001;King 1983;Kılıç and Teymen 2008;Lacy 1997;Lashkaripour 2002;Najibi et al. 2015;Ohen 2003). Table 1 presents a summary of some empirical correlations between static and dynamic relationships from the literature.
The objective of this work is to present relationships between static and dynamic Young's modulus (E), Poisson's ratio (ν), and bulk modulus (K) for three zones (Nahr Umr shale, Zubair shale, and Zubair sandstone) located in southern Iraq. Once a robust relationship between static and dynamic E, K, and ν are available, these relations can be used for many geomechanical applications such as in-situ stress estimation, wellbore stability, sand control, hydraulic fracturing design, etc.

Geological setting
In this section, the geological setting of the study area will be explained. Figure 1 shows the location of the study area. This work will focus on the southern part of Iraq, where most oil fields are located. This work uses rock samples from Zubair and Nahr Umr formations. These formations are Lower Cretaceous in age (Jassim and Goff 2006).

Zubair formation
Being the richest petroleum reservoir in southern Iraq, the Zubair formation comprises 380-400 m of sandstone, siltstone, and shale. It is divided into five shale and sandstone units used for reservoir description. The Zubair formation incorporates only sandstone in the Salman Zone and the shale portion decrease rapidly as moving southwest. It is underlain by interbeds of limestone and shale (Ratawi formation) and is overlain by limestone formation (Shuaiba formation). The porosity of the Zubair formation ranges from 15 to 30% while the permeability ranges between 20 and 1800 md (Jassim and Goff 2006).

Nahr Umr formation
The Nahr Umr formation is shale-dominated in the eastern part of Iraq, and sand-dominated in the western and southwestern part. Compromised of black shales interbedded fine-to medium-grained sandstones. It is underlain by the Shuaiba formation and is overlain by the Mauddud formation (limestone). The porosity of Nahr Umr formation ranges between 16 and 23.3% (Jassim and Goff 2006). Several wellbore instability issues have been experienced while drilling the Nahr Umr formation, including but not limited to; caving, stuck pipes, and tight holes (Mohammed et al. 2018).

Core samples
Core samples were taken from three zones in southern Iraq (Nahr Umr shale, Zubair shale, and Zubair sandstone). Plugs were created from the core samples for each zone to get the plugs ready for the geomechanical tests. Figure 2 demonstrates the rock plugs before and after executing SST tests for Nahr Umr shale, Zubair shale, and Zubair sandstone.

Preparation of core plugs
To prepare the samples for the SST test, each core plug was at least 2 times longer than its diameter, and at least ten times wider than the largest rock grain. Also, samples cylinder ends were surface ground to ensure a flat surface right cylinder, as well as an accurate match with the end caps when mounted into the instrumentation stack. Also, precise measurements of each sample being tested are recorded including the sample's diameter, length, and weight.

Single-stage triaxial (SST) tests
Single-stage triaxial (SST) tests are among the most common geomechanical test to obtain mechanical rock properties. The limitation of SST tests is the requirement of at least three core plugs to obtain rock strength parameters such as uniaxial compressive strength (UCS), cohesion (S o ), and internal friction angle (ɸ) (Ameen et al. 2009;Fjaer et al. 2008;Zoback 2007). To conquer that limitation and to save time, material, and money, multistage triaxial stage (MST) tests are usually used to obtain rock strength parameters using only one core plug. However, for elastic properties (e.g., E, K, and ν), SST tests are usually recommended (Zoback 2007). In this work, SST tests were executed for core samples taken from three zones; Nahr Umr shale, Zubair shale, and Zubair sandstone. Confining Multiple rock types (Christaras et al. 1994) Calcarenite (Brotons et al. 2014) Multiple rock types (Brotons et al. 2016) log 10 E s = 0.96 log 10 ( E d ) − 3.306 1 3 pressures were kept constant for the three formations which represent the horizontal stresses (σ 2 = σ 3 in SST). Confining pressures of 5016 psi, 5581 psi, and 4629 psi were used for Nahr Umr shale, Zubair shale, and Zubair sandstone, respectively. For each sample, the axial load was increased at a rate of 1 μ strain/s while monitoring the changes in axial and radial deformations, until the failure point is reached (reaching the maximum compressive strength (MCS)). A digital computerized data acquisition system was utilized to monitor confining pressure, axial load, and axial and radial deformations. Figure 3 shows the rock mechanics system used to conduct the SST tests.

Calculations of static elastic moduli
Static E, K, and ν were calculated using the data of SST tests executed for Nahr Umr shale, Zubair shale, and Zubair sandstone. For E, there are several methods used to estimate static E such as initial modulus, secant modulus, tangent modulus, and average modulus. In this work, static E was estimated using the slope from 1/3 to 2/3 maximum compressive strength (MCS) (peak stress). Static E is the ratio between axial stress (σ a ) and axial strain (ε a ) increments, while the static v is the ratio between radial strain (ε r ) and ε a . While static K is the ratio between the between confining pressure (σ p ) and volumetric strain (ε v ) as shown in the following Equations (Fjaer et al. 2008):

Calculations of dynamic elastic moduli
Well logging measurements (sonic and density logs) taken over the same section of the core samples utilized to estimate dynamic E, K, and ν using the following Equations (Zoback 2007): where V p and V s are the P-wave and S-wave velocities, respectively, and ρ is the bulk density.

CT scanning
Computed tomography (CT) scanning is a method used to image internal structures using X-ray. CT is a non-distractive visualization technique. While CT scanners are used for medical applications, they have been utilized in many geoscience applications (Siddiqui and Khamees 2004). Typically, a sample is placed between the X-ray detector and the X-ray source. Relative to the source-detector system, CT scanners need a rotation of the samples. The images taken by CT X-ray records the X-ray's degree of attenuation. The images reflect the change in the density of the material and the atomic composition. Qualitatively, CT images can be used to measure porosity, density, fractures, and heterogeneity (Choo et al. 2014;Cnudde et al. 2006;Siddiqui and Khamees 2004). For each core plug, two axial and two vertical (longitudinal) images were obtained at 0 and 90 degrees.

Development of the regression models
In the following subsection, the development of the regression models for the three zones (Zubair shale, Zubair sandstone, and Nahr Umr shale) and for E, K, and ν will be explained in detail. The process was repeated for the three zones and for E, K, and ν models. Thus, the following subsections will explain the process of creating one model, the process will be the same for all models and for all three zones.

Data processing
The static and dynamic E, K, and ν of the three zones (Zubair shale, Zubair sandstone, and Nahr Umr shale) were cleaned from outliers. To ensure the model can work on data that are not used in the process of creating the model, the data were randomly separated into two parts; training (70%) and testing (30%). The summaries of statistics of the data are shown in Tables 2, 3 and 4 for Nahr Umr shale, Zubair shale, and Zubair sandstone, respectively.

Ridge regression
Regularization is an important concept in machine learning. The idea of regularization is to avoid overfitting the model and ensure generalization for new data. There are many methods used to ensure the machine learning models are not overfitted and can generalize to new data such as lasso and ridge regression, also known as L 1 and L 2 regularization. In this paper, ridge regression was utilized to create the models for the three zones to ensure the models are not overfitted and can generalize to new data (Pedregosa et al. 2011).
In ordinary least square regression, the goal is to minimize the residual sum of squares between predicted and actual targets. Mathematically, the goal is to minimize the following: where w is coefficient and w o is the intercept. In ridge regression, however, there will be a penalty on the size of the coefficients. This will ensure the model will not overfit and can be generalized for new data. In ridge regression, the goal is to minimize the following: The parameter α controls the shrinkage, meaning the higher the α, the higher the penalty on the coefficient. This will ensure the model will not overfit and can generalize to new data. When α is zero, this simply will be an ordinary least square regression. Finding the "best" value of α is not straightforward. There is a trade-off between having a large value of α (underfitting) and a small value of α (overfitting). In this work, α was selected for each model based on an iterative process, where the goal has the best fit (R 2 ) for testing (Pedregosa et al. 2011).
To assess the error, root mean squared error (RMSE), mean absolute error (MAE) were used while R 2 was used to assess the "goodness of fit". RMSE, MAE, and R 2 were calculated using the following Equations: where Y i is the actual data point, Ŷ i is the predicted data point, Y is the mean of the actual data points, and n is the number of data points.

Results and discussion
Nahr Umr Shale Figure 4 shows the results of the CT scan for Nahr Umr core samples used for SST. Some fractures can be observed in the sample. Three ridge regression models were created for Nahr Umr shale to predict static E, ν, and K based on dynamic data (sonic log data). Figure 5 shows the static and dynamic E model. Figure 5a shows the relationship between static and dynamic E, a linear ridge regression was the best fit for the relationship between static and dynamic E. The predicted and actual static E is shown in Fig. 5b, R 2 shown in Fig. 5b is for testing (same is true for all other R 2 in the figures). On the other hand, Fig. 6 shows the static and dynamic ν model with a linear model, as shown in Fig. 6a. The actual and predicted ν is shown in Fig. 6b. In the same vein, Fig. 7a shows a linear fit of the static and dynamic K model while Fig. 7b shows the actual and predicted K. Table 5 shows the summary of the three models. Both E and K models had an α of 0.5, while ν had an α of 0.001. Regarding the error, both RMSE and MAE were low during training and testing with no significant difference between training and testing. This shows that the models can generalize to new data since the testing set (new data) was not used during training. Furthermore, R 2 -a measure of the "goodness of fit" showed promising and consistent values for training and testing. The testing R 2 of E, ν, and K models were 0.984, 0.959, and 0.984, respectively. The results prove that the created models are robust and can be utilized to predict static E, ν, and K based on dynamic data to be utilized to create geomechanical models for sand production, wellbore stability, hydraulic fracturing, etc.
The following Equations can be used to predict static E, ν, and K based on dynamic E, ν, and K for Nahr Umr shale, respectively:   Figure 8 shows the results of the CT scan for Zubair shale core samples used for SST. Some fractures can be observed in the sample. Three ridge regression models were created for the Zubair shale; to predict static E, ν, and K based on dynamic E, ν, and K as shown in Figs. 9, 10 and 11, respectively. Both E and K had a linear fit with a testing R 2 of 0.997 and 0.949, respectively. On the other hand, the best fit of the ν was a second-degree polynomial with a testing R 2 of 0.94. Table 6 shows a summary of the Zubair shale models. For all models, Both RMSE and MAE for training and testing were low and there was not a large discrepancy between training and testing. The α of E and K were 0.4 and 0.5, respectively, while the α for the ν model was 0.000001. The R 2 for the testing data were 0.997, 0.943, and 0.949 for E, ν, and K, respectively.

Zubair Shale
The following Equations can be used to predict static E, ν, and K based on dynamic E, ν, and K for Zubair shale, respectively: Figure 12 shows the results of the CT scan for Zubair sandstone core samples used for SST. Fewer fractures than the two shale samples can be observed in the Zubair sandstone sample. Ridge regression models were created for static E, ν, and K based on dynamic E, ν, and K for the Zubair sandstone. Linear ridge regression models were the best for static E and K with testing R 2 of 0.937 and 0.986, respectively, while ν data had a second-degree polynomial with an R 2 of 0.92 as shown in Figs. 13, 14 and 15.

Zubair sandstone
As shown in Table 7, MAE and RMSE for all three models were low during training and testing. The α for the E and K models were 0.6 and 0.7, respectively, while the ν model had α of 0.0001. Furthermore, the R 2 for training and testing were consistent without a large discrepancy for all three models.
The following Equations can be used to predict static E, ν, and K based on dynamic E, ν, and K for Zubair sandstone, respectively:

Conclusion
Static measurements of elastic moduli are vital for accurate geomechanical models that can be used for applications of wellbore stability, sand production, hydraulic fracturing, etc. Nevertheless, it is not easy to acquire core samples due to the limitation in material and cost associated with extracting the cores. Thus, dynamic moduli from wireline measurements can be used instead. However, dynamic moduli are significantly different from static moduli and can lead to inaccurate geomechanical models. In this work, core samples from Nahr Umr shale, Zubair shale, and Zubair sandstone were acquired with wireline measurements of the same sections to create correlations between static and dynamic E, ν, and K. SST tests and CT scans were executed for the core plugs. The results showed that ridge regression can be a good tool to limit overfitting as it did for all created models. Testing R 2 for all models was between 0.92 and 0.997 with consistency in the results of training and testing. The errors quantified by RMSE and MAE were low for both training and testing, as well. Furthermore, all models of E, K, and ν were linear besides ν for the Zubair shale and sandstone which were second-degree polynomial. In short, the proposed ridge regression models to estimate static E, ν, and K from the readily available dynamic data can be utilized to save material, money, and time when core samples are limited. Alternately, robust geomechanical models can be created from wireline measurements with a reasonable margin of error for Nahr Umr shale, and Zubair sandstone.
Acknowledgements The authors would like to thank Basra Oil Company from Iraq for providing the core samples and for the permission to publish this work. The authors would also like to thank MetaRock Laboratories for their unequivocal help and support in conducting the laboratory experiments.