Introduction

The expansion of the world's population, national economies, and quality of life has led the oil and gas industry to prioritize the exploration and utilization of unconventional hydrocarbon reservoirs due to the depletion or inaccessibility of conventional sources for multinational companies. Nevertheless, tapping into organic-rich reservoirs typically involves increased uncertainties and financial risks compared to conventional sources. Consequently, the evaluation and development of unconventional resources require a comprehensive, interdisciplinary approach that incorporates various datasets including geological, petrophysical, geophysical, and geomechanical variables. This holistic evaluation facilitates enhanced reservoir characterization, essential for effective development.

In the development of unconventional resources, factors like reservoir quality, completion quality, and completion effort are multidimensional constructs comprising various physical elements crucial for their influence on production (Zee Ma 2016). These factors are economically significant as they help evaluate whether the reservoir can yield commercially viable amounts of hydrocarbons.

Reservoir quality plays a key role during exploration as not all source rocks possess reservoir quality characteristics. This aspect delineates the hydrocarbon potential, accessible hydrocarbons, and the capability of the rock formation to deliver them (Zee Ma 2016). Various factors contribute to assessing reservoir quality, including the overall organic carbon concentration, level of thermal maturity, organic matter volume, mineral content, rock type, effective porosity, fluid presence, permeability, and formation pressure (Passey et al. 2010).

In identifying organic-rich shale gas deposits, total organic carbon (TOC) stands out as a crucial indicator of hydrocarbon potential. TOC's significance lies in its direct link to both the overall porosity and gas content of the rock (Passey et al. 2010). This connection suggests that higher TOC levels correspond to increased porosity and gas content, thus elevating the potential of the shale deposit as a hydrocarbon source. However, it is important to clarify that while organic matter, represented by TOC or kerogen volume, serves as the material for hydrocarbon generation, it does not directly indicate extractable resources. The conversion of this organic matter into hydrocarbons involves a prolonged geological process called thermal maturation, occurring across extensive periods under high pressure and temperature conditions (Zee Ma 2016).

While a high concentration of kerogen in a source rock might indicate the potential for hydrocarbon extraction, it is not solely indicative. Thermal maturation is a prerequisite for these organic materials to reach the specific conditions, known as the oil or gas window, where oil or gas formation occurs. This means that even with high TOC content, significant thermal maturation is necessary before hydrocarbons can be produced. Therefore, while TOC offers initial insights into hydrocarbon potential, it is just one aspect of a larger process. The transition from organic-rich source rock to a viable hydrocarbon reservoir is intricate and involves both the initial presence of organic material (as indicated by TOC) and its subsequent transformation through thermal maturation.

Various well log analysis techniques have been proposed for determining TOC (Passey et al. 1990; Schmoker 1979; Zhao et al. 2016). However, the assessment of the level of maturity (LOM) has primarily relied on vitrinite reflectance laboratory tests (Law 1999). In formation evaluation, LOM remains a constant value chosen somewhat arbitrarily, posing a risk in accurately and reliably estimating TOC.

Although the level of maturity (LOM) serves as a crucial input for basin and petroleum system modeling, its influence extends to various other factors such as porosity, pore size, type, and volume, which exhibit significant changes with increasing thermal maturity (Dong and Harris 2020; Pommer 2014; Zargari et al. 2015; Zhang et al. 2020). Moreover, thermal maturity offers a basis for predicting porosity on a regional scale, establishing an empirical link between the level of organic matter transformation and porosity alterations. Consequently, basin models can simulate porosity changes by accounting for lithology and thermal maturity influences (Osborne and Volk 2020; Schmoker 1984).

In this study, we present a straightforward approach that does not require the laboratory measurement of LOM using core samples (e.g., vitrine reflectance). We introduce an inversion methodology called interval inversion that derives LOM based on the estimation of TOC using well log inversion results. Additionally, we also predict the LOM using the global optimization method of simulated annealing by establishing an energy function related to the well-known mathematical expression suggested by Passey et al. (1990), which relates LOM and TOC. Both methods give a continuous in situ estimate of LOM along a borehole for establishing more accurate and reliable reservoir identification and hydrocarbon reserve estimation.

Methods

TOC estimation using improved Passey method

TOC stands as a pivotal indicator in assessing hydrocarbon potential, particularly in the evaluation of organic shale gas reservoirs (Zee Ma 2016). It quantifies the organic carbon content within a formation and demonstrates a direct correlation with both porosity and gas saturation (Passey et al. 2010). A widely adopted shale formation evaluation method, introduced by Passey et al. (1990), integrates porosity-indicative logs (e.g., sonic, density, and/or neutron porosity) with resistivity data, while also considering the level of maturation.

Passey et al. (1990) described the direct correlation between TOC and the separation distance observed in P-wave sonic travel time (Δt) and true formation resistivity (R) logs when overlaid, notably evident within intervals rich in organic content. This separation, termed delta-log R distance (Δ log R), relies on the distinction between resistivity logs, primarily indicating fluids, and porosity logs (e.g., sonic, density, or neutron), which respond to a mix of kerogen/matrix and fluids. In rocks with lower organic content (lean source rocks), the resistivity and porosity curves align because both fluids and the rock matrix, including kerogen, jointly influence these measurements.

In intervals abundant in organic material, these curves diverge due to the specific influence of organic material, particularly kerogen, within the rock. Kerogen affects the sonic log by elongating the interval transit time, leading to a decline in the calculated sonic velocity. Meanwhile, the resistivity log registers an increase in measured resistivity as a response to the presence of kerogen. Passey suggests a mathematical expression to estimate TOC as function of Δ log R distance, which is also a function of LOM:

$$TOC=\left(\Delta \mathit{log}R\right)\times {10}^{\left(2.297-0.1688\times LOM\right).}$$
(1)

The \(\Delta \log R\) is defined by the following equation:

$$\Delta logR = log_{{10}} \left( {R/R_{{baseline}} } \right) + 0.02 \cdot \left( {\Delta t - \Delta t_{{baseline}} } \right),$$
(2)

where \({R}_{baseline}\) is the resistivity corresponding to the \(\Delta {t}_{baseline}\) value when the curves are baselined in fine grained non-source, clay-rich rocks. Various porosity curves, including bulk density and neutron porosity, have proven effective in estimating TOC content alongside the sonic curve. These curves serve as valuable tools in corroborating the ΔlogR separation observed with the sonic/resistivity pair. However, the correlation between resistivity, sonic transit time, and TOC is not consistently strong across all formations, complicating the accuracy of TOC predictions. Furthermore, changes in formation mineral composition can significantly impact TOC calculations. Variations in minerals, such as increased pyrite content, can alter resistivity log responses, leading to inaccurate TOC estimates. Moreover, the influence of borehole enlargement and other geological factors exacerbates these inaccuracies, necessitating the development of improved methodologies for TOC estimation (Passey et al. 1990).

To address the limitations of traditional TOC calculation methods, more innovative approaches have been proposed, such as the dual-difference ΔlogR (DDΔlogR) method introduced by Zhu et al. (2019). This method involves three key improvements over traditional ΔlogR techniques: 1) deriving a dynamic theoretical relation curve between resistivity and sonic logs to account for variable mineralogy; 2) calculating two ΔlogR values – one using the measured sonic log with TOC influences, and another theoretical ΔlogR without TOC effects – and taking their difference to isolate the organic matter signal; and 3) utilizing natural gamma-ray logs, which are less sensitive to radioactive mineral components, to represent the radioactivity contributions from organic matter. Likewise, Zhu et al. (2020) proposed an innovative dual model- and data-driven approach to improve the accuracy of TOC predictions that combine traditional physical response models with modern deep learning techniques. This approach utilizes intelligent curve overlapping to enhance the geological significance of the data, allowing for more precise TOC predictions even in formations with variable mineral compositions.

Level of organic metamorphism

Following the accumulation and preservation of organic-rich sediments, the organic matter undergoes diagenesis, catagenesis, and metagenesis processes spanning millions of years. During burial, these processes subject the organic matter to thermal degradation and escalating temperatures, leading to the generation of petroleum molecules. The "Level of Organic Metamorphism" (LOM) describes the extent of thermal metamorphism experienced by the deposited organic matter throughout subsurface burial (Hood et al., 1975).

Core measurement methods for estimating the level of thermal maturity involve the analysis of organic matter and its transformation during the process of burial and heating within a sedimentary basin. These methods are essential for understanding the generation and preservation of hydrocarbons, as well as the thermal history of a basin. Among the most widely used techniques are (Tissot and Welte 1984):

Vitrinite Reflectance (Ro or %Ro): This method measures the percentage of light reflected from vitrinite; a maceral type derived from woody plant material. As organic matter is subjected to increasing temperatures during burial and maturation, its molecular structure changes, causing an increase in the reflectance of vitrinite particles. Vitrinite reflectance is one of the most reliable and widely used indicators of thermal maturity, providing an estimate of the maximum temperature experienced by the sedimentary rock. A well-established relationship exists between LOM and Ro, making vitrinite reflectance a key parameter for estimating the thermal maturity level of organic matter (Passey et al. 2010).

Rock–Eval Pyrolysis: This technique involves heating a rock sample in an inert atmosphere and measuring the amount of hydrocarbons released. The Rock–Eval analysis provides several parameters, including the total organic carbon (TOC) content, the hydrogen index (HI), and the Tmax, which is the temperature at which the maximum release of hydrocarbons occurs during pyrolysis. These parameters can be used to assess the type and maturity of the organic matter present in the sample.

Biomarker Analysis: This method involves the analysis of specific organic compounds, known as biomarkers, which are derived from the remains of once-living organisms. Certain biomarker ratios, such as the sterane and terpane ratios, can be used to estimate the thermal maturity of the organic matter, as they undergo predictable changes in response to increasing temperature.

The LOM scale, ranging from 1 to 15, with higher values indicating greater thermal alteration, is a widely used tool for assessing thermal maturity of sedimentary rocks based on the optical properties of dispersed organic matter. This scale is particularly useful for samples lacking vitrinite or that have experienced higher temperatures, where vitrinite reflectance may be unreliable. Precise determination of LOM is significant for indicating the present and past maturity levels of a formation and as a crucial tool in estimating the generation windows (depths) at which oil and gas originate from potential petroleum source rocks. Thermal maturity plays an important role in identifying promising shale oil and gas accumulations in the early phases of exploration. Additionally, accurate assessment of LOM is imperative for estimating TOC using the ΔlogR method, as incorrect maturity estimations could lead to erroneous absolute TOC values, although the vertical variability in TOC would still be accurately represented (Passey et al. 1990).

Interval inversion method

The interval inversion method consolidates all data gathered within a selected logging interval and jointly undergoes inversion (Dobróka and Szabó, 2001). This differs from the conventional depth-point-by-point inversion, which estimates petrophysical parameters separately at each depth point using all available data. By discretizing model parameters via a series expansion technique and estimating fewer expansion coefficients than the available data, the inverse problem becomes highly overdetermined, resulting in a substantial data-to-unknowns ratio. Consequently, this approach reduces the sensitivity of the interpretation process to disturbances in the data. The efficacy of interval inversion in enhancing estimation accuracy has been evidenced in previous studies focusing on conventional hydrocarbon deposits (Abordán and Szabó, 2020; Dobróka et al. 2016). Additionally, interval inversion has shown promise in evaluating various formations such as shaly sand sequences, carbonate and metamorphic reservoirs, and shale gas formations. Studies conducted by Dobróka et al. (2012), Szabo and Dobróka (2020), and Szabó et al. (2022) highlight the applicability of interval inversion in accurately assessing multi-mineral structures and unconventional formations.

Petrophysical parameters are considered depth-dependent, addressing the challenge of slight overdetermination in local inverse procedures, which can result in relatively noise-sensitive inversion outcomes. To counter this, modifications are made to the tool response functions used in solving the forward problem, ensuring their adaptability to changes in depth.

$${d}_{s}^{(c)}={g}_{s}\left({m}_{1}\left(z\right), \dots , {m}_{i}\left(z\right), \dots , {m}_{M}\left(z\right)\right),$$
(3)

where, \({g}_{s}\) represents the response function of the s-th logging tool (s = 1, 2, …, S, where S represents the number of logging instruments utilized), while \({m}_{i}\) signifies the i-th petrophysical property, considering there are M petrophysical parameters in total. In numerical computations, model parameters within the tool response functions are represented using continuous functions, necessitating appropriate discretization. The interval inversion method relies on discretizing model parameters through a series of expansions, as illustrated below:

$${m}_{i}\left(z\right)=\sum_{q=1}^{{Q}_{i}}{B}_{q}^{i}{\Psi }_{q}(z),$$
(4)

in this context, \({B}_{q}^{i}\) refers to the q-th series of expansion coefficients, and Ψ denotes the q-th basis function. \({Q}_{i}\) represents the number of expansion coefficients that describe the i-th discretized model parameter. The basis functions encompass various possibilities; for instance, when characterizing a layer-wise homogeneous model, a combination of unit step functions, like the Heaviside function, can be utilized (Dobróka et al. 2012). Conversely, for formations demonstrating vertical variation in petrophysical parameters within a layer, an orthogonal set of polynomials can serve as basis functions for inhomogeneous models (Dobróka et al. 2016). To illustrate, a layer-wise homogeneous model may be characterized with the fewest unknown parameters using a combination of unit step functions.

$${\Psi }_{q}\left(z\right)=u\left(z-{z}_{q-1}\right)-u\left(z-{z}_{q}\right),$$
(5)
$$\Psi _{q} \left( z \right) = {\text{ }}\left\{ {\begin{array}{*{20}c} 0 & {if} & {z < z_{q} } \\ 1 & {if} & {z_{q-1}\le z \le z_{q} ,} \\ 0 & {if} & {z > z_{q} } \\ \end{array} } \right.$$
(6)

where \({z}_{q-1}\) and \({z}_{q}\) are, respectively, the depth coordinates of the upper and lower boundaries of the q-th layer in meters, respectively. Quantity \({B}_{q}^{i}\) in Eq. (4) corresponds to the i-th petrophysical parameter of the q-th bed since the q-th basis function is always zero except in the q-th layer, that is \({m}_{i}\left({z}_{q-1}\le z\le {z}_{q}\right)={B}_{q}^{(i)}.\)

The series expansion leads to a single series expansion coefficient corresponding to each petrophysical parameter. The inversion method is employed to calculate the series expansion coefficients, subsequently composing the model vector for the interval inversion problem.

$${d}_{s}^{(c)}\left(z\right)={g}_{s}({z,B}_{1}^{\left(1\right)},\dots ,{B}_{Q1}^{\left(1\right)},\dots \dots \dots \dots \dots \dots , {B}_{1}^{\left(M\right)},\dots ,{B}_{QM}^{\left(M\right)}).$$
(7)

The interval-wise homogeneous model presents an advantage by necessitating the inversion process to determine significantly fewer unknowns than the available data points. This highly overdetermined inverse problem allows for an accurate and reliable estimation across the entire recorded time interval. The objective function aims to optimize the fitting of theoretical data to observed data by minimizing the relative data distance, approximating the required number of expansion coefficients (QM).

$$E=\sum_{v=1}^{V}\sum_{j=1}^{N}{\left(\frac{{d}_{vj}^{m}-{d}_{vj}^{c}}{{d}_{vj}^{m}}\right)}^{2}=min,$$
(8)

where \({d}_{vj}^{m}\) and \({d}_{vj}^{c}\) indicate the j-th measured and calculated data in the v-th depth, respectively (V is the total number of measurement points in the processed depth interval), the calculated data are computed using Eq. (7).

To tackle the nonlinear inverse problem, it is simplified into a sequence of linear problems. Prior to employing these techniques, the inverse problem undergoes a linearization process. The computation of the model correction vector is accomplished utilizing the Marquardt algorithm (Marquardt 1963).

$$B = \left( {G^{T} G + \varepsilon ^{2} I} \right)^{{ - 1}} G^{T} d^{{(m)}} ,$$
(9)

where G is a linear operator called Jacobi or sensitivity matrix, I is the identity matrix, and \(\epsilon\) is a properly chosen positive constant to numerically stabilize the inversion procedure.

When there is minimal change observed in the difference between the measured and computed data, calculated as:

$${D}_{d}=\sqrt{\frac{1}{VN}\sum_{v=1}^{V}\sum_{j=1}^{N}{\left(\frac{{d}_{vj}^{m}-{d}_{vj}^{c}}{{d}_{vj}^{m}}\right)}^{2}}*100\text{ \%},$$
(10)

it can be assumed that the newly estimated expansion coefficients are a solution and that the depth variation of petrophysical characteristics can be easily deduced by substituting them into Eq. (4).

Simulated annealing method

The simulated annealing (SA) method, first proposed by Metropolis et al., in 1953, emulates the thermal equilibrium state found in solids. The SA method draws inspiration from the physical process of annealing metals, wherein a crystalline solid is heated and slowly cooled, ultimately attaining its minimum lattice energy state, resulting in a more ordered crystal lattice configuration. If the cooling process is slow enough, it results in improved structural integrity. This thermodynamic behavior is related to the search for global minima in discrete optimization problems through the simulated annealing method.

The SA algorithm generates and assesses two solutions: typically an initial solution and a newly selected one, using an objective function (often termed the energy function) at each iteration. Enhanced solutions are invariably accepted, while a subset of lesser solutions is considered to escape local optima and pursue global optima. An algorithmic temperature parameter, decreasing over iterations, governs the probability of adopting a non-improved solution (Henderson et al. 2003).

The Metropolis SA algorithm adjusts the components of the relevant model parameter vector as follows:

$${m}_{j}^{\left(new\right)}={m}_{j}^{\left(old\right)}+b,$$
(11)

where b specifies a perturbation term, which is a random value between \(\left[b,{b}_{max}\right]\), where \({b}_{max}\) is often reduced by:

$${b}_{max}^{\left(new\right)}={b}_{max}^{\left(old\right)}\bullet \tau,$$
(12)

where \(\tau\) is known as the decrement factor \(\left(0\le\uptau \le 1\right)\).

At each iteration of the random walk across the parameter space, the energy function (E) of the pertinent model is computed and compared to its previous state, \(\Delta E=E\left({\overrightarrow{m}}^{new}\right)-E\left({\overrightarrow{m}}^{old}\right).\)

The new model's acceptance probability (\(P\)) is determined by the Metropolis criteria.

$$P\left( {\Delta E,T} \right) = \left\{ {\begin{array}{*{20}c} 1 & {if\;\;\;\Delta E \le 0} \\ {\exp \left( { - \frac{{\Delta E}}{T}} \right),} & {if\;\;\;\Delta E > 0} \\ \end{array} } \right.,$$
(13)

where T represents the temperature that controls the energy function. If the current step's temperature is lower than the previous one, the new model is always accepted. However, if the energy of the new model increases, acceptance depends on the energy required to escape from local minimums. The acceptance criterion, \(P(\Delta E) > \alpha\), determines whether the new model is approved or rejected (where \(\alpha\) is a random number with uniform probability between 0 and 1). The rate of cooling, essential for convergence, should strike a balance—not too rapid to hinder improvement or too fast to risk trapping the process in a local minimum (Geman and Geman 1984).

LOM-SA estimation method

Traditionally, LOM is estimated using the TOC-ΔlogR diagram (Fig. 1) introduced by Passey et al. (1990), where the best-fitting straight line to the data is determined as \(TOC=\alpha *\left(\Delta \mathit{log}R\right)+\beta\).

Fig. 1
figure 1

ΔlogR diagram relating the ΔlogR to TOC at different maturity levels

The Passey’s empirical equation (Eq. 1) for calculating TOC from ΔlogR excludes the intercept term \(\beta\) from the fitting model equation, setting it to zero. This assumption indicates a perfect alignment between the transit time and resistivity curves in non-source intervals. When equating the fitting model equation and Eq. 1 and solving for LOM, the resulting equation for LOM is derived as \(LOM={(2.297-\text{log}}_{10}\alpha )/0.1688\).

However, the diagram technique is only possible to be performed if there is a fair amount of TOC values from geochemistry data to calibrate the linear. Therefore, since geochemistry data are scarce and expensive to acquire in sufficient amount, we propose combining interval inversion technique and the SA method to solve the problem. Our inversion workflow is presented in Fig. 2.

Fig. 2
figure 2

Flowchart of the inversion-based level of maturity estimation method (LOM-SA)

Through interval inversion, first, we estimate the petrophysical parameters of the studied formations. The TOC is derived from the kerogen volume (Vk) and bulk density (\({\rho }_{k})\)(Herron and Letendre 1990).

$$TOC=\frac{{\rho }_{k}}{C{\rho }_{b}}{V}_{k},$$
(14)

where \({\rho }_{k}\) is kerogen density (\(\frac{g}{c{m}^{3}}\)),  \(C\) is a constant, varying from 1.18 to 1.48, and \({\rho }_{b}\) is the bulk density measurable using the density (gamma-gamma) log. Then, after obtaining a predicted TOC log curve, it is sought to minimize the objective or energy function that is defined as:

$$E={\left(\frac{1}{NS}{\sum }_{i=1}^{NS}{\left(TO{C}_{i}^{(m)}-TO{C}_{i}^{(c)}\right)}^{2}\right)}^\frac{1}{2},$$
(15)

where TOC measured (\(TO{C}_{i}^{(m)}\)) is taken from Eq. (14) and TOC calculated (\(TO{C}_{i}^{(c)}\)) is evaluated according to Eq. (1).

Case studies

Norway

Geology of the study area

The well-logging dataset was collected from the Norwegian Continental Shelf in the North Sea, recognized as one of the world's most productive hydrocarbon provinces (Brennand et al. 1998). Geologically, the North Sea is classified as an intracratonic basin, situated on continental crust. Over time, this basin has undergone periods of stretching, thinning, and subsidence during the late Carboniferous, Permian-Early Triassic, and Late Jurassic eras. Notably, the upper Jurassic marine shales are identified as the most significant source rocks within the North Sea province. These formations mark the transition from the Triassic to the Jurassic period, signifying a shift from continental to shallow-marine depositional settings due to an Early Jurassic transgression. This geological shift facilitated the accumulation of black shales across extensive areas of North-west Europe. These shales exhibit favorable attributes such as high organic richness, quality, and maturation characteristics, rendering them excellent source rocks for oil and gas reserves, particularly in the southern regions of the North Sea.

We assessed a wireline logging dataset from a well drilled in the Central Graben area, encountering three distinct lithological units. The lithological details are drawn from the Norwegian Petroleum Directorate’s reports (Norwegian Petroleum Directorate 2023). The uppermost unit is identified as a marlstone containing elements of glauconite and pyrite, with occasional occurrences of sandstones and siltstones. This unit is believed to have been deposited in a well-oxygenated shallow-marine setting, characterized by minimal clastic input and hosting a diverse faunal assemblage. The subsequent geological unit comprises a carbonaceous claystone, moderately to minimally calcareous, characterized by high organic carbon content. It contains intermittent thin bands of carbonate rocks and, in certain regions, sandstone deposits. This formation is presumed to have been laid down in a confined, low-energy, oxygen-depleted (anoxic) marine environment, and distanced from substantial terrestrial influence. As for the lowermost layer, it primarily consists of compact, fine to medium-grained sandstone, accompanied by a thin layer of siltstone. These sandstones exhibit arkosic to subarkosic composition, characterized by the presence of glauconite and mica. Additionally, there are common occurrences of slender, nodular calcite-cemented bands within this layer.

Interval inversion results

For the inversion procedure, five borehole logs were inverted such as gamma ray (GR), density (RHOB), neutron porosity (NEU), sonic (\(\Delta \text{t}\)), and deep resistivity (RES) logs, also the interval was divided into four layers, marked by the gamma-ray intensity and deep resistivity changes along the borehole. Boundaries of the four-layer intervals were chosen at 328, 439, and 564 m depth. For the interval inversion procedure, 20 iterations were performed over the four-layer intervals with a damping factor (\(\epsilon\)) of 1000 with a decrement of 0.30 in each iteration. The initial model (\(m_{o}\)) was set at 50% water saturation, 10% porosity, 10% clay, 10% quartz, and 10% carbonate content, and kerogen and pyrite content 1% each. Therefore, seven model parameters were estimated jointly in the inversion procedure. The estimated model parameters from the inversion procedure for each layer are shown in Table 1. The data distance calculated using Eq. (10) at the last iteration for the four layers is, respectively, 0.44, 1.62, 0.87, and 1.6. The data calculated by the resultant model are approximated properly as can be seen in Fig. 3.

Fig. 3
figure 3

Borehole logs from the North Sea, Norway. Measured (blue line) and Interval Inversion estimated (red dashed line) logs are displayed from track 1 to 5. The overlaid acoustic and resistivity logs are displayed on track 6. The separation of the curves (green area) indicates the source rock zone. On track 7, TOC content logs are presented. In red, the inverted TOC log. In blue, the TOC was obtained by Passey method. And the black dots indicate the core TOC values [Colored figure]

Simulated annealing results

The \(\Delta \log R\) distance was previously calculated by Valadez-Vergara (2020) establishing the baseline interval around 480 m depth (Fig. 3), and considering the overlap between resistivity and sonic log, with baseline values of 1.99 Ω-m and 81.1 µsec/ft, respectively. TOC content was determined using Eq. (14), from the Vk obtained by the interval inversion procedure considering \({\rho }_{k}=1.0\)  (\(\frac{g}{c{m}^{3}}\)), \(k=1.2\), and \({\rho }_{b}\) from the density log (Fig. 3). In Fig. 3, the TOC log from the inversion procedure follows a good approximation compared to other estimations (i.e., core samples); the root-mean square deviation from the TOC inversion is 0.91, while from the Passey method or \(\Delta \log R\) distance method is reported as 1.36.

After obtaining the input logs (Fig. 2) for the simulating annealing process, 30 runs were performed with 100,000 iterations in each run (Fig. 4). The temperature was changed in the q-th iteration as \({T}^{(new)}\hspace{0.17em}=\hspace{0.17em}{T}_{0}/\text{lg}(\hspace{0.17em}q),\) where the initial temperature of the artificial system \({T}_{0}\) is experimentally set to 1.5 × 10–1. The result from minimizing the energy function established in Eq. (15) gave an average of 8.2334 for the value of LOM (~ 0.6 Ro).

Fig. 4
figure 4

Average value of LOM resulting from the 30 LOM-SA runs (blue dots), the red line marks the average value of the LOM estimation results [Colored figure]

The measure of central tendency of the estimated LOM is mean equal to 8.2334; the most frequent value is 8.2053, and the median 8.2324, which indicates a positively skewed distribution. The skewness value is 0.2024, which indicates closely Gaussian distributed data; also its positive value confirmed that the data displays a right-skewed distribution, indicating an elongated right tail compared to the left tail. The kurtosis value stands at 2.5697, indicating more peaked distribution compared to the normal distribution (Fig. 5).

Fig. 5
figure 5

Histogram of the LOM sample values after 30 iteration runs of LOM-SA method

The 95% confidence interval of the calculated LOM values around the mean is LOM = 8.2334 ± 0.0047. The standard deviation calculated from the 30 runs is 0.0131, and it has a variance of 1.718⋅10–4. The minimum LOM value is 8.2053 and the maximum 8.2589 giving a range of 0.0536. The 25th and 75th percentile values are 8.2262 and 8.2381, respectively, and the second (50) percentile is 8.2324. Outliers are typically identified as data points lying beyond 1.5 times the interquartile range above the upper quartile (75th percentile) or below the lower quartile (25th percentile), given a total of three outliers to the trend (Fig. 6). However, if we consider Grubbs' method (Grubbs 1969) (to identifies an outlier, we have zero, since it calculates the difference between the value and the mean, and then dividing that difference by the standard deviation and compared each sample so when that ratio is too large, the value is defined to be an outlier. This method is more appropriate since it is applicable to data that is normally distributed, as it is the case.

Fig. 6
figure 6

Upper panel: Box plot showing the first (8.2262) and third quartile (8.2381). The median is 8.2324. Bottom panel: QQ and probability plot showing normal distribution of LOM-SA [Colored figure]

A quantile–quantile plot (QQ-plot) was employed to assess the degree of resemblance between the dataset's distribution and a standard normal distribution, and it can be confirmed that the data tend to the normal distribution although it is skewed and with a heavy tails’ behavior (Fig. 6). The normal distribution parameters were estimated (Fig. 6), that is, the mean (µ) and standard deviation (σ), and the 95% confidence intervals for the parameters, given the following values respectably, 8.2334, 0.0131, and [8.22851, 8.2383], [0.0104386, 0.0176201].

The average data distance after the 30 runs with 100,000 iterations is 30.68% (Fig. 7). The root-mean-square deviation of the differences between sample values predicted by the model and the values observed is 0.7916.

Fig. 7
figure 7

Data distance curves calculated during the LOM-SA procedures are obtained over 100,000 iterations. The strong red dashed line is the average data distance from the 30 runs [Colored figure]

The residuals (Fig. 8) have a mean of 0.3621, a standard deviation of 0.8069, and a variance of 0.6511, a mode of 0.4862, and a median of 0.2699.

Fig. 8
figure 8

The blue curve shows the calculated residuals values for each data point of Norway well. The residuals are calculated from the difference between TOC estimated from LOM-SA and interval inversion. The red line marked the deviation zero or perfect estimation. It can be seen that a good agreement between values is reached [Colored figure]

The 25th and 75th percentile have values of -0.0057 and 0.5787, respectively, and the second (50) percentile is 0.2699. They have a kurtosis of 27.8602 and a skewness of 4.2256, and the Quantile–quantile plot (QQ-plot) and the normal distribution plot ensures the normality of the residuals. The residual sum of squares is 517.1630, and the residual standard error is equal to 0.8845, the total sum of squares is 3.1725 × 103, giving coefficient of determination of 0.8370, and a Person's coefficient of 0.9149.

Hungary

The geological formation under scrutiny resides in the Derecske basin, positioned in East Hungary near the Romanian border. In Fig. 9, the well log suite utilized for estimating LOM is illustrated. Within this basin lies a notable gas accumulation zone comprising a middle Miocene tight reservoir series (Kiss et al. 2012). These tight reservoirs predominantly consist of sandstones and siltstones characterized by minimal porosity and permeability. Their permeability typically does not exceed 0.1 millidarcies (mD) (Szabó et al. 2022). The middle Miocene marine source rocks extend over a broader expanse encompassing the southern Zala, Dráva, Somogy, Kiskunhalas, Makó, Békés, and Derecske basins.

Fig. 9
figure 9

Well logs from borehole Be-4 at Derecske basin, Hungary. Gamma ray (GR) is displayed at the first track. Density and neutron porosity logs are shown at the second track. Spectral gamma-ray (U-Th-K) logs are displayed in the third track. Deep resistivity (RD) is displayed on the fourth track. On the fifth track: Th/K, U/K, and Th/U log ratios are presented. The \(\Delta \log R\) log is displayed on the sixth track, and on the last track, TOC logs obtained by interval inversion (blue dashed line) and LOM-SA (red dashed line) are exhibited. The zone excluded from the analysis is marked by the area above the black dash-dotted line on tracks 5, 6, and 7 [Colored figure]

In the Miocene interval of the Derecske basin, distinctive lithological units are observable. Beginning from the Miocene formations' base, there is a breccia layer atop the basement, succeeded by a conglomerate layer exhibiting compact properties due to tuff content. Above the conglomerate, three discernible tuff zones or layers are present within the predominantly siliciclastic sequence. Notably, the third zone represents a gas reservoir with 8% porosity. The Miocene sequence also encompasses carbonate formations of relatively lesser thickness. Moreover, this sequence displays various sandier or silty "cycles" that include layers of sandstone, siltstone, clay, clay marl, and tuff-tuffite. The occurrence of tight gas correlates closely with these specific lithological units (Szabó et al. 2022).

The sedimentation environment within the siliciclastic units of the Derecske basin suggests a diminished saline setting, revealing traces of historical freshwater and swampy conditions. Kerogen findings in the clays indicate a type III classification, possessing the capability to produce hydrocarbon gas. The relatively recent age of the source rock, combined with swift subsidence and significant sediment accumulation within the hydrocarbon play region, has facilitated its maturation, rendering it conducive for hydrocarbon generation.

LOM-SA results

The interval inversion method was employed to evaluate the organic carbon content (Vk) from the wireline logs obtained at the well Be-4 (Fig. 9). Then, TOC content was determined using Eq. (14), from the Vk obtained on the inversion procedure, and considering \({\rho }_{k}=1.5\)  (\(\frac{g}{c{m}^{3}}\)), \(k=1.2\), and \({\rho }_{b}\) from the density log (Fig. 9). The \(\Delta \log R\) distance was calculated based on a lithology analysis (Fig. 9), the baseline values for the resistivity and neutron log were set where both curves overlay according to the density-neutron log plot, as it emulates the non-source rock shale interval, set at the 3595–3600-m depth interval. The values for Rt and neutron porosity baseline were chosen as the average of such interval, that is, equal to 5.5204 (ohm-m) and 0.1954 (v/v) (Fig. 9). Furthermore, when following the procedure suggested by Passey et al. (1990) to estimate the \(\Delta \log R\) distance the interval between 3570 and 3581 m was excluded from the analysis, since the baseline for the underlying geological succession would not represent the same sedimentary condition type (Fig. 9).

After running the simulating annealing algorithm, the result from minimizing the energy function gave a mean of 10.6927 LOM (~ 0.93 Ro) (Fig. 10). The most frequent value and median of the estimated LOM are 10.6394 and 10.6922, respectively, which indicates a positively skewed distribution, its value is 0.1011, which indicates a close normal distribution of the data. The kurtosis value is 2.4071, indicating a highly peaked distribution compared to the normal distribution. However, the magnitude of the kurtosis value is not extremely high, so the departure from normality might not be very pronounced.

Fig. 10
figure 10

Result from the 30 runs to estimate LOM values (blue). The red line marked the average value of the samples [Colored figure]

The standard deviation of the 30 runs is 0.0290, and it has a variance of 8.3904 × 10–4. The small standard deviation and variance imply a high level of precision in the LOM calculations. The range of the estimated LOM is 0.1107, with minimum and maximum values at 10.6394 and 10.7501. The 25th and 75th percentile have values of 10.6746 and 10.7141, respectively, and the second (50) percentile is 10.6922. If we define outliers as elements exceeding 1.5 times the interquartile range beyond the upper quartile (75th percentile) or below the lower quartile (25th percentile), we have no outliers. The 95% confidence interval of the estimated LOM is 10.6927 ± 0.0108.

The mean distance after 30 runs with 100,000 iterations is 18.34% (Fig. 11). The root-mean-square deviation of the differences between TOC-LOM-SA sample values predicted by the model and the values estimated by inversion is 0.3068. The residuals have a mean of 0.1332, a standard deviation of 0.2767, and a variance of 0.0765, a mode of −0.4111, and a median of 0.0671. The variance indicates the extent of the spread of residuals' values around their mean. The mode of suggests a point where the residuals have a higher frequency, and the median indicates that the middle value of the residuals is close to zero. It seems that the model has a good level of accuracy, as indicated by the relatively low root-mean-square deviation.

Fig. 11
figure 11

Data distance curves are obtained during the LOM-SA procedures over 100,000 iterations. The strong read line is the average data distance from the 30 runs [Colored figure]

Alaska

The Kingak Formation, found in Alaska’s North Slope region, constitutes one of three primary oil and gas source rock systems in the area. This formation, part of the Jurassic to Lower Cretaceous Kingak sequence, lies buried at depths exceeding 2700 m. Its basal sequence, known as the K1 sequence, ranges between 300 and 380 m in thickness. This basal sequence is composed of marine and terrigenous materials, gathering organic matter from both origins within a marine siliciclastic environment during the initial formation of the Canadian basin (Houseknecht and Bird 2004; Rouse and Houseknecht 2016). Wire-line logs throughout the K1 sequence exhibit an exceptionally high gamma-ray response, indicating that a "hot shale" zone consistently presents within a thin layer of silty mudstone, observed in wells such as North Inigok and Inigok (Houseknecht and Bird 2004).

The Kingak Formation consists of dark gray to dark-olive-gray shale, siltstone, claystone, and clay ironstone in its composition (Detterman et al. 1975). Its upper segment comprises clay shale, silty shale, and siltstone, featuring red ironstone layers that weather to a rusty hue. The lower section, referred to as K1, is characterized by dark gray to black fissile paper shale, dark gray clay shale, minor claystone, and contains beds and nodules of red-weathering ironstone (Reiser et al. 1980). The most extensive well penetrations of the K1 sequence are positioned near or sometimes even beyond the clinoform toes within the eastern NPRA (such as the Inigok and North Inigok wells). These wells exhibit a strong gamma-ray response, indicating a "hot shale" within a narrow layer of silty mudstone. This occurrence is interpreted as a condensed section within the basin and is deemed a significant petroleum source rock (Houseknecht and Bird 2004). Typically, the lower limit of the K1 sequence marks the transition between the Kingak Shale and the underlying Triassic formations, specifically the Sag River Sandstone or Shublik Formation.

The LOM-SA approach's feasibility was demonstrated through the analysis of well logs extracted from the North Inigok 1 well, drilled into the Kingak Formation. In this study, a 90-m interval was scrutinized, previously earmarked as a potential source rock using Passey’s method (Detterman et al. 1975). The study outcomes were validated using core data from the same section. Notably, the Kingak Shale in the North Inigok-1 well exhibited thermal maturity reaching a postmature stage. This aligns with the reduction of the Hydrogen Index (HIo) in the basal organic-rich shale, which shows measured values within the range of about 20 to 30 mg HC/g TOC. Additionally, at the North Inigok-1 well, the basinal organic-rich shale indicated TOC values of approximately 4 wt.% (Peters et al. 2006).

LOM-SA results

Real data measured at North Inigok 1 well in the set K1 organic-rich shale reservoir in the Kingak Formation were inverted using the interval inversion procedure (Fig. 12). The TOC results showed a good agreement between the estimated and measured data, indicating that this method can be a valuable tool for evaluating organic-rich shale formations (Fig. 12). Therefore, we used as input for the SA procedure the TOC log estimated. To estimate the \(\Delta \log R\) distance (Fig. 12), the values for RT and sonic travel time baseline were chosen as 2.46686 (ohm-m) and 112.407 (µsec/ft), respectively; from the average measurements of the interval located between 2960 and 2970 m depth (Fig. 12).

Fig. 12
figure 12

Borehole logs from the North Inigok 1 well, Alaska. At the first track gamma ray (GR) is displayed. Density and neutron porosity logs are presented at the second track. Sonic or acoustic (Δt) log is displayed in the third track. Deep resistivity (RES) is shown on the fourth track. The \(\Delta \log R\) log is displayed on the fifth track. Finally, on track sixth, TOC content log (blue dashed line) from interval inversion is presented. In red dots, the core TOC values are shown [Colored figure]

As in the cases analyzed before, the SA was run multiple times to offer a statistical analysis. However, in this case, the number of runs was doubled (60 runs) to increase the significance of the results. Then, after running the simulating annealing algorithm, the result from minimizing the energy function gave an average LOM value of 36.80 ± 2.1655 (~ 3.58 ± 0.07 Ro). The range of the results goes from 21.6183 to 55.7545 LOM, with a median of 37.4593 LOM and a mode of 21.6183.

The normality tests distribution of the results was analyzed using the Anderson–Darling and the Jarque–Bera test, at a 5% significance level, given both as a result that the null hypothesis is not rejected. That is, for the Jarque–Bera test that the data does not significantly deviate from a normal distribution in terms of these statistical measures (i.e., skewness and kurtosis); as for the Anderson–Darling test, it suggests that the data closely follow the expected distribution (i.e., the normal distribution).

Discussion

The organic richness or TOC content and the current and past maturity level of the formation are two crucial factors that determine whether a particular rock will be a good source rock, and they are needed to assess the capacity of a source rock to produce hydrocarbon (Passey et al. 2010). However, borehole geophysical instruments have not been used to evaluate these petrophysical variables directly, including some other volumetric metrics (e.g., porosity, shale volume, water saturation, etc.).

There have been many attempts to overcome the problem of lack of direct measurements of petrophysical parameters and other relevant ones, such as textural properties and zone parameters (e.g., LOM), and innovative techniques have been developed (Szabó and Dobróka 2020; Szabó et al. 2021, 2022) showing to be useful tools for detecting and characterizing unconventional reservoirs. The preceding results here demonstrate that the interval inversion technique method enables an accurate estimation of TOC, but also fractional volumes of pore spaces, water, hydrocarbon, and mineral composition of unconventional formations. Although there is a lack of core data on the fractional volumes of the mineral composition, porosity, and water saturation; the results obtained correlated quite well with the ones that are found in the literature of the area. For instance, Storebø (2021) reported that water saturations at the North Sea Lower Cretaceous clay-rich carbonate reservoirs are ranging from 60 to 80%, and porosities around 25%, while Worden et al. (2020) reported porosities values around 20%, which could reach up to 30% in some areas. The mineral composition of the formations in that area is mainly dominated by calcium minerals and clays (Worden et al. 2020), with a significant amount of pyrite (~ 8%) reaching up to 17% in some samples (Storebø 2021).

In practice, LOM is calculated from core samples in the laboratory (vitrinite reflectance) or using the well-known relationship of Passey method by establishing an empirical relationship that relays TOC from core data. Our proposed method for predicting LOM is based on the estimation of TOC through interval inversion and approximating the global optimum of the energy function proposed in Eq. (15). Using the suggested methodology, it is provided a straightforward estimation of LOM that can be independent of the TOC from cores, since it is proven that TOC from the inversion procedure is a reliable tool for its estimation and the characterizing of shale gas formations. The result from the SA procedure on the geological region is confirmed by previous studies in the area, which shows a precise correlation with the vitrinite reflectance gradients for the Norwegian sectors of the Central Graben in the North Sea (Petersen et al. 2013).

As for the Hungarian case study, the analysis was quite more complex since tight reservoirs do not function as source rocks like shale hydrocarbon accumulations. However, they possess a low hydrocarbon migration potential, implying that source rocks should be located nearby as interbedded layers. Furthermore, these source rocks, at the Middle Miocene formation of the Derecske basin, exhibit significant vertical and lateral variations making the interpretation a complex task (Badics et al. 2023).

At the Middle Miocene formation of the Derecske basin, the increase in gamma-ray readings and uranium content at depths of 3583 to 3592 m suggests an anomalous enrichment of organic matter. At the top of the analyzed section, the high gamma-ray intensity values may indicate the presence of another source rock (Fig. 9). However, from the interpretation of the spectral gamma-ray log, it might be indicated a clear boundary between two different facies, distinguished by an abrupt change on the constant’s values of Th/U, U/K, and Th/K ratios (Fig. 9). It has been observed that such ratios are helpful to interpret the environment of deposition. Furthermore, this change is followed by a major increase of U value, and a particularly low Th/U ratio, which may also be associated with a marine condensed sequence (Wignall and Myers 1988). This idea is followed by the fact that peaks in Th and K (and therefore presumably high Th/U) were associated with major transgressive surfaces (Ehrenberg 2001).

Therefore, when following the procedure suggested by Passey et al. (1990), such intervals were excluded from the analysis, since the baseline for the underlying geological succession would not represent the same sedimentary condition type. Additionally, the presence of K and Th is commonly associated with clay content; however, the relatively low U response rules out the possibility of it being a source rock (Glover 2000). The results from the SA procedure in this region are consistent with the report values in zone of interest, vitrinite reflectance-reported values are ranging from 0.91%Ro to 0.96%Ro by Szabó et al. (2022) and an average value of 0.91 of Badics et al. (2023).

The Alaska dataset results indicate an agreement with the data reporter from the lower section of the Kingak Formation, K1, such section is an overmature formation that has been indicated as the source of the heavy crude oil in the Alpine field (Bird et al. 1998; Houseknecht and Bird 2004; Peters et al. 2006). Also, Peters et al. (2006) mentioned that on the well North Inigok-1 at depth of the section K1, the reduction of the HIo in the basal organic-rich shale to measured values is consistent with a postmature or thermally spent unit. Furthermore, from the USGS report on vitrinite reflectance data, the Kingak Shale shows a diverse distribution of vitrina reflectance values, which goes from 2.01 (14.21 LOM) to 4.26 (63.49 LOM).

It is worth mentioning that the Passey method was calibrated and tested on worldwide type of lithologies; however, it has been suggested that this technique should be primarily applicable to formations that spawn between the onset of oil generation and over maturity, that is, LOM 7 to LOM 12. Even though it can be applied for lower limits, it should be cautious on the results, since this means that it might not be as accurate or applicable to source rocks with significantly lower or higher maturity levels, mostly when it comes to TOC estimation, since it under-predicts TOC values in over mature shale gas systems (Passey et al. 2010; Sondergeld et al. 2010).

Conclusion

A new alternative geophysical approach for the automated estimation of the level of maturity to improving the evaluation of unconventional hydrocarbon reservoirs in the prospecting stage, in the assessment of TOC, and reservoir quality parameters is presented here, as a simple and straightforward method, overcoming the dependence on laboratory measurements of LOM from core samples (e.g., vitrine reflectance) by relaying on the modern inversion tool of interval inversion. Also, the methodology presented incorporates the metaheuristic global optimization method of simulated annealing by establishing a cost function related to the well-known mathematical expression that Passey suggested before to approximate the TOC content and that relies on having precise LOM knowledge from laboratory data.

Moving forward, it is essential to consider future directions for research in this field. One promising avenue is the further development and refinement of thermal maturity estimation, building upon the interval inversion technique presented in this study, as the method permits the assessment of lateral variations in layer thicknesses and the concurrent analysis of petrophysical parameter changes along a 2D cross-section formed by multiple boreholes (Abdellatif and Szabó, 2022). Therefore, there is a possibility to assess pseudothermal maturity profiles not only for one section of a borehole but in a complete hydrocarbon field considering adjacent wells.

Furthermore, our work extends beyond the scope of a single reservoir type or geographic location. The methods and insights presented herein hold potential applications in a broader spectrum of reservoir types and geological settings (i.e., shale gas and tight gas formation). Future studies should explore the adaptability and reliability of our method in diverse geological contexts, fostering a deeper understanding of its capabilities and limitations.

Additionally, the method proposed here has practical implications for the exploration and evaluation of unconventional hydrocarbon reservoirs. By reducing the dependence on time-consuming and costly laboratory measurements of LOM from core samples, our method streamlines the prospecting phase. This not only improves the efficiency of reservoir assessment but also could potentially reduce operational expenses.

This study sets the groundwork for well-logging thermal maturity estimation techniques and enhances our ability to evaluate hydrocarbon reservoirs. By addressing the future directions of research, potential applications to various reservoir types, and the promising developments in interval inversion methods, it will be possible to contribute to the ongoing progress in this field for more efficient and accurate reservoir characterization in the future.