Introduction

The precise calculation and evaluation of the in-situ rock stress tensor is a crucial factor in addressing the major challenges related to the design, construction, and operation of subsurface engineering, such as mining, civil, and petroleum projects1,2,3,4,5,6. Meanwhile, accurate in-situ stress data is also indispensable for earth science research, such as plate driving mechanism, tectonic evolution, earthquake mechanism and prediction, and fault activity7,8,9,10,11,12,13,14. It is well known that the genesis of in-situ stress is extremely complicated due to the long-term countless tectonic movements experienced by the earth as well as various geological structures with different scales and directions widely developed within the earth, and the crustal stress field exhibits remarkable temporal and spatial characteristics. As a consequence, the complexity and variability of the crustal stress state are caused, and its performance varies in regions with different geological tectonical backgrounds.

Compared with other rock mass properties, rock stress is a difficult quantity to measure. As Leeman15 pointed out, “It is impossible to measure stress directly since, in fact, it is a fictitious quantity. It is only possible to deduce the stresses in a solid body from the results of measurements using some indirect method”. At present, reliable information about the magnitude and direction of in-situ stress can only be obtained by in-situ measurement technique. Over the past decades, various techniques for determining in-situ stresses, such as hydraulic methods, overcoring methods, jacking methods, strain recovery methods, and borehole breakout methods, have been developed and improved. Generally, all in-situ stress measurement techniques involve disrupting the rock. The response related to the disturbance is measured (in the form of strain, displacement, or hydraulic pressure record) and analyzed by making several assumptions about the constitutive behavior of rocks, and the process of disturbance itself is usually accounted for in the analysis16. The identified in-situ stress is a point tensor quantity and the defined stress values have uncertainties associated with site geology and environmental factors (including topography, rock type, geological structures, and anisotropy and heterogeneity of rocks), the accuracy of various measurement methods, and data analysis and interpretation approaches17. Hence, understanding and controlling all known sources of error is crucial. According to Amadei and Stephansson18, 10–20% scatter in stress magnitude and ± 20° scatter in stress directions have to be accepted when conducting a series of stress measurements in typical rock conditions. Against this background, considerable efforts have been devoted to improving in-situ stress testing equipment and enhancing measurement accuracy worldwide recently.

In general, the measurement of in-situ stress is to determine the undisturbed three-dimensional stress state existing in the stratum, which is usually achieved through point-by-point measurements. To accurately determine the in-situ stress in a certain area, a large number of “point” measurements are necessary. However, due to the limitation of various factors such as manpower, material resources, site, and funds, it is impossible to measure the stress at all points in the target area, but only at a few points, so the measured stress data often has strong randomness, large dispersion, and small sample size. When using traditional probability statistics methods to simulate and fit the stress field of the entire target area and establish a fine stress field distribution model based on the limited measured data, the calculation error will be significant. If the non-linear fitting approach is applied to the limited measured data, the convergence and accuracy of the fitting results cannot be guaranteed. Moreover, although some other popular prediction methods19,20,21,22,23,24,25 such as multivariate regression, machine learning, deep learning, numerical simulation, neural networks, support vector machines, genetic algorithms, and random forest models have been developed and have emerged as key players in various industries, these methods are mainly devoted to mining the implicit rules of the dataset from a large amount of data, predicting or classifying expected results, and have problems such as too many iterations, slow convergence speed, and is prone to get stuck in local minima. Furthermore, the current research on in-situ stress prediction methods primarily focuses on theoretical models (such as rock physical equivalent model)26 and numerical simulation methods (such as boundary displacement adjustment method and displacement function method)27, while there are few studies on the prediction of in-situ stress by using new mathematical methods, such as coupling of two single mathematical methods. Thus, to overcome these shortcomings, it is particularly necessary to develop effective and accurate approaches for predicting in-situ stress.

To that end, in the present study, taking the most commonly used overcoring method as an example, some innovative techniques, equipment, and measures developed to improve the accuracy of in-situ stress measurement are systematically introduced. Moreover, the improved overcoring method has been used to measure the in-situ stress in many mining areas in China in recent years, and a large number of valuable and high-quality stress data have been obtained. These measurement practices have proven that the measurement accuracy and reliability of in-situ stress have been significantly improved. In addition, to address the shortcomings of the current stress prediction approaches, an improved grey neural network combination model is proposed based on the grey system theory and neural network principle, and the model algorithm is synchronously optimized. All the equations are fully developed, and the model is put forward in its entirety. Furthermore, taking the compiled measured stress data as sample data, the successful application of this model is also presented, and the predicted results are compared with the in-situ stress value determined by a single grey model and a single neural network model. This combination model only requires a small sample size, can weaken the randomness of the original data, and can be continuously modified to converge to satisfactory accuracy. The achieved results provide a scientific basis and method for greatly improving the accuracy of in-situ stress measurement and prediction.

Overcoring stress measurements

Improved overcoring technique for accurate measurement

The overcoring (OC) method is an indirect in-situ stress measurement technique that can accurately and quantitatively determine the original rock stress vector. It is currently the only method that can obtain the magnitude and direction of the complete three-dimensional stress from a single OC test28,29. The OC method is based on measuring the response strain when the rock core is released from the stress field of the surrounding rock mass, and the original rock stress vector can be estimated by analyzing the measured strains with the knowledge of the elastic properties of rocks6. Normally, the induced strains are measured by 12 strain gauges with different orientations before, during, and after the stresses on the borehole wall are released through OC (Fig. 1)17. The strain difference is used to back-calculate the stresses acting on the rock cylinder prior to OC assuming continuous, homogeneous, isotropic, and linear-elastic rock behavior. In the early 1950s, Hast30 first applied the OC method to measure in-situ stresses in Scandinavia. After decades of theoretical and technical development, this method has become one of the recommended methods for determining rock stress recommended by the Test Methods Committee of the International Society for Rock Mechanics (ISRM) in 200331. Among the various OC devices developed, the OC technique with a hollow inclusion strain gauge is the most representative one. Since the CSIRO-HI type strain gauge was invented in the 1970s32, this technique has been extensively adopted in practice worldwide. According to statistics, among various stress indicators, the in-situ stress data measured by the OC method accounts for approximately 80%.

Figure 1
figure 1

Schematic diagram of the stress state at the vicinity of the borehole at different phases during OC (after Hakala et al.17): in-situ state (a); after borehole drilling (b); and after OC (c).

The nature of rock (discontinuity, heterogeneity, and anisotropy), changes in environmental temperature at measurement points, and the selection method of rock elastic parameters used in stress calculation are the three main factors that affect the accuracy of in-situ stress measurement. The conventional OC method for solving the original rock stress is based on the basic assumption of elastic theory33, that is, the rock is homogeneous and isotropic. However, most rock masses have anisotropic properties. Anisotropy leads to the measured stress value being larger than the actual value in some directions and smaller than the actual value in other directions. Thus, the influence of rock anisotropy must be considered when calculating in-situ stress based on the measured strain value of stress relief. For rock masses with simple anisotropic properties such as transverse anisotropy and orthogonal anisotropy, anisotropic theoretical analytical solutions or numerical simulations can be used for the analysis and calculation of measurement results. However, for most rock masses, the combination of anisotropy, discontinuity, and heterogeneity is extremely complex and cannot be described using a simple model. For this purpose, the effect of anisotropy can be preliminarily corrected by using the results of the confining pressure test of the borehole core. In addition, the outstanding advantage of the hollow inclusion strain gauge is that the strain gauge and the borehole wall are bonded together in a considerable area, thus ensuring good bonding quality. Moreover, the cementing agent can also be injected into the cracks and defects in the surrounding rock around the strain gauge to complete the rock mass, making it easier to obtain a complete borehole rock core. Thus, the hollow inclusion strain gauge can be used in moderately discontinuous and uneven rock masses and has good waterproof performance. This has been confirmed in previous stress measurement practices34.

Hollow inclusion strain gauges, like many other strain measuring instruments, use resistance strain gauges as measuring elements and convert strain changes into resistance changes and then voltage changes based on the Wheatstone bridge principle for measurement and recording. The resistance strain gauge is quite sensitive to temperature, and its resistance values will change accordingly when the temperature changes, generating corresponding output voltage in the bridge. As a result, the false additional strain values will be calculated. To eliminate this additional temperature strain value, corresponding compensation measures must be taken35. For this reason, Cai et al.36,37,38 invented a complete temperature compensation technique, which includes the following points: (1) in the Wheatstone bridge used to measure and record the strain value of each strain gauge, except for the working strain gauge, the three bridge arms are all ultra-low temperature coefficient resistors, and their resistance values basically do not change with temperature, so they will not cause output voltage in the bridge when the temperature changes; (2) monitoring the temperature changes of the strain gauge during the stress relief; (3) after the stress relief is completed, the drilled core containing a hollow inclusion strain gauge is placed in an incubator with adjustable temperature for temperature calibration test to determine the temperature strain rate (i.e., the strain value generated by a temperature change of 1 °C) for each measuring point and strain gauge; and (4) based on the calibrated temperature strain rate and the measured temperature change at the measuring point during the stress relief, the false additional strain value of each strain gauge caused by temperature changes during the stress relief can be calculated, and this additional strain value is removed from the measured total strain value to obtain the true strain of each strain gauge caused by the stress relief. In addition, the wires of the resistance strain gauge can also cause significant additional strain when the temperature changes, so this part of the additional strain should be compensated correctly.

To use the complete temperature compensation technique, significant improvements have been made to the structure of the traditional hollow inclusion strain gauge39,40, as illustrated in Fig. 2. The related improvements mainly involve embedding a thermistor at the point D between two sets of strain gauges for continuous measurement of temperature changes of the strain gauge during OC41. Furthermore, at point E, two wires that are the same as the strain gauge wires are welded together, and are led out of the cable along with other wires and connected to a bridge arm adjacent to the working strain gauge. As such, the additional strain of the strain gauge wires due to temperature change during stress relief can be offset based on the principle of the Wheatstone bridge. Moreover, the traditional common compensation patch is removed in the improved hollow hollow strain gauge, as this type of compensation patch is not suitable for the temperature compensation of cemented strain gauges. In addition, since the ordinary resistance strain gauge cannot be used for the aforementioned complete temperature compensation technique, a dedicated strain-resistance–voltage conversion device is designed and developed based on the principle of the Wheatstone bridge. This device contains 12 bridges, and each bridge is used for data measurement of a strain gauge. The voltage output from the 12 bridges is automatically recorded by the data collector, which can simultaneously collect the data from 48 channels in one-thousandth of a second, eliminating the influence of the contact resistance of the ordinary resistance strain gauge conversion switches on the measurement results.

Figure 2
figure 2

Schematic diagram of the structure of the improved hollow inclusion strain gauge43.

In addition, the measurement accuracy of the elastic modulus and Poisson’s ratio, especially the elastic modulus, of rocks, has a great influence on the accuracy of the in-situ stress calculation results, so it is of great significance to improve the measurement accuracy of the elastic parameters to improve the accuracy of the stress measurement results42,43. In the existing in-situ stress measurements, it is assumed that the rock is linearly elastic, and the influence of stress levels on the elastic modulus and Poisson’s ratio is ignored when calculating the elastic modulus and Poisson’s ratio of rocks through uniaxial compression tests or biaxial confining pressure tests. In fact, most rock masses are not completely linear elastic, and the elastic modulus of rocks varies with the stress level. Generally, the elastic modulus at a high-stress level is much larger than that at a low-stress level. Hence, calculating low-stress values using elastic modulus values under high-stress levels or calculating high-stress values using elastic modulus values at low-stress levels will cause considerable errors in the calculation results. Meanwhile, for nonlinear rock masses, the loading and unloading paths are different, and the calculation of the elastic modulus from the loading path can also cause certain errors because stress relief is an unloading process. Consequently, to consider the nonlinear elasticity of rocks, the elastic modulus value used must be in accordance with the calculated stress level.

Theoretical calculation

For the improved OC technique, after the temperature calibration, the final strain values measured by the hollow inclusion strain gauges during the stress relief process can be used to calculate the in-situ stress values according to the following formulas:

$$\varepsilon_{\theta } = \frac{1}{E}\left\{ {\left( {\sigma_{x} + \sigma_{y} } \right)K_{1} + 2\left( {1 - \upsilon^{2} } \right)\left[ {\left( {\sigma_{y} - \sigma_{x} } \right)\cos 2\theta - 2\tau_{xy} \sin 2\theta } \right]K_{2} - \upsilon \sigma_{z} K_{4} } \right\}$$
(1)
$$\varepsilon_{z} = \frac{1}{E}\left[ {\sigma_{z} - \upsilon \left( {\sigma_{x} + \sigma_{y} } \right)} \right]$$
(2)
$$\gamma_{\theta z} = \frac{4}{E}\left( {1 + \upsilon } \right)\left( {\tau_{yz} \cos \theta - \tau_{zx} \sin \theta } \right)K_{3}$$
(3)
$$\varepsilon_{ \pm 45^\circ } = \frac{1}{2}\left( {\varepsilon_{\theta } { + }\varepsilon_{z} \pm \gamma_{\theta z} } \right)$$
(4)

where εθ, εz, γθz, and ε±45° represent the circumferential strain, axial strain, shear strain, and the strain at 45° to the axis, respectively; σx, σy, σz, τxy, τyz, and τzx are the six components of the original rock stresses; E and υ are the elastic modulus and Poisson’s ratio, respectively; θ is the separation angle between the strain gauge and the X-axis; and K1, K2, K3, and K4 stand for the four correction coefficients, which are collectively referred to as K coefficients.

In the hollow inclusion strain gauge, three groups of strain gauges are embedded in the center of an epoxy cylinder wall instead of directly sticking to the surface of the strain gauge, and the measured strain value is different from that of the strain gauges directly sticking to the borehole wall. The four correction coefficients are used to correct this difference, and the values of the four correction coefficients are related to the elastic modulus and Poisson’s ratio of rocks and hollow inclusion materials, the inner and outer diameters of hollow inclusions, and the radial position of strain gauges in hollow inclusions. Only for different rocks, the values of these correction coefficients can differ by 50%. Therefore, for each OC measuring point in a borehole, the values of the four correction coefficients must be calculated using the formulas provided by Duncan Fama and Pender44.

The elastic modulus and Poisson’s ratio of the rock are necessary when calculating the in-situ stress from the strain value measured by the hollow inclusion strain gauge during the OC process. The elastic modulus and Poisson’s ratio of the cored rocks at each measuring point are obtained through confining pressure calibration experiments of the drilled rock cores, which are retrieved from the site after each stress relief test, thus ensuring that the elastic modulus and Poisson’s used for stress calculation truly correspond to the rocks at the measuring points. The specific method and process of the confining pressure calibration experiment can be found in the detailed introduction by Cai et al.37. Moreover, considering the influence of colloid deformation parameters and bonding gap, the calculation formula of elastic modulus is revised:

$$E = K_{{1}} \left( {\frac{{P_{{\text{c}}} }}{{\varepsilon_{\theta } }}} \right)\frac{{2R^{2} }}{{\left( {R^{2} - r^{2} } \right)}}$$
(5)
$$\upsilon { = }\frac{{\varepsilon_{\theta } }}{{\varepsilon_{z} }}$$
(6)

where Pc is the confining pressure; and r and R are the inner and outer radii of the core, respectively. Note that in the previous calculation of elastic modulus, K1 was assumed to be a fixed constant of 1.1243. In some cases, K1 may be equal to or close to this value, but for many rocks, especially soft rocks, K1 is not equal to 1.12, and its value is between 0.80 and 1.20. Replacing K1 with 1.12 will cause a great error in the calculation results of rock elastic modulus, with an error as high as 40%.

In summary, in the OC method, when calculating the rock elastic modulus, the coefficient K1 is required, and when calculating the coefficient K1, the rock elastic modulus also needs to be known. Thus, the iterative method is needed to solve. At the same time, due to the nonlinearity of rocks, their elastic modulus and Poisson’s ratio are related to stress levels, but the in-situ stress value is unknown, so the iterative method is also required to solve the in-situ stress value to use the elastic modulus of rocks that is equivalent to the in-situ stress level. Hence, the double iteration method should be used to solve the rock elastic modulus value, Poisson’s ratio, correction coefficients, and in-situ stress value. This calculation method can ensure the accuracy of stress measurement results. The application of these technologies has significantly improved the accuracy of in-situ stress measurements, and the measurement accuracy in fractured and porous rock masses has increased by more than 20%.

Measured stress data

To obtain accurate and reliable stress measurement data, the OC equipment, setup, and testing procedures adopted at the measuring points strictly follow the recommendations provided by the ISRM31. The accuracy of the OC stress measurements is not only influenced by the instrument itself and the measurement operations but also largely constrained by objective factors such as engineering geological environment and rock conditions. Therefore, the selection of in-situ stress measurement points should follow the following principles proposed: (1) the measurement points should avoid stress variation zones and be arranged in positions that are not affected by engineering excavation disturbances, namely, the original rock stress area, which requires that the depth of the borehole should reach at least 3–5 times the span of a tunnel; (2) the measuring points should be arranged in continuous and complete rock masses as far as possible, generally far away from faults, and away from rock fracture zones and fault development zones unless the measurements are intentionally carried out, to ensure the bonding quality between the hollow inclusion strain gauge and the borehole and obtain complete overcored rock cores, which is very necessary for the subsequent experiments and analysis of stress relief measurement results; (3) the measuring points shall be far away from or as far away as possible from larger excavation bodies, such as large goafs, large chambers, etc.; (4) to investigate the variation law of in-situ stress state with depth, the measurements should be carried out at least at three depth levels; (5) equipment and personnel needs have to be assessed; and (6) avoiding other potential interference sources that may affect stress measurements.

After the stress relief is completed, the strain data stored in the data collector are printed out by the computer, and the stress relief curve (i.e., the curve of the strain value of each strain gauge changing with the depth of borehole relief) is drawn accordingly. The stress relief curve is very helpful to check the working state of each strain gauge and determine whether the measured strain is affected by other factors besides stress relief, such as rock conditions. If each strain gauge is in normal working condition and the rock condition is good, the stress relief curve should show a regular and predictable shape (Fig. 3). In the rock core temperature calibration test, the additional strain value of each strain gauge caused by the temperature change of the measuring point in the stress relief process can be obtained by multiplying the temperature change value of the strain gauge during stress relief by the temperature strain rate. By removing this additional strain value from the final stable strain value measured during stress relief, the strain value really caused by stress relief can be derived.

Figure 3
figure 3

Typical stress relief curves45.

Based on the stress relief strain values after temperature calibration and the results of confining pressure calibration tests, the K coefficients, rock elastic modulus, Poisson’s ratio, and in-situ stress values for each measurement point can be calculated using a self-developed double iteration calculation program, ensuring the correctness of the calculation process and results. In recent years, the improved OC technique has been employed by our team to conduct in-situ stress measurement campaigns in 14 mines (including eight metal mines and four coal mines) in China, and a total of 123 groups of OC stress data were obtained. Detailed information on the measured stress data is presented in Table 1. These stress data are distributed in eight provinces of China with different geological tectonic settings, and their measurement depth ranges from 56 to 1123 m, which has good representativeness. The absolute value of stress indicates that the present-day crustal stress state in these mining areas is compressed. The “direction” of stress is defined as 0° at due north, rotating clockwise, and 360° when turning to due north again. The positive or negative of the “dip” of stress is defined according to the horizontal plane, and it is positive when the horizontal plane angle is upward, and vice versa. Note that since the stress is a pair of collinear vectors, if a principal stress at a site lies in the NW–SE direction, its dip angle is positive from the fourth quadrant and negative from the second quadrant. To validate the quality of the obtained stress data and ensure the comparability of the data, the stress data are checked by using the globally accepted World Stress Map (WSM) quality ranking system46 regarding the OC method, and these data can be ranked as categories A, B, C, and D, revealing reliable and significant information on the in-situ stress tensor.

Table 1 Summary of the OC stress measurements in some mines in China.

As shown in Table 1, although these stress data are distributed in different regions with distinct tectonic settings, the measurement results indicate that the maximum principal stress at each measuring point is close to the horizontal direction (called σH). For the remaining two principal stresses, i.e., intermediate and minimum principal stresses, one is nearly horizontal (called σh) and the other remains basically vertical (called σv). This reflects that horizontal tectonic stress plays a dominant role in the stress field of those mining areas, which matches more or less with the first-order pattern of the crustal stress in the lithosphere47. Hence, the derived OC stress data agree with the hypothesis that horizontal tectonic stress normally regulates the modern crustal stress field of the upper crust. Moreover, the vertical principal stress magnitude is basically equal to or slightly less than the weight of overlying strata, and the magnitude of the maximum horizontal principal stress is around twice that of the vertical principal stress on average. In addition, the three principal stress magnitudes versus depth are plotted in Fig. 4, showing the correlation between the stress values and depth obtained through linear fitting analysis. The results indicate that the three principal stresses exhibit small dispersion and increase approximately linearly with depth, which is completely in accord with the current universal cognition of in-situ stress. On the other hand, the dominant direction of the maximum principal stress that we identified in each mine is in good agreement with the directions of various stress indicators previously derived from different measurement methods (such as OC, hydraulic fracturing, borehole slotter, and geological indicators) in and around the mining area in the recently updated WSM48 and the tectonic stress map of North China49. The above results suggest that the improved OC technique has excellent testing performance and reliable measurement results. The stress data has provided the necessary basis for the scientific design and construction of these mines, especially for the selection of appropriate roadway and stope location and direction, the optimal shape and size of underground openings, effective and safe excavation sequence, and reliable support and reinforcement of mining structures.

Figure 4
figure 4

Constrained principal stress profile with depth.

In-situ stress prediction approach

Grey model

The grey system theory was first proposed by Deng50 in the late 1970s, which mainly aimed at uncertainty problems with little experience and data. The information in the grey system includes both known and unknown, and there are uncertain relationships between various factors within the system (Fig. 5). Grey model (GM) is a prediction approach to establish a mathematical model and make a forecast through a small amount of incomplete information. In recent years, the grey prediction theory has been widely used in many fields. It can effectively deal with systems with significant uncertainties and limited data samples and exhibits noticeable advantages over traditional prediction approaches, so it is increasingly being applied. The essence of the grey theory is to weaken the randomness of the original random sequence by the method of generating information so that the original data sequence can be transformed into a new sequence that is easy to model. The model of the original grey sequence can be obtained by performing the corresponding inverse generation on the model built according to the new sequence.

Figure 5
figure 5

Concept diagram of the grey system.

Preprocessing the original data is required when establishing a grey model. In this study, the accumulating generation operator (AGO) is used for preprocessing, which sequentially accumulates the data in the original sequence to yield a new sequence. Its mathematical form is denoted as

$$X^{(1)} = AGO \bullet X^{(0)}$$
(7)

in which,

$$X^{(1)} = (x^{(1)} (1),x^{(1)} (2), \cdots ,x^{(1)} (n))$$
(8)
$$x^{(1)} (k) = \sum\limits_{m = 1}^{k} {x^{(0)} (m)} { (}k = 1,2, \cdots ,n)$$
(9)

By constructing data matrixs and vectors, the model parameters can be solved using the least square method. Assuming \(x_{1}^{(0)} = (x_{1}^{(0)} (1),x_{1}^{(0)} (2), \cdots ,x_{1}^{(0)} (n))\) is a system characteristic data sequence, and

$$\left\{ \begin{gathered} x_{2}^{(0)} = (x_{2}^{(0)} (1),x_{2}^{(0)} (2), \cdots ,x_{2}^{(0)} (n)) \hfill \\ x_{3}^{(0)} = (x_{3}^{(0)} (1),x_{3}^{(0)} (2), \cdots ,x_{3}^{(0)} (n)) \hfill \\ \cdots \hfill \\ x_{N}^{(0)} = (x_{N}^{(0)} (1),x_{N}^{(0)} (2), \cdots ,x_{N}^{(0)} (n)) \hfill \\ \end{gathered} \right.$$
(10)

is the correlation factor sequence, and \(x_{i}^{(1)}\) (\(i = 1,2, \cdots ,N\)) is the 1-AGO sequence of \(x_{i}^{(0)}\), thus

$$x_{1}^{(1)} (k) = a + b_{2} x_{2}^{(1)} (k) + b_{3} x_{3}^{(1)} (k) + \cdots + b_{N} x_{N}^{(1)} (k)$$
(11)

is called the GM(0, N) model.

The GM(0, N) model does not contain derivatives, so it is a static model. It looks like a multivariate linear regression model, but it is essentially different from the general multivariate linear regression model. The general modeling of multiple linear regression is based on the original data sequence, while the modeling basis of GM(0, N) is the 1-AGO sequence of the original data.

Assuming that \(x_{i}^{(0)}\) and \(x_{i}^{(1)}\) meet the above definitions, let the parameter columns be

$$u = [a,b_{2} ,b_{3} , \cdots ,b_{N} ]^{T}$$
(12)
$$B = \left[ {\begin{array}{*{20}c} {x_{2}^{(1)} (2)} & {x_{3}^{(1)} (2)} & \cdots & {x_{N}^{(1)} (2)} \\ {x_{2}^{(1)} (3)} & {x_{3}^{(1)} (3)} & \cdots & {x_{N}^{(1)} (3)} \\ \vdots & \vdots & \ddots & \vdots \\ {x_{2}^{(1)} (n)} & {x_{3}^{(1)} (n)} & \cdots & {x_{N}^{(1)} (n)} \\ \end{array} } \right]$$
(13)
$$Y = \left[ {\begin{array}{*{20}c} {x_{1}^{(1)} (2)} \\ {x_{1}^{(1)} (3)} \\ {} \\ {x_{1}^{(1)} (n)} \\ \end{array} } \right]$$
(14)

The residual vector is

$$E = Y - Bu$$
(15)

When \(E^{T} E \to \min\), the model can fit the AGO sequence well. Define the objective function as

$$J = E^{T} E = (Y - Bu)^{T} (Y - Bu)$$
(16)

Expanding Eq. (16) yields:

$$J = Y^{T} Y - 2(Bu)^{T} Y + u^{T} B^{T} Bu$$
(17)

The minimum value of the objective function is at the point where the derivative of the parameter column u is zero, so there is:

$$\nabla J = 2B^{T} Bu - 2B^{T} Y = 0$$
(18)

According to the least square technique, it can be estimated that:

$$u = (B^{T} B)^{ - 1} B^{T} Y$$
(19)

Hence, the GM(0, N) model of the AGO sequence is obtained. By performing a reverse generation on this model, the model of the original sequence can be derived. If the regularity of the sequence \(x^{(1)}\) is not strong, a 2-AGO sequence \(x^{(2)}\) can be generated until a \(\xi\)-AGO sequence \(x^{(\xi )}\). In this case, the established model can be inversely generated \(\xi\) times to determine the original sequence model.

Based on the measured in-situ stress data of each measuring point, the σH, σh, and σv sequences and the depth z sequences of each measuring point are respectively combined as known grey sequences to establish the GM(0, 1) model:

$$\sigma_{i} = a + bz$$
(20)

where σi represents σH, σh, or σv; and a is the undetermined constant of the model.

The accuracy of the model is a true reflection of the correctness and practicability of the model prediction. The relative error size test method is employed in this study to check whether the model is reasonable. This method is an intuitive and point-by-point comparison arithmetic test method, which compares the predicted data with the actual data to observe whether the relative error meets the practical requirements.

Assuming the prediction result of the GM(0, 1) model is \(\hat{x}_{1}^{(0)} = (\hat{x}_{1}^{(0)} (1),\hat{x}_{1}^{(0)} (2), \cdots ,\hat{x}_{1}^{(0)} (n))\), the residual can be calculated to obtain:

$$E = (e(1),e(2), \cdots ,e(n)) = x_{1}^{(0)} - \hat{x}_{1}^{(0)}$$
(21)

where \(e(k) = x_{1}^{(0)} (k) - \hat{x}_{1}^{(0)} (k) \, (k = 1,2, \cdots ,n)\).

The relative error is:

$$rel(k) = \frac{e(k)}{{x_{1}^{(0)} (k)}} \times 100\% ,k = 1,2, \cdots ,n$$
(22)

The average relative error is:

$$rel = \frac{1}{n}\sum\limits_{k = 1}^{n} {\left| {rel(k)} \right|}$$
(23)

The accuracy of the GM(0, 1) model is given by

$$p^{o} = (1 - rel) \times 100\%$$
(24)

Successful modeling generally requires \(p^{o} > 80\%\), preferably \(p^{o} > 90\%\).

Based on the measured in-situ stress data (Table 1), seven mines, namely Sanshandao gold mine, Xincheng gold mine, Linglong gold mine, Pingdingshan No. 1 mine, Pingdingshan No. 10 mine, Meishan iron mine, and Beiminghe iron mine, have a large amount of stress data, which are selected as training samples. Note that for the stress data at the same depth in a mine, one set of data is chosen as the representative data. Grey prediction model GM(0, 1) is used to predict the magnitudes of σH, σh, and σv at each measuring point in these seven mines, and the results are listed in Table 2 and the intuitive comparison between the predicted values and measured values is plotted in Fig. 6. In the seven mines, the mean relative errors of the predicted σH values are 9.4446%, 13.6101%, 6.6579%, 11.3630%, 10.6638%, 8.4339%, and 8.9955%, respectively; the mean relative errors of the predicted σh values are 13.5686%, 29.7263%, 6.9395%, 12.4404%, 14.2287%, 13.1935%, and 11.7173%, respectively; and the mean relative errors of the predicted σv values are 6.1479%, 12.1361%, 10.7058%, 13.3870%, 21.9825%, 10.8826%, and 11.3876%, respectively. Obviously, the mean relative errors of the prediction results of GM(0, 1) for the three principal stresses all reach 6–30%, and the accuracy of the model fails to meet the requirements, which means that the reliability of the prediction results is not high. Thus, the BPNN with strong prediction and fitting ability is considered to achieve higher prediction accuracy.

Table 2 Predicted stress values and mean relative errors using GM(0, 1) in the seven mines.
Figure 6
figure 6

Comparison between the predicted values using the GM model and measured values of the three principal stresses in the Sanshandao gold mine (a), Xincheng gold mine (b), Linglong gold mine (c), Pingdingshan No. 1 mine (d), Pingdingshan No. 10 mine (e), Meishan iron mine (f), and Beiminghe iron mine (g).

Back propagation neural network

Back propagation neural network (BPNN), designed by Rumelhart et al.51 in 1986, is a multi-layer feedforward neural network based on error back propagation. It is the most popular type of neural network model and can achieve arbitrary nonlinear mapping between the input and output, which makes it widely used in fields such as function recognition and pattern recognition. The BPNN is a typical multi-layer network, which is divided into an input layer, hidden layer, and output layer. The connection mode between layers is that the input of the layer i is only connected with the output of layer i-1, and each neuron accepts the input of the previous layer and outputs it to the next layer without feedback. The input layer of the neural network takes values corresponding to the attributes of each training sample and assigns them to the input layer units. The outputs of these units are combined with corresponding weights and transmitted to the hidden layer units simultaneously. The weighted output of the hidden layer is then passed as input to the next hidden layer, and the weighted output of the last hidden layer node is passed to the output layer unit, which ultimately gives the predicted output of the corresponding sample. As long as there are enough hidden layers in the middle, the nonlinear threshold function in the multilayer forward network can fully approximate any function.

The BPNN algorithm is the training algorithm for acyclic multilevel networks. Its basic idea is a gradient descent method, which uses the gradient search technique to minimize the mean square error between the actual output value and the expected output value of the network. The network learning process actually includes two stages: forward propagation and backward propagation, as illustrated in Fig. 7. In the forward propagation process, the input information is processed layer by layer from the input layer through the hidden layer and transmitted to the output layer, and the state of each layer of neurons only affects the state of the next layer of neurons. If the expected output cannot be obtained at the output layer, the error signal will be returned according to the original connection path, and the goal of minimizing the error signal will be achieved by modifying the weights of neurons at each layer.

Figure 7
figure 7

BPNN infrastructure.

Here, the implementation steps of a three-layer BPNN learning algorithm are provided, and so on for multi-layer situations.

  1. (1)

    Initialization. Assign random numbers to all neurons in the network in the interval of (− 1, 1).

  2. (2)

    Given training data samples: input vector \(X = (x_{1} ,x_{2} , \cdots ,x_{n} )\) and expected output vector \(Y = (y_{1} ,y_{2} , \cdots ,y_{m} )\).

  3. (3)

    Calculate the output layer by layer starting from the input layer.

All neurons in the input layer: the input is \(x_{i}\), the output is \(o_{i} = x_{i} \, (i = 1,2, \cdots ,n)\), where n is the number of neurons in the input layer.

All neurons in the hidden layer: the input is \(x_{k} = \sum\nolimits_{i = 1}^{l} {w_{ik} o_{i} + } b_{k}\), the output is \(o_{k} = f(x_{k} ) \, (k = 1,2, \cdots ,l)\), where l is the number of neurons in the hidden layer, \(w_{ik}\) represents the weight of the ith neuron in the input layer to the kth neuron in the hidden layer, and \(b_{k}\) denotes the offset of the kth neuron in the hidden layer.

All neurons in the output layer: the input is \(x_{j} = \sum\nolimits_{k = 1}^{l} {w_{kj} o_{k} + } b_{j}\), the output is \(o_{j} = g(x_{j} ) \, (j = 1,2, \cdots ,m)\), where m is the number of neurons in the output layer, \(w_{kj}\) represents the weight from the kth neuron in the hidden layer to the jth neuron in the output layer, and \(b_{j}\) represents the offset of the jth neuron in the output layer.

The functions \(f( \cdot )\) and \(g( \cdot )\) are usually called the excitation function and are typically the following Sigmoid function:

$$f(x) = \frac{1}{{1 + e^{ - x} }}$$
(25)
  1. (4)

    Error back propagation, adjusting weights layer by layer from the output node.

Define the network error function:

$$E = \frac{1}{2}\sum\limits_{j = 1}^{m} {(o_{j} - y_{j} )^{2} }$$
(26)

Weight adjustment using gradient algorithm:

$$w(t + 1) = w(t) - \eta \nabla E(t)$$
(27)

where \(- \eta \nabla E(t)\) is the opposite direction of the gradient change of the error function during t training.

The adjustment amount of the weight \(w_{kj}\) from the output layer to the hidden layer is:

$$w_{kj} (t + 1) = w_{kj} (t) + \eta (y_{j} - o_{j} )o_{j} (1 - o_{j} )o_{k}$$
(28)

where \(\eta\) is the learning rate, which is a positive constant, and \(y_{j} { (}j = 1,2, \cdots ,m{)}\) is the expected output.

The adjustment amount of the weight \(w_{ik}\) from the hidden layer to the input layer is:

$$w_{ik} (t + 1) = w_{ik} (t) + \eta o_{k} (1 - o_{k} )o_{i} \sum\limits_{j = 1}^{m} {\delta_{j} w_{kj} }$$
(29)

where \(\delta_{j} = (y_{j} - o_{j} )o_{j} (1 - o_{j} )\), which represents the error of the jth neuron in the output layer.

  1. (5)

    After the weight adjustment is completed, return to step (3) and recalculate until the error meets the requirements or the termination condition is reached.

    The BPNN can realize highly nonlinear mapping from N-dimensional space to M-dimensional space, which can better fit data sequences, deal with complex nonlinear problems, and have certain generalization abilities. However, the BPNN has many problems to be solved urgently, such as excessive iterations, slow convergence speed, and easy falling into local minimum. Given these problems, combined with the improved methods already proposed, the following new improvements have been made in this study:

  2. (6)

    Adjust the activation function. The advantage of choosing the ReLU (Eq. (30)) function as the activation function is that neurons only need to judge whether the input is greater than 0, which is more efficient and faster in calculation. Compared to the saturation at both ends of the Sigmoid function, the ReLU function is a left saturation function, and its derivative is 1 when \(x > 0\), which alleviates the problem of gradient elimination of neural networks to some extent and accelerates the convergence speed of gradient descent.

    $${\text{ReLU(}}x) = \left\{ {\begin{array}{*{20}c} {x{\text{ if }}x > 0} \\ {0{\text{ if }}x \le 0} \\ \end{array} } \right. = \max (0,x)$$
    (30)
  3. (7)

    Adaptive moment estimation (Adam) algorithm is selected to optimize and adjust the weights between networks. The gradient descent algorithm always maintains a single learning rate to update all weights, and the learning rate will not change with the progress of training. The Adam algorithm designs independent adaptive learning rates for different parameters by calculating the first-order moment estimation and the second-order moment estimation of gradients, thereby better balancing the convergence speed and stability. Meanwhile, the adaptive learning rate mechanism and second-order moment estimation of the Adam algorithm are helpful in preventing gradient explosion and vanishing problems. The main calculation process of this algorithm is as follows:

  4. (8)

    Initialization: Given the learning rate \(\eta\), smoothing constants \(\beta_{1}\) and \(\beta_{2}\), and initial network weights, i.e., \(m_{0} = 0\) and \(v_{0} = 0\).

  5. (9)

    Calculating gradient \(g_{t}\): the first-order moment estimation of the gradient \(m_{t} = \beta_{1} m_{t - 1} + (1 - \beta_{1} )g_{t}\) and the second-order moment estimation of the gradient \(v_{t} = \beta_{2} v_{t - 1} + (1 - \beta_{2} )g_{t}\), and correct deviations \(\hat{m}_{t} = \frac{{m_{t} }}{{1 - \beta_{1}^{t} }}\) and \(\hat{v}_{t} = \frac{{v_{t} }}{{1 - \beta_{2}^{t} }}\).

  6. (10)

    Updating parameters: \(\theta_{t} = \theta_{t - 1} - \eta \frac{{\hat{m}_{t} }}{{\sqrt {\hat{v}_{t} } + \varepsilon }}\).

According to the measured in-situ stress data in the seven mines, a single-input single-output BPNN model is adopted. That is, the number of neurons in the input layer is 1, the number of neurons in both hidden layers is 10, and the number of neurons in the output layer is 1, and the learning rate is 0.02. After the training, the predicted values and average relative errors of the three principle tresses using the model are presented in Table 3 and their comparison is shown in Fig. 8. In the seven mines, the mean relative errors of the predicted σH values are 2.9863%, 1.8531%, 0.6831%, 5.2761%, 0.4801%, 0.9130%, and 0.0169%, respectively; the mean relative errors of the predicted σh values are 5.5816%, 1.1250%, 1.1399%, 2.5183%, 0.3028%, 7.2294%, and 0.1274%, respectively; and the mean relative errors of the predicted σv values are 3.6198%, 1.0315%, 3.2713%, 3.2308%, 1.6501%, 0.5832%, and 0.0125%, respectively. It can be observed that the average relative errors of the three principal stresses are all less than 10% (0.0125–7.2294%), and the model accuracy meets the requirements and has sufficient credibility.

Table 3 Predicted stress values and mean relative errors based on the BPNN in the seven mines.
Figure 8
figure 8

Comparison between the predicted values using the BPNN model and measured values of the three principal stresses in the Sanshandao gold mine (a), Xincheng gold mine (b), Linglong gold mine (c), Pingdingshan No. 1 mine (d), Pingdingshan No. 10 mine (e), Meishan iron mine (f), and Beiminghe iron mine (g).

Embedded grey neural network combination model

A single prediction model exhibits some limitations to some extent. For the grey model, the larger the data discretization program (i.e., the data gray level), the worse the prediction accuracy. Moreover, the fitting sequence of the model is a non-homogeneous exponential sequence, and it is necessary for the original data to have a clear exponential law to predict the results accurately enough. In addition, the BPNN algorithm is essentially a gradient algorithm for nonlinear optimization problems, which inevitably has local minima and converges slowly in the flat area of the error surface. Furthermore, excessive training times will lead to a decrease in learning efficiency and slow convergence speed of learning algorithms, and the convergence speed is closely related to the selection of initial weights. The combined prediction model can avoid the problem that a single model makes it easy to lose information, reduce randomness, reflect the changing law of the system more comprehensively, and improve prediction accuracy. Based on the idea of combinatorial modeling, a new grey neural network prediction model is established by integrating the GM model with the BPNN model to better solve the model prediction problem.

The sequence generated by the accumulation in the grey prediction method grows monotonically, which weakens the randomness and fluctuation of the original data sequence. At the same time, the Sigmoid activation function commonly used in the BPNN also increases monotonically. Because of the monotonic increasing trend of accumulated data, taking the accumulated data sequence as the training sample of the BPNN, it is convenient for the BPNN to approximate the function, accelerate the convergence speed, and thus improve the prediction accuracy. Accordingly, an embedded grey neural network (GM–BPNN) model (Fig. 9) is adopted in this paper, that is, on the basis of the general BPNN, a grey layer is added to the front end to gray the input data and a whitening layer is added to the back end to restore the output information of the network, so as to get a definite output result.

Figure 9
figure 9

Schematic diagram of the embedded GM–BPNN model.

The grey layer generally accumulates the original data once or multiple times to generate new data, which weakens the randomness of the original data and then uses it as the training sample of the neural network. The whitening layer corresponds to the ashing layer, and the final prediction result can be obtained by performing one or more inverse generation processes.

Based on the proposed embedded GM–BPNN, 1-AGO is selected as the grey layer, the BPNN introduced earlier continues to be used, and the final prediction result can be generated by the operation of accumulation and subtraction. The predicted values and average relative errors of three principal stresses in the seven mines are shown in Table 4, and the comparison between the predicted values and measured values is illustrated in Fig. 10. In the seven mines, the mean relative errors of the predicted σH values are 2.0923%, 0.1959%, 0.0174%, 0.1045%, 0.2933%, 0.1239%, and 0.0003%, respectively; the mean relative errors of the predicted σh values are 4.8338%, 0.0477%, 0.9461%, 0.0023%, 0.2707%, 0.0217%, and 0.0001%, respectively; and the mean relative errors of the predicted σv values are 3.2194%, 0.0874%, 0.1986%, 2.7460%, 0.0197%, 0.1034%, and 0.0022%, respectively. The results indicate that the predicted values of this model are very close to the measured values.

Table 4 Predicted stress values and mean relative errors based on the embedded GM–BPNN model in the seven mines.
Figure 10
figure 10

Comparison between the predicted values using the embedded GM–BPNN model and measured values of the three principal stresses in the Sanshandao gold mine (a), Xincheng gold mine (b), Linglong gold mine (c), Pingdingshan No. 1 mine (d), Pingdingshan No. 10 mine (e), Meishan iron mine (f), and Beiminghe iron mine (g).

In addition, the mean relative error, a statistical indicator used to measure the error between predicted and actual values, is calculated. Its value is between 0 and 1, and the smaller the value, the higher the accuracy of the prediction. The obtained mean relative errors of the σH, σh, and σv predicted by the GM, BPNN, and embedded GM–BPNN models in the seven mines are plotted in Fig. 11. Apparently, the accuracy of the embedded GM–BPNN model in predicting the principal stress values is much better than that of a single BPNN model and a single GM, with mean relative errors of 0.0003–2.0923%, 0.0001–4.8338%, and 0.0022–3.2194%, respectively. Moreover, another commonly used indicator for evaluating predictive performance, namely, root mean square error, is employed to further measure the root mean square difference between the predicted value and the true value, representing the average degree of deviation between the predicted and true values. The smaller the root mean square error is, the more accurate the prediction result is. The calculated root mean square errors of the σH (a), σh (b), and σv (c) predicted by the GM, BPNN, and embedded GM–BPNN models in the seven mines are drawn in Fig. 12. It can be observed that compared with the other two models, the root mean square errors of the embedded GM–BPNN model in predicting the three principal stresses are the smallest, which are 0–0.9241 MPa, 0–0.8747 MPa, and 0–0.5390 MPa, respectively. The above findings imply that the embedded GM–BPNN model proposed in this study is correct in predicting the in-situ stress magnitudes, and the prediction results have relatively high accuracy, verifying the reliability and applicability of this model.

Figure 11
figure 11

Comparison of the mean relative errors of the σH (a), σh (b), and σv (c) predicted by the GM, BPNN, and embedded GM–BPNN models in the seven mines (SSDGM = Sanshandao gold mine, XCGG = Xincheng gold mine, LLGM = Linglong gold mine, PDS1M = Pingdingshan No. 1 mine, PDS10M = Pingdingshan No. 10 mine, MSIM = Meishan iron mine, and BMHIM = Beiminghe iron mine).

Figure 12
figure 12

Comparison of root mean square errors of the σH (a), σh (b), and σv (c) predicted by the GM, BPNN, and embedded GM–BPNN models in the seven mines (SSDGM = Sanshandao gold mine, XCGG = Xincheng gold mine, LLGM = Linglong gold mine, PDS1M = Pingdingshan No. 1 mine, PDS10M = Pingdingshan No. 10 mine, MSIM = Meishan iron mine, and BMHIM = Beiminghe iron mine).

Discussion

To significantly improve the design level and economic benefits and ensure the safety of the project, improving the accuracy of in-situ stress measurement is an important issue that must be studied at present. Although there have been dozens of measurement methods and instruments developed based on the principle of OC over the past decades, and some of them have been reported to have achieved complete success in actual measurements, it is impossible to compare the measured values with unknown quantities to determine the accuracy of the measurement results because the actual stress values are unknown16,36,52. There is no other choice but to evaluate the reliability and accuracy of the measurement results based solely on personal experience and subjective judgment. In fact, there are few objective evaluations of the working performance and measurement accuracy of various methods and instruments under actual rock conditions before. This evaluation can only be completed through laboratory experiments under the condition that the stress value is known. To achieve this, a series of OC tests can be conducted in the laboratory by simulating the actual in-situ stress and rock conditions. By comparing the stress value calculated based on the measured strain or deformation with the actually applied stress value, the working performance and measurement accuracy of the OC instrument under this test condition can be quantitatively analyzed and evaluated.

Accurate in-situ stress measurement results are the prerequisite for establishing an accurate in-situ stress prediction model. As mentioned earlier, in response to the problems existing in the conventional OC strain gauges, our team has proposed the concept of accurate in-situ stress measurement, made technical improvements in the measurement circuit and temperature compensation, and invented an improved complete temperature compensation hollow inclusion strain gauge, which greatly reduced the influence of temperature change on the measurement accuracy. Moreover, Li et al.53 conducted a comprehensive comparison and analysis of the magnitude and causes of errors in measurement circuits and temperature compensation between two types of hollow inclusion strain gauge probes currently used in China and pointed out that the 13-wire probe produces smaller errors in the circuit, while the 15-wire probe performs better in temperature compensation, and they provided a complete form of the correction formula for confining pressure calibration. The in-situ digital hollow inclusion strain gauge produced by Environmental Systems & Services Pty Ltd in Australia uses an in-situ AD conversion to eliminate the influence of attenuation in signal transmission54, but the data line needs to pass through the drill pipe water hole one by one, which interferes with the drilling work during measurement. Additionally, Bai et al.55 developed a deep borehole hollow inclusion strain gauge in-situ stress measuring instrument, which installs a cable-free micro-probe at a predetermined position inside the hole for measurement. However, the conventional temperature compensation method can not reduce the measurement error caused by large temperature changes in deep strata. With the development of deep exploration and excavation in the earth, the temperature disturbance during OC and the nonlinear behavior of rocks under high-stress levels increase the error of in-situ stress measurement, which further aggravates the demand for the high accuracy of in-situ stress measurement56. Consequently, the OC technique with hollow inclusion strain gauge needs to be designed from the aspects of strain gauge structure, strain gauge arrangement, the circuit board size, circuit wiring, and component selection and layout to improve the performance of the strain gauge in measurement accuracy, stability, and long-term, develop a new data acquisition system, and thus realize the front-end digital data acquisition of the hollow inclusion strain gauge.

To meet the requirements of accurate in-situ stress measurement, based on the OC technique with hollow inclusion strain gauge, Li et al.45 systematically studied the stability of the acquisition circuit, digitalization of complete temperature compensation technique, and adaptability of measuring equipment in a complex environment, and developed a front-end digital acquisition system (Fig. 13) with instantaneous acquisition and power interruption continuous acquisition functions. Moreover, based on the temperature self-compensation technique and double temperature compensation design, the application of the complete temperature compensation technique in front-end digital hollow inclusion strain gauge in-situ stress measurement has been achieved. The double temperature compensation algorithm combined with temperature calibration experiments realizes double temperature compensation for both the measurement circuit and the acquisition circuit. The temperature deviation in the sensor system and the data recording system was calculated by the double temperature compensation method. The indoor calibration experiments and field test data indicated that the correction of the original temperature compensation algorithm data by the double temperature compensation technique reached 15%. In addition, they used the confining pressure calibration method and a true three-dimensional in-situ stress test bench to conduct indoor tests to simulate field conditions, and the results showed that the data truly reflected the spatial stress changes. The digital hollow inclusion strain gauge OC technique with double temperature compensation algorithm exhibits good test accuracy and stability of acquisition system in both indoor and field tests, which represents the development direction of in-situ stress measurement technique in the future.

Figure 13
figure 13

Structure of the new improved front-end digital hollow inclusion strain gauge45.

In addition, GM, BPNN, and embedded GM–BPNN models are constructed separately to predict the magnitude of in-situ stress. According to the results obtained in the study, compared with the GM and BPNN models, the embedded GM–BPNN model produces the best results, that is, less sample size requirements and the best performance index (Table 4). When the embedded GM–BPNN model is used, more than 90% of the prediction results are considered to be close to the truth. The embedded GM–BPNN model significantly improves the accuracy of the GM(0, 1) model by correcting the error of the predicted value, and has the characteristics of high calculation accuracy and fast operation speed.

There are some similarities between grey theory and BPNN in information representation. The output of BPNN can approach a fixed value with a certain precision, but because of the error, the output will fluctuate around a certain value. According to the definition of grey number in grey theory, it can be considered that the output of BPNN is actually a grey number57. Thus, the BPNN itself contains the content of grey theory. Furthermore, there are also some differences and complementarities between them. The BPNN model can approximate a nonlinear function with arbitrary accuracy, while the GM is not suitable for approximating complex nonlinear functions, but it can better predict the overall trend of parameter changes. The cumulative generation in the GM is that the sequence shows a monotonic growth trend, which is more suitable for the BPNN model approximation. As a consequence, by combining the GM and BPNN models and learning from each other’s strong points, the predictive performance of the embedded GM–BPNN model is satisfactory. This has also been confirmed in this study.

It should be noted that the modeling mechanism of each prediction method is different. If a prediction model is considered completely unsuitable for prediction because of its large prediction error, some useful information may be lost. In fact, even a prediction model with a large prediction error, if it contains some independent information about the system when it is combined with a prediction model with a smaller prediction error, it is entirely possible to increase the prediction accuracy of the system. In this study, although the prediction error of the GM is relatively large, its small sample requirement and ability to process poor information make the GM have a broad application prospect, especially for the case of less in-situ stress measurement data. Consequently, in addition to combining GM with the BPNN model, it is also possible to try combining GM with other prediction techniques, such as support vector machines, prior knowledge, evidence theory, and chaos theory, which helps promote the development of the grey theory. This is also the research work that we will carry out in the near future. In addition, the embedded GM–BPNN model proposed in this study is only used to predict the OC stress data measured by our team. To further validate the wide applicability and practicability of this model, the in-situ stress data obtained by other measurement techniques (such as hydraulic fracturing, anelastic strain recovery, and acoustic emission) will be predicted and analyzed in the future. Based on this, the model will be continuously revised and optimized to improve the prediction accuracy.

Consent to participate

All authors have confirmed their participation.

Summary and conclusions

  1. (1)

    The concept of accurate in-situ stress measurement is proposed, the measurement circuit, temperature compensation, and calculation method are improved, and an improved complete temperature compensation hollow inclusion strain gauge is invented, which reduces the influence of temperature change on measurement accuracy. The adoption of the complete temperature compensation technology and the reasonable determination of rock elastic parameters that are suitable for the measured stress level based on the nonlinearity of the rock have significantly improved the accuracy of stress measurement. This measurement technique has been applied to in-situ stress measurements in many mines, and a large amount of valuable data has been determined. The measurement results indicate that the horizontal tectonic stress plays a dominant role in the stress field of those mining areas, the vertical principal stress magnitude is basically equal to or slightly less than the weight of overlying strata, and the magnitude of the maximum horizontal principal stress is around twice that of the vertical principal stress on average, and the three principal stresses increase approximately linearly with depth. The dominant direction of the maximum principal stress identified in each mine is in good agreement with the directions of various stress indicators previously derived from other different measurement methods in and around the mining area.

  2. (2)

    The mean relative errors of the prediction results of GM(0, 1) for the three principal stresses all reach 6–30%, and the accuracy of the model fails to meet the requirements. The average relative errors of the prediction results of the BPNN model are all less than 10% (0.0125–7.2294%), and the model accuracy meets the requirements and has sufficient credibility. Compared with the GM and BPNN models, the embedded GM–BPNN model produces the best prediction results. The accuracy of the embedded GM–BPNN model in predicting the principal stress values is much better than that of a single BPNN model and a single GM, with mean relative errors of 0.0003–2.0923%, 0.0001–4.8338%, and 0.0022–3.2194%, respectively. Moreover, compared with the other two models, the root mean square errors of the embedded GM–BPNN model in predicting the three principal stresses are the smallest, which are 0–0.9241 MPa, 0–0.8747 MPa, and 0–0.5390 MPa, respectively. Hence, the predicted values of the embedded GM–BPNN model are very close to the measured values, verifying the reliability and applicability of this model.

  3. (3)

    The embedded GM–BPNN model for predicting in-situ stress values established fully utilizes the characteristics of grey theory and BP neural network, which require a small sample size, weaken the randomness of the original data, and gradually approach the accuracy of the model, making it particularly suitable for situations with limited stress measurement data. To further validate the wide applicability and practicability of this model, future prediction and analysis of the stress data obtained by other measurement techniques (such as hydraulic fracturing, anelastic strain recovery, and acoustic emission) will be carried out, which will be beneficial for further revising and optimizing the model to improve the prediction accuracy. In addition, only the magnitude of stress is predicted and analyzed in this study, and another important component of stress tensor, i.e., stress direction, will be predicted in the future, striving to achieve complete stress tensor prediction. With the increase of stress data volume and the complexity of stress data in the future, the accuracy and calculation speed of the embedded GM–BPNN model to predict different types of stress data need to be further improved to obtain higher reliability and credibility of prediction results.