Introduction

Knowledge of the present-day crustal stress field is essential for the understanding of geodynamic processes as well as planning and managing the usage of the subsurface, such as geothermal energy extraction, stimulation of enhanced geothermal systems or fluid (re-)injection (Fuchs and Müller 2001; Gaucher et al. 2015; Zoback 2007). The contemporary 3-D stress state also provides the basis to assess the impact of induced stress changes in the subsurface which can lead to the reactivation of pre-existing faults (Altmann et al. 2014; Hakimhashemi et al. 2014b, a; Kwiatek et al. 2018; Müller et al. 2018; Segall and Fitzgerald 1998; Walsh and Zoback 2016), the generation of new fractures (Cornet 1986; Haimson and Cornet 2003) and subsidence due to long-term depletion (Denlinger and Bufe 1982; Mossop and Segall 1997; Segall et al. 1994; Segall and Fitzgerald 1998; van Wees et al. 2017).

In some cases, the occurrence of induced seismicity resulted in a decline in public acceptance and eventually in the termination of geothermal exploitation or other conventional subsurface applications. Prominent examples of failed geothermal projects due to induced seismicity include Basel 2006 (Deichmann and Giardini 2009) and Pohang 2017 (Grigoli et al. 2018), but induced seismicity also occurred at ongoing projects such as Unterhaching (Megies and Wassermann 2014), Landau (Grünthal 2014) and Poing (Seithel et al. 2019) in Germany, The Geysers in California (Majer and Peterson 2007), and the Cooper Basin in Australia (Baisch et al. 2006). Other examples of induced seismicity that affected commercial activities are the stoppage of gas production in the Groningen field (NOS: Nederlandse Omroep Stichting 2019) and repeated stoppages of hydraulic fracture stimulation conducted for shale gas production near Blackpool in the United Kingdom (Clarke et al. 2014; Hicks et al. 2019). In terms of mitigating these kinds of induced hazards, knowledge of 3-D stress state is required to estimate how far it is from a given failure criterion (Blöcher et al. 2018; Morris et al. 1996). The distance between the stress state and failure indicates how much stress changes are permitted due to induced or natural processes before reactivation of pre-existing faults or creation of new fractures occurs (Morris et al. 1996; Schoenball et al. 2018; Walsh and Zoback 2016). Generally, various geomechanical parameters such as slip tendency, dilation tendency, fault reactivation potential and distance to failure, which all depend on the knowledge of the 3-D stress state, are being used to quantify seismic hazard on short and long temporal scales and its changes over time (Altmann et al. 2010; Fischer and Henk 2013; Henk 2009; Morris et al. 1996; Müller et al. 2018; Schoenball et al. 2010).

In most of the previous studies, the orientation of the stress tensor by means of the maximum horizontal stress \(S_\text {Hmax}\) has received extensive attention (Bell 1996b; Barton and Moos 2010; Tingay et al. 2005b; Zoback 2007) as it is a key parameter for fluid flow in fractured reservoirs (Barton et al. 1995; Finkbeiner et al. 1997; Sibson 1996), borehole stability (Hillis and Williams 1993; Moos et al. 2003; Zoback 2007) and hydraulic fracture stimulation (Bell 1996b; Seidle 2011). So far, only the orientation of \(S_\text {Hmax}\), and, where possible, the stress regime has been systematically compiled by the World Stress Map (WSM) project (Heidbach et al. 2018; Zoback 1992) and provided in the form of a public-domain database. The current information in the WSM database is a critical element in geodynamics, geo-engineering and petroleum geomechanics and is used for various applications related to fluid flow within the subsurface. It can also be used as first qualitative indicator of the probability of reactivating faults. However, stress magnitude information is required when investigating questions related to stability and hazard mitigation strategies of induced seismicity (Gaucher et al. 2015; Schoenball et al. 2018; Shen et al. 2019a; Morris et al. 1996), but to the best of our knowledge, no comprehensive and open-access stress magnitude database has been published yet. Although there are some stress magnitude compilations on global and regional scale, they do not supply single stress magnitude values. Instead, stress gradients with depth are common, which can be deceptive as they depend on several assumptions such as the elastic properties of the lithology in which the measurements were conducted (e.g. Gunzburger and Cornet 2007; Warpinski and Teufel 1987, 1991; Warpinski 1989). Furthermore, the early stress magnitude compilations were often geo-engineering driven. In this application-oriented context, not only gradients but also conflations of principal stresses such as sums, means and ratios were used and therefore published.

First publications addressing wide-scale compilations of stress magnitudes include Hast (1967; 1969; 1973), starting out from Fennoscandia and later including data from Iceland and the Mont Blanc. Stephansson (1989) collected data from Fennoscandia in the Fennoscandian Rock Stress Data Base (FRSDB). Herget (1974) analysed stress data from Canada, Worotnicki and Denham (1976) data from Australia, Fellgett et al. (2018) from the UK, and Brown and Hoek (1978) and Breckels and Van Eekelen (1982) compared data from different parts of the world. Another global stress magnitude database was published by Ranalli and Chandler (1975), who also gathered some specified values of principal horizontal stress magnitudes. McGarr and Gay (1978) also reviewed specified stress magnitude data from southern Africa, North America, Australia and Iceland. Stacey and Wesseloo (1998a1998b) presented a collection of stress data from mining and civil engineering projects in southern Africa and even developed a group measurement grading. The database itself is however not publicly accessible. Bell et al. (1994) provided an at that time current overview of stress information in the Western Canada Sedimentary Basin, mainly gained from fluid injection tests. More currently, Haug and Bell (2016) published an open-access Compilation of In Situ Stress Data from Alberta and Northeastern British Columbia but the stress magnitude information is again only available as gradients, determined from the ratio of the stress magnitude to the depth. Another attempt to start a global compilation of discrete magnitude information was made by Zang et al. (2012), but without granting open access.

As a subset of the WSM database, Reiter et al. (2016) provided a stress map for Germany and adjacent regions with 753 data records for the \(S_\text {Hmax}\) orientation and very limited stress magnitude data. However, a systematic and public stress magnitude database does not exist for this region.

This paper presents the first comprehensive stress magnitude database for Germany and adjacent regions, consisting of 568 data records. We also introduce a quality ranking scheme for stress magnitude data to provide a framework for reliability assessment that can be used for practical applications such as the calibration of geomechanical-numerical models (e.g. Hergert et al. 2015; Reiter and Heidbach 2014). Our ambition is to establish a public database collecting and presenting stress magnitude data in an objective manner without the attempt of a quantitative interpretation. In our understanding, the overview of available data, provision of easy access and a data quality assessment according to defined criteria provide the basis for any further application and interpretation, which themselves are not part of the database.

In the following sections, we present the theoretical basics of crustal stress and its main stress magnitude indicators as precondition for our database. Following this, we introduce the technical framework of the stress magnitude database and its quality ranking scheme. In the results, we present details of the German stress magnitude database. Finally, we discuss the potentials and limits of our presented database concept. The sustainable accessibility of the database is granted through the repository of the GFZ Data Services under https://doi.org/10.5880/wsm.2020.004.

Stress state in the Earth’s crust

Basics of the stress tensor

Fig. 1
figure 1

Explanation of the stress tensor. a The nine components of the stress tensor describe the stress state at an arbitrary point and enable calculation of the stress vector on any surface through that point. To describe the stress tensor components, an infinitely small cube with unit surfaces is used. b Due to the conservation of momentum (no rotation), the stress tensor is symmetric, and thus a coordinate system exists where shear stresses vanish along the faces of the cube. In this principal axis system, the remaining three stresses are the principal stresses. c Assuming that the vertical stress in the Earth’s crust \(S_\text {V} = \rho \cdot g \cdot z\) (g is gravitational acceleration, \(\rho\) is the rock density, z is the depth below surface) is a principal stress, \(S_\text {hmin}\) and \(S_\text {Hmax}\) are also principal stresses. This so-called reduced stress tensor is fully determined by four components: the \(S_\text {Hmax}\) orientation and the magnitudes of \(S_\text {V}\), \(S_\text {Hmax}\), and \(S_\text {hmin}\)

A stress tensor is a second rank tensor field defined at any point within a rock mass and can be described by a square matrix (Fig. 1a). The SI unit of stress is pascal (1 Pa = 1 N/\(\hbox {m}^2\)), although within industry reports the unit pounds per square inch (psi) is also used. In the following chapters, we use the general terms stress and stress state for the undisturbed stress state at a point. In text books, this is referred to as in situ stress state and less often also as virgin or natural stress state. When we use the term in situ, we refer to the original location of the rock as it was found in the subsurface.

Due to the symmetry of the stress tensor, only six out of nine tensor components are independent from each other (e.g. Jaeger et al. 2009; Schmitt et al. 2012). The coordinate system in which the off-diagonal components that represent the shear stresses vanish is called principal axis system. The remaining three components are the principal stresses \(\sigma _1\), \(\sigma _2\) and \(\sigma _3\), where \(\sigma _1\) is the largest and \(\sigma _3\) is the smallest. Their orientations and magnitudes describe the stress state completely (Fig. 1b). Assuming an Andersonian state of stress (Anderson 1905), meaning that the vertical stress \(S_\text {V}\) is one of the three principal stresses (Fig. 1c), the orientation of this so-called reduced stress tensor is uniquely determined by the orientation of \(S_\text {Hmax}\). As \(S_\text {V}\) can usually be estimated from the thickness and bulk density of the overburden, the remaining unknowns are the magnitudes of maximum (\(S_\text {Hmax}\)) and minimum (\(S_\text {hmin}\)) horizontal stresses, respectively. The relative magnitudes of the three principal stresses can be expressed by the stress regime, which not necessarily coincides with the kinematically observed style of faulting, since re-activated pre-existing faults are not necessarily mechanically optimally oriented (Célérier 1995). Following Anderson (1905), the three stress regimes are normal faulting (\(S_\text {V}> S_\text {Hmax} > S_\text {hmin}\)), thrust faulting (\(S_\text {Hmax}> S_\text {hmin} > S_\text {V}\)) and strike–slip (\(S_\text {Hmax}> S_\text {V} > S_\text {hmin}\)).

The concept of stress is only applicable at a scale in which the continuum mechanical framework is valid. This is when the volume under investigation is at least two orders of magnitudes larger than the representative elementary volume (REV). The REV represents the minimum volume for which an equivalent continuum can be defined for the volume of the physical point (Zang and Stephansson 2010). It is not always possible to define an appropriate REV for the given problem, e.g. in a heterogeneous and/or fractured rock mass, continuum mechanics and therefore also the concept of stress may be inappropriate. In practice, stress estimation should involve volumes larger than the REV (Hudson et al. 2003).

Perturbations of the stress field result from contrasts in density, stiffness and rock strength (Heidbach et al. 2018; Zoback 1992). These contrasts can result from both geological structures (folds and faults) and artificial interventions in the subsurface. The spatial scale these influences act on depends on the degree of mechanical contrast, the size of the structure and the orientation of the structure relative to the far-field stresses (Tingay et al. 2006; Rajabi et al. 2017a). In addition, fault activity can also be associated with stress changes. Fault slip releases elastic energy that can temporally affect the state of stress. The spatial scale of this perturbation depends on the magnitude of the earthquake (King et al. 1994; Tingay et al. 2006).

Strategies of stress magnitude estimation

Stress cannot be measured directly. Instead, components of the reduced stress tensor can be inferred from measurements of other quantities that are physically linked to stress. As the possibilities to infer stress magnitudes differ between the stress tensor components, a brief overview is given over respective approaches.

Estimating the vertical stress magnitude

As mentioned in the previous section, \(S_\text {V}\) can be estimated from the thickness z and bulk density \(\rho\) of the overburden combined with the gravitational acceleration g (e.g. Amadei and Stephansson 1997):

$$\begin{aligned} S_\text {V}(z) = g \int _{0}^{z} \rho (z')dz'. \end{aligned}$$

For borehole measurements, bulk density logs can be used to integrate the density with depth. In cases without logging data, bulk density values are usually assumed based on stratigraphic information and rock sample measurements (Tingay et al. 2003). Only if the topography or lateral density contrasts are very pronounced, the uneven loading can cause a deviation from this conventional assumption. The load below a valley is then increased by parts from the surrounding higher density. Therefore, the validity of the assumption expressed in the above formula should be critically questioned in individual cases, especially in the case of shallow measurements in mountainous regions (Evans et al. 1989a; Figueiredo et al. 2014; Savage et al. 1985; Warpinski and Teufel 1991). The non-vertical stress components might be as well influenced, but if they are derived from correct measurements, the results are at most of reduced significance for certain applications. However, in cases of shallow depth and notable topography the assumption of an Andersonian state of stress is in general not valid.

Estimating the horizontal stress magnitudes

In the reduced stress tensor concept, \(S_\text {hmin}\) is assumed to equal \(\sigma _3\) in case of normal fault and strike–slip regime. In a homogeneous rock volume, a new tensile fracture is generated orthogonal to the least principal stress \(\sigma _3\). The pressure needed to open such a fracture corresponds to the \(\sigma _3\) magnitude (Hubbert and Willis 1957). Thus, in a normal fault or strike–slip regime it is possible to estimate the magnitude of \(S_\text {hmin}\) by means of loading methods such as hydraulic fracturing, leak-off and mini-frac tests (Addis et al. 1998; Bell 1996a; Lee et al. 2004; Schmitt and Haimson 2017; White et al. 2002). Under thrust fault regime conditions, this concept does not apply, as \(\sigma _3\) is oriented vertically resulting in \(S_\text {V}\) instead of \(S_\text {hmin}\) being measured (Hubbert and Willis 1957).

The \(S_\text {Hmax}\) magnitude is most commonly derived from hydraulic fracturing using additional information on the fluid pressure. The estimation of the \(S_\text {Hmax}\) magnitude from loading tests includes many assumptions often involving large uncertainties (Vernik and Zoback 1992). Besides, the interpretation of borehole failure observations can also be used to constrain the \(S_\text {Hmax}\) magnitude, although requiring additional assumptions as well (Valley and Evans 2019; Vernik and Zoback 1992). A special case among the loading methods is the hydraulic testing of pre-existing fractures (HTPF; Cornet 1986), from which the complete stress tensor can be derived by inversion.

Another approach to derive the magnitudes of the principal stresses is to relieve specimens of rock from the in situ stress and observe the elastic reactions (e.g. Sjöberg et al. 2003). These relief methods in general yield also the complete stress tensor, from which the \(S_\text {Hmax}\) magnitude can be inferred. In the subsequent chapter, the different stress magnitude indicators are explained in more detail.

One approach that we will not consider in this paper or in the database is to derive the horizontal stresses from the overburden in conjunction with Poisson’s ratio (e.g. Avasthi et al. 2000). For this, it is assumed that the horizontal stresses are only determined by the elastic behaviour of the rock and the influence of the vertical load. This approximation is not valid because it completely disregards the influence of external stresses, not to mention the lack of knowledge of the actual Poisson’s ratio in the subsurface.

Overview of methods of stress magnitude estimation

In this chapter, we outline the methods of stress estimation relevant to our database, either because there are already data records of this kind part of the database or it seems likely that there will be records included in a global version of the stress magnitude database that is in preparation. Considering the indirectness of stress estimation, we avoid using terms like stress measurement technique but rather employ the term stress (magnitude) indicator referring to stress estimation methods. A graphical overview over the explained indicators is presented in Fig. 2. For further technical and physical details extensive review literature and textbooks exist, such as Amadei and Stephansson (1997), Zoback (2007), Zang and Stephansson (2010), Schmitt et al. (2012) and Cornet (2015).

Fig. 2
figure 2

Stress magnitude indicators. Overview of indicators relevant to the stress magnitude database of Germany and adjacent regions. HF hydraulic fracturing, HTPF hydraulic testing of pre-existing fractures, (X)LOT (extended) leak-off test, FIT formation integrity test, BO borehole breakout, \(\sigma _n\) normal stress, \(\sigma _3\) minimum principal stress

Loading methods

Loading methods such as hydraulic fracturing pressurize boreholes by fluid injection, building up a pressure against the stress in the borehole surrounding. Hereby it is possible to infer the minimum principal stress magnitude \(\sigma _3\) (cf. Fig. 2). With the assumption of \(S_\text {V}\) being a principal stress but not being the minimum principal stress (as for strike–slip and normal faulting stress regimes), \(\sigma _3\) equals \(S_\text {hmin}\). Often, pressure test procedures are conducted for purposes of drilling safety, drilling process optimization and maintenance of borehole stability rather than reliably deriving stress magnitude data. Sometimes expert elicitation is used to interpret attributes of the drilling process in order to infer stress magnitude information in a non-standardized manner, which is referred to as implicit drilling fluid pressure indicators within this publication. The volume for which the test provides valid data is directly linked to the test duration and the injected volume.

In general, classical interpretations of loading tests rely on the borehole axis being parallel to one of the principal stresses. Excessive deviation invalidates the classical method of interpretation of test results (Haimson and Cornet 2003; Schmitt and Haimson 2017). Therefore, the boreholes used for classical approaches of stress determination through loading tests should be vertical or at least subvertical, or more generally, aligned with a principal stress axis.

Leak-off tests, extended leak-off tests and formation integrity tests

Leak-off tests (LOTs) as well as formation integrity tests (FITs) are common practice in the hydrocarbon industry to estimate the upper limit of the mud weight that can be used during drilling without fracturing the wellbore wall. As illustrated in Fig. 3b, the tests are executed in an open hole of several metres length beneath the casing shoe (Addis et al. 1998; White et al. 2002).

Fig. 3
figure 3

Hydraulic pressure curves. a Schematic pressure curve of fluid injection tests (after White et al. 2002). FIT marks the stage in which the pumping is ceased in case of formation integrity tests, LOT marks the corresponding stage for leak-off tests. The continued curve is valid for hydraulic fracturing, mini-fracs and extended leak-off tests (XLOT). The pressure values to be picked are abbreviated as follows: LOP leak-off pressure, FBP/\(P_\text {b}\) formation breakdown pressure, FPP formation propagation pressure, ISIP/\(P_\text {si}\) instantaneous shut-in pressure, FCP fracture closure pressure, \(P_\text {r}\) reopening pressure. \(P_0\) is the pore pressure prior to pumping. b Schematic setup of (X)LOTs and FITs at the uncased bottom section of a borehole. c Schematic setup of hydraulic fracturing and mini-frac procedures in a packed borehole section

In the course of an LOT, the mud pressure is increased until the pressure build-up deviates from a linear trend indicating fluid leakage into the rock. This is interpreted to result from the creation of a small tensile fracture (Bell 1996a). However, observations indicate that also shear failure could be initiated by increasing the wellbore pressure, particularly in active thrust belts (Chan et al. 2014; Couzens-Schultz and Chan 2010; Zhang et al. 2011). For our compilation we keep the conventional assumption of tensile failure creation. The pressure when leak-off occurs (leak-off pressure, LOP) is identified based on the shape of the pressure curve (White et al. 2002).

XLOTs are more comprehensive and longer (extended) leak-off tests in which pumping is continued beyond leak-off, and which are primarily conducted to obtain a fracture closure pressure. An XLOT consists of at least one complete cycle of leak-off, formation breakdown, fracture propagation, shut-in and fracture closure (Addis et al. 1998; Bell 1996a; Kunze and Steiger 1991; Li et al. 2009; White et al. 2002). Figure 3a gives a schematic illustration of an XLOT pressure curve, marking also the stage at which simple LOTs are stopped. Depending on the type of the test (LOT or XLOT), leak-off pressure (LOP), fracture propagation pressure (FPP), instantaneous shut-in pressure (ISIP) or fracture closure pressure (FCP) are used to estimate the magnitude of \(S_\text {hmin}\) (see Fig. 3a). In general, the LOP is least reliable for estimating \(S_\text {hmin}\), and tends to slightly over-estimate \(S_\text {hmin}\) (Bell 1996a; Breckels and Van Eekelen 1982). However, when XLOT data is available, both FCP and ISIP can be used to provide more reliable information for the calculation of \(S_\text {hmin}\) (Addis et al. 1998; Enever et al. 1996; White et al. 2002). Particularly the FCP information from the subsequent cycles are more reliable than the first cycle because the FCP of the second or third cycles has removed the effect of tensile rock strength and hence provide more accurate estimation of the \(S_\text {hmin}\) magnitude (Bell 1996a). Traditional approaches of determining FCP are double tangent (e.g. Enever and Chopra 1986) and square root time analysis (Guo et al. 1993b). It has become more common to use a G-function analysis (Castillo 1987) or some other time-based derivation method. Which strategy is appropriate depends largely on the permeability of the tested formation (Schmitt and Haimson 2017, who also provide a more comprehensive overview of strategies used to extract \(\sigma _3\) from pressure curves). Different methods can generate different values from the same initial data. However, White et al. (2002) and Zoback et al. (2003) stated that there is little difference between LOP, FPP and ISIP and that all can serve as an approximate value for \(\sigma _3\) or \(S_\text {hmin}\). Raaen et al. (2006) contradict these assertions at least if high precision is required. It is common practice to consider the LOP as an upper bound of \(S_\text {hmin}\). As \(\sigma _3\) corresponds to the pressure needed to reopen a pre-existing fracture, LOTs can only yield a raw estimate which is not adjusted regarding surpassed rock strength. A thorough comparison of LOTs and XLOTs is provided by Addis et al. (1998). They conclude that XLOT provides far superior data compared to that obtained from a LOT, and recommend XLOTs for stress magnitude estimation.

If the fluid injection is stopped prior to the LOP being reached (cf. Fig. 3a), it is called a FIT or limit test (Zoback 2007). The original purpose of FITs is to test whether the wellbore can sustain the stresses expected during drilling and production. FIT magnitudes yield most likely only lower bounds of \(\sigma _3\), except for very high tensile strengths, since the test is halted before a fracture is being initiated. As the uncertainties in the derived stress determination are large, FITs should only be used for a lower boundary of the \(\sigma _3\) magnitude when no other information is available.

Note that FIT is different from DFIT (diagnostic fracture injection test), which we will discuss in the section on Mini-frac Tests.

Hydraulic fracturing tests

Hydraulic fracturing (HF) tests involve the sealing of a borehole section by packers (Haimson and Cornet 2003; Schmitt and Haimson 2017), as illustrated in Fig. 3c. The fluid pressure is increased until leakage and the rock stresses are derived from pressure curves, from which characteristic values have to be picked (Fig. 3a). These values are namely the breakdown pressure \(P_\text {b}\), the reopening pressure \(P_\text {r}\) and the shut-in pressure \(P_\text {si}\). \(P_\text {b}\) is defined analogously to FBP and \(P_\text {si}\) analogously to ISIP, only the prevalent nomenclature varies in literature. For an estimate of the \(\sigma _3\) magnitude under the assumption of initially intact rock, \(P_\text {si}\) as well as FCP are used. \(P_\text {si}\) is the pressure immediately after shut-in and higher than the FCP. It is considered as an upper bound for the \(\sigma _3\) magnitude (English et al. 2017). Although it is common practice to directly equate \(P_\text {si}\) with \(\sigma _3\) (e.g. Haimson and Cornet 2003), it is recommended to use FCP as an estimate of \(\sigma _3\) because FCP is the pressure counteracting fracture closure and thus rather equal or slightly lower than \(\sigma _3\) Schmitt and Haimson (2017). In case of thrust faulting regime, \(\sigma _3\) corresponds to \(S_\text {V}\) and therefore only measures the overburden, which can be estimated in a simpler and cheaper way by means of density integration over depth.

The determination of the \(S_\text {Hmax}\) magnitude is an even more discussed issue. Following the pioneering work by Hubbert and Willis (1957), several authors such as Scheidegger (1962), Kehle (1964), Haimson and Fairhurst (1967) and Fairhurst (1964) further developed the concept of hydraulic fracturing to infer the \(S_\text {hmin}\) and \(S_\text {Hmax}\) magnitudes from \(P_\text {si}\), \(P_\text {r}\), \(P_\text {b}\), \(P_\text {0}\) and the tensile strength under the assumption of intact, homogeneous and elastic rock. Since then, the concept was further developed, questioned or expanded by various authors. Regarded aspects include the influence of fracture fluid viscosity and injection rate (Guo et al. 1993a), the identification and interpretation of \(P_\text {r}\) (Bredehoeft et al. 1976; Ratigan 1992; Rutqvist et al. 2000), differences between succeeding injection cycles (Bredehoeft et al. 1976; Hickman and Zoback 1983; Rutqvist et al. 2000), integration of Biot’s poroelastic theory (Haimson 1968; Schmitt and Zoback 1989), and fracture mechanics (Abou-Sayed et al. 1978; Rummel 1987).

Mini-frac tests

In contrast to HF tests, mini-fracs are only short-duration fracturing operations which are performed to propagate small fractures in reservoirs, e.g. as a pre-treatment for chemical-enriched massive fracs. Since only a small volume of water is injected, it has to be taken into account that only small rock volumes are involved. Besides, mini-fracs are often run in long-accessed reservoirs with accordingly lowered fluid pressures, and thus may not reflect the undisturbed stress state (Bell 2006). However, they explicitly serve stress magnitude estimation and are nowadays typically performed with extremely precise downhole gauges.

In the petroleum industry, the acronym DFIT (diagnostic fracture injection test) has evolved to refer to virtually any test performed in which stresses are estimated regardless of procedure or geometry (Schmitt and Haimson 2017). Sometimes this term is also used synonymously to mini-frac test, or referred to as mini fall-off, injection fall-off test or fracture calibration test (Wang and Sharma 2017). They are especially designed for unconventional hydrocarbon exploration and include extensive and densely time-sampled pressure monitoring after shut-in. However, DFITs usually use only a single pressurization cycle and are often carried out through perforated casing. In our compilation no DFIT record is included at the time of this publication, but for future expansions of the database, one has to note carefully how an individual test was actually carried out to conclude which quality shall be assigned to a data record in question. If a DFIT is not further defined in terms of procedure and geometry, it may be regarded analogously to unspecified drilling fluid pressure indicators (cf. Table 2).

Hydraulic testing of pre-existing fractures

An alternative approach to breaking intact rock for stress magnitude estimation is the HTPF method (Cornet 1986). To solve the given inverse problem, fractures with various orientations are specifically opened in several tests. As a single test reflects the normal stress on the investigated fracture, a best fit solution of the 3-D stress tensor can be inferred from at least six tests on different, non-parallel fractures. Additional tests are recommended to better address uncertainties. The method is applicable to all borehole orientations, and it is also independent of pore pressure effects and material property determination. A clear advantage compared to classic HF is that the full 3-D stress tensor can be determined. Beyond that, it is possible to combine HF and HTPF when the borehole is vertical. In such cases, the \(S_\text {hmin}\) magnitude can be obtained from the HF test, while three to four HTPF tests are sufficient to constrain the magnitudes of \(S_\text {Hmax}\) and \(S_\text {V}\), without any consideration of either pore pressure or tensile strength (Haimson and Cornet 2003).

As a quite current development, Ask et al. (2017) presented a wire-line logging tool for hydraulic rock stress estimation in slim boreholes which integrates the HTPF method with HF testing and sleeve fracturing to gain the 3-D stress tensor. The system is supposed to provide reliable results with low measurement-related uncertainties and was successfully tested (Ask et al. 2018).

Aspects of uncertainty in loading methods

Some simple physical considerations illustrate the complexity of deriving stress magnitudes from loading methods: Due to the lack of preliminary exploration of the borehole wall, the (X)LOT procedure cannot assure to induce a new fracture and to not open a pre-existing fissure that is not normally oriented to \(\sigma _3\). Therefore, the opening pressure might over-estimate \(\sigma _3\) and therefore \(S_\text {hmin}\). Moreover, the geometry of the borehole bottom may influence the fracture initiation process in so far as a horizontal fracture may be initiated (Haimson and Fairhurst 1969) before the fracture turns in the vertical direction according to the \(S_\text {Hmax}\) orientation prevailing in the reservoir (Li et al. 2009). Also, if shearing occurs, the LOP will underestimate \(\sigma _3\) (Couzens-Schultz and Chan 2010).

Cornet and Valette (1984) pointed out that even in borehole sections beyond the bottom of the borehole, induced fractures do not always grow perpendicular to \(\sigma _3\), but may be influenced by pre-existing weakness planes such as natural fissures, especially for low injection rates. Therefore, if classic hydraulic fracturing shall be performed, preceding borehole imaging is indispensable to verify the validity of the investigated fracture. Still, induced fractures may also twist and curve (tortuosity), especially if \(S_\text {hmin}\) and \(S_\text {V}\) are close, so the stress state changes significantly with distance from the wellbore wall. On the other hand, this means that although the stress state near the borehole may at first affect fracture initiation, but once the fracture has propagated away from the borehole the undisturbed stresses reassert themselves and control the orientation of the fracture (Warren and Smith 1985), indicating that the pressure results are generally valid for \(\sigma _3\) magnitude inference. However, if it is uncertain whether a thrust fault regime prevails, e.g. in shallow depths, but also in generally unexplored stress settings, one should always check whether the \(\sigma _3\) value derived from the pressure recordings corresponds approximately to the calculated overburden. In such cases, \(\sigma _3\) should not be equated with \(S_\text {hmin}\), as it might actually be \(S_\text {V}\) which has been determined.

Liu et al. (2018) showed that the FBP decreases with increasing stress ratio \(S_\text {Hmax}/S_\text {hmin}\). They also investigated the implications of oriented perforations varying from \(S_\text {Hmax}\) orientation, suggesting increasing FBP with higher orientation discrepancies. Concerning fluid injection methods in general, procedures with several succeeding cycles are preferable to one-cycle tests, as the interpretation of the pressure curve regarding both the effectiveness of fracture initiation and the quantification of characteristic values is more reliable. Haimson and Cornet (2003) recommend to perform at least three pressurization cycles using the same flow rate. In addition, downhole pressure measurements are preferable to measuring the pressure at the surface while adding the hydrostatic pressure of the wellbore fluid (Zoback 2007), premising reasonable sampling, although for tests up to 500 m depth in hard rock of low permeability surface recording is sufficient (Schmitt and Haimson 2017). Li et al. (2009) noted that the interpretation of pressure test results is complicated by the use of non-Newtonian drilling fluid, which is common for XLOTs. Wellbore deviation and azimuth also influence the LOT value, therefore, single LOTs are rather unreliable for stress magnitude determination in inclined wells. However, if several LOTs within the same stress setting are available, inversion methods may improve the deduction of \(\sigma _3\) magnitude (Aadnoy 1990). Furthermore, which formula and which strategy is appropriate in a certain case depends on the geologic setting of the study area. Hence, the analytical correlation between pressure values and stress magnitudes remains difficult to validate given the absence of a universal solution especially concerning the estimation of the magnitude of \(S_\text {Hmax}\).

Finally, there is the general limiting issue that the actual test values or charts are often not archived or even published. Often, all that is available is a note in the drilling reports that a leak-off or loss of fluid occurred to a certain value, which makes a comprehensive quality assessment impossible.

Regarding the overburden, determination of \(S_\text {V}\) magnitude is also prone to errors and uncertainty, especially due to density log data usually not commencing until well below the surface. Moreover, in many cases the overburden is estimated only from approximated density values assumed for large depth sections, inevitably leading to inaccuracy. Beyond that, in cases of pronounced topography or significant lateral density contrast, the calculated overburden might not be correct due to the uneven loading, and the Andersonian state of stress (Anderson 1905) might not be valid, at least near the surface (Evans et al. 1989a; Savage et al. 1985; Warpinski and Teufel 1991).

Problems of estimating the maximum horizontal stress magnitude

Following the conventional approach for the calculation of the \(S_\text {Hmax}\) magnitude after Bredehoeft et al. (1976), the tensile strength of the rock is determined as the difference between breakdown pressure \(P_b\) and reopening pressure \(P_r\). This is to avoid uncertainties arising from laboratory tests that determine the tensile strength. Ito et al. (1999) named two sources of error linked to the method proposed by Bredehoeft et al. (1976). First, pressure penetration into the crack prior to reopening is not considered. To address this, Ito et al. (1999) propose a modified equation. Second, the true reopening pressure is systematically overestimated from the borehole pressure records. This discrepancy increases with larger hydraulic compliance of the test equipment, which is why Ito et al. (1999) suggest to reduce the flow rate several orders of magnitude compared to conventional hydraulic fracturing systems when the aim is to determine also the \(S_\text {Hmax}\) magnitude. For tests at shallow depth, this requires only minor system modifications, but in deep boreholes, it is recommended to comply flow measurements directly at the downhole packers (Ito et al. 1999). Evans et al. (1989a) further referred to the difficulties of determining \(P_r\) when \(S_\text {Hmax}>2S_\text {hmin}-P_0\), where \(P_0\) is the ambient pore pressure prior to pumping.

A brief summary regarding the calculation of \(S_\text {Hmax}\) magnitude can be found in the ISRM Suggested Methods by Haimson and Cornet (2003). Further discussion on aspects and restrictions of \(S_\text {Hmax}\) magnitude determination can be found in Zoback (2007). For instance, tests under application of perforated borehole casings invalidate the physical conditions to derive \(S_\text {Hmax}\) magnitude from the pressure curves because fracture initiation is not governed by the stress concentration around the well (Zoback 2007).

Thus, the calculation of \(S_\text {Hmax}\) magnitude includes geomechanical assumptions as well as several picked pressure values as interim results, which are each subject to uncertainty. Consequently, the result is associated with a large overall uncertainty. Beyond that, there are potentially further error sources due to insufficiently considered pore pressure effects (Haimson and Cornet 2003). In addition, data publications are often incomplete regarding physical and geological assumptions, pressure values, and quantitative uncertainties of the measurements. Thus, it remains practically quite difficult to quantify the overall uncertainties in a comprehensive manner. After all, the evaluation of \(S_\text {Hmax}\) magnitude involves in general larger uncertainties than that of \(S_\text {hmin}\) magnitude.

Relief methods

The basic principle of so-called relief methods is to relieve a rock sample by removing the contiguous volume and examine its deformation response (Ljunggren et al. 2003). These methods include overcoring (Hast 1958; Leeman 1964, 1968), borehole slotting (Becker and Werner 1994; Bock and Foruria 1983) and tests on core samples in laboratory (e.g. Strickland and Ren 1980; Teufel and Warpinski 1984; Yamamoto et al. 1990).

Given the expected scatter of smaller scale methods such as relief methods, it is reasonable and even desirable to use several tests to infer the stress state at a certain location. However, it has to be noted that for the pooling of test results the validity of the continuity hypothesis has to be ensured, which requires detailed knowledge of the local geology (Ask 2017).

Overcoring

Applying the overcoring method, strain sensors are attached prior to drilling round the in situ sample to measure the strain resulting from the mechanical decoupling (Hast 1958; Leeman 1964, 1968). The complete 3-D stress tensor can be calculated from the strains of a single set of measurements, provided knowledge of the elastic rock properties. Several different gauges are used to implement overcoring tests, but the underlying physical principle and therefore the general procedure is the same: After drilling a pilot hole, strain gauges are bonded to the still unreleased rock. In the next step, the measurement cell is overcored using a larger coring bit, which effectively relieves the stress acting on the rock. Strains are measured before, during, and after overcoring. The in situ stress state is calculated from the strains assuming continuous, homogeneous, isotropic, and linear-elastic rock behaviour. The required elastic rock properties, namely Young’s modulus and Poisson’s ratio, are commonly gained on-site using biaxial testing. During a measurement campaign, usually several measurements are taken at near distance of typically 0.5-1.0 m to form more significant mean values. The resulting oriented stress tensor can be transformed to any preferable coordinate system (Sjöberg et al. 2003). Figure 4 shows the procedure of a single measurement using the Borre probe (Sjöberg et al. 2003; Sjöberg and Klasson 2003).

Fig. 4
figure 4

Implementation of the overcoring method on the example of the Borre probe (Sjöberg and Klasson 2003); the principles can be applied to any overcoring method (Sjöberg et al. 2003). Figure and description was adapted from Sjöberg et al. (2003). (1) Drill main borehole (76 mm diameter) to measurement depth. Grind borehole bottom using a planing tool. (2) Drill pilot hole (36 mm diameter) and recover the core for appraisal. Flush the borehole to remove drill cuttings. (3) Apply glue to strain gauges. Insert the probe with installation tool into hole. (4) Let the probe tip with strain gauges enter the pilot hole. Release the probe from installation. A compass released at the same time is recording the installed probe orientation. Gauges are bonded to pilot hole wall. (5) Pull out installation tool and retrieve to surface. The probe is bonded in place. (6) Allow glue to harden overnight. Then overcore the probe and record strain data using the built-in data logger. Break the core and recover it to the surface

Borehole slotting

The borehole slotter is a device providing in situ strain relief without overcoring (Bock and Foruria 1983). It involves a strain sensor taking measurements during and after a blade is cutting a slot into the borehole wall (Becker and Werner 1994). This method is limited to dry shallow boreholes and it also requires independently gained information regarding the elastic properties of the rock (Bock 1993). Although it is not widely used anymore, it is mentioned here since there were several applications in Germany and Switzerland that contributed data to the database presented with this paper.

Tests on core samples in laboratory

Tests on core samples deduce the in situ stress tensor components from the deformation behaviour of rock specimens through laboratory testing, with or without reloading.

One method without reloading is called anelastic strain recovery (ASR) and measures the strains on oriented cores to calculate the horizontal stress magnitudes from the principal strain magnitudes, a previously determined overburden corresponding to one of the principal stresses, and Poisson’s ratio. Unlike with overcoring, the gauges are installed subsequent to stress relief (Teufel and Warpinski 1984; Teufel 1983). Hereby, a partial component of the anelastic strain recovery can still be determined, which is sufficient for the stress magnitude determination if (1) the rock is homogeneous and linearly viscoelastic, (2) the viscoelasticity of the rock can be characterized by one viscoelastic parameter, (3) Poisson’s ratio is not time-dependent, (4) the in situ stresses are removed instantaneously. In case of transversely isotropic cores, at least one additional viscoelastic parameter is required (Blanton 1983).

Core-sample based reloading approaches are based on the creation of micro-cracks resulting from the stress release. The assumption is that aligned micro-crack densities are proportional to the relieved stress magnitudes of corresponding directions. It is analysed how those cracks close under varying pressure application. Basic principles are provided by Strickland and Ren (1980).

In the case of wave velocity analysis (WVA), the anisotropic wave velocities are measured within oriented samples (Braun et al. 1998; Ren and Hudson 1985). Directional ultrasonic waves are induced to measure the wave travel time across the oriented rock sample under increased isotropic loading. The anisotropy in wave velocity refers to the orientation of tensile micro-cracks which are in turn correlated to the in situ stress state the sample was released from. The minimum wave velocity points along the orientation of \(S_\text {Hmax}\) (Fleckenstein et al. 2004).

Yamamoto et al. (1990) developed the deformation rate analysis (DRA) method, which uses uni-axial compression cycles for the definition of a strain difference function. This function of axial stress is obtained by subtracting the axial strains observed from different loading cycles. The in situ stresses are estimated from gradient changes of the strain difference functions.

For the differential strain analysis (DSA) method, released rock samples are compressed isotropically. The pressure is increased in steps. To infer the in situ stress magnitudes, the overburden corresponding to the vertical in situ stress, the in situ pore pressure, and Poisson’s ratio are required (Widarsono et al. 1998).

A relatively new approach without reloading yielding the differential stress (\(S_\text {Hmax}-S_\text {hmin}\)) is the diametrical core deformation analysis (DCDA; Funato and Ito 2017). The strains are determined by means of an optical micrometer. Although this method does not yield isolated information about the principal stress magnitudes, \(S_\text {Hmax}\) magnitude can be determined if the magnitude of \(S_\text {hmin}\) is known from, e.g. a hydraulic fracturing test nearby the initial location of the core sample, combining that information with the differential stress from DCDA. As for all strain analysis approaches, Young’s modulus and Poisson’s ratio have to be determined and the rock must meet the requirements of homogeneity, isotropy and linear elasticity (Funato and Ito 2017).

Aspects of uncertainty in relief methods

Care must be taken if measurements are realized for the purposes of engineering projects such as tunnel excavations. Because free surfaces of constructional interventions in the subsurface disturb the stress state, measurements directly behind a tunnel wall are not necessarily reliable indicators of the undisturbed stress state. Brady and Brown (2004) suggest a zone of influence of 5 times the radius of the excavation regarding a circular shape.

Besides, as the assumption of ideal rock behaviour (continuous, homogeneous, isotropic, and linear-elastic or linear-viscoelastic behaviour) are seldom met completely, epistemic errors are inevitably introduced. A review by Amadei and Stephansson (1997) found that the expected imprecision of overcoring results is at least 10–20 %, even under nearly ideal rock conditions. Leijon (1989) showed that the absolute scatter in overcoring data from hard rock amounts to \(\pm 2\) MPa, which means that for shallow depths, when stress magnitudes are low, the overcoring results are relatively more uncertain. Furthermore, Irvin et al. (1987) draw attention to the possible occurrence of boundary yield, meaning mechanical yielding at the borehole–cell interface. This can result in a significantly increase in the stress magnitude aligned parallel to the borehole. They therefore recommend to carry out measurements in two orthogonal boreholes at a site.

Relief methods infer stresses from small-scale strains and therefore their results are generally highly dependent on the precision of the corresponding measurements (Bertilsson 2007; Hakala et al. 2003; Hakala 2007). Referring to overcoring data, Ask (2003) mentioned bonding between sensors and rock specimen, temperature effects and the identification of elastic parameters as measurement-related uncertainties. Ask specifically denounces the improper handling and the quality of the glue by which the sensors are attached to the rock. Not only valid for overcoring, temperature control is in general a critical measure to avoid severe inaccuracies. Also, poorly conducted biaxial tests can lead to distorted results, when the stress calculation depends on the determined rock properties (Ask 2003). If there are inaccuracies or ambiguities regarding the fulfilment of the theoretical assumptions or if there are contradictory results, Ask (2017) recommends to take into account other stress indicators for data comparison, to decide which data is to be trusted.

Widarsono et al. (1998) pointed out that the DSA method was only to be used under the assumption that all micro-cracks existing within a tested sample originated from stress relief, or at least that all pre-existing micro-cracks are not affecting the measured deformation significantly. They emphasize that grain size heterogeneities have a strong influence on the deviation of stress relief micro-cracks. This remark applies to all methods based on the investigation of stress relief micro-cracks.

Other methods

Upper limits of stress magnitudes derived from the frictional limit

Assuming that the Earth’s crust contains pre-existing faults that are optimal oriented in the prevailing stress field and furthermore assuming that these faults are at their frictional limit, the ratio of the maximum and minimum principal stress is determined with \(\sigma _1/\sigma _3=(\sqrt{(\mu ^2+1)}+\mu )^2\), where \(\mu\) is the coefficient of friction. For \(\mu =0.6\) this ratio would be 3.1 (Jaeger et al. 2009; Sibson 1974). When the faulting regime is known and assuming that the vertical stress is a principal stress, upper bounds for \(S_\text {hmin}\) in a thrust faulting and for \(S_\text {Hmax}\) in normal faulting regime, respectively, can be estimated. In a strike–slip faulting regime where \(S_\text {V}\) is the intermediate principal stress, the ratio is determined by \(S_\text {Hmax}\)/\(S_\text {hmin}\) and further assumptions or information is needed. Therefore, the stress state can be narrowed with a so-called stress polygon (Schoenball et al. 2018; Zoback 2007; Zoback et al. 2003). Here additional information, e.g. from FITs or LOTs can be introduced to further constrain the upper and lower boundaries of the horizontal principal stresses.

Since the frictional limit approach is based on a number of simplifying assumptions and the knowledge of the friction coefficient \(\mu\), the reliability is limited compared to stress magnitude from derived from the indicators described in the previous sections. Furthermore, it only delivers upper bounds and thus will be ranked lower in quality (see chapter after next Quality Ranking Scheme for Stress Magnitude Data and Table 2 for further details).

An implementation of the frictional limit approach supported by empirical information is possible by using borehole failure observations (borehole breakouts, BOs; drilling-induced tensile fractures, DIFs; see next subsection) in combination with rock strength. These additional data can also be integrated in a stress-polygon and enable to constrain \(S_\text {Hmax}\) magnitude also in deviated wells if \(S_\text {hmin}\) is already known (Moos and Zoback 1990; Peška and Zoback 1995a, b; Schoenball and Davatzes 2017; Valley and Evans 2007).

Borehole breakouts and drilling-induced tensile fractures

Whereas the orientation of BOs indicates the orientation of \(S_\text {hmin}\), the BO width might be analysed in order to infer stress magnitude ratios. Thus, if the magnitude of \(S_\text {hmin}\) is already known, the magnitude of \(S_\text {Hmax}\) can potentially be estimated if borehole failure is observed within the same lithological layer (Barton et al. 1988; Lee and Haimson 1993; Shen 2008; Vernik and Zoback 1992). Shen (2008) used numerical modelling to establish a quantitative relation between BO dimensions and stresses. However, this applies only under isotropic rock conditions, with assumptions of compressive rock strength, elastic parameters and friction coefficient. In addition, the analytical result relies heavily on the used failure criteria (Valley and Evans 2019). Furthermore, this method assumes that the precise mud weight conditions at BO initiation are known, whereas, in practice, the exact time at which BOs are generated, and thus the exact downhole mud weight, are rarely known. The BO width method also assumes no chemical or thermal effects on the near wellbore stresses or wellbore strengths, and thus is potentially significantly erroneous and likely to markedly over-estimate maximum horizontal stress in wells drilled with water-based mud or in high temperature wells. Finally, the BO width method assumes that BOs initiate and then are completely undisturbed by the drilling bottom hole assembly (BHA) or other downhole tools during drilling, reaming, or running-in. Hence, it should be noted that the BO width method, whilst supported by numerical and lab models under highly controlled conditions is, in practice, at risk of suffering from numerous error sources or invalid assumptions. This leads to high uncertainties and therefore the majority of industry practitioners do not use it. They have also found from practical experience that results from BO width interpretation tend to yield an \(S_\text {Hmax}\) estimate that is significantly higher than those from other methods. Instead of using the BO width to determine a precise value, it is therefore seen as rather more reliable to use the occurrence or absence of BOs in a well to place constraints on \(S_\text {Hmax}\) magnitude. This is done by calculating the approximate minimum \(S_\text {Hmax}\) magnitude value which is needed for any BO to be generated. Hence, the occurrence of BOs indicates that this threshold value corresponds to a lower bound for \(S_\text {Hmax}\) magnitude, whereas the absence of BOs indicates that this threshold value is an upper bound. An analogous procedure is used for DIFs. As such, borehole failure observations can serve as a supplement to, e.g. the frictional limit considerations explained in the previous subsection, in the absence of more explicit data.

The stress magnitude database

The content of the stress magnitude database feeds on published sources only. For each data record, at least one reference has to be supplied in the corresponding fields. The references have so far been coded as labels, which can be resolved by means of a supplementary table (Additional file 3). Additionally, the references can be made identifiable by their digital object identifier (DOI), if one exists, or written-out publication information. Table 1 gives an overview of the fields (columns) included in the stress magnitude database. The location in terms of latitude, longitude and true vertical depth below surface (true vertical depth) must be given to ensure the usefulness of a data record in principle. UTM coordinates are an alternative system which might be used, but then the additional entry of the lat–lon values is requested to simplify map display. In addition, the type of stress magnitude indicator as base for the quality assignment (see next chapter) and at least one stress magnitude are required. In general, however, only a subset of the fields offered is filled within an individual data record. This can be due to the fact that not all fields are applicable to all kinds of indicator or due to lack of accessible information in the referenced data source.

Although the basic approach of collecting stress magnitude data resembles that of the established stress orientation data compilation of the WSM project, there are notable differences between the two databases. This issue will be addressed in the Discussion chapter of this paper.

As mentioned above, Table 1 shows a general overview of the database fields. Supplementary to this, Additional file 1 includes a more detailed list, further itemizing the available database columns including required units. The core elements of the database are the quantified stress magnitudes or at least upper or lower boundaries of the stress magnitudes. Depending on the referenced data source, these can be given as principal stress values (\(\sigma _1\), \(\sigma _2\), \(\sigma _3\)) or in form of the horizontal and vertical stress magnitudes (\(S_\text {hmin}\), \(S_\text {Hmax}\), \(S_\text {V}\)). In the former case, at least stress regime information is needed to conclude the tensor orientation. Depending on the stress magnitude indicator, only the minimum principal stress \(\sigma _3\) might be given. The vertical stress information mostly originates from integrating estimated density profiles, although the implementation of a density log ensures more evidence-based data. For each type of magnitude information, compressional stress is by definition indicated by positive values. If the depth information indicated in the data source is not referring to ground level but to a deviating reference height, e.g. the kelly bushing or the rotary table, this may be specified in the corresponding fields that allow the derivation of the requested depth below surface.

The elastic properties of the investigated rock are of particular importance for contextual interpretation and also for the calibration of geomechanical-numerical models. However, such information is mostly not available. Even data gained from sonic logs do only provide dynamic elastic properties, and not true static ones. Measurements of static elastic properties, as they are gained from core sample testing, are rarely available at hydraulic test depths. Still, details on rock type and lithology can yield indications of rough estimates. Furthermore, to estimate the effective stresses \(\sigma _{eff}\) after Terzaghi (1936) pore pressure (\(P_0\)) information is required. Terzaghi defined \(\sigma _{eff}\) as the difference between the total stress \(\sigma\) and the pore pressure \(P_0\), as \(P_0\) acts against the external stresses affecting the rock, which is why \(\sigma _{eff}\) is in fact the critical variable regarding stability issues. However, the calculation of \(\sigma _{eff}\) is mostly not possible due to missing information on \(P_0\). If the data source directly specifies effective stresses, these are to be entered in the designated columns.

The comment field is set to include additional information, which might be crucial for the interpretation of the dataset. This includes geological background information, structural geology on borehole scale, topographic features, origin of material parameter information, decisive assumptions, scale of measurement, problems occurred in the technical implementation of the measurements, origin of the pore pressure knowledge, and discrepancy in data interpretation. Other points of interest are whether the test were open-hole or through perforated or otherwise permeable casings, length of the test zone, hole and shoe depth, well deviations, pressure gauge location, number of cycles or repeats, course of injection rate, returned fluid volume, static mud weight and mudline depth, and method of pressure value picking. If the referenced data source specifies pressure values from multiple cycles, these are to be listed in the comment field, while mean values are entered in the corresponding pressure fields. If there is any field entry marked as explicitly questionable in the references, for instance due to problems or inconsistencies in the test sequence, it is registered as such by naming it in an extra field complementary to the comments.

If a data record from the WSM \(S_\text {Hmax}\) orientation database is connected to a data record in the stress magnitude database, the \(S_\text {Hmax}\) azimuth information as well as its quality referring to the WSM quality ranking scheme for the \(S_\text {Hmax}\) orientation data records are quoted. From the various stress magnitude indicators, the loading methods are most important in regard to reliability, significance and number of data records. To ensure replicability, the specification of pressure values of hydraulic measurements is required. Consequently, they are always left empty in case of indicators other than loading methods. These quantities might be used to apply alternative stress magnitude calculation approaches.

Table 1 Information included in the stress magnitude database

Quality ranking scheme for stress magnitude data

Since the majority of the new data come from loading methods, they provide reliable information only on the \(S_\text {hmin}\) (or more generally \(\sigma _3\)) magnitude. Thus, the quality ranking presented in this paper refers only to \(S_\text {hmin}\) or \(\sigma _3\) magnitudes, although for indicators yielding the whole stress tensor in one step (e.g. overcoring, HTPF), the assigned quality practically refers to other stress tensor components as well. However, in contrast to the WSM quality ranking for \(S_\text {Hmax}\) orientation data records, estimates of stress magnitudes cannot be averaged over large rock volumes or depth ranges. Instead each pointwise information has to be considered separately. Thus, we developed a different approach for the quality ranking scheme of \(S_\text {hmin}\) magnitude data records which is based on two general criteria, each also having a subsection in this chapter:

  1. 1.

    The reliability of the individual stress magnitude indicator.

  2. 2.

    The degree of information integrity available for a given data record.

Let Record-1 be an \(S_\text {hmin}\) magnitude data record which is provided with comprehensive information, but is obtained from a stress magnitude indicator that has a limited range of achievable quality ranks. And let Record-2 be a poorly documented data record from a stress magnitude indicator that would have the potential to yield a data record with a high quality rank. Then the two criteria named above imply that Record-1 is possibly more reliable in the individual case than Record-2. Hence, no method of stress magnitude indication can be defined as superior to another per se, when it comes to actual data. Accordingly, the proposed quality ranking scheme presented in Table 2 incorporates both the type of stress magnitude indicator and the degree of information that is available. Detailed aspects considered thereby are explained in the subsequent sections.

The different stress magnitude indicators used in this paper are listed in the left column of Table 2. The qualities rank from A (best) to E (poorest), following the concept of the WSM quality ranking scheme for the \(S_\text {Hmax}\) orientation data records. The highest overall quality a data record can achieve is limited by the stress magnitude indicator (criterion 1). The completeness of information for a data record determines further loss of quality level (criterion 2). Unlike the WSM quality ranking scheme for stress orientation data, no standard deviation for stress magnitudes is derived from the assigned quality. Nevertheless, the number and designation of the qualities ranks is nearly adopted from the WSM as it reflects the broad range of data presented in the various sources and takes into account the diversity of the considered indicators. Only the X-category has been added to the previous scheme (AE) to indicate data for which the sources are currently not accessible or undisclosed due to confidentiality issues, but may become accessible in the future. Thus, X serves as a placeholder for a currently unrated quality. It therefore needs no extra column in Table 2. Apart from that, the chosen designation with capital letters is not decisive, but it seems reasonable to make use of the recognition value associated with the long-established WSM system despite the different implications of the quality classification.

After all, our motivation behind the quality ranking scheme is to provide a basis of assessment that works independently from specific data or specific areas. Although the criteria have been developed also by looking at the available data, the intention is that the scheme is be applied to the assessed data, and not that the assessment requirements are fitted to the data, as this would inevitably mean a loss of universal applicability. At the same time, the actual availability of information provided on the data needs to be reflected in the quality ranking scheme to allow for a corresponding range of different qualities.

Quality aspects of stress magnitude indicators

Hydraulic fracturing (HF) and hydraulic testing of pre-existing fractures (HTPF) can achieve A-quality, assuming that the large volume of injected fluid along with the extent of the pressure curves and their elaborate interpretation ensure both a high degree of reliability and validity for a large rock volume if they are executed in an isolated open-hole interval and properly documented (Schmitt and Haimson 2017). Mini-fracs and XLOTs can also achieve A-quality if certain conditions are met (see Table 2). They are generally more reliable than simple LOTs as they are executed with repeated cycles and extended monitoring (Addis et al. 1998). Although no results from XLOTs are available for the German stress magnitude database, we integrated this stress magnitude indicator in the quality ranking scheme due to its general importance. LOTs are ranked as not more than C-quality due to the short pressure record duration and the therefore limited options for evaluation (Addis et al. 1998).

Relief methods infer stresses from small-scale strains and thus their results are generally highly sensitive to conditions disturbing the strain measurements (Bertilsson 2007; Hakala et al. 2003; Hakala 2007). This is in contrast to loading methods which infer stresses from fluid pressure measurements. To reduce the probability of significant systematic errors due to, e.g. temperature effects, several repetitive measurements from similar depth are required to achieve a quality better than C. The definition of the minimum number of single measurements and the minimum depth follow the quality ranking scheme of the WSM stress orientation database (cf. Heidbach et al. 2010). Furthermore, it is difficult to assess how representative a core sample measurement is for a larger volume for two reasons: first, samples are often taken near open surfaces in the underground and thus probably do not measure the undisturbed stress field before excavation. And second, the elastic properties as part of the stress deduction are gained from laboratory measurements and do not necessarily represent the in situ conditions (Brace 1981; Pratt et al. 1972). This also applies to the method of borehole slotting, which, even though not using isolated rock samples for strain measurements, strongly depends on the assumed elastic parameters, specifically Young’s modulus (Becker and Werner 1994).

In some publications, results of fluid injection methods without further specification are gathered without providing any technical details. Others include implicit drilling fluid pressure indicators originally not recorded with the objective of determining stress tensor quantities. These measurements originate mainly from industry treatments. As replicability and accuracy of these indicators are very limited, they are ranked at best as C-quality and most often as D-quality.

FITs are used to infer lower bounds of the \(\sigma _3\) magnitude or more generally as a rough \(\sigma _3\) magnitude estimation. As this kind of information is valuable if no other information is available but not to the same extend as absolute stress magnitudes (e.g. Drews et al. 2019), it is rated at best as D-quality. The purely assumption-based approach based on frictional limit considerations is also ranked as D-quality at best, due to the lack of data basis and the unverifiable premise that the faults are optimally oriented to the stress field.

Data records lacking depth or stress magnitude indicator information are generally rated as the lowest quality rank E. Beyond that, data records from relief methods are considered as E if they were obtained at very shallow depth (<10 m). In addition to the direct influence of the near surface, a further restriction applies to data from loading methods, since these depend on \(\sigma _3\) corresponding to \(S_\text {hmin}\) and not \(S_\text {V}\). Therefore, a depth threshold of 100 m is used for HF, mini-frac, LOT, XLOT and unspecified or implicit drilling fluid pressure indicators. Data records from HTPF and FITs form exceptions in this context for different reasons: HTPF can be employed independently of the prevailing stress regime. As FITs yield only a lower bound of \(S_\text {hmin}\) anyway, in case of thrust faulting conditions this is only rougher, but not wrong.

Quality aspects of stress magnitude data sources

Since for most data records an uncertainty in terms of standard deviation is not available, the integrity of data sources is evaluated to supplement the presented quality ranking scheme. Criteria considered during reference evaluation are, for instance, scientific considerations explained in the publication, specification of used formulae, and for fluid injection methods, display of pressure curves and statement of pressure values as interim results. If no access had been gained to the data source, the ranking scheme of Table 2 is not applicable and the quality rank is set to X. Unlike data of E-quality, an X-dataset can achieve a better quality once the source becomes accessible.

Table 2 Quality ranking scheme for stress magnitude data referring to the \(S_\text {hmin}\) (or \(\sigma _3\)) magnitude of the data records

Results: stress magnitude data in Germany and adjacent regions

The open-access database presented in this paper compiles stress magnitude information from various sources. It currently contains 568 data records in the area of Germany and adjacent regions (latitude: \(47^\circ -55.5^\circ\) N; longitude: \(5.8^\circ -15.1^\circ\) E). 12 data records are assigned to A-quality, 60 to B-quality and 42 to C-quality; the remaining 454 data records are D- (n = 266), E- (n = 141) or X-quality (n = 47).

Fig. 5
figure 5

Locations, stress magnitude indicators and number of data records. Examples discussed in more detail are marked by dashed boxes. Plotted are all data records for which depth information is available (n = 564). a Assigned indicator types illustrated by colour. Very shallow records (<10 m below surface) are additionally marked with a cross. The colour code corresponds to that used in Figs. 6 and 7. b Number of data records per site illustrated by circle diameters. The term site refers to the same latitude–longitude location

Fig. 6
figure 6

Depth distribution of data records. The bars of the histograms are colour coded according to the type of stress magnitude indicator. The inset plots show the upper 500 m and 50 m below surface, respectively, in more detail and clarify the clustered appearance of the borehole slotting and overcoring measurements in shallow depths. The colour code corresponds to that used in Figs. 5a and 7. Percentages reflect the overall proportions of the different stress magnitude indicators. Considered are all data records for which depth information is available (n = 564). 273 data records are included in the larger inset histogram, 116 data records are included in the smaller inset histogram

Figure 5 shows the 2-D spatial distribution of the stress information included in the database. The colours used to signify the different stress magnitude indicators in Fig. 5a are the same as in Figs. 6 and 7. As the number of data points distributed along depth at one location on the map differs widely and might be decisive for the utility of the data records, Fig. 5b shows the number of data records per location. A larger diameter signifies the availability of several data points, whereas small dots mean that only a single record exists. The distribution of depth regardless of the geographical clustering is shown in Fig. 6, which demonstrates the majority of \(S_\text {hmin}\) data are from shallower parts of the Earth’s crust (<250 m) corresponding to the general depth range for mining. Especially in the upper 10 m, overcoring and borehole slotting are the prevailing indicators.

Fig. 7
figure 7

Depth distribution of stress magnitude data. Minimum principal stress (\(\sigma _3\)) magnitude and horizontal stress (\(S_\text {hmin}\)) magnitude, depending on how it is indicated in the reference, plotted with record depth below surface. Plotted are all data records for which the one and/or the other information is available, resulting in 530 data points (\(\hbox {n}_{{Shmin}}\) = 379; \(\hbox {n}_{\sigma 3 }\) = 151). 30 data records provide both \(S_\text {hmin}\) and \(\sigma _3\) magnitude, therefore 500 distinct records remain. The difference to the total number of data records in the database (n = 564) results from the fact that some records provide only other types of stress information or lack the required depth information. The colour code in the plot corresponds to that used in Figs. 5a and  6

Fig. 8
figure 8

Map of assigned data record qualities. Best quality available at each dataset location. Plotted are all data records in the database to which depth information is available (n = 564). Pie chart reflects proportions of assigned qualities of all data records (n = 568)

The overall prevalence of HF and HTPF data compared to data from other stress magnitude indicators is clearly shown by the percentage given in Fig. 6 (37 %). Although these indicators have the potential to be ranked as A-quality (s. Table 2), not every record of this type possesses high reliability. Only 12 out of 210 HF/HTPF data records are actually ranked as A-quality due to the thoroughness of their referenced sources. These 12 data points are clustered in ways that only two actual sites remain having A-quality, namely Soultz-sous-Forêts and the German Continental Deep Drilling Program (see also details in the corresponding subsections and the map display in Fig. 8). In order to give a quantitative impression of the data, Fig. 7 shows the \(S_\text {hmin}\) and \(\sigma _3\) magnitudes of the database records with depth, without tempting any further interpretation. In particular, we do not provide a gradient as this is clearly not appropriate given the diversity of geologic and tectonic settings from which the data originate and the general problem that gradients are inappropriate when geomechanical layering exists (Fleckenstein et al. 2004; Roth and Fleckenstein 2001; Warpinski and Teufel 1987). Figure 8 illustrates the distribution of all sites with data records and their assigned qualities accompanied by a pie chart illustrating the proportions of the quality ranks in the database regardless of spatial distribution.

Examples

In the following, we select and present a number of prominent or representative examples. The corresponding sites are marked in Fig. 5a and b.

German continental deep drilling program

The German Continental Deep Drilling Program (in German: Kontinentales Tiefbohrprogramm der Bundesrepublik Deutschland, KTB) is situated in NE Bavaria, at about 49.82\(^\circ\) N, 12.12\(^\circ\) E. The assessment of stress data from the project data results in 13 records in the presented database and covers a depth range from 0.8 km to 9 km below surface. Used indicators include hydraulic fracturing and the integration of borehole failure. Five references are associated with the data records (Baumgärtner et al. 1990; Brudy 1995; Brudy et al. 1997; Röckel and Natau 1993; Zoback and Harjes 1997).

North German Basin

The cluster of 36 data records in NW Germany, about 53.1\(^\circ\) N, 9.4\(^\circ\) E, is mainly situated in the Dyas-Perm lithology and originates from exploration campaigns for natural gas. The stress magnitudes were inferred from mini-frac treatments and wave velocity anisotropy measurements in laboratory. The area investigated in the referenced report is characterized by halokinetic structures and associated geomechanical decoupling (Fleckenstein et al. 2004).

Soultz-sous-Forêts

One example for a site beyond the German border but near enough to be relevant for the estimation of the stress state of Germany is Soultz-sous-Forêts, France, 48.93\(^\circ\) N, 7.88\(^\circ\) E. This borehole site was created in the context of an enhanced geothermal system. The stress dataset includes 16 data records from 5 wellbores belonging to the same site. The depth ranges from 1.5 km to 4.5 km below surface. Four references are associated with the data records (Cornet et al. 2007; Klee and Rummel 1993; Rummel and Baumgärtner 1992; Valley and Evans 2007). The used indicators include HF and HTPF as well as the interpretation of stimulation injections similar to LOT procedures.

Spessart

Another considerable dataset is located on the northern boundary of the Spessart mountains about 50.03\(^\circ\) N, 9.67\(^\circ\) E and 50.2\(^\circ\) N, 9.51\(^\circ\) E (well identifiable in Fig. 5b). It includes 26 HF data records published by Rummel et al. (1983) along with a detailed explanation of testing and calculation approaches. The measurements were in three boreholes up to a depth of nearly 450 m below surface. Ten of the data records are rated as E-quality due to their depth being shallower than 100 m below surface. There is also one overcoring record from Rummel and Baumgärtner (1982) included in the database, albeit the referenced primary data source is currently not available.

Aachen

The distinct circle at 50.78\(^\circ\) N, 6.08\(^\circ\) E (Fig. 5b) originates from an unsuccessful geothermal project in Aachen (Trautwein-Bruns et al. 2010). It marks an application of the borehole failure approach after Zoback (2007), where a linear behaviour of stress magnitudes with depth is assumed. Therefore, the 25 data records do not each contain independent information and the impression of the large circle diameter in Fig. 5b might be deceptive.

Dittingen/Laufen

On Swiss territory, there are two large circles standing out in Fig. 5b about 47.43\(^\circ\) N, 7.5\(^\circ\) E at the site Dittingen/Laufen. These datasets lie close together, whereby they are even more outstanding in the map. The stress magnitude data are gained through 14 overcorings and 35 borehole slotter measurements (Becker and Werner 1994). The amount of data records is due to repeated measurements performed in close succession with depth. However, most of the corresponding data records are rated as E-quality due to the shallow depth (<10 m). Only five borehole slotter data records were taken in 30 m depth below surface, which were registered separately without taking mean values to enable consideration of the increasing distance to the quarry wall, which was noted in the comments. They were barely rated as D-quality.

Discussion

The first public stress magnitude database for Germany and adjacent areas currently contains 568 data records. The large amount of data is encouraging as it shows the general availability of published stress magnitude data. The stress magnitude database, combined with the stress orientation database of the WSM project, significantly improves our ability to address geoscientific questions in general and practical geotechnical applications that require knowledge of the 3-D stress state in particular. However, most of the reliable data is on the \(S_\text {hmin}\) magnitude. Yet, \(S_\text {Hmax}\) magnitudes are also essential to estimate the differential stress in strike–slip or thrust faulting stress regime where \(S_\text {Hmax}\) is the largest principal stress. Furthermore, even though the stress magnitude database with its quality ranking is a major step forward, the distribution of quality of the data records in Fig. 8 shows that only about one-fourth of the data records are considered reliable equivalent to \(A-C\) quality.

The quality ranking scheme for stress magnitude data

The suggested quality ranking scheme for the \(S_\text {hmin}\) magnitude data is not based on uncertainties, e.g. by means of a standard deviation. These cannot be estimated for most data records as they are in most cases single measurements. Also systematic errors (epistemic uncertainties) for individual stress indicators are difficult to assign as they strongly depend on the equipment, the technical handling during the measurement, the information that is recorded and provided, and last but not least the local geological setting in which the measurement was made. Another issue is the question how representative the point data are for a volume that is several magnitudes larger than the probed rock volume. Thus, the quality ranking is a first guidance for the reliability of the data records. In principle, the quality ranking scheme shall provide a universal assessment basis that works independently from specific data or specific areas. Nevertheless, it is essential to study the individual setting and information for each data record in the volume of interest as given in the paper or report and to bring this into the context of the question to be addressed. For example, Seithel (2019) performed a plausibility check of stress magnitude information derived from FITs and LOTs to study the reliability of the individual data records from the same lithology in the Bavarian Molasse. He identified that some data records are of poorer quality due to epistemic uncertainties which would have been not detected when only our proposed quality ranking scheme would have been applied. However, this plausibility approach is only possible when a sufficiently large dataset in the region of interest exists and thus it cannot be integrated in a quality ranking scheme which considers data records individually. Furthermore, the plausibility check is made under the assumption of a somewhat homogeneous material distribution within the lithology. This cannot be guaranteed to be the case.

It remains to be noted that a quality ranking is always based on expert elicitation, which is at least to some extend a subjective choice of borders between the qualities. However, our proposed scheme provides a sound approach to comparing data records from a wide range of very different stress magnitude indicators. In this sense, our compilation strategy and quality assessment is consistent with the WSM database for \(S_\text {Hmax}\) orientation. It is purely data-driven, does not follow any hypothesis, nor does it provide any interpretation. The latter is in the responsibility of the user to make sure that information used is appropriate for their purpose.

Possible applications of stress magnitude data

A direct application of the stress magnitude database with its assigned qualities is its usage in the course of forward modelling of the initial stress state (Fischer and Henk 2013; Henk 2009; Lecampion et al. 2018). Given that the stress information is sparse, unevenly distributed and incomplete, the only way to achieve a continuous description of the 3-D stress tensor in the area of interest is by means of geomechanical-numerical modelling. The reliability of the model prediction depends mainly on the quality of the underlying static 3-D geological models, rock properties and stress magnitude data that is used for model calibration (Hergert et al. 2015; Rajabi et al. 2017b; Reiter and Heidbach 2014; Ziegler and Heidbach 2020; Ziegler et al. 2016). Here the German stress magnitude database provides a public compilation for the first time and the assigned qualities of the individual data records can be used as weights during the model calibration procedure. Ziegler (2018) provides the tool FAST Calibration v1.0 which allows to speed up the model calibration and to use weights for individual stress magnitude data.

Differences to the stress tensor orientation WSM database

The WSM project has its focus on the systematic compilation of the reduced stress tensor orientation by means of the \(S_\text {Hmax}\) orientation. Thus, the stress orientation map for Germany and adjacent regions from Reiter et al. (2016), as a part of the WSM project, provides only information on the \(S_\text {Hmax}\) orientation in these areas.

The backbone of the WSM stress orientation compilation is its quality ranking scheme, which makes it possible to compare stress orientation information from different stress indicators that represent very different rock volumes (Heidbach et al. 2010; Ljunggren et al. 2003; Sperner et al. 2003; Zoback and Zoback 1991; Zoback 1992; Zoback and Zoback 1989). Herein, we extend the data compilation by stress magnitudes and present a quality ranking scheme for stress magnitude data.

The WSM quality ranking scheme for \(S_\text {Hmax}\) orientations is to a large extent based on standard deviations and often defaults to a mean \(S_\text {Hmax}\) orientation averaged over a larger volume (e.g. from earthquake focal mechanisms) or along a depth profile (e.g. the orientation of the BOs and DIFs is a mean orientation of the borehole sections along which they are observed). Considering the mean value over depth is reasonable since the \(S_\text {Hmax}\) orientation in most cases shows variation only within the uncertainty of the observations except for areas where mechanical decoupling, significant lateral density and stiffness contrast, fracture systems or faults are present (Heidbach et al. 2007; Pierdominici and Heidbach 2012; Roth and Fleckenstein 2001; Rajabi et al. 2017b; Tingay et al. 2005a, 2009; Yale 2003).

In contrast to the WSM stress orientation database, the compilation of stress magnitude data has two major differences that have to be noted:

First, stress magnitudes change not only with depth, but also with lithology (Warpinski 1993, 1989; Warpinski and Teufel 1991). Although Evans et al. (1999) stated the representativeness of linear functions of magnitudes with depth, no gradients at one site should be assumed as standard practice because stress magnitudes depend on the elastic properties of the encountered rock (Evans et al. 1989a, b; Gunzburger and Cornet 2007; Gunzburger and Magnenet 2014; Hergert et al. 2015; Meixner et al. 2014; Nelson et al. 2006; Warpinski and Teufel 1987; Wileveau et al. 2007). If the issue of a measurement campaign has an economical background such as the exploration of hydrocarbon resources, the available stress information is typically limited to the lithology of the operator’s interest and not representing the variety of lithologies at a given site. Depending on the geologic history, lithologies might be heterogeneously distributed with depth even within one formation. Of course, the database can be used to create stress gradients for different applications since the depths of the pointwise magnitude information are part of the records. However, we recommend not to use gradients or use them with caution and only for analysis at appropriate scales (see e.g. Shen et al. 2019b). For the same reason, quantifying the goodness of linear fit is not necessarily appropriate. We therefore compile only pointwise information rather than gradients or mean values over depth ranges. Thus, a standard deviation is only available in very few locations, e.g. where a hydraulic fracturing measurement was repeated in the same or very similar depth and lithology or measurements of the same pressure value are available for several cycles. Since such information is very rare, the standard deviation cannot be used for a quality ranking scheme of stress magnitude data. Conversely, it is not appropriate to derive a standard deviation from the assigned quality. Similarly, mean values of measurements from different depths are not reasonable as well.

The second major difference compared to the WSM stress orientation database is that the variety of informative value for each data record is much larger. Some indicators such as hydraulic fracturing provide actual stress magnitudes while others provide only upper or lower bounds of the stress magnitudes, partly based on certain simplifying assumptions (see e.g. FITs in the chapter Overview of Methods of Stress Magnitude Estimation).

Outlook

From a technical perspective, the stress magnitude database for Germany and adjacent regions is in an initial phase. Currently the database consists of an ASCII table as provided in the supplementary material (Additional file 1). Indeed, the presented compilation will benefit from the ongoing development of a PostgreSQL-based implementation of the WSM stress orientation database. This new technical framework for the WSM database includes also an extension towards stress magnitude data and will allow to access and select the database (stress orientations and stress magnitudes) with a browser-based user interface. Accordingly, the German stress database along with its proposed quality ranking scheme for stress magnitude data will serve as a blueprint for a global compilation of stress magnitude data. The transfer into a global compilation will benefit from the integration of other types of stress magnitude indicators and further development or refinement of the quality ranking scheme. Still, the basis of that global concept is already exemplified with the presented German stress magnitude database and potential contributors are asked to provide not only stress magnitude data for open access, but also all relevant information associated with their acquisition.