1 Introduction

Earthquakes induced by subsurface fluid injection have become a significant issue for some industrial activities, including wastewater disposal (WWD), hydraulic fracturing (HF), engineered geothermal stimulation (EGS), natural gas storage (NGS) and carbon capture and storage (CCS). As the use of these technologies has increased in scale and become more widespread, more cases of injection-induced seismicity have been identified. Given the potential for damage from larger induced earthquakes and the public concern that arises even when relatively small events are felt at the surface (e.g. Cremen and Werner 2020), there has been a growing need for decision-makers to develop strategies and regulations to limit or prevent the occurrence of induced seismicity.

The primary method used to manage induced seismicity has been traffic light schemes (TLSs). This concept was first developed by Bommer et al. (2006) to be applied to an enhanced geothermal project in El Salvador. In general, TLSs are created by defining “amber” and “red” thresholds. If the levels of seismicity are below the amber threshold then the industrial activity continues as planned. If the amber threshold is exceeded then the operator takes steps to mitigate the levels of seismicity (for example by reducing injection rates or pressures). If the red threshold is exceeded then operations are stopped.

TLSs are conceptually simple: decisions to reduce or stop injection are taken based on the magnitude of the largest event that has occurred (or the ground motions that this event has caused). TLSs are easy to explain to the general public, they are immune to model-based assumptions or parameters, and they do not require expert judgement or interpretation once the TLS thresholds have been set.

Rational management of induced seismicity requires the identification of an appropriate risk threshold—a level of seismicity that should be minimised or prevented. This threshold can be developed by considering expected ground motions from shallow earthquakes and the exposure and vulnerability of nearby populations, buildings and infrastructure (e.g. Langenbruch et al. 2020; Schultz et al. 2020, 2021). This threshold may be very different in different situations (e.g. Kendall et al. 2019).

TLSs are retroactive: decisions are made after an event has occurred. This means that TLS thresholds must be designed by considering the potential for “magnitude jumps” and “trailing seismicity” to occur, as shown in Fig. 1. A magnitude jump describes the situation where an event occurs that is significantly larger than any previous induced seismicity during a given injection operation. Trailing seismicity describes the situation where earthquakes continue to occur and magnitudes continue to increase after injection has stopped (e.g. Kettlety et al. 2021). The occurrence of large jumps in magnitude means that seismicity could significantly exceed the red-light threshold before any mitigating action is taken. The occurrence of trailing seismicity means than magnitudes may continue to grow even after injection is stopped, which again means that seismicity could significantly exceed the red-light threshold at which injection was stopped.

Fig. 1
figure 1

Schematic depiction of TLS operation and the potential issues caused by magnitude jumps and trailing events. In a, magnitudes evolve gradually through the amber and then red-light thresholds. When injection is stopped after the red-light event, magnitudes drop rapidly back to “green” levels. This behaviour is implicitly required for the safe and effective functioning of a TLS. In such situations, the red-light threshold could be set at a level that is close to the risk-based target which the TLS seeks to mitigate against. In b, event magnitudes are initially low, but a sudden jump in magnitudes produces an event that exceeds the red-light threshold. Although injection is stopped, trailing events continue to increase in magnitude for a period after injection, before eventually dropping away. In such situations, the largest event that occurs is significantly larger than the red-light threshold and a large gap between the red-light threshold and the risk-based target may therefore be required

As a result the red-light threshold should be set below the magnitude that the TLS is designed to avoid. The appropriate gap between the red-light level and the risk-based threshold will be governed by the extent to which trailing seismicity and magnitude jumps occur (Verdon and Bommer 2021a, for brevity, VB21 hereafter).

VB21 investigated these effects for HF-induced seismicity, finding that magnitude jumps were typically below 2 magnitude units and that trailing event magnitude increases occurred in approximately one quarter of cases, with the largest observed trailing event increase of 1.5 magnitude units. As such, they recommended that TLS red-light thresholds be set 2 magnitude units below the risk-based target that the TLS seeks to avoid.

In this study, we reprise the analysis presented by VB21 for cases of WWD-induced seismicity, and other cases where seismicity has been generated by low-pressure, long-term (LPLT) injection of large fluid volumes, such as NGS and CCS. To date, the WWD industry has not typically been regulated using TLSs, unlike HF and EGS. Nevertheless, many countries are planning a growth in the use of technologies such as NGS, CCS (e.g. BEIS 2018) and even hydrogen storage (e.g. BEIS 2021) in order to meet emissions targets and improve their energy security, and we anticipate that regulators may implement TLSs to manage the induced seismicity hazard posed by such developments.

WWD, CCS and NGS represent a long-term subsurface perturbation, where large volumes of fluid are injected (large WWD projects may inject millions of m3), but at a comparatively slow rate, with low injection pressures. WWD, CCS and NGS sites typically operate for years or even decades. This contrasts with HF, which represents a very short-term perturbation, where a relatively small volume of fluid is injected rapidly, at high pressure. Typically, HF is completed in a few days, and large-scale HF may inject a few tens of thousands m3 per well.

We might therefore expect the temporal evolution of event magnitudes induced by LPLT fluid injection to behave in a very different manner to HF-induced seismicity. It might alternatively be argued that induced earthquake magnitudes are controlled by the properties of the pre-existing tectonic faults on which they occur (e.g. van der Elst et al. 2016), and so the temporal evolution of magnitudes could be relatively independent of the triggering process (whether short-term high-pressure HF, or LPLT injection).

2 Methods and analysis

We compile datasets where seismicity has been induced by WWD, NGS and CCS. Our primary objective is to assess the extent to which magnitude jumps and trailing event magnitude increases have occurred. Magnitude jumps, ΔM, represent the gap, in magnitude units, between the largest event in the sequence at a given time, and the magnitudes of any events preceding that event.

$$\Delta M={M}_{i}-\mathrm{max}({M}_{\left[1,i-1\right]})$$
(1)

where Mi is the magnitude of the ith event, and M[1,i−1] are the magnitudes of all preceding events, and ΔM is only computed when Mi > max(M[1,i−1]). Our focus is on the largest magnitude jump, ΔMMAX = max(ΔM). Magnitude jumps can occur at any point within a sequence and ΔMMAX may not be the jump that produces the largest event.

Detection thresholds vary significantly between our case studies, and in some cases an apparent magnitude jump may in fact represent a constraint imposed by a lack of event detection. Limited detection thresholds are often apparent at the start of event sequences, whereas dedicated local monitoring arrays may be deployed, either by operators, regulators or academics, once an event sequence reaches larger magnitudes. As such, we only consider magnitude jumps that occur above a magnitude of M 2.5.

We define trailing events as increases in magnitude that occur after injection has ceased, and the maximum trailing event increase as:

$$\Delta {M}_{T}=\mathrm{max}\left({M}_{T>{t}_{E}}\right)-\mathrm{max}\left({M}_{T<{t}_{E}}\right)$$
(2)

where MT>tE are events occurring after the end of injection, and MT<tE are events occurring before the end of injection, and only positive ΔMT values are considered. Schultz et al. (2022) provide observational evidence that trailing seismicity follows Båth’s law for earthquake aftershocks, with ΔMT being proportional to the relative numbers of earthquakes during and after the injection phase.

For areas with a large number of injection wells, it can be difficult to identify specific causative wells, and therefore to establish when the influence of injection has stopped (at which point any further seismicity can be considered to be “trailing”), especially when some wells may inject for long periods before and after the sequence in question. This situation is more complex than hydraulic fracturing cases, where bursts of seismicity are more easily linked to specific wells that have brief, short-lived injection periods. Details of the wells used in our analysis with respect to injection start and stop times for each sequence are provided in the supplementary materials , but in some cases the decision to include or exclude a particular well as a contributor to a given seismic sequence is somewhat arbitrary.

The Gutenberg and Richter (1944) b-value (GR hereafter) is also a significant factor when assessing the hazard posed by induced seismicity with low b-values indicating an increased likelihood of larger magnitude events. For tectonic seismicity, b-values are commonly observed to be close to 1.0 (Frohlich and Davis 1993). However, it has been suggested that seismicity driven directly by fluid processes may produce higher b-values, implying a higher number of small events relative to the number of large events (e.g. Wyss et al. 1997; Kettlety et al. 2019). Therefore, we compute b-values for all of our datasets using the Aki (1965) maximum likelihood method and a Kolmogorov–Smirnov test to estimate the minimum magnitude of completeness, MMIN (Verdon 2013).

We also analyse the time lags between the start of injection and the onset of seismicity. The observed lags prior to induced seismicity onset can inform monitoring strategies at future sites. Again, the lag between injection and seismicity may be constrained by the detection thresholds of monitoring arrays (since small, early events may be missed) and by appropriate identification of which wells were responsible for the seismicity (and so when the start of injection took place).

In their compilation of induced seismicity sequences, van der Elst et al. (2016) argued that induced seismicity is driven primarily by regional tectonic conditions, and hence the largest event could occur at any time within a sequence. This concept poses a potential challenge for the successful use of TLSs, since TLSs implicitly depend on the assumption that magnitudes increase progressively as injection continues. The concept proposed by van der Elst et al. (2016) implies that the largest events could occur relatively early within a sequence before any mitigation actions could be taken. Van der Elst et al. (2016) tested this assertion by examining the position of the largest event within each sequence as measured by the ratio PMmax = Nprior/N, where Nprior is the number of events before the largest event, and N is the total number of events larger than the magnitude of completeness. They found no evidence to indicate that the largest event was not positioned at random within each sequence. However, subsequently both Skoumal et al. (2018) and VB21 compiled HF-induced seismicity cases and found statistically significant evidence that the largest events were positioned later within their respective sequences. The positioning of larger events later within sequences is consistent with rates and magnitudes of induced seismicity being correlated with injection volumes (e.g. Hallo et al. 2014) and, if so, larger events become more likely as the injected volume increases.

Using geomechanical simulations, Kroll and Cochran (2021) argued that the occurrence time of larger events will be determined by the stress conditions. Where stresses are further from the failure criteria, faults will reactivate slowly and larger-magnitude events will occur towards the end of the sequence, whereas when stresses are close to failure, runaway rupture will occur quickly, with large events occurring at the start of the sequence.

Here, we compile PMmax values for each LPLT injection-induced sequence. To mitigate the effects of varying detection thresholds in time for some sequences, we only evaluate the numbers of events above an MMIN threshold that is appropriate for the entire timespan of the available catalogue. We use a χ2 test to evaluate whether the observed PMmax values are consistent with the random positioning of the largest event within each sequence.

Finally, various studies have recognised the potential to use seismicity forecasting models to guide decision-making at industrial sites (e.g. Kwiatek et al. 2019; Clarke et al. 2019; Cao et al. 2020; Mancini et al. 2021). The most common forecasting approach is to adopt some sort of scaling between injection rates and seismicity rates that can be extrapolated to forecast what may happen as injection continues. However, this approach is challenging as injection rate data is not readily available for all of our case study sites. To circumvent the necessity for injection rate data, VB21 used the next record breaking event (NRBE) approach developed by Cao et al. (2020) to evaluate whether magnitude jumps, and thereby the largest magnitude events, could be forecasted. The NRBE approach is based on record-breaking statistics theory (Tata 1969; Mendecki 2016), computed from the observed seismicity alone, and so does not require injection data. For each sequence, we compute the temporal evolution of modelled maximum magnitude, MMMAX, and compare the values of MMMAX to the actual largest event, MMAX, in order to assess the degree to which the NRBE approach could provide useful magnitude forecasting.

3 Induced seismicity case histories

The first case of injection-induced seismicity to have been identified and studied was generated by WWD at the Rocky Mountain Arsenal, Colorado, in the mid-1960s (Evans 1966; Healy et al. 1968). Further cases of WWD-induced seismicity were identified in the USA over the following three decades, including at Perry, Ohio (Nicholson et al. 1988); Ashtabula, Ohio (Seeber et al. 2004); the Raton Basin, Colorado (Meremonte et al. 2002) and Paradox Valley, Colorado (Ake et al. 2005). It is important to note that the overwhelming majority of WWD wells have not caused induced seismicity (Verdon 2014; Walsh and Zoback 2015).

In the past two decades, the volumes of wastewater being re-injected into the subsurface in the mid-continental USA have increased substantially (Keranen et al. 2014). This increase has been driven primarily by changes in hydrocarbon production methods with a move towards reservoirs with higher water fractions and the need to dispose of flow-back fluids from hydraulic fracturing (Rubenstein and Mahani 2015). The increase in WWD in the mid-continental USA has driven significant amounts of induced seismicity in Oklahoma (Weingarten et al. 2015), Kansas (Schoenball and Ellsworth 2017), East Texas (e.g. Frohlich 2012), the Permian Basin in west Texas (e.g. Skoumal et al. 2020a, b) and Arkansas (e.g. Horton 2012). In Oklahoma, the largest events have exceeded M 5.5, and this region has been the focus of much WWD-induced seismicity research. Cases of WWD-induced seismicity have also been observed in the Appalachian Basin (e.g. Kim 2013; Brudzinski and Kozłowska 2019) and the Western Canada Sedimentary Basin (WCSB) (e.g. Schultz et al. 2014).

In some cases, especially where both HF and WWD takes place in close proximity, it can be challenging to disentangle the causes of induced events (e.g. Yoon et al. 2017; Skoumal et al. 2018), since a reliance on simple spatial and temporal coincidence may lead to erroneous associations (e.g. Verdon and Bommer 2021b), and more detailed study of specific sequences may be required (e.g. Verdon et al. 2019; Grigoratos et al. 2022).

WWD and NGS-induced seismicity has been less common on other continents. In Europe, NGS at the Castor site, off the coast of Valencia, Spain, caused three M > 4.0 events in 2013, which led to the closure of the project (Cesca et al. 2021). Moderate magnitude induced seismicity (M < 3.0) was also generated by a high-rate WWD well in the Val d’Agri oilfield in southern Italy (Improta et al. 2015). In South America, WWD-related induced seismicity occurred in the Quifa and Rubiales oilfields in the Llanos Basin, Colombia (Molina et al. 2020), where the largest events have exceeded M > 4.0. In the Sichuan Basin, China, induced seismicity has primarily been associated with hydraulic fracturing (e.g. Lei et al. 2019). However, cases of WWD-induced seismicity have also been identified, including the Kongtan M 5.4 sequence (Lei et al. 2020) and sequences in the Rongchang (Wang et al. 2020) and Huangjiachang (Lei et al. 2013) gas fields. Induced seismicity has also been linked to the Hutubi NGS facility in Xinjiang (Tang et al. 2018).

Table 1 lists the WWD and NGS case studies compiled in our analysis. These sites were selected based on several criteria. Firstly, that magnitudes of the induced events exceeded M 3.0. Secondly, that earthquake catalogues were publicly available and contained a sufficient number of events (N > 30) to compute a valid b-value and MMIN. We sourced event catalogues either from published studies or via public networks such as TexNet (Savvaidis et al. 2019), the Oklahoma Geological Survey (Walter et al. 2020) and the USGS Comprehensive Earthquake Catalogue (ComCat). Thirdly, we sought sequences of seismicity where the onset (of both injection and seismicity) can be clearly identified. This can be challenging for WWD-induced seismicity, since high volume wells can have an influence across a wide area (Goebel et al. 2017). When considering the interacting effects between different wells in areas such as Oklahoma or west Texas, it can be difficult to isolate a particular “sequence” with a clear onset (from which subsequent magnitude jumps and delay times can be assessed), unless the entirety of WWD-induced seismicity in a particular basin were to be considered as a single sequence. Finally, in order to examine seismicity behaviour relative to the start and end of injection, we require information about, at the very least, when nearby injection wells were active. Again, given the potential areas of influence for WWD, this may require information about a significant number of wells in proximity to a given sequence. A more complete description of each case study, including the source of each event catalogue and injection start/end times, is given in the Supplementary Material. Digital event catalogues are also provided in the Supplementary Material.

Table 1 List of case studies used in our analysis, including the year that seismicity began, the largest event size, the catalogue size (N), b-values, magnitude jumps, trailing events, PMmax, the time of onset relative to the start of injection and the time of the largest event relative to the start of injection

4 Results

Table 1 lists the observations for each identified induced seismicity sequence. Figures 2, 3, 4, 5 and 6 summarise our observations across these parameters.

Fig. 2
figure 2

Histogram (bars) and cumulative distribution (circles) of the observed largest magnitude jumps for each of the case studies listed in Table 2

Table 2 Best-fit and 95% confidence intervals for the log-normal distributions fitted to the LPLT, HF and synthetic ΔMMAX distributions shown in Fig. 7
Fig. 3
figure 3

Histograms (bars) and cumulative distributions (circles) of the time between the onset of injection and a the first observed event and b the largest observed event

Fig. 4
figure 4

Histogram of b-values (a) and a cross-plot of b-values against the number of events within each event population (b)

Fig. 5
figure 5

Histogram of PMmax (a) and a cross-plot of PMmax against the number of events within each event population (b)

Fig. 6
figure 6

Cross-plot of observed and modelled maximum magnitudes (a), coloured by the number of events in the population, and the different between observed and modelled magnitudes as a function of the number of events (b)

4.1 Magnitude jumps

Figure 2 shows the distribution of ΔMMAX for our case studies. We find that 75% of our cases have magnitude jumps of less than 1 magnitude unit, while only one sequence, in Pawnee, Oklahoma (Walter et al. 2017) had ΔMMAX > 2.0 (ΔMMAX = 2.1). Broadly speaking, the distribution of ΔMMAX for LPLT injection is similar to that observed for HF-induced seismicity, where ΔMMAX < 1 in 60% of cases, and ΔMMAX > 2 in less than 10% of cases (VB21). However, the distribution of ΔMMAX for HF-induced seismicity is shifted to slightly higher values in comparison to LPLT injection.

4.2 Trailing events

Only three sequences (11% of cases)—Castor, Youngstown and Dallas-Forth Worth—have produced trailing event magnitude increases. This is significantly fewer than for HF-induced seismicity, where 25% of cases produced trailing event magnitude increases. It is worth mentioning, however, that for the HF-induced sequences, either injection was stopped in direct response to large magnitude events or was stopped soon after large events as the stimulation operations had been completed. In contrast, for most of the LPLT injection cases injection was not terminated in response to large-magnitude events, so there is no possibility for trailing events since we define these as increases in magnitude occurring after injection is stopped. We cannot rule out the possibility that for some case studies, had injection been stopped seismicity would have continued increasing in magnitude for a period of time, generating trailing event magnitude increases. At the Castor and Youngstown sites, injection was terminated immediately in response to moderate-magnitude earthquakes and there were significant trailing event magnitude increases. The difference in observed trailing event increases between HF and LPLT injection-induced seismicity simply reflects the definition for ΔMT that we have used and the different timescales for HF versus LPLT injection.

VB21 observed that for HF-induced seismicity trailing events typically continued to increase in magnitude for < 1 week, with the longest example being 3 weeks. For the two LPLT injection cases with significant trailing event magnitude increases, the largest events occurred within 1 day (Youngstown) and 2 weeks (Castor) post injection, indicating that similar timescales for event magnitudes to subside might be expected for LPLT injection-induced seismicity. However, further cases where LPLT injection was immediately stopped in response to a large-magnitude event would be required to properly test this inference. As in VB21, we did not find any cases which experienced both a significant magnitude jump during injection, followed by significant trailing event increases after injection.

4.3 Timescales for event initiation

Figure 3 shows histograms and cumulative distributions of time lags from the onset of injection to the first observed event and to the largest event. There is a broad range in time lags with some sites generating seismicity within a few days of injection, while other sites had time lags of up to 15 years between injection and the first observed earthquake, and some with time lags of over 25 years between the onset of injection and the largest event.

The variability in onset time likely reflects the different pore pressure and geomechanical conditions (i.e. locations of faults, stress conditions) at specific sites. In general, pore pressures will increase continually as cumulative injection volumes increase (until a steady-state is reached), and the lateral extent of the pore pressure pulse will widen with time. Hence, a critically stressed fault located very close to an injection well could be reactivated very soon after the onset of injection, whereas it will take longer for a perturbation to reach faults that are located further from the injection well. Likewise, some faults will be further from their failure state, requiring a larger pore pressure change and hence a longer time before reactivation can take place.

These results show that, while most sites that experience seismicity begin to do so within 2–3 years of injection, longer time lags can and do occur. This situation contrasts with the timescales of seismicity induced by hydraulic fracturing, which tend to appear within a few days of injection. This is because for hydraulic fracturing, fluids are injected at pressures above the tensile failure threshold for intact rock and so are, by definition, well above the pressure needed to generate shear slip on well-orientated pre-existing faults. Hence, for hydraulic fracturing well-orientated faults can reach their failure criteria quickly after the onset of injection.

4.4 b-values

Figure 4 shows a histogram of b-values and the b-values as a function of the number of events in each case study population. The observed b-values range from 0.6 ≤ b ≤ 1.2, with a mean and median slightly lower than 1.0, which is the typical value observed for tectonic seismicity (Frohlich and Davis 1993). However, in Fig. 4b we see that many of the lower b-values we obtained were from catalogues with a relatively small number of events. The precision of b-value estimates is affected by the number of events used in the calculation (e.g. Roberts et al. 2015), with b-value estimates for low numbers of events having higher uncertainties. As the numbers of events in the catalogues used to compute b-values exceeds 1000, the observed b-values fall within the range 0.8 ≤ b ≤ 1.2.

It is interesting that the b-values we obtain for small catalogues are systematically lower than expected, rather than having any cases with elevated b-values. One possibility is that many of the catalogues used in this study have been obtained by matched filter earthquake detection (e.g. Frohlich et al. 2014; Skoumal et al. 2014). Matched filters function by identifying earthquakes with similar waveforms to reference events, which are typically selected from the largest events in a sequence. Hence, matched filters may fail to detect small events that have different locations or source mechanisms than the reference events, leading to a systematic loss of smaller events that would produce an apparent decrease in b-values.

For larger datasets, our b-values trend towards 1.0, as is the case for tectonic earthquakes. Rodríguez-Pradilla et al. (2022) found similar observations when they examined HF-induced seismicity in the WCSB. Our interpretation is that, whereas microseismicity driven by injection-induced fracturing may often have elevated b-values, once pre-existing tectonic faults begin to be reactivated, leading to larger, felt seismic events, the b-values also return to similar values to those observed for tectonic earthquakes. Detailed microseismic observations at hydraulic fracturing sites have revealed such b-value variations in practice (Kettlety et al. 2019, 2021). Hence, it is reasonable to assume b-values of 1.0 when assessing the potential hazard posed by induced seismicity.

4.5 Random occurrence of M MAX events?

We assess whether the largest event within each sequence occurs at a random using the approach of van der Elst et al. (2016), computing PMmax for each case and then examining whether there are any systematic trends. Figure 5 shows a histogram of PMmax for each case study and the values of PMmax as a function of N. We find that there is no correlation between PMmax and N, indicating that our results are not biased by the relative lack of detection in some datasets. A failure to detect small events early within a sequence could otherwise impose a bias to low PMmax values.

The histogram shown in Fig. 5a shows that the majority of sequences have PMmax < 50%, implying that the largest events occur within the first half of the sequence. This contrasts with van der Elst et al. (2016), who found that PMmax was distributed randomly, and to VB21, who found that MMAX was systematically shifted to the latter half of sequences for HF-induced seismicity. The distribution of PMmax observed here has a statistical significance of p = 17% (using χ2) and so we cannot rule out the null hypothesis that PMmax is distributed randomly. However, the trend for larger events to occur relatively early in the sequence is intriguing. From the observed magnitude jumps (Fig. 2) and times to seismicity onset (Fig. 3), this does not represent large events occurring without precursory activity. Instead, what we observe in many sequences is a build up to the largest event, but then magnitudes stabilising, or even reducing, as injection continues. In some cases, this may be driven by a reduction in injection rates or pressures implemented by an operator in response to induced seismicity, or even a change in injection depth to avoid near-basement layers (e.g. McNamara et al. 2015).

However, another possibility is that as continued injection takes place the available budget of tectonic strain is depleted, reducing the magnitudes of the subsequent events. Large magnitude induced events occur on pre-existing faults and release accumulated tectonic strain energy. For regions with relatively moderate background deformation rates, a handful of induced M > 3.0 events may represent the tectonic strain accumulated over hundreds of thousands of years (e.g. Kao et al. 2018). On geological timescales, the perturbations created by industrial activities are essentially instantaneous—there is no opportunity for tectonic stresses to re-load a fault. As such, once the induced seismicity has used up a significant portion of the accumulated tectonic strain budget, we might expect seismicity rates to reduce as injection continues.

Rodríguez-Pradilla et al. (2022) examined this effect for the WCSB using the seismic efficiency parameter, SEFF, which relates the cumulative seismic moment released to the cumulative injected volume. They found that in areas prone to HF-induced seismicity, rates of seismicity that were initially elevated began to decrease as further injection continued, which they interpreted as showing that the available tectonic strain budget had been depleted. It is possible that the trend we observe here for the PMmax to fall in the earlier portion of event sequences is driven by a similar effect. However, further detailed study of seismicity rates as a function of injection volumes is required in order to assess this hypothesis.

Finally, we note that if the PMmax values shown here for LPLT injection cases are combined with the PMmax values observed by VB21 for HF cases, then the resulting distribution has nearly equal proportions of PMmax < 50% (24 cases) and PMmax > 50% (27 cases). Hence, the apparent randomness of PMmax observed by van der Elst et al. (2016) could be a result of the fact that their study combined HF and LPLT injection cases into a single analysis, whereas the combined observations presented here and by VB21 indicate that these different activities produce different PMmax distributions.

4.6 Forecasting magnitude jumps

We use the NRBE approach (Cao et al. 2020) to evaluate whether magnitude jumps, and thereby the largest magnitude events, can be forecasted. The maximum expected magnitude jump, ΔMMAX, is determined from the observed event population with:

$$\Delta {M}_{MAX}^{M}=2\Delta {M}_{MAX}^{O}-\sum\nolimits_{i=1}^{{n}_{j}-1}\left[{\left(1-\frac{i}{{n}_{j}}\right)}^{{n}_{j}}-{\left(1-\frac{i+1}{{n}_{j}}\right)}^{{n}_{j}}\right]\Delta {M}_{i}^{O}$$
(3)

where ΔMMMAX is the modelled magnitude jump, which can be added to the largest observed event to compute the modelled magnitude of the next record breaking event, MMMAX. The observed population of magnitude jumps is given by ΔMOi, where the subscript i denotes the ith value ordered by size, and ΔMOMAX is the largest observed magnitude jump. In this application, we compute ΔMO values between all earthquakes ordered from smallest to largest, irrespective of occurrence time, meaning that the number of magnitude jumps used in our calculation is nj = N − 1.

Figure 6a compares the largest observed magnitude, MMAX, with the NRBE forecast MMMAX prior to the occurrence of the MMAX event. We note that the majority of cases fall close to or above the dashed line representing MMMAX = MMAX, implying a successful forecast. However, there were 3 cases (Cordel, Milan and Timpson) where the largest event occurred before a sufficient number of events were available to generate a forecast magnitude. Likewise, two of the four cases where MMAX significantly exceeded the forecast magnitude were produced by cases with a relatively small number of total events (Youngstown, N = 282), or a small number of events before the largest event (Prague, N = 136 events before the largest event). Figure 6b shows that as the numbers of events in the catalogue increase, the forecasts generally become more accurate (MMMAX ≥ MMAX). This demonstrates the importance of dedicated local monitoring networks, producing high-resolution catalogues with low detection thresholds, to guide decision-making to mitigate induced seismicity (e.g. Kwiatek et al. 2019; Clarke et al. 2019). However, there are still cases such as the Pawnee sequence where the forecast is inaccurate despite the large number of observed events (N > 1500). Overall, these results follow a similar pattern to that found by VB21—in general, the NRBE method produces accurate magnitude jump forecasts, indicating that the method has utility. However, occasional cases significantly exceed the modelled magnitudes implying that the approach, while useful as a guide to mitigating induced seismicity, cannot be used as an absolute guarantee that larger events will not occur.

5 Discussion

In Fig. 2, we examined the observed distribution of magnitude jumps (ΔMMAX). The null hypothesis posed by van der Elst et al. (2016) is that induced seismicity is driven solely by tectonic factors, and therefore that induced event magnitudes can be treated as if they are generated at random from an underlying GR distribution. Even if magnitudes were generated at random from a GR distribution, we would still expect to see magnitude jumps leading up to the largest event, except in the extreme case where the largest event is the first to occur (which has a likelihood of 1/N). It is therefore worth examining whether our observed ΔMMAX distribution is consistent with what we might expect under the condition that event magnitudes are generated at random.

To do this, we simulate synthetic event sequences where each event magnitude is selected at random from an underlying GR distribution. We generate 1000 synthetic sequences, an example of which is shown in Fig. 7a and b. For each synthetic sequence, we set the b-value at random within a uniform range of 0.7 < b < 1.2, matching the distribution of b-values from our case studies, and we set MMIN = 1.5. For each sequence, we sequentially draw N = 1000 events (close to the mean value of N for our observed cases) from the GR distribution and compute the resulting values of ΔM and thereby ΔMMAX. Figure 7c and d show the resulting distribution of ΔMMAX values produced by random draws from the GR distribution, providing a comparison with the observed ΔMMAX values for this study and for the HF-induced seismicity cases in VB21. We find that ΔMMAX values for the synthetic sequences are systematically larger than those from the HF and LPLT observations.

Fig. 7
figure 7

Comparison of observed and synthetic ΔMMAX distributions. In a, we show a representative sequence of events with magnitudes drawn at random from an underlying GR distribution given in b. The black line in a shows the magnitude jumps generated in the synthetic sequence. In c, we show the distributions and d cumulative distributions of ΔMMAX for our observed LPLT cases (red circles), the HF cases compiled by VB21 (green triangles), and the synthetic populations (grey squares), and the best-fit log-normal distributions (red, green and black lines)

The synthetic and observed populations can be reasonably modelled with log-normal distributions, as shown in Fig. 7c and d. Table 2 lists the best-fit log-normal means and standard deviations for each of the LPLT, HF and synthetic ΔMMAX distributions, and their 95% confidence intervals. The differences in the log-normal means between the observed and the synthetic cases is significant: the ΔMMAX values we have observed for both HF and LPLT injection are lower than expected under the hypothesis that event magnitudes are generated at random from an underlying GR distribution. Instead, these results are consistent with the concept that event magnitudes are increasing through time as the subsurface perturbation increases in size and scale, resulting in smaller magnitude jumps as the sequence evolves.

The ΔMMAX distributions for LPLT injection- and HF-induced seismicity are similar, with HF-induced seismicity having a slightly higher mean and standard deviation, thereby producing more cases at the high ΔMMAX end of the distribution. However, there is significant overlap in the confidence intervals for the HF and LPLT cases. The similarities in ΔMMAX distributions for LPLT injection and HF-induced seismicity suggest that the tectonic conditions and frictional properties of the faults themselves have a significant control on the growth and escalation of event magnitudes. VB21 produced circumstantial evidence that large magnitude jumps might be more common on faults with high shear stresses, and Kettlety et al. (2021) have shown that, for similar faults subjected to similar perturbations, the escalation of event magnitudes is strongly dependent on the orientation of the faults with respect to the in situ stress field. However, the slightly higher ΔMMAX values observed for HF-induced seismicity may indicate that high-pressure, short duration injection-induced perturbations may produce slightly larger magnitude jumps than low-pressure long-term injection.

As an additional comparison between our observed cases and the synthetic models, we examine the respective distributions of ΔM values (Fig. 8). These distributions are compiled from all ΔM values (i.e. not just the largest ΔMMAX value) for each case study and from each synthetic modelled sequence. Where event magnitudes are independently drawn from a GR distribution, as per our synthetic models, we would expect to observe a similar power-law distribution of ΔM values, with the same exponent, b', as the magnitude distribution (e.g. van der Elst 2021). We find that the exponent for the synthetic ΔM distribution behaves as expected, with a power law fit across 5 orders of magnitude with b' = 0.82, which is within the range of b-values used to generate the synthetic event populations. In contrast, the b' value for the compilation of observed cases is b' = 1.32. This is significantly higher than any of the b-values observed for the individual case studies (Fig. 4).

Fig. 8
figure 8

Cumulative distribution of all magnitude jumps across all of the case study sites (red circles) and those generated by our synthetic modelling (i.e. events drawn at random from a GR distribution), with power-law fits to each distribution (dashed lines)

The elevated b' value of the observed ΔM distribution implies a relative absence of higher ΔM values relative to that expected for events drawn independently from a GR distribution. Again, this is consistent with the concept that event magnitudes increase sequentially as the subsurface perturbation increases in size and scale, resulting in smaller magnitude jumps that would be the case if events were drawn independently from an underlying GR distribution.

5.1 Implications for TLSs for LPLT injection sites

Our results indicate that induced seismic event magnitudes do tend to increase systematically as the subsurface perturbations generated by low-pressure, long-term injection develop. This implies that TLSs could be used as a tool for mitigation of seismicity from such sites. Based on our observed ΔMMAX values, we suggest that for TLSs applied to LPLT injection-induced seismicity, a reasonable red-light threshold should be set between 1.5 and 2 magnitude units below the risk-based threshold that is to be avoided. This is slightly lower than the 2–2.5 unit gap for HF-induced seismicity suggested by VB21, reflecting the slightly lower distribution of ΔMMAX observed for LPLT injection-induced seismicity. These values are similar to the proposed TLS thresholds developed by Schultz et al. (2020, 2021).

6 Conclusions

We anticipate that TLSs will be used to mitigate induced seismicity from future LPLT injection projects such as CCS sites and NGS facilities. Since TLSs are retroactive, an appropriate gap should be set between the red-light threshold and the magnitude of seismicity that the TLS is designed to mitigate. This gap should be based on observations of magnitude jumps and trailing events produced by operations at analogue sites. In this study, we compile datasets of LPLT injection-induced seismicity sequences, comprising primarily of WWD, with some NGS cases. We investigate the occurrence of magnitude jumps and trailing events within these datasets, finding that magnitude jumps are typically below 2.0 magnitude units. Few cases have trailing events, primarily because there are few cases where LPLT injection has been stopped in the immediate aftermath of large-magnitude seismicity. In many of our cases, injection has continued until the present day, by definition precluding entirely the possibility of having trailing events. The distribution of ΔMMAX values is significantly different to what would be generated by random population of event magnitudes from an underlying GR distribution, indicating that event magnitudes do escalate during injection as injected volumes and event numbers increase. However, we also found that the largest events were systematically distributed within the earlier part of event sequences. This finding reflects the fact that for many of our cases, after initial escalation of magnitudes, levels of seismicity have since stabilised at lower magnitudes with continued injection. We hypothesise that this could be a result of initial release of tectonic stresses that abates as the accumulated tectonic energy is used up, but further investigation is required to assess this hypothesis in detail. We also examine the timescales over which induced seismicity begins and escalates, finding a wide range in time from the start of injection to the onset of seismicity, likely resulting from variable in situ geological conditions. Finally, we investigated the use of the NRBE approach as an alternative means to guide proactive decision-making to mitigate induced seismicity. We found that this method broadly performs well in forecasting expected event magnitudes, but for some cases the observed events significantly exceeded the model forecasts.