Using ground motion prediction equations to monitor variations in quality factor due to induced seismicity: a feasibility study

Sub-surface operations for energy production such as gas storage, fluid reinjection or hydraulic fracking may modify the physical properties of the rocks, in particular the seismic velocity and the anelastic attenuation. The aim of the present study is to investigate, through a synthetic test, the possibility of using empirical ground-motion prediction equations (GMPEs) to observe the variations in the reservoir. In the synthetic test, we reproduce the expected seismic activity (in terms of rate, focal mechanisms, stress drop and the b value of the Gutenberg-Richter) and the variation of medium properties in terms of the quality factor Q induced by a fluid injection experiment. In practice, peak-ground velocity data of the simulated earthquakes during the field operations are used to update the coefficients of a reference GMPE in order to test whether the coefficients are able to capture the medium properties variation. The results of the test show that the coefficients of the GMPE vary during the simulated field operations revealing their sensitivity to the variation of the anelastic attenuation. The proposed approach is suggested as a promising tool that, if confirmed by real data analysis, could be used for monitoring and interpreting induced seismicity in addition to more conventional techniques.


Introduction
The increasing demand of energy is pushing exploration of new sources involving the exploitation of sub-surface resources. In highly populated regions, it is of primary importance that underground industrial operations are made in the safest way to minimize seismic hazard and avoid public concern (e.g., Giardini 2009;Convertito et al. 2012;Grigoli et al. 2017;López-Comino et al. 2018).
It is thus worthwhile that seismicity and medium properties variation induced by field operations have to be monitored. Indeed, it is known from the physics of the rocks that the presence of fluids, in particular when injected at high pressure, in addition to enhance the probability of earthquake occurrence (e.g., due to a reduction in fault strength see Guha 2000), can also modify rock properties and in particular seismic velocity and anelastic attenuation. This is confirmed by both laboratory measurements and data analysis recorded during field operations. For example, Toksoz et al. (1979) and Johnston et al. (1979) conclude that, for sandstone samples, Q can change from 100 to 10 based on the saturation and differential pressure. Similar results have been obtained by Hutching et al. (2019) by analyzing earthquakes recorded during field operations at The Geysers geothermal field and by Wcisło et al. (2017) who found high attenuation values (Q P ~ 48 and Q P /Q S < 1) analyzing seismicity induced by the injection of wastewater in the Costa Molina 2 well in southern Italy. Wandycz et al. (2019) measured the variation of the anelastic attenuation based on the microseismicity recordings in Northern Poland during hydraulic stimulation of the two unconventional gas reservoirs. They found that anelastic attenuation varies from 60 to 90 for the Lubocino dataset and has average values of 100 and 124, for P-and S-wave, respectively, in the case of Wysin.
As for the medium properties, the monitoring is currently done using approaches such as 4D seismic velocity, anelastic attenuation or seismic noise tomography (e.g., Calò and Dorbath 2013;Gritto and Jarper 2014;Zang et al. 2014), which require both onerous data processing and are time-consuming.
In the present study, we develop a new technique that could be used together with the previous ones to observe variations in the medium properties. Specifically, we test the sensitivity of the ground-motion prediction equations (GMPEs), which are generally used to estimate ground motion at a specific site or ground-shaking maps (e.g., Convertito et al. 2010), to the variations of the propagation medium properties and, in particular, those related to seismic velocity and anelastic attenuation during subsoil exploitation procedures.
Indeed, GMPEs are equations that correlate a strong ground motion parameter (response variable), such as peakground motion acceleration (PGA), peak-ground motion velocity (PGV) or spectral ordinates (Sa) at different structural periods, to predictor variables, such as magnitude and source-to-site distance, through coefficients that must be inferred from the analysis of the available waveforms. Predictor variables have to account for most of the source and propagation effects that can modify the expected value of the selected response variable.
In this study, we simulate different datasets of peakground velocities for different structural models with the aim of reproducing a real case where field operations can alter the status of the reservoir. We select a reference GMPE model and next we re-estimate its coefficients to test whether and how they are sensitive to the variations induced in the medium.
The idea of the present analysis is supported by empirical analyses that recognize the need of regionalizing the GMPE in terms of anelastic attenuation in addition to the geometrical attenuation to avoid, for example, an overestimation of the residual standard deviation (Atkinson and Morrison 2009;Kale et al. 2015;Kotha et al. 2016) and by theoretical studies that demonstrate how peak-ground motion parameters and peak frequency parameters are related to earthquake stress-drop, geometrical spreading and anelastic attenuation (Baltay et al. 2013;Lior and Ziv 2018;Wandycz et al. 2019). It is thus possible to think of an inverse procedure that allows to infer stress-drop or anelastic attenuation from peak-ground motion parameters (e.g., Lior and Ziv2018).
We notice that the proposed approach is intended as work in progress that can be used together with well-established techniques for induced seismicity monitoring.

Method
We perform a synthetic test that reproduces the expected seismic activity (seismicity rate, minimum and maximum magnitude, etc.,…) induced by a fluid injection experiment and recorded at a local seismic network. To simulate the variation of the propagation medium properties during the field operations, for each earthquake the waveforms are computed by modifying physical parameters of the propagation medium and in particular the quality factor Q.
The first analysis is devoted to test the difference between a constant Q and a frequency dependent Q. To this aim, we compute synthetic three-component waveforms at 16 stations deployed on an area of 4 × 4 km 2 ( Fig. 1), for 10 hypocentral depths (1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75, 3.0, and 3.25) km, a range of hypocentral distances (1, 6.2) km, a range of magnitude (0.0, 3.0) and a homogeneous crustal model (V P = 4500 m/s, V S = 2300 m/s, density ρ = 2400 kg/ m 3 ). Earthquakes' depth and stations' location are selected to approximately reproduce a real case of reservoir stimulation where medium properties variation is limited to a relatively For the first test, the activity rate is assumed to be constant (N events = 800) while individual magnitude values are randomly selected from the Gutenberg-Richter relation assuming a constant b value equal to 1.0.
In order to simulate full-waveforms, we compute the Green's functions by using the code AXITRA (Cotton and Coutant 1997), based on the discrete wavenumber method for a 1D-layered model and a triangle source-time function to represent the earthquake source. In order to set the proper duration of the triangle, we use the Brune's source model (Brune 1970) and, for each magnitude value, we assign a stress-drop value by using an empirical relationship based on the results obtained by Lengliné et al. (2014) for Soultz-Sous-Forêts.
To further increase data heterogeneity, in addition to magnitude, stress-drop and anelastic attenuation, we simulate waveforms of earthquakes taking also into account the different focal mechanisms. In particular, focal mechanisms are selected in the range: strike (140°, 180°), dip (40°, 70°) and rake (− 100°, − 80°) in accordance with an assumed regional stress-field at Soultz-Sous-Forêts (Valley and Evans 2007).
We compute synthetic waveforms for three different decreasing constant Q values (Q = 160, 80 and 50) and the final PGV corresponds to the maximum velocity measured on the vector composition of the three-component seismograms. These values are compatible with Q quality factors observed in sedimentary rock reservoirs (Abercrombie 1998;Ripperger et al. 2009;Bethman et al. 2012;Hutchings et al. 2019 and references therein), and with laboratory measurements (e.g., Toksoz et al. 1979;Johnston et al. 1979). However, in the most general formulation of the anelastic attenuation model, Q is frequency dependent and should be written as Q(f) = Q o (f/f o ) n where f o is a reference frequency generally assumed to be 1 Hz (e.g., Morozov 2008) and Q o is the quality factor Q value at f o . Thus, to study the effect of the Q(f) models, the results obtained by using the Q-constant values are compared with those obtained for three frequency dependent Q models. For the latter, we use the model Q(f) = (Vs/32)f 0.57 = Q 0 f 0.57 presented by Satoh (2004). Specifically, to introduce a frequency-dependent anelastic attenuation, we convolved the Fourier amplitude spectra of the waveforms obtained by using the previous crustal model (V P = 4500 m/s, V S = 2300 m/s, density ρ = 2400 kg/m3) without anelastic attenuation, with the A(R, f) = A o e −πfR/VsQ(f) function. In the previous equation, A o is the amplitude at the source, f is the frequency, R is the hypocentral distance, V S is the shear-wave velocity.
We select three different models, Q(f) = 72f 0.57 , Q(f) = 20f 0.57 and Q(f) = 10f 0.57 . The selection of the specific Q(f) models is done to make them compatible with the constant Q-values in the frequency band (8, 20) Hz, which is the range of interest commonly used to study events (natural and induced) in the magnitude range adopted in the present study (e.g., Emolo et al. 2011;Douglas et al. 2013;Sharma et al. 2013).
Moreover, given the range of distances (1, 6.2) km the peak-ground velocities are measured on S-waves; it is thus reasonable to assume that PGVs are affected by Q S . Furthermore, since it is expected that Q P is larger than Q S (e.g., Lay and Wallace 1995), for simplicity in the waveforms simulation we assume that Q P =Q S for each model.
The subsequent analysis is aimed at testing the proposed procedure on a layered crustal model. Given the same network geometry used in the previous analysis, we compute synthetic waveforms and relative PGV values for the three crustal models described below, that mimic the effects of increasing field operations. As for the selection of the layers thickness, velocity and density values we refer to the model proposed by Cuenot et al. (2008) and Valley and Evans (2007) for Soultz-sout-Forêts. We assume that the field operations affect all the layers but that the highest attenuation is attained in the first two layers that could represent the sedimentary sequence. In order to imitate real-data acquisition as much as possible, for each model, we generate a new earthquake catalog assuming a specific b-value. Indeed, it is expected that the b-value changes in space and time during fluid injection operations, deviating from the usually observed b = 1 value (Henderson et al. 1999;Cuenot et al. 2008;Convertito et al. 2012;Bachmann et al. 2012), although the variation is not strictly correlated with the injection rate and distance from the well (e.g., Cuenot et al. 2008;Bachmann et al. 2012). For the analysis of the layered crustal models, we assume that the b-value increases as function of the injection rate and moves from a starting value 1 to 1.2 and 1.5 in the last stage of the operations as observed during the hydraulic stimulation at Soultz-sout-Forêts (Cuenot et al. 2008) or during gas depletion in the Netherlands (Van Wees et al. 2014). Similarly, we consider that also the seismicity rate varies during the operations and, in particular, we select 400 events for the first model, 800 events for the second model, and 1200 events for the third model.
For both the two above-mentioned analyses, the PGVs (corresponding to the maximum velocity measured on the vector composition of the three-component seismograms) are measured on the synthetic waveforms for inferring the coefficients of the GMPE using the Levenberg-Marquardt least squares algorithm (Marquardt 1963) for curve fitting.
As a first task for a homogeneous, isotropic medium, we have to select a GMPE, which is assumed as the reference model. For the analyses presented in this study, we select a GMPE with a formulation as simple as possible: where Y is the PGV, R is the hypocentral distance and M is the earthquake magnitude. In Eq. (1), the coefficients a and b account for the effect of the earthquake size on Y, while c accounts for the geometrical spreading, and d accounts for the anelastic attenuation, which represents the fractional loss of energy for cycle of oscillation during wave propagation (e.g., Knopoff 1964;Del Pezzo and Bianco 2013). As reported above, the intrinsic anelastic attenuation is inversely related to the quality factor Q (Knopoff 1964) and can be modeled as a filter of the form A(R,f) = A o e −πfR/VsQ , which is thus convolved with the source spectrum (Babaie Mahani and Atkinson 2012). We notice that, as reported by Cotton et al. (2008), the interpretation of the term dR as the anelastic attenuation effect in the GMPE is only strictly correct when considering Fourier amplitudes for a particular frequency. However, since PGVs are measured at the amplitude spectral peak and this latter has a bell-shape (if a ω −2 displacement spectral model is assumed) modified by the anelastic effect, which always shifts the corner frequency toward lower values, we can assume that PGVs are associated to a limited frequency range. The GMPE is also associated with an error σ logY corresponding to the total standard error, describing uncertainty in the value of Y given the predictive relationship.
Equation (1) represents a simple model compared to many formulations with higher degree of complexity (e.g., Douglas 2003;Douglas et al. 2013) that could better account for the physical processes affecting the recorded ground motion parameter. However, additional terms accounting for nonlinearity in magnitude scaling and magnitude-dependent geometrical spreading are effective and can be resolved for distances and magnitudes larger than those considered in this study (Bommer et al. 2007;Cotton et al. 2008;Baltay and Hanks 2014).
To infer the reference model, we simulated 800 events using the stations' configuration and earthquakes depths' distribution described above. As for the initial Q we refer to Charléty et al. (2007) and Beauce et al. (1992) who suggest that Q S is about 500 in the granite and a minimum value of 50 in the upper sedimentary layers.
The approach proposed in this study is to use the earthquakes recorded during the field operations to infer and/ or update the coefficients a, b, c and d (De Matteis and (1) log Y = a + bM + c log R + dR Convertito 2015) and test which of them have to be analyzed to monitor the status of the reservoir. In particular, we test two cases. In the first case, all the earthquakes recorded during a fixed period are used to infer the new coefficients. A similar approach has been used by Chiou and Youngs (2008) to study the dependence of the anelastic attenuation term on the magnitude. In the second case, data of synthetic earthquakes are individually used to infer the new coefficients. This analysis was designed to be applied, for example, during those stages of field operations when earthquakes do not occur continuously over time and are not enough numerous to update all the coefficients of a reference model. However, we verified that the analysis of single events could still provide information on any critical change in the medium.

Results and conclusion
The dataset used to infer the reference model (MODREF) reported in Eq.
(1) is shown in Fig. 2 and the obtained coefficients are listed in Table 1. In addition to the normality plot shown in Fig. 2d, we performed the analysis of variance (ANOVA) (Fisher 1990), which allows to test the null hypothesis that the coefficients in regression model are zero against the alternative hypothesis that the coefficients are different from zero. It can be shown that the test statistic has an F(p, n − p − 1) distribution where p is the number of parameters to be inferred and n is the number of data (Fisher 1990). We note that the F test does not indicate which of the parameters is not equal to zero, but only that at least one of them is linearly related to the response variable. The result of the ANOVA for MODREF is reported in Table 2 and, based on the computed p value (< 0.0001) we cannot accept the null hypothesis that coefficients are zero at 95% level of confidence. Moreover, the obtained R 2 = 0.897 indicates that about 90% of the variation of the response variable is explained by the predictive variables. The initial analysis is devoted to study the effect of considering Q constant quality factor vs Q frequency-dependent quality factor. To this aim, we compute the updated coefficients of Eq. (1) using PGVs simulated for models with Q = 160, 80 and 50 that are compared with Q(f) = 72f 0.57 , 20f 0.57 and 10f 0.57 , respectively. The results are depicted in Fig. 3 while the coefficients are listed in Table 3. It is noticeable how the GMPE coefficients show a similar variation regardless of using a frequency dependent or a constant attenuation model and, in particular, the d coefficient is sensitive to the variations of Q, decreasing with the increase in the anelastic attenuation. Thus, for the following analysis we can adopt the Q constant models. Next, to simulate a more reasonable propagation medium with respect to a homogeneous medium, we analyze PGVs data simulated by using the three-layered models reported in Table 4 (MOD1, MOD2 and MOD3). Incidentally, we notice that the Q value in the first layer in MOD3 is a value generally measured in the first 100 m (Abercrombie, 1998) that we arbitrarily extrapolated up to 800 m only to test an extreme case. The simulated datasets for MOD1, MOD2 and MOD3 are shown in Figs. 4, 5 and 6, respectively, while the inferred coefficients are listed in Table 1. For each regression, we performed the ANOVA whose results are listed in Tables 5, 6 and 7. Also for the layered models, the ANOVA indicates that the hypothesis of zero value coefficients cannot be accepted and that R 2 is larger than 88%. Fig. 2 a Magnitude distribution for the catalogue used to infer the reference model (MODREF). b Sample distribution of the focal mechanism angles (strike, dip and rake) for the same catalogue. c PGVs as function of distance used to infer MODREF. Circles are color coded according to the magnitude of the earthquake. The black lines correspond to the inferred GMPE computed for magnitude values M 0, 1 and 2. d Normality plot for the residuals. The dimension of the circles is proportional to the event's magnitude From the analysis of Table 1, we note that the inferred d coefficient decreases with the decreasing of the quality factor, that is, when the attenuation is higher. The decrease in the standard deviation from MOD1 to MOD3 is likely due to the increase in the b-value of the Gutenberg-Richter relation used in our simulations. In particular, higher b values correspond to a higher percentage of smaller magnitude events, which are similarly affected by the anelastic attenuation compared to the case in which a wider range of magnitude (i.e., lower b value) is considered. In order to analyze the differences between the inferred d coefficients, we use a Student's t test at 95% level of significance. The results indicate that the difference between the d values is statistically significant (see Table 8). In addition, for each investigated model we report in Fig. 7 the distribution of residuals (logPGV obs − logPGV pred ) as function of distance and magnitude. A regression analysis indicates that for all the considered GMPEs there is no trend in the residuals as function of distance. This means that all the models properly account for the geometrical and anelastic attenuation.
In order to test the resolution of the proposed technique, we modified MOD2 by decreasing the Q values by 30%   Table 4 Structural models used to simulate waveforms and the relative PGVs shown in Figs. 4, 5 and 6. Z is the depth of the layers, V P and V S refers to P-and S-wave velocity, ρ is the density and Q P and Q S the P and S quality factors. The last three columns report the three investigated anelastic models: MOD1, MOD2 and MOD3 0  1850  860  2000  50  20  10  800  2870  1340  2200  100  50  25  1600  5800  3310  2600  500  100  50  2600  5820  3320  2600  500  100  50  3600  5850  3340  2600  500  100  50  4600  5870  3350  2600  500  100  50  5600  5900  3370  2600  500  100  50  6600  5920  3380  2650  500  100  50  7600  5950  3400  2650  500 100 50 (MOD2P30) and 20% (MOD2P20). We note that MOD3 is obtained by reducing Q of MOD2 by 50% and that for MOD3 the t test reported in Table 8 has shown a statistical difference. The coefficients obtained for the modified models are listed in Table 9 while Table 10 reports the result of the t test aimed at verifying the statistical difference between the inferred d values. The test indicates that the d values are statistically different and thus the approach presented in this study can resolve variation of Q in the order of 20%. Lower variations seem to be unfeasible because they would be lower than the uncertainty generally associated with the estimation of Q for real data (White 1992). The last application concerns the case when data of each earthquake are separately inverted. We note that, when analyzing a single earthquake there is an intrinsic difficulty to infer all the parameters of the GMPE due to the limited number of data.
Preliminary tests have shown that both b and c parameters must be set equal to the reference values ( while inverting a and d. A similar result is obtained when the number of earthquakes is not enough to build a homogenous PGVs dataset. This assumption means that operations do not strongly affect the geometrical spreading of the S-waves travelling from the source to the receivers and that the same operations only partially affect the earthquakes stress-drop. Concerning the assumption about the geometrical spreading we note that, while it is a reasonable assumption for conventional filed operation, it could not be fully valuable for fracking operations (e.g., Adachi et al. 2007;Belyadi et al. 2019). As for the stress drop, while the variation generally observed for natural moderate-to-large earthquakes spans over at least 4 order of magnitude (e.g., Allman and Shearer 2009), the one observed for induced seismicity falls within a tighter range of values (e.g., Goertz-Allmann et al. 2011;Lengliné et al. 2014;Martínez-Garzón et al. 2014;Staszek et al. 2017). The results of the analysis are shown in Fig. 8 and indicate a clear variation of the coefficient d when moving from MOD1 to MOD3, that is, when the anelastic attenuation is increased. This finding still holds when a random noise (in the range ± 40% of PGV at each station) is added to the simulated PGVs.
Moreover, for each crustal model there is an evident variation of d when data from low magnitude events are inverted. This can be explained by considering that the is the source spectrum, where Ω o is the low-frequency spectral level from which seismic moment can be computed and f c is the corner frequency. The term A(R,f) = A o e −πfR/VsQ is the attenuation spectrum model. If one assumes that PGV is measured at the maximum of the amplitude spectrum it can be shown that, at a given distance from the source, the effect of the anelastic attenuation is larger for smaller magnitude events that have higher corner frequencies compared to larger magnitude events. The conclusions of the present study can be summarized as follows: • GMPEs can be used to observe the variations of Q during field operations. When a large dataset is available, all the coefficients of the GMPE can be inverted and compared with those of a reference model to check if significant variations (from a statistical point of view) of the coefficients are occurring. However, preliminary tests have shown that, when the dataset is not homogeneous in terms of magnitude and distances, as for the case of the initial phases of a project (e.g., first stimulation of reser-voir, acidifications) some of the coefficients must be set to a-priori selected values. • As a consequence of the above conclusion, the proposed approach cannot be applied when field operations strongly affect the geometrical spreading, as in the case of fracking operations, or produce a variation in the stress-drop larger than 3 to 4 order of magnitude. In fact, these large variations cause that the coefficient b or c cannot be set to a-priori constant values. • Aside from the possibility of inverting all the coefficients of the GMPE (as in the case when all the events are inverted together) or the necessity of fixing some of them, as in the case of the analysis of single earthquakes, the synthetic tests presented in this study have shown that the coefficient d is statistically sensitive to the anelastic attenuation variations. • The tests indicate that the variation of d due to anelastic attenuation dominates the variations possibly due to focal mechanisms, stress-drop and magnitude. Indeed, since in each simulation we have imposed the same variation of the focal parameters, if the focal mechanisms would have dominated the PGV amplitudes variation, the d coefficients would not have changed. • The analysis of single earthquakes suggests that, for each used crustal model, coefficients' variation is more evident when the analysis is done for single, small magnitude earthquakes.
• The proposed technique has the advantage that it requires only the location of the event and the measure of the PGVs that, compared to other measures such as P-and S-wave picking (necessary to imple-    1 3 ment tomographic approaches) or P-wave pulse width measure (e.g., Zollo and de Lorenzo 2001) or the peak frequency (e.g., Wandycz et al. 2019) are more readily measured after the earthquake occurrence.

Data and resources
Figures have been generated with the Generic Mapping Tools (GMT; Wessel and Smith 1991). S4CE (Science for Clean Energy) supporting project details can be found at www.scien ce4cl eanen ergy.eu.