Failure models driven by a selfcorrecting point process in earthquake occurrence modeling
 415 Downloads
Abstract
The longterm recurrence of strong earthquakes is often modeled according to stationary Poisson processes for the sake of simplicity. However, renewal and selfcorrecting point processes (with nondecreasing hazard functions) are more appropriate. Shortterm models mainly fit earthquake clusters due to the tendency of an earthquake to trigger other earthquakes. In this case, selfexciting point processes with nonincreasing hazard are especially suitable. To provide a unified framework for analysis of earthquake catalogs, Schoenberg and Bolt proposed the shortterm exciting longterm correcting model in 2000, and in 2005, Varini used a statespace model to estimate the different phases of a seismic cycle. Both of these analyses are combinations of longterm and shortterm models, and the results are not completely satisfactory, due to the different scales at which these models appear to operate. In this study, we propose alternative modeling. First, we split a seismic sequence into two groups: the leader events, nonsecondary events the magnitudes of which exceed a fixed threshold; and the remaining events, which are considered as subordinate. The leader events are assumed to follow the wellknown selfcorrecting point process known as the stressrelease model. In the interval between two subsequent leader events, subordinate events are expected to cluster at the beginning (aftershocks) and at the end (foreshocks) of that interval; hence, they are modeled by a failure process that allows bathtubshaped hazard functions. In particular, we examined generalized Weibull distributions, as a large family that contains distributions with different bathtubshaped hazards, as well as the standard Weibull distribution. The model is fit to a dataset of Italian historical earthquakes, and the results of Bayesian inference based on the Metropolis–Hastings algorithm are shown.
Keywords
Bathtubshaped hazard function Bayesian inference Generalized Weibull distributions Point processes Stressrelease modelMathematics Subject Classification
60Gxx 62F15 62P121 Introduction
Earthquakes are expressions of complex systems in which many components interact with each other. They are natural phenomena that affect multiple time–space scales, and of which we only have indirect measurements that are strongly affected by uncertainty, despite modern seismographic systems (such as site effects in the propagation of waves). These phenomena can be investigated on different, coherent time–space–magnitude scales, where they show different critical aspects and are related to different goals; e.g., focusing on a single event, the aim is to determine its parameters through the recordings provided by various stations. Moving up to the next level, i.e., expansion of the time–space scale, there is the problem of the modeling of the sequence of secondary events that follow a strong earthquake, and the estimation of its length and width. Climbing up to the next hierarchical level, we deal with the occurrence of destructive earthquakes that are generated by faults or systems of indistinguishable, interacting faults, embedded in the reciprocal sliding of plates. Ideally, through combination of the equations that govern the physical processes of the generation of a fault rupture with those of the fault interactions, we can model the whole phenomenon. Clearly we are far from this level of knowledge, and many of the studies performed at various levels do not benefit from the results obtained for the other levels.
Starting from the goal we want to pursue, we choose the resolution level at which to observe the phenomenon, the time–space–magnitude scales for its description, what are the relevant variables, and what ones are treatable as fluctuations. Moving to a level means to adopt those interaction models that capture the emerging system behavior, also using information or models from lower levels. In this study, our objective is to consider the time evolution of disastrous earthquakes, which means to move on mediumlarge scales, and to model their occurrence in the forecasting perspective by combining knowledge gained at different levels.
In the past, phenomenological analyses of seismicity led mainly to two patterns: timeindependent point processes on regional and longterm scales; and selfexciting models to describe typically the increase in seismic activity over short space–time scales immediately after large earthquakes. Later, it was noted that even catalogs without secondary shocks show clusters. These observations, and the need to consider jointly the previous patterns to provide a better description of the phenomenon, have brought us to hybrid models that widely require measurements of geodetic and geological quantities. The global earthquake activity rate (GEAR) model (Kagan 2017) provides longterm forecasting with no explicit time dependence, that is based on the linear combination and loglinear mixing of two ‘parent’ forecasts: smoothedseismicity forecasts and tectonic forecasts based on a strainrate map that is converted first into a longterm seismic moment rate, and then into earthquake rates. The most recent version of the Uniform California Earthquake Rupture Forecast (UCERF3) (Field et al. 2017) is a hierarchical model that covers both longterm (decades to centuries) probabilities of fault rupture and shortterm (hours to years) probabilities of clustered seismicity. The hierarchical structure is composed of three levels, each of which is conditional on the previous level: a timeindependent model is at the first level, then there is a renewal model, and finally an epidemictype aftershock sequence (ETAS) model (Ogata and Zhuang 2006; Ogata 2011), to represent spatiotemporal clustering. Also the hybrid timedependent probabilistic seismic hazard model (Gerstenberger et al. 2016) developed in New Zealand after the Canterbury earthquake sequence (September 4, 2010, \(M_w \ 7.1\)) is a combination of three types of models: smoothed seismicity background models as longterm models; EEPAS (‘Every Earthquake a Precursor According to Scale’) models (Rhoades and Evison 2004), to take into account the precursory scale increase phenomenon on the mediumterm scale, and two aftershock models in the shortterm clustering class (Ogata 1998; Gerstenberger et al. 2004).
Models of the selfcorrecting class are the only ones, on large space–time scales, that attempt to incorporate physical conjecture into the probabilistic framework. They are inspired by the elastic rebound theory by Reid (1911), which was transposed into the framework of stochastic point processes by VereJones (1978), through the first version of the stressrelease model. Subsequent versions of this model express the presence of clusters of even large earthquakes, in terms of possible interactions among neighboring fault segments (Bebbington and Harte 2003).
Summarizing, most probability models used in earthquake forecasting belong to the two classes of selfexciting and selfcorrecting models, which are conflicting from the point of view of the hazard function: a piecewise decreasing function for selfexciting models, and a piecewise increasing function for selfcorrecting models. To conciliate this dissent, Schoenberg and Bolt (2000) proposed the Shortterm Exciting Longterm Correcting (SELC) models by simply putting together the conditional intensity functions of two point processes, one from each class, and fitting the models to two datasets of micro and moderate earthquakes that occurred in California, over periods of 7 years and 30 years. With the same goal, Varini (2005, 2008) used a statespace model to estimate the different phases of a seismic cycle, in which the state process is a homogeneous pure jump Markov process with three possible states associated with the Poisson, ETAS, and stressrelease models. In both of the proposals, the results were not completely satisfactory, because of the different scales at which the triggering and strainrelease mechanisms appear to operate for SELC models, and of the difficulty in fitting the sudden changes of state in the case of the statespace model. Again, in the perspective that the physical system passes through different states during the earthquake generation process, Votsi et al. (2014) considered a discretetime hidden semiMarkov model, the states of which are associated with different levels of the stress field.
In this study, we propose a new stochastic model for earthquake occurrences, hereinafter denoted as the compound model, which takes into account the following points: (a) the benefit of exploiting a stochastic model inspired by elastic rebound theory; (b) the need to consider jointly the opposite trends that characterize selfexciting and selfcorrecting models; and (c) the idea to superimpose behaviors characteristic of different timescales in a single hierarchical model.
Let us consider all of the earthquakes that are associated with a seismogenic source, and select \(m_0\) as the magnitude threshold so that the time period in which the dataset can be considered as complete includes a sufficient number of strong earthquakes according to the seismicity of the region under examination; e.g., in Italy, an earthquake of \(M_w \ge 5.3\) can already be considered as strong. These damaging events are responsible for most of the release of seismic energy. We put them in the first level of our model, and assume that these leader events follow the stressrelease model (Rotondi and Varini 2007). At the second level, there are the subordinate events; i.e., those that occur between two consecutive leaders and show the tendency to cluster in closeness to them. We consider the occurrence times of these events as ordered failure times in the time interval limited by the two leaders, and we model them through distributions belonging to the family of the generalized Weibull distributions with a bathtubshaped hazard function, so as to match the clustering trend close to the extremes of the interval. We examine the model in the Bayesian perspective; hence, in addition to the elements of the model, we assign prior distributions of the model parameters, which include the available information on the phenomenon drawn from the literature in the case of previously studied components of the models. As for the new components, i.e., the generalized Weibull distributions, an objective Bayesian perspective is followed in assigning the prior distributions of their parameters, through combination of the empirical Bayes method and use of vagueproper prior distributions.
The proposed model is applied to the sequence of earthquakes associated with one of the most active composite seismogenic sources (CSS) of the Italian Database of Individual Seismogenic Sources (DISS, version 3.0.2). This is located in the central Apennines, and includes the L’Aquila earthquake, one of the most recent destructive earthquakes in Italy. Then we report the parameter estimates and the performance of the model in terms of the marginal likelihood. Moreover, we compare our model with the stressrelease and ETAS models on the basis of two validation criteria: the Bayes factor and the information criterion by Ando and Tsay.
2 Superimposed point processes: failure process and selfcorrecting model
2.1 The two conflicting model classes
2.2 A proposal of conciliation
Stochastic modeling of magnitude Let G(m) denote the probability distribution function of the magnitude, and g(m) its density function on the domain \(m \ge m_0\). In particular, it turns out that the distribution of the leader magnitude is truncated on \([m_{th}, +\infty )\), whereas the distribution of the subordinate magnitude is truncated on \([m_0, +\infty )\).
Stochastic modeling of subordinate events The subset of subordinate events consists of earthquakes of magnitude lower than the magnitude threshold \(m_{th}\) or of secondary events triggered by leader events, possibly also of magnitude higher than \(m_{th}\). The identification of seismicity patterns in a region might be a controversial issue, but there is generally wide agreement on some features, as, for example, on clustering after mainshocks and on the sometimes observed increase in activity before a strong shock. Accordingly, it should expected that the occurrence times of the subordinate events are gathered more in the neighborhood of the main events, in correspondence with those that are called aftershocks and foreshocks. This means that it is reasonable to expect a bathtubshaped hazard function for the occurrence time of subordinate events between two consecutive leaders; i.e., a function that decreases just after a leader event has happened, then possibly remains constant, and finally increases with the approach of another leader event.
Properties of the modified (MW) and additive (AW) Weibull distributions that were applied to subordinate events: \(f^*(x)\) is the density function, \(F^*(x)\) is the distribution function, \(S^*(x)\) is the survival function, and \(h^*(x)\) is the hazard function
MW  \(f^*(x) = \displaystyle {\frac{a_1 + c_1 \, x}{b_1^{a_1}}} \, x^{a_11} e^{c_1 \, x} exp \left\{  {\left( \frac{x}{b_1} \right) }^{a_1} \, \displaystyle {e^{c_1 x}} \right\} \qquad a_1, \, b_1, \, c_1 >0\) 
\(F^*(x) = 1  exp \displaystyle {\left\{  {\left( \frac{x}{b_1} \right) }^{a_1} \, e^{c_1 x} \right\} }\)  
\(S^*(x) = exp \displaystyle { \left\{  {\left( \frac{x}{b_1} \right) }^{a_1} \, e^{c_1 x} \right\} }\)  
\(h^*(x) = \displaystyle {\frac{a_1 + c_1 \, x}{b_1^{a_1}} \, x^{a_11} e^{c_1 \, x}}\)  
AW  \(f^*(x) = \left[ \displaystyle \frac{a_1}{b_1} \left( \displaystyle \frac{x}{b_1}\right) ^{a_11} + \displaystyle \frac{a_2}{b_2} \left( \displaystyle \frac{x}{b_2}\right) ^{a_21} \right] \ exp\left\{ \left( \displaystyle \frac{x}{b_1}\right) ^{a_1} \left( \displaystyle \frac{x}{b_2}\right) ^{a_2}\right\} \quad a_1, \, a_2, \, b_1, \, b_2>0\) 
\(F^*(x) = 1  exp\left\{ \left( \displaystyle \frac{x}{b_1}\right) ^{a_1} \left( \displaystyle \frac{x}{b_2}\right) ^{a_2}\right\}\)  
\(S^*(x) = exp\left\{ \left( \displaystyle \frac{x}{b_1}\right) ^{a_1} \left( \displaystyle \frac{x}{b_2}\right) ^{a_2}\right\}\)  
\(h^*(x) = \left[ \displaystyle \frac{a_1}{b_1} \left( \displaystyle \frac{x}{b_1}\right) ^{a_11} + \displaystyle \frac{a_2}{b_2} \left( \displaystyle \frac{x}{b_2}\right) ^{a_21} \right]\) 
The hazard functions in Table 1 can take a variety of shapes according to the variation of the model parameters. For the MW model, \(h^*(x)\) is always increasing if its shape parameter is \(a_1 >1\), and it has a bathtub shape otherwise. In the latter case, the turning point of \(h^*(x)\) is given by \(x^* = (\sqrt{a_1}a_1)/c_1\). The AW model is a twofold competitive risks model where the hazard function is the sum of the hazard functions of two Weibull distributions. Therefore, \(h^*(x)\) is increasing if both shape parameters \(a_1\) and \(a_2\) are larger than 1, decreasing if \(a_1<1\) and \(a_2<1\), and bathtub shaped if \(a_1<1\) and \(a_2>1\), or vice versa; in the last case the turning point is given by \(x^* = \displaystyle {{\left[  \frac{a_2 (a_21) \, b_1^{a_1}}{a_1(a_11) \, b_2^{a_2}}\right] }^{1/(a_1a_2)}}\).

cumulative distribution function \(F(s)=\displaystyle \frac{F^*(s)}{F^*(1)}\),

density function \(f(s)= \displaystyle \frac{f^*(s)}{F^*(1)}\),

survival function \(S(s)=\displaystyle \frac{F^*(1)F^*(s)}{F^*(1)}\),

hazard function \(h(s)=\displaystyle \frac{f^*(s)}{F^*(1)F^*(s)}\).
A further level of dependency of the subordinate events from the leaders can be introduced by assuming that shape parameters depend on the actual hazard level of the stressrelease model. That means that the hazard function of subordinate events can take different shapes in the intervals between consecutive leaders; e.g., in the MW model, one of the parameters in the turning point can be made timedependent as follows: \(a_1 (t) = a_1 / \lambda (t\mid {\mathcal{H}}_t)\); and in the AW model, it is possible to set \(a_1(t) = a_1\ \lambda (t\mid {\mathcal{H}}_t)\) and \(a_2(t) = a_2 / \lambda (t\mid {\mathcal{H}}_t)\). We plan to develop these indications in future investigations.
3 Bayesian inference and model comparisons
In this section, we deal with the problem of estimation of the model parameters following the Bayesian pardigm and applying Markov chain Monte Carlo (McMC) methods for sampling from the posterior probability distributions of the parameters. In this way, we obtain not only the parameter estimates, typically as their posterior means, but also a measure of their uncertainty, as expressed through the simulated posterior distribution of each parameter.

\(\pi (b_{\ell } \mid m_i, i=1,..,n) \propto \displaystyle \prod _{i=1}^n \frac{g_{\ell }(m_i \mid b_{\ell })}{1G_{\ell }(m_{th} \mid b_{\ell })} \; \pi _0 (b_{\ell })\)

\(\pi (b_s \mid (m_{ij}, j=1,.., n_i), i=1,..,n) \propto \displaystyle \prod _{i=1}^{n1} \prod _{j =1}^{n_i} \frac{g_s(m_{ij} \mid b_s)}{1G_s(m_0 \mid b_s)} \; \pi _0 (b_s)\)

\(\pi (\gamma \mid (n_{ij}, j=1,..,n_i), t_i, i=1,..,n ) \propto \displaystyle \prod _{i=1}^{n1} e^{\displaystyle  \gamma \, (t_{i+1}t_i)} \, \frac{\gamma \, {(t_{i+1}t_i)}^{n_i}}{n_i!} \; \pi _0(\gamma )\)

\(\pi (\varvec{\theta }_s \mid {\mathcal{H}}_T) \propto \displaystyle \prod _{i=1}^n \lambda (t_i \mid {\mathcal{H}}_{t_i}, \varvec{\theta }_{\ell }) \ exp \left\{ \displaystyle \int _{t_1}^{t_n} \lambda (u \mid {\mathcal{H}}_u, \varvec{\theta }_{\ell }) \, du\right\} \; \pi _0(\varvec{\theta }_{\ell })\)

\(\pi (\varvec{\theta }_s \mid (s_{ij}, j=1,..,n_i), n_i, t_i, i=1,..n) \propto \displaystyle \prod _{i=1}^{n1} \, \left[ n_i! \prod _{j =1}^{n_i} f(s_{ij} \mid \varvec{\theta }_s ) \right] \; \pi _0(\varvec{\theta }_s)\) .
Metropolis–Hastings algorithm  
step 1 : select \(\theta _0\) from \(\pi _0(\theta )\) and set \(i=1\),  
step 2 : draw a candidate \({{\tilde{\theta }}}\) from the proposal distribution \(q (\theta \mid \theta _{i1})\),  
step 3 : compute the acceptance probability  
\(\alpha ({\tilde{\theta }} \mid \theta _{i1})= min \left( 1, \ \displaystyle \frac{\pi _0({\tilde{\theta }}) \, {\mathcal{L}}(D \mid {\tilde{\theta }}) \, q (\theta _{i1} \mid {\tilde{\theta }})}{\pi _0(\theta _{i1}) \, {\mathcal{L}}(D \mid \theta _{i1}) \, q ({\tilde{\theta }} \mid \theta _{i1})}\right) \, ,\)  
step 4 : accept \({\tilde{\theta }}\) as \(\theta _i\) with probability \(\alpha ({\tilde{\theta }} \mid \theta _{i1})\), set \(\theta _i = \theta _{i1}\) otherwise,  
step 5 : repeat steps 24 a number R of times to get R draws from the posterior distribution, with optional burnin and/or thinning. 
The initial value \(\theta _0\) of the parameter is generated from the prior distribution \(\pi _0(\theta )\). Then, given \(\theta _{i1}\), for each iteration \(i=1,2,\ldots\), a candidate \({\tilde{\theta }}\) is drawn from a proposal distribution \(q (\theta \mid \theta _{i1})\), and accepted with probability \(\alpha ({\tilde{\theta }} \mid \theta _{i1})\). Due to the Markovian nature of the simulation, the first values of the chain are highly dependent on the starting value and are usually removed from the sample as burnin. The proposal distribution is chosen such that it is easy to sample from it, and it covers the support of the posterior distribution. In general some practical rules are proposed in the literature to adjust the proposal. One of these is to modify the variance \(\sigma ^2\) of the proposal so as to optimize the acceptance rate, i.e., the fraction of the proposed samples that is accepted in a window of the last N samples, with N sufficiently large. Indeed, if \(\sigma ^2\) is too small, the acceptance rate will be high, but the chain will mix and converge slowly. On the other hand, if \(\sigma ^2\) is too large, the acceptance rate will be very low and again the chain will converge slowly. It is generally accepted (Gamerman and Lopes 2006) that a reasonable acceptance rate is about 20% to 50%. Another critical point is the correlation between successive states of the chain, which reduces the amount of information contained in a given number of draws from the posterior distribution. A simple, yet efficient, method that deals with this issue is to only keep every d draws from the posterior, and to discard the rest. This is known as thinning the chain. Although the asymptotic convergence of the chain to the equilibrium distribution is theoretically proven, we make inferences based on a finite Markov chain, and hence the main problem is to establish how long we must run our chain until convergence. The R package BOA (Smith 2007) provides functions for summarizing and plotting the output from McMC simulations, as well as diagnostic tests of convergence. It must be stressed, however, that no method can truly prove convergence; diagnostics can only detect failure to converge.
Model comparison We now compare the new proposed model with the representative ones of the two classes of selfexciting and selfcorrecting models. For the former, the ETAS model was chosen, and for the latter, the stressrelease model. Among the Bayesian oriented criteria for model selection, we adopt the classical Bayes factor and the Ando and Tsay criterion. The Bayes factor aims at model comparison by looking for the model that best fits the data, whereas the Ando and Tsay criterion chooses which model gives the best predictions of future observations generated by the same process as the original data.
4 Application
4.1 Dataset construction
List of the earthquakes associated with the Italian composite seismogenic source 25 (database DISS v. 3.0.2) with magnitude of at least 4.45
n  year/mo/da  ho:mi:se  \(M_w\)  Lat  Lon 

1  1873/07/12  06:06:–  5.38  41.686  13.778 
2  1873/12/13  –:–:–  4.47  41.405  13.983 
3  1874/12/06  15:50:–  5.48  41.655  13.827 
4  1877/08/24  02:45:–  5.21  41.710  13.351 
5  1879/02/23  18:30:–  5.59  42.766  13.043 
6  1885/04/10  01:44:–  4.57  41.820  13.104 
7  1898/06/27  23:38:–  5.50  42.414  12.903 
8  1901/04/24  14:20:–  5.25  42.100  12.736 
9  1901/07/31  10:38:30  5.16  41.719  13.750 
10  1902/10/23  08:51:–  4.74  42.357  12.839 
11  1903/11/02  21:52:–  4.81  42.794  13.074 
12  1904/02/24  15:53:26  5.68  42.097  13.319 
13  1904/02/25  00:25:–  4.56  42.076  13.319 
14  1913/01/03  13:39:25  4.53  41.868  13.657 
15  1914/06/12  06:42:–  4.66  41.477  13.879 
16  1915/01/13  06:52:43  7.08  42.014  13.530 
17  1915/01/13  16:44:–  4.79  41.983  13.600 
18  1915/01/13  20:19:–  4.74  41.983  13.600 
19  1915/01/14  01:50:–  4.64  41.983  13.600 
20  1915/01/14  07:17:–  4.88  41.855  13.018 
21  1915/01/14  16:55:22  4.60  41.765  13.024 
22  1915/01/18  20:08:–  4.98  41.983  13.600 
23  1915/01/18  23:31:–  5.02  41.983  13.600 
24  1915/01/21  12:29:28  4.83  41.926  13.231 
25  1915/02/27  23:23:05  4.77  41.680  13.543 
26  1915/04/05  06:18:58  4.80  42.050  12.906 
27  1915/09/23  18:07:–  5.07  42.415  13.076 
28  1915/12/04  01:02:–  4.47  41.754  13.548 
29  1916/01/26  12:22:–  4.72  41.638  13.610 
30  1916/04/22  04:33:–  5.09  42.292  13.397 
31  1917/07/08  02:–:–  4.68  42.082  13.087 
32  1920/06/21  07:22:–  4.62  41.621  13.748 
33  1922/12/29  12:22:06  5.24  41.793  13.632 
34  1927/10/11  14:45:08  5.20  41.841  13.466 
35  1957/04/11  16:19:–  4.94  42.256  13.079 
36  1960/03/14  04:44:–  4.72  42.037  13.267 
37  1961/04/06  11:34:42  4.55  41.986  13.378 
38  1961/04/10  06:56:00  4.55  42.020  13.037 
39  1961/10/31  13:37:–  5.09  42.407  13.064 
40  1964/08/02  10:40:–  4.53  42.835  13.036 
41  1969/04/17  09:12:–  4.59  41.550  13.789 
42  1979/09/19  21:35:37  5.83  42.730  12.956 
43  1979/09/19  21:52:50  4.46  42.812  13.012 
44  1980/02/28  21:04:40  4.97  42.800  12.967 
45  1980/05/24  20:16:04  4.48  43.087  13.190 
46  1980/06/14  20:56:50  4.96  41.905  13.696 
47  1983/08/12  19:36:30  4.76  41.655  14.045 
48  1984/05/07  17:50:–  5.86  41.667  14.057 
49  1984/05/07  18:07:15  4.47  41.644  13.863 
50  1984/05/11  10:41:49  5.47  41.651  13.843 
51  1984/05/11  10:50:07  4.79  41.756  13.903 
52  1984/05/11  11:26:15  4.49  41.706  13.876 
53  1984/05/11  13:14:55  4.80  41.732  13.901 
54  1984/05/11  13:39:01  4.50  41.732  13.898 
55  1984/05/11  16:39:18  4.62  41.648  13.870 
56  1984/06/24  22:02:44  4.57  41.761  13.828 
57  1984/07/01  07:47:12  4.63  41.704  13.915 
58  2009/04/06  01:32:40  6.29  42.309  13.510 
4.2 Graphical method and statistical tests for identification of failure models

successive TTTstatistics \(T_k = \sum _{j=1}^k X_{(j)} + (n_s k) X_{(k)}\)

scaled TTTstatistics \(T_k^* = \displaystyle \frac{T_k}{T_{n_s}}\)
4.3 Results
Before illustrating the results achieved by fitting the three models, as the stressrelease, ETAS and compound models, separately for the selected data, we want to make a clarification regarding the dataset. The CPTI15 parametric catalog from which these data were extracted is the richest collection of both historical and instrumental information on Italian seismicity. However, just for this reason, it has characteristics that both advantage and disadvantage the classical models for hazard assessment. It cannot include all of the secondary events of past strong earthquakes, as required by the ETAS model, nor include just earthquakes that release most of the energy stored in a fault structure, as implied by the stressrelease model. For this reason we do not expect the best performance from these two models, which also explains the need for a composite model.
Prior distributions of the parameters of the three models, as the compound, stressrelease, and ETAS models. The parameters correspond to mean and variance. The modified Weibull distribution with parameters \(a_1, \, b_1, \, c_1\) models the failure times of the subordinate events in the compound model
Compound model  Stress release model  ETAS model  

\(\alpha\)  N(3, 7.3)  \(\alpha\)  N(\(\) 0.5, 0.2)  
\(\beta\)  \({\varGamma }\)(15, 180)  \(\beta\)  \({\varGamma }\)(0.05, 0.002)  
\(\rho\)  \({\varGamma }\)(0.02, 0.00032)  \(\rho\)  \({\varGamma }\)(0.5, 0.2)  
\(\mu\)  \({\varGamma }\)(0.5, 0.2)  
\(a_1\)  \({\varGamma }\)(0.5, 0.2)  K  \({\varGamma }\)(1, 0.81)  
\(b_1\)  \({\varGamma }\)(1, 0.81)  a  \({\varGamma }\)(1, 0.81)  
\(c_1\)  \({\varGamma }\)(1, 0.810)  c  \({\varGamma }\)(0.2, 0.032)  
p  \({\varGamma }\)(1, 0.81)  
\(b_{\ell }\)  \({\varGamma }\)(1.5, 1.82)  b  \({\varGamma }\)(1.5, 1.82)  b  \({\varGamma }\)(2.5, 5) 
\(b_s\)  \({\varGamma }\)(1.5, 1.82)  
\(\gamma\)  \({\varGamma }\)(1, 0.81) 
The marginal likelihood, on a \(\log _{10}\) scale, of the three models indicates: \(\log _{10} {\mathcal{L}}_{SR}(data) = 54.850\) for the stressrelease model; \(\log _{10} {\mathcal{L}}_{ETAS}(data) = 8.551\) for the ETAS model; and \(\log _{10} {\mathcal{L}}_{{\mathcal{C}}_{MW}}(data) = 2.745\) and \(\log _{10} {\mathcal{L}}_{{\mathcal{C}}_{AW}}(data) = 2.847\) for the compound model with the MW and AW distributions, respectively. In light of these results, we evaluate the Bayes factor of the compound model \({\mathcal{C}}_{MW}\) (\({\mathcal{M}}_1\)) with respect to the stressrelease model (\({\mathcal{M}}_2\)) and with respect to the ETAS model (\({\mathcal{M}}_2\)). Here, we have \(\log _{10} BF_{12} = 52.105\) and \(\log _{10} BF_{12} = 5.806\), respectively. By analogy, using the model \({\mathcal{C}}_{AW}\), we have \(\log _{10} BF_{12} = 52.003\) and \(\log _{10} BF_{12} = 5.704\), respectively. Comparing the Bayes factors with Jeffreys’ scale (Kass and Raftery 1995) it can be seen that the evidence in favor of the compound model is decisive; although, as observed at the beginnig of this subsection, such an evidence is not surprising given the type of catalog analyzed, the strength of this evidence is so large that we are confident about the possibility of obtaining good performances also in other applications. Details of the computational aspects relating to the evaluation of the Bayes factor can be found in Rotondi and Varini (2007).
Through the MH algorithm (Sect. 3) we have now generated sequences of \(R_{tot} = 5{,}500{,}000\) samples from the posterior distributions of the parameters, of which 500,000 are discarded as burnin. To reduce autocorrelation, these chains are then thinned, with discarding of all but every 125th value. Lognormal distributions are used as proposal distributions, with the mean equal to the current value of the Markov chain and the variance such that the value of the resulting acceptance rate is in the range (25%, 40%). Finally, some diagnostics are applied to check the convergence of each Markov chain to its target distribution, which are implemented in BOA R package. The posterior mean and standard deviation of each model parameter are reported in Table 4. As an example of the posterior distributions produced by the MH algorithm, Fig. 4 shows the prior and posterior densities of the parameters \(\gamma\), \(\rho\), and a of the compound, stressrelease, and ETAS models, respectively.
As for the ETAS model, we note that the parameter estimates satisfy the conditions that make the process nonexplosive, that is, \({\hat{p}}>1\), \({\hat{b}} > {\hat{a}}\), and \({\hat{K}} < ({\hat{b}}{\hat{a}})/{\hat{b}}\). The estimated loading rate, \({\hat{\rho }}=0.016\), of the stress release component of the compound model is consistent with the value obtained in Varini et al. (2016) by taking into account the events exceeding the magnitude threshold \(M_w \, 5.3\), associated with the Italian composite seismogenic sources that share the same tectonic characteristics. The same parameter in the stressrelease model has larger value, \({\hat{\rho }}=0.20\), because it must compensate for larger stress release due to the analysis of the entire data set.
Posterior means and standard deviations (in brachets) of the parameters of the compound, stressrelease, and ETAS models. The modified Weibull distribution with parameters \(a_1, \, b_1, \, c_1\) models the failure times of the subordinate events in the compound model
Compound model  Stress release model  ETAS model  

\({\hat{\alpha }}\)  2.75 (1.07)  \({\hat{\alpha }}\)  \(\) 0.70 (0.20)  
\({\hat{\beta }}\)  12.84 (3.40)  \({\hat{\beta }}\)  0.02 (0.16)  
\({\hat{\rho }}\)  0.016 (0.0008)  \({\hat{\rho }}\)  0.20 (0.16)  
\({\hat{\mu }}\)  0.17 (0.05)  
\({\hat{a}}_1\)  0.21 (0.04)  \({\hat{K}}\)  0.01 (0.007)  
\({\hat{b}}_1\)  0.61 (0.33)  \({\hat{a}}\)  1.54 (0.28)  
\({\hat{c}}_1\)  0.45 (0.22)  \({\hat{c}}\)  0.0003 (0.0004)  
\({\hat{p}}\)  1.09 (0.09)  
\({\hat{b}}_{\ell }\)  1.66 (0.52)  \({\hat{b}}\)  1.99 (0.26)  \({\hat{b}}\)  2.03 (0.26) 
\({\hat{b}}_s\)  2.99 (0.42)  
\({\hat{\gamma }}\)  0.39 (0.12) 
5 Final remarks
In this study we propose a new compound model for earthquake occurrences that captures contrasting features of seismic activity related to the clustering and elastic rebound processes. Other models in the literature share this aim, but they originated in contexts that differ both for the availability of geological and geodetic data and the wealth of historical information. Indeed, some seismically active regions are charactized by mainly linear faults, such as the Sant Andreas fault that extends for roughly 1300 km through California. When an earthquake takes place along that fault, it means that some section of that system has ruptured, which reduces the probability of that subsection participating in a future earthquake. On the contrary, in Italy we have complex, fragmented fault systems that extend 40–50 km in southern Italy and just 10–20 km in central Italy, the components of which are small faults that are related to each other in a dynamic way.
A weak point of the proposed model is a consequence of its multilevel structure; in fact, because of its hierarchical structure, uncertainty in forecasting a single variable increases; e.g., for the law of the total variance, the error (var(N)) in the number of subordinate events N is the sum of the expected variance of N as we average over all values of the length \({\varDelta }t\) of the interval between two consecutive leader events (\(E[var(N \, {\varDelta }t)]\)), and the variability of \(E(N \, {\varDelta }t)\) itself (\(var[E(N \, {\varDelta }t)]\)).
As indicated in the text, more general versions of the model can be investigated which allow timedependent parametrization of the generalized Weibull distribution (see Sect. 2.2) and can be applied to more general data sets (see Sect. 3). Moreover we plan to examine in the future which zones are most suitable for the application of the new model: seismogenic sources, when the events associated with their fault structures are known; or larger areal sources characterized by their tectonic properties.
Notes
Acknowledgements
We are grateful to the Editor and two reviewers for their helpful comments. The authors also thank Roberto Basili for providing the earthquake association with the fault source. This work was partly financed by the Italian Ministry of Education, University and Research (MIUR) in the framework of the PRIN2015 project ‘Complex space–time modeling and functional analysis for probabilistic forecast of seismic events’.
References
 Aarset MV (1985) The null distribution for a test of constant versus bathtub failure rate. Scand J Statist 12:55–61Google Scholar
 Ando T, Tsay R (2010) Predictive likelihood for Bayesian model selection and averaging. Int J Forecast 26:744–763CrossRefGoogle Scholar
 Barlow RE, Campo R (1975) Total time on test processes and applications to failure data analysis. In: Barlow RE, Fussel HB, Singpurwalla ND (eds) Reliability and fault tree analysis: theoretical and applied aspects of system reliability and safety assessment. SIAM, Philadelphia, pp 451–481Google Scholar
 Basili R, Valensise G, Vannoli P, Burrato P, Fracassi U, Mariano S, Tiberti MM, Boschi E (2008) The database of individual seismogenic sources (DISS), version 3: summarizing 20 years of research on Italy’s earthquake geology. Tectonophysics 453(1–4):20–43. https://doi.org/10.1016/j.tecto.2007.04.014 CrossRefGoogle Scholar
 Bebbington M, Harte DS (2003) The linked stress release model for spatiotemporal seimicity: formulations, procedures and applications. Geophys J Int 154(3):925–946CrossRefGoogle Scholar
 Bebbington M, Lai CD, Zitikis R (2007) A flexible Weibull extension. Reliab Eng Syst Saf 92:719–726CrossRefGoogle Scholar
 Berger J (2006) The case for objective Bayesian analysis. Bayesian Anal 1(3):385–402CrossRefGoogle Scholar
 Bergman B (1979) On age replacement and the total time on test concept. Scand J Stat 6:161–168Google Scholar
 Daley DJ, VereJones D (2003) An introduction to the theory of point processes, vol I. Springer, New YorkGoogle Scholar
 DISS Working Group (2007) Database of Individual Seismogenic Sources (DISS), Version 3.0.2: A compilation of potential sources for earthquakes larger than M 5.5 in Italy and surrounding areas, http://diss.rm.ingv.it/diss/, \(\copyright\) INGV (2007) Istituto Nazionale di Geofisica e Vulcanologia. Rome, Italy, https://doi.org/10.6092/INGV.ITDISS3.0.2
 Field EH, Jordan TH, Page MT, Milner KR, Shaw BE, Dawson TE, Biasi GP, Parsons T, Hardebeck JL, Michael AJ, Weldon RJ II, Powers PM, Johnson KM, Zeng Y, Felzer KR, van der Elst N, Madden C, Aeeowsmith R, Werner MJ, Thatcher WR (2017) A synoptic view of the third uniform California earthquake rupture forecast (UCERF3). Seismol Res Lett 88(5):1259–1267. https://doi.org/10.1785/0220170045 CrossRefGoogle Scholar
 Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. CRC Press, LondonGoogle Scholar
 Gerstenberger MC, Wiemer S, Jones L (2004) Realtime forecast of tomorrow’s earthquakes in California: a new mapping tool. Technical Report OpenFile Report 20041390, U.S. Geological SurveyGoogle Scholar
 Gerstenberger MC, Rhoades DA, McVerry GH (2016) A hybrid timedependent probabilistic seismichazard model for Canterbury, New Zealand. Seismol Res Lett 87(6):1311–1318. https://doi.org/10.1785/0220160084 CrossRefGoogle Scholar
 Gruppo di lavoro CPTI (2004) Catalogo Parametrico dei Terremoti Italiani, versione 2004 (CPTI04), INGV, Bologna. https://doi.org/10.6092/INGV.ITCPTI04
 Kagan YY (2017) Worldwide earthquake forecasts. Stoch Environ Res Risk Assess 31:1273–1290. https://doi.org/10.1007/s0047701612689 CrossRefGoogle Scholar
 Kagan YY, Schoenberg F (2001) Estimation of the upper cutoff parameter for the tapered Pareto distribution. J Appl Probab 38A:901–918Google Scholar
 Kanamori H, Brodsky EE (2004) The physics of earthquakes. Rep Prog Phys 67:1429–1496CrossRefGoogle Scholar
 Kass RE, Raftery AE (1995) Bayes factor. J Am Stat Assoc 90(430):773–795CrossRefGoogle Scholar
 Lai CD (2014) Generalized Weibull distributions, vol 118. Springer Briefs in Statistics. https://doi.org/10.1007/9783642391064_1
 Meletti C, Marzocchi W, Albarello D, D’Amico V, Luzi L, Martinelli F, Pace B, Pignone M, Rovida A, Visini F and the MPS16 Working Group (2017) The 2016 Italian seismic hazard model. In: Paper \(\text{N}^\circ\) 747, 16th world conference on earthquake, 16WCEE 2017, Santiago Chile, January 9–13, 2017, p 12Google Scholar
 Ogata Y (1998) Space–time point process models for earthquake occurrences. Ann Inst Stat Math 50:379–402CrossRefGoogle Scholar
 Ogata Y (2011) Significant improvements of the space–time ETAS model for forecasting of accurate baseline seismicity. Earth Planets Space 53(6):217–229CrossRefGoogle Scholar
 Ogata Y, Zhuang J (2006) Spacetime ETAS models and an improved extension. Tectonophysics 413:13–23CrossRefGoogle Scholar
 Reid HF (1911) The elasticrebound theory of earthquakes. Univ Calif Pub Bull Dept Geol Sci 6:413–444Google Scholar
 Rhoades DA, Evison FF (2004) Longrange earthquake forecasting with every earthquake a precursor according to scale. Pure Appl Geophys 161(1):47–72CrossRefGoogle Scholar
 Rotondi R, Garavaglia E (2002) Statistical analysis of the completeness of a seismic catalogue. Nat Hazards 25(3):245–258. https://doi.org/10.1023/A:1014855822358 CrossRefGoogle Scholar
 Rotondi R, Varini E (2007) Bayesian inference of stress release models applied to some Italian seismogenic zones. Geophys J Int 169(1):301–314CrossRefGoogle Scholar
 Rovida A, Locati M, Camassi R, Lolli B, Gasperini P (eds) (2016) CPTI15, the 2015 version of the Parametric Catalogue of Italian Earthquakes. Istituto Nazionale di Geofisica e Vulcanologia. https://doi.org/10.6092/INGV.ITCPTI15
 Schoenberg F, Bolt B (2000) Shortterm exciting, longterm correcting models for earthquake catalogs. Bull Seismol Soc Am 90(4):849–858. https://doi.org/10.1785/0119990090 CrossRefGoogle Scholar
 Schoenberg FP, Patel RD (2012) Comparison of Pareto and tapered Pareto distributions for environmental phenomena. Eur Phys J Spec Top 205(1):159–166. https://doi.org/10.1140/epjst/e2012015684 CrossRefGoogle Scholar
 Senatorski P (2007) Apparent stress scaling and statistical trends. Phys Earth Planet Inter 160:230–244CrossRefGoogle Scholar
 Smith BJ (2007) boa: An R package for MCMC output convergence assessment and posterior inference. J Stat Softw 21:11. https://doi.org/10.18637/jss.v021.i11 CrossRefGoogle Scholar
 Varini E (2005) Sequential estimation methods in continuoustime statespace models. Ph.D. Thesis, Institute of Quantitative Methods, Bocconi University, Milano, ItalyGoogle Scholar
 Varini E (2008) A Monte Carlo method for filtering a marked doubly stocastic Poisson process. Stat Methods Appl 17:183–193CrossRefGoogle Scholar
 Varini E, Rotondi R (2015) Probability distribution of the waiting time in the stress release model: the Gompertz distribution. Environ Ecol Stat 22:493511. https://doi.org/10.1007/s1065101403072 CrossRefGoogle Scholar
 Varini E, Rotondi R, Basili R, Barba S (2016) Stress release model and proxy measures of earthquake size. Application to Italian seismogenic sources. Tectonophysics 682:147–168. https://doi.org/10.1016/j.tecto.2016.05.017 CrossRefGoogle Scholar
 VereJones D (1978) Earthquake prediction—a statistician’s view. J Phys Earth 26:129–146CrossRefGoogle Scholar
 VereJones D, Ozaki T (1982) Some examples of statistical estimation applied to earthquake data. Ann Inst Stat Math 34(Part B):189–207CrossRefGoogle Scholar
 Votsi I, Limnios N, Tsaklidis G, Papadimitriou E (2014) Hidden semiMarkov modeling for the estimation of earthquake occurrence rates. Commun Stat Theory Methods 43:1484–1502CrossRefGoogle Scholar
 Watanabe S (2010) Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res 11:3571–3594Google Scholar
 Wells DL, Coppersmith KL (1994) New relationships among magnitude, rupture length, rupture width, rupture area, and surface displacement. Bull Seismol Soc Am 84(4):974–1002Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.