A new approach to model the counts of earthquakes: INARPQX(1) process

This paper introduces a first-order integer-valued autoregressive process with a new innovation distribution, shortly INARPQX(1) process. A new innovation distribution is obtained by mixing Poisson distribution with quasi-xgamma distribution. The statistical properties and estimation procedure of a new distribution are studied in detail. The parameter estimation of INARPQX(1) process is discussed with two estimation methods: conditional maximum likelihood and Yule-Walker. The proposed INARPQX(1) process is applied to time series of the monthly counts of earthquakes. The empirical results show that INARPQX(1) process is an important process to model over-dispersed time series of counts and can be used to predict the number of earthquakes with a magnitude greater than four.


Introduction
Destructive earthquakes are one of the biggest problems of all humanity. Earthquakes are not only natural disasters that threaten the people's lives, but also affect the economies of the countries negatively. Therefore, statistical modeling of the counts of earthquakes and taking measures according to the model results are of great importance. Many different statistical models have been used in the literature to model earthquake events. Turkan and Ozel [26] used the linear regression, beta regression, and semi-parametric additive regression to model the casualty rate of earthquakes and related covariates. The relation between the magnitude of an earthquake and the number of deaths was investigated in several piece of researches such as Samardjieva and Badal [22], Gutiérrez et al. [10], Wyss [30], Koshimura et al. [15], Zhao et al. [31], and Wu et al. [29]. Besides these regression models, stochastic models are also powerful tools to model earthquakes. Stochastic point process models for earthquakes were discussed by Brillinger [7], Vere-Jones [27], Ogata [19,20] and Zhuang et al. [32,33]. Gospodinov and Rotondi [9] and Aktas et al. [1] used the compound Poisson process to model the cumulative energy release of mainshocks and the expected value of the earthquake, respectively. Ozel [21] introduced a bivariate compound Poisson model to predict the number of foreshocks and aftershocks.
The first-order integer-valued autoregressive process, shortly INAR (1), is an alternative model to predict the number of the time of series of counts such as monthly deaths from corona-virus, monthly counts of passengers for a specific airline company or monthly counts of destructive earthquakes for a specific region or a country. The INAR(1) process with Poisson innovations, known as INARP (1), was developed by McKenzie [17,18] and Al-Osh and Alzaid [2] and increased its popularity in the last decade. The time series of counts generally display overdispersion. The INARP(1) process can be used to model equi-dispersed time series of counts which means that the ratio of sample variance to sample mean should be one. The over-dispersion occurs when the sample variance is greater than the sample mean, the opposite indicates the under-dispersion. After the important researches of McKenzie [17,18] and Al-Osh and Alzaid [2], the researches have focussed on the distribution of an innovation process of INAR(1) to develop new models for over-dispersed or under-dispersed time series of counts. In what follows, we list some recent contributions on overdispersed INAR(1) processes: INAR(1) process with geometric innovations (INARG(1)) by Jazi et al. [13], INAR(1) with Poisson-Lindley innovations (INARPL(1)) by Lívio et al. [16], compound Poisson INAR(1) by Schweer and Weiß [23], processes INAR(1) process with Katz family innovations by Kim and Lee [14], INAR(1) process with generalized Poisson and double Poisson innovations by Bourguignon et al. [6] , INAR(1) process with geometric marginals by Borges et al. [5] and INAR(1) process with Skellam innovations by Andersson and Karlis [4], INAR(1) process with a new Poisson-weighted exponential innovations by Altun [3].
In this study, we first introduce a new two-parameter discrete distribution as an alternative distribution for the case of over-dispersion. The proposed distribution is obtained by mixing the Poisson distribution with a quasi-xgamma distribution of Sen and Chandra [24]. The resulting distribution is called as Poisson-quasi-xgamma (PQX) distribution. The statistical properties of the PQX distribution are studied in detail as well as its parameter estimation with different methods of estimations. Then, we introduce a new INAR(1) model by using the PQX distribution as an innovation process and called this process as a INARPQX (1). The goal of the presented study is to open a new opportunity to model over-dispersed time series of counts with a more flexible innovation distribution than existing ones. To demonstrate the effectiveness of the proposed process, we use the earthquake data set (magnitude 4 and above) of Turkey. The earthquake data set is regularized as a monthly basis to predict the monthly counts of the earthquakes with magnitude 4 and above occurred in Turkey. We model the monthly counts of earthquakes by INAR(1) processes with several innovation distributions as well as PQX distribution.
The ongoing sections of the study are organized as follows. In Sect. 2, the PQX distribution is presented with its statistical properties. In Sect. 3, we discuss the parameter estimation procedure of the PQX distribution with maximum likelihood and method of moments estimation methods. In Sect. 4, Expectation-Maximization (EM) algorithm for the PQX distribution is given. Section 5 deals with the simulation study to compare the finite sample performance of the estimation methods. In Sect. 6, we introduce a new INAR(1) process with PQX innovations. Section 7 contains an application to the earthquake data of Turkey. We summarize the findings of the study in Sect. 8.

Poisson-quasi-xgamma distribution
Let the random variable (rv) X follows a Poisson distribution. The probability mass function (pmf ) of X is where > 0 . The Poisson distribution is an attractive distribution and is widely used in many fields because of its tractable properties and software support. However, many count data sets exhibit over-dispersion or underdispersion in which case the Poisson distribution does not work well. When the empirical variance is greater than the empirical mean, the over-dispersion occurs, and opposite case represents the under-dispersion. The negative-binomial (NB) distribution is a common choice for over-dispersed count data sets. However, in the last decade, researchers have introduced more sophisticated discrete probability distributions to open a new opportunity to model fat-tailed over-dispersed count data sets. Here, we introduce an alternative distribution to model over-dispersed time series of counts. To do this, the quasixgamma distribution, introduced by Sen and Chandra [24], is used as a mixing distribution for Poisson parameter . The probability density function (pdf ) of quasi-xgamma distribution is where > 0 and > 0 . The quasi-xgamma distribution is a special mixture of exponential and gamma densities and contains Gamma( , 3) (for = 0 ) and xgamma( ) (for = ), introduced by Sen et al. [25], distributions as its submodels. Now, we introduce a new two-parameter (2) f (x; , ) = 1 + + 1. The PQX distribution reduces to Poisson-xgamma distribution for = , 2. The PQX distribution reduces to NB(3, ∕( + 1)) for = 0, 3. The PQX distribution reduces to Geometric( ∕( + 1)) for → ∞.
The corresponding cumulative distribution function (cdf ) to (3) is The rest of the section is devoted to the inference of the statistical properties of PQX distribution and its parameter estimation.

Statistical properties
Proposition 2 Let the rv X follow a PQX distribution. The factorial moments of X are given by Proof The factorial moments of X can be obtained by using mixing formula, as follows The proof is completed. ◻ Using (5), the mean and variance of PQX distribution are given, respectively, by The dispersion index (Var(X)/E(X)) of PQX is given by As seen from (8), since the parameter and are greater than zero, the dispersion index of PQX distribution is always greater than 1 which ensures that the PQX distribution is over-dispersed.

Proposition 3
The probability generating function (pgf) of PQX distribution is given by . Proof The pgf of X can be written in the form Using (10), the pgf of X can be obtained as follows

◻
The moment generating function of PQX distribution can be easily obtained by substituting s with exp (t) in (9). Then, we have The quasi-xgamma distribution is a special mixture of exponential( ) and gamma(3, ) distributions with mixing proportion, p = ∕( + 1) . This property of the quasixgamma distribution can be used to generate rvs from PQX distribution. The below algorithm is given based on the mixture property of the quasi-xgamma.
The skewness and kurtosis plots of the PQX distribution against the mean and dispersion values are displayed in Fig. 1. The parameter is taken 1 and parameter is increased in the interval (0. 5,15). From these figures, we conclude that the skewness and kurtosis decrease when the mean and dispersion increase. The pmf of the NB distribution is (14) p(y; , p) = Γ(y + ) The skewness and kurtosis measures of PQX distribution can be easily obtained based on the first four raw moments by using (12) S = 3 2 + 3 3 + 2 3 + 5 2 2 + 27 2 +30 2 + 7 2 + 33 + 18 + 3 2 + 9 + 6 8 + 3 + 4 The skewness and kurtosis plots of the NB distribution against the mean and dispersion values are displayed in Fig. 2. The parameter is taken 0.5, and parameter p is increased in the interval (0.1, 0.9). From these figures, we conclude that the skewness and kurtosis decrease when the mean and dispersion increase. So, the NB and PQX distributions have the similar behaviors. However, to understand the differences between these distributions, | https://doi.org/10.1007/s42452-020-04109-8 Research Article  (14), the meanparametrized NB distribution is given by (3), we have the pmf of the meanparametrized PQX, given by where > 0 , > 0 , 1 < < 3 and E(X) = . Figure 3 displays the skewness and kurtosis values of the mean-parametrized NB and PQX distributions against their dispersion values. The mean parameter is determined as 1 for both distributions. The parameter and are increased in the interval (1.5, 2.5) since we have condition on the parameter space of the mean-parametrized PQX distributions, such as 1 < < 3 . As seen from these figures, when the dispersion increases, the skewness and kurtosis increase for both distributions. However, under these re-parametrization, the PQX distribution is able to model wider range of skewness and kurtosis than the NB distribution.
We also compare the right-tail probabilities of the mean-parametrized NB and PQX distributions. Table 1 lists the tail probabilities of the NB and PQX distributions for some values of the mean parameter and . The results given in Table 1 indicate that the PQX has fatter right-tail than the NB distribution.

Shape of PQX(˛, ) distribution
Using (3), the ratio of successive probabilities P(X = x + 1)∕P(X = x) for PQX( , ) is given as Further, the ratio P(X =x+1) P(X =x) < (>)1 implies that the pmf is decreasing (increasing). Hence, solving the equation P(X =x+1) P(X =x) = 1 for non-integer x (say), the roots are given as (17) Fig. 3 The skewness and kurtosis comparison of the mean-parametrized NB and PQX distributions Table 1 The comparison of tail probabilities of the NB and PQX distributions Tail probabilities Hence, it can be easily verified as that , the ratio is less than 1; hence, the PQX( , ) is unimodal with mode at 0.
is bimodal with the mode at 0 and ⌈x * * 0 ⌉.
The results given above are graphically summarized in Fig. 4 which displays the modality regions of the PQX   distribution with respect to the values of the parameters and .
In Fig. 5, we depict different shapes of PQX( , ) distribution for different parameters values. As seen from these figures, PQX distribution has bimodal and unimodal shapes with right skewness and symmetric. Moreover, for large value for and the PQX distribution can be used to zero inflated count data sets.

Maximum likelihood
Let x 1 , x 2 , … , x n be a random observations from the PQX distribution. The log-likelihood function is The partial derivatives of (19) with respect to and , the following score vectors are obtained.
The simultaneous solution of (20) and (21) gives the maximum likelihood estimates (MLEs) of and . However, since the likelihood equations contain nonlinear functions, there is no explicit form for the MLEs of the parameters of PQX distribution. Therefore, the log-likelihood function in (19) should be maximized by using statistical software such as R, S-PLUS, Mathematica or MATLAB. Here, nlm function of R software is used to minimize the minus of log-likelihood function which is equivalent to maximization of log-likelihood. The inverse of observed information matrix evaluated at MLEs of the parameters is used to obtain corresponding standard errors. To do this, fdHess function of R software is used. The asymptotic confidence intervals of the parameters are obtained by where z p∕2 is the upper p/2 quantile of the standard normal distribution.

Method of moments
The method of moments estimators of the parameters and can be obtained by equating theoretical moments of PQX distribution to sample moments, given as follows where m 1 and m 2 are the first and second sample moments, respectively. Simultaneous solution of (22) and (23) Taking the second partial derivative of g(t), we have which ensures that the g(t) is strictly convex. Using the Jensen's inequality, So, since g E X = g( ) = g 2( + 3)∕( (2 + 2)) = , we obtain E ̂M M > . ◻

Proposition 5 For fixed , the MM estimator ̂ of is consistent and asymptotically normal
where Proof Since g � ( ) exists and is nonzero valued, by the delta method, we have where g(x) =̂ , g( ) = and The proof is completed. ◻

EM algorithm for PQX(˛, ) distribution
In this section, we deals in the estimation of parameter and of PQX distribution by another estimation method known as Expectation-Maximization (EM) (see Dempster et al., [8]). The EM algorithm consists of two steps: the E-step and the M-step. E-Step computes the expectation of the unobservable part given the current values of the parameters and M-step maximizes the complete data likelihood and updates the parameters using the conditional expectations obtained in E-step. This procedure can be useful when there are no closed-form expressions for estimating the parameters and the derivatives of the likelihood are complicated.
To start with, a hypothetical complete-data distribution is defined with joint probability function It is straightforward to verify that the E-step of an EM cycle requires the computation of the conditional expectations o f is the current value of ( , ) . Using, .
The EM cycle completes with the M-step, involving complete data maximum likelihood over ( , ) , with the missing 's replaced by their conditional expectations

Simulation
The finite sample performance of MM and MLE methods is compared via simulation study. The below simulation procedure is implemented for this purpose.
1. Determine the sample size n and the parameter values of PQX distribution, = ( , ), 2. Generate random observations from PQX distribution for given n and parameter vector, 3. Using the random sample in step 2, estimate with MLE and MM methods, 4. Repeat the steps 2 and 3 based on the replication number, N, 5. Using the estimated parameter vector, ̂ , and true parameter vector, , calculate the biases, mean relative estimates (MREs) and mean square errors (MSEs) by using, The simulation is carried out with statistical software R. We generate n = 50, 55, 60, … , 300 sample of size from PQX distribution. The simulation replication number is N = 1000 . The true parameter vector is = ( = 0.5, = 1.5) . Figure 6 displays the simulation results. We expect that the estimated biases and MSEs should be near the zero for large n values. The results verify the expectation. The biases and MSEs approach the their nominal value, zero, for n → ∞ . Additionally, the MREs are near the one. The MLE and MM methods behave very similar for the parameter . However, the MLE method approaches the nominal values of biases, MSEs, and MREs more faster than MM method for the parameter . Therefore, we suggest practitioners to use MLE method for estimating the parameters of PQX distribution. Vol:.(1234567890) Research Article SN Applied Sciences (2021) 3:274 | https://doi.org/10.1007/s42452-020-04109-8

INAR(1) model with PQX innovations
Let the innovations t ℕ be an independent and identically distributed (iid) process with E t = and Var t = 2 . The process, X t ℕ , is called as INAR(1) if it follows a recursion where 0 ≤ p < 1 . The symbol, • , represents a binomial thinning operator and is defined as where Z j j≥1 is referred to as a counting series and be a sequence of iid Bernoulli rvs with Pr Z j = 1 = p . The INAR(1) process is stationary for 0 ≤ p < 1 . The process is non-stationary for the case p = 1 . The INAR(1) process is a homogeneous Markov chain with the one-step transition probabilities given by McKenzie [17], Al-Osh and Alzaid [2]) The mean, variance, and dispersion index of INAR(1) process are given, respectively, by Weiß [28] (36)

Estimation
The estimation procedure of INAR(1) process is discussed with three estimation methods. These are conditional maximum likelihood (CML), Yule-Walker (YL), and conditional least squares (CLS). The finite sample performance of these estimation methods is compared via extensive simulation study. The rest of this section is devoted to theoretical background of the used estimation methods.

Conditional maximum likelihood
Let X 1 , X 2 , ..., X T be a random sample from the stationary process, INARPQX(1). The conditional log-likelihood function of INARPQX (1) is The CML estimators of (p, , ) can be obtained by direct maximization of log-likelihood function given in (46). The nlm function of R software is used to minimize the minus of log-likelihood function. The inverse of observed information matrix is used to obtain corresponding standard errors of the CML estimation of the INARPQX(1) process. The observed information matrix is obtained by fdHess function of R software.

Yule-Walker
The YW estimators of INARPQX(1) process are obtained by equating the sample moments to the theoretical moments of the process. Since the autocorrelation function (ACF) of INAR(1) process at lag h is x (h) = p h , the YW estimator of p is given by Research Article SN Applied Sciences (2021) 3:274 | https://doi.org/10.1007/s42452-020-04109-8 The YW estimators of and are obtained by equating the sample mean and sample dispersion to theoretical mean and theoretical dispersion of the process. The YW estimator of is given by Substituting with (48) in (43) and equating (43) to sample dispersion, D I X , we have

Model accuracy
The standardized Pearson residual is used to check the fitted model accuracy. The standardized Pearson residuals are given by where E X t | | x t−1 and Var X t | | x t−1 are given in (44) anf (45), respectively. When the model is adequate for fitted data set, the standardized Pearson residuals should be uncorrelated with zero mean and unit variance. If the variance of standardized Pearson residuals is higher/lower than one, it , t = 2, 3, ..., T , shows that there is more or less dispersion considered by fitted model (see, Harvey and Fernandes, [11]).

Simulation
We compare the finite sample performance of CML and YW estimators of the parameters of INARPQX(1) process with a brief simulation study. We generate n = 100, 300 and 500 sample of sizes from INAR (1) Table 2.
Based on the results given in this

Empirical study
In this section, we illustrate the importance of the INAR-PQX(1) by an application on the earthquake data of Turkey. Firstly, we describe the created earthquake catalog and its properties. In the second step, INAR(1) processes defined under different innovation distributions are used to model the monthly counts of the earthquake.

Study area and data
The earthquake data of the Turkey is obtained from the Disaster & Emergency Management Authority Presidential of Earthquake Department, Turkey. The data are available from https ://depre m.afad.gov.tr/?lang=en. The used data contain the earthquakes with magnitude 4 and above occurred in Turkey between the dates 6 January 2012 and 14 October 2018. Firstly, the earthquake catalog of the Turkey is created by using the ETAS package of the R software. The detail on this package can be found in Jalilian [12]. The earthquake catalog is displayed in Fig. 7.
The top-left figure shows the spatial distribution of the earthquakes under the study area. The three figures in the right part of Fig. 7 show the changes of the latitude, longitude, and magnitude of the earthquakes over the time.
The two figures in the bottom-right part of Fig. 7 show the completeness and time stationary of the earthquake catalog. Here, N m represents the number of earthquakes with magnitude ≥ m . If the plot of log(N m ) versus m shows linear trend, it represents the completeness of the earthquake catalog. Besides, the stationary of the earthquake catalog is evaluated based on the plot of N t versus t. Here, N t represents the number of earthquakes up to time t. If the plotted points of N t versus t have a functional form such as N t = 0 t where > 0 , it is an evidence that the time series of the earthquake is stationary. Therefore, we conclude that the used earthquake catalog is stationary and complete. To analyze the reported data set in the previous section, the earthquakes are grouped monthly to predict the monthly number of the earthquakes with magnitude ≥ 4 . We use the − , Akaike Information Criteria (AIC), and Bayesian Information Criteria (BIC) statistics to select the most appropriate model for the used data set. The lowest values of these criteria show the best fitted model. The used data sets consist of 82 monthly counts of earthquakes with magnitude greater than 4 between the date of January 2012 and March 2018. Firstly, the possible overdispersion in the data set should be explored. For this aim, we use the hypothesis test for over-dispersion proposed by Schweer and Weiß [23]. The test statistic value of the over-dispersion hypothesis is 24.187, and its corresponding p value is < 0.001 which indicates that the used data set displays over-dispersion. Therefore, the innovation distribution of the INAR(1) process should be able to capture the over-dispersion in the data set. The some useful plots of the used data set displayed in Fig. 8. As seen from autocorrelation function (ACF) and partial ACF (PACF) plots, there is a clear cut of at first lag which ensures that the AR(1) process can be used to model this data set.  The estimated parameters of the fitted INAR(1) processes and corresponding AIC, BIC values are listed in Table 3. As seen from the results reported in Table 3, the proposed INARPQX(1) process has the lowest values of AIC and BIC values which indicates the proposed model performs better than INARP(1), INARG(1), INARPL(1), and IARNB(1) models. Moreover, we analyze the departure from error distribution by means of residual analysis. For this aim, we use the Pearson residuals. The Pearson residuals can be calculated by using where E X t | | X t−1 and Var X t | | X t−1 are given in (44) and (45), respectively. When the fitted model is correct, the mean and variance of the Pearson residuals should be near the zero and one, respectively. Also, there should be no autocorrelation problem for the estimated Pearson residuals. The mean and variance of the Pearson residuals are calculated 0.0009 and 0.9754, respectively, which are very near the desired values. Additionally, the ACF plot of the Pearson residuals is displayed in right side of Fig. 9 which indicates that there is no autocorrelation problem for the Pearson residuals. The fitted INARPTE(1) model is given by where t ∼ PQX(94.964, 4.342) . The predicted values of criminal mischief in 14th police car beat can be obtained by using (56).
The predicted values of monthly counts of earthquakes are displayed in the left side of Fig. 9.

Conclusion and future work
This study introduces a new two-parameter discrete distribution, shortly PQX, for modeling the over-dispersed counts. The statistical properties of the PQX distribution are derived and estimation of the unknown parameters of the proposed distribution is discussed in detail. INAR-PQX(1) processes are introduced based on the PQX distribution to predict the monthly counts of the earthquakes with magnitude 4 and above in Turkey. Empirical results show that the INARPQX(1) processes perform better than INARP(1), INARP(1), INARG(1), and INARNB(1) processes for the data used. As a future work of this study, we will (55) try to extend the INARPQX(1) to multivariate case for joint modeling of the more than one region in predicting the monthly counts of earthquakes. We believe that the PQX distribution will increase its popularity find a wider application area in different sciences such medical, finance, and actuarial.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.