Abstract
The paper focusses on the estimation of the extreme value index in terms of k-records based on a maximum likelihood approach, which is suggested recently by Louzaoui and El Arrouchi (J Probab Stat, 2020). Its asymptotic normality is well investigated in order to propose a bias correction while ensuring that the new estimator becomes asymptotically unbiased and still normal. Some numerical studies are also provided in order to show how the proposed estimators behave in practice.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
For an n-sample \(X_1,X_2,\ldots ,X_n\) from a continuous distribution function F, let \(X_{1,n} \le \cdots \le X_{n,n}\) be the corresponding order statistics. Recall that k-record process is defined in terms of the kth largest observations, see Dziubdziela and Kopocinski [10]. For any integer k, let
and then, the k-record values are defined by \(R^{(k)}_{i}=X_{\nu ^{(k)}_{i}-k+1,\nu ^{(k)}_{i}},\ \ i\ge 1\). Next, suppose that F belongs to the max-domain of attraction of an extreme value distribution \(G_{\gamma }\) (\(F \in D(G_{\gamma })\)) where \(\gamma \in {\mathbb {R}}\) is the extreme value index. That is, there exist sequences \(a_n > 0\) and \(b_n \in {\mathbb {R}}\) such that
where \(1 + \gamma x > 0\). Let \(U(y) =\inf \{z: 1-F(z) \le 1/y\}\), for \(y \ge 1\). The first order condition (1), in term of U, is equivalent, for all \(x>0\), to
where a(.) is a some positive auxiliary function. It can be proved that (1) or (2) is equivalent to
where \(1 + \gamma x > 0\) and \(\sigma (.)>0\) is a function with \(t_{*} = \sup \{y: F(y)< 1\}\le \infty\) is the right endpoint of F. The function \(D_{\gamma }\) is known as the Generalized Pareto Distribution. See [6] for more theoretical discussion on the max-domain of attraction.
The problem of estimating the extreme value index \(\gamma\) on the basis of the largest observations of the sample \((X_1,\ldots , X_n)\) has received very special attention in the classical extreme value theory. Many statistics based on higher order statistics have been proposed to estimate \(\gamma\) such as Hill’s estimator [14], Pickands’s estimator [16], Moment estimator derived by Dekkers et al. [8], Maximum likelihood (ML) estimator suggested by Drees et al. [9]. For further informations, [6] and [3] gives a good introduction which are rich in application, but gives even more theoretical and practical details on the estimation problem of the extreme value index. In the other hand, the example of the Resnick’s duality theorem [2, Theorem 2.3.3] or the caracterization of tail distributions [11] show that the extreme value theory is very linked to the theory of record values. A recent development in record theory can be found in [13] and [1]. Observing k-records only prevents the possibility of applying conventional estimators in extreme value statistics and therefore, the construction of estimators based on record values is essential [12]. This problem has not been sufficiently studied in the literature which has been revisited recently by Louzaoui and El Arrouchi [15] by using a maximum likelihood (ML) approach based on the top \(k+1\) highest k-records. More precisely, for \(k=k_m\) an intermediate sequence of integers satisfying \(k_m \rightarrow \infty\), \(k_m/m\rightarrow 0\) as \(m\rightarrow \infty\) with m is the number of k-records observed, they are showed that the conditional joint distribution of \(\left( R_{m-k+1}^{(k)}-R_{m-k}^{(k)},\ldots ,R_{m}^{(k)}-R_{m-k}^{(k)}\right)\) given \(R_{m-k}^{(k)}= y\) is the same as the unconditional joint distribution of the k-record values \(\left( Z^{(k)}_{1}, \ldots ,Z^{(k)}_{k}\right)\) from independent and identically distributed random variables \(Z_1,Z_2\ldots\) having a distribution \(F_y(z)= (F(z)-F(y))/(1-F(y))\ \ (z>y)\) (left-truncated distribution) which can be replaced, from (3), by \(D_{\gamma }(./\sigma )\) (Generalized Pareto Distribution). This result can be used to construct a pseudo maximum likelihood estimation \(({{\hat{\gamma }}},{{\hat{\sigma }}})\) of the unknown parameters \((\gamma ,\sigma )\); that is, based on the sample of k-record values \(\left( Z^{(k)}_{1}, \ldots ,Z^{(k)}_{k}\right)\), we can maximize the likelihood function
with \(Y_{i} = R^{(k)}_{n-i+1}- R^{(k)}_{n-k}\), \(1 \le i\le k\) and \(d_{\gamma ,\sigma }(y)= \partial D_{\gamma }(y/\sigma )/\partial y\). Consequently, \({\hat{\gamma }}:\equiv {\hat{\gamma }}_{m}(k)\) and \({\hat{\sigma }}:\equiv {\hat{\sigma }}_{m}(k)\) are obtained by solving the likelihood equations
with \(Y_{i} = R^{(k)}_{m-i+1}- R^{(k)}_{m-k}\). For \(\gamma = 0\), the equations are obtained by continuity. Put \(h_m(t):=q_m(t). g_m(t)-1\) where \(g_m(t):= \frac{1}{k}\sum \nolimits _{i=1}^{k}{\frac{1}{1+tY_i}}\), \(q_m(t):= \left( 1+\frac{1}{tY_1}\right) f_m(t)\) and \(f_m(t):= \log \left( 1+tY_1\right)\). Any solution \(({\hat{\gamma }},{\hat{\sigma }})\) of (4) satisfies \(h_m({\hat{\gamma }}/{\hat{\sigma }})=0\). Conversely, \(({\hat{\gamma }},{\hat{\sigma }})=(f_m(t^*),f_m(t^*)/t^*)\) is solution of (4) for any non-zero solution \(t^*\) of \(h_m(t)=0\). It can be easily seen that \(h_m(t)=0\) has a zero solution which must be dropped even if really \(\gamma =0\).
Under the first order condition (2), Louzaoui and El Arrouchi [15] have shown the existence of a random N such that the likelihood equations have a consistent solution \(({\hat{\gamma }}_{m},{\hat{\sigma }}_{m})\) for all \(m\ge N\). Here, we study their asymptotic normality under the so-called second order condition and derive another estimation of the extreme value index which is asymptotically unbiased and normal. The remainder of this paper is organized as follows. In Sect. 2, we establish the asymptotic normality of the ML estimators for \(\gamma \ne 0\) and then we propose a bias correction. Section 3 will devoted to some numerical studies which lend further support to our theoretical results with discussion. Finally, in Sect. 4, a real data set is analyzed by using the suggested methods.
2 Main results
The study of the asymptotic normality of ML estimators requires a second order condition which is a refinement of (2), see [6]. For some \(\gamma\) positive, there exists a auxiliary function A(t) (with constant sign and \(A(t) \rightarrow 0\) as \(t\rightarrow \infty\)) and a real index \(\rho \le 0\), such that, \(\forall x>0\),
The parameter \(\rho\) governs the rate of convergence in (2). It can be shown that necessarily \(|A| \in RV_{\rho }\). The parameter \(\rho\) is of primordial importance in the adaptive choice of the threshold to be considered in the estimation of the extreme value index [6, 12]. For \(\gamma <0\) this condition becomes, \(\forall x>0\)
We now state our main result, stating asymptotic normality of ML estimators.
Theorem 2.1
Suppose that the second order condition (5) or (6) holds and suppose \(k=k_m\rightarrow \infty\), \(k/m\rightarrow 0\), \(\left( m\log \log m\right) ^{1/2}/k\rightarrow 0\), as \(m\rightarrow \infty\).
-
(1) If \(\lim \nolimits _{m\rightarrow \infty }\sqrt{k}A(e^{{m/k}})=\lambda \in {\mathbb {R}}\), then as \(m\rightarrow \infty\)
-
(i) \(\sqrt{k}({\hat{\gamma }}_m(k)-\gamma ) {\mathop {\longrightarrow }\limits ^{d}} {\mathcal {N}}\left( \lambda \mu (\rho ),\gamma ^{2}\right)\), with \(\mu (\rho )=\frac{1-e^{-\rho }}{\rho }\).
-
(ii) When (\(-\rho <\gamma\)) or (\(-\rho>\gamma >0,\ \lim _{t\rightarrow \infty }a(t)-\gamma U(t)=0\)) or (\(\gamma <0\)), we have as \(m\rightarrow \infty\)
$$\begin{aligned} \sqrt{k}\left( {\hat{\gamma }}_m(k)-\gamma ,\sqrt{\frac{k}{m-k}}\left( \frac{{\hat{\sigma _m}}(k)}{a\left( e^{(m/k)-1}\right) }-1\right) \right) {\mathop {\rightarrow }\limits ^{d}} {\mathcal {N}}\left( \lambda b_{\gamma ,\rho },\Sigma \right) , \end{aligned}$$with \({{\mathcal {N}}}\) normal distribution, \(b_{\gamma ,\rho }=(\mu (\rho ),0)\) and the covariance matrix \(\Sigma\) is given by \(\left( \begin{array}{cc} \gamma ^2&{} 0 \\ 0 &{} \gamma ^2 \\ \end{array} \right)\).
-
-
(2) If \(\lim \nolimits _{n\rightarrow \infty }\sqrt{k}|A(e^{{m/k}})|=+\infty\), then as \(m\rightarrow \infty\)
$$\begin{aligned} \left( A(e^{{m/k}})\right) ^{-1}({\hat{\gamma }}_m(k)-\gamma ){\mathop {\longrightarrow }\limits ^{p}}\mu (\rho ). \end{aligned}$$
Proof
Let \(\{E_i,i\ge 1\}\) be an independent and identically distributed sequence of standard exponential random variable and \(S_j = E_1 + \cdots + E_j,\ j \ge 1\). Denote the hazard function of F by \(H(x)=-\log (1-F(x))\). It can be seen easily that, for \(x\ge 1\), \(U(x)=H^{\leftarrow }(\log (x))\) and \(H^{\leftarrow }\) is a strictly increasing function, since F is continuous. From this and the Relation (4.7) in [17], we get the following representation
Without loss of generality, we can assume that \(R_{m-j}^{(k)}=U\left( e^{\frac{S_{m-j}}{k}}\right)\), where \(0\le j\le k\) and \(m\ge 1\).
For the case \(\gamma >0\), Louzaoui and El Arrouchi [15] have given bounds for the solution \(t^*\) of \(h_m(t)=0\). More Precisely, under the first order condition (2) and when \(\delta _m\rightarrow 0,\ k\rightarrow \infty ,\ k/m\rightarrow 0\) and \(k/\log m\rightarrow \infty\) as \(m\rightarrow \infty\), they proved the existence of N, a random integer, such that, \(h_m(T_m^{(\delta _m)})<0\) and \(h_m(T_m^{(-\delta _m)})>0\) for any \(m\ge N\) almost surely, where \(T_m^{(\delta _m)}:=(1+\delta _m)/R^{(k)}_{m-k}\) (see [15, Lemma 3]). From this, the existence of a random variable \(T_m^*\in [T_m^{(-\delta _m)},T_m^{(\delta _m)}]\) such that, almost surely, \(h_m(T_m^*)=0\) is assured by the mean value theorem. Notice that condition \(\left( m\log \log m\right) ^{1/2}/k\rightarrow 0\) implies \(k/\log m\rightarrow \infty\).
Let \(W_{m}=\frac{R_{m}^{(k)}}{R_{m-k}^{(k)}}\). We have, \(m\rightarrow \infty\),
From (5) and by Theorem B.2.18 in [6], there exists, for each \(\epsilon >0\), a \(t_{0}=t_{0}(\epsilon )\) such that for \(x\ge 1\) and \(t>t_{0}\),
Take \(t=e^{\frac{S_{m-k}}{k}}\), \(x=e^{\frac{S_{m}-S_{m-k}}{k}}\) and observe that as \(m\rightarrow \infty\), \(t\rightarrow \infty\), \(x\rightarrow e\) and \(\frac{x^{\rho }-1}{\rho }\pm \epsilon x^{\rho +\epsilon }\rightarrow e^{\rho }\mu (\rho )\pm \epsilon e^{\rho +\epsilon }\) almost surely, see Lemma 1 in [15]. We get, for each \(\epsilon >0\), almost surely
and so
Similarly, \(\liminf\limits_m\frac{\log U(tx)-\log U(t)-\gamma \log x}{A(t)}\le e^{\rho }\mu (\rho )\) almost surely. Thus, as \(m\rightarrow \infty\), almost surely
Hence, as \(m\rightarrow \infty\)
Notice that the central limit theorem implies, as \(m\rightarrow \infty\)
where \(N_{1}\) is a random variable having a standard normal distribution.
On the other hand, by \(\left( m\log \log m\right) ^{1/2}/k\rightarrow 0\) and using the law of the iterated logarithm, we have as \(m\rightarrow \infty\)
and by the fact that \(A \in RV_{\rho }\), we get as \(m\rightarrow \infty\)
Choosing \(\delta _{m}\) such that \(\sqrt{k}\delta _{m}\rightarrow 0\) and combining (7), (8), (10) with \(\sqrt{k}A(e^{m/k})\longrightarrow \lambda\), we have \(\sqrt{k}\left( f_m(T_m^{(\delta _m)})-\gamma \right)\) is asymptotically normal with mean \(\lambda \mu (\rho )\) and variance \(\gamma ^{2}\). The same arguments show that \(\sqrt{k}\left( f_m(T_m^{(-\delta _m)})-\gamma \right)\) is asymptotically normal with mean \(\lambda \mu (\rho )\) and variance \(\gamma ^{2}\). Since \(f_{m}\) is an increasing function, we have for sufficiently large m
which gives the result (i).
To prove the asymptotic normality of \({\hat{\sigma _m}}\), we use the following expansion
First consider \(T_1\). For sufficiently large m, we have almost surely
From Lemma 1.2.9 in [6], we have \(a(t)\sim \gamma U(t)\) as \(t\rightarrow \infty\). Then from (11) we get that \(\frac{\gamma }{t^*a\left( e^{S_{m-k}/k}\right) }\overset{p}{\rightarrow }1\) as \(m\rightarrow \infty\). Since \(a\in RV_{\gamma }\), (9) implies \(\frac{a\left( e^{S_{m-k}/k}\right) }{a\left( e^{(m-k)/k}\right) }{\mathop {\rightarrow }\limits ^{p}} 1\) as \(m\rightarrow \infty\). Hence, as \(m\rightarrow \infty\)
Next consider \(T_2\). We have by Theorem 2.3.3 in [6]
and consequently, for each \(\epsilon >0\), there exists a \(t_{0}=t_{0}(\epsilon )\) such that for \(t>t_{0}\), \(x\ge 1\),
Take \(t=e^{(m-k)/k}\) and \(x=e^{(S_{m-k}-(m-k))/k}\) and observe again that as \(m\rightarrow \infty\), \(t\rightarrow \infty\), \(x\rightarrow 1\) and \(x^{\gamma } \frac{x^{\rho }-1}{\rho }\pm \epsilon x^{\gamma +\rho +\epsilon }\rightarrow \pm \epsilon\) almost surely. We get
and so, by the central limit theorem, as \(m\rightarrow \infty\)
Notice that, without loss of generality, we can take \(N_1\) and \(N_2\) are independent random variables.
Thirdly, consider \(T_3\). From (11), we have for sufficiently large m,
Next, adapting the Lemma 4.5.4 in [6] to the case where \(\gamma\) is positive, we get that, if (\(\gamma >-\rho\)) or (\(0<\gamma <-\rho\), \(\lim _{t\rightarrow \infty }a(t)-\gamma U(t)=0\)),
Combining this with \(\sqrt{k}\delta _{m}\rightarrow 0\), we have as \(m\rightarrow \infty\)
Thus, as \(m\rightarrow \infty\)
Finally, the combination of the three parts proves (ii).
The proof for \(\gamma <0\) is the same as before with slight modifications. It proved by Louzaoui and El Arrouchi [15] that if \(k\rightarrow \infty ,\ k/m\rightarrow 0\) and \(k/\log m\rightarrow \infty\) as \(m\rightarrow \infty\), the first order condition (2) ensures the existence of a solution \(t^{*}\) of \(h_m(t)=0\) such that, almost surely, \(T_m^{(-\delta _m)}<t^{*}<T_m^{(\delta _m)}\) for some small \(\delta _m>0\) where \(T_m^{(\delta _m)}:=-\frac{1+\delta _m}{U(\infty )-R^{(k)}_{m-k}}\). Similar to (7), we have as \(m\rightarrow \infty\)
The rest is similar except that the relation (12) becomes for \(\gamma <0\) as
provided the second order condition. Finally, the statement (2) follows directly from (7) and (10). \(\square\)
Remark 2.2
Notice that, for \(0<\gamma <-\rho\), the relation (12) is a special case of the general relation
which unfortunately, for \(\displaystyle \lim _{x\rightarrow \infty }(a(x)/\gamma - U(x))\ne 0\), not ensures the desired approximation of \(T_3\). A similar remark can be made for \(\gamma =-\rho\).
In order to obtain an unbiased estimator for \(\gamma\), it can be seen from the asymptotic expansion (7) that is necessary to eliminate the term \(A(e^{m/k})\) and to replace \(\rho\) by any consistent estimator. Define, for integers \(n\ge k\ge 1\),
and
where \(N^{(k)}(n)\) denote the number of k-record values in the sequence \(X_1,\ldots ,X_n\) and [x] is the largest integer less than or equal to x. Then we have the following theorem.
Theorem 2.3
Assume (5) holds for \(\rho <0\). Assume \(k=k_n\rightarrow \infty\), \(k/n\rightarrow 0\), \(\log (n/k)\log \log n=o(k)\) and \(k/\log n\rightarrow \infty\) as \(n\rightarrow \infty\).
-
(i) If \(\lim _{n\rightarrow \infty }\sqrt{k}A(n/k)=\lambda \in {\mathbb {R}}\), then as \(n\rightarrow \infty\)
$$\begin{aligned}\sqrt{k}({{\widetilde{\gamma }}}_n(k)-\gamma ){\mathop {\rightarrow }\limits ^{d}} {\mathcal {N}}\left( \lambda \mu (\rho ),\gamma ^{2}\right),\end{aligned}$$and
$$\begin{aligned} \sqrt{k}({{\bar{\gamma }}}_n(k)-\gamma ){\mathop {\rightarrow }\limits ^{d}} \mathcal N\left( 0,{{\tilde{\sigma }}}^{2}\right) , \end{aligned}$$with \(\displaystyle {{\tilde{\sigma }}}^{2}=\frac{(1-4^{\rho })^2+4^{2\rho }}{2(1-4^{\rho })^2}\gamma ^2\).
-
(ii) If \(\lim _{n\rightarrow \infty }\sqrt{k}|A(n/k)|=\infty\), then as \(n\rightarrow \infty\)
$$\begin{aligned} (A(n/k))^{-1}({{\widetilde{\gamma }}}_n(k)-\gamma ){\mathop {\longrightarrow }\limits ^{p}}\mu (\rho ). \end{aligned}$$
Remark 2.4
Notice that if \(\rho \ge -1/2\), \({{\tilde{\sigma }}} \ge \lambda ^2\) and, if \(\rho \le -1/2\), \({{\tilde{\sigma }}} \le \lambda ^2\).
Proof
We use the same arguments as in above section. We can write for \(\delta _m=o(1/\sqrt{k})\) as \(m\rightarrow \infty\),
By using the Proposition 2.2 in [7], we have as \(n\rightarrow \infty\), almost surely
where \(\{B(t), t\ge 0\}\) is a standard Brownian motion, and by Theorem 1.3.1 in [5], we get as \(n\rightarrow \infty\), almost surely
Furthermore, since \(\log (k \log (n/k)) < \log \log n\), we have as \(n\rightarrow \infty\), almost surely
Combining this with the fact \(A\in RV_\rho\), we get from (13)
and so, for \(0 < s\le 1\)
which, by the Donsker’s invariance principle, gives for \(0<s\le 1\) and \(n\rightarrow \infty\),
This gives the first part of (i) and (ii).
Next, observe that if \(k\in \{n/\log n,n/(2\log n),n/(4\log n)\}\), then all conditions on the sequence k are fulfilled with \(\lim _{n\rightarrow \infty }\sqrt{k}|A(n/k)|=\infty\). Since, as \(n\rightarrow \infty\), \(A(2 \log n) \sim 2^{\rho } A(\log n)\) and \(A(4 \log n) \sim 4^{\rho } A(\log n)\), we have from the statement (ii)
From (14) we deduce that if \(\displaystyle \lim _{n\rightarrow \infty }\sqrt{k}A(n/k)=\lambda\),
and so, as \(n\rightarrow \infty\),
which gives the second part of (i). \(\square\)
Corollary 2.5
Assume the conditions of Theorem 2.3holds. Let \(MSE({{\widetilde{\gamma }}}_n(k))\) and \(MSE({{\bar{\gamma }}}_n(k))\) be the mean square errors of \({{\widetilde{\gamma }}}_n(k)\) and \({{\bar{\gamma }}}_n(k)\), respectively. If \(\lim _{n\rightarrow \infty }\sqrt{k}A(n/k)=\lambda \in {\mathbb {R}}\), then
Proof
It easily follows from Theorem 2.3. \(\square\)
Remark 2.6
-
1.
In the same way, we can obtain a similar results to those of Theorem 2.3 and Corollary 2.5 in the case where \(\gamma <0\).
-
2.
The method used here cannot work for \(\gamma = 0\) because the bounds found in [15] are almost surely constant, and therefore they are not asymptotically normal.
-
3.
This method is not applicable on bounds proposed by Zhou [18] since they are not symmetrical.
3 Simulation results
We now present some numerical results for the proposed bias correction. We consider here the Generalized Pareto distribution with \(F(x)=1-(1+\frac{\gamma }{\sigma } x)^{-1/\gamma }\), for \(x\ge 0\) and \(\gamma , \ \sigma >0\), the Burr IV distribution with \(F(x)= \{(\alpha /x-1)^{1/\alpha }+1\}^{-\beta }\), \(0<x<\alpha\), \(\beta >0\) and the standard Cauchy distribution with \(F(x)=\frac{1}{2}+\frac{1}{\pi } \arctan (x)\), for \(x\in {\mathbb {R}}\). For each of these distributions, we generate a random sample of size n. Moreover, for each of these random samples, the record values are picked up and then the corresponding estimates are computed. We report the simulation results in Tables 1, 2, 3 and 4. \(\overline{{\tilde{\gamma }}}\) and \(\overline{{\bar{\gamma }}}\) are the averages of 10,000 estimates of \({\tilde{\gamma }}\) and \({\bar{\gamma }}\) with \(MSE({\tilde{\gamma }})\) and \(MSE({\bar{\gamma }})\) denoted respectively their mean square errors. The simulated values are calculated for three sizes n against k with a reasonable number of the record \(N^{(k)}(n)\) (by using the approximation \(N^{(k)}(n)\sim k\log (n/k)\)). We remark that when the mean squared error values were rounded to the fourth decimal place, some values were repeated. We observe that the simulated values of \({\tilde{\gamma }}\) and \({\bar{\gamma }}\) are close to the theoretical value of \(\gamma\), and frequently, we have \({\bar{\gamma }}\) is closer to theoretical \(\gamma\) than \({\tilde{\gamma }}\). Unfortunately, the balancing of the MSE’s did not allow us to conclude, and we think this is due to the change in the number of records in the original sample.
4 Real data
In this section, we apply our estimation method on rainfall data, collected monthly from 1975 until 2007 at Melk Zhar Station in the Souss Massa region of Morocco (Fig. 1). This estimation are compared with the most used methods: the block maxima (GEV from (1)) and the POT (GPD from (3)). Table 5 shows the estimated parameters of the GEV distribution. The estimated shape parameter is positive, but the 95% confidence interval extends also below zero which proves \(\gamma\) is not significantly away from 0 at the 5% significance level. Considering only block maxima, when just few years of observations are available, can cause a great waste of data since there could be more than one extreme measurement in a single block. In general, the Peaks over threshold approach can be used to get improved accuracy. In particular, if the block maxima can be fitted by a GEV distribution, then the excesses over a high threshold t can be fitted by a GPD. Two techniques are used for threshold selection, namely Mean Residual Life Plot and stability of parameter estimates. The linearity of the mean residual life and the stability of the GPD parameters are both reached when \(t = 20\) which the excesses are composed by 89 observations. From Table 6, we have \(\hat{\gamma }=2.5\times 10^{-8}\) which is very close to 0. Next, using equations in (4), Table 7 summarizes our estimates for some selected values for k which ensures again the closeness of \(\gamma\) to 0. Consequently, the Gumbel model (GEV with \(\gamma =0\)) is a suitable model for our data. This is supported by diagnostic plots in Fig. 2. By adopting the Gumbel model, the associated return level \(z_{\alpha }\) at return period \(1/{\alpha }\) is \(z_{\alpha } =\mu - \sigma \log (-\log (1-\alpha ))\), where \(\mu\) and \(\sigma\) are estimated in Table 8. It means that on average, \(z_{\alpha }\) is exceeded once every \(1/{\alpha }\) years. The return level estimates are given in Table 9. Hence, the return level estimates indicate that the maximum value 144.6 (maximum total monthly rainfall recorded in Melk Zhar, see Fig. 1) will not be exceeded in the next 20 years, but it will be exceeded in the next 50 years.
Data availability statement
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Alawady, M.A., Barakat, H.M., Mansour, G.M., Husseiny, I.A.: Information measures and concomitants of \(k\)-record values based on Sarmanov family of bivariate distributions. Bull. Malays. Math. Sci. Soc. 46, 9 (2023). https://doi.org/10.1007/s40840-022-01396-9
Arnold, B.C., Balakrishnan, N., Nagaraja, H.N.: Record. Wiley, New York (1998)
Barakat, H.M., Nigm, E.M., Khaled, O.M.: Statistical Techniques for Modelling Extreme Value Data and Related Applications. Cambridge Scholars Publishing, Newcastle upon Tyne (2019)
Chandler, K.N.: The distribution and frequency of record values. J. R. Stat. Soc. Ser. B 14, 220–228 (1952)
Csörgő, M., Révész, P.: Strong Approximation in Probability and Statistics. Academic Press, New York (1981)
De Haan, L., Ferreira, A.: Extreme Value Theory. An Introduction. Springer, Berlin (2006)
Deheuvels, P., Nevzorov, V.B.: Limit laws for \(k\)-record times. J. Stat. Plan. Inference 38, 279–307 (1994)
Dekkers, A.L.M., Einmahl, J.H.J., de Haan, L.: A moment estimator for the index of an extreme-value distribution. Ann. Stat. 17, 1833–1855 (1989)
Drees, H., Ferreira, A., de Haan, L.: On maximum likelihood estimation of the extreme value index. Ann. Appl. Probab. 14, 1179–1201 (2004)
Dziubdziela, W., Kopocinski, B.: Limiting properties of the kth record values. Appl. Math. 15, 187–190 (1976)
El Arrouchi, M.: Characterization of tail distributions based on record values by using the Beurling’s Tauberian theorem. Extremes 20(1), 111–120 (2017)
El Arrouchi, M., Imlahi, A.: Optimal choice of \(k_n\)-records in the extreme value index estimation. Stat. Decis. 23(2), 101–115 (2005). https://doi.org/10.1524/stnd.2005.23.2.101
Elgawad, M.A.A., Barakat, H.M., Yan, T.: Bivariate limit theorems for record values based on random sample sizes. Sankhya A 82, 50–67 (2020). https://doi.org/10.1007/s13171-019-00167-2
Hill, B.M.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3, 1163–1174 (1975)
Louzaoui, A., El Arrouchi, M.: On the maximum likelihood estimation of extreme value index based on \(k\)-record values. J. Probab. Stat. (2020). https://doi.org/10.1155/2020/5497413
Pickands, J., III.: Statistical inference using extreme order statistics. Ann. Stat. 3, 119–131 (1975)
Resnick, S.I.: Extreme Values, Regular Variation and Point Processes. Springer, New York (1987)
Zhou, C.: Existence and consistency of the maximum likelihood estimator for the extreme value index. J. Multivar. Anal. 100(4), 794–815 (2009)
Acknowledgements
The authors would like to thank the anonymous reviewers and the editors for their comments and suggestions to improve this paper.
Funding
The authors state that no funding source for this paper.
Author information
Authors and Affiliations
Contributions
The authors have equally made contributions. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Louzaoui, A., El Arrouchi, M. Improving the bias of a pseudo-maximum likelihood estimate of the extreme value index by k-records. J Stat Theory Appl 22, 54–69 (2023). https://doi.org/10.1007/s44199-023-00055-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s44199-023-00055-7