Regression Estimator for the Tail Index

Estimating the tail index parameter is one of the primal objectives in extreme value theory. For heavy-tailed distributions the Hill estimator is the most popular way to estimate this parameter. Several recent publications’ aim was to improve the Hill estimator, using different methods, for example the bootstrap, or the Kolmogorov–Smirnov metric. These methods are asymptotically consistent, but for tail index ξ>0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi >0.5$$\end{document} the estimations fail to approach the theoretical value for realistic sample sizes. In this paper, we introduce new empirical methods, which combine the advantages of the Kolmogorov–Smirnov approach and the bootstrap. We demonstrate that our estimators are able to estimate large tail index parameters well and might also be useful for relatively small sample sizes. As an application, we consider the classic Danish fire data set and the most destructive natural disasters in Europe.


Introduction
In many applications of probability theory and statistics the most important problem of research is to estimate the high quantiles of a distribution, for example solvency margin calculations for risky investments or estimating the loss caused by a natural disaster, which may be observed once within a given time period (e.g. 100 years). These type of problems could be solved using tools provided by extreme value theory, summarized by [10,16].
In the 1920's [17] described the limit behaviour of the maximum of i.i.d. samples, which is the basis of many applications of extreme value theory. Later another approach emerged, where the extremal model is based on the values over a high threshold. Its In real life applications, a three-parametric family (including location and scale parameters) is used, called generalized extreme value distribution (GEV). For every GEV distribution G there exist real numbers a > 0 and b, such that G * (x) = G(ax + b) for every x. The tail index parameter is invariant under such linear transformations.
Another approach to the investigation of the tail behaviour is the peaks over threshold (POT) model of [1,28], where the extremal model is based on the values over a threshold u. Let x F be the right endpoint of the distribution F (finite or infinite). If the distribution of the standardized excesses over the threshold has a limit, that must be the generalized Pareto distribution: where The two models result in the same parameter in Eqs. (1) and (2) for any given initial distribution F, for which the limit exists.
An (x) function is called slowly varying if lim t→∞ (tx)∕ (t) = 1 for every x > 0 . A sufficient condition for the limit theorems above, in case of positive tail index parameter (this implies x F = ∞ ) is that Finding a suitable function (x) and calculating the parameters of the limiting distribution of extremes is only feasible for special known distributions. In case of real life problems finding (x) is unrealistic, therefore estimating parameter is in the main focus.

Methods for Defining the Threshold in Hill Estimator
For a tail index > 0 Hill [24] proposed an estimator: let X 1 , X 2 , … , X n be a sample from a distribution function F and X * 1 ≤ X * 2 ≤ ⋯ ≤ X * n the order statistic. The Hill estimator for the tail index is Similarly to the POT model, the Hill estimator also uses the largest values of the sample. The threshold is defined as the (k + 1) th largest observation.
The Hill estimator strongly depends on the choice of k. It is important to mention that by the work of [26], ̂ is a consistent estimator for the tail index only if k → ∞ and k∕n → 0 as n → ∞ . Using a too small k leads to large variance; however, for a too large k, the estimator is likely to be biased. Therefore, proposing a method for choosing the optimal k for the Hill estimator has been in the focus of research since its publication, see, e.g. [6,7,19,20,22] and others.

Double Bootstrap
One of the most accurate estimators, the double bootstrap method was introduced by [6] and improved by [29]. One can find a proposed k by minimizing the asymptotic mean squared error of Hill estimator. Define For i = 1 one can observe, that the ̂= M (1) (n, k) is the Hill estimation. Let us define M(n, k) = M (2) (n, k) − 2(M (1) (n, k)) 2 and k 2 as the optimal threshold for M(n, k). Since as proved by [6], where is a regularity parameter (which can be estimated in a consistent way), it is possible to estimate k 2 and instead of k 1 . The double bootstrap method consists of the following steps: • Choose ∈ (0, 1∕2), and set m 1 = [n 1− ] to ensure consistency as n → ∞ . Estimate E(M(m 1 , r) 2 |X 1 , X 2 , … , X n ) by drawing m 1 size bootstrap samples from the empirical distribution function F n and minimize it in r. Denote the minimum by r 1 . • Set m 2 = [m 2 1 ∕n] and minimize E(M(m 2 , r) 2 |X 1 , X 2 , … , X n ) the same way as in the first step, denote the minimum by r 2 . • Estimate the regularity parameter , which is important for the further calculations, by ̂= log(r 1 )∕(−2 log(m 1 ) + 2 log(r 1 )). • Now one can estimate the optimal k using the approximation (3) by The Hill estimator based on the double bootstrap method provides appropriate tail index estimate for > 0 , but usually results in a long computation time because of the needed high number of bootstrap simulations. For smaller sample sizes the acceptable range is limited to > 0.5 , as one can see the poor simulation results of [7] for lower tail indices. .

Hall's Methodology for Optimal k-Selection
In recent years many methods have arisen from the influence of [23] which use bootstrap samples to estimate the tail index parameter or improve the results of previous estimators. One of these methods appeared first in [18] and was concretized by [19], which paper also corrected the k selection, as follows. Let X 1 , X 2 , … , X n be the sample. First, one must fix an initial k aux threshold (usually k aux = 2 √ n ), an estimator (1) (k) for 1∕ (i.e. Hill with k as the threshold), and n 1 = n , n 2 = n 2 1 ∕n subsample sizes, where n is the sample size, ∈ (0.5, 1) is arbitrary (usually = 0.85 is used). The steps of the algorithm are the following: 2. Simulate bootstrap samples of size n 1 and n 2 , B times. Calculate for every k, where (1) n i ,l (k) is the estimate by using n i -element ( i = 1, 2 ) bootstrap samples, l = 1, 2, … , B and k as threshold. 3. Let k 0 (n i ) = arg min n 1 ,n 2 ∈1,2,…,n−1 As a corrected threshold one may use k 0 = (k 0 (n 1 )) 2 k 0 (n 2 ) if k 0 ∈ (1, n − 1). 5. That way the corrected estimate for the tail index is 1∕ (k 0 ). This algorithm can be really effective, however it has the same disadvantage as the double bootstrap method, namely finding k 0 (n i ) requires the calculations to be repeated for every k which takes approximately as long time as calculating the optimal r 1 , r 2 in the double bootstrap method.

Acceleration Algorithm
For both the double bootstrap and Hall's method, the computational time could be a crucial problem. Minimizing among a large interval of possible k or t n i values needs a lot of calculations and especially the bootstrap simulations increase the needed time. These problems could lead us to use possibly less simulations, or execute the minimization over a smaller set.
We experienced that the absolute error of estimate is high by using small and large k values, while it is smaller in the neighbourhood of the optimum. The error function in k was usually smooth and showed a "U shaped" line (similar property was already mentioned by [19]). This motivates an approximation algorithm to find the minimal error value, instead of scanning all of the numbers k = 1, … , n , which takes unnecessary long simulation time. We propose the following step by step approximation algorithm: 1. Calculate the mean squared error only for k = c ⋅ √ n (c ∈ ℤ + ) and execute the minimization among them. Denote the minimum by m a . 2. Focus on the minimum m a and its 2 ⋅ √ n wide surroundings and fix an integer b. Now minimize among the numbers m a ± c ⋅ b (c ∈ ℤ + ) in the examined region. Denote the minimum by m b . 3. Continue the calculation in the 2 ⋅ b wide neighbourhood of m b and minimize among these values.
This way, one can find the global optimum for k 0 , r 1 , r 2 in double bootstrap and Hall's method in most of the cases. However, sometimes the algorithm stops in a local optimum. Therefore, using the acceleration algorithm we trade-off some accuracy of estimation for faster computation time. In fact, as n → ∞ , the error rate decreases, since finding the optimal k is not crucial for the estimator-an adequate k results in similar estimate. In addition, the steps can be modified, to focus better on the potential optimal k values, making the calculation even faster.
In Table 1 one can see the comparison of Hall and accelerated Hall method. The mean of absolute error is similar for 200-1000 size samples, while the computation time is strongly in favour of accelerated method.
The acceleration algorithm can provide even better results in some cases (e.g. Table 1, n = 200 ), however it is caused by the uncertainty of bootstrap simulations-original method may stop in a wrong value, while accelerated finds the right one. Using more bootstrap simulations this paradox diminishes.

Kolmogorov-Smirnov Distance Metric
Another approach for tail index estimation was suggested by [7] which minimizes the distance between the tail of the empirical distribution function and the fitted Pareto distribution with the estimated tail index parameter. For the minimization the Kolmogorov-Smirnov distance of the quantiles can be used. Assume that the distribution of the sample is in the maximum domain of attraction of a GEV distribution which implies that the distribution function can be written in the following form: Rearranging this equation the value of x can be approximated by One can construct a quantile function by using this approximation. The probability P(X > x) can be replaced by the observed relative frequency j/n, and may be estimated using the Hill estimator for a k, moreover C can be approximated by k n (X * n−k+1 ) 1∕ as the highest observations follow a Pareto distribution in the limit. These substitutions result in an estimator for the quantiles as a function of j and k.
This quantile estimator was first proposed by [31] as a maximum likelihood estimator using only the k upper ordered statistics.
In this procedure, the optimal k for the Hill estimator is chosen as the k, which minimizes the Kolmogorov-Smirnov distance between the empirical and the calculated quantiles where T sets the fitting threshold (we call it KS threshold). Choosing T is almost arbitrary because for heavy tailed distributions the largest differences always appear for the highest quantiles. We applied the rule of choosing T = max(50, √ n) . The advantages of the Kolmogorov-Smirnov method are that it is easy to programme and its computation time is short. As [7] mentioned, it is the best performing known method if 0 < < 0.5 and by our experience, it also works well for small sample sizes. However, for distributions with tail index > 0.5 this technique results in highly biased estimates.

Asymptotics of Kolmogorov-Smirnov Method
Although the Kolmogorov-Smirnov method has a straightforward algorithm, its theoretical background and asymptotic behaviour is not presented yet. In Theorem 1 we state that under some conditions the Kolmogorov-Smirnov technique results in underestimates for the tail index. Moreover, the simulations show that the estimation is approximating the real tail index parameter if the number of observation increases. To the proof of the properties of the estimator we need the following lemma: Proof For k → ∞ by using the Stirling formula we have

distribution is in the maximum domain of attraction of an extreme value distribution). The Hill estimator using the Kolmogorov-Smirnov k-selection technique has a negative bias as
Proof Let X 1 , X 2 , … , X n be our sample, and let X * 1 , X * 2 , … , X * n be the ordered sample. Let be a quantile estimator for (X * n−j+1 ) , similarly to [7]. It is asymptotically unbiased as it is a maximum likelihood estimator for a given quantile by [31].
Under the given conditions X * n , X * n−1 … , X * n−k can be considered as a k + 1 size ordered sample from a Pareto distribution with unknown parameters (see [25]). Therefore, a Glivenko-Cantelli type theorem for quantiles provides that q n (j, k) is a consistent estimator for the (k + 1 − j)∕(k + 1) quantile of the unknown Pareto distribution which can also be approximated by X * n−j+1 . Namely, for every > 0 there is an M such that if k > M (and n ≫ M ) then the Kolmogorov-Smirnov distance between q n (j, k) and X * n−j+1 will be less than . Choose k according to the Kolmogorov-Smirnov method (2.3) to ensure the approximation of quantile estimation and let ̂ be the Hill estimate using k if n > k > M.

Conjecture 1 Using the conditions of Theorem 1 the Hill estimator using the Kolmogorov-Smirnov k-selection technique shows asymptotically unbiased behaviour.
The consistency of the Hill estimator is well-known as n → ∞ , k → ∞ and k∕n → 0 . (see, e.g. [16]). The estimator is expected to behave regularly, implying the L 1 convergence. As an empirical observation by Monte Carlo simulations from standard generalized Pareto distribution (with = 1 ), we could observe, that for an increasing sequence k the ratio between the two sides of Jensen's inequality in the proof of Theorem 1 tends to 1 (see Table 2). Similarly, in the approximation by quantile estimation the ratio of the two sides tends to 1 when k → ∞ and n → ∞ holds (see Table 3). These observations suggest that the Kolmogorov-Smirnov method results in asymptotically unbiased estimation for the tail index, under the conditions k → ∞ , k∕n → 0 and n → ∞.
However k → ∞ is not an evident condition and even if it is realized we cannot say anything about the speed of convergence. This could lead to biased tail index estimate using the Kolmogorov-Smirnov method, especially if is large. For moderate size samples, this bias is estimated in Table 9, and it is always negative in accordance to Theorem 1. As [7] presented, the Kolmogorov-Smirnov method provides acceptable estimators for moderate size samples if < 0.5 , but for tail index ≥ 0.5 more than 10,000 size samples might be needed for proper estimates. In most real-life applications such a large sample size is not available, therefore in cases of distributions with infinite variance the estimator may have a significant bias.
In the previous paragraphs we have dealt with two important properties of Kolmogorov-Smirnov method: consistency (originated from Hill estimators consistency) and negative bias for finite samples. These properties motivates us to perform a simulation study, in order to estimate the bias of Hill estimator based on the Kolmogorov-Smirnov method using numerous initial distributions, sample sizes and theoretical tail indices. This will lead us to our new tail index estimator.

Finite Sample Properties of Kolmogorov-Smirnov Method
In this section, we show that Kolmogorov-Smirnov method has the following properties: • The distribution of ̂ is similar for each parent distribution • Independence of sample size in the interval n ∈ (200, 10,000) • The bias is linear in First, we simulated 500 size i.i.d. samples from Fréchet, GPD, stable and Student distributions with fixed tail index parameter . The estimates were calculated 30,000 times for each . We experienced, that in accordance to the empirical distribution of ̂ is similar, GEV-like for each parent distribution as one can see in Fig. 1. It would be beneficial to characterize the empirical distribution more precisely, since the best fitting GEV distribution is still rejected by the goodness of fit tests (chi square test after discretization, p < 10 −16 , which may be caused by the large sample size). Despite, the empirical distribution and the best fitted GEV distribution are in relation, thus the parameters of the GEV distribution (estimated by maximum likelihood) might characterize the empirical distribution well. Second, we simulated from 200 to 10,000 sized samples. We fitted GEV distribution to tail index estimates of 1000 Monte Carlo simulations. We experienced, that the GEV parameters and expectation of the tail index were similar, regardless of the sample size -see Table 4. We do not have results for larger samples, since double bootstrap and Hall method are much more effective in this case, due to their asymptotic properties. We can state that, for ∈ (0.5, 4) the distribution of the Kolmogorov-Smirnov estimators does not depend on the sample size if 200 < n < 10,000.
Finally, we wanted to determine the bias of Kolmogorov-Smirnov method for ∈ (0.5, 4) . Therefore, we simulated 500 size samples from Fréchet distribution with various parameters. We calculated the average of estimated and the location parameter of the best fitted GEV distribution based on 30,000 simulations. As one can see in Fig. 2, the average and GEV parameters are in linear relation with the true parameter. Therefore, correcting the estimator by a linear transformation we can improve the Kolmogorov-Smirnov method. When > 4 , we can not receive reasonable results, so the correction could not help (but these distributions do not have any practical relevance).

The Proposed New Estimator
Based on the finite sample properties of the Kolmogorov-Smirnov method, we constructed a new, empirical estimator, that corrects the potential bias. As we experienced in Sect. 2.3.2, the Kolmogorov-Smirnov method neither depends, on the type of the sampling distribution, nor on the size of the sample for the investigated cases, therefore a linear transformation can correct the bias. Based on the Monte Carlo simulation we fitted a linear regression model to the average of estimated , GEV parameters and sample size against the theoretical . We experienced that: • Both the mean of the estimates and the location parameter of the fitted GEV were in strong correlation with , thus two regression models are feasible for bias correction: One of the potential models is ̂= −0.119 + 1.603 ⋅ f , where f is the location parameter of the best fitted GEV model to the KS estimates. The other However, for applying the correction models, we need to know the distribution of the Hill estimates on the sample. As [5] pointed out, m out of n bootstrap is a feasible way to estimate the distribution of a point estimator in the extreme-value setup. Therefore, the new bias corrected algorithm consists of a bootstrap simulation, and a transformation based on linear regression. It is important to mention, that the method works best for ∈ (0.5, 4) and for sample sizes between 200 and 10,000. Outside this region the statements above might not hold, thus the regression coefficients may be different. Now, we present a step-by-step algorithm for the bias correcting method, we call this model regression estimator:

Algorithm
Let X = X 1 , X 2 , … , X n be independent and identically distributed observations.

Simulations
The properties of the regression estimators were tested for simulated data from Pareto, Fréchet, Student and symmetric stable parent distributions. We also tested the estimator on a mixed distribution-the lower 80% consist of an exponential core, while the upper 20% is a Pareto tail. We set the tail index parameter to the same ∈ (0.2, 4) value in each distribution for the samples of size 200, 1000 and 4000. For the simulation study we used the built-in rgpd (package: fExtremes), rfrechet (package: VGAM), rt and rstable (package: stabledist) functions of the R programming language. We compared the average of the estimated tail index values by FRE. Subsequently we calculated the average absolute error from the theoretical . Using m = n 0.85 , the subsample sizes were 90, 355 and 1153. We set a suitable KS threshold (maximal number of ordered statistic for KS method) to min(50, √ m) . The results of the simulation can be seen in Table 8. The most important observations: • For < 0.5 the error rate is higher than the optimal, while for > 2 one can see slight underestimation of the real tail index (except for the mixed distribution). • For ∈ (0.5, 2) the estimator seems to work properly. • We can say generally that higher sample size results in more accurate estimate for every distribution and tail index. • Symmetric stable distribution with = 0.5 is a special case: it is not in the domain of attraction of the Fréchet distribution, so extreme value model can not work on it. However it was an interesting question if the regression estimator works outside the domain of attraction or not-the answer is no.
Generally, we can say, that FRE works well for ∈ (0.2, 4) , but its best region is ∈ (0.5, 2) for every type of parent distribution with proper extremal behaviour. We experienced similar properties for MRE too (Figs. 2, 3 and 4; Table 9).

Comparison of the Methods
In this section, we compare the effectiveness of Kolmogorov-Smirnov (2.3), double bootstrap (2.1) and Hall's method (2.2) with accelerated algorithm to the two types of regression estimators. We try to decide which one is the most efficient on the average, absolute error and the computational time of estimation. We include different parent distributions (namely, Fréchet, symmetric stable, generalized Pareto, and Student) with varying tail indices and sample sizes.
First, we present the average of absolute error of different methods using 200 size samples from the Student distribution with ∈ (0.1, 1.5) . The results in Fig. 3 imply that Kolmogorov-Smirnov method has the smallest error for < 0.5 , while in the ∈ (0.5, 1.5) region FRE and MRE seems to be the best. For larger double bootstrap starts to outperform the regression estimators. For 500 size samples the optimal range for MRE and FRE lies around = 0.5 , while smaller tail indices can be estimated well with Kolmogorov-Smirnov method. For larger values Hall's and double bootstrap method are better options for the estimation. Second, we analysed the effect of sample size on the estimates. We fixed = 0.7 , as a tail index value, where regression estimation works well. We present the average absolute error using different size samples for the methods in Fig. 4. The four parent distributions were analysed separately. We may conclude, that regression estimators seem to be the best for n < 200 . In case of Student distribution FRE has the smallest error for n < 500 , while for GPD the regression estimators have no competitors if n < 800 . Double bootstrap and Hall method start to behave better for larger sizes. Concluding the figure, for high tail indices the best method depends on the sample size and distribution, FRE and MRE are the best for smaller samples.
Finally, for ∈ (0.333, 2) the mean of estimates, absolute error and computational time are presented in Table 9 for 200, 500 and 1000 sizes samples using different methods. The parent distribution is Fréchet, and the averages were calculated after 500 simulations. We can say, that for smaller sample sizes usually FRE or MRE has the best estimate and smallest error. For n = 1000 double bootstrap starts to be better and Hall's method also outperforms the regression estimator. Kolmogorov-Smirnov method is the fastest in every cases, but works properly only for < 0.5 . Computational times of FRE and MRE methods are also competitive with the other accelerated algorithms.
We can conclude, that our new estimation method provides the best estimates among the examined options if the tail index is between 0.5 and 1.5. For small (less than 200) sized samples it always seems to be the best, but for moderate (200-500) it is preferred in case of Student or GPD-like parent distribution. However, for larger and different type of parent distribution the optimal region may be wider.
Some remarks: • The effectiveness of FRE and MRE comes from the property that the methods are not sensitive with respect to the sample size. Therefore, the advantage can disappear if n > 1000 due to consistency of other methods.
• Unlike for other examined methods, the distribution of FRE and MRE can be approximated by a normal distribution. One can see a comparison with double bootstrap in Fig. 5. Therefore, confidence interval calculations are easier. • If ∈ (0.5, 4) , the errors of the FRE and MRE estimators are not sensitive to the underlying distribution, unlike the double bootstrap or Hall method. This property results in different optimality regions, thus for different distributions the double bootstrap and Hall method needs different sample sizes to outperform the regression estimators.

Model Selecting Scheme
The results of Sect. 3.3 suggest, that different methods are optimal for different types of samples. Therefore, a model selecting scheme is beneficial for real life applications. For getting preliminary information about the sample we need to fit different distributions and calculate in a fast, but maybe not reliable way. This is formulated in the model selection algorithm as follows: 1. Estimate the tail index using the Kolmogorov-Smirnov method (2.3). If the estimated ̂ is smaller than 0.45, then the true tail index is likely smaller than 0.5, therefore usually the Kolmogorov-Smirnov is the best working method (according to [7] and simulations).

Danish Fire Losses
The Danish fire losses is a well known open dataset, which is available in an R package [27]. Previous discussions were published, e.g. by [12,30]. The dataset contains 2167 fire losses, which occurred between 1980 and 1990. The true tail index parameter is between 0.5 and 1 by multiple estimators, therefore large sample size implies point no. 5 of the model selection scheme. It is not clear which method results in the best estimate, however the large sample size strongly supports the asymptotic double bootstrap and Hall method. We estimated the tail index parameter by using the presented models, namely: double bootstrap, Kolmogorov-Smirnov and Hall's method. Additionally, we fitted Pareto distribution for the data. As one can see in Table 5, all of the estimators results in 0.5 <̂< 0.75 values, therefore the model selection might not have a big significance. It is important to mention, that MRE and FRE results in the smallest estimates. This might be caused by the fact, that we used n 0.85 = 685 subsample size, which is too large for such a big sample. However, we used the same method for regression estimator to subsample selection as the other estimators, it might not be optimal for m out of n bootstrap [4].
To check if we can improve regression estimates we considered smaller bootstrap subsample sizes, namely ones from the interval (50, 300). We calculated the tail index of the dataset with 10,000 bootstrap samples for different subsample sizes. The results can be seen in Table 6. By using 50-100 subsample size, we receive the values of most accurate methods. It might be a future task, to optimize based on the sample size, thus making the regression estimator more accurate for larger samples.

European Natural Disaster Damages
As a second example, we analysed the damages of the most destructive storms and temperature related natural disasters of the past 50 years in Europe. The data originated from [21] database. First, we need to prepare the data for the tail index analysis. We summed up the damages occurring in different countries and days for each disaster. With these modifications the dataset contains 403 observations. In the analysis we applied the presented tail index estimators. One can see in Table 7, that the estimates differ in a wide range, which can be explained by the small sample size and strange distribution-extremes are quite large, but more of them occurred in the observation period. The KS estimate is larger than 0.45, which suggests that the true tail index value is likely to be larger than 0.5. Since the sample size is also in the optimal range of regression estimator MRE or FRE can result in the best estimate for the data. One can see the histogram of data and the best fitted extreme value distribution using the estimated tail index in Fig. 6. The tail behaviour is best captured by the MRE, therefore regression estimates could result in the best high quantile estimates.
For further analysis, one might have a look at the time dependent structure of the data. Climate change could have an effect on the severity of storms and heat waves, which can turn up as a positive trend among the observations. Therefore, using time dependent models might improve the effectiveness of the other estimators.

Conclusion
Our new regression method (Sect. 3) provides an alternative to estimate the tail index for heavy-tailed distributions. We have shown its merits by the parameter estimation of known distributions, and presented that our method is also useful for real life data. The computation time is similar to the double bootstrap method. However as our algorithm also applies bootstrap techniques, one can use less bootstrap samples to lower the computation time further if needed-at the expense of the estimations' accuracy, or conversely. Our simulations showed in agreement to [7] that the best estimation is the Hill estimation based on the Kolmogorov-Smirnov method (Sect. 2.3), if the parameter is less than 0.5. If the sample size is small (n < 200) , FRE and MRE have the best properties in most cases. For moderate (200 < n < 500) sample sizes and 0.5 < < 1.5 , the best working algorithm depends on the distribution of the sample. If it is GPD or Student-like, then usually the FRE and MRE can result in the best estimates. Otherwise using double bootstrap (2.1) or Hall method is suggested. If the size is more than 500, then in general the double bootstrap or Hall method could result in more accurate estimation. This model selection algorithm could be extended by comparing more methods or by more types of initial distributions.
Our regression estimation has two types (FRE and MRE), using the mean of the bootstrap samples or using the location parameter of the best fitted GEV distribution. Our experiments did not indicate which one is the more precise, therefore we suggest using both in a real life analysis. Further research could find the answer for this question.