1 Introduction

Rainfall is vital for life on Earth [1], but its occurrence in high magnitude can cause damage and losses, usually causing flooding, destruction of buildings and crops, soil erosion, breaches of dikes and dams, among others [2, 3]. Damage in cities tends to be more severe because of the rapid urbanization and installation of complex infrastructure [4]. In addition, the frequency of extreme weather events has shown an increasing trend in various regions of the planet [2, 5]. In addition, the frequency of extreme weather events has shown an increasing trend in several regions of the planet [6,7,8], and the southern region of Brazil has suffered from the occurrence of these events [2, 5].

To minimize negative impacts or avoid economic, social and environmental losses, it is necessary to plan activities and constructions based on the probabilistic forecast of the occurrence of maximum precipitation in a given location [9]. For the forecasting process the fit of mathematical statistical models to the data, which can study the phenomena with different approaches, as well as the occurrence of extreme values, temporal distribution, spatial distribution, the intensity of the phenomenon, among others [10,11,12].

Statistical approaches based on the analysis of extreme values have shown promising results in the forecasting of these events in several areas of science [13,14,15,16]. One of the models extensively employed, for this purpose, in various scientific fields such as insurance, finance, meteorology, and the environment is the Generalized Pareto Distribution [17, 18].

Given the use of probabilistic models, assessing their goodness of fit is an equally important task. In the analysis of extreme events, this stage is practically not taken into account, even when it is a very consolidated methodology. Goodness of fit tests such as Kolmogorov-smirnov, chi-squared, and likelihood ratio are widely used [17, 19, 20]. However, as recommended by [21], the fit of the distributions using estimates of the parameters of the fitted distributions can lead to the occurrence of type II error, and, to circumvent this fact, proposes a simulation study. In general, these simulation studies are based on Monte Carlo procedures [22, 23].

Hence, the present work aims to fit the Generalized Pareto Distribution to the maximum monthly rainfall in the city of Uruguaiana, Rio Grande do Sul state, Brazil, as well as to calculate the probability of some extreme events occurring, calculate return levels of extreme rainfall events and its confidence intervals in periods of 2, 5, 10, 30, 50 and 100 years.

2 Methodology

The data set was obtained from the meteorological database for teaching and research (BDMEP), from January 1961 to April 2019, made available by the National Institute of Meteorology (INMET) and registered at the Uruguaiana—Rio Grande do Sul state weather station. The data are grouped in monthly periods and in each month the threshold method is used. Consequently, the highest values of rainfall above a sufficiently high threshold have been estimated according to the POT (peaks over threshold) methodology. As a result, they are analyzed by Generalized Pareto Distribution.

According to Coles [24], as well as Generalized Extreme Values (Henceforth GEV) distribution is the limit distribution of the block maxima, and the GPD appears as the parametric form for limit distribution for threshold excesses, whose probability density function is given by

$$\begin{aligned} f\left( {x\left| {\xi ,\sigma ,u} \right. } \right) = \left\{ \begin{array}{l} \frac{1}{\sigma }\left[ {1 + \xi \left( {\frac{{x - u}}{\sigma }} \right) } \right] ^{ - \left( {1 + \frac{1}{\xi }} \right) } ,\,x \ge 0\,{\mathrm{{if}}}\,\xi \ne 0 \\ \frac{1}{\sigma }\exp \left( { - \frac{{x - u}}{\sigma }} \right) ,\,0 \le x \le \frac{1}{{\left| \xi \right| }}\,{\mathrm{{if}}}\,\xi \rightarrow 0 \\ \end{array} \right. \end{aligned}$$
(1)

The distribution function is given by

$$\begin{aligned} F\left( {x\left| {\xi ,\sigma ,u} \right. } \right) = \left\{ \begin{array}{l} 1 - \left[ {1 + \xi \left( {\frac{{x - u}}{\sigma }} \right) } \right] ^{ - \frac{1}{\xi }} ,\,\,\xi \ne 0 \\ 1 - \exp \left( { - \frac{{x - u}}{\sigma }} \right) ,\,\xi \rightarrow 0 \\ \end{array} \right. \end{aligned}$$
(2)

where u is the threshold, \(\sigma \) is the scale parameter and \(\xi \) the shape parameter. In priori, the threshold should be known and it is described in Sect. 2.1. The parameters \(\sigma \) and \(\xi \) must be estimated from the data and it is described in Sect. 2.2. Through the GPD distribution, three classes of standard distributions can be obtained: Type I: Exponential (\(\mathop {\lim }\limits _{\xi \rightarrow 0} F\left( {x\left| {\xi ,\sigma ,u} \right. } \right) \)), Type II: Pareto (\(\xi >0\)) and Type III: Beta or ordinary Pareto (\(\xi <0\)).

2.1 Threshold selection

To choose the appropriate threshold value, an exploratory graphical tool was used based on the linearity of the mean excesses function. This plot consists of the mean excesses above several thresholds with the threshold itself (Fig. 1). This plot is also known as mean residual life plot [25].

Fig. 1
figure 1

Mean residual life plot for choosing a threshold

Fig. 2
figure 2

Threshold choice plot for scale and shape estimated parameters

On the other hand, the mean residual life plot can be difficult to interpret as a threshold selection method. A complementary technique is employed, and it is based on fitting the GPD at a variety of thresholds, and on looking at the stability of the parameter estimates [24]. This plot is known as threshold choice plot (Fig. 2).

The choice of the very high threshold may result in a small number of observations, influencing the variance of the estimators. However, a threshold that does not satisfy the theoretical assumptions may result in distorted estimates. Thus, one should choose the threshold that makes the mean residual life plot and the functions of the parameters \(\sigma \) and \(\xi \) more or less linear [26].

2.2 Parameter estimation

After selection of the threshold, the GPD parameters were estimated by the maximum likelihood method. The maximum likelihood estimators maximize the log-likelihood function. Suppose \(y_1,\ldots ,y_k\) are the k excesses of a threshold u [24]. For \(\xi \ne 0\)

$$\begin{aligned} l\left( {\sigma ,\xi } \right) = - k\log \left( \sigma \right) - \left( {1 + \frac{1}{\xi }} \right) \sum \limits _{i = 1}^k {\log \left( {1 + \xi \frac{{x_i }}{\sigma }} \right) }, \end{aligned}$$
(3)

where \(({1 + \sigma ^{ - 1} \xi x_i }) > 0\,{\mathrm{{for}}}\,i = 1,\ldots ,k\); in other way, \(l( {\sigma ,\xi }) = - \infty \). In the \(\xi \rightarrow 0\) case, the log-likelihood function is given by

$$\begin{aligned} l\left( \sigma \right) = - k\log \left( \sigma \right) - \frac{1}{\sigma }\sum \limits _{i = 1}^k {x_i }. \end{aligned}$$
(4)

The maximum likelihood estimators of parameters \(\sigma \) and \(\xi \) are obtained through the solution of the homogeneous equations, given by partial derivatives of log-likelihood with respect each parameter. The estimation of \(\sigma \) and \(\xi \) requires the use of a numerical maximization, usually any method for this works, like Newton–Raphson, Simulated Annealing, Fisher’s scoring or its variations [27].

2.3 Hypothesis testing

With the parameters estimated, goodness of fit criteria of the GPD model were evaluated. The Kolmogorov Smirnov (KS) test was used to compare the theoretical cumulative distribution and the empirical cumulative distribution [28]. The Ljung Box (LB) independence test, whose statistics are compared with the \(\alpha \)-th quantile of the chi-squared distribution with one degree of freedom. The Mann-Kendall test was used to determine if the series has a statistically significant time trend [29]. When very small values of p-value are found, it indicates evidence in favor of the alternative hypothesis, that is, there is some tendency to modify the behavior of the analyzed series.

For the maximum likelihood estimates, one can test if \(\xi \) is statistically null. Then, to test the null hypothesis that the extremes distributions is exponential, we use the likelihood ratio test (LT), whose test statistic is

$$\begin{aligned} \Lambda = 2\left[ {l\left( {{\hat{\sigma }} ,{\hat{\xi }} } \right) - l\left( {{\hat{\sigma }} } \right) } \right] , \end{aligned}$$
(5)

where \({l\left( {{\hat{\sigma }} } \right) }\) and \({l\left( {{\hat{\sigma }} ,{\hat{\xi }} } \right) }\) represent the log-likelihoods respectively using the Exponential and GPD densities with the respective maximum likelihood estimates [26]. Thus, the null hypothesis that \(\xi = 0 \) is rejected if \(\Lambda \) is greater than the \(\alpha \)-th quantile of the chi-squared distribution with 1 degree of freedom. Alternatively, if the p-value of the test is less than the significance level, the null hypothesis is rejected. For all tests we adopt \(1\%\) as significance level

2.4 Probability of excesses and return levels

According to Eq. 2 in the \(\xi \ne 0\) case, to estimate the probability of occurrence of precipitation above a threshold, we have that

$$\begin{aligned} \Pr \left[ {X> x\left| {X > u} \right. } \right] = \left[ {1 + \xi \left( {\frac{{x - u}}{\sigma }} \right) } \right] ^{ - \frac{1}{\xi }}. \end{aligned}$$
(6)

However, in equation 6 it calculates the probability of occurrence of a given maximum precipitation that is higher than the adopted threshold. It is desired to calculate the probability of occurrence of precipitation above a maximum value. Therefore, equation 6 is simplified in

$$\begin{aligned} \Pr \left[ {X > x} \right] = \lambda \left[ {1 + \xi \left( {\frac{{x - u}}{\sigma }} \right) } \right] ^{ - \frac{1}{\xi }}, \end{aligned}$$
(7)

where \(\lambda = \Pr \left[ {X > u} \right] \). Hence, the level \(x_m\) that is exceeded on average once every m observations is the solution of

$$\begin{aligned} \lambda \left[ {1 + \xi \left( {\frac{{x_m - u}}{\sigma }} \right) } \right] ^{ - \frac{1}{\xi }} = \frac{1}{{m}}. \end{aligned}$$
(8)

Therefore, the equation 8 leads to the m-observation return level. For representation, it is often more convenient to give return levels on an annual scale, so that the N-year return level is the level expected to be exceeded once every N years. If there are \(n_x\) observations per year, this corresponds to the m-observations return level, where \(m = N \times n_x\) [24]. Hence, the N-Year return level is defined by

$$\begin{aligned} {\widehat{z}}_N = {\widehat{u}} + \frac{{{\widehat{\sigma }} }}{{{\widehat{\xi }} }}\left[ {\left( {Nn_x {\hat{\lambda }} } \right) ^{{\widehat{\xi }} } - 1} \right] \end{aligned}$$
(9)

where \(n_x\) is the number of days to be analyzed. We analyzed monthly rainfall data, so \(n_x = 31, 30, 28\) days, according to month. If \(\xi \rightarrow 0\), the return level is defined by

$$\begin{aligned} {\widehat{z}}_n = {\widehat{u}} + {\widehat{\sigma }} \,\log \left( {Nn_x {\hat{\lambda }} } \right) . \end{aligned}$$
(10)

For the estimates of return level, we need to know the estimates of the parameters of the GPD. As a result, to estimate the probabilities and return level, the maximum likelihood estimates will be used, as described in the previous sections. Thus, an estimate for \(\lambda \) is required, which has the following natural estimator

$$\begin{aligned} {\hat{\lambda }} = \frac{k}{n} \end{aligned}$$
(11)

corresponding to the proportion of the sample points exceeding u. In addition to the return level estimates, the confidence intervals with confidence coefficient \((1-\alpha )\times 100\%\), associated with the return periods of 2, 5, 10, 30, 50 and 100 years, were constructed using the delta method, as described in Coles [24]. Since the number of excesses of u follows a binomial distribution, \({\hat{\lambda }}\) is also the maximum likelihood estimate of \(\lambda \). The confidence intervals for \({\widehat{z}}_N\) can be obtained by the delta method, but the uncertainty in the estimate of \({\hat{\lambda }}\) should also be included in the calculation. From the standard properties of the binomial distribution, \(Var\left( {\hat{\lambda }} \right) \approx {\hat{\lambda }} {{\left( {1 - {\hat{\lambda }} } \right) } / n}\), then the complete variance-covariance matrix is approximately

$$\begin{aligned} V = \left[ {\begin{array}{*{20}c} {{\hat{\lambda }} {{\left( {1 - {\hat{\lambda }} } \right) } / n}} &{} 0 &{} 0 \\ 0 &{} {v_{1,1} } &{} {v_{1,2} } \\ 0 &{} {v_{2,1} } &{} {v_{2,2} } \\ \end{array}} \right] \end{aligned}$$
(12)

where \(v_{i,j}\), represents the term (ij) of the variance-covariance matrix of \({\widehat{\sigma }}\) and \({\widehat{\xi }}\). Thus by the delta method,

$$\begin{aligned} Var\left( {{\widehat{z}}_N } \right) \approx \nabla z_N^T \,V\,\nabla z_N \end{aligned}$$
(13)

where

$$\begin{aligned} \nabla z_N^T = \left[ {\frac{{\partial z_N }}{{\partial \lambda }},\frac{{\partial z_N }}{{\partial \sigma }},\frac{{\partial z_N }}{{\partial \xi }}} \right] \end{aligned}$$
(14)

evaluated in \(\left( {{\widehat{\lambda }} ,{\widehat{\sigma }} ,{\widehat{\xi }} } \right) \). Therefore, the confidence interval \((1 - \alpha )\times 100\%\) for \({{\widehat{z}}_N }\) is given by

$$\begin{aligned} CI_{(1 - \alpha )\times 100\% }\left( {{\widehat{z}}_N }\right) = {\widehat{z}}_N \pm z_{\frac{\alpha }{2}} \sqrt{Var\left( {\widehat{z}_N } \right) }, \end{aligned}$$
(15)

where \(z_{\frac{\alpha }{2}}\) is the \(\frac{\alpha }{2}\)-th quantile of the standard normal distribution.

2.5 Simulation study to evaluate goodness of fit for extreme values distributions

A computational simulation study was conducted with the purpose of evaluating the performance of the distributions in each month. For this, the Monte Carlo simulation method was used, which consists of making several achievements of a phenomenon according to pre-established parameters. At the end of these simulations, we can calculate the mean and standard deviation of the simulations and these represent measures of accuracy and precision, respectively [30, 31]. For each month, the data series was divided into a training series, comprising 30 years (1961–1991), and a test series, comprising 29 years (1992–2019). Thus, two scenarios are considered: (1) the first scenario generates samples of the Exponential distribution with the estimated parameters, and (2) the second scenario generates samples of the GPD with the estimated parameters.

Each scenario \([(k = (1),(2)]\) is repeated 10000 times, according to the Monte Carlo simulation procedure, following the steps described below:

  1. (i)

    With the training sample, generate a sample of the same size (n) according to the probability distribution of scenario k;

  2. (ii)

    Estimate the parameters of the Exponential and GPD distributions using the maximum likelihood method, described in Sect. 2.2;

  3. (iii)

    Perform the likelihood ratio test of step (ii);

  4. (iv)

    For the return periods of 2, 5, 10, 15, 20, 25, 28 years, calculate the respective return level with the probability distributions and their respective parameters estimated in step (ii);

  5. (v)

    With the test sample, obtain the observed return levels for the return periods of 2, 5, 10, 15, 20, 25, 28 years. Calculate the Mean Absolute Percentage Error (MAPE) and the Root Mean Squared Error (RMSE), given by equations 16 and 17 , respectively.

    $$\begin{aligned} RMSE= & {} \sqrt{\frac{{\sum \nolimits _{i = i}^{n_z } {\left( {z_{N_i } - {\hat{z}}_{N_i } } \right) ^2 } }}{{n_z }}} \end{aligned}$$
    (16)
    $$\begin{aligned} MAPE= & {} \frac{1}{{n_z }}\sum \limits _{i = 1}^{n_z } {\left| {\frac{{z_{N_i } - {\hat{z}}_{N_i } }}{{z_{N_i } }}} \right| } \times 100 \end{aligned}$$
    (17)

Steps from (i) to (v) are repeated 10000 times. After that, we obtain the Monte Carlos average from MAPE and RMSE. In addition, the following were calculated: the proportion of which the LT, in step (iii), resulted in a p-value higher than the significance level of \(1\%\), denoted by \({\hat{p}}_{LT}\); the proportion of which the MAPE of the GPD is greater than the MAPE of the Exponential distribution, denoted by \({\hat{p}}_{MAPE}\); and the proportion of which the RMSE of the GPD is greater than the RMSE of the Exponential distribution, denoted by \({\hat{p}}_{RMSE}\). It should be noted that the adopted return times, 2, 5, 10, 15, 20, 25, 28 years, \(n_z = 7\), comprise the time of the test series.

Finalizing the proposed methodology, we used the R software [32] and the evd package [33].

3 Discussion and results

Table 1 shows that in all months the exponential distribution (\(\xi \rightarrow 0\)) performs better by the likelihood ratio test. The Mann-Kendall test indicated no trend in all months of the year, since the p-values showed results higher than 0.01. That is, there are statistical indications that each series of monthly rainfall ceilings does not have a trend over the years. Furthermore, the series of monthly highs are independent, with \(1\%\) level of significance. We should highlight that we have used these tests to verify the assumptions of the Extreme Value Theory models, but that they could be used for other interests, such as [2, 29, 34] in the trend analysis of hydro-climatic series. In addition, the Kolmogorov-Smirnov test states that both distributions were fitted in all months and the QQ plots corroborate the results (Fig. 3). Satisfactory adjustment of the GPD distribution was also found by Lazoglou [35], Salleh and Hassan [36], Wan et al. [37], Zahid et al. [38].

Table 1 Threshold (\({\hat{u}}\)) selected by procedure described in Sect. 2.1, parameter estimates and Hypothesis tests (p-value) of the Generalized Pareto (GPD) and Exponential distributions for monthly maximum rainfall data of the city of Uruguaiana, RS, Brazil
Fig. 3
figure 3

Q-Q plots of the best fitted distribution showed by the comparison of Tables 4 and 5 for monthly maximum rainfall data of the city of Uruguaiana, RS, Brazil. Dashed lines represent the \(95\%\) confident interval, the points represent the empirical return level and estimated by the fitted model, solid line represent the 1 to 1 relation by return levels

Table 2 Probability (\(\%\)) of rainfall occurrence by the probability distributions for monthly maximum rainfall data of the city of Uruguaiana, RS, Brazil

From the fit of the exponential distribution, we verify, in Table 2 that in the months of October to February and April to May, amounts of rainfall above 50 mm are recorded, with a probability of occurrence greater than 60%. The table also shows that the probability of rainfall above 150 mm is higher in April and May than in other months of the year.

Rain volumes between 100 mm and 180 mm in a few hours can lead to landslides and flooding. One example occurred in the city of Rolante, metropolitan region of Porto Alegre, which has an average rainfall of 180 mm accumulated. landslides caused by a flood reached an area of 230 hectares and more than 6,600 inhabitants, and mud were dragged by the river, causing a cutoff of the water supply in eight municipalities of the region [39].

Herrmann [40] reported that in November 1991 there was precipitation in only two days with accumulated above 400 mm in São José / SC. There were numerous landslides and deaths in the eastern mountain range of Santa Catarina since houses crashed down and several sections of the highway BR 101 were blocked by the collapse of barriers. In December 1995, heavy rainfall resulted in 29 deaths, causing 29 municipalities in the mesoregion of southern Santa Catarina to declare a state of calamity.

Table 3 Return Levels estimates (mm) by the probability distributions for monthly maximum rainfall data of the city of Uruguaiana, RS, Brazil

Table 3 presents estimates of maximum rainfall return levels for periods of 2 to 100 years for each month. We monitored that by means of the fit of the GPD and exponential distributions, that the precipitation estimates increase as the time of return increases. This fact is already expected and is in agreement with Zahid et al. [38].

In the period from September to May, rainfall above 50 mm is recorded, which depending on the hourly intensity may cause erosive processes in the soil, which can become harmful in order to contribute to the removal of essential nutrients for the development of the crop [25].

In March, it is expected that the maximum rainfall return level of 154.01 mm is exceeded once in 50 years by the Exponential distribution. Medeiros et al. [41] found for the same month a return level of 124.33 mm by the Gumbel distribution in the municipality of Jataí-Goiás and report that high levels of precipitation daily can cause intense rainfall and that estimates of precipitation in different return periods can be useful for assist professionals involved with planning and execution of hydraulic structure projects in decision making in control of floods.

Zahid et al. [38] conducted a study on temperatures return levels in concluded that extreme temperatures can affect yields. The crops are very sensitive to temperature variations in the order of 1 \(^\circ \)C, according to Hatfield & Prueger [42]. Every harvest has a certain temperature tolerance limit. When the temperature exceeds this limit, the yield of the harvest is drastically reduced. The same goes for extreme rainfall.

The results indicate that the month of April presented the highest rainfall return levels, whose expected level is 156.96 mm in an average period of 50 years. As a way of providing greater precision in the results, Beijo et al. [43] calculated the maximum rainfall return levels in Lavras, Minas Gerais state, by type I extreme values distribution (Gumbel), and found that for an average period of 50 years, expected level is 148 mm and with a \(95\%\) confidence that varies between 131 mm and 164 mm. These authors also recommend that, in the analysis of maximum precipitation, if the interest is in the maximum extreme event, it is suggested that the upper limit of the interval be used as a reference value. In this sense, the Fig. 4 shows the behavior of the return levels and their 95 \(\%\) confidence intervals.

Rain shall be considered erosive and individual as long as they are greater than or equal to 10 mm or greater or equal to 6.0 mm, provided that they occur in a maximum of 15 minutes and separated from each other by a period of at least six hours with a rainfall of 1.0 mm or less [44].

Fig. 4
figure 4

Return level plot (in years) and confidence intervals for monthly maximum rainfall data of the city of Uruguaiana, RS, Brazil. Dashed lines represent the \(95\%\) confident interval and solid line represent estimated return level by the best distribution showed by the comparison of Tables 4 and 5

As seen in Table 1, the likelihood ratio test attests that the Exponential distribution is sufficient to model rainfall data and in a few months the Kolmogorov Smirnov test indicated that the GPD distribution is more appropriate, by comparing its p-values. If two probability distributions from the same family fit a set of data, the one with the least number of parameters is preferable [45]. This fact is important when there are problems in estimating the parameters of models, which can occur in methods based on likelihood [13, 46, 47]. In our study, this fact did not occur, which allows us to conduct the simulation study referred to in the Sect. 2.5. We conclude that there are months in which the Exponential distribution is more adequate, as in the months of January, March, April and August, since most of the comparison criteria used are favorable to this distribution. In September and November, most criteria indicated that the GPD distribution is more appropriate (Tables 4 and 5 ).

Table 4 Results of scenario 1 for the Monte Carlo simulation in 10000 replicates for each month of the year for the Exponential and GPD distributions of monthly maximum rainfall data in Uruguaiana-RS
Table 5 Results of scenario 2 for the Monte Carlo simulation in 10000 replicates for each month of the year for the Exponential and GPD distributions of monthly maximum rainfall data in Uruguaiana-RS

In the months of February, May, June, July, October and December, the result was inconclusive, as there was no unanimity between the two distributions in the two scenarios evaluated (Tables 4 and 5). In that case, we can use any of the distributions. We should emphasize that the Exponential distribution is expected to present a better result in the first scenario and GPD in the second scenario. When this does not occur, there is a strong indication that the true distribution in that month is that which was unanimously elected by the adopted criteria.

Regarding simulation studies involving distributions of extreme values, Xavier et al. [48] have reported in their simulation studies involving the generalized extreme values distribution, in the presence of covariates to model trend or temporal effect. The one that is more parsimonious is preferable and that according to the subject of study, the method used to select models is an important issue. In the same sense, Kim et al. [49] have showed by Monte Carlos simulation that the model comparison methods behave differently in the evaluation of stationary and nonstationary GEV models. For the nonstationary case, the Akaike information criteria showed better results and in the stationary case the likelihood ratio test was superior in detecting the most appropriate model. Our study used stationary GPD and we showed by Monte Carlo simulation that there are months when the most adequate distribution is different from that chosen in the Table 1. We intend to extend this study to other probability distributions.

Beijo et al. [50] stresses the importance of obtaining accurate estimates for rainfall. From a practical point of view, accuracy is important in terms of safety and economy, because when, in a shorter period, there is greater rainfall than expected, this can cause serious damage. In the case of the construction of a contour line, it would not support the volume of water and, consequently, would cause soil erosion and burial of plantations, causing serious damage to the environment and to the owners. Thus, and in accordance with the results of the Tables 4 and 5 , we provide the QQplots and confidence intervals for return levels according to the most accurate probability distribution.

4 Conclusions

The Generalized Pareto distribution was satisfactorily fitted in all months and can be used to provide maximum rainfall extreme levels. No positive trend and temporal dependence of monthly maximum rainfall was found.

The rainfall estimates from January to December were calculated for the return periods of 2, 5, 10, 30, 50 and 100 years. The highest estimate was observed in April (with rainfall above 170 mm every 100 years and with 95\(\%\) confident interval of 140 mm to 220 mm, approximately) and the lowest return level was in July (with rainfall near from 90 mm every 100 years).

By comparing the distributions by computer simulation, it was possible to identify the true probability distribution of extreme values of the excess of a threshold. We chose three measures of fit quality to make the comparisons, and the measures \({\hat{p}}_{MAPE}\) and \({\hat{p}}_{RMSE}\) are obtained as a result. The proposed algorithm could be adapted for other measures of fit quality, such as the Akaike (AIC) information criterion, its corrected version (AICc), or Bayesian (BIC), among others. The length of the training and testing series is another issue that can be discussed. The original series should be as large as possible, but not less than 30 years. It is essential to have a balance between the sizes of the training and test series, so that if the training series is very long, the adjusted model can generalize well and, if the test set is long, the sample used to fit the model may be insufficient to reproduce the test series. In our work, for simulation, we divided the series into 30 years to adjust the model and 29 years to carry out the calculations of the appropriate quality measures, totaling 59 years of time series. The more extended set allows greater flexibility between the training and test series, and care has to be taken for short series, usually less than 30 years.

The results have practical implications for assessing the risk of extreme rain events in Uruguaiana, Brazil. The graphics are prepared to guide the local administration to support adaptations, such as the preparation of baseline contingency plans to deal with the maximum rainfall based on the current climatology. Studies like this are not yet available in this municipality. Our results will contribute to regional planning and may also be useful for ongoing economic and environmental projects in southern Brazil, as well as for a better understanding of the Pampa biome.