Applied Water Science

, Volume 3, Issue 1, pp 13–18 | Cite as

Moment estimators of the GEV distribution for the minima

Open Access
Original Article

Abstract

The moment (MOM) estimators for the parameters, quantiles and confidence limits, using the general extreme value distribution for the minima, is presented towards its application in low flow frequency analysis. The procedures to compute the parameters, design events (quantiles) for several return periods and their confidence limits are shown in the paper. Two measures of goodness of fit tests are contained in the paper to compare the proposed methodology with other models in competition. A full example of application is presented in the paper to show how easy is to apply the proposed methodology.

Keywords

Low flow Frequency analysis Parameter estimation Confidence limits Method of moments 

Introduction

A subject of paramount interest in planning and design of water works is that related with low flow frequency analysis. Due to the characteristic that design values have, given that they are linked to a return period or to an exceedance probability, the use of mathematical models known as probability-distribution functions is a must. Among the most widely used probability distribution functions for hydrological analysis, related with low flow frequency analysis, are (Kite 1988; Salas and Smith 1980; Rao and Hamed 2000; Raynal-Villasenor 2010):
  1. 1.

    Three parameters log-normal (LN3)

     
  2. 2.

    Pearson type III (PIII)

     
  3. 3.

    Extreme value type III (EVIII)

     
  4. 4.

    General extreme value for the minima (GEVM)

     

The first three probability-distribution functions have been applied to low flow frequency analysis by Matalas (1963). Gumbel (1958) developed the theoretical grounds and hydrological applications for the extreme value type III distribution for the minima (EVIIIM), the well-known Weibull distribution. This distribution has been applied since the first third of the twentieth century to the analysis of dynamic breaking strength of materials (Weibull 1939a, b). Kite (1988) provided with a computer program to estimate the parameters of the EVIIIM distribution using the methods of moments (MOM) and maximum likelihood (ML). More recently, Lee and Kim (2008) used the two-parameter Weibull distribution with Bayesian Markov chain Monte Carlo and maximum likelihood estimates to assess the uncertainty of low frequency analysis. The estimation of ML parameters of EVIIIM distribution has some difficulties when using the Newton–Raphson method as have been pointed out by Offinger (1996). Durrans and Tomic (2001) compared five methods of estimation of parameters for the log-normal distribution in fitting the lower tail of such distribution. Smakhtin (2001) made a review of 20 years of research results with regard to low flow hydrology. Yue and Wang (2004) studied the scaling of Canadian rivers to regionalize the low flows. Taha et al. (2008) presented a brief review of statistical models that are commonly used in the estimation of low flows both at sites with a reliable stream flow record and sites remote from data sources. Hao and Singh (2009) applied the maximum entropy method to the Burr III distribution and compared the results with the MOM, ML and probability-weighted moments (PWM); they found no differences on the quantiles for small return period, the differences increased for large period returns. Iacobellis (2008) studied the evaluation of a flow duration curve with assigned a T-year return period with beta and complementary beta distributions.

The use of the general extreme value distribution for the minima (GEVM) with moment estimators for the parameters, quantiles and confidence limits are proposed in the paper. A complete example of application of the proposed methodology is contained in the paper, through the application of common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.). The results are compared with the other three distribution function mentioned before.

Probability distribution and density functions of the GEVM

The probability-distribution function of the GEVM distribution for the minima is, Raynal-Villasenor and Douriet-Cardenas (1994):
$$ \Uppi (x) = \exp \left\{ { - [1 - \beta (\omega - x)/\alpha ]^{1/\beta } } \right\} $$
(1)
where ω, α and β are the location, scale and shape parameters, respectively. Π(x) is the probability-distribution function of the random variable x and for the case of low flow frequency analysis is equal to the exceedance probability, Pr(X > x). The scale parameter must meet the condition that α > 0. The domain of variable x in GEVM distribution is as follows:
  1. 1.

    For β < 0:

     
$$ - \infty < x \le \omega - \alpha /\beta $$
(2)
  1. 2.

    For β < 0:

     
$$ \omega - \alpha /\beta \le x < \infty $$
(3)
The probability density function for the GEVM distribution is (Raynal-Villasenor and Douriet-Cardenas 1994):
$$ \pi (x) = \frac{1}{\alpha }\exp \left\{ { - [1 - \beta (\omega - x)/\alpha ]^{1/\beta } } \right\}[1 - \beta (\omega - x)/\alpha ]^{1/\beta - 1} $$
(4)
where π(x) is the probability-density distribution of random variable x.

Moment estimators for the parameters of the GEVM distribution

The moment estimators for the parameters of the GEVM distribution have the following expressions:
  1. 1.

    Location parameter:

     
$$ \hat{\omega } = \hat{A} + \frac{{\hat{\alpha }}}{{\hat{\beta }}} $$
(5)
$$ \hat{A} = \hat{\mu } - \frac{{\hat{\alpha }}}{{\hat{\beta }}}\Upgamma (1 + \hat{\beta }) $$
(6)
  1. 2.

    Scale parameter:

     
$$ \hat{A} = \hat{\mu } - \frac{{\hat{\alpha }}}{{\hat{\beta }}}\Upgamma (1 + \hat{\beta }) $$
(7)
$$ \hat{B} = \left( {\frac{{\hat{\sigma }^{2} }}{{\sigma_{z}^{2} }}} \right)^{1/2} = \frac{{\hat{\sigma }}}{{\sigma_{z} }} $$
(8)
$$ \sigma_{z}^{2} = \Upgamma (1 + 2\beta ) - \Upgamma^{2} (1 + \beta ) $$
(9)
  1. 3.

    Shape parameter:

     
For β < 0 and \( - 19.0 < \hat{\gamma } \le - 1.1396 \):
$$ \hat{\beta } = 0.24662 + 0.286678\hat{\gamma } + 0.072454\hat{\gamma }^{2} + 0.010176\hat{\gamma }^{3} + 0.000816\hat{\gamma }^{4} + 0.000037\hat{\gamma }^{5} $$
(10)
For β > 0 and \( - 1.1396 \le \,\,\hat{\gamma } < 11.35 \):
$$ \hat{\beta } = 0.279434 - 0.333535\hat{\gamma } + 0.048305\hat{\gamma }^{2} + 0.024414\hat{\gamma }^{3} + 0.003765\hat{\gamma }^{4} - 0.000263\hat{\gamma }^{5} $$
(11)
where x0, α and β are the location, scale and shape parameters of the GEVM distribution. Γ(.) is the complete Gamma function.

Design values for the GEVM distribution

The design values (quantiles) for the GEVM distribution can be obtained by inverting the GEVM distribution function:
$$ Q_{T} = \omega + \frac{\alpha }{\beta }\left\{ {\left[ { - {\text{Ln}}\left( {1 - \frac{1}{{T_{\text{r}} }}} \right)} \right]^{\beta } - 1} \right\} $$
(12)
where QT are the design values and Tr is the return period associated with such design values.

Confidence limits for the design values of the GEVM distribution

The moment confidence limits for the GEVM distribution are been computed through the following formula:
$$ x_{\rm l} = Q_{T} \pm z_{\alpha } S_{T} $$
(13)
where xl is the confidence limit (lower or upper confidence limit), QT is the design value, zα is a standard normal value corresponding to a confidence level of α, and ST is the standard deviation of the estimates. The form of such standard deviation is:
$$ S_{T}^{2} = \frac{{\mu_{2} }}{N}\left\{ {1 + K_{T} \hat{\gamma } + \frac{{K_{T}^{2} }}{4}(\hat{\kappa } - 1) + \frac{{\partial K_{T} }}{{\partial \hat{\gamma }}}\left[ {2\hat{\kappa } - 3\hat{\gamma }^{2} - 6 + K_{T} \left( {\hat{\lambda }_{1} - \frac{{6\hat{\gamma }\,\hat{\kappa }}}{4} - \frac{{10\hat{\gamma }}}{4}} \right)} \right]} \right. $$
(14)
where ST2 is the variance of the estimates, μ2 is the sample variance, N is the sample size, KT is the frequency factor, γ is the skewness coefficient, κ is the kurtosis coefficient and λ1 is a function of moments. Then, the frequency factor is:
$$ K_{T} \, = B_{K} \,\left[ {\left( { - {\text{Ln}}\left( {1 - \frac{1}{{T_{r} }}} \right)} \right)^{\beta } - A_{K} } \right] $$
(15)
$$ A_{K} = \Upgamma (1 + \beta ) $$
(16)
$$ B_{K} = \frac{1}{{\left[ {\Upgamma (1 + 2\beta ) - \Upgamma^{2} (1 + \beta )} \right]^{1/2} }} $$
(17)
$$ \frac{{\partial K_{T} }}{\partial \gamma } = \left( {\frac{{\partial K_{T} }}{\partial \beta }} \right)\left( {\frac{\partial \beta }{\partial \gamma }} \right) $$
(18)
$$ \frac{{\partial K_{T} }}{\partial \beta } = \left\{ {\frac{{[y^{\beta } {\text{Ln}}(y) - G_{1} P_{1} ] - \frac{1}{2}[y^{\beta } - G_{1} ]\left[ {G_{2} - G_{1}^{2} } \right]^{ - 1} \left[ {G_{2} P_{2} - 2G_{1}^{2} P_{1} } \right]}}{{\left[ {G_{2} - G_{1}^{2} } \right]^{1/2} }}} \right\} $$
(19)
$$ \frac{\partial \gamma }{\partial \beta } = \left\{ {\frac{{\left[ {G_{3} P_{3} + 3G_{1} \left( {P_{1} \left( {G_{1}^{2} - G_{2} } \right) - G_{2} P_{2} } \right)} \right] - \frac{3}{2}\left[ {G_{2} - G_{1}^{2} } \right]^{ - 1} \left[ {G_{3} - 3G_{1} G_{2} + 2G_{1}^{3} } \right]\left[ {G_{2} P_{2} - 2G_{1}^{2} P_{1} } \right]}}{{\left[ {G_{2} - G_{1}^{2} } \right]^{3/2} }}} \right\} $$
(20)
$$ y = - {\text{Ln}}\left( {\frac{1}{{T_{r} }}} \right) $$
(21)
$$ G_{r} = \Upgamma (1 + r\beta ) $$
(22)
$$ P_{r} = \psi (1 + r\beta ) $$
(23)
where ψ(.) is the di-gamma function.
$$ \frac{{\partial K_{T} }}{\partial \gamma } = \frac{{\left[ {G_{2} - G_{1}^{2} } \right]\left\{ {[y^{\beta } {\text{Ln}}(y) - G_{1} P_{1} ]} \right\} - \frac{1}{2}\left[ {y^{\beta } - G_{1} } \right]\left[ {G_{2} P_{2} - 2G_{1}^{2} P_{1} } \right]}}{{\left\{ {\left[ {G_{3} P_{3} + 3G_{1} \left( {P_{1} \left( {G_{1}^{2} - G_{2} } \right) - G_{2} P_{2} } \right)} \right] - \frac{3}{2}\left[ {G_{2} - G_{1}^{2} } \right]^{ - 1} \left[ {G_{3} - 3G_{1} G_{2} + 2G_{1}^{3} } \right]\left[ {G_{2} P_{2} - 2G_{1}^{2} P_{1} } \right]} \right\}}} $$
(24)

Goodness of fit tests for the parameters of the GEVM distribution

The two goodness of fit tests considered in this paper are:
  1. 1.
    Standard error of fit, SEF, Kite (1988)
    $$ {\text{SEF}} = \left[ {\frac{{\sum\nolimits_{i = 1}^{N} {(x_{i} - y_{i} )^{2} } }}{{(N - n_{\text{p}} )}}} \right]^{1/2} $$
    (25)
    where xiare the descending-ordered historical values of the sample,yi are the values produced by the distribution function corresponding to the same return periods of the historical values, N is the sample size, and np is the number of parameters of the distribution function, in this case, np = 3.
     
  1. 2.
    Mean absolute relative deviation, MARD, Jain and Singh (1987)
    $$ {\text{MARD}} = \frac{100}{N}\sum\limits_{i = 1}^{N} {\left| {\frac{{(x_{i} - y_{i} )}}{{x_{i} }}} \right|} $$
    (26)
     

Numerical example

The gauging station Villalba is located in the San Pedro River in Northwestern Mexico and has been selected to analyze its sample of annual one-day low flows, using the GEVM distribution with the MOM method of estimation of its parameters, design values and confidence limits.

The geographical location of gauging station Villalba, Mexico is shown in Fig. 1.
Fig. 1

Location of gauging station Villalba, Mexico

The first step in the computations is to obtain basic statistics of the one-day low flow sample and such statistics have been obtained by the application of common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.), they are shown in Fig. 2.
Fig. 2

Data statistics for gauging station Villalba, Mexico

The parameters, the goodness of fit measures, and design values and its confidence limits obtained through the use of the application of common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.), they are shown in Fig. 3.
Fig. 3

Estimation of parameters (GEVM-MOM) and goodness of fit measures for gauging station Villalba, Mexico

The comparison between the histogram of flood data and the theoretical probability-density function is shown in Fig. 4. Figure 5 shows the empirical and theoretical frequency curves for the MOM estimation of parameters for the GEVM, PIII, EVIIIM and LN3 distributions to the 1-day low flow sample of gauging station Villalba, Mexico. In Fig. 6, it is shown a graphical representation of the MOM method of estimation for the design values and their confidence limits. All the figures mentioned before have been obtained through the use of the application of common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.)
Fig. 4

Histogram and theoretical probability density function for gauging station Villalba, Mexico

Fig. 5

Empirical and theoretical frequency curves for several models applied to 1-day low flow data at gauging station Villalba, Mexico

Fig. 6

Empirical and theoretical frequency curves and their confidence limits for one-day low flow data at gauging station Villalba, Mexico

Discussion of results

The easy use of proposed methodology has been shown by the development of the numerical example. Using the common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.), the user has all the time on sight the formulas and results and a possible error could be spotted very easily.

The tables shown in Fig. 3 contain all the required results for a low flow frequency analysis study for a particular set of low flow data. In these tables are contained the values of the parameters, their goodness of fit measures, design values for several return periods and their confidence limits. Two different measures of goodness of fit are provided to choose among competing models.

The information contained in the graphs produced by the common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.) are informative on how good is the adjustment of a particular probability distribution function to a particular set of data, this is given by the graph showing the low flow data and the adjusted model (Fig. 5), the graph that shows the theoretical probability density function and histogram of low flow data (Fig. 4) and the graph that shows the confidence limits and the adjusted model and the low flow data (Fig. 6).

Conclusions

A proposed methodology has been presented for low flow frequency analysis, using the GEVM distribution coupled with MOM method. The use of the common spreadsheets framework provided by Excel® (Excel is a registered trademark of Microsoft Corporation, Inc.) is particularly useful in education and training. The proposed methodology compares well with the existing probability distribution functions when the MOM method is applied. The straightforward application of the proposed methodology to real data, as it has been shown in example contained in the paper, makes it a versatile tool to train students or technical personnel in the field with a personal computer and a printer.

References

  1. Durrans SR, Tomic S (2001) Comparison of parametric tail estimators for low flow frequency analysis. J Am Wat Resour Assoc 37(5):1203–1214. doi:10.1111/j.1752-1688.2001.tb03632.xCrossRefGoogle Scholar
  2. Gumbel EJ (1958) Statistics of extremes. Columbia University Press, New YorkGoogle Scholar
  3. Hao Z, Singh VP (2009) Entropy-based parameter estimation for extended three-parameter Burr III distribution for low flow frequency analysis. Trans ASABE 52(4):1193–1202CrossRefGoogle Scholar
  4. Iacobellis V (2008) Probabilistic model for the estimation of the T-year flow duration curves. Wat Resour Res 44(2), article number W02413. doi:10.1029/2006WR005400
  5. Jain D, Singh VP (1987) Estimating parameters of EV1 distribution for flood frequency analysis. Wat Resour Bull 23(1):59–71CrossRefGoogle Scholar
  6. Kite GW (1988) Flood and risk analyses in hydrology. Water Resources Publications, LittletonGoogle Scholar
  7. Lee KS, Kim SU (2008) Identification of uncertainty in low flow frequency analysis using Bayesian MCMC method. Hydrol Proc 22(12):1949–1964. doi:10.1002/hyp.6778CrossRefGoogle Scholar
  8. Matalas NC (1963) Probability distribution of low flows: USGS professional paper no 434-A. US Printing Office, WashingtonGoogle Scholar
  9. Offinger R (1996) Shätzer in drei Parametrigen Weibull-Modelen und Untersuchung iher Eigenshaften mittels Simulation. Diplomarbeit Institut für Mathematik de Universität Ausburg, AusburgGoogle Scholar
  10. Rao R, Hamed KH (2000) Flood frequency analysis. CRC Press, Boca RatonGoogle Scholar
  11. Raynal-Villasenor JA (2010) Frequency analysis of hydrologic extremes. http://www.lulu.com/spotlight/flodro4dot0atgmaildotcom. Accessed 10 Sept 2011
  12. Raynal-Villasenor JA, Douriet-Cardenas JC (1994) Moment parameter estimators for the general extreme value distribution for the minima. Hydrol Sci Tech J 10(1–4):118–125Google Scholar
  13. Salas JD, Smith R (1980) Computer programs of distribution functions in hydrology. Colorado State University, Fort CollinsGoogle Scholar
  14. Smakhtin V (2001) Low flow hydrology: a review. J Hydrol 240(3–4):147–186. doi:10.1016/S0022-1694(00)00340-1Google Scholar
  15. Taha BMJO, Charron C, St-Hilaire A (2008) Statistical model and the estimation of low flows. Can Wat Resour J 33(2):195–205CrossRefGoogle Scholar
  16. Weibull W (1939a) A statistical theory of strength of materials: Ingeniörs Vetenskaps Akademien Handlingar, No. 151. Generalstabens Litografiska Anstalts Förlang, StockholmGoogle Scholar
  17. Weibull W (1939b) The phenomenon of rupture of solids: Ingeniörs Vetenskaps Akademien Handlingar, No. 153. Generalstabens Litografiska Anstalts Förlang, StockholmGoogle Scholar
  18. Yue S, Wang CY (2004) Scaling of Canadian low flows. Stoch Environ Res Risk Assess 18(5):291–305. doi:10.1007/s00477-004-0176-6CrossRefGoogle Scholar

Copyright information

© The Author(s) 2012

This article is published under license to BioMed Central Ltd. Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Authors and Affiliations

  1. 1.Universidad de las Americas PueblaSan Andres CholulaMexico

Personalised recommendations