Skip to main content
Log in

Parameter Estimation for Univariate Hydrological Distribution Using Improved Bootstrap with Small Samples

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

It is crucial yet challenging to estimate the parameters of hydrological distribution for hydrological frequency analysis when small samples are available. This paper proposes an improved Bootstrap and combines it with three commonly used parameter estimation methods, i.e., improved Bootstrap with method of moments (IBMOM), maximum likelihood estimation (IBMLE) and maximum entropy principle (IBMEP). A series of numerical experiments with different small sized (10, 20, and 30) of samples generated from the three commonly used probability distributions, i.e., Pearson Type III, Weibull, and Beta distributions, are conducted to evaluate the performance of the proposed three methods compared with the cases of conventional Bootstrap and without-Bootstrap. The proposed methods are then applied to the estimation of distribution parameters for the average annual precipitations of 8 counties in Qingyang City, China with assumption of Pearson Type III distribution for the average annual precipitations. The resulting absolute deviation (AD) box plots and Root Mean Square Error (RMSE) and bias estimators from both the numerical experiments and the case study show that the estimated parameters obtained by the improved Bootstrap methods have less deviation and are more accurate than those obtained through conventional Bootstrap and without-Bootstrap for the three distributions. It is also interestingly found that the improved Bootstrap provides more relative improvement on the parameter estimation when smaller size of sample is used. The method based on improved Bootstrap paves a new way forward to alleviating the need of large size of sample for quality hydrological frequency analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of Data and Materials

Available from the corresponding author on request.

References

  • Aghakouchak A (2014) Entropy–copula in hydrology and climatology. J Hydrometeorol 15(6):2176–2189

    Article  Google Scholar 

  • Aissia MAB, Chebana F, Ouarda TB, Roy L, Desrochers G, Chartier I, Robichaud É (2012) Multivariate analysis of flood characteristics in a climate change context of the watershed of the Baskatong reservoir, Province of Québec, Canada. Hydrol Process 26(1):130–142

    Article  Google Scholar 

  • Benchohra M, Lazreg JE (2015) On stability for nonlinear implicit fractional differential equations. Matematiche (catania) 70(2):49–61

    Google Scholar 

  • Bozorg M, Bracale A, Caramia P, Carpinelli G, Carpita M, De Falco P (2020) Bayesian bootstrap quantile regression for probabilistic photovoltaic power forecasting. J Protect Control Modern Power Syst 5(1):1–12

    Google Scholar 

  • Bracken C, Rajagopalan B, Cheng L, Kleiber W, Gangopadhyay S (2016) Spatial Bayesian hierarchical modeling of precipitation extremes over a large domain. Water Resour Res 52(8):6643–6655

    Article  Google Scholar 

  • Candelario G, Cordero A, Torregrosa JR, Vassileva MP (2022) An optimal and low computational cost fractional Newton-type method for solving nonlinear equations. Appl Math Lett 124(1):107650

  • Chen G (2004) Stability of nonlinear systems. Encyc RF Microw Eng 4881–4896

  • De Michele C, Salvadori G (2005) Some hydrological applications of small sample estimators of Generalized Pareto and Extreme Value distributions. J Hydrol 301(1–4):37–53

    Article  Google Scholar 

  • Dosne AGL, Bergstrand M, Harling K, Karlsson MO (2016) Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance resampling. J Pharmacokinet Pharmacodyn 43(6):583–596

    Article  Google Scholar 

  • Dwivedi AK, Mallawaarachchi I, Alvarado LA (2017) Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method. Stat Med 36(14):2187–2205

    Google Scholar 

  • Hussain Z, Ahmad I (2021) Effects of L-moments, maximum likelihood and maximum product of spacing estimation methods in using pearson type-3 distribution for modeling extreme values. Water Resour Manag 35(5):1415–1431

    Article  Google Scholar 

  • Jackson EK, Roberts W, Nelsen B, Williams GP, Nelson EJ, Ames DP (2019) Introductory overview: Error metrics for hydrologic modelling–A review of common practices and an open source library to facilitate use and adoption. Environ Model Softw 119:32–48

    Article  Google Scholar 

  • Jia ZQ, Cai JY, Liang YY (2009) Real-time performance reliability evaluation method of small-sample based on improved Bootstrap and Bayesian Bootstrap. Appl Res Comput 26(8):2851–2854

    Google Scholar 

  • Kong X, Hao Z, Zhu Y (2020) Entropy theory and pearson type-3 distribution for rainfall frequency analysis in semi-arid region. IOP Conf Ser Earth Environ Sci 495(1):012042

    Article  Google Scholar 

  • Krit M, Gaudoin O, Remy E (2021) Goodness-of-fit tests for the Weibull and extreme value distributions: A review and comparative study. J Commun Stat-Simul Comput 50(7):1888–1911

    Article  Google Scholar 

  • Lei G-J, Wang W-C, Yin J-X, Wang H, Xu D-M, Tian J (2019) Improved fuzzy weighted optimum curve-fitting method for estimating the parameters of a Pearson Type-III distribution. Hydrol Sci J 64(16):2115–2128

    Article  Google Scholar 

  • Lei G-J, Yin J-X, Wang W-C, Wang H (2018) The analysis and improvement of the fuzzy weighted optimum curve-fitting method of Pearson–type III distribution. Water Resour Manag 32(14):4511–4526

    Article  Google Scholar 

  • Liu Y, Brown J, Demargne J, Seo DJ (2011) A wavelet-based approach to assessing timing errors in hydrologic predictions. J Hydrol 397(3–4):210–224

    Article  Google Scholar 

  • Liu D, Wang D, Wang Y, Wu J, Singh VP, Zeng X, Wang L, Chen Y, Chen X, Zhang L (2016a) Entropy of hydrological systems under small samples: Uncertainty and variability. J Hydrol 532:163–176

    Article  Google Scholar 

  • Liu Z, Törnros T, Menzel L (2016b) A probabilistic prediction network for hydrological drought identification and environmental flow assessment. Water Resour Res 52(8):6243–6262

    Article  Google Scholar 

  • Ma M, Song S, Ren L, Jiang S, Song J (2013) Multivariate drought characteristics using trivariate Gaussian and Student t copulas. Hydrol Process 27(8):1175–1190

    Article  Google Scholar 

  • Mindham DA, Tych W, Chappell NA (2018) Extended state dependent parameter modelling with a data-based mechanistic approach to nonlinear model structure identification. Environ Model Softw 104:81–93

    Article  Google Scholar 

  • Qian L, Wang H, Dang S, Wang C, Jiao Z, Zhao Y (2018) Modelling bivariate extreme precipitation distribution for data-scarce regions using Gumbel-Hougaard copula with maximum entropy estimation. Hydrol Process 32(2):212–227

    Article  Google Scholar 

  • Qian L, Zhao Y, Yang J, Li H, Wang H, Bai C (2022) A new estimation method for copula parameters for multivariate hydrological frequency analysis with small sample sizes. Water Resour Manag 36(4):1141–1157

    Article  Google Scholar 

  • Rahmani MA, Zarghami M (2015) The use of statistical weather generator, hybrid data driven and system dynamics models for water resources management under climate change. J Environ Inf 25(1):23–35

    Article  Google Scholar 

  • Rasheed M, Shihab S, Rashid T, Enneffati M (2021) Some step iterative method for finding roots of a nonlinear equation. J Al-Qadisiyah Comput Sci Math 13(1):95–102

    Google Scholar 

  • Razmi A, Mardani-Fard HA, Golian S, Zahmatkesh Z (2022) Time-varying univariate and bivariate frequency analysis of nonstationary extreme sea level for New York City. Environ Process 9(1):1–27

    Article  Google Scholar 

  • Ryu D, Famiglietti JS (2005) Characterization of footprint-scale surface soil moisture variability using Gaussian and Beta distribution functions during the Southern Great Plains 1997 (SGP97) hydrology experiment. Water Resour Res 41(12):4203–4206

    Article  Google Scholar 

  • Shao Y, Lu P, Wang B, Xiang Q (2019) Fatigue reliability assessment of small sample excavator working devices based on Bootstrap method. Frattura Ed Integrità Strutturale 13(48):757–767

    Article  Google Scholar 

  • Singh VP (1998) Entropy-based parameter estimation in hydrology. Springer, Dordrecht

    Book  Google Scholar 

  • Singh VP, Asce F (2011) Hydrologic synthesis using entropy theory: Review. J Hydrol Eng 16(5):421–433

    Article  Google Scholar 

  • Singh VP, Sivakumar B, Cui H (2017) Tsallis entropy theory for modeling in water engineering: A review. Entropy 19(12):641

    Article  Google Scholar 

  • Song S, Kang Y, Song X, Singh VP (2021) MLE-based parameter estimation for four-parameter exponential gamma distribution and asymptotic variance of its quantiles. Water Resour Manag 13(15):2092

    Google Scholar 

  • Sun P, Wen Q, Zhang Q, Singh VP, Sun Y, Li J (2018) Nonstationarity-based evaluation of flood frequency and flood risk in the Huai River basin, China. J Hydrol 567:393–404

    Article  Google Scholar 

  • Wang C, Chang NB, Yeh GT (2009) Copula-based flood frequency (COFF) analysis at the confluences of river systems. Hydrol Process Int J 23(10):1471–1486

    Article  Google Scholar 

  • Westhoff MC, Zehe E, Schymanski SJ (2014) Importance of temporal variability for hydrological predictions based on the maximum entropy production principle. Geophys Res Lett 41(1):67–73

    Article  Google Scholar 

  • Xia J, Wang G, Tan G, Ye A, Huang G (2005) Development of distributed time-variant gain model for nonlinear hydrological systems. Sci China Ser D Earth Sci 48(6):713–723

    Article  Google Scholar 

  • Yang X, Li Y, Liu Y, Gao P (2020) A MCMC-based maximum entropy copula method for bivariate drought risk analysis of the Amu Darya River Basin. J Hydrol 590:125502

    Article  Google Scholar 

  • Zhang J, Lin G, Li W, Wu L, Zeng L (2018) An iterative local updating ensemble smoother for estimation and uncertainty assessment of hydrologic model parameters with multimodal distributions. Water Resour Res 54(3):1716–1733

    Article  Google Scholar 

  • Zhang M, Liu X, Wang Y, Wang X (2019) Parameter distribution characteristics of material fatigue life using improved bootstrap method. Int J Damage Mech 28(5):772–793

    Article  Google Scholar 

Download references

Funding

The study was supported by the National Key Research and Development Program of China (2021YFC3201100), The Belt and Road Special Foundation of the State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering (Grant No. 2020nkms03), National Natural Science Foundation of China (Grant No 41875061), and NUPTSF (Grant Nos. NY219161 and NY220035).

Author information

Authors and Affiliations

Authors

Contributions

The contribution of Hanlin Li is project design, model construction, and simulation experiments. The contribution of Longxia Qian includes project design and result analysis. The contribution of Jianhong Yang is algorithmic programming. The contribution of Suzhen Dang is model validation. The contribution of Mei Hong is model analysis.

Corresponding author

Correspondence to Longxia Qian.

Ethics declarations

Ethics Approval

The authors confirm that this article is original research and has not been published or presented previously in any journal or conference in any language (in whole or in part).

Consent to Participate and Consent to Publish

The authors declare that have consent to participate and consent to publish.

Competing Interests

The authors have no conflict of interest and are completely satisfied with the publication of their article in water resources management journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Distribution Functions

1.1 Pearson Type III Distribution

It contains three parameters \(\alpha ,\beta ,a_{0}\) with the following probability density:

$$f(x) = \frac{{\beta^{\alpha } }}{\Gamma (\alpha )}(x - a_{0} )^{\alpha - 1} e^{{ - \beta (x - a_{0} )}}$$
(8)

where \(\alpha\) is the shape parameter, \(\beta\) is the scale parameter, and \(a_{0}\) is the position parameter; \(x > a_{0}\)\(\beta > 0\)\(\Gamma (\alpha ) = \int_{0}^{\infty } {t^{\alpha - 1} e^{ - t} dt}\).

The distribution function of the Pearson Type III distribution has the following form:

$$F(x) = \int_{{a_{0} }}^{x} {\frac{{\beta^{\alpha } }}{\Gamma (\alpha )}(t - a_{0} )^{\alpha - 1} e^{{ - \beta (t - a_{0} )}} } dt$$
(9)

1.2 Weibull Distribution

This paper studies the latter two-parameter estimation problem with the following probability density function.

$$f(x) = \frac{a}{b}(\frac{x}{b})^{a - 1} e^{{ - (\frac{x}{b})^{a} }}$$
(10)

where \(a\) is the shape parameter and \(b\) is the scale parameter; \(a > 0\), \(b > 0\).

The distribution function of the Weibull distribution has the following form:

$$F(x) = e^{{ - (\frac{x}{b})^{a} }}$$
(11)

The Weibull is a two-parameter distribution that can be considered as an inverse generalized extreme value distribution.

1.3 Beta Distribution

Its probability density function containing two parameters \(a,b\) is shown as follows.

$$f(x) = \frac{\Gamma (a + b)}{{\Gamma (a)\Gamma (b)}}x^{a - 1} (1 - x)^{b - 1}$$
(12)

where the range of parameters is \(a > 0\), \(b > 0\) and \(0 < x < 1\), \(a,b\) are called shape parameters.

The distribution function of the Beta distribution has the following form:

$$F(x) = \frac{\Gamma (a + b)}{{\Gamma (a)\Gamma (b)}}\int_{0}^{x} {t^{a - 1} (1 - t)^{b - 1} } dt$$
(13)

Appendix B: Parameter Estimation Methods

2.1 Method of Moments (MOM)

  1. 1.

    MOM for Pearson Type III distribution is as follows:

$$\left\{ \begin{array}{l} \overline{x} = \alpha \beta + a_{0} \hfill \\ \sigma^{2} (x) = \alpha^{2} \beta \hfill \\ C_{s} = \frac{2}{\sqrt \beta } \hfill \\ \end{array} \right.$$
(14)
  1. 2.

    MOM for Weibull distribution is as follows:

    $$\left\{ \begin{array}{l} \overline{x} = b\Gamma (1 + \frac{1}{a}) \hfill \\ \sigma^{2} (x) = b^{2} \Gamma (1 + \frac{2}{a}) \hfill \\ \end{array} \right.$$
    (15)
  1. 3.

    MOM for Beta distribution is as follows:

    $$\left\{ \begin{array}{l} \overline{x} = ab \hfill \\ \sigma^{2} (x) = a^{2} b \hfill \\ \end{array} \right.$$
    (16)

    where \(x\) is the original sample sequence, \(\alpha ,\beta ,a_{0}\) is the parameters to be estimated for the Pearson Type III distribution, which are shape, scale and location parameters, respectively, and which used the first three orders of sample moments \(\overline{x},\sigma^{2} (x),C_{s}\), i.e., mean, variance and skewness coefficients to build a ternary system of equations; \(a,b\) is the two parameters to be estimated for the Weibull and Beta distributions.

2.2 Maximum Likelihood Estimation (MLE)

  1. 1.

    MLE for Pearson Type III Distribution is as follows:

$$\begin{aligned} \ln L(x;\alpha ,\beta ,a_{0} ) & = - n\ln \alpha - n\ln \Gamma (\beta ) + (\beta - 1)\sum\limits_{i = 1}^{n} {\ln (x_{i} - a_{0} )} \\ & \quad - n(\beta - 1)\ln \alpha - \frac{1}{\alpha }\sum\limits_{i = 1}^{n} {\ln (x_{i} - a_{0} )} \\ \end{aligned}$$
(17)
$$\left\{ \begin{gathered} \frac{n\beta }{\alpha } = \frac{1}{{\alpha^{2} }}\sum\limits_{i = 1}^{n} {(x_{i} - a_{0} )} \\ n\psi (\beta ) + n\ln \alpha = \sum\limits_{i = 1}^{n} {\ln (x_{i} - a_{0} )} \\ (\beta - 1)\sum\limits_{i = 1}^{n} {(\frac{1}{{x_{i} - a_{0} }})} = \frac{n}{\alpha } \\ \end{gathered} \right.$$
(18)
  1. 2.

    MLE for Weibull distribution is as follows:

    $$\ln L(x;a,b) = n\ln (\frac{a}{b}) + (a - 1)\sum\limits_{i = 1}^{n} {\ln (\frac{{x_{i} }}{b})} - \sum\limits_{i = 1}^{n} {(\frac{{x_{i} }}{b})^{a} }$$
    (19)
    $$\left\{ \begin{array}{l} \frac{1}{a} = \frac{{\sum\limits_{i = 1}^{n} {x_{i}^{a} \ln x_{i} } }}{{\sum\limits_{i = 1}^{n} {x_{i}^{a} } }} - \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln x_{i} } \hfill \\ b^{a} = \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{i}^{a} } \hfill \\ \end{array} \right.$$
    (20)
  1. 3.

    MLE for Beta distribution is as follows:

    $$\ln L(x;a,b) = n\ln \Gamma (a + b) - n\ln \Gamma (a) - n\Gamma (b) + \sum\limits_{i = 1}^{n} {\ln (x_{i}^{a - 1} (1 - x_{i} )^{b - 1} )}$$
    (21)
    $$\left\{ \begin{array}{l} \frac{\partial [\ln \Gamma (a)]}{{\partial (a)}}{ - }\frac{\partial [\ln \Gamma (a + b)]}{{\partial (a)}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln x_{i} } \hfill \\ \frac{\partial [\ln \Gamma (b)]}{{\partial (b)}}{ - }\frac{\partial [\ln \Gamma (a + b)]}{{\partial (b)}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln (1 - x_{i} )} \hfill \\ \end{array} \right.$$
    (22)

    where \(\Gamma ( \cdot )\) and \(\psi ( \cdot )\) are the gamma function and the Pusey function, respectively, and the expression of the relationship between them is that,

    $$\begin{aligned} \psi (x) & { = }\int_{0}^{\infty } {[\frac{{e^{ - t} }}{t} - \frac{{e^{ - xt} }}{{1 - e^{ - t} }}]} dt \\ & = \frac{d}{dx}\ln \Gamma (x) = \frac{{\Gamma^{\prime}(x)}}{\Gamma (x)} \\ \end{aligned}$$
    (23)

    where \(x\) must satisfy \({\text{Re}} (x) > 0\), i.e., the real part of each sample is greater than 0.

2.3 Maximum Entropy Principle (MEP)

Based on the book "Entropy-based Parameter Estimation in hydrology" by Singh of Louisiana State University (1998), the derivation of the maximum entropy estimates for the three distributions involved in this paper is given below.

  1. 1.

    MEP for Pearson Type III distribution is as follows:

$$\left\{ {\begin{array}{*{20}c} {\overline{x} = \alpha \beta + a_{0} } \\ {\sigma^{2} (x) = \alpha^{2} \beta } \\ {\frac{1}{n}\sum\limits_{i = 1}^{n} {\ln (x_{i} - c)} { = }\psi (\beta ) + \ln \alpha } \\ \end{array} } \right.$$
(24)
  1. 2.

    MEP for Weibull distribution is as follows:

    $$\left\{ \begin{array}{l} b^{a} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln x_{i}^{a} } \hfill \\ \psi (1) - \ln b = \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln x_{i} } \hfill \\ \end{array} \right.$$
    (25)
  1. 3.

    MEP for Beta distribution is as follows:

    $$\left\{ \begin{array}{l} \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln x_{i} } = \psi (a) - \psi (a + b) \hfill \\ \frac{1}{n}\sum\limits_{i = 1}^{n} {\ln (1 - x_{i} )} = \psi (b) - \psi (a + b) \hfill \\ \end{array} \right.$$
    (26)

The significance of each parameter in the Eqs. (24) to (26) of this section has been described in detail in the previous two subsections.

Appendix C: Empirical Analysis

Fig. 9
figure 9

P-P plots of the EFC and TFC using unexpanded and expanded data in the Huanxian County

Fig. 10
figure 10

P-P plots of the EFC and TFC using unexpanded and expanded data in the Qingcheng County

Fig. 11
figure 11

P-P plots of the EFC and TFC using unexpanded and expanded data in the Zhenyuan County

Fig. 12
figure 12

P-P plots of the EFC and TFC using unexpanded and expanded data in the Huachi County

Fig. 13
figure 13

P-P plots of the EFC and TFC using unexpanded and expanded data in the Nianxian County

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Qian, L., Yang, J. et al. Parameter Estimation for Univariate Hydrological Distribution Using Improved Bootstrap with Small Samples. Water Resour Manage 37, 1055–1082 (2023). https://doi.org/10.1007/s11269-022-03410-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-022-03410-y

Keywords

Navigation