Skip to main content
Log in

Using maximum likelihood to derive various distance-based goodness-of-fit indicators for hydrologic modeling assessment

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Currently used goodness-of-fit (GOF) indicators (i.e. efficiency criteria) are largely empirical and different GOF indicators emphasize different aspects of model performance; a thorough assessment of model skill may require the use of robust skill matrices. In this study, based on the maximum likelihood method, a statistical measure termed BC-GED error model is proposed, which firstly uses the Box–Cox (BC) transformation method to remove the heteroscedasticity of model residuals, and then employs the generalized error distribution (GED) with zero-mean to fit the distribution of model residuals after BC transformation. Various distance-based GOF indicators can be explicitly expressed by the BC-GED error model for different values of the BC transformation parameter λ and GED kurtosis coefficient β. Our study proves that (1) the shape of error distribution implied in the GOF indicators affects the model performance on high or low flow discharges because large error-power (β) value can cause low probability of large residuals and small β value will lead to high probability of zero value; (2) the mean absolute error could balance consideration of low and high flow value as its assumed error distribution (i.e. Laplace distribution, where β = 1) is the turning point of GED derivative at zero value. The results of a study performed in the Baocun watershed via comparison of the SWAT model-calibration results using six distance-based GOF indicators show that even though the formal BC-GED is theoretically reasonable, the calibrated model parameters do not always correspond to high performance of model-simulation results because of imperfection of the hydrologic model. However, the derived distance-based GOF indicators using the maximum likelihood method offer an easy way of choosing GOF indicators for different study purposes and developing multi-objective calibration strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Andréassian V, Le Moine N, Perrin C, Ramos MH, Oudin L, Mathevet T, Lerat J, Berthet L (2012) All that glitters is not gold: the case of calibrating hydrological models. Hydrol Process 26:2206–2210

    Article  Google Scholar 

  • Baratti R, Cannas B, Fanni A, Pintus M, Sechi GM, Toreno N (2003) River flow forecast for reservoir management for neural networks. Neurocomputing 55:421–437

    Article  Google Scholar 

  • Bates BC, Campbell EP (2001) A Markov Chain Monte Carlo scheme for parameter estimation and inference in conceptual rainfall-runoff modeling. Water Resour Res 37(4):937–947

    Article  Google Scholar 

  • Bennett ND, Croke BFW, Guariso G, Guillaume JHA, Hamilton SH, Jakeman AJ, Marsili-Libelli S, Newham LTH, Norton JP, Perrin C, Pierce SA, Robson B, Seppelt R, Voinov AA, Fath BD, Andreassian V (2013) Characterising performance of environmental models. Environ Model Softw 40:1–20

    Article  Google Scholar 

  • Beven K, Binley A (1992) The future of distributed models: model calibration and uncertainty prediction. Hydrol Process 6:279–298

    Article  Google Scholar 

  • Beven K, Binley A (2013) GLUE: 20 years on. Hydrol Process. https://doi.org/10.1002/hyp.10082

    Google Scholar 

  • Beven K, Smith P, Westerberg I, Freer J (2012) Comment on: “Pursuing the method of multiple working hypotheses for hydrological modeling” by P. Clark et al. Water Resour Res 48(11):W11801

    Article  Google Scholar 

  • Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc B 26:211–252

    Google Scholar 

  • Chen X, Cheng Q-B, Chen YD, Smettem K, Xu C-Y (2010) Simulating the integrated effects of topography and soil properties on runoff generation in hilly forested catchments, South China. Hydrol Process 24:714–725

    Article  Google Scholar 

  • Cheng Q-B, Chen X, Xu C-Y, Reinhardt-Imjela C, Schulte A (2014) Improvement and comparison of likelihood functions for model calibration and parameter uncertainty analysis within a Markov chain Monte Carlo scheme. J Hydrol 519(27):2202–2214

    Article  Google Scholar 

  • Cheng Q-B, Reinhardt-Imjela C, Chen X, Schulte A, Ji X, Li F-L (2016) Improvement and comparison of the rainfall-runoff methods in SWAT at the monsoonal watershed of Baocun, Eastern China. Hydrol Sci J 61(8):1460–1476

    Article  Google Scholar 

  • Clarke RT (1973) A review of some mathematical models used in hydrology, with observations on their calibration and use. J Hydrol 19:1–20

    Article  Google Scholar 

  • Dawson CW, Abrahart RJ, See LM (2007) HydroTest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts. Environ Model Softw 22:1034–1052

    Article  Google Scholar 

  • Engeland K, Renard B, Steinsland I, Kolberg S (2010) Evaluation of statistical models for forecast errors from the HBV model. J Hydrol 384(1–2):142–155

    Article  Google Scholar 

  • Evin G, Kavetski D, Thyer M, Kuczera G (2013) Pitfalls and improvements in the joint inference of heteroscedasticity and autocorrelation in hydrological model calibration. Water Resour Res 49:1–7

    Article  Google Scholar 

  • FAO/IIASA/ISRIC/ISSCAS/JRC (2009) Harmonized world soil database (version 1.1). FAO, IIASA, Rome, Laxenburg

  • Feyen L, Vrugt JA, Nuallain BO, van der Knijff J, Roo AD (2007) Parameter optimization and uncertainty assessment for large-scale streamflow simulation with the LISFLOOD model. J Hydrol 332:276–289

    Article  Google Scholar 

  • Freni G, Mannina G (2012) Uncertainty estimation of a complex water quality model: the influence of Box–Cox transformation on Bayesian approaches and comparison with a non-Bayesian method. Phys Chem Earth 42–44:31–41

    Article  Google Scholar 

  • Green IRA, Stephenson D (1986) Criteria for comparison of single event models. Hydrol Sci J 31(3):395–411

    Article  Google Scholar 

  • Gupta HV, Sorooshian S, Yapo PO (1998) Toward improved calibration of hydrologic models: multiple and noncommensurable measures of information. Water Resour Res 34:751–763. https://doi.org/10.1029/97WR03495

    Article  Google Scholar 

  • Gupta HV, Kling H, Yilmaz KK, Martinez GF (2009) Decomposition of the mean squared error and NSE performance criteria: implications for improving hydrological modeling. J Hydrol 377:80–91

    Article  Google Scholar 

  • Jain SK, Sudheer KP (2008) Fitting of hydrologic models: a close look at the Nash-Sutcliffe Index. J Hydrol Eng 13:981–986

    Article  Google Scholar 

  • Krause P, Boyle DP, Bäse F (2005) Comparison of different efficiency criteria for hydrological model assessment. Adv Geosci 5:89–97

    Article  Google Scholar 

  • Laloy E, Fasbender D, Bielders CL (2010) Parameter optimization and uncertainty analysis for plot-scale continuous modeling of runoff using a formal Bayesian approach. J Hydrol 380(1–2):82–93

    Article  Google Scholar 

  • Legates DR, McCabe GJ (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35(1):233–241

    Article  Google Scholar 

  • Lehmann EL, Casella G (1998) Theory of point estimation, 2nd edn. Springer, Berlin

    Google Scholar 

  • Li L, Xu C-Y, Xia J, Engeland K, Reggiani P (2011) Uncertainty estimates by Bayesian method with likelihood of AR (1) and normal model and AR (1) and multi-normal model in different time-scales hydrological models. J Hydrol 406:54–65

    Article  Google Scholar 

  • McMillan H, Clark M (2009) Rainfall-runoff model calibration using informal likelihood measures within a Markov chain Monte Carlo sampling scheme. Water Resour Res 45:W04418

    Article  Google Scholar 

  • Muleta MK (2012) Model performance sensitivity to objective function during automated calibrations. J Hydrol Eng 17(6):756–767

    Article  Google Scholar 

  • Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models 1: a discussion of principles. J Hydrol 10(3):282–290

    Article  Google Scholar 

  • Neitsch SL, Arnold JG, Kiniry JR, Williams JR (2005) Soil and water assessment tool theoretical documentation and user’s manual, version 2005. GSWR Agricultural Research Service & Texas Agricultural Experiment Station, Temple Texas

    Google Scholar 

  • Nott DJ, Marshall L, Brown J (2012) Generalized likelihood uncertainty estimation (GLUE) and approximate Bayesian computation: what’s the connection? Water Resour Res 48:W12602

    Article  Google Scholar 

  • Oudin L, Andréassian V, Mathevet T, Perrin C (2006) Dynamic averaging of rainfall-runoff model simulations from complementary model parameterizations. Water Resour Res 42(7):W07410

    Article  Google Scholar 

  • Pianosi F, Raso L (2012) Dynamic modeling of predictive uncertainty by regression on absolute errors. Water Resour Res 48:W03516

    Article  Google Scholar 

  • Powell LA (2007) Approximating variance of demographic parameters using the delta method: a reference for avian biologists. Condor 109(4):949–954

    Article  Google Scholar 

  • Pushpalatha R, Perrin C, Le Moine N, Andréassian V (2012) A review of efficiency criteria suitable for evaluating low-flow simulations. J Hydrol 420–421:171–182

    Article  Google Scholar 

  • Qiao L, Herrmann RB, Pan Z-T (2013) Parameter uncertainty reduction for SWAT using grace, streamflow, and groundwater table data for lower Missouri river basin. JAWRA 49(2):343–358

    Google Scholar 

  • Reichert P, Mieleitner J (2009) Analyzing input and structural uncertainty of nonlinear dynamic models with stochastic, time-dependent parameters. Water Resour Res 45:W10402

    Article  Google Scholar 

  • Reichert P, Schuwirth N (2012) Linking statistical bias description to multiobjective model calibration. Water Resour Res 48:W09543

    Article  Google Scholar 

  • Renard B, Kavetski D, Kuczera G, Thyer M, Franks SW (2010) Understanding predictive uncertainty in hydrologic modeling: the challenge of identifying input and structural errors. Water Resour Res 46:W05521

    Article  Google Scholar 

  • Sadegh M, Vrugt JA (2013) Bridging the gap between GLUE and formal statistical approaches: approximate Bayesian computation. Hydrol Earth Syst Sci 17:4831–4850

    Article  Google Scholar 

  • Sakia RM (1992) The Box–Cox transformation technique: a review. The Statistician 41:169–178

    Article  Google Scholar 

  • Schaefli B, Gupta HV (2007) Do nash values have value? Hydrol Process 21:2075–2080

    Article  Google Scholar 

  • Schoups G, Vrugt JA (2010) A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and non-Gaussian errors. Water Resour Res 46:W10531

    Google Scholar 

  • Seibert J (2001) On the need for benchmarks in hydrological modeling. Hydrol Process 15:1063–1064

    Article  Google Scholar 

  • Smith T, Sharma A, Marshall L, Mehrotra R, Sisson S (2010) Development of a formal likelihood function for improved Bayesian inference of ephemeral catchments. Water Resour Res 46:W12551

    Google Scholar 

  • Sorooshian S, Gupta VK (1995) Model Calibration. In: Singh VP (ed) Computer models of watershed hydrology. Water Resources Publications, Highlands Ranch, pp 23–67

    Google Scholar 

  • Thiemann M, Trosset M, Gupta H, Sorooshian S (2001) Bayesian recursive parameter estimation for hydrologic models. Water Resour Res 37(10):2521–2535

    Article  Google Scholar 

  • Tolson BA, Shoemaker CA (2007) Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water Resour Res 43:W01413

    Article  Google Scholar 

  • Vandewiele GL, Xu C-Y, Ni-Lar-Win (1992) Methodology and comparative study of monthly water balance models in Belgium, China and Burma. J Hydrol 134:315–347

    Article  Google Scholar 

  • Vazquez-Amábile GG, Engel BA (2005) Use of SWAT to compute groundwater table depth and streamflow in the Muscatatuck River watershed. Trans ASABE 48(3):991–1003

    Article  Google Scholar 

  • Vrugt JA, Ter Braak CJF, Gupta HV, Robinson BA (2009a) Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? Stoch Environ Res Risk Assess 23(7):1011–1026

    Article  Google Scholar 

  • Vrugt JA, Ter Braak CJF, Diks CGH, Robinson BA, Hyman JM, Higdon D (2009b) Accelerating Markov Chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling. Int J Nonlinear Sci Numer Simul 10(3):273–290

    Article  Google Scholar 

  • White ED, Easton ZM, Fuka DR, Collick AS, Adgo E, McCartney M, Awulachew SB, Selassie YG, Steenhuis TS (2011) Development and application of a physically based landscape water balance in the SWAT model. Hydrol Process 25:915–925

    Article  Google Scholar 

  • Willems P (2009) A time series tool to support the multi-criteria performance evaluation of rainfall-runoff models. Environ Model Softw 24:311–321

    Article  Google Scholar 

  • Wu Y-P, Liu SG (2014) A suggestion for computing objective function in model calibration. Ecol Inform 24:107–111

    Article  Google Scholar 

  • Xu C-Y (2001) Statistical analysis of parameters and residuals of a conceptual water balance model—methodology and case study. Water Resour Manag 15:75–92

    Article  Google Scholar 

  • Xu C-Y, Seibert J, Halldin S (1996) Regional water balance modelling in the NOPEX area: development and application of monthly water balance models. J Hydrol 180:211–236

    Article  Google Scholar 

  • Yang J, Reichert P, Abbaspour KC (2007a) Bayesian uncertainty analysis in distributed hydrological modelling: a case study in the Thur River basin (Switzerland). Water Resour Res 43:W10401

    Google Scholar 

  • Yang J, Reichert P, Abbaspour KC, Yang H (2007b) Hydrological modelling of the Chaohe Basin in China: statistical model formulation and Bayesian inference. J Hydrol 340:167–182

    Article  Google Scholar 

  • Yustres Á, Asensio L, Alonso J, Navarro V (2012) A review of Markov Chain Monte Carlo and information theory tools for inverse problems in subsurface flow. Comput Geosci 16:1–20

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Nos. 41601013, 51190091, 51079038, 41571130071), the National Key R&D Program of China (No. 20165051922) and the Natural Science Foundation of Jiangsu Province, China (No. BK20150809). The authors would like to thank Weihai Hydrology and Water Resource Survey Bureau for providing hydrologic data, the National Meteorological Information Center, China Meteorological Administration (CMA) for providing climate data, the computing facilities of Freie Universität Berlin (ZEDAT) for computer time. We are also grateful to two anonymous reviewers and associate editor for their constructive comments that led to significant improvements of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qin-Bo Cheng.

Appendices

Appendix 1: Maximum likelihood of BC-GED model

The derivative of logarithmic likelihood function at maximum point should equal to zero, so from the likelihood function Eq. (6), we can derive:

$$ \frac{{\partial \left( {l\left( {\theta \left| {obs} \right.} \right)} \right)}}{\partial \sigma } = - \frac{n}{\sigma } + \frac{{c(\beta )^{\beta } \beta }}{{\sigma^{\beta + 1} }}\sum\limits_{1}^{n} {\left| {e_{i} } \right|^{\beta } = 0} $$
(16)

By solving Eq. (16), we get:

$$ \sigma = c(\beta )\sqrt[\beta ]{\beta }\sqrt[\beta ]{{\frac{{\sum\nolimits_{1}^{n} {\left| {e_{i} } \right|^{\beta } } }}{n}}} = \sqrt[\beta ]{\beta }\sqrt {\frac{\Gamma [3/\beta ]}{\Gamma [1/\beta ]}} \sqrt[\beta ]{{\frac{{\sum\nolimits_{1}^{n} {\left| {e_{i} } \right|^{\beta } } }}{n}}} $$
(17)

By substituting Eqs. (17) into (6), we get:

$$ l\left( {\theta \left| {obs} \right.} \right)_{max} = - n\ln \frac{{\sigma \sqrt[\beta ]{\varepsilon }}}{\omega (\beta )} = - n\ln \left[ {\frac{2\Gamma [1/\beta ]}{\beta }\sqrt[\beta ]{{\varepsilon \beta \frac{{\sum\nolimits_{1}^{n} {\left| {e_{i} } \right|^{\beta } } }}{n}}}} \right] $$
(18)

where ε is the base of the natural logarithm (ε ≈ 2.718).

Equation (17) is a method to estimate the standard deviation (σ) in Eq. (5). In the following sections, we will separately prove the unbiasedness, consistency and efficiency of the estimation method (Lehmann and Casella 1998).

  1. (1)

    Unbiasedness

The mathematic expectation of Eq. (17) is:

$$ E(\sigma ) = \sqrt[\beta ]{\beta }\sqrt {\frac{\Gamma [3/\beta ]}{\Gamma [1/\beta ]}} \sqrt[\beta ]{{E\left( {\frac{{\sum\nolimits_{1}^{n} {\left| {e_{i} } \right|^{\beta } } }}{n}} \right)}} = \sqrt[\beta ]{\beta }\sqrt {\frac{\Gamma [3/\beta ]}{\Gamma [1/\beta ]}} \sqrt[\beta ]{{E\left( {\left| e \right|^{\beta } } \right)}} $$
(19)

If e i follows the generalized error distribution (GED) with zero-mean, the Eq. (19) can be rewritten as:

$$ \begin{aligned} E(\sigma ) & = \sqrt[\beta ]{\beta }\sqrt {\frac{\Gamma [3/\beta ]}{\Gamma [1/\beta ]}} \sqrt[\beta ]{{\int_{ - \infty }^{ + \infty } {\left| x \right|^{\beta } } \frac{\omega (\beta )}{\sigma }exp\left( { - c(\beta )^{\beta } \frac{{\left| x \right|^{\beta } }}{{\sigma^{\beta } }}} \right)dx}} \\ & = \sqrt[\beta ]{\beta }\sqrt {\frac{\Gamma [3/\beta ]}{\Gamma [1/\beta ]}} \sqrt[\beta ]{{\frac{1}{\beta }\left( {\sigma \sqrt {\frac{\Gamma [1/\beta ]}{\Gamma [3/\beta ]}} } \right)^{\beta } }} \\ & = \sigma \\ \end{aligned} $$
(20)

So, Eq. (17) is the unbiased estimator of σ in Eq. (5).

  1. (2)

    Consistency

The variance of random variable (\( \sigma^{\beta } \)) estimated by Eq. (17) is:

$$ \begin{aligned} {\text{Var}}\left( {\sigma^{\beta } } \right) & {\text{ = Var}}\left( {\sqrt {\frac{\Gamma [3 /\beta ]}{\Gamma [1 /\beta ]}}^{\beta } } \right)\beta \left( {\frac{{\sum\nolimits_{1}^{n} {\left| {e_{i} } \right|^{\beta } } }}{n}} \right) \\ & = \left( {\frac{\Gamma [3 /\beta ]}{\Gamma [1 /\beta ]}} \right)^{\beta } \frac{{\beta^{2} }}{n}{\text{Var}}\left( {\left| e \right|^{\beta } } \right) \\ & = \left( {\frac{\Gamma [3 /\beta ]}{\Gamma [1 /\beta ]}} \right)^{2\beta } \frac{{\beta^{2} }}{n}\left( {E\left( {\left| e \right|^{2\beta } } \right) - E\left( {\left| e \right|^{\beta } } \right)^{2} } \right) \\ & = \frac{\beta }{n}\sigma^{2\beta } \\ \end{aligned} $$
(21)

According to the delta method that uses second-order Taylor expansions to approximate the variance of a random variable (Powell 2007), we get the approximate variance of Eq. (17):

$$ {\text{Var}}(\sigma ) = {\text{Var}}\left( {\sqrt[\beta ]{{\sigma^{\beta } }}} \right) \approx \left( {\frac{1}{\beta }\left( {\sigma^{\beta } } \right)^{1/\beta - 1} } \right)^{2} {\text{Var}}\left( {\sigma^{\beta } } \right) = \frac{1}{{\beta^{2} }}\sigma^{2 - 2\beta } \frac{\beta }{n}\sigma^{2\beta } = \frac{{\sigma^{2} }}{\beta n} $$
(22)

Equation (22) shows that the variance of Eq. (17) approaches zero as the number of samples (n) approaches infinity. So, Eq. (17) is the consistent estimator of σ in Eq. (5).

  1. (3)

    Efficiency

The definition of the standard deviation of model residuals is:

$$ \sigma = \sqrt {\frac{{\sum\nolimits_{1}^{n} {e_{i}^{2} } }}{n}} $$
(23)

The variance of random variable (\( \sigma^{2} \)) estimated by Eq. (23) is:

$$ \begin{aligned} {\text{Var}}(\sigma^{2} ) & = \frac{1}{n}\left( {E\left( {e^{4} } \right) - E\left( {e^{2} } \right)^{2} } \right) \\ & = \frac{1}{n}\frac{{\Gamma [1 /\beta ]\Gamma [5 /\beta ] -\Gamma [3 /\beta ]^{2} }}{{\Gamma [3 /\beta ]^{2} }}\sigma^{4} \\ \end{aligned} $$
(24)

According to the delta method (Powell 2007), we get the approximate variance of Eq. (23):

$$ \begin{aligned} {\text{Var}}(\sigma ) & = {\text{Var}}\left( {\sqrt[2]{{\sigma^{2} }}} \right) \approx \left( {\frac{1}{2}\left( {\sigma^{2} } \right)^{ - 1/2} } \right)^{2} {\text{Var}}\left( {\sigma^{2} } \right) \\ & = \frac{1}{4}\left( {\frac{\Gamma [1/\beta ]\Gamma [5/\beta ]}{{\Gamma [3/\beta ]^{2} }} - 1} \right)\frac{{\sigma^{2} }}{n} \\ \end{aligned} $$
(25)

By comparing the Eq. (25) with the Eq. (22), we get:

$$ \frac{1}{4}\left( {\frac{\Gamma [1/\beta ]\Gamma [5/\beta ]}{{\Gamma [3/\beta ]^{2} }} - 1} \right) \ge \frac{1}{\beta } $$
(26)

where the equality holds if and only if β = 2 (Gaussian distribution).

Equation (26) demonstrates that the variance of Eq. (2) is less than Eq. (8). So, Eq. (2) is more efficient or precise than Eq. (8) to estimate the value of σ in Eq. (5).

Appendix 2: Diagnosis of the autocorrelation of model residuals

See Fig. 10.

Fig. 10
figure 10

Diagnosis of the autocorrelation of model residuals for six goodness-of-fit indicators where the dash line is the statistically zero band (\( \pm \,{{1.96} \mathord{\left/ {\vphantom {{1.96} {\sqrt n }}} \right. \kern-0pt} {\sqrt n }} \))

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, QB., Chen, X., Xu, CY. et al. Using maximum likelihood to derive various distance-based goodness-of-fit indicators for hydrologic modeling assessment. Stoch Environ Res Risk Assess 32, 949–966 (2018). https://doi.org/10.1007/s00477-017-1507-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-017-1507-8

Keywords

Navigation