Abstract
Probabilistic forecasting aims at producing a predictive distribution of the quantity of interest instead of a single best guess point-wise estimate. With regard to water flow forecasts, the two main sources of uncertainty stem from unknown future rainfall and temperature (input error, i.e., meteorological uncertainty) and from the inadequacy of the deterministic simulator mimicking the rainfall–runoff (RR) transformation (hydrological uncertainty or RR error). These two sources of uncertainty can be dealt with separately and only the latter will be considered here. Only hydrological uncertainty is at stake when recorded meteorological data (instead of meteorological forecasts) are used as inputs to feed the RR simulator (RRS) for probabilistic predictions. The predictive performance of the RRS may strongly depend on the hydrological regimes: rapid flood variations induce large errors of anticipation but a series of dry events will translate into a much more smoother sequence of river levels due to the easily predictable behavior of the soil reservoir emptying. Consequently, a model with several regimes adapted to different error structures appears as a solution to cope with the issue of unstationary predictive variance. The river regime is modeled as a latent variable, the distribution of which is based on additional outputs of the RRS to be selected. Inference is performed by the EM algorithm with both steps leading to explicit analytic expressions. Asymptotic confidence regions for the estimates are provided within the same EM framework. Model selection is also performed, including the length of the model memory as well as the choice of explanatory variables for the latent regimes. The model is applied to a series of water flow forecasts routinely issued by two hydroelectricity producers in France and in Québec and compared with their present operational forecasting methods.
This is a preview of subscription content, access via your institution.










References
Ailliot, P. and Monbet, V. (2012). Markov-switching autoregressive models for wind time series. Environmental Modelling & Software, 30:92–101.
Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422):669–679.
Andreassian, V., Bergstrom, S., Chahinian, N., Duan, Q., Gusev, Y., Littlewood, I., Mathevet, T., Michel, C., Montanari, A., Moretti, G., et al. (2006). Catalogue of the models used in MOPEX 2004/2005. IAHS publication, 307:41.
Bates, B. C. and Campbell, E. P. (2001). A Markov chain Monte Carlo scheme for parameter estimation and inference in conceptual rainfall-runoff modeling. Water Resources Research, 37(4):937–947.
Box, G. and Jenkins, G. (1970). Time Series Analysis: Forecasting and Control. Holden–Day, San Francisco, Ca.
Box, G. E. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), pages 211–252.
Chib, S. (1996). Calculating posterior distributions and modal estimates in markov mixture models. Journal of Econometrics, 75(1):79–97.
Collet, J., Épiard, X., and Coudray, P. (2009). Simulating hydraulic inflows using PCA and ARMAX. The European Physical Journal-Special Topics, 174(1):125–134.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (methodological), pages 1–38.
Engeland, K., Renard, B., Steinsland, I., and Kolberg, S. (2010). Evaluation of statistical models for forecast errors from the HBV model. Journal of Hydrology, 384(1):142–155.
Evin, G., Kavetski, D., Thyer, M., and Kuczera, G. (2013). Pitfalls and improvements in the joint inference of heteroscedasticity and autocorrelation in hydrological model calibration. Water Resources Research, 49(7):4518–4524.
Evin, G., Thyer, M., Kavetski, D., McInerney, D., and Kuczera, G. (2014). Comparison of joint versus postprocessor approaches for hydrological uncertainty estimation accounting for error autocorrelation and heteroscedasticity. Water Resources Research, 50(3):2350–2375.
Fortin, V. (2000). Le modèle météo-apport HSAMI: historique, théorie et application. Institut de recherche d’Hydro-Québec, Varennes.
Furrer, E. M., Jacques, C., and Favre, A.-C. (2006). Short term discharge prediction using a Markovian regime switching model. Technical report, INRS-ETE.
Gailhard, J. (2014). Algorithme de recalage associé à MORDOR diagnostic et proposition d’améliorations. Note Technique Interne H-44200965-2014-000075, EDF-DTG.
Gelfand, A. E. and Smith, A. F. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85(410):398–409.
Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359–378.
Gneiting, T., Raftery, A. E., Westveld, A. H., and Goldman, T. (2005). Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Monthly Weather Review, 133(5):1098–1118.
Hemri, S., Fundel, F., and Zappa, M. (2013). Simultaneous calibration of ensemble river flow predictions over an entire range of lead times. Water Resources Research, 49(10):6744–6755.
Hemri, S., Lisniak, D., and Klein, B. (2015). Multivariate postprocessing techniques for probabilistic hydrological forecasting. Water Resources Research, 51(9):7436–7451.
Hersbach, H. (2000). Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting, 15(5):559–570.
Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous Univariate Distributions, vol. 1–2. New York: John Wiley & Sons.
Krzysztofowicz, R. (2002). Bayesian system for probabilistic river stage forecasting. Journal of Hydrology, 268(1):16–40.
Kuczera, G. (1983). Improved parameter inference in catchment models: 1. evaluating parameter uncertainty. Water Resources Research, 19(5):1151–1162.
Li, M., Wang, Q., Bennett, J., and Robertson, D. (2015). A strategy to overcome adverse effects of autoregressive updating of streamflow forecasts. Hydrology and Earth System Sciences, 19(1):1–15.
Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B, 44(2):226–233.
Lu, Z.-Q. and Berliner, L. M. (1999). Markov switching time series models with application to a daily runoff series. Water Resources Research, 35(2):523–534.
Matheson, J. E. and Winkler, R. L. (1976). Scoring rules for continuous probability distributions. Management Science, 22(10):1087–1096.
Mathevet, T. (2010). Erreur empirique de modèle. Note Technique Interne D4165/NT/2010-00395-A, EDF-DTG.
Morawietz, M., Xu, C.-Y., Gottschalk, L., and Tallaksen, L. M. (2011). Systematic evaluation of autoregressive error models as post-processors for a probabilistic streamflow forecast system. Journal of Hydrology, 407(1):58–72.
Perreault, L., Garçon, R., and Gaudet, J. (2007). Analyse de séquences de variables aléatoires hydrologiques à l’aide de modèles de changement de régime exploitant des variables atmosphériques. La Houille Blanche (6):111–123.
Pianosi, F. and Raso, L. (2012). Dynamic modeling of predictive uncertainty by regression on absolute errors. Water Resources Research, 48(3).
Raftery, A. E., Gneiting, T., Balabdaoui, F., and Polakowski, M. (2005). Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review, 133(5).
Schaefli, B., Talamba, D. B., and Musy, A. (2007). Quantifying hydrological modeling errors through a mixture of normal distributions. Journal of Hydrology, 332(3):303–315.
Schoups, G. and Vrugt, J. A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and non-Gaussian errors. Water Resources Research, 46(10).
Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics, 6(2):461–464.
Sorooshian, S. and Dracup, J. A. (1980). Stochastic parameter estimation procedures for hydrologie rainfall-runoff models: Correlated and heteroscedastic error cases. Water Resources Research, 16(2):430–442.
Thyer, M., Kuczera, G., and Wang, Q. (2002). Quantifying parameter uncertainty in stochastic models using the Box-Cox transformation. Journal of Hydrology, 265(1):246–257.
Todini, E. (2008). A model conditional processor to assess predictive uncertainty in flood forecasting. International Journal of River Basin Management, 6(2):123–137.
Vrugt, J. A. and Robinson, B. A. (2007). Treatment of uncertainty using ensemble methods: Comparison of sequential data assimilation and Bayesian model averaging. Water Resources Research, 43(1).
Wang, Q., Shrestha, D. L., Robertson, D., and Pokhrel, P. (2012). A log-sinh transformation for data normalization and variance stabilization. Water Resources Research, 48(5).
Acknowledgements
This work was supported by Électricité de France and by Hydro-Québec [research Grant Number 694R] through the thesis of M. Courbariaux. We would like to thank Anne-Catherine Favre, Joël Gailhard and Luc Perreault for their unfailing help and constructive comments on earlier drafts of the article. The forecasting and development teams at EDF-DTG and Hydro-Québec have provided the necessary material and case studies as well as many valuable advises ; we thank in particular Catherine Guay, Isabelle Chartier and Marie Minville from IREQ, Rémy Garçon, Matthieu Le-Lay and Federico Garavaglia from EDF-DTG. We also thank Joan Sobota for English proofreading. We finally thank the Associate Editor and the two reviewers for their comments and questions which help us to improve the paper.
Author information
Affiliations
Corresponding author
Appendices
Appendix 1: Operational predictive method
EDF’s operational predictive method consists of 3 independent modules: a deterministic model, an error model and an empirical copula.
Deterministic model (Gailhard 2014) The deterministic model in use at EDF is an autoregressive model combined with exponential smoothing. The strength of the autocorrelation is supposed to increase with the rate of water flow coming from the deep reservoirs of the watershed.
Error model (Mathevet 2010) The error model is an heteroscedastic conditional normal model derived for each forecasting lead time h (after normalization):
where \(b_{h}\) and \(\sigma _{h}\) are tabulated functions of x.
Empirical mopula One finally resorts to an empirical copula to get samples of a space and time multivariate distribution from samples from the marginal (lead time by lead time) distributions.
Appendix 2: Fisher information matrix
For any k, \(k'\not =k\),
Then, the first term in the Louis decomposition is computed since \(\mathbb {E}(\mathbb {I}_{\{S_{t}=k\}}|\mathbf {Y};(\varvec{\theta },\mathbf {B}))=\tau _{kt}\) is computed above.
For the second term in the Louis decomposition, we notice that:
We also need to compute:
where
Again, we rely on the independence between the \(Z_t\)s.
The remaining terms are easily evaluated:
Rights and permissions
About this article
Cite this article
Courbariaux, M., Barbillon, P. & Parent, É. Water flow probabilistic predictions based on a rainfall–runoff simulator: a two-regime model with variable selection. JABES 22, 194–219 (2017). https://doi.org/10.1007/s13253-017-0278-5
Received:
Accepted:
Published:
Issue Date:
Keywords
- EM algorithm
- Probit model
- Model uncertainty
- Probabilistic forecasts
- Hydrology
- Rainfall–runoff model