Abstract
This paper explores the practical benefits of Bayesian model averaging, for a problem with limited data, namely future flow of five intermittent rivers. This problem is a useful proxy for many others, as the limited amount of data only allows tuning of small, simple models. Bayesian model averaging is theoretically a good way to cope with these difficulties, but it has not been widely used on this and similar problems. This paper uses real-world data to illustrate why. Bayesian model averaging can indeed give a better prediction, but only if the amount of data is small — if the data is so limited that it agrees a wide range of different models (instead of consistent with only a few near-identical models), then the weighted votes of those diverse models in Bayesian model averaging will (on average) give a better prediction than the single best model. In contrast, plenty of data can fit only one or a few very similar models; since they’ll vote the same way, Bayesian model averaging will give no practical improvement. Even with limited data that agrees with a range of models, the improvement is not very big large, but it is the direction of the improvement that stands out as a help for forecasting. Working around these caveats lets us better predict river floods, and similar problems with limited data.
Similar content being viewed by others
Notes
This paper uses weekly data which is available from November 1981 onwards, from http://ioc-goos-oopc.org/state_of_the_ocean/sur/pac. For a quick introduction to the Nino3 and Nino4 sea surface temperature numbers, please see https://climatedataguide.ucar.edu/climate-data/.
See http://ioc-goos-oopc.org/state_of_the_ocean/sur/ind for weekly data on the Indian Ocean sea surface temperature indices.
Actually it can vary from 95,000 to 107,000 models, as we keep lowering the posterior cut-off, and enumerate all models with posterior more than that.
References
Darwen PJ, Yao X (1997) Speciation as automatic categorical modularization. IEEE Trans Evol Comput 1(2):101–108
Duan Q, Ajami NK, Gao X, Sorooshian S (2007) Multi-model ensemble hydrologic prediction using Bayesian model averaging. Adv Water Resour 30:1371–1386
Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian data analysis, 2 edn. Texts in statistical science series. Chapman-Hall, Boca Raton
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading
Halide H, Ridd P (2008) Complicated ENSO models do not significantly outperform very simple ENSO models. Int J Climatol 28:219–233
Haug EG (2007) Derivatives: models on models. Wiley, Hoboken
Hemri S, Fundel F, Zappa M (2013) Simultaneous calibration of ensemble river flow predictions over an entire range of lead times. Water Resour Res 49(10):6744–6755
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–417
Madadgar S, Moradkhani H (2013) A Bayesian framework for probabilistic seasonal drought forecasting. J Hydrometeorol 14(6):1685–1705
Marshall L, Nott D, Sharma A (2007) Towards dynamic catchment modelling: a Bayesian hierarchical mixtures of experts framework. Hydrol Process 21(7):847–861
Mazzarella A, Giuliacci A, Liritzis I (2010) On the 60-month cycle of multivariate ENSO index. Theor Appl Climatol 100:23–27
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Najafi M, Moradkhani H, Jung I (2011) Assessing the uncertainties of hydrologic model selection in climate change impact studies. Hydrol Process 25(18):2814–2826
Parrish MA, Moradkhani H, DeChant CM (2012) Toward reduction of model uncertainty: integration of Bayesian model averaging and data assimilation. Water Resour Res 48(3):W03,519. https://doi.org/10.1029/2011WR011116
Qu B, Zhang X, Pappenberger F, Zhang T, Fang Y (2017) Multi-model grand ensemble hydrologic forecasting in the Fu River basin using Bayesian model averaging. Water 9(2):74
Raftery AE, Gneiting T, Balabdaoui F, Polakowski M (2005) Using Bayesian model averaging to calibrate forecast ensembles. Mon Weather Rev 133(5):1155–1174
Smith RE, Forrest S, Perelson AS (1992) Searching for diverse, cooperative populations with genetic algorithms. Evol Comput 1(2):127–149
Taleb N (1997) Dynamic hedging: managing vanilla and exotic options. Wiley Finance, New York
Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature felection. IEEE Trans Evol Comput 20(4):606–626
Yan X, Su X (2009) Linear regression analysis: theory and computing. World Scientific
Acknowledgements
This research was supported by James Cook University’s Brisbane campus. The author would like to thank Matthew Fuller for technical support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Darwen, P.J. Bayesian model averaging for river flow prediction. Appl Intell 49, 103–111 (2019). https://doi.org/10.1007/s10489-018-1232-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1232-0