Abstract
Accurate and reliable predictions of the debris-flow volume are the necessary prerequisite for potential hazard delineation and risk assessment of debris flows. Various theoretical, empirical, and machine learning methods have been proposed by researchers to estimate the debris-flow volume. However, current methods generally provide point-value deterministic predictions and have limitation in assessing the predictive uncertainties associated with the observation data, model parameters, and structures. This paper proposed a data-driven ensemble model to probabilistically forecast the debris-flow volume using multiple deterministic machine learning methods and Bayesian model averaging (BMA). The rainfall-induced debris flows in Taiwan were selected as an illustrative example to evaluate the feasibility of the proposed approach. Firstly, the debris-flow datasets are preprocessed by the principal component analysis (PCA) to select input variables. Then, four data-driven models are applied to provide deterministic estimates for ensemble forecasts. Finally, BMA incorporates the deterministic predictions of multiple data-driven models to generate probabilistic forecasts. The performances of individual data-driven models and BMA ensemble forecast are evaluated and compared. Results show that the proposed BMA ensemble model performs better than the single models for predicting the debris-flow volume in terms of the effectiveness and robustness. Ensemble models with good performance can combine the strengths of different models to improve the prediction accuracy. Weighting only good members may not achieve the best performance for both calibration and validation periods. The performance of different combinations of data-driven models is closely related to the observation data and the prediction accuracy of each model.
Similar content being viewed by others
Data availability
All data in this study are available from the corresponding author upon reasonable request.
References
Berkhahn S, Fuchs L, Neuweiler I (2019) An ensemble neural network model for real-time prediction of urban floods. J Hydrol 575:743–754. https://doi.org/10.1016/j.jhydrol.2019.05.066
Bianco G, Franzi L (2000) Estimation of debris flow volumes from storm events. Second International Congress on Debris Flows Hazard Mitigation, Taipei, Taiwan, 441–448
Bishop CM (2006) Pattern recognition and machine learning. Springer, Singapore
Bovis MJ, Jakob M (1999) The role of debris supply to determine debris flow activity. Earth Surf Proc Land 24:1039–1054. https://doi.org/10.1002/(SICI)1096-9837(199910)24:11%3c1039::AID-ESP29%3e3.0.CO;2-U
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Chang CW, Lin PS, Tsai CL (2011) Estimation of sediment volume of debris flow caused by extreme rainfall in Taiwan. Eng Geol 123:83–90. https://doi.org/10.1016/j.enggeo.2011.07.004
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1023/A:1022627411411
Costa JE (1984) Physical geomorphology of debris flows. In: Costa JE, Fleischer PJ (eds) Developments and Applications of Geomorphology. Springer, Berlin, Germany, pp 268–317
Friedel MJ (2011) A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty. Environ Modell Soft 26:1583–1598. https://doi.org/10.1016/j.envsoft.2011.07.014
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–141. https://doi.org/10.1214/aos/1176347963
Gartner JE, Cannon SH, Santi PM, DeWolfe VG (2008) Empirical models to predict the volumes of debris flows generated by recently burned basins in the western U.S. Geomorphology 96:339–354. https://doi.org/10.1016/j.geomorph.2007.02.033
Goh ATC, Zhang WG, Zhang YM, Xiao Y, Xiang YZ (2018) Determination of earth pressure balance tunnel-related maximum surface settlement: a multivariate adaptive regression splines approach. B Eng Geol Environ 77:489–500. https://doi.org/10.1007/s10064-016-0937-8
Hao YF, Baik JJ, Tran H, Choi MH (2022) Quantification of the effect of hydrological drivers on actual evapotranspiration using the Bayesian model averaging approach for various landscapes over Northeast Asia. J Hydrol 607:127543. https://doi.org/10.1016/j.jhydrol.2022.127543
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 1:382–401. https://doi.org/10.1214/ss/1009212519
Huang J, Hales TC, Huang RQ, Ju NP, Li Q, Huang Y (2020) A hybrid machine-learning model to estimate potential debrisflow volumes. Geomorphology 367:107333. https://doi.org/10.1016/j.geomorph.2020.107333
Hungr O (1995) A model for the runout analysis of rapid flow slides, debris flows, and avalanches. Can Geotech J 32:610–623. https://doi.org/10.1139/t95-063
Hungr O, Morgan GC, Kellerhals R (1984) Quantitative analysis of debris torrent hazards for design of remedial measures. Can Geotech J 21:663–677. https://doi.org/10.1139/t84-073
Hürlimann M, Rickenmann D, Medina V, Bateman A (2008) Evaluation of approaches to calculate debris-flow parameters for hazard assessment. Eng Geol 102:152–163. https://doi.org/10.1016/j.enggeo.2008.03.012
Iverson RM (1997) The physics of debris flows. Rev Geophys 35(3):245–296. https://doi.org/10.1029/97RG00426
Iverson RM, George DL (2016) Modelling landslide liquefaction, mobility bifurcation and the dynamics of the 2014 Oso disaster. Géotechnique 66(3):175–187. https://doi.org/10.1680/jgeot.15.LM.004
Jakob M (2005) A size classification for debris flows. Eng Geol 79(3–4):151–161. https://doi.org/10.1016/j.enggeo.2005.01.006
Jan CD, Lee MH (2004) A debris flow rainfall-based warning model. J Chin Soil Water Conserv 35(3):275–285 (in Chinese)
Kwan JSH, Sun HW (2007) Benchmarking exercise on landslide mobility modelling – runout analyses using 3dDMM. In Proceedings of the 2007 International Forum on Landslide Disaster Management, ed. Ho and Li. Hong Kong Geotechnical Engineering Office, pp. 945–966
Lay US, Pradhan B, Yusoff ZBM, Abdallah AFB, Aryal J, Park HJ (2019) Data mining and statistical approaches in debris-flow susceptibility modelling using airborne lidar data. Sensors 19:3451. https://doi.org/10.3390/s19163451
Lin PS, Lin JY, Huang JC, Yang MD (2002) Assessing debris-flow hazard in a watershed in Taiwan. Eng Geol 66:295–313. https://doi.org/10.1016/S0013-7952(02)00105-9
Luna BQ (2012) Dynamic numerical run-out modeling for quantitative landslide risk assessment. Thesis of University of Twente, ITC 206:1–237
McDougall S (2017) 2014 Canadian Geotechnical Colloquium: landslide runout analysis-current practice and challenges. Can Geotech J 54(5):605–620. https://doi.org/10.1139/cgj-2016-0104
Nandi A, Mandal A, Wilson M, Smith D (2016) Flood hazard mapping in Jamaica using principal component analysis and logistic regression. Environ Earth Sci 75:1–16. https://doi.org/10.1007/s12665-016-5323-0
Neaupanea KM, Achet SH (2004) Use of backpropagation neural network for landslide monitoring: a case study in the higher Himalaya. Eng Geol 74(3–4):213–226. https://doi.org/10.1016/j.enggeo.2004.03.010
Pellegrino AM, Scotto di Santolo A, Schippa L (2015) An integrated procedure to evaluate rheological parameters to model debris flows. Eng Geol 196:88–98. https://doi.org/10.1016/j.enggeo.2015.07.002
Pirulli M (2010) On the use of the calibration-based approach for debris-flow forwards-analyses. Nat Hazard Earth Sys 10:1009–1019. https://doi.org/10.5194/nhess-10-1009-2010
Prochaska AB, Santia PM, Higgins JD, Cannon SH (2008) Debris-flow runout predictions based on the average channel slope (ACS). Eng Geol 98:29–40. https://doi.org/10.1016/j.enggeo.2008.01.011
Quilty J, Adamowski J, Boucher MA (2019) A stochastic data-driven ensemble forecasting framework for water resources: a case study using ensemble members derived from a database of deterministic wavelet-based models. Water Resour Res 55:175–202. https://doi.org/10.1029/2018WR023205
Radu VC, Rosenthal J, Yang C (2009) Learn from the thy neighbor: parallel-chain and regional adaptive MCMC. J Am Stat Assoc 104(488):1454–1466. https://doi.org/10.1198/jasa.2009.tm08393
Raftery AE, Gneiting T, Balabdaoui F, Polakowski M (2005) Using Bayesian model averaging to calibrate forecast ensembles. Mon Weather Rev 133:1155–1174. https://doi.org/10.1175/MWR2906.1
Raftery AE, Madigan D, Hoeting JA (1997) Bayesian model averaging for linear regression models. J Am Stat Assoc 92:179–191. https://doi.org/10.1080/01621459.1997.10473615
Rickenmann D (1999) Empirical relationships for debris flows. Nat Hazards 19(1):47–77. https://doi.org/10.1023/A:1008064220727
Rumelhart DE, Hinton GE, Williams RJ (1988) Learning representations by back-propagating errors. Cognit Model 5:1. https://doi.org/10.1038/323533a0
Schilling SP, Iverson RM (1997) Automated, reproducable delineation of zones at risk from inundation by large volcanic debris flows. Proc. First Int. Conf. on Debris Flow Hazards Mitigation, San Francisco, U.S.A., ASCE, pp. 176–186
ter Braak CJF, Vrugt JA (2008) Differential evolution Markov chain with snooker updater and fewer chains. Stat Comput 18(4):435–446. https://doi.org/10.1007/s11222-008-9104-9
Tian M, Li LH, Xiong ZM (2022) A data-driven method for predicting debris-flow runout zones by integrating multivariate adaptive regression splines and Akaike information criterion. Bull Eng Geol Environ 8:222. https://doi.org/10.1007/s10064-022-02701-3
Vrugt JA (2016) Markov chain Monte Carlo simulation using the DREAM software package: theory, concepts, and MATLAB implementation. Environ Modell Softw 75:273–316. https://doi.org/10.1016/j.envsoft.2015.08.013
Vrugt JA, Robinson BA (2007) Treatment of uncertainty using ensemble methods: comparison of sequential data assimilation and Bayesian model averaging. Water Resour Res W01411. https://doi.org/10.1029/2005WR004838
Vrugt JA, ter Braak CJF, Clark MP, Hyman JM, Robinson BA (2008) Treatment of input uncertainty in hydrologic modeling: doing hydrology backward with Markov chain Monte Carlo simulation. Water Resour Res 44:W00B09. https://doi.org/10.1029/2007WR006720
Wang J, Ward SN, Xiao L (2015) Numerical simulation of the December 4, 2007 landslide-generated tsunami in Chehalis Lake, Canada. Geophys J Int 201:372–376. https://doi.org/10.1093/gji/ggv026
Xu Q, Li HJ, He Y, Liu FZ, Peng DL (2019) Comparison of data-driven models of loess landslide runout distance estimation. Bull Eng Geol Environ 78:1281–1294. https://doi.org/10.1007/s10064-017-1176-3
Zhang WG, Wu CQ, Zhong HY, Li YQ, Wang L (2021) Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geosci Front 12(1):469–477. https://doi.org/10.1016/j.gsf.2020.03.007
Zhang WG, Zhang RH, Wang W, Zhang F, Goh ATC (2019) A multivariate adaptive regression splines model for determining horizontal wall deflection envelope for braced excavations in clays. Tunn Undergr Space Technol 84:461–471. https://doi.org/10.1016/j.tust.2018.11.046
Zhou W, Fang JY, Tang C, Yang GY (2019) Empirical relationships for the estimation of debris flow runout distances on depositional fans in the Wenchuan earthquake zone. J Hydrol 577:123932. https://doi.org/10.1016/j.jhydrol.2019.123932
Zhou Y, Guo S, Xu CY, Chen H, Guo J, Lin K (2016) Probabilistic prediction in ungauged basins (PUB) based on regional parameter estimation and Bayesian model averaging. Hydrol Res 47(6):1087–1103. https://doi.org/10.2166/nh.2016.058
Acknowledgements
The authors are grateful to the anonymous reviewers for their helpful comments and advice.
Funding
This work was supported by the National Natural Science Foundation of China (Project No. 52009037), the Natural Science Foundation of Hubei Province of China (Project No. 2020CFB291), the Outstanding Young and Middle-Aged Science and Technology Innovation Team Project of Colleges and Universities of Hubei Province (Project No. T2022010), and the Wuhan Knowledge Innovation Special Project (Project No. 2022020801020268).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, M., Fan, H., Xiong, Z. et al. Data-driven ensemble model for probabilistic prediction of debris-flow volume using Bayesian model averaging. Bull Eng Geol Environ 82, 34 (2023). https://doi.org/10.1007/s10064-022-03050-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10064-022-03050-x