Abstract
Artificial intelligence (AI) is becoming increasingly popular and useful for modeling landslide movement processes due to its advantages of providing excellent generalization ability and accurately describing complex and nonlinear behavior. However, the identification of key variables is a crucial step in ensuring robustness and accuracy in AI modeling, but thus far, little attention has been given to this topic. In the present study, mutual information (MI)-based measures are proposed for input variable selection (IVS) and incorporated into optimized support vector regression (SVR) for the displacement prediction of seepage-driven landslides. The performance of optimized SVR models with ten MI-based IVS strategies is compared. A typical seepage-driven landslide was chosen for comparison. The experimental results indicate that IVS-based optimized SVR can significantly improve predictions. When the variable-reduced inputs were input into the optimized artificial bee colony (ABC)-SVR model, the mean values of normalized root mean square error (NRMSE) and Kling-Gupta efficiency (KGE) decreased and increased by as much as 71.6 and 95.2%, respectively, relative to those for the base model with all candidates. Furthermore, the joint mutual information (JMI) and double input symmetrical relevance (DISR) criteria are recommended for IVS for seepage-driven landslides because they achieve the best tradeoff between accuracy and stability.
Similar content being viewed by others
References
Back AD, Trappenberg TP (2001) Selecting inputs for modeling using normalized higher order statistics and independent component analysis. IEEE Trans Neural Networks 12:612–617. https://doi.org/10.1109/72.925564
Brown G, Pocock A, Zhao M-J, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66
Cao Y, Yin KL, Alexander DE, Zhou C (2016) Using an extreme learning machine to predict the displacement of step-like landslides in relation to controlling factors. Landslides 13:725–736. https://doi.org/10.1007/s10346-015-0596-z
Chen Y, Yuan H (2020) Evaluation of nine sub-daily soil moisture model products over China using high-resolution in situ observations. J Hydrol 588:125054. https://doi.org/10.1016/j.jhydrol.2020.125054
Darudi A, Rezaeifar S, Mohammd Hossein Javidi Dasht B (2013) Partial mutual information based algorithm for input variable selection For time series forecasting. 2013 13th International conference on environment and electrical engineering (EEEIC), 313–318.
Deng ML, Zhou J, Yi QL, Zhang FL, Han B, Li ZJ (2020) Characteristics and mechanism of deformation of chair-shaped soil landslides in three Gorges Reservoir area. Chinese J Geot Eng 42:1296–1303. https://doi.org/10.11779/cjge202007013
Du J, Yin KL, Lacasse S (2013) Displacement prediction in colluvial landslides, three Gorges Reservoir, China. Landslides 10:203–218. https://doi.org/10.1007/s10346-012-0326-8
Gupta HV, Kling H, Yilmaz KK, Martinez GF (2009) Decomposition of the mean squared error and NSE performance criteria: implications for improving hydrological modelling. J Hydrol 377:80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003
Huang FM, Huang JS, Jiang SH, Zhou CB (2017) Landslide displacement prediction based on multivariate chaotic model and extreme learning machine. Eng Geol 218:173–186. https://doi.org/10.1016/j.enggeo.2017.01.016
Huang D, Luo SL, Zhong Z, Gu DM, Song YX, Tomás R (2020) Analysis and modeling of the combined effects of hydrological factors on a reservoir bank slope in the three Gorges Reservoir area, China. Eng Geol 279:105858. https://doi.org/10.1016/j.enggeo.2020.105858
Jiang P, Li R, Liu N, Gao Y (2020) A novel composite electricity demand forecasting framework by data processing and optimized support vector machine. Appl Energy 260:114243. https://doi.org/10.1016/j.apenergy.2019.114243
Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl Soft Comput 8:687–697. https://doi.org/10.1016/j.asoc.2007.05.007
Kennedy J, Eberhart R (1995) Particle swarm optimization. Proceedings of ICNN'95-international conference on neural networks, 1942–1948, vol.1944
Krkač M, Bernat Gazibara S, Arbanas Ž, Sečanj M, Mihalić Arbanas S (2020) A comparative study of random forests and multiple linear regression in the prediction of landslide velocity. Landslides. https://doi.org/10.1007/s10346-020-01476-6
Legates DR, McCabe GJ Jr (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233–241. https://doi.org/10.1029/1998WR900018
Li XZ, Kong JM (2014) Application of GA–SVM method with parameter optimization for landslide development prediction. Nat Hazards Earth Syst Sci 14:525–533. https://doi.org/10.5194/nhess-14-525-2014
Li LX, Dennis Cook R, Nachtsheim CJ (2005) Model-free variable selection. J R Stat Soc Series B Stat Methodol 67:285–299. https://doi.org/10.1111/j.1467-9868.2005.00502.x
Li HJ, Xu Q, He YS, Fan XM, Li SL (2019) Modeling and predicting reservoir landslide displacement with deep belief network and EWMA control charts: a case study in three Gorges Reservoir. Landslides. https://doi.org/10.1007/s10346-019-01312-6
Li LW, Wu YP, Miao FS, Xue Y, Huang YP (2020a) A hybrid interval displacement forecasting model for reservoir colluvial landslides with step-like deformation characteristics considering dynamic switching of deformation states. Stocha Environ Res Risk Assess. https://doi.org/10.1007/s00477-020-01914-w
Li WJ, Fang HY, Qin GX, Tan XQ, Huang ZW, Zeng FT, Du HW, Li SP (2020b) Concentration estimation of dissolved oxygen in pearl river basin using input variable selection and machine learning techniques. Sci Total Environ 731:139099. https://doi.org/10.1016/j.scitotenv.2020.139099
Li CD, Criss RE, Fu ZY, Long JJ, Tan QW (2021) Evolution characteristics and displacement forecasting model of landslides with stair-step sliding surface along the Xiangxi River, three Gorges Reservoir region. China Eng Geol 283:105961. https://doi.org/10.1016/j.enggeo.2020.105961
Liao K, Wu YP, Li LW, Miao FS, Xue Y (2019) Displacement prediction model of landslide based on time series and GWO-ELM. J Cent South Univ 50:619–626. https://doi.org/10.11817/j.issn.1672-7207.2019.03.015
Lu XS, Miao FS, Xie XX, Li DY, Xie YH (2021) A new method for displacement prediction of “step-like” landslides based on VMD-FOA-SVR model. Environ Earth Sci 80:542. https://doi.org/10.1007/s12665-021-09825-x
Ma JW, Tang HM, Hu XL, Bobet A, Zhang M, Zhu TW, Song YJ, Ez Eldin MAM (2017) Identification of causal factors for the Majiagou landslide using modern data mining methods. Landslides 14:311–322. https://doi.org/10.1007/s10346-016-0693-7
Ma JW, Tang HM, Liu X, Wen T, Zhang JR, Tan QW, Fan ZQ (2018) Probabilistic forecasting of landslide displacement accounting for epistemic uncertainty: a case study in the Three Gorges Reservoir area, China. Landslides 15:1145–1153. https://doi.org/10.1007/s10346-017-0941-5
Ma JW, Liu X, Niu XX, Wang YK, Wen T, Zhang JR, Zou ZX (2020a) Forecasting of landslide displacement using a probability-scheme combination ensemble prediction technique. Int J Environ Res Public Health 17:4788. https://doi.org/10.3390/ijerph17134788
Ma JW, Niu XX, Tang HM, Wang YK, Wen T, Zhang JR (2020b) Displacement prediction of a complex landslide in the three Gorges reservoir area (China) using a hybrid computational intelligence approach. Complexity 2020:2624547. https://doi.org/10.1155/2020/2624547
Malik A, Kumar A, Singh RP (2019) Application of heuristic approaches for prediction of hydrological drought using multi-scalar streamflow drought index. Water Resour Manage 33:3985–4006. https://doi.org/10.1007/s11269-019-02350-4
May RJ, Maier HR, Dandy GC, Fernando TMKG (2008) Non-linear variable selection for artificial neural networks using partial mutual information. Environ Modell Software 23:1312–1326. https://doi.org/10.1016/j.envsoft.2008.03.007
Miao FS, Wu YP, Xie YH, Li YN (2017) Prediction of landslide displacement with step-like behavior based on multialgorithm optimization and a support vector regression model. Landslides 15:475–488. https://doi.org/10.1007/s10346-017-0883-y
Miao FS, Wu YP, Li LW, Liao K, Xue Y (2020) Triggering factors and threshold analysis of baishuihe landslide based on the data mining methods. Nat Hazard. https://doi.org/10.1007/s11069-020-04419-5
Niu DX, Wang KK, Sun LJ, Wu J, Xu XM (2020) Short-term photovoltaic power generation forecasting based on random forest feature selection and CEEMD: a case study. Appl Soft Comput 93:106389. https://doi.org/10.1016/j.asoc.2020.106389
Niu XX, Ma JW, Wang YK, Zhang JR, Chen HJ, Tang HM (2021) A novel decomposition-ensemble learning model based on ensemble empirical mode decomposition and recurrent neural network for landslide displacement prediction. Appl Sci 11:4684. https://doi.org/10.3390/app11104684
Pham LT, Luo L, Finley AO (2020) Evaluation of Random Forest for short-term daily streamflow forecast in rainfall and snowmelt driven watersheds. Hydrol Earth Syst Sci Discuss 2020:1–33. https://doi.org/10.5194/hess-2020-305
Pool S, Vis M, Seibert J (2018) Evaluating model performance: towards a non-parametric variant of the Kling-Gupta efficiency. Hydrol Sci J 63:1941–1953. https://doi.org/10.1080/02626667.2018.1552002
Qian CD, Rong XS, Tao MC, Lin SG (2014) Analysis on deformation influencing factors of Woshaxi landslide in the Three Gorges Reservoir Area. J China Three Gorges University (natural Sciences) 36:66–70
Ren F, Wu XL, Zhang KX, Niu RQ (2014) Application of wavelet analysis and a particle swarm-optimized support vector machine to predict the displacement of the Shuping landslide in the Three Gorges. China Environ Earth Sci 73:4791–4804. https://doi.org/10.1007/s12665-014-3764-x
Sassa K, Picarelli L, Yin YP (2009) Monitoring, prediction and early warning. In: Sassa K, Canuti P (eds) Landslides-Disaster Risk Reduction. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 351–375
Senders JT, Arnaout O, Karhade AV, Dasenbrock HH, Gormley WB, Broekman ML, Smith TR (2018) Natural and artificial intelligence in neurosurgery: a systematic review. Neurosurgery 83:181–192. https://doi.org/10.1093/neuros/nyx384
Shamaei E, Kaedi M (2016) Suspended sediment concentration estimation by stacking the genetic programming and neuro-fuzzy predictions. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2016.03.009
Snieder E, Shakir R, Khan UT (2020) A comprehensive comparison of four input variable selection methods for artificial neural network flow forecasting models. J Hydrol 583:124299. https://doi.org/10.1016/j.jhydrol.2019.124299
Solomatine DP, Shrestha DL (2009) A novel method to estimate model uncertainty using machine learning techniques. Water Resour Res. https://doi.org/10.1029/2008WR006839
Tang HM, Wasowski J, Juang CH (2019) Geohazards in the three Gorges reservoir area, China – Lessons learned from decades of research. Eng Geol 261:105267. https://doi.org/10.1016/j.enggeo.2019.105267
Tang GQ, Clark MP, Papalexiou SM, Ma ZQ, Hong Y (2020) Have satellite precipitation products improved over last two decades? A comprehensive comparison of GPM IMERG with nine satellite and reanalysis datasets. Remote Sens Environ 240:111697. https://doi.org/10.1016/j.rse.2020.111697
Vasu NN, Lee S-R (2016) A hybrid feature selection algorithm integrating an extreme learning machine for landslide susceptibility modeling of Mt. Woomyeon. South Korea. Geomorphology. https://doi.org/10.1016/j.geomorph.2016.03.023
Wang L (2014) Research of recurrence mechanism and prediction of Shuping landslide under water level variation and rainfall in Three Gorges Reservoir area, China Three Gorges University
Wang FF (2021) Research on the model and application progress based on grey relational analysis theory. Adv Edu Tech Psychol 5:30–35. https://doi.org/10.23977/aetp.2021.52006
Wang YK, Tang HM, Wen T, Ma JW (2019) A hybrid intelligent approach for constructing landslide displacement prediction intervals. Appl Soft Comput 81:105506. https://doi.org/10.1016/j.asoc.2019.105506
Wang JE, Schweizer D, Liu QB, Su AJ, Hu X, Blum P (2021) Three-dimensional landslide evolution model at the Yangtze River. Eng Geol 292:106275. https://doi.org/10.1016/j.enggeo.2021.106275
Wang YK, Tang HM, Huang JS, Wen T, Ma JW, Zhang JR (2022) A comparative study of different machine learning methods for reservoir landslide displacement prediction. Eng Geol 298:106544. https://doi.org/10.1016/j.enggeo.2022.106544
Wen T, Tang HM, Wang YK, Lin CY, Xiong CR (2017) Landslide displacement prediction using the GA-LSSVM model and time series analysis: a case study of Three Gorges Reservoir, China. Nat Hazards Earth Syst Sci. https://doi.org/10.5194/nhess-17-2181-2017
Xing Y, Yue J, Chen C, Qin Y, Hu J (2020) A hybrid prediction model of landslide displacement with risk-averse adaptation. Comput Geosci 141:104527. https://doi.org/10.1016/j.cageo.2020.104527
Xu Q, Tang MG (2019) Data for activity law and hydraulics mechanism of landslides with different sliding surface and permeability in the Three Gorges Reservoir Area China. Mendeley Data. https://doi.org/10.17632/rc7wj3cmd3.1
Xue Y, Wu YP, Miao FS, Li LW, Liao K, Ou GZ (2020a) Effect of spatially variable saturated hydraulic conductivity with non-stationary characteristics on the stability of reservoir landslides. Stochastic Environ Res Risk Assess 34:311–329. https://doi.org/10.1007/s00477-020-01777-1
Xue ZW, Zhang Y, Cheng C, Ma GJ (2020b) Remaining useful life prediction of lithium-ion batteries with adaptive unscented kalman filter and optimized support vector regression. Neurocomputing 376:95–102. https://doi.org/10.1016/j.neucom.2019.09.074
Zhang S, Ma JW, Tang HM (2020a) Estimation of risk thresholds for a landslide in the Three Gorges reservoir based on a KDE-Copula-VaR approach. Geofluids 2020:8030264. https://doi.org/10.1155/2020/8030264
Zhang YA, Yan BB, Aasma M (2020b) A novel deep learning framework: prediction and analysis of financial time series using CEEMD and LSTM. Expert Syst Appl 159:113609. https://doi.org/10.1016/j.eswa.2020.113609
Zhang JR, Tang HM, Tannant DD, Lin CY, Xia D, Liu X, Zhang YQ, Ma JW (2021) Combined forecasting model with CEEMD-LCSS reconstruction and the ABC-SVR method for landslide displacement prediction. J Cleaner Prod. https://doi.org/10.1016/j.jclepro.2021.126205
Zhou XW (2007) The study on the grey relational degree and its application, Jilin University
Zhou C, Yin KL, Cao Y, Ahmed B (2016) Application of time series analysis and PSO-SVM model in predicting the Bazimen landslide in the Three Gorges Reservoir, China. Eng Geol 204:108–120. https://doi.org/10.1016/j.enggeo.2016.02.009
Acknowledgements
This research was funded by the Major Program of the National Natural Science Foundation of China (Grant No. 42090055), the National Natural Science Foundation of China (Grant Nos. 42177147 and 41702328), the Science and Technology Project of the Huaneng Lancang River Hydropower Co., Ltd. (Grant No. HNKJ18-H24), Open Fund of Key Laboratory of Exploration Technologies for Oil and Gas Resources (Yangtze University), Ministry of Education (No. K2021-11), and the Chongqing Geo-disaster Prevention and Control Center (Grant No. 20C0023). We thank the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn) for providing landslide displacement, rainfall, and reservoir water level data. All support is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
Selected publications that use IVS with AI methods for landslide displacement prediction.
Reference | Algorithm | Basis of IVS | Selected input variables | Evaluation criteria |
---|---|---|---|---|
Du et al. (2013) | Back-propagation neural network (BP) | Empirical expert knowledge | Accumulated precipitation in the previous month, accumulated precipitation in the past two months, variations in the reservoir level during the previous month, accumulated displacement increment for 1 year, and average reservoir level in the current month | Average RE: 1.3/3.04% |
Li and Kong (2014) | Genetic algorithm-optimized support vector machine (GA–SVM) | Empirical expert knowledge | Average displacement rate, average reservoir level, accumulated precipitation, and average monthly groundwater flow in the current month | RMSE: 0.0009 mm R: 0.9992 |
Ren et al. (2014) | Particle swarm-optimized support vector machine (PSO-SVM) | Empirical expert knowledge | Accumulated precipitation in the previous month, accumulated precipitation in the past two months, maximum continuous decrement in reservoir level during the current month, and variation in the reservoir water level during the current month | MSE: 2.45/10.64 mm R2: 0.945/0.981 |
Cao et al. (2016) | Extreme learning machine (ELM) | Empirical expert knowledge and grey correlation coefficient | Accumulated precipitation in the past 1 and 2 months, average reservoir level in the current month, variation in the reservoir level during the previous 1 month, groundwater level in the current month, and displacement in the past 1, 2 and 3 months | RE: 0.69—5.94% R: 0.984 |
Zhou et al. (2016) | Particle swarm-optimized support vector machine (PSO-SVM) | Empirical expert knowledge | Accumulated precipitation in the current month and over the past 2 months, average reservoir level in the current month, variation in the reservoir level in the current month, and displacement in the past 1, 2 and 3 months | MAPE: 13.28% RMSE: 25.95 mm |
Huang et al. (2017) | Multivariate chaotic model and extreme learning machine (Multivariate chaotic ELM) | Empirical expert knowledge | Periodic displacement in the past 1, 2, 3, and 4 months, accumulated precipitation in the past 1 and 2 months, and variation in the reservoir level in the past 1 and 2 months | RMSE: 23.71 mm R2: 0.908 |
Wen et al. (2017) | Particle swarm-optimized support vector machine (PSO-SVM) | Empirical expert and grey correlation coefficient | Cumulative precipitation in the current and past 2 months, reservoir water level, variation in the reservoir level in the current and past 2 months, and groundwater depth | RMSE: 49.0485 mm MAE: 48.5392 mm MAPE: 3.131% |
Ma et al. (2018) | Bootstrap, extreme learning machine, and artificial neural network (Bootstrap-ELM-ANN) | Empirical expert knowledge | Accumulated precipitation in the past one and 2 months, average reservoir level in the current month, variation in the reservoir level in the current month, and displacement in the past 1, 2, and months | PICP: 97.5% |
Li et al. (2019) | Continuous wavelet analysis and deep belief network (CWT-DBN) | p values less than 0.05 | First two monthly precipitation lags, eighth to tenth monthly lags in reservoir water level fluctuations, second and fourth monthly lags for precipitation, and first, seventh, and eighth lags in reservoir water level fluctuations, and historic lags in past-month displacement | MAE: 6.89 mm MAPE: 14.59% RMSE: 24.63 mm |
Wang et al. (2019) | Lower and upper bound estimation, double exponential smoothing, and particle swarm-optimized extreme learning machine (LUBE- DES-PSO-ELM) | Partial autocorrelation function and maximum information coefficient | Displacement in the past 1 and 2 months, average reservoir level during the current month versus that in the previous one, two and three months, and variation in the reservoir level in the current and past 2 months | AIC: − 76.77 |
Xing et al. (2020) | Double moving average, support vector machine, and long short-term memory (DES-SVR-LSTM) | Grey correlation coefficient | Maximum daily rainfall in the current month, accumulated rainfall in the current month and over the past 1 and 2 months, reservoir level in current month, variation in the reservoir level in the current and past 2 months, displacement in the past 1, 2, and 3 months | RMSE: 6.18 mm MAE: 4.97 mm |
Li et al. (2021) | Greedy algorithm, complementary ensemble empirical mode decomposition, and random forest (GA-CEEMD-RF) | Empirical expert knowledge and grey correlation coefficient | Accumulated precipitation in the past one and 2 months, average precipitation in the first three months, average reservoir level in the current month, maximum variation in reservoir level in the current month and the first two months, and the high-frequency components of accumulated precipitation in the past one month and reservoir level in the current month | R2: 0.98 |
Appendix 2
Typical seepage-driven landslides in the TGRA.
Landslide | Location | Features | Landslide mass materials and permeability coefficient | Description of landslide movement |
---|---|---|---|---|
Baishuihe landslide | Zigui County, right bank of the Yangtze River 110°36′35″, 31°01′24″ | Length: 700 m; Width: 600 m; Average thickness: 30 m; Volume: 1260 × 104 m3; The elevations of the landslide toe and crown are 70 and 410 m, respectively | Silty clay with gravels with a permeability coefficient of 0.55 m/d Xue et al. (2020a) | Ancient landslide reactivated in June 2003 and continuously deformed; on June 30, 2007, an approximately 100,000 m3 collapse occurred in the warning zone; on July 13, 2007, the horizontal displacement reached 1760.7 mm (XD-03), with velocities of 26.2–50.85 mm/d |
Woshaixi landslide | Zigui County, right bank of the Yangtze River 110°35′42.4″, 30°58′03.9″ | Length: 400 m; Width: 700 m; Average thickness: 15 m; Volume: 420 × 104 m3; The elevations of the landslide toe and crown are 140 and 405 m, respectively | Silty clay with gravels with a permeability coefficient of 0.06–0.43 m/d Qian et al. (2014) | On February 24, 2007, a secondary landslide in the middle of the landslide front edge occurred; in May 2008, cracks with a length of 500 m and a vertical offset of 0.30 m appeared at the trailing edge; in May 2009, three new cracks with vertical offsets of 0.30–0.50 m appeared at the boundary of the landslide; on May 26, 2009, landslide displacement reached 3.4 m, with velocity of 1.96 m/d |
Shuping landslide | Zigui County, right bank of the Yangtze River 110°36′46″, 30°59′38″ | Length: 800 m; Width: 700 m; Thickness: 30–70 m; Volume: 2700 × 104 m3; The elevations of the landslide toe and crown are 65 and 400 m, respectively | Silty clay with gravels with a permeability coefficient of 0.78 m/d Wang (2014) | Ancient landslide reactivated in June 2003 and continuously deformed; seven collapses and eleven cracks appeared at the landslide surface; in June 2012, the landslide velocity reached 15 mm/d; cracks at the trailing edge have a maximum crack aperture of 0.2 m and a vertical offset of 0.18 m; En echelon cracks have appeared at the eastern boundary |
Baijiabao landslide | Zigui County, right bank of Xiangxi River 110°45′29″, 30°59′00″ | Length: 550 m; Width: 400 m; Average thickness: 45 m; Volume: 990 × 104 m3; The elevations of the landslide toe and crown are 125 and 272 m, respectively | Silty clay with gravels with a permeability coefficient of 0.0273–0.0455 m/d Deng et al. (2020) | The Baijiabao landslide only displayed low creep rates prior to the impoundment of the TGRA, but significant movements occurred shortly afterward. In June 2007, surface cracks first appeared at the road on the right side of the landslide, reaching approximately 160 m in length. In July 2008, cracks continued to appear in the middle of the road and at the boundary of the landslide. In June 2009, the cracks on each side began to connect at the trailing edge. In June 2010, new tension cracks occurred on each side. In July 2011, cracks appeared in the middle part and at the boundary on each side |
Rights and permissions
About this article
Cite this article
Ma, J., Wang, Y., Niu, X. et al. A comparative study of mutual information-based input variable selection strategies for the displacement prediction of seepage-driven landslides using optimized support vector regression. Stoch Environ Res Risk Assess 36, 3109–3129 (2022). https://doi.org/10.1007/s00477-022-02183-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-022-02183-5