Skip to main content
Log in

A comparative study of mutual information-based input variable selection strategies for the displacement prediction of seepage-driven landslides using optimized support vector regression

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Artificial intelligence (AI) is becoming increasingly popular and useful for modeling landslide movement processes due to its advantages of providing excellent generalization ability and accurately describing complex and nonlinear behavior. However, the identification of key variables is a crucial step in ensuring robustness and accuracy in AI modeling, but thus far, little attention has been given to this topic. In the present study, mutual information (MI)-based measures are proposed for input variable selection (IVS) and incorporated into optimized support vector regression (SVR) for the displacement prediction of seepage-driven landslides. The performance of optimized SVR models with ten MI-based IVS strategies is compared. A typical seepage-driven landslide was chosen for comparison. The experimental results indicate that IVS-based optimized SVR can significantly improve predictions. When the variable-reduced inputs were input into the optimized artificial bee colony (ABC)-SVR model, the mean values of normalized root mean square error (NRMSE) and Kling-Gupta efficiency (KGE) decreased and increased by as much as 71.6 and 95.2%, respectively, relative to those for the base model with all candidates. Furthermore, the joint mutual information (JMI) and double input symmetrical relevance (DISR) criteria are recommended for IVS for seepage-driven landslides because they achieve the best tradeoff between accuracy and stability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

Download references

Acknowledgements

This research was funded by the Major Program of the National Natural Science Foundation of China (Grant No. 42090055), the National Natural Science Foundation of China (Grant Nos. 42177147 and 41702328), the Science and Technology Project of the Huaneng Lancang River Hydropower Co., Ltd. (Grant No. HNKJ18-H24), Open Fund of Key Laboratory of Exploration Technologies for Oil and Gas Resources (Yangtze University), Ministry of Education (No. K2021-11), and the Chongqing Geo-disaster Prevention and Control Center (Grant No. 20C0023). We thank the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn) for providing landslide displacement, rainfall, and reservoir water level data. All support is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junwei Ma.

Ethics declarations

Conflict of interest

We have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Selected publications that use IVS with AI methods for landslide displacement prediction.

Reference

Algorithm

Basis of IVS

Selected input variables

Evaluation criteria

Du et al. (2013)

Back-propagation neural network (BP)

Empirical expert knowledge

Accumulated precipitation in the previous month, accumulated precipitation in the past two months, variations in the reservoir level during the previous month, accumulated displacement increment for 1 year, and average reservoir level in the current month

Average RE:

1.3/3.04%

Li and Kong (2014)

Genetic algorithm-optimized support vector machine (GA–SVM)

Empirical expert knowledge

Average displacement rate, average reservoir level, accumulated precipitation, and average monthly groundwater flow in the current month

RMSE: 0.0009 mm

R: 0.9992

Ren et al. (2014)

Particle swarm-optimized support vector machine (PSO-SVM)

Empirical expert knowledge

Accumulated precipitation in the previous month, accumulated precipitation in the past two months, maximum continuous decrement in reservoir level during the current month, and variation in the reservoir water level during the current month

MSE: 2.45/10.64 mm

R2: 0.945/0.981

Cao et al. (2016)

Extreme learning machine (ELM)

Empirical expert knowledge and grey correlation coefficient

Accumulated precipitation in the past 1 and 2 months, average reservoir level in the current month, variation in the reservoir level during the previous 1 month, groundwater level in the current month, and displacement in the past 1, 2 and 3 months

RE: 0.69—5.94%

R: 0.984

Zhou et al. (2016)

Particle swarm-optimized support vector machine (PSO-SVM)

Empirical expert knowledge

Accumulated precipitation in the current month and over the past 2 months, average reservoir level in the current month, variation in the reservoir level in the current month, and displacement in the past 1, 2 and 3 months

MAPE: 13.28%

RMSE: 25.95 mm

Huang et al. (2017)

Multivariate chaotic model and extreme learning machine (Multivariate chaotic ELM)

Empirical expert knowledge

Periodic displacement in the past 1, 2, 3, and 4 months, accumulated precipitation in the past 1 and 2 months, and variation in the reservoir level in the past 1 and 2 months

RMSE: 23.71 mm

R2: 0.908

Wen et al. (2017)

Particle swarm-optimized support vector machine (PSO-SVM)

Empirical expert and grey correlation coefficient

Cumulative precipitation in the current and past 2 months, reservoir water level, variation in the reservoir level in the current and past 2 months, and groundwater depth

RMSE: 49.0485 mm

MAE: 48.5392 mm

MAPE: 3.131%

Ma et al. (2018)

Bootstrap, extreme learning machine, and artificial neural network (Bootstrap-ELM-ANN)

Empirical expert knowledge

Accumulated precipitation in the past one and 2 months, average reservoir level in the current month, variation in the reservoir level in the current month, and displacement in the past 1, 2, and months

PICP: 97.5%

Li et al. (2019)

Continuous wavelet analysis and deep belief network (CWT-DBN)

p values less than 0.05

First two monthly precipitation lags, eighth to tenth monthly lags in reservoir water level fluctuations, second and fourth monthly lags for precipitation, and first, seventh, and eighth lags in reservoir water level fluctuations, and historic lags in past-month displacement

MAE: 6.89 mm

MAPE: 14.59%

RMSE: 24.63 mm

Wang et al. (2019)

Lower and upper bound estimation, double exponential smoothing, and particle swarm-optimized extreme learning machine (LUBE- DES-PSO-ELM)

Partial autocorrelation function and maximum information coefficient

Displacement in the past 1 and 2 months, average reservoir level during the current month versus that in the previous one, two and three months, and variation in the reservoir level in the current and past 2 months

AIC: − 76.77

Xing et al. (2020)

Double moving average, support vector machine, and long short-term memory (DES-SVR-LSTM)

Grey correlation coefficient

Maximum daily rainfall in the current month, accumulated rainfall in the current month and over the past 1 and 2 months, reservoir level in current month, variation in the reservoir level in the current and past 2 months, displacement in the past 1, 2, and 3 months

RMSE: 6.18 mm

MAE: 4.97 mm

Li et al. (2021)

Greedy algorithm, complementary ensemble empirical mode decomposition, and random forest (GA-CEEMD-RF)

Empirical expert knowledge and grey correlation coefficient

Accumulated precipitation in the past one and 2 months, average precipitation in the first three months, average reservoir level in the current month, maximum variation in reservoir level in the current month and the first two months, and the high-frequency components of accumulated precipitation in the past one month and reservoir level in the current month

R2: 0.98

  1. AIC Average interval score MAE Mean absolute error MAPE Mean absolute percentage error; MSE Mean squared error PICP Prediction interval coverage probability R Correlation coefficient R2 Coefficient of determination RE Relative error RMSE Root mean square error.

Appendix 2

Typical seepage-driven landslides in the TGRA.

Landslide

Location

Features

Landslide mass materials and permeability coefficient

Description of landslide movement

Baishuihe landslide

Zigui County, right bank of the Yangtze River

110°36′35″, 31°01′24″

Length: 700 m;

Width: 600 m;

Average thickness: 30 m;

Volume: 1260 × 104 m3;

The elevations of the landslide toe and crown are 70 and 410 m, respectively

Silty clay with gravels with a permeability coefficient of 0.55 m/d Xue et al. (2020a)

Ancient landslide reactivated in June 2003 and continuously deformed; on June 30, 2007, an approximately 100,000 m3 collapse occurred in the warning zone; on July 13, 2007, the horizontal displacement reached 1760.7 mm (XD-03), with velocities of 26.2–50.85 mm/d

Woshaixi landslide

Zigui County, right bank of the Yangtze River

110°35′42.4″, 30°58′03.9″

Length: 400 m;

Width: 700 m;

Average thickness: 15 m;

Volume: 420 × 104 m3;

The elevations of the landslide toe and crown are 140 and 405 m, respectively

Silty clay with gravels with a permeability coefficient of 0.06–0.43 m/d Qian et al. (2014)

On February 24, 2007, a secondary landslide in the middle of the landslide front edge occurred; in May 2008, cracks with a length of 500 m and a vertical offset of 0.30 m appeared at the trailing edge; in May 2009, three new cracks with vertical offsets of 0.30–0.50 m appeared at the boundary of the landslide; on May 26, 2009, landslide displacement reached 3.4 m, with velocity of 1.96 m/d

Shuping landslide

Zigui County, right bank of the Yangtze River

110°36′46″, 30°59′38″

Length: 800 m;

Width: 700 m;

Thickness: 30–70 m;

Volume: 2700 × 104 m3;

The elevations of the landslide toe and crown are 65 and 400 m, respectively

Silty clay with gravels with a permeability coefficient of 0.78 m/d Wang (2014)

Ancient landslide reactivated in June 2003 and continuously deformed; seven collapses and eleven cracks appeared at the landslide surface; in June 2012, the landslide velocity reached 15 mm/d; cracks at the trailing edge have a maximum crack aperture of 0.2 m and a vertical offset of 0.18 m; En echelon cracks have appeared at the eastern boundary

Baijiabao landslide

Zigui County, right bank of Xiangxi River

110°45′29″, 30°59′00″

Length: 550 m;

Width: 400 m;

Average thickness: 45 m;

Volume: 990 × 104 m3;

The elevations of the landslide toe and crown are 125 and 272 m, respectively

Silty clay with gravels with a permeability coefficient of 0.0273–0.0455 m/d Deng et al. (2020)

The Baijiabao landslide only displayed low creep rates prior to the impoundment of the TGRA, but significant movements occurred shortly afterward. In June 2007, surface cracks first appeared at the road on the right side of the landslide, reaching approximately 160 m in length. In July 2008, cracks continued to appear in the middle of the road and at the boundary of the landslide. In June 2009, the cracks on each side began to connect at the trailing edge. In June 2010, new tension cracks occurred on each side. In July 2011, cracks appeared in the middle part and at the boundary on each side

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, J., Wang, Y., Niu, X. et al. A comparative study of mutual information-based input variable selection strategies for the displacement prediction of seepage-driven landslides using optimized support vector regression. Stoch Environ Res Risk Assess 36, 3109–3129 (2022). https://doi.org/10.1007/s00477-022-02183-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-022-02183-5

Keywords

Navigation