Skip to main content

Advertisement

Log in

Application of classification coupled with PCA and SMOTE, for obtaining safety factor of landslide based on HRA

  • Original Paper
  • Published:
Bulletin of Engineering Geology and the Environment Aims and scope Submit manuscript

Abstract

Machine learning algorithms have been recently applied to build a landslide susceptibility map. The objective of this study is to find whether classification algorithms of machine learning are suitable for obtaining safety factor based on a high-risk-area (HRA) model, composed of eight geotechnical properties. Each property value is designated as an input value for machine learning, and the output value is determined as a safety factor. The data are transformed into continuous data after preprocessing with label encoding since the data have a discontinuous pattern. The DT, KNN, LR, RF, and SVM algorithms are selected to perform the classification with train and validation ratio of 7:3. To improve the reliability of the results, the classification is also performed after applying the PCA technique, which can reduce eight dimensions to two principal components. In addition, the number of data is equally oversampled using the SMOTE technique to solve the data imbalance problem for each class, and the results of classification are also compared. The PCA shows a limited ability to reflect the characteristics of the original data, and the oversampled data by the SMOTE provides high reliability. The results show that the RF is suitable for performing classification with high accuracy in the range of 1.2–2.4 of safety factors. This study demonstrates that it is possible to classify even discontinuous data through a preprocessing technique, and SMOTE can improve the accuracy of landslide risk mapping.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

All data, models, or code generated or used during the study are available from the corresponding author by request.

References

  • Allam AS, Bassioni HA, Kamel W, Ayoub M (2020) Estimating the standardized regression coefficients of design variables in daylighting and energy performance of buildings in the face of multicollinearity. Sol Energy 211:1184–1193

    Article  Google Scholar 

  • Alphonsus C, Raji AO (2019) Application of principal component analysis (PCA) for correcting multicollinearity and dimension reduction of morphological parameters in Bunaji Cows. Nigerian J Anim Sci 21(2):1–8

    Google Scholar 

  • Bernardo A, Della Valle E (2022) An extensive study of C-SMOTE, a continuous synthetic minority oversampling technique for evolving data streams. Expert Syst Appl 196:116630

    Article  Google Scholar 

  • Blondeau S, Gunnell Y, Jarman D (2021) Rock slope failure in the Western Alps: A first comprehensive inventory and spatial analysis. Geomorphology 380:107622

    Article  Google Scholar 

  • Bordoni M, Vivaldi V, Lucchelli L, Ciabatta L, Brocca L, Galve JP, Meisina C (2021) Development of a data-driven model for spatial and temporal shallow landslide probability of occurrence at catchment scale. Landslides 18:1209–1229

    Article  Google Scholar 

  • Borra S, Thanki R, Dey N (2019) Satellite image classification. In Satellite Image Analysis: Clustering and Classification (pp. 53–81). Springer, Singapore

  • Choo H, Min DH, Sung JH, Yoon HK (2019) Sensitivities of input parameters for predicting stability of soil slope. Bull Eng Geol Env 78(8):5671–5685

    Article  Google Scholar 

  • Guo Z, Shi Y, Huang F, Fan X, Huang J (2021) Landslide susceptibility zonation method based on C5. 0 decision tree and K-means cluster algorithms to improve the efficiency of risk management. Geosci Front 12(6):101249

  • Han XL, Jiang NJ, Yang YF, Choi J, Singh DN, Beta P, Wang YJ (2022) Deep learning based approach for the instance segmentation of clayey soil desiccation cracks. Comput Geotech 146:104733

    Article  Google Scholar 

  • Han Y, Bao X (2022) Topological mapping of complex networks from high slope deformation time series for landslide risk assessment. Expert Syst Appl 206:117816

    Article  Google Scholar 

  • Hu JZ, Zhang J, Huang HW, Zheng JG (2021) Value of information analysis of site investigation program for slope design. Comput Geotech 131:103938

    Article  Google Scholar 

  • Jun H, Min DH, Yoon HK (2017) Determination of monitoring systems and installation location to prevent debris flow through web-based database and AHP. Mar Georesour Geotechnol 35(8):1049–1057

    Article  Google Scholar 

  • Lee JS, Park J, Kim J, Yoon HK (2022) Study of oversampling algorithms for soil classifications by field velocity resistivity probe. Geomech Eng 30(3):247–258

    Google Scholar 

  • Lee S, Yoon HK (2020) Hydraulic conductivity of saturated soil medium through time-domain reflectometry. Sensors 20(23):7001

    Article  Google Scholar 

  • Liang D, Yi B, Cao W, Zheng Q (2022) Exploring ensemble oversampling method for imbalanced keyword extraction learning in policy text based on three-way decisions and SMOTE. Expert Syst Appl 188:116051

    Article  Google Scholar 

  • Merghadi A, Yunus AP, Dou J, Whiteley J, ThaiPham B, Bui DT, Avtar R, Abderrahmane B (2020) Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Sci Rev 207:103225

    Article  Google Scholar 

  • Metya S, Chaudhary N, Sharma KK (2021) Psuedo static stability analysis of rock slope using patton’s shear criterion. Intern J Geo-Eng 12:1–22

    Article  Google Scholar 

  • Min DH, Yoon HK (2021) Suggestion for a new deterministic model coupled with machine learning techniques for landslide susceptibility mapping. Sci Rep 11(1):1–24

    Article  Google Scholar 

  • Pan Y, Wu G, Zhao Z, He L (2020) Analysis of rock slope stability under rainfall conditions considering the water-induced weakening of rock. Comput Geotech 128:103806

    Article  Google Scholar 

  • Pham BT, Nguyen-Thoi T, Qi C, Van Phong T, Dou J, Ho LS, Le HV, Prakash I (2020) Coupling RBF neural network with ensemble learning techniques for landslide susceptibility mapping. CATENA 195:104805

    Article  Google Scholar 

  • Rotigliano E, Martinello C, Hernandéz MA, Agnesi V, Conoscenti C (2019) Predicting the landslides triggered by the 2009 96E/Ida tropical storms in the Ilopango caldera area (El Salvador, CA): optimizing MARS-based model building and validation strategies. Environ Earth Sci  78:1–16

    Google Scholar 

  • Sarkar S, Chakraborty M (2021) Stability analysis for two-layered slopes by using the strength reduction method. Intern J Geo-Eng 12(1):24

    Article  Google Scholar 

  • Singh V, Stanier S, Bienen B, Randolph MF (2021) Modelling the behaviour of sensitive clays experiencing large deformations using non-local regularisation techniques. Comput Geotech 133:104025

    Article  Google Scholar 

  • Song BD, Park H, Park K (2022) Toward flexible and persistent UAV service: Multi-period and multi-objective system design with task assignment for disaster management. Expert Syst Appl 206:117855

    Article  Google Scholar 

  • Suleiman S, Badamsi S (2019) Effect of multicollinearity in predicting diabetes mellitus using statistical neural network. Euro J Adv Eng Technol 6(6):30–38

    Google Scholar 

  • Van Dao D, Jaafari A, Bayat M, Mafi-Gholami D, Qi C, Moayedi H, Phong T, Ly H, Le T, Trinh PT, Luu C, Quoc NK, Thanh BN, Pham BT (2020) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. CATENA 188:104451

    Article  Google Scholar 

  • Wei X, Zhang L, Luo J, Liu D (2021) A hybrid framework integrating physical model and convolutional neural network for regional landslide susceptibility mapping. Nat Hazards 109(1):471–497

    Article  Google Scholar 

  • Yoon HK, Lee JS (2010) Field velocity resistivity probe for estimating stiffness and void ratio. Soil Dyn Earthq Eng 30(12):1540–1549

    Article  Google Scholar 

  • Zhang J, Li M, Han S, Deng G (2021) Estimation of seismic wave incident angle using vibration response data and stacking ensemble algorithm. Comput Geotech 137:104255

    Article  Google Scholar 

Download references

Funding

This research was supported by the Daejeon University Research Grants (2023).

Author information

Authors and Affiliations

Authors

Contributions

Sewon Kim: methodology, software, formal analysis. Hyung-Koo Yoon: conceptualization, data process, writing manuscript.

Corresponding author

Correspondence to Hyung-Koo Yoon.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, S., Yoon, HK. Application of classification coupled with PCA and SMOTE, for obtaining safety factor of landslide based on HRA. Bull Eng Geol Environ 82, 381 (2023). https://doi.org/10.1007/s10064-023-03403-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10064-023-03403-0

Keywords

Navigation