Skip to main content
Log in

Spatial prediction of soil contamination based on machine learning: a review

  • Review Article
  • Published:
Frontiers of Environmental Science & Engineering Aims and scope Submit manuscript

Abstract

Soil pollution levels can be quantified via sampling and experimental analysis; however, sampling is performed at discrete points with long distances owing to limited funding and human resources, and is insufficient to characterize the entire study area. Spatial prediction is required to comprehensively investigate potentially contaminated areas. Consequently, machine learning models that can simulate complex nonlinear relationships between a variety of environmental conditions and soil contamination have recently become popular tools for predicting soil pollution. The characteristics, advantages, and applications of machine learning models used to predict soil pollution are reviewed in this study. Satisfactory model performance generally requires the following: 1) selection of the most appropriate model with the required structure; 2) selection of appropriate independent variables related to pollutant sources and pathways to improve model interpretability; 3) improvement of model reliability through comprehensive model evaluation; and 4) integration of geostatistics with the machine learning model. With the enrichment of environmental data and development of algorithms, machine learning will become a powerful tool for predicting the spatial distribution and identifying sources of soil contamination in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abdar M, Pourpanah F, Hussain S, Rezazadegan D, Liu L, Ghavamzadeh M, Fieguth P, Cao X, Khosravi A, Acharya U R, et al. (2021). A review of uncertainty quantification in deep learning: techniques, applications and challenges. Information Fusion, 76: 243–297

    Article  Google Scholar 

  • Adimalla N, Qian H, Nandan M J, Hursthouse A S (2020). Potentially toxic elements (PTEs) pollution in surface soils in a typical urban region of south India: an application of health risk assessment and distribution pattern. Ecotoxicology and Environmental Safety, 203: 111055

    Article  CAS  Google Scholar 

  • Adnan K, Akbar R (2019). An analytical study of information extraction from unstructured and multidimensional big data. Journal of Big Data, 6(1): 91

    Article  Google Scholar 

  • Akaike H (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6): 716–723

    Article  Google Scholar 

  • Akinpelu A A, Ali M E, Owolabi T O, Johan M R, Saidur R, Olatunji S O, Chowdbury Z (2020). A support vector regression model for the prediction of total polyaromatic hydrocarbons in soil: an artificial intelligent system for mapping environmental pollution. Neural Computing & Applications, 32(18): 14899–14908

    Article  Google Scholar 

  • Azizi K, Ayoubi S, Nabiollahi K, Garosi Y, Gislum R (2022). Predicting heavy metal contents by applying machine learning approaches and environmental covariates in west of Iran. Journal of Geochemical Exploration, 233: 106921

    Article  CAS  Google Scholar 

  • Baglaeva E, Buevich A, Sergeev A, Shichkin A, Subbotina I (2018). Recognition of chromium distribution features in different urban soils by multilayer perceptron. In: International Conference of Computational Methods in Sciences and Engineering (ICCMSE), Thessaloniki. Maryland: AMER INST Physics2040: 050008

    Book  Google Scholar 

  • Baglaeva E M, Sergeev A P, Shichkin A V, Buevich A G (2021). The extraction of the training subset for the spatial distribution modelling of the heavy metals in topsoil. Catena, 207: 105699

    Article  CAS  Google Scholar 

  • Ballabio C, Jiskra M, Osterwalder S, Borrelli P, Montanarella L, Panagos P (2021). A spatial assessment of mercury content in the European Union topsoil. Science of the Total Environment, 769: 144755

    Article  CAS  Google Scholar 

  • Bazoobandi A, Emamgholizadeh S, Ghorbani H (2022). Estimating the amount of cadmium and lead in the polluted soil using artificial intelligence models. European Journal of Environmental and Civil Engineering, 26(3): 933–951

    Article  Google Scholar 

  • Bellon-Maurel V, Fernandez-Ahumada E, Palagos B, Roger J M, Mcbratney A (2010). Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy. Trends in Analytical Chemistry, 29(9): 1073–1081

    Article  CAS  Google Scholar 

  • Bhagat S K, Tiyasha T, Awadh S M, Tung T M, Jawad A H, Yaseen Z M (2021a). Prediction of sediment heavy metal at the Australian Bays using newly developed hybrid artificial intelligence models. Environmental Pollution, 268: 115663

    Article  CAS  Google Scholar 

  • Bhagat S K, Tung T M, Yaseen Z M (2021b). Heavy metal contamination prediction using ensemble model: case study of bay sedimentation, Australia. Journal of Hazardous Materials, 403: 123492

    Article  CAS  Google Scholar 

  • Bishop C (1991). Improving the generalization properties of radial basis function neural networks. Neural Computation, 3(4): 579–588

    Article  Google Scholar 

  • Bonelli M G, Ferrini M, Manni A (2017). Artificial neural networks to evaluate organic and inorganic contamination in agricultural soils. Chemosphere, 186: 124–131

    Article  CAS  Google Scholar 

  • Gordon A D, Breiman L, Friedman J H, Olshen R A, Stone C J (1984). Classification and Regression Trees. Biometrics, 40(3): 874

    Article  Google Scholar 

  • Broomhead D, Lowe D (1988). Multivariable functional interpolation and adaptive networks. Complex Systems, 2: 321–355

    Google Scholar 

  • Cai C, Li J, Wu D, Wang X, Tsang D C W, Li X, Sun J, Zhu L, Shen H, Tao S, Liu W (2017). Spatial distribution, emission source and health risk of parent PAHs and derivatives in surface soils from the Yangtze River Delta, eastern China. Chemosphere, 178: 301–308

    Article  CAS  Google Scholar 

  • Cao W, Zhang C (2020). A collaborative compound neural network model for soil heavy metal content prediction. IEEE Access: Practical Innovations, Open Solutions, 8: 129497–129509

    Article  Google Scholar 

  • Cao W, Zhang C (2021). Data prediction of soil heavy metal content by deep composite model. Journal of Soils and Sediments, 21(1): 487–498

    Article  CAS  Google Scholar 

  • Chen F, Zhang Q, Ma J, Zhu Q, Wang Y, Liang H (2021). Effective remediation of organic-metal co-contaminated soil by enhanced electrokinetic-bioremediation process. Frontiers of Environmental Science & Engineering, 15(6): 113

    Article  CAS  Google Scholar 

  • Chen T, Guestrin C (2016). XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Fransisco. New York: Association for Computing Machinery. 785–794

    Google Scholar 

  • Cover T M, Hart P E (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1): 21–27

    Article  Google Scholar 

  • D’M, Macchiato M, Ragosta M, Simoniello T (2012). A method for the integration of satellite vegetation activities observations and magnetic susceptibility measurements for monitoring heavy metals in soil. Journal of Hazardous Materials, 241–242: 118–126

    Google Scholar 

  • Droz B, Payraudeau S, Rodríguez Martín J A, Tóth G, Panagos P, Montanarella L, Borrelli P, Imfeld G (2021). Copper content and export in European vineyard soils influenced by climate and soil properties. Environmental Science & Technology, 55(11): 7327–7334

    Article  CAS  Google Scholar 

  • Duong V H, Ly H B, Trinh D H, Nguyen T S, Pham B T (2021). Development of Artificial Neural Network for prediction of radon dispersion released from Sinquyen Mine, Vietnam. Environmental Pollution, 282: 116973

    Article  CAS  Google Scholar 

  • Fathizad H, Ardakani M A H, Heung B, Sodaiezadeh H, Rahmani A, Fathabadi A, Scholten T, Taghizadeh-Mehrjardi R (2020). Spatiotemporal dynamic of soil quality in the central Iranian desert modeled with machine learning and digital soil assessment techniques. Ecological Indicators, 118: 106736

    Article  CAS  Google Scholar 

  • Fei X, Christakos G, Xiao R, Ren Z, Liu Y, Lv X (2019a). Improved heavy metal mapping and pollution source apportionment in Shanghai City soils using auxiliary information. Science of the Total Environment, 661: 168–177

    Article  CAS  Google Scholar 

  • Fei X, Xiao R, Christakos G, Langousis A, Ren Z, Tian Y, Lv X (2019b). Comprehensive assessment and source apportionment of heavy metals in Shanghai agricultural soils with different fertility levels. Ecological Indicators, 106: 105508

    Article  CAS  Google Scholar 

  • Friedman J H (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4): 367–378

    Article  Google Scholar 

  • Gao B, Stein A, Wang J (2022). A two-point machine learning method for the spatial prediction of soil pollution. International Journal of Applied Earth Observation and Geoinformation, 108: 102742

    Article  Google Scholar 

  • Huang H, Zhou Y, Liu Y, Li K, Xiao L, Li M, Tian Y, Wu F (2020). Assessment of anthropogenic sources of potentially toxic elements in soil from arable land using multivariate statistical analysis and random forest analysis. Sustainability (Basel), 12(20): 8538

    Article  CAS  Google Scholar 

  • Huang H, Zhou Y, Liu Y J, Xiao L, Li K, Li M Y, Tian Y, Wu F (2021a). Source apportionment and ecological risk assessment of potentially toxic elements in cultivated soils of Xiangzhou, China: a combined approach of geographic information system and random forest. Sustainability (Basel), 13(3): 1214

    Article  CAS  Google Scholar 

  • Huang S, Xiao L, Zhang Y, Wang L, Tang L (2021b). Interactive effects of natural and anthropogenic factors on heterogenetic accumulations of heavy metals in surface soils through geodetector analysis. Science of the Total Environment, 789: 147937

    Article  CAS  Google Scholar 

  • Hüllermeier E, Waegeman W (2021). Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Machine Learning, 110(3): 457–506

    Article  Google Scholar 

  • Jang J S R (1993). ANFIS — adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man, and Cybernetics, 23(3): 665–685

    Article  Google Scholar 

  • Jia X, Cao Y, O’connor D, Zhu J, Tsang D C W, Zou B, Hou D (2021). Mapping soil pollution by using drone image recognition and machine learning at an arsenic-contaminated agricultural field. Environmental Pollution, 270: 116281

    Article  CAS  Google Scholar 

  • Jia X, Fu T, Hu B, Shi Z, Zhou L, Zhu Y (2020). Identification of the potential risk areas for soil heavy metal pollution based on the source-sink theory. Journal of Hazardous Materials, 393: 122424

    Article  CAS  Google Scholar 

  • Jia X, Hu B, Marchant B P, Zhou L, Shi Z, Zhu Y (2019). A methodological framework for identifying potential sources of soil heavy metal pollution based on machine learning: a case study in the Yangtze Delta, China. Environmental Pollution, 250: 601–609

    Article  CAS  Google Scholar 

  • Jia Z, Zhou S, Su Q, Yi H, Wang J (2017). Comparison study on the estimation of the spatial distribution of regional soil metal(loid)s pollution based on kriging interpolation and BP neural network. International Journal of Environmental Research and Public Health, 15(1): 34

    Article  Google Scholar 

  • Jordan M I, Mitchell T M (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245): 255–260

    Article  CAS  Google Scholar 

  • Kanevski M, Demyanov V, Pozdnukhov A, Parkin R, Maignan M (2003). Advanced geostatistical and machine-learning models for spatial data analysis of radioactively contaminated regions. Environmental Science and Pollution Research International, (Special Issue): 137–149

  • Kebonye N M, Eze P N, John K, Gholizadeh A, Dajčl J, Drábek O, Němeček K, Borůvka L (2021). Self-organizing map artificial neural networks and sequential Gaussian simulation technique for mapping potentially toxic element hotspots in polluted mining soils. Journal of Geochemical Exploration, 222: 106680

    Article  CAS  Google Scholar 

  • Bou Kheir R, Shomar B, Greve M B, Greve M H (2014). On the quantitative relationships between environmental parameters and heavy metals pollution in Mediterranean soils using GIS regression-trees: the case study of Lebanon. Journal of Geochemical Exploration, 147: 250–259

    Article  CAS  Google Scholar 

  • Kim S B, Han K S, Rim H C, Myaeng S H (2006). Some effective techniques for naive Bayes text classification. IEEE Transactions on Knowledge and Data Engineering, 18(11): 1457–1466

    Article  Google Scholar 

  • Li J, Heap A D (2014). Spatial interpolation methods applied in the environmental sciences: a review. Environmental Modelling & Software, 53: 173–189

    Article  Google Scholar 

  • Li X, Geng T, Shen W, Zhang J, Zhou Y (2021). Quantifying the influencing factors and multi-factor interactions affecting cadmium accumulation in limestone-derived agricultural soil using random forest (RF) approach. Ecotoxicology and Environmental Safety, 209: 111773

    Article  CAS  Google Scholar 

  • Li Y, Li C, Tao J, Wang L (2011). Study on spatial distribution of soil heavy metals in Huizhou City based on BP-ANN modeling and GIS. Procedia Environmental Sciences, 10, 1953–1960

    Article  CAS  Google Scholar 

  • Liu G, Zhou X, Li Q, Shi Y, Guo G, Zhao L, Wang J, Su Y, Zhang C (2020a). Spatial distribution prediction of soil As in a large-scale arsenic slag contaminated site based on an integrated model and multi-source environmental data. Environmental Pollution, 267: 115631

    Article  CAS  Google Scholar 

  • Liu H, Yin S, Chen C, Duan Z (2020b). Data multi-scale decomposition strategies for air pollution forecasting: a comprehensive review. Journal of Cleaner Production, 277: 124023

    Article  Google Scholar 

  • Lundberg S M, Lee S I (2017). A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach. New York: Curran Associates Inc. 4768–4777

    Google Scholar 

  • McCuen R H, Knight Z, Cutter A G (2006). Evaluation of the Nash-Sutcliffe efficiency index. Journal of Hydrologic Engineering, 11(6): 597–602

    Article  Google Scholar 

  • Mikkonen H G, Van De Graaff R, Clarke B O, Dasika R, Wallis C J, Reichman S M (2018a). Geochemical indices and regression tree models for estimation of ambient background concentrations of copper, chromium, nickel and zinc in soil. Chemosphere, 210: 193–203

    Article  CAS  Google Scholar 

  • Mikkonen H G, Van De Graaff R, Mikkonen A T, Clarke B O, Dasika R, Wallis C J, Reichman S M (2018b). Environmental and anthropogenic influences on ambient background concentrations of fluoride in soil. Environmental Pollution, 242: 1838–1849

    Article  CAS  Google Scholar 

  • Nash J E, Sutcliffe J V (1970). River flow forecasting through conceptual models part I — A discussion of principles. Journal of Hydrology (Amsterdam), 10(3): 282–290

    Article  Google Scholar 

  • Padarian J, Minasny B, Mcbratney A B (2020). Machine learning and soil sciences: a review aided by machine learning tools. Soil (Göttingen), 6(1): 35–52

    Article  CAS  Google Scholar 

  • Paes É D C, Veloso G V, Fonseca A A, Fernandes-Filho E I, Fontes M P F, Soares E M B (2022). Predictive modeling of contents of potentially toxic elements using morphometric data, proximal sensing, and chemical and physical properties of soils under mining influence. Science of the Total Environment, 817: 152972

    Article  CAS  Google Scholar 

  • Qin G, Niu Z, Yu J, Li Z, Ma J, Xiang P (2021). Soil heavy metal pollution and food safety in China: effects, sources and removing technology. Chemosphere, 267: 129205

    Article  CAS  Google Scholar 

  • Qiu L, Wang K, Long W, Wang K, Hu W, Amable G S (2016). A comparative assessment of the influences of human impacts on soil cd concentrations based on stepwise linear regression, classification and regression tree, and random forest models. PLoS One, 11(3): e0151131

    Article  Google Scholar 

  • Ren X, Zeng G, Tang L, Wang J, Wan J, Liu Y, Yu J, Yi H, Ye S, Deng R (2018). Sorption, transport and biodegradation: an insight into bioavailability of persistent organic pollutants in soil. Science of the Total Environment, 610–611: 1154–1163

    Article  Google Scholar 

  • Riedmiller M (1994). Advanced supervised learning in multilayer perceptrons: from backpropagation to adaptive learning algorithms. Computer Standards & Interfaces, 16(3): 265–278

    Article  Google Scholar 

  • Rossiter D G (2018). Past, present & future of information technology in pedometrics. Geoderma, 324: 131–137

    Article  Google Scholar 

  • Ru F, Yin A, Jin J, Zhang X, Yang X, Zhang M, Gao C (2016). Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree. Estuarine, Coastal and Shelf Science, 177: 1–7

    Article  CAS  Google Scholar 

  • Sakizadeh M, Mirzaei R, Ghorbani H (2017). Support vector machine and artificial neural network to model soil pollution: a case study in Semnan Province, Iran. Neural Computing & Applications, 28(11): 3229–3238

    Article  Google Scholar 

  • Schwarz K, Weathers K C, Pickett S T A, Lathrop R GJr, Pouyat R V, Cadenasso M L (2013). A comparison of three empirically based, spatially explicit predictive models of residential soil Pb concentrations in Baltimore, Maryland, USA: Understanding the variability within cities. Environmental Geochemistry and Health, 35(4): 495–510

    Article  CAS  Google Scholar 

  • Sergeev A P, Buevich A G, Baglaeva E M, Shichkin A V (2019). Combining spatial autocorrelation with machine learning increases prediction accuracy of soil heavy metals. Catena, 174: 425–435

    Article  CAS  Google Scholar 

  • Shao W, Guan Q, Tan Z, Luo H, Li H, Sun Y, Ma Y (2021). Application of BP-ANN model in evaluation of soil quality in the arid area, northwest China. Soil & Tillage Research, 208: 104907

    Article  Google Scholar 

  • Shi T, Hu X, Guo L, Su F, Tu W, Hu Z, Liu H, Yang C, Wang J, Zhang J, Wu G (2021). Digital mapping of zinc in urban topsoil using multisource geospatial data and random forest. Science of the Total Environment, 792: 148455

    Article  CAS  Google Scholar 

  • Shichkin A, Buevich A, Sergeev A, Baglaeva E, Subbotina I (2018). Forecasting of spatial variable by the models based on Artificial Neural Networks on an example of heavy metal content in Topsoil. Thessaloniki. Maryland: American Institute of Physics Inc, 2040: 050007

    Google Scholar 

  • Singha S, Pasupuleti S, Singha S S, Singh R, Kumar S (2021). Prediction of groundwater quality using efficient machine learning technique. Chemosphere, 276: 130265

    Article  CAS  Google Scholar 

  • Specht D F (1991). A general regression neural network. IEEE Transactions on Neural Networks, 2(6): 568–576

    Article  CAS  Google Scholar 

  • Strobl C, Boulesteix A L, Zeileis A, Hothorn T (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8(1): 1–21

    Article  Google Scholar 

  • Svozil D, Kvasnicka V, Pospichal J (1997). Introduction to multi-layer feed-forward neural networks. Chemometrics and Intelligent Laboratory Systems, 39(1): 43–62

    Article  CAS  Google Scholar 

  • Swets J A (1988). Measuring the accuracy of diagnostic systems. Science, 240(4857): 1285–1293

    Article  CAS  Google Scholar 

  • Taghizadeh-Mehrjardi R, Fathizad H, Ali Hakimzadeh Ardakani M, Sodaiezadeh H, Kerry R, Heung B, Scholten T (2021). Spatiotemporal analysis of heavy metals in arid soils at the catchment scale using digital soil assessment and a random forest model. Remote Sensing (Basel), 13(9): 1698

    Article  Google Scholar 

  • Tao H, Liao X, Zhao D, Gong X, Cassidy D P (2019). Delineation of soil contaminant plumes at a co-contaminated site using BP neural networks and geostatistics. Geoderma, 354: 113878

    Article  CAS  Google Scholar 

  • Tarasov D, Buevich A, Shichkin A, Subbotina I, Tyagunov A, Baglaeva E, Aip (2018a). Chromium distribution forecasting using multilayer perceptron Neural Network and Multilayer perceptron residual Kriging. Maryland: American Institute of Physics Inc, 1978, 440019

    Google Scholar 

  • Tarasov D, Buevich A, Shichkin A, Vasilev J, Aip (2018b). Forecasting of chromium distribution in subarctic noyabrsk using generalized regression neural networks and multilayer perceptron. Maryland: American Institute of Physics Inc, 1978, 440024

    Google Scholar 

  • Tarasov D A, Buevich A G, Sergeev A P, Shichkin A V (2018c). High variation topsoil pollution forecasting in the russian subarctic: using artificial neural networks combined with residual kriging. Applied Geochemistry, 88: 188–197

    Article  CAS  Google Scholar 

  • Tepanosyan G, Maghakyan N, Sahakyan L, Saghatelyan A (2017). Heavy metals pollution levels and children health risk assessment of Yerevan kindergartens soils. Ecotoxicology and Environmental Safety, 142: 257–265

    Article  CAS  Google Scholar 

  • Tepanosyan G, Sahakyan L, Maghakyan N, Saghatelyan A (2020). Combination of compositional data analysis and machine learning approaches to identify sources and geochemical associations of potentially toxic elements in soil and assess the associated human health risk in a mining city. Environmental Pollution, 261: 114210

    Article  CAS  Google Scholar 

  • Wang H, Yilihamu Q, Yuan M, Bai H, Xu H, Wu J (2020). Prediction models of soil heavy metal(loid)s concentration for agricultural land in Dongli: a comparison of regression and random forest. Ecological Indicators, 119: 106801

    Article  CAS  Google Scholar 

  • Wang L, Zhou Y, Li Q, Xu T, Wu Z, Liu J (2021a). Application of three deep machine-learning algorithms in a construction assessment model of farmland quality at the county scale: case study of Xiangzhou, Hubei Province, China. Agriculture, 11(1): 72

    Article  Google Scholar 

  • Wang Q, Xie Z, Li F (2015). Using ensemble models to identify and apportion heavy metal pollution sources in agricultural soils on a local scale. Environmental Pollution, 206: 227–235

    Article  CAS  Google Scholar 

  • Wang Y, Wu X, He S, Niu R (2021b). Eco-environmental assessment model of the mining area in Gongyi, China. Scientific Reports, 11(1): 17549

    Article  CAS  Google Scholar 

  • Wu J, Teng Y, Chen H, Li J (2016). Machine-learning models for on-site estimation of background concentrations of arsenic in soils using soil formation factors. Journal of Soils and Sediments, 16(6): 1787–1797

    Article  CAS  Google Scholar 

  • Xiao L, Zhou Y, Huang H, Liu Y J, Li K, Li M Y, Tian Y, Wu F (2020a). Application of geostatistical analysis and random forest for source analysis and human health risk assessment of Potentially Toxic Elements (PTEs) in Arable Land Soil. International Journal of Environmental Research and Public Health, 17(24): 9296

    Article  CAS  Google Scholar 

  • Xiao L, Zhou Y, Huang H, Liu Y J, Li K, Li M Y, Tian Y, Wu F (2020b). Application of geostatistical analysis and random forest for source analysis and human health risk assessment of potentially toxic elements (PTEs) in arable land soil. International Journal of Environmental Research and Public Health, 17(24): 9296

    Article  CAS  Google Scholar 

  • Xu H, Croot P, Zhang C (2021). Discovering hidden spatial patterns and their associations with controlling factors for potentially toxic elements in topsoil using hot spot analysis and K-means clustering analysis. Environment International, 151: 106456

    Article  CAS  Google Scholar 

  • Yang H, Huang K, Zhang K, Weng Q, Zhang H, Wang F (2021a). Predicting heavy metal adsorption on soil with machine learning and mapping global distribution of soil adsorption capacities. Environmental Science & Technology, 55(20): 14316–14328

    Article  CAS  Google Scholar 

  • Yang S, Taylor D, Yang D, He M, Liu X, Xu J (2021b). A synthesis framework using machine learning and spatial bivariate analysis to identify drivers and hotspots of heavy metal pollution of agricultural soils. Environmental Pollution, 287: 117611

    Article  CAS  Google Scholar 

  • Yaseen Z M (2021). An insight into machine learning models era in simulating soil, water bodies and adsorption heavy metals: review, challenges and solutions. Chemosphere, 277: 130126

    Article  CAS  Google Scholar 

  • Yu Z, Zhang C, Xiong N, Chen F (2022). A new random forest applied to heavy metal risk assessment. Computer Systems Science and Engineering, 40(1): 207–221

    Article  Google Scholar 

  • Zafar M R, Khan N (2021). Deterministic local interpretable model-agnostic explanations for stable explainability. Machine Learning and Knowledge Extraction, 3(3): 525–541

    Article  Google Scholar 

  • Zhang C, Kuang W, Wu J, Liu J, Tian H (2021a). Industrial land expansion in rural China threatens environmental securities. Frontiers of Environmental Science & Engineering, 15(2): 29

    Article  Google Scholar 

  • Zhang H, Yin A, Yang X, Fan M, Shao S, Wu J, Wu P, Zhang M, Gao C (2021b). Use of machine-learning and receptor models for prediction and source apportionment of heavy metals in coastal reclaimed soils. Ecological Indicators, 122: 107233

    Article  CAS  Google Scholar 

  • Zhang H, Yin S H, Chen Y H, Shao S S, Wu J T, Fan M M, Chen F R, Gao C (2020). Machine learning-based source identification and spatial prediction of heavy metals in soil in a rapid urbanization area, eastern China. Journal of Cleaner Production, 273: 122858

    Article  CAS  Google Scholar 

  • Zhang X, Lin F, Jiang Y, Wang K, Wong M T F (2008). Assessing soil Cu content and anthropogenic influences using decision tree analysis. Environmental Pollution, 156(3): 1260–1267

    Article  CAS  Google Scholar 

  • Zhong S, Zhang K, Bagheri M, Burken J G, Gu A, Li B, Ma X, Marrone B L, Ren Z J, Schrier J, et al. (2021). Machine learning: new ideas and tools in environmental science and engineering. Environmental Science & Technology, 55(19): 12741–12754

    CAS  Google Scholar 

  • Zhou P, Zhao Y, Zhao Z, Chai T (2015). Source mapping and determining of soil contamination by heavy metals using statistical analysis, artificial neural network, and adaptive genetic algorithm. Journal of Environmental Chemical Engineering, 3(4, Part A): 2569–2579

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Key Research and Development Program of China (No. 2018YFC1800100); the National Natural Science Foundation of China (No. 42277475).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mei Lei.

Additional information

Highlights

• A review of machine learning (ML) for spatial prediction of soil contamination.

• ML have achieved significant breakthroughs for soil contamination prediction.

• A structured guideline for using ML in soil contamination is proposed.

• The guideline includes variable selection, model evaluation, and interpretation.

Special Issue—Artificial Intelligence/Machine Learning on Environmental Science & Engineering (Responsible Editors: Yongsheng Chen, Xiaonan Wang, Joe F. Bozeman III & Shouliang Yi)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Lei, M., Li, K. et al. Spatial prediction of soil contamination based on machine learning: a review. Front. Environ. Sci. Eng. 17, 93 (2023). https://doi.org/10.1007/s11783-023-1693-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11783-023-1693-1

Keywords

Navigation