Skip to main content


Log in

Application of VNIR and machine learning technologies to predict heavy metals in soil and pollution indices in mining areas

  • Soils, Sec 5 • Soil and Landscape Ecology • Research Article
  • Published:
Journal of Soils and Sediments Aims and scope Submit manuscript



Soil pollution indices are an effective tool in the computation of metal contamination in soil. They monitor soil quality and ensure future sustainability in agricultural systems. However, calculating a soil pollution index requires laboratory measurements of multiple soil heavy metals, which increases the cost and complexity of evaluating soil heavy metal pollution. Visible and near-infrared spectroscopy (VNIR, 350–2500 nm) has been widely used in predicting soil properties due to its advantages of a rapid analysis, non-destructiveness, and a low cost.


In this study, we evaluated the ability of the VNIR to predict soil heavy metals (As, Cu, Pb, Zn, and Cr) and two commonly used soil pollution indices (Nemerow integrated pollution index, NIPI; potential ecological risk index, RI). Three nonlinear machine learning techniques, including cubist regression tree (Cubist), Gaussian process regression (GPR), and support vector machine (SVM), were compared with partial least squares regression (PLSR) to determine the most suitable model for predicting the soil heavy metals and pollution indices.


The results showed that the nonlinear machine learning models performed significantly better than the PLSR model in most cases. Overall, the SVM model showed a higher prediction accuracy and a stronger generalization for Zn (R2V = 0.95, RMSEV = 6.75 mg kg−1), Cu (R2V = 0.95, RMSEV = 8.04 mg kg−1), Cr (R2V = 0.90, RMSEV = 6.57 mg kg−1), Pb (R2V = 0.86, RMSEV = 4.14 mg kg−1), NIPI (R2V = 0.93, RMSEV = 0.31), and RI (R2V = 0.90, RMSEV 3.88). In addition, the research results proved that the high prediction accuracy of the three heavy metal elements Cu, Pb, and Zn and their significant positive correlations with the soil pollution indices were the reason for the accurate prediction of NIPI and RI.


Using VNIR to obtain soil pollution indices quickly and accurately is of great significance for the comprehensive evaluation, prevention, and control of soil heavy metal pollution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  • Al-Anbari R, Al OAHMJ, Ali FHA (2015) Pollution loads and ecological risk assessment of heavy metals in the urban soil affected by various anthropogenic activities. J ADV RES 3(2):104–110

    CAS  Google Scholar 

  • Askari MS, O’Rourke SM, Holden NM (2015) Evaluation of soil quality for agricultural production using visible–near-infrared spectroscopy. Geoderma 243–244:80–91

    Article  Google Scholar 

  • Bali AS, Sidhu GPS (2021) Heavy metal contamination indices and ecological risk assessment index to assess metal pollution status in different soils. In: Kumar V, Sharma A, Cerdà A (eds) Heavy metals in the environment: impact, assessment, and remediation. Elsevier, pp 87–98

    Chapter  Google Scholar 

  • Ben-Dor E (2002) Quantitative remote sensing of soil properties. Adv Agron 75:173–243

    Article  CAS  Google Scholar 

  • BenDor E, Inbar Y, Chen Y (1997) The reflectance spectra of organic matter in the visible near-infrared and short wave infrared region (400–2500 nm) during a controlled decomposition process. Remote Sens Environ 61(1):1–15

    Article  Google Scholar 

  • Chakraborty S, Weindorf DC, Paul S, Ghosh B, Li B, Ali MN, Ghosh RK, Ray DP, Majumdar K (2015) Diffuse reflectance spectroscopy for monitoring lead in landfill agricultural soils of India. Geoderma Reg 5:77–85

    Article  Google Scholar 

  • Chen L, Lai J, Tan K, Wang X, Chen Y, Ding J (2022) Development of a soil heavy metal estimation method based on a spectral index: Combining fractional-order derivative pretreatment and the absorption mechanism. Sci Total Environ 813:151882

    Article  CAS  Google Scholar 

  • Chen ZM, Curran PJ, Hansom JD (1992) Derivative reflectance spectroscopy to estimate suspended sediment concentration. Remote Sens Environ 40(1):67–77

    Article  Google Scholar 

  • Cheng H, Shen RL, Chen YY, Wan QJ, Shi TZ, Wang JJ, Wan Y, Hong YS, Li XC (2019) Estimating heavy metal concentrations in suburban soils with reflectance spectroscopy. Geoderma 336:59–67

    Article  CAS  Google Scholar 

  • Dong J, Dai W, Xu J, Li S (2016) Spectral estimation model construction of heavy metals in mining reclamation areas. Int J Env Res Pub He 13(7)

  • Dotto AC, Dalmolin RSD, ten Caten A, Grunwald S (2018) A systematic study on the application of scatter-corrective and spectral-derivative preprocessing for multivariate prediction of soil organic carbon by Vis-NIR spectra. Geoderma 314:262–274

    Article  CAS  Google Scholar 

  • Gong QJ, Deng J, Xiang YC, Wang QF, Yang LQ (2008) Calculating pollution indices by heavy metals in ecological geochemistry assessment and a case study in parks of Beijing. J China Univ Geosci 19(3):230–241

    Article  CAS  Google Scholar 

  • Hakanson L (1980) An ecological risk index for aquatic pollution-control-a sedimentological approach. Water Res 14(8):975–1001

    Article  Google Scholar 

  • Kemper T, Sommer S (2002) Estimate of heavy metal contamination in soils after a mining accident using reflectance spectroscopy. Environ Sci Technol 36(12):2742–2747

    Article  CAS  Google Scholar 

  • Khosravi V, Doulati Ardejani F, Yousefi S, Aryafar A (2018) Monitoring soil lead and zinc contents via combination of spectroscopy with extreme learning machine and other data mining methods. Geoderma 318:29–41

    Article  CAS  Google Scholar 

  • Kowalska JB, Mazurek R, Gasiorek M, Zaleski T (2018) Pollution indices as useful tools for the comprehensive evaluation of the degree of soil contamination-A review. Environ Geochem Hlth 40(6):2395–2420

    Article  CAS  Google Scholar 

  • Krishnan P, Alexander JD, Butler BJ, Hummel JW (1980) Reflectance technique for predicting soil organic-matter. Soil Sci Soc Am J 44(6):1282–1285

    Article  Google Scholar 

  • Lin LI (1989) A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1):255–268

    Article  CAS  Google Scholar 

  • Liu WD, Baret F, Gu XF, Tong QX, Zheng LF, Zhang B (2002) Relating soil surface moisture to reflectance. Remote Sens Environ 81(2):238–246

    Google Scholar 

  • Morellos A, Pantazi X-E, Moshou D, Alexandridis T, Whetton R, Tziotzios G, Wiebensohn J, Bill R, Mouazen AM (2016) Machine learning based prediction of soil total nitrogen, organic carbon and moisture content by using VIS-NIR spectroscopy. Biosyst Eng 152:104–116

    Article  Google Scholar 

  • Munnaf MA, Mouazen AM (2021) Development of a soil fertility index using on-line Vis-NIR spectroscopy. Comput Electron Agr 188

  • Palacios-Orueta A, Ustin SL (1998) Remote sensing of soil properties in the Santa Monica Mountains I. Spectral Analysis Remote Sens Environ 65(2):170–183

    Article  Google Scholar 

  • Paz-Kagan T, Shachak M, Zaady E, Karnieli A (2014) A spectral soil quality index (SSQI) for characterizing soil function in areas of changed land use. Geoderma 230–23:171–184

    Article  Google Scholar 

  • Paz-Kagan T, Zaady E, Salbach C, Schmidt A, Lausch A, Zacharias S, Notesco G, Ben-Dor E, Karnieli A (2015) Mapping the Spectral Soil Quality Index (SSQI) Using Airborne Imaging Spectroscopy. Remote Sens 7(11):15748–15781

    Article  Google Scholar 

  • Quinlan JR (1993) Combining instance-based and model-based learning. In: Utgoff PE (ed) Machine Learning: Proceedings of the Tenth International Conference. Morgan Kaufmann Publishers Inc, San Mateo, CA, pp 236–243

  • Recena R, Fernández-Cabanás VM, Delgado A (2019) Soil fertility assessment by Vis-NIR spectroscopy: Predicting soil functioning rather than availability indices. Geoderma 337:368–374

    Article  CAS  Google Scholar 

  • Smola AJ, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    Article  Google Scholar 

  • St. Luce M, Ziadi N, Gagnon B, Karam A (2017) Visible near infrared reflectance spectroscopy prediction of soil heavy metal concentrations in paper mill biosolid- and liming by-product-amended agricultural soils. Geoderma 288:23–36

    Article  CAS  Google Scholar 

  • Stoner ER, Baumgardner MF (1981) Characteristic variations in reflectance of surface soils. Soil Sci Soc Am J 45(6):1161–1165

    Article  Google Scholar 

  • Sun WC, Zhang X, Sun XJ, Sun YL, Cen Y (2018) Predicting nickel concentration in soil using reflectance spectroscopy associated with organic matter and clay minerals. Geoderma 327:25–35

    Article  CAS  Google Scholar 

  • Sun Y, Zhou Q, Xie X, Liu R (2010) Spatial, sources and risk assessment of heavy metal contamination of urban soils in typical regions of Shenyang. China J Hazard Mater 174(1–3):455–462

    Article  CAS  Google Scholar 

  • Tan K, Ye YY, Du PJ, Zhang QQ (2014) Estimation of heavy metal concentrations in reclaimed mining soils using reflectance spectroscopy. Spectrosc Spectr Anal 34(12):3317–3322

    CAS  Google Scholar 

  • Tan K, Ma W, Wu F, Du Q (2019) Random forest-based estimation of heavy metal concentration in agricultural soils with hyperspectral sensor data. Environ Monit Assess 191(7):446

    Article  Google Scholar 

  • Viscarra Rossel RA, Behrens T (2010) Using data mining to model and interpret soil diffuse reflectance spectra. Geoderma 158(1–2):46–54

    Article  Google Scholar 

  • Viscarra Rossel RA, Rizzo R, Demattê JAM, Behrens T (2010) Spatial Modeling of a Soil Fertility Index using Visible-Near-Infrared Spectra and Terrain Attributes. Soil Sci Soc Am J 74(4):1293–1300

    Article  Google Scholar 

  • Viscarra Rossel RA, Hicks WS (2015) Soil organic carbon and its fractions estimated by visible-near infrared transfer functions. Eur J Soil Sci 66(3):438–450

    Article  CAS  Google Scholar 

  • Wang JJ, Cui LJ, Gao WX, Shi TZ, Chen YY, Gao Y (2014) Prediction of low heavy metal concentrations in agricultural soils using visible and near-infrared reflectance spectroscopy. Geoderma 216:1–9

    Article  Google Scholar 

  • Wei ZY, Wang DF, Zhou HP, Qi ZP (2011) Assessment of soil heavy metal pollution with principal component analysis and geoaccumulation index. Procedia Environ Sci 10:1946–1952

    Article  CAS  Google Scholar 

  • Weissmannová HD, Pavlovský J (2017) Indices of soil contamination by heavy metals – methodology of calculation for pollution assessment (minireview). Environ Monit Assess 189:616

    Article  Google Scholar 

  • Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool for chemometrics. Chemometr Intell Lab 58(2):109–130

    Article  CAS  Google Scholar 

  • Xu SX, Wang MY, Shi XZ (2020) Hyperspectral imaging for high-resolution mapping of soil carbon fractions in intact paddy soil profiles with multivariate techniques and variable selection. Geoderma 370:114358

    Article  CAS  Google Scholar 

  • Yang MH, Mouazen A, Zhao XM, Guo X (2020) Assessment of a soil fertility index using visible and near-infrared spectroscopy in the rice paddy region of southern China. Eur J Soil Sci 71(4):615–626

    Article  CAS  Google Scholar 

  • Zhou W, Yang H, Xie LJ, Li HR, Huang L, Zhao YP, Yue TX (2021) Hyperspectral inversion of soil heavy metals in Three-River Source Region based on random forest model. Catena 202

  • Zhu W, Bian B, Li L (2008) Heavy metal contamination of road-deposited sediments in a medium size city of China. Environ Monit Assess 147(1–3):171–181

    Article  CAS  Google Scholar 

Download references


We would like to thank Dr. Xianzhang Pan and Dr. Changkun Wang of ISSAS for their help in VNIR spectroscopy analysis. Many thanks also go to the anonymous reviewers whose constructive comments helped improve this paper.


This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA28100500) and the National Natural Science Foundation of China (41771253) .

Author information

Authors and Affiliations


Corresponding author

Correspondence to Shengxiang Xu.

Ethics declarations

Consent to participate

This manuscript is new and not being considered elsewhere. All authors have approved the submission of this manuscript.

Conflict of interest

The authors declare no competing interests.

Additional information

Responsible editor: Jun Zhou

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 736 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Zhao, Y. & Xu, S. Application of VNIR and machine learning technologies to predict heavy metals in soil and pollution indices in mining areas. J Soils Sediments 22, 2777–2791 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: