Abstract
Purpose
Soil organic matter (SOM) is an important indicator of soil fertility in karst area. A more effective alternative to conventional soil analysis methods is the application of visible-near infrared-shortwave infrared (VNIR-SWIR) reflectance spectroscopy, which can rapidly, non-destructively, and accurately determine SOM. However, the procedures in modeling and prediction of SOM vary among studies. Thus, well-defined steps are required for SOM prediction, which have a great potential to improve model accuracy in different study area.
Methods
This study explored a wide range of methods to investigate a systematic methodology towards SOM modeling in karst areas in Southwest China. Three machine learning models were integrated with seven spectral preprocessing techniques, two feature selection algorithms, and five spectral index approaches, and the accuracy of their prediction was compared.
Results
Results show that the random forest (RF) machine learning model combined with competitive adaptive reweighting sampling (CARS) selected feature bands using original spectra has the best fit and highest model accuracy for SOM estimation (R2 = 0.85, ratio of prediction to deviation (RPD) = 2.66, root-mean-square error (RMSE) = 3.39).
Conclusions
Through examining the influence of various preprocessing techniques, feature selection approaches, spectral index methods, and machine learning models, study results validate the reliability of SOM prediction. Furthermore, the modeling strategy developed in this study can be a guidance for future research to choose the most appropriate methodology in a different region.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11368-023-03691-9/MediaObjects/11368_2023_3691_Fig7_HTML.png)
Similar content being viewed by others
References
Al-Abbas AH, Swain PH, Baumgarder MF (1972) Relating organic matter and clay content to the multi-spectral radiance of soil. Soil Sci 114:477–485
Andrews SS, Karlen DL, Cambardella CA (2004) The soil management assessment framework: a quantitative soil quality evaluation method. Soil Sci Soc Am J 68:1945–1962. https://doi.org/10.2136/sssaj2004.1945
Angelopoulou T, Balafoutis A, Zalidis G, Bochtis D (2020) From laboratory to proximal sensing spectroscopy for soil organic carbon estimation –a review. Sustainability 12:443. https://doi.org/10.3390/su12020443
Bao Y, Meng X, Ustin S, Wang X, Tang H (2020) Vis-SWIR spectral prediction model for soil organic matter with different grouping strategies. Catena 195:104703
Barman B, Patra S (2020) Variable precision rough set based unsupervised band selection technique for hyperspectral image classification. Knowl-Based Syst 193:105414. https://doi.org/10.1016/j.knosys.2019.105414
Batjes NH (1996) Total carbon and nitrogen in the soils of the world. Eur J Soil Sci 47(2):151–163. https://doi.org/10.1111/j.1365-2389.1996.tb01386.x
Ba Y, Liu J, Han J, Zhang X (2020) Application of Vis-NIR spectroscopy for determination the content of organic matter in saline-alkali soils. Spectrochim Acta A Mol Biomol Spectrosc 229:117863. https://doi.org/10.1016/j.saa.2019.117863
Belgiu M, Drăgut L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
Ben Dor E, Banin A (1995) Near-infrared analysis as a rapid method to simultaneously evaluate several soil properties. Soil Sci Soc A J 59:364–372. https://doi.org/10.2136/sssaj1995.03615995005900020014x
Ben Dor E, Ong C, Lau IC (2015) Reflectance measurements of soils in the laboratory: standards and protocols. Geoderma 245-246:112–124. https://doi.org/10.1016/j.geoderma.2015.01.002
Bioucas-Dias JM, Plaza A, Camps-Valls G, Scheunders P, Nasrabadi N, Chanussot J (2013) Hyperspectral remote sensing data analysis and future challenges. IEEE Geosc Rem Sen M 1(2):6–36. https://doi.org/10.1109/MGRS.2013.2244672
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Brown DJ, Shepherd KD, Walsh MG, Dewayne Mays M, Reinsch TG (2006) Global soil characterization with VNIR diffuse reflectance spectroscopy. Geoderma 132:273–290. https://doi.org/10.1016/j.geoderma.2005.04.025
Bowers SA, Hanks RJ (1965) Reflection of radiant energy from soils. Soil Sci 100:130–138
Cao X, Ren H, Li X, Yu K, Su B (2018) Discrimination of winter Jujube’s maturity using hyperspectral technique combined with characteristic wavelength and spectral indices. Spectrosc Spect Anal 38(7):2175–2182. https://doi.org/10.3964/j.issn.1000-0593(2018)07-2175-08
Chang C, Laird DA, Mausbach MJ, Hurburgh CR (2001) Near-infrared reflectance spectroscopy-principal components regression analyses of soil properties. Soil Sci Soc Am J 65:480–490. https://doi.org/10.2136/sssaj2001.652480x
Chen S (2007) Leap Beibu Gulf. https://www.cas.cn/ys/ysjy/200706/t20070618_1689094.shtml. Accessed 3 Jan 2022
Coelho F, Costa M, Verleysen M, Braga AP (2020) LASSO multi-objective learning algorithm for feature selection. Soft Comput 24:1–4. https://doi.org/10.1007/s00500-020-04734-w
Conforti M, Castrignanò A, Robustelli G, Scarciglia F, Stelluti M, Buttafuoco G (2015) Laboratory-based Vis–NIR spectroscopy and partial least square regression with spatially correlated errors for predicting spatial variation of soil organic matter content. Catena 124:60–67. https://doi.org/10.1016/j.catena.2014.09.004
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018
Debaene G, Niedźwiecki J, Pecio A, Żurek A (2014) Effect of the number of calibration samples on the prediction of several soil properties at the farm-scale. Geoderma 214-215:114–125. https://doi.org/10.1016/j.geoderma.2013.09.022
Demattê JAM, Horák-Terra I, Beirigo RM, Terra F da S, KPP M, Fongaro CT, Silva AC, Vidal-Torrado P (2017) Genesis and properties of wetland soils by VISNIR- SWIR as a technique for environmental monitoring. J Environ Manage 197:50–62. https://doi.org/10.1016/j.jenvman.2017.03.014
de Santana FB, de Giuseppe LO, de Souza AM, Poppi RJ (2019) Removing the moisture effect in soil organic matter determination using NIR spectroscopy and PLSR with external parameter orthogonalization. Microchemi J 145:1094–1101. https://doi.org/10.1016/j.microc.2018.12.027
Dotto AG, Dalmolin RSD, ten Caten A, Grunwald S (2018) A systematic study on the application of scatter-corrective and spectral derivative preprocessing for multivariate prediction of soil organic carbon by Vis-NIR spectra. Geoderma 314:262–274. https://doi.org/10.1016/j.geoderma.2017.11.006
Filgueiras PR, Sad CMS, Loureiro AR, Santos MFP, Castro EVR, Dias JCM, Poppi RJ (2014) Determination of API gravity, kinematic viscosity and water content in petroleum by ATR-FTIR spectroscopy and multivariate calibration. Fuel 116:123–130. https://doi.org/10.1016/j.fuel.2013.07.122
Fu C, Tian A, Zhu D, Zhao J, Xiong H (2021) Estimation of salinity of content in different saline-alkali zones based on machine learning model using FOD pretreatment method. Remote Sens 13:5140. https://doi.org/10.3390/rs13245140
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17. https://doi.org/10.1016/0003-2670(86)80028-9
Guan H, Li J, Chapman M, Deng F, Ji Z, Yang X (2013) Integration of orthoimagery and lidar data for object-based urban thematic mapping using random forests. Int J Remote Sens 34:5166–5186. https://doi.org/10.1080/01431161.2013.788261
Hong Y, Guo L, Chen S, Linderman M, Mouazen AM, Yu L, Chen Y, Liu Y, Liu Y, Cheng H, Liu Y (2020) Exploring the potential of airborne hyperspectral image for estimating topsoil organic carbon: effects of fractional order derivative and optimal band combination algorithm. Geoderma 365:114228. https://doi.org/10.1016/j.geoderma.2020.114228
Hou L, Li X, Li F (2019) Hyperspectral-based inversion of heavy metal content in the soil of coal mining areas. J Environ Qual 48:57–63. https://doi.org/10.2134/jeq2018.04.0130
Huang G, Zhu Q, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, Budapest, Hungary, pp 25–29. https://doi.org/10.1109/IJCNN.2004.1380068
Hunt GR, Salisbury JW, Lenhoff CJ (1971) Visible and near infrared spectra of minerals and rocks: II. Carbonates Modern Geology 195-205
Hutengs C, Seidel M, Oertel F, Ludwig B, Vohland M (2019) In situ and laboratory soil spectroscopy with portable visible-to-near-infrared and mid-infrared instruments for the assessment of organic carbon in soils. Geoderma 355:113900. https://doi.org/10.1016/j.geoderma.2019.113900
Jia X, Hou D (2023) Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning. Sci Total Environ 857:159387. https://doi.org/10.1016/j.scitotenv.2022.159387
Knox NM, Grunwald S, McDowell ML, Bruland GL, Myers DB, Harris WG (2015) Modelling soil carbon fractions with visible near-infrared (VNIR) and mid-infrared (MIR) spectroscopy. Geoderma 239-240:229–239. https://doi.org/10.1016/j.geoderma.2014.10.01
Kuang B, Tekin Y, Mouazen AM (2015) Comparison between artificial neural network and partial least squares for on-line visible and near infrared spectroscopy measurement of soil organic carbon, pH and clay content. Soil Tillage Res 146(Part B):243–252. https://doi.org/10.1016/j.still.2014.11.002
Li H, Liang Y, Xu Q, Cao D (2009) Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal Chim Acta 648(1):77–84. https://doi.org/10.1016/j.aca.2009.06.046
Liu XM, Liu JS (2013) Measurement of soil properties using visible and short wave-near infrared spectroscopy and multivariate calibration. Measurement: J Internat Measurement Confederation 46:3808–3814. https://doi.org/10.1016/j.measurement.2013.07.007
Luo W, Wang Y, Zeng G, Wang M, Yang H, Lv Y, Li Z, Wang S (2019) Study on soil inorganic carbon cycle and its mechanism in karst key zones. In: Proceedings of the 17th Annual Conference of Chinese Society of Mineralogy, Petrology and Geochemistry, Hangzhou, China
Mahmoudzadeh H, Matinfar HR, Taghizadeh-Mehrjardi R, Kerry R (2020) Spatial prediction of soil organic carbon using machine learning techniques in western Iran. Geoderma Reg 21:e00260. https://doi.org/10.1016/j.geodrs.2020.e00260
McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bullet Mathemat Biophys 5:115–133. https://doi.org/10.1007/BF02478259
Mládková L, Rohošková M, Borůvka L (2006) Methods for the assessment of humic substances quality in forest soils. Soil Water Res 1(1):3–9. https://doi.org/10.17221/6499-swr
Mondal BP (2017) Geostatistical assessment of spatial variability of soil fertility and its implications in calibration of visible-near infrared models. Dissertation Punjab Agricultural University, Ludhiana, Punjab. www.researchgate.net/profile/Bhabani_Mondal
Morra MJ, Hall MH, Freeborn LL (1991) Carbon and nitrogen analysis of soil fractions using near-infrared reflectance spectroscopy. Soil Sci Soc Am J 55:288–291. https://doi.org/10.2136/sssaj1991.03615995005500010051x
Nawar S, Buddenbaum H, Hill J, Kozak J, Mouazen AM (2016) Estimating the soil clay content and organic matter by means of different calibration methods of vis-NIR diffuse reflectance spectroscopy. Soil Till Res 155:510–522. https://doi.org/10.1016/j.still.2015.07.021
Qiao X, Wang C, Feng M, Yang W, Ding G, Sun H, Liang Z, Shi C (2017) Hyperspectral estimation of soil organic matter based on different spectral preprocessing techniques. Spectrosc Lett 50(3):156–163. https://doi.org/10.1080/00387010.2017.1297958
Riese FM, Keller S (2020) Supervised, semi-supervised, and unsupervised learning for hyperspectral regression. In: Prasad S, Chanussot J (eds) Hyperspectral image analysis: Advances in machine learning and signal processing. Springer, Berlin, Germany, pp 187–232
Rinnan R, Rinnan A (2007) Application of near infrared reflectance (NIR) and fluorescence spectroscopy to analysis of microbiological and chemical properties of arctic soil. Soil Biol Biochem 39:1664–1673. https://doi.org/10.1016/j.soilbio.2007.01.022
Rinnan S, Nrgaard L, Berg F, Thygesen J, Bro R, Engelsen SB (2009) Data pre-processing. In Infrared Spectros Food Qual Anal Control 29-50. https://doi.org/10.1016/B978-0-12-374136-3.00002-X
Sahrawat KL (1982) Simple modification of the Walkley-Black method for simultaneous determination of organic carbon and potentially mineralizable nitrogen in tropical rice soils. Plant Soil 69:73–77
Sarkhot DV, Grunwald S, Ge Y, Morgan CLS (2011) Comparison and detection of total and available soil carbon fractions using visible/near infrared diffuse reflectance spectroscopy. Geoderma 164:22–32. https://doi.org/10.1016/j.geoderma.2011.05.006
Sawant S, Manoharan P (2020) Hyperspectral band selection based on metaheuristic optimization approach. Infrared Phys Techn 107:103295. https://doi.org/10.1016/j.infrared.2020.103295
Seema AK, Das Ghosh BS (2019) Characterization and classification of alluvium derived soils under different land uses in Varanasi district of Uttar Pradesh. J Indian Soc Soil Sci 67(3):360–364. https://doi.org/10.5958/0974-0228.2019.00039.2
Seema AK, Das Ghosh BS, Reddy N (2020) Application of VIS-NIR spectroscopy for estimation of soil organic carbon using different spectral preprocessing techniques and multivariate methods in the middle Indo-Gangetic plains of India. Geoderma Reg 23:e00349. https://doi.org/10.1016/j.geodrs.2020.e00349
Sharma V, Chauhan R, Kumar R (2021) Spectral characteristics of organic soil matter: a comprehensive review. Microchem J 171:106836. https://doi.org/10.1016/j.microc.2021.106836
Shen L, Gao M, Yan J, Li Z, Duan S (2020) Hyperspectral estimation of soil organic matter content using different spectral preprocessing techniques and PLSR method. Remote Sens 12(7):1206. https://doi.org/10.3390/rs12071206
Shibusawa S, ImadeAnom SW, Sato S, Sasao A, Hirako S (2001) Soil mapping using the realtime soil spectrophotometer. In: Grenier G, Blackmore S (eds) Precision agriculture: Third European conference on precision agriculture. Agro Montpellier (ENSAM), Montpellier, France, pp 497–508
Shi T, Liu H, Wang J, Chen Y, Fei T, Wu G (2014) Monitoring arsenic contamination in agricultural soils with reflectance spectroscopy of rice plants. Environ Sci Technol 48:6264–6272. https://doi.org/10.1021/es405361n
Shi Y, Zhao J, Song X, Qin Z, Tang J (2021) Hyperspectral band selection and modeling of soil organic matter content in a forest using the ranger algorithm. PLoS ONE 16(6):e0253385. https://doi.org/10.1371/journal.pone.0253385
Stenberg B, Viscarra Rossel RA, Mouazen AM, Wetterlind J (2010) Chapter Five—Visible and near infrared spectroscopy in soil science. In: Sparks DL (ed) Advances in Agronomy. Academic Press: Cambridge, MA, USA, pp 16–215. https://doi.org/10.1016/S0065-2113(10)07005-7
Stevenson FJ (1994) Humus chemistry: genesis, composition, reactions, 2nd edn. John Wiley and Sons, New York
Stoner E (1979) Physiochemical, site, and bidirectional reflectance factor characteristics of uniformly-moist soils. Dissertation,. Ann Arbor, Michigan
Sun W, Liu S, Zhang X, Li Y (2022) Estimation of soil organic matter content using selected spectral subset of hyperspectral data. Geoderma 409:115653. https://doi.org/10.1016/j.geoderma.2021.115653
Ter Braak CJF, Juggins S (1993) Weighted averaging partial least squares regression (WA-PLS): an improved method for reconstructing environmental variables from species assemblages. Hydrobiologia 269-270:485–502. https://doi.org/10.1007/BF00028046
Viscarra Rossel RA, Walvoort DJJ, McBratney AB, Janik LJ, Skjemstad JO (2006) Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma 131:59–75. https://doi.org/10.1016/j.geoderma.2005.03.007
Vohland M, Ludwig M, Thiele-Bruhn S, Ludwig B (2014) Determination of soil properties with visible to near-and mid-infrared spectroscopy: effects of spectral variable selection. Geoderma 223:88–96. https://doi.org/10.1016/j.geoderma.2014.01.013
Wang C, Zhang T, Pan X (2017) Potential of visible and near-infrared reflectance spectroscopy for the determination of rare earth elements in soil. Geoderma 306:120–126. https://doi.org/10.1016/j.geoderma.2017.07.016
Wang J, Ding J, Abulimiti A, Cai L (2018) Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy, Ebinur Lake Wetland. Northwest China PeerJ 3(6):e4703. https://doi.org/10.7717/peerj.4703
Wang L, Wang R (2022) Determination of soil pH from Vis-NIR spectroscopy by extreme learning machine and variable selection: a case study in lime concretion black soil. Spectrochim Acta A Mol Biomol Spectrosc 283:121707. https://doi.org/10.1016/j.saa.2022.121707
Wang S (2003) The most serious eco-geologically environment problem in southwestern China- karst rocky desertification. Bull Mineral Petrol Geochem 22(2):120–126
Williams CKI, Barber D (1998) Bayesian classification with Gaussian processes. IEEE T Pattern Anal 20:1342–1351. https://doi.org/10.1109/34.735807
Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab 58:109–130. https://doi.org/10.1016/S0169-7439(01)00155-1
Xiao D, Huang J, Li J, Fu Y, Li Z (2022) Inversion study of cadmium content in soil based on reflection spectroscopy and MSC-ELM model. Spectrochim Acta A Mol Biomol Spectrosc 283:121696. https://doi.org/10.1016/j.saa.2022.121696
Xie C (1982) Soil types and characteristics in karst areas of Guangxi. J Guangxi College of Agricult 01:88–93
Yang D, Bao W (2017) Group Lasso-based band selection for hyperspectral image classification. IEEE Geosci Remote S 99:1–5. https://doi.org/10.1109/LGRS.2017.2768074
Yang H, Kuang B, Mouazen AM (2012) Quantitative analysis of soil nitrogen and carbon at a farm scale using visible and near infrared spectroscopy coupled with wavelength reduction. Eur J Soil Sci 63:410–420. https://doi.org/10.1111/j.1365-2389.2012.01443.x
Yang M, Xu D, Chen S, Li H, Shi Z (2019) Evaluation of machine learning approaches to predict soil organic matter and pH using vis-NIR spectra. Sensors 19(2):263. https://doi.org/10.3390/s19020263
Yang P, Hu J, Hu B, Luo D, Peng J (2022) Estimating soil organic matter content in desert areas using in situ hyperspectral data and feature variable selection algorithms in southern Xinjiang. China. Remote Sens 14:5221. https://doi.org/10.3390/rs14205221
Zhang Z, Ding J, Zhu C, Wang J, Ma G, Ge X, Li Z, Han L (2020) Strategies for the efficient estimation of soil organic matter in salt-affected soils through Vis-NIR spectroscopy: optimal band combination algorithm and spectral degradation. Geoderma 382:114729. https://doi.org/10.1016/j.geoderma.2020.114729
Zhou W, Xiao J, Li H, Chen Q, Wang T, Wang Q, Yue T (2023) Soil organic matter content prediction using Vis-NIRS based on different wavelength optimization algorithms and inversion models. J Soils Sediments 23:2506–2517. https://doi.org/10.1007/s11368-023-03480-4
Zou X, Zhao J, Malcolm JWP, Mel H, Mao H (2010) Variables selection methods in near-infrared spectroscopy. Anal Chim Acta 667:14–32. https://doi.org/10.1016/j.aca.2010.03.048
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Responsible editor: Jun Zhou
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
ESM 1
(XLSX 1846 kb)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, M., Huang, Y., Zhao, X. et al. Effects of different spectral processing methods on soil organic matter prediction based on VNIR-SWIR spectroscopy in karst areas, Southwest China. J Soils Sediments 24, 914–927 (2024). https://doi.org/10.1007/s11368-023-03691-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11368-023-03691-9