Introduction

Heavy metals in agricultural soil are very persistent, they do not be biodegrade and they readily accumulate to toxic levels1. In general, heavy metals can migrate from polluted soil and/or irrigation water to vegetables and crops, leading, after chronic consumption, to food security and to health problems2. Therefore, the situation of soil especially agricultural soil heavy metal pollution in the farmland soil cannot be ignored.

Rice is the most widely consumed cereal grain on earth, the global rice production was over 740 million tonnes in 2014, with Asian countries, including China, Thailand, Japan, and Indonesia dominating the global rice production3. Rice cultivated in the polluted paddy soil area can affect human health detrimentally4. It has been reported that the soils in China polluted by heavy metals alone account for almost one-sixth of the total cultivated land and those polluted soils are mainly distributed in the intensively cultivated areas5, so, many rice are still cultivated in the large-scale slightly and moderately heavy metal contaminated soil6.

Given the concerns of monitoring heavy metals in farmland, numerous research efforts have been conducted to assess the total amount of heavy metals in farmland soil7,8. But currently researchers realized that the total metal content in the solid phase often does not well predict toxic effects in soil dwelling organisms and plants9,10,11. Instead, organisms respond only to the fraction that is biologically available for that organism11. In the last few decades, researchers have followed different extraction techniques to estimate the fractionation of metals in soil/ sediments12,13,14. 0.01 M CaCl2 is a commonly used selective chemical extractant15,16,17, because 0.01 M CaCl2 solution matches the soil solution with respect to pH, concentration and composition18. Novozamsky et al. (1993) reported that a close relationship was found between the Cd concentration in vegetables and its concentration in the CaCl2-extract19. Anjos et al.(2012) use five extractant solutions evaluate the available fraction of aluminium (Al), Pb, manganese (Mn) and zinc (Zn) in the Pb mine, and found that CaCl2 seems to be a good extractant medium20.

However, traditional CaCl2-extract methods is time-consuming and expensive21. And it is highly challenging to use the field sampling and wet chemistry methods for regular monitoring of heavy metal uptake at large scales. Compared with most chemical analysis, remote sensing technology has the advantages of simple, time-saving and labor-saving in soil monitoring22, especially the emergence of hyperspectral remote sensing technology makes it possible to monitor soil minerals, water, nutrients, salinity and other elements. With the continuous sampling and the high spectral resolution (<5 nm), hyperspectral sensors can discriminate critical spectral differentials in detail23. Some researchers have applied hyperspectral reflectance to detect the heavy metal in soil8,24,25. However, in the present study, some heavy metals are spectrally featureless in the visible and near-infrared parts of the electromagnetic spectrum26.

Compared with straightforward to derive heavy metal concentrations in soil, indirect access to soil heavy metal concentrations by plants is more practical. When plants are stressed, the biochemical contents (e.g., chlorophyll) of their leaves may change, and the spectral properties (reflectance and transmittance) at specific wavelengths (e.g., red, green, blue and red edge bands) will change with the biochemical contents of plants leaves27. Therefore, plants can be used as bridges to detect the elements in the soil using hyperspectral remote sensing techniques. Hyperspectral remote sensing has been used to detect stress in plants before visible symptoms have been observed28,29,30, such as water deficiency31, metal accumulation32, diseases33 and salt34. Compared with monitoring stress in plants, use plant as an indicator to estimating CaCl2-extractable concentrations of heavy metals in agricultural soil by use the remote or proximal sensing is less studied and rarely reported in the literature according to our reviews.

The main objective of this study was to evaluate the effectiveness of using spectral reflectance at leaf scales to quantify the heavy metal concentrations in agricultural soil in Zhangjiagang, Suzhou, China. The aims of our study were: (1) to analysis the relationship between heavy metals concentrations in rice leaves and CaCl2-extractable heavy metals (E-HM) concentrations in soil; (2) to determine the optimum variables that provide great sensitivity to E-HM concentrations; and (3) to establish the PLSR model for estimating E-HM concentrations in agricultural soil using optimum sensitive variables of hyperspectral data of rice leaves.

Materials and Methods

Description of study area

Located on the eastern of the Yangtze River Delta (31°43′-32°02′N, 120°21′-120°52′E), Zhangjiagang city is approximately 999 km2, of which 799 km2 are terrestrial areas (Fig. 1). The average annual temperature is 17.3 °C and the average annual precipitation is 1556.5 mm35. The soil type is mainly fluvo-aquic soil and paddy soil36. Because of the developed chemical industry, metallurgy, electroplating industry, printing and dyeing papermaking, et al., the Zhangjiagang city is one of the fastest growing cities in the Yangtze River Delta.

Figure 1
figure 1

Location of the Zhangjiagang city and field sample sites.

Field sampling and hyperspectral measurement

A total of 21 sampling sites (Fig. 1) were set during September, 2017 in agricultural areas. At each sampling site, hyperspectral reflectance of the rice leaves, samples of rice and their root-soil (0–20 cm depth) were taken. The rice samples and soil were packed into polyethylene bags. Five random samples on each site were taken and bulked together as one composite sample. The location of each sampling site was acquired using a Global Positioning System (GPS, UniStrong G120) with an accuracy of about 3 m.

The hyperspectral reflectance of the rice leaves was obtained using a field portable spectrometer (UniSpec, PP systems, Haverhill, MA, USA). Spectral range and the spectral resolution of the sensor were 310–1100 nm and 3.3 nm, respectively. A bifurcated fiber optic cable and a leaf clip (models UNI410 and UNI501, PP Systems, Haverhill, MA, USA) were used to measure leaf reflectance of rice. The leaf clip held the fiber at a 60° angle to the leaf surface. Leaf illumination was provided by a halogen lamp in the spectrometer through one side of the bifurcated fiber. To minimize the measurement noise of the reflectance spectral, three spectral measurements of the fully-expanded leaves near top of each bundle were made and the 15 results averaged as one spectral measurement for the sampling site. A barium sulfate panel was used as a white reference standard to calibrating and optimizing the spectral before each measurement.

Laboratory analysis

Soil heavy metals concentrations measurements

Soil samples were air-dried at room temperature (26–28 °C), and then sieved through a 2-mm nylon mesh to remove stones or other debris. Total concentrations of Cd and Pb in the soil were determined as following steps: 0.2 g soil was digested with 10 ml mixed solution of HNO3, HClO4 and HF (1:1:2, v/v/v) in a polytetrafluoroethylene digestion tank, microwave digestion for 15 minutes (according to different sample conditions, the proportion of acid and digestion time can be adjusted), the final solution was diluted to 50 ml using deionized distilled water and analyzed with inductively coupled plasma atomic mass spectrometry (ICP-MS, X2, Thermo Electron Corporation) after digestion by a mixture of concentrated37,38.

CaCl2-extractable concentrations of Cd and Pb were determined as following steps: a 25 ml aliquot of 0.01 M CaCl2 solution was added into a 5 g soil (<2 mm) sample in a 100 ml conical flask and the suspension was shaken at 250 rpm at 25 °C. After 12 h of shaking, the supernatant was separated from the solid phase by centrifugation at 3000 rpm for 20 min. The concentrations of Pb and Cd in the supernatant were analyzed with ICP-MS18.

Rice samples were thoroughly washed in deionized water, oven-dried at 70 °C until constant weight. For analyzing Cd and Pb concentration in rice leaves, 0.2 g sample was digested with 5 ml mixed solution of 5:2 HNO3: H2O2 (v/v) in centrifuge tubes at room temperature. Then this solution was heated in a microwave accelerated reaction system (Anton-Paar PE Multiwave 3000) for 20 min. The digested substrate was then diluted with 43 ml deionized water and analyzed for total Cd and Pb with ICP-MS.

Hyperspectral data pretreatment

The original hyperspectral signal is susceptible to the environment, so original spectra data were preprocessed to enhance the spectral features and to acquire more information about heavy metals in the soil. Wavelengths shorter than 420 nm and longer than 980 nm were not analyzed due to excessive noises39, thus a total of 560 wavelengths were used as the raw spectral reflectance and were automatically interpolated from 3.3 nm to 1 nm in calibration23. This process was done using Excel 2007 (Microsoft Inc.).

Derivative transformation can remove the interferences of background, resolve overlapping spectra, and minimize the baseline drift of raw spectra that is caused by differences in grinding and optical setups40. The first derivative transformation and second derivative transformation were done using OriginPro 8 software.

Spectral indices calculated

Ten commonly used spectral indices were calculated (Table 1). As shown in Table 1, except for water index (WI), all other spectral indices are related to chlorophyll or pigment.

Table 1 Spectral indices used in this study.

Variables selected and partial least-squares regression model built

Correlation analysis of E-HM concentrations with the raw spectral reflectance (R), first-order differential of R (R′), second-order differential of R (R′′) and spectral indices respectively were performed in SPSS (IBM SPSS Statistics 22) using bivariate related analysis. Variables with a significant correlation (P < 0.05) were selected for use in the model.

The partial least-squares regression (PLSR) with leave-one-out cross-validation was used to predict E-HM concentrations in farmland soil using selected variables and spectral indices. PLSR is one of the most frequently used methods for the estimation of soil heavy metal concentrations with visible and near-infrared reflectance spectroscopy (VNIRS)40,41,42. It can process data with strong collinearity and noise, and is well suited for situations where the number of variables considerably exceeds the number of available samples43,44. The PLSR and cross-validation were performed in TQ Analyst (8.3.125, Thermo Fisher Scientific Inc.). The performances of PLSR were assessed with two evaluation parameters between the measured values and predicted values: the coefficient of determination (R2) and root mean square error (RMSE). The R2 and the RMSE are commonly calculated using the following formulas45:

$${R}^{2}=\frac{{[{\sum }_{{\rm{i}}=1}^{{\rm{n}}}({{\rm{x}}}_{{\rm{p}}}-\overline{{{\rm{x}}}_{{\rm{p}}}}){\sum }_{{\rm{i}}=1}^{{\rm{n}}}({{\rm{x}}}_{{\rm{m}}}-\overline{{{\rm{x}}}_{{\rm{m}}}})]}^{2}}{{\sum }_{{\rm{i}}=1}^{{\rm{n}}}{({{\rm{x}}}_{{\rm{p}}}-\overline{{{\rm{x}}}_{{\rm{p}}}})}^{2}{\sum }_{{\rm{i}}=1}^{{\rm{n}}}{({{\rm{x}}}_{{\rm{m}}}-\overline{{{\rm{x}}}_{{\rm{m}}}})}^{2}}$$
(1)
$${\rm{RMSE}}=\sqrt{\frac{{\sum }_{{\rm{i}}=1}^{{\rm{n}}}{({{\rm{x}}}_{{\rm{p}}}-{{\rm{x}}}_{{\rm{m}}})}^{2}}{{\rm{n}}}}$$
(2)

where xp and \(\overline{{{\rm{x}}}_{{\rm{p}}}}\,\,\)are the predicted value and the average predicted value of E-HM concentrations, xm and \(\overline{{{\rm{x}}}_{{\rm{m}}}}\) are the measured value and the average measured value of E-HM concentrations and n is the number of samples.

Results and Discussion

Heavy metal concentrations in agricultural soil

Descriptive statistics of concentration of Cd, Pb were reported in Table 2. It illustrated that the average concentrations of Pb (29.193 mg kg−1) was below the limit (80 mg kg−1) set by Ministry of Ecology and Environment of the People’s Republic of China (MEEPRC)46, while the average concentration of Cd (0.301 mg kg−1) may be affected by human activities was slightly higher than the limit (0.3 mg kg−1) set by MEEPRC. In addition, the concentration of Cd in four samples (4 out of 21) exceeded the limit set by MEEPRC. Also, the mean concentration of Pb was bigger than the mean concentration of Cd, but the relationship between the mean concentration of E-Pb and E-Cd was reverse. That was because Cd is more available than Pb in soil47,48.

Table 2 Heavy metal concentrations (mg kg−1) of agricultural soil (n = 21) in Zhangjiagang city.

The mean SD and CV of E-Cd concentrations and E-Pb concentrations were also shown in Table 2. The CV of E-Cd concentrations and E-Pb concentrations were different from it of Cd concentrations and Pb concentrations. Forevermore, the soil with high concentrations of Cd and Pb may not have high concentrations of E-Cd and E-Pb. That may because that the E-HM concentrations in natural soils depends on differences soil environment, such as pH, concentrations of clay, sand and organic matter9.

Relationship between heavy metals concentrations in soil and those in rice leaves

The Pearson’s correlation coefficients between heavy metals concentrations in soil and in rice leaves are shown in Table 3. Only the significance of Pearson’s correlation coefficients between E-Cd in soil and Cd in rice leaves was 0.649, which reached to the level of 0.05; While the significance of Pearson’s correlation coefficients between Pb in soil and Pb in rice leaves were 0.340 for concentration of E-Pb and 0.222 for total concentration of Pb. Compared with concentrations of E-Pb, the concentrations of E-Cd had higher correlation with Cd concentrations rice leaves relatively in our study. Earlier studies found that, at common soil pH range, the stability of Cd is lower than that of Pb49,50. Meanwhile, rice tends to accumulate Cd, and the accumulation of Cd in rice is often controlled to greater extent by its bioavailability than its total content in the soil3. Therefore, the concentrations of Cd in rice leaves had higher correlation with E-Cd.

Table 3 The Pearson’s correlation coefficients between heavy metals concentrations in the soil and the heavy mental concentrations in rice leaves.

Relationship of E-HM concentrations against hyperspectral data

The Pearson’s correlation coefficients of the E-HM concentrations and their processed reflectance (R, R′ and R′′) are shown in Fig. 2 and Table 4, and the Pearson’s correlation coefficients between the E-HM concentrations and spectral indices are summarized in Table 5. The wavelengths with significant at P < 0.05 indicate that these bands are sensitive to E-HM.

Figure 2
figure 2

Correlations between processed reflectance (R- raw reflectance; R′- the 1st derivative spectra; R′′- the second derivative spectra) spectra and E-Cd (a) and E-Pb (b) concentrations in soil from Zhangjiagang city.

Table 4 Correlation analysis between E-HM concentrations and transformations of spectra.
Table 5 The Pearson’s correlation coefficients between the E-HM concentrations and spectral indices.

From Table 4, we could see that the maximum positive correlation waves and the minimum negative correlation waves between E-Cd and E-Pb were different. As shown in Fig. 2a, the number of bands associated with E-Cd gradually decreases as the processing progresses. There were 277 bands (in the range of 420–696 nm) of R, 68 bands of R′ and 37 bands of R′′ had significant relationship (P < 0.05) with E-Cd concentrations in soil. The correlation bands of R were continuous, while the correlation bands of R′ and R′′ were dispersed. In some literature correlation, the similar relationship between heavy metals and spectral data were shown51,52. This indicated that heave metal stress leads to spectral response, but redundant information was contained among the very close spectral bands53,54. Pre-processing techniques could remove redundancy information and made some subtle information clear in the spectral in order to improve the subsequent multivariate regression55.

While in Fig. 2b, we could see that the trend of the relationships between spectral and E-Pb concentrations was similar to that between spectral and E-Cd concentrations, but whether it in R, R′ or in R′′ correlograms, there correlations coefficient were not reach the 0.01 significance level (Table 4). That may be due to the low concentrations of E-Pb in the agricultural soil, which has not caused obvious stress on rice and has no obvious effect on the leaf spectra.

As shown in Table 5, spectral indices showed a wide range of correlations with the concentrations of E-Cd (−0.705–0.222) and E-Pb (−0.35–0.259). All spectral indices were negatively related to E-Cd concentrations, four of them (NDVI, CRI, PRI2 and NPCI) had significant correlation (P < 0.05) with E-Cd content, while none index had significant correlation with E-Pb concentrations. The four spectral indices associated with E-Cd concentrations were leaf pigment-related indices. The index related to leaves water content (WI) had no significant correlation with E-Cd concentrations. Because Cd can damage the structure of chloroplasts, as manifested by the disturbed shape and the dilation of the thylakoid membranes56, so the indices associated with leaf pigment were more susceptible to Cd. However, rice water content is resistant to Cd when the mass fraction of Cd in 2.0–3.0 mg/kg in farmland soil57. According to Table 2, the mean content of Cd in agricultural soil was 0.3 mg/kg, which in the region of the resistant. Therefore, the spectral index related to water content had no significant associate with E-Cd concentrations.

Model development and validation

We selected 386 and 209 variables for the model of E-Cd concentrations and E-Pb concentrations respectively, and the number of the samples was 21, meanwhile, most of the selected variables have strong collinearity, so the PLSR models was very suitable for this study.

The relationship between measured concentrations of E-HM and predicted concentrations of E-HM were presented in Fig. 3. A proper model should have low RMSE and R2 should be close to 17. It was clear that the PLSR model had the capacity to predict E-Cd content, due to its higher coefficients of determination (R2 = 0.592) and low RMSE (0.046) (Fig. 3a). While, the prediction of the PLSR for E-Pb concentrations with the RMSE value was 0.019 and R2 only achieved 0.013, did not show good (Fig. 3b). It is known from the literature that Cd is the best-known toxic heavy metal and it is taken up by the calcium uptake system in plants58, while the soil has a higher binding capacity for Pb than for Cd59, making Pb less bioavailable. And from Table 2 we also knew that the ratio of the E-Cd concentrations in the total Cd (E-Cdmean/Cdmean = 0.051/0.301 = 0.17) was higher than the ratio of E-Pb concentrations in the total Pb (E-Pbmean/Pbmean = 0.01/29.193 = 0.0001), so rice may stressed by Cd not by Pb. The accurate of the PLSR model of E-Cd concentrations was not very high, that may contribute to only 21 sampling points were used for model development and validation, which impact on the robustness of the models.

Figure 3
figure 3

The relationship between measured and predicted E-Cd (a) and E-Pb (b) concentration in soil based on PLSR models.

In summary, it was demonstrated that, if the rice was sensitive to E-HM or it was stressed by a certain concentration of E-HM, the PLSR model based on pretreatment reflectance from hyperspectral data of rice leaves had the capability to estimate E-HM concentrations.

Conclusions

In present study, the concentration of Cd in 19.05% of samples points exceeded the limit set by MEEPRC in agricultural soil of Zhangjiagang city, and the concentration of E-Cd in soil had significant correlation with concentration of Cd in rice leaves. However, due to the low concentration and the low bioavailability of Pb, the concentration of E-Pb in soil had no significant correlation with concentration of Pb in rice leaves.

The raw reflectance had redundant information, and pre-processing techniques could remove redundancy information and made some subtle information clear in the spectral. So the number of bands associated with E-HM gradually decreases as the processing progresses (R > R′ > R′′). The number of bands associated with E-Cd was more than that of E-Pb; the correlation between E-Cd concentrations and spectral data was higher than that between E-Pb concentrations and spectral data. Meanwhile, because of the low concentration of the E-Pb and the Cd resistant of rice water content, there were four indices (NDVI, CRI, PRI2 and NPCI), which related to chlorophyll or pigment were significant correlated with E-Cd concentrations.

The PLSR model had the capacity to estimate E-Cd concentrations in agricultural soil, but cannot estimate E-Pb concentrations in agricultural soil because of the low concentration of E-Pb. So, if the crop was sensitive to E-HM or the crop was stressed by the E-HM, the PLSR model had the capacity to estimate E-HM concentrations in soil.

Using hyperspectral data to evaluate E-HM content in agricultural soil is not affected by soil chemical properties (such as soil pH, organic matter content and soil texture), which can directly reflect the toxicity of heavy metals in soil and has a wider range of applications and a more accurate result compared with the total heavy metals concentration assessment method. This method may provide a new insight to monitoring the E-HM content in agricultural soil. However, the number of samples was too low to use an external validation, so more samples will be collected in the future to improve the predictive performances, and more heavy metals will be estimated to test robustness of the model.