1 Introduction

Soil is an essential component of the Earth’s system, linked directly with the hydrological cycle, sedimentological, geochemical, biological, and ecological processes, as well as representing a major source of goods, services, and resources for humanity (Brevik et al. 2015; Decock et al. 2015). However, the current production model to meet the increasing demand for food, fibre, and fuel from the world’s growing population has accelerated land degradation (Tesfaye et al. 2015; Ollobarren et al. 2016).

The pursuit for increasing crop productivity and the expansion of new areas for cultivation increase the pressure on fragile lands and ecosystems, impacting hydrological processes and favouring water erosion (Wohl et al. 2012; Santos et al. 2016). Soil erosion by water may occur at different intensities depending on the characteristics of rainfall, as well as terrain, soil type, and land cover and use. Soils in the Brazilian semiarid region are usually shallow with low organic matter content and low water retention capacity. These characteristics, in association with the high intensity rainfall events typical of this region, and the absence of vegetation cover resulting from inadequate land use, can potentialize the erosive process (Calixto Júnior and Drumond 2014).

The material produced by erosion is exported to the river systems, and a significant portion may be deposited in reservoirs used for water supply. In the semiarid region of Brazil, sediment deposition causes a reduction of approximately 1.6% of the storage capacity of surface reservoirs per decade (de Araújo et al. 2006). Thereby, sediment deposition also changes reservoir geometry, making them shallower and more susceptible to evaporation (de Araújo et al. 2006). Besides, soil erosion contributes to reservoir eutrophication, since the nutrient-enriched sediments increase ecosystem productivity and reduce the dissolved oxygen level (Coelho et al. 2017; Moura et al. 2020).

In theory, soil conservation practices are effective in reducing sediment yield and reservoir siltation; however, human and financial resources needed for land use control and monitoring may hamper implementation to large territorial extensions. From the reservoir perspective, removal and reuse of the deposited sediment have potential to, simultaneously, reduce the nutrient content in the lake and replace its storage capacity lost by siltation (Lira et al. 2020). Sediment reuse has been proposed as a practice contributing to the circular economy concept, considering sediment as a resource rather than waste (Brils et al. 2014), and many studies emphasize its positive effects (e.g., Fonseca et al. 1998; Sigua 2009; Junakova and Balintova 2012; Mattei et al. 2017; Braga et al. 2019). Capra et al. (2015) reported on the reuse of dredged sediment for the replacement of soil degraded by erosion, concluding that the addition of sediment had beneficial effects on the physicochemical properties of the soil and resulted in higher total dry matter production in plants. Also, Sigua et al. (2004) observed increases in biomass production when they reused dredged sediment, and Braga et al. (2017) found that the addition of sediment in sunflower cultivation improved the relative chlorophyll content and total dry mass when compared to plants growing on substrate containing commercial fertilizers.

Although the economic feasibility of sediment reuse for soil fertilization has already been demonstrated for specific conditions (e.g., Braga et al. 2019), the agricultural sector has not yet adopted this practice to replace traditional fertilization, which, via soil erosion, may further increase reservoir eutrophication. Therefore, sediment characterization is essential to provide information about its suitability as fertilizer and to promote the idea of reuse, consolidating this practice in the agricultural production system as a measure of soil and water conservation, as well as financial benefit.

Such sediment characterization can be achieved by several means, such as time-consuming and costly physicochemical laboratory analyses. Currently, visible near-infrared and short-wave infrared (VNIR-SWIR) spectroscopy has proven to be an alternative for indirect analyses of soil or sediment attributes, as it (i) is cheaper and faster than the traditional laboratory procedures and thus (ii) favours repeatability and reproducibility at different temporal and spatial scales. Recent research has successfully established correlations between VNIR-SWIR spectroscopy and sediment or soil attributes, for instance Viscarra Rossel et al. (2006a), Vågen et al. (2006), Morgan et al. (2009), Kuang and Mouazen (2011), Nawar et al. (2017), Morellos et al. (2016), Cozzolino et al. (2016), Hu (2013), Wang et al. (2015), and Demattê et al. (2019a).

Thereby, sediment characterization is especially viable in semiarid regions due to the flood-drought dynamics: In Northeast Brazil, where this study was conducted, small and medium-sized reservoirs often fall dry during the intra-annual dry season, exposing the silted sediment and making it easily accessible. This increases the chance that the sediment can be periodically sampled and analysed, e.g., by VNIR-SWIR spectroscopy and, consequently, be easily excavated and reused when proven efficient for soil fertilization.

To assess the potential of VNIR-SWIR spectroscopy for the characterization of reservoir sediments, we collected sediment samples, performed physicochemical and spectral analyses, and generated regression models with the goals to (i) characterize the sediment deposited in reservoirs in the semiarid Northeast of Brazil; thereby, we also aimed to (ii) assess the effect of spatial scale on the accuracy of prediction models; in addition, to assess the potential of spaceborne imaging spectroscopy data that is currently becoming more available, we (iii) assessed the influence of spectral resolution on model performance.

2 Study area

This study was conducted in the semiarid region of Brazil, Federal State of Ceará, encompassing ten surface reservoirs of various sizes (flooded areas varying from 0.02 to 37 km2) distributed in three catchments (Benguê, Fogareiro, and Pentecoste, with approximately 1000, 5100, and 3200 km2, respectively), as illustrated in Fig. 1.

Fig. 1
figure 1

Study area in the semiarid northeast of Brazil

Climate in the region is semiarid, with annual potential evaporation ranging from approximately 1800 mm by the coast to up to 3000 mm in the upstream hinterland. Average annual precipitation presents an inverse gradient pattern, ranging from roughly 1100 to 550 mm, concentrated in a well-defined rainy season, which generates 6 to 9 months per year of atmospheric water deficit, on average (INMET 2018).

Additionally, shallow soils (usually < 1 m depth) on top of a crystalline basement make the rivers intermittent and promote low hydrogeological potential in the region, which led to the construction of dams for water supply. The reservoir network, with an average density of the order of one reservoir per 8 km2, impacts the overall water resources (de Araújo and Medeiros 2013), and is particularly important on sediment retention (Medeiros et al. 2014; Mamede et al. 2018). The accumulation of sediment and the adsorbed nutrients in the reservoirs results in a water quality effect, as described by Medeiros and Sivapalan (2020), negatively affecting water availability in terms of quantity (reservoir siltation) and quality (eutrophication). In this context, sediment reuse has been proposed for soil fertilization (Braga et al. 2019), also contributing to recovery of water quality (Lira et al. 2020).

The wide range of reservoirs and catchment areas assessed in this study enabled an analysis of the potential of sediment characterization by VNIR-SWIR spectroscopy at the reservoir (< 40 km2), catchment (900–6000 km2) and regional (82,500 km2) spatial scales. Analysis at the reservoirs Araras, Açude III, Boqueirão, Benguê, and Escola was not possible due to the limited number of sediment samples in each of those reservoirs, but such samples were included in the analysis at the catchment and regional scales. Table 1 presents the reservoirs and catchments adopted in each of the spatial scales, and a description of each area is presented below.

Table 1 Reservoirs and catchments assessed in this study

The studied reservoirs and catchments were selected based on the following criteria:

  • Variability in terms of soil type and hydrological regime, which control the sediment characteristics and flux;

  • Existence of strategic reservoirs at the catchments’ outlets, monitored by the Water Resources Management Company of Ceará — COGERH, which provides secondary data on hydrological variability;

  • Scientific research conducted by the authors in the regions during the last decades, which contributes to prior knowledge.

2.1 Benguê catchment

The Benguê catchment drains an area of roughly 1000 km2 in the headwaters of the Jaguaribe Basin, with 560 mm average annual rainfall producing 47 mm of runoff (8% runoff coefficient) (Ceará 2009). Predominant soil types are luvisols and neosols, though deep latosols prevail in the border regions (EMBRAPA 2011). The catchment is controlled by the Benguê reservoir at its outlet, with a storage capacity of 19.6 hm3 and a flooded area of 3.5 km2. In addition to the Benguê reservoir, three small reservoirs with flooded areas ranging from 0.02 to 0.03 km2 were studied: Boqueirão (de Figueiredo et al. 2016), Araras, and Açude III. Figure 1 of the supplementary material (Fig. S1) presents the location of the studied reservoirs and the respective sediment sampling points in the Benguê catchment.

2.2 Fogareiro catchment

Alike Benguê, the Fogareiro catchment is located in the Jaguaribe Basin, with an area of approximately 5100 km2 with predominance of neosols, but also luvisols and argisols (EMBRAPA 2011). Average annual rainfall is 680 mm and runoff 58 mm, which corresponds to 8% runoff coefficient. Four reservoirs were studied in the catchment: Fogareiro, Marengo, São Joaquim, and São Nicolau, with storage capacities of 118, 15.3, 5.0, and 0.9 hm3, and flooded areas of 20.5, 3.4, 1.2, and 0.4 km2, respectively. Studies have been conducted in the Fogareiro catchment for characterization of surface reservoirs (Zhang et al. 2016, 2018) as well as the feasibility of the sediment reuse practice (Braga et al. 2019). Fig. S2 (supplementary material) presents the location of the studied reservoirs and the respective sediment sampling points in the Fogareiro catchment.

2.3 Pentecoste catchment

The Pentecoste catchment is located within the Curu Basin and extends over an area of approximately 3200 km2. Average annual rainfall and runoff account for 750 and 126 mm, respectively, corresponding to a runoff coefficient of 17% (Ceará 2009). Most of the area is on luvisols, though also small areas of planosols are indicated (EMBRAPA 2011). In this catchment, two reservoirs were selected for sediment sampling: the Pentecoste reservoir, located at the catchment outlet, with 360 hm3 storage capacity and a flooded area of 57 km2, and the Escola reservoir, located in the Vale do Curu Experimental Farm (FEVC), with a storage capacity of approximately 0.05 hm3 and a flooded area of 0.03 km2, monitored since 2015 (Silveira and Mamede 2021) (Fig. S3, supplementary material).

3 Material and methods

The study comprehends four steps: (1) sediment sampling in the studied reservoirs; (2) physicochemical laboratory analyses; (3) spectral analyses; (4) correlation of physicochemical and spectral properties to elaborate models of reservoir sediment characterization by diffuse reflectance spectroscopy at different spectral and spatial resolutions.

3.1 Sediment sampling

Sediment sampling was performed in empty reservoirs in the period of November 2016 to February 2017. Small reservoirs dry out periodically in the study region due to the high evaporation rates, whereas medium and large size reservoirs retain water for longer periods. However, due to a long-lasting drought (2012 to 2017), it was possible to obtain sediment samples also from larger strategic reservoirs. The sediment sampling from the reservoirs’ beds was preceded by removal of litter (Fig. 2). At each sampling point, an area of roughly 0.5 m diameter was delimited and approximately 2 kg of sediment from the top layer (~ 2 cm depth) was collected at 3 to 5 points, forming a composite sample. The number of composite samples varied according to the size of the reservoir, with twenty samples being the maximum for the largest reservoirs (see Table 1), totalling 138 in the 10 studied reservoirs.

Fig. 2
figure 2

Study area and field work: A Fogareiro reservoir with low water level, B bed of an empty reservoir C litter removal at a sampling point, D sediment sampling

The sediment samples were air-dried, disaggregated, homogenized, and sieved to 2 mm, and then sent to physicochemical and spectral laboratories for the respective analyses, as described below.

3.2 Physicochemical analyses of sediment

The sediment physicochemical analyses were performed in the Soil and Water Laboratory of the Federal University of Ceará (UFC). The attributes nitrogen (N), phosphorus (P), potassium (K), soil organic carbon (SOC), and electrical conductivity (EC) and granulometry (for clay content) were analysed according to the methods recommended in the Manual of Soil Analysis Methods of the Brazilian Agricultural Research Corporation (EMBRAPA 2017):

  • N (g kg−1): Kjeldahl method, in which N is converted to ammonium sulphate through oxidation, and the released ammonia is determined by acidimetry;

  • P (mg kg−1): formation of blue phosphorus-molybdic complex after reduction of molybdate with ascorbic acid, and determination of the assimilable phosphorus by molecular absorption spectrophotometry;

  • K (cmolc kg−1): extraction with dilute hydrochloric acid solution and subsequent determination of the exchangeable potassium by flame spectrophotometry;

  • SOC (g kg−1): oxidation of organic matter via a wet process with potassium dichromate in a sulfuric medium. The excess dichromate after oxidation is titrated with a standard solution of ferrous ammonium sulphate;

  • EC (dS m−1): preparation of a saturation paste by addition of water to the sediment sample until saturation, and direct reading with a conductivity meter;

  • Clay fraction (g kg−1): pipette method, with agitation and suspension of the silt and clay fractions in dispersing solution, and quantification of the suspended fraction after sedimentation.

During control procedures and uncertainty assessments, the laboratory performs triplicate analyses and usually observes < 2% differences, with a 5% difference being admitted as the upper limit for reanalyses. Although no triplicate analyses were performed in this study, we assume the laboratory error to be < 5%.

3.3 Spectral analyses of sediment

For the spectral analyses, sediment samples were air-dried and placed in black cylindrical plastic containers with 6 cm diameter and 4 cm depth, totalling a volume of 113.1 cm3. The readings were taken at the Agricultural and Electronics Laboratory (LEMA) of UFC in a dark room with no reflective surfaces. A spectroradiometer covering the spectral range between 350 and 2500 nm (ASD FieldSpec®3 Hi-Res) was used with a single artificial light source (halogen lamp) oriented to the sample with a 45° zenith angle and 71 cm distance. The distance between sensor head and sample was one-third of the container diameter, to avoid influence of the edges on the readings. Each spectrum was obtained by automatic averaging of 30 measurements, and three spectra were collected from each sample with 120° rotation between readings. During the analysis, three optimizations and white reference measures were performed, the first prior to the spectral readings and the last two when the equipment indicated saturation.

The spectral reading comprises the range between 350 and 2,500 nm with 1 nm spectral resolution. However, due to noise observed in the border areas of the measured spectral range, only the region between 400 and 2400 nm was considered for subsequent analyses.

3.4 Models of sediment characterization from spectroscopy

Establishing relationships between soil/sediment physicochemical properties and reflectance data is challenging due to the large number of possible combinations. Currently, partial least squares regression — PLSR (see Wold et al. 2001) is a widely used and successful technique for estimating target characteristics from spectral data (e.g., Viscarra Rossel et al. 2006a2008; Gomez et al. 2008; Lu et al. 2013; Ludwig et al. 2017). The PLSR algorithm selects orthogonal factors that maximize the covariance between the predictor variables X (spectral data) and the response variable Y (sediment attribute, in this case) and decomposes both X and Y variables to find new components (scores), called latent variables, which are orthogonal. Regressions are calculated between these new components of variables X and Y (Moreira et al. 2015).

In our study, we used the ParLeS version 3.1 software provided by Viscarra Rossel (2008) to estimate the contents of N, P, K, SOC, EC, and clay from spectral data via PLSR-modelling. Prior to PSLR-modelling, pre-processing techniques were applied to improve the robustness of the models. First, detector jumps present in a few occasions were corrected routinely using in-house scripts. Then, ParLeS “spectral manipulation” options were applied, namely, (1) a SNV transformation (Barnes et al. 1989) for spectral normalization to remove interference due to light scattering, (2) a Savitzky-Golay filter (Savitzky and Golay 1964) for spectral smoothing, and (3) mean centring of the data. In our case, this selection was found to outperform other common pre-processing techniques such as e.g., spectral derivatives.

The regression models were developed individually for each of the reservoirs and catchments presented in Table 1, except those with less than 20 sediment samples, resulting in five reservoir models, three catchment models, and one basin model per sediment property (totalling 54 calibrated models). Due to the low number of samples in individual reservoirs (n < 20) and for reasons of comparability between model performances, we did not separate the datasets into calibration and validation, but performed leave-one-out cross validation, whereas a maximum number of 12 factors were allowed. Further, we provide regression coefficients (intercept and slope) and bias of the calibrated models. Mean, standard deviation and range of observed versus predicted sediment attributes are presented in the supplementary material.

Performance of the regression models was assessed as the best combination of high coefficient of determination (R2) and low root-mean-square error (RMSE), while aiming for a ratio of performance to deviation (RPD) > 1.4. Usual ranges of RPD are taken as excellent (RPD > 2.50), very good (2.00 < RPD ≤ 2.50), good (1.80 < RPD ≤ 2.00), moderate (1.40 < RPD ≤ 1.80), weak (1.00 < RPD ≤ 1.40), and very poor (RPD ≤ 1.00) (Viscarra Rossel et al. 2006b). In addition, Nash–Sutcliffe efficiency (NSE) coefficient was calculated as measure of model performance, according to which the model can be considered very good (0.75 < NSE ≤ 1.00), good (0.65 < NSE ≤ 0.75), satisfactory (0.50 < NSE ≤ 0.65), or unsatisfactory (NSE ≤ 0.50) (Moriasi et al. 2007).

To analyse the influence of spectral resolution on the accuracy of sediment attribute estimation from spectroscopic data, the original spectral curves of 1 nm resolution (2001 spectral bands) as provided by the instrument were resampled to 5, and 10 nm resolution, resulting in 400, and 200 bands, respectively. For each of the abovementioned spectral resolutions, the model presenting highest correlation with each sediment attribute, estimated from the physicochemical analyses, was selected. Again, the respective coefficients of determination (R2) were used as a measure of the goodness of fit.

To assess the influence of spatial scale on the estimations, the physicochemical data of the sediment sampling points were grouped at spatial scales varying from reservoir (< 100 km2) to regional (> 10,000 km2), according to Table 1. R2 were calculated for all combinations of sediment attributes and spatial scales, enabling to interpret how the correlation evolves.

4 Results and discussion

4.1 Physicochemical and spectral characterization of the sediments

Sediment properties in the 10 studied reservoirs are summarized in Table 2, from which the variation within and between reservoirs can be observed. The largest variation among the reservoirs was observed in P content, with a minimum of 2 mg kg−1 in a sample of the Benguê reservoir and a maximum of 289 mg kg−1 in São Nicolau. Mean P values in those same reservoirs were 7 mg kg−1 and 82 mg kg−1, respectively.

Table 2 Statistics of sediment attributes for the three study catchments

Nitrogen, K, and SOC tend to be less variable among the studied reservoirs, with mean values ranging from 1.3 to 1.9 g kg−1, 0.8 to 2.0 cmolc kg−1, and 13.4 to 19.0 g kg−1, respectively. As for P, clay contents showed a large variation, with the highest value observed in the Pentecoste reservoir (maximum of 744 g kg−1), where the mean was 528 g kg−1, and lowest in the Escola reservoir (minimum and mean of 37 and 249 g kg−1, respectively). Mean values of EC ranged from a minimum of 1.5 to a maximum 6.7 dS m−1 in the Benguê and Escola reservoirs, respectively. Within each reservoir, the standard deviation ranged from 0.6 to 5.6 dS m−1.

The spectra of the sediment in all sampling points are shown in Fig. 3 (Benguê catchment), Fig. 4 (Fogareiro catchment), and Fig. 5 (Pentecoste catchment). There is little, visually expressive contrast among the different catchments. Some variability can be observed in the VNIR between 500 and 900 nm (likely linked to Carbon and sediment colour), slope between 1500 and 1800 nm (Carbon, grain size), and the water absorption features around 1400 and 1900 nm; all sediment spectra seem to contain a more or less distinct clay feature around 2200 nm. The overall reflectance (brightness) varies within each reservoir, in a narrow range for the Boqueirão reservoir (Fig. 3) and a wide range, e.g., for the Marengo reservoir (Fig. 4).

Fig. 3
figure 3

Spectra of the sediment samples in the Benguê catchment, before pre-processing

Fig. 4
figure 4

Spectra of the sediment samples in the Fogareiro catchment, before pre-processing

Fig. 5
figure 5

Spectra of the sediment samples in the Pentecoste catchment, before pre-processing

4.2 Performance of models for sediment characterization from spectroscopy

4.2.1 Reservoir-scale

Performance of the models for characterization of sediments from spectroscopy at the reservoir-scale (< 40 km2) using partial least squares regressions is presented in Fig. 6 and Table 3, which show the cross-validation results for N, P, K, SOC, clay, and electrical conductivity. Please consider that the models were established for each of the five reservoirs individually, although results are presented together in Fig. 6A–F for comparison.

Fig. 6
figure 6

Results of cross-validation (leave-one-out) of models for individual reservoirs (< 40 km2): A nitrogen, B phosphorus, C potassium, D soil organic carbon, E clay, and F electrical conductivity

Table 3 Calibrated models and their performance for characterization of sediments from spectroscopy at the reservoir-scale

In general, regression models to estimate clay, EC, and SOC presented the highest correlations between spectra and sediment properties (with R2 in the range of 0.49 to 0.85) and best performances, varying from satisfactory to very good according to NSE coefficient (values in the range of 0.48 to 0.84) and from moderate to excellent as interpreted according to RDP (most models falling in the range of 1.40 < RPD < 2.00, classified as moderate to good). Models for N presented moderate correlations as well (R2 in the range of 0.52 to 0.69), with performances classified as satisfactory to good according to NSE (in the range of 0.47 to 0.66) and moderate according to RPD (minimum of 1.42 and maximum of 1.75). However, the models for P and K performed unsatisfactorily to satisfactorily according to NSE coefficient (0.22 ≤ NSE ≤ 0.51 and 0.05 ≤ NSE ≤ 0.62, respectively) and from weak to moderate according to RPD ranges (1.16 ≤ RPD ≤ 1.47 and 1.05 ≤ RPD ≤ 1.67, respectively). Among the reservoirs, models for Marengo and Fogareiro performed better (R2 in the ranges of 0.45 to 0.85 and 0.49 to 0.75, respectively; NSE from 0.44 to 0.84 and 0.46 to 0.75, respectively; RPD in the ranges of 1.37 to 2.59 and 1.40 to 2.06, respectively), whereas the results for Pentecoste presented high variation (NSE, R2, and RPD varying from 0.05 to 0.83, 0.08 to 0.83 and 1.05 to 2.47, respectively).

4.2.2 Catchment-scale

Results of models for sediment characterization at the catchment-scale (900–6000 km2) are presented in Fig. 7, whereas performance of the models is presented in Table 4. Again, please consider that the models were established for each catchment individually, although they are presented altogether in the graphs of Fig. 7A–F for comparison.

Fig. 7
figure 7

Results of cross-validation (leave-one-out) of models for catchments (900–6000 km2): A nitrogen, B phosphorus, C potassium, D soil organic carbon, E clay, and F electrical conductivity

Table 4 Calibrated models and their performance for characterization of sediments from spectroscopy at the catchment-scale

Alike the performance at the reservoir scale, regression models to estimate clay and EC usually presented coefficients of determination in the range of 0.4 to 0.8 and performances classified from satisfactory to very good according to NSE coefficient and from moderate to very good according to RPD. However, low correlation was observed between spectra and EC at the Pentecoste catchment (R2 = 0.12), where the model performed unsatisfactorily (with NSE = 0.06 and RPD = 1.05), strongly influenced by an extreme value equivalent to almost five times the average. It is important to note that removal of such extreme value did not considerably improve the modelling results; therefore, we decided to keep all values.

The SOC, N, P, and K models performed unsatisfactorily based on NSE (NSE < 0.5) and weak based on RPD (RPD < 1.4) in general, with some specific ones presenting satisfactory/moderate results: SOC and N at the Fogareiro catchment and P and K at the Benguê catchment. Correlations were also lower than those obtained for the previous properties, with R2 usually below 0.6 and as low as 0.01. It is important to note the very poor performance of SOC and N models at the Benguê catchment, with NSE < 0 and RPD ≤ 1.0. Negative NSE indicates that taking the average values of the samples is a better estimate than adopting the tested model.

4.2.3 Regional-scale

Figure 8 illustrates the results of sediment characterization from spectroscopy at the regional-scale, which comprises the sediment samples from all reservoirs in the Jaguaribe and Curu basins (total area of 82,500 km2). Contrary to Figs. 6 and 7, these graphs show only one model each. Models’ performance is shown in Table 5.

Fig. 8
figure 8

Results of cross-validation (leave-one-out) of models at the regional scale (82,500 km2): A nitrogen, B phosphorus, C potassium, D soil organic carbon, E clay, and F electrical conductivity

Table 5 Calibrated models and their performance for characterization of sediments from spectroscopy at the regional-scale

At the regional scale, the models for clay and EC presented good and moderate to good performance according to NSE (values of 0.65 and 0.69, respectively) and RPD (values of 1.70 and 1.81, respectively). The coefficients of determination were 0.65 and 0.69, indicating some level of correlation. The models for SOC, N, P, and K presented NSE values ranging from 0.20 to 0.39 (classified as unsatisfactory) and RPD values ranging from 1.12 to 1.29 (classified as weak), at this scale. Lower correlations were obtained between spectra and those properties, with 0.24 ≤ R2 ≤ 0.42.

4.2.4 Discussion

In this study, models’ performances expressed by NSE and RPD tend to decrease in models with low slope, i.e., when the dependent variable is less sensitive to the variation of the input data. For instance, slopes of the calibrated models at the reservoir scale are mostly in the range of 0.5 to 0.9. Some exceptions are K in the São Nicolau and Pentecoste reservoirs, and also P for the latter, in which the lowest NSE (0.23, 0.05, and 0.22, respectively) and RPD (1.17, 1.05, and 1.16, respectively) are observed.

In summary, the models for clay and EC performed in the range of good to very good at the spatial scales ranging from reservoirs (< 40 km2) up to regional (> 82,500 km2), with some specific ones rated as satisfactory. SOC models performed satisfactorily to very good at the reservoir scale, but degraded to unsatisfactory at the other scales. Similar results were found for N, whose models were satisfactory at the reservoir scale but tended to be unsatisfactory at the catchment and regional scales. The models for P and K presented high variation, from unsatisfactory to satisfactory at all investigated scales.

The overall performance at the reservoir scale benefits from their relatively small areas, which tend to present more homogeneous factors controlling the sediment characteristics and transport conditions: soil type, land use, and hydrological regime. Similar findings have been reported by other authors investigating soil attributes with spectral data. For instance, Morellos et al. (2016) found good N and SOC estimations (R2 > 0.70) for small areas (~ 0.30 km2), and Franceschini et al. (2015) also found good model performance to estimate organic matter, K and clay in small areas (~ 0.02 km2) using PSLR.

At the catchment and regional scales, slight overestimation of lower values and underestimation of higher values are observed for SOC. In fact, Ward et al. (2019) reported a similar effect, and explained that it is caused by the skewness of the SOC distribution and changes in the relationship between SOC and spectra for higher values. In our study, out of the 60 data samples (combination of six sediment properties in 10 reservoirs), 45% are highly skewed (|skewness|≥ 1), 33% are moderately skewed (0.5 ≤|skewness|< 1), and only 22% are approximately symmetric (|skewness|< 0.5).

Whereas the models for estimating sediment attributes from spectral data seem reliable and potentially replicable in areas up to 40 km2 at the reservoir scale in the study region, application to larger extents on a regional basis is challenging. At the catchment scale (Benguê, Pentecoste, and Fogareiro catchments, with areas ranging from approximately 1000 to 5000 km2) and regional scale (which embraces two large basins: Jaguaribe, with 74,000 km2 and Curu, with 8500 km2), performance of the models degraded. This finding indicates that estimating sediment attributes from spectroscopy seems not to be feasible at very large spatial scales: heterogeneity of sediment sources and processes contributing to sediment transport and deposition in reservoirs might not be captured by regression models.

Xu et al. (2018) reported that the performance of models to estimate soil or sediment attributes depends largely on the variability of the dataset. Due to differences in the source materials, sediments may have different physical and chemical characteristics, affecting the spectral response. In the future, methods to separate the dataset according to physicochemical and/or spectral similarity rather than spatial origin, such as cluster analysis (Ward et al. 2019), may help improve model robustness.

Still, the indirect characterization of sediment from reservoirs’ beds supports the practice of sediment reuse as fertilizer, by reducing the need for costly and time-consuming laboratory analyses. Thereby, the overall good performance of the models to predict clay content is crucial, as clay is a major physical characteristic controlling water flux and retention in the soil, impacting crop growth and productivity, especially in dry environments. Also, good model performance for EC at all the investigated scales is useful as a first assessment of the feasibility of sediment for reuse, by indicating reservoirs with high salinity that are unsuitable as nutrient sources for soil fertilization.

On the other hand, the low performance of most models for N, P, and K, especially at larger spatial scales, prevents accurate estimations of such fundamental nutrients for crop production and, therefore, indicates a limitation of the approach adopted by us. Such results suggest that spectroscopy should not be the sole method to estimate nutrient content in sediments, but rather that it complements laboratory analyses. Improvements might be achieved with revision of the approach, e.g., by applying prior data transformations and/or clustering analyses as proposed by Ward et al. (2019), or adapting the (PLSR) model.

4.3 Influence of the spatial scale on sediment attribute estimation

Spatial scale is an important feature to consider when proposing regression models for sediment characterization, since higher heterogeneity of the factors controlling sediment composition (soil types, land use, sediment transport conditions) is expected as the spatial scale increases. Figure 9 presents the coefficients of determination (R2) of the regression models, according to the spatial scales as defined in Table 1.

Fig. 9
figure 9

Influence of the spatial scale on the accuracy of regression models for estimating sediment attributes

From the graph, one can depict that an increase in the spatial scale of the reservoirs and from the aggregation of data of several reservoirs worsen the correlations, indicating that catchment and regional-scale models tend to be less accurate. This is particularly notable for P (to which the reduction of R2 is high, from roughly 0.5 to 0.25 at the reservoir and regional scales, respectively) and the other nutrients, whereas for clay and EC, the models tend to be more stable, with R2 around 0.7 in all scales.

From the abovementioned figure, three ranges of R2 can be identified according to the scale: for the reservoir scale (< 40 km2), R2 tends to be higher and less variable (generally in the range of 0.5 < R2 < 0.8); for the catchment scale (900–6000 km2), a decrease in R2 and higher variability can be observed (0.1 < R2 < 0.7); for the regional scale (82,500 km2), there is a strong tendency of degradation of the regression models’ accuracy, with R2 usually below 0.4.

This result shows the role of spatial scale in estimating sediment attributes: sediment generated in larger areas tends to present higher heterogeneity (Silva et al. 2018), hampering its characterization from VNIR-SWIR spectroscopy and reducing the performance of prediction models. This finding suggests that homogeneous areas be established, for which the models could be transferred from one reservoir to another without (re)calibration requiring physicochemical laboratory data. For instance, Ward et al. (2019) demonstrated that spectral clustering can improve soil organic carbon model performance compared to a reference model that was calibrated on the whole database without clustering. Furthermore, establishment of soil and sediment libraries, such as the Brazilian Soil Spectral Library — BSSL (Demattê et al. 2019b), has proven to potentially improve the capacity to predict the attributes at various spatial scales by identifying patterns of spectral signatures.

4.4 Influence of the spectral resolution on sediment attribute estimation

By decreasing the spectral resolution down to 10 nm intervals, it was possible to assess its influence on the estimation of sediment attributes. Figure 10 presents the coefficients of determination (R2) of the models, at the reservoir (Fig. 10A) and catchment/regional scales (Fig. 10B).

Fig. 10
figure 10

Influence of the spectral resolution on the accuracy of regression models for estimating sediment attributes at the reservoir A and catchment/regional B scales

Generally, very little impact is observed on the correlations between physicochemical and spectral characteristics when the spectral resolution is coarsened from 1 to 5 nm intervals: on average, the coefficient of determination is reduced by 1% at the reservoir scale, with only two models (out of 30 at this scale) with R2 decreasing more than 10%. At the catchment/regional scale, the average reduction of R2 was 6%, and 4 models (out of 24) presented R2 reduction by more than 10%.

By decreasing the spectral resolution further down to 10 nm intervals, the correlations are maintained at the same level, with little influence of the spectral resolution on sediment attribute estimation. At the reservoir scale, average reduction of R2 (from 1 to 10 nm resolution coarsening) accounts for 2%, with over 10% reduction of R2 in 5 models. At the catchment/regional scale, average reduction of R2 is 5%, and for 3 models, R2 reduction is higher than 10%. This is particularly important when analysing data that are originally captured at lower spectral resolutions, such as by hyperspectral satellite sensors, since coarsening of spectral resolution does not produce much degradation of the models’ performance.

Similar results were found by Knadel et al. (2013), who tested the influence of different spectral resolutions up to 8 nm to estimate clay and SOC and found that lower spectral resolution did not affect model performance. Adeline et al. (2017) reported in their study that to estimate soil attributes, a reduction of spectral resolution caused a slight decrease in model prediction performance. Peng et al. (2014) performed spectral resampling between 2 and 10 nm on VNIR spectral data to estimate SOC, and found that, in general, there was little variation in the accuracy of the models. Yang et al. (2012) observed that calibration models for estimating soil N and C are insensitive to reduction of spectral resolution, and models with coarser resolution (e.g., 100 nm) presented very similar accuracy to those with 1 nm resolution.

Modelling based on spectroscopy, such as in this study, helps to build the knowledge necessary for the use of hyperspectral satellite imagery for soil and sediment characterization (Viscarra Rossel et al. 2006a). For instance, we demonstrate that model predictability is only slightly impacted by coarsening the spectral resolution up to 10 nm, which is in the range of the spectral sampling distance of the PRISMA hyperspectral satellite (lower than 11 nm in the VNIR and SWIR) (Cogliati et al. 2021) and the upcoming EnMAP hyperspectral satellite (6.5 nm in the VNIR and 10 nm in the SWIR) (Guanter et al. 2015). As argued by Braga et al. (2019), satellite-based remote sensing approaches may help not only to identify sediment properties, but also areas with nutrient deficiency in soils, enabling an optimization of the nutrient balance in catchments. However, the large pixel size of current and planned hyperspectral satellite missions, the limitation to the topsoil layer as well as soil and sediment cover, e.g., by litter can be challenging.

5 Conclusions and outlook

The use of VNIR-SWIR spectroscopy to estimate physicochemical sediment attributes in the semiarid region of Brazil has shown to be a promising approach for the characterization of sediments silted in surface water reservoirs.

Partial least square regression models performed in the range of good to very good in the prediction of clay and EC at spatial scales ranging from reservoirs (< 100 km2) up to regional (> 10,000 km2). The models for other sediment properties performed worse: for instance, SOC and N models were satisfactory to very good at the reservoir scale, but degraded to unsatisfactory at the other spatial scales, whereas the models for P and K presented high variation, from unsatisfactory to satisfactory at all investigated scales. Such findings indicate that model performance is affected by heterogeneity of the factors controlling the sediment characteristics and transport conditions (soil type, land use, and hydrological regime), which tend to be higher in larger extents. Therefore, models tend to be more accurate when applied at small scales.

Coarsening of the spectral resolution in the range of 1 to 10 nm reduced the models’ performance only slightly, not impacting the overall capacity of estimating the sediment attributes. This conclusion highlights the importance of field-based studies to estimate sediment attributes from VNIR-SWIR spectroscopy, as a step towards the application of hyperspectral satellite imagery to characterize the sediment from surface reservoirs’ beds. Spectral resolution of satellites like PRISMA, EnMAP, and others are in the range tested by us in this study, but some hindrance still prevails, like the need for sediment exposure during the satellite passage, which implies empty reservoirs with no litter covering the bed. Despite the increasing availability of spaceborne imaging spectroscopy data, the methodology is not fully operational in terms of regular Earth observation yet and the temporal stability of sediment characteristics (in this region) needs to be assessed.

In general, spectroscopy-based indirect characterization of sediment supports the practice of sediment reuse as fertilizers, by reducing the costly and time-consuming laboratory analyses. The good performance on the estimation of clay is crucial, as it is a key feature controlling water flux and retention in the soil. Furthermore, reliable estimations of EC help to indicate the feasibility of sediment reuse, by identification of reservoirs with high salinity that are, therefore, unsuitable as nutrient source for soil fertilization.

However, lower performances for N, P, and K suggest that the proposed method needs improvement and be applied complementarily to laboratory analyses, as these are fundamental nutrients for soil fertilization.