1 Introduction

Remote sensing of vegetation is developed from the very nature of spectral properties of vegetation, i.e. vegetation, overall, exhibit strong absorption in the red band range while higher reflectance in the near-infrared bandwidth. Vegetation indices are used as proxies for understanding temporal and spatial variations of vegetation structure and density.

Though these proxies are used widely, they are associated with different problems and limitations. The saturation nature of vegetation indices is one of the prominent impediments which make them have little use, such as leading to an underestimation of ecosystem productivity in a highly vegetated or densely forested area. For example, NDVI which is the most widely used vegetation index yields poor estimates of vegetation productivity in areas where there is 100% vegetation cover and during peak of the season it is inadequate for showing important vegetation properties [1,2,3,4,5]. The most logical explanation behind is that the red band (680 nm) strongly absorbs electromagnetic energy, represented by an exponential function [1] and when canopy cover reaches 100%, the amount of red light that can be absorbed by leaves reaches a peak [1, 5] whereas, Near Infrared Reflectance (NIR) will increase because an addition of leaves results in multiple scattering [6]. In other words, for instance in crop fields, while reflectance in the red region exhibits a nearly flat response once the leaf area index (LAI) exceeds 2, the near-infrared reflectance continues to respond significantly to changes in moderate-to-high vegetation density (LAI from 2 to 6). The imbalance between a slight decrease in the red and high NIR reflection results in a slight change in the NDVI ratio, hence, yields a poor relationship with biomass. In order to double the ratio, the NIR reflectance should almost double to compensate for the slight change in red reflectance, however, which is not always the case [6].

As a response, various attempts have been made in addressing this saturation problem that could be put within three groups: mathematical improvement or modification of the indices, comparison, selection and combined use of less sensitive indices and the use of hyperspectral or red-edge bands. The Wide Dynamic Range Vegetation Index (WDRVI = (a * ρNIR–ρred)/(a*ρNIR + ρred)), which is a result of simple modification of NDVI, proposed, the coefficient a (with values between 0.1 and 0.2) and it increases the correlation with vegetation fraction by linearizing the relationship in three crop canopies: wheat, soybean, and maize. The approach resulted in more sensitivity to moderate-to-high LAI (between 2 and 6) which is at least three times greater than that of the NDVI, which enables more robust characterization of crop physiological and phenological characteristics [7].

Vegetation indices were combined taking into account differences in the performance of various vegetation indices to vegetation density and the available bands in sensors. For sensors with spectral bands having red and near-infrared bands, NDVI combined with Simple Ratio (RVI) resulted in the best combination for maize though such combination is species-specific. Red-edge NDVI and Chlorophyll Red edge (CIred-edge) is recommended for sensors having bands with red-edge and near-infrared regions [8]. Using RapidEye images three improved indices namely NDVIred&RE (red and red-edge NDVI), MSRred&RE (red and red-edge MSR index), and CIred&RE (red and red-edge CI) were developed that combined red and red–edge bands; and these indices compared to existing indices proved to be powerful alternative for LAI estimation of crops with wide chlorophyll range [9].The combined use of NDVI with RVI was also developed after the relationship between NDVI and the ratio vegetation index (RVI) at high NDVI values was assessed. And it produced an empirical equation for estimating saturation-adjusted NDVI based on RVI in cornfields that enable better prediction of actual vegetation conditions [10].

Unlike the few numbers of bands in multispectral sensors which are often broad band, hyperspectral sensors offer possibilities of discriminating vegetation indices based on narrow bands in the whole electromagnetic spectrum (350–2500 nm). Many studies acknowledged the importance of narrow-band vegetation indices for estimation of forest canopy LAI [11,12,13,14,15]. Narrow bands placed in the red edge (680–750 nm) region influenced plant properties, such as canopy biomass and leaf chlorophyll content. The point of maximum slope on the red infrared curve referred to as a red edge position is found to be sensitive to biomass variations for green vegetation in contrast to senescing vegetation and hence less susceptible to saturation problem [3, 16]. Three methods: NN, PLS and VI were implemented to find informative spectral bands retained for LAI estimation using hyperspectral reflectance. The study took 8 years of observations and results showed that red edge and NIR bands were found to be the most informative [17].

In a study that employs the narrow bands of Hyperion imagery, three major regions have been identified showing the strongest relationship with LAI and namely far-red and red edge (680–740 nm), NIR (885–1134 nm) and SWIR (1639–1790 nm) [18]. Applying a linear model the best narrowband combinations in the boreal forest area revealed an R2 > 0.65 and much higher R2 > 0.85 value for a coniferous forest. Considering the placement of the bands’ narrow spectral bands close to the water absorption regions are uniquely useful for estimating LAI. The usefulness of the red edge position (REP) for LAI estimation is associated with the sensitiveness of the REP to leaf and chlorophyll content. That is an increase in the amount of chlorophyll or LAI is related to the longer REP wavelength due to the widening of the chlorophyll absorption region at approximately 680 nm [19, 20]. On the other hand, contrasting to SWIR spectral bands, the far-red and red edge is sensitive to species composition, displaying poor relationships over mixed vegetation. Overall, although repeated studies are asserting the importance of red edge and SWIR spectral region, the specific spectral bands providing the strongest relationship with LAI varied across studies [18]. Narrow multispectral bands (400–800 nm and 10 cm resolution) from Unmanned Area Vehicle (UAV) successfully estimated LAI using a modified triangular vegetation index (MTVI2) with calibrated and validated R2 of 0.79 and 0.8 respectively in wheat crop [21].

Most of the existing studies on the saturation nature of vegetation indices have focused on crops and understanding in other vegetation types such as forest and grassland owing unique characteristics in terms of productivity are presumably unaddressed [8]. In the meantime, the recently available remote sensing sensor sentinel-2, which is designed and developed with much promise in vegetation studies, came up with three red-edge bands, one red band, two near-infrared bands, and two shortwave infrared bands. In this regard, outputs from very recent studies addressed an estimation of biophysical parameters using simulated and real sentinel-2 data. For instance, simulated sentinel-2 data was applied for estimation of canopy chlorophyll content (CCC), fraction of photosynthetically active radiation absorbed by the vegetation canopy (FAPAR), and fraction of photosynthetically active radiation absorbed only by its photosynthesizing components (GFAPAR) in a dynamic subalpine grassland ecosystem [22]. The combination of sentinel-2A with sentinel-1A was applied for high spatial–temporal decametric LAI estimates. For rice fields, LAI estimates were attained by inverting the PROSAIL radiative transfer model with gaussian regression [23]. A very recent study by [24] produced a novel index called SeLI for estimation of gLAI using sentinel-2 bands; and it was implemented using a multi-crop dataset. LAI, by using weighted difference vegetation index could be estimated at a level of R2 of 0.809 using band 4 with 10 m spatial resolution of sentinel-2 that implying the avoidance of the use of the red-edge bands with 20 m spatial resolution [25] but the study didn’t evaluate the red-edge bands.

Nonetheless, apart from the few studies, in general, understanding the saturation properties of sentinel-2 bands as the data availability is relatively recent is yet an open area with much-expected progress ahead to fully appreciate the real capability and implement practical methods for estimation of biophysical parameters using real sentinel-2 bands. Therefore, this study is primarily motivated for understanding the sensitivity of the narrow bands of sentinel-2 with respect to saturation in the mixed crop-forest ecosystem. And it is also original in performing the interband variation of the multispectral bands of sentinel-2 in light of explicitly explaining saturation properties. The selection of a mixed ecosystem is purposive as the maximum green leaf area index is obtained often in the highly forested area while crop areas represent low values and hence placing the study in a mixed agricultural-forest ecosystem enables understanding the full saturation characteristics of the bands. Accordingly, this paper evaluates the saturation nature of three widely applied vegetation indices calculated from bands of sentinel-2.

2 Materials and methods

2.1 Study area

The study site is located in two adjacent crop and forest ecosystems in southeastern Ethiopia. The forest has area coverage of 2671 km2 and an average elevation of 2101masl located within Harenna Forest of the Bale Mountains National Park. The cropland, which is placed in weredas of Arsi-Bale districts, is widely known as part of the highly productive wheat belt of Ethiopia. The cropland covers an area of 590 km2 with an average altitude of 2448 masl (Fig. 1).

Fig. 1
figure 1

Map of the study area

2.2 Data

This study used the sentinel-2 imagery and Leaf Area Index (LAI) product; hence, to suit intercomparison, equivalent products from the two sensors were collected at similar or close days. Imageries and products were accessed for a specific geographic area and for four time periods (Table 1) so that comparison results will be more reasonable accounting temporal variations.

Table 1 Date of Imageries used for the study

LAI product used in this study is obtained from the Copernicus Global Land Service (CGLS) Leaf Area Index (LAI) Version 1 product, derived from PROBA-V daily data having 300 m spatial resolution. It is a 30-days composite, updated every 10 days using a moving window. The PROBA-V LAI Version 1 is derived from the SPOT/VGT-like Top of Atmosphere (TOA) PROBA-V reflectance in three VNIR spectral bands (B0 (blue), B2 (Red), B3 (NIR)) generated by the PROBA2VGT module. The retrieval methodology is described in [26].

The use of this product is said to have some limitations, notably, the collection 300 m Version 1 product is not accurate over the wetlands and can provide unrealistic values over water areas. This is explained by the fact that the MODIS products and the CYCLOPES products used to train the neural networks do not perform well over such surfaces. Nonetheless, the accuracy assessments of PROBA-V LAI product at 300 m spatial resolution over Europe have shown a good overall quality with good spatial and temporal consistency. No major issues were detected, except by the low precision of the NRT guess [27].

2.3 Image preprocessing and processing

Six bands of sentinel-2 including Band 4 (central wavelength (CW) of 0.665 μm, bandwidth (BW) of 30 nm), Band 5 (CW = 0.705 μm, BW = 15 nm), Band 6 (CW = 0.740 μm, BW = 15 nm), Band 7 (CW = 0.783 μm, BW = 20 nm), Band 8 (CW = 0.842, BW = 115 nm) and Band 8A (CW = 0.865 μm, BW = 20 nm used for stacking. As the study uses images from different platforms and involves cross-comparison study, for valid geometrical accuracy, image to image registration between inter-sensors was conducted. Then, a subset of images was implemented for all of the three sensors.

Three vegetation indices namely Normalized Difference Vegetation Index (NDVI) or a generic Normalized Difference Index (NDI), Simple Ratio (SRI) and Transformed Vegetation Index (TVI) were derived. Since the original NDVI is specific bandwidth dependent, NDI is more representative nonetheless given the popularity of NDVI it is chosen to use this name. These indices, their respective formulae are presented below, are mainly selected as they employ red, near-infrared bands and used widely for estimating biomass for agricultural and ecological applications [5].

2.3.1 Normalized difference vegetation index (NDVI)

The widely used index is probably the Normalized Difference Vegetation Index (NDVI), originally proposed by [28]:

$${\text{NDVI}} = \frac{NIR800 - RED670}{NIR800 + RED670}$$
(1)

Generic Normalized Difference Index (NDI), defined as

$${\text{NDI}} = \frac{Rb - Ra}{Rb + Ra}$$
(2)

And looking for those wavelengths a and b that provides the best correlation with LAI (Leaf Area Index), Ch (chlorophyll) and some other biophysical parameters obtained from experimental data.

2.3.2 Simple ratio vegetation index (SRI) [29]

$${\text{SRI}} = {\text{NIR}}/{\text{RED}}$$
(3)

2.3.3 Transformed vegetation index (TVI)

$${\text{TVI}} = \sqrt[2]{{\frac{{\left( {NIR - RED} \right)}}{{\left( {NIR + RED} \right)}} + 0.5}}$$
(4)

2.4 Saturation property and sensitivity analysis

Two major analysis methods: correlation and best curve fitting were used to evaluate the saturation nature of the three indices versus gLAI (Table 2). First, a correlation coefficient was computed between the indices and gLAI to discover the presence of any valid association, if the result indicated a strong association, then, the best fit function that describes the relationship will be determined by analyzing the possible mathematical functions using the coefficient of determination and RMSE. Then, a detailed analysis of the saturation properties was implemented via graphical analysis. Furthermore, to understand the saturation properties across the full range region-wise sensitiveness was implemented by computing, the noise equivalent of green leaf area index versus green leaf area index (NEΔgLAI vs. gLAI) [30] it tells the accuracy of the gLAI estimation.

$${\text{NEgLAI}} = \frac{{RMSE\left( {VI vs.gLAI} \right)}}{{d\left( {VI} \right)/d\left( {LAI} \right)}}$$
(5)

Where NEΔgLAI = is the noise equivalent (NE) of gLAI RMSE (VIvsgLAI) = is the RMSE of the VI versus gLAI relationship d (VI/d(LAI) = is the first derivative of the VI for gLAI.

Table 2 Combinations of Sentinel-2 bands used

The NEΔgLAI takes into account not only the RMSE of gLAI estimation but also accounts for the sensitivity of the VI to gLAI, thus providing a metric accounting for both the scattering of the points from the best-fit function and the slope of the best-fit function [8]. The NEΔgLAI provides a measure of how well the VI responds to gLAI across its entire range of variation.

3 Result and discussion

3.1 Correlation matrix of vegetation indices

Table 3 presented the correlation matrix among NDVI, SRI and TVI values computed via different combinations of sentinel 2 bands as well as with green LAI. Accordingly, for the three of the vegetation index values computed from bands 4 and 5 coupled with bands 8 and 8a resulted in a strong correlation with green LAI, which is consistent across the four time periods. Additionally, the inter-band association between b4/b5 with b8/b8a resulted in a strong correlation. Conversely, NDVI values computed using bands 6 and 7 coupled with bands 8 and 8a display weak correlation with LAI that is again consistent across the four-time periods, except for b6 with b8a. This result implies that the linear relationship is strong and could be useful for those groups with a strong correlation coefficient, whereas, band 6/7 combined with band 8/8a linearly is not important.

Table 3 Correlation matrix outputs among band combinations of sentinel-2 and with gLAI

3.2 Best fit functions and sensitivity analysis

Under Sect. 3.1 presented above, important linear associations were revealed and the involved bands were identified and here to understand the saturation nature of these associations, possible fit functions were determined for three fitting curves from three functions: logarithmic, linear and polynomial. The best curve lines for NDVI versus gLAI, where bands 4/5 combined with bands 8/8a, are obtained using 3rd order polynomial, and Fig. 2 presented the case for bands 4 with 8. That is the polynomial function resulted in coefficient of determination of 0.98 compared to 0.96 and 0.95 by logarithmic and linear function respectively.

Fig. 2
figure 2

NDVI (band 4 and 8) plotted versus gLAI for three equations

For inter-band comparisons using the third-order polynomial function, which is selected as the best one, curves of four combinations of NDVI using four bands are calculated and presented in Fig. 3.

Fig. 3
figure 3

Best fit function for NDVI (bands 4, 5 with bands 8 and 8a) plotted versus gLAI

As it is well noticed in all the NDVI calculated, the NDVI values computed from b4 and b8 as well as b4 and b8a showed an increasing and higher full range distribution that reaches an approximate value of 0.9. In contrast, the NDVI values computed from b5 and b8 as well as b5 and b8a have narrow range values that reach a maximum of 0.7. For all of the four curves, initially, the slope increases sharply followed by decreasing rate of increase and then it became almost to the horizontal level of after 3 gLAI.

This observation could be explained using Fig. 4, in which the NEΔgLAI increases sharply starting from 3 gLAI that peaks the highest value at around 4.5 gLAI which implies vividly the associated saturation property is emanated from these highest values of NEΔgLAI. Besides, Fig. 4 indicated among the four curves, NDVI computed using bands 8a and 4 possesses the lowest value of NEΔgLAI. This indicated band 4 combined with band 8 and 8a is less saturated than band 5 combined with band 8 and 8a, and therefore, it is band four that plays the key role in producing the lowest value of saturation.

Fig. 4
figure 4

Noise equivalent of the green leaf area index plotted versus green leaf area index for NDVI for four bands

Similarly, for SRI, among the tested five fitting functions, exponential growth function resulted in the best-fit function having the highest value of the coefficient of determination of 0.94. Whereas functions such as power, polynomial, linear and logarithmic revealed coefficient of determination of 0.92, 0.85, 0.84 and 0.78 respectively (Fig. 5 showed the case of band 4 coupled with band 8).

Fig. 5
figure 5

SRI (band 4 and 8) plotted versus gLAI for five functions

To evaluate the interband variation plots were prepared using exponential growth function for bands of 4, 5, 8 and 8a and displayed in Fig. 6. Among the four plots displayed SRI computed from band 4 coupled with band 8 and 8a revealed the highest exponential growth functions. In contrast, SRI computed from band 5 coupled with band 8 and 8a demonstrated relatively low increasing exponential growth function.

Fig. 6
figure 6

Exponential growth function of SRI (bands 4, 5 with bands 8 and 8a) plotted versus gLAI

The slowly increasing slopes of the SRI curves up to three green leaf area index especially for band combinations of 8a and 4 as well as 8 and 4 is well supported by highest values NEΔgLAI for these bands as depicted in Fig. 7.

Fig. 7
figure 7

Noise equivalent of the green leaf area index plotted versus green leaf area index for SRI using four bands

Nonetheless, at higher values of green leaf area index all the bands own smaller values of the NEΔgLAI which implies SRI performs well at higher values of green leaf area index compared to NDVI (see Fig. 4 for comparison). Overall, SRI computed using band 4 is less saturated compared to band 5. On the other hand, for TVI, polynomial function evidenced by the highest R2 values of 0.98 and 0.96 produced the best fit plot for the association between TVI and gLAI among the five best possible fitting functions for bands of 4 and 8(Fig. 8) and for bands of 5 and 8.

Fig. 8
figure 8

TVI (band 4 and 8) plotted versus gLAI

For inter-band comparisons using the polynomial function, which is the best-fit function, plots for TVI versus gLAI is presented in Fig. 9. There are two identified groups of curves and these are that of band 4 coupled with bands 8 and 8a on one hand and band 5 coupled with bands 8 and 8a on the other side. The two groups within the first group have more widely distributed values of TVI that reach up to 1.2 compared to the second group of two curves that are narrowly distributed TVI range of values with a maximum of below 1.1. Thus, the two TVIs in the first group: b8_4 and b8a_b4 are less saturated than b8_b5 and b8a_b5.

Fig. 9
figure 9

Best fit function for TVI (bands 4, 5 with bands 8 and 8a) plotted versus glai

As shown in Fig. 10, the NEΔgLAI versus gLAI curves for the plots also unveiled similar output, in particular for TVI from bands 8a and 4 showed not only smaller also consistent values across the full range value of green leaf area index. TVI calculated using bands 8a and 5 is the second one in terms of having less value of NEΔgLAI, as evidenced at 4.5 gLAI the corresponding NEΔgLAI is smaller than the values for the rest two bands combinations specifying lower values of NEΔgLAI are associated with band 8a.

Fig. 10
figure 10

Noise equivalent of the green leaf area index plotted versus green leaf area index for TVI using four

Previous researches implemented in agricultural fields using simulated sentinel-2 resulted in comparable findings. That is, using Normalized Difference Index (NDVI), LAI can be derived with 674 nm and 712 nm as best performing bands which are positioned closely to sentinel-2 B4 (665 nm) and the new red edge band, B5(705 nm) [23].Similarly, for potato fields, the use of the 10 m red band, b4, enabled an estimation of LAI using the weighted difference vegetation index (WDVI) with an R2 of 0.809 and it was stated the avoidance of the red edge bands which are available at 20 m spatial resolution [25]. In the boreal forest, a study using simulated sentinel-2 bands the highest R2 values resulted from SRI and NDVI (Normalized Vegetation Index) compared to other types of vegetation indices. For effective LAI estimation band 4 combined with band 8 was the best one with R2 of 0.93 [31].

A very recent study by [24] using real sentinel-2 images produced anew green LAI called SeLI which is a normalized index that applied band 5 and band 8a using multi-crop data set as the best performing index compared with different tested indices. Using linear fitting an R2 of 0.708 and 0.732 were obtained for the calibrated and validated dataset. Thus, it indicated that despite the difference between the present study area (mixed agricultural-forest) and the indicated output from agricultural areas, results are consistent and agreed.

On the other hand, for the three of the indices that computed from bands of 6 and 7 coupled with bands of 8 and 8a, correlation coefficient values between the three VIs versus gLAI are so small revealing weak association (Table 3). Furthermore, these observations are supported by smaller values of coefficient of determination and among the possible functions, notably linear and non-linear (logarithmic, polynomial, exponential and power), all failed to result in a strong relationship. For instance, both logarithmic and polynomial functions yield coefficient of determination of 0.15 and a very close result of 0.14 is obtained for linear function. Consequently, these bands are not useful for discriminating biomass parameters such as gLAI.

3.3 Sensitivity analysis of inter-vegetation indices

In the aforementioned discussions, the saturation characteristics among the bands for the three of the vegetation indices were presented. It is also imperative to understand the variations of the three vegetation indices for estimating of green leaf area index. This is done by analyzing the plot noise equivalent of the green leaf area index versus green leaf area index. Under the previous sections, bands 4 and 5 were obtained as the best performing bands; and in this part, the analysis considered solely these bands.

Accordingly, as shown in Fig. 11a, using bands 8 and 4 for low to medium gLAI values (< 4.2 gLAI), NDVI has the lowest noise equivalent of green leaf area index value compared to SRI and TVI. On the other hand, for medium to the high value of gLAI, all of the three vegetation indices have closely similar values of NEΔgLAI though TVI has the smallest values.

Fig. 11
figure 11

Noise equivalent of the green leaf area index plotted versus gLAI for three vegetation indices. a Bands 8 and 4 b Bands 8 and 5 c Bands 8a and 4 d Bands 8a and 5

The analysis for band combinations of 8 and 5 and 8a and 5 (see Fig. 11 (b and d)) showed very close results. SRI got the largest value of NEΔgLAI for gLAI values less than 2. At this range, both the NDVI and TVI are characterized by similar small values of NEΔgLAI. Beyond 2 gLAI, SRI by far got the lowest values of NEΔgLAI and therefore in the former range either NDVI or TVI is preferred to determine gLAI whereas in the later range SRI outweighs the rest two indices.

On the other hand, among the three of the vegetation indices compared using bands of 8a and 4(see Fig. 11c) across the full range of gLAI, TVI is characterized by lower values of NEΔgLAI that implies TVI outperforms in terms of its utilization for estimation of gLAI. This is an interesting output in the sense that a single index with significantly lower values of NEΔgLAI as well as having a strong association with gLAI is obtained that could be very useful for quantification of gLAI in mixed agricultural and forest ecosystems. For the rest of the bands and vegetation indices combination of the indices setting thresholds could be one possible solution [8].

The outputs from the present study should be considered within the context of the research. As the study employed the VI approach for estimating LAI involving two bands at a time, the application of other methods or approaches that use more number of bands for LAI discrimination, for instance, lookup tables and an approved artificial neural networks could be other capabilities to utilize different bands of sentinel-2 [32].

4 Conclusion and recommendation

The three vegetation indices: NDVI, SRI, and TVI computed from sentinel-2 bands: four (red) and five (red edge) coupled with bands 8 and 8a showed some level of saturation. Across the three of the vegetation indices, the combination of band 4 with 8a is the least saturated one among all of the evaluated bands. Vegetation indices calculated from band 4 with b8 and b8a are less saturated than vegetation indices computed from band 5 with b8 and b8a. Nonetheless, TVI computed from bands 8a and 4 is the best outperforming combination for using as a tool for discriminating green leaf area index in a mixed agriculture-forest ecosystem. Using linear model interpretation NDVI derived from bands 4 and 5, though the former showed slightly increased importance, could be reliably used for estimation of a biophysical parameter such as green leaf area index showing 95% variation. Conversely, other red edge bands such as 6 and 7 combined with bands 8 and 8a are less significant to apply using the three vegetation indices and then discriminate biomass parameters notably green leaf area index under the current forest ecosystem.

In the present study, though the overall conclusion drew was convincing and well agreed with previous related researches, the green leaf area index was derived from Proba-V sensor and will have some potential limitations that could be associated with error propagation. Therefore, future research might consider the determination of green leaf area index using a more accurate approach that could be set in more controlled experimental plots to come up with better validated and assertive outputs.