Background

Above-ground biomass in forest ecosystems plays an important role in the global carbon cycle and climate change mitigation by reducing atmospheric CO2 concentration (Alkama and Cescatti 2016; Georgia et al. 2017). Holding 40% of the global terrestrial carbon, sustainable management of tropical forests is crucial for mitigating climate change and conserving biodiversity (Canadell and Raupach 2008; Mauya et al. 2015; Schuit et al. 2021). Data on forest productivity assessment, total biomass production, growth prediction and ecosystem services valuation are essential for forest management planning and utilization (Zianis and Mencuccini 2004; Soenen et al. 2010). However, data accuracy and collection methods have remained serious methodological challenges (Powell et al. 2010). Accurate data on forest biomass are needed for appropriate management decision making and monitoring. Data accuracy is a key factor for forest carbon accounting for successful implementation of carbon market mechanisms such as the REDD+ (Herold et al. 2011). Techniques that facilitate rapid and accurate forest biomass estimation across spatial and temporal scales are very useful in reducing the level of uncertainty in carbon stock assessments and for informing strategic forest management plans (Soenen et al. 2010; Mascaro et al. 2011; Pan et al. 2011; Dou and Yang 2018).

Above-ground biomass and carbon stock estimation methods in forest ecosystems have evolved from the destructive direct measurements to the non-destructive indirect measurements using empirical equations and remotely sensed vegetation attributes (Table 1) (Brown 1993; Vashum and Jayakumar 2012). The methods have their own merits and demerits. The direct harvesting method measures biomass from oven dry weight of tree/shrub components (stems, branches, leaves, twigs) in the forest (Brown et al. 1989; Brown 1993; Hughes et al. 1999, MoA 2000). Although such data is the most accurate, the method is laborious, time consuming, expensive and not feasible for large area application. For large and protected forests, allometric equations are suitable and popularly applied globally (Brown 1997; Segura and Kanninen 2005; Navar 2009; Pearson et al. 2005). The allometric method is non-destructive but it has limitations in accuracy and often designed for specific site conditions or species types (Navar 2009; Pearson et al. 2005).

Table 1 Description of merits and demerits of methods for above-ground biomass estimation in forest ecosystems

The other non-destructive and reliable method is application of remote sensing. Since the launch of resource scanning satellites, remote sensing has been increasingly used for land use land cover mapping (Forkuor et al. 2017) and for forest biomass estimation (Steininger 2000; Lu et al. 2014). Satellite sensors measure vegetation parameters that are correlated with biomass such as height, crown size, density, volume, leaf area index and other attributes (Isbaex and Coelho 2020). By combining remote sensing data with field sample measurements and based on the strength of the relationship (Pertille et al. 2019), spatially explicit estimates of forest biomass can be generated for a large area through modeling (McRoberts et al. 2013; Castillo et al. 2017; Chen et al. 2018; Pandit et al. 2018). The coupled method establishes predictive models by selecting best predictor variables that can be applied for mapping and monitoring of forest biomass and carbon at multiple scales (Castillo et al. 2017). Such models are popularly used in forest vegetation studies (Dou and Yang 2018; Chen et al. 2018) and can also be applied for estimating nutrients in herbaceous biomass in rangelands (Ramoelo et al. 2015).

Various remote sensing products from optical sensors, radio and light detection platforms are used for biomass estimation. However, widespread application of the products is limited by many factors such as low accessibility or high cost, low resolutions (spatial, spectral and temporal), cloud and canopy penetration capacity, and data saturation problems (Lu 2006; Timothy et al. 2016; Chen et al. 2018). For instance, Landsat images are freely accessible and widely used for vegetation classification and biomass estimation (Lyon et al. 1998; Timothy et al. 2015; Georgia et al. 2017), but the data saturation problem causes under-estimation of forest biomass from such images (Lu et al. 2014; Pandit et al. 2018). In multispectral images, vegetation indices can be derived from the reflectance information in the visible, near infrared and shortwave infrared bands (Isbaex and Coelho 2020). Images with broad band widths and low spectral resolutions are insensitive to differences in plant characteristics and they are less reliable for above-ground biomass estimation for very diverse types of sub-tropical forests (Mutanga and Skidmore 2004; Pandit et al. 2018). Hence, high resolution data in narrow band width are very useful to overcome data saturation, improve reliability and accuracy. The Sentinel-2 platform has multispectral instrument (MSI) sensor that yields images with better spectral coverage (e.g., red-edge band, shortwave infrared bands), high spatial resolution (e.g., 10 m, 20 m 60 m) (Shoko and Mutanga 2017), and increased temporal frequency compared to the Landsat series (Gómez 2017; Pandit et al. 2018; Sun et al. 2019; Isbaex and Coelho 2020).

The Sentinel-2 image is freely accessible from the European Space Agency (ESA) hub, and has improved the application of the coupled modeling of biomass from field measured data with vegetation indices, spectral bands and biophysical variables (Zhang et al. 2017; Castillo et al. 2017). The red-edge band in Sentinel-2 images is most suitable for assessing and mapping vegetation characteristics (Ramoelo et al. 2015; Shoko and Mutanga 2017; Pertille et al. 2019). One of the advantages of Sentinel-2 image is the high spatial resolution (< 10 m) that can be approximated or resampled to the size of plots in field measured inventory data, which contributed to improving the accuracy of the model predictions (Isbaex and Coelho 2020). A study by Chrysafis et al. (2017) on the relationships of growing stock volume and Sentinel-2 indices in the Mediterranean forest reported a strong performance of the prediction model with R2 = 0.63 and RMSE (root mean square error) of 63.11 m3 ha−1. Inclusion of Sentinel-2 texture matrices in the estimation of above-ground biomass in a sub-tropical forest of Nepal yielded a very high model performance with R2 = 0.99 and RMSE of 4.51 Mg ha−1 (Pandit et al. 2018). The general literature on coupled modeling of field measured data with Sentinel-2 indices show an improvement and robust outcomes on the accuracy of the estimated biomass with high goodness of fit and low RMSE (Castillo et al. 2017; Chen et al. 2019).

In Ethiopia, forest biomass quantification methods have been largely done with direct measurement of oven dry weight of biomass of tree components (MoA 2000). This has gradually developed into plot-based measurement of tree parameters and application of general allometric equations (Yohannes et al. 2015; Siraj 2019). For the lowland woodlands, species-specific equations have been developed for some species during the national biomass inventory project (MoA 2000) that serve to quantify above-ground biomass in lowland vegetation. However, above-ground biomass estimation in the dry and moist montane forest ecosystems is done using general allometric equations developed for sites with similar rainfall regimes and forest vegetation types (Pearson et al. 2005; Melese et al. 2014; Yohannes et al. 2015; Dibaba et al. 2019). The allometric methods have limitations in data accuracy, applicability for inaccessible terrains and raises questions on representativeness of the forest ecosystems (Zianis and Mencuccini 2004; Shrestha 2011; Vashum and Jayakumar 2012). The protected forests, biosphere reserves and the last remaining intact Afro-montane forests in the country are located in dissected and inaccessible mountainous terrains, where physical access is limited (Kebede et al. 2013). Proper accounting and reporting of the forest biomass and carbon sequestration in those forests is essential to meet the Nationally Determined Commitment on emission reduction and for the successful implementation of the Reduced Emission from Deforestation and Forest Degradation (REDD+) program in the country (MEFCC 2016). Therefore, this study has the following objectives (i) to investigate the relationship between field measured biomass data and vegetation indices, biophysical variables and spectral bands derived from Sentinel-2 Multispectral image, (ii) to identify best predictor variables through correlation and regression analysis, and (iii) to develop above-ground biomass prediction model using best estimator vegetation variables (iv) to produce forest carbon stock map using the developed model. The novelty of this work is the application of Sentinel-2 image for estimating biomass in a tropical Montane forest in Ethiopia, which is an addition to knowledge on the methods of forest biomass estimation.

Materials and methods

Description of the study area

The Yayu afro-montane forest is found in the Illubabor Zone, southwest of the country at about 550 km from the capital, Addis Ababa. The geographic location is between 8° 4′ 56.05″–8° 24′ 40.46″ N latitude and 35° 44′ 53.85″–36° 5′ 12.23″ E longitudes (Fig. 1). Large part of the Yanu afro-montane forest is protected as a Forest Biosphere Reserve. The forest is part of the last remaining intact patches of natural forests in the southwest region. The forest has multiple economic, social and environmental benefits. It provides non-timber forest products, mainly spices, honey, and herbal medicine to rural communities for their livelihoods. The forest contains one of the largest forest biomass in the country and hence significantly contributes to climate change mitigation. Besides, the Yayu forest is one of the last remaining montane-rainforests containing wild Coffee arabica gene pool populations in Ethiopia. The forest site is effectively serving as an in situ conservation forest for the wild Coffee arabica population gene pool (Gole et al. 2008; Schuit et al. 2021). Coffee makes the largest share of living for the local communities. The climate is characterized by hot and humid tropical climate with a mean annual temperature of 25 °C, varying between 12.7 and 26.1 °C. The region receives high mean annual rainfall of about 2100 mm, with high annual variability ranging from 1400 to 3000 (Gole et al. 2008).

Fig. 1
figure 1

Location map of the study area, Yayu forest, in the South Western region of Ethiopia

The topography is complex with undulating hills and valleys dissected by several small streams draining into the Geba and Dogi Rivers. The elevation ranges between 1217 m.a.s.l at the valley bottom to 2583 m.a.s.l at the highest point in the watershed (Fig. 2). The valley gorges and the mountains are not accessible and have very steep slopes. The dense and large patches of the forests are found in the valleys and on mountains, which makes is difficult to conduct a ground inventory of the forests.

Fig. 2
figure 2

Digital elevation model (DEM) map of the Yayu forest showing the highest and lowest elevation points in the study forest

Land use land cover classification

Before conducting the field sample measurement, the land use land cover of the study area was classified using a Landsat-8 dry season imagery acquired in February, 2018, which was downloaded from the open access Global Land Cover Facility (GLCF) (GIS Resources 2013). The forest land covered about 62%, which is the largest in the landscape followed by the cultivated agricultural land occupying about 30% of the total area (Fig. 3). The rest of the landscape is covered with shrub lands (3%), settlements (2.7%) and wetlands (2.3%) (Fig. 3). Although the forest area is designated as a National Forest Priority Area and the Yayu Biosphere reserve is established within the forest landscape, the local communities are highly dependent on the forest mainly for harvesting natural coffee, spices and honey production. Thus, the Yayu biosphere reserve forest has three functional zones allowing farmers to harvest non-timber forest products in the transition and buffer zones while leaving the core zone as access-restricted conservation zone, which is primarily located in the valleys and mountains. As shown in Fig. 3, the dark green areas are the dense forests designated as core zones in the inaccessible high altitude steep mountains and in the low altitude river valleys in the Yayu forest. The landscapes in the middle altitude areas are the buffer and transition zones, in which agricultural cultivation is practiced with strict management actions (Gole et al. 2008).

Fig. 3
figure 3

Land use land cover map of the Yayu forest in the study area, showing the forests distributed in the high altitude parts of the mountain and in the valley gorges

Field sampling and measurement of tree parameters

The forest map of the study area was extracted from the land use land cover map produced using the Landsat-8 image. A total of 20 field sampling plots were randomly drawn from the forest map in ArcGIS 10.2 platform. The coordinates of the random plots were used as references to locate the plots on the ground within the transitional, buffer and core zones of the Biosphere reserve forest by using a hand held Garmin III GPS. The size of each plot was 20 m × 20 m (400 m2) and the boundaries were delineated using a measuring tape. In each plot, all trees with a diameter of ≥ 5 cm and a height of > 1.3 m were identified, recorded and measured for diameter at breast height (DBH) and total height (H). The DBH was measured using diameter tape while height was measured using Sunnto clinometer.

Above-ground biomass and carbon stock estimation from field measured data

There is well established relationship between measurable tree parameters and forest stand parameters (e.g., volume, biomass and density) for natural vegetation types in the tropics, which are difficult for direct measurements (Husch et al. 2003). Based on intensive sampling and analysis, allometric equations are often developed and globally used to estimate forest biomass from tree parameter data (Pearson et al. 2005). The above-ground biomass and carbon stock in the Yayu forest was quantified by selecting suitable allometric equation that is applicable to the Yayu tropical afro-montane forest. The equation was established for tropical forests and it has been widely applied (Chave et al. 2004) to quantify biomass in tropical dry afro-montane forests that have similar climatic conditions to that of the Yayu forest. The equation uses DBH and height measured from the sampling plots as well as specific wood density of each species to convert the wood volume into biomass. The specific wood density data, which is the dry mass of a unit volume of fresh wood of a tree, was obtained from a tropical species wood density database source (Gisel et al. 1992).

$${\text{AGB}} = 0.0{673} \times \left( {\uprho {\text{D}}^{{2}} {\text{H}}} \right)^{{0.{976}}}$$
(1)

where AGB is Above-ground biomass (g), ρ is specific wood density (g cm−3), D2 is diameter at breast height (DBH) (cm), H is total height of a tree (m).

The above-ground biomass was converted into carbon equivalent using the biomass conversion factor or carbon fraction value of 0.47 (IPCC 2006).

$${\text{C}} = {\text{AGB}} \times {\text{CF}}$$
(2)

where, C is carbon stock (g), and CF is carbon fraction of above-ground biomass.

Sentinel-2 image pre-processing

The Sentinel-2 satellite imagery, taken in the dry season of February 2018, was downloaded from the open access European Space Agency (ESA 2019) hub. The raw image was pre-processed using the Sentinel Application Platform (SNAP) and quantum GIS (QGIS) Software. The transformation from radiance to surface reflectance was done by applying the Dark Object Subtraction (DOS) method using the semi-automatic classification plugin in QGIS software. The DOS method removes the darkest pixel in each band that might be affected by atmospheric scattering (Chavez 1988). A radiometric correction was done to reduce atmospheric and sun angle effects (Baillarin et al. 2012) and to make the optical image a Level-2A product with bottom-of atmosphere reflectance. The Sentinel-2 Multispectral instrument (MSI) with swath width of 290 km was Ortho-rectified to UTM Zone 37N projection and combined with the Shuttle Radar Topographic Mission (SRTM) 3 s digital elevation model to improve the accuracy. The processed image was then resampled the 10 m bands into 20 m resolution. The high resolution 10 m bands, the blue, green, red and near infrared bands, were resampled into a 20 m pixel size resolution using the ArcGIS software, so as to correspond with the 20 m vegetation sampling plot size of the field data measurement. The pre-processed and resampled Sentinel-2 image was then used for deriving vegetation indices (VIs) and biophysical variables (BPVs) from combination of the multi-spectral bands (Fig. 4).

Fig. 4
figure 4

Methodological flow chart showing the data sources, analytical procedures and final outputs

Vegetation indices (VI) extraction and mapping

In a remotely sensed data, a vegetation index is a spectral transformation of two or more bands designed to enhance the contribution of vegetation properties and allow reliable spatial and temporal inter-comparisons of terrestrial photosynthetic activity and canopy structural variations (Huete et al. 2000; Castillo et al. 2017; Taddesse et al. 2020). Vegetation indices extracted from Satellite data have emerged as important tools in monitoring, mapping and managing terrestrial vegetation as the indices provide radiometric measurement of the quantity, structure and condition of vegetation, and effectively serve as useful indicators of seasonal and inter-annual variations (Isbaex and Coelho 2020).

There are many VIs with similar functionality and most of them use the inverse relationship between red and near-infrared reflectance associated with healthy green vegetation (Pertille et al. 2019). According to Bannari et al. (1995), VIs are normally classified based on a range of attributes such as the number of spectral bands (2 or greater than 2); the method of calculations (ratio or orthogonal), depending on the required objective; and the historical development (as first generation VIs or second generation VIs). In order to compare the effectiveness of different VIs, Lyon et al. (1998) classified seven types of VIs based on their computational methods (Subtraction, Division or Rational Transform). The selection of pertinent VIs improved with the advancement in hyper-spectral remote sensing technology and currently high resolution reflectance spectrums are available for multispectral VIs analysis (Isbaex and Coelho 2020).

The five vegetation indices in Table 2 were selected based on their performance of biomass estimation in previous studies suing Sentinel-2 images (Castillo et al. 2017; Pandit et al. 2018; Chen et al. 2018). The VIs were computed and mapped from the resampled multispectral bands and their combinations in ArcGIS and QGIS platforms based on the spectral values (Fig. 5). The bands 1, 9, and 10 with 60 m pixel size were excluded from the computation because of the low resolution.

Table 2 List of selected multispectral bands, derived VIs and BPVs from Sentinel-2 image
Fig. 5
figure 5

Maps of the vegetation indices showing the highest and lowest spectral values computed from combination of bands indicating distribution of green vegetation

Biophysical variables (BPVs) extraction and mapping

Vegetation surface biophysical or canopy properties provide an understanding of the physics of the interactions between solar radiation and vegetation elements (Asrar et al. 1989; Dou and Yang 2018). Vegetation surface parameter relates vegetation attribute to its spectral signature or reflectance value thereby providing reasonable estimates of vegetation properties across spectral, spatial and temporal scales (Asrar et al. 1989). The measurements of vegetation attributes include leaf area index (LAI), green leaf area index (GLAI), percent green cover or fractional green cover, chlorophyll content, green biomass and fraction of absorbed photo-synthetically active radiation (FAPAR). According to Widlowski et al. (2004), biophysical variables describe the spatial distribution of vegetation state and dynamics, thus, are useful for biomass estimation. The four biophysical variables (Table 2) were selected based on their performance for biomass estimation in other studies (Castillo et al. 2017; Chen et al. 2018). The biophysical variables were computed by the SNAP toolbox using the Biophysical variable processor (SNAP 2016). The attribute maps were produced in ArcGIS and QGIS software based on the surface reflectance values of each attribute (Fig. 6).

Fig. 6
figure 6

Maps of the vegetation biophysical variables showing the highest and lowest surface reflectance values of the vegetation canopy and its spatial distribution

Extraction of the pixel values of predictor variables

The pixel values for each variable derived from the Sentinel-2 image were extracted using zonal statistics in ArcGIS. The field plot geographical location (latitude and longitude) points were used as references to match the pixels as shown in Fig. 7. The extracted pixel values for each predictor variable are shown in Tables 3 and 4. The values were exported in CSV (comma separated variable) data formats for correlation and regression analysis in SPSS software.

Fig. 7
figure 7

Pixel values extraction by overlaying field samples plots on Sentinel-2 image pixels in ArcGIS platform

Table 3 Pixel values extracted from selected bands of Sentinel-2 image and AGB measured from plots
Table 4 Pixel values of vegetation indices extracted from Sentinel-2 image and AGB measured from plots

Data analysis

The forest biomass data measured from the field plots and the extracted values for the predictor variables from the Sentinel-2 images were organized into a spreadsheet with a CSV format. A pairwise Pearson’s product-moment correlation test was conducted to determine the correlation of the observed above-ground biomass and the predictor variables from the Sentinel-2 image using SPSS v.20 software. The predictor variables that showed significant correlation with the measured biomass data were selected for regression analysis. A regression test was iteratively performed between the measured biomass and the significantly correlated predictor variables in order to develop a regression model for biomass prediction.

The prediction model was then evaluated based on the magnitude of the Root Mean Square Error (RMSE) and value of the coefficient of determination (r2). The r2 was preferred because it has a standard measure of values ranging from 0 to 1. The r2 also shows the percentage of the variability explained by the model (Husch et al. 2003). This helps to understand the relationship between the independent variables (indices) and the dependent variable (biomass) (Peters 2007). The best prediction model was developed by selecting the variables with high r2 and a low RMSE values. The equation developed from the regression model was then applied to estimate the AGB by using the vegetation indices. The significance of the model was assessed from the P-value at α = 0.05.

$$\mathrm{RMSE}=\sqrt{\frac{\sum_{i=1}^{n}(AGBo-AGBp)}{n}}$$
(3)

where AGBo is observed AGB value, AGBp is predicted AGB value, and n is number of samples

Results

Above-ground biomass from field measurements

The highest amount of AGB was recorded from the sample plots located in the core zone of the Yayu forest biosphere reserve (Table 5). The core zone is access-restricted and far from the sources of disturbances. Large sized canopy trees are present in this part of the forest compared to the buffer and transition zones (Annex 1). The field measured tree data from the respective plots shows that the number and size category of trees is relatively high compared to those recorded from the buffer and transitions zones (Gole et al. 2008; Schuit et al. 2021). The least amount of the AGB was recorded in the plots located in the disturbed and semi-disturbed forest, which is the buffer and transition zones of the biosphere reserve (Table 5) (Gole et al. 2008). This is perhaps directly linked to the degree of human impact on the forest since the core zone is protected while the buffer and transition zones are open for community access that might lead to selective removal of mature trees from the forest (Schuit et al. 2021).

Table 5 Forest above-ground biomass estimated from field sample plot measurements

Correlation between AGB and the predictor variables from the Sentinel-2 image

The result of the correlation analysis between the measured above-ground biomass and the predictor variables extracted from the Sentinel-2 images showed a strong correlation between the observed AGB and most of the vegetation indices, with correlation coefficient (r) values ranging from 0.36 to 0.74. Among the predictor variables, NDVI (r = 0.36), IRECI (r = 0.5), NDVI45 (r = 0.40), LAI (r = 0.74), FAPAR (r = 0.7), FCOVER (r = 0.64) and Cab (r = 0.69) were strongly correlated with the AGB (Table 6). The IRECI from the vegetation indices and LAI from the biophysical variables were best correlated with the observed/measured AGB. Among the different predictor variables, the biophysical variables were found strongly correlated with the above-ground biomass (r = 0.65–0.74).

Table 6 Correlation between field measured above-ground biomass and selected vegetation index predictor variables

Relationship between measured above-ground biomass and derived indices

The results from the linear regression analysis revealed that there is a positive linear relationship between forest above-ground biomass and the spectral vegetation indices extracted from the Sentinel-2 satellite images (Fig. 8; r2 = 0.017–0.27). Similarly, the forest above-ground biomass has showed a strong and linear relationship with the surface reflectance biophysical variables drawn from the satellite images (Fig. 9; r2 = 0.42–0.54). From the MSI bands, Band 4 performed better than other Sentinel-2 bands (r = − 0.44 and r2 = 0.2), which is selected for developing the AGB prediction regression model. The best predictor variables for the biomass prediction model development were selected based on the strength of the relationship between the indices and the measured above-ground biomass.

Fig. 8
figure 8

Linear relationship between observed AGB and spectral vegetation indices extracted from Sentinel-2 MSI

Fig. 9
figure 9

Linear relationship between observed AGB and vegetation biophysical variables extracted from Sentinel-2 MSI

Modeling AGB biomass prediction from vegetation indices

From the regression analysis, the variables with high values of coefficient of determination were selected for the above-ground biomass prediction. The variables with low values of coefficient of determination and those showing multi-collinearity were excluded from the model. Only five variables were selected to develop the model and the remaining were excluded because of very low values of coefficient of determination and presence of multi-collinearity (Table 7). Predictor variables with multi-collinearity can cause high variance in regression analysis and should be excluded from the modeling (Chen et al. 2018). As a result, LAI, FCOVER and FAPAR from the biophysical variables, IRECI from the vegetation indices and Band 4 from the MSI bands were selected for the model development (Table 7). The results show that the biophysical variables are better suited for developing forest biomass prediction model compared to other types of vegetation indices (Table 7).

Table 7 Linear regression result of the observed AGB with selected predictor variables

Based on the above regression results, the biomass prediction equation was developed to estimate the above-ground biomass of the forest vegetation using Sentinel-2 image extracted variables as shown in equation below.

$${\mathbf{AGB}} = \left( {{2}0.{176}*{\varvec{B4}}} \right) + \left( {{6}.{633}*{\varvec{FCOVER}}} \right) - \left( {{6}.{18}0*{\varvec{FAPAR}}} \right) + \left( {{13}.{452}*{\varvec{LAI}}} \right) - \left( {{6}.{3}0{7}*{\varvec{IRECI}}} \right) - {2}.{282}$$
(4)

where B4 is Band 4, LAI is Leaf area index, IRECI is Inverted Red-Edge Chlorophyll Index, FCOVER is Fraction of vegetation cover, FAPAR is Fraction of Absorbed Photo-synthetically Active Radiation. These indices can be derived from any Sentinel-2 image and can be used to predict forest above-ground biomass using the prediction equation in a tropical afro-montane forest. The prediction model was validated using the measured or observed values of above-ground biomass from the field (Table 8). The measure of the goodness of fit between the observed and predicted values showed a strong linear relationship with a coefficient of determination of r2 = 0.73 (Fig. 10).

Table 8 Values of the observed and predicted above-ground biomass
Fig. 10
figure 10

Scatter plot showing goodness of fit between the observed and predicted values of the above-ground biomass

Discussion

The significance of the coupled modeling approach for biomass estimation relates to the field measured data from representative samples (Zhao et al. 2021). Although the numbers of sample plots were small, they represented the different layers of vegetation in the Yayu biosphere reserve (Annex 1). The plot measurement results corresponded with the forest biosphere strata of the Yayu forest, depicting the forest canopy structure in the different management zones of the forest. The core zones have high canopy trees with relatively dense standing stock (Gole et al. 2008). The magnitude of the Biomass measured from those plots located in the buffer zone was lower than those measured from plots located in the core zone. Unlike the buffer and transition zones, the core zone is protected for biodiversity reserve and conservation (Gole et al. 2008; Schuit et al. 2021). The core zone is inaccessible, and the size and density of trees is relatively high compared to the transition zones. On the contrary, the buffer and transition zones are freely accessible for agricultural production and it is managed for coffee cultivation (Schuit et al. 2021). Trees are sparse and hence, biomass density is low. The structural stratification and management zonation of the vegetation has corresponded with the surface reflectance characterization of the forest in the Sentinel-2 images, which was very well illustrated in the vegetation indices maps (Figs. 5 and 6). This has positively contributed to signature sampling and classification accuracy.

The average measured biomass per unit area agrees with results reported in earlier studies for similar afro-montane forest types in the Central highlands of Ethiopia (Yohannes et al. 2015; Dibaba et al. 2019; Eshetu and Hailu 2020). These studies used similar coupled methods of sample plot measurements and allometric equations for the biomass estimation. Among the different vegetation indices derived from the Sentinel-2 image, only IRECI showed strong correlation with the measured biomass values. This is because IRECI uses the red-edge bands (Band 6 and Band 7) that are well related to healthy vegetation biomass (Castillo et al. 2017; Isbaex and Coelho 2020). On the contrary, most of the vegetation biophysical variables were strongly correlated with the measured biomass. This suggests that vegetation biophysical indices or surface reflectance attributes are better predictor variables for above-ground biomass estimation than the other vegetation indices and multispectral bands (Taddesse et al. 2020). A study by Chen et al. (2018) revealed that the vegetation biophysical variables of Sentinel-2 were the most relevant and important predictors for explaining the observed variability of AGB. This is possibly because the Sentinel-2 product has comparatively large spatial coverage and high resolution to perform efficiently for estimation of biomass than other open-source sensor data products (Juniansah et al. 2018). LAI was found to be the single most important predictor by showing strong correlation with measured biomass. Similar observation has been reported by Castillo et al. (2017). LAI is known to be a very good indicator of vegetation density and can accurately describe state of vegetation compared to other canopy variables (Dusseux et al. 2015; Castillo et al. 2017). Those variables with high coefficient of determination (i.e., most of the vegetation biophysical indices) and those without multi-collinearity (e.g., LAI, FCOVER, FAPAR, IRECI and Band 4) were found to be suitable predictors of biomass in the regression model. This is consistent with a study by Pandit et al. (2018), who reported that Sentinel-2 images yielded reliable estimates of forest biomass and carbon stock using prediction algorithms, after observing a strong relationship between selected vegetation biophysical variables and measured biomass from the forest with a high measure of goodness of fit (R2 = 0.81). This also agrees with the model goodness of fit between the predicted and observed values in this study (Fig. 10).

Using the raster calculator of ArcGIS and the forest layer thematic map, the above-ground biomass was mapped by applying the prediction model (Fig. 11). The result corresponded very well with the biosphere structural zones (Gole et al. 2008). The highest amount of the AGB is in the range of 6 to 10 ton per pixel or 150 to 250 t ha−1, which are those areas closer to the core conservation zone of the forest biosphere reserve. In the transitional and the buffer zone of the forest, where access roads are available and where agricultural activities are permitted, the concentration of biomass is below 6 ton per pixel (Fig. 11). Within the core zone, a predicted value of biomass higher than 10 ton per pixel (250 t ha−1) has been recorded in scattered pocket areas of the forest reserve. These spots are located in the steepest and most inaccessible parts of the forest, in which anthropogenic activities are restricted and very minimum.

Fig. 11
figure 11

Map of the predicted values of the above-ground biomass of Yayu forest biosphere reserve

The above-ground forest biomass was converted to the carbon equivalent using carbon conversion factor (a default value of CF = 0.47) and mapped using the raster calculator in ArcGIS. The carbon stock map is similar to the biomass distribution map and the inaccessible areas were found to have higher amount of carbon stock with a value of 7.05 ton/pixel or 176.25 t ha−1. Likewise, the lowest amount of carbon stock, i.e., 2.82 ton/pixel or 70.5 t ha−1, was recorded in the accessible and transitional zone of the biosphere reserve forest (Fig. 12).

Fig. 12
figure 12

Map of the predicted carbon stock in the biosphere reserve of Yayu forest

The interesting finding in this study is that the biophysical variables derived from the Sentinel-2 images are found to be the most important predictors of the forest biomass vegetation. The predicted carbon stock maps well corresponded with the values recorded from the field measurements. These variables are directly related to the photosynthetic activity of the plants and the canopy structure, which determine the primary productivity of the forest ecosystems (Taddesse et al. 2020). The Sentinel-2 image, with its high spectral and spatial resolution, is found to be very suitable for biomass estimation in the tropical high forest areas.

Conclusion

Forest biomass and carbon stock estimation using remotely sensed data are becoming more reliable due to improvements in spectral and spatial resolutions of products from different sensors. Recently, Sentinel-2 optical data are increasingly applied for estimating above-ground biomass and other vegetation attributes. This study tested the potential application of Sentinel-2 derived vegetation indices for modeling of above-ground biomass estimation in a tropical afro-montane forest. The results showed that biophysical variables (canopy surface reflectance values) are strongly correlated with forest above-ground biomass compared to the multispectral bands and band combination vegetation indices. Multispectral band 4 (red band), the biophysical variables LAI, FAPAR, FCOVER and the vegetation index IRECI showed high coefficient of determination with low margin of error. These variables are good predictors for modeling forest above-ground biomass estimation using Sentinel-2 imagery. The results demonstrated that Sentinel-2 imagery is highly suitable for above-ground biomass estimation. From the low margin of error we can conclude that Sentinel-2 image has improved the reliability of biomass estimation from remotely sensed data. Since the results in this study were achieved using measurements taken from limited number of field sampling plots, the level of accuracy of the results will likely improve with more number of field sample measurements.