Introduction

In most African nations, including Ethiopia, a traditional yield estimation method is expensive, time-consuming, and prone to significant errors (FAO 2016; Burke and Lobell 2017). Additionally, national crop production forecasts are only available months after a crop harvest, making it challenging to decide quickly whether to import or export in order to ensure food security and economic growth. As a result, it is difficult to identify food shortages and ensure the country’s food security. As a result, there are differences between the government’s expected crop yields and actual production (Haile 2005).

Therefore, accurate information regarding Ethiopia’s primary crops, such as maize, teff, and finger millet, is essential for crop husbandry and yield prediction (Sakamoto et al. 2013). Additionally, it is essential for identifying factors that limit agricultural productivity and creating location-specific management methods, such as raising fertilizer rates, increasing irrigation, and scheduling harvests, which will boost crop production (Chivasa et al. 2021; Kayad et al. 2021). Furthermore, accurate and timely information regarding the kinds of crops growing, their area coverage, and their production, is essential for Ethiopia, whose economy is based primarily on agriculture. Furthermore, early yield predictions are required by decision-makers, scientific groups, and crop insurance agencies (Jaafar and Mourad 2021).

Remote sensors (for example, Sentinel and crop canopy analyzers) have been widely used for frequent long-term crop growth monitoring (Flynn et al. 2020) and estimating crop yield (Hunt et al. 2019). Compared to conventional survey methods, these time- and resource-efficient techniques are more accurate and support decisions for agricultural output (Katsoulas et al. 2016). Furthermore, for an agriculture-based economy like Ethiopia, regular and accurate information about crops planted, area coverage, and productivity is essential. In order to ensure that the crop is harvested at the proper time, to identify any biotic and abiotic stresses in the crop, and to establish soil and agricultural management zones across the nation, crop yield prediction using remote sensing techniques is crucial (Gianquinto et al. 2011). As a result, it is crucial to use remote sensing data to estimate the area suitable for growing crops as well as to collect primary data to determine yield (Doraiswamy et al. 2004).

Furthermore, remote sensors, including satellite and canopy analyzer, can provide essential information about crop growth and yield at low cost (Burke and Lobell 2017). Crop production parameters are also important indicators for determining seasonal ecosystem and environmental impacts (Marshall and Thenkabail 2015). Unfortunately, a lack of high-resolution data is a recurrent issue for Earth observation research that attempts to map yield. The Sentinel-2 satellite and crop canopy analyzer provides data that could support the prediction of crop leaf area index (LAI), leaf water content, biomass, and chlorophyll content (Chemura et al. 2018). Researchers use a range of satellite- and crop canopy analyzer-based vegetation indices (VIs) and LAI to calculate these and other crop parameters, which have become effective tools for mapping crops and estimating crop biomass and grain yield at low cost (Jin et al. 2019; Wolanin et al. 2019).

The European Space Agency offers Sentinel-2 data with a spatial resolution of 10 to 60 m and a temporal resolution of 5 days (Copernicus Open Access Hub 2020). Sentinel-2 data provide an opportunity for frequent long-term crop growth monitoring (Wolanin et al. 2019; Flynn et al. 2020). The satellite played a key role in classifying land cover (Belgiu and Csillik 2018; Gašparović and Jogun 2018), estimating crop yield (Gómez et al. 2019; Hunt et al. 2019), and predicting biomass (Campos et al. 2019). Crop monitoring using Sentinel-2 and crop canopy-derived data showed that wheat growth is strongly associated with grain yield (Du and Noguchi 2017).

The development of high-resolution data and readily accessible Sentinel-2 data could not improve yield calculation methodologies in Ethiopia, where food production still only reaches subsistence farming levels. The unique spectral bands in Sentinel-2 imaging, according to Lebourgeois et al. (2017), were also able to capture differences in crop growth phases, crop patterns, and crop type classifications. To the best of the authors’ knowledge, agricultural production estimates in Ethiopia have not been made using high-resolution remote sensing data such as Sentinel-2 and crop canopy analyzer data.

Furthermore, there has not been much investigation into the dynamics of vegetation growth and crop productivity in the nation. Additionally, it is essential to evaluate yield models before directly applying them to all of Ethiopia’s different topographies (Tsegaye and Abeje 2018). Little attention has been paid to the effect of variability on important phenological stages on agricultural productivity (Gummadi et al. 2018). Additionally, there is not much information on crop yield estimation under various soil management practices and slope gradients in the studied sites.

For several crops, linear regression models have been employed to produce trustworthy yield estimates (Jin et al. 2017; Nazir et al. 2021; Abebe et al. 2022; Elders et al. 2022; Eze et al. 2022; Yunus and Polat 2023). Crop yield estimation has been done using empirical regressions. Additionally, it was discovered that employing VIs produced from multiple sources of data rather than just one image data increased the accuracy of crop monitoring and yield estimation (Skakun et al. 2021). In the present study, catchment-scale predictions of agricultural yields using VIs were made using linear regression models.

We developed a model based on Sentinel-2 and crop canopy analyzer-derived data to evaluate the impacts of bund construction on biomass and grain yield of the major crops (teff and finger millet) using data from 2020 and 2021 in the Aba Gerima catchment. We hypothesized that of the many VIs and LAI available from the satellite dataset and crop canopy analyzer, one index or a combination of indices would provide acceptably accurate estimates of crop growth and yield. It would also be desirable to replace conventional crop surveys with more precise and affordable techniques, such as those generated from remote sensing. To date, within-field crop yield estimates have not been made using LAI obtained from radiative transfer models. However, the possibility of calculating crop output variations within a field has not yet been adequately investigated. LAI has demonstrated improvements in grain yield estimation as compared to VIs (Gilardelli et al. 2019). The effects of soil bund on geographical variability of teff’s and finger millet’s yields were investigated using Sentinel-2 and crop canopy analyzer LAI-derived data in the Aba Gerima.

Materials and methods

Description of the research area

The part of the Aba Gerima catchment that we studied is located between 11° 39′ 0″ N–11° 40′ 30″ N and 37° 29′ 30″ E–37° 31′ 30″ E in Ethiopia’s Amhara Region, a tropical highland in the Blue Nile Basin (Fig. 1; WGS 84 / UTM zone 37 N). Figure 1 also shows the topography (elevation) and distribution of the sample sites. By the Köppen–Geiger classification, the site lies under the humid tropical climatic region (Peel et al. 2007), with altitudes ranging from 1914 to 2121 m above sea level. The altitude map was developed from a high-resolution (0.5 × 0.5 m) digital elevation model (DEM) using ArcGIS software version 10.5 (Fig. 1b). The DEM was purchased from the Advanced Land Observing Satellite-2 (ALOS-2) operated by the Japan Aerospace Exploration Agency.

Fig. 1
figure 1

Location of study catchment and crop sampling plots: (a) the Blue Nile Basin and its location within Ethiopia, (b) Aba Gerima topography, (c) Aba Gerima slope, (d) locations and land use for the sample sites. (“Cloud” represents pixels for which data were unavailable due to cloud contamination)

According to records from 1994 to 2021 at nearby meteorological stations, the study area receives a mean annual rainfall varying from 1076 to 1953 mm, with a mean monthly maximum temperature of 27.0 °C and a mean monthly minimum temperature of 12.6 °C (Fig. S1). The main rainfall occurs from June to September; otherwise, it is dry (NMSA 2004). Acrisols, Luvisols, Vertisols, and Leptosols are the common soils in the catchment (FAO 2006). Major crops include teff (Eragrostis tef), finger millet (Eleusine coracana), and maize (Zea mays Z.).

Justification of the study area selection

We selected the Aba Gerima catchment in northwest Ethiopia, a tropical agroecological zone where soil erosion and fertility loss have been important contributors to poor agricultural yields. Extension programs have incorporated bund construction in the catchment (Haregeweyn et al. 2017). However, due to the expense and time involved, which could be avoided by adopting a remote sensing strategy, their effects on crop productivity have not been examined.

Additionally, efficient soil bunding can help improve deteriorated soil quality and functions, guarantee sustainable output, and improve the environment. Low agricultural yields have been a serious problem in the Aba Gerima due to soil erosion and diminished soil productivity. Further, replacing conventional crop survey to estimate crop yield with more accurate and low-cost methods, such as satellite and crop canopy analyzer-derived approaches would be beneficial.

Methods of data collection

Experimental setup and crop sampling

We defined three slope categories for cropland: gently sloping (2 to 5%), moderately sloping (5 to 10%), and strongly sloping (10 to 15%) (FAO 2006). We mapped their locations within the Aba Gerima catchment using the digital elevation model in ArcGIS v. 10.5.1 (www.esri.com). We then considered two land management scenarios: 24 plots without bunding (WB) and 24 plots with soil bunds (SB) reinforced with grass and stone. The bunds are 5 years old, have bottom width of 0. 8 m, and a height of 0.5 m. Forty-eight representative soil-sampling plots were identified, with a minimum size of 40 m × 40 m (1600 m2) each. All plots were intentionally distributed and arranged in a randomized complete block design. Each plot was geo-coded with a hand-held GPS device (GPSMAP64, Garmin, Olathe, KS, USA) and recorded in ArcGIS software (Fig. 1c).

The soil in each sample plot was cultivated to a depth of 20 cm by an ox-drawn plow. Teff and finger millet seeds were each broadcast at 25 kg/ha. Weeds were controlled by hand throughout the growing season. No agricultural pesticides were applied to control crop pests. We used urea and di-ammonium phosphate fertilizer (at 100 kg/ha).

Sentinel-2 and crop canopy analyzer-based data collection

The open-source tools and data used in this study were designed to be easily replicable by non-profit organizations and other stakeholders. Sentinel-2, a twin-satellite system that orbits the Earth in a polar, sun-synchronous orbit, is a multispectral, high-resolution, wide-swath satellite. The satellite images were acquired from Sentinel-2 by the European Space Agency. Due to the satellites’ 180° phase difference, the combined revisit frequency can be as frequent as five days in the equator and 2 to 3 days in mid-latitudes. Each satellite has a Multi-Spectral Instrument, which collects data in 13 different spectral bands, four of which have a spatial resolution of 10 m, six of which have a 20 m resolution, and three bands with a resolution of 30 m. Each satellite has a sweep width of 290 km.

We downloaded cloudless Sentinel-2 images covering the Aba Gerima catchment during the tillering growth stage (1 to 31 October 2020 and 2021) from the Copernicus Open Access Hub (https://sentinels.copernicus.eu/). Sentinel-2 images acquired during peak growth stages are more useful than those acquired during low growth periods. We predicted crop yield at the maximum physiological maturity stage (tillering) of the test crops. We used the 2020 data for model development and the 2021 data for model validation. After selecting the images, we generated composite images. We used QGIS v. 3.10.1 (https://qgis.org/​en/site/forusers/download.html) for atmospheric correction of the images. We used about 16 ground-truthing training sites to supervise the satellite images and map cover by forest, shrub or bushland, maize, teff, and finger millet. The same dataset was used for calibrating and validating the LAI-based model. Figure 2 shows the overall workflow for estimating grain yield (GY) and aboveground biomass (AGB) of teff and finger millet. The raster calculator in ArcGIS derived the VIs (Table S1) for each pixel.

Fig. 2
figure 2

Workflow for estimating crop yield and aboveground biomass using Sentinel-2 satellite and crop canopy analyzer-derived data

Field data collection

Leaf area index (LAI) is a non-destructive measurement method (Marshall and Thenkabail 2015) of the leaf area value of a crop with a Plant Canopy Analyzer. The LAI of the two test crops (teff and finger millet) was measured before 10:00 a.m. on the same day as the Sentinel-2 images acquisition dates with a 270° view angle cap with an LAI-2200C Plant Canopy Analyzer (LAI-2000, LI-COR Inc., Lincoln, NE, USA; Fig. 3). All teff and finger millet plots underwent measurements of the radiation above and below the canopy. Seven independent measurements were made for each quadrant to reduce measurement errors, and they were all incorporated into the analysis.

Fig. 3
figure 3

Leaf area index measurement of teff (a) and finger millet (b) with plant canopy analyzer in Aba Gerima catchment

As a destructive approach, all teff plants were harvested on 1 November, and all finger millet plants were harvested on 25 November 2020 and 2021 and air-dried for 1 week. AGB per plot (ton/ha) was recorded by weighing the whole plant on an electronic balance with a precision of 10 mg. We measured the thousand-seed weight for both crops using the same balance. Seed moisture content was measured with a hand-held grain moisture tester (model AG-12, A-Grain, India). GY (kg/ha), adjusted to 12% grain moisture content, was calculated as:

$$\textrm{GY}\left(\frac{\textrm{kg}}{\textrm{ha}}\right)=\frac{\textrm{GYPP}\times 10\;000\times \left(100-\textrm{GM}\right)}{\textrm{HA}\times 88}$$
(1)

where GYPP = grain yield per plot (kg), GM = grain moisture content at harvest (%), and HA = harvested area (m2).

Data analysis

We employed variance and regression analyses to identify the VIs that contributed significantly to the prediction of GY and AGB in SAS v. 9.4 (https://www.sas.com/) and Stata version 14.0. The coefficient of determination (R2) and root-mean-square error (RMSE) were computed. High R2 and low RMSE between crop yields and the proposed VIs were used to select the best-fit models. We used ArcGIS to generate yield maps of the study area for teff and finger millet using the best-fit regression models.

$${R}^2=\frac{\sum_{\textrm{i}=1}^N\ {\left({\hat{y}}_{\textrm{i}}-{\bar{\textrm{y}}}_{\textrm{i}}\right)}^2}{\sum_{\textrm{i}=1}^N\ {\left({y}_{\textrm{i}}-{\bar{\textrm{y}}}_{\textrm{i}}\right)}^2}$$
(2)
$$\textrm{RMSE}=\frac{1}{N}\sum_{i=1}^N{\left({\hat{y}}_{\textrm{i}}-{y}_{\textrm{i}}\right)}^2$$
(3)

where ŷ = predicted value; ӯ = mean measured value; y = measured value; and N = number of observations with i = 1, 2… N.

Results and discussion

Slope aspect and topographic wetness index (TWI) computation

The slope aspect map for this investigation was obtained using the DEM (Fig. 4a). For this work, a raster DEM with a 0.5 by 0.5 m grid size was employed. Temperature is influenced by slope; for example, eastern and western slopes receive more direct solar radiation and become warmer than southern and northern slopes.

Fig. 4
figure 4

Slope aspect (a), upslope flow accumulation (b), radian slope (c), and topographic wetness index (d) of Aba Gerima catchment

TWI is used to quantify topographic controls on hydrological processes, vegetation patterns, and annual net primary production (White and Running 1994; Zinko et al. 2005; Sorenson et al. 2006). Nowadays, TWI has been derived by Beven and Kirkby 1979 from DEM (Kopecký et al. 2021). Even though the current study did not incorporate TWI for predicting crop yield, we calculated TWI (Beven and Kirkby (1979); Eq. 4).

$$\textrm{TWI}=\ln \frac{b}{\tan c}$$
(4)

where b is the upslope contributing area per unit contour length (Fig. 4b), and c is the slope (mm−1); radian slope (Fig. 4c) = (“slope.tif” × 1.570796) ∕ 90; t𝑎𝑛 𝑠𝑙𝑜𝑝𝑒 = 𝐶𝑜𝑛 (“r𝑎𝑑𝑖𝑎𝑛 𝑠𝑙𝑜𝑝𝑒.𝑡𝑖𝑓” > 0, 𝑇𝑎𝑛 ((“r𝑎𝑑𝑖𝑎𝑛 𝑠𝑙𝑜𝑝𝑒.𝑡𝑖𝑓”), 0.001); rescaled flow accumulation = (“flow accumullation.tif”+ 1) × cell size. The final TWI was computed by dividing Rescaled flow accumulation by tan slope Fig. 4d. According to the TWI color pattern, areas with red indicate high TWI and wetness, whereas areas with blue suggest low TWI and wetness (Fig. 4d).

Effect of a soil bund on the vegetation indices, leaf area index, and crop yields

GY and AGB for both crops were substantially improved (p < 0.001) by the presence of a soil bund and were greater (but not always significantly) in the lower slope classes (Tables 1 and 2). However, WDRVI and ARVI did not differ significantly among plot types in both crop and LAI for finger millet, so they are not included in the paper. This could be due to variations in biotic and abiotic factors (Barnes et al. 2000) and their saturation (Vallentin et al. 2020; Skakun et al. 2021). It might also be brought on by the plants’ increased red light absorption and NIR light reflection (Rouse et al. 1974). Similar findings were obtained for wheat by Jat et al. (2008) and Bandyopadhyay et al. (2014). Crop canopies absorb photosynthetically active radiation in a linear relationship with VIs (Gamon et al. 1995; Silva et al. 2020).

Table 1 Mean values of the vegetation indices for teff in 2020
Table 2 Analysis of the vegetation indices for finger millet in 2020

The higher VIs and LAI, GY, and AGB values from most bunded plots, such as S1SB, can be attributed to improved water and nutrient contents, increased LAI, and plant density. It could also result from greater absorption of red light and reflection of NIR light by the plants (Rouse et al. 1974). Jat et al. (2008) and Bandyopadhyay et al. (2014) stated comparable results in wheat. VIs are linearly associated with the absorption of photosynthetically active radiation by crop canopies (Weigand and Richardson 1990; Gamon et al. 1995). Bala and Islam (2009) compared NDVI, LAI, and fraction of photosynthetically active radiation in estimating the yield of potatoes. Studies have found that NDVI is asymptotically saturate in high LAI (Huete et al. 2002; Gitelson et al. 2003).

Yield prediction models

Yield estimation and identification of yield-limiting factors are performed using multiple linear regression (Kitchen et al. 2003) and machine learning (Kaul et al. 2005). Machine learning, however, has improved the capability to precisely anticipate agricultural yields (Archontoulis et al. 2020; Bogard et al. 2020; Washburn et al. 2020). Moreover, they are simple and have high data storage (Shahhosseini et al. 2019). Several machine learning algorithms such as linear regression, decision tree, and random forest were employed for the prediction of agronomic variables (Haque et al. 2020; Khaki et al. 2020). However, in this study, we regressed each VI with GY and combined those VIs having higher R2 (greater than 0.50) and lower RMSE values using multiple linear regression model (Taghizadeh et al. 2020). The individual use of the VIs such as NDVI could not be the most accurate and reliable alternative for crop monitoring (Dhillon et al. 2023). Thus, we analyzed the potential of each VI, LAI, and their best combinations to predict teff and finger millet yields (Table 3). The spatial variability of the indices was also depicted in Figs. S2 and S3. For teff, the equation for GY with the highest R2 (0.83) and close to the lowest RMSE (0.14 ton/ha) was:

$$GY=5.57\ast EVI+3.19\ast NDVI+1.15\ast SAVI-0.35\ast GNDVI+0.09\ast GCVI-0.34\ast RNDVI+0.04\ast LAI\hbox{--} 3.71$$
Table 3 Models developed to predict aboveground biomass (AGB) and grain yield (GY) of teff and finger millet in 2020

For finger millet GY, the best fit (R2 = 0.63, RMSE = 0.48 ton/ha) was:

$$GY=4.41\ast EVI+1.82\ast NDVI\hbox{--} 7.16\ast SAVI+1.04\ast GNDVI+1.41 GCVI\hbox{--} 0.72\ast RNDVI+1.15$$

Using linear regression models, the major contributors were EVI and NDVI for teff AGB (R2 = 0.57, RMSE = 0.46 ton/ha) and NDVI for teff GY (R2 = 0.70; RMSE = 0.10 ton/ha); and NDVI for finger millet AGB (R2 = 0.73; RMSE = 1.72 ton/ha) and GY (R2 = 0.69; RMSE = 0.41 ton/ha) in 2020 (Table 3). However, a greater accuracy in crop yield prediction using the random forest model was reported (R2= 0.86, RMSE = 683 kg/ha; Zhao et al. 2022; and 0.963 R2, 0.035 RMSE; Jhajharia et al. 2023). Random forest also performed better for rice yield estimation with R2 = 0.85 and 0.81 for calibration and validation, respectively (Satpathi et al. 2023). Random forest was also the best model with RMSE of 510.8 and 129.9 for potato and maize, respectively, and R2 was 0.88 and 0.82 (Kuradusenge et al. 2023). This demonstrates that for predicting finger millet GY at the catchment scale, a predictive model integrating numerous VIs was more accurate than models based on a single VI. We selected it to assess and forecast the geographic variability in finger millet output in the Aba Gerima catchment. Thus, we can conclude that adding canopy development measurements increased the precision of crop yield forecast. This result is consistent with reports made by Zhao et al. (2020). The spatial variability of the indices was also depicted in Figs. S2 and S3. In Spain, the best correlation with rice yield was R2 = 0.72 to 0.76 (Franch et al. 2021).

According to the findings of the current study, a substantial correlation between green difference vegetation index and millet GY (Bartholome 1988; Leroux et al. 2020) and NDVI and pearl millet yield (Gerard and Buerkert 2001) was also noted. According to research, high LAI values cause NDVI to asymptotically saturate (Huete et al. 2002; Gitelson et al. 2003). According to Yadav et al. (2021), NDVI has a strong correlation with wheat’s chlorophyll content, but it loses sensitivity once plants reach a certain saturation level. However, there was a significant correlation between NDVI and LAI (Yadav et al. 2020a, 2020b).

A strong association between VIs, LAI, and AGB is expected when crops reach maximum growth (Lambert et al. 2018). Increased accuracy of NDVI for forecasting finger millet yield was reported in Iowa (Prasad et al. 2006) and Kenya (Jin et al. 2017a). Yield underestimation by models with high RMSE could be due to pest infestations or diseases, insufficient water, and the effects of temperature and soil nutrient stresses (Barnes et al. 2000), as well as by saturation of the VI values (Robson and Wright 2013; Robson et al. 2016). Combining metrics related to canopy development (Zhao et al. 2020) with VIs and LAI enhanced the accuracy of crop yield prediction.

We plotted the observed yield (ton/ha) against the estimated yield (ton/ha) at the catchment scale to validate the model (Fig. 5). The strong fits (R2 > 0.83, RMSE < 0.98 ton/ha) suggest that a predictive model that combined multiple VIs was more accurate than models based on a single VI at predicting teff and finger millet GY and AGB at a catchment scale. We also used the best-fit models from Table 3 to predict the GY and AGB of the two crops throughout the study area, and Fig. 6 shows their spatial distribution.

Fig. 5
figure 5

Scatterplots of the model validation results for grain yield (GY) and aboveground biomass (AGB) of teff and finger millet. Models were developed using data from 2020 and validated using data from 2021. R2, coefficient of determination; RMSE, root-mean-square error

Fig. 6
figure 6

Spatial distribution of the predicted teff and finger millet yields in Aba Gerima catchment. GY grain yield, AGB aboveground biomass

These variations could be due to factors such as growth stage, nutrient availability, and water status that affect leaf area (Chang et al. 2005), plant anatomy (Lillesand et al. 2004), crop row orientation (Maire et al. 2004; Johannsen and Daughtry 2009), and plant canopies (Pimstein et al. 2009). Strong linear connections were found between VIs generated from Sentinel-2 imagery and agricultural yield despite the influence of these factors on yield (Manivasagam et al. 2021; Perros et al. 2021; Vallentin et al. 2021). Furthermore, Sentinel-2 preprocessing with radiometric and atmospheric adjustment and elimination of noisy spectral areas increased the model’s robustness, accuracy, repeatability, and reproducibility (Gholizadeh et al. 2013). These results suggest that the models for forecasting crop yields at broad spatial scales from Sentinel-2 data are accurate and might be reproducible in the study area conditions. Before application at very large scales, validation is required.

NDVI and EVI gave the best predictions of teff and finger millet yields. They also explained the largest proportion of the variation of GY and AGB; thus, the models showed superior performance at predicting crop yields at the tillering stage. This may be because EVI is based on the spectrum’s red, blue, and NIR regions. It could also be because EVI is an optimized VI with high sensitivity for measuring terrestrial vegetation at dense vegetation (Rouse et al. 1974; Tucker 1979; Huete et al. 2002). As a result, a study demonstrated that EVI had a stronger linear association than NDVI with the LAI in field crops (Boegh et al. 2002). Furthermore, LAIs ranging from 5 to 6 m2 m−2 (Yadav et al. 2020a) and 3 to 6 m2 m−2, NDVI became ineffective (Carlson and Ripley 1997).

NDVI was employed to monitor vegetation growth and crop yield in previous studies (Sultana et al. 2014; Fieuzal and Baup 2016; Moussa Kourouma et al. 2021). However, NDVI becomes saturated at higher biomass density (Santin-Janin et al. 2009; Zheng and Moskal 2009). In previous research, NDVI data accurately explained the variation of maize GY (de Oliveira et al. 2019; Maresma et al. 2020) and teff GY (Jean et al. 2021) in other countries.

Zhao et al. (2007) reported that EVI and NDVI at flowering were moderately to strongly correlated (R2 from 0.56 to 0.89) with irrigated cotton yield. NDVI and EVI explained soybean yield (R2 = 0.69 and 0.70, respectively; Bolton and Friedl 2013). Previous research on sorghum yield (Bartholome 1988; Mandal et al. 2007; Shafian et al. 2018; Habyarimana and Baloch 2021), maize GY (Maresma et al. 2020; Vozhehova et al. 2020), and teff GY (Jean et al. 2021) revealed that NDVI data effectively explained the variation in these yields. However, NDVI became ineffective at LAI of 3 to 6 (Carlson and Ripley 1997). In other studies, EVI and NDVI best predict crop grain and biomass from reflectance values (Le Page and Zribi 2019; Wang et al. 2019; Liu et al. 2020). This could be due to their derivation from the red and NIR spectral bands, which are influenced strongly by chlorophyll absorption and reflecting light (Huete et al. 1997, 2002). However, models with a large RMSE may understate production as a result of pest or disease infestations, drought, and the impacts of temperature and soil nutrient stressors, as well as saturation of the VI values (Barnes et al. 2000; Robson et al. 2016).

Spatial crop yield mapping

We used the four best models in Table 3 to map crop yield throughout the Aba Gerima catchment. GY of teff ranged from 0.30 to 1.81 ton/ha and AGB from 1.1 to 4.53 ton/ha, and GY of finger millet varied from 0.5 to 4.36 ton/ha and AGB from 1.95 to 14.92 ton/ha (Fig. 6). These variations could be due to spatial variation in soil fertility, water availability, crop cultivars, crop management practices, sowing date, and environmental conditions (Lambert et al. 2018; Jin et al. 2019; Tiruneh et al. 2021a, b; Tiruneh et al. 2023), and to slope class and the presence or absence of a soil bund. Despite the influence of these factors on yield, strong linear relationships between VIs generated from Sentinel-2 imagery and crop yield were found (Franch et al. 2021; Perros et al. 2021; Vallentin et al. 2021). The information on yield variability will be valuable to agriculture managers in helping them to define different land management zones with specific yield-limiting factors that should be monitored (Diker et al. 2004).

The best-fit models performed well at estimating the yield of finger millet and teff at the tillering stage from VIs and LAI extracted from Sentinel-2 and crop canopy analyzer-derived data. However, caution will be necessary before using the models in other regions and seasons because differences between regions in their characteristics, including land management practices (Delmotte et al. 2011; Knox et al. 2012), would influence the accuracy of a yield estimation model (Prasad et al. 2006; Hunt et al. 2019). Thus, the models must be redeveloped or (if the basic model form is valid) parameterized for each region before they can be used to predict crop yield. Our study confirmed the capacity of Sentinel-2 images and crop canopy analyzer-derived LAI and the VIs developed from the data to predict yield within the Aba Gerima catchment. As a result, it will provide important support for improved agricultural management that will both increase food security and improve farmer income.

Conclusions

This study’s primary goal was to develop a yield estimation model utilizing Sentinel-2 vegetation indices and a leaf area index obtained from crop canopy analysis (LAI). Between vegetative indices, LAI, and actual yield values at tillering stages of teff and finger millet, several linear regression models were created. For teff and finger millet crops, a soil bund boosted grain production. To ensure long-term production and environmental protection, good soil bunding can help conserve and restore the quality and functions of soil.

We confirmed our research hypothesis that could use the available VIs and LAI to provide acceptably accurate estimates of crop growth and yield. Such decision-support tools can provide important guidance for developing agricultural policies designed to improve food security and farmer income. We found that Sentinel-2 satellite imagery and LAI derived from crop canopy analyzer during the tillering stage could accurately predict the yield of teff and finger millet in Ethiopia’s Aba Gerima catchment. Most VIs and LAI calculated from the Sentinel-2 images and crop canopy analyzer were strongly influenced by soil bund construction, which improved yield, and by slope variation, which decreased yield on steeper slopes.

The two most significant yield predictors that were taken into consideration were EVI and NDVI. The suggested methodology could be seen as a promising addition to the survey-based yield assessments already used in Ethiopia since it enables the delivery of early teff and finger millet crop projections quickly and inexpensively. Meanwhile, the same methodology was successfully used to estimate the yield of teff and finger millet, two additional significant cereals in Ethiopia.

Our results reveal the super potential for using Sentinel-2 and crop canopy analyzer-derived LAI data to accurately predict the crop yield of teff and finger millet and its spatial variability at the catchment level. Developing a timely and accurate yield estimation model could help decision-makers allocate resources and improve food security. This will benefit farmers by suggesting optimal site-specific interventions to sustainably increase yield and income. It can help farmers and other land users select what to cultivate and when to grow it. The same approach should be explored to test the ability to predict yields of other crops and in other regions of Ethiopia; this will be particularly important for crops such as maize, which use C4 metabolism and has different spectral characteristics. Although our models can be applied to other regions, and they must be tested to ensure that they and their parameterization are appropriate for those regions before they are used to support agricultural management there. Most likely, the technique can be applied to other nations with comparable weather characteristics. Understanding local agricultural productivity via affordable and timely remote sensing-based approaches is crucial for providing appropriate humanitarian actions in light of rising food costs globally. The development of a larger variety of yield estimation methods, such as machine learning algorithms from satellite photos with an improved spatial and temporal resolution, is proposed for future research. Additional research will be necessary to improve the accuracy of the models.