Wheat yield gaps across smallholder farming systems in Ethiopia

Wheat yields in Ethiopia need to increase considerably to reduce import dependency and keep up with the expected increase in population and dietary changes. Despite the yield progress observed in recent years, wheat yield gaps remain large. Here, we decompose wheat yield gaps in Ethiopia into efficiency, resource, and technology yield gaps and relate those yield gaps to broader farm(ing) systems aspects. To do so, stochastic frontier analysis was applied to a nationally representative panel dataset covering the Meher seasons of 2009 and 2013 and crop modelling was used to simulate the water-limited yield (Yw) in the same years. Farming systems analysis was conducted to describe crop area shares and the availability of land, labour, and capital in contrasting administrative zones. Wheat yield in farmers’ fields averaged 1.9 t ha−1 corresponding to ca. 20% of Yw. Most of the yield gap was attributed to the technology yield gap (> 50% of Yw) but narrowing efficiency (ca. 10% of Yw) and resource yield gaps (ca. 15% of Yw) with current technologies can nearly double actual yields and contribute to achieve wheat self-sufficiency in Ethiopia. There were small differences in the relative contribution of the intermediate yield gaps to the overall yield gap across agro-ecological zones, administrative zones, and farming systems. At farm level, oxen ownership was positively associated with the wheat cultivated area in zones with relatively large cultivated areas per household (West Arsi and North Showa) while no relationship was found between oxen ownership and the amount of inputs used per hectare of wheat in the zones studied. This is the first thorough yield gap decomposition for wheat in Ethiopia and our results suggest government policies aiming to increase wheat production should prioritise accessibility and affordability of inputs and dissemination of technologies that allow for precise use of these inputs.


Introduction
Ethiopia is the largest wheat producer in sub-Saharan Africa with a record harvest of 4.6 million metric tons registered in 2017 (CSA 2019). However, during that same year, the country imported 1.5 million tons of wheat, corresponding to a value of around US$600 million (CSA 2019). Further increases in demand for wheat (and other cereals), due to population growth and dietary changes (van Ittersum et al. 2016), are projected to put additional pressure on the national treasury, making the national economy vulnerable to cereal price volatility in world markets. These drivers have put wheat self-sufficiency high on the agenda, with a new initiative of the Ethiopian government targeting selfsufficiency in wheat in the coming few years (https://www. press.et/english/?p=815).
Increasing wheat yield in Ethiopia, through narrowing yield gaps, is important to reduce the import dependency for this crop while avoiding area expansion. This needs to occur in a smallholder agriculture setting as wheat is cultivated by approximately 4.2 million smallholders on ca. 1.7 million ha ( Fig. 1, CSA 2019). Currently, wheat is produced mostly under rainfed conditions and with relatively low inputs (Anteneh and Asrat 2020). Despite the yield growth observed during the past 15 years (ca. 63 kg ha −1 year −1 ), with wheat yield doubling to values reaching ca. 2.7 t/ha  (CSA, 2018), the current wheat yield is only ca. 20% of its water-limited potential (Silva et al. 2019a;van Ittersum et al. 2016). Understanding the key drivers behind this large yield gap is thus important to help prioritise policies and interventions towards wheat self-sufficiency in Ethiopia.
Yield gap analysis at regional level is useful to investigate the key factors limiting and reducing crop production in farmers' fields (Hochman and Horan 2018;Rattalino-Edreira et al. 2017;van Ittersum et al. 2013a). Previous research concluded wheat yield gaps in West Arsi, one of the wheat belts in Ethiopia, can be largely attributed to technology yield gaps (Silva et al. 2019a). This means that technologies currently used by farmers do not reach agronomic best practices and that considerably more and better use of inputs is needed if wheat yield gaps are to be narrowed (Habte et al. 2014;Tanner et al. 1993). Competition for labour during sowing, weeding, and harvesting was also observed in this administrative zone, as labour peaks for other cereal and legume crops overlap with labour peaks for wheat. This results in potential tradeoffs on resource allocation at farm level and reflects the importance of contextualizing yield gaps within broader farming system aspects (Silva and Ramisch 2019b).
The objective of this manuscript is twofold: (1) to decompose wheat yield gaps across smallholder farming systems in Ethiopia in order to identify relevant management, technological, and policy interventions corresponding to efficiency, resource, and technology yield gaps, and (2) to relate those yield gaps to broader farm(ing) systems contexts. For this purpose, we applied a framework combining frontier analysis and crop modelling to a large and nationally representative database of individual farm field data. This is the first agronomic assessment of the causes of wheat yield gaps across the main wheat producing administrative zones, agro-ecological zones, and farming systems of Ethiopia and we suggest our findings are useful for policy makers aiming to improve food security and to ensure wheat self-sufficiency in the country.

Concepts and definitions
The yield gap analysis presented here builds upon the frameworks of Silva et al. (2017) and van Dijk et al. (2017), which consider six yield levels to decompose the yield gap (Table 1). The water-limited yield (Yw) refers to the maximum yield that can be obtained under rainfed conditions in a well-defined biophysical environment (van Ittersum et al. 2013a). Yw can be simulated with crop growth models or derived from field trials with high levels of nutrients applied and pests, diseases, and weeds fully controlled. The highest farmers' yields (Y HF ) refer to the maximum yields (e.g. average above the 90th percentile of actual farmers' yields) observed in a sample of farmers sharing similar biophysical conditions (weather and soils) and technologies adopted (e.g. varieties). Differently from Y HF , van Dijk et al. (2017) consider economic yields (Ye) and feasible yields (Yf): the former refers to the yield level in which marginal costs are equal to marginal revenue and the latter refers to the maximum yield that can be reached with available technology and best-practice management but without economic constraints. The technically efficient yields (Y TEx ) comprise the maximum yield that can be achieved for a given input level and can be computed using methods of frontier analysis in combination with concepts of production ecology (Silva et al. 2017). The latter provide We note these yield levels and yield gaps are only meaningful, and therefore also defined, for a given genotype × environment combination a framework to assess the relative importance of different growth factors and inputs to actual yields and resource-use efficiencies (van Ittersum and Rabbinge 1997). Finally, the actual yield (Ya) refers to the yield in farmers' fields as recorded in farm surveys. Six yield gaps can be distinguished based on these yield levels ( Table 1). The total yield gap is defined as the difference between Yw and Ya. The efficiency yield gap is defined as the difference between Y TEx and Ya and it is explained by crop management imperfections related to time, form, and/or space of the inputs applied. The resource yield gap is defined as the difference between Y HF and Y TEx and captures the yield penalty due to a sub-optimal amount of inputs applied. According to van Dijk et al. (2017), the resource yield gap can be further decomposed into an allocative yield gap (Ye − Y TEx ) and into an economic yield gap (Yf − Ye), which allows exploring how profit-maximisation affects the amount of inputs used. These allocative and economic yield gaps are particularly important in developing countries due to market failures and lower amounts of inputs used in highest-yielding fields than those required to reach Yw (van Ittersum et al. 2013a). The technology yield gap is defined as the difference between Yw and Y HF (Silva et al. 2017) or between Yw and Yf (van Dijk et al. 2017), which can be caused by resource yield gaps of specific inputs and/or the use of technologies in farmers' fields where Yw is not achieved.

Farm household survey
The Wheat Adoption and Impact Survey (WAIS) was conducted by the International Maize and Wheat Improvement Center (CIMMYT), in collaboration with the Ethiopian Institute of Agricultural Research (EIAR), for the purpose of tracking varietal change and assessing the impact of genetic improvement of wheat in Ethiopia (Jaleta et al. 2019;Tolemariam et al. 2018). The survey is a panel at household level covering the growing seasons of 2009 and 2013 and it is representative of Ethiopian wheat growing areas. The sampling frame comprised the selection of 148 major wheat growing districts, followed by a random selection of 120 farmers' associations (communities) within these districts and by a random selection of 15 to 18 households within each farmers' association (Tolemariam et al. 2018;Abro et al. 2017). This resulted in a sample of 2096 representative farmers (Fig. 2).
The survey included a wide range of farm and household characteristics as well as detailed information on the types and quantities of inputs used and crop yields obtained in all fields of each farm (Figs. S1 and S2), which makes it suitable for yield gap analysis (Beza et al. 2017). Descriptive statistics of the data are provided in Table 2 for the administrative zones with more than 100 wheat plots in both survey rounds. The large sample size and national coverage make this survey suitable for analysis at national level and allows for more focused analyses across different administrative zones. The latter was done for West Arsi, North Showa, East Gojam, and South Wollo administrative zones given the large sample size for each zone and the similar agro-ecological conditions and farming systems across zones (Table 2). Yet, these zones differ in a number of socio-economic conditions such as other crops cultivated and land availability. All data in the WAIS were self-reported by the farmer at the end of the growing seasons of 2009 and 2013. Actual yields were calculated based on farmers' report on wheat production after threshing and plot area. Wheat production after threshing was assumed to have a dry-matter content of 86.5%. The amount of urea applied was not properly recorded in Arsi, West Arsi, and Jimma administrative zones. For this reason, these data were imputed based on the amount of di-ammonium phosphate (DAP) applied, for which reliable data were available, while assuming a linear relationship between both. We also note that plots with 0 kg N ha −1 applied (n = 1043) were not considered in the yield gap analysis as we were not sure whether this value refers to no application or to missing data. We excluded wheat plots belonging to households with more than 20 pairs of oxen (n = 2) and plots with wheat yields greater than 10 t DM ha −1 (n = 11) and unspecified soil fertility status (n = 1). This resulted in 3818 (bread) wheat plots during the Meher growing seasons of 2009 (n = 1751) and 2013 (n = 2067).
Wheat yields and yield gaps were averaged per administrative zone, agro-ecological zone, or farming system (as per Table 2). The administrative zones were retrieved from the household survey while the other classifications were obtained from secondary sources based on the GPS coordinates of the individual households.
The agro-ecological classification combines temperature, rainfall, elevation, and the length of the growing season for the main crops and was retrieved from the Ministry of Agriculture of Ethiopia (MoA 1998). The farming system classification combines agro-ecological information with expert knowledge of the main farming systems in Ethiopia as documented by Amede et al. (2017).

Stochastic frontier analysis
Stochastic frontier analysis was used to estimate the production frontier, Y TEx , and the efficiency yield gap for wheat production in Ethiopia. The estimated models assumed a Cobb-Douglas functional form (i.e. only firstorder terms included) to describe the relationship between wheat yield and a vector of biophysical and agronomic variables. Models with a translog functional form were also fitted to assess interactions between variables but these are presented as Supplementary Material only (Table S1). We note parameter estimates were rather similar between both functional forms and the Cobb-Douglas has a considerable lower number of parameters and is easier to interpret. The formulation of the stochastic frontier model with a Cobb-Douglas functional form and the calculation of Y TEx and the efficiency yield gap (Eff. Yg) were as follows (Silva et al. 2017;Battese and Coelli 1995): where y it represents the wheat dry-matter (DM) grain yield reported in plot i and in year t, x kit is a vector of agronomic inputs k used on plot i and year t, and α 0 and β k are parameters to be estimated. The stochastic frontier accounts for two random errors, v it (random noise) and u it (technical inefficiency), which are assumed to be independently distributed from each other and to follow a normal (2) and half-normal distribution truncated at 0 (3), respectively (Battese and Coelli 1995). The Cobb-Douglas model fitted to the pooled data was used to estimate the feasible yield (Yf) for different input levels as explained in Section 2.3.3.

Table 2
Number of households per administrative zone, agro-ecological zone,  Data are presented for administrative zones with more than 100 observations in both survey rounds The vector of inputs x kit was defined according to principles of production ecology (van Ittersum and Rabbinge 1997). The growth-defining factors included in the analysis were temperature seasonality and growing degrees days (both obtained from the climate zonation of Van Wart et al. 2013b), year of the survey (Meher seasons of 2009 and 2013), seed rate, and type of variety (improved or unknown landrace). Temperature seasonality refers to the standard deviation of monthly average temperatures and growing degrees days consider a base temperature of 0 • C. The growth-limiting factors related to water included in the analysis were aridity index (i.e. the ratio between annual total precipitation and annual total potential evapotranspiration, also from Van Wart et al. 2013b), soil available water (obtained from the Africa Soil Information Service, AfSIS) and farmer reported information on soil depth (deep, medium, shallow), occurrence of water logging (yes/no), occurrence of drought (yes/no), use of water conservation techniques (yes/no), and ploughing frequency (less than three times, three times, four times, and five times or more). The growth-limiting factors related to nutrients included in the analysis were the farmer reported soil fertility status of the plot (rich, medium, poor), the use of manure (yes/no), incorporation of crop residues (yes/no), previous crop type (cereal, legume, other), and N applied (kg N ha −1 ). P applied was not included due to strong collinearity with N applied. Finally, herbicide use (L ha -1 ), hand-weeding (person-day ha −1 ), a dummy variable to distinguish weeded from non-weeded plots, pesticide use (yes/no), and occurrence of pests or diseases (yes/no) were included to capture or control for growth-reducing factors. Missing data on seed rate (n = 828 plots) were filled with the mean value of the pooled sample and fields with no N applied were excluded from the analysis. All continuous input-output variables were ln-transformed prior to the analysis, so that parameter estimates can be interpreted as elasticities.
The stochastic frontier model (1)-(3) was fitted to the pooled sample (national analysis) and to subsets of the data for selected administrative zones (namely West Arsi, North Showa, East Gojam, and South Wollo) using maximum likelihood, as implemented in the sfa() function of the R package frontier (Coelli and Henningsen 2013). Efficiency yield gaps (4) and Y TEx (5) were derived from the stochastic frontier model fitted to the pooled sample. We tried to assess the determinants of the efficiency yield gap with a second-stage regression (Battese and Coelli 1995) but refrain from showing these because most models did not converge and results were inconclusive. Data were used as a cross-section rather than as a panel of households in all models estimated meaning that the effects of technological change and time-(in)variant technical inefficiencies were not tested.

Input use across actual yield percentiles
Farmers' fields within a unique year × climate zone × soil fertility combination were categorised into highest-, average-and lowest-yielding based on their actual yields.
Year, climate zone, and soil fertility were considered in this analysis to ensure yield differences between the three field types were only explained by differences in crop management and not in biophysical conditions. Year refers to the Meher seasons of 2009 and 2013, the climate zones were obtained from the Global Yield Gap Atlas (Van Wart et al. 2013b), and the soil fertility was based on farmer's own assessment. Varieties were not considered because we found no significant yield differences between variety types (Table 3).
Highest-yielding fields were identified as the observations above the 90th percentile of Ya and the highestfarmers' yields (Y HF ) were computed as the mean Ya for these fields. Similarly, the lowest-yielding fields were identified as the observations below the 10th percentile of Ya (Y LF ), and the average-yielding fields as the observations between the 10th and the 90th percentile of Ya (Y AF ). Significant differences across the different field categories were tested with analysis of variance (ANOVA) followed by a Tukey HSD post hoc test (considering a 5% significance level) for wheat yield, seed rate, N applied, total labour use (for land preparation, sowing, handweeding, and harvesting), labour use for hand-weeding, and herbicide application. This was implemented for selected administrative zones using the scipy and statsmodels libraries in Python (Virtanen et al. 2020; Seabold and Perktold 2010).

Crop modelling and variety trials
Water-limited yields of wheat across Ethiopia were simulated with the WOFOST crop model (Boogaard et al. 2014) for the Meher seasons of 2009 and 2013 following the protocols of the Global Yield Gap Atlas (GYGA; Grassini et al. 2015;van Bussel et al. 2015). These provide a bottomup approach to estimate Yw (with a dry-matter content of 86.5%) within a spatial framework using climate zones based on local weather, soil, and agronomic data. Daily weather data on minimum and maximum temperatures and precipitation for 12 weather stations across the country were acquired from the National Meteorology Agency of Ethiopia. Gridded soil data on rootable depth and soil water availability were obtained from AfSIS, and crop management information was obtained through expert knowledge and literature review. Further details about model calibration and validation can be found in Tesfaye (2016). The simulated Yw was linked to the household survey based on the GPS coordinates of each farm.
The technology yield gap was calculated as the difference between Yw and Y HF for unique year × climate zone × soil fertility combinations (Section 2.3.2.). The simulated Yw was further compared with wheat yields observed in variety trials conducted in 2016 and 2017 in Debre Zeit, Kulumsa, Bekoji, and Dawa Busa (Bezabih et al. 2018). This comparison was done for Arsi (Kulumsa) only with the purpose to cross-validate the simulated Yw and assess the contribution of varietal differences in yield potential to the technology yield gap. Moreover, the feasible yield was estimated with the Cobb-Douglas stochastic frontier model fitted to the pooled data (1)-(3) for non-limiting amounts of inputs applied as proposed by van Dijk et al. (2017). In other words, the parameter estimates of the fitted stochastic frontier model were used to predict wheat yields at high input levels. This was done for three different levels of N applied (150, 250, and 350 kg N ha −1 ) in combination with a seed rate of 200 kg ha −1 , 1.5 L ha −1 of herbicide use, and 25 person-day ha −1 for hand-weeding labour. These seed and N rates are in line with the amounts needed to reach Yw (www.yieldgap.org) while weeding requirements reflect current management in highest-yielding fields of West Arsi. The contribution of sub-optimal amounts of inputs to the technology yield gap was further assessed by estimating an additional resource yield gap, i.e. the difference between the estimated feasible yields and Y HF . These resource yield gaps are thus part of the technology yield gap defined as the difference between Yw and Y HF .

Resource allocation at farm level
Wheat cultivation by smallholders in Ethiopia occurs alongside the cultivation of other crops. This has important implications for the allocation of resources at the farm level and may lead to trade-offs depending on the level of resource constraints. Crop area shares of wheat, other cereals (e.g. barley and tef), pulses (e.g. faba bean, field peas and chickpeas), oilcrops, and vegetables were computed to assess the level of specialisation in wheat production of individual households. Resource allocation at farm level was further studied by comparing the number of ploughing days, labour use for weeding, and total labour use (incl. land preparation, sowing, hand-weeding, and harvesting) for wheat and for other crops within each household. We found no evident substitution or competition effects and results are thus presented as Supplementary Material (Fig. S6). It was not possible to relate the amount of labour used with the timing of the different management operations as the dataset lacked information on the latter.
The pairs of oxen owned by each household were used to investigate whether the farming system was limited by land (intensification pathway) or by labour (extensification pathway) in relative terms (Silva et al. 2019a). Four different groups were identified based on this information: households with no oxen, households with one pair of oxen, households with two pairs of oxen, and households with three or more pairs of oxen. Differences in wheat yield, resource availability, and input use between households owning different pairs of oxen were tested for significance in selected administrative zones with ANOVA followed by a Tukey HSD post hoc test, as explained in Section 2.3.2. Other resource variables analysed included wheat cultivated area, total labour use for land preparation, sowing, handweeding and harvesting, farm assets as reported by each household (excluding livestock), seed rate, N fertiliser rate, herbicide use, and labour use for hand-weeding (all referring to wheat production).

Magnitude of wheat yield gaps in Ethiopia
Wheat yields in farmers' fields were on average 1.9 t ha −1 for the pooled sample, which corresponds to a yield gap closure of 21% of Yw (Fig. 3). The lowest Ya (less than 1.7 t ha −1 ) was recorded in the moist and sub-moist agroecological zones (M2, M3, and SM3) while the highest Ya (1.8-2.2 t ha −1 ) was recorded in the humid (H2 and H3) and sub-humid agro-ecological zones (SH1 and SH2; Fig. 3A and D). The latter are indeed amongst the most suitable agro-ecological zones for wheat production in Eastern Africa (Negassa et al. 2013;Hodson and White 2007). Ya also varied across administrative zones with a minimum of 1.2 t ha −1 in South Wollo and North Gonder and a maximum of 2.9 t ha −1 in West Arsi ( Fig. 3B and E). No major differences in Ya were observed between highland mixed and highland perennial farming systems ( Fig. 3C and F) while wheat yields were considerably greater in maize mixed farming systems, ca. 3.2 t ha −1 , than in the other farming systems. However, we note the bulk of the sample is classified as highland mixed farming system (Table 2). Finally, Yw varied between 8.3 t ha −1 in South Gonder (moist sub-afroalpine areas) and in East Showa (sub-moist and sub-humid highlands) up to 10.5 t ha −1 in West Arsi (humid and sub-humid highlands) and Gurage (sub-humid highlands; Fig. 3A and B).
The actual yield reported in the household survey at national level was 2.1 and 1.7 t ha −1 in 2009 and 2013, respectively. We acknowledge yield progress may have occurred since then but our analysis is still relevant as it focuses on comparisons between farms and regions. Yet Wheat yield gaps were mostly attributed to the technology yield gap (> 50% of Yw) but narrowing efficiency and resource yield gaps can still double Ya (Fig. 3). This was true for most agro-ecological zones, administrative zones, or farming systems. The efficiency yield gap was on average 10% of Yw and did not differ much between agro-ecological zones (7.8-10.7% of Yw, Fig. 3D), administrative zones (7.5-11.4% of Yw, Fig. 3E), or farming systems (10.1-10.5% of Yw, Fig 3F). The resource yield gap was on average 15% of Yw and was smallest in the highland agro-ecological zones (SH2, H2, and SM2, 9.3-13.1% of Yw) and greatest in the SM3, M2, M3, and SH1 agro-ecological zones (> 15% of Yw, Fig. 3D). In terms of administrative zones, the resource yield gap was smaller than 10% of Yw in West Arsi, North Gonder, and Gurage and above 20% of Yw in West Showa, South Wollo, and North Wollo (Fig. 3E). The resource yield gap was negligible for maize mixed farming systems and ca. 15% of Yw for highland mixed and highland perennial farming systems (Fig. 3F). These efficiency and resource yield gaps seem to be small when expressed in relation to Yw but we note they are far from insignificant when compared to Ya (Fig. 3). Finally, high seed rates and weed control together with N application rates of 150, 250, and 350 kg N ha −1 resulted in an average yield gap closure to 50, 60, and 70% of Yw, respectively. For instance, an application of 350 kg N ha −1 increased wheat yields to ca. 80% of Yw in sub-afroalpine agro-ecological zones (H3 and SM3; Fig. 3D) and to 60-75% of Yw in a number of administrative zones (Fig. 3E) and highland mixed farming systems (Fig. 3F).
In summary, fine-tuning current crop management practices can deliver the additional production needed to reach wheat self-sufficiency without expanding wheat area. However, further narrowing yield gaps towards Yw requires inputs and technologies currently lacking in highest-yielding fields. Technologies currently not used by many farmers include for instance mechanisation of land preparation, planting and harvesting operations, effective control of pests, diseases, and weeds or other nutrients beyond N and P. Efficiency and resource yield gaps as large as current Ya and technology yield gaps as large as 50% of Yw have also been reported in other studies on wheat yield gaps in Ethiopia (Silva et al. 2019a) and Rwanda (Baudron et al. 2019), and on maize yield gaps in Ethiopia (Assefa et al. 2020) and Tanzania (van Dijk et al. 2017).

Production frontier and yield variability
The magnitude, sign, and significance of the first-order terms of growth-defining, -limiting, and -reducing factors on wheat yields were consistent between the Cobb-Douglas and translog stochastic models fitted to the pooled sample (Table S1). Regarding the second-order terms, the translog model revealed positive quadratic effects of seed rate, N rate and herbicide use, a negative interaction between seed and N rates, and positive interactions between temperature seasonality and seed rate and hand-weeding, and between available water and herbicide use (Table S1).
Wheat yields decreased with increased growing degrees days and temperature seasonality, after controlling for other factors, and there were no significant differences across varieties and years (Table 3). There was a negative effect of aridity index on wheat yields, a result also found for maize yields in Ethiopia (Assefa et al. 2020). Seed rates had a significant positive effect on wheat yields and increasing the former by 1% resulted in ca. 0.10% increase of the latter. This positive association between plant population and wheat yields was also documented for wheat in Rwanda (Baudron et al. 2019). Crop establishment remains a challenge in smallholder conditions due to manual sowing which leads to large variation in sowing depths and heterogeneous plant populations across the field. Plots where water logging or drought were reported by the farmer yielded 35-45% less than plots where these were not reported, and plots with deeper soils yielded ca. 8% more than plots with medium or shallow soil depths. Frequent ploughing was found to increase wheat yields using the same household survey data (Abro et al. 2018) but in our analysis, this did not translate into significantly greater wheat yields (Table 3). We note the analysis of Abro et al. (2018) focused exclusively on investigating the effect of ploughing frequency on wheat yields while our analysis investigates a broader range of biophysical and management drivers, which overtake the level of yield variation explained by ploughing frequency.
There was a clear yield response to N across models estimated for the pooled sample and for specific administrative regions: on average, wheat yields increased by ca. 0.27% with 1% increase in N applied (Table 3). Earlier studies also identified N fertilisation as a key determinant of wheat yields in Ethiopia (Habte et al. 2014;Tanner et al. 1993) but further research should investigate whether increasing N rates is economically viable for smallholders (cf. van Dijk et al. 2017). High fertiliser prices were identified as an important constraint to increase fertiliser access and use by wheat smallholders in Ethiopia (Anteneh and Asrat 2020). In addition to profitability, smallholders' decisions to apply fertiliser also depend on the area share of each crop, household wealth, access to rental land, and the level of land fragmentation (Yu and Nin-Pratt 2014). At regional level, the distance from the input distribution to the farmer was found to increase the price of mineral fertilisers as a result of greater transaction and transportation costs (Minten et al. 2013). Overcoming these constraints at farm and regional levels thus remains important to increase access to and use of fertilisers in the country.
Fertile plots yielded 6% and 16% more than medium and poor fertile plots, respectively. No significant yield differences were observed between plots with and without legumes as preceding crop, which is not in agreement with earlier empirical findings (Taa et al. 2004), nor between plots with and without manure application. This may be due to the heterogeneity in manure management and legume productivity and residue management between farms and to the relatively low number of fields with manure use reported (n = 534) and legumes recorded as previous crop (n = 840). Finally, herbicide use was positively associated with wheat yields (but the effects were small), pesticide use translated into 12% greater wheat yields, and disease occurrence reduced wheat yields by ca. 30%. There were no positive significant effects of hand-weeding on wheat yields as this operation might be done after the crop suffers from severe competition from weed as a result of labour shortages or inconvenient working days during more critical periods of the growing season.
The stochastic frontier model with a Cobb-Douglas functional form was fitted to a subset of the data for the administrative zones West Arsi, North Showa, East Gojam, and South Wollo (Table 3). The results obtained for these administrative zones were largely consistent with the results of the national analysis reported above, particularly for seed rate (only non-significant in West Arsi), N application rate (strongly positive in all zones), occurrence of drought (strongly negative in all zones), and occurrence of diseases (strongly negative in all zones). The most notable difference between both national and regional analyses was that the significance of biophysical variables (e.g. growing degrees day, temperature seasonality and aridity index) observed Table 3 Parameter estimates of the stochastic frontier model estimated for wheat-based farming systems in Ethiopia (Meher seasons of 2009 and 2013; Eq. 1). The same model was fitted to the pooled sample (Ethiopia) and to selected administrative zones (West Arsi, North Showa, East Gojam, and South Wollo). Significance is indicated by the codes: '***' 0.1%, '**' 1%, '*' 5%, ' # ' 10%. n.a. not applicable

Ethiopia
West in the former tend to disappear in the latter and wheat yield responses to N were largest in North Showa. This is expected as the pooled sample used in the national analysis exhibits greater variation in biophysical conditions between households compared to the subset used for the regional analyses.

Resource yield gap and yield response to inputs
Y HF were 3.4 t ha −1 in South Wollo, 3.1 t ha −1 in East Gojam, 4.2 t ha −1 in North Showa, and 4.5 t ha −1 in West Arsi ( Fig. 4A and Table 2). Y AF and Y LF were higher in West Arsi (2. (1.6 and 0.8 t ha −1 ), and North Showa (1.5 and 0.4 t ha −1 ) and lower in South Wollo (1.2 and 0.4 t ha −1 ). Y HF were associated with significantly greater seed and N application rates compared to Y AF and/or Y LF across all four administrative zones ( Fig. 4B and C). Seed rates in highest-yielding fields were ca. 250 kg ha −1 in North Showa, East Gojam, and South Wollo, which was significantly greater than the average 180 kg ha −1 used in average-and lowest-yielding fields. The variation in seed rates between field classes was smaller (and not significant) in West Arsi compared to other administrative zones: ca. 220 and 190 kg ha −1 in highest-and lowest-yielding fields, respectively. N application rates in highest-yielding fields were ca. 90 kg N ha −1 in North Showa, East Gojam, and South Wollo, which was significantly greater than the ca. 60 kg N ha −1 used in average-yielding fields in North Showa and East Gojam, the ca. 45 kg N ha −1 used in South Wollo, and the ca. 30 kg N ha −1 (60 kg N ha −1 ) observed across the lowest-yielding fields in North Showa and South Wollo (East Gojam).
Labour use for land preparation, sowing, hand-weeding and harvesting was significantly greater for Y HF than for Y AF and Y LF in all administrative zones except West Arsi ( Fig. 4D and E). Significant differences in herbicide use across groups were only observed in West Arsi and North Showa (Fig. 4F). As an example, highest-yielding fields were associated with a total labour use of ca. 140, 120, and 100 person-day ha −1 in South Wollo, North Showa, and East Gojam, respectively, while labour use ranged between 60-80 person-day ha −1 in the lowest-yielding fields of these zones. Considerably more labour was used in North Showa, East Gojam, and South Wollo than in West Arsi and there was an inverse relationship between labour use for hand-weeding and herbicide use ( Fig. 4E and F). This is best seen in West Arsi where herbicide use was greatest (ca. 0.8 L ha −1 ) and labour for hand-weeding was lowest (ca. 12 person-day ha −1 ), data which validate those of an independent household survey conducted in 2012 in the same region and analysed by Silva et al. (2019a).

Technology yield gap and increased amounts of inputs
The simulated Yw in Arsi administrative zone was 8.6 t ha −1 in 2009 and 9.7 t ha −1 in 2013, which was considerably greater than the values observed for Y HF during the same years (Figs. S5A and S5B). The variety trials described by Bezabih et al. (2018) were conducted at Kulumsa Agricultural Research Center (KARC), Arsi administrative zone, in 2016 and 2017 and simulated Yw was 9.3 and 9.1 t ha −1 in these years, respectively. The yields observed in these trials ranged between 4.9 and 7.7 t ha −1 in 2016 and between 4.4 and 7.0 −1 in 2017 (Figs. S5C and S5D). Despite differences in varieties used in the highest-yielding fields and the variety trials (data not shown), most of the varieties cultivated in highestyielding fields were improved genotypes bred at KARC with parental material from CIMMYT (Fig. S4). This means farmers use improved varieties that can reach up to ca. 80% of Yw on-station and, hence, that the technology yield gap is likely caused by other factors than lack of improved varieties.
The low amount of inputs (particularly seeds and fertilisers) used in highest-yielding fields compared to what is needed to reach Yw ( Fig. 4C and Section 2.3.3) and the lack of certain inputs and technologies in these fields are the most likely drivers of the technology yield gap of wheat across Ethiopia. The former is reflected by the difference between the feasible yield (Yf) and Y HF , and it is depicted in Fig. 3 as the additional resource yield gap for increasing amounts of N applied (150, 250, and 350 kg N ha −1 ). For instance, combining high seed rates with intensive weeding practices and 150 kg N ha −1 can increase wheat yields up to an average Yf of 4.6 t ha −1 . This corresponds to a technology yield gap of ca. 47% of Yw. Applying N rates of 250 and 350 kg N ha −1 result in average Yf of 5.3 and 5.8 t ha −1 which reduces the technology yield gap to about 39 and 33% of Yw, respectively. We note the aforementioned N application rates are way above those currently observed in highest-yielding fields ( Fig. 4C) but these high N rates could be reduced if efficient N management practices are adopted cf. (Assefa et al. 2020;ten Berge et al. 2019). Another point of concern is that most wheat in Ethiopia is currently cultivated in acid soils where yield responses to N are not always clear and hence, increasing the N rates further aggravates soil acidity and lowers wheat yield (Regassa and Agegnehu 2011). This clearly indicates that the scope to increase fertiliser rates is context specific and needs to be integrated with other soil management practices and excellent agronomy. Other factors explaining this yield gap may include poor crop establishment and poor weed control, which currently rely heavily on draught power and manual labour, and poor pest and disease control (as partly shown in Table 3). It is also important to consider that row planting improves radiation interception under high seed rates (high plant populations) as compared to the current farmer practice of broadcasting (Alemu et al. 2014).

Crop diversity at farm level
The total cultivated land area per farm was on average 2 ha in West Arsi and North Showa and 1.6 and 1.4 ha in East Gojam and South Wollo, respectively (Figs. 5 and S7). This land was allocated differently to different crops in different administrative zones. The share of wheat in the total cultivated land was high in South Wollo and West Arsi, on average ca. 45%, and low in North Showa (35%) and East Gojam (25%). In West Arsi, households allocated 47% and 6% of their cultivated land to other cereals (mostly barley) and to legumes (mostly faba bean), respectively (Fig. 5A). In North Showa, the share of other cereals (mostly barley and red tef) and legumes (mostly faba bean) of the total cultivated land was ca. 37% and 27%, respectively (Fig. 5B). In East Gojam, around 60% of the cultivated land was allocated to other cereals (mostly red and white tef) and only ca. 10% was cultivated with legumes ( Fig. 5C). In South Wollo, both other cereals and legumes were cultivated on ca. 25% of the total cultivated land (Fig. 5D). This indicates farms in North Showa and East Gojam are more diverse regarding the crop types cultivated than farms in West Arsi and South Wollo ( Fig. 4D and E). The different crop types are known to compete for labour during key periods of the growing season (Silva et al. 2019a), but we were not able to find clear substitution or competition for land and labour between wheat and other crops possibly due to a lack of information on the timing of different operations (Fig. S6). Further research is thus needed to clarify the importance of wheat as a source of income, and priority for investment, in the more diversified farming systems.

Availability of land, labour, and capital
Oxen ownership was associated with slightly greater wheat yields in West Arsi, North Showa, and East Gojam but the effects were only significant in East Gojam (Fig. 6A). In addition, households with more oxen pairs tended to cultivate larger wheat areas than households with few oxen pairs (Fig. 6B). This was particularly true in West Arsi and North Showa, where land is more 'abundant' (Fig. 5A and  B), and not as much and significantly in East Gojam and South Wollo, where land is constrained (Fig. 5C and D). No significant differences in total labour use for wheat were observed for different levels of oxen ownership in either zone (Fig. 6C); hence, oxen ownership did not translate into labour savings per unit land and into substitution of manual labour by draught power. The economic value of farm assets increased on average with increasing oxen ownership,  Fig. S7 a b c d which was particularly clear in West Arsi and North Showa (Fig. 6D). No major significant differences in input use for wheat were observed across different levels of oxen ownership in either zone (Fig. S8). In summary, oxen ownership was a proxy for draught power and capital availability and was associated with larger wheat area, particularly in the administrative zones with largest cultivated area per household (i.e. West Arsi and North Showa; Silva et al. 2019a). Hence, access to draught power and capital translates into increases in wheat production through expansion of cultivated land and not so much through intensification of wheat production via yield gap closure ( Fig. 6A and B).

Comparisons across administrative zones
The four administrative zones analysed in greater depth in this study capture differences in the level of intensification of wheat production (Fig. 4) and in farming systems regarding the crop area shares and oxen ownership (Fig.  S1). Wheat yields were greatest in West Arsi, intermediate in North Showa and East Gojam, and smallest in South Wollo, while the opposite was true for labour use for wheat (both total and hand-weeding; Fig. 4). West Arsi is distinct from the other zones mostly because herbicides are widely used, substituting labour for hand-weeding and possibly other inputs (e.g. N) in the short term ( Fig. 4C and E). We also note that the cultivated area per farm is greatest in West Arsi and smallest in South Wollo, which may explain the use of herbicides in the former and the heavy reliance on human labour in the latter (Fig. 5). Finally, there was a positive relationship between the number of pairs of oxen (a proxy for capital availability) and the wheat area cultivated per farm in West Arsi and North Showa, the zones where cultivated land per farm was greatest, while no relationship was observed between the number of pairs of oxen and the input use for wheat (Fig. 6). This means that increases in wheat production are mostly obtained through increases in cultivated areas rather than through yield gap closure and that households with more capital do not necessarily use more inputs for wheat. These results suggest that smallholders do not have proper access to inputs because these are either too expensive or unavailable when needed, which can be pointed as the main challenge for intensification of wheat production and the achievement of wheat self-sufficiency in Ethiopia without expansion of wheat area.

Conclusion
Wheat yields across farmers' fields in Ethiopia were only up to ca. 20% of the water-limited yield potential, the benchmark for what can be achieved with best agronomic practices under rainfed conditions. Most of the yield gap was attributed to the technology yield gap, meaning that a b c d Fig. 6 Relationship between number of pairs of oxen owned by households and (A) wheat yields, (B) wheat cultivated area, (C) labour use for land preparation, sowing, hand-weeding, and harvesting of wheat, and (D) farm assets owned by households for selected administrative zones in Ethiopia (West Arsi, North Showa, East Gojam, and South Wollo). For each zone, lower-case letters depict significant differences between groups at 5% significance level certain inputs and technologies are entirely lacking in highest-yielding fields (such as technologies for optimal crop establishment and for control of pests, diseases, and weeds) and that the current input levels are not high enough to reach the water-limited yield. Despite their small share in explaining the wheat yield gap, narrowing the efficiency and resource yield gaps can nearly double actual yields and contribute to realise the yield progress needed to achieve wheat self-sufficiency in Ethiopia without having to expand the wheat area. However, achieving this requires increases in input use to the levels observed in highest-yielding fields and fine-tuning current crop management practices in relation to the time, space, and form of the inputs used. Wheat is cultivated in Ethiopia alongside other cereal and legume crops, particularly in North Showa and East Gojam and to a lesser extent in West Arsi and South Wollo. This diversity is important for food and nutrition security at household level. We found no clear evidence of substitution of or competition for land and labour between wheat and other crops. Yet, this finding merits further research because smallholders are known to operate under resource constraints and the different crops compete for labour in key periods of the growing season. Our results also indicate that households with more access to draught power and capital are not investing more in inputs per hectare of wheat but rather in allocating more land to wheat. This was particularly true in zones where the total cultivated land per household is relatively high, such as West Arsi and North Showa.
Here we show for the first time that it is possible to achieve wheat self-sufficiency in Ethiopia with current technologies (e.g. varieties) but that greater amounts, and more efficient use, of inputs are needed to do so. Narrowing technology yield gaps is also essential given the rapidly increasing demand for cereals due to population growth and dietary change (van Ittersum et al. 2016). Government policies aiming to increase wheat production should focus on fostering the accessibility and affordability of inputs, particularly fertilisers, and on promoting technologies that allow for a more precise management of these inputs (e.g. mechanisation and herbicides). This will also be essential to avoid environmental externalities of an intensification. Such policies also need to consider that wheat is one of the many crops cultivated by farmers, whose livelihoods should not be forgotten. It is thus important to understand whether or not narrowing yield gaps towards 80% of Yw is desirable from, e.g. an economic, environmental, or labour productivity perspective under prevailing conditions.