1 Introduction

Livelihoods of smallholder farmers in sub-Saharan Africa (SSA) are under pressure. Many are caught in a poverty trap, a vicious cycle of low productivity and lack of opportunities and incentives to invest in agricultural inputs (Tittonell and Giller 2013; Koning 2017). Additionally, constraints such as small farm sizes, limited market access, and a changing climate require considerable changes in current farming systems (Giller 2020). Sustainable intensification of farming is seen as a key strategy to enhance rural livelihoods in SSA (Vanlauwe et al. 2014; Jayne and Sanchez 2021). Sustainable intensification aims to enhance production per unit land, nutrient, and labor input, while reducing environmental damage, building resilience, and natural capital, as well as securing environmental services (e.g., Pretty et al. 2011; The Montpellier Panel 2013). Struik and Kuyper (2017) argue that the concept of sustainable intensification can be used as a “process of inquiry and analysis” and discuss how the social and economic dimensions of sustainability can be included. Such a broad view enables identification of trade-offs that arise when agricultural systems intensify. Using a diverse set of indicators to describe these trade-offs, can inform decision-making by society and policy makers (Struik et al. 2014; Struik and Kuyper 2017).

However, increasing yields through sustainable intensification is challenging in SSA (Schut and Giller 2020) and alternative pathways are often more apparent. For instance, extensification is currently more common than intensification in many regions of SSA (Baudron et al. 2012; Ollenburger et al. 2016). Continued extensification is associated with soil nutrient mining, and this trend could be reversed by strongly increasing nutrient inputs (Giller et al. 2021). However, this is constrained by widespread poverty traps (Tittonell and Giller 2013; Koning 2017) and the relatively low economic benefits of staple crop intensification in practice (Bonilla-Cedrez et al. 2021). Indeed, current trends show an increase in the area under maize cultivation in SSA (van Loon et al. 2019; Santpoort 2020), which historically has been linked to an increasing population, increasing food requirements and urbanization (Smale and Jayne 2003), and hence increasing land pressure (Crowley and Carter 2000). Although specialization towards maize favors the production of sufficient energy, diversified cropping systems would be more sustainable in terms of income, nutrition, crop yields, and risk spreading (Vanlauwe et al. 2019). Hence, identification of constraints and opportunities is essential to support desired pathways such as diversification and intensification.

Setting sustainable intensification as an overall goal for smallholder farming systems results in multiple subsidiary goals, e.g., increased yields, desired N use efficiencies, and food self-sufficiency at household and national level. Attaining all goals simultaneously is virtually impossible as trade-offs exist (Klapwijk et al. 2014; Vanlauwe and Dobermann 2020). Moreover, farmers follow their own objectives and prioritize some goals over others. Some goals also require time before they can be attained (Vanlauwe et al. 2010) and outcomes may differ between seasons, requiring assessment over multiple seasons, which is rarely done (Smith et al. 2017). Measuring progress towards the multiple goals of sustainable intensification requires a multi-criteria assessment of indicators associated with the principles of sustainability. Using a framework of principles and criteria warrants transparency and a justified selection of indicators (Florin et al. 2012). According to Florin et al. (2012, p.109), “Principles are the overarching (‘universal’) attributes of a system. Criteria are the rules that govern judgement on outcomes from the system and indicators are variables that assess or measure compliance with criteria.” Criteria can also help to decide upon benchmarks to judge whether a goal is reached (Schut et al. 2014). Within sustainable intensification of smallholder farming systems, criteria, indicators, and benchmarks need to address the field, farm, and household level. At national level, increasing yields to a certain threshold is required to attain food self-sufficiency, while at farm level cereal self-sufficiency is an important indicator that fits with farmers’ objectives.

Yield-increasing inputs required for sustainable intensification are beyond the reach of most smallholder farmers (Vanlauwe et al. 2010) and need incentives such as input subsidies. In the past 15 years, several fertilizer and seed subsidy programs were (re-)initiated by African governments (Jayne and Rashid 2013; Jayne et al. 2018), after their virtual absence during the 1990s and early 2000s (Martin and Anderson 2008). In addition, social enterprises, such as One Acre Fund (www.oneacrefund.org), provide inputs though credit schemes to smallholder farmers. Increased input use, however, also requires new knowledge (Jayne et al. 2019; Jayne and Sanchez 2021). In a large-scale subsidy scheme in Malawi, the limited extension provided by the government was seen as a possible cause for N use efficiencies to remain low (Dorward et al. 2008). In addition, fertilizers can be scarce and farmers may mistrust their quality (Michelson et al. 2021). Co-learning, an iterative learning framework involving farmers and researchers or extension workers, has proven to be successful in developing contextualized knowledge (Descheemaeker et al. 2019). We developed an integrated co-learning approach (Marinus et al. 2021), which aimed to sustainably increase farm level production by fostering increased input use through the provision of a voucher, in combination with knowledge co-creation (Fig. 1). In this paper, we apply a multi-criteria assessment over five seasons to analyze the outcomes of a co-learning program in relation to different agricultural development pathways.

Fig. 1
figure 1

A farmer who took part in the integrated co-learning approach explains how she has used a new type of maize spacing to ensure increased light availability for the intercropped groundnut. Maize grew more vigorously due to increased fertilizer use as part of her intensification strategy. Moreover, by learning about maize-legume spacing options and new groundnut varieties she was able to increase the area of groundnut on her farm and thereby to also diversify her cropping system. Photographed by Wytze Marinus.

Our overarching aim was to improve the understanding of farmer responses to input subsidies and new knowledge, in order to better support desired agricultural development pathways in smallholder farming. This materialized in the following objectives to (1) assess the effect of co-learning supported by a voucher for inputs on farmers’ decisions and management outcomes, by comparing it with a voucher-only approach; (2) analyze the above effects in terms of criteria and indicators that relate to agricultural development pathways; and (3) reflect on the pathways of intensification, extensification, specialization, and/or diversification resulting from the co-learning and voucher program.

2 Methodology

2.1 The integrated co-learning approach

We applied an integrated co-learning approach from August 2016 until July 2018, as described in detail by Marinus et al. (2021). The approach combined four complementary elements: input vouchers, an iterative learning process, common grounds for communication, and complementary knowledge. An input voucher of US$ 100 was provided each season to 47 farming households which aimed to alleviate resource constraints and increase input use. Inputs for maize, groundnut, soybean, common bean, and sorghum production and for dairy were made available. Most inputs were offered from the first season on, while groundnut and (short duration) common bean seed and Imazapyr-treated maize seed against striga were added later during the program in response to feedback from the co-learning farmers. The feedback was central to an iterative learning process in which a co-learning workshop prior to each cropping season played a pivotal role. The focus of the workshops evolved over time based both on questions and on feedback from farmers during the season as well as topics identified by the researchers. Discussion topics during the workshops included the judicious use of mineral fertilizers and the cultivation of alternative crops such as legumes. Researchers monitored the farmers’ responses through a mid-season field survey, yield data collection, and an individual evaluation interview at the end of each season (see Marinus et al. 2021 for further details).

2.2 Research setup

The integrated co-learning approach was applied in two locations, Vihiga and Busia County in western Kenya. Vihiga is one of the most densely populated rural areas in SSA with 1050 people km−2, with small farm sizes of <0.5 ha. Busia is less densely populated with 530 people km−2, and somewhat larger farms of about 1.0 ha (Jaetzold et al. 2005; KNBS 2019). Both locations receive a rainfall of 1800–2000 mm year−1 and a have a bi-modal rainfall pattern (Jaetzold et al. 2005), with the long-rain (LR) cropping season from March until June and the short-rain (SR) cropping season from September until November. Activities started in the SR season of 2016 and continued for five seasons until the SR season of 2018. Vihiga was selected as a location for its high population density, which commonly occurs in highlands areas of East Africa. Busia was selected for its comparably larger farm sizes than Vihiga, which could lead to more opportunities for increasing household income from farming.

In each county, Vihiga and Busia, two sub-locations were selected and in each of these locations 11–12 farmers were chosen. Farmers in one sub-location formed the co-learning group while a comparison group was formed in the other sub-location. The sub-locations were selected to have similar farming systems, yet be sufficiently far apart to avoid spillover effects. All farmers in the co-learning group received a voucher and took part in the co-learning activities. Those in the comparison group received only the input voucher. When inputs were added to the voucher based on feedback from the co-learning groups, these were added for the comparison group as well. A mid-season field monitoring survey included a visit by researchers to each field including fields that were newly added during the program, to record the crops cultivated and the percentage intercropping. The farmer was asked about input use, planting dates, and other crop management practices. Field sizes were measured using a hand-held GPS before the start of the program in June 2016. Small fields with sides less than 20 m were measured by hand. Yield measurements were done in two 4 × 4 m (16 m2) quadrats in all fields containing maize, groundnut, soybean, and/or common bean. These crops together made up about 60–70% of the total cultivated area per farm. Fresh cob (maize) and pod (legumes) yields were measured in the field, with one sub-sample per quadrat was taken to determine dry weight by oven drying. Dry weights were calculated back to a standardized moisture content of 14%, and the grain yield (kg ha−1) per field was calculated based on the average of the two quadrats. The detailed monitoring and measurement campaign during five seasons ensured a comprehensive assessment of changes in farm management over time. However, the limited number of farmers per sub-location precluded a formal statistical analysis. Additionally, we compared the situation during the program with a baseline study from the two seasons before the program. The baseline study was held in the dry season, June 2016, before the start of the program. It used the detailed farm characterization survey methodology (Giller et al. 2011) to ask many questions relating to the household characteristics and the production system, including estimates of crop yields and input use in the previous two seasons. Field sizes for all fields in the farm were measured and farmer reported data was used to derive crop production and input use. During the program, however, crop yields were measured, and farmer-reported input use was triangulated by comparing field and farm level application. Hence, the accuracy of the baseline study and the detailed monitoring during the program differs and this needs to be considered in the comparison.

2.3 The indicator framework: principles, criteria, and indicators

We used a multi-criteria assessment to analyze farmers’ decisions and management outcomes of the integrated co-learning program. Indicators were selected using principles and criteria (Table 1). We identified four principles of sustainable intensification of smallholder systems: productivity, food self-sufficiency, environmental protection, and economic viability. For each principle, one to four criteria and indicators were identified. The yield-related indicators and food self-sufficiency focused on maize, which was the most important crop in terms of food and sale with nearly all households cultivating maize every season.

Table 1 Indicators for agricultural development, organized according to principles (in italics) and criteria. The third column identifies other principles under which an indicator may also fit. Yw water-limited yield potential, AE adult equivalent; improved maize variety: all varieties that were not “Local OPVs” (hybrid varieties, improved varieties, and improved open-pollinated varieties); PPP purchasing power parity.

We present the indicator values at the start and the end of the program in a spider web diagram, to assess possible pathways related to agricultural development. Indicators that were identified for intensification and extensification and for diversification and specialization, indicated with a * in Table 1, were included in the spider web diagram. In Sections 2.3.12.3.4, we describe the link for each of the indicators with their respective pathway. Those indicators were scaled using a 0 to 10 score based on specific benchmarks (described in Sections 2.3.12.3.4), with a larger score indicating a more sustainable situation. Linear interpolation was applied to the indicator values to score them between 0 and 10.

2.3.1 Productivity

Reducing yield gaps

Maize grain yield (kg ha−1) was measured in all maize fields, both monocropped and intercropped. A farm-level, weighted average maize grain yield was calculated based on the area of each maize field. The yield benchmark (score 10) was 50% of the season-specific, water-limited yield potential in western Kenya, a yield target required to attain national or regional food self-sufficiency (van Ittersum et al. 2016). The average water-limited yield potentials were calculated with a crop growth simulation model (hybrid-maize) using long-term weather data. They were 12.5 Mg ha−1 and 8.0 Mg ha−1 for the long- and the short-rain cropping seasons respectively (GYGA 2020). The score was set to zero at a maize yield of 0 kg ha−1. In addition, the water-limited yield potential of 80% was used as a benchmark for the maximum attainable yield and 15% was used as the low baseline found for current yields in SSA (van Ittersum et al. 2016). Using these seasonal average yield potentials is a simplification of what is possible in the region, on average, as the water-limited yield potential varies from season to season and from farm to farm. This should be considered when evaluating the results against the benchmarks.

Yield can be increased by using improved varieties. All varieties that were not local open-pollinated varieties (OPVs) were classified as “improved” varieties. These include hybrid varieties and improved OPVs. The benchmark score was 0 at no use of improved varieties and 10 if 100% of the maize area was sown with improved varieties. Mineral N application rates on maize were scored at 0 if no N fertilizer was applied and 10 if the mineral N application rate on maize was 120 kg N ha−1 or more. The above three indicators, associated with reducing the yield gap, were used as indicators for the pathway of intensification.

Food production

Maize is representative of the food produced at farm level and in principle available for home consumption. The total maize production at farm level (kg) was calculated from maize yield and maize area for each season.

2.3.2 Food self-sufficiency

Maize self-sufficiency

Maize self-sufficiency was considered an indicator for food production, as maize self-sufficiency was reported to be an important production objective by participating farmers (Marinus et al. 2021). Maize self-sufficiency may also be a prerequisite before farmers start to consider other changes in their farm towards sustainable intensification, e.g., diversification into legumes. Maize self-sufficiency at household level (−) was calculated as the total maize production at farm level per season (kg) divided by the maize requirements per household per season (kg). The seasonal maize requirement was calculated from the annual requirement multiplied with the proportional contribution of seasonal maize production to the annual production. The annual household requirements were calculated from the number of adult male equivalents (AMEs) per household and the energy requirements of an active male, 2500 kcal/day (FAO/WHO/UNU 2001). The number of AMEs per household was based on the family composition during the 2018SR, whereby a female was equivalent to 0.82 AME and children (0–18 years) 0.75 AME (FAO/WHO/UNU 2001). The maize requirements were 260 kg AME−1 year−1, based on an energy content of maize grain of 3500 kcal kg−1 (Lukmanji et al. 2008).

2.3.3 Environmental protection

Nitrogen use efficiency and N surplus

Nitrogen (N) use efficiency of maize was calculated per season: the total N outputs in maize grain (kg N ha−1) divided by the N inputs on all fields with maize (kg N ha−1). N output was calculated using the farm-level weighted average maize grain yield and a fixed N content in maize grain of 1.54% (Njoroge 2019). A farm level weighted average for N inputs was calculated based on the mineral fertilizer used per field, as reported in the monitoring survey. N use efficiency was analyzed using the framework developed by the EU Nitrogen Expert Panel (2015), with a minimum and a maximum N use efficiency of 50% and 90% respectively and a maximum N surplus of 80 kg N ha−1. A N use efficiency below 50% or a N surplus above 80 kg N ha−1 indicated a high risk of N losses to the environment, while N use efficiencies above 90% indicated a high risk of soil mining. The framework also includes a general benchmark for a desired output of 80 kg N ha−1. We adjusted this benchmark to the N output at 50% of the water-limited yield potential, equivalent to 83 and 53 kg N ha−1 for the long-rain and the short-rain cropping seasons.

Crop area of maize and legumes

Assessing area per crop in smallholder farming is not straightforward as crops are commonly intercropped: e.g., maize is often intercropped with legumes such as common bean or soybean. Cultivated area per crop (ha) was calculated as the sum of the areas of all fields containing that crop and was used to calculate yields. The percentage farm area per crop (%) was calculated using the estimated percentage intercropping and the field area when comparing percentage areas of different crops. When analyzing maize alone, the percentage intercropping was not considered as, in most common maize-legume intercropping systems used by farmers in western Kenya, intercropping does not influence maize yield (Ojiem et al. 2014). The percentage farm area covered by maize was an indicator of specialization and by legumes of diversification. If the percentage maize was above 75% of the farm area, the score was 0 and if it was 25% or less it was 10. For legumes, the score was 0 if no legumes were present and 10 if they occupied more than 30% of the farm area.

2.3.4 Economic viability

Value of produce

Value of produce per crop was calculated for maize, common bean, groundnut, and soybean based on the total production per crop per season and the median crop price for 2018. Median prices were obtained through a weekly market survey after pooling the data from both locations as there were limited differences. Value of produce was expressed per adult equivalent per day based on the household composition in 2018 and season length. Input costs were not considered as these were largely covered by the voucher. The value of produce therefore paints a relatively optimistic picture and does not reflect farm profitability. In addition, seasonal and within season price fluctuations were not considered, as this was not feasible for all crops and inputs. We used the poverty line for Kenya (World Bank 2015) and the living income for rural Kenya (Anker and Anker 2017) as benchmarks. Both were corrected for inflation, using 2018 as reference year, which was the same year as for the crop prices. Both the poverty line and the living income were expressed in $ purchasing power parity ($PPP) per adult equivalent per day, following OECD (2011) and Van de Ven et al. (2020). The value of produce per hectare of all crops combined was expressed per hectare of farm land for each season. It was scored at 0 if the value of produce was 0 $PPP ha−1. The score of 10 was assigned to the 75% percentile of the value of produce obtained by all farmers in the short- and the long-rain cropping seasons, so it was a relative score based on the current production values. Value of produce was considered an indicator for intensification.

Risk spreading

Economic viability is improved if risk is spread by growing a variety of crops and not focusing solely on maize. We calculated the relative contribution of legumes (common bean, groundnut, and soybean) to the combined value of produce at farm level as an indicator for risk spreading. It was scored 0 if legumes did not contribute to the value of produce and 10 if legumes contributed 50% or more to the value of produce. The degree of risk spreading was considered an indicator of diversification.

Farm area

Farm area often limits the income that can be attained from farming (Marinus et al. 2022). We assessed the total farm area per farm based on measured field sizes of all fields in the farm and monitored this over time during the seasonal monitoring survey. Farm area was score 0 if the farm area was 0 ha. The score of 10 was assigned to the 75% percentile of the farm areas observed for all farmers, so it was a relative score based on the current farm areas. An increase in farm area was considered an indicator for extensification.

3 Results

There were few differences between the two groups of farmers, the co-learning and the comparison group, except for the expansion of legumes. Therefore, in the results section, no distinction is made between the two groups of farmers, except where relevant differences arose. We first assess the indicators of Table 1 and subsequently analyze the different pathways for sustainable intensification.

3.1 Maize yield and production

Median yields were about 15% of the seasonal-average water-limited yield potential before the program (Table 2) and strongly increased to almost 50% of the seasonal-average water-limited yield potential for most households from the first season of the program onwards. Some farms even reached 80% of the seasonal-average water-limited yield potential in some seasons. Those good yields were maintained during all five seasons of the program (Fig. 2). During the program, farmers planted nearly all of their maize area with improved varieties (96%) in both locations, while before the program this was only 46% in Vihiga and 63% in Busia.

Table 2 Average household level indicators per location, before (averaged over two seasons), and during the program (averaged over the five seasons). Indicators are grouped according to the pathways of intensification/extensification and/or diversification/specialization. The crop area in % was corrected for intercropping. Crop production and input use before the program were farmer estimates while field sizes were measured. Yields during the program were measured.
Fig. 2
figure 2

Total maize production per household in relation to the maize cultivated area per household during the program for Vihiga (A) and Busia (B). The dotted line indicates a maize grain yield of 50% of the seasonal-average water-limited yield potential, 4000 kg ha−1 for the SR (short rains) and 6300 kg ha−1 for the LR (long rains) cropping season. The short-dashed line indicates a maize grain yield of 80% of the water-limited yield potential, 6400 kg ha−1 for the SR and 10000 kg ha−1 for the LR cropping season. The long-dashed line indicates a maize grain yield of 15% of the water-limited yield potential, 1200 kg ha−1 for the SR and 1900 kg ha−1 for the LR cropping season.

The maize production per household before the program was about 15% of that during the program, due to both a yield increase and the increase in maize area (Table 2). During the program, the maize area remained relatively large and some farmers even increased it over time (Supplementary materials 2, Fig. 2). This trend was observed irrespective of the initial cultivated area of maize (Supplementary materials 2).

3.2 Maize self-sufficiency and maize area

Maize self-sufficiency before the program in Vihiga was on average one-third of the required amount of maize per household and in Busia this was half. During the program most households became maize self-sufficient. On average, in Vihiga, households were producing 1.62 times what they needed and in Busia 3.28 times (Fig. 3). Increases in maize area from the second season onwards resulted in an improvement in maize self-sufficiency for those households in Vihiga which were not yet maize self-sufficient in the first season. In Busia, larger maize self-sufficiency was associated with a smaller fraction of the farm area dedicated to maize (Fig. 3). These relatively larger farms cultivated a larger absolute area with maize than smaller farms of less than 0.5 ha, who tended to plant maize in most of their fields (Fig. 4). This critical area of 0.5 ha was roughly what was needed to produce twice the amount of maize required by typical households, indicating farmers’ priority to attain food self-sufficiency. Maize self-sufficiency and the good market for maize, albeit at low price, were named by farmers as reasons to grow maize during the evaluation interviews.

Fig. 3
figure 3

Fraction of farm area under maize in relation to maize self-sufficiency per season for Vihiga (A) and Busia (B). A maize self-sufficiency ratio of one (dashed line) means that a household is maize self-sufficient. The fraction of farm area under maize is not corrected for intercropping. SR stands for short-rain cropping season and LR for long-rain cropping season.

Fig. 4
figure 4

Maize-cultivated area per farm in relation to farm area for Vihiga (A) and Busia (B). The dotted line is a 1:1 line, indicating that all fields of the farm contain maize. The maize area is not corrected for intercropping. The dashed line indicates 0.5 ha of maize, above which no farms cultivate only maize. SR stands for short-rain cropping season and LR for long-rain cropping season.

3.3 Nitrogen application and nitrogen use efficiency

Before the program, farmers in Vihiga applied a similar rate of mineral N fertilizer on maize as during the program (Table 2). The total amount of N applied on maize however nearly doubled, but due to the increase in maize area, the rate remained similar. The N application rate in Busia increased by nearly 50% during the program as compared to before the program. P application rates increased in both sites during the program as compared to before the program (Table 2).

There was a clear negative relationship between N application rate and maize area in both Vihiga and Busia during the program (Fig. 5). High N application rates (> 120 kg N ha−1) were applied on farms with a small maize area (<0.2 ha) and the rates were largest in the first season (2016SR). Especially the farmers in Vihiga applied high rates, which was attributed to their extremely small cultivated areas. With an increased maize area from the second season onwards, the N application rates reduced. The other seasons showed a similar pattern as 2017LR. Farmers with a large maize area tended to distribute the fertilizers over the whole area, resulting in lower application rates per hectare (40–50 kg N ha−1). This relation between N application rate and farm area seemed partly related to the size of the input voucher, which limited total N use per farm. A common choice was to use 60% of the voucher to buy a 50 kg bag of DAP (di-ammonium phosphate) and a 50 kg bag of CAN (calcium ammonium nitrate), adding up to 23 kg of N which was the common maximum N use per farm across the maize fields (Supplementary materials 3). Some farmers with a larger maize area, mainly in Busia, bought small amounts of additional mineral fertilizer with their own money, resulting in moderate fertilizer N application rates of around 50 kg N ha−1.

Fig. 5
figure 5

Mineral N rate applied to maize fields in relation to the area cropped with maize per farm in 2016SR and 2017LR cropping seasons for Vihiga (A) and Busia (B). The dotted line indicates an application rate of 50 kg N ha−1 (common) and the dashed line 120 kg N ha−1 (advised). SR stands for short-rain cropping season and LR for long-rain cropping season.

Only few farms across sites and seasons were within the desired range of N use efficiency (white area in Fig. 6). Too high N use efficiencies (>90%), indicating soil mining, were found for many of the farms in Busia, during all five seasons, and for about half of the farms in Vihiga from the second season onwards. Too low N use efficiencies (<50%) and too large N surpluses (>80 kg N ha−1) were mainly found in Vihiga (Fig. 6), especially in the first season, where large amounts of N-based fertilizers were applied on small maize areas (<0.2 ha). This problem reduced from the second season onwards when the cultivated area of maize increased (Fig. 5).

Fig. 6
figure 6

Farm level N outputs in maize grain in relation to mineral N inputs on maize, all in kg N ha−1, for Vihiga (A) and Busia (B). The figure is based on the EU Nitrogen Expert Panel (2015) analysis method. The upper and the lower diagonal lines with a y-intercept of zero indicate a N use efficiency of 90% and 50% respectively. An N use efficiency above 90% indicates a risk of soil N mining (deep yellow color), while an N use efficiency below 50% indicates a risk of N losses to the environment (orange color). The cleat between these two lines is further narrowed by (1) a dotted diagonal line indicating a N surplus of 80 kg N ha−1, which, if exceeded, indicates a risk of N losses to the environment (light yellow-color); (2) a horizontal dashed line indicating a N output that is equivalent to 50% of the water-limited yield potential per season, 83 kg N ha−1 for the long rains and 53 kg N ha−1 for the short rains. Below this output, the maize grain yield is lower than targeted (pink color). The remaining white area indicates the desired range of N efficiencies and output. SR stands for short-rain cropping season and LR for long-rain cropping season.

3.4 Relative cropping area for maize and legumes

Before the program, the relative crop area for both maize and legumes was smaller than during the program (Table 2). The share of maize increased by 10 to 25% and the share of legumes doubled. However, the area in common bean decreased, the area in groundnut increased, and soybean was newly introduced to 6% of the farm area (Table 2). The fraction of the farm area cropped with maize increased in the first two seasons, whereas that with legumes increased in later seasons. Co-learning farmers planted a larger fraction of their farm area with groundnut and soybean in the last two seasons (2018LR and 2018SR) than the comparison farmers (Fig. 7), although this seemed to be at the cost of common bean. Groundnut and soybean were two focus crops of the co-learning program, for rotational benefits and high value of produce per hectare, with specific attention to intercropping arrangements. The difference between comparison and co-learning groups was larger during the long-rain cropping season (Supplementary materials 4), which is locally seen as the main season for maize. Some households cultivated legumes mainly during the long rains and others mainly during the short rains. Small farms tended to grow a larger fraction of the farm area with legumes than larger farms, but mostly in intercropping with maize. In evaluation interviews, farmers with larger farms noted labor constraints for cultivating legumes as their main reason for dedicating only a limited area to legumes. In Vihiga, legumes were mainly intercropped with maize.

Fig. 7
figure 7

Average percentage of farm area cultivated with legumes crops before and during the program for the comparison (A) and co-learning (B) farmers in Vihiga and for the comparison (C) and co-learning (D) farmers in Busia. The dashed line indicates the start of the program. Percentage areas per crop are corrected for intercropping. SR stands for short-rain cropping season and LR for long-rain cropping season.

After increasing in the first seasons, the fraction of farm area with maize decreased in the last season on the larger farms (2018SR, Supplementary materials 5). The initial increases were realized both by replacing other crops (cassava, sorghum) and by using additional land, e.g., by renting in land and using land that was previously fallow (not shown). Most farmers who decreased their maize area had a relatively large maize area. They reported ample maize self-sufficiency and low maize prices as main reasons for the decrease. Maize was replaced by groundnut and by leaving land fallow.

3.5 Value of crop produce

The value of combined crop produce per hectare more than doubled during the program when compared to before (Table 2). This was the result of yield increases of most crops. Only yields of mostly intercropped common bean decreased during the program, because of the prolific maize growth.

Maize contributed most to the total value of produce for most households (Fig. 8), because of the large fraction of farm area on which it was grown. The contribution of maize to the total value of produce was more or less the same before and during the program and increased only slightly. As a consequence, the contribution of legumes slightly decreased. However, the share of common beans strongly decreased (low yields, smaller fraction of farm area) and groundnut and soybean took over (Table 2). For some individual households, legumes contributed two to three times more to the total value of produce than maize, because of their larger legume area fraction combined with relatively good legume yields (not shown). The expanding area of groundnut (Fig. 7) also explains why the value of produce of legumes became more important for co-learning farmers than comparison farmers in the last two seasons (Fig. 8). In particular, groundnut became important, contributing 14% and 8% to total value of produce for co-learning farmers in Vihiga and Busia, respectively, in 2018LR. For comparison farmers, the value of produce of legumes was 1% in Vihiga and 0% in Busia in 2018LR due to low yields and small areas with soybean. Soybean was mainly valued as an option to reduce striga infestation and less important for its selling value.

Fig. 8
figure 8

Value of produce for soybean, groundnut, common bean, and maize in $PPP per adult equivalent per day for each household for the comparison (A) and co-learning (B) farmers in Vihiga and for the comparison (C) and co-learning (D) farmers in Busia. Households were ordered each season per location for their value of produce of maize. Household IDs were assigned per location. SR stands for short-rain cropping season and LR for long-rain cropping season.

Only one household in Vihiga obtained a value of produce that was equivalent to the living income in two of the seasons (Fig. 8). In Busia, slightly more households in both groups obtained a living income, which was mainly related to the larger farm area compared with Vihiga. The total value of produce was equivalent to the poverty line for a few households per group in Vihiga and for about one-third of the households in Busia.

3.6 Indications of different agricultural development pathways

Farm area appeared to be an important characteristic for explaining the indicator values, especially in Busia. Based on Fig. 4, a cutoff point of 0.5 ha was determined to group farmers with a smaller farm area (<0.5 ha), denoted “small farms”, and farmers with a larger farm area (>0.5 ha), denoted “larger farms”, even though these farms are still very small. Above an area of 0.5 ha, no farmers cultivated maize on all of their land, with one exception in Busia. In Vihiga, very few farms were larger than 0.5 ha, too few to consider as a separate category, so we excluded these from the analysis.

In Vihiga, most intensification happened at the beginning of the program, while hardly any further intensification was observed in subsequent years (Fig. 9). This was the case for all indicators related to intensification: value of produce per hectare of all crops, maize yield, and the use of improved varieties remained the same. N application rate even slightly decreased. The relative maize area showed a slight specialization towards maize over time during the program, but at the same time the trends in relative legume area and the legume contribution to the value of produce pointed at diversification and spreading of risk. Farm area slightly increased over time, pointing towards extensification.

In Busia the small farms showed a similar pattern: a large positive change in intensification only at the start of the program and a decreasing N application rate during the program due to an increase in maize and total farm area (Fig. 9). The specialization in maize (low score for maize area) was even more pronounced than in Vihiga and coincided with a slight decrease in the relative legume area. However, the contribution of legumes to the value of produce slightly increased, pointing at risk spreading through diversification. Similar to the small farms, the larger farms in Busia showed most intensification at the start of the program and hardly any further intensification, except for a slight increase in maize yield. The other indicators for intensification remained the same during the program. The farms diversified as shown by both the relative maize area and the relative legume area and a large increase in the contribution of legumes to the value of produce, leading to spreading of risk. Farm area for both groups in Busia slightly increased over time, pointing towards extensification.

Comparing the larger and the small farms in Busia showed a slightly higher degree of intensification on the larger farms by a larger value of produce per hectare and a higher maize yield during the program (Fig. 9). However, the differences were small. During the program, small farms were more diversified in terms of legume area and legume contribution to value of produce, than larger farms. At the end of the program, however, the contribution of legumes to the total value of produce was larger for larger farms due to higher yields of legumes, contributing to diversification for risk spreading.

Fig. 9
figure 9

Spider web diagrams with average indicator scores per indicator for farms with a relatively small (<0.5 ha) and a larger farm area (>0.5 ha) (data from 2015SR) in Vihiga (A) and Busia (B). For Vihiga, larger farms were left out, as there were too few. A larger score indicates a more sustainable situation. Dotted lines represent the short-rain (SR) cropping season before the start of the program, 2015SR. Dashed lines represent the first season of the program, 2016SR, while solid lines represent the last season of the program, 2018SR. The indicators maize yield, N application rate, and improved maize variety refer to maize crop level. The other indicators are at farm level. (I) An intensification indicator, (E) extensification indicator, (D) diversity indicator, (S) specialization indicator. The “–” sign after maize area indicates a negative relation for this specific indicator: maize area receives a higher score if the cultivated area with maize is smaller.

4 Discussion

In this study, we used a diverse set of indicators to analyze five seasons of detailed farm level data, which was gathered as part of a co-learning program with 47 farmers in western Kenya. We also compared the outcomes during the program (measured) with farmer-reported data, collected during a baseline study held before the program. We compared the integrated co-learning approach (Marinus et al. 2021), which included an input voucher, with a voucher-only approach. We assessed whether the integrated co-learning approach and/or the input voucher-only would lead to pathways of intensification or extensification and pathways of diversification or specialization. We did not observe a difference between farmers only receiving a voucher and those also taking part in the co-learning program, so we analyzed them as one group. The only exception was the adoption of legumes, which were included more substantially by the co-learning farmers. Soybean was newly introduced and groundnut substantially expanded, which led to a more diversified cropping system. All farmers in our sample increased maize yields (intensification) compared to the situation before the program, although an increase in farm and maize areas in combination with relatively low N application rates (risk of soil N mining) also pointed to extensification and specialization. The value of produce remained below a living income for most households in our sample due to the small farm areas. This was more prominent in Vihiga than in Busia. The larger farms in our sample scored better in terms of diversification than the small farms, especially related to fraction of the area in maize and contribution of legumes to the value of produce. Our results are in line with the well-described difficulty of enabling an increase in yields and agricultural production, while at the same time fulfilling other environmental and economic goals that are important for sustainable intensification of smallholder agriculture.

4.1 Farmers’ response to the voucher and integrated co-learning

The voucher seems to have resulted in changes in input use, yields, maize area, and farm area, independent of the co-learning workshops (Table 2). Although maize yields and input use prior to the program were based on farmer-reported data, they were in line with current yields (MoALF 2015) and input use (Sheahan et al. 2013; Valbuena et al. 2015) reported in the literature for western Kenya. The measured maize yields and subsequent increased farm level production allowed most households to achieve maize self-sufficiency during the program. This is most likely due to the provision of the US$ 100 input voucher, as most farms in western Kenya only produce enough maize to feed the household for half of the year (Valbuena et al. 2015). Although the voucher alleviated capital constraints for agriculture at household level, co-learning helped to facilitate more complex changes such as diversification into (new) legumes such as soybean and groundnut (Fig. 7). Although taking time, the iterative learning process facilitated learning on new intercropping arrangements of maize and legumes and identified specific objectives for soybean (e.g., reducing striga incidence) and groundnut (e.g., high value of produce ha−1), as described in more detail in Marinus et al. (2021). Co-learning can thus be used to contextualize knowledge for the breadth of options that is needed for sustainable intensification (Descheemaeker et al. 2019; Ronner et al. 2021).

Initially, all households in our sample, irrespective of farm area, specialized in maize both in Vihiga and in Busia. Larger farms, however, reduced their maize area again in the last season (2018SR), while small farms maintained their increased maize area. Similar increases in maize area after the introduction of an input voucher or subsidy have been described before for western Kenya (Sanchez et al. 2007) and Malawi (Holden and Lunduka 2010; Chibwana et al. 2012), based on farmer-reported maize areas. In these studies, however, an increase in maize area often resulted in a decrease in legume area (Holden and Lunduka 2010; Chibwana et al. 2012). For small farms, maintaining the large maize area was associated with farmers’ objectives to be maize self-sufficient (Marinus et al. 2021).

The financial benefits of diversification into groundnut and soybean that we observed were in line with the findings of Franke et al. (2014), who simulated benefits of diversification with legumes for different farm types in Malawi. Diversification is important for spreading risks (crop failure, low prices), and for nutritional and rotational benefits (Vanlauwe et al. 2019). On the small farms, legumes were mainly intercropped with maize resulting in limited benefits due to land constraints. However, on larger farms, labor constraints were limiting the expansion of legumes, similar to the findings of Franke et al. (2014). This would imply that developing and promoting legume-specific, small-scale mechanization, such as groundnut diggers for harvesting and shellers (Tsusaka et al. 2017), may be required to enable diversification for households with a larger farm area.

4.2 Concurrent pathways of intensification and extensification

The maize yields obtained during the program point at the pathway of intensification as yields were two to three times greater than the yields reported by participating farmers before the program, and close to the seasonal-average benchmark of 50% of the water-limited yield potential. However, as the corresponding N application rates were both above and below the desirable range, it appeared to be difficult to enhance N use efficiency, which is a typical challenge pertaining to sustainable intensification (Zhang et al. 2015).

Intensified mineral fertilizer use resulted in extremely high N application rates in Vihiga (>200 kg N ha−1 in the first season, ~100 kg N ha−1 in later seasons) due to the small farm areas in our sample there, resulting in N use efficiencies below 50%. These farms of less than 0.2 ha were not able to allocate all inputs from the voucher in a useful manner, even with an increased farm or maize area in later seasons. Maize yields were not negatively related to farm area, and in some seasons even positively related to farm area, while for a small sample, this seems to go against the inverse farm size-productivity relationship (Larson et al. 2014). Our finding however is in line with Desiere and Jolliffe (2018) and Gourlay et al. (2017) who also found no negative relation between farm area and yield. Notwithstanding higher N application rates, smaller farms did not seem to produce better yields than larger farms, which may be explained by reliance on off-farm work requiring farmers’ attention (Leonardo et al. 2015) and the presence of poorer soils (Franke et al. 2019), requiring longer term investments in soil fertility (Vanlauwe et al. 2010).

Extensification was observed on larger farms, who increased their farm and/or maize areas and hence distributed N over larger areas. This was most notable in Busia, as also discussed in Marinus et al. (2023), where population pressure is lower and fallow land is available. We hypothesize that the expansion in farm area was enabled by the voucher (Marinus et al. 2023). The preference of the farmers in our sample for extensification over intensification goes against one of the key objectives of sustainable intensification, namely to increase agricultural production on existing farmland (Cassman et al. 2003; Struik and Kuyper 2017). The preference for land expansion among African smallholders to increase production, however, seems to be a general trend for crop area (Baudron et al. 2012; Ollenburger et al. 2016; Jayne and Sanchez 2021; Giller et al. 2021). At farm level, however, extensification may be less expensive than increasing input rates with the associated larger risks of financial losses (Tittonell et al. 2007; Burke et al. 2019; Jindo et al. 2020), which can help to explain the observed farmers’ preference. The additional fields that farmers rented in were either previously fallow, or already in active use for agriculture. Expanding into fallow land or nature areas, on the one hand, can result in environmental concerns as it can jeopardize current ecosystem services such as providing natural habitats or erosion control. On the other hand, using fallow land more frequently, can also be seen as intensification and therefore desirable. Increasing farm area by some farmers, by renting in land that was ready in active use, meant that farm area decreased for other farmers. If this would happen on a larger scale, increasing farm areas could push others out of agriculture, requiring alternative employment for those going out of agriculture (Giller 2020).

Except for the first season, the N application rates were remarkably similar for the small and larger farmers in our sample at about 50 kg N ha−1. This was partly limited by the fixed voucher size of US$ 100. Farmers with a relatively large maize area, who bought additional fertilizers still applied N at a maximum of 50 kg N ha−1, despite the advice in the co-learning workshops to apply more. This may be partly due to the active presence of One Acre Fund in the area who, as a credit provider, advises farmers to use this conservative rate of 50 kg N ha−1. One Acre Fund was already present in the program locations before the start of the program and did not change the intensity of their activities during the program. The relatively good yields and low N fertilizer application rates are probably not sustainable as they will likely result in soil N mining. Soil N mining is common in SSA, but usually at lower yields and lower input levels than in our study (Sheahan and Barrett 2017). We diagnosed negative N balances over multiple seasons, which suggest that soil mining will occur on the long term. This may have been enabled by the increased application of P through the mineral fertilizers. In the P-fixing soils of the study area, P limits mineralization and strong yield responses to P can be found (Kihara and Njoroge 2013). Another reason may be that we did not account for N inputs from manure and N2-fixation, although these were small (<14 kg N ha−1 for manure on average, with large variation in rates due to likely recall error). When good yields in combination with soil N mining are continued, N and other nutrients (e.g., K) may become limiting (Njoroge 2019) and fertilizer rates will need to be adjusted.

4.3 Development pathways evaluated by multi-criteria analysis

We combined indicators that farmers indicated to be important from their perspective (e.g., maize self-sufficiency, value of produce) with indicators that are important for local or national food self-sufficiency (e.g., yield and production) and environmental protection (e.g., N use efficiency, N surplus) in an integrated assessment. This analysis, in combination with the discussions with the co-learning farmers, identified potential constraints and trade-offs at farm level. Achieving and even surpassing maize self-sufficiency was a first priority for farmers, because of the importance of having surplus food as a buffer for later seasons and the reliable market for maize (Marinus et al. 2021). This priority could stimulate specialization. The limited observed diversification goes against a common assumption in modeling studies that farmers are likely to diversify into other crops once they are maize self-sufficient (e.g., Hengsdijk et al. 2014; Leonardo et al. 2018). Increasing the value of produce obtained from farming was a second objective for farmers. However, despite the good yields, farm area and in Busia also, labor availability seemed to be overriding constraints for reaching the income benchmarks. At best, one-third of the households in our sample obtained a living income and half of the households reached the poverty line in Busia. In Vihiga, only one out of twenty-three households obtained a living income in two seasons, while at best one-fourth of the sample reached the poverty line. Increasing farm area per household or extensification may thus be needed to increase income from farming to a living income. New employment opportunities will then be needed for those who choose to leave farming (Giller 2020), if no additional land is available or if an increase of agricultural land is not desired (e.g., Godfray et al. 2010; The Montpellier Panel 2013). Following area expansion, in Busia, mechanization could alleviate the labor constraints for cultivation of profitable crops such as legumes, of which the further expansion during the program seemed to be limited by labor constraints. Mechanization could thereby facilitate further diversification into more profitable crops for economic viability. Apart from changing to more profitable crops and increasing farm area, selling products at times of high prices can also be a way to increase income. This strategy is not within reach of all farmers, as it depends on their short-term needs for cash. We did not consider seasonal price fluctuations, although maize prices can differ more than a factor two between the scarce lean season and just after harvest, when maize is abundant (Burke et al. 2017). A more extensive analysis of each individual household and price fluctuations would be required to assess this.

Disaggregating the analysis per household showed that farm area limited outcomes for both small and larger farms in specific ways. For example, N use efficiencies were below (for about one-third of farms in Vihiga) or above the desired range (for about half of the farms in both sites), respectively, while outputs (yield) were in the desired range. Another methodological lesson learned is that assessing adoption of new crops or varieties in programs on diversification needs multi-season studies (Glover et al. 2019) as our results showed that the legume area per farm differed per season and not necessarily according to the season when legumes were known to be most commonly cultivated. Finally, the principles, criteria, and indicator framework, following Florin et al. (2012), was useful in being explicit on the underlying assumptions, i.e., criteria, on when an indicator contributes to sustainability. Some of these assumptions, e.g., on crop area, can be subjective and thereby require transparency on why they were chosen and which benchmarks were used (Marinus et al. 2018).

5 Conclusions

We analyzed whether farmer responses to a voucher and co-learning were indicative of different pathways for agricultural development over a period of five seasons by applying an indicator framework. Our overarching aim was to improve the understanding of farmer responses to input subsidies and new knowledge, in order to better support desired agricultural development pathways in smallholder farming. Although we focused on a limited number of farmers, 47 in total, we believe that based on the detailed data collection over multiple seasons, some conclusions can be drawn. The novel integrated co-learning approach which we developed facilitated more complex changes in farm management, such as diversification through an increase in legume area and legume contribution to the value of produce. Other responses were mainly related to the input voucher itself.

Increased input use through the voucher seemed to increase yields and production, indicating a pathway of intensification that allowed households to achieve maize self-sufficiency. As a result of increases in maize and farm area on larger farms, N application rates remained constant, despite larger inputs. Accompanied by too low N use efficiencies, this pointed at extensification and a risk of not reaching environmental protection objectives. Most small farms were only just maize self-sufficient and their value of produce remained below the poverty line. Obtaining a living income was only possible on large farm areas. It should be noted, however, that we based this on the product prices of 2018. Different prices would give a somewhat different picture, but it is clear that prices would have to increase several-fold to lift the majority of the farmers out of poverty. Our multi-criteria analysis highlighted the difficulty of supporting diversification as a pathway towards sustainable intensification. Improving livelihoods requires changes that go far beyond the farm level. Smallholder farmers in western Kenya and in many rural areas of sub-Saharan Africa are essentially part-time farmers who depend on many sources of income. To increase income from farming, farm areas need to increase, which requires off-farm employment opportunities for those who choose to leave farming. Whether sustainable intensification of smallholder agriculture will actually happen may therefore depend on how changes in farm structure—that is, capital, land and labor—are facilitated at farm level and as part of the wider socio-economic developments within a country.