1 Introduction

In recognition of the myriad benefits of grain legume production, the Food and Agriculture Organization (FAO) of the United Nations named 2016 the International Year of PulsesFootnote 1 under the banner of “nutritious seeds for a sustainable future”. The production and consumption of grain legumes have been shown to convey many environmental, economic, and nutritional benefits. For example, because legumes naturally fix nitrogen (N) in the soil, they reduce the need for inorganic fertilizer and can improve the environmental sustainability of cropping systems; in addition, the residual N in the soil can enhance long-term soil fertility and crop productivity (Bationo et al. 2011; Bohlool et al. 1992; Dakora and Keya 1997; Thierfelder et al. 2012). When grown as intercrops or in rotations with cereals, legumes also help reduce cereal crop diseases and pests by serving as hosts for beneficial microbial life, creating an inhospitable soil environment for many soil-borne diseases, and encouraging beneficial insect predators; this in turn reduces the need for pesticides (Bohlool et al. 1992; Howieson et al. 2000). In addition to the above mentioned environmental benefits, legumes carry many potential economic and nutritional benefits for smallholder farm households. For example, farmers can store these crops for long periods of time without loss of nutritional value, which grants them the choice to consume or sell the legumes between harvests (FAO 2016). Moreover, parts of the legume plant itself (e.g., the leaves of cowpea and common bean plants) can be consumed during the growing season, offering smallholders some insurance against food insecurity (Barrett 1990; Chivenge et al. 2015; Ojiewo et al. 2015). Also, as a result of their high protein, mineral, and fiber content, legumes serve as a valuable complement to a primarily carbohydrate-based diet (Ojiewo et al. 2015; Tharanathan and Mahadevamma 2003).

Despite the many possible benefits that legumes may grant to smallholder farm households, the relationship between legume cultivation and household food security has not been rigorously analyzed in the literature. This paper attempts to help fill that gap using nationally-representative panel survey data from small-scale farming households in Zambia. Specifically, we examined the impacts of the various ways that households incorporate legumes into their cropping activities on several indicators of household food production, availability, and access. These indicators are the gross value of crops harvested (which is further split into the value of crops retained by the household for household consumption versus the value sold), per capita calorie and protein production, months of adequate household food provisions (MAHFP), and household dietary diversity score (HDDS). We were particularly interested in the effects of cereal-legume rotations and cereal-legume intercropping but also analyzed the effects on these outcomes of other forms of legume production as a group (namely legume monocropping and intercropping or rotating grain legumes with non-cereal crops, including other legumes).Footnote 2 All of these indicators could potentially influence household food security and nutrition (see, for example, FAO (2009) and Mubanga and Ferguson (2017)).Footnote 3 We therefore explored the role of legume-based cropping practices in the broad agriculture-food security-nutrition nexus but with a focus on the food availability and access dimensions of food security. We did not explicitly analyze the effects of legume-based cropping practices on two other important dimensions of food security: utilization and stability.

Zambia is an appropriate case study for several reasons. First, grain legume cultivation is quite common but far from universal among smallholder farmers in the country. For example, groundnut is the second most commonly cultivated crop after maize (86.0% of smallholder households grow maize whereas 39.2% grow groundnut) (Tembo and Sitko 2013). Mixed beans, grown by 15.7% of smallholder households, is the fifth most commonly produced crop (Tembo and Sitko 2013). These crops are vital components of the daily diets of most Zambians. Second, maize-legume rotations have been heavily promoted in the country as a key component of conservation agriculture (Haggblade and Tembo 2003; Zulu-Mbata et al. 2016). And third, food and nutrition insecurity remain major challenges in Zambia. For example, approximately 46% of Zambians are undernourished and 40% of Zambian children under age five are stunted (FAO, International Fund for Agricultural Development, United Nations Children’s Fund, World Food Programme, and World Health Organization 2017). Based on the 2017 Global Hunger Index, Zambia had the fifth highest level of hunger among the 119 countries included in the rankings (International Food Policy Research Institute, Concern Worldwide, and Welthungerhilfe 2017).

2 Conceptual framework

The literature uses different approaches to conceptualize causal pathways from agriculture to food security, nutrition, and health (see Webb 2013 for a review of this literature). We based our conceptual framework on one such approach, that of Herforth and Harris (2014). They developed a framework that highlights three main pathways through which agriculture, food access, and nutrition may be linked: food production, agricultural income, and women’s empowerment (Fig. 1). The food production pathway considers how increases in food production resulting from an agricultural sector intervention or adoption of an agricultural technology or management practice can affect a household’s food access and nutrition through the type (including diversity), quantity, and seasonality of food available for consumption (Chung 2012; Herforth and Harris 2014; Kumar et al. 2015). The meso-level food market environment also influences a household’s food production and consumption decisions (Fig. 1). If a preferred food is not available or affordable in the local market, a household may instead choose to grow that product itself (Herforth and Harris 2014). The second pathway considers increases in agricultural income resulting from an agricultural sector intervention or technology, including income from crops produced and sold, which could allow greater household expenditures on food and could result in greater overall food consumption and an improvement in household food and nutrition security (Pandey et al. 2016; Pauw et al. 2015). Higher agricultural income might also result in higher non-food expenditure, including spending on health care, which could improve a household’s nutrition outcomes.Footnote 4 Women’s empowerment, the third pathway in this framework, emphasizes women’s combined roles in agriculture, dietary choices, and healthcare, and how they influence the nutritional and food security outcomes of children, mothers, and families (Malapit and Quisumbing 2015; Sraboni et al. 2014). The women’s empowerment pathway is particularly important when considering agricultural sector interventions that are designed to empower women or that may have unintended consequences that affect women’s empowerment.

Fig. 1
figure 1

Agriculture-food access-nutrition linkages. Source: Herforth and Harris (2014)

Legume-based cropping practices serve as a useful conduit to better understand the linkages between agriculture and food and nutrition security across the food production and agricultural income pathways.Footnote 5 A production system that includes a greater variety of foods grants the household a greater diversity of food for own consumption (Jones et al. 2014; Kumar et al. 2015; Sibhatu et al. 2015; Venkatesh et al. 2016). For example, Jones et al. (2014) indicate that a more diverse cropping system is positively and significantly correlated with dietary diversity indices and with the number and frequency of legumes, fruits, and vegetables consumed. Accordingly, under the food production pathway, we expected to find that households that integrate legumes into their production systems will have greater and more diverse availability of food (Kassie et al. 2015; Manda et al. 2016a; Nyanga 2012).

Moreover, much research suggests a positive relationship between grain legume intercropping or rotation and crop yields. Legumes have a unique role in sustaining soil fertility through symbiotic biological N fixation. Extensive experimental evidence shows that integrating grain legumes in the cropping system significantly increases the yields of the subsequent crops in the rotation (Chauhan et al. 2012; Jeranyama et al. 2007; Kamanga et al. 2010; Lunze and Ngongo 2012; Odhiambo 2011; Thierfelder et al. 2012; Waddington et al. 2007a) and to a lesser extent through intercropping (e.g. see Waddington et al. 2007b). Impact studies based on observational data also support this linkage between legume intercropping and/or rotation and cereal productivity. For example, Arslan et al. (2015) showed that legume intercropping significantly increased maize yields among smallholder farm households in Zambia; however, the effect of crop rotation on maize yields was negative.Footnote 6 A similar study by Kassie et al. (2015) examined the effects of maize-legume intercropping and rotation on maize productivity in Malawi. Their results suggest that these technologies had a positive and significant impact on maize yields. Manda et al. (2016a) also found a positive effect of maize-legume rotation on maize yields in Zambia.

The increase in crop productivity induced by the presence of legumes on farm is expected, in turn, to increase the availability of food for both sale and home consumption, thus potentially influencing both the production and income pathways.

In this paper, we explore the role of legume-based cropping practices in influencing intermediate indicators along these pathways linking agriculture to food security and nutrition outcomes. Specifically, we explored whether and to what extent the adoption of these practices by a cereal-growing small-scale farm household affects household level indicators representing the four nodes in Fig. 1 that are outlined in red boxes: food production and availability (measured by calories and protein produced), income (measured by gross value of crops produced and gross value of crops sold), food access (measured by MAHFP and HDDS), and diet or food quality (measured by HDDS). We tested the hypotheses that, with all else equal, cereal-growing small-scale farm households that integrate legumes into their production system have: (1) more availability of food as measured by total production of (a) calories and (b) protein (food production pathway); (2) more income from crop production or sales (income pathway); (3) more months with adequate food access; and (4) greater household dietary diversity. The latter two outcomes may occur as a result of the food production and/or income pathways.

3 Materials and methods

3.1 Data

The data for this study were from the Rural Agricultural Livelihoods Survey (RALS), a two-wave, nationally representative panel survey of Zambian smallholder and small-scale farm households conducted in June–July 2012 and 2015 by the Indaba Agricultural Policy Research Institute (IAPRI). The Zambia Ministry of Agriculture and Central Statistical Office define smallholder households as those cultivating less than 20 ha (ha) of land, and small-scale as those cultivating less than 5 ha of land. The vast majority (more than 90%) of Zambian smallholder households are ‘small-scale’. For details on the RALS sample design, see IAPRI (2012, 2015).

The 2012 survey covered the 2010/11 agricultural year (October 2010–September 2011) and the associated crop marketing year (May 2011–April 2012). The 2015 survey covered the 2013/14 agricultural year and the 2014/15 crop marketing year. The RALS data included detailed information on household demographics, crop production, crop sales, asset holdings, and access and distances to agricultural extension, inter alia.Footnote 7 From these data, we computed the gross value of crop production by multiplying the kilograms (kg) harvested of each crop produced by the household by the crop price per kg, and then summed up these values by household. The RALS data also capture, for each crop produced, the quantity that was sold versus retained for home consumption during the subsequent marketing year. The price received for crops sold was also captured in the data. We used this information to compute the gross value of crops sold and the gross value of crops retained. We used the actual sale price for the former and the median district or provincial crop price for the latter. We also used the data on the quantities of each crop harvested by the household and data on household size to compute the calories produced per capita per day and the protein produced per capita per day by the household. The crop quantities were converted to calories and grams of protein using conversion factors from the FAO (1968). In addition to these outcome variables, we also analyzed MAHFP and HDDS. Both waves of the RALS included a module on MAHFP but the HDDS module was only introduced in the 2015 wave of the survey. In total, we analyzed seven household-level outcome variables: overall gross value of crop production, gross value of crops retained, gross value of crops sold, calories produced/capita/day, protein produced/capita/day, MAHFP, and HDDS. More details on the MAHFP and HDDS outcome variables are provided in the next sub-section.

A total of 8839 households were interviewed in the 2012 RALS. Of these, 7254 (82.1%) were successfully re-interviewed in 2015. Given this non-trivial rate of attrition, we tested for attrition bias using the regression-based test recommended by Wooldridge (2010). Based on the test, we failed to reject the null hypothesis of no attrition bias for the MAHFP, calories produced/capita/day, and protein produced/capita/day outcome variables (p > 0.10). For gross value of production, gross value retained, and gross value sold we rejected the null hypothesis of no attrition bias (p = 0.029, p = 0.097, and p = 0.087, respectively). Given the marginal statistical significance for gross value retained and gross value sold, this suggests attrition bias was not a major concern in this study. We could not test for attrition bias in the HDDS regressions because HDDS was only captured in the second survey wave. Finally, because we are interested in how incorporating legumes into cropping activities affects small-scale cereal- (i.e., maize-, sorghum-, or millets-)Footnote 8 growing farm households in Zambia, our analytical sample consisted of all panel households that grew any of these cereal crops in both waves of the RALS and who cultivated fewer than 5 ha of land in the first wave of the RALS (N = 5381).

3.2 MAHFP and HDDS

MAHFP and HDDS are both household-level indicators related to food access, an important dimension of food security (Bilinsky and Swindale 2010; Swindale and Bilinsky 2006; Jones et al. 2013). Household food access is defined as “the ability to acquire sufficient quality and quantity of food to meet all household members’ nutritional requirements for productive lives” (Swindale and Bilinsky 2006, p. 1). HDDS measures household dietary diversity and food quality, whereas MAHFP measures the duration of an adequate quantity of food accessed by the household (Jones et al. 2013).

The MAHFP module in the 2012 and 2015 RALS asked the respondent household in which months, if any, it did not have enough food to meet its needs during the most recent crop marketing year (May–April). The resultant MAHFP outcome variable is an integer between 0 and 12, with a lower value indicating more months with adequate household food provisions and thus better food access (Bilinsky and Swindale 2010). Leah et al. (2012) note that MAHFP is a particularly useful indicator in the context of agricultural populations because it captures a household’s ability to meet its food needs over the course of a year.

The HDDS variable was constructed using data from a dietary diversity module included in the 2015 RALS. Survey respondents were asked if any individual in the household consumed anything out of 16 different food groups (such as cereals, dark green leafy vegetables, or meat) in the previous 24 h. Some of these categories were then combined for a total of 12 food categories as in the standard HDDS tool (e.g., Swindale and Bilinsky 2006). The HDDS outcome variable is then an integer ranging from 0 to 12 that reflects a count of how many food groups were consumed by the household in the past day, with a higher number indicating greater dietary diversity, food quality, and food access. Hoddinott and Yohannes (2002) found that dietary diversity is positively associated with per capita consumption and per capita caloric availability from both staple foods and non-staples, which suggests that HDDS is a useful indicator of overall household food access. Although the HDDS provides a good measure of the breadth of food groups consumed by the household, it does not measure the quantity consumed or the intra-household food distribution, and it does not indicate a household’s habitual dietary pattern (Kennedy et al. 2013).

Before elaborating our empirical strategy, it is important to highlight the timing of farmers’ use of the legume-based cropping practices that were captured in the RALS data vis-à-vis the reference periods for the various outcome variables described above. Use of the legume-based cropping practices was captured for the agricultural year (October–September), with crop choice, planting, intercropping, and crop rotation decisions typically made between November and January (Fig. 2). The main harvest period is May–June, and the gross value of crop production and calories and protein produced outcome variables capture the quantities harvested of all crops planted (and affected by the agricultural technologies and management practices employed) during that agricultural year. We therefore captured the current-year effects of the legume-based practices on these outcome variables. The legume-based cropping practices may also have lagged effects on these outcome variables – e.g., through effects on soil fertility in the next agriculture year – but the RALS data did not allow us to capture these effects. The MAHFP variable reflects the status of household food provisions from the beginning of the main harvest period (May) through the following April (Fig. 2); the breakdown of gross value of crop production into value sold vs. retained was also for this period. Finally, the reference period for the HDDS variable is the 24 h prior to the time of interview, which was approximately one year after the main harvest from the agricultural year during which the legume-based cropping practices were used (Fig. 2). Given this timing, if we were to find significant effects of the legume-based practices on HDDS, they would reflect more enduring impacts. With the RALS data we were not able to analyze the effects of legume-based cropping practices on HDDS at or shortly after the harvest associated with the use of these practices.

Fig. 2
figure 2

Timing of legume-based cropping practice use vis-à-vis outcome variables

3.3 Empirical strategy

3.3.1 Estimating the effects of legume-based cropping practices on household welfare

It is notoriously difficult to rigorously assess the impacts of technology adoption, including adoption of legume-based cropping practices. Adoption of such practices may be endogenous to the household food production, availability, and access indicators analyzed here (which we subsequently refer to as ‘household welfare’ or ‘welfare indicators’ for the sake of brevity). A household usually voluntarily adopts a new technology, and the decision to adopt may be correlated with unobserved factors that also affect household welfare. This complicates the estimation of the causal effects of adoption of these technologies along the impact pathways depicted in Fig. 1. An often cited example is that more motivated households or those with better management ability are more likely to adopt improved technologies. If this were the case for legume-based cropping practices and motivation or management ability were unobservable and also positively correlated with gross value of crop production, for example, then ordinary least squares (OLS) estimates of the effects of the adoption of a given practice on gross value of crop production would be biased upward.

It is difficult, if not impossible, to randomly assign technology adoption, although it may be possible to, for example, randomly assign exposure to or additional training on a given technology. However, in this study, we relied on observational data on the adoption of legume-based cropping practices and household welfare, and as such we must employ quasi-experimental techniques to identify the welfare effects of the practices. More specifically, we used panel data methods (e.g., the fixed effects estimator and the Mundlak-Chamberlain correlated random effects approach (Chamberlain 1984; Mundlak 1978)) or two-stage least squares (2SLS) to correct for different sources of endogeneity.

For all outcome variables except for HDDS, which is only observed in the 2015 RALS, we estimated household fixed effects (FE) models of the welfare indicators regressed on measures of the household’s adoption of the various legume-based cropping practices (cereal-legume intercropping, cereal-legume rotation, and other legume production) and a vector of control variables that are described in the next sub-section. Adoption of the various legume-based cropping practices was measured as either: (i) a binary ‘treatment’ variable equal to one if the household used the practice on at least one plot, and equal to zero otherwise; or (ii) a continuous ‘treatment’ variable equal to the household’s total land area under the practice.Footnote 9 Under the key assumption of strict exogeneity of the observed covariates conditional on the unobserved time-constant household-level heterogeneity, FE estimates of the welfare effects of legume-based cropping practice adoption are unbiased and consistent. If, for example, a household’s motivation and management ability did not vary between the 2012 and 2015 waves of the RALS, then the FE approach may largely solve the endogeneity problem.

Because the HDDS outcome variable is only observed in the 2015 wave of the RALS, we could not estimate household FE models. However, because we observed all explanatory variables in both waves of the RALS, we could take a Mundlak-Chamberlain correlated random effects (CRE)-like approach to somewhat control for time invariant unobserved heterogeneity in the HDDS models (Chamberlain 1984; Mundlak 1978; Wooldridge 2010). In particular, we estimated linear CRE models in which the RALS 2015 HDDS was regressed on the RALS 2015 levels of the covariates as well as the RALS 2012 and 2015 household time averages of the covariates. Two key assumptions for the CRE estimates to be unbiased and consistent are: (i) strict exogeneity; and (ii) that the time-constant unobserved household-level heterogeneity be a linear function of the household time averages of the observed covariates, such that including these time averages as additional covariates in the regression effectively controls for the unobserved heterogeneity (Chamberlain 1984; Mundlak 1978; Wooldridge 2010).

For all outcome variables we also estimated 2SLS regressions in which we instrumented for the three legume-based cropping practice variables, which we suspected may be endogenous to household welfare.Footnote 10 To do this, we needed at least three instrumental variables (IVs). These variables must be strongly correlated with the suspected endogenous variables after controlling for the other exogenous covariates and must be uncorrelated with the idiosyncratic error term in the welfare indicator equations. We used as IVs the following three variables: (i) a dummy variable equal to one if any member of the household received advice on rotating cereals with legumes during or prior to the agricultural year in question (i.e., 2010/11 and 2013/14 for RALS 2012 and 2015, respectively), and equal to zero otherwise; (ii) a similar dummy variable for if any member of the household received advice on intercropping cereals with legumes; and (iii) a variable that captures the prevalence of legume cultivation in the household’s community and is defined as the percentage of other households in the standard enumeration area (SEA) that grew legumes (excluding the household itself).Footnote 11 For the first two instruments, extension advice dummies have been previously used to instrument for technology adoption decisions; see, for example, diFalco et al. (2011) and Manda et al. (2016b). However, we acknowledge the possibility that some households may self-select into accessing extension advice, which would violate the exclusion restrictions needed for these IVs to be valid; we therefore put more emphasis on the FE results than the 2SLS results.

First stage regression results of the legume-based cropping practice variables on the three IVs and the exogenous covariates suggest that the cereal-legume intercropping and rotation advice dummies and the legume prevalence variable were quite strongly correlated with the use of these practices (see Tables S1 and S2 in the Supplemental Online Appendix material). As expected, receipt of advice on intercropping (or rotating) cereals with legumes was positively and statistically significantly associated with households adopting cereal-legume intercropping (or rotation). Also as expected, an increase in the prevalence of legume production among other households in a community was positively and significantly associated with a given household’s adoption of other legume-based practices. The partial F statistics for the excluded IVs exceeded 10 in all but one of the six models in which we used the 2012 and 2015 RALS panel data (see the bottom of Tables S1 and S2 for details). This suggests that the IVs are quite strong when both waves of the data are used. However, when we used only the 2015 RALS cross-section, the partial F statistics exceeded 10 in three out of the six models but fell below 10 for the remaining three models. Note that these weaker IVs affect only the HDDS 2SLS regressions. Overall, based on the Staiger and Stock (1997) rule of thumb of partial F > 10, the first stage results suggest that the candidate IVs were sufficiently strong to be used in the 2SLS regressions with the 2012 and 2015 RALS panel data but weak IVs are a concern for the HDDS 2SLS regressions; thus, the latter results must be interpreted with caution.

3.3.2 Estimating the effects of the gross value, calorie, and protein variables on food access

In addition to estimating the direct effects of cereal-legume intercropping, rotation, and other legume production on household welfare, we explored the impacts of the gross revenue and calorie and protein production variables on HDDS and MAHFP. The gross value of production and the calories/protein production variables are all computed using the kg of each crop harvested, which is then multiplied by different factors (prices or calorie/protein conversion factors) to get the final outcome variable. Because of this, in order to avoid concerns with multicollinearity, we used four different models: one in which the overall gross value of crop production is the key explanatory variable of interest; one model in which the gross value of crop production is disaggregated into the gross value retained for home consumption versus sold as the two key explanatory variables; and two models in which calorie or protein production, respectively, is the key explanatory variable. In doing so, we aimed to test the hypotheses that increased crop income and higher levels of crop production are the pathways through which legume-based cropping practices affect household food access (Fig. 1).

The estimators we used to analyze this link in the agriculture-food access chain were CRE for the HDDS models and FE for the MAHFP models. We explored using 2SLS but were unable to identify sufficiently strong IVs.

3.4 Control variables

A non-separable agricultural household model motivates the choice of control variables included in our main empirical models (Singh et al. 1986). In such models, consumption and production decisions are made jointly, and so we included consumer demand determinants and producer supply determinants in the regressions for both the consumption-related outcome variables (i.e., HDDS and MAHFP) and the production-related outcome variables (i.e., gross value of production and calorie and protein production). The consumer demand determinants included in the models were household demographic variables such as the age, gender, and education level of the household head and the number of members in the household. We also controlled for the retail price of maize meal (maize flour, an important food staple) during the hungry season.Footnote 12 The producer supply determinants included were the household’s agricultural assets (per capita landholding size, number of fields operated, average plot size, livestock owned, and farm equipment owned); proxies for access to agricultural information and markets (i.e., whether the household owns a radio or cell phone and the distance to the nearest agricultural extension office, district town, Food Reserve Agency (FRA) depot, marketplace to buy and sell agricultural goods, and paved road); the producer prices of maize, groundnut, beans, and soybean; and fertilizer prices.Footnote 13 In addition, we included a year dummy equal to one for the 2015 RALS to control for unobserved changes between the two survey rounds that affect all households, and a dummy equal to one if the household resides in a rural SEA. District dummies and district-by-year dummy interaction terms were also used in the panel models to control for location and time-varying unobservables and variables for which we do not have data (e.g., rainfall, soil quality, other prices). See Table 1 for detailed variable descriptions and summary statistics.

Table 1 Summary statistics for Zambia (2013/14 agricultural year values)

The regressions of HDDS and MAHFP on the gross value and calorie/protein production variables contain many of the same explanatory variables as the main models. The full list of controls used in these models, excluding the district and year dummies, is given in Table S3 in the Supplemental Online Appendix.

4 Results

Figure 3 provides information on the adoption of the various legume-based cropping practices by cereal-growing small-scale farm households in Zambia during the 2010/11 and 2013/14 agricultural years. Tables 2, 3, and 4 summarize the key findings from the regression analysis – i.e., the estimated effects of cereal-legume intercropping, cereal-legume rotation, and other legume production, respectively – on the seven key outcome variables discussed above (full regression results are available from the authors upon request). We begin with a brief descriptive analysis and then discuss the effects of each legume-based cropping practice in turn. Lastly, we discuss the effects of the gross value, calorie, and protein production variables on HDDS and MAHFP to understand which pathway (production and/or income) is contributing to these effects, if present.

Fig. 3
figure 3

Importance of various grain legume-based cropping practices in Zambia: Comparison of the 2010/11 and 2013/14 agricultural years. Note: Reference population is panel households who grew a cereal crop (maize, sorghum, or millet) in both agricultural years and who cultivated fewer than five ha in the 2010/2011 agricultural year (N = 5381). Percentages are weighted using sampling weights

Table 2 Summary of main regression results for the effects of cereal-legume intercropping on household welfare
Table 3 Summary of main regression results for the effects of cereal-legume rotation on household welfare
Table 4 Summary of main regression results for the effects of other legume production on household welfare

4.1 Importance of legumes in cereal-based cropping systems in Zambia

Legume cultivation is fairly common among cereal-growing small-scale farm households in Zambia. Approximately 60–64% of such households grow grain legumes in some way (Fig. 3). The most common way that these households incorporate legumes into their farms is via rotation with cereals – approximately 40–43% of households did this each year. In contrast, cereal-legume intercropping is practiced by fewer than 5% of households each year (Fig. 3). Approximately 22–23% of households produce legumes in other ways, e.g., via legume monocropping or rotating/intercropping legumes with non-cereal crops. Legume monocropping constitutes about 91% of other legume production. Among grain legume crops, groundnut is the most popular (53% of cereal-growing small-scale farm households grew groundnut in the 2013/14 agricultural year), followed by mixed beans (about 17% of households), soybean (7% of households), and bambara nut and cowpea (3% of households each).

4.2 Effects of cereal-legume rotation

For the much more common practice of cereal-legume rotation, we found more evidence of statistically significant effects on household welfare relative to cereal-legume intercropping. For cereal-legume rotation, we found a positive association with MAHFP but again, only in the 2SLS models (Table 3). The magnitudes of the cereal-legume rotation effects on MAHFP are much more plausible, however, than the magnitudes of the cereal-legume intercropping effects on HDDS described above. Recall that the IVs were generally strong in models where the full panel data was used, including the MAHFP models. In this case, adoption of cereal-legume rotation is estimated to increase MAHFP by 2.3 months, on average and with other factors constant, or by 2.9 months given a 1-ha increase in a household’s area under cereal-legume rotation. This is against a sample mean MAHFP of 10.4 months in 2013/14. We found no evidence of statistically significant cereal-legume rotation effects on the other indicator of food access, HDDS.

The results also suggest that a 1-ha increase in a household’s area under cereal-legume rotation increased its gross value of crop production overall and gross value of crop sales, although the results were mixed for the gross value of crops retained. The most robust results in Table 3 relate to the positive effects of cereal-legume rotation on calorie and protein production. For example, the FE results suggest that a 1-ha increase in a household’s area under cereal-legume rotation was associated with increases in production of 1342 cal and 45 g of protein per capita per day. These are substantial increases vis-à-vis the sample means of 5338 cal and 142 g of protein per capita per day.

Overall, the evidence suggests that cereal-legume rotation has a significantly positive effect on food availability as measured through calorie and protein production, and on the gross value of crop sales. It may also improve MAHFP.

4.3 Effects of other legume production practices

Similar to the results for cereal-legume intercropping, we found relatively little evidence that production of other legumes affects household welfare (Table 4). Recall that other legume production refers to forms of legume production other than cereal-legume intercropping and cereal-legume rotation (mainly legume monocropping but also legume intercropping or rotation with non-cereals). The 2SLS models suggest negative effects of other legume production on HDDS but this result does not persist in the FE models and the 2SLS effects’ magnitudes are implausibly large; so for these and the other reasons highlighted for the 2SLS HDDS results for cereal-legume intercropping, we place very little emphasis on this result. Based on the weight of the evidence in Table 4, our conclusion is that other legume production has no robust positive or negative effects on household calorie/protein production, gross value of crops produced/sold/retained, or MAHFP. However, unlike for the cereal-legume intercropping results where we were concerned that the lack of statistically significant effects might be driven by low statistical power, which was not a concern for the other legume production effects (or lack thereof) because more than 20% of Zambian small-scale cereal-growing households practice such legume production.

4.4 Testing the pathways of impact of cereal-legume rotation on food access

Based on the results above, we found some evidence that adoption of cereal-legume rotation may increase food access as measured by MAHFP. The results in Table 3 also suggest that this practice raises crop income (proxied by the gross value of crops sold) and food production (measured by calories and protein produced per capita per day). In our final set of regressions, we sought to understand if increases in crop income and food production are indeed associated with improvements in MAHFP. These are separate regressions of MAHFP on a vector of control variables and either: (1) gross value of crop production sold vs. retained, (2) overall gross value of crop production, (3) calorie production, or (4) protein production. We also estimated similar models with HDDS as the dependent variable. These regressions suggest that all of these measures of food production and crop income were positively and significantly associated with MAFHP (Table 5). These results, coupled with those in Table 3, suggest that the potential positive effects of cereal-legume rotation on MAHFP likely occur through both the food production and crop income pathways described in Fig. 1. In contrast, none of the food production and crop income variables had a statistically significant effect on HDDS. A potential explanation for this latter finding is that there is a substantial time lag between the reference period for the crop income and production variables and when the HDDS was measured (Fig. 2). By the time the HDDS was measured, any effects of food production or crop income from the previous year may have dissipated.

Table 5 Summary of main regression results for the effects of gross value of crop production and calorie and protein production on HDDS and MAHFP

4.5 Effects of cereal-legume intercropping

From the econometric results in Table 2, we found some evidence of positive cereal-legume intercropping effects on HDDS. However, given that: (i) this effect is only statistically significant in the 2SLS models; (ii) the IVs are weak in the first stage in this (HDDS) case; and (iii) the 2SLS estimates are implausibly large (suggesting increases of 10 and 24 units in HDDS, which itself only ranges from 0 to 12 and had a sample mean of 5.7 in 2013/14), we take this as very weak evidence, at best, of a positive cereal-legume intercropping effect on HDDS. We found no evidence of statistically significant effects of this practice on the other indicator of food access, MAHFP. Moreover, six of the eight parameter estimates for the calorie and protein production per capita per day models were not statistically different from zero. We similarly found no evidence of statistically significant cereal-legume intercropping effects on households’ gross value of crop production. However, when we decomposed the gross value of crop production into the gross value of crops retained for home consumption versus the gross value of crops sold, there was fairly robust evidence of cereal-legume intercropping effects. In particular, the adoption of cereal-legume intercropping was associated with an increase in the gross value of crops retained but a decrease in the gross value of crops sold.

Overall, although adoption of cereal-legume intercropping appears to be associated with some shifts in the gross value of crop production that is retained versus sold, the results do not point to robust effects of the practice on household food access (HDDS and MAHFP) or food production and availability. It is important to keep in mind, however, that overall very few small-scale farm households in Zambia practice cereal-legume intercropping (fewer than 5% in both agricultural years in our study, although it was more common in some parts of the country). Thus, we likely had low statistical power for this practice, meaning we would only be able to detect large cereal-legume intercropping effects on household welfare.

5 Conclusions and policy implications

The value of grain legumes and the multiple roles they play in agricultural, environmental, food and feed systems around the world is well recognized in the literature (e.g. Barrett 1990; Bationo et al. 2011; Ojiewo et al. 2015; Thierfelder et al. 2012). Legumes thus feature prominently in development strategies and discussions among researchers and practitioners on leveraging agriculture to achieve better food security and nutritional outcomes. Despite the strategic importance of legumes, few studies have rigorously examined the causal effects of legume-based cropping practices on household crop income, production, and food security outcomes and the specific pathways through which these effects occur. To begin to fill this gap, this paper set out to analyze the role of three specific grain legume-based cropping practices (namely, cereal-legume rotations, cereal-legume intercropping, and other forms of legume production such as legume monocropping or intercropping or rotating legumes with non-cereal crops) on measures of food availability and access.

Overall, the results suggest that at least in the context of small-scale cereal-growing households in Zambia, integrating grain legumes into production systems has varying effects across practices. Our results suggest that integrating legumes as an intercrop with cereals has little or no statistically significant effect on household welfare as measured by the indicators used here. This may partially explain the low adoption of this practice by farmers in Zambia (fewer than 5% of small-scale farmers practice cereal-legume intercropping). However, the low adoption of cereal-legume intercropping itself also implies low statistical power to detect the effects of this practice, and thus our finding of no statistically significant effects does not necessarily mean that cereal-legume intercropping in fact has no effect on household welfare in Zambia. Such effects may indeed exist but they may be too small for us to detect. We also only measured the current year effects of cereal-legume intercropping; it is very possible that through its positive effects on soil fertility, cereal-legume intercropping may positively affect food production, availability, or access in future years. We also found little evidence that legume production via legume monocropping or legume rotations or intercropping with non-cereals affects household welfare. More than 20% of small-scale farm households in Zambia produce legumes in one of these ways, so these findings are less likely to be driven by low statistical power.

In contrast, cereal-legume rotation was strongly and positively associated with crop income and food production, and there is some evidence that it improves households’ food access through both the food production and crop income impact pathways. Households that rotate cereals with legumes reap the benefits of having more revenue from crop sales, and more calories and protein to eat; they may also have sufficient food in more months of the year than households that do not practice cereal-legume rotation. Approximately 40% of small-scale farm households in Zambia practice cereal-legume rotation each year, so the practice is fairly common but far from universal. This implies there is scope for increases in adoption, and our results suggest that promoting wider adoption of cereal-legume rotations could contribute to improved household food availability and possibly food access in Zambia. From a policy perspective, these findings on cereal-legume rotations give credence to recent development efforts that promote this practice as part of a strategy of leveraging agriculture to achieve multiple development goals. Our results do not suggest which methods of promotion would be most cost-effective but this is an important area for future research. One possibility is that the Zambian government through its extension service as well as NGOs and private sector actors working in the agricultural sector could share information about the benefits of cereal-legume rotations more widely so that farmers may take up this practice where it is feasible for them to do so. Moreover, researchers at the Zambia Agriculture Research Institute together with social scientists could investigate the specific types and lengths of cereal-legume rotations that are the most welfare-enhancing for Zambian smallholders.

Further research is also needed to understand the low prevalence of cereal-legume intercropping among small-scale farmers in Zambia. The country is more land-abundant than some other countries in the region where cereal-legume intercropping is common (e.g., Malawi and Kenya). Having relatively larger land sizes than their Malawian or Kenyan counterparts may afford Zambian farmers the luxury of being able to rotate their cereals with legumes rather than having to rely on intercropping to incorporate legumes into their production systems. Explicit messages from government extension in Zambia encouraging rotation over intercropping may also explain the low prevalence of cereal-legume intercropping (MAL 2012). Future efforts could seek to identify and promote specific cereal-legume intercrops that meet the needs of Zambian farmers. Finally, to guide future efforts in the promotion of these technologies among Zambian farmers, further research is needed to assess whether there are complementarities or substitution effects between different ways in which legumes can be integrated in smallholder cropping systems.