1 Introduction

East African agricultural soils suffer from degradation that includes erosion, loss of organic carbon, low nutrient status, poor structure, and reduced water holding and transmission capacity (Tully et al. 2015). This degradation is widely acknowledged as the major biophysical limitation to agricultural production on the African continent (Vanlauwe and Giller 2006), contributing to declining per capita food production on smallholder farms and low climate change resilience (Sennhenn et al. 2017). Scientists and farmers have considered many different strategies for reversing soil degradation including integrated soil fertility management (Chianu et al. 2012), agroforestry (Mokgolodi et al. 2011), Conservation Agriculture (Thierfelder et al. 2015), and others (Taylor et al. 2021).

Conservation Agriculture (CA) is based on the principles of reduced soil disturbance (no-till or minimal till), permanent soil cover (with mulch and/or living plants), and increased crop diversity (via intercropping, cover cropping, and/or crop rotation) (Thierfelder et al. 2015). Scientists from the CGIAR and other research institutions have conducted field research on CA since the 1980s (Thierfelder et al. 2009). Results demonstrate that CA is especially positive for yield under drier growing conditions (Steward et al. 2018), where soil surface mulches (“permanent soil cover”) increase water infiltration (Stroosnijder 2009) and reduce evaporative water losses (Mupangwa et al. 2007). Residue retention can also increase the level of microbial biomass carbon (Limon-Ortega et al. 2006) and promote beneficial soil microflora (Govaerts et al. 2008). Increased soil carbon with CA has been observed in several countries including Ethiopia (Lanckriet et al. 2012; Liben et al. 2018) and Malawi (Mloza-Banda et al. 2016), while improved water infiltration reducing ponding and surface crusting was observed in Malawi (Ngwira et al. 2012) and Zimbabwe (Thierfelder et al. 2013).

Farmers have a strong awareness of their land, a deep understanding of their challenges, and are important evaluators of new methods through an adaptive learning process (Sumberg et al. 2003). Tully et al. (2015) state “Clearly the goal is to reverse degradation, and therefore farmer perceptions must not be overlooked, as they are a primary actor on agricultural landscapes. Farmers provide invaluable information on the location and type of degradation they observe on their lands as well as describe solutions.” This is strongly supported by Corbeels et al. (2014) though they caution that CA techniques should be tailored to local contexts, and not be a set of prescriptive steps. Many also point out that farmers and scientists measure soil degradation differently with farmers relying more on visual assessments of crop performance and yield (qualitative measures) and scientists on chemical analyses (quantitative measures). Studies in Ethiopia (Karltun et al. 2011; Tesfahunegn et al. 2011) Botswana (Stringer and Reed 2007) and Malawi (Hermans et al. 2021) have shown a strong correlation between qualitative assessments by farmers (maize yield or weeds) and quantitative assessments by scientists and technicians (scientific measure of soil organic matter, SOM).

Farmer participation in agricultural research is a form of citizen science which allows for synergies between academic and local knowledge, it provides the opportunity for farmers’ ideas and approaches to be tested (Salomons et al. 2018), and increases the likelihood of research results being of practical use to farmers (Beza et al. 2017). In the Central Kenyan Highlands, farmers used qualitative observations such as crop growth, “ease of tillage” (analogous to the hoe test in the present study), moisture retention, and soil colour, as diagnostic indicators of soil productivity (Murage et al. 2000). Karltun et al. (2011) observed a strong correlation between soil organic matter content and farmers’ ranking of soil fertility based on colour and softness of soil samples in Ethiopia. Adeyolanu and Ogunkunle (2016) reported similar correlations between soil health from farmer qualitative ratings and descriptions and quantitative (lab) soil quality assessments in Nigeria. Based on their research with 69 farm households in Eastern Kenya where farmers’ knowledge constituted a consistent and logical classification of soil quality, Wawire et al. (2021) concluded that local soil knowledge should be developed in conjunction with scientific soil methodologies to benefit resource-poor small-holder farmers. Tesfahunegn et al. (2011) concluded that local farmers not only correctly identified soil degradation in northern Ethiopia, they also demonstrated their capability to suggest appropriate land management solutions for specific problems. Hermans et al. (2021) developed a framework for integrating local and scientific knowledge and stress the importance of understanding if farmers experience improvements and how they view CA benefits in terms of overall household demands and other social circumstances.

While citizen science can increase the sample size of research, as described above, there are drawbacks. Using the example from bird surveying, Dickinson et al. (2010) point out that trained volunteers were not as good as professionals at detecting low densities of specific bird and insect species. In an example of citizens monitoring crabs, Delaney (2009) found that older volunteers with at least 2 years of university education were better able to correctly identify both the species and age of crabs. These concerns may not apply to farmers who have worked their own land for many years; they would be considered the experts. Fitzpatrick et al. (2009) found that volunteers performed better when accompanied by professionals, suggesting that ongoing, personalized training is important. However, the presence of such “accompanying professionals” may introduce bias by influencing the farmers’ perceptions.

This study used a participatory model that assumed equal importance of local indigenous and scientific knowledge. The first objective was to determine the effect of on-farm CA on soil health. We hypothesized that farmers’ descriptions and assessments would be able to detect soil changes due to CA adoption. Second, we compare farmers’ soil health descriptions based on qualitative indicators with quantitative scientific soil health measures (pH and respiration). Based on previous research that supported the value of farmer indigenous knowledge to identify healthy soils (Hermans et al. 2021), we hypothesized that both qualitative and quantitative soil health indicators would show improvement under CA management. The study focussed mainly on soil parameters, though a subset of data was available to measure the effects of on-farm CA on crop yields.

2 Methodology

2.1 Site description/background

The East African highlands represent an important agricultural region with about 80% of the population living in rural areas and dependent on farming, fishing, and aquaculture for their livelihoods. In this context, we considered three areas for our study: the Wolaita Zone of south-central Ethiopia (elevation 1,600 to 2,100 m); the Muranga County in central Kenya (elevation 1610 m); and the Dodoma district of Tanzania (elevation 1610 m). Soil health assessments were conducted on small holder farms that had been practicing CA for up to 8 years, with assessments between 2017 and 2019 in Kenya and Ethiopia, and between 2018 and 2020 in Tanzania. Average annual temperatures are 19.3 °C in Wolaita Zone of Ethiopia, 20.7 °C in Muranga County, Kenya, and 22.6 °C in the Dodoma district of Tanzania. The Wolaita region has an average annual rainfall of about 1500 mm/year, which falls mostly in a bimodal distribution between March and October. Wolaita is prone to drought, however, and the soils in the region (mostly Nitisols) are highly degraded and eroded (Bekele et al. 2018). Muranga county has a bimodal rainfall pattern, with an average of 983 mm/year falling into two rainy seasons (mid-March to May and mid-October to December) (Kibunja et al. 2012). Soils in the study area are also Nitisols and tend to be clayey and of moderate fertility. The Dodoma region is semi-arid with sandy, infertile soils and a unimodal rainy season with an average of 600 mm of rain falling mostly between December and April (Shemsanga et al. 2018). Soil texture was assessed at each field site using the Canadian Foodgrains Bank soil testing guidelines for hand-texturing (Soil testing guidance 2021). The most common soil textures on the farms in Ethiopia were silty clay, silty clay loam, loamy sand, and clay loam. In Kenya, most fields had sandy loam, sandy clay loam, or loam textures. In Tanzania, all fields had a high sand content (sandy or loamy sand texture). Thus, the study was conducted across a range of soil textures from clay and loam soils in Kenya and Ethiopia to sandy soils in Tanzania.

2.2 Farmer selection and crop management

Farmers were selected for inclusion in the study based on their participation in the Canadian Foodgrains Bank’s “Scaling up CA” program. This program provided technical advice, but no financial or other (e.g., seed, fertilizers) supports. In addition, farmers had to have both CA and CT land under their management to allow direct comparison of their plots.

Soil health assessments were carried out on 23 farms in Ethiopia, 23 farms in central Kenya, and 13 farms in central Tanzania. On each farm, CA fields were paired with conventional tilled (CT) fields, in close proximity (within metres), and with identical soil texture. To qualify as a CA field, at least two CA principles (minimal soil disturbance, soil cover, and crop diversity [intercropping or cover cropping]) had to be practiced on a given field. CT fields could use at most one CA principle. Additional recommended agricultural practices, like increased fertility inputs and improved plant spacing, were promoted alongside CA, sometimes called “CA-Plus.” These practices may have contributed to the positive effects of CA, although, at least in Ethiopia, the improved practices were used on both CA and CT plots. Fields had been under CA management for 1–6 years in the first year, 2–7 years in the second year, and 1–8 years in the third year of the study. This variation is because some farms had already been practicing CA prior to the start of the study. Repeated sampling was conducted for 3 consecutive years on all but one farm in each Tanzania and Ethiopia. In Kenya, repeated sampling for 1, 2, and 3 consecutive years was conducted on 23, 16, and 13 farms, respectively. Of the 59 farms, only 3 did not participate for the duration of the study; these 3 farms discontinued CA practices during the study period and were replaced with 3 other CA farms. GPS locations, elevation, soil texture, and management practices for each farm were recorded.

Maize (Zea mays L.) was the main crop grown on both CA and CT fields in Ethiopia and Kenya, and pearl millet (Pennisetum glaucum (L.) R.Br.) was the main crop in Tanzania. The most common intercrops were lablab (Lablab purpureus (L.) Sweet), pigeon pea (Cajanus cajan (L.) Walp.), cowpea (Vigna unguiculata L.), beans (Phaseolus vulgaris L.), and Jackbean (Canavalia ensiformis L.). Occasionally, alternatives to maize or millet were grown as the main crops, such as cassava (Manihot esculenta Crantz), groundnut (Arachis hypogea L.), sorghum (Sorghum bicolor (L.) Moench), or phaseolus beans. All CA plots used limited soil disturbance, usually planting stations, with occasional use of ox-drawn rippers, whereas CT fields were ploughed or hand hoed before seeding. The vast majority of CA fields had mulch applied. In Tanzania, almost all CA plots were intercropped with green manure cover crops (often lablab) and 9 of the 13 farms had manure and wood ash applied. In Kenya, about one-third to three-quarters of farmers intercropped depending on the year. In Ethiopia, over half of CA plots had intercrops in the second 2 years of the study. The plots in Ethiopia often had compost added and occasionally chemical fertilizer. Some CT fields in Kenya and Tanzania also had intercrops. In Kenya, there were generally two crop cycles per year, and thus those fields have been under twice as many CA crops as fields in the other countries. This variation in crops and CA practices gives a realistic sample of how CA is being practiced by small holder farmers in these regions. While the variation in farming practices may introduce some confounding factors to our analysis, we believe it is outweighed by the benefit of evaluating CA as actually practiced by farmers outside of a controlled research setting.

Sampling occurred in late April and early May 2017–2019 in Ethiopia; May in 2017 and 2018 and late May to mid-June in 2019 in Kenya; late May in 2018, and in March in 2019 and 2020 in Tanzania.

2.3 Soil health survey

Our soil health assessment followed the Cornell protocol (Moebius-Clune et al. 2016). However, it focused on practical hands-on assessment methods that could be done in the field with input from farmers, and with minimal specialized equipment. This interaction between field staff and farmers enhances soil quality measurements with on-farm knowledge (Adeyolanu and Ogunkunle 2016) and includes farmers’ observations of their soil and crops, which they routinely monitor (Murage et al. 2000). Farmers commonly use soil indicators that they can directly sense (e.g., see, feel, or smell) including crop characteristics and soil colour, and these can be accurate indicators of soil health (Mairura et al. 2007). A review of the literature found that the most common soil recognition identified by farmers was related to topsoil characteristics, especially soil colour and texture (Huynh et al. 2020).

Most of the soil health indicators used in the survey were qualitative and evaluated based on ratings from one to five, with five being the most desirable and one being the lowest (worst) level of soil health. The protocol was developed with input from the Conservation Agriculture Technical Staff of the Canadian Foodgrains Bank to ensure it was practical and appropriate for the local conditions and farming community. Training in the soil assessment method was provided to local partners and shared with the farmers during the interactive soil health surveys. Special emphasis was placed on honest reporting so as not to bias towards the CA practice. Farmers received no remuneration for either the CA project or conducting field assessments.

2.3.1 Qualitative assessment

Qualitative soil health assessments were done by farmers working together with Foodgrains Bank field staff and partner organization staff in the field. Four assessments were done for soil physical characteristics (Table 1). The “hoe test” describes the ease of digging in the soil and was conducted by hoeing ten planting stations in each plot. “Soil structure” was assessed by digging up a spade of soil and observing the presence and size of aggregates and clods. “Soil compaction” was assessed by digging and consulting with the farmer about the presence of hard pans and evidence of restricted root growth. Ten sampling points per field were used. The level of “soil crusting” was determined via farmer consultation, as well as inspection when possible: farmers were asked whether the soil tended to seal and form a crust in a dry period after a heavy rain, and how this affected seedling emergence. The data was recorded by project staff directly in the field.

Table 1 Qualitative soil health assessment ratings guide. Assessment scores range from one, which indicates the poorest level of soil health, up to five, which indicates the most desirable conditions for soil health.

Closely related to the soil’s physical characteristics, soil water movement and holding capacity were also assessed on a 1–5 rating scale. “Water movement” was rated by observation and asking farmers to describe how water moved over the land after rainfall; for example, if there was runoff or ponding. “Water holding capacity” was evaluated by considering the depth and texture of the topsoil and consulting with the farmer to learn how long the soil would hold water during a dry spell following a rain event. “Soil smell” was determined by taking a handful of soil, smelling it, and characterizing the smell as a fresh earthy smell (rating of 3) to no smell (1). Soil smell was rated out of 3 instead of 5.

Crop health is an important indicator of soil health, and as such both above-ground and below-ground plant growth were assessed. “Growth and colour” of crops were rated based on a visual assessment of plants, looking for stand uniformity and signs of stress and nutrient deficiency. “Root systems” were evaluated by digging around several plants to observe the roots, particularly looking for fine roots and evidence of restricted growth. As these crops were important to the farmer’s livelihoods, plants could not be pulled up and care was taken not to destroy them. If a direct assessment of the crop was not possible at the time of sampling, either because the crops had just been seeded or were dry and ready for harvest, crop colour and growth ratings were determined by consultation with the farmer.

2.3.2 Quantitative (validation) assessment

Quantitative measurements of pH and microbial respiration were taken at the same time as the qualitative assessment as a way to validate farmer information with scientific data. Five to ten cores were taken from each CA and CT field by Foodgrains Bank field staff in a random, zig-zag pattern and then bulked for pH and microbial respiration analysis. Soil pH was determined using a calibrated Spectrum SoilStik (https://www.specmeters.com/nutrient-management/ph-and-ec-meters/ph/) at both 0–15cm and 15–30cm depths, with well-wetted soil. Microbial respiration in the top 0–15cm of soil was measured using Solvita Field Test Kits (https://solvita.com/fieldtest/). Subsamples were wetted and incubated with a Solvita gel paddle for 24h and then read with a Solvita gel digital colour reader.

Soil colour was also determined from the damp soil samples (0–15 cm) using the Munsell Soil Colour Chart (Munsell Color (Firm) 2010). The single colour attribute “value” from the Munsell colour system was used for statistical analysis. The value ranges from dark to light (0 being black, and 10 being white), with a lower value (darker) soils tending to have higher organic matter content (Schulze et al. 1993; Spielvogel et al. 2004). The colour charts were created by converting Munsell colour system colours to RGB (Red Green Blue) colour codes, and then selecting these colours in Microsoft Power Point. Only the data from the most recent soil assessment was used, which was 2019 from Ethiopia and Kenya, and 2020 from Tanzania. These charts give a visual sense of colour changes and complement the statistical analysis.

2.4 Data analysis

Not all soil health measurements were conducted on each farm each year for various reasons, including equipment malfunction and stage of plant growth (too young or too mature for assessment). In these cases, the farm was removed from the analysis of that metric for that year. If only one field in a pair of CA and CT fields had an analysis performed, both fields from the pair were removed for that metric that year.

2.4.1 Qualitative data

The data from the qualitative soil health indicators is on a rating scale; specifically, it is a “fully anchored rating scale” (Harpe 2015). When working with rating scales, Harpe (2015) recommends using non-parametric tests when the rating data consist of individual rating items, have numerical response formats, and have four or fewer categories. Although most rating scales in the present study had 5 categories, often not all 5 categories would be selected for a given soil measurement/year/place combination. The logarithm of the odds ratio estimates method was selected because it allows for the inclusion of random effects when modeling categorical data. This means that the farms can be treated as a block and accounted for as a random effect in the model. This test gives an odds ratio estimate, which shows the relative difference between the two treatments. It shows the probability that one treatment will rank better on the rating scale than the other. This type of data and analysis is more common in academic disciplines that regularly use surveys, such as medicine (e.g., pain ratings) and psychology.

Rating data were analyzed using PROC GLIMMIX in SAS version 9.4 (SAS Institute Inc. 2016) using a multinomial response model with random effects (Kiernan 2018). A cumulative logit proportional-odds model was fit to the data, to assess treatment effects in terms of cumulative odds ratios. Treatment (CA or CT) was the fixed effect, and the farm was treated as a block, which was the random effect. The multinomial response model of the rating data only included categories (ratings 1–5) that had scores in the given soil health measure. If one or more rating scores did not occur in either treatment in a given soil health test, it/they would be excluded from the model.

Countries were analyzed separately because the purpose of the study was to determine whether CA improved soil health specifically in each of the three regions. The purpose was not to see if CA improved soil health generally on a larger scale. For example, weather conditions (like drought) vary between years and we wanted to know each year if there was an effect of CA. The effect of CA relative to CT is known to be affected by temperature and precipitation events (Steward et al. 2018). A preliminary analysis combining years and including years as a factor in the model showed that years had a significant effect on some of the soil health tests in some regions, which further supported the analysis of years individually.

2.4.2 Quantitative (validation) data

Data from quantitative soil health tests were analyzed used PROC MIXED in SAS version 9.4 (SAS Institute Inc. 2016), with treatment (CA or CT) as a fixed effect and farm as a random factor. The repeated / group= statement was used to find the lowest Akaike information criterion value, which is the model that minimizes heterogeneity of variance. If residuals were not normally distributed (according to the Shapiro-Wilk test in the univariate procedure), in the microbial respiration or colour data, the data was log10 transformed. In four cases the residuals for the colour value data remained non-normally distributed after transformation, and this is noted in the results. In two cases the residuals for the pH data were not normally distributed (Shapiro-Wilk W statistics of .84 and .87), and the data was not transformed because it was already on a log scale.

Linear regression was used to understand the relationship between microbial respiration and pH (0–15cm depth only) and the number of years in CA (Data Analysis tool in Microsoft Excel 365). The number of years in CA was the predictor (x) variable, and the difference in microbial respiration and pH between CA and CT pairs (CA – CT) was the response (y) variables.

2.4.3 Radar charts

Radar charts were used to display results from all the tests that comprise the soil health assessment. Values shown are all relative to CT (=100%). The median was used for qualitative soil tests and the mean for the quantitative tests. The colour value scale was reversed to make 10 the darkest and 0 the lightest to fit with the other variables, where higher numbers indicated better soil health. Several of the tests use different scales, so individual tests need to be considered when observing the relative differences in the diagrams. For example, microbial respiration, measured in kg CO2 ha-1, is on a much larger scale with large differences, whereas pH changes, although very slight, could still be important.

The qualitative soil assessments were done initially with the ratings reversed (i.e., 1 being the best and 5 being the least desirable conditions for soil health) as this is how the raw data were reported. For the figures and text of this report, the rating scale has been reversed, because it seems more intuitive to have the highest number as the best rating.

2.5 Yield survey

Yield was not systematically measured in the soil health survey. However, yield can be considered an indication of overall soil health and, in addition, is extremely important to farmer perceptions of CA’s effectiveness. For this reason, we include a subset of data collected by the Canadian Foodgrains Bank in a separate yield survey of CA farmers which included the three regions in the soil health survey. There was some, but not complete overlap between farmers participating in the yield survey and farmers participating in the soil health survey. Unfortunately, due to issues with data collection in Ethiopia, only data from the Muranga region in Kenya and the Dodoma region in Tanzania were available. Farmers with a minimum of 0.25 ha of land in both CA and CT participated in the yield survey. Yields were measured from the 0.25 ha plots using 100 kg harvest bags. Crops grown, and whether crops were intercropped or monocropped, varied between farmers and between regions. In Muranga, 3 farmers monocropped maize in both CA and CT; 8 farmers intercropped maize and beans in both CA and CT; 1 farmer intercropped maize and cowpea in both CA and CT; and 1 farmer intercropped maize and Irish potato in both CA and CT (total 14 farms). In Dodoma, 9 farmers intercropped pearl millet and cowpea in CA and monocropped millet in non-CA; 8 farmers intercropped sorghum and lablab in CA and monocropped sorghum in non-CA (total 17 farms). District average yield data was not available for lablab in Dodoma, so groups were analysed separately. Millet-cowpea CA versus millet non-CA yields is compared using the “composite yield” method, while sorghum-lablab yields are summed and compared directly with sorghum CT yields.

Data from Muranga and Dodoma were analysed separately. In order to compare different crops grown within and between fields, a “composite yield” approach was taken where yields of each crop were normalized to county average yields for that crop (Eq. 1), such that a “composite yield” of 1 indicates that the field is roughly as productive as the county average. In the Dodoma district, yield data was not available for lab-lab, so intercrop yields including lab-lab were simply summed and compared in kg ha-1. Statistical analysis was performed using PROC GLIMMIX in SAS 9.4, with farmer included as a random variable, with statistical significance determined at p<0.05.

$$ Composite\ yield=\frac{intercrop\ yield\ of\ crop\ 1}{county\ average\ yield\ of\ crop\ 1}+\frac{intercrop\ yield\ of\ crop\ 2}{county\ average\ yield\ of\ crop\ 2} $$
(1)

3 Results and discussion

3.1 Qualitative data

Data for each qualitative parameter in each country and year showed that in most cases, CA fields received a higher rating than CT fields (data not shown). In order to assess the differences statistically, the odds ratio was calculated for each parameter in each year (Table 2).

Table 2 Odds ratio estimates and p values for soil health tests over 3years at farms in Wolaita Zone, Ethiopia, Muranga County, Kenya, and Dodoma District, Tanzania. The odds ratio estimate or exponentiated estimate indicates the relative differences for the Conservation Agriculture (CA) compared to conventional tilled (CT) treatment groups. They show the odds of the CA treatment group being rated higher (i.e., better) in the soil health categories relative to the CT treatment. The p values show the significance of the logarithm of the odds ratio that is obtained from comparing the CA vs. CT treatment group. p values lower than 0.05 are italicized.

In Wolaita zone Ethiopia, the hoe test, soil structure, soil compaction, and water movement had significant log odds ratios for all 3 years. This indicates that fields under CA had a higher probability of rating better (higher on the 5-point scale) than CT fields in all of these categories each year. The exponentiated estimate or odds ratio shows the relative difference between the two treatments (Table 2). For example, the odds of CA being in the higher (better) soil structure category compared to CT fields in 2018 were 41.6:1. The odds ratios show that, with CA practices, the probability of improvements in these soil health measures ranged from 18.6:1 to 157:1 relative to CT. Soil crusting, water holding, and crop growth and colour had a higher probability of rating better under CA management in Ethiopia in 2018 and 2019 but not in 2017. This may suggest improvement in soil health indicators related to the number of years under CA. Root system health had a significant odds ratio in 2018 only, and soil smell in 2019 only. In summary, each soil health measure was improved with CA in at least 1 year in Ethiopia, indicating that the soil health measures can all be responsive to CA practices.

In Muranga county Kenya, the four soil physical characteristics and the two crop characteristics (crop growth and colour and root systems) all had significant odds ratios in 2018 and 2019 but not in 2017 (Table 2). The odds ratios show a range of 35:1 to 6,894:1 likelihood of improvement in soil health measures with CA practices relative to CT. Water movement had a significant odds ratio in 2018, and soil smell in 2017 and 2018. Water holding was the only soil health test to not show a relative difference between CA and CT fields in any year. The year 2018 was the year with improvement seen in most soil health measures with CA (7 out of 9 indicators). Soil smell was the only odds ratio that was significant in 2017. Therefore, we detected an improvement in soil health in CA relative to CT from 2017 to 2018, though the annual difference was not tested statistically.

Fewer benefits of CA on qualitative soil health measures were observed in Dodoma district Tanzania than in the other two countries (Table 2). The odds ratios show that there was an improvement in soil structure and soil crusting with CA relative to CT in 2019, and improvements in soil crusting and water movement with CA in 2018. In 2020, there was less difference between CA and CT fields, with no significant odds ratios. p values between 0.05 and 0.06 were observed for crop colour and growth and water holding in 2019 and 2020, and soil smell in 2020 with CA scoring higher than CT.

These qualitative data indicate improved soil physical and water properties with CA as perceived by participating farmers and appear supported by the literature. For example, increased water infiltration rates along with reduced evaporation from mulch/residue retention have been found to be important benefits of CA, which can increase soil moisture and thus increase the resilience of the system to dry spells and heat stress (Thierfelder et al. 2017).

The health of crops is particularly important to farmers, as well as a sensitive indicator and integrator of soil quality. An improvement in crop health in CA relative to CT was found in Kenya and Ethiopia in the second 2 years of the study. This may be related to improved soil structure and consequent water movement as well as moisture retention and improved nutrient cycling from mulching. It could also be impacted by the CA-Plus technology of precision planting and manure addition. The effect of CA on crop growth has been shown to be variable, with CA having relatively better yields than CT in conditions of lower moisture (Nyamangara et al. 2014) and lower fertility (Thierfelder and Wall 2012) and in drought and heat stress (Steward et al. 2018). Where CA tends to perform poorly relative to CT in terms of yield is with high precipitation and poorly draining soils (clays with poor aggregation or sands with an impermeable layer not far from the surface), leading to waterlogging (Thierfelder and Wall 2012; Nyamangara et al. 2014). We did not correlate our results with prevailing environmental conditions.

Soil smell has not been previously used in farmer assessment of CA-managed soils. A stronger earthy smell in CA relative to CT was found in Ethiopia and Kenya for at least 1 year of the study (Table 2). The characteristic “earthy” smell of soil is attributed to two chemicals, geosmin, and 2-methylisoborneol, produced by cyanobacteria and actinomycetes in the soil (Wang and Cane 2008). The smell indicates an active microbial community which is often considered a sign of soil health because of their important role in nutrient cycling/organic matter turnover. Soil microbial activity is strongly related to organic matter inputs and; for example, microbial activity has been found to generally be higher in sites under native forest compared to reforested and agricultural areas, with fallow fields having the least microbial activity (Nogueira et al. 2006).

3.2 Quantitative (validation) data

3.2.1 pH

The soils from Wolaita zone Ethiopia demonstrated the greatest effect of CA management on pH, with CA practices increasing pH in 2017 and 2018 at both 0–15 cm and 15–30 cm depths, and in 2019 at 0–15cm depth (Table 3). The higher pH represents an improvement, as it brings the pH closer to 7, although the soil pH of farms in Ethiopia was generally within the optimal range for most crops (5.5–7.5). No effect of CA on pH was observed for Muranga county Kenya, though there appeared to be a small trend towards higher pH under CA (Table 3). In Dodoma district Tanzania, CA practices increased pH at both depths in 2019 but not in 2020 (Table 3). The increase in pH with CA in 2019 could be greatly beneficial, as the Dodoma soils are moderate to very strongly acidic, with pH values ranging from 3.6 to 5.8 at the surface and 3.5 to 6.0 in the subsurface. The pH meter malfunctioned in 2018, so there is no data from that year.

Table 3 Quantitative soil health measurement results. Soil pH at 0–15 and 15–30cm depth, microbial respiration (kg CO2 ha-1) at 0–15cm depth, and soil colour value in fields under Conservation Agriculture (CA) and conventional tillage (CT) management over 3 years in Dodoma District, Tanzania, Muranga County, Kenya, and Wolaita Zone, Ethiopia. Mean and standard error values from the raw data are displayed, with different letters indicating alpha ≤0.05, comparing CA and CT within 1 year and region. Microbial respiration was measured using Solvita Field Test kits. The single attribute value from the Munsell colour system was used to analyze colour. Value is measured on a scale of 0 -10, with 0 being black and 10 white. *Residuals very close to normal (Shapiro-Wilk W statistics of .85 to .89). ** Residuals not normally distributed after log transformation.

One explanation for greater buffering (i.e., increased pH) in CA may result from mulch additions, which can in turn increase soil organic matter (SOM). SOM additions can increase or decrease soil pH depending on the initial pH level, quantity and quality of plant residues, and the rate of decomposition (McCauley et al. 2009). Others have found higher pH for soils within basin planting stations compared with soil between basins in CA plots (Belder et al. 2007). Mloza-Banda et al. (2016) observed higher pH in CA compared to ridge-till fields after 2 years but not after 5 years.

3.2.2 Microbial respiration

Microbial respiration was increased with CA in the last 2 years of the study for each of the three countries (Table 3). In Wolaita zone Ethiopia, CA fields had 51 and 21% higher microbial respiration than CT fields in 2018 and 2019 respectively. In Muranga county Kenya, microbial respiration also increased in CA compared to CT plots in 2018 and 2019, by 95 and 247% respectively. In Dodoma district Tanzania, soil microbial respiration was higher with CA practices in 2019 and 2020, with increases of 202 and 36% respectively. No significant differences in microbial respiration were observed between CA and CT fields in 2017 in Ethiopia and Kenya, and in 2018 in Tanzania, though mean respiration for CA was numerically higher in both cases.

Microbial respiration is directly related to the amount of organic matter/residue in the soil, as this stimulates microbial activity (McDaniel et al. 2014). We speculate that the practice of mulching in the CA fields provided a substrate for the microbes, whereas the CT fields did not have these organic matter additions. In summary, microbial respiration was very consistent throughout the study, with higher levels found under CA in all three regions in the last 2 years of the study.

3.2.3 Colour

In Ethiopia, soil colour was darker in CA than in CT in 2019 (Table 2). In 2018, a similar trend was seen, but the data only marginally conformed to the assumptions of the ANOVA so it is less certain. Colour was not affected by CA in Kenya or Tanzania in 2018 or 2020. The 2019 data for Kenya and Tanzania did not conform to the requirements of ANOVA so the difference cannot be assessed with confidence. The colour charts in Fig. 1 show the full Munsell colour system of each soil sampled in the final year of the study. Soils under CA management generally appear darker than their CT counterparts, but there are also cases where the CT fields appear darker and where the two are similar (Fig. 2).

Fig. 1
figure 1

Munsell colours of soils from the 0 to 15 cm soil depth in the three study regions. In each column, the colour of Conservation Agriculture (CA) fields is on the left and matched to paired conventional tillage (CT) fields from the same farm on the right. Colours shown are from the last year of the study, which was 2020 in Dodoma District (column A), and 2019 in Muranga County (column B) and Wolaita Zone (column C).

Fig. 2
figure 2

Conservation Agriculture managed soil in top hand in photo, conventional tillage soil in bottom hand. Fields were side-by-side, separated by a narrow walking path. Soils were picked up by reaching into the field while standing on the path (approximately 2m from each other). Photo credit: Marla Riekman, 2018 Muranga county, Kenya.

3.2.4 Yield

In Maranga county Kenya, “composite yield” significantly increased from 0.77 under CT management (indicating lower productivity than county average) to 1.8 (greater productivity than county average) under CA management (Fig. 3). The composite yield of millet-cowpea in Dodoma district Tanzania significantly increased from 0.23 under CT to 1.1 under CA management. The mean total yield from CT sorghum plots in Tanzania was 380 kg ha-1, while intercropped sorghum-lablab CA plots produced 2040 kg ha-1. These yield increases coincide with increases in both quantitative and qualitative soil health metrics and support the role of CA as it was practiced on the farms in improving both soil health and yield outcomes.

Fig. 3
figure 3

Conservation Agriculture field on right compared with conventional tillage field on left. Photo credit: Jess Nicksy, 2020, Muranga county, Kenya.

3.3 Synthesis of qualitative and quantitative results

Radar charts were used to display results from all 12 soil health measurements together. Results show a trend of improvement for the majority of the measurements with CA relative to CT. Overall, our study is strongly supported by Hermans et al. (2021) who identified that key farmers’ indicators of soil health in Malawi included crop performance, colour, and structure, and that these local indicators were consistent with conventional soil health indicators.

In Ethiopia and Kenya (Fig. 4) most of the soil health measurements were improved with CA and there appears to be more of an effect of CA in the later 2 study years. Some improvement in soil parameters with CA was also observed in Tanzania, though fewer significant effects were recorded, and year-to-year variation in CA effects was also observed. For example, the radar charts for Tanzania show soil health measures on CA fields improving with respect to CT for several measures in 2019; however, results were inconsistent between years (Fig. 4). In Tanzania in 2020, only microbial respiration differed between CA and CT. Microbial respiration was interestingly very consistent throughout the study, with higher levels observed for CA in all three regions in the last 2 years of the study.

Fig. 4
figure 4

Radar charts showing all 12 soil health indicators over 3 years in each region, with chart A showing soil health results from Dodoma District, Tanzania, B, from Muranga County, Kenya, and C, from Wolaita Zone, Ethiopia. The coloured lines indicate soil properties for Conservation Agriculture (CA) fields relative to paired conventional tillage (CT) fields from the same year. The data was converted to relative values, so CA is relative to CT in each year (CT=100%, indicated by black line). The rating data was on a scale of one to five or one to three, with higher values indicating better soil health. The colour value scale was reversed to make 10 the darkest and 0 the lightest colour so that it would align with the rest of the indicators, which indicate better soil health with higher values. Median values were used for the ratings (qualitative) indicators and means for the quantitative indicators. Soil pH is 0–15cm depth. Microbial respiration is in kg CO2 ha-1. *Indicates that the odds ratio estimate (qualitative indicators) was significant or the means (quantitative indicators) significantly different between CA and CT fields at alpha=0.05 in 1 year, ** indicates significant difference in 2 years and *** in 3 years.

The low response of soils to CA in Tanzania may be related to the texture of the soils; over half of the sites were classified as sand or loamy sand. It is more difficult to build SOM and improve soil structure in sandy soils, due to the low potential for aggregation (Carter and Gregorich 2010). Chivenge et al. (2007) found that within CA systems, different management strategies were necessary to increase and maintain SOM in sandy as opposed to clay soils. In clay soils, the most important practice was reducing soil disturbance (i.e., tillage) in order to reduce the disruption of soil aggregates, which contain much of the SOM in fine-textured soils. In sandy soils, which have minimal aggregation and less fine-fraction SOM, maintaining consistent inputs of organic residues was key to retaining SOM (Chivenge et al. 2007; Mhlanga et al. 2021). Thus, regular mulch additions and intercropping cereals with legumes may be key to maintaining SOM levels in the Tanzania soils. The Dodoma district of Tanzania is much drier than the other two study regions (Kenya and Ethiopia), which may have contributed to reduced residue inputs and crop biomass. Future on-farm studies should document mulch amounts being added to the CA fields.

Some studies report it can take several years of practicing CA principles to detect improvements in soil properties (Thierfelder et al. 2013, 2017). Steward et al. (2018), on the other hand, found no correlation between years in CA and improved CA yield when systems experienced heat and drought stress. The farms in the present study had been in CA for up to 8 years by the last year of the study, and CA practices improved more soil health indicators relative to CT in the second 2 years of the study in Ethiopia and Kenya. Our limited dataset does not allow any cause and effect about why multiple years of CA can result in soil improvements to be determined. However, improved crop growth in CA systems as indicated by the farmer observations suggests that more organic matter was added to soils through the shoot and root growth, and this additional organic matter may have a cumulative effect over time.

3.4 Benefits and drawbacks of on-farm participatory methodology

Our work built on the participatory action research model but stressed indigenous farmer knowledge in the assessment of CA practices. Consulting with farmers during the soil assessment makes them active participants in the research and provides a link between scientific and on-farm knowledge (Adeyolanu and Ogunkunle 2016). However, engaging farmers as research partners requires certain considerations. One involves the validation of farmer knowledge. The effectiveness with which both local and scientific knowledge was able to document soil health improvements with CA in our study provides such validation. Knowing that farmers are effective observers of soil health should provide confidence to scientists, government agencies, and others (e.g., NGOs) involved in agronomy for sustainable development. In the life of our project, the validation of farmer knowledge in the Wolaita region led the Ethiopian government to take notice and in 2018 began to train its extension workers about CA principles.

A potential benefit of on-farm participatory knowledge generation regards spreading the word to non-practicing farmers. There is a growing body of literature that demonstrates the potential of farmer-to-farmer knowledge exchange to improve the uptake and maintenance of soil-improving technologies (Kansanga et al. 2021). When farmers can assess the effectiveness of soil improving technologies such as CA on their own, they would be effective and credible ambassadors of soil health knowledge. This is important in the context of low agricultural extension in sub-Saharan Africa. The on-farm data may also reduce the monitoring and evaluation (“M and E”) budget requirements of NGO and government programs. On-farm monitoring research could become even more efficient if some farmers within communities were trained to use scientific tools such as pH meters and respiration soil test kits.

Drawbacks of our approach involve subjectivity and bias. In our case, both technicians and farmers could be influenced by their interest in promoting CA. Some of this bias can be overcome by comparing farmer perceptions with scientific measurements. In our case, the subjective measurements were verified by concurrent improvements in pH, soil respiration, and yield which are quantitative parameters less subject to (though not entirely free from) bias. Similar results from parallel qualitative (using soil health cards) and quantitative (lab) assessments were documented by Islam et al. (2017), while Kelly and Anderson (2016) found a strong mismatch between farmer perceptions and lab assessments of soil quality. As the use of on-farm knowledge generation increases, a clearer methodology may be required. Hermans et al. (2021) provide an integrated approach that improves understanding of farmers’ land management decision-making and the role of soil health in the process.

The variation in the ways that CA and CT were practiced in this study, as well as the use of CA Plus technologies, make it hard to extract which particular practices might have caused the observed changes. However, one benefit of this dataset is that it represents how CA is being practised by small holder farmers, which is immensely valuable, particularly since these same regions and types of farms are being targeted for additional uptake of soil health practices. By contrast, studies on research farms are valuable for contributing data on specific aspects of CA in tightly controlled conditions, but may not accurately reflect the ways that farmers practice CA in their own fields. Our results support the important role of on-farm, participatory research which includes farmer knowledge and perceptions, not as a replacement for controlled experiments, but as a complement to them. On-farm results have the potential to accurately reflect the impact of future uptake of CA on soil quality could be in these regions.

4 Conclusions

This study used a participatory research approach to test the effects of CA on soil health, monitoring 59 farms in three distinct regions of Eastern Africa for up to 3 years. Four of the 12 soil health indicators (soil crusting, water movement, soil structure, and microbial respiration) stood out as being significantly higher in CA compared to business-as-usual CT fields in at least 1 year in all three study districts. Three of these four were farmer indicators, demonstrating that farmer soil health knowledge appeared to be as consistent as one of our scientific measures (microbial respiration) in identifying soil health improvement. This work adds to the growing body of literature that CA has the potential to improve soil health (e.g., Thierfelder et al. 2009) and crop production (Salomons et al. 2018), and importantly, supports our hypothesis that small-holder farmers can identify soil health benefits arising from improved soil management practices. We therefore conclude that future soil improvement efforts should emphasize a participatory approach that includes experienced farmers working as citizen scientists alongside other scientists and technicians. Hermans et al. (2021), working in Malawi, reached a similar conclusion, though they also stressed the value of the scientific measures to guide and grow farmers understanding of soil processes. The implications of our collective conclusions for future agronomic research are profound. More effort should be invested in local initiatives where farmers operate as citizen scientists to document the progress of their site-specific soil improvement efforts, and where scientists help verify farmer observations and provide deeper insights into the processes (e.g., mineralization, pH modification) that farmers are managing.

Soil health assessments often seek to establish a minimum dataset of indicators. From a farmer’s perspective, the most relevant indicators may vary with site-specific conditions. For example, under the hotter, dryer, and lower pH conditions in Tanzania, indicators that provide water conservation (infiltration, water movement, crusting) and pH modification (use of ash) may be deemed more important by farmers than in hilly land where erosion protection may be seen as more relevant. Indeed, it was the water-related indicators that scored highest for CA in Tanzania. The lesson for future research may be to work with farmers in identifying the most relevant soil health indicators for their biophysical (and perhaps also social) circumstances. The role of scientists in these situations can be to unravel the processes operating within the farmer’s practice more thoroughly. The scientific forms of knowledge can add to overall understanding to explain “why does it work here?” (Hermans et al. 2021). Furthermore, this feedback from scientists can inform how soil improving techniques can evolve to be better tailored to local contexts (Corbeels et al. 2014).

One site-specific condition that varied among study districts was soil texture. Sandy soils in Tanzania did not appear to improve with CA as dramatically as clay and loam soils in Ethiopia and Kenya, suggesting different CA strategies may be required. Given the importance of the organic matter in soil health, the emerging new knowledge of carbon fixation in soils (Cotrufo et al. 2015), and possible differences in carbon fixation in sandy vs. clay soils (Carter and Gregorich 2010), future on-farm soil improvement research should be especially attuned to soil textural class.

In conclusion, African farmers were shown to be effective “citizen scientists” as they were able to discern soil health benefits from CA practices on their farms. These citizen scientists can be especially effective in the extension of CA and other soil-improving technologies. Farmer participation and on-farm assessments merit incorporation into evaluation schemes aimed at soil restoration agronomy in East Africa.