1 Introduction

Policies and programmes that improve the livelihoods and resilience of rural people are critical for the achievement of the Sustainable Development Goals (SDG) targets, particularly the targets on agriculture and food security (FAO, 2017). Smallholder farming communities in many parts of the developing World are in need of support to cope with rapidly changing circumstances of rural population growth, climate change, land degradation, loss of natural resources, and food insecurity (Vermeulen et al., 2012). These changes require that appropriate adaptations are made in land use, farming practices, and entrepreneurial activities in accordance with the opportunities and limitations posed by the existing conditions; hence, farmers must be able to strengthen their capacity for adaptation (Röling & Wagemakers, 2000; Darnhofer et al., 2010).

The farmer field school (FFS) has been promoted by the Food and Agriculture Organization of the United Nations (FAO) and various other organizations as an approach for educating farmers to make adaptive farming decisions based on understanding of agroecological principles obtained through systematic observations and simple experiments in the field, typically in weekly meetings during a season from planting to harvest (FAO, 2016). The FFS has been adopted in over 90 countries world-wide, for use in many crops, for livestock and fisheries (Waddington et al., 2014). Studies in DR Congo, Malawi and Tanzania have shown that FFS participants experienced improvements in household food security and dietary diversity through diversification of their agricultural production (Doocy et al., 2017, 2018; Larsen & Lilleør, 2014; Weinhardt et al., 2017). Also, the FFS has shown positive effects on farmers’ abilities to cope with the consequences of climate change (Tomlinson & Rhiney, 2018; Chandra et al., 2017; Osumba et al., 2021). However, many FFS studies have used a quasi-experimental design and may thus have been subject to selection bias (Waddington et al., 2014).

The educational foundations of the FFS support a process of continued learning and action and enable the empowerment of its participants (Pontius et al., 2002). Consequently, the FFS can be expected to produce a broad spectrum of effects – not restricted to agricultural productivity – which should be captured by monitoring and evaluation (Bakker et al., 2022). However, the popularity of the FFS also made it vulnerable to be used in ways that may compromise its educational foundations, while many FFS programmes have struggled to monitor and evaluate relevant targets or indicators to assure quality or to make improvements in their interventions (Bakker et al., 2020; van den Berg et al., 2020c).

Previous work on developing a framework for the evaluation of FFS identified impact pathways in the human, social, natural and financial capital domains (Douthwaite et al., 2007), which borrowed from the sustainable rural livelihoods approach (Scoones, 1998). The effects of the FFS have recently been reviewed by using an analytical framework that identified outputs, outcomes and impacts of the FFS in the human, social, natural and financial domains (van den Berg et al., 2020b). That review concluded the FFS has prospects to enhance the four capital domains of rural livelihoods but identified the need for quality assurance and well-planned evaluation studies of the FFS. Following up on this work, FAO developed a generic framework and guidance on monitoring, evaluation and learning (MEL) for FFS programmes (van den Berg et al., 2023). Whilst the experiential learning cycle and adaptation is central to the FFS approach, MEL essentially extends the learning cycle to the programme or institutional level to facilitate learning and adaptation to improve the quality of interventions and to support locally-led adaptation (Stone-Jovicich et al., 2019; Coger et al., 2021). The Malawi Office of FAO, which has been at the forefront of recent developments on MEL of FFS programmes, was the first entity to decide to test the generic MEL framework.

Malawi is a predominantly agricultural country with a growing rural population and a high level of poverty (World Bank, 2023). Maize is grown as a staple food, but this crop has high requirements for nitrogen fertilizer and water, which makes its production vulnerable to regulatory, environmental, and climatic shocks. Despite the promotion of improved varieties and a fertilizer subsidy programme, maize yields have been low on average, owing to a dependence on rain-fed agriculture, the effects of climate change, and degraded soils (White, 2019; Nyirenda et al., 2021). Agricultural diversification has remained poor (Kerr, 2014), and showed a deterioration at national level between 2004/05 and 2010/11 (Kankwamba et al., 2018). One-third of Malawians face moderate or severe food insecurity (IPC, 2022), while the country has a high prevalence of chronic malnutrition and low dietary diversity among children (National Statistical Office 2021; Gelli et al., 2022). Malawi’s Department of Agricultural Extension Services (DAES) of the Ministry of Agriculture has a comprehensive strategy for decentralized agricultural extension to improve agricultural productivity, but the implementation of extension services has been hampered by an inadequate number of extension officers and inadequate resources (Ministry of Agriculture, 2020). The National Agriculture Policy identified the FFS as one of the extension delivery approaches for attaining sustainable agricultural production and productivity, to complement the extension services by introducing participatory methods and education of farmers (Government of Malawi, 2016). The FFS has been used in Malawi since the mid-1990s in initiatives to improve pest management, food security, climate change adaptation and market-oriented farming (van den Berg et al., 2020a).

Through the US$36m EU-funded project KULIMA (‘Revitalising Agricultural Clusters and Ulimi wa Mndandanda through Farmer Field Schools in Malawi’), the DAES and FAO have been promoting sustainable agricultural growth and incomes to enhance food and nutrition security in Malawi within the context of a changing climate (FAO, 2021). The project selected the FFS as its approach for community outreach to develop the skills of highly diverse smallholder farmers for adaptation to a changing context, which is relevant to the Malawian situation. The project used an extended period for educating farmers through three consecutive agricultural seasons, which differs from the typical ‘one-season FFS’. A recent external project review reported major achievements in capacity building for the FFS, but shortcomings were identified in the monitoring and evaluation of the FFS activities and it was recommended that MEL should be developed, and field tested (FAO, 2023). Through the KULIMA project, the DAES intended to strengthen quality assurance and harmonization in ongoing and future projects in Malawi that use the FFS approach. To achieve this, the project embarked on establishing a framework for MEL to improve the quality and effects of FFS activities at field level.

The two objectives of our study were (i) to test the utility of a project-specific MEL framework for FFSs at the district level, and (ii) to explore the effects of the FFS as obtained through MEL. The goal was to develop feasible, acceptable and effective MEL methods for operational use in FFS programmes in Malawi. As FFS projects elsewhere have faced challenges in quality assurance and improvement of interventions, it is hoped that the methods and results presented here will assist FFS initiatives in other countries in the development of their MEL framework.

2 Materials and methods

2.1 Conceptual framework

Based on previous work on developing a framework to evaluate farmer field schools (Douthwaite et al., 2007; van den Berg et al., 2020a, 2020b) project-specific framework was prepared with participation of the FAO-Malawi team and DAES. It was anticipated that project activities caused a process of change in terms of outputs, outcomes and impacts, called the ‘results chain’, which corresponds with the ‘impact pathway’ in earlier work (Douthwaite et al., 2007). Targets should be set for what a project ultimately wants to achieve at the impact level; hence, outcomes and outputs become the milestones towards attaining the impact targets (van den Berg et al., 2023). Consequently, a reverse order was used, starting from the impact level to discuss the outcomes that can lead to that impact, and outputs that can lead to the outcome. The process of change will not be limited to agricultural production but can take place in the human, social, natural and financial domains. In each domain, targets were set for the outputs, outcomes and impacts; by target is meant the outputs, outcomes or impacts that are expected to be achieved by the project or initiative, in accordance with pre-set objectives. A two-step procedure was used in the setting of targets. First, targets were listed which could be relevant for the selection by any possible FFS project within Malawi. Second, to increase practical feasibility for operational use, the number of targets was reduced to 1–3 per domain per step of the results chain, by selecting only those targets that were most relevant to the objectives of the KULIMA project (Table 1). The conceptual framework was used to select suitable tools for conducting MEL and to develop the methods and questions for each selected tool.

Table 1 Conceptual framework for monitoring, evaluation and learning (MEL)

2.2 Tools

A variety of tools are available for the monitoring, evaluation and impact assessment of FFS programmes, including interviews, surveys, and readily available records by farmers or FFS facilitators (van den Berg et al., 2023). Considering that FFS programmes have struggled with the selection of evaluation indicators, data management, data analysis and utilization of results (Bakker et al., 2020; van den Berg et al., 2020c), tools were selected based on criteria of (i) ease of use in data collection, (ii) the likelihood of producing valuable data or information on outputs, outcomes and impacts, (iii) ease of entry, management, analysis and presentation of data or information by MEL teams at district level, and (iv) the likely ability of the combination of tools to provide complementary results and to offer cross-verification of results. Hence, three tools were selected, (i) spider diagramming, (ii) focus group discussion, and (iii) direct observation, the latter as a supplementary tool.

Spider diagramming is a simple tool for visualizing the findings or perceptions on virtually any type of target or indicator (Mancini et al., 2007; FAO, 2015). A spider diagram is composed of several axes stemming from a central point, with concentric circles showing the scale of those axes. Targets are assigned to each axis, scores are given, and the dots are connected to result in a diagram resembling a spider web. By using subjective scores by its participants, spider diagramming can efficiently generate quantitative information about outputs, outcomes or impacts. The participants provide the scores for, both, the current situation and retrospectively for the situation before the FFS; the difference between them provides an estimate of ‘progress’. Spider diagramming is suited for participatory evaluation and offers a learning opportunity for its participants, with the potential for locally led follow-up activities, for example, to address specific gaps in the results of their evaluation. A list was developed of 24 questions (see S1 Table in the Supplementary Material), to capture all the targets presented in the conceptual framework.

A spider diagram with six axes – each axis linked to one specific question, and each axis with a scale from one to five – was prepared for each of the capital domains, with a total of four spider diagrams covering the 24 questions. The legend for the scale was provided as: 5, ‘very good/very true’; 4, ‘good/true’; 3, ‘average/somehow’; 2, ‘only slightly’; or 1, ‘not at all’.

A focus group discussion is a discussion in a small group guided by a moderator to gain understanding about different perceptions, opinions and experiences among the participants. A list of questions was developed for the focus group discussions (see S2 Table in the Supplementary Material); the questions captured all the targets of the conceptual framework. The list of questions was developed so that part of the questions matched those in the spider diagramming.

Direct observation was used as a supplementary MEL tool to provide independent information to verify and complement the other tools. Direct observation included the observation of groups of participants and observations of fields, livestock and farm records, where available (see S3 Table in the Supplementary Material). Specifically, direct observation was used to verify the quality of the agroecosystem analysis exercise and farmer field experiments which are core activities of the FFS (FAO, 2016). Moreover, the MEL teams verified the cohesiveness of FFS groups and gender-based participation, and checked whether the financial issues related to costs, benefits and markets were discussed in the FFS group.

2.3 Site selection and sample size

A database of FFS groups had been maintained by the KULIMA project, with information including the location, GPS-coordinates, FFS facilitator, and number and gender of FFS members. Only FFS groups under the KULIMA project were eligible for selection, which were 5491 FFS groups with start-year from 2017 to 2021. In this project, the FFS featured weekly meetings over three consecutive seasons, covering various crops and livestock, and topics such as crop production, food and nutrition security, soil and water conservation, and income-generating activities (see S4 Table in the Supplementary Material). As part of the FFS, groups learned to carry out field studies or experiments. The FFS facilitators were lead farmers who had received season-long training to become community-based facilitators under the mentorship of a master trainer; some FFSs were facilitated by the master trainers themselves. At the beginning of the FFS, a group was engaged in curriculum development by identifying the local problems and proposing possible solutions for inclusion or testing in the FFS.

The start-year represented a time factor which was expected to affect the manifestation of outputs, outcomes and impacts in the results chain. FFS groups with start-year 2017 were omitted because their number was small, while those with start-year 2021 were omitted because data entry for that year was incomplete at the time of site selection. Eligible FFS groups were 1236 units with start-year 2018, 1654 units with start-year 2019, and 2084 units with start-year 2020; altogether 4974 FFS groups. The locations of these FFS groups were scattered over eleven project districts; for operational reasons, one particularly large district, Mzimba, had been divided into two ‘project districts’, Mzimba North and Mzimba South. The eleven project districts from North to South were Chitipa, Karonga, Mzimba North, Mzimba South, Nkhata Bay, Kasungu, Salima, Nkhotakota, Chiradzulu, Thyolo and Mulanje. The number of eligible FFS groups varied by district, from 251 in Nkhata Bay to 761 in Mulanje. However, because our objective was to test MEL at the district level, an equal number of FFS groups were selected per district, rather than a sample representative of the geographic distribution of the FFSs over the eleven project districts. The sample size of the study was based largely on the feasibility of data collection by each district within the available timeframe for study. Three FFS groups were selected from each of the eleven project districts, one with start-year 2018, one with start-year 2019 and one with start-year 2020, yielding a total of 33 FFS groups (Fig. 1). The samples were selected randomly, using MS Excel formula RANDBETWEEN(1,y) whereby y is the number of eligible FFS groups per district per start-year.

Fig. 1
figure 1

Map of Malawi with district boundaries, showing the GPS coordinates of the 33 randomly selected farmer field school (FFS) groups from 11 districts used in the study. Map created in PaintMaps.com (https://paintmaps.com/map-charts/146/Malawi-map-chart)

2.4 Data collection

A three-day planning workshop was conducted in April 2022 to train district staff on the data collection methods for the pilot testing of the MEL framework in their respective districts; a second three-day workshop was conducted after the pilot testing to evaluate the methods and results. Participants were those with responsibility for monitoring and evaluation in the project, namely planning officers from agricultural development offices in regions covering the project districts, and Agricultural Extension Development Officers, also serving as ‘master trainers’ for the FFS, from each project district. All participants were familiar with the FFS approach. In advance of the workshop, a manual was prepared describing the MEL framework, the tools, and presenting standard forms with the questions for pilot testing. In anticipation of future operational use at district level, the manual contained methods on random sample selection for use by district staff. The training included practical sessions on how to moderate the spider diagramming. From April-May 2022, the participants formed MEL teams of two to three persons per district to implement the pilot testing in their respective districts at the selected FFS sites at a time coinciding or overlapping with an FFS group meeting. To avoid bias, master trainers did not participate in data collection if the selected FFS was in their own working area.

For the implementation of spider diagramming, the moderator prepared the frames and legend on large newsprint paper and explained to the FFS group the purpose and methods of the spider diagramming, and the use of the scores and the legend. The moderator introduced each question, which was subsequently discussed by the FFS group members present to reach consensus, first, on a score for the current situation and then, on a score for the retrospective situation before the FFS. This process continued until the dots could be connected, and the spider diagram for the next domain could be started. The outputs of the spider diagramming were recorded onto electronic tablets.

For implementation of the focus group discussions, the moderator asked the local FFS facilitator to select five farmers to represent the ‘FFS group’. These focus group participants had both genders represented and were chosen so that one or two persons had ‘quite dominant’ participation, one or two persons had ‘average participation’, and one or two persons had ‘relatively low participation’ in the FFS group, based on the judgment of the local FFS facilitator. Insofar as possible, a quiet place was selected for the discussions to avoid interference from others who were not part of the focus group. During the focus group discussions, the moderator solicited narrative explanatory responses from the participants, rather than short yes/no answers. The rapporteur documented the responses to questions directly onto paper forms. During their field visit, the MEL team made direct observations regarding the FFS activities and field situation. The itemized observations were documented on paper forms. Within a few weeks after the field visit, the MEL team in each district transferred the written outputs from the focus group discussions and direct observations to an electronic spreadsheet.

2.5 Ethical considerations

Prior to the spider diagramming and focus group discussions, the moderator explained the purpose of the exercise, explaining that participation was voluntary, specifying that the collected information would be used to evaluate the effects of the FFS, and indicating that the results would be reported. The focus group participants were assured that their feedback would remain anonymous. Verbal informed consent was obtained prior to the focus group discussions. The data from FFS groups and their exact locations were anonymized in the presentation of results. The pilot testing was an integral part of the developmental activities of the project, not a separate research activity.

2.6 Data processing and analysis

The district-level MEL teams analysed their data for the purpose of presentation at the second workshop. The comprehensive data from the eleven project districts were transmitted to the national level where data were compiled in a spreadsheet for centralized analysis. For each question of the spider diagramming, the mean score was determined across all FFS groups, on a scale from 1 to 5, for the contemporary situation (T1), and for the retrospective situation (T0); the mean score and standard deviation were determined for the ‘progress’ (T1-T0); the statistical significance of ‘progress’ was tested with a t-test. To test the effect of start-year on ‘progress’ (T1-T0), the mean score of T1-T0 was determined across the questions pertaining to each domain and across all domains per FFS group; the effect of start-year was tested using one-way ANOVA (32 df). To identify the relationship between the situation before the FFS (T0) and ‘progress’ (T1-T0), the mean score of T0 and T1-T0 was determined across all 24 questions per FFS group and projected the data pairs (T0, T1-T0) per FFS group (n = 33) in a scatter plot for testing with Pearson’s correlation coefficient.

The responses from the focus group discussions were analysed by categorizing the information as ‘yes or present’ (score 1) or ‘no or absent’ (score 0), where applicable, or, for specific questions, by categorizing the types of farming practices or income-generating activities. For direct observations, the narrative responses were categorized as ‘adequate’ (i.e., no weakness reported), ‘inadequate’ (weakness reported), or ‘absent’. All categorizations were made by the first author.

The narrative responses to questions were used to describe or solidify the categorized results. The results of the two tools, spider diagramming and focus group discussions, were compared for the matching questions. For all matching questions (n = 23 questions), the mean ‘progress’ (T1-T0) in spider diagramming across FFS groups were plotted against the mean response in the focus group discussions across FFS groups. The linear relationship between the two tools was determined to establish whether the tools produce roughly comparable results.

2.7 Review of MEL framework

The MEL learning cycle includes a process of review and adaptation. A three-day evaluation workshop was conducted in August 2022, with the same participants as in the planning workshop, to evaluate the MEL process and to review the results obtained. The participants discussed the strengths and weaknesses of implementing the pilot testing, and identified modifications to be made to the questions, tools and framework, and proposed specific improvements in the project’s interventions.

3 Results

All 33 selected FFS groups were successfully sampled, including those in an unknown number of hard-to-reach locations. Women made up 68% of members in the selected FFS groups, or 72% women in all eligible FFS units; 32 out of the selected FFS groups were mixed gender, while one FFS group was women-only. Youth, which in the Malawian context are persons younger than 35 years (Benson et al., 2021), made up 42% of members in the selected FFS groups, or 37% in all eligible FFS groups.

The spider diagramming revealed clear differences between the contemporary and retrospective scores, which indicated major progress made on most targets in the human, social and natural domains, but progress in some elements of the financial domain, such as marketing skills, understanding the market and engagement in marketing, was modest (Fig. 2). For all 24 questions, the contemporary score was significantly higher than the retrospective score, indicating that a broad spectrum of outputs, outcomes and impacts were ascribed to the FFS by the participating farmers (Table 2).

Fig. 2
figure 2

Spider diagrams for the four domains (a-d), showing the average scores from 33 FFS groups for each topic numbered corresponding to the questions in Table 2

Table 2 Results of the spider diagramming tool for monitoring, evaluation and learning (MEL)a

3.1 Human domain

In the spider diagramming, farmers reported that their options in farming had improved, they had gained control over income and expenditures, and acquired a better position to improve their farming and living situation (Table 2). Similarly, all focus groups confirmed that the FFS gave them improved options in farming, with various examples of new crop varieties, farming practices, or techniques which they selected or implemented, and that they were in a better position as compared to before the FFS to improve their farming and living situation (Table 3). The majority (82%) of focus groups indicated that their control over income and expenditures had increased since the FFS, explaining that they were now able to save money and make home budgets, and with men and women sharing control; but in the remaining (18%) focus groups, men continued to control the finances at household level. Moreover, 91% of focus groups stated that the FFS increased their confidence and motivation in farming; typical responses were “yes, because we are able to choose and make decisions on our own”, “we are able to mobilize resources and plan for next season”, “we have learnt new farming techniques”, “we have improved our soils”, or “we have better yields”. However, one focus group responded that there is little motivation because other factors like marketing have not been addressed, and another focus group stated that some members still are not self-confident to make their own [farming] decisions. Almost all focus groups stated that the FFS changed their attitude or mindset towards farming, with many groups mentioning that they now farm partly for business purposes as opposed to only for household consumption, and they consider the balance in food groups for nutrition security.

Table 3 Results of focus group discussion tool for monitoring, evaluation and learning (MEL)a

These results suggest that the farmers have been empowered to improve their own situation. Farmers stated that they increased their confidence and motivation in farming since they joined the FFS, and they improved their attitude or mindset towards farming. Experimentation with new combinations of techniques or practices was almost absent before the FFS but was widespread after the FFS.

The focus groups reported their involvement in field studies or experiments, which included studies on agronomic practices, pest management, soil fertility management, varieties, intercropping, and water conservation. Only one out of 33 focus groups mentioned that no field study had been conducted. On average, 1.7 studies were conducted per group with start-year 2020, 2.1 studies per group with start-year 2019, and 2.6 studies per group with start-year 2018 (P = 0.07; one-way ANOVA, 32 df); these results suggest that the FFS groups carried on conducting field studies in the years following their training, but additional data are needed to substantiate this finding. Direct observations of farmers’ field experiments showed that in 7 out of 23 groups where experiments were present at the time of visit there were signs of inadequate quality, in terms of availability of demarcated field plots, treatments, of data records.

To the question which farming techniques or study results have been adopted, all focus groups (n = 33) mentioned the adoption of one or more techniques or practices. 55% of the focus groups mentioned the use of soil fertility management practices or the use of manure, 48% mentioned the use of ridge alignment or box ridging for water conservation, 36% mentioned crop diversification, and another 36% mentioned early planting. Farming techniques that were adopted by approximately one fourth of the groups were plant spacing, use of botanicals for pest control, early maturing crops, conservation agriculture, adoption of varieties, pit planting, and irrigation. The extent of adoption of these techniques by the members of the FFS groups and neighbouring farmers requires further study. Moreover, the focus groups described an average of 1.3 types of income-generating activities developed since the FFS, which included the production of horticultural crops, banana, sweet potato, cassava, potato, pig production, goat production, seed multiplication, fishponds, and bee keeping; six out of 33 focus groups (18%) reported no income-generation activities.

3.2 Social domain

In the spider diagramming, farmers reported progress in planning and goalsetting by the FFS groups; increased sharing of gender roles at household level; and an increased engagement of FFS groups in collective actions. The FFS groups expressed that friendship and respect among group members had been strengthened, farmers felt more included and were better able to express themselves in the group, as compared to the situation before the FFS (Table 2). The direct observations furthermore indicated that group cohesion was adequate in a 28 out of 30 observed FFS groups (93%).

All focus groups stated that they had elected office bearers to establish an organizational structure in the FFS group (Table 3), which included roles as chair, secretary, and treasury in 100%, 82%, and 73% of groups, respectively. The majority (82%) of focus groups indicated that the FFS group had plans and goals, for instance to buy an oil extraction machine, or to raise pigs as a business, while 30% of focus groups also specified that they have funds for implementing their plans. A change in gender roles at household level was evident; most focus groups (91%) mentioned that men and women increased their sharing of roles, for example, in cooking and cleaning, taking household decisions, and engagement in farming activities as compared to the situation before the FFS. Typical responses were “men are cooking now unlike in the past” and “men began to realize how burdened women and girls are”. Conversely, one focus group explained that selling of produce is still done by the men, and another focus group reported no change in gender roles, because “culture plays a role in determining roles for a man and a woman.”

12% of the focus groups stated that they were members of a larger FFS network or association; one focus group explained that they were members of a network of nearby FFS groups, another focus group that few FFS farmers were members of a cooperative (Table 3). 64% of groups had engaged in collective actions, examples of which were the sale of produce as a group, collectively deciding on prices of produce for marketing, buying goats or pigs as a group, conducting income-generating activities together, harvesting together, and constructing houses together. Some of the reported results of the collective actions were increased group unity, increased recognition in the community, improved decisions on what to grow or what studies to conduct, improved negotiation for prices, finding ways to transport the produce, and increased profits. All focus groups stated that there was friendship and respect among FFS group members, apart from the occasional misunderstanding mentioned by some focus groups. Three out of the 33 focus groups mentioned that there was some level of mistrust or rivalry. All but one focus group mentioned that no one felt excluded, and that everyone participated in the FFS. Moreover, all focus groups expressed that their communication skills had improved since participating in the FFS. These items suggest that the FFS caused improvements in cooperation and collaboration at household and group level.

3.3 Natural domain

In the spider diagramming, farmers stated that they had adequate food throughout the year, while they reported ‘only slightly’ adequate food for the situation before the FFS. Also, a ‘good number’ of food sources was indicated in the contemporary household situation, while farmers mentioned a less adequate number of food sources before the FFS (Table 2). Similarly, all focus groups stated that the FFS had contributed to an increase in the number of meals and food sources at the household level (Table 3). Typical responses were “we now take three meals per day” instead of one or two meals before the FFS, “with snacks in between”, and “we also consider all […] food groups” unlike before joining the FFS. These findings suggest that the FFS had a positive impact on food and nutrition security.

All but one FFS groups indicated in their spider diagrams an improvement in crop yields and livestock production, although figures on actual yields were not given (Table 2). The average number of crops grown by farmers doubled from 2.2 before the FFS to 4.4 at the time of reporting. Moreover, farmers reported positive FFS outputs in terms of the use of improved agricultural practices and farmers’ ability to make balanced agricultural decisions. In focus group discussions, all groups reported that the FFS contributed to an increase in crop yields and livestock production (Table 3); 26 focus groups estimated the increase in crop yield which was 78% on average, and six focus groups specified an increase in livestock production which was 23% on average; however, these figures could not be verified. 91% of the focus groups noted an increase in the number of crops grown or the number of types of livestock kept (Table 3), in many cases, from growing one crop (usually maize) before the FFS to growing two or more crops, including with intercropping, while several focus groups mentioned that they started producing poultry, pigs, ruminants, or fish after joining the FFS. In the direct observations, the observed condition of crops was found to be adequate in 23 out of 28 FFS groups (82%).

Advice from climate forecasting services was reportedly sought by 45% of the focus groups (Table 3); the advice was provided mostly through telephone messages, radio messages, extension officers, or, in two groups, directly from the meteorological department; the remaining 55% of the groups, however, did not seek advice or mentioned they did not know where to obtain it. Most focus groups noted that the agroecosystem analysis exercise helped improve their farming decisions, whilst four focus groups mentioned that they had no knowledge about agroecosystem analysis. Moreover, direct observations of the process of agroecosystem analysis showed that the field observations, data analysis and interpretation, discussion of results, were inadequate in 7 out of 20 observed FFS groups (35%).

3.4 Financial domain

In the spider diagramming, farmers stated that their financial situation had improved since the start of the FFS (Table 2). Substantial improvements were noted by farmers in their acquisition of savings and in their access to credit or savings services. Similarly, 88% of focus groups described that their financial situation had changed for the better since they joined the FFS (Table 3). Several explanations were given: an increase in production allowed farmers to sell the surplus; they grew more types of crops and livestock; they embarked on income-generating activities; and they became involved in village savings and loans schemes. The extent of financial gains could not be verified. Four out of 33 focus groups reported no change in their financial situation. Some 88% of focus groups mentioned an increase in their savings since joining the FFS, which they used to pay school fees, procure agricultural inputs, or pay into their village savings and loans schemes. Also, 88% of focus groups reported having access to credit or savings services, which was mostly through a savings and loans scheme in their village; these schemes were developed as an FFS programme activity. These results indicate a positive impact of the FFS on the financial security of farming households.

Conversely, only minor progress was reported in the engagement of farmer groups in marketing, farmers’ understanding of the market, and farmers’ marketing skills (Table 2). Only one-third of the focus groups had been engaged in group-based marketing of their produce, and only three out of 33 focus groups confirmed their engagement in value addition or value chains (Table 3). Market research to understand the demand, supply and prices on the market had been conducted to some extent by 24% of FFS groups, while other focus groups indicated that their FFS group did not conduct market research or lacked the skills to do so. In 13% of the FFS groups, a collective market hub had been established, for example, as a collection point where buyers come to buy and collect. Some 9% of FFS groups had reportedly developed a linkage with a cooperative, and where such linkage had been developed, this involved only one or a few members of the FFS group. Records of farm input costs and produce sales were being kept by 75% of groups, but several focus groups explained that these records were incomplete or lacked detail. Only one-third of focus groups reported the ability to calculate the break-even price, a necessity for marketing purposes. Moreover, 21% of focus groups mentioned that they acquired adequate bargaining skills for trade deals. In the direct observations, financial issues including costs, benefits and markets were not, or inadequately, discussed in the sessions of 19 out of 30 FFS groups (63%). Hence, weaknesses in aspects of marketing in the FFS were evident in the KULIMA project.

The data from spider diagramming were submitted to additional analysis on the effect of start-year. Table 4 shows the mean progress made by FFS groups in each domain by start-year; there is a significant effect of start-year in the human domain but not in the social, natural and financial domains. When the scores from all domains are pooled together, there is a marginally significant effect of start-year, suggesting that scores were highest in 2018 and lowest in 2020. This result suggests that those FFS groups trained longest ago experienced most progress; more data are needed to substantiate this finding.

Table 4 Mean progress (T1-T0) in spider diagram scores by start-year in the four domains

Moreover, the spider diagram results from individual FFS groups showed a negative correlation between the situation before the FFS and the level of progress made (r=-0.50; 32 df; P < 0.01) (Fig. 3). This pattern suggests that, in general, those FFS groups that were self-reported to be in the poorest situation before the FFS gained most from their FFS participation.

Fig. 3
figure 3

Scatter plot of the relationship between the situation before the FFS (T0) and ‘progress’ (T1-T0) as measured in the spider diagramming. Results indicate the mean score across all 24 questions for individual FFS groups (n = 33 groups). Dotted line shows the linear trend line of the regression (r=-0.50; 32 df; P < 0.01)

3.5 Review of MEL framework

Participants of the evaluation workshop reviewed the process and results of the pilot testing in their respective districts. The framework and three tools were considered effective in generating useful results, and their use was considered feasible and acceptable by those conducting MEL and the farmer respondents. Some modifications were proposed in the MEL framework for operational use by improving the questions for each tool, and reducing the number of questions. Also, the ‘financial domain’ was replaced with a ‘financial/physical domain’, thus, allowing the inclusion of assets acquired by farmers or farmer groups. The spider diagramming was considered suitable for participatory evaluation by FFS groups because it provided a comprehensible overview on a wide range of relevant targets.

The comparison of the results from matching questions in the spider diagramming and the focus group discussions, which have been specified in Tables 2 and 4, showed a positive linear relationship between the two tools (r = 0.73; 22 df; P < 0.001) (Fig. 4). Those questions with high scores in the spider diagramming generally gave high response rates in the focus group discussions. Four questions with low scores in the spider diagramming also gave low response rates in the focus group discussions. Consequently, the outcomes of the two tools were considered comparable, implying that the tools could be used side-by-side to provide cross-verification of results.

Fig. 4
figure 4

Scatter plot of the mean ‘progress’ (T1-T0) reported in spider diagrams versus the mean response in focus group discussions in matching questions. Dots signify individual questions (n = 23 questions). Dotted line shows the linear trend line of the regression (r = 0.73; 22 df; P < 0.001)

4 Discussion

4.1 Utility of MEL

The three data-collection tools selected for our study – spider diagramming, focus group discussions, and direct observations –demonstrated their feasibility and acceptability for use by practitioners at the district level in Malawi. The tools were relatively easy to use by the MEL teams and produced valuable information in line with the targets of the project-specific framework. Entry and management of data of the spider diagramming by MEL teams was particularly straightforward because of the numeric responses. Spider diagramming was also found to be suitable for participatory evaluation by FFS groups, for MEL at the farmer group level, because the tool visualized the effects of the FFS and provided quick feedback which was easy to interpret or summarize by farmers (Mancini et al., 2007). Focus group discussions were more time-intensive in use and analysis, but our intention was to reduce the number of questions for operational use after the pilot testing phase. The narrative responses of the focus group discussions provided a means of verifying the numeric results obtained through spider diagramming. Moreover, the direct observations offered an independent view on certain activities and outputs that were visible at the time of visit, as a reality check of the FFS. The direct observations revealed weaknesses in two core elements of the FFS, the agroecosystem analysis and field studies or experiments (FAO, 2016), which suggests that quality assurance of these elements could enhance the future outcomes and impacts of the FFS.

A strength of spider diagramming and focus group discussions was that these tools collected farmer views on wide-ranging topics from farmers as ‘end-users’ of the FFS. In this context, the targets in the four domains provided relevant information that was directly or indirectly related to food security and climate change adaptation, including on improved adaptive capacity, experimentation, increased farming options, collective actions, improvements in food sources, meals, yields, crop diversification, improved crop cultivation practices, and improvements in the financial situation. These subjective measures for routine use of MEL by operational programmes offered a simple and efficient alternative or supplement to the scientific use of predefined indicators or indices (e.g., on food security or climate change adaptation) in impact assessment (Coger et al., 2021). It is expected that these MEL methods can be adapted for use in other FFS programmes and other contexts through modification of the framework’s targets or the questions regarding each target.

A limitation in pilot testing was that the responses in spider diagramming and focus group discussions may have been over- or under-stated. Resource-poor farmers might give positively biased responses due to factors such as vested interests, peer pressure, fear of being stigmatised, or to compensate for weaknesses in other areas. The presence of members of the MEL team may also have influenced the responses. The MEL teams included master trainers, who benefited the pilot testing phase because of their expertise in the FFS which could also have presented a vested interest in reporting on the FFS, although the master trainers did not participate in data collection within their own working area. Another limitation is that the retrospective reporting in the spider diagramming may have been subject to recall bias, whilst other contemporary changes (e.g., other programmes, fluctuating prices) could have influenced the effect or effect size attributed to the FFS. To address the limitations and to substantiate and quantify the impacts of the FFS, it has been recommended that programmes supplement their routine MEL activities with an independent impact assessment to measure specific indicators or indices (van den Berg et al., 2023), for instance on food or nutrition security, preferably by including a control group (i.e., farmers who have not been exposed to the FFS).

The MEL concept implies that monitoring and evaluation is used on a routine basis to learn from the results and improve the quality and effects of the activities or interventions (Stone-Jovicich et al., 2019). In the FFS context, MEL demands a mechanism with recurrent learning cycles (e.g., annually) for collecting and interpreting data, for critical review, and for the development and implementation of corrective action through modified project activities. For example, the review of the results presented in our study should be followed up by curriculum development, refresher training of facilitators, and farmer training on aspects of marketing and climate forecasting to introduce or strengthen these elements through the training-of-trainers courses into FFS groups. MEL activities within FFS programmes can be conducted at the central level or at decentralized levels (e.g., district level), or at a combination thereof, depending on the level at which the capacity is established for data collection, data interpretation, regular review, corrective action, oversight, training and guidance. A potential challenge in plans for operationalizing MEL in Malawi is to establish capacity for MEL at the district level. To address this challenge, the programme will be providing tailor-made training for district officers, as district MEL team, on sample selection, data management, analysis and interpretation, and on the process of annual review and adaptation of interventions.

4.2 Effects of the FFS

Our pilot study demonstrated a range of effects attributed to the FFS. By using four domains of the sustainable rural livelihoods, the MEL framework broadened the scope of evaluation of the FFS from what has been commonly practiced (Bakker et al., 2022; van den Berg et al., 2020b). The effects were evident in each of the four livelihood domains. Regarding the interactions between the capital domains, it is plausible that the combination of effects in the four domains – effects such as innovation, cooperation, agricultural diversification, and marketing – are needed to enable farmers to improve their situation of household food security and to cope with stress and shocks such as those imposed by climate change (Connolly-Boutin & Smit, 2016). Another FFS study using spider diagramming showed an interdependency between the human and social capitals and between the financial and physical capitals, suggesting that a change in one capital could trigger a change in the other (Mancini et al., 2007).

FFS members gained confidence and a positive attitude towards farming, they experimented in their fields, and they had improved their options in farming and their control over their living situation; these results suggest that a process of empowerment took place (Friis-Hansen & Duveskog, 2012). Cohesive FFS groups were formed that established their own organizational structures, with plans and goals, and with collective actions reported by most FFS groups. There was evidence that men and women started sharing roles; previous data from Kenya showed that mixed-gender FFSs contributed to role sharing and breaking of social customs regarding gender roles (Friis-Hansen et al., 2012). These social impacts of the FFS could strengthen safety nets and increase the prospects of farmers to embark on economic and entrepreneurial activities. Other effects of the FFS were an increase in the number of meals and food sources, an increase in crop yields, and agricultural diversification in terms of crops grown and livestock ownership. These outcomes can positively affect household food security and climate adaptive capacity as indicated in previous studies (Mango et al., 2018; Maganga et al., 2021). Moreover, the results indicated that the financial situation of farmers improved after joining the FFS, with increased savings and improved access to credit or savings services. However, weaknesses were demonstrated regarding marketing and marketing skills. Farmers mentioned they had not or insufficiently received training on the analysis of costs and benefits and on marketing skills. This is an area that has received inadequate attention in the KULIMA project at field level.

The FFS groups that were trained longest ago appeared to have made most progress. A likely explanation of this result is that the process of learning, development, and experimentation continued in the years after their FFS training, apparently without further project intervention, which supports the sustainability of effects. This result could possibly be traced back to the fact that FFS groups participated actively in the development of their FFS curricula, and that the farmers learnt to experiment and adapt during their FFS, which are factors that have been associated with a process of change and adaptation (Bakker et al., 2021; Douthwaite et al., 2007). Follow-up work is needed to verify these findings; for example, training quality and participation may not have been constant every year but could have been negatively influenced by the COVID-19 pandemic in the year 2020.

Another finding of our study was that those FFS groups that were in the poorest situation before the FFS, in terms of targets across the four domains, appeared to gain most from the intervention. This finding may have implications for the targeting of FFS interventions. Targeting farmers who have a poor livelihood situation, with low education level, limited social capital, food and nutrition insecurity or who lack of savings, could thus optimize the impacts of the FFS in the Malawian context. The targeting of poorer farmers is especially relevant because those with lower education or income or with less land are generally the most vulnerable to the effects of climatic change (Wilts et al., 2021; Striessnig et al., 2013). This finding contrasts to the results of a meta-analysis across FFS programmes which indicated that FFS programmes that targeted better-educated farmers tended to be most effective in reducing pesticide use and increasing yields or incomes for farmers (Phillips et al., 2014). However, it is postulated that the Malawian model of a three-season FFS, as opposed to the regular one-season FFS in most other countries (Waddington et al., 2014), may have enabled farmers with a poorer background or less education to catch up with farmers who were better-off prior to the FFS.