1 Introduction

Agricultural research and extension activities that rely on experimentation have been embedded in various ways in physical and institutional spaces (including laboratories, research stations, and farms). They have been associated with various methodological developments or research fronts in disciplines such as agronomy, genetics, crop physiology, livestock science, and ecology (Sumberg et al. 2003; Maat 2011; Parolini 2015). Recently, an explicit interest in experiments that take place on farms in collaboration with farmers has led to a renewed and specific definition proposal for “on-farm experimentation” (OFE) (Lacoste et al. 2021).

The respective roles and legitimacies of experimental research stations and farmers’ fields as appropriate spaces to produce knowledge and innovations and their evolution, as changes in agronomy-centered research occurred, have been partially documented. Historical studies have already shown how experimentation practices are influenced by developments in methodologies and basic disciplines. For example, in the mid-nineteenth century, the rise of chemistry and inferential statistics was decisive in the development of agronomic research stations (Jas 2000). While research stations were preferred spaces for the development of generalizable knowledge and rules that were to be disseminated to farmers, the rise of more systemic and context-dependent approaches (for instance the Farming System Research movement) has led to a greater focus on farmers’ practices and has given some legitimacy back to experiments in less-controlled environments such as that of commercial farms (Lockeretz 1987; Jouve 2007). These types of experiment were, for instance, proposed for more adequately taking into consideration particular local circumstances in the implementation of techniques or with regard to specific farmers’ objectives, or in making necessary final adaptations of new technologies (Sumberg et al. 2003). In fact, agronomists’ ways of designing new practices and systems, and of disseminating them and the associated knowledge, entailed a new relationship to farms as places where experiments are carried out and to farmers’ practices in general (Salembier et al. 2018). Interactions between on-station and on-farm experiments have consequently been examined from different perspectives. Some researchers focus on the balance between the scientific legitimacy of knowledge production and the relevance of the proposed techniques in real farming conditions (e.g., Henke 2000). Others underline the reciprocal inspiration among farmers’ own experiments, observations made at stations, and on-farm experimentation in which farmers and researchers collaborate (Maat and Glover 2012; Périnelle et al. 2021). Along the same lines, some authors directly associate the saliency of OFE with the research questions to be answered, namely, questions concerning agricultural techniques that are particularly sensitive to management skills and conditions of implementation in particular contexts (Lockeretz 1987). These various approaches to OFE in research activities exhibit a diversity of applied practices and tools, regarding what the actual form of intervention on the farm is, who the involved actors are as well as their roles and legitimacy, and what observations and instruments applied for interpretation are (Fig. 1). We argue that there is still a lack of studies providing descriptions and distinctions within this diversity to inform agricultural research communities.

Fig. 1
figure 1

a Two treatments are applied on 0.25-ha farm to compare different ways of growing camelina (monocropping and intercrop with barley). b Yields of an on-farm trial being measured with harvest equipment rather than with microplot sampling (credit: Margot Leclère).

These descriptions of the realities of the OFE practices are all the more salient now as the literature shows a recent revival of interest in OFE (Lacoste et al. 2021) at the intersection between several agricultural research fronts. First, enhancing agroecological transitions calls for “option by context” approaches (Sinclair and Coe 2019), taking into account the particularities of farming situations and the complexities and uncertainties in their evolution as practices change (Meynard et al. 2012; Tittonell et al. 2020). Bringing experimental processes into these diverse situations is one of the means explored to allow for the complexity, uncertainty, diversity, and variability linked with these realities to be addressed (Lacoste et al. 2021). Second, the development of digital tools to acquire, store, analyze, and share a wide variety and large quanties of data and knowledge has increased the interest in on-farm experimentation (Bullock et al. 2019; Lacoste et al. 2021). Digitalization is often related to the development and improvement of precision agriculture methods for monitoring and managing variability (Kyveryga 2019). However, digitalization also supports the development of tools and platforms for sharing qualitative knowledge, situated experiences, and action strategies in more contextualized and illustrated ways (Compagnone et al. 2018; Girard and Magda 2020; Salembier et al. 2020). Third, the increasing development of open innovation theories and approaches within the agricultural sector also lays the foundations for OFE practices (Cook et al. 2021). In fact, OFEs are part of many coordinated activities comprising innovation processes in which their functions and outputs depend on the actual experimental practices applied. For example, new innovative processes and arrangements (e.g., living labs and transition experiments) link OFE practices to the recognition of the value of combining diverse interests (public and private) and to the fostering of trusting and productive relationships among actors (Schäpke et al. 2018). The renewal of OFE favors “joint exploration whereby researchers and others engage closely with farming realities to align with the ways farmers learn” (Lacoste et al. 2021). Over the past few decades, various on-farm experimentation practices have thus emerged in accordance with and as part of distinct approaches to innovation. These range from technology transfer (where farmers are considered as adopters and scientists innovate from the knowledge they produce) to the farming system research approach (where farmers provide scientists with knowledge and information about particular contexts and needs) and, recently, the agricultural innovation system (where scientists and farmers are partners in co-innovation processes along with other actors in various political, agro-climatic, economic, and institutional contexts) (Hall 2007; Klerkx et al. 2012). These three archetypal movements, coexisting but anchored in different innovation paradigms, are inviting agricultural researchers to build reflexivity about what the various characteristics of OFE practices are and how they contribute to different innovation dynamics.

In this paper, our objective is to facilitate this reflexivity and transparency on how OFE is conducted by benchmarking existing practices. Through a literature review, we highlight the diversity of ways of defining and performing OFE with a focus on the characteristics of experimental practices. Regarding this diversity, we also aim to identify and discuss the various forms of digital development that support these experimental practices or that could do so. More specifically, we answer the following questions: can we identify diverse types of experimental practices within the OFE-related literature? If so, how do these different forms of OFE rely on digital devices and what would be the various roles that digital technologies could acquire in these OFE practices? We first present our two-step method, based first on the scientometric analysis of a large corpus of publications, and, second, on a more in-depth and qualitative analysis of subsets of publications spread across the topics identified. This led us to identify seven types of OFE practices based on a specifically proposed analytical framework. After describing these types of OFE practices, we discuss in the final section how this typology of approaches can support the deeper analysis of their intertwining with the development of digital technologies and, more generally, different OFE development perspectives.

2 Material and methods

Our research process was two-fold. First, we built a corpus of literature addressing theoretical aspects of OFE or presenting research based on OFE practices. We applied scientometric analysis based on methods developed for mapping networks and characterizing the socio-semantic dynamics from publication corpuses (Cointet 2009). This type of analysis combines: i) the construction of a specific corpus, ii) lexical extractions that make it possible to build lists of terms characterizing diverse practices, iii) indexation of the corpus with these custom lists of terms, and iv) analysis of the frequencies and co-occurrences of these terms within the corpus. In the second step, we used the resulting structure of the scientific corpus to build an analytical framework. This served to describe experimental practices that enabled the steering of a more in-depth qualitative review of a sample of representative articles.

2.1 Construction of the corpus and semantic analysis

We performed a query on the SCOPUS database, combining variants of terms referring to the location of farm experiments or farmer-centric approaches, with the terms related to experimentation activities. Terms related to farmer participation in research activities were considered in constructing the query (e.g., “participatory research,” “participatory experiment,” and “collaborative research”), but they resulted in corpuses that went well beyond the literature, allowing us to describe specifically experimentation practices and their relations to farms. Furthermore, we ensured that a query without these terms would still return articles related to participatory research by checking that specific articles we knew should have been captured were indeed present in the corpus. We focused the query in order to capture mainly the predominantly English-language, academic-focused literature. This made the textual analysis possible on the whole resulting corpus (since the tools applied for the scientometric analysis cannot handle several languages simultaneously). This was in alignment with our objective to explore the mainstream practices in the leading internationally renowned literature. Finally, the following query was applied on titles, abstracts, and keywords that combine the targeted terms at a maximum distance of 5 terms: TITLE-ABS-KEY ([“on farm” OR “on-farm” OR “farmer-centric” OR “farmer? led” OR “farmer? field” OR “farmer? managed”] W/5 [experiment* OR trial OR demonstration OR test OR survey OR research OR evaluation]).

This query returned 3955 publications (on August 16, 2021).

The 2000 most frequent terms (including monograms) were extracted from titles and abstracts with the CorText Manager Platform (http://manager.cortext.net/). After removing duplicates and grouping together closely related terms, we refined the list to conserve only terms related to experimentation (for instance we removed all terms describing agronomic objects such as species names but kept terms like control treatment, crop response, and cropping systems). We thus obtained a list of 926 terms. The corpus was indexed with this custom list of terms based on the title, abstract, and keywords fields. Co-occurrence counts and network mapping were then performed with the new indexation of the corpus.

The network mapping relied on the calculation of distances between terms, for which we applied the distributional measure as it is commonly used for homogenous networks (i.e., for calculating distances between terms from the same fields of publication descriptions) (Weeds and Weir 2005). Closest subgroups of terms were then clustered based on Louvain’s algorithm, limiting the number of closest neighbor terms to seven to clarify the readability of the generated map.

The network of co-occurrences of terms was first analyzed to identify the types of methodologies applied in OFE practices and the scientific communities to which these uses refer (namely, by identifying journals related to each cluster). For instance, we analyzed the terms used in connection with “on-farm research,” “on-farm experiment,” and “on-farm trials,” as these three expressions are grouped in different clusters. This first analysis was based on publication metadata (namely authorship, publication journals, disciplinary areas, keywords, and titles) and the network of terms, without including entire publications. Reading entire subgroups of publications and analyzing them in-depth served as the second step of our analysis. The mapping of term co-occurrences made it possible to identify clusters and associate the publications of the corpus with each cluster. Between 5 and 10 of the most cited publications from each cluster were selected for the second step of our analysis.

2.2 Framework to analyze the specificities of experimental practices

To decipher practices associated with the various dominant terms in the corpus and to better understand their diversity as reflected by the clusters obtained, we applied a specific analytical framework.

Our analysis of experimental practices described in the selected articles was based on relevant analytical categories inspired by preceding works on experimentation either in urban and public spaces (e.g., Laurent and Tironi 2015) or in rural contexts (e.g., Lovell et al. 2018), and on studies on experimentation in sustainable transition contexts (e.g., Caniglia et al. 2017).

We connected four analytical categories that were appropriate for further understanding the experimental logic behind the analyzed cases of OFE practices in each cluster:

i) Construction of a space for experimentation: this refers to aspects that are considered in order to characterize a particular experimentation site. As Engels et al. (2019) point out, “what counts as real-world conditions for testing are never just ‘out there,’ but always subject to interpretation and occasionally highly contested.” In fact, we focus specifically on two aspects: a) the delineation of what is considered as part of the experiment and what is not (e.g., to what extent are complete plots or cropping systems included in the analysis? Are the adaptations applied by the farmers considered? To what extent are the existing farmers’ practices intertwined with the applied treatments and analyzed?) and b) the dimensions of on-farm contexts that are included in the characterization, selection or comparison of experimental sites (e.g., are socioeconomic contexts considered as part of what characterizes experimental situations?).

ii) Specificities of interventions: what are the practices effectively implemented on experimental sites? For how long? Who acts and how? In which parts of the experimental process? Is there an iterative adjustment of protocols during the course of the experiment and across various sites? How are practices monitored?

iii) Observations and measurements: for the same intervention, the ways in which experimenters capture the outcomes of actions (around which the uncertainty justifies the experimental process) can take multiple forms. What types of data are obtained? What instrumentation is deployed? Which observations and analyses are included in the valorization of the experiment? How are unplanned observations or events handled when these are mentioned?

iv) Mention of digital technologies: mention of tools or of the opportunities to develop them even when these are not explicitly related to a digitalization process.

This framework made it possible to identify that each cluster derived from the scientometric analysis covers sometimes largely differing on-farm experimental practices. We thus progressively defined different types of on-farm experimentation by iteratively allocating to each one the most cited publications of each cluster. The qualitative identification of these types of OFE practices resulted from common features based on the four analytical categories described above, encountered in the publications across the different clusters. A minimum of five most cited articles associated with each cluster of terms (excluding publications finally considered to be outside the scope of the inquiry) were included in this second part of the analysis. We selected the most cited articles over a random selection, in spite of risk of bias toward publication years, in order to favor influential publications reflecting mainstream research practices and thoughts on the topic.

3 Results

The query returned 3955 documents published from 1940 but with a steep and constant increase in numbers of publications from 1980. The five most frequent journals were Field Crops Research (157 articles), Experimental Agriculture (118), Agronomy Journal (102), Agricultural Systems (88), and Acta Horticulturae (85).

3.1 Analysis of the structure of the corpus: Seven thematic clusters related to OFE

The co-occurrences of the most frequent terms appearing in titles, abstracts, and keywords were mapped according to their frequencies and proximity (Fig. 2). Six clusters of terms were clearly identified, as well as a peripheral one consisting of only three terms (in red in Fig. 2), across a total of 3955 publications. These clusters were named with the two terms presenting the most numerous connections with other terms within the same cluster.

Fig. 2
figure 2

Co-occurrences of terms extracted from titles, summaries, and keywords. The 200 most frequent co-occurring terms are shown (distributional proximity, proximity threshold 0.43, filtering of the seven closest neighbor terms).

The largest cluster concerned “grain yields and fertilizer management” (light green, 1206 publications associated, from journals such as Field Crops Research and Indian Journal of Agronomy) corresponding to terms related to yield (most frequent associated terms: increased yields and yield gap) and fertilization management within cropping systems (NPK fertilizer, fertilizer application, fertilizer management, nutrient management, cropping systems, and crop production). This cluster corresponded to studies generally focused on single species (e.g., rice, wheat, and maize), associated with research-based fertilization strategies to be tested or compared to traditional ones (Cui et al. 2008; Peng et al. 2008). This research on fertilization strategies includes specific types of OFE practices, as we will describe below.

An opposite cluster, “knowledge and innovations” (dark green, 755 publications), comprised terms related more to farmers’ participation and knowledge (work, farmer participation, collaboration, perceptions, knowledge, role, and decision) and to farmers’ perceptions of innovations and implementation (demonstration, adoption, perceptions, and innovations). Contrary to the preceding cluster, the term on-farm research was more common in the associated publications than on-farm trials or experiments. Associated publications dealt with farmers’ learning, risk perceptions (e.g., Ghadim et al. 2005), and complete farming systems or consistent combinations of practices (e.g., conservation agriculture, agroforestry, and system of rice intensification), as well as participatory research (e.g., Chambers and Ghildyal 1985; Carberry et al. 2002).

A smaller cluster, both in number of terms and associated articles, concerned “model and error” (orange, 216 publications). This cluster was connected to precision agriculture and sensor technologies (site specific, sensor, variability, precision, and estimate, model). The most frequent journals were Precision agriculture and Computers and Electronics in Agriculture. Articles from this cluster focused on the characterization of on-farm variability to refine new technologies or models (e.g., hyperspectral canopy sensing of paddy, Gnyp et al. 2014), or proposed plans for on-farm precision experiments (Alesso et al. 2021).

Two related clusters were articulated around the term on-farm evaluation. One of these clusters, “laboratory and samples” (blue, 816 publications), corresponded to evaluations based on collection of information and observations on farms (assessment, farm surveys, measures, and value), as in the case of unpredictable events occurring in farms and difficult to reproduce experimentally (e.g., a case control study on on-farm risk factors for tail biting in pigs, Moinard et al. 2003). The other cluster, named “station and weight” (yellow, 606 publications), concerned relations between experimental stations and farms for evaluation of varieties and crop traits and preferentially applied the term on-farm trial (varieties, environment, performance, replication, and on-station). For instance, Casler et al. (1998) evaluated perennial forage grass varieties for management intensive grazing (MIG) systems by replicating trials on three dairy farms in southern Wisconsin.

The sixth and last cluster of interest, “smallholder and food security” (light orange, 306 publications), concerned farming system analyses and impacts of policies and climate change. Different approaches were used in this variegated cluster (e.g., questionnaires and surveys, on-farm experiments, and action research), but it specifically comprised works that address households with regard to labor, income, and livelihoods. These dimensions of analysis led to particular methods for characterizing contexts in which experiments were to take place (e.g., surveys on labor and perceived constraints by farmers, and socioeconomic diagnoses).

The described clusters showed distinct orientations in the use of farms as spaces where experiments are performed. Some objects appeared to be studied with specific OFE approaches. “On-farm evaluation”, for example, seemed to be related to the measurement of traits, performances, weights of dry matter, namely, applied to the study of varieties, and in connection with research stations (“station and weight” cluster, Fig. 2). However, these objects are more closely associated with works in a distinct subfield of agricultural research (that which is dedicated to biological measurements that happen to be taken on farms) than works that define how the corresponding methodologies apply (or use) experimentation on farms. This distinction can, for instance, be illustrated by the following: how the relation between on-farm evaluation and on-station trials with varieties is constructed, how various environments are characterized, and who acts in these experimental processes. The second step of analysis, presented in the next section, developed this more in-depth qualitative description of OFE practices.

3.2 Analysis of variety in experimentation practices associated with OFE

The identification of clusters made it possible to select publications for each one that would help refine the understanding of the associated research through in-depth reading. We applied the analytical framework to subgroups of the 5 to 10 most cited publications from each cluster in order to analyze the construction of the experimentation space, interventions, measures and observations, and the role and importance of digital devices. These four analytical categories brought out important variations in practices between clusters and within clusters that we inductively classified into 7 distinct “types of OFE practice.” These types and their main distinctive characteristics are summed up in Table 1. We describe each type in the following sections.

Table 1 Summary of the distinct types of OFE practices according to our analytical framework, with potential digital uses

3.2.1 Type 1: Exploring and explaining a phenomenon through the diversity of a farmers’ circumstances and practices

Type 1 corresponded to experimentation on a large number of farms in which biophysical and farming situations were carefully assessed to further the understanding of their variability and how it affected the biophysical processes investigated. Assessments of diverse situations regarding the biophysical processes impacting a phenomenon under study were used to analyze variations and better understand the phenomenon itself. This was, for instance, typically the case in the study “The effect of shade structure on coffee grain yields” (Soto-Pinto et al. 2000) and in the research by Rockström et al. (2007), who applied measurements and assessed variations in widely contrasting situations to demonstrate that “crop transpiration and yield show nonlinearity under on-farm and low yield conditions.” The research methods often combined surveys (for instance to identify various constraints that farmers face, Fermont et al. 2009) and diagnoses with treatment comparisons on farms. Sometimes, various experimental treatments were applied on farms, with usual farmers’ practices as one of them. Mostly, however, the treatments applied corresponded to slight modifications of one aspect of farmers’ practices (e.g., fertilization practices and cultivar choice) in order to explore the variability of impacts of actions and of the targeted phenomenon in diverse situations. In fact, farm plots were sometimes associated with the term “environmental sampling” (Meynard et al. 1981).

In this type of OFE practice, digital tools were very seldom mentioned or applied.

3.2.2 Type 2: Validating models or technologies in a large range of biophysical contexts through standardized protocols

In type 2, the experimental sites on farms were selected to obtain the greatest and most representative variety of biophysical and agronomic conditions for assessing the robustness of technologies or technical models when applied in those diverse conditions. These experiments were not designed to answer new research questions but to validate previous research-based developments (models and technologies). For instance, Kanampiu et al. (2003) verified “that the herbicide seed coating technology is successful in multi-site on-station studies and especially in farmers’ fields in different conditions and environments.” Similar treatments were usually applied during a single study both at experimental stations and on farms, often on randomized block or split-plot designs, proposed and mostly managed by agronomists. This is consistent with interventions on farms to test the robustness of a defined technology. For instance, Khan et al. (2008), in their on-farm evaluation of the “push-pull” technology for the control of stemborers and striga weed on maize in western Kenya, explained that “farmers are guided by the Ministry of Agriculture and ICIPE [International Centre of Insect Physiology and Ecology] field staff [in order] to ensure that the ‘push–pull’ plots are properly laid out and companion plots properly established and managed since the effectiveness of the technology is dependent on these two.”

Digital tools were seldom mentioned unless the tested technology corresponded to a set of practices resulting from a model (Chen et al. 2011) or when on-farm data contributed to the calibration of models based on specific sensors (e.g., hyperspectral canopy sensors, Gnyp et al. 2014).

3.2.3 Type 3: Comparing new strategies and combinations of techniques with farmers’ practices

In this type of OFE practice, the trials implemented on farms were intended to promote the adoption and adaptation of proven techniques, namely, by testing them in diverse contexts under farmers’ own constraints. On-farm trials ranged from researcher-led/researcher-managed to farmer-led/farmer-managed interventions. The object of experiments on stations and farms was more often a set of combined techniques forming a strategy (e.g., site-specific nutrient management for rice fertilization, Dobermann et al. 2002) rather than isolated technologies. These OFE practices were sometimes related explicitly to an “on-farm evaluation” (Cui et al. 2008) or a “farm-scale evaluation” (Perry et al. 2003). Thus, in contrast with the preceding type of OFE, they usually included particular attention to the profitability of tested strategies or techniques in farmers’ contexts and under specific constraints. Unlike type 1, the situations in which experiments took place were not characterized further than with biophysical aspects related to the strategies tested and with the most dominant farming practices. Most often, several new strategies or specific sets of practices were applied and compared (both at stations and on farms) using the equivalent of the predominant or representative farmers’ practices as the control treatment. This was typically the case for several studies comparing N fertilization strategies proposed by researchers (namely, “real-time N management” and “fixed-time adjustable-dose N management”, both based on researchers’ monitoring with chlorophyll meters) with those of farmers in specific regions and for particular crops (Peng et al. 2006). These objectives of comparison and demonstration of clearly identified treatments were consistent with the experimental settings, which mainly corresponded to a randomized complete block design with replicates. Mentions of farmers appeared in most studies. They were reported as users of “traditional practices” as well as potential adopters of the new strategies.

The digital tools mentioned in relation to these experimental processes were mainly models for calculation of fertilization strategies and sensors either for monitoring the implementation of strategies (e.g., chlorophyll meters) or for measuring processes of interest for evaluating strategies (e.g., oxygen/carbon dioxide analyzer in an experiment on maize storage bags, Ng’ang’a et al. 2016).

3.2.4 Type 4: Demonstrating or testing new technologies on farm fields to convince future adopters

As with the previous type, the objective of demonstrating the value of a technique to promote farmer adoption steered the experimental practices. However, the main difference here was that the promotion goal was so emphasized that farmers’ practices or constraints were little investigated. Instead, researchers insisted on the idea of testing technologies on demonstration farms, in “real-world situations,” or on “real farms” to ensure a robust assessment but without scrutinizing particular interactions with existing practices or building persuasiveness from acute comparisons with farmers’ usual practices. Measurements were thus usually restricted to the technology under assessment, such as NH3 emissions for “air scrubbing techniques for ammonia and odor reduction in livestock operations” (Melse and Ogink 2005), and the saliency of their results was mostly argued in relation to the fact that they were obtained on real farms. Researchers and advisors usually designed and managed most experimental settings; as Frank et al. (2018) commented on in the case of a demonstration farm approach for pastoral livestock production systems, “farmers who own the fields often only participate passively.” Much like the preceding type of OFE practice, farmers were mostly considered as adopters of demonstrated technologies.

3.2.5 Type 5: Considering farm fields as the locus of experiments without mentioning farmers

The experimentation took place partly on farmers’ fields, but neither farmers’ practices nor “real world situations” motivated the choice for this type of OFEs. Rather, the aim was to access conditions that were difficult to reproduce or obtain with reliability and relevant diversity at research stations. This was the case for disease conditions explored by Larkin et al. (2007), who chose potato farms with a history of soil-borne disease problems to experiment with brassica species as green manure. The space for experimentation was thus, above all, a relevant biophysical space for the phenomenon studied, where experimentation was feasible. Randomized complete block designs were applied in most cases. Measures and observations, as in the previous type, considered only the agronomic process under study, while farmers’ constraints and difficulties to implement the experimental treatments or technologies were never mentioned.

3.2.6 Type 6: Developing on-farm research based on multi-year trials and surveys

In these cases, the experiments described were more explicitly embedded within long-term interventions combining different means of knowledge production. These combined, for instance, model development and implementation as learning and decision-support tools with research and development on experimental sites, interactions with farmers’ collectives and advisors, and surveys. Categories of situations were often constructed to capture diversity and structure corresponding interventions. These categorizations relied mostly on surveys and analyses of main farm characteristics (e.g., size, main practices, and mean yields, Cooper et al. 1987) or of biophysical contexts (e.g., rainfall and erosivity, Herweg and Ludi 1999). The selection of experimental treatments and the design of on-farm trials were typically negotiated with local farmer groups and their private consultants or with local public extension officers, as in the case of the Farmscape program in Australia (Carberry et al. 2002). Contrary to types 3 and 4, farmers were thus much more engaged in the experimental processes, contributing either to the choice of treatments to be applied or to the implementation and observations at various sites. The trials usually lasted several years, and farmer groups engaged in monitoring, for example, of soil parameters under different treatments (e.g., different grazing systems, Drewry et al. 2006; soil and water conservation techniques, Herweg et al. 1999). This monitoring often made it possible to consider emergent issues or characteristics of particular situations during the multi-year experimental process. For instance, during the Farmscape program, the identification of “deep N bulge” throughout on-farm trial monitoring led to the reconfiguration of the purposes of these trials and of the use of simulations. Researchers consistently paid attention to the farmers’ and other actors’ learning throughout the experimental process (Carberry et al. 2002). To this end, they relied on informal interviews and on a combination of quantitative measures with “qualitative observations and statements of farmers from within and around the research sites” (Herweg et al. 1999).

Mentions of digital tools thus regarded mainly simulation models dedicated to the agronomic processes of interest and those used to favor co-learning and shared interpretations of on-farm trials to complement the qualitative observations.

3.2.7 Type 7: Adapting participatory and farmer-managed trials to individual farms

This type differs from the previous one regarding the extent to which farmers were co-designers of the experimental choices and settings and the attention paid to particular adaptations required in various situations for the investigated practices to be satisfying. The balance between treatments and replicates was secondary compared to the emphasis on adaptive and collaborative design of relevant treatments, as in the case of Rockström et al. (2009), where “each combination of tillage, timing, weeding, fertilization, and crop choice was agreed in farmer groups, as was the set of comparative treatments.” Blocks were still randomized but with farms considered as replicates. As adaptation to a particular situation was part of the object under investigation, experimental situations were characterized extensively. This included the analysis of biophysical, economic, and social conditions, as well as existing farmers’ practices (e.g., Ouédraogo 2001; Rockström et al. 2009). As in type 6 and unlike the other types, adjustments of the experiments were recognizable and often made explicit. This included adjustments to both particular farming situations and those in order to take first observations or outcomes on board in adaptations of the applied protocols. New knowledge per se was not always underlined as an explicit goal of the process, but evaluations of results, adaptations in protocols, and orientations for new trials were often described as being the output of joint workshops (i.e., with farmers, advisors, and researchers).

Digital tools were seldom mentioned, and when cited mainly corresponded to crop models supporting the exploration and interpretation of practices (e.g., Stoop et al. 2002).

4 Discussion

4.1 A novel framework to analyze the diversity of scientific OFE practices

Studies on experimentations in the agricultural sector have often separated the experimental practices of researchers and those of farmers mostly in a comparative way (e.g., Catalogna et al. 2018; Hansson 2019). The descriptive tools applied in these cases are mostly well established in experimental agronomy, such as the presence or absence of “controls”, the possibility to isolate the effects of different variables, and the degree of randomization and replication. The differences between these aspects of experimentation are then often linked to distinct intents: “epistemic experiments” that are intended to produce knowledge and further the understanding of specific processes and that are opposed to “direct action-guiding experiments” that inform the effectiveness of certain actions (Hansson 2019). In contrast, we proposed here to analyze and distinguish between experiments according to the very processes of their implementation, which include the activities they require or combine from various actors involved. Our analysis shows that OFE practices, as reported in the academic literature, are widely diverse even when similar terms such as “on-farm trials” or “on-farm research” are used to define them. This was particularly apparent by the fact that we could identify very distinct “types of OFE practice” within each sub-corpus of publications associated with the clusters established during our first phase of analysis (Table 1, column “Examples of publications”). For instance, publications associated with the cluster Grain yields and fertilizer management included many examples of work corresponding to type 2 (Validate models in diverse contexts, e.g. Chen et al. 2011; Van Ittersum et al. 2013), as well as other works corresponding more closely to the contrasting type 7 (Participatory and farmer-managed trials, e.g., Ouédraogo et al. 2001; Stoop et al. 2002). The analytical framework we applied to publication samples across these clusters enabled the identification of major distinctive characteristics beyond these common terms. Similarly, Salembier et al. (2018) have shown that agronomists relied on farmers’ actual farming practice situations in very different ways over the course of time as agronomy as a scientific discipline evolved itself. The types of OFE practice that we have described show that the same diversity of approaches appears behind the OFE umbrella. Furthermore, publications associated with the different clusters of terms were concomitant (median year of publication ranged between 2009 and 2014), which suggests that these differing approaches have continued to coexist in agricultural research.

Combining the analysis of the experimental space, the interventions (in terms of instruments, actions, and actors involved), observations, and measurements made it possible to identify coherence in the features of the experiments themselves. Such coherence is usually built between one aspect of the experimental process and its outcome, or based on the general aims of the experimenters. Lockeretz (1987) linked the relevance of OFE with certain objectives or requirements met by agronomists, such as to cover a range of particular soil types or other physical conditions that are not available at the experiment station, to analyze systems that involve interactions among several individual enterprises or that intrinsically are of a whole-farm nature, to evaluate production techniques that are particularly sensitive to management skills, or to analyze a production method or management system that is already practiced by some farmers but has not received attention from researchers. Sumberg et al. (2003) more particularly associated the legitimacy and utility of farmers’ participation in an experimental process with the type of technology targeted as its outcome: either commercial high tech, where farmers may have a limited role in problem identification, or “systems technologies” where farmers have important roles in problem identification and assessment of technologies early in the process in accordance with their needs and situations. Our redefined description of on-farm experimentation reverses the viewpoint, as it relates the possible outcomes of the experimentation process to the type of practice it relies on.

The intention supporting the definition of types is to better understand the diversity of logic supporting the experimental process rather than to provide criteria to judge what may (or may not) correspond to good OFE practices. The different types of OFE practice we have described show how the same kinds of devices, tools, and methods are, in fact, applied in very different ways with different objectives. However, as Lacoste et al. (2021) commented, “theoreticians and practitioners need to align their work conceptually, methodologically, and empirically to provide a solid and unified foundation for future efforts.” We argue that clarifying the various practices and approaches within the OFE community is an important step in that direction.

An example of such clarification that could be grounded using the types of OFE that we propose concerns farming situations and their characterization as “contexts of on-farm experimentations.” We have described various ways in which these farming contexts are assessed, either with a focus on restricted biophysical conditions of interest (types 2 and 3) or through extended diagnoses of socioeconomic aspects and practices (types 1, 6, and 7). The latter type of description supports the understanding of various approaches to the variability of situations, whereas within experimental stations, this variability is most often handled as a bias reduced through replicates, and the spreading of experimental interventions on multiple farms transforms the uses of induced variability in multiple ways. In type 1 (explore and explain phenomena), such variability is an asset for better understanding the functioning of a phenomenon occurring in each situation. Thus, Meynard et al. (1981) argued, “the study of farming situations is a central part of the scientific field of agronomy; it extends and enriches the development of theoretical models used in this discipline.” In types 3 and 4, the variability within the same farming context is often what a strategy or technology is supposed to adapt to (e.g., a fertilization strategy to adapt to particular soil fertility states and their dynamics), based on the given formalism and model of the involved biophysical processes. This resonates with the “option by context” approach proposed by Sinclair and Coe (2019). In type 2, the variability is maximized as a support for testing the robustness of a technology or practice without the need for characterizing situations individually or linking specific results of trials with specific situations (unless it is to identify unexpected data points), whereas adaptation of technologies to local conditions is a principle directly associated with agroecological approaches (Bell et al. 2008; Tittonell et al. 2020), and the concrete and illustrated ways to handle it are too seldom discussed in the literature (Nelson et al. 2019; Sinclair and Coe 2019; Salembier et al. 2021). However, the various types of OFE practice require this.

Finally, the types of OFE practice can help researchers analyzing and supporting farmers’ own experiments. As Kummer et al. (2017) commented, farmers’ experiments have received little attention from agronomists, and mostly in countries of the Global South. Researchers may describe and formalize farmers’ experimental practices so as to stimulate experimentation in another farmers’ activity (Catalogna et al. 2018). Types 6 and 7 also show how more attention to these experiments can be paid in combination with a researcher-initiated experimental process either by including socioeconomic dimensions and practices in initial diagnoses preceding OFE (Cooper et al. 1987) or by including independent farmers’ evaluations in the assessments of experiments (Rockström et al. 2009) (Table 1).

4.2 Diverse forms of digitalization suggested by the various types of OFE practice

Digital technologies sometimes associated with the term “Agriculture 4.0” include many different kinds of devices such as drones, the internet of things (IoT), robotics, and sensors connected to precision farming technology, artificial intelligence, machine learning, and blockchains (Klerkx and Rose 2020). The application of such tools in the seven types of OFE practice was most often implicit, except in types 4 and 5 where sensors and precision farming technologies were tested and developed further. The development of simulations and models was closely related to various OFE practices but with distinct objectives: either to improve robustness through application in a wide range of environments (e.g., type 2 or 4) or to support learning (e.g., type 6). The relatively weak resonance of digital transformations in agriculture in our corpus is probably to be attributed to the time period and selection criteria for in-depth analysis, which excluded the most recent publications (only 3 publications from 2015 or later). There is no doubt, however, that the digitalization of agriculture is closely connected to developments in on-farm experimentations (Piepho et al. 2011; Laurent et al. 2020; Lacoste et al. 2021). Digitalization still refers mostly to big data technologies and precision agriculture (Rotz et al. 2019; Ingram et al. 2022). In fact, the development of tools derived from information and communication technologies (e.g., virtual spaces for information exchanges and media for recording observations) has long been associated with experiments on farms (Wolfert et al. 2011). Smart farming has renewed the potential for a range of tools currently in use, such as smart sensing and monitoring (i.e., acquiring more numerous and accurate data points on farms for better decision making), smart analysis and planning (i.e., management and decision tools that ground calculations on more interconnected and enriched information on the farm’s biophysical and economic data), and smart control (i.e., precision farming) (Wolfert et al. 2017).

In contrast, the diverse types of OFE practice we identified invite us to focus awareness on two major issues. First, while the main developments of digital tools for OFE are based on the assumption that all variables of interest to be monitored by digital tools should be known, along with the best data to inform them, this does not fit OFE practices where some of the variables to be explored emerge during the experimental process (namely in types 6 and 7). During the multi-year experiments that support a step-by-step redesign of cropping systems, for instance, the most useful observations for interpreting the effects of actions often emerge from the first outcomes of new practices and are re-assessed after connecting several observations (e.g., the vegetation architecture of peas in the flowering stage is only interpreted after having progressively established several relationships with sowing density, fertility, and physical states of soils with different preceding crops or yields finally reached) (Toffolini et al. 2015). This may occur in relation to the exploration of a phenomenon in contrasting situations (type 1, Soto-Pinto et al. 2000) or in relation to the adaptation of protocols and their adjustments to situations with farmers (types 6 and 7, e.g., Carberry et al. 2002). More generally, it points to the risk associated with the paradoxically reduced exploration of reality through experimentation: what is not included in digitally targeted data falls out of the scope of emerging sources of knowledge.

Second, very few studies and reviews on the development of digital tools in the agricultural sector highlight the possibility for new digital tools to support social interactions and learning among the diverse actors involved in on-farm experimentation processes (Leveau et al. 2019). For instance, digital tools could be tailored to store and provide access to the serendipity of collective activities (e.g., analyzing and visualizing social and interpretative interactions during a workshop), visualize qualitative data and situated interpretations (e.g., concerning the sharing of individual experiments on farms), and connect existing information resources based on a query by farmers rather than fine-tuning the individualized advice. Some examples have appeared recently, such as digital platforms for sharing maps and descriptions of on-farm innovations regarding equipment or buildings (Chance and Meyer 2017), or for sharing techniques and experiences related to the valorization of natural vegetation in production (Girard and Magda 2020).

These digital tools need to integrate diverse dimensions to support meaningful comparisons and analogies across farming situations if they are to derive generic knowledge from individual and anecdotic situations. This calls for specific research on their design (Quinio et al. 2022). For instance, the tools should offer support for more heterogeneous databases (including qualitative observations). They could also offer media that enhance the exchanges and collective interpretations of situations experienced, with a view to supporting innovation in other contexts (Elzen et al. 2017). More specifically, digital tools that support farmers’ interactions and exchange of observations made through OFE are part of these potential developments and could draw on recent works on farmers’ use of online communities and social media (Prost et al. 2017).

4.3 Alignments between OFE practices and agricultural innovation approaches

We observed some alignment between the types of OFE practice identified through the present analysis and various approaches to agricultural innovation that emerged over time. On the one hand, types 3 and 4 (Comparing new strategies and combinations of techniques with farmers’ practices, and Demonstrating or testing new technologies in farm fields to convince future adopters) could be related to a diffusion model or technology transfer approach (Hall 2007; Klerkx et al. 2012). On the other hand, the emphasis on collecting farm data including socio-technical information in types 1 and 6 (Exploring and explaining a phenomenon through a diagnosis of diverse farmers’ practices, and Developing on-farm research based on multi-year trials and surveys) illustrate the Farming Systems Research stream, which purposively placed farms and farmers groups within their direct biophysical and socioeconomic contexts in order develop social learning. Such approaches draw on the idea that to achieve agroecological transitions, the technologies designed must fit specific farming situations. Questions arise regarding means for sharing situated knowledge and experiences and associated data privacy and intellectual property issues. These questions are, all the more, acute that open innovation approaches develop in the agricultural sector and within the OFE research community (Berthet et al. 2018; Salembier et al. 2020; Lacoste et al. 2021).

Finally, OFE practices that correspond to farmer participatory research with agronomic and socioeconomic diagnoses (type 7) may be related to an agricultural innovation system approach that is more oriented toward the development of capacities for innovation (ibid.).

The analysis performed here is, however, not sufficient to fully relate the seven types of OFE practice to innovation system theories and approaches and would require a wider analysis of OFE practices in their institutional contexts. First, OFE practices could be more widely situated with larger corpuses of literature, including more research using terms related to participation rather than being limited to the on-farm locus of experimental interventions. This enlargement could also target less academic literature, in languages other than English, and include more development practices. Second, a more institution-focused analysis of mobilization and application of OFE concepts would require the collection of more and different information than that provided by the reviewed articles, for instance, information on institutional arrangements and actors’ interventions in the experiments and around their implementation and use, with a view to better understand the various contributions of experiments to innovation, as proposed by Salembier et al. (2021). This is a sound research perspective for further mapping the realities of OFE practices in various innovation settings . Deciphering OFE practices, focusing on their pragmatic realities, and focusing on literature expressly referring to the on-farm location for experimental processes are the first step that calls for broadening the inquiry and discussion on how OFE is institutionalized and to refine or renew research practices and innovation policies that contribute to shaping innovation processes.

5 Conclusion

Our aim was first to characterize the wide variety of practices gathered under the banner of on-farm experimentation. The literature review process and analytical framework presented here provide a synthetic understanding of a wide range of practices and how these are organized. The two-step methodology, joining a scientometric approach with a qualitative analysis of the literature, provided a comprehensive and original deciphering of seven types of on-farm experimentation practices based on the treatments applied, ways to consider farmers’ existing practices and socioeconomic contexts, distribution of responsibilities among the actors involved, and resulting learning, whether targeted or not. It appeared that digital technologies other than those related to precision agriculture and simulation models were not often discussed or envisioned, whereas these could support participatory and long-term on-farm experimentation practices (e.g., knowledge exchange support tools, repertories of experiences, and designs applied on farms and their situated evaluations). Further refinements to describe OFE practices need to be developed to inform a collective reflection within emerging research communities on the appropriate positioning and types of digital technologies to support and, especially, to engage in collective inquiry into these issues with broader communities of stakeholders and citizens. Specifying how and what we name behind the keywords associated with OFE should help to keep the wide variety of approaches in the debate, maintaining all possibilities open and legitimate instead of closing innovation paths around the most advanced digital technologies. The contributions to agroecological transitions of the different OFE practices identified also need to be discussed.