Toward an effective approach for on-farm experimentation: lessons learned from a case study of fertilizer application optimization in Japan

On-farm experimentation (OFE) is increasing worldwide. Appropriate OFE procedures may differ depending on the characteristics and circumstances surrounding farms, such as climate, field conditions, farm size, degree of agricultural digitalization, and a farmer’s socioeconomic background. This study aims to guide the future development of OFE in Japanese grain farming by examining the experimental setup, data analysis, and farmers’ activities within their socioeconomic and institutional communication and learning networks. The results of this typical OFE case study, which estimates a field’s economically-optimal fertilizer variable-rate application map for winter wheat production, are reported. The outcomes of the case study, which are intended to guide the direction of OFE development in Japan, were used as reference materials for a survey taken while interviewing farmers who had never been involved in OFE. Farmers’ answers showed that the economic return of site-specific management depends on farm and field size and exhibits economies of scale. A very high share of the profit increases provided by OFE data came from improvements in field-specific uniform rate management, not from within-field site-specific management. The interviews revealed that farmers open to OFE are more interested in increasing rice crop quality to earn price premiums than in increasing yield. Increased engagement with farmers in conducting OFEs could play a key role not only in generating data to guide farmers’ input management but also in fostering farmer collaboration to develop marketing strategies. This study is the first to propose future orientations of OFE research that target typical moderately-sized Japanese grain farms.


Introduction
On-farm experimentation (OFE) is an innovative process in which farmers and professional researchers collaborate to improve farm management by generating data from agronomic experiments on farmers' own fields (Lacoste et al., 2022). Typically, farmers implement Extended author information available on the last page of the article field-scale agronomic trials designed by researchers who also analyze the generated data to develop improved input management strategies. In an iterative process over time, taking a transdisciplinary approach to data analysis can generate useful management insights. Although OFE originates from agricultural sciences, the other two disciplinary domains, social sciences and data sciences, are essential to understand and answer farmers' questions. New business models developed by researchers in the social sciences contribute to farmers' engagement in OFE, knowledge transfer, and value creation. Research from data sciences provides more robust and reliable analytical outcomes for farmers or data management systems, which benefit farmers' digital footprint and business. Overlapping different disciplinary domains support continuous communication among a variety of OFE community members, such as farmers, scientists, and other stakeholders. Thus, OFE projects evolve over time as transdisciplinary work that continuously addresses multiple objectives.
On-farm precision experimentation (OFPE) is a type of OFE that can be of special interest to large-scale farms aiming to improve site-specific crop input management (Bullock et al., 2019). OFPE can generate site-specific knowledge about crop yield response within a field, which is key to optimizing prescription maps for variable-rate application (VRA). OFPE uses as-applied input data and yield monitoring data to estimate spatially-dependent relationships between crop yield and input application strategies. Geographically weighted regression (GWR) is a popular statistical tool used to analyze OFPE data. GWR estimates local regression parameters using a distance-decay kernel (Evans et al., 2020;Trevisan et al., 2021). In another approach, machine learning techniques, such as random forest (RF) (Krause et al., 2020) and convolutional neural networks (Barbosa et al., 2020), have proven suitable when data on spatially-dependent field characteristics (e.g., soil properties, elevation, and satellite imagery) are available.
Development-oriented agronomists have been collaborating with smallholders in Africa to conduct a type of OFE that is predominantly focused on the impacts of new technologies or nutrient management practices; they have been employing strategies to manage mineral fertilizer, composts, manure, crop residues, pests, weeds, and tillage (Kool et al., 2020). Similar research has been conducted in northern China (Jiang et al., 2021;Zhang et al., 2016). Smallholder OFE is important because frequently the differences in environments (i.e., soil fertility, resource availability) between research stations and farmers' fields limit the inferences that can be drawn from research station experiments about real-farm management. Moreover, resource constraints and OFE logistics limit the number of experiments that studies can include. More than half of the studies reviewed by Kool et al. (2020) were conducted in fewer than ten locations (fields), while only 3% were conducted in one hundred to one thousand locations (Kool et al., 2020). Finally, data analytical guidelines for smallholder farmers have not been well documented in textbooks, which typically put more emphasis on the social dimension of small-farm research (Kool et al., 2020).
Although significant benefits are anticipated from the outcomes of OFE irrespective of the farming scale from smallholders to large-scale farmers, little research has been reported regarding OFE for small-to moderate-scale farms (e.g., 0.3-1 ha fields) (Tanaka et al., 2021), which are common in Asian countries in which paddy fields are a primary land use; these countries include China, Korea, and Japan. These farms face special challenges due to unique social conditions, some of which are especially prevalent in Japan. For example, the high rates of aging and retirement among Japanese farmers have led to increased abandonment of arable land and foreshadow similar upcoming events in Asian countries in which similar demographic changes are imminent. Japanese family-owned and family-operated smallholder farms are currently rationalizing production processes via consolidation into moderate-scale farms managed by producers' cooperatives and farming companies (Ministry of Agriculture, 2016). While to a degree farms have increased their adoption of "smart farming" technologies (de Bourgogne, 2021), local farmers and agricultural extension workers continue to question whether the benefits of precision agriculture technology can offset the high investment costs. Analysis of OFE-generated data could answer this question, but few OFEs are being conducted in Japan and nearby countries. Given the uniqueness of the East Asian agricultural background and development, appropriate OFE directions, including data collection, experimental design, data analytics, extension, and knowledge transfer, should be explored.
The objectives of this study were, via interviews, to examine the benefits and difficulties encountered by Japanese grain farmers associated with establishing, monitoring and interpreting OFEs. For this purpose, OFPE was conducted via collaborations with Japanese farmers. RF regression was used to model the crop yield response to fertilizer application rates, and sensitivity analysis was performed to evaluate a field's site-specific economically-optimal fertilizer strategy for wheat production. Second, using these outcomes, interviews were conducted with farmers who had and farmers who had not been involved in OFE to examine the benefits and difficulties associated with OFE.

Recruitment and interview process
To better understand Japanese grain farmers' views on the benefits of and difficulties in running OFEs, both farmers who had and farmers who had not been involved in OFEs were interviewed using the case studies of OFEs as reference material (Fig. 1). Specifically, each OFE procedure (e.g., experimental design) and results (e.g., figures and tables in this study and other sources) were presented to the farmers to enhance their understanding. Strip trials examining the yield effects of basal fertilization rates reported by Tanaka  2021) were used as one of the reference materials. In strip trials, long strips are laid out side-by-side in a field, and each strip receives different rates of fertilizer application. The treatment of strips can be implemented without VRA technology if farmers are manually capable of adjusting the rates with their own field equipment. The outcome of statistical analysis on strip trials provides information as to whether farmers should increase or decrease their conventional rate to enhance economic return. Another case study used as reference material involved OFPE, which uses precision agriculture technology to generate large amounts of crop input application and yield response data that can be used to estimate spatially-variant optimal input application rates and thus improve agronomic decision making. OFPE was conducted in 2019-2020 in cooperation with the Japanese farming company Fukue-eino, which owns the required VRA and yield monitoring equipment. The farmers who participated in the OFPE were interviewed, and a question was asked whether the experiment's results would change their fertilization decisions in the upcoming year. The experimental design and data analysis of the OFPE case study are described in more detail in the next section.
With the help of the Gifu Prefectural government, nine organizations were also identified and interviewed; they included cooperatives or farming companies, which had not previously run OFEs. Agricultural producers' cooperatives are typical management entities in Japan. Their decisions are based on the principle of one person, one vote, and meaningful changes in investments or policies require strong consensus. The results of the interviews reinforced the importance of understanding how Japanese grain farmers act within the social context of agricultural producers' cooperatives. Each interview started with diverse discussions about crop management practices and was ultimately developed in the direction of farmer recruitment for further OFE research. These methods facilitated a conversation with farmers by offering evidence from the personal experience of a real and representative Japanese farm, rather than a less personalized discussion in terms of objectified scientific results. All interviews were conducted in person, and answers were recorded via notetaking. The primary question asked in the interview process was whether the interviewee was willing to conduct OFEs. Interviewees answering no were asked to describe the obstacles that kept them from running OFEs. Interviewees answering yes were then asked what their motivations were to run OFEs and what types of OFEs they were willing to perform. Three types of OFEs, including strip trials, OFPEs, and field-specific trials, were presented as options within the question. Strip trials do not always need VRA technology, which provides information on the optimal uniform rate for each field. OFPEs need a more complicated experimental design (e.g., checkerboard) that should be implemented with VRA technology, as the aim of the OFPEs is to optimize site-specific crop management. Field-specific trials have different application rate treatments for each field to assess the optimal uniform rate for each farmer or region. Field-specific trials offer a hypothetical experimental design and data analytical approach that might be suitable for Japanese farmers, who are already often managing dozens or hundreds of small-scale fields. Furthermore, all interviewees were asked if they had access to a yield monitor. Throughout the whole interview process, unintended, specific comments made by farmers that related to how OFE frameworks could be elaborated were summarized as key remarks.

OFPE experimental design and data collection
To evaluate the effect of different fertilizer application rates on wheat yield, a split-plot agronomic field trial design was implemented at two locations (Experiments 1 and 2) in Gifu, Japan (35°11'N, 136°39'E), in 2019-2020. Figure 2 illustrates Experiment 1's design. Just before seeding (early November), a slow-release basal fertilizer (NPK 25-6-4) was broadcast at rates of 270, 360, 450, and 540 kg ha −1 . Then, before the booting stage (early March), a top-dressing fertilizer (NPK 17-0-17) was applied at rates of 222, 296, 370, and 444 kg ha −1 . The rates were decided according to a discussion with the farmer, whose conventional rate was 450 and 370 kg ha −1 for basal and top-dressing application, respectively. A variable-rate fertilizer broadcaster with an 18-m working width (Axis 40.2, Kuhn, France) was used. Yield data were collected using a combine harvester with a yield monitor sensor (WRH1200, Kubota, Japan). Although the combine had a 2.6-m header width, yield values within the 5-m square grids were provided after data preprocessing based on the calculation procedures of the manufacturer. Thus, yield values aggregated in square grids were used, and the grids on the boundaries between treatment plots and headlands were excluded from further data analysis. After removing the data from a buffer zone around the field's perimeter, the total data observations were 970 and 428 for Experiments 1 and 2, respectively.
Soil properties were used as covariates to enhance model performance in the yield response assessment. Before basal fertilizer application in Experiments 1 and 2, a total of 52 and 39 surficial soil samples (0-150 mm) were collected at intervals of approximately 30 m. At each location, three randomly-located partial soil samples weighing approximately 0.5 kg each were collected within one square meter and mixed to produce one composite sample. Soil samples were air-dried and sieved through a 2.0-mm mesh before chemical analysis. Soil pH, electrical conductivity (EC), total carbon (TC) content, Fig. 2 Experimental design for Experiment 1. Different colors represent the different fertilizer application rates. Gray represents the data points touching the treatment borders or headlands that were not used for data analysis mineralizable N, available phosphorus (P), cation exchange capacity (CEC), exchangeable calcium (Ca), exchangeable magnesium (Mg) and exchangeable potassium (K) were measured. Total C was determined using a CN analyzer (Sumigraph NC-TR22, Sumitomo Chemical Co., Tokyo, Japan). Mineralizable N was determined according to the Inoko's (1986) method. Soils were anaerobically incubated at 30 °C for four weeks, and inorganic N was extracted with a 2 M KCl solution. The concentrations of NH 4 + and NO 3 − in the extracts were then determined using the indophenol method (Keeney & Nelson, 2015) and the Cataldo method (Cataldo et al., 1975). Mineralizable N was calculated by balancing the inorganic N (NH 4 + and NO 3 − ) before and after anaerobic incubation. Available P was measured by the Truog method (Truog, 1930). Cation exchange capacity was measured by saturating the soil with a neutral 1 mol L −1 ammonium acetate solution, washing with 80% ethanol to remove soluble NH 4 + , and extracting exchangeable NH 4 + with 2 mol L −1 KCl. The concentrations of Ca, Mg, and K were determined by inductively-coupled plasma atomic emission spectroscopy (ICP-AES, ULTIMA 2, HORIBA, Japan).

Data analysis
To model the site-specific yield response to fertilizer, RF regression models were created using the Python module 'scikit-learn' (version 1.0.2) (Pedregosa et al., 2011). The models treated the application rates of basal and top-dressing fertilizers and soil properties as covariates. The units of the data analysis were created by averaging raw spatial data values in 5-m square grids. Soil properties were originally point data; thus, soil properties were spatially interpolated using empirical best linear unbiased prediction (E-BLUP) at the scale of a 5-m square grid. E-BLUP is equivalent to universal kriging. Predicted values from the BLUP were back-transformed if the distribution of the observations was highly skewed, and the Box-Cox transformation (Box & Cox, 1964) was applied for semivariogram parameter estimation. For semivariogram parameter estimation, the Matérn covariance function (Webster & Oliver, 2007) and the restricted maximum likelihood estimator were used. For spatial interpolation, the 'geoR' package (Ribeiro & Diggle, 2001) implemented in R version 3.6.2 (R Development Core Team, 2019) was used.
Hyperparameters were determined by fivefold cross-validation repeated three times using a training dataset, and then the best RF models were retrained with an optimal hyperparameter using the training dataset. For each fold, a grid search was performed to optimize the n_estimators hyperparameter, which took on values of 100, 500, 1000, 1500, and 2000. For the test dataset, model prediction accuracies were evaluated by root mean square error (RMSE).
Site-specific EORs were calculated by treating the predicted yield values from the best RF model as deterministic outcomes. The site-specific expected net revenue ($ ha −1 ) was defined as where p = $1.16 kg −1 (136.8 JPY kg −1 ) is the price of wheat grain, y i is the wheat grain yield predicted by the RF model at location i, w BF = $1.58 kg −1 (187.0 JPY kg −1 ) is the basal fertilizer price, BF i is the basal fertilizer application rate at location i, w TF = $0.60 kg −1 (71.2 JPY kg −1 ) is the top-dressing fertilizer price, and TF i is the top-dressing fertilizer application rate at location i. To evaluate the economically-optimal application rates, fertilizer application rates were optimized by running the best RF model at intervals of 1 kg ha −1 while keeping the other covariate values of soil properties unchanged. To evaluate the (1) net revenue for each experimental site, four scenarios were assumed, including the farmers' conventional rates (basal fertilizer rate: 450 kg ha −1 , top-dressing fertilizer rate: 370 kg ha −1 ), the crop advisory recommendation rates (basal fertilizer rate: 420 kg ha −1 , top-dressing fertilizer rate: 330 kg ha −1 ), the optimal VRA rates, and optimal rates under uniform management. To assess the feasibility of VRA from the perspective of economies of scale, the total areas that generate sufficient revenue to offset the investment cost in VRA were estimated as follows: where w VRA is the investment cost in VRA technology, r VRA is the net revenue under the scenario of the optimal VRA rate, and r UM is the net revenue under optimal uniform management. The calculations assumed a lifetime of six years for VRA technology and resulted in an annual VRA fixed cost of $5,932. These assumptions were based on local enterprise budgets.

Analytical results: a case study of wheat
The relationships between the observed and predicted yields of the RF models for the training and test datasets are shown in Fig. 3. For Experiment 1, the RMSE values were 0.15 and 0.39 t ha −1 for the training and test datasets, respectively. For Experiment 2, the RMSE values were 0.17 and 0.43 t ha −1 . The RF model performed poorly in the test dataset at both sites, which indicated an overfitting issue, even though parameter tuning was conducted with an independently-trained dataset.
Sensitivity analyses were conducted to examine the robustness of the RF models' estimations of expected net revenues in each of the four scenarios. The results in Table 1 indicate that the lowest net revenues were generated from the farmer's conventional application rate, and in comparison to the optimal uniform rate strategies, led to losses of $266 ha −1 in Experiment 1 and $789 ha −1 in Experiment 2. When the farmers followed the optimal variable-rate strategy instead of the optimal uniform rate strategy, they received relatively small additional gains in net revenue of $120 and $59 ha −1 . Thus, optimizing the uniform rate might significantly improve farmers' profits, although equipment for VRA was unavailable.
The spatial patterns of economically-optimal fertilizer application rates showed little spatial autocorrelation (Fig. 4). Excluding the case of basal fertilizer in Experiment 2, in most parts of the fields, site-specific economically-optimal fertilizer application rates were lower than rates recommended by the farm's and crop advisory service.

Comparison to results from previous research
Previous studies that used crop simulation models for corn demonstrated that the use of VRA increased net revenues by approximately $16 ha −1 in Iowa, USA (Paz et al., 1999), by $13 ha −1 in northwest Italy (Basso et al., 2016), and by $18.21-29.57 ha −1 in Colorado, USA (Koch et al., 2004). Due to the differences in grain and fertilizer prices and the degree of spatial yield variations that can be adjusted by VRA, the expected improvement in net (2) A = w VRA ∕ r VRA −r UM , revenues due to VRA is not directly comparable to our results. However, in this study, a case in which high profits can be expected simply from optimizing uniform fertilizer application rates ($266-789 ha −1 ), and the additional net revenue generated by moving from this point to optimizing VRA rates is positive but small ($59-120 ha −1 ) ( Table 1). This study also provided some evidence of a large gap in net revenues between optimal VRA or uniform rates and the rates recommended to farmers in commercial markets or producer cooperatives. Further research is needed to assess the framework for optimizing machine learning models and the selection of environmental variables to improve site-specific prediction accuracy and model explainability.

The farmer's response
The results from Experiment 1 indicated a large net revenue loss from increasing application rates for both basal and top-dressing fertilizer in Experiment 1 (Table 1). In the following season, the farm decided to apply fertilizer at its advisory service's recommended rates of 420 kg ha −1 of basal fertilizer and 330 kg ha −1 of top-dressing, thus choosing not to reduce application rates to the 316 kg ha −1 of basal fertilizer and 192 kg ha −1 of top-dressing indicated by Experiment 1's data. Farmers feared the potential risk of an unexpected decline in crop yield and questioned the reliability of this outcome simulation considering  the year-to-year variations in crop yield. Furthermore, the net revenue of basal fertilizer rates of Experiment 2 supported the farmer's assumption of a strong crop response to inputs that can improve profits by increasing the application rate. This situation indicates that multiple-year trials or measurements of model prediction uncertainty that account for temporal variations are needed. The combinations of crop simulation models or preplant soil tests might be a promising tool to facilitate understanding of temporal variations in crop yield (Trevisan et al., 2021).

Practical implications
As noted above, substantially raising net revenues by moving from a crop advisory's recommended uniform application rate to an optimal uniform application rate estimated from OFPE data. Maine et al. (2010) indicated that more than 196 ha was needed to compensate investment cost in VRA technology for site-specific maize production to be profitable in South Africa. The results showed that approximately 50-100 ha was needed for the benefits of VRA to surpass its costs (Table 1). Although the two studies' outcomes cannot be directly compared, as the price of grain and fertilizer and the degree of crop responses to fertilizer rates are not the same, economies of scale are one of the important factors determining the benefits of VRA. Given that the average Japanese farm size is 3.2 ha (Ministry of Agriculture, 2022) and farms of sizes greater than 50 ha comprised only 0.56% of all Japanese agricultural management entities in 2015 (Ministry of Agriculture, 2017), the site-specific management recommendation that can be derived from information generated from such small-scale OFPEs is unlikely to offset the costs of the technology. Therefore, Fig. 4 Spatial distribution of economically-optimal application rates of basal (a) and top-dressing fertilizer (b) (kg ha −1 ) data analytics that can recommend optimal field-specific uniform input management rather than site-specific crop yield response enabled by VRA technology might be more appropriate for Japanese fields. That said, generating enough value to pay for the precision agriculture technologies needed to implement the trials, namely, VRA technology, might also simply be infeasible, based on the information garnered from OFPEs on typical Japanese farm fields. The interview survey revealed that the analytical results of the case study may successfully facilitate farmers' understanding of OFE research and can encourage their engagement in OFE ( Table 2). Five of the interviewed farmers answered that they would be willing to conduct OFEs if the profitability of field-specific treatments could be assessed via data analysis (Table 2). Digital tools, namely, yield sensors that can quantify within-field crop variability, are powerful drivers in the implementation of OFEs in many Western countries (Lacoste et al., 2022). However, considering the small scale of Japanese grain farming, further research should be directed toward developing data analytical approaches for assessing the treatment effect on field-specific or household-based yield data and efficient digitalization platforms. For instance, a Bayesian approach and web application might offer solutions for data analytics and visualization (Laurent et al., 2019(Laurent et al., , 2021 because they can account for the effects of years and sites on yield as random effects, while user-friendly web applications enable users to explore the trial effects and economic responses. Such an interactive interface is further expected to allow users with different interests to support decision-making and education (Laurent et al., 2021). This flexible feature might be more welcomed for a variety of small-scale or moderate-scale farms than for large-scale farms. From the perspectives of model generality, deep learning techniques may also have great potential to model causal relationships affecting crop yield through transfer learning if a large dataset is derived from multiple farms, as proposed by Barbosa et al. (2020).
Conducting strip trials that do not require VRA technology is potentially a cost-effective method of generating data for the estimation of optimal uniform application rates. However, a typical Japanese farmer may be unwilling to manually adjust application rates from strip to strip across many of a farm's dozens or hundreds of small-scale fields, and farmers are frequently skeptical about how well the outcomes of strip trials on partial fields represent yield response across entire fields. Furthermore, collecting yield data can be difficult; harvesters equipped with yield sensors have become commercially available in Japan over the past few years but remain uncommon, and most of the yield sensors manufactured in Japan quantify only whole-field grain weight, while yield mapping is a rarely purchased optional feature.
One-third of the rice producers (farmers 2, 3, and 5) answered that while they did not own a yield sensor, they do have access to either field-level or household-based (i.e., aggregated over multiple fields) yield data during the drying and storage process. This general process is used by Japanese farmers because they sometimes need to prove their yield to claim payment from their agricultural producers' cooperatives after using the cooperative's grain drying system. Thus, field-level yield data derived from the yield monitor can be collected from two of the nine farmers, while field-level manually-quantified yield data can be collected from three of the nine farmers. This situation implies the feasibility of conducting only on-farm trials that assign experimental rates that are uniform within the fields but variable among them. Since the individual fields are quite small, the resultant data might generally resemble data from OFPEs on very large fields, although organizing trials with many farmer participants might be challenging.
Surveyed farmers expressed interest in conducting OFEs to study the effects of input application strategies on rice quality rather than yield (Remark 1 in Table 3). Rice quality can be measured by the protein content and Mido value (value of shine on the surface of boiled rice grains) of brown rice (Hamaker, 1993;SATO et al., 2003). While yield response has been the subject of many previous OFE studies (Kool et al., 2020), quality can be of great concern to Japanese farmers; for example, farmer 3 indicated that increasing the quality of the rice crop could double the price received for it. Several interviewees reported conducting informal trials to assess the effect of crop management on crop quality (Remark 2 in Table 3). However, while their trial designs featured no treatment replications and the data generated were not statistically analyzed, these activities indicate the potential of conducting successful OFEs through cooperation among farmers and researchers. These remarks were not direct answers derived from our intentional questions but from extended conversations during the interview process.

Farmers' engagement
Farmers' engagement with researchers and other farmers in the OFE process is crucial. Repeating experiments and reviewing outcomes are key (Lacoste et al., 2022), but Japanese producers may lack the social infrastructure to support this process. Some interviewees expressed concern that the successful implementation of OFEs would necessitate logistical coordination that would increase the "hidden" cost of experimentation (Remark 3 in Table 3). The interviews also found that farmers communicated very little among themselves about crop management (Remark 4 in Table 3) but rather primarily received that information in a top-down format from either private crop advisory or governmental services. Several interviewees claimed that farmers' philosophies and circumstances differed greatly, so information sharing among farmers was not useful (Remark 5 in Table 3). Interviewees further emphasized the complexity of making decisions through agricultural producers' cooperatives (Remark 6 in Table 3). Multiple actors are responsible for implementing fertilization for a single cooperative. Thus, not only farmer-to-farmer co-learning but also within-management-body co-learning would be important to facilitate OFE implementation. These responses indicate the importance of building a farmers' network to drive the implementation of OFE for nonlarge-scale farmers who cannot benefit from the outcomes of one-farm OFEs (Schneider et al., 2009); at the very least, initially implementing multiple-farm OFEs would require considerable organizational investment by researchers. Farmer 5 requested a workshop to deepen the mutual understanding between neighboring farmers and scientists.
Another possible benefit of organizing Japanese farmers to conduct OFEs is that doing so might facilitate cooperative marketing and increase their product prices; establishing common interests might create a new basis for collaboration and social learning (Schneider et al., 2009). For example, the connections created among farmers during workshop discussions regarding OFE outcomes or interactive web applications could encourage new marketing collaborations and thus enhance farm income.

Conclusion
OFE research has thus far primarily focused on either large-scale farming with access to highly modernized precision agriculture technologies or on smallholder farming in the global south. To the best of our knowledge, this study has presented the first discussion about future research directions in OFEs on the moderately-sized farms typically engaged in Japanese grain production. The results of the OPFE case study and interviews indicated that establishing experiments and data-analytical approaches that recommend optimal uniform input management strategies for Japan's small fields would be more practical than trying to estimate sitespecific crop yield responses within those fields. Thus, VRA technology will most likely not be prioritized for Japanese grain farming. The findings highlighted the importance of studying the effects of management strategies on crop quality in Japan. This study also identified scarcity in social communications infrastructure as a key challenge. Both farmer-to-farmer co-learning and within-management-body co-learning are important for facilitating the implementation of OFE due to the difficulties in reaching agreement on crop management strategies, especially within agricultural producers' cooperatives. Future research will be conducted to establish an OFE research framework by setting up OFEs involving multiple middle-scale farmers. Analysis of the treatment effects on crop yield and quality need to be gathered from data generated from trials run cooperatively with many farmers on multiple fields. Cooperative marketing strategies might accompany such cooperative research.