Contents

1 Introduction

Organic agriculture is becoming of growing importance in the agricultural sector. In 2015, 50.9 million of hectares, i.e. 1.1% of global agricultural land, were cultivated under organic management compared to 11 million ha in 1999 (Willer and Lernoud 2016, 2017). During the same period, the global organic market size increased about fivefold to reach 81.6 billion US dollars (Willer and Lernoud 2016, 2017). Organic agriculture, which prohibits the use of almost all synthetic inputs, often relies on the intensification of ecological processes (FAO/WHO Codex Alimentarius Commission 1999). Organic management is thus expected to be associated with lower impacts on natural resources than conventional agriculture at the local and global scales (Tuomisto et al. 2012), as well as beneficial effects on human health (e.g. regarding the absence of pesticide residues, Reganold and Wachter 2016) (Fig. 1). However, it also raises concerns regarding its capacity to produce enough food and feed to meet the demand of a wealthier, growing world population (Cassman 2007; Connor 2008, 2013). Consequently, organic to conventional yield gaps are scrutinised (Stanhill 1990; Badgley et al. 2007; de Ponti et al. 2012; Seufert et al. 2012; Ponisio et al. 2015). These studies found yields to be on average 8 to 25% lower in organic systems with differences depending on the crop species, growing conditions and management practices (Bellon and Lamine, 2009; de Ponti et al. 2012; Seufert et al. 2012). For example, fruits were ranked among the highest yielding crops in organic systems by Seufert et al. (2012) while they were ranked among the lowest by de Ponti et al. (2012). These opposite conclusions may be due to differences in dataset characteristics and in the statistical methods used by the authors (e.g. mixed-effect vs. fixed-effect models, frequentist vs. Bayesian statistical methods). All these studies focused on the average yield difference, but did not analyse variability of yields between sites and between years in organic and conventional management. Some authors have hypothesised that organic farming agroecosystems lead to more stable yields (Altieri 1999; Scialabba and Müller-Lindenlauf 2010; Gomiero et al. 2011; Altieri et al. 2015). A few experimental studies indeed reported lower year-to-year variability and lower vulnerability to extreme weather conditions in organic systems (Smolik et al. 1995; Lotter et al. 2003; Smith and Gross 2006). But, others reported larger spatio-temporal variability levels in organic than in conventional agriculture (Smith et al. 2007; Casagrande et al. 2009; Euvard 2010; Delmotte et al. 2011; Rolland et al. 2012). These conflicting results outline that little is known on the relative variability of organic systems compared to conventional ones. This is despite yield variability being a major source of concern for the agri-food system. For example, Cernay et al. (2015) showed that species showing high yield variability tend to be grown on restricted proportions of the cultivated areas. Horticultural producers are particularly risk adverse for at least two motives. First, production costs are high due to a high share of labour costs in low mechanised systems or to substantial fixed costs in soil-less systems (Jeannequin et al. 2011). Second, due to strict marketing standards, biotic and abiotic damages, which can be mitigated using appropriate crop management techniques, generate large wastes associated with high economic losses (as defined by Savary et al. 2006).

Fig. 1
figure 1

Vegetable production, here in West Africa, is often associated with the use of high levels of pesticides. The development of organic food chains remains an important challenge in periurban areas and can also be beneficial for farmers’ health. Abidjan, Côte d’Ivoire. Photo Eric Malézieux

In this study, we make progress on the above-mentioned knowledge gaps by performing a meta-analysis on a dataset including the results of 52 papers reporting yield data for 37 horticultural species in 17 countries. We define horticulture as production systems based on vegetables and/or fruit production, both in fields, market gardens or orchards. We compare organic and conventional horticultural crops, analyse average yield differences and assess yield variability across experiments and across years.

2 Material and methods

2.1 Literature search

A systematic literature review was performed to collect published papers comparing yields in organic versus conventional horticultural crops. We first listed the references mentioned in review papers (Stanhill 1990; Offermann and Nieberg 2000; Pretty and Hine 2001; Kaval 2004; Badgley et al. 2007; Seufert et al. 2012). Then, we extended the search using Web of Science with the following equation: « (horticulture* or vegetable* or (tree crop*)) AND organic AND yield* ». The terms in the first bracket were used to select papers dealing with horticultural crops. The other terms were used to select papers dealing with organic farming and reporting yield data. The search equation was applied to the paper titles with no date limit. The references listed in the retrieved articles were also screened. The literature search was completed by November 2014.

2.2 Paper selection

An initial selection was made by analysing titles and abstracts. The full texts of the selected papers were then examined. The criteria for selecting the papers were as follows: (1) yield data (or yield ratios) were reported for individual crop species in both organic and conventional treatments; (2) the organic treatment was certified organic, biodynamic or followed organic standards (including in transition to organic horticulture); (3) the reported data were primary data coming from experimental stations or on-farm trials (i.e. farm surveys were not included to avoid confounding effects due to farm characteristics) and were not already reported in other papers; and (4) yield data obtained in organic and conventional treatments were obtained in the same sites during the same time periods. A total of 52 papers met our criteria and were finally selected (Table 2).

Table 1 Scientific names of the crops included in the database

2.3 Data extraction

Data were extracted from the text, tables and digitised figures of the selected papers and were included in a dataset. Each study was described by the name(s) of the author(s), the year of publication, the type of publication (report, journal, conference, book), the study title, whether the study had already been included in review, and the type of data (experimental, on-farm trial). Each study was related to an experimental site (ES) with each experimental site including one or several comparisons between an organic and a conventional treatment for a given species. If an organic (respectively conventional) treatment was compared in the same experimental site with several conventional (respectively organic) treatments, each comparison was included and hereafter named experimental comparison (EC). For instance, if two conventional (C1, C2) and two organic (O1, O2) treatments were tested on the same site, four experimental comparisons were included in the database: C1 versus O1, C2 versus O1, C1 versus O2 and C2 versus O2. Each experiment can include several years of comparison. In addition to yield data or yield ratios, we also extracted several other characteristics: type of crop (tuber root, vegetable, spice, fruit tree, small fruit, other fruit), crop common name (Table 1), crop scientific name (Table 1), crop life duration (perennial vs. annual crop), legume versus non-legume crop, type of harvested organ (root, fruit, bulb, leafy), country, climate (tropical, temperate, subtropical, Mediterranean), date, organic type (certified, organic standards, biodynamic, in transition) and conventional type (high input, low input). We refer to low input for conventional treatments using integrated protection methods and/or integrated fertilisation management.

Table 2 Dataset description. OH organic horticulture, CH conventional horticulture, Year nb number of years of data, Expe station experimental station, Org stand organic standards

When reported, number of replicates and measure of dispersion (standard deviation, least significant difference, coefficient of variation) were included. Note that organic and conventional treatments sometimes show contrasted cropping practices (e.g. soil tillage, cover crop or crop sequence).

The dataset covers a total of 50 experimental sites, 255 experimental comparisons and 560 yield ratios. Data are obtained for 17 countries and 37 crop species. About two thirds of the data concerns five species: tomato (32%), apple (10%), potato (9%), spinach (7%) and bean (7%). About two thirds of the data are obtained from experiments carried out in four European or American countries—the USA (41%), Italy (8%), Switzerland (8%) and Germany (8%). Four sites locate in Asia and only one in Central America and in Africa.

2.4 Organic and conventional yield comparison

2.4.1 Mean yield ratio estimation

The natural log of the response ratio was used as an effect size metric for the meta-analysis. An advantage of the ratio metric is that it allows one to handle yield data reported in different units for different crop species. The log transformation is used to normalise data and to ensure positive confidence intervals. The response ratio, Y, is calculated as the ratio of organic to conventional yield for each comparison:

$$ Y=\ln \left(\frac{\overline{X_{\mathrm{O}}}}{\overline{X_{\mathrm{C}}}}\right) $$
(1)

where \( \overline{X_{\mathrm{O}}} \) is the average organic yield calculated over n O repetitions and \( \overline{X_{\mathrm{c}}} \) is the average conventional yields calculated over n C repetitions.

To account for a possible effect of the choice of a statistical model on the results, eight statistical models are compared to estimate the mean effect size (i.e. the mean yield ratio). This includes two linear fixed-effect models, six linear mixed-effect models including one random effect (random experimental site effect) and two linear mixed-effect models including two random effects (random experimental site and random experimental comparison effects) accounting for the nested structure of the dataset (Table 3). The mixed-effect models 1 and 4 include a random experimental site effect and are defined as follows:

$$ \mathit{\ln}\left({Y}_{ijk}\right)=\mu +\kern0.5em {b}_i+{\varepsilon}_{ijk} $$
(2)

where ln(Y ijk ) is the natural log of the yield ratio in the ith experimental site, the jth experimental comparison and the kth year, μ is the mean log yield ratio, b i is a random site effect, b i \( \sim N\left(0,{\sigma}_b^2\right) \), ε ijk is the residual error term, ε ijk \( \sim N\left(0,{\sigma}_{\varepsilon}^2\right) \), and \( {\sigma}_b^2 \) and \( {\sigma}_{\varepsilon}^2 \) are the between-experiment and within-experiment variances, respectively. The mixed-effect models 2 and 5 are based on the same equation but, in these models, the random effect b i describes the variability across experimental comparisons and not across site. The mixed-effect models 3 and 6 include two random effects, one describing the between-site variability and one describing the variability across experimental comparisons. Finally, the fixed-effect models 7 and 8 do not include random effect; they assume that all experiments share the same log yield ratio.

The model parameters are estimated by restricted maximum likelihood using the lme and glm functions from the nlme and stats packages (R v.3.1.2). Models are ranked according to two statistical criteria, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). Models 1, 2, 3 and 7 are fitted to the full dataset using unweighted yield ratios. Models 4, 5, 6 and 8 are fitted to a restricted dataset. With this restricted dataset, each yield ratio is weighted using the inverse of the following variance (Hedges et al. 1999):

$$ \operatorname{var}(Y)=\frac{{\mathrm{SD}}_{\mathrm{O}}^2}{n_{\mathrm{O}}{\overline{X}}_{\mathrm{O}}^2}+\frac{{\mathrm{SD}}_{\mathrm{C}}^2}{n_{\mathrm{C}}{\overline{X}}_{\mathrm{C}}^2} $$
(3)

where SDO and SDC are the standard deviation of yield calculated over n O and n C replicates in the organic and conventional treatments, respectively. The variance (3) gives more weight to yield ratios computed from a high number of replicates and/or to yield ratios showing a low variability across replicates. The variance defined by (3) could be calculated for 303 out of 560 yield ratios. Values of SDO, SDC, n O and n C are missing for 257 yield ratios. Correlations between experimental comparisons based on the same organic or conventional treatment (i.e. correlations due to multiple organic or conventional treatments on the same site-years) are not included in our analysis as done in Lajeunesse (2011). However, to take into account the dependency of data collected in the same experimental comparisons, a random effect is included in two of the fitted models (models 3 and 6). Mean yield ratios exp(μ) and 95% confidence intervals are estimated using each model in turn.

Following the best practices recommendations in Philibert et al. (2012), we look for potential publication bias. We compute a funnel plot to relate precision (1/variance of yield ratio, where the variance is computed using (3)) to the natural log of the yield ratio based on the restricted dataset. Its asymmetry is studied considering the estimation of intercept deviation to zero from a regression analysis (Eggert et al. 1997).

2.4.2 Influences of covariates on yield ratios

The effects of seven covariates are studied using model 1 (i.e. the best model according to the AIC and BIC criteria): type of crop (fruit tree, small fruit, other fruit, spice, tuber roots, vegetable), type of product (bulb, fruit, leafy, root), annual versus perennial, legume versus non-legume, types of climate (Mediterranean, subtropical, temperate, tropical), organic system type (biodynamic, certified, organic standards, transition) and conventional system type (high input, low input). Each covariate is included in the model and its statistical significance analysed. Mean yield ratio and confidence intervals are estimated for each level of each covariate. In addition, model 1 is separately fitted to tomato, potato, apple, spinach, bean, lettuce, carrot and onion.

2.5 Variability of yield ratios across experiments

According to our statistical models, the probability distribution describing the variability of the log yield ratio across experiments (i.e. across site-years) is a Gaussian distribution with an expected value equal to μ and a standard deviation equal to σ b. The percentiles 1, 5, 50, 95 and 99% of the yield ratio are computed from the exponential of the corresponding percentiles of the Gaussian distribution.

2.6 Comparison of organic and conventional yield variances across years and across replicates

Two types of yield variance are estimated for each experimental comparison: (i) yield variances across repetitions in organic and conventional treatments for experimental comparisons including standard deviations and number of repetitions and (ii) interannual variances of organic and conventional yields for experimental comparisons including at least 5 years of data. Variances (i) describe the yield variability across replicates for a given site-year. Variances (ii) describe the yield variability across years for a given site. Ratios of variances of organic to conventional yields are estimated for each experimental comparison and each variance type, separately. Yield variances are then compared between organic and conventional treatments based on Fisher tests (function var. test, R v.3.1.2, null hypothesis: equality). Finally, we estimate the mean ratio of organic yield standard deviation to conventional yield standard deviation. The individual ratios of yield standard deviations are computed and weighted according to the method described in Nakagawa et al. (2015), for all experimental comparisons including at least 5 years of data. The mean ratio is then estimated using a mixed-effect model including a random experimental site effect.

3 Results and discussion

3.1 Mean ratios of organic to conventional yield and effects of covariates

The average ratio of organic to conventional yields is equal to 0.83 based on the best model fitted over the total dataset (model 1, 95% confidence interval [0.77–0.90]). Generally, mixed models performed better than fixed-effect models according to both AIC and BIC (Table 3). Model 6, based on the restricted dataset, is also selected and estimates an average ratio of 0.76 (95% confidence interval [0.68–0.85]). The two best models thus indicate that, across experiments, organic yields are on average at most 10 to 32% lower than conventional yields. This result is consistent with the three most recent meta-analyses comparing organic versus conventional yields for a large range of species. De Ponti et al. (2012), Seufert et al. (2012) and Ponisio et al. (2015) respectively found organic yields to be on average 20, 25 and 19% lower than conventional yields over all crop species. These authors reported estimated average yield ratios of about 0.8–0.9 for vegetables and of about 0.72–1 for fruits. Older studies such as Badgley et al. (2007) and Stanhill (1990) suffered severe methodological impediments: a lack of formal statistical analysis for the former as well as the use of a reduced dataset for the latter. Although a few studies of our dataset with high yield ratios (i.e. > 1) tend to be associated with lower precision levels than the studies displaying yield ratios < 1, the funnel plot reveals no publication bias (Fig. 2) (i.e. according to the Eggert (1997) test).

Table 3 Statistical models used to analyse ratio of organic versus conventional yields. lme linear mixed-effect model, lfm linear fixed-effect model, EC experimental comparison, ES experimental site. Data were weighted by their variances in models 4, 5, 6 and 8. Italic entries highlight the selected models
Fig. 2
figure 2

Funnel plot showing the precision (1/variance of yield ratio) as a function of the natural log of the organic to conventional yield ratios. Each black circle is the value for any experimental comparisons where standard deviations are informed. Vertical dashed line: 0; vertical full line: ln(estimated yield ratio)

Yield ratios do not significantly differ across crop types, product types, biological types (lifespan, nitrogen fixing) and climatic conditions (Fig. 3). Organic horticulture types—certified, organic standards, biodynamic, in transition—and conventional horticulture types—high input and low input—do not show significant effects on yield ratios. For example, the estimated yield ratio is equal to 0.90 (95% confidence interval 0.75–1.08) for organic systems in transition, and this value is not significantly different from values estimated for certified organic systems (p value > 0.4). No significant differences are further found between yield ratios when estimated independently for each crop species (data not shown). Note though that a few crop types (e.g. other and small fruits), product types (e.g. bulb), crop species (e.g. asparagus, celeriac, chard, cucumber) but also some geographical areas (e.g. Africa) are under-represented in our dataset. This reflects the literature but it would be very valuable to extend the geographical coverage of the database with more experiments locating outside Europe or North America. Our results are consistent with Ponisio et al. (2015) who also found no significant differences between crop types, whether defined based on the related food products (cereals, fruits, vegetable, etc.) or on biological traits (legume vs. non-legume crops, perennial vs. annual crops). On the other hand, Badgley et al. (2007), de Ponti et al. (2012) and Seufert et al. (2012) found differences between crop groups but as pointed out by Ponisio et al. (2015) regarding Seufert et al. (2012), statistical methods used in these studies are known to underestimate the sizes of confidence intervals. De Ponti et al. (2012) showed that the yield ratio differed across regions of the world whereas we did not find significant differences between climate types. Note though that our study considers climate, not geographical zones.

Fig. 3
figure 3

Effect of crop types and climate conditions on organic to conventional yield ratios for model 1. For each covariate, one modality is used as reference (black square) and compared to 1. The other modalities are compared to the reference (i.e. vegetable, tuber roots, spice, small fruit and other fruits are compared to fruit tree; perennial is compared to annual; and tropical. temperate and subtropical are compared to Mediterranean) with corresponding p values indicated on the right side of each graph (0 is given for any p value lower than 0.001). Results obtained with model 6 are similar (data not shown). Bars show 95% confidence intervals; the vertical line indicates a ratio equal to 1

As for management, Seufert et al. (2012) reported an effect of irrigation or the use of the best management practices in organic farming on the gap between organic and conventional yields. Ponisio et al. (2015) did not find an effect of management practices at the crop scale but found a diversification of crop species in space or over time to improve yields in organic farming compared to undiversified conventional farming. In our study, the low-input conventional horticulture category refers primarily to level of input use but is also frequently associated to a more diversified crop rotation. In our dataset, the relative effects of input use, rotation diversification or management practices (e.g. tillage, fertilisation, pest management) were generally hard to disentangle. Organic and conventional systems cover a diversity of farming systems and a large range of agricultural practices (Sylvander et al. 2006; Darnhofer 2014; Navarrete et al. 2015; Petit and Aubry 2015). Assuredly, progress can be made by an in-depth analysis of the effects of management practices on the yield ratio from a detailed description of cropping system characteristics. The yield differences reported here are due to the effect of a differentiated crop management in the same locations. Organic and conventional farms may be located in different physical or economic conditions, as suggested by Seufert and Ramankutty (2017), and differences of location may also have an effect on yields. However, the analysis of this effect is out of the scope for this paper.

3.2 Are organic yields generally more variable than conventional ones?

3.2.1 Yield ratio variability across experiments

The cumulative probability distribution of the organic to conventional yield ratios reveals large variability across experiments. There is about 90% chance to get a yield ratio higher than 0.5; i.e. yield loss in organic horticulture has a 10% chance to exceed 50%. On the other hand, organic yields have 50–60% chances to reach at least 75% of the conventional yields, and there is a 20% chance to get higher yields in organic systems. These results highlight that, in horticulture, organic to conventional yield gaps vary greatly across experiments.

Our results reveal the importance of studying both the average yield gap and its variability across experiments. We failed to identify variables significantly explaining the interexperiment variability of yield ratios. Covariates describing in more details the local environment (soil type, weather, pest pressure) and the crop management would be valuable. Lotter et al. (2003) highlighted for instance that ratios of organic versus conventional maize differed as a function of rainfall. Cooper et al. (2016) showed the interest of dividing reduced tillage practices into different classes to analyse their effect on organic yields. This requires consistent site and management information in published papers.

We can hypothesise that in specific environmental and/or management conditions, organic production is more efficient than conventional production and vice versa. Lotter et al. (2003) showed for instance that organic maize outyields conventional maize in extreme climate conditions. A possible explanation would lie in an improved soil water capture in organic systems related to the use of organic amendments leading to higher soil organic matter.

3.2.2 Yield variability across replicates

Based on variance comparison tests, we find that the variances in organic and conventional treatments are significantly different (p value < 0.01) only in 11% of the considered experimental comparisons. The variance of the organic treatment was significantly higher than the variance of the conventional treatment in 6% of the experimental comparisons and lower in 5%. Organic yields are thus generally not more variable across replicates than conventional yields. Kravchenko et al. (2005) highlighted that crop management practices can affect grain yield spatial variability but with alternative impacts according to weather conditions. Although they found no difference in yield spatial variability between low chemical input and conventional treatments, they observed yield variability to be lower in low precipitation years for zero fertiliser input treatment (no chemical inputs nor compost or manure) than in the treatment that received fertiliser inputs. It is however difficult to extrapolate their findings to organic systems.

3.2.3 Yield variability across years

Interannual variability is here analysed based on variance ratio estimated for 36 experimental comparisons including at least 5 years of data (Fig. 4). Although 57% of the estimated variance ratios are lower than one, none of the variance comparison tests are significant. Over all experimental comparisons including at least 5 years of data, the estimated mean ratio of standard deviations of organic and conventional yields is equal to 0.98 (95% confidence interval [0.82–1.18]) and is not significantly different from one. This means that, for horticulture, the interannual variability of organic yields is not significantly higher than conventional ones. This result is uncertain; confidence intervals associated with estimated variance ratios are often large because of the relatively small number of data in each experimental comparison. In turn, this large uncertainty partly explains why none of the computed differences are found significant. Our results are consistent with Stanhill (1990) who found no effect of organic farming on the interannual variability of yields. A review of the other papers analysing interannual yield variability suggests contrasting results. Comparing different management systems, Smith et al. (2007) showed that interannual yield variability is significantly higher for organic soybean and organic wheat but not for organic corn. Smolik et al. (1995) on the other side reported that year-to-year yield variability is lower in organic systems compared to conventional or reduced-till systems. Lotter et al. (2003) showed that organic maize outyields conventional maize in extreme climate years due to improved soil water capture. Euvard (2010) and Rolland et al. (2012) reported respectively that, in France, organic potato and organic wheat yields are characterised by a high interannual yield variability due to weather or biological fluctuations but did not carry out any comparison with conventional farming. Further research, for example relying on long-term field trials, is therefore needed.

Fig. 4
figure 4

Ratio of interannual yield variances (organic vs. conventional) estimated for experimental comparisons (EC) including at least 5 years of yield data. Each row corresponds to an EC whose associated crop name is given; the number of years per EC stands on the right. No variance ratio was significantly different from one. The upper bound of the range of variance ratio was set to 10

4 Conclusion

Our meta-analysis, based on a global comprehensive experimental dataset, shows that yields in organic horticulture are on average 10 to 32% lower than yields in conventional horticulture. Our analysis reveals a strong variability of organic versus conventional yield ratios across experiments. The probability to get an extremely high yield loss in organic systems is small: yield losses have only 10% chance to exceed 50% of conventional yield. On the other hand, organic yields have 20% chance to exceed conventional yields. These results suggest that, when studying the yield differences between organic and conventional management, agronomists should not only focus on average yield differences but also analyse yield ratio distributions.

We did not identify any covariates significantly affecting the magnitude of yield losses. Our results support the need to further extend the coverage of databases comparing organic versus conventional yield from horticultural crops to (i) better describe crop management to account for the large range of practices existing in organic and in conventional farming systems and (ii) include a description of the local soil and bioclimatic environment.

We find no significant impact on yield ratios of different climatic zones. However, more than 75% of our dataset is composed of experiments carried out in European or North American countries, whereas these countries cover only 30% of the organic land dedicated to horticultural crops (Willer and Lernoud 2017). Considering data coming from other regions of the world is therefore crucial.

Our study shows that yield instability is not significantly different in organic versus conventional horticulture. We do not find significant differences between organic and conventional horticulture yield variances across replicates and years. This result is important because high yield variability, which can be perceived as increased risk, may limit the development of agricultural activities. The amount of data suitable to analyse interannual yield variability was however limited. A more robust conclusion could be made in the future by analysing new long-term field trials. Other criteria such as the variability of fruit/vegetable nutrient contents (especially in developing countries, as pointed out by Hunter et al. 2011 and Schoonbeek et al. 2013) should also be analysed to assess the costs and benefits of organic products. In addition to criteria related to crop production, organic systems should undoubtedly be compared to conventional systems also according to their environmental and social benefits.