1 Introduction

Agricultural cropping systems will experience increased perturbations due to climate change (Ray et al. 2015; Tigchelaar et al. 2018). The projected climate effects include increases in temperature and changes in patterns of precipitation and an increased frequency and severity of extreme weather events such as drought, heat stress, and flooding (Lobell et al. 2011). The increase in climate variation, alongside other factors, has resulted in decreased yield stability of major crop species in large parts of Europe and other regions of the world (Döring and Reckling 2018; Ray et al. 2015), and yield stability of crops is predicted to decrease even more under future climate change scenarios (Tigchelaar et al. 2018).

Cropping systems need to be designed to enhance crop yields while maintaining or increasing yield stability under varying climatic conditions (Liu et al. 2019). Besides the choice of the evaluated outputs (e.g., yield stability), the factors influencing these outputs (e.g., rotational effects and climatic conditions) need to be considered in the design of diversified cropping systems. While most studies use simulation modeling to assess cropping systems under various conditions (Bergez et al. 2010), few use empirical data, e.g., where cropping systems are assessed over long time periods taking the role of crop rotations into account (Marini et al. 2020). Crop rotation principles are important in the design and assessment of farming systems because they describe the sequence and frequency of crops grown and the interactions within a system. Assessing yield stability of crops in different cropping systems requires yield data over long time periods. Long-term experiments (LTE) that compare different cropping systems provide such data (Fig. 1), i.e., crop yields over long time periods under different conditions and rotations (Johnston and Poulton 2018). The assessment of such experiments can provide fundamental knowledge on the role of crop rotation diversification in affecting yield stability, the probability of diversified systems to outperform less diverse systems, and environmental adaptability. As detailed in Section 2.2.4, the latter is assessed based on crop yields of individual cropping systems against the mean yields of a range of cropping systems. Such knowledge could support the design of future cereal-based cropping systems that are better able to deal with the increasing climate variability.

Fig. 1
figure 1

Tulloch is a Scottish LTE established in 1991 with a 6-year crop rotation. In the stocked system, crops are partly grazed by sheep and receive farm-yard manure (picture: C. Watson/SRUC).

There are several hundred LTEs available worldwide, and these are mainly analyzed for single research objectives comparing few treatments. It is only recently that the yield data from LTEs have been used to assess yield stability for separate LTEs (see Reckling et al. (2021) for a review) and also in combined analyses using several international LTEs (Marini et al. 2020; Reckling et al. 2018). Data from LTEs incorporate changes in climate and changes associated with the management such as the impacts of crop diversification on soil structure and soil carbon that affect crop performance.

Incorporating perennial leys and grain legumes into cropping systems increases the yield of subsequent crops under most conditions compared to systems without legumes and perennial crops (Angus et al. 2015; Preissel et al. 2015; St-Martin et al. 2017). This is due to a positive pre-crop crop effect (residual and break crop effect) that enhances main crop yields by 20% on a global average (Zhao et al. 2022). Cover crops can significantly reduce nutrient leaching and also affect the yield of subsequent crops (Hauggaard-Nielsen et al. 2012; Plaza-Bonilla et al. 2016). While the effect of these diversification strategies (incorporating leys with different lengths, grain legumes, cover crops, and crop sequences) on yield has been observed in separate analyses, the effect of these strategies on yield stability, environmental adaptability, and the probability of diversified systems to outperform less diverse systems has not been investigated.

While stability analysis was originally used to assess the stability of crop genotypes across environments, the analysis of yield stability of cropping systems, especially in relation to climate change, has gained importance (Lobell et al. 2011; Tigchelaar et al. 2018). To assess yield stability, environmental adaptability, and the probability of systems to outperform other systems, different regression- and variance-based indicators have been proposed (see Reckling et al. (2021) for an overview).

There are a number of types of indicators that can help to assess cropping systems in relation to yield stability, environmental adaptability, and the probability of diversified systems to outperform less diverse systems: (i) Static and variance-based yield stability indicators include the relatively simple coefficient of variation (CV), which is one of the most frequently used indicators in agronomic and ecological research (Ray et al. 2015). Applying the CV implies the assumption that the standard deviation increases linearly with the mean. However, under certain conditions, the unguarded interpretation of the CV of crop yield data may be misleading, especially when the crop yield data spans a large numeric range (Döring et al. 2015; Döring and Reckling 2018). (ii) Another static yield stability indicator is the POLAR (Power Law Residuals) (Döring et al. 2015) that estimates yield stability of crops independent of differences in mean yields between cropping systems. (iii) Dynamic indicators can be expressed as a measure of environmental adaptability. A regression-based indicator following Finlay and Wilkinson (1963) is used in this study to assess the interaction between the yield performances of the single crop within a cropping system in relation to the mean yield of this crop over all cropping systems. (iv) The probability that one cropping system outperforms another system can be estimated to account for the variance and also the mean yield between systems (Piepho 1998).

The objective of this study was to assess the effect of cropping system diversification strategies across different European climates on cereal yield stability, environmental adaptability, and the probability that diversified systems outperform less diverse systems. We investigated diversification through (i) integration of perennial leys, (ii) increasing proportion (length, i.e., number of years) of the perennial ley relative to the entire crop rotation, (iii) varying the order in which crops are positioned in the rotation, (iv) integration of grain legumes, and (v) integration of cover crops. We used cereal yield data from five LTEs from Sweden, Scotland, and France.

2 Materials and methods

2.1 Characteristics of long-term experiments

Long-term experiments (LTEs) containing cereals in different cropping systems from Sweden, France, and Scotland with different bio-physical conditions and experimental designs (Table 1) were used for the analyses of yield stability.

Table 1 Details about the long-term experiments.

2.1.1 Swedish long-term experiments

Annual cereal yield data (winter wheat and oats) was used from three Swedish LTEs located at Lanna, Stenstugu, and Säby and established in 1965, 1968, and 1969, respectively. The time periods used for this study were 1971–2014 (44 years), 1974–2014 (41 years), and 1975–2014 (40 years) for Lanna, Stenstugu, and Säby, respectively (Table 1). The experimental design included three different crop rotations (A, with grass-clover ley; B, with grass ley; and C, without ley; Table S1) and four levels of N fertilization (N0, N1, N2, and N3; Table S1) with amounts of 0 kg ha-1 year-1, 38–55 kg ha-1 year-1, 73–100 kg ha-1 year-1, and 105–145 kg ha-1 year-1, respectively, and depending on crop. The design has no spatial replicates but all crops in the crop rotations are present each year. Each crop rotation × N level combination is hereafter referred to as a cropping system. All nitrogen was applied as a mineral fertilizer (see Persson et al. (2008) for further details).

In all cropping systems, the crop sequence was the same for the first 4 years of the rotation (Table S1): winter/spring oilseed rape (Brassica napus L.) or white mustard (Sinapis alba L.), winter wheat (Triticum aestivum L.), spring oat (Avena sativa L.), and spring barley (Hordeum vulgare L.). The grass-clover ley of crop rotation A consisted of red clover (Trifolium pratense L.) and timothy (Phleum pratense L.) at Lanna and red clover, timothy, and alfalfa (Medicago sativa L.) at Stenstugu and Säby. The grass-clover ley contained > 30% legumes in the biomass in the mixture in most years. However, there was a large variation in the legume percentage between years and rotation systems but there was no detailed data available. The grass ley of crop rotation B consisted of timothy and meadow fescue (Festuca pratensis L.) at all sites. Spring wheat (Triticum aestivum L.) and a black fallow was used in crop rotation C instead of first and second year leys in rotations A and B (Table S1).

To avoid confounding effects of previous management at the experimental sites, the first 6-year rotation cycle was excluded from all analyses. Due to poor drainage at Säby, the winter survival of winter wheat was poor, and thus, the complete winter wheat dataset was excluded from the analysis. Spring barley was excluded because it was managed differently in the three crop rotations (Table S1).

2.1.2 Scottish long-term experiments

Tulloch, the Scottish LTE, was established in 1991 in the North-East of Scotland. The time periods used for this study was 1994–2017 (24 years) (Table 1). The 6-year crop rotations have each course of the rotation present in each year, and there are two spatial replicates of each rotation (see Watson et al. (2011) for further details). The T50 crop rotation (see Table S2) consisted of 3 years of grass-clover ley (Lolium perenne L., P. pratense L., and Trifolium repens L.), followed by spring oats (A. sativa L.), swede (Brassica napus L.), and then, barley with undersown ley. The T67 system had 4 years of grass-clover ley followed by spring oats and then spring oats undersown with the grass-clover ley mixture. These rotations were grazed by sheep and received farm-yard manure as described in Table S2. In 2007, the T67 rotation was converted to a stockless rotation. The first 4 years of the oat data were excluded from the analysis as they did not follow 4 years of grass-clover.

2.1.3 French long-term experiment

The French LTE is located at Auzeville and was established in 2004. The time periods used for this study was 2005–2016 (12 years) (Table 1). The LTE has a split-plot design with crop rotation (R1 and R2; Table S3) as the main plot and cover crops as sub-plots (with or without; Table S3). There are no spatial replicates, but all crops are present each year.

Four cropping systems resulting from a combination of two 3-year crop rotations (Table S3) with or without cover crops were used in the present study. The R1 rotation consisted of durum wheat (Triticum turgidum L.) followed by sorghum (Sorghum bicolor L.) and then sunflower (Helianthus annuus L.). Crop rotation R2 had durum wheat followed by sunflower and then winter pea (Pisum sativum L.) during 2004–2010 and winter faba bean (Vicia faba L.) during 2011–2016 (Table S3). A vetch-oat (Vicia sativa L.–A. sativa L.) mixture was used as a cover crop in crop rotation R2 while a combination of vetch and other crop species was used in rotation R1 (see Table S3 for details). Analyses were performed for the grain yields from durum wheat, since cereals were the target crop types for this study and it was the only crop that was present in both crop rotations with the same pre-crops. Further information regarding the LTE can be found in Plaza-Bonilla et al. (2016).

2.2 Statistical analyses

2.2.1 Probability method (PM)

The probability of one system outperforming another system is a method described by Piehpo (1998) that compares single crops, e.g., cereals in two cropping systems directly with each other in terms of differences in yield and their variance and co-variance (Piepho 1998) and is calculated as

$$\mathrm{Pr}\left({D}_{j}>0\right) =\Phi [{\delta /\sigma }_{D}]$$

where \(\Phi\) is the cumulative distribution function of the standard normal distribution, \(\delta\) is the estimated mean difference between systems, and σ \(2\) D is the variance of a difference \({D}_{j}\) in a randomly selected environment following Piepho (1998).

There is a connection between the probability of one system outperforming another and the variance-based stability indicators mentioned above (CV and POLAR) through accounting for the mean, the variance, and covariance of the yield data. The probability indicator was used in this study to calculate the likelihood that one system, e.g., with perennial ley, “outyields” the system without perennial ley in more than 50% of the cases. For the Swedish LTE, each cropping system was considered as a system, and all analyses were performed separately for each site and crop combination. The environmental variance model described by Piepho (1999) was used to calculate estimates of variances and mean yields for the probability calculations. Probability calculations were not performed for the French and Scottish LTEs due to the limited availability of data.

2.2.2 Coefficient of variation (CV)

The yield stability of cropping systems was determined by the coefficient of variation (CV). The CV is defined as the standard deviation σ divided by the mean μ and is calculated as:

$$CV =\upsigma /\upmu \bullet 100\mathrm{\%}$$

For the Swedish LTEs, the dataset of each cropping system was divided into one subset for each crop rotation cycle. Hence, every CV value comprised yield data from 6 years, except for the seventh crop rotation cycle at Stenstugu and Säby, which only comprised yield data from 5 and 4 years, respectively. The winter wheat grain yield data from crop rotation cycle (subset) five at Stenstugu were excluded from the CV analyses due to missing data. The dataset from the French and Scottish LTE was also divided in subsets. The CV values were analyzed statistically using the MIXED procedure in SAS and Tukey’s test (α = 0.05) was used as a post hoc test. The dataset of the Swedish LTEs was analyzed by comparing the mean CV values between crop rotations (A, B, and C) within each site × crop × N level combination. Rotation cycle (subset) and crop rotation were considered fixed factors in these analyses. To assess the effect of two different crop rotations on the durum wheat yield stability, the dataset of the French LTE was analyzed by comparing mean CV values between the R1 rotations with grain legumes (with and without cover crop) and the R2 rotations without grain legumes (with and without cover crop), since the interaction between the systems was not significant. To assess the effect of cover crops on the durum wheat yield stability, mean CV values of cropping systems with cover crops (R1 with and R2 without grain legumes) and without cover crops (R1 with and R2 without grain legumes) were compared. Rotation cycle (subset), crop rotation, and cover crop were considered as fixed factors in the French analysis. The Scottish dataset was also analyzed in two different ways. To assess the yield stability of spring oat that was directly following the grass-clover ley (oat position 1) with the yield stability of spring oat separated from the grass-clover ley by other crops in the rotation (oat position 2), CV values from the Scottish T50 rotation were compared for the period 1994–2017. CV values of spring oats in the 3-year ley (T50) and 4-year ley (T67) rotations were compared without any statistical analyses because of an insufficient number of crop rotations cycles.

2.2.3 Power Law Residuals (POLAR)

The Power Law Residuals (POLAR) is an index of yield stability that is independent of the mean yield in contrast to the CV (Döring et al. 2015; Reckling et al. 2018). POLAR is based on Taylor’s power law (TPL), which states that the logarithm of the sample variance (σ2) is a linear function of the logarithm of the sample mean (μ) across different subsets of data. Using the same subsets of data as described for the calculation of the CV, means (\(\widehat{\mu }\)) and variances (\({\widehat{\upsigma }}^{2}\)) were calculated resulting in pairs (with index i) consisting of a mean and a variance. Following TPL, a linear regression is calculated for log10 of the variance over the log10 of the mean. With vi = log(\({\widehat{\upsigma }}^{2}\) i) and mi = log(\(\widehat{\mu }\) i), the linear regression is v = a + bm. The residuals ui from this regression line (the POLAR values) are then calculated according to Döring et al. (2015) as:

$${u}_{i}= {v}_{i }-\left(a+b{m}_{i}\right)$$

A low POLAR value corresponds to a high stability, whereas a high value corresponds to a low stability. For the POLAR calculations, all yield data from the LTEs were divided into the same subsets as for the CV analyses, and for each subset, a mean and a variance were calculated. Subsequently, a linear regression was calculated for log10 of the variance against the log10 of the mean using all yield data subsets from all LTEs following (Döring et al. 2015; Reckling et al. 2018). The yield data from spring oat and winter wheat from Lanna and Stenstugu, spring oat from Säby and the Scottish LTE, and durum wheat for the French LTE were used for the calculation of the linear regression. The number of values for the regression was the same as for the calculated CVs. The POLAR values (residuals) obtained from this regression line were compared and statistically analyzed as described for the CV values.

2.2.4 Environmental adaptability (FW)

In this regression analyses, annual yields of individual cropping systems were plotted against the mean annual yields of all cropping systems included in the comparison (Finlay and Wilkinson 1963; Piepho 1998) to be interpreted as an indicator of environmental adaptability. The performance of an individual cropping system is evaluated by comparing the slope of the regression lines (FW, Finlay-Wilkinson regression coefficient bi) independent of the intercept. A positive slope value greater than one (b > 1) indicates higher environmental adaptability. Higher environmental adaptability means that favorable environmental conditions result in higher yields. A slope value lower than one (b < 1) indicates a lower environmental adaptability. The slopes within each site × crop × N level combination of the Swedish LTE were compared. In the French LTE, the slopes of the R1 rotations (with and without cover crop) and the R2 rotations (with and without cover crop) were compared and the slopes of cropping systems with cover crops (R1 and R2) and without cover crops (R1 and R2). The slopes of the Scottish crop rotations T50 and T67 at first oat position were compared using data between 1995 and 2006, and the slopes of the two different oat positions were compared using data from the T50 crop rotation between years 1994 and 2017. Following Litell et al. (2006), all comparisons of slopes were performed using the MIXED procedure in SAS.

3 Results and discussion

3.1 Diversification through perennial legume crops

Winter wheat yields in the Swedish cropping system with the perennial grass-clover ley (A) outperformed winter wheat yields in the system without the ley (C) across all sites and N levels with a probability of 64–94% in the Swedish LTEs (Table 2 for the site Lanna, S4 for Stenstugu, and S5 for Säby). Winter wheat in the system with perennial grass ley (B) also outperformed the wheat in the system without ley (C) but only with a probability of 55–79% (Table 2). Diversification with perennial grass-clover ley (A) and grass ley (B) did not consistently affect winter wheat yield stability quantified with the CV (yield dependent indicator) and POLAR (yield independent indicator) compared to wheat in the system without perennial ley (C) (Table 3). In Stenstugu in the 0 N treatment, yield stability was significantly higher with a CV of 23% for winter wheat in the systems with perennial ley (small CV value indicates higher stability) compared to the system without ley with a CV of 29% (Table 3). There was no significant difference in the environmental adaptability, although the slope of wheat in the systems with perennial crops had consistently higher values than 1 indicating higher environmental adaptation (Table 3).

Table 2 Probability of the ith treatment combination (crop rotation × N level) outperforming the i′th treatment combination for winter wheat and oat grain yields during 1971–2014 at the site Lanna.
Table 3 Mean winter wheat grain yields, mean CV values, and Finlay-Wilkinson (FW) regression coefficient bi for each crop rotation and N level combinations at sites Lanna and Stenstugu.

Similar to the probability analysis for winter wheat, grain yields of oats in the Swedish crop rotations with perennial leys (A and B) outperformed the oat yield in the system without a perennial crop (C) by 55–95%, except at the Säby site at N level 3 (Table 2). Oat yield stability (CV and POLAR) was not affected by the presence of perennial leys, except for N level 3 at Säby. At this site and N level, the CV of oat yields was significantly lower (indicating higher stability) without a perennial crop (C) compared to the system with a perennial grass ley (B) (Table 4) but not for the yield-independent POLAR coefficient. In contrast to winter wheat, the presence of perennial crops affected the response of the oat yield to the environmental conditions. At N level 0 and all sites, the slope values of the environmental adaptability coefficient for crop rotations including grass-clover leys and grass leys (A, B) were significantly higher than without (C) indicating higher environmental adaptability (Table 4, Fig. 2). For fertilization levels N1 and N2 at Stenstugu, the coefficient for rotation A was significantly higher than rotation C indicating higher environmental adaptability (Table 4). At Säby, a higher environmental adaptability was found for N2 and N3 for systems with perennial leys.

Table 4 Mean oat grain yields, mean CV values, Finlay-Wilkinson (FW) regression coefficient bi and POLAR coefficients for each crop rotation and N level combination at sites Lanna, Stenstugu and Säby.
Fig. 2
figure 2

Regression lines of individual cropping systems (A, with grass-clover ley; B, with grass ley; and C, without ley) over the overall cropping systems year means for oat at N level 0 (0 kg N ha-1 year-1) for the three sites, A Lanna LTE, B Stenstugu LTE, and C Säby LTE. Comparisons between regression lines are reported in Table 4.

Our results partly support our first hypothesis that cropping systems with perennial leys (in the three Swedish LTEs) outperformed those without leys in terms of winter wheat and oat yields (Table 2). Sanford et al. (2021) found that systems with a greater proportion of perennial crops had the highest long-term stability. In our study, the magnitude of differences in yield was similar for winter wheat and oat and across the three sites but decreased from 92 to 64% with increasing nitrogen fertilizer application rate (Tables 2, S4, S5). Thus, the greatest impact of crop diversification was found in the low-input systems. In such systems, many factors can be limited including nitrogen, and the residues from the perennial crops provide nutrients and better growing conditions for the following cereals compared to other pre-crops. Forage legumes fix more dinitrogen when N is limiting and contribute to better soil conditions of the following crops (Iannetta et al. 2016). It has already been shown for the Swedish LTEs investigated here that the systems with perennial grass and grass-clover increased soil organic carbon (Persson et al. 2008) and that the yields of the cereals were greater than in the system without leys (Bergkvist and Båth 2015). Macholdt et al. (2020) found differences in yield stability (indicated as the risk of yield falling below a threshold) between less and more diverse systems to be larger in low-input treatments. On the other hand, crops in such low-input systems had the highest production risk (probability of yield loss) compared to fertilized systems (Macholdt et al. 2020). We also found an indication that non-fertilized systems were less stable than fertilized systems (Tables 3 and 4) but we did not find a systematic effect of diversification through leys on yield stability of winter wheat and oat. However, there was clear indication of higher environmental adaptability of the cropping systems with perennial leys, especially for oat in low-input systems across all sites (Fig. 2). The residual nitrogen from roots and crop residues of forage legumes might be one of the reasons to support crop growth of spring oat when nitrogen is limiting (without nitrogen fertilization) especially under favorable climatic conditions (with a high nitrogen demand for crop growth). Under unfavorable conditions, e.g., water limitation in dry years, we found only slightly higher yields of spring oat (probably factors other than nitrogen were limiting). In the analysis by Marini et al. (2020), they found that the more diverse systems performed even better under these poor conditions and concluded that this increases their potential for adapting to changes in climate.

3.2 Diversification through the length of the perennial ley crops

The period of perennial leys can vary and affect the subsequent crop production. At the Scottish site, the length of the ley phase had no effect on the mean yield and yield stability of the oats as indicated by the CV and the POLAR and no significant difference in the adaption to the environmental conditions according to the FW regression (Table 5). Thus, diversification through increasing the length of the perennial ley had no significant effect on yield regardless of the environmental conditions. This is supported by the Finlay and Wilkinson analysis showing no difference in adaption to environmental conditions although the 3-year ley had a slope < 1 indicating lower adaptability than the 4-year ley with a slope > 1 (Table 5). The difference between the systems (both already very diverse with perennial grass-clover) was probably too small to impact yield variation. However, the longer ley phase in the rotation might have positively affected the nitrogen balance due to more crop residues and higher N input via N2 fixation (Iannetta et al. 2016). The higher N input of the longer ley phase is also reflected in a higher N content of oat grain following the 4-year ley compared with the 3-year ley (Watson et al. 2011).

Table 5 Mean grain yields of oat, CV, POLAR, and Finlay-Wilkinson (FW) in the Scottish LTE for rotations with different length and positions of the oat in the crop sequence.

3.3 Diversification through the crop sequence

The place of a crop within a cropping sequence can differ and affect the productivity. Oat yields in the three-year perennial ley rotation in Scotland were 33% higher when following directly after the ley compared to oat grown 2 years later in the crop sequence (second oat, following the first oat and the swede crop; Table 5). Even though the CV and POLAR value of the first oat position was numerically lower (indicating a higher stability) than position two, no significant difference was found with either the CV or the POLAR method. Equally, there was no difference in the response of the first and second oat to different environmental conditions, as shown by FW regression (Table 5).

The positive effects on soil fertility are more important to the crop directly following the ley due to carry-over and residual effects (Persson et al. 2008). According to the analysis of the Swedish LTEs by Bergkvist and Båth (2015), the positive effect of the leys preceding the oat did not increase continuously over time. In such systems, yield stability may be affected when the processes of the pre-crop are more dominant in particular years, then in others, but there was no difference in the response of the first and second oat to the environmental adaptability (Table 5).

3.4 Diversification through the integration of grain legumes

Durum wheat grown in the French cropping system with grain legumes (pea and faba bean) in the rotation (no intercropping) tended to yield higher especially in years with yields below average (< 5500 kg/ha) and had relatively low yields in high yielding years (> 6500 kg/ha) compared to durum wheat grown in a cropping system without legumes (Fig. 3). Indeed, the system with the grain legume had a slope < 1 indicating lower environmental adaptability than the system without the legume (Table 6). Yield stability was not significantly different between the two systems probably due to the limited dataset (CV 13% vs 18% and a POLAR of − 0.35 vs 0.02) (Table 6).

Fig. 3
figure 3

Regression lines of individual cropping systems over the overall cropping systems year means for durum wheat at the French LTE for rotations with and without grain legumes (left) and rotations with and without cover crops (right). Comparisons between regression lines are reported in Table 6.

Table 6 Mean grain yield of durum wheat, CV, POLAR, and Finlay-Wilkinson (FW) for the rotations with and without grain legumes and with and without cover crops in the French LTE.

The positive effects of the grain legumes on the durum wheat yield (in lower yielding years) could be a result of the positive pre-crop effect of grain legumes (Angus et al. 2015; Preissel et al. 2015). This process encompasses the N and non-N-related preceding crop effects (Chalk 1998) that are difficult to separate empirically. While the “nitrogen effect” comprises the provision of N to the subsequent crops, the other benefits include the “break-crop effect” that occurs when a disease cycle is broken, benefits to soil organic matter and structure and phosphorus mobilization (Watson et al. 2017). These effects occur especially in cereal-dominated cropping systems such as the systems without legumes in this study. There are meta-analyses on the pre-crop effect on mean yield, with cereal yields being 1.46 t ha-1 in temperate Europe (Preissel et al. 2015) and 1.2 t ha-1 in Australia, Europe, and North America (Angus et al. 2015) higher after grain legumes than after cereal pre-crops. This is one of the first studies exploring the pre-crop effect on yield stability.

3.5 Diversification through cover crops

Diversification with cover crops in the rotations had no significant effect on the mean yield of durum wheat in the French LTE (Table 6). Cover crops did not affect yield stability of durum wheat significantly (CV 14% vs 17% and POLAR − 0.26 vs − 0.07, Table 6). The FW coefficient in the system with cover crops was < 1, indicating low environmental adaptability, while it was > 1 in the system without cover crops (Fig. 3). However, the difference between the systems was not significant, maybe due to the few observations available for the analysis.

Cover crops are primarily grown to reduce nitrate leaching, increase soil organic carbon, or reduce soil erosion (Plaza-Bonilla et al. 2016). Although the effect of cover crops on yield stability has not been investigated so far, improved soil conditions could result in higher yields in bad years (indicated by our study) and improve yield stability in the long-term. Longer datasets are needed to draw robust conclusions related to the impact of cover cropping on yield stability.

3.6 Diverging impacts of diversification

Our results provide diverging indications for the effects of cropping system diversification on cereal yield stability, environmental adaptability, and the probability of diversified systems outperforming less diverse systems. St-Martin et al. (2017) have drawn a similar conclusion related to cereal yield stability and environmental adaptability when analyzing three cropping systems in a single LTE. In contrast Macholdt et al. (2020) found in another LTE that winter barley grown in cropping sequences dominated by cereals had lower yield stability and environmental adaptability and greater production risks compared with winter barley grown in cropping systems with higher crop diversity and additional organic matter inputs. Using seven long-term experiments across a wide latitudinal gradient in Europe, Marini et al. (2020) found that growing multiple crop species (including legumes) in a rotation always provided higher yields for both winter and spring cereals (average + 860 and + 390 kg ha-1 per year, respectively) compared with continuous cereal cropping. Yield gains in diverse rotations were especially higher in (low-yielding) years with high temperature and low precipitation. Similarly, we found that all diversification measures increased cereal yields in years with below average yields (including durum wheat in cropping systems with cover crops and grain legumes in France). In high-yielding years, only cropping systems with perennial legume crops consistently increased cereal yields. The inclusion of grain legumes tended to reduce yields of cereals in high yielding environments. While winter cereals yielded more in diverse rotations consistently across the period of the LTE, the yield gain increased over time since establishment of the LTE in spring cereals (Marini et al. 2020). Globally and also in Europe, cropping system diversification uses very different approaches (Hufnagel et al. 2020), which range from diversifying simple maize-based systems (Bowles et al. 2020) to systems with a large spatial and genetic diversity (Ditzler et al. 2021). This makes comparisons difficult.

4 Conclusion

We conclude that diversification affected the performance of cereals within cropping systems. While most but not all diversification measures increased the productivity of cereals in long-term experiments, the effects on yield stability and environmental adaptability were inconsistent. For the five diversification measures tested, we conclude (i) diversification through perennial grass and legume crop mixtures outperformed systems without leys across the three sites for wheat and oat. We found a higher environmental adaptability of the cropping systems with perennial leys, especially in low-input systems. (ii) Diversification through the length of the perennial ley increased the yield of oats after a longer period of the ley, but did not affect yield stability or environmental adaptability. (iii) Diversification through changing the position of oats in the crop sequence increased the yield by 33% when it followed the ley directly compared to the crop grown 2 years later. (iv) Diversification through the integration of grain legumes indicated increased yields of durum wheat in lower-yielding years compared to the system without the grain legume. (v) Diversification through cover crops did not affect yield stability in the studied LTE.

We conclude that the four methods used for analyzing the yield data, were a strength of our study, since it allowed detecting diverging evidence of the cropping system performance. While the probability method was providing the most conclusive results by combining aspects of productivity and stability, there is a risk of making unjustified conclusions about yield stability if using only one method. The existing dataset could be extended, i.e., adding more experiments with additional treatments to analyzing further effects of diversification.