Introduction

Yield gain patterns in Germany have changed considerably since 2000. Although yield increases up until 2000 have been steady, the standard deviation post-2000 has increased significantly and more than doubled on average. Additionally, the yield gap between official on-farm assessments and official variety trials is continuously increasing (see Laidig et al. 2014). One reason for these changes is the more frequent occurrence of weather extremes and the resulting changes in pest and pathogen distribution and occurrence (Juroszek and Tiedemann 2013a, b). As a consequence, breeding has to take into account increasingly different, and in part opposed characteristics, such as tolerance to heat, cold, drought stress and water logging, as well as disease and pest resistances (Heisey and Day-Rubenstein 2015). In addition, there is a need for improved water and nutrient acquisition ability under water stress, as well as differential ripening times in order to cope with increasing seasonal drought (Mohammadi 2018; Mwadzingeni et al. 2016).

Since the early twentieth century, most wheat breeding has relied on pure line breeding. Diversification strategies such as the use of multilines, cultivar mixtures, and evolutionary breeding have been advocated since the 1940s especially to improve weed competitive ability and pest and pathogen resistances (Finckh and Wolfe 2015). Evolutionary breeding (Suneson 1956) is usually based on the use of multiparental diallelic or hierarchical crossing systems (composite cross populations, CCPs). In a recent review, Ceccarelli and Grando (2020) suggest, that “research on evolutionary populations and mixtures is able to address the complexity of climate change while stabilizing yield, decreasing the use of most agrochemicals…”, a view that is supported by ecological research on diversification (Brooker et al. 2021). Both the organic sector (e.g., Dawson et al. 2008; Döring et al. 2015; Murphy and Carter 2013), as well as breeders interested in creating stable and adaptable materials (e.g. Goldringer et al. 1998; Bonnin et al. 2014; Tsujimoto et al. 2015; Bocci et al. 2020) indicate great interest in evolutionary breeding.

Yield stability is of particular interest in heterogeneous populations as positive relationships between diversity and productivity and stability have been reported (Finckh et al. 2000; Barot et al. 2017). While numerous parameters have been developed to characterize yield stability (Annicchiarico 2002), two basic concepts can be distinguished, each requiring the use of different parameters:

  1. 1.

    Type 1 or static stability describes stability that minimizes risk in the event of unforeseeable environmental influences (e. g. years with different weather conditions or occurrence of pathogens). Ideally in this case, similar (high) yields are achieved in the individual years, however, stability under such unforeseen conditions may result in lower than average yields with little variance between years. The type 1 stability is described by the variance across environments and parameters derived from it, such as the coefficient of variation and the parameters 'POLAR' and 'aCV' (Döring et al. 2015).

  2. 2.

    Type 2 or dynamic stability aims at equal distribution of the yield risk across all locations where the genotype is cultivated and measures yield increase proportional to the environment mean. Dynamic stability measures account for the productivity level of each environment and are more suitable if the response to different locations or management factors is assessed. These include the variance of the yields relative to the location mean (Annicchiarico 2002), as well as the parameters proposed by Eberhart and Russell (1966) and Shukla (1972). In addition, several other indices have been proposed and applied to populations and are described in detail elsewhere (Weedon and Finckh 2019).

If the factors that cause yield fluctuations or differences are the same, e. g. weather or diseases, it is not possible to optimize static and dynamic stability at the same time, because high static stability leads (but is not equivalent) to low dynamic stability, while high dynamic stability results in medium static stability. Thus, static parameters are relevant for risk minimization over years, while dynamic parameters are more suitable to identify genotypes or populations adapted to a wide range of environments such as climate zones, soil types or farm management.

Several winter wheat CCPs based on the intercrosses of up to twenty parents with either high-yielding (Y) or good baking quality (Q) characteristics were created in 2001 (Döring et al. 2015), and have been exposed to natural selection in our research group in Germany since 2005 (F5) under organic and conventional conditions. The effective population sizes were large enough to avoid genetic drift so that the naturally low outcrossing rates of wheat sufficed to maintain the genetic variability in the CCPs (Brumlop et al. 2019). Early vigor, as measured through root system development and seedling growth of wheat CCPs has been shown to increase under organic management compared to conventional management (Bertholdsson et al. 2016; Vijaya Bhaskar et al. 2019b), pointing to specific adaptive processes within the CCPs to the growing system. Nevertheless, populations maintained under the same conditions may still change over time in different ways (Weedon and Finckh 2019). Under organic conditions, yields and baking quality of the CCPs corresponded to or exceeded that of the crossing parents even in advanced generations (Döring et al. 2015; Brumlop et al. 2017). The derived from the intercross of eight high yielding with twelve baking quality parents has been shown to be highly weed suppressive (Finckh et al. 2018) and quite adaptable to ecological intensification (Baltazar and Boutsen 2019). Preliminary data on the CCYQ suggests that when broadcast-sown under organic conditions and managed without further mechanical weed control, it may be pre-adapted to minimum tillage conditions (Schmidt et al. 2018). The CCYQ was registered as an open-source heterogeneous variety in 2020 under the name ‘EQuality' (OSS 2020).

Across ten generations, yields of organically and conventionally maintained CCPs were comparable to the reference cultivars under organic conditions, but inferior under conventional conditions. In contrast, yield stability (both in the dynamic and static sense) of the populations was higher than that of reference variety “Capo' in both management systems (Weedon and Finckh 2019). For a more thorough comparison of yield and yield stability and in order to determine the effects of population history on performance, populations with variable histories were compared to a larger set of pure line cultivars across a set of environments, including both conventional and organic systems, as well as variable nutrient inputs.

In the present paper, we compare a set of 10 CCYQ derived from the original population described above in 2005 (F5), but with varying population histories in terms of management. The role of diversity for yield and stability was assessed by including 10 lines extracted from the CCYQ population in the F7. Moreover, conventionally and organically bred baking quality cultivars were used for comparison.

In order to increase the range of grain yields and protein contents in the tested assortment, two feed cultivars with very high yields, but low protein content were included, namely the cultivar “Elixer” and the hybrid variety “Hybery”.

Within this article, the following questions are addressed:

  1. 1.

    What are the effects of within population diversity on performance and stability? In order to answer this, the populations were compared to the randomly extracted lines for overall performance and stability.

  2. 2.

    How do the populations and commercial cultivars of different breeding origin compare in their performance for yield, protein content and stability across environments (organic vs. conventional farming system and fertilization level)?

  3. 3.

    What are the effects of evolutionary history on agronomic traits of the populations and do these interact with the farming system and input levels?

Materials and methods

Plant material

Populations

In 2001, a half diallel cross of 20 wheat cultivars released between 1934 and 2000, selected for good performance under low-input conditions (Jones et al. 2010), was performed by the Organic Research Centre, Elm Farm and the John Innes Institute in the UK. All populations examined in this study are based on this ‘CCYQ’ (Composite Cross YQ). In brief it is based on the progenies of the crosses of eight high yielding (Y) with eleven high baking quality (Q) parents, including the progeny of all 19 parental cultivars crossed with ‘Bezostaya`. In addition, crosses of the parents with naturally male sterile lines were added to enhance outcrossing in early generations (see Döring et al. 2015; Brumlop et al. 2019 for details). In 2005, seed of the F4 grown at various UK sites was bulked and a sample was kept in the UK, one sent to the University of Kassel and one to the Hungarian Academy of Sciences in Martonvasar where the CCPs have been since been maintained.

At the University of Kassel, the CCP was split and increased under both organic and conventional conditions for one year. In 2006, each organic and conventional population were divided into two parallel sub-populations, which were maintained at that site in plots ≥ 150 m2 to avoid genetic drift without mixing and separated from other wheat by at least 3 m (Brumlop et al. 2019). In 2008, two further organic subpopulations were created that were broadcast-sown (BC) every year and maintained without weed control under organic management.

In addition, in 2008 eight partners in seven European countries (UK, Netherlands, Denmark, Germany, Switzerland, France, and Hungary) received a portion of the population maintained either in the UK, Hungary or at University of Kassel. The populations were grown from the F8 to the F12 by each partner for one year and then passed on to the next in a distinct cycling scheme with the aim to expose the populations to variable environmental conditions over time (Weedon et al. 2016; Weedon 2018). Three cycling populations, originating from one of the three original countries (UK, Hungary and Germany), were included in this study and were compared to a CCYQ population that had been maintained at the University of Kassel since 2008 (Cycl. Control). In summary, the CCPs had the following variable histories (see Table S1 for population abbreviations and management history):

  • Organic (low input—LI) vs. conventional (high input—HI) farming systems at University of Kassel

  • Broadcast seeding (BC) without weed management vs. seeding in rows with mechanical weed management, both under organic conditions.

  • Changing environments across sites for 5 years (Cycl.)

Random lines

In 2007, 100 spikes were extracted randomly from the original population (CCYQ, F6 in the UK, grown under organic conditions) from which inbred lines were derived and subsequently maintained at the Technical University of Munich (TUM) under conventional conditions. Ten of these lines were randomly chosen and included in the trial in order to assess the genetic potential of the populations while excluding intra-specific diversity effects.

Commercial cultivars

The commercial cultivars included in the trials were (a) four baking quality cultivars bred for conventional agriculture, (b) four baking quality cultivars bred for and under the conditions of organic agriculture and (c) two high-yielding conventionally bred cultivars for feed, including one hybrid line (Table S2). The latter were included in order to extend the range of yield, protein content and yield stability potential, but only to a limited extent for direct comparison with the populations as this is only meaningful within groups of similar yield and baking quality classes.

Field trial locations, management and experimental design

The field trials took place at four locations: Viehhausen, Dürnast, Neu-Eichenberg, and Quedlinburg (Table S3). The experimental station Viehhausen (VH) of TUM has been managed organically since 1995. Soils are cambisols based on sandy loam with a water holding capacity of approx. 180 mm. No mineral fertilizers, fungicides, insecticides or herbicides were applied. Weeds were controlled mechanically through harrowing at tillering. The pre-crop was a grass-clover ley in both years. Environmental and soil conditions at the conventional research station Dürnast (DN) of TUM, situated close to Viehhausen, are very similar to VH. The pre-crop was maize in both years. Weather data for both sites were provided by the Bayerische Landesanstalt für Landwirtschaft (Station Nr. 8, Freising), which is situated approx. 7 km from VH and 3 km from DN.

The research station of the University of Kassel in Neu-Eichenberg (NE) has been managed organically since 1984; no mineral fertilizers, fungicides, insecticides or herbicides were applied. Weeds were controlled mechanically through harrowing at tillering. The soil is a fine loamy loess soil (deep Haplic Luvisol) with a useable field capacity of approx. 180 mm. In order to provide a very low nitrogen site, the pre-crop was triticale in both years. Weather data were provided by an on-station meteorological station for both seasons.

The soil at the conventionally managed experimental station of the Julius-Kühn-Institute at Quedlinburg (QB) is a chernozem with a usable field capacity of approx. 220 mm. Pre-crop was oilseed rape in both years. Weather data were provided by an on-station meteorological station for both seasons.

All experiments were set up as split-plot experiments with four replicates with N-fertilisation as the main factor and genotypes assigned to the subplots. Plot size varied depending on the equipment available at the respective experiment stations (see Table S3 for details). All trials were tilled conventionally using the plough and managed according to the common agronomic practise of the respective environment, comprising application of fungicides and growth regulators in the conventionally managed environments.

Assessments

Ground cover of weeds and wheat (%) was visually estimated at the end of tillering (BBCH 27–29). Heading date was assessed visually and expressed in number of days from 1 January. Leaf diseases were visually estimated as percentage of non-green leaf area per plot as described by Brumlop et al. (2017) at the time of flowering (BBCH 60–65). Moreover, the predominant foliar pathogens were noted. Plant height was measured manually after flowering (BBCH 70). Based on manual cuts of 1 m2 per plot, the number of spikes/m2, the number of grains per ear and the average ear weight were determined. Yields were corrected to 14% moisture content and thousand kernel weight (TKW) determined. Grain protein content was determined by near-infrared spectrometry (NIRS) using commercial standard devices and calibrations.

Data analysis

Statistical analyses were performed using R Version 3. 4. 3 (R Core Team 2020). Linear mixed models were built using the R package “lme4” (Bates et al. 2015), multiple comparison of means was successively performed based on the significance of pairwise comparisons of adjusted means, using the packages “emmeans” (Lenth 2016) and “lmerTest” (Kuznetsova et al. 2017).

The percentage of leaf area affected by diseases at flowering was not included in the overall ANOVA, but analyzed for each single experiment, as different diseases were present in the single environments and the data were therefore not comparable.

For the group comparisons i.e. populations (Table S1), the different groups of commercial cultivars (Table S2) and the randomly extracted lines), the group (G), the farming system, i.e. organic vs. conventional (S), the input level (I), as well as the interactions G × S, G × I and G × S × I were considered fixed factors, while trial, i.e. location × year, and genotype, i.e. cultivar, line or population were considered random factors.

Response variables were ground cover, heading date, plant height, yield, yield components, protein content, and protein yield. This analysis was performed with two different subsets: (a) only commercial cultivars and populations for assessing the potential of populations and (b) populations and random lines for assessing the impact of within-population diversity on yield and other variables.

For comparison of populations evolved under different conditions, the fixed factors were genotype (G), farming system in the evaluation trial (S), nitrogen input level in the evaluation trial (F), as well as their interactions G × S, G × F and G × S × F. The single experiment (T, location × year), replications and nitrogen fertilization blocks were considered random factors. Response variables were those as mentioned above. For the comparison of the ten CCPs, three separate groups of the parallel populations were created for comparison. These included the high (HI) and low input (LI) CCPs that had evolved without mixing as parallel populations in Neu-Eichenberg for 10 years, as well as the two parallel broadcast populations (BC) (Table S1). Linear contrasts were used to compare between the grouped parallel populations (LI, HI, BC), the three cycling populations and the cycling population control population, which had been maintained in Germany since 2005. These linear contrasts allow for the comparison of populations that have evolved in parallel, as well as between populations with differing evolutionary pathways.

Stability was assessed for both grain yield and protein content, based on the variance of absolute and relative yields or protein content over 16 environments (2 year * 4 locations * 2 fertilization levels). The variance of absolute yields describes the “static” or “Type 1” stability concept (Annicchiarico 2002), while the variance of the yields relative to the overall mean in each environment is a measurement for “dynamic” or “Type 2” stability (Annicchiarico 2002). In order to avoid eventual biases, due to the application of only one stability index, we additionally calculated regression coefficient according to Finlay and Wilkinson (1963), i.e. the slope of the regression of genotype (or population) yields on the environmental mean, which may be both interpreted as both static (considering the deviation from 0) and dynamic stability (considering the deviation from 1), and the variance of the residuals (Mean square error, MSE) of the joint regression (Finlay and Wilkinson 1963) according to Eberhart and Russell (1966), a measure of dynamic stability (type 2). Confidence intervals were calculated for the stability indices and they were compared using Ekbom’s Test (Ekbohm 1981).

Results

General observations

In both years, the annual mean temperatures for each experimental station were similar to the long-term mean from 2003 to 2017. There was an overall water deficit in 2015/16 in Quedlinburg and in 2016/17 in Neu-Eichenberg (Figure S1). In addition, the year 2014/15 had been exceptionally dry in Neu-Eichenberg, resulting in a subsequent water deficit for 2015/16 at that site (Weedon and Finckh 2021).

Overall yield differences due to the fertilization level were only about 5% in 2016 and 10- 20% in 2017, probably due to the droughts in Quedlinburg and Neu-Eichenberg in 2015/16 (Fig. 1). The prevailing foliar diseases in 2016 were yellow rust (Puccinia striiformis f. sp. tritici) at the two organic sites and DTR leaf spot (Drechslera tritici-repentis) and Septoria (Zymoseptoria tritici) at the conventional sites. In addition, brown rust (Puccinia triticina) played a role in the organic sites in 2017. Disease pressure was generally low (< 15% non-green leaf area on average, almost no disease in the conventional site at Dürnast in both years due to successful application of fungicides), with the exception of Neu-Eichenberg in 2016 where yellow rust pressure was somewhat higher (Fig. 3).

Fig. 1
figure 1

Overview over yields per location. Year and fertilization level. Error bars indicate standard deviations. Differences between N levels were significant in all cases (p < 0.05). NE Neu-Eichenberg, VH Viehhasen, DN Dürnast, QB Quedlinburg. see Table 2 for details

Performance of populations, varieties and extracted breeding lines

The most important performance traits, namely grain yield, grain protein content and protein yield, are depicted in Fig. 2 for all genotypes and populations, both in organically and conventionally managed trials. A comparison among groups (populations, randomly extracted inbred lines ans different groups of commercial cultivars) is shown in Tables 1 and 2, comprising yield components and phenology. The infestation with leaf diseases is shown in Fig. 3 and Table 4.

Fig. 2
figure 2

Grain yield, grain protein content and protein yield of commercial cultivars, randomly extracted inbred lines and populations under organic and conventional management. The shaded area depicts the 5% confidence interval of the regression line. The dotted lines are isolines for different levels of protein yield in kg/ha). The graph depicts also the “grain protein deviation”, i.e. the deviation from the regression line

Table 1 Effects of group (Populations. inbred lines E- C- and organic varieties) and farming system
Table 2 Grain yield and other parameters per group
Fig. 3
figure 3

Percentage of leaf area affected by fungal diseases at flowering per genotype (populations. inbred lines or commercial cultivars) and individual trial (see Table 3). Note, that there were different diseases in the respective trials (see Table S3 for details). Asteriscs indicate genotypes differing significantly from the population (p < 0.05). As determined by t test corrected according to Bonferroni. The horizontal dotted line depicts the average of the commercial cultivars). Error bars depict standard deviations. The results from Dürnast are not presented. Because infestation was very low (< 5%) and no differences among populations and genotypes could be detected

Populations versus randomly extracted inbred lines

Yields of the inbred lines extracted from the populations were significantly lower than those of the populations. The differences were greater under organic than under conventional management. In contrast, grain protein content was somewhat higher in the inbred lines than in the populations. Protein yields of populations were higher in organic, but not in conventional environments (Fig. 2, Table 2).

There was a marked difference in yield structure between populations and randomly extracted lines with populations indicating higher tillering capacity, earlyer heading, but smaller spikes and lower TKW. Moreover, compared to the random lines, the populations were characterized by a higher ground cover at tillering under organic conditions and fewer seeds per ear (Table 2). Plant height tended to be similar between the populations and random lines (data not shown) and where disease pressure was high, the random lines generally indicated greater non-green leaf area values than the populations (Fig. 3).

Populations and commercial cultivars

Genotype group (G), fertilization level (I) and farming system (S) affected plant development (ground cover, heading, height), yield, most yield components, protein yield and grain protein content with significant G × S interactions in most cases (Table 1). While no G × I interactions occurred, three-way G × I × S interactions for ground cover and grain yield was found (Table 1). The significant three-way interactions for yield was most likely due to differing degrees of lodging depending on fertilizer level and is not considered in detail.

As expected, yields of the two high yielding feed cultivars were higher in both growing systems in comparison to all other entries, while grain protein contents were the lowest (Fig. 2, Table 2). These two cultivars also yielded the most protein/ha, had the highest number of seeds/ear and usually ranked very high in the number of ears/m2. However, ground cover was slightly reduced under organic conditions (Table 2).

In the comparison of the conventionally and organically bred baking quality cultivars and the populations, no significant yield differences under organic conditions between the three groups were found, while under conventional conditions the conventionally bred baking cultivars outyielded the other two groups (Table 2). The organically bred cultivars always had the highest protein contents, followed by the populations and then the conventionally bred cultivars. Under conventional management, all groups achieved an average grain protein content above the threshold of 11% which is commonly regarded as the minimum required for bread production. Under organic management, however, only the organically bred cultivars and the random lines achieved an average grain protein content above the threshold of 11%, which was coupled with lower yields. The populations were characterized by a slightly higher ground cover and tillering capacity under organic but not under conventional management, as well as earlier heading date, smaller spikes and lower TKW (Table 2).

Where foliar diseases were recorded, the mean disease levels of the CCPs were similar to the mean of the commercial cultivars (Fig. 3). In contrast, the random lines were significantly more affected in the organic environments where rust diseases were prevalent. In Neu-Eichenberg and Viehhausen in 2016, the severity of yellow rust on the commercial cultivars varied considerably with ‘Kerubino’ being particularly susceptible (Fig. 3, Table 4).

In almost all environments, conventionally bred baking cultivars yielded more than organically bred baking cultivars at the cost of considerably lower grain protein content with the exception of ‘Genius’ under organic conditions (Fig. 2).

Protein content and grain yield were strongly negatively correlated under both organic and conventional conditions (Fig. 2). With the exception of ‘Genius’ under organic conditions, conventionally bred baking and feed cultivars, populations and randomly extracted inbred lines were close to the regression line. However, under both management systems the organically bred baking cultivars indicated greater deviation from the regression line, considerably higher than that of the conventionally bred cultivars, populations or randomly extracted inbred lines with similar yield potential (Fig. 2).

Stability

The yields relative to the site mean of the entries across all 16 environments varied considerably with the lowest variation seen in the cycling and non-cycling CCYQs followed by the commercial baking quality cultivars and then the extracted lines (Fig. 4).

Fig. 4
figure 4

Relative yields of Populations. Extracted inbred lines and commercial Cultivars in 16 environments (Locations × years × Fertilization level; each line corresponds to a genotype or population). Each line corresponds to an environment. Green lines depict organically, red lines conventionally managed environments. Genotypes or Populations are in the x-axis. “C” stands for feed wheat, “E” for conventionally bred baking quality wheat and “O” for organically bred varieties

Yield and yield stability must always be seen in the context of the respective yield levels (or overall protein content level in the case of the stability of protein content), therefore, yields (or protein contents) are plotted against the stability parameters (Fig. 5). For details, please refer to Tables S4 and S5, which include confidence intervals.

Fig. 5
figure 5

Yield and yield stability (ad) and stability of grain protein contents (e, f) over all environments expressed by different stability indices: variance of absolute yields (a), joined regression according to Finlay and Wilkison (b), variance of relative yields (c), MSE according to Eberhard and Russel (d), variance of protein contents (e) and variance of relative protein contents, referred to the mean of the respective environment (f). Regression line and r refer to the cultivars only. a, b, e Refers to static, c, d, and f to dynamic stability

The static stability as expressed by low variance of absolute yields across environments was highest for the organically bred cultivars, albeit at a low yield level (Fig. 5a). In contrast, the variance of the CCYQs was higher but also at a higher yield level. The conventionally bred baking cultivars were similar to the CCYQs in variance except for the yellow rust susceptible variety ‘Kerubino’ (Fig. 5a). The dynamic stability, i.e. the variance of the yields relative to the site mean, was best (lowest variance) for the CCYQs followed by the commercial baking quality cultivars, random lines and then the high yielding feed cultivars (Fig. 5c). Both for static and dynamic stability parameters, there was a negative correlation between yield and stability among commercial cultivars (Fig. 5a–d), indicating that the higher the yield, the lower the stability. In contrast, the extracted lines had very high variances despite low yields indicating both low static and dynamic stability.

The regression coefficient (b) according to Finlay and Wilkinson (1963) (Fig. 5b) allows conclusions to be drawn about the suitability for high or low input systems. According to this, the two high-yielding feed quality cultivars and ‘Kerubino’ are distinct HI cultivars, while the organically bred cultivars are suitable for LI systems, followed by the populations, which according to this parameter indicate good adaption to a wide range of environments.

Similar trends were observed for protein content stability, which is not surprising as protein content and yield are closely correlated: relatively high stability of the populations, low stability of the inbred lines, while the stability of the commercial cultivars varied over a broader range. (Fig. 5e, f).

Effects of evolutionary history on the performance of the populations

There were significant effects of the evolutionary history on plant height, grain protein content, yield and most yield parameters (Table 3). Heading date was affected by history under conventional but not under organic conditions, with all populations heading earlier under conventional conditions. Otherwise, the evolutionary history did not interact with the farming system and fertilizer input (Table 3). No significant effect of the evolutionary environment on foliar diseases could be detected (data not shown).

Table 3 Grain yield and other parameters by Genotype and evolutionary environment

The populations which evolved under conventional management (HI) were significantly taller than the CCPs that had evolved under organic conditions at the same site (Table 3). The highest yield was achieved by the cycling population CyclUK, that originated from the UK. This population also produced the highest number of ears/m2, while the highest TKW was produced by the cycling population CyclHU that had stayed three years in Hungary before cycling. The highest protein content and grain protein yield was achieved by the cycling control (CCYQ-Cycl. Ctr), which is a mixture of the two LI CCPs created and maintained in Germany since 2008 (Table 3).

The cycling populations differed in yield, height, number of ears/m2,, TKW and for heading date under organic conditions. The population CyclUK, that had stayed three extra years in the UK from 2005 to 2008 was significantly shorter than the other two cycling populations. The population CyclHU that had been in Hungary for three extra years produced significantly fewer head bearing tillers and higher TKW than the other two cycling populations (Table 3).

No differences concerning yield stability and stability of protein contents among populations with different evolutionary history could be detected (data not shown).

Discussion

Yields in the conventional trials were in the range of the mean yields achieved in Germany during 2016 and 2017 for conventional agriculture, while yields in the organic trials slightly exceeded the current average organic yields in Germany,hich amount to approx. 3.6 t/ha, (BMEL 2019)Organic crops depend on optimal conditions with respect to soil water and temperature for nutrient mineralization. The poor performance of the organic trials in 2016 was in part due to the extremely dry 2014/15 season and the subsequent water deficit, which resulted in low overall mineralisation.

Comparison among groups

Overall, the composite cross populations performed similar in yield to the organically and conventionally bred baking cultivars under organic conditions. Under conventional conditions, the populations were higher yielding than the organically bred cultivars. Grain protein content of the CCPs were similar to that of the conventionally bred cultivars, but one percentage point less than those bred for organic conditions. In addition, the CCPs performed considerably better than the inbred lines extracted. The similarity of the yields of the populations to the organic and conventionally bred cultivars is remarkable, as the populations were generated from genetic materials from 1934 to 2000 that had been chosen for adaptation to low-input environments (Jones et al. 2010) and comprising of many parents bred in the UK, which are not fully adapted to the environmental conditions of Germany. Clearly, it was not expected that yields of the bread quality and feed quality cultivar groups be comparable.

The HI and LI populations included in these trials had been grown in Neu-Eichenberg for 10 generations and were in the F15 (2015/16) and F16 (2016/17). During these years and up to the F18, yields under organic conditions were similar to that of the conventionally bred cultivars ‘Capo’ and ‘Achat’ but under conventional conditions, ‘Capo’ outyielded the populations (Weedon and Finckh 2019, 2021) and was highly stable (Weedon and Finckh 2021). This is in accordance with older results on other populations, where yields were found to be stable or tended to increase over generations (Suneson 1956; Qualset 1968; Allard and Adams 1969; Jain and Qualset 1976; Danquah and Barrett 2002). Compared to their parents or parental mixtures the CCPs outyielded them when tested (Brumlop et al. 2017; Döring et al. 2015; Finckh et al. 2009; Weedon and Finckh 2021). Similar results with other wheat populations were obtained by Goldringer et al. (2001). This suggests that populations can be considered an as an alternative to inbred cultivars, both in organic and conventional agriculture, if based on parent cultivars with a high performance. However, our experiments are not representative of environments with very favorable growth conditions typical for oceanic climates in northern Germany, where yields of more than 10 t/ha are possible.

Crop density (i.e. the number of spikes m−2) was highest in populations both in comparison with commercial cultivars and the inbred lines extracted from the populations under organic conditions. Corresponding to this, grain and spike weights were lower. The same could be observed for ground cover during tillering, where the difference between populations and inbred lines was particularly notable. Early development in the field and ground cover correlate well with root development where the CCPs have been shown to outperform most pure lines (Vijaya Bhaskar et al 2019a) and organically evolved CCPs outperformed conventionally evolved ones (Bertholdsson et al. 2016; Vijaya Bhaskar et al. 2019b). Through the advantage of genetic diversity for adaptation and the combination of different genotypes, the CCPs appear to be better able to use the available space and resources than homogeneous stands of single lines. Possible benefits are higher wheat biomass and consequently yield, improved organic matter supply to the soil (Simon et al. 2019), as well as improved weed suppression (Finckh et al. 2018).

Yield vs. grain protein content

The strong negative correlations under both management systems between yield and grain protein content observed in our trials (Fig. 2) are not unusual. Since there is also a positive correlation of yield and protein yield, there is not purely a dilution effect (Simmonds. 1995). Breeding has exploited this correlation in order to develop higher yielding cultivars while tolerating lower protein contents (Voss-Fels et al. 2019). In the case of baking wheat, this has in part been compensated with better gluten quality (Laidig et al. 2017). For baking quality cultivars, breeding aims at increasing yields while keeping the protein content of the grain as high as possible. The breeding objective would therefore ideally be to break the above-mentioned correlation, i.e. to increase cultivar deviation from the regression line of protein content on yield (as shown in Fig. 2) that has been termed 'grain protein deviation' (GPD) by Monaghan et al. (2001). A marked positive GPD is evident for the organically bred cultivars under both management systems and ‘Genius’ under organic conditions, while the GPD was considerably lower in the conventionally managed trials (Fig. 2). Thus, for certain cultivars and under certain conditions it is possible to break this correlation.

The ability of the plants to take up nitrogen in the period before flowering and to transfer it to the grain during the grain filling phase, coupled with N uptake during the grain filling phase, which is largely dependent on N availability during this period, are two decisive factors for grain protein content (Baresel et al. 2008). According to Bogard et al. (2010), post-anthesis N uptake is crucial for a high GDP in wheat under relatively high input conditions. In environments with more limited late N availability, Nehe et al. (2020) found pre-anthesis N uptake and the efficiency of its translocation to be more important. In conventionally managed environments, N supply during grain filling is less affected by plant genotype since differences between genotypes are masked by the high N availability. In organically managed environments, N is often lacking during the grain filling period and post-anthesis N uptake is more dependent on the plant genotype, which may explain, why the GDP of cultivars selected under these conditions is often higher.

Taking this into account, we conclude that selection under near-optimum conditions may not lead to optimum cultivar selection for organic or low-input systems due to the fact that the negative correlation between yield and protein content and the possibilities to deviate from it through increasing GPD values, strongly depends on selection in different environmental conditions. Moreover, the intrinsic diversity of populations does not contribute to greater GPD. For this reason, increasing GPD in heterogeneous populations is strongly dependent on parental cultivar choice and performance under different environmental conditions, as is the case for increasing yield and grain protein potential (Döring et al. 2011, 2015).

Role of diversity for performance

The superiority of the CCPs over the inbred lines extracted from them was more evident in the organic than in the conventional trials where fungicides were used to control foliar diseases. Under organic conditions, when no fungicides were used, the CCPs were always considerably healthier presumably due to their resistance diversity. However, even in the trials with low leaf infestation where the populations were similarly or slightly more affected by foliar diseases, yields of populations were higher. Thus, greater tolerance to foliar pathogens is not the only benefit gained from diversity, but there must be other factors of importance.

Certain morphological characteristics are considered a main factor for high yield potential in wheat cultivars. Besides harvest index, particular importance is given to competitive ability. It has been considered, that h in wheat as a “community plant” this should be kept low. The resulting ideotype in pure line breeding is therefore low in stature, scarcely tillering and possesses short, erect leaves (Donald 1968). Plant types with higher competitive ability are considered less suitable, at least for high-input-conditions, because all plants compete for the same resources and may thus suppress each other.

In the case of heterogeneous populations, with a range of morphological and phenological characteristics, each individual occupies a different niche, leading to a certain degree of complementarity. This would allow for denser canopies and more complete ground cover, which was observed in our field experiments. Therefore, the negative effect of more competitive ideotypes is alleviated, if a certain degree of diversity is present in the canopies. Diversification may therefore be considered an alternative or supplementary way to achieve higher crop densities, in addition (or alternatively) to ideotypes with low competitive ability. As the overall competitive ability of such diversified canopies is higher, this results in the observed improved weed suppressiveness (Finckh et al. 2018).

Several new yellow rust races have emerged after 2010 in Europe affecting many of the current cultivars (Hovmøller et al. 2016) including ‘Kerubino’ in our trials. Some of the parents of the CCPs carried resistances against these new races just like most of the pure line cultivars, but also against the brown rust races present during their evolution in Neu-Eichenberg (Weedon and Finckh 2021). This diversity in resistances has provided the CCPs with advantages over a number of tested cultivars. The pure lines were extracted from the CCPs in the UK in 2007, which meant that they were selected before exposure to the new yellow rust races. In addition, results from our trials indicate that most of them did not carry the relevant yellow rust resistances and were as susceptible as ‘Kerubino’ on average. Thus, highly diverse populations are a valuable and durable alternative to resistance breeding based on single resistance genes, especially if parental cultivar choices include diversity for disease resistances.

Under conventional conditions where Zymoseptoria triticiand Drechslera tritici-repentis were the prevailing pathogens, the populations performed similarly to all other entries with a maximum of 4. 5 percentage points greater NGLA value, which is biologically not significant (Table 4). Although canopy architecture diversity generally decreases splash dispersed diseases (Vidal et al. 2017, 2018), the higher crop density of the populations may have led to higher canopy moisture and to higher disease conduciveness.

Table 4 Leaf diseases of populations and isolated lines: major diseases in the respective trial and infected leaf area in percent

Role of diversity for stability and adaptation

While the commercial cultivars with the highest yields were able to exploit their potential primarily in environments with high yield potential, the cultivars bred for organic farming proved to be cultivars for low input conditions. The fact that the populations achieved yields that ranged between both the conventionally and organically bred cultivars combined with high dynamic stability, points to their wider adaptation to fluctuating environmental conditions. In contrast to the organically bred cultivars, the adaptation of the populations was not limited to low-input conditions or organic farming, but they were generally well adapted to the conventionally managed environments as well. However, our study lacks environments, where extremely high yields of more than 9 t/ha are possible; consequently, we do not know, whether the populations are also adapted to this type of environments.

The stability of the CCPs was in contrast to the extracted random inbred lines, which were clearly better adapted to HI-conditions as can been seen by their relative yields in the 16 environments, though highly divergent in their performance and with low overall yields. The marked difference in yield stability of populations and isolated inbred lines shows that the higher stability of the populations is due to their diversity, not their genetic background. The findings are in accordance with results of Döring et al. (2015) and Weedon and Finckh, (2019). We can therefore conclude, that higher diversity is an appropriate way to improve overall adaptation and stability to different site conditions.

In our studies, the differences among environments were mainly caused by local factors and different crop management (fertilization levels and farming system). The results are therefore primarily valid for the adaptation to local conditions and cultivation measures.

However, comparing the adjusted means of the single genotypes or population between the two years, we could note, that the distance was lowest in the populations (154 kg/ha in average), followed by the organic varieties (225 kg/ha), the varieties with high baking quality (291 kg/ha), the random inbred lines (328 kg/ha) and finally the feed varieties (700 kg/ha), the same ranking, which has also been observed in the stability parameters over all environments. Performing the analysis represented in in Fig. 5 for ech year individually instead of over both years, similar, though less marked effects could be observed (data not shown). In addition, high resilience of the CCPs with respect to water availability, abiotic stresses and disease pressure in one site has been documented over 13 generations in previous field trials (Weedon and Finckh 2021). We may therefore conclude, that Populations are also more stable over years, not only over locations or different systems of crop management.

In our studies, the differences among environments were mainly caused by local factors and different crop management (fertilization levels and farming system). The results are therefore primarily valid for the adaptation to local conditions and cultivation measures. Nevertheless, high resilience of the CCPs with respect to water availability, abiotic stresses and disease pressure in one site has been documented over 13 generations (Weedon and Finckh 2021).

We conclude, that using more genetically diverse material can be considered an alternative to cultivars consisting of single inbred lines, which can contribute to improved yield stability, if appropriate parental material is used. Moreover, we found that the use of populations is not limited to low-input or organic farming systems, but in contrast to the cultivars bred for organic farming, CCPs show a wider adaptation to a broad range of environments.

Evidence for evolution and adaptation to specific environments

We could show that populations having evolved under different conditions differed for several agronomic traits. Since population sizes were large enough to avoid genetic drift (Brumlop et al. 2019), and since two parallel populations have been examined in three cases (with exception of the populations CyclHU, CyclDE and CyclUK and CCYQ-Cycl.Ctr, see Table S1), these differences are evidence for adaptation to the specific environments. However, no specific adaptations to either organic or conventional environments or either high-or low-N-fertilization were evident.

Changes in early root and shoot development of the HI and LI CCPs over time have been demonstrated under controlled conditions, with these shoot and root measurements correlating with early soil cover in our field trials in NEB (Vijaya Bhaskar et al 2019a). An earlier comparison for yield effects among populations with different histories was also inconclusive (Brumlop et al. 2017). It has also been documented that in the first generations when the populations were still only growing in the UK, genotypes carrying Rht-genes for reduced height had been strongly reduced without negative yield effects (Knapp et al. 2020). Similar observations were reported by Goldringer et al (2001) in other wheat populations. For conventional environments with particularly high yield potential (more than 8 t/ha), this might increase the risk of lodging and require higher doses of growth regulators. Additional comparisons of the F17 with earlier generations show, that evolutionary changes did not result in yield changes over time (Baresel et al. unpublished), in line with other studies that either report increasing or stagnating yields, but never yield decreases over generations in evolving populations (Suneson 1956; Allard 1961, 1990; Jain and Qualset 1976; McDonald et al. 1989; Danquah and Barrett 2002).

A concern that has been raised with respect to the use of populations in dynamic management, is that favouring of more competitive plant types should lead to yield depression due to natural selection (Denison 2012). Natural selection favours genotypes with higher seed production of individual plants. If all genotypes compete for exactly the same resources, this would result in the reduction of productivity of other, competing genotypes. This would be expected, if there were little diversity within the population concerning specific resource efficiency and if there were no diversity in resource availability within a field, such as if all available resources were already exploited, i.e. the genotypes were perfectly adapted to the respective environments and if no outcrossing occurred. Clearly, it is highly unlikely that these conditions can be met. If there are unexploited niches and if genotypes differ in their response to specific resource acquisition, production per plant coincides with better exploitation of non- or underexploited resources/niches, without reducing seed production of plants belonging to other genotypes. This would lead to differentiation within the population, and might lead to higher total yields. Such a development would ideally occur if there is a high genetic variability within the population and if there are many unexploited niches. Beside maintaining their yield level, the CCPs have maintained high genetic diversity through the maintenance of plot sizes large enough to limit genetic drift (Brumlop et al. 2019), as well as outcrossing in wheat which regularly takes place. Thus, the populations maintained a high degree of diversity, which was also reflected in the wide range of morphological characteristics in the randomly extracted lines. Consequently, the populations most likely consist of members differing in their specific resource use efficiency and adaptative capacity, which allows for greater resilience in variable environmental conditions within and among years where different genotypes will be more successful depending on the prevalent conditions. In the long-term, this would lead to an equilibrium between differently adapted genotypes, potentially increasing yield stability over time and, as well as maintaining grain yields.

Conclusions

From our results and in conjunction with the literature reviewed, we conclude, that populations have an interesting, currently neglected potential to improve the resilience of agricultural systems:

  • Populations may be considered an alternative to inbred lines under organic and low to moderately high input conditions providing good foliar disease resistance, which in turn will reduce pesticide inputs in the long term. Parental varietal choice is critical when considering grain yield and protein content potential in differing environments, especially to improve breeding for greater grain protein deviation (GPD).

  • The diversity of pathogen resistances found in populations means that they may not be so susceptible to “ageing effects” in comparison to inbred cultivars, particularly due the loss of resistances due to the appearance of new, virulent strains of pathogens.

  • Although there may be strong initial selection for single traits strongly influenced by major genes, such as plant height or phenology, populations remain stable in the long term; selection towards competitive types does not necessarily result in yield decrease due to possible trade-off effects; a possible explanation is niche differentiation and complementarity, resulting in higher overall crop densities.

  • Yield and protein content stability tend to be higher in populations than in pure lines. In addition, the considerable effort required for maintenance breeding of inbred cultivars is avoided.

  • Crop density and consequently, weed suppression, ground cover density and biomass during tillering tend to be higher in populations than in inbred lines resulting in the reduced need for herbicides.

Our recommendation based on the results of these trials would be to encourage the development and the use of populations in practical agriculture as stipulated by the new EU organic regulation 2018/1519 particularly for use in organic and low-input systems. However, there is no reason to restrict such effects to organic farming systems. Additionally, this should be accompanied by multisite studies in order to further assess yield stability of a range of populations with differing genetic background, as well as the socio-economic impact relating to the introduction of heterogeneous populations. We want to further stress the point that populations should not be seen as a random product but that parental variety choice is vital to the success of population performance. Populations do not require less breeding effort than line cultivars and may in fact be even more challenging due to the lack of long-term experience on suitable breeding methods, parental choice and combing ability, all of which need further development. Research into the optimization of parental variety choice for the establishment of heterogeneous populations, as well as population improvement through admixing of new materials in order to optimize yield and yield stability is essential.