Introduction

Yield losses in cereals due to diseases are still considerable despite intensive crop protection measures and progress in resistance breeding. New varieties with a broad resistance to diseases and lodging, and higher nitrogen use efficiency are required to foster the national and EU agricultural policies (BMEL 2019; EU 2020) aiming towards reduction of nitrogen (NOG) and pesticide use, new varieties with a broad resistance to diseases and lodging, and higher NOG use efficiency are required. In this regard cereals are of specific importance, as they are the most grown field crops in the European Union (EU-28) covering 54% of arable land in 2020 (https://ec.europa.eu/eurostat/de/). In Germany, about the same share of total arable land (55%, 2019) was cultivated with cereals (including grain maize) (BMEL 2020). In the German market, 45% of cereal grain was used for livestock feeding, 32% for human food, 12% for industrial uses, 6% for bioenergy, and 2% for seeds (BLE 2021). Each year, 300–350 new cereal candidate varieties are applied to the Bundessortenamt (https://www.bundessortenamt.de) to enter official trials to assess their value for cultivation and use (VCU), of which 40–50 new varieties are eventually released to the market. Grain yield and quality as well as resistance to diseases are the most important VCU traits.

Breeding progress has substantially increased potential yields of cereals in Germany by about 0.6–1.3% per year over the last 30 years (Laidig et al. 2014). In addition, varieties’ susceptibility to most major cereal diseases was reduced, while stem stability was not (Laidig et al. 2021). Yield losses in cereal crops are caused by several stress factors. While fungal diseases are the dominating cause for biotic stress, heavy rain, storm and hail are important abiotic stressors causing lodging and consecutive yield losses. Application of fungicides (FCI) and growth regulators (GWR) and selection of resistant varieties are the major options to protect against the respective yield losses.

Considerable yield losses in cereals due to diseases and lodging were reported by numerous studies (e.g., Jayasena et al. 2007; Wijk 2009; Jahn et al. 2012; Fones and Gurr 2015; Jevtic et al. 2017; Willocquet et al. 2021). Insufficiently controlled fungal diseases are a major cause of the cereal yield gap, i.e., the difference between attainable yields and actual on-farm yields. The yield gap for rain-fed wheat was estimated to range between 10 and 40% in NW Europe and 10% to 30% in Germany (Schils et al. 2018). In NW Europe, 25% of the total wheat yield gap were attributed to fungal diseases (Savary et al. 2019). Oerke and Dehne (2004) reported average yield losses due to pathogens of 10% in wheat and 14% in barley in NW Europe. Even rather small yield losses are significant in the high-yielding environment of NW Europe, where the average yield level in wheat for the last three years (2019–2021) was for example 87.0 dt ha−1 in Belgium, 75.2 dt ha−1 in Germany and 71.9 dt ha−1 in France (EUROSTAT 2021).

Generally, cereal registration trials to assess a variety’s VCU are conducted over a wide range of environmental conditions. Mostly two intensities are tested, where in intensity 1 varieties are tested without FCI and GWR and in intensity 2 (hereinafter referred to as I1 and I2, respectively) they are treated with FCI and GWR. These two intensities provide the opportunity to evaluate disease susceptibility and lodging of varieties in I1 and estimate the respective yield loss compared to I2. They further allow assessing the impact of input intensity on yield. Depending on the country and the time period of the conducted VCU trials, NOG fertilizer rates may be identical in I1 and I2 or may be higher in I2 (e.g., Laidig et al. 2014; Mackay et al. 2011; Zhang et al. 2007). Application of herbicides and insecticides is generally identical in both intensities. Most guidelines for conducting registration trials state that treatments in I2 should be applied according to “good local agronomic practice (GLAP)”, allowing I2 to serve as a reference for attainable yield. In contrast, I1 allows an undisturbed evaluation of varieties’ susceptibility to diseases and their stem stability. One needs to be aware that GLAP is a very variable treatment regimen. Under a GLAP regimen, input intensity in I2 aims neither for maximum yield nor for full control of diseases and lodging. Input intensity according to GLAP is subject to the local crop experts’ decision regarding frequency, amount and time of application. When applying GLAP in variety trials, economic aspects, IPM and soil protection requirements, and most importantly the dynamics of environmental conditions during the growth phases must also be taken into account.

This is the first study based on a long-term dataset of official variety trials which has considered diseases, input intensity and soil fertility (quantified across environments) as well as modelling and comparing their impacts on yields across five cereal crops. For this reason, our study provides new insights in the complex interactions of variety × environment × input intensity.

The overall goals of this study are to i) quantify yield reduction due to multiple diseases and lack of stem stability, and ii) evaluate the impact of input intensity and soil fertility on yield in I2 under GLAP regimen across five cereal crops. In particular, we first evaluate the annual variability of input intensity (application rates) of NOG fertilizer, FCI, GWR and herbicides by using box plots. In the second part, we quantify relative yield reduction in I1 compared to I2 by multiple linear and quadratic regression equations, where the covariates for the regression terms are the severity scores of disease and stem stability traits. The variability of the different sources of environmental and genotypic variation was taken into account by random effects in a mixed model with regression terms as fixed effects. Third, the impact of NOG, FCI and GWR application rates and the influence of soil fertility on yield in I2 will be estimated by linear and quadratic regression equations as fixed effects in a mixed model including random effects representing environmental variation. In the fourth, and the last part, we evaluate the strength of association between relative yield in I1, severity of disease and stem stability and additionally, the association between yield in I2, input intensity and soil fertility using marginal correlation coefficients.

Materials and methods

Data

This study is based on data from official variety trials of five cereal crops, i.e., winter wheat (WW), winter rye hybrid (WR Hyb) and population (WR Pop) varieties, winter triticale (WTI), winter barley (WB) two-rowed (2r) and six-rowed (6r) varieties and spring barley (SB) conducted in 2005–2019. Varieties in all crops were line varieties except for WR. These crops accounted for 48% of total arable land in Germany in 2015–2019. WW was the most important crop (26%), followed by WB (11%), WR (5%), WTI and SB, with 3% each (BMEL 2020). The Federal Plant Variety Office (Bundessortenamt) conducted the trials at multiple locations. The regular testing period for a newly applied variety was three years. In each year three parallel trial series were run. In any given year, series S1, S2 and S3 included varieties in their first, second and third testing year, respectively. This means that at a specific location up to three trials were grown in the same year. The number of locations per trial series was in the range of 15–25.

Well-established varieties were chosen as references, representing the actual state of breeding progress. At least three reference varieties were included in each series. The references were identical in each series (S1, S2, S3) and updated on a regular basis, ensuring at least partial overlap of sets of references used in successive years. Individual trials were treated according to GLAP. Each trial was conducted with 2 intensities, where I1 was treated without and I2 with FCI and GWR. The application rates for NOG, herbicides and insecticides were identical in I1 and I2. Only in winter rye, growth regulators were also applied in I1 in a few trials at a lower rate than in I2 (Fig. 2). Following a standard procedure, timing, type and application rate of FCI were decided based on the average disease severity across all tested varieties, independent of the actual variety-specific disease severity and resistance level. FCI treatment was thus decided by neither the most susceptible nor the most resistant variety.

The NOG, FCI and GWR application rates were recorded in detail for each individual trial in I1 and I2. NOG rates were accumulated as total kg N ha−1. The preceding crops’ residual NOG supply was additionally considered in our analysis, based on Table 7 in DUEV (2017), and added to the applied mineral NOG rate. The NOG equivalent of sporadically applied organic fertilizer was considered and added according to the applied mineral NOG equivalent rate. Unfortunately, no data on plant-available mineralized nitrogen in the soil (Nmin) was available. FCI and GWR rates were standardized using the treatment frequency index (TFI) following Roßberg (2006). Many studies used the TFI to assess plant protection intensities in crop production (e.g., Klocke et al. 2020; Strehlow et al. 2020). Here, the TFI described the amount of plant protection products applied to a specific land unit relative to the application amount recommended by the approval authority for each individual plant protection product for the specific crop. A TFI of 1 might derive from the application of a single plant protection product in recommended full dose, but might also derive from the application of two plant protection products, each applied at half the recommended dose. We derived separate TFIs for FCI, GWR and herbicide applications in our study. The rates of applied NOG, FCI, GWR and herbicides are shown in Fig. 2.

For each trial, soil fertility (SLF) points (Ackerzahl) were recorded which indicates the quality of a specific area of arable land. Basis is the soil value (Bodenzahl) as assessed in the German Soil Taxation Framework (BodSchätzG 2007; Blume et al. 2015, Chap. 11.2, p. 564 ff). Soil values were assigned depending on soil type, geological age of the parent rock and soil development stage. The best soil quality receives a soil value of 100 (Chernozems of the Magdeburger Börde). Yield potential, however, is not only dependent on soil value, but also influenced by factors like climate, temperature, precipitation and topography. In a field rating, SLF is assessed as a correction of the soil value by taking into account natural environmental conditions of a specific area of arable land. In Germany, SLF is graded on a scale from 1 to 120 points, where 1 means very poor and 120 very good SLF.

In each crop, all traits were assessed in I1 and I2 as given in Table 1. Grain yield (YLD), lodging (LDG) and powdery mildew (MLD) were assessed in each of the five cereal crops. The other traits were crop-specific: in WW, brown rust (syn. leaf rust) (BNR), Septoria leaf blotch, STB), yellow rust (syn. stripe rust) (YLR), Septoria nodorum blotch (syn. Stagnospora nodorum blotch) (SNB) and tan spot (DTR); in WTI, BNR, STB, YLR and RYS; in WR, stem buckling (SBL), BNR and Rhynchosporium (syn. scald) (RYS); in WB and SB, SBL, ear buckling (EBL), RYS, net blotch (NTB) and dwarf leaf rust (syn. barley leaf rust) (DLR). For Latin names of diseases, it is referred to Table 1. All evaluated traits were considered when assessing the VCU of varieties. Stem stability and disease severity were assessed visually on a 1–9 scale by crop experts in the field according to the guidelines of the Federal Plant Variety Office (Bundessortenamt 2000). LDG, SBL, EBL and disease severity propensity were expressed in the 1–9 scale where a score of 1 refers to “missing or very low” and a score of 9 refers to “very high” (Bundessortenamt 2000, Sect. 4.1). The recorded score represents the average disease severity of the plot. For more details on pathology, assessment methods and scaling for stem stability and disease traits it is referred to Supplementary Material SM.

Table 1 Basic data 2005 – 2019

Throughout this paper we use the term “disease severity” to describe each individual variety’s actual visually observed score on the 1–9 scale. An individual trial was considered as non-diseased with respect to a specific disease, if no disease symptoms for the specific disease were visible for all varieties in this trial, or, if only a few varieties showed a severity score of at most 2 and the others of 1. For stem stability and diseases, the number of trials from which observations were taken were notably smaller than for YLD, because scores were only recorded from those trials, which were actually diseased or showed lodging (Table 1). The plant damage assessment was done by the local crop experts responsible for conducting the trial (Bundessortenamt, Federal States, breeders).

Trials were laid out as split-plot designs with main plots arranged in complete blocks. The treatments (I1 and I2) were applied to main plots, and the varieties were arranged in subplots. Subplots within main plots were either laid out as randomized complete blocks, or according to an alpha-lattice designs. The harvested average plot size was about 10 m2. Winter rye hybrid and population varieties were grown in the same trial and also treated identically. However, we analysed both types separately.

We used only data from varieties tested for at least three years to achieve a good representation of the trial conditions and build on a solid database. Data included in this study are shown in Table 1.

The data set was highly non-orthogonal with respect to variety-year combinations, whereas the variety-location combinations were orthogonal within year and trial series, i.e., all varieties were grown together at all locations within the same year and trial series. The data were checked for recording errors and outliers by calculating standardized residuals based on Eq. (1). We excluded observations with standardized residuals greater than ± 5.0 from further analysis.

Pedo-climatic conditions, pre-crops and tillage

Variety trials were conducted in the different crops’ typical growing regions across Germany. The number of different locations included in this study was in the range of 90 for winter triticale and 129 for winter barley. However, during the study period, a substantial share of the locations has been dropped and new ones entered the trial system. Trial series were planned in such a way that each crop’s typical growing region in Germany is covered by a representative number of trials. In Fig. 1a we show the distribution of important indicators for growing conditions, i.e., long-term annual temperature and precipitation, altitude of trial fields and SLF. Observations were trial-specific. The means for long-term annual temperature and altitude ranged between 8.6 °C (WW) to 8.3 °C (WR) and 180 m (WW) to 229 m (SB) above sea-level, respectively. Mean precipitation between crops ranged from 656 mm (WB) to 689 mm (WR). The greatest differences between crops occurred for SLF. WW was grown on trial locations with the highest mean for SLF (67 points), followed by WB (63 points), SB (58 points) and WR (46 points) with the lowest SLF. Despite the fact that crop means were very similar, except for SLF, the variation covered by 95% of trials was very large, indicating that for each crop the trials were conducted under a very wide range of environmental conditions.

Fig. 1
figure 1

(a) Pedo-climatic conditions 2005–2019 and (b) pre-cropping and type of tillage for period 1 (2005–2007) and period 2 (2017–2019). Observations are based on year × location × trial series-combinations. Boxes cover 50% and whiskers 95% of trial observations. WW Winter wheat; WTI Winter triticale; WR Winter rye; WB Winter barley, 2r two-row varieties, 6r six-row varieties; SB Spring barley

Pre-crops were categorized into three groups: foliage crops (e.g., sugar beet, oil seed rape), maize and cereals, and tillage into two groups: tillage and ploughless tillage as shown in Fig. 1b by comparing categories of trial frequency between early period 1 (2005–2007) and late period 2 (2017–2019). The reason for comparing both periods was that, e.g. increased cereal pre-crops may increase disease infection or increased ploughless tillage may increase weed growing and hence use for more herbicides.

In WW, foliage was the most frequent pre-crop with more than 75%, followed by WTI, WR, while for WB and SB cereals predominated. Maize was used as pre-crop in all cereals only with relatively small frequencies. When considering the change between early and late period, a slight increase of maize as pre-crop occurred, especially in WW. Foliage pre-crops did not change or showed a small increase for WTI and WR. The comparison showed that the frequency of cereals did not increase, hence may not influence trends in yield damage or provide reason for more fungicide use. The plough was used for tillage in more than 80% of trials in the early phase, except in WW (about 75%). In the late phase, on the other hand, there was a clear trend towards ploughless tillage, especially in WW, where more than 40% of trials were prepared without a plough (Fig. 1b) which may be an indicator for increasing use of herbicides.

Statistical analysis

Basic model

For a given observation (average over replications), we used a model with factors genotype, location, trial series and year given by

$$y_{ijkl} = \mu + G_{i} + L_{j} + Y_{k} + \left( {LY} \right)_{jk} + \left( {LYT} \right)_{jkl} + \left( {GL} \right)_{ij} + \left( {GY} \right)_{ik} + \varepsilon_{ijkl} ,$$
(1)

where yijkl is the mean yield of the ith genotype in the jth location and kth year within the lth trial series, μ is the overall mean, Gi is the main effect of the ith genotype, Lj is the main effect of the jth location, Yk is the main effect of the kth year, (LY)jk is the jkth location × year interaction effect, T indicates the trial series (S1, S2, S3) and (LYT)jkl is the effect of the lth trial series within the jkth location × year combination, (GL)ij is the ijth genotype × location interaction effect, (GY)ik is the ikth genotype × year interaction effect and \(\varepsilon_{ijkl}\) is a residual comprising the genotype × location × year interaction \(\left( {GLY} \right)_{ijk}\), the genotype × location × year × trial series interaction and the error of a mean arising from sampling the replications. We confounded \(\left( {GLY} \right)_{ijk}\) with the residual error, because it was only based on the few reference varieties and was of about the same magnitude as the residual without the three-way interaction (Hartung et al. 2022). All effects except μ, are assumed to be random and independent with constant variance for each effect.

Extended model for relative yield reduction due to lack of stem stability and disease damage

First, we want to quantify the lack of stem stability and disease severity on relative YLD I1 during 2005–2019, expressed as YLD I1 (i.e., without FCI and GWR) as percent of YLD I2 (with FCI and GWR), in the following denoted as RYLD (%). We extended Eq. (1) by using all traits for stem stability and disease severity as covariates of fixed regression terms in the extended model (Eq. (3)) for a specific crop. In the model selection procedure, we included linear regression terms for all traits as pre-set, because we assumed that all traits had a potential impact on RYLD. In addition to the linear regression terms, quadratic terms were selected by using a coefficient of determination for mixed models (Piepho 2019) as selection criterion. This measure is equivalent to the adjusted coefficient of determination in a linear mixed model and is given by

$$R^{2} = \frac{{\theta \left( {V_{0} } \right) - \theta \left( V \right)}}{{\theta \left( {V_{0} } \right)}}100\%$$
(2)

where \(\theta \left( {V_{0} } \right) = trace\left( {V_{0} } \right)\) represents the trace of the variance–covariance matrix for the observed data under the basic model and accordingly \(\theta \left( V \right)\) is the trace under the model including covariates. \(\theta \left( {V_{0} } \right)\) is the average marginal variance (AMV) for the baseline model (Eq. (1)). Quadratic terms were added to the pre-set linear terms in the model, only if R2 was increased by more than 0.5%.

The expected value of the extended final model is given by

$$E\left( {z_{ijkl} } \right) = \mu + \Sigma_{p} \alpha_{p} x_{p} + \Sigma_{q} \beta_{q} x_{q}^{2}$$
(3)

where \(z_{ijk} = 100 \times \left( {\frac{{y_{1ijk} }}{{y_{2ijk} }}} \right)\) corresponds to RYLD, \(\alpha_{p}\) is the linear regression coefficient of covariate \(x_{p}\) and \(\beta_{p}\) is the quadratic regression coefficient of covariate \(x_{q}\) (Subscripts for covariates x are the same as for y). The subscripts 1 and 2 for y correspond to I1 and I2, respectively. RYLD was estimated by Eq. (3).

The expected value of Eq. (3) is denoted as baseline, when covariates \({x}_{p}=1\) and \({x}_{q}=1\) for all p and q, i.e., if no lack of stem stability and no diseases were present. Then the estimated relative yield reduction is given by the estimate of the expected value of Eq. (3) for given scores of covariates minus the baseline. The estimated maximum relative yield reduction is defined as the estimated RYLD at the maximum score minus the baseline, shortly referred to as yield reduction, where the maximum score is derived by the 99th percentile of the univariate distribution of scores. In the following we denote the maximum relative yield reduction by RYLDred. The results are shown in Table 2.

Table 2 Regression coefficients and estimates of yield reduction due to lack of stem stability and diseases for score 1 to the 99th percentile (Eq. (3)) based on relative yield intensity 1 expressed as percent of yield intensity 2 (RYLD (%))

Extended model for impact of input intensity and soil fertility on yield intensity 2

The aim of testing new varieties under I1 was primarily to evaluate their susceptibility to diseases and lack of stem stability. Under I2, varieties additionally received FCI and GWR treatment to show their performance under GLAP across environments. Not only NOG, FCI, GWR application rates and SLF varied from trial to trial, but also their interaction with trial-specific environmental conditions. Hence, it is of interest to quantify the impact of input intensity for NOG, FCI, GWR application rates and SLF on YLD I2 across trials. To consider the effect of covariates NOG, FCI, GWR and SLF on YLD I2, we extended Eq. (1) by a linear and quadratic regression term. To reduce computing time, we applied a two-stage procedure, which is usually very close to a single-stage procedure (Damesa et al. 2017). In the first stage, the trial means were estimated by reducing Eq. (1) to

$$y_{ijkl} = \mu + G_{i} + \left( {LYT} \right)_{jkl} + \left( {GL} \right)_{ij} + \left( {GY} \right)_{ik} + \varepsilon_{ijk} ,$$
(4)

where yijkl is the mean yield of the ith genotype in the jth location and kth year and lth trial series within year and location for I2, μ is the overall mean, Gi is the main effect of the ith genotype, (LYT)jkl is the jklth trial effect within location × year, (GL)ij is the ijth genotype × location interaction effect, (GY)ik is the ikth genotype × year interaction effect, and εijk is a residual effect. All effects except μ and (LYT)jkl are assumed to be random and independent with constant variance for each effect. The estimated least square mean for the effect (LYT)jkl is denoted by ujkl, which represents the trial mean (year × location × trial series combination) as the dependent variable in the second stage.

The expected value of the regression model based on estimated individual trial means \(u_{jkl}\) obtained in the second stage is given by

$$E\left( {u_{jkl} } \right) = \mu + \Sigma_{p} \alpha_{p} x_{p} + \Sigma_{p} \beta_{p} x_{p}^{2}$$
(5)

where \({\alpha }_{p}\) is the regression coefficient of the linear and \({\beta }_{p}\) of the quadratic terms. Subscripts for covariates x are the same as for u. The random effects of the regression model in Eq. (5) are \({\left(L\right)}_{j}\), \({\left(Y\right)}_{k}\), \({\left(LY\right)}_{jk}\) and \({\epsilon}_{jkl}\). Linear terms of covariates NOG, FCI, GWR and SLF were pre-set, because we assumed that all covariates had an impact on the trial mean of YLD I2. The selection procedure of the quadratic and the interaction terms was the same as applied for Eq. (2). Quadratic terms were added only if they reduced R2 by at least 0.5% compared to the basic model (Eq. 4).

The two-stage procedure for the selection of covariates for Eq. (3) could be applied because the covariates were related to the trial means, whereas for the selection of covariates of Eq. (2) only a one-stage-procedure was applicable, because observations were related to varieties within trials.

Correlation of relative yield intensity 1, stem stability and diseases

In “Extended model for relative yield reduction due to lack of stem stability and due to disease damage” Section we estimated the impact of scored traits on RYLD. This raised the question as to which extent traits were associated, i.e., how strong were the correlations among RYLD and the scored traits. Simple correlation coefficients of observations are not always appropriate to allow valid inferences, if the structure of the trial series is not considered. We therefore estimated marginal (total) correlation coefficients (Piepho 2018) between traits based on variety × year × location × trial series observations (Eq. 1).

Preliminary analysis had shown, that variance components of interaction effects (GL) and (GY) in Eq. (1) were at least for one of both effects zero or very small and hence less important. In these cases, covariances were difficult or impossible to be estimated. For these reasons, we reduced Eq. (1) to

$$z_{ijkl} = \mu + G_{i} + \left( {LYT} \right)_{jkl} + \varepsilon_{ijkl} ,$$
(6)

by confounding the effects L, Y and LY with LYT, GL and GY with the residual effect ε, where \({z}_{ijkl}\) corresponds to RYLD. The correlations between effects of genotype G, the confounded trial effect LYT and residual ε were calculated assuming a multivariate model with traits as independent variables and random effects given by the basic model of Eq. (6). We choose a univariate approach from which correlations for pairs of traits can be inferred (Piepho et al. 2014):

First, we calculated variance components of random effects according to the model of Eq. (6) for trait U and V and for the difference \(U - V\) between both traits.

Second, we computed covariances between the random effects of trait U and V from variance components obtained from univariate models by using the equation

$$var\left( {U - V} \right) = var\left( U \right) + var\left( V \right) - 2cov\left( {U,V} \right) \Leftrightarrow$$
(7)
$$cov\left( {U,V} \right) = \frac{1}{2}\left( {var\left( U \right) + var\left( V \right) - var\left( {U - V} \right)} \right)$$
(8)

Third, we used variances of random effects from Eq. (6) and covariance from Eq. (8) to calculate the correlation coefficients.

The marginal correlation coefficient was derived by the marginal variances and covariances which are the sum over individual random effects of var (\(U\)), var (V) and cov (\(U,V)\). Compared to the simple correlation coefficient, the marginal correlation is the correlation on the level of observations (variety × year × location × trial series combinations), which takes into account the model structure of the trial series (Piepho 2018). The marginal correlation coefficients quantify the strength of association between traits involved in the estimation of RYLD.

Variance components and correlations between yield intensity 2, input intensity and soil fertility

Application rates of NOG, FCI, GWR and soil fertility (SLF) were covariates observed in trials in I2, i.e., were specific data for each trial (year × location × trial series combinations). The model is given by

$$u_{jkl} = \mu + L_{j} + Y_{k} + \left( {LY} \right)_{jk} + \varepsilon_{jkl} ,$$
(9)

where \(u_{jkl}\) is the trial mean for YLD I2 estimated by Eq. (4). Variance components of random effects were estimated for location (L), year (Y), location × year interaction (LY) and residual effect \(\varepsilon\) to quantify their relative variation for NOG, FCI, GWR and SLF. Further, we evaluated the marginal correlation coefficients among trial means for YLD I2, NOG, FCI and SLF by considering the structure of the trial series, i.e., location, year and location × year interaction as given in Eq. (8). We confounded (LY) with \(\varepsilon\), because the residual variation due to \(\varepsilon\) was rather low. The analogue procedure was applied as described in the previous “Correlation of relative yield intensity 1, stem stability and diseases” section.

Results

Input intensity in trials

In Fig. 2, we show the trial-specific annual distribution of input intensities in I2 from 2005 to 2019. WW received the highest NOG and SB the lowest, whereas the levels in WR, WB and SB were about similar and between that of WW and SB. Figure 2 shows a slightly increasing linear time trend for NOG in WTI and even stronger in WB and SB, but in WW an inverse trend from 2016 on became apparent. Generally, the variation of NOG rates within years and between locations was rather large.

Fig. 2
figure 2

Nitrogen application rates (kg ha−1) and treatment frequency index (TFI) for fungicides and growth regulators in I2, for herbicides applied in I1 and I2. Observations of annual boxplots are based on location × trial series-combinations. Boxes cover 50% and whiskers 95% of trial observations. The red line indicates the annual means. WW Winter wheat, WTI Winter triticale; WR Winter rye; WB Winter barley; SB Spring barley

Trends for FCI were also slightly increasing until about 2015 for all crops, while from then on, a reduction was found, especially in WW, WTI and WR. The levels of GWR rates were about the same in all crops, except in SB with a rather low TFI level of about 0.5 and a small within year variation. GWR in WW and SB did not increase, whereas in WTI, WB and WR an increasing trend, especially in the last years, was found. Herbicide application increased in WTI and WR, slightly in WB and in SB until 2016 and then dropped.

Generally, Fig. 2 shows only slightly increasing time trends in NOG, FCI and GWR rates between 2005 and 2019 and little variability of annual means, but considerable between-trial within-year variability.

Distribution of yield, stem stability and disease scores

In Fig. 3 the annual distributions of observations (variety × year × location × trial series combination) in I1 for yield, stem stability and disease scores are represented as box plots to show the variation and mean trends of disease severity as an indicator of damage potential. Annual means for I1 are indicated by red and for I2 by blue lines. For scored traits, we included all observations, i.e., also observations with scores 1 to reflect the actual incidence situation rather than exhibit only observations from those trials which showed an incidence, i.e., trials with scores greater than 1.

Fig. 3
figure 3

Boxplots for (a) relative yield in intensity 1 (dt ha−1) and (b) severity scores for stem stability and diseases in intensity 1 based on variety × location × trial series combinations 2005–2019. Boxes cover 50% and whiskers (one-sided) 95% of observations. The red line indicates the annual means of intensity 1, the blue line of intensity 2. Trials with scores 1 were included. WW Winter wheat; WTI Winter triticale; WR Winter rye, Hyb Hybrid varieties Pop Population varieties; WB Winter barley, 2r two-row varieties, 6r six row varieties; SB Spring barley, LDG Lodging; SBL Stem buckling; EBL Ear buckling; MLD Powdery mildew; BNR Brown rust; STB Septoria leaf blotch; RYS Rhynchosporium; YLR Yellow rust; SNB Septoria nodorum blotch; NTB Net blotch; DLR Dwarf leaf rust

A clear difference in annual means for YLD I1 and YLD I2 is shown in Fig. 3a with an about parallel profile, but with considerable variability from year to year and a large within-year variability in I1. YLD in WB 2r and WB 6r increased clearly, whereas no clear trend for the other crops were visible due to the ups and downs of annual means.

The most important stem stability trait was LDG evaluated in all crops (see Fig. 3b). High severity scores occurred in WR Hyb and WR Pop in I1 and I2, whereas in WW and WB less LDG was observed. Notable differences of LDG means in I1 versus I2 in WB, WTI and especially in WR suggest a strong effect of GWR treatment. The 95% range of observations, represented by whiskers, covers the complete 1–9 scale in all crops and in most years, except in SB where means in I1 and I2 were low and show little variation. In SB, the severity levels for SBL were considerably lower than in WR and WB. The large difference between I1 and I2 in annual means for SBL in WB indicate a strong efficacy of GWR treatment. For EBL in WB and SB, nearly no difference between means of I1 and I2 is visible in Fig. 3b, indicating that GWR treatment had no effect on this trait.

For MLD, highest severity scores occurred in WTI and lowest in SB. Means in WW, WB and SB were close to score 1 in some years as shown in Fig. 3b. BNR showed high severity in WR, medium in WW and low severity in WTI in I1, and low mean severities in all crops in I2 indicating that BNR was controlled effectively by FCI. YLR incidence was very low in WW and WTI until 2013, but later it occurred epidemically in WW (2014–2016) (Fig. 3b) and in WTI (2014–2016, 2019) due to strong epidemics caused by the highly virulent ‘Warrior’ race and its descendants. Low means for I1, however, indicate a high efficacy of fungicides. Frequency and severity of DTR and SNB in WW was very low. For RYS, high severity levels were found in WR, while in WB and SB lower annual severity means were found. In addition, RYS was effectively controlled by FCI in WR, WB and SB. Barley-specific diseases are NTB and DLR showing about the same severity pattern with respect to levels and variation in WB and SB, respectively, indicating nearly full disease control in I2. A remarkably increasing time trend for DLR was observed in I1, while near complete DLR control was achieved in I2 (Fig. 3b).

In general, we found high severity scores for LDG and SBL, followed by BNR, STB and RYS, lower for EBL, MLD, NTB and DLR and very low for SNB and DTR. For YLR, epidemic occurrence was found. A near-perfect control by FCI was realized for MLD, BNR, YLR, RYS, NTB and DLR in WW, WB and SB, but not for STB in WW and WTI, and specific diseases in WR.

Relative yield reduction due to lack of stem stability and disease damage

RYLD was estimated in all crops by a multiple regression model (Eq. 3) with covariates corresponding to stem stability and disease severity traits. The difference of estimated RYLD to the baseline was denoted as relative yield reduction (RYLDred). The RYLDred for each covariate (stem stability or disease trait) was estimated within the range of score 1 and the maximum score, which was derived as the approximate 99th percentile of the covariate’s score distribution. Trials, which showed no incidence of a specific disease or lack in stem stability, were not considered. Table 2 shows the coefficient of determination R2, the overall mean of RYLD, the baseline, estimated regression coefficients, the relative yield reduction RYLDred at maximum score and the observed maximum scores.

Figure 4 illustrates RYLD as function of severity scores for all traits in each crop. The curves represent yield reduction effects of individual traits from score 1 to the maximum observed severity, i.e., the 99th percentile. The common origin of the curves corresponds to the baseline, i.e., the estimated RYLD if no incidence occurred. The horizontal dashed line corresponds to the overall mean of RYLD. The curve of a specific trait indicates the estimated RYLD, while the severity scores for the non-plotted traits are set to 1, i.e., no occurrence in any other disease or lack in stem stability trait. Then, the RYLDred for a specific trait at a given score is graphically represented in Fig. 4 as the difference between the baseline and the curve, e.g., RYLDred in WW for YLR at score 7 is − 7.1%. In the following, the RYLDred always corresponds to the maximum relative score as shown in Table 2. We should note, that the regression model (Eq. (3)) is additive with respect to the covariates. This means that for multiple disease incidence, RYLDred of individual traits at given scores are the sum of RYLDred of individual diseases at given severity scores.

Fig. 4
figure 4

Relative yield intensity 1 (I1) as percent of yield intensity 2 (I2) (RYLD (%) plotted against severity score 1 to the 99th percentile of the of scores using Eq. (3). Regression coefficients are shown in Table 2. RYLD at score 1 corresponds to the baseline, i.e. if no disease incidence and lack of stem stability was occurred in trials. The horizontal line represents the overall mean of RYLD; WW Winter wheat, WTI Winter triticale; WR Winter rye, Hyb Hybrid varieties, Pop Population varieties; WB Winter barley, 2r two-row varieties, 6r six row varieties; SB Spring barley. LDG Lodging; SBL Stem buckling; EBL Ear buckling MLD Mildew; BNR Brown rust; STB Septoria leaf blotch; SNB Septoria nodorum blotch; YLR Yellow rust; NTB Net blotch; RYS Rhynchosporium; DLR Dwarf leaf rust;

The coefficient of determination R2 indicates how much of the total variation was explained by the regression function. R2 was in the range of 8.7% (SB) and 27.5% (WTI) indicating that most of the variation for RYLD was due to genotypic and environmental sources as represented by the random effects in Eq. (2). The lowest overall means for RYLD were found in WB 6r (86.5%) and WR Pop (86.6%), the highest in SB (91.2%) and WW (89.4%) (Table 2 and Fig. 4). The estimated baseline for RYLD, i.e., all covariates had score 1 (no incidence), was between 93.0% (SB) and 88.8% (WR Pop).

Table 2 (column RYLDred) shows that in WW the highest RYLDred (at maximum score) was estimated for BNR (− 8.4%), YLR (− 7.1%) and STB (− 6.6%), whereas DTR (− 2.6%) and SNB (− 2.3%) were of minor importance. Response curves for BNR, STB and DTR showed a quadratic form. RYLDred in WTI was considerably higher than in WW, i.e., − 14.4% for YLR, − 7.4% for MLD and − 7.2% for LDG. In WR, the impact of disease severity was generally lower than in WW and WTI, except for LDG in WTI and BNR in WR Hyb. In WB, LDG showed the highest RYLDred with − 8.0% in WB 2r and − 7.1% in WB 2r. In SB, RYLDred were low, except for DLR (5.7%).

Overall, a high RYLDred was estimated for YLR, LDG, BNR and STB, whereas MLD showed moderate reductions. For EBL no yield reducing effect was estimated. The dominating risk factor for stem stability was LDG, whereas SBL and EBL were of low impact. Interestingly, yield reduction due to LDG in WR showed a curvilinear relationship in both hybrid and population varieties.

Impact of input intensity and soil fertility on yield intensity 2

The impact of the covariates NOG, FCI, GWR and SLF on YLD I2 (dt ha−1) was based on estimated means for individual trials by Eq. (4). We should note that in this case, these covariates affected all varieties within a trial equally, whereas relative yield reduction due to diseases and lack of stem stability affected each individual variety. The regression model is given by Eq. (5). Estimates of regression coefficients are shown in Table 3. The model fit R2, was in the range of 14.6% (WW) and 27.8% (WB 2r), and the overall mean YLD I2 was lowest for SB (70.6 dt ha−1) and highest for WW (99.3 dt ha−1) as shown in Table 3. In Fig. 5, we visualize the impact of NOG, FCI, GWR and SLF covered by the 95% range (2.5th–97.5th percentile). As plots in Fig. 5 are univariate, we had to specify the values for the non-plotted covariates by choosing three levels: a low (L), a medium (M) and a high level (H), representing the 15th, 50th and 85th percentile of application rates for NOG, FCI and GWR as well as scoring of SLF, respectively. Hence, for each covariate three curves are shown, a green one for L, blue for M and red for H. For example, in WW the curve for NOG was plotted in the range of 100–230 kg ha−1 at level M with constant FCI of 2.2 TFI, GWR of 1.0 TFI and SLF of 65 points. The regression curves were linear and parallel (e.g., for GWR in WR Hyb) or quadratic and parallel (e.g., for GWR in WTI) because of the additivity of the regression terms (Eq. (4)).

Table 3 Regression coefficients estimating impact of nitrogen (NOG), fungicide (FCI) and growth regulator application (GWR) rates, and soil fertility (SLF) points based on yield intensity 2 (dt ha−1) 2005—2019 (Eq. (4))
Fig. 5
figure 5

Estimated yield intensity 2 (dt ha−1) plotted against input intensity and soil fertility in the range of the 2.5th–97.5th percentile of trials 2005–2019 using Eq. (4) for three levels of application rates. The dashed grey line indicates the average yield. L denotes the low level of the 15th, M to the medium level of the 50th and H to the high level of 85th percentile of the non-plotted input intensities. WW Winter wheat, WTI Winter triticale; WR Winter rye, Hyb Hybrid varieties, Pop Population varieties; WB Winter barley, 2r two-row varieities, 6r six row varieties; SB Spring barley; I2 Intensity 2; TFI Treatment frequency index;

In WW, YLD I2 did not change much over the range of NOG rates. The regression coefficient for NOG indicated no significant impact on YLD I2. GWR showed a light non-linear increasing trend, levelling off with higher TFI while YLD I2 increased linearly with increasing FCI and SLF. YLD I2 showed not much differences between levels L, M and H (see Fig. 5).

In WTI a similar curve pattern as in WW was estimated, however, YLD I2 was more distinct between levels L, M and H than in WW (Fig. 5, Table 3). As in WW, little change for YLD I2 occurred over the range of NOG application rates, while YLD I2 increased considerably from levels L and M to H. The trend of FCI in WTI was nearly zero and not significant. GWR increased YLD I2 until about TFI 1.5 and then levelled off. SLF showed a stronger increase on YLD I2 with increasing levels from L and M–H (Fig. 5).

The curve pattern in WR Hyb and WR Pop were very similar for all covariates, but the overall mean YLD I2 (grey dotted line) was considerably higher in WR Hyb than in WR Pop. A remarkable curve pattern was estimated for SLF. Curve for YLD I2 increased up to a turning point between SLF of 55–60 points and then dropped.

WB 2r and 6r varieties also showed similar curve patterns. For NOG a zero and FCI a slight positive but non-significant trend on YLD I2 was estimated, whereas GWR and SLF showed a strongly increasing impact on YLD I2.

In SB, for NOG and FCI no significant impact across application rates was estimated (Table 3). GWR revealed an increasing curve until about TFI 0.8 and then it turned to become negative, while SLF increased strongly. Noticeably, levels L and M were nearly the same for SLF, while H was considerably higher.

Overall, Fig. 5 indicates a large range of NOG, FCI and GWR application rates across trials following GLAP. The range was largest in WW and smallest in SB, whereas the range for SLF in WW was smaller than in other crops and but more right-shifted between about 40 and 85 points. In all crops, GWR and SLF showed the strongest YLD I2 increasing trend, whereas YLD I2 did not much change over NOG treatment rates, except in WB.

Correlation between relative yield for intensity 1, stem stability and diseases

The overall association between traits based on observations of variety × year × location × trial series combinations were estimated by marginal (total) correlation coefficients shown in Fig. 6a. We categorized the strength of association between traits by the following categories: |r|< 0.15 very weak, 0.15 ≤|r|< 0.35 weak, 0.35 ≤|r|< 0.55 moderate, 0.55 ≤|r|< 0.75 strong, 0.75 ≤|r| very strong. Significance levels of the correlation coefficients in Fig. 6 are not indicated. Due to the large number of observations even coefficients categorized as very week (|ρ|< 0.15) may be significant at the 0.1% level, however not relevant, so a significance test is not considered as helpful here.

Fig. 6
figure 6

Marginal correlation coefficients (a) between relative yield intensity 1 (I1) (RYLD), traits for stem stability and diseases of I1 using Eqs. (68) for 2005–2019. The lower triangular matrix belongs to winter wheat (WW), winter rye hybrid varieties (WR Hyb), winter barley two-row varieties (WB 2r) and spring barley (SB). The upper part to winter triticale (WTI), winter rye population varieties (WR Pop) and winter barley six-row varieties (WB 6r). Marginal correlation coefficients (b) between yield I2 (dt ha−1) (YLD I2), nitrogen (NOG), fungicide (FCI), growth regulator (GWR) application rates and soil fertility (SLF) derived by Eqs. (6, 7 and 9)) for 2005–2019. The green-grey-red heat maps indicate the strength of correlation from positive (green) to negative (red) coefficients. (c) Variance components for nitrogen, fungicide and growth regulator application rates, and soil fertility based on Eq. (9) for 2005–2019. Y Year; L Location; Y × L: Year by location; Res Residual. Categorization: |ρ|< 0.15 very weak, 0.15 ≤|ρ|< 0.35 weak, 0.35 ≤|ρ|< 0.55 moderate, 0.55 ≤|ρ|< 0.75 strong, 0.75 ≤|ρ| very strong

In general, RYLD was negatively and very weakly to moderately associated with traits for stem stability and diseases. LDG was the trait with the most noticeable and negative association with RYLD, especially in WB 2r (r = − 0.44), WB 6r (r = − 0.33) and WR Hyb (− 0.31). YLR showed also a moderate association with RYLD (r = − 0.42) in WTI, but only a weak one in WW (r = − 0.25). SBL was weakly associated with RYLD in WB 6r (r = − 0.26), WB 2r (r = − 0.24) and SB (r = − 0.21), whereas in WR the correlation with SBL was very weak. A noteworthy correlation of RYLD with MLD occurred only in WTI (r = − 0.26) and WW (r = − 0.17). The strongest negative correlations of RYLD with diseases occurred in WW and WTI.

The association among traits of stem stability and diseases was generally positive but mostly very weak to weak as shown in Fig. 6a. Association among traits of stem stability were noticeably positive, especially for LDG with SBL in WR, WB and SB in the range of r = 0.15 to r = 0.26, and SBL with EBL in WB and SB from r = 0.31 to r = 0.36. MLD correlated mostly very weak with other diseases.

Generally, the correlation of RYLD with stem stability and diseases was negative, while the correlation among the stem stability and diseases was positive, but lower than with RYLD.

Correlations between yield intensity 2, input intensities and soil fertility, and variance components

The marginal association between YLD I2, NOG, FCI, GWR and SLF indicate the strength of their mutual dependence on the basis of trial means (year × location × trial series combination) whereas the regression model given by Eq. (5) quantifies the impact of NOG, FCI and SLF on YLD I2, but does not indicate the strength of association which is shown in Fig. 6b.

YLD I2 and NOG were not or only very weakly related across crops, while YLD I2 showed positively weak to moderate correlations with FCI, GWR and SLF in all crops. The association of NOG with FCI and GWR was very weak. However, the correlation of NOG with SLF was negative in all crops in the range of r = − 0.07 (WW) to r = − 0.28 (SB). In all crops, positive correlations occurred among FCI, GWR and SLF.

To obtain an overview on the variation of NOG, FCI, GWR and SLF, we estimated the variance components for the random effects location (L), year (Y), year × location (Y × L) and the residual error for all four covariates (Eq. (9)). Figure 6c shows the percent values of variance components relative to the total of variance components for random effects in Eq. (9). The variation of NOG for L accounted for more than 50% of the total variation in crops, except in SB it was higher with around 70%, while for FCI the interaction of Y + L was the dominating component. For GWR, L and Y × L were of about the same magnitude. As expected, variation of SLF was caused to more than 90% by L, except in WR where variation due to Y × L accounted for about 30% of total variation. This large variation was caused by a few trials, which featured very different SLF in different years for the same location. Residual variability was negligibly small. The variation for Y was somewhat larger than for residual variation, but less than 5%. Generally, Fig. 6c indicates that variation of NOG, FCI and GWR under GLAP was nearly completely determined by L and Y × L while Y was nearly negligible.

Discussion

In this study, we described the input intensity in variety trials (Fig. 2) grown over a wide range of environmental conditions and for a large set of genotypes in five cereal crops. Further, we depicted the annual distribution of yield, stem stability and disease scores (Fig. 3). We then quantified the impact of lack of stem stability and of disease damage on RYLD under natural infection (Fig. 4, Table 2). Finally, we evaluated the impact of input intensity of NOG, FCI, GWR and SLF on YLD I2 (Fig. 5, Table 3) and estimated the association between RYLD, stem stability and disease traits, and the association between those traits (Fig. 6).

Distribution of yield, stem stability and disease scores

Figure 3 shows an outline of annual yield, stem stability and disease severity patterns. The variation in occurrence of plant diseases is usually explained by differences in climatic conditions, synchronization between pathogen arrival and the growth stage of the host crop, cultivation history, host plant resistance and agronomic practices (Jalli et al 2020). The individual plots illustrated three major issues. First, the distance between the annual means for I2 (blue) and I1 (red) indicated the effect of fungicide and growth regulator application and the level of the disease severity and stem stability. The gap showed that FCI and GWR application were mostly very effective, e.g., for MLD. (Fig. 3b). Many studies on fungicide application revealed stronger differences between the untreated control and the treated variant compared to our study (Lollato et al. 2019; Wegulo et al. 2011; Thompson et al. 2014). In our study, the differences were particularly large in years with high infection pressure, e.g., in WW for BNR in 1988 and YLR in 2014–2016. Second, the width of boxes and whiskers demonstrated the variability of observations within years over varieties and trials due to natural infections. Third, the incidence of stem stability and diseases is different not only from crop to crop but also from year to year, where some diseases were rather chronic (e.g., BNR and STB), and some rather episodic (e.g., YLR and DTR), which means that they do not occur in every year. An extreme example was YLR that occurred over the whole period only in three epidemic years in WW (2014–2016) and in 5 years in WTI (2014–2017, 2019), that were associated with ideal weather conditions and the widespread occurrence of the new ‘Warrior’ race and its descendant ‘Warrior (-)’ (GRRC 2021). Nevertheless, about 75% of all tested WW varieties showed effective resistance in these epidemic years with scores from 1 to 3 as indicated by the box plots in Fig. 3b. This is consistent with the German Descriptive Variety List classification regarding resistance to YLR. At least 60% of the varieties were classified as resistant to YLR in the years 2005–2019 (BSL 2005–2020).

Marginal correlation coefficients between RYLD, stem stability and disease traits (Fig. 6a), and between YLD I2 and NOG, FCI, GWR and SLF (Fig. 6b) were relatively low. This can be explained by the nature of the marginal correlation, which included genotypic, environmental and genotype × environmental variation. However, low correlation coefficients do not reflect the magnitude of regression coefficients given in Tables 2 and 3. For example, yield and disease trait may be weakly correlated but the disease trait may show a strong impact on yield as is the case for WR Hyb between BNR and RYLD (r = − 0.13, RYLDred = − 5.2%).

Yield reduction due to disease damage and lack of stem stability

In this study, we used RYLD (YLD I1 relative to YLD I2 = 100%) to estimate yield reduction due to damage caused by lack of stem stability and disease severity. Several authors used this relative measure (e.g., Teng and Gaunt 1980; Zhang et al. 2007), because YLD I1 and YLD I2 were strongly positively correlated over environments. Further, RYLD considers the variability of yield levels between trials and time trends which are similar for YLD I1 and YLD I2 supplying a more stable and better to interpret measure than the absolute yield difference in dt ha−1.

Evaluation of yield reduction due to disease damage and lack of stem stability was very demanding and challenging to interpret in terms of particular traits. By applying mixed models with fixed linear and quadratic regression terms and considering environmental and genotypic random effects (see Eq. 3), we were able to estimate yield reduction as a function of severity scores for individual traits showing their damage potential (Table 2).

The model fit (R2) was between 27.5% (WTI) and 8.7% (SB) indicating that the largest part of variation, which corresponds to 100% − R2, was caused by random effects of Eq. (3) representing environmental variability. Teng and Gaunt (1980) argued, “if 90% of variation in disease - yield loss model is due to factors other than that diseases, the model should have limited biological and practical application”. Results in Table 2 showed that for all crops, except SB, less than 90% of variation was left due to variation of other effects than lack of stem stability and diseases. This confirmed the good explanatory power of our regression models.

We denoted the difference “RYLD – baseline” as relative yield reduction RYLDred and not as yield loss, because yield loss is generally defined as the difference between attainable or potential yield, free of biotic stress, and the actual yield (Zadoks and Schein 1979, p. 246; Zetzsche et al. 2020). In contrast, in our trials yield was reached under GLAP, which does not aim for absolutely disease-free conditions. In Table 2 and Fig. 4, we showed the RYLDred for individual traits predicted in the range of score 1 and the maximum severity score (99th percentile). Due to the additivity of our models (Eq. 3), yield reduction of more than one trait equals the sum of yield reduction for individual diseases. Low correlation coefficients between traits, as shown in Fig. 6a, supported the additivity assumption of model terms. Additivity of yield reduction under multiple disease conditions illustrates how important it is to achieve multi-disease resistances in varieties (Miedaner et al. 2020).

RYLDred reported here was mostly lower than in other papers on yield reduction for lodging or individual diseases for several reasons (Losert et al. 2017; Berry et al. 2015; Zetsche et al. 2020). One reason is the incomplete control of disease damage and lodging in I2 by chemicals, which leads to an underestimation of RYLD. The incomplete control is mainly due to practicability reasons, because in VCU trials all varieties, independent of their resistance levels and phenological differences, had to be treated at the same time. A situation-related application of each individual variety was not possible from an experimental-logistic point of view. Hence, in practice, fungicide application was based on the average disease severity of the majority of varieties. This leads to a reduction in the effectiveness of fungicide and growth regulator applications as some varieties were likely treated too early or too late, e.g., STB in WW and LDG in WR showed means in I1 considerably higher than score 1 (Fig. 3b). In consequence, it should be noted that YLD I2 was likely not always representing potential yields, but was lower and slightly underestimating potential yield reduction. Another strong reason is that our results are derived from field conditions under natural infection and not under artificial disease infection as for example results reported in Zhang et al. (2006, 2007) and Zetsche et al. (2020). Moreover, Teng and Gaunt (1980) pointed to the fact that yield losses can have occurred at later development stages reducing YLD I2 further.

In this study, many observations were found with a score of 1 (Table 1 and Fig. 3b). These could belong to fully resistant varieties, but also to susceptible varieties that were not infected due to the absence of the specific pathogens. Because this study was conducted over a long period of 15 years and the infection process generally depends on a number of different micro-climatic factors such as relative humidity, precipitation, leaf wetness duration, and temperature, in many years only moderate to low infection for many of the assessed diseases occurred. Regularly high disease scores of more than three were only found for STB in WW, and BNR and RYS in WR.

Our results showed that LDG caused strong RYLDred, which can be influenced by factors like plant height, plant density, single ear weight, stem tissue properties, heavy rainfall, strong winds, high NOG application rates and biotic stress factors like foot rot diseases. The greatest lodging-induced reduction in YLD are reported to occur when crops are lodging flat already at anthesis or during early grain filling period (Berry et al. 2004). In WW, yield effect of LDG was relatively low (− 3.9%). This can largely be attributed to the widespread use of semi-dwarfing genes, mainly Rht-D1 and Rht24, in WW (Würschum et al. 2017). Higher yield reduction due to severe LDG have been reported in wheat from 31% (Weibel and Pendleton 1964) to 80% (Easson et al. 1993) and in barley from 28 to 65% (Stanca et al. 1979; Jedel and Helm 1991). The lowest yield reductions estimated in SB (− 3.1%) can also mainly be attributed to the low plant height of SB varieties. In all crops of our study, LDG reached a maximum score of 9, which means complete lodging. The effect of GWR application on LDG (difference of annual means I1–I2) was highest in WR (Fig. 4). Strong phenotypic correlations between plant height and lodging were reported for example by Losert et al. (2017) in WTI (r = 0.71) and strong genotypic correlations by Laidig (unpublished) in WW (r = 0.61) and in SB (r = 0.57). Although low plant height appeared to be a major factor regarding lodging tolerance, Navabi et al. (2006) found differences in lodging tolerance among tall WW genotypes indicating that stem tissue properties are another relevant factor besides plant height. Large awns in WB tend to foster early lodging caused by heavy rainfall, whereas in WR Pop the tallness of cultivars largely explained the high yield reduction due to lodging (Laidig et al. 2021).

Among all diseases, MLD was the only disease evaluated in all crops. However, in most crops MLD showed a low RYLDred of less than − 2.3% at maximum severity score between 6 (WW) and 8 (WTI) (Table 2). A reason for the lower yield reduction for MLD is the greater tolerance of varieties. According to IPM principles (EU 2009), powdery mildew treatment should only be carried out if the disease control threshold of 60% infected plants (disease incidence) is reached (Beer 2005). In the case of YLR, the YLR-specific threshold is already reached with the very first YLR symptoms on a few plants, since varieties, especially susceptible ones, tolerate only little YLR infestation. These threshold values are in line with the estimated yield reductions in our study, which were much higher with increasing susceptibility in YLR than in MLD (Fig. 4). The same reason for lower yield reduction as in MLD can be confirmed for RYS. According to the principles of IPM, RYS also needs to be controlled only at a threshold of 50% infected plants to prevent reduced yields. The estimated yield reduction is correspondingly lower with increasing susceptibility for RYS than for YLR.

In WTI, yield reduction was comparably high, especially for MLD and YLR. MLD reached a yield reduction of − 7.4% at the maximum severity score 8. In the other crops lower yield reductions were estimated. The very first infections of MLD in WTI were found in 2001 in Germany (Klocke et al. 2013; Laidig et al. 2021). Until then, no resistance breeding took place. After the first epidemics, however, selection was started. Nonetheless, until today many WTI varieties are still susceptible. Similarly, the first widespread occurrence of YLR in WTI occurred in 2001 (Laidig et al. 2021) and the first large epidemics started in 2014, as shown in Fig. 3b. Devastating YLR epidemics were observed in WTI and also in WW across three subsequent years. Accordingly, the highest RYLDred among all crops and traits based on the current analysis and respective VCU trial data was estimated in WTI for YLR reaching − 14.4%, whereas in WW the YLR damage was about half (− 7.1%). The main reason is that YLR encountered WTI, a crop in which many varieties were much more susceptible than most WW varieties. Intensive selection for YLR resistance had already taken place in WW in the previous decades. Despite the strong yield reduction, the maximum severity score reached 7 in WW and 8 in WTI. In both crops, effective FCI application was able to control the epidemic occurrence of YLR in 2014–2016 nearly fully (Fig. 3b), also due to increased application rates in these epidemic years as indicated by Fig. 2.

In Table 2 baselines (no disease and lodging) for RYLD were shown and plotted in Fig. 4. Baselines ranged between 93.0% (SB) and 88.8% (WR Pop) corresponding to a difference between 11.2% and 7% compared to I2. The question arises, why baselines were below 100%, because they were estimated at score 1 for all traits. These differences are likely caused by the missing growth regulators in I1 plus a yield-enhancing effect of some fungicides (i.e., strobilurins) leading to a delayed leaf senescence and longer green leaf duration even when no disease infection occurred (Ballini et al. 2013; Schierenbeck et al. 2019). Additionally, other diseases, which were not considered in the analysis, like Fusarium head blight (Fusarium graminearum and others), Puccinia graminis f. sp. secale, Microdochium nivale and Ramularia collo-cygni may also have contributed to this gap. The notably low difference between I2 and I1 of only 7.0% in SB may partly be explained by the potentially adverse effects of GWR application before or during ear emergence in dry years where the ears can be stuck in the tillers leading to lower YLD in I2 than in I1. This is supported by the high percentage of observations with RYLD > 100% which we found in SB amounting to 14.3% of all observations (data not shown).

Yield loss due to multiple diseases and lack of stem stability in variety trials was reported in numerous studies, mostly for wheat. However, the different studies varied considerably depending on whether historic varieties were grown or data from historic trials were used, as well as the applied input intensity, environments, and natural or artificial infection. This makes it difficult to compare reported yield reduction directly with ours. In the following, we give a few examples to demonstrate the heterogeneity of outcomes. Savary et al. (2019) reported on an expert-based survey for wheat in NW Europe average losses due to MLD, BNR, STB, YLR, SNB, DTR of 2.2%, 2.5%, 5.5%, 5.8%, 0.1% and 1.9%, whereas we estimated RYLDred of 2.3%, 8.4%, 6.6%, 7.1%, 2.3%, 2.6%, respectively. Berry and Spink (2012) used an empirical model to predict yield loss caused by lodging effects in WW. Their results showed that severe lodging reduced yields by about 61%. Zhang et al. (2007) estimated a yield loss in WW variety trials during 1990–2000 in France under natural multiple disease conditions for high severity (score 8) of 28.8% for STB, 16% for YLR and 9% for MLD. Wijk (2009) evaluated disease control in WW trials in farmers’ fields from southern Sweden during 1977–2005. The average yield loss was 6.6% of the untreated relative to the FCI treated intensity, thereof, 74% were ascribed to leaf blotch diseases (STB, SNB, DTR), 20% to MLD, 5% to BNR and 1% to YLR. Jalli et al. (2020) reported on yield increases due to fungicide control of leaf blotch diseases in wheat and barley in the Nordic-Baltic region. A total of 449 trials, mostly fungicide efficacy trials, were grown during 2004–2017 under natural infection. They found an average yield loss due to leaf blotch of 10.7 dt ha−1 (12%) in WW and 11.1 dt ha−1 (17%) in SB. Jahn et al. (2012) predicted yield losses in WW caused by important fungal diseases in 2003–2008 from 744 trials under natural infection across Germany and found a loss of 7.0 dt ha−1 for STB, 2.5 dt ha−1 for BNR and 2.0 dt ha−1 for MLD and DTR.

Our results cannot be directly compared with yield losses observed on-farm, despite of the wide range of pedo-climatic conditions in our study. First, FCI and GWR types in our study were not applied variety- and disease-specific but were chosen to be the best compromise across varieties. Second, on-farm only a small proportion of the available varieties are actually grown. Results of the project “Network of reference farms for plant protection” show that in the reference farms about one third of the available wheat varieties were cultivated. However, the 10 most commonly grown varieties cover more than 60% of the reference farms area (Klocke and Dachbrodt-Saaydeh 2021). Furthermore, these 10 most common varieties featured comparatively low multi-resistance levels, while varieties with much higher multi-resistance levels were actually available on the market. It should be noted that our results represent the RYLDred that would have been achieved with the full range of varieties available, and thus represent the benefits of breeding new improved varieties.

Overall, the strong influence of a variety’s susceptibility to YLR and BNR on yield reduction showed that the cultivation of resistant varieties should matter more in practice (Fig. 4). For diseases such as MLD and RYS, which lead to relatively lower yield reductions at increasing susceptibility compared to other diseases, the application of disease control thresholds is of enormous importance in avoiding the unnecessary application of fungicides. Accordingly, many MLD resistances have already been crossed into the individual crops and are available to the farmer. More than 60% of the available WW varieties in Germany have effective MLD resistances (BSL 2021). The choice of resistant varieties and the consistent use of control thresholds are crucial to control all diseases investigated in our study effectively and to reduce fungicide use.

Impact of input intensity and soil fertility on yield intensity 2

Variety trials primarily aim to evaluate the varieties’ VCU in a set of environments representing a crops’ typical growing region under management practices and input intensities corresponding to GLAP. The tests and respective evaluation aim to allow a fair comparison of candidate varieties, which reflects their actual performance potential and ensures to provide the best varieties for on-farm use. However, VCU trials are not designed to and, hence, not able to evaluate the optimal intensity for a specific variety. In Germany, such assessments for farmers’ guidance are conducted in additional regional post-registration trials by various federal state institutions. It should further be considered that VCU trials series are conducted under no predefined application rates for NOG, FCI and GWR as is the case in planned fertilizer or plant protection experiments. Input intensity of a trial is defined case specific according to GLAP and hence differs from trial to trial as demonstrated in Figs. 1, 2 and 6b. In fact, treatment and trial management take into account yield potential for a given trial with situation-specific NOG, FCI and GWR application. This implies that we cannot necessarily expect that yield is rising in response to increasing application rates of a specific input.

In Fig. 5, the 95% range of NOG, FCI, GWR and SLF application rates showed that WW is the crop with highest NOG, FCI and GWR application rates and SB with the lowest, which corresponds with the highest overall mean yield among all crops of 99.3 dt ha−1 (Table 2). Moreover, high SLF indicates that WW production requires soils with comparatively high agronomic productivity. We did not consider herbicide application as further treatment for estimating impact on YLD I2, because preliminary results showed that this trait had only negligible influence on WW yield, which is in line with Wojcik-Gront (2018). However, Fig. 2 revealed an increasing trend of herbicide use, which may be ascribed to an increased ploughless tillage during 2005–2019 in the range of + 5% to + 15% in the different crops (Fig. 1a).

We expected in our study that with increasing NOG application rates YLD I2 would also increase. Instead, we found a contrasting result: yield did not increase when application rate for NOG increased from 107.3 to 147.3 kg ha−1, and regression coefficients for NOG (Table 3) were non-significant in all crops. Further, the marginal correlation coefficients between NOG and YLD I2 showed no association in all crops (Fig. 6b). This agrees with Wojcik-Gront (2018) who observed only a very weak yield effect of NOG fertilization amounts in Polish post registration trials in WW. They estimated a yield increase of only 0.019 dt ha−1 per 1 kg NOG, which was similar to what we estimated (0.0143 dt ha−1 per 1 kg NOG, see Table 3). One reason for this result, which is unexpected at first glance, is the fact that at locations with higher SLF less NOG is needed. Heavier soils, i.e., soils with higher silt and clay content and lower sand content generally tend to provide more NOG through mineralization compared to lighter soils (Cassity-Duffey et al. 2020; Soinne et al. 2021; Vigil et al. 2002). Soils with higher SLF also feature higher plant available water capacity, which further acts positive on N-mineralization (Paul et al. 2003). Moreover, higher plant available water goes along with higher plant available nitrogen during yield formation. The negative correlation of NOG with SLF in all crops shown in Fig. 6b supports this explanation. In this regard, one needs to be aware that NOG fertilization decisions in VCU trials naturally take current information on soil Nmin into account. However, those soil Nmin data were not available and could hence not be considered in the present study. Accordingly, NOG application rates provide only limited information on the trial specific plant available NOG. As described above, plant available NOG over the growing season is determined by applied NOG, but also by soil type, soil organic matter, soil biota, soil management and seasonal weather conditions (Vigil et al., 2002). Therefore, we would like to emphasize that the results of Fig. 5, i.e. a similar yield was obtained over a wide range of NOG rates, should not be interpreted as indicating a large potential for reducing NOG in WW. The reason that similar yields were achieved over a wide range of NOG rates is the complex interaction of SLF, NOG, FCI, GWR with environmental conditions and trial management according to GLAP.

A similar low impact for FCI on YLD I2 was estimated in all crops, except in WW. Comparable results were found in WW (Wojcik-Gront 2018) and WTI (Wojcik-Gront et al. 2021) in Polish variety trials. This actually indicates that FCI treatments in trials were likely managed in a situation specific manner, i.e., higher FCI at higher disease pressure. Accordingly, the low impact may further be attributed to the fact that disease severity was very different between trials such that trials with low disease severity and low FCI rates could reach the same yield levels as trials with high severity and high FCI rates. This is not in conflict with the positive correlation of YLD I2 with FCI shown in Table 3, because strength of correlation may be high, but the magnitude of regression, i.e., the effect strength, may be low at the same time.

A noticeable strong impact was estimated for GWR being curvilinear in some crops. The curvilinear regression for WW, WTI and WB 2r indicated a decreasing effect with higher GWR rates. In SB, however, YLD I2 dropped again with higher TFIs. As explained earlier, this effect may be ascribed to ear emergence problems when a drought period follows after higher GWR application rates.

Besides GWR, SLF had a very strong impact on YLD I2 in all crops, but in WR, SLF showed a noticeable curvilinear impact on YLD I2 which was increasing until a turning point around 60 points and then dropping again (Fig. 5). A closer inspection of the data revealed that this effect was caused by two locations with very high SLF points (96 and 92) but low long-term annual precipitation (600 mm and 500 mm) and high annual mean temperature (9.5 °C and 8.7 °C) compared to other locations (see Fig. 1a). In WTI, the decreasing YLD I2 with increasing SLF was caused by locations with similar conditions as found in WR. Frequently occurring drought and heat stress were the likely reasons for yield depression at those locations causing the unexpected curve pattern.

Generally, regression coefficients in Table 2, marginal correlation coefficients in Fig. 6b and plots in Fig. 5 indicate that SLF is the most important factor for achieving high YLD I2. As described in detail above, reasons are that locations with higher SLF generally provide higher plant available water and plant available NOG. Furthermore, locations with higher SLF are less frequently affected by adverse weather conditions including drought and heat stress compared to locations with lower SLF (Bönecke et al. 2020). Thus, investments in soil fertility through adapted agronomic management, such as organic matter input, diversified crop rotations or positive humus balance, can contribute to an improved climate resilience of cropping systems and can help to ensure crop production under climate change (Macholdt et al. 2020; Jahangir et al. 2019; Seremesic et al., 2011). Similarly, Veresoglou (2013) stated that temperature-related effects were on average about one sixth higher at locations with low yield potential. However, this may not always be the case as was shown in WR and WTI above. The positive correlation coefficients of SLF with FCI and GWR (Fig. 6b) showed that the positive yield effects of higher SLF go along with higher application rates for FCI and GWR. This was also demonstrated by the increasing YLD I2 from treatment level L to H in Fig. 5. However, SLF is the most important and central factor for achieving high yields across environments under GLAP followed by GWR, while FCI and NOG were of minor influence.

Conclusions

The large damage potential of fungal diseases and lodging shown in our study can be interpreted in terms of sustainability as a clear indication of the importance of resistance breeding and the increasing necessity of the farmers to grow resistant varieties. When the yield in winter wheat could be maintained by resistant varieties despite infections by brown rust, yellow rust, and Septoria tritici blotch, this could highly contribute to less fungicide use. According to our results, resistances in winter triticale to yellow rust, mildew and lodging are most important, in winter rye resistances to lodging and brown rust and in the three barleys resistances to lodging and dwarf leaf rust. Evidently, increased efforts to breed for lodging tolerance are needed as the potential increase in strong winds and heavy rains due to climate change is expected to further challenge the stem stability of all cereals in the future.

Impact of nitrogen, fungicide and growth regulator application rates following good local agronomic practice showed that yield did not change much over a wide range of nitrogen and fungicide application rates in most cereals. As nitrogen application rates do include plant available nitrogen over various environments and, as growth regulators and fungicides were applied situation-specific, we cannot derive specific saving potentials of fungicide and growth regulators in practical farming. Higher growth regulator application rates were related with higher yield in winter rye and winter barley, while in spring barley higher rates were associated with lower yields. Generally, soil fertility showed the strongest impact on yield in all crops followed in descending order by growth regulator, fungicide and nitrogen application rates.

Against the background of the European Union and the national agricultural policy aiming to reduce nitrogen and pesticide use considerably, new varieties with a broad resistance to diseases and improved stem stability and higher nitrogen use efficiency are necessary to counter-balance targeted reductions. In this regard, it is important to reconsider input intensities according to good local agronomic practice in the intensity 1 variant of trials for assessing value for cultivation and use, targeting less intensive production in future registration trials. This could foster the selection of improved multi-resistant and nitrogen efficient varieties adapted to reduced inputs to support the sustainability of future cereal production.