Introduction

Bread wheat varieties (Triticum aestivum spp. aestivum) have to fulfill numerous requirements along the supply chain (Thorwarth et al. 2018). Besides high grain yield and disease resistance, which are important for farmers, bread making quality is of great relevance (Oury and Godin 2007). While grain yield can be measured with combine harvesters, good bread making quality encompasses a wide range of properties with time-consuming test methods (Thanhaeuser et al. 2014). Therefore, protein content is often used to predict wheat baking quality because it is quick and easy to measure and serves as the basis for the farmer payment system in many countries (Thorwarth et al. 2018; Boeven and Longin 2019). However, the correlation of protein content with baking volume widely differs across studies (Graybosch et al. 1996; Uhlen et al. 2004; Koppel and Ingver 2010; Maphosa et al. 2015). Furthermore, the negative correlation of protein content and grain yield requires high amounts of nitrogen fertilizers to realize high grain yields with acceptable protein contents with negative impact on the environment (Zörb et al. 2018). Therefore, more information about dough rheological traits and baking volume needs to be considered (Koppel and Ingver 2010).

Brabender extensograph is a standard method to investigate dough rheological properties such as dough extensibility and elasticity, which provides insights into important processing properties (Frakolaki et al. 2018). In addition to dough properties, shape and volume of the final product, other quality aspects such as storage potential and freshness are also relevant for the product quality (Freund and Kim 2006). High water absorption promotes a good baking volume, good freshness of the end product and improved storage potential (Puhr and D’Appolonia 1992; Koppel and Ingver 2010). It is also of practical interest because with a high water absorption, less flour is required to reach a certain loaf volume (Koppel and Ingver 2010; Frakolaki et al. 2018).

In the last decade, hybrid wheat breeding gained much interest in the public and private sector (Boeven and Longin 2019). Hybrid breeding is well established in many outcrossing species but is still under development in wheat (Gupta et al. 2019). A mid-parent heterosis for grain yield of approximately 10% has been reported for hybrid bread and durum wheat (Gowda et al. 2010; Thorwarth et al. 2018). For protein content, negative heterosis values were reported, which might be explained by the negative correlation between grain yield and protein content (Oury and Godin 2007; Thorwarth et al. 2018, 2019; Boeven and Longin 2019). However, some recent studies have shown that hybrid bread and durum wheat can combine good sedimentation values, acceptable protein content with high grain yield (Thorwarth et al. 2018; Akel et al. 2019; Boeven and Longin 2019). To the best of our knowledge, no study has investigated other bread making quality traits like extensograph traits and baking volume at a considerably high number of bread wheat hybrids and their parental lines.

We therefore investigated 35 male and 73 female lines and 119 of their single-cross hybrids at three different locations for grain yield, dough quality and baking volume. Our objectives were to (1) evaluate variance components, trait correlations and heritabilities for the examined quality traits, (2) estimate the extend of mid- and better-parent heterosis, (3) assess the association between mid-parent value, line per se performance and GCA effects with hybrid performance and (4) evaluate the potential of hybrids to combine good bread making quality with high grain yield.

Materials and methods

Plant material and field experiments

The initial study was based on phenotypic data of 236 elite winter bread wheat lines (Triticum aestivum ssp. aestivum) used as parents representing Central European diversity and their 1744 single-cross hybrid progenies. Lines were distinguished into two groups of 40 males and 196 females, considering pollination ability, flowering time and plant height. Hybrids were crossed in an incomplete factorial mating design. Additionally, 11 check varieties were included (Colonia, Elixer, Hystar, Hybred, JBAsano, Julius, KWSLoft, LGAlpha, RGTReform, Rumor and Tobak). The initial experimental setup was described by Zhao et al. (2021). Due to capacity limits in the baking quality analytics, the presented data were based on a selected fraction of the initial plant material of 35 male and 73 female lines and 119 of their single-cross hybrids depending on different criteria such as varying line per se and hybrid performance for agronomic traits, protein content and sedimentation value or diversity in glutenin bands. This selection led to a representative subsample of the initial 1744 hybrids as reflected by a PCA (Principal Component Analysis) based on the BLUEs (Best Linear Unbiased Estimates) for grain yield, protein content and sedimentation value (Suppl. Fig S2). Furthermore, with that selection almost half of the females and males were represented in ≥ 2 and ≥ 3 hybrid combinations (Suppl. Table 1). Samples of the selected fraction were taken from the location in Hadmersleben (51°59'31''N, 11°18'5''E, Germany, mean temp. 11.2 °C and precipitation 334.2 mm in 2018), Seligenstadt (50°3'N, 8°59'E, Germany, mean temp. 11 °C and precipitation 470.8 mm in 2018) and Gola (51°49'16.2'N 16°52'32.7''E, Poland, mean temp. 10.3 and precipitation 781.8 mm in 2018). Experiments were conducted during the growing season 2018. In each environment, the experimental design consisted of three trials. In each trial, an un-replicated alpha lattice design was used.

Different genotypes were evaluated in different trials linked by the 11 common checks. Plot size ranged between 5.70 and 9 m2. All plots were treated with fertilizer, pesticides and fungicides according to farmers’ practice for intensive wheat production.

Grain yield was measured in tons per hectare (t/ha) with an adjusted moisture content of 14%. Protein content (%) was determined with a near-infrared reflectance (NIR) spectrometry (ICC standard method 159, ICC, Vienna, Austria). Wet gluten content (%) was measured with Perten Glutomatic (ICC standard method 137/1, ICC, Vienna, Austria). Sedimentation value was determined according to Zeleny (ICC standard method 116/1, ICC, Vienna, Austria). Five quality traits were assessed with the Brabender extensograph (ICC standard method Nr. 114/1, ICC, Vienna, Austria) but with only one measurement instead of two for each trait. Traits are extensibility (mm), resistance to extension (EU), the ratio between extensibility and resistance to extension, energy (cm2) as the area under the extensograph curve and water absorption (%) of the dough. Maltose (%) was determined after the Berlin method (Klüver 1994). Browning was visually scored from 1 (high intensity) to 5 (low intensity) by an experienced cereal scientist (Quality Lab Aberham, Augsburg, Germany). Baking volume (ml) was determined with a rapid method adapted from Rapid Mix test (ICC standard method 131, ICC, Vienna, Austria) but with lower amount of flour and test bread rolls (Quality Lab Aberham, Augsburg, Germany). All traits and abbreviations are summarized in Table 1.

Table 1 Agronomic and quality traits assessed in the study

Phenotypic data analyses

As suggested by Bernal-Vasquez et al. (2016) an outlier detection was performed before analyzing the phenotypic data following the method 4 ‘‘Bonferroni-Holm with re-scaled median absolute deviation standardized residuals.” The best linear unbiased estimators (BLUEs) were then calculated based on the following mixed model (1):

$$y_{ikl} = \mu + g_{i} + e_{l} + t_{kl} + \left( {ge} \right)_{il} + {\mathcal{E}}_{ikl} ,$$

where \({y}_{ikl}\) is the phenotypic observation of the ith genotype within the kth trial at the lth environment. The intercept is denoted as \(\mu\), \({g}_{i}\) is the effect of the ith genotype, \({e}_{l}\) the effect of the lth environment, where environment is defined as location times year combination, \({t}_{kl}\) is the effect of the kth trial at the lth environment, \({(ge)}_{il}\) represents the interaction effect between genotype and environment, and \({\mathcal{E}}_{ikl}\) is the residual term of \({y}_{ikl}\). For the calculation of the BLUEs all effects were taken as random, except \({g}_{i}\). The error variance was assumed to be heterogeneous.

With an extended model (2), we dissected genetic variance into variances for general (GCA) and specific combining ability (SCA) to evaluate GCA, SCA and heterotic effects (Beukert et al. 2020):

$$y_{ijkl} = \mu + e_{l} + t_{kl} + g_{ij} + m_{i} + f_{j} + s_{ij} + \left( {me} \right)_{il} + \left( {fe} \right)_{jl} + {\mathcal{E}}_{ijkl} ,$$

\({y}_{ijkl}\) is the phenotypic observation of lines \((i=j)\) or hybrids \((i\ne j)\), where hybrids were denoted as a cross between the ith parent with the jth parent. \(\mu\) is the the overall population mean, \({e}_{l}\) and \({t}_{kl}\) follow the same notation as in model (1). \({g}_{ij}\) refers to the genotypic effect of parental lines, \({m}_{i}\) is the GCA effect of the ith male parent, \({f}_{j}\) the GCA effect of the jth female parent and \({s}_{ij}\) represents the SCA effect of the cross between the ith male with the jth female. Interaction effects between GCA effects of the ith male and the jth female with the lth environment were modeled as \({(me)}_{il}\) and \({(fe)}_{jl}\). \({\mathcal{E}}_{ijkl}\) is the residual term of \({y}_{ijkl}\). All effects were modeled as random effects except \(\mu\) and the error variance was assumed to be heterogeneous for each environment. For simplicity of illustration, however, the average of these error variances is shown in Table 2. Significance of variance components were based on their z-ratios.

Table 2 Estimates of variance components, heritabilities, mean and range of mid-parent (MPH) and better-parent (BPH) heterosis, correlations between mid-parent values and hybrid performance r(MP, HYB), general combining ability effects and hybrid performance r(GCA, HYB) as well as general combining ability effects and line per se performance r(GCA, per se)

Broad-sense heritability was computed separately for hybrids and lines following Piepho and Möhring (2007):

$$H^{2} = \frac{{\delta_{g}^{2} }}{{\delta_{g}^{2} + \frac{v}{2}}} ,$$

where \(\frac{v}{2}\) is the mean variance of a difference of two adjusted treatment means and \({\delta }_{g}^{2}\) the genetic variance estimated with model (2). The mid-parent heterosis (MPH) was calculated as MPH = HYB–MP and better-parent heterosis as BPH = HYB–\({\mathrm{P}}_{\mathrm{max}}\), where \({\mathrm{P}}_{\mathrm{max}}\) is the performance of the better-parent and HYB the performance of hybrids. MP is the mid-parent performance, determined as MP = (\({P}_{1}\)+\({P}_{2}\))/2, where \({P}_{1}\) and \({P}_{2}\) are the performances of parents of specific hybrids. In addition, we calculated the Pearson’s correlation coefficient of MP with HYB r = (MP, HYB), the sum of GCA effects with HYB r = (GCA, HYB) and line per se performance r = (GCA, per se). We also calculated the Pearson’s correlation coefficients of phenotypic values among all traits.

All analyses were performed within the R software (R Core Team 2021) and the software ASReml 3.0 (Gilmour et al. 2009).

Results

The 109 parental lines and their 119 hybrids varied largely in all traits considered, resulting in highly significant genetic variances for all traits (Table 2). The sum of GCA variances were considerably smaller than the genetic variance within the parental lines except for grain yield. For instance, for baking volume, the genetic variance for parental lines was 3007.55 ml2 (p < 0.01) but the sum of GCA variances in their hybrids only 2134.46 ml2. Thus, the exploitable variance for selection is higher for line than for hybrid breeding. The GCA variance for male lines was lower than that for female lines across all traits, which may be due to the fact that we used only 35 male lines but 74 female lines in our study. The \({\upsigma }_{\mathrm{SCA}}^{2}\) /\({\upsigma }_{\mathrm{GCA}}^{2}\) ratio was low for all traits ranging from 0 for energy value of extensograph to 0.29 for browning of the bread crust.

Estimates of variance components have large standard errors, the larger the lower the trait heritability and the larger the complexity of variance components, e.g., SCA versus GCA. For instance, for baking volume, standard error for SCA variance was 127% in relation to SCA variance while for GCA it was 20% and 37% for males and females, respectively (data not shown). For grain yield with considerably lower heritability, it was more pronounced with a standard error of the SCA variance of 2.21 (data not shown). Thus, we speculate that for grain yield the SCA variance in our study is underestimated as reflected by larger SCA variances reported in the literature based on larger numbers of hybrids and locations (Oettler et al. 2005; Longin et al. 2013; Miedaner et al. 2017; Zhao et al. 2021). Thus, we concentrate our further discussion on variance components of quality traits.

With the exception for grain yield, genotype-by-environment interaction variances were considerably lower than genetic variances resulting in high heritability estimates for all traits. Furthermore, heritability estimates were comparable for parental lines and hybrids with a tendency toward slightly lower values for hybrids for few traits.

The average mid-parent heterosis was positive for grain yield, sedimentation value, all dough traits measured with extensograph (EX, REX, R, EN), browning of the bread crust, and baking volume (Table 2), while it was negative for the remaining traits. Regarding average better-parent heterosis, positive values were determined only for grain yield. Nevertheless, a wide range of trait values were found within the groups of male and female lines as well as hybrids across all quality traits (Fig. 1). Differences between the groups of males, females, checks and hybrids were therefore not significant for all quality traits except for protein content, which was slightly higher in parental lines than in hybrids. It is important to note that quality analysis of hybrids is based on F2-kernels, i.e., the harvest of the F1-hybrids, while agronomic traits were measured at the F1-plants. Nevertheless, millers and bakers will also get the harvest of the F1-hybrids, i.e., F2-kernels and consequently, our quality analyses reflect practice. The correlation of mid-parent and hybrid performance varied largely from 0.42 (p < 0.01) for grain yield up to 0.84 (p < 0.01) for energy value of extensograph and baking volume (Table 2). The correlation of GCA and hybrid performance was larger than the correlation of mid-parent and hybrid performance for all traits.

Fig. 1
figure 1

Boxplots for different traits grouped by checks, males, females and hybrids. Means between groups with a common letter for a given trait do not differ significantly from each other based on Tukey’s test

Grain yield was moderately negatively correlated with all quality traits except for maltose content and browning of the bread crust (Table 3). Thereby, the correlation coefficients for the parental lines and hybrids were similar in magnitude. Baking volume correlated highest with sedimentation value at 0.89 (p < 0.01) for the lines and hybrids. The second highest correlation to baking volume was recorded for the energy value of extensograph with 0.83 (p < 0.01) and 0.87 (p < 0.01) for the lines and hybrids, respectively. In contrast, the correlation coefficients between baking volume and protein or gluten content was considerably lower.

Table 3 Phenotypic correlation coefficients among 12 traits determined either for 119 F1-hybrids (above diagonal) or for inbred lines (below diagonal; consisting of 35 male lines, 73 female lines and 11 checks)

The negative correlation of grain yield with many quality traits was also visible when quality traits were plotted against grain yield (Fig. 2). The large variation in parental lines and hybrids within the individual traits, however, allows for selection of “outliers” from that negative correlation having high yield and good quality. For instance, considering a baking volume greater than 700 ml (Fig. 2h), there was still a wide variation in grain yield available for selection ranging from 7.7 to 9.7 t/ha. Interestingly, the best line in that quality class had a yield of only 9.2 t/ha, showing an advantage of hybrids of 0.5 t/ha. This advantage of hybrids was also confirmed when a selection step was applied to sedimentation value taking either the 20% best or worst genotypes before plotting their baking volumes against grain yield (Fig. 3).

Fig. 2
figure 2

Linear regression plots of a protein content, b wet gluten content, c sedimentation value, d resistance to extension, e ratio of REX/EX, f energy, g water absorption and h baking volume on grain yield. The regression line of check varieties is colored in red, of hybrids in green and of lines in blue. R2adj represents the adjusted R2 for the regression lines of checks, hybrids and lines, respectively (color figure online)

Fig. 3
figure 3

Baking volume plotted against grain yield of lines (filled symbols) and hybrids (empty symbols) belonging to the 20% best (circles), 20% worst (triangles) genotypes regarding sedimentation value and check varieties (red crosses) (color figure online)

Discussion

Contrary to previous assumptions, several studies showed that hybrid bread and durum wheat can combine good sedimentation values, acceptable protein content with high grain yield (Thorwarth et al. 2018; Akel et al. 2019; Boeven and Longin 2019). However, more information is required for bread making in terms of dough properties and baking volume, which is to the best of our knowledge not yet investigated on a larger number of hybrids and parental lines.

Hybrid wheat can have good baking quality

We observed an average mid-parent heterosis for baking volume of 0.67% (Table 2). Average better-parent heterosis (BPH) was −4.15% but ranged up to 14.47%. Furthermore, no significant difference in baking volume was observed between the groups of male and female lines, check varieties as well as hybrids (Fig. 1h). Similarly, within the ten genotypes with highest baking volume we found six parental lines and four hybrids (data not shown). Thus, a hybrid wheat variety can have a baking volume similar to the best quality line variety.

Even more, hybrid wheat has on average a higher grain yield at a given level of baking volume. For instance, assuming a baking volume of 700 ml we found only two parental lines with grain yield higher than 9 t/ha surrounded by plenty of hybrids (Fig. 2h). Similarly, the widely grown line variety RGTReform had a baking volume of 719.8 ml and a grain yield of 9.0 t/ha in our trial. We found one hybrid with ~ 3% higher grain yield and slightly better baking volume but no parental line with better grain yield at the given quality level. Similar results were found for sedimentation value or dough traits measured by the extensograph. Average heterosis values were negative but ranged widely up into positive values (Table 2) leading to yield advantages of hybrids at given levels of sedimentation values or extensograph traits (Fig. 2). This confirms previous studies on sedimentation value in durum and bread wheat (Thorwarth et al. 2018; Akel et al. 2019). No comparable literature was available for heterotic effects of extensograph traits or baking volume.

By contrast, average mid-parent heterosis was negative for protein content and wet gluten content (Table 2). Although a wide range in heterosis values was visible for both traits, genotypes with the highest protein or wet gluten content belonged mainly to the group of lines (Fig. 2a, b). This confirms previous studies on protein content of durum and bread wheat (Gowda et al. 2010; Thorwarth et al. 2018) and might be explained by the negative correlation of grain yield and protein content (Table 3) leading in tendency to low protein contents in genotypes with high grain yield.

Similarly, average mid- and better-parent heterosis was negative for the trait water absorption (Table 2) and genotypes with the highest water absorption predominantly belonged to the group of lines (Fig. 2g). Water absorption correlated positively with protein and wet gluten content in lines and hybrids, respectively (Table 3). Thus, the negative average heterosis might be explained by the amount of gluten, which is reported to be partly responsible for water absorption in wheat dough (Wieser 2007; Kaushik et al. 2015). Water absorption is also influenced by kernel hardiness, which is a trait influenced by few major and many minor genes (Mikulikova 2007; Mohler et al. 2012). However, we did not investigate kernel hardiness requiring further research for quality breeding in hybrids.

Ignoring very high values of protein content (> 14%), wet gluten content (> 32%) and water absorption (> 55%) led to the same observation as discussed above for sedimentation value or baking volume: At a given quality trait level, hybrids have almost always a higher grain yield than their parental lines (Fig. 2). For instance, taking the actually most popular wheat variety in Germany RGTReform as reference, we found hybrids which had similar or better baking volume, sedimentation value, protein content and wet gluten content but around 4% higher grain yield. Summarizing, hybrid wheat can have high bread making quality expressed in baking volume, sedimentation value or dough properties at acceptable levels of protein and wet gluten content but at higher grain yield compared with line varieties.

How to breed hybrids with high baking quality?

We determined a low ratio of \({\upsigma }_{\mathrm{SCA}}^{2}\) /\({\upsigma }_{\mathrm{GCA}}^{2}\) = 0.03 for baking volume (Table 2) which points toward a mainly additive gene action of that trait. Furthermore, the correlation of mid-parent and hybrid performance was high with r = 0.84 (p < 0.01). Similar results were obtained for sedimentation value, water absorption and the energy value of the extensograph. Consequently, high-quality parental lines should be chosen in both heterotic groups to maximize baking quality of hybrids.

In contrast to our results, Oettler et al. (2005), Longin et al. (2013), Miedaner et al. (2017), Thorwarth et al. (2018) and Zhao et al. (2021) reported higher amounts of SCA and lower correlation coefficients between mid-parent and hybrid performance for grain yield or resistance to some diseases. Consequently, these authors recommended to predict hybrids based on GCA values rather than using parental line per se performance values thereby making hybrid breeding in wheat expensive and slow as compared to line breeding (Boeven and Longin 2019). By contrast, our findings on a high correlation between mid-parent and hybrid performance for most important wheat quality traits enables selection of parental lines based on their per se values largely facilitating breeding for high bread making quality in hybrid wheat.

Correlation coefficients between grain yield and quality traits as well as between quality traits were of similar magnitude in hybrids and lines (Table 3). For instance, baking volume correlated highest with sedimentation value and the energy value of extensograph while considerably lower with protein and wet gluten content in lines and hybrids. Similarly, protein content, wet gluten content, sedimentation value and baking volume were moderately negatively correlated with grain yield in parental lines and hybrids. Thus, knowledge collected across decades on quality selection in line breeding could be directly applied in hybrid wheat breeding.

Breeding for high-quality hybrids

The mainly additive gene action for most important quality traits discussed above requests for quality breeding in both heterotic groups. Selection for the important trait baking volume cannot be performed in early generations, as testing needs a lot of grains, requires standardized milling and baking and is expensive and slow. Sedimentation value has a high heritability, a high correlation to baking volume, only few grams of flour are required, several hundred tests can be run within a day and the correlation between mid-parent and hybrid performance is high making it interesting as important trait to be measured as soon as possible in early generations. To validate its use, we selected the 20% best and worst lines and hybrids in sedimentation value and plotted their baking volume against their grain yield (Fig. 3).

To our opinion, three important points became visible. First, the groups selected for sedimentation value led to a clear separation of these groups regarding baking volume; thus, a selection on sedimentation value is effective for improving baking volume. Second, irrespective of the selected group, hybrids had a higher grain yield at a given quality level underlining our findings discussed above. And third, grain yield in lines and hybrids was higher in the group with low sedimentation value and baking volume. Selection on quality traits might therefore lead to reduced yield potential, which can be explained by the negative correlation between grain yield and most quality traits. Thus, in early generations, the breeder has to compromise between selection for the required quality traits and selection to maintain or even improve a high yield level.

In later breeding generations, testing on baking volume is required as even correlation coefficients between sedimentation value and baking volume of around 0.90 can lead to misclassifications in that important trait. Furthermore, the correlation between GCA and hybrid performance is considerably higher than the correlation between mid-parent and hybrid performance for baking volume (Table 2, Suppl. Figure 1). Thus, selection within heterotic groups should be based on GCA values as soon as they become available, latest after first yield tests of new parental lines. As for grain yield the correlation of mid-parent and hybrid performance is low (Table 2), efficient hybrid breeding on grain yield requires GCA values as soon as possible in the breeding program. This also enables to estimate GCA values for baking volume at least for agronomically promising new parental lines.

Application of genomic selection has shown to be very promising to facilitate hybrid wheat breeding in agronomic traits (Zhao et al. 2013, 2014; Basnet et al. 2019). In line breeding, genomic selection was already shown to be of high interest especially in quality selection due to the slow and expensive procedure of baking tests (Battenfield et al. 2016; Michel et al. 2018). Thus, further research is urgently required to evaluate the use of genomic selection for quality breeding in hybrid wheat (Thorwarth et al. 2019).

Conclusions

We could clearly elaborate that wheat hybrids can have excellent quality comparable to best quality of line varieties. Even more, for a similar level of baking volume, sedimentation value and important dough traits, the highest grain yield was always achieved for hybrids and not for lines in our study although best breeding lines from all German wheat breeding companies were used as parental lines. Only for protein and wet gluten content, lines were slightly better than hybrids but the large variability in these traits allows for selection of hybrids with acceptable amounts of protein and gluten. Baking volume, sedimentation value, energy value from extensograph and water absorption of the dough had high heritabilities, low \({\upsigma }_{\mathrm{SCA}}^{2}\), and high correlations between mid-parent and hybrid performance. This facilitates on the one hand breeding for quality in hybrid wheat as per se values of parental lines could be used as predictors for their GCA and hybrid performance. On the other hand, the almost additive gene action requires quality breeding in both heterotic groups. However, to our opinion, these findings have limited importance on the future effectiveness of hybrid versus line breeding in wheat. This mainly depends on rapid improvements in hybrid seed production technologies and speedup of hybrid programs with genomic selection and other predictive tools.