Introduction

Fusarium head blight FHB, also called scab, is one of the most calamitous diseases affecting cereals such as wheat, barley, maize and oat (Walter et al. 2010; Beres et al. 2018). FHB is mainly caused by Fusarium graminearum and Fusarium culmorum with the former being considered as the fourth most economic and scientifically important plant fungal pathogen worldwide (Dean et al. 2012). The disease’ effects comprise losses in grain yield and quality as well as contamination by mycotoxins like deoxynivalenol (DON) and its acetylated derivatives, which in turn are a serious threat for human and animal health and malting purposes (Darwish et al. 2014; Dweba et al. 2017; Fung and Clark 2004). In comparison to hexaploid bread wheat (Triticum aestivum L.), the tetraploid durum wheat (Triticum turgidum L. ssp. durum) is particularly susceptible to the Fusarium species complex which makes the development of resistant varieties a major breeding goal. Durum wheat is a cereal crop with an annual global production of 39 million of tons corresponding to 5% of the total wheat production (Kadkol and Sissons 2016). Besides bread and bulgur, other essential foodstuffs like couscous and pasta, consumed by hundreds millions people worldwide, are prepared based on semolina that is the granular milled product of durum grains (Fiedler et al. 2017; Tuberosa and Pozniak 2014). A certain level of control over FHB can be achieved with a series of cultural, biological and chemical control measures plus cultivar resistance (Pirgozliev et al. 2003). Moreover, and within a Genotype \(\times\) Environment \(\times\) Management (\(\hbox {G} \times \hbox {E} \times \hbox {M}\)) interaction framework, the development and adoption of cultivars tolerant to FHB is and will remain to be the cornerstone of any strategy aiming to manage this disease (Beres et al. 2018).

FHB resistance (FHBr), is a trait whose quantitative inheritance is controlled by many loci each with small effect which leads to a slow genetic gain per unit of time. More than 500 QTL have been described for FHB resistance on all 21 chromosomes of hexaploid wheat (Buerstmayr et al. 2009; Jia et al. 2018; Liu et al. 2009; Venske et al. 2019), which is not the case in tetraploid wheats where only a small number of loci are reported (Prat et al. 2014). For the latter wheat species FHBr loci have been mapped on all chromosomes except 1A, 1B and 5A, though all of them possess only small or moderate effects compared to the major resistance loci in hexaploid wheat e.g. Fhb1 located on chromosome 3BS, Fhb2 located on 6BS and Qfhs.ifa-5A all derived from the Chinese resistance cultivar “Sumai 3” (Prat et al. 2014; Zhao et al. 2018). In addition, within the durum gene pool there is a narrow genetic variation for FHBr in both the mostly susceptible elite cultivars (Buerstmayr et al. 2003; Miedaner and Longin 2013) and landraces, while among the latter only few have shown some elevated level of resistance (Huhn et al. 2012; Talas et al. 2011). Other wild wheat tetraploids relatives (T. turgidum ssp. dicoccoides, T. turgidum ssp. dicoccum, and T. turgidum ssp. carthlicum) have shown moderate to high resistance levels (Oliver et al. 2007, 2008) although the incorporation of loci from such sources into elite germplasm faces some obstacles like linkage drag, unfavorable epistatic interactions or pleiotropic effects (Kumar et al. 2018; Prat et al. 2014; Zhu et al. 2016). Nevertheless, a couple of studies in this direction have been conducted and reported successful pyramiding of resistance genes from hexaploid wheat together with native durum loci (Prat et al. 2014; Zhao et al. 2018).

Marker assisted selection (MAS) has proved to be a suitable and effective instrument for breeders when targeting a trait with small number of large effect genes like the stripe (yellow) rust resistance gene Yr15 in durum wheat (Yaniv et al. 2015). The actual implementation of MAS for quantitative inherited traits with an underlying complex genetic architecture is though less feasible, while genomic prediction (GP) has overcome such limitations since it uses genome-wide markers to estimate the breeding values of unobserved traits in a panel of selection candidates targeting also the multitude of minor QTL (Meuwissen et al. 2001). Precisely, GP models involving FHB resistance in hexaploid wheat (Arruda et al. 2015, 2016b; Mirdita et al. 2015; Rutkoski et al. 2012; Schulthess et al. 2017; Dong et al. 2018) and in durum wheat (Miedaner et al. 2017; Steiner et al. 2018) have shown higher prediction accuracy and larger selection gain than traditional MAS approaches. With few exceptions (Schulthess et al. 2017; Steiner et al. 2018), suggestions and proofs of the advantages of including correlated traits in GP models have been barely explored for FHBr in wheat, even though the well-known associations with others traits like heading date or plant height are obvious. In this regard, earlier flowering genotypes are generally more susceptible to FHB and most of the published evidence support the existence of such negative and significant effect in both bread wheat (Gervais et al. 2003; He et al. 2016; McCartney et al. 2016; Paillard et al. 2004; Schmolke et al. 2005; Yi et al. 2018) and durum wheat (Buerstmayr et al. 2012; Miedaner et al. 2017; Miedaner and Longin 2013; Prat et al. 2017). Loci affecting both traits were found to overlap specifically on chromosomes 4A, 5B, 6A, 6D and 7B in various of the mentioned studies, although the correlation FHBr-heading date might be also highly dependent on environmental factors like temperature, rainfall and humidity (Miedaner and Longin 2013). Another trait studied in the host response to FHB is plant height where previous findings suggest a clear negative and significant trade-off between shortness and resistance to FHB in bread and durum wheat (Buerstmayr et al. 2012; Miedaner and Longin 2013; Prat et al. 2017; Talas et al. 2011). One of the most relevant height-reducing mutant alleles, Rht-B1b, has been intensively introgressed into elite durum wheat germplasm since the 1960s’ and important loci for FHBr have been mapped at the Rht-B1 position on chromosome 4B, (Buerstmayr et al. 2012; Miedaner et al. 2017; Prat et al. 2017; Steiner et al. 2018) suggesting a pleiotropic effect of the latter on susceptibility to FHB (Srinivasachary et al. 2009).

In view of the findings described by Miedaner et al. (2017) regarding to the potential and improvements needed to implement GP methods for FHB, the scope of such endeavor has thus to be widened to deal with the multiple involved traits. Hence, the main objectives of this study were (1) to evaluate the implementation of different approaches in order to include the records of two covariates namely heading date and plant height into GP models and (2) to assess the performance of the developed GP models in terms of prediction accuracy and selection gain for all involved traits.

Materials and methods

Plant material

The international diversity panel in matter contained 170 winter and 14 spring types, including modern and old cultivars as well as current breeding lines. Field trials were conducted at the experimental stations Heidfeldhof (Hoh) and Oberer Lindenhof (Oli) of the University of Hohenheim, Stuttgart, Germany both in the cropping season 2013 and 2014 as well as at the experimental site of the Department of Agrobiotechnology Tulln in Austria (Tul) during 2014. The combination of location and year provided five environments (Hoh13, Hoh14, Oli13, Oli14, Tul14), in which experiments were laid out as \(\upalpha\)-lattice designs with three replications in the locations Hoh, Oli and two replications in Tul.

The Fusarium inoculation procedure and its details can be found in the studies of Miedaner et al. (1996) and Miedaner et al. (2017). In short, the whole experiment was inoculated by \(2 \times 10^{5}\) conidiospores \(\hbox {ml}^{-1}\) with a machine-driven small-plot field sprayer (Hoh, Oli) or a motor-driven backpack sprayer (Tul) several times during flowering. The inoculation dates were spread across the whole flowering period of the experiment in a way that each genotype was inoculated at least once at full flowering. In Tul, the crop canopy was kept moist by mist irrigation during 20 hr after inoculations. Fusarium head blight (FHB) severity was scored several times by the visual evaluation of all spikelets in a plot rating from 1 to 9, where 1 stands for no visible symptoms and 2 to 9 respectively means: < 5%, 5–15%, 15–25%, 25–45%, 45–65%, 65–85%, 85–95% and \(>\,\)95%. Mean FHB severity (FHBms) will be the term referred in this study to the arithmetic mean of six individual plot ratings measured from the beginning of the symptoms development (11–15 days after inoculation) and repeated in a 3-days interval until the beginning of the yellow ripening stage. Heading date (HD) was noted as the day in the year when 75% of the ears of the plot emerged to 75% and plant height (PH) as the cm from the ground to the middle of the ear at EC70 stage.

Phenotypic data analysis

A two-step approach, as described by Möhring and Piepho (2009), was implemented in order to obtain the Best Linear Unbiased Estimates (BLUE) of each genotype for each of the traits. Firstly, each environment was analyzed by a linear mixed model of the form:

$$\begin{aligned} y_{ikm}=\mu +g_{i}+r_{k}+b_{m}+e_{ikm} \end{aligned}$$
(1)

where \({y}_{ikm}\) is the phenotypic observation of a trait of interest, \({\mu }\) is the overall mean, \({g}_{i}\) is the effect of the ith genotype, \({r}_{k}\) is the effect of the kth replicate, \({b}_{\textit{km}}\) is the effect of the mth block nested within the kth replicate, and \({e}_{\textit{ikm}}\) is the residual effect. All the effects except \({g}_{{i}}\) were considered as random. In a second step, BLUEs were calculated across the multi-environment trails using the linear mixed model:

$$\begin{aligned} y_{ij}=\mu +g_{i}+E_{j}+e_{ij} \end{aligned}$$
(2)

where \({y}_{\textit{ij}}\) is the BLUEs from the first step, \({E}_{{j}}\) is the effect of the jth environment and \({e}_{\textit{ij}}\) is the residual term accounting for Genotype \(\times\) Environment interaction and the residual effect. The effect of genotypes was considered here as fixed as in the first model.

Variance components were estimated by restricted maximum likelihood (REML) to estimate the repeatability and entry-mean heritability, (\(\hbox {H}^{2}\)), within each environment and across them as described by Piepho and Möhring (2007) (Eq. 19) using instead the model described in section 2.2 in Miedaner et al. (2017). The models were implemented in the package sommer for R (Covarrubias-Pazaran 2016). Accessions’ id and the estimated BLUEs for each environment can be found as supplementary material.

Correction of the FHBms estimators

Three methods were taken to correct BLUE of FBHms in each of the environments and across them. The first approach, ADJ.A, considered the residuals from the linear regression of FHBms on the covariates HD and PH, used both separate and simultaneously; in a general form following the equation:

$$\begin{aligned} y=1\beta _{0}+x_{HD}\beta _{HD}+x_{PH}\beta _{PH}+e \end{aligned}$$
(3)

where y is a vector of BLUEs for FHBms (dependent variable), \({1 \beta }_{0}\) is the intercept, while \({x}_{\textit{HD}}\) and \({x}_{\textit{PH}}\) are the BLUEs of the covariates HD and PH estimated within each environment by model (1), \(\beta _{\textit{HD}}\) and \(\beta _{\textit{PH}}\) are the regression coefficients and \(\epsilon\) are the residuals. Finally, the mean of the FHBms was added to the vector of residuals to be use as response variable in the subsequent analysis.

In the second method, denoted as ADJ.B, we investigated the approach suggested by Emrich et al. (2008), Miedaner et al. (2006), and included plant height and heading date as fixed covariates when calculating BLUEs resulting in an extension of model (1) to:

$$\begin{aligned} y_{ikm}=\mu +g_{i}+r_{k}+b_{m}+\beta _{HD}c1_{ikm}+\beta _{PH}c2_{ikm}+e_{ikm} \end{aligned}$$
(4)

where \(c1_{ikm}\) and \(c2_{ikm}\) modeled the plot basis observations of HD and PH, same as in above method \(\beta _{\textit{HD}}\) and \(\beta _{\textit{PH}}\) are the regression coefficients, while analogues to the adjustments made by model (3) either both traits were considered simultaneously or alone. The corrected FHBms values obtained from this method were defined as the effect \({g}_{i}\) from model (4) plus the overall mean of the FHBms.

The third correction method ADJ.C was adopted from Rapp et al. (2018), and consisted in the combination of the target trait and the covariate(s) in a multi-variate mixed model, following this equation:

$$\begin{aligned} y_{ijkm}^{T}=\mu ^{T}+g_{i}^{T}+E_{j}^{T}+(gE)_{ij}^{T}+r_{jk}^{T}+b_{jkm}^{T}+e_{ijkm}^{T} \end{aligned}$$
(5)

where each of effects had the same nomenclature as used in models (1) and (2), except for the superscript T which denotes that each term is an array of \(n \times t\) dimension being t the number of traits involved. The initial assumption is that all the effects are considered as random and each follows a multi-variate normal MVN distribution, like the one for the genetic effects \({g}_{\textit{FHBms}}, {g}_{\textit{HD}}, g_{\textit{P}{\mathrm{H}}}\):

$$\begin{aligned} \begin{pmatrix} g_{FHBms} \\ g_{HD} \\ g_{PH} \\ \end{pmatrix} \sim MVN \begin{bmatrix} \begin{pmatrix} 0 \\ 0 \\ 0 \\ \end{pmatrix},\begin{pmatrix} \sigma _{FHBms}^2 &{} r_1\sigma _{FHBms}\sigma _{HD} &{} r_2\sigma _{FHBms}\sigma _{PH}\\ r_1\sigma _{FHBms}\sigma _{HD} &{} \sigma _{HD}^2 &{} r_3\sigma _{HD}\sigma _{PH}\\ r_2\sigma _{FHBms}\sigma _{PH} &{} r_3\sigma _{HD}\sigma _{PH} &{} \sigma _{PH}^2\\ \end{pmatrix} \end{bmatrix} \end{aligned}$$
(6)

where diagonal \(\sigma ^{2}\) terms stands for the respective genetic variances; the genetic correlation of FHBms with HD is: \({r}_{1}\), with PH: \({r}_{2}\) and between covariates: \({r}_{3}\). Then, the residuals from the model (5) were taken, similarly as in Thorwarth et al. (2018), as the corrected values for FHBms according to:

$$\begin{aligned} FHBms^{ADJ.C}=g_{FHBms}-\frac{r_{1}\sigma _{FHBms}}{\sigma _{HD}}g_{HD}-\frac{r_{2}\sigma _{FHBms}}{\sigma _{PH}} g_{PH} \end{aligned}$$
(7)

Genotypic data

After a refinement step to exclude markers either with a minor allele frequency (MAF) of < 5% or with a percentage of missing data higher than 20%, as stated in Sieber et al. (2017); a final set of 23,542 DArT (Diversity Arrays Technology Yarralunla, Australia) markers and 7070 single nucleotide polymorphism (SNPs) markers were yielded. Missing data was imputed as the average score of each marker by the ”mean” method included in rrBLUP R package (Endelman 2011). Marker data is available for downloading as supplementary material.

Genomic predictions

Genomic best-linear unbiased predictions were obtained from the following model:

$$\begin{aligned} y=X\beta +Zg+e \end{aligned}$$
(8)

where y contains the calculated FHBms estimates with or without any adjustment; \(\beta\) is the vector of the fixed effects containing the overall mean; g is the vector of genomic estimated breeding values (GEVB) \([g\sim N(0,G\sigma _{g}^{2})]\) with \(\hbox {g}^{2}\) accounting for the genotypic variance estimated by REML and G stands for the genomic relationship matrix obtained according to the description of VanRaden (2008): \(G=ZZ'/2\sum p_{j}(1-p_{j})\) where Z is a \(n \times m\) matrix of m markers and n individuals with elements \(z_{ij}=x_{ij}-2p_{j}+1\) and \({x}_{{\mathrm{ij}}}\) being the value of a given allele for the ith genotype at the jth locus, and \({p}_{j}\) the allele frequency of the jth marker. X and Z are the design matrices for fixed and random effects respectively and, e is a vector containing the residuals \([e\sim N(0,\sigma ^2_e)]\).

The prediction ability (PA) was calculated by a five-fold cross validation (CV) scheme as the correlation between the GEBVs and the corresponding BLUEs. Briefly, the panel was randomly divided in five equally sized subsets, then four out of those were used as the training set and the left one as the validation set. The procedure was repeated 40 times totaling 200 cross validation runs. Mean comparisons between the PAs under any model were estimated with Tukey’s HSD (honest significant difference) test (Tukey 1949) with an alpha level of 5%.

Tuning GBLUP model parameters

The following GBLUP model reformulations included the approach taken by Song et al. (2017), where the covariates effects were directly integrated and considered as fixed in the GBLUP model (8). This approach will be referred later on as G + COV method. In addition, multi-trait GBLUP models (MT.GP) were executed following this general equation:

$$\begin{aligned} y_{t}=X_t\beta _t+M_tg_t+e \end{aligned}$$
(9)

where \({y}_{t}\) is the array for t traits, namely FHBms, HD and/or PH, containing the BLUE values; \({g}_{t}\) is the array of genetic effects GEBVs \([g\sim MVN(0,\sum _g \bigotimes G)]\) with as the complete unstructured variance-covariance matrix \(\begin{pmatrix} \sigma ^2_{g1} &{} cov_{g1,2} &{} cov_{g1,3} \\ cov_{g1,2} &{} \sigma ^2_{g2} &{} cov_{g2,3} \\ cov_{g1,3} &{} cov_{g2,3} &{}\sigma ^2_{g3} \\ \end{pmatrix}\) alongside with its design matrix \({M}_{t}\). The former terms \(\sigma ^2_{g1}\), \(\sigma ^2_{g2}\) and \(\sigma ^2_{g3}\) are the genetic variances of FHBms, HD and PH; \({cov}_{\textit{g1,2}}\), \({cov}_{\textit{g1,3}}\) are the covariance of FHBms with HD and PH respectively, and \({cov}_{\textit{g2,3}}\) is the covariance between the covariates. The residual array follows a distribution: \([e\sim MVN(0,\sum _e \bigotimes\, I_n)]\) where \({I}_{{n}}\) stands for an \(n \times n\) identity matrix and for the variance-covariance matrix similar to the previous one but accounting for the residuals between traits.

Lastly we evaluated the so called trait assisted genomic prediction (TA.GP) (Fernandes et al. 2018; Michel et al. 2017) in which for each CV run, unlike as in the MT.GP model framework described above, the covariate estimators of the validation set were included into the training set and represented pre-existing information about the genotype performance.

Index selection

Genomic selection indexes were afterwards constructed out of the GEBVs obtained in the MT.GP models. Briefly, we used a desired gains index developed by Pešek and Baker (1969) but restricting gains for the covariates and thus allowing only response to selection towards the target trait.

$$\begin{aligned} GEBV_{index}=g_1b_1+g_2b_2+g_3b_3 \end{aligned}$$
(10)

where \({g}_{\textit{1}}\), \({g}_{{2}}\) and \({g}_{{3}}\) are the GEBVs obtained from multi-trait model (9), b is the array containing the indexes for each trait obtained as \(b=\sum _g^{-1}\times a\), with a being the vector with the respective weights keeping always \({a}_{\textit{FHBms}}\) as 1 and \({a}_{\textit{HD}}\), \({a}_{\textit{PH}}\) equal to zero. The genomic variance-covariance matrix \(\varSigma _{\mathrm{g}}\) was derived from model (9). Finally, an alternative \(\varSigma _{\mathrm{g}}\) matrix was derived by calculating the variances and covariances between the GEBVs obtained from the single trait models ST.GP from model (8). Table 1 condense all the GP approaches previously mentioned in the preceding sections.

Table 1 Summary of the different genomic prediction alternatives of FHB mean severity attempted in this study
Table 2 Summary statistics of the phenotypically estimated values of 178 durum wheat genotypes for the traits: heading date (HD), plant height (PH) and Fusarium head blight mean severity (FHBms) as well as their variance components, repeatability and entry-mean heritabilities (H2) and correlations between traits (r) for each of the five environments (Env) and across them
Table 3 Pearson coefficients of correlation between both covariates HD and PH and the adjusted Fusarium head blight mean severity (\(\hbox {FHBms}^{ADJ}\)) values corrected under three methods (A, B, C shortly described in Table 1) singled out by \(\sim\) sign for the covariate(s) adjusted

Expected selection gain

The differences between the mean of BLUE estimators of the best 5, 10, 15 and 20% performing lines and the overall mean for a given validation set were calculated in each CV run to estimate the expected selection gain (\({G}_{\textit{e}}\)) The mentioned differences were standardized, in order to make comparisons between traits, according to:

$$\begin{aligned} G_e=\frac{\mu _{Trait}^{sel}-\mu _{Trait}^{all}}{\sigma _{Trait}} \end{aligned}$$
(11)

where \(\mu _{Trait}^{sel}\) is the trait mean of the selected genotypes ranked based on the FHBms GEBVs and \(\mu _{Trait}^{all}\) stands for the overall mean of the respective trait. When any correction approach was applied to perform the predictions, subscript Trait referred always to the non-adjusted estimations of FHBms. The standardized \({G}_{\textit{e}}\) was also obtained choosing lines based on the ranking on the phenotypic estimation of FHBms (phenotypic selection).

Results

Phenotypic analysis

Plant height records of genotypes in 2014 season were in average greater than in 2013, in which Hoh was the location with the shortest plant stand while Tul had the tallest one and in the across environment analysis plant height was, on average approximately 81.02 cm (Table 2). For the sake of the subsequent analyses we decided to exclude six of the extremely tallest genotypes categorized as outliers based on their across environments’ BLUEs. These “outgroup” lines included several genetic resources with exotic genetic background from the US, Canada, Bulgaria and Ukraine: DGE-1, Agathe, AC Navigator, Cirpan 13, No3026 and Nursith. On average, the heading date was 155 days after January with a significant later heading behavior in 2013. High entry-mean heritabilities are reported here for both covariates analyzed and, moreover, their genetic variances were always higher than both genotype \(\times\) environment and residual variances. In addition, a significant and direct association between both covariates were detected in the environments Hoh13 and Tul14 (Table 2).

The artificial inoculation of Fusarium was successful in all environments, and a broad FHBms variation could be found within the individual environments as well as in the across trail environments analysis ranging from resistant to highly susceptible genotypes. Tul14 and Oli14 were the most and least affected environments, respectively, and the later was the environment with the largest FHBms variation and heritability (Table 2). The updated top resistant lines included the Italian variety ”Belfuggito” released in the 1960’s, the Russian cultivar ”Amazonka”, the cultivar ”Soldur” and several breeding lines from breeding program of the University of Hohenheim. FHBms had a moderate to high heritability, even though its genotypic variance component which was just slightly higher than both the residual and the \(\hbox {G} \times \hbox {E}\) components (Table 2), thus confirming the good quality of the scoring for Fusarium in this multi-environment study.

Both covariates plant height (PH) and heading date (HD), exhibited significantly negative correlations with FHBms of \(\hbox {r}=-\,0.25\) and \(\hbox {r}=-\,0.63\) (\({p}<0.001\)), respectively. Despite of dropping six outlier genotypes for plant height PH, its correlation with FHBms was still highly significant also for the single environment Hoh13. FHBms was in all environments but Tul14 as well as in the multi environment (MET) analysis highly associated with HD as seen by coefficients higher than \(\mathrm{r}=-\,0.60\).

Correction effects

The variability of FHBms BLUEs decreased when they were corrected by either HD or HD plus PH in terms of their coefficient of variation under any of the methods and the same effect was detected for their heritabilities. A comparison between their distributions revealed how their frequencies increased around the mean value being the more pronounced case the \(\hbox {FHBms}^{{\mathrm{ADJ.C}}}\) BLUEs (Supplemental Figure 1).

Amongst the correction methods considering only HD as covariate, ADJ.A and ADJ.C performed nearly similar, both adjusting completely for the unfavorable FHBms-HD correlation while ADJ.B reduced it until non-significance only for the environment Oli13. The former methods though led the coefficient of the FHBms-PH trade-off to significance (\(p <0.05\)). Almost the same first mentioned pattern was found when FHBms’ BLUEs were adjusted exclusively for PH under the former methods however, ADJ.B significantly increased the correlation towards PH except for the environments Hoh14 and Oli14 (Table 3).

The adjustment of the FHBms’ BLUEs considering simultaneously both covariates was evaluated only when worth it i.e. the environment Hoh13 and the MET case. ADJ.A was the only capable to effectively decrease the magnitude of correlation coefficients towards both covariates and turn them non-significant. In the other hand, ADJ.B and ADJ.C merely reduced the \(\sim\) HD coefficients’ magnitudes and regarding PH both methods raised the coefficients in the environment Hoh13 and the former method did so in the MET case (Table 3).

Genomic predictions

Prediction abilities

Prediction abilities of FHBms, across environments ranged from \(\hbox {PA}=0.61\) in Oli14 to \(\hbox {PA}=0.69\) in Hoh14 (Supplemental Table 1). However, when predictions accounted for the \(\hbox {G} \times \hbox {E}\) interaction term, the prediction ability reached a significant higher value of PA = 0.75. The above values corresponded to the FHB mean severity estimators predicted without any correction or parameter tuning in the GBLUP model. It must be mentioned that both prediction abilities and expected selection gain \(\hbox {G}_{\mathrm{e}}\) rates for all the models were calculated taking as reference the non-adjusted estimators of FHBms. This consideration turned out to be a remarkable point since, specifically for the correction methods, it allowed us to compare PAs from any type of approach.

Significant lower mean prediction abilities were obtained when the corrected FHBms estimators were used for genomic prediction as expected since these estimators deviated from the non-adjusted FHBms. In fact, this occurred irrespective of the applied adjustment method, except when PH was the covariate included (Fig. 1 left panel). Analogously, if the \(\sim\)PH models are ignored since their predictabilities are comparable to the reference model, prediction abilities under the ADJ.B method were less diminished and they, in turn, outperformed their equivalents from both ADJ.A and ADJ.C methods. In the other hand, what stands out in Fig. 1 (right panel) is the significant improvement in the prediction abilities of the trait-assisted TA.GP approach, reaching PA = 0.80, when HD was included as covariate. There was no predictability upsurge for any of the multi-trait MT.GP models and even for such version involving both covariates a slight diminution in mean predictability was found. Our so-called G + HD and G + HD + PH alternatives had poorer predictabilities than the multi-trait GP models though greater than their respective ADJ.A and ADJ.C correction models versions.

Fig. 1
figure 1

Stem plots for the mean values of prediction ability PA (above the zero line) and correlation with the covariates (below the zero line) obtained from 40 cycles of five-fold cross validation of genomic prediction for Fusarium head blight mean severity (FHBms) and the covariates heading date (HD) and plant height (PH). Prediction abilities for three FHBms correction methods (left panel) and three GBLUP model parameter tuning (right panel) including multi-trait versions, are illustrated as the correlation of phenotypically estimated BLUEs of both the target trait and covariates with the GEBVs for the target trait. Mean PAs and correlation scores for the non-adjusted values are highlighted in orange bars and the rest of the models/methods are gray colored and decreasingly sorted by the within method average PA. Mean PAs were compared with Tukey’s HSD test with \(\upalpha =0.05\) and displayed in compact letters display fashion. Correlations with *, **, *** superscripts stand for significance level at \(\upalpha =0.05\), 0.01 and 0.001 respectively. Every model is denoted by its abbreviation (see Table 2) followed by “\(\sim\)” symbol and the covariate(s) incorporated

Fig. 2
figure 2

Stem plots for the mean values of prediction ability PA (above the zero line) and correlation with the covariates (below the zero line) obtained from 40 cycles of five-fold cross validation of genomic prediction (GP) for Fusarium head blight mean severity (FHBms) and the covariates heading date (HD) and plant height (PH). In each panel with gray bars are shown the scores for different selection indexes applied over the single and multi-trait GP models’ outputs. In the right panel, the scores for the GP model of the non-adjusted FHBms are presented with orange bars. In the left panel, the multi-trait genomic GP models including the covariates both together and separately are presented with orange bars. Mean PAs were compared with Tukey’s HSD test with \(\upalpha =0.05\) and displayed in compact letters display fashion. Correlation coefficients with *, **, *** superscripts stand for significance level at \(\upalpha =0.05\), 0.01 and 0.001 respectively. Every model is denoted by its abbreviation (see Table 2) followed by “\(\sim\)” symbol and the covariate(s) incorporated

Fig. 3
figure 3

Models performance in terms of expected selection gain (\(\hbox {G}_e\)). Models that predicted FHBms corrected values (left panel), GBLUP model tuning (central panel) and application of restricted selection indexes (right panel) when predicting FHBms –dotted lines–. \(\hbox {G}_e\) values are displayed for FHBms and heading date HD across several selection intensities. In each of the facets are included also the Ge scores for the non-adjusted FHBms predictions –solid lines- and for the selection based exclusively on the phenotypic BLUE values PS –two dashed lines. Every model is denoted by its abbreviation (see Table 2) followed the covariate(s) incorporated

Fig. 4
figure 4

Layout of 178 durum wheat genotypes according to their FHBms and HD BLUE values. After a single round of simulated genomic-based selection aiming the 20% most resistant lines, the probability for a genotype to be selected based on 40 cross-validation cycles is highlighted in a blue-red scale. The mean values of the selected genotypes in each panel are cross-pointed and the overall mean values are marked in the non-adjusted panels. Every panel is labeled with the model abbreviation (see Table 1)

In addition, relative low \(\hbox {PA}=0.60\) resulted when GEBVs obtained from either MT.GP\(\sim\)HD or MT.GP\(\sim\)HD+PH were weighed by restricted selection indexes (Fig. 2 right panel), however the lowermost predictabilities amongst all the methods were obtained when the restricted indexes were derived from ST.GP’ GEBVs involving both HD and HD plus PH versions (Fig. 2 left panel). Lastly, only for the TA.GP approach a slightly predictability increase was detected when the three traits at issue were fit taking the NO.ADJ GP model as reference and the tri-variate versions of both ADJ.C and IDX.ST models were the only ones for which a significant lower prediction ability was observed in a within version’s model comparison.

Correlation with covariates

Genome-based estimators for non-adjustment FHBms were negligibly associated (\(\hbox {r}=-0.14, p > 0.05\)) with PH though still highly correlated (\(\hbox {r}=-0.47, p < 0.001\)) with HD. Genomic predictions under the correction method ADJ.B presented still significant (\({p}< 0.01\)) correlations towards HD, nevertheless the genomic predictions using ADJ.A and ADJ.C methods -omitting their \(\sim\)PH versions- were able to decline the correlation with HD to marginal non-significant levels. Regarding the trade-offs with PH, FHBms’ predictions became significant (\({p}< 0.05\)) when both PH and HD + PH were corrected for in method ADJ.B (Fig. 1 left panel).

All versions of MT.GP and TA.GP models’ predictions were highly associated (\({p}< 0.001\)) with HD, and for the latter model correlation coefficients up to \(\hbox {r}=-0.85\) and \(\hbox {r}=-0.38\) were detected for HD and PH respectively (Fig. 1 right panel). The only MT approach for which a non-significant trade-off between FHBms and PH was found was MT.GP\(\sim\)HD. On the other hand, the method G + COV presented no relevant correlation coefficients between their predictions and both covariates, excepting when PH was the incorporated trait. Non-significant correlations with both PH and HD were observed when predictions obtained either from ST or MT models were weighted by restricted indexes (Fig. 2).

Expected selection gain

A single round of genomic-based selection was performed targeting the 5, 10, 15 and 20% of the most resistant genotypes, and their standardized expected selection gain values are presented in Fig. 3 ticked by approach type and both traits FHBms and HD. Selection based on the most accurate model (TA.GP) showed a larger indirect response towards HD than the conventional GP model for uncorrected FHBms values and such response was even comparable to the one coming by means of phenotypic selection when more than 5% of the lines were selected (Fig. 3 central panel). In contrast, if the GEBVs from ST.GP models were scaled by restricted indexes, their gain differentials in heading time was almost reduced to zero (Fig. 3 right panel). ADJ.A and ADJ.C methods followed IDX.ST model being the most effective alternatives to counteract the gains in heading date (Fig. 3 left panel). G + COV and IDX.MT methods performed similar decreasing by about a half standard deviation the unfavorable expected selection gain towards a later average heading date. Expected selection gains for FHBms amongst approaches kept the same prediction abilities’ ranking with the TA.GP model notably increasing the disease resistance gains. Moreover, confirmed after the simulation of the genomic selection of the best 20% genotypes proportion, the latter was the only method which significantly decreased the FHBms from the GP model with no modifications (Fig. 4). Contrarily, the rest of methods increased in average the susceptibility with exception of the multi-trait MT.GP model for which no meaningful change was detected. Additionally, from Fig. 4 it can be advised that all methods, excluding MT.GP and TA.GP, had the capability to choose more genotypes having both susceptibility and heading date estimators under the overall mean than the reference non-adjusted model.

Discussion

Simultaneous selection for multiple traits is an exigent task in plant breeding, hence some considerations have to made in order to exploit favorable or to adjust unfavorable trade-offs between traits. The present research focused on the prediction of FHB severity in the framework of its correlation with plant height and heading date, using a diverse durum wheat panel as a case study. The investigated alternatives for addressing these relationships ranged from corrections applied prior and post genomic prediction either through adjustments of FHBms estimations or usage of restriction indexes, respectively, as well as the tuning of GP model parameters including multi-trait GP alternatives.

The highly quantitative nature of FHB resistance regarding this case study panel was unveiled by Miedaner et al., (2017) via genome-wide association mapping, reporting nine loci explaining between 1% to 14% of the genetic variance. Combining minor FHBr loci detected in durum varieties with major QTL from bread wheat has been showed certain efficiency in decrease the levels of susceptibility e.g. Prat et al. (2017) reported that concurrently introgressing the major locus Fhb1 with either loci on chromosome 2BL or 5AL could even overcome the negative effects on FHBr of the Rht-B1b allele and, Zhao et al. (2018) combined the introgressed QTL Qfhb.ndwp-7A from the Chinese line PI 277012 with both minor loci on chromosomes 2A and 7A to significantly increase FHB resistance. Such allele combination and deployment in durum populations has been typically achieved by phenotypic selection or MAS in the past, although genomic selection might nowadays be prefigured as a recommended strategy to capture the genetic variation generated by many small-effect QTL and might assist breeders by shortening the breeding cycle or improve the selection gain in early stages where typically more resistant lines could in this way be selected. Moreover, among the studies demonstrating the higher efficiency of GS over traditional FHBr breeding strategies, Mirdita et al. (2015) evidenced a GS predictability three times superior than MAS in the absence of major QTL using a large set of 2325 European winter wheat genotypes evaluated in 11 environments and, Arruda et al. (2016a, b) and Steiner et al. (2018) also showed the same accuracy boost that was even increased when significant FHBr loci were included as fixed effects in the GP models. Predictability of MAS in the present dataset reached 0.65, reported by Miedaner et al. (2017), being even higher than some of our investigated approaches (Figs. 1, 2). However, the latter MAS’ feature could be an overestimation due to the relatedness between lines which in turn is a drawback that GS could overcome as revealed for instance by independent versus cross validation sampling comparisons, that makes GP models’ performance more stable in cases like European wheat breeding programs with slow allele frequency dynamics Jiang et al. (2017).

Correction of covariates tradeoffs as an alternative in GP

Roughly eight QTL for plant height explaining minor proportions of phenotypic variation and not being localized nearby any FHBr locus were revealed for the panel in study. Apart from those loci, the Rht-B1 (Rht1) locus explained more than half of the phenotypic variation in plant stature. It has been verified, for other durum germplasms as well, that Rht1 can increase or even duplicate measurements of FHB severity (Buerstmayr et al. 2012; Prat et al. 2017; Talas et al. 2011). We partially mitigated this trade-off by leaving out six out of nine of the tallest lines carrying the Rht-B1a wild type allele which met the outlier exclusion criteria, by doing so the range and variance of FHB severity ratings did not dramatically change and the correlation coefficient was reduced from \(-\,0.37\) to \(-\,0.25\) (p\(< 0.001\)). The latter exclusion could be taken as a step in order to focus on genomic predictions for elite durum wheat rather than genetic resources. Unlike plant height, for the tandem FHB severity-days to heading/flowering, both positive (Clark et al. 2016; Prat et al. 2017; Steiner et al. 2004; Zhao et al. 2018) and negative (Buerstmayr et al. 2012; Gervais et al. 2003; He et al. 2016; McCartney et al. 2016; Paillard et al. 2004; Schmolke et al. 2005; Somers et al. 2003; Yi et al. 2018) type of correlations have been reported in bread and durum wheat. Co-localization of these traits’ loci have been found in durum on chromosomes 4AL and 6AS according to Prat et al., 2017 and on chromosome 7B to Buerstmayr et al. (2012). In all environments analyzed in this study as well as in the across environments analysis strong negative FHBms-HD trade-offs were detected with the exception of Tul14 (Table 2), a circumstance that may be attributable to the moist-keeping conditions during flowering in this location and both its lower HD repeatability and range (e.g. just 9.5 days compared with 20.5 days in Hoh13). In addition, HD loci detected for this panel were not in LD respect to FHB resistance QTL yet when FHBms BLUES were adjusted for this covariate (Miedaner et al. 2017), therefore the great influence of specific differences in weather factors or ripening and possible pleiotropy events might be plausible explanations for this trade-off.

Some alternatives have been investigated to assess true genotypic effects of FHB resistance and dissect it from a passive mechanism of resistance such as plant height or development stage i.e. flowering time. For instance Klahr et al. (2007) used a covariance analysis and detected the only stable QTL QFhb.hs-5B non-affected by plant height and heading date in a RIL population of European bread wheat. Likewise, Miedaner et al. (2006) adjusted the FHB ratings to the effect of heading date and evaluated the effect of QTL introgressed from resistant donors in European elite spring wheat lines. Other covariance considerations highlight additional advantages over traditional mapping methods when trying to identify QTL, i.e. He et al. (2016) and Lu and Lillemo (2014) included HD and PH as covariates into the QTL mapping algorithms, while Yi et al. (2018) suggested a conditional QTL mapping on either HD or PH. In the context of genome-wide association mapping and genomic prediction studies for FHB resistance in hexa- and tetraploid wheat, several studies (Arruda et al. 2015, 2016a, b; Miedaner et al. 2017) have used the strategy first proposed by Emrich et al. (2008) and included the trait HD as a quantitative covariate in the mixed linear model from where the adjusted BLUE/BLUP values were obtained. The former covariance analysis was also examined in this study under the method ADJ.B (4) for which the unexpected higher FHBms-PH correlations obtained after correction for this covariate in addition to the unsatisfactory association levels of the \(\hbox {FHBms}^{\mathrm{ADJ.B}}\) predictions discourage considering this method for either low or moderately correlated traits. Furthermore, as proposed by Song et al. (2017), our so-called G + COV approach evaluated the inclusion of the covariates as fixed effects in the GP model stated in [8] and it turned out to be the most efficient method to accomplish both higher predictabilities and negligible trade-offs (\({p}> 0.05\)) with the covariates regardless whether HD alone or together with PH were involved.

Regression-based correction methods are frequently employed in a plant breeding perspective to assess negative trade-offs between major agronomic traits e.g. disease resistance and both maturity and plant height (Bormann et al. 2004; Bradshaw et al. 2004) or grain yield and protein content (Monaghan et al. 2001). In fact, one of the most extended applications of regression-based adjustments addressing the latter traits in wheat are the so-called grain protein deviations GPD, which are defined as the residuals of regressing protein content on grain yield (Rapp et al. 2018; Thorwarth et al. 2018). The ADJ.A and ADJ.C methods took advantage of this procedure as did the correction of FHBms based on regression coefficients derived from either single or multi-variate mixed models on a plot basis, respectively. The fact that the predictions for the FHBms values corrected by the latter methods were among the poorest in terms of predictability can be explained since these adjustments deviated the most from the original FHBms; although, this attribute led, in turn, to a highly effective reduction of the respective trade-offs both at BLUEs and GEBVs levels.

Single-trait predictions comparison with multi-trait models

Multi-trait genomic prediction MT.GP models tested in this study provided no predictability’ advantage over the reference non-adjusted model. These observation is supported by other empirical studies, where the usage of multi-trait GP models did not necessarily result in an increase in prediction abilities (Fernandes et al. 2018; Schulthess et al. 2017; Guo et al. 2014), but is in disagreement with reports that showed important performance’ improvements when either i) higher heritabilities for the indicator traits and/or ii) significant correlations between traits were evidenced (Calus and Veerkamp 2011; Guo et al. 2014; Jia and Jannink 2012). In a simulation study Calus and Veerkamp (2011) showed that only the inclusion of traits with genetic correlations stronger than 0.5 and higher heritabilities can lead to greater predictabilities and Schulthess et al. (2017) suggested that the benefits of MT.GP models could be limited since achieving last mentioned requirements might be somewhat unrealistic, as was evinced for our case where no pronounced differences in heritabilities were found. Plant height having moderately high phenotypic associations with FHBms in this study, did not seem to play a major role in multi-trait GP models as expected if compared to, for instance, the evidence showed in Lado et al. (2018) who have demonstrated that the inclusion of a third mildly correlated trait into MT.GP models was not suitable to increase predictabilities in comparison to the inclusion of a single highly correlated indicator trait.

The multi-trait GP approach type that improved the prediction ability was the one for which phenotypic records of the correlated covariates were already available in the validation population when predicting the target trait i.e. the trait-assisted selection model TA.GP Fernandes et al. (2018) also known as phenotypic imputation Jia and Jannink (2012). This predictability’ upsurge was evinced for both the TA.GP model fitting HD alone and HD+PH simultaneously, which was significant in the former case. A similar effect has been noticed in several studies with different crops (Fernandes et al. 2018; Jia and Jannink 2012; Rutkoski et al. 2012, 2016; Schulthess et al. 2017; Wang et al. 2017), although the degree of association between the target trait’ predictions and the covariates has not been extensively assessed e.g. in Fernandes et al. (2018) since the trade-offs between yield biomass and either moisture or plant height in sorghum were not of concern. While predictions based on both approaches MT.GP and TA.GP exhibited the higher predictabilities among the alternatives studied here, the former had similar correlation levels with the covariates than the NO.ADJ model but the TA.GP models exhibited coefficients up to 44% higher respect to HD and regarding to PH it was even more than tripled in the TA.GP\(\sim\)PH model. Moreover, it was demonstrated after selection simulation that under TA.GP model none of the moderately resistant and early flowering lines would have been selected. Using this phenotypic imputation for multi-trait GP, Steiner et al. (2018) revealed an important increase in the trade-off FHB severity-PH when both PH and PH+FD (flowering date) were fit in the GP models since in their analyzed panel the only moderately correlated trait with FHB severity was plant height. The actual usage of such previous generated records, that are routinely assessed in earlier generations during variety development for GP models should therefore be examined in the light of the indirect responses which rise for such scenarios.

Weighting predictions by genomic restriction indexes

Final products delivered by breeders in form of new cultivars must satisfy a certain number of requirements. In this context, a durum cultivar is desirable if it displays moderate resistance levels to Fusarium, high yields, short stature and early flowering. Breeders have been traditionally conducted multi-trait selection by methods like tandem selection, independent culling levels and index selection (Dudley 1997), while usage of selection indexes is generally more efficient than the former ones (Wang et al. 2018). Since the net merit index proposed by Smith (1936) and Hazel (1943) many modifications like restriction index have appeared, where the aim is to improve a given trait while keeping the response to others at zero (Kempthorne and Nordskog 1959). Similarly, Pešek and Baker (1969) presented an approach to achieve a certain rate of desired gains for a set of traits. In our case, weighting the GEBVs of FHBms, HD and PH obtained in ST.GP models by employing this desired gain index as restriction indexes had the best performance in annulling any response towards the respective indicator traits HD and PH though with the subsequent largest predictability penalization for FHBms. However, the utter goal in a multi-trait GP framework may not necessarily be the cancellation of the unfavorable correlations towards the indicator traits but rather their reduction till negligible levels of either magnitude or significance, as it was accomplished here using the restriction indexes ponderating the GEBVs derived from MT.GP. In the same regard, the use of this genomic indexes deflated the enlarged FHB severity-PH trade-off detected after either ST or MT genomic-based predictions in a durum wheat panel (Steiner et al. 2018). The authors of the mentioned study did not detect significant differences between the index usage in either single or multiple trait models in terms of predictability although on average they were significantly lower than their unweighted counterparts. In this regard, it must be said that the most plausible models through which both the less drastic predictability penalization and non-significant \(\sim\)HD correlation levels were attained appeared to be G + HD and IDX.MT\(\sim\)HD, although some features would give the former certain advantage: (1) its slightly higher predictability, (2) due to it is less computationally demanding and, (3) since it was more prone to avoid the selection of the earliest flowering-moderately susceptible genotypes after the genomic selection simulation round (Fig. 4).

Conclusion

Highly significant associations of FHB severity and the indicator traits PH and HD estimations were detected, although the trade-off with HD turned out to be the only one relevant and persisted after genomic predictions. Including PH as the only covariate in genomic prediction models resulted in imperceptible changes in the GP performances, except when PH was fit in the trait-assisted GP model for which a significant decline in predictability was detected. Incorporating simultaneously both covariates did not improve the predictability and for the models dealing with multi-trait frameworks such configuration led to worse predictabilities. The approaches that corrected FHBms phenotypic estimations by regression-based methods and weighting the FHBms’ genomic breeding values by restriction indexes derived from single trait GP models were the most effective alternatives controlling the indirect responses towards the covariates and even decreased by up to two standard deviations the most important trade-off. Multi-trait genomic prediction approaches significantly outperformed in predictability the non-adjusted reference model only when HD records for the validation population were already available, which led in turn to the largest rise in the correlation between HD and FHBms. Fitting HD as fixed effect in the GP model (G + COV) and correcting the FHBms’ genomic estimations using restriction indexes derived from multi-trait GP models achieved smaller predictabilities but were on the other hand capable to reduce HD trade-offs, and therefore represent alternative models with the highest relevance in this study.