Introduction

Plant breeding aims at the improvement of the economic merit of cultivars, which depends on the performance of the traits composing the product profile or variety concept (Bernardo 2010). Attempts to combine information from several sources in one single value (net merit) date back to Galton´s “Law of Regression”, which was presented even before the Mendelian nature of inheritance was clear (Hazel 1943). In their classical formulation, selection indices are linear functions of the target traits, each of them receiving a certain weight depending on its importance. Currently, index selection is a key tool in animal breeding programs as it facilitates the process of making consistent breeding decisions across years (Cole and VanRaden 2017). In plant breeding, to select for single traits using independent culling levels (ICL) is a widely adopted practice as phenotypic data for all traits are not readily available or not affordable to be screened at a large scale. However, selection indices have been shown to achieve more significant and more consistent genetic gains than ICL (Hazel 1943), exerting major value in the population improvement phase of a plant breeding program (Gaynor et al. 2017). Widely used indices as the Smith–Hazel index (Hazel 1943), base index (Williams 1962), and restricted index (Kempthorne and Nordskog 1959) are constructed with (1) phenotypic and genetic correlations, (2) variances and covariances between traits, and (3) the economic importance assigned to each trait. The Smith–Hazel index (SH) behaves as an “optimum index” when variances and covariances are obtained without error. As this is rarely the case, the base index uses only the economic importance of each trait as the weight and disregard phenotypic and genotypic covariances. The restricted index of Kempthorne and Nordskog uses variance and covariance components and economic importance to improve one trait while keeping the second constant. Despite these three indices requiring knowledge on each trait's economic importance, determining these economic weights is a cumbersome process in plant breeding (Mistele et al 1994). Breeding for multiple traits has proven challenging when negative correlations exist between traits (Bernardo 2010). Well-known examples for negatively correlated traits are grain yield and protein content in bread wheat (Longin et al. 2013a; Laidig et al. 2016) and durum wheat (Longin et al. 2013b; Rapp et al. 2018), and dry matter yield and dry matter content in maize (Grieder et al. 2012).

Genomic selection opened new avenues to implement simultaneous selection for several traits, with index selection being a routine practice in plant breeding. Currently, wheat and maize breeding programs use phenotypic data from previous years to build large calibration sets. These enable practitioners to simultaneously access genomic estimated breeding values (GEBV) of all required traits needed for index selection. Several theoretical and practical attempts to generate a selection index using information from GEBV and to compute the realized response to selection have been made (Dekkers 2007; Ceron-Rojas et al. 2015; Michel et al. 2019). However, studies calculating the expected selection gain (ΔG) and optimum allocation of resources in breeding programs for index selection are lacking in the literature. In particular, the influence of different indices on the ranking of breeding strategies and their optimum allocation of resources when simultaneously improving traits with contrasting economic relevance, levels of genetic correlation, and prediction accuracy in genomic selection (GS) for the same budget and logistical constraints is of major relevance in practical plant breeding. Further studies are required to estimate the decrease in selection gain for the individual traits when moving from model calculations for single-trait improvement to multiple-trait improvement via ICL or index selection. For a single trait, implementation of GS in plant breeding strategies has shown promising results based on model calculations. A breeding strategy with one-stage GS followed by one-stage phenotypic selection is expected to increase the annual selection gain for hybrid and line breeding in wheat by more than 30% compared with classical two-stage phenotypic selection (Longin et al. 2015) and similar improvements are expected in the annual selection gain for other important crops like maize, rye, triticale, and barley (Marulanda et al. 2016). However, further research is warranted for implementing GS in plant breeding programs aiming to improve several traits simultaneously. Therefore, we expanded the open-source R package “selectiongain” (Mi et al. 2014) to estimate selection gain and optimize the allocation of resources for index selection to simultaneously improve two traits at the population improvement phase of a breeding program. Our objectives were to (1) determine how the correlation between traits, the economic weights, and the variance components affect selection gain of the net merit and each trait separately, (2) examine the consequences of the choice of different selection indices in breeding strategies including genomic and phenotypic selection, and (3) identify efficient breeding strategies to simultaneously improve traits with different prediction accuracies. Finally, we study a specific case for negatively correlated traits in hybrid wheat in the practical section.

Materials and methods

Breeding strategies

We compared two breeding strategies for hybrid breeding, a classical two-stage phenotypic selection (PSstandard) and a breeding scheme using one-stage genomic selection followed by one-stage phenotypic selection (GSrapid). Both strategies, originally proposed by Longin et al. (2015) and modified by Marulanda et al. (2016), start with the production of Nini doubled haploid (DH) lines from crosses of numerous DH parents and finish with the selection of the five DH lines with highest general combining ability (GCA; Suppl. Figure 1). First, the Nini DH lines are tested (and multiplied) in an observation nursery for their line per se performance regarding highly heritable traits, which are not correlated with the net merit (index for the two target traits). According to the conclusions of Marulanda et al. (2016), after this nursery assessment, we assumed that only 25% of the lines are selected and promoted to evaluation in one or two stages of field tests for GCA. The first strategy, GSrapid, includes a selection stage for GCA of net merit based on the GEBV of the N1 DH lines selected in the nursery. After GS, one stage of phenotypic selection for GCA of net merit with N2 candidates is performed. The second strategy, PSstandard, represents the conventional practice in hybrid cereal breeding programs with two stages of phenotypic selection and was considered as benchmark. For hybrid seed production, we assumed the use of a chemical hybridization agent (CHA), and hybrids were tested in multi-environment trials conducted at Lj locations with Tj inbred testers in selection stage two and three (j = 2, 3). Without restrictions on Lj, selection gain is maximized by one replication per location (Melchinger et al. 2005). Thus, we set the number of replications to one for all calculations.

Optimum allocation of resources and calculation of selection gain

Calculation of the selection gain was based on the well-known formula of Cochran (1951) with multivariate normal integrals for selected fractions and heritabilities. As all our results were generated with these model calculations, we present strictly speaking the expected selection gain for the investigated scenarios. For sake of simplicity, however, we disregard the term “expected” in the following. As the investigated breeding strategies differ in their cycle length, we further determined the annual selection gain, which is the absolute expected selection gain divided by the number of years required to finish one breeding cycle (Suppl. Figure 1).

A vector [Nini, N1, Nj, Lj, Tj] represents the number of DH lines, testers, and locations at the initial stage (ini) or stage j. This vector corresponds to the optimum allocation only if it maximizes selection gain in the respective breeding strategy for a given scenario regarding variance components, budgets, and costs. The optimum allocation was determined by a grid search across all possible allocations for the scenario under consideration (for details see Mi et al. 2015). Different breeding strategies were only compared at their respective optimum allocation for the scenario under consideration. Thus, all our results and comparisons are based on the maximum annual selection gain, which is in the following referred to as ΔGa. We also present the annual selection gain for the individual traits when the allocation of resources is optimized for the net merit. For comparison, optimization of strategies for single-trait improvement was performed to estimate the maximum annual selection gain for trait 1 and trait 2. For all our calculations, we used the open-source R (R Core Team 2016) package “selectiongain”, which was extended for the incorporation of index selection including phenotypic and genomic information for two traits. For details about the package, the reader is referred to Mi et al. (2015) and the user manual of version 2.0.65 (http://cran.r-project.org/web/packages/selectiongain/selectiongain.pdf).

Quantitative genetic parameters and economic frame

This manuscript is divided into two sections. First, we investigated scenarios with two hypothetical traits to assess the influence of three parameters (correlation of traits, variance components, and economic weights) on the optimum allocation of test resources and ΔGa under four different indices and two breeding schemes. We defined a standard scenario assuming that both traits have equal variance components, equal economic weights (each trait weight was set to 1, so the sum of weights equals 2), and a genetic correlation between traits of -0.5 (scenario I, Table 1). As variance components, we used reported values for hybrid wheat (Longin et al. 2013a) and standardized them to a GCA variance of 1. Afterward, possible effects of each of the three parameters were assessed under ceteris paribus conditions, i.e., varying only one parameter at a time and comparing the results with the standard scenario. Further, scenarios compared to the standard scenario were: a genetic correlation between both traits of 0 and 0.5 (scenarios II and III, respectively), an economic weight of 0 for trait two (trait one weight 2, scenario IV), and economic weight of 4/3 for trait two, so that it has twice the economic importance of trait one (scenario V). As defined by Osthushenrich et al. (2018), the phenotypic variance equals the sum of GCA and masking variance, which is due to interaction effects (e.g., genotype by environment) and the residual error. For investigating the influence of variance components, we compared the standard scenario (equal variance components for both traits) with scenarios, where the GCA variance of trait two was three times higher than the GCA variance of trait 1 (scenario VI), or where the masking variance for trait 2 was three times higher than the masking variance of trait one (scenario VII; Table 1). In the second section, we extended our calculations to hybrid wheat breeding with alternative goals: (1) improve simultaneously grain yield and protein content and (2) improve simultaneously grain yield and sedimentation volume based on empirical data on variance components, correlation of traits, and economic weights as explained below.

Table 1 Scenarios investigated on the first section of this study. Parameters differing from those in the standard scenario I are given in boldface

Other assumptions required for the model calculations were taken from the literature (cf. Marulanda et al. 2016). These assumptions include: a fixed total budget of 10,000 field plot equivalents was available for the following activities: (1) producing DH lines and testcross seed, (2) evaluating line per se performance in the observation nursery, and (3) genotyping and assessing testcross performance in the field trials. The following costs, expressed as field plot equivalents, were assumed: CostDH = 1, CostGenotyping = 2, and CostHybridseed = 4. The assessment of protein content (through near-infrared spectroscopy, NIRS, ICC standard method 159, ICC, Austria) and sedimentation volume according to Zeleny (ICC standard method 116/1, ICC, Austria) was assumed to be already included in the cost of field assessment. Typically, one stage of hybrid seed production delivers enough seeds for two years of phenotyping.

Before routinely using GS, calibration experiments are required to develop prediction models. We assumed that GS calibrations were already developed outside the breeding cycle. Furthermore, recalibrations of models are regularly needed in the breeding program (Heslot et al. 2015), but these can be done with data available from routine breeding without additional budget requirements. The genetic architecture of the trait, the method of assessment, and the environmental effects could lead to different levels of GS accuracy for different traits (cf. Zhao et al. 2013; Liu et al. 2016). We, therefore, varied the prediction accuracies from 0.1 to 0.7, increasing by intervals of 0.1, and compared two assumptions: (1) “equal prediction accuracies”: both traits are assumed to have equal prediction accuracies and (2) “unequal prediction accuracies”: the prediction accuracy of trait 2 is always 0.2 higher than the accuracy of trait 1. Simultaneous testing of unequal prediction accuracies and economic weights or variance components results in numerous scenarios, e.g., trait 2 has a higher prediction accuracy than trait 1 and all economic weights. To reduce the complexity in the manuscript, we intentionally left some comparisons out of our study. However, the readers can find good approximations for scenarios not included when exchanging the conclusions for the traits in the scenarios presented. For instance, scenario V approaches the example mentioned above as trait 2 has larger prediction accuracy than trait 1 and two times the economic weight of trait 1.

We used variance and covariance components of a recent large experimental study on hybrid wheat (Liu et al. 2016). For the theoretical section, the variance components for grain yield (dt h−1) were used as a reference and both traits were assumed to have equal variance components as follows: \({\sigma }_{GCA}^{2}=5.7\), \({\sigma }_{GCA \times L}^{2}=5.19\), \({\sigma }_{SCA}^{2}=1.88\), \({\sigma }_{SCA \times L}^{2}=2.94\), and \({\sigma }_{error}^{2}=24.37\). For the practical section on hybrid wheat, we used these variance components of grain yield for trait 1 and the following estimates of protein content (%) for trait 2: \({\sigma }_{GCA}^{2}=0.08\), \({\sigma }_{GCA \times L}^{2}=0.02\), \({\sigma }_{SCA}^{2}=0.01\), \({\sigma }_{SCA \times L}^{2}=0.00\) and \({\sigma }_{error}^{2}=0.09\) or sedimentation volume (ml) \({\sigma }_{GCA}^{2}=23.54\), \({\sigma }_{GCA \times L}^{2}=1.58\), \({\sigma }_{SCA}^{2}=0.64\), \({\sigma }_{SCA \times L}^{2}=0.25\) and \({\sigma }_{error}^{2}=6.9\) (Goals 1 and 2, respectively). Further information can be found in the Supp. Table. 1. The genetic correlations were − 0.40 for grain yield and protein content and − 0.18 for grain yield and sedimentation volume (Zhao, pers. comm.). For simplicity, and as most breeders have experimental data available, we assumed that the covariances between both traits were zero for their specific combining ability (SCA) effects, their GCA x location, and SCA x location interactions. To verify whether these components have a high impact on the outcome of our model calculations, we additionally determined the optimum allocation and ΔGa for intermediate and high values for these covariance components (data not shown).

Economic weights in the practical section were determined from the marginal return obtained by farmers in the German market as described by Laidig et al. (2016). Briefly, a farmer producing fodder quality wheat (C-grade), received 17.23 € per dt. Higher quality was remunerated by extra payments of 1.0 € per dt for B quality wheat, 2.0 € per dt for A quality wheat, and 4.5 € per dt for E quality wheat compared to C quality. While B class grain is used directly for bread-making purposes, the A and E classes are used for blending to obtain a specific raw bakery material. Based on these values, we reported the economic weights as relative values to the price per dt of 17.23. So, the economic weight for grain yield is set to 1 and the weight of grain protein content or sedimentation volume corresponds to the increase for quality grade A (2/17.23 = 0.12). The weight of this category is closest to the average of the increase in the four quality categories (0.11).

Implementation of index selection

The proposed implementation of index selection to deterministically evaluate and optimize the use of genomic selection for multiple traits in hybrid breeding programs was inspired by combining ideas from three main publications. While Utz (1969) provided estimates of covariance for traits in multiple-stage selection, Wricke and Weber (1986) explained in detail the construction of indices in multiple-stage selection and Dekkers (2007) gave an extension to multiple-trait marker-assisted and phenotypic selection. We considered a group of \(m\) traits for which phenotypic scores are available and denoted as \({x}_{j}\) (\(j=1,\dots , m\)). The second group of traits corresponds to breeding values \({y}_{l}\) (\(l =1,\dots , n),\) which are improved by selection on \({x}_{j}\). Traits may overlap in both groups.

We constructed indices for both groups: first a phenotypic index \(I=\sum_{j=1}^{m}{b}_{j}{x}_{j}\) and second a genotypic index \(=\sum_{l=1}^{n}{a}_{l}{y}_{l}\). The genotypic index, known as the net merit, corresponds to the criterion for which we want to maximize the response to selection. The economic weights \({a}_{l}\) can be (1) chosen by the breeder according to her/his experience, (2) derived from the monetary returns produced by each trait, or (3) determined from selection decisions made in previous years in the breeding program. The phenotypic weights \({b}_{j}\) should be chosen such that selection based on the phenotypic traits \({x}_{j}\) maximizes the response to selection for net merit\(H\). The selection gain ΔG for the net merit \(H\) can be estimated by\(=i\frac{{cov}_{(I,H)} }{{\sigma }_{I}}\). In the frame of multiple-stage selection, the selection intensity \(i\) can be determined using the multivariate normal distribution as implemented in the package “selectiongain”. In this study, we expanded the function multistagecor of “selectiongain” to estimate the covariance \({cov}_{(I,H)}\) between the phenotypic index and the net merit and the variance \({\sigma }_{I}^{2}\) of the phenotypic index as presented in detail in the appendix. The new version of the R package “selectiongain” (2.0.65) is online available, including the open-source code of the functions to be used by the scientific community. Additionally, we have included supplementary file 1 (R script) in which we use the new version of the package “selectiongain” to solve two numerical examples. First, we solve the numerical example of Wricke and Weber (1986) for the selection of two traits in one stage using the Smith–Hazel index (chapter 12 page 348). Second, we estimate the expected genetic gain in a multistage selection scenario when two traits are improved using the Smith–Hazel index. We used the input parameters of the scenario standard and provide the expected genetic gain after each selection stage for each trait. Finally, we introduced references to the formulas presented in the appendix, facilitating the comprehension of the step-by-step process to compute the expected genetic gain.

Indices proposed in literature include the Smith–Hazel index (Hazel 1943), the base index (Williams 1962), and the restricted index (Kempthorne and Nordskog 1959). While these indices are linear functions of several traits, they vary on the assignment of weights to each trait. Smith–Hazel index uses information from the phenotypic and genotypic covariances, and the economic importance to calculate the weights. In the rare case of having estimates of variances and covariances without error, this index becomes the “optimum index”. The base index uses the economic importance of each trait directly as the weight, neglecting phenotypic and genotypic covariances. The restricted index uses variance and covariance components and economic importance to improve one trait while keeping the second constant. For further details, the reader is referred to the appendix.

We used two benchmarks to assess the impact on ΔGa of both the net merit and individual traits when using index selection. First, we used single-trait selection as benchmark to estimate the reduction in ΔGa when moving from a strategy to improve only one trait to a strategy improving multiple traits simultaneously. This comparison is of interest to breeders because applied programs often focus almost entirely on yield (Eagles and Frey 1974; Bernardo 2010).

As a second benchmark, we used independent culling levels (ICL), i.e., only individuals meeting minimum requirements for each trait are selected (cf. Bernardo 2010). When ICL selection is applied to two traits in a single stage, selection gain for each trait can be estimated as \({\Delta G}_{k}=\frac{1 }{\alpha }\left({\varrho }_{{p}_{1},{y}_{k}}{Z}_{1}{I}_{2}+{\varrho }_{{p}_{2},{y}_{k}}{Z}_{2}{I}_{1}\right)\) (Wricke and Weber, 1986), where \(\alpha\) corresponds to the total selected fraction and equals the product of the selected fractions for both traits \({\alpha }_{1}\times {\alpha }_{2}\); \({\varrho }_{{p}_{1},{y}_{k}}\) and \({\varrho }_{{p}_{2},{y}_{k}}\) are correlation coefficients between the genotypic value (y) of trait k and phenotypic values (p) of trait 1 and 2, respectively; \({Z}_{1}\) and \({Z}_{2}\) are the ordinates of the standard normal distribution at the truncation points \({x}_{1}\) and \({x}_{2}\), respectively, referring to the standardized culling levels defined by the breeder; and \({I}_{1}\) and \({I}_{2}\) are the probabilities associated with the corresponding truncation points \({x}_{1}\) and \({x}_{2}\) in a multivariate normal cumulative distribution function. The truncation points \({x}_{1}\) and \({x}_{2}\) are interdependent when a phenotypic correlation \({\varrho }_{{p}_{1},{p}_{2}}\) between traits 1 and 2 exists. The computation of the covariances between traits to obtain the required correlations \({\varrho }_{{p}_{1},{y}_{k}}\) and \({\varrho }_{{p}_{2},{y}_{k}}\) \({\varrho }_{{p}_{1},{p}_{2}}\) is presented in detail in the appendix. An extension of this formula given by Tallis (1961) to estimate selection gain for any number of correlated traits was implemented in the package R selectiongain (Mi et al. 2015). Finally, we computed the maximum selection gain for the ICL net merit as \(\sum_{k=1}^{n}{a}_{k}{\Delta G}_{k}\) where \({a}_{k}\) is the economic weight of trait k. Following the approach of Hazel and Lush (1942), ΔGa of ICL was determined under the optimum allocation of resources for the SH index regarding \({\alpha }_{j}, {N}_{j}, {L}_{j}, {T}_{j}\) at the stage j. To provide a vast range of culling levels, we tested 1000 different combinations of \({\alpha }_{1}\) and \({\alpha }_{2}\) with the only constraint of \({\alpha }_{j}={\alpha }_{1} \times {\alpha }_{2}\). After, the combination of \({\alpha }_{1}\) and \({\alpha }_{2}\) that gave the highest expected selection gain for the net merit was considered as the best allocation for stage j and used in this manuscript. Since the SH and base indices consistently outperformed ICL (Hazel and Lush 1942; Young 1961; Elgin et al. 1970; Vinson 1971), we implemented ICL only in few scenarios of the theoretical section and in the practical section.

Results

For the standard scenario, ΔGa for net merit was higher for the breeding strategy GSrapid than for PSstandard for all studied indices and ICL (Fig. 1, Scenario I). The three investigated selection indices offered larger ΔGa for net merit in comparison to ICL (Fig. 1, Scenario I). While ΔGa for net merit was almost identical for the SH and base indices for all investigated scenarios, it was lower for the restricted index (Figs. 1, 2, 3, 4, Table 2). The latter was more pronounced for breeding strategy GSrapid than for breeding strategy PSstandard. Similarly, the advantage in ΔGa of GSrapid over PSstandard increased with increasing prediction accuracy for genomic selection. Assuming a higher prediction accuracy for trait 2 than for trait 1 maximized the annual selection for net merit irrespective of the scenario considered (Figs. 1, 2, 3).

Fig. 1
figure 1

Maximum annual selection gain (ΔGa) for net merit and corresponding selection gain in individual traits for three selection indices in two hybrid breeding schemes. Scenarios I, II, and III with genetic correlation between traits of − 0.5, 0, and 0.5, respectively. Seven genomic prediction accuracies were evaluated. Prediction accuracies for both traits were assumed to be equal in all strategies except for “unequal accuracies”, where the accuracy of trait 1 is shown in the x axis and the accuracy of trait 2 is 0.2 higher than for trait 1 in absolute terms. For illustration, the maximum annual selection was also shown when directly selecting only one trait (“No Index”). For further details, see Table 1 and Material and methods. ICL, independent culling levels; SH, Smith–Hazel; Restric, Restricted

Table 2 Practical section on hybrid wheat: Optimum allocation of test resources maximizing annual selection gain (ΔGa) under two alternative breeding strategies using three selection indices for the simultaneous improvement of two traits

First, we investigated the influence of the correlation of the traits, their economic weights, and variance components on ΔGa in the breeding strategies and selection indices. The correlation between both traits strongly influenced ΔGa and the superiority of GSrapid over PSstandard (Fig. 1). For instance, when using an SH index and assuming a prediction accuracy of 0.3 for both traits, ΔGa for net merit was 29.2% higher for GSrapid than for PSstandard for a positive correlation between traits of 0.5 (Fig. 1, Scenario III). This difference increased up to 44.9% for a negative correlation of -0.5 between both traits (Fig. 1, Scenario I). ΔGa for net merit based on the SH or base index was considerably higher for positively than for negatively correlated traits. In contrast, for the restricted index, the maximum ΔGa for net merit was obtained when traits were uncorrelated (Fig. 1, Scenario II). ΔGa for net merit was always lowest for ICL except for the restricted index when traits have zero or positive correlation (Fig. 1, Scenario II and III). For all scenarios, selecting candidates based on an index resulted in selection gains for the individual traits up to 50.8% lower compared to the benchmark of selecting only for one trait (Figs. 1, 2, 3). The use of a restricted index minimized this disadvantage across almost all investigated scenarios at the expense of no selection gain in trait 2 (Figs. 13). The higher the correlation between the two traits, the lower was the reduction of selection gain for the individual traits when using index selection.

Second, we studied the impact of the economic weights for individual traits on ΔGa net merit. Assuming a negative correlation of − 0.5 between both traits, ΔGa for net merit was considerably higher for an economic weight of zero for trait 2 than for equal economic weights for both traits (Fig. 2, Scenario IV). Thereby, the selection gain expected by index selection for the individual trait 1 was comparable to that expected by selecting only for trait 1, but at the expense of a negative selection gain expected for trait 2. Assuming a two-fold higher economic weight for trait 2 than trait 1 as compared to equal weights led to a similar ΔGa for net merit using the SH or base index in breeding strategy GSrapid. However, the selection gain expected for trait 2 was considerably higher at the expense of a reduced expected selection gain for trait 1 as compared with the standard scenario with equal weights for both traits (Fig. 2, Scenario V).

Fig. 2
figure 2

Maximum annual selection gain (ΔGa) for net merit and corresponding selection gain in individual traits for three selection indices in two hybrid breeding schemes. Scenarios I, IV, and V with economic weights of trait 1 = trait 2 = 1, trait 1 = 2 and trait 2 = 0, and trait 1 = 0.7 and trait 2 = 1.3, respectively. Seven genomic prediction accuracies were evaluated. Prediction accuracies for both traits were assumed to be equal in all strategies except for “unequal accuracies”, where the accuracy of trait 1 is shown in the x axis and the accuracy of trait 2 is 0.2 higher than for trait 1 in absolute terms. For illustration, the maximum annual selection was also shown when directly selecting only one trait (“No Index”). For further details, see Table 1 and Material and methods. SH, Smith–Hazel; Restric, Restricted

Third, we evaluated the impact of the variance components on multiple-trait selection. The higher the genetic variance or the lower the masking variance components, the higher was ΔGa. Even though negatively correlated to trait 1, a higher genetic variance of trait 2 led also to a higher ΔGa in net merit by considerably increasing the expected selection gain for trait 2 (Fig. 3, Scenario VI). In contrast, higher non-genetic variance components in the negatively correlated trait 2 slightly reduced ΔGa for net merit as well as decreased selection gain for trait 2 (Fig. 3, Scenario VII).

Fig. 3
figure 3

Maximum annual selection gain (ΔGa) for net merit and corresponding selection gain in individual traits for three selection indices in two hybrid breeding schemes. Scenarios I, VI, and VII with equal variance components between traits, three times more genetic variance for trait 2, and three times more masking variances for trait 2, respectively. Seven genomic prediction accuracies were evaluated. Prediction accuracies for both traits were assumed to be equal in all strategies except for “unequal accuracies”, where the accuracy of trait 1 is shown in the x axis and the accuracy of trait 2 is 0.2 higher than for trait 1 in absolute terms. For illustration, the maximum annual selection was also shown when directly selecting only one trait (“No Index”). For further details, see Table 1 and Material and methods. SH, Smith–Hazel; Restric, Restricted

Calculation of an SH index requires estimates of genetic covariances between the traits. While the genetic covariance between traits is often available in breeding programs, knowledge about the covariances between genotype-by-location interactions of both traits or in hybrid breeding the covariance between SCA effects of both traits is mostly not available. Therefore, we calculated the SH index in the standard scenario also for a wide range of these covariance components and found only a small impact on ΔGa for net merit and the selection gain for individual traits as well as the ranking of the different indices and breeding schemes (data not shown). Consequently, if breeders want to apply selection indices but lack the required covariance components, as a first approximation, they may use the genetic covariance between traits only and ignore the covariances between the genotype-by-location interactions of both traits and the covariance between SCA effects of both traits.

Discussion

Plant breeders usually consider several traits during selection and apply different methods to deal with this in practice. We compared the effects of three different indices on ΔGa for net merit and selection gain expected for individual traits assuming two breeding strategies: (i) classical two-stage phenotypic selection and (ii) a combination of genomic selection with one-stage phenotypic selection. In particular, we investigated the impact of variance components, correlation of traits and their economic weights on ΔGa, and the ranking of the breeding strategies by varying only one factor at a time (Figs. 1, 2, 3). Finally, we demonstrate our findings in the practical section using hybrid wheat as a case of study (Fig. 4, Table 2).

Fig. 4
figure 4

Practical section on hybrid wheat: Maximum annual selection gain (ΔGa) for net merit and corresponding selection gain in individual traits for three selection indices in two hybrid breeding schemes. Trait one was grain yield and trait two was the quality trait protein content or sedimentation volume. Variance components, the correlation between traits, and economic weights were based on empirical data as described in Goals 1 and 2 of Suppl. Table 1. Prediction accuracies for both traits were assumed to be equal in all strategies except for “unequal accuracies”, where the accuracy of trait 1 is shown in the x axis and the accuracy of trait 2 is 0.2 higher than for trait 1 in absolute terms. For illustration, the maximum annual selection was also shown when directly selecting only one trait (“No Index”). ICL independent culling levels; SH, Smith–Hazel; Restric, Restricted

Breeding strategy GSrapid maximizes annual selection gain

Across all investigated scenarios, the breeding strategy GSrapid achieved consistently higher ΔGa than PSstandard (Figs. 1, 2, 3, 4, Table 2). Already visible for low prediction accuracies of genomic selection, this advantage increased considerably with increasing prediction accuracies for both traits. Prediction accuracies for genomic selection across different breeding cycles were reported to be fairly low (cf.Zhao et al. 2013; Albrecht et al. 2014; Michel et al. 2016; Rapp et al. 2018). However, assuming a prediction accuracy for genomic selection for both traits of only 0.3 in the standard scenario and taking the SH index led already to 44.9% ΔGa for GSrapid compared with PSstandard (Fig. 1, Scenario I). This advantage depends on the correlation of traits, economic weights, and variance components of the individual traits. For instance, the advantage decreases to 35.9% when a larger genetic variance was assumed for the second trait (Fig. 3, Scenario VI). The consistent advantage of the breeding strategy GSrapid over PSstandard corroborates a previous study, which investigated single-trait improvement (grain yield) across several crops (Marulanda et al. 2016). As already emphasized by these authors, the advantage of breeding strategy GSrapid is based on two pillars: the shorter breeding cycle length and the higher number of DH lines initially tested (Table 2). Here, we confirmed that using genomic selection in breeding programs improving two traits simultaneously is promising regardless of the prediction accuracy of the individual traits. Even if prediction accuracy is 0, GSrapid provided a more considerable expected genetic gain for the net merit than PSstandard. This is because it can be extrapolated from the linear trajectories of the expected genetic gain when prediction accuracy increases from 0.1 to 0.7 in Figs. 1, 2, 3, 4. Assuming a higher prediction accuracy for trait 2 than for trait 1, ΔGa for net merit was increased for the use of an SH or base index even under the unfavorable situation that trait 2 was negatively correlated with trait 1 (Fig. 1, Scenario I). In conclusion, the integration of genomic selection into applied breeding programs appears very promising, but the combination of traits requires special attention for choosing the selection index. If not explicitly stated otherwise, we concentrate the following discussion on breeding strategy GSrapid.

Choice of the index largely influences selection gain expected for individual traits

The choice of the index had a large impact on ΔGa for net merit (Figs. 1, 2, 3). Almost identical ΔGa for net merit were determined for the use of SH or base index across all considered scenarios. By contrast, ΔGa for net merit using a restricted index was considerably lower for both breeding strategies GSrapid and PSstandard depending on the investigated scenario. For instance, the superiority in ΔGa for net merit using SH or base index compared to restricted index increased from 13.5% for negatively correlated traits (Fig. 1, Scenario I) to 112.2% for positively correlated traits (Fig. 1, Scenario III). This is explained by the efficiency of SH and base indices to exploit the positive correlation between traits by selecting the best genotypes for both traits to maximize the expected genetic gain.

On the other hand, the restricted index suffered when applied in positively or negatively correlated traits as the best genotypes for the trait improvement can no longer be selected by the breeders because they do not fit the goal of the index. Similar reductions in the realized genetic gain when applying restricted index for positively correlated traits in oats were found by Rosielle and Frey (1975). Among the available restricted indices, Kempthorne and Nordskog (1959) proposed to set the most restrictive conditions to maximize the gain of one trait while keeping the second at the desired level. Future use of restricted indices in the selectiongain package might include more versatile alternatives such as those in Pesek and Baker (1969).

The choice of the index not only influenced the ΔGa for net merit, but also the expected selection gain for the individual traits combined in the index (Figs. 1, 2, 3). For instance, for the standard scenario and prediction accuracies between 0.1 and 0.7, the selection gains expected for trait 1 applying SH or base index were 54.3·% and 47.7% lower than the ΔGa of strategies improving only one trait. This negative effect on realized selection gain in individual traits was especially pronounced when traits were negatively correlated, underpinning that parallel selection on two traits is always a compromise regarding the selection gain for each of the traits. Similarly, the more weight is given to one trait in the index, the lower is the selection gain for the other trait. By contrast, the use of a restricted index decreased the disadvantage of reduced individual selection gain for trait 1 in index selection across almost all investigated scenarios but at the expense of no selection gain for trait 2.

While the correlation between target traits can hardly be changed and variance components only with considerable expenditures, breeders can easily influence the choice of the economic weights and indices. Economic weights should reflect market demands and numerous methods are at disposal, a topic beyond the scope of this study (cf. Graffius 1964; Eagles and Frey 1974; Mistele et al. 1994; Bernardo 2010). As expected, the higher the weight given to an individual trait, the larger is its individual selection gain by applying index selection (Fig. 2). However, the correlation between traits largely affected the selection gain for the other trait in index selection. While for positively correlated traits an indirect selection gain can be expected in index selection, in the case of negatively correlated traits each slight increase of the economic weight of trait 1 will automatically reduce the selection gain for trait 2 and vice versa (Fig. 2, Scenario IV and V).

The major difference between restricted and SH index or base index is in the specific economic significance assigned to the traits to warrant the desired selection gain for each of them. Using an SH or base index aims at maximizing the selection gain (ΔGa) for net merit by ideally increasing also the selection gain for each trait. However, for negatively correlated traits and depending on the variance components and economic weights of the individual traits, the use of an SH or base index can lead to a negative selection gain for an individual trait (Fig. 2, Scenario IV). By contrast, using a restricted index implicitly includes the decision that trait 2 is less important than trait 1. Independent of the trait correlations and variance components, this warrants a non-negative selection gain for trait 2 and therefore represents an interesting option for negatively correlated traits, as long as the market requirement for trait 2 is satisfied by keeping it just constant.

Choosing either an SH or a base index yielded similar results for ΔGa on net merit across almost all scenarios. For an SH index, however, information about several covariance components is required, which are mostly not available in applied breeding programs. Furthermore, the base index seems slightly more robust in the selection gain for individual traits than the SH index. For instance, a higher genetic variance of trait 2 led to a more intensive selection on this trait compared to the base index (Fig. 3, Scenario VI and VII, respectively). Thus, we recommend using the base index for applied breeding programs.

Advantages of index selection over ICL in GS based strategies

In applied breeding, phenotypic data of quality-related traits is available weeks or months after agronomic data. Then, selection based on indices is delayed until all data required for the index are completed. Breeding programs facing this data production imparity, have routinely used ICL to avoid delays in selections and reduce the number of genotypes assessed for quality-related traits. However, the wide adoption of genomic selection in plant breeding favors index selection over ICL, as GEBV are simultaneously available when the calibration set uses data from previous years. We found that ICL based strategies obtained 28.1% and 23.9% lower ΔGa for net merit as compared to the SH index for a GS prediction accuracy of 0.3 and a genetic correlation between traits of − 0.5 and 0.5 respectively (Fig. 1). This inferiority remained across all levels of correlation between traits or accuracy of genomic prediction, which is in harmony with the literature on phenotypic selection (Hazel and Lush 1942; Young 1961; Elgin et al 1970; and Vinson 1971). Therefore, we discourage the uncritical use of ICL in applied breeding but acknowledge specific scenarios where it might still be appropriate in the short term. For instance, when the rate of propagating material produced per plant is low, several cycles of multiplication are required before conducting yield trials. Thus, reliable information on yield is obtained one or several seasons after highly heritable traits such as plant height or yield components have been observed. In this case, ICL offers an alternative to practice multistage selection for multiple traits (Bernardo, 2010) by applying phenotypic selection on secondary traits during the first stages and in a later stage on yield. An alternative for these breeding programs facing data production imparity is to move from PSstandard ICL to GSrapid SH or base index. In this case, index selection based on GEBV during the initial stages is carried out in parallel to the planting for material multiplication. Selection based on phenotypic assessment takes place on a drastically reduced number of genotypes in a subsequent stage. Current developments in genome analysis and the low genotyping costs facilitate the application of this breeding strategy even in polyploid or “orphan” crops (Kamenya et al. 2021). Moreover, calibration set design could keep prediction accuracy and facilitate the selection of individuals to maintain the calibration set. A detailed assessment of the advantage of index selection over ICL in breeding programs with data production imparity exceeds the scope of our manuscript but will be investigated using the new implementation of ICL and index selection in the R package “selectiongain.

Practical section: index selection on grain yield and quality in hybrid wheat

In wheat, the combination of high grain yield with good baking quality is of utmost importance in the development of new varieties and the remuneration of farmers (Laidig et al. 2016). As baking trials are expensive and time-consuming, breeders and traders alike usually estimate bread-making quality by the grain protein content and the sedimentation volume (cf. Thorwarth et al. 2018). We, therefore, investigated two goals: the simultaneous improvement of (1) grain yield with grain protein content and (2) grain yield with sedimentation volume based on empirical data for variance components, correlation of traits, and economic weights. For both goals, selection gain for net merit and the individual traits was considerably higher for the breeding strategy GSrapid than for PSstandard (Fig. 4, Table 2). ICL was least efficient to improve the net merit for both goals, matching our theoretical results for the negative correlation between traits (Fig. 1, Scenario I). Using the SH or base index, the negative correlation of -0.40 between grain yield and grain protein content (goal 1) coupled with a low economic weight for grain protein content resulted in a negative selection gain for this trait. The use of a restricted index avoids a reduction of grain protein content but at the expense of a reduction in ΔGa for net merit and grain yield. However, the restricted index gives higher weight to grain protein content than based on the marginal returns of farmers over the last years, underpinning the importance of the choice of the index and the economic weights for individual traits on short- and long-term selection gain.

Although sedimentation volume was negatively correlated (r = -0.18) with grain yield, their combination in the SH or base index resulted in positive ΔGa for net merit and for each trait (Fig. 4, goal 2; Table 2). The ΔGa observed for the individual traits is most likely due to the low genetic correlation of traits and the high genetic variance and heritability of sedimentation volume. Despite this positive result, aiming for goal 2 led to 2.5% less ΔGa for net merit compared to goal 1. Similarly, 3.8% less selection gain for grain yield was obtained when comparing goal 2 to goal 1. Delivering almost identical ΔGa for net merit and grain yield, the use of an SH index resulted in a slightly higher ΔGa for sedimentation volume than the base index. Interestingly, the advantage of the SH or base index over a restricted index was low for ΔGa in net merit and grain yield but more pronounced for sedimentation volume.

The optimum allocation of test resources, i.e., optimum number of test candidates, locations, and testers in the individual selection stages, was similar across the different indices for both goals (Table 2). A similar number of test candidates, however, does not necessarily mean that the same genotypes were selected by different indices. By contrast, the different selection gains in the individual traits showed that quite different genotypes were finally selected depending on the index and the investigated scenario (Figs. 1, 2, 3). This is in line with experimental findings in bread (Thorwarth et al. 2018) and durum wheat (Rapp et al. 2018). Although quite similar across the indices in a given scenario, the optimum allocation of resources was influenced by the correlation of traits, economic weights, and variance components of both traits. For breeding strategy GSrapid using the SH index, we found in our theoretical section the average number of initial DH lines to be (1) 16.4% higher for negatively than for positively correlated traits, (2) 12.5% higher for equal economic weights for both traits versus all economic importance devoted to one trait, and (iii) 3.6% higher for equal variance components for both traits in comparison to larger masking variance for one of the traits (data not shown). This underlines the need to optimize the allocation of test resources of breeding programs separately and routinely depending on the target traits and markets.

In conclusion, the breeding strategy combining genomic selection with one-stage phenotypic selection (GSrapid) had a considerably higher ΔGa than the breeding strategy based on two-stage phenotypic selection (PSstandard) regardless of the index chosen for combining two traits. However, the choice of an index and the economic weight given to the individual traits have a large effect on ΔGa for the net merit (index) as well as for the individual traits. Breeders should therefore carefully investigate which index fits best to the short- and long-term goals of their breeding program. With the extensions of the R package “selectiongain” developed for this dissertation, breeders are now able to rapidly determine the optimum allocation of test resources and ΔGa for line and hybrid crops with different breeding strategies and choosing among index selection procedures.