Improving hybrid rice breeding programs via stochastic simulations: number of parents, number of hybrids, tester update, and genomic prediction of hybrid performance

Fritsche-Neto, Roberto; Ali, Jauhar; De Asis, Erik Jon; Allahgholipour, Mehrzad; Labroo, Marlee Rose

doi:10.1007/s00122-023-04508-6

Improving hybrid rice breeding programs via stochastic simulations: number of parents, number of hybrids, tester update, and genomic prediction of hybrid performance

Original Article
Open access
Published: 12 December 2023

Volume 137, article number 3, (2024)
Cite this article

Download PDF

You have full access to this open access article

Theoretical and Applied Genetics Aims and scope Submit manuscript

Improving hybrid rice breeding programs via stochastic simulations: number of parents, number of hybrids, tester update, and genomic prediction of hybrid performance

Download PDF

Roberto Fritsche-Neto ORCID: orcid.org/0000-0003-4310-0047^1,2,
Jauhar Ali¹,
Erik Jon De Asis¹,
Mehrzad Allahgholipour¹ &
…
Marlee Rose Labroo^3,4

2014 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Key message

Schemes that use genomic prediction outperform others, updating testers increases hybrid genetic gain, and larger population sizes tend to have higher genetic gain and less depletion of genetic variance

Abstract

One of the most common methods to improve hybrid performance is reciprocal recurrent selection (RRS). Genomic prediction (GP) can be used to increase genetic gain in RRS by reducing cycle length, but it is also possible to use GP to predict single-cross hybrid performance. The impact of the latter method on genetic gain has yet to be previously reported. Therefore, we compared via stochastic simulations various phenotypic and genomics-assisted RRS breeding schemes which used GP to predict hybrid performance rather than reducing cycle length, which allows minimal changes to traditional breeding schemes. We also compared three breeding sizes scenarios that varied the number of genotypes crossed within heterotic pools, the number of genotypes crossed between heterotic pools, the number of hybrids evaluated, and the number of genomic predicted hybrids. Our results demonstrated that schemes that used genomic prediction of hybrid performance outperformed the others for the average interpopulation hybrid population and the best hybrid performance. Furthermore, updating the testers increased hybrid genetic gain with phenotypic RRS. As expected, the largest breeding size tested had the highest rates of genetic improvement and the lowest decrease in additive genetic variance due to the drift. Therefore, this study demonstrates the usefulness of single-cross prediction, which may be easier to implement than rapid-cycling RRS and cyclical updating of testers. We also reiterate that larger population sizes tend to have higher genetic gain and less depletion of genetic variance.

Genomic prediction in hybrid breeding: II. Reciprocal recurrent genomic selection with full-sib and half-sib families

Article Open access 31 August 2023

Optimal cross selection for long-term genetic gain in two-part programs with rapid recurrent genomic selection

Article Open access 06 June 2018

Reciprocal recurrent genomic selection: an attractive tool to leverage hybrid wheat breeding

Article 28 November 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Hybrid breeding via reciprocal recurrent selection (RRS) is thought to be one of the most effective strategies for improving the genetic value of crops with heterosis due to dominance (Cowling et al. 2020). RRS is an emerging breeding strategy for self-pollinating inbred-hybrid crops, such as rice (Oryza sativa L.); rice hybrids, which RRS does not always produce, currently outperform the best inbreds by around 30% with appropriate management (Toriyama et al. 2019; Labroo et al. 2021; Lu and Xu, 2010). The main reason to use RRS is that hybrids increasingly take advantage of heterosis due to the dominance over breeding cycles. Heterosis describes a phenomenon in which F₁ hybrids derived from crosses between heterotic pools with diverging allele frequencies at loci because of the dominance exhibit on average superior performance compared to individuals within heterotic pools, whether inbred or outbred (Matsubara 2020; Cui et al. 2020). Heterosis can also be due to epistasis, but RRS as a strategy primarily targets heterosis due to dominance (Lamkey and Edwards, 1999).

The availability of genomic information can increase the efficacy of RRS. It is well known that genomic selection (GS) can increase the rate of genetic gain in RRS programs by reducing the cycle length (Powell et al., 2020). Genomic information also has a further application in increasing the genetic value of hybrid value if used for genomic prediction of hybrid performance (Kadam et al. 2016). It is typically logistically impossible to make and phenotype all possible hybrid individuals between two pools, so predicting hybrid performance allows identifying individuals likely to be high performing before inter-pool crossing (Hallauer et al. 2010; Kadam et al. 2016). To predict hybrid performance, models are developed from a training population of hybrid phenotypes and maker genotypes (usually from previous cycles of the breeding program) and used (Cui et al. 2020). The hybrid marker profiles are typically deduced from their parent inbreds to save on genotyping costs. Some empirical studies show encouraging results in employing genomic prediction of hybrid performance in hybrid rice (Wang et al. 2017; Matsubara 2020; Cui et al. 2020; Labroo et al. 2021). However, they used static and genetically unbalanced datasets, and the effectiveness of the methods is only based on empirical accuracy.

With genomic prediction, estimation of GCA is also resolvable without testcrossing, so unlike in phenotypic programs, testcrossing is unnecessary. It has been demonstrated that testcrossing is suboptimal in terms of accuracy if used to create a training set for rapid-cycling recurrent genomic selection (Fristche-Neto et al. 2018; Seye et al. 2020). Conversely, the North Carolina design II is the best training set to predict hybrids taken from heterotic pools because, via testcrosses, the tester effect may mask the actual breeding values of the parents, then predictive abilities obtained within the same group but with a different tester can be disappointingly low (Albrecht et al. 2014). However, the impact of genomic prediction of hybrid performance on hybrid value over breeding cycles has not been previously reported to our knowledge (Fristche-Neto et al. 2018; Seye et al. 2020).

The use of RRS as well as genomic information in commercial hybrid rice breeding has not yet been fully adopted. Rapid-cycling reciprocal recurrent genomic selection is a resource-demanding and logistically challenging strategy, and there is a need for strategies to transition to its implementation. Conducting RRS with genomic prediction of hybrid performance without rapid cycling is a logistically attractive transition strategy because it allows infrastructure development to collect, manage, and analyze genomic data without the unforgiving timelines imposed by rapid cycling. Therefore, we evaluate long-term schemes in terms of parameters such as genetic gain, best hybrid performance, genetic variance, heterotic pool divergence, and prediction accuracy by stochastic simulation. We compare breeding scenarios that produce single-cross hybrids phenotypically or with genomic prediction, with or without updating testers in phenotypic cases, and consider three breeding program sizes. Our study aims to provide a transition strategy to use genomic information in RRS schemes to increase genetic gain with minimal changes to phenotypic RRS schemes.

Materials and methods

Our study compared different rice breeding strategies in a long-term reciprocal recurrent selection program by stochastic simulations in the R package AlphaSimR (Gaynor et al. 2021). We considered single crosses that resulted from testcrossing based on phenotypic data, with or without tester updating, vs. single-crosses subset from a North Carolina II design (factorial) predicted from genomic data, including additive or additive + dominance kernels at three different total sizes of the breeding program.

Historical population and genetic parameters

The historical rice founder population was simulated as 3,000 unique diploid inbred individuals, with 12 chromosome pairs each, using a Markovian Coalescent Simulator (MaCS; (Chen et al. 2009), considering a “GENERIC” species. The number of aggregating segments was defined based on the genome size (cM) described by Li et al. (2008). The “GENERIC” option allows the user to define specific genetic/genomic features in order to represent as much as possible the species in the study.

The target of the simulation was a quantitative trait, such as grain yield (GY). The trait was comprised of 30 QTN per chromosome, totaling 360 QTN. A simulated SNP chip with 83 SNPs per chromosome was used for genotyping, totaling 996 SNPs; SNP and QTN sites were not allowed to overlap. The additive, dominance, and average degree of dominance parameters were defined based on (Li et al. 2008). Each QTN was assigned additive and dominance effects. Total genetic values for each genotype were obtained by summing all additive and dominance effects times the appropriately scaled genotype dosage for all QTN; for details, see Gaynor et al. (2021). Additive effects ($a$) were sampled of a gamma distribution with scale and shape parameters equal to 1 and randomly assigned for each QTN. Similarly, dominance effects ($d$) for each QTN were computed by multiplying the absolute value of its additive effect ($a_{i}$) by locus-specific dominance degree ($\delta_{i}$). Dominance degrees were sampled from a Gaussian distribution with $\delta_{i} \sim N\left( {\mu_{\delta } ,\sigma_{\delta }^{2} } \right)$, where $\mu_{\delta }$ is the average dominance degree equal to 0.22 and $\sigma_{\delta }^{2}$ is the variance of the dominance degrees equal to 0.26. Therefore, there is at least a 33% chance that the delta will be negative (bi-directional dominance deviations) and a 6% chance that it will exceed the unit (overdominance).

The initial mean of the quantitative trait was 0, and its initial total genetic variance was 1. Phenotypic values of each individual were obtained by adding a random error sampled from a Gaussian distribution to its true total genetic value such that initial broad-sense heritability was 0.53 and initial narrow-sense heritability was 0.5; heritability changed over cycles as genetic variances changed. This study did not consider epistasis, even though it may contribute to heterosis in many rice populations (Huang et al. 2016).

Base population and burn-in phase

To obtain the base population, we selected 384 individuals based on their superior phenotypic values from the 3,000 lines of the historical population (Fig. 1). Then, we crossed all the selected lines in silico, obtaining 73,536 single crosses. Later, we selected the ten best hybrids based on their phenotypic performances. From the parents of the best hybrids, we identify the ten best and unique female and male parents to compose the cycle zero (C0) of heterotic pools (HP) A and B, respectively. Furthermore, the parents from each HP of the three best hybrids were considered testers. Next, we simulated three traditional reciprocal recurrent selection (RRS) cycles totaling nearly 20 years of breeding as the burn-in stage. We used breeding size scenario I, the current size used at IRRI, which will be described later. Finally, we obtained the base breeding population as a reference to evaluate the main objective of this study, different breeding schemes, and sizes for rice hybrid reciprocal recurrent selection breeding programs.

Breeding schemes simulated

Traditional (TRAD_RRS), drift (DRIFT_RRS), and traditional updating testers every cycle (TRAD_RRS_ UP) reciprocal recurrent selection

The traditional RRS scheme, TRAD_RRS, was conducted following the traditional RRS scheme described by Comstock et al. (1949) adapted to rice (Gilmore 1964) (Fig. 2). First, the selected P intra-pool parents were randomly mated within each HP to obtain the C1 recombinant generation. The C1 generation was fully inbred by the rapid generation advanced (RGA) single-seed descent (SSD) method, which accelerates the growth cycle to 90—100 days (Collard et al. 2017). Additionally, we assumed that females and males, respectively, passed through the cytoplasmic male sterile (CMS) conversion and restorer line fixation and its confirmation of fertility restoration in the F₁ hybrids (Toriyama et al. 2019). Then, a random sample of L intra-pool lines (Table 1) was obtained from each HP and crossed with three testers from the first cycle of selection of the reciprocal HP, with the same three testers used for all cycles. Thus, HO testcrosses (2 pools × L lines × 3 testers) were obtained and evaluated. After that, we selected the P best hybrids based on their phenotypic performance, then identified their parents, which the females will belong to HP A and the males to HP B. The best phenotypically performed hybrid was also “released” as a variety (Fig. 2). The selected parents were then recycled to restart the cycle.

Table 1 Number of lines, hybrids, and parents compared (breeding sizes scenarios) in the stochastic simulations

Full size table

The only difference between the TRAD_RSS and DRIFT_RSS is that for DRIFT_RRS a random sample of hybrids was obtained with the same size (P), and consequently, the parents were chosen by chance. The DRIFT_RRS is important to detangle/separate the changes in genetic variability due to selection or by genetic drift (small sample size). The latter happens when a small number of parents are used for recombination. It is a well-known phenomenon in the literature (Walsh and Lynch 2018). Finally, in the TRAD_RSS_UP, the three testers of each HP were replaced with improved ones (the three best testcross parents) every breeding cycle.

Genomic additive (GS_A_RRS) and additive + dominance (GS_AD_RRS) reciprocal recurrent selection

Overall, the genomic selection schemes followed almost the same framework used in the TRAD_RRS, with few modifications (Fig. 3). The primary difference was that rather than testcrossing, L inbred lines from each HP were genotyped. All their possible single-cross combinations genotypic values were then predicted in silico, resulting in HI testcrosses (L × L). The HO hybrids with the highest total genetic values were advanced to the inter-pool crossing block and phenotypically evaluated (Fristche-neto et al. 2018). As in TRAD_RRS, parents of the next generation were the parents of the P best-performing hybrids, but the best-performing hybrids were selected by predicted total genetic value rather than phenotype.

To compose the first training set (TS) for GS, we used the phenotypes and marker genotypes of 384 hybrids resulting from the base population after the burn-in stage. Markers’ effects were estimated using the ridge-regression best linear unbiased prediction (RR-BLUP), considering the respective functions in the AlphaSimR package for A and A+D (Gaynor et al. 2021). Moreover, we adopted the GPO strategy to update the TS to keep the accuracy at reasonable levels for more breeding cycles (Sabadin et al. 2021). In short, this strategy considers only the last three breeding cycles to compose the training set (Grandparents, Parents, and Offspring). Therefore, we added the newest HO hybrid data every breeding cycle, removed the oldest one, and maintained only the last three generations in the TS.

To perform the genomic predictions, the AlphasimR package (Gaynor et al. 2021) uses the basic RR-BLUP model for the additive model:

$${\varvec{y}} = {\varvec{WGu}} + \varepsilon$$

where ${\varvec{u}}\sim N\left( {0,{\varvec{I}}\sigma_{u}^{2} } \right)$ is a vector of marker additive effects, G is the genotype matrix (e.g., {aa,Aa,AA} = {− 1,0,1} for biallelic single nucleotide polymorphisms (SNPs) under an additive model), and W is the design matrix relating lines to observations (y). The BLUP solution for the marker effects can be written as $\hat{u} = \left( {{\varvec{Z}}^{\prime } {\varvec{Z}} + \lambda {\varvec{I}}} \right)^{ - 1} {\varvec{Z}}^{\prime } {\varvec{y}}$, where Z = WG and the ridge parameter $\lambda = \sigma_{e}^{2} /\sigma_{u}^{2}$ is the ratio between the residual and marker variances.

Additionally, for the additive-dominance scenario, we used the following model:

$${\varvec{y}} = {\varvec{WGu}} + {\varvec{WTd}} + \varepsilon$$

where ${\varvec{u}}\sim N\left( {0,{\varvec{I}}\sigma_{u}^{2} } \right)$ is a vector of marker additive effects, ${\varvec{d}}\sim N\left( {0,{\varvec{I}}\sigma_{d}^{2} } \right)$ is a vector of marker dominance effects, G is the genotype matrix (e.g., {aa,Aa,AA} = {− 1,0,1} for biallelic single nucleotide polymorphisms (SNPs) for an additive effects), T is the genotype matrix (e.g., {aa,Aa,AA} = {0,1,0} for biallelic single nucleotide polymorphisms (SNPs) for dominance effects), and W is the design matrix relating lines to observations (y). The BLUP solutions for the marker effects can be written as $\hat{u} = \left( {{\varvec{Z}}^{\prime } {\varvec{Z}} + \lambda {\varvec{I}}} \right)^{ - 1} {\varvec{Z}}^{\prime } {\varvec{y}}$ and $\hat{d} = \left( {{\varvec{Z}}_{{\varvec{d}}}^{\prime } {\varvec{Z}}_{{\varvec{d}}} + {\varvec{\lambda}}_{{\varvec{d}}} {\varvec{I}}} \right)^{ - 1} {\varvec{Z}}_{{\varvec{d}}}^{\prime } {\varvec{y}}$, where Z = WG, and the ridge parameter $\lambda = \sigma_{e}^{2} /\sigma_{u}^{2}$ is the ratio between the residual and marker additive variances. Finally, Z_d = WT, and the ridge parameter ${\varvec{\lambda}}_{{\varvec{d}}} = \sigma_{e}^{2} /\sigma_{d}^{2}$ is the ratio between the residual and marker dominance variances.

Breeding scenarios sizes

We also compared different breeding sizes for the most critical stages of the reciprocal selection schemes, such as the number of lines (L) sample from each heterotic pool, hybrids obtained in silico (HI), and field evaluated/obtained (HO), and finally, the number of parents used for recombination within each heterotic pool (Table 1). Therefore, in the end, we have six different breeding frameworks, one benchmark (DRIFT_RSS), and two modeling variations for the GS schemes, GS_A_RSS and GS_AD_RSS. Besides it would be interesting in terms of theory, we avoided simulating unrealistic breeding scenarios, at least for our budget conditions. The main bottleneck to work on the breeding sizes is not the genotyping process but the male sterility and recovery systems/pipelines, which demand a tremendous amount of labor, and thus, defining the maximum number of lines that can be managed per year.

Comparing breeding schemes

Considering that all compared methods used almost the same framework, there were no differences in cycle length among them. Therefore, we measured the average true genetic value of the HO hybrids, the true genetic value of the best hybrid, the true additive genetic variance within the heterotic pool, the prediction accuracy, and the divergence between the HP over the breeding cycles. The prediction accuracy was calculated as a Pearson’s correlation between hybrid true genetic values and the hybrid genomic estimated total genetic value. The selection accuracy was computed for the TRAD_RSS scheme as the correlation between hybrid phenotypic value and hybrid true total genetic value. Conversely, for the DRIFT_RRS, it was considered zero because the selection was made by chance. In its turn, the divergence was estimated by the fixation index (F_ST) (Luo et al. 2019).

Each strategy was simulated for 20 breeding cycles and replicated 100 times within a single population using the AlphaSimR package (Gaynor et al. 2021).

Results

The GS (GS_A_RSS and GS_AD_RRS) methods outperformed the traditional (phenotypic selection, TRAD_RRS) and drift (DRIFT_RRS) for both the average interpopulation hybrid population performance and the best hybrid performance (Fig. 4a, b). Furthermore, updating the testers in TRAD_RRS_UP provides a great increase in response to selection compared to repeated use of the same testers in TRAD_RRS. This trend was regardless of the breeding size, but it was more substantial considering the size III (~ 42%).

Overall, breeding size scenario III provides higher responses to selection (Fig. 4a) and reduces the losses in additive genetic variance (V_a) by drift (Fig. 5a, b). However, Va substantially decreased over the cycles in both heterotic pools regardless of the method.

Regardless of the method, the fixation index (F_ST) increased over the cycles, even in the drift scenario, being more substantial in breeding size I and less accentuated considering breeding size III. In the latter, there was a better separation between the F_ST produced by selection or drift (Fig. 4c). Conversely, for all methods, the prediction accuracy significantly declined in the first two cycles (between the TS and first breeding cycles), then presented a steady trend until the 20th cycle.

Comparing the GS methods in terms of modeling, using only additive (GS_A_RRS) or additive + dominance (GS_AD_RRS) to predict hybrid performances, there were no significant differences between them for any parameters studied over the reciprocal recurrent breeding cycles.

Finally, after 20 breeding cycles, using GS and breeding size III rather than TRAD_RRS and size I, it was possible to increase by more than 56% the average interpopulation hybrid population performance and the best hybrid performance. Please note that genomic selection by the A and A + D models has almost overlapped (since the light blue broken line is not visible in many figures).

Discussion

Simulations have proven to be a powerful tool to assess the main factors in long-term breeding programs because several features can be controlled to make inferences on given genetic parameters over cycles in a fast, inexpensive, and consistent way (Dai et al. 2020). In this study, we successfully used this approach to examine the effectiveness of two main frameworks in employing RRS in hybrid rice breeding, phenotypic and genomic selection (Figs. 4 and 5). Furthermore, we examined three breeding sizes scenarios concerning the number of parents to obtain testcrosses, hybrids obtained and in silico and evaluated them in the field, and the number of parents for recombination within the heterotic pools (Table 1). Moreover, the simulated method mimics the current scenario in the IRRI hybrid rice breeding program (breeding size I) and some possible adjustments in the framework.

Several studies have reported an increase in response to selection in breeding schemes using GS (Gorjanc et al. 2018; Muleta et al. 2019; DoVale et al. 2021; Sabadin et al. 2021). However, to increase the GS-based breeding program efficiency, it is essential to decide how and at which stage to apply this tool (Sabadin et al. 2021). Usually, this increase is due to the shortening of the breeding cycle, which can be logistically challenging. Conversely, in our study, there were no differences in cycle length among breeding frameworks. Instead, by predicting all possible single-cross hybrids and advancing on their genomic predicted total genetic value, it was possible to increase mean hybrid value and maximum hybrid value (Fig. 4a, b). In practice, shortening the cycle by genomic selection on GCA can be used concurrently with genomic prediction of hybrid performance, but starting with use of the latter may be a useful transition strategy for programs moving to use of genomic information.

The first reason single-cross prediction appeared to increase hybrid value was that it enabled regular inter-pool crossing to the most recent generation of the reciprocal HP, rather than using a few testers or even testers from previous breeding cycles. This is apparent because TRAD_RRS_UP had intermediate performance between TRAD_RRS and the GS methods; like TRAD_RRS_UP, the GS methods continuously update both pools rather than only one in TRAD_RRS (Fig. 4a, b). In other words, both parents benefit from population improvement as they are derived from recent breeding cycles in TRAD_RRS_UP and GS methods. The observations of Labroo and Rutkoski (2022) support our findings, as they observed that allowing old parents to be recycled with overlapping generations tended to decrease genetic gain in recurrent selection, and using old testers has a similar effect as using old parents in the RRS context. Although we include the TRAD_RRS_UP scenario to demonstrate this point, in practice, it can be challenging for programs to update testers due to seed limitations, so GS may provide logistical advantages that support regular updating of both HP. Other recent results show that using parents from just one generation back is enough to create a significant “penalty” in the population’s performance (Platten and Fritsche-Neto 2022). Additionally, although we selected testers as the parents of the most recent high-value hybrids, we note that if needed, even a random, more recent tester is likely to produce better hybrids than a selected but older tester.

The second reason single-cross prediction appeared to increase hybrid value was that hybrids were advanced from all possible single-cross combinations on genomic estimated total genetic value rather than testcrossing to three testers. This increases the chances of identifying the best crosses, which is unfeasible with forms of RRS that require a resolvable estimate of GCA (DoVale et al. 2021). Since we did not estimate GCA or SCA in our study and simply recycled and chose testers based on hybrid phenotype or genomic estimated total genetic value (performance) in the previous generation, the decreased performance of the phenotypic methods was partly due to inability to separate the parental GCA effects and SCA effects in the hybrid progeny of the current generation (Cowling et al. 2020). With GS, the prediction of hybrid additive and additive + dominance genetic values is typically equivalent to prediction of parental GCA and SCA, so it was possible to utilize SCA non-randomly in the hybrids via genomic prediction of single-cross performance but not with phenotypic methods (Kadam et al. 2016; Seye et al. 2020). In our study, the use of single-cross prediction also likely indirectly increased the selection differential for GCA, because the rate of increase in hybrid value over cycles was higher than with phenotypic strategies, not just the hybrid value. The use of a reduced NCII design in inter-pool crossing also likely had higher accuracies than a genomics-assisted inter-pool testcrossing design would (Albrecht et al. 2014; Fristche-neto et al. 2018).

Some studies have already shown that information about dominance deviations incorporated in GS models can increase the prediction accuracy (Azevedo et al. 2015; Matias et al. 2018; Dias et al. 2018; Dai et al. 2020), especially when the final product is heterozygous and dominance is present (Dos Santos et al. 2016). Even though only average effects are transmitted over generations, the transmission of average effects via GCA in intra-pool individuals in RRS leads to increased dominance deviations in the inter-pool hybrids over breeding cycles (Falconer and Mackay 1996). In this context, our findings (Fig. 2) show that including dominance effects in the model did not increase accuracy or genetic gain in this population (Technow et al. 2012; Alves et al. 2019b; DoVale et al. 2021). This may be due to the low degree of dominance assumed here for rice yield, which may be lower than in other traits like maize yield (Li et al. 2008; Lin et al. 2020; Labroo et al. 2021). Another possible explanation was described by Reif et al. (2007), where the dominance variance decreases the respective additive variance but increases according to the populations’ divergence. Thus, dominance effects are increasingly absorbed by the population mean or become inseparable from additive effects over breeding cycles (Technow et al. 2014). Therefore, over the breeding cycles, dominance may become more important. Because there was no significant difference in hybrid prediction, including the dominance factor in the GS models did not seem harmful, and in other populations it may be useful. In the presence of dominance, the additive-dominant model is expected to be more accurate, as the additive model falsely assumes that residuals are independent and identically distributed (Duenk et al. 2017). Furthermore, additive-dominant models allow for a more refined assessment of genomic contributions to performance than models that only consider additive effects. However, when we consider only the additive kernel in the models, additive effects, dominance, and residuals are not readily distinguishable (Alves et al. 2019a).

Some possible explanations are that RRS methods not only work based on dominance but also in the other two heterosis components, reorganizing the alleles to maximize and explore divergence and complementary effects (Hallauer et al. 2010). Also, the dominance effect is small in rice compared to other crops, such as maize (DoVale et al. 2021). Finally, the allele substitution effect $\alpha = a + \left( {q - p} \right)d$, when $p \ne q$ and $d \ne 0$, partially captures the dominance effects (Falconer and Mackay 1996). Therefore, the additive model captures, at least in part, the dominance variance. However, the inclusion of dominance in the GS model does not “hurt”, and, in specific cases and germplasm, it might be positive.

This study used an SNP chip containing 996 markers to simulate the SNP panel optimized for the IRRI-irrigated breeding program (Arbelaez et al. 2019). Low-density SNP panels are attractive for GS due to the lower price per sample (Cerioli et al. 2022). In this context, our results showed that even using a low-density marker set, it is possible to keep the accuracy at levels that provide genetic gain over many breeding cycles, however, there is a need to update the TS properly (Sabadin et al. 2021). Furthermore, as described by Sabadin et al. (2021), the GPO updating method promotes more accurate estimates of LD between markers and QTL since, in long-term breeding schemes, recombination between marker and QTL causes an LD decrease. In contrast, selection and drift generate new LD or tighten the LD between closely linked loci. We did not consider the cost of genotyping in comparing our breeding schemes because, for the IRRI program, the cost is negligible, and this will vary depending on institutions.

As expected and well described in the literature, the bigger the population, the bigger the genetic gains. Hence, breeding size III provides higher responses to selection (Fig. 4a). Also, size III was slightly better than others in preserving V_a, although selection intensity differed among the three breeding sizes (Fig. 5a, b). Any breeding method, mainly those based on GS, will vanish the genetic variability faster than in seen traditional ways. So, it has some main consequences: i) Breeders can reach the plateau of the genetic gains in a population much faster using GS-based methods; ii) After that, one can select a new set of parents and start a new breeding population or even introduce a new genetic V_a. Therefore, there is no case of “the population of the breeder’s life”, in other words, the need to create a population to support the selection for 20–30 years of breeding; iii) As the selection will be very intense, drift must be minimized to guarantee that the losses in genetic variability will be driven by selection.

In this context, it is well known that smaller effective population sizes (Ne) increase the effect caused by the drift (Hartl and Clark 2006), reducing genetic variance in long-term breeding cycles. Consequently, it may create a plateau of faster selection response due to the inbreeding effect. Also, the fixation of the alleles in the heterotic groups becomes partly due to random factors (Gerke et al. 2015; DoVale et al. 2021). Reducing inbreeding can be done by increasing the number of parents (breeding sizes) or conserving population genetic diversity via optimal parental contribution (Cowling et al. 2020). Programs considering increasing the size of the breeding population to increase genetic gain or preserve genetic variance should carefully consider whether this intervention is cost-effective. Although some programs may be able to increase the number of entries evaluated at very little cost (e.g., when fixed costs are high and variable costs are low), for others, it may be quite expensive, but this depends on the situation.

Finally, after the horizon of 20 breeding cycles, using GS and breeding size III rather than the phenotypic selection and size I, it is possible to increase by more than 56% the average interpopulation hybrid population performance and the best hybrid performance. Thereby showing the importance of combining the appropriate method of selection, having a dynamic pairwise and updating system of testing using the possibility of creating all possible testcrosses in silico, and the effective population sizes to evaluate and recombine parents to avoid genetic drift and better conserve the genetic variation.

Data availability

The datasets and scripts used for this study can be found in the supplementary material.

References

Albrecht T, Auinger H-J, Wimmer V et al (2014) Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years. Theor Appl Genet 127:1375–1386. https://doi.org/10.1007/s00122-014-2305-z
Article PubMed Google Scholar
Alves FC, Granato ÍSC, Galli G et al (2019a) Bayesian analysis and prediction of hybrid performance. Plant Methods. https://doi.org/10.1186/s13007-019-0388-x
Article PubMed PubMed Central Google Scholar
Alves FC, Stefanine Í, Granato C et al (2019b) Bayesian analysis and prediction of hybrid performance. Plant Methods. https://doi.org/10.1186/s13007-019-0388-x
Article PubMed PubMed Central Google Scholar
Arbelaez JD, Dwiyanti MS, Tandayu E et al (2019) 1k-RiCA (1K-Rice Custom Amplicon) a novel genotyping amplicon-based SNP assay for genetics and breeding applications in rice. Rice 12(1):1–15
Article Google Scholar
Azevedo CF, de Resende MDV, e Silva FF et al (2015) Ridge, Lasso and Bayesian additive-dominance genomic models. BMC Genet 16:1–13. https://doi.org/10.1186/s12863-015-0264-2
Article CAS Google Scholar
Cerioli T, Hernandez CO, Angira B et al (2022) Development and validation of an optimized marker set for genomic selection in southern U.S. rice breeding programs. Plant Genome 15:20219. https://doi.org/10.1002/tpg2.20219
Article CAS Google Scholar
Chen GK, Marjoram P, Wall JD (2009) Fast and flexible simulation of DNA sequence data. Genome Res 19:136–142. https://doi.org/10.1101/gr.083634.108
Article CAS PubMed PubMed Central Google Scholar
Collard BCY, Beredo JC, Lenaerts B, Mendoza R, Santelices R, Lopena V et al (2017) Revisiting rice breeding methods–evaluating the use of rapid generation advance (RGA) for routine rice breeding. Plant Prod Sci 20:337–352. https://doi.org/10.1080/1343943X.2017.1391705
Article Google Scholar
Comstock RE, Robinson HF, Harvey PH (1949) A breeding procedure designed to make maximum use of both general and specific combining ability 1. Agron J 41:360–367. https://doi.org/10.2134/agronj1949.00021962004100080006x
Article Google Scholar
Cowling WA, Gaynor RC, Antolín R et al (2020) In silico simulation of future hybrid performance to evaluate heterotic pool formation in a self-pollinating crop. Sci Rep 10:4037. https://doi.org/10.1038/s41598-020-61031-0
Article CAS PubMed PubMed Central ADS Google Scholar
Cui Y, Li R, Li G et al (2020) Hybrid breeding of rice via genomic selection. Plant Biotechnol J 18:57–67. https://doi.org/10.1111/pbi.13170
Article CAS PubMed Google Scholar
Dai Z, Long N, Huang W (2020) Influence of genetic interactions on polygenic prediction. G3: Genes Genomes Genet 10:109–115. https://doi.org/10.1534/g3.119.400812
Article Google Scholar
Dias KODG, Gezan SA, Guimarães CT et al (2018) Improving accuracies of genomic predictions for drought tolerance in maize by joint modeling of additive and dominance effects in multi-environment trials. Heredity 121:24–37. https://doi.org/10.1038/s41437-018-0053-6
Article CAS PubMed PubMed Central Google Scholar
Dos Santos JPR, De Castro Vasconcellos RC, Pires LPM et al (2016) Inclusion of dominance effects in the multivariate GBLUP model. PLoS ONE 11:1–21. https://doi.org/10.1371/journal.pone.0152045
Article CAS Google Scholar
DoVale JC, Carvalho HF, Sabadin F, Fritsche-R (2021) Reduction of genotyping marker density for genomic selection is not an affordable approach to long-term breeding in cross-pollinated crops
Duenk P, Calus MPL, Wientjes YCJ, Bijma P (2017) Benefits of dominance over additive models for the estimation of average effects in the presence of dominance. G3 Genes Genomes Genet 7:3405–3414. https://doi.org/10.1534/g3.117.300113
Article CAS Google Scholar
Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics. Pearson Education India, Noida
Google Scholar
Fristche-neto R, Akdemir D, Jannink J (2018) Accuracy of genomic selection to predict maize single-crosses obtained through different mating designs. Theor Appl Genet 131:1153–1162. https://doi.org/10.1007/s00122-018-3068-8
Article CAS Google Scholar
Gaynor RC, Gorjanc G, Hickey JM (2021) AlphaSimR: an R package for breeding program simulations. G3 Genes Genomes Genet. https://doi.org/10.1093/g3journal/jkaa017
Article Google Scholar
Gerke JP, Edwards JW, Guill KE et al (2015) The genomic impacts of drift and selection for hybrid performance in maize. Genetics 201:1201–1211. https://doi.org/10.1534/genetics.115.182410
Article PubMed PubMed Central Google Scholar
Gilmore EC (1964) Suggested method of using reciprocal recurrent selection in some naturally self-pollinated species 1. Crop Sci 4:323–325. https://doi.org/10.2135/cropsci1964.0011183X000400030027x
Article Google Scholar
Gorjanc G, Gaynor RC, Hickey JM (2018) Optimal cross selection for long-term genetic gain in two-part programs with rapid recurrent genomic selection. Theor Appl Genet 131:1953–1966. https://doi.org/10.1007/s00122-018-3125-3
Article PubMed PubMed Central Google Scholar
Hallauer AR, Carena MJ, Miranda Filho JB (2010) Quantitative genetics in maize breeding. Springer, New York
Google Scholar
Hartl DL, Clark AG (2006) Principles of population genetics. Sinauer Associates Inc, Sunderland
Google Scholar
Huang X, Yang S, Gong J et al (2016) Genomic architecture of heterosis for yield traits in rice. Nature 537:629–633. https://doi.org/10.1038/nature19760
Article CAS PubMed ADS Google Scholar
Kadam DC, Potts SM, Bohn MO et al (2016) Genomic prediction of single crosses in the early stages of a maize hybrid breeding pipeline. G3: Genes Genomes Genet 6:3443–3453. https://doi.org/10.1534/g3.116.031286
Article Google Scholar
Labroo MR, Rutkoski JE (2022) New cycle, same old mistakes? Overlapping vs. discrete generations in long-term recurrent selection. BMC Genomics 23:736. https://doi.org/10.1186/s12864-022-08929-3
Article CAS PubMed PubMed Central Google Scholar
Labroo MR, Ali J, Aslam MU et al (2021) Genomic prediction of yield traits in single-cross hybrid rice (Oryza sativa L.). Front Genet 12:692870. https://doi.org/10.3389/fgene.2021.692870
Article PubMed PubMed Central Google Scholar
Li L, Lu K, Chen Z et al (2008) Dominance, overdominance and epistasis condition the heterosis in two heterotic rice hybrids. Genetics 180:1725–1742. https://doi.org/10.1534/genetics.108.091942
Article PubMed PubMed Central Google Scholar
Lin T, Zhou C, Chen G et al (2020) Heterosis-associated genes confer high yield in super hybrid rice. Theor Appl Genet 133:3287–3297. https://doi.org/10.1007/s00122-020-03669-y
Article CAS PubMed PubMed Central Google Scholar
Luo Z, Brock J, Dyer JM et al (2019) Genetic diversity and population structure of a camelina sativa spring panel. Front Plant Sci. https://doi.org/10.3389/fpls.2019.00184
Article PubMed PubMed Central Google Scholar
Matias FI, Barrios SCL, Bearari LM et al (2018) Contribution of additive and dominance effects on agronomical and nutritional traits, and multivariate selection on spp. Hybrids Crop Sci 58:2444. https://doi.org/10.2135/cropsci2018.04.0261
Article Google Scholar
Matsubara K (2020) How hybrid breakdown can be handled in rice crossbreeding? Front Plant Sci 11:575412. https://doi.org/10.3389/fpls.2020.575412
Article PubMed PubMed Central Google Scholar
Muleta KT, Pressoir G, Morris GP (2019) Optimizing genomic selection for a sorghum breeding program in Haiti: a simulation study. G3: Genes Genomes Genet 9:391–401. https://doi.org/10.1534/g3.118.200932
Article Google Scholar
Platten JD, Fritsche-Neto R (2022) Optimizing QTL introgression via stochastic simulations: an example of the IRRI rice breeding program. https://doi.org/10.21203/rs.3.rs-1780978/v1
Reif JC, Gumpert FM, Fischer S, Melchinger AE (2007) Impact of interpopulation divergence on additive and dominance variance in hybrid populations. Genetics 176:1931–1934. https://doi.org/10.1534/genetics.107.074146
Article CAS PubMed PubMed Central Google Scholar
Sabadin F, Dovale JC, Platten J, Fritsche-Neto R (2021) Optimizing self-pollinated crop breeding employing genomic selection: from schemes to updating training sets. Front Plant Sci 13:3770
Google Scholar
Seye AI, Bauland C, Charcosset A, Moreau L (2020) Revisiting hybrid breeding designs using genomic predictions: simulations highlight the superiority of incomplete factorials between segregating families over topcross designs. Theor Appl Genet 133:1995–2010. https://doi.org/10.1007/s00122-020-03573-5
Article CAS PubMed Google Scholar
Technow F, Riedelsheimer C, Schrag TA, Melchinger AE (2012) Genomic prediction of hybrid performance in maize with models incorporating dominance and population specific marker effects. Theor Appl Genet 125:1181–1194. https://doi.org/10.1007/s00122-012-1905-8
Article PubMed Google Scholar
Technow F, Schrag TA, Schipprack W et al (2014) Genome properties and prospects of genomic prediction of hybrid performance in a breeding program of maize. Genetics 197:1343–1355. https://doi.org/10.1534/genetics.114.165860
Article PubMed PubMed Central Google Scholar
Toriyama K, Kazama T, Sato T et al (2019) Development of cytoplasmic male sterile lines and restorer lines of various elite Indica Group rice cultivars using CW-CMS/Rf17 system. Rice 12:73. https://doi.org/10.1186/s12284-019-0332-8
Article PubMed PubMed Central Google Scholar
Walsh B, Lynch M (2018) evolution and selection of quantitative traits. Oxford University Press
Book Google Scholar
Wang X, Li L, Yang Z et al (2017) Predicting rice hybrid performance using univariate and multivariate GBLUP models based on North Carolina mating design II. Heredity 118:302–310. https://doi.org/10.1038/hdy.2016.87
Article CAS PubMed Google Scholar

Download references

Funding

AGGRi Alliance (Accelerated Genetic Gain in Rice in South Asia and Africa—OPP1194889)—Bill and Melinda Gates Foundation (BMGF).

Author information

Authors and Affiliations

International Rice Research Institute (IRRI), Los Banos, Philippines
Roberto Fritsche-Neto, Jauhar Ali, Erik Jon De Asis & Mehrzad Allahgholipour
H. Rouse Caffey Rice Research Station, LSU AgCenter, Rayne, USA
Roberto Fritsche-Neto
Excellence in Breeding Platform, Consultative Group of International Agricultural Research, Lisbon, Mexico
Marlee Rose Labroo
International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
Marlee Rose Labroo

Authors

Roberto Fritsche-Neto
View author publications
You can also search for this author in PubMed Google Scholar
Jauhar Ali
View author publications
You can also search for this author in PubMed Google Scholar
Erik Jon De Asis
View author publications
You can also search for this author in PubMed Google Scholar
Mehrzad Allahgholipour
View author publications
You can also search for this author in PubMed Google Scholar
Marlee Rose Labroo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

RFN elaborated on the hypothesis, conducted the analyses, interpreted the results, and contributed to the writing. MRL edited the code and manuscript. JA, EJDA, and MA contributed to the writing, mainly through discussion. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Roberto Fritsche-Neto or Jauhar Ali.

Ethics declarations

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Communicated by Hiroyoshi Iwata.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fritsche-Neto, R., Ali, J., De Asis, E.J. et al. Improving hybrid rice breeding programs via stochastic simulations: number of parents, number of hybrids, tester update, and genomic prediction of hybrid performance. Theor Appl Genet 137, 3 (2024). https://doi.org/10.1007/s00122-023-04508-6

Download citation

Received: 25 April 2023
Accepted: 18 November 2023
Published: 12 December 2023
DOI: https://doi.org/10.1007/s00122-023-04508-6

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improving hybrid rice breeding programs via stochastic simulations: number of parents, number of hybrids, tester update, and genomic prediction of hybrid performance