Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials

Krause, Matheus D.; Piepho, Hans-Peter; Dias, Kaio O. G.; Singh, Asheesh K.; Beavis, William D.

doi:10.1007/s00122-023-04470-3

Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials

Original Article
Open access
Published: 21 November 2023

Volume 136, article number 252, (2023)
Cite this article

Download PDF

You have full access to this open access article

Theoretical and Applied Genetics Aims and scope Submit manuscript

Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials

Download PDF

Matheus D. Krause ORCID: orcid.org/0000-0003-2411-9287¹,
Hans-Peter Piepho²,
Kaio O. G. Dias³,
Asheesh K. Singh¹ &
…
William D. Beavis¹

1761 Accesses
3 Citations
12 Altmetric
Explore all metrics

Abstract

Key message

Simulations demonstrated that estimates of realized genetic gain from linear mixed models using regional trials are biased to some degree. Thus, we recommend multiple selected models to obtain a range of reasonable estimates.

Abstract

Genetic improvements of discrete characteristics are obvious and easy to demonstrate, while quantitative traits require reliable and accurate methods to disentangle the confounding genetic and non-genetic components. Stochastic simulations of soybean [Glycine max (L.) Merr.] breeding programs were performed to evaluate linear mixed models to estimate the realized genetic gain (RGG) from annual multi-environment trials (MET). True breeding values were simulated under an infinitesimal model to represent the genetic contributions to soybean seed yield under various MET conditions. Estimators were evaluated using objective criteria of bias and linearity. Covariance modeling and direct versus indirect estimation-based models resulted in a substantial range of estimated values, all of which were biased to some degree. Although no models produced unbiased estimates, the three best-performing models resulted in an average bias of $\pm\, 7.41$ kg/ha$^{-1}$/yr$^{-1}$ ($\pm\, 0.11$ bu/ac$^{-1}$/yr$^{-1}$). Rather than relying on a single model to estimate RGG, we recommend the application of several models with minimal and directional bias. Further, based on the parameters used in the simulations, we do not think it is appropriate to use any single model to compare breeding programs or quantify the efficiency of proposed new breeding strategies. Lastly, for public soybean programs breeding for maturity groups II and III in North America, the estimated RGG values ranged from 18.16 to 39.68 kg/ha$^{-1}$/yr$^{-1}$ (0.27–0.59 bu/ac$^{-1}$/yr$^{-1}$) from 1989 to 2019. These results provide strong evidence that public breeders have significantly improved soybean germplasm for seed yield in the primary production areas of North America.

Field experimental design comparisons to detect field effects associated with agronomic traits in upland cotton

Article 16 July 2015

Envirotype-based delineation of environmental effects and genotype × environment interactions in Indian soybean (Glycine max, L.)

Article Open access 21 May 2024

Across year and year-by-year GGE biplot analysis to evaluate soybean performance and stability in multi-environment trials

Article 28 May 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The purpose of plant breeding is to improve genetic contributions to plant characteristics. For many discrete characteristics such as fruit and seed color, pubescence, herbicide and disease resistance, etc., the genetic improvements are obvious and easy to demonstrate. However, for continuous characteristics, such as seed or grain yield, the genetic contributions are incremental and difficult to separate from management practices or variable and changing environments. Plant breeders have used various experimental and statistical methods to estimate incremental genetic improvements of quantitative traits (e.g., Rincker et al. 2014; Byrum et al. 2017). However, with the exception of Rutkoski (2019b), the various proposed methods have not been compared using objective criteria. Herein we utilize a simulation approach to propose and investigate linear mixed models to obtain estimates of genetic improvements for quantitative traits.

Phenotypes (P) of traits evaluated on continuous scales are composed of discretely inherited polygenic effects (G), continuous environmental effects (E), and GE interaction effects (GEI) (Fisher 1918; Mayr 1942; Sprague and Federer 1951; Lynch and Walsh 1998). Further, the simple linear model $P = G + E + GE$ has been successfully used to investigate responses to selection (Johannsen 1911; Tabery 2008). The genetic component G is modeled as the sum of additive effects of alleles at multiple loci and non-additive genotypic effects (dominance and epistatic). For diploid species, additive effects refer to those associated with discrete alleles that can be inherited across generations, while non-additive genetic effects refer to genotypes that are not transmitted through inheritance. Thus, cycles of selection and reproduction affect genetic variability only through inherited alleles, also known as breeding values (Kempthorne 1957; Falconer 1960). In other words, realized genetic gain (RGG) is the improvement in average genetic (breeding) values due to the accumulation of favorable alleles through recurrent cycles of selection (Hazel and Lush 1942; Walsh and Lynch 2018; Rutkoski 2019a).

Applications of this theory were originally demonstrated using population improvement methods (Jenkins 1940). An original sample drawn from an unselected population is known as the founder population and is referred to as cycle zero (C0). Based on evaluated phenotypic values, a subset from C0 is selected to be randomly inter-crossed. The progeny generated by inter-crossing among the selected group represents a Mendelian sample of all possible progeny from the selected alleles, and is referred to as the C1 population. The process used to create C1 is reiterated with a sample from the C1 population to create C2, which is used to create C3, etc. The process is continuous and the expected outcomes are populations of genetically improved individuals.

Within a cycle of population improvement, the difference between the average phenotypic value of the population and the average of a selected sub-group is known as the selection differential (S). The difference between the averaged phenotypic values representing two consecutive cycles of recurrent selection is known as the realized response to selection (R). Thus, under an additive (linear) model (Fisher 1918), R equals the mean breeding value of members that are selected for inter-crossing (Walsh and Lynch 2018), and the relationship between S and R is given by the breeder’s equation: $R = h^2S$, where $h^2$ is the narrow-sense heritability (Lush 1937). Because only breeding values, i.e., additive allelic effects, are transmitted from one cycle to the next (for diploid species), the ratio R/S assumes values between zero and one and is known as the realized heritability ($h^2_R$) (Hallauer and Miranda 1988; Bernardo 2020). The $h^2_R$ associated with R and S can be used in the breeder’s equation to obtain an estimate of RGG. Alternatively, it has been suggested that RGG can be estimated from the slope of the regression line of average breeding values on breeding cycles across time (Eberhart 1964; Falconer and Mackay 1996; Rutkoski 2019a; Fritsche-Neto et al. 2023).

While RGG has been used to compare various population improvement methods in maize (Jenkins 1940; Hallauer and Miranda 1988; Dudley and Lambert 2004), since about 1980, population improvement breeding methods have not been used routinely to develop competitive hybrids. Rather, various types of cultivar development methods have been developed for maize and most commodity crops (Singh et al. 2021). Nonetheless, Duvick and co-workers demonstrated that previously sold maize hybrids could be used to assess progress in proprietary hybrid maize breeding programs (e.g., Duvick 1977, 1984, 2005). Their approach consisted of selecting a few widely grown maize hybrids to represent time periods (eras), and evaluating these in replicated field trials conducted in a common set of environments (location-year combinations). The estimated genotypic values for hybrids were regressed across the years when the hybrids were released to obtain an estimated trend (slope). As with maize, RGG in soybean has been estimated using “era trials” (Wilcox 2001; Ustun et al. 2001; Fox et al. 2013; Rincker et al. 2014; Rogers et al. 2015; Felipe et al. 2016; Bruce et al. 2019; Milioli et al. 2022).

Because era trials use designed experiments in which widely grown cultivars represent a selected “treatment,” replicated across the same set of environments, treatments and environments are orthogonal. However, inferences from era trials are limited because there are only a select few cultivars representing each era and they are evaluated only in a few environments. These do not represent a random sample of genotypes used to create the breeding populations. Thus, changes in the genotypic component across time represent an estimated commercial gain, which is not directly interpreted as an estimate of RGG (see Discussion). Further, the years in which the trials are conducted will favor newer cultivars that are more likely to be adapted to the environments in which the era trial is conducted (Piepho et al. 2014; Rizzo et al. 2022). Therefore, there is likely an environmental bias that favors and is confounded with recently released cultivars.

Alternatively, historical field data from annual multi-environment trials (MET) used to evaluate genotypes during their development have been used to assess changes across time for common bean (de Faria et al. 2018), potato (Ortiz et al. 2022), rice (Breseghello et al. 2011; Streck et al. 2018), rye (Laidig et al. 2017), sugarcane (Ellis et al. 2004), sunflower (de la Vega et al. 2007), wheat (Crespo-Herrera et al. 2018; Gerard et al. 2020), among other commercial crops and forage grasses (Piepho et al. 2014; Laidig et al. 2014). The advantages of using historical MET records are that the data (i) are usually available in stored repositories and (ii) represent larger samples of both genotypes and environments. However, historical MET usually have low connectivity among genotypes across environments because most experimental genotypes are culled annually. Thus, the disadvantages are that (i) environmental effects are mostly estimated from a relatively small set of selected experimental genotypes and a set of check cultivars, and (ii) new breeding programs usually do not have large historical datasets (Covarrubias-Pazaran 2020).

The most widely used method for partitioning genetic and non-genetic effects is the linear mixed model (LMM, Henderson 1949, 1950; Henderson et al. 1959). For a LMM, Best Linear Unbiased Estimators (BLUE) and Best Linear Unbiased Predictors (BLUP) are used to obtain estimates of fixed effects and predictions of random effects, respectively. In a Frequentist framework, both estimators usually utilize Residual Maximum Likelihood (REML, Patterson and Thompson 1971) to estimate variance components, yielding estimated/predicted empirical values (i.e., eBLUE and eBLUP values). In general, genetic trends are computed from the regression of the eBLUE or eBLUP values of genotypes on the first year of testing in MET, or the year of cultivar release for era studies. Alternatively to the application of LMM, algorithmic modeling (Byrum et al. 2017) or a combination of algorithmic and linear modeling have been proposed to remove non-genetic effects in order to estimate RGG (Brisson et al. 2010; Oury et al. 2012; Bornhofen et al. 2018).

A simulation approach designed to evaluate the accuracy and precision of estimators of RGG using historical MET data was conducted by Rutkoski (2019b), where simulation parameters were obtained from low-budget cultivar development programs conducted by the International Rice Research Institute. In total, the author simulated 80 indica-type rice breeding programs assuming two levels of heritability (low and high), the inclusion of either a positive or negative non-genetic trend linked to the calendar year (i.e., years of breeding operation), and two breeding schemes. The RGG was then estimated for a quantitative trait composed of additive effects at 1000 loci from simulated era and yield trials with different modeling strategies based on LMM. The study concluded that (i) the evaluated estimators were inaccurate, (ii) the error associated with the estimates was dependent on the breeding scheme, non-genetic trend, and heritability, and (iii) if the goal is restricted to determine if RGG is greater than zero, some indicators like the expected rate of genetic gain and the equivalent complete generations are useful (Boichard et al. 1997).

Herein, we use a similar simulation approach as Rutkoski (2019b) to evaluate LMM estimators of RGG in simulated public soybean breeding programs adapted to Maturity Groups (MG) II and III in North America. It is evident the genotypic sampling space in a breeding program is complex due to multiple parental genotypes, families in early trials, check cultivars, experimental genotypes, and introductions of germplasm from external sources. In this context, the first step towards estimating RGG in cultivar development programs is to define the inference space. We define the breeding population as consisting of mostly homozygous genotypes (i.e., purelines) that are used as parents in crossing blocks. Estimates of genetic gains across time based on MET data are associated with lines that were or could be inter-crossed to create new families consisting of segregating self-pollinated genotypes for evaluation. As such, the breeding population represents a set of experimental lines from the tail of a distribution that has about twice the additive genetic variance expressed in the F$_2$ generation (Lynch and Walsh 1998; Bernardo 2020), plus lines from external sources that likewise have genotypic values in the tail of the distribution. Assuming that these best-performing lines have accumulated favorable alleles, i.e., have higher average breeding value compared to their parents, the genetic trends across years of MET are a function of the breeding values and therefore can be interpreted as an estimate of RGG for germplasm that is adapted to the conditions of the MET.

Simulations were based on knowledge of the organization of public soybean breeding programs for MG II and III in North America, as well as previously analyzed MET data (Krause et al. 2023). These data consist of 4257 genotypes evaluated in replicated field trials consisting of $\sim$ 20–34 experimental lines per trial. Trials within environments were conducted at 63 locations for 31 years, resulting in 591 observed environments (location-year combinations) from 1989 to 2019. Simulated values for RGG were computed according to the accumulation of favorable alleles in experimental lines evaluated in advanced MET. We also implemented in the simulator an alternative way to simulate non-genetic contributions to gain based on empirical frequencies of locations used in the historical records of MET. Thus, the simulator was built to (i) evaluate bias from LMM estimators of RGG based on simulations that emulated empirical data routinely collected by public soybean breeders, and (ii) investigate if these models can determine if there is any RGG regardless of estimated bias. Lastly, we report estimates of RGG for the soybean empirical dataset from Krause et al. (2023) using the best-performing models.

Material and methods

Soybean stochastic simulations

Simulations were conducted using AlphaSimR (Gaynor et al. 2021) and functions developed in the R programming environment (R Core Team 2021). The simulation code was run in parallel (3 cores per replicate) in a bash shell under Linux. Each simulation run represents an independent breeding program consisting of 46 years. On average, each simulation run took four hours to be completed in computer nodes with 30 GB of RAM and Intel processors with speeds ranging from 3.20 to 3.50 GHz. Statistical analyses were performed using Asreml-R version 4.1 (Butler et al. 2017) and R base functions. Values for the simulation parameters (e.g., initial trait mean, variance components, number of crosses and genotypes, etc.) were obtained from analyses of historical MET soybean data from Krause et al. (2023).

Founder population

A founder population was created from a set of 499 pureline soybean genotypes developed by public soybean breeders for maturity groups II and III. These lines were genotyped using the SoySNP6K BeadChip (Song et al. 2020). Single nucleotide polymorphism markers (SNP) were removed if missing scores were $>20$%, or if the minor allele frequency was $< 5\%$. Lines were removed if heterozygosity $>6.25\%$ and/or $>20$% of the markers were missing data. The final number of SNP markers was 5279, ranging from 204 to 349 markers per chromosome. A principal components analysis of the additive genomic relationship matrix ${\textbf {G}}_{{\textbf {M}}}$ (Endelman and Jannink 2012) does not show evidence of grouping among this set of possible founders (Supplementary Material Figure A1).

Breeding values

Simulated breeding values were obtained assuming Fisher’s infinitesimal model to approximate polygenic effects on soybean seed yield. Eligible additive quantitative trait loci (QTL) were randomly assigned to 1000 loci distributed across 20 chromosomes, according to a genetic linkage map (2145.5 cM) estimated from the Soybean Nested Association Mapping population (Ramasubramanian and Beavis 2020). Further details are given in Supplementary Material.

Crossing nurseries and development of experimental lines

Every year 50–80 biparental crosses between 20–30 selected genotypes (lines) were simulated to produce a population of F$_1$ individuals. Each F$_1$ individual was self-pollinated to create 50–80 simulated families of F$_2$ seeds (S$_{0}$ generation), each representing segregating progeny descended from a biparental cross. Note, for hybrid crop species we also utilize the ‘S’ symbol to represent selfing generation in line development projects within heterotic groups. From each F$_2$ individual plant, 2–3 F$_3$ seeds were obtained from simulated self-pollinations to represent the S$_{1}$ generation. Each F$_3$ individual was self-pollinated to produce F$_4$ seeds representing the S$_{2}$ generation. All of the self-pollinated seed from each F$_4$ individual was combined to represent an F$_{4:5}$ experimental line. Selections were not simulated during the development of the F$_{4:5}$ experimental lines.

Selection across stages and field trials

Phenotypic values of F$_{4:5}$ experimental lines were simulated for evaluation in an unreplicated field trial at a single location, designated as breeders trial 1 (BT1). Self-pollinated seeds (F$_{4:6}$) from selected F$_{4:5}$ experimental lines were subsequently evaluated at two locations (BT2). Self-pollinated seeds (F$_{4:7}$) from the selected F$_{4:6}$ experimental lines were then evaluated at three locations (BT3). The best F$_{4:7}$ experimental lines selected from the BT3 were then evaluated in regional trials. The first year of regional trials consisted of F$_{4:8}$ experimental lines evaluated at eight locations and was referred to as the preliminary yield test (PYT). If selected, F$_{4:8}$ experimental lines were evaluated in the uniform regional test (URT) at 12 locations. Experimental lines in PYT and URT (i.e., advanced MET) can be thought of as candidate varieties (Fig. 1B, C).

Ten percent of the BT1 experimental lines were selected for evaluation in BT2. For the remaining stages, a 20% selection intensity was applied. Yearly selections were carried out by ranking the predicted eBLUP values of experimental line means obtained only from the data of the current year. Experimental lines in the BT1 phase of development were genotyped with 2400 SNP markers, although the marker genotypes were not used for genomic selection. Rather, SNP information was used by some of the LMM to estimate RGG by providing information about covariance across cycles of development.

Field trials within evaluation stages (BTs, PYT and URT) included three to six check cultivars, where the actual number of checks was randomly determined for each simulation (Fig. 2A and Supplementary Material Figure A2). Historical records of MET represent summaries of trials that were conducted using a randomized complete block design with two or three blocks per trial, but the summary does not include individual plot and block information. Rather, the summaries report eBLUE values for each genotype (Krause et al. 2023). In other words, reported genotypic values and estimates of residual variance for each trial are adjusted for block effects. Thus, in order to simulate using similar values for residual variance, the field plot design for each trial was completely randomized with a single replicate for BT1, two replicates for BT2, and three replicates per location for the remaining trials. A random percentage from zero to 12% of individual field plots were considered to be missing data within each trial. The number of locations for each trial was fixed, although actual simulated locations were replaced across years to mimic the practice of occasionally changing locations in a region from year to year (Figs. 2B and Supplementary Material Figure A3). The models used for the selection of experimental lines are provided in Supplementary Material.

Selection of breeding lines and cycle of line development

BLUP selection of breeding (parental) lines was performed in BT3 with combined phenotypic data from BT1, BT2, and BT3 (Model C1, Supplementary Material). Assuming that the creation of the experimental lines prior to evaluation in BT1 could be conducted at off-site continuous nurseries with 3 seasons per year, the time per cycle of line development would be five years. Explicitly, it will require two years to begin with the initial crosses, create F$_{1}$ seed, and subsequently develop the F$_{4:5}$ experimental lines for evaluation on-site in BT1. Subsequently, we assumed that there is only one growing season for evaluating the crop in the breeder’s trials (BT’s) (Fig. 1B). Breeding lines were not selected in the first five years of the simulated breeding programs (i.e., not enough data). Rather, the simulated crosses were randomly sampled from the founder population. Moreover, these initial years were not used to estimate RGG (Supplementary Material Figure A4).

Simulations of GEI in MET

Genotype by environment effects were defined as the sum of genotype $\times$ location (${\textit{GL}}$), genotype $\times$ year (${\textit{GY}}$), and genotype $\times$ location $\times$ year (${\textit{GLY}}$) interaction effects, i.e., ${\textit{GEI}} = {\textit{GL}} + {\textit{GY}} + {\textit{GLY}}$. Genotype refers to experimental lines. QTL effects were assigned to genotypes at the beginning of simulation runs and retained through all stages of line development. The simulation of the main genotypic effects (G) was accomplished using AlphaSimR as described in section “Breeding values”, with the matrix of additive QTL genotypes being used to simulate GEI effects. Individual plot phenotypic values were simulated according to the following general linear model:

$$\begin{aligned} Y_{ijkl} = \mu + G_i + L_j + Y_k + {\textit{GL}}_{ij} + {\textit{GY}}_{ik} + LY_{jk} + {\textit{GLY}}_{ijk} + \epsilon _{ijkl} \end{aligned}$$

(1)

where $Y_{ijk}$ is the simulated phenotype for the ijkth genotype (line) $\times$ location $\times$ year combination in the lth plot, $\mu$ is the intercept or mean trait value, and $\epsilon _{ijkl}$ is the residual plot-to-plot variability associated with the simulated phenotypic value. The remaining terms L and Y represent the locations and years’ main effects, respectively. All model terms except L were considered random effects sampled from various distributions described below (section “Variations of simulated MET models”). Locations were simulated as fixed effects to incorporate estimated values from empirical data (Supplementary Material Figure A5). Individual trial effects were not simulated.

Variations of simulated MET models

We used six simulation models for MET and labeled them A1, A2, B1, B2, B2-M, and B2-R. The “A” conditions represent simple genetic effects models where random effects from Model (1) were simulated as independent with homogeneous variances. The “B” conditions represent models where each simulation run ($s = 1, \dots , S = 225$) had a unique G ($\sigma _{G_s}^2$) and ${\textit{GLY}}$ ($\sigma _{{\textit{GLY}}_s}^2$) variance component. Also, the B model simulations consisted of correlated additive QTL effects across locations and years for the GL and GY interaction effects, respectively. The “1” versions of these models had no non-genetic trend, while the “2” versions did (see below). Simulations labeled B2-M are a modification of B2 where the sampled variance values for GL, GY, and GLY were divided by an ad-hoc factor of 10 to minimize the contribution of GEI components. Lastly, simulations labeled B2-R represent B2 where a random sample of experimental and breeding lines were retained (i.e., no selection). Heterogeneous residual variances represented plot-to-plot variability within trials and were sampled from a Log-Logistic distribution (Tables 1 and 2).

The GEI simulation approach for the “B” simulation models was developed to provide similar results as encountered in empirical data analyzed by Krause et al. (2023). A step-by-step description of the simulation of the correlated QTL effects is given in Supplementary Material. Briefly, define ${}_{\varvec{I}}\varvec{\Theta }_{\varvec{1000}}$ as the matrix of QTL dosages (0, 1, and 2) simulated by AlphaSimR. The matrix $\varvec{\Theta }$ has dimension $I \times 1000$, where I $(i = 1, \dots , I)$ represents the number of genotypes and $q = 1000$ is the number of simulated QTL. The sampled effects are defined in matrix notation as ${}_{\varvec{q}}\varvec{\phi }^{\varvec{G}}_{\varvec{1}}$, ${}_{\varvec{q}}\varvec{\phi }^{\varvec{GL}}_{\varvec{J}}$, ${}_{\varvec{q}}\varvec{\phi }^{\varvec{GY}}_{\varvec{K}}$, and ${}_{\varvec{q}}\varvec{\phi }^{\varvec{GLY}}_{\varvec{JK}}$ for model terms $G, {\textit{GL}}, {\textit{GY}}$, and ${\textit{GLY}}$, respectively. Their dimensions are represented by I, J, K for genotypes, locations ($j = 1, \dots , J$), and years ($k = 1, \dots , K$), respectively. Note ${}_{\varvec{q}}\varvec{\phi }^{\varvec{G}}_{\varvec{1}}$ is a vector of length 1000 as defined in section “Breeding values”. This notation is general and reflects the sample size associated with the model parameters in each trial (Tables 1 and 2).

Table 1 Variance component values and probability distributions implemented in the simulator

Full size table

Table 2 Fixed effect and distributional assumptions for random effects implemented in the simulator. G, L, and Y represent genotypic (line), location, and year factors. Note the sampling for the genotypic-related terms G, GL, GY, and GLY were applied at the level of simulated QTL

Full size table

Simulation of the non-genetic trend

Simulation models A2 and B2 include a positive non-genetic trend whereas models A1 and B1 do not (Tables 1 and 2). To simulate a positive non-genetic trend, we proposed that the location $\times$ year interaction term ($LY_{jk}$) from Model (1) be further split into two components:

$$\begin{aligned} LY_{jk} = u_{jk} + (z_{jk} \times \eta _j),\quad \text {for}\quad \eta _j \in \left( \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \frac{1}{6}\right) \end{aligned}$$

(2)

where $u_{jk} \sim N(0, \sigma ^2_{LY})$, $z_{jk}$ is a location–year covariate mapping if the jth location was observed in the kth and $(k+1)$th years, and $\eta _j$ is a constant of the non-genetic gain randomly chosen with equal probability. Equation (2) was designed to simulate a (cumulative) positive non-genetic trend by adding increments of $\eta _j$ units (bu/ac) every time a specific location was observed across consecutive years. This model was used to represent improvements in management practices in field trials that are continuously used by the breeding program, and is linked to the location $\times$ year effect. For example, suppose location “L1” is used for yield evaluations in the current year, and the farmer/researcher identifies regions of low fertility at L1. The issue will be addressed and the yield at L1 in subsequent years will likely improve. For clarity, an example of the covariate mapping is provided in Tables 3 and 4.

Annual effects ($Y_k$) were assumed to be random: there are favorable (positive increments in yield) and unfavorable (negative increments in yield) years. This was simulated by randomly sampling year effects from a Normal distribution (Table 1). If the cumulative non-genetic trend is not simulated, the term $z_{jk}$ does not appear, and therefore Eq. (2) is reduced to $LY_{jk} = u_{jk}$. In this case, the location–year effects are only a function of the random sample from the Normal distribution ($u_{jk}$).

Table 3 A hypothetical example of the covariate mapping ($z_{jk}$) used to simulate cumulative ($z_{jk} \times \eta _j$) rates ($\eta _j$) of non-genetic gain

Full size table

Table 4 Hypothetical example of $L_j + Y_k + LY_{jk}$ from Model (1) when the non-genetic trend is included

Full size table

Simulated “true” RGG values

The simulated “true” values of RGG were calculated from the genetic values of lines used for crossing. Explicitly, it was calculated as the slope ($\beta _{T_{(s)}}$) of the regression line of true genetic values of breeding lines ($g_{T_{i(s)}}$) on the year they were used in crossing blocks ($w_{i_{(s)}}$):

$$\begin{aligned} g_{T_{i(s)}}&= \beta _{0_{(s)}} + \beta _{T_{(s)}} w_{i_{(s)}} + \epsilon _{T_{i(s)}}, \quad \text {where}\nonumber \\&\quad \beta _{0_{(s)}}\, \text {is the intercept and}\, \epsilon _{T_{i(s)}} \sim \, \text {N}(0, \sigma ^2_{\epsilon _{T(s)}}) \end{aligned}$$

(3)

The slope $\beta _{T_{(s)}}$ represents the rate of accumulation of beneficial (additive) alleles among breeding lines across years of breeding operation. A breeding line may be crossed multiple times (hub network crossing design), but only in a single crossing nursery/year.

Data and estimation of RGG

Historical data records have the potential to provide four possible data sets generated in the line development process: (i) the most advanced set of experimental lines evaluated in the URT, (ii) data from experimental lines that are first evaluated in PYT and subsequently in the URT, (iii) data from BT3, PYT, and URT that provide information from three years of trials for the most advanced experimental lines as well as information from breeding lines (i.e., pedigree information from the previous cycle), and (iv) data from all MET stages (BTs, PYT, URT). The first and second data sets are typical of historical data available for most crop species. For our simulations, we also evaluated the third data set to determine if estimating RGG exclusively from breeding lines would provide more accurate estimates. The fourth data set was not considered in this study.

Check cultivars were included in the models to provide estimates of non-genetic trends across years but were not considered to estimate RGG. The estimation models can be classified as providing either direct or indirect estimates of RGG, as they differ in the decomposition of the genotypic effects ($G_i$). The direct estimate was computed by replacing $G_i$ with $\beta _{g_{(s)}} r_{i_{(s)}}$, where $\beta _{g_{(s)}}$ is a heterogeneous regression coefficient of RGG for experimental lines, and $r_{i_{(s)}}$ is the year of first testing. The year of first testing is designated as the first year in which the phenotypic data of the ith experimental line is available. For public soybean empirical MET data, this occurs in the PYT. $\beta _{g(s)}$ is considered heterogeneous so that data from check cultivars do not contribute to the estimation of RGG. In contrast, indirect estimates of RGG require two analytic steps: The first step consisted of estimating/predicting genotypic main effects ($\hat{g}_{i_{(s)}}$), which can be either eBLUE or the eBLUP values for only the experimental lines. The second step estimates RGG as the slope ($\beta _{R_{(s)}}$) of the linear regression between $\hat{g}_{i_{(s)}}$ and $r_{i_{(s)}}$, defined as:

$$\begin{aligned} \hat{g}_{i_{(s)}}&= \beta _{0_{(s)}} + \beta _{R_{(s)}} r_{i_{(s)}} + \epsilon _{R_{i(s)}},\quad \text {where}\nonumber \\&\quad \beta _{0_{(s)}}\, \text {is the intercept and}\, \epsilon _{R_{i(s)}} \sim \, \text {N}(0, \sigma ^2_{\epsilon _{R(s)}}) \end{aligned}$$

(4)

Alternatively, when $\hat{g}_{i_{(s)}}$ are eBLUP values, RGG can indirectly be estimated with the cumulative sum of the average $\hat{g}_{i_{(s)}}$ values of all lines ($\varvec{\kappa }_{(s)}$):

$$\begin{aligned} \varvec{\kappa }_{(s)} = \left( \kappa _1, \kappa _1 + \kappa _2, \dots , \sum _{k=1}^{K} \kappa _k\right) ,\quad \text {where}\;\kappa _n = \frac{\sum _{i\in S{_k}} \hat{g}_{i}}{n_k} \end{aligned}$$

(5)

where K is the number of years in the data set, $\hat{g}_{i}$ is the predicted eBLUP value of the ith experimental line, $S_k$ is the set of lines first tested in the kth year, and $n_k$ is the number of experimental lines evaluated in $S_k$. The RGG is then estimated by regressing $\varvec{\kappa }_{(s)}$ on the year of first testing ($r_{i_{(s)}}$). An example is presented in Table 5.

Table 5 Hypothetical example of the cumulative sum ($\varvec{\kappa }$) of the average ($\kappa _n$) eBLUP values ($\hat{g}_{i}$) linked to the first year of trial ($r_{i}$) for nine experimental lines (ID)

Full size table

Thus, $\varvec{\kappa }_{(s)}$ is a generalization of the expected gain from selection (Walsh and Lynch 2018; Falconer and Mackay 1996). Variability from year to year (i.e., GY, GLY, LY, and Y) is accounted for because $\varvec{\kappa }_{(s)}$ is computed with a full, $G\times L\times Y$ model.

Estimation models applied to simulated PYT and URT

Twenty-one models were applied to simulated data sets consisting of PYT and URT (Table 6). The first model, designated EB (benchmark), indirectly estimated RGG with eBLUP values. Model E0 uses the same framework as EB, but RGG was estimated with $\varvec{\kappa }_{(s)}$. Model E1 (Mackay et al. 2011) estimated RGG with eBLUE values. Model E2 (Piepho et al. 2014) directly estimated RGG with a fixed effect regression coefficient, and Model E3 with a random regression coefficient. Models E4 and E5 are variations of Models E0 and E1, respectively, where RGG was estimated only from lines in URT (i.e., the last year of trial). Model E6 (control population, Rutkoski 2019b) directly estimated RGG ($\beta _t$) by contrasting checks and experimental lines with a fixed regression term ($\beta _t t_k$), where $t_k$ is a continuous covariate for the calendar year. Model E7 was designed to assess if environmental (the combination of location–year) effects will be properly modeled using check cultivars. An initial model is applied to the phenotypic data associated with only the check cultivars to obtain predicted values (i.e., eBLUP) of environmental effects. These predicted values are subsequently used as a fixed effect covariate in a second analysis model. In the second model, the eBLUP values of experimental lines are predicted and used to estimate RGG indirectly. Model E8 is a variation of E7, in which RGG was directly estimated in the second step.

We also included parameters for several covariance structures representing GL and GY interaction effects. Models E0, E1, E2, E3, E4, E5, and E6, assumed independent effects with homogeneous variance, whereas their counterparts E0V, E1V, E2V, E3V, E4V, E5V, and E6V, involved correlated random effects with heterogeneous variances. We restricted investigations of diagonal and first-order factor-analytic models to avoid convergence issues across simulation runs. For Models E0G, E0GV, and E7G, we also investigated the RGG estimation with genomic estimated breeding values (Table 6).

Model E9 takes advantage of both experimental lines and checks replicated between consecutive pairs of years: (i) an initial model was fit within years to obtain eBLUP values for experimental lines ($g_{ik}$); and (ii) the $g_{ik}$ values were then corrected for the “year effect” according to the “reference year (R).” For example, in a MET dataset of 10 years, if R = year 5, the corrections for the year effects will be performed with a forward-backward regression algorithm among years, e.g., $1\leftarrow 2\leftarrow 3\leftarrow 4\leftarrow {\textbf {5}}\rightarrow 6\rightarrow 7\rightarrow 8\rightarrow 9\rightarrow 10$. If R = 1, only a forward process (${\textbf {1}}\rightarrow \dots \rightarrow 10$) was used, and if R = 10, only a backward process ($1\leftarrow \dots \leftarrow {\textbf {10}}$) was used (Table 6). The methodology is presented with references in Supplementary Material. In addition to the described models, RGG was also computed with Model 4 using raw phenotypes at each location (“Pheno”) to provide an estimate of phenotypic changes across years. Thus, it accesses the impact of estimating RGG without explicit models.

Table 6 Models applied to simulated PYT and URT. Fixed effects are underlined and the RGG was estimated with model terms highlighted in gray

Full size table

Estimation models applied to simulated BT3, PYT and URT

All experimental lines in BT3, PYT, and URT were included in the analyses, but RGG was estimated only from the breeding lines (Table 7). The indirect RGG estimate was computed with modified versions of Models (4) and (5), where $\hat{g}_{i_{(s)}}$ and $\varvec{\kappa }_{(s)}$ are restricted to breeding lines, and $r_{i_{(s)}}$ was replaced by $w_{i_{(s)}}$.

Models with data from BT3 provide crossing information and include variables for types of general ($P_i$) and specific ($F_{ii'}$) combining abilities (Table 7). We refer to this information as “types of combining abilities” because data from PYT and URT are selected lines derived from progeny of crosses. The selection process in MET (BT1 $\rightarrow$ BT2 $\rightarrow$ BT3 $\rightarrow$ PYT) results in the best $\sim$0.4 percent of self-pollinated experimental lines in advanced trials. Thus, we can actually estimate the combining abilities of breeding lines to produce the best $\sim$ 0.4 percent of lines created for the next cycle of line development.

Six models were applied to these data sets. Model E10 treats the variable values in $P_i$ as random effects (Möhring et al. 2011) and Model E11 as fixed effects. Model E10G included ${\textbf {G}}_{{\textbf {M}}}$. Shrunken or GEBVs were then calculated as $2 \times \hat{P}_i$ (Isik et al. 2017) and used to indirectly estimate RGG. For Models E10 and E10G, estimates of RGG were obtained using Model 5 ($\varvec{\kappa }_{(s)}$). Model E12 provided direct estimates of RGG by replacing the main effects of lines used for crossing with $\beta _{f_{(s)}} w_{i_{(s)}}$, where $\beta _{f_{(s)}}$ is the estimated slope, and $w_{i_{(s)}}$ the year the ith breeding line was used for crossing. Note only one term $\beta _{f_{(s)}} w_{i_{(s)}}$ was included in this model because breeding lines were not used for crosses in multiple years, so that $w_{i_{(s)}}$ is equivalent for $P_i$ and $P_{i'}$. Models E13 and E14 are similar to E7 and E8 (Table 6), respectively, where the check cultivars were used to predict environmental effects in the first step.

Table 7 Models applied to simulated BT3, PYT and URT. These models include information from breeding lines used in crossing. Fixed effects are underlined, and the RGG was estimated with model terms highlighted in gray

Full size table

Statistical comparisons of evaluated models

Covariance modeling

Models E0, E1, $\dots$, and E6, assumed independent GEI effects with homogeneous variance, whereas their counterparts E0V, E1V, $\dots$, and E6V, assumed correlated effects with heterogeneous variances or covariances (Table 6). We hypothesize there is no substantial difference in estimating RGG due to variance-covariance (VCOV) modeling. We formally tested our hypothesis by applying an analysis of variance (ANOVA) to the estimated RGG values as a function of the simulated breeding program (i.e., simulation run) and evaluated models (Supplementary Material Figure A6). The average RGG (i.e., the slope $\beta _{R}$) of each model ($\beta _{R_{E0}}$, $\beta _{R_{E0V}}$, $\dots$, $\beta _{R_{E6}}$, $\beta _{R_{E6V}}$) was then computed, and the pairwise differences ($\beta _{R_{E0}} - \beta _{R_{E0V}}$, $\beta _{R_{E1}} - \beta _{R_{E1V}}$, $\dots$, $\beta _{R_{E6}} - \beta _{R_{E6V}}$) were tested with the Tukey method with $\alpha = 0.05$ adjusted for multiplicity using the R package emmeans (Lenth 2022, functions lstrends and pairs). The same procedure was applied to compare independent versus correlated genotypic effects, and direct versus indirect estimates of RGG.

Bias and linearity

Because $\hat{\beta }_{R_{(s)}}$ (Model 4) is an estimator of $\beta _{T_{(s)}}$ (Model 3, “true” simulated RGG), the estimation bias or error is defined as $\xi _{(s)} = \hat{\beta }_{R_{(s)}}-\beta _{T_{(s)}}$. Models were evaluated based on the average ($\bar{\xi }$) and relative ($\xi ^{\textrm{rel}}_{(s)}$) bias across simulation runs, as well as on the root mean squared error (${\textit{RMSE}}$), computed as follows:

$$\begin{aligned} \bar{\xi }&= \frac{\sum _{s=1}^{S}\xi _{(s)}}{S} \end{aligned}$$

(6)

$$\begin{aligned} \xi ^{\textrm{rel}}_{(s)}&= \frac{\sum _{s=1}^{S} \xi _{(s)} / \beta _{T_{(s)}}}{S} \times 100 \end{aligned}$$

(7)

$$\begin{aligned} {\text {RMSE}}&= \sqrt{\frac{\sum _{s=1}^{S}\xi ^2_{(s)}}{S}} \end{aligned}$$

(8)

The same metrics were calculated for $\hat{\beta }_{g_{(s)}}$, the direct estimator of RGG. Most models used to obtain estimates of RGG did not use information from experimental lines used in crosses (Table 6). All experimental lines selected for recycling based on the results of BT3 + BT2 + BT1 were included in the PYT (Fig. 1), however, not all experimental lines in PYT were selected as breeding lines. Thus, when pedigree information is not available, the estimated RGG from advanced PYT and URT trials is a mixture of experimental lines used for crossing, and experimental lines that simply advanced from BT3 to PYT based on their yearly performance. Thus, the expected bias of estimating RGG from advanced trials without pedigree information is the difference between true simulated RGG ($\beta _{T_{(s)}}$) and the true trend from MET ($\beta ^{\textrm{true}}_{R_{(s)}}$). The regression slope $\beta ^{\textrm{true}}_{R_{(s)}}$ was calculated from Model 4 by replacing $\hat{g}_{i_{(s)}}$ by true simulated genetic values (i.e., no error nor GEI). Results for expected bias are reported using the term “Expected.” Note the expected bias is zero for Models E10, $\dots$, E14 (Table 7), where the pedigree is known and the RGG was estimated from breeding lines.

In addition to bias, a metric for linearity was assessed. The linearity metric is based on the premise that if there is continuous genetic progress due to the selection of a (additive) quantitative trait and constant genetic variance, then a linear trend between years of line development and genotypic means for MET should be evident. For indirect models where marginal genotypic (line) effects are estimated/predicted, we formally tested for linearity with the sieve-bootstrap version of the Student’s t-test (Lyubchich and Gel 2022; Noguchi et al. 2011). The null hypothesis of no trend versus the alternative hypothesis of a linear trend was assessed on $\alpha = 0.05$ with Bonferroni correction, and the proportion of statistically significant trends across simulation runs was reported. For direct models, we report the distribution of the two-tailed p-values from the estimated z-ratio (point estimate/standard error) of the slope ($\beta _g, \beta _t, \beta _f$) as an indication of its significance. This distribution was compared to the random sample simulation model B2-R (Table 2).

Estimation of RGG from the empirical soybean data

Models that demonstrated the least bias in analyses of simulated data were used to estimate RGG from the empirical data. This data refers to historical soybean seed yield records of maturity groups II and III evaluated in PYT and URT between 1989 to 2019. The data set contains 39,006 data points, 4257 experimental genotypes derived from multiple public breeding programs, and 591 observed environments located in the United States and Canada. The dataset can be obtained from the R package SoyURT. Refer to Krause et al. (2023) for more details.

Results

Simulation overview

In total, $\sim 1.03$ trillion data points were simulated in MET across 1350 breeding programs. The number of different sampled locations ranged from 28 to 41. The sample sizes (i.e., the number of experimental lines excluding checks) ranged from 5000–18,723 for BT1; 500–1872 for BT2; 100–374 for BT3; 21–75 for PYT; and 4–15 for URT. Estimated broad sense heritabilities on an entry mean basis (Cullis et al. 2006) for individual trials within breeding stages had larger values for the more advanced trials (Supplementary Material Figure A7 and Supplementary Material Figure A8). The average true simulated RGG in bu/ac$^{-1}$/yr$^{-1}$ was 0.44 for A1 and A2, 0.41 for B1 and B2, 0.55 for B2-M, and zero for B2-R (Fig. 3).

Covariance modeling

Including covariance models for GEI significantly affected contrasts of estimated RGG values. Three contrasts were statistically significant when a random sample of breeding lines was used to create a new cycle of breeding (B2-R). The inclusion of the genomic relationship matrix (${\textbf {G}}_{{\textbf {M}}}$) was statistically significant for the contrast E10–E10G in B2-R, and E0–E0G was significant for all B simulation models. The direct versus indirect estimation of RGG resulted in significant differences between estimated RGG values for most simulation models; the primary exception was that the contrast between Models E7 and E8 was not large for any of the simulation models (Table 8).

Table 8 Adjusted Turkey p-values for the contrasts between variance-covariance modeling (VCOV) and direct versus indirect (DI) estimation of RGG

Full size table

Relative bias and overall performance

On average, RGG values from analytic Models EB, E0G, E0GV, and E10G were estimated at zero (Fig. 3). The simulation model B2-R was designed to not have a positive RGG value due to the random sampling of genotypes (i.e., no selection). Then, when the estimated RGG values from the tested models were compared in B2-R, a large variation in the results complicated the comparison of the models. For example, if the true RGG in a simulation run in B2-R was 0.0001 bu/ac$^{-1}$, an estimated value of 0.01 from any model would be 100 times bigger than the simulated one. Thus, results from these models and from B2-R will not be included in the reported summary statistics below. Raw estimates of RGG are available in (Supplementary Material Figure A9–Supplementary Material Figure A14), as well as the estimated bias without standardization in (Supplementary Material Figure A15).

All models demonstrated some degree of bias. On average, a smaller bias was observed for simulation models with complex GEI effects. The expected relative bias for models without information from breeding lines had an average value of $-$15.21% (Figs. 4 and Supplementary Material Figure A16). Models with information from breeding lines are expected to have no bias, but did not outperform (i.e., less biased) models that only considered advanced trials (Figs. 3, 4 and 5). Including ${\textbf {G}}_{{\textbf {M}}}$ to account for correlated genetic effects did not improve the accuracy of the estimated RGG values. Across simulation runs, the average value of the diagonal values of ${\textbf {G}}_{{\textbf {M}}}$ was 1.91, and zero for the off-diagonal. Within years, these values were 1.91 and 0.49, respectively (data not shown).

On average, 15 models presented less than $\pm\, 5$% relative bias for at least one simulation model. Models E2, E7, E7G, E8, and E13 had less than $\pm\, 18$% relative bias across all simulation models (Fig. 4). Directional relative bias (i.e., under or overestimating true RGG on average) was observed for Models E1, E2V, E4, E4V, E5, E6, E6V, E10, E11, E12, and E14, and very similarly for Models E1V, E3, E3V, and E5V. Across simulation models and excluding very few negative estimates (Supplementary Material Table A1), the range of estimated RGG values from Models E1, E2V, and E7, contained the true simulated RGG in 58% of the simulations (Supplementary Material Figure A17). For these models, the relative bias ranged from $-$16.82% to 40.73%, which represents biases of $\pm\, 7.41$ kg/ha$^{-1}$/yr$^{-1}$ or $\pm\, 0.11$ bu/ac$^{-1}$/yr$^{-1}$.

Estimates of RGG using raw phenotypic data (“Pheno”) resulted in relative bias across simulations from $-$14.64% to 23.23% (Fig. 4), with a similar $\textit{RMSE}$ as other analytic models (Fig. 5). However, in B2-R, where there was no true simulated RGG, using raw phenotypic data resulted on average in an estimated RGG of 0.16 bu/ac$^{-1}$/yr$^{-1}$ (Fig. 3). Simulation models A2, B2, B2-M, and B2-R, included a positive non-genetic trend. Although in this work we are not investigating the estimation of the rate of non-genetic gain (i.e., we were trying to isolate it from RGG), by comparing the relative bias in scenarios A1 versus A2, and B1 versus B2, it is evident some analytic models successfully isolated it (Fig. 4).

Linearity

For Models E0V, E1, E1V, E7, and E7G, the proportion of statistically significant linearity ranged from 0.80 to 0.99, with an average value of 0.90. As expected, true simulated genetic values from breeding lines were always statistically significant. In B2-R, Models E0, E0V, E4, E4V, E5, E5V, E10G, and E11, presented an average value of 0.63, similar to when raw phenotypic values were considered (Fig. 6). For models in which RGG was directly estimated, the distribution of the p-values from the z-ratio statistic showed the linearity metric could be used to indicate there was RGG, except for Models E6, E6V, and E12, where results were similar to the simulation model B2-R (Fig. 7).

Estimates of RGG from empirical data

According to the estimates of bias, directional bias, RMSE, and linearity from the evaluated models using simulated data, we used Models E1, E2V, and E7 to estimate RGG for the empirical data. For Model E2V, several covariance structures were evaluated, and the best-fit model was selected (Supplementary Material Table A2). The point estimates of RGG ranged from 0.27 to 0.59 bu/ac$^{-1}$/yr$^{-1}$. For Models E1 and E7, the p-values from the linearity test were statistically significant, as well as for the z-ratio of the estimated slope $\beta _g$ (i.e., the direct estimate of RGG) from Model E2V (Table 9). Therefore, although imprecise, there is strong evidence that RGG from public soybean breeding programs has been positive for maturity groups II and III in the period from 1989 to 2019.

Table 9 Estimated RGG with 95% confidence interval (bu/ac$^{-1}$/yr$^{-1}$) and linearity for the empirical soybean dataset

Full size table

Discussion

Since the beginning of selection for quantitative traits, the development of analytical methods to estimate RGG has been pursued by animal and plant breeders (Eberhart 1964; Garrick 2010; Rizzo et al. 2022). The motivations include the need for an accurate metric that can be used to assess return on investment and can be used to evaluate proposed novel breeding strategies (Rutkoski 2019a, b). The challenge for the development of any statistical metric is to clearly define the inference space. For line development programs, we propose that the inference space consists of lines used in crossing, i.e., a breeding population, consisting of nearly homozygous genotypes that are used as parents in crossing blocks. If experimental lines in advanced MET have accumulated favorable alleles from selection, i.e., have higher average breeding values than their parents from the previous cycle of development, then genetic trends across years of MET are a function of the breeding values of the lines used for crosses, and therefore herein we interpret the comparison across time as an estimate of RGG from all possible germplasm that is adapted to the variable and changing conditions of MET. This connection between accumulation of favorable alleles and expression of the phenotype in advance MET recognizes that RGG in cultivar development programs can include a culling process executed across multiple environments, and migration and drift, as contributors to the concept of genetic gain (Kempthorne 1957; Falconer 1960).

Similar to breeding using recurrent population improvement methods (Jenkins 1940), cultivar development consists of evaluation, selection, and reproduction, although there are relevant distinctions. There is no single cycle zero from which all genotypes are recurrently derived. Indeed, within both public and proprietary plant breeding programs, there are many geographically distributed line development programs that began with unique sets of founders. Further, each line development program has its own set of objectives and local environments, thus resulting in breeding islands, although lines are exchanged and evaluated in common MET. Consequently, genetic gains can be due not only to selection, but also to migration and drift. In this work, we simulated a broad range of environments in a closed system, i.e., there was no exchange or introgression of germplasm from external sources. Future simulations should consider the role of breeding “islands” and the exchange of breeding lines to more accurately reflect the underlying mechanisms responsible for the historical records (Ramasubramanian and Beavis 2021). Fortunately, the concept of RGG using our defined inference space still applies to a more open system as long as adjusted genotypic means (i.e., eBLUE values) for migrants that become breeding lines are included in the estimation of RGG.

Distinguishing RGG from genotypic and/or commercial gain will help avoid misleading interpretations. Genotypic gain, not genetic gain, is RGG plus the expression of non-additive genetic effects such as epistatic (Hansen and Wagner 2001; Pavlicev et al. 2010) and dominance deviations. Hence, for a trait not only controlled by additive effects (e.g., Garcia et al. 2008), the genotypic gain is expected to be higher than the RGG. Commercial gain is the genotypic gain delivered in the farmer’s field. Examples of commercial gain are estimates from era trials (e.g., Bruce et al. 2019; Cooper et al. 2020) where only widely grown cultivars are used to represent a specified era. The essential distinction between RGG, genotypic, and commercial gain, is that only RGG is relevant to genetic improvement by the breeding project.

Another metric called “yield gain,” ‘yield advances,” or simply “genetic trend”, has been used to quantify yield increase of staple crops worldwide (Grassini et al. 2013; Prasanna et al. 2022) or in specific countries (Fischer et al. 2022; Guo et al. 2022; Rizzo et al. 2022). For example, Rizzo et al. (2022) collected maize field-trial data over 14 years (2005–2018) from the state of Nebraska (USA), and concluded that climate and agronomy represented 87% of the yield gains in high-yield irrigated environments, leaving 13% for the genetic contribution. The 13% genetic contribution can be interpreted as an unweighted average of the commercial gain. It is unweighted in the sense that individual commercial programs breeding for Nebraska have a specific contribution to the reported gain, given each program has released a number of hybrids with an average lifespan of three years according to their market share. For that reason, the yearly yield gain is a composition of commercial gain within and across breeding programs. Consequently, careful consideration should be given when linking reported genetic trends with RGG.

The main emphasis of this study was to estimate RGG from advanced MET. We choose LMM for estimators because they are commonly used in data analysis of MET (Isik et al. 2017; Dias et al. 2018; Krause et al. 2020) and are well-known in the plant and animal breeding communities. The underlying distributional assumptions of LMM are that random effects have an expected value of zero, and are realizations of independent and multivariate Gaussian distributions with positive-definite variance matrices (Gumedze and Dunne 2011). By assuming the random effects have an expected value of zero, both the average (Isik et al. 2017) and sum (Searle 1997) of predicted values for the random factor are zero. This analytic constraint, however, does not assure the eBLUP value of a newly developed genotype will be numerically higher than that of an older genotype when historical MET data is analyzed using $G\times L\times Y$ models (Supplementary Material Figure A21). Thus, our benchmark Model EB did not provide positive values for RGG.

An alternative parametrization for the $G\times L\times Y$ benchmark model was computed with the cumulative sum of the average eBLUP values of all genotypes according to the year of first testing (e.g., the PYT for the empirical and simulated data). This strategy is connected to the expected gain from selection when all genotypes that are under selection are analyzed together (i.e., the same model). In this case, the expected/predicted gain can be directly calculated with the average of the eBLUP values from selected individuals (Falconer and Mackay 1996; Walsh and Lynch 2018). The rationale is that, when likelihood-based estimators of variance components are used (e.g., REML), the regularization parameter lambda in the BLUP predictor is usually the ratio of residual and random term variance estimates. Hence, for the main effects of genotypes, the shrinkage of the genetic term is inversely proportional to the heritability/repeatability of the trait (Xavier et al. 2016). The practical application of $\varvec{\kappa }$ relies on the assumption that the observed experimental genotypes in MET are a realization of a multivariate random variable with covariance depending on genetic relationships. This assumption does not require reference to a specific base population with idealized properties (Piepho et al. 2008). Furthermore, in parallel to recurrent selection, the first cycle is arbitrarily set by the first available year in the MET data, thus representing the initial value of $\varvec{\kappa }$.

Another alternative for the benchmark Model EB was including a variable for environments (location–years combination). The idea underlying this approach is from Diers et al. (2018) and Montes et al. (2022). These authors analyzed phenotypic data from a soybean nested association panel using stage-wise models. In the first-stage model, the eBLUP values of incomplete blocks within locations were predicted using a unique identifier in the dataset, and in the second stage, the eBLUP values for incomplete blocks were used as a fixed effect covariate. We modified their first-stage model to obtain the eBLUP values of environmental effects ($\hat{E}$) based on evaluations of check cultivars, and subsequently considered the values as a fixed effect covariate to represent both E and GEI effects. We hypothesized that if the non-genetic effects can be successfully captured by check cultivars in the first stage (Supplementary Material Figure A22), then unbiased estimates of genetic effects could be obtained from the second-stage model.

While this second modeling strategy successfully captured RGG and GEI effects, it significantly inflated the estimates of genotypic variance, and consequently, the estimates of heritability (Supplementary Material Figure A21 and Supplementary Material Figure A23). For example, estimates of genotypic variance and heritability using empirical soybean MET data changed, respectively, from 6.30 (bu/ac)$^{2}$ and 0.47, in the benchmark Model EB, to 31.11 (bu/ac)$^{2}$ and 0.94 with Model E7. A similar outcome can be achieved by dropping the interaction terms GL, GY, and GLY from the benchmark model, where the estimated genotypic variance changed from 6.30 (bu/ac)$^{2}$ to 27.01 (bu/ac)$^{2}$. These results suggest the genotypic variance in Models E7 and E7G were inflated by the GL, GY, and GLY variances associated with the experimental genotypes. Thus, while checks account for variability among environments of MET, the GEI effects from experimental genotypes are still confounded with the genotypic effects.

The aforementioned inflation due to GEI applies to other mixed models used for analyzing data from MET. For example, the “EBV” model from Rutkoski (2019b) only accounted for genotypic main effects and successfully captured RGG with a biased (inflated) genotypic variance. In this case, RGG estimates from eBLUP values [${\textbf {G}} \sim N(\varvec{0}, {\textbf {I}}\sigma ^2_G)$], estimated breeding values [${\textbf {G}} \sim N(\varvec{0}, {\textbf {A}}\sigma ^2_G)$], or genomic estimated breeding values [${\textbf {G}} \sim N(\varvec{0}, {\textbf {G}}_{{\textbf {M}}}\sigma ^2_G)$], will not produce large differences, as shown in Models E7 and E7G. Nonetheless, for annual and breeding line selections, it is well-known that under normality, equal variance, and independence, BLUP minimizes the mean squared error of predicted values (Robinson 1991; Piepho et al. 2008) and hence maximizes the correlation between true and predicted genotypic values (Henderson 1963; Searle 1974; Searle et al. 1992). Thus, although RGG from MET cannot be directly estimated from eBLUP values of the $G\times L\times Y$ models using large historical data, the use of BLUP is likely to increase gains from selection if data is correctly modeled (Smith et al. 2005; Hartung et al. 2023).

Every model assessed to estimate RGG in the simulations demonstrated some degree of bias. For example, the average bias was similar between simulation models A1 and A2 (simple GEI effects), and in B1 and B2 (complex GEI effects), but were relatively different between the A and B simulations. Including a positive, cumulative, non-genetic gain did not result in major bias when comparing simulation models A1 versus A2, and B1 versus B2. Ideally, RGG should be estimated with models with small values for root mean squared error (Supplementary Material Figure A18–Supplementary Material Figure A20). Even though all evaluated estimation models produced biased results, note that for some models the bias was consistently in the same negative/positive direction relative to the “true” simulated RGG values, indicating that the estimated RGG values underestimate/overestimated on average the true RGG. Such directional bias can be seen as insurance that the estimated RGG from empirical datasets is likely under or overestimated, depending on the model of choice.

The simulator was constructed to mimic public soybean breeding programs responsible for maturity zones II and III in the USA. These programs evaluate nearly homozygous experimental lines in replicated field trials for four to five years (Fig. 1), but the available empirical data consist of only the last two years of MET. We assumed for simulation that breeding lines were selected in BT3 after three years of trials, and then were used in the crossing nursery as well as advanced to a regional PYT. However, the PYT stage could include lines that were not used in the crossing nursery. There could be lines from other line development programs or there could be experimental lines that might be considered for release as purelines, but not for crossing; not all lines considered for release as purelines also produce high-yielding progeny. Consequently, estimates of RGG based on data from only PYT and URT, with no information about which lines are used for crossing, represent a biased sample relative to the breeding lines that are used to create the next cycle of segregating progeny. Our results show that the average bias was $-$15.21% across all simulation models, indicating that, on average, the samples of lines included in the PYT and URT underestimate the true RGG. In addition, given only 20–30 experimental lines were used in crossing nurseries each year, genetic drift associated with different intensities of selection throughout the pipeline might also have contributed to the observed bias (Falconer and Mackay 1996; Vaughn and Li 2016; Walsh and Lynch 2018).

Based on theory, larger samples will asymptotically produce more accurate and precise estimates. We focused on estimating RGG from 30 years of advanced MET and could not obtain an unbiased, robust estimator. Results considering only 10 years of data revealed a much larger bias and root mean squared error, as also observed by Rutkoski (2019b). The simulated breeding programs in this work and from Rutkoski (2019b) assumed small breeding projects relative to proprietary ones. Proprietary breeding programs have a much larger network of trials, which could easily represent a threefold to sixfold increase in the amount of data. In addition, their field trials comprise lines and checks that belong to narrow (0.1) relative maturity (RM) groups. For example, Byrum et al. (2017) developed an environmental index representing the non-genetic component of seed yield from data collected on thousands of check varieties, each representing every 0.1 RM group and grown in hundreds of MET per RM every year for a ten year period. Yield values for each experimental line grown in the same trials as the check varieties were adjusted by subtracting the environmental index. The primary feature of this Genetic Gain Performance (GGP) metric is its reliance on data from a large number of check varieties, grown in a large number of trials and environments, so that the environmental index could be accurately estimated using the Expectation-Maximization (EM) algorithm (Dempster et al. 1977). Thus, a question for further investigation is to determine the optimal (or minimum) number of trials, locations, and years from advanced MET needed to obtain unbiased estimates of RGG using LMM. Also, strategies to increase the trial’s accuracy (e.g., reliability) such as modeling field spatial patterns can be considered to increase the accuracy of the estimated genotypic means (Gilmour et al. 1997; Borges da Silva et al. 2021).

We used simulated data from two or three years of MET, but excluded the simulated data from local breeder trials. We excluded these because empirical data from local breeder trials historically have not been published nor recorded in accessible databases. However, it is well known that a pattern of missing data also can introduce bias in trend (Hartung et al. 2023) and REML variance estimation (Piepho and Mohring 2006; Aguate et al. 2019; Hartung and Piepho 2021). Further research is needed to investigate if this exclusion introduced bias in estimates of RGG. Furthermore, simulated breeding lines were selected in BT3 with BLUP models that consider all available data (BT1, BT2, BT3). It is worthwhile investigating if these predicted values could lead to less biased estimates of RGG. This approach would also be informative regarding the predicted genetic gain as previously discussed. In addition, early generations can also be used to estimate RGG (Cowling et al. 2023).

Computing RGG from raw phenotypes, without statistical modeling, can indicate positive RGG when no genetic gain was actually delivered. One might argue that when the data is balanced the arithmetic average and (generalized) least-squares yield numerically equivalent estimates. The first issue is that data from MET are rarely balanced within years, and largely unbalanced across years. Thus, RGG estimates from raw phenotypes are completely confounded with non-genetic effects. As emphasized by Hartung et al. (2023), careful consideration should be given in selecting the best-fit, proper, model for the dataset being analyzed. Great attention should be given to evaluating covariance modeling for non-genetic effects given it played an important role for most models. Metrics such as the Akaike and Bayesian information criteria (Akaike 1974; Schwarz 1978), as well as the proportion of genetic variance explained by FA models (Smith et al. 2015), should always be considered.

Both direct and indirect estimators of RGG provided useful information. Check Piepho et al. (2014) for the theoretical development of direct estimation. Including additive genomic relationships to account for correlated genetic effects did not improve the RGG estimates. This result likely occurs due to genotypes exhibiting little to no relationship across years of MET. Diallel-based models were also evaluated assuming pedigree was available, so RGG was estimated directly from breeding lines. These models did not outperform models that only considered advanced trials, and hence there is no clear advantage in considering this modeling approach. When pedigree is available, an alternative to compute RGG would be to use breeding lines per se performance. We did not test for this approach, but results from Rutkoski (2019b) showed unbiased estimates could not be obtained.

Overall, the best-performing models were E1, E2V, and E7. Since none of these models yielded unbiased estimates of RGG, the most suitable strategy would be to account for the range of the estimated values. This would increase the likelihood of capturing the true RGG. Using this strategy, we report estimates of RGG that ranged from 18.12 to 39.60 kg/ha$^{-1}$/yr$^{-1}$ (0.27 to 0.59 bu/ac$^{-1}$/yr$^{-1}$) for 31 years of empirical soybean MET, considering all sampled environments from the target population of environments. This result further assumes that there is only one set of BTs from a single cultivar development project. It should be pointed out that there are actually multiple variety development projects working in maturity zones II and III, so there are multiple sets of BTs and further work using island models (Ramasubramanian and Beavis 2021) are needed to provide better interpretation of results (including maintenance of genetic variability) from MET. Lastly, if the goal is to determine if there is evidence for RGG, regardless of bias, our linearity measure based on the simulation results is useful. Further investigation is needed if RGG is expected to be nonlinear due to changes in the genetic variance, heritability, and non-additive effects (Eberhart 1964; Bulmer 1971; Ramasubramanian and Beavis 2021).

Conclusion

We evaluated several LMM to estimate RGG using advanced MET. We approach the research question by simulation and propose a careful characterization of the inference space for RGG that considers the intricate nature of cultivar development programs while remaining consistent with the original concept of genetic gain. Our results suggest it is not possible to accurately estimate RGG using data from two years of MET, such as are available in the public soybean breeding programs in the USA. Consequently, the evaluated estimators should not be used to compare breeding programs or quantify the relative efficiencies of proposed breeding systems. If the goal is only to determine whether there was RGG, the linearity metric is useful. Therefore, as also concluded by Rutkoski (2019b), there were no unbiased estimates from LMM to estimate RGG using limited samples from large complex interactions among genetic and non-genetic conditions. Lastly, in addition to the practical and theoretical results applied to soybean genetic improvement, the analyses performed in this study can be applied to quantitative traits evaluated in any diploid crop undergoing phenotypic evaluations in MET.

Data availability

The simulator and evaluated models are publicly available on GitHub (https://github.com/mdkrause/RGG). The soybean empirical data is available in the R package SoyURT (https://github.com/mdkrause/SoyURT).

References

Aguate F, Crossa J, Balzarini M (2019) Effect of missing values on variance component estimates in multienvironment trials. Crop Sci 59:508–517
Article Google Scholar
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723
Article Google Scholar
Bates D, Maechler M (2021) Matrix: sparse and dense matrix classes and methods
Bernardo R (2020) Breeding for quantitative traits in plants, 3rd edn. Stemma Press, Woodbury
Google Scholar
Boichard D, Maignel L, Verrier E (1997) The value of using probabilities of gene origin to measure genetic variability in a population. Genet Sel Evol 29:5
Article PubMed Central Google Scholar
Borges da Silva ED, Xavier A, Faria MV (2021) Joint modeling of genetics and field variation in plant breeding trials using relationship and different spatial methods: a simulation study of accuracy and bias. Agronomy 11:1397
Article CAS Google Scholar
Bornhofen E, Todeschini MH, Stoco MG, Madureira A, Marchioro VS, Storck L, Benin G (2018) Wheat yield improvements in Brazil: roles of genetics and environment. Crop Sci 58:1082–1093
Article Google Scholar
Breseghello F, de Morais OP, Pinheiro PV, Silva ACS, da Maia de Castro E, Guimaraes EP, de Castro AP, Pereira JA, De Matos Lopes A, Utumi MM, de Oliveira JP (2011) Results of 25 years of upland rice breeding in Brazil. Crop Sci 51:914–923
Brisson N, Gate P, Gouache D, Charmet G, Oury FX, Huard F (2010) Why are wheat yields stagnating in Europe? A comprehensive data analysis for France. Field Crop Res 119:201–212
Article Google Scholar
Bruce RW, Grainger CM, Ficht A, Eskandari M, Rajcan I (2019) Trends in soybean trait improvement over generations of selective breeding. Crop Sci 59:1870–1879
Article CAS Google Scholar
Bulmer MG (1971) The effect of selection on genetic variability. Am Nat 105:201–2011
Article Google Scholar
Butler DG, Cullis BR, Gilmour AR, Gogel BG, Thompson R (2017) ASReml-R reference manual version 4
Byrum J, Beavis B, Davis C, Doonan G, Doubler T, Kaster V, Mowers R, Parry S (2017) Genetic gain performance metric accelerates agricultural productivity. Interfaces 47:442–453
Article Google Scholar
Cooper M, Tang T, Gho C, Hart T, Hammer G, Messina C (2020) Integrating genetic gain and gap analysis to predict improvements in crop productivity. Crop Sci 60:582–604
Article CAS Google Scholar
Covarrubias-Pazaran G (2020) Genetic gain as a high-level key performance indicator. Technical report, Excellence in Breednig Platform
Cowling WA, Castro-Urrea FA, Stefanova KT, Li L, Banks RG, Saradadevi R, Sass O, Kinghorn BP, Siddique KHM (2023) Optimal contribution selection improves the rate of genetic gain in grain yield and yield stability in spring canola in Australia and Canada. Plants 12:383
Article PubMed PubMed Central CAS Google Scholar
Crespo-Herrera LA, Crossa J, Huerta-Espino J, Vargas M, Mondal S, Velu G, Payne TS, Braun H, Singh RP (2018) Genetic gains for grain yield in CIMMYT’s semi-arid wheat yield trials grown in suboptimal environments. Crop Sci 58:1890–1898
Article PubMed PubMed Central CAS Google Scholar
Cullis BR, Smith AB, Coombes NE (2006) On the design of early generation variety trials with correlated data. J Agric Biol Environ Stat 11:381–393
Article Google Scholar
de Faria LC, Melo PGS, de Souza TLPO, Pereira HS, Melo LC (2018) Efficiency of methods for genetic progress estimation in common bean breeding using database information. Euphytica 214
de la Vega AJ, DeLacy IH, Chapman SC (2007) Progress over 20 years of sunflower breeding in central Argentina. Field Crop Res 100:61–72
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc Ser B (Methodol) 39:1–38
Google Scholar
Dias KODG, Gezan SA, Guimarães CT, Parentoni SN, Guimarães PEdO, Carneiro NP, Portugal AF, Bastos EA, Cardoso MJ, Anoni CdO, de Magalhães JV, de Souza JC, Guimarães LJM, Pastina MM (2018) Estimating genotype $\times$ environment interaction for and genetic correlations among drought tolerance traits in maize via factor analytic multiplicative mixed models. Crop Sci 58:72
Diers BW, Specht J, Rainey KM, Cregan P, Song Q, Ramasubramanian V, Graef G, Nelson R, Schapaugh W, Wang D, Shannon G, Mchale L, Kantartzi SK, Xavier A, Mian R, Stupar RM, Michno JM, An YQC, Goettel W, Ward R, Fox C, Lipka AE, Hyten D, Cary T, Beavis WD (2018) Genetic architecture of soybean yield and agronomic traits. G3: Genes Genomes Genet 8:3367–3375
Article CAS Google Scholar
Dudley JW, Lambert RJ (2004) 100 Generations of selection for oil and protein in corn. In: Janick J (ed) Plant breeding reviews, vol 24. John Wiley & Sons Inc, New York, pp 79–110
Google Scholar
Duvick DN (1977) Genetic rates of gain in hybrid maize yields during the past 40 years. Maydica 187–196
Duvick DN (1984) genetic contributions to yield gains of U.S. hybrid maize, 1930 to 1980. In: Fehr WR (ed) Genetic contributions to yield gains of five major crop plants, vol 2. CSSA Special Publications, pp 15–47
Duvick DN (2005) The contribution of breeding to yield advances in maize (Zea mays L.). In: Advances in Agronomy, vol 86. Academic Press, pp 83–145
Eberhart SA (1964) Least squares method for comparing progress among recurrent selection methods. Crop Sci 4:230–231
Article Google Scholar
Ellis RN, Basford KE, Leslie JK, Hogarth DM, Cooper M (2004) A methodology for analysis of sugarcane productivity trends. 2. Comparing variety trials with commercial productivity. Aust J Agric Res 55:109
Article Google Scholar
Endelman JB, Jannink J-L (2012) shrinkage estimation of the realized relationship matrix. G3 Genes Genomes Genet 2:1405–1413
Article Google Scholar
Falconer D (1960) Introduction to quantitative genetics. Oliver and Boyd Ltd, Edinburgh
Google Scholar
Falconer DS, Mackay TF (1996) Introduction to quantitative genetics, 4th edn. Pearson Education Limited
Felipe M, Gerde JA, Rotundo JL (2016) Soybean genetic gain in maturity Groups III to V in Argentina from 1980 to 2015. Crop Sci 56:3066–3077
Article Google Scholar
Fischer T, Ammar K, Monasterio IO, Monjardino M, Singh R, Verhulst N (2022) Sixty years of irrigated wheat yield increase in the Yaqui Valley of Mexico: past drivers, prospects and sustainability. Field Crop Res 283:108528
Article Google Scholar
Fisher RA (1918) The correlation between relatives on the supposition of mendelian inheritance. Trans R Soc Edinb 52:399–433
Article Google Scholar
Fox CM, Cary TR, Colgrove AL, Nafziger ED, Haudenshield JS, Hartman GL, Specht JE, Diers BW (2013) Estimating soybean genetic gain for yield in the Northern United States-influence of cropping history. Crop Sci 53:2473–2482
Article Google Scholar
Frensham A, Cullis B, Verbyla A (1997) Genotype by environment variance heterogeneity in a two-stage analysis. Biometrics 53:1373–1383
Article Google Scholar
Fritsche-Neto R, Sabadin F, doVale JC, Souza PH, Borges KLR, Crossa J, Garbuglio DD (2023) Realized genetic gains via recurrent selection in a tropical maize haploid inducer population and optimizing simultaneous selection for the next breeding cycles. PREPRINT (Version 2) available at Research Square
Garcia AAF, Wang S, Melchingerand AE, Zeng ZB (2008) Quantitative trait loci mapping and the genetic basis of heterosis in maize and rice. Genetics 180:1707–1724
Article PubMed PubMed Central Google Scholar
Garrick DJ (2010) An animal breeding approach to the estimation of genetic and environmental trends from field populations. J Anim Sci 88:E3–E10
Article PubMed CAS Google Scholar
Gaynor RC, Gorjanc G, Hickey JM (2021) AlphaSimR: An R package for breeding program simulations. G3 Genes Genomes Genet 11
Gerard GS, Crespo-Herrera LA, Crossa J, Mondal S, Velu G, Juliana P, Huerta-Espino J, Vargas M, Rhandawa MS, Bhavani S, Braun H, Singh RP (2020) Grain yield genetic gains and changes in physiological related traits for CIMMYT’s High Rainfall Wheat Screening Nursery tested across international environments. Field Crop Res 249:107742
Article Google Scholar
Gilmour AR, Cullis BR, Verbyla AP, Verbyla AP (1997) Accounting for natural and extraneous variation in the analysis of field experiments. J Agric Biol Environ Stat 2:269
Article Google Scholar
Grassini P, Eskridge KM, Cassman KG (2013) Distinguishing between yield advances and yield plateaus in historical crop production trends. Nat Commun 4:2918
Article PubMed Google Scholar
Gumedze F, Dunne T (2011) Parameter estimation and inference in the linear mixed model. Linear Algebra Appl 435:1920–1944
Article Google Scholar
Guo S, Zhang Z, Guo E, Fu Z, Gong J, Yang X (2022) Historical and projected impacts of climate change and technology on soybean yield in China. Agric Syst 203:103522
Article Google Scholar
Hallauer A, Miranda J (1988) Quantitative genetics in maize breeding. Iowa State University Press, Ames
Google Scholar
Hansen TF, Wagner GP (2001) Modeling genetic architecture: a multilinear theory of gene interaction. Theor Popul Biol 59:61–86
Article PubMed CAS Google Scholar
Hardin J, Garcia SR, Golan D (2013) A method for generating realistic correlation matrices. Ann Appl Stat 7
Hartung J, Laidig F, Piepho H-P (2023) Effects of systematic data reduction on trend estimation from German registration trials. Theor Appl Genet 136:21
Article PubMed PubMed Central Google Scholar
Hartung J, Piepho H (2021) Effect of missing values in multi-environmental trials on variance component estimates. Crop Sci 1–11
Hazel L, Lush J (1942) The efficiency of three methods of selection. J Hered 33:393–399
Article Google Scholar
Henderson CR (1949) Estimates of changes in herd environment. J Dairy Sci
Henderson CR (1950) Estimation of genetic parameters. Ann Math Stat 21:309–310
Google Scholar
Henderson CR (1963) Selection index and expected genetic advance. In: Statistical genetics and plant breeding. National Academy of Genetic Advance - National Research Council, Washington DC, p 623
Henderson CR, Kempthorne O, Searle SR, von Krosigk CM (1959) The estimation of environmental and genetic trends from records subject to culling. Biometrics 15:192
Article Google Scholar
Isik F, Holland J, Maltecca C (2017) Genetic data analysis for plant and animal breeding. Springer, Cham
Book Google Scholar
Jenkins MT (1940) The segregation of genes affecting yield of grain in maize. Agron J 32:55–63
Article Google Scholar
Johannsen W (1911) The genotype conception of heredity. Am Nat 45:129–159
Article Google Scholar
Kempthorne O (1957) An introduction to genetic statistics. Wiley publications in statistics, New York
Google Scholar
Kleinknecht K, Möhring J, Laidig F, Meyer U, Piepho HP (2016) A simulation-based approach for evaluating the efficiency of multienvironment trial designs. Crop Sci 56:2237–2250
Article Google Scholar
Krause MD, Dias KOdG, Pedroso Rigal dos Santos J, Oliveira AA, Guimarães LJM, Pastina MM, Margarido GRA, Garcia AAF (2020) Boosting predictive ability of tropical maize hybrids via genotype-by-environment interaction under multivariate GBLUP models. Crop Sci 60:3049–3065
Krause MD, Dias KOG, Singh AK, Beavis WD (2023) Using soybean historical field trial data to study genotype by environment variation and identify mega-environments with the integration of genetic and non-genetic factors. BioRxiv. https://doi.org/10.1101/2022.04.11.487885. https://www.biorxiv.org/content/early/2023/04/15/2022.04.11.487885
Laidig F, Piepho HP, Drobek T, Meyer U (2014) Genetic and non-genetic long-term trends of 12 different crops in German official variety performance trials and on-farm yield trends. Theor Appl Genet 127:2599–2617
Article PubMed PubMed Central Google Scholar
Laidig F, Piepho HP, Rentel D, Drobek T, Meyer U, Huesken A (2017) Breeding progress, variation, and correlation of grain and quality traits in winter rye hybrid and population varieties and national on-farm progress in Germany over 26 years. Theor Appl Genet 130:981–998
Article PubMed PubMed Central Google Scholar
Lenth RV (2022) emmeans: Estimated marginal means, aka least-squares means
Lush J (1937) Animal breeding plans. Iowa State College Press, Ames, Iowa
Google Scholar
Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits, 1et edn. Sinauer Associates, Sunderland
Google Scholar
Lyubchich V, Gel YR (2022) funtimes: Functions for time series analysis
Mackay I, Horwell A, Garner J, White J, McKee J, Philpott H (2011) Reanalyses of the historical series of UK variety trials to quantify the contributions of genetic and environmental factors to trends and variability in yield over time. Theor Appl Genet 122:225–238
Article PubMed CAS Google Scholar
Mayr E (1942) Systematics and the origin of species. Columbia Univ. Press, New York
Google Scholar
Milioli AS, Meira D, Panho MC, Madella LA, Woyann LG, Todeschini MH, Zdziarski AD, Ramos Campagnolli O, Menegazzi CP, Colonelli LL, Fernandes RAT, Melo CLPd, Fernandes de Oliveira M, Bertagnolli PF, Arias CAA, Giasson NF, Matsumoto MN, Quiroga M, Rossi Silva R, Bertan I, Capelin MA, Matei G, Benin G (2022) Genetic improvement of soybeans in Brazil: south and midwest regions. Crop Sci 62:2276–2293
Möhring J, Melchinger AE, Piepho HP (2011) REML-based Diallel analysis. Crop Sci 51:470–478
Article Google Scholar
Montes CM, Fox C, Sanz-Saez A, Serbin SP, Kumagai E, Krause MD, Xavier A, Specht JE, Beavis WD, Bernacchi CJ, Diers BW, Ainsworth EA (2022) High-throughput characterization, correlation, and mapping of leaf photosynthetic and functional traits in the soybean (Glycine max) nested association mapping population. Genetics
Noguchi K, Gel YR, Duguay CR (2011) Bootstrap-based tests for trends in hydrological time series, with application to ice phenology data. J Hydrol 410:150–161
Article Google Scholar
Ortiz R, Reslow F, Cuevas J, Crossa J (2022) Genetic gains in potato breeding as measured by field testing of cultivars released during the last 200 years in the Nordic Region of Europe. J Agric Sci 160:310–316
Article Google Scholar
Oury FX, Godin C, Mailliard A, Chassin A, Gardet O, Giraud A, Heumez E, Morlais JY, Rolland B, Rousset M, Trottet M, Charmet G (2012) A study of genetic progress due to selection reveals a negative effect of climate change on bread wheat yield in France. Eur J Agron 40:28–38
Article Google Scholar
Patterson HD, Thompson R (1971) Recovery of inter-block information when block sizes are unequal. Biometrika 58:545–554
Article Google Scholar
Pavlicev M, Le Rouzic A, Cheverud JM, Wagner GP, Hansen TF (2010) Directionality of epistasis in a murine intercross population. Genetics 185:1489–1505
Article PubMed PubMed Central CAS Google Scholar
Piepho HP, Laidig F, Drobek T, Me Yer U (2014) Dissecting genetic and non-genetic sources of long-term yield trend in German official variety trials. Theor Appl Genet 127:1009–1018
Article PubMed Google Scholar
Piepho HP, Mohring J (2006) Selection in cultivar trials—is it ignorable? Crop Sci 46:192–201
Article Google Scholar
Piepho HP, Möhring J, Melchinger AE, Büchse A (2008) BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161:209–228
Article Google Scholar
Piepho HP, Möhring J, Schulz-Streeck T, Ogutu JO (2012) A stage-wise approach for the analysis of multi-environment trials. Biom J 54:844–860
Article PubMed Google Scholar
Prasanna BM, Burgueño J, Beyene Y, Makumbi D, Asea G, Woyengo V, Tarekegne A, Magorokosho C, Wegary D, Ndhlela T, Zaman-Allah M, Matova PM, Mwansa K, Mashingaidze K, Fato P, Teklewold A, Vivek BS, Zaidi PH, Vinayan MT, Patne N, Rakshit S, Kumar R, Jat SL, Singh SB, Kuchanur PH, Lohithaswa HC, Singh NK, Koirala KB, Ahmed S, Vicente FS, Dhliwayo T, Cairns JE (2022) Genetic trends in CIMMYT’s tropical maize breeding pipelines. Sci Rep 12:20110
Article PubMed PubMed Central CAS Google Scholar
R Core Team (2021) R: A language and environment for statistical computing
Ramasubramanian V, Beavis WD (2020) Factors affecting response to recurrent genomic selection in soybeans. bioRxiv
Ramasubramanian V, Beavis WD (2021) Strategies to assure optimal trade-offs among competing objectives for the genetic improvement of soybean. Front Genet 12
Rencher AC, Schaalje GB (2007) Linear models in statistics. John Wiley & Sons Inc, Hoboken
Book Google Scholar
Rincker K, Nelson R, Specht J, Sleper D, Cary T, Cianzio SR, Casteel S, Conley S, Chen P, Davis V, Fox C, Graef G, Godsey C, Holshouser D, Jiang G-L, Kantartzi SK, Kenworthy W, Lee C, Mian R, McHale L, Naeve S, Orf J, Poysa V, Schapaugh W, Shannon G, Uniatowski R, Wang D, Diers B (2014) Genetic improvement of U.S. soybean in maturity groups II, III, and IV. Crop Sci 54:1419–1432
Article Google Scholar
Rizzo G, Monzon JP, Tenorio FA, Howard R, Cassman KG, Grassini P (2022) Climate and agronomy, not genetics, underpin recent maize yield gains in favorable environments. Proc Natl Acad Sci 119
Robinson GK (1991) That BLUP is a good thing: the estimation of random effects. Stat Sci 6:15–32
Google Scholar
Rogers J, Chen P, Shi A, Zhang B, Scaboo A, Smith SF, Zeng A (2015) Agronomic performance and genetic progress of selected historical soybean varieties in the southern USA. Plant Breed 134:85–93
Article Google Scholar
Rutkoski JE (2019) Chapter four—a practical guide to genetic gain. In: Donald LS (ed) Advances in agronomy, vol 157. Academic Press, London, pp 217–249
Google Scholar
Rutkoski JE (2019) Estimation of realized rates of genetic gain and indicators for breeding program assessment. Crop Sci 59:981–993
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article Google Scholar
Searle S (1974) Prediction, mixed models and variance components. In: Proschan F, Serfling R (eds) Reliability and biometry. Society for Industrial and Applied Mathematics, Philadelphia, pp 229–266
Google Scholar
Searle SR (1997) Built-in restrictions on best linear unbiased predictors (BLUP) of random effects in mixed models. Am Stat 51:19
Google Scholar
Searle SR, Casella G, McCulloch CE (1992) Variance components. John Wiley & Sons, Hoboken
Book Google Scholar
Singh DP, Singh AK, Singh A (2021) Plant breeding and cultivar development, 1st edn. Academic Press, London
Google Scholar
Smith A, Cullis B, Gilmour A (2001) The analysis of crop variety evaluation data in Australia. Aust N Z J Stat 43:129–145
Article Google Scholar
Smith A, Cullis BR, Thompson R (2001) Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics 57:1138–1147
Article PubMed CAS Google Scholar
Smith AB, Cullis BR, Thompson R (2005) The analysis of crop cultivar breeding and evaluation trials: an overview of current mixed model approaches. J Agric Sci 143:449–462
Article Google Scholar
Smith AB, Ganesalingam A, Kuchel H, Cullis BR (2015) Factor analytic mixed models for the provision of grower information from national crop variety testing programs. Theor Appl Genet 128:55–72
Article PubMed Google Scholar
Song Q, Yan L, Quigley C, Fickus E, Wei H, Chen L, Dong F, Araya S, Liu J, Hyten D, Pantalone V, Nelson RL (2020) Soybean BARCSoySNP6K: an assay for soybean genetics and breeding research. Plant J 104:800–811
Article PubMed PubMed Central CAS Google Scholar
Sprague G, Federer W (1951) A comparison of variance components in corn yield trials: II. error, year x variety, location x variety, and variety components. Agron J 43:535–541
Article Google Scholar
Stephens M (1986) Tests based on EDF statistics. In: D’Agostino R, Stephens M (eds) Goodness-of-fit techniques. Marcel Dekker, New York, pp 97–194
Google Scholar
Streck EA, de Magalhaes AM, Aguiar GA, Henrique Facchinello PK, Reis Fagundes PR, Franco DF, Nardino M, de Oliveira AC (2018) Genetic Progress in 45 years of irrigated rice breeding in Southern Brazil. Crop Sci 58:1094–1105
Article Google Scholar
Tabery J (2008) R. A. Fisher, Lancelot Hogben, and the origin(s) of genotype-environment interaction. J Hist Biol 41:717–761
Article PubMed Google Scholar
Teimouri M (2021) ForestFit: statistical modelling for plant size distributions
Ustun A, Allen FL, English BC (2001) Genetic progress in soybean of the U.S. Midsouth. Crop Sci 41:993–998
Article Google Scholar
Vaughn JN, Li Z (2016) Genomic signatures of North American soybean improvement inform diversity enrichment strategies and clarify the impact of hybridization. G3: Genes Genomes Genet 6:2693–2705
Article Google Scholar
Vencosvsky R, Morales A, Garcia JC, Teixeira NM (1986) Progresso Genético Em Vinte Anos De Melhoramento Do Milho No Brasil. Anais do congresso nacional de milho e sorgo. Belo Horizonte, Minas Gerais, Embrapa-CNPMS, Sete Lagoas, Minas Gerais, Brazil, pp 300–307
Walsh B, Lynch M (2018) Evolution and selection of quantitative traits, vol 1. Oxford University Press, Oxford
Book Google Scholar
Wilcox JR (2001) Sixty years of improvement in publicly developed elite soybean lines. Crop Sci 41:1711–1716
Article Google Scholar
Xavier A, Muir WM, Craig B, Rainey KM (2016) Walking through the statistical black boxes of plant breeding. Theor Appl Genet 129:1933–1949
Article PubMed Google Scholar
Yan W (2016) Analysis and handling of G x E in a practical breeding program. Crop Sci 56:2106–2118
Article Google Scholar

Download references

Acknowledgements

Our sincere thanks to Dr. R Chris Gaynor for providing R functions to simulate GEI effects with a compound symmetry model, to Dr. Jack Dekkers for valuable insights on the observed biases of the evaluated models, and to the Iowa State University (ISU) Research IT team for providing efficient computational resources.

Funding

Funding for this research was provided by the Department of Agronomy - ISU, the North Central Soybean Research Program, an NSF Grant (1830478), Baker Center for Plant Breeding, and USDA-ARS CRIS Project IOW04714.

Author information

Authors and Affiliations

Department of Agronomy, Iowa State University, Ames, IA, USA
Matheus D. Krause, Asheesh K. Singh & William D. Beavis
Biostatistics Unit, University of Hohenheim, Stuttgart, Germany
Hans-Peter Piepho
Department of General Biology, Federal University of Viçosa, Viçosa, Brazil
Kaio O. G. Dias

Authors

Matheus D. Krause
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Peter Piepho
View author publications
You can also search for this author in PubMed Google Scholar
Kaio O. G. Dias
View author publications
You can also search for this author in PubMed Google Scholar
Asheesh K. Singh
View author publications
You can also search for this author in PubMed Google Scholar
William D. Beavis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MDK and WDB conceived the research; MDK designed the simulator, models, performed the statistical analyses, and wrote the first drafts of the manuscript; HPP and KOGD provided insights into the methodology; HPP revised numerous drafts of the manuscript; AKS provided knowledge on the structure of a public soybean breeding programs; and WDB and AKS were responsible for acquiring funding to support the research. All authors approved the final version of the manuscript.

Corresponding authors

Correspondence to Matheus D. Krause or William D. Beavis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Daniela Bustos-Korts.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 5953 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Krause, M.D., Piepho, HP., Dias, K.O.G. et al. Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials. Theor Appl Genet 136, 252 (2023). https://doi.org/10.1007/s00122-023-04470-3

Download citation

Received: 15 May 2023
Accepted: 25 September 2023
Published: 21 November 2023
DOI: https://doi.org/10.1007/s00122-023-04470-3

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Models to estimate genetic gain of soybean seed yield from annual multi-environment field trials

Abstract

Key message

Abstract

Similar content being viewed by others

Field experimental design comparisons to detect field effects associated with agronomic traits in upland cotton

Envirotype-based delineation of environmental effects and genotype × environment interactions in Indian soybean (Glycine max, L.)

Across year and year-by-year GGE biplot analysis to evaluate soybean performance and stability in multi-environment trials

Introduction

Material and methods

Soybean stochastic simulations

Founder population

Breeding values

Crossing nurseries and development of experimental lines

Selection across stages and field trials

Selection of breeding lines and cycle of line development

Simulations of GEI in MET

Variations of simulated MET models

Simulation of the non-genetic trend

Simulated “true” RGG values

Data and estimation of RGG

Estimation models applied to simulated PYT and URT

Estimation models applied to simulated BT3, PYT and URT

Statistical comparisons of evaluated models

Covariance modeling

Bias and linearity

Estimation of RGG from the empirical soybean data

Results

Simulation overview

Covariance modeling

Relative bias and overall performance

Linearity

Estimates of RGG from empirical data

Discussion

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 5953 kb)

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation