Linear Molecular and Genomic Eigen Selection Index Methods

Céron-Rojas, J. Jesus; Crossa, José

doi:10.1007/978-3-319-91223-3_8

J. Jesus Céron-Rojas⁴ &
José Crossa⁴

4636 Accesses

Abstract

The three main linear phenotypic eigen selection index methods are the eigen selection index method (ESIM), the restricted ESIM (RESIM) and the predetermined proportional gain ESIM (PPG-ESIM). The ESIM is an unrestricted index, but the RESIM and PPG-ESIM allow null and predetermined restrictions respectively to be imposed on the expected genetic gains of some traits, whereas the rest remain without any restrictions. These indices are based on the canonical correlation, on the singular value decomposition, and on the linear phenotypic selection indices theory. We extended the ESIM theory to the molecular-assisted and genomic selection context to develop a molecular ESIM (MESIM), a genomic ESIM (GESIM), and a genome-wide ESIM (GW-ESIM). Also, we extend the RESIM and PPG-ESIM theory to the restricted genomic ESIM (RGESIM), and to the predetermined proportional gain genomic ESIM (PPG-GESIM) respectively. The latter five indices use marker and phenotypic information jointly to predict the net genetic merit of the candidates for selection, but although MESIM uses only statistically significant markers linked to quantitative trait loci, the GW-ESIM uses all genome markers and phenotypic information and the GESIM, RGESIM, and PPG-GESIM use the genomic estimated breeding values and the phenotypic values to predict the net genetic merit. Using real and simulated data, we validated the theoretical results of all five indices.

You have full access to this open access chapter, Download chapter PDF

Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition

Article Open access 27 December 2017

Accuracy of genomic BLUP when considering a genomic relationship matrix based on the number of the largest eigenvalues: a simulation study

Article Open access 12 December 2019

A new approach fits multivariate genomic prediction models efficiently

Article Open access 17 June 2022

8.1 The Molecular Eigen Selection Index Method

The molecular eigen selection index method (MESIM) is very similar to the linear molecular selection index (LMSI) described in Chap. 4; thus, it uses the same set of information to predict the net genetic merit of individual candidates for selection, and therefore needs the same set of conditions as those of the LMSI. The only difference between the two indices is how the vector of coefficients is obtained and the assumption associated with the vector of economic weights. Thus, although the LMSI obtains the vector of coefficients according to the linear phenotypic selection index (LPSI) described in Chap. 2 and assumes that the economic weights are known and fixed, the MESIM assumes that the economic weights are unknown and fixed and obtains the vector of coefficients according to the ESIM theory.

8.1.1 The MESIM Parameters

In the MESIM context, the net genetic merit can be written as

$$ H={\mathbf{w}}_1^{\prime}\mathbf{g}+{\mathbf{w}}_2^{\prime}\mathbf{s}=\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right]\left[\begin{array}{c}\mathbf{g}\\ {}\mathbf{s}\end{array}\right]={\mathbf{w}}^{\prime}\mathbf{a}, $$

(8.1)

where $ {\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em \dots \kern0.5em {g}_t\right] $ is the vector of true breeding values, t is the number of traits, $ {\mathbf{w}}_1^{\prime }=\left[{w}_1\kern0.5em \cdots \kern0.5em {w}_t\right] $ is a vector of unknown economic weights associated with g, $ {\mathbf{w}}_2^{\prime }=\left[{0}_1\kern0.5em \cdots \kern0.5em {0}_t\right] $ is a null vector associated with the vector of marker score values $ {\mathbf{s}}^{\prime }=\left[{s}_1\kern0.5em {s}_2\kern0.5em \dots \kern0.5em {s}_t\right] $, $ {\mathbf{w}}^{\prime }=\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right] $ and $ {\mathbf{a}}^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\mathbf{s}}^{\prime}\right] $ (Chap. 4 for details). The MESIM index can be written as

$$ I={\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\mathbf{y}+{\boldsymbol{\upbeta}}_s^{\prime}\mathbf{s}=\left[{\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_s^{\prime}\right]\kern0.24em \left[\begin{array}{c}\mathbf{y}\\ {}\mathbf{s}\end{array}\right]={\boldsymbol{\upbeta}}^{\prime}\mathbf{t}, $$

(8.2)

where $ {\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em \cdots \kern0.5em {y}_t\right] $ is the vector of phenotypic values; $ {\mathbf{s}}^{\prime }=\left[{s}_1\kern0.5em {s}_2\kern0.5em \dots \kern0.5em {s}_t\right] $ is the vector of marker scores; $ {\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime } $ and β_s are vectors of phenotypic and marker score weight values respectively, $ {\boldsymbol{\upbeta}}^{\prime }=\left[{\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_G^{\prime}\right] $ and $ {\mathbf{t}}^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\mathbf{s}}^{\prime}\right] $. The objectives of the MESIM are the same as those of the ESIM (see Chap. 7 for details).

Let $ Var(H)={\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\mathbf{w}={\sigma}_H^2 $ be the variance of H, $ Var(I)={\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}}_M\boldsymbol{\upbeta} ={\sigma}_I^2 $ the variance of I, and Cov(H, I) = w′Ψ_Mβ the covariance between H and I, where $ {\boldsymbol{\Psi}}_M= Var\left[\begin{array}{c}\mathbf{g}\\ {}\mathbf{s}\end{array}\right]=\left[\begin{array}{cc}\mathbf{C}& {\mathbf{S}}_M\\ {}{\mathbf{S}}_M& {\mathbf{S}}_M\end{array}\right] $ and $ {\mathbf{T}}_M= Var\left[\begin{array}{c}\mathbf{y}\\ {}\mathbf{s}\end{array}\right]=\left[\begin{array}{cc}\mathbf{P}& {\mathbf{S}}_M\\ {}{\mathbf{S}}_M& {\mathbf{S}}_M\end{array}\right] $ are block matrices of size 2t × 2t (t is the number of traits) of covariance matrices where P, S_M, and C are covariance matrices t × t of phenotypic (y), marker score (s), and genetic breeding (g) values respectively. Let $ {\rho}_{HI}=\frac{{\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\boldsymbol{\upbeta}}{\sqrt{{\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\mathbf{w}}\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}}_M\boldsymbol{\upbeta}}} $ and $ {h}_I^2=\frac{{\boldsymbol{\upbeta}}^{\prime }{\boldsymbol{\Psi}}_M\boldsymbol{\upbeta}}{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}}_M\boldsymbol{\upbeta}} $ be the correlation between H and I, and the heritability of I respectively; then, the MESIM selection response can be written as

$$ R={k}_I{\sigma}_H{\rho}_{HI} $$

(8.3)

and

$$ R={k}_I{\sigma}_I{h}_I^2, $$

(8.4)

where k_I is the standardized selection differential (or selection intensity) associated with MESIM; $ {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\mathbf{w}} $ and $ {\sigma}_I=\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}}_M\boldsymbol{\upbeta}} $ are the standard deviations of the variance of H and I respectively. It is assumed that k_I is fixed, and that matrices T_M and Ψ_M are known; therefore, we can maximize R by maximizing ρ_HI (Eq. 8.3) with respect to vectors w and β, or by maximizing $ {h}_I^2 $ (Eq. 8.4) only with respect to vector β.

Maximizing $ {h}_I^2 $ only with respect to β is simpler than maximizing ρ_HI with respect to w and β; however, in the latter case the maximization process of ρ_HI gives more information associated with MESIM parameters than when $ {h}_I^2 $ is maximized only with respect to β (see Chap. 7, Eq. 7.13, for details). In this subsection, we maximize ρ_HI with respect to vectors w and β similar to the ESIM in Chap. 7, Sect. 7.1.1. Thus, we omit the steps and details of the maximization process of ρ_HI.

We maximize $ {\rho}_{HI}=\frac{{\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\boldsymbol{\upbeta}}{\sqrt{{\mathbf{w}}^{\prime }{\boldsymbol{\Psi}}_M\mathbf{w}}\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}}_M\boldsymbol{\upbeta}}} $ with respect to vectors w and β under the restrictions $ {\sigma}_H^2={\mathbf{w}}^{\prime}\boldsymbol{\Psi} \mathbf{w} $, $ {\sigma}_I^2={\boldsymbol{\upbeta}}^{\prime}\mathbf{T}\boldsymbol{\upbeta } $, and 0 < $ {\sigma}_H^2 $, $ {\sigma}_I^2 $ < ∞, where $ {\sigma}_H^2 $ is the variance of H = w′a and $ {\sigma}_I^2 $ is the variance of I = β′t. Thus, it is necessary to maximize the function

$$ f\left(\boldsymbol{\upbeta}, \mathbf{w},\mu, \phi \right)={\mathbf{w}}^{\prime}\boldsymbol{\Psi} \boldsymbol{\upbeta} -0.5\mu \left({\boldsymbol{\upbeta}}^{\prime}\mathbf{T}\boldsymbol{\upbeta } -{\sigma}_I^2\right)-0.5\phi \left({\mathbf{w}}^{\prime}\boldsymbol{\Psi} \mathbf{w}-{\sigma}_H^2\right) $$

(8.5)

with respect to β, w, μ, and ϕ, where μ and ϕ are Lagrange multipliers. The derivatives of Eq. (8.5) with respect to β, w, μ, and ϕ are:

$$ \boldsymbol{\Psi} \mathbf{w}-\mu \mathbf{T}\boldsymbol{\upbeta } =\mathbf{0}, $$

(8.6)

$$ \boldsymbol{\Psi} \boldsymbol{\upbeta} -\phi \boldsymbol{\Psi} \mathbf{w}=\mathbf{0}, $$

(8.7)

$$ {\boldsymbol{\upbeta}}^{\prime}\mathbf{T}\boldsymbol{\upbeta } ={\sigma}_I^2\kern1em \mathrm{and}\kern1em {\mathbf{w}}^{\prime}\boldsymbol{\Psi} \mathbf{w}={\sigma}_H^2, $$

(8.8)

respectively, where Eq. (8.8) denotes the restrictions imposed for maximizing ρ_HI. It can be shown (see Chap. 7) that vector w can be obtained as

$$ {\mathbf{w}}_M={\boldsymbol{\Psi}}_M^{-1}{\mathbf{T}}_M\boldsymbol{\upbeta} $$

(8.9)

and the net genetic merit in the MESIM context can be written as $ {H}_M={\mathbf{w}}_M^{\prime}\mathbf{a} $; thus, the correlation between $ {H}_M={\mathbf{w}}_M^{\prime}\mathbf{a} $ and I is $ {\rho}_{H_MI}=\frac{\sqrt{{\boldsymbol{\upbeta}}^{\prime}\mathbf{T}\boldsymbol{\upbeta }}}{\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{T}\boldsymbol{\Psi}}^{-1}\mathbf{T}\boldsymbol{\upbeta }}} $ and the MESIM vector of coefficients (β) that maximizes $ {\rho}_{H_MI} $ can be obtained from equation

$$ \left({\mathbf{T}}^{-1}\boldsymbol{\Psi} -{\lambda}_M^2{\mathbf{I}}_{2t}\right){\boldsymbol{\upbeta}}_M=\mathbf{0}, $$

(8.10)

where I_2t is an identity matrix of size 2t × 2t (t is the number of traits), and $ {\lambda}_M^2 $ and β_M are the eigenvalue and eigenvector of matrix $ {\mathbf{T}}_M^{-1}{\boldsymbol{\Psi}}_M $. The words eigenvalue and eigenvector are derived from the German word eigen, which means owned by or peculiar to. Eigenvalues and eigenvectors are sometimes called characteristic values and characteristic vectors, proper values and proper vectors, or latent values and latent vectors (Meyer 2000). The square root of $ {\lambda}_M^2 $ (λ_M) is the canonical correlation between $ {H}_M={\mathbf{w}}_M^{\prime}\mathbf{a} $ and $ {I}_M={\boldsymbol{\upbeta}}_M^{\prime}\mathbf{t} $, and the optimized MESIM index can be written as $ {I}_M={\boldsymbol{\upbeta}}_M^{\prime}\mathbf{t} $. Using a similar procedure to that described in Chap. 7 (Eq. 7.17), it can be show that vector β_M can be transformed into β_C = Fβ_M, where F is a diagonal matrix with values equal to any real number, except zero values.

The maximized correlation between $ {H}_M={\mathbf{w}}_M^{\prime}\mathbf{a} $ and $ {I}_M={\boldsymbol{\upbeta}}_M^{\prime}\mathbf{t} $, or MESIM accuracy, is

$$ {\rho}_{H_M{I}_M}=\frac{\sqrt{{\boldsymbol{\upbeta}}_M^{\prime }{\mathbf{T}}_M{\boldsymbol{\upbeta}}_M}}{\sqrt{{\boldsymbol{\upbeta}}_M^{\prime }{\mathbf{T}}_M{\boldsymbol{\Psi}}_M^{-1}{\mathbf{T}}_M{\boldsymbol{\upbeta}}_M}}=\frac{\sigma_{I_M}}{\sigma_{H_M}}, $$

(8.11)

where $ {\sigma}_{I_M}=\sqrt{{\boldsymbol{\upbeta}}_M^{\prime }{\mathbf{T}}_M{\boldsymbol{\upbeta}}_M} $ is the standard deviation of $ {I}_M={\boldsymbol{\upbeta}}_M^{\prime}\mathbf{t} $, and $ {\sigma}_{H_M}=\sqrt{{\boldsymbol{\upbeta}}_M^{\prime }{\mathbf{T}}_M{\boldsymbol{\Psi}}_M^{-1}{\mathbf{T}}_M{\boldsymbol{\upbeta}}_M} $ is the standard deviation of $ {H}_M={\mathbf{w}}_M^{\prime}\mathbf{a} $.

The maximized selection response and expected genetic gain per trait of MESIM are

$$ {R}_M={k}_I\sqrt{{\boldsymbol{\upbeta}}_{M_1}^{\prime }{\mathbf{T}}_M{\boldsymbol{\upbeta}}_{M_1}} $$

(8.12)

and

$$ {\mathbf{E}}_M={k}_I\frac{{\boldsymbol{\Psi}}_M{\boldsymbol{\upbeta}}_{M_1}}{\sqrt{{\boldsymbol{\upbeta}}_{M_1}^{\prime }{\mathbf{T}}_M{\boldsymbol{\upbeta}}_{M_1}}}, $$

(8.13)

respectively, where $ {\boldsymbol{\upbeta}}_{M_1} $ is the first eigenvector of matrix $ {\mathbf{T}}_M^{-1}{\boldsymbol{\Psi}}_M $. If vector $ {\boldsymbol{\upbeta}}_{M_1} $ is multiplied by matrix F, we obtain $ {\boldsymbol{\upbeta}}_{C_1}={\mathbf{F}\boldsymbol{\upbeta}}_{M_1} $; in this case, we can replace $ {\boldsymbol{\upbeta}}_{M_1} $ with $ {\boldsymbol{\upbeta}}_{C_1}={\mathbf{F}\boldsymbol{\upbeta}}_{M_1} $ in Eqs. (8.12) and (8.13), and the optimized MESIM index should be written as $ {I}_M={\boldsymbol{\upbeta}}_{C_1}^{\prime}\mathbf{y} $.

8.1.2 Estimating MESIM Parameters

We estimate the MESIM parameters using the same procedure described in Chap. 7 (Sect. 7.1.4) to estimate the ESIM parameters. Let $ \widehat{\mathbf{C}} $, $ \widehat{\mathbf{P}} $, and $ {\widehat{\mathbf{S}}}_M $ be the estimates of the genotypic, phenotypic, and marker scores covariance matrices, $ {\widehat{\mathbf{T}}}_M=\left[\begin{array}{cc}\widehat{\mathbf{P}}& {\widehat{\mathbf{S}}}_M\\ {}{\widehat{\mathbf{S}}}_M& {\widehat{\mathbf{S}}}_M\end{array}\right] $ and $ {\widehat{\boldsymbol{\Psi}}}_M=\left[\begin{array}{cc}\widehat{\mathbf{C}}& {\widehat{\mathbf{S}}}_M\\ {}{\widehat{\mathbf{S}}}_M& {\widehat{\mathbf{S}}}_M\end{array}\right] $ the estimated block matrices (Chap. 4) and $ \widehat{\mathbf{W}}={\widehat{\mathbf{T}}}_M^{-1}{\widehat{\boldsymbol{\Psi}}}_M $; then, to find the estimators $ {\widehat{\boldsymbol{\upbeta}}}_{M_1} $ and $ {\widehat{\lambda}}_{M_1}^2 $ of the first eigenvector ($ {\boldsymbol{\upbeta}}_{M_1} $) and the first eigenvalue ($ {\lambda}_{M_1}^2 $) respectively, we need to solve the equation

$$ \left(\widehat{\mathbf{W}}{\widehat{\mathbf{W}}}^{\prime }-{\widehat{\mu}}_j\mathbf{I}\right){\widehat{\boldsymbol{\upbeta}}}_{M_j}=\mathbf{0}, $$

(8.14)

where $ {\widehat{\mu}}_j={\widehat{\lambda}}_{M_j}^4 $, j= 1, 2, …, 2t. For additional details, see Eqs. (7.22) and (7.23), and Sect. 7.1.5 of Chap. 7. The result of Equation (8.14) allow the MESIM index ($ {I}_M={\boldsymbol{\upbeta}}_{M_1}^{\prime}\mathbf{t} $) to be estimated as $ {\widehat{I}}_M={\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{M_1}\mathbf{t} $, whereas the estimator of the maximized ESIM selection response and its expected genetic gain per trait can be denoted by

$$ {\widehat{R}}_M={k}_I\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{M_1}{\widehat{\mathbf{T}}}_M{\widehat{\boldsymbol{\upbeta}}}_{M_1}}\kern0.5em \mathrm{and}\kern0.5em {\widehat{\mathbf{E}}}_M={k}_I\frac{{\widehat{\boldsymbol{\Psi}}}_M{\widehat{\boldsymbol{\upbeta}}}_{M_1}}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{M_1}{\widehat{\mathbf{T}}}_M{\widehat{\boldsymbol{\upbeta}}}_{M_1}}}, $$

(8.15)

respectively.

8.1.3 Numerical Examples

To validate the MESIM theoretical results, we use a real maize (Zea mays) F₂ population with 247 genotypes (each with two repetitions), 195 molecular markers, and two traits—plant height (PHT, cm) and ear height (EHT, cm)—evaluated in one environment. We coded the marker homozygous loci for the allele from the first parental line by 1, whereas the marker homozygous loci for the allele from the second parental line was coded by −1 and the marker heterozygous loci by 0. The estimated phenotypic, genetic, and marker scores covariance matrices were $ \widehat{\mathbf{P}}=\left[\begin{array}{cc}191.81& 106.89\\ {}106.89& 167.93\end{array}\right] $, $ \widehat{\mathbf{C}}=\left[\begin{array}{cc}83.00& 57.44\\ {}57.44& 59.80\end{array}\right] $, and $ {\widehat{\mathbf{S}}}_M=\left[\begin{array}{cc}15.750& 0.983\\ {}0.983& 28.083\end{array}\right] $ respectively, and the vector of economic weights was $ {\mathbf{a}}^{\prime }=\left[{\mathbf{w}}^{\prime}\kern0.5em {\mathbf{0}}^{\prime}\right] $, where $ {\mathbf{w}}^{\prime }=\left[-1\kern0.5em -1\right] $ and $ {\mathbf{0}}^{\prime }=\left[0\kern0.5em 0\right] $. Details of how to estimate the marker scores and their variance were given in Chap. 4.

We compare LMSI versus MESIM efficiency. The estimated LMSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}^{\prime }={\mathbf{a}}^{\prime }{\widehat{\boldsymbol{\Psi}}}_M{\widehat{\mathbf{T}}}_M^{-1}=\left[-0.59\kern0.5em -0.18\kern0.5em -0.41\kern0.5em -0.82\right] $. Using a 10% selection intensity (k_I = 1.755), the estimated LMSI selection response and the expected genetic gain per trait were $ \widehat{R}={k}_I\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime }}\widehat{\mathbf{T}}}_M\widehat{\boldsymbol{\upbeta}}}=20.41 $ and $ {\widehat{\mathbf{E}}}^{\prime }={k}_I\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime }}\widehat{\boldsymbol{\Psi}}}_M}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime }}\widehat{\mathbf{T}}}_M\widehat{\boldsymbol{\upbeta}}}}=\left[-10.09\kern0.5em -10.31\kern0.5em -2.53\kern0.5em -4.39\right] $ respectively, whereas the estimated LMSI accuracy was $ {\widehat{\rho}}_{H\widehat{I}}=\frac{{\widehat{\sigma}}_I}{{\widehat{\sigma}}_H}=0.72 $.

Vector $ {\widehat{\boldsymbol{\upbeta}}}_{M_1}^{\prime }=\left[0.089\kern0.5em -0.061\kern0.5em -0.536\kern0.5em 0.837\right] $ was the original estimated MESIM vector of coefficients. Using matrix $ \mathbf{F}=\left[\begin{array}{cccc}-0.1& 0& 0& 0\\ {}0& -0.1& 0& 0\\ {}0& 0& 0.75& 0\\ {}0& 0& 0& -0.75\end{array}\right] $, vector $ {\widehat{\boldsymbol{\upbeta}}}_{M_1}^{\prime } $ was transformed as $ {\widehat{\boldsymbol{\upbeta}}}_{C_1}^{\prime }={\widehat{\boldsymbol{\upbeta}}}_{M_1}^{\prime}\mathbf{F}=\left[-0.009\kern0.5em 0.006\kern0.5em -0.402\kern0.5em 0.628\right] $ and then the estimated MESIM index was $ {\widehat{I}}_M=-0.009\;\mathrm{PHT}+0.006\;\mathrm{EHT}-0.402\;{\mathrm{S}}_{\mathrm{PHT}}+0.628\;{\mathrm{S}}_{\mathrm{EHT}} $, where S_PHT and S_EHT denote the marker scores associated with PHT and EHT respectively. The estimated MESIM expected genetic gain, selection response, and accuracy were $ {\widehat{\mathbf{E}}}_M^{\prime }={k}_I\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{C_1}{\widehat{\boldsymbol{\Psi}}}_M}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{C_1}{\widehat{\mathbf{T}}}_M{\widehat{\boldsymbol{\upbeta}}}_{C_1}}}=\left[-3.438\kern0.5em -8.516\kern0.5em -3.319\kern0.5em -8.372\right] $, $ {\widehat{R}}_M={k}_I\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{C_1}{\widehat{\mathbf{T}}}_M{\widehat{\boldsymbol{\upbeta}}}_{C_1}}=6.573 $ and $ {\widehat{\rho}}_{H_M{\widehat{I}}_M}=\frac{{\widehat{\sigma}}_{I_M}}{{\widehat{\sigma}}_{H_M}}=0.99 $ respectively.

The inner product of the estimated LMSI and MESIM vector of coefficients were 1.221 and 0.556 respectively, whence the estimated LMSI selection response (20.41) divided by 1.221 was 16.716, and the estimated MESIM selection response (6.573) divided by 0.556 was 11.821. That is, the estimated LMSI selection response was higher than the estimated MESIM selection response for this data set. Similar results were found when we compared the estimated LMSI expected genetic gain per trait with the estimated MESIM expected genetic gain per trait. Finally, Fig. 8.1 presents the frequency distribution of the 247 estimated MESIM values for the real data set described earlier, which approaches normal distribution, as we would expect.

Now with a selection intensity of 10% (k_I = 1.755), we compare the LMSI and MESIM efficiency using the simulated data set described in Sect. 2.8.1 of Chap. 2 for four phenotypic selection cycles, each with four traits (T₁, T₂, T₃ and T₄), 500 genotypes, and four replicates of each genotype. The economic weights for T₁, T₂, T₃, and T₄ were 1, −1, 1, and 1 respectively. For this data set, we did not use the linear transformation $ {\widehat{\boldsymbol{\upbeta}}}_{C_1}=\mathbf{F}{\widehat{\boldsymbol{\upbeta}}}_{M_1} $.

The estimated selection responses of the linear marker, combined genomic and genome-wide selection indices (LMSI, CLGSI, and GW-LMSI respectively; see Chaps. 4 and 5 for details) for four simulated selection cycles when their vectors of coefficients were normalized, are presented in Table 8.1. Also, in this table the selection responses of the estimated linear molecular, genomic, and genome-wide eigen selection index methods (MESIM, GESIM, and GW-ESIM respectively; details in Sect. 8.2) are shown for four simulated selection cycles. The average of the estimated LMSI selection response was 2.22, whereas the average of the estimated MESIM selection response was 1.69. The estimated LMSI selection response was higher than that of the MESIM.

Table 8.1 Estimated linear molecular, combined genomic, and genome-wide selection index (LMSI, CLGSI and GW-LMSI respectively) selection responses when their vectors of coefficients are normalized for four simulated selection cycles

Full size table

Table 8.2 presents the estimated LMSI and MESIM expected genetic gains for four traits (T1, T2, T3, and T4) and their associated marker scores (S1, S2, S3, and S4) for four simulated selection cycles. The averages of the estimated LMSI expected genetic gains for the four traits and their associated marker scores were 12.74, −2.10, 1.60, 0.94, 5.70, −2.19, 0.71, and 0.64 respectively, whereas the averages of the estimated MESIM expected genetic gains for the four traits and their associated marker scores were 14.40, −0.38, −0.39, 0.34, 8.65, 0.47, −0.21, and −0.70 respectively. Except for trait T1 and its associated molecular scores, the estimated LMSI expected genetic gains per trait were higher than the estimated MESIM expected genetic gains. Thus, for this data set, LMSI efficiency was greater than MESIM efficiency.

Table 8.2 Estimated linear molecular selection index (LMSI) and estimated linear molecular eigen selection index method (MESIM) expected genetic gains for four traits (T1, T2, T3, and T4) and their associated marker scores (S1, S2, S3, and S4) for four simulated selection cycles. The selection intensity was 10% (k_I = 1.755)

Full size table

Chapter 11 presents RIndSel, a user-friendly graphical unit interface in JAVA that is useful for estimating the LMSI and ESIM parameters and selecting parents for the next selection cycle.

8.2 The Linear Genomic Eigen Selection Index Method

The linear genomic eigen selection index method (GESIM) is based on the standard CLGSI described in Chap. 5, and uses genomic estimated breeding values (GEBVs) and phenotypic values jointly to predict the net genetic merit. Thus, conditions for constructing a valid GESIM are the same as those for constructing the CLGSI. Also, the MESIM theory described in Sect. 8.1 is directly applied to the GESIM and only minor changes are necessary in GESIM theory. For example, instead of marker scores, the GESIM uses GEBVs to predict the net genetic merit; thus, the details of the estimation process are the same as for the MESIM.

8.2.1 The GESIM Parameters

In the GESIM context, the net genetic merit can be written as

$$ H={\mathbf{w}}_1^{\prime}\mathbf{g}+{\mathbf{w}}_2^{\prime}\boldsymbol{\upgamma} =\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right]\left[\begin{array}{c}\mathbf{g}\\ {}\boldsymbol{\upgamma} \end{array}\right]={\mathbf{w}}^{\prime}\boldsymbol{\upalpha}, $$

(8.16)

where $ {\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em \dots \kern0.5em {g}_t\right] $ is the vector of true breeding values, t is the number of traits, $ {\mathbf{w}}_1^{\prime }=\left[{w}_1\kern0.5em \cdots \kern0.5em {w}_t\right] $ is a vector of unknown economic weights associated with g, $ {\mathbf{w}}_2^{\prime }=\left[{0}_1\kern0.5em \cdots \kern0.5em {0}_t\right] $ is a null vector associated with the vector of genomic breeding values $ {\boldsymbol{\upgamma}}^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\kern0.5em \dots \kern0.5em {\gamma}_t\right] $, $ {\mathbf{w}}^{\prime }=\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right] $, and $ {\boldsymbol{\upalpha}}^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $. The estimator of γ is the GEBV (see Chap. 5 for additional details). The GESIM index can be written as

$$ I={\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\mathbf{y}+{\boldsymbol{\upbeta}}_{\boldsymbol{\upgamma}}^{\prime}\boldsymbol{\upgamma} =\left[{\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_{\boldsymbol{\upgamma}}^{\prime}\right]\left[\begin{array}{c}\mathbf{y}\\ {}\boldsymbol{\upgamma} \end{array}\right]={\boldsymbol{\upbeta}}^{\prime}\mathbf{f}, $$

(8.17)

where $ {\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em \cdots \kern0.5em {y}_t\right] $ is the vector of phenotypic values; $ {\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime } $ and β_γ are vectors of weights of phenotypic and genomic breeding values weights respectively; $ {\boldsymbol{\upbeta}}^{\prime }=\left[{\boldsymbol{\upbeta}}_{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_{\boldsymbol{\upgamma}}^{\prime}\right] $ and $ {\mathbf{f}}^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $.

Let $ Var(H)={\mathbf{w}}^{\prime}\mathbf{Aw}={\sigma}_H^2 $ be the variance of H = w′α, $ Var(I)={\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta} ={\sigma}_I^2 $ the variance of I = β′f, and Cov(H, I) = w′Aβ = σ_HI the covariance between H and I, where $ \mathbf{A}= Var\left[\begin{array}{c}\mathbf{g}\\ {}\boldsymbol{\upgamma} \end{array}\right]=\left[\begin{array}{cc}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ and $ \boldsymbol{\Phi} = Var\left[\begin{array}{c}\mathbf{y}\\ {}\boldsymbol{\upgamma} \end{array}\right]=\left[\begin{array}{cc}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ are block matrices 2t × 2t (t is the number of traits) of covariance matrices and P, Γ, and C are covariance matrices of phenotypic (y), genomic (γ), and genetic (g) values respectively. Then, $ {\rho}_{HI}=\frac{{\mathbf{w}}^{\prime}\mathbf{A}\boldsymbol{\upbeta }}{\sqrt{{\mathbf{w}}^{\prime}\mathbf{A}\mathbf{w}}\sqrt{{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta}}} $ is the correlation between H = w′α and I = β′f and the GESIM selection response can be written as

$$ R={k}_I{\sigma}_H{\rho}_{HI}, $$

(8.18)

where k_I is the standardized selection differential (or selection intensity) associated with the GESIM and $ {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Aw}} $ is the standard deviation of the variance of H. It is assumed that k_I is fixed, and that matrices Φ and A are known; then, we can maximize R by maximizing ρ_HI with respect to vectors w and β under the restrictions $ {\sigma}_H^2={\mathbf{w}}^{\prime}\mathbf{Aw} $, $ {\sigma}_I^2={\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta} $, and 0 < $ {\sigma}_H^2 $, $ {\sigma}_I^2 $ < ∞; similar to the MESIM.

It can be shown that the vector w in the GESIM context is

$$ {\mathbf{w}}_G={\mathbf{A}}^{-1}\boldsymbol{\Phi} \boldsymbol{\upbeta} $$

(8.19)

and that the net genetic merit can be written as $ {H}_G={\mathbf{w}}_G^{\prime}\boldsymbol{\upalpha} $. The correlation between $ {H}_G={\mathbf{w}}_G^{\prime}\boldsymbol{\upalpha} $ and I = β′f is $ {\rho}_{H_GI}=\frac{\sqrt{{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta}}}{\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\boldsymbol{\Phi} \mathbf{A}}^{-1}\boldsymbol{\Phi} \boldsymbol{\upbeta}}} $ and the GESIM index vector of coefficients that maximizes $ {\rho}_{H_GI} $ can be obtained from the equation

$$ \left({\boldsymbol{\Phi}}^{-1}\mathbf{A}-{\lambda}_G^2{\mathbf{I}}_{2t}\right){\boldsymbol{\upbeta}}_G=\mathbf{0}, $$

(8.20)

where I_2t is an identity matrix of size 2t × 2t (t is the number of traits); the optimized GESIM index can be written as $ {I}_G={\boldsymbol{\upbeta}}_G^{\prime}\mathbf{f} $. By Eqs. (8.19) and (8.20), GESIM accuracy can be written as

$$ {\rho}_{H_G{I}_G}=\frac{\sigma_{I_G}}{\sigma_{H_G}}, $$

(8.21)

where $ {\sigma}_{I_G}=\sqrt{{\boldsymbol{\upbeta}}_G^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_G} $ is the standard deviation of $ {I}_G={\boldsymbol{\upbeta}}_G^{\prime}\mathbf{f} $, and $ {\sigma}_{H_G}=\sqrt{{\boldsymbol{\upbeta}}_G^{\prime }{\boldsymbol{\Phi} \mathbf{A}}^{-1}{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_G} $ is the standard deviation of $ {H}_G={\mathbf{w}}_G^{\prime}\boldsymbol{\upalpha} $. In Eq. (8.20), $ {\lambda}_G^2={\rho}_{H_G{I}_G}^2 $ is the square of the canonical correlation between H_G and I_G, and β_G is the canonical vector associated with $ {\lambda}_G^2={\rho}_{H_G{I}_G}^2 $.

The maximized GESIM selection response and expected genetic gain per trait are

$$ {R}_G={k}_I\sqrt{{\boldsymbol{\upbeta}}_G^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_G} $$

(8.22)

and

$$ {\mathbf{E}}_G={k}_I\frac{{\mathbf{A}\boldsymbol{\upbeta}}_G}{\sqrt{{\boldsymbol{\upbeta}}_G^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_G}}, $$

(8.23)

respectively, where β_G is the first eigenvector of matrix Φ⁻¹A. Vector β_G can be transformed as β_CG = Fβ_G, where F is a diagonal matrix defined earlier.

8.2.2 Numerical Examples

To compare the CLGSI versus GESIM theoretical results, we use a real maize (Zea mays) F₂ population with 244 genotypes (each with two repetitions), 233 molecular markers, and three traits—grain yield (GY, ton ha⁻¹), ear height (EHT, cm), and plant height (PHT, cm). We estimated matrices P and C using Eqs. (2.22) to (2.24) described in Chap. 2, whence the estimated matrices were $ \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45& 1.33& 2.33\\ {}1.33& 65.07& 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] $ and $ \widehat{\mathbf{C}}=\left[\begin{array}{lll}0.07& 0.61& 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] $. In a similar manner, we estimated matrix Γ by applying Eqs. (5.21) to (5.23) described in Chap. 5 using phenotypic and marker information jointly; the estimated matrix was $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& 0.65& 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] $. The selection intensity for making a selection cycle was 10% (k_I = 1.755) and the vector of economic weights was $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $. To obtain the estimated vector of coefficient of CLGSI ($ \widehat{\boldsymbol{\upbeta}}={\widehat{\boldsymbol{\Phi}}}^{-1}\widehat{\mathbf{A}}\mathbf{w} $) and GESIM (Eq. 8.20), it is necessary to construct matrices $ \widehat{\mathbf{A}}=\left[\begin{array}{ll}\widehat{\mathbf{C}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $ and $ \widehat{\boldsymbol{\Phi}}=\left[\begin{array}{ll}\widehat{\mathbf{P}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $.

The estimated CLGSI vector of coefficients for the traits GY, EHT, and PHT and their associated GEBVs (GEBV_GY, GEBV_EHT, and GEBV_PHT respectively) was $ {\widehat{\boldsymbol{\upbeta}}}^{\prime }=\left[0.08\kern0.5em -0.02\kern0.5em -0.01\kern0.5em 4.92\kern0.5em -0.08\kern0.5em -0.09\right] $, whereas the estimated CLGSI selection response, accuracy, and expected genetic gain per trait were $ \widehat{R}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}^{\prime}\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}=1.54 $, $ {\widehat{\rho}}_{HI}=\frac{{\widehat{\sigma}}_I}{{\widehat{\sigma}}_H}=0.814 $, and $ {\widehat{\mathbf{E}}}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}^{\prime}\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}}=\left[0.36\kern0.5em 1.04\kern0.5em 1.70\kern0.5em 0.36\kern0.5em 1.53\kern0.5em 2.38\right] $ respectively. Finally, $ \widehat{I}=0.08\mathrm{GY}-0.02\mathrm{EHT}-0.01\mathrm{PHT}+4.92{\mathrm{GEBV}}_{\mathrm{GY}}-0.08{\mathrm{GEBV}}_{\mathrm{EHT}}-0.09{\mathrm{GEBV}}_{\mathrm{PHT}} $ was the estimated CLGSI.

The estimated GESIM vector of coefficients, selection response, accuracy, and expected genetic gain per trait were $ {\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime }=\left[-0.207\kern0.5em 0.029\kern0.5em 0.041\kern0.5em 0.820\kern0.5em 0.337\kern0.5em 0.411\right] $, $ {\widehat{R}}_G={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{G_1}}=6.288 $, $ {\widehat{\rho}}_{{\widehat{H}}_G{\widehat{I}}_G}=\frac{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{G_1}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\mathbf{A}}}^{-1}{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{G_1}}}=0.9056 $, and $ {\widehat{\mathbf{E}}}_G^{\prime }={k}_1\frac{{\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{G_1}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{G_1}}}=\left[0.369\kern0.5em 5.528\kern0.5em 9.186\kern0.5em 0.370\kern0.5em 5.250\kern0.5em 8.702\right] $ respectively.

Fig. 8.2 presents the frequency distribution of the 244 estimated GESIM index values for one (Fig. 8.2a) and three traits (Fig. 8.2b) using the real data set described earlier. The frequency distribution of the estimated GESIM index values approaches the normal distribution for both indices.

Now, we compare the estimated CLGSI and GESIM selection response and expected genetic gain per trait using the simulated data set described in Sect. 2.8.1 of Chap. 2 for four phenotypic selection cycles, each with four traits (T₁, T₂, T₃ and T₄), 500 genotypes, and four replicates per genotype. The economic weights of T₁, T₂, T₃, and T₄ were 1, −1, 1, and 1 respectively and the selection intensity for both indices was 10% (k_I = 1.755). For this data set, matrix F was an identity matrix of size 8 × 8 in all four selection cycles.

For this data set, the averages of the estimated CLGSI and GESIM selection responses were 0.68 and 2.74 (Table 8.1) respectively. The estimated CLGSI selection response was lower than the estimated GESIM selection response. Table 8.3 presents the estimated CLGSI and GESIM expected genetic gain for four traits (T1, T2, T3, and T4) and their associated genomic estimated breeding values (GEBV1, GEBV2, GEBV3, and GEBV4) for four simulated selection cycles. The averages of the estimated CLGSI expected genetic gains for the four traits and their associated GEBVs were 7.45, −3.35, 2.68, 1.09, 7.13, −3.68, 3.13, and 2.69 respectively, whereas the averages of the estimated GESIM expected genetic gains for the four traits and their associated GEBVs were 8.18, −3.08, 2.27, 0.71, 7.46, −3.53, 2.86, and 2.39 respectively. The estimated CLGSI and GESIM expected genetic gains per trait were very similar.

Table 8.3 Estimated combined linear genomic selection index (CLGSI) and estimated GESIM expected genetic gains for four traits (T1, T2, T3, and T4) and their associated genomic estimated breeding values (GEBV1, GEBV2, GEBV3, and GEBV4) for four simulated selection cycles. The selection intensity was 10% (k_I = 1.755)

Full size table

8.3 The Genome-Wide Linear Eigen Selection Index Method

The MESIM requires regressing phenotypic values on marker coded values to predict the marker score values for each individual candidate for selection, and then combining the marker scores with phenotypic information using the MESIM to obtain a final prediction of the net genetic merit. In addition, the GESIM requires fitting of a statistical model to estimate all available marker effects in the training population; these estimates are then used to obtain GEBVs, which are predictors of breeding values. Crossa and Cerón-Rojas (2011) extended the ESIM theory to a genome-wide linear molecular ESIM (GW-ESIM) similar to the GW-LMSI described in Chap. 4. The GW-LMSI and GW-ESIM are very similar and only minor changes are necessary in GW-ESIM; for example, instead of estimating the GW-LMSI vector of coefficients according to the LPSI method (Chap. 2), the GW-ESIM vector of coefficients is estimated according to the singular value decomposition (SVD) described in Chap. 7.

8.3.1 The GW-ESIM Parameters

In the GW-ESIM context, the net genetic merit can be written as

$$ H={\mathbf{w}}_1^{\prime}\mathbf{g}+{\mathbf{w}}_2^{\prime}\mathbf{m}=\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right]\left[\begin{array}{l}\mathbf{g}\\ {}\mathbf{m}\end{array}\right]={\mathbf{w}}^{\prime}\mathbf{x}, $$

(8.24)

where $ {\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em \dots \kern0.5em {g}_t\right] $ is the vector of true breeding values, t is the number of traits, $ {\mathbf{w}}_1^{\prime }=\left[{w}_1\kern0.5em \cdots \kern0.5em {w}_t\right] $ is the vector of unknown economic weights associated with the breeding values; $ {\mathbf{w}}_2^{\prime }=\left[{0}_1\kern0.5em \cdots \kern0.5em {0}_N\right] $ is a null vector associated with the vector of marker code values $ {\mathbf{m}}^{\prime }=\left[{m}_1\kern0.5em \cdots \kern0.5em {m}_N\right] $, where m_j (j = 1, 2, …, N = number of markers) is the jth marker in the training population; $ {\mathbf{w}}^{\prime }=\left[{\mathbf{w}}_1^{\prime}\kern0.5em {\mathbf{w}}_2^{\prime}\right] $ and $ \mathbf{x}=\left[{\mathbf{g}}^{\prime}\kern0.5em {\mathbf{m}}^{\prime}\right] $. The GW-ESIM (I) index combines the phenotypic value and all the marker information of individuals to predict Eq. (8.24) values in each selection cycle and can be written as

$$ I={\boldsymbol{\upbeta}}_y^{\prime}\mathbf{y}+{\boldsymbol{\upbeta}}_m^{\prime}\mathbf{m}=\left[{\boldsymbol{\upbeta}}_y^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_m^{\prime}\right]\left[\begin{array}{l}\mathbf{y}\\ {}\mathbf{m}\end{array}\right]={\boldsymbol{\upbeta}}^{\prime}\mathbf{q}, $$

(8.25)

where $ {\boldsymbol{\upbeta}}_y^{\prime } $ and β_m are vectors of phenotypic and marker weights respectively; $ {\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em \cdots \kern0.5em {y}_t\right] $ is the vector of phenotypic values; m was defined in Eq. (8.24); $ {\boldsymbol{\upbeta}}^{\prime }=\left[{\boldsymbol{\upbeta}}_y^{\prime}\kern0.5em {\boldsymbol{\upbeta}}_m^{\prime}\right] $ and $ {\mathbf{q}}^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\mathbf{m}}^{\prime}\right] $.

Let $ {\sigma}_I^2={\boldsymbol{\upbeta}}^{\prime}\mathbf{Q}\boldsymbol{\upbeta } $ and $ {\sigma}_H^2={\mathbf{w}}^{\prime}\mathbf{Zw} $ be the variance of I = β′q and H = w′z respectively, and σ_HI = w′Zβ the covariance between I and H, where $ \mathbf{Q}= Var\left[\begin{array}{l}\mathbf{y}\\ {}\mathbf{m}\end{array}\right]=\left[\begin{array}{cc}\mathbf{P}& {\mathbf{G}}_M^{\prime}\\ {}{\mathbf{G}}_M& \mathbf{M}\end{array}\right] $ and $ \mathbf{X}= Var\left[\begin{array}{l}\mathbf{g}\\ {}\mathbf{m}\end{array}\right]=\left[\begin{array}{cc}\mathbf{C}& {\mathbf{G}}_M^{\prime}\\ {}{\mathbf{G}}_M& \mathbf{M}\end{array}\right] $ are block matrices of size (t + N) × (t + N) (t is the number of traits and N is the number of markers) where P = Var(y), M = Var(m), C = Var(g), and G_M = cov (y, m) = cov (g, m) are covariance matrices of phenotypic (y), coded marker (m), and genetic (g) values respectively, whereas G_M is the covariance matrix between y and m, and between g and m (for details see Chap. 4); w and β were defined earlier. Note that although the size of matrices P and C are t × t, the sizes of matrices M and G_M are N × N and N × t respectively. Thus, if the number of markers is very high, the size of matrices M and G_M could also be very high.

In Chap. 4 we described matrix M as

$$ \mathbf{M}=\left[\begin{array}{cccc}1& \left(1-2{\theta}_{11}\right)& \dots & \left(1-2{\theta}_{1N}\right)\\ {}\left(1-2{\theta}_{21}\right)& 1& \dots & \left(1-2{\theta}_{2N}\right)\\ {}\vdots & \vdots & \ddots & \vdots \\ {}\left(1-2{\theta}_{N1}\right)& \left(1-2{\theta}_{N2}\right)& \dots & 1\end{array}\right], $$

(8.26)

where (1 − 2θ_ij) and θ_ij (i, j= 1, 2, …, N= number of markers) are the covariance (or correlation) and the recombination frequency between the ith and jth marker respectively, whereas matrix G_M can be written as

$$ {\mathbf{G}}_M=\left[\begin{array}{cccc}\left(1-2{r}_{11}\right){\alpha}_{11}& \left(1-2{r}_{11}\right){\alpha}_{12}& \dots & \left(1-2{r}_{1N}\right){\alpha}_{1{N}_Q}\\ {}\left(1-2{r}_{21}\right){\alpha}_{21}& \left(1-2{r}_{22}\right){\alpha}_{22}& \dots & \left(1-2{r}_{2N}\right){\alpha}_{2{N}_Q}\\ {}\vdots & \vdots & \ddots & \vdots \\ {}\left(1-2{r}_{t1}\right){\alpha}_{t1}& \left(1-2{r}_{N2}\right){\alpha}_{t2}& \dots & \left(1-2{r}_{NN}\right){\alpha}_{tN_Q}\end{array}\right], $$

(8.27)

where (1 − 2r_ik)α_qk (i= 1, 2, …, N, k= 1, 2, …, N_Q = number of quantitative trait loci (QTL), q = 1, 2, …, t) is the covariance between the qth trait and the ith marker; r_ik is the recombination frequency between the ith and kth QTL, and α_qk is the effect of the kth QTL over the qth trait.

Let $ {\rho}_{HI}=\frac{{\mathbf{w}}^{\prime}\mathbf{X}\boldsymbol{\upbeta }}{\sqrt{{\mathbf{w}}^{\prime}\mathbf{X}\mathbf{w}}\sqrt{{\boldsymbol{\upbeta}}^{\prime}\mathbf{Q}\boldsymbol{\upbeta }}} $ be the correlation between I = β′q and H = w′x; then, the GW-ESIM selection response can be written as

$$ R={k}_I{\sigma}_H{\rho}_{HI}, $$

(8.28)

where k_I is the standardized selection differential (or selection intensity) associated with GW-ESIM and $ {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Xw}} $ is the standard deviation of the variance of H.

Assuming that k_I is fixed, and that matrices Q and X are known, we can maximize R (Eq. 8.28) by maximizing ρ_HI with respect to vectors w′ and β under the restrictions $ {\sigma}_H^2={\mathbf{w}}^{\prime}\mathbf{Xw} $, $ {\sigma}_I^2={\boldsymbol{\upbeta}}^{\prime}\mathbf{Q}\boldsymbol{\upbeta } $, and 0 < $ {\sigma}_H^2 $,$ {\sigma}_I^2 $ < ∞, similar to the MESIM and GESIM. It can be shown that vector w can be written as

$$ {\mathbf{w}}_W={\mathbf{X}}^{-1}\mathbf{Q}\boldsymbol{\upbeta } $$

(8.29)

and that $ {H}_W={\mathbf{w}}_W^{\prime}\mathbf{x} $ is the net genetic merit in the GW-ESIM context. The correlation between $ {H}_W={\mathbf{w}}_W^{\prime}\mathbf{x} $ and I = β′q is $ {\rho}_{H_WI}=\frac{\sqrt{{\boldsymbol{\upbeta}}^{\prime}\mathbf{Q}\boldsymbol{\upbeta }}}{\sqrt{{\boldsymbol{\upbeta}}^{\prime }{\mathbf{QX}}^{-1}\mathbf{Q}\boldsymbol{\upbeta }}} $ and the GW-ESIM vector of coefficients (β) that maximizes $ {\rho}_{H_WI} $ can be obtained from equation

$$ \left({\mathbf{Q}}^{-1}\mathbf{Z}-{\lambda}_W^2{\mathbf{I}}_{\left(t+N\right)}\right){\boldsymbol{\upbeta}}_W=\mathbf{0}, $$

(8.30)

where I_(t + N) is an identity matrix of size (t + N) × (t + N) and $ {I}_W={\boldsymbol{\upbeta}}_W^{\prime}\mathbf{q} $ is the optimized GW-ESIM. The accuracy of the GW-ESIM can be written as

$$ {\rho}_{H_W{I}_W}=\frac{\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\boldsymbol{\upbeta}}_W}}{\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\mathbf{X}}^{-1}{\mathbf{Q}\boldsymbol{\upbeta}}_W}}=\frac{\sigma_{I_W}}{\sigma_{H_W}}, $$

(8.31)

where $ {\sigma}_{I_W}=\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\boldsymbol{\upbeta}}_W} $ is the standard deviation of $ {I}_W={\boldsymbol{\upbeta}}_W^{\prime}\mathbf{q} $, and $ {\sigma}_{H_W}=\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\mathbf{X}}^{-1}{\mathbf{Q}\boldsymbol{\upbeta}}_W} $ is the standard deviation of $ {H}_W={\mathbf{w}}_W^{\prime}\mathbf{x} $. In Eq. (8.30) $ {\lambda}_W^2={\rho}_{H_W{I}_W}^2 $ is the square of the canonical correlation between H_W and I_W.

The maximized GW-ESIM selection response and expected genetic gain per trait are

$$ {R}_W={k}_I\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\boldsymbol{\upbeta}}_W} $$

(8.32)

and

$$ {\mathbf{E}}_W={k}_1\frac{{\mathbf{X}\boldsymbol{\upbeta}}_W}{\sqrt{{\boldsymbol{\upbeta}}_W^{\prime }{\mathbf{Q}\boldsymbol{\upbeta}}_W}}, $$

(8.33)

respectively, where β_W is the first eigenvector of Eq. (8.30).

8.3.2 Estimating GW-ESIM Parameters

In Chap. 2, Eqs. (2.22) to (2.24), we described the restricted maximum likelihood methods to estimate matrices C and P, which can be denoted by $ \widehat{\mathbf{C}} $ and $ \widehat{\mathbf{P}} $. In Chap. 4, we described how to estimate matrices M and G_M, which can be denoted by $ \widehat{\mathbf{M}} $ and $ {\widehat{\mathbf{G}}}_M $. With these estimates, we constructed the block estimated matrices as $ \widehat{\mathbf{Q}}=\left[\begin{array}{cc}\widehat{\mathbf{P}}& {\widehat{\mathbf{G}}}_M^{\prime}\\ {}{\widehat{\mathbf{G}}}_M& \widehat{\mathbf{M}}\end{array}\right] $ and $ \widehat{\mathbf{X}}=\left[\begin{array}{cc}\widehat{\mathbf{C}}& {\widehat{\mathbf{G}}}_M^{\prime}\\ {}{\widehat{\mathbf{G}}}_M& \widehat{\mathbf{M}}\end{array}\right] $, whence we obtained the equation

$$ \left({\widehat{\mathbf{Q}}}^{-}\widehat{\mathbf{X}}-{\widehat{\lambda}}_{Wj}^2\mathbf{I}\right){\widehat{\boldsymbol{\upbeta}}}_{Wj}=\mathbf{0}, $$

(8.34)

j = 1, 2, …, (t + N), where (t + N) is the number of traits and markers in the GW-ESIM index. Similar to the MESIM, we obtained estimators $ {\widehat{\boldsymbol{\upbeta}}}_{W_1} $ and $ {\widehat{\lambda}}_{W_1}^2 $ of the first eigenvector $ {\boldsymbol{\upbeta}}_{W_1} $ and the first eigenvalue $ {\widehat{\lambda}}_{W_1}^2 $ respectively, from equation

$$ \left(\widehat{\mathbf{E}}{\widehat{\mathbf{E}}}^{\prime }-{\widehat{\mu}}_j\mathbf{I}\right){\widehat{\boldsymbol{\upbeta}}}_{W_j}=\mathbf{0}, $$

(8.35)

where $ \widehat{\mathbf{E}}={\widehat{\mathbf{Q}}}^{-}\widehat{\mathbf{X}} $ and $ {\widehat{\mu}}_j={\widehat{\lambda}}_{W_j}^4 $. These results allow the GW-ESIM index selection response and its expected genetic gain per trait to be estimated as $ {\widehat{I}}_W={\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{W_1}\widehat{\mathbf{q}} $, $ {\widehat{R}}_W={k}_I\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{W_1}\widehat{\mathbf{Q}}{\boldsymbol{\upbeta}}_{W_1}^{\prime }} $ and $ {\widehat{\mathbf{E}}}_w={k}_I\frac{\widehat{\mathbf{X}}{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{W_1}}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{W_1}\widehat{\mathbf{Q}}{\boldsymbol{\upbeta}}_{W_1}^{\prime }}} $ respectively, whereas the estimator of GW-ESIM accuracy is $ {\widehat{\lambda}}_{W_1} $.

8.3.3 Numerical Examples

We compare the estimated GW-LMSI and GW-ESIM selection responses using the simulated data set described in Sect. 2.8.1 of Chap. 2, with a selection intensity of 10% (k_I = 1.755). Table 8.1 presents the estimated GW-LMSI selection response for four simulated selection cycles when their vectors of coefficients are normalized, whence it can be seen that the average estimated GW-LMSI selection response was 0.87. Table 8.1 also presents the estimated GW-ESIM selection response for four simulated selection cycles; the average of the estimated GW-ESIM selection responses was 0.93. Thus, for this data set, the estimated GW-LMSI and selection responses were very similar.

8.4 The Restricted Linear Genomic Eigen Selection Index Method

The restricted linear genomic eigen selection index method (RGESIM) is based on the restricted linear phenotypic ESIM (RESIM) theory described in Chap. 7. In the RESIM, the breeder’s objective is to improve only (t − r) of t (r < t) traits, leaving r of them fixed. The same is true for RGESIM, but in this case, we should impose 2r restrictions, i.e., we need to fix r traits and their associated r GEBV to obtain results similar to those obtained with the RESIM (see Chap. 7 for details). This is the main difference between the RGESIM and the RESIM.

It can be shown that Cov(I, α) = Aβ is the covariance between the breeding value vector (α′ = [g′ γ′]) and the GESIM index (I = β′f). In the RGESIM, we want some covariances between the linear combinations of α ($ {\mathbf{U}}_{\mathbf{G}}^{\prime}\boldsymbol{\upalpha} $) and I = β′f to be zero, i.e., $ Cov\left({\mathrm{I}}_{\mathrm{G}},{\mathbf{U}}_G^{\prime}\boldsymbol{\upalpha} \right)={\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0} $, where $ {\mathbf{U}}_{\mathbf{G}}^{\prime } $ is a matrix 2(t − 1) × 2t of 1s and 0s (1 indicates that the trait and its associated GEBV are restricted, and 0 indicates that the trait and its GEBV have no restrictions). We can solve this problem by maximizing $ \frac{{\boldsymbol{\upbeta}}^{\prime}\mathbf{A}\boldsymbol{\upbeta }}{\sqrt{{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta}}} $ with respect to vector β under the restriction $ {\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0} $ and β′β = 1 similar to the RESIM, or by maximizing the correlation between H = w′α and I = β′f, $ {\rho}_{HI}=\frac{{\mathbf{w}}^{\prime}\mathbf{A}\boldsymbol{\upbeta }}{\sqrt{{\mathbf{w}}^{\prime}\mathbf{A}\mathbf{w}}\sqrt{{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta}}} $, with respect to vectors w′ and β under the restrictions $ {\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0} $, $ {\sigma}_H^2={\mathbf{w}}^{\prime}\mathbf{Aw} $, $ {\sigma}_I^2={\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta} $ and 0 < $ {\sigma}_H^2 $, $ {\sigma}_I^2 $ < ∞, as we did for the GESIM.

8.4.1 The RGESIM Parameters

To obtain the RGESIM vector of coefficients, we maximize the function

$$ f\left(\boldsymbol{\upbeta}, {\mathbf{v}}^{\prime}\right)=\frac{{\boldsymbol{\upbeta}}^{\prime}\mathbf{A}\boldsymbol{\upbeta }}{\sqrt{{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta}}}-{{\mathbf{v}}^{\prime }{\mathbf{U}}^{\prime}}_G\mathbf{A}\boldsymbol{\upbeta } $$

(8.36)

with respect to β and v′, where v′ = [v₁ v₂ ⋯ v_2(r − 1)] is a vector of Lagrange multipliers. The derivatives of function f(β, v′) with respect to β and v′ can be written as

$$ 2{\left({\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta} \right)}^{1/2}\mathbf{A}\boldsymbol{\upbeta } -{\left({\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Phi} \boldsymbol{\upbeta} \right)}^{-1/2}\left({\boldsymbol{\upbeta}}^{\prime}\mathbf{A}\boldsymbol{\upbeta } \right)\boldsymbol{\Phi} \boldsymbol{\upbeta} -{\mathbf{AU}}_G\mathbf{v}=\mathbf{0}, $$

(8.37)

$$ {\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0}, $$

(8.38)

respectively, where Eq. (8.38) denotes the restriction imposed for maximizing Eq. (8.36). Using algebraic methods on Eq. (8.37), we get

$$ \left({\mathbf{K}}_{RG}{\boldsymbol{\Phi}}^{-1}\mathbf{A}-{\lambda}_{RG}^2{\mathbf{I}}_{2t}\right){\boldsymbol{\upbeta}}_{RG}=\mathbf{0}, $$

(8.39)

where $ {\lambda}_{RG}^2={h}_{I_{RG}}^2 $, $ {h}_{I_{RG}}^2 $ is the RGESIM heritability obtained under the restriction $ {\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0} $; K_RG = [I_2t − Q_RG], I_2t is an identity matrix of size 2t × 2t, and $ {\mathbf{Q}}_{RG}={\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{U}}_G^{\prime}\mathbf{A} $. When $ {\mathbf{U}}_G^{\prime } $ is a null matrix, $ {\boldsymbol{\upbeta}}_{RG}^{\prime }={\boldsymbol{\upbeta}}_G^{\prime } $ (the vector of the GESIM coefficients); thus, the RGESIM is more general than the GESIM and includes the GESIM as a particular case. The RGESIM index $ {I}_{GR}={\boldsymbol{\upbeta}}_{RG}^{\prime}\mathbf{y} $ and its selection response and expected genetic gain per trait use the first eigenvector of matrix K_GΦ⁻¹A. It can be shown that the vector of coefficients of $ H={\mathbf{w}}_{RG}^{\prime}\boldsymbol{\upalpha} $ in the RGESIM can be written as

$$ {\mathbf{w}}_{RG}={\mathrm{A}}^{-1}\left[\boldsymbol{\Phi} +{\mathbf{Q}}_{RG}^{\prime}\mathbf{A}\right]\;{\boldsymbol{\upbeta}}_{RG}, $$

(8.40)

where $ {\mathbf{Q}}_{RG}^{\prime }={\mathbf{A}\mathbf{U}}_G{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1} $.

Note that the restriction $ {\mathbf{U}}_G^{\prime}\mathbf{A}\boldsymbol{\upbeta } =\mathbf{0} $ can be written as β′AU_G = 0; this means that β′Q′_RG = 0 and that the covariance between $ {H}_{RG}={\mathbf{w}}_{RG}^{\prime}\boldsymbol{\upalpha} $ and $ {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\mathbf{f} $ ($ {\sigma}_{H_{RG}{I}_{RG}} $) can be written as

$$ {\sigma}_{H_{RG}{I}_{RG}}={\mathbf{w}}_{RG}^{\prime}\mathbf{A}{\boldsymbol{\upbeta}}_{RG}^{\prime }={\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG}+{\boldsymbol{\upbeta}}_{RG}^{\prime }{\mathbf{Q}}_{RG}^{\prime }{\mathbf{C}\boldsymbol{\upbeta}}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG}. $$

(8.41)

Equation (8.41) indicates that $ {\sigma}_{H_{RG}{I}_{RG}} $ is equal to the variance of $ {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\mathbf{f} $ ($ {\sigma}_{I_{RG}}^2={\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG} $); therefore, the maximized correlation between I_RG and H_RG or RGESIM accuracy can be written as

$$ {\rho}_{H_{RG}{I}_{RG}}=\frac{\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG}}}{\sqrt{{\mathbf{w}}_{RG}^{\prime }{\mathbf{Aw}}_{RG}}}, $$

(8.42)

where $ {\mathbf{w}}_{RG}^{\prime }{\mathbf{Aw}}_{RG} $ is the variance of H_RG. Hereafter, to simplify the notation, we write Eq. (8.42) as λ_RG.

The maximized selection response and the expected genetic gain per trait of the RGESIM are

$$ {R}_{RG}={k}_I\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG}} $$

(8.43)

and

$$ {\mathbf{E}}_{RG}={k}_I\frac{{\mathbf{A}\boldsymbol{\upbeta}}_{RG}}{\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{RG}}}, $$

(8.44)

respectively, where β_RG is the first eigenvector of matrix K_RGΦ⁻¹A.

8.4.2 Estimating RGESIM Parameters

In Sect. 8.2, we indicated how to estimate matrices P, Γ, and C using phenotypic and genomic information, whence we can estimate matrices $ \mathbf{A}=\left[\begin{array}{ll}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ and $ \Phi =\left[\begin{array}{ll}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $. Those methods are also valid for the RGESIM. This means that the SVD methods described for estimating MESIM parameters are also valid for estimating RGESIM parameters.

8.4.3 Numerical Examples

With a selection intensity of 10% (k_I = 1.755), we compare the CRLGSI (for details see Chap. 6) versus the RGESIM theoretical results using a real maize (Zea mays) F₂ population with 244 genotypes (each with two repetitions), 233 molecular markers, and three traits—GY (ton ha⁻¹), EHT (cm), and PHT (cm)—described in Sect. 8.2.2, where $ \widehat{\mathbf{P}}=\left[\begin{array}{ccc}0.45& 1.33& 2.33\\ {}1.33& 65.07& 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] $, $ \widehat{\mathbf{C}}=\left[\begin{array}{ccc}0.07& 0.61& 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] $, and $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}0.07& 0.65& 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] $ were the estimated matrices of P, C, and Γ respectively.

We have indicated that the main difference between the RLPSI and the CRLGSI is the matrix $ {\mathbf{U}}_C^{\prime } $, on which we now need to impose two restrictions: one for the trait and another for its associated GEBV. Consider the data set described earlier and suppose that we restrict the trait GY (ton ha⁻¹) and its associated GEBV_GY; then, matrix $ {\mathbf{U}}_C^{\prime } $ should be constructed as $ {\mathbf{U}}_{C1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] $. If we restrict traits GY and EHT (cm) and their associated GEBV_GY and GEBV_EHT, matrix $ {\mathbf{U}}_C^{\prime } $ should be constructed as $ {\mathbf{U}}_{C2}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] $, etc. The procedure for obtaining matrices $ {\widehat{\mathbf{K}}}_{RG}=\left[{\mathbf{I}}_{2t}-{\widehat{\mathbf{Q}}}_{RG}\right] $ and $ {\widehat{\mathbf{Q}}}_{RG}={\widehat{\boldsymbol{\Phi}}}^{-1}\widehat{\mathbf{A}}{\mathbf{U}}_G{\left({\mathbf{U}}_G^{\prime }{\widehat{\mathbf{A}}\widehat{\boldsymbol{\Phi}}}^{-1}\widehat{\mathbf{A}}{\mathbf{U}}_{\mathrm{G}}\right)}^{-1}{\mathbf{U}}_G^{\prime}\widehat{\mathbf{A}} $ was described in Chap. 6, and is also valid for estimating RGESIM parameters.

The estimated CRLGSI vector of coefficients is $ {\widehat{\boldsymbol{\upbeta}}}_{CR}={\widehat{\mathbf{K}}}_{RG}\widehat{\boldsymbol{\upbeta}} $, where $ \widehat{\boldsymbol{\upbeta}}={\widehat{\boldsymbol{\Phi}}}^{-1}\widehat{\mathbf{A}}\mathbf{w} $ is the estimated CLGSI vector of coefficients (Chap. 6). Let w′ = [5 − 0.1 − 0.1 0 0 0] be the vector of economic weights and suppose that we restrict trait GY and its associated GEBV_GY; in this case, $ {\mathbf{U}}_{C1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] $, and according to matrices $ \widehat{\mathbf{P}} $, $ \widehat{\mathbf{C}} $, and $ \widehat{\boldsymbol{\Gamma}} $ described earlier, $ {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.076\kern0.5em -0.004\kern0.5em -0.018\kern0.5em 2.353\kern0.5em -0.096\kern0.5em -0.082\right] $ was the estimated CRLGSI vector of coefficients and the estimated CRLGSI was

$$ {\widehat{I}}_{CR}=0.076\mathrm{GY}-0.004\mathrm{EHT}-0.018\mathrm{PHT}+2.353{\mathrm{GEBV}}_{\mathrm{GY}}-0.096{\mathrm{GEBV}}_{\mathrm{EHT}}-0.082{\mathrm{GEBV}}_{\mathrm{PHT}} $$

where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with the traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions.

The estimated CRLGSI selection response and expected genetic gain per trait were $ {\widehat{R}}_{CR}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{CR}}=0.96 $ and $ {\widehat{\mathbf{E}}}_{CR}^{\prime }={k}_I=\frac{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{CR}}}\left[0\kern0.5em -3.53\kern0.5em -6.03\kern0.5em 0\kern0.5em -2.93\kern0.5em -4.87\right] $ respectively, whereas the estimated CRLGSI accuracy was $ {\widehat{\rho}}_{Hl_{\mathrm{C}R}}=\frac{{\widehat{\sigma}}_{I_{\mathrm{C}R}}}{{\widehat{\sigma}}_H}=0.51 $. Note that in $ {\widehat{\mathbf{E}}}_{CR}^{\prime } $, the trait GY and its associated GEBV_GY have null values, as we would expect.

The estimated RGESIM vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.015\kern0.5em -0.001\kern0.5em -0.004\kern0.5em 0.998\kern0.5em -0.029\kern0.5em -0.045\right] $, and the estimated RGESIM index was $ {\widehat{I}}_{RG}=0.015\mathrm{GY}-0.001\mathrm{EHT}-0.004\mathrm{PHT}+0.998{\mathrm{GEBV}}_{\mathrm{GY}}-0.029{\mathrm{GEBV}}_{\mathrm{EHT}}-0.045{\mathrm{GEBV}}_{\mathrm{PHT}} $ where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions.

The estimated RGESIM selection response and expected genetic gain per trait were $ {\widehat{R}}_{RG}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{RG}}=0.37 $ and $ {\widehat{\mathbf{E}}}_{RG}^{\prime }={k}_I=\frac{{\widehat{\boldsymbol{\upbeta}}}_{RG}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{RG}}}\left[0\kern0.5em -3.28\kern0.5em -6.03\kern0.5em 0\kern0.5em -2.93\kern0.5em -5.40\right] $ respectively, whereas the estimated RGESIM accuracy was $ {\widehat{\rho}}_{{\widehat{H}}_{RG}{\widehat{I}}_{RG}}=\frac{{\widehat{\sigma}}_{{\widehat{I}}_{RG}}}{{\widehat{\sigma}}_{{\widehat{H}}_{RG}}}=0.86 $.

Fig. 8.3 presents the frequency distribution of the 244 estimated RGESIM index values for two null restrictions on traits GY and EHT and their associated GEBV_GY and GEBV_EHT, for one selection cycle in an environment for a real maize (Zea mays) F₂ population with 233 molecular markers. Note that the frequency distribution of the estimated RGESIM index values approaches the normal distribution.

Now we compare the estimated CRLGSI and RGESIM selection responses and expected genetic gains per trait using the simulated data set described in Sect. 2.8.1 of Chap. 2. We used that data set for four phenotypic selection cycles (C2, C3, C4, and C5), each with four traits (T₁, T₂, T₃, and T₄), 500 genotypes, and four replicates per genotype. The economic weights for T₁, T₂, T₃, and T₄ were 1, −1, 1, and 1 respectively. For this data set, matrix F was an identity matrix of size 8 × 8 for all four selection cycles.

Columns 2, 3, and 4 (from left to right) of Table 8.4 present the estimated CRLGSI selection responses when their vectors of coefficients are normalized and the estimated RGESIM and selection responses for one, two, and three restrictions for four simulated selection cycles. The averages of the estimated CRLGSI selection responses of the traits and their associated GEBVs for each of the three null restrictions were 3.24 for one restriction, 4.08 for two restrictions, and 5.06 for three restrictions, whereas the averages of the estimated RGESIM selection responses were 3.08 for one restriction, 2.79 for two restrictions, and 3.23 for three restrictions. Note that although for one restriction the selection response was similar for both indices, for two and three restrictions the CRLGSI selection responses were greater than the RGESIM selection responses.

Table 8.4 Estimated combined null restricted linear genomic selection index (CRLGSI) and estimated combined predetermined proportional gain linear genomic selection index (CPPG-LGSI) selection responses for one, two, and three restrictions when their vectors of coefficients are normalized for four simulated selection cycles

Full size table

Table 8.5 presents the estimated CRLGSI and RGESIM expected genetic gains per trait for four traits (T1, T2, T3, and T4) and their associated GEBVs (in this case denoted by G1, G2, G3, and G4 to simplify the notation) in four simulated selection cycles and for one, two, and three null restrictions in four simulated selection cycles. Note that the null values of the traits and their restricted GEBVs are not shown in Table 8.5 with the aim of simplifying the table. The averages of the estimated CRLGSI expected genetic gains for the three traits and their associated GEBVs were −2.60, 2.16, 2.84, −1.21, 0.67, and 1.02 for one restriction; 2.74, 3.23, 0.78, and 0.99 for two restrictions; and 4.02 and 1.33 for three restrictions. On the other hand, the averages of the estimated RGESIM expected genetic gains for the three traits and their associated GEBVs were 3.27, −1.67, −1.33, 2.16, −0.92, and −0.84 for one restriction; 3.29, 1.02, 1.76, and 0.46 for two restrictions; and 3.53 and 2.07 for three restrictions. These results indicate that in terms of absolute values, the estimated expected genetic gains for the traits and their associated GEBVs were similar for both indices.

Table 8.5 Estimated CRLGSI and estimated null RGESIM expected genetic gains per trait for four traits (T1, T2, T3, and T4) and their associated genomic estimated breeding values (G1, G2, G3, and G4) for four simulated selection cycles and for one, two, and three null restrictions for four simulated selection cycles. The selection intensity was 10% (k_I = 1.755)

Full size table

8.5 The Predetermined Proportional Gain Linear Genomic Eigen Selection Index Method

The predetermined proportional gain linear genomic eigen selection index method (PPG-GESIM) theory is based on the predetermined proportional gain linear phenotypic ESIM (PPG-ESIM) described in Chap. 7. In the PPG-ESIM, the vector of PPG (predetermined proportional gain) imposed by the breeder was $ {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \cdots \kern0.5em {d}_r\right] $. However, because the PPG-GESIM uses phenotypic and GEBV information jointly to predict the net genetic merit, the vector of PPG imposed by the breeder (d_PG) should be twice the standard vector d′, that is, $ {\mathbf{d}}_{PG}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \cdots \kern0.5em \begin{array}{cccc}{d}_r& {d}_{r+1}& {d}_{r+2}& \begin{array}{cc}\cdots & {d}_{2r}\end{array}\end{array}\right] $, where we would expect that if d₁ is the PPG imposed on trait 1, then d_r + 1 should be the PPG imposed on the GEBV associated with trait 1, etc. Thus, in the PPG-GESIM we have three possible options for determining (for each trait and GEBV) the PPG: e.g., for trait 1, d₁ = d_r + 1, d₁ > d_r + 1 or d₁ < d_r + 1. This is the main difference between the standard PPG-ESIM described in Chap. 7 and the PPG-GESIM.

8.5.1 The PPG-GESIM Parameters

Using the same procedure described for RGESIM and PPG-ESIM, the PPG-GESIM vector of coefficients (β_PG), which maximizes the PPG-GESIM selection response and the expected genetic gain per trait, is the first eigenvector of the following equation

$$ \left({\mathbf{T}}_{PG}-{\lambda}_{PG}^2{\mathbf{I}}_{2t}\right){\boldsymbol{\upbeta}}_{PG}=\mathbf{0}, $$

(8.45)

where T_PG = K_RGΦ⁻¹A + B, K_PG = [I_2t − Q_RG], I_2t is an identity matrix of size 2t × 2t, $ {\mathbf{Q}}_{RG}={\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{U}}_G^{\prime}\mathbf{A} $, B = δφ′, $ \boldsymbol{\updelta} ={\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{d}}_{PG} $, and $ {\varphi}^{\prime }=\frac{{\mathbf{d}}_{PG}^{\prime }{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}\mathbf{A}}{{\mathbf{d}}_{PG}^{\prime }{\left({\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G\right)}^{-1}{\mathbf{d}}_{PG}} $. When B is a null matrix, T_PG = K_RGΦ⁻¹A (matrix of the RGESIM), and when $ {\mathbf{U}}_G^{\prime } $ is a null matrix, T_PG = Φ⁻¹A (matrix of the GESIM); this means that the PPG-GESIM includes the RGESIM and GESIM as particular cases. The optimized PPG-GESIM index can be written as $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\mathbf{f} $.

The vector of coefficients of $ H={\mathbf{w}}_{PG}^{\prime}\boldsymbol{\upalpha} $ in the PPG-GESIM can be written as

$$ {\mathbf{w}}_{PG}={\mathbf{A}}^{-1}\left[\boldsymbol{\Phi} +{\mathbf{Q}}_{PG}^{\prime}\mathbf{A}\right]{\boldsymbol{\upbeta}}_{PG}, $$

(8.46)

where $ {\mathbf{Q}}_{PG}^{\prime }={\mathbf{A}\mathbf{U}}_G{\mathbf{D}}_G{\left({\mathbf{D}}_G^{\prime }{\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1}{\mathbf{A}\mathbf{U}}_G{\mathbf{D}}_G\right)}^{-1}{\mathbf{D}}_G^{\prime }{\mathbf{U}}_G^{\prime }{\mathbf{A}\boldsymbol{\Phi}}^{-1} $, and $ {\mathbf{D}}_G^{\prime }=\left[\begin{array}{lllll}{d}_{2r}& 0& \cdots & 0& -{d}_1\\ {}0& {d}_{2r}& \cdots & 0& -{d}_2\\ {}\vdots & \vdots & \ddots & \vdots & \vdots \\ {}0& 0& \cdots & {d}_{2r}& -{d}_{2r-1}\end{array}\right] $. Similar to RGESIM, it can be shown that the covariance between $ {H}_{RG}={\mathbf{w}}_{PG}^{\prime}\boldsymbol{\upalpha} $ and $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\mathbf{f} $ ($ {\sigma}_{H_{PG}{I}_{PG}} $) is equal to the variance of $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\mathbf{f} $ ($ {\sigma}_{I_{PG}}^2={\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{PG} $), that is, $ {\sigma}_{H_{PG}{I}_{PG}}={\mathbf{w}}_{PG}^{\prime }{\mathbf{A}\boldsymbol{\upbeta}}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{PG}={\sigma}_{I_{PG}}^2 $.

The maximized correlation between I_PG and H_PG, or PPG-GESIM accuracy, is

$$ {\rho}_{H_{PG}{I}_{PG}}=\frac{\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{PG}}}{\sqrt{{\mathbf{w}}_{PG}^{\prime }{\mathbf{Aw}}_{PG}}} $$

(8.47)

where $ {\mathbf{w}}_{PG}^{\prime }{\mathbf{Aw}}_{PG} $ is the variance of H_PG. Hereafter, to simplify the notation, we write Eq. (8.47) as λ_PG.

The maximized selection response and the expected genetic gain per trait of the PPG-GESIM are

$$ {R}_{PG}={k}_I\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{PG}} $$

(8.48)

and

$$ {\mathbf{E}}_{PG}={k}_I\frac{{\mathbf{A}\boldsymbol{\upbeta}}_{PG}}{\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Phi} \boldsymbol{\upbeta}}_{PG}}}, $$

(8.49)

respectively, where β_PG is the first eigenvector of Eq. (8.45).

8.5.2 Numerical Examples

The process for estimating PPG-ESIM parameters is similar to the method described for estimating RGESIM parameters. With a selection intensity of 10% (k_I = 1.755), we compare the combined predetermined proportional gain linear genomic selection index (CPPG-LGSI) and PPG-GESIM results using the real maize (Zea mays) F₂ population with 244 genotypes, 233 molecular markers, and three traits—GY (ton ha⁻¹), EHT (cm), and PHT (cm)—where $ \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45& 1.33& 2.33\\ {}1.33& 65.07& 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] $, $ \widehat{\mathbf{G}}=\left[\begin{array}{lll}0.07& 0.61& 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] $ and $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& 0.65& 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] $ are the estimated matrices of P, G, and Γ respectively, whereas $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $ was the vector of economic weights.

The estimated CPPG-LGSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{CP}={\widehat{\boldsymbol{\upbeta}}}_{CG}+{\widehat{\theta}}_{CP}\widehat{\boldsymbol{\updelta}} $ (see Chap. 6 for additional details). Let $ \widehat{\mathbf{A}}=\left[\begin{array}{ll}\widehat{\mathbf{G}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $ and $ \widehat{\boldsymbol{\Phi}}=\left[\begin{array}{ll}\widehat{\mathbf{P}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $ be the estimated block matrices and $ {\mathbf{d}}_{PG}^{\prime }=\left[7\kern0.5em -3\kern0.5em 3.5\kern0.5em -1.5\right] $ the vector of PPG imposed by the breeder on the traits GY and EHT, and their associated genomic estimated breeding values (GEBV_GY and GEBV_EHT), and let $ {\mathbf{U}}_C^{\prime }=\left[\begin{array}{cccccc}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] $ be the matrix of null restrictions on the CPPG-LGSI and $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $ the vector of economic weights. It can be shown that $ {\widehat{\uptheta}}_{\mathrm{CP}}=-0.00009 $ is the estimated value of the proportionality constant, $ {\widehat{\boldsymbol{\updelta}}}^{\prime }=\left[-112.92\kern0.5em -72.16\kern0.5em 61.35\kern0.5em 231.79\kern0.5em 64.75\kern0.5em -61.35\right] $, $ {\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }=\left[-0.01\kern0.5em 0.01\kern0.5em -0.01\kern0.5em 0.59\kern0.5em 0.09\kern0.5em -0.09\right] $ is the estimated CPPG-LGSI vector of coefficients, and the estimated CPPG-LGSI can be written as

$$ {\widehat{I}}_{CP}=-0.01\mathrm{GY}+0.01\mathrm{EHT}-0.01\mathrm{PHT}+0.59{\mathrm{GEBV}}_{\mathrm{GY}}+0.09{\mathrm{GEBV}}_{\mathrm{EHT}}-0.09{\mathrm{GEBV}}_{\mathrm{PHT}} $$

where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for more than two predetermined restrictions. The estimated CPPG-LGSI selection response and expected genetic gain per trait were $ {\widehat{R}}_{CP}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{CP}}=0.443 $ and $ {\widehat{\mathbf{E}}}_{CP}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{CP}}}=\left[-0.004\kern0.5em 0.002\kern0.5em -4.639\kern0.5em -0.002\kern0.5em 0.001\kern0.5em -4.326\right] $ respectively, whereas the estimated CPPG-LGSI accuracy is $ {\widehat{\rho}}_{HI_{CP}}=\frac{{\widehat{\sigma}}_{I_{CP}}}{{\widehat{\sigma}}_H}=0.234 $.

Because the estimated value of the proportionality constant was negative ($ {\widehat{\uptheta}}_{\mathrm{CP}}=-0.00009 $), the expected genetic gains of the traits GY and EHT, and their associated genomic estimated breeding values (GEBV_GY and GEBV_EHT), which appeared in the $ {\widehat{\mathbf{E}}}_{CP}^{\prime } $ values, were not in accordance with the values of the vector of PPG imposed by the breeder, $ {\mathbf{d}}_{PG}^{\prime }=\left[7\kern0.5em -3\kern0.5em 3.5\kern0.5em -1.5\right] $, as we would expect, and CPPG-LGSI accuracy (0.234) was low. These results indicate that in the CPPG-LGSI, it is very important for the estimated values of $ {\widehat{\uptheta}}_{\mathrm{CP}} $ to be positive (see Chaps. 3 and 6 for details).

In the PPG-GESIM, we need to find the solutions to equation $ \left({\widehat{\mathbf{T}}}_{PG}-{\widehat{\lambda}}_{PG_j}^2{\mathbf{I}}_{2t}\right){\widehat{\boldsymbol{\upbeta}}}_{PG_j}=\mathbf{0} $, for $ {\widehat{\lambda}}_{PG_j}^2 $ and $ {\widehat{\boldsymbol{\upbeta}}}_{PG_j} $ (see Eq. 8.45). The estimated PPG-GESIM vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{PG}^{\prime }=\left[0.001\kern0.5em -0.050\kern0.5em 0.029\kern0.5em 0.975\kern0.5em 0.154\kern0.5em -0.157\right] $, which was transformed using matrix $ \mathbf{F}=\left[\begin{array}{cccccc}-0.1& 0& 0& 0& 0& 0\\ {}0& 3& 0& 0& 0& 0\\ {}0& 0& 2& 0& 0& 0\\ {}0& 0& 0& -1& 0& 0\\ {}0& 0& 0& 0& -1& 0\\ {}0& 0& 0& 0& 0& -1\end{array}\right] $, that is, we changed the direction of the original vector. With the $ {\widehat{\boldsymbol{\upbeta}}}_{PG}^{\prime } $ values, we can estimate the PPG-GESIM index as

$$ {\widehat{I}}_{PG}=0.001\mathrm{GY}-0.05\mathrm{EHT}+0.029\mathrm{PHT}+0.975{\mathrm{GEBV}}_{\mathrm{GY}}+0.154{\mathrm{GEBV}}_{\mathrm{EHT}}-0.157{\mathrm{GEBV}}_{\mathrm{PHT}} $$

where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with the traits GY, EHT, and PHT respectively. The estimated PPG-GESIM selection response, accuracy, and expected genetic gain per trait were $ {\widehat{R}}_{PG}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{PG}}=0.696 $, $ {\widehat{\rho}}_{{\widehat{H}}_{PG}{\widehat{I}}_{PG}}=\frac{{\widehat{\sigma}}_{{\widehat{I}}_{PG}}}{{\widehat{\sigma}}_{{\widehat{H}}_{PG}}}=0.843 $, and $ {\widehat{\mathbf{E}}}_{PG}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{PG}^{\prime}\widehat{\mathbf{A}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG}^{\prime }{\widehat{\boldsymbol{\Phi}}\widehat{\boldsymbol{\upbeta}}}_{PG}}}=\left[0.01\kern0.5em -1.00\kern0.5em -3.56\kern0.5em 0\kern0.5em -0.46\kern0.5em -3.98\right] $ respectively.

Fig. 8.4 presents the frequency distribution of the 244 estimated PPG-GESIM index values for two predetermined restrictions on the traits GY and EHT and their associated GEBVs (GEBV_GY and GEBV_EHT), for one selection cycle in an environment for a real maize (Zea mays) F₂ population with 233 molecular markers. Note that the frequency distribution of the estimated PPG-GESIM index values approaches normal distribution.

Now, with a selection intensity of 10% (k_I = 1.755) and a vector of predetermined restrictions $ {\mathbf{d}}_{PG}^{\prime }=\left[7\kern0.5em -3\kern0.5em 5\kern0.5em 3.5\kern0.5em -1.5\kern0.5em 2.5\right] $, we compare the estimated CPPG-LGSI and PPG-GESIM selection responses and expected genetic gains per trait using the simulated data set described in Sect. 2.8.1 of Chap. 2. Traits T1, T2, and T3 and their associated GEBVs (GEBV1, GEBV2, and GEBV3 respectively) were restricted, but trait T4 and its associated GEBV4 were not restricted. For this data set, matrix F was an identity matrix of size 8 × 8 for all four selection cycles.

Table 8.6 presents the estimated CPPG-LGSI selection responses when their vectors of coefficients are normalized, and the estimated PPG-GESIM selection responses for one, two, and three predetermined restrictions for four simulated selection cycles. The averages of the estimated CPPG-LGSI selection responses were 5.08 for one restriction, 3.42 for two restrictions, and 1.60 for three restrictions, whereas the averages of the estimated PPG-GESIM selection responses were 1.96 for one restriction, 4.14 for two restrictions, and 5.46 for three restrictions. For this data set, when the number of restrictions increases, the estimated CPPG-LGSI selection response tends to decrease, whereas the estimated PPG-GESIM selection response increases.

Table 8.6 Estimated CPPG-LGSI expected genetic gains for one, two, and three restricted predetermined traits (T1, T2, and T3) and for one, two, and three restricted predetermined GEBVs (GEBV1, GEBV2, and GEBV3) for four simulated selection cycles

Full size table

Tables 8.7 presents the estimated CPPG-LGSI and PPG-GESIM expected genetic gains for one, two, and three predetermined restrictions respectively, for four simulated selection cycles. The averages of the estimated CPPG-LGSI expected genetic gains for the four traits and their four associated GEBVs were 8.28, −4.12, 3.23, 2.23, 4.14, −2.26, 1.71, and 1.01 for one restriction; 8.43, −3.61, 3.28, 2.13, 4.22, −1.81, 1.72, and 0.93 for two restrictions; and 5.81, −2.49, 4.15, 2.26, 2.90, −1.24, 2.07, and 0.89 for three restrictions. On the other hand, the averages of the estimated PPG-GESIM expected genetic gains for the four traits and their four associated GEBVs were 6.97, −1.31, 1.78, 0.52, 5.64, −1.74, 1.75, and 0.58 for one restriction; 6.93, −2.73, 1.29, 0.85, 5.75, −2.55, 1.49, and 0.79 for two restrictions, and 8.12, −3.27, 2.99, 1.13, 2.19, −1.15, 1.30, and 0.45 for three restrictions. These results indicate that the estimated CPPG-LGSI expected genetic gains for the four traits and their four associated GEBVs were generally higher than the estimated PPG-GESIM expected genetic gains for the four traits and their four associated GEBVs.

Table 8.7 Estimated PPG-GESIM expected genetic gains for one, two, and three restricted traits (T1, T2, and T3) and for one, two, and three restricted GEBVs (GEBV1, GEBV2, and GEBV3) for four simulated selection cycles

Full size table

References

Crossa J, Cerón-Rojas JJ (2011) Multi-trait multi-environment genome-wide molecular marker selection indices. J Indian Soc Agric Stat 62(2):125–142
Google Scholar
Meyer CD (2000) Matrix analysis and applied linear algebra. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA
Google Scholar

Download references

Author information

Authors and Affiliations

Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Mexico, Mexico
J. Jesus Céron-Rojas & José Crossa

Authors

J. Jesus Céron-Rojas
View author publications
You can also search for this author in PubMed Google Scholar
José Crossa
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Céron-Rojas, J.J., Crossa, J. (2018). Linear Molecular and Genomic Eigen Selection Index Methods. In: Linear Selection Indices in Modern Plant Breeding. Springer, Cham. https://doi.org/10.1007/978-3-319-91223-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-91223-3_8
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91222-6
Online ISBN: 978-3-319-91223-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Linear Molecular and Genomic Eigen Selection Index Methods

Abstract

Similar content being viewed by others

Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition

Accuracy of genomic BLUP when considering a genomic relationship matrix based on the number of the largest eigenvalues: a simulation study

A new approach fits multivariate genomic prediction models efficiently

8.1 The Molecular Eigen Selection Index Method

8.1.1 The MESIM Parameters

8.1.2 Estimating MESIM Parameters

8.1.3 Numerical Examples

8.2 The Linear Genomic Eigen Selection Index Method

8.2.1 The GESIM Parameters

8.2.2 Numerical Examples

8.3 The Genome-Wide Linear Eigen Selection Index Method

8.3.1 The GW-ESIM Parameters

8.3.2 Estimating GW-ESIM Parameters

8.3.3 Numerical Examples

8.4 The Restricted Linear Genomic Eigen Selection Index Method

8.4.1 The RGESIM Parameters

8.4.2 Estimating RGESIM Parameters

8.4.3 Numerical Examples

8.5 The Predetermined Proportional Gain Linear Genomic Eigen Selection Index Method

8.5.1 The PPG-GESIM Parameters

8.5.2 Numerical Examples

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Linear Molecular and Genomic Eigen Selection Index Methods

Abstract

Similar content being viewed by others

Variable selection models for genomic selection using whole-genome sequence data and singular value decomposition

Accuracy of genomic BLUP when considering a genomic relationship matrix based on the number of the largest eigenvalues: a simulation study

A new approach fits multivariate genomic prediction models efficiently

8.1 The Molecular Eigen Selection Index Method

8.1.1 The MESIM Parameters

8.1.2 Estimating MESIM Parameters

8.1.3 Numerical Examples

8.2 The Linear Genomic Eigen Selection Index Method

8.2.1 The GESIM Parameters

8.2.2 Numerical Examples

8.3 The Genome-Wide Linear Eigen Selection Index Method

8.3.1 The GW-ESIM Parameters

8.3.2 Estimating GW-ESIM Parameters

8.3.3 Numerical Examples

8.4 The Restricted Linear Genomic Eigen Selection Index Method

8.4.1 The RGESIM Parameters

8.4.2 Estimating RGESIM Parameters

8.4.3 Numerical Examples

8.5 The Predetermined Proportional Gain Linear Genomic Eigen Selection Index Method

8.5.1 The PPG-GESIM Parameters

8.5.2 Numerical Examples

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation