Constrained Linear Genomic Selection Indices

Céron-Rojas, J. Jesus; Crossa, José

doi:10.1007/978-3-319-91223-3_6

J. Jesus Céron-Rojas⁴ &
José Crossa⁴

4601 Accesses

Abstract

The constrained linear genomic selection indices are null restricted and predetermined proportional gain linear genomic selection indices (RLGSI and PPG-LGSI respectively), which are a linear combination of genomic estimated breeding values (GEBVs) to predict the net genetic merit. They are the results of a direct application of the restricted and the predetermined proportional gain linear phenotypic selection index theory to the genomic selection context. The RLGSI can be extended to a combined RLGSI (CRLGSI) and the PPG-LGSI can be extended to a combined PPG-LGSI (CPPG-LGSI); the latter indices use phenotypic and GEBV information jointly in the prediction of net genetic merit. The main difference between the RLGSI and PPG-LGSI with respect to the CRLGSI and the CPPG-LGSI is that although the RLGSI and PPG-LGSI are useful in a testing population where there is only marker information, the CRLGSI and CPPG-LGSI can be used only in training populations when there are joint phenotypic and marker information. The RLGSI and CRLGSI allow restrictions equal to zero to be imposed on the expected genetic advance of some traits, whereas the PPG-LGSI and CPPG-LGSI allow predetermined proportional restriction values to be imposed on the expected trait genetic gains to make some traits change their mean values based on a predetermined level. We describe the foregoing four indices and we validated their theoretical results using real and simulated data.

You have full access to this open access chapter, Download chapter PDF

An analytical framework to derive the expected precision of genomic selection

Article Open access 27 December 2017

Overview of Genomic Prediction Methods and the Associated Assumptions on the Variance of Marker Effect, and on the Architecture of the Target Trait

Performance of Bayesian and BLUP alphabets for genomic prediction: analysis, comparison and results

Article 04 May 2022

6.1 The Restricted Linear Genomic Selection Index

Let H = w′g be the net genetic merit and I_G = β′γ the linear genomic selection index (LGSI, see Chap. 5 for details), where g, γ, w, and β are vectors t × 1 (t= number of traits) of breeding values, genomic breeding values, economic weights, and LGSI coefficients respectively. It can be shown that Cov(I_G, g) = Γβ is the covariance between g and I_G = β′γ, and that Var(γ) = Γ is the genomic covariance matrix of size t × t (see Chap. 5 for details). The objective of the restricted linear genomic selection index (RLGSI) is to improve only (t − r) of t (r < t) traits (leaving r of them fixed) in a testing population using only genomic estimated breeding values (GEBVs). The RLGSI minimizes the mean squared difference between I_G and H, E[(H − I_G)²], with respect to β under the restriction Cov(I_G, U′g) = U′Γβ = 0, where U′ is a matrix (t − 1) × t of 1s and 0s, in a similar manner to the restricted linear phenotypic selection index (RLPSI) described in Chap. 3 in the phenotypic selection context.

6.1.1 The Maximized RLGSI Parameters

Let Var(I_G) = β′Γβ be the variance of I_G = β′γ, w′Cw the variance of H = w′g, and Cov(I_G, H) = w′Γβ the covariance between H = w′g and I_G = β′γ. The mean squared difference between H and I_G can be written as E[(H − I_G)²], which should be minimized under the restriction U′Γβ = 0 assuming that Γ, C, U′, and w are known, i.e., it is necessary to minimize the function

$$ {f}_R\left(\boldsymbol{\upbeta}, \mathbf{v}\right)={\mathbf{w}}^{\prime}\mathbf{Cw}+{\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} -2{\mathbf{w}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} +2{\mathbf{v}}^{\prime }{\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} $$

(6.1)

with respect to vectors β and v′ = [v₁ v₂ ⋯ v_r − 1], where v is a vector of Lagrange multipliers. In matrix notation, the derivative results of Eq. (6.1) are

$$ \left[\begin{array}{c}\boldsymbol{\upbeta} \\ {}\mathbf{v}\end{array}\right]={\left[\begin{array}{cc}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \mathbf{U}\\ {}{\mathbf{U}}^{\prime}\boldsymbol{\Gamma} & \mathbf{0}\end{array}\right]}^{-1}\left[\begin{array}{c}\boldsymbol{\Gamma} \mathbf{w}\\ {}\mathbf{0}\end{array}\right]. $$

(6.2)

Following the procedure described in Chap. 3 (Eqs. 3.2 to 3.5), it can be shown that the RLGSI vector of coefficients that minimizes E[(H − I_G)²] under the restriction U′Γβ = 0 is

$$ {\boldsymbol{\upbeta}}_{RG}={\mathbf{K}}_G\mathbf{w}, $$

(6.3)

where K_G = [I_t − Q_G], Q_G = U(U′ΓU)⁻¹U′Γ, w is a vector of economic weights, and I_t is an identity matrix t × t. When no restrictions are imposed on any of the traits, U′ is a null matrix and β_RG = w, the optimized LGSI vector of coefficients (see Chap. 5 for details).

By Eq. (6.3), the RLGSI, and the maximized RLGSI selection response and expected genetic gain per trait can be written as

$$ {I}_{\mathrm{RG}}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma}, $$

(6.4)

$$ {R}_{RG}=\frac{k_I}{L_G}\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG}} $$

(6.5)

and

$$ {\mathbf{E}}_{RG}=\frac{k_I}{L_G}\frac{{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG}}{\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG}}}, $$

(6.6)

respectively, where k_I is the standardized selection differential (or selection intensity) associated with the RLGSI, and L_G is the interval between selection cycles or the time required to complete a selection cycle using the RLGSI. Equations (6.4) to (6.6) depend only on GEBV information; thus, they are useful in testing populations.

6.1.2 Statistical Properties of RLGSI

Assuming that H = w′g and $ {I}_{\mathrm{RG}}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} $ have bivariate joint normal distribution, β_RG = K_Gw, and Γ, C, and w are known, it can be shown that the RLGSI has the following properties:

1.
Matrices K_G and Q_G are idempotent ($ {\mathbf{K}}_G={\mathbf{K}}_G^2 $ and $ {\mathbf{Q}}_G={\mathbf{Q}}_G^2 $) and orthogonal (K_GQ_G = Q_GK_G = 0), that is, they are projectors. Matrix Q_G projects vector β = w into a space generated by the columns of matrix U′Γ due to the restriction U′Γβ = 0 used when f_R(β, v) (Eq. 6.1) is minimized with respect to vectors β and v, whereas matrix K_G projects w into a space perpendicular to that generated by the U′Γ matrix columns.
2.
Because of the restriction U′Γβ = 0, matrix K_G projects vector w into a space smaller than the original space of w. The space reduction into which matrix K_G projects w is equal to the number of zeros that appears in Eq. (6.6).
3.
Vector β_RG = K_Gw minimizes the mean square error under the restriction U′Γβ = 0.
4.
The variance of $ {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} $ ($ {\sigma}_{I_{RG}}^2={\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG} $) is equal to the covariance between $ {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} $ and H = w′g ($ {\sigma}_{HI_{RG}}={\mathbf{w}}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG} $).
5.
The maximized correlation between H and I_RG is equal to $ {\rho}_{HI_{RG}}=\frac{\sigma_{I_{RG}}}{\sigma_H} $, where $ {\sigma}_{I_{RG}}=\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG}} $ and $ {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Cw}} $ are the standard deviations of $ {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} $ and H = w′g respectively.
6.
The variance of the predicted error, $ Var\left(H-{I}_{RG}\right)=\left(1-{\rho}_{HI_{RG}}^2\right){\sigma}_H^2 $, is minimal. Note that $ Var\left(H-{I}_{RG}\right)={\sigma}_{I_{RG}}^2+{\sigma}_H^2-2{\sigma}_{HI_{RG}} $, and when β_RG = K_Gw, $ {\sigma}_{I_{RG}}^2={\sigma}_{HI_{RG}} $, whence $ Var\left(H-{I}_{RG}\right)={\sigma}_H^2-{\sigma}_{I_{RG}}^2=\left(1-{\rho}_{HI_{RG}}^2\right){\sigma}_H^2 $ is minimal.

The statistical RLGSI properties are equal to the statistical RLPSI properties. Thus the RLGSI is an application of the RLPSI to the genomic selection context.

6.1.3 Numerical Examples

To estimate the parameters associated with the RLGSI, we use the real data set described in Chap. 5, Sect. 5.1.8, where we found that, in the testing population, the estimate of matrix Γ was $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}0.21& 2.95& 5.00\\ {}2.95& 42.41& 71.11\\ {}5.00& 71.11& 121.53\end{array}\right] $. We use this matrix and the GEBVs associated with the traits grain yield (GY, ton ha⁻¹), ear height (EHT, cm), and plant height (PHT, cm) to illustrate the RLGSI theoretical results.

Suppose that on the RLGSI expected genetic gain per trait we impose one and two null restrictions using matrices $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $ and $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] $ (see Chap. 3, Sect. 3.1.3, for details about matrix U′). We need to estimate the RLGSI vector of coefficients (β_RG = K_Gw) as $ {\widehat{\boldsymbol{\upbeta}}}_{RG}={\widehat{\mathbf{K}}}_G\mathbf{w} $, where $ {\widehat{\mathbf{K}}}_G=\left[{\mathbf{I}}_3-{\widehat{\mathbf{Q}}}_G\right] $ and $ {\widehat{\mathbf{Q}}}_G=\mathbf{U}{\left({\mathbf{U}}^{\prime}\widehat{\boldsymbol{\Gamma}}\mathbf{U}\right)}^{-1}{\mathbf{U}}^{\prime}\widehat{\boldsymbol{\Gamma}} $ are estimates of matrices K_G = [I₃ − Q_G] and Q_G = U(U′ΓU)⁻¹U′Γ respectively, and I₃ is an identity matrix 3 × 3. The estimated Q_G matrices for restrictions $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $ and $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] $ were $ {\widehat{\mathbf{Q}}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}1.0& 14.05& 23.81\\ {}0& 0& 0\\ {}0& 0& 0\end{array}\right] $ and $ {\widehat{\mathbf{Q}}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}1.0& 0& 11.18\\ {}0& 1.0& 0.90\\ {}0& 0& 0\end{array}\right] $ respectively, whereas the estimated K_G matrices for both restrictions were $ {\widehat{\mathbf{K}}}_{G_1}=\left[{\mathbf{I}}_3-{\widehat{\mathbf{Q}}}_{G_1}\right]=\left[\begin{array}{ccc}0& -14.05& -23.81\\ {}0& 1.0& 0\\ {}0& 0& 1.0\end{array}\right] $ and $ {\widehat{\mathbf{K}}}_{G_2}=\left[{\mathbf{I}}_3-{\widehat{\mathbf{Q}}}_{G_2}\right]=\left[\begin{array}{ccc}0& 0& -11.18\\ {}0& 0& -0.90\\ {}0& 0& 1.0\end{array}\right] $.

Let $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\right] $ be the vector of economic weights; then the estimated RLGSI vector of coefficients for one and two null restrictions were $ {\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G_1}^{\prime }=\left[3.78\kern0.5em -0.1\kern0.5em -0.1\right] $ and $ {\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G_2}^{\prime }=\left[1.12\kern0.5em 0.09\kern0.5em -0.1\right] $ respectively, and the estimated RLGSI for both restrictions can be written as $ {\widehat{\boldsymbol{I}}}_{RG_1}=3.78{\mathrm{GEBV}}_1-0.1{\mathrm{GEBV}}_2-0.1{\mathrm{GEBV}}_3 $ and $ {\widehat{\boldsymbol{I}}}_{RG_2}=1.12{\mathrm{GEBV}}_1+0.09{\mathrm{GEBV}}_2-0.1{\mathrm{GEBV}}_3 $, where GEBV₁, GEBV₂, and GEBV₃ are the genomic estimated breeding values associated with traits GY, EHT, and PHT respectively in the testing population.

Table 6.1 presents 20 genotypes selected from a population of 380 genotypes and the GEBVs in the testing population ranked according to the estimated RLGSI values for one restriction, where $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $. The estimated RLGSI values for genotypes 5 and 306 can be obtained as follows: $ {\widehat{I}}_{RG_5}=3.78\left(-0.6\right)-0.1\left(-8.67\right)-0.1(15.97)=0.196 $ and $ {\widehat{I}}_{RG_{306}}=3.78(0.13)-0.1(1.31)-0.1(1.66)=0.194 $ respectively. This procedure is valid for any number of genotypes and GEBVs in the testing population.

Table 6.1 Number of genotypes selected from 380 genotypes of a real testing population; genomic estimated breeding values (GEBVs) associated with three traits: grain yield (GY, ton ha⁻¹), ear height (EHT, cm), and plant height (PHT, cm) in the testing population, and estimated and ranked restricted linear genomic selection index (RLGSI) values obtained in the testing population for one null restriction

Full size table

Assume a selection intensity of 10% ($ {k}_{I_G}=1.755 $); then the estimated RLGSI selection response and expected genetic gain per trait not including the interval length were $ {\widehat{R}}_{RG_1}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_1}}=0.40 $ and $ {\widehat{\mathbf{E}}}_{RG_1}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_1}}}=\left[0\kern0.5em -1.42\kern0.5em -2.58\right] $ respectively. For two restrictions, with $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] $, the estimated RLGSI selection response and expected genetic gains not including the interval length were $ {\widehat{R}}_{RG_2}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_2}}=0.23 $ and $ {\widehat{\mathbf{E}}}_{RG_2}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_2}}}=\left[0\kern0.5em 0\kern0.5em -2.29\right] $ respectively. When the number of restrictions increases, the estimated RLGSI selection response value decreases, whereas the number of zeros increases in the estimated RLGSI expected genetic gain per trait. The number of zeros in the estimated RLGSI expected genetic gain per trait is equal to the number of restrictions imposed on RLGSI by matrix U′, where each restriction appears as 1.

Figure 6.1 presents the frequency distribution of the estimated RLGSI values for one (Fig. 6.1a) and two null restrictions (Fig. 6.1b). For both restrictions the frequency distribution of the estimated RLGSI values approaches the normal distribution.

Now we use the simulated data set described in Chap. 2, Sect. 2.8.1, to compare RLPSI (restricted linear phenotypic selection index, Chap. 3 for details) efficiency versus RLGSI efficiency. Table 6.2 presents the estimated RLPSI and RLGSI selection response for one, two, and three null restrictions imposed by matrices $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $, $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] $, and $ {\mathbf{U}}_3^{\prime }=\left[\begin{array}{llll}1& 0& 0& 0\\ {}0& 1& 0& 0\\ {}0& 0& 1& 0\end{array}\right] $ for five simulated selection cycles including and not including the interval between selection cycles. In each selection cycle, the sample size was equal to 500 genotypes, each with four repetitions and four traits, whereas the selection intensity was 10% (k_I = 1.755); the interval lengths for the RLPSI and RLGSI were 4 and 1.5 years (Beyene et al. 2015) respectively.

Table 6.2 Estimated restricted linear phenotypic selection index (RLPSI) and RLGSI selection responses for 1, 2, and 3 null restrictions for 5 simulated selection cycles including and not including the interval between selection cycles. The interval lengths for the RLPSI and the RLGSI were 4 and 1.5 years respectively

Full size table

Table 6.2 was divided in two parts. The first part presents the estimated RLPSI whereas the second part presents the estimated RLGSI selection responses. Columns 2, 3, and 4 in Table 6.2 present the estimated RLPSI and RLGSI selection responses not including the interval length, whereas columns 5, 6, and 7 present the estimated RLPSI and RLGSI selection response, including the interval length. The averages of the estimated RLPSI selection response not including the interval length for one, two, and three restrictions were 7.04, 5.50, and 3.90, whereas when the interval length was included, the averages were 1.76, 1.38, and 0.98 respectively. The averages of the estimated RLGSI selection response not including the interval length for one, two, and three restrictions were 5.04, 3.72, and 2.79, whereas when the interval length was included the averages were 3.36, 2.48, and 1.86 respectively. These results indicated that when the interval length was included in the estimation of the RLPSI and RLGSI selection response, RLGSI efficiency was greater than RLPSI efficiency, and vice versa, when the interval length was not included the RLPSI efficiency was greater than RLGSI efficiency.

Table 6.3 presents the estimated RLPSI (first part) and RLGSI (second part) expected genetic gain per trait not including the interval between selection cycles for one, two, and three null restrictions in five simulated selection cycles. In this case, RLPSI efficiency is greater than RLGSI efficiency because the averages of the estimated RLPSI expected genetic gain per trait were −2.52, 2.26, and 2.26 for one null restriction; 2.84 and 2.65 for two null restrictions; and 3.90 for three null restrictions. For the same set of restrictions, the averages of the estimated RLGSI expected genetic gain per trait were: −1.85, 1.13, and 2.06 for one null restriction; 1.52 and 2.19 for two null restrictions, and 2.79 for three null restrictions. However, divided by the interval length (4 years in the RLPSI), the averages of the estimated RLPSI expected genetic gain per trait were −0.63, 0.57, and 0.57 for one null restriction; 0.71 and 0.66 for two null restrictions, and 0.98 for three null restrictions. In a similar manner, dividing by the interval length (1.5 years in this case), the averages of the estimated RLGSI expected genetic gain per trait were −1.23, 0.75, and 1.37 for one restriction; 1.01 and 1.46 for two restrictions; and 1.86 for three restrictions.

Table 6.3 Estimated RLPSI and RLGSI expected genetic gain per trait for 1, 2, and 3 null restrictions for 5 simulated selection cycles (each with 4 traits) not including the interval length between selection cycles

Full size table

Table 6.4 presents the estimated RLPSI heritability ($ {\widehat{h}}_{I_R}^2 $) values, the estimated restricted linear genomic selection index (RLGSI) accuracy ($ {\widehat{\rho}}_{HI_{RG}} $) values, the values of $ W=\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} $ (L_RP = 4), and the values of $ \widehat{p}=100\left({\widehat{\lambda}}_R-1\right) $, where $ {\widehat{\lambda}}_R={\widehat{\rho}}_{HI_R}/{\widehat{\rho}}_{HI_{RG}} $ and $ {\widehat{\rho}}_{HI_R} $ is the estimated RLPSI accuracy, for one, two, and three restrictions for five simulated selection cycles. The RLGSI interval length was L_RG = 1.5 whereas the averages of the values of $ W=\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} $ for each restriction were 1.22, 0.85, and 0.60; this means that the estimated Technow inequality (Technow et al. 2013), $ {L}_{RG}<\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} $ (Chap. 5, Eq. 5.18), was not true. Thus, according to the Technow inequality results, for this data set, RLGSI efficiency in terms of time was not greater than RLPSI efficiency. The inequality $ {L}_{RG}<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_{I_R}}{L}_{I_R} $ was not true because the estimated RLGSI accuracy was very low, whereas RLPSI heritability was high. Thus, note that the averages of the estimated RLGSI accuracy for one, two, and three null restrictions were 0.25, 0.19, and 0.14 respectively, and the averages of the estimated RLPSI heritability values were 0.70, 0.78 and 0.88, respectively. Thus, according to these results, because the estimated RLGSI accuracy is very low and RLPSI heritability is high, RLGSI efficiency was lower than RLPSI efficiency in terms of time.

Table 6.4 Estimated RLPSI heritability ($ {\widehat{h}}_{I_R}^2 $), estimated RLGSI accuracy ($ {\widehat{\rho}}_{HI_{RG}} $), estimated values of $ W=\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} $ (L_RP = 4), and values of $ \widehat{p}=100\left({\widehat{\lambda}}_R-1\right) $, where $ {\widehat{\lambda}}_R={\widehat{\rho}}_{HI_R}/{\widehat{\rho}}_{HI_{RG}} $, and $ {\widehat{\rho}}_{HI_R} $ are the estimated RLPSI accuracy values, for 1, 2, and 3 restrictions for five simulated selection cycles

Full size table

The last three columns of Table 6.4, from left to right, present the estimated p values, $ \widehat{p}=100\left({\widehat{\lambda}}_R-1\right) $, for one, two, and three null restrictions in five simulated selection cycles. The average of the $ \widehat{p} $ values indicates that for each of the three restrictions the RLPSI efficiency was 65.05%, 78.73%, and 74.09%, greater than RLGSI efficiency at predicting the net genetic merit. Thus, for this data set, the RLPSI was a better predictor of the net genetic merit than the RLGSI in each cycle.

6.2 The Predetermined Proportional Gain Linear Genomic Selection Index

6.2.1 Objective of the PPG-LGSI

Let $ {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] $ be a vector 1 × r (r is the number of predetermined proportional gains) of the predetermined proportional gains imposed by the breeder, and assume that μ_q is the population mean of the qth trait before selection. The objective of the predetermined proportional gain linear genomic selection index (PPG-LGSI) is to change μ_q to μ_q + d_q in the testing population, where d_q is a predetermined change in μ_q. It is possible to solve this problem minimizing the mean squared difference between I_G = β′γ and H = w′g, E[(H − I_G)²], under the restriction U′Γβ = θ_Gd, where θ_G is a proportionality constant, or under the restriction D′U′Γβ = 0, where $ {\mathbf{D}}^{\prime }=\left[\begin{array}{lllll}{d}_r& 0& \dots & 0& -{d}_1\\ {}0& {d}_r& \dots & 0& -{d}_2\\ {}\vdots & \vdots & \ddots & \vdots & \vdots \\ {}0& 0& \dots & {d}_r& -{d}_{r-1}\end{array}\right] $ is a matrix (r − 1) × r (see Chap. 3 for details), and d_q (q = 1, 2…, r) is the q^th element of vector $ {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] $; U′ is a matrix (t − 1) × t of 1s and 0s, and $ \boldsymbol{\Gamma} =\left\{{\sigma}_{\gamma_{q{q}^{\prime }}}\right\} $ (q, q′ = 1, 2, …, t, t = number of traits) is a covariance matrix of additive genomic breeding values, γ′ = [γ₁ γ₂…γ_t].

6.2.2 The Maximized PPG-LGSI Parameters

In this subsection, we minimize E[(H − I_G)²] under the restriction D′U′Γβ = 0 and later under the restriction U′Γb = θ_Gd. Under the restriction D′U′Γβ = 0, it is necessary to minimize the function

$$ {f}_P\left(\boldsymbol{\upbeta}, \mathbf{v}\right)={\boldsymbol{\upbeta}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} +{\mathbf{w}}^{\prime}\mathbf{Cw}-2{\mathbf{w}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} +2{\mathbf{v}}^{\prime }{\mathbf{D}}^{\prime }{\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \boldsymbol{\upbeta} $$

(6.7)

with respect to β and $ {\mathrm{v}}^{\prime }=\left[{v}_1\kern0.5em {v}_2\kern0.5em \dots \kern0.5em {v}_{r-1}\right] $, where v′ is a vector of Lagrange multipliers. From a mathematical point of view, Eq. (6.7) is equal to Eq. (6.1); thus, the vector of coefficients β of the PPG-LGSI should be similar to the vector of coefficients of the RLGSI (Eq. 6.3), i.e., the PPG-LGSI vector of coefficients is equal to

$$ {\boldsymbol{\upbeta}}_{PG}={\mathbf{K}}_P\mathbf{w}, $$

(6.8)

where now K_P = [I_t − Q_P], Q_P = UD(D′U′ΓUD)⁻¹D′U′Γ, w is a vector of economic weights, and I_t is an identity matrix t × t. When D′ = U′, β_PG = β_RG (the RLGSI vector of coefficients), and when U′ is a null matrix, β_PG = w (the LGSI vector of coefficients). This means that the PPG-LGSI includes the RLGSI and the LGSI as particular cases.

Under the restriction U′Γβ = θ_Gd (see Chap. 3 for details) the vector of coefficients of the PPG-LGSI can be written as

$$ {\boldsymbol{\upbeta}}_{PG}={\boldsymbol{\upbeta}}_{RG}+{\theta}_G\mathbf{U}{\left({\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \mathbf{U}\right)}^{-1}\mathbf{d}, $$

(6.9)

where β_RG = K_Gw (Eq. 6.3), K_G = [I − Q_G], Q_G = U(U′ΓU)⁻¹U′Γ, and $ {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] $ is the vector of the predetermined proportional gains imposed by the breeder. It can be shown that θ_G, the proportionality constant, can be written as

$$ {\theta}_G=\frac{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \mathbf{U}\right)}^{-1}{\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \mathbf{w}}{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}^{\prime}\boldsymbol{\Gamma} \mathbf{U}\right)}^{-1}\mathbf{d}}. $$

(6.10)

When θ_G = 0, β_PG = β_RG, and when U′ is a null matrix, β_PG = w. Equations (6.8) and (6.9) give the same results, that is, both equations express the same result in a different mathematical way.

The maximized selection response and expected genetic gain per trait of the PPG-LGSI can be written as

$$ {R}_{PG}=\frac{k_I}{L_G}\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG}} $$

(6.11)

and

$$ {\mathbf{E}}_{\mathrm{PG}}=\frac{k_I}{L_G}\frac{{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG}}{\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG}}}, $$

(6.12)

respectively, where L_G is the time required to complete a selection cycle using the PPG-LGSI. Equations (6.11) and (6.12) depend only on GEBV information.

6.2.3 Statistical Properties of the PPG-LGSI

Assuming that H = w′g and the PPG-LGSI ($ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\boldsymbol{\upgamma} $) have bivariate joint normal distribution, β_PG = K_Pw; Γ, C, and w are known, it can be shown that PPG-LGSI has the following statistical properties:

1.
The vector β_PG = K_Pw minimizes the mean square error under the restriction D′U′Γβ = 0.
2.
The variance of $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} $ ($ {\sigma}_{I_{PG}}^2={\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG} $) is equal to the covariance between $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} $ and H = w′g ($ {\sigma}_{HI_{PG}}={\mathbf{w}}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG} $).
3.
The maximized correlation between H and I_PG (also called PPG-LGSI accuracy) is equal to $ {\rho}_{HI_{PG}}=\frac{\sigma_{I_{PG}}}{\sigma_H} $, where $ {\sigma}_{I_{PG}}=\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime}\;{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG}} $ and $ {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Cw}} $ are the standard deviations of $ {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} $ and H = w′g respectively.
4.
The variance of the predicted error, $ Var\left(H-{I}_{PG}\right)=\left(1-{\rho}_{HI_{PG}}^2\right){\sigma}_H^2 $, is minimal.

The statistical PPG-LGSI properties are equal to the statistical PPG-LPSI properties, then, the PPG-LGSI is an application of the PPG-LPSI to the genomic selection context.

6.2.4 Numerical Example

To illustrate the PPG-LGSI theory, we use the estimated matrix $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.21&\ 2.95& \kern0.30em 5.00\\ {}2.95& 42.41& 71.11\\ {}5.00& 71.11& 121.53\end{array}\right] $ and the GEBVs associated with the traits GY (ton ha⁻¹), EHT (cm), and PHT (cm), described in Sect. 6.1.3.

It is necessary to estimate the PPG-LGSI vector of coefficients β_PG = β_RG + θ_gU(U′ΓU)⁻¹d (Eqs. 6.9 and 6.10). In Sect. 6.1.3, we showed that the estimated vectors of coefficients of β_RG = K_Gw for the null restrictions $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $ and $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] $ were $ {\widehat{\boldsymbol{\upbeta}}}_{RG1}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G1}^{\prime }=\left[3.78\kern0.5em -0.1\kern0.5em -0.1\right] $ and $ {\widehat{\boldsymbol{\upbeta}}}_{RG2}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G2}^{\prime }=\left[1.12\kern0.5em 0.09\kern0.5em -0.1\right] $ respectively, where $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\right] $. This means that to estimate β_PG = β_RG + θ_GU(U′ΓU)⁻¹d, we need only to estimate θ_GU(U′ΓU)⁻¹d for both sets of restrictions.

Consider matrix $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $ and let d₁ = 7.0 be the predetermined proportional gain restriction for trait 1. We can estimate θ_G and U(U′ΓU)⁻¹d as $ {\widehat{\theta}}_{G1}=\frac{7.0{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}\mathbf{w}}{7.0{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{-1}7.0}=0.036 $ and $ {\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{-1}7.0=\left[\begin{array}{c}33.333\\ {}0\\ {}0\end{array}\right] $, whence the PPG-LGSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{PG_1}={\widehat{\boldsymbol{\upbeta}}}_{RG_1}+{\widehat{\theta}}_{G_1}{\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{-1}7.0=\left[\begin{array}{c}5.0\\ {}-0.1\\ {}-0.1\end{array}\right] $, and the estimated PPG-LGSI was $ {\widehat{I}}_{PG_1}=5.0{\mathrm{GEBV}}_1-0.1{\mathrm{GEBV}}_2-0.1{\mathrm{GEBV}}_3 $. In a similar manner, we can estimate the PPG-LGSI vector of coefficients under restrictions $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] $ and $ {\mathbf{d}}_2^{\prime }=\left[7\kern0.5em -3\right] $. In this case, $ {\widehat{\boldsymbol{\upbeta}}}_{PG_2}={\widehat{\boldsymbol{\upbeta}}}_{RG_2}+{\widehat{\theta}}_{G_2}{\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{-1}{\mathbf{d}}_2=\left[\begin{array}{l}\kern0.65em 4.97\\ {}-0.18\\ {}-0.10\end{array}\right] $ and the estimated PPG-LGSI was $ {\widehat{I}}_{PG_2}=4.97{\mathrm{GEBV}}_1-0.18{\mathrm{GEBV}}_2-0.1{\mathrm{GEBV}}_3 $.

Figure 6.2 presents the frequency distribution of the estimated PPG-LGSI values for one (Fig. 6.2a) and two (Fig. 6.2b) predetermined restrictions, d = 7 and $ {\mathbf{d}}^{\prime }=\left[7\kern0.5em -3\right] $ respectively, obtained in a real testing population for one selection cycle in one environment. For both restrictions, the frequency distribution of the estimated PPG-LGSI values approaches the normal distribution.

Assume a selection intensity of 10% ($ {k}_{I_G}=1.755 $); then, for one predetermined restriction, where $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $ and d₁ = 7.0, the estimated PPG-LGSI selection response and expected genetic gain per trait, not including the interval length, were $ {\widehat{R}}_{PG_1}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_1}}=1.05 $ and $ {\widehat{\mathbf{E}}}_{PG_1}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_1}}}=\left[0.74\kern0.5em 9.92\kern0.5em 16.54\right] $ respectively. For two restrictions, with $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] $ and $ {\mathbf{d}}^{\prime }=\left[7\kern0.5em -3\right] $, the estimated RLGSI selection response and expected genetic gains, not including the interval length, were $ {\widehat{R}}_{PG_2}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{G_2}}=0.52 $ and $ {\widehat{\mathbf{E}}}_{PG_2}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_2}}}=\left[0.11\kern0.5em -0.05\kern0.5em 0.14\right] $ respectively.

Now, we use the simulated data set described in Chap. 2, Sect. 2.8.1 to compare PPG-LGSI efficiency versus predetermined proportional gain linear phenotypic selection index (PPG-LPSI) efficiency. Let $ {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] $, $ {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] $, and $ {\mathbf{U}}_3^{\prime }=\left[\begin{array}{llll}1& 0& 0& 0\\ {}0& 1& 0& 0\\ {}0& 0& 1& 0\end{array}\right] $ be the matrices and d₁ = 7, $ {\mathbf{d}}_2^{\prime }=\left[7\kern0.5em -3\right] $, and $ {\mathbf{d}}_3^{\prime }=\left[7\kern0.5em -3\kern0.5em 5\right] $ the vectors for one, two, and three predetermined restrictions respectively. Table 6.5 presents the estimated PPG-LPSI and PPG-LGSI selection response for each predetermined restriction in five simulated selection cycles including and not including the interval between selection cycles (4 years for the PPG-LPSI and 1.5 years for the PPG-LGSI); estimated PPG-LPSI and PPG-LGSI accuracy; and estimated variance of the predicted error (VPE). In each selection cycle, the sample size was equal to 500 genotypes, each with four repetitions and four traits. The selection intensity was 10% (k_I = 1.755).

Table 6.5 Estimated predetermined proportional gain linear phenotypic and genomic selection index (PPG-LPSI and PPG-LGSI respectively) selection responses for 1, 2 and 3 predetermined restrictions for five simulated selection cycles including and not including the interval between selection cycles (4 years for the PPG-LPSI and 1.5 years for the PPG-LGSI); estimated PPG-LPSI and PPG-LGSI accuracy and estimated variance of the predicted error (VPE)

Full size table

The averages of the estimated PPG-LPSI selection response not including the interval length were 15.14, 14.87, and 13.30, whereas when the interval length was included, the average selection responses were 3.79, 3.72, and 3.33, for one, two, and three predetermined restrictions respectively (Table 6.5). The averages of the estimated PPG-LGSI selection responses not including the interval length for one, two, and three predetermined restrictions were 14.48, 13.47, and 11.26 respectively, and when the interval length was included, the selection responses were 9.65, 8.98, and 7.51 respectively (Table 6.5). These results indicate that when the interval length was included in the estimation of the PPG-LPSI and PPG-LGSI selection responses, PPG-LGSI efficiency was greater than PPG-LPSI efficiency, and vice versa, when the interval length was not included in the PPG-LPSI and PPG-LGSI selection responses, PPG-LPSI efficiency was higher than PPG-LGSI efficiency.

The averages of the estimated VPE values of the PPG-LPSI for one, two, and three predetermined restrictions were 22.42, 30.56, and 41.17 respectively, whereas the estimated VPE values of the PPG-LGSI (see Sect. 6.2.3 for details) were 59.80, 66.95, and 83.98, respectively, that is, in all selection cycles, the VPE of the PPG-LPSI was lower than that of the PPG-LGSI. This means that for this data set, the PPG-LPSI was a better predictor of the net genetic merit than the PPG-LGSI. These results can be explained by observing that the averages of the estimated PPG-LPSI accuracies were 0.88, 0.86, and 0.77, whereas the estimated PPG-LGSI accuracies were 0.65, 0.68, and 0.57 for each predetermined restriction, that is, the estimated PPG-LGSI accuracies were lower than the estimated PPG-LPSI accuracies for this data set.

Table 6.6 presents the estimated predetermined PPG-LPSI heritability ($ {\widehat{h}}_P^2 $) values, $ {W}_P=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_P}{L}_P $ (L_P = 4) values, and ratio of the estimated PPG-LPSI accuracy ($ {\widehat{\rho}}_{HI_P} $) to the estimated PPG-LGSI accuracy ($ {\widehat{\rho}}_{HI_{PG}} $), i.e., $ {\widehat{\lambda}}_P={\widehat{p}}_{HI_P}/{\widehat{p}}_{HI_{PG}} $, and, finally, values of $ \widehat{p}=100\left({\widehat{\lambda}}_P-1\right) $ for one, two, and three null restrictions for five simulated selection cycles.

Table 6.6 Estimated PPG-LPSI heritability ($ {\widehat{h}}_P^2 $), values of $ {W}_P=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_P}{L}_P $ (L_P = 4), and the ratio of the estimated PPG-LPSI accuracy ($ {\widehat{\rho}}_{HI_P} $) to the estimated PPG-LGSI accuracy ($ {\widehat{\rho}}_{HI_{PG}} $): $ {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_P}/{\widehat{\rho}}_{HI_{PG}} $, and values of $ \widehat{p}=100\left({\widehat{\lambda}}_p-1\right) $ for 1, 2 and 3 predetermined restrictions for five simulated selection cycles

Full size table

The averages of the W_P values for one, two, and three null restrictions were 3.29, 3.12, and 2.53, respectively, whereas the PPG-LGSI interval length was 1.5 (L_G = 1.5). This means that the estimated Technow inequality, $ {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_P}{L}_P $ (see Chap. 5, Eq. 5.18) was true. Thus, PPG-LGSI efficiency in terms of time was greater than PPG-LPSI efficiency for this data set. These results coincide with those obtained earlier in this chapter, when we compared PPG-LGSI efficiency versus PPG-LPSI efficiency in terms of interval length. However, the average values of $ \widehat{p}=100\left({\widehat{\lambda}}_P-1\right) $ (see Chap. 5, Eq. 5.15) were, in percentage terms, 16.80%, 20.76%, and 25.85% for each restriction. These latter results indicate that for this data set, the PPG-LPSI was a better predictor of the net genetic merit than the PPG-LGSI. This is because the estimated PPG-LPSI accuracies were higher than the estimated PPG-LPSI accuracies for this data set. We found similar results when we compared the PPG-LPSI VPE versus PPG-LGSI VPE (Table 6.5).

6.3 The Combined Restricted Linear Genomic Selection Index

The combined restricted linear genomic selection index (CRLGSI) is based on the RLPSI (Chap. 3) and combined linear genomic selection index (CLGSI, Chap. 5) theory. In the RLPSI, the breeder’s objective is to improve only (t − r) of t (r < t) traits, leaving r of them fixed; the same is true for the CRLGSI, but in the latter case, it is necessary to impose 2r restrictions, i.e., we need to fix r traits and their associated r GEBVs to obtain results similar to those obtained with the RLPSI. This is the main difference between the CRLGSI and the RLPSI.

It can be shown that Cov(I_C, a_C) = Ψ_Cβ_C is the covariance between the breeding value vector ($ {\boldsymbol{a}}_C^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $) and the CLGSI, $ {I}_C={\boldsymbol{\upbeta}}_C^{\prime }{\mathbf{t}}_C $ (see Chap. 5 for details), where $ {\mathbf{t}}_C^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $. In the CRLGSI, we want some covariances between the linear combinations of a_C ($ {\mathbf{U}}_C^{\prime }{\mathbf{a}}_C $) and CLGSI to be zero, i.e., $ Cov\left({\mathrm{I}}_{\mathrm{C}},{\mathbf{U}}_C^{\prime }{\mathbf{a}}_C\right)={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} $, where $ {\mathbf{U}}_C^{\prime } $ is a matrix 2(t − 1) × 2t of 1s and 0s (1 indicates that the trait and its associated GEBV are restricted, and 0 that the trait and its GEBV have no restrictions) and $ {\boldsymbol{\Psi}}_C=\left[\begin{array}{cc}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ is a block covariance matrix of $ {\mathbf{a}}_C^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $ where C and Γ are the covariance matrices of breeding (g) and genomic (γ) values respectively. This problem can be solved by minimizing the mean squared difference between the CLGSI and H (E[(H − I_C)²]) under the restriction $ {\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} $ similar to the RLGSI in Sect. 6.1.

6.3.1 The Maximized CRLGSI Parameters

Let $ {\mathbf{T}}_C=\left[\begin{array}{cc}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ be the block covariance matrix of $ {\mathbf{t}}_C^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] $ where P and Γ are the covariance matrices of phenotypic (y) and genomic (γ) values respectively. Based on the Eq. (6.1) result, it can be shown that the CRLGSI vector of coefficients that minimizes E[(H − I_C)²] under the restriction $ {\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} $ is

$$ {\boldsymbol{\upbeta}}_{CR}={\mathbf{K}}_C{\boldsymbol{\upbeta}}_C, $$

(6.13)

where K_C = [I − Q_C], $ {\mathbf{Q}}_C={\mathbf{T}}_C^{-1}{\boldsymbol{\Phi}}_C{\left({\boldsymbol{\Phi}}_C^{\prime }{\mathbf{T}}_C^{-1}{\boldsymbol{\Phi}}_C\right)}^{-1}{\boldsymbol{\Phi}}_C^{\prime } $, $ {\boldsymbol{\Phi}}_C={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C $, and $ {\boldsymbol{\upbeta}}_C={\mathbf{T}}_C^{-1}{\boldsymbol{\Psi}}_C{\mathbf{a}}_C $ (the vector of coefficients of the CLGSI, see Chap. 5 for details); $ {\mathbf{T}}_C^{-1} $ is the inverse of matrix T_C, and I is an identity matrix 2t × 2t. When no restrictions are imposed on any of the traits, $ {\mathbf{U}}_C^{\prime } $ is a null matrix and β_CR = β_C (the vector of coefficients of the CLGSI). That is, the CRLGSI is more general than the CLGSI. Similar to the RLPSI and the RLGSI, matrices K_C and Q_C are idempotent ($ {\mathbf{K}}_C={\mathbf{K}}_C^2 $ and $ {\mathbf{Q}}_C={\mathbf{Q}}_C^2 $) and orthogonal (K_CQ_C = Q_CK_C = 0), that is, K_C and Q_C are projectors. Thus, we can assume that the CRLGSI has similar properties to those described for the RLPSI (see Chap. 3 for details) when matrices $ {\boldsymbol{\Psi}}_C=\left[\begin{array}{cc}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ and $ {\mathbf{T}}_C=\left[\begin{array}{cc}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] $ are known.

The maximized selection response and the optimized expected genetic gain per trait of the CRLGSI can be written as

$$ {R}_{CR}=\frac{k_I}{L_I}\sqrt{{\boldsymbol{\upbeta}}_{CR}^{\prime }{\mathbf{T}}_C{\boldsymbol{\upbeta}}_{CR}} $$

(6.14)

and

$$ {\mathbf{E}}_{CR}=\frac{k_I}{L_I}\frac{{\boldsymbol{\Psi} \boldsymbol{\upbeta}}_{CR}}{\sqrt{{\boldsymbol{\upbeta}}_{CR}^{\prime }{\mathbf{T}}_C{\boldsymbol{\upbeta}}_{CR}}}, $$

(6.15)

respectively. Although in the RLGSI and the PPG-LGSI the interval between selection cycles is denoted as L_G, in the CRLGSI it is denoted as L_I. This is because the RLPSI and the CRLGSI should have the same interval between selection cycles.

6.3.2 Numerical Examples

To illustrate the CRLGSI theoretical results, we use a real training maize (Zea mays) F₂ population with 248 genotypes (each with two repetitions), 233 molecular markers, and three traits: GY (ton ha⁻¹), EHT (cm), and PHT (cm). Matrices P and C were estimated based on Eqs. (2.22) to (2.24) described in Chap. 2. The estimated matrices were $ \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45& \kern0.34em 1.33& \kern0.55em 2.33\\ {}1.33& 65.07& \kern0.55em 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] $ and $ \widehat{\mathbf{C}}=\left[\begin{array}{lll}0.07& \kern0.35em 0.61&\ 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] $. In a similar manner, we estimated matrix Γ using Eqs. (5.21) to (5.23) described in Chap. 5. The estimated matrix was $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& \kern0.35em 0.65&\ 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] $.

To estimate the CRLGSI and its associated parameters (selection response, expected genetic gain per trait, etc.), we need to obtain matrices $ {\widehat{\mathbf{T}}}_C=\left[\begin{array}{ll}\widehat{\mathbf{P}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $ and $ {\widehat{\boldsymbol{\Psi}}}_C=\left[\begin{array}{ll}\widehat{\mathbf{C}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] $ using phenotypic and genomic information and the estimated CRLGSI vector of coefficients $ {\widehat{\boldsymbol{\upbeta}}}_{CR}={\widehat{\mathbf{K}}}_C{\widehat{\boldsymbol{\upbeta}}}_C $, where $ {\widehat{\mathbf{K}}}_C=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_C\right] $, $ {\widehat{\mathbf{Q}}}_C={\widehat{\mathbf{T}}}_C^{-1}{\widehat{\boldsymbol{\Phi}}}_C{\left({\widehat{\boldsymbol{\Phi}}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{-1}{\widehat{\boldsymbol{\Phi}}}_C\right)}^{-1}{\widehat{\boldsymbol{\Phi}}}_C^{\prime } $, $ {\widehat{\boldsymbol{\Phi}}}_C={\mathbf{U}}_C^{\prime }{\widehat{\boldsymbol{\Psi}}}_C $, and $ {\widehat{\boldsymbol{\upbeta}}}_C={\widehat{\mathbf{T}}}_C^{-1}{\widehat{\boldsymbol{\Psi}}}_C{\mathbf{a}}_C $.

We have indicated that the main difference between the RLGSI and the CRLGSI is matrix $ {\mathbf{U}}_C^{\prime } $, on which we now need to impose two restrictions: one for the trait and another for its associated GEBV. Consider the (Zea mays) F₂ population described earlier and suppose that we restrict trait GY; then, matrix $ {\mathbf{U}}_C^{\prime } $ should be constructed as $ {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] $. If we restrict traits GY and EHT, matrix $ {\mathbf{U}}_C^{\prime } $ should be constructed as $ {\mathbf{U}}_{C_2}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] $, etc. The procedure for obtaining matrices $ {\widehat{\mathbf{K}}}_C=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_C\right] $, $ {\widehat{\mathbf{Q}}}_C={\widehat{\mathbf{T}}}_C^{-1}{\widehat{\boldsymbol{\Phi}}}_C{\left({\widehat{\boldsymbol{\Phi}}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{-1}{\widehat{\boldsymbol{\Phi}}}_C\right)}^{-1}{\widehat{\boldsymbol{\Phi}}}_C^{\prime } $, and $ {\widehat{\boldsymbol{\Phi}}}_C={\mathbf{U}}_C^{\prime }{\widehat{\boldsymbol{\Psi}}}_C $ is similar to that described in Chap. 3.

Let $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $ be the vector of economic weights and assume that we restrict trait GY; in this case, according to the estimated matrices $ \widehat{\mathbf{P}} $, $ \widehat{\mathbf{C}} $, and $ \widehat{\boldsymbol{\Gamma}} $ described earlier, the estimated CRLGSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{RG}^{\prime }=\left[0.076\kern0.5em -0.004\kern0.5em -0.018\kern0.5em 2.353\kern0.5em -0.096\kern0.5em -0.082\right] $, whence the estimated CRLGSI can be written as

$$ {\widehat{I}}_{CR}=0.076\mathrm{GY}-0.004\mathrm{EHT}-0.018\mathrm{PHT}+2.353{\mathrm{GEBV}}_{\mathrm{GY}}-0.096{\mathrm{GEBV}}_{\mathrm{EHT}}-0.082{\mathrm{GEBV}}_{\mathrm{PHT}} $$

where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions.

Figure 6.3 presents the frequency distribution of the estimated CRLGSI values for one (Fig. 6.3a) and two null restrictions (Fig. 6.3b) using matrices $ {\mathbf{U}}_{C_1}^{\prime } $ and $ {\mathbf{U}}_{C_2}^{\prime } $, and the real data set of the F₂ population. For both restrictions, the frequency distribution of the estimated CRLGSI values approaches normal distribution.

Suppose a selection intensity of 10% (k_I = 1.755), matrix $ {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] $ and that the vector of economic weights is $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $; then, according to the estimated matrices $ \widehat{\mathbf{P}} $, $ \widehat{\mathbf{C}} $, and $ \widehat{\boldsymbol{\Gamma}} $ described earlier, the estimated CRLGSI selection response and the estimated CRLGSI expected genetic gain per trait were $ {\widehat{R}}_{CR}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CR}}=0.96 $ and $ {\widehat{\mathbf{E}}}_{CR}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime}\widehat{\boldsymbol{\Psi}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CR}}}=\left[0\kern0.5em -3.53\kern0.5em -6.03\kern0.5em 0\kern0.5em -2.93\kern0.5em -4.87\right] $ respectively, whereas the estimated CRLGSI accuracy was $ {\widehat{\rho}}_{HI_{CR}}=\frac{{\widehat{\sigma}}_{I_{CR}}}{{\widehat{\sigma}}_H}=0.51 $ (see Chaps. 3 and 5 for details).

Now, we use the simulated data described in Chap. 2, Sect. 2.8.1 to compare CRLGSI efficiency versus RLGSI efficiency. The criteria for this comparison are the Technow inequality (Eq. 5.18, Chap. 5) and the ratio of the estimated CRLGSI accuracy ($ {\widehat{\rho}}_{HI_{CR}} $) to the estimated RLGSI accuracy ($ {\widehat{\rho}}_{HI_R} $) expressed as percentages (Eq. 5.17, Chap. 5), i.e., $ \widehat{p}=100\left({\widehat{\lambda}}_{CR}-1\right) $, where $ {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_{CR}}/{\widehat{\rho}}_{HI_R} $, for one, two, and three null restrictions for five simulated selection cycles.

Table 6.7 presents the estimated CRLGSI heritability ($ {\widehat{h}}_C^2 $), the estimated RLGSI accuracy ($ {\widehat{\rho}}_{HI_R} $), the values of $ {W}_C=\frac{{\widehat{\rho}}_{HI_R}}{{\widehat{h}}_I}{L}_I $ (L_I = 4), and the values of $ \widehat{p}=100\left({\widehat{\lambda}}_{CR}-1\right) $, where $ {\widehat{\lambda}}_{CR}={\widehat{\rho}}_{HI_{CR}}/{\widehat{\rho}}_{HI_R} $ and $ {\widehat{\rho}}_{HI_{CR}} $ is the estimated CRLGSI accuracy, for one, two, and three null restrictions for five simulated selection cycles. The averages of the $ {W}_C=\frac{{\widehat{\rho}}_{HI_R}}{{\widehat{h}}_C}{L}_I $ values for one, two, and three null restrictions were 1.26, 0.92, and 0.59 respectively, whereas the RLGSI interval length was 1.5 (L_G = 1.5). This means that the estimated Technow inequality ($ {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I $) was not true. Thus, for this data set, RLGSI efficiency in terms of time is not greater than CRLGSI efficiency. The inequality $ {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I $ was not true because the estimated RLGSI accuracy was very low, whereas CRLGSI heritability was high. Thus, note that the averages of the estimated RLGSI accuracy for one, two, and three null restrictions were 0.25, 0.19, and 0.14 respectively, whereas the averages of the estimated CRLGSI heritability values were 0.72, 0.75, and 0.89 respectively. Thus, according to these results, when the estimated RLGSI accuracy is very low and the estimated CRLGSI heritability is high, RLGSI efficiency will be lower than CRLGSI efficiency in terms of time.

Table 6.7 Estimated combined restricted linear genomic selection index (CRLGSI) heritability ($ {\widehat{h}}_I^2 $), estimated RLGSI accuracy ($ {\widehat{\rho}}_{HI_R} $), values of $ {W}_C=\frac{{\widehat{\rho}}_{HI_R}}{{\widehat{h}}_I}{L}_I $ (L_I = 4), and values of $ \widehat{p}=100\left({\widehat{\lambda}}_{CR}-1\right) $, where $ {\widehat{\lambda}}_{CR}={\widehat{\rho}}_{HI_{CR}}/{\widehat{\rho}}_{HI_R} $ and $ {\widehat{\rho}}_{HI_{CR}} $ is the estimated CRLGSI accuracy, for 1, 2, and 3 null restrictions for five simulated selection cycles

Full size table

The last three columns of Table 6.7, from left to right, present the average of the values of $ \widehat{p}=100\left({\widehat{\lambda}}_{CR}-1\right) $, for one, two, and three null restrictions of five simulated selection cycles. According to these results, CRLGSI efficiency was 53.78%, 78.25%, and 61.25% higher than RLGSI efficiency. Thus, for this data set, the CRLGSI was a better predictor of the net genetic merit than the RLGSI.

6.4 The Combined Predetermined Proportional Gains Linear Genomic Selection Index

In the PPG-LPSI described in Chap. 3, the vector of the PPG (predetermined proportional gains) was $ {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] $. However, because the combined predetermined proportional gains LGSI (CPPG-LGSI) uses phenotypic and GEBV information jointly to predict the net genetic merit, the vector of the PPG (d_C) should be twice the standard vector d′, that is, $ {\mathbf{d}}_C^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \cdots \kern0.5em {d}_r\kern0.5em {d}_{r+1}\kern0.5em {d}_{r+2}\kern0.5em \cdots \kern0.5em {d}_{2r}\right] $, where we would expect that if d₁ is the PPG imposed on trait 1, then d_r + 1 should be the PPG imposed on the GEBV associated with trait 1, etc. In addition, in the CPPG-LGSI, we have three possible options for determining (for each trait and GEBV) the PPG, e.g., for trait 1, d₁ = d_r + 1, d₁ > d_r + 1, or d₁ < d_r + 1. This is the main difference between the standard PPG-LPSI described in Chap. 3 and the CPPG-LGSI.

6.4.1 The Maximized CPPG-LGSI Parameters

It can be shown that the vector of coefficients of the CPPG-LGSI can be written as

$$ {\boldsymbol{\upbeta}}_{CP}={\boldsymbol{\upbeta}}_{CR}+{\theta}_{CP}{\boldsymbol{\updelta}}_{CP}, $$

(6.16)

where

$$ {\uptheta}_{\mathrm{CP}}=\frac{{\boldsymbol{\upbeta}}_C^{\prime }{\boldsymbol{\Phi}}_C{\left({\boldsymbol{\Phi}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{-1}{\boldsymbol{\Phi}}_C\right)}^{-1}{\mathbf{d}}_C}{{\mathbf{d}}_C^{\prime }{\left({\boldsymbol{\Phi}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{-1}{\boldsymbol{\Phi}}_C\right)}^{-1}{\mathbf{d}}_C} $$

(6.17)

is a proportionality constant. In addition, in Eq. (6.16), β_CR = K_Cβ_C is the vector of coefficients of the CRLGSI (Eq. 6.13), $ {\boldsymbol{\updelta}}_{CP}={\mathbf{T}}_C^{-1}{\boldsymbol{\Phi}}_C{\left({\boldsymbol{\Phi}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{-1}{\boldsymbol{\Phi}}_C\right)}^{-1}{\mathbf{d}}_C $, $ {\boldsymbol{\Phi}}_C^{\prime }={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C $, and $ {\boldsymbol{\upbeta}}_C={\mathbf{T}}_C^{-1}{\boldsymbol{\Psi}}_C{\mathbf{a}}_C $ (the vector of coefficients of the CLGSI). When θ_CP = 0, β_CP = β_CR, and if θ = 0 and $ {\mathbf{U}}_C^{\prime } $ is the null matrix, then β_CR = β_C. Thus, the CPPG-LGSI is more general than the CRLGSI and the CLGSI, and includes the latter two indices as particular cases. In addition, it can be shown that the CPPG-LGSI has the same properties as the PPG-LPSI described in Chap. 3.

The maximized selection response and the expected genetic gain per trait of the CPPG-LGSI can be written as

$$ {R}_{CP}=\frac{k_I}{L_I}\sqrt{{\boldsymbol{\upbeta}}_{CP}^{\prime }{\mathbf{T}}_C{\boldsymbol{\upbeta}}_{CP}} $$

(6.18)

and

$$ {\mathbf{E}}_{CP}=\frac{k_I}{L_I}\frac{{\boldsymbol{\Psi} \boldsymbol{\upbeta}}_{CP}}{\sqrt{{\boldsymbol{\upbeta}}_{CP}^{\prime }{\mathbf{T}}_C{\boldsymbol{\upbeta}}_{CP}}}, $$

(6.19)

respectively. Although in the RLGSI and the PPG-LGSI the interval between selection cycles is denoted as L_G, in the CPPG-LGSI it is denoted as L_I. This is because the RLPSI and the CPPG-LGSI should have the same interval between selection cycles because they use phenotypic information to predict the net genetic merit.

6.4.2 Numerical Examples

Similar to the CRLGSI, to illustrate the CPPG-LGSI results we use the real training maize (Zea mays) F₂ population with 248 genotypes, 233 molecular markers, and three traits—GY (ton ha⁻¹), EHT (cm), and PHT (cm)—where $ \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45&\ 1.33& \kern0.5em 2.33\\ {}1.33& 65.07&\ 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] $, $ \widehat{\mathbf{C}}=\left[\begin{array}{lll}0.07&\ 0.61&\ 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] $, and $ \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& \kern0.45em 0.65& \kern0.45em 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] $ were the estimated matrices of P, C, and Γ respectively.

We can obtain the estimated CPPG-LGSI vector of coefficients as $ {\widehat{\boldsymbol{\upbeta}}}_{CP}={\widehat{\boldsymbol{\upbeta}}}_{CR}+{\widehat{\theta}}_{CP}{\widehat{\boldsymbol{\updelta}}}_{CP} $ (Eq. 6.16). Suppose that we restrict trait GY and its associated GEBV with matrix $ {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] $ and the vector of predetermined restriction $ {\mathbf{d}}_C^{\prime }=\left[7\kern0.5em 3.5\right] $. In Sect. 6.3.2, we showed that the estimated CRLGSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.076\kern0.5em -0.004\kern0.5em -0.018\kern0.5em 2.353\kern0.5em -0.096\kern0.5em -0.082\right] $; then, we only need to calculate $ {\widehat{\uptheta}}_{\mathrm{CP}} $ and $ {\widehat{\boldsymbol{\updelta}}}_{CP} $ to obtain the vector of coefficients $ {\widehat{\boldsymbol{\upbeta}}}_{CP} $.

Let $ {\mathbf{w}}^{\prime }=\left[5\kern0.5em -0.1\kern0.5em -0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] $ be the vector of economic weights. It can be shown that $ {\widehat{\uptheta}}_{\mathrm{CP}}=0.00030 $ is the estimated value of the proportionality constant and $ {\boldsymbol{\updelta}}_{CP}^{\prime }=\left[0.56\kern0.5em -77.28\kern0.5em 40.89\kern0.5em 49.44\kern0.5em 77.28\kern0.5em -40.89\right] $. Thus, the estimated CPPG-LGSI vector of coefficients was $ {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.76\kern0.5em -0.030\kern0.5em -0.004\kern0.5em 2.369\kern0.5em -0.070\kern0.5em -0.096\right] $, whence the estimated CPPG-LGSI can be written as

$$ {\widehat{I}}_{CP}=0.076\mathrm{GY}-0.03\mathrm{EHT}-0.004\mathrm{PHT}+2.369{\mathrm{GEBV}}_{\mathrm{GY}}-0.070{\mathrm{GEBV}}_{\mathrm{EHT}}-0.096{\mathrm{GEBV}}_{\mathrm{PHT}}, $$

where GEBV_GY, GEBV_EHT, and GEBV_PHT are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions. Note that because $ {\widehat{\uptheta}}_{\mathrm{CP}}=0.0003 $ is very small, the estimated CPPG-LGSI and CRLGSI values were very similar.

Figure 6.4 presents the frequency distribution of the estimated CPPG-LGSI values for one (Fig. 6.4a) and two predetermined restrictions (Fig. 6.4b) using matrices $ {\mathbf{U}}_{C_1}^{\prime } $ and $ {\mathbf{U}}_{C_2}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] $, the vectors of the PPG $ {\mathbf{d}}_{C1}^{\prime }=\left[7\kern0.5em 3.5\right] $ and $ {\mathbf{d}}_{C2}^{\prime }=\left[7\kern0.5em -3\kern0.5em 3.5\kern0.5em -1.5\right] $, and the real data set F₂. For both restrictions, the frequency distribution of the estimated CPPG-LGSI values approaches normal distribution.

Suppose a selection intensity of 10% (k_I = 1.755) and that we restrict trait GY and its associated GEBV. The estimated CPPG-LGSI selection response and expected genetic gain per trait were $ {\widehat{R}}_{CP}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CP}}=0.98 $ and $ {\widehat{\mathbf{E}}}_{CP}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime}\widehat{\boldsymbol{\Psi}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\mathbf{T}}\widehat{\boldsymbol{\upbeta}}}_{CP}}}=\left[0.007\kern0.5em -3.647\kern0.5em -5.760\kern0.5em 0.004\kern0.5em -2.829\kern0.5em -4.711\right] $ respectively, whereas the estimated CPPG-LGSI accuracy was $ {\widehat{\rho}}_{HI_{CP}}=\frac{{\widehat{\sigma}}_{I_{CP}}}{{\widehat{\sigma}}_H}=0.52 $. Once again, because $ {\widehat{\uptheta}}_{\mathrm{CP}}=0.0003 $, the latter results are very similar to the CRLGSI results.

Now, we use the simulated data described in Chap. 2, Sect. 2.8.1, to compare CPPG-LGSI efficiency versus PPG-LGSI efficiency. The criteria for this comparison are the Technow inequality (Chap. 5, Eq. 5.18) and the ratio of CPPG-LGSI accuracy ($ {\rho}_{HI_{CP}} $) to PPG-LGSI accuracy ($ {\rho}_{HI_P} $) expressed as percentages (Chap. 5, Eq. 5.17), $ \widehat{p}=100\left({\widehat{\lambda}}_{CP}-1\right) $, where $ {\widehat{\lambda}}_{CP}={\widehat{\rho}}_{HI_{CP}}/{\widehat{\rho}}_{HI_P} $ for one, two, and three null restrictions in five simulated selection cycles.

Table 6.8 presents the estimated CPPG-LGSI heritability ($ {\widehat{h}}_I^2 $), the estimated PPG-LGSI accuracy ($ {\widehat{\rho}}_{HI_{CP}} $), values of $ {W}_{CP}=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I $ (L_I = 4) and $ \widehat{p}=100\left({\widehat{\lambda}}_{CP}-1\right) $, where $ {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_{CP}}/{\widehat{\rho}}_{HI_P} $ and $ {\widehat{\rho}}_{HI_P} $ is the estimated CPPG-LGSI accuracy, for one, two, and three null restrictions in five simulated selection cycles. The averages of the estimated W_CP values for one, two, and three predetermined restrictions were 3.60, 3.31, and 2.50 respectively, whereas the PPG-LGSI interval length was 1.5 (L_G = 1.5). This means that the estimated Technow inequality, $ {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I $, was true. Thus, for this data set, PPG-LGSI efficiency is greater than CPPG-LGSI efficiency in terms of time.

Table 6.8 Estimated combined predetermined proportional gain linear genomic selection index (CPPG-LGSI) heritability ($ {\widehat{h}}_I^2 $), estimated PPG-LGSI accuracy ($ {\widehat{\rho}}_{HI_{CP}} $), values of $ {W}_{CP}=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I $ (L_I = 4), and $ \widehat{p}=100\left({\widehat{\lambda}}_{CP}-1\right) $, where $ {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_{CP}}/{\rho}_{HI_P} $ and $ {\widehat{\rho}}_{HI_P} $ is the estimated CPPG-LGSI accuracy, for one, two, and three null restrictions for five simulated selection cycles

Full size table

The last three columns of Table 6.8, from left to right, present the values of $ \widehat{p}=100\left({\widehat{\lambda}}_{CP}-1\right) $, for one, two, and three null restrictions in five simulated selection cycles. The average values of $ \widehat{p}=100\left({\widehat{\lambda}}_{CP}-1\right) $ for each of the three restrictions, in percentage terms, were 37.19%, 32.82%, and 37.08% respectively. This means that the CPPG-LGSI efficiency was greater than PPG-LGSI efficiency at predicting the net genetic merit.

References

Beyene Y, Semagn K, Mugo S, Tarekegne A, Babu R et al (2015) Genetic gains in grain yield through genomic selection in eight bi-parental maize populations under drought stress. Crop Sci 55:154–163
Article Google Scholar
Technow F, Bürger A, Melchinger AE (2013) Genomic prediction of northern corn leaf blight resistance in maize with combined or separate training sets for heterotic groups. G3 (Bethesda) 3:197–203
Article Google Scholar

Download references

Author information

Authors and Affiliations

Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Mexico, Mexico
J. Jesus Céron-Rojas & José Crossa

Authors

J. Jesus Céron-Rojas
View author publications
You can also search for this author in PubMed Google Scholar
José Crossa
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Céron-Rojas, J.J., Crossa, J. (2018). Constrained Linear Genomic Selection Indices. In: Linear Selection Indices in Modern Plant Breeding. Springer, Cham. https://doi.org/10.1007/978-3-319-91223-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-91223-3_6
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91222-6
Online ISBN: 978-3-319-91223-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Constrained Linear Genomic Selection Indices

Abstract

Similar content being viewed by others

An analytical framework to derive the expected precision of genomic selection

Overview of Genomic Prediction Methods and the Associated Assumptions on the Variance of Marker Effect, and on the Architecture of the Target Trait

Performance of Bayesian and BLUP alphabets for genomic prediction: analysis, comparison and results