Abstract
The constrained linear genomic selection indices are null restricted and predetermined proportional gain linear genomic selection indices (RLGSI and PPGLGSI respectively), which are a linear combination of genomic estimated breeding values (GEBVs) to predict the net genetic merit. They are the results of a direct application of the restricted and the predetermined proportional gain linear phenotypic selection index theory to the genomic selection context. The RLGSI can be extended to a combined RLGSI (CRLGSI) and the PPGLGSI can be extended to a combined PPGLGSI (CPPGLGSI); the latter indices use phenotypic and GEBV information jointly in the prediction of net genetic merit. The main difference between the RLGSI and PPGLGSI with respect to the CRLGSI and the CPPGLGSI is that although the RLGSI and PPGLGSI are useful in a testing population where there is only marker information, the CRLGSI and CPPGLGSI can be used only in training populations when there are joint phenotypic and marker information. The RLGSI and CRLGSI allow restrictions equal to zero to be imposed on the expected genetic advance of some traits, whereas the PPGLGSI and CPPGLGSI allow predetermined proportional restriction values to be imposed on the expected trait genetic gains to make some traits change their mean values based on a predetermined level. We describe the foregoing four indices and we validated their theoretical results using real and simulated data.
Download chapter PDF
6.1 The Restricted Linear Genomic Selection Index
Let H = w′g be the net genetic merit and I_{G} = β′γ the linear genomic selection index (LGSI, see Chap. 5 for details), where g, γ, w, and β are vectors t × 1 (t= number of traits) of breeding values, genomic breeding values, economic weights, and LGSI coefficients respectively. It can be shown that Cov(I_{G}, g) = Γβ is the covariance between g and I_{G} = β′γ, and that Var(γ) = Γ is the genomic covariance matrix of size t × t (see Chap. 5 for details). The objective of the restricted linear genomic selection index (RLGSI) is to improve only (t − r) of t (r < t) traits (leaving r of them fixed) in a testing population using only genomic estimated breeding values (GEBVs). The RLGSI minimizes the mean squared difference between I_{G} and H, E[(H − I_{G})^{2}], with respect to β under the restriction Cov(I_{G}, U′g) = U′Γβ = 0, where U′ is a matrix (t − 1) × t of 1s and 0s, in a similar manner to the restricted linear phenotypic selection index (RLPSI) described in Chap. 3 in the phenotypic selection context.
6.1.1 The Maximized RLGSI Parameters
Let Var(I_{G}) = β′Γβ be the variance of I_{G} = β′γ, w′Cw the variance of H = w′g, and Cov(I_{G}, H) = w′Γβ the covariance between H = w′g and I_{G} = β′γ. The mean squared difference between H and I_{G} can be written as E[(H − I_{G})^{2}], which should be minimized under the restriction U′Γβ = 0 assuming that Γ, C, U′, and w are known, i.e., it is necessary to minimize the function
with respect to vectors β and v′ = [v_{1} v_{2} ⋯ v_{r − 1}], where v is a vector of Lagrange multipliers. In matrix notation, the derivative results of Eq. (6.1) are
Following the procedure described in Chap. 3 (Eqs. 3.2 to 3.5), it can be shown that the RLGSI vector of coefficients that minimizes E[(H − I_{G})^{2}] under the restriction U′Γβ = 0 is
where K_{G} = [I_{t} − Q_{G}], Q_{G} = U(U′ΓU)^{−1}U′Γ, w is a vector of economic weights, and I_{t} is an identity matrix t × t. When no restrictions are imposed on any of the traits, U′ is a null matrix and β_{RG} = w, the optimized LGSI vector of coefficients (see Chap. 5 for details).
By Eq. (6.3), the RLGSI, and the maximized RLGSI selection response and expected genetic gain per trait can be written as
and
respectively, where k_{I} is the standardized selection differential (or selection intensity) associated with the RLGSI, and L_{G} is the interval between selection cycles or the time required to complete a selection cycle using the RLGSI. Equations (6.4) to (6.6) depend only on GEBV information; thus, they are useful in testing populations.
6.1.2 Statistical Properties of RLGSI
Assuming that H = w′g and \( {I}_{\mathrm{RG}}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} \) have bivariate joint normal distribution, β_{RG} = K_{G}w, and Γ, C, and w are known, it can be shown that the RLGSI has the following properties:

1.
Matrices K_{G} and Q_{G} are idempotent (\( {\mathbf{K}}_G={\mathbf{K}}_G^2 \) and \( {\mathbf{Q}}_G={\mathbf{Q}}_G^2 \)) and orthogonal (K_{G}Q_{G} = Q_{G}K_{G} = 0), that is, they are projectors. Matrix Q_{G} projects vector β = w into a space generated by the columns of matrix U′Γ due to the restriction U′Γβ = 0 used when f_{R}(β, v) (Eq. 6.1) is minimized with respect to vectors β and v, whereas matrix K_{G} projects w into a space perpendicular to that generated by the U′Γ matrix columns.

2.
Because of the restriction U′Γβ = 0, matrix K_{G} projects vector w into a space smaller than the original space of w. The space reduction into which matrix K_{G} projects w is equal to the number of zeros that appears in Eq. (6.6).

3.
Vector β_{RG} = K_{G}w minimizes the mean square error under the restriction U′Γβ = 0.

4.
The variance of \( {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} \) (\( {\sigma}_{I_{RG}}^2={\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG} \)) is equal to the covariance between \( {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} \) and H = w′g (\( {\sigma}_{HI_{RG}}={\mathbf{w}}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG} \)).

5.
The maximized correlation between H and I_{RG} is equal to \( {\rho}_{HI_{RG}}=\frac{\sigma_{I_{RG}}}{\sigma_H} \), where \( {\sigma}_{I_{RG}}=\sqrt{{\boldsymbol{\upbeta}}_{RG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{RG}} \) and \( {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Cw}} \) are the standard deviations of \( {I}_{RG}={\boldsymbol{\upbeta}}_{RG}^{\prime}\boldsymbol{\upgamma} \) and H = w′g respectively.

6.
The variance of the predicted error, \( Var\left(H{I}_{RG}\right)=\left(1{\rho}_{HI_{RG}}^2\right){\sigma}_H^2 \), is minimal. Note that \( Var\left(H{I}_{RG}\right)={\sigma}_{I_{RG}}^2+{\sigma}_H^22{\sigma}_{HI_{RG}} \), and when β_{RG} = K_{G}w, \( {\sigma}_{I_{RG}}^2={\sigma}_{HI_{RG}} \), whence \( Var\left(H{I}_{RG}\right)={\sigma}_H^2{\sigma}_{I_{RG}}^2=\left(1{\rho}_{HI_{RG}}^2\right){\sigma}_H^2 \) is minimal.
The statistical RLGSI properties are equal to the statistical RLPSI properties. Thus the RLGSI is an application of the RLPSI to the genomic selection context.
6.1.3 Numerical Examples
To estimate the parameters associated with the RLGSI, we use the real data set described in Chap. 5, Sect. 5.1.8, where we found that, in the testing population, the estimate of matrix Γ was \( \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}0.21& 2.95& 5.00\\ {}2.95& 42.41& 71.11\\ {}5.00& 71.11& 121.53\end{array}\right] \). We use this matrix and the GEBVs associated with the traits grain yield (GY, ton ha^{−1}), ear height (EHT, cm), and plant height (PHT, cm) to illustrate the RLGSI theoretical results.
Suppose that on the RLGSI expected genetic gain per trait we impose one and two null restrictions using matrices \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \) and \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] \) (see Chap. 3, Sect. 3.1.3, for details about matrix U′). We need to estimate the RLGSI vector of coefficients (β_{RG} = K_{G}w) as \( {\widehat{\boldsymbol{\upbeta}}}_{RG}={\widehat{\mathbf{K}}}_G\mathbf{w} \), where \( {\widehat{\mathbf{K}}}_G=\left[{\mathbf{I}}_3{\widehat{\mathbf{Q}}}_G\right] \) and \( {\widehat{\mathbf{Q}}}_G=\mathbf{U}{\left({\mathbf{U}}^{\prime}\widehat{\boldsymbol{\Gamma}}\mathbf{U}\right)}^{1}{\mathbf{U}}^{\prime}\widehat{\boldsymbol{\Gamma}} \) are estimates of matrices K_{G} = [I_{3} − Q_{G}] and Q_{G} = U(U′ΓU)^{−1}U′Γ respectively, and I_{3} is an identity matrix 3 × 3. The estimated Q_{G} matrices for restrictions \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \) and \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] \) were \( {\widehat{\mathbf{Q}}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{1}{\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}1.0& 14.05& 23.81\\ {}0& 0& 0\\ {}0& 0& 0\end{array}\right] \) and \( {\widehat{\mathbf{Q}}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{1}{\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{ccc}1.0& 0& 11.18\\ {}0& 1.0& 0.90\\ {}0& 0& 0\end{array}\right] \) respectively, whereas the estimated K_{G} matrices for both restrictions were \( {\widehat{\mathbf{K}}}_{G_1}=\left[{\mathbf{I}}_3{\widehat{\mathbf{Q}}}_{G_1}\right]=\left[\begin{array}{ccc}0& 14.05& 23.81\\ {}0& 1.0& 0\\ {}0& 0& 1.0\end{array}\right] \) and \( {\widehat{\mathbf{K}}}_{G_2}=\left[{\mathbf{I}}_3{\widehat{\mathbf{Q}}}_{G_2}\right]=\left[\begin{array}{ccc}0& 0& 11.18\\ {}0& 0& 0.90\\ {}0& 0& 1.0\end{array}\right] \).
Let \( {\mathbf{w}}^{\prime }=\left[5\kern0.5em 0.1\kern0.5em 0.1\right] \) be the vector of economic weights; then the estimated RLGSI vector of coefficients for one and two null restrictions were \( {\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G_1}^{\prime }=\left[3.78\kern0.5em 0.1\kern0.5em 0.1\right] \) and \( {\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G_2}^{\prime }=\left[1.12\kern0.5em 0.09\kern0.5em 0.1\right] \) respectively, and the estimated RLGSI for both restrictions can be written as \( {\widehat{\boldsymbol{I}}}_{RG_1}=3.78{\mathrm{GEBV}}_10.1{\mathrm{GEBV}}_20.1{\mathrm{GEBV}}_3 \) and \( {\widehat{\boldsymbol{I}}}_{RG_2}=1.12{\mathrm{GEBV}}_1+0.09{\mathrm{GEBV}}_20.1{\mathrm{GEBV}}_3 \), where GEBV_{1}, GEBV_{2}, and GEBV_{3} are the genomic estimated breeding values associated with traits GY, EHT, and PHT respectively in the testing population.
Table 6.1 presents 20 genotypes selected from a population of 380 genotypes and the GEBVs in the testing population ranked according to the estimated RLGSI values for one restriction, where \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \). The estimated RLGSI values for genotypes 5 and 306 can be obtained as follows: \( {\widehat{I}}_{RG_5}=3.78\left(0.6\right)0.1\left(8.67\right)0.1(15.97)=0.196 \) and \( {\widehat{I}}_{RG_{306}}=3.78(0.13)0.1(1.31)0.1(1.66)=0.194 \) respectively. This procedure is valid for any number of genotypes and GEBVs in the testing population.
Assume a selection intensity of 10% (\( {k}_{I_G}=1.755 \)); then the estimated RLGSI selection response and expected genetic gain per trait not including the interval length were \( {\widehat{R}}_{RG_1}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_1}}=0.40 \) and \( {\widehat{\mathbf{E}}}_{RG_1}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_1}}}=\left[0\kern0.5em 1.42\kern0.5em 2.58\right] \) respectively. For two restrictions, with \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] \), the estimated RLGSI selection response and expected genetic gains not including the interval length were \( {\widehat{R}}_{RG_2}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_2}}=0.23 \) and \( {\widehat{\mathbf{E}}}_{RG_2}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{RG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{RG_2}}}=\left[0\kern0.5em 0\kern0.5em 2.29\right] \) respectively. When the number of restrictions increases, the estimated RLGSI selection response value decreases, whereas the number of zeros increases in the estimated RLGSI expected genetic gain per trait. The number of zeros in the estimated RLGSI expected genetic gain per trait is equal to the number of restrictions imposed on RLGSI by matrix U′, where each restriction appears as 1.
Figure 6.1 presents the frequency distribution of the estimated RLGSI values for one (Fig. 6.1a) and two null restrictions (Fig. 6.1b). For both restrictions the frequency distribution of the estimated RLGSI values approaches the normal distribution.
Now we use the simulated data set described in Chap. 2, Sect. 2.8.1, to compare RLPSI (restricted linear phenotypic selection index, Chap. 3 for details) efficiency versus RLGSI efficiency. Table 6.2 presents the estimated RLPSI and RLGSI selection response for one, two, and three null restrictions imposed by matrices \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \), \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] \), and \( {\mathbf{U}}_3^{\prime }=\left[\begin{array}{llll}1& 0& 0& 0\\ {}0& 1& 0& 0\\ {}0& 0& 1& 0\end{array}\right] \) for five simulated selection cycles including and not including the interval between selection cycles. In each selection cycle, the sample size was equal to 500 genotypes, each with four repetitions and four traits, whereas the selection intensity was 10% (k_{I} = 1.755); the interval lengths for the RLPSI and RLGSI were 4 and 1.5 years (Beyene et al. 2015) respectively.
Table 6.2 was divided in two parts. The first part presents the estimated RLPSI whereas the second part presents the estimated RLGSI selection responses. Columns 2, 3, and 4 in Table 6.2 present the estimated RLPSI and RLGSI selection responses not including the interval length, whereas columns 5, 6, and 7 present the estimated RLPSI and RLGSI selection response, including the interval length. The averages of the estimated RLPSI selection response not including the interval length for one, two, and three restrictions were 7.04, 5.50, and 3.90, whereas when the interval length was included, the averages were 1.76, 1.38, and 0.98 respectively. The averages of the estimated RLGSI selection response not including the interval length for one, two, and three restrictions were 5.04, 3.72, and 2.79, whereas when the interval length was included the averages were 3.36, 2.48, and 1.86 respectively. These results indicated that when the interval length was included in the estimation of the RLPSI and RLGSI selection response, RLGSI efficiency was greater than RLPSI efficiency, and vice versa, when the interval length was not included the RLPSI efficiency was greater than RLGSI efficiency.
Table 6.3 presents the estimated RLPSI (first part) and RLGSI (second part) expected genetic gain per trait not including the interval between selection cycles for one, two, and three null restrictions in five simulated selection cycles. In this case, RLPSI efficiency is greater than RLGSI efficiency because the averages of the estimated RLPSI expected genetic gain per trait were −2.52, 2.26, and 2.26 for one null restriction; 2.84 and 2.65 for two null restrictions; and 3.90 for three null restrictions. For the same set of restrictions, the averages of the estimated RLGSI expected genetic gain per trait were: −1.85, 1.13, and 2.06 for one null restriction; 1.52 and 2.19 for two null restrictions, and 2.79 for three null restrictions. However, divided by the interval length (4 years in the RLPSI), the averages of the estimated RLPSI expected genetic gain per trait were −0.63, 0.57, and 0.57 for one null restriction; 0.71 and 0.66 for two null restrictions, and 0.98 for three null restrictions. In a similar manner, dividing by the interval length (1.5 years in this case), the averages of the estimated RLGSI expected genetic gain per trait were −1.23, 0.75, and 1.37 for one restriction; 1.01 and 1.46 for two restrictions; and 1.86 for three restrictions.
Table 6.4 presents the estimated RLPSI heritability (\( {\widehat{h}}_{I_R}^2 \)) values, the estimated restricted linear genomic selection index (RLGSI) accuracy (\( {\widehat{\rho}}_{HI_{RG}} \)) values, the values of \( W=\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} \) (L_{RP} = 4), and the values of \( \widehat{p}=100\left({\widehat{\lambda}}_R1\right) \), where \( {\widehat{\lambda}}_R={\widehat{\rho}}_{HI_R}/{\widehat{\rho}}_{HI_{RG}} \) and \( {\widehat{\rho}}_{HI_R} \) is the estimated RLPSI accuracy, for one, two, and three restrictions for five simulated selection cycles. The RLGSI interval length was L_{RG} = 1.5 whereas the averages of the values of \( W=\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} \) for each restriction were 1.22, 0.85, and 0.60; this means that the estimated Technow inequality (Technow et al. 2013), \( {L}_{RG}<\frac{{\widehat{\rho}}_{HI_{RG}}}{{\widehat{h}}_{I_R}}{L}_{RP} \) (Chap. 5, Eq. 5.18), was not true. Thus, according to the Technow inequality results, for this data set, RLGSI efficiency in terms of time was not greater than RLPSI efficiency. The inequality \( {L}_{RG}<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_{I_R}}{L}_{I_R} \) was not true because the estimated RLGSI accuracy was very low, whereas RLPSI heritability was high. Thus, note that the averages of the estimated RLGSI accuracy for one, two, and three null restrictions were 0.25, 0.19, and 0.14 respectively, and the averages of the estimated RLPSI heritability values were 0.70, 0.78 and 0.88, respectively. Thus, according to these results, because the estimated RLGSI accuracy is very low and RLPSI heritability is high, RLGSI efficiency was lower than RLPSI efficiency in terms of time.
The last three columns of Table 6.4, from left to right, present the estimated p values, \( \widehat{p}=100\left({\widehat{\lambda}}_R1\right) \), for one, two, and three null restrictions in five simulated selection cycles. The average of the \( \widehat{p} \) values indicates that for each of the three restrictions the RLPSI efficiency was 65.05%, 78.73%, and 74.09%, greater than RLGSI efficiency at predicting the net genetic merit. Thus, for this data set, the RLPSI was a better predictor of the net genetic merit than the RLGSI in each cycle.
6.2 The Predetermined Proportional Gain Linear Genomic Selection Index
6.2.1 Objective of the PPGLGSI
Let \( {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] \) be a vector 1 × r (r is the number of predetermined proportional gains) of the predetermined proportional gains imposed by the breeder, and assume that μ_{q} is the population mean of the qth trait before selection. The objective of the predetermined proportional gain linear genomic selection index (PPGLGSI) is to change μ_{q} to μ_{q} + d_{q} in the testing population, where d_{q} is a predetermined change in μ_{q}. It is possible to solve this problem minimizing the mean squared difference between I_{G} = β′γ and H = w′g, E[(H − I_{G})^{2}], under the restriction U′Γβ = θ_{G}d, where θ_{G} is a proportionality constant, or under the restriction D′U′Γβ = 0, where \( {\mathbf{D}}^{\prime }=\left[\begin{array}{lllll}{d}_r& 0& \dots & 0& {d}_1\\ {}0& {d}_r& \dots & 0& {d}_2\\ {}\vdots & \vdots & \ddots & \vdots & \vdots \\ {}0& 0& \dots & {d}_r& {d}_{r1}\end{array}\right] \) is a matrix (r − 1) × r (see Chap. 3 for details), and d_{q} (q = 1, 2…, r) is the q^{th} element of vector \( {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] \); U′ is a matrix (t − 1) × t of 1s and 0s, and \( \boldsymbol{\Gamma} =\left\{{\sigma}_{\gamma_{q{q}^{\prime }}}\right\} \) (q, q′ = 1, 2, …, t, t = number of traits) is a covariance matrix of additive genomic breeding values, γ′ = [γ_{1} γ_{2}…γ_{t}].
6.2.2 The Maximized PPGLGSI Parameters
In this subsection, we minimize E[(H − I_{G})^{2}] under the restriction D′U′Γβ = 0 and later under the restriction U′Γb = θ_{G}d. Under the restriction D′U′Γβ = 0, it is necessary to minimize the function
with respect to β and \( {\mathrm{v}}^{\prime }=\left[{v}_1\kern0.5em {v}_2\kern0.5em \dots \kern0.5em {v}_{r1}\right] \), where v′ is a vector of Lagrange multipliers. From a mathematical point of view, Eq. (6.7) is equal to Eq. (6.1); thus, the vector of coefficients β of the PPGLGSI should be similar to the vector of coefficients of the RLGSI (Eq. 6.3), i.e., the PPGLGSI vector of coefficients is equal to
where now K_{P} = [I_{t} − Q_{P}], Q_{P} = UD(D′U′ΓUD)^{−1}D′U′Γ, w is a vector of economic weights, and I_{t} is an identity matrix t × t. When D′ = U′, β_{PG} = β_{RG} (the RLGSI vector of coefficients), and when U′ is a null matrix, β_{PG} = w (the LGSI vector of coefficients). This means that the PPGLGSI includes the RLGSI and the LGSI as particular cases.
Under the restriction U′Γβ = θ_{G}d (see Chap. 3 for details) the vector of coefficients of the PPGLGSI can be written as
where β_{RG} = K_{G}w (Eq. 6.3), K_{G} = [I − Q_{G}], Q_{G} = U(U′ΓU)^{−1}U′Γ, and \( {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] \) is the vector of the predetermined proportional gains imposed by the breeder. It can be shown that θ_{G}, the proportionality constant, can be written as
When θ_{G} = 0, β_{PG} = β_{RG}, and when U′ is a null matrix, β_{PG} = w. Equations (6.8) and (6.9) give the same results, that is, both equations express the same result in a different mathematical way.
The maximized selection response and expected genetic gain per trait of the PPGLGSI can be written as
and
respectively, where L_{G} is the time required to complete a selection cycle using the PPGLGSI. Equations (6.11) and (6.12) depend only on GEBV information.
6.2.3 Statistical Properties of the PPGLGSI
Assuming that H = w′g and the PPGLGSI (\( {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\boldsymbol{\upgamma} \)) have bivariate joint normal distribution, β_{PG} = K_{P}w; Γ, C, and w are known, it can be shown that PPGLGSI has the following statistical properties:

1.
The vector β_{PG} = K_{P}w minimizes the mean square error under the restriction D′U′Γβ = 0.

2.
The variance of \( {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} \) (\( {\sigma}_{I_{PG}}^2={\boldsymbol{\upbeta}}_{PG}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG} \)) is equal to the covariance between \( {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} \) and H = w′g (\( {\sigma}_{HI_{PG}}={\mathbf{w}}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG} \)).

3.
The maximized correlation between H and I_{PG} (also called PPGLGSI accuracy) is equal to \( {\rho}_{HI_{PG}}=\frac{\sigma_{I_{PG}}}{\sigma_H} \), where \( {\sigma}_{I_{PG}}=\sqrt{{\boldsymbol{\upbeta}}_{PG}^{\prime}\;{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{PG}} \) and \( {\sigma}_H=\sqrt{{\mathbf{w}}^{\prime}\mathbf{Cw}} \) are the standard deviations of \( {I}_{PG}={\boldsymbol{\upbeta}}_{PG}^{\prime}\;\boldsymbol{\upgamma} \) and H = w′g respectively.

4.
The variance of the predicted error, \( Var\left(H{I}_{PG}\right)=\left(1{\rho}_{HI_{PG}}^2\right){\sigma}_H^2 \), is minimal.
The statistical PPGLGSI properties are equal to the statistical PPGLPSI properties, then, the PPGLGSI is an application of the PPGLPSI to the genomic selection context.
6.2.4 Numerical Example
To illustrate the PPGLGSI theory, we use the estimated matrix \( \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.21&\ 2.95& \kern0.30em 5.00\\ {}2.95& 42.41& 71.11\\ {}5.00& 71.11& 121.53\end{array}\right] \) and the GEBVs associated with the traits GY (ton ha^{−1}), EHT (cm), and PHT (cm), described in Sect. 6.1.3.
It is necessary to estimate the PPGLGSI vector of coefficients β_{PG} = β_{RG} + θ_{g}U(U′ΓU)^{−1}d (Eqs. 6.9 and 6.10). In Sect. 6.1.3, we showed that the estimated vectors of coefficients of β_{RG} = K_{G}w for the null restrictions \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \) and \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] \) were \( {\widehat{\boldsymbol{\upbeta}}}_{RG1}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G1}^{\prime }=\left[3.78\kern0.5em 0.1\kern0.5em 0.1\right] \) and \( {\widehat{\boldsymbol{\upbeta}}}_{RG2}^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}_{G2}^{\prime }=\left[1.12\kern0.5em 0.09\kern0.5em 0.1\right] \) respectively, where \( {\mathbf{w}}^{\prime }=\left[5\kern0.5em 0.1\kern0.5em 0.1\right] \). This means that to estimate β_{PG} = β_{RG} + θ_{G}U(U′ΓU)^{−1}d, we need only to estimate θ_{G}U(U′ΓU)^{−1}d for both sets of restrictions.
Consider matrix \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \) and let d_{1} = 7.0 be the predetermined proportional gain restriction for trait 1. We can estimate θ_{G} and U(U′ΓU)^{−1}d as \( {\widehat{\theta}}_{G1}=\frac{7.0{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{1}{\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}\mathbf{w}}{7.0{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{1}7.0}=0.036 \) and \( {\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{1}7.0=\left[\begin{array}{c}33.333\\ {}0\\ {}0\end{array}\right] \), whence the PPGLGSI vector of coefficients was \( {\widehat{\boldsymbol{\upbeta}}}_{PG_1}={\widehat{\boldsymbol{\upbeta}}}_{RG_1}+{\widehat{\theta}}_{G_1}{\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_1\right)}^{1}7.0=\left[\begin{array}{c}5.0\\ {}0.1\\ {}0.1\end{array}\right] \), and the estimated PPGLGSI was \( {\widehat{I}}_{PG_1}=5.0{\mathrm{GEBV}}_10.1{\mathrm{GEBV}}_20.1{\mathrm{GEBV}}_3 \). In a similar manner, we can estimate the PPGLGSI vector of coefficients under restrictions \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{ccc}1& 0& 0\\ {}0& 1& 0\end{array}\right] \) and \( {\mathbf{d}}_2^{\prime }=\left[7\kern0.5em 3\right] \). In this case, \( {\widehat{\boldsymbol{\upbeta}}}_{PG_2}={\widehat{\boldsymbol{\upbeta}}}_{RG_2}+{\widehat{\theta}}_{G_2}{\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{1}{\mathbf{d}}_2=\left[\begin{array}{l}\kern0.65em 4.97\\ {}0.18\\ {}0.10\end{array}\right] \) and the estimated PPGLGSI was \( {\widehat{I}}_{PG_2}=4.97{\mathrm{GEBV}}_10.18{\mathrm{GEBV}}_20.1{\mathrm{GEBV}}_3 \).
Figure 6.2 presents the frequency distribution of the estimated PPGLGSI values for one (Fig. 6.2a) and two (Fig. 6.2b) predetermined restrictions, d = 7 and \( {\mathbf{d}}^{\prime }=\left[7\kern0.5em 3\right] \) respectively, obtained in a real testing population for one selection cycle in one environment. For both restrictions, the frequency distribution of the estimated PPGLGSI values approaches the normal distribution.
Assume a selection intensity of 10% (\( {k}_{I_G}=1.755 \)); then, for one predetermined restriction, where \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \) and d_{1} = 7.0, the estimated PPGLGSI selection response and expected genetic gain per trait, not including the interval length, were \( {\widehat{R}}_{PG_1}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_1}}=1.05 \) and \( {\widehat{\mathbf{E}}}_{PG_1}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_1}}}=\left[0.74\kern0.5em 9.92\kern0.5em 16.54\right] \) respectively. For two restrictions, with \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] \) and \( {\mathbf{d}}^{\prime }=\left[7\kern0.5em 3\right] \), the estimated RLGSI selection response and expected genetic gains, not including the interval length, were \( {\widehat{R}}_{PG_2}={k}_{I_G}\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{G_2}}=0.52 \) and \( {\widehat{\mathbf{E}}}_{PG_2}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime}\widehat{\boldsymbol{\Gamma}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{PG_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}\widehat{\boldsymbol{\upbeta}}}_{PG_2}}}=\left[0.11\kern0.5em 0.05\kern0.5em 0.14\right] \) respectively.
Now, we use the simulated data set described in Chap. 2, Sect. 2.8.1 to compare PPGLGSI efficiency versus predetermined proportional gain linear phenotypic selection index (PPGLPSI) efficiency. Let \( {\mathbf{U}}_1^{\prime }=\left[1\kern0.5em 0\kern0.5em 0\right] \), \( {\mathbf{U}}_2^{\prime }=\left[\begin{array}{lll}1& 0& 0\\ {}0& 1& 0\end{array}\right] \), and \( {\mathbf{U}}_3^{\prime }=\left[\begin{array}{llll}1& 0& 0& 0\\ {}0& 1& 0& 0\\ {}0& 0& 1& 0\end{array}\right] \) be the matrices and d_{1} = 7, \( {\mathbf{d}}_2^{\prime }=\left[7\kern0.5em 3\right] \), and \( {\mathbf{d}}_3^{\prime }=\left[7\kern0.5em 3\kern0.5em 5\right] \) the vectors for one, two, and three predetermined restrictions respectively. Table 6.5 presents the estimated PPGLPSI and PPGLGSI selection response for each predetermined restriction in five simulated selection cycles including and not including the interval between selection cycles (4 years for the PPGLPSI and 1.5 years for the PPGLGSI); estimated PPGLPSI and PPGLGSI accuracy; and estimated variance of the predicted error (VPE). In each selection cycle, the sample size was equal to 500 genotypes, each with four repetitions and four traits. The selection intensity was 10% (k_{I} = 1.755).
The averages of the estimated PPGLPSI selection response not including the interval length were 15.14, 14.87, and 13.30, whereas when the interval length was included, the average selection responses were 3.79, 3.72, and 3.33, for one, two, and three predetermined restrictions respectively (Table 6.5). The averages of the estimated PPGLGSI selection responses not including the interval length for one, two, and three predetermined restrictions were 14.48, 13.47, and 11.26 respectively, and when the interval length was included, the selection responses were 9.65, 8.98, and 7.51 respectively (Table 6.5). These results indicate that when the interval length was included in the estimation of the PPGLPSI and PPGLGSI selection responses, PPGLGSI efficiency was greater than PPGLPSI efficiency, and vice versa, when the interval length was not included in the PPGLPSI and PPGLGSI selection responses, PPGLPSI efficiency was higher than PPGLGSI efficiency.
The averages of the estimated VPE values of the PPGLPSI for one, two, and three predetermined restrictions were 22.42, 30.56, and 41.17 respectively, whereas the estimated VPE values of the PPGLGSI (see Sect. 6.2.3 for details) were 59.80, 66.95, and 83.98, respectively, that is, in all selection cycles, the VPE of the PPGLPSI was lower than that of the PPGLGSI. This means that for this data set, the PPGLPSI was a better predictor of the net genetic merit than the PPGLGSI. These results can be explained by observing that the averages of the estimated PPGLPSI accuracies were 0.88, 0.86, and 0.77, whereas the estimated PPGLGSI accuracies were 0.65, 0.68, and 0.57 for each predetermined restriction, that is, the estimated PPGLGSI accuracies were lower than the estimated PPGLPSI accuracies for this data set.
Table 6.6 presents the estimated predetermined PPGLPSI heritability (\( {\widehat{h}}_P^2 \)) values, \( {W}_P=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_P}{L}_P \) (L_{P} = 4) values, and ratio of the estimated PPGLPSI accuracy (\( {\widehat{\rho}}_{HI_P} \)) to the estimated PPGLGSI accuracy (\( {\widehat{\rho}}_{HI_{PG}} \)), i.e., \( {\widehat{\lambda}}_P={\widehat{p}}_{HI_P}/{\widehat{p}}_{HI_{PG}} \), and, finally, values of \( \widehat{p}=100\left({\widehat{\lambda}}_P1\right) \) for one, two, and three null restrictions for five simulated selection cycles.
The averages of the W_{P} values for one, two, and three null restrictions were 3.29, 3.12, and 2.53, respectively, whereas the PPGLGSI interval length was 1.5 (L_{G} = 1.5). This means that the estimated Technow inequality, \( {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_P}{L}_P \) (see Chap. 5, Eq. 5.18) was true. Thus, PPGLGSI efficiency in terms of time was greater than PPGLPSI efficiency for this data set. These results coincide with those obtained earlier in this chapter, when we compared PPGLGSI efficiency versus PPGLPSI efficiency in terms of interval length. However, the average values of \( \widehat{p}=100\left({\widehat{\lambda}}_P1\right) \) (see Chap. 5, Eq. 5.15) were, in percentage terms, 16.80%, 20.76%, and 25.85% for each restriction. These latter results indicate that for this data set, the PPGLPSI was a better predictor of the net genetic merit than the PPGLGSI. This is because the estimated PPGLPSI accuracies were higher than the estimated PPGLPSI accuracies for this data set. We found similar results when we compared the PPGLPSI VPE versus PPGLGSI VPE (Table 6.5).
6.3 The Combined Restricted Linear Genomic Selection Index
The combined restricted linear genomic selection index (CRLGSI) is based on the RLPSI (Chap. 3) and combined linear genomic selection index (CLGSI, Chap. 5) theory. In the RLPSI, the breeder’s objective is to improve only (t − r) of t (r < t) traits, leaving r of them fixed; the same is true for the CRLGSI, but in the latter case, it is necessary to impose 2r restrictions, i.e., we need to fix r traits and their associated r GEBVs to obtain results similar to those obtained with the RLPSI. This is the main difference between the CRLGSI and the RLPSI.
It can be shown that Cov(I_{C}, a_{C}) = Ψ_{C}β_{C} is the covariance between the breeding value vector (\( {\boldsymbol{a}}_C^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] \)) and the CLGSI, \( {I}_C={\boldsymbol{\upbeta}}_C^{\prime }{\mathbf{t}}_C \) (see Chap. 5 for details), where \( {\mathbf{t}}_C^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] \). In the CRLGSI, we want some covariances between the linear combinations of a_{C} (\( {\mathbf{U}}_C^{\prime }{\mathbf{a}}_C \)) and CLGSI to be zero, i.e., \( Cov\left({\mathrm{I}}_{\mathrm{C}},{\mathbf{U}}_C^{\prime }{\mathbf{a}}_C\right)={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} \), where \( {\mathbf{U}}_C^{\prime } \) is a matrix 2(t − 1) × 2t of 1s and 0s (1 indicates that the trait and its associated GEBV are restricted, and 0 that the trait and its GEBV have no restrictions) and \( {\boldsymbol{\Psi}}_C=\left[\begin{array}{cc}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] \) is a block covariance matrix of \( {\mathbf{a}}_C^{\prime }=\left[{\mathbf{g}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] \) where C and Γ are the covariance matrices of breeding (g) and genomic (γ) values respectively. This problem can be solved by minimizing the mean squared difference between the CLGSI and H (E[(H − I_{C})^{2}]) under the restriction \( {\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} \) similar to the RLGSI in Sect. 6.1.
6.3.1 The Maximized CRLGSI Parameters
Let \( {\mathbf{T}}_C=\left[\begin{array}{cc}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] \) be the block covariance matrix of \( {\mathbf{t}}_C^{\prime }=\left[{\mathbf{y}}^{\prime}\kern0.5em {\boldsymbol{\upgamma}}^{\prime}\right] \) where P and Γ are the covariance matrices of phenotypic (y) and genomic (γ) values respectively. Based on the Eq. (6.1) result, it can be shown that the CRLGSI vector of coefficients that minimizes E[(H − I_{C})^{2}] under the restriction \( {\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C{\boldsymbol{\upbeta}}_C=\mathbf{0} \) is
where K_{C} = [I − Q_{C}], \( {\mathbf{Q}}_C={\mathbf{T}}_C^{1}{\boldsymbol{\Phi}}_C{\left({\boldsymbol{\Phi}}_C^{\prime }{\mathbf{T}}_C^{1}{\boldsymbol{\Phi}}_C\right)}^{1}{\boldsymbol{\Phi}}_C^{\prime } \), \( {\boldsymbol{\Phi}}_C={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C \), and \( {\boldsymbol{\upbeta}}_C={\mathbf{T}}_C^{1}{\boldsymbol{\Psi}}_C{\mathbf{a}}_C \) (the vector of coefficients of the CLGSI, see Chap. 5 for details); \( {\mathbf{T}}_C^{1} \) is the inverse of matrix T_{C}, and I is an identity matrix 2t × 2t. When no restrictions are imposed on any of the traits, \( {\mathbf{U}}_C^{\prime } \) is a null matrix and β_{CR} = β_{C} (the vector of coefficients of the CLGSI). That is, the CRLGSI is more general than the CLGSI. Similar to the RLPSI and the RLGSI, matrices K_{C} and Q_{C} are idempotent (\( {\mathbf{K}}_C={\mathbf{K}}_C^2 \) and \( {\mathbf{Q}}_C={\mathbf{Q}}_C^2 \)) and orthogonal (K_{C}Q_{C} = Q_{C}K_{C} = 0), that is, K_{C} and Q_{C} are projectors. Thus, we can assume that the CRLGSI has similar properties to those described for the RLPSI (see Chap. 3 for details) when matrices \( {\boldsymbol{\Psi}}_C=\left[\begin{array}{cc}\mathbf{C}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] \) and \( {\mathbf{T}}_C=\left[\begin{array}{cc}\mathbf{P}& \boldsymbol{\Gamma} \\ {}\boldsymbol{\Gamma} & \boldsymbol{\Gamma} \end{array}\right] \) are known.
The maximized selection response and the optimized expected genetic gain per trait of the CRLGSI can be written as
and
respectively. Although in the RLGSI and the PPGLGSI the interval between selection cycles is denoted as L_{G}, in the CRLGSI it is denoted as L_{I}. This is because the RLPSI and the CRLGSI should have the same interval between selection cycles.
6.3.2 Numerical Examples
To illustrate the CRLGSI theoretical results, we use a real training maize (Zea mays) F_{2} population with 248 genotypes (each with two repetitions), 233 molecular markers, and three traits: GY (ton ha^{−1}), EHT (cm), and PHT (cm). Matrices P and C were estimated based on Eqs. (2.22) to (2.24) described in Chap. 2. The estimated matrices were \( \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45& \kern0.34em 1.33& \kern0.55em 2.33\\ {}1.33& 65.07& \kern0.55em 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] \) and \( \widehat{\mathbf{C}}=\left[\begin{array}{lll}0.07& \kern0.35em 0.61&\ 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] \). In a similar manner, we estimated matrix Γ using Eqs. (5.21) to (5.23) described in Chap. 5. The estimated matrix was \( \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& \kern0.35em 0.65&\ 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] \).
To estimate the CRLGSI and its associated parameters (selection response, expected genetic gain per trait, etc.), we need to obtain matrices \( {\widehat{\mathbf{T}}}_C=\left[\begin{array}{ll}\widehat{\mathbf{P}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] \) and \( {\widehat{\boldsymbol{\Psi}}}_C=\left[\begin{array}{ll}\widehat{\mathbf{C}}& \widehat{\boldsymbol{\Gamma}}\\ {}\widehat{\boldsymbol{\Gamma}}& \widehat{\boldsymbol{\Gamma}}\end{array}\right] \) using phenotypic and genomic information and the estimated CRLGSI vector of coefficients \( {\widehat{\boldsymbol{\upbeta}}}_{CR}={\widehat{\mathbf{K}}}_C{\widehat{\boldsymbol{\upbeta}}}_C \), where \( {\widehat{\mathbf{K}}}_C=\left[\mathbf{I}{\widehat{\mathbf{Q}}}_C\right] \), \( {\widehat{\mathbf{Q}}}_C={\widehat{\mathbf{T}}}_C^{1}{\widehat{\boldsymbol{\Phi}}}_C{\left({\widehat{\boldsymbol{\Phi}}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{1}{\widehat{\boldsymbol{\Phi}}}_C\right)}^{1}{\widehat{\boldsymbol{\Phi}}}_C^{\prime } \), \( {\widehat{\boldsymbol{\Phi}}}_C={\mathbf{U}}_C^{\prime }{\widehat{\boldsymbol{\Psi}}}_C \), and \( {\widehat{\boldsymbol{\upbeta}}}_C={\widehat{\mathbf{T}}}_C^{1}{\widehat{\boldsymbol{\Psi}}}_C{\mathbf{a}}_C \).
We have indicated that the main difference between the RLGSI and the CRLGSI is matrix \( {\mathbf{U}}_C^{\prime } \), on which we now need to impose two restrictions: one for the trait and another for its associated GEBV. Consider the (Zea mays) F_{2} population described earlier and suppose that we restrict trait GY; then, matrix \( {\mathbf{U}}_C^{\prime } \) should be constructed as \( {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] \). If we restrict traits GY and EHT, matrix \( {\mathbf{U}}_C^{\prime } \) should be constructed as \( {\mathbf{U}}_{C_2}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] \), etc. The procedure for obtaining matrices \( {\widehat{\mathbf{K}}}_C=\left[\mathbf{I}{\widehat{\mathbf{Q}}}_C\right] \), \( {\widehat{\mathbf{Q}}}_C={\widehat{\mathbf{T}}}_C^{1}{\widehat{\boldsymbol{\Phi}}}_C{\left({\widehat{\boldsymbol{\Phi}}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{1}{\widehat{\boldsymbol{\Phi}}}_C\right)}^{1}{\widehat{\boldsymbol{\Phi}}}_C^{\prime } \), and \( {\widehat{\boldsymbol{\Phi}}}_C={\mathbf{U}}_C^{\prime }{\widehat{\boldsymbol{\Psi}}}_C \) is similar to that described in Chap. 3.
Let \( {\mathbf{w}}^{\prime }=\left[5\kern0.5em 0.1\kern0.5em 0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] \) be the vector of economic weights and assume that we restrict trait GY; in this case, according to the estimated matrices \( \widehat{\mathbf{P}} \), \( \widehat{\mathbf{C}} \), and \( \widehat{\boldsymbol{\Gamma}} \) described earlier, the estimated CRLGSI vector of coefficients was \( {\widehat{\boldsymbol{\upbeta}}}_{RG}^{\prime }=\left[0.076\kern0.5em 0.004\kern0.5em 0.018\kern0.5em 2.353\kern0.5em 0.096\kern0.5em 0.082\right] \), whence the estimated CRLGSI can be written as
where GEBV_{GY}, GEBV_{EHT}, and GEBV_{PHT} are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions.
Figure 6.3 presents the frequency distribution of the estimated CRLGSI values for one (Fig. 6.3a) and two null restrictions (Fig. 6.3b) using matrices \( {\mathbf{U}}_{C_1}^{\prime } \) and \( {\mathbf{U}}_{C_2}^{\prime } \), and the real data set of the F_{2} population. For both restrictions, the frequency distribution of the estimated CRLGSI values approaches normal distribution.
Suppose a selection intensity of 10% (k_{I} = 1.755), matrix \( {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] \) and that the vector of economic weights is \( {\mathbf{w}}^{\prime }=\left[5\kern0.5em 0.1\kern0.5em 0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] \); then, according to the estimated matrices \( \widehat{\mathbf{P}} \), \( \widehat{\mathbf{C}} \), and \( \widehat{\boldsymbol{\Gamma}} \) described earlier, the estimated CRLGSI selection response and the estimated CRLGSI expected genetic gain per trait were \( {\widehat{R}}_{CR}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CR}}=0.96 \) and \( {\widehat{\mathbf{E}}}_{CR}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime}\widehat{\boldsymbol{\Psi}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CR}}}=\left[0\kern0.5em 3.53\kern0.5em 6.03\kern0.5em 0\kern0.5em 2.93\kern0.5em 4.87\right] \) respectively, whereas the estimated CRLGSI accuracy was \( {\widehat{\rho}}_{HI_{CR}}=\frac{{\widehat{\sigma}}_{I_{CR}}}{{\widehat{\sigma}}_H}=0.51 \) (see Chaps. 3 and 5 for details).
Now, we use the simulated data described in Chap. 2, Sect. 2.8.1 to compare CRLGSI efficiency versus RLGSI efficiency. The criteria for this comparison are the Technow inequality (Eq. 5.18, Chap. 5) and the ratio of the estimated CRLGSI accuracy (\( {\widehat{\rho}}_{HI_{CR}} \)) to the estimated RLGSI accuracy (\( {\widehat{\rho}}_{HI_R} \)) expressed as percentages (Eq. 5.17, Chap. 5), i.e., \( \widehat{p}=100\left({\widehat{\lambda}}_{CR}1\right) \), where \( {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_{CR}}/{\widehat{\rho}}_{HI_R} \), for one, two, and three null restrictions for five simulated selection cycles.
Table 6.7 presents the estimated CRLGSI heritability (\( {\widehat{h}}_C^2 \)), the estimated RLGSI accuracy (\( {\widehat{\rho}}_{HI_R} \)), the values of \( {W}_C=\frac{{\widehat{\rho}}_{HI_R}}{{\widehat{h}}_I}{L}_I \) (L_{I} = 4), and the values of \( \widehat{p}=100\left({\widehat{\lambda}}_{CR}1\right) \), where \( {\widehat{\lambda}}_{CR}={\widehat{\rho}}_{HI_{CR}}/{\widehat{\rho}}_{HI_R} \) and \( {\widehat{\rho}}_{HI_{CR}} \) is the estimated CRLGSI accuracy, for one, two, and three null restrictions for five simulated selection cycles. The averages of the \( {W}_C=\frac{{\widehat{\rho}}_{HI_R}}{{\widehat{h}}_C}{L}_I \) values for one, two, and three null restrictions were 1.26, 0.92, and 0.59 respectively, whereas the RLGSI interval length was 1.5 (L_{G} = 1.5). This means that the estimated Technow inequality (\( {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I \)) was not true. Thus, for this data set, RLGSI efficiency in terms of time is not greater than CRLGSI efficiency. The inequality \( {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I \) was not true because the estimated RLGSI accuracy was very low, whereas CRLGSI heritability was high. Thus, note that the averages of the estimated RLGSI accuracy for one, two, and three null restrictions were 0.25, 0.19, and 0.14 respectively, whereas the averages of the estimated CRLGSI heritability values were 0.72, 0.75, and 0.89 respectively. Thus, according to these results, when the estimated RLGSI accuracy is very low and the estimated CRLGSI heritability is high, RLGSI efficiency will be lower than CRLGSI efficiency in terms of time.
The last three columns of Table 6.7, from left to right, present the average of the values of \( \widehat{p}=100\left({\widehat{\lambda}}_{CR}1\right) \), for one, two, and three null restrictions of five simulated selection cycles. According to these results, CRLGSI efficiency was 53.78%, 78.25%, and 61.25% higher than RLGSI efficiency. Thus, for this data set, the CRLGSI was a better predictor of the net genetic merit than the RLGSI.
6.4 The Combined Predetermined Proportional Gains Linear Genomic Selection Index
In the PPGLPSI described in Chap. 3, the vector of the PPG (predetermined proportional gains) was \( {\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \dots \kern0.5em {d}_r\right] \). However, because the combined predetermined proportional gains LGSI (CPPGLGSI) uses phenotypic and GEBV information jointly to predict the net genetic merit, the vector of the PPG (d_{C}) should be twice the standard vector d′, that is, \( {\mathbf{d}}_C^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \cdots \kern0.5em {d}_r\kern0.5em {d}_{r+1}\kern0.5em {d}_{r+2}\kern0.5em \cdots \kern0.5em {d}_{2r}\right] \), where we would expect that if d_{1} is the PPG imposed on trait 1, then d_{r + 1} should be the PPG imposed on the GEBV associated with trait 1, etc. In addition, in the CPPGLGSI, we have three possible options for determining (for each trait and GEBV) the PPG, e.g., for trait 1, d_{1} = d_{r + 1}, d_{1} > d_{r + 1}, or d_{1} < d_{r + 1}. This is the main difference between the standard PPGLPSI described in Chap. 3 and the CPPGLGSI.
6.4.1 The Maximized CPPGLGSI Parameters
It can be shown that the vector of coefficients of the CPPGLGSI can be written as
where
is a proportionality constant. In addition, in Eq. (6.16), β_{CR} = K_{C}β_{C} is the vector of coefficients of the CRLGSI (Eq. 6.13), \( {\boldsymbol{\updelta}}_{CP}={\mathbf{T}}_C^{1}{\boldsymbol{\Phi}}_C{\left({\boldsymbol{\Phi}}_C^{\prime }{\widehat{\mathbf{T}}}_C^{1}{\boldsymbol{\Phi}}_C\right)}^{1}{\mathbf{d}}_C \), \( {\boldsymbol{\Phi}}_C^{\prime }={\mathbf{U}}_C^{\prime }{\boldsymbol{\Psi}}_C \), and \( {\boldsymbol{\upbeta}}_C={\mathbf{T}}_C^{1}{\boldsymbol{\Psi}}_C{\mathbf{a}}_C \) (the vector of coefficients of the CLGSI). When θ_{CP} = 0, β_{CP} = β_{CR}, and if θ = 0 and \( {\mathbf{U}}_C^{\prime } \) is the null matrix, then β_{CR} = β_{C}. Thus, the CPPGLGSI is more general than the CRLGSI and the CLGSI, and includes the latter two indices as particular cases. In addition, it can be shown that the CPPGLGSI has the same properties as the PPGLPSI described in Chap. 3.
The maximized selection response and the expected genetic gain per trait of the CPPGLGSI can be written as
and
respectively. Although in the RLGSI and the PPGLGSI the interval between selection cycles is denoted as L_{G}, in the CPPGLGSI it is denoted as L_{I}. This is because the RLPSI and the CPPGLGSI should have the same interval between selection cycles because they use phenotypic information to predict the net genetic merit.
6.4.2 Numerical Examples
Similar to the CRLGSI, to illustrate the CPPGLGSI results we use the real training maize (Zea mays) F_{2} population with 248 genotypes, 233 molecular markers, and three traits—GY (ton ha^{−1}), EHT (cm), and PHT (cm)—where \( \widehat{\mathbf{P}}=\left[\begin{array}{lll}0.45&\ 1.33& \kern0.5em 2.33\\ {}1.33& 65.07&\ 83.71\\ {}2.33& 83.71& 165.99\end{array}\right] \), \( \widehat{\mathbf{C}}=\left[\begin{array}{lll}0.07&\ 0.61&\ 1.06\\ {}0.61& 17.93& 22.75\\ {}1.06& 22.75& 44.53\end{array}\right] \), and \( \widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{lll}0.07& \kern0.45em 0.65& \kern0.45em 1.05\\ {}0.65& 10.62& 14.25\\ {}1.05& 14.25& 26.37\end{array}\right] \) were the estimated matrices of P, C, and Γ respectively.
We can obtain the estimated CPPGLGSI vector of coefficients as \( {\widehat{\boldsymbol{\upbeta}}}_{CP}={\widehat{\boldsymbol{\upbeta}}}_{CR}+{\widehat{\theta}}_{CP}{\widehat{\boldsymbol{\updelta}}}_{CP} \) (Eq. 6.16). Suppose that we restrict trait GY and its associated GEBV with matrix \( {\mathbf{U}}_{C_1}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\end{array}\right] \) and the vector of predetermined restriction \( {\mathbf{d}}_C^{\prime }=\left[7\kern0.5em 3.5\right] \). In Sect. 6.3.2, we showed that the estimated CRLGSI vector of coefficients was \( {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.076\kern0.5em 0.004\kern0.5em 0.018\kern0.5em 2.353\kern0.5em 0.096\kern0.5em 0.082\right] \); then, we only need to calculate \( {\widehat{\uptheta}}_{\mathrm{CP}} \) and \( {\widehat{\boldsymbol{\updelta}}}_{CP} \) to obtain the vector of coefficients \( {\widehat{\boldsymbol{\upbeta}}}_{CP} \).
Let \( {\mathbf{w}}^{\prime }=\left[5\kern0.5em 0.1\kern0.5em 0.1\kern0.5em 0\kern0.5em 0\kern0.5em 0\right] \) be the vector of economic weights. It can be shown that \( {\widehat{\uptheta}}_{\mathrm{CP}}=0.00030 \) is the estimated value of the proportionality constant and \( {\boldsymbol{\updelta}}_{CP}^{\prime }=\left[0.56\kern0.5em 77.28\kern0.5em 40.89\kern0.5em 49.44\kern0.5em 77.28\kern0.5em 40.89\right] \). Thus, the estimated CPPGLGSI vector of coefficients was \( {\widehat{\boldsymbol{\upbeta}}}_{CR}^{\prime }=\left[0.76\kern0.5em 0.030\kern0.5em 0.004\kern0.5em 2.369\kern0.5em 0.070\kern0.5em 0.096\right] \), whence the estimated CPPGLGSI can be written as
where GEBV_{GY}, GEBV_{EHT}, and GEBV_{PHT} are the GEBVs associated with traits GY, EHT, and PHT respectively. The same procedure is valid for two or more restrictions. Note that because \( {\widehat{\uptheta}}_{\mathrm{CP}}=0.0003 \) is very small, the estimated CPPGLGSI and CRLGSI values were very similar.
Figure 6.4 presents the frequency distribution of the estimated CPPGLGSI values for one (Fig. 6.4a) and two predetermined restrictions (Fig. 6.4b) using matrices \( {\mathbf{U}}_{C_1}^{\prime } \) and \( {\mathbf{U}}_{C_2}^{\prime }=\left[\begin{array}{llllll}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 0& 1& 0& 0\\ {}0& 0& 0& 0& 1& 0\end{array}\right] \), the vectors of the PPG \( {\mathbf{d}}_{C1}^{\prime }=\left[7\kern0.5em 3.5\right] \) and \( {\mathbf{d}}_{C2}^{\prime }=\left[7\kern0.5em 3\kern0.5em 3.5\kern0.5em 1.5\right] \), and the real data set F_{2}. For both restrictions, the frequency distribution of the estimated CPPGLGSI values approaches normal distribution.
Suppose a selection intensity of 10% (k_{I} = 1.755) and that we restrict trait GY and its associated GEBV. The estimated CPPGLGSI selection response and expected genetic gain per trait were \( {\widehat{R}}_{CP}={k}_I\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\mathbf{T}}}_C{\widehat{\boldsymbol{\upbeta}}}_{CP}}=0.98 \) and \( {\widehat{\mathbf{E}}}_{CP}^{\prime }={k}_I\frac{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime}\widehat{\boldsymbol{\Psi}}}{\sqrt{{\widehat{\boldsymbol{\upbeta}}}_{CP}^{\prime }{\widehat{\mathbf{T}}\widehat{\boldsymbol{\upbeta}}}_{CP}}}=\left[0.007\kern0.5em 3.647\kern0.5em 5.760\kern0.5em 0.004\kern0.5em 2.829\kern0.5em 4.711\right] \) respectively, whereas the estimated CPPGLGSI accuracy was \( {\widehat{\rho}}_{HI_{CP}}=\frac{{\widehat{\sigma}}_{I_{CP}}}{{\widehat{\sigma}}_H}=0.52 \). Once again, because \( {\widehat{\uptheta}}_{\mathrm{CP}}=0.0003 \), the latter results are very similar to the CRLGSI results.
Now, we use the simulated data described in Chap. 2, Sect. 2.8.1, to compare CPPGLGSI efficiency versus PPGLGSI efficiency. The criteria for this comparison are the Technow inequality (Chap. 5, Eq. 5.18) and the ratio of CPPGLGSI accuracy (\( {\rho}_{HI_{CP}} \)) to PPGLGSI accuracy (\( {\rho}_{HI_P} \)) expressed as percentages (Chap. 5, Eq. 5.17), \( \widehat{p}=100\left({\widehat{\lambda}}_{CP}1\right) \), where \( {\widehat{\lambda}}_{CP}={\widehat{\rho}}_{HI_{CP}}/{\widehat{\rho}}_{HI_P} \) for one, two, and three null restrictions in five simulated selection cycles.
Table 6.8 presents the estimated CPPGLGSI heritability (\( {\widehat{h}}_I^2 \)), the estimated PPGLGSI accuracy (\( {\widehat{\rho}}_{HI_{CP}} \)), values of \( {W}_{CP}=\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I \) (L_{I} = 4) and \( \widehat{p}=100\left({\widehat{\lambda}}_{CP}1\right) \), where \( {\widehat{\lambda}}_P={\widehat{\rho}}_{HI_{CP}}/{\widehat{\rho}}_{HI_P} \) and \( {\widehat{\rho}}_{HI_P} \) is the estimated CPPGLGSI accuracy, for one, two, and three null restrictions in five simulated selection cycles. The averages of the estimated W_{CP} values for one, two, and three predetermined restrictions were 3.60, 3.31, and 2.50 respectively, whereas the PPGLGSI interval length was 1.5 (L_{G} = 1.5). This means that the estimated Technow inequality, \( {L}_G<\frac{{\widehat{\rho}}_{HI_G}}{{\widehat{h}}_I}{L}_I \), was true. Thus, for this data set, PPGLGSI efficiency is greater than CPPGLGSI efficiency in terms of time.
The last three columns of Table 6.8, from left to right, present the values of \( \widehat{p}=100\left({\widehat{\lambda}}_{CP}1\right) \), for one, two, and three null restrictions in five simulated selection cycles. The average values of \( \widehat{p}=100\left({\widehat{\lambda}}_{CP}1\right) \) for each of the three restrictions, in percentage terms, were 37.19%, 32.82%, and 37.08% respectively. This means that the CPPGLGSI efficiency was greater than PPGLGSI efficiency at predicting the net genetic merit.
References
Beyene Y, Semagn K, Mugo S, Tarekegne A, Babu R et al (2015) Genetic gains in grain yield through genomic selection in eight biparental maize populations under drought stress. Crop Sci 55:154–163
Technow F, Bürger A, Melchinger AE (2013) Genomic prediction of northern corn leaf blight resistance in maize with combined or separate training sets for heterotic groups. G3 (Bethesda) 3:197–203
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2018 The Author(s)
About this chapter
Cite this chapter
CéronRojas, J.J., Crossa, J. (2018). Constrained Linear Genomic Selection Indices. In: Linear Selection Indices in Modern Plant Breeding. Springer, Cham. https://doi.org/10.1007/9783319912233_6
Download citation
DOI: https://doi.org/10.1007/9783319912233_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 9783319912226
Online ISBN: 9783319912233
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)