## 9.1 Multistage Linear Phenotypic Selection Index

In a similar manner to the linear phenotypic selection index (LPSI, Chap. 2), the objectives of the multistage linear phenotypic selection index (MLPSI) are:

1. 1.

To predict the net genetic merit H = wg, where g′ = [g1g2 … gt] is the vector of true breeding values of an individual for t traits and $${\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em \dots \kern0.5em {w}_t\right]$$ is the vector of economic weights.

2. 2.

To select individuals with the highest H values at each stage as parents of the next generation.

3. 3.

To maximize the MLPSI selection response and its expected genetic gain per trait.

4. 4.

To provide the breeder with an objective rule for evaluating and selecting several traits simultaneously.

When selection is based on all the individual traits of interest jointly, the LPSI vector of coefficients that maximizes the selection response $$R=k\sqrt{{\mathbf{b}}^{\prime}\mathbf{Pb}}$$ and the expected genetic gain per trait $$\mathbf{E}=k\frac{\mathbf{Cb}}{\sqrt{{\mathbf{b}}^{\prime}\mathbf{Pb}}}$$ is b = P−1Cw, where C and P are the covariance matrices of the true breeding values (g) and trait phenotypic values (y) respectively, and k is the selection intensity. In MLPSI terminology, the LPSI is called a one-stage selection index. The MLPSI is an extension of the LPSI theory to the multistage selection context and, as we shall see, the MLPSI theoretical results are very similar to the LPSI theoretical results described in Chap. 2.

### 9.1.1 The MLPSI Parameters for Two Stages

Let $${\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em {y}_2\kern0.5em \cdots \kern0.5em {y}_t\right]$$ be a vector with t traits of interest and suppose that we can select only ni of them (ni < t) at stage i (i= 1, 2, ⋯, N), such that after N stages (N < t), $$\sum \limits_{i=1}^N{n}_i=t$$. Thus, for each stage we should have a selection index with a different number of traits. For example, at stage i the index would be $${I}_i=\sum \limits_{j=1}^{n_i}{b}_{ij}{y}_{ij}$$, and at stage N the index would be $${I}_N=\sum \limits_{j=1}^{n_1}{b}_{1j}{y}_{1j}+\sum \limits_{j=1}^{n_2}{b}_{2j}{y}_{2j}+\cdots +\sum \limits_{j=1}^{n_N}{b}_{Nj}{y}_{Nj}=\sum \limits_{i=1}^N{I}_i$$, where the double subscript of yij indicates that the jth trait is measured at stage i, so that at each sub-index Ii, all the ni traits are measured at the same age.

Suppose that there are four traits of interest and that $${\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em {y}_2\kern0.5em {y}_3\kern0.5em {y}_4\right]$$ is the vector of observable phenotypic values and $${\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em {g}_2\kern0.5em {g}_3\kern0.5em {g}_4\right]$$ is the vector of unobservable breeding values. If at the first and second stages we select two traits, then n1 = n2 = 2 and y′ can be partitioned as $${\mathbf{y}}^{\prime }=\left[{\mathbf{x}}_1^{\prime}\kern0.5em {\mathbf{x}}_2^{\prime}\right]$$, where $${\mathbf{x}}_1^{\prime }=\left[{y}_1\kern0.5em {y}_2\right]$$ and $${\mathbf{x}}_2^{\prime }=\left[{y}_3\kern0.5em {y}_4\right]$$ are the vectors of traits that become evident at the first and second stages respectively. At the first stage, the phenotypic covariance matrix of x1 (P1) and the covariance matrix of x1 with the vector of true breeding values g (G1) can be written as $$Var\left({\mathbf{x}}_1\right)=\left[\begin{array}{cc} Var\left({y}_1\right)& Cov\left({y}_1,{y}_2\right)\\ {} Cov\left({y}_2,{y}_1\right)& Var\left({y}_2\right)\end{array}\right]={\mathbf{P}}_1$$ and

$$Cov\left({\mathbf{x}}_1,\mathbf{g}\right)=\left[\begin{array}{cccc} Cov\left({y}_1,{g}_1\right)& Cov\left({y}_1,{g}_2\right)& Cov\left({y}_1,{g}_3\right)& Cov\left({y}_1,{g}_4\right)\\ {} Cov\left({y}_2,{g}_1\right)& Cov\left({y}_2,{g}_2\right)& Cov\left({y}_2,{g}_3\right)& Cov\left({y}_2,{g}_4\right)\end{array}\right]={\mathbf{G}}_1$$ respectively. For the second stage, in addition to matrix P1, we need the phenotypic covariance matrix between x1 and x2 (P12) and the phenotypic covariance matrix of x2 (P2); thus, the covariance matrix of phenotypic values at stage 2 is $$\mathbf{P}=\left[\begin{array}{cc}{\mathbf{P}}_1& {\mathbf{P}}_{12}\\ {}{\mathbf{P}}_{21}& {\mathbf{P}}_2\end{array}\right]$$. In a similar manner, in addition to matrix G1, at stage 2 we need the covariance between x2 and g (G2); that is, at stage 2 the covariance matrix between phenotypic and breeding values can be written as $$\mathbf{G}=\left[\begin{array}{c}{\mathbf{G}}_1\\ {}{\mathbf{G}}_2\end{array}\right]$$. Matrices G and C are not exactly the same, because although C = Var(g), $$\mathbf{G}=\left[\begin{array}{c} Cov\left({\mathbf{x}}_1,\mathbf{g}\right)\\ {} Cov\left({\mathbf{x}}_2,\mathbf{g}\right)\end{array}\right]=\left[\begin{array}{c}{\mathbf{G}}_1\\ {}{\mathbf{G}}_2\end{array}\right]$$ and this latter matrix changes at each stage.

Let $${\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em {w}_3\kern0.5em {w}_4\right]$$ be the vector of economic weights; then, at the first and second stages the MLPSI vectors of coefficients are $${\mathbf{b}}_1^{\prime }={{\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime}}_1{\mathbf{P}}_1^{-1}=\left[{b}_{11}\kern0.5em {b}_{12}\right]$$ and $${\mathbf{b}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime }{\mathbf{P}}^{-1}=\left[{b}_{21}\kern0.5em {b}_{22}\kern0.5em {b}_{23}\kern0.5em {b}_{24}\right]$$ respectively. The selection indices at stages 1 and 2 can be written as $${I}_1={b}_{11}{y}_1+{b}_{12}{y}_2={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$ and $${I}_2={b}_{21}{y}_1+{b}_{22}{y}_2+{b}_{23}{y}_3+{b}_{24}{y}_4={\mathbf{b}}_2^{\prime}\mathbf{y}$$, which could be correlated and then numerical integration would be required to find optimal truncation points and selection intensities (Xu and Muir 1992; Hicks et al. 1998) before obtaining the maximized MLPSI selection response and expected genetic gain per trait.

The accuracy of the MLPSI at stages 1 and 2 can be written as

$${\rho}_{HI_1}=\sqrt{\frac{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{HI_2}=\sqrt{\frac{{\mathbf{b}}_2^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_2}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}},$$
(9.1)

respectively. Let k1 and k2 be the selection intensities for stages 1 and 2; then, the maximized MLPSI expected genetic gains per trait can be written as

$${\mathbf{E}}_1={k}_1\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_1}{\sqrt{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_2={k}_2\frac{{\mathbf{b}}_2^{\prime }{\mathbf{C}}^{\ast }}{\sqrt{{\mathbf{b}}_2^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_2}},$$
(9.2)

and the total expected genetic gain per trait for the two stages is equal to E1 + E2. In a similar manner, the maximized selection responses for both stages are

$${R}_1={k}_1\sqrt{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}\kern1em \mathrm{and}\kern1em {R}_2={k}_2\sqrt{{\mathbf{b}}_2^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_2},$$
(9.3)

and the total selection response for the two stages is R1 + R2. In Eqs. (9.1) to (9.3), matrices P and C are matrices P and C respectively, adjusted for previous selection on $${I}_1={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$. That is, the MLPSI accuracy, expected genetic gain per trait, and selection response at stage 2 are affected by previous selection on I1 (Saxton 1983) and it is necessary to adjust P and C.

One method for adjusting matrices P and C has been provided by Cochran (1951) and Cunningham (1975). Suppose that X, Y, and W are three jointly normally distributed random variables and that the covariance among them is known, then the covariance between X and Y adjusted for the effects of selection on W can be obtained as

$$Cov{\left(X,Y\right)}^{\ast }= Cov\left(X,Y\right)-u\frac{Cov\left(X,W\right) Cov\left(Y,W\right)}{Var(W)},$$
(9.4)

where u = k1(k1 − τ), k1 is the selection intensity at stage 1 and τ is the truncation point when $${I}_1={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$ is applied. For example, if the selection intensity at the first stage is 5%, k1 = 2.063, τ = 1.645, and u = 0.862 (Falconer and Mackay 1996, Table A).

According to Dekkers (2014), with the result of Eq. (9.4), it is possible to obtain matrices P and C using the following two equations:

$${\mathbf{P}}^{\ast }= Var{\left(\mathbf{y}\right)}^{\ast }=\mathbf{P}-u\frac{Cov\left(\mathbf{y},{\mathbf{x}}_1\right){\mathbf{b}}_1{\mathbf{b}}_1^{\prime } Cov\left({\mathbf{x}}_1,\mathbf{y}\right)}{{\mathbf{b}}_1^{\prime } Var\left({\mathbf{x}}_1\right){\mathbf{b}}_1}=\mathbf{P}-u\frac{\left[\begin{array}{c}{\mathbf{P}}_1\\ {}{\mathbf{P}}_{21}\end{array}\right]{\mathbf{b}}_1{\mathbf{b}}_1^{\prime}\left[{\mathbf{P}}_1\kern0.5em {\mathbf{P}}_{21}\right]}{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}$$
(9.5)

and

$${\mathbf{C}}^{\ast }= Var{\left(\mathbf{g}\right)}^{\ast }=\mathbf{C}-u\frac{Cov\left(\mathbf{g},{\mathbf{x}}_1\right){\mathbf{b}}_1{\mathbf{b}}_1^{\prime } Cov\left({\mathbf{x}}_1,\mathbf{g}\right)}{{\mathbf{b}}_1^{\prime } Var\left({\mathbf{x}}_1\right){\mathbf{b}}_1}=\mathbf{C}-u\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_1{\mathbf{b}}_1^{\prime }{\mathbf{G}}_1}{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}.$$
(9.6)

With the Eq. (9.5) result, the correlation between $${I}_1={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$ and $${I}_2={\mathbf{b}}_2^{\prime}\mathbf{y}$$ is

$$Corr\left({I}_1,{I}_2\right)=\frac{{\mathbf{b}}_1^{\prime}\left[{\mathbf{P}}_1\kern0.5em {\mathbf{P}}_{21}\right]{\mathbf{b}}_2}{\sqrt{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}\sqrt{{\mathbf{b}}_2^{\prime }{\mathbf{P}\mathbf{b}}_2}}={\rho}_{12},$$
(9.7)

where $$\sqrt{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1}$$ and $$\sqrt{{\mathbf{b}}_2^{\prime }{\mathbf{Pb}}_2}$$ are the standard deviations of the variances of $${I}_1={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$ and $${I}_2={\mathbf{b}}_2^{\prime}\mathbf{y}$$ respectively.

### 9.1.2 The Selection Intensities

Selection intensity k is related to the height of the ordinate of the normal curve (z) and the proportion selected (p) in the LPSI as k = z/p. In the multistage selection context, it is usual to fix the total proportion to be selected (p) before selection is carried out and then to determine the unknown proportion qi (i=1, 2,⋯, N) for each stage under the restriction

$$p=\prod \limits_{i=1}^N{q}_i,$$
(9.8)

where N is the number of stages. In the two-stage selection scheme, we would have p = q1q2. Based on the fixed proportion p and the ρ12 value (Eq. 9.7), Young (1964) used the bivariate truncated normal distribution theory to obtain the selection intensity for two stages. A truncated distribution is a conditional distribution resulting when the domain of the parent distribution is restricted to a smaller region (Hattaway 2010). In the multistage selection context, a truncation occurs when a sample of individuals from the parent distribution are selected as parents for the next selection cycle, thus creating a new population of individuals that follow a truncated normal distribution.

Suppose that $${I}_1={\mathbf{b}}_1^{\prime }{\mathbf{x}}_1$$ and $${I}_2={\mathbf{b}}_2^{\prime}\mathbf{y}$$ have joint normal distribution and let I1 and I2 be transformed as $${v}_1=\frac{I_1-{\mu}_{I_1}}{\sigma_{I_1}}$$ and $${v}_2=\frac{I_2-{\mu}_{I_2}}{\sigma_{I_2}}$$ with a mean of zero and a variance of 1, where $${\mu}_{I_2}$$ and $${\mu}_{I_2}$$ are the means, whereas $${\sigma}_{I_1}$$ and $${\sigma}_{I_2}$$ are the standard deviations of the variances of I1 and I2 respectively. In this case, the method of selection is to retain animals or plants with v1 ≥ c1 at stage 1 and v1 + v2 ≥ c2 at stage 2, where c1 and c2 are truncation points for I1 and I2 respectively.

The selected population has bivariate left truncated normal distribution with a probability density function given by $$h\left({v}_1,{v}_2\right)=\frac{f\left({v}_1,{v}_2\right)}{p}$$, where $$f\left({v}_1,{v}_2\right)=\frac{1}{2\pi \sqrt{1-{\rho}_{12}^2}}\exp \left\{-\frac{1}{2\left(1-{\rho}_{12}^2\right)}\left[{v}_1^2+{v}_2^2-2{\rho}_{12}{v}_1{v}_2\right]\right\}$$ and ρ12 is the correlation between v1 and v2. The fixed total proportion (p) before selection can be written as $$p={\int}_{c_1}^{\infty }{\int}_{c_2-{v}_1}^{\infty }f\left({v}_1,{v}_2\right){dv}_2{dv}_1$$, where c1 and c2 are truncation points for I1 and I2, respectively. Then, as p is fixed, Young (1964) integrated by parts (Thomas 2014)

$${\int}_{c_1}^{\infty }{\int}_{c_2-{v}_1}^{\infty }f\left({v}_1,{v}_2\right){dv}_1{dv}_2$$
(9.9)

and found the expectations of v1 and v2 in the selected population, writing the selection intensity values for stages 1 (k1) and 2 (k2) as

$${k}_1=\frac{z\left({c}_1\right)Q(a)}{p}+\frac{z\left({c}_3\right)Q(b)\sqrt{\left(1+{\rho}_{12}\right)/2}}{p}$$
(9.10)

and

$${k}_2=\frac{\rho_{12}z\left({c}_1\right)Q(a)}{p}+\frac{z\left({c}_3\right)Q(b)\sqrt{\left(1+{\rho}_{12}\right)/2}}{p}$$
(9.11)

respectively, where $$z\left({c}_1\right)=\frac{\exp \left\{-0.5{c}_1^2\right\}}{\sqrt{2\pi }}$$ and $$z\left({c}_3\right)=\frac{\exp \left\{-0.5{c}_3^2\right\}}{\sqrt{2\pi }}$$ are the heights of the ordinates of the standard normal distribution at the lowest value of c1 and $${c}_3=\frac{c_2}{\sqrt{2\left(1+{\rho}_{12}\right)}}$$ and p is the total proportion of the population of animal or plant lines selected; $$a=\frac{c_2-{c}_1\left(1+{\rho}_{12}\right)}{\sqrt{1-{\rho}_{12}^2}}$$ and $$b=\frac{2{c}_1-{c}_2}{\sqrt{2\left(1-{\rho}_{12}\right)}}$$, whereas Q(a) = 1 − Φ(a) and Q(b) = 1 − Φ(b) are the complement of the standard normal distribution; $$\Phi (a)={\int}_{-\infty}^a\frac{1}{\sqrt{2\pi }}\exp \left\{-0.5{w}^2\right\} dw$$ and $$\Phi (b)={\int}_{-\infty}^b\frac{1}{\sqrt{2\pi }}\exp \left\{-0.5{t}^2\right\} dt$$ are probabilities of the standard normal distribution, i.e., Φ(a) = Pr(W ≤ a) and Φ(b) = Pr(T ≤ b).

Young (1964) provided figures to obtain values of c1 and c2 when the ρ12 values are between −0.8 and 0.8, and the p values are between 0.05 and 0.8. For example, suppose that ρ12 = 0.8 and p = 0.2 (or 20%), then, according to Young (1964, Fig. 9), c1 = 0.80 and c2 = 1.6, and to find the selection intensities for the first (k1) and second stages (k2) we need to solve Eqs. (9.10) and (9.11). That is, as c1 = 0.80, c2 = 1.6, ρ12 = 0.8, and p = 0.2, then $$z\left({c}_1\right)=\frac{\exp \left\{-0.5{(0.8)}^2\right\}}{\sqrt{2\pi }}=0.290$$, $$z\left({c}_3\right)=\frac{\exp \left\{-0.5\left[{(1.6)}^2/2(1.8)\right]\right\}}{\sqrt{2\pi }}=0.28$$, $$a=\frac{1.6-0.8(1.8)}{\sqrt{1-{(0.8)}^2}}=0.27$$, $$b=\frac{2(0.8)-1.6}{\sqrt{2(0.2)}}=0$$, Φ(a) = 0.6064, Φ(b) = 0.5, Q(a) = 1 − Φ(a) = 0.3936, and Q(b) = 1 − Φ(b) = 0.5. Based on these results, the selection intensities for stages 1 and 2 are

$$\begin{array}{l}{k}_1=\frac{(0.29)(0.3936)}{0.2}+\frac{(0.28)(0.5)(0.9)}{0.2}=0.744\kern1em \mathrm{and}\\ {}{k}_2=\frac{(0.8)(0.29)(0.3936)}{0.2}+\frac{(0.28)(0.5)(0.9)}{0.2}=0.721\end{array}}$$

respectively. Note that the values of Φ(a) = 0.6064 and Φ(b) = 0.5 can be obtained from any table with values showing the area under the curve of the standard normal distribution (e.g., Rausand and Hϕyland 2004, Table F.1).

One problem with Eqs. (9.10) and (9.11) is that they tend to overestimate the selection intensities values and also overestimate the selection response when the total proportion retained p is lower than 10%. Cochran (1951) have given two equations to obtain selection intensities in the two stages context but his equations also overestimate the selection intensities values when p is lower than 10%. Up to now, there is not an accurate method to estimate selection intensities for two or more stages in the MLPSI context. Mi et al. (2014) have developed an R package called selectiongain that enables calculation of the OMLPSI selection response for up to 20 selection stages. Selectiongain uses raw integration to obtain the first moment of a lower truncated multivariate standard normal distribution and then it estimates the OMLPSI selection response at each stage; however, this integral requires complex numerical algorithms with no convergence criteria (Arismendi 2013) and could also overestimate the selection intensity at each stage.

### 9.1.3 Numerical Example

To illustrate the two-stage selection theory, we use the poultry data of Xu and Muir (1992). This data set contains four traits: age at sexual maturity, defined as the age (in days) at which the first trap-nested egg was laid (y1); rate of lay, defined as 100 times (total eggs in the laying period)/(total days in the laying period) (y2); body weight (in pounds) measured at 32 weeks of age (y3); and average egg weight (in ounces per dozen) of all the eggs laid up to 32 weeks of age (y4). The estimated phenotypic and genetic covariance matrices were $$\widehat{\mathbf{P}}=\left[\begin{array}{cccc}137.178& -90.957& 0.136& 0.564\\ {}-90.957& 201.558& 1.103& -1.231\\ {}0.136& 1.103& 0.202& 0.104\\ {}0.564& -1.231& 0.104& 2.874\end{array}\right]$$ and $$\widehat{\mathbf{C}}=\left[\begin{array}{cccc}14.634& -18.356& -0.109& 1.233\\ {}-18.356& 32.029& 0.103& -2.574\\ {}-0.109& 0.103& 0.089& 0.023\\ {}1.233& -2.574& 0.023& 1.225\end{array}\right]$$ respectively, whereas the vector of economic weights for the four traits was $${\mathbf{w}}^{\prime }=\left[-3.555\kern0.5em 19.536\kern0.5em -113.746\kern0.5em 48.307\right]$$.

Suppose that at the first and second stages we select two traits (n1 = n2 = 2); then, $${\mathbf{y}}^{\prime }=\left[{\mathbf{x}}_1^{\prime}\kern0.5em {\mathbf{x}}_2^{\prime}\right]$$, where $${\mathbf{x}}_1^{\prime }=\left[{y}_1\kern0.5em {y}_2\right]$$ and $${\mathbf{x}}_2^{\prime }=\left[{y}_3\kern0.5em {y}_4\right]$$. The estimated phenotypic ($${\widehat{\mathbf{P}}}_1$$) and genetic ($${\widehat{\mathbf{G}}}_1$$) covariance matrices for the first stage were $${\widehat{\mathbf{P}}}_1=\left[\begin{array}{cc}137.178& -90.957\\ {}-90.957& 1.103\end{array}\right]$$ and $${\widehat{\mathbf{G}}}_1=\left[\begin{array}{cccc}14.634& -18.356& -0.109& 1.233\\ {}-18.356& 32.029& 0.103& -2.574\end{array}\right]$$ respectively. For the first and second stages, the estimated MLPSI vector of coefficients were $${\widehat{\mathbf{b}}}_1^{\prime }={{\mathbf{w}}^{\prime }{\widehat{\mathbf{G}}}^{\prime}}_1{\widehat{\mathbf{P}}}_1=\left[-0.918\kern0.5em 2.339\right]$$ and $${\widehat{\mathbf{b}}}_2^{\prime }={\widehat{\mathbf{w}}}^{\prime}\widehat{\mathbf{C}}{\widehat{\mathbf{P}}}^{-1}=\left[-0.59\kern0.5em 2.78\kern0.5em -49.45\kern0.5em 3.75\right]$$ respectively.

The estimated correlation value between the estimated indices $${\widehat{I}}_1={\widehat{\mathbf{b}}}_1^{\prime }{\mathbf{x}}_1$$ and $${\widehat{I}}_2={\widehat{\mathbf{b}}}_2^{\prime}\mathbf{y}$$ was $${\widehat{\rho}}_{12}=\frac{{\widehat{\mathbf{b}}}_1^{\prime}\left[{\widehat{\mathbf{P}}}_1\kern0.5em {\widehat{\mathbf{P}}}_{21}\right]{\widehat{\mathbf{b}}}_2}{\sqrt{{\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_1}\sqrt{{\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_2}}=0.88$$, where $$\sqrt{{\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_1}$$ and $$\sqrt{{\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_2}$$ were the estimated standard deviations of the variance of $${\widehat{I}}_1$$ and $${\widehat{I}}_2$$ respectively. Assuming that p = 0.2 (or 20%), an approximate selection intensity for the first stage was k1 = 0.744, whence the estimated MLPSI selection response, expected genetic gain per trait, and accuracy were $${\widehat{R}}_1={k}_1\sqrt{{\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_1}=29.85$$, $${\widehat{{\mathbf{E}}^{\prime}}}_1={k}_1\frac{{\widehat{\mathbf{G}}}_1^{\prime }{\widehat{\mathbf{b}}}_1}{\sqrt{{\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_1}}=\left[-1.046\kern0.5em 1.702\kern0.5em 0.006\kern0.5em -0.133\right]$$, and $${\widehat{\rho}}_{HI_1}=\sqrt{\frac{{\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_1}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.353$$ respectively.

According to the k1 = 0.744 value, the approached value of u was u = 0.554, and by Eqs. (9.5) and (9.6), the estimated and adjusted phenotypic ($${\widehat{\mathbf{P}}}^{\ast }$$) and genetic ($${\widehat{\mathbf{C}}}^{\ast }$$) covariance matrices for the second stage were $${\widehat{\mathbf{P}}}^{\ast }=\left[\begin{array}{cccc}97.682& -26.241& 0.422& 0.168\\ {}-26.241& 95.518& 0.634& -0.582\\ {}0.422& 0.634& 0.200& 0.107\\ {}0.168& -0.582& 0.107& 2.870\end{array}\right]$$ and $${\widehat{\mathbf{C}}}^{\ast }=\left[\begin{array}{cccc}13.540& -16.575& -0.102& 1.094\\ {}-16.575& 29.129& 0.092& -2.348\\ {}-0.102& 0.092& 0.089& 0.024\\ {}1.094& -2.384& 0.024& 1.207\end{array}\right],$$ respectively.

For the second stage, the approximated selection intensity was k2 = 0.721, whereas the estimated MLPSI selection response, expected genetic gain per trait and accuracy, were $${\widehat{R}}_2={k}_{I_2}\sqrt{{\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{P}}}_2^{\ast }{\widehat{\mathbf{b}}}_2}=24.84$$, $${\widehat{\mathbf{E}}}_2^{\prime }={k}_{I_2}\frac{{\widehat{\mathbf{C}}}^{\ast^{\prime }}{\widehat{\mathbf{b}}}_2}{\sqrt{{\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{P}}}_2^{\ast }{\widehat{\mathbf{b}}}_2}}=\left[-0.443\kern0.5em 0.804\kern0.5em -0.087\kern0.5em -0.087\right]$$, and $${\widehat{\rho}}_{HI_2}=\sqrt{\frac{{\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{P}}}_2^{\ast }{\widehat{\mathbf{b}}}_2}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.314$$ respectively. Finally, the total estimated MLPSI selection response and expected genetic gain per trait were $${\widehat{R}}_1+{\widehat{R}}_2=54.69$$ and $${\widehat{\mathbf{E}}}_1^{\prime }+{\widehat{\mathbf{E}}}_2^{\prime }=\left[-1.488\kern0.5em 2.506\kern0.5em -0.081\kern0.5em -0.219\right]$$.

## 9.2 The Multistage Restricted Linear Phenotypic Selection Index

The multistage restricted linear phenotypic selection index (MRLPSI) is an extension of the null restricted linear phenotypic selection index (RLPSI) described in Chap. 3 to the multistage case; thus, the theoretical results of the MRLPSI are very similar to those of the RLPSI. The MRLPSI allows restrictions equal to zero to be imposed on the expected genetic gains of some traits, whereas other traits increase (or decrease) their expected genetic gains without any restrictions being imposed.

### 9.2.1 The MRLPSI Parameters for Two Stages

In Chap. 3, we indicated that vector bR = Kb is a linear transformation of the LPSI vector of coefficients (b) made by the projector matrix K, and that matrix K is idempotent (K = K2) and projects b into a space smaller than the original space of b. The reduction of the space into which matrix K projects b is equal to the number of zeros that appears on the expected genetic gain per trait. Hence, the MRLPSI vector of coefficients for stages 1 and 2 should be a linear transformation of the MLPSI vector of coefficients at stages 1 ($${\mathbf{b}}_1={\mathbf{P}}_1^{-1}{\mathbf{G}}_1\mathbf{w}$$) and 2 (b2 = P−1Cw) described in Sect. 9.1.1 of this chapter, and should be written as

$${\mathbf{b}}_{R_1}={\mathbf{K}}_1{\mathbf{b}}_1$$
(9.12)

and

$${\mathbf{b}}_{R_2}={\mathbf{K}}_2{\mathbf{b}}_2,$$
(9.13)

respectively, where, at stage 1, K1 = [I1 − Q1], $${\mathbf{Q}}_1={\mathbf{P}}_1^{-1}{\boldsymbol{\Psi}}_1{\left({\boldsymbol{\Psi}}_1^{\prime }{\mathbf{P}}_1^{-1}{\boldsymbol{\Psi}}_1\right)}^{-1}{\boldsymbol{\Psi}}_1^{\prime }$$, $${\boldsymbol{\Psi}}_1^{\prime }={{\mathbf{U}}^{\prime }{\mathbf{G}}^{\prime}}_1$$, I1 is an identity matrix of the same size as P1, and $${\mathbf{P}}_1^{-1}$$ is the inverse of matrix P1. At stage 2, K2 = [I2 − Q2], $${\mathbf{Q}}_2={\mathbf{P}}^{-1}{\boldsymbol{\Psi}}_2{\left({\boldsymbol{\Psi}}_2^{\prime }{\mathbf{P}}^{-1}{\boldsymbol{\Psi}}_2\right)}^{-1}{\boldsymbol{\Psi}}_2^{\prime }$$, $${\boldsymbol{\Psi}}_2^{\prime }={\mathbf{U}}^{\prime}\mathbf{C}$$, I2 is an identity matrix of the same size as P, and P−1 is the inverse of matrix P. By Eqs. (9.12) and (9.13), the MRLPSI for stages 1 and 2 can be written as $${I}_1={\mathbf{b}}_{R_1}^{\prime }{\mathbf{x}}_1$$ and $${I}_2={\mathbf{b}}_{R_2}^{\prime}\mathbf{y}$$, where $${\mathbf{y}}^{\prime }=\left[{\mathbf{x}}_1^{\prime}\kern0.5em {\mathbf{x}}_2^{\prime}\right]$$; $${\mathbf{x}}_1^{\prime }$$ and $${\mathbf{x}}_2^{\prime }$$ are the vectors of traits that become evident at the first and second stages respectively.

Let k1 and k2 be the selection intensities for stages 1 and 2 (Eqs. 9.10 and 9.11) respectively, and let P and C be the covariance matrices adjusted in the MRLPSI context according to Eqs. (9.5) and (9.5) respectively. The maximized MRLPSI selection response, expected genetic gain per trait, and accuracy at stages 1 and 2 can be written as

$${R}_{R_1}={k}_1\sqrt{{\mathbf{b}}_{R_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{R_1}}\kern1em \mathrm{and}\kern1em {R}_{R_1}={k}_2\sqrt{{\mathbf{b}}_{R_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{R_2}},$$
(9.14)
$${\mathbf{E}}_{R_1}={k}_1\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_{R_1}}{\sqrt{{\mathbf{b}}_{R_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{R_1}}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_{R_2}={k}_2\frac{{\mathbf{b}}_{R_2}^{\prime }{\mathbf{C}}^{\ast }}{\sqrt{{\mathbf{b}}_{R_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{R_2}}}$$
(9.15)

and

$${\rho}_{R_1}=\sqrt{\frac{{\mathbf{b}}_{R_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{R_1}}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{R_2}=\sqrt{\frac{{\mathbf{b}}_{R_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{R_2}}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}},$$
(9.16)

respectively, whereas the total MRLPSI selection response and expected genetic gain per trait for both stages are equal to $${R}_{R_1}+{R}_{R_2}$$ and $${\mathbf{E}}_{R_1}+{\mathbf{E}}_{R_2}$$.

### 9.2.2 Numerical Examples

To illustrate the MRLPSI theory for a two-stage selection breeding scheme, we use the real data set of the White Leghorn chickens of Hicks et al. (1998). This data set is conformed with six traits (y1 to y6) that correspond to records consisting of the number of eggs laid during different periods: from week 0 through 4 (y1), 4 through 8 (y2), 8 through 28 (y3), 28 through 32 (y4), 32 through 36 (y5), and 36 through 52 (y6) respectively. The estimated phenotypic and genotypic covariance matrices were

$$\widehat{\mathbf{P}}=\left[\begin{array}{cccccc}102& 32& 14& 4& 3& -1\\ {}32& 80& 80& 16& 17& 7\\ {}14& 80& 298& 78& 112& 62\\ {}4& 16& 78& 66& 80& 51\\ {}3& 17& 112& 80& 135& 49\\ {}-1& 7& 62& 51& 49& 98\end{array}\right]\kern1em \mathrm{and}\kern1em \widehat{\mathbf{C}}=\left[\begin{array}{cccccc}44& 11& -11& -3& -8& -3\\ {}11& 26& 24& 7& 7& 3\\ {}-11& 24& 62& 23& 37& 20\\ {}-3& 7& 23& 14& 23& 14\\ {}-8& 7& 37& 23& 42& 25\\ {}-3& 3& 20& 14& 25& 18\end{array}\right],$$

respectively, and $${\mathbf{w}}^{\prime }=\left[0.08\kern0.5em 0.08\kern0.5em 0.38\kern0.5em 0.08\kern0.5em 0.08\kern0.5em 0.31\right]$$ was the vector of economic weights.

Let $${\mathbf{y}}^{\prime }=\left[{y}_1\kern0.5em {y}_2\kern0.5em {y}_3\kern0.5em {y}_4\kern0.5em {y}_5\kern0.5em {y}_6\right]$$ and $${\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em {g}_2\kern0.5em {g}_3\kern0.5em {g}_4\kern0.5em {g}_5\kern0.5em {g}_6\right]$$ be the vectors of observed phenotypic and unobserved genotypic values respectively, and suppose that at stage 1 we select four traits and at stage 2 we select two traits, then $${\mathbf{x}}_1^{\prime }=\left[{y}_1\kern0.5em {y}_2\kern0.5em {y}_3\kern0.5em {y}_4\right]$$ and $${\mathbf{x}}_2^{\prime }=\left[{y}_5\kern0.5em {y}_6\right]$$ are the vector of observations at stages 1 and 2 respectively, whereas $${\mathbf{y}}^{\prime }=\left[{\mathbf{x}}_1^{\prime}\kern0.5em {\mathbf{x}}_2^{\prime}\right]$$ is the vector of total observations at stage 2. We need to estimate vectors $${\mathbf{b}}_{R_1}^{\prime }={\mathbf{b}}_1^{\prime }{\mathbf{K}}_1^{\prime }$$ and $${\mathbf{b}}_{R_2}^{\prime }={\mathbf{b}}_2^{\prime }{\mathbf{K}}_2^{\prime }$$, where $${\mathbf{b}}_1^{\prime }={{\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime}}_1{\mathbf{P}}_1^{-1}$$ and $${\mathbf{b}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime }{\mathbf{P}}^{-1}$$. In Chap. 3, we described methods of estimating matrices K1 = [I1 − Q1], $${\mathbf{Q}}_1={\mathbf{P}}_1^{-1}{\boldsymbol{\Psi}}_1{\left({\boldsymbol{\Psi}}_1^{\prime }{\mathbf{P}}_1^1{\boldsymbol{\Psi}}_1\right)}^{-1}{\boldsymbol{\Psi}}_1^{\prime }$$, $${\boldsymbol{\Psi}}_1^{\prime }={{\mathbf{U}}^{\prime }{\mathbf{G}}^{\prime}}_1$$, K2 = [I2 − Q2], $${\mathbf{Q}}_2={\mathbf{P}}^{-1}{\boldsymbol{\Psi}}_2{\left({\boldsymbol{\Psi}}_2^{\prime }{\mathbf{P}}^{-1}{\boldsymbol{\Psi}}_2\right)}^{-1}{\boldsymbol{\Psi}}_2^{\prime }$$, and $${\boldsymbol{\Psi}}_2^{\prime }={\mathbf{U}}^{\prime}\mathbf{C}$$, which are used in this subsection.

At stage 1, the estimated phenotypic and genotypic covariance matrices were $${\widehat{\mathbf{P}}}_1=\left[\begin{array}{cccc}102& 32& 14& 4\\ {}32& 80& 80& 16\\ {}14& 80& 298& 78\\ {}4& 16& 78& 66\end{array}\right]$$ and $${\mathbf{G}}_1=\left[\begin{array}{cccccc}44& 11& -11& -3& -8& -3\\ {}11& 26& 24& 7& 7& 3\\ {}-11& 24& 62& 23& 37& 20\\ {}-3& 7& 23& 14& 22& 14\end{array}\right]$$ respectively. At both stages, traits y1 and y2 are restricted. Matrix U can be written as $${\mathbf{U}}^{\prime }=\left[\begin{array}{cccccc}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\end{array}\right]$$, whence the estimated matrix of restrictions was $${\widehat{\boldsymbol{\Psi}}}_1^{\prime }=\mathbf{U}{\widehat{\mathbf{G}}}_1^{\prime }=\left[\begin{array}{cccc}44& 11& -11& -3\\ {}11& 26& 24& 7\end{array}\right]$$; therefore, the estimated matrices of $${\mathbf{Q}}_1={\mathbf{P}}_1^{-1}{\boldsymbol{\Psi}}_1{\left({\boldsymbol{\Psi}}_1^{\prime }{\mathbf{P}}_1^{-1}{\boldsymbol{\Psi}}_1\right)}^{-1}{\boldsymbol{\Psi}}_1^{\prime }$$ and K1 = [I4 − Q1] were $${\widehat{\mathbf{Q}}}_1={\widehat{\mathbf{P}}}_1^{-1}{\widehat{\boldsymbol{\Psi}}}_1{\left({\widehat{\boldsymbol{\Psi}}}_1^{\prime }{\widehat{\mathbf{P}}}_1^{-1}{\widehat{\boldsymbol{\Psi}}}_1\right)}^{-1}{\widehat{\boldsymbol{\Psi}}}_1^{\prime }=\left[\begin{array}{cccc}0.923& -0.013& -0.511& -0.144\\ {}0.164& 1.026& 1.093& 0.317\\ {}-0.145& -0.069& -0.001& -0.001\\ {}0.010& 0.159& 0.178& 0.052\end{array}\right]$$ and $${\widehat{\mathbf{K}}}_1=\left[{\mathbf{I}}_4-{\widehat{\mathbf{Q}}}_1\right]=\left[\begin{array}{cccc}0.077& 0.013& 0.511& 0.144\\ {}0.164& -0.026& -1.093& -0.317\\ {}0.145& 0.069& 1.001& 0.001\\ {}-0.010& -0.159& -0.178& 0.948\end{array}\right]$$ respectively, where I4 is an identity matrix of size 4 × 4.

The estimated vector $${\mathbf{b}}_{R_1}^{\prime }={\mathbf{b}}_1^{\prime }{\mathbf{K}}_1^{\prime }$$ was $${\widehat{\mathbf{b}}}_{R_1}^{\prime }={\widehat{\mathbf{b}}}_1^{\prime }{\widehat{\mathbf{K}}}_1^{\prime }=\left[0.044\kern0.5em -0.095\kern0.5em 0.045\kern0.5em 0.131\right]$$, where $${\widehat{\mathbf{b}}}_1^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{G}}}_1^{\prime }{\widehat{\mathbf{P}}}_1^{-1}=\left[-0.067\kern0.5em 0.125\kern0.5em 0.045\kern0.5em 0.167\right]$$, and $${\widehat{I}}_{R_1}={\widehat{\mathbf{b}}}_{R_1}^{\prime }{\mathbf{x}}_1$$ was the estimated MRLPSI at stage 1. The estimated MRLPSI vector of coefficients at stage 2 was $${\widehat{\mathbf{b}}}_{R_2}^{\prime }={\widehat{\mathbf{b}}}_2^{\prime }{\widehat{\mathbf{K}}}_2^{\prime }=\left[0.045\kern0.5em -0.068\kern0.5em 0.028\kern0.5em -0.057\kern0.5em 0.099\kern0.5em 0.106\right]$$ and $${\widehat{I}}_{R_2}={\widehat{\mathbf{b}}}_{R_2}^{\prime}\mathbf{y}$$ was the estimated MRLPSI at stage 2.

The estimated correlation value ($${\widehat{\rho}}_{R_{12}}$$) between $${\widehat{I}}_{R_1}={\widehat{\mathbf{b}}}_{R_1}^{\prime }{\mathbf{x}}_1$$ and $${\widehat{I}}_{R_2}={\widehat{\mathbf{b}}}_{R_2}^{\prime}\mathbf{y}$$ was $${\widehat{\rho}}_{R_{12}}=\frac{{\widehat{\mathbf{b}}}_{R_1}^{\prime}\left[{\widehat{\mathbf{P}}}_1\kern0.5em {\widehat{\mathbf{P}}}_{21}\right]{\widehat{\mathbf{b}}}_{R_2}}{\sqrt{{\widehat{\mathbf{b}}}_{R_1}^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{R_1}}\sqrt{{\widehat{\mathbf{b}}}_{R_2}^{\prime }{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_{R_2}}}=0.564$$, where $$\sqrt{{\widehat{\mathbf{b}}}_{R_1}^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{R_1}}$$ and $$\sqrt{{\widehat{\mathbf{b}}}_{R_2}^{\prime }{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_{R_2}}$$ are the estimated standard deviations of the variance of $${\widehat{I}}_{R_1}={\widehat{\mathbf{b}}}_{R_1}^{\prime }{\mathbf{x}}_1$$ and $${\widehat{I}}_{R_2}={\widehat{\mathbf{b}}}_{R_2}^{\prime}\mathbf{y}$$ respectively. According to Young (1964, Fig. 8), and Eqs. (9.10) and (9.11), the selection intensities for stages 1 and 2 were k1 = 0.641 and k2 = 0.593 respectively. The estimated selection responses and expected genetic gains per traits for both stages were $${\widehat{R}}_{R_1}={k}_1\sqrt{{\widehat{\mathbf{b}}}_{R_1}^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{R_1}}=0.973$$ and $${\widehat{R}}_{R_2}={k}_2\sqrt{{\widehat{\mathbf{b}}}_{R_2}^{\prime }{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{R_2}}=0.930$$, $${\widehat{\mathbf{E}}}_{R_1}^{\prime }={k}_1\frac{{\widehat{\mathbf{G}}}_1^{\prime }{\widehat{\mathbf{b}}}_{R_1}}{\sqrt{{\widehat{\mathbf{b}}}_{R_1}^{\prime }{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{R_1}}}=\left[0\kern0.5em 0\kern0.5em 1.271\kern0.5em 0.870\kern0.5em 1.482\kern0.5em 0.974\right]$$ and $${\widehat{\mathbf{E}}}_{R_2}^{\prime }={k}_2\frac{{\widehat{\mathbf{C}}}^{\ast^{\prime }}{\widehat{\mathbf{b}}}_{R_2}}{\sqrt{{\widehat{\mathbf{b}}}_{R_2}^{\prime }{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{R_2}}}=\left[0\kern0.5em 0\kern0.5em 1.419\kern0.5em 1.014\kern0.5em 2.037\kern0.5em 1.349\right]$$, whereas $${\widehat{R}}_{R_1}+{\widehat{R}}_{R_2}=1.903$$ and $${\widehat{{\mathbf{E}}^{\prime}}}_{R_1}+{\widehat{{\mathbf{E}}^{\prime}}}_{R_2}=\left[0\kern0.5em 0\kern0.5em 2.691\kern0.5em 1.884\kern0.5em 3.519\kern0.5em 2.322\right]$$ were the total estimated MRLPSI selection response and expected genetic gain per trait respectively.

Finally, the estimated MRLPSI accuracy at stage 1 was $${\widehat{\rho}}_{R_1}=\sqrt{\frac{{\widehat{{\mathbf{b}}^{\prime}}}_{R_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{R_1}}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.320$$ and at stage 2 it was $${\widehat{\rho}}_{R_2}=\sqrt{\frac{{\widehat{{\mathbf{b}}^{\prime}}}_{R_2}{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{R_2}}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.334$$. In this case, $${\widehat{\rho}}_{R_2}>{\widehat{\rho}}_{R_1}$$. We can explain these results considering that although $${\widehat{\rho}}_{R_2}$$ was obtained with six traits, $${\widehat{\rho}}_{R_1}$$ was obtained only with four traits, two of them restricted.

## 9.3 The Multistage Predetermined Proportional Gain Linear Phenotypic Selection Index

The main objectives of the multistage predetermined proportional gain linear phenotypic selection index (MPPG-LPSI) are the same as those of the predetermined proportional gain linear phenotypic selection index (PPG-LPSI) described in Chap. 3, i.e., to optimize, under some predetermined restrictions, the expected genetic gains per trait, to predict the net genetic merit, and to select the individual with the highest net genetic merit values as parents of the next generation under some predetermined restrictions. The MPPG-LPSI allows restrictions different from zero to be imposed on the expected genetic gains of some traits, whereas other traits increase (or decrease) their expected genetic gains without any restrictions being imposed.

### 9.3.1 The MPPG-LPSI Parameters

In a similar manner to the MRLPSI, the MPPG-LPSI vector of coefficients for stages 1 and 2 should be a linear transformation of the MLPSI vector of coefficients at stages 1 ($${\mathbf{b}}_1={\mathbf{P}}_1^{-1}{\mathbf{G}}_1\mathbf{w}$$) and 2 (b2 = P−1Cw), and should be written as

$${\mathbf{b}}_{M_1}={\mathbf{K}}_{M_1}{\mathbf{b}}_1$$
(9.17)

and

$${\mathbf{b}}_{M_2}={\mathbf{K}}_{M_2}{\mathbf{b}}_2,$$
(9.18)

respectively, where, at stage 1, $${\mathbf{K}}_{M_1}=\left[{\mathbf{I}}_1-{\mathbf{Q}}_{M_1}\right]$$, $${\mathbf{Q}}_{M_1}={\mathbf{P}}_1^{-1}{\mathbf{M}}_1{\left({\mathbf{M}}_1^{\prime }{\mathbf{P}}_1^{-1}{\mathbf{M}}_1\right)}^{-1}{\mathbf{M}}_1^{\prime }$$, $${\mathbf{M}}_1^{\prime }={{\mathbf{D}}^{\prime }{\boldsymbol{\Psi}}^{\prime}}_1$$, $${\boldsymbol{\Psi}}_1^{\prime }={{\mathbf{U}}^{\prime }{\mathbf{G}}^{\prime}}_1$$, I1 is an identity matrix of the same size as P1, and $${\mathbf{P}}_1^{-1}$$ is the inverse of matrix P1. At stage 2, KM = [I − QM], QM = P−1M(MP−1M)−1M′, M′ = DΨ′, Ψ′ = UC, I is an identity matrix of the same size as P, P−1 is the inverse of matrix P, and $${\mathbf{D}}^{\prime }=\left[\begin{array}{ccccc}{d}_r& 0& \cdots & 0& -{d}_1\\ {}0& {d}_r& \cdots & 0& -{d}_2\\ {}\vdots & \vdots & \ddots & \vdots & \vdots \\ {}0& 0& \cdots & {d}_r& -{d}_{r-1}\end{array}\right]$$, where dq (q = 1, 2…, r) is the qth element of $${\mathbf{d}}^{\prime }=\left[{d}_1\kern0.5em {d}_2\kern0.5em \cdots \kern0.5em {d}_r\right]$$, the vector PPG (predetermined proportional gains) imposed by the breeder (see Chap. 3 for details).

By Eqs. (9.17) and (9.18), the MPPG-LPSI for stages 1 and 2 can be written as $${I}_{M_1}={\mathbf{b}}_{M_1}{\mathbf{x}}_1$$ and $${I}_{M_2}={\mathbf{b}}_{M_2}\mathbf{y}$$ respectively, where, assuming that at stage 1 we select four traits and at stage 2 we select two traits, $${\mathbf{x}}_1^{\prime }=\left[{y}_1\kern0.5em {y}_2\kern0.5em {y}_3\kern0.5em {y}_4\right]$$ and $${\mathbf{x}}_2^{\prime }=\left[{y}_5\kern0.5em {y}_6\right]$$ are the vectors of phenotypic observations at stages 1 and 2 respectively, and $${\mathbf{y}}^{\prime }=\left[{\mathbf{x}}_1^{\prime}\kern0.5em {\mathbf{x}}_2^{\prime}\right]$$ is the vector of total phenotypic observations at stage 2.

Let k1 and k2 be the selection intensities for stages 1 and 2 (Eqs. 9.10 and 9.11) respectively and let P and C be the adjusted matrices according to Eqs. (9.5) and (9.6) in the MPPG-LPSI context. Then, the MPPG-LPSI selection response and expected genetic gain per trait for both stages can be written as

$${R}_{M_1}={k}_1\sqrt{{\mathbf{b}}_{M_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{M_1}}\kern1em \mathrm{and}\kern1em {R}_{M_2}={k}_2\sqrt{{\mathbf{b}}_{M_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{M_2}}$$
(9.19)

and

$${\mathbf{E}}_{M_1}={k}_1\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_{M_1}}{\sqrt{{\mathbf{b}}_{M_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{M_1}}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_{M_2}={k}_2\frac{{\mathbf{b}}_{M_2}^{\prime }{\mathbf{C}}^{\ast }}{\sqrt{{\mathbf{b}}_{M_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{M_2}}},$$
(9.20)

respectively, whereas the total MPPG-LPSI selection response and expected genetic gain per trait for both stages are equal to $${R}_{M_1}+{R}_{M_2}$$ and $${\mathbf{E}}_{M_1}+{\mathbf{E}}_{M_2}$$. In addition, the MPPG-LPSI accuracy for both stages can be written as

$${\rho}_{M_1}=\sqrt{\frac{{\mathbf{b}}_{M_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{M_1}}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{M_2}=\sqrt{\frac{{\mathbf{b}}_{M_2}^{\prime }{\mathbf{P}}^{\ast }{\mathbf{b}}_{M_2}}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}}.$$
(9.21)

### 9.3.2 Numerical Examples

We use the real data set described in Sect. 9.2.2 to illustrate the theoretical results of the MPPG-LPSI in the same form as we did with those of the MRLPSI. We need to estimate vectors $${\mathbf{b}}_{M_1}^{\prime }={\mathbf{b}}_1^{\prime }{\mathbf{K}}_{M_1}^{\prime }$$ and $${\mathbf{b}}_{M_2}^{\prime }={\mathbf{b}}_2^{\prime }{\mathbf{K}}_{M_2}^{\prime }$$, where $${\mathbf{b}}_1^{\prime }={{\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime}}_1{\mathbf{P}}_1^{-1}$$ and $${\mathbf{b}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{G}}^{\prime }{\mathbf{P}}^{-1}$$. In Chap. 3 we have given methods to estimates KM = [I − QM], QM = P−1M(MP−1M)−1M′, M′ = DΨ′, and Ψ′ = UC, which will be used in this subsection.

The estimated phenotypic and genotypic covariance matrices at stage 1 were $${\widehat{\mathbf{P}}}_1=\left[\begin{array}{cccc}102& 32& 14& 4\\ {}32& 80& 80& 16\\ {}14& 80& 298& 78\\ {}4& 16& 78& 66\end{array}\right]$$ and $${\mathbf{G}}_1=\left[\begin{array}{cccccc}44& 11& -11& -3& -8& -3\\ {}11& 26& 24& 7& 7& 3\\ {}-11& 24& 62& 23& 37& 20\\ {}-3& 7& 23& 14& 22& 14\end{array}\right]$$ respectively, whereas $${\mathbf{w}}^{\prime }=\left[0.08\kern0.5em 0.08\kern0.5em 0.38\kern0.5em 0.08\kern0.5em 0.08\kern0.5em 0.31\right]$$ was the vector of economic weights. The traits restricted at both stages are y1, y2, and y3. The vector of PPG was $${\mathbf{d}}^{\prime }=\left[2\kern0.5em 3\kern0.5em 5\right]$$, whence $${\mathbf{D}}^{\prime }=\left[\begin{array}{ccc}5& 0& -2\\ {}0& 5& -3\end{array}\right]$$ and $${\mathbf{U}}^{\prime }=\left[\begin{array}{cccccc}1& 0& 0& 0& 0& 0\\ {}0& 1& 0& 0& 0& 0\\ {}0& 0& 1& 0& 0& 0\end{array}\right]$$ were matrices D′ and U. The estimated matrices of $${\mathbf{M}}_1^{\prime }$$ and $${\mathbf{K}}_{M_1}=\left[\mathbf{I}-{\mathbf{Q}}_{M_1}\right]$$ were $${\widehat{\mathbf{M}}}_1^{\prime }={{\mathbf{D}}^{\prime }{\boldsymbol{\Psi}}^{\prime}}_1=\left[\begin{array}{cccc}242& 7& -178& -61\\ {}88& 58& -66& -34\end{array}\right]$$ and $${\widehat{\mathbf{K}}}_{M_1}=\left[\begin{array}{cccc}0.176& 0.205& 0.606& 0.159\\ {}0.031& 0.032& -0.007& 0.199\\ {}0.195& 0.235& 0.852& -0.098\\ {}0.130& 0.130& -0.098& 0.940\end{array}\right]$$ respectively, where $${\widehat{\boldsymbol{\Psi}}}_1^{\prime }={\mathbf{U}}^{\prime }{\widehat{\mathbf{G}}}_1^{\prime }$$.

At stages 1 and 2, the estimated MPPG-LPSI vector of coefficients were $${\widehat{{\mathbf{b}}^{\prime}}}_{M_1}={\widehat{{\mathbf{b}}^{\prime}}}_1{\widehat{\mathbf{K}}}_{M_1}^{\prime }=\left[0.068\kern0.5em 0.035\kern0.5em 0.039\kern0.5em 0.160\right]$$ and $${\widehat{\mathbf{b}}}_1^{\prime }={\mathbf{w}}^{\prime }{\widehat{\mathbf{G}}}_1^{\prime }{\widehat{\mathbf{P}}}_1^{-1}=\left[-0.067\kern0.5em 0.125\kern0.5em 0.045\kern0.5em 0.167\right]$$, whence the estimated MPPG-LGSI were $${\widehat{I}}_{M_1}={\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\mathbf{x}}_1$$ and $${\widehat{I}}_{M_2}={\widehat{{\mathbf{b}}^{\prime}}}_{M_2}\mathbf{y}$$. The estimated correlation value ($${\widehat{\rho}}_{M_{12}}$$) between $${\widehat{I}}_{M_1}={\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\mathbf{x}}_1$$ and $${\widehat{I}}_{M_2}={\widehat{{\mathbf{b}}^{\prime}}}_{M_2}\mathbf{y}$$ was $${\widehat{\rho}}_{M_{12}}=\frac{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}\left[{\widehat{\mathbf{P}}}_1\kern0.5em {\widehat{\mathbf{P}}}_{21}\right]{\widehat{\mathbf{b}}}_{M_2}}{\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{M_1}}\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_2}{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_{M_2}}}=0.870$$, where $$\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{M_1}}$$ and $$\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_2}{\widehat{\mathbf{P}}\widehat{\mathbf{b}}}_{M_2}}$$ were the estimated standard deviations of variance of $${\widehat{I}}_{M_1}={\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\mathbf{x}}_1$$ and $${\widehat{I}}_{M_2}={\widehat{{\mathbf{b}}^{\prime}}}_{M_2}\mathbf{y}$$ respectively. According to Young (1964, Fig. 8), the selection intensities for stages 1 and 2 were k1 = 0.744 and k2 = 0.721 (Eqs. 9.10 and 9.11) respectively.

The estimated selection responses and expected genetic gains per traits for both stages were $${\widehat{R}}_{M_1}={k}_1\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{M_1}}=1.553$$ and $${\widehat{R}}_{M_2}={k}_2\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_2}{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{M_2}}=1.401$$, $${\widehat{{\mathbf{E}}^{\prime}}}_{M_1}={k}_1\frac{{\widehat{\mathbf{G}}}_1^{\prime }{\widehat{\mathbf{b}}}_{M_1}}{\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{M_1}}}=\left[0.877\kern0.5em 1.316\kern0.5em 2.193\kern0.5em 1.128\kern0.5em 1.655\kern0.5em 1.037\right]$$, and $${\widehat{{\mathbf{E}}^{\prime}}}_{M_2}={k}_2\frac{{\widehat{\mathbf{C}}}^{\ast^{\prime }}{\widehat{\mathbf{b}}}_{M_2}}{\sqrt{{\widehat{{\mathbf{b}}^{\prime}}}_{M_2}{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{M_2}}}=\left[0.878\kern0.5em 1.346\kern0.5em 2.604\kern0.5em 1.433\kern0.5em 2.506\kern0.5em 1.602\right]$$, whereas $${\widehat{R}}_{M_1}+{\widehat{R}}_{M_2}=2.954$$ and $${\widehat{{\mathbf{E}}^{\prime}}}_{M_1}+{\widehat{{\mathbf{E}}^{\prime}}}_{M_2}=\left[1.755\kern0.5em 2.662\kern0.5em 4.797\kern0.5em 2.561\kern0.5em 4.161\kern0.5em 2.639\right]$$ were the total estimated MPPGLPSI selection response and expected genetic gain per trait respectively. Note that the vector of predetermined restriction was $${\mathbf{d}}^{\prime }=\left[2\kern0.5em 3\kern0.5em 5\right]$$. This means that the MPPG-LPSI efficiency at predicting the total expected genetic gain per trait was high because the difference between each predetermined value (2, 3, and 5) and the total of each predicted value (1.755, 2.662, and 4.797) were 0.245, 0.338, and 0.203 respectively.

Finally, the estimated MPPG-LPSI accuracy at stage 1 was $${\widehat{\rho}}_{M_1}=\sqrt{\frac{{\widehat{{\mathbf{b}}^{\prime}}}_{M_1}{\widehat{\mathbf{P}}}_1{\widehat{\mathbf{b}}}_{M_1}}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.435$$, and at stage 2 it was $${\widehat{\rho}}_{M_2}=\sqrt{\frac{{\widehat{{\mathbf{b}}^{\prime}}}_{M_2}{\widehat{\mathbf{P}}}^{\ast }{\widehat{\mathbf{b}}}_{M_2}}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.428$$; that is, both were very similar.

## 9.4 The Multistage Linear Genomic Selection Index

We describe the multistage linear genomic selection indices (MLGSI) as an extension of the linear genomic selection index (LGSI, Chap. 5) theory to the multistage genomic selection context; thus, the theoretical results of the MLGSI are very similar to those of the LGSI. The MLGSI is a linear combination of genomic estimated breeding values (GEBVs) and is useful for predicting individual net genetic merit and for selecting individuals from a nonphenotyped testing population as parents of the next selection cycle.

### 9.4.1 The MLGSI Parameters

The objective of the MLGSI is to predict the net genetic merit H = wg, where g is a vector of true breeding values and w′ is the vector of economic weights, using only GEBVs. In Chap. 5, we indicated that the covariance between γi and gi is equal to the variance of γi, i.e., $$Cov\left({\mathbf{g}}_i,{\boldsymbol{\upgamma}}_i\right)={s}_i^2$$, and that the GEBV associated with the ith trait is a predictor of the ith vector of genomic breeding values (γi). In the testing population, the only observable information is w′ and the GEBV associated with the traits of interest. For this reason, in practice, we construct a linear combination of GEBVs, which should be a good predictor of H = wg.

Suppose that the breeder is interested in four traits, and that $${\boldsymbol{\upgamma}}^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\kern0.5em {\gamma}_3\kern0.5em {\gamma}_4\right]$$, $${\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em {g}_2\kern0.5em {g}_3\kern0.5em {g}_4\right]$$, and $${\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em {w}_3\kern0.5em {w}_4\right]$$ are the vectors of genomic breeding values (γ), true breeding values (g), and economic weights (w) respectively. Let $$\boldsymbol{\Gamma} = Var\left(\boldsymbol{\upgamma} \right)=\left[\begin{array}{cccc}{s}_1^2& {s}_{12}& {s}_{13}& {s}_{14}\\ {}{s}_{21}& {s}_2^2& {s}_{23}& {s}_{24}\\ {}{s}_{31}& {s}_{32}& {s}_3^2& {s}_{34}\\ {}{s}_{41}& {s}_{42}& {s}_{43}& {s}_4^2\end{array}\right]$$ and $$\mathbf{C}=\left(\mathbf{g}\right)=\left[\begin{array}{cccc}{\sigma}_1^2& {\sigma}_{12}& {\sigma}_{13}& {\sigma}_{14}\\ {}{\sigma}_{21}& {\sigma}_2^2& {\sigma}_{23}& {\sigma}_{24}\\ {}{\sigma}_{31}& {\sigma}_{32}& {\sigma}_3^2& {\sigma}_{34}\\ {}{\sigma}_{41}& {\sigma}_{42}& {\sigma}_{43}& {\sigma}_4^2\end{array}\right]$$ be the covariance matrix of g and γ. At a two-stage selection breeding scheme, $${\boldsymbol{\upgamma}}^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\kern0.5em {\gamma}_3\kern0.5em {\gamma}_4\right]$$ can be partitioned into $${\boldsymbol{\upgamma}}_1^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\right]$$ and $${\boldsymbol{\upgamma}}_2^{\prime }=\left[{\gamma}_3\kern0.5em {\gamma}_4\right]$$; therefore, at stage 1, $${\boldsymbol{\Gamma}}_1= Var\left({\boldsymbol{\upgamma}}_1\right)=\left[\begin{array}{cc}{s}_1^2& {s}_{12}\\ {}{s}_{21}& {s}_2^2\end{array}\right]$$ is the genomic covariance matrix of $${\boldsymbol{\upgamma}}_1^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\right]$$ and $$Cov\left({\boldsymbol{\upgamma}}_1,\mathbf{g}\right)=\left[\begin{array}{cccc}{s}_1^2& {s}_{12}& {s}_{13}& {s}_{14}\\ {}{s}_{12}& {s}_2^2& {s}_{23}& {s}_{24}\end{array}\right]={\mathbf{A}}_1$$ is the covariance matrix of $${\boldsymbol{\upgamma}}_1^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\right]$$ with $${\mathbf{g}}^{\prime }=\left[{g}_1\kern0.5em {g}_2\kern0.5em {g}_3\kern0.5em {g}_4\right]$$. Matrix A1 indicates that we are assuming that the covariance between γi and gj (i, j = 1, 2, ⋯, g; g= number of genotypes) is equal to the covariance between γi and γj. This is because, in practice, in the testing population, we can only estimate matrix Γ.

At stage 2, Γ = Var(γ) is the covariance matrix of γ and A = Γ is the covariance matrix of the vector of genomic breeding values γ with the vector of breeding values g. The MLGSI vector of coefficients at stages 1 and 2 are $${\boldsymbol{\upbeta}}_1^{\prime }={{\mathbf{w}}^{\prime }{\mathbf{A}}^{\prime}}_1{\boldsymbol{\Gamma}}_1^{-1}=\left[{\beta}_{11}\kern0.5em {\beta}_{12}\right]$$ and $${\boldsymbol{\upbeta}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{A}\boldsymbol{\Gamma}}^{-1}={\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em {w}_3\kern0.5em {w}_4\right]$$ respectively, and the MLGSI for both stages can be written as $${I}_1={\beta}_{11}{\gamma}_1+{\beta}_{12}{\gamma}_2={\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\upgamma}}_1$$ and I2 = w1γ1 + w2γ2 + w3γ3 + w4γ4 = wγ.

Let k1 and k2 be the MLGSI selection intensities for stages 1 and 2. For both stages, the MLGSI accuracies ($${\rho}_{HI_1}$$ and $${\rho}_{HI_2}$$), expected genetic gains per trait (E1 and E2) and selection responses (R1 and R2) can be written as

$${\rho}_{HI_1}=\sqrt{\frac{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{HI_2}=\sqrt{\frac{{\mathbf{w}}^{\prime }{\boldsymbol{\Gamma}}^{\ast}\mathbf{w}}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}},$$
(9.22)
$${\mathbf{E}}_1={k}_1\frac{{\mathbf{A}}_1^{\prime }{\boldsymbol{\upbeta}}_1}{\sqrt{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_2={k}_2\frac{{\boldsymbol{\Gamma}}^{\ast}\mathbf{w}}{\sqrt{{\mathbf{w}}^{\prime }{\boldsymbol{\Gamma}}^{\ast}\mathbf{w}}}$$
(9.23)

and

$${R}_1={k}_1\sqrt{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}\kern1em \mathrm{and}\kern1em {R}_2={k}_2\sqrt{{\mathbf{w}}^{\prime }{\boldsymbol{\Gamma}}^{\ast}\mathbf{w}}.$$
(9.24)

The total MLGSI expected genetic gain per trait and selection response at both stages are equal to E1 + E2 and R1 + R2. To simplify notation, in Eqs. (9.23) and (9.24), we have omitted the intervals between stages or selection cycles (LG). Matrices C and Γ in Eqs. (9.22) to (9.23) are matrices Γ and C adjusted for previous selection on I1.

We adjust matrices Γ and C for previous selection on I1 as

$${\boldsymbol{\Gamma}}^{\ast }=\boldsymbol{\Gamma} -u\frac{{\mathbf{A}}_1^{\prime }{\boldsymbol{\upbeta}}_1{\boldsymbol{\upbeta}}_1^{\prime }{\mathbf{A}}_1}{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}$$
(9.25)

and

$${\mathbf{C}}^{\ast }=\mathbf{C}-u\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_1{\mathbf{b}}_1^{\prime }{\mathbf{G}}_1}{{\mathbf{b}}_1^{\prime }{\mathbf{P}}_1{\mathbf{b}}_1},$$
(9.26)

respectively, where u = k1(k1 − τ), k1 is the standardized selection differential, and τ is the truncation point when $${I}_1={\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\upgamma}}_1$$ is applied. All the terms in Eq. (9.26) were defined in Eq. (9.6).

The correlation between $${I}_1={\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\upgamma}}_1$$ and I2 = wγ can be written as

$$Corr\left({I}_1,{I}_2\right)=\frac{{\boldsymbol{\upbeta}}_1^{\prime }{\mathbf{A}}_1\mathbf{w}}{\sqrt{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}\sqrt{{\mathbf{w}}^{\prime}\boldsymbol{\Gamma} \mathbf{w}}}={\rho}_{I_1{I}_2},$$
(9.27)

where $$\sqrt{{\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_1}$$ and $$\sqrt{{\mathbf{w}}^{\prime}\boldsymbol{\Gamma} \mathbf{w}}$$ are the standard deviations of the variances of $${I}_1={\boldsymbol{\upbeta}}_1^{\prime }{\boldsymbol{\upgamma}}_1$$ and I2 = wγ respectively. In Eq. (9.27), matrix Γ was not adjusted according to Eq. (9.25).

### 9.4.2 Estimating the Genomic Covariance Matrix

All the MLGSI parameters are associated with matrix Γ; thus, the estimation of this matrix in the testing population is very important. We estimate matrix Γ according to the estimation method described in Chap. 5 (Eq. 5.25), that is, as

$${\widehat{\boldsymbol{\Gamma}}}_l=\left\{{\widehat{\sigma}}_{\gamma_{q{q}^{\prime }}}\right\},$$
(9.28)

where $${\widehat{\sigma}}_{\gamma_{q{q}^{\prime }}}=\frac{1}{g}{\left({\widehat{\boldsymbol{\upgamma}}}_{ql}-\mathbf{1}{\widehat{\mu}}_{\gamma_{ql}}\right)}^{\prime }{\mathbf{G}}_l^{-1}\left({\widehat{\boldsymbol{\upgamma}}}_{q^{\prime }l}-\mathbf{1}{\widehat{\mu}}_{\gamma_{q^{\prime }l}}\right)$$ is the estimated covariance between $${\widehat{\boldsymbol{\upgamma}}}_{ql}={\mathbf{X}}_l{\widehat{\mathbf{u}}}_q$$ and $${\widehat{\boldsymbol{\upgamma}}}_{q^{\prime }l}={\mathbf{X}}_l{\widehat{\mathbf{u}}}_{q^{\prime }}$$ at stage l or selection cycle of the testing population; g is the number of genotypes; $${\widehat{\mu}}_{\gamma_{ql}}$$ and $${\widehat{\mu}}_{\gamma_{q^{\prime }l}}$$ are the estimated arithmetic means of the values of $${\widehat{\boldsymbol{\upgamma}}}_{ql}$$ and $${\widehat{\boldsymbol{\upgamma}}}_{q^{\prime }l}$$; 1 is an g × 1 vector of 1s and $${\mathbf{G}}_l={c}^{-1}{\mathbf{X}}_l{\mathbf{X}}_l^{\prime }$$ is the additive genomic relationship matrix at stage l or selection cycle in the testing population (see Chap. 5 for details).

### 9.4.3 Numerical Examples

We illustrate the MLGSI theoretical results using the data described in Chap. 2, Sect. 2.8.1 simulated for eight phenotypic and seven genomic selection cycles, each with four traits (T1, T2, T3 and T4), 500 genotypes, four replicates for each genotype, 2500 molecular markers, and 315 quantitative trait loci in one environment. The economic weights of T1, T2, T3, and T4 were 1, −1, 1, and 1 respectively. In this subsection, and only for illustrative purposes, we use the data set from cycle 1.

The genotypic and genomic estimated covariance matrices in cycle 1 were $$\widehat{\mathbf{C}}=\left[\begin{array}{cccc}36.21& -12.93& 8.35& 2.74\\ {}-12.93& 13.04& -3.4& -2.24\\ {}8.35& -3.4& 9.96& 0.16\\ {}2.74& -2.24& 0.16& 6.64\end{array}\right]$$ and $$\widehat{\boldsymbol{\Gamma}}=\left[\begin{array}{cccc}16.26& -6.51& 5.60& 2.29\\ {}-6.51& 5.79& -2.23& -1.62\\ {}5.60& -2.23& 3.75& 0.94\\ {}2.29& -1.62& 0.94& 2.62\end{array}\right]$$ respectively, whereas $${\mathbf{w}}^{\prime }=\left[1\kern0.5em -1\kern0.5em 1\kern0.5em 1\right]$$ was the vector of economic weights. Matrices $$\widehat{\mathbf{P}}$$ and $$\widehat{\mathbf{C}}$$ were obtained according to Eqs. (2.22) to (2.24), whereas matrix $$\widehat{\boldsymbol{\Gamma}}$$ was obtained according to Eq. (9.28).

Suppose that we select two traits at stages 1 and 2. Then, at stage 1, $${\widehat{\boldsymbol{\Gamma}}}_1=\left[\begin{array}{cc}16.26& -6.51\\ {}-6.51& 5.79\end{array}\right]$$ and $${\widehat{\mathbf{A}}}_1=\left[\begin{array}{cccc}16.26& -6.51& 5.60& 2.29\\ {}-6.51& 5.79& -2.33& -1.62\end{array}\right]$$ are the estimated covariance matrices of Γ1 and A1 respectively, and the estimated MLGSI vector of coefficients was $${\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1={\mathbf{w}}^{\prime }{\widehat{{\mathbf{A}}^{\prime}}}_1{\widehat{\boldsymbol{\Gamma}}}_1^{-1}=\left[1.39\kern0.5em -1.25\right]$$. Because at stage 2 $${\boldsymbol{\upbeta}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{A}\boldsymbol{\Gamma}}^{-1}={\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em {w}_3\kern0.5em {w}_4\right]$$, the estimated MLGSI vector of coefficients is the vector of economic weights. Thus, $${\widehat{\rho}}_{I_1{I}_2}=\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\mathbf{A}}}_1\mathbf{w}}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_1}\sqrt{{\mathbf{w}}^{\prime}\widehat{\boldsymbol{\Gamma}}\mathbf{w}}}=0.97$$ was the estimated correlation between $${\widehat{I}}_1={\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\boldsymbol{\upgamma}}}_1$$ and $${\widehat{I}}_2={\mathbf{w}}^{\prime}\widehat{\boldsymbol{\upgamma}}$$, and assuming that the fixed proportion was 0.2 (20%), k1 = 0.744 and k2 = 0.721 were the approximated selection intensities for stages 1 and 2 respectively. The adjusted matrices Γ and C for previous selection on $${\widehat{I}}_1={\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\boldsymbol{\upgamma}}}_1$$ were $${\widehat{\boldsymbol{\Gamma}}}^{\ast }=\left[\begin{array}{cccc}7.96& -2.11& 2.71& 0.88\\ {}-2.11& 3.46& -0.80& -0.87\\ {}2.71& -0.80& 2.75& 0.45\\ {}0.88& -0.87& 0.45& 2.38\end{array}\right]$$ and $${\widehat{\mathbf{C}}}^{\ast }=\left[\begin{array}{cccc}24.40& -5.65& 5.47& 1.39\\ {}-5.65& 8.55& -1.63& -1.41\\ {}5.47& -1.63& 9.26& -0.17\\ {}1.39& -1.41& -0.17& 6.49\end{array}\right]$$.

The estimated MLGSI accuracy, selection response, and expected genetic gain for stage 1 in the testing population were $${\widehat{\rho}}_{HI_1}=\sqrt{\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_1}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.71$$, $${\widehat{R}}_1={k}_1\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_1}=5.90$$, and $${\widehat{\mathbf{E}}}_1^{\prime }={k}_1\frac{{\widehat{\mathbf{A}}}_1^{\prime }{\widehat{\boldsymbol{\upbeta}}}_1}{\sqrt{{\boldsymbol{\upbeta}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_1}}=\left[2.88\kern0.5em -1.53\kern0.5em 1.00\kern0.5em 0.49\right]$$ respectively, whereas at stage 2, the estimated MLGSI accuracy, selection response, and expected genetic gain were $${\widehat{\rho}}_{HI_2}=\sqrt{\frac{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.64$$, $${\widehat{R}}_2={k}_2\sqrt{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}=4.10$$, and $${\widehat{{\mathbf{E}}^{\prime}}}_2={k}_2\frac{{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}{\sqrt{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}}=\left[1.74\kern0.5em -0.92\kern0.5em 0.85\kern0.5em 0.58\right]$$ respectively. The estimated MLGSI accuracy, selection response, and expected genetic gain at stage 2 were lower than at stage 1. This means that the adjusted matrices $${\widehat{\boldsymbol{\Gamma}}}^{\ast }$$ and $${\widehat{\mathbf{C}}}^{\ast }$$ negatively affected the estimated MLPSI parameters at stage 2. The total estimated MLGSI selection response and expected genetic gain for stages 1 and 2 were $${\widehat{R}}_1+{\widehat{R}}_2=9.99$$ and $${\widehat{\mathbf{E}}}_1^{\prime }+{\widehat{\mathbf{E}}}_2^{\prime }=\left[4.62\kern0.5em -2.45\kern0.5em 1.85\kern0.5em 1.07\right]$$.

## 9.5 The Multistage Restricted Linear Genomic Selection Index (MRLGSI)

The restricted linear genomic selection index (RLGSI) described in Chap. 3 is extended to the multistage restricted linear genomic selection index (MRLGSI) context in a two-stage breeding selection scheme.

### 9.5.1 The MRLGSI Parameters

In Sect. 9.4.1, we indicated that the MLGSI vector of coefficients at stage 1 can be written as $${\boldsymbol{\upbeta}}_1^{\prime }={{\mathbf{w}}^{\prime }{\mathbf{A}}^{\prime}}_1{\boldsymbol{\Gamma}}_1^{-1}=\left[{\beta}_{11}\kern0.5em {\beta}_{12}\right]$$ and at stage 2 as $${\boldsymbol{\upbeta}}_2^{\prime }={\mathbf{w}}^{\prime }{\mathbf{A}\boldsymbol{\Gamma}}^{-1}={\mathbf{w}}^{\prime }=\left[{w}_1\kern0.5em {w}_2\kern0.5em {w}_3\kern0.5em {w}_4\right]$$. It can be shown that the MRLGSI vector of coefficients is a linear transformation of vectors β1 and β2 made by matrix KG, which is a projector (see Chaps. 3 and 6 for details) that projects β1 and β2 into a space smaller than the original space of β1 and β2. Thus, at stages 1 and 2, the MRLGSI vector of coefficients is

$${\boldsymbol{\upbeta}}_{R_1}={\mathbf{K}}_{G_1}{\boldsymbol{\upbeta}}_1$$
(9.29)

and

$${\boldsymbol{\upbeta}}_{R_2}={\mathbf{K}}_{G_2}{\boldsymbol{\upbeta}}_2={\mathbf{K}}_{G_2}\mathbf{w},$$
(9.30)

respectively, where $${\mathbf{K}}_{G_1}=\left[\mathbf{I}-{\mathbf{Q}}_{G_1}\right]$$, $${\mathbf{Q}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1$$, $${\mathbf{K}}_{G_2}=\left[\mathbf{I}-{\mathbf{Q}}_{G_2}\right]$$, and $${\mathbf{Q}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime }{\boldsymbol{\Gamma} \mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\boldsymbol{\Gamma}$$ are matrix projectors. By Eqs. (9.29) and (9.30), the MRLGSI at stages 1 and 2 can be written as $${I}_{R_1}={\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\upgamma}}_1$$ and $${I}_{R_2}={\boldsymbol{\upbeta}}_{R_2}^{\prime}\boldsymbol{\upgamma}$$ respectively, where $${\boldsymbol{\upgamma}}_1^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\right]$$ and $${\boldsymbol{\upgamma}}^{\prime }=\left[{\gamma}_1\kern0.5em {\gamma}_2\kern0.5em {\gamma}_3\kern0.5em {\gamma}_4\right]$$ are vectors of genomic breeding values, which can be estimated using GEBVs, as described in Chap. 5. In Chap. 6 we described methods for constructing matrix U′ and estimating matrix KG; those methods are also valid in the MRLGSI context.

In a similar manner to the MLGSI context, MRLGSI accuracies, expected genetic gains per trait, and selection responses for stages 1 and 2 in the testing population can be written as

$${\rho}_{HI_1}=\sqrt{\frac{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{HI_2}=\sqrt{\frac{{\boldsymbol{\upbeta}}_{R_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{R_2}}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}},$$
(9.31)
$${\mathbf{E}}_{R_1}={k}_1\frac{{\mathbf{A}}_1^{\prime }{\boldsymbol{\upbeta}}_{R_1}}{\sqrt{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_{R_2}={k}_2\frac{{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{R_2}}{\sqrt{{\boldsymbol{\upbeta}}_{R_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{R_2}}}$$
(9.32)

and

$${R}_{R_1}={k}_1\sqrt{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}\kern1em \mathrm{and}\kern1em {R}_{R_2}={k}_2\sqrt{{\boldsymbol{\upbeta}}_{R_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{R_2}},$$
(9.33)

respectively. The total MRLGSI expected genetic gain per trait and selection response for both stages are equal to $${\mathbf{E}}_{R_1}+{\mathbf{E}}_{R_2}$$ and $${R}_{R_1}+{R}_{R_2}$$. To simplify the notation, in Eqs. (9.32) and (9.33), we have omitted the intervals between stages or selection cycles (LG). Matrices Γ and C in Eqs. (9.31) to (9.33) are matrices Γ and C adjusted for previous selection.

In the MRLGSI context, matrices Γ and C can be obtained as

$${\boldsymbol{\Gamma}}^{\ast }=\boldsymbol{\Gamma} -u\frac{{\mathbf{A}}_1^{\prime }{\boldsymbol{\upbeta}}_{R_1}{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\mathbf{A}}_1}{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}$$
(9.34)

and

$${\mathbf{C}}^{\ast }=\mathbf{C}-u\frac{{\mathbf{G}}_1^{\prime }{\mathbf{b}}_{R_1}{\mathbf{b}}_{R_1}^{\prime }{\mathbf{G}}_1}{{\mathbf{b}}_{R_1}^{\prime }{\mathbf{P}}_1{\mathbf{b}}_{R_1}},$$
(9.35)

where $${\boldsymbol{\upbeta}}_{R_1}$$ was defined in Eq. (9.29) and vector $${\mathbf{b}}_{R_1}$$ can be obtained according to the RLPSI as described in Chap. 3. The term u = k(k − t) was defined earlier.

The correlation between $${I}_{R_1}={\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\upgamma}}_1$$ and $${I}_{R_2}={\boldsymbol{\upbeta}}_{R_2}^{\prime}\boldsymbol{\upgamma}$$ can be written as

$${\rho}_{I_{R_1}{I}_{R_2}}=\frac{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\mathbf{A}}_1{\boldsymbol{\upbeta}}_{R_2}}{\sqrt{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}\sqrt{{\boldsymbol{\upbeta}}_{R_2}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{R_2}}},$$
(9.36)

where $$\sqrt{{\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{R_1}}$$ and $$\sqrt{{\boldsymbol{\upbeta}}_{R_2}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{R_2}}$$ are the standard deviations of the variances of $${I}_{R_1}={\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\upgamma}}_1$$ and $${I}_{R_2}={\boldsymbol{\upbeta}}_{R_2}^{\prime}\boldsymbol{\upgamma}$$ respectively. In Eq. (9.36), matrix Γ was not adjusted for previous selection on $${I}_{R_1}={\boldsymbol{\upbeta}}_{R_1}^{\prime }{\boldsymbol{\upgamma}}_1$$.

### 9.5.2 Numerical Examples

To illustrate the MRLGSI theory in a two-stage breeding selection scheme, we use the simulated data described in Sect. 9.4.3. In that subsection we indicated that the estimated covariance matrices of Γ1 and A1 were $${\widehat{\boldsymbol{\Gamma}}}_1=\left[\begin{array}{cc}16.26& -6.51\\ {}-6.51& 5.79\end{array}\right]$$ and $${\widehat{\mathbf{A}}}_1=\left[\begin{array}{cccc}16.26& -6.51& 5.60& 2.29\\ {}-6.51& 5.79& -2.33& -1.62\end{array}\right]$$, and that $${\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_1={\mathbf{w}}^{\prime }{\widehat{\mathbf{A}}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1^{-1}=\left[1.39\kern0.5em -1.25\right]$$ was the estimated MLGSI vector of coefficients at stage 1. At stage 2, the estimated MLGSI vector of coefficients was $${\mathbf{w}}^{\prime }=\left[1\kern0.5em -1\kern0.5em 1\kern0.5em 1\right]$$, the vector of economic weights.

Suppose that we restrict only trait 2; then at stages 1 and 2, matrix $${\mathbf{U}}_1^{\prime }=\left[0\kern0.5em 1\right]$$ and matrix $${\mathbf{U}}_2^{\prime }=\left[0\kern0.5em 1\kern0.5em 0\kern0.5em 0\right]$$ respectively. In addition, $${\widehat{\mathbf{Q}}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1$$, $${\widehat{\mathbf{Q}}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}$$, $${\widehat{\mathbf{K}}}_{G_1}=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_{G_1}\right]$$, and $${\widehat{\mathbf{K}}}_{G_2}=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_{G_2}\right]$$ are the estimated matrices described in Eqs. (9.29) and (9.30) for stages 1 and 2. It can be shown that, at stages 1 and 2, $${\widehat{\boldsymbol{\upbeta}}}_{R_1}^{\prime }={\widehat{\boldsymbol{\upbeta}}}_1^{\prime }{\widehat{\mathbf{K}}}_{G_1}^{\prime }=\left[1.39\kern0.5em 1.558\right]$$ and $${\widehat{\boldsymbol{\upbeta}}}_{R_2}^{\prime }={{\mathbf{w}}^{\prime }{\widehat{\mathbf{K}}}^{\prime}}_{G_2}=\left[1.0\kern0.5em 1.81\kern0.5em 1.0\kern0.5em 1.0\right]$$ are the MRLGSI vectors of coefficients respectively.

Suppose that the total proportion retained for the two stages was 20%, then at stage 1, k1 = 0.744 is an associated approximated selection intensity and the estimated MRLGSI selection response, expected genetic gain per trait, and accuracy were $${\widehat{R}}_{R_1}={k}_1\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{R_1}{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_{R_1}}=3.083$$, $${\widehat{\mathbf{E}}}_{R_1}=\left[2.225\kern0.5em 0\kern0.5em 0.742\kern0.5em 0.117\right]$$, and $${\widehat{\rho}}_{HI_1}=\sqrt{\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{R_1}{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_{R_1}}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.370$$ respectively. The estimated MRLGSI expected genetic gain, accuracy, and selection response at stage 2 were $${\widehat{\mathbf{E}}}_{R_2}={k}_2\frac{{\widehat{\boldsymbol{\upbeta}}}_{R_2}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast }}{\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{R_2}{\widehat{\boldsymbol{\Gamma}}}^{\ast }{\widehat{\boldsymbol{\upbeta}}}_{R_2}}}=\left[1.156\kern0.5em 0\kern0.5em 0.793\kern0.5em 0.536\right]$$, $${\widehat{\rho}}_{HI_2}=\sqrt{\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{R_2}{\widehat{\boldsymbol{\Gamma}}}^{\ast }{\widehat{\boldsymbol{\upbeta}}}_{R_2}}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.32$$, and $${\widehat{R}}_{R_2}={k}_2\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{R_2}{\widehat{\boldsymbol{\Gamma}}}^{\ast }{\widehat{\boldsymbol{\upbeta}}}_{R_2}}=2.485$$ respectively, where k2 = 0.721 was the approximated selection intensity value for stage 2.

The estimated total MRLGSI selection response and expected genetic gain at stages 1 and 2 were $${\widehat{R}}_{R_1}+{\widehat{R}}_{R_2}=5.568$$ and $${\mathbf{E}}_{R_1}^{\prime }+{\mathbf{E}}_{R_2}^{\prime }=\left[3.380\kern0.5em 0\kern0.5em 1.535\kern0.5em 0.653\right]$$ respectively. Note that, in effect, the expected genetic gain for trait 2 was 0, as expected.

## 9.6 The Multistage Predetermined Proportional Gain Linear Genomic Selection Index

The MPPG-LGSI is an adaptation of the predetermined proportional gain linear genomic selection index (PPG-LGSI) described in Chap. 6; thus, the theoretical results, properties, and objectives of both indices are similar. The MPPG-LGSI objective is to change μq to μq + dq, where dq is a predetermined change in μq. We solve this problem by minimizing the mean squared difference between I = βγ and H = wg (E[(H − I)2]) under the restriction UΓβ = θGd, where θG is a proportionality constant, d′ = [d1d2dr] is the vector of predetermined restrictions, U′ is a matrix (t − 1) × t of 1s and 0s, and Γ is a covariance matrix of additive genomic breeding values, γ′ = [γ1 γ2…γt], where r is the number of predetermined restrictions and t the number of traits.

### 9.6.1 The OMPPG-LGSI Parameters

According to the results in Chap. 6, at stages 1 and 2, the MPPG-LGSI vector of coefficients can be written as

$${\boldsymbol{\upbeta}}_{P_1}={\boldsymbol{\upbeta}}_{R_1}+{\uptheta}_1{\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}$$
(9.37)

and

$${\boldsymbol{\upbeta}}_{P_2}={\boldsymbol{\upbeta}}_{R_2}+{\uptheta}_2{\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime }{\boldsymbol{\Gamma} \mathbf{U}}_2\right)}^{-1}\mathbf{d},$$
(9.38)

respectively, where $${\boldsymbol{\upbeta}}_{R_1}={\mathbf{K}}_{G_1}{\boldsymbol{\upbeta}}_1$$, $${\boldsymbol{\upbeta}}_{R_2}={\mathbf{K}}_{G_2}{\boldsymbol{\upbeta}}_2={\mathbf{K}}_{G_2}\mathbf{w}$$, $${\mathbf{K}}_{G_1}=\left[\mathbf{I}-{\mathbf{Q}}_{G_1}\right]$$, $${\mathbf{Q}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1$$, $${\mathbf{K}}_{G_2}=\left[\mathbf{I}-{\mathbf{Q}}_{G_2}\right]$$, and $${\mathbf{Q}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime }{\boldsymbol{\Gamma} \mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\boldsymbol{\Gamma}$$ were described in Eqs. (9.29) and (9.30). Also, it can be shown that the proportionality constants for stages 1 (θ1) and 2 (θ2) are

$${\uptheta}_1=\frac{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\mathbf{A}}_1\mathbf{w}}{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_1^{\prime }{\boldsymbol{\Gamma}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}}\kern1em \mathrm{and}\kern1em {\uptheta}_2=\frac{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_2^{\prime }{\boldsymbol{\Gamma} \mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\boldsymbol{\Gamma} \mathbf{w}}{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_2^{\prime }{\boldsymbol{\Gamma} \mathbf{U}}_2\right)}^{-1}\mathbf{d}},$$
(9.39)

respectively. By Eqs. (9.37) to (9.39), the MPPG-LGSI for stages 1 and 2 can be written as $${I}_{P_1}={\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\upgamma}}_1$$ and $${I}_{P_2}={\boldsymbol{\upbeta}}_{P_2}^{\prime}\boldsymbol{\upgamma}$$ respectively, where γ1 and γ are vectors of genomic breeding values, which can be estimated using GEBVs (see Chap. 5 for details).

For stages 1 and 2, the MPPG-LGSI accuracies ($${\rho}_{HI_1}$$ and $${\rho}_{HI_2}$$), expected genetic gains per trait ($${\mathbf{E}}_{P_1}$$ and $${\mathbf{E}}_{P_2}$$), and selection responses ($${R}_{P_1}$$ and $${R}_{P_2}$$) can be written as

$${\rho}_{HI_1}=\sqrt{\frac{{\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{P_1}}{{\mathbf{w}}^{\prime}\mathbf{Cw}}}\kern1em \mathrm{and}\kern1em {\rho}_{HI_2}=\sqrt{\frac{{\boldsymbol{\upbeta}}_{P_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{P_2}}{{\mathbf{w}}^{\prime }{\mathbf{C}}^{\ast}\mathbf{w}}},$$
(9.40)
$${\mathbf{E}}_{P_1}={k}_1\frac{{\mathbf{A}}_1^{\prime }{\boldsymbol{\upbeta}}_{P_1}}{\sqrt{{\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{P_1}}}\kern1em \mathrm{and}\kern1em {\mathbf{E}}_{P_2}={k}_2\frac{{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{P_2}}{\sqrt{{\boldsymbol{\upbeta}}_{P_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{P_2}}}$$
(9.41)

and

$${R}_{P_1}={k}_1\sqrt{{\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{P_1}}\kern1em \mathrm{and}\kern1em {R}_{P_2}={k}_2\sqrt{{\boldsymbol{\upbeta}}_{P_2}^{\prime }{\boldsymbol{\Gamma}}^{\ast }{\boldsymbol{\upbeta}}_{P_2}},$$
(9.42)

respectively. The total MPPG-LGSI expected genetic gain per trait and selection response at both stages are equal to $${\mathbf{E}}_{P_1}+{\mathbf{E}}_{P_2}$$and $${R}_{P_1}+{R}_{P_2}$$. To simplify the notation, in Eqs. (9.41) and (9.42), we omitted the intervals between stages or selection cycles (LG). Matrices Γ and C are matrices Γ and C adjusted for previous selection on $${I}_{P_1}$$ according to Eqs. (9.34) and (9.35) respectively in the MPPG-LGSI context.

The correlation between $${I}_{P_1}={\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\upgamma}}_1$$ and $${I}_{P_2}={\boldsymbol{\upbeta}}_{P_2}^{\prime}\boldsymbol{\upgamma}$$ can be written as

$${\rho}_{12}=\frac{{\boldsymbol{\upbeta}}_{p_1}^{\prime }{\mathbf{A}}_1{\boldsymbol{\upbeta}}_{p_2}}{\sqrt{{\boldsymbol{\upbeta}}_{p_1}^{\prime }{\boldsymbol{\Gamma}}_1{\boldsymbol{\upbeta}}_{p_1}}\sqrt{{\boldsymbol{\upbeta}}_{p_2}^{\prime }{\boldsymbol{\Gamma} \boldsymbol{\upbeta}}_{p_2}}}.$$
(9.43)

In Eq. (9.43), matrix Γ was not adjusted for previous selection on $${I}_{P_1}={\boldsymbol{\upbeta}}_{P_1}^{\prime }{\boldsymbol{\upgamma}}_1$$.

### 9.6.2 Numerical Examples

To illustrate the MPPG-LGSI theory, we use the simulated data described in Sect. 9.4.3. Suppose that we select two traits at stages 1 and 2; then, at stage 1, $${\widehat{\boldsymbol{\Gamma}}}_1=\left[\begin{array}{cc}16.26& -6.51\\ {}-6.51& 5.79\end{array}\right]$$ and $${\widehat{\mathbf{A}}}_1=\left[\begin{array}{cccc}16.26& -6.51& 5.60& 2.29\\ {}-6.51& 5.79& -2.33& -1.62\end{array}\right]$$ are the estimated covariance matrices of Γ1 and A1 respectively. We restricted trait 2 with d =  − 2; then, at the stage 1 matrix $${\mathbf{U}}_1^{\prime }=\left[0\kern0.5em 1\right]$$ and at the stage 2 matrix $${\mathbf{U}}_2^{\prime }=\left[0\kern0.5em 1\kern0.5em 0\kern0.5em 0\right]$$. In addition, $${\widehat{\mathbf{Q}}}_{G_1}={\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1$$, $${\widehat{\mathbf{Q}}}_{G_2}={\mathbf{U}}_2{\left({\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}{\mathbf{U}}_2\right)}^{-1}{\mathbf{U}}_2^{\prime}\widehat{\boldsymbol{\Gamma}}$$, $${\widehat{\mathbf{K}}}_{G_1}=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_{G_1}\right]$$, and $${\widehat{\mathbf{K}}}_{G_2}=\left[\mathbf{I}-{\widehat{\mathbf{Q}}}_{G_2}\right]$$ are the estimates of matrix projectors associated with stages 1 and 2 (Eqs. 9.37 and 9.38 for details).

In Sect. 9.4.3, we showed that the estimated MRLGSI vector of coefficients for stage 1 was $${\widehat{\boldsymbol{\upbeta}}}_{R_1}^{\prime }={\widehat{\boldsymbol{\upbeta}}}_1^{\prime }{\widehat{\mathbf{K}}}_{G_1}^{\prime }=\left[1.386\kern0.5em 1.550\right]$$. Thus, by Eq. (9.37), to obtain $${\widehat{\boldsymbol{\upbeta}}}_{P_1}={\widehat{\boldsymbol{\upbeta}}}_{R_1}+{\widehat{\uptheta}}_1{\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}$$, we only need to obtain $${\widehat{\uptheta}}_1$$ and $${\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}$$, where d = − 2 and $${\widehat{\uptheta}}_1=\frac{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }{\widehat{\mathbf{A}}}_1\mathbf{w}}{{\mathbf{d}}^{\prime }{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}}$$. It can be shown that $${\mathbf{U}}_1{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}\mathbf{d}=\left[\begin{array}{c}0\\ {}-0.345\end{array}\right]$$ and $${\widehat{\uptheta}}_1=8.125$$; therefore, $${\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{P_1}=\left[1.39\kern0.5em -1.25\right]$$ is the MPPG-LGSI vector of coefficients at stage 1.

Suppose that the total proportion retained for the two stages was 20%; then, k1 = 0.744 is an approximate selection intensity associated with MPPG-LGSI and the estimated MPPG-LGSI accuracy, selection response, and expected genetic gain at stage 1 were $${\widehat{\rho}}_{HI_1}=\sqrt{\frac{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{P_1}{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_{P_1}}{{\mathbf{w}}^{\prime}\widehat{\mathbf{C}}\mathbf{w}}}=0.71$$, $${\widehat{R}}_{P_1}={k}_1\sqrt{{\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{P_1}{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_{P_1}}=5.90$$ and $${\widehat{\mathbf{E}}}_{P_1}^{\prime }={k}_1\frac{{\widehat{\mathbf{A}}}_1^{\prime }{\widehat{\boldsymbol{\upbeta}}}_{P_1}}{\sqrt{{\boldsymbol{\upbeta}}_{P_1}^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\widehat{\boldsymbol{\upbeta}}}_{P_1}}}=\left[2.88\kern0.5em -1.53\kern0.5em 1.00\kern0.5em 0.49\right]$$ respectively.

It can be shown that at stage 2, $${\mathbf{d}}^{\prime }{\left({\mathbf{U}}_1^{\prime }{\widehat{\boldsymbol{\Gamma}}}_1{\mathbf{U}}_1\right)}^{-1}{\mathbf{U}}_1^{\prime }=\left[0\kern0.5em -0.345\kern0.5em 0\kern0.5em 0\right]$$, $${\widehat{\uptheta}}_2=8.125$$ and $${\widehat{{\boldsymbol{\upbeta}}^{\prime}}}_{P_2}={\mathbf{w}}^{\prime }=\left[1\kern0.5em -1\kern0.5em 1\kern0.5em 1\right]$$. Thus, the estimated MPPG-LGSI accuracy, selection response, and expected genetic gain at this stage were $${\widehat{\rho}}_{HI_2}=\sqrt{\frac{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}{{\mathbf{w}}^{\prime }{\widehat{\mathbf{C}}}^{\ast}\mathbf{w}}}=0.64$$, $${\widehat{R}}_{P_2}={k}_2\sqrt{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}=4.10$$, and $${\widehat{{\mathbf{E}}^{\prime}}}_{P_2}={k}_2\frac{{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}{\sqrt{{\mathbf{w}}^{\prime }{\widehat{\boldsymbol{\Gamma}}}^{\ast}\mathbf{w}}}=\left[1.74\kern0.5em -0.92\kern0.5em 0.85\kern0.5em 0.58\right]$$ respectively, where k2 = 0.721. The estimated total MPPG-LGSI selection response and expected genetic gain for both stages were $${\widehat{R}}_{P_1}+{\widehat{R}}_{P_2}=9.99$$ and $${\widehat{\mathbf{E}}}_{P_1}^{\prime }+{\widehat{\mathbf{E}}}_{P_2}^{\prime }=\left[4.62\kern0.5em -2.45\kern0.5em 1.85\kern0.5em 1.07\right]$$ respectively. Note that the total expected genetic gain for trait 2 was −2.45, which is similar to d =  − 2, the PPG imposed by the breeder. Finally, to simplify the notation, we omitted the intervals between stages or selection cycles (LG) in the estimated MPPG-LPSI selection response and expected genetic gain for both stages.