The Linear Phenotypic Selection Index Theory

The main distinction in the linear phenotypic selection index (LPSI) theory is between the net genetic merit and the LPSI. The net genetic merit is a linear combination of the true unobservable breeding values of the traits weighted by their respective economic values, whereas the LPSI is a linear combination of several observable and optimally weighted phenotypic trait values. It is assumed that the net genetic merit and the LPSI have bivariate normal distribution; thus, the regression of the net genetic merit on the LPSI is linear. The aims of the LPSI theory are to predict the net genetic merit, maximize the selection response and the expected genetic gains per trait (or multi-trait selection response), and provide the breeder with an objective rule for evaluating and selecting parents for the next selection cycle based on several traits. The selection response is the mean of the progeny of the selected parents, whereas the expected genetic gain per trait, or multi-trait selection response, is the population means of each trait under selection of the progeny of the selected parents. The LPSI allows extra merit in one trait to offset slight defects in another; thus, with its use, individuals with very high merit in one trait are saved for breeding even when they are slightly inferior in other traits. This chapter describes the LPSI theory and practice. We illustrate the theoretical results of the LPSI using real and simulated data. We end this chapter with a brief description of the quadratic selection index and its relationship with the LPSI.


Bases for Construction of the Linear Phenotypic Selection Index
The study of quantitative traits (QTs) in plants and animals is based on the mean and variance of phenotypic values of QTs. Quantitative traits are phenotypic expressions of plant and animal characteristics that show continuous variability and are the result of many gene effects interacting among them and with the environment. That is, QTs are the result of unobservable gene effects distributed across plant or animal genomes that interact among themselves and with the environment to produce the observable characteristic plant and animal phenotypes (Mather and Jinks 1971;Falconer and Mackay 1996).
The QTs are the traits that concern plant and animal breeders the most. They are particularly difficult to analyze because heritable variations of QTs are masked by larger nonheritable variations that make it difficult to determine the genotypic values of individual plants or animals (Smith 1936). However, as QTs usually have normal distribution ( Fig. 2.1), it is possible to apply normal distribution theory when analyzing this type of data.
Any phenotypic value of QTs ( y) can be divided into two main parts: one related to the genes and the interactions (g) among them (called genotype), and the other related to the environmental conditions (e) that affect genetic expression (called environment effects). Thus, the genotype is the particular assemblage of genes possessed by the plant or animal, whereas the environment consists of all the nongenetic circumstances that influence the phenotypic value of the plant or animal (Cochran 1951;Bulmer 1980;Falconer and Mackay 1996). In the context of only one environment, the phenotypic value of QTs ( y) can be written as where g denotes the genotypic values that include all types of gene and interaction values, and e denotes the deviations from the mean of g values. For two or more environments, Eq. (2.1) can be written as y ¼ g + e + ge, where ge denotes the interaction between genotype and environment. Assumptions regarding Eq. (2.1) are:

The Net Genetic Merit and the LPSI
Not all the individual traits under selection are equally important from an economic perspective; thus, the economic value of a trait determines how important that trait is for selection. Economic value is defined as the increase in profit achieved by improving a particular trait by one unit (Tomar 1983;Cartuche et al. 2014). This means that for several traits, the total economic value is a linear combination of the breeding values of the traits weighted by their respective economic values (Smith 1936;Hazel and Lush 1942;Hazel 1943;Kempthorne and Nordskog 1959); this is called the net genetic merit of one individual and can be written as where g 0 ¼ [g 1 g 2 . . . g t ] is a vector of true unobservable breeding values and w 0 ¼ w 1 w 2 . . . w t ½ is a vector of known and fixed economic weights. Equation (2.2) has several names, e.g., linear aggregate genotype (Hazel 1943), genotypic economic value (Kempthorne and Nordskog 1959), net genetic merit (Akbar et al. 1984;Cotterill and Jackson 1985), breeding objective (Mac Neil et al. 1997), and total economic merit (Cunningham and Tauebert 2009), among others. In this book, we call Eq. (2.2) net genetic merit only. The values of H ¼ w 0 g are unobservable but they can be simulated for specific studies, as is seen in the examples included in this chapter and in Chap. 10, where four indices have been simulated for many selection cycles.
In practice, the net genetic merit of an individual is not observable; thus, to select an individual as parent of the next generation, it is necessary to consider its overall merit based on several observable traits; that is, we need to construct an LPSI of observable phenotypic values such that the correlation between the LPSI and H ¼ w 0 g is at a maximum. The LPSI should be a good predictor of H ¼ w 0 g and should be useful for ranking and selecting among individuals with different net genetic merits. The LPSI for one individual can be written as is the I vector of coefficients, t is the number of traits on I, and y 0 ¼ y 1 y 2 Á Á Á y t ½ is a vector of observable trait phenotypic values usually centered with respect to its mean. The LPSI allows extra merit in one trait to offset slight defects in another. With its use, individuals with very high merit in some traits are saved for breeding, even when they are slightly inferior in other traits (Hazel and Lush 1942). Only one combination of b values allows the correlation of the LPSI with H ¼ w 0 g for a particular set of traits to be maximized. Figure 2.2 indicates that the regression of the net genetic merit on the LPSI is lineal and that the correlation between the LPSI and the net genetic merit is maximal in each selection cycle. Also, note that the true correlations between the LPSI and the net genetic merit, and the true regression coefficients of the net genetic merit over the LPSI are the same, but the estimated correlation values between the LPSI and the net genetic merit are lower than the true correlation ( Fig. 2.2). Table 2.1 indicates that the LPSI in the ith selection cycle and the LPSI in the (i + 1)th selection cycle do not correlate. However, in practice, the correlation values between any pair of LPSIs could be different from zero in successive selection cycles.
One fundamental assumption of the LPSI is that I ¼ b 0 y has normal distribution. This assumption is illustrated in Fig. 2.3 for two real datasets: a maize (Zea mays) F 2 population with 252 lines and three traits-grain yield (ton ha À1 ); plant height (cm) and ear height (cm)-evaluated in one environment; and a double haploid wheat (Triticum aestivum L.) population with 599 lines and one trait-grain yield (ton ha À1 )-evaluated in three environments. Figure 2.3 indicates that, in effect, the LPSI values approach normal distribution when the number of lines is very large.

Fundamental Parameters of the LPSI
There are two fundamental parameters associated with the LPSI theory: the selection response (R) and the expected genetic gain per trait (E). In general terms, the selection response is the difference between the mean phenotypic values of the offspring (μ O ) of the selected parents and the mean of the entire parental generation (μ P ) before selection, i.e., R ¼ μ O À μ P (Hazel and Lush 1942;Falconer and Mackay 1996). The expected genetic gain per trait (or multi-trait selection response) is the covariance between the breeding value vector and the LPSI (I) values weighted by the standard deviation of the variance of I(σ I ), i.e., Cov I;g  (ECO) values between the linear phenotypic selection index (LPSI) and the net genetic merit for seven selection cycles, and true regression coefficient (TRC) of the net genetic merit over the LPSI for four traits and 500 genotypes in one environment simulated for seven selection cycles selection intensity. This is one form of the LPSI multi-trait selection response. In the univariate context, the expected genetic gain per trait is the same as the selection response.
One additional way of defining the selection response is based on the selection differential (D). The selection differential is the mean phenotypic value of the individuals selected as parents (μ S ) expressed as a deviation from the population mean (μ P ) or parental generation before the selection was made (Falconer and Mackay 1996); that is, D ¼ μ S À μ P . Thus, another way of defining R is as the part of the expected differential of selection (D ¼ μ S À μ P ) that is gained when selection is applied (Kempthorne and Nordskog 1959); that is where Cov g; y ð Þ ¼ σ 2 g is the covariance between g and y, g is the individual breeding value associated with trait y, σ 2 y is the variance of y, k ¼ D σ y is the standardized

Maize LPSI values
Wheat LPSI values is the heritability of trait y in the base population. Heritability (h 2 ) appears in Eq. (2.4) as a measure of the accuracy with which animals or plants having the highest genetic values can be chosen by selecting directly for phenotype (Hazel and Lush 1942). The selection response (Eq. 2.4) is the mean of the progeny of the selected parents or the future population mean of the trait under selection (Cochran 1951). Thus, the selection response enables breeders to estimate the expected progress of the selection before carrying it out. This information gives improvement programs a clearer orientation and helps to predict the success of the selection method adopted and choose the option that is technically most effective on a scientific base (Costa et al. 2008). Equation (2.4) is very powerful but its application requires strong assumptions. For example, Eq. (2.4) assumes that the trait of interest does not correlate with other traits having causal effects on fitness and, in its multivariate form the validity of predicted change rests on the assumption that all such correlated traits have been measured and incorporated into the analysis (Morrissey et al. 2010).

The LPSI Selection Response
The univariate selection response (Eq. 2.4) can also be rewritten as where σ g was defined in Eq. (2.4) and ρ gy is the correlation between g and y. Thus, as H ¼ w 0 g and I ¼ b 0 y are univariate random variables, the selection response of the LPSI (R I ) can be written in a similar form as Eq. (2.5), i.e., where σ H and σ I are the standard deviation and ρ HI the correlation between H ¼ w 0 g and I ¼ b 0 y respectively; k I ¼ μ IA Àμ IB σ I is the standardized selection differential or the selection intensity associated with the LPSI; μ IA and μ IB are the means of the LPSI values after and before selection respectively. The second part of Eq. (2.6) (k I σ H ρ HI ) indicates that the genetic change due to selection is proportional to k I , σ H , and ρ HI (Kempthorne and Nordskog 1959). Thus, the genetic gain that can be achieved by selecting for several traits simultaneously within a population of animals or plants is the product of the selection differential (k I ), the standard deviation of H ¼ w 0 g (σ H ), and the correlation between H ¼ w 0 g and I ¼ b 0 p (ρ HI ). Selection intensity k I is limited by the rate of reproduction of each species, whereas σ H is relatively beyond man's control; hence, the greatest opportunity for increasing selection progress is by ensuring that ρ HI is as large as possible (Hazel 1943). In general, it is assumed that k I and σ H are fixed and w known and fixed; hence, R I is maximized when ρ HI is maximized only with respect to the LPSI vector of coefficients b.
Equation (2.6) is the mean of H ¼ w 0 g, whereas σ 2 H ρ 2 HI 1 À v ð Þis its variance and ρ * HI ¼ ρ HI ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 À v 1 À vρ 2 HI s the correlation between H ¼ w 0 g and I ¼ b 0 p after selection was carried out (Cochran 1951), where v ¼ k I (k I À τ) and τ is the truncation point. For example, if the selection intensity is 5%, k I ¼ 2.063, τ ¼ 1.645, and v ¼ 0.862 (Falconer and Mackay 1996, Table A). In R (in this case R denotes a platform for data analysis, see Kabakoff 2011 for details), the truncation point and selection intensity can be obtained as v <À qnorm(1 À q) and k <À dnorm(v)/q, respectively, where q is the proportion retained. Both the variance and the correlation (ρ * HI ) are reduced by selection. If H ¼ w 0 g could be selected directly, the gain in H ¼ w 0 g would be k I . Thus, the gain due to indirect selection using I ¼ b 0 p is a fraction ρ HI of that due to direct selection using H ¼ w 0 g. As k I increases, R I increases (Eq. 2.6), ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi σ 2 H ρ 2 HI 1 À v ð Þ q and ρ * HI decrease, and the effects are in the same direction as ρ * HI increases (Cochran 1951). These results should be valid for all selection indices described in this book. Smith (1936) gave an additional method to obtain Eq. (2.6). Suppose that we have a large number of plant lines and we select one proportion q for further propagation. In addition, assume that the values of I for each line are normally distributed with variance σ 2 I ¼ b 0 Pb; let I be transformed into a variable u, with unit variance and mean at zero, that is, u ¼ IÀμ I σ I , where μ I is the mean of I. Assume that all I values higher than I 0 value are selected; then the value of u 0 ¼ I 0 Àμ I σ I corresponding to any given value of q may be ascertained from a table of the standard normal probability integral ( Fig. 2 ance between H and I, and σ 2 I ¼ b 0 Pb is the variance of I. Therefore, if σ 2 I and σ HI are fixed, the LPSI selection response (R I ) can be obtained as the expectation of the selected population, which has univariate left truncated normal distribution. A truncated distribution is a conditional distribution resulting when the domain of the parent distribution is restricted to a smaller region (Hattaway 2010). In the LPSI context, a truncation distribution occurs when a sample of individuals from the parent distribution is selected as parents for the next selection cycle, thus creating a new population of individuals that follow a truncated normal distribution. Thus, we need to find E[E(H/I )] ¼ q À1 Bσ I E(u), or, using integral calculus, is the height of the ordinate of the normal curve at the lowest value of u 0 retained and q is the proportion of the population of animal or plant lines that is selected ( Fig. 2.4). The proportion q that must be saved depends on the reproductive rate and longevity of the species under consideration and on whether the population is expanding, stationary or declining in numbers. The ordinate (z) of the normal curve is determined by the proportion selected (q) (Fig. 2.4). The amount of progress is expected to be larger as q becomes smaller; that is, as selection becomes more intense (Hazel and Lush 1942). Kempthorne and Nordskog (1959) showed that z q ¼ k I . Thus, Eqs. (2.6) and (2.7) are the same, that is,

The Maximized Selection Response
The main objective of the LPSI is to maximize the mean of H ¼ w 0 g (Eq. 2.7). Assuming that P, G, w, and k I are known, to maximize R I we can either maximize ρ HI or minimize the mean squared difference between I and H, is the vector that simultaneously minimizes E[(H À I ) 2 ] and maximizes ρ HI , and then R I ¼ k I σ H ρ HI . By Eq. (2.8), the maximized LPSI selection response can be written as The maximized LPSI selection response predicts the mean improvement in H due to indirect selection on I only when b ¼ P À1 Gw (Harris 1964) and is proportional to the standard deviation of the LPSI variance (σ I ) and the standardized selection differential or the selection intensity (k I ).
The maximized LPSI selection response (Eq. 2.9) it related to the Cauchy-Schwarz inequality (Rao 2002;Cerón-Rojas et al. 2006), which establishes that for any pair of vectors u and v, if A is a positive definite matrix, then the inequality Kempthorne and Nordskog (1959) proved that also maximizes R I . According to Eqs. (2.6) and (2.7), R 2 I can be written as R 2 This implies that the maximum is reached when . This latter result is the same as Eq. (2.9) when b ¼ P À1 Gw. Result obtained using the Cauchy-Schwarz inequality corroborates that b ¼ P À1 Gw (Eq. 2.8) is a global minimum when the mean squared difference between I and H (E[(H À I ) 2 ]) is minimized, and a global maximum when the correlation ρ HI between I and H is maximized because is the covariance between g and y weighted by the standard deviation of the variance of y, Cov I;g ð Þ σ I is the covariance between the breeding value vector and the LPSI values weighted by the standard deviation of the variance of LPSI. This means that in effect, E is the LPSI multi-trait selection response and can be written as

The LPSI Expected Genetic Gain Per Trait
where G, σ I and k I were defined earlier. As Eq. (2.10) is the covariance between the genetic gain in the jth index trait due to selection on I will be is a vector of genotypic covariances of the jth index trait with all the index traits (Lin 1978;Brascamp 1984). If Eq. (2.11) is multiplied by its economic weight, we obtain a measure of the economic value of each trait included in the net genetic merit (Cunningham and Tauebert 2009). In percentage terms, the economic value attributable to genetic change in the jth trait can be written as In addition, the percentage reduction in the net genetic merit of overall genetic gain if the jth trait is omitted from the LPSI (Cunningham and Tauebert 2009 where φ À2 j is the jth diagonal element of the inverse of the phenotypic covariance matrix P À1 and b 2 j the square of the jth coefficient of the LPSI. Equations (2.12) and (2.13) are measures of the importance of each trait included in the LPSI when makes selection.

Heritability of the LPSI
and R are the phenotypic, genetic, and residual covariance matrices respectively, then the LPSI heritability (Lin and Allaire 1977;Nordskog 1978) can be written as When selecting a trait, the correlation between the phenotypic and genotypic values is equal to the square root of the trait's heritability (ρ gy ¼ h); however, in the LPSI context, when b ¼ P À1 Gw, the maximized correlation between H and I is is the square root of I heritability; that is, from a mathematical point of view, ρ HI 6 ¼ h I . In practice, h 2 I and ρ 2 HI give similar results (Fig. 2.5).

Statistical LPSI Properties
Assuming that H and I have joint bivariate normal distribution, b ¼ P À1 Gw, and P, G and w are known, the statistical LPSI properties (Henderson 1963) are the following: 1. The variance of I (σ 2 I ) and the covariance between H and I (σ HI ) are equal, i.e., σ 2 I ¼ σ HI . We can demonstrate this property noting that as b ¼ P À1 Gw, This last result implies that when μ I ¼ 0, E (H/I ) ¼ I.

The maximized correlation between H and I is equal to
Thus, the larger ρ HI , the smaller E[(H À I) 2 ] and the more similar I and H are. If ρ HI > 0, I and H tend to be positively related; if ρ HI < 0, they tend to be negatively related; and if ρ HI ¼ 0, I and H are independent (Anderson 2003). 4. The total variance of H explained by I is σ 2 That is, the variance of H explained by I is proportional to ρ HI , and when ρ HI is close to 1, σ 2 I is close to σ 2 H , and if ρ HI is close to 0, σ 2 I is close to 0.

The Base LPSI
To derive the LPSI theory, we assumed that the phenotypic (P) and the genotypic (G) covariance matrix, and the vector of economic values (w) are known. However, P, G, and w are generally unknown and it is necessary to estimate them. There are many methods for estimating P and G (Lynch and Walsh 1998) and w (Cotterill and Jackson 1985;Magnussen 1990). However, when the estimator of P( b P ) is not positive definite (all eigenvalues positive) or the estimator of G( b G) is not positive semidefinite (no negative eigenvalues), the estimator of Gw) could be biased. In this case, the base linear phenotypic selection index (BLPSI): may be a better predictor of H ¼ w 0 g than the estimated LPSI Williams 1962a;Lin 1978) if the vector of economic values w is indeed known. Many authors (Williams 1962b;Harris 1964;Hill 1980, 1981) have investigated the influence of parameter estimation errors on LPSI accuracy and concluded that those errors affect the accuracy of b I ¼ b b 0 y when the accuracy of b P and b G is low. If vector w values are known, the BLPSI has certain advantages because of its simplicity and its freedom from parameter estimation errors (Lin 1978). Williams (1962a) pointed out that the BLPSI is superior to b I ¼ b b 0 y unless a large amount of data is available for estimating P and G.
There are some problems associated with the BLPSI. For example, what is the BLPSI selection response and the BLPSI expected genetic gains per trait when no data are available for estimating P and G? The BLPSI is a better selection index than the standard LPSI only if the correlation between the BLPSI and the net genetic merit is higher than that between the LPSI and the net genetic merit (Hazel 1943).
However, if estimations of P and G are not available, how can the correlation between the base index and the net genetic merit be obtained? Williams (1962b) pointed out that the correlation between the BLPSI and H ¼ w 0 g can be written as and indicated that the ratio ρ HI B =ρ HI can be used to compare LPSI efficiency versus BLPSI efficiency; however, in the latter case, at least the estimates of P and G, i.e., b P and b G, need to be known. In addition, Eq. (2.15) is only an assumption, not a result, and implies that P and G are the same. That is, b ¼ P À1 Gw ¼ w only when P ¼ G, which indicates that the BLPSI is a special case of the LPSI. Thus, to obtain the selection response and the expected genetic gains per trait of the BLPSI, we need some information about P and G. Assuming that the BLPSI is indeed a particular case of the LPSI, the BLPSI selection response and the BLPSI expected genetic gains per trait could be written as and

The LPSI for Independent Traits
Suppose that the traits under selection are independent, then P and G are diagonal matrices and b ¼ P À1 Gw is a vector of single-trait heritabilities multiplied by the economic weights, because P À1 G is the matrix of multi-trait heritabilities (Xu and Muir 1992). Based on this result, Hazel and Lush (1942) and Smith et al. (1981) used trait heritabilities multiplied by the economic weights (or heritabilities only) as coefficients of the LPSI. Thus, when the traits are independent and the economic weights are known, the LPSI can be constructed as and when the economic weights are unknown, the LPSI can be constructed as

Criteria for Comparing LPSI Efficiency
Assuming that the intensity of selection is the same in both indices, we can compare BLPSI (I B ¼ w 0 y) efficiency versus LPSI efficiency to predict the net genetic merit in percentage terms as   (Williams 1962b;Bulmer 1980). Therefore, when p ¼ 0, the efficiency of both indices is the same; when p > 0, the efficiency of the LPSI is higher than the base index efficiency, and when p < 0, the base index efficiency is higher than LPSI efficiency (Fig. 2.6). Equation (2.21) is useful for comparing the efficiency of any linear selection index, as we shall see in this book.

Estimating Matrices G and P
To derive the LPSI theory we assumed that matrices P and G are known. In practice, we have to estimate them. Matrices P and G can be estimated by analysis of variance (ANOVA), maximum likelihood or restricted maximum likelihood (REML) (Baker 1986;Lynch and Walsh 1998;Searle et al. 2006;Hallauer et al. 2010). Equation (2.1) is the simplest model because we only need to estimate two variance components: the genotypic variance (σ 2 g ) and the residual variance (σ 2 e ), from where the phenotypic variance for trait y is the sum of σ 2 g and σ 2 e , that is, σ 2 y ¼ σ 2 g þ σ 2 e . However, to construct matrices P and G, we also need the covariance between any two traits. Thus, if y i and y j (i, j ¼ 1, 2, Á Á Á, t) are any two traits, then the covariance between y i and y j (σ y ij ) can be written as σ y ij ¼ σ g ij þ σ e ij , where σ g ij and σ e ij denote the genotypic and residual covariance respectively of traits y i and y j .
Several authors (Baker 1986;Lynch and Walsh 1998;Hallauer et al. 2010) have described ANOVA methods for estimating matrix G using specific design data, for example, half-sib, full-sib, etc., when the sample sizes are well balanced. In the ANOVA method, observed mean squares are equal to their expected values; the expected values are linear functions of the unknown variance components; thus the resulting equations are a set of simultaneous linear equations in the variance components. The expected values of mean squares in the ANOVA method do not need assumptions of normality because the variance component estimators do not depend on normality assumptions (Lynch and Walsh 1998;Hallauer et al. 2010).
In cases where the sample sizes are not well balanced, Lynch andWalsh (1998) andFry (2004) proposed using the REML method to estimate matrix G. The REML estimation method does not require a specific design or balanced data and can be used to estimate genetic and residual variance and covariance in any arbitrary pedigree of individuals. The REML method is based on projecting the data in a subspace free of fixed effects and maximizing the likelihood function in this subspace, and has the advantage of producing the same results as the ANOVA in balanced designs (Blasco 2001).
In the context of the linear mixed model, Lynch and Walsh (1998) have given formulas for estimating variances σ 2 g and σ 2 e that can be adapted to estimate covariances σ g ij and σ e ij . Suppose that we want to estimate σ 2 g and σ 2 e for the qth trait (q ¼ 1, 2Á Á Á, t ¼ number of traits) in the absence of dominance and epistatic effects using the model y q ¼ 1μ q + Zg q + e q , where the vector of averages y q~N MV (1μ q ,V q ) is g Â 1 (g ¼ number of genotypes in the population) and has multivariate normal distribution; 1 is a g Â 1 vector of ones, μ q is the mean of the qth trait, Z is an identity matrix g Â g, g q~N MV(0, Aσ 2 gq ) is a vector of true breeding values, and e q~N MV(0, Iσ 2 e q ) is a g Â 1 vector of residuals, where NMV stands for normal multivariate distribution. Matrix A denotes the numerical relationship matrix between individuals (Lynch and Walsh 1998;Mrode 2005) and V q ¼ Aσ 2 gq þ Iσ 2 e q . The expectation-maximization algorithm allows the REML to be computed for the variance components σ 2 g q and σ 2 e q by iterating the following equations: where, after n iterations, σ 2 nþ1 ð Þ g q and σ 2 nþ1 ð Þ e q are the estimated variance components of σ 2 g q and σ 2 e q respectively; tr(.) denotes the trace of the matrices within brackets; The additive genetic and residual covariances between the observations of the qth and ith traits, y q and y i (σ g q, i and σ e q, i , q, i ¼ 1, 2, . . ., t), can be estimated using REML by adapting Eqs. (2.22) and (2.23). Note that the variance of the sum of y q and y i can be written as Var(y i + y q ) is the variance of y q ; in addition, 2C iq ¼ 2Aσ giq + 2Iσ eiq ¼ 2Cov(y i , y q ) is the covariance of y q and y i , and σ giq and σ eiq are the additive and residual covariances respectively associated with the covariance of y q and y i . Thus, one way of estimating σ giq and σ eiq is by using the following equation:

Simulated Data
This data set was simulated by Ceron-Rojas et al. (2015) and can be obtained at http://hdl.handle.net/11529/10199. The data were simulated for eight phenotypic selection cycles (C0 to C7), each with four traits (T 1 , T 2 , T 3 and T 4 ), 500 genotypes, and four replicates for each genotype (Fig. 2.7). The LPSI economic weights for T 1 , Inter-cross F 2:3 × tester(s) Fig. 2.7 Schematic illustration of the steps followed to generate data sets 1 and 2 for the seven selection cycles using the linear phenotypic selection index and the linear genomic selection index. Dotted lines indicate the process used to simulate the phenotypic data (according to Ceron-Rojas et al. 2015) T 2 , T 3 and T 4 were 1, À1, 1, and 1 respectively. Each of the four traits was affected by a different number of quantitative trait loci (QTLs): 300, 100, 60, and 40, respectively. The common QTLs affecting the traits generated genotypic correlations of À0.5, 0.4, 0.3, À0.3, À0.2, and 0.1 between T 1 and T 2 , T 1 and T 3 , T 1 and T 4 , T 2 and T 3 , T 2 and T 4 , and T 3 and T 4 respectively. The genotypic value of each plant was generated based on its haplotypes and the QTL effects for each trait. Simulated data were generated using QU-GENE software (Podlich and Cooper 1998;Wang et al. 2003). A total of 2500 molecular markers were distributed uniformly across 10 chromosomes, whereas 315 QTLs were randomly allocated over the ten chromosomes to simulate one maize (Zea mays L.) population. Each QTL and molecular marker was biallelic and the QTL additive values ranged from 0 to 0.5. As QU-GENE uses recombination fraction rather than map distance to calculate the probability of crossover events, recombination between adjacent pairs of markers was set at 0.0906; for two flanking markers, the QTL was either on the first (recombination between the first marker and QTL was equal to 0.0) or the second (recombination between the first marker and QTL was equal to 0.0906) marker; excluding the recombination fraction between 15 random QTLs and their flanking markers, which was set at 0.5, i.e., complete independence (Haldane 1919), to simulate linkage equilibrium between 5% of the QTLs and their flanking markers. In addition, in every case, two adjacent QTLs were in complete linkage. For each trait, the phenotypic value for each of four replications of each plant was obtained from QU-GENE by setting the per-plot heritability of T 1 , T 2 , T 3 , and T 4 at 0.4, 0.6, 0.6, and 0.8 respectively.
According to the means of the three traits, the first estimated LPSI value was obtained as This estimation procedure is valid for any number of genotypes. Table 2.3 presents the 20 genotypes ranked by the estimated LPSI values. Note that if we use 20% selection intensity for Table 2.2 data, we should select genotypes 12, 18, 1, 6, and 10, because their estimated LPSI values are higher than the remaining LPSI values for that set of genotypes. Using the idea described in Fig. 2.4, genotypes 12, 18, 1, 6, and 10 should be in the red zone, whereas the rest of the genotypes are in the white zone and should be culled. Here, the proportion selected is q ¼ 0.2 and  Table 2.4 presents 25 genotypes and the means of the three traits obtained from the 500 simulated genotypes for cycle C1 and ranked by the estimated LPSI values. In this case, we used 5% selection intensity (k I ¼ 2.063). Also, the last four rows in Table 2.4 give: 1. The means of traits T 1 , T 2 , and T 3 (175.46, 39.26, and 38.83 respectively) of the selected individuals and the mean of the selected LPSI values (97.84). 2. The means of the three traits in the base population (161.88, 45.19, and 34.39) and the mean of the LPSI values in the base population (79.18) 3. The selection differentials for the three traits (13.58, À5.92, and 4.44) and the selection differential for the LPSI (18.66) 4. The LPSI expected genetic gain per trait (9.51, À5.48, and 4.22) and the LPSI selection response (19.21).
The variance of the estimated selection index for the 500 genotypes was , from which the standard deviation of b I was 9.312. The  The selection intensity was 5%

LPSI Efficiency Versus Base Index Efficiency
The estimated correlation between the LPSI and the net genetic merit was whereas the estimated correlation between the base index and the net genetic merit was b ρ HI B ¼ 0:875, thus b λ ¼ b ρ HI b ρ HI B ¼ 1:0217 and, by Eq. (2.21), This means that LPSI efficiency was only 2.2% higher than the base index efficiency for this data set.
Using the same data set described in Sect. 2.8.1 of this chapter, we conducted seven selection cycles (C1 to C7) for the four traits (T 1 , T 2 , T 3 , and T 4 ) using the LPSI and the BLPSI. These results are presented in Table 2.5. To compare the LPSI efficiency versus BLPSI efficiency, we obtained the true selection response of the simulated data (second column in Table 2.5) and we estimated the LPSI and BLPSI selection response for each selection cycle (third column in Table 2.5); in addition, we estimated the LPSI and BLPSI expected genetic gain per trait for each selection cycle (columns 4 to 7 in Table 2.5). The first part of Table 2.5 shows the true selection response and the estimated values of the LPSI selection response and expected genetic gain per trait. In a similar manner, the second part of Table 2.5 shows the true selection response, the estimated values of the BLPSI selection  Table 2.5 results and those presented in Fig. 2.6, we can conclude that the LPSI was more efficient than the BLPSI for this data set. Finally, additional results can be seen in Chap. 10, where the LPSI was simulated for many selection cycles. Chapter 11 describes RIndSel: a program that uses R and the selection index theory to make selection.

The LPSI and Its Relationship with the Quadratic Phenotypic Selection Index
In the nonlinear selection index theory, the net genetic merit and the index are both nonlinear. There are many types of nonlinear indices; Goddard (1983) and Weller et al. (1996) have reviewed the general theory of nonlinear selection indices. In this chapter, we describe only the simplest of them: the quadratic index developed mainly by Wilton et al. (1968), Wilton (1968), and Wilton and Van Vleck (1969), which is related to the LPSI.

The Quadratic Nonlinear Net Genetic Merit
The most common form of writing the quadratic net genetic merit is where α is a constant, g is the vector of breeding values, which has normal distribution with zero mean and covariance matrix G, μ is the vector of population means, and w is a vector of economic weights. In addition, matrix A can be written as A ¼ w 1 0:5w 12 Á Á Á 0:5w 1t 0:5w 12 w 2 Á Á Á 0:5w 2t ⋮ ⋮ ⋱ ⋮ 0:5w 1t 0:5w 2t . . . . . ., t ) is the relative economic weight of the genetic value of the squared trait i and w ij (i,j = 1,2, . . ., t ) is the economic weight of the cross products between the genetic values of traits i and j. The main difference between the linear net genetic merit (Eq. 2.2) and the net quadratic merit (Eq. 2.25) is that the latter depends on μ and (μ + g) 0 A(μ + g).

The Quadratic Index
The quadratic phenotypic selection index is where β is a constant, y is the vector of phenotypic values that has multivariate normal distribution with zero mean and covariance matrix P, b 0 ¼ b 1 b 2 Á Á Á b t ½ is a vector of coefficients, and B ¼ b 1 0:5b 12 Á Á Á 0:5b 1t 0:5b 12 b 2 Á Á Á 0:5b 2t ⋮ ⋮ ⋱ ⋮ 0:5b 1t 0:5b 2t . . . b t 2 6 6 4 3 7 7 5 . In matrix B, the diagonal ith values b i (i = 1,2, . . ., t ) is the index weight for the square of the phenotypic i and b ij (i,j = 1,2, . . ., t ) is the index weight for the cross products between the phenotype of the traits i and j.

The Vector and the Matrix of Coefficients of the Quadratic Index
As we saw in Sect. 2.3.2 of this chapter, to obtain the vector (b) and the matrix (B) of coefficients of the quadratic index that maximized the selection response, we can minimize the expectation of the square difference between the quadratic index (I q ) and the quadratic net genetic merit (H q ): Φ = E{[I q À E(I q )] À [H q À E(H q )]} 2 , or we can maximize the correlation between I q and H q , i.e., ρ H q I q ¼ where Cov(H q , I q ) is the covariance between I q and H q , ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Var I q À Á q is the standard deviation of the variance of I q , and ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Var H q À Á q is the standard deviation of the variance of H q . In this context, it is easier to maximize ρ H q I q than to minimize Φ. Vandepitte (1972) minimized Φ, but in this section we shall maximize ρ H q I q . Suppose that μ = 0, since α and β are constants that do not affect ρ H q I q , we can write I q and H q as I q = b 0 y + y 0 By and H q = w 0 g + g 0 Ag. Thus, under the assumption that y and g have multivariate normal distribution with mean 0 and covariance matrix P and G, respectively, E(I q ) = tr(BP) and E(H q ) = tr(AG) are the expectations of I q and H q , whereas Var(I q ) = b 0 Pb + 2tr[(BP) 2 ] and Var(H q ) = w 0 Gw + 2tr[(AG) 2 ] are the variances of I q and H q , respectively. The covariance between I q and H q is Cov (H q , I q ) = w 0 Gb + 2tr(BGAG) (Vandepitte 1972), where tr(∘) denotes the trace function of matrices.
According to the foregoing results, we can maximize the natural logarithm of ρ H q I q [ln ρ H q I q ] with respect to vector b and matrix B assuming that w,A,P, and G are known. Hence, except for two proportional constants that do not affect the 2.9 The LPSI and Its Relationship with the Quadratic Phenotypic Selection Index