Background

A reference population with sufficient size is essential in genomic selection (GS) [1]-[3]. For dairy cattle, in almost all countries with developed dairy industry, thousands of progeny-tested bulls with highly reliable estimated breeding value (EBV) are used to form the national reference population. However, constituting such a reference population is not feasible in some countries, e.g. China, where the number of bulls with highly reliable EBV is limited. As an alternative, cows can be used to form the reference population. Ding et al. [4] investigated the accuracy of genomic prediction using a reference population consisting of cows, and showed that genomic selection using cows is feasible. However, a larger population of reference cows was required to obtain comparable accuracies of genomic prediction than when progeny-tested bulls are used as reference population, because cow EBV are generally less reliable than bull EBV [4]. Further efforts are needed to improve the accuracy of genomic prediction in such a situation.

The term "one-step blending" was used to distinguish it from the original single-step approach using DPR (de-regressed proofs) instead of raw phenotypes [5]. In the present study, we investigated the possible improvements in the accuracy of genomic prediction by applying the one-step blending approach to Chinese Holsteins, for which the reference population consists primarily of cows. In addition, the influence of the relationship between the non-genotyped animals and genotyped selection candidates on the prediction accuracy of one-step blending was also investigated.

Methods

Data

The data consisted of 4917 Chinese Holstein cows born from 1998 to 2009 and 240 progeny-tested bulls born from 1984 to 2005, all of which had official EBV on five milk production traits (milk yield, fat yield, fat percentage, protein yield, and protein percentage). These official EBV were obtained based on a multiple-trait random regression test-day model [6]. DRP of all animals were derived from their EBV according to VanRaden and Wiggans [7] and used as response variables for genomic prediction. Reliabilities of the DRP were calculated according to Liu et al. [8]. All animals had reliabilities of DRP greater than 0.40 (for cows) or 0.80 (for bulls). Out of the 4917 cows, 4106 born before 2008, together with the 240 bulls, were taken as the reference population, and the remaining 811 cows born in or after 2008 were used as the validation population.

All individuals in the reference and validation populations were genotyped with the Illumina BovineSNP50 BeadChip (Illumina, San Diego, CA). Missing genotypes of single nucleotide polymorphisms (SNPs) with known chromosomal positions were imputed by BEAGLE [9], and those with unknown chromosomal positions were discarded. After imputation, SNPs with minor allele frequency (MAF) less than 0.01 were removed, leaving 46 422 SNPs for genomic prediction.

To implement the one-step blending approach, all non-genotyped dams and half-sisters of the validation cows that had DRP with reliabilities greater than 0.40, were considered. Of the 811 validation cows, 425 had non-genotyped dams (424 in total) and all had non-genotyped half-sisters (17 085 in total, ranging from 154 to 2672).

Blood samples were collected from Chinese Holstein cattle when the regular quarantine inspection of the farms was conducted. The procedure for collecting the blood samples was carried out in strict accordance with the protocol approved by the Animal Welfare Committee of China Agricultural University (Permit Number: DK996).

Statistical models

Three methods, GBLUP, the original one-step blending and the adjusted one-step blending, were implemented for genomic prediction of animals in the validation population.

GBLUP

The following genomic BLUP model [10] was used to predict genomic breeding values:

y=1μ+Zg+e,

where y is the vector of DRP of the reference animals, g is the vector of additive genetic effects, which assumed to follows a normal distribution N 0 , G σ g 2 , with G being the genomic relationship matrix constructed using the first method of VanRaden [10], and e is the vector of random errors, assumed to follow a normal distribution N 0 , D σ e 2 , with D being a diagonal matrix with d ii = 1/w i , where w i = r DR P i 2 / 1 r DR P i 2 ( r DR P i 2 is the reliability of DRP of individual i) [10],[11]. The estimates in g based on this model are termed direct genomic breeding values (DGV).

Original one-step blending

Following Legarra et al. [12], Aguilar et al. [13], and Christensen and Lund [14], the one-step blending method has the same model as GBLUP, except that the vector y also contains the DRP of non-genotyped animals and vector g is assumed to follow a normal distribution N 0 , H σ g 2 , where H is defined as:

H= A 11 + A 12 A 22 1 G A 22 A 22 1 A 12 ' A 12 A 22 1 G G A 22 1 A 12 ' G ,

with A11, A12, and A22 sub-matrices of A (the pedigree-based relationship matrix), and subscripts 1 and 2 refer to non-genotyped and genotyped animals, respectively. The estimates in g based on this model are termed the genomic enhanced breeding values (GEBV).

Adjusted one-step blending

To avoid the potential incompatibility in scale between the coefficients of G and A22 involved in the H matrix, which could lead to incorrect weighting of the pedigree and genomic information, as pointed out by Forni et al. [15], the G matrix was adjusted following Gao et al. [16], i.e.,

G a =Gβ+α,

where β and α are obtained from the following equations:

Avg diag G β + α = Avg diag A 22 , Avg offdiag G β + α = Avg offdiag A 22 ,

Where Avg(diag(*)) means the average value of diagonal elements of matrix *; Avg(offdiag(*)) means the average value of non-diagonal elements of matrix *.

The variance components σ g 2 and σ e 2 involved in the three models were estimated using AI-REML, as implemented in the software DMU [17].

Evaluation of the accuracy of genomic prediction

The accuracy of genomic predictions was evaluated as r v = r g ^ , DRP r DRP [5], where r g ^ , DRP is the correlation between the estimated g (DGV or GEBV) and the DRP in the validation population and rDRP is the average of the square root of the reliability of the DRP of the validation cows.

In addition, the theoretical accuracy of the DGV or GEBV was calculated for each individual in the same way as in conventional BLUP, following Henderson [18] from the diagonal of the inverse of the mixed model equation (MME), and the average theoretical accuracy over validation animals was also used to evaluate the accuracy of genomic predictions.

Results and discussion

As shown in Table 1, for the 811 validation cows, rv and average theoretical accuracies from the original one-step blending increased by 0.12 and 0.02, respectively, compared with the accuracies from GBLUP averaged over the five traits. Accuracies from the adjusted one-step blending approach were almost the same as those from the original one-step blending. Theoretical accuracies were much higher than rv, which was also observed in other studies [3],[19]-[21]. The theoretical accuracy may also be overestimated owing to sampling errors in elements of the genomic relationship matrix as pointed out by Goddard et al. [22]. In comparison with GBLUP, the one-step blending approach can significantly improve the accuracy of genomic prediction by incorporating the phenotypes (DRP) of non-genotyped relatives of the selection candidates. However, the adjusted one-step blending did not result in further improvements in accuracy compared with the original one-step blending, probably because the original G matrix was little adjusted in our situation, since the estimates of β and α were 0.992 (close to 1) and 0.017 (close to 0), respectively, while they were 0.859 and 0.298 in the study of Christensen et al. [23]. Similar results were also observed by Gao et al. [16] in the Nordic Holstein population, where the adjusted one-step blending resulted in little improvement in the prediction accuracy and estimates of β and α were 0.976 and 0.085, respectively.

Table 1 Accuracies of genomic prediction for the validation cows

Among the 811 validation cows, 425 had both non-genotyped dams and half-sisters, while 386 with genotyped dams had only non-genotyped half-sisters. For validation cows with genotyped dams, r v and the theoretical accuracies obtained from both one-step blending approaches were nearly the same as those from GBLUP (Table 1), while for validation cows with both non-genotyped dams and half-sisters, r v were improved by 15 to 26 percentage points and 1 to 3 percentage points for the theoretical accuracy, when using the one-step blending approach (Table 1). Again, in all these cases, the adjusted one-step blending did not perform better than the original one-step blending. These results suggest that, compared with GBLUP, improvements in accuracies from the one-step blending approach were almost completely contributed by the non-genotyped dams. To further prove this, we discarded all non-genotyped half-sisters and only included the non-genotyped dams of 425 validation cows in the one-step blending approach. As expected, r v and the theoretical accuracies of the 425 validation cows from the original one-step blending approach (Table 2) were almost the same as those in the scenario when both non-genotyped dams and half-sisters were included in the one-step blending approach (Table 1). The reason for this is that all non-genotyped half-sisters were daughters of 19 genotyped sires in the reference population and the information from these daughters was part of the DRP of the sires. Therefore, these half-sisters contributed little extra information for genomic prediction.

Table 2 Accuracies of genomic prediction for 425 validation cows when their non-genotyped dams and not their non-genotyped half-sisters were used in the one-step blending approach

Conclusions

Averaged over the five milk production traits, both one-step blending methods increased rv and the average theoretical accuracy by about 0.12 and 0.02, respectively, compared to GBLUP. However, the adjusted one-step blending did not perform better than the original one-step blending in our situation. In our situation, improvements in accuracies from both one-step blending approaches were almost completely contributed by the non-genotyped dams of the validation animals.