Mutations of large effects often show non-additive effects on the phenotype such as dominance and epistasis and one well-known example is the coat colour of mice [1]. However, it is uncertain how important these non-additive effects are for polymorphisms that control variation in complex or quantitative traits. Hill et al. [2] argued that even if gene effects are not additive, most of the genetic variance is still expected to be additive variance. Several studies have estimated epistatic or dominance variance in livestock using traditional pedigree information (e.g., [3-7]) and reported a small but significant non-additive variance. However, it is difficult to estimate non-additive variance because it is often, at least partially, confounded with other effects such as common environment or maternal effects. Consequently, estimates of non-additive variance may be biased upwards. In view of this, it is not surprising that most genetic evaluation systems use an additive model and ignore non-additive effects, especially considering that their aim is to estimate breeding values or additive genetic values.

Genome-wide dense single nucleotide polymorphisms (SNPs) have been widely used in cattle for association studies [8-12] and genomic prediction [13-15] and represent a new opportunity to estimate non-additive effects at individual loci and to estimate non-additive variances. However, most genome-wide association studies (GWAS) have only reported additive effects of SNPs and additive genetic variance. There are a few studies (e.g., [6]) based on high-density SNPs that have reported variance explained by non-additive effects. In a study on growth rate (daily gain) in pigs, Su et al. [6] estimated dominance variance and additive × additive epistatic variance to account for 5.6 and 9.5% of the total phenotypic variance, respectively. It is possible that the proportion of variance that is explained by non-additive effects varies between traits.

Therefore, the objective of this study was to estimate dominance and epistatic effects in beef cattle using high-density SNP genotypes and phenotypic data on growth, carcass and reproductive traits.


Animals, phenotypes and SNP data

Animals originated from nine different populations of three breed types. They included four Bos taurus breeds (Angus, Murray Grey, Shorthorn, Hereford), one Bos indicus breed (Brahman cattle), three composite breeds (Belmont Red, Santa Gertrudis, Tropical composites), and one recent Brahman cross population (F1 crosses of Brahman with Limousin, Charolais, Angus, Shorthorn, and Hereford). The structure of the populations used is described in Bolormaa et al. [15]. A total of 10 191 animals of the three breed types (3389 B. indicus, 3296 B. taurus, and 3506 B. taurus × B. indicus) with SNP genotypes and phenotypes for at least one trait were used.

Phenotypes for 16 different traits (Table 1), including growth, feed intake, carcass and meat quality and fertility traits were obtained from five sources, which will be referred to as datasets: the Beef Cooperative Research Centre Phase I (CRCI), Phase II (CRCII), Phase III (CRCIII), the Trangie selection lines, and the Durham Shorthorn group (for a detailed description, see [15-17]). All individuals were not measured for all traits. The number of genotyped cattle with each trait in each dataset, trait definitions and abbreviations, means and standard deviations (SD), and heritability estimates are in Table 1 and were obtained from Bolormaa et al. [15,18]. The data for two female fertility traits (age at first detected corpus luteum and post-partum anoestrus interval PPAI) used in this study were adjusted for non-genetic effects based on models described in previous studies [19,20].

Table 1 Number of genotyped animals and number of phenotypes, and mean, standard deviation (SD) and heritability estimates (h 2 ) of each trait for animals with a full set of records 1

Data on 729 068 SNPs were imputed from genotypes from five different SNP panels: (1) the Illumina HD Bovine SNP chip ( that contains 777 963 SNPs; the BovineSNP50K BeadChip (Illumina, San Diego) version 1 (2) and version 2 (3) that contain 54 001 and 54 609 SNPs, respectively; (4) the Illumina SNP7K panel that comprises 6909 SNPs; and (5) the Parallele SNP10K chip (Affymetrix, Santa Clara, CA) with 11 932 SNPs. All SNPs were mapped to the UMD 3.1 assembly of the bovine genome sequence provided by the Centre for Bioinformatics and Computational Biology at University of Maryland (CBCB) ( Procedures for stringent quality control of the SNP data and for imputation to 729 068 SNPs are described in Bolormaa et al. [15]. The genotypes for each SNP were encoded in the top/top Illumina A/B format (

Statistical analyses

Genome-wide association studies by fitting dominance

Model used for GWAS: GWAS was performed using the ASReml software [21] based on the following mixed model:

$$ \mathbf{y}={\mathbf{1}}_{\mathbf{n}}\mu +\mathbf{X}\mathbf{b}+{\mathbf{s}}_{\mathbf{i}}{\alpha}_i+{\mathbf{w}}_{\mathbf{i}}{\delta}_i+\mathbf{Z}\mathbf{a}+\mathbf{e}, $$

where y is the vector of observed phenotypic values of the animals, 1 n is an nx1 vector of 1′s (n is the number of animals with phenotypes), μ is the overall mean, b is a vector of fixed effects, X is a design matrix relating observations to the corresponding fixed effects, Z is a design matrix relating observations to random animal genetic effects, a is a vector of polygenic breeding values sampled from N ~ (0, Aσa 2), where σa 2 is additive genetic variance and A is the additive relationship matrix constructed from the pedigree of the animals and their 5-generation-ancestors; s i is a vector of additive genotype codes of each animal at the i-th SNP, with genotypes AA, AB and BB coded as 0, 1 or 2, respectively, i.e. according to the number of B alleles present, αi is the additive effect of the i-th SNP, w i is a vector of dominance genotype codes at SNPi with the heterozygote AB coded as 1 and the homozygotes AA and BB coded as 0, δi is the dominance effect of i-th SNP, and e is the vector of random residual effects. Vectors s i and w i were fitted as covariates. In this parameterization, significance of the dominance effect was tested after fitting the additive effect. This analysis was carried out once for each SNP, i.e. 729 068 times.

The model included dataset, breed, cohort and sex as fixed effects for all traits, except for the two female fertility traits. Other fixed effects varied by trait, as detailed in [19,22-26]. The fixed effects were fitted as nested within a dataset.

Following Bolormaa et al. [27], the false discovery rate (FDR) of SNP effects was estimated as:

$$ \frac{P\left(1-\frac{A}{T}\right)}{\left(\frac{A}{T}\right)\left(1-P\right)}, $$

where P is the P-value used to declare the additive or dominance effect of a SNP to be significant (e.g., 0.00001), A is the number of SNP effects that were declared significant at the stated P -value and T is the total number of SNPs tested. Although the true FDR cannot exceed 100%, the estimated FDR can exceed 100% if the number of significant SNPs is smaller than expected by chance, but the interpretation is simply that all SNPs discovered are false positives.

Validation of SNPs with dominance effects

In order to validate statistically significant SNP effects in an independent population, the animals in the complete dataset were split into five sets by allocating the offspring of randomly selected sires to one of the five datasets. Then, one of the five sets was used as a validation population and the four other sets as the reference population. In this way, no animal used for validation had paternal half-sibs in the reference population. GWAS for each trait was performed in the reference population. For SNPs with a significant dominance effect in the reference population, the analysis was repeated in the validation population. We counted the number of times that the direction of the estimated SNP effect was the same in the validation and discovery populations.

Dominance variance explained by SNP genotypes

The dominance variance was estimated by restricted maximum likelihood (REML) using the following model:

$$ \mathbf{y}={\mathbf{1}}_{\mathbf{n}}\mu + \mathbf{X}\mathbf{b}+\mathbf{h}\mathbf{e}\mathbf{t}\mathrm{\ss } + \mathbf{g}+\mathbf{d}+\mathbf{e}, $$

where y is the vector of observed phenotypic values of the animals, 1 n is a vector of 1′s, μ is the overall mean, X is a design matrix relating observations to the corresponding fixed effects where the fixed effects were the same as used in the GWAS, b is vector of fixed effects, het is a vector containing the average heterozygosity over all SNPs for each animal, ß is the regression of each trait on heterozygosity, g is a vector of genomic breeding values distributed as N ~ (0, Gσg 2), where σg 2 is additive genetic variance explained by SNPs and G is the genomic relationship matrix (GRM) [28], d is a vector of dominance deviations distributed as N ~ (0, Dσd 2), where σd 2 is dominance variance explained by the SNPs and D is the dominance relationship matrix (DRM), and e is the vector of random residual effects. The GRM was constructed using the genotypes of all 10 191 animals in the combined datasets according to [28]. More details on the GRM used in this study are in [15]. The DRM was derived as described below.

Let the allele frequency of A be q and that of B be p = 1-q. The genotype of the j-th animal at the i-th SNP (AA, AB, and BB) was coded as Hij = −p/q, +1, and –q/p, respectively. Then, in a base population in Hardy-Weinberg equilibrium:

$$ E\left({H}_{ij}\right)=-\frac{p}{q}{q}^2+2pq-\frac{q}{p}{p}^2=0 $$


$$ E\left({H}_{ij}^2\right)=\frac{p^2}{q^2}{q}^2+2pq+\frac{q^2}{p^2}{p}^2=1\ . $$

The additive contribution of the genotype of the j-th animal (Tij) is coded as -2p/ \( \sqrt{2pq} \), (q-p)/ \( \sqrt{2pq} \), and 2q/ \( \sqrt{2pq} \) for AA, BB, and BB genotypes, respectively. Then, the expectation of the covariance between additive and dominance contributions to the genotype is as follows:

$$ \begin{array}{l}E\left({H}_{ij}{T}_{ij}\right)=\frac{-\frac{p}{q}\left(-2p\right){q}^2+1\left(q-p\right)2pq-\frac{q}{p}2q{p}^2}{\sqrt{2pq}}\hfill \\ {}=\frac{2{p}^2q+2p{q}^2-2{p}^2q-2p{q}^2}{\sqrt{2pq}}=0.\hfill \end{array} $$

This shows that allele substitution effects and dominance deviations that are implicit in the model are orthogonal contrasts. Therefore, DRM = HH’/m and GRM = TT’/m, where m is the total number of SNPs, H is a matrix of dimension of the number of animals x number of SNPs, where Hij is present at each position in the matrix; in the same way, T is a matrix containing Tij. With this model, σd 2 is the classical definition of dominance variance. Model (2) is equivalent to a model in which g is replaced by Tu, Var(u) = Iσa 2/m, and d is replaced by Hv, Var(v) = Iσd 2/m where u and v are vectors containing the additive and dominant effects of each SNP.

This definition of the DRM is similar to that used by Vitezica et al. [29], except that in our study, the contributions of each SNP to the DRM was weighted by 1/(2pq). In the analysis of each trait, only the elements of the GRM and DRM corresponding to animals with phenotypes were used to reduce the computing time. The same fixed effects as those used in the GWAS (model 1) were fitted in this model. The estimates of variance components were performed using ASReml software [21].

The significance of the dominance variance was tested by comparing twice the difference in log-likelihood between additive genomic model (AM) and additive and dominance genomic model (ADM) to a chi-squared distribution with degrees of freedom of 1.

Accuracy of predictions of the phenotypic values by including dominance deviations

For validation purposes, predictions of additive and dominance genetic values were made using the best linear unbiased prediction (GBLUP) analysis with variance estimated by REML. A five-fold cross validation scheme was carried out. The data were split into five parts of approximately equal size, by allocating the offspring of each sire to one of the five datasets. In this way no animal used for validation had paternal half sibs in the discovery population. The analysis was performed five times using each dataset in turn as a validation group and the four other sets as the discovery population. Validation animals were included in the GRM and DRM but had missing phenotypes in the calculation of genomic estimated breeding values (GEBV). For each validation population, the genetic and phenotypic value of each animal in the validation population was predicted as \( \widehat{\boldsymbol{g}}=\widehat{\boldsymbol{a}}+\widehat{\boldsymbol{d}} \), where ĝ is a predicted total genetic value, â is the estimated genomic breeding value, and \( \widehat{\boldsymbol{d}} \) is the estimated dominance deviation predicted from ADM, that is model (2) above. Estimates of ĝ were obtained using ASReml software [21]. The phenotypic values for each trait were corrected for all fixed effects as: corrected phenotype = phenotype - fixed effects.

The accuracy of this prediction was calculated as the correlation between ĝ and the corrected phenotype within each breed. Accuracies were reported only when the number of records per breed was greater than 200. When combining accuracies across breeds, the overall accuracy was the mean accuracy within breeds weighted by the number of records of each breed. The accuracy when including dominance effects was compared to the accuracy obtained using the same model but without the dominance effects.

Epistatic interactions between SNPs

Bolormaa et al. [18] identified 28 SNPs, referred to as lead SNPs, that had significant additive effects (P < 10−5) in a multi-trait analysis of the same traits as those used here. To minimise multiple-testing, we tested only additive × additive interactions between each of the 28 lead SNPs and all other 729 068 SNPs. Therefore, for each trait and for each lead SNP, we performed a GWAS in which we fitted the same model as for the dominance GWAS but, instead of a dominance effect, we fitted the additive effects of the lead SNP and one other SNP and the interaction between the two of them. The statistical model used for each trait was:

$$ \mathbf{y}={\mathbf{1}}_{\mathbf{n}}'\mu +\mathbf{X}\mathbf{b}+{\mathbf{l}}_{\mathbf{j}}{u}_j+{\mathbf{s}}_{\mathbf{i}}{a}_i + \left({\mathbf{l}}_{\mathbf{j}}\times {\mathbf{s}}_{\mathbf{i}}\right){m}_{ji}+\mathbf{Z}\mathbf{a}+\mathbf{e}, $$

where y is the vector of observed phenotypic values of the animals, 1 n is a vector of 1′s, μ is the overall mean, b is a vector of fixed effects, X is a design matrix relating observations to the corresponding fixed effects, Z is a design matrix relating observations to the random animal effect, a is the vector of polygenic breeding values, e is the vector of random residual effect, l j is a vector containing the coded genotypes for the j-th lead SNP (lj, j = 1, 2, 3, …, 28), uj is the additive effect of the j-th lead SNP, s i is a vector containing coded genotypes for the i-th SNP (SNPi, i = 1, 2, 3, … , 729 068), ai is the additive effect of the i-th SNP, l j × s i is a vector of the interactions between the lead SNPj and SNPi, and mji is the effect of the interaction between the lead SNPj and SNPi. The effects of the j-th lead SNP, the i-th SNP, and their interaction (l j × s i ) were fitted simultaneously as covariates.


Genome-wide association studies fitting dominance

A Manhattan plot of the –log10(P-values) of SNP dominance effects is in Figure 1. The number of SNPs showing a significant (P < 10−4) dominance effect in the discovery population for each trait is in Table 2. For instance, for post-weaning hip height (PW_hip), 191 SNPs had significant (P < 10−4) dominance effects. Since 729 068 SNPs were tested, 70 significant SNPs are expected by chance alone and therefore the FDR was 36%. Among these 191 SNPs, 174 had a positive and 17 a negative dominance effect. In the validation population, only 185 of the 191 SNPs could be tested for dominance because, for the other six SNPs, not all three genotypes were represented in the validation population. Among the SNPs that had a positive effect in the discovery population, 66% also had a positive effect in the validation population and among those that had a negative effect in the discovery population, 56% also had a negative effect in the validation population. If the significant SNPs in the discovery population were all false positives, we would expect 50% of the effects to be in the same direction in the validation population. Based on a chi-squared test, 66% differs significantly (P < 0.001) from 50% but 56% does not differ significantly from 50%. For the trait, weight measured at feedlot exit (X_lwt), similar results were observed (Table 2).

Figure 1
figure 1

Manhattan plot of the –log 10 ( P -values) of SNP dominance of the whole genome except chromosome X for live weight measured at feedlot exit (X_lwt).

Table 2 Total number of SNPs with significant ( P< 10 −4 ) dominance effects, number of SNPs with positive (+ve) and negative (−ve) dominance effects in the discovery population, number of SNPs tested for validation of dominance effects (Nb tested), and percentage of SNPs with positive and negative dominance effects that had an effect in the same direction in the validation population for each trait

Table 2 shows that for height and weight traits and for percentage of normal sperm in semen, the number of positive dominance effects (d > 0) was greater than the number of negative dominance effects (d < 0). For residual feed intake, age at puberty and postpartum anoestrus interval, there were more negative effects. These results show dominance in the direction one might associate with increased fitness and in the direction of heterosis in crosses between breeds. In fact, the number of significant dominance effects in the direction expected to decrease fitness for these traits was smaller than expected by chance, while the number of effects in the expected direction (corresponding to heterozygote advantage) was greater than expected by chance for all these traits. We did not attempt to validate individual SNP dominance effects because the validation population was not large enough to have sufficient power for this. Instead, we attempted to validate the dominance effects as a group by observing the percentage of effects that had the same direction in the validation population as in the discovery population. For 11 of the 16 traits, the percentage of effects in the same direction was larger than 50%. The most convincing results were obtained for traits that show more dominance effects in the direction that is expected for heterosis and for which statistical power was greatest, i.e. height and weight traits. For other traits, such as fat depth, the FDR was high, the proportion of positive and negative effects was similar and the percentage of effects that had the same direction in the training and validation populations was close to 50%, which suggests that most of these effects were false positives. One exception was marbling measured by CIMF, for which the FDR was only 35%, the number of negative effects was greater and more than 50% of the SNPs had effects in the same direction in the training and validation populations.

Dominance variance

Estimates of genetic variance using the additive genomic model (AM) and the model fitting both additive and dominance genetic effects (additive-dominance genomic model, ADM) are in Table 3 for each trait. The dominance variance was significant (P < 0.05) for PW_hip, PW_lwt, X_lwt, PWIGF, CIMF, CRBY and AGECL (Table 3). Estimates of the dominance variance as a proportion of phenotypic variance (VD/VP) varied from 0 to 0.42 but the standard errors were large in some cases (e.g., SC12 and PNS24). For seven traits, VD/VP was estimated to be 0. Therefore, the average VD/VP across traits was biased upwards but the median, which was 0.05, should not be biased. The only fertility trait that had a significant dominance variance was AGECL (18%). Dominance variance was not significant for both male fertility traits (SC12 and PNS24) and for PPAI (Table 3), which may be due to the low number of records for these traits (Table 2), large environmental variance and low relationships among recorded animals, which results in lack of power to estimate dominance variance.

Table 3 Proportion of genetic variance based on the additive genomic model (AM) and the additive and dominance genomic model (ADM) for each trait

The effect of the overall observed heterozygosity (tHe) for each trait is in Table 3 as a signed t-value. Heterozygosity significantly increased growth traits and SC12 and significantly decreased age at puberty (Table 3). The direction of these effects largely agreed with the preponderance of positive or negative effects at individual SNPs that are reported in Table 2. Dominance variance tended to be significant for traits that have a significant effect of average heterozygosity, such as post-weaning live weight. However, CIMF and CRBY also had significant dominance variance but these traits showed no effect of average heterozygosity. Trait PWIGF had the largest proportion of variance explained by dominance but this estimate had a high standard error due to the small number of records.

Accuracy of prediction of the phenotypic value

The accuracy of prediction of the phenotypic value using a model that included additive and dominance effects was compared to that obtained with an additive model (Table 4). The accuracy was calculated for the nine traits for which the variance explained by dominance deviations was not equal to 0 (Table 3). The average accuracy across traits was equal to 0.22 for both the additive model and the model with dominance deviations (Table 4). No significant improvement from including dominance deviations was observed for any trait.

Table 4 Average weighted accuracies of predicted phenotypic values across breeds for the 5-fold cross-validation populations based on the GBLUP model without dominance (GRM) and with dominance (GRM + DRM)

Epistatic effects

The number of significant (P < 10−5) epistatic interactions between one lead SNP and one of the other remaining 729 067 SNPs was calculated for the 16 traits (Table 5). For example, for PW_lwt, 153 significant (P < 10−5) interactions were found between the lead SNP BTA14_25 (close to the PLAG1 gene) and other SNPs. Two examples of these interactions are in Table 6 and show that the SNPs on chromosome 2 or 5 had a greater effect on weight when the animals carried the A allele at the lead SNP than when they carried the B allele. The minus log P-values of epistatic effects for PW_lwt between lead SNP BTA14_25Mb and all other SNPs are shown in Figure 2. Clusters of significant (P < 10−5) epistatic effects for PW_lwt on BTA2, 5, 8, 9, 19, and 29 were observed (Figure 2). For each cluster, we examined the genes within a region of 50 kb up- and down-stream and, in a few cases, we identified possible candidate genes that could interact with the PLAG1 gene for PW_lwt. These include the genes MBNL1 (muscleblind-like splicing regulator 1) on BTA1, FAT/CD36 (fatty acid translocase/cluster of differentiation 36) on BTA4, GRN (granulin), FASN (fatty acid synthase), and ITG (platelet glycoproteins (A2B and B3)) on BTA19, and INS (insulin) and IGF2 (insulin-like growth factor 2) on BTA29.

Table 5 Number of significant epistatic interactions ( P< 10 −5 ) between each of the 28 lead SNPs and each of the other remaining 729 067 SNPs for each trait
Table 6 Example of significant epistasis of the lead SNP BTA14_25Mb with SNPs BTA2_70715761 and BTA29_50068561) for post-weaning live weight (PW_lwt, kg)
Figure 2
figure 2

Manhattan plot showing the –log 10 ( P -values) of pair-wise epistatic effects for post-weaning live weight (PW_lwt) between lead SNP (BovineHD1400007259 at position 25015640 on BTA 14) and SNPs across the genome, except SNPs on the X chromosome.

Analysis of epistatic effects shows that the number of significant interactions between SNPs was greater than expected by chance and varied widely between traits (Table 5). As described in Bolormaa et al. [18], the first four lead SNPs have an allele that increased mature size (increases height and weight and decreased fatness, RFI and blood concentration of IGF1). The highest number of epistatic effects between these lead SNPs and other SNPs was found for weight (PW_lwt; X_lwt), height (HUMP; PW_hip), fatness (CIMF; CRIB), RFI and PWIGF. In addition, large numbers of significant interactions with other SNPs were found for lead SNPs 5_48Mb and 6_40Mb for LLPF and AGECL, respectively.

Five lead SNPs (BTA7_98Mb; BTA10_92Mb; BTA25_3.7 Mb; BTA26_28Mb; BTA29_45Mb) had an allele that increased meat tenderness and fatness, as described by Bolormaa et al. [18] and showed interactions that had an effect on weight (X_lwt), RFI, fatness and SC12 (Table 5). The next seven lead SNPs (BTA2_25Mb; BTA3_80Mb; BTA6_13Mb; BTA13_35Mb; BTA17_25; BTA19_25Mb; BTA25_15Mb) had an allele that increased both fatness and weight [18] and showed interactions that affected RFI, CIMF and SC12.


Traits related to fitness, including growth rate, are commonly found to show inbreeding depression and heterosis [30,31], which is usually explained by directional dominance at loci that control these traits. Our results are consistent with this. We found an effect of average heterozygosity on traits related to fitness, such that increasing heterozygosity changed the trait in the direction that is presumed to increase fitness, for instance, increased growth rate. In accordance with this, more individual SNPs had dominance effects in the direction presumed to increase fitness than in the reverse direction. However, the FDR associated with these effects was moderate to high, so the evidence for significant dominance effects at individual SNPs was not strong, especially for traits that do not have a clear relationship to fitness, such as fat depth, retail beef yield and tenderness. The FDR of dominance effects (ranging from 26.1 to 109.9%; Table 2) was considerably higher than the FDR of additive effects in the same data (ranging from 1.5 to 21.1%; [15,16]). This suggests that dominance effects are smaller than additive effects and/or more difficult to estimate (note that an estimated FDR greater than 100% occurs when the number of significant SNPs is smaller than expected by chance).

Directional dominance can cause a significant regression of trait phenotype on average heterozygosity, as we observed. In fact, the effect of average heterozygosity on trait phenotype could be considered as an estimate of inbreeding depression based on SNP genotypes. However, significant inbreeding depression can occur without generating much dominance variance if it is due to small dominance deviations at very many loci. The fact that we observed significant dominance variance for some traits implies that some individual loci have moderate dominance effects. This is supported by the finding of significant dominance effects at individual SNPs in the GWAS.

Across traits, the proportion of phenotypic variance that was estimated to be due to dominance varied widely (from 0 to 42%) but much of this variation could be due to large standard errors of the estimates. The median values across traits for the dominance variance as a proportion of total phenotypic variance is 0.05 compared with a median value for the additive variance which was 0.4 of the total phenotypic variance. The results for post-partum anoestrous interval (PPAI) are somewhat surprising for a fertility trait for the following reasons: (1) the heritability for PPAI was equal to 0.39, which is higher than many heritability estimates of fertility traits in beef cattle [32]; (2) there was weak evidence at best for dominance effects of individual SNPs; (3) there was no significant effect of average heterozygosity and (4) the estimate of dominance variance was close to zero. Fertility traits are usually considered to be fitness traits and the expectation is therefore that non-additive effects are important. The results reported here may indicate lack of power in this subset of the data but it could also be that an intermediate value of PPAI is optimal for fitness.

The model that we used to estimate dominance variance is similar to that of Vitezica et al. [29], except that SNPs were weighted differently to calculate the DRM and we also fitted the average heterozygosity. Variation in average heterozygosity is equivalent to variation in inbreeding in a classical model and thus, we accounted for the effect of this on the phenotype. Our assumptions that E(d) = 0 and cov(a,d) = 0 hold after correcting for average heterozygosity (inbreeding). If average heterozygosity is not included in the model, then the average effects of inbreeding contribute to the estimate of dominance variance. The usual definition of dominance variance (e.g., [33]) does not include the effect of variation in inbreeding and we have followed that practice.

There are a few publications (e.g., [3-5,34]) that have reported estimates of the non-additive genetic variance based on pedigree information in cattle. Using an animal model, Rodriguez-Almeida et al. [34] estimated the dominance variance at 28% and 11% of total phenotypic variance for weaning weight and hip height, respectively, while we obtained lower estimates for weaning live weight (11%) and a hip height (4%). In dairy cattle, estimates of dominance variance for milk, fat, and protein yields have been reported by many authors [3,35-37], with values ranging from 3 to 24% of total phenotypic variance. Palucci et al. [4] estimated the dominance variance for fertility traits such as non-return rate (0.6% of total phenotypic variance) and calving to first service (7.3% of total phenotypic variance) in Canadian dairy cows. Our estimates of dominance variance as a proportion of total genetic variance were also low for fertility traits, but it should be noted that there was insufficient power due to the limited number of records for these traits.

Su et al. [6] estimated non-additive genetic variance for average daily gain in pigs using high-density SNPs and reported that 5.6 and 9.5% of total phenotypic variance was explained by dominance and additive by additive variance, respectively, which is similar to our median estimate of 5%. However, they did not use the classical definition of dominance variance and consequently their dominance variance might be slightly over-estimated.

We used a set of animals that originated from multiple breeds. The extension of the traditional model with additive and dominance variance components to multiple breeds has been done in two ways. Either the breeds are regarded as separate populations (e.g., [38]) or the breeds are regarded as partially inbred lines within a larger population [33]. Both formulations result in the definition of multiple variance components. For instance, the model with inbreeding [33] has five variance components, as well as the regression on average heterozygosity. Consequently, it is not possible to obtain accurate estimates of all parameters using real data and thus, these models are seldom used. The model that we used considered partially inbred lines within a larger population but the variances were defined in terms of SNP genotype effects. The model is best understood from a model that is equivalent to model (2), in which the breeding value (g) and the dominance value (d) are replaced by the sum of all the effects of individual SNPs, that is g = Tu and d = Hv. Thus, our model can be considered as estimating the combined variances due to all SNPs. T and H are defined using the allele frequencies over the whole dataset and thus correspond to a model in which the current breeds have descended from a base population with these allele frequencies, and the current breeds are regarded as inbred relative to this base population. Consequently, the diagonal elements of the GRM and DRM are not expected to be 1. Diagonal elements of the GRM are greater than 1 when calculated by TT’/m, as expected. Diagonal elements of the DRM calculated from HH’/m are also typically greater than 1, which reflects the effect of inbreeding. The estimated variance components refer to the variances in the hypothetical base population. A potential problem is that the additive (T) and dominance (H) values are not necessarily orthogonal when genotype frequencies deviate from Hardy-Weinberg equilibrium. However, the results in Table 3 show that the additive variances obtained with the AM and ADM models were almost the same. Therefore, additive and dominance effects were nearly orthogonal.

Although there were more significant epistatic effects than expected by chance, the FDR was high (overall FDR of 39% at P < 10−5). This could be due to the additive × additive epistatic variance being small and/or to the epistatic effects being difficult to estimate. There were no significant epistatic interactions among the 28 lead SNPs, for which the power for detecting such interactions was expected to be greatest, since the lead SNPs had highly significant additive effects on multiple traits. By focusing on these 28 SNPs, we may have missed many important epistatic interactions. However, testing interactions between all pairs of SNPs would have greatly increased the multiple-testing problem. Therefore, caution is warranted in interpreting the significant interactions that we identified. Still, an interaction between PLAG1 and IGF2 is biologically plausible, since PLAG1 is a transcription factor that regulates many genes and pathways, including the IGF2 and IGF1R pathways [39].

To the best of knowledge, there are no publications on the estimation of non-additive genetic variance in beef cattle using high-density SNPs. Only a few papers [6,34] have reported results on the variance explained by epistatic (additive by additive) variance in dairy cattle and pigs. Su et al. [6] used a Hadamard product of GRM × GRM to calculate the epistatic relationship matrix. Since the off-diagonal elements of the GRM in our data were small, the square of these numbers will be close to 0, which means that the estimation of epistatic variance would suffer from a large sampling error and therefore we did not attempt it.

Although the dominance variance was significant for some traits, inclusion of dominance deviations in genomic prediction did not increase the accuracy of prediction of the genetic value of individuals and hence the prediction of their phenotypic values. This is consistent with the high FDR of dominance effects, which indicates that the dominance effects were not estimated with high precision. This could be because the dominance effects were small and/or because they were difficult to estimate. For instance, all three genotypes at a given SNP must be present in the data to estimate the dominance effect. Su et al. [6] reported an increase in accuracy of prediction (from 0.319 to 0.330) as a result of fitting a dominance effect in pigs but their data included pigs that were more closely related to each other than the animals in our dataset. Therefore, inclusion of dominance effects in genetic evaluations for important traits (e.g., CIMF and AGECL) could increase the accuracy of prediction of genetic effects, but only if the animals are closely related. In these situations, caution should be taken to minimize the tendency to confound dominance with common environment effects such as litter.


Across all SNPs, we observed dominance effects for growth and fertility traits in the direction that is expected to increase fitness. The dominance effects were not equal for all SNPs, which resulted in some dominance effects reaching significance (P < 10−5) and significant variance explained by dominance deviations for some traits. Despite this, including dominance in the prediction model did not increase the accuracy of prediction of genetic and phenotypic values for any trait, which is probably because the dominance variance is much smaller than the additive genetic variance, and dominance effects are difficult to estimate accurately. The number of additive × additive epistatic effects was greater than expected by chance but their FDR was high.