Background

Fatty acids are required by daily normal metabolism, and can be obtained from food and meat. Improving nutritional value of meat products for human health has attracted extensive attention in current society [1, 2]. Fat content and fatty acid composition in beef products are associated with meat taste and flavor, and these are considered as main sensory properties in consumer’s selection and acceptance [3].

Fatty acids are important indicators of beef meat quality, and previous studies have been conducted to examine fatty acids for various cattle breeds in different feeding environments [4]. Fatty acid composition are often lowly or moderately heritable traits in various populations with different genetic architecture [5]. Several studies have revealed that the level of heritability and genetic correlation theoretically allow for genetic improvement of fatty acid composition by selection of both major candidate genes and genomic selection strategies [6,7,8,9,10]. Therefore, application of molecular genetics approaches can provide an opportunity for genetic improvement for fatty acid composition of beef cattle [11,12,13].

During the last decades, tremendous works have been done to elucidate the genetic mechanism of fatty acids using candidate gene [14,15,16,17,18,19,20,21] and linkage mapping approaches [22,23,24]. In recent years, genome-wide association studies (GWAS) have been widely used to study the molecular mechanism underlying important traits in beef and dairy cattle [25,26,27,28,29]. Previous GWAS and genomic predictions have identified candidate markers associated with various fatty acid composition and evaluated the accuracy of genomic prediction for these traits [8, 30,31,32,33,34]. However, many studies were conducted in populations with relatively low density SNP arrays. Despite recent GWAS for fatty acid composition have been investigated using the high density BovineHD (770 K) SNP array, those studies were mostly limited in Nellore cattle [35, 36]. On the other hand, extensive attention has been paid to investigate the accuracies of genomic prediction using multiple methods in different populations [32, 33]. A recent study in Nellore cattle indicated that the accuracies of genomic prediction were moderate to high and it was feasible to apply genomic selection in cattle. However, their results were limited to carcass traits in Nellore population [37]. Therefore, understanding the molecular mechanisms underlying fatty acid composition and evaluating the accuracy of genomic predictions in other important cattle breeds still need further investigation.

The objectives of the current study were to explore the associated genomic regions and estimate the predictive accuracies for fatty acid composition using the BovineHD SNP array in Chinese Simmental population. In this study, we identified several potential candidate markers and genes associated with fatty acid composition. Our findings will facilitate the elucidation of the molecular mechanism and help us design optimal genomic selection strategies for fatty acid composition in cattle and other farm animals.

Results

Descriptive statistics of fatty acid composition and their estimates of heritability

We measured six saturated fatty acids (SFA), four monounsaturated fatty acids (MUFA) and eleven polyunsaturated fatty acids (PUFA) using gas chromatography. Descriptive statistics and estimates of heritability for 21 individual fatty acids were presented in Table 1. We observed that the most abundant individual saturated fatty acids were C16:0 (23.8%) and C18:0 (20.2%), while for monounsaturated and polyunsaturated fatty acids, relatively high proportions of individual fatty acids were C18:1 cis-9 (32.0%) and C18:2 n-6 (12.9%). In contrast, we found saturated fatty acids (C20:0, C22:0, C24:0), monounsaturated fatty acids (C14:1 cis-9 and C20:1 cis-11), and polyunsaturated fatty acids (C18:2 t-9c-11, C18:2 t-12c-10, C18:3 n-6, C18:3 n-3, C20:2 n-6, C20:4 n-6, C20:5 n-3, C22:5 n-3, C22:6 n-3) accounted for relatively low proportion (<1% each) of the total fatty acids. In this study, our results found the estimated heritability varied noticeably for these fatty acids. Among 21 individual fatty acids, we found only C14:0 showed a relatively high heritability at 0.54 and five fatty acids including C18:0, C20:0, C14:1 cis-9, C16:1 cis-9 and C18:1 cis-9 showed a moderate heritability, while most of heritability estimates for others fatty acids (15 out of 21) were below 0.2 (Table 1). For the eight groups of fatty acids, we found MUFA, n-6/n-3 and health index (HI) showed moderate heritabilities (0.27, 0.22 and 0.24), while the estimated heritability for SFA, PUFA, total of omega 3 (n-3), total of omega 6 (n-6) were 0.12, 0.16, 0.15, and 0.16, respectively.

Table 1 Summary statistics of mean (%), standard deviation (SD, %) and heritability estimates (h 2), additive genetic variance and coefficient of variation (CV%)

Phenotypic and genetic correlations

Phenotypic and genetic correlations among 21 individual fatty acids were presented in Fig. 1. The estimated phenotype correlation of these fatty acids (Fig. 1a) generally displayed different patterns compared to genetic counterparts (Fig. 1b). We observed that high positive phenotypic and genetic correlations existed between several pairs of individual fatty acids. For instance, the estimated genetic correlations between C20:0 and C20:1 cis-11, C20:0 and C18:2 t-12c-10, C20:0 and C20:2 n-6 and C20:0 and C20:4 n-6 were 0.89, 0.95, 0.92 and 0.86, respectively. In contrast, we found clear negative correlations between C14:0 and C18:2 n-6, C14:0 and C20:3 n-3, C18:1 cis-9 and C18:2 n-6, C18:1 cis-9 and C20:5 n-3 (Additional file 1: Table S1).

Fig. 1
figure 1

Heatmap of phenotypic (a) and genetic correlation (b) across 21 individual fatty acid compositions

Bayesian based GWAS and candidate regions

We performed GWAS using the BayesB method for 11 individual fatty acids that showed estimated genomic heritability ≥ 0.10, including 3 saturated fatty acids (C14:0, C18:0 and C20:0), 3 monounsaturated fatty acids (C14:1 cis-9, C16:1 cis-9 and C18:1 cis-9) and 5 polyunsaturated fatty acids (C18:2 n-6, C18:3 n-3, C20:3 n-3, C20:5 n-3 and C22:5 n-3). To identify potential regions associated with fatty acids, we divided the genome into 100 kb windows, leading to 24,900 regions across the genome. Regions that explain more than 1% of additive genetic variances were considered as candidates and subject to further analyses to identify the associated genes within these regions. Summary statistics including genetic variances explained, position for the 100 kb windows, flanking rs number ID for these regions, and candidate genes of saturated fatty acid, monounsaturated fatty acid and polyunsaturated fatty acids were represented in Tables 2 and 3, respectively.

Table 2 Genomic regions associated with the saturated fatty acids in Chinese Simmental cattle using BayesB method
Table 3 Genomic regions associated with the monounsaturated and polyunsaturated fatty acids in Chinese Simmental cattle using BayesB method

Saturated fatty acids

We detected a total of 16 candidate regions that explain more than 1% of genetic variance for saturated fatty acids. These regions were distributed on BTA2, BTA4, BTA6, BTA7, BTA12, BTA14, BTA15, BTA19, BTA22, BTA23 and BTA25 (Table 2). Among them, we found four, three and nine candidate regions for C14:0, C18:0 and C20:0, respectively (Additional file 2). Intriguingly, the detected window with the largest genetic variance (10.04%) near 51.3 Mb on BTA19 for C14:0, containing gene fatty acid synthase (FASN) that is related to fatty acid synthesis. We also found one region explaining about 1.46% of genetic variance for C14:0 and located at 25.1 Mb on BTA23. This region overlapped with the elongation of very long chain fatty acids protein 5 (ELOVL5) whose function involves in fatty acid elongase activity (Fig. 2). In addition, we identified 10 candidate genes that are likely to be related to fatty acids composition embedded in these candidate regions (Table 2).

Fig. 2
figure 2

a Manhattan plot of absolute value of SNP effects estimated using BayesB for C14:0. b Manhattan plots showing P-values of association for each SNP using the GRAMMAR-GC, where the y-axis was defined as -Log 10 (P)

Monounsaturated fatty acids

For monounsaturated fatty acids, we detected 8 candidate regions, each of which captured more than 1% of the total genetic variance across six chromosomes (Additional file 3). Notably, we detected the same region at 51.3 Mb on BTA19 overlapping with FASN, which explains 6.49% of the genetic variance for C14:1 cis-9 (Fig. 3), and this region was also associated with C14:0. The top associated region was detected near 30.3 Mb on BTA14, which explains 19.44% of genetic variance for C18:1 cis-9 (Fig. 4), while no known gene was observed near this region. Overall, there are three, two and three regions identified to be associated with C14:1 cis-9, C16:1 cis-9 and C18:1 cis-9, respectively (Table 3).

Fig. 3
figure 3

a Manhattan plot of absolute values of SNP effects estimated using BayesB for C14:1 cis-9. b Manhattan plots showing P-values of association for each SNP using the GRAMMAR-GC, where the y-axis was defined as -Log 10 (P)

Fig. 4
figure 4

a Manhattan plot of absolute values of SNP effects estimated using BayesB for C18:1 cis-9. b Manhattan plots showing P-values of association for each SNP using GRAMMAR-GC, where the y-axis was defined as -Log 10 (P)

Polyunsaturated fatty acids

We detected 11 associated regions for five polyunsaturated fatty acids, including C18:2 n-6, C18:3 n-3, C20:3 n-3, C20:5 n-3 and C22:5 n-3 (Additional file 4). Of these regions, we observed one candidate region for C18:3 n-3 that was located at BTA14, two regions associated with C18:2 n-6 at BTA4 and BTA20, two regions associated with C20:3 n-3 at BTA4 and BTA17, three regions associated with C20:5 n-3 at BTA3, 5, 12, and three regions associated with C22:5 n-3 at BTA2, 5, 9 respectively (Table 3). Notably, we found 12 genes imbedded in the identified regions and these genes were likely involved in the function of fatty acid synthesis and metabolism. Of these 12 genes, two genes were associated with C18:2 n-6, three genes with C20:3 n-3, six genes with C20:5 n-3, and one gene with C22:5 n-3, respectively. The top four regions explaining more than 2% of the genetic variance were identified at BTA20 associated with C18:2 n-6 (4.95%), BTA14 with C18:3 n-3 (2.83%), BTA9 with C22:5 n-3 (2.46%) and BTA4 with C18:2 n-6 (2.33%).

Fatty acid groups

To systematically explore the genetic mechanism underlying fatty acid composition beyond individual fatty acid, we also conducted GWAS using BayesB method on eight fatty acid groups, including SFA, MUFA, PUFA, PUFA/SFA, n-3, n-6, n-6/n-3, and HI (Additional file 5). We found three associated regions for HI located at BTA4, BTA10 and BTA15, two regions for MUFA at BTA12 and BTA14, one for n-3 at BTA4 and one for n-6/n-3 at BTA20 (Table 4). Intriguingly, we observed two regions accounting for ~10% of the genetic variance for MUFA, while two regions located at BTA4 and BTA20 accounting for 1.25% and 4.66% of genetic variance for n-3 and n-6/n-3, respectively. One region located at BTA12 overlapped with two genes claudin 10 (CLDN10) and DAZ interacting zinc finger protein 1 (DZIP1). For health index (HI), the genetic variances contributed by three identified regions were 1.52%, 1.19% and 2.03%, which located at BTA4, BTA10 and BTA15, respectively. These regions overlapped with sarcoglycan epsilon (SGCE), paternally expressed 10 (PEG10) and DEAD-box helicase 10 (DDX10).

Table 4 Genomic regions associated with the fatty acid groups in Chinese Simmental cattle using BayesB method

Identification of associated loci using GRAMMAR-GC

We further conducted GWAS using GRAMMAR-GC implemented in GenABEL package for 11 individual fatty acids and eight fatty acid groups, each of which has a genomic heritability of high than 0.10. To ensure the power and accuracy of GWAS for these traits, we utilized genomic control approach to correct for possible population stratifications in GRAMMAR-GC test. After this correction, we found the inflation factor λ was close to one, suggesting that our approach has successfully accounted for population stratification, and thus no further adjustment was required. We identified a total of 44 and 8 significant SNPs associated with nine fatty acid composition and two fatty acid groups, respectively. The suggestive P value (0.05/163,473 = 3.06E-7) was used as the cut off threshold for significance, which approximately considered the number of “independent” SNPs by counting 1 SNP per LD block, plus all SNPs outside of blocks (interblock SNPs). We observed 14, 5, 8, 1, 3, 4, 3, 3 and 3 significant associated SNPs surpassing the suggestive threshold (P <3.06E-7) for C14:0, C14:1 cis-9, C18:1 cis-9, C18:3 n-6, C20:0, C20:1 cis-11, C20:2 n-6, C20:4 n-6 and C18:2 t-9c-11, respectively. The top four significant SNPs for C14:0 (P =1.39E-10) were located at 51.3 Mb on BTA19. Totally, we identified 17 associated SNPs for saturated fatty acid (C14:0 and C20:0), 17 SNPs for monounsaturated fatty acids (C14:1 cis-9, C18:1 cis-9 and C20:1 cis-11), 10 SNPs for polyunsaturated fatty acids (C18:3 n-6, C20:2 n-6, C20:4 n-6 and C18:2 t-9c-11). Notably, we found the majority of SNPs were located at BTA19 (18 SNPs) and BTA14 (19 SNPs), which indicated these regions are potential candidate for fatty acid composition. Fig. 2, 3 and 4 show the genome-wide plots of C14:0, C14:1 cis-9 and C18:1 cis-9 for P-values and the absolute values of marker effects against the genomic position. We found one associated SNP located at 51.3 Mb on BTA19 for both C14:0 and C14:1 cis-9 (P = 5.19E-10 and P = 1.82E-07), and this SNP was also located at ~4 kb upstream of the FASN gene.

Region-based association test and LD analyses

To explore potential associated loci which might fail to be identified due to the strict threshold for high density SNPs, we investigated two 100 kb associated regions on BTA19 and BTA23 (BTA19:51.3–51.4 Mb and BTA23: 25.1–25.2 Mb) using region-based association tests implemented in R package FREGAT. The two regions contain two candidate genes FASN and EVOL5 involved in fatty acid synthesis. For the region at 51.3 Mb on BTA19, we found 19 SNPs showing significant association with C14:0 (P < 0.01), and among them, five SNPs were identified within FASN, and one SNP near FASN with the strongest association signal (P = 5.17E-10). We found that the P value for region-based test for FASN was 0.0048, which indicated that FASN may be considered as a candidate gene for C14:0. Moreover, the LD and haploblock analyses revealed that this region showed high LD level with multiple haploblocks, which may imply a potential selection signature involved in fatty acids within this region in Chinese Simmental cattle population (Fig. 5a). For region at 25.1 Mb on BTA23, two SNPs were detected with P < 0.01, and the top SNP (BovineHD2300006955) was detected at 25.1 Mb showing significant association (P = 3.1E-5). This region partly overlapped with gene ELVOL5. Therefore, we next examined the 500 kb upstream and downstream of the region. However, no other SNPs were found which were significantly associated with C14:0 (Fig. 5b).

Fig. 5
figure 5

Regional plots of the two major candidate regions on BTA19 and BTA23. Results were shown for C14:0 at 50.8-51.8 Mb on BTA19 (a) and for C20:0 around 24.6-25.6 Mb on BTA23 (b). In the upper panels, the top SNPs were highlighted by blue solid circles. Different levels of linkage disequilibrium (LD) between the lead SNP and surrounding SNPs were indicated in different colors

Genomic prediction for fatty acid composition

We performed genomic selection for fatty acid composition using GBLUP and BayesB. The predictive accuracies ranged from 0.03 (C18:2 t-12c-10) to 0.51 (C18:3 n-3) using GBLUP, and from 0.1 (C18:2 t-9c-11) to 0.53 (C14:0) using BayesB. The averaged predictive accuracies across all fatty acids using GBLUP and BayesB were 0.24 and 0.29, respectively. These results suggested that genomic prediction using BayesB was slightly superior over GBLUP for fatty acids in Chinese Simmental cattle.

For each individual fatty acid trait, we found that the relatively high predictive accuracies were achieved for C14:0, C22:0, C14:1 cis-9, C18:3 n-3, C20:3 n-3 for both GBLUP and BayesB. Despite the fact that BayesB performed at least as well as GBLUP for most traits, we indeed found some traits where BayesB showed much higher predictive accuracies than GBLUP, such as C14:0 (0.48 for GBLUP and 0.53 for BayesB), C18:0 (0.17 vs. 0.24), C24:0 (0.11 vs. 0.23), C18:2 t-12c-10 (0.03 vs. 0.12), C18:3 n-6 (0.05 vs. 0.13), C20:2 n-6 (0.07 vs. 0.14) (Table 5). To investigate possible bias between the predicated and observed breeding values, we also calculated the regression coefficient of genomic estimated breeding values on adjusted phenotypes (Table 5). Our results revealed that the average regression coefficients for GBLUP and BayesB were 0.71 and 0.93, indicating that BayesB is less biased than GBLUP because the latter has a regression coefficient closer to unity.

Table 5 Predictive accuracy (±SE) and regression coefficients (±SE) of genomic breeding value prediction for fatty acid composition

Discussion

Fatty acid is an important indicator of meat quality and taste, and its strongly influences consumer’s preferences [3, 38]. Previous genome-wide association studies have been conducted for fatty acid composition in multiple cattle breeds, including Angus, Japanese Black, Nellore and other crossbreds [8, 30, 32, 35, 36]. To our knowledge, this study is the first attempt to investigate molecular mechanism underling fatty acid composition using high density SNP array and evaluate the accuracy of genomic predictions in Chinese Simmental cattle population.

Genomic wide scan identified candidate regions and loci

We investigated 21 individual fatty acids including 6 saturated fatty acids, 4 monounsaturated fatty acids, 11 polyunsaturated fatty acids and 8 fatty acid groups. Despite the fact that single-SNP based GWAS methods have been widely used in many studies, these methods may not be powerful for studying the complex traits with low or moderate heritability. Due to the polygenic characteristics of fatty acid composition in cattle, GWAS using the Bayesian methods have enabled to identify many associated loci that have missed by the single-SNP regression approach [33, 35, 36]. BayesB has been widely used for GWAS of complex trait in farm animals [39,40,41,42]. In this study, we utilized BayesB and GRAMMAR-GC to identify candidate regions associated with fatty acids in Chinese Simmental cattle population. We detected a total of 35 candidate regions on 16 autosomes associated with fatty acid composition using BayesB. However, these identified regions may not include all potentially significant associated SNPs due to use of 100 kb window-based strategy in BayesB. Therefore, we conducted GWAS using the single locus GRAMMAR-GC method implemented in GenABLE package. With this approach, we detected a total of 44 and 8 significant associated SNPs for individual fatty acids and fatty acid groups using a suggestive adjust threshold. The suggested threshold was set to avoid overestimation of the significant SNPs caused by high LD level in the high density SNPs array [43]. In current study, we found a total of 24 candidate SNPs overlapping with these regions identified by BayesB. For instance, the same candidate peaks for C14:1, C14:1 cis-9 and C18:1 cis-9 were identified using both methods (Figs. 24). Utilization of multiple complementary methods is an effective way to detect candidate regions or SNPs and helps elucidate genetic architecture of complex traits in farm animals [33, 44].

Candidate genes for fatty acid composition

Several genes were identified as potential candidates contributing to the genetic architecture of fatty acids in this study. Among them, we observed FASN at 51 Mb on BTA19 overlapped with a 100 kb associated region, which explains 10% and 6.5% of the genetic variances for C14:0 and C14:1 cis-9, respectively. Notably, we also found multiple significant SNPs around this gene that were associated with C14:0 using the GRAMMAR-GC method. A region-based association test revealed strong evidence of association for multiple SNPs within this gene. Furthermore, we found strong LD at the upstream of FASN (Fig. 5). This gene encodes a multifunctional protein enzyme to catalyze the synthesis of palmitate (C16:0) from acetyl-CoA and malonyl-CoA, in the presence of NADPH. Previous studies based on candidate gene approaches had revealed polymorphisms within FASN that were related to fatty acid composition in multiple beef cattle populations [45,46,47,48,49] and milk fat content in dairy cattle [50, 51]. For instance, several studies were conducted to explore and evaluate the association between fatty acid composition and candidate SNPs using GWAS in Japanese beef cattle [8, 30, 52]. These studies provided multiple evidences that several SNPs near or within FASN may be regarded as responsible mutations for fatty acid composition and contribute largely to meat quality in the Japanese Black cattle population. Saatchi et al performed GWAS using BovineSNP50 in Angus beef cattle and reported FASN located at 51 Mb on BTA19 (from 51,384,922 to 51,403,614 bp) was associated with fatty acids [32]. Chen et al. found SNP rs41921177 (BTA19:51,326,750) located near FASN. This SNP rs41921177 had relative large effects on multiple fatty acids in both subcutaneous adipose and longissimus lumborum muscle tissues of crossbred beef cattle [33]. However, investigation of genetic architecture of fatty acids in the Nellore cattle showed no significant associations for several polymorphisms within or near FASN [35, 36]. Despite previous studies have indicated that FASN has a conserved role across genetic backgrounds, there are several different variants that may be responsible for the different FASN effects in different breeds, and different FASN alleles appear to be segregating in different populations [8, 49].

Another gene called ELOVL5 encodes a multi-pass membrane protein which is involved in the elongation of long-chain polyunsaturated fatty acids. This gene was identified in the associated region at 25.1 Mb on BTA23 accounting for 1.5% of the genetic variance for C14:1 cis-9. ELOVL5 plays an important role in de novo synthesis of specific MUFA species in mammalian cells, ELOVL5 knockdown decreased the elongation of C16:1 cis-9, n-7, and ELOVL5 overexpression increased synthesis of C18:1 cis-9, n-7 [53]. Also, previous study using mice models revealed that a reduced ELOVL5 activity can lead to hepatic steatosis, and endogenously synthesized PUFAs are critical regulators of fatty acid synthesis [54]. Lemos et al. reported a candidate region embedded in ELOVL5 can explain 4% of the genetic variance for C20:4 n-6 using ssGBLUP based on window association test in Nellore cattle [35]. The consistent role of ELOVL5 gene involved in fatty acid synthesis and composition was also extensively investigated in diverse pig populations [55,56,57].

Moreover, previous studies suggested that ELOVL5 are involved in the production of multiple acids including C16:0, C16:1, C18:0 and C18:1 in cattle [58]. Also, ELOVL5 was found associated with C20:1n9/C18:1n9 and C20:2n6/C18:2n6 in a F2 population derived from Erhulian pig [55]. As a result, ELOVL5 may have pleiotropic effects on multiple fatty acid composition and also appear to exhibit pleiotropic effects in multiple metabolic steps. However, we only identified ELOVL5 that was associated with C14:0 in Chinese Simmental cattle. In addition, several previous studies have suggested variants within SCD gene and the expression level of SCD gene should be significantly associated with fatty acids [18, 20, 30, 32, 47, 48, 59, 60], the SCD gene was not detected in this study, probably due to heterogeneous genetic architecture of fatty acids differ across different populations.

Genomic prediction for fatty acid composition

Previous studies have investigated predictive abilities of genomic selection for fatty acid composition in American Angus [32], Japanese Black cattle [61] and Canadian beef cattle [33]. In the current study, we explored, for the first time, genomic prediction for fatty acid composition in Chinese Simmental cattle. We found that the accuracies of genomic prediction for most of fatty acids were relatively low (<0.30) using both GBLUP and BayesB, which was consistent with a previous report by Chen et al. [33]. This finding may be explained by the relatively low and inaccurate estimates of heritability for the measured fatty acid composition [62]. Our studies also revealed that BayesB provided slightly higher average regression coefficients as compared to GBLUP. Considering the complex architecture of fatty acid composition, this finding implied that BayesB which allows a fraction of SNPs to be allocated with relatively large effects is superior over GBLUP which assumes the same genetic variance for each SNP. Fatty acid composition are commonly recognized as complex traits with a polygenic nature and, to some extent, they are difficult to measure, thus the application of genomic selection for fatty acids will be valuable in future selection breeding programs. With increasing public understanding of the relationships between diet and health, much attention should be paid to the studies of some important fatty acids related to human health [63]. As consumer become more health conscious, they have increased preference for better tasting and healthier products in their diet such as unsaturated fatty acid levels. Further investigation of causal mutations will promote our understanding of lipid metabolism, fat deposition and application of selection for fatty acids in cattle.

Conclusions

We identified several significant associated regions and loci as the potential candidate markers for genomics-assisted breeding programs. Using multiple methods, our results revealed that FASN and ELOVL5 associated with fatty acids with strong evidences. Our analyses also suggested that it is feasible to perform genomic selection for fatty acids in the Chinese Simmental cattle population.

Methods

Ethics statement

All animals used in the study were treated following the guidelines established by the Council of China Animal Welfare. Protocols of the experiments were approved by the Science Research Department of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (CAAS) (Beijing, China).

Animals and phenotypes

A total of 723cattleborn between 2010 and 2013 were used in this study, and these cattle were originated from Ulgai, Xilingol League and Inner Mongolia of China. After weaning, the cattle were moved to JinweifurenCo.,Ltd for fattening, all animals sharing the same feeding and management conditions. More detailed description of breeding and management has been described previously [64, 65]. The cattle were slaughtered at an average of 20 months. During the period of slaughtering, we measured traits in strict accordance with the guidelines proposed by the Institutional Meat Purchase Specifications for fresh beef. Meat samples were removed from the longissimus lumborum (LL) muscleafter stored for 48 h between the 12th and 13th ribsfrom each animal, and then samples were vacuum packed and chilled at -80 °C. And approximate 10 g of sample were taken for subsequent fatty acid analyses. Total lipid was extracted from the sample according to protocols described previously [66]. About 2 mg extracted lipid was re-dissolved in 2 ml of n-hexane and 1 ml of KOH (0.4 M) for saponification and methylation. A total of 21 individual fatty acid composition were measured using gas chromatography (GC-2014 CAFsc, Shimadzu Scientific Instruments) including six saturated fatty acids, four monousaturated fatty acids, and eleven polyunsaturated fatty acids. Each fatty acid was quantified as a weight of percentage of total fatty acids. In addition, fatty acids were indexed as groups of saturated, monounsaturated, polyunsaturated fatty acid, total of saturated fatty acid (SFA), total monounsaturated (MUFA), total of polyunsaturated (PUFA), total of omega 3 (n-3) and total of omega 6 (n-6). The calculation of various fatty acid groups are described as follows: SFA = C14:0 + C16:0 + C18:0 + C20:0 + C22:0 + C24:0; MUFA = C14:1 cis-9 + C16:1 cis-9 + C18:1 cis-9 + C20:1 cis-11; PUFA = C18:2 n-6 + C18:2 t-9c-11 + C18:2 t-12c-10 + C18:3 n-6 + C18:3 n-3+ C20:2 n-6 + C20:3 n-3 + C20:4 n-6 + C20:5 n-3 + C22:5 n-3 + C22:6 n-3; n-3 = C18:3 n-3 + C20:3 n-3 + C20:5 n-3 + C22:5 n-3 + C22:6 n-3; n-6 = C18:2 n-6 + C18:3 n-6 + C20:2 n-6 + C20:4 n-6; PUFA/SFA: ratio between PUFA and SFA; n-6/n-3: ratio between n-6 and n-3; HI = (MUFA + PUFA)/(4 × C14:0 + C16:0).

Genotyping and quality control

In total, 723 Simmental cattle were genotyped for the Illumina BovineHD BeadChip. Before statistical analysis, SNPs were pre-processed using PLINK v1.07 [67]. SNPs were selected based on minor allele frequency (>0.05), proportion of missing genotypes (<0.05), and Hardy-Weinberg equilibrium (P > 10E-6). Moreover, individuals with more than 10% missing genotypes were excluded. After these quality controls, the final data consisted of 685 individuals and 595,715 autosomal SNPs.

Heritability and genetic correlation estimation

Phenotypic and genetic (co) variances of fatty acids were estimated using the pairwise bivariate animal model implemented in the ASReml v3.0 software package [68]. The model is

$$ \left[\begin{array}{c}\hfill {y}_1\hfill \\ {}\hfill {y}_2\hfill \end{array}\right]\kern0.5em =\kern0.5em \left[\begin{array}{cc}\hfill {X}_1\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill {X}_2\hfill \end{array}\right]\left[\begin{array}{c}\hfill {b}_1\hfill \\ {}\hfill {b}_2\hfill \end{array}\right]+\kern0.5em \left[\begin{array}{cc}\hfill {Z}_1\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill {Z}_2\hfill \end{array}\right]\left[\begin{array}{c}\hfill {a}_1\hfill \\ {}\hfill {a}_2\hfill \end{array}\right]\kern0.5em +\left[\begin{array}{c}\hfill {\mathbf{e}}_{\mathbf{1}}\hfill \\ {}\hfill {\mathbf{e}}_{\mathbf{2}}\hfill \end{array}\right] $$

where y1 and y2 are vectors of phenotypic values of trait 1 and 2, respectively; ×1 and × 2 are incidence matrices for fixed effects; b 1 and b 2 are the vectors of the fixed effects; Z1 and Z2 are incidence matrices relating the phenotypic observations to vectors of the polygenic (a) effects for two traits; e 1 and e 2 are random residuals for two traits. The polygenic effects were treated as random and assumed to be mutually independent.

Variances of the random effects are defined as V(a) = G σ 2 a for the polygenes and V(e) = I σ 2 e for the residuals, where G is the additive genetic relationship matrix, I is the identity matrix, σ 2 a is the additive genetic variance and σ 2 e is the residual variance. Matrix G matrix was inferred from the SNP markers according to the study of VanRaden [69]. Fixed effects in the model included effects of gender, farm and year. In addition, ages at slaughter, days between slaughter and fatty acid extraction, hot carcass weight and marbling score were considered as covariates in the model. Genomic heritability of each trait was estimated using

$$ {h}^2={\sigma}_a^2/\left({\sigma}_a^2,+,{\sigma}_e^2\right) $$

Pairwise bivariate analyses were performed for each combination of fatty acids to estimate the (co) variance components, phenotypic and genetic correlations as well as the heritability.

Genome-wide association study using BayesB

Fatty acid composition was adjusted for fixed effects and covariates using a linear mixed model, and fixed effects and covariates were defined above. We conducted genome-wide association analyses using BayesB, which analyzed all autosomal SNP simultaneously and assumed different genetic variance for each SNP [40, 70]. The model is described as follows,

$$ {y}_i= u+{\displaystyle \sum_{j=1}^M{Z}_{i j}{\alpha}_j{\delta}_j+{e}_i} $$
(1)

where y i is the adjusted phenotypic value for the i th individual, u is the mean (after removing fixed effects and all covariants), M is the number of SNP loci, Z ij is the j th SNP genotype of animal i coded as the number of B alleles in the genotype, α j is the average effect of allele substitution for SNP j, and is assumed to be normally distributed N (0, σ 2 j ), δ j is an indicator variable to show the presence (δ j  = 1) and absence (δ j  = 0) of marker j, and the presence is given a prior probability, and e i is the residual error with an assumed normal distribution N (0, σ 2 e ). The prior distribution of variance σ 2 j (or σ 2 e ) is assumed to be a scaled inverse Chi-square with degrees of freedom v α  = 4 (or v e  = 10) and scale parameter S 2 α (or S 2 e ). The scale parameter was usually derived from an assumed additive-genetic variance [71]. π was set to 0.9998, which meant that about 100–150 SNPs were fitted simultaneously in each MCMC iteration. The Markov chains were run for 50,000 cycles of iterations with the first 10,000 iterations being discarded as burn-in followed by additional 40,000 iterations to form the posterior sample. All SNPs effects were estimated from the posterior sample. We performed GWAS for all the 21 fatty acids but only reported the results for the traits with genomic heritability ≥ 0.10. We inferred the associations for fatty acids using a 100 kb window rather than single marker [35, 36]. There were 24,900 SNP windows across the 29 autosomes. The variance for each window was estimated using the genetic value of all adjacent SNPs within 100 kb window, and proportion of genetic variance explained by each window was obtained by dividing the variance of window breeding value by the variance of the whole genome breeding value. Genome windows with the highest posterior mean proportion of genetic variance ≥1% were considered as the most important regions associated with the traits. Positional candidate genes were investigated for the 100 kb windows using the UCSC Genome Browser, which allowed visualization of SNP based on the Bos taurus genome assembly UMD 3.1.

Genome wide association study using GRAMMAR

We also performed genome-wide association study using GRAMMAR-GC implemented in an R package GenABEL v1.8-0 [72]. The method accounts for population stratification and covariance structure of individuals inferred from all by SNPs. Bonferroni corrected threshold of 8.39E-08 (0.05/595,715) was adopted for the top 5% genome-wide significance. This correction was highly conservative for GWAS using high density SNPs array. To avoid the “overcorrection” for SNPs that may not truly independent due to LD across genome, we used a suggestive P value (P = 0.05/163,473) as thresholds proposed by Duggal et al., considering approximate the number of “independent” SNPs by counting 1 SNP per LD block, plus all SNPs outside of the LD blocks (interblock SNPs) [43].

Region-based association test and haploblock analyses

Region-based association test is a more powerful approach of gene mapping than the association test of an individual genetic variant. In this study, we performed the region-based association test for several target 100 kb regions identified using BayesB. SNPs in these regions and the adjusted fatty acid composition were investigated with this region-based association test using R package FREGAT [73]. The LD of these regions were estimated using PLINK v1.7 software [67].

Genomic prediction

Genomic best linear unbiased prediction (GBLUP) and BayesB were used for genomic prediction. Five-fold cross validation was used to evaluate the accuracy of genomic prediction. The data were split into five approximately equal-sized groups. For each cross validation, four groups were used as the training sample to estimate parameters and the remaining group was used as the test sample. The linear model is written as,

$$ \mathbf{y}={\mathbf{1}}_{\mathbf{n}} u+\mathbf{Z}\mathbf{a}+\mathbf{e} $$
(2)

where y is the vector of adjusted phenotypic values in the training sample, μ is the overall mean, a is the vector of breeding values for all animals, e is the vector of residuals errors and Z is the incidence matrix for the random effects. For the BayesB, SNP effects were estimated based on the training population using the statistical model described in the GWAS analyses. The GEBV for animal i in the validation population was predicted by summing up SNP effects over all loci as follows: GEBV i  = ∑ M j = 1 Z ij α j , where α j is the estimated effect for SNP j. Predictive accuracy was measured as the correlation between the estimated breeding values and the adjusted phenotypic values divided by the square root of heritability separately for each of the 5-fold cross-validation replicates.