Background

Ruminant production is of considerable economic value, since meat and milk are important agricultural products and major sources of protein for humans. In France, beef cattle production mainly uses purebred specialized breeds such as the Charolaise, Limousine and Blonde d’Aquitaine breeds [1, 2]. As a complex phenotype, beef quality is determined by both environmental and genetic factors [35]. Moreover, different criteria and/or perceptions are used to define meat quality: sensory, nutritional, technological or hygienic quality. Sensory traits such as palatability, juiciness and consumer eating satisfaction depend highly on beef tenderness [6]. Several studies have focused on Warner–Bratzler shear force (WBSF) as a relevant trait to identify genetic markers associated with beef tenderness [79] and have led to the development of a DNA-based commercial test that targets the calpain 1 (CAPN1) and calpastatin (CAST) genes. However, these markers explain only a fraction of the phenotypic variance and for some breeds (including the main French beef cattle) these tests are not informative [2, 8, 10].

Currently, there is an emerging consensus about the relevance of taking biological information into consideration for genomic selection [11, 12] and thus, it is important to identify functional single nucleotide polymorphisms (SNPs) to increase the accuracy of genomic predictions. System-based approaches have emerged as an alternative to study complex traits and to identify candidate genes. In this study, we used the association weight matrix (AWM), a systems biology approach that integrates information of genome-wide association studies (GWAS) with network inference algorithms, to identify candidate genes and their regulatory elements that could affect phenotype [13, 14]. This multi-trait approach was applied to 17 traits to identify a common set of interacting genes associated with meat tenderness and other meat quality traits across the Limousine, Charolaise and Blonde d’Aquitaine breeds.

Methods

Phenotypic traits, animals and genotypes

We used data resources on three French purebred specialized beef breeds, Blonde d’Aquitaine (n = 981), Charolaise (n = 1114) and Limousine (n = 1254) that were previously reported [1]. In brief, 17 traits were collected on young beef bulls: four traits were related to muscle conformation, three to carcass fatness, and 10 to meat quality (Table 1). Most of the young bulls were genotyped using the Illumina BovineSNP50 Genotyping BeadChip, i.e. 947 Blonde d’Aquitaine, 1059 Charolaise, and 1219 Limousine. All 114 sires (30, 48 and 36 for the Blonde d’Aquitaine, Charolaise, and Limousine breeds, respectively) were genotyped with the Illumina BovineHD BeadChip (777K SNPs). After quality control, 50K genotypes were imputed to HD within breed using the FImpute software [15] and pedigree information. Imputation was based on their respective reference populations of 672, 462, and 327 sires from the Charolaise, Limousine, and Blonde d’Aquitaine breeds, respectively, which were chosen based on their marginal contribution to the whole population of each breed [16].

Table 1 Description and summary statistics for the 17 traits analyzed in the three breeds

Association weight matrix approach and network analyses

After quality control, SNPs with a minor allele frequency (MAF) lower than 5 %, SNPs that mapped to the sex chromosomes or that were not mapped to the UMD3.1 bovine genome assembly were excluded. Details regarding models and the fixed effects that were fitted for each trait are available in [17]. Fifty-one GWAS were performed (17 traits for each of the three breeds) by single-trait-single-SNP association analysis, using the option mlma of the GCTA software [18] and the following model:

$$\tt {\mathbf{y}}_{ljm} = {\mathbf{u}}_{l} + {\mathbf{s}}_{l} {\mathbf{a}}_{k} + {\mathbf{e}}_{ljm} ,$$

where y ljm is a vector of performance records adjusted for contemporary groups effects; u represents the infinitesimal genetic effect with \(\tt {\mathbf{u\sim}}N(0,{\mathbf{G}}\sigma_{u}^{2} )\); G is the genomic relationship matrix (GRM) calculated using the autosomal SNPs based on the methodology of Yang et al. [18], with \(\tt \sigma_{u}^{2}\) representing the additive genetic variance; s l is a indicator variable depending on the l-th individual genotype for the k-th SNP, ak represents the additive association of the kth SNP on the j-th trait, and e ljm is a vector of random residual effects.

Subsequently, three independent association weight matrices (AWM) (one per breed) were built from the GWAS results. An AWM is a matrix with rows represented by genes and columns represented by phenotypes [13]. To construct an AWM, two matrices are required that both contain row-wise SNPs and column-wise phenotypes. The {i,j}th element of the first matrix is equal to the p value of the association of the i-th SNP with the j-th phenotype. In the second matrix, the {i,j}th element is equal to the z-score standardized additive effect of the i-th SNP for the j-th phenotype. Warner–Bratzler shear force (WBSF) was selected as the key phenotype and SNPs that were associated with WBSF (p value ≤0.01) were included in the AWM. In the next step, the dependency among phenotypes was explored by estimating the average number of other phenotypes (Ap) that were associated with these SNPs at a p value ≤ 0.01 (Ap = 2). Subsequently, all SNPs that were associated with at least two phenotypes at a p value ≤0.01 were included in the AWM. To build the AWM, in the next steps we followed the procedure described by Fortes et al. [13], which was modified as follows: (1) only SNPs within genes or located close to intergenic SNPs (within 10 kb of the coding region) were selected; and (2) to identify putative regulators, in addition to the transcription factors (TF) reported by Vaquerizas et al. [19], genes that encoded microRNA (miRNA) and long non-coding RNA (lnRNA) and that were mapped to the UMD 3.1 bovine genome assembly (GenBank assembly accession: GCA_000003055.3) were also considered in this analysis.

The proportion of the phenotypic variance explained by the SNPs was estimated using GCTA through a genomic restricted maximum likelihood (GREML) approach, as described in the model description. To estimate the variance explained by the AWM-SNPs, a first GRM was constructed based only on the SNPs that were selected for the AWM. Then, a second GRM was built using SNPs that were localized within genes found for all three breeds. The same numbers of randomly selected SNPs were used to made 10,000 GRM (10,000 replicates), to estimate the variance explained by those randomly selected SNPs. All these GRM were created for each breed.

Hierarchical clustering of traits (AWM columns) and genes (AWM rows) was estimated and visualized using the ‘hclust’ R function [20]. Significant gene–gene interactions (co-associations) were inferred using the partial correlation and information theory (PCIT) algorithm [21]. In the network, each node represents a gene, whereas each edge that connects two nodes represents a significant gene–gene interaction. Cytoscape software [22] was used to visualize the gene network and the CentiScaPe plug-in [23] was used to calculate specific node centrality values and network topology parameters.

Identification of key regulators, functional classification and pathway analyses

To identify potential regulators of the candidate genes across the three breeds, we applied an information lossless approach [24] that explored the connectivity of all regulators (TF, miRNA and lnRNA) in the network. We also used the iRegulonv1.3 Cytoscape plugin [25] to in silico identify TF binding site motifs in the cis-regulatory elements that were shared among the identified common candidate genes. Gene functional classification and pathway analyses were performed using Ingenuity Pathways Analysis software (IPA; Ingenuity Systems, Redwood City, CA). Over-represented gene ontology (GO) terms were identified using ClueGO, Cytoscape plug-in [26]. The cut-off for considering a significant over-representation was established by Benjamini and Hochberg multiple-test correction [27] (p value ≤0.05).

Results and discussion

Phenotype statistics and GWAS results

We used a systems biology approach to identify candidate genes that were co-associated with WBSF and 16 other phenotypes across the Charolaise, Limousine and Blonde d’Aquitaine breeds. Summary statistics, number of records available for each trait, and a brief description of the 17 traits are in Table 1. GWAS were performed for these 17 traits using the BovineHD BeadChip (777K) genotypes for each of the three breeds. The results of these GWAS served as the basis for the AWM approach and other analyses which are discussed below.

Single-breed analyses

After quality control, 533,604 (Blonde d’Aquitaine), 539,337 (Limousine) and 543,682 (Charolaise breed) informative SNPs remained for analysis. After applying a further set of selection criteria (see Methods section), the numbers of SNPs that were retained to build the AWM were equal to 2339 (Blonde d’Aquitaine), 2331 (Limousine) and 2518 (Charolaise). Cluster distributions were in agreement with the physiological similarities and genetic correlation among traits (see Additional file 1: Figure S1). Hence, a clear separation between WBSF and sensory traits such as tenderness sensory score (TEND), juiciness sensory score (JUIC) and flavor sensory score (FLAV) was observed. Opposite directionality of the estimated additive values was also observed between WBSF, intramuscular composition (IMF), fatness and conformations traits (see Additional file 1: Figure S1). In agreement with previous studies [28], SNPs detected with the AWM approach explained between 68 and 74 % of the phenotypic variance for WBSF (Table 2). Moreover, previously estimated genetic correlations agreed moderately well with AWM-based correlations (see Additional file 2: Table S1).

Table 2 Proportion of phenotypic variance explained by SNPs identified with the AWM approach, by the 206 common genes, and by the 206 randomly selected SNPs

Breed-specific candidate genes

Consistent with previous reports [13, 29], which support the reliability of our results, relevant biological information was captured by the within-breed AWM. SNPs that map to well-known candidate genes for tenderness and meat quality traits were identified (Table 3). Most of these SNPs were breed-specific in agreement with [17]. For example, using the AWM approach, we identified genes that encode proteins displaying differential abundance in the muscle of animals with extreme meat tenderness phenotypes [30]. These included for the Charolaise breed: phosphoglucomutase 1 (PGM1), stress-induced phosphoprotein 1 (STIP1) and capping protein muscle Z-line, beta (CAPZB); for the Blonde d’Aquitaine breed: ankyrin 1 erythrocytic (ANK1), Ki-67 (MKI67) and myosin regulatory light chain 2 skeletal muscle isoform (MYLPF); and for the Limousine breed: glutathione S-transferase alpha 4 and 5 (GSTA4, GSTA5), leptin (LEP) and fatty acid binding protein 4 (FABP4).

Table 3 Candidate genes identified for meat tenderness and related meat quality traits by breed and across breeds

Breed-specific networks and regulators

Gene–gene interactions were predicted using PCIT [21] and three marker-derived gene networks (one per breed) were inferred. Predicted interactions were based on partial correlations between SNP effects across traits that were detected as significant by the PCIT algorithm. To facilitate a posteriori analysis, network reduction was applied by considering only correlations that were greater than the mean ± 2 × SD in absolute value (rBlonded’Aquitaine ≥ 0.83, rCharolaise ≥ 0.84, rLimousine ≥ 0.83). Topological parameters were similar across the three breeds and the numbers of nodes connected by the largest correlations were equal to 2063 nodes connected by 8869 edges for the Blonde d’Aquitaine breed, 2116 nodes connected by 10,248 edges for the Charolaise breed, and 1979 nodes connected by 5734 edges for the Limousine breed.

To identify potential regulators, we focused on TF, miRNA and lnRNA. In agreement with its relevant role as a regulator of muscle cell growth and differentiation, myostatin (MSTN) was identified among the top TF for two of the three breeds (Charolaise and Blonde d’Aquitaine). Specifically for the Blonde d’Aquitaine breed, MSTN co-associated with the largest number of traits (eight traits, including WBSF), while MSTN co-associated with five traits for the Charolaise breed. Moreover, for both these breeds, a co-operative role of MSTN with two different miRNA was predicted in the transcriptional regulation of tenderness and meat quality traits: bta-mir-488 for Charolaise and ENSBTAG00000037365 for Blonde d’Aquitaine. Among the top regulators identified for the Limousine breed, we found the proline, glutamate and leucine rich protein 1 (PELP1) and the cAMP responsive element binding protein 3-like 3 (CREB3L3) genes. CREB3L3 is a TF that is activated by cyclic AMP stimulation and is linked to triglyceride metabolism and acute inflammatory response [31]. Interestingly, a relationship between acute stress and WBSF has been reported in Angus cattle [31]. PELP1 is a member of the chromatin remodeling complexes and a coactivator of the estrogen receptor and it plays an important role in the ER/growth factor cross-talk [32].

Multi-breed analyses

Beyond identifying within-breed candidate genes, the main goal of our study was to identify a common set of interacting genes, biological pathways and functions associated with meat tenderness and meat quality across the three breeds. Therefore, comparisons at the AWM and functional levels were performed as follows. At the gene level, the AWM approach revealed only a few common genes across the three breeds, their number decreasing from ~22 % of the total number of genes between two breeds to only ~8 % across the three breeds (representing 206 genes) (Fig. 1a). Previous studies had already reported that the number of quantitative trait locus (QTL) regions that overlap between bovine breeds is small [33, 34]. It should be noted that the number of common SNPs identified by the AWM approach was very small, ranging from 37 to 48 SNPs between two breeds, and not a single common SNP was identified when all three breeds were compared (see Additional file 3: Figure S2A). This can be explained by breed differences in allele frequencies and/or linkage disequilibrium (LD) and also by the possibility of different mutations in the same candidate gene between breeds. In agreement with previous reports, there was more overlap at the pathway and GO functional levels [35]. We found 24 over-represented pathways and 22 GO terms that were shared between the three breeds (see Additional file 3: Figure S2A) and Fig. 1b, including biological processes such as cell morphogenesis involved in differentiation (GO:0000904), calcium ion transmembrane transport (GO:0070588), Rho protein signal transduction (GO:0007266), positive regulation of GTPase activity (GO:0043547) and Rac protein signal transduction (GO:0016601) (see Additional file 4: Table S2).

Fig. 1
figure 1

Number of overlapping genes (a) and overlapping GO biological processes (b)

Across-breed candidate genes

Among the candidate genes that have been reported to be associated with meat quality traits, calpastatin (CAST), calpain 5 (CAPN5), growth hormone receptor (GHR), RAR-related orphan receptor C (RORC), myostatin (MSTN), and ArfGAP with SH3 domain, ankyrin repeat and PH domain 1 (ASAP1) were identified in at least two of the three breeds (Table 3), and calpain 1 (CAPN1), collagen type XI alpha 1 (COL11A1), and RAB11 family interacting protein 5 (RAB11FIP5) were identified in all three breeds. The estimated proportion of phenotypic variance for WBSF explained by the SNPs that mapped to the 206 genes that were common across the three breeds ranged from 28 to 30 % and was significantly larger (p value <0.01) than the variance explained by the same number of randomly (10,000 replicates) selected SNPs (Table 2; Fig. 2).

Fig. 2
figure 2

Distribution of the proportion of phenotype variance for WBSF explained by 206 randomly selected SNPs in 10,000 replicates. The black vertical line represents the proportion of the phenotype variance for WBSF explained by the 206 common SNPs identified for the Limousine and Charolaise breeds. The green vertical line represents the proportion of the phenotype variance for WBSF explained by the 206 common SNPs identified for the Blonde d’Aquitaine breed

To identify other candidate genes, we examined the position of each SNP located within the 206 genes against the bovine QTL database (http://www.animalgenome.org/cgi-bin/QTLdb/BT/index). In spite of differences in the genetic background of the animals included in our study and in the methods used to measure WBSF, there were several overlapping QTL for WBSF between our study and those previously reported across five taurine cattle breeds [8] (see Additional file 5: Table S3). We identified several interesting candidate genes within these QTL regions, i.e. the RNA binding motif protein 20 (RBM20) gene, which is predominantly expressed in striated muscle and the expression of which correlates with sarcomere assembly [36]. Genes involved in adipogenesis (EBF1), calcium metabolism (MRVI1, KCNIP4, PCDH7, and CUBN), muscle metabolism, growth retardation and bone metabolism (WWOX and AKAP6) were also observed.

It is interesting to note that 29 % (60/206) of the common genes that we detected in our multi-breed analysis have been reported to be associated with WBSF in Brazilian Nelore beef cattle using a Bayesian approach [37] (see Additional file 6: Table S4) and ~39 % (80/206) of the common genes have been reported to be associated with intramuscular fat (IMF) deposition in Australian beef cattle [38] (see Additional file 7: Table S5). Oury et al. [39] reported the existence of a relationship of meat tenderness with IMF content, as well as with total and insoluble collagen content, glycolytic metabolism and mean cross-sectional fibre area. IMF explains part of the variability in sensory tenderness score of longissimus dorsi muscle in beef cattle [4]. In total, 23 genes were common across the Brazilian, Australian and French datasets (see Additional file 8: Table S6). This overlap of genes and QTL for WBSF suggests that some of the identified candidate genes may be useful in multi-breed gene-assisted selection programs for meat tenderness. Further studies to identify the functional mutations are warranted.

Key regulators across breeds

Among the 206 genes, 22 were predicted as putative regulators (21 TF + 1 miRNA). Analysis of the literature showed that some of these 22 putative regulators belong to biological pathways and relevant functions that are related to muscle development. For example, RAR-related orphan receptor A (RORA) encodes a nuclear receptor that is essential for the activation of myogenic-specific markers and regulates a number of genes involved in lipid metabolism [40, 41]. Zinc finger protein 423 (ZNF423) encodes a TF that is involved in metal ion binding and was recently reported as a potential candidate gene for skeletal muscle growth rate and major cell types in cattle [42]. Regulators involved in epigenetic modifications, such as histone deacetylase 4 (HDAC4), were also identified. HDAC4 is associated with cell differentiation and tissue development and interestingly, it was shown to be involved in muscle maturation via an interaction with the myocyte enhancer factor [43, 44]. We identified only one microRNA among the common regulators, i.e. bta-let-7i, which was initially characterized from bovine adipose tissue and mammary gland [45]. This microRNA maps to bovine chromosome 5 (between 51,209,080 and 51,209,164 bp) and, to the best of our knowledge, there is no report on an association of bta-let-7i with meat quality traits.

To detect potential candidate regulators among the 206 common genes that were identified in our study, we performed in silico identification of enriched TF binding motifs and TF with the iRegulon Cytoscape plugin [25]. We detected different enriched motifs that can bind directly to candidate TF (see Additional file 9: Table S7) such as the interferon regulatory factor 1 (IRF1) TF (enriched motif ID = transfac_pro-M00747; NES = 3.860; Targets = 57), which was previously identified as a master regulator in the bovine skeletal muscle of two beef cattle breeds (Piedmontese x Hereford and Wagyu x Hereford) [46]; the nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 4 (NFATC4) TF (enriched motif ID = transfac_pro-M01734; NES = 3.717, Targets = 50), which encodes a member of the nuclear factor of activated T cells (NFAT) DNA-binding transcription complex protein family that plays an important role in skeletal muscle development and growth [47] and in glucose and insulin homeostasis [48]; the core-binding factor, beta subunit (CBFB) TF (enriched motif ID = hdpi-CBFB; NES = 3.339; Targets = 43), which encodes a beta subunit of a heterodimeric PEBP2/CBF TF family that regulates genes involved in osteogenesis [49]; the serum response factor (SRF) (enriched motif ID = yetfasco-145; NES = 3.295; Targets = 23), the adenosine deaminase, RNA-specific, B1 (ADARB1) (enriched motif ID = hdpi-ADARB1; NES = 3.231; Targets = 63) and the mex-3 RNA binding family member C (MEX3C) (enriched motif ID = hdpi-RKHD2; NES = 3.339; Targets = 43), which play a crucial role in skeletal muscle growth and maturation [50], skeletal myogenesis [51], and postnatal growth and energy balance regulation [52, 53], respectively. Finally, iRegulon identified the GATA binding protein 2 (GATA2) (NES = 4.887) as an enriched TF that can bind 73 of the 206 analyzed genes. Interestingly, GATA factors play a key role in the regulation of adipogenesis and GATA2 is included in the top TF that were detected in a genome-wide analysis aimed at linking transcriptional and regulatory information in bovine skeletal muscle [54].

Implications and conclusions

Consumer eating satisfaction and palatability attributes depend highly on meat tenderness and, thus, improvement of meat tenderness is relevant to the beef industry. In this study, we used a systems biology approach to explore dependencies among traits, identify pleiotropic SNPs, and infer networks of interacting genes that affect correlated traits. Genetic heterogeneity among breeds was observed and most SNPs were significantly associated with WBSF within rather than across breeds, with only 8 % of the total number of genes revealed by the AWM approach found to be co-associated in all three breeds. Compared to the within-breed AWM, the common set of genes explained between 38 and 42 % of the phenotypic variance (Table 2). This decrease in % of variance explained is expected as a consequence of the exclusion of relevant breed-specific variants.

Multivariate approaches that exploit correlations between traits have been used, for example in humans, to increase the prediction accuracy for schizophrenia, bipolar disorder, and major depressive disorder [55]. Using the AWM approach, Snelling et al. [28] reported that the accuracy of functionally-informed genomic predictions is higher than that of classical genomic predictions for beef tenderness. In addition, in agreement with recent reports [11], Snelling et al. [28] underline the importance of properly choosing the a priori biological information to obtain an increase in prediction accuracy. Our results also suggest that breed-specific mutations may affect differently the same genes. Therefore, it is likely that, depending on allele frequencies and LD patterns, a marker that is identified to be associated with a trait for one breed may not be informative for another breed. We hypothesize that for multi-breed programs, strategies that use the LD between markers (such as the use of haplotypes), together with functional information, will be more accurate for genomic predictions than using functionally-guided single-markers.

We demonstrate that the use of systems biology approaches that explore the association between correlated traits can increase the power to identify candidate genes beyond the classical one-dimensional approach also in a multi-breed context. However, it is possible that other important SNPs for meat tenderness were not captured by our approach, for example, because their impact is breed-specific. In this study, we implemented the AWM approach in a cis gene-centered manner. As recently demonstrated by the human ENCODE project, most biologically meaningful variants are likely to have a regulatory function, which suggests that most causative SNPs will be non-coding [56]. The same is expected for other species, including cattle. Improvement of the functional annotation of the bovine genome should facilitate the identification of functional mutations. Studies to better understand the genetic basis of complex traits is an active area of research and a priority to improve the accuracy of genomic predictions. In this regard, the use of integrative approaches that combine multiple sources of complementary information (e.g. intermediate phenotypes) could improve our understanding of the biological processes underlying phenotypes of interest to support technological development towards the improvement of beef quality.