Background

Wild ancestors of the pig (Sus scrofa) are still alive, providing an excellent model for tracing their evolutionary history and for defining the evolutionary mechanism driven by artificial selection during domestication [1]. Pigs were first domesticated approximately 9,000 years ago [1-3]. The domestication of pigs occurred independently in various parts of the word [2-7] and historically, Europe and China are the two major areas of pig breeding [8]. More than 730 pig breeds or lines have undergone natural and artificial selection in different environments, especially catering to the distinct needs of humans, which has provided the large diversity of morphological and physiological characteristics that currently exist worldwide [5,9,10]. For example, the lean and muscular Landrace (Lde) type in Europe and the high fat deposition and thin muscle fibers of the Lantang (LT) type in China [11]. The lean (Lde) and obese (LT) pig breeds have been found to have significant differences in their genetic of muscle growth rate and fatness [11]. Lde is characterized by a high lean meat percentage, fast-growing muscle and high body weight [12,13], while LT, an obese pig breed indigenous to China, is characterized by high intramuscular fat content, slow-growing muscle, and low body weight [11]. Significant genome and transcriptome differences have been revealed by comparative genomic studies [2,11,13]. However, the mechanisms underlying the morphological variations in muscle among pig breeds are still unclear. Generally, it is has been reported that changes in gene expression and regulatory interaction networks rather than genetic changes that result in changes to the amino acid sequences of proteins that account for the phenotype differences among species [14]. Therefore, the identification of gene expression regulatory networks in pig breeds with distinct muscle phenotypes is necessary to understand how muscle has been modified during pig domestication.

Strong selective pressures though the artificial selection of domestication have caused rapid phenotype evolution and major changes in the morphological architectures of pig muscle [1,2,7,15]. The Lde and LT breeds, which have distinct muscle phenotypes, were domesticated under different breeding goals in Europe and Asia [2,5,11]. Thus, artificial selection was probably critical in modifying the gene expression regulatory networks that resulted in muscle phenotype divergence. To better understand gene expression network differences in muscle development between the Lde and LT breeds, we applied a global network approach using weighted gene coexpression network analysis (WGCNA) [14,16-20]. WGCNA elucidates the higher-order relationships between groups of genes coexpressed with high topological overlap across samples, which are termed “modules”. A module is a pairwise measure of the similarity of the coexpression relationships of two genes with all other genes in a network. The topological overlap of paired proteins in gene coexpression networks was significantly higher for physically interacting protein pairs compared with pairs that did not interact. Thus, WGCNA screens for the core functional units of transcriptional networks. WGCNA also identifies the statistical significant enrichments of genes with the highest degree of connectivity within each module, referred to as “hub genes”. Hub genes are expected to play critical roles in the coexpression network of each module [14,16-21]. Thus, a comprehensive analysis of gene coexpression relationships in different muscle phenotypes provides an efficient way of exploring the genetic basis of phenotype variation. In this way, we used this approach to identify and visualize modules of coexpressed genes, which were organized into modules of coexpressed genes with clear functional interpretations, and to explore module differences between breeds. We identified modules of coexpressed genes, which corresponded to muscle phenotypes, and determined the hub genes responsible for the key functional distinctions between breeds. Our results demonstrated that the molecular mechanism underlying phenotype divergence between breeds cannot be robustly explained by differential gene expression alone but can be explained by coexpression network modules. We also showed that muscle phenotype differences between the Lde and LT breeds were not regulated by the muscle genes alone but by the coordinated action of muscle, nerve, and immunity genes. Thus, our results indicated that the regulation of muscle development were more complex than previously acknowledged. The evolutionary rates of most modules were accelerated, implying that complex species-specific coexpression networks underlie artificial selection during domestication. These findings are important in elucidating the molecular mechanisms that underlie muscle development and phenotype variation, and reveal the potential impact of evolutionary changes at the coexpression network level.

Results

Gene coexpression networks in lean and obese pig muscle

To investigate coexpression networks that comprehensively represent muscle transcription during pig development, we constructed gene coexpression networks from 20 next-generation sequencing data sets generated by Solexa/Illumina’s genome sequencing technology [11]. The 20 data sets comprised 10 LT and 10 Lde data sets, each of which contained muscle transcriptomes data at 35, 49, 63, 77, 91 days post-coitus and at 2, 28, 90, 120, 180 days post-natum [11]. The gene expression levels in each sample were assessed using 3’ digital gene expression tag-based profiling [11]. A total of 3652 and 3404 temporally differentially expressed genes (DEGs) were identified during LT prenatal and postnatal muscle development, respectively. Similarly, 3649 and 3408 DEGs were identified from Lde prenatal and postnatal muscle, respectively. Weighted Pearson correlations were calculated for all 3652 and 3404 DEGs in LT, and for all 3649 and 3408 DEGs in Lde. All the weighted Pearson correlations were converted into matrices of connection strength by a power function [22]. The topological overlaps between genes were then calculated using these connection strengths. Topological overlaps values were used to assess the similarity of the coexpression relationship of two genes with all the other genes in the network in a robust and biologically meaningful way [22,23]. Average linkage hierarchical clustering was used to cluster coexpressed genes with similar patterns of connection strengths or with high topological overlaps into modules. In all, we identified 24 modules in the Lde prenatal network (Figure 1A), 32 in the Lde postnatal network (Figure 1B), 35 in the LT prenatal network (Figure 1C) and 34 in the LT postnatal network (Figure 1D) (Additional file 1: Table S1 and Table S2).

Figure 1
figure 1

Gene coexpression networks in lean (Lde) and obese (LT) pig muscle. The Lde prenatal network (A), Lde postnatal network (B), LT prenatal network (C), and LT postnatal network (D) are shown. The dendrograms were produced by average linkage hierarchical clustering of genes on the basis of topological overlap. The y axes correspond to co-expression distance and the x axis to genes. Dynamic tree cutting was used to determine modules, generally by dividing the dendrogram at significant branch points. The modules of coexpressed genes were assigned colors and numbers as indicated by the horizontal bar beneath each dendrogram. The y-axes correspond to co-expression distance and the x axis to genes. See also Additional file 1: Table S1.

Coexpression network modules between lean and obese breeds are more different in postnatal animals than in prenatal animals

To determine the preservation of coexpression network modules between different muscle types, we assessed whether different modules were composed of the same genes on a module-by-module basis. A high degree of module preservation between the prenatal animals was observed by calculating the overlap for each possible pair of modules (Additional file 1: Table S3). Two pairs of modules were deemed to show significant preservation when the gene coexpression relationships were > 50% overlap. In the LT-turquoise and Lde-turquoise module pair (P < 0.001), we identified about 1021 overlapping genes: 47% (1021/2183) in Lde-turquoise; and 72% (1021/1417) in LT-turquoise. In the LT-blue and Lde-brown module pair (P < 0.001), we identified about 100 overlapping genes: 38% (100/261) in LT-blue and 56% (100/177) in Lde-brown. Overalll, these two coexpression network modules with 1121 genes (31% of all module genes) were highly preserved in the LT and Lde prenatal animals (Additional file 1: Table S4). The Gene Ontology (GO) annotations assigned to the genes indicated that most of the 1121 genes were involved in cell differentiation and growth, muscle and skeletal system development, neuron development, and cellular response (Additional file 1: Table S5). Because the “hub genes” have the highest degree of within-module connectivity, they were expected to play critical roles in the coexpression network modules and were therefore considered to be a primary indicator of the module function [14,16-21]. We identified the hub genes by visualizing the preserved coexpression network modules (Figure 2). In the blue-brown module (Figure 2A), the hub genes included SMN1, which has been shown to be crucial in neurite outgrowth and neuromuscular maturation during the differentiation and development of neurons and muscle [24]; GNB2L1 [25] and SBDS [26], which may be involved in cell division and growth (Additional file 1: Table S5); and ELOF1, a conserved transcription elongation factor [27]. In the turquoise-turquoise module (Figure 2B), the hub genes included HOXB7, HEY2 and PBX2, which have been reported to regulate muscle development [28-30]; and MPPED2 and NEFL, which may play roles in neuronal differentiation [31,32]. Between the postnatal animals the degree of module preservation was much lower than was found between the prenatal pigs. Indeed, only two modules containing a total of 101 genes (1.5% of all module genes) were common between the LT and Lde postnatal animals (Additional file 1: Table S4). Thus, the coexpression network modules were more conserved in prenatal than in postnatal animals, and muscle related genes were found to play key roles in most of the preserved coexpression network modules.

Figure 2
figure 2

Visualization of the common gene coexpression network modules to identify hub genes. The eigengene in the common modules between the LT-blue and Lde-brown (A) and between LT-turquoise and Lde-turquoise (B) modules are shown. The top 300 connections are shown for each module. Dots correspond to genes and lines to connections; hubs genes have at least 15 connections. Where the gene symbols are unknown, gene IDs are shown (e.g., WE424869). See also Additional file 1: Table S4.

Differences in prenatal modules between lean and obese breeds provide insight into prenatal muscle development differences in fiber number and muscle fiber composition

Differences in transcriptional levels are important for studying the evolutionary basis of phenotypic differences at the molecular level [18]. Differences in network modules could provide a basis for better understanding of the differences in muscle development between lean and obese pigs. In this study, we identified six highly lean-specific modules and five highly obese-specific modules in the prenatal animals (Additional file 1: Table S6 and Table S7). A GO analysis of these module genes revealed that nine of these modules were involved in muscle development, neuron development and cellular response (Additional file 1: Table S8 and Table S9). Hub genes involved in muscle development were enriched in six lean-specific modules (HSBP1 in Lde-blue; MYL1 and DLK1 in Lde-midnight blue; MAP4 and FERMT2 in Lde-only-turquoise; TPM2, TCEA3, ZFP36L1, DES, TNNT3, and ANK3 in Lde-pink; MAPK12, MYLPF, and MYH2 in Lde-red; and SIRT1, OSR2, and MEF2D in Lde-tan) and in three obese-specific modules (GNB2L1 in LT-blue; ACTN2, MYH7, MYOZ3, MALAT1, PTP4A3, and ENO3 in LT-purple, and TNNI2 and DAG1 in LT-yellow green) (Figure 3). The hub genes in the LT-dark red module were significantly enriched for genes involved in cellular response (RRAGD, EPHX1, TPD52 and PSMA2). Overall, a greater number of muscle development-related modules that regulate fiber number and muscle fiber composition were identified in lean Lde animals than in obese LT animals.

Figure 3
figure 3

Visualization of breed-specific gene coexpression networks in prenatal animals. (A) LT-yellow green (B) LT-purple (C) LT-blue (D) Lde-tan (E) LDE-red (F) Lde-pink (G) Lde-turquoise (H) Lde-midnight blue (I) Lde-blue. The top 300 connections are shown for each module. Dots correspond to genes and lines to connections; hubs genes have at least 15 connections. Where the gene symbols are unknown, gene IDs are shown.

Differences in postnatal modules between lean and obese breeds provide insight into differences in postnatal muscle growth and fat deposition

Only two modules were common between lean and obese postnatal animals; however, about 15 highly lean-specific modules and 13 highly obese-specific modules were identified (Additional file 1: Table S10 and Table S11). GO analysis of these module genes revealed that 18 of these modules were involved in muscle development, neuron development, and cellular response, and three were enriched in cellular response and metabolism (Additional file 1: Table S12 and Table S13). Hub genes involved in muscle development were enriched in 13 lean-specific modules (MYBPC1 and CBX3 in Lde-blue; PRRX1 in Lde-dark grey; USP2 in Lde-green; PDLIM7 in Lde-grey60; VCAM1, CXCL12, HRAS, SETD3, and MYLPF in Lde-light yellow; UNC45B and DZIP1 in Lde-midnight blue; MYOZ2 and FABP3 in Lde-pink; LMNA and PRMT5 in LDE-red; UBR5 in Lde-royal blue; JUN, SPARC, and TEAD1 in Lde-sky blue; STAT5B and GNB2L1 in Lde-salmon; MLIP in Lde-yellow; and ELL3 and RPL27A in Lde-purple). The hub genes in the Lde-black module were significantly enriched for genes involved in the regulation of alternative splicing (ZRANB2, RNPS1, and SRSF6) (Figure 4). In addition, 18 muscle development hub genes were identified in the 11 obese-specific modules (RHEB in LT-dark magenta;; SIX1, MUSTN1, and SFRS1 in LT-grey60; FHOD1 and SMPX in LT-orange; KLF10, HDLBP, and JAK1 in LT-sienna3; MYF6, CDK9, TEAD4, and S100A11 in LT-sky blue; GADD45A and PRMT5 in LT-violet; TEAD1 in LT-yellow green; SFRS18 in LT-magenta; ATP5B in LT-red; MCL1, CDKN3, and RBM19 in LT-light yellow; and SPNS1 in LT-tan). In particular, hub genes involved in intramuscular fat deposition and meat quality were significantly enriched in five obese-specific modules (SFRS18 in LT-magenta; ATP5B in LT-red; ACOT9 in LT-light green; ACOT8 and CSRP1 in LT-black; and HDLBP in LT-sienna3) (Figure 5). Thus, difference between lean- and obese- specific modules in the postnatal animals provided insights into differences in postnatal muscle growth and fat deposition in the LT and Lde pigs.

Figure 4
figure 4

Visualization of Lde-specific gene coexpression networks in postnatal animals. (A) Lde-yellow (B) Lde-salmon (C) Lde-sky blue (D) Lde-red (E) Lde-pink (F) Lde-midnight blue (G) Lde-royal blue (H) Lde-purple (I) Lde-light yellow (J) Lde-grey60 (K) Lde-green (L) Lde-dark grey (M) Lde-blue (N) Lde-black. The top 300 connections are shown for each module. Dots correspond to genes and lines to connections; hubs genes have at least 15 connections. Where the gene symbols are unknown, gene IDs are shown.

Figure 5
figure 5

Visualization of the LT-specific gene coexpression networks in postnatal animals. (A) LT-yellow green (B) LT-violet (C) LT-tan (D) LT-sky blue (E) LT-sienna3 (F) LT-red (G) LT-orange (H) LT-magenta (I) LT-light yellow (J) LT-light green (K) LT-grey60 (L) LT-dark magenta (M) LT-black. The top 300 connections are shown for each module. Dots correspond to genes and lines to connections; hubs genes have at least 15 connections. Where the gene symbols are unknown, gene IDs are shown.

Regulation of muscle development is coordinated by muscle, nerve, and immunity genes

Among the 42 modules mentioned above (i.e., the three common modules and the 39 breed-specific modules), 24 contained genes related to muscle development, nervous system development, and immune response, seven contained genes related to muscle development and immune response, and another three contained genes related to muscle development and nervous system development (Additional file 1: Table S5, Table S8, Table S9, Table S12 and Table S13). Many neuron and immune response genes played crucial roles in the coexpression network modules of muscle (Figures 2, 3, 4 and 5). This finding suggests that the regulation of muscle development might be more complex than previously acknowledged, because our results suggest that the muscle development process may be regulated not only by muscle genes but by the coordinated action of muscle, nerve, and immunity genes.

Detection of positive selection pressure

To examine the genes that showed accelerated evolution in the 42 modules, we obtained the ortholog sequences of the 4597 genes in these modules from whole-genome resequencing data of 37 individual pigs and 11 wild boars. Evolutionary rates (Ka/Ks values, nonsynonymous/synonymous substitution rate ratio) were inferred form the filtered alignments of these 4597 module genes (Additional file 1: Table S4-S7, Table S10 and Table S11). We found that 80% of these genes had Ka/Ks ratios < 0.1, indicating a high level of purifying selection pressure in these genes (Figure 6), and approximately 7% of the genes had Ka/Ks ratios >0.1 (Figure 6). Five genes under strong positive selection were identified in the prenatal common modules (Table 1), while only one of the genes under positive selection in the LT-blue module was identified in the 11 prenatal breed-specific modules (Table 1). In the postnatal modules, five genes from six breed-specific modules were found to be under strong positive selection, while no positively selected genes were found in the postnatal common modules. These genes could be involved in the regulation the basic cell biological processes, such as cell migration (CDC42BPA), transport (PLTP), proteolysis (PLAU), and RNA process (SART3) (Table 1). In particular, CMYA5 and FHOD1 have been reported to regulate the muscle cell phenotype and meat quality [3,33]. These results suggested that the genes under positive selection may have played a role in the muscle phenotype divergence among pig breeds. However, among these positively selected genes, only FHOD1 was a hub gene in the postnatal LT-orange and Lde-purple modules (Table 1). A high level of purifying selection pressure was identified in 94 other muscle related hub genes. Therefore, although coding sequences changes under positive selection have a role in the evolution of gene function, their role in the muscle phenotype divergence among pig breeds seemed to be minor. The divergent of coexpression modules among breeds might regulate the muscle phenotype divergence during domestication.

Figure 6
figure 6

Detection of selection pressure on all module genes. Ka/Ks valuesare the nonsynonymous/synonymous substitution rate ratios; Conserved indicates gene sequences that are conserved and none snp detected; 0, Ka/Ks = 0; 0–0.1, 0 < Ka/Ks < 0.1; 0.1-1, 0.1 < Ka/Ks < 1; >1, Ka/Ks > 1; 99, Ka/Ks = 99. Genes with Ka/Ks ratios equal to 99 were not included in the 7% of genes with Ka/Ks ratios > 0.1 because estimates of omega equal to 99 are not reliable.

Table 1 Module genes under positive selection

Discussion

During the domestication of wild boar, dramatic phenotype changes were generated in domestic pigs under artificial selection with different breeding goals. For example, the lean (Lde) and obese (LT) pig breeds have significant genetic differences in the processes associated with muscle growth rate and fatness [11]. The pig genome has been sequenced and resequenced, which has made it easier to investigate the regulatory mechanism that underlie the phenotype diversity in domestic pigs. Using the genome resequence methods, Rubin et al. [7] identified a few genes related to pig domestication that were under positive selection; however, none of these genes were related to muscle phenotype. Most previous studies have focused on changes in gene expression, while several studies have reported that connectivity was a more sensitive measure of evolutionary divergence compared with gene expression changes alone [14,16,18,19,21]. Therefore, we used WGCNA to reveal molecular and evolutionary mechanisms associated with the coordination of gene expression patterns in different pig breeds with distinct muscle phenotypes. In this study, we showed that the transcriptional diversity of different muscle phenotypes was regulated at the genome level by distinct gene coexpression networks.

Comparison of the coexpression network modules between prenatal LT and Lde animals, which were constructed using transcriptome data of five developmental stages, revealed that 1121 genes in two modules were also conserved in the LT and Lde pigs. We have highlighted seven hub genes (SMN1, GNB2L1, SBDS, ELOF1, HOXB7, HEY2 and PBX2) that were predicted to play key roles in muscle development. SMN1 encodes a protein that is crucial in neuromuscular maturation [24]. GNB2L1 [25] and SBDS [26] encode proteins that are involved in cell division and growth, which may contribute to the proliferation of muscle cells. HOXB7, HEY2 and PBX2 encode proteins that directly regulate muscle development [28-30], and ELOF1 encodes a conserved transcription elongation factor, which might regulate the basic transcription process of muscle genes [27]. These results suggested that the conserved coexpression network modules contained genes that were associated with the regulation of basic muscle development; implying that the key processes that regulate muscle development are similar in the two breeds. Nonetheless, six highly lean-specific modules and five highly obese-specific modules were identified in the prenatal LT and Lde animals, indicating that prenatal myogenesis was significantly different in the two breeds. Most of these breed-specific modules were involved in muscle development, neuron development, and cellular response. Among the genes with known functions, 17 hub genes related to muscle development were found to play major roles in the six lean-specific modules. As an essential myogenesis regulator in many diverse species, MEF2D directly regulates muscle genes at all developmental stages [34]. SIRT1 has been found to increase the cell proliferation of myoblasts [35]. FERMT2 regulates myogenic differentiation by the myogenic factor, myogenin, via canonical Wnt signaling [36]. DES [37], MAPK12 [38], DLK1 [39] and MAP4 [40] were reported to play essential roles in myoblast fusion, myotube formation, and maintenance of the structural and functional integrity of muscle during myogenesis. OSR2 [41] and TCEA3 [42] encode proteins that regulate proliferation and development genes. ANK3 [43] and HSBP1 [44] have been found to play critical roles in myogenesis. In the obese-specific modules, only seven key muscle related hub genes were found. Among these genes, MALAT1 encodes a protein that was reported to regulate myoblast proliferation [45], and ENO3 and DAG1 [46] have both been shown to regulate myogenesis [47]. A greater number of hub genes related to myogenesis were detected in the lean-specific modules compared with in the obese-specific modules.

This might have resulted in the formation of more muscle fibers in Lde pig during embryonic development, which may explain the main phenotype difference in prenatal muscle development between the LT and Lde breeds [11]. In addition to the genes that were directly associated with myogenesis, numerous muscle fiber type related genes in the coexpression network modules were different between the LT and Lde pigs. For example, the hub genes MYL1 [48], TNNT3 [49], and MYH2 [50] in the lean-specific modules, encode proteins that are critical for fast fiber differentiation, and TPM2 [51] and MYLPF [52] have been reported to be critically important for fast and slow skeletal muscle development. In the obese-specific modules, the hub gene MYH7 encodes a protein that regulates slow skeletal muscle fiber [53], while ACTN2 [54], TNNI2 [49], and MYOZ3 [55] have been found to be highly expressed in fast skeletal muscle fibers, and MYOZ3 is closely related to meat quality [55]. Total fiber number and muscle fiber composition between fast and slow muscle fibers are associated with different muscle phenotypes [56]. All these modules contain genes that can regulate differences in development of muscle fiber number, size, and fiber composition between the two breeds. Thus, these modules may be responsible for the different muscle features and meat quality in the LT and Lde pigs [11].

Previous studies have shown that muscle phenotype is determined during embryonic development and that postnatal muscle growth is not critical [11]. However, in our by coexpression network module analysis, we found that differences in transcriptional profiling between LT and Lde were more significant in postnatal animals than in prenatal animals. Only two modules that contained 101 module genes (1.5% of all module genes) were conserved in both breeds, and none of the genes were related to muscle regulation. In contrast, our analysis of 15 lean-specific modules and 13 obese-specific modules containing 2504 genes revealed a molecular regulation mechanism that was associated with the different muscle phenotype in the two breeds. Although muscle phenotype was found to be determined during embryonic development, several hub genes related to muscle development were identified in these modules. In lean-specific modules, a hub gene in Lde-salmon, STAT5B, was reported to be critical for normal postnatal growth [57]. STAT5B encodes a transcription factor that can regulate skeletal muscle growth and fiber composition. The absence of STAT5B has been shown to increase the expression levels of several genes that regulate type I fibers, which resulted in muscle composed almost exclusively of type II fibers [57]. Thus, STAT5B and MYLPF [52], a hub gene in the Lde-light yellow module, might be critical for muscle growth and fiber composition in postnatal development. Other hub genes, VCAM1 [58] and CXCL12 [59], in the Lde-light yellow module have been reported to play roles in the control of secondary muscle growth. Thus, although muscle phenotype is determined mainly during embryonic development, we found that secondary muscle growth during postnatal development was also critical for the muscle phenotype difference between LT and Lde. The hub genes SETD3 [60], DZIP1 [61], LMNA [62], PRMT5 [63], JUN [64], TEAD1 [65], ELL3 [66] and SPARC [67] have been shown to control muscle cell proliferation and differentiation and regulate muscle development. Some of the hub genes that we identified have been reported to be involved in proliferation and differentiation of vascular smooth muscle cells and cardiac myocytes; for example, CBX3 [68], PRRX1 [69], PDLIM7 [70], FABP3 [71], and UBR5 [72]. The coexpression network modules that contain these genes may regulate the development of the vascular and circulatory system. In addition, the hub genes, MYBPC1 [73], UNC45B [74] and MYOZ2 [75], have been shown to be are required for skeletal muscle function, such as muscle contraction. These results suggest that the lean-specific modules cover all the main processes of postnatal muscle development, including muscle cell proliferation and differentiation, secondary muscle growth, postnatal muscle growth for fiber composition, development of the vascular and circulatory system, and muscle function regulation. All these coexpression network modules seem to be associated with the mechanisms that regulate the high lean meat percentage muscle phenotype in Lde.

In addition to the numerous genes that were found to positively regulate muscle development in the coexpression network modules of Lde, we identified hub genes that negatively regulate muscle development in the postnatal LT modules. RHEB was found to negatively regulate skeletal myogenesis by repression of insulin receptor substrate 1 (IRS1) [76], and KLF10 was reported to inhibit myoblast proliferation by suppression of the promoter activity of fibroblast growth factor receptor 1 [77]. Although JAK1 was found to be critical in promoting proliferation, it was also found to prevent the premature differentiation of myoblasts [78]. In contrast, MUSTN1 was shown to have no effect on myoblast proliferation, but was found to significantly impairs myoblast differentiation and prevent myofusion [79]. These negative coexpression network modules associated with muscle development might control the muscle phenotype in LT, which features low lean meat percentage, slow-growing muscle and low body weight. These characteristics facilitate the deposition of high levels of intramuscular fat. Besides these negative regulation modules, several positive coexpression network modules were also identified in LT. TEAD1 was shown to regulate the fast-to-slow fiber-type transition and overexpression of TEAD1 was found to produce a slower skeletal muscle contractile phenotype [80]. The hub genes SIX1 [81], MYF6 [82], CDK9 [83], TEAD4 [84], and PRMT5 [63] in LT are myogenesis genes that regulate myogenic differentiation and muscle development. FHOD1 [33] and S100A11 [85] regulate smooth muscle cell migration, vesicular exocytosis, and smooth muscle cell phenotype, processes that are related to vascular and circulatory system development. SMPX [86] and GADD45A [87] are LT-specific muscle function genes. In particular, we have identified coexpression network modules related to intramuscular fat deposition and meat quality. For example, the hub genes HDLBP [88], SFRS1 and SFRS18 [89] can regulate the deposition of intramuscular fat, while ACOT8 and ACOT9 [90] can regulate lipid and amino acid metabolism. It has been suggested that fat deposition and fatty acid composition are the determining factors for meat quality [91]. ATP5B [92] and CSRP1 [56] were shown to play key roles in muscle fiber development and may be responsible for breed-specific differences in meat quality.

It has been suggested that muscle fiber composition, size, and total fiber number are critical for meat quality, and that slow fibers contribute to both juiciness and tenderness [56]. These muscle fiber features also define muscle phenotypes. In our comparative transcriptome analysis, we detected a greater number of coexpression network modules related to myogenesis and muscle growth, secondary postnatal muscle growth, fast fiber differentiation, and fiber composition in the Lde transcriptome compared with in the LT transcriptome. Although fewer coexpression network modules related to myogenesis and muscle growth were identified in LT, more modules related to negative regulation of postnatal muscle and slow skeletal muscle fiber development were identified compared with Lde. In particular, coexpression network modules related to negative regulation of intramuscular fat deposition and meat quality were identified in LT. Thus, the differences in coexpression network modules between Lde and LT described above are likely to have resulted in the high lean meat percentage, fast-growing muscle, and high body weight characteristics in Lde, and the high intramuscular fat content, slow-growing muscle, and low body weight characteristics in LT. However, our results showed that the muscle phenotype differences between the two breeds were not only regulated by muscle genes but were coordinated by muscle, nerve, and immunity related genes. The complex coexpression networks responsible for the different muscle phenotypes are likely to have been generated by artificial selection during the domestication process. The evolutionary analysis showed that the coding sequences of most of the module genes in the coexpression network modules were conserved among pig breeds under artificial selection. Therefore, the role of changes in coding sequence under positive selection in the divergence of muscle phenotype among pig breeds was found to be minor. We propose that the divergence of coexpression modules among breeds under positive selection eventually regulated the muscle phenotype divergence during domestication. Previous studies have usually focused on the effect of selection pressure on gene function. In this study, we have shifted the emphasis to the role of selection in the divergence of coexpression networks between breeds during the domestication process.

Conclusions

Here, we have carried out the first comprehensive analysis of gene coexpression relationships in muscle development in two pig breeds from embryo to adult. We identified significant differences in coexpression networks modules between the Lde and LT breeds, which may be responsible for divergence of the muscle phenotypes. A greater number of coexpression network modules related to myogenesis, postnatal muscle growth, and fast fiber differentiation were found in Lde compared with in LT. However, although fewer modules of myogenesis and muscle growth were identified in LT, more modules related to slow muscle fiber and negative regulation of muscle development were found. In particular, we identified five modules related to intramuscular fat deposition and meat quality in LT. We showed that positive selection played a key role in the divergence of the breed-specific modules, while changes in the gene coding sequence among breeds played only a minor role. Our results demonstrate that the molecular mechanism underlying phenotype divergence between breeds cannot be robustly explained by differential gene expression alone, but can be explained by coexpression network modules. The elucidation of gene coexpression network divergence in the developmental processes of different breeds provides a new foundation for understanding the functional organization of transcriptomes in phenotype variation.

Methods

Ethics statement

All animal procedures were performed according to the guidelines developed by the China Council on Animal Care and the protocols were approved by the Animal Care and Use Committee of Guangdong Province, China. The approval ID or permit numbers are SCXK (Guangdong) 2004–0011 and SYXK (Guangdong) 2007–0081.

Selection of genes for network analysis

The transcriptome sequence data from 20 pig (Sus scrofa) muscle samples were downloaded from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25406). These 20 datasets contain the sequenced transcriptomes of LT and Lde at prenatal days 35, 49, 63, 77, 91 and postnatal days 2, 28, 90, 120, 180. All possible CATG + 17-nt tag sequences were created from the Sus scrofa genome sequence (Sscrofa9.2) and UniGene (NCBI36.1, 20090827) databases and used as reference sequences to align and identify the sequencing tags. (The “CATC site” is a digestion site of the NlaIII restriction enzyme. The NlaIII digestion site was selected to produce the Solexa sequencing tags which were 21 bp long (i.e., CATG + 17 tags) because most the mRNA sequences (99%) have NlaIII digestion sites). All clean tags were aligned to the reference database, and unambiguous tags were annotated. Each alignment was allowed one mismatch to allow for polymorphisms across samples. Mismatches can be caused by sequencing errors, but the frequency of such errors is generally very low (1 or 2 per million).To compare the differential expression of genes across samples, the number of raw clean tags in each sample was normalized to tags per million (TPM) to obtain normalized gene expression levels. Differential expression of genes or tags across samples was detected according to methods described previously [93]. The DEGs with a log2 ratio > 0.5 (P < 0.009, false discovery rate (FDR) < 0.02) between libraries were identified. To construct the coexpression network modules, 7057 DEGs genes in Lde pigs and 7056 DEGs genes in LT pigs were used.

Methodology used to construct the gene coexpression networks

WGCNA [14,16-20] was carried out using the R software (http://www.r-project.org). Breed and time were analyzed separately. The absolute values of the Pearson correlation coefficients were calculated for all pairwise comparisons of gene-expression values across the LT and Lde samples. The correlation matrix for each breed was then transformed into a matrix of connection strengths (i.e., an “adjacency” matrix) using a power function (connection strength = |correlation|b), which resulted in a “weighted” network. To make meaningful comparisons across data sets, a power of b = 10 was chosen for all analyses. The function TOMdist1 in R was used to compute dissimilarity based on the topological overlap matrix. To group nodes with high topological overlap into modules (clusters), we typically used the average linkage hierarchical clustering coupled with the TOM distance measure. We choose a height cutoff with a threshold of 0.995 to create the clusters. Modules that had at least 30 genes that corresponded to the branches of the dendrogram were selected for analysis. The modules were visualized by classical multidimensional scaling in three dimensions. Then, the module eigengene was compared with the indicator variable using a Kruskal-Wallis test.

Detection and characterization of modules

The gene expression profile of each module were decomposed via singular value decomposition and the value of the module eigengene, V1 (i.e., the first principal component), was plotted for each sample. We then compared the module eigengene to the indicator variable using a Kruskal-Wallis test.

Detection of positive selection

Whole-genome alignments of 37 individual pigs and 11 wild boars were downloaded from the NCBI Sequence Read Archive, (ftp://ftp.sra.ebi.ac.uk/vol1/ERA164/ERA164657/bam/, Accession Number. ERP001813). SAMtools/BCFtools [94] was used to call SNPs for each individual animal. The results were merged, and SNPs with low frequency within all samples (<5%) where filtered out. These remaining SNPs were used to generate the consensus sequence for the module genes. PAML [95] was used to perform the ka/ks analysis.

Availability of supporting data

All the supporting data are included as additional files.