Abstract
Facioscapulohumeral muscular dystrophy (FSHD) is one of the most common autosomal dominant muscle disorders, yet no cure or amelioration exists. The clinical presentation is diverse, making it difficult to identify the actual driving pathomechanism among many downstream events. To unravel this complexity, we performed a meta-analysis of 13 original omics datasets (in total 171 FSHD and 129 control samples). Our approach confirmed previous findings about the disease pathology and specified them further. We confirmed increased expression of former proposed DUX4 biomarkers, and furthermore impairment of the respiratory chain. Notably, the meta-analysis provides insights about so far not reported pathways, including misregulation of neuromuscular junction protein encoding genes, downregulation of the spliceosome, and extensive alterations of nuclear envelope protein expression. Finally, we developed a publicly available shiny app to provide a platform for researchers who want to search our analysis for genes of interest in the future.
Similar content being viewed by others
Introduction
Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominant inherited muscle disorder characterized by weakness and atrophy. It is one of the most common muscular dystrophies. According to the latest European epidemiological study (published in 2014), the prevalence of FSHD is 5–12 affected individuals per 100,000 population1. The pathomechanism of FSHD has not been fully elucidated yet, and no drug cures the disease or slows its progression.
A milestone in FSHD research was the discovery of the abnormal activity of a gene called DUX4, which is thought to be involved in regulating the cleavage phase of embryonic development2, but is otherwise silenced throughout life, except for low-level expression in the thymus and testes3. In healthy individuals, it has been reported to be hypermethylated at its locus on chromosome 4q35, a macrosatellite consisting of 11–100 or more D4Z4 repeats, each containing a DUX4 gene. In FSHD, this chromosomal segment is shortened to 10 or fewer repeats, with concomitant detection of hypomethylation allowing reading of the most distant DUX4 gene. In combination with the 4qA haplotype, which contains a polyadenylation signal, DUX4 can be expressed due to this contraction4. The encoded double homeobox protein 4 (DUX4) is a transcription factor that ultimately triggers signaling cascades by activating other transcription factors and hundreds of genes, resulting in rapid cell death5. Generally, there are two types of FSHD, which show exactly the same phenotype. Ninety-five percent of patients with FSHD have FSHD1, and the remaining percentage have FSHD2. In FSHD2, there is no contraction of the D4Z4 repeats, as the 4q35 locus contains about 11–20 repeats. Instead, there is a mutation in either SMCHD1 (>80% of FSHD2), DNMT3B or LRIF1 gene, which leads to hypomethylation of the D4Z4 repeats and allows expression of DUX46.
The exact mechanisms of the signaling pathways prevalent in FSHD are still unclear. Further decelerating research progression and insight, FSHD is characterized by extreme variability in phenotype, even compared with other muscular dystrophies. This variability can be observed between affected family members and in a frequent body asymmetry, with individual muscle groups having different degrees of damage7. Moreover, although several studies have reported an inverse correlation between the number of D4Z4 repeats and disease severity8,9, other studies examining few10 or comparatively many repeats11 have detected that D4Z4 allele size is not always related to clinical severity. In addition, the size of repeats in the upper range was reported to have lower methylation than predicted for a comparatively milder phenotype12. It suggests that unknown additional factors are involved, which may ultimately mitigate or exacerbate disease progression. This is further supported by the fact that pathological changes in muscle morphology and integrity generally start in the second decade of life4 with varying muscle groups affected only slightly or not at all, hinting at rescue or compensation mechanisms preventing DUX4 expression and toxicity.
Given the unknown factors of FSHD pathology, advances in transcriptomics technologies over the past decades harbor great potential in elucidating molecular mechanisms and pathogenic signaling pathways using bioinformatics approaches and statistical methods. In this context, to better understand the complexity of FSHD, we performed a meta-analysis (PROSPERO ID: CRD4202233048913) to verify existing knowledge and to identify additional characteristics that could advance FSHD research.
Results
Our search identified a total of 11 studies, summarized in 13 datasets (Table 1), that met the predefined criteria based on data from databases, related publications, and information obtained during our literature search through email contact with the corresponding authors (Fig. 1a; detailed description of the individual phases of the meta-analysis can be found in Supplementary Notes 1, 2 and Supplementary Data 1, sheet 1; the PRISMA 2020 Checklist14 and Checklist for Conducting Meta-Analysis of Microarray Datasets15 are depicted in Supplementary Table 2). The stages of unified preprocessing and statistical analysis (Fig. 1b) culminated in the summary results of a random-effects model16, which were confirmed in all subsequent areas by a secondary analysis approach, the vote-counting15 (see the “Methods” section and Supplementary Notes 6). The random-effects meta-analysis yielded 1935 significant results (adjusted (adj.) p-value < 0.05). These results are decreasingly sorted by standardized mean difference with heteroscedastic population variances in the two groups (SMDH)17 as shown in the heatmap in Fig. 1c, which is linked to data summarized in Supplementary Data 1, sheet 2a. The results are verified by sensitivity analyses (Supplementary Notes 3 and 4), enrichment analyses (random-effects model and vote-counting approach; Supplementary Data 1, sheets 3, 5 and 6), clustering of Gene Ontology (GO) terms using the Bioconductor package simplifyEnrichment (Supplementary Figs. 1–3), and the results of the meta-FSHD app we developed (see the “Methods” section).
Meta-FSHD app
We have developed a publicly available shiny app called meta-FSHD to provide a tool for researchers to quickly and easily get the estimated meta-analytic effect for any gene of potential interest (Fig. 1d). The app considers all genes measured in at least three datasets, corresponding to 26,858 unique ENSEMBL-IDs, 21,080 unique ENTREZ-IDs, and 22,791 unique gene names. It gives detailed parameters regarding significance, effect size, confidence interval (CI), and degree of heterogeneity. Using meta-FSHD, all results of the meta-analysis on significant genes can be confirmed and are traceable easily and quickly for any person (see the “Methods” section).
Confirmation of previous knowledge of FSHD pathology
Several studies have already demonstrated that the clinical picture of FSHD is associated with the highly toxic expression of the DUX4 transcription factor18,19,20. Since the gene is expressed both sporadically and in only a few myonuclei (1 in 200–1000 cells), detection is difficult5, which is why DUX4 target genes are investigated21. In the meta-analysis, DUX4 biomarker genes are the most upregulated (Fig. 2a; Supplementary Data 1, sheet 2a). The highest expression is found in H3Y1 (standardized log2-fold change (STD log2-FC) (95% CI) = +2.89); while histone variant H3 was previously linked to DUX422, it was later stated that DUX4 induces H3Y and H3X, which mark DUX4 target genes for expression23. Besides, as shown by our GO clusters (Supplementary Fig. 1), there are enriched Biological Process (BP) sections of up- and downregulated genes related to development, differentiation, and morphogenesis. DUX4 generally has been described as a disruptor of muscle myogenesis that, when present at high levels, leads to apoptosis and, when present at lower levels, inhibits myogenesis24. In this context, the BP GO clusters and the gene list (Supplementary Data 1, sheet 2a) further show strong inflammasome activation and upregulation of apoptotic processes (Fig. 2b).
In addition, the meta-analysis shows that metabolic genes are misregulated in FSHD (Fig. 2c and BP clusters in Supplementary Fig. 1). Several studies have already pointed to mitochondrial abnormalities in FSHD25,26. A recent study was able to identify mitochondria as a source of excessive reactive oxygen species due to impairment of mitochondrial oxidative phosphorylation (OXPHOS), particularly in complex I, as an early event of DUX4-induced toxicity27. This is of great interest because in proliferating healthy myoblasts, approximately 30% of the ATP consumed by the cells is generated by OXPHOS. In contrast, in terminally differentiated myotubes, mitochondrial respiration is the major source of ATP (~60%)28. The impairment of the respiratory chain was described to lead to an immediate decrease in metabolic activity followed by a gradual increase in mitochondrial membrane potential. The general consequences observed were apoptosis by mitochondrial reactive oxygen species and impairment of mitochondrial health by lipid peroxidation27. Furthermore, it was reported that impaired metabolic adaptation would lead to misdirected increase in hypoxia signaling27,29. About these findings, we noticed a significant upregulation of HIF1α in 10 of the 13 datasets (STD log2-FC (95% CI) = +0.619). The meta-analysis also shows the downregulation of respiratory chain genes. In complex IV, COX2 (which is among the approximately 5% of the most downregulated genes; STD log2-FC (95% CI) = −0.759) and COX3 (STD log2-FC (95% CI) = −0.622), two of the three mitochondrial DNA-derived genes involved in cytochrome c oxidase activity30, were strongly downregulated. In complex I the downregulation refers to ND4L (STD log2-FC (95% CI) = −0.625), ND5 (STD log2-FC (95% CI) = −0.736) and ND1 (STD log2-FC (95% CI) = −1.07). The latter constitute three of the seven subunits of NADH dehydrogenase that originate in mitochondrial DNA and catalyze the electron transfer of NADH through the respiratory chain30. Intriguingly, ND1 is among the 10 most downregulated genes of all 1935 significant genes (Fig. 2c; Supplementary Data 1, sheet 2a). The GO terms in the enrichment analysis highlight the relevance of the findings on dysfunctional OXPHOS in FSHD. Of the first 25 downregulated cellular component (CC) categories with comparatively smallest p-values, 10 refer to the mitochondrial respirasome, either complex I, IV, or both (Supplementary Data 1, sheet 3).
Another area that emerges from the meta-analysis is the impact of the nuclear lamina (NL) on FSHD pathology, as already reported in several studies31,32. This is highly interesting in terms of genome regulation, particularly with regard to the question of why skeletal muscles are so severely affected in FSHD. Our list of significant genes (Supplementary Data 1, sheet 2a) supports the findings on long-distance interactions between D4Z4, the NL, and the telomere32, showing altered expression of FAT1 (STD log2-FC (95% CI) = +0.576) and SORBS2 (STD log2-FC (95% CI) = +0.583). Interestingly, FAT1, reported previously to be lower in FSHD muscles compared to control muscles33, is downregulated in the DUX4-model but upregulated in almost all patient datasets. As a general trend, we find genes associated with the NL expressed in opposite directions when comparing the DUX4-model with the patient datasets. This not only refers to long-distance interactions but also to genes directly involved in the scaffold of the NL, like LMNA (STD log2-FC (95% CI) = +0.698), which has already been associated with several muscle diseases34. The differential expression of NL-associated genes in the DUX4-model and the patient datasets suggests mechanisms independent of DUX4, as previously shown concerning FAT133. These findings show altered genome organization, evident from our enrichment analysis regarding the first entry of upregulated BP pathways, which concerns supramolecular fiber organization (GO:0097435; p-value: 8.592309e-09; Supplementary Data 1, sheet 3). It encompasses a total of 102 genes (including FAT1 and SORBS2), which are differently regulated between FSHD patients and controls in the meta-analysis.
Discoveries on FSHD
Using the simplifyEnrichment package for GO clustering35, we became aware of genes within the CC category (in both up- and downregulated pathways) that are involved in pre- and postsynaptic processes, membranes and transitions, and neuronal processes of nerve projection (Supplementary Fig. 2). Interestingly, these neuronal aspects have not been described in FSHD, yet. To test the relevance of these findings, we examined our enrichment analysis results (Supplementary Data 1, sheet 3) and detected that upregulated signaling pathways with small p-values refer to processes within the extracellular matrix (ECM; Fig. 3d). Since muscle fibers are located within the ECM in a three-dimensional scaffold composed of various collagens, glycoproteins, proteoglycans, and elastin, the ECM is vital for muscle contraction, integrity, and elasticity36. Notably, overgrowth of the ECM, also referred to as fibrosis, is well described in FSHD and results in hardening of the interconnective tissue, which leads to impaired muscle contraction and stiffness37. Consistently, our data also show many genes upregulated at the deepest and smallest component of the ECM, the basal lamina (BL), which is adjacent to the sarcolemma of the myofibers (such as type IV collagen genes, which are reported to dominate the BL (upregulation of COL4A1 and COL4A2), specific laminins (LAMA2, LAMB1, LAMA5-AS1), several integrins and elastin38. Due to this general upregulation of ECM-related genes, we were surprised to find one very specific set of collagens downregulated, which is reported to form a functional trimer at the NMJ (Fig. 3d). Intriguingly, the BL differs in morphology depending on whether it is synaptic or extra-synaptic. The synaptic BL is composed mainly of type IV collagen α3-, -α4-, and -α5-chains (Fig. 3a), which, together with certain laminins (laminin β2, α4 and -α5) serve transmission of signals and mechanical forces that perform muscle innervation38,39,40. Notably, all three synaptic type IV collagen isoforms, α3- (COL4A3), α4- (COL4A4), and -α5 (COL4A5), are strongly downregulated in FSHD patients compared with controls throughout all datasets (Supplementary Data 1, sheet 2a). COL4A3 is even the most downregulated gene across the entire meta-analysis (STD log2-FC (95% CI) = −1.52 (−2.38 to −0.659), Fig. 3c). We thus hypothesised that NMJ architecture may be altered in FSHD patients probably affecting muscle innervation and searched for other factors that have been associated with impaired signal transduction at the NMJ. We discovered more than 60 genes that are misregulated on the transcriptional level (including AGRN, MACF1, DOK7, WNT4, DVL1, etc.; Fig. 3b). The data show altered processes regarding ion channels and ion pumps, NMJ maintenance and formation, acetylcholine (ACh) receptor clustering, and vesicle transfer (Supplementary Table 3).
Nuclear envelope (NE) protein-encoding genes are misregulated in FSHD
The role of nuclear lamina-associated genes (Supplementary Data 1, sheet 2a), the alteration of supramolecular fiber organization (BP upregulation; Supplementary Data 1, sheet 3), and CC downregulation of signaling pathways associated with the nuclear body and nuclear speckles (Supplementary Data 1, sheet 3) led us to examine the results of the meta-analysis in more detail. In this context, the meta-analysis demonstrated the variability of the FSHD phenotype, not only between patients but also between different muscle types, left and right muscles, or even different loci in the same muscle (Supplementary Note 1); besides, some FSHD samples show gene signatures like controls depending on the extraction of muscle tissue41 (Supplementary Fig. 4). Hence, providing additional information about the degree to which the investigated muscle is affected is necessary to differentiate between actual contributors to the phenotype and noise42. Moreover, RNA-Seq has a clearly higher detection rate of differentially expressed genes (DEG) than microarrays and also allows the analysis of splicing43,44. Therefore, we used the RNA-Seq dataset generated by Wang et al.21 using muscle biopsies of 36 FSHD patients for a deeper analysis of pathways we found altered in the meta-analysis. For the said study, Wang et al. correlated the expression of four DUX4-regulated biomarker genes with MRI data of the lower extremities and histopathological changes. Based on this, they divided the patients into four groups, with group 1 being similar to the controls and group 4 displaying the strongest pathology21.
While the number of DEGs (log2-FC > 1 and <−1, p-value < 0.05) was already high when analyzing all samples together (1879), we found even more genes to be significantly misregulated when comparing the groups to the controls separately, with group 4 having the highest amount (7400). The reason for this could be the reduced variability within the groups, but also the increased disease pathology. Since there are so many DEGs in severely affected individuals, we wondered whether alterations in genome organization induce these changes. Several nuclear envelope transmembrane proteins (NETs), including muscle-specific ones, have been proven to be involved in genome organization and gene expression regulation45. Notably, a NET has been described previously to be misregulated in FSHD46, and long-distance interactions between the D4Z4 locus and the nuclear envelope have been reported32. Further, there are muscular dystrophies with similar symptoms to FSHD, that are linked with genes of the nuclear envelope: striated muscle laminopathies (e.g. Emery–Dreifuss-muscular-dystrophy and limb-girdle-muscular-dystrophy 1B) are caused by mutations in EMD, LMNA or SYNE1, among others. We thus first screened the data for a list of 386 NE-associated genes that are known to be expressed and relevant in muscle47,48. Figure 4a shows the expression of these genes in strongly affected individuals (group 4) compared to controls. This revealed that many NE protein-encoding genes were significantly differentially expressed. Noteworthy, the majority were downregulated (152 down vs. 83 up with log2-FC > 0.5 and <−0.5), while the majority of all DEGs in group 4 were upregulated (2032 down vs. 5368 up). Many of these genes play a role in positioning specific genes to the NE and thereby repressing their expression49,50. Thus, it is conceivable that downregulation of these NE genes might contribute to the upregulation of many DEGs found in FSHD. Importantly, we found genes associated with Emery–Dreifuss-muscular-dystrophy altered in FSHD patients, including LMNA (2-fold upregulated) and EMD (1.5-fold down Fig. 4a, red). We then checked whether the expression of these genes correlates positively with disease severity and indeed saw a clear correlation for many of them, some of which we present in Fig. 4b. Mutations in LMNA, FHL1, PLPP7 and TMEM38A have been linked to Emery–Dreifuss-muscular-dystrophy49,51,52. Notably, PLPP7 and TMEM38A were shown to be important for muscle regeneration as their knockdown leads to inefficient differentiation in C2C12 myoblasts. Since they are both significantly downregulated in FSHD (TMEM38A in all groups, PLPP7 in groups 3 and 4), they might contribute to muscle weakness and wasting through impaired muscle regeneration. Therefore, we conjectured that an actual contribution of TMEM38A and PLPP7 to the FSHD phenotype would lead to expression changes of genes regulated by them. Employing a list of target genes of these two NETs generated in C2C12 mouse myoblasts53, we found 716 genes misregulated in FSHD that are potentially regulated by TMEM38A and PLPP7 (Fig. 4c). When analyzing these 716 genes for a GO term analysis among others metabolism, signaling, and differentiation were enriched in FSHD (Fig. 4d). It is noteworthy that TMEM38A and PLPP7 are only two of several genome organizing NETs being misregulated.
Components of the splicing machinery are downregulated in FSHD and result in the mis-splicing of muscle genes
Looking at the results of the random-effects analysis, the spliceosomal complex was among the top 15 results for downregulated CC clusters (Supplementary Data 1, sheet 3), whereas downregulated mRNA splicing via the spliceosome affected almost all datasets with significant results in the vote-counting approach (Supplementary Data 1, sheet 6).
Previous studies suggested that components of alternative splicing are upregulated, while constitutive splicing is downregulated in two other muscular dystrophies, myotonic dystrophy type I and Emery–Dreifuss-muscular-dystrophy54,55. Since there are many similarities between muscular dystrophies also on the molecular level, we analyzed the dataset generated by Wang et al.21 about the expression of splicing components using a gene set enrichment analysis for splicing associated terms. While splicing factors were misregulated in all samples, groups 2 and 3 (Supplementary Figs. 5 and 6) showed upregulation of alternative splicing and downregulation of constitutive splicing. In contrast, there was a general downregulation of alternative as well as constitutive splicing in group 4 (Fig. 5a). This suggests a major disruption of the splicing machinery in strongly affected FSHD patients. We hypothesize that fewer spliceosomes assemble at splice sites due to the lower abundance of splicing factors, leading to many (constitutive and alternative) splice sites being unused. This has to be experimentally validated in the future but is beyond the scope of this analysis. We next looked into splicing variations, primarily in group 4, using modeling alternative junction inclusion quantification (MAJIQ)56 to identify differentially used splicing events. We set the default of 10% (percent spliced in, Ψ-value ≥ 0.1) and false discovery rate of 10% for local splicing variations to be significantly different. We found 730 events differentially used between controls and FSHD patients (Supplementary Data 1, sheet 7). In line with a general downregulation of splicing components, we found many exon skipping events in FSHD patients (Fig. 5b). However, we were startled to detect a similarly large number of introns retained in controls but spliced out in FSHD patients. If a downregulation of many splicing factors actually leads to fewer spliceosomes assembling to functional units, we would expect less splicing in general. We thus checked these intron exclusion events separately. We found that only around 25% of these events are classic intron splice events, while 75% are exon skipping events naturally excluding the adjacent intron which was retained at a higher ratio in controls but removed along with the exon in FSHD patients. MAJIQ correctly identifies these events as introns retained in controls with a Ψ-value ≥ 0.1. In this regard, nonsense-mediated-decay (NMD) might play a role as it has previously been shown to be downregulated in FSHD and intron retention leads mostly to mRNA-NMD57,58. This is an important regulatory level in gene expression governed through alternative splicing. In this context, the meta-analysis shows a high amount of significantly dysregulated RNA isoforms (Supplementary Data 1, sheet 2a), which might be possible substrates for NMD. Besides, a key player of NMD, the exon junction complex component EIF4A3, which has been previously discussed by Shadle et al.57, is significantly upregulated in the meta-analysis results (STD log2-FC (95% CI) = +0.651) and could be an important candidate for further research.
To investigate the potential effect on the muscle phenotype, we next looked into the genes mis-spliced in FSHD. A GO term enrichment analysis revealed that these genes are highly relevant for muscle-specific signaling, development, and innervation (neuron projection morphogenesis; Fig. 5c). The heatmap in Fig. 5d shows a selection of genes differentially spliced in FSHD, many of which are involved in pathways we found misregulated in our meta-analysis, e.g. in muscle structure, mitochondria and metabolism, signaling, and splicing. Interestingly, microtubule-associated factor 1 (MACF1), which is a top hit in the meta-analysis (Supplementary Data 1, sheets 2 and 4) and regulates myonuclear positioning at the NMJ59, shows a preference for skipping of exon 116 in FSHD (Ψ-value 0.325). MAJIQ further detects an exon skipping event in ATP1B3 (Ψ-value = 0.359), which encodes a subunit of the Na+/K+-ATPase and is involved in the electrical excitability of muscle and nerves. We further identified several splicing factors alternatively spliced. Interestingly, MBNL1, a main contributor to myotonic dystrophy, shows the same double exon skipping event of exons 5 and 6 as in myotonic dystrophy type I. There are intron exclusion events in β-tubulin TUBB6 and FHL1, of which the latter is a component of the nuclear envelope and differentially expressed in FSHD, as described above. TUBB6 was proposed to act in muscle regeneration and be upregulated in dystrophy muscle60. We also show two examples of intron retention events in controls instead of exon skipping in FSHD patients in two highly important sarcomeric genes, skeletal muscle troponin 3 (TNNT3) and tropomyosin 1 (TPM1).
Discussion
In the past, several studies have been conducted using mainly explorative meta-analytical approaches that yielded highly interesting results. For example, a link between b-catenin and DUX4 was discovered61 and PAX7 target genes were shown to be globally repressed in FSHD skeletal muscle62. Furthermore, skeletal muscle regeneration in FSHD was found to correlate with disease severity63. A recently published study also showed that FSHD muscle exhibits disruption of fibroadipogenic progenitor cells, mitochondrial function and alternative splicing independent of inflammation64. The latter underpins the results of our meta-analysis and shows the great importance of these aspects for the disease course in FSHD.
This meta-analysis confirms DUX4 as a main driver of FSHD pathology, as shown by the upregulation of DUX4 biomarker genes. However, many genes behave seemingly independently of DUX4 expression. Since disease severity is sometimes comparatively mild or severe regardless of the number of D4Z4 repeats, these genes may be disease modulators under a DUX4-independent mechanism. This is also suggested by other research groups and highlights the need for new biomarkers to track disease progression and stratify patients65. Based on the meta-analysis results, we hypothesize that gene expression alterations affecting the NMJ, NE and spliceosome are relevant factors in FSHD disease progression.
We identified downregulation of genes essential for the NMJ architecture. A consequence is very likely impairment of muscle innervation in FSHD. This might be the key to a highly interesting area of research that addresses specific force in FSHD. Intrinsic force production capacity of muscle (force per unit area; N/cm2) was found to be decreased in patients with both mild and severe FSHD, regardless of disease severity and even before the onset of fat infiltration or lower limb weakness66. While possible reasons for these observations have been associated with early myopathic changes and non-muscular factors such as fatigue or musculoskeletal pain, our results are showing that impairment of muscle innervation may play an essential role in FSHD pathology.
Moreover, researchers have suggested that the epigenetic landscape is the missing link between the FSHD phenotype and underlying genetic parameters67. Besides, the major effect of DNA methylation on 3D genome structure has been described previously68. In this context, the altered expression of NE proteins in the meta-analysis likely contributes to the high amount of differentially expressed genes in FSHD. NE proteins are involved in genome organization and subsequent gene expression control45. Therefore, the consequence of the misregulation of NE proteins would be a secondary effect on many other genes. While this is difficult to filter out in multifactorial disease, there is clear evidence of this being true as genes that have been shown under the control of the muscle-specific NETs PLPP7 and TMEM38a (in mouse myotubes53) are also misregulated in FSHD, where the expression of PLPP7 and TMEM38A appears to be correlated to disease severity. Similar effects have been observed in myotonic dystrophy type I54. Thus, it may reflect a more general pathomechanism in muscular dystrophies. However, due to the correlation with disease severity, the expression of NE genes may be predictive, which should be further investigated.
As an additional contributing factor, we identified altered expression of genes encoding spliceosome proteins and, subsequently, a different splicing profile affecting a bulk of genes with essential roles in muscle. In the most severely affected patients, we hypothesize that there are fewer functional spliceosomes: it seems they assemble only at every second or third splice site compared to controls, as we also observe many double exon skipping events.
Another, although already described factor, are mitochondrial impairments in FSHD27. Given the confirmation of these defects by this meta-analysis, studies specifically testing mitochondria-targeted agents, NAD+ precursors, or OXPHOS modulators to slow disease progression should be endorsed.
Although we lack the biopsy material to validate all these aspects of the FSHD pathology, we could identify secondary effects on general gene expression and splicing, most likely caused by the NE and spliceosome alterations. Thus, we rate them and the NMJ alterations, as potential candidates involved in the FSHD phenotype and strongly endorse a deeper investigation.
Methods
Individual participant data (IPD) meta-analysis
The key advantage of our meta-analysis and the foundation of its statistical power is using the original omics data from the included studies instead of summarized data. All datasets, which were generated with different approaches and at different time points, could thus be standardized in terms of data retrieval from databases and uniform analysis procedures to provide optimal conditions for directly comparing significantly differentially expressed genes and molecular signalling pathways between FSHD patients and controls.
The meta-analysis follows the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement14 and the guidelines described by Ramasamy et al.15. Timing and decision-making within the meta-analysis were divided into the following work areas: Literature search, inclusion and exclusion criteria, data extraction, quality assessment, data preprocessing, statistical analysis, and data synthesis (Supplementary Notes 1 and 2, Supplementary Table 1 and Supplementary Data 1, sheet 1). In May 2022, the meta-analysis was registered in the International Prospective Register of Systematic Reviews (PROSPERO; ID: CRD4202233048913). As shown in Fig. 1a, our search identified a total of 11 studies that met the predefined criteria based on data from databases69,70,71,72,73,74,75,76,77,78,79, related publications and information obtained during our literature search through email contact with the corresponding authors (detailed description in Supplementary Notes 1 and Supplementary Data 1, sheet 1).
In total, the meta-analysis comprises data from 300 samples, including 171 FSHD samples and 129 controls (Table 1, based on Supplementary Table 1). 13 datasets were generated from the 11 studies, as one dataset included both cell lines and biopsies (GSE5678776). Furthermore, one dataset included two families (two sisters each as patient and control), which resulted in family-related batch effects in our statistical analysis due to strong genetic similarity in the families (GSE12346877). The DUX4-model found in our literature search was not included in the overall calculation due to its artificial nature, but was contrasted for comparison since DUX4-induced gene expression has been reported to be the major molecular signature in FSHD skeletal muscle80.
Statistics and reproducibility
All codes for preprocessing and reproducible analysis are published on GitHub (https://github.com/FSHDresearch/Meta-Analysis-of-FSHD). The meta-analysis encompasses data from five Affymetrix GeneChips81, one Illumina BeadArray82 and five Illumina RNA-Seq studies83. All downstream analyses were performed with R (v4.2.2)84. For statistical analysis, the Benjamini–Hochberg correction was used to adjust for multiple testing85 and adj. p-values of <0.05 were considered significant. All statistical tests were two-sided.
In case of RNA-Seq data, Salmon (v1.9.0) was used to quantify transcript abundance from fastq raw files (which were freely available in the databases except for EGAD00001008337), since it has been shown to significantly improve the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis86. In case of the EGA dataset, after email request we were provided with the raw counts table and the metadata (Supplementary Notes 1), which were subsequently integrated into the unified analysis procedure of the meta-analysis. In order not to lose data, no separate filtering steps were applied in the R workflows for the individual datasets. For pre-processing Affymetrix microarray data, the robust multi-array average (RMA) approach87 of the Bioconductor packages oligo (v1.60.0)88 and affy (v1.74.0)89 were used; and for Illumina microarray data we selected Robust Spline Normalization (RSN)90 of the lumi package (v2.48.0)91. As we used limma (v3.52.2)92 in combination with MKomics (v0.7)93 for differential expression analysis of microarray data94, for reasons of comparability, we chose limma for RNA-Seq data preparation and analysis as well. Thus, only minimal pipeline changes were required to switch between analyses for RNA-Seq and microarray experiments95,96.
As a first strategy a random-effects model was selected to identify significantly differentially expressed genes between patients and controls16. In this context, we used SMDH as an effect measure, as suggested by Bonett (2009)17. The model was chosen because, besides the biological heterogeneity between study participants, a relevant degree of technical heterogeneity could be assumed due to the use of different technologies (microarray vs. RNA-Seq; Supplementary Notes 3 and corresponding sensitivity analyses in Supplementary Notes 4).
Without the DUX4-dataset, we identified a total of 53113 unique Ensembl IDs (Ensembl database version 108 from December 2022). We decided to use only unique IDs measured in at least three datasets to get reliable results from the meta-analyses, which gave us 26,858 unique IDs. We filtered our data to increase the power of the analysis97,98, whereupon 13,274 unique IDs remained. However, filtering can lead to a bias in the false discovery rate calculation99. The results in Supplementary Notes 7 suggest a true false discovery rate of 5% to 7% for the selected 1935 genes. For comparison we have included the results of the unfiltered analysis in Supplementary Data 1, sheets 2a and 2b. Using the random-effects model, a p-value was assigned to each individual unique Ensembl ID. In this way, 13,274 associated meta-analyses were theoretically performed for each dataset within the 292 samples (without DUX4-model samples), ultimately leading to the 1935 significant results (adj. p-value < 0.05). These results are decreasingly sorted by SMDH, as shown in the heatmap in Fig. 1c, which is linked to data depicted in Supplementary Data 1, sheet 2a.
To roughly divide the functional terms related to the genes into clusters, we used the Bioconductor simplifyEnrichment package to cluster and visualize the enrichment results35. In this context, genes were divided into the three GO domains cellular component (CC), molecular function (MF), and biological process (BP), each distinguishing between up- and down-regulated genes (Supplementary Figs. 1–3). These clusters were used in combination with the gene list of our heatmap shown in Fig. 1c (Supplementary Data 1, sheet 2a), corresponding enrichment analyses with Bioconductor packages100 (Supplementary Data 1, sheet 3) and the forest plots obtained with our shiny app meta-FSHD (Fig. 1d) to analyse the results of our meta-analysis in terms of existing expertise and potential new findings in the context of FSHD.
To validate the results of the random-effects model, a second alternative analysis approach, known as vote-counting15, was performed. In this regard, the datasets were considered separately and the significant genes and molecular pathways were compared in terms of their overlap rate between datasets. 7 of the 13 datasets yielded significant results when considering the adj. p-values by using limma (v3.52.2)92 (Supplementary Data 1, sheet 4). In this context, standard annotation packages from Bioconductor were utilized to map the data to the ENSEMBLE gene ID101. The core strategy was to find out, which genes are altered in FSHD. Enrichment analyses were performed using GO-database102,103 to identify molecular signalling pathways (Supplementary Data 1, sheets 5 and 6).
Meta-FSHD App
The goal of meta-FSHD is to support future research in FSHD. For its implementation, we used the R packages shiny104 and shinythemes105 in combination with the R packages metafor106, grid84, forestploter107 and ggplot2108. The entire R code as well as the data for the app are publicly available on GitHub (https://github.com/stamats/metaFSHD). The app can easily be used by anyone without installing R and RStudio at the website hosted by Posit Software, PBC (https://stamats.shinyapps.io/metafshd). The corresponding background data for each study, including all samples, are provided in Supplementary Table 1.
In total, meta-FSHD considers all genes measured in at least 3 datasets, corresponding to 26,858 unique ENSEMBL-IDs, 21,080 unique ENTREZ-IDs and 22,791 unique gene names. In this regard, the user can choose between gene name, ENSEMBL-ID, and ENTREZ-ID to enter the corresponding gene of interest. This has the decisive advantage that, in addition to well-described genes, novel transcripts, e.g. previously described only by their ENSEMBL-ID, can be investigated. (For DUX4 and biomarker gene searches, additional information is given in Supplementary Notes 5).
Once the gene of interest is entered, a forest plot appears, encompassing all 13 datasets. Although the DUX4-model is not considered in the random-effects calculation, it is presented in addition to the patient datasets for comparison purposes. There is an additional option to save each forest plot in PDF format. The datasets are further divided into microarray and RNA-Seq datasets for clarity. The statistical calculation is based on the STD log2-FC between FSHD and control due to the different scales for microarrays and sequencing data, where STD log2-FC corresponds to SMDH17. The mean (standard deviation) on the log2-scale per group (FSHD or control) is given next to the name of each study, followed by a graphical representation incorporating the studies’ impact (size of squares proportional to weight (inverse of standard error) of the single dataset within the meta-analysis) with the 95% CI (horizontal lines) and the corresponding numbers on the right. If a gene is significantly upregulated in FSHD compared to control samples, the app displays it in red; if it is significantly downregulated, it is shown in blue. If a gene is insignificant, the CI-line crosses the vertical line of no effect (STD log2-FC = 0). The overall result of the meta-analysis is represented by the diamond. Furthermore, there are additional parameters such as the Cochran’s Q test for heterogeneity109 and I2, a measure of heterogeneity (Supplementary Note 3).
Sub-analysis of an MRI-informed RNA-Seq dataset
Raw data from Wang et al.21 were mapped to the human genome assembly GRCh38 (hg38) and sorted by coordinate using STAR 2.7.9a110 for analysis in DESeq2111 and MAJIQ56.
A gene count matrix was generated applying featureCounts112 and standard DESeq2 workflow was followed, inbuilt lfcShrink function was used with apeglm113. Patients were grouped according to the authors’ assessment based on biomarker expression (LEUTX, KHDCL1, TRIM43, and PRAMEF2), which correlated with pathology (section stainings and MRI). Thus, five groups were formed: controls and FSHD groups 1–4, with the latter having the strongest phenotype and biomarker expression. All DESeq2 results are found in Supplementary Data 1, sheet 8.
A comprehensive list of 386 genes that are associated with the nuclear envelope (either transmembrane proteins or interacting with them on the nucleoplasmic or cytoplasmic side) and preferentially expressed in muscle47,48 (Supplementary Data 1, sheet 9) was used to screen for nuclear envelope genes misregulated in FSHD. This was done for either group 4 or all groups and differences were evaluated by calculating p-values with ggpmisc114. Genes regulated by TMEM38A and PLPP7 were extracted from Robson et al.53, and the venn diagram was generated with ggVennDiagram115. GO term enrichment analysis (for NET regulated genes and mis-spliced genes) was conducted using gprofiler2116 and visualized with enrichplot117. Next to classic GO term analysis, gene set enrichment analysis was used for finding splicing-related pathways and their genes with the Bioconductor package fgsea118, which was then visualized using GOplot119.
Splicing analysis was done with MAJIQ (v2.3) in python (v3.9; https://www.python.org/) for group 4 compared to controls and visualized using Voila (v2.3; https://majiq.biociphers.org/). All other plots were generated with ggplot2108.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The raw data are from Gene Expression Omnibus (GEO). One RNA-Seq dataset, for which no fastq data were available for data protection reasons, is from The European Genome-phenome Archive (EGA). For the latter we obtained raw counts table and the metadata. Detailed information on the data used in the meta-analysis is provided in Table 1 and Supplementary Notes 1 and Supplementary Table 1. Complete results of the analyses are included in Supplementary Data 1.
Code availability
The complete R code of the meta-analysis is available on GitHub (https://github.com/FSHDresearch/Meta-Analysis-of-FSHD). The entire R code as well as the data for the meta-FSHD app are publicly available on GitHub (https://github.com/stamats/metaFSHD). The app can easily be used by anyone without installing R and RStudio at the website hosted by Posit Software, PBC (https://stamats.shinyapps.io/metafshd).
References
Deenen, J. C. W. et al. Population-based incidence and prevalence of facioscapulohumeral dystrophy from the Department of Neurology. Neurology 83, 1056–1059 (2014).
Hendrickson, P. G. et al. Conserved roles of mouse DUX and human DUX4 in activating cleavage-stage genes and MERVL/HERVL retrotransposons. Nat. Genet. 49, 925–934 (2017).
Das, S. & Chadwick, B. P. Influence of repressive histone and DNA methylation upon D4Z4 transcription in non-myogenic cells. PLoS ONE 11, 1–26 (2016).
Tawil, R. et al. Evidence-based guideline summary: evaluation, diagnosis, and management of facioscapulohumeral muscular dystrophy. Neurology 85, 357–364 (2015).
Tassin, A. et al. DUX4 expression in FSHD muscle cells: how could such a rare protein cause a myopathy? J. Cell Mol. Med. 17, 76–89 (2013).
Schätzl, T., Kaiser, L. & Deigner, H.-P. Facioscapulohumeral muscular dystrophy: genetics, gene activation and downstream signalling with regard to recent therapeutic approaches: an update. Orphanet J. Rare Dis. 16, 129 (2021).
Tawil, R. & Van Der Maarel, S. M. Facioscapulohumeral muscular dystrophy. Muscle Nerve 34, 1–15 (2006).
Ricci, E. et al. Progress in the molecular diagnosis of facioscapulohumeral muscular dystrophy and correlation between the number of KpnI repeats at the 4q35 locus and clinical phenotype. Ann. Neurol. 45, 751–757 (1999).
Lunt, P. W. et al. Correlation between fragment size at D4F104S1 and age at onset or at wheelchair use, with a possible generational effect, accounts for much phenotypic variation in 4q35-facioscapulohumeral muscular dystrophy (FSHD). Hum. Mol. Genet. 4, 951–958 (1995).
Nikolic, A. et al. Clinical expression of facioscapulohumeral muscular dystrophy in carriers of 1–3 D4Z4 reduced alleles: experience of the FSHD Italian National Registry. BMJ Open 6, e007798 (2016).
Butz, M. et al. Facioscapulohumeral muscular dystrophy: phenotype–genotype correlation in patients with borderline D4Z4 repeat numbers. J. Neurol. 250, 932–937 (2003).
Statland, J. M. et al. Milder phenotype in facioscapulohumeral dystrophy with 7–10 residual D4Z4 repeats. Neurology 85, 2147–2150 (2015).
Schätzl, T. et al. Meta-analysis of Datasets in Facioscapulohumeral Muscular Dystrophy Using the Original Omics Data for up to Date Comparability https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=330489 (2022).
Page, M. J. et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372, 71 (2021).
Ramasamy, A., Mondry, A., Holmes, C. C. & Altman, D. G. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Med. 5, 1320–1332 (2008).
Borenstein, M., Hedges, L. V., Higgins, J. P. T. & Rothstein, H. R. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res. Synth. Methods 1, 97–111 (2010).
Bonett, D. G. Meta-analytic interval estimation for standardized and unstandardized mean differences. Psychol. Methods 14, 225–238 (2009).
Kowaljow, V. et al. The DUX4 gene at the FSHD1A locus encodes a pro-apoptotic protein. Neuromuscul. Disord. 17, 611–623 (2007).
Bosnakovski, D. et al. An isogenetic myoblast expression screen identifies DUX4-mediated FSHD-associated molecular pathologies. EMBO J. 27, 2766–2779 (2008).
Corona, E. D., Jacquelin, D., Gatica, L. & Rosa, A. L. Multiple protein domains contribute to nuclear import and cell toxicity of DUX4, a candidate pathogenic protein for facioscapulohumeral muscular dystrophy. PLoS ONE 8, e75614 (2013).
Wang, L. H. et al. MRI-informed muscle biopsies correlate MRI with pathology and DUX4 target gene expression in FSHD. Hum. Mol. Genet. 28, 476–486 (2019).
Choi, S. H. et al. DUX4 recruits p300/CBP through its C-terminus and induces global H3K27 acetylation changes. Nucleic Acids Res. 44, 5161–5173 (2016).
Resnick, R. et al. DUX4-induced histone variants H3.X and H3.Y mark DUX4 target genes for expression. Cell Rep. 29, 1812–1820.e5 (2019).
Bosnakovski, D. et al. Low level DUX4 expression disrupts myogenesis through deregulation of myogenic gene expression. Sci. Rep. 8, 16957 (2018).
Turki, A. et al. Functional muscle impairment in facioscapulohumeral muscular dystrophy is correlated with oxidative stress and mitochondrial dysfunction. Free Radic. Biol. Med. 53, 1068–1079 (2012).
Denny, A. P. & Heather, A. K. Are antioxidants a potential therapy for FSHD? A review of the literature. Oxid. Med. Cell. Longev. 2017, 7020295 (2017).
Heher, P. et al. Interplay between mitochondrial reactive oxygen species, oxidative stress and hypoxic adaptation in facioscapulohumeral muscular dystrophy: metabolic stress as potential therapeutic target. Redox Biol. 51, 102251 (2022).
Leary, S. C., Battersby, B. J., Hansford, R. G. & Moyes, C. D. Interactions between bioenergetics and mitochondrial biogenesis. Biochim. Biophys. Acta 1365, 522–530 (1998).
Lek, A. et al. Applying genome-wide CRISPR-Cas9 screens for therapeutic discovery in facioscapulohumeral muscular dystrophy. Sci. Transl. Med. 12, 9–11 (2020).
Vercellino, I. & Sazanov, L. A. The assembly, regulation and function of the mitochondrial respiratory chain. Nat. Rev. Mol. Cell Biol. 23, 141–161 (2022).
Masny, P. S. et al. Localization of 4q35.2 to the nuclear periphery: is FSHD a nuclear envelope disease? Hum. Mol. Genet. 13, 1857–1871 (2004).
Gaillard, M. C. et al. Analysis of the 4q35 chromatin organization reveals distinct long-range interactions in patients affected with facio-scapulo-humeral dystrophy. Sci. Rep. 9, 10327 (2019).
Mariot, V. et al. Correlation between low FAT1 expression and early affected muscle in facioscapulohumeral muscular dystrophy. Ann. Neurol. 78, 387–400 (2015).
Maggi, L. et al. LMNA-associated myopathies: the Italian experience in a large cohort of patients. Neurology 83, 1634–1644 (2014).
Gu, Z. & Hübschmann, D. Simplify enrichment: a bioconductor package for clustering and visualizing functional enrichment results. Genom. Proteom. Bioinform. 21, 190–202 (2022).
Csapo, R., Gumpenberger, M. & Wessner, B. Skeletal muscle extracellular matrix—what do we know about its composition, regulation, and physiological roles? A narrative review. Front. Physiol. 11, 253 (2020). 2017.
Lieber, R. L. & Ward, S. R. Cellular mechanisms of tissue fibrosis. 4. Structural and functional consequences of skeletal muscle fibrosis. Am. J. Physiol.-Cell Physiol. 305, C241–C252 (2013).
Patton, B. L. Basal lamina and the organization of neuromuscular synapses. J. Neurocytol. 32, 883–903 (2003).
Sanes, J. Laminin, fibronectin, and collagen in synaptic and extrasynaptic portions of muscle fiber basement membrane. J. Cell Biol. 93, 442–451 (1982).
Zhang, W., Liu, Y. & Zhang, H. Extracellular matrix: an important regulator of cell functions and skeletal muscle development. Cell Biosci. 11, 65 (2021).
Tasca, G. et al. Different molecular signatures in magnetic resonance imaging-staged facioscapulohumeral muscular dystrophy muscles. PLoS ONE 7, e38779 (2012).
Tasca, G. et al. Magnetic resonance imaging in a large cohort of facioscapulohumeral muscular dystrophy patients: pattern refinement and implications for clinical trials. Ann. Neurol. 79, 854–864 (2016).
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).
Wang, C. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32, 926–932 (2014).
de las Heras, J. I. et al. Tissue specificity in the nuclear envelope supports its functional complexity. Nucleus 4, 460–477 (2013).
Östlund, C. et al. Reduction of a 4q35-encoded nuclear envelope protein in muscle differentiation. Biochem. Biophys. Res. Commun. 389, 279–283 (2009).
Korfali, N. et al. The nuclear envelope proteome differs notably between tissues. Nucleus 3, 552–564 (2012).
Wilkie, G. S. et al. Several novel nuclear envelope transmembrane proteins identified in skeletal muscle have cytoskeletal associations. Mol. Cell. Proteom. 10, M110.003129 (2011).
Meinke, P. et al. A multistage sequencing strategy pinpoints novel candidate alleles for Emery–Dreifuss muscular dystrophy and supports gene misregulation as its pathomechanism. EBioMedicine 51, 102587 (2020).
Dong, C. H. et al. LMNB2 promotes the progression of colorectal cancer by silencing p21 expression. Cell Death Dis. 12, 331 (2021).
Bonne, G. et al. Mutations in the gene encoding lamin A/C cause autosomal dominant Emery–Dreifuss muscular dystrophy. Nat. Genet. 21, 285–288 (1999).
Gueneau, L. et al. Mutations of the FHL1 gene cause Emery–Dreifuss muscular dystrophy. Am. J. Hum. Genet. 85, 338–353 (2009).
Robson, M. I. et al. Tissue-specific gene repositioning by muscle nuclear membrane proteins enhances repression of critical developmental genes during myogenesis. Mol. Cell 62, 834–847 (2016).
Todorow, V. et al. Transcriptome analysis in a primary human muscle cell differentiation model for myotonic dystrophy type 1. Int. J. Mol. Sci. 22, 8607 (2021).
de las Heras, J. I. et al. Metabolic, fibrotic and splicing pathways are all altered in Emery–Dreifuss muscular dystrophy spectrum patients to differing degrees. Hum. Mol. Genet 32, 1010–1031 (2023).
Vaquero-Garcia, J. et al. A new view of transcriptome complexity and regulation through the lens of local splicing variations. Elife 5, e11752 (2016).
Shadle, S. C. et al. DUX4-induced dsRNA and MYC mRNA stabilization activate apoptotic pathways in human cell models of facioscapulohumeral dystrophy. PLoS Genet. 13, e1006658 (2017).
Feng, Q. et al. A feedback loop between nonsense-mediated decay and the retrogene DUX4 in facioscapulohumeral muscular dystrophy. Elife 4, e04996 (2015).
Ghasemizadeh, A. et al. Macf1 controls skeletal muscle function through the microtubule-dependent localization of extra-synaptic myonuclei and mitochondria biogenesis. Elife 10, e70490 (2021).
Randazzo, D. et al. Persistent upregulation of the β-tubulin tubb6, linked to muscle regeneration, is a source of microtubule disorganization in dystrophic muscle. Hum. Mol. Genet. 28, 1117–1135 (2019).
Banerji, C. R. S. et al. β-Catenin is central to DUX4-driven network rewiring in facioscapulohumeral muscular dystrophy. J. R. Soc. Interface 12, 20140797 (2015).
Banerji, C. R. S. et al. PAX7 target genes are globally repressed in facioscapulohumeral muscular dystrophy skeletal muscle. Nat. Commun. 8, 2152 (2017).
Banerji, C. R. S. et al. Skeletal muscle regeneration in facioscapulohumeral muscular dystrophy is correlated with pathological severity. Hum. Mol. Genet. 29, 2746–2760 (2020).
Engquist, E. N. et al. FSHD muscle shows perturbation in fibroadipogenic progenitor cells, mitochondrial function and alternative splicing independently of inflammation. Hum. Mol. Genet. 33, 182–197 (2024).
Salsia, V., Vattemi, G. N. A. & Tupler, R. G. The FSHD jigsaw: are we placing the tiles in the right position? Curr. Opin. Neurol. 36, 455–463 (2023).
Lassche, S. et al. Reduced specific force in patients with mild and severe facioscapulohumeral muscular dystrophy. Muscle Nerve 63, 60–67 (2021).
Erdmann, H. et al. Reply: an epigenetic basis for genetic anticipation in facioscapulohumeral muscular dystrophy type 1. Brain 146, e111–e114 (2023).
Buitrago, D. et al. Impact of DNA methylation on 3D genome structure. Nat. Commun. 12, 3243 (2021).
van den Heuvel et al. FSHD-group Leiden University Medical Center (LUMC). Dataset. RNA-Sequencing Data from Human FSHD and Control Skeletal Muscle Biopsies https://ega-archive.org/datasets/EGAD00001008337 (2022).
Cheli, S. & Meneveri, R. Series GSE26061. Expression Profiling of 4q-linked and Phenotypic FSHD in Different Steps of Myogenic Differentiation https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26061 (2011).
Ehrlich, M. & Tsumagari, K. Series GSE26145. Expression Profiling FSHD vs. Control Myoblasts and Myotubes https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26145 (2011).
Welle, S. Series GSE10760. Effect of Facioscapulohumeral Dystrophy (FSHD) on Skeletal Muscle Gene Expression https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10760 (2008).
Kho, A., Arashiro, P., Kunkel, L. & Zatz, M. Series GSE15090. Gene Expression Profiles in Muscle Tissue from FSHD Patients https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15090 (2009).
Rahimov, F. Series GSE36398. Transcriptional Profiling in Facioscapulohumeral Muscular Dystrophy to Identify Candidate Biomarkers. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE36398 (2012).
Tasca, G., Pescatori, M., Cubeddu, T. & Ricci, E. Series GSE26852. Gene Expression Analysis of FSHD Muscle with Different MRI Pattern https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26852 (2012).
Yao, Z. et al. Series GSE56787. DUX4-induced Gene Expression is the Major Molecular Signature in FSHD Skeletal Muscle https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56787 (2014).
Banerji, C. R. S. & Zammit, P. Series GSE123468. RNA-seq of FSHD and Control Immortalised Myoblasts II https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE123468 (2018).
Freidman, S. et al. Series GSE115650. MRI-informed Muscle Biopsies Correlate MRI with Pathology and DUX4 Target Gene Expression in FSHD https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115650 (2018).
Watt, K. et al. Series GSE138768. DUX4 Promotes Mitochondrial Impairment in Skeletal Muscle https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE138768 (2021).
Yao, Z. et al. DUX4-induced gene expression is the major molecular signature in FSHD skeletal muscle. Hum. Mol. Genet. 23, 5342–5352 (2014).
Dalma‐Weiszhausz, D. D., Warrington, J., Tanimoto, E. Y. & Miyada, C. G. DNA microarrays, Part A: array platforms and wet-bench protocols. [1] The affymetrix GeneChip® platform: an overview. In Methods in Enzymology Vol. 410 3–28 (Academic Press, 2006).
Fan, J. et al. DNA microarrays, Part A: array platforms and wet-bench protocols. [3] Illumina universal bead arrays. In Methods in Enzymology Vol. 410 57–73 (Academic Press, 2006).
Dillies, M.-A. et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14, 671–683 (2013).
R Core Team. R: A Language and Environment for Statistical Computing https://www.R-project.org/ (2022).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. b-Methodol. 57, 289–300 (1995).
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Irizarry, R. A. et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003).
Carvalho, B. S. & Irizarry, R. A. A framework for oligonucleotide microarray preprocessing. Bioinformatics 26, 2363–2367 (2010).
Gautier, L., Cope, L., Bolstad, B. M. & Irizarry, R. A. affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20, 307–315 (2004).
Du, P. & Lin, S. lumi: BeadArray Specific Methods for Illumina Methylation and Expression Microarrays. rsn: Robust Spline Normalization between Chips https://rdrr.io/bioc/lumi/man/rsn.html (2020).
Du, P., Kibbe, W. & Lin, S. lumi: a pipeline for processing Illumina microarray. Bioinformatics 24, 1547–1548 (2008).
Smyth, G. K. et al. limma: Linear Models for Microarray and RNA-Seq Data User’s Guide https://bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf (2021)
Kohl, M. MKomics: Omics Data Analysis https://CRAN.R-project.org/package=MKomics (2021).
Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Tong, Y. The comparison of limma and DESeq2 in gene analysis. E3S Web Conf. 271, 03058 (2021).
Corchete, L. A. et al. Systematic comparison and assessment of RNA-seq procedures for gene expression quantitative analysis. Sci. Rep. 10, 19737 (2020).
Bourgon, R., Gentleman, R. & Huber, W. Independent filtering increases detection power for high-throughput experiments. PNAS 107, 9546–9551 (2010).
Hackstadt, A. J. & Hess, A. M. Filtering for increased power for microarray data analysis. BMC Bioinform. 10, 11 (2009).
van Iterson, M., Boer, J. M. & Menezes, R. X. Filtering, FDR and power. BMC Bioinform. 11, 450 (2010).
Huber, W. et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 12, 115–121 (2015).
Aken, B. L. et al. The Ensembl gene annotation system. Database (Oxford) 2016, baw093 (2016).
Carbon, S. et al. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 49, D325–D334 (2021).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Chang, W. et al. Shiny: Web Application Framework for R. R package version 1.7.4 https://CRAN.R-project.org/package=shiny (2022).
Chang, W. shinythemes: Themes for Shiny. R package version 1.2.0 https://CRAN.R-project.org/package=shinythemes (2021).
Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 36, 1–48 (2010).
Dayimu, A. forestploter: Create Flexible Forest Plot. R package Version 1.1.0 https://CRAN.R-project.org/package=forestploter (2023).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Higgins, J. P. T. & Thompson, S. G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 21, 1539–1558 (2002).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Liao, Y., Smyth, G. K. & Shi, W. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Zhu, A., Ibrahim, J. G. & Love, M. I. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics 35, 2084–2092 (2019).
Aphalo, P. ggpmisc: Miscellaneous Extensions to ‘ggplot2’. R Package Version 0.5.3 https://CRAN.R-project.org/package=ggpmisc (2023).
Gao, C. H., Yu, G. & Cai, P. ggVennDiagram: an intuitive, easy-to-use, and highly customizable R package to generate Venn diagram. Front. Genet. 12, 706907 (2021).
Kolberg, L., Raudvere, U., Kuzmin, I., Vilo, J. & Peterson, H. gprofiler2—an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler. F1000Res 9, 709 (2020).
Yu, G. enrichplot: Visualization of Functional Enrichment Result. R Package Version 1.20.0 https://yulab-smu.top/biomedical-knowledge-mining-book/ (2023).
Korotkevich, G., Sukhov, V. & Sergushichev, A. A. Fast gene set enrichment analysis. Preprint at bioRxiv https://doi.org/10.1101/060012. (2019)
Walter, W., Sánchez-Cabo, F. & Ricote, M. GOplot: an R package for visually combining expression data with functional analysis. Bioinformatics 31, 2912–2914 (2015).
Tryggvason, K. & Patrakka, J. Alport’s disease and thin basement membrane nephropathy. In Genetic Diseases of the Kidney Ch. 4 (eds. Lifton, R. P., Somlo, S., Giebisch, G. H. & Seldin, D. W.) 77–96 (Academic Press, 2009).
Wang, M., Zhao, Y. & Zhang, B. Efficient test and visualization of multi-set intersections. Sci. Rep. 5, 16923 (2015).
Acknowledgements
We would like to thank three reviewers and the editors for their critical reading of our manuscript and their helpful suggestions. We acknowledge financial support by DFG within the funding programme Open Access Publikationskosten.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
T.S.: conceptualisation, methodology, data curation, formal analysis, investigation, visualisation, writing—original draft preparation, writing—review & editing; V.T.: conceptualisation, data curation, formal analysis, investigation, visualisation, writing—original draft preparation, writing—review & editing; L.K.: conceptualisation, investigation, writing—review & editing; H.W.: data curation, investigation, writing—review & editing; B.S.: conceptualisation, investigation, writing—review & editing; H.-P. D.: conceptualisation, investigation, writing—review & editing; P.M.: supervision, conceptualisation, methodology, formal analysis, investigation, visualisation, writing—original draft preparation, writing—review & editing; M.K.: project management, supervision, conceptualisation, methodology, data curation, formal analysis, app implementation, investigation, visualisation, writing—original draft preparation, writing—review & editing; All authors approved the final version of the paper for submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Christopher Banerji, Quentin Gouil and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Melanie Bahlo and Manuel Breuer. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Schätzl, T., Todorow, V., Kaiser, L. et al. Meta-analysis towards FSHD reveals misregulation of neuromuscular junction, nuclear envelope, and spliceosome. Commun Biol 7, 640 (2024). https://doi.org/10.1038/s42003-024-06325-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-06325-z
- Springer Nature Limited