Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology and risk prediction

Schormair, Barbara; Zhao, Chen; Bell, Steven; Didriksen, Maria; Nawaz, Muhammad S.; Schandra, Nathalie; Stefani, Ambra; Högl, Birgit; Dauvilliers, Yves; Bachmann, Cornelius G.; Kemlink, David; Sonka, Karel; Paulus, Walter; Trenkwalder, Claudia; Oertel, Wolfgang H.; Hornyak, Magdolna; Teder-Laving, Maris; Metspalu, Andres; Hadjigeorgiou, Georgios M.; Polo, Olli; Fietze, Ingo; Ross, Owen A.; Wszolek, Zbigniew K.; Ibrahim, Abubaker; Bergmann, Melanie; Kittke, Volker; Harrer, Philip; Dowsett, Joseph; Chenini, Sofiene; Ostrowski, Sisse Rye; Sørensen, Erik; Erikstrup, Christian; Pedersen, Ole B.; Topholm Bruun, Mie; Nielsen, Kaspar R.; Butterworth, Adam S.; Soranzo, Nicole; Ouwehand, Willem H.; Roberts, David J.; Danesh, John; Burchell, Brendan; Furlotte, Nicholas A.; Nandakumar, Priyanka; Earley, Christopher J.; Ondo, William G.; Xiong, Lan; Desautels, Alex; Perola, Markus; Vodicka, Pavel; Dina, Christian; Stoll, Monika; Franke, Andre; Lieb, Wolfgang; Stewart, Alexandre F. R.; Shah, Svati H.; Gieger, Christian; Peters, Annette; Rye, David B.; Rouleau, Guy A.; Berger, Klaus; Stefansson, Hreinn; Ullum, Henrik; Stefansson, Kari; Hinds, David A.; Di Angelantonio, Emanuele; Oexle, Konrad; Winkelmann, Juliane

doi:10.1038/s41588-024-01763-1

Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology and risk prediction

Article
Open access
Published: 05 June 2024

Volume 56, pages 1090–1099, (2024)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue Submit your manuscript

Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology and risk prediction

Download PDF

11k Accesses
1 Citation
1435 Altmetric
199 Mentions
Explore all metrics

Abstract

Restless legs syndrome (RLS) affects up to 10% of older adults. Their healthcare is impeded by delayed diagnosis and insufficient treatment. To advance disease prediction and find new entry points for therapy, we performed meta-analyses of genome-wide association studies in 116,647 individuals with RLS (cases) and 1,546,466 controls of European ancestry. The pooled analysis increased the number of risk loci eightfold to 164, including three on chromosome X. Sex-specific meta-analyses revealed largely overlapping genetic predispositions of the sexes (r_g = 0.96). Locus annotation prioritized druggable genes such as glutamate receptors 1 and 4, and Mendelian randomization indicated RLS as a causal risk factor for diabetes. Machine learning approaches combining genetic and nongenetic information performed best in risk prediction (area under the curve (AUC) = 0.82–0.91). In summary, we identified targets for drug development and repurposing, prioritized potential causal relationships between RLS and relevant comorbidities and risk factors for follow-up and provided evidence that nonlinear interactions are likely relevant to RLS risk prediction.

Large genome-wide association study identifies three novel risk variants for restless legs syndrome

Article Open access 25 November 2020

Genetics of Restless Legs Syndrome (RLS)

Update on Restless Legs Syndrome: from Mechanisms to Treatment

Article 27 June 2019

Main

RLS is a prevalent, but underdiagnosed, chronic sensorimotor disorder, affecting up to 10% of the elderly population in Europe and North America^1,2. Previous genome-wide association studies (GWAS) have identified 22 risk loci^3,4. However, objective biomarkers for prediction or diagnosis are not available yet. Severely impairing sleep, RLS has a profound impact on daily functioning, overall health and quality of life. Long-term treatment options are scarce and require frequent adjustment due to side effects^2,5.

RLS is often comorbid with psychiatric disorders such as depression or anxiety as well as cardiovascular disorders, hypertension and metabolic conditions such as diabetes^2,6. The extent to which these associations imply causal relations is unknown⁷. Epidemiological and clinical studies have consistently demonstrated the prevalence of RLS to be twice as high in women than in men^8,9. The contribution of genetic factors to this difference has not been examined yet.

To address these shortcomings, we conducted a genome-wide association meta-analysis (GWAMA) of three independent GWAS. We integrated multiple layers of functional omics data to identify pathways and cell types relevant to RLS. Furthermore, our analyses included sex-stratified GWAS and a genetic investigation of the X chromosome. To facilitate translational research, we identified drug targets among candidate genes, used machine learning to enhance risk prediction and conducted extensive genetic correlation and Mendelian randomization (MR) analyses to identify risk factors.

Results

Pooled autosomal GWAS meta-analysis

We performed a meta-analysis of summary statistics from three GWAS for RLS, totaling 116,647 cases and 1,546,466 controls of European ancestry (Extended Data Fig. 1). The first GWAS (EU-RLS-GENE) was conducted in affected individuals recruited by expert clinicians of the International EU-RLS-GENE consortium and ancestry-matched controls. The second GWAS (INTERVAL) was based on the INTERVAL study of blood donors in the United Kingdom, which used the Cambridge-Hopkins questionnaire to diagnose RLS. The third GWAS (23andMe) was conducted on the research participant base of 23andMe, identifying RLS by asking whether a diagnosis or treatment of RLS was received from a physician. Further details are provided in the Methods. Genetic correlations between the GWAS were strong but indicated some degree of heterogeneity, with pairwise genetic correlation (r_g) ranging between 0.70 and 0.76 (Extended Data Fig. 2), possibly due to differences in phenotyping of RLS as well as in source populations targeted for recruitment. Therefore, we used a multivariate GWAMA approach (Methods). After quality control, 9,196,648 variants with minor allele frequency (MAF) ≥ 1% were available for meta-analysis. We identified 161 RLS risk loci (P < 5 × 10⁻⁸) on the autosomes, confirming all known loci and adding 139 new loci (Extended Data Fig. 3a). Conditional analysis within each locus resulted in a total of 193 independent lead SNPs (Supplementary Table 1).

An LD score regression (LDSC) intercept of 1.072 (standard error (s.e.) = 0.013) with an inflation ratio of 0.064 (s.e. = 0.012) indicated that population stratification was negligible and that the inflation of the test statistics was driven by the polygenic architecture of RLS.

At the meta-analysis level, assuming a disease prevalence of 9%, the overall SNP-based heritability was estimated to be 0.20 (s.e. = 0.016) using LDSC (Methods). Because the meta-analysis included studies with different phenotyping methods, we also derived heritability estimates from the individual GWAS. LDSC-derived heritability in the most stringently phenotyped study, EU-RLS-GENE, was higher (0.26, s.e. = 0.038) than that in INTERVAL (0.17, s.e. = 0.051, P_EU-Interval = 0.073, two-sample two-sided Z-test) and 23andMe (0.14, s.e. = 0.011, P_EU-23andMe = 0.0012, two-sample two-sided Z-test). While the LDSC model showed the best fit, this trend was consistent with other estimation methods (Supplementary Table 2a).

Sex-stratified autosomal GWAS and meta-analyses

To study sex-specific genetic effects, we conducted sex-stratified GWAS for the autosomes in each study and meta-analyzed the results (Extended Data Fig. 3b, representing 78,333 cases and 844,872 controls in women and 38,314 cases and 701,594 controls in men). Heritability was significantly higher for females in the meta-analysis (${h}_{{\rm{males}}}^{2}= 0.13$, s.e. = 0.012; ${h}_{{\rm{females}}}^{2}=0.32$, s.e. = 0.027; P_difference = 1.9 × 10⁻⁸, two-sample two-sided Z-test). The INTERVAL study was too small for reliable application of LDSC, but both other cohorts showed higher estimates for LDSC-derived heritability in females than in males (P_difference = 0.07 in EU-RLS-GENE; P_difference = 0.09 in 23andMe, two-sample two-sided Z-test; Supplementary Table 2b,c). Comparing the two sex-specific meta-analyses showed a high genetic correlation of 0.96 (s.e. = 0.018); however, the remaining small divergence was significant (P = 0.044, one-sample two-sided Z-test).

The sex-specific meta-analyses identified 58 independent lead SNPs in 50 risk loci in males and 155 SNPs in 130 loci in females (Supplementary Tables 3 and 4). Of these loci, 23 (two in males, 21 in females) were not genome-wide significant in the pooled analysis. To prioritize loci with robust sex differences, we tested the lead SNPs of the pooled meta-analysis for heterogeneity of effect sizes between males and females. This was statistically significant for six loci (Extended Data Table 1).

To understand the discrepancy between the heritability estimates of the two sexes despite their high genetic correlation, we ran a simulation study (Supplementary Note) modeling the impact of an environmental risk factor and of its interaction with the genetic predisposition to RLS (G × E). The results obtained with the model including the G × E interaction recapitulated the situation observed in our real-world GWAS data very closely. This was the case for both binary and continuous environmental factors, with the binary risk factor showing a slightly better fit (log₁₀ (Bayes factor) of 11.43 compared to 9.11). In line with this, the G × E model showed a closer fit (P = 0.02, two-sample two-sided Z-test) to the ${h}_{\rm{male}}^{2}/{h}_{\rm{female}}^{2}$ ratio observed in the pooled GWAS than the model without the G × E interaction (Extended Data Fig. 4). Furthermore, the impact of a G × E interaction on RLS was higher in females than in males with a ${r}_{G\times E(\rm{female})}/{r}_{G\times E(\rm{male})}$ ratio of 16.1 (95% CI = 7.09, 51.12).

X-chromosomal meta-analyses

We performed pooled as well as sex-specific X chromosome-wide association study (XWAS) meta-analyses using EU-RLS-GENE and 23andMe data (Methods). Based on the pooled meta-analysis, SNP-based heritability ${h}_{{\rm{pooled}}}^{2}$ carried by the X chromosome was 0.0035 (s.e. = 0.0010), with the sex-specific values again being lower in men (${h}_{{\rm{males}}}^{2}=0.0032$, s.e. = 0.0018) than in women (${h}_{{\rm{females}}}^{2}=0.0047$, s.e. = 0.0012; Extended Data Fig. 5 and Supplementary Table 5), but this difference was not significant (P = 0.49). Genetic correlation between the two sexes was high (r_g = 0.926, s.e. = 0.071, P_difference = 0.29, one-sample two-sided Z-test). Our analyses identified three independent risk loci for RLS on the X chromosome in the pooled data and one in the male-only data (Supplementary Tables 1 and 3).

Replication of lead variants in additional datasets

We combined data from three additional cohorts to replicate the lead SNP associations of our meta-analyses (Methods): the discovery dataset of a previously published meta-analysis, a second research participant sample from 23andMe and a second set of blood donors from INTERVAL, totaling 29,028 cases and 398,815 controls. Despite the considerably smaller sample size, 71% of the lead SNPs from the pooled discovery meta-analysis were at least nominally significant in the replication dataset (P < 0.05) and there was a high positive correlation between the effect size estimates of the discovery stage and the replication dataset (Pearson’s r = 0.94, P < 2.2 × 10⁻¹⁶; Extended Data Fig. 6a). The male- and female-specific analyses showed similar results (male, 67% of lead SNPs with P < 0.05, Pearson’s r = 0.97, P < 2.2 × 10⁻¹⁶; female, 70%, Pearson’s r = 0.92, P < 2.2 × 10⁻¹⁶; Extended Data Fig. 6b,c). A joint analysis of discovery and replication datasets revealed that all lead SNPs of the pooled, male-specific and female-specific meta-analyses reached Bonferroni-corrected significance (Supplementary Table 6).

Functional annotation and biological interpretation

We performed gene set and cell type enrichment analyses based on the pooled meta-analysis (Methods). We used DEPICT to perform gene set enrichment analyses across the genome-wide significant risk loci and detected 319 gene sets with a false discovery rate (FDR) < 0.05 (Supplementary Table 7). These clustered in pathways, processes and structures related to neurodevelopment, neuron migration, axon guidance, synapse formation and signal transduction between neurons (Fig. 1a). An additional gene set enrichment analysis using MAGMA prioritized nine biological processes related to neuron migration and synapse formation with an FDR <0.05 (Supplementary Table 8). This supported the results from DEPICT and emphasizes the key role of neurodevelopmental processes in RLS biology (Fig. 1b).

**Fig. 1: Pathway enrichment analysis.**

We performed enrichment analyses to identify tissue and cell types involved in RLS. We first examined body-wide human gene expression data. The default analysis in DEPICT identified 24 of 209 tissue and cell types with significant enrichment (FDR < 0.05), 23 of which were central nervous system (CNS) tissues (Supplementary Table 9). Using GTEx version 8 as an independent validation dataset yielded highly comparable results (Supplementary Table 10). Therefore, we focused on higher-resolution single-cell sequencing datasets of the nervous system in mice, available for developmental and postnatal stages (Methods). Only neurons and neuroblasts showed statistically significant enrichment, while glial and endothelial cells, for instance, did not (Fig. 2 and Supplementary Tables 11–14). We then dissected these cell types to identify specific anatomical regions and neurotransmitter classes (Fig. 2). We found cell types with statistically significant enrichment in all main compartments of the embryonic CNS: forebrain, midbrain, hindbrain and spinal cord. This was mirrored in the adult dataset, where cell types in the cerebrum, the cerebellum and the brainstem were highlighted. In most regions, both excitatory and inhibitory neuron types showed statistically significant enrichment, with glutamatergic neurons in the spinal cord showing the strongest enrichment. Overall, developmental-stage data yielded more robust enrichment than adult-stage data. Analyses in human datasets confirmed the enrichment in neuronal cell types and the higher level of significance obtained in the developmental datasets (Fig. 2 and Supplementary Tables 15 and 16). Again, excitatory and inhibitory neurons showed the highest enrichment. An additional analysis of bulk human brain transcriptome data from BrainSpan indicated an enrichment in the prenatal stage, but not the postnatal stage, underscoring a role for neurodevelopment in susceptibility to RLS (Supplementary Table 17).

**Fig. 2: Tissue and cell type enrichment analysis.**

We used diverse functional genomic annotation and fine-mapping approaches to build a sum score for ranking candidate causal genes within risk loci (maximum score = 12, Methods). Six loci contained no gene with a score above 2, 69 loci contained genes reaching a score of up to 6, and 89 loci contained genes with a score ≥ 7 (Supplementary Table 18). We focused further interpretation on the latter group. At 61 loci, there was a single independent lead SNP as well as a single top-scoring gene. These included six known loci, strengthening previous reports (MEIS1, PTPRD, SKOR1, NTNG1, CADM1 and RANBP17)^{3,4,10,11,12,13}. Because drug repurposing is one of the fastest options for translating GWAS findings into patient care, we mapped the top-scoring genes against the druggable genome and identified 13 potential candidates targeted by existing compounds (Table 1). Among them, GRIA1 and GRIA4, which encode subunits of AMPA-type glutamate ionotropic receptors, provided genetic evidence of a link between RLS and glutamate receptor function. Another interesting candidate is CCKBR, which encodes the predominant cholecystokinin receptor in the brain^14,15. Our prioritization algorithm also listed SLC40A1, which had already been identified in the discovery stage of a previous study but had failed to replicate⁴. SLC40A1 encodes ferroportin 1, the only known transporter for iron export from cells, being relevant for iron replacement therapies^16,17,18. To evaluate whether iron-related traits and RLS shared causal variants in SLC40A1, we performed additional colocalization analyses using recently published GWAS of peripheral iron measures as well as quantitative susceptibility mapping (QSM) and T2* magnetic resonance imaging data as readouts for brain iron levels^19,20,21 (Supplementary Note). For the pallidum and the putamen, colocalization analysis pointed toward distinct causal variants (posterior probability for H₃ hypothesis of coloc absolute Bayes factor analysis (PP.H3.abf)_pallidum ≥ 96.1% for QSM and PP.H3.abf_putamen > 99% for T2*), whereas results were inconclusive for the caudate nucleus. In other subcortical brain regions, the results were not statistically significant. For peripheral iron measurements, we saw a probability of >99% for different causal variants for both ferritin and total iron binding capacity and RLS. In general, our analyses suggest that the RLS association in the SLC40A1 locus is distinct from iron-related associations (Supplementary Table 19).

Table 1 Drug repurposing options for top-scoring genes

Full size table

Genetic correlation and MR analysis

We performed a large-scale genetic correlation analysis followed by MR to discover potentially modifiable risk factors for RLS and to explore epidemiological or mechanistic overlaps with other diseases (Methods). Calculating genetic correlations with LDSC identified 1,054 of 2,649 analyzed traits and diseases as significantly correlated with RLS (FDR < 0.05; Supplementary Table 20). To factor in the complex interrelations between these traits, we performed bi-serial genetic correlation followed by weighted correlation network analysis of all 1,054 traits. This clustering yielded 11 modules, which reflected independent higher-level trait categories linked to RLS (Methods and Fig. 3a). The genetic correlation results strongly converged on RLS being associated with lower general physical as well as mental health. They confirmed epidemiological associations with increased body weight, depression, hypertension, cardiovascular disease, diabetes and sleep disturbances (Fig. 3b). However, they also provided evidence for less well-described associations of RLS with lower educational attainment, higher risk of asthma and diseases of the digestive system. In line with the increased prevalence in females, we identified a cluster of female-specific traits such as age of first childbirth, hysterectomy, oophorectomy and excessive menstruation (blue module, Fig. 3a,b and Supplementary Table 20).

**Fig. 3: Genetic correlation analysis.**

We performed MR to infer potential causal relationships between RLS and representative traits from these clusters (Fig. 4 and Supplementary Table 21). RLS as a common and complex disease is characterized by phenotypic heterogeneity and likely entails genetic pleiotropy, necessitating cautious interpretation of MR results. Therefore, we used the latent heritable confounder MR (LHC-MR) approach for the primary analysis, which is a robust method designed to account for pleiotropy and potential confounding (Methods). We confirmed known unidirectional and bidirectional relations, for example, that the number of live births significantly increased the risk of RLS or that insomnia symptoms and RLS were bidirectionally linked^8,9,22,23.

For other traits, LHC-MR indicated relationships being causal rather than due to confounding. In terms of unidirectional relationships, RLS showed a significant effect (defined as P_FDR < 0.05) on type 2 diabetes with an effect estimate of a_{RLS→diabetes2} = 0.99 (s.e. = 0.06, P_FDR = 1.5 × 10⁻⁶⁸) and significant likelihood-ratio tests (LRTs) for effects being only causal (P_{LRT_causal_only} = 8.5 × 10⁻²⁸) and effects only of RLS on type 2 diabetes (P_{LRT_only_RLS→diabetes2} = 2.9 × 10⁻⁴⁰). Unidirectional causal links to RLS with strong evidence were fresh fruit intake (decreased RLS risk with a_fruit→RLS = −0.33 ± 0.08, P_FDR = 0.0002, P_{LRT_causal_only} = 2.2 × 10⁻⁵, P_{LRT_only_fruit→RLS} = 2.3 × 10⁻⁵) and being tense or highly strung as well as having had a headache in the last month (elevated RLS risk with a_tense→RLS = 0.44 ± 0.06, P_FDR = 8 × 10⁻¹², P_{LRT_causal_only} = 8.6 × 10⁻⁹, P_{LRT_only_tense→RLS} = 4.2 × 10⁻⁸ and a_{headache→RLS} = 0.37 ± 0.08, P_FDR = 2.9 × 10⁻⁵, P_{LRT_causal_only} = 1.2 × 10⁻⁸, P_{LRT_only_headache→RLS} = 6.9 × 10⁻⁷). Significant bidirectional relations with evidence of only causal effects were found for five traits (all with P_{LRT_causal_only} < 0.05; Fig. 4 and Supplementary Table 21): ease of getting up in the morning lowered RLS risk (a_ease→RLS = −0.3 ± 0.06) and vice versa (a_RLS→ease = −0.09 ± 0.02). The frequency of tenseness or restlessness in the last 2 weeks as well as two traits reflecting lung function increased RLS risk and vice versa, with a stronger effect on RLS (a_{tenseness→RLS} = 0.62 ± 0.07, a_{RLS→tenseness} = 0.11 ± 0.02, a_{COPD-differential-diagnosis →RLS} = 0.38 ± 0.06, a_{RLS→COPD-differential-diagnosis} = 0.12 ± 0.03), while, for self-reported osteoarthritis, RLS had the stronger effect (a_{osteoarthritis→RLS} = 0.46 ± 0.19, a_{RLS→osteoarthritis} = 0.18 ± 0.04). We performed inverse-variance weighted (IVW)-MR analyses with Steiger filtering and MR-Egger intercept assessment as a secondary analysis. The results were consistent for 14 traits, which included the unidirectional link between RLS and type 2 diabetes (Fig. 4 and Supplementary Table 22).

Considering the proposed involvement of brain iron homeostasis in RLS²⁴ and SLC40A1 as a candidate gene in our GWAMA, we also investigated peripheral and brain iron traits. Both genetic correlation and MR analyses did not reveal strong effects (Supplementary Tables 21 and 23). Only white matter hyperintensity measured by T2* was significantly correlated with RLS in the full dataset (r_g = 0.126, s.e. = 0.046, P = 0.0065, P_FDR = 0.016, one-sample two-sided Z-test). LHC-MR revealed a significant effect of peripheral calculated transferrin levels on RLS; however, this appears to be largely attributable to confounding factors (P_{LRT_latent_only} = 0.005).

Development and validation of a risk prediction model

We assessed the predictive performance of basic linear models as well as that of models integrating interaction effects and time-dependent effects using genetic data and basic demographic variables such as age, sex and age of disease onset (Methods and Supplementary Note). We employed three classes of models, generalized linear models (GLMs) with or without interaction terms, random forest (RF) models and deep neural network (DNN) models, implemented as a binary or a time-to-event (survival) classifier. Genetic risk was calculated as a polygenic risk score (PRS) using individual dosages of 216 genome-wide significant SNPs (PRS.lead), because this score showed better performance than a score using genome-wide data (LDpred2) with an area under the receiver operator characteristic curve (AUC) of AUC_LDpred2 = 0.66 ± 0.019 compared to AUC_PRS.lead = 0.73 ± 0.018 (P = 0.0056, two-sample two-sided Z-test).

Overall, the machine learning survival classifier models considering nonlinear interactions and time-varying effects performed best (Fig. 5). The random survival forest model (RSF-5yr; 5-year period) and the DNN survival model (DNNsurv-5yr) showed comparable performance: AUC_RSF-5yr = 0.91 ± 0.008 compared to AUC_DNNsurv-5yr = 0.90 ± 0.012 in the EU-RLS-GENE dataset and AUC_RSF-5yr = 0.87 ± 0.005 compared to AUC_DNNsurv-5yr = 0.86 ± 0.012 in the INTERVAL dataset. Additional performance metrics such as odds ratio (OR) and area under the precision–recall curve yielded the same trends (Supplementary Table 24).

We also evaluated the contribution of the interaction effects to the model performance either directly (GLMs) or indirectly by calculating the incremental gain in explained variance for the DNN and RF models (Nagelkerke’s pseudo-R²; Methods). For the GLM, we found a significant interaction between PRS and age (β = −0.47, s.e. = 0.08, P = 4.3 × 10⁻⁹, one-sample two-sided Z-test). The impact of the PRS was significantly lower in the 60+ age group (OR_overall = 5.05 (4.69–5.45), OR₆₀₊ = 3.70 (3.27–4.19), P_difference = 2.6 × 10⁻⁵, two-sample two-sided Z-test). We did not find a significant sex difference in overall PRS effects (OR_male = 4.70 (4.16–5.31), OR_female = 5.28 (4.80–5.80), P_difference = 0.141, two-sample two-sided Z-test), even though the effect of sex was highly significant (OR = 2.54 (2.33–2.78), P = 1.93 × 10⁻⁹⁴, one-sample two-sided Z-test). In the best-performing RF and DNN binary classification models, pseudo-R² was 0.329 (s.e. = 0.003) and 0.324 (s.e. = 0.005), almost 1.5 times higher than in the GLM (R² = 0.221, s.e. = 0.003). The time-to-event classifier models showed a further increase in R² by approximately 10% for both models (${{R}}_{{\rm{RSF}}{\hbox{-}}5{\rm{yr}}}^{2}=0.363$, s.e. = 0.004; ${{R}}_{{\rm{DNNsurv}}{\hbox{-}}5{\rm{yr}}}^{2}=0.354$, s.e. = 0.005). Overall, nonlinear relationships and interactions accounted for 39.1% (s.e. = 1.96%) of the explained variance.

Discussion

Performing the largest meta-analysis of RLS GWAS to date, we have increased the number of known risk loci eightfold. We included three cohorts, representative of commonly used strategies to assess behavioral phenotypes, ranging from in-person interviews to a single online question. They also reflect the breadth of target populations for recruitment into GWAS, including clinical cohorts as well as samples from the general population. Despite this heterogeneity, genetic correlations were strong between the cohorts, justifying their combination in a multivariate meta-analysis.

We investigated sex-specific genetic susceptibility in RLS. While the heritability was significantly higher in women, the genetic correlation between the sexes was close to one. Results from our simulation study pointed to an unobserved environmental risk factor and corresponding gene–environment interactions driving the difference in heritability. Our analyses emphasize the importance of tracking environmental exposures in genetically susceptible individuals and may motivate re-interpretation of previous observations in RLS, for example, of parity potentially driving the higher prevalence observed in women^8,9. In line with the high genetic correlation between the sexes, there were only six loci where risk variants showed significant sex differences in effect size. An additional two loci in males and 21 loci in females were genome-wide significant in only one sex but did not reach significance in the between-sex heterogeneity tests. With larger sample sizes, some of these may turn out to be true sex-specific association signals.

Our enrichment analyses corroborate results from earlier GWAS of RLS by prioritizing CNS tissues and primarily pathways linked to neurodevelopment and neurotransmission³. Interestingly, the enrichment effects were consistently stronger in fetal and prenatal datasets. This suggests that development may represent a critical period in which genetic contributors to RLS susceptibility act on the activity, connectivity or composition of neurons in the CNS. Analyses in developmental mouse CNS single-cell data prioritized excitatory glutamatergic neurons in the spinal cord, hindbrain, midbrain and forebrain but also γ-aminobutyric acid (GABA)ergic neurons in at least the midbrain and hindbrain. This diversity was reflected in the adult dataset, with again mostly excitatory neurons showing enrichment. Overall, the diversity of cell types and structures with significant enrichment corresponds to the complex phenotype of RLS, which includes sensory and motor symptoms as well as a circadian pattern. Unfortunately, the current scarcity of high-resolution data limits the ability of our study to validate these observations in humans. Tissue enrichment analysis depends on the methodology as well as on the composition of the datasets. Specifically, definite exclusion of cell types is difficult as they may not have been represented in the dataset. We tried to address these limitations by using two different enrichment methods as well as several datasets.

Interestingly, except for the prioritization of SLC40A1 (ferroportin), we did not identify strong links between iron metabolism and genetic risk factors for RLS in our pathway and genetic correlation analyses. However, the T2* and QSM values we used as surrogates for brain iron content are differentially influenced by iron and myelin; therefore, future magnetic resonance imaging GWAS with higher anatomical resolution may allow better dissection of genetic effects involved in iron and myelin content^25,26. Moreover, we cannot rule out an incomplete representation of brain or general iron homeostasis in the currently available pathway definitions.

Our study provides discoveries relevant for advancing clinical care in RLS. We identified several genes that are druggable and in some cases targets of known drugs. For example, the prioritization of two glutamate receptors suggests that the efficacy of anticonvulsants in RLS should be re-assessed. Small open trials have shown good response to glutamate receptor antagonists such as perampanel or lamotrigine in RLS^27,28. The benefit of α₂δ ligands such as pregabalin or gabapentin adds further evidence that anti-epileptic drugs could be an additional therapeutic option²⁹. Investigation into a completely new line of treatment is suggested by the prioritization of the cholecystokinin B receptor, a neuropeptide receptor that has been linked to pain modulation and anxiety-related behavior^15,30. Furthermore, our genetic correlation and MR analyses identified relationships of potential medical relevance between RLS and several traits. In line with previous reports, the strongest genetic correlations with RLS were observed for insomnia symptoms and for depression^22,23. MR analysis showed bidirectional effects, with the full model (causal as well as confounding effects) performing best. Probably, both pleiotropic genetic effects as well as the presence of RLS cases in the depression and insomnia cases and vice versa are involved. Disentangling the contributions of shared genetics and of case misclassification to this relationship will require large datasets with high-quality phenotyping of both insomnia and RLS. We saw a robust and significant unidirectional relationship of RLS with type 2 diabetes, with consistent results between LHC-MR and standard IVW-MR. The causal-only-effect model performed best in LHC-MR, suggesting that this link from RLS to diabetes is unlikely due to a heritable confounder. Thus far, cross-sectional and clinical studies have yielded inconsistent results regarding the causal relationship between RLS and type 2 diabetes³¹. Our MR analyses support a causal effect of RLS increasing the risk of type 2 diabetes. We found likely causal, albeit bidirectional relationships between RLS and osteoarthritis and between RLS and diseases of the respiratory system. Clinical or epidemiological studies on RLS in these disorders are limited or even non-existent at present; therefore, patients could benefit from increased awareness and research activities. The beneficial effect of modifiable behaviors on reducing the risk of RLS is underscored by findings that a healthy lifestyle, for example, fresh fruit consumption, is linked to lower RLS risk. Due to the inherent limitations of MR analysis, these results should not be overinterpreted. Even though the LHC-MR approach seems robust across a range of scenarios with different violations of the MR assumptions, it has its own drawbacks³². Therefore, we advise leveraging our findings to inform future clinical and epidemiological research aimed at gathering further evidence to support causality.

Predicting the likelihood of developing RLS is crucial for targeted disease-prevention strategies. We compared traditional PRSs to more advanced machine learning approaches integrating interaction and nonlinear effects. The latter showed superior performance compared to simple PRS-only or PRS-plus-linear interactions models. In our simulation study with only limited phenotypic data, the RF and DNN approaches provided comparable results. Enhanced phenotypic data may amplify the effectiveness of DNNs for predictive purposes. Two aspects limited our options for risk prediction. First, the definitive RLS cases (diagnosed by face-to-face interviews) with individual-level data required for developing the models had no detailed clinical data. Second, they were part of a case–control cohort and therefore do not reflect the general population structure, which necessitated creating a simulated dataset from the original data. Nevertheless, we were able to achieve an AUC of up to 91% for the 5-year prediction window with the machine learning approaches and validated our results in the INTERVAL study, where the performance was comparable with an AUC of up to 87%.

Collectively, our study marks a substantial advance in deciphering the genetic basis of RLS and paves the way for improving treatment and prevention strategies. We acknowledge two important limitations. First, biobank-scale longitudinal datasets with detailed medical and lifestyle information and high-quality RLS phenotyping are lacking. This type of data is needed to dissect the relationships discovered by genetic correlation and MR analyses as well as to study the roles of age, sex and other environmental effects and their interactions in shaping the risk and course of disease. Second, large-scale GWAS for RLS are currently limited to populations of European ancestry. An extension to non-European populations is imperative to improve genetic fine-mapping at shared loci and to adapt disease concepts to these populations with respect to non-shared genetics.

Methods

Ethics statement

All studies were approved by the respective local ethical committees, and all participants provided informed consent. The EU-RLS-GENE study was approved by an institutional review board at the University Hospital of the Technical University of Munich (2488/09). The INTERVAL dataset was approved by the National Research Ethics Service Committee East of England—Cambridge East (REC 11/EE/0538). Participants of 23andMe provided informed consent under a protocol approved by the external AAHRPP-accredited IRB, Ethical and Independent (E&I) Review Services. As of 2022, E&I Review Services is part of Salus IRB (https://www.versiticlinicaltrials.org/salusirb). The deCODE dataset was approved by the National Bioethics Committee of Iceland. The Danish Blood Donor Study (DBDS) dataset was approved by the Scientific Ethical Committee of Central Denmark (M-20090237) and by the Danish Data Protection agency (30-0444). GWAS studies in the DBDS were approved by the National Ethical Committee (NVK-1700407). The Emory dataset was approved by an institutional review board at Emory University, Atlanta, GA, USA (HIC ID 133-98).

GWAS phenotyping and genotyping

Some of the samples were included already in our previous GWAS meta-analysis³. The reported sample numbers are the final sample numbers after quality control. Additional details are provided in the Supplementary Note.

Discovery meta-analysis

International EU-RLS-GENE consortium (7,248 cases (2,479 males and 4,769 females) and 19,802 controls (10,422 males and 9,380 females))

RLS cases were recruited in specialized outpatient clinics for movement disorders and in sleep clinics in European countries (Austria, Czech Republic, Estonia, Finland, France, Germany and Greece), Canada (Quebec) and the USA. RLS was diagnosed in a face-to-face interview by an expert neurologist or sleep specialist based on IRLSSG diagnostic criteria¹. Controls were either population-based unscreened controls (Austria, Estonia, Finland, France, Germany) or healthy individuals recruited in hospitals (Canada, Czech Republic, Greece, USA). A total of 6,228 cases and 10,992 ancestry-matched controls had been genotyped on the Axiom array and were the study sample used in our previous meta-analysis. For the current study, 1,020 cases and 8,810 ancestry-matched controls were added who were genotyped on the Infinium Global Screening Array-24 version 1.0. Genotype calling was performed in GenomeStudio 2.0 according to the GenomeStudio Framework User Guide, and identical quality-control criteria were used for both datasets. Imputation was performed on the UK10K haplotype and 1000 Genomes Phase 3 reference panel using the EAGLE2 (version 2.0.5) and PBWT (version 3.1) imputation tools as implemented in the Sanger imputation server. Imputed SNPs with pHWE ≤ 1 × 10⁻⁵ or an INFO score < 0.5 were filtered out.

INTERVAL study (3,491 cases (1,291 males and 2,200 females) and 23,741 controls (12,511 males and 11,230 females))

The INTERVAL study includes whole-blood donors recruited in England between 2012 and 2014. The Cambridge-Hopkins Restless Legs questionnaire was used to define RLS cases, and probable and definite cases were combined to form a binary phenotype as described previously³. A detailed description of Axiom ‘Biobank’ array genotyping and the imputation procedure plus related quality control in the INTERVAL trial can be found elsewhere³⁴. Briefly, imputation was performed using a joint UK10K and 1,000 Genomes Phase 3 (May 2013 release) reference panel via the Sanger imputation server, and variants with MAF ≥ 0.1% and INFO score ≥ 0.4 were retained for analysis.

Research participant cohort for 23andMe (105,908 cases (34,544 males and 71,364 females) and 1,502,923 controls (678,661 males and 824,262 females))

This study includes research participants of 23andMe who agreed to participate in research studies. The RLS phenotype was defined by self-reported responses to survey questions that assessed whether someone had ever been diagnosed with RLS or had ever received treatment for RLS as described previously³. Participants were genotyped on one of five platforms, all using Illumina arrays with added custom content (HumanHap550+ BeadChip, OmniExpress+ BeadChip, Infinium Global Screening Array). Participant genotype data were imputed in a two-step procedure using a reference panel created by combining the May 2015 release of the 1000 Genomes Phase 3 haplotypes with the UK10K imputation reference panel. Pre-phasing was carried out using either the internally developed tool Finch, which implements the Beagle algorithm, or EAGLE2. Imputation was performed with Minimac3.

Replication meta-analysis

Research participant cohort for 23andMe (19,214 cases and 347,000 controls)

This cohort includes only individuals who had not been part of the 23andMe GWAS used in the discovery meta-analysis. Cases and controls were defined as described above.

INTERVAL replication cohort (1,591 cases and 10,000 controls)

Individuals in this cohort do not overlap with samples included in the INTERVAL GWAS used in the discovery meta-analysis. RLS status was assessed with a single question on having received a diagnosis of RLS.

For 23andMe and INTERVAL, genotyping and imputation was carried out as described for the discovery stage.

deCODE–DBDS–Emory cohort (8,223 cases and 41,815 controls)

This dataset included the DBDS, a cohort from deCODE Genetics, Iceland, the Emory Hospital Atlanta, USA and the Donor InSight-III study. Phenotyping and genotyping procedures have been described in detail previously⁴.

SNP-based association analysis

Discovery-stage GWAS of autosomes

EU-RLS-GENE GWAS

First, the Axiom- and the GSA-genotyped datasets were analyzed separately using SNPTEST version 2.5.4 with genotype dosages and assuming an additive model. Age, sex and the first ten PCs from the MDS analysis in PLINK were included as covariates. These summary statistics of the two datasets were then combined by fixed-effect inverse-variance meta-analysis (STERR scheme) using METAL (release 2011-03-25)³⁵. One round of genomic control was performed in each dataset before meta-analysis.

INTERVAL GWAS

Assuming an additive genetic model, genotype dosages were analyzed in SAIGE (0.35.8.8) using a linear mixed model to account for cryptic relatedness and saddle point approximation to account for case–control imbalance³⁶. Age, sex and the first ten PCs of ancestry were included as potential genomic confounders. The analysis was restricted to genetic variants with MAF ≥ 0.001, INFO ≥ 0.4 and a minor allele count of 10.

The 23andMe GWAS

Association analysis was conducted by logistic regression (LRT) assuming additive allelic effects and imputed dosages. Age, sex, genotyping platform and the first ten PCs were included as covariates.

In all individual GWAS, sex-specific analyses were performed using the same pipelines as those for the pooled analyses minus adjustment for sex as a covariate.

Discovery-stage meta-analysis for autosomes

We applied the same methods for both the pooled and the sex-specific GWAS. The three independent datasets were combined in a multivariate GWAS meta-analysis using the N-weighted-GWAMA R function (version 1.2.6)³⁷. To assess the possibility of heterogeneity of SNP effects between the studies, Cochran’s Q-test was applied as described in METAL.

Discovery-stage meta-analysis for chromosome X

Data for the X chromosome were available in two of the discovery-stage datasets: EU-RLS-GENE and 23andMe.

EU-RLS-GENE XWAS

For the pooled association analysis, male genotypes were coded as 0/2 (assuming no dosage compensation in males). All other methods were identical to those of the autosomal analyses. In sex-stratified analyses, males were coded as 0/1 and females as 0/1/2.

The 23andMe XWAS

In both pooled and sex-stratified analyses, males were coded as 0/2 and females as 0/1/2.

Pooled and sex-specific meta-analyses were performed using the N-GWAMA R function as in the autosomal analysis. Because N-GWAMA operates with Z scores, the type of male allele coding did not affect the results.

Sex-specific meta-analysis association analysis

We performed sex-specific (male-only and female-only) meta-analyses of the corresponding GWAS using the N-GWAMA approach as described above. The results were used to estimate sex-specific heritability and genetic correlation between the sexes.

To detect sex-specific effects, we tested all independent (r² < 0.2) genome-wide significant SNPs of the pooled and sex-specific meta-analyses for heterogeneity of effect sizes between the two sexes using Cochran’s Q-test (one-sided) and a Bonferroni-corrected significance threshold of P_adj ≤ 0.05/221.

Replication-stage association analysis

For 23andMe and INTERVAL, quality control and statistical analysis were performed as described for the discovery stage. Statistical analysis for the DBDS, deCODE–Emory and Donor Insight studies has been described previously⁴. Meta-analysis was performed using Han and Eskin’s random-effects model in METASOFT (RE2, METASOFT version 2.0.1)³⁸.

Identification of risk loci and independent lead SNPs

To define independent risk loci, we first used the ‘--clump’ command in PLINK (version 1.90b6.7)³⁹ to collapse multiple genome-wide significant association signals based on linkage disequilibrium (LD) and distance (clump-r2 > 0.05, clump-kb < 500 kb clump-p1 < 5 × 10⁻⁸, clump-p2p-value < 10⁻⁵). We then performed conditional analyses to identify secondary independent signals in risk loci using GCTA (version 1.93.0beta) with the ‘-cojo-slct’ option, the P-value threshold for genome-wide significance set at 5 × 10⁻⁸, the distance window set at 10 Mb and the colinearity cutoff set at 0.9 (ref. ⁴⁰). LD was derived from EU-RLS-GENE genotype data. Independent genome-wide significant signals were merged into one genomic risk locus if either their LD block distance was <500 kb or their clumped regions were overlapping.

Heritability analyses

Heritability is reported on the liability scale unless otherwise indicated. Prevalence estimates were derived from the population cohorts INTERVAL and 23andMe themselves. For the EU-RLS-GENE case–control dataset and for the meta-analysis, prevalence estimates were derived from previous publications on European ancestries.

We estimated SNP-based heritability under several different heritability models. LDSC (version 1.0.1) was used with standard settings, invoking a model where SNPs with different MAFs are expected to contribute equally to heritability⁴¹. LDAK (version 5.0) was used with standard settings to implement the LDAK model, where SNP contributions depend on LD structure and MAF as well as the BLD-LDAK and BLD-LDAK+Alpha models, which incorporate additional annotation-based features⁴². All analyses were based on summary statistics and filtering according to LDSC default settings, that is, HapMap3 non-HLA SNPs with MAF > 0.01 and INFO ≥ 0.9. The Akaike information criterion of each of these models was reported for model comparison. Further details are provided in the Supplementary Note.

For X chromosome heritability estimation, we followed the approach described by Lee et al. and used the summary statistics of the N-GWAMA meta-analysis⁴³. For sex k, the SNP heritability ${h}_{k}^{2}$ relates to the expected χ² statistics as ${\mathbb{E}}({\chi }_{k}^{2})\approx 1+{N}_{k}{h}_{k}^{2}/{M}_{{\rm{eff}}}$, where N_k is the GWAS sample size, and M_eff is the effective number of loci within the examined genomic region (assumed to be the same in males and females). For calculation of the (sex-specific) relative heritability contribution of the X chromosome, χ² statistic-based h² was also calculated for the autosomes.

Genetic correlation analysis

For autosomal data, genetic correlations were calculated using LDSC (version 1.0.1) using the same SNP filtering criteria and the two-step estimation option as in the heritability estimation. Because the LDSC framework is not applicable for chromosome X, the genetic correlation coefficient ${\hat{r}}_{\rm{g}}$ was estimated as ${\hat{r}}_{\rm{g}}=\,\frac{\widehat{{Z}_{\rm{m}}{Z}_{\rm{f}}}}{\sqrt{(\;{\hat{\chi }}_{\rm{f}\,}^{2}-\,1)(\;{\hat{\chi }}_{\rm{m}\,}^{2}-\,1)}}$, where Z and χ² are the Z scores and mean χ² estimates from the female (f) and male (m)-specific studies.

In addition to between-study and between-sex genetic correlations, we performed a large-scale genetic correlation screen for RLS (represented by the pooled autosomal meta-analysis data) and other traits using LDSC as described above. Sources and filtering criteria for summary statistics included in this screen are provided in the Supplementary Note.

Traits significantly correlated with RLS (FDR < 0.05, one-sample two-sided Z-test) were taken forward to a bi-serial genetic correlation analysis. Here, we computed the pairwise ${\hat{r}}_{\rm{g}}$ between all traits.

An unsigned weighted correlation matrix was built using the pairwise ${\hat{r}}_{\rm{g}}$ and used as input for a weighted correlation matrix analysis to perform hierarchical clustering and to detect modules with the WGCNA package (version 1.69)⁴⁴. The following settings were applied in WGCNA: softPower, 6; network type, ‘unsigned’; TOMDenom, ‘min’; Dynamic-cutree, method = ‘hybrid’; deepSplit, 2; minModuleSize, 30; pamStage, TRUE; pamRespectsDendro, FALSE; useMedoids, FALSE. The defining trait categories in each module were determined by consensus through independent review of the within-module cluster structure by visual inspection of network plots at two sites (Helmholtz and Cambridge).

Mendelian randomization

To select traits for MR, we defined two to eight clusters in a module based on its complexity. In each cluster, the traits were ranked according to the significance of their correlation with RLS, and we selected the most significantly correlated medical conditions or potentially modifiable lifestyle factors. We supplemented this list with traits for which an association with RLS has been described in the literature.

Using R version 4.0.4, we filtered GWAS datasets to uncorrelated SNPs (r² < 0.01 in the European 1000 Genomes Phase 3 data), aligned them to GRCh37 and mapped them to dbSNP 153 with the gwasvcf package (version 0.1.0). We harmonized effect alleles across studies using the TwoSampleMR package (version 0.5.6)⁴⁵. Palindromic variants with ambiguous allele frequencies and those with unresolved strand issues were excluded from analysis.

To avoid violations of the classical MR assumptions when studying correlated and likely pleiotropic traits, we used a robust method for bidirectional MR, LHC-MR (version 0.0.0.9000)³². Traits with low heritability (h² < 2.5%, ${P_{h^2}}$ > 0.05) were excluded from the analysis. Significance of directionality and confounding effect were tested by comparing the goodness of fit of six degenerate LHC-MR models (only latent effect, only causal effect, only causal effect to RLS, only causal effect from RLS, no causal effect to RLS and no causal effect from RLS) to the full model. We supplemented these analyses with those based on the IVW and MR-Egger methods.

Gene prioritization in risk loci

All analyses were performed on the N-GWAMA results of the pooled meta-analysis. We applied several complementary approaches to prioritize candidate genes in the genome-wide significant risk loci. These included the gene-prioritization pipeline of DEPICT (version 1.rel194), three prioritization workflows (positional, eQTL-based and topology-based mapping) provided on the FUMA platform (https://fuma.ctglab.nl/, version 1.3.6a), a gene-level GWAS using MAGMA version 1.08, a transcriptome-wide association study using S-PrediXcan and S-MultiXcan (MetaXcan package version 0.7.4), a colocalization analysis with eCAVIAR (version 2.2) and statistical fine-mapping with CAVIARBF (version 0.2.1)^{46,47,48,49,50,51,52}. In the DEPICT, FUMA eQTL-based mapping, MAGMA and transcriptome-wide association study analyses, a gene was considered prioritized if it had an FDR < 0.05; in FUMA topology-based mapping, if it had an FDR < 1 × 10⁻⁵; and in eCAVIAR, if it had a colocalization posterior probability > 0.1. In FUMA positional mapping, a gene was considered prioritized if genome-wide significant SNPs physically mapped to it. In statistical fine-mapping, a gene was considered prioritized if an SNP in the 95% credible set of the risk locus could be linked to it by either eQTL, chromatin interaction or positional mapping. In addition, we checked whether a gene contained genome-wide significant coding variants (the gene was considered prioritized if it did) and whether a gene mapped to a gene set that was significant in our enrichment analyses (the gene was considered prioritized if it did). We combined the results of all approaches per gene in a prioritization score by summing up the individual results, counting ‘not prioritized’ as 0 and ‘prioritized’ as 1. Further details are provided in the Supplementary Note.

Enrichment analyses

Gene set and pathway enrichment analyses

DEPICT

We ran DEPICT to detect enrichment of gene sets across risk loci as well as to identify tissue and cell types where expression is enriched for genes across risk loci. We set the significance thresholds for lead SNPs at 1 × 10⁻⁵ and at 5 × 10⁻⁴ for null GWAS; all other settings were the same as those used for gene prioritization (see above). DEPICT was run with all built-in datasets. eQTL mapping and functional prioritization were evaluated in DEPICT’s built-in eQTL and reconstituted gene sets.

Excluding 12 SNPs not reaching genome-wide significance in the joint analysis of discovery and validation did not change the main results (Supplementary Table 25).

MAGMA

MAGMA (version 1.08) was used to perform gene set enrichment testing for pathway identification. MAGMA conducts competitive gene set tests with correction for gene size, variant density and LD structure. A total of 7,522 gene sets representing the GO biological process ontology (MSigDB version 7.1, C5 collection, GO:BP subset) were tested for association. We adopted a significance threshold of FDR < 0.05 (one-sided t-test).

Tissue and cell type enrichment analyses

Using the settings described above, we tested enrichment of RLS heritability with DEPICT across 209 different tissue types covered in the built-in dataset. For an independent validation on the tissue level as well as for the analyses on the cell type level, we mainly used the CELLEX and CELLECT tools⁵³. CELLECT provides two different gene-prioritization approaches for heritability enrichment testing, S-LDSC and MAGMA covariate analysis^54,55. For compatibility of the results, the summary statistics of the pooled N-GWAMA analysis were filtered using settings identical to those in our LDSC heritability analyses. Following the recommendations by Timshel et al.⁵³, we applied a ‘tiered’ approach by starting with body-wide datasets and then focusing on CNS-centric datasets. We used CELLECT software (version 1.3.0) with default settings but updated to MAGMA version 1.08 to test enrichment of RLS heritability in cell type- or tissue-specific genes for datasets with publicly available RNA-seq data. These analyses require a measure of expression specificity for each gene in a cell or tissue type. We either used CELLEX (version 1.2.1) to compute expression specificity or relied on precomputed CELLEX expression specificity scores. Human adult datasets without publicly available raw RNA-seq data were analyzed using MAGMA_Celltyping (version 2.0.0) in top10 mode. The list of input datasets is provided in the Supplementary Note, and results of our evaluation of both approaches showing high correlation are presented in Supplementary Fig. 1 and Supplementary Table 26.

Risk prediction

We applied three types of models for genetic risk evaluation and RLS risk prediction: GLM with and without interaction terms, RF models and DNN models. These were implemented as binary classifiers as well as time-to-event classifiers.

Training of the models and evaluation by tenfold cross-validation were based on the EU-RLS-GENE Axiom subset. Therefore, we first conducted a meta-analysis excluding this dataset to generate unbiased summary statistics to be used in all models. Because GWAS have an ascertainment bias, we constructed a simulation cohort dataset by resampling of the EU-RLS-GENE Axiom subset based on the year of birth of the sampled individuals, their ages at onset and the demographic composition of the German population (Supplementary Note). We calculated the PRS using dosages of 216 independent lead SNPs of our discovery meta-analyses.

For a baseline comparison of the predictive power of this score to a PRS based on genome-wide data, we calculated a genome-wide PRS using the LDpred2-auto option of LDpred2 (R package bigsnpr version 1.12.2)⁵⁶. Variants and the LD reference panel were based on the HapMap3 EUR dataset, and window size for calculating SNP correlation was set to 3 cM.

Binary classification models were evaluated by Nagelkerke’s pseudo-R², receiver operator characteristic AUC and precision–recall AUC. A 5-year binary classifier was constructed for each of the time-to-event models by predicting the label until the next 5 years and evaluated by the metrics for binary classification.

To evaluate the contribution of the interaction effects to model performance, we estimated the effect sizes of interaction terms such as PRS × age by logistic regression:

$$\begin{array}{l}P({\rm{RLS}}=1|{\rm{PRS}},{\rm{sex}},{\rm{age}},{\bf{PC}})\\=\displaystyle\frac{1}{1+{e}^{-\left({\beta }_{0}+{\beta }_{1}{\rm{PRS}}+{\beta }_{2}{\rm{sex}}+{{\beta }}_{3}{\rm{age}}+{\beta }_{4}{\rm{PRS}}\times {\rm{sex}}+{{\beta }}_{5}{ {\rm{PRS}\times\rm{age}}}+{{\beta }}_{6}{{\rm{sex}}\times\rm{age}} +{{\beta }}_{7}{{\rm{PRS}}\times {\rm{sex}\times\rm{age}}}+{\boldsymbol{\gamma }}\cdot{\bf{PC}}\right)}},\end{array}$$

where age is the dummy variable of age in bins of 20 years, PC indicates the first ten PCs from the MDS analysis in PLINK, γ is a vector of effect sizes of PCs and the PRS = Σ_jw_jg_j, where w_j and g_j are the per-allele effect size and dosage of the j-th SNP, respectively.

For the DNN and RF models, we used these logistic regression estimates as the baseline and then further estimated the interaction effect sizes indirectly by calculating the incremental gain in explained variance (Nagelkerke’s pseudo-R²) from model₀ to model₁ as:

$${R}^{2}=\left(1-\left(L\left(\rm{model}_{0}\right)/{\it{L}}(\rm{model}_{1})\right)^{\frac{2}{\it{N}}}\right)\left(1-{\it{L}}(\rm{model}_{0})^{\frac{2}{\it{N}}}\right)^{-1},$$

where L is the likelihood function for a logistic regression model with the first ten PCs included as covariates.

Binary classification models, GLMs and RF and DNN models were built, optimized and trained by H2O AutoML (version 3.36.0.2) in R (version 4.0.2)⁵⁷. Time-to-event models were implemented with randomForestSRC (version 3.0.1) in R (version 4.0.2) and PyTorch⁵⁸ (pycox version 0.2.1 and PyTorch version 1.6.0). Cross-validation-based Nagelkerke’s pseudo-R² was calculated in R version 4.0.2.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Summary statistics of the meta-analysis are publicly available for the top 10,000 SNPs at Zenodo (https://doi.org/10.5281/zenodo.10804907)⁵⁹. Summary statistics of the discovery-stage International EU-RLS-GENE consortium GWAS and the INTERVAL GWAS are available at the GWAS Catalog (https://www.ebi.ac.uk/gwas/) under accession codes GCST90399568, GCST90399569, GCST90399570, GCST90399571, GCST90399572 and GCST90399573. The full GWAS summary statistics for the 23andMe discovery dataset have been made available through 23andMe to qualified researchers under an agreement with 23andMe that protects the privacy of the 23andMe participants. Datasets have been made available at no cost for academic use. Please visit https://research.23andme.com/collaborate/#dataset-access/ for more information and to apply to access the data. Additional data used for tissue and cell type enrichment analysis are available here: developmental (http://mousebrain.org/development/downloads.html) and adult single-cell RNA-seq datasets (http://mousebrain.org/adult/downloads.html) from the Mouse Brain Atlas (http://mousebrain.org/), the Human Gene Expression During Development dataset from the BBI-Allen Single Cell atlases (https://descartes.brotmanbaty.org/), the BrainSpan Developmental Transcriptome RNA-seq dataset from the BrainSpan Atlas of the Developing Human Brain (https://www.brainspan.org/static/home), the V8 RNA-seq dataset (GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_reads.gct.gz) from GTEx (https://gtexportal.org/home/datasets) and the human C8 collection from MSigDb version 7.4 (http://software.broadinstitute.org/gsea/msigdb/), with legacy versions available at https://www.gsea-msigdb.org/gsea/downloads_archive.jsp after creating a user account with GSEA–MSigDB. Summary statistics of GWAS for genetic correlation and MR analyses are available at the University of Bristol Integrative Epidemiology Unit OpenGWAS server (https://gwas.mrcieu.ac.uk) and the GWAS Atlas (https://atlas.ctglab.nl/). Additional GWAS summary statistics for iron-related traits are available at https://www.fmrib.ox.ac.uk/ukbiobank/gwas_resources/index.html, https://open.win.ox.ac.uk/ukbiobank/big40/BIGv2/ and https://www.decode.com/summarydata/. A complete list of sources used for annotation with FUMA is available at https://fuma.ctglab.nl/links and https://fuma.ctglab.nl/tutorial. Auxiliary files for use with MAGMA are available at https://ctg.cncr.nl/software/magma. Additional files for use with LDSC and LDAK are available at https://alkesgroup.broadinstitute.org/LDSCORE/. Information about drug targets is available at the free-to-access database DrugBank Online (https://go.drugbank.com/).

Code availability

References

Allen, R. P. et al. Restless legs syndrome: diagnostic criteria, special considerations, and epidemiology. A report from the restless legs syndrome diagnosis and epidemiology workshop at the National Institutes of Health. Sleep Med. 4, 101–119 (2003).
PubMed Google Scholar
Manconi, M. et al. Restless legs syndrome. Nat. Rev. Dis. Primers 7, 80 (2021).
PubMed Google Scholar
Schormair, B. et al. Identification of novel risk loci for restless legs syndrome in genome-wide association studies in individuals of European ancestry: a meta-analysis. Lancet Neurol. 16, 898–907 (2017).
PubMed PubMed Central Google Scholar
Didriksen, M. et al. Large genome-wide association study identifies three novel risk variants for restless legs syndrome. Commun. Biol. 3, 703 (2020).
CAS PubMed PubMed Central Google Scholar
Allen, R. P. et al. Restless legs syndrome/Willis–Ekbom disease diagnostic criteria: updated International Restless Legs Syndrome Study Group (IRLSSG) consensus criteria—history, rationale, description, and significance. Sleep Med. 15, 860–873 (2014).
PubMed Google Scholar
Trenkwalder, C. et al. Comorbidities, treatment, and pathophysiology in restless legs syndrome. Lancet Neurol. 17, 994–1005 (2018).
CAS PubMed Google Scholar
Trenkwalder, C., Allen, R., Hogl, B., Paulus, W. & Winkelmann, J. Restless legs syndrome associated with major diseases: a systematic review and new concept. Neurology 86, 1336–1343 (2016).
CAS PubMed PubMed Central Google Scholar
Berger, K., Luedemann, J., Trenkwalder, C., John, U. & Kessler, C. Sex and the risk of restless legs syndrome in the general population. Arch. Intern. Med. 164, 196–202 (2004).
PubMed Google Scholar
Pantaleo, N. P., Hening, W. A., Allen, R. P. & Earley, C. J. Pregnancy accounts for most of the gender difference in prevalence of familial RLS. Sleep Med. 11, 310–313 (2010).
PubMed Google Scholar
Schormair, B. et al. PTPRD (protein tyrosine phosphatase receptor type δ) is associated with restless legs syndrome. Nat. Genet. 40, 946–948 (2008).
CAS PubMed Google Scholar
Akcimen, F. et al. Transcriptome-wide association study for restless legs syndrome identifies new susceptibility genes. Commun. Biol. 3, 373 (2020).
CAS PubMed PubMed Central Google Scholar
Sarayloo, F. et al. SKOR1 has a transcriptional regulatory role on genes involved in pathways related to restless legs syndrome. Eur. J. Hum. Genet. 28, 1520–1528 (2020).
CAS PubMed PubMed Central Google Scholar
Tilch, E. et al. Identification of restless legs syndrome genes by mutational load analysis. Ann. Neurol. 87, 184–193 (2020).
CAS PubMed Google Scholar
Moran, T. H., Robinson, P. H., Goldrich, M. S. & McHugh, P. R. Two brain cholecystokinin receptors: implications for behavioral actions. Brain Res. 362, 175–179 (1986).
CAS PubMed Google Scholar
Bernard, A. et al. The cholecystokinin type 2 receptor, a pharmacological target for pain management. Pharmaceuticals 14, 1185 (2021).
CAS PubMed PubMed Central Google Scholar
Burkhart, A. et al. Expression of iron-related proteins at the neurovascular unit supports reduction and reoxidation of iron for transport through the blood–brain barrier. Mol. Neurobiol. 53, 7237–7253 (2016).
CAS PubMed Google Scholar
Hentze, M. W., Muckenthaler, M. U., Galy, B. & Camaschella, C. Two to tango: regulation of mammalian iron metabolism. Cell 142, 24–38 (2010).
CAS PubMed Google Scholar
Girelli, D., Ugolini, S., Busti, F., Marchi, G. & Castagna, A. Modern iron replacement therapy: clinical and pathophysiological insights. Int. J. Hematol. 107, 16–30 (2018).
CAS PubMed Google Scholar
Wang, C. et al. Phenotypic and genetic associations of quantitative magnetic susceptibility in UK Biobank brain imaging. Nat. Neurosci. 25, 818–831 (2022).
CAS PubMed PubMed Central Google Scholar
Bell, S. et al. A genome-wide meta-analysis yields 46 new loci associating with biomarkers of iron homeostasis. Commun. Biol. 4, 156 (2021).
CAS PubMed PubMed Central Google Scholar
Elliott, L. T. et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018).
CAS PubMed PubMed Central Google Scholar
Jansen, P. R. et al. Genome-wide analysis of insomnia in 1,331,010 individuals identifies new risk loci and functional pathways. Nat. Genet. 51, 394–403 (2019).
CAS PubMed Google Scholar
Lane, J. M. et al. Biological and clinical insights from genetics of insomnia symptoms. Nat. Genet. 51, 387–393 (2019).
CAS PubMed PubMed Central Google Scholar
Connor, J. R., Patton, S. M., Oexle, K. & Allen, R. P. Iron and restless legs syndrome: treatment, genetics and pathophysiology. Sleep Med. 31, 61–70 (2017).
PubMed Google Scholar
Hametner, S. et al. The influence of brain iron and myelin on magnetic susceptibility and effective transverse relaxation—a biochemical and histological validation study. Neuroimage 179, 117–133 (2018).
CAS PubMed Google Scholar
Ravanfar, P. et al. Systematic review: quantitative susceptibility mapping (QSM) of brain iron profile in neurodegenerative diseases. Front. Neurosci. 15, 618435 (2021).
PubMed PubMed Central Google Scholar
Garcia-Borreguero, D., Cano, I. & Granizo, J. J. Treatment of restless legs syndrome with the selective AMPA receptor antagonist perampanel. Sleep Med. 34, 105–108 (2017).
PubMed Google Scholar
Youssef, E. A., Wagner, M. L., Martinez, J. O. & Hening, W. Pilot trial of lamotrigine in the restless legs syndrome. Sleep Med. 6, 89 (2005).
PubMed Google Scholar
Winkelmann, J. et al. Treatment of restless legs syndrome: evidence-based review and implications for clinical practice (revised 2017). Mov. Disord. 33, 1077–1091 (2018).
PubMed Google Scholar
Casello, S. M. et al. Neuropeptide system regulation of prefrontal cortex circuitry: implications for neuropsychiatric disorders. Front. Neural Circuits 16, 796443 (2022).
CAS PubMed PubMed Central Google Scholar
Ning, P., Mu, X., Yang, X., Li, T. & Xu, Y. Prevalence of restless legs syndrome in people with diabetes mellitus: a pooling analysis of observational studies. eClinicalMedicine 46, 101357 (2022).
PubMed PubMed Central Google Scholar
Darrous, L., Mounier, N. & Kutalik, Z. Simultaneous estimation of bi-directional causal effects and heritable confounding from GWAS summary statistics. Nat. Commun. 12, 7274 (2021).
CAS PubMed PubMed Central Google Scholar
Di Angelantonio, E. et al. Efficiency and safety of varying the frequency of whole blood donation (INTERVAL): a randomised trial of 45 000 donors. Lancet 390, 2360–2371 (2017).
PubMed PubMed Central Google Scholar
Finan, C. et al. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 9, eaag1166 (2017).
PubMed PubMed Central Google Scholar
Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429 (2016).
CAS PubMed PubMed Central Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
CAS PubMed PubMed Central Google Scholar
Zhou, W. et al. Efficiently controlling for case–control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
CAS PubMed PubMed Central Google Scholar
Baselmans, B. M. L. et al. Multivariate genome-wide analyses of the well-being spectrum. Nat. Genet. 51, 445–451 (2019).
CAS PubMed Google Scholar
Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011).
CAS PubMed PubMed Central Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
PubMed PubMed Central Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).
CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
CAS PubMed PubMed Central Google Scholar
Speed, D., Holmes, J. & Balding, D. J. Evaluating and improving heritability models using summary statistics. Nat. Genet. 52, 458–462 (2020).
CAS PubMed Google Scholar
Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50, 1112–1121 (2018).
CAS PubMed PubMed Central Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
PubMed PubMed Central Google Scholar
Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
PubMed PubMed Central Google Scholar
Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).
CAS PubMed Google Scholar
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
PubMed PubMed Central Google Scholar
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
PubMed PubMed Central Google Scholar
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).
PubMed PubMed Central Google Scholar
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
PubMed PubMed Central Google Scholar
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
CAS PubMed PubMed Central Google Scholar
Chen, W., McDonnell, S. K., Thibodeau, S. N., Tillmans, L. S. & Schaid, D. J. Incorporating functional annotations for fine-mapping causal variants in a Bayesian framework using summary statistics. Genetics 204, 933–958 (2016).
PubMed PubMed Central Google Scholar
Timshel, P. N., Thompson, J. J. & Pers, T. H. Genetic mapping of etiologic brain cell types for obesity. eLife 9, e55851 (2020).
CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
CAS PubMed PubMed Central Google Scholar
Skene, N. G. et al. Genetic identification of brain cell types underlying schizophrenia. Nat. Genet. 50, 825–833 (2018).
CAS PubMed PubMed Central Google Scholar
Prive, F., Arbel, J. & Vilhjalmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 36, 5424–5431 (2021).
PubMed Google Scholar
LeDell, E. & Poirier, S. H2O AutoML: scalable automatic machine learning. In 7th ICML Workshop on Automated Machine Learning (ICML, 2020).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. 33rd International Conference on Neural Information Processing Systems 721 (Curran Associates, 2019).
Schormair, B. et al. Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology, and risk prediction - additional data and code. Zenodo https://doi.org/10.5281/zenodo.10804907 (2024).

Download references

Acknowledgements

For the International EU-RLS-GENE consortium, we thank all colleagues and staff at the participating centers of the International EU-RLS-GENE consortium for their help in recruiting study participants. We thank the German Restless Legs Syndrome Foundation (RLS e.V.) for continuously supporting our studies. We acknowledge the technical support of Core Facility Genomics at Helmholtz Zentrum München for part of the genotyping done for the EU-RLS-GENE GWAS. Funding was received from the Deutsche Forschungsgemeinschaft (grants 218143125 and 310572679 to J.W. and partial support to J.W. within SyNergy EXC 2145 grant 390857198); the European Regional Development Fund (project GenTransMed 2014-2020.4.01.15-0012 to M.T.-L. and A.M.); the University of Thessaly (grant 2845 to G.M.H.); the NIH–NIA and the NIH–NINDS (1U19AG063911, FAIN U19AG063911 to Z.K.W.); the Mayo Clinic Center for Regenerative Medicine, gifts from the Donald G. and Jodi P. Heeringa Family, the Haworth Family Professorship in Neurodegenerative Diseases fund and the Albertson Parkinson’s Research Foundation (all to Z.K.W.); the National Institutes of Health National Institute of Neurological Disorders and Stroke (P50 NS072187 to O.A.R.); the Mayo Clinic Neuroscience Focused Research Team (Cecilia and Dan Carmichael Family Foundation and the James C. and Sarah K. Kennedy Fund for Neurodegenerative Disease Research) (all to O.A.R.); the Canadian Institutes of Health Research (376503 to A.F.R.S.); the Natural Sciences and Engineering Research Council of Canada (RGPIN-2016-04985 to A.F.R.S.), the Canadian Diabetes Association (OG-3-14-4567-HC to A.F.R.S.), the Heart and Stroke Foundation of Canada (G-16-00014085 to A.F.R.S.); the Charles University Cooperation Program in Neuroscience and the Program EXCELES (LX22NPO5107 to D.K. and K. Sonka). Collection of samples by Emory University investigators was funded by the RLS and AL Williams Jr. Family Foundations. The PROCAM-2 Study was initiated and conducted by the Leibniz Institute for Arteriosclerosis Research at the University of Münster under the leadership of G. Assmann. After his retirement, all data were transferred to the university for further scientific use. The later follow-up assessment was carried out with funds from the Institute of Epidemiology and Social Medicine, and DNA isolation was performed with financial support from the Dean of the medical faculty, both at the University of Münster. Genotyping was enabled through funds from the German Center for Cardiovascular Disease. The Course of RLS Study (COR) was supported by unrestricted grants to the University of Münster from the German Restless Legs Society (RLS e.V. Deutsche Restless Legs Vereinigung) and Boehringer Ingelheim Pharma, Mundipharma Research, Roche Pharma, NeuroBioTec and UCB (Schwarz Pharma). The KORA study was initiated and financed by the Helmholtz Zentrum München—German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the state of Bavaria. Data collection in the KORA study is done in cooperation with the University Hospital of Augsburg. Furthermore, KORA research was supported within the Munich Center of Health Sciences, Ludwig-Maximilians-Universität, as part of LMUinnovativ. We thank all participants for their long-term commitment to the KORA study, the staff for data collection and research data management and the members of the KORA Study Group (https://www.helmholtz-munich.de/en/epi/cohort/kora) who are responsible for the design and conduct of the study. For the INTERVAL study, we thank the NIH Research Cambridge Biomedical Research Centre for funding (RG64219). Participants in the INTERVAL randomized controlled trial were recruited with the active collaboration of NHS Blood and Transplant England (https://www.nhsbt.nhs.uk), which has supported field work and other elements of the trial. DNA extraction and genotyping were cofunded by the National Institute for Health and Care Research (NIHR), the NIHR BioResource (http://bioresource.nihr.ac.uk) and the NIHR Cambridge Biomedical Research Centre (BRC-1215-20014) (the views expressed in this paper are those of the author(s) and not necessarily those of the NIHR, NHSBT or the Department of Health and Social Care). The academic coordinating center for INTERVAL was supported by core funding from the NIHR Blood and Transplant Research Unit in Donor Health and Genomics (NIHR BTRU-2014-10024), the NIHR BTRU in Donor Health and Behaviour (NIHR203337), the UK Medical Research Council (MR/L003120/1), the British Heart Foundation (SP/09/002; RG/13/13/30194; RG/18/13/33946) and NIHR Cambridge BRC (BRC-1215-20014). A complete list of the investigators and contributors to the INTERVAL trial is provided in ref. ⁶⁰. The academic coordinating center thanks blood donor center staff and blood donors for participating in the INTERVAL trial. This work was also supported by Health Data Research UK, which is funded by the UK Medical Research Council, the Engineering and Physical Sciences Research Council, the Economic and Social Research Council, the Department of Health and Social Care (England), the Chief Scientist Office of the Scottish Government Health and Social Care Directorates, the Health and Social Care Research and Development Division (Welsh Government), the Public Health Agency (Northern Ireland), the British Heart Foundation and Wellcome. Regarding the INTERVAL data, this work was performed using resources provided by the Cambridge Service for Data Driven Discovery (CSD3) operated by the University of Cambridge Research Computing Service (https://www.csd3.cam.ac.uk), provided by Dell EMC and Intel using tier 2 funding from the Engineering and Physical Sciences Research Council (capital grant EP/P020259/1), and DiRAC funding from the Science and Technology Facilities Council (https://www.dirac.ac.uk). S.B. is supported by Cancer Research UK (A27657). W.H. Ouwehand is supported by grants to his laboratory from the National Institute for Health Research (NIHR), the European Commission (HEALTH-F2-2012-279233), the British Heart Foundation (RP-PG-0310-1002 and RG/09/12/28096) and NHSBT, and he is a senior investigator for the NIHR. N. Soranzo is supported by the Wellcome Trust (WT098051 and WT091310) and the European Commission Framework Programme 7 (EPIGENESYS 257082 and BLUEPRINT HEALTH-F5-2011-282510). D.J.R. is supported by the NIHR (NIHR-RP-PG-0310-1004). J. Danesh holds a British Heart Foundation Professorship and an NIHR Senior Investigator Award. P.V. was supported by LX22 NPO 5102. We also thank the DBDS and the DBDS Genomic Consortium for their contribution. A complete list of the investigators and contributors to the DBDS Genomic Consortium is provided in the Supplementary Note. The DBDS is funded by the Danish Council for Independent Research—Medical Sciences (8020-00403B), the Danish Administrative Regions and Bio- and Genome Bank Denmark, the Danish Blood Donor Research Foundation and the Novo Nordisk Foundation Challenge Program (NNF17OC0027594). We thank the research participants and employees of 23andMe for making this work possible. The 23andMe Research Team provided infrastructure for generating 23andMe data. Participants provided informed consent and volunteered to participate in the research online under a protocol approved by the external AAHRPP-accredited IRB, E&I Review Services. As of 2022, E&I Review Services is part of Salus IRB (https://www.versiticlinicaltrials.org/salusirb).

Funding

Open access funding provided by Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH).

Author information

These authors contributed equally: Barbara Schormair, Chen Zhao, Steven Bell.
These authors jointly supervised this work: Emanuele Di Angelantonio, Konrad Oexle, Juliane Winkelmann.
Full lists of members and their affiliations appear in the Supplementary Information.

Authors and Affiliations

Institute of Neurogenomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Barbara Schormair, Chen Zhao, Nathalie Schandra, Wolfgang H. Oertel, Volker Kittke, Philip Harrer, Konrad Oexle & Juliane Winkelmann
Institute of Human Genetics, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
Barbara Schormair, Chen Zhao, Nathalie Schandra, Volker Kittke, Philip Harrer, Konrad Oexle & Juliane Winkelmann
Department of Oncology, University of Cambridge, Cambridge, UK
Steven Bell
Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
Steven Bell
Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
Steven Bell
Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
Maria Didriksen, Joseph Dowsett, Sisse Rye Ostrowski & Erik Sørensen
Department of Neuroscience, University of Copenhagen, Copenhagen, Denmark
Maria Didriksen
deCODE Genetics/Amgen, Reykjavik, Iceland
Muhammad S. Nawaz, Hreinn Stefansson & Kari Stefansson
Sleep Disorders Clinic, Department of Neurology, Medical University of Innsbruck, Innsbruck, Austria
Ambra Stefani, Birgit Högl, Abubaker Ibrahim & Melanie Bergmann
Sleep–Wake Disorders Center, Department of Neurology, Hôpital Gui-de-Chauliac, CHU Montpellier, Institut des Neurosciences de Montpellier, INSERM, Université de Montpellier, Montpellier, France
Yves Dauvilliers & Sofiene Chenini
SomnoDiagnostics, Osnabrück, Germany
Cornelius G. Bachmann
Department of Neurology, University Medical Center Göttingen, Göttingen, Germany
Cornelius G. Bachmann
Department of Neurology and Centre of Clinical Neuroscience, Charles University, First Faculty of Medicine and General University Hospital, Prague, Czech Republic
David Kemlink & Karel Sonka
Department of Neurology, Ludwig Maximilians University Munich, Munich, Germany
Walter Paulus
Paracelsus-Elena-Klinik, Kassel, Germany
Claudia Trenkwalder
Department of Neurosurgery, University Medical Center Göttingen, Göttingen, Germany
Claudia Trenkwalder
Department of Neurology, Philipps-University Marburg, Marburg, Germany
Wolfgang H. Oertel
Neuropsychiatry Centre Erding/München, Erding, Germany
Magdolna Hornyak
Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
Maris Teder-Laving & Andres Metspalu
Department of Neurology, Nicosia General Hospital Medical School, University of Cyprus, Nicosia, Cyprus
Georgios M. Hadjigeorgiou
Bragée ME/CFS Center, Stockholm, Sweden
Olli Polo
Department of Pulmonology, Center of Sleep Medicine, Charité—Universitätsmedizin Berlin, Berlin, Germany
Ingo Fietze
Department of Neuroscience, Mayo Clinic College of Medicine, Jacksonville, FL, USA
Owen A. Ross
Department of Neurology, Mayo Clinic, Jacksonville, FL, USA
Zbigniew K. Wszolek
Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
Sisse Rye Ostrowski & Ole B. Pedersen
Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
Christian Erikstrup
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
Christian Erikstrup
Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
Ole B. Pedersen
Department of Clinical Immunology, Odense University Hospital, Odense, Denmark
Mie Topholm Bruun
Department of Clinical Immunology, Aalborg University Hospital, Aalborg, Denmark
Kaspar R. Nielsen
British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Adam S. Butterworth, John Danesh & Emanuele Di Angelantonio
British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
Adam S. Butterworth, John Danesh & Emanuele Di Angelantonio
National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
Adam S. Butterworth, Nicole Soranzo, David J. Roberts, John Danesh & Emanuele Di Angelantonio
Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
Adam S. Butterworth, John Danesh & Emanuele Di Angelantonio
Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
Adam S. Butterworth, John Danesh & Emanuele Di Angelantonio
Department of Haematology, University of Cambridge, Cambridge, UK
Nicole Soranzo & Willem H. Ouwehand
Department of Human Genetics, the Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
Nicole Soranzo & John Danesh
NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
Willem H. Ouwehand
Department of Haematology, University College London Hospitals, London, UK
Willem H. Ouwehand
Radcliffe Department of Medicine and National Health Service Blood and Transplant, Oxford, UK
David J. Roberts
Department of Haematology and BRC Haematology Theme, Churchill Hospital, Headington, Oxford, UK
David J. Roberts
Magdalene College, Cambridge, UK
Brendan Burchell
23andMe, Inc., Sunnyvale, CA, USA
Nicholas A. Furlotte, Priyanka Nandakumar & David A. Hinds
Center for Restless Legs Syndrome, Department of Neurology, Johns Hopkins University, Baltimore, MD, USA
Christopher J. Earley
Department of Neurology, Methodist Neurological Institute, Weill Cornell Medical School, Houston, TX, USA
William G. Ondo
The Neuro (Montreal Neurological Institute–Hospital), McGill University, Montreal, Quebec, Canada
Lan Xiong & Guy A. Rouleau
Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, Canada
Lan Xiong & Guy A. Rouleau
Centre d’Études Avancées en Médecine du Sommeil, Hôpital du Sacré-Cœur de Montréal, Montreal, Quebec, Canada
Alex Desautels
Department of Neurosciences, Université de Montréal, Montreal, Quebec, Canada
Alex Desautels
Clinical and Molecular Metabolism Research Program (CAMM), Faculty of Medicine, University of Helsinki, Helsinki, Finland
Markus Perola
Department of Public Health and Welfare, National Institute for Health and Welfare, Helsinki, Finland
Markus Perola
Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Academy of Science of Czech Republic, Prague, Czech Republic
Pavel Vodicka
First Faculty of Medicine, Charles University in Prague, Prague, Czech Republic
Pavel Vodicka
Biomedical Centre, Faculty of Medicine in Pilsen, Charles University in Prague, Pilsen, Czech Republic
Pavel Vodicka
L’institut du thorax, CNRS, INSERM, Nantes Université, Nantes, France
Christian Dina
Department of Genetic Epidemiology, Institute for Human Genetics, University of Münster, Münster, Germany
Monika Stoll
Institute of Clinical Molecular Biology, Kiel University, Kiel, Germany
Andre Franke
PopGen Biobank and Institute of Epidemiology, Christian Albrechts University Kiel, Kiel, Germany
Wolfgang Lieb
John and Jennifer Ruddy Canadian Cardiovascular Genetics Centre, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
Alexandre F. R. Stewart
Department of Medicine, Duke University School of Medicine, Durham, NC, USA
Svati H. Shah
Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA
Svati H. Shah
Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Christian Gieger & Annette Peters
Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Christian Gieger
German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Hannover, Germany
Annette Peters
Chair of Epidemiology, Institute for Medical Information Processing, Biometry and Epidemiology, Medical Faculty, Ludwig-Maximilians-Universität München, Munich, Germany
Annette Peters
Department of Neurology, Emory University, Atlanta, GA, USA
David B. Rye
Department of Human Genetics, McGill University, Montreal, Quebec, Canada
Guy A. Rouleau
Institute of Epidemiology and Social Medicine, University of Münster, Münster, Germany
Klaus Berger
Statens Serum Institute, Copenhagen, Denmark
Henrik Ullum
Health Data Science Research Centre, Fondazione Human Technopole, Milan, Italy
Emanuele Di Angelantonio
Neurogenetic Systems Analysis Group, Institute of Neurogenomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Konrad Oexle
Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
Juliane Winkelmann
German Center for Mental Health (DZPG), partner site Munich–Augsburg, Munich–Augsburg, Germany
Juliane Winkelmann
Inserm U1283, CNRS UMR 8199, European Genomic Institute for Diabetes, Institut Pasteur de Lille, Lille, France
Amélie Bonnefond
University of Lille, Lille University Hospital, Lille, France
Amélie Bonnefond
Institut Necker-Enfants Malades, INSERM UMR-S1151, CNRS UMR-S8253, Université Paris Cité, Paris, France
Louis Potier
Department of Diabetology, Endocrinology and Nutrition, DHU FIRE, Assistance Publique-Hôpitaux de Paris, Bichat Hospital, Paris, France
Louis Potier

Authors

Barbara Schormair
View author publications
You can also search for this author in PubMed Google Scholar
Chen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Steven Bell
View author publications
You can also search for this author in PubMed Google Scholar
Maria Didriksen
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad S. Nawaz
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Schandra
View author publications
You can also search for this author in PubMed Google Scholar
Ambra Stefani
View author publications
You can also search for this author in PubMed Google Scholar
Birgit Högl
View author publications
You can also search for this author in PubMed Google Scholar
Yves Dauvilliers
View author publications
You can also search for this author in PubMed Google Scholar
Cornelius G. Bachmann
View author publications
You can also search for this author in PubMed Google Scholar
David Kemlink
View author publications
You can also search for this author in PubMed Google Scholar
Karel Sonka
View author publications
You can also search for this author in PubMed Google Scholar
Walter Paulus
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Trenkwalder
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang H. Oertel
View author publications
You can also search for this author in PubMed Google Scholar
Magdolna Hornyak
View author publications
You can also search for this author in PubMed Google Scholar
Maris Teder-Laving
View author publications
You can also search for this author in PubMed Google Scholar
Andres Metspalu
View author publications
You can also search for this author in PubMed Google Scholar
Georgios M. Hadjigeorgiou
View author publications
You can also search for this author in PubMed Google Scholar
Olli Polo
View author publications
You can also search for this author in PubMed Google Scholar
Ingo Fietze
View author publications
You can also search for this author in PubMed Google Scholar
Owen A. Ross
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew K. Wszolek
View author publications
You can also search for this author in PubMed Google Scholar
Abubaker Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Melanie Bergmann
View author publications
You can also search for this author in PubMed Google Scholar
Volker Kittke
View author publications
You can also search for this author in PubMed Google Scholar
Philip Harrer
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Dowsett
View author publications
You can also search for this author in PubMed Google Scholar
Sofiene Chenini
View author publications
You can also search for this author in PubMed Google Scholar
Sisse Rye Ostrowski
View author publications
You can also search for this author in PubMed Google Scholar
Erik Sørensen
View author publications
You can also search for this author in PubMed Google Scholar
Christian Erikstrup
View author publications
You can also search for this author in PubMed Google Scholar
Ole B. Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Mie Topholm Bruun
View author publications
You can also search for this author in PubMed Google Scholar
Kaspar R. Nielsen
View author publications
You can also search for this author in PubMed Google Scholar
Adam S. Butterworth
View author publications
You can also search for this author in PubMed Google Scholar
Nicole Soranzo
View author publications
You can also search for this author in PubMed Google Scholar
Willem H. Ouwehand
View author publications
You can also search for this author in PubMed Google Scholar
David J. Roberts
View author publications
You can also search for this author in PubMed Google Scholar
John Danesh
View author publications
You can also search for this author in PubMed Google Scholar
Brendan Burchell
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas A. Furlotte
View author publications
You can also search for this author in PubMed Google Scholar
Priyanka Nandakumar
View author publications
You can also search for this author in PubMed Google Scholar
Christopher J. Earley
View author publications
You can also search for this author in PubMed Google Scholar
William G. Ondo
View author publications
You can also search for this author in PubMed Google Scholar
Lan Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Alex Desautels
View author publications
You can also search for this author in PubMed Google Scholar
Markus Perola
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Vodicka
View author publications
You can also search for this author in PubMed Google Scholar
Christian Dina
View author publications
You can also search for this author in PubMed Google Scholar
Monika Stoll
View author publications
You can also search for this author in PubMed Google Scholar
Andre Franke
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Lieb
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre F. R. Stewart
View author publications
You can also search for this author in PubMed Google Scholar
Svati H. Shah
View author publications
You can also search for this author in PubMed Google Scholar
Christian Gieger
View author publications
You can also search for this author in PubMed Google Scholar
Annette Peters
View author publications
You can also search for this author in PubMed Google Scholar
David B. Rye
View author publications
You can also search for this author in PubMed Google Scholar
Guy A. Rouleau
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Berger
View author publications
You can also search for this author in PubMed Google Scholar
Hreinn Stefansson
View author publications
You can also search for this author in PubMed Google Scholar
Henrik Ullum
View author publications
You can also search for this author in PubMed Google Scholar
Kari Stefansson
View author publications
You can also search for this author in PubMed Google Scholar
David A. Hinds
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Di Angelantonio
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Oexle
View author publications
You can also search for this author in PubMed Google Scholar
Juliane Winkelmann
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

23andMe Research Team

Nicholas A. Furlotte
, Priyanka Nandakumar
& David A. Hinds

D.E.S.I.R. study group

Amélie Bonnefond
& Louis Potier

Contributions

B.S. and J.W. designed and coordinated the study. C.Z., S.B. and K.O. proposed and performed, contributed to and supervised bioinformatic procedures, statistical tests and meta-analyses, respectively. K.O., E.D.A. and J.W. supervised the overall study. B.S., C.Z., S.B., M.D., J. Dowsett., M.S.N., N.A.F., P.N. and D.A.H. performed statistical analysis within cohorts. N. Schandra, A.S., B.H., Y.D., C.G.B., D.K., K. Sonka, W.P., C.T., W.H. Oertel, M.H., G.M.H., O.P., I.F., J.W., O.A.R., Z.K.W., A.I., M.B., S.C., C.J.E., W.G.O., M.T.-L., A.M., A.D., L.X., G.A.R., K.B., M.P., P.V., C.D., A.F., W.L., A.F.R.S., S.H.S., C.G., A.P., M.S., the D.E.S.I.R. study group, the 23andMe Research Team, A.S.B., N. Soranzo, W.H. Ouwehand, D.J.R., J. Danesh, B.B., E.D.A., S.R.O., E.S., C.E., O.B.P., M.T.B., K.R.N., H.U., D.B.R., H.S. and K. Stefansson acquired data and samples within cohorts. V.K., P.H., N. Schandra, A.S., B.H., Y.D., C.G.B., D.K., K. Sonka, W.P., C.T., W.H. Oertel, M.H., G.M.H., O.P., I.F., J.W., O.A.R., Z.K.W., A.I., M.B., S.C., C.J.E., W.G.O., M.T.-L., A.M., A.D., L.X., G.A.R., K.B., M.P., P.V., C.D., A.F., W.L., A.F.R.S., S.H.S., C.G., A.P., M.S., the D.E.S.I.R. study group, the 23andMe Research Team, A.S.B., N. Soranzo, W.H. Ouwehand, D.J.R., J. Danesh, B.B., E.D.A., S.R.O., E.S., C.E., O.B.P., M.T.B., K.R.N., H.U., D.B.R., H.S., K. Stefansson and K.O. supported data interpretation within cohorts. B.S., C.Z., S.B., K.O., E.D.A. and J.W. performed data interpretation of meta-analyses. B.S., K.O., C.Z., S.B., J.W. and E.D.A. wrote the paper. B.S., K.O., C.Z., S.B., J.W., E.D.A., V.K., P.H., M.D., J. Dowsett, M.S.N., N.A.F., P.N., D.A.H., N. Schandra, A.S., B.H., Y.D., C.G.B., D.K., K. Sonka, W.P., C.T., W.H. Oertel, M.H., G.M.H., O.P., I.F., O.A.R., Z.K.W., A.I., M.B., S.C., C.J.E., W.G.O., M.T.-L., A.M., A.D., L.X., G.A.R., K.B., M.P., P.V., C.D., A.F., W.L., A.F.R.S., S.H.S., C.G., A.P., M.S., the D.E.S.I.R. study group, the 23andMe Research Team, A.S.B., N. Soranzo, W.H. Ouwehand, D.J.R., J. Danesh, B.B., S.R.O., E.S., C.E., O.B.P., M.T.B., K.R.N., H.U., D.B.R., H.S. and K. Stefansson reviewed and approved the final version of the paper.

Corresponding author

Correspondence to Barbara Schormair.

Ethics declarations

Competing interests

The funders of the study had no role in conceptualization, design, data collection, analysis, the decision to publish or preparation of the manuscript. J.W., B.S., K.O. and C.Z. have filed a patent application (WO2021185936A1). Z.K.W. serves as PI or co-PI on Biohaven Pharmaceuticals (BHV4157-206), Neuraly (NLY01-PD-1) and Vigil Neuroscience (VGL101-01.002, VGL101-01.201, PET tracer development protocol, and CSF1R biomarker and repository project) grants. Z.K.W. serves as co-PI of the Mayo Clinic APDA Center for Advanced Research and as an external advisory board member for Vigil Neuroscience. W.P. has received honoraria as a speaker from Philips and MediPark Clinic and as a consultant from Abbott and Precisis. J. Danesh serves on scientific advisory boards for AstraZeneca, Novartis and the UK Biobank and has received multiple grants from academic, charitable and industry sources outside of the submitted work. A.S.B. reports institutional grants from AstraZeneca, Bayer, Biogen, BioMarin, Bioverativ, Novartis, Regeneron and Sanofi. D.A.H., N.A.F., P.N. and members of the 23andMe Research Team are employed by and hold stock or stock options in 23andMe. Authors affiliated with deCODE Genetics/Amgen declare competing financial interests as employees. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 General study workflow.

Overview of the main analytical steps conducted in the study. While sex-specific GWAS meta-analysis results were used to dissect similarities and differences between both sexes, the pooled meta-analysis results were used for further functional interpretation.

Extended Data Fig. 2 Genetic correlation between individual discovery stage GWAS of the N-GWAMA meta-analysis.

Genetic correlations between the discovery stage input GWAS were calculated using LDSC on the summary statistics.

Extended Data Fig. 3 Manhattan and Miami plots of discovery stage meta-analyses.

a, Results of the pooled discovery meta-analysis. b, Results of the sex-specific discovery meta-analyses. Female-only results are depicted in red in the upper section of the Miami plot, male-only results are depicted in blue in lower section of the Miami plot. The x-axis shows chromosome and base pair positions of the tested variants. The y-axis shows significance as −log₁₀ of the two-sided nominal P-values of the N-GWAMA analyses. Red horizontal dashed lines indicate the Bonferroni-adjusted significant threshold of P < 5 × 10⁻⁸.

Extended Data Fig. 4 Simulation study assessing sex-specific heritability and genetic correlation divergence.

Simulation of environmental effect that reconciles sex-difference in heritability with the similarity of the SNP effect sizes. a, Frequency density distributions of the liabilities for different models. Blue line, base model, $\varphi =X\beta +\varepsilon$, as assumed to be present in males with h² = 0.1395, X and β as determined by GWAS, $\varepsilon \sim N(0,1)$, and a disease threshold in keeping with the male RLS prevalence of 0.06 (shaded area under the curve). Black line, model with non-interacting binary environmental effect, $\varphi =X\beta +\tau E+\varepsilon$, with $X,\beta ,\varepsilon$ and threshold as in the base model plus an additional binary effect $E \sim Bernoulli(p=0.21)$, representing childlessness with a weight vector τ such that that prevalence is 0.13 as in females. Red line, analogous G×E model, $\varphi =X\beta +X\eta \circ E+\varepsilon$, but where the environmental effect now interacts with the genetic effects and the corresponding weight vector η is chosen in accordance with the female prevalence. b, c, Optimization of the model $\varphi =X\beta +X\eta \circ E+\tau E+\varepsilon$ with $X,\beta ,E,\varepsilon$ and threshold as above, where the additional degree of freedom is covered by also considering the mean effect size ratio rb observed in the GWAS. Heatmap and contour plot for logistic regression-based liability scaled LDSC h² (b) and effect size ratio rb (c) as functions of $Var(\tau E)$ and $Var(X\eta \circ E)$. Optimal values for $Var(\tau E)$ and $Var(X\eta \circ E)$, that is, for τ and η, respectively, comply with female prevalence, female heritability, and observed effect size ratio as well. The optimal τ turns out to be close to zero so that the environmental factor acts mostly via genetic interaction.

Extended Data Fig. 5 Per chromosome heritability estimation based on the EU-RLS-GENE dataset.

Heritability estimates for each chromosome. a, Overall heritability of SNPs on each chromosome. The height of the bar represents the point estimate of the heritability, and the error bars indicate the standard error of this point estimate. b, Enrichment of heritability, which is defined as the proportion of SNP-heritability divided by the proportion of SNPs in each chromosome. The height of the bar represents the point estimate of the enrichment of heritability, and the error bars indicate the standard error of this point estimate.

Extended Data Fig. 6 Replication of lead SNPs in independent validation samples.

Association results of replication stage. The effect size (beta) of the replication analysis is plotted against the effect size (beta) of the discovery stage for genome-wide significant lead SNPs. The color-coding and symbol shape indicate the strength of the association signal in the replication stage meta-analysis (nominal two-sided P value of random-effects meta-analysis). Blue square, Bonferroni-corrected significance; green circle, nominal significance; grey triangle, not significant. a, Pooled meta-analysis with Bonferroni threshold set at 0.000255, correcting for 196 lead SNPs. b, Male-specific meta-analysis with Bonferroni threshold set at 0.00082, correcting for 61 lead SNPs. c, Female-specific meta-analysis with Bonferroni threshold set at 0.000318, correcting for 157 lead SNPs.

Extended Data Table 1 Lead SNPs with significant heterogeneity of effect sizes between sexes

Full size table

Supplementary information

Supplementary Information

Supplementary Note and Fig. 1.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–26.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Schormair, B., Zhao, C., Bell, S. et al. Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology and risk prediction. Nat Genet 56, 1090–1099 (2024). https://doi.org/10.1038/s41588-024-01763-1

Download citation

Received: 09 March 2023
Accepted: 19 April 2024
Published: 05 June 2024
Issue Date: June 2024
DOI: https://doi.org/10.1038/s41588-024-01763-1
Springer Nature America, Inc.

Genome-wide meta-analyses of restless legs syndrome yield insights into genetic architecture, disease biology and risk prediction

Abstract

Similar content being viewed by others

Main

Results

Pooled autosomal GWAS meta-analysis

Sex-stratified autosomal GWAS and meta-analyses

X-chromosomal meta-analyses

Replication of lead variants in additional datasets

Functional annotation and biological interpretation

Genetic correlation and MR analysis

Development and validation of a risk prediction model

Discussion

Methods

Ethics statement

GWAS phenotyping and genotyping

Discovery meta-analysis

International EU-RLS-GENE consortium (7,248 cases (2,479 males and 4,769 females) and 19,802 controls (10,422 males and 9,380 females))

INTERVAL study (3,491 cases (1,291 males and 2,200 females) and 23,741 controls (12,511 males and 11,230 females))

Research participant cohort for 23andMe (105,908 cases (34,544 males and 71,364 females) and 1,502,923 controls (678,661 males and 824,262 females))

Replication meta-analysis

Research participant cohort for 23andMe (19,214 cases and 347,000 controls)

INTERVAL replication cohort (1,591 cases and 10,000 controls)

deCODE–DBDS–Emory cohort (8,223 cases and 41,815 controls)

SNP-based association analysis

Discovery-stage GWAS of autosomes

EU-RLS-GENE GWAS

INTERVAL GWAS

The 23andMe GWAS

Discovery-stage meta-analysis for autosomes

Discovery-stage meta-analysis for chromosome X

EU-RLS-GENE XWAS

The 23andMe XWAS

Sex-specific meta-analysis association analysis

Replication-stage association analysis

Identification of risk loci and independent lead SNPs

Heritability analyses

Genetic correlation analysis

Mendelian randomization

Gene prioritization in risk loci

Enrichment analyses

Gene set and pathway enrichment analyses

DEPICT

MAGMA

Tissue and cell type enrichment analyses

Risk prediction

Reporting summary

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Consortia

23andMe Research Team

D.E.S.I.R. study group

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Extended Data Fig. 5 Per chromosome heritability estimation based on the EU-RLS-GENE dataset.

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation