An atlas on risk factors for multiple sclerosis: a Mendelian randomization study

We conducted a systematic review and wide-angled Mendelian randomization (MR) study to examine the association between possible risk factors and multiple sclerosis (MS). We used MR analysis to assess the associations between 65 possible risk factors and MS using data from a genome-wide association study including 14 498 cases and 24 091 controls of European ancestry. For 18 exposures not suitable for MR analysis, we conducted a systematic review to obtain the latest meta-analyses evidence on their associations with MS. Childhood and adulthood body mass index were positively associated with MS, whereas physical activity and serum 25-hydroxyvitamin D were inversely associated with MS. There was evidence of possible associations of type 2 diabetes, waist circumference, body fat percentage, age of puberty and high-density lipoprotein cholesterol. Data of systematic review showed that exposure to organic solvents, Epstein Barr virus and cytomegalovirus virus infection, and diphtheria and tetanus vaccination were associated with MS risk. This study identified several modifiable risk factors for primary prevention of MS that should inform public health policy.


Introduction
Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system and a leading nontraumatic cause of disability among young adults of northern European ancestry [1]. Even though epidemiological studies have uncovered several modifiable risk factors for MS, such as serum vitamin D levels [2] and body mass index [3], the overall etiological basis of MS is poorly understood [4]. A recent umbrella review found consistent evidence supporting the associations of Epstein-Barr virus infection and smoking with MS risk [1]. However, the role of other environmental factors and internal conditions for MS risk have been scarcely investigated. In addition, it is unclear whether the associations reported by traditional observational studies are causal due to potential confounding, reverse causality and misclassification of such studies.
Mendelian randomization (MR) is an analytical approach that utilizes genetic variants, generally single nucleotide polymorphisms (SNPs), as instrumental variables for an exposure to diminish confounding and reserve causality, thereby strengthening the causal inference of an exposure-outcome association [5]. The rationale of minimizing confounding in MR studies is that genetic variants are randomly allocated at meiosis, and therefore, one trait is generally unrelated to other traits. Reverse causality can be avoided since genetic variants are fixed and, therefore, cannot be modified by disease onset and progression [5]. There are three key assumptions for MR analysis [5]. First, the genetic variants proposed as instrumental variables should be associated with the risk factor of interest. Second, the used genetic variants should not be associated with potential confounders. Third, the selected genetic variants should affect the risk of the outcome (e.g. MS) merely through the risk factor. Exploiting different summary genetic sources for an exposure and outcome, the two-sample MR approach infer the exposureoutcome causality with improve statistical power and less confounding bias [6].
The aim of the present study was to systematically appraise the evidence of causal associations between possible risk factors and MS using the two-sample MR design. For exposures that cannot be instrumented by genetic variants, we additionally obtained data from the latest metaanalyses through a systematic review of the literature.

Study design overview and potential risk factor identification
The overview of the study design is displayed in Fig. 1. First, we conducted a systematic review in the PubMed database to identify possible risk factors for MS. In total, 1863 studies published in recent 5 years were screened, and 87 general risk factors were pinpointed (Supplementary Table 1). After excluding traits without suitable genetic instruments or limited genetic instruments (SNPs < 3), a total of 65 possible risk factors were included in the MR analyses. In addition, we included 18 risk factors in the systematic review.

Instrumental variable selection
Instrumental variables for the 65 exposures were identified from genome-wide association studies (GWASs). SNPs at the genome-wide significance threshold (p < 5 × 10 -8 ) were proposed as instrumental variables. To mitigate against colinearity between included SNPs, we excluded SNPs in linkage disequilibrium (R 2 ≥ 0.01) and retained SNPs with the strongest effect on the associated trait. SNPs in the MHC gene region, which is strongly associated with MS, were removed from analyses to exclude possible pleiotropy. The variance explained by used SNPs for individual risk factor was either extracted from the original GWAS or estimated based on minor allele frequency, beta coefficient for minor allele and standard deviation (SD) for the risk factor. Information of the data sources as well as the number of SNPs used, and the variance explained by the SNPs is presented in Table 1. Other information, such as instrumental variable  selection and unit for each trait, is available in Supplementary Table 2.

Multiple sclerosis genotyping data
Summary-level statistics for the associations of 65 risk factor-associated SNPs with MS were extracted from the discovery stage of a GWAS with 14 498 MS cases and 24 091 controls of European ancestry from 11 countries [7]. Beta and standard error for identified SNPs had been obtained by logistic regression analysis with adjustment for five population principal components. MS cases were diagnosis by Neurologists familiar with multiple sclerosis in accordance with recognised diagnostic criteria that employ a combination of clinical and laboratory-based para-clinical information. Detailed cases ascertainment criteria in every included area were specified in the published GWAS of MS [7]. Overall and country-specific disease-related features, such as sex ratio, onset age, age at examination and disease severity, is displayed in Supplementary Table 3.

Statistical analysis
The association between individual risk factor and MS risk attributable to each SNP was estimated with the Wald method. The ratio estimates for every used SNPs for one trait were combined by using the multiplicative random-effects inverse-variance weighted (IVW) meta-analysis method [8]. We used the weighted median approach as sensitivity analysis, which can provide a consistent estimate with the prerequisite that more than 50% of the weight in the analysis comes from valid instrumental variables [8]. The MR-PRESSO approach was used to correct for possible pleiotropic effects. The MR-PRESSO test detects possible outliers and provides estimates after removal of outliers, thereby correcting for horizontal pleiotropy [9]. Heterogeneity across used SNPs for a trait was measured by the Cochranes's Q statistic and possible pleiotropy was detected by MR-Egger regression model with p for intercept ≤ 0.05 [8]. To assess the strength of the instrumental variables, F-statistics was estimated based on sample size, numbers of SNPs used, and variance explained by included SNPs [10]. Power estimation was based on a web-tool [11] and is shown in Supplementary Table 2. Odds ratios (ORs) and 95% confidence intervals (CIs) of MS were scaled to one-unit increase in corresponding units for different traits. All statistical analyses were twosided and performed using the mrrobust package in Stata/SE 15.0 and TwoSampleMR in R 3.6.0 software. Associations with p value < 0.05 in both IVW-random effects and MR-PRESSO models were deemed as robust associations and associations with p < 0.05 in either IVW-random effects or MR-PRESSO model and in the same direction across all analyses were regarded as suggestive associations.

Systematic review
With regard to risk factors not suitable for MR analysis, we conducted systematic reviews to obtain the latest metaanalysis including the most studies. Systematic reviews were carried out on 18 risk factors. We extracted published information, number of included studies, sample size and risk estimates. Detailed information and search strategies are documented in Supplementary Table 4.

Mendelian randomization
Among 65 possible risk factors, four traits, including childhood and adulthood body mass index, serum 25-hydroxyvitamin D and physical activity, were robustly associated with risk of MS. There were suggestive associations with 5 risk factors, including type 2 diabetes, waist circumference, body fat percentage, age of puberty and high-density lipoprotein cholesterol.

Health status
Six out of 36 health status-related risk factors were associated with MS (Table 2). Specifically, liability to type 2 diabetes, childhood and adulthood body mass index, waist circumference and body fat percentage were positively associated with MS risk, whereas age of puberty was inversely associated with risk. Even though there was heterogeneity in the above analyses, no indication of pleiotropy was revealed in MR-Egger regression analysis (all p > 0.05). The other 30 factors showed limited evidence for an association with MS risk.

Nutrition and lifestyle
Genetically higher serum 25-hydroxyvitamin D levels and physical activity (moderate to vigorous level) were associated with a decreased MS risk in all models and no pleiotropy was detected ( Table 2). There was a borderline association between urinary sodium levels and MS in both IVW-random effects and MR-PRESSO models ( Table 2). There was no evidence of causal associations of circulating levels of amino acids, fatty acids, or other minerals and vitamins, alcohol drinking, coffee consumption, or smoking with MS risk (Table 2).

Internal biomarker
Genetic predisposition to higher levels of high-density lipoprotein cholesterol was suggestively associated with a lower risk of MS (Table 2). There was limited evidence supporting causal associations of other serum lipids, tumor necrosis factor, C-reactive protein and immunoglobulin E with MS.

Systematic review
We obtained 9 meta-analyses on 18 individual risk factors by a systematic search in PubMed. There were limited data from meta-analysis of sun exposure, pesticide-related products exposure, air pollution, exposure to farm animals and pets and antibiotic use in relation to MS. Exposure to organic  Table 4).

Discussion
Using MR analysis, we found that 4 out of 65 risk factors were robustly associated with MS risk, including childhood and adulthood body mass index, serum 25-hydroxyvitamin D and physical activity. There was evidence of suggestive associations of type 2 diabetes, waist circumference, body fat percentage, age of puberty and high-density lipoprotein cholesterol with risk of MS. Evidence of latest meta-analyses showed that exposure to organic solvents, Epstein Barr virus and cytomegalovirus virus infection, and diphtheria and tetanus vaccination were associated with MS risk. Adulthood obesity has been identified as a risk factor for MS in previous studies [3,12]. The present study confirmed the causal association between high body mass index and an elevated risk of MS using more than ten-fold more SNPs for adulthood body mass index compared with the previous MR study [3]. We additionally assessed the influence of birth weight, childhood body mass index, waist circumstance, body fat percentage, lean body mass, basal metabolic rate, and circulating adiponectin levels on MS. Consistent with observational findings [13], our study observed a causal positive association between childhood obesity and MS risk. Waist circumstance and body fat percentage but not lean body mass showed evidence of possible associations with MS risk, which might shed light on the possible varying effects of obesity phenotypes on MS risk and mechanisms.
Low serum 25-hydroxyvitamin D levels exert detrimental effects on MS development, which has been found in previous studies [2,14] and verified in the present study. Maternal and neonatal 25-hydroxyvitamin D status has also been found to be associated with MS risk in offspring or later on [15,16]. We observed a consistent protective effect of moderate to vigorous physical activity on MS risk, which supports observational findings [17]. In addition, increased physical activity level can act as a beneficial rehabilitation strategy for MS patients to manage symptoms, restore function, improve quality of life, and promote wellness [18]. Therefore, from the preventive and therapeutic perspectives, exercise should be promoted among individuals at high risk of MS as well as for MS patients.
Effects of nutritional factors, except vitamin D, on the risk of MS are seldom discussed. Recent prospective cohort studies did not find any associations of potassium, magnesium, calcium and iron with MS risk [19,20], which is overall consistent with our study. Observational evidence stated a protective effect of omega-3 polyunsaturated fatty acids [21]   and a detrimental effect of total polyunsaturated fatty acids [22] on MS risk. Nonetheless, our study examined several individual plasma fatty acids levels and found null associations of these fatty acids with MS. We did not find any causal roles of amino acid and other vitamins in the onset of MS, which are scarcely explored in observational studies.
Observational data showed that the prevalence of both type 1 and type 2 diabetes was higher among MS patients compared with non-MS individuals [23,24]. The present study revealed a possible association between type 2 diabetes and MS. We found limited evidence supporting a causal effect of type 1 diabetes on MS risk. The reason behind a concurrence between type 1 diabetes and MS in observational studies might be shared genes contributing to susceptibility to both diseases (e.g. CLEC16A and CLECL1) [25], instead of a causal relationship.
Most studies have detected a decreased MS risk among individuals with postponed puberty age [26,27], which is consistent with our results. Several population and animal studies have indicated that puberty might influence MS risk or relapse per se or via body mass index and other pathways [28,29]. Conflicting findings of observational studies have revealed possible roles of cigarette smoking, alcohol drinking, and coffee consumption in the development of MS [12,[30][31][32]. The present MR study did not confirm a causal influence of those lifestyle factors on MS risk, but we cannot exclude that we may have overlooked weak associations. The causal role of those lifestyle factors on MS risk merit further study if more SNPs are identified for those factors and in studies based on larger number of MS cases and controls.
Among internal biomarkers, previous studies found that serum lipid levels were not associated with MS risk [33]. However, high-density lipoprotein cholesterol was found to play a role in MS fatigue [34]. The present study observed a suggestive positive association between high-density lipoprotein cholesterol and risk of MS. Given inconsistent information on this association, whether high-density lipoprotein cholesterol play a casual role in the development of MS needs more study. This is the first study to comprehensively investigate the potential risk factors for MS using MR analysis. In addition, for exposures not feasible for MR analysis, a systematic review of the literature was conducted to provide contemporary evidence of risk factors for MS. Evidence from meta-analyses of observational studies can be challenged by potential methodological limitations embedded in such studies. Thus, the findings from meta-analyses need more study. Population bias was largely reduced by using genetic data mainly from individuals with European ancestry. However, findings based on certain analyses using genetic data from multi-ancestries need to be cautiously interpreted and verified. The F-statistic for traits indicated that our results were unlikely biased by weak instruments (F-statistic > 10) [10]. However, the statistical power for some analyses was modest, suggesting that it is likely that some of the null results might suffer from "false negative" findings. Given that MR analysis reflects a lifetime exposure, the obtained effect sizes in the present study might be exaggerated and are not directly comparable with estimates derived from traditional observational studies. All MR analyses assumed linear relationships between the risk factors and MS and no interaction (e.g., the interaction between smoking and human leukocyte antigen genes [35]) or modification effects. We could not assess reverse causality through bidirectional MR analysis because suitable summary-level data were not available for most exposures. Thus, whether there are bidirectional associations between certain exposures and MS needs to be revealed in future study.

Conclusions
This MR study provides evidence of causal associations of a childhood and adulthood body mass index, serum 25-hydroxyvitamin D and physical activity with MS risk. Our complementary systematic review additionally showed that exposure to organic solvents, Epstein Barr virus and cytomegalovirus virus infection, and diphtheria and tetanus vaccination were associated with MS risk. Taken together, this study suggests that lowering obesity and Epstein Barr virus infection and increasing physical activity and serum vitamin D levels can reduce the risk of MS.
author. Ethical approval for the present MR study was not considered because these de-identified data came from summary statistics and no individual-level data were used. Summary-level genetic data for MS at the discovery stage can be downloaded at the website: https :// imsgc .net/.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.