Introduction

Multiple sclerosis (MS) is an immune-mediated demyelinating disease of the Central Nervous System (CNS), with both genetic and environmental risk factors [1]. Environmental factors including obesity, low serum vitamin D, infection with Epstein–Barr virus, smoking, air pollution and solvent exposure have all been associated with an increase in MS risk, although the mechanisms through which they act remain uncertain [1]. Traditional epidemiological study designs used to study environmental risk factors—such as case–control and cohort studies—are liable to unmeasured confounding and reverse causation, and so may yield associations which are not causal [2].

Mendelian randomisation (MR) is a type of instrumental variable (IV) analysis that uses genetic variants as proxies to examine whether observed associations between an exposure and an outcome are likely to be causal [2, 3]. Childhood obesity has been identified as a potential causal MS risk factor in both epidemiological [4, 5] and MR studies [6, 7]. A consistent finding is that birth weight is not associated with MS risk, whereas higher BMI during later childhood and adolescence appears to be a risk factor [4, 8,9,10]. BMI is a dynamic trait, and it is unknown if there is a crucial point during development when BMI influences subsequent MS risk. Understanding the precise nature and timing of this association may shed light on the biological pathways leading to the clinical development of MS and increase the rationale for targeted obesity prevention programmes in early life. Defining this critical window of effect is challenging using traditional MR techniques to study the effect of BMI on MS risk at a single time point. Although genetic instruments for BMI throughout the life course can be obtained from different datasets, using GWAS from a single, longitudinal cohort minimises the risk of subtle population stratification distorting the results.

Longitudinal MR analysis of the effect of BMI during development is now possible due to the recent publication of longitudinal BMI GWAS data from the Norwegian Mother, Father, and Child Cohort Study (MoBa) cohort [11]. We set out to explore the causal effect of BMI changes during early life on MS risk and examine the dynamics of this effect over time.

Methods

Genetic datasets

Exposures dataset

The MoBa cohort is an open-ended cohort study that recruited pregnant women in Norway from 1999 to 2008, with anthropometric measurements of the children by trained nurses across childhood. This cohort was used to perform GWAS of BMI at various ages using data from between 11,095 and 28,681 children across 12 time points, from birth to 8 years (birth, 6 weeks, 3 months, 6 months, 8 months, 1 year, 1.5 years, 2 years, 3 years, 5 years, 7 years, and 8 years). 46 distinct genetic loci were associated with higher childhood BMI at various ages [11]. The outcome of this GWAS was standardized BMI, and therefore the units are not directly comparable with published estimates from MR studies using other GWAS for BMI for genetic instruments [11]. In the MoBa cohort, BMI was standardised to an age- and sex-specific reference using the generalized additive model for location, scale and shape method—each data point therefore represents the BMI of that child at that time point relative to children of the same age and same sex within the study. This overcomes some of the inaccuracies that can be introduced using ‘raw’ BMI across varying ages, where the range of healthy BMI varies throughout childhood.

We used variants reaching a genome-wide p value threshold of < 5 × 10–8 at each time point after conditional and joint analysis as instrumental variables. In the primary analysis, time points were amalgamated into four distinct age epochs: Birth to 6 weeks, 3 months to 1.5 years, 2 to 5 years, and 7 to 8 years. These epochs were defined according to the shared genetics of BMI at each time point, loosely corresponding to the four phases of BMI genetics suggested by the authors [11], and reflect the developmental stages of birth, infant, toddler and child. If a variant was associated at p < 5 × 10–8 with BMI at > 1 time point within an epoch, the association statistics were taken from the time point with the strongest association (i.e., the smallest p value). In secondary analyses, each individual time point was examined separately. Data are available for download on the MoBa website at https://www.fhi.no/en/studies/moba/for-forskere-artikler/gwas-data-from-moba/.

Outcome dataset

The International Multiple Sclerosis Genetics Consortium (IMSGC) 2019 discovery phase GWAS (total n = 47,429 MS cases, 68,374 controls) was used as the outcome dataset [12]. We used summary statistics from the discovery stage meta-analysis (14,802 persons with MS, 26,703 controls). This GWAS discovered 233 independent genome-wide independent significant signals associated with MS, of which 32 variants lay within the Major Histocompatibility Complex (MHC). Data are available on request from https://imsgc.net/.

Other datasets

GWAS datasets for adult BMI and birthweight were obtained from the GIANT and EGG consortia meta-analyses (respectively) [13, 14]. Data on birth weight have been contributed by the EGG Consortium and were downloaded from www.egg-consortium.org. Data on adult BMI can be downloaded from https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files.

Mendelian randomisation and statistical analyses

Mendelian randomisation (MR) was performed using the TwoSampleMR (version 0.5.5) package in R (version 3.6.1) [15].

Genetic instruments for each age epoch/time point were generated by the following steps:

  1. 1.

    Exposure (BMI) Single-Nucleotide Polymorphisms (SNPs) associated with BMI at the given time point at p < 5 × 10–8 were retained.

  2. 2.

    SNPs were clumped to ensure independence using the PLINK clumping method and a European 1000 Genomes reference dataset. SNPs were clumped using a linkage disequilibrium (LD) threshold of R2 = 0.001 and a clumping window of 10,000 kb.

  3. 3.

    MHC SNPs (chr6:25,000,000–chr6:35,000,000 in hg19) were excluded.

  4. 4.

    Exposure SNPs were extracted from the outcome dataset (IMSGC MS GWAS).

  5. 5.

    Exposure and outcome SNPs were harmonised so that their effects corresponded to the same allele and palindromic SNPs with a minor allele frequency (MAF) > 0.42 were discarded.

MR analysis was performed using a random-effects, inverse variance-weighted method (IVW). Secondary analyses included a test for heterogeneity, tests for horizontal pleiotropy through calculation of an MR-Egger intercept and MR-PRESSO [16], and a leave-one-out analysis. In secondary analyses, we excluded SNPs explaining more variance in the outcome than exposure (Steiger filtering) as these variants are likely to violate the exclusion–restriction assumption of MR [17]. Results are presented as odds ratios (OR) with 95% confidence intervals (CIs) and associated p values. Results are visualised as scatter, forest, funnel and leave-one-out plots. F statistics for instrument strength were calculated for each SNP as β2/SE2, where β is the beta coefficient estimating the per-allele effect on BMI, and SE is the standard error of the beta coefficient. Mean F statistics for each epoch were calculated and are displayed in supplementary table 1. All mean F statistics were above 10, suggestive of strong instruments.

To determine the extent to which the observed effect of genetically estimated higher early-life BMI on MS risk was mediated by adult BMI, we performed multivariable MR, conditioning on the effect of each instrument on adult BMI. Adult BMI GWAS summary statistics were obtained from the GIANT consortium GWAS meta-analysis [13]. SNPs in the childhood BMI instrument were first harmonised with the adult BMI GWAS. After harmonisation, only one variant remained for the 2–5 years epoch, precluding multivariable analysis. For other epochs, we performed multivariable MR using the residual-based method as described in Burgess et al. (2015) [18]. In brief, this is a two-step approach which first regresses SNP associations with the outcome (MS) on SNP associations with the secondary risk factor (adult BMI). The residuals from this first step represent the variation in the outcome not explained by the secondary risk factor. These residuals are then regressed on the main exposure of interest (childhood BMI), giving an estimate of the effect of this exposure on the outcome independent of the secondary risk factor. As a sensitivity analysis, we also performed multivariable MR by jointly modelling the effects of SNPs on both risk factors in a weighted regression (implemented in mv_multiple in TwoSampleMR). As a further sensitivity analysis, we repeated the univariable MR for each time epoch, restricting the instruments to SNPs for which (a) associations with adult BMI were reported in the GIANT GWAS and (b) these associations were weaker than an arbitrary p value cutoff (p > 0.05). The intuition behind this approach is that, given the power of the GIANT GWAS, removing SNPs with even weak evidence of association with adult BMI (e.g., p < 0.05) should restrict the genetic instrument to those SNPs with an effect on childhood BMI, but not later-life BMI.

To provide further confirmation of the birthweight MR result, we repeated the analysis using an independent and larger GWAS of birthweight [14].

Data and code availability

We thank MoBa, Prof Stefan Johansson and Dr Marc Vaudel for providing summary statistics for childhood BMI. We thank the IMSGC for providing MS GWAS data. We thank MR-Base for making the TwoSampleMR package available publicly. Download links for datasets used in this study are provided above. All code used in this study are available at https://github.com/benjacobs123456/MR_MOBA_MS.

Consent and approval

This study was performed using publicly available data sources. There were no experiments performed on human subjects and no direct patient contact. Therefore, written consent was not required.

Results

Age-epoch analysis

We collapsed BMI during childhood into four windows: birth (birth–6 weeks inclusive), infancy (3 months–1.5 years inclusive), early childhood (2–5 years inclusive), and later childhood (7–8 years). For all time epochs other than birth, genetically estimated higher BMI was associated with an increased liability to MS (Tables 1 and 2, Fig. 1, supplementary figures 1 and 2): birth (OR 0.81, 95% CI 0.50–1.31, NSNPs = 7, p = 0.39), infancy (OR 1.18, 95% CI 1.04–1.33, NSNPs = 18, p = 0.01), early childhood (OR 1.31, 95% CI 1.03–1.66, NSNPs = 4, p = 0.03), and later childhood (OR 1.34, 95% CI 1.08–1.66, NSNPs = 4, p = 0.01).

Table 1 Single–Nucleotide Polymorphisms (SNPs) associated with BMI at various time epochs during early life
Table 2 MR estimates from the primary analysis of BMI during each time epoch on MS
Fig. 1
figure 1

Forest plots showing the MR effect estimates during each epoch for the effect of BMI on MS susceptibility. Points represent beta estimates reflecting the predicted log odds ratio for MS risk per 1 unit increase in genetically determined standardised BMI at each time point. Error bars show 95% confidence intervals. The separate panels display the primary MR analytic method—the inverse variance weighted (IVW) method—followed by secondary sensitivity analyses

The MR-Egger intercepts did not suggest that unbalanced horizontal pleiotropy was biasing the IVW estimate at any of these epochs (supplementary table 1). Cochran’s test of heterogeneity suggested heterogeneity in the IVW estimators during the birth epoch, but not during any other time window (supplementary table 1). There was evidence of global pleiotropy for the birth epoch (p < 0.001) but no other epochs (infancy p = 0.68, early childhood p = 0.84, later childhood p = 0.43). For the birth epoch, a single outlier (rs11187129) was found to distort the overall MR estimate (POutlier < 0.05), and removal of this outlier led to a point estimate closer to the null (betacorrected = − 0.06, p = 0.69).

To determine whether these results could be driven by SNPs acting in the opposite causal direction (i.e., influencing childhood BMI via susceptibility to MS), we excluded any SNPs explaining more variance in the outcome than the exposure (Steiger filtering) [17]. No SNPs were filtered out using this approach, and thus the causal estimates were unchanged. Leave-one-out sensitivity analysis did not lead to an appreciable change in the MR estimates, although there was some loss of precision (supplementary table 2).

The effect of early-life BMI on MS risk is contingent on persistence of elevated BMI into later life

Next, we explored whether the observed MR effects of genetically determined BMI during infancy, early and late childhood on MS risk could be driven by the effects of these genetic variants on BMI in adulthood. After conditioning on adult BMI, the causal estimates for each epoch diminished in magnitude, and did not show strong evidence for association with MS risk. This finding suggests that the observed effects in the univariable analysis may be due to persistence of higher BMI into later life (supplementary figure 3, supplementary tables 3 and 4). There is some loss of precision and power in these estimates compared to the univariable MR, largely because some variants were not available for analysis in the GIANT GWAS dataset; for the early childhood epoch, only one SNP was retained. The attenuation of effect size was most marked for the later childhood epoch (univariable MR estimate OR 1.34, 95% CI 1.08–1.66, p = 0.01; multivariable MR estimate OR 0.98, 95% CI 0.74–1.30, p = 0.9).

To confirm these results, we repeated the univariable MR analysis using SNPs with no evidence of association with adult BMI. After excluding SNPs associated with adult BMI at p < 0.05, no SNPs remained for the early or late childhood epochs, underlining the high correlation between BMI in these epochs and adult BMI. For the infancy epoch, although there was a loss of precision, the IVW estimate suggested a possible residual effect of elevated BMI at this time point (OR 1.28, 95% CI 0.97–1.68, p = 0.08, NSNP = 4), with no significant unbalanced horizontal pleiotropy biasing this estimate (Egger intercept 0.05, p = 0.60). Each of these four SNPs—rs209421 (UBE3D), rs2816985 (NR5A2), rs2268657 (GLP1R) and rs1772945 (OPRM1)—influences BMI during infancy, with minimal effect on birth weight or subsequent BMI, and no direct association with MS risk (association p values all > 0.05).

Individual time point analysis

Secondary analysis at each time point demonstrated results consistent with the epoch-based analysis, albeit with less precision at each individual time point due to the smaller number of variants used for the genetic instruments (supplementary figure 4, supplementary table 5). A Mann–Kendall trend test of the 12 individual time points suggested evidence of a trend in the MR effect estimates (IVW or Wald Ratio where only one SNP was available for analysis; tau = 0.636, 2-sided p value = 0.005).

Although there was no evidence of unbalanced horizontal pleiotropy at any of these time points where there were sufficient SNPs in the instrument to test, there was again evidence of notable heterogeneity at the birth BMI time point (supplementary table 6).

To replicate our negative finding that birth weight does not causally impact MS risk using an external dataset, we repeated the analysis using summary statistics from a larger GWAS of birthweight [14]. This yielded a convincing null (OR 1.20, 95% CI 0.92–1.55, NSNP = 136, p = 0.17) with no evidence to suggest unbalanced horizontal pleiotropy (Egger intercept = − 0.01, p = 0.18).

Discussion

Using two-sample MR and longitudinal data from a Norwegian early-life cohort, we replicate the finding that genetically determined early-life BMI appears to be a causal risk factor for multiple sclerosis. Our results suggest that most of the causal effect of elevated early-life BMI on MS risk is likely to depend on persistence of elevated BMI through to adolescence/early adulthood. We find no evidence for an effect of genetically estimated birth weight on subsequent MS risk. We report an increase in magnitude of the effect size of genetically estimated higher BMI across the life course from early to late childhood.

Using multivariable MR to control for the effects of genetic variants mediated via later-life BMI, we find that this gradient is likely to represent an increasingly strong genetic correlation between BMI at each age epoch and late-adolescent/adult BMI. Our finding of a lack of association between genetically estimated birth weight and subsequent MS risk, in contrast to other age epochs, may reflect the major role of maternal and gestational factors in determining birthweight, which have a lesser impact on BMI in later life.

Using longitudinal GWAS from the same cohort provides an opportunity to understand time-varying exposures, and minimises population stratification compared to using multiple GWAS from different time points in different cohorts. We use a variety of methods to quantify and account for pleiotropy, and demonstrate using the MR-PRESSO method that pleiotropy is unlikely to influence the positive results we observe at the 3 months–1.5 years, 2–5 years, and 7–8 years epochs. We demonstrate the possible value of this approach for unpicking the role of time-varying factors on disease risk, demonstrating a clear trend from a null effect of birthweight to a clear effect of BMI in late childhood, consistent with previous epidemiological findings in MS [8, 10].

Our results support and extend previous observations that higher BMI during adolescence, particularly late teenage years, is a risk factor for MS [4, 8, 10, 19]. Prior MR studies have strengthened the notion that this may be a causal relationship [6, 7]. Although a convincing mechanistic explanation for this association remains lacking, plausible mediators include vitamin D metabolism, the gut microbiome, and the low-grade inflammatory milieu associated with obesity [20]. Our findings are consistent with several possible explanations: suppose SNP a is associated with BMI at time point x, and has a weaker association with BMI at later time point y. It is plausible that:

  • a increases MS risk solely due to its impact on BMI at time point x, or

  • a increases MS risk solely due to its impact on BMI at later time point y, or

  • a increases MS risk due to its impact on BMI at both time points.

Given that the genetic determinants of BMI at different points throughout early life are highly correlated, it is difficult to distinguish these possibilities with available data. It is plausible that a longer ‘exposure’ to high BMI—especially if this exposure coincides with the seemingly critical window during adolescence—is associated with higher MS risk. Intuitively, genetic variants which promote higher BMI over a longer period of time are more likely to exert a greater influence on MS risk. In support of this hypothesis, a recent study used multivariable MR to distinguish the direct effects of childhood BMI on MS risk from those mediated via adult BMI: controlling for the effect of adult BMI abolished the observed MR effect, suggesting that persistently raised BMI into adulthood, rather than transiently increased BMI during childhood, is the causal risk factor [7]. These results suggest that what we observe may be driven by the same phenomenon—pleiotropic SNPs acting on MS risk via their effect on BMI in later life rather than in childhood per se.

Although we employ two approaches to attempt to address this problem—multivariable MR and exclusion of variants showing nominal association with adult BMI—neither approach is perfect. The high genetic correlation between BMI at different time points and the relatively small number of genetic instruments (and therefore variance explained in the exposure) raise concerns about the power and the risk of multicollinearity in the multivariable MR estimates. The presence of multicollinearity—high correlation between genetic effect estimates of each variant on later childhood BMI and adult BMI—may render the multivariable MR estimates unstable and of low precision. Similarly, our attempts to restrict our analysis to SNPs not associated with adult BMI are limited by the genetic overlap between BMI during early/late childhood and adulthood. As larger GWAS and therefore more genetic instruments become available for childhood BMI at different early-life stages, we expect these analyses to become more powerful with the discovery of variants with strongly ‘time-specific’ effects on BMI.

A further weakness of our study is the relatively small number of SNPs included at each time point. The limited variance explained in the exposure—BMI—by the genetic instruments used limits the power of the study, especially given the importance of environmental influences on BMI. While SNPs were collapsed into time epochs based on shared biology and kinetics of BMI during these time windows, we recognise that this represents an arbitrary simplification of the data. Although at individual time points, our MR instruments are under-powered, our epoch instruments all have good power (F statistics > 10) to detect true effects. The clear null effect of birthweight is confirmed using an external, and better-powered, GWAS [14]. A further possible concern is that population stratification may influence our results, as the exposure GWAS (Norwegian) and outcome GWAS (European) are drawn from slightly different populations. There are three reasons why we believe that this does not confound our results. First, the MoBa GWAS were performed on a post-quality control subset of individuals who cluster tightly with reference EUR samples from the CEU population (correspondence with authors). Second, these GWAS results show a strong genetic correlation with the ‘comparative body size at age 10’ trait from UK Biobank. Third, allele frequencies for included SNPs are very similar between this GWAS and reference European datasets such as UKB.

In summary, we provide evidence from longitudinal MR that the relationship between genetically determined early-life BMI and MS appears to follow a gradient, with no effect at birth and a gradually increasing influence on later-life MS risk. This gradient may be accounted for by the effect of these genetic variants on BMI in adulthood. Further work is required to determine whether there is a critical window during which elevated BMI operates as a causal risk factor for MS. Our findings extend previous MR findings in this field and demonstrate the possible value of studying time-varying risk factors with MR.