Introduction

In multiple sclerosis (MS), the most common pathological brain changes are widespread pathology in the white matter (WM) and atrophy of the grey matter (GM) [1]. Although GM atrophy shows stronger associations with clinical dysfunction than WM atrophy [2], large variability between patients’ atrophy rates and disability progression have been reported [3, 4]. It has been suggested that this inter-patient variability of GM atrophy rates arises to a large part because of the different distribution and severity of WM damage between patients, indicating the importance of looking at the GM–WM relationship in anatomically connected regions [5,6,7,8]. However, most studies investigating this had a cross-sectional design and could therefore not draw any conclusions on whether GM atrophy precedes of follows WM damage, and how this effects disability progression in MS.

A recent systematic review by Lie et al. [9] that included 90 studies on the relationship between WM lesions and GM volume showed an inverse association between the 2, particularly in early (relapsing) MS, and less so in progressive MS, suggesting that GM neurodegeneration is mostly secondary to WM damage in the form of lesions in early stages of the disease. Still, a knowledge gap is present since WM damage in MS is not confined to focal lesions, but to microstructural damage as well. Microstructural damage in the WM can be assessed indirectly through quantitative magnetic resonance (MR) measures, such as fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD) and radial diffusivity (RD), which are simplified measures obtained from the diffusion tensor for the overall integrity of the WM [10]. Multi-shell diffusion weighted imaging (DWI) can provide more biophysical properties such like neurite orientation dispersion and density imaging (NODDI) [11], thereby providing more information on microstructural damage in the WM. Recent studies have shown that NODDI parameters such as neurite density index (NDI) and orientation dispersion index (ODI) enable additional characterization of the WM in MS subjects [12, 13]. In particular, NDI can be considered an axonal marker [14]. Other imaging techniques that can be used for further characterization of damage in the WM are quantitative susceptibility mapping (QSM) and myelin water imaging (MWI). QSM may enable us to visualize WM damage and lesion formation before conventional structural imaging could [15], and from MWI, the myelin water fraction (MWF) can be calculated, which can provide information on the myelin content of the WM [16]. Combining these imaging measures may enable us to visualize more subtle WM damage patterns that occur in the early stages of the disease.

In this study, we aimed to investigate whether in early RRMS, the amount and/or type of damage in the WM tracts connected to the GM influences neurodegeneration, or whether damage in the GM influences damage in the connected WM tracts over time, and how this may relate to disability. For this, we studied longitudinal multimodal imaging data to better understand the spatiotemporal relationship between WM and GM damage in early RRMS.

Methods

Subjects

The institutional review board approved the study protocol and all participants gave written informed consent prior to participation, according to the Declaration of Helsinki. To be included in the study, patients had to be diagnosed with clinically definite RRMS [17] and have a disease duration of no more than 5 years. They could be included if they were using no treatment or first-line treatment, but not if they were using more advanced treatment. Their clinical disability levels had to be limited, with a maximum allowed EDSS score of 5.0. In case of switching of treatment, MRI examinations were planned with at least 4–6 months of delay [18]. When steroids were used, MRI was delayed by 3 months [19]. Patients were excluded (over the course of the study) in case of (switching to) second-line treatment to avoid spurious (pseudo)atrophy effects. Patients were seen at 1-year intervals, for extensive MR imaging and evaluation of clinical and neuropsychological performance.

A group of age-, gender- and education-matched healthy controls (HCs) was also included and seen at 1-year intervals for extensive MR imaging. Exclusion criteria for both MS and HC subjects were inability to undergo MRI examination; and past or current clinically relevant neurological, psychiatric or (auto)immune disorders other than MS. The data used in the current study were part of a larger cohort study, some results of which have been reported previously [20].

MR imaging—acquisition

This was a single-center study in which a single MR scanner was used without intermediate upgrades. Imaging was performed on a 3 T whole-body scanner (Discovery MR750, GE Healthcare, Milwaukee, WI., USA) and an eight-channel phased-array head coil. The MR protocol included a sagittal 3D T1-weighted fast spoiled gradient echo sequence (FSPGR with TR/TE/TI = 8.2/3.2/450 ms and 1.0 mm isotropic resolution) and a sagittal 3D T2-weighted fluid attenuated inversion recovery sequence (FLAIR with TR/TE/TI = 8000/130/2338 ms at resolution 1.0 × 1.0x1.2 mm).

Axial 2D DWI acquisitions covering the entire brain (echo planar imaging, TR/TE = 6200/86 ms with 2.0 mm isotropic resolution) with accompanying reference scans with reversed phase-encoding direction were performed. The multi-shell DWI consisted of 7 volumes without diffusion weighting (b = 0 s/mm2) and 88 volumes with non-colinear diffusion gradients (29 images with b = 500 s/mm2 and 59 images with b = 2000 s/mm2). The b-values were interleaved to improve post-processing steps like motion correction [21].

Quantitative susceptibility mapping (QSM) images were acquired using an axial 3D multiple echo GRE sequence, with 7 TEs (TR/TEstart/TEdelta/TEmax = 49.24/5.44/6.55/44.73 ms at resolution 0.5 × 0.5 × 1.6 mm).

Lastly, multicomponent driven equilibrium single pulse observation of T1 and T2 (mcDESPOT) was acquired in an axial 3D slab with 2.5 mm isotropic resolution, using a spoiled gradient recalled sequence (SPGR with TR/TE = 8/3.02 ms with multiple flip angles [3°, 4°, 5°, 6°, 7°, 9°, 13°, 18°]), a balanced steady-state free precession sequence (SSFP with TR/TE = 4.736/2.368 ms, with multiple flip angles [12°, 16°, 21°, 27°, 33°, 40°, 51°, 65°], with phases 0° and 180° for the odd flip angles and phases 90 and 270 degrees for the even flip angles), and a 2D axial B1 map (TR/TE = 17/12.8 ms at lower resolution 4.0 × 4.0 × 5.0 mm). High-order shimming was performed for multi-shell DWI, QSM, and mcDESPOT acquisitions.

MR imaging—analysis

An overview of the different MRI post-processing streams used in this study is shown in Fig. 1 and analyses are described below.

Fig. 1
figure 1

Overview of MR imaging analysis. Top row: T1 and FLAIR images were used to create a lesion mask (nicMSlesions) and lesion filled (LEAP) image, which was used as input for longitudinal SAMSEG for subcortical segmentation and longitudinal FreeSurfer for cortical parcellation. Middle row: multi-shell diffusion weighted images (DWI) were used to obtain FA and MD in the WM mask. A tensor image from DTI-TK was used to register FreeSurfer ROIs, and probtrackx was used for tractography (example from thalamus tracts and postcentral tracts). Bottom row: NODDI was used to obtain NDI and ODI maps; SPGR images were used to create MWF maps; magnitude and phase (not shown) images were used to create QSM maps

Structural imaging analysis

Brain extraction was performed using FMRIB Software Library (FSL) version 5.0.10 brain extraction tool BET [22, 23] and N3 bias field correction with 3T optimized parameters was performed with FreeSurfer version 6.0 [24]. Lesion segmentation was performed with the deep-learning algorithm nicMSlesions version 0.2 [25, 26] which was optimized for our data in an earlier study [27]. In summary, full re-training of the nicMSlesions neural network (11 layers) was done with the use of manual lesion segmentations available from fourteen subjects with MS. All parameters were set at default. On the resulting lesion probability map, the optimized probability threshold of 0.4 was applied and lesions smaller than five voxels were removed.

Lesion filling was performed with LEsion Automated Processing (LEAP) [28] and lesion filled images were processed with the longitudinal pipeline of FreeSurfer [29,30,31], using a template-driven approach to provide a detailed parcellation and segmentation of the cortex and subcortical structures [32, 33]. Cortical thickness was obtained from FreeSurfer and brain volumes were obtained from the longitudinal pipeline of Sequence Adaptive Multimodal SEGmentation (SAMSEG) [34, 35]. To control for differences in head size, whole brain (WB), WM and DGM brain volumes were normalized and calculated as a percentage of the segmentation-based total intracranial volume (normalized WB [nWB], normalized WM [nWM], and normalized DGM [nDGM], respectively). In addition, nDGM volume and CGM thickness were converted into z-scores for each structure separately, based on the entire cohort. In our analysis, two DGM structures (nucleus accumbens and amygdala) and three CGM structures (entorhinal cortex, frontal pole and temporal pole) were excluded from analyses due to their known measurement variabilities [36].

Diffusion imaging analysis

All images were corrected for susceptibility induced geometric distortions using FSL topup [37, 38], and for movement and eddy currents with FSL eddy [39]. The diffusion tensor was fitted on all b = 0 mm/s2 and b = 2000 s/mm2 images to obtain the FA, MD, AD and RD values.

The multi-shell ball and stick model of bedpostx [40] was used to estimate the voxel-wise diffusion parameter distribution. Probabilistic tractography using 5000 streamlines was performed with probtrackx2 [41,42,43]. After linear registration of DWI to 3DT1, the inverse transformation was used to register the FreeSurfer regions of interest (ROIs) to DWI space using nearest-neighbour interpolation; thus per hemisphere, 5 DGM and 31 CGM regions were available as seed and target ROIs for tractography. A midline exclusion mask was used to ensure tracts could not cross hemispheres, except through the corpus callosum, fornix and brainstem. Moreover, the target ROIs were used as both waypoint and termination masks.

Probabilistic tractography was performed on year-1 scans, after which the tracts were propagated to the year-2 scans using Diffusion Tensor Imaging ToolKit (DTI-TK) version 2.3.1 [44, 45], to create a spatially normalized tensor within-subjects template. DTI-TK makes use of an affine registration algorithm with explicit tensor reorientation optimization [46]. The longitudinal pipeline as proposed by Keihaninejad et al. [47] was used. A DTI-TK within-subject template was created for each subject from the b = 2000 mm/s2 images. With these templates, the year-1 probabilistic connectivity distribution was registered to year-2 images for each subject.

Neurite Orientation Dispersion and Density Index (NODDI) Matlab Toolbox version 1.0.1 was used to obtain the neurite density index (NDI) and orientation dispersion index (ODI). Voxels for which the NODDI model failed (i.e. all voxels with fiso > 0.99; ficvf > 0.99; kappa = 0.05; and all error-code voxels) [11] were filtered out to create a tissue mask image to apply to the NDI and ODI images.

QSM imaging analysis

SuscEptibility mapping PIpeline tool for phAse image (SEPIA) version 0.7.2 [48] containing the Susceptibiltiy Tensor Imaging (STI) Suite [49] in MATLAB (The MathWorks, Natick, MA) was used for processing of magnitude and phase images after applying a brain mask obtained with FSL BET. After phase unwrapping with the optimum weights-laplacian-based method and background field removal with variable radius kernel, QSM images were estimated with iterative sparse linear equation and least square (iLSQR) with susceptibility of the whole brain defined as 0 parts per million (ppm). QSM maps were registered to DWI space of the corresponding time point.

Myelin water imaging analysis

From B1, SPGR and SSFP images, myelin water fraction (MWF) maps were estimated using qimcdespot (from Quantitative Imaging Tool [QUIT] [50] version 2.0.2), assuming a three-component model with exchange. Upper and lower bounds were slightly adapted based on inspection of histograms and in-house optimization within a group of HC subjects. MWF maps were registered to DWI space of the corresponding time point.

WM measures in probabilistic tracts

The probabilistic tracts were multiplied by the binary WM mask, and for all measures (i.e. FA, MD, AD, RD, NDI, ODI, MWF and QSM), mean values per tract were extracted, weighted by the connectivity probability to emphasize the tract center, and decrease the effect of spurious tracts. We determined all WM measures for the whole tract, consisting of normal-appearing (NA)WM and lesional WM, but also perilesional WM.

In order to combine WM measures, z-scores were calculated per tract for each WM measure, based on the entire cohort, and z-scores were combined depending on which information was present. We calculated a diffusion z-score (WM-Diffusion = (zFA-zMD-zRD)/3) as well as an extended WM z-score (WM-Extended = (zFA – zMD – zRD + zNDI + zMWF)/5), when data were available.

Clinical and neuropsychological evaluation

Patients’ medical history was taken, including the occurrence of relapses and changes in therapy. Neuropsychological evaluation included the Symbol Digit Modalities Test (SDMT) [51] to measure information processing speed. Two parallel test versions were used to minimize learning effects. SDMT scores were corrected for age, sex and educational level and transformed into z-scores, based on a normative sample of Dutch healthy controls (n = 407) [52].

Patients’ physical disability was measured with the Expanded Disability Status Scale (EDSS) questionnaire [53], the 9 Hole Peg Test (9-HPT) and the 25 Foot Walk Test (25-FWT) [54]. From these three, EDSS + progression was determined as a descriptive, binary, marker of physical disability progression [55].

To also take cognitive disability into account in this group of early RRMS subjects with relatively low EDSS scores, Multiple Sclerosis Functional Composite (MSFC) [54] scores were calculated from 9-HPT, 25-FWT and SDMT z-scores. For each of the three sections, results were considered impaired upon z-score ≤ -1.5. Over time, RRMS subjects were classified as MSFC decliner when MSFC scores were lower at year-2 compared to year-1, and as MSFC non-decliner when MSFC scores were equal or higher at year-2 compared to year-1.

Statistics

HC vs RRMS comparisons for baseline demographics were performed by independent samples t-test, Mann–Whitney U-test, or Chi Square test, when appropriate. Differences in the RRMS group over time were analyzed with paired t-test, Wilcoxon Signed Ranks test, or McNemar test, when appropriate.

To analyze HC and RRMS differences over time in normalized brain volumes, cortical thickness, and z-scores for nDGM, CGM or WM, linear mixed models (LMM) with subject as random intercept were used, with fixed factors time, type (i.e. HC/RRMS) and time*type. When appropriate, LMM analysis was also performed to assess baseline differences between the two groups. Since application of an LMM requires constant variance in errors, z-scores for both WM and GM measures were used.

LMM analysis with subject as random intercept and type (i.e. HC/RRMS or MSFC non-decliner/decliner) as fixed factor was used for longitudinal analysis of DGM/CGM z-score relations with WM z-scores in the connected tracts. For the longitudinal relations, we analyzed both the effect of baseline WM z-scores on change in the connected GM z-scores (i.e. baseline WM to ΔGM), and the effect of baseline GM z-scores on change in WM z-scores in the connected tracts (i.e. baseline GM to ΔWM); both with type (HC/RRMS) or MSFC group (declining/non-declining) as covariates. When an interaction effects were present, the cohort was split based on type or MSFC group and LMM analysis were performed as described above. Analyses for DGM and CGM were performed separately.

Exploratory binary logistic regression was used to predict whether a subject would belong to the MSFC non-declining or MSFC declining group over time by looking at baseline WM or baseline GM values only; as well as combining these measures with demographics such as sex, age, education, treatment type and EDSS at baseline. Statistical analyses were performed using SPSS26 (IBM SPSS, Chicago, USA) and p-values were considered statistically relevant upon p ≤ 0.05. Since LMM analysis was performed multiple times (i.e. for every GM or WM measure as possible outcome separately), p-values were considered statistically significant upon p ≤ 0.01.

Results

Demographics

In total, 40 subjects with early RRMS and 15 age-and-sex-matched HCs were included in the study. A total of 11 subjects did not complete both year-1 and year-2 measurements and were, therefore, excluded from analyses in the current study (two HC subjects [inability to undergo MRI examination] and nine RRMS subjects [n = 5 switch to second-line therapy; n = 1 switch to first-line therapy just prior to examination; n = 3 inability to undergo MRI examination unrelated to RRMS]). This resulted in a total of 44 age-, sex-, and education-matched subjects (13 HC, 31 RRMS) with full data available for analysis, for whom demographics are depicted in Table 1. Follow-up time for RRMS and HC subjects was similar (1.00 ± 0.10 years and 0.99 ± 0.10 years, respectively).

Table 1 Demographics for HC and RRMS subjects; and RRMS clinical characteristics over time

RRMS subjects had an average disease duration of 3.5 ± 1.4 years at baseline measurements with median EDSS 3.0 (range 0.0–5.0). A total of 6 subjects (19%) showed EDSS + progression during follow-up. 9-HPT and SDMT scores did not significantly differ from year-1 to year-2, but worsening of 25-FWT results was seen (Z = – 2.930, p = 0.003). A total of ten subjects (32%) had at least one impaired MSFC section (i.e. 9-HPT, 25-FWT and/or SDMT) at year-1, and seven subjects (23%) at year-2. Total MSFC score did not change significantly over time, but a total of n = 15 subjects showed an overall decline in MSFC score from year-1 to year-2 (MSFC decliners), whereas n = 16 did not (MSFC non-decliners). No significant differences in age, sex, education, disease duration, treatment time/type, or number of relapses in the previous year were found between the MSFC decliners and non-decliners.

MRI characteristics

MWI data were not available for 2 of 13 HC subjects and 7 of 31 RRMS subjects at year-1 (MSFC non-decliners n = 2 and MSFC decliners n = 5, respectively), and for 1 RRMS subject (MSFC non-decliner) at year-2. QSM data were not available for one RRMS subject (MSFC decliner) at year-1, and for one RRMS subject (MSFC non-decliner) at year-2.

HC vs RRMS subjects

MRI characteristics for HC and RRMS subjects over time are shown in Table 2. Larger atrophy rates in RRMS compared to HC subjects were found for normalized WB, WM and DGM volumes (p = 0.024, p = 0.018 and p = 0.026, respectively). RRMS subjects showed generally lower z-scores of DGM volume, CGM thickness, and measures of WM integrity (e.g. lower FA, higher MD; or more specifically lower NDI as axonal damage marker, and lower MWF as myelin damage marker), but no significant differences were found between HC and RRMS, neither cross-sectionally nor longitudinally.

Table 2 MRI characteristics of HC and RRMS subjects over time

In both DGM- and CGM-seeded tracts, WM-Extended z-scores decreased less over time in RRMS compared to HC subjects (DGM: B = 0.290, SE = 0.090, p < 0.001; CGM: B = 0.122, SE = 0.037, p < 0.001), but the WM-Extended z-scores were also already lower at year-1 in RRMS compared to HC subjects (DGM: B = 0.464, SE = 0.200, p = 0.024; CGM: B = 0.358, SE = 0.190, p = 0.066), although not reaching significance.

MSFC non-declining vs MSFC declining subjects

Comparing the RRMS subjects based on their declining or non-declining MSFC score over time (Table 3), we found significantly higher atrophy rates for normalized WB, WM, GM and DGM volumes (p = 0.013, p = 0.029, p = 0.022 and p = 0.021, respectively) in the MSFC declining group compared to non-declining group. CGM thickness as well as CGM z-score also decreased faster over time in MSFC declining compared to non-declining subjects (p = 0.010 and p = 0.019, respectively).

Table 3 MRI characteristics of RRMS subjects with non-declining or declining MSFC score over time

In both DGM and CGM tracts, several WM z-scores declined faster in MSFC declining compared to non-declining subjects (e.g. WM-Extended: B = 0.259, SE = 0.094, p = 0.006 for DGM, and B = 0.176, SE = 0.040, p < 0.001 for CGM, respectively). Interestingly, this was highly significant for changes in myelin-related measures (MWF and RD), but not for the axonal marker NDI. For the CGM, but not DGM, overall WM integrity was significantly lower at both time points in MSFC declining subjects compared to MSFC non-declining subjects (e.g. WM-Extended: B = – 0.436, SE = 0.204, p = 0.040).

Effect of baseline WM integrity on change in connected GM over time

The relation between baseline WM integrity and changes in DGM volume and CGM thickness over time is shown in Table 4. Table 4A shows significant relations between baseline WM-Diffusion, WM-Extended, FA and RD z-scores and the change in CGM thickness in the entire cohort (e.g. WM-Extended: B = – 0.055, SE = 0.019, p = 0.005). This relationship between baseline WM and CGM thickness change was stronger in RRMS compared to HC subjects for WM-Diffusion, WM-Extended, MD, RD and NDI (e.g. WM-Diffusion*Type: B = 0.088, SE = 0.025, p < 0.001), thus including both axonal and myelin markers. When the MSFC non-declining group was compared with the MSFC declining group, similar effects were observed as when comparing HC with RRMS (Table 4B): also here the baseline WM z-scores had a WM*Type relationship in CGM (e.g. WM-Extended*MSFC type: B = 0.120, SE = 0.031, p = 0.001), where the relationship between WM damage and CGM thickness change was stronger in MSFC decliners than in non-decliners, and involved both markers of myelin damage in the WM (FA, RD) and the marker of axonal damage (NDI). No significant relations between WM z-scores and change in DGM volumes over time were seen, neither over the entire cohort, nor for the MSFC groups.

Table 4 Effect of baseline WM integrity z-score on change in DGM volume z-score and CGM thickness z-score over time; compared in HC/RRMS (A), MSFC non-declining/declining (B), and split per group (C; CGM only)

Post hoc analysis in HC and RRMS groups separately (Table 4C) showed that lower WM integrity related to a decrease in cortical thickness over time in RRMS subjects (e.g. WM-Extended: B = 0.033, SE = 0.015, p = 0.030), whereas HC subjects showed an opposite relationship (e.g. WM-Extended: B = – 0.055, SE = 0.019, p = 0.004). Post hoc analysis splitting in the two MSFC groups showed that the WM–GM relationship was only significant in subjects whose MSFC score declined over time (e.g. WM-Extended: B = 0.098, SE = 0.025, p < 0.001), and not in subjects whose MSFC did not (e.g. WM-Extended: B = – 0.021, SE = 0.019, p = 0.272).

Effect of baseline GM on change in connected WM integrity over time

The relation between baseline DGM volume or CGM thickness and changes in WM integrity over time is shown in Table 5. Table 5A shows there were no significant relations between baseline GM z-scores and WM z-scores over time, except for a different association in HC and RRMS subjects of baseline DGM volume on AD change over time, which was also seen between the two MSFC groups (Table 5B). CGM thickness z-scores did not relate significantly to WM change over time.

Table 5 Effect of baseline DGM volume z-score or CGM thickness z-score on change in WM integrity z-score over time; compared in HC/RRMS (A), MSFC non-declining/declining (B), and split per group (C; DGM only)

Post hoc analysis in HC and RRMS groups separately (Table 5C) showed that in RRMS subjects, lower DGM volumes related to a decrease in AD in the connected tracts over time (B = 0.109, SE = 0.042, p = 0.009), an effect that was not found in HC subjects (B = – 0.130, SE = 0.075, p = 0.086). Furthermore, this was only seen in subjects whose MSFC would decline over time (B = 0.203, SE = 0.050, p < 0.001), and not in subjects whose MSFC would not decline over time (B = – 0.002, SE = 0.066, p = 0.977).

Predicting MSFC decline based on baseline WM and GM parameters

Exploratory binary logistic regression to study the relation between baseline imaging parameters and the MSFC groups showed that WM-Extended z-scores were better predictors than WM-Diffusion z-scores (Nagelkerke R2 = 0.594 versus 0.362, respectively), and that either of the WM z-scores were better predictors than GM z-scores only (Nagelkerke R2 = 0.287) in a model including sex, age, education, treatment type and baseline EDSS score. A total of 78% of subjects could be classified correctly (MSFC-non-declining: 84%; MSFC-declining: 69%) in a model based on sex, age, education, treatment type, EDSS, GM z-score, WM-Diffusion z-score and WM-Extended z-score (Nagelkerke R2 = 0.611).

Discussion

This longitudinal study in early RRMS explored the spatial and temporal relations between GM damage and WM integrity in the connected tracts using multimodal MRI. Our most important finding was that lower baseline WM integrity, as determined with DTI, NODDI and MWF, related to increasing atrophy of the connected cortical GM over time in subjects with RRMS, and especially for those who experienced increasing disability over the study period. In contrast, lower baseline cortical GM thickness or deep GM volume did not relate to WM integrity changes in the connected tracts over time.

These results suggest that in early RRMS, damage of the WM precedes atrophy of the cortical GM and that this relationship is clinically relevant since it was only found in subjects with worsening of their overall MSFC score over time. Cortical thinning as a result of preceding lower WM integrity in connected tracts has been suggested previously [56, 57], and we extend those findings by studying this longitudinally in early RRMS. Accelerated CGM atrophy has previously also been found upon increasing clinical progression independent of relapse activity [58], and our results confirm this already in the very early stages of the disease. For progressive MS, it was earlier found that WM damage preceded cortical GM damage [6], and a recent combined MRI-histopathological study showed that this WM-integrity-related cortical thinning could be attributed to GM axonal density loss, rather than myelin or microglia density loss within GM [59].

Our study adds important new insights, by demonstrating that in WM tracts, both axonal damage and myelin damage are related to subsequent cortical thinning. That myelin damage, as assessed here through MWF and RD, would be implicated, may not come as a surprise given the relations between demyelinating focal WM lesions and GM atrophy in MS described previously and reviewed systematically in [9]. Nonetheless, the fact that quantitatively assessed myelin damage in whole WM tracts was related to subsequent thinning of the connected cortex in RRMS, is novel. While (higher) RD was highly significantly more strongly related to subsequent cortical thinning in RRMS than HC, (lower) MWF failed to reach significance. In the post hoc analyses in RRMS alone, both RD and MWF just failed to meet the significance threshold. However, in the most affected patient group, those with RRMS who exhibited MSFC decline during the study, both RD and MWF at baseline were highly significantly related to subsequent thinning of the connected cortex. Future studies, ideally in larger groups but maintaining the homogeneity of image acquisition adhered to here, should investigate the role of myelin damage in WM in the process of cortical thinning in RRMS in more detail.

Axonal damage is another important component of WM damage in MS. With NDI, this study included a quantitative marker of this damage. Lower NDI in WM tracts was significantly related to more subsequent thinning of the connected cortex in; whether assessing its effect in RRMS versus HC; in MSFC-declining versus non-declining RRMS; or in the post hoc analyses on RRMS separately and in the MSFC-declining subgroup. The recurring strength of this marker suggests an important role for WM axonal damage in the development of subsequent cortical thinning.

The finding that this process may already be ongoing in early RRMS may also inform therapeutic choices: treatments that would be developed to target primarily neurodegeneration may have limited effect, since atrophy may be secondary to WM damage. In the current study, we did not find any significant influences of WM integrity on atrophy of DGM in either HC or RRMS subjects; and also not in either of the two MSFC groups. Correlations between WM damage and connected DGM volumes have been shown in earlier studies, although these studies focused only on lesion volume and not on overall integrity of the entire tract [9, 60]. Studies that did look at overall integrity of the tract at baseline and its effect on atrophy focused mainly on the thalamus [7, 61, 62]. Interestingly, a cross-sectional study showed that connectivity of thalamocortical tracts related to cortical, but not thalamic, atrophy [63], which may be in line with our results where we did not find any relations between WM and DGM atrophy, but only with CGM atrophy. It is known that MS DGM segmentation has some methodological issues[64], and therefore we took some precautions: subjects’ brain volumes were measured at similar times of the day to limit diurnal volume fluctuations; specific longitudinal segmentation software optimized in subjects with MS was used[34]; amygdala and nucleus accumbens were excluded from analysis due to known difficulties in segmentation of these small structures[36]; and DGM volumes were normalized and converted into z-scores per structure to enable grouping of large and small DGM structures within one LMM. However, where our linear mixed model method was well suited for analysis of the CGM tracts (i.e. damage may occur in different tracts and different cortical areas between MS subjects), it may be less suitable for the analysis of DGM tracts, and analyses per DGM structure may be more informative. Due to the small sample size, we did not further investigate the separate DGM structures in the current study, but this remains of interest for future, larger, studies.

The fact that anterograde neurodegeneration from WM integrity loss was mainly found in subjects who showed clinical progression (i.e. MSFC decliners), further indicates the clinical relevance of these disease mechanisms. It should be noted that the distinction between decliner and non-decliner is only subtle, in the 1-year follow-up of this cohort of 31 RRMS subjects. However, although the groups only consisted of 15 declining and 16 non-declining RRMS subjects, exploratory regression analysis suggested importance of baseline WM integrity (including NDI and MWF) over baseline GM scores, on MSFC progression over time. Although implications for clinical progression are subtle and preliminary, future studies may benefit from including measures of the integrity of the tract over lesion volume only, and especially from the addition of NDI and MWF as important axonal and myelin markers, respectively, to standard DTI analyses.

This study has some limitations. First, our sample size was relatively small and for a few subjects, MWF and QSM values were not available; this was mitigated by using mixed models analyses that take all data-points into account. Although our study follow-up time was short, significant atrophy patterns as well as WM–GM relations could be recognized already early in the disease. Larger sample sizes and longer follow-up times might increase the power of these findings.

Furthermore, while it would be interesting to study the relation with GM atrophy separately for damage in WM lesions in connected tracts and for damage in NAWM in connected tracts, we have chosen not to pursue such analyses in the present work. Because of the small numbers of lesions per person in our study group of people with early MS, and because the anatomical locations of those lesions vary from patient to patient, a large number of WM tracts will not contain any lesion. The small amount of data will lead to a low statistical power regarding the relation between that lesion-bound damage on the one hand, and atrophy of the connected GM regions on the other. By contrast, by quantifying the tract-specific damage through the quantitative MR measures for the whole tract, regardless of whether lesions were present, we were able to include all tracts in all patients in our analyses. This allowed for a comprehensive analysis, while also limiting the number of statistical tests. Nonetheless, the extraction of weighted mean values from the whole WM tract may be less optimal for QSM, due to opposite effects on susceptibility of demyelination and iron deposition[15], and so for QSM separate analysis of lesions, perilesional WM and NAWM may be informative. For future studies, it may be interesting to also look at the microstructural measures in the DGM and CGM itself. In this way, we can investigate not only the effect of WM integrity on atrophy of the connected GM, but also on the neurobiological processes underlying this neurodegeneration, such as demyelination and axonal loss.

Conclusion

Lower baseline WM integrity related to increasing cortical atrophy in RRMS subjects that show clinical progression over a 1-year follow-up. Baseline GM did not influence WM integrity changes over time. Collectively, this suggests that white matter tract damage drives cortical neurodegeneration in early RRMS.