Introduction

Thanks to the introduction of many effective novel drugs with different mechanisms of action, treatment of MS patients is evolving dramatically, with significant reductions in annualized relapse rate (ARR), disability progression, and the accumulation of MRI-detectable damage.

Fingolimod and natalizumab are treatments approved for relapsing–remitting (RR) MS patients with high disease activity. By preventing leukocytes from reaching the CNS, they exert a strong anti-inflammatory effect that limits the formation of focal lesions and irreversible tissue loss [1, 2].

Several phase III randomized clinical trials (RCTs) [3,4,5,6,7] have shown greater benefits of both fingolimod and natalizumab over placebo or interferon beta (IFN-β) on clinical and MRI disease activity. The only RCT performed with a head-to-head comparison, terminated before completion, showed a stronger and faster reduction of disease activity with natalizumab than with fingolimod in active RRMS patients [8].

Recent observational studies have reported inconsistent effects on clinical and MRI disease activity, with some describing a similar efficacy for the 2 drugs [9, 10], but others showing higher effects for natalizumab [11,12,13,14,15]. Some studies [11, 13, 14] included MRI outcome measures, but they were limited to measures of inflammation (i.e., new T2-hyperintense and gadolinium [Gd]-enhancing lesions), without including measures of neurodegeneration. Many factors could explain discrepancies among studies, including differences in baseline clinical and MRI features of the cohorts recruited, and follow-up duration. A recent meta-analysis, which combined data from RCTs and observational studies, suggested a superiority of natalizumab compared to fingolimod in preventing disease activity [16].

Although some observational studies suggested a positive effect of both natalizumab and fingolimod on cognitive performance, fatigue, and depression [17,18,19,20,21], none has directly compared their effects on these measures.

Against this background, we investigated the effects of natalizumab and fingolimod on clinical and MRI measures of inflammation (relapses and active MRI lesions) over a 2-year period in RRMS patients starting these treatments. Additionally, their effects on disability, cognition (including depression and fatigue), and MRI measures of neurodegeneration (T1-hypointense lesions and atrophy) were assessed.

Methods

Study Design

This was a single-center, prospective, longitudinal, open-label, nonrandomized study. Figure 1 summarizes the main inclusion and exclusion criteria and study flowchart. Between September 2011 and July 2016, from 104 consecutive RRMS patients starting fingolimod or natalizumab, 25 treated with fingolimod and 30 with natalizumab completed the study and were included in the analysis. All patients underwent clinical and MRI evaluation at baseline (T0) (± 10 days from treatment initiation), month 6 (M6) (± 3 days), month 12 (M12) (± 7 days), and month 24 (M24) (± 7 days). Approval was received from the local ethical standards committee, and written informed consent was obtained from all study participants prior to enrolment.

Fig. 1
figure 1

Study flowchart. Abbreviations: RRMS = relapsing–remitting multiple sclerosis; AIFA = Agenzia Italiana del Farmaco (Italian Medicine Agency); EDSS = Expanded Disability Status Scale; Gd = gadolinium

Clinical and Neuropsychological Evaluation

At each visit, the Expanded Disability Status Scale (EDSS) score, the Multiple Sclerosis Functional Composite (MSFC) score, relapses, and ARR were rated by a neurologist unaware of the MRI results. At T0, treatment history and ARR during the 2 years before treatment initiation were also recorded. At follow-up, confirmed disability progression (CDP) was defined when the EDSS score increased ≥ 1.0 point if the baseline EDSS score was ≥ 1.0 or ≥ 1.5 points if the baseline score was 0 [6, 7], and sustained for 3 months.

No evidence of disease activity 3 (NEDA-3), defined as no clinical relapses, no MRI activity (new/enlarging T2-hyperintense or Gd-enhancing lesions on MRI), and no EDSS progression, was also evaluated at follow-up.

Cognitive performance was assessed at T0, M12, and M24 by a neuropsychologist, unaware of the MRI results, using 2 parallel versions of the Brief Repeatable Battery of Neuropsychological Tests. Performance was abnormal when the test score was below the 5th percentile of the normative values for the Italian population, and patients with at least 2 abnormal tests were considered cognitively impaired. Longitudinal changes of test performance were assessed using the Reliable Change Index (RCI).

Fatigue and depression were quantified using the modified fatigue impact scale (MFIS) and the Montgomery–Asberg Depression Rating Scale (MADRS), respectively.

MRI Acquisition

Using a 3.0-Tesla scanner (Intera, Philips Medical Systems, Best, The Netherlands) under a regular maintenance program (no major scanner hardware or software upgrade occurred during the study), the following brain images were acquired from all participants: a) dual-echo turbo spin-echo (repetition time [TR]/echo time [TE] = 2599/16.80 ms; echo train length [ETL] = 6; flip angle [FA] = 90°; matrix size = 256 × 256; field of view [FOV] = 240 × 240 mm2; 44 contiguous, 3-mm-thick axial slices); b) 3D T1-weighted fast field echo (TR/TE = 25/4.6 ms; FA = 30°; matrix size = 256 × 256; FOV = 230 × 230 mm2; 220 contiguous axial slices; voxel size = 0.89 × 0.89 × 0.8 mm); c) postcontrast (0.1 mmol/kg of Gd-DTPA; acquisition delay = 5 min) T1-weighted inversion recovery sequence (TR/TE/inversion time [TI] = 2000/10/800 ms, ETL = 5; FA = 90°; matrix size = 400 × 320, FOV = 230 mm × 195.5 mm; 44 contiguous, 3-mm-thick axial slices). For all scans, the slices were positioned parallel to a line joining the most infero-anterior and infero-posterior margins of the corpus callosum, with careful repositioning at follow-up.

MRI Analysis

At each visit, T2-hyperintense, T1-hypointense, and Gd-enhancing lesion volumes (LVs) were measured using a local thresholding segmentation technique (Jim 6.0 software, www.xinapse.com) by consensus of 2 observers blinded to subjects’ data. The numbers of Gd-enhancing lesions and new T2-hyperintense and T1-hypointense lesions were also counted.

Atrophy Assessment

After T1-hypointense lesions refilling, normalized brain volume and longitudinal percentage brain volume change (PBVC) were assessed on the 3D T1-weighted images using the SIENAx and SIENA software.

Baseline gray matter (GM) and white matter (WM) volume fractions were estimated using SPM12: GM and WM tissues were first segmented on the 3D T1-weighted images, and intracranial volume (ICV) was evaluated. Then, the ratio of the tissue volume to the respective ICV was computed. For longitudinal GM volume (GMV) and WM volume (WMV) change quantification, a pipeline for atrophy assessment was built by combining the “longitudinal pairwise registration” from SPM12, with the Jacobian integration technique, as previously described [22].

Deep GM nuclei were segmented on the 3D T1-weighted images using FIRST (http://www.fmrib.ox.ac.uk/fsl/first), and their volumes were divided by ICV to correct for head size. The total volume of all deep GM (including the thalamus, putamen, pallidum, amygdala, and accumbens) was then obtained, and the percentage change relative to T0, M6, or M12 was used in longitudinal assessments.

MRI-derived volumes were also compared with a group of age- and sex-matched healthy controls (HC) (n = 17, 10 females; mean age [standard deviation (SD)] = 34.9 [11.0] years) scanned for research purposes using the same MRI scanner and protocol at T0 and M24.

Statistical Analysis

Variables were reported as mean and SD, median and interquartile range, or count and relative frequencies. Baseline characteristics were compared between treatment groups using Student’s t test, Mann–Whitney test, or χ2 test. The standardized differences between groups were calculated according to Yang’s and Dalton’s method. Imbalances between groups were considered for a standardized difference > 0.10. To assess within- and between-group longitudinal changes, a linear mixed model with random intercept was adopted for continuous and nonskewed variables, and residuals were checked graphically for normality. T2-hyperintense and T1-hypointense LVs were log-transformed before analysis. A negative binomial model or a logistic model was used for count data or binary characteristics, respectively. To adjust for baseline between-group differences, analyses were weighted by the inverse probability of treatment calculated using a logistic model including all baseline demographic and clinical characteristics that showed a standardized difference > 0.10.

For all analyses, the following comparisons were defined a priori: M6 versus T0, M12 versus T0, and M24 versus T0, to investigate interesting dynamic changes along the time points. To limit the influence of “pseudoatrophy” [23,24,25], comparisons of volumetric changes were performed after rebaselining at M6 and at M12.

Clinical, neuropsychological, and MRI longitudinal changes were correlated using Spearman’s correlation coefficient.

Stata (v.14; StataCorp) was used for statistical analysis with a p value < 0.05 considered statistically significant.

Results

Baseline Findings

Table 1 summarizes the baseline findings according to treatment started. Fingolimod and natalizumab patients showed similar characteristics. Four RRMS patients starting natalizumab and none starting fingolimod were treatment naïve. All the others switched from a first-line (18 in the fingolimod group, 23 in the natalizumab group) or a second-line drug (7 in the fingolimod group, 3 in the natalizumab group).

Table 1 Main demographic, clinical, neuropsychological, and MRI findings at baseline in relapsing–remitting multiple sclerosis patients starting fingolimod or natalizumab

Some clinical and MRI variables showed a standardized difference > 0.10 between the 2 treatments, although not statistically significant (p values from 0.06 to 0.99) (Table 1).

Longitudinal Findings

Figure 2 and Supplementary Table 1 show the results of the clinical, neuropsychological, and MRI lesion assessments at follow-up.

Fig. 2
figure 2

Evolution of clinical and neuropsychological findings. Main clinical and neuropsychological findings at the different time points of the study in relapsing–remitting multiple sclerosis (RRMS) patients starting fingolimod (FTY) (blue) or natalizumab (NAT) (green). (a) Annualized relapse rate (ARR); (b) Expanded Disability Status Scale (EDSS) score; (c) cognitive reliable change index (RCI); (d) Multiple Sclerosis Functional Composite (MSFC) score; (e) modified fatigue impact scale (MFIS) score; (f) Montgomery–Asberg Depression Rating Scale (MADRS) score. Results are reported as mean and 95% confidence interval (CI). *Significantly different versus T0; °significantly different between NAT and FTY. See text for further details

Both treatments significantly reduced the ARR at each time point (mean ARR reduced from 1.00 to 0.12 at M24 in the fingolimod group, p value < 0.001, and from 1.20 to 0.02 at M24 in the natalizumab group, p value < 0.001) and significantly improved cognitive performance at M24 (p value = 0.03 for fingolimod; p value = 0.01 for natalizumab). None of the fingolimod patients had CDP at M24. Natalizumab patients showed an improvement in the MADRS score at M24 (p value = 0.003). Compared to fingolimod patients, natalizumab patients had a lower ARR at M6 and at M12 (adjusted p value = 0.02 for both comparisons).

Both groups showed a significant accumulation of new T2-hyperintense lesions (at M24, p value < 0.001, in fingolimod; p value = 0.002 in natalizumab) and T1-hypointense lesions (at M24, p value < 0.001 in fingolimod; p value = 0.005 in natalizumab).

Compared to fingolimod patients, natalizumab patients had a lower accumulation of new T2-hyperintense lesions at M24 (adjusted p value = 0.03) and of T1-hypointense lesions at M12 and M24 (adjusted p values from 0.002 to 0.02). The natalizumab group also had a lower prevalence of new T2-hyperintense and T1-hypointense lesions at M12 and at M24 (adjusted p values from 0.02 to 0.04).

Compared to fingolimod patients, a higher number of natalizumab patients were free from MRI activity (67% vs 36%, adjusted p value = 0.02). Conversely, no significant differences were found for the prevalence of patients free from clinical relapses (fingolimod = 76%; natalizumab = 97%, adjusted p value = 0.09) or from CDP (fingolimod = 100%; natalizumab = 93%, adjusted p value = 0.31). Together, these findings resulted in a higher prevalence of patients with NEDA-3 at M24 in natalizumab versus fingolimod (57% vs 28%, p value = 0.04).

Fingolimod patients showed an increase in T2-hyperintense and T1-hypointense LVs at each time point (p values from 0.004 to < 0.001), whereas a decrease in T2-hyperintense and in T1-hypointense LV was found for natalizumab patients at M24 (p values < 0.001). Between-group comparisons confirmed that fingolimod patients had a greater increase of T2-hyperintense and T1-hypointense LVs at each time point compared to natalizumab patients (p value < 0.001 for all comparisons).

Evolution of Volumetric MRI Findings

Figures 3 and 4 and Supplementary Table 1 show the MRI-derived volume changes at follow-up. During the first 6 months, both groups showed a progression of brain, GM, and WM atrophy (p values from 0.03 to < 0.001), but not of deep GM atrophy (p value = 0.30 for natalizumab and p = 0.06 for fingolimod).

Fig. 3
figure 3

Progression of global and regional brain atrophy between months 0 to 6 and months 6 to 24. Main global and regional volumetric MRI findings for months 0 to 6 and for months 6 to 24 in relapsing–remitting multiple sclerosis (RRMS) patients starting fingolimod (blue) or natalizumab (green). (a) Percentage brain volume change (PBVC); (b) percentage gray matter volume (GMV) change; (c) percentage white matter volume (WMV) change; (d) percentage deep gray matter (GM) volume change. Results are reported as mean and 95% CI. *Significantly different versus months 0 or 6; °significantly different between natalizumab and fingolimod. See text for further details

Fig. 4
figure 4

Progression of global and regional brain atrophy in the first and second year. Main global and regional volumetric MRI findings from baseline (T0) to month 12 (M12) and from M12 to month 24 (M24) in relapsing–remitting multiple sclerosis (RRMS) patients starting fingolimod (blue) or natalizumab (green). (a) Percentage brain volume change (PBVC); (b) percentage gray matter volume (GMV) change; (c) percentage white matter volume (WMV) change; (d) percentage deep gray matter (GM) volume change. Results are reported as mean and 95% CI. *Significantly different versus T0 or M12, respectively. See text for further details

At M24 versus M6, both groups showed atrophy progression of the whole brain (p value = 0.002 in fingolimod; p value < 0.001 in natalizumab), GM (p value < 0.001 for both), and WM (p value < 0.001 for both). Natalizumab patients also showed significant deep GM atrophy (p value = 0.02).

At M12 versus T0 and M24 versus M12, both groups showed atrophy progression of the whole brain and WM (p values ≤ 0.001). A significant GM atrophy progression occurred at both comparisons in fingolimod patients (p values = 0.002) and at the M24 versus M12 in the natalizumab group (p value < 0.001). Both groups showed also a significant deep GM atrophy progression at M12 versus T0 (p values = 0.002 in fingolimod; p value = 0.05 in natalizumab). No significant differences of atrophy progression between groups were found.

Compared to HC (annualized PBVC = − 0.07%; annualized percentage GMV change = − 0.04%; annualized percentage WMV change = 0.03%; annualized deep GM atrophy = − 0.12%), both groups showed a higher annualized PBVC (− 0.24%, adjusted p value = 0.02 for fingolimod; − 0.28%, adjusted p value = 0.006 for natalizumab), percentage GMV change (− 0.41%, adjusted p value = 0.001 for fingolimod; − 0.43%, adjusted p value = 0.001 for natalizumab), and percentage WMV change (− 0.66%, adjusted p values < 0.001 for fingolimod; − 0.67%, adjusted p value < 0.001 for natalizumab), whereas no difference in annualized deep GM atrophy was found (− 0.30%, adjusted p values = 0.48 for fingolimod; − 0.52%, adjusted p value = 0.26 for natalizumab) (Supplementary Table 2).

Analysis of Correlations

In both groups, no consistent correlations were found between changes in clinical and MRI variables. In fingolimod patients, increased T2-hyperintense LV correlated with PBVC (r = − 0.55, p value = 0.005), percentage GMV change (r = − 0.43, p value = 0.03), and percentage WMV change (r = − 0.49, p value = 0.01). In natalizumab patients, the accumulation of new T2-hyperintense lesions was associated with PBVC (r = − 0.43, p value = 0.03), and the number of new T1-hypointense lesions was correlated with PBVC (r = − 0.38, p value = 0.04) and percentage WM volume change (r = − 0.36, p value = 0.05).

Discussion

Both natalizumab and fingolimod significantly reduced clinical and MRI disease activity, with a stronger effect of natalizumab compared to fingolimod. Both drugs exerted similar positive effects on cognitive function, fatigue, and depression and similarly modified atrophy progression.

In line with RCTs [3,4,5,6,7], both treatments were highly effective in reducing relapses, and limiting MRI activity and EDSS worsening, with a superiority of natalizumab in limiting clinical and MRI activity and maintaining NEDA-3 status, thus confirming previous comparative observational studies [11,12,13,14,15] and suggesting a stronger effect of natalizumab in limiting inflammation and demyelination. This is also supported by the reduction in T2-hyperintense LVs found with natalizumab, possibly due to resolution of inflammation and partial recovery of WM lesion damage. Conversely, this did not occur in fingolimod patients, who experienced a significant progressive increase of WM LVs. Despite this, no between-group difference in disability progression was found.

In line with observational studies investigating natalizumab [17,18,19] or fingolimod [20, 21] separately, both treatments improved global cognitive performance. Interestingly, although significant only at M24, an improvement appeared already at M12, suggesting an early influence of these drugs on cognitive function. As previously suggested [18, 19, 26], the enhancement of cognitive function was associated with an impact on fatigue and depression. Brain plasticity promoting recovery from baseline disease activity, a reduction of CNS inflammation, and the prevention of brain damage may all have contributed to these findings. Disappointingly, no significant correlation was found between clinical, neuropsychological, and MRI changes over 2 years. The small sample sizes, the relatively short study duration, and the complex structural changes occurring shortly after treatment initiation could contribute to these negative findings.

According to phase III RCTs, the rates of brain atrophy over 2 years are similar for the 2 drugs, ranging from − 0.84% [3] to − 0.86% [4] with fingolimod and from − 0.80% [24] to − 0.81% [25] with natalizumab. However, the comparison of different studies is challenging because of heterogeneities in patients’ characteristics and methodologies used. Moreover, atrophy dynamics might be different according to treatment type. Whereas a reduced brain atrophy rate is detectable already after 6 months with fingolimod [20, 23], natalizumab determines a paradoxically higher brain volume loss mainly in the first year of treatment (“pseudoatrophy”), followed by a subsequent significant reduction in atrophy rate [24, 25].

To limit “pseudoatrophy,” described with both natalizumab [24, 25] and fingolimod [23], especially in highly active patients, we analyzed atrophy rates during the first 6 months of treatment separately from those of the subsequent phase of the study and in the first and second year of treatment. At M6, a significant brain atrophy progression occurred in both groups (− 0.23% in fingolimod, − 0.29% in natalizumab) with no significant between-group differences. Whereas no data are available for natalizumab at M6, our findings confirmed phase III RCTs of fingolimod (mean PBVC = − 0.22% [3] and − 0.23% [4]). Interestingly, during these first 6 months, both treatment groups showed a higher rate of WM atrophy (− 0.46% in fingolimod, − 0.55% in natalizumab) than for GM atrophy (− 0.28% in fingolimod, − 0.24% in natalizumab), suggesting that pseudoatrophy mainly involve WM volume loss [27, 28], in which inflammatory infiltrates, glial activation, and vasogenic edema are more prominent.

After rebaselining at M6, the analysis of global brain volume loss (PBVC) showed a significant atrophy progression occurring similarly and almost linearly up to M24 with both treatments. The analysis in the first and second year of treatment confirmed these findings, suggesting similar effects of these 2 treatments in halting neurodegeneration.

Interestingly, brain atrophy rates detected at M24 versus M6 in both groups (− 0.35% in fingolimod, − 0.42% in natalizumab) were lower compared to that found in phase III RCTs [3,4,5, 24, 25] and to the recently suggested MS pathological threshold (− 0.4%/year) [29]. Although the annualized PBVC in the fingolimod (− 0.24%) and natalizumab (− 0.28%) groups was significantly higher than in our HC (− 0.07%), both treatments are likely to reduce the rate of atrophy, which is expected to be particularly pronounced in more active patients, thus justifying second-line treatment. After an initial phase with a higher rate of atrophy partially due to a pseudoatrophy, natalizumab could limit brain atrophy progression and promote secondary neuroprotection through a strong anti-inflammatory effect in the CNS, and the fostering of a more favorable environment to enhance tissue recovery. Conversely, fingolimod might reduce neuroinflammation and exert direct neuroprotective effects in the CNS, by promoting remyelination and preventing neurodegeneration [1]. In line with this, recent evidence suggests that up to 54% of the fingolimod effect on reducing brain volume loss is independent from disease activity [23].

To disentangle the substrates of brain atrophy and the involvement of specific brain compartments, we also assessed WM and GM atrophy separately, including deep GM atrophy [30]. Natalizumab has been shown to stabilize GMV up to 3 years [17, 31], and significantly reduce the accumulation of cortical lesions [32] and cortical thinning [18] compared to INF-β. However, another study showed a significant GM atrophy (mainly during the first year of treatment) in cortico-subcortical regions and in the cerebellum being correlated with disability [28]. By pooling data from 2 phase III RCTs, one study demonstrated a positive effect of fingolimod in reducing thalamic and deep GM atrophy, but not cortical GM volume loss [30]. Two other studies showed that fingolimod significantly reduced cortical lesion formation and GM atrophy progression compared to placebo [33, 34]. Our study demonstrated a significant GM atrophy at M24, with no between-group differences, and with an annualized rate significantly higher than HC. Natalizumab patients also had more significant deep GM atrophy (although not statistically different from the other groups and mainly occurring in the first 6 to 12 months of treatment). Heterogeneities in patients’ characteristics, a different follow-up duration, and differences in methods for atrophy quantification could contribute to explain discrepancies among studies. Considering the significant correlations between atrophy progression and lesion accumulation with both treatments, GM atrophy could be secondary to retrograde degeneration of fibers passing through recently formed WM lesions. A primary GM involvement, with local demyelination and damage, could also be relevant although it cannot be properly evaluated from the images acquired.

Fingolimod and natalizumab patients also experienced similar WM atrophy at M24 versus M6 (− 0.98% in fingolimod, − 0.99% in natalizumab), which was significantly more pronounced compared to HC. In natalizumab patients, significant WM atrophy has been demonstrated up to the second year of treatment [27, 28], whereas a reduced WM volume loss was found with fingolimod compared to placebo [30]. Pseudoatrophy, especially during the first year of treatment [27] and in patients with more active disease, could influence tissue volumes for up to 2 years [35], but WM lesion accumulation could also contribute to atrophy progression, as suggested by the correlations between lesional and volumetric changes.

Conversely, since a limited number of relapses (n = 7) occurred during the follow-up at least 30 days before or after the scheduled MRI time points, the effects of steroid treatment on our MRI outcomes are likely to be very limited.

Our study is not without limitations. First, we recruited 2 quite small cohorts of MS patients starting fingolimod or natalizumab and, although analyses were adjusted for unbalanced baseline findings, this was not a RCT. Differences in baseline characteristics, although not statistically significant and possibly due to diverging real-life indications, could influence the results obtained. However, we analyzed a large number of clinical, neuropsychological, and MRI variables acquired at well-scheduled time points whose assessment is challenging in a clinical setting. Results of our study need to be cautiously interpreted, and larger sample sizes and randomization might allow confirming our findings and identifying subtle differences between the 2 treatments. Secondly, a longer follow-up might be needed to obtain clear evidence of the positive effects of the different drugs. Thirdly, we did not acquire MR images sensitive to detect cortical damage and we did not investigate spinal cord involvement.

In conclusion, this study confirmed the anti-inflammatory effects of both fingolimod and natalizumab, their positive role in reducing the accumulation of irreversible clinical disability, cognitive impairment, and brain tissue damage, and their ability to promote neuroprotection by preventing further demyelination and neuroaxonal loss.