Patients with clinically defined MS according to McDonald Criteria  and RR phenotype  underwent MRI as part of research studies ongoing at our center from 2010 to 2013. For the cross-sectional analysis, we retrospectively collected Expanded Disability Status Scale (EDSS), Timed 25-foot walk test (25-FWT) and Nine Hole Peg Test (9HPT) from clinical evaluations performed within one week of MRI. For longitudinal purposes, from March 2015 to July 2015, a new clinical evaluation was performed. Mean follow-up (FU) interval was 3.70 ± 1.44 years. At both time-points the following information was collected: age, time since first symptoms, disease duration, clinical phenotype and specific treatment for MS (yes/no).
We reviewed all included patients’ clinical history and confirmed the MS diagnosis according to more recent criteria .
A cohort of age- and sex-matched healthy control (HC), who underwent MRI in our center during the same period of time, was selected.
The study was conducted after institutional ethical committee approval and was in accordance with the declaration of Helsinki. Written informed consent was obtained from all subjects.
None of the MS participants had experienced clinical relapses within three months from participation at both baseline and FU.
MRI acquisition protocol
The imaging study was performed with a 3.0-T MR unit (Verio; Siemens, Erlangen, Germany). The manufacturer’s 12-channel head coil designed for parallel imaging (generalized autocalibrating partially parallel acquisition- GRAPPA) was used for radiofrequency signal reception. A multiplanar T1-weighted localizer image with section orientation parallel to the subcallosal line was acquired at the beginning of each MR imaging examination. Brain MRI imaging protocol included the following sequences for all subjects: 1. High-resolution 3 dimensional T1-weighted (3D-T1) Magnetization Prepared Rapid Acquisition Gradient Echo sequence: TR = 1900 ms; TE = 2.93 ms; flip angle = 9°; field of view [FOV] = 260 mm; matrix = 256 × 256; 176 sagittal slices 1 mm thick; no gap; 2. Dual turbo spin-echo, proton density (PD) and T2-weighted images: TR = 3320 ms; TE1 = 10 ms; TE2 = 103 ms; FOV = 220 mm; matrix = 384 × 384; 25 axial slices 4 mm thick; 30% gap. For the cervical spinal cord we used: 3. T2-weighted sequence: TR = 3800 ms; TE = 123.0 m; FOV = 280 mm; matrix = 288 × 448; flip angle = 160°, 13 sagittal slices 3 mm thick, 10% gap; 4. STIR T2-weighted sequence: TR = 4000 ms; TE = 55.0 m; FOV = 250 mm; matrix = 240 × 320; flip angle = 150°, 13 sagittal slices 3 mm thick, 10% gap.
MRI imaging analysis
Image data were processed on Linux workstations using the FMRIB Software Library 5.9 package (FMRIB Image Analysis Group, Oxford, England, http://www.fmrib.ox.ac.uk/ fsl) and Jim 6.0 software (Xinapse Systems, Essex, England; http:// www.xinapse.com). The analysis has been carried out in 2019.
Brain and lesion volumes
Lesion volumes were obtained using a semi-automated technique based on local thresholding with the Jim software. Lesions were segmented on PD images, while T2-weighted images were used to increase the confidence level in lesion identification by two neurologists (SR, LDG). Lesion volumes yielded the following data for every subject: a quantification of the lesion burden (total lesion volume—LV) and a binary lesion mask needed for the volumetric analysis, which was co-registered onto the 3D-T1 images. Brain volumes were normalized to standard space MNI reference image to avoid head-size dependencies, and measured using SIENAx  on lesion filled brain images  to obtain normalized brain volume (NBV).
Spinal cord volume
Spinal cord volume was measured on the brain 3D-T1 images from C2 to C3 using a semi-automatic segmentation method (Jim version 6.0; Xinapse Systems, Essex, England). First, the sagittal 3D-T1 was reformatted and resampled axially to a 1-mm slice thickness, with the image plane perpendicular to the cord at the C2/C3 disk level. On this image, a marker was placed at the level of the most inferior slice passing centrally through the C2/C3 disk. Then, moving back up, two markers were placed after every five slices, until the fifteenth slice from the first maker was reached. An active surface method was then applied, using the markers of the cord centerline as input. An automatic calculation of spine volume was eventually obtained. To compensate for the biological variation of structural measurements, unrelated to disease effects, the raw volume was subsequently normalized dividing it by the number of slices . The presence and number of spinal cord lesions from C1 to C3 level was assessed on spinal cord MRI T2-weighted and STIR images.
Atrophy cut-offs definition and patients’ classification
Individual NBV and spinal cord volume were normalized to the mean and the standard deviation values of the HC group thus obtaining z-scores. To classify each patient according to a specific atrophy pattern, we followed the procedure described in Raji et al. . Briefly, in the HC cohort, given the normal distribution of brain and spinal cord volumes, 95% of the brain volume values were located within the area of the mean ± 1.77 standard deviation, while 95% of the spinal cord volume values were located within the area of the mean ± 1.87 standard deviation. Only 5% of the brain/spinal cord volume values were expected to be larger or smaller. Therefore, we assumed that z-scores below – 1.77 and – 1.87 represented, respectively, a significant brain/spinal cord volume reduction with an error probability of 2.5% at most. These z-score cut-offs were consequently applied to group individual MS patients based on their brain and spinal cord volumes into the following classes:
Group I: no brain or spinal cord atrophy (z-scores greater than – 1.77 and – 1.87, respectively);
Group II: brain atrophy (z-scores lower than – 1.77), no spinal cord atrophy (z-scores greater than – 1.87);
Group III: no brain atrophy (z-scores greater than – 1.77), spinal cord atrophy (z-scores lower than – 1.87);
Group IV: both brain and spinal cord atrophy (z-scores lower than – 1.77 and – 1.87, respectively).
Statistical analyses were performed in SPSS 25.0, with a significance level α = 0.05.
Differences in age and sex between patients and controls were tested via t-test and Fisher test, respectively. Pearson chi-square test was used to test differences in sex between the 4 groups. Analysis of variance (ANOVA) was used to test differences in age, disease duration, clinical parameters/lesion loads at baseline and follow-up interval between the 4 groups with post-hoc analysis accounting for multiple comparisons (Bonferroni). Analysis of covariance (ANCOVA), accounting for age and gender, was used to test differences in brain and spinal cord volumes between HC and the 4 patients’ groups, with post-hoc analysis accounting for multiple comparisons (Bonferroni).
Finally, the relationship between atrophy groups and disease progression was tested via logistic regression, entering atrophy classes as independent variable and progression as dependent variable. Progression was defined for each clinical measure as follow: (i) EDSS: increase of 1.5 points for patients with a baseline EDSS score of 0, increase of 1 point for patients with baseline EDSS score from 1.0 to 5.0, and increase of 0.5 points for patients with baseline EDSS score equal or higher to 5.5 ; (ii) 9HTP and (iii) 25-FWT: > 20% increase from baseline to FU . Patients were then divided into two groups (progressed versus clinically stable) according to a progression measure known as “EDSS-Plus,” described as progression on ⩾1 of the 3 components (EDSS, 25-FWT, and/or 9HPT) .