Introduction

Today there are several alternatives to quantitative MRI (qMRI) that enable concurrent measurements of the relaxation times (T1 and T2) and proton density (PD) in clinical MR systems [1,2,3,4,5]. These tissue parameters can be exploited for different applications including the creation of synthetic MR images and segmentation of white matter (WM), grey matter (GM) and cerebrospinal fluid (CSF).

Synthetic MR images are calculated according to mathematical expressions to give image contrasts analogous to conventional T1 and T2 weighting (T1W and T2W) as well as FLAIR [1, 5]. The synthetic images, including T1W, T2W and FLAIR, are thus obtained from one single acquisition compared to conventional imaging where image contrast series are obtained one by one.

The qMRI sequence named QRAPMASTER [1] has been implemented in different MRI systems [6, 7] together with the possibility to pursue calculation of synthetic MR images. Studies of synthetic MR imaging mimicking conventional T1W, T2W and FLAIR have shown that the image qualities of T1W and T2W are comparable but synthetically calculated FLAIR images have an inferior quality compared to conventional images [6, 8].

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system. Neurological deficits arise from focal lesions in the brain or spinal cord. MR imaging constitutes a cornerstone in MS diagnosis and for monitoring of treatment [9]. Standardised MR protocols in MS include T1W, T2W and FLAIR images [10, 11]. The evaluation of the examination involves identification of MS lesions regarding the number and localisation.

Synthetic images were in a recent study compared to conventional images in terms of MS lesion detection in order to compare diagnostic accuracy [6]. The study showed a good agreement between the numbers and volumes of identified lesions using the two alternatives.

Administration of gadolinium (Gd) contrast agent enables detection of contrast-enhanced active MS plaques in T1W images. Such highlighted areas are important in staging MS since they correspond to inflammatory processes and damage of the blood-brain barrier. The use of synthetic T1W images to identify active MS plaques has not been compared to conventional MRI even though the combination of synthetic MR and Gd contrast has been investigated in another context indicating differences when imaging before and after the administration of contrast agent [12].

The aims of this study were to compare (1) the diagnostic outcome between synthetic and conventional MRI in terms of detection of MS lesions in T1W, T2W and FLAIR images after the administration of Gd contrast agent, (2) the inter- and intra-observer agreement and (3) the lesion-to-white matter contrast and signal-to-noise ratio of the two imaging alternatives.

Materials and methods

Subject group

All regulatory obligations for a prospective observational study were fulfilled and approval was received from the Regional Ethics Review Board. Informed consent was obtained from all patients participating in the study. Fifty-five MS patients fulfilling the revised McDonald diagnostic criteria [13] were consecutively examined and recruited by a consultant in neurology (M.G.) between January 2011 and May 2013. Some patients have previously been reported in [14], but then however in another context.

Three patients recalled from consent. The remaining group of 52 patients consisted of 41 females and 11 males, the mean age was 42 years (range 22 – 65 years) and the mean disease duration was 12 years (range 2 – 34 years). Clinical subtypes were classified as relapsing-remitting in 43 patients, secondary-progressive in 8 patients and 1 patient had primary-progressive MS. The mean Expanded Disability Status Scale Score (EDSS) was 3.0 (range 0 – 8).

Sample size (N) was estimated to detect a difference (paired) of three lesions between two reviewers assuming a standard deviation in difference of 6. This was met for N = 33 with a power of 0.8 and a significance level (two-sided) of 0.05. Examinations showing confluent white matter lesions were not included in the study.

MR examination protocol

All MR examinations were performed on a Philips Achieva 1.5-T system (Philips Medical Systems, Best, The Netherlands). An 8-channel SENSE head coil or a 16-channel SENSE neurovascular coil was used as receiver coil.

MR sequence parameters are shown in detail in Table 1. The clinical routine MR protocol for examination of the MS patients included the following sequences: axial diffusion-weighted imaging (DWI), axial and sagittal FLAIR, axial T2W fast spin echo and gadolinium-enhanced axial T1W spin echo. Gadolinium contrast agent [Omniscan (Gadodiamide, GE Healthcare, UK), 0.1 mmol/kg bodyweight] was administered to the patient after the initial DWI as a bolus intravenous injection. The qMRI sequence QRAPMASTER [1] was added last in the protocol and encompassed the whole brain with equal slice angulation, slice thickness, gap, positioning and field of view (FOV) as the conventional axial FLAIR, T1W and T2W series. QRAPMASTER is a multi-slice, multi-echo and multi-saturation recovery pulse sequence using an optional number of echo times (TE) and saturation delay times (TD).

Table 1. Parameters used for conventional MRI sequences, QRAPMASTER and synthetic MRI generation

MR image post processing

The T1 and T2 relaxation times were determined from the combination of images obtained with different TE and TD. Images acquired at different TE were used for calculation of T2 while images acquired at different TD were used for calculation of T1. A mono-exponential fit of the image data to the expression of the signal intensity in the QRAPMASTER images gave an estimate of the unsaturated magnetisation (M0) in each voxel, which then was rescaled to give the proton density [1, 15].

Image sets acquired with QRAPMASTER were exported to the PACS system (IDS 7, version 14.3.14.2, SECTRA Medical Systems, Linköping, Sweden) and uploaded into the SyMRI software (version 7.0.2, Synthetic MR AB, Linköping, Sweden), where the calculation of T1, T2 and PD was performed as well as the generation of the synthetic T1W, T2W and FLAIR images (see Table 1 for parameter settings). The contrast in the synthetic images was calculated using well-known mathematical equations that include both tissue (T1, T2 and PD) and pulse sequence (TR, TE and TI) parameters [1]. Default values of TR, TE and TI given in the SyMRI software were used in the calculation of the synthetic images. All images, both conventional and synthetic, were saved in PACS as image stacks for review.

Image analysis

Conventional and synthetic axial T1W, T2W and FLAIR images were separately analysed blindly, independently and in random order in a PACS workstation by a general radiologist with 13 years of experience (W.K., denoted reviewer 1) and a subspecialist in neuroradiology with 15 years of experience (M.N., denoted reviewer 2). Repeated analysis was performed by one radiologist (W.K.) six weeks apart. Hyperintense MS lesions on T2W and FLAIR images and contrast-enhancing MS lesions on T1W images were counted. The location of each lesion was documented (juxtacortical, periventricular or infratentorial) and also summarised to give the total number of cerebral lesions. The review of all images was performed in the same PACS system and displayed on the same kind of workstation.

Cerebrum, juxtacortical and periventricular lesion counts were categorised in three groups (0-9, 10-20 and >20 lesions) based on the number of lesions found in the T2W and FLAIR images. The rationale for grouping the number of lesions is based on recommendations from the Swedish MS Society on how to report MR findings in follow-up examinations of MS patients [9]. Two categories (0 and ≥ 1 lesions) were used for grouping infratentorial T2 lesions and contrast-enhancing lesions.

A summary radiology report was created for each patient that contained: (1) the lesion counts in the cerebrum identified in the T2W images, (2) the lesion counts in the cerebrum identified in the FLAIR images and (3) the presence or non-presence of any contrast-enhancing lesion.

For comparison of lesion-to-white matter contrast and signal-to-noise ratio measurements between conventional and synthetic images, all patients with contrast-enhancing lesions on T1W images were selected. Circular regions of interest (ROIs) were placed in contrast-enhancing lesions and adjacent normal-appearing white matter in T1W, T2W and FLAIR images to assess lesion-to-white matter contrast. The signal-to-noise ratio was measured by placing ROIs in five anatomical areas (CSF, left centrum semiovale, anterior horn of the corpus callosum, left thalamus, left frontal cortex). The mean and standard deviation of the intensity for each ROI were registered. Lesion-to-white matter contrast was determined as the difference between lesion and white matter intensity divided by the white matter intensity. The signal-to-noise ratio was calculated for each ROI by dividing the mean intensity by the standard deviations.

Statistical methods

The Shapiro-Wilk test was used to assess for the normality of distribution of lesion counts as well as values for lesion-to-white matter contrast and signal-to-noise ratio.

The differences of cerebrum lesions counts between the reviewers are presented with median and the inter-quartile range (IQR) due to non-Gaussian distribution of the data. Two-sided Wilcoxon rank sum test was used to identify any statistically significant differences.

Linear kappa was used as a measure of agreement. Inter- and intra-observer agreement was calculated using the grouped data of lesions found in T1W, T2W and FLAIR images obtained from conventional and synthetic MRI. In terms of agreement, the kappa value was interpreted as: poor < 0.20, fair 0.21-0.40, moderate 0.41-0.60, good 0.61-0.80 and very good 0.81-1.00. Confidence intervals (95% CI) of the kappa values were used to find possible non-overlapping intervals that, in this case, indicate a significant difference (p < 0.05). To get an overall comparison of all inter- and intra-observer agreements, a paired t-test of all kappa values was performed.

McNemar’s test was used to examine if there was any significant difference in radiology report consistency when the reviewers used conventional or synthetic images. Report consistency occurred when the two reviewers judged criteria (1), (2) and (3) similar.

Bland-Altman analysis [16, 17] with 95 % limits of agreement was used to show differences in lesion counts in conventional versus synthetic T1W, T2W and FLAIR images.

The normally distributed samples for the lesion-to-white matter contrast and signal-to-noise ratio in conventional vs. synthetic MR images were compared using a two-sided paired t-test, while for non-normally distributed samples the Wilcoxon rank sum test for paired samples was applied.

Statistical analysis was performed using the MedCalc software (MedCalc for Windows, release 16.4.1, MedCalc Software, Ostend, Belgium).

Results

Among the MS population, 13 individuals had confluent white matter lesions yielding 39 examinations for review of T2W and FLAIR images while all 52 included examinations contained T1W image stacks that were usable for analysis. In total, 260 image stacks were reviewed by both reviewers. Examples of conventional and synthetic images are shown in Fig. 1.

Fig. 1.
figure 1

Example of conventional images (upper row) and synthetic T1W (contrast-enhanced), T2W, and FLAIR images (lower row) in an MS patient

A summary of total lesion count differences in the cerebrum is presented in Table 2. There were no significant differences in lesion counts between conventional and synthetic MR for all three images types (T1W, T2W, FLAIR). The lesion detection in terms of inter- and intra-observer agreement showed similar results for conventional and synthetic images.

Table 2. Difference in cerebrum lesion counts between the two reviewers in T1W, T2W and FLAIR images expressed with median and the inter-quartile range (IQR) within parentheses. The 2.5 and 97.5 percentiles were used to give the range of difference for the T1W images since the IQR was equal to zero in these cases

Bland-Altman plots in Fig. 2 illustrate the differences in detected lesions as documented by reviewer 1 for the different images type. As appreciated in Fig. 2a contrast-enhancing lesions were found in ten patients; in the remaining patients no enhancing lesion was found and these data points coincide in the origin of the plot.

Fig. 2.
figure 2

Bland-Altman plots showing the differences between the total number of detected lesions in conventional and synthetic images: (a) T1W, (b) T2W and (c) FLAIR images (reviewer 1)

Agreement on lesion counts between conventional and synthetic MR did not differ significantly in any region or for any type of image since there were overlapping confidence intervals in all cases; see Table 3. In the infratentorial region, however, the inter- and intra-observer agreement was exclusively higher using conventional images compared to synthetic images. Also, the inter-observer agreement regarding the total number of identified lesions in the cerebrum was higher using conventional images for all three image types. The kappa values in Table 3 obtained for conventional images were significantly higher (p = 0.0285) compared to the synthetic images.

Table 3. Kappa values with confidence intervals (CI) for inter- and intra-observer agreement for conventional and synthetic MR images in the whole cerebrum and different anatomical regions

The proportion of consistent radiology reports using conventional images was higher (62%) compared to synthetic images (51%) but the difference was not significant.

The lesion-to-white matter contrast was statistically significantly higher (p < 0.0001) in conventional contrast-enhanced T1W images while there was no significant difference in T2W (p = 0.34) and FLAIR (p = 0.50) images. Signal-to-noise ratios (see Table 4) were significantly higher in synthetic T2W and FLAIR images compared to conventional images except for CSF in T2W images where there was no significant difference. The differences in SNR between synthetic and conventional images were not consistent in T1W as in T2W and FLAIR.

Table 4. Differences in signal-to-noise ratio (SNR) between synthetic and conventional images in different regions. Normally distributed differences are presented as mean ± 1 SD while non-normally distributed differences are presented with median and inter-quartile range (IQR)

Discussion

The results of this study showed no significant differences in inter- and intra-observer agreement regarding detection of MS lesions using conventional and synthetic MR images, but a tendency to poorer agreement in synthetic images. No statistically significant difference between total lesion counts in synthetic vs. conventional MR images was observed. Also, a lower percentage of consistent radiology reports was observed when using synthetic MR, but the difference was not statistically significant. The lesion-to-white matter contrast was significantly higher in conventional contrast-enhanced T1W images while no difference was noticed in T2W and FLAIR images. Signal-to-noise ratios were mainly higher in synthetic T2W and FLAIR images compared to conventional images.

A limited number of studies in the literature have investigated the inter-observer agreement of conventional MR in MS. One study showed poor agreement for the total number of lesions and moderate agreement when using dichotomised composite criteria according to Barkhof, Fazekas and Paty [18] or the McDonald criteria [19]. Another study assessing the inter-observer agreement regarding dissemination in space (DIS) and dissemination in time (DIT) [20] showed moderate agreement for neuroradiologists trained in using diagnostic criteria published by the International Panel on the diagnosis of MS [13].

Some studies assessing diagnostic performance of synthetic MRI have been published. Synthetic MR images were perceived to be of inferior quality, but agreed with the clinical diagnosis (MS versus non MS) to the same extent as the conventional images [8]. A recent study performed with synthetic MR on a 3-T system showed no statistically significant differences in lesion detection and diagnostic classification between synthetic and conventional MR images. The differences in lesion counts between synthetic and conventional images were on the same order of magnitude as differences between observers [6]. In another recently published study detection of brain metastases in conventional T1W images was compared to synthetic T1W and T1 inversion recovery (T1IR) images. The study was performed on a 3-T system and obtained similar results using synthetic images compared to conventional images [7].

Our findings are in accordance with the results of these earlier studies and add information about the diagnostic performance of contrast-enhanced T1W images in MS.

Variations in lesion counts could partly be explained by pulsation artefacts in T1W images. In both conventional and synthetic T2W and FLAIR images the distinction of large, focal periventricular lesions and confluating lesions may be difficult and affect lesion counts. In FLAIR images, areas with high contrast between grey and white matter and small vessels could simulate small MS lesions, particularly in the centrum semiovale (Fig. 3). Another recently published study [21] described that artefacts were more pronounced in synthetic T2W FLAIR compared to conventional MR images.

Fig. 3.
figure 3

Conventional (left column) and synthetic (right column) FLAIR images. Suspicious MS lesions on synthetic images (arrows) are less obvious on conventional images

In contrast to previous studies where synthetic FLAIR images suffered particularly from a perceived lower signal-to-noise-ratio and inferior overall image quality compared to conventional FLAIR [6, 8, 22], in this study a higher signal-to-noise ratio was observed in synthetic FLAIR images but also in synthetic T2W images. This can be explained by the larger voxel size used in the synthetic images compared to the conventional, even if the impact of reconstruction filters and parallel reconstruction, for example, remains difficult to evaluate [23]. The differences in measured lesion-to-white matter contrast did not obviously affect diagnostic performance.

Considering that this and earlier studies show variations in inter- and intra-observer agreement, the importance/relevance of absolute numbers may have to be put into perspective. In making the differential diagnosis of MS vs. non-MS, categorising lesion numbers according to the McDonald criteria [13] for instance is a useful tool, but in follow-up examinations the appearance of any single lesion can be of importance. Studies on this issue using synthetic MR are lacking so far. In a clinical context, though, the availability of prior examinations for comparison may facilitate the image interpretation.

Though quantitative MRI with reconstruction of synthetic images performed inferiorly in terms of overall inter- and intra-observer agreement quality and could not acquire 3D isotropic high-resolution images, there are several potential benefits of this approach. A major advantage of quantitative/synthetic MRI is the simultaneous acquisition of tissue parameters that can be used for automated calculation of tissue maps, tissue analysis and volume assessment [12, 24,25,26,27,28,29] and therefore may be a promising future alternative to evaluate focal and diffuse disease compared to assessment of focal lesions only. Furthermore, the imaging data are potentially scanner independent and scan times can be shortened. The total acquisition time for the conventional axial series (T1W, T2W and FLAIR) was in this study 18:30 (min:s) while the scan time for QRAPMASTER was 6:27.

This study has some limitations. There is a difference in experience between the two observers in terms of professionally active working years, though in our work the inter-observer variation was not higher than the intra-observer variation. The possibility of calculating other images than T1W, T2W and FLAIR to be used in the evaluation, such as IR T1W and double IR images, was not included in the study. In this study the 2D synthetic MR images were compared to corresponding 2D conventional MRI images even though 3D isotropic high-resolution imaging nowadays is an available alternative (3D was not available on the system when the study started) and is becoming more common in clinical routine. The use of 3-mm-thick slices, which was applied for both the synthetic and conventional images, implies a risk of missing small lesions because of partial volume effects compared to isotropic 3D acquisitions. The limitation of axial imaging may impede the evaluation of lesions in the corpus callosum and adjacent centrum semiovale. The in-plane resolution in the synthetic images was lower compared to the conventional images. The choice of resolution in the synthetic images was a compromise of scanning time, slice thickness and whole brain coverage. Another limitation is the variable existence of MS lesions in the study population, from a single or few lesions to multiple, in some cases nearly confluating, lesions, making counting lesions difficult at times. Studies comparing the diagnostic performance of synthetic MRI in 1.5 and 3 T may be needed as detection of MS lesions has been shown to be superior in 3 T, at least in conventional imaging [30], and significant differences in tissue segmentation between 1.5 and 3 T have been shown [28].

In conclusion, synthetic MR images have the potential to be used in the assessment of MS dissemination in space despite a slightly lower inter- and intra-observer agreement compared to conventional MR images. Studies evaluating the impact of those differences on clinical management and synthetic MR in the assessment of dissemination of MS lesions over time remain to be performed.