Introduction

The heterogeneous group of rheumatic diseases, known under the name of spondyloarthritis (SpA), has been divided into two categories: axial (axSpA) and peripheral spondyloarthritis (pSpA) [1]. In the first type, the axial skeleton is predominantly involved, while in the second, only peripheral manifestations of the disease are observed.

In 2009, magnetic resonance imaging (MRI) of sacroiliac joints (SIJs) was included into the Assessment of SpondyloArthritis international Society (ASAS) SpA diagnostic criteria. It resulted in the introduction of axSpA subdivision into two categories: radiographic and nonradiographic, where the inflammatory changes are only visible in the MRI or completely absent [2]. Thereafter, this decision turned out to be crucial, since further research assessed the prevalence of nonradiographic axSpA in the whole axSpA cohort oscillating between 20 and 80%, which is a substantial group, previously omitted in the diagnostic process [3]. Thus, the mean diagnostic delay of axSpA decreased from approximately 7 to 2 years [4], which has also led to speeding up the introduction of proper therapy, before the occurrence of disabling structural changes in the SIJs and spine.

As the presence of active axSpA significantly diminishes patient’s health-related quality of life [5], it is vital to gain control over the disease as soon as possible and decrease the diagnostic delay to the greatest extent. This is the reason why there is still the need to search for techniques that increase the diagnostic accuracy of the axSpA in the early stage. One of the possible techniques is multiparametric MRI, which, apart from standard sequences, consists of methods such as diffusion-weighted imaging (DWI) and dynamic contrast-enhanced (DCE) perfusion imaging. Nonetheless, the consensus regarding its use in the diagnostics of axSpA has not been reached yet [6], due to the limited evidence and contradictory results of previous research [7,8,9,10].

The aim of the study was to assess the diagnostic performance of the visual assessment of DWI sequence with ADC maps and DCE sequence in the detection of active sacroiliitis in the course of axSpA, in comparison to the standard STIR sequence. A secondary aim was to assess if the presence of other signs of acute sacroiliitis, not mentioned in ASAS criteria, could aid the diagnosis of axSpA.

Materials and methods

The study obtained approval from the Institutional Bioethics Committee (No. of approval: 1072.6120.16.2019, date of approval: 31st January 2019).

Study population

We included into the retrospective study 49 patients who undergone multiparametric MRI of the SIJs due to the clinical suspicion of axSpA. All examinations were performed between January 2017 and August 2018. The inclusion criterium was the clinical suspicion of sacroiliitis in the course of axSpA [11]. The exclusion criteria were: age < 18 or > 45 years [11], the lack of clinical data about the reason of referral, patients with a history of sacroiliac region trauma or neoplasm. The mean age of qualified patients was 28.9 ± 8.5 years (range 18–43 years), the percentage of females was 63.3% (n = 31) and males 36.7% (n = 18).

Examination protocol

All examinations were performed in 3.0 T MRI scanner (Achieva, Philips Healthcare, Amsterdam, Netherlands) with the use of 8 channel phased-array XL-torso body matrix coil. Both SIJs were imaged simultaneously from the anterior to the posterior border in the coronal oblique plane, parallel to the long axis of the sacral bone.

Detailed imaging parameters were:

  • Coronal oblique T1-weighted turbo spin echo (TSE) sequence (TR 500 ms, TE 14 ms, flip angle 90, NEX 1, slice thickness 3 mm, matrix 560 × 560, FOV 240 × 240 × 71, scan time 3.02 min)

  • Coronal oblique short tau inversion recovery (STIR) TSE sequence (TR 5239 ms, TE 30 ms, inversion time 190 ms, flip angle 90, NEX 2, slice thickness 3 mm, matrix × 400 × 400, FOV 240 × 240 × 71, scan time 2.15 min)

  • Coronal oblique diffusion-weighted imaging (DWI)—multitransmit single shot echo-planar (EPI) diffusion-weighted imaging with multiple diffusion gradient b values of 0 and 800 s/mm2 (TR 3837 ms, TE 58 ms, flip angle 90, NEX 6, slice thickness 3 mm, matrix 192 × 192, FOV 350 × 292 × 132, scan time 2.45 min)

  • Coronal oblique dynamic contrast-enhanced (DCE) sequence with fat saturation (e-THRIVE), acquired 34 times (TR 3.4 ms, TE 1.7 ms, flip angle 10, slice thickness 6 mm, matrix 176 × 176, FOV 240 × 250 × 71, scan time—4.30 min). Simultaneously with the launch of the acquisition, intravenous contrast agent gadobutrolum was administered (Gadovist, Bayer, Germany) at a dose of 0.1 mmol/kg of body weight at a flow rate of 2.5 ml/s.

ADC maps were automatically created by the MR system. In several cases of the clinical doubts, ADC maps were created manually.

Image interpretation

MR images were retrospectively assessed in random order by two independent observers, aware of the clinical suspicion of axSpA, blinded to patients’ identities and clinical findings. In every patient, such sequences were individually evaluated: STIR combined with T1-weighted sequence (structural joint assessment), DWI sequence with b = 0 and 800 with ADC map and DCE sequence. To standardize the visual assessment of these sequences, we used the SPARCC (Spondyloarthritis Research Consortium of Canada) score, with slight modifications (without the evaluation of depth and intensity of the inflammatory lesions) [12]. In every sequence, eight sections, with the longest visible part of the SIJ articular surface (> 1 cm), were chosen. On every section, each SIJ was divided into four quadrants (upper iliac, upper sacral, lower iliac, lower sacral), what finally made the number of 64 quadrants evaluated in every sequence. Each quadrant was separately analysed for the presence of bone marrow oedema/osteitis related to the inflammatory sacroiliitis—the presence of typical subchondral bone marrow oedema lesion [13] (STIR) or restricted diffusion (DWI, ADC) or contrast enhancement (DCE) was marked as 1, and the lack of these signs as 0. Moreover, every SIJ was assessed in STIR sequence for the presence of other signs of acute inflammatory sacroiliitis—enthesitis, capsulitis, and synovitis.

Clinical characteristics

The referral of every patient was studied for symptoms and test results, which were the reason for axSpA clinical suspicion—back pain, family history of SpA, peripheral joint arthritis, HLA-B27 haplotype, psoriasis, inflammatory bowel diseases, enthesitis, uveitis.

axSpA ASAS classification criteria

If the active inflammatory lesion according to ASAS criteria [13] was identified in at least one joint of the particular patient in the STIR sequence, a patient was qualified to the group with ASAS positive sacroiliitis. If the particular patient belonged to an ASAS-positive sacroiliitis group and additionally had at least one typical axSpA feature, according to ASAS criteria [2], in the next step, this patient was included into ASAS axSpA imaging arm positive group. As only patients suspected of axSpA were included into our study, there was not any patient fulfilling the clinical arm of ASAS criteria.

Statistical analysis

Statistical analysis was performed with the use of IBM SPSS Statistics for Windows, version 25.0. The normality of the data was checked using the Shapiro–Wilk test. The difference in age between groups was analysed using the Mann–Whitney U test. The correlation between two unpaired nominal variables was evaluated using Fisher’s exact test or Chi squared test. The results of bone marrow oedema/osteitis assessment from the DWI sequence with ADC map and DCE sequence were compared with the results from STIR sequence (which were treated as the reference) to calculate the accuracy, sensitivity, specificity, positive (PPV) and negative (NPV) predictive values, for each observer separately. Inter-observer agreement was evaluated with the use of Cohen’s κ coefficient, the interpretation was: κ < 0—poor agreement, 0 ≤ κ < 0.2—slight agreement, 0.2 ≤ κ < 0.4—fair agreement, 0.4 ≤ κ < 0.6—moderate agreement, 0.6 ≤ κ < 0.8—substantial agreement and 0.8 ≤ κ < 1—almost perfect agreement. p values < 0.05 were considered as statistically significant.

Results

Characteristics of the group

In general, 46.9% (n = 23) of the study group fulfilled the imaging arm of ASAS axSpA criteria. Mean SPARCC score of patients from the ASAS axSpA imaging arm positive group was 15.7 ± 15.9 (range 2–48). There was not any statistically significant difference in age (p = 0.195) and gender (p = 0.130) between groups fulfilling the imaging arm of ASAS axSpA criteria and the group without the diagnosis of axSpA.

Detailed information regarding the characteristics of the group is shown in Table 1.

Table 1 Clinical profile of patients included in the study

Diagnostic performance of DWI/ADC and DCE sequence vs. STIR sequence

The performance of the visual assessment of DWI sequence combined with ADC map and DCE sequence was compared to the STIR sequence with regard to the detection of active sacroiliitis fulfilling ASAS criteria for axSpA. DWI sequence with ADC map had slightly higher sensitivity and markedly lower specificity than DCE sequence in the detection of active sacroiliitis. Accuracy and PPV were slightly higher for DCE sequence than for DWI sequence with ADC, contrary to the NPV, which was higher for DWI sequence with ADC map.

A comprehensive summary of accuracy, sensitivity, specificity, PPV and NPV values for DWI sequence with ADC map and DCE sequence for both observers is shown in Table 2.

Table 2 The summary regarding the measures of diagnostic performance for DWI sequence with ADC map and DCE sequence in comparison to STIR sequence for both observers

Inter-observer agreement

The level of agreement was compared between both the observers. The highest inter-rater agreement was achieved for STIR sequence, which was almost perfect (κ = 0.888). The level of agreement was similar both for DWI sequence with ADC map (κ = 0.674) and DCE sequence (κ = 0.773), with a slight advantage of the DCE sequence.

Details concerning the inter-observer agreement of STIR, DWI sequence with ADC map and DCE sequence are provided in Table 3.

Table 3 Inter-observer agreement of various sequences used in our study

We also assessed the inter-observer agreement for SPARCC scoring and the agreement was almost perfect, at the same level as for STIR sequence (κ = 0.888), with a slightly narrower 95% confidence interval (CI 0.882–0.894).

Remaining active sacroiliitis symptoms vs. ASAS axSpA diagnosis

In the last step, the diagnostic performance of active sacroiliitis additional signs during the identification of patients fulfilling imaging arm of the ASAS axSpA classification criteria was assessed. Signs of synovitis were present in 18.4% (n = 9) of all patients, capsulitis in 16.3% (n = 8) and enthesitis in 10.2% (n = 5). Synovitis (34.8% with axSpA vs. 3.8% without; p = 0.008) and capsulitis (34.8% with axSpA vs. 0.0% without; p = 0.001) were significantly more frequently present in patients with axSpA, in comparison to the cohort without axSpA. A similar correlation was not detected for the presence of enthesitis (17.4% with axSpA vs. 3.8% without; p = 0.173). Although all these signs achieved high sensitivity for the identification of patients with axSpA, but the specificity was very poor.

More information regarding the diagnostic performance of these signs is presented in Table 4.

Table 4 The diagnostic performance of synovitis, capsulitis and enthesitis in the identification of patients with axSpA

Discussion

Our results show that the visual assessment of DWI sequence paired with ADC map has similar accuracy, sensitivity and PPV to DCE technique, which remained high for both these sequences. On the other hand, both these sequences have poor specificity (especially DWI) in comparison to the gold standard, which is the STIR sequence [1]. Thus, visual assessment of these sequences does not seem to be helpful in early detection of active sacroiliitis. This finding is consistent with Boy et al.’s conclusions that the addition of the visual assessment of DWI and DCE sequences to the axSpA diagnostic path does not aid the diagnosis of sacroiliitis, both for less and more experienced radiologists [9]. The primary drawback of these two sequences, which could explain their poor effectiveness in visual assessment of the SIJs, may be their markedly lower contrast to noise ratios in comparison to the STIR sequence [14]. As a decreased contrast to noise ratio diminishes the clear definition of the lesion, the visual assessment of the SIJs in these sequences is hindered. Noticeably, the decreased specificity of DWI sequence in comparison to DCE sequence may be also caused by particular susceptibility of this sequence to artefacts, such as “T2 shine through”, “T2 black out”, ghosting, blurring and distortions [15]. Obviously, the additional analysis of ADC maps, which we performed, helps to discriminate between actual lesions and some artefacts, but generally these maps are hard to interpret [16], especially in the case of small lesions.

In all sequences, the SIJs were assessed in the systematic way, with the use of a method based on the SPARCC score [12], but without the evaluation of depth and intensity of the lesion and while using a set number of assessed slices. It enabled us to ensure that the visual assessment was sufficiently thorough, and observers identified and considered the same changes as pathologic in particular examination. Thus, it was possible to credibly evaluate the inter-observer agreement. The highest value of this parameter was achieved by the STIR sequence, which confirms that its visual assessment is notably easier and more precise than the sequences of multiparametric MRI. The agreement for DWI sequence with ADC maps and DCE sequence was acceptable and at a similar level, moderately with the advantage of the DCE sequence. In spite of the higher inter-observer agreement of DCE sequence and slightly better overall diagnostic performance in the diagnostics of active sacroiliitis, it still remains in a lost position in comparison to DWI sequence. The first reason is that it requires gadolinium-based contrast media administration, which may cause adverse effects and should not be overused due to the risk of gadolinium depositions in the brain and bones [17]. Moreover, this is a sequence of long acquisition time in comparison to the basic sequences (twice longer acquisition time than STIR sequence) and DWI sequence. In contrast, DWI sequence does not require gadolinium contrast media administration and its acquisition time is similar to the STIR sequence. Taking into consideration all above-mentioned aspects, neither the visual assessment of DWI sequence, nor DCE sequence seem to be promising for the early detection of active sacroiliitis in MRI.

An additional parameter, that we assessed in this study, was the presence of concomitant symptoms of active sacroiliitis such as synovitis, capsulitis, and enthesitis. Overall, they were not very prevalent in the axSpA-positive group (synovitis: 34.8% of patients with axSpA, capsulitis 34.8% and enthesitis 17.4%), but they exhibited high sensitivity for axSpA positive patient identification. Nonetheless, their specificity was extremely low and enthesitis was not even statistically significantly more prevalent in the group with axSpA, in comparison to axSpA-negative patients. This supports the statement of Lambert et al. that these changes are not sufficient to identify the active sacroiliitis, without coexisting, highly suggestive of SpA, bone marrow oedema [13]. They could only give a hint about the probable diagnosis if bone marrow oedema is present, but their overall significance is low.

Of course, we cannot omit the fact that several previous authors reported the high relevance of quantitative assessment of ADC value in the differentiation of axSpA and non-inflammatory lesions [7, 18,19,20,21,22,23] as well as in the monitoring of axSpA treatment [8, 24]. The mean ADC was higher in patients with an active axSpA in comparison to patients with low back pain of mechanical origin [20, 21], Modic 1 changes in spine [22, 23] and healthy individuals [18,19,20]. Furthermore, according to previous research, ADC value measured within the bone lesions correlates with C-related protein level [7], disease activity (BASDAI—Bath Ankylosing Spondylitis Disease Activity Index), functional impairment (BASFI—Bath Ankylosing Spondylitis Functional Index) and patient global assessment (BASGI—Bath Ankylosing Spondylitis Global Index) scores [25]. These results seem to be promising and further research is vital. First of all, a reliable methodology of ADC value measurement should be developed and uniformed between the future studies. Currently, some authors perform a direct measurement of ADC value [7, 18, 20, 21], while the others calculate the relative ADC value [10, 19], referring it to an unaffected bone, for instance, in the midline of the sacral bone [10]. Additionally, in some reports, there is a comparison between ADC value within the inflammatory lesions and to the one measured within the unaffected bone [7, 10, 18, 20, 21], while in the others, mean ADC value from regions unaffected by bone marrow oedema or structural changes is globally assessed and compared between the sides and the groups with and without axSpA [19]. These discrepancies hinder the reliable comparison of previous authors’ results. Moreover, as the values of ADC significantly differ according to age and sex, it is advisable to use a rather relative ADC value [16]. We should also be especially careful with ADC value measurement in younger cohorts, as ADC value in skeletally immature patients could overlap with the values reported for the active sacroiliitis [26]. Another important issue, that has not been covered yet, is the ADC value cutoff points for discrimination between healthy individuals and patients with active axSpA. Nonetheless, apart from all these diagnostic pitfalls of DWI sequence, there are still doubts if the measurement of ADC value, and in consequence, the addition of DWI sequence to the axSpA diagnostic algorithm, is really beneficial. First, as Lambert et al. emphasize, numerical data could be obtained from any MRI sequence, not only from more advanced sequences of multiparametric MRI [16]. A good example of a validated and feasible method is a semi-quantitative assessment of active inflammatory lesions with the use of SPARCC score [12]. Our results regarding an almost perfect level of inter-observer agreement obtained for SPARCC scoring further confirmed its reliability. To facilitate the process of SIJ evaluation even more, CaRE (Canadian Research and Education) Arthritis Organization is providing a simple, schematic SPARCC scoring interface, available on their website [16]. Second, the measurement of ADC values inside very small lesions could be highly time consuming and inexact. Third, the reproducibility of ADC measurement is questionable and its values can even vary while using the same MR system [15]. Hence, future research should focus on the standardization of ADC value measurement methods and their validation, instead of searching solely for its spectacular correlations with the disease.

This study also has some limitations, namely its retrospective design and small study group. The only lesion in our study, whose visibility was evaluated between the sequences, was bone marrow oedema, which is not pathognomonic for axSpA and might be present in up to 23% of patients with non-specific back pain and in approximately 7% of healthy volunteers. In consequence, the MRI examination result should be always correlated with clinical symptoms and laboratory results of particular patients [27]. Furthermore, we did not assess any quantitative data deriving from analysed sequences, yet it is within our future research agenda.

Conclusions

The visual assessment of DWI sequence with ADC maps and DCE sequence is characterised by high accuracy and sensitivity of bone marrow oedema/osteitis detection, but the specificity of these sequences is poor, especially for the DWI sequence with ADC maps. Moreover, the inter-observer agreement of these two sequences is lower than the one calculated for the STIR sequence. Hence, the visual assessment of DWI and DCE sequences is not beneficial in the early diagnosis of active sacroiliitis in the course of axSpA. As well, the presence of additional signs of active sacroiliitis (synovitis, capsulitis, and enthesitis) does not aid the diagnosis of the axSpA.