Introduction

High-resolution magnetic resonance neurography (MRN) has emerged to a useful diagnostic tool for various neuropathies and allows detecting minor damage of the peripheral nervous system (PNS) with high sensitivity [1,2,3,4,5,6]. In a clinical setting, MRN is typically based on visual assessment of nerve lesions using high-resolution T2-weighted sequences.

Morphological nerve imaging can be complemented by quantitative MRI techniques, such as diffusion tensor imaging (DTI), which are increasingly studied in traumatic, hereditary, inflammatory, and degenerative neuropathies [7,8,9]. These techniques offer additional information about nerve microstructure or tissue composition and provide quantitative biomarkers. In addition to DTI—which has been well evaluated [10]—T2 relaxometry and magnetization transfer imaging (MTI) have increasingly been applied as novel quantitative MRN techniques in recent investigations [11,12,13,14].

T2 relaxometry yields the readout parameters transverse relaxation time (T2) and proton spin density (PSD) and is conducted by a multi-echo sequence and fitting of an exponential function. MR signal loss after a radiofrequency pulse usually follows an exponential decay characterized by the tissue-specific time constant T2, which describes the time in which the transverse magnetization decreases to 37% (1/e) of its initial value [15, 16]. PSD is another tissue-intrinsic parameter and refers to the concentration of protons excitable by MRI [17]. It equals the theoretical MR-signal-intensity without any effects of transverse relaxation. While T2 is considered a biomarker of free water, PSD is regarded to reflect total water content including protons bound to macromolecules such as myelin [16,17,18].

MTI is an MRI technique that generates the readout parameter magnetization transfer ratio (MTR) [19,20,21]. It relies on the principle that protons bound to macromolecules may be selectively saturated by an off-resonance radiofrequency pulse since resonance occurs in a larger bandwidth off the Larmor frequency in these bound protons compared to free-water protons. Magnetic saturation is then transferred to free-water protons which leads to a lower MR signal in a sequence with a saturation pulse than without. MTI works by a pair of MR sequences, one with and one without a preceding off-resonance saturation pulse. By calculating the relative difference between signals, the MTR can be determined, which reflects the concentration of bound protons and their interaction with protons in free water and is considered a biomarker of demyelination [22, 23].

Recent studies suggested that T2 relaxometry and MTI could yield promising MRN biomarkers for the assessment of various neuropathies [24,25,26,27]. However, these techniques have not been implemented into clinical routine yet. To use these parameters for decisions in individual patients, it is crucial to assess reliability by a quantitative estimation of the measurement error attributed to different examinations (test-retest) or different readers (interreader). First studies implemented reliability analyses of these techniques in their investigations with promising results, but a systematic assessment of both test-retest and interreader reliability in a larger cohort and a quantification of the measurement error are still lacking [11, 28, 29].

Reliability is commonly expressed by the intraclass correlation coefficient (ICC), a dimensionless parameter ranging between 0 and 1 [30]. However, the ICC should be interpreted cautiously since it depends not only on the measurement error but also on the sample variance. To describe measurement imprecision independently of the sample variance, the standard error of measurement (SEM) may be calculated, which indicates how test results spread around a “true” value [31, 32]. Besides, the SEM allows calculating the minimum detectable difference (MDD), which is the smallest difference needed between separate measurements in order for the difference to be considered real [31, 32]. Although sometimes sharing the same abbreviation, the standard error of measurement should not be confused with the standard error of the mean, which is considered a different statistical parameter.

The aim of the present study was to systematically assess interreader and test-retest reliability of MTI and T2 relaxometry in MRN and to quantify the measurement accuracy by means of ICC, SEM, and MDD. Being the most commonly examined and well-accessible nerve, we focused on the sciatic nerve and examined a cohort of 21 healthy participants who underwent three MRN examinations on separate days which were analyzed by two readers independently.

Materials and methods

This prospective study was approved by the institutional ethics board and written informed consent was obtained from all participants prior to the examinations. The study design is summarized in Fig. 1.

Fig. 1
figure 1

Study design flowchart. Twenty-one healthy participants underwent repeated MR neurography of their sciatic nerve on three separate days, each covering the exact same anatomical region. Image analysis and postprocessing were performed by two independent readers. Interreader agreement and test-retest reproducibility were statistically assessed for each of the biomarkers transverse relaxation time (T2), proton spin density (PSD), and magnetization transfer ratio (MTR)

Study subjects

Twenty-one healthy adult (>18 years) male volunteers were prospectively enrolled over a period of 22 months. Mean age was 24.1 ± 3 years (range: 20–30 years), mean weight was 76.8 ± 6.7 kg, mean height was 1.81 ± 0.1 m, and mean body mass index was 23.4 ± 1.7 kg/m2. Exclusion criteria were history of peripheral nerve disorders and general contraindications concerning MRI.

MR data acquisition

All participants were prospectively examined at a 3-Tesla MR scanner (Magnetom Prisma-FIT, Siemens Healthineers) on three separate days using a 15-channel transmit-receive knee coil (Siemens Healthineers). The mean timespan was 5.7 ± 2.7 days between scans 1 and 2 and 4.8 ± 0.8 days between scans 2 and 3. MRN of the sciatic nerve of the dominant leg was conducted at mid to distal thigh level according to the following protocol (MR sequence parameters in Table 1):

  1. 1.

    An axial T2-weighted 2-dimensional turbo spin-echo (TSE) sequence to provide anatomical coverage with a high structural resolution,

  2. 2.

    An axial 2-dimensional multi-spin-echo (MSE) sequence for T2 relaxometry, and

  3. 3.

    Two-axial proton density–weighted 3-dimensional gradient-echo (GRE) FLASH sequences with and without an off-resonance saturation pulse (Gaussian envelop, duration 9984 μs, frequency offset 1200 Hz) at the exact same slice position for MTI. An adaptive inline image filter (Siemens Healthineers) was applied to reduce possible effects of B1 inhomogeneities on the received signal.

Table 1 MRN sequence parameters

Image postprocessing

Image postprocessing and segmentation were performed within 3 weeks after completion of image acquisition. First, images were visually assessed for motion artifacts and other artifacts that might impede segmentation of the sciatic nerve or that might modify readout parameters by projection on nervous tissue. All images were analyzed by two readers M.K. and F.P. with more than 6 and 4 years of experience in neuromuscular imaging, respectively, using the DICOM-viewer OsiriX (Pixmeo Sarl). Six central slices within each image slab covering the same anatomical region were identified by F.P. prior to further analysis. Manual nerve segmentation was subsequently performed in the high-resolution T2-weighted images by both readers independently and was restricted to the tibial portion of the sciatic nerve to prevent inclusion of paraneurial fat (for representative examples of image postprocessing, see Fig. 2) [33].

Fig. 2
figure 2

Image postprocessing. Nerve segmentation was performed by each reader independently by delineating the tibial portion of the sciatic nerve using the freehand region of interest (ROI) tool in Osirix in the T2-turbo spin-echo (TSE) sequence (a). Subsequently, ROIs were transferred onto co-registered gradient-echo sequences for magnetization transfer imaging (MTI) (b), and multi-echo sequences for T2-relaxometry (c) and quantitative readout parameters were assessed. MTR = magnetization transfer ratio, TE = echo time, S0 = MR signal intensity without saturation pulse, S1 = MR signal intensity with saturation pulse

T2 relaxometry

T2 relaxometry was based on MR signal analysis of the sciatic nerve in the MSE sequence. ROIs from the T2-weighted images were copied onto the corresponding MSE slice with TE = 20 ms, in which the boundaries of the sciatic nerve could be delineated best. Manual correction of distortion artifacts was performed by each reader independently, if necessary. Signal intensity for the whole ROI was determined for every echo time using the OsiriX plugin ROI enhancement, since averaging over all voxels helps to denoise data. Then, an exponential function was fitted:

$$ S(TE)=\mathrm{PSD}\times {\mathrm{e}}^{-\frac{TE}{T2}}+\mathrm{offset}, $$

as described previously [26, 33, 34], where S(TE) stands for the signal intensity at a given echo time TE, PSD is a dimensionless value for proton spin density, and T2 is the transverse relaxation time. To minimize systematic error, we restricted the analysis to the even echoes and used the offset as a fitting parameter, as described previously [34,35,36]. Additionally, a normalized PSD was calculated by dividing the PSDnerve by a PSD of skeletal muscle (PSDnerve/PSDmuscle). The ROI for measurement of PSDmuscle was placed in muscle tissue medially adjacent to the sciatic nerve (M. semimembranosus or adductor magnus). After slice-wise calculation of T2 and PSD, parameters were averaged from all six slices for further analysis.

Magnetization transfer ratio

MTR was calculated based on sciatic nerve signal intensity in the pair of GRE sequences without (S0) and with (S1) off-resonance saturation pulse using the equation:

$$ \mathrm{MTR}=100\times \frac{\left({S}_0-{S}_1\right)}{S_0} $$

Analogously to T2 relaxometry, segmentation information was derived from the high-resolution T2 sequence. Likewise, signal intensity was averaged over all voxels to reduce possible effects of noise. If necessary, ROIs were manually adjusted for distortion or motion artifacts by both readers independently using the images without the off-resonance pulse. Additional computational co-registration between the two MTI sequences was not applied [37], as we were able to ascertain visually that ROIs aligned well with the nerve contours in both sequences (Fig. 2B). Subsequently, MTR was calculated for every slice and then averaged over all six slices.

Statistical analysis

All values are shown as mean ± standard deviation. p values ≤ 0.05 were regarded as statistically significant. Statistical analyses were performed using SPSS (Version 24; SPSS Inc.) or R (Version 4.0.3; R Foundation for Statistical Computing). Graphs were created using GraphPad Prism (Version 8.3; GraphPad Software Inc.).

A single measurement, absolute agreement, two-way random effects model, ICC (2,1) according to Shrout and Fleiss was applied to assess interreader agreement and to calculate ICCs with 95% confidence intervals (CIs). To estimate test-retest reliability, a single measurement, absolute agreement, two-way mixed effects model was applied to calculate ICC (3,1) for each observer separately, and CIs were calculated accordingly. ICC values between 0.5 and 0.75, between 0.75 and 0.9, and greater 0.9 were regarded as moderate, good, and excellent agreement [30]. In addition, a Bland-Altmann analysis for repeated measurements was conducted and illustrated in Bland-Altmann plots. To assess measurement accuracy between readers and scans, SEM was calculated according to Popovic and Thomas [31]. MDD values for a CI of 95% were calculated according to the formula MDD = SEM x 1.96 x √2 [31, 32].

Results

For the sciatic nerve, overall mean ± standard deviation MTR was 26.75 ± 3.5%, mean T2 relaxation time was 64.54 ± 8.2 ms, and mean PSD was 340.93 ± 78.8. Overall normalized PSD was 0.71 ± 0.09. Detailed descriptive statistics for every parameter, all three scans, and both readers are presented in Table 2 and Fig. 3.

Table 2 Descriptive statistics of multiparametric sciatic nerve MRN
Fig. 3
figure 3

Descriptive statistics for both readers and all three scans. Values are illustrated as boxplots with whiskers to visualize measurement distribution (box showing the 25th to 75th percentiles with a line at the median, whiskers indicate the minimum and maximum data values). MTR = magnetization transfer ratio, T2 = transverse relaxation time, PSD = proton spin density

Assessment of interreader reliability overall resulted in ICC values between 0.81 (for MTR) and 0.94 (for PSD), implying good to excellent agreement according to Koo and Li [30]. Calculation of test-retest reliability resulted in slightly lower ICCs ranging from 0.75 to 0.94. Lowest ICCs were observed for nerve MTR and highest for nerve PSD, respectively.

SEM for MTR was 1.7% for both interreader and test-retest agreement, SEM for T2 was 2.67 ms for interreader and 2.66 ms for test-retest agreement, and SEM for PSD was 21.3 and 20.1 for interreader and test-retest agreement, respectively. ICC, SEM, and MDD values for all parameters are listed in Table 3.

Table 3 Intraclass correlation coefficients (ICCs), standard error of measurement (SEM), and minimum detectable difference (MDD) values for interreader reliability and test-retest reliability

Bland-Altmann plots for test-retest reliability are shown in Fig. 4 and for interreader reliability in Fig. 5. Bland-Altman analysis showed a rather random distribution of measurement error with low bias between raters and scans. Also, no proportional bias such as systematically higher measurement error for higher or lower MTR, T2, or PSD values could be observed. Overall mean differences between readers were 0.1% for MTR, 0.3 ms for T2, and 10.1 for PSD.

Fig. 4
figure 4

Bland-Altman plots for assessment of test-retest reliability of magnetization transfer ratio (MTR) (a), transverse relaxation time T2 (b), and proton spin density (PSD) (c) for both observers and all three scans, respectively. The black continuous line represents the mean of all differences (bias), grey dotted lines show the 95% limits of agreement

Fig. 5
figure 5

Bland-Altman plots for assessment of interreader reliability of magnetization transfer ratio (MTR) (a), transverse relaxation time T2 (b), and proton spin density (PSD) (c) of the sciatic nerve. The black continuous line represents the mean of all differences (bias), grey dotted lines show the 95% limits of agreement

Discussion

This study evaluates the interreader and test-retest reliability of quantitative MRN biomarkers like MTR, T2 relaxation time, and PSD of the sciatic nerve in a cohort of 21 healthy participants. Each participant underwent three MRN exams on separate days which were analyzed by two independent readers. By reporting ICCs, SEM, and MDD values, we quantify measurement imprecision of all readout parameters.

While T2 relaxometry and MTI are increasingly studied as quantitative MRN techniques, they have not been implemented in clinical routine yet [11,12,13,14, 24, 26]. Before interpreting MRN biomarkers in individual patients, assessment of reliability is required not only in qualitative categories but particularly by precise quantification of measurement error.

This study adds to the field by providing a systematic quantification of interreader and test-retest reliability in a larger cohort for both peripheral nerve MTI and T2 relaxometry biomarkers. With regard to peripheral nerve MTI, we are aware of two previous studies that have assessed reliability in smaller cohorts (Yiannakas et al [28]: 5 participants, ICCinterreader 0.65, ICCtest-retest 0.76; Dortch et al [11]: 13 participants, ICCinterreader 0.92, ICCtest-retest 0.69), although the main focus of these studies was on other research concerns [11, 28]. Regarding T2 relaxometry, Sollmann et al [29] have recently reported promising first results for interreader reliability while test-retest reliability had not been assessed yet to our knowledge.

While the ICC is a widely used index of reliability, it depends not only on the measurement error itself but also on the variance of the measured parameter in the examined cohort [38,39,40]. To provide a parameter of measurement imprecision that is independent of the variance of the biomarker in the test cohort, we here additionally report the SEM for all assessed biomarkers.

Moreover, the SEM allows for the calculation of the MDD, which indicates the difference in repeated measurements needed to attribute a measured difference to a change in the true value with a 95% certainty, but not to a fluctuation due to measurement error [31, 32]. The SEM and MDD are therefore of high significance when implementing quantitative biomarkers in clinical practice.

To value the clinical relevance of the measurement error of T2 relaxometry and MTI, the MDD can be regarded in context with results from previous studies in the setting of specific neuropathies and healthy controls as summarized in Tables 4 and 5.

Table 4 MTR values in health and disease as reported in previous studies of peripheral nerve imaging (non-exhaustive)
Table 5 Nerve transverse relaxation times T2 and/or proton spin density (PSD) values in various diseases as reported in previous studies of peripheral nerve imaging (non-exhaustive)

For MTR of the sciatic nerve, the current study implies an MDD value of 4.7%. In previous studies, significant differences in MTR between patients and healthy controls ranged from 3.4 to 13%, and in three out of four studies a difference of over 4.7% was observed (Table 4) [11, 12, 14, 24]. Notably, the mean MTR of our cohort was lower than in most of the control groups listed in Table 4. This most likely has technical reasons, since MTR values depend on various factors, such as magnetic field strength, coil characteristics, and sequence parameters, like offset frequency and power of the saturation pulses [47,48,49,50].

For T2 of the sciatic nerve, the present study implies an MDD of approximately 7.4 ms. For comparison, previously reported significant differences of T2 ranged from 10.8 to 26.6 ms between patients and control groups (Table 5) [13, 25, 29, 43, 44].

For PSD, our results imply an MDD of 55.7 (test-retest) and 59 (interreader). PSD is considered a semi-quantitative parameter that directly depends on the MR signal and associated parameters, such as receiver gain. Therefore, reliability values of PSD reported in this study may only be regarded as orientation values and in conjunction with the absolute mean values of other studies (Table 5). To overcome this limitation, we additionally normalized PSD values by using adjacent muscles. Although these normalized PSD values may serve as a more robust parameter when using different setups, they may however be influenced by muscular changes which can occur in the context of systemic neuropathies [51].

While the calculated MDDs are often smaller than the mean differences between patients and healthy controls, it is not clear whether changes in biomarkers in longitudinal follow-up studies will exceed these limits. So far, we are not aware of any longitudinal studies assessing MTR, T2, or PSD in the PNS. However, such longitudinal studies are crucial to further evaluate the potential of quantitative MRN in a clinical follow-up setting.

Although quantitative MRN offers multiple contrasts, neuropathies of different etiologies share non-specific changes as an increase in T2 or PSD and a decrease in MTR, which may limit their potential in differential diagnosis [52]. Furthermore, normative data may depend on scanning parameters, demography, and postprocessing methods [33, 41]. However, if reliability is proven, quantitative biomarkers could allow tracking disease progression or responsiveness under therapy. In order to implement quantitative MRN as a clinical tool, more studies on normative data should be carried out in the future.

Our study has limitations. First, we restricted the analysis to the sciatic nerve, since it is the most commonly examined nerve and well suited for MRN due to its large caliber and straight course and therefore most appropriate for MSE and GRE sequences with lower special resolution. An inclusion of smaller nerves or nerves with an oblique course may have led to different values of measurement error. Thus, all values presented should primarily be used for MRN of the sciatic nerve and may generally serve as orientation values under optimal conditions. Second, our cohort only consisted of healthy young participants and demographic variables were relatively homogeneous. This also allowed for assessment of measurement error under rather good conditions, since from our experience motion artifacts are less commonly observed in younger subjects. The examined cohort was male, however, we would not expect different results for female participants, since all parameters have been shown to not systematically differ between sexes [33, 41]. Furthermore, planning of the MR examination and image postprocessing were performed by experienced neuromuscular radiologists, and also for this reason the proposed orientational values should rather be regarded as minimum values. Besides, MTR may depend on B1 field inhomogeneities, which is why correction schemes may be applied to neuromuscular imaging [11, 12, 53]. To reduce the effects of B1 field inhomogeneities on the received signal, we applied an adaptive inline image filter and restricted analysis to six central slices where the B1 field is expected to be more homogeneous than at the edges of the slab. Lastly, this is a single-center, single-vendor study and measured parameters may depend on the hardware, sequence parameters, and postprocessing methods, limitations that are inherent to most quantitative imaging studies. While we measured test-retest reliability in a follow-up setting with one particular scanner, we would expect higher measurement error if multiple scanners were used, as differences in amplification of the MR signal modify PSD and differences in saturation pulses may affect MTR.

In conclusion, this study demonstrates that MTI and T2 relaxometry of the sciatic nerve results in reliable and reproducible values. By assessing the SEM for all examined parameters, it provides quantitative data to measure the imprecision that is associated with multiple scans or different observers. These values may be considered as orientation values of measurement error in further studies and potential clinical applications of quantitative MRN.