Introduction

Since its first clinical application in the musculoskeletal (MSK) system, muscle diffusion tensor imaging (m-DTI) has become a valuable sequence for the evaluation of architectural changes of fibre microenvironments [1,2,3,4]. M-DTI provides parameters such as fractional anisotropy (FA), radial diffusivity (RD), and mean diffusivity (MD) that allow extracting quantitative information on integrity of muscle fibres [5, 6]. Inflammatory pathologies, traumatic injuries, neuromuscular disorders, or atrophic conditions are the principal areas of application of m-DTI in conjunction with conventional sequences in the assessments of early structural changes [7,8,9,10,11].

Furthermore, post-processing of DTI is able to generate fibre tractography and to assess 3D muscular structure from the origin to distal insertion by calculating architectural parameters such as fibre length, number and volume, and pennation angle [12].

Despite its potential to assess structural changes of the muscles, the application of m-DTI in clinical practice is still controversial because of several factors influencing DTI signals, such as field strength, gradient strength, b-values, and post-processing algorithms [13]. Several studies reported an acceptable agreement of DTI measurements on the brain, whereas m-DTI studies, which were mainly conducted on lower limb muscle group, reported relatively high variations with FA values ranging from 0.28 to 0.6 [6, 14]. Moreover, some studies on brain DTI reported conflicting results regarding inter-site, intra-site, and inter-vendor reliability [15,16,17]. To the best of our knowledge, no studies assess m-DTI inter-vendor agreement. The aims of our study were to assess the inter-reader reliability and the inter-vendor reliability on 3 T magnetic resonance (MR) for m-DTI measurements.

Materials and methods

Study subjects

The local ethics committee approved our prospective study, and all participants signed an informed consent before starting the examination. The study was conducted in compliance with the Declaration of Helsinki.

We enrolled six healthy volunteers: Three were males (mean age: 42; range: 31–52 years) and three women (mean age: 36; range: 30–44 years).

Inclusion criteria were: 18 year or older, no neuromuscular diseases in their personal and/or family history, no present or past muscle strains in the muscular group under evaluation, and no participation in any sports activity three weeks before the examination.

Exclusion criteria for enrolment were: usual contraindications to MR imaging, positive pregnancy test, and objects in the body that could obscure the target muscle groups through artefacts. After having optimized the sequences in collaboration with the specialists of the different vendors, each volunteer was scanned once on three different anatomical sites bilaterally (middle third of the arm, middle third of the leg, and middle third of the thigh). All scans were acquired on the same day within 6 h to reduce any possible bias and were checked for image quality and artefacts.

MR examination

MR examinations were performed using three 3 T (T) MR of our institution: Signa Pioneer (GE Healthcare, Milwaukee, WI, USA), Achieva (Philips Healthcare, Best, Netherlands), and Skyra (Siemens Healthineers, Erlangen, Germany). The total MR examination time for Signa Pioneer, Achieva, and Skyra was 18.05, 17.56, and 17.52 min, respectively. MR protocol acquisition parameters including RF-coils are summarized in Tables 1 and 2.

Table 1 DTI acquisition parameters
Table 2 T1 TSE acquisition parameters

Image analysis

Following data acquisition and after removing all patient identifying information, a radiologist with eight years of experience in MSK MR interpretation assessed image quality [18]. Then, m-DTI parameters on different muscle compartments were independently assessed by two radiologists (8 and 10 years of experience in the MSK field) using a commercially available software (Olea sphere 3.0). The muscle regions of interest (ROIs) were selected as described in Fig. 1. Post-processing was performed on the DTI images. Motion-related misalignments and adjacent image noise were corrected with automated image registration. Both readers manually drew the ROIs on the same slices at the middle third of the thigh, leg, and arm on axial T1w sequences as shown in Fig. 1. FA, RD, and MD values of the different muscle areas were calculated. Fibre tractography of the thigh, leg, and arm is shown in Fig. 2.

Fig. 1
figure 1

Axial T1w images showing ROIs of the different anatomical compartments. a 1 rectus femoris, 2 vastus medialis, 3 vastus lateralis, 4 vastus intermedius, 5 sartorius, 6 gracilis, 7 biceps femoris, 8 semitendinosus, 9 semimembranosus. b 1 Medial gastrocnemius, 2 lateral gastrocnemius, 3 soleus, 4 anterior tibialis, 5 peroneal muscles, 6 posterior tibialis, 7 flexor digitorum longus, 8 flexor hallucis longus. c 1 Medial head of triceps brachii, 2 lateral head of triceps brachii, 3 long head of triceps brachii, 4 biceps brachii, and 5 coraco brachialis

Fig. 2
figure 2

Axial color FA map with overlaid tractography of the thighs (a), legs (b), and arm (c)

Statistical analysis

Results are reported as medians and interquartile ranges (IQR). Data distributions were checked for normality using the Shapiro–Wilk test, which showed that all data were non-normally distributed (p < 0.05). Next, the intra-class correlation coefficient (ICC) was used to determine between-observers agreement for overall measurements and clinical sites, keeping in this last case, the distinction between left and right side, so to use the contralateral side as a double check of agreement. Next, between-group comparisons were made through the nonparametric Friedman’s test. We applied Friedman’s test for each observer’s measurements to prevent biased findings. Finally, the Bland–Altman method was used to determine agreement among the three scanner measurements, comparing them two by two. A p value less than 0.01 was considered to be statistically significant. Statistical analyses were executed by MedCalc Statistical Software version 19.2.6 (MedCalc Software bv, Ostend, Belgium; https://www.medcalc.org; 2020).

Results

Agreement between observers

The ICC reported high levels of agreement between the two observers as summarized in Table 3.

Table 3 Intra-class correlation

Good to excellent ICC values (higher than 0.69) were assessed between the two observers according to anatomical sites, except for FA measurement on Siemens MR (0.62, 95% CI 0.43–0.76). Detailed results are reported in Table 4.

Table 4 ICC for clinical sites

Inter-vendor reliability

No statistically significant inter-vendor differences were observed for both readers and for all the parameters. (Table 5). MD measurements for reader two were close to significance (p = 0.0573).

Table 5 Inter-vendor differences in MD, FA, and RD

Bland–Altman plots comparing MD (Fig. 3), FA (Fig. 4), and RD (Fig. 5), were drawn. The Bland–Altmann method confirmed a high correlation between parameter values due to the slight deviation obtained in the mean values and the difference between them: All values were drawn in the limits of agreement (LoA ± 1.96 standard deviation).

Fig. 3
figure 3

MD-A: Philips versus Siemens; B: GE versus Siemens; C: GE versus Philips

Fig. 4
figure 4

FA-A: Philips versus Siemens; B: GE versus Siemens; C: GE versus Philips

Fig. 5
figure 5

RD-A: Philips versus Siemens; B GE versus Siemens; C GE versus Philips

Discussion

Our study shows almost perfect inter-reader reliability for MD, FA, and RD on three different MR scanner and overall no statically significance differences among the three different vendors. A slightly decrease of inter-reader agreement was detected on Skyra MR for FA measurement in the right thigh. We suppose that this is due to the inclusion of tiny fatty areas of the subcutaneous tissue within the ROI [19].

Similar to previous studies on the nervous system, in our study the DTI values showed slightly statistical differences among different muscles as reported in Additional file (1) [13, 15, 18,19,20]. We assessed highest FA (0.348; IQR: 0.097) on gracilis muscle and lowest FA on vastus intermedius (second observer GE FA = 0.229; IQR: 0.038). Nevertheless, we believe that these differences, albeit slightly statically different, are not clinically significant because there is no overlapping between our FA values and those reported in patients with spinal muscular atrophy or muscular dystrophies ranging from 0.7 to 0.41 [9, 23,24,25]. Our results support the findings found by Fourè and colleagues who reported some differences among the FA values of the muscles of the lower limb [26]. However, values of the different muscles reported in this study are slightly different compared to ours. We believe that value discrepancies between the two studies may be due to different magnetic fields strength that determines higher SNR provided by acquisition on 7 T which improve fibre tracking compared to 3 T [27, 28].

Another study conducted on ten volunteers showed good reproducibility on both 3 and 7 T MR (Siemens Healthcare GmbH, Erlangen, Germany) with an SNR increase in the 7 T MR of up to 111% [12]. However, on the 7 T MR, the authors found higher FA and lower MD values in the soleus muscle, while the results of the remaining muscle compartments did not show significant statistical difference of the quantitative values between the MR. The authors justified the heterogeneity of the muscular values as the results of the same effect described by Polders et al., who found a higher uncertainty in peripheral areas of the brain on 7 T [29].

Interestingly, one study conducted on 18 healthy subjects on a single 3 T MR (Philips Medical Systems, Best, Netherlands) on the entire lower leg [30] and reported a statistically significant intra-muscle difference in FA between the origin of the muscle and the muscle belly. This is probably due to different chemical–physical properties of the actin-myosin components and the different amount and organization of collagen fibres.

M-DTI may be influenced by contraction and activity. Muscle contraction induces muscle fibre shortening and increases cross-sectional area (CSA), producing higher MD and lower FA values. Mazzoli et al. [31] performed MRI examinations of the lower leg on five volunteers during muscle contraction and found lower FA values of anterior tibialis in dorsiflexion compared with plantarflexion contraction and no contraction. The assessment of DTI values during muscle contraction is complex due to the length of current MRI sequences, which do not allow for constant, homogeneous fibre contraction. For this reason, we preferred to perform our MR examinations during muscle rest state. However, it is possible that in the future, with the increasingly widespread use of ultra-fast MRI sequences, the evaluation of muscle contraction will soon become available [32].

M-DTI may be influenced by activity, as well [33]. Hooijmans and colleagues described higher DTI values in upper legs muscles of 12 marathon runners in the post-marathon acquisition. The increase of MD and the decrease of FA are related to interstitial oedema and the alteration of diffusivity cellular barriers caused by muscle micro-trauma. These higher values returned to baseline (i.e. those values observed in the pre-marathon phase at the follow-up) after 3 weeks. MD and RD values are the first to return to the resting phase values, while FA values show a more prolonged alteration. This is the reason why we acquired MR examinations on the same days (within 6 h), without any sport activities performed the 4 weeks before the MR examination [34].

The first limitation of our study is the small sample size of volunteers, anyhow we assessed a large amount of data. Second, MR protocol parameters are not perfectly identical among the three MR scanners, because vendor-specific characteristics prevented us from applying exactly the same parameters for all the MR. However, other authors have used coils with different numbers of channels and DTI sequences with slightly different parameters to evaluate inter-observer and intra-observer agreement on the brain, obtaining promising results [13]. Moreover, the good results obtained with some parametric differences indirectly allow to obtain an even more significant inter-vendor agreement for clinical applications.

We believe that these reasons would make this sequence even more usable in clinical practice.

However, other studies with a larger sample of healthy volunteers are needed to confirm this claim.

Conclusions

Our results highlight the inter-vendor and inter-reader reproducibility of m-DTI values, and we strongly believe that the use of this sequence should be more included in the MRI protocols during daily clinical practice for the evaluation of MSK pathologies.