Magnetic resonance imaging scoring system of the lower limbs in adult patients with suspected idiopathic inflammatory myopathy

Abstract   Purpose We aim to propose a visual quantitative score for muscle edema in lower limb MRI to contribute to the diagnosis of idiopathic inflammatory myopathy (IIM). Material and methods We retrospectively evaluated 85 consecutive patients (mean age 57.4 ± 13.9 years; 56.5% female) with suspected IIM (muscle weakness and/or persistent hyper-CPK-emia with/without myalgia) who underwent MRI of lower limbs using T2-weighted fast recovery-fast spin echo images and fat-sat T2 echo planar images. Muscle inflammation was evaluated bilaterally in 11 muscles of the thigh and eight muscles of the leg. Edema in each muscle was graded according to a four-point Likert-type scale adding up to 114 points ([11 + 8)] × 3 × 2). Diagnostic accuracy of the total edema score was explored by assessing sensitivity and specificity using the area under the ROC curve. Final diagnoses were made by a multidisciplinary Expert Consensus Panel applying the Bohan and Peter diagnostic criteria whenever possible. Results Of the 85 included patients, 34 (40%) received a final diagnosis of IIM (IIM group) while 51 (60%) received an alternative diagnosis (non-IIM group). A cutoff score ≥ 18 was able to correctly classify patients having an IIM with an area under the curve of 0.85, specificity of 96%, and sensitivity of 52.9%. Conclusion Our study demonstrates that a quantitative MRI score for muscle edema in the lower limbs (thighs and legs) aids in distinguishing IIM from conditions that mimic it. Supplementary Information The online version contains supplementary material available at 10.1007/s10072-024-07386-y.


Introduction
Idiopathic inflammatory myopathies (IIMs) are a heterogeneous group of diseases characterized by progressive and symmetric proximal muscle weakness, elevated serum levels of skeletal muscle enzymes (e.g., creatine phosphokinase (CPK)), presence of specific autoantibodies, Laura Ludovica Gramegna, Rita Rinaldi, Giovanna Cenacchi and Raffaele Lodi contributed equally to this work.
Laura Ludovica Gramegna and Rita Rinaldi share first authorship.Giovanna Cenacchi and Raffaele Lodi jointly supervise this work.electromyography changes, and primary inflammatory infiltration in muscle biopsy [1,2].
The diagnosis of IIM can be challenging in clinical practice as several conditions mimic it, including muscular dystrophies, metabolic myopathies, endocrine or toxic myopathies, and systemic inflammatory diseases [3].The European League Against Rheumatism (EULAR) and the American College of Rheumatology (ACR) have recently proposed EULAR-ACR classification criteria to distinguish IIMs from mimics using 16 clinical and readily available laboratory/ histopathological features [3].Two models, with or without muscle biopsy results, were developed and according to this consensus, a diagnosis of definite, probable, or possible IIM can be obtained based on total scores.However, there are several limitations to the EULAR-ACR Criteria Project, such as exclusion of normal controls from the external validation cohort, data missing in the derivation data set, and exclusion of validation samples and MRI data [3].
MRI has several advantages in clinical practice, including detection of muscle edema and fatty replacement using T2 weighted and short tau inversion recovery (STIR) [4].Previous exploratory muscle MRI studies correlated muscle edema with clinical markers of severity [4,5] in heterogeneous subgroups of IIM patients; however, a universally accepted qualitative and quantitative MRI scoring system for patients with IIMs is lacking.Most studies performed MRI on bilateral thigh muscle alone [4,[6][7][8][9][10] and used binomial classification (present/absent) or qualitative visual scoring methods of muscle edema (Table 1 supplementary material).
Only recently there was an attempt to obtain a cutoff value that distinguishes IIMs from other muscular dystrophy [10].
Distinguish IIM from other mimics is of great importance in clinical practice as IIM requires immunosuppressive therapy, which can be dangerous in other types of muscle diseases with remarkably similar presentation [11].
The aims of this study were to assess the potential role for whole lower limb muscles MRI in the diagnosis of IIMs through a newly devised edema total scoring system, to describe its feasibility in clinical practice, and to explore its diagnostic accuracy to distinguish between patients with diagnoses of IIM and mimics.

Study design and patient selection
This was a retrospective cohort study of patients evaluated between January 2008 and September 2018 in the "Diagnostic Therapeutic Assistance Path (PDTA) for adult muscle diseases" of the Policlinico Sant'Orsola-Malpighi, Bologna, Italy.Inclusion criteria were adults (age ≥ 18 years) with suspected inflammatory myopathy, i.e., muscle weakness and/ or persistent hyper-CPK-emia with/without myalgia (Fig. 1).Patients with a clear non-inflammatory disease (i.e., clear hereditary or metabolic myopathy) were excluded.
The study was approved by local IRB Ethic Committee (CE AVEC 850-2021-OSS-AUSLBO), and informed  consent was waived due to the retrospective nature of the study.

Diagnostic evaluation pathway
All patients underwent neurological and rheumatological evaluations to assess muscle weakness, and multisystemic involvement such as skin rashes and esophageal or pulmonary dysfunction, needle electromyography with spontaneous muscle activity and quantitative Motor Units study, assays for CPK, lactate dehydrogenase (LDH), transaminase, myositis-specific and myositis-associated antibodies.
A muscle biopsy of the vastus lateralis to assess skeletal muscle inflammation and/or other changes was proposed to all patients.Steroids were administered to all patients with definitive diagnoses or when IIM was not diagnosed by laboratory or histopathological testing, but symptoms worsened as ex juvantibus criteria.

Histological evaluation
Muscle samples were snap-frozen, cross-sectioned, and stained using a panel of routine histochemical methods: hematoxilyn/eosin, modified Gomori trichrome, reduced nicotinamide adenine dinucleotide tetrazolium reductase (NADH-TR), combined cytochrome oxidase (COX) and succinate dehydrogenase (SDH), adenosine triphosphatases (ATPases), and acid and alkaline phosphatases [12].Muscle cross sections were processed by immunohistochemistry using mouse monoclonal antibodies to major histocompatibility complex class I (MHC-I) and neonatal myosin (MHC-n) to check regeneration rate, as previously described [13].Small samples from each biopsy were fixed in 2.5% glutaraldehyde in cacodylate buffer, post-fixed in 1% OsO 4 , dehydrated, and embedded in araldite.These sections were stained with uranyl acetate and lead citrate and were observed on a CM100 Transmission Electron Microscope [13].Stained muscle sections were observed by an experienced histopathologist with more than 20 years of experience in muscular disorders (G.C.) to check for morphological alterations suggestive of inflammatory myopathy.

MRI evaluation
MRI of the upper and lower limbs was performed in all patients.All images were acquired using a 1.The MR images generated were reviewed by two neuroradiologists with > 20 (R.L.) and > 10 years (L.L.G.) of experience in the field of neuromuscular disorders.Based on visual inspection of the axial FR-FSE and EPI-T2 weighted sequence, the edema of each muscle was graded according to a four-point scale (0 = no edema; 1 = slight edema involving 1/3 of muscle area and/or being of slight hyperintensity; 2 = moderate edema involving 2/3 of muscle area and/ or being of moderate hyperintensity; or 3 = severe edema involving total muscle area and/or being of severe hyperintensity) (Fig. 2).The presence of fibro-adipose tissue within the muscle was evaluated with the same four-point scale.Muscle inflammation was evaluated in 11 individual muscles of the thighs (vastus lateralis, vastus intermedius, rectus femoris, vastus medialis, sartorius, adductor longus, adductor magnus, gracilis, biceps femoris, semitendinosus, semimembranosus) and 8 individual muscles of the legs (tibialis anterior, extensor digitorum longus, peroneus, tibialis posterior, flexor digitorum longus, soleus, gastrocnemius (caput lateralis), gastrocnemius (caput medialis)).Therefore, the total highest possible score was 114, i.e., ([11 + 8] × 3 points × 2 legs).
The neuroradiologists evaluated the images independently and blinded to clinical information.Divergent scores for each muscle were discussed and final judgment was reached by consensus, with these scores used for computing the final cutoff score.

Reference standard for the final diagnosis
The reference standard was the diagnostic decision established using a consensus among a multidisciplinary team [14] consisting of two neurologists, one rheumatologist, and one histopathologist, evaluating clinical, neurophysiological, laboratory, and histopathological findings.Radiological images were excluded from the procedure.Whenever possible, diagnosis of IIM was based on Bohan and Peter diagnostic criteria [15,16].The diagnosis of idiopathic hyper-CPK-emia was made as previously described by Kyriakides and colleagues [17][18][19].In cases of patients refusing biopsy, other information was used to reach a consensus diagnosis on a single-patient basis.The final diagnostic groups were "IIM" and "non-IIM."EULAR-ACR [3] scores were assigned to all patients.

Statistical analysis
Continuous variables were expressed as mean and standard deviation (SD), while categorical variables were expressed by absolute numbers (n) and percentages (%).Groups were compared using Student's t-test or Mann-Whitney U test, as appropriate, for continuous variables, and Pearson chisquare test or Fisher's exact test, as appropriate, for categorical variables.The Shapiro-Wilk test was employed to evaluate the normality of the data.
For radiological evaluation, the intraclass correlation coefficient (ICC) was used to measure inter-rater agreement.A two-way mixed-effects model was applied, and the singlerater absolute agreement was evaluated [20].Values < 0.5 indicate poor reliability [20].
Diagnostic accuracy of the MRI edema total scoring system versus the reference standard was explored measuring sensitivity, specificity, and the area under the ROC curve.Optimal sensitivity and specificity cutoffs were explored with both a statistical method (Youden test) and a clinical scenario method (maximization of post-test probability for negative test and for positive test).The variability of estimates was expressed for each measure with 95% confidence interval (95% CI).Statistical analysis was performed using the Stata SE, 14.2 statistical package (Stata Corp.).To explore whether it would be possible to construct a simplified index, ROC analysis was also performed on supplementary scores constructed by eliminating muscles from the total if they demonstrated low inter-rater agreement (i.e., < 0.5).
In the group of patients with IIM diagnosis, age was correlated with both the total edema score and total fibroadipose tissue score using the Spearman correlation test.

Patient characteristics
A total of 85 patients (mean age 57.4 ± 13.9 years; 48 females) were included in the study.
In seven IIM patients, the muscle biopsy was reported as negative since it showed very few myopathic changes; 3/7 showed fibro-fatty substitution (patient 73, 30, and 69), 2/7 (69 and 63) showed moth-eaten fibers, and six patients had sarcoplasmic expression of MHC-I below the 50% cutoff [13].In five of these patients, the diagnosis was made due to a strong response to immunotherapy (i.e., an ex adjuvantibus diagnosis).In one patient, the diagnosis was based on clear symptoms, including a cutaneous rash, and laboratory analysis showing anti-Jo positivity.In one patient, the biopsy showed no inflammatory changes but tested positive for anti-SR1, which is compatible with inflammatory necrotizing myopathy.In 3 non-IIM patients (patients 34, 36, and 59), the muscle biopsy was reported to have inflammatory signs; 2/3 had myophagy and inflammatory cells (patients 34 and 59), and 1/3 (patient 36) showed rimmed vacuoles and intranuclear filaments at the ultrastructural level.No specific differences were found in comparison to the biopsies that tested positive in patients with definitive IIM diagnosis.In two of these cases, a final diagnosis of genetic myopathy was established, one being calpainopathy and the other one a dystrophy due to a mutation in the Titin gene.In the third patient, a final diagnosis of undetermined myopathy was made.
In the group of patients with diagnoses of IIM, the average total edema score was 21.79 ± 16.95, specifically 12.29 ± 12.69 in the thighs and 9.50 ± 6.41 in the legs.Table 3 includes details regarding the involvement of each muscle.In patients with final diagnoses of IIM mimics, the average total edema score was 5.43 ± 6.28, specifically 1.43 ± 3.21 in the thighs and 4.00 ± 4.25 in the legs.Table 3 includes details about each muscle involvement.
Total bilateral lower limb edema, thigh bilateral edema, and leg bilateral edema scores were significantly higher in the IIM group than the non-IIM group (p < 0.001).The total burden of fibroadipose tissue, both in the thighs and legs, was not significantly different between groups (p = 0.7866 and p = 0.8128, respectively) (Table 1).
In the group patients with IIM diagnosis, there was no correlation between age and the edema scores (rho = − 0.036, p = 0.841), and the association between age and fibro-adipose (FA) score was not significant (rho = 0.308 and 0.077).
There were no differences in the edema score in IIM patients between those receiving treatment (14/34) and those untreated (20/34) at the time of the MRI (23.6 ± 18.9 versus 20.6 ± 15.7, respectively).

Diagnostic accuracy analysis
Exploratory diagnostic accuracy analysis of the edema total scoring system showed an area under the ROC curve of 0.85 (95% CI 0.77-0.93)(Fig. 3).

Discussion
It is well known that clinicians face challenges in discriminating between IIM and its mimics as IIM may have a heterogenous presentation and course, and there is no single feature that could serve as a "gold standard" for diagnosis and/or classification [21].Most importantly, if mimicking conditions are mistaken for autoimmune conditions in clinical practice, this misdiagnosis can lead to inappropriate and potential harmful immunosuppressive therapy [11].
In this study, we propose an imaging score that is easily applicable in clinical practice, as the MRI acquisition parameters are standard, and it only requires visual inspection of the lower limbs to assess the degree of edema in each muscle in the thighs and legs, ranging from 0 (absent) to 3 (severe).This score enables the comprehensive assessment of edema burden in the lower limbs and reveals that a cutoff score of 17 aids in distinguishing between IIM patients and mimics, with an AUC of 0.85, specificity of 96.1%, and sensitivity of 52.9%.
Our scoring system assessing the extent of edema demonstrates high reproducibility, as evidenced by excellent inter-rater agreement for both thighs and legs.This aspect can also be attributed to the semi-quantitative approach to assess muscle edema involvement.In particular, in our study, we introduced a discrete evaluation score for each muscle, whereas most previous studies either employed a binary classification (present or absent) for each muscle or focused solely on the thigh [10].In this study, our goal was to determine the optimal cutoff score for this new standardized muscle edema score on MRI in distinguishing it from all other mimics.We have decided to recommend a cutoff of 18 in clinical practice since it yielded the highest specificity at 96.1%, even though it came with a suboptimal sensitivity of 52%.This suggests that our score should be utilized as a secondary tool to confirm a suspected diagnosis (see Scenario 2 in Fig. 3).In our study, we found no difference in the fibro-adipose score between patients with idiopathic inflammatory myopathies (IIM) and those without, suggesting that in the diagnostic process for IIM, edema is a more reliable indicator of the disease than fibro-adipose (FA) infiltration, even though FA may be present.
Relatively recently, the EULAR-ACR Classification Criteria have been proposed as a method for classifying adult and juvenile IIM to be used in clinical trials for myositis [3].However, they present some limitations in clinical practice since the output consists of the definite, probable, and possible likelihood of having IIM.For the purpose of our analysis, we needed to establish a dichotomous classification of IIM versus non-IIM.The best cutoff point for EULAR criteria is still under debate since, after the publication of the original EULAR criteria, several studies have tried to validate those criteria in external derivation cohorts and have found different cutoff points in each population [22].One study attempted to add MRI as a covariate in the original score and found that doing so (AUC = 0.86) was more likely to correctly diagnose IIM than the EULAR score alone (AUC = 0.80) [23].However, the authors reported that the MRI was evaluated by the reporting radiologist using a binary score reflecting the presence or absence of muscle edema, there was no description of the muscles evaluated, and the authors did not specify the degree of edema or extent of muscle involvement [23].In our study, we proposed a discrete evaluation score for each muscle whereas the majority of previous studies used a binomial classification for each muscle (present or absent) or evaluated only the thigh [10].
In our study, two non-IIM patients with definitive diagnosis of genetic dystrophy had muscle biopsies reported to show inflammatory signs, with no specific differences compared to biopsies that tested positive in patients with definitive diagnoses.This is not surprising, as it is well-known that an inflammatory pattern can also be observed in muscle biopsies of these diseases [24].
Our study presents several limitations, primarily related to the relatively small sample size and its retrospective nature.However, our results may have significant implications as our proposed cutoff score of 18 has a higher specificity (96%) in comparison to the EULAR criteria (88% with biopsy and 82% without), and we suggest that in selected case, MRI could confirm the diagnosis, avoiding biopsy and could be considered for future revision of the EULAR criteria.Our patients exhibited variability in their Fig. 3 ROC curve.An exploratory diagnostic accuracy analysis of the edema scoring system showed an area under the ROC curve of 0.85 [95% CI 0.77-0.93].To maximize specificity, the optimal cutoff score was ≥ 18.In this case, sensitivity was 52.9% (95% CI 35.1-70.2%)and specificity 96.1% (95% CI 86.5-99.5%);78.8% of patients were correctly classified.To maximize sensitivity (corresponding to the best Youden's Index, 53.9%), the optimal cutoff score was ≥ 7.In this case, sensitivity was 85.3% (95% CI 68.9-95%) and specificity 68.6% (95% CI 54.1-80.9%)therapy, as 14 out of 85 (16.5%) patients were undergoing anti-immunosuppressant therapy at the time of the MRI examination.The available data seem to support that MRI results are not affected.In our study, there were no differences in the edema score in IIM patients between those receiving treatment (14/34) and those untreated (20/34) at the time of the MRI (23.6 ± 18.9 versus 20.6 ± 15.7, respectively).
In conclusion, our study demonstrates that a quantitative MRI score for muscle edema in the lower limbs (thighs and legs) aids in distinguishing IIM from conditions that mimic it.This lends weight to the idea that, especially in the initial stages, edema is a more reliable indicator of the disease than fibro-adipose (FA) infiltration, even when FA is present.Further studies are needed to test whether MRI could be also a good outcome measure.

Fig. 1
Fig. 1 Flow diagram of eligible and included participants and edema scores.Patients were classified according to the two scenarios of the MRI edema scoring system (Scenario 1, cutoff score = 7, and Scenario 2, cutoff score = 18).Specifically, considering 7 as the cutoff score (Scenario 1), 40 patients had score < 7 (negative MRI and among them 5 patients had a definitive diagnosis of IIM) and 45 patients had a score ≥ 7 (positive MRI and among them 29 patients had a definitive diagnosis of IIM.Considering 18 as the cutoff score (Scenario 2), 65 patients had score < 18 (negative MRI and among them 16 had a definitive diagnosis of IIM) and 20 patients had score ≥ 18 (positive MRI and among them 18 had a final diagnosis of definitive myositis, IIM)Cut-off 7= sensiƟvity 85.3% specificity 68.6% Cut-off 18= sensiƟvity 52.9% specificity 96.1% 5-T (GE Medical Systems Signa HDx) MRI scanner at the Functional Unit of the Department of Biomedical and Neuromotor Sciences at the University of Bologna.Imaging was performed in the axial plane using T2-weighted sequences.Excitation and signal acquisition were acquired using an 8-channel phasedarray coil.Initial scanning with sagittal and coronal views was performed for localization.Images were obtained of both the thighs and legs.The following pulse sequences were used: T2-weighted bi-dimensional axial fast recoveryfast spin echo (FR-FSE) images were acquired (echo time (TE) = 85 ms, repetition time (TR) = 14,080 ms, field of view (FOV) = 34 cm); fat-sat T2-weighted axial echo planar imaging (EPI) images were acquired (TE = 30, 60, 90 ms; TR = 10,000 ms; FOV = 34 cm).Slice thickness: 5 mm, gap 1 mm.Acquisition time was 20 min.

Fig. 2
Fig. 2 Axial EPI-T2-weighted sequences (TE = 90 ms) of the thigh depicting three different grades of edema in the right vastus lateralis muscle.A Edema was scored as 1 because of the slight involvement of 1/3 of muscle; the patient was negative for IIM and had a final

Table 1
No patients in the non-IIM group presented with skin alterations, while 44.1% (15/34) of the IIM group had dermatologic manifestations.Patients in the IIM group had a higher rate of esophageal motility disorders than the non-IIM group (20.6% (7/34) vs. 3.9% (2/51), respectively; p = 0.026).Elevated levels of CPK, LDH, and transaminase did not differ between the two groups.The mean EULAR-ACR score was higher for the IIM group (70.7 ± 34.2) compared to the non-IIM group (16.2 ± 21.3) (p < 0.001).

Table 2
Evaluation of the intra-class correlation coefficient (ICC) between the two readers

Table 3
Evaluation of the average edema score and the single muscle edema score in the IIM and IIM mimics patients Data are mean ± SD, number of patients with a score > 0 (%), or median [IQR], number of patients with a score > 0 (%)