Introduction

Total joint arthroplasty (TJA) has become the most common standard treatment for severe end-stage hip or knee disease, allowing joint pain relief, improvement of physical activity, and an increase in quality of life [1,2,3,4]. Although the postsurgical outcomes are usually excellent [5,6,7,8], the incidence of various complications will continue to increase over time, in large part due to the rise in the number of TJA over recent years and the increased life expectancy [9, 10]. Among these, periprosthetic joint infections (PJIs) is devastating because of prolonged hospitalization, repeated surgical interventions, or severe psychological and economic burden to patients [11, 12].

Determining the presence of PJIs remains a challenge of modern orthopedics as there is no gold standard diagnostic tool [13, 14]. In the last decade, the commonly used diagnostic criteria for PJIs were released by the European Bone and Joint Infection Society (EBJIS), Musculo-Skeletal Infection Society (MSIS) or two International Consensus Meetings on PJIs in 2013 (ICM 2013) and 2018 (ICM 2018) [15,16,17]. In general, the diagnostic approach in patients with suspected PJIs involves clinical findings, laboratory evaluation, radiology, biopsies with microbiological analysis, nuclear imaging, or intraoperative findings [18, 19]. There is no clear consensus about the choice of the most accurate imaging technique to detect suspected PJIs [20], especially in the case of a challenging diagnosis of an early or low-virulence infection.

Since the development of advanced metal artifact reducing techniques, magnetic resonance imaging (MRI) has been increasingly recognized as a noninvasive and valuable method in the evaluation of patients with septic arthritis [21] or hip and knee pain after arthroplasty [22,23,24,25]. However, there are two issues with the MRI diagnostic value of PJIs: (1) To date, there is no consensus on the diagnostic value of MRI for PJIs in total hip arthroplasty (THA) or total knee arthroplasty (TKA) patients [26,27,28,29,30,31,32,33]; and (2) There are no consistent criteria for the identification or definition of specific MRI features related to PJIs diagnosis. Consequently, it is necessary to systematically evaluate the diagnostic value of MRI features for PJIs.

This systematic review aimed to analyze the main value of MRI for PJIs diagnosis and summarize various helpful MRI appearances in identifying infected prostheses for THA or TKA patients.

Materials and methods

This systematic review strictly adheres to the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines [34]. Ethics committee approval was not needed to conduct a systematic review of the published literature.

Search strategy

In March 27, 2022, a systematic literature search of the PubMed (Medline) and EMBASE (Elsevier) databases was conducted to identify the original studies that reported the imaging features of MRI for the diagnosis of PJIs. The detailed search terms were as follows: (Periprosthetic Infection OR Infected OR painful OR symptomatic) AND (THA OR TKA OR TJA OR TKR OR THR OR Knee Arthroplasty OR Hip Arthroplasty) AND (MRI OR MR OR MR Imaging OR magnetic resonance imaging) AND (Hip OR knee). The bibliographies of the included studies were also hand-screened to expand the search extent and to avoid missing relevant articles. Moreover, there were no search date limits in this study.

Inclusion and exclusion criteria

After inspection for duplicates, studies were included based on the following inclusion criteria: (1) original articles in a peer-reviewed journal; (2) human studies; (3) reports on the features of MRI for the diagnosis of PJIs after THA or TKA; and (4) original articles in English. Studies were excluded if any of following criteria were satisfied: (1) review articles; (2) meta-analyses; (3) letters to the editor; (4) replies; (5) comments; (6) conference abstracts; (7) editorials; (8) case reports; (9) non-English studies; and (10) studies involving only animals.

Study selection and data extraction

The eligible articles were independently selected by two reviewers according to title and abstract assessment. The final decision regarding inclusion was based on the full-text articles. If consensus was not reached in case of disagreement, a third reviewer was included.

The following study characteristics were extracted from the eligible studies: (1) authors; (2) year of publication; (3) study design; (4) number of subjects; (5) sex; (6) age; (7) prosthesis; (8) number of prostheses (total/infected/noninfected); (9) MRI setting; (10) duration from THA or TKA to MRI examination; and 11) study outcomes, including interrater and intrarater reliability (intraclass correlation coefficient (ICC) for continuous variables and Cohen coefficient (κ) for categorical variables with standard errors), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy. Data extraction was conducted independently by the two reviewers, and any disputes between them were resolved by a consensus meeting.

Methodologic quality appraisal and analysis

The included articles evaluated the reproducibility and accuracy of MRI features to diagnose PJIs. To assess the quality of these articles, the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) tool and Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool were used.

The reproducibility of the included articles in this study can be evaluated by reliability assessment using COSMIN reliability box 6 [35]. Reliability box 6 contains 3 domains: design requirements, statistical methods and other flaws [35]. Each standard is answered by the four point rating system (inadequate, doubtful, adequate, or very good) [36], and the final rating is determined by the lowest score given for any of the standards in box 6 (the worst score counts method) [37]. Interrater or intrarater reproducibility can be calculated with κ or ICC. The κ statistic was interpreted as follows: almost perfect agreement (0.81-1), substantial agreement (0.61–0.80), moderate agreement (0.41–0.60), fair agreement (0.21–0.40), slight agreement (0.01–0.20), and no agreement (0) [38]. The definition of ICC values was as follows: excellent reliability (> 0.90), good reliability (0.75–0.90), moderate reliability (0.50–0.75), and poor reliability (< 0.50) [39].

The QUADAS-2 tool is recommended for use in rating bias and applicability of a majority of diagnostic accuracy studies [40]. The QUADAS-2 contains 4 domains: patient selection, index test, reference standard, and flow and timing. Each question can be assessed with “low risk of bias”, “high risk of bias”, or “unclear risk” [41]. Moreover, sensitivity, specificity, PPV, NPV and accuracy were also calculated for each MRI feature.

Results

Search results

A flowchart of study selection is shown in Fig. 1. The systematic search strategy identified 1909 articles from PubMed and EMBASE. After removing 236 duplicate articles, 1673 articles remained. Of these, 1664 were excluded after analyzing the information in the title and abstract, while the remaining 9 full-text articles were downloaded for a further assessment. One article was excluded because it included only one patient with PJIs [42]. No other potentially relevant studies were extracted from the bibliographies of these articles. Finally, 8 eligible articles, which included a total of 645 patients, were summarized and analyzed in this study [26,27,28,29,30,31,32,33].

Fig. 1
figure 1

Flowchart of the literature systematic search

Characteristics of included studies

The detailed study characteristics are summarized in Table 1. The included studies were published in 2013 (1/8) [33], 2014 (1/8) [31], 2016 (2/8) [30, 32], 2020 (3/8) [27,28,29] and 2021 (1/8) [26]. The study design was retrospective in 87.5% (7/8) of studies [26,27,28,29,30, 32, 33] and prospective in 12.5% (1/8) [31]. The number of subjects ranged from 30 to 140 patients. Six studies detected the diagnostic role of MRI for THA patients [26,27,28,29,30,31] and the remaining two studies focused on the MRI features of TKA patients [32, 33]. Altogether, a total of 645 patients, 206 (31.94%) with and 439 (68.06%) without PJIs, were assessed in 481 (74.57%) hip prostheses and 164 (25.43%) knee prostheses. MRI was performed by 1.5 T scanners in all eight studies.

Table 1 Study Characteristics

Study quality appraisal and analysis

For reproducibility assessment (Table 2), seven studies were scored adequate to very good by the COSMIN reliability box [26,27,28,29,30, 32, 33], and only one study was scored inadequate [31]. Nevertheless, approximately 12.5% of the included articles did not analyze interrater reliability, and 50% of the included studies did not provide intrarater reliability assessment.

Table 2 Reproducibility of MRI Measurements to Diagnose Periprosthetic Joint Infection

For accuracy assessment (Figs. 2 and 3), the methodological quality of seven studies had a high risk of bias [27,28,29,30,31,32,33], and only one study had a low risk of bias [26]. Because only one retrospective study scored a low risk of bias [26], the accuracy of the included articles showed more concerns regarding patient selection. Generally, the retrospective study design property will increase susceptibility to selection bias. In addition, the majority of included studies provide necessary information in regard to index test, reference standard, or flow and timing [26,27,28,29, 32, 33].

Fig. 2
figure 2

The methodologic quality of the included studies using QUADAS-2 shows the proportions of studies with high, low, or unclear risk of bias and concerns regarding applicability

Fig. 3
figure 3

The methodologic quality of the included studies using QUADAS-2 shows each domain of studies with high, low, or unclear risk of bias and concerns regarding applicability

MRI findings and PJIs

As shown in Tables 2 and 3, MRI features demonstrated high diagnostic performance in evaluating suspected PJIs, but the individual MRI signs of PJIs around the prosthesis varied or were inconsistent among all included studies. The important MRI findings of PJIs are summarized as follows:

Table 3 Accuracy of MRI Measurements to Diagnose Periprosthetic Joint Infection

Synovitis is common in patients with hip or knee prostheses, and the lamellated hyperintense synovitis (LHS) is the most suggestive MRI sign of PJIs in THA [26, 28, 31] or TKA [32, 33] patients. Reasonable reliability results were found regarding LHS, with an interrater reliability of (K, 0.76–0.907) and interrater reliability of (K, 0.44–0.89) [26, 28, 32, 33]. The sensitivity and specificity for diagnosing PJIs by LHS on MRI varied between 26.3% and 86% for sensitivity and between 84% and 98.8% for specificity. The diagnostic accuracy of LHS ranged from 74.8 to 94.4% [26, 28, 32, 33].

Edema, including bone edema [26, 31], extracapsular edema [26, 31], capsule edema [27], intramuscular edema [27], and adjacent soft tissue edema [29,30,31] had a high correlation with the clinical diagnosis of PJIs. Interrater reliability was almost perfect for bone edema (K = 0.927) [26], extracapsular edema (K, 0.905–0.923) [26], capsule edema (K = 0.88) [27], intramuscular edema (K = 0.73–0.88) [27], and adjacent soft tissue edema (K = 0.955) [29]. The sensitivity of edema on MRI for PJI was 68.4 − 100%, and the specificity for diagnosing PJIs by edema on MRI was 73.1 − 95%. The diagnostic accuracy of edema ranged from 79.8 to 93% [26, 27, 29,30,31].

The MRI appearance of extracapsular collection (or fluid collection) [26, 27, 31] was suggestive of an infected arthroplasty implant. Interrater reliability was almost perfect (K, 0.905–0.923) or substantial (K = 0.68) for extracapsular collection. These articles reported the sensitivity and specificity values of 28–58% and 77.8–98%, respectively [26, 27]. The diagnostic accuracy of extracapsular collection ranged from 68.9 to 85% [26, 27].

A correlation was found between reactive lymphadenopathy (or nodal indices) on MRI and PJIs [26, 27, 29]. Results demonstrated excellent reliability for lymphadenopathy (ICC = 0.98, K, 0.844–0.99) [26, 29]. The sensitivity and specificity of the diagnoses varied between 78.9 − 93% and 47 − 87.7% [26, 29]. The diagnostic accuracy of lymphadenopathy ranged from 70 to 93.1% [26, 29].

Details of other MRI signs of PJIs are shown in Tables 2 and 3.

Discussion

The present study aimed to systematically review the role of MRI in the assessment of infected joint prostheses for THA or TKA patients. The main findings suggest that MRI is capable of identifying suspected periprosthetic joint infection, but the definition of specific MRI features related to PJIs diagnosis lacks consensus and standardization.

All included articles were published in the last 8 years, with a rapid rise in published articles per year over time, especially in 2020–2021. The publication trend indicated that MRI assessment of PJIs is currently a research focus. MRI of metallic joint arthroplasty implants needs modified and advanced MRI pulse sequences to eliminate vast metal artifacts between the implant components and the surrounding soft tissues [23, 24]. High performance of 1.5 T MRI system is suited for achieving this function of substantial reductions in artifacts around metallic implants [23]. Hence, MRI is increasingly recognized as a noninvasive and valuable tool in the assessment of patients with problematic arthroplasty [19, 20, 43].

When inconsistent laboratory tests or nonspecific clinical symptoms are found, distinguishing between aseptic and septic implant failure remains imperfect and challenging [13, 14]. The clinical manifestation of PJIs includes the chronic, acute, low-grade, and high-grade implant infections. To date, there is no consistent diagnostic standard for PJIs in clinical practice [18, 19]. Among the aforementioned criteria, none recommend MRI as a diagnostic test for PJIs. In addition, there is another problem that conclusions of different studies on the diagnostic value of MRI for PJIs are not exactly the same. For example, Albano et al. considered conventional MRI features to have limited accuracy detecting total hip arthroplasty (THA) patients with PJIs [26], but other studies indicated the assessment of MRI findings facilitated the diagnosis of PJIs in THA or total knee arthroplasty (TKA) patients [27,28,29,30,31,32,33]. Possible considerations included the following: (1) very little evidence has been released on the diagnostic value of MRI for PJIs, and standardized specific MRI diagnostic features for PJIs are inconsistent and multifarious [19]. Due to the complicated anatomical structure of the hip joint, the extraction of typical MRI features on PJIs is difficult. (2) Some problems in the retrospective studies might result in serious bias risks. Most relevant articles that have been published are retrospective [26,27,28,29,30, 32, 33]. Because the retrospective nature of the study, it might lead to high selection bias and the possibility that the diagnostic value was falsely calculated. For example, the control group in some studies did not manifest characteristics of PJIs, but a possible low-virulence infection could not be excluded in a timely manner. (3) A periprosthetic mechanical stress reaction in MRI cannot be distinguished well from PJIs; in other words, a single positive MRI feature cannot be exclusive for implant infections [44]. (4) MRI is not extensively utilized to diagnose PJIs in clinical practice because of limitations such as high cost, long acquisition time, complex image postprocessing, and operator dependence.

Although MRI itself has the above inevitable limitations, the intrinsic multiparametric nature of MRI is conducive to achieving qualitative grading of bone destruction, synovitis, soft tissue edema, fluid collection, periosteal reaction, or lymphadenopathy, without ionizing radiation [25]. In this study, some MRI features, such as lamellated hyperintense synovitis, edema, fluid collection, or lymphadenopathy, were valuable diagnostic imaging findings. Diagnostic properties were found in terms of sensitivity, specificity, PPV and NPV (26.3 − 100%, 47 − 98%, 46 − 94.7% and 73.8 − 98%) with satisfactory accuracy (63.9 − 94.4%) and adequate reliability. Standardization is challenging, but a unique metric for the evaluation of PJIs as well as a standardized MRI protocol should be strenuously achieved, allowing MRI criteria of PJIs to be used in some suspected infections of patients who are difficult to diagnose.

Some inherent limitations included the following: (1) Collecting large-scale populations with PJIs in clinical practice is difficult, and only 206 patients with PJIs were included in this study. (2) The included studies showed statistical homogeneity and a high risk of bias, so it is improbable to perform a meta-analysis and categorize standardized MRI features for PJIs; (3) Due to the design limitations of the included studies, the diagnostic value of MRI for different types of PJIs was not clear. (4) Most included articles were retrospective designs which might result in serious variation and bias risk. Some larger prospective studies should be conducted to evaluate standardized MRI features for PJIs diagnosis in the future.

In conclusion, there is preliminary evidence that MRI has a noteworthy value of distinguishing suspected PJIs in patients with TKA or THA, but the definition of specific MRI features related to PJIs diagnosis lacks consensus and standardization. Large-scale studies with robust quality were required to help make better clinical decisions in the future.