What is the Accuracy of Nuclear Imaging in the Assessment of Periprosthetic Knee Infection? A Meta-analysis

Background In the assessment of possible periprosthetic knee infection, various imaging modalities are used without consensus regarding the most accurate technique. Questions/Purposes To perform a meta-analysis to compare the accuracy of various applied imaging modalities in the assessment of periprosthetic knee infection. Methods A systematic review and meta-analysis was conducted with a comprehensive search of MEDLINE and Embase® in accordance with the PRISMA and Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) recommendations to identify clinical studies in which periprosthetic knee infection was investigated with different imaging modalities. The sensitivity and specificity of each imaging technique were determined and compared with the results of microbiologic and histologic analyses, intraoperative findings, and clinical followup of more than 6 months. A total of 23 studies, published between 1990 and 2015, were included for meta-analysis, representing 1027 diagnostic images of symptomatic knee prostheses. Quality of the included studies showed low concerns regarding external validity, whereas internal validity indicated more concerns regarding the risk of bias. The most important concerns were found in the lack of uniform criteria for the diagnosis of a periprosthetic infection and the flow and timing of the included studies. Differences among techniques were tested at a probability less than 0.05 level. Where there was slight overlap of confidence intervals for two means, it is possible for the point estimates to be statistically different from one another at a probability less than 0.05. The z-test was used to statistically analyze differences in these situations. Results Bone scintigraphy was less specific than all other modalities tested (56%; 95% CI, 0.47–0.64; p < 0.001), and leukocyte scintigraphy (77%; 95% CI, 0.69–0.85) was less specific than antigranulocyte scintigraphy (95%; 95% CI, 0.88–0.98; p < 0.001) or combined leukocyte and bone marrow scintigraphy (93%; 95% CI, 0.86–0.97; p < 0.001). Fluorodeoxyglucose positron emission tomography (FDG-PET) (84%; 95% CI, 0.76–0.90) was more specific than bone scintigraphy (56%; 95% CI, 0.47–0.64; p < 0.001), and less specific than antigranulocyte scintigraphy (95%; 95% CI, 0.88–0.98; p = 0.02) and combined leukocyte and bone marrow scintigraphy (93%; 95% CI, 0.86–0.97; p < 0.001). Leukocyte scintigraphy (88%; 95% CI, 0.81–0.93; p = 0.01) and antigranulocyte scintigraphy (90%; 95% CI, 0.78–0.96; p = 0.02) were more sensitive than FGD-PET (70%; 95% CI, 0.56–0.81). However, because of broad overlapping of confidence intervals, no differences in sensitivity were observed among the other modalities, including combined bone scintigraphy (93%; 95% CI, 0.85–0.98) or combined leukocyte and bone marrow scintigraphy (80%; 95% CI, 0.66–0.91; p > 0.05 for all paired comparisons). Conclusions Based on current evidence, antigranulocyte scintigraphy and combined leukocyte and bone marrow scintigraphy appear to be highly specific imaging modalities in confirming periprosthetic knee infection. Bone scintigraphy was a highly sensitive imaging technique but lacks the specificity needed to differentiate among various conditions that cause painful knee prostheses. FDG-PET may not be the preferred imaging modality because it is more expensive and not more effective in confirming periprosthetic knee infection. Level of Evidence Level III, diagnostic study. Electronic supplementary material The online version of this article (doi:10.1007/s11999-016-5218-0) contains supplementary material, which is available to authorized users.


Introduction
After primary TKA, as many as 2% of patients have prosthetic joint infection (PJI) develop; this risk is as great as 5% after revision surgery [3,26] Accurate diagnosis of periprosthetic infection remains a clinical challenge, particularly in subacute or chronic infections. The evaluation of suspected PJI is characterized by a multimodality workup including microbiologic, laboratory (elevated erythrocyte sedimentation rate, C-reactive protein [CRP]), synovial marker, and histologic tests [35,57]. Recently, promising results have been reported regarding synovial biomarkers tests, including the alpha defensin immunoassay and synovial fluid CRP tests [5,54]. However, these test are not yet widely available and their utility has been confirmed in only a few studies [54]. In addition to these diagnostic tests, various imaging techniques including radiographs, ultrasound, CT, MRI, bone, leukocyte, bone marrow, or antigranulocyte scintigraphy, and positron emission tomography (PET) can be used in the assessment of suspected periprosthetic knee infection [10,11,29,31,57], especially in the case of a challenging diagnosis of a chronic or low-grade infection [45][46][47][48].
A delay in diagnosing and treating a periprosthetic knee infection can have a critical effect on loosening or maintaining the prosthesis and joint function. Timely identification of a periprosthetic infection is essential to allow initiation of appropriate medical and surgical therapies [49] in which various imaging modalities can contribute when other tests are inconclusive. However, inconsistent diagnostic accuracies across studies investigating periprosthetic knee infection have been published [10,11,22]. Consequently, the choice of the most accurate imaging technique remains controversial [11,31]. To our knowledge, there has been no meta-analysis comparing the most commonly used imaging modalities to evaluate TKA PJI.
The aim of this systematic review and meta-analysis was to compare the diagnostic accuracy of different imaging modalities used for diagnosing periprosthetic knee infection.

Search Criteria and Strategy
The imaging modalities that were reviewed for the assessment of periprosthetic knee infection were radiography, ultrasound, CT, MRI, scintigraphy (including bone, antigranulocyte, leukocyte, and bone marrow scintigraphy), and PET.
In June 2015 a computer-aided search of the PubMed and Embase 1 databases was conducted and updated in January 2016 (Appendix 1. Supplemental material is available with the online version of CORR 1 ). The search was restricted regarding primary studies that were written in English. For each database, a specific search strategy was developed (Fig. 1) with a medical informatics specialist. Reference lists of the identified studies and relevant reviews were hand-searched for supplementary eligible studies. The search was performed according to the PRISMA Statement (Appendix 2. Supplemental material is available with the online version of CORR 1 ) [24].

Study Selection
The following inclusion criteria were used for eligible studies: (1) radiography, ultrasound, CT, MRI, scintigraphy, and PET were used to identify suspected periprosthetic knee infections; (2) a valid reference standard of positive intraoperative culture whether combined with histopathologic evidence regarding acute inflammation of the periprosthetic tissue of surgical débridement or prosthesis removal and/or the presence of a sinus tract that communicates with the prosthesis [8,13,29] and/or a clinical followup of at least 6 months; and (3) adequate details to reconstruct a two-by-two contingency table to determine the results of the index tests. Exclusion criteria were (1) animal studies; (2) non-English studies; (3) studies that did not differentiate between various joint replacements; and (4) case reports. Potential overlap of patient populations was assessed when more than one study was selected by the same author or institution by comparing the patient demographics. The study with the largest number of patients was selected when an overlap of patient populations between studies was observed.
The titles were screened for eligibility by one reviewer (SJV) and then processed for abstract assessment. The titles and abstracts were independently screened and assessed in an unblinded standardized manner for eligibility by two reviewers (SJV, RJAS). The final decision regarding inclusion was based on the full article. Disagreement in the evaluation of three studies was resolved with consensus by a third reviewer (OPPT). A priori, no differentiation was made for the type of knee implant, the interpretation criteria used for the index test, or the time between surgery and imaging.

Studies Included
The search strategy identified 3708 studies from MED-LINE and 2864 studies from Embase 1 . The source population was formed by the total of 6572 studies (including duplicates). In 1933 studies, overlap was found between the retrieved studies from Embase 1 and MED-LINE. Of the initial 6572 studies, 6433 were excluded after analyzing the information provided in the title and abstract. The full articles of the remaining 139 studies were reviewed for eligibility (Appendix 2. Supplemental material is available with the online version of CORR 1 ). No other studies were extracted from the reference list of these studies. A total of 116 studies were excluded because the study was not a clinical diagnostic study (32%), did not describe periprosthetic knee infection (12%), was not written in English (17%), did not specify the definition of positivity regarding the index test or applied an insufficient reference standard for periprosthetic knee infection (7%), did not differentiate regarding different prosthetic joint replacements (15%), did not provide data to reproduce twoby-two contingency tables (16%), or the study revealed a potential overlap of the patient population (1%). Eventually, 23 studies were included in this review.

Methodologic Quality Assessment
The criteria list of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) for evaluating internal and external validity of diagnostic studies recommended by the Cochrane Screening and Diagnostic Tests Methods Group (http://methods.cochrane.org/sdt/handbook-dta-reviews) was used for grading the methodologic quality of the selected studies [53]. Evaluation was performed by two reviewers (SJV, RJAS) independently. Internal and external criteria were used for determination of the methodologic limitations, respectively, for descriptive purposes. Studies, however, were not excluded from the systematic review on the basis of quality.
The external validity showed low concerns regarding applicability in more than 85% of the included studies (Fig. 2). The internal validity of the included studies showed more concerns regarding the risk of bias. Approximately 50% of the included studies did not provide sufficient information regarding patient selection, reference standard, and flow and timing. Quantitative Analysis (Meta-analysis) For the diagnostic modalities, true-positive, false-positive, true-negative, and false-negative results were derived from a two-by-two contingency table. The interpretation criteria with the highest diagnostic accuracy were selected in case multiple interpretation sets for the same index test were used. When studies reported results for more than one observer, the first readers' findings were included. The statistical heterogeneity of the diagnostic odds ratio (DOR) of each imaging index test across studies was tested using the chi-square test (Q DOR ) for independence with k-1 degrees of freedom (k = number of studies) [6]. The Spearman rank correlation coefficient q value of the DOR was used in case of heterogeneity to measure the correlation between sensitivity and specificity. A q of 0.40 or less suggests that the variation between studies may be explained by different cutoff points, or diagnostic thresholds, on a summary receiver operating characteristic curve [6,25]. The symmetry of funnel plots was visually interpreted to evaluate possible publication bias. For all included studies, the test of homogeneity for the DOR indicated no statistical heterogeneity. The studies that evaluated bone scintigraphy (six studies, n = 216 knee prostheses), combined bone and leukocyte scintigraphy (four studies, n = 114 knee prostheses), leukocyte scintigraphy (six studies, n = 238 knee prostheses),  The sensitivity and specificity were pooled independently and were weighted by the inverse of the variance with use of Meta-DiSc software (Available at: http://www. hrc.es/investigacion/metadisc_en.htm) [55]. The logittransformed sensitivity, specificity, and corresponding 95% CI of the index tests were compared with use of z-test statistics. A probability less than 0.05 was considered significant (Table 3). In the comparison of two imaging   modalities, confidence intervals for two means can overlap and yet the two means can be statistically different from one another at a probability less than 0.05 [1,19,36]. The z-test was used to statistically analyze these differences. A secondary analysis was performed to evaluate possible influence of the methodologic quality on the sensitivity and specificity.

Discussion
In the assessment of suspected periprosthetic knee infection, various diagnostic tests including blood tests, synovial fluid microbiologic analyses, and synovial fluid marker tests (such as alpha defensin and synovial fluid CRP), can be used. However, accurate diagnosis of periprosthetic knee infection remains challenging, especially in chronic or low-grade infections, and inconsistent diagnostic accuracies with various tests across studies have been published [10,11,22]. Because of that, imaging tests remain important, but studies do not agree on which imaging technique is the most accurate [11,31]. Our meta-analysis revealed that in diagnosing periprosthetic knee infection, antigranulocyte scintigraphy and combined leukocyte and bone marrow scintigraphy were highly specific imaging techniques (Fig. 3).
Although the included studies showed statistical homogeneity of data, the reliability of the pooled estimates depends on the methodologic quality of the included studies. There are several limitations of this meta-analysis to consider. Collecting large sample sizes of patients with suspected periprosthetic knee infection is difficult; the total number of infected TKAs included in this meta-analysis was only 288. Subsequently, several studies showed wide confidence intervals, because of small numbers of patients who were evaluated using each diagnostic modality. This means that there may have been differences in sensitivity or specificity between certain modalities that we did not detect. Future comparative studies might help resolve this issue. Studies were not excluded on the basis of methodologic quality. Our secondary analysis, with exclusion of studies that showed a high risk of bias, suggested that FDG-PET might be more sensitive than the primary analysis showed; indeed, it may be comparably sensitive to leukocyte scintigraphy and antigranulocyte scintigraphy. The methodologic quality of the included studies did not substantially influence the sensitivity and specificity of other imaging modalities (data not shown). However, there were important concerns regarding the flow and timing of the included studies. Most of the studies often insufficiently described important variables, including types of implants, use of antibiotics, imaging time after surgery, improvement of imaging techniques, and inter-and intraobserver reliability variance. Consequently, analyses of the effect of these variables on the accuracy of imaging was not possible, but could influence the diagnostic performance of the imaging modalities we studied. In addition, the long period evaluated here (1990 to 2015) saw the introduction of numerous new diagnostic tests (such as alpha defensin and synovial fluid CRP) and new diagnostic standards [4], which might have changed the apparent performance of the imaging modalities we studied and how they might be used in practice. The differentiation between acute or chronic infection influences the decision to evaluate a suspected infection with additional imaging, and should be investigated in additional studies. Another important limitation of the included studies is the lack of uniform criteria for diagnosis of a periprosthetic infection. We could not restrict inclusion to studies using the Musculoskeletal Infection Society criteria [35] because many of the included studies were performed before the development of these criteria. Although a valid reference standard with microbiologic confirmation was a stringent inclusion criterion in this meta-analysis, there is a risk of false-positive diagnosis of infection, which potentially could decrease specificity. When a diagnosis of no infection was considered, clinical followup sometimes was used to monitor the final diagnosis. Only studies with a clinical followup of at least 6 months were included. For obvious reasons, surgery with microbiologic evaluation could not be performed in all patients (patients believed to be without infection did not always undergo surgery). However, this could result in more false-negative results and potentially decrease the reported specificity when an infection is found after the final diagnosis, especially in the case of a low-grade infection.
Our meta-analysis defined test performance for the various imaging modalities when used in isolation. However, multiple diagnostic tests including aspiration results and laboratory tests can contribute in diagnosing periprosthetic infection, which could influence the diagnostic performance of the evaluated imaging techniques, and generally should improve their performance. During the years, important developments have been described in the diagnosis of periprosthetic infection, including the introduction of alpha defensin and synovial fluid tests [5,54]. When the diagnostic evaluation using synovial fluid markers clearly indicates infection, there is little or no need for additional nuclear imaging tests. However, if those tests cannot be obtained or are inconclusive, nuclear imaging can be used in concert with other elements of diagnostic evaluation, including microbiologic analysis and blood testing, to arrive at a more-precise diagnosis than is possible with imaging or laboratory testing alone. Nuclear imaging seldom is used in isolation, and probably should not be used that way [57].
Using bone scintigraphy during the first years after implantation, postoperative tracer (Table 4) uptake can be caused by various factors and therefore lacks the specificity needed to differentiate between aseptic and septic loosening [10,32]. Our results (Table 5) confirmed the reputation of high sensitivity and low specificity of this technique [30,31,42,43]. Unfortunately, subgroup analysis of imaging time after implantation could not be performed owing to insufficient data. In clinical practice, imaging often is used to rule out an infection. Bone scintigraphy is widely available and a sensitive tool for evaluation of painful knee prostheses (Fig. 4). However, when confirmation of infection is needed, a positive bone scintigraphy outcome usually leads to a second, more-specific, investigation.
Leukocyte scintigraphy is assumed to be a more specific-imaging modality and has a long history of use in detection of infections [11,51]. However, our meta-analysis showed that this technique alone may not be the preferred modality for confirming periprosthetic knee infection, given that it has only moderate specificity (77%) ( Table 6). We found that leukocyte scans are very sensitive (88%) (Fig. 5). However, in contrast to bone scintigraphy, leukocyte scintigraphy is a time-consuming procedure with higher costs and therefore may not be the preferred imaging technique to rule out periprosthetic knee infection. The explanation for the moderate specificity may be that labeled leukocytes (Table 7) not only accumulate in infections, but also physiologically in the bone marrow [33]. To reduce the consequent number of false-positive results, leukocyte scintigraphy can be combined with bone marrow scintigraphy (Table 8), which has been proposed as the preferred imaging modality for diagnosing prosthetic joint infections [10,11,22,32]. The current results for knee prostheses confirmed an increased specificity of 93% versus 77% when combining leukocyte with bone marrow scintigraphy (Table 9). Another assessed option to improve specificity (Table 10) was combining leukocyte with bone scintigraphy (Table 11). As expected, specificity did not improve (Fig. 6) [10]. More recently, antigranulocyte scintigraphy was introduced as a less time-consuming alternative for leukocyte scintigraphy with the advantage of in vivo labeling of leukocytes with considerable potential   in the detection of infection (Table 12) [10,11]. We found antigranulocyte scintigraphy (Table 13) to be more specific than leukocyte scintigraphy and FGD-PET (Table 14). However, its role in the assessment of periprosthetic infection is not yet fully established [10]. An important drawback in clinical practice is that neither antigranulocyte scintigraphy nor leukocyte scintigraphy are widely available and used in clinical practice [10].
FDG-PET is increasingly used and has proposed potential in the diagnosis of PJI, especially regarding hip arthroplasty [10,39,51,58]. Although this technique offers advantages such as time efficiency, increased resolution, and the use of low-dose CT, our results revealed that this technique was less specific in diagnosing periprosthetic knee infection than combined leukocyte and bone marrow scintigraphy and antigranulocyte scintigraphy (Fig. 7). Kim et al. [18] 99Tc-HMPAO 740-1100 MBq Increased uptake in the periprosthetic area or if the foci in nearby soft tissue had greater activity than the background soft tissue activity * Mean doses; SQ = semiquantitative; diagnostic odds ratio 28,143; heterogeneity chi-square = 3.52 (df = 5); p = 0.620; inconsistency (I 2 ) = 0.0%; K = suspected region of infection/reference region (bone marrow); HMPAO = hexamethylpropyleneamine oxime; NR = not reported.     Some investigations concluded that uptake patterns rather than intensity in the bone-prosthesis interface are specific in diagnosing periprosthetic infection (Table 15) [2,56]. In particular, the sensitivity of 70% is only moderate (Fig. 8) and was lower than the sensitivity of leukocyte or antigranulocyte scintigraphy (Fig. 9). However, our secondary analysis revealed that FDG-PET was highly sensitive (93%) when low-quality studies were excluded [21,23],      which is not less sensitive than the other imaging techniques evaluated. This should be considered further in well-designed studies. The specificity was not higher than that of combined leukocyte and bone marrow scintigraphy and antigranulocyte scintigraphy. An important drawback of FDG-PET is the high cost compared with other imaging modalities. Therefore, FDG-PET may not be the preferred imaging modality in the evaluation of a suspected infected knee prosthesis. This meta-analysis revealed that, based on current evidence, antigranulocyte scintigraphy and combined leukocyte and bone marrow scintigraphy were highly specific in confirming periprosthetic knee infection. However, the time-consuming procedures and limited availability are important drawbacks of these techniques. Bone scintigraphy was highly sensitive but lacks the specificity in differentiating between various conditions of painful knee prostheses. FDG-PET may not be the preferred imaging modality because it is more expensive and not more effective in confirming infected knee prostheses. In practice, other tests should be used in concert with the evaluated imaging modalities to arrive at more-sensitive and specific diagnostic decisions than are possible with imaging or laboratory testing alone. Future, larger prospective studies should assess the utility of imaging in the diagnostic algorithm of a suspected periprosthetic knee infection, providing more data to evaluate important variables, including the differentiation between acute and chronic infections.