Sarcoidosis is a multi-systemic disease traditionally characterized by the presence of non-caseating granulomas. Cardiac sarcoidosis (CS) affects at least 25% of patients with systemic sarcoidosis and is associated with considerable morbidity and mortality.1 Advanced cardiovascular imaging techniques including Fluorine-18 fluorodeoxyglucose positron emission tomography (FDG PET) and cardiac magnetic resonance imaging have become pivotal in the diagnosis, management, and prognostication of patients with CS.1,2 However, studies utilizing advanced imaging to improve the identification and management of CS are limited by a multitude of factors, particularly the small sample sizes and observational nature. Given the fact that larger-scale studies are unlikely to be conducted for this less common disease, meta-analyses have been published to address the diagnostic performance of FDG PET for the detection of CS.3 Since the first meta-analysis by Youssef et al. published in 2012, which included seven studies,3 more single-center observational studies have been added to the literature, prompting an updated systematic review and meta-analysis.

In this issue of the Journal of Nuclear Cardiology, Kim and colleagues are commended on performing an extensive and comprehensive literature search and meta-analysis with a clear and specific question, and rigorous inclusion and exclusion criteria for studies, and ultimately included 17 FDG PET studies involving 891 patients in their meta-analysis.4 Their main findings include a pooled sensitivity of FDG PET for the diagnosis of CS of 0.84 (95% CI 0.71 to 0.91) and a pooled specificity of 0.83 (95% CI 0.74 to 0.89) with significant heterogeneity stemming mainly from concomitant assessment of myocardial perfusion (MPI) at the time of FDG PET in 7 out of the 17 studies.4 Pooled sensitivity and specificity were slightly higher in studies that assessed MPI in addition to FDG but did not reach statistical significance. However, diagnostic odds ratio for CS using FDG PET was improved when assessment of myocardial perfusion was performed [25.7 (14.6 to 45.4) with MPI vs 14.2 (5 to 44.3) without MPI].4

Meta-analyses in the area of imaging in CS diagnosis are challenging for many reasons. Despite the well-conducted meta-analysis by Kim et al.4 a major fundamental issue remains: there is no gold standard test for CS for a robust comparative assessment of diagnostic performance. Most studies included in this meta-analysis used the 1993 Japanese Ministry of Health and Welfare (JMHW) clinical diagnostic pathway as the reference standard.4,5 A severe limitation of this approach is that the JMHW clinical diagnostic pathway, and other diagnostic criteria with a clinical pathway, rely heavily on expert consensus opinion, and less so on supporting data, and have not been prospectively validated.5 Prior discussions have suggested that the lower specificity of FDG PET in some studies may reflect the fact that FDG PET is more sensitive than the JMHW criteria, and that the lower sensitivity of FDG PET in some studies may reflect the reduced specificity of the JMHW criteria.1 Thus, it is unclear, even with a rigorously conducted meta-analysis, that we are getting closer to the true diagnostic accuracy of FDG PET for CS. Moreover, the JMHW criteria did not take into account imaging findings from FDG PET or cardiac magnetic resonance imaging until their most recent update in 2017, while all of the included studies in this meta-analysis used prior iterations of the JMHW criteria.6

Another fundamental issue with a meta-analysis on the diagnostic accuracy of FDG PET in CS is the substantial heterogeneity in reported sensitivity and specificity of FDG PET among the included studies (I2 > 0.75 for both). The I2 statistic is a measure of heterogeneity in meta-analyses and is considered substantial if I2 > 0.5.7 This heterogeneity likely stems from several important factors. Preparation protocols (or combination of different protocols) varied widely: six studies utilized fasting only (12 to 18 hours), six studies utilized high-fat low-carbohydrate diet (one used low-carbohydrate diet only) before fasting 3 to 12 hours, and five studies used fasting 6 to 12 hours in combination with unfractionated heparin (Table 1 in Kim et al.).4 Assessment of suppression of physiologic myocardial FDG uptake was not performed and may also be a source of heterogeneity in this meta-analysis. Proper dietary preparation and optimal suppression of physiologic myocardial uptake is paramount to the interpretation of FDG PET images in CS patients and can markedly affect the diagnostic accuracy of FDG PET for CS.

Distinct from the prior meta-analysis by Youssef et al.3 Kim and colleagues also assessed the diagnostic accuracy of PET alone vs PET/CT. Although there were differences between the two modalities (higher sensitivity for PET alone vs higher specificity for PET/CT—Table 2 in Kim et al.),4 diagnostic accuracy was not statistically different between the two imaging techniques (Table 3 in Kim et al.).4 This again stems from the very wide confidence intervals due to reduced power and heterogeneity of the included studies. Another important new assessment by Kim et al.4 not previously considered in prior meta-analyses is the addition of MPI to FDG PET and its role in the observed heterogeneity. The addition of MPI correlated with higher diagnostic odds ratio of CS compared to studies performed without MPI (Tables 4 and 5 in Kim et al.)4 and supports the recommendation by the joint Society of Nuclear Medicine and Molecular Imaging/American Society of Nuclear Cardiology expert consensus document1 to include MPI assessment in the performance of FDG PET for the diagnosis and prognosis of CS.

Although the authors examined heterogeneity stemming from image interpretation as qualitative vs quantitative, the lack of standardization in visual interpretation of images and in the use of quantitative parameters may also produce considerable heterogeneity not examined in the current study. There are various qualitative and quantitative approaches; assessment of which would go beyond simply categorizing studies as interpreted quantitatively vs qualitatively. Additionally, the most recent expert consensus document encourages concomitant, qualitative and quantitative assessment of PET FDG examinations for CS.1

In summary, Kim and colleagues nicely summarize the available literature regarding the diagnostic performance of FDG PET for CS. Significant heterogeneity among the included studies and lack of an established gold standard reference continue to impair our assessment of the true diagnostic accuracy of FDG PET for CS. Recent expert consensus documents and increasing awareness of CS may enable future larger studies assessing the diagnostic performance of FDG PET with standardized performance and interpretation in combination with myocardial perfusion imaging.