Introduction

Prostate cancer (PCa) is the second most prevalent cancer and the fifth leading cause of cancer death in men worldwide, representing approximately 1.3 million new cases in 2018 [1]. Radical prostatectomy is usually a definitive treatment for PCa with intermediate to high risk [2]. Despite treatment, a significant portion of patients may subsequently suffer from biochemical recurrence due to insufficient identification and localization of metastatic lymph nodes at the time of primary staging using conventional imaging such as computed tomography (CT) and magnetic resonance imaging (MRI). Lymph node status is one of the most important prognostic factors in patients with newly diagnosed PCa. Therefore, there is considerable interest in developing a more PCa-specific and reliable imaging technique with improved utility for detecting metastatic lymph nodes in PCa patients.

In recent updates, clinical guidelines on prostate cancer have recognized the role of multiparametric MRI (mpMRI) in primary staging and biochemical recurrence staging, especially its superior soft-tissue image resolution for primary tumor and lymph node status assessment [3, 4]. However, there is limited recognition of Gallium-68 prostate-specific membrane antigen positron emission tomography computed tomography (68Ga-PSMA PET/CT) despite its potential to improve the diagnostic accuracy for preoperative lymph node staging. PSMA is a transmembrane protein expressed on the surface of prostatic cells. It is selectively overexpressed in PCa lesions, metastatic lymph nodes and bone metastases, and PSMA expression increases with increasing tumor grade and stage [5, 6], and when PCa cells become androgen-independent [7]. Therefore, PSMA has become an invaluable PET imaging biomarker.

A number of studies have shown that 68Ga-PSMA PET/CT is a promising imaging technology not only for lymph node staging but also for early detection of PCa, evaluation of biochemical recurrence, and staging of metastases [8,9,10], which may positively affect clinical decision making and improve patient management in approximately half of the patients [11]. If lymph node metastasis detection rate with 68Ga-PSMA PET/CT can meet current clinical standards, this imaging technique may potentially serve as a tool to both assess the characteristics of the local PCa and test for lymph node metastasis during a single examination. The aim of this study is to conduct a systematic review and meta-analysis to evaluate the diagnostic accuracy of 68Ga-PSMA PET/CT compared with guideline-recommended mpMRI for detection of metastatic lymph nodes in the same cohort of PCa patients.

Methods

Search strategy

This systematic review and meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [12]. It included original research studies of primary lymph node staging with 68Ga-PSMA PET/CT and mpMRI identified in four electronic databases: MEDLINE, EMBASE, CINAHL, and Cochrane. The combination of subject headings and terms used for querying the databases is in the appendices. Titles and abstracts of the selected original research articles were screened independently by two reviewers. The lists of references of all retrieved studies were also extensively cross-checked.

Eligibility criteria

The study included studies on patients with intermediate to high risk of PCa receiving both 68Ga-PSMA PET/CT and mpMRI for initial lymph node staging of PCa prior to extended pelvic lymph node dissection and definitive histopathologic examination. Both retrospective and prospective observational studies with full-text reports published in English were included. There were no restrictions on the age and ethnicity of patients and the definition of positive lymph nodes in 68Ga-PSMA PET/CT and mpMRI. Duplicate articles, animal studies, cell studies as well as letters, narrative reviews, and conference abstracts were excluded. Studies using alternative PSMA-bound radiotracers (e.g., F-18), using conventional or anatomical MRI, without sufficient raw data to construct a 2 × 2 table or with less than 10 subjects were also excluded.

Data extraction

A study-specific data-extraction spreadsheet was created to record the following data from eligible studies: the name of the first author, country, year of publication, sample size, methods of patient selection, patients’ demographic characteristics, imaging protocols, criteria for detection of positive lymph nodes in index tests, the time interval between index tests and histopathologic examination, and performance characteristics of index tests.

Quality assessment

The quality of the included studies was assessed using QUADAS-2 instrument [13]. The following data were also extracted in the data-extraction spreadsheet: (1) clinical characteristics of the study participants; (2) patient selection (consecutive or not); (3) study type (prospective or retrospective); (3) blinding (blinded or not); (5) verification (e.g. whether all patients or lesions were confirmed by histopathologic examination).

Data synthesis and analysis

Diagnostic accuracy

For each study, binary diagnostic accuracy data were extracted and a 2 × 2 table was constructed to classify patients and lymph nodes into one of four groups: true positives, true negatives, false positives and false negatives. The numbers of positive and negative values were extracted either directly or through a calculation based on reported measures of accuracy. Using 2 × 2 tables, sensitivity and specificity of 68Ga-PSMA PET/CT and mpMRI for per-patient analysis and per-lesion analysis were calculated, respectively. Bivariate meta-analysis methods were applied to generate paired forest plots of sensitivity and specificity with corresponding 95% confidence intervals (CI) using random-effect models to obtain a general overview of the diagnostic accuracy estimates before the interpretation of the pooled results. Summary receiver operating characteristic (SROC) curves were constructed to generate pooled estimates of sensitivity and specificity. To further verify the accuracy of the results, a hierarchical SROC (HSROC) model was used to produce HSROC plots with corresponding 95% confidence regions and prediction regions. HSROC model is currently the most statically rigorous and recommended approach for dealing with a threshold effect while HSROC curves can overcome some of the deficiencies of traditional ROC curves [14,15,16,17,18]. The pooled positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic odds ratio (DOR) were also calculated.

Heterogeneity investigation

Heterogeneity due to variation between studies or the threshold effect was investigated. For the heterogeneity due to variation between studies, each imaging method was assessed by (1) visual inspection of the paired forest plots for deviation of sensitivity and specificity of each study from the vertical line corresponding to the pooled estimates, (2) visual inspection of HSROC plots for the variability of study estimates, (3) Cochrane’s Q test and chi-squared p values, and (4) the heterogeneity index (I2) with the following cut-off points: 25–50% low heterogeneity; 51–75% moderate heterogeneity; > 75%: high heterogeneity [19].

The heterogeneity due to threshold effect was assessed by (1) visual inspection of SROC plots [14, 20], and (2) the Spearman’s correlation coefficients between sensitivity (logit of the true positive rate) and specificity (logit of the false positive rate) [21].

Sensitivity analysis

Using the QUADAS-2 assessments, each study was classified as high concern if either the risk of bias or the concerns regarding applicability was high or if both of them were considered unclear. Studies were categorized as with or without concerns regarding patients’ selection and reliability of index imaging tests. The pooled estimates of the diagnostic accuracy between 68Ga-PSMA PET/CT and mpMRI were recalculated after excluding studies with concern.

Publication bias

Publication bias was assessed using Deek’s funnel plot asymmetry test. An asymmetric distribution of data points with a p value < 0.05 suggested the presence of publication bias [22].

Statistical software

All statistical analyses were carried out using STATA 16.0 (StataCorp, College Station, TX, USA).

Results

Study selection

In total, 460 studies were identified in the searched databases, including MEDLINE, EMBASE, CINAHL and Cochrane Library. The study identification and reasons for exclusion are summarized in Fig. 1. After removing duplicates, 365 articles were screened using titles and abstracts. Overall, 323 publications were excluded; 250 because they were not related to the subject of the review (e.g. ineligible patients, using different index tests or reference standards) or due to the ineligible publication type. The full-text versions of 42 articles were reviewed, but further 36 papers were excluded, leaving six studies for the meta-analysis [23,24,25,26,27,28].

Fig. 1
figure 1

PRISMA flow diagram. Unreliable data [29]—the reported diagnostic accuracy dose not correspond to the calculated values based on the extracted data

The meta-analysis included six studies, which included 476 patients, for per-patient analysis, and four studies, including in total 4859 dissected lymph nodes, for per-lesion analysis. A summary of the description of the included studies and extracted data is in Tables 1 and 2.

Table 1 Patient and study characteristics
Table 2 Diagnostic performances of 68Ga-PSMA PET/CT and mpMRI in included studies

Methodological quality of eligible studies

All include studies were assessed using QUADAS-2 analysis. Figure 2 summarizes the evaluation of the six included studies regarding risk of bias and applicability concerns.

Fig. 2
figure 2

QUADAS-2 summary of risk of bias and applicability concerns of author’s judgements about each domain as percentage for 6 included studies

Diagnostic performance of 68Ga-PSMA PET/CT and mpMRI

68 Ga-PSMA PET/CT

68Ga-PSMA PET/CT detected 161 (3.3%) positive lymph nodes in 113 (23.7%) patients in six included studies. For per-patient analysis, the pooled sensitivity, pooled specificity, and AUC were 0.69 (95% CI 0.45–0.86), 0.93 (95% CI 0.87–0.96) and 0.94, respectively. For per-lesion analysis, the pooled sensitivity, pooled specificity and AUC were 0.58 (95% CI 0.17–0.9), 0.99 (95% CI 0.98–1) and 0.99, respectively.

In the forest plots, there was large deviation from the pooled estimate for sensitivity, whereas the deviation from the pooled estimate for specificity was small in both analyses (Figs. 3, 4). In the HSROC plots, three studies were outside both confidence and prediction regions in per-patient analysis and the SROC curve was not consistent with each study estimates while no significant outlier point was detected in per-lesion analysis. However, both HSROC plots showed wide confidence and prediction regions (Fig. 5). A heterogeneity test of sensitivity and specificity showed Q = 25.03 (p = 0), I2 = 80.02% and Q = 14.62 (p = 0.01), I2 = 65.79%, respectively, for per-patient analysis and Q = 85.54 (p = 0), I2 = 96.49% and Q = 31.5 (p = 0), I2 = 90.48%, respectively, for per-lesion analysis.

Fig. 3
figure 3

Paired sensitivity and specificity plots of a 68Ga-PSMA PET/CT and b mpMRI for per-patient analysis. Horizontal lines are the 95% confidence intervals. df degrees of freedom

Fig. 4
figure 4

Paired sensitivity and specificity plots of a 68Ga-PSMA PET/CT and b mpMRI for per-lesion analysis. Horizontal lines are the 95% confidence intervals. df degrees of freedom

Fig. 5
figure 5

Hierarchical Summary Receiver Operating Characteristic plots. Per-patient analysis: a 68Ga-PSMA PET/CT, b mpMRI. Per-lesion analysis: c 68Ga-PSMA PET/CT and d mpMRI. The size of circles represents study size. Studies were insufficient to provide meaningful confidence and prediction regions

mpMRI

mpMRI detected 125 (2.6%) positive lymph nodes in 57 (12.1%) patients. For per-patient analysis, the pooled sensitivity, pooled specificity and AUC were 0.37 (95% CI 0.15–0.66), 0.95 (95% CI 0.91–0.97) and 0.93, respectively. For per-lesion analysis, the pooled sensitivity, pooled specificity and AUC were 0.44 (95% CI 0.1–0.85), 0.99 (95% CI 0.98–1) and 0.99, respectively.

In the forest plots, there was large deviation from the pooled estimates for sensitivity whereas the deviation from the pooled estimate for specificity was also small in both analyses (Figs. 3, 4). On the HSROC plots, three studies were observed outside confidence region in per-patient analysis and the SROC curve was not consistent with each study estimates while no significant outlier point was detected in per-lesion analysis. However, both HSROC plots also showed wide confidence and prediction regions (Fig. 5). A heterogeneity test of sensitivity and specificity showed Q = 39.31 (p = 0), I2 = 87.28% and Q = 18.79 (p = 0), I2 = 73.39%, respectively, for per-patient analysis and Q = 100.23 (p = 0), I2 = 97.01% and Q = 34.5 (p = 0), I2 = 91.3%, respectively, for per-lesion analysis.

Comparison between two imaging techniques

68Ga-PSMA PET/CT had higher overall detection rate than mpMRI for primary lymph node staging of intermediate to high risk of PCa. The pooled sensitivity was higher in 68Ga-PSMA PET/CT than mpMRI for both per-patient and per-lesion analyses, while the pooled specificity was slightly higher in mpMRI than 68Ga-PSMA PET/CT for per-patient analysis. The reported sensitivities and specificities for both imaging techniques varied across studies; however, the specificities were less variable than the sensitivities.

Overall, there was deviation from the pooled sensitivities and specificities in the forest plots for both 68Ga-PSMA PET/CT and mpMRI. The deviation was larger for sensitivity than specificity. The study estimates for both 68Ga-PSMA PET/CT and mpMRI were scattered in the HSROC plots and the confidence and prediction regions were wide due to insufficient data. Based on the patterns of confidence and prediction regions in HSROC plots and CIs of the summary estimates, the uncertainty in the pooled specificities of both imaging tests was significantly lower than the uncertainty in the pooled sensitivities. The I2 values were either moderate or high and Q test p values were generally low for both sensitivity and specificity of two imaging tests. Notable heterogeneities were present and higher heterogeneity was observed for sensitivity than specificity.

Threshold effect

The patterns of the study estimates in the SROC space did not show a “shoulder arm” shape. The data were, however, insufficient and sparse for visual inspection of threshold effect on the SROC space. Thus, the power to detect threshold effect was low (Appendix II).

The Spearman’s correlation coefficients and p values of 68Ga-PSMA PET/CT and mpMRI for both per-patient and per-lesion analysis were presented in Table 3. Per-patient analysis of mpMRI had a significant and strong positive correlation coefficient of threshold effect.

Table 3 The Spearman correlation coefficients to test threshold effects in the meta-analysis

Sensitivity analysis

The results of sensitivity analysis demonstrated that the pooled estimates were stable after excluding studies with patient selection concerns, whereas the variances of the pooled estimates became significant and the heterogeneity improved after excluding studies with concerns about index imaging tests (Appendix III).

Publication bias

The funnel plots in Fig. 6 showed that the studies were distributed symmetrically with corresponding p values > 0.05 indicating no evidence of publication bias. The number of studies included in the meta-analysis was, however, small with high heterogeneity so the power to detect bias was low (Table 4).

Fig. 6
figure 6

Deeks’ funnel plots of Publication Bias. Per-patient analysis: a 68Ga-PSMA PET/CT, b mpMRI. Per-lesion analysis: c 68Ga-PSMA PET/CT and d mpMRI

Table 4 Summary of diagnostic performances for 68Ga-PSMA PET/CT and mpMRI

Discussion

Main findings

This review indicated that 68Ga-PSMA PET/CT had a better overall diagnostic performance than mpMRI with a comparable and high specificity but an especially notable superiority of sensitivity. The uncertainty in the diagnostic performance was also lower in 68Ga-PSMA PET/CT than mpMRI. However, the identified studies were heterogeneous and higher heterogeneity was observed in sensitivity than specificity. The sensitivity analysis showed that the lack of blinding regarding the imaging tests might lead to inflated measures of the diagnostic accuracy.

Comparison with previous findings

The results of this meta-analysis are in line with the earlier publications [10, 30]. Our meta-analysis showed that 68Ga-PSMA PET/CT had a higher sensitivity but slightly lower specificity than the earlier review [30]. Several differences between two reviews were noted. Wu et al. combined conventional MRI and mpMRI to produce the pooled estimates, stratified risk of PCa solely based on biopsy results and included studies comparing two imaging techniques in a different cohort of patients. Nevertheless, both of our reviews had significant heterogeneity that we could not determine the sources explicitly. The diagnostic accuracy of 68Ga-PSMA PET/CT for lymph node staging was also found to be similar in the recent and statistically powered meta-analysis with a specificity of 97 and 99% for per-patient and per-lesion analysis, respectively [10].

Strengths and limitations of this review

This study provides a comprehensive review of the current evidence identified through a systematic search on the diagnostic accuracy of 68Ga-PSMA PET/CT and mpMRI for detection of metastatic lymph nodes in the same cohort of PCa patients.

One limitation of this review was the small number of included studies with the few data in the analysis that limited the power of statistical tests. Another limitation was the suboptimal quality of eligible studies, including a suboptimal design regarding reporting of patient selection criteria, independence of test interpretation and reporting of time intervals between individual tests. These are known as review biases that may lead to an overestimation of test accuracy. Furthermore, there might be misclassification of PCa risk since different risk stratification approaches were used in included studies. However, clinical characteristics of patients provided in the studies were incomplete for us to reclassify them and analyses were limited to the use of aggregate data.

Cochrane’s Q test and I2 statistic alone do not account for threshold effect; therefore, we incorporated several approaches to explore heterogeneity in this review. The issue of pre-test probability has been considered especially relevant when conducting tests with an implicit threshold in response to the perception of increased prevalence [31, 32]. In addition, physicians set the level of subjective threshold potentially in response to prior test results. These might also lead to inflated measures of diagnostic accuracy. Hence, the importance of patient selection and the reliability of imaging tests were considered in the sensitivity analysis. The sources of heterogeneity could also be caused by scanner-related issues and imaging protocols applied in different institutions based on their own equipment, capacity and expertise. The included studies using ‘time-of-flight’ advanced technology obtained higher sensitivity than the study using older PET/CT scanners without technical refinements [25,26,27,28]. The variability in diffusion gradient factor b used in DWI sequences of mpMRI made the ranges of ADC values difficult to interpret although the earlier meta-analysis reported no significant differences between electric field strengths (1.5 or 3.0 T) in mpMRI diagnostic performance [30]. The data in the included studies was, however, insufficient to examine those potential sources of heterogeneity for this review.

Given our data limitations and substantial heterogeneity between studies, we cannot ensure the generalizability of the findings to settings with different imaging protocols for PCa patients which might result in different sensitivity and specificity.

Implications for clinical practice

The high specificity of 68Ga-PSMA PET/CT and mpMRI may prove their use as imaging tests to rule in metastatic lymph nodes for PCa patients and avoid over-diagnosis and invasive investigation. However, inadequate sensitivity may limit their use as a screening test for asymptomatic populations. As the findings indicated that the diagnostic accuracy for lymph node staging with 68Ga-PSMA PET/CT meets current clinical standards, it may be reasonable to consider the recommendations for staging and prognosis of PCa to favor 68Ga-PSMA PET/CT as a single whole-body imaging examination.

PSMA provides an excellent target for theranostic application and has become a unique biomarker for both imaging and radionuclide treatment. A number of studies showed promising treatment response of Lutetium-177 PSMA in metastatic castration-resistant PCa patients [33,34,35,36]. Pretreatment selection and therapeutic response are mainly assessed by 68Ga-PSMA PET/CT. Therefore, 68Ga-PSMA PET/CT certainly has an important role in diagnosis, staging as well as treatment of PCa in the near future.

Development of PSMA PET/CT and mpMRI and implications for future researches

Recently, 18Fluorine (18F)-labeled PSMA PET/CT has demonstrated good imaging quality potentially outperforming current imaging modalities with several principle advantages such as minimal radiotracer accumulation in the bladder [37,38,39]. mpMRI for lymph node staging with ultrasmall superparamagnetic iron oxide (USPIO) has also shown promising diagnostic performance in depicting PCa metastatic lymph nodes [40, 41]. However, USPIO is currently not widely available worldwide due to the withdrawal of its license in many regions where its use is limited to research purpose [42]. A sufficient number of high-quality studies regarding 18F-PSMA PET/CT and USPIO-enhanced MRI are warranted to further define the accuracy, capabilities and role in the management of PCa in the future.

Given the lack of studies in the review and those methodological limitations, large scale prospective randomized clinical trials are necessary to confirm the diagnostic performances and clinical values of 68Ga-PSMA PET/CT and mpMRI. To reduce significant heterogeneity and misclassification bias, individual patient data (IPD) meta-analysis can be performed [43, 44]. Line by line patient data are collected and analyzed more similarly from the eligible studies in IPD meta-analysis rather than a standard aggregate meta-analysis and specific subgroups of patients can be assessed across studies.

Conclusions

This review provides valuable insight into the role of 68Ga-PSMA PET/CT and mpMRI in primary lymph node staging of intermediate to high-risk PCa. Both imaging techniques are useful to rule in metastatic lymph nodes due to superior specificity. 68Ga-PSMA PET/CT has a better and more certain overall diagnostic performance for imaging-guided region-based lymph node dissection. These should increase diagnostic impact of 68Ga-PSMA PET/CT in clinical practice and result in a greater acceptance of this imaging technique by molecular imaging community, physicians, patients and funding bodies. However, with the paucity of data from the included studies and all of the methodological issues considered, large scale prospective trials and IPD meta-analysis would need to further confirm the clinical values of these two imaging techniques.