Background

Immunohistochemical (IHC) detection of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) is the foundation of clinical subtyping of breast cancer since it selects targets for endocrine or HER2-targeted therapy [1,2,3]. In addition, gene expression profiling (GEP) studies have identified at least four intrinsic breast cancer subtypes that more accurately capture the diversity of breast cancer [4, 5]. Surrogate intrinsic subtypes have been defined which can be approximated using IHC determination of ER, PR, HER2 and Ki-67 [6,7,8]. To date, clinical subtyping using IHC has near exclusive use in contemporary practice.

Positron emission-tomography (PET) using [18F]-fluorodeoxyglucose ([18F]FDG) is a widely accepted imaging modality in breast cancer that is nowadays mostly used in combination with computed tomography (PET/CT) or magnetic resonance imaging (PET/MRI) for anatomic correlation. While mainly used for initial staging in patients with locally advanced or suspected recurrent breast cancer, it has also been thoroughly investigated for its ability to predict and detect response to neoadjuvant systemic therapy (NST) and to predict prognosis [9,10,11]. In practice, [18F]FDG uptake is predominantly expressed using maximum standardized uptake values (SUVmax).

Previous studies report a correlation of [18F]FDG uptake with tumour aggressiveness, with increased SUVmax in primary breast tumours that are ER-negative, PR-negative, HER2-positive or Ki-67-positive [12,13,14]. Studies investigating the difference in [18F]FDG uptake between clinical subtypes have found a similar pattern with relatively low SUVmax in subtypes including ER and PR, and high SUVmax for subtypes including HER2 or that are triple negative [15, 16]. To date, no meta-analysis has investigated or quantified the relative difference in SUVmax between IHC expression of ER, PR, HER2, Ki-67, and clinical subtypes based on these markers.

Therefore, the aim of the present study is to perform a systematic review and meta-analysis to investigate and quantify the association between [18F]FDG uptake expressed as SUVmax and IHC expression of ER, PR, HER2, Ki-67, and clinical subtypes based on these markers.

Methods

The full description of the methods can be obtained in Additional file 1 (Tables S1–S2). To be eligible for the meta-analysis, a study had to fulfill the following inclusion criteria: patients with invasive breast cancer, [18F]FDG uptake expressed as SUVmax and measured on the primary tumour before any therapy, comparison of [18F]FDG uptake between patients negative and positive for IHC expression of ER, PR, HER2, or Ki-67, and between clinical subtypes based on the IHC expression of these markers. Data on the number of patients, mean and standard deviation (SD) of SUVmax of patients negative and positive for IHC expression of ER, PR, HER2, Ki-67, and clinical subtypes based on these markers, was extracted. Study quality was assessed by using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 tool. For the meta-analysis, the primary summary statistic was the standardized mean difference (SMD) with 95% confidence intervals (CIs) using Hedges’ g correction for small study samples. The primary analyses were based on studies which presented mean [18F]FDG uptake with SD. Sensitivity analyses also included studies which presented median [18F]FDG uptake with (interquartile) range which were transformed to mean and SD. Lastly, Egger’s regression test was used to identify small-study effects.

Results

Study characteristics and QUADAS-2

Figure 1 shows the search pattern and selection of articles at each step. Of the 74 included studies the means and SDs were provided in 50 [12, 14, 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63]. In the remaining 24 studies the means and SDs were transformed from the provided medians and (interquartile) ranges [13, 64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87]. An overview of the characteristics of included studies as well as the [18F]FDG PET characteristics is provided in Additional file 2 (Tables S3–S4). The number of patients, mean and SD of each individual study for negative and positive IHC expression of ER, PR, HER2, Ki-67, and of clinical subtypes based on these markers, is provided in Additional file 2 (Tables S5–S11).

Fig. 1
figure 1

PRISMA flow diagram of the study selection

Quality of included studies

Risk of bias for patient selection originated from poor reporting of in- and exclusion criteria in three studies and the use of case–control designs in another three studies. For the index test, there was an unclear risk of bias in 26 studies since it was not reported who reviewed the PET images or performed SUVmax measurements, and a high risk of bias in 8 studies since no harmonization of PET-data was performed while using multiple PET-devices. With regard to the reference standard, 22 studies did not provide criteria for receptor positivity or subtypes. Lastly, high risk of bias in flow and timing existed in 8 studies since not all patients were included in the final analysis without providing valid reasons. In general, applicability concerns are low, meaning that the patient selection, index test and reference standard of the included studies match the review question. Figure 2 visualizes the risk of bias and applicability concerns and additional information on methodologic quality of individual studies is provided in Additional file 2 (Table S12).

Fig. 2
figure 2

Methodological quality of included studies

Association between [18F]FDG uptake and receptor status

Table 1 displays the estimates of the SMD with 95% CIs as measure for the difference in [18F]FDG uptake between negative versus positive IHC expression of ER, PR, HER2 and Ki-67. The primary analyses show that the SUVmax is significantly higher in ER-negative (SMD 0.66, P < 0.0001), PR-negative (SMD 0.56, P < 0.0001), HER2-positive (SMD − 0.29, P = 0.0043) or Ki-67-positive (SMD − 0.77, P < 0.0001) primary tumours compared to their counterparts.

Table 1 Estimates of the SMD as summary measure for the difference in [18F]FDG (SUVmax) uptake between negative versus positive IHC expression of ER, PR, HER2, and Ki-67

Association between [18F]FDG uptake and surrogate intrinsic subtypes

The estimates of the SMD with 95% CIs as measure for the difference in [18F]FDG uptake between surrogate intrinsic subtypes based on recommendations from the St. Gallen conferences is displayed in Table 2. The primary analyses reveal that LA was associated with significantly lower SUVmax than LB (SMD − 0.49, P = 0.0001), LB HER2-negative (SMD − 0.68, P = 0.0021), LB HER2-positive (SMD − 0.72, P = 0.0089), HER2-positive (SMD − 0.91, P < 0.0001) and TNBC (SMD − 1.21, P < 0.0001); LB significantly lower than TNBC (SMD − 0.77, P = 0.0002); LB HER2-negative significantly lower than TNBC (SMD − 0.58, P = 0.0177); LB HER2-positive significantly lower than HER2-positive (SMD − 0.22, P = 0.0457); and TNBC significantly higher than non-TNBC (SMD 0.56, P < 0.0001). While the sensitivity analyses did not reveal a difference in the direction of the meta-analyses, the size and 95% CIs of the SMDs did differ significantly for the comparison of LA with LB HER2-negative (P = 0.0213) and of TNBC versus non-TNBC (P = 0.0015) when including transformed medians and (interquartile) ranges.

Table 2 Estimates of the SMD as summary measure for the difference in [18F]FDG (SUVmax) uptake between St. Gallen surrogate intrisic subtypes

Association between [18F]FDG uptake and clinical subtypes according to a simplified classification

Table 3 displays the estimates of the SMD with 95% CIs as measure for the difference in [18F]FDG uptake between clinical subtypes according to a simplified classification which classified patients into three groups (i.e. ER-positive/HER2-negative, HER2-positive, and TNBC). The primary analyses reveal that SUVmax was significantly lower in ER-positive/HER2-negative than in HER2-positive (SMD − 0.34, P = 0.0070) or in TNBC (SMD − 0.89, P = 0.0008) and significantly lower in HER2-positive than in TNBC (SMD − 0.54, P = 0.0193).

Table 3 Estimates of the SMD as summary measure for the difference in [18F]FDG (SUVmax) uptake between clinical subtypes according to a simplified classification

Discussion

The results of this systematic review and meta-analysis indicate that there are substantial differences in [18F]FDG uptake expressed as SUVmax of the primary tumour between negative and positive IHC expression of ER, PR, HER2, Ki-67, and between clinical subtypes based on these markers. The pooled SMD estimated significantly increased SUVmax in tumours that are ER-negative, PR-negative, HER2-positive and Ki-67-positive. Clinical subtypes based on these markers follow the same pattern with lower SUVmax in luminal subtypes including ER and PR, and higher uptake in TNBC. HER2 overexpression and associated subtypes have an intermediate effect, with significantly higher uptake compared to LA and LB HER2-positive, similar uptake compared to LB and LB HER2-negative, and insignificantly lower uptake compared to TNBC.

The effect of IHC expression of each separate marker (i.e. ER, PR, HER2 and Ki-67) on [18F]FDG uptake can partially be explained by both the interrelations as well as the underlying differences in confounding clinicopathologic factors. Proliferation marker Ki-67, having the single largest effect on [18F]FDG uptake in our meta-analysis, is closely related to histological or nuclear grading and proliferative, poorly differentiated tumours are more common in ER-negative, PR-negative and HER2-positive tumours [88, 89]. In addition, tumour size has an independent effect on [18F]FDG uptake and ER-negative, PR-negative, HER2-positive, and Ki-67-positive tumours are associated with larger sizes [14, 88, 90]. This difference is further increased by an underestimation of [18F]FDG uptake in smaller tumours due to partial volume effects [91]. Lastly, invasive lobular carcinoma is associated with lower [18F]FDG uptake and is especially common in ER-positive, PR-positive and Ki-67-negative tumours [14, 92].

Clinical subtyping provides a more sophisticated classification of breast cancer compared to the separate evaluation of IHC markers. Decreased [18F]FDG uptake in luminal tumours can be attributed to ER and PR expression, with an increase in avidity in case of HER2-positivity as displayed by the increase in [18F]FDG uptake in LB and HER2-positive subtypes. Analogous to separate markers, [18F]FDG uptake closely mimicks the degree of proliferation and differentiation with a gradual increase in both [18F]FDG uptake as well as Ki-67 labeling index and poorly differentiated tumours from LA, LB, HER2-positive to TNBC [93, 94]. Paradoxically, HER2-positivity increases [18F]FDG uptake while TNBC is associated with the highest [18F]FDG uptake of all clinical subtypes. Moreover, increased [18F]FDG uptake can be attributed to larger tumours in luminal and HER2-positive subtypes, but not in TNBC due to contradictory reports on its relative tumour size compared to other subtypes [93, 94]. This suggests underlying differences in [18F]FDG uptake mechanisms between clinical subtypes beyond receptor status, tumour size, proliferation and differentiation [95].

Distinct differences in [18F]FDG uptake between clinical subtypes could influence diagnostic, predictive or prognostic performance, especially when using cutoff values to predict outcome. To illustrate, applying the same cutoff value to different clinical subtypes to predict presence of axillary lymph node metastasis (ALNM) can lead to an underestimation of performance in TNBC since this subtype is associated with increased [18F]FDG uptake and a decreased rate of ALNM [40, 96]. Contrarily, Groheux et al. reported differences in baseline as well as percentage decrease [18F]FDG uptake in primary tumour response to NST between clinical subtypes, suggesting improved diagnostic performance when using distinct cutoffs [15]. In general, the precise effect of clinical subtypes on performance of [18F]FDG PET is lacking and the results of our meta-analysis suggest a need for more research on this topic.

While practices and guidelines differ, [18F]FDG PET/CT is generally recommended in breast cancer patients with a large primary tumour or with clinically node-positive disease [97]. While mainly performed to detect (distant) metastatic disease, the majority of primary tumours in breast cancer patients are [18F]FDG-avid [98]. In current clinical practice, [18F]FDG uptake is predominantly evaluated qualitatively. Considering the increasing number of studies reporting on the significant value of quantitative [18F]FDG PET, this imaging modality is not fully utilized by merely evaluating it qualitatively. Consequently, measuring [18F]FDG PET parameters such as SUVmax on the primary tumour could easily provide valuable predictive or prognostic information that could aid in clinical decision making in the context of personalized medicine. In addition, the application of artificial intelligence to [18F]FDG PET imaging provides a promising adjunct to further improve its diagnostic, predictive and prognostic accuracy [99].

The major limitations of this study were variability in the designs and methods of the included studies, specifically the variability in the administered dose of [18F]FDG, emission time, vendor, type of modality and cutoff values used for receptor status. This variability in design and methods (including vendor variability) is illustrated by the reported heterogeneities, hence the choice for SMD as a summary statistic. Including studies from 2007 onwards, differences in definitions with regard to receptor positivity as well as of criteria for clinical subtypes should be taken into account when interpreting the results of the meta-analyses in this study. Aware that varying definitions could influence the [18F]FDG uptake, there was deliberately chosen to incorporate these changes in the quality assessment instead of additional sensitivity analyses. Furthermore, it can be hypothesized that the changing criteria mainly relate to borderline cases that are of negligible effect on [18F]FDG uptake.

Conclusions

This systematic review and meta-analysis indicates a substantial and significant association between increased [18F]FDG expressed as SUVmax and ER-negativity, PR-negativity, HER2-positivity and Ki-67-positivity. Clinical subtypes based on these markers follow the same pattern with lower [18F]FDG uptake in luminal subtypes including ER and PR, and higher uptake in TNBC. HER2 overexpression and associated subtypes have an intermediate effect on [18F]FDG uptake. Clinical subtypes should be taken into account when applying and interpreting [18F]FDG PET in breast cancer.