Background

Neoadjuvant systemic therapy (NST) has become a widely accepted initial treatment modality for breast cancer patients with unfavorable tumor characteristics and/or with axillary lymph node metastases [1, 2]. NST can lead to pathologic downstaging of the axillary disease burden allowing less-invasive axillary surgery [3]. The axillary response to NST is subtype-dependent and patients with human epidermal growth factor receptor 2 positive (HER2-positive) and triple negative (TN) breast cancer are more likely to achieve axillary pathologic complete response (axillary pCR) than patients with estrogen receptor (ER)-positive breast cancer [4]. Importantly, patients with axillary pCR have both improved overall (OS; 85% vs 55%) as well as disease-free survival (DFS; 83% vs 58%) compared to patients with residual axillary disease [5].

Positron emission tomography with computed tomography (PET/CT) using 18F-fluorodeoxyglucose (18F-FDG) is commonly used to stage patients with locally advanced or recurrent breast cancer [6, 7]. It has been hypothesized that axillary lymph node metastases with higher baseline glycolytic activity, and therefore higher 18F-FDG uptake reflected by standardized uptake values (SUVs), achieve axillary pCR less frequently [8, 9]. In this regard, baseline 18F-FDG PET/CT prior to NST could contain valuable information regarding axillary response which might aid in the clinical decision making regarding NST or primary surgery. Ideally, a cut-off value of an easily computed PET-parameter, such as maximum SUV (SUVmax), would be clinically helpful to predict which patients are more likely to achieve axillary pCR following NST [10].

In addition, breast cancer subtype seems to affect the relationship between baseline glycolytic activity on 18F-FDG PET/CT and axillary response. While the negative correlation of ER expression with 18F-FDG uptake is well established by several studies, the relationship between 18F-FDG uptake and HER2 status remains a matter of controversy [11, 12]. However, studies do show that ER-positive/HER2-negative patients often have significantly lower 18F-FDG uptake compared to TN and HER2-positive patients, while the difference between TN and HER2-positive patients is less clear [13,14,15,16]. The apparent contradiction between higher SUVmax in subtypes that tend to respond well to NST indicates that axillary response prediction based on 18F-FDG uptake should be investigated by taking breast cancer subtypes into account.

Therefore, the aim of the present study was to determine the value of 18F-FDG PET/CT prior to the start of NST to predict which breast cancer patients will achieve axillary pCR following NST with a specific emphasis on breast cancer subtype.

Methods

Patient selection

We retrospectively evaluated all female breast cancer patients that underwent an 18F-FDG PET/CT exam prior to NST at our facility between 2008 and 2018. Exclusion criteria were the absence of axillary surgery following NST, inflammatory breast cancer and incomplete exams. In all patients, the primary tumor was assessed using mammography, ultrasonography (US), and/or magnetic resonance imaging (MRI). Histological core biopsies of the primary tumor were performed to determine tumor characteristics. The axillary lymph nodes were evaluated with axillary US and concurrent tissue sampling in case of suspicious lymph nodes (i.e., diffuse cortical thickening, focal cortical mass and/or thickening and loss of the fatty hilum) [17]. In patients diagnosed with bilateral invasive breast cancer, lymph nodes were assessed in both axillae separately. The local medical ethics committee waived the necessity to acquire informed consent due to the retrospective study design.

Treatment

Patients with unfavorable tumor characteristics and/or lymph node metastasis were offered NST at our institution. The type of NST regimens were administered according to the prevailing Dutch national breast cancer guidelines (Additional file 1: Table S1) [18]. The sequential NST regimen generally consisted of four cycles of 3-weekly doxorubicin and cyclophosphamide, followed by either four 3-weekly cycles of docetaxel in case of ER-positive and/or HER2-positive subtype, or by 12 weekly cycles of paclitaxel in case of TN subtype. Moreover, carboplatin could be added in case of TN subtype. Alternatively, patients could be offered a concurrent schedule consisting of six 3-weekly cycles of doxorubicin, cyclophosphamide and docetaxel. In case of HER2-positive subtype, HER2-targeted therapy with trastuzumab and/or pertuzumab was administered following four 3-weekly cycles of doxorubicin and cyclophosphamide. Alternatively, HER2-positive patients could be offered a concurrent schedule consisting of six 3-weekly cycles of docetaxel, trastuzumab and pertuzumab.

Patients with clinically node-negative disease prior to NST underwent a sentinel lymph node biopsy (SLNB). Clinically node-positive patients underwent either an axillary lymph node dissection (ALND) or a combination of the procedure marking axillary lymph nodes with radioactive iodine seeds (MARI) and SLNB [19].

18F-FDG PET/CT imaging

Prior to the start of NST, patients underwent an 18F-FDG PET/CT exam (Gemini TF, Philips Healthcare, Best, the Netherlands) with a standard acquisition protocol [20, 21]. Prior to 18F-FDG administration patients had to fast for at least 4 h. Afterward blood glucose levels were checked to ensure levels below 10 mmol/l, and subsequently, an intravenous 18F-FDG injection of 2 MBq/kg body weight was administered. A standard supine whole-body 18F-FDG-PET/CT with elevated arms was acquired after a resting period of 45–60 min. A low-dose CT scan (120 kV, 30 mAs, slice thickness 4 mm) from head to thigh was performed, followed by the PET acquisition (2 min per bed position). CT images were reconstructed using filtered-back projection. PET images were reconstructed using the BLOB-OS-TF time-of-flight algorithm provided by the manufacturer, with a voxel size of 4 × 4 × 4 mm3. The 18F-FDG PET/CT imaging protocol did not change during the study period.

Imaging assessment

A nuclear medicine physician with ten years of experience (C.M.) reviewed all images using simultaneous display of PET, CT, and fused PET/CT images. The reader was blinded for clinicopathologic or follow-up findings other than the presence of breast cancer. All image analyses were performed on a dedicated commercially available workstation (AW-server 3.2, GE Healthcare, Chicago, USA). The 18F-FDG uptake in the primary tumor and the most FDG-avid axillary lymph node was semi-quantitatively analyzed using the metabolic PET-parameters maximum, mean and peak SUV (SUVmax, SUVmean, and SUVpeak) (Fig. 1). SUV-parameters were determined for each region of interest by correcting the measured activity for radioactive decay, total administered activity, and body weight [21]. Moreover, metabolic tumor volume (MTV) was determined by measuring the volume of FDG-avid voxels with an activity equal to or greater than 42% of the SUVmax in that specific region of interest and total lesion glycolysis (TLG) was computed by multiplying the MTV with the SUVmean. Lastly, the nodal/tumor ratio (NT-ratio) was computed by dividing the SUVmax of the most FDG-avid axillary lymph node by the SUVmax of the primary tumor [22].

Fig. 1
figure 1

Patient example. Example of a patient with a 22-mm HER2-positive invasive carcinoma of no special type in her left breast. An axial 18F-FDG PET/CT exam of the left axilla shows the most FDG-avid axillary lymph node with an SUVmax of 7.37. Following completion NST, this patient had residual axillary disease

Response assessment of axillary nodes

Of all axillary surgery specimens, the total number of evaluated lymph nodes and the number of lymph nodes with isolated tumor cells (≤ 0.2 mm or less than 200 cells), micrometastases (> 0.2 and ≤ 2.0 mm), and macrometastases (> 2.0 mm) was reported. Histopathologic response of the axillary lymph nodes and the primary tumor was evaluated according to EUSOMA guidelines and based on reduction of tumor cellularity, using the classification suggested by Pinder et al. [23]. Histopathologic response to NST was based on the axillary lymph node with the most unfavorable category. Axillary pCR was categorized as lymph nodes without metastatic disease and with or without evidence of response/downstaging such as fibrosis, residual axillary disease was defined as lymph nodes with metastatic disease and with or without evidence of response/downstaging such as fibrosis [23].

Statistical analysis

Statistical analyses were performed using SPSS software (version 25.0, IBM Corporation, Armonk, New York, USA). The difference in response to NST of the axillary region between breast cancer subtypes was compared by use of a Pearsons's chi-squared test. Differences in PET-parameters between patients with and without axillary pCR were examined for statistical significance by the Mann–Whitney U test. Receiver-operating characteristics (ROC) analyses were performed to determine cut-off values of PET-parameters for the prediction of axillary response to NST for PET-parameters that differed significantly between response groups at baseline. Residual axillary lymph node disease was considered positive, and axillary pCR was considered negative. Sensitivity was defined as the proportion of patients with residual axillary disease that were correctly predicted. Specificity was defined as the proportion of patients with axillary pCR that were correctly predicted. Positive predictive value (PPV) was defined as the proportion of patients predicted to have residual axillary disease who had residual axillary disease following NST. Negative predictive value was defined as the proportion of patients predicted to achieve axillary pCR who had axillary pCR following NST. Due to the small sample size in combination with the low incidence of axillary pCR and low 18F-FDG uptake in ER-positive/HER2-negative patients, analyses for ER-positive/HER2-negative were performed separately from HER2-positive/TN breast cancer. Additionally, subgroup analysis of clinically node-positive breast cancer patients was performed. All statistical tests were two-sided, with the level of significance established at P < 0.05.

Results

Patient characteristics

Eighty-one consecutive patients with 87 primary tumors underwent 18F-FDG PET/CT prior to NST at the Maastricht University Medical Center between 2008 and 2018. After exclusion of eighteen cases for various reasons [inflammatory breast cancer (n = 8), incomplete exams (n = 4), no axillary surgery following NST (n = 6)], a remaining 66 patients with 69 primary tumors were included in this study. Clinicopathologic and operative characteristics of included patients are listed in Table 1. Of all included patients, 33 had ER-positive/HER2-negative, 16 HER2-positive, and 20 triple-negative (TN) breast cancer. Seventeen axillae were considered clinically node-negative and the remaining 52 axillae were clinically node-positive, based on axillary US findings.

Table 1 Clinicopathologic and operative characteristics of all breast cancer patients and subdivided by breast cancer subtype

Axillary response to NST

Overview of the pathologic response of the axillary lymph nodes to NST is displayed in Table 2. When considering all patients there was no evidence of axillary residual disease following NST in 37 axillae (53.6%). In the subgroup of clinically node-positive breast cancer patients axillary pCR was achieved in a total of 25 axillae (48.1%).

Table 2 Axillary response following NST

When considering all patients, there is a significant difference (p < 0.01) in axillary response after NST between subtypes with the highest percentage of patients without axillary residual disease in HER2-positive breast cancer (87.5%), followed by the TN (60.0%) and ER-positive/HER2-negative subtypes (33.3%). In the subgroup of clinically node-positive breast cancer patients there is a consistent significant difference (p = 0.01) in axillary pCR between subtypes with the highest percentage in HER2-positive breast cancer (83.3%), followed by the TN (46.7%) and ER-positive/HER2-negative subtypes (32.0%).

PET-parameters associated with axillary response to NST

The NT-ratio (0.75 vs 0.39, p = 0.025) is the only PET-parameter for which a significant difference is reported between response groups regarding the whole cohort (Table 3). Similar analyses were performed for a combined cohort of HER2-positive/TN breast cancer. In this subgroup, significant differences between presence and absence of axillary residual disease following NST were reported for the PET-parameters SUVmax (7.50 vs 3.15, p = 0.002), SUVmean (4.69 vs 2.07, p = 0.002), SUVpeak (5.69 vs 2.66, p = 0.007), TLG (7.77 vs 3.46, p = 0.028) and NT ratio (1.18 vs 0.39, p = 0.008), with higher values found for patients with residual axillary disease. The difference is consistently significant in the subgroup of clinically node-positive HER2-positive/TN patients for SUVmax (7.50 vs 4.53, p = 0.040) and SUVmean (4.69 vs 3.01, p = 0.031). For the subgroup of patients with ER-positive/HER2-negative breast cancer, none of the measured PET-parameters differed between axillary response groups.

Table 3 Differences in PET-parameters determined on the most FDG-avid axillary lymph node between axillary response groups

For the whole cohort as well for the subtypes separately, none of the PET-parameters determined on the primary tumor prior to NST was significantly associated with axillary response following NST (Additional file 1: Table S2).

Overall predictive value of PET-parameters for residual axillary disease

Regarding the entire cohort of HER2-positive/TN breast cancer patients, the ROC curve for baseline SUVmax and baseline SUVmean showed an AUC of 0.82 and 0.83, respectively (Additional file 1: Table S3). In the subgroup of clinically node-positive HER2-positive/TN breast cancer patients, the AUCs were 0.74 and 0.75, respectively.

In the entire cohort of HER2-positive/TN breast cancer, the highest diagnostic accuracy to predict axillary pCR based on SUVmax was achieved by using a cut-off of 4.89 on the most FDG-avid axillary lymph node, yielding a sensitivity, specificity, PPV, and NPV of 90%, 69%, 53%, and 95%, respectively (Table 4).

Table 4 ROC analyses of PET-parameters in predicting axillary response following NST in HER2-positive/TN breast cancer patients

Discussion

This study demonstrates that focusing on breast cancer subtype allows for the prediction of axillary response after completion NST using easily computed PET-parameters. A cut-off of 4.89 SUVmax measured on the most FDG-avid axillary lymph node in breast cancer patients with the HER2-positive or TN subtype achieved fair diagnostic accuracy with an AUC of 0.82 in predicting axillary response following NST. Specifically, a SUVmax lower than 4.89 measured on the most FDG-avid axillary lymph node in breast cancer patients with the HER2-positive or TN subtype is predictive of having no residual axillary disease following NST with an NPV of 95%.

Molecular subtypes based on expression of receptors strongly influence prognosis and therapeutic approach [24, 25]. Similar to literature, axillary pCR occurred more frequently in HER2-positive and TN than in ER-positive/HER2-negative patients [26]. The rates of axillary pCR found in this study are in line with previously reported rates of axillary pCR in the HER2-positive and TN breast cancer subtypes [27,28,29,30,31]. The rate of axillary pCR in the ER-positive/HER2-negative subtype in our study is strikingly high when compared to previous studies which can possibly be explained by the low sample size of this study [10]. Since the HER2-positive and TN subtypes are more likely to achieve axillary pCR, prediction of axillary pCR seems clinically more relevant in these subtypes [4].

The association between breast cancer subtype and 18F-FDG uptake has been extensively investigated. Similar to many previous studies, we report a clear trend with the highest 18F-FDG uptake in the TN subtype, followed by HER2-positive and ultimately ER-positive/HER2-negative [13, 14, 16, 32,33,34]. Not all PET-parameters differed significantly between subtypes, possibly owing to the small sample size of this study. Another explanation can be that the SUVmean provides a better representation of the heterogeneity in the primary tumor compared to SUVmax or SUVpeak, which is clearly shown by the smaller ranges reported for SUVmean. The wide and mostly overlapping ranges between breast cancer subtypes can be explained by the fact that the molecular subtypes do not perfectly represent the true diversity and metabolic heterogeneity of breast cancer [35, 36]. However, especially studies with larger sample sizes do show a clear effect of the expression of ER and HER2 on 18F-FDG uptake [16, 33, 34, 37,38,39]. This study adds to the accumulating evidence regarding the differences in 18F-FDG uptake between subtypes indicating that future research in this field should always take breast cancer subtype into account.

Baseline differences in PET-parameters between response groups measured on axillary lymph node metastases have been investigated before. Keam et al. did not find a difference in baseline SUVmax between clinically node-positive patients with axillary pCR and residual axillary disease [40]. Similar to the results reported in this paper, Rousseau et al. did find that the SUVmax was lower in patients that developed axillary pCR following NST [9]. Akdeniz et al. investigated baseline differences in baseline SUVmax on axillary lymph node metastases in breast cancer subtypes, but did not report any significant differences [41]. With regard to HER2-positive and TN breast cancer, we do report statistically significant differences in various PET-parameters between patients with axillary pCR and residual axillary disease. The parameters SUVmax and SUVmean measured on the most FDG-avid axillary lymph node are persistently lower in HER2-positive and TN patients that develop axillary pCR following NST. Accordingly, these parameters can be investigated for their added value to predict axillary response following NST in clinical practice.

Despite studies reporting on baseline differences in SUVmax between axillary pCR and residual axillary disease, its value in predicting axillary response to NST could not yet be established previously [8, 41, 42]. A possible explanation for this low diagnostic accuracy is that previous studies did not focus on subtypes when predicting axillary response with baseline 18F-FDG PET/CT. We found that focusing on the HER2-positive and TN subtype could increase the AUC to 0.82 at a cut-off of 4.89 SUVmax [8, 42]. Further focusing on the subgroup of clinically node-positive HER2-positive and TN breast cancer patients, the AUC decreased slightly to 0.74 at a cut-off of 3.77 SUVmax measured on the most-FDG avid axillary lymph node. An SUVmax lower than 3.77 was able to reliably exclude axillary residual disease with an NPV of 92.3%. While the AUCs for SUVmean were consistently higher than those found for SUVmax, computing the SUVmean is prone to more inter- and intraobserver variability and therefore not applicable in daily clinical practice [43].

The value of sequential 18F-FDG PET/CT for the early prediction of axillary response to NST has also been previously explored. Three studies reported a significant increase in diagnostic performance when percentage decrease after the first cycle of NST was used to distinguish between axillary response groups [8, 9, 42]. Interestingly, two studies reported an increase in performance when focusing on specific breast cancer subtypes [8, 42]. Contrarily to our results, Wu et al. reported improved predictive value when excluding ER-negative/HER2-positive patients. A possible explanation could be that ER-positive/HER2-negative patients were underrepresented in the final analysis since patients with a baseline SUVmax of the most FDG-avid axillary lymph node < 2.5 were excluded from further analysis [42]. Nevertheless, these previous results as well as the findings of our research indicate that breast cancer subtype should be taken into account when using 18F-FDG uptake for axillary response prediction.

Large early trials comparing NST with adjuvant systemic therapy (AST) in breast cancer found no significant difference in DFS or OS, permitting the use of NST for its advantages in allowing less invasive surgery [44,45,46,47]. However, these early trials were not specifically aimed at molecular subtypes. A recent systematic review of 9 studies including 36,480 TN breast cancer patients showed that developing a pCR provides a significant advantage in OS and DFS, with hazard ratios of 0.53 (0.29–0.98) and 0.52 (0.29–0.94), respectively, compared to AST in this subtype [48]. Contrarily, having residual disease in the TN subtype deteriorates OS and DFS, with hazard ratios of 1.19 (1.09–1.28) and 2.36 (1.42–3.89), respectively, suggesting that these patients would have benefited from primary surgery followed by AST [48]. Accordingly, TN breast cancer patients more likely not to respond to NST could benefit from earlier tumor debulking with a decreased opportunity for systemic tumor seeding and micrometastases [49]. These data suggest that in the TN subtype predicting response to NST could provide valuable information for selecting patients more suited for primary surgery followed by AST, potentially based on 18F-FDG PET/CT findings.

Besides identification of residual axillary disease following NST, prediction of axillary pCR is equally clinically relevant. To date, current noninvasive imaging techniques remain inaccurate to reliably detect which patients have developed an axillary pCR following NST [50]. Meanwhile, less-invasive axillary surgical staging techniques such as SLNB and MARI, performed separately or combined in the TAD- or RISAS-procedures, are investigated and gaining support to omit further axillary treatment [19, 51, 52]. Accordingly, less-invasive axillary surgery could be harmful in patients with residual axillary disease with higher chances of metastatic dissemination. Therefore, it is paramount to investigate noninvasive imaging modalities that can reliably predict or detect axillary response and thus select patients for less-invasive axillary surgery.

This study has several limitations. First, due to the small sample size per subtype HER2-positive and TN breast cancer patients were analyzed combined. Preferably, analyses are performed in large numbers of patients per subtype. Moreover, we did not perform a logistic regression analysis to determine confounding factors associated with subtype as well as with pathologic response to NST because of the small sample size. Second, the inclusion of patients is over a long period of time during which the neoadjuvant regimens have changed, thus influencing the rate of axillary pCR in especially the HER2-positive subtype. Third, this single-center, single-vendor study might limit the external validity of this research since the use of different PET/CT systems or settings might influence the absolute values of PET-parameters. However, using the NT-ratio could possibly overcome this limitation since it is not dependent of the individual PET-parameters.

A focus of future research could be on identifying breast cancer subgroups in which response prediction can be reliably performed. Additionally, the emerging modality 18F-FDG PET/MRI could further improve the diagnostic performance of noninvasive imaging in predicting or detecting axillary response to NST in breast cancer. Recent studies have shown promising results of 18F-FDG PET/MRI in breast cancer and have suggested it could potentially function as a one-stop-shop solution for patients in need of locoregional as well as distant staging [53,54,55]. Combining morphologic MRI parameters with metabolic PET-parameters of sequential 18F-FDG PET/MRI has shown promising results in predicting primary tumor response in breast cancer in two previous studies [56, 57]. Lastly, major advances in artificial intelligence could further increase the efficiency and accuracy of the prediction and detection of nodal response with imaging [58].

Conclusions

To conclude, this study was the first to demonstrate that predicting axillary response to NST with baseline 18F-FDG PET/CT can be performed when focusing on breast cancer subtype. The parameters SUVmax and SUVmean can predict axillary response in HER2-positive and TN breast cancer patients with fair diagnostic accuracy in the entire cohort as well as in clinically node-positive patients. Baseline 18F-FDG PET/CT can be valuable in selecting patients more suited for either primary surgery followed by AST or for NST prior to surgery.