Introduction

Neoadjuvant chemotherapy (NAC) has been accepted as one of the standard treatments for operable breast cancer. The prognosis of patients treated with NAC is at least equivalent to the prognosis of patients treated with postoperative adjuvant chemotherapy; NAC improves surgical options through tumor shrinkage, and is useful for testing the treatment response [1, 2]. Patients with a pathologic complete response (pCR) have a better prognosis than patients who did not achieve a pCR [1, 2]. However, as several definitions of pCR have been used, the term pCR has not been applied in a consistent manner [3, 4]. According to some definition, the presence of an intraductal component is negligible, or invasive residual disease is acceptable if minimal, while others require that there must be no histologic evidence of residual cancer cells in the breast and axillary lymph nodes (LNs) [1, 37]. Under these conditions, FDA has proposed the use of ypT0/isypN0 as an endpoint to support accelerated approval regulations in 2012 [8].

According to the histological response criteria of the Japanese Breast Cancer Society (JBCS), pathologic response was categorized into 6 grades (Grade 0, 1a, 1b, 2a, 2b, 3) based on histological change in the invasive area, and in the past decade the Japan Breast Cancer Research Group (JBCRG) has conducted three prospective phase II studies of NAC, JBCRG-01, JBCRG-02 and JBCRG-03, which have examined sequential combinations of fluorouracil, epirubicin and cyclophosphamide (FEC), and docetaxel [3, 912]. In these studies, the invasive component, intraductal component, and LN metastasis were individually evaluated, and we could apply several definitions of pCR to the same patient. The present study was a pooled analysis of these JBCRG studies performed to compare the prognostic significance of several different definitions of pCR.

Patients and methods

JBCRG studies of NAC

Details of JBCRG-01, JBCRG-02, and JBCRG-03 studies have been described previously [1012]. In brief, the three studies had comparable main eligibility criteria. The diagnosis of invasive breast cancer was histologically confirmed in all patients by core biopsy. Female patients needed to have a measurable breast tumor of at least 1 cm in diameter. Locally advanced or inflammatory breast cancer was not eligible. Prior to surgery, 4 cycles of fluorouracil (500 mg/m2), epirubicin (100 mg/m2), and cyclophosphamide (500 mg/m2), q3w followed by 4 cycles of DOC (75 mg/m2), q3w were administered in JBCRG-01, and the dose of DOC was increased to 100 mg/m2 in JBCRG-02 [10, 11]. In JBCRG-03, FEC and DOC were administered in reverse order from JBCRG-01 [12]. Patients with hormone receptor (HR)-positive tumors were encouraged to receive adjuvant endocrine treatment for at least 5 years, and adjuvant radiation therapy was recommended for patients who underwent breast-conserving surgery. No patients received trastuzumab as a part of NAC; however, after the approval of adjuvant use of trastuzumab in 2008, patients could receive trastuzumab for 1 year, if indicated. All studies were approved by the relevant ethics committees, and all patients provided written informed consent for study participation and data collection. All studies were registered to UMIN (JBCRG-01, C000000011; JBCRG-02, C000000020, C000000320; JBCRG-03, C000000291).

Patients

For this pooled analysis, individual patient data regarding baseline characteristics, histopathological results at diagnosis and surgery, and follow-up were extracted from the original databases. Only patients who received at least one cycle of systemic chemotherapy were included. Patients were excluded due to missing data for ER, PgR, Her2, or surgery and due to ineligibility or withdrawal of consent. Finally, among 389 patients who were enrolled in JBCRG-01, JBCRG-02, and JBCRG-03, 353 patients were included in the present study. The detailed patients’ characteristics have been summarized in the previous articles [13, 14]. In brief, 200 patients received adjuvant endocrine therapy according to protocol and practice guidelines, and after the approval of trastuzumab for adjuvant use, 17 patients received postoperative trastuzumab for 1 year. Ki-67 was not available for the majority of patients, and nuclear grade was not assessed in 106 patients (30.0 %).

Assessment of response

Clinical tumor assessments were performed at each institute within 4 weeks before initiation of NAC, after completion of the first 4 cycles of chemotherapy, and before surgery according to the modified Response Evaluation Criteria in Solid Tumors (RECIST) guidelines. Clinical examinations were based on palpable changes in tumor size in combination with mammography, ultrasonography, computed tomography (CT), and magnetic resonance imaging (MRI).

Pathologic response was independently evaluated by a blinded central review committee according to the JBCS criteria [3, 9]. For an assessment of pCR, multiple tumor sections were examined, and cytokeratin immunostaining was performed to confirm the presence of residual cancer cells (RDs), if required. pCR was defined as follows: quasi pCR (QpCR), no invasive RD in the breast, but noninvasive RDs, only a few remaining invasive RDs and infiltrated LNs allowed; comprehensive pCR (CpCR), no invasive RD in the breast but noninvasive breast RDs and infiltrated LNs allowed, i.e., ypT0/is or Grade 3; CpCRbn, no invasive RD in the breast and LNs but noninvasive breast RDs allowed, i.e., ypT0/isypN0; strict pCR (SpCR), no invasive and noninvasive RD in the breast, i.e., ypT0; and SpCRbn, no invasive and noninvasive RD in the breast and LNs, i.e., ypT0ypN0. Furthermore, we defined three categories of RD as follows: pCRinv, only noninvasive breast RDs in the breast, i.e., ypTis; Grade 2b, marked changes approaching a complete response with only a few RDs in the breast; and Grade 0–2a, no or slight response, or marked changes in two-thirds or more of tumor cells with apparent RDs in the breast.

Assessment of HR and Her2

Estrogen receptor (ER) status and progesterone receptor (PgR) status were determined by immunohistochemistry at each institute, and in general, tumors with >10 % positively stained tumor cells were classified as positive for ER and PgR. Her2 status was also determined at each institute by immunohistochemistry or by fluorescence in situ hybridization (FISH) analysis. Her2-positive tumors were defined as 3+ on immunohistochemistry or as positive by FISH. Subtypes were classified into luminal tumors (ER-positive and/or PgR-positive, Her2-negative), luminal/Her2-positive tumors (ER-positive and/or PgR-positive, Her2-positive), Her2-positive tumors (ER-negative, PgR-negative, Her2-positive), and triple-negative (TN) tumors (ER-negative, PgR-negative, Her2-negative).

Statistical analysis

Comparisons between groups were made with the Chi-square test or Fisher’s exact test for proportions and Wilcoxon test for continuous variables. The Kaplan–Meier methods were used to calculate disease-free survival (DFS) and overall survival (OS) from the date of initiation of NAC to the date of last follow-up, recurrence, secondary cancers, contralateral breast cancers, or death. Comparisons were made using the log-rank test. Hazard ratios (HRs), 95 % confidence interval (CI), and corresponding p values were calculated using the Cox proportional hazards model. In multivariate analysis, variables were chosen on the basis of goodness of fit. Statistical analyses were performed with JMP (version 10, SAS Institute Inc.), and p < 0.05 was considered statistically significant.

Results

The rates of QpCR, CpCR, CpCRbn, SpCR, and SpCRbn were 27.8, 20.4, 18.4, 9.9, and 8.2 %, respectively (Table 1). Luminal/Her2-positive, Her2-positive and TN tumors showed significantly higher pCR rates than luminal tumors (p < 0.001) irrespective of the definition of pCR. Nuclear grade, nodal status, and clinical response were also associated with pCRs (p < 0.05), although there was no significant association between QpCR and clinical response before surgery (p = 0.06).

Table 1 Patient characteristics and corresponding pCR rates

With a median follow-up of 2,274 days, patients who achieved pCR had significantly improved DFS as compared to patients without pCR when QpCR, CpCR, and CpCRbn were used, while there were no significant differences between pCR and DFS in SpCR and SpCRbn (QpCR, log-rank, p < 0.001, HR = 0.28, p < 0.001; CpCR, log-rank, p = 0.024, HR = 0.44, p = 0.014; CpCRbn, log-rank, p = 0.011, HR = 0.36, p = 0.005; SpCR, log-rank, p = 0.548, HR = 0.77, p = 0.535; SpCRbn, log-rank, p = 0.305: HR = 0.59, p = 0.272) (Fig. 1). For OS, similar results were observed (QpCR, log-rank, p = 0.002, HR = 0.14, p < 0.001; CpCR, log-rank, p = 0.024, HR = 0.22, p = 0.010; CpCRbn, log-rank, p = 0.014, HR = 0.12, p = 0.003; SpCR, log-rank, p = 0.371, HR = 0.53, p = 0.332; SpCRbn, log-rank, p = 0.222, HR = 0.31, p = 0.160). A Cox proportional hazards model that included pCR, study, age, tumor size, nuclear grade, nodal status, subtype, and clinical response found that prognostic significance of nodal status (n + vs n0) and subtype (TN vs luminal) were consistent irrespective of the definition of pCR for DFS and OS (p < 0.01) (Table 2). HRs of each pCR were lower than 1; however, it was significant for DFS and OS only when QpCR, CpCR or CpCRbn was used as the definition of pCR (DFS; QpCR, p < 0.01; CpCR, p < 0.05; CpCRbn, p < 0.05: OS; QpCR, p < 0.01; CpCR, p < 0.05; CpCRbn, p < 0.05). Tumor size was the significant prognostic variable for OS when CpCR, CpCRbn, SpCR or SpCRbn was used as the definition of pCR (CpCR, p < 0.05; CpCRbn, p < 0.05; SpCR, p < 0.05; SpCRbn, p < 0.05).

Fig. 1
figure 1

Association between various definition of pathologic complete response and survival

Table 2 Prognostic impact of pCR on survival (Cox proportional hazards model)

As shown in Table 3, the rates of SpCR, pCRinv, Grade 2b, and Grade 0–2a were 9.9, 10.5, 7.4, and 72.2 %, respectively, and univariate analysis showed significant association between pathologic response and nuclear grade, nodal status, subtype, and clinical response before surgery (nuclear grade, p = 0.028; nodal status, p < 0.001, subtype, p < 0.001, clinical response before surgery, p = 0.028). Patients who achieved Grade 3 or Grade 2b experienced longer DFS and OS than those with Grade 0–2a (DFS; log-rank, p < 0.001; Grade 3, HR = 0.39, p = 0.005; Grade 2b, HR = 0.16, p < 0.001: OS; log-rank, p = 0.007; Grade 3, HR = 0.20, p = 0.005; Grade 2b, HR = 0.15, p = 0.006) (Fig. 2). A Cox proportional hazards model found that pathologic response (Grade 3, Grade 2b vs Grade 0–2a), nodal status (n + vs n0), and subtype (TN vs luminal) were significant prognostic variables for DFS and OS (DFS; Grade 3, HR = 0.5, p < 0.001; Grade 2b, HR = 0.19, p < 0.001; n+, HR = 2.33, p < 0.001, TN, HR = 3.19, p < 0.001: OS; Grade 3, HR = 0.15, p < 0.001; Grade 2b, HR = 0.15, p < 0.001; n+, HR = 3.06, p < 0.001, TN, HR = 4.80, p < 0.001) (Table 4).

Table 3 Association between patient characteristics and pathologic response defined by the classification of Japanese Breast Cancer Society
Fig. 2
figure 2

Survival according to pathologic response defined by the classification of Japanese Breast Cancer Society

Table 4 Prognostic impact of Grade 3 and Grade 2b on survival (Cox proportional hazards model)

When the first sites of recurrence were analyzed according to pCR and subtype, neither bone nor brain was the first site of recurrence in patients with pCR, irrespective of the definition of pCR (Table 5). In patients who achieved Grade 2b, no recurrence was observed. On the other hand, bone was not the first site of recurrence in patients with luminal/Her2-positive and Her2-positive tumors, and soft tissue recurrence was not observed in patients with luminal/Her2-positive tumors. Her2-positive or TN tumors tended to recur in soft tissue more frequently than the other subtypes, and luminal tumors had a lower rate of recurrence in brain. Viscera were the most common sites of first recurrence independent of the definition of pCR and subtype.

Table 5 First site of recurrence in terms of pCR and subtype

Discussion

To the best of our knowledge, the present study is the largest individual patient-based pooled analysis of the different definitions of pCR in breast cancer patients who were enrolled in prospective studies of neoadjuvant anthracycline–taxane-based chemotherapy in Japan. We first compared 5 definitions of pCR: QpCR, CpCR, CpCRbn, SpCR, and SpCRbn. By definition, SpCR is the most vigorous response in the breast, and SpCRbn represents the most complete response to NAC, and the order of pCR rates is theoretically as follows: QpCR ≥ CpCR ≥ SpCR, CpCR ≥ CpCRbn, SpCR ≥ SpCRbn, CpCRbn ≥ SpCRbn [3, 4]. In agreement with this, the order of pCR rates was QpCR > CpCR > CpCRbn > SpCR > SpCRbn in the present study. Similarly, in a meta-analysis of 12 neoadjuvant randomized trials conducted by the Collaborative Trials in Neoadjuvant Breast Cancer (CTNeoBC) (n = 13,125), pCR rates of CpCR, CpCRbn, and SpCRbn were 22, 18, and 13 %, respectively [15]. In addition, in the study by von Minckwitz et al. [6], the rates of ypT0/is/micypN0/+, CpCR, CpCRbn, and SpCRbn were 30.2, 22.8, 19.8, and 15.0 %, respectively. Thus, pCR rates could vary according to the definition, and this non-equivalency in the definition of pCR could be problematic when reviewing the results of NAC for approval under the accelerated approval regulations [8]. In this respect, the CTNeoBC has recommended SpCRbn or CpCRbn for the definition of pCR in consideration of the consistency, while von Minckwitz et al. have concluded that SpCRbn could best discriminate between patients with favorable and unfavorable outcomes [6, 15]. Unfortunately, as these meta-analyses included the studies performed in Europe and United States, it still remains uncertain whether these recommendations are applicable in Japan.

The present study found prognostic significance of CpCRbn in addition to QpCR and CpCR, and SpCR and SpCRbn were not significantly associated with prognosis. Thus, CpCRbn is considered to be the preferable definition of pCR. As for the prognostic significance of SpCR and SpCRbn, our results seem to contradict the previous findings described above [6, 15], and the prognostic significance of tumor size appears to be dependent on the definition of pCR. This observation might be attributable to a much lower number of patients with SpCR or SpCRbn than patients with QpCR, CpCR, or CpCRbn and a limited number of events, resulting in a much lower statistical power to show prognostic significance in the present study. Less intensive NAC might not the cause of lower SpCR or PpCRbn rates, as every patient received an anthracycline-containing regimen and docetaxel with acceptable compliance in this pooled analysis [1012].

On the other hand, the prognostic significance of nodal status and subtype was consistent regardless of the definition of pCR. As for nodal status, this observation is consistent with other studies [6, 16, 17]. For example, the study by Bear et al. [16] has demonstrated that pathologic nodal status was a strong predictor of survival irrespective of pathologic response to the breast. As for subtype, the potential limitations of this study should be addressed; i.e., we could not divide luminal subtype into luminal A subtype and luminal B/Her2-negative subtype, and the sample size of patients with TN or Her2-positive tumors was small. Nevertheless, it is noteworthy that patients with TN tumors could achieve pCR, but TN tumors were associated with poor prognosis as compared to luminal tumors in the present study. This observation is in line with the study demonstrating that patients with TN tumors have increased pCR rates as compared to patients with non-TN tumors, and patients with pCR have excellent and comparable survival, while those with invasive RD have significantly worse survival if they have TN tumors versus non-TN tumors [17]. Thus, the current issue regarding TN tumors appears to be that high pCR rates obtained in patients with TN tumors do not necessarily have a meaningful effect on prognosis of the entire group of patients with TN tumors [17]. It is also interesting to note that the pCR rate was high in patients with Her2-positive or luminal/Her2-positive tumors as compared to patients with luminal tumors, but Her2 positivity had no prognostic significance in the present study. We did not use trastuzumab as a part of NAC and only 23 % of patients with luminal/Her2-positive or Her2-positive tumors received postoperative trastuzumab as reported previously [14]. As trastuzumab is now used routinely, however, it is possible that the prognostic gap between luminal/Her2-positive or Her2-positive tumors and luminal tumors could be wider today. In fact, several studies have demonstrated an influential effect on achieving pCR through inclusion of Her2-directed therapy with NAC as well as improvement of prognosis through adjuvant use of Her2-directed therapy [18, 19].

We also found that patients who achieved Grade 3 or Grade 2b had a more favorable prognosis than patients who did not. In this respect, it should be noted that invasive RD after NAC includes a broad range of actual responses from near pCR to frank resistance, and Grade 2b differs from the other studies including focal RD to pCR in the extent of RD [6, 14, 20, 21]. Grade 2b was strictly defined as only a few remaining isolated cancer cells, while the other studies considered up to 5 mm of RD as focal and found that focal invasive RD, ypTis, and ypN+ were associated with increased relapse risk [6]. In association with this, it is interesting to note that Symmans et al. [7] found minimal RD, i.e., residual cancer burden (RCB)-I had the same 5-year prognosis as patients with no RD. In that study, pathologic responses were subdivided into RCB-0 (ypstage0, no RD), RCB-1, RCB-II (moderate RD), and RCB-III (extensive RD) by calculating RCB as a continuous variable from the primary tumor dimensions, cellularity of the tumor bed, and the number and size of nodal metastases. Needless to say, the inclusion of RCB-1 or Grade 2b would expand the subset which could be identified as having benefited from NAC, and further study should clarify the biology of the remaining cancer cells observed in RCB-1 or Grade 2b.

Furthermore, we found a certain level of association between the first site of recurrence and pCR or subtype. In particular, neither bone nor brain was the first site of recurrence in patients with pCR, irrespective of the definition of pCR, and bone was not the first site of recurrence in patients with luminal/Her2-positive or Her2-positive tumors. As for soft tissue recurrence, the results of the present study are consistent with the study by Caudle et al. [22] demonstrating that Her2-positive and TN tumors were associated with higher rates of locoregional recurrence. Similarly, Liedtke et al. [17] reported that TN tumors had higher rates of recurrence in viscera and soft tissue and lower rates in bone. As for brain metastasis, Shimizu et al. [23] found that the brain was not the first site of recurrence in patients with luminal/Her2-positive or Her2-positive tumors who were not treated with trastuzumab, while it was the most common site of first metastasis in patients treated with trastuzumab as a part of NAC. Taken together, the first site of recurrence could vary according to pathologic response, subtype, and treatment. So far, limited data are available for the first site of recurrence after NAC, and whether intensive follow-up could improve survival has not yet been demonstrated. Further studies should examine the utility of the individualized surveillance based on the pathologic response, subtype, and treatment.

In conclusion, the prognostic significance of pCR as well as its rate varied according to the definition of pCR. Subtype and nodal status were prognostic variables independent of the definition of pCR. This study underscores the needs of standardization of the definition of pCR and provides supporting evidence to CTNeoBC.