Background

Recently, several trials have explored the safety of omitting axillary lymph node (ALN) dissection (ALND) and administering appropriate adjuvant therapy in patients with early breast cancer having limited axillary involvement [1,2,3]. However, precise information on the total number of involved lymph nodes cannot be obtained if ALND is omitted in patients with metastatic sentinel lymph nodes (SLNs). In previous studies, 13–33% of patients who had undergone ALND for SLN metastases had non-SLN metastases [1,2,3], and 13.7% of patients had ≥ 4 lymph node metastases [3]. When determining systemic therapy and regional irradiation strategies for patients with breast cancer, it is important to distinguish between non-advanced ALN status (i.e., 0–3 positive ALNs) and advanced ALN metastasis (ALNM) (i.e., ≥ 4 positive ALNs) [4]. Although performing ALND in patients with non-advanced ALNM may lead to overtreatment, omitting ALND in patients with advanced ALNM may lead to undertreatment. Therefore, it is important to accurately estimate whether a patient has non-advanced or advanced ALNM if omitting ALND in patients with metastatic SLNs. Several nomograms reportedly predict the risk of advanced ALNM in patients with metastatic SLNs; however, most nomograms use data obtained from surgery, such as pathological tumor size or lymphovascular invasion; therefore, it is difficult to use these nomograms intraoperatively [5,6,7,8,9]. Other nomograms use the total tumor load of SLNs evaluated by one-step nucleic acid amplification as a predictor [10, 11], but this method is not yet commonly used in clinical practice. Therefore, a model that can accurately predict the likelihood of a patient with metastatic SLN having advanced ALNM using only easily available preoperative and intraoperative data is required.

The number of suspicious nodes on axillary ultrasound (US) is related to the number of positive ALNs [12,13,14]. The histologic type of invasive lobular histology is related to advanced ALNM [7, 12, 15] as well. Although these factors may improve the predictive performance of ALN status, no model has been developed to predict advanced ALNM by combining preoperative data of axillary US imaging and histology with intraoperative data of SLNs.

We previously developed an easy-to-use scoring system with only preoperatively available data that could distinguish between advanced and non-advanced ALNM with a high degree of accuracy for patients with cT1-T3cN0-1 breast cancer [12]. However, some patients that were used to develop the previous predictive model had undergone ALND without SLN biopsy (SLNB) or had not undergone ALND when one to two SLNs were positive. Especially, we might have underestimated the total number of positive nodes because not all patients had undergone ALND. Additionally, the previous predictive model was not a model specific to clinically node-negative patients. Therefore, we conducted a new study to develop and evaluate a scoring model that can differentiate non-advanced and advanced ALNM cases with a combination of available preoperative and intraoperative data in clinically node-negative patients with metastatic SLNs who had undergone ALND.

Methods

The dataset comprised consecutive patients who had operable primary breast cancer (clinical TNM stage, T1-T3, and N0 as per the 8th edition of the American Joint Committee of Cancer [AJCC] Cancer Staging Manual [16]) in our institutional database from January 2010 to October 2021. We excluded patients who had locally advanced disease (T4 or N2-3) and those who had received neoadjuvant chemotherapy (NAC). We also excluded patients who had not undergone SLNB and those with negative SLNs. Patients with positive SLNs who had not undergone ALND were also excluded. We identified 804 patients, who were randomly divided into the training and the validation cohorts in a ratio of 5:3 using the statistical software STATA SE version 13 (Stata Corp., College Station, TX, USA). The training and validation cohorts were used to develop the scoring system and for validation, respectively. The detailed exclusion criteria are presented in Fig. 1.

Fig. 1
figure 1

The flowchart of patients’ selection. NAC, neoadjuvant chemotherapy; SLNB, sentinel lymph node biopsy; SLN, sentinel lymph node; ALND, axillary lymph node dissection; AUC, area under the curve; CI, confidence interval

Data collection

We obtained patients’ medical records regarding age at diagnosis, clinical tumor size evaluated by US, total number of suspicious lymph nodes detected by the axillary US, histologic tumor type, histologic tumor grade, estrogen receptor (ER) status, progesterone receptor (PR) status, human epidermal growth factor receptor 2 (HER2) status, Ki-67 level, and total number of metastatic lymph nodes in SLNB and ALND. ER and PR positivity were defined as immunohistochemical staining of > 1% of tumor cells. Hormone receptor (HR) positivity was defined as ER positivity and/or PR positivity. HER2 positivity was defined as a score of 3+ on immunohistochemistry or on amplification of fluorescence in situ hybridisation [16, 17]. Regarding the classification of Ki-67 levels, in the St. Gallen International Consensus Guidelines for treatment of early breast cancer 2021, the panel generally supported the recommendation that tumors with Ki-67 ≥ 30% receive chemotherapy, but the Ki-67 threshold for recommending chemotherapy in ER-positive cases could not be consistently defined as 10–25% [18]. Considering this point, we classified the tumors according to the Ki-67 level into the following three groups: ≤ 10%, 10–30%, and ≥ 30%. Diffuse cortical thickness > 5 mm, asymmetric cortical thickness > 3 mm, and complete or near-complete absence of fatty hilum on the axillary US were considered as suspicious lymph nodes [19, 20]. Fine-needle aspiration cytology was performed for suspicious ALNs. The most suspicious node was sampled in patients with multiple suspicious lymph nodes, and the number of suspicious nodes was recorded.

Surgical procedure

SLNB was performed on all patients. SLNs were identified using lymphoscintigraphy (technetium-99m phyrate) and/or indocyanine green. Completion level I–II ALND was performed in all patients regardless of the metastatic size of the SLNs.

Pathological evaluation

ALNs were evaluated using hematoxylin and eosin staining. Each ALN was classified as negative or positive for metastasis, and the number of metastatic ALNs was recorded. According to the current AJCC criteria [21], isolated tumor cells (deposit < 0.2 mm) were classified as negative (pN0i+), and micrometastatic deposits of 0.2–2.0 mm were classified as positive for metastasis (pN1mi). Pathologic nodal staging was determined by the number of metastatic ALNs as follows: pN0, none; pN1 (1–3 lymph node metastases), limited; and pN2-3 (> 3 lymph node metastases), advanced.

Statistical analysis

Tumor characteristics and axillary US scans of patients in the training and validation cohorts were compared using the Mann–Whitney U test or chi-square test, as appropriate. In the training cohort, the odds ratios (ORs) and 95% confidence intervals (CIs) for advanced ALNM were calculated using univariate logistic regression analysis. Baseline variables (P < 0.10 on univariate analysis were assessed for multicollinearity using the variance inflation factor (VIF). A VIF of > 10 indicated multicollinearity between the variables [22]. Baseline variables (P < 0.10 on univariate analysis) were included in the multivariate regression analysis, which was used to devise the scoring system. For each of these variables (P < 0.10 on multivariate analysis), a score was calculated using the β-coefficient by the value rounded to the nearest integer as each score. The total score was derived from the sum of the scores for each variable. Then, discrimination and calibration accuracy of the scoring system was evaluated to distinguish between advanced and non-advanced ALNM cases. The discrimination ability of each scoring system was evaluated using the receiver operating characteristic (ROC) curve, and then assessed by calculating the area under the curve (AUC) with a 95% CI. Calibration was evaluated by comparing the expected number of patients with advanced ALNM (as predicted by each total score) with the observed number of patients with advanced ALNM. The calibration of our scoring system was tested by performing the Hosmer–Lemeshow goodness-of-fit test on the training cohort [23]. To evaluate the generalisability of the scoring system, a 5-fold cross-validation (CV) was performed [24] using the training cohort dataset.

The diagnostic performance of the scoring system for predicting advanced ALNM was estimated in terms of its AUC, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of each cutoff value of the scoring system. The feasibility of the scoring system was evaluated with the validation cohort.

All statistical analyses were conducted using the statistical software STATA SE version 13 (Stata Corp). P < 0.05 was set as the threshold for significance.

Results

Factors associated with advanced ALNM in the training cohort

Patient demographics and tumor characteristics of both the training and validation cohorts are summarized in Table 1. Patient and tumor characteristics were well-balanced between the training and validation cohorts.

Table 1 Baseline patient characteristics

The factors associated with advanced ALNM are summarized in Table 2. Univariate analysis revealed that the significant factors associated with advanced ALNM were the clinical tumor size, histologic type, type of surgery, axillary US findings of suspicious lymph nodes, size of SLN metastasis, ratio of the number of positive SLNs to the total number of SLNs removed, and number of positive SLNs. Variables with P < 0.10 in univariate analysis were assessed for multicollinearity (Table S1). No variable with a VIF > 10 was found, indicating no collinearity between the variables. In the multivariate analysis, patients with advanced ALNM were more likely to have US findings of multiple suspicious lymph nodes, SLN macrometastasis, higher ratio of the number of metastatic SLNs to the total number of SLNs removed, and two or three metastatic SLNs. A clinical tumor size of 4–5 cm or > 5 cm and a histologic type of invasive lobular carcinoma were of borderline significance.

Table 2 Results of univariate and multivariate analyses of factors associated with advanced ALNM (pN2-N3) in the training cohort

Scoring system for distinguishing between non-advanced and advanced ALNM

Using the results of the multivariate analysis, a scoring system was devised to assess the likelihood of advanced ALNM (Table 3). The factors included were the clinical tumor size, axillary US findings, histologic type, size of SLN metastasis, ratio of the number of positive SLNs to the total number of SLNs removed, and number of positive SLNs. A score was calculated using the β-coefficient from the multivariate analysis, and the value rounded to the nearest integer was used as each score. The total score was derived from the sum of the scores for each variable. The percentage of patients with advanced ALNM increased significantly as the total score increased (Table 4).

Table 3 Scoring system based on multivariate analysis in the training cohort
Table 4 Distribution of advanced ALNM (pN2-N3) stratified by total score in the training and validation cohorts

Predictive accuracy of the scoring system in the training cohort

The ROC curves for the scoring system in the training cohort are shown in Fig. 2A. The AUC was 0.90, and the 5-fold CV showed a mean AUC of 0.90. The calibration plots of frequency compared to the predicted probability of the scoring model showed a slope of 1.00 for the training cohort (Fig. 2C), and the Hosmer-Lemeshow test indicated goodness-of-fit for the model in the training cohort (P = 0.69). The predictive accuracy of the scoring system for differentiating between non-advanced and advanced ALNM cases at each cutoff value is presented in Table 5. The AUC value was highest when the cutoff value was a total score of 4. At this score, the NPV for excluding advanced ALNM was 96.8%, and the sensitivity, specificity, and PPV were 85.3%, 79.1%, and 41.8%, respectively.

Fig. 2
figure 2

ROC curves of the scoring system for differentiating between non-advanced and advanced ALNM in the training cohort (A) and the validation cohort (B), and calibration plots of the system for the training cohort (C) and the validation cohort (D). The Hosmer–Lemeshow test indicated goodness-of-fit for the model in the training (χ2 = 3.09, P = 0.69) and validation cohorts (χ2 = 9.44, P = 0.31). The calibration plot of the observed frequency compared to the predicted probability of the scoring model showed slopes of 1.000 for the training cohort (C) and 0.852 for the validation cohort (D). ROC, receiver operating characteristic; ALNM, axillary lymph node metastasis; AUC, area under the curve; CI, confidence interval

Table 5 Predictive ability of the scoring system to differentiate between non-advanced and advanced ALNM (pN2-N3) at each cutoff point in the training and validation cohorts

Performance of the scoring system for the validation cohort

The percentage of patients with advanced ALNM in the validation cohort also increased significantly as the total score increased (Table 4). The ROC curve for the scoring system in the validation cohort is shown in Fig. 2B. The AUC was 0.89, and the calibration plots of frequency compared to the predicted probability of the scoring model showed a slope of 0.852 (Fig. 2D). The Hosmer–Lemeshow test indicated goodness-of-fit for the model in the validation cohort (P = 0.31). At a total score of 4 points, the AUC was 0.81 (95% CI, 0.76–0.87); the NPV for excluding advanced ALNM was 96.9%; and sensitivity, specificity, and PPV were 87.8%, 74.8%, and 40.2%, respectively (Table 5).

Discussion

In the present study, we developed a scoring system using available preoperative clinicopathological factors and the intraoperative SLN status to differentiate between non-advanced and advanced ALNM in patients with breast cancer having positive SLNs following ALND. We evaluated the predictive accuracy of the scoring system by assessing the discrimination and calibration ability in the training cohort and then evaluated its feasibility using the validation cohort. The results revealed that the scoring system had a high predictive performance. Therefore, our scoring system may identify patients with positive SLNs who are likely to have non-advanced ALNM.

The tumor size, histologic type, size of SLN metastasis, number of positive SLNs, proportion of positive SLNs, and suspicious lymph nodes on the axillary US are potential factors that would be useful in predicting the likelihood of advanced ALNM [5,6,7,8,9, 11, 12], which was confirmed by our study. Therefore, we confirmed the non-collinearity between these variables before developing the scoring system.

The pathological tumor size has often been reported as a factor for predicting advanced ALNM [5,6,7,8, 11], while the clinical tumor size has been less frequently reported [9, 12]. Although pathological tumor size is a more accurate factor than clinical tumor size, it is unavailable preoperatively; therefore, we used clinical tumor size evaluated by US to develop our scoring system, which assigned 2 and 0 points for a clinical tumor size > 4 cm and ≤ 4 cm, respectively. Among the 717 patients in our entire cohort whose clinical tumor size was estimated to be < 4 cm, only 71 (9.9%) had a pathological tumor size > 4 cm (data not shown). Thus, the NPV of patients whose clinical tumor size was estimated to be ≤ 4 cm was as high as 90.1%, suggesting that the risk of underestimating the tumor size was low. The clinical tumor size is easy to estimate by US, which is an advantage of our scoring system.

ILC was an independent risk factor for advanced ALNM in our study. US findings of axillary nodes coupled with FNA have been reported to be useful in predicting heavy axillary burden [25]. Advanced ALNM with false-negative US findings was more prevalent in patients with ILC than in those with infiltrating ductal carcinoma (IDC) because of the morphological features of ILC [26, 27]. Fine-needle aspiration biopsy of suspicious ALNs on US was less sensitive in patients with ILC than in those with IDC [28, 29]. Therefore, we believe that the inclusion of histologic type as a predictive factor in the scoring system is an important complement to US findings.

A suspicious ALN on US often corresponds to an SLN. Considering that the false-negative rate for SLNB is 7.3–9.8% [30, 31], adding axillary US findings to the scoring system may reduce the risk of underestimating non-SLN metastasis. The AUC value for predicting advanced ALNM using our scoring system was significantly higher than that predicted by a single independent predictor, such as the number of positive SLNs or the proportion of positive SLNs (Figure S1).

In the analysis restricted to patients with one or two SLN metastases, our scoring system showed similarly good discrimination and calibration abilities (Figure S2). Specifically, 386 (77.0%) of 501 patients in the training cohort and 238 (78.5%) of 303 patients in the validation cohort had 1–2 SLN metastases, while 54 (14.0%) of 386 patients in the training cohort and 31 (13.0%) of 238 patients in the validation cohort had advanced ALNM; these frequencies were similar to those previously reported [3]. Assuming that these patients with advanced ALNM undergo only SLNB, the indications for postoperative chemotherapy and regional irradiation may be determined based on the underestimated lymph node status of the SLNB, thereby resulting in undertreatment.

In the training cohort, when the cutoff point of our scoring system was set to 4, 262 (67.9%) patients with 1–2 metastatic SLNs had a score of ≤ 4 points, and the NPV was 95.8% (Tables S2 and S3). A high NPV indicated that our scoring system accurately detected patients with 1–2 metastatic SLNs but a low risk of advanced ALNM. Therefore, even if ALND is omitted in patients with a score ≤ 4 and the indication for postoperative chemotherapy or regional irradiation is determined based only on SLNB results, the risk of undertreatment is low. Although no adverse prognostic effect of ALND omission has been observed [3], at each individual patient level, underestimating the number of lymph node metastases due to ALND omission may affect decisions regarding adjuvant treatment, leading to a risk of undertreatment. As our study was not a prospective study, the effectiveness of using the total score to determine whether ALND should be performed (and whether adjuvant therapy should be administered) is unclear. However, the ability of our scoring system to intraoperatively predict the risk of non-advanced and advanced ALNM at the individual level in patients with 1–2 metastatic SLNs may help reduce the risk of undertreatment with adjuvant therapy even if ALND is omitted.

The strength of our model is that it is based on available clinicopathological features from preoperative and intraoperative evaluations conducted in routine clinical oncology practice. The first advantage of including US findings in the predictive model is that US is a non-invasive, reproducible, and low-cost diagnostic tool. Second, while axillary assessment by magnetic resonance imaging is affected by body mass index and has decreased sensitivity in obese patients [32], US has been reported to have similar sensitivity in obese and non-obese patients and better specificity in obese patients [33]. All patients underwent level I–II ALND completion, which provided accurate information regarding ALNM. Additionally, the scoring system-type prediction model is relatively simple compared to the nomogram-type model and is easily implemented in daily clinical practice.

This study had several limitations. First, it was performed at a single institution. Second, although we collected data from consecutive patients with invasive breast carcinoma, we did not control for selection bias. Patients who had undergone NAC were excluded. Patients with HR−/HER2+ and HR−/HER2− tended to receive NAC compared to those with HR+/HER2−, even if they were clinically node-negative, and the inclusion of these patients might have affected our results.

Conclusions

We developed a scoring system to accurately differentiate non-advanced ALNM from advanced ALNM in patients with breast cancer demonstrating SLN metastasis. This scoring system is easy to use, requires only preoperative and intraoperative available data, and can exclude advanced ALNM with a high NPV. Hence, it may contribute to reducing the risk of undertreatment with adjuvant therapy in patients with metastatic SLNs, even if ALND is omitted.