Background

Current practice guidelines recommend a completion axillary lymph node dissection for breast cancer patients whose SLN contains metastatic tumor [13]. The risk of morbidity that accompanies completion ALND seems justified for patients with NSLN metastases, because they would undergo excision of residual cancer [4]. However, 50 to 65% of patients with tumor-involved SLNs do not have additional nodal metastasis [5, 6]. For them, ALND offers no clear therapeutic benefit, provides no further information for staging, and increases the cost of medical care. Further, completion ALND is associated with substantial morbidity affecting up to 39% of patients, with a nearly three-fold increased risk of lymphedema or regional sensory loss [79]. Identifying SLN-positive patients without NSLN metastases who could forgo completion ALND would improve the quality of life and reduce costs for the majority of women with new diagnoses of breast cancer.

Previous investigations have not identified predictors of NSLN status with accuracy sufficient to change clinical practice. This failure may be due to limited sample sizes or single institution studies [5, 6, 10]. The majority of prior investigations include sample sizes of less than two hundred subjects, with the challenges of dealing with small sample sizes leading to decreased predictive accuracy when applied to the general population [5, 6, 1012]. However in 2003 Van Zee et al. proposed a nomogram to predict risk of NSLN metastasis based on an accrued population of 1075 cases of primary invasive breast cancer [13]. The Memorial Sloan-Kettering Cancer Center (MSKCC) Breast Cancer Nomogram (Nomogram) has since been successfully applied internationally and become the most commonly used predictive model for NSLN involvement [14]. Use of a predictive nomogram has been shown to be superior to expert opinion, to improve clinical decision making, and to be partially responsible for the decreasing frequency of ALNDs performed [15, 16]. However, use of the Nomogram is limited by its complexity, and inability to be applied if not all patient characteristics are known [17]. Although the Nomogram was based on a large sample size, its reported predictive accuracy and its generalizability to patient populations with dissimilar tumor characteristics or to non-academic, non-quaternary care hospitals has been questioned [1719].

Our goal was to identify characteristics of patients and their tumors that predict NSLN status within the Bay Area SLN Database, comprised of diverse patient populations from one academic and 15 community-based medical centers in Northern California and Oregon. We constructed three new models and contrasted their performance with the Nomogram. We provide a model that has simpler input than the Nomogram and shows higher accuracy for our diverse patient population and for another population of SLN-positive patients with different patient characteristics from Northwestern University. We have created an internet-based calculator, the Stanford Online Calculator, for validation testing and clinical application.

Methods

Study patients

The Bay Area SLN Study for Detection of Axillary Metastasis in Breast Cancer is a multi-institutional collaboration involving 16 institutions in the Greater Bay Area of Northern California and Oregon, of which 15 are community hospitals. A total of 1,040 patients underwent SLN biopsy for biopsy-proven breast cancer between 1996 and 2002. After excluding 256 patients (criteria shown in Additional file 1), we analyzed 784 prospectively accrued subjects with primary invasive breast carcinoma and clinically negative axilla who underwent SLN biopsy with completion axillary lymph node dissection. 285 (36.4%) had tumor-involved SLNs. Among the 285 SLN-positive patients, 213 had pathologic information regarding presence or absence of angiolymphatic invasion (lymphovascular invasion, LVI); 171 patients had complete pathologic information on both angiolymphatic invasion and hormone receptor status. The Northwestern test dataset was compiled by chart review of all patients who underwent a SLN biopsy at Northwestern Memorial Hospital in Chicago, IL, between 2002 and 2006. It is comprised of 77 consecutively identified sentinel node positive patients with invasive breast cancer who underwent completion ALND and had complete pathologic information on tumor type, tumor size, tumor grade, hormone receptor status, HER2/neu status, angiolymphatic invasion status, number of nodes removed, and size of sentinel node metastases. Inclusion and exclusion criteria are similar to that outlined for the Stanford patients in Additional file 1. The Northwestern database was compiled by physicians not involved in generation of the predictive models. The Bay Area SLN study was performed under a protocol approved by the Stanford University Administrative Panel on Human Subjects in Medical Research and the Institutional Review Boards of each participating institution. An independent protocol was approved by the Institutional Review Board of Northwestern University for retrospective chart review and data collection to test the Stanford Online Calculator and MSKCC Nomogram.

SLN biopsy and pathological evaluation

SLN biopsy has been described previously [20]. The SLN was identified using peritumoral injection of 1% isosulfan blue dye, filtered 99mTc sulfur colloid radioactive tracer, or both, as decided by the operating surgeon. All lymph nodes that were blue and/or focally radioactive and/or suspicious by intraoperative palpation were denoted SLNs. All SLNs were evaluated by step-sectioning with hematoxylin and eosin (H&E) staining; in the Bay Area SLN study, SLNs without metastasis detectable by H&E underwent staining by immunohistochemistry (IHC) [21]. IHC was performed on at least four levels of the SLN using anti-keratin antibodies AE1 and CAM5.2. One pathologist directed and interpreted IHC studies on every SLN excised at 14 of the 16 participating institutions in the Bay Area SLN Study. NSLNs were evaluated by H&E only, without serial sectioning. In the Northwestern series, negative SLNs did not undergo IHC testing and individual tumor cells or clusters were identified on H&E only.

Statistical analyses

Thirteen characteristics were studied individually for predicting NSLN status: patient age, tumor histology, tumor size (as a continuous variable and as T size by 6th edition AJCC criteria), tumor grade [22], estrogen receptor (ER) status, progesterone receptor (PR) status, HER2/neu status, presence of angiolymphatic invasion, number of SLNs excised, number of positive SLNs, size of nodal metastasis (recorded according to revised 6th edition AJCC criteria) [23], and method of detecting nodal metastasis (H&E or IHC). Univariate testing was done with χ2 statistics and Wilcoxon rank sums. For multivariate analyses, tree-based classification and logistic regression were performed [24, 25]. Recognizing that some characteristics can be interdependent, we performed multivariate analyses with two approaches whereby interactions among variables are emphasized: recursive partitioning via receiver operating characteristic (RP-ROC) [26] curves and (boosted) classification and regression trees (CART®) [24, 27, 28].

RP-ROC uses the relationship of sensitivity and specificity to calculate the "best value" of each variable for predicting NSLN status. It then chooses the variable with best value. Successive partitioning permits use of ROC curves to compare predictive accuracy and best cut point on "best selected variable." Partitioning of the population into subgroups continues until only patients with or without NSLN metastases are segregated to the group, or until the putative p value of the split exceeds 0.01. RP-ROC was performed as is described in detail by Kraemer [26] (software available from Sierra-Pacific MIRECC [29]).

CART as we applied it uses both cross-validation and voting methods (boosting) to assess the stability and improve the accuracy of the final model [24, 27], (software available from Salford Systems, v5 [30]). Splits are chosen by what is termed the Gini criterion, whose goal is to render nodes of the tree as "pure" as possible in terms of positive or negative NSLN status. Boosting is a method designed to focus on "hard to classify" observations. In all classifications, there is dependence on the products by class of priors and costs of misclassification. For all classification trees, mixed priors (an average of equal priors and prevalence-based priors) were used. After surveying eighteen breast surgeons expert in SLN biopsy and not associated with this study, the costs of a false-positive and false-negative NSLN were set at 3 and 10, respectively.

A third technique, multivariate logistic regression (MLR) informed by CART, was performed with variable selection based on paths from the root to the five terminal nodes of unboosted CART [31]. Odds ratios were calculated individually for all terms that were candidates for inclusion in subsequent analyses. Those terms retained were entered into the MLR by forward selection based on the likelihood ratio. Wald statistics and odds ratios were determined for variables significant at putative p < 0.01 within the regression model [32]. A cutoff p < 0.01 was chosen in the interest of our ending with a focused, concise, predictive model.

In constructing the predictive models of NSLN status, we used tumor characteristics that were significant by univariate testing (Table 1): tumor size, tumor grade, ER status, PR status, angiolymphatic invasion, size of SLN metastasis, and SLN metastasis identification method. Statistical modeling of NSLN status allowed calculation of both the predictive capacity of significant variables and the critical interactions between and among variables, such as increasing angiolymphatic invasion with increasing tumor size. All models used identical variables, although not identical patients. RP-ROC requires complete data, where no values of features are missing, whereas CART does not. Instead, CART relies on the subtle notion of "surrogate split" [24]. Thus, boosted CART analyses were performed on all 285 SLN-positive patients as well as subsets with more complete information, while RP-ROC and MLR analyses were performed on the 213 patients with complete data for angiolymphatic invasion and on the 171 patients with complete data for angiolymphatic invasion and hormone receptor status.

Table 1 Characteristics of NSLN- and NSLN+ cases among SLN+ patients (Bay Area SLN Database).

The MSKCC Breast Cancer Nomogram for Prediction of ALN Status [13] (Nomogram) was applied to our patient population and, to provide fair comparison, calculated for only the 171 patients with complete information on the eight variables required for its application (pathologic size of primary tumor, tumor type with nuclear grade if ductal, LVI, multifocality of primary tumor, ER status, method of detecting SLN metastasis, number of positive SLNs, and number of negative SLNs; a ninth variable, whether a frozen section was performed, was not applicable to our patients). ROC curves were constructed for the Nomogram and the other methods to compare the area under the curve (AUC). Internal validation was performed by 10-fold cross-validation, as previously described [27]. Data were divided at random into 10 parts, as equal as possible in size. CART (in this instance, but more generally any other procedure) was then computed successively for 9/10 of the data with the remaining piece held out as "test sample." This was repeated 10 times and results on the 10 test samples were averaged. Cross-validation is an internal validation method that estimates performance on subsequent subjects by eliminating bias that owes to using the same, or even a portion of the same, data for both modeling and testing. However, even with internal validation, bias and variability can be introduced into subsequent analyses if the prevalence of features that predict outcome is different in future datasets than in the dataset from which the model was developed. The differences in distribution of variables (and in synergistic interactions between variables) for an original and a subsequent test dataset impacts a model's performance on future datasets and applies both to our models and to that of the Nomogram. For this reason, we tested our model and the Nomogram on the Northwestern dataset that differed from our original dataset in its distribution of patient, tumor, and sentinel node variables.

ROC curves were constructed for the Nomogram and the MLR informed by CART model for the Bay Area SLN study dataset and the independent Northwestern dataset.

Statistical analyses were performed with R [33].

Results

Table 1 and Additional files 2 and 3 describe in detail the SLN-negative and SLN-positive patients of the Bay Area SLN dataset. As expected, the incidence of SLN metastasis increased with increasing tumor size: 29% of T1, 51% of T2, and 80% of T3 tumors had SLN metastasis. As tumor size increased over 1 cm, the incidence of angiolymphatic invasion doubled for both SLN-positive and SLN-negative patients but was higher for SLN-positive patients (Additional file 3). Among all 784 patients, the total number of women with any axillary lymph node metastasis was 316 (40%), including 31 (9.8%) with a false negative SLN (Additional file 2).

Among SLN-positive cases, the average number of SLNs removed was 1.91, with metastatic disease limited to a single SLN in 73%. Among tumor-involved SLNs, 23% contained isolated tumor cells or clusters (ITCs, ≤ 0.2 mm); 70% contained micrometastases (>0.2 mm to 2 mm); and 7% contained macrometastases (>2 mm). All SLNs containing ITCs required IHC for detection. Only one of 200 cases with SLNs involved by micrometastasis was not observed on H&E and required IHC staining for identification. All 21 cases with SLN macrometastasis were identified by H&E staining (Additional file 2).

Of 285 patients with tumor-involved SLNs, 101 (35.4%) were found to have NSLN metastases, with tumor metastases to two or more NSLNs in the majority of cases (median number of positive NSLNs 2; mean 3.5; range 1–19) (Additional file 2). By univariate analyses, 8 variables were highly predictive of NSLN status: tumor size (in cm), tumor size by AJCC T classification, tumor grade, ER status, PR status, angiolymphatic invasion, size of SLN metastasis, and whether the nodal metastasis was identified by H&E or IHC (Table 1). Of patients whose SLN was identified by H&E, 45% had NSLN metastases, whereas only 4.6% of patients whose SLN was identified by IHC had NSLN metastases (p < 0.001). Size of SLN metastasis and staining method for metastasis identification are highly correlated (p = 0.02, by χ2 testing) and therefore are not independent predictors of NSLN status. Thus, staining method for identifying tumor-involvement was not included in the multivariate analysis shown in Table 1. By multivariate analysis, tumor size, angiolymphatic invasion, and size of SLN metastasis remained significantly predictive of NSLN status (p < 0.001 by unconditional testing). Of the 285 patients with SLN metastases, NSLN metastases were found in 25% of patients with T1 tumors; in 46% with T2 tumors; and in 60% with T3 tumors (Figure 1A). When angiolymphatic invasion was present, there was a 3.9-fold increase in NSLN metastases (74% vs. 19%, Figure 1B). Among patients with isolated tumor cells or clusters within the SLN, 4.7% had NSLN metastasis; whereas 42% of patients with micrometastasis and 71% with macrometastasis had NSLN involvement (Figure 1C and Table 1).

Figure 1
figure 1

Fraction of patients in Bay Area SLN Database with and without NSLN metastases in relation to (A) tumor stage, (B) angiolymphatic invasion, and (C) size of SLN metastasis.

The models generated by RP-ROC (Figure 2A) and CART (Figure 2B, Additional files 4 and 5) ultimately included tumor size, angiolymphatic invasion, and size of SLN metastasis. At the final split, likelihood of NSLN metastases partitioned into groups by level of risk. The significant predictors as selected by multivariate tree-based modeling were tested individually, as well as all iterations of predictors, in a MLR model. Variables entered were tumor size, angiolymphatic invasion, and size of SLN metastasis (Table 2). Size of SLN metastases interacts with the status of angiolymphatic invasion; that is, the impact of the size of SLN metastases upon the presence or absence of NSLN metastases depends on whether there was angiolymphatic invasion. The tree suggests that one might enter angiolymphatic invasion (scored as 1 if present, 0 if absent) not only multiplied by SLN metastasis size to the first power, but also as the product of angiolymphatic invasion and the square of SLN metastasis size (scored as an ordinal variable with values of 1, 2, and 3 corresponding to the size classification of isolated tumor cells, micrometastasis, or macrometastasis). The MLR model identified two highly predictive composite variables: the product of angiolymphatic invasion and size of SLN metastasis (p < 0.0001, odds ratio of 4.73 with approximate 95% confidence interval 3.11–7.20) as well as the product of tumor size and squared size of SLN metastasis (p < 0.0001, odds ratio of 1.18 with 95% confidence interval 1.10–1.26). We emphasize that p-values are only approximate because CART was used as preprocessor to manufacturing the predictive variables. However, these p-values are so small, and the clinical logic so compelling, that we do not doubt their practical, let alone statistical, significance.

Figure 2
figure 2

Tree diagrams for RP-ROC and CART. As CART is able to impute missing data, it was calculated for all SLN positive patients, n = 285. RP-ROC requires complete data and was calculated for patients with known angiolymphatic invasion status, n = 213 (Bay Area SLN Database).

Table 2 Multivariate Logistic Regression (MLR) analysis informed by CART for predicting NSLN metastasis among SLN+ patients (n = 213) (Bay Area SLN Database).

Table 3 compares the sensitivities, specificities, and predictive accuracies of our three models, RP-ROC, boosted CART, and MLR, all computed with 10-fold cross validation [26]. As different models require different information, we evaluated models for the entire group (n = 285, only possible for CART) and subsets that contained complete information on angiolymphatic invasion (n = 213), and alternatively, on angiolymphatic invasion and ER status (n = 171). Cross-validated sensitivities/specificities of the three technologies for the group with known angiolymphatic invasion status (n = 213) were 79%/76% for RP-ROC, 88%/71% for boosted CART, and 78%/86% for MLR. Cross-validated specificity of boosted CART when inferred for the entire dataset (n = 285) was lower than when calculated using known values for angiolymphatic invasion (n = 213), suggesting that angiolymphatic invasion is informative in our dataset. This is supported by the continued selection of angiolymphatic invasion in CART modeling when patients have known angiolymphatic invasion status (n = 213) and known angiolymphatic status and ER status (n = 171) (Additional files 4 and 5, respectively). Overall diagnostic accuracy, based on areas under the ROC curve [34] (AUC), for predicting NSLN metastasis among patients in our database was greatest by MLR (83% and 85%) for the subsets of patients for whom the computation was possible (n = 213 and n = 171, respectively). Further, we applied the Nomogram to our SLN-positive patients who had complete data available for entry of its eight variables (n = 171, all patients with known angiolymphatic invasion status and ER status). Figure 3 shows a graph of the ROC curve that devolves from our MLR using our two composite variables (n = 213) and the ROC curve that devolves from the Nomogram (n = 171). Because much preprocessing has gone into our computations, p-values we might report (regarding a null hypothesis that the "true" areas under the curves are equal) would be suspect. However, the diagnostic accuracy or area under the curve (AUC) for our MLR is 83% (95% confidence interval 0.81–0.86), and the AUC for the Nomogram is 77% (95% confidence interval 0.73–0.81). When we use the same patients as used in the Nomogram for the MLR calculation (n = 171), our model achieves cross-validated AUC of 85% (95% confidence interval 0.81–0.89). Given that only three variables were used to calculate our MLR, the difference is noteworthy.

Figure 3
figure 3

ROC curves for MLR informed by CART calculation in blue, AUC = 0.83, and Nomogram in green, AUC = 0.77, when applied to the Bay Area SLN Database. Note that MLR informed by CART calculation was done for larger group of patients (n = 213). When it was performed for the same patient group as the Nomogram (n = 171), AUC increased to 0.85.

Table 3 Model comparisons for predicting NSLN metastasis among SLN+ patients (Bay Area SLN Database).

Finally, the MLR and Nomogram were applied to a database of 77 patients who received ALND for positive SLNs at Northwestern University (Additional file 6). The SLN metastases in this dataset were identified by H&E stain without IHC. Among the 77 SLN positive patients, 61% had T1 tumors, 36% had T2 tumors, and 2.6% had T3 tumors. Angiolymphatic invasion was present in 68% of patients' tumors, and the SLN metastases in the Northwestern dataset were predominantly of large tumor burden with 56% having macrometastasis. NSLN metastases were present in 24 patients (31%). This is in contrast to the Bay Area SLN dataset with 55% T1 tumors, 38% T2 tumors, and 7% T3 tumors; 45% with angiolymphatic invasion (when angiolymphatic invasion status was known); 7% with macrometastasis; and 35% NSLN metastases (Table 1 and Figure 1). Although the Northwestern tumors were somewhat smaller, the higher percentage of angiolymphatic invasion and SLN macrometastases suggest more biologically aggressive disease in their dataset, yet they had a slightly lower percentage of NSLN metastasis.

Both the MLR model and the Nomogram performed less well when applied to the Northwestern dataset; however, the MLR model was supported with an AUC of 77% (95% confidence interval 0.67–0.80). This is superior to the performance of the Nomogram among this population, 62% (95% confidence interval 0.55–0.68) (Figure 4).

Figure 4
figure 4

ROC curves for MLR informed by CART calculation in blue, AUC = 0.74, and Nomogram in green, AUC = 0.62, when applied to the Northwestern test set (n = 77). 24 patients had NSLN metastasis in this dataset.

Discussion

Sentinel lymph node biopsy is a major advance in the treatment of women with breast cancer [35]. If no SLN metastases are identified, the likelihood of additional NSLN involvement is 9.8% in our series. Though above the goal false-negative rate proposed by the American Society of Breast Surgeons, this is comparable to that reported in NSABP-32 and recently by both Lyman and Veronesi ranging 9.7%, 8.4%, and 8.8% respectively [1, 36, 37]. Of our patients with positive SLNs, the majority presented with micrometastasis, 70%, or isolated tumor cells, 22%. Thus, our population contains a predominance of limited SLN disease burden relative to prior reports, including van Rijk's reported rate of 23% for micrometastasis and 16% for isolated tumor cells [38]. This may be important as suggested by Alran et al. who showed lower performance of the Nomogram in patients with only micrometastases [19]. Despite the seemingly low sentinel node tumor burden, 35% had NSLN metastases upon completion ALND. Unfortunately, no combination of clinical and/or pathologic characteristics enabled identification of all SLN-positive patients at risk for NSLN metastases. Although SLN-positive patients will receive systemic chemotherapy and/or hormone therapy, it is unknown whether occult NSLN metastases are eradicated by adjuvant treatment. Until results of large prospective clinical trials can demonstrate no long-term increase in mortality from omitting ALND in the setting of systemic therapy, prophylactic ALND for patients with tumor-involved SLNs, including those with and without NSLN involvement, remains standard surgical care [3943]. However, in practice, it is the patient and her physician who decide whether or not a completion axillary dissection is performed. This decision may be informed using online calculators such as the Nomogram and the one presented here.

Based on a multi-institutional sample set larger than most prior studies, we found that univariate predictors of NSLN status include tumor size (in cm and by AJCC T size classification), tumor grade, hormone receptor status (ER and PR), angiolymphatic invasion, size of SLN metastasis, and whether nodal tumor involvement is identified by H&E. By multivariate analyses, tumor size, angiolymphatic invasion, size of SLN metastases, and products of these variables predict NSLN tumor involvement. Others have also discovered the predictive strength of each of the three simple characteristics [5, 6, 10, 4450], although here we confirm their collective power in a unique way. Additionally, we found that angiolymphatic invasion is as strong a predictor of NSLN metastasis as is size of SLN metastasis.

For women with isolated tumor cells in the SLN, we found a 4.7% chance of NSLN involvement, similar to Calhoun and Giuliano's reported NSLN-involvement rate of 4.9% for the same subset of patients, and comparable to or lower than that found previously, 10–15% [47, 51, 52]. The benefits of no further axillary dissection must be weighed against the risk of harboring axillary metastasis that may potentially seed occult metastatic disease. Clinical context, with consideration of a patient's expected life-span and associated health problems, may impact the definition of a "minimal acceptable risk." Recommendations for clinical practice are difficult because the risk of NSLN metastasis in SLN-positive patients with isolated tumor cells is comparable or lower than the risk of NSLN metastases in patients without SLN metastases (9.8% in our study) [53]. These issues are being studied in large-scale prospective clinical trials [40, 41]. Future molecular technologies may also provide guidance [54, 55].

Our goal was to identify patients with tumor-free NSLNs who, with near certainty, may be spared completion ALND. Using multivariate tree-based modeling by RP-ROC, boosted CART, and MLR informed by CART, we identified tumor size, angiolymphatic invasion, and size of SLN metastasis as characteristics that optimized stratification of NSLN status. These refined, statistical analyses demonstrated a highly synergistic interaction between size of SLN metastasis and angiolymphatic invasion on risk of NSLN metastasis. Our models (Figures 2, Additional files 4 and 5) stratified patients with tumor-involved SLNs into four risk groups for having NSLN metastasis: low risk (10% or less), moderate risk (30–45%), high risk (about 60%), and very high risk (greater than 90%).

MLR modeling of NSLN status that was informed by CART in its selection of predictors provided the most accurate cross-validated technique for predicting NSLN metastases for patients with known angiolymphatic invasive status, with accuracy superior to boosted CART, RP-ROC, and the Memorial Sloan-Kettering Breast Cancer Nomogram. When applied to the Bay Area SLN Database, the Nomogram had an AUC of 77%. This compares with the accuracy of the Nomogram for the original MSKCC population (76%) and for a prospective cohort at MSKCC (77%) [13]. Seven subsequent studies have tested the Nomogram and show an accuracy of 63% to 86%, though as low as 54% when applied only to patients with SLN micrometastases [14, 1719, 5660]. In contrast, our MLR informed by CART model performed equally well among patients with isolated tumor cells, micrometastases, or macrometastases. Relative importance of size of SLN metastasis in the Nomogram is determined by method of detection including IHC, serial H&E, routine analysis, versus frozen section (among the subset for which this is performed) [13]. The improved predictive accuracy of the MLR model informed by CART, particularly among patients with isolated tumor cells or micrometastasis, may be due to the relative weight ascribed to the specific size of SLN metastasis in our model. Application of our MLR model to other patient populations is required to validate its performance.

Considering the risk of potential bias due to low sentinel node tumor burden in our dataset, we applied both our model and the MSKCC Nomogram to an independent dataset of 77 SLN positive patients who underwent completion ALND. These cases were not identified by IHC and this dataset contained cases with a much larger tumor burden: 56% of cases contained macrometastasis in the SLN compared to 7% of the Bay Area SLN dataset. Again, the MLR model showed superior performance to the Nomogram. However, the performance of both models decreased compared to the Bay Area SLN Database: the Stanford Online Calculator generated an AUC of 0.74, or 74%, and the Nomogram generated an AUC of 0.62, or 62%. This raises concern regarding the generalizable nature of any model. An underlying reason why neither model performed as well as anticipated is that when a model is developed based on data from one group of patients, and the model is subsequently applied to data from a different group of patients, performance is generally diminished [24, 27]. This is due to differences in the distributions of predictive features and differences in the synergistic impact (interactions) between and among these features in different groups of patients. Thus a model developed in one group of patients would not be expected to perform as well for a different group of patients, even if the performance of the model was validated internally (cross-validated) on the original group. Table 1 and Additional file 6 shows differences in the distribution of the three variables in our model – tumor size, size of SLN metastasis, and angiolymphatic invasion – between patients in the Bay Area SLN Database and Northwestern series. As we would also expect the interactions of these variables to be different for both groups, we believe these factors in aggregate may be responsible for our findings.

The predictive accuracy of the Nomogram requires assessment of eight tumor characteristics [13, 14, 17, 18, 56]. A hazard of multi-variable modeling is that its overall accuracy is dependent upon the accuracy and precision with which each individual variable is determined. Our MLR confirmed the importance of two composite variables from only three tumor characteristics: 1) the size of SLN metastasis when angiolymphatic invasion is present, and 2) tumor size times the square of the size of SLN metastasis. The first composite variable reflects the synergism between angiolymphatic invasion and size of SLN metastasis; the second involves tumor burden. Using these two composite variables, AUC is 83% or 85% compared to the Nomogram's AUC of 77% that relies on eight variables. By using statistical methods which allow assessment of the variable-variable interactions we demonstrate superior accuracy with fewer required variables. Our model is the first proposed which emphasizes the synergistic interactions among patient characteristics. By reducing the required variables, we are hopeful the MLR model may be applied a larger population of patients, without excluding those with incomplete, unavailable, or pending pathologic data.

Missing pathologic data is problematic for breast cancer patients nationwide. Though the generalizability of our model may have benefited from the diverse population represented, obtaining complete clinicopathologic information was partially limited by enrollment across 16 institutions during the years of our study, 1996–2002. Approximately 25% of our 285 SLN-positive patients had no histologic analysis for angiolymphatic invasion. Of the 213 patients with angiolymphatic data present, another 19.7% had no ER status performed or recorded. This is comparable to the 17.1% of invasive breast cancer patients without recorded ER status in 13 registries of the national Surveillance, Epidemiology, and End Results (SEER) database from 1999–2003 [61] (unpublished data, Jeffrey lab); presence of angiolymphatic invasion status was not requested by SEER. Thus, we analyzed our data using three patient groups: the entire SLN-positive dataset of 285 patients; 213 SLN-positive patients who had complete information on angiolymphatic invasion; and 171 SLN-positive patients who had complete information on angiolymphatic invasion and ER status. Even applying the smallest dataset, more SLN-positive patients are analyzed than in most other published studies.

Though not directly compared among identical patient populations, the AUC of our model is also superior to that of M.D. Anderson Cancer Center scoring system (70%) and the Hôpital Tenon scoring system derived in Paris, France (68%), as recently reported by Dauphine et al. [59, 62, 63]. Calculations using our MLR model are easily done over the internet with the Stanford Online Calculator [64]. We encourage others to access and test our model and directly compare it with other models for evaluating risk of NSLN metastasis.

Although no modeling technique has been able to identify patients without any risk of NSLN metastasis, Park et al. recently argued that ALND may be reasonably eliminated among patients with approximately 9% or less predicted risk of NSLN involvement [16]. A low risk subset of 287 patients with SLN metastasis were followed in a non-randomized study with a 2% observed rate of local recurrence. This recommendation, however, is limited by a follow-up of only 23 months. We expect that data from two large prospective clinical studies, NSABP-32 and American College of Surgeons trial Z0011, will more definitively resolve questions regarding the optimal surgical management of SLN-positive patients [40, 41]. In the meantime, we hope that our calculator may provide further guidance for risk evaluation.

Conclusion

Fewer than half of women undergoing completion axillary lymph node dissection (ALND) for breast cancer will have non-sentinel node (NSLN) metastasis. We present a new model and the Stanford Online Calculator developed from a Northern California and Oregon database with superior accuracy and simplicity (three versus eight required patient variables) compared to the Memorial Sloan-Kettering Breast Cancer Nomogram for our dataset and another independent dataset. We hope that other institutions will test our model using their datasets, which will contain different patient demographics, to validate its accuracy and to refine in which populations it may be best used. Further investigation of predictive models to stratify risk of non-sentinel lymph node metastasis will better define their role in guiding clinical decision-making, while we await the results of larger randomized trials.