Background

Breast cancer is the most prevalent carcinoma in women. An estimated 268,000 American women were diagnosed with breast cancer in 2019, accounting for approximately 30% of all new cancer diagnoses in women, resulting in 41,760 deaths (15% of women’s cancer mortality) [1].

A variety of methods, including DNA sequencing and immunohistochemistry, have been used to study the mechanisms driving the occurrence and progression of breast cancer [2]. In order to facilitate identification and treatment of breast cancer with different characteristics, immunohistochemical markers are used to classify tumors into subtypes [3]. Hormone receptors (HRs), such as the estrogen receptor (ER) and the progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) are important immunohistochemical markers. Four molecular subtypes are recognized immunologically based on these biomarkers: luminal A, luminal B, basal-like, and HER2 [4]. There is a correlation between the molecular subtypes and the prognosis of breast cancer [5]. Luminal A breast cancer is defined as breast cancers with the following expression characteristics: ER > 1%, PR ≥ 20%, HER2 negative, and Ki-67 < 14% [6]. Howlader et al. reported that the proportions of luminal A in the breast cancer subtypes was 72.7% [7]. Luminal A breast cancers have high hormone receptor expression, negative HER2 expression, and a low proliferation rate compared to other subtypes of breast cancer. Fortunately, these characteristics contribute to a better prognosis for patients with luminal A breast cancer [8, 9]. Yet, additional factors may affect the prognosis of this subtype.

Lymph node-positive is a high-risk factor of breast cancer and is related to lymph node metastasis. First, the expression of microRNA is an important factor affecting lymph node metastasis. Some studies have reported that the expression of miR-98 leads to metastasis of tumor cells to sentinel lymph nodes, which is associated with the poor prognosis of ER-positive, HER-2 negative breast cancer [10, 11]. Second, some immune cells are also associated with lymph node metastasis. A study by Takada et al. showed that the density of tumor-infiltrating lymphocytes in patients with lymph node metastasis was significantly lower than in patients without lymph node metastasis [12]. Third, the invasion of peripheral and lymphatic vessels is associated with lymph node metastasis. Çetintaş et al. reported that perineural invasion and lymphatic vessel invasion were significantly associated with the risk of lymph node metastasis [13]. In addition to the factors mentioned above, other factors such as tumor size, body mass index (BMI), and the platelet-to-lymphocyte ratio are also associated with lymph node metastasis and breast cancer prognosis [10, 14, 15].

Although it has been confirmed that some factors, such as body mass index (BMI) and the expression of the mircoRNAs mentioned above, affect the prognosis of lymph node positive, luminal A breast cancer, however, whether chemotherapy can improve the survival of these patients is still controversial. While the study by Herr et al. showed that the OS of patients with lymph node positive, luminal A breast cancer improved after receiving chemotherapy [16], studies by Taskaynatan H et al. and Uchida N et al. failed to show benefit from chemotherapy for the same patient population [17, 18]. Therefore, it is necessary to use chemotherapy as a predictor to build a predictive model, which can more accurately clarify and predict the impact of chemotherapy on the prognosis of patients with breast cancer. Moreover, a nomogram is a visual tool based on a prognostic model that includes relevant clinicopathological factors that provide specific individual clinical outcomes, thereby providing clinicians with a more accurate assessment of prognosis. Previous nomograms did not show the effect of the treatment on the survival of patients with luminal A, lymph node-positive breast cancer [19, 20]. but the treatment, for example, surgery, has a significant effect on the prognosis of breast cancer [21]. Thus, it is important to use the treatment modality as a predictor for building the nomogram to predict the prognosis of patients.

In this study, we focused on constructing nomograms which can predict the survival outcomes of patients with lymph node positive, luminal A breast cancer. First, the information of the patients was screened from the Surveillance, Epidemiology, and End Results (SEER) database. Then, the patients were divided into two groups, the test group and the verification group, and the test group was used to construct a model to predict the prognosis of the patients. Finally, the validation group was used to verify the sensitivity and accuracy of the model. Detailed information about the steps of construction and validation of the nomogram are presented in the Fig. 1.

Fig. 1
figure 1

Flow diagram showing steps involved in construction and validation of nomograms

Methods

Research populations

We collected and screened information from January 2010 to December 2015 in SEER Registry data of 18 registries. The following are the inclusion criteria: (1) Female; (2) Age of diagnosis ≥18; (3) Diagnosis confirmed by positive histology instead of other methods; (4) Breast cancer was considered as the first primary cancer; (5) The subtype of breast cancer is luminal A; (6) Complete survival data and survival time was not “0”; (7) Complete information of the variables contains age of diagnosis, ethnic group, marital status, historical subtype, tumor size, location, grade, laterality, positive lymph nodes counts, histological subtype, the seventh edition of American Joint Committee on Cancer (AJCC) TNM stage, tumor grade, SEER cause-specific death, vital status, breast cancer subtype and metastasis site (8) The TNM stage is T1–4, N1-N3 and M0-M1 according to the seventh edition of AJCC TNM.

Variables and definition

The following data were extracted for each patient from the database: age at diagnosis, year diagnosed, race, marital status at diagnosis, primary site of the tumor, adjusted AJCC seventh T stage, N stage, M stage, tumor grade, histological subtype, number of positive lymph nodes, surgery, chemotherapy, radiotherapy, SEER cause-specific death, metastasis site, vital status, breast cancer subtype, and survival (months).

Histologic grades were classified into well differentiated (grade 1), moderately differentiated (grade 2), poorly differentiated (grade 3), and undifferentiated /anaplastic (grade 4). In terms of marital status, unmarried included single, divorced, separated, widowed, unmarried and family partner. In the racial classification, others include American Indian / Alaskan Aboriginal and Asian / Pacific Islander. We define Overall survival (OS) as the time from diagnosis to death, from any cause or until the last follow-up. Breast Cancer-specific survival (BCSS) was taken the definition of the time from diagnosis to death caused by breast cancer or to the last follow-up time. The endpoint of follow-up was December 2015.

Data analysis

The amounts and percentages of each variable through summarizing were used to describe the basic characteristics of the groups. In the training group, we adopted univariate analysis and multivariate Cox regression analyses to determine the risk of each factor associated with prognosis of OS and BCSS, which were performed by SPSS software (IBM Corporation, USA, version 21). The factor was considered significant if p < 0.05. All significant factors in the univariate analysis were included in the multivariate Cox regression analyses. The significant variables in multivariate Cox regression analyses were selected for the final prognostic models in order to construct the nomograms. The final prognostic model was then used to predict the 1 -, 3 -, and 5-year outcomes of OS and BCSS. We validated the nomogram internally and externally both in the training group and in the validation group. Harrell Consistency Index (C-Index) and area under ROC curve (AUC) were used to evaluate the nomogram, with a higher C-index indicating a more accurate prognostic predictions [22]. The nomogram demonstrated good discriminative ability, with a C-index between 0.78 and 0.81. We also adopted the calibration plot to evaluate nomogram performance. The calibration plots along the 45-degree line indicate a perfect calibration model in which the predicted probabilities are identical to the actual outcomes [22]. The survival analysis and curve plotting was carried out using Kaplan-Meier curves and the log-rank test, respectively. We used SEER*Stat software (version 8.3.6; NCI, Bethesda, MD) to extract the data. The C-index, ROC curves, nomogram, calibration curves and Kaplan-Meier curves were generated in R with packages “rms”, “survival”, “foreign”, “timeROC” and “regplot” respectively.

Results

Demographics and Clinicopathological characteristics

A total of 39,051 cases were collected from the SEER database for this study (Fig. 1). The eligible patients were randomly divided into a training group (n = 19,526) and a validation group (n = 19,525) at the ratio of 1:1. Among all the patients, most patients were between 40 and 49 (20.5%), 50–59 (26.7%), and 60–69 (25.7%) years of age. As for the ethnic group, most of the patients were Caucasian (79.6%). In regard to histology classification, firstly, most of the patients presented with infiltrating duct carcinoma (72.8%). Secondly, about half of the patients presented with grade II oncology grades (53.0%). Thirdly, most patients were N1 stage (76.0%) and almost all patients were M0 stage (97.1%). Nearly all of the patients received surgery (99.0%) and most received radiotherapy (62.4%) and chemotherapy (64.1%). The rate of metastasis to the bone, brain, liver, lung was 2.0, 0.1, 0.4, and 0.5%, respectively. All variables displayed similar proportions in the validation group and the training group. Table 1 demonstrates the details of the baseline characteristics.

Table 1 Demographic and clinicopathologic characteristics of the patients

Univariate and multivariate cox analysis and nomogram constructions

Univariate analyses showed that race, age of diagnosis, marital status, grade, T stage, Tumor size, N stage, M stage, positive regional nodes number, the site of metastasis (bone, brain, liver, lung), surgery records, radiotherapy records, chemotherapy records had a significant correlation with OS and BCSS (Table 2). According to the Cox regression multivariate analysis, the independent elements of OS and BCSS were identified and age of diagnosis, marital status, grade, T stage, M stage, race, positive regional nodes count, bone metastasis, brain metastasis, liver metastasis, surgery records, radiotherapy records, and chemotherapy records were independent prognostic factors. Black patients were observed to be at higher risk for death than Caucasian patients, while other patients have lower risk than Caucasian patients. The unmarried group was also found more to be at higher risk than the married group. In regard to histology classification, the risk of the Grade IV group was significantly higher than the Grade I group. The risk of the T4/N3/M1 group was obviously higher than the T1/N1/M0 group. With regard to treatment, patients who underwent surgery or received radiotherapy or chemotherapy were at lower risk than those who did not receive any of these treatments. As for breast cancer metastasis, patients with brain, bone, liver, and lung metastasis were at higher risk than those without. In the multivariate Cox proportional hazards models, we excluded M stage due to similar significance of the metastasis site and M stage while combining other independent predictors in the training group into the building of the nomogram for 1-, 3-, and 5-year OS and BCSS (Fig. 2). The length of the line behind the variable in the nomogram indicates the effect of the variable on the prognosis of breast cancer. From the nomogram, we found the brain metastasis, age, and T stage were the three most significant factors affecting the prognosis of patients with lymph node positive, luminal A breast cancer.

Table 2 Univariate and multivariate Cox analysis of overall survival and breast cancer-specific survival
Fig. 2
figure 2

Nomograms for predicting 1-, 3-, and 5-year OS (A) and BCSS (B) for patients with the indicated prognosis factors. Summing up points from all predictors could obtain total points. The predicted probabilities of OS and BCSS can be obtained by projecting the location of the total points to the bottom scales. NO. nodes: number of positive lymph nodes; OS, overall survival; BCSS, breast cancer-specific survival

Validation of the nomograms

Our nomograms were validated internally and externally between the training group and the validation group. The calibration plots presented excellent consistency between the actual and nomogram-predicted survival probabilities in both the training the validation cohorts (Fig. 3). The AUC of the ROC curve, which indicates discrimination ability, in predicting 5-year OS was 0.768 in the training cohort and 0.766 in the validation cohort. The AUC of the ROC curve in predicting 5-year BCSS was 0.789 in the training cohort and 0.787 in the validation cohort (Fig. 4). Our findings indicate that the nomogram can efficiently predict a patient’s OS and BCSS.

Fig. 3
figure 3

Calibration plots for the 1-, 3-, and 5-year. (A, B, C) Internal calibration curves for OS; (D,E,F) external calibration curves for OS; (G, H, I) internal calibration curves for BCSS; (J, K, L) external calibration curves for BCSS. OS, overall survival; BCSS, breast cancer-specific survival

Fig. 4
figure 4

ROC curves for the 1-, 3-, and 5-year. (A) Internal calibration plots for OS; (B) external calibration curves for OS;(C) internal calibration plots for BCSS; (D) external calibration plots for BCSS. OS, overall survival; BCSS, breast cancer-specific survival

Moreover, we determined the C-index values of our nomograms to assess their discriminative abilities. The C-index of OS were 0.782 (95% CI, 0.772–0.792) with 0.806 (95%CI, 0.794–0.818) for BCSS in the training cohort. In the testing cohort, C-index values for OS is 0.783 (95% CI, 0.773–0.793) and 0.804 (95% CI, 0.792–0.16) for BCSS.

Survival analysis

Kaplan-Meier curves were used to predict the effective factors of prognosis on the OS and BCSS of the test group in nomograms. The length of the line segment after the variable in the nomogram indicates the degree of influence of the variable on the prognosis of the patient. As shown in Fig. 2, brain metastasis has the most significant impact on the prognosis of patients. The Hazard Ratio of OS for patients with brain metastasis in the multivariate analysis was 4.449 (Table 2, 95% CI: 2.381–8.313). Surgery and the number of positive lymph nodes are also important factors affecting the prognosis of patients. The Hazard Ratio of OS for patients with surgery was 0.401 (95% CI: 0.311–0.517). The Hazard Ratio of OS patients with over 10 positive lymph nodes was 2.357(95%CI: 1.698–3.270). It is of great significance of all the prognostic factors in the nomograms in the primary group. Judging from Table 2, we observed consistent results in the training group. The curves indicates that all the factors turned out to have the identical outcome trends for OS and BCSS (Fig. 5).

Fig. 5
figure 5

Kaplan-Meier curves of OS and BCSS for each predictor. (A, B) age; (C, D) race; (E, F) marital status; (G, H) T stage; (I, J) number of positive lymph nodes; (K, L) Bone metastasis; (M, N) liver metastasis; (O, P) brain metastasis; (Q, R) tumor grade; (S, T) Radiotherapy; (U, V) Chemotherapy; (W, X) Surgery

Discussion

Many factors are associated with the prognosis of lymph-node luminal A subtype breast cancer. So, it is vital to identify the independent factors related to prognosis. Nomograms were constructed to predict 1-, 3-, and 5-year OS and BCSS of patients, which contain the following risk elements: age of diagnosis, grade, ethnic group, T stage, marital status, positive regional nodes number, bone metastasis, brain metastasis, liver metastasis, surgery, radiotherapy, and chemotherapy.

One of the most significant risk factors that affects the prognosis of breast cancer was age at diagnosis. Liu et al. observed that patients with luminal A breast cancer had significantly lower 5-year disease-free survival (DFS) and distant metastasis-free survival (DMFS) in the ≤40 years old age group compared to the 41–60 years old age group [23]. Another study that young age at diagnosis is associated with lower frequency of luminal A breast cancer. The 5-year event-free survival rates of patients aged less than 40, between 40 and 50, and > 50 years were 54.3 ± 3.5, 68.5 ± 1.9, and 70.4 ± 1.3% [24]. Additionally, another study shows that breast cancer-specific mortality for age > 80 was 25.8% at 5 years [25]. In our study, the nomograms we constructed showed that, compared to the 40–49 age group, patients aged 18–29 at diagnosis have a lower risk of death, while patients aged 30–39 at diagnosis have a higher risk of death. When the age at diagnosis was ≥50, the risk of death generally showed a upward trend as age increased (Fig. 2). From Kaplan-Meier curves, we observed that the BCSS of the ≥80-year-old subgroup was not as bad as the OS (Fig. 5A, B). The study by Chu et al. also reported that age affects the prognosis, and their nomogram also shows that the ≥80-year-old subgroup had the highest risk compared to the other subgroups, which is consist with our results. Our findings suggest that the poor survival prognosis of patients aged ≥80 years old might be due to reasons other than the breast cancer itself. There are a number of reasons age may impact the prognosis of patients. First, the levels of estrogen and progesterone differ amongst patients in different age groups, and the levels of estrogen and progesterone are important factor that affect the occurrence and prognosis of breast cancer. Second, older patients are more likely to have chronic diseases, such as high blood pressure and diabetes. These diseases can also affect the survival of patients. Third, a study reported that older patients have a higher risk of venous thromboembolism after receiving chemotherapy or endocrine therapy [26].

Studies by Chu et al. and Wang et al. have observed that race is a factor related to the prognosis of breast cancer [19, 27]. From our nomograms, we observed American Indian/Alaska Native and Asian/Pacific Islander women have a lower risk of death compared to Caucasian women, while black women have a higher risk of death (Fig. 2). This is likely due to various reasons, such as medical conditions and environmental factors. A study reported that people often seek health care closer to them than at a greater distance [28]. The racial demographics differ in different areas and the incidence of medical conditions varies from region to region, which may be related to the stage at which the breast cancer is diagnosed and the conditions for the treatment of breast cancer. Being diagnosed at an advanced stage is often accompanied with poorer living conditions, ultimately affecting the prognosis of breast cancer [29].

The number of positive lymph nodes is one of the most important factors affecting the prognosis of patients with luminal A breast cancer. Studies by Han et al. and Herr et al. reported that the prognosis of patients with more than 3 positive lymph nodes was significantly worse than 1–3 positive lymph nodes in luminal A breast cancer [16, 30]. It also associated with distant recurrence. A study showed that patients with a ratio of ≤20% in the number of positive lymph nodes to the total number of excised auxiliary lymph nodes had lower distant recurrence and better OS than those with a ratio > 20% [31]. In our study, the hazard ratios of OS and BCSS shows an upward trend as the number of positive lymph nodes increases (Table 2) The number of positive lymph nodes is related to perineural invasion, lymphatic vessel invasion, and tumor size, all of which can affect the prognosis of luminal A breast cancer [13].

T stage, referring to the size of the tumor, also affects the prognosis of patients with luminal A subtype breast cancer. Kustic et al. study have reported larger tumor was related to poor prognosis and adversely affected DFS and OS [32]. Our study showed that the median survival time shortened as T stage increased (Fig. 5 G, H), which is consistent with the observations observed by Kustic et al. The nomogram constructed by Chu et al also shows an increased risk of death in the ≥5 cm group than that of the ≤1 cm group. The survival curve also shows that the survival time of the ≥5 cm group is noticeably shorter than that of the ≤1 cm group. This result may be due to the fact that larger tumors are often associated with later staging, and it is more likely to have lymph node metastasis and distant metastasis, which affects the prognosis of breast cancer [33].

The site of metastasis displays close correlation with the prognosis of breast cancer, and occurs in the bone, liver and brain of patients with luminal A breast cancer. Bone metastasis is the most common site of metastasis in luminal A breast cancer [34]. Parkes et al. reported that median survival time of bone-only metastasis is 7.54 years [35], while Wang et al. reported that the median survival time of liver metastasis was 15 months [36]. Brain metastasis is associated with poor prognosis of luminal A breast cancer. Kim et al. reported that the median survival of luminal A subtype in brain metastasis was 12 months, and it is 14 months for brain metastasis instead of visceral metastasis [37]. Our nomograms showed that patients with brain metastasis have the highest risk of death compared with liver and bone metastasis (Fig. 2). In our Kaplan-Meier curves, patients with bone metastasis had a similar median survival time as patients with liver metastasis in lymph node positive, luminal A breast cancer. Moreover, brain metastasis led to the shortest median survival time (Fig. 5 K-P). The reason why brain metastasis contributes to poor prognosis is that 80% of patients with breast cancer with brain metastases are accompanied by other extracranial diseases [38]. Additionally, patients with brain metastases are often at the later stage of the disease. These factors contribute the overall poor prognosis of patients with brain metastases.

The course of treatment also affects the prognosis of patients with lymph node positive, luminal A breast cancer. A study by Xue et al. reported that the OS of patients who underwent surgery was significantly longer than those treated without surgery (34 months versus 23 months, respectively) [39]. Surgery in auxiliary lymphadenectomy can improve the survival of breast cancer patients with lymph node positive breast cancer [21].

Radiotherapy is another important treatment option. A study shows that patients with luminal A breast cancer have the highest benefit of radiotherapy compared with other subtypes. This is due to the fact that luminal A breast cancers are radiosensitive, thus resulting in better response to the treatment, a reduced risk of recurrence, and increased survival [40].

Chemotherapy is another common treatment option for breast cancer patients. For patients with lymph node positive, luminal A breast cancer, The National Comprehensive Cancer Network (NCCN) guidelines recommend that patients receive chemotherapy regardless of the number of positive nodes [41]. Previous studies have also shown that patients with lymph node-positive, luminal A breast cancer can benefit from chemotherapy, which can prolong OS [42, 43]. However, not all patients with lymph node-positive, luminal A subtype breast cancer will benefit from chemotherapy. The National Surgical Adjuvant Breast and Bowel Project (NSABP) B20 and Southwest Oncology Group (SWOG) 8814 supposed that whether chemotherapy was needed was determined by the Oncotype DX 21-gene recurrence score (RS) [44, 45]. The SWOG 8814 study showed that postmenopausal women with lymph node-positive luminal A subtype breast cancer with low (< 18) or moderate (18 < RS < 31) recurrence scores do not benefit from chemotherapy [44]. In our work, surgery was considered to be the most important treatment option compared with chemotherapy and radiotherapy, according to the nomogram (Fig. 2). Patients who received radiotherapy, and surgery can prolong OS and BCSS according to Kaplan-Meier curves, but chemot, breaherapy did not improved BCSS as significantly as it improved OS (Fig. 5 S-X).

Other factors, such as breast feeding, marital status, and exposure to certain drug extracts, may also affect the prognosis of the patients with luminal A, lymph node-positive breast cancer. A previous study reported that breastfeeding may decrease the risk of breast cancer [46], and another study reported that breastfeeding may be related to the occurrence and prognosis of breast cancer [47]. With regard to marital status, a systematic review reported that unmarried women are more likely to develop advanced stage breast cancer, and that a spouse may represent an advantage for providing practical assistance and support that may lead to the early detection of the breast cancer [48]. Lastly, the hydroalcoholic extract of garden sage has been shown to inhibit the angiogenesis of breast cancer cell lines, thereby potentially improving the prognosis of patients with breast cancer [49].

Our study has established a prognostic model for patients with luminal A, lymph node-positive breast cancer, and our verification has shown that it has high accuracy and sensitivity. Besides, compared with previous similar studies, we have included the treatment method as a predictive factor, which could provide references for clinicians to choose appropriate treatment options for these patients. However, our study has several limitations. First, deviations due to race may exist in the study population due to the fact that most of the population in the SEER database is Caucasian. Therefore, whether our nomogram is applicable in other regions outside the United States of America needs to be investigated. Second, although internal and external validations were used to evaluate the performance of the nomograms, validating the nomograms in cohorts outside of the SEER program is still needed. Third, the SEER database lacks information about targeted therapy and endocrine therapy, so the effect of these treatments on the prognosis of patients with lymph node-positive, luminal A breast cancer could not be determined. Lastly, due to the lack of information in the SEER database, unknown information on chemotherapy and radiotherapy may affect the accuracy of our predictions. Therefore, further prospective studies are needed to guarantee the performance of our nomograms [21].

Conclusion

Based on the information from the SEER database, nomograms were built to predict survival for lymph node positive, luminal A subtype of breast cancer. Compared with previous studies, this is the first nomogram that incorporates treatment as a predictor to predict the prognosis of luminal A, lymph node-positive breast cancer. Our validation analysis showed that the actual and nomogram-predicted survival probabilities were consistent and that our nomogram displays good discrimination. However, most of the population in the SEER database are Caucasian, and the lower proportion of blacks and Asians may affect the sensitivity and accuracy of the nomogram’s predictive qualities in these populations. The nomograms may provide clinicians with more information about the risky sides for each prognostic factor and may assist clinicians choosing the proper treatments that will increase the 1-, 3-, and 5-year OS and BCSS of patients with lymph node positive, luminal A breast cancer.