Background

Breast cancer is the main cause of cancer mortality and morbidity among women [1]. In 2020 over two million women were diagnosed with breast cancer worldwide [2]. According to the World Health Organization report, breast cancer was the most common cancer among Iranians in 2018 [3]. The age standardized ratio of breast cancer in Iran was 35.7 per 100,000 people in 2016, and the number is expected to increase 63% by 2025 [4]. Besides, the incidence age of breast cancer in Iran reported 10 years lower than the rest of the world, so it is suggested to start screening of women at earlier ages [5, 6].

Based on RNA sequencing analysis and immunohistochemistry studies, breast cancer is routinely clustered into several molecular subtypes as luminal A, luminal B, her2-enriched, and basal types [7, 8]. Since Ki-67 and cytokeratin testing are not commonly reported in many countries, the breast cancer subtypes are typically classified based on the presence of estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor receptor-2 (HER-2). According to the Hormone Receptors (HR: ER and/or PR) and HER2 status, there are four main molecular subtypes including HR+/HER2+, HR+/HER2–, HR−/HER2+ and HR−/HER2− [9]. Studies have shown that biological and clinical characteristics of breast cancer patients, such as risk factors, recurrence and metastasis risk, and survival outcomes, are associated with molecular subtypes [10, 11]. As an important prognostic factor for the survival of patients with breast cancer, this classification guides systemic therapy [12]. Several studies were performed to assess the prognostic effect of biomarkers on breast cancer survival; however, the results vary in different countries, and the evidence for predicting the impacts of subtypes in developing countries is limited.

Most of common survival models assume that all subjects will eventually experience the desired event (such as recurrence, metastasis, or death) [13]. While nowadays, due to cancer treatment progress, a fraction of patients may never experience the event. These patients are cured or long-term survivors, and will have similar survival to the general population [14].

Cured patients survive longer than uncured patients and have a better quality of life, because they no longer suffer from complications. Therefore, in cancer therapy, the cure of cancer is more important goal than prolonging patients' survival time. Furthermore, the presence of cured patients in survival models increases censoring, leading to overestimated survival rate. Cure models are a professional method in survival analysis that can be used in these cases [15, 16]. The cure model estimates the odds of cure or long-term survival of patients and the survival rate of uncured patients or short-term survival. Moreover, factors related to patients' survival or cure probability can be calculated by this model. So, it can give us a better insight about factors affecting the patients' survival. In his study we aimed to compare the demographic and clinical characteristics, survival rates, and cure probability of stages I-III breast cancer patients between different molecular subtypes.

Methods

Sample selection

The current retrospective cohort study was performed on 1287 female diagnosed with breast cancer admitted at Motamed Cancer Institute, Tehran, Iran. Women with pathologically confirmed primary invasive breast cancer which underwent surgery from 2000 to 2016 were enrolled in the study and followed until 2018. To register the survival status of patients, they were followed up as complete as possible with mail out/phone contact, or they were referred for in person visit in clinic. If the patient had died by the time of contact, information was provided by available family members. This analysis was limited to women aged > 20, and diagnosed with stages I-IIIC (AJCC TNM staging) breast cancer. Patients with primary metastatic breast cancer, missing more than one variable and missing information on hormonal receptors or HER-2 status were excluded. Individuals who were alive at the end of the study or whose final status information was not available and could not be contacted (who withdrew) were considered censored. The study was approved by the Tarbiat Modares University Ethics Committee, Tehran, Iran (code number: IR.MODARES.REC.1397.278).

Prognostic factors

Several variables were selected and analyzed based on the expert physicians' opinions and literature review. The variables included age at diagnosis, body mass index (BMI), education level, marital status, family history of cancer (any type of cancer in first- or second-degree relatives), menopausal status, pathology type, tumor size, histologic grade, stage at diagnosis (Breast Cancer Adjusted AJCC Cancer Staging Manual 6th Edition), lymph node status, ER, PR, HER2, type of surgery, hormone therapy, radiotherapy, adjuvant chemotherapy, recurrence status (local and/or contralateral recurrence), and distant metastasis status.

Since Ki-67 had not been reported in many immunohistochemical reports, we categorized breast cancer into four molecular subtypes based on HR and HER2 status, including HR-positive/Her2-positive; HR-positive/Her2-negative; HR-negative/Her2-positive, and HR-negative/Her2-negative (triple-negative). The molecular subtypes were defined by immunohistochemical staining against HR (ER and/or PR) and HER2 markers. ER and/or PR positive tumor cells ≥ 1% was defined as positive, HER2 0/1 was defined as negative, HER2 3+ defined as positive, and HER2 2+ was determined either negative or positive by fluorescence in situ hybridization (FISH).

This study contains two kinds of outcomes: survival time and cure status. For the survival time, disease-free survival (DFS) defined months from the diagnosis date to the recurrence, metastasis, death, or last follow-up; while overall survival (OS) considered as the time from diagnosis to death from any cause or last follow-up. The second outcome is cure status of patients. This is a latent variable that is specified by the model. Intuitively, the cure fraction can be estimated by Kaplan–Meier curve. We first determined the cure time of the patients using this curve. The patients who survived longer than this time were identified as cured or long-term survivors and the others are uncured or short-term survivors.

Statistical analysis

The variables were described using frequency, mean and standard deviation. The proportion of clinicopathologic and therapeutic regimens was compared among different molecular subtypes using the chi-square test. Then survival plot was fitted by the Kaplan–Meier curve and the time of cure and cure fraction were determined. Kaplan–Meier DFS curve for stage I-III patients was prepared according to tumor subtype and compared with log-rank tests. Finally, univariate and multivariate mixture cure model performed to investigate the impact of various factors on DFS.

Mixture cure model is an extension of cox proportional hazards model that suppose the population is a mixture of two groups: the cured or long-term survivor patients and uncured or short-term survivors. Logistic regression modeled cure fraction and its related factors, and the chance of cure was indicated by the odds ratio (OR). In the survival part, factors associated with the survival time were investigated by cox model, and the risk of experiencing the outcomes was denoted by hazard ratio (HR). The Kaplan–Meier curve determined the cure fraction and adequacy of the cure model. If the tail of the curve does not reach zero and has a long and stable plateau with heavy censoring, it will be evidence for the existence of cured patients and the adequacy of the follow-up period. All calculations were performed using SAS version 9.4 and R version 4.1.1 software at a significance level of 0.05.

Results

Baseline characteristics

A total of 1287 women with non-metastatic breast cancer aged 21 to 87 years have entered the study, of whom 200 patients (15.5%) died, and 1087 cases (84.5%) were censored. The mean age of patients was 47.00 ± 10.72 years, 62% of patients were < 50 and 25.9% were < 40 years old. The characteristics of patients and differences in clinicopathologic and treatment features grouped by various molecular subtypes have been summarized in Table 1. In 64% of cases, the lymph nodes showed a positive result and in 68%, patients were diagnosed in early-stages (stages I and II). Patients with available chemotherapy information included 842 patients received adjuvant chemotherapy, 259 patients received Neoadjuvant chemotherapy, 99 patients received both, and 8 patients received no chemotherapy. So, we divided patients into those received and those not received adjuvant chemotherapy. All of the patients underwent surgery, and most patients received adjuvant chemotherapy (73%), radiotherapy (90%), and hormone therapy (80%). Proportions of patients by tumor subtype were 20%, 52%, 12% and 16% for HR+/Her2+, HR+/Her2−, HR−/Her2+ and HR−/Her2− respectively.

Table 1 Demographic, clinicopathologic and treatment features of breast cancer patients by molecular subtypes

As demonstrated in Table 1, clinicopathologic and treatment patterns differed by molecular subtypes. Patients with HR−/HER2+ status tended to have larger tumor size (P = 0.001), higher histologic grade (P < 0.001), higher stage (P = 0.038) and positive lymph node (P < 0.001). In contrast, smaller tumor size (P = 0.001), lower stage (P = 0.038) and lower grade (P < 0.001), were more common in HR+/HER2− patients. The lowest positive lymph nodes rate (about 50%) was also observed in triple-negative breast cancer (TNBC) patients (P < 0.001).

Almost half of women were treated by BCS; however, in HR−/HER2+ patients, MRM was the more prevalent procedure (P = 0.001). Hormone therapy differed significantly between subtypes. Virtually all HR-positive patients received hormone therapy compared with less than 40% of HR-negative patients. In addition, nearly all of the patients received radiotherapy and adjuvant chemotherapy, although in HER2− patients, adjuvant chemotherapy was more common (P = 0.022).

Figure 1 shows recurrence, metastasis and mortality rates by molecular subtypes according to treatment types. In all subgroups, patients who had MRM surgery experienced higher rate of mortality, metastasis, and recurrence than those who undergo BCS surgery (Fig. 1.a). Furthermore, patients who received hormone therapy, particularly HR+/HER2+ tumors, had lower mortality and metastasis rates, while HR−/HER2+ tumors had higher mortality rate (Fig. 1.b). The mortality rates of patients who received radiation therapy and/or adjuvant chemotherapy were nearly identical to those of other patients, as shown in Fig. 1.c and d. Patients with negative HR who received radiation therapy had a lower recurrence and metastasis rates, and those who received adjuvant chemotherapy had a greater recurrence and metastasis rate.

Fig. 1
figure 1

Recurrence, metastasis and mortality rates of breast cancer patients according to a type of surgery, b hormone therapy, c radiotherapy, d adjuvant chemotherapy

Survival outcome

The Kaplan–Meier method estimated 170.09 ± 3.11 months for mean OS and 155.2 ± 3.32 months for mean DFS. In addition, 1, 5, 7, 10, and 15-year OS rate was calculated 98%, 86%, 79%, 72%, and 67%, respectively. The Kaplan Meyer DFS curve with the 95% confidence interval is shown in Fig. 2.

Fig. 2
figure 2

Kaplan–Meier disease-free survival curve with 95% CL in the month for breast cancer patient

The survival plot is almost flat after about 150 months. The cure fraction is estimated about 60% (the difference between the smooth sequence of the chart and the zero value on the survival probability axis). Patients who survived longer than 150 months were intuitively considered as cured or long-term survivors.

The 5-year DFS was 77.3% in HR+/Her2+, 84.2% in HR+/Her2−, 62.3% in HR−/Her2+, and 76.5% in TNBC (Table 2). In general, DFS of breast cancer subtypes were associated with tumor stage (Fig. 3). The DFS was not significantly different among molecular subtypes in stages I and III breast cancer (Fig. 3a and c). By contrast, in stage II, HR−/HER2+ patients had worst prognosis with 5-year DFS of 57.4% and HR+/HER2− patients had best DFS of 87.3%. The 5-year DFS estimated in stage II TNBC was 79.2%, which was lower than DFS estimated in the HR+/HER2− group (Table 2).

Table 2 The OS and DFS of breast cancer molecular subtypes in different stages of the disease
Fig. 3
figure 3

DFS of a stage I; b stage II; c stage III breast cancer by different molecular subtype

Prognostic factors of DFS survival

We performed the univariate and multivariate Cox mixture cure model to analyze the prognostic factors for short-term and long-term DFS of breast cancer patients, as shown in Table 3.

Table 3 Prognostic factors of breast cancer DFS using cox mixture cure model

The effect of factors on survival time and cure probability of patients were indicated by HR and OR, respectively. If the HR is more/less than one, it means that the variable is a risk/protective factor for survival. Whereas the OR more/less than one means that the variable is a protective/risk factor for odds of cure. The greater distance from one, the greater effect.

In univariate analysis, the hazard ratio of DFS in the TNBC, HR+/HER2+, and HR−/HER2+ were 1.78, 1.53 and 1.38 compared to HR+/HER2 as the reference subtype. However, only the HR−/Her2+ subtype was associated significantly with the cure probability (OR = 0.38). On the other hand, after adjusting for demographic and clinical factors, the risk of mortality, metastasis, and recurrence was not significantly associated with molecular subtype in the short-term. Still, the cure probability of HR−/Her2+ patients was significantly lower than HR+/HER2− patients (OR = 0.22).

Other associated factors with worse short-term DFS included larger tumor size (HR = 1.99) and positive lymph nodes (HR = 1.58). Furthermore, high-level of education (HR = 0.7), BCS surgery (HR = 0.55), and adjuvant chemotherapy (HR = 0.55) had a protective effect on DFS.

In the long-term, married women were more likely to be cured (OR = 1.62). Regardless, the cure probability of the women with obesity (OR = 0.41), postmenopausal status (OR = 0.63), family history of cancer (OR = 0.66), IDC pathology (OR = 0.26), and advanced tumor stage (OR = 0.21) was significantly lower than the others.

Discussion

In modern medicine, predicting the patient's prognosis is essential to avoid overtreatment or undertreatment [17]. In the present study, molecular subtypes were used to assess the breast cancer patients' survival by mixture cure model. Recognition separately the different covariate effects on cure probability and survival of patients can be considered as the advantages of this model.

In our study, the mean age of patients was 47.00 ± 10.72 years, which was similar to the mean age previously reported in a meta-analysis of 24 Iranian breast cancer studies [18] and a review conducted among Asian countries [19]. In addition, based on our results, the proportion of patients younger 50 in Iran was much higher than in high-income countries, such as the USA [20], Germany [21], and the Netherlands [22]. Furthermore, there were 26% of young patients (less than 40 years old) with breast cancer, while most other countries had a smaller proportion of young onset patients[23, 24]. This discrepancy might be related to differences in disease patterns or age distributions; the median age of women in Iran and Europe are 32 and 45.5 years, respectively [25]. Accordingly, this result emphasizes the value of local data in developing national mammography guidelines that recommend starting women screening before the age of 50. Our results revealed no significant relationship between diagnosis age and breast cancer patients' survival rate or cure probability. However, some studies have reported significant association [26,27,28].

HR+/HER2− was the most frequent subtype, with a prevalence of 51.1%, which is lower than those reported for developed countries such as Canada (64.8%) [11] and the USA (66.6%) [12]. In this study, the frequency of HR+/HER2+, HR−/HER2+, and TNBC were 20.3%, 12.7%, and 15.9%, respectively. Other studies from the USA (6% HR−/HER2+; 17.4% TNBC) [29] and China (6.8% HR−/HER2+; 18.3% TNBC) [30] reported lower prevalence of HR−/HER2+ but almost the same for TNBC. Furthermore, a prior study in Iran reported a prevalence of 61% in HR+/HER2−, 8.1% in HR−/HER2+, and 23% for the TNBC [31]. Several factors can cause disparities in our results, including differences in age distribution, genetic variation, or classification criteria for molecular subtypes.

The 5-year OS rate was 86.3% for all patients. Before 2000, the survival rate of Japanese, Korean, Turkish, and Arab females was 88.1%, 83.7%, 76.7%, and 64.5%, respectively [32]. Moreover, it is slightly lower than the 5-year OS of 91% in the USA [33] and 89.7% in Brazil [34]. Considering breast cancer research center as a referral center in Iran, it seems that the 5-year survival rate in Iran is higher than it in many Asian counterparts but lower than in American countries. It can be concluded that even without a national screening policy and lack of availability to some new therapeutic modalities, our survival rate is within an acceptable limit, so it is expected to improve soon.

DFS and OS rates significantly differ between subtypes based on Log-Rank test, ranging from highest to lowest HR+/HER2− (90.2% OS; 84.2% DFS), HR+/HER2+ (83.4% OS; 77.3% DFS), TNBC (81.0% OS; 76.5% DFS), and HR−/HER2+ (77.8% OS; 62.3% DFS), respectively; which was confirmed by some other studies [30, 31]. HR-positive tumors showed better survival compared to HR-negative tumors. In addition, we detected a relevant difference in OS and DFS according to breast cancer subtypes in stages I-III disease. In stages I and II, 5-year DFS of HR−/HER2+ patients were significantly lower than others. Stage III of disease generally had the worst prognosis, and the difference between subtypes in these patients was not statistically significant. DFS of the HR+/Her2− subtype was high in all stages, in line with finding in America and Europe [35,36,37]. From the above results, comparing subtypes at different stages of the disease can provide more accurate information.

According to our study, the HR−/HER2+ subtype had the worst prognosis. Even though some studies have found similar results [30, 31], but it differ from the common belief that TNBC has the worst prognosis [37,38,39]. The reasons might be the larger tumor size, more advanced stage, more positive lymph node, or poor treatment in these patients. Newer HER2-directed therapies, such as pertuzumab and trastuzumab, as well as T-DM1 and TKI, may improve HER2+ patients' outcomes [40].

This research revealed that the tumor characteristics and survival rates varied with the molecular subtype. Like other studies, more proportion of tumors under 2 cm, stage I, and low histologic grade were in HR+/HER2− patients [30, 31]. By contrast, HR−/HER2+ tumors tended to be larger in size, higher stage, higher histologic grade, and positive lymph nodes. Even though these results differ from some earlier studies that reported worse tumor features in TNBC [38, 41], they are consistent with those of Chinese [30] and Canadians [12] that observed the worst tumors in HR−/HER+ patients. As suggested in the literature [31, 42], breast cancer patients with TNBC have fewer positive lymph nodes, increasing the possibility of blood spreading of this cancer than lymphocytes. Future researches are needed to clarify this molecular subtypes' mechanism.

The results of multivariate analysis showed that several clinicopathological factors such as larger tumor size, IDC pathology and more advanced stage of the disease are correlated with breast cancer survival. Some other studies have emphasized the importance of these factors [30, 38]. Furthermore, the nodal status remains one of the most essential risk factors for survival and metastasis as we also showed in our study. Therefore, node status is also an important determinant in the decision-making for breast cancer treatment. Obesity, menopause, and a family history of cancer were also identified as risk factors for patients' survival, while higher education, adjuvant chemotherapy, BCS, and marital status were protective. So, it seems that postmenopausal women and those with a family history of cancer should be given more attention in screening policies. Changing lifestyle and diet modification can also help improve patients' survival.

After adjusting for pathological and demographic variables, short-term survival was not significantly different between the molecular subtypes, but in the long-term, the cure probability of HR−/HER2+ patients was much lower than the others. In other words, the short-term survival differences between subtypes are mostly due to pathological and demographic variables. However, even after adjusting for these variables, HR−/HER2+ patients still are less likely to be cured in the long-term.

The current study represents the biomarker analysis of DMR and RR in breast cancer by molecular subtype. For both DMR and RR, patients with HR+/HER2− tumors had the most favorable prognosis, with DMR and RR of only 7.1% and 2.6% at 5 years. Conversely, HER2 positive and TNBC exhibited the highest rates of DMR (14.1% and 9.3%) and RR (10.4% and 4.9%). Our results have several similarities with a retrospective cohort study of Asian young breast cancer patients that found HR−/HER2+ tumors had the highest recurrence and metastasis rates. In contrast, HR+/HER2− tumors displayed the lowest LRR [30]. Another study of 10-year recurrence in European breast cancer patients announced that those with HR−/HER2+ and TNBC had a significantly higher recurrence and metastasis rate than HR-positive profiles, which was in agreement with our findings [22]. A Canadian study showed that, after a median follow-up of 6.9 years, 8% of HER2-positive cancers had a local recurrence, compared to 12% in TNBC. Moreover, they showed that distant metastases occurred in 19.2% of HER2-positive and 27.4% of TNBC. They also noted that the HR+ patients had the best prognosis of local recurrence and distant metastasis [43].

In this study, we tried to predict breast cancer patients' short- and long-term survival with robust statistical methods. We have some limitations in our study. First, we classified molecular subtypes based on HR and HER2 status without additional biomarkers, including Ki-67 and cytokeratin levels. Second, the medical records we used did not include detailed information on different therapies, such as endocrine therapies, HER2-directed therapies, or chemotherapy. By adjusting for these factors, we might reduce confounding effects and improve our knowledge about the survival of breast cancer molecular subtypes. Third, since there was a large amount of missing data, especially for ER, PR and HER2, we had to use the limited number of data available. So, the results should be interpreted with caution.

Conclusion

In conclusion, our study showed that HR+/HER2− breast cancer is the most prevalent tumor subtype, similar to other countries. It had the best prognosis, followed by HR+/HER2+, triple-negative, and HR−/HER2+ patients. Based on the mixture cure model, prognostic variables showed different relations with the DFS in the short-term and cure probability in the long-term. Accordingly, even after controlling for variables such as tumor stage, tumor size, positive lymph nodes, and obesity, HR−/HER2+ patients still had a lower cure probability in the long time. Starting screening women at a younger age and shorter intervals to detect patients in the early stages and using new targeted therapies based on molecular subtypes may improve breast cancer patients' survival and cure probability.