Introduction

De novo metastatic breast cancer refers to distant metastasis at the initial diagnosis and an inferior prognosis, with a 5-year survival rate of less than 30%; patients with de novo metastatic breast cancer account for approximately 5% of the entire population1,2. Although it is treatable considering the advances in novel therapeutics, de novo metastatic disease tends to be incurable and could be a therapeutic challenge in clinical practice. Among this group of diseases, the occurrence of lung metastasis is estimated to be 21–77%; the lung is one of the most common sites of cancer spread3,4,5.

Despite the notable prevalence of lung metastatic breast cancer (LMBC), limited studies have evaluated the presentations of patients with LMBC. Hence, the clinicopathological features and prognostic profiles are unclear. Previous studies have reported preliminary findings regarding the molecular subtypes and lung metastasis6. However, this association needs to be adequately studied due to insufficient clinical outcomes and follow-up. In addition, the tumor burden at initial diagnosis could be a critical factor, and the metastatic pattern is a significant component for cancer management and survival prediction of de novo disease. However, few studies have focused on this profile in the LMBC population. Moreover, as a predominant treatment, surgical intervention could be the foremost option for early breast cancer, but its prognostic benefits have not been adequately determined for de novo metastatic disease7,8,9. Therefore, the prognostic value of surgical performance in the therapeutic course of patients with LMBC should be clarified.

We conducted this study to comprehensively discuss the clinicopathological and prognostic characteristics of patients with LMBC to assess the associations between clinical outcomes and molecular subtypes, metastatic patterns, and surgical performance. We further aimed to establish a prediction model for the individual estimation of survival probabilities of patients with LMBC to provide promising evidence and reference for the introduction of individual therapeutics for patients with LMBC in clinical practice.

Methods

Population

Data on patients diagnosed with breast cancer between January 1, 2010 and December 31, 2016 were obtained from the Surveillance, Epidemiology, and End Results (SEER) database. Patients who were newly diagnosed with LMBC and had no missing clinicopathological and survival data were assessed for eligibility. Patients were excluded if (1) tumor grade; molecular subtypes; and the status of estrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor 2 (HER2), in addition to that of visceral metastases, were unknown and (2) tumor size and node involvement were not evaluated. Data analyses were performed in December 2020.

Information on the selected cohort was successively extracted for the analysis of the following: age at diagnosis, sex, race, laterality, histologic type, grade, molecular subtypes, immunochemical status (ER, PR, and HER2), tumor size, node involvement, visceral metastases, performance of surgery, radiotherapy, and chemotherapy. This study was conducted in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology guidelines10 and the Transparent Reporting of a Multivariate Prediction Model for Individual Prognosis or Diagnosis statement11.

Outcome

LMBC was defined as de novo metastatic breast cancer presenting with lung metastasis with positive histological confirmation. The differences in clinicopathological features and prognosis were compared among the molecular subtypes, which were classified into four categories—hormone receptor (HR)-positive/HER2-negative (HR+/HER2−), HR-positive/HER2-positive (HR+/HER2 +), HR-negative/HER2-positive (HR−/HER2+, HER2), and HR-negative/HER2-negative (HR−/HER2−, TN). Overall survival (OS) was defined as the interval between the initial diagnosis of breast cancer and death caused by any reason. According to SEER terminology, visceral metastases involve the liver and brain. The American Joint Committee on Cancer 7th edition guidelines were adopted to define the tumor–node–metastasis stage of breast cancer.

Statistical analysis

Comparative analysis of baseline characteristics was performed using Pearson’s chi-square test and Fisher’s exact probability test for qualitative data and the t-test or Wilcoxon rank test for quantitative data with a normal and abnormal distribution, respectively. Survival outcomes were compared using the Kaplan–Meier method with log-rank tests. Patients were randomly assigned to the training and validation cohorts in a 7:3 ratio to establish and externally validate the model. Prognostic factors were identified with consecutive performance of univariate and multivariate Cox proportional hazards regression analyses, which were adopted to develop a nomogram for estimating the 2- and 5-year survival probabilities. The discriminative and calibrating capabilities of this nomogram were evaluated both internally and externally using the concordance index (C-index) and calibration curves with bias-corrected validation under 1000 bootstrap resamples. A C-index of 0.5 indicated agreement by chance, and a C-index of 1 indicated perfect discrimination. All statistical analyses were two sided, with P < 0.05 considered statistically significant, and were performed using IBM SPSS Statistics (version 26.0; IBM Corp., Armonk, NY), and R software (version 3.6.4, www.r-project.org/).

Results

Among the 7746 initially identified patients with LMBC, 4310 were finally eligible (Supplementary Fig. 1). The population demographics and baseline clinicopathological characteristics are presented in Supplementary Table 1.

Clinical outcomes associated with molecular subtypes

In total, 52.4% (2259/4310) patients were HR+/HER2−, 17.6% patients (757/4310) were HR+/HER2+, 10.8% (467/4310) patients were HR−/HER2+, and 19.2% (827/4310) patients were HR−/HER2−. Their baseline features are listed in Table 1. The median age at diagnosis in patients with patients HR+/HER2−, HR+/HER2+, HR−/HER2+, and HR−/HER2− subtypes was 64.0, 59.0, 59.0, and 62.0 years, respectively, and there was profound heterogeneity in the disease characteristics among them. Compared to luminal-like subtype disease, the HER2 and TN subtypes of LMBC presented a higher grade (P < 0.0001), a larger tumor size (P < 0.0001), a higher rate of node involvement (P < 0.0001), and a higher incidence of brain metastasis (P < 0.0001). Luminal-like subtype LMBC exhibited a higher rate of bone metastasis (P < 0.0001), while the HER2 overexpression subtype, including HR+/HER2+ and HR−/HER2+, tended to be associated with a relatively higher occurrence of liver metastasis (P < 0.0001).

Table 1 Population demographics and baseline characteristics of included patients associated with molecular subtypes.

Regarding prognosis related to molecular subtypes, the median OS was 35.0 months (95% confidence interval [CI] 30.1–39.9) in HR+/HER2+, 28.0 months (95% CI 26.0–29.9) in HR+/HER2−, 22.0 months (95% CI 18.1–25.9) in HR−/HER2+ and 11.0 months (95% CI 10.0–11.9) in HR−/HER2− subtypes, indicating a successively worse trend in overall prognosis (P < 0.0001; Fig. 1).

Figure 1
figure 1

Comparative analysis of OS associated with molecular subtypes. (R software version 3.6.4, www.r-project.org).

Clinical outcomes associated with metastatic patterns

The metastatic patterns of patients with LMBC were analyzed; the involved cases and their survival were analyzed for outcome evaluation. Overall, lung-only metastatic disease had the highest incidence rate (1555/4310, 36.1%), followed by lung and bone metastatic disease (1332/4310, 30.9%), with no statistical significance in the median OS between the groups (P = 0.053; Supplementary Tables 2, 3). With respect to the number of metastatic sites, the overall prognosis constantly worsened with an increase in the number of involved organs (Supplementary Fig. 2A). For patients with malignancy involving three sites, an inferior tendency was detected in patients with bone, lung, and brain metastases (P < 0.0001; Supplementary Fig. 2B). However, no statistical significance was noted in the prognosis of patients with malignancy involving three sites (Supplementary Fig. 2C). In addition, patients with LMBC and brain metastasis exhibited the worst survival, and the additional involvement of the bone tended to exert little effect on the prognosis of patients with lung-only (P = 0.053); lung and liver (P = 0.621); and lung, liver, and brain metastasis (P = 0.648; Supplementary Table 3, Supplementary Fig. 2D).

Clinical outcomes associated with treatment

The prognostic benefits of surgical performance were assessed in patients with de novo LMBC. Regarding molecular subtypes, a constantly improved OS was revealed across HR+/HER2+, HR+/HER2−, HR−/HER2+, and HR−/HER2− subtype disease (Supplementary Fig. 3A–D), which was consistent with the prognostic outcomes of patients with lung-only and paired-organ metastases with bone, liver, and brain involvement (Supplementary Fig. 4A–D). For the entire LMBC population, the overall OS was significantly improved by surgical intervention (P < 0.0001), and the comparative prognosis stratified by clinical characteristics is presented in Supplementary Table 4.

In addition, treatment patterns were subjected to comparative analysis in terms of survival benefits. A comparable effectiveness was detected between surgery plus chemotherapy (40.9 months, 95% CI 43.9–38.0) and surgery plus radiotherapy (42.0 months, 95% CI 48.8–35.2). In addition, no additional benefit was retrieved from surgery plus chemotherapy plus radiotherapy. The surgery-based combination regimen was advantageous compared to the other treatment options, including surgery alone, chemotherapy alone, or chemotherapy plus radiotherapy.

Development and validation of the nomogram

Eligible patients were randomly allocated to the training and validation cohorts, which included 3017 and 1293 individuals, respectively. In the training cohort, the prognostic factors were successively identified, including age at diagnosis (P < 0.0001), race (P < 0.0001), histologic type (P = 0.001), tumor grade (P < 0.0001), molecular subtype (P < 0.0001), AJCC T stage (P = 0.006), bone metastasis (P < 0.0001), liver metastasis (P < 0.0001), brain metastasis (P < 0.0001), performance of surgery (P < 0.0001), and chemotherapy (P < 0.0001), which were collectively adopted to develop the prognostic model (Table 2). The nomogram showed that a tumor grade, molecular subtype, and age at diagnosis had a higher effect. The points of each variable were summed up by locating the respective points on the scale and then a straight line was drawn down to the total point scales to estimate the 2-year and 5-year survival rates.

Table2 Prognostic factors identified by uni- and multivariate COX regression analyses in the training cohort.

The nomogram constructed for the estimation of 2- and 5-year survival in patients with LMBC was constructed is shown in Fig. 2. The overall C-index was 0.70 (95% CI 0.69–0.83) in the training cohort and 0.71 (95% CI 0.68–0.72) in the validation cohort, and the time-dependent C-index curves of the two cohorts signified that the values associated with survival were consistently > 0.50, indicative of favorable discriminative power (Fig. 3A). Calibration plots of the two cohorts demonstrated a decent agreement between the actual and predicted 2- and 5-year survival probabilities, which suggested a satisfactory calibration capability (Fig. 3B,C). In summary, the newly established nomogram showed good performance for survival estimation in patients with LMBC.

Figure 2
figure 2

Nomogram for individual estimation of 2- and 5-year survival probabilities in LMBC patients. (R software version 3.6.4, www.r-project.org).

Figure 3
figure 3

Validation of nomogram in the training cohort and validation cohort. (A) The C-index curves of nomogram in both the training and validation cohorts. (B) Calibration curves of 2-year survival rates in the training and validation cohorts. (C) Calibration curves of 5-year survival rates in the training and validation cohorts. (R software version 3.6.4, www.r-project.org).

Discussion

To our knowledge, this is the first study to comprehensively discuss the clinical features and prognostic outcomes associated with molecular subtypes, metastatic patterns, and surgical intervention and to develop a robust prediction model for the estimation of individual prognosis of de novo metastatic breast cancer with lung involvement.

To illustrate the distinctive presentations associated with molecular subtypes, we first performed comparative analyses among the LMBC population with HR+/HER2+, HR+/HER2−, HR−/HER2+ and HR−/HER2− subtype disease. The percentage of TN and HER2 subtype disease was relatively higher in patients with LMBC than in the entire breast cancer population (approximately 10% vs. 4%)2, suggesting an inclination of lung metastasis related to molecular subtype in patients with LMBC. An ascending tendency of lung involvement in TN and HER2 subtype breast cancer was noted in previous studies, with a recorded incidence of 20.8–35.0% and 22.9–45.0%, respectively12,13,14. In addition, we demonstrated that bone involvement tended to occur in luminal-like disease, while liver metastasis tended to occur in HER2 overexpression disease, which is consistent with the findings of previous studies that focused on de novo metastatic breast cancer12,15,16. The current evidence suggests that this kind of presentation can be independent of disease characteristics17, and our study demonstrated that the organ-specific metastasis remained stable in patients with initial lung metastasis. This type of subtype-associated predisposition could potentially constitute the intrinsic profiles of breast malignancies and provide clinical implications for organic selectivity in the management of cancer metastasis.

We also assessed the heterogeneous prognosis among the different molecular subtypes of LMBC, and our results suggested that the survival was in great favor of the HR+/HER2+ subtype, and patients with TN exhibited a relatively worse prognosis than the other subtypes. It is well acknowledged that TN breast cancer presents the most unfavorable disease features, with a median OS of 10–13 months in de novo metastatic breast cancer18,19, which was in line with the survival outcomes reported in the present study. In contrast, patients with HR+/HER2+LMBC had relatively favorable prognostic profiles, which could be the result of multiple treatment options for this type of subtype, including anti-HER2-targeted therapy and endocrine therapy. However, we could not further discuss the therapeutic influences on prognosis due to insufficient information on treatment in SEER database.

This is the first study to show that the distinctive survival outcomes are associated with metastatic profiles. We classified the metastatic patterns and further investigated the effects of the involved sites on the prognosis of LMBC. The prognosis gradually worsened as the total number of involved sites increased, and for patients with LMBC with paired metastatic sites, a successively inferior tendency was detected in lung involvement combined with bone, liver, and brain involvement. However, no statistical significance was revealed in patients with LMBC and three concurrent metastases. To further clarify the prognosis of patients with LMBC with diverse metastatic patterns, we performed a comparative analysis in the entire population. The corresponding results showed that patients with LMBC and brain metastasis had the worst survival, and the additional involvement of the bone did not decrease the overall prognosis. Although the metastatic patterns and prognostic correlations have been discussed in previous studies12,20,21, they tended to focus on the entire group of patients with de novo metastatic breast cancer instead of patients with LMBC. Therefore, the findings might not apply to patients with newly diagnosed lung involvement. In the current study, we conducted analyses in this specific cohort and reported novel findings of prognostic profiles associated with involved patterns, which can provide promising evidence for clinical management of patients with LMBC in clinical practice.

Given the controversial role of surgical intervention in de novo metastatic breast cancer22,23,24,25, we comprehensively discussed the potential effects of surgical performance on the prognosis of LMBC. Surgical performance could prolong the OS of patients with LMBC independent of the molecular subtypes. For patients with LMBC with lung-only and paired metastases, this kind of survival benefit remained consistent. Collectively, resection of primary disease can improve the overall prognosis of patients with. LMBC and this benefit tended to vary with metastatic patterns, which was consistent with previous findings26. There is a promising rationale for this practice, and increasing evidence has emerged for surgical performance in de novo stage IV breast cancer27. However, we could not further elaborate on the correlations between surgical performance and involved patterns in specific breast cancer subtypes due to the limited sample size, in addition to the specific techniques regarding surgery including surgical procedures, the optimal time point for surgery, and predictive biomarkers of the advantageous population for the receipt of surgical intervention due to limited data in the database. In addition, the overall prognosis could be interpreted by a show of factors associated with cancer treatment and disease characteristics in the setting of therapeutic phrases, these findings should be used with enough caution for physicians. However, considering the limited evidence for the prognostic value of surgical intervention for patients with LMBC, the current study could provide emerging evidence, and further studies should be conducted to investigate the associations between primary disease resection and surgical performance in the specified cohorts from the LMBC population.

To further quantify the estimation for individual prognosis, we developed a prediction model for the 2- and 5-year survival probabilities of patients with. de novo LMBC, which was further validated internally and externally in the selected cohorts. The results of model validation suggested that this novel nomogram provided a robust prediction of survival in the LMBC population. Considering that this reliable nomogram was the first fulfillment of prognostic estimation for LMBC, the present study provides strong evidence for practitioners to introduce individual-based therapeutics for survival benefits in clinical practice.

There are limitations to our findings. First, metastatic sites were not fully recorded in this database, which comprised the metastatic sites after sequential therapies and the soft tissue and distant lymph nodes at the initial diagnosis, and could exert inevitable effects on the proportion of results regarding metastatic patterns. However, the organs commonly involved in breast cancer include the lung, bone, liver, and brain28, which were included in our analyses, and the study results can be applied to all patients with LMBC. In addition, treatment information was not sufficiently available. This includes, for instance, endocrine therapy as a first-line intervention for ER+/HER2− breast cancer, targeted therapy for HER2+ breast cancer, chemotherapeutic protocols, radiation performance, and surgical removal of metastatic lesions, which could result in misestimation of the associations between current treatment options and survival benefits as well as ignorance of the influence of some new treatments, such as immunotherapy, PARP inhibitors, and PI3K-AKT inhibitors on survival benefits. This should be further improved in future population-based studies. Moreover, information on progression-free survival was not included in the SEER database, leading to a lack of a major survival profile. Finally, several disease characteristics vital to clinical outcomes are absent in this database, such as the Ki-67 index and lymphovascular invasion; therefore, we could consider all disease characteristics to further calibrate this prediction model.

In conclusion, this study revealed great heterogeneity in the clinical outcomes of LMBC associated with molecular subtypes, metastatic patterns, and surgical performance. Prognostic factors were identified, and we established a robust nomogram for the estimation of individual 2- and 5-year survival in patients with LMBC. Prospective studies with more cohorts for extensive validation are warranted in the future.