Introduction

Breast cancer is the most common non-skin cancer diagnosed in women and incidence increases with age [1, 2]. Mammography screening reduces breast cancer mortality by 19% in women 40–74 years. However, there is a delay in benefit; on average it takes 10.7 years for 1 in 1000 women screened to avoid breast cancer death [3, 4]. There are also harms to screening, including anxiety, complications from workup of cancer, and overdiagnosis (detection of non-lethal tumors) [5]. Therefore, guidelines recommend not screening women who have a life expectancy of less than 10 years. [6,7,8] Despite these recommendations, 40–55% of community dwelling US women ≥ 65 years with < 10 year life expectancy are screened; most are at low or average breast cancer risk. [9, 10]

In addition, none of the mammography screening trials included women ≥ 75 years old. Most guidelines recommend engaging women ≥ 75 in shared decision making [7, 8, 11, 12]. High-quality shared decision making around mammography screening requires consideration of breast cancer risk, life expectancy, and values and preferences [5]. However, shared decision making rarely occurs and many older women overestimate their breast cancer risk and screening’s benefits [13,14,15]. Furthermore, while guidelines recommend biennial screening for women ≥ 55 years, many older women choose to be screened annually; personalized information about breast cancer risk may help these women decide how often to be screened [16, 17]. Despite the need, there are no tools that simultaneously estimate older women’s individualized breast cancer risk and life expectancy to support shared decision making around mammography screening.

The Breast Cancer Risk Assessment Tool (BCRAT, a.k.a. “Gail Model”) is the most commonly used breast cancer prediction model in primary care [18, 19]. It considers a woman’s age, age at menarche, age at first birth, history of breast biopsy (including presence of atypia), breast cancer family history and race/ethnicity to estimate 5-year breast cancer risk for women up to age 85. We previously examined BCRAT’s performance in in the Nurses’ Health Study (NHS) and Women’s Health Initiative and found that BCRAT overestimated 5-year breast cancer risk by 5–20% in women ≥ 55 years and by 10–30% in women ≥ 75 and had modest discrimination (c-statistic 0.57–0.58) [20]. We hypothesized that BCRAT overestimated breast cancer risk in older women because while it accounts for age-based risk of non-breast cancer (non-BC) death in estimating breast cancer risk, it does not account for women’s individualized non-BC death risk. Therefore, we aimed to develop a novel model that would simultaneously predict breast cancer risk and non-BC death in women ≥ 55 years.

We previously used NHS data and Fine-Gray competing risk regression to develop a breast cancer prediction model for older women [21]. That model included women’s age, family history of breast cancer, reproductive factors, health behaviors, and prior mammography use (because screening increases breast cancer detection and may confound the influence of some risk factors on breast cancer incidence) [22]. It also included six mortality risk factors (history of stroke, diabetes, myocardial infarction, emphysema, heart failure, and limitation in performing moderate activity) that were added based on expert opinion to apply weights to women’s probability of a competing non-BC death when estimating their breast cancer risk. Our current aim was to develop and validate a non-BC death prediction model and then include predictors from this model in our competing risk breast cancer risk prediction model.

Methods

We used NHS data to extend our previously developed competing risk breast cancer prediction model to also predict non-BC death [23]. NHS is a longitudinal study of 121,738 female nurses aged 30–55 years at entry in 1976; 97% who were white. Since Black women are more likely to die of breast cancer and be diagnosed at earlier ages than white women, we further examined model performance in the Black Women’s Health Study (BWHS) [24, 25]; a longitudinal study of 59,000 self-identified Black women ages 21–69 at entry in 1995. At baseline and in biennial follow-ups, participants in both cohorts provide detailed lifestyle and medical history information through mailed questionnaires (Additional file 1: Appendix A provides additional details about each cohort). Our study samples (n = 83,330 NHS, n = 17,380 BWHS; see Additional file 1: eFigure 1) included postmenopausal women without a history of invasive or noninvasive breast cancer who returned the 2004 NHS questionnaire (could be returned through May 2006) or 2009 BWHS questionnaire (96.3% who returned  the questionnaire returned it by the end of 2010). We chose the 2004 NHS questionnaire for study initiation since: (1) similar to current practice most women had stopped using menopausal hormone therapy (MHT); (2) it included functional assessments; and (3) it allowed > 10 years follow-up. We chose the 2009 BWHS questionnaire for study initiation since it allowed 10 years follow-up for most women and allowed a maximum number of BWHS participants to be included (i.e., to have reached age 55). NHS participants were 57–85 years, and BWHS participants were 55–85 at study entry. The study was approved by the institutional review boards of Boston University Medical Center, Brigham and Women’s Hospital, Harvard T.H. Chan School of Public Health, and those of participating registries as required.

Outcomes

The oldest participants at the end of follow-up were aged 95. In both cohorts, cause of death was determined from state-issued death certificates, the National Death Index, family and friends, and the post-office. In NHS, death information was further supplemented with medical record review; > 98% of deaths are identified [26, 27]. For NHS, we included breast cancers confirmed by medical record review (88%) or self-reported (12%) since validation studies have found self-reported breast cancers in NHS to be accurate [28]. In BWHS, breast cancers are identified through self-report or through 24 state cancer registries (> 95% of BWHS participants live in these states) and are confirmed by review of hospital and state cancer registry pathology records (> 99% are confirmed) [29, 30]. We excluded women with a history of cancer (except non-melanomatous skin cancer) since NHS did not consistently confirm second cancer diagnoses.

Mortality risk factors

To expand our model to predict non-BC death, we considered 60 potential mortality risk factors, including health behaviors (4), comorbidity (32), physical function (16), psychosocial factors (5), age, age at menopause, and parental longevity; information was obtained from the 2004 NHS questionnaire and/or prior years (see Additional file 1: Appendix B for variable definitions). We only included factors that may be self-reported (e.g., no laboratory values) for ease of clinical implementation; however, in sensitivity analyses, we repeated our analyses using confirmed diseases when available. We did not consider socioeconomic factors (e.g., income) because once the model is implemented we do not want women to be denied screening because of a low estimated life expectancy based on socioeconomic status.

Breast cancer risk factors

After identifying the best-fitting model for non-BC death, we then re-examined our model’s performance in predicting breast cancer including the risk factors identified for non-BC death [21]. For these analyses, we censored women with noninvasive breast cancer or other cancers at the time of diagnosis. We also re-examined model performance including risk factors as continuous rather than categorical variables if linearly associated with breast cancer risk. Since measured mammographic density was only available for 2174 NHS participants, we used predicted mammographic density as performed previously by Rice et al. (predicted and actual mammographic density are correlated [Spearman correlation of 0.61]). The validated mammographic density prediction model considers age, current BMI, BMI at age 18, adolescent somatotype, parity, age at first birth, postmenopausal status, alcohol use, benign breast disease, and MHT use. [31]

Non-BC death model development

Analyses were completed using SAS 9.4 software. We randomly divided the NHS population into 2/3 (n = 55,553) for model development and 1/3 (n = 27,777) for internal validation. Survival time was measured from study entry until non-BC death; participants were censored at breast cancer death or 10 years from their 2004 questionnaire return date, whichever came first. We first examined the unadjusted effect of each mortality risk factor using proportional hazards regression (PHR). Variables significantly associated (p < 0.05) with non-BC death in univariate analyses and not collinear at > 0.4 (Spearman correlation) were considered in our multivariable model. When two variables were collinear, we removed the variable more difficult to self-report. We used best-subsets regression (allowing comparison of all possible models and selecting those with the highest global score chi-square statistic) [31], the Akaike information criterion (AIC, a function of the log-likelihood that adds a penalty of 2 for each additional factor; lower AICs indicate better fit), and the c-index (estimate of area under the receiver operating characteristic curve) to identify the best-fitting models for non-BC death [32,33,34]. Investigators reviewed the top models associated with the highest c-index and lowest AIC to select the best model. The proportional hazards assumption was evaluated by computing Schoenfeld residuals and visually examining log–log survival curves; no apparent violations were identified. Since few methods exist for covariate selection using competing risk regression and breast cancer death is a rare competing risk to non-BC death, we hypothesized that using cause-specific PHR for covariate selection would identify the same top models as competing risk regression. To confirm, we reviewed the AIC and c-index of the 10 best-fitting models and found that the AICs and c-indices were similar using either method. We determined the subdistribution hazard ratio (HR) for each risk factor in our final model using competing risk regression and computed cause-specific cumulative incidence functions (CIFs) for breast cancer death and non-BC death.

In sensitivity analyses, we examined for “ghost-time” bias (to examine the potential effect of including data from individuals who may have died but not yet captured) by censoring participants at age 90 [35]. We also calculated age-adjusted c-indices, used multiple imputation to impute missing data (see Additional file 1: Appendix C for multiple imputation details). In addition, we compared our new model’s performance in predicting non-BC death to a model that included only the 6 mortality risk factors chosen by expert opinion to predict the competing risk of non-breast cancer death in our original breast cancer prediction model.

Internal and external validation

We examined the final model’s performance in predicting 10-year non-BC death and 5-year breast cancer risk because these thresholds have clinical significance. Guidelines for use of breast cancer prevention medications consider postmenopausal women with ≥ 3% 5-year risk to be at high risk [36, 37]. Also, prior studies have shown that individuals with ≥ 50% 10-year mortality risk tend to have < 10-year life expectancy since life expectancy is the median survival of a population [38, 39].

We used Royston and Altman’s methods for validating models using survival analyses and examined our model’s calibration (whether model predicted probabilities are accurate) and discrimination (how well our model distinguishes between individuals who do and do not develop an outcome) in predicting non-BC death [40, 41]. First, we compared the prevalence and regression coefficients associated with each risk factor in the development and validation cohorts using normal approximation z-tests. While most risk factors were defined similarly by NHS and BWHS, BWHS did not assess participant mobility or ability to bath/dress oneself. We censored BWHS participants without complete 10-year follow-up on December 31, 2020, since death data after that date may have been incomplete.

Calibration of the model in predicting non-BC death was assessed by estimating the ratio of the expected survival (1-CIF for non-BC death from our competing risk regression model) to the observed survival (1-the observed CIF computed using the nonparametric estimation of CIF) at 5 and 10 years within risk quintiles [42]. To test discrimination, we calculated the model’s c-index in the validation cohorts using risk factor regression coefficients from the development cohort using Kremer’s SAS macro [43] based on the work of Harrell et al. [44] and Pencina et al. [45]. Additional file 1: Appendix C provides additional details on methods used for model validation. We repeated these methods to examine model performance in predicting non-BC death by age (55–74, 75+) and in predicting breast cancer risk overall and by age.

Examples

To demonstrate how our model may be useful, we calculated breast cancer and non-BC death risk estimates for four example women 75 years old for whom guidelines recommend shared decision making and to not screen women with < 10 year life expectancy [7, 8, 11]. We also presented the proportion of women in our validation cohorts who would be estimated to be at higher or lower risk of non-BC death (using a 50% 10-year mortality risk threshold) and of breast cancer (using a 3% 5-year breast cancer risk threshold) based on model risk estimates.

Results

NHS development cohort participants (n = 55,553) were 96.2% non-Hispanic white, and their mean age was 70.1 (SD 7.0) years. Over 10 years, 3.1% developed breast cancer, 0.3% died of breast cancer, and 20.1% died of other causes. NHS validation cohort (n = 27,777) participants were similar to development cohort participants (≤ 0.5% difference for any characteristic, Table 1). BWHS participants (n = 17,380) differed by race, were younger, more likely to have had a mammogram, a breast biopsy, have higher BMI, younger age at menopause, comorbidity, to be nulliparous and to walk briskly than NHS development cohort participants; BWHS participants were less likely to use alcohol, cigarettes, or MHT. The number of breast cancer diagnoses was similar between cohorts, but BWHS participants were slightly more likely to die of breast cancer; after standardizing by age the cohorts had similar rates of non-BC death. Additional file 1: eTable 1 demonstrates differences across cohorts in participant characteristics by age group (55–74, 75+ years).

Table 1 Participant baseline characteristics in the NHS development and validation cohorts and in BWHS

Predicting non-BC death

Additional file 1: eTable 2 includes all 60 variables considered in predicting non-BC death and the reasons certain variables were removed. Best-subsets regression resulted in 961 top models; 281 had the highest c-index of 0.789. Within this group, the AIC varied by < 0.02%. Based on clinical judgment, effect on model performance, and ease of self-report, we included the 20 variables (age, BMI, alcohol use, cigarette use, function, mobility, walking pace, age at menopause, and 12 diseases) that made it into > 97% of the top 281 models; the other variables made it into < 84% of these models. Since mammography use in the past two years predicts breast cancer death (the competing risk of non-BC death), we included it in the model when predicting non-BC death. Using competing risk regression, the model’s c-index was 0.795 (0.791–0.800) for predicting 10-year non-BC death in the development cohort (Table 2), which is higher than the c-index (0.778 [0.773–0.782]) of the model when only including the 6 mortality risk factors previously selected based on expert opinion for our competing risk breast cancer prediction model.

Table 2 Final model for predicting non-breast cancer death in the NHS cohorts and in BWHS

Neither censoring follow-up of participants at age 90 nor using confirmed (only available for stroke and myocardial infarction) rather than self-reported diagnoses changed the model’s c-index. Adjusting model c-indices for age led to a small decrease in model performance (Additional file 1: eTable 3). The model’s c-index was the same using multiple imputation or a complete case analysis, and the HRs for all predictors were within 8% of each other except for kidney disease (17% difference in HRs) which was the rarest disease (0.6% prevalence, Additional file 1: eTable 4). Therefore, our final model included women with complete data. Regardless of whether competing risk regression or cause-specific PHR was used, the model’s performance was similar (PHR c-index = 0.796 [0.791–0.800] for predicting 10-year non-BC death) and risk factor hazard ratios (HRs) were within 3% of each other (Additional file 1: eTable 5).

Internal and external validation

Although most risk factor HRs for predicting non-BC death differed significantly between the development cohort and the validation cohorts, likely due to large cohort sample sizes, directions of the associations were similar (Table 2). When the non-BC death model was applied to the validation cohorts, c-indices were 0.790 (0.784–0.796) in the NHS validation cohort and 0.768 (0.757–0.780) in the BWHS. Figure 1 presents non-BC death survival curves in BWHS over time by 10-year risk deciles. The c-index for prediction of 10-year non-BC death in women 55–74 was 0.760 (0.749–0.770) in the NHS validation cohort and 0.735 (0.721–0.750) in BWHS; among women ≥ 75 the c-indexes were 0.696 (0.686–0.706) and 0.671 (0.645–0.696) in NHS and BWHS, respectively (Additional file 1: eTable 6).

Fig. 1
figure 1

Non-BC death cumulative incidence function (CIF) curves in the BWHS over time by 10-year risk deciles

Figure 2 demonstrates the CIF for 10-year non-BC death in each cohort, and Table 3 demonstrates how the expected-to-observed ratio of predicted risk approached 1 for each risk quintile for each outcome; except for women in the highest risk quintile for 10-year non-BC death in BWHS, where the model underestimated survival; results were similar when we examined calibration by age.

Fig. 2
figure 2

CIFs for 10-year non-BC and breast cancer death and 5-year breast cancer from each cohort. Our competing risk model for predicting non-BC death yielded two CIF functions, one for the outcome of interest of non-BC death and one for breast cancer death

Table 3 Calibration table for predicting 10-year non-breast cancer death and 5-year breast cancer risk

We then applied regression coefficients from the development cohort to the validation cohorts to predict 5-year breast cancer risk. We included all risk factors for non-BC death and breast cancer. That model’s c-index was 0.603 (0.575–0.632) in the NHS validation cohort and 0.556 (0.517–0.595) in the BWHS. We then removed factors that predicted mortality alone (comorbidities, function, and cigarette use); the model’s c-statistic improved to 0.611 (0.582–0.640) in the NHS validation cohort and to 0.566 (0.528–0.604) in BWHS. We then examined model performance considering BMI and alcohol use as continuous rather than categorical variables; model performance improved when using continuous BMI. Finally, we added factors considered in our previous NHS model (months breastfeeding, having a grandmother with breast cancer, and age at menarche), but model performance did not improve. The c-index of our final breast cancer prediction model was 0.612 (0.583–0.641) in the NHS validation cohort and 0.573 (0.536–0.611) in BWHS, as shown in Table 4. Among women aged 55–74, the c-indexes were 0.618 (0.585–0.650) and 0.566 (0.526–0.606) in NHS and BWHS, respectively. Among women ≥ 75, the c-indexes were 0.596 (0.534–0.657) and 0.614 (0.506–0.722) in NHS and BWHS, respectively, in Additional file 1: eTable 9; however, there were only 26 cases among BWHS women ≥ 75. Figure 2 presents 5-year CIFs for breast cancer in each validation cohort. Table 2 and Additional file 1: eTable 7 provide data on the model’s excellent calibration in predicting breast cancer in the validation cohorts. Additional file 1: eTable 8 demonstrates that model performance and HRs are similar using PHR.

Table 4 Final model for predicting 5-year breast cancer in the NHS cohorts and BWHS cohort

Examples

Table 5 presents risk estimates for four hypothetical women aged 75. Model risk estimates may help identify women at high risk of non-BC death and may provide older women with realistic estimates of their breast cancer risk. Table 6 demonstrates that across cohorts most older women are at low risk of breast cancer and of 10-year non-BC death until age 75 when risk of non-BC death is higher.

Table 5 Risk estimates for 5-year breast cancer and 10-year non-BC death, for example, women aged 75
Table 6 Women at low or high risk of breast cancer and non-BC death in each cohort

Discussion

We developed a novel model to simultaneously predict breast cancer incidence and non-BC death to inform older women’s breast cancer screening decisions. Our model performed well in predicting non-BC death across cohorts. The model slightly over-predicted death in the BWHS after 10 years follow-up in the highest risk group, possibly because data were not available in BWHS on participant mobility and function and these values were highly significant for non-BC death in the NHS cohorts. Our rigorously developed non-BC death prediction model outperformed a model that only included risk factors for death that were selected based on expert opinion. In addition, our model for predicting 5-year breast cancer risk demonstrated excellent calibration but only modest discrimination in predicting breast cancer, similar to the other breast cancer prediction models. Specifically, the Gail and Tyrer-Cuzick breast cancer prediction models have been shown to have c-statistics of < 0.60 in NHS participants ≥ 70 years [20, 46] and breast cancer prediction models developed in case–control studies of breast cancer in Black women have been found to have c-statistics of 0.56 [47] and 0.58 [30] when validated in cohorts of Black women. Our model yielded a similar c-statistic (0.57) in the BWHS.

While mortality indices exist for estimating adults’ 10-year overall mortality risk [37, 48], none specifically predict non-BC death. Although breast cancer death is uncommon and dependent on breast cancer tumor characteristics at diagnosis and treatments received, inclusion of breast cancer death in estimates of overall death when deciding on breast cancer screening may bias decision making. The two most widely used 10-year mortality indices (the Lee and Schonberg indices) were developed from cross-sectional U.S. population surveys and only considered a few diseases [37, 48]. NHS is a longitudinal study which allowed our model to consider numerous diseases and functional measures. While our model includes some risk factors included in these mortality risk models (e.g., age, BMI, smoking history, function, heart failure, emphysema, cancer, and diabetes), our model also considers gait speed, depression, dementia, hypertension, Parkinson’s disease, hip fracture, stroke, and kidney disease. Gait speed is known to be an especially strong predictor of mortality [49]. The inclusion of dementia may be particularly useful since screening decisions for these women may be challenging for caregivers and clinicians [50].

We used competing risk regression to obtain the estimated HRs and CIFs because HRs from competing risk models are directly associated with CIFs and take into account competing risks, while HRs from cause-specific PHR models examine a risk factor’s effect in a hypothetical world with no competing risks. (Competing risks are censored.) Putter et al. have elegantly shown that the HRs from competing risk models are directly associated to those from cause-specific proportional hazard models via the negative logarithm of the reduction factor from the cause-specific model [51]. In our model, hazard ratios were similar regardless of the regression method used, likely because breast cancer death was an uncommon competing risk to non-BC death and non-BC death was an uncommon competing risk to breast cancer incidence. However, our work highlights that the relation between risk factors for breast cancer and non-BC death is complex. Some model risk factors had congruous effects on breast cancer risk and non-BC death, while others had opposing effects. For example, older age was associated with increased risk of both outcomes. Higher BMI was associated with increased breast cancer risk but had a more nuanced effect on non-BC death. Low BMI which may be associated with frailty was highly associated with non-BC death, while BMI > 40 was associated with only a slight increased risk of 10-year non-BC death likely because many of the mediators of high BMI on mortality were included in the model [52]. Greater alcohol consumption was associated with increased breast cancer risk, while drinking < 15 g of alcohol per day (<1 drink) was associated with lower non-BC death risk.

In addition, while the hazard ratios associated with many of the risk factors for non-BC death differed significantly between the predominantly white women in NHS and the Black women in BWHS, qualitative trends were similar. When predicting breast cancer, only the HRs for increasing BMI and prior mammography differed significantly between the cohorts; however, increasing BMI was associated with increasing breast cancer risk in both cohorts and having a mammogram in the past 2 years was associated with women being less likely to be diagnosed with breast cancer in 5 years. Simulation modelers have also found that it is necessary to consider women’s prior use of mammography in estimating breast cancer incidence [53]. While we initially planned to include all the non-BC death risk factors in our competing risk breast cancer prediction model, we found that doing so led to our model performing less well likely because the model was overfit. Instead, the model performed better in validation when only including breast cancer risk factors; importantly, some breast cancer risk factors (e.g., age, alcohol use, age at menopause) also predicted non-BC death.

While our rigorously developed model predicts breast cancer and non-breast cancer death, it also has limitations. Since actual mammographic density was not available for most participants, we considered predicted mammographic density. However, we did not find an association with breast cancer risk likely because our model already included several factors associated with breast density (e.g., BMI) and because the prevalence of high mammographic density decreases with age as does its effect on breast cancer risk [54,55,56]. Furthermore, inclusion of polygenic risk scores (PRS) in breast cancer prediction models has been shown to increase model discrimination slightly; however, the association of PRS with breast cancer risk declines with age and few women have obtained this information [57,58,59]. If PRS data become more available in practice, we would test adding PRS to the model in the future. In addition, model performance needs to be tested in older Asian and Hispanic women before being used in these populations.

Conclusions

We formally modeled prediction of breast cancer and non-BC death in a competing risk model we are developing for clinical use. As demonstrated with the examples in Table 4, model risk estimates may be helpful in identifying women at high risk of non-BC death who are unlikely to benefit from screening and in providing older women with realistic estimates of their breast cancer risk. Before making our model available online, we plan to formally compare performance of our model to existing breast cancer prediction models such as the BCRAT and Tyrer-Cuzick [60] and to further examine its performance in other diverse cohorts such as Women’s Health Initiative and Multiethnic Cohort [61, 62].