Introduction

Contralateral breast cancer (CBC) is the most common second primary cancer among women diagnosed with first primary invasive breast cancer (BC) [1]. CBC accounts for approximately 40–50% of all new secondary cancers in women with first primary invasive BC and has a potentially less favorable prognosis [2,3,4,5,6]. Worries regarding CBC risk have increased the demand for contralateral preventive mastectomy (CPM) [7, 8]. However, the impact of CPM on survival is uncertain, especially in women with a low risk to develop a CBC [9,10,11,12,13]. Thus, improved CBC risk prediction is important in order to inform decision-making on surveillance and preventive strategies. Currently, the most important factor for decision-making on CPM is the BRCA1/2 mutation status [14].

We previously developed and cross-validated two models using data from 132,756 invasive BC patients with a median follow-up of 8.8 years including 4672 CBC events [15]. One model (PredictCBC-1A) was developed including information about BRCA1/2 mutation status and another model (PredictCBC-1B) for the general breast cancer population of genetically untested women. Two other specific CBC prediction tools are currently available in the literature: the Manchester formula (part of the Manchester guidelines for CPM) and CBCrisk [15,16,17,18].

In addition to BRCA1/2 mutations, other genetic risk factors for breast cancer are also associated with CBC risk. In particular, there is substantial evidence that the CHEK2 c.1100delC variant increases the risk of developing CBC [19, 20]. In addition, polygenic risk scores (PRS) of common variants, developed for association with first breast cancer, have been shown to predict CBC in the general BC population and in BRCA1/2 mutation carriers [21,22,23,24], particularly the extensively validated 313 SNP PRS [25]. With regard to the lifestyle and reproductive factors, there is evidence that body mass index (BMI) and parity at or around the time of the first primary invasive BC diagnosis are associated with CBC risk [26].

Our aim was to refit PredictCBC models incorporating these additional risk factors. We utilized the same dataset but with updated follow-up and added additional studies, especially one large study of BRCA1 and BRCA2 mutation carriers. We evaluated the potential improvement in prediction performance and utility for clinical decision-making of the updated models for both BRCA1/2 carriers as the general (non-tested) breast cancer population (PredictCBC-2.0).

Material and methods

Study population and available data

We used the data from the same five main sources previously used for PredictCBC models to develop the PredictCBC-2.0 models including updated follow-up information, additional patients, and invasive or in situ CBC events [15]. Two studies were additionally included from the Breast Cancer Association Consortium (BCAC) compared to the version of the BCAC data used to develop PredictCBC-1A and PredictCBC-1B models. Most of the studies were either population- or hospital-based series; and most women were of European descent (Additional file 1: Data and patient selection and Additional file 2: Table S1 and Additional file 1: Table S2, available online). We also additionally included patients selected from the Hereditary Breast and Ovarian cancer study in the Netherlands (HEBON) [27], a nationwide study based on clinical genetic centers. The eligibility criteria were the same as previously: briefly, we included female patients with invasive first primary BC with no sign of distant metastases at diagnosis or prior history of any cancer (except for non-melanoma skin cancer) [15]. We included women diagnosed after 1990 so that diagnostic and treatment procedures were close to modern practice while follow-up was sufficient to study CBC incidence. In total, 207,510 women with first primary invasive BC from 23 studies were included. All studies were approved by the appropriate ethics and scientific review boards. All women provided written informed consent; or, for some Dutch cohorts as applicable, the secondary use of clinical data was in accordance with Dutch legislation and codes of conduct [28, 29]. Information about the sample size for every data source and the total sample size after eligibility criteria are provided in Table 1. The choice of additional predictors in the analyses was based on evidence from the literature and the availability of predictors in our data sources. In particular, evidence from the literature suggests that CHEK2 c.1100delC and 313 SNP PRS increased the risk of developing CBC [21,22,23,24]. In addition, a systematic review of lifestyle and reproductive factors suggested that BMI and parity at or around the time of the first primary invasive BC diagnosis are associated with CBC risk [26]. Details about sample size per study and about the factors included in the analyses, follow-up per dataset, and study design are in Additional file 2: Table S1 and Additional file 3: Table S3, available online.

Table 1 Patient characteristics in the different data sources

Statistical analyses

Primary endpoint and follow-up

The primary endpoint in the analyses was the incidence of invasive or in situ metachronous CBC. Follow-up started 3 months after invasive first primary BC diagnosis, to exclude synchronous CBCs, and ended at the date of CBC, distant metastasis (but not a loco-regional relapse), CPM, or last date of follow-up (due to death, loss to follow-up, or end of study), whichever occurred first. For 36,553 (17.6%) women, from BCAC and HEBON, recruitment or blood sampling for DNA testing occurred more than 3 months after diagnosis of the first primary BC. For women with the first primary invasive BC, follow-up started at recruitment or at the date of blood draw or at DNA test result (left truncation). Patients who underwent CPM during the follow-up were censored because of negligible CBC risk after a CPM [30]. Missing data were multiply imputed by chained equations (MICE) to avoid loss of information due to case-wise deletion [31,32,33] (Additional file 1: Multiple imputation of missing values, available online).

Model development and validation

We used multivariable Fine and Gray regression models to account for death and distant metastases as competing events [34]. Analyses were stratified by a study to allow baseline hazard (sub)distributions to differ across studies. The assumption of proportional subdistribution hazards was graphically checked using Schoenfeld residuals [35]. The resulting subdistribution hazard ratios (sHRs) and corresponding 95% confidence intervals (CI) were pooled from 5 imputed datasets using Rubin’s rules [33]. We re-estimated the coefficients of PredictCBC-1A and PredictCBC-1B, and we re-fitted the PredictCBC models using the extended dataset with updated follow-up time. PredictCBC-1A, developed including information about BRCA1/2 mutation carrier status, was extended by including CHEK2 c.1110delC status, PRS-313, self-reported BMI, and self-reported parity (hereafter: PredictCBC-2.0A) [15]. CHEK2 c.1110delC and PRS-313 were derived from the BCAC database, as published previously [25, 36, 37]. We extended PredictCBC-1B, developed for genetically untested women, incorporating self-reported BMI and parity (hereafter: PredictCBC-2.0B). Potential nonlinear relations between continuous predictors and CBC risk were investigated using restricted cubic splines with three knots.

The validity of the model was investigated by leave-one-study-out cross-validation [38]. In each validation cycle, all studies were analyzed except one, in which the validity of the model was evaluated. Since some BCAC studies had insufficient CBC events required for reliable validation, we used the geographic area as a unit for splitting [38,39,40]. Nineteen out of 23 studies were combined in 4 geographic areas (Additional file 1: Table S2, available online). A total of 8 units of splitting including 4 geographic areas and 4 studies were used to cross-validate the models.

The performance of the PredictCBC-2.0 was assessed by discrimination, i.e., the ability to differentiate between patients diagnosed with CBC and those who were not, and by calibration, which measures the agreement between the actual (observed) risk and CBC risk estimated by the prediction models (predicted). Discrimination was quantified by time-dependent areas under the ROC curve (AUCs) based on Inverse Censoring Probability Weighting at 5 and 10 years [41]. The AUCs were estimated using the prognostic index which is a/the combination of the estimated coefficients (betas) of PredictCBC models multiplied by the corresponding individual characteristics (i.e., predictors) included in the models. Values of AUCs close to 1 indicate good discrimination, while values close to 0.5 indicated poor discrimination. Calibration was assessed by the observed-to-expected (O/E) ratio and calibration plots at 5 and 10 years [42, 43]. An O/E ratio lower or higher than 1 indicates that average predictions are too high or low, respectively.

To consider heterogeneity among studies, a random-effect meta-analysis was performed to provide summaries of discrimination and calibration performance. The 95% prediction intervals (PI) indicate the likely performance of the model in a new dataset. The summary performances of PredictCBC-2.0 and 1.0 models were compared to evaluate whether adding the new predictors improved the performance of CBC risk prediction. We developed and validated the risk prediction model following the Transparent Reporting of a Multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) statement [44]. Analyses were done in SAS (SAS Institute Inc., Cary, NC, USA) and R (version 3.6.1).

Clinical utility

The clinical utility of the prediction models was evaluated using decision curve analysis (DCA) [45, 46]. A key metric DCA is the net benefit, which is the number of true-positive classifications (in this example: the number of CPMs in patients who would have developed a CBC) minus the weighted number of false-positive classifications (in this example: the number of unnecessary CPMs in patients who would not have developed a CBC). The false positives are weighted by a factor related to the relative harm of a missed CBC versus an unnecessary CPM. The weighting is derived from the threshold probability to develop a CBC using a fixed time horizon (e.g., CBC risk at 5 or 10 years) [47]. For example, a threshold of 10% implies that CPM in 10 patients, of whom one would develop CBC if untreated, is acceptable (thus performing 9 unnecessary CPMs). The net benefit of a prediction model is traditionally compared with the strategies of treat all or treat none. Since the use of CPM is generally only considered among BRCA1/2 mutation carriers, the decision curve analysis was reported among BRCA1/2 mutation carriers and non-carriers separately [48]. Among patients not tested for BRCA1/2 germline mutations, we assumed that the decision for CPM is based on family history of breast cancer. The net benefits of PredictCBC-2.0A and PredictCBC-2.0B were compared with the net benefit of PredictCBC-1A and 1B, respectively, to assess the potential improvement in the clinical utility of the updated models.

Results

A total of 207,510 women with invasive first primary BC diagnosed between 1990 and 2017, with 8225 CBC events (6828 invasive, 1397 in situ), from 23 studies, were used for CBC risk prediction modeling (Additional file 2: Table S1, available online). Median follow-up time was 10.2 years, and CBC cumulative incidences at 5 and 10 years were 2.2% and 4.1%, respectively. Details of the studies and patient, tumor, and treatment characteristics are provided in Additional file 3: Table S3 (available online). The multivariable models with estimates for all included factors are given in Table 2.

Table 2 Multivariable subdistribution hazard models for contralateral breast cancer risk

Most of the factors were independently associated with CBC risk, including the new factors incorporated in the PredictCBC-2.0 models, i.e., s BMI, parity, CHEK2 c.1110delC, and PRS-313. There was no evidence against log-linear relationships between BMI, parity and PRS-313 and CBC risk. Nonlinearity between age at first BC diagnosis and CBC risk was accounted for with a linear spline at age 60 years. The formulae of the PredictCBC models are provided in Additional file 1: Formula to estimate the contralateral breast cancer risk using PredictCBC-2.0A and PredictCBC-2.0B (available online). To calculate the predicted CBC cumulative incidence, we used the event-free baseline probability of the Netherlands Cancer Registry (NCR), as previously [15].

The AUCs at 5 and 10 years of PredictCBC-2.0A were higher than of PredictCBC-1A at 5 years: 0.66, 95% prediction interval (PI) 0.55–0.76 versus 0.62 (95%PI 0.51–0.74); and at 10 years: 0.65 (95%PI 0.56–0.74) versus 0.63 (95%PI 0.54–0.71) (Figs. 1 and 2, Table 3). The AUCs for PredictCBC-2.0B and PredictCBC-1B were both 0.59 (95%PI: PredictCBC-2.0B: 0.51–0.68; PredictCBC-1B:0.49–0.69) at 5 years and both 0.58 (95%PI 0.51–0.65) at 10 years (Figs. 1 and 2, Table 3).

Fig. 1
figure 1

Analysis of predictive performance of PredictCBC-2.0A in leave-one-study-out cross-validation. Discrimination was assessed by a time-dependent AUC at 5 and 10 years (panel A and B, respectively). Calibration accuracy was measured with observed/expected (O/E) ratio at 5 and 10 years (panel C and D, respectively). The black squares indicate the estimated accuracy of a model built using all remaining studies or geographic areas. The black horizontal lines indicate the corresponding 95% confidence intervals of the estimated accuracy (interval whiskers). The black diamonds indicate the mean with the corresponding 95% confidence intervals of the predictive accuracy, and the dashed horizontal lines indicate the corresponding 95% prediction intervals

Fig. 2
figure 2

Analysis of predictive performance of PredictCBC-2.0B in leave-one-study-out cross-validation. Discrimination was assessed by a time-dependent AUC at 5 and 10 years (panel A and B, respectively). Calibration accuracy was measured with observed/expected (O/E) ratio at 5 and 10 years (panel C and D, respectively). The black squares indicate the estimated accuracy of a model built using all remaining studies or geographic areas. The black horizontal lines indicate the corresponding 95% confidence intervals of the estimated accuracy (interval whiskers). The black diamonds indicate the mean with the corresponding 95% confidence intervals of the predictive accuracy, and the dashed horizontal lines indicate the corresponding 95% prediction intervals

Table 3 Summary of prediction performance of PredictCBC-1A, PredictCBC-1B, PredictCBC-2.0A, and PredictCBC-2.0B with the corresponding 95% prediction intervals (PI) based on a leave-one-study-out cross-validation procedure

The O/E ratio at 5 and 10 years across all versions of PredictCBC models ranged between 0.90 and 0.92 with similar 95%PIs (Figs. 1 and 2, Table 3). Calibration plots of PredictCBC-2.0 models are provided in Additional file 1: Figs, S1–S4 (available online).

The decision curves showed the net benefit for a range of harm–benefit thresholds at 10-year CBC risk (Fig. 3). We evaluated the potential clinical utility of PredictCBC-2A versus PredictCBC-1.0A for decision thresholds between 4 and 12% for the 10-year CBC risk among BRCA1/2 mutation carriers and non-carriers (Figs. 3 and 4, Table 4). For example, if consensus guidelines would indicate the acceptability of 1 in 10 patients for whom a CPM is recommended developing CBC, a risk threshold of 10% may be used to define high- and low-risk BRCA1/2 mutation carriers based on the absolute 10-year CBC risk prediction estimated by the models. Compared with a strategy recommending CPM to all BRCA1/2 mutation carriers, PredictCBC-1A avoids 76.9 net CPMs per 1000 patients (Table 4). An additional 50.0 CPMs may be avoided using PredictCBC-2.0A compared to PredictCBC-1A. In contrast, almost no non-BRCA1/2 mutation carriers had predictions above the 10% threshold (general BC population, Table 4); three necessary CPMs per 1000 patients would be indicated using PredictCBC-2.0A. Analyses for PredictCBC-1B and PredictCBC-2.0B at 10 years suggested a potential clinical utility between 4 and 6% 10-year CBC risk for patients with and without family history (Table 4 and Figs. 3 and 4). No remarkable improvement in net benefit was detected using PredictCBC-2.0B compared to PredictCBC-1B in decision-making regarding CPM (Table 4 and Fig. 3). Decision curves for CBC risk using PredictCBC and PredictCBC-2.0 at 5 years and the corresponding clinical utility showed similar patterns (Additional file 1: Figs. S5-S6 and Table S4, available online).

Fig. 3
figure 3

Decision curve analysis at 10 years for the contralateral breast cancer risk (CBC) models (PredictCBC-1.0 and PredictCBC-2.0 models) including BRCA mutation information. A The decision curve to determine the net benefit of the estimated 10-year predicted CBC cumulative incidence for patients without a BRCA1/2 gene mutation using PredictCBC-1A (dotted black line) and PredictCBC-2.0A (dashed black line) compared to not treating any patients with contralateral preventive mastectomy (CPM) (black solid line). B The decision curve to determine the net benefit of the estimated 10-year predicted CBC cumulative incidence for BRCA1/2 mutation carriers using PredictCBC-1A (dotted black line), PredictCBC-2.0A (dashed black line) versus treating (or at least counseling) all patients (gray solid line). C The decision curve to determine the net benefit of the estimated 10-year predicted CBC cumulative incidence for patients without (first degree) family history using PredictCBC-1B (dotted black line), PredictCBC-2.0B (dashed black line) compared to not treating any patients with CPM (black solid line). D The decision curve to determine the net benefit of the estimated 10-year predicted CBC cumulative incidence for patients with (first degree) family history using PredictCBC-1B (dotted black line), PredictCBC-2.0B (dashed black line) versus treating (or at least counseling) all patients (gray solid line). The y-axis measures net benefit, which is calculated by summing the benefits (true positives, i.e., patients with a CBC who needed a CPM) and subtracting the harms (false positives, i.e., patients with CPM who do not need it). The latter are weighted by a factor related to the relative harm of a non-prevented CBC versus an unnecessary CPM. The factor is derived from the threshold probability to develop a CBC at 10 years at which a patient would opt for CPM (e.g., 10%). The x-axis represents the threshold probability. Using a threshold probability of 10% implicitly means that CPM in 10 patients of whom one would develop a CBC if untreated is acceptable (9 unnecessary CPMs, harm-to-benefit ratio 1:9)

Fig. 4
figure 4

Density distribution of 10-year predicted contralateral breast cancer using PredictCBC version 2 models. A Density distribution of 10-year predicted contralateral breast cancer absolute risk using PredictCBC-2.0A within non-carriers (area with black solid lines) and BRCA1/2 mutation carriers (area with black dashed lines). B Density distribution of 10-year predicted contralateral breast cancer absolute risk using PredictCBC-2.0B within patients without (first degree) family history (area with black solid lines) and patients with (first degree) family history (area with black dashed lines)

Table 4 Clinical utility of the 10-year contralateral breast cancer risk prediction models (PredictCBC-1A with PredictCBC-2.0A and PredictCBC-1B with PredictCBC-2.0B)

Discussion

We evaluated the potential improvement in CBC risk prediction by adding established genetic (CHEK2 c.1100delC and PRS-313) and lifestyle (BMI and parity) factors to the previous PredictCBC models and used additional follow-up information and new studies to provide more reliable estimates.

The current clinical recommendations of CPM are mostly based on the presence of a pathogenic mutation in BRCA1/2 [49, 50]. This seems a reasonable approach according to CBC risk predictions based on the PredictCBC models: few non-BRCA1/2 carriers exceed a 10% 10-year risk threshold. However, approximately 40% of BRCA1/2 mutation carriers do not reach this threshold either, suggesting that a significant proportion of BRCA1/2 carriers might be spared CPM. Additional genetic information beyond BRCA1/2 germline mutation such as the presence of the CHEK2 c.1110delC variant and PRS-313 might improve decision-making.

Currently available CBC models, such as CBCrisk and the Manchester formula, show only moderate discrimination [51]. In addition, the Manchester formula has been shown to systematically overestimate CBC risk [51]. The BOADICEA model, a well-known risk prediction tool to estimate the risk of developing the first primary BC, also allows the calculation of CBC risk [52,53,54,55]. Although BOADICEA includes rare pathogenic variants in moderate- and high-risk BC susceptibility genes (i.e., BRCA1, BRCA2, PALB2, ATM and CHEK2, BARD1, RAD51C, RAD51D), and PRS-313, it does not incorporate information on the systemic treatment of the primary BC, which are important predictors of CBC risk [56].

A model for the prediction of recurrence, the INFLUENCE nomogram, was developed to estimate 5-year recurrence risk as well as conditional annual risks of developing a local or regional recurrence based on first BC and treatment characteristics [57]. A more recent version (INFLUENCE 2.0) also provides 5-year individualized predictions for secondary primary breast cancer based on cases older than 50 years at first cancer diagnosis from the NCR nationwide cohort irrespective of their genetic status or testing status using random survival forests [58]. The model provided moderate discrimination (AUC at 5 years: 0.67; 95%CI 0.65–0.68) using internal validation. In our comparable population- and hospital-based Dutch series, EMC and NCR, the AUCs at 5 years of PredictCBC-1A were 0.69 (95%CI 0.64–0.73) and 0.66 (95%CI 0.65–0.67), and of PredictCBC-2.0A 0.71 (95%CI 0.66–0.75) and 0.68 (95%CI 0.66–0.69), respectively. Moreover, INFLUENCE 2.0 is only relevant to the general population, while PredictCBC can also be used in the clinical genetic setting. Notably, we demonstrated that decision-making about preventive strategies in clinical practice is unlikely to improve without genetic information.

Our work has some limitations: firstly, some women included in the Dutch studies (providing specific information on family history, BRCA mutation or CPM) were also present in our selection of the NCR population, as described previously [15]. Privacy and coding issues prevented linkage at the individual patient level, but based on the hospitals from which the studies were recruited, and the age and period criteria used, we calculated a maximum potential overlap of 9%. Secondly, important predictors such as family history, BRCA1/2 and CHEK2 c.1110delC status, and PRS-313, were only available in a subset of the women, although the multiple imputation approach should lead to consistent estimates [59,60,61]. Detailed information about family history of breast cancer would have been useful to improve CBC risk prediction, especially among patients with a mutation in BRCA1/2 or CHEK2. Nonetheless, we considerably increased the number of patients with BRCA1/2 mutation status and family history information compared to our previous publication (40,343 vs. 7704 and 53,399 vs. 30,541 patients with available BRCA mutation status and family history information, respectively), and added CHEK2 c.1110delC, which is a founder mutation present in approximately 0.5–1.6% of individuals of Northern and Eastern European descent and explains the large majority of carriers of CHEK2 protein truncating variants in these populations [19, 62]. Further validation will be required to investigate how well PredictCBC models predict risk in other populations. In particular, the model was developed in patients of European ancestry and further evaluation and adaptation will be needed to extend PredictCBC models to non-European populations, including Asia [63, 64]. Future research might also include comparisons of machine learning (ML) methods with classical statistical regression models [65, 66].

The prediction models may be further improved by including additional risk factors. In particular, rare mutations in other breast cancer susceptibility genes, such as ATM and PALB2, are also likely to be associated with an increased risk of CBC [22, 67, 68]. The discrimination provided by the PRS will also improve as more SNPs are added [69, 70]. Prediction performance might also be improved by adding breast density and other risk factors (e.g., additional lifestyle and reproductive factors such as alcohol use, age at primiparity, age at menopause) modeled dynamically in a time-dependent fashion [71]. Finally, we wish to emphasize that adequate presentation (e.g., with online tools) of the risk estimates is crucial for effective communication about CBC risk during doctor–patient consultations [72, 73].

Conclusions

In conclusion, we present an updated version of a previously proposed contralateral breast cancer risk model (PredictCBC) including additional information on breast cancer genetic variants beyond BRCA1/2, lifestyle and reproductive factors. PredictCBC-2.0, available online at [74], is based on longer follow-up from a wide range of new European-descent population and hospital-based studies, with reasonable calibration. PredictCBC-2.0 may be used to tailor clinical decision-making toward CPM or alternative preventive strategies, especially when genetic information is available.