Introduction

Though mammography screening reduces breast cancer mortality, it is imperfect like all screening tests. The high burden of false-positive tests relative to the number of cancers detected has contributed to controversy about the routine use of mammography screening among women ages 40 to 50, as well as about biennial rather than annual screening [1]. After 10 years of annual mammography screening beginning at age 40, over 60% of women will have a false-positive result and 7% to 9% will have a biopsy [2]. False-positive mammograms can result in inconvenience, pain and anxiety for patients, as well as increased costs [3,4].

Using pretest probability of disease can improve the positive predictive value of a screening test. However, this approach requires the ability to accurately determine an individual’s risk of disease. The Breast Cancer Risk Assessment Tool (BCRAT), or Gail model, uses age, family history of breast cancer, reproductive history and history of breast biopsy or atypical hyperplasia to estimate a woman’s 5-year or lifetime risk of breast cancer [5]. Although the model is well calibrated, its discriminatory accuracy is modest [6]. Additional risk factors, such as genetic markers [7-14] and body mass index (BMI) [15-19], have been shown to moderately improve breast cancer risk prediction.

Although many studies have focused on predicting cancer risk in the general population, few have employed risk prediction models to improve decisions about follow-up of abnormal mammograms. Current standards in the United States recommend biopsy of a mammographic abnormality if the radiologist deems the probability of cancer diagnosis to be at least 2% [20-22]. Mammogram results are reported using the American College of Radiology (ACR) Breast Imaging-Reporting and Data System (BI-RADS), which includes six result categories, each tied to follow-up recommendations [20]. The BI-RADS 4 category indicates the presence of a suspicious abnormality that should be followed up with a biopsy. However, the 1-year probability of breast cancer for women with a BI-RADS 4 mammogram is 15% to 30% on average [20,22-29]; therefore, the majority of biopsies of BI-RADS 4 abnormalities are benign. Furthermore, the likelihood of cancer diagnosis varies widely within the BI-RADS 4 category, leading to the subdivision of the category into BI-RADS 4A (2% to 9% risk of malignancy), BI-RADS 4B (10% to 49% risk of malignancy) and BI-RADS 4C (50% to 94% risk of malignancy) [22]. A small pilot study suggested that an experienced radiologist using this substratification scheme could increase the threshold for the biopsy decision without missing invasive cancers [30]. In addition, a recent modeling study suggested that the addition of pretest breast cancer risk factors, including genetic markers, could change biopsy decisions for a small proportion of women with abnormal mammograms [31]. Greater ability to predict cancer outcomes in women with BI-RADS 4 mammograms could reduce the burden of false-positive tests from mammography.

In this study, we assessed the usefulness of the Gail model, BMI and a panel of 12 single-nucleotide polymorphisms (SNPs) to predict cancer diagnosis among women with BI-RADS 4 mammograms. We then evaluated the extent to which these factors could improve decisions about biopsy among this group by reclassifying women without cancer below the biopsy threshold.

Methods

Participants

Women referred for breast biopsies at the Hospital of the University of Pennsylvania following a BI-RADS 4 mammogram between January 2010 and April 2012 were invited to participate in the study. Women were excluded if they were younger than 20 years old, had a personal history of breast or ovarian cancer, mantle radiation or known BRCA1/2 mutation. Women who consented provided a buccal swab for DNA testing prior to their biopsy appointment. Three hundred sixty-three women were enrolled. An additional 119 women with a BI-RADS 4 mammograms from a previous study in which breast imaging modalities were compared at the same institution were also included (2002 to 2006; National Institutes of Health grant P01 CA85484; Principal Investigator: M Schnall). Participants in the breast imaging study were enrolled between July 2003 and August 2007. A blood sample from each patient was collected and stored, which was used for genetic analysis. Of the total sample, five patients were missing follow-up information, eleven had data on fewer than nine SNP markers and two had nonbreast malignancies (tubular adenoma, B-cell lymphoma in the breast). These participants were excluded, resulting in a total population of 464 for analysis. Both studies were approved by the University of Pennsylvania Institutional Review Board, and written informed consent was obtained from each study participant.

Risk factors

Participants completed a health history questionnaire, including information on race, age at menarche, age at first live birth, number of biopsies, presence of atypical hyperplasia and family history of breast and ovarian cancer. Using the BCRAT, we estimated the 5-year absolute risk and relative risk (RRs) of breast cancer using source code version 3.0 from the National Cancer Institute website [32]. BMI was calculated by using the patient’s self-reported weight and height at the time of recruitment, or it was extracted from medical record data prior to recruitment.

Single-nucleotide polymorphism panel

Buccal swabs (N = 347) or blood samples (N = 117) were sent to deCODE genetics (Reykjavik, Iceland) for analysis using Illumina Infinium II whole-genome genotyping (Illumina, San Diego, CA, USA). The deCODE genetics SNP assay included 12 loci that have consistently been associated with breast cancer risk: 2q35 (rs13387042), MRPS30 (rs4415084), FGFR2 (rs1219648), TNRC9/TOX3 (rs3803662), 8q24 (rs13281615), LSP1 (rs3817198), 5q11 (rs889312), NEK10 (rs4973768), 1p11 (rs11249433), RAD51L1 (rs999737), COX11 (rs6504950) and CASP8 (rs1045485) [33-40]. The call rate was 99.8%. The deCODE BreastCancer™ test uses individual allele effect sizes for the 12 SNPs to create a RR estimate for each genotype. For each participant, a combined RR estimate for the 12-SNP panel was calculated by multiplying the RR estimates for all SNPs as described previously [11]. Expected and observed allele frequencies and homozygote odds ratios (ORs) for risk alleles are included in Additional file 1. The combined SNP panel RR estimate has been shown to be independent of BCRAT factors [11].

Statistical analysis

The results of the BIRADS 4 biopsies were obtained from pathology records. Logistic regression was used to assess the association of Gail risk factors, BMI and SNP panel RR with cancer diagnosis (invasive or ductal carcinoma in situ (DCIS)). First, each predictor was tested in an age-adjusted model. SNP panel RRs were examined as a log-transformed continuous variables and as categorized RRs <1.00, 1.01 to 1.49 and ≥1.50. The Gail RR was tested as a log-transformed continuous variable. Gail absolute 5-year risk estimate was categorized as <1.67% and ≥1.67%, as these cutoffs have been widely used to denote high risk of breast cancer, as well as for the use of chemopreventive drugs [41,42]. BMI data were missing in 17% of participants, and therefore BMI was entered into models, including a category for missing data, as follows: <25 kg/m2, 25 to 29.9 kg/m2, ≥30 kg/m2 and missing. The multivariate logistic regression model included log-transformed SNP panel RR, all Gail risk factors (age, race/ethnicity, age at menarche, age at first live birth, first-degree family history of breast cancer, breast biopsy, atypical hyperplasia) and BMI. We also examined the predictive ability of the various risk factors. Model calibration was assessed using the Hosmer-Lemeshow goodness-of-fit test to compare observed and predicted outcomes within deciles of predicted risk for each model [43]. Discriminatory accuracy was assessed by calculating area under the receiver operating characteristic curve (AUC). DeLong’s test was used to compare AUCs for various models. In our analysis, the model incorporating age and the Gail RR had poor calibration. The original Gail model incorporated 5-year intervals of age, but we entered age as a continuous predictor to minimize the number of predictors in our models. Because of the poor calibration of the age plus Gail RR model, we also examined a model that entered all Gail risk factors individually, and this model was better calibrated to our data. In addition, we performed tenfold cross-validation of the prediction models in the total study population. Finally, we estimated the predicted probability of cancer using the multivariate model and assessed reclassification below several risk thresholds (2%, 3%, 5% and 10%) for cancer cases and noncancer cases. Statistical analyses were performed using SAS 9.3 (SAS Institute, Cary, NC, USA) and Stata/IC 12 (College Station, TX, USA) software.

Results

The mean age of study participants was 48.7 years (SD, 13.2), and approximately one-half of the study population was over age 50 (Table 1). Over 30% of participants were black or African American. The mean 5-year breast cancer risk estimate derived by using the BCRAT was 1.54, and 33% of participants had a 5-year risk estimate of 1.67% or greater. The mean SNP panel RR was 1.22 (SD, 0.44). Over one-fourth of participants had a SNP panel RR estimate of 1.50 or greater, indicating their risk of breast cancer was 50% greater than that of the general population. Of the 464 participants, 74 women (16%) were diagnosed with cancer, 33 (7%) with DCIS and 41 (9%) with invasive cancer.

Table 1 Characteristics of BIRADS 4 cohort, all ages, N= 464 a

Table 2 displays the results of age-adjusted and multivariate logistic regression models used to estimate the OR for cancer diagnosis. The SNP panel RR was significantly associated with cancer diagnosis (OR, 2.15; 95% CI, 1.04 to 2.43; P = 0.038). The ORs estimated in our model for the categorized SNP panel RRs were comparable to the predefined RR estimates obtained from deCODE genetics. The Gail RR estimate was not significantly associated with cancer diagnosis, nor was Gail absolute 5-year risk ≥1.67%. Among the Gail factors, only age was significantly associated with breast cancer diagnosis, though the ORs for race/ethnicity, age at menarche, age at first live birth and family history of breast cancer were consistent with expected associations. Prior breast biopsy and atypical hyperplasia were inversely associated with breast cancer, though these data were not statistically significant. Few participants (4.3%) reported prior atypical hyperplasia.

Table 2 Logistic regression, odds of cancer among women with BIRADS 4 mammograms, N= 464 a

In the multivariate model, age, SNP panel RR and BMI were significantly associated with breast cancer diagnosis. Older women were more likely than younger women to be diagnosed with breast cancer (OR = 1.05; 95% CI, 1.03 to 1.08; P < 0.001). The SNP panel RR remained strongly associated with breast cancer diagnosis after multivariable adjustment (OR = 2.30; 95% CI, 1.06 to 4.99; P = 0.035). Higher BMI was also strongly associated with increased odds of breast cancer diagnosis. Obese women (OR = 2.20; 95% CI, 1.05 to 4.58; P = 0.036) had more than twice the odds of cancer diagnosis compared to women with a BMI <25 kg/m2.

Next, we evaluated the association of the SNP panel separately for white (N = 277) and black (N = 145) women (Table 3). Among white women, the SNP panel RR was associated with twofold elevated odds of receiving a cancer diagnosis in both age-adjusted (OR = 2.43; 95% CI, 0.99 to 5.98; P = 0.053) and multivariate (OR = 1.97; 95% CI, 0.76 to 5.10; P = 0.161) models, and OR estimates were similar for the SNP panel RR categories and predefined values. There was evidence that the SNP panel RR was associated with breast cancer diagnosis among black women. Among black women, the OR estimate was 4.50 in the age-adjusted model (OR = 4.50; 95% CI, 0.87 to 23.2; P = 0.073) and 4.21 in the multivariate model adjusted for age, Gail factors and BMI (OR = 4.21; 95% CI, 0.79 to 22.6; P = 0.093), though these estimates did not reach statistical significance. In addition, the OR estimates for the SNP panel RR categories were similar to the predefined RR values. There was no significant interaction between race and the SNP panel RR (P = 0.880).

Table 3 Logistic regression, odds of cancer among women with BIRADS 4 mammograms, by race a

We compared the predictive accuracy of the Gail factors, BMI and SNP panel RR (Table 4). First, Gail RR, SNP panel RR and BMI were tested separately in models including age. The model with age and Gail RR had the lowest predictive ability (AUC = 0.6646), and the Hosmer-Lemeshow goodness-of-fit test indicated poor model fit (P = 0.0019). All other models exhibited acceptable model fit (P > 0.05). The predictive accuracy was similar for age and the SNP panel RR (0.6848) and age and BMI (0.6845). Age, BMI and the SNP panel RR together yielded an AUC of 0.7007, which was of borderline significance compared to age alone (P = 0.061).

Table 4 Predictive accuracy of models using Gail risk factors, body mass index and single-nucleotide polymorphism panel among women with BIRADS 4 mammograms a

Predictive accuracy was greater in the model including the individual Gail risk factors (0.7144) compared to a model with age alone (P = 0.044). Adding BMI to the Gail risk factor model increased the AUC (0.7279), but the difference was not statistically significant (P = 0.341). Subsequently adding the SNP panel RR to the model further increased the AUC (0.7377; P = 0.212). We repeated analyses stratified by age (35 to 49 years and ≥50 years) and found that the addition of BMI and SNP panel RR improved predictive accuracy compared to the Gail factors alone in both age groups, though the AUC values were greater for women ages 50 and older. The addition of the SNP panel had a greater impact on the AUC in the 35 to 49 age group than in women ages 50 and older. When stratified by race, the AUC values were comparable for black women and white women. For the model including Gail factors, BMI and SNP panel RR, the AUC was 0.7518 for white women and 0.7710 for black women. We repeated our analyses excluding women younger than 40, and the results were similar. We performed tenfold cross-validation on the prediction models in the total study population (Table 5). AUC values were slightly attenuated after cross-validation and were not statically significant. The highest cross-validated AUC was observed for the model including age, BMI and the SNP panel (AUC = 0.6753).

Table 5 Cross-validation of prediction models a

The predicted probabilities of breast cancer diagnosis for each individual were estimated using the model including age, Gail factors, BMI and the SNP panel RR. Figure 1 displays the distribution of predicted probabilities by breast cancer status. Women diagnosed with cancer (true-positives) had a mean predicted probability of cancer diagnosis of 22.6%, compared to 12.2% for women not diagnosed with cancer (false-positives), though the 95% CIs significantly overlapped (Table 6). However, no women diagnosed with cancer had a predicted probability below 5%. On the basis of our model, nine women (3.4%) with BI-RADS 4 mammograms were reclassified below the <2% threshold, none of whom were diagnosed with cancer. Furthermore, 69 women (14.9%) had a predicted probability of cancer less than 5%, and none of these women were subsequently diagnosed with cancer. The positive predictive value of the BIRADS 4 categorization alone was 15.9%, compared to 18.7% using the BIRADS 4 categorization along with the prediction model with a 5% predicted probability.

Figure 1
figure 1

Distribution of the predicted probability of cancer using Gail factors, body mass index and single-nucleotide polymorphism panel.

Table 6 Predicted probability of cancer using Gail factors, body mass index and single-nucleotide polymorphism panel ( N= 464)

Discussion

Our results suggest that breast cancer risk factors can be used to predict cancer diagnosis among women with BI-RADS 4 mammograms. Age, BMI and the 12-SNP panel were strongly associated with cancer diagnosis. Addition of BMI and the 12-SNP panel to Gail risk factors improved model discrimination. Furthermore, using a predicted probability cutoff of 5% for biopsy would reclassify 15% of women below the biopsy threshold while retaining 100% sensitivity in cancer detection in this sample. Though our results need to be prospectively validated, our work provides proof of concept that the use of pretest risk factors to guide follow-up of BI-RADS 4 mammograms could potentially improve mammography screening outcomes by reducing the number of biopsies among women who do not have cancer.

To our knowledge, our present study is the first in which a panel of genetic markers has been tested in women with abnormal mammograms. The SNP panel RR estimates observed were similar to the RR estimates stated by deCODE genetics in our population of women with BI-RADS 4 mammograms, and the SNP panel RR estimate remained strongly associated with cancer diagnosis after adjusting for other breast cancer risk factors. Similar to what has been reported in prior studies [7,8,10-14,44,45], the SNP panel in the present study moderately improved predictive accuracy. However, this small improvement may prove to be more clinically valuable for decisions about biopsies among women with abnormal mammograms than for risk stratification in the general population.

It was not entirely surprising that the Gail risk estimate was not significantly associated with cancer diagnosis in our study, because the Gail model was developed to estimate 5-year or lifetime risk of invasive breast cancer in the general population. In our present study, we attempted to predict the risk of diagnosis of either DCIS or invasive cancer in women with abnormal mammograms. The magnitudes of the exposure–disease relationships are likely different for short-term cancer outcomes in the higher-risk BI-RADS 4 population. In our analysis, the model using age and the Gail RR had poor calibration, and therefore the AUC estimates are not meaningful. The poor calibration of this model could have been due to differences in the study population and outcome used in our study, or it could have been a result of our inclusion of age as a continuous predictor to provide a more parsimonious model, whereas the original Gail model used 5-year age categories. Because of this, we also examined a model that entered all Gail risk factors individually, and this model was better calibrated to our data. We observed an AUC of 0.738 for the model with Gail factors, BMI and the SNP panel, which is higher than the AUC observed in the general population for the Gail model alone (0.596) or the Gail model including breast density (0.634) [46]. Researchers in two prior studies evaluated prediction models in women with BI-RADS 4 mammograms. A prediction model trained on 170 French patients with BI-RADS 4 mammograms using Gail risk, age, presence of a palpable lesion, lesion size, hormone replacement therapy and menopause status demonstrated predictive accuracy similar to our model, with an AUC of 0.716 in the training set and AUC of 0.660 when validated in 188 BI-RADS 4 patients from Texas [47]. Similar to our results, age was the strongest predictor of cancer among approximately 4,000 women with BI-RADS 4 mammograms referred for biopsy between 1997 and 2001 in the Vermont Breast Cancer Surveillance System [48]. The presence of a palpable lump, previous breast biopsy, menopause status and use of postmenopausal hormone therapy were also associated with cancer diagnosis. Genetic risk factors and BMI were not included in these prediction models.

Obese women had more than twice the odds of receiving a cancer diagnosis compared to women of normal weight. One possible explanation for this association is that obese women tend to have less-dense breasts and therefore potentially easier-to-read mammograms, which facilitates a more accurate interpretation of their mammograms by radiologists, such that obese women with a BI-RADS 4 mammogram are more likely to actually have cancer (and less likely to have a false-positive test) than nonobese women. The association of BMI with cancer diagnosis may also reflect disease etiology, as BMI is associated with increased risk of postmenopausal breast cancer [49]. Although BMI data were missing for 17% of participants, we do not believe the missing data biased the observed association. The distribution of risk factors (except for age at first birth) and percentage diagnosed with cancer did not differ for women with missing BMI data and women with complete BMI data. Additional studies are needed to verify this association and to tease apart the effects of BMI and breast density in women with abnormal mammograms.

This study was a first attempt to validate the 12-SNP panel among black women. The SNP panel variants were identified and validated primarily in white/European populations. Several genome-wide association studies and candidate gene studies [50-60], and authors of meta-analyses [61-66] have assessed the association of these 12 SNPs individually with breast cancer risk among black/African American populations, with mixed results. Six of the twelve SNPs in the panel have been replicated in at least one study of black/African American populations: rs1045485 (CASP8) [59], rs1219648 (FGFR2) [54,58,59], rs13387042 (2q35) [52,58,59], rs3817198 (LSP1) [60], rs4415084 (FGF10) [56] and rs999737 (14q24.1, RAD51B) [59]. Validating breast cancer–associated SNPs among black women is challenging, given the large sample sizes needed to detect small associations, differing linkage disequilibrium patterns among different ancestral groups, and disease heterogeneity. Despite the fact that only half of these SNP associations have been replicated, the 12-SNP panel appeared to have predictive value among black women, though our results need to be validated in larger studies. In addition, in future studies, researchers should assess whether race-specific and tumor subtype–specific SNP panels can further improve breast cancer risk prediction.

Several limitations should be considered when interpreting our results. Because we recruited women referred for biopsy at one academic hospital, our study sample may not be representative of all women with abnormal mammograms referred for biopsy. Our sample size was modest, and therefore our results, particularly those of subgroup analyses, should be interpreted cautiously. We performed cross-validation of our prediction models for the entire study sample; however, prospective validation of our results is needed. Given the limited number of cancers (N = 75), our study did not have statistical power to fit separate models for DCIS and invasive cancer or to assess interactions between risk factors. We utilized a validated panel of 12 breast cancer–associated SNPs. To date, nearly 70 SNPs have been identified that are associated with breast cancer risk [67]. Therefore, our results using 12 SNPs may underestimate the utility of genetic markers, and including a larger number of genetic markers may further improve risk prediction. In future studies, researchers should evaluate the use of genetic markers in women with abnormal mammograms. Also, breast density was not controlled for, and this may partly explain the observed association of BMI with cancer diagnosis.

This study has several strengths. Ours is one of the first studies to develop a cancer prediction model for women with abnormal mammograms. We had rich data on recognized breast cancer risk factors ascertained prior to biopsy. We employed a validated panel of genetic markers associated with breast cancer incidence, with RR estimates independent of traditional breast cancer risk factors. Our study population was diverse in terms of age and race/ethnicity, suggesting that our model could be applied broadly.

Conclusions

Our results suggest that pretest breast cancer risk factors could be utilized to individualize biopsy decisions following abnormal mammograms. We found that age, BMI and a 12-SNP panel were significantly associated with breast cancer diagnosis in women with BI-RADS 4 mammograms. The association of obesity with cancer diagnosis was particularly novel and warrants additional investigation. On the basis of results derived from the model using Gail risk factors, BMI and genetic markers, we were able to identify a predicted probability threshold that could be used to identify women who would not benefit from immediate biopsy. Our study, though preliminary, highlights that improved risk modeling for women with abnormal mammograms could reduce the burden of false-positive tests and therefore increase the benefits of mammography. Future studies are needed to validate these results in larger patient populations.