For African American or Hispanic women, the extent to which clinical breast cancer risk prediction models are improved by including information on susceptibility single nucleotide polymorphisms (SNPs) is unknown, even though these women comprise increasing proportions of the US population and represent a large proportion of the world’s population. We studied 7539 African American and 3363 Hispanic women from the Women’s Health Initiative. The age-adjusted 5-year risks from the BCRAT and IBIS risk prediction models were measured and combined with a risk score based on >70 independent susceptibility SNPs. Logistic regression, adjusting for age group, was used to estimate risk associations with log-transformed age-adjusted 5-year risks. Discrimination was measured by the odds ratio (OR) per standard deviation (SD) and the area under the receiver operator curve (AUC). When considered alone, the ORs for African American women were 1.28 for BCRAT, and 1.04 for IBIS. When combined with the SNP risk score (OR 1.23), the corresponding ORs were 1.39 and 1.22. For Hispanic women the corresponding ORs were 1.25 for BCRAT, and 1.15 for IBIS. When combined with the SNP risk score (OR 1.39), the corresponding ORs were 1.48 and 1.42. There was no evidence that any of the combined models were not well calibrated. Including information on known breast cancer susceptibility loci provides approximately 10 and 19 % improvement in risk prediction using BCRAT for African Americans and Hispanics, respectively. The corresponding figures for IBIS are approximately 18 and 26 %, respectively.
Genome-wide association studies have identified an increasing number of single nucleotide polymorphisms (SNPs) associated with breast cancer risk, the majority of which have been discovered by the COGS consortium using Caucasian women of European descent . Whilst each SNP is only associated with small increment in risk, the indications are that a polygenic approach to genetic testing could improve estimates of individual risk, raising the possibility of individualized screening strategies for women . Several studies have investigated the value of combining the genomic risk estimates obtained from SNP genotyping with conventional breast cancer risk prediction algorithms such as the breast cancer risk assessment tool (BCRAT, also known as the Gail Model) and IBIS (also known as the Tyrer-Cuzick Model). The combination of SNP panels with these algorithms has been shown to improve risk prediction, reclassifying some women across risk categories and potentially changing clinical management [3–7].
Although originally developed using population data for white women, the BCRAT has been modified for Hispanic women using SEER data  and for African American women as the modified CARE Model [9, 10]. Whilst the IBIS Model has only been validated for European populations, it is widely used across ethnicities in breast cancer centers throughout the USA . This new study investigates whether a panel of SNPs can improve breast cancer risk estimates obtained from BCRAT or IBIS for African American and Hispanic women, in terms of calibration and discriminatory accuracy. These women comprise increasing proportions of the US population and represent a large proportion of the world’s population.
We studied 7539 self-reported African American women and 3363 self-reported Hispanic women identified from within the Women’s Health Initiative (WHI) SNP Health Association Resource (SHARe). Written informed consent was obtained from each participant and the study was approved by the Fred Hutchinson Cancer Research Center Institutional Review Board. Participants in the WHI had an opportunity to opt in or out of any collaborations involving commercial entities because some women may prefer not to participate in research involving commercial (as opposed to non-profit) entities. We restricted our analyses to the subset of these individuals that had consented for collaborations involving commercial entities. The interventions used in the WHI clinical trial are independent of baseline genetic and clinical risk factors by study design , so analyses presented here were not stratified by trial intervention.
Selection of SNPs
The SNP panels used were derived from SNPs identified as being associated with breast cancer risk from studies of Caucasian women  and for which imputed genotypes were available in WHI SHARe. This resulted in a panel of 75 SNPs for African Americans and 71 for Hispanics.
Risk prediction models
We used the BCRAT (incorporating the modified CARE model) [9, 10] and IBIS  to estimate the 5-year absolute risk of breast cancer. For BCRAT, we did not have information on biopsy histopathology (i.e., presence of atypical hyperplasia) so this was coded as “unknown”. Similarly for IBIS, missing family history variables were coded as “unknown”.
SNP risk score and combined model risk scores
Using the approach of Mealiffe et al. , we calculated a SNP risk score using previously published estimates of the odds ratio (OR) per allele and risk allele frequencies (p) [13, 15–18] assuming independence of additive risks on the log OR scale. For each SNP, we calculated the unscaled population average risk as μ = (1 − p)2 + 2p(1 − p)OR + p 2OR2. Adjusted risk values (with a population average risk equal to 1) were calculated as 1/μ, OR/μ, and OR2/μ for the three genotypes defined by the number of risk alleles. The overall SNP risk score was then calculated by multiplying the adjusted risk values for each of the SNPs .
For both BCRAT and IBIS, we calculated a combined risk score by multiplying the SNP-based score by the model’s predicted 5-year risk of breast cancer.
The model risk scores, SNP-based score and combined risk scores were log transformed for all analyses, and then adjusted for age using multiple linear regression. We used Pearson correlation to test for associations between the model risk scores, the SNP-based score and the combined risk scores. We then used logistic regression to estimate risk associations, in terms of OR per age-adjusted log 5-year predicted risk, while adjusting for age group. Model calibration was assessed using the Hosmer–Lemeshow goodness-of-fit test, which compares the expected and observed numbers of cases and controls within groups that were defined by deciles of risk for controls. Discrimination between cases and controls was measured using the AUCs of the risk scores.
As in Mealiffe et al. , we categorized 5-year absolute risks as low risk (<1.5 %), intermediate risk (≥1.5 and <2.0 %) and high risk (≥2.0 %) and constructed reclassification tables for each of the risk prediction models as a cross-tabulation of the classification of the risk score from the original model with the risk score from the combined model. The net reclassification improvement statistic was calculated as P(up|case) − P(down|case) + P(down|control) − P(up|control), where up refers to moving to a higher risk category and down refers to moving to a lower risk category. We tested the null hypothesis that the net reclassification improvement is equal to 0 using an asymptotic Z-test.
Stata Release 13  was used for all statistical analyses; all statistical tests were two sided, and P values less than 0.05 were considered nominally statistically significant.
African American women
The characteristics of the study participants are provided in supplementary Table 1. For cases, the mean 5-year risk of breast cancer was 1.7 % (SD 0.06 %) from BCRAT and 1.3 % (SD 0.04 %) from IBIS. For controls, the mean 5-year risk of breast cancer was 1.6 % (SD 0.05 %) from BCRAT and 1.3 % (SD 0.04 %) from IBIS. The mean SNP-based score was 1.29 (SD 0.51) for cases and 1.19 (SD 0.43) for controls. Supplementary Table 2 shows the genotype distributions and the minor allele frequencies for cases and controls for each of the 75 SNPs as well as their OR per allele and the corresponding published ORs.
Table 1 shows the age group-adjusted association between the age-adjusted log-transformed risk scores and breast cancer. For each of the models, the OR per SD of the age-adjusted risk scores was higher for the combined score than for both the SNP-based score and the corresponding model risk score. The increase in OR by the addition of SNPs was 9.6 % for BCRAT and 17.5 % for IBIS.
Receiver operating characteristic curve analysis confirmed that, for each model, the combined risk score gave greater discrimination than the SNP-based score and the corresponding model risk score (Table 2). The increase in AUC compared with 0.5 by the addition of SNPs was 5.4 % for BCRAT and 7.8 % for IBIS.
For each of the models, the risk scores and the combined risk scores were classified as low risk (1.5 %), intermediate risk (≥1.5 and <2.0 %), and high risk (≥2.0 %), as shown in Tables 3 and 4. The proportion of cases moving into a higher risk category was 42.5 % for BCRAT and 37.7 % for IBIS, while the proportion of cases moving into a lower risk category was 10.1 % for BCRAT, and 6.5 % for IBIS. The proportion of controls moving into a lower risk category was 11.2 % for BCRAT, and 8.2 % for IBIS, while the proportion of controls moving into a higher risk category was 40.3 % for BCRAT and 33.5 % for IBIS. The net reclassification improvement was 0.033 for BCRAT (95 % CI −0.025, 0.089), and 0.060 for IBIS (95 % CI 0.005, 0.113).
The characteristics of the study participants are provided in Supplementary Table 3. For cases, the mean 5-year risk of breast cancer was 1.2 % (SD 0.07 %) from BCRAT and 1.4 % (SD 0.04 %) from IBIS. For controls, the mean 5-year risk of breast cancer was 1.1 % (SD 0.06 %) from BCRAT and 1.4 % (SD 0.04 %) from IBIS. The mean SNP-based score was 1.19 (SD 0.65) for cases and 1.00 (SD 0.57) for controls. Supplementary Table 4 shows the genotype distributions and the minor allele frequencies for cases and controls for each of the 71 SNPs as well as their OR per allele and the corresponding published ORs.
Table 5 shows the age group-adjusted association between the age-adjusted log-transformed risk scores and breast cancer. For each of the models, the OR per SD of the age-adjusted risk scores was higher for the combined risk score than that for the SNP-based score and the corresponding model risk score. The increase in OR by the addition of SNPs was 19.0 % for BCRAT and 26.1 % for IBIS.
Receiver operating characteristic curve analysis confirmed that, for each model, the combined risk score gave greater discrimination than the SNP-based score and the corresponding model risk score (Table 6). The increase in AUC compared with 0.5 by the addition of SNPs was 10.9 % for BCRAT and 11.3 % for IBIS.
For each of the models, the risk scores and the combined risk scores were classified as low risk (1.5 %), intermediate risk (≥1.5 and <2.0 %), and high risk (≥2.0 %), as shown in Tables 7 and 8. The proportion of cases moving into a higher risk category was 20.4 % for BCRAT and 35.4 % for IBIS, while the proportion of cases moving into a lower risk category was 6.8 % for BCRAT, and 10.8 % for IBIS. The proportion of controls moving into a lower risk category was 6.2 % for BCRAT, and 16.8 % for IBIS, while the proportion of controls moving into a higher risk category was 11.7 % for BCRAT and 23.1 % for IBIS. The net reclassification improvement was 0.082 for BCRAT (95 % CI 0.003, 0.162), and 0.181 for IBIS (95 % CI 0.085, 0.273).
The ability of a 77-SNP panel to improve the risk estimates provided by the major breast cancer risk assessment algorithms for Caucasians (BOADICEA, BRCAPRO, BCRAT) has been previously quantified  and the combined SNP and model risk scores are now among the strongest known measures for differentiating women with and without breast cancer, at least for Caucasian women [20, 21]. For example, the OR per SD of age-adjusted risk scores for the model that included the SNP score versus the same model alone was from 1.67 to 1.80 for BCRAT, and from 1.30 to 1.52 for IBIS.
The present study quantifies how much the addition of a SNP risk component can also improve the discrimination of the BCRAT and IBIS models for both African American women and Hispanic women. Specifically, for African American women the OR per SD increased from 1.25 to 1.37 when using BCRAT and from 1.04 to 1.22 when using IBIS. For Hispanic women, the corresponding changes were from 1.25 to 1.48 and from 1.15 to 1.42.
For each of the risk prediction models, the combined risk score resulted in approximately 40 % of African American cases moving into a higher risk category and approximately 10 % of controls moving into a lower risk category. For Hispanics, over 20 % of cases moved into a higher risk category, and 6 % of controls moved into a lower risk category when using BCRAT and 17 % when using IBIS. These values are higher than the two previous studies of Caucasian cohorts which identified between 3 and 10 % of cases moving to a higher risk category [6, 20].
The AUC value for BCRAT obtained for Hispanic women is lower than that previously reported . Whilst the IBIS model is widely used across ethnicities in the US it has only been validated for Caucasian populations , and both the ORs and AUC derived here for the model alone are low for both African American and Hispanic women. For the present analysis, information was not available for second-degree relatives or for family history of ovarian cancer and it is not immediately clear whether this has impacted on the IBIS model performance. In addition to ethnicity differences and reduced pedigree inputs, the low values may reflect that IBIS was developed using data from studies of predominately postmenopausal women and is intended for use with high-risk populations .
Similarly, the SNPs used in this study were predominantly identified by discovery GWAS of Caucasian women . The estimated OR per SD for the log SNP-based score alone was 1.24 for African American and 1.39 for Hispanic women, which are both lower than the estimate of 1.55 reported by Mavaddat et al. for Caucasian women  Whilst susceptibility loci are likely to be similar across ethnicities, the informative SNPs for those loci could vary across, and remain to be confirmed, across ethnicities. Thus, the SNP risk scores used here are likely to improve once GWAS datasets use Phase I datasets of the relevant ethnic populations, and fine mapping studies have been conducted across populations.
Overall, breast cancer prevention strategies rely upon accurate risk assessment, the models for which have typically only been validated for Caucasian women. Although most national screening programs rely solely upon age as the factor to determine eligibility (e.g., inviting only women above a certain age-threshold for screening), more targeted screening based upon a calibrated risk assessment is being considered . We hope that the information presented in studies such as this can eventually be used to help make screening more effective, and across all populations of the world, particularly those with less resources.
Michailidou K, Hall P, Gonzalez-Neira A et al (2013) Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 45(4):353–361. doi:10.1038/ng.2563
Pharoah PD, Antoniou AC, Easton DF et al (2008) Polygenes, risk prediction, and targeted prevention of breast cancer. N Engl J Med 358(26):2796–2803. doi:10.1056/NEJMsa0708739
Mealiffe ME, Stokowski RP, Rhees BK et al (2010) Assessment of clinical validity of a breast cancer risk model combining genetic and clinical information. J Natl Cancer Inst 102(21):1618–1627. doi:10.1093/jnci/djq388
Comen E, Balistreri L, Gonen M et al (2011) Discriminatory accuracy and potential clinical utility of genomic profiling for breast cancer risk in BRCA-negative women. Breast Cancer Res Treat 127(2):479–487. doi:10.1007/s10549-010-1215-2
Dite GS, Mahmoodi M, Bickerstaffe A et al (2013) Using SNP genotypes to improve the discrimination of a simple breast cancer risk prediction model. Breast Cancer Res Treat 139(3):887–896. doi:10.1007/s10549-013-2610-2
Brentnall AR, Evans DG, Cuzick J (2014) Distribution of breast cancer risk from SNPs and classical risk factors in women of routine screening age in the UK. Br J Cancer 110(3):827–828. doi:10.1038/bjc.2013.747
Wacholder S, Hartge P, Prentice R et al (2010) Performance of common genetic variants in breast-cancer risk models. N Engl J Med 362(11):986–993. doi:10.1056/NEJMoa0907727
Banegas MP, Gail MH, LaCroix A et al (2012) Evaluating breast cancer risk projections for Hispanic women. Breast Cancer Res Treat 132(1):347–353. doi:10.1007/s10549-011-1900-9
Gail MH, Costantino JP, Pee D et al (2007) Projecting individualized absolute invasive breast cancer risk in African American women. J Natl Cancer Inst 99(23):1782–1792. doi:10.1093/jnci/djm223
Adams-Campbell LL, Makambi KH, Frederick WA et al (2009) Breast cancer risk assessments comparing Gail and CARE models in African-American women. Breast J 15(Suppl 1):S72–S75. doi:10.1111/j.1524-4741.2009.00824.x
Amir E, Freedman OC, Seruga B et al (2010) Assessing women at high risk of breast cancer: a review of risk assessment models. J Natl Cancer Inst 102(10):680–691. doi:10.1093/jnci/djq088
Prentice RL, Anderson GL (2008) The women’s health initiative: lessons learned. Annu Rev Public Health 29:131–150. doi:10.1146/annurev.publhealth.29.020907.090947
Mavaddat N, Pharoah PD, Michailidou K et al (2015) Prediction of breast cancer risk based on profiling with common genetic variants. J Natl Cancer Inst 107(5):djv036. doi:10.1093/jnci/djv1036 [published online ahead of print 8 Apr 2015]
Tyrer J, Duffy SW, Cuzick J (2004) A breast cancer prediction model incorporating familial and personal risk factors. Stat Med 23(7):1111–1130. doi:10.1002/sim.1668
Feng Y, Stram DO, Rhie SK et al (2014) A comprehensive examination of breast cancer risk loci in African American women. Hum Mol Genet 23(20):5518–5526. doi:10.1093/hmg/ddu252
Long J, Zhang B, Signorello LB et al (2013) Evaluating genome-wide association study-identified breast cancer risk variants in African-American women. PLoS One 8(4):e58350. doi:10.1371/journal.pone.0058350
Palmer JR, Ruiz-Narvaez EA, Rotimi CN et al (2013) Genetic susceptibility loci for subtypes of breast cancer in an African American population. Cancer Epidemiol Biomark Prev 22(1):127–134. doi:10.1158/1055-9965.epi-12-0769
Fejerman L, Stern MC, Ziv E et al (2013) Genetic ancestry modifies the association between genetic risk variants and breast cancer risk among Hispanic and non-Hispanic white women. Carcinogenesis 34(8):1787–1793. doi:10.1093/carcin/bgt110
StataCorp (2013) Stata statistical software, release 13. StataCorp LP, College Station
Dite GS, MacInnis RJ, Bickerstaffe A et al. (2015) Breast cancer risk prediction using clinical models and 77 independent risk-associated SNPs for women aged under 50 years: Australian Breast Cancer Family Registry. Cancer Epidemiol Biomarkers Prev Submitted
Hopper JL (2015) Odds PER Adjusted standard deviation (OPERA): comparing strengths of associations for risk factors measured on different scales, and across diseases and populations. Am J Epidemiol 182(10):863–867
Amir E, Evans DG, Shenton A et al (2003) Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme. J Med Genet 40(11):807–814
Evans DG, Howell A (2015) Can the breast screening appointment be used to provide risk assessment and prevention advice? Breast Cancer Res 17:84. doi:10.1186/s13058-015-0595-y
The authors thank Chancellor Hohensee of the Fred Hutchinson Cancer Research Center, Division of Public Health Sciences, Seattle, Washington for assistance in compiling the dataset and Dr Adrian Bickerstaffe of the Centre for Epidemiology and Biostatistics, The University of Melbourne, Australia for assistance in batch processing the IBIS data. The WHI program is funded by the National Heart, Lung, and Blood Institute, NIH, U.S. Department of Health and Human Services through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119,32122, 42107-26, 42129-32, and 44221.
Conflict of Interest
Dr Richard Allman is an employee of Genetic Technologies Ltd.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Allman, R., Dite, G.S., Hopper, J.L. et al. SNPs and breast cancer risk prediction for African American and Hispanic women. Breast Cancer Res Treat 154, 583–589 (2015). https://doi.org/10.1007/s10549-015-3641-7
- Single nucleotide polymorphisms
- Breast cancer
- Risk prediction
- African American