Introduction

Breast cancers are biologically heterogeneous. Gene expression profiling of breast tumor tissues has identified reliable patterns indicative of clinically distinct subtypes [14]. At this time, the subtype classifications most often used in clinical settings are based on the commonly measured tumor markers estrogen receptor (ER), progesterone receptor (PR), and HER2/neu (HER2), which offer imperfect but practical surrogates for genomic profiling [5]. It is increasingly recognized that breast cancer subtypes vary in occurrence (especially by race/ethnicity) [58], in their detection by screening mammography [9, 10], and in their risk associations with other factors [1115]. Treatment options and prognosis also depend on breast cancer subtype [9, 1618].

Despite accumulating evidence that breast cancer subtypes should be considered separately, it is still routine to present statistics that consider the disease as a single entity. Perhaps most commonly cited is the 12% lifetime probability statistic [19], prompting the widespread perception that 'one in eight' US women will develop the disease. This single estimate does not convey race-specific variation in breast cancer risks. Moreover, although some groups are reported to have greater relative risk of specific breast cancer subtypes, there are no data with which to counsel patients about the absolute magnitude of these risks in comparison with other threats to their health; one clinically important example is the ER-, PR-, and HER2-negative (triple-negative) breast cancer subtype among black women [5]. To provide estimates relevant to patient care and health policy, we took advantage of recent data on subtype-specific incidence patterns (collected in the large and diverse population of California) to calculate absolute lifetime risks of developing a first primary breast cancer according to breast cancer subtype and presented those calculations separately for women of four racial/ethnic groups.

Materials and methods

Study population

The California Cancer Registry (CCR), a contributor to the National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) program, has ascertained all cancers diagnosed in the state of California since 1988, with estimated 99% completeness. In this analysis, we included all invasive breast cancers (International Classification of Disease for Oncology, Third Edition [ICD-O-3] sites 50.0 to 50.9; all histologies excluding sarcomas and lymphomas 9050 to 9055, 9140, and 9590 to 9989). The CCR has collected information on ER and PR since 1990 and on HER2 since 1999. Before the year 2006, 29% of cases lacked HER2 data; subsequently, HER2 data completeness increased to at least 85%, and thus we limited our assessment to the 40,936 women whose cancer was diagnosed between 1 January 2006 and 31 December 2007, comprising the most recent years for which data are available from the CCR. Each marker is reported as positive, negative, borderline, not tested, not recorded, or unknown. ER and PR were evaluated by dextran-coated charcoal assays or immunohistochemistry (IHC), with positive defined as greater than or equal to 5% nuclear staining; HER2 was tested by IHC (with 0 and 1+ defined as negative, 2+ as borderline, and 3+ as positive) or fluorescence in situ hybridization (with fewer than or equal to two gene copies defined as negative and greater than two copies defined as positive) [20]. Tumor size and stage at diagnosis, patient age at diagnosis, race, and ethnicity were abstracted directly from the medical record; in most cases (84%), race was derived from a patient self-report [21]. We categorized race/ethnicity as non-Hispanic (NH) white, NH black, Hispanic, and NH Asian or Pacific Islander (hereafter referred to as white, black, Hispanic, and Asian).

Categorization of breast cancer subtypes

We categorized breast cancer subtypes according to tumor expression of ER, PR, and HER2; we designated three subtype groupings, which are distinguished by their differences in clinical management (consisting of treatment with ER-, PR-, or HER2-targeted therapies) and by their prognosis. Similarly to previous investigators [22], we defined a 'luminal' category as ER- or PR-positive or both and HER2-negative (a category that overlaps, but does not concord completely, with the gene expression-based subtypes luminal A and luminal B) [3, 4]; other subtype categories were HER2-positive (ER- and PR-positive or -negative and HER2-positive) and triple-negative (ER-negative, PR-negative, and HER2-negative) [58, 17].

Statistical analysis

We used DevCan software (version 6.4.1), developed by the National Cancer Institute, to compute absolute probabilities that a specific breast cancer subtype will be diagnosed and the associated 95% confidence intervals (CIs) [23]. DevCan employs competing-risks methodology to estimate age-dependent probabilities of cancer occurrence and accounts for competing risks of death (specifically, all non-breast cancer causes of death) and is conditioned upon the patient's never having had breast cancer previously [2427]. For each age group, DevCan calculates the probabilities of two mutually exclusive events: either developing the cancer of interest or dying from other causes without ever having developed the cancer of interest. Consequently, cause-specific mortality data are required to estimate incidence of the cancer of interest. The DevCan program uses data on cause-specific mortality for the US population; the data are specific to age, sex, race, and calendar year and are derived from the National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention [23, 28]. Since NCHS does not provide subtype-specific breast cancer mortality data, we used overall breast cancer-specific mortality in place of breast cancer subtype-specific mortality and assumed that the difference in these mortality rates would be small at the population level. We limited our assessment to risks of developing a first breast cancer [23] and did not consider second primary breast cancer (a rare event affecting only 4% of breast cancer survivors) [29]. All analyses were conducted in accordance with the Institutional Review Board approval of the Cancer Prevention Institute of California (protocol number 2001-043).

Results and Discussion

Study participants

From the cohort of 40,936 women whose breast cancer was diagnosed in California in 2006-2007, we excluded 7,737 cases (18.9%) having any of the three markers ER, PR, or HER2 coded as borderline, not tested, not recorded, or unknown. This excluded group comprised 5,069 whites, 505 blacks, 1,262 Hispanics, and 901 Asians; there were no significant differences according to race/ethnicity or age between the cases excluded for missing ER, PR, or HER2 results and the included cases that had ER, PR, and HER data available (results not shown). We included a total of 33,199 women, for whom data on ER, PR, and HER2 were available, in our analyses. Table 1 presents demographic and clinical characteristics of the patient population derived from the CCR. Among breast cancer cases, 66.1% were white, 6.2% black, 16.6% Hispanic, and 11% Asian. Compared with other racial groups, white patients had a higher proportion of tumors that were luminal (71.6% versus 53% to 62.8%), that were diagnosed in local stage (64.5% versus 54.5% to 61.7%), and that were diagnosed at a size of 2 cm or less (61.7% versus 48.6% to 58.4%). Black women had the highest proportion of tumors that were triple-negative (24.6% versus 10% to 16.7%). A chi-square test of breast cancer subtypes by race yielded a P value of less than 0.0001, indicating a statistically significant difference in subtype distribution between racial groups.

Table 1 Characteristics of patients whose breast cancer was diagnosed in California from 2006 to 2007

Lifetime risks by racial/ethnic group

Table 2 presents absolute lifetime risks of developing specific breast cancer subtypes for white, black, Hispanic, and Asian women. All racial/ethnic groups have a higher lifetime risk of developing luminal breast cancer than any other subtype, but this luminal breast cancer risk varies significantly by race/ethnicity and ranges from 4.60% (95% CI 4.40% to 4.81%) for Hispanics to 8.10% (95% CI 7.94% to 8.20%) for whites. Although the 95% CIs around risks for HER2-positive breast cancer do not overlap between most racial/ethnic groups (for example, Hispanics 1.56%, 95% CI 1.46% to 1.68% and Asians 1.91%, 95% CI 1.78% to 2.07%), the risk differences are smaller in magnitude than for the luminal subtype. For triple-negative breast cancer, blacks have the highest lifetime risk at 1.98% (95% CI 1.80% to 2.17%), which is significantly greater than that of Asian (0.77%, 95% CI 0.67% to 0.88%), Hispanic (1.04%, 95% CI 0.96% to 1.13%), and white (1.25%, 95% CI 1.20% to 1.30%) women. For all races and all subtypes combined, overall absolute risk is 12.3% (95% CI 12.2% to 12.4%), which is consistent with 1 in 8 women developing breast cancer in her lifetime. Figure 1 presents race-specific incidence curves for each subtype.

Table 2 Absolute lifetime riska of developing breast cancer by subtypeb and race/ethnicityc
Figure 1
figure 1

Age-specific incidence of breast cancer. Incidence is expressed as rates per 100,000 by age (in years) for subtypes - luminal (ER- or PR-positive or both and HER2-negative), HER2-positive (ER- and PR-positive or -negative and HER2-positive), and triple-negative (ER-negative, PR-negative, and HER2-negative) - and for racial/ethnic groups: (a) whites, (b) blacks, (c) Hispanics, and (d) Asians. ER, estrogen receptor; HER2, Her2/neu; PR, progesterone receptor.

Absolute risks by age and racial/ethnic group

Table 3 presents age-specific risks for breast cancer subtypes for women who are unaffected by cancer at age 40. Between the ages of 40 and 49, white women have 0.87% (95% CI 0.84% to 0.90%) probability of developing luminal breast cancer, 0.27% (95% CI 0.25% to 0.29%) probability of developing HER2-positive breast cancer, and 0.17% (95% CI 0.16% to 0.19%) probability of developing triple-negative breast cancer; for blacks, corresponding probabilities are 0.59% (95% CI 0.52% to 0.66%), 0.31% (95% CI 0.26% to 0.37%), and 0.34% (95% CI 0.29% to 0.40%). For all races, nearly half the lifetime probability of developing luminal breast cancer occurs after age 70, whereas triple-negative breast cancer and HER2-positive breast cancer subtypes have an earlier age distribution, as shown in Figure 2. The Supplemental table (Additional file 1) presents age-specific risks in 10-, 20-, and 30-year intervals in addition to lifetime risks for women ages 20 to 80 by race/ethnicity and by breast cancer subtype.

Table 3 Absolute riska to develop breast cancer in specific age intervals for cancer-free 40-year-old women by subtypeb and race/ethnicityc
Figure 2
figure 2

Distribution of breast cancer subtypes by age. Distribution is expressed as a percentage, and age is expressed in years. Subtypes are defined as (a) luminal (ER- or PR-positive or both and HER2-negative), (b) HER2-positive (ER- and PR-positive or -negative and HER2-positive), and (c) triple-negative (ER-negative, PR-negative, and HER2-negative). ER, estrogen receptor; HER2, Her2/neu; PR, progesterone receptor.

Conclusions

We present lifetime and age-specific probabilities of developing luminal (ER- or PR-positive or both and HER2-negative), HER2-positive (ER- and PR-positive or -negative and HER2-positive), and triple-negative (ER-, PR-, and HER2-negative) subtypes of breast cancer for women from four racial/ethnic groups and use the most recently available data from the large and diverse population of California. These estimates refine the frequently cited 'one in eight' statistic [12], which fails to capture the substantial differences in epidemiology and prognosis among breast cancer subtypes [69, 16, 17]. Most importantly, these estimates facilitate clinically relevant discussion between patients and physicians. For women considering prevention strategies such as prophylactic tamoxifen or raloxifene, which reduce the incidence of only certain subtypes of breast cancer [30, 31], or screening methods such as magnetic resonance imaging (MRI), which may contribute more to the detection of triple-negative than luminal cancers [9, 10], our estimates may inform decisions about managing breast cancer risk. A woman at low risk for a specific subtype might choose to forego particular interventions and their side effects (for example, stroke and uterine cancer from tamoxifen or false-positive biopsies from screening mammogram or MRI) [30, 32], depending on the relative importance of such side effects and her competing health risks.

We present statistics separately for women in four major racial/ethnic groupings because lifetime risks for breast cancer as a whole vary substantially by these groups. Most notable were the significantly increased risks of luminal breast cancer among whites and of triple-negative breast cancer among black women. Our findings are consistent with studies reporting greater relative risks of triple-negative breast cancer among black women [5, 7, 8, 33] and strengthen the rationale for investigating genetic, reproductive, and lifestyle factors that may mediate this racial difference, such as age at menarche, family cancer history, breastfeeding, and abdominal adiposity [34, 35]. In all groups, the luminal subtype was the most common one. This universal predominance of luminal (ER- or PR-positive or both and HER2-negative) breast cancer, regardless of race, may be reassuring since this subtype has the best survival [5, 17], can be targeted by existing chemoprevention agents [30, 31], and may be most readily detectable by screening mammography [9, 10, 36, 37]. Although the disproportionately increased risk among black women of poor-prognosis triple-negative breast cancer warrants further study and targeted interventions, black women may be reassured to learn that they also have a high probability of avoiding this disease over their lifetimes.

It is essential to differentiate risks according to a woman's current age. A prior analysis using SEER data characterized qualitative patterns of breast cancer incidence according to ER status and reported an age-related crossover between black and white women [38]. We found that, for all races and for all subtypes, absolute breast cancer risks were low between ages 40 and 49 years: less than 1% per subtype and less than 2% for all subtypes combined. For women between ages 50 and 59, risks of each subtype increased substantially, and the greatest increase was for luminal breast cancers in white women. Nearly half the lifetime risk of luminal breast cancer, the dominant subtype for all racial/ethnic groups, occurred at or after age 70. These findings are important to the ongoing critical examination of mammographic screening guidelines [32, 36, 39] and may warrant extending recommendations for mammographic screening beyond the current upper limit of 69 years of age [32].

Our analyses have certain limitations, which should be considered in interpreting our results. Given the DevCan program's competing-risks methodology, cause-specific mortality is required to calculate incidence of the cancer in question [23, 25]; we used overall US breast cancer mortality rates to calculate subtype-specific incidence [28] because subtype-specific mortality rates are not available. However, since overall breast cancer mortality is low at the population level, this is unlikely to affect our risk estimates substantially. Racial misclassification might present another potential source of bias, but given that prior studies of the CCR found that race data derive from patient self-report in more than 80% of cases [21], it seems improbable that a large proportion were incorrectly classified. We excluded 7,737 cases (18.9%) from analysis because of missing ER, PR, or HER2 information; since there were no major differences in race or age distribution between the excluded and included cases, the lack of information on these cases seems unlikely to have biased our findings. Although defining subtypes by ER, PR, and HER2 expression does not entirely approximate results of genomic profiling, this classification offers a practical substitute that is increasingly well characterized in published literature [5, 79, 4042] and that guides breast cancer treatment [43].

This study reports average lifetime risks at the population level; it does not address the urgent need for more accurate risk stratification of individual patients or the limitations of current breast cancer risk prediction models [44]. Genetic mutations such as BRCA1 convey dramatically increased risks of triple-negative breast cancer [45, 46], and the results of genome-wide association studies may eventually guide even more personalized risk prediction [47, 48]. Our estimates may inform health policy and resource planning across diverse populations and may help patients and clinicians to weigh the average probabilities of developing specific breast cancer subtypes against other competing health risks.