Breast cancer is highly heterogeneous with regard to morphological spectrum, clinical presentation, and response to cancer therapy [1]. Based on gene-expression profiling using cDNA microarray technology, a molecular taxonomy has been proposed to classify breast cancer into luminal A, luminal B, basal-like, and HER2 subtypes, which have distinct differences in prognosis and responses to cancer therapies [2, 3]. Using conventional immunohistochemistry (IHC) detection of estrogen receptor-alpha (ERα), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) status, molecular subtypes of breast cancer can be classified as: luminal A (ERα+ and/or PR+, HER2-), luminal B (ERα+ and/or PR+, HER2+), triple-negative (ERα-, PR- and HER2-), and HER2 (HER2+, ERα-, and PR-) [4]. It has been suggested that the triple-negative and HER2 subtypes defined by IHC have poorer survival outcomes and respond differently to adjuvant chemotherapy compared with the luminal A subtype [4, 5]. Most previous studies were conducted in Western populations, while few population-based studies have been conducted in Asians.

Racial differences in molecular subtypes have been reported. For example, the triple-negative subtype appears to be more common in African-American populations, especially among younger African-American women, compared with European-ancestry populations [4, 68]. One study has suggested that the HER2 subtype is more common in Asian populations and that the distribution of breast cancer subtypes among Asian women may vary by ethnicity (i.e., Chinese, Japanese, etc.) [9]. A few studies have evaluated the molecular subtypes of breast cancer in Chinese women [1014]. However, most of those studies have had a relatively small sample size and applied different criteria to define positivity. For example, HER2 has been defined as positive with a DAKO score of 3+ (>10% cells show strong complete membrane staining) [1012, 14] or ≥2+ (>10% cells show weak to moderate complete membrane staining) [13]. The widely used criteria for HER2 positivity modified by the American Society of Clinical Oncology/College of American Pathologists guidelines [15] were not used in those publications. The prevalence and clinicopathological significance of breast cancer subtypes in the Chinese population merits verification. The present study used data from a large-scale, population-based cohort study of breast cancer patients in Shanghai, China [16]. The distribution of molecular subtypes of breast cancer and their correlation with breast cancer outcomes were evaluated.



Study participants were women aged 20 to 75 years who were diagnosed with a primary breast cancer and enrolled in the Shanghai Breast Cancer Survival Study (SBCSS), a longitudinal, population-based cohort study in Shanghai, China [16]. Through the population-based Shanghai Cancer Registry, 6,299 women were identified approximately 6.5 months after a cancer diagnosis, and 5,042 were enrolled in the study (participation rate: 80.0%) between March 2002 and April 2006. The SBCSS was approved by the institutional review boards of all institutions involved in the study, and written informed consent was obtained from all participants.

Data collection

Trained interviewers, all retired health professionals (e.g., nurses and physicians), conducted in-person interviews using a standard baseline survey questionnaire to collect information on demographic characteristics, reproductive history, disease history, medication use, selected lifestyle factors, diet, use of complementary and alternative medicine, and quality of life. Clinical information collected included cancer stage, tumor ERα and PR status, and primary treatments. Inpatient medical charts were reviewed to verify clinical information. Anthropometric measurements, including height, weight, and circumferences of the waist and hips, were taken according to a standard protocol by trained interviewers at the baseline interview. The cohort is being followed up by in-person interviews that take place at 18 months, 36 months, and 60 months after cancer diagnosis, supplemented by record linkage to the Shanghai Vital Statistics Registry.

Tissue slide preparation

Pathology slides for 2,791 cases were available for this study. The slides were collected from the diagnosis hospitals according a standard protocol. Briefly, the formalin-fixed, paraffin-embedded blocks were cut in 5 μm thick sections. The sectioned tissue slides were covered with a thin layer of paraffin, and stored in vacuum chambers (Terra Universal, Inc., Anaheim, CA) placed in a 4°C cold room to properly preserve the antigenicity of the sectioned tissues. This slide storage method has been established and verified in our centralized laboratory [17]. The diagnoses and clinicopathologic data were confirmed by a combination of medical chart review and centralized review of pathology slides. The histological types of breast cancer were confirmed according to the criteria of the World Health Organization classification [18] by the study pathologist (Su). The histologic grade of all cancer slides was determined using the Nottingham histologic grading system [19].


HER2 staining was conducted for all 2,791 participants included in this study in our centralized laboratory, with rabbit polyclonal antibody recognizing the HER2 cytoplasmic domain (DAKO, Cat# A0485, 1:100), following the protocol of the DAKO Envision™ kit (DAKO, Cat# K4011). This staining protocol has been validated by comparing it with the HercepTest™ kit (DAKO, Cat# K5204) using commercial tissue microarray (TMA) slides, which included tissue slides from 70 breast cancer cases (BR701, US Biomax Inc.). A 100% concordance rate between the two methods was obtained. For patients whose ERα (243 cases) and PR (222 cases) status could not be obtained from medical charts, double immunohistochemical staining for PR/HER2 and ERα/estrogen receptor beta (ERβ) was conducted. PR/HER2 double staining was performed using the EnVision™ G|2 Doublestain System (Additional file 1). ERα/ERβ double immunofluorescent staining was performed using a sequential double labeling protocol proposed by Vector Laboratories (Additional file 2). The staining protocols were carefully validated by comparison with standard staining, using the above BR701 commercial TMA breast cancer slides (Figures 1 and 2). We constructed a TMA block as a quality control, which included one breast cancer tissue sample with positive expression of ERα, ERβ, and PR; one breast cancer tissue sample with positive HER2 expression; and normal ovary, prostate, and liver tissue samples (Figure 3). The control TMA slides were stained in parallel with each batch of study samples using an Autostainer Universal Staining System (DAKO, Model LV-1).

Figure 1
figure 1

Double immunohistochemical staining for PR/HER2. To validate the lab staining method, commercial breast cancer tissue microarray (TMA) slides BR701 (US Biomax Inc.) were used. D1, TMA core with HER2+ and PR- staining. D2, HER2- and PR+ staining. G1, HER2+ and PR+ staining. D9, PR+ and HER2 weak-positive (borderline) staining. PR/HER2 double stains were comparable to standard single stains for HER2 and PR, although the PR signal in the double staining was somewhat weaker (original magnification: ×200).

Figure 2
figure 2

Double immunofluorescence staining for ERα/ERβ. The same commercial breast cancer TMA slides were used to validate ERα staining. TMA cores C2 and C8, strong ERα nuclear staining. F10, weak ERα nuclear staining. F6, negative ERα staining. ERα fluorescent positive signals were comparable to standard single staining for ERα (original magnification: ×100).

Figure 3
figure 3

Double immunofluorescence staining for ERα/ERβ using lab-constructed TMA control slides. For immunostaining quality control, lab-constructed slides were stained with each batch. One TMA core of breast tissue exhibited strong ERα and ERβ nuclear staining in tumor cells (T), other than normal epithelium (N). Most tumor cells exhibited co-expression of ERα and ERβ, as revealed in the overlapping image (original magnification: ×200).

Membrane staining intensity and the pattern of HER2 staining were evaluated using the 0 to 3+ scale [15]. Scores of 0 and 1+ (weak immunostaining in less than 30% of tumor cells) were defined as negative, 2+ (complete membranous staining, in at least 10% but less than 30% of tumor cells) as equivocal, and 3+ (uniform intense membranous staining in at least 30% of tumor cells) as positive. For ERα staining, a clinically validated threshold [20, 21] for the prediction of response to hormonal therapy (>10% cutoff for whole slides) was used in this study. PR expression was considered to be positive, if the nuclei of more than 1% of cells were stained positive in a single slide. The stained slides were evaluated independently by two investigators (Su and Li), and all slides with inconsistent readings were re-evaluated by the two investigators jointly and the final status assigned.


Differences in sociodemographic and clinicopathologic characteristics across different breast cancer subtypes were evaluated using the one-way analysis of variance test for continuous variables (such as age), and the chi-square test for categorical variables. Log-rank tests were applied to evaluate differences in survival rates. Multivariate Cox proportional hazards models were employed to evaluate associations of molecular subtypes with overall and disease-free survival rates. The following covariates were adjusted in the multivariate models: age at diagnosis, education, income, body mass index (BMI), radiotherapy, chemotherapy, immunotherapy and tamoxifen use, TNM stage, histologic grade, and tumor size. All the tests were performed by using Statistical Analysis Software (SAS, version 9.1; SAS Institute, Inc., Cary, North Carolina). The significance levels were set at P < 0.05 for two-sided analyses.


Distributions of baseline sociodemographic and clinicopathologic characteristics by 5-year survival rate for the study population of 2,791 subjects are presented in Table 1. The variables significantly related to 5-year survival were: age at diagnosis, education, income, TNM stage, histologic grade and type, ERα, PR, HER2 status, and use of adjuvant therapy (tamoxifen and radiotherapy). The subpopulation of this study was similar to the overall study population for the above characteristics (data not shown).

Table 1 Selected demographic and clinical characteristics of breast cancer patients included in the Shanghai Breast Cancer Survival Study.

Prevalences of the luminal A (ERα + and/or PR+, HER2-), luminal B (ERα + and/or PR+, HER2+), HER2 (HER2+, ERα-, and PR-), and triple-negative (ERα-, PR-, HER2-) subtypes were 48.6%, 16.7%, 13.7%, and 12.9%, respectively. The 8.1% of cases that showed weak positive staining of HER2 (scored as 2+) by IHC were classified as the borderline or equivocal group in this study. During an average of 53.4 months of follow-up after cancer diagnosis, 290 total deaths and 341 recurrences/breast cancer-specific deaths were documented. Differences among molecular subtypes with regard to clinicopathologic characteristics were observed and are presented in Table 2. Significant differences were observed in age at cancer diagnosis (P = 0.03) with luminal A breast cancer being more common among older women (≥70) and triple-negative cancer more common among younger women (<40). Women with the luminal A subtype was more likely to have low TNM stage (P < 0.01), smaller tumor size (P < 0.01), and low histologic grade (P < 0.01) compared with women with the HER2 and triple-negative subtypes. Women with triple-negative breast cancer had a higher frequency of family history of breast cancer than women with other subtypes (P = 0.048).

Table 2 Comparisons of clinical and tumor characteristics by molecular subtypes of breast cancer, the Shanghai Breast Cancer Survival Study.

We classified histological types of breast cancer into four categories: non-invasive, invasive lobular, invasive ductal, and invasive special types (data not shown in tables). For non-invasive breast cancer (ductal carcinoma in situ [DCIS] and lobular carcinoma in situ [LCIS]), the most common molecular subtype was luminal A (42.7%), followed by luminal B (20.8%), and the HER2 subtype (18.8%); the triple-negative subtype was least common (10.4%). Among invasive cancers, luminal A accounted for 66% of ILCs, luminal B for 10.7%, the HER2 subtype for 3.7%, and triple-negative for 8.4%. For IDCs, luminal A accounted for 43.3%, luminal B for 18.8%, the HER2 subtype for 16.3%, and triple-negative for 13.1%. Luminal A was the most common molecular subtype among the special histological types of breast cancer (mucinous, 81.2%; papillary, 65.5%; mixed, 58.9%), except for medullary breast cancer where triple-negative was most common (37.9%) (luminal A, 30%; HER2 subtype, 19%).

Associations of molecular subtypes with 5-year overall and disease-free survival rates are presented in Table 3. Women with the luminal A, luminal B, HER2, and triple-negative subtypes had 5-year overall/disease-free survival rates of 92.9/88.6, 88.6/85.1, 83.2/79.1, and 80.7/76.0, respectively. Multivariate Cox regression analyses showed that the HER2 and triple-negative subtypes were associated with an increased risk of overall mortality (hazard ratio (HR) = 1.47, 95% CI, 1.03 to 2.10; HR = 1.87, 95% CI, 1.31 to 2.66, respectively) and breast cancer recurrence/disease-related mortality (HR = 1.32, 95% CI, 0.95 to 1.83; HR = 1.52, 95% CI, 1.09 to 2.11, respectively) after adjustment for age, education, income, BMI, radiotherapy, chemotherapy, immunotherapy, tamoxifen use, TNM stage, histologic grade, and tumor size. A total of 477 cases in our study received some forms of immunotherapy, including IL-2, lymphokine-activated killer cell, and interferon. Immunotherapy was associated with improved overall survival (HR = 0.27, 95% CI, 0.10 to 0.69) and disease-free survival (HR = 0.41, 95% CI, 0.20-0.82) for luminal A breast cancer, but reduced disease-free survival (HR = 2.21, 95% CI, 1.09-4.48) for the HER2 subtype of breast cancer. Use of immunotherapy was not significantly associated with survival for women with luminal B and triple-negative breast cancers.

Table 3 Molecular subtypes in association with breast cancer survival, the Shanghai Breast Cancer Survival Study.


Distribution of molecular subtypes of breast cancer in Chinese women

Our study showed that the prevalence of the triple-negative subtype of breast cancer among Chinese women (12.9%) is similar to that in European populations (10-16%), but lower than in African-American population (20-21%). The HER2 subtype accounted for 13.7% of Chinese breast cancer cases, which is higher than the reported positivity (4-8%) in either European or African-American populations [4, 68, 22, 23]. In our study, approximately 8% of breast cancer patients had weak positive or borderline staining (2+) for HER2, which was interpreted as an equivocal category that would be recommended for verification with fluorescent in situ hybridization (FISH) for therapeutic indication of trastuzumab (Herceptin) treatment [15]. The FISH-derived amplification rate for the HER2 equivocal group (i.e., IHC 2+) has been observed to be approximately 25% in Western women [24, 25]. If a similar rate is true for Chinese women, most HER2 equivocal cases would fall into either luminal A or triple-negative subtypes. Therefore, the frequencies of luminal A (48.6%) and triple-negative subtypes in our study could be underestimated. We compared ER and PR status for the HER2 borderline group with the HER2+ and HER2- groups, and found that ER+ and PR+ rates (67.4% and 57.7%, respectively) for the HER2 borderline group were more similar to that of the HER2- group (69.7% and 67.1%, respectively) than to the HER+ group (46.1% and 41.3, respectively), suggesting that the vast majority of cases in our HER2 borderline group should likely be classified in the HER2- group. Regardless of the true HER2+ rate for the borderline group, the prevalence of the HER2 subtype in this Chinese population is higher than in Western populations.

The prevalence of breast cancer subtypes appears to differ among different races or ethnicities. It has been well documented that the triple-negative subtype is most common among young African-American patients, while luminal A is most common among postmenopausal white women [4, 69, 22, 23]. The increased risk for the triple-negative subtype in African-American women may due to parity and younger age at first full-term pregnancy, multiple live births without breastfeeding, use of medications to suppress lactation [7], and intrinsic genetic variables, such as higher p53 expression [6] and particularly high prevalence of founder mutations in BRCA1 or BRCA2 gene in young (<35 years) African-American women [26]. In our study population, women with triple-negative breast cancer more frequently reported a family history of breast cancer than did women with other subtypes. This suggests that genetic factors may play a more important role in this molecular subtype of breast cancer. Since BRCA mutations in Chinese women are uncommon (1.1% each for BRCA1 and BRCA2) [27]; other genetic contributors to the triple-negative subtype in Chinese women need to be investigated.

We found that HER2+ breast cancers account for 30% of all breast cancer cases in our study population, similar to a previous report from Shanghai (31%) and higher than the reports from Tianjin (26%) [13, 14], Taiwan (21%) [28], and the US (26%) [9]. Consistent with our findings, one large, registry-based population study [8] showed that HER2+ tumors are more common among Asian/Pacific Islanders (28%) than among non-Hispanic Whites (21%) or non-Hispanic Blacks (24%), but similar to Hispanics (26%) (Table 4). Another large population study further revealed that among Asian-Americans, Korean and Philipino women had the highest prevalence of HER2+ tumors (36% and 31%, respectively), followed by Vietnamese (29%) and Chinese (26%) women, while Japanese and South Asian women showed a prevalence of HER2+ tumors similar to non-Hispanic Whites and non-Hispanic Blacks (19-23%) [9]. It was not clear why the prevalence of the HER2 subtype or of HER2+ tumors is higher among Chinese or Asian women compared with women of European ancestry or African-American women. Although it has been suggested that environmental factors might play an important role in the etiology of HER2+ breast cancers, variations in criteria used to determine HER2 status may also contribute the differences.

Table 4 Distribution of breast cancer subtypes in different ethnicities and in different geographical areas of China, %

Prognostic significance of breast cancer subtypes among Chinese women

Chinese women with the triple-negative subtype were younger in age at diagnosis compared with women who had other subtypes of breast cancer, which is similar to findings reported in Western populations [7, 29]. The triple-negative subtype was associated with larger tumor size, higher histologic grade, later TNM stage, and higher prevalence in IDC than in ILC. These clinicopathologic characteristics have been consistently observed in both Western [4, 8] and Chinese populations [1013], suggesting that the triple-negative subtype is an aggressive subtype of breast cancer across all ethnicities. Multivariate analysis confirmed that the triple-negative subtype is an independent prognostic factor for the progression and survival of breast cancer. Most triple-negative cancers defined by IHC present a basal-like subtype profile defined by cDNA microarray, but they do not completely correlate in about 25% of cases [30]. Other molecular subsets may be included in triple-negative cancers. Further epidemiological and biomarker studies for this important subtype in Chinese women is necessary.

The HER2 subtype was closely correlated with larger tumor size and higher histologic grade, consistent with previous reports in other Chinese studies [1013]. We found that the HER2 subtype was associated with earlier age at diagnosis, more advanced TNM stage, and reduced 5-year overall and disease-free survival rates. Anti-HER2 therapy is currently available. Our study suggests that about 30% of Chinese women with the HER2 subtype (14%) or with the luminal B subtype (17%) may benefit from trastuzumab (Herceptin) and other targeted therapies, if HER2 status were evaluated following the standardized HER2 evaluation guidelines [15] and this information were incorporated into therapeutic decisions.

The luminal B subtype in our study was correlated with younger age at diagnosis, more advanced TNM stage, larger tumor size, higher histologic grade, and was less common in the ILC and special histologic types compared with the luminal A subtype. However, after adjusting for TNM stage, histologic grade, and tumor size, we observed no statistically significant differences for overall or disease-free survival between the two luminal subtypes. Currently, the definition of the luminal B subtype remains debatable. The luminal B subtype originally classified using cDNA microarray gene profiling was unstable and sometimes clustered with the ER- classes (HER2 and basal subtypes) [31, 32]. Approximately 30-50% of luminal B class samples defined by gene profiling were HER2+. Therefore, the IHC definition of luminal B (ERα+ and/or PR+, HER2+) is not equivalent to the luminal B tumors classified with microarray gene profiling [4]. Since the gene profile-classified luminal B subtype is defined as tumors with lower expression levels of ERα/PR and related genes, higher proliferative rates, and higher histologic grade [32], some authors have suggested that ERα expression in tumor cells should be semi-quantified using the Allred, Q-score, or H-score to distinguish luminal B from luminal A [33]. More recently, a study [34] suggested that the Ki67 index for cellular proliferation should be combined with ERα, PR, and HER2 to classify luminal tumors into three subtypes: luminal A (ERα+ and/or PR+, HER2-, Ki67 low), luminal B (ERα+ and/or PR+, HER2-, Ki67 high), and luminal-HER2 (ERα+ and/or PR+, HER2+). In that study, the luminal B and luminal-HER2 subtypes had a statistically significant association with poor breast cancer recurrence-free and disease-specific survival in all adjuvant systemic treatment categories. Additional research is warranted to determine the clinical utility of new methods to distinguish luminal breast cancers.

The immune system is thought to play an important role in the metastatic cascade among cancer patients. Thus, various immune strategies have been tested as therapy for breast cancer, including vaccine therapy, administration of exogenous cytokines, monoclonal antibodies, and gene therapy [35]. In our study, we collected general information on immunotherapy by asking participants whether they had received immunotherapies such as IL-2, lymphokine-activated killer (LAK) cell, and interferons. We found that use of immunotherapy was associated with improved overall and disease-free survival among women with the luminal A subtype but with reduced disease-free survival among women with the HER2 subtype, suggesting that choosing the proper immunotherapeutic method should be based on the molecular characteristics of the tumor. This also indicates that analysis of molecular subtypes of breast cancer has significance for personalized immunotherapy to improve the survival of breast cancer patients.

In our study, we found that the molecular subtype of breast cancer is not always consistent with histological type in terms of predicting breast cancer outcomes. For example, in our study, medullary breast cancers accounted for about 19% of the HER2 subtype and 38% of triple-negative cases. Medullary breast cancer is generally considered to be a favorable histological type of breast cancer with a good prognosis. The unfavorable molecular subtypes among medullary breast cancer might not mean an unfavorable outcome. These results suggest that breast cancer is more heterogonous than the four molecular subtypes as defined by ER, PR, and HER2 status. Further investigation into molecular heterogeneity is warranted.

This study is the largest population-based study on molecular subtypes of breast cancer and survival among Chinese women. This study has several notable strengths. The population-based study design and high overall response rate (80%) minimized potential selection bias. Standardized staining and scoring of HER2 status, and centralized pathological confirmation of diagnosis minimized misclassification. There are also some limitations to this study. For example, ERα and PR status for the majority of participants (91% and 92%) was obtained from medical charts. Approximately 8% of cases with borderline positivity for HER2 as determined by IHC were not evaluated with FISH. For cases with missing ERα (234 cases) or PR status (222 cases), ER/PR status was measured at the Vanderbilt centralized laboratory using a cut-off for ER positivity of >10%, which is the cut-off used by the large hospitals in Shanghai [10] and had been validated for the prediction of response to hormonal therapy [20, 21]. Due to a slightly decreased PR sensitivity of HER2/PR double staining, a lower cut-off positivity value (>1%) was used for PR positivity. To evaluate the potential influence of the variation in the criteria used to define ER and PR status, we performed additional analysis by excluding cases whose ER and PR status were measured at the centralized laboratory. We did not observe appreciable changes in the study results. In 2010, the American Society of Clinical Oncology/College of American Pathologists recommended that ERα and PR should be considered positive, if there are at least 1% positive tumor nuclei in the tissue samples with proper controls [36]. If the recommended 1% cut-off value for ERα and PR positivity were used in this study, the number of HER2 and triple-negative subtypes would decrease and the number of luminal subtypes would increase. However, the overall prevalence of HER2+ tumors, which includes the luminal B and HER2 subtypes, would not be affected. Future studies on breast cancer subtypes using recommended guidelines [15, 36] for hormone receptors and HER2 status are warranted. In addition, the follow-up period of this cohort is relatively short. Our ongoing follow-up with the cohort would overcome this limitation and allow an examination of the long-term effects of different molecular subtypes on the survival of breast cancer patients.


This large population-based study of Chinese breast cancer survivors confirmed that the triple-negative and HER2+ subtypes were associated with poorer outcomes compared with the luminal A subtype among Chinese women. The HER2+ subtype was more prevalent in this Chinese population compared with Western populations, suggesting the importance of standardized HER2 detection and anti-HER2 therapy to potentially benefit a high proportion of breast cancer patients in China.