Introduction

Breast cancer is a highly heterogenous disease at the morphologic and molecular level, with at least four main molecular subtypes described, including luminal (divided into luminal A and B), HER2 over-expressing, and basal-like [13]. More recently additional molecular subtypes have been indentified including claudin-low and molecular apocrine [46]. Each of these subtypes has characteristic morphologic, immunophenotypic, and prognostic features. BRCA1-associated breast cancers have been shown to be enriched with tumors of the basal-like subtype, whereas BRCA2-associated tumors and familial non-BRCA1-2 tumors are more likely to be of the luminal subtypes [711].

The normal epithelium of the breast has been demonstrated to be organized in a cellular hierarchy with an ER-negative (−) stem cell giving rise to ER-positive (+) and ER-negative progenitors, which ultimately give rise to fully differentiated functional luminal and myoepithelial/basal epithelium [12, 13]. It has been suggested that the different molecular subtypes of breast cancer arise from the transformation of different stem or progenitor cell populations which retain, or acquire as a consequence of the transformation process some or all of functional characteristics of normal stem cells [14]. These characteristics include limitless self-renewal capabilities, which drives tumorigenesis, and the ability to differentiate (albeit aberrantly) leading to morphologic tumor heterogeneity. There is also evidence to suggest that it is the CSC population that mediates metastases and can evade the effects of chemotherapy and radiation therapy thus promoting recurrence and relapse [1522].

A number of markers have been proposed that enrich for the identification of breast CSCs including CD44 in combination with low or absent expression of CD24 (known as the CD44+/CD24−/low phenotype) [23, 24] and Aldehyde dehydrogenase 1 (ALDH1) [25]. CD44 and CD24 are both adhesion molecules that play major roles in cell–cell and cell–extracellular matrix (ECM) interactions. CD44 is a Class I transmembrane glycoprotein that serves as the primary receptor for hyaluronan [26] and binds other ECM components, such as collagen, laminin, and fibronectin. CD44 exists in different splicing variants; some of these variants have been reported to promote growth, survival, invasion, and metastatic properties in breast cancer cells [2730]. However, other studies on the role of CD44 in breast cancer have shown opposite effects, suggesting that CD44’s function in breast cancer is context-dependent (reviewed [31]). CD24 is a small cell-surface glycoprotein that binds P-selectin, an adhesion receptor on platelets and endothelial cells [32]. In addition, CD24 promotes binding to fibronectin, collagen, and laminin and in agreement with these functions has been shown to promote adhesion, migration, and metastasis, and to associate with markers of poor prognosis in breast cancer [3234]. ALDH1 is a detoxifying enzyme responsible for the oxidation of intracellular aldehydes [35] that plays a role in early differentiation of stem cells by promoting the formation of retinoic acid [36]. In addition to preferential expression of ALDH1 in breast cancer cells with tumor initiating properties [25], retinoid signaling has been directly implicated in modulating breast cancer stem cell (CSC) differentiation [37].

In the present study, we examine the expression of the proposed breast CSC markers in a well-characterized collection of familial breast cancer cases. In addition, we investigated whether the expression of these markers is associated with any known clinical–pathologic tumor variables, molecular subtype or with patient survival.

Materials and methods

Study population

The study population included 58 BRCA1-associated, 64 BRCA2-associated, and 242 familial non-BRCA1/BRCA2 tumors from the Ontario Familial Breast Cancer Registry. All familial non-BRCA1/BRCA2 breast cancers were obtained from probands within the Breast Cancer Family Registry who met any of the following criteria for being at possible genetic risk of breast cancer; at least 1 first-degree relative with breast or ovarian cancer, at least 2 second-degree relatives with breast or ovarian cancer, diagnosis before age 26, male, multiple primaries or breast and ovarian cancer, at least 1 second-degree or third-degree relative with male breast cancer, multiple breast, or breast and ovarian primaries, or breast cancer before 26 years of age, or ovarian cancer before 60 years of age, Ashkenazi Jewish, or 3 first degree relatives in the family with breast, ovarian, colon, prostate or pancreatic cancer or sarcoma (with one diagnosed before age 50) but who tested negative for germline BRCA1 or BRCA2 mutations. Sporadic breast cancer cases from the registry were not available on TMAs.

Mutational analysis of BRCA1 and BRCA2

Testing for germline mutations in BRCA1 and BRCA2 was performed using an RNA/DNA-based protein truncation test with complementary 5′ sequencing, as previously described [38, 39], or by complete gene sequencing by Myriad Genetics. All mutations were confirmed by DNA sequencing. Mutations were classified as deleterious if they were protein-truncating, missense mutations (rare), or splice-site mutations as defined by the Breast Informatics Consortium (http://research.nhgri.nih.gov/bic/).

Pathology review

All tumors from the familial breast cancer cohort had a centralized pathology review performed by an expert breast pathologist using a standardized checklist form. The reviewing pathologist was unaware of the mutational status of the tumor at the time of review. Tumors were classified according to the WHO histologic classification of breast tumors and graded using the Nottingham histologic grading system [40, 41].

TMA construction

A suitable paraffin-embedded block of invasive tumor was chosen at the time of pathology review and the area of invasive tumor encircled for TMA construction. Two 0.6 mm cores of tissue were taken from the paraffin tumor block and used for TMA construction (Beecher Instruments, Sun Praire, WI) as previously described [7, 8]. Four μm sections were cut and immunohistochemical staining for ER, PR, HER2, CK5, CK14, EGFR, ALDH1, CD44, and CD24 was performed using methods as listed in Table 1. Microwave antigen retrieval was carried out in a Micromed T/T Mega Microwave Processing Lab Station (ESBE Scientific, Markham, Ontario, Canada). Sections were developed with diaminobenzidine tetrahydrochloride (DAB) and counterstained in Mayer’s hematoxylin.

Table 1 Summary of antibodies and their conditions of use

Interpretation and scoring of immunohistochemistry

Each of the immunohistochemical TMA-stained sections was scored using Allred’s scoring method [42], which adds scores for the intensity of staining (absent: 0, weak: 1, moderate: 2, and strong: 3 to the percentage of cells stained (none: 0, <1%: 1, 1–10%: 2, 11–33%: 3, 34–66%: 4, and 67–100%: 5 to yield a “raw” score of 0 or 2–8. Previously validated cut-offs for ER and PR were used (0, 2 = negative, 3–8 = positive) [43, 44]. Strong complete membranous staining was assessed for HER2 and the cut-off of >5 was used to indicate positivity [45]. For CK5, CK14, EGFR, CD44, and CD24 a score of ≥4 was considered positive, for ALDH1 a score of ≥5 was considered positive. The raw score data were reformatted using a TMA deconvoluter software program into a format suitable for statistical analysis [46]. The highest score from each TMA tumor pair was entered into the statistical analysis. Only the epithelial component of each TMA spot was scored for the markers indicated. Immunohistochemical results were recorded as unavailable when the tissue sections were washed off the slide, TMA cores contained no invasive tumor cells or when sections were uninterpretable due to tissue artifact.

Tumors were classified as luminal if they expressed ER or PR and were negative for HER2. Any tumor with a score of >5 for HER2, irrespective of the ER status was considered a HER2 over-expressing tumor and basal-like tumors were defined as ER, PR, and HER2 negative (triple negative) and positive for CK5 and/or CK14 and/or EGFR as previously described [7, 8, 47].

Statistical analysis

The Chi square test or Fisher’s exact test was used to analyze the marker associations with clinical–pathologic tumor variables, molecular subtype and genetic subtype. Analyses of the association of OS (overall survival) with marker status were conducted using Kaplan–Meier plots and log-rank tests. The follow up data were to the end of November 24, 2011. Excluding the patients lost to follow-up and those with deaths, the minimum follow-up time was 12 months after surgery and the median follow-up time was 148 months. Patient status on November 24, 2011, determined OS time and censoring status. All tests were two-sided. A test with a P-value < 0.05 was considered statistically significant. P-values were not adjusted for multiple testing. Statistical analysis of associations was performed using SAS 9.1 software (SAS Institute, Inc.). Survival curves were plotted using R statistical software, version 2.15.0 (http://www.r-project.org/).

Results

CD44+/CD24−/low

Two hundred and sixty two (262) cases had results for both CD44 and CD24, of which 41 (16%) had a CD44+/CD24−/low phenotype (Table 2; Fig. 1a–d). When compared with all other combinations of CD44 and CD24 expression, the CD44+/CD24−/low phenotype was positively associated with high-tumor grade (p = 0.03), a high-mitotic score (p = 0.003), margin circumscription (p = 0.0009), a moderate tumor lymphocytic infiltrate (p = 0.01), and absent lympho-vascular space invasion (p = 0.008). In addition there was a statistically non-significant trend in association between the CD44+/CD24−/low phenotype and the lack of lymph-node metastases (p = 0.09), syncytial tumor growth pattern (p = 0.06) and young age at diagnosis (p = 0.06). No association was detected between the CD44+/CD24−/low phenotype and tumor size, or tumor type.

Table 2 Association between the CD44+/CD24−/low phenotype and tumor morphologic characteristics
Fig. 1
figure 1

a BRCA1-associated breast cancer TMA section exhibiting strong membranous staining for CD44 in the majority of invasive tumor cells. b BRCA1-associated breast cancer TMA section negative for CD24 staining

In the 41 CD44+/CD24−/low cases a molecular phenotype was assignable for 33 tumors (Table 3), 16 (48.5%) of which were basal, 1 (3%) was HER2 overexpressing and 16 (48.5%) were luminal. In comparison to all other combinations of CD44 and CD24 expression, tumors with a CD44+/CD24−/low phenotype were more likely to belong to the basal-like molecular subtype (48.5 vs. 22.2%; p = 0.0034).

Table 3 Association between the CD44+/CD24−/low phenotype and tumor molecular subtype

Of the 41 CD44+/CD24−/low cases, 11 (27%) were BRCA1-associated tumors, 7 (17%) were BRCA2-associated tumors, and the remainder 23 (56%) were from non-BRCA1/BRCA2 tumors (Table 4). When compared to all other combinations of CD44 and CD24 expression, a CD44+/CD24−/low phenotype was more likely to be associated with tumors arising in BRCA1 germline mutation carriers than non-BRCA1 mutation carriers (26.8 vs. 12.7%; p = 0.02). However, when the analysis was restricted to basal tumors only (Table 5), there was no statistical difference in the incidence of CD44+/CD24−/low expression in BRCA1-associated basal tumors and non-BRCA1-associated basal tumors.

Table 4 Association between the CD44+/CD24−/low phenotype and tumor genetic subgroup
Table 5 Association between the CD44+/CD24−/low phenotype and tumor genetic subgroup within the basal-like molecular subtype

ALDH1

ALDH1 was expressed in 39 of 255 (15%) tumors (Table 6; Fig. 2a, b). The expression of ALDH1 was positively associated with high-tumor grade (p = 0.003), large tumor size (p = 0.009), high-mitotic score (p = 0.05), a syncytial growth pattern (p < 0.0001), a moderate tumor lymphocytic infiltrate (p = 0.002) and younger age at diagnosis (p = 0.02). No statistically significant association was detected between ALDH1 expression and tumor type, lympho-vascular space invasion, or lymph-node status.

Table 6 Association between ALDH1 expression and tumor morphologic characteristics
Fig. 2
figure 2

a Tumor section exhibiting moderate cytoplasmic positivity for ALDH1 in approximately 50% of tumor cells. b Tumor section negative for ALDH1 staining, the macrophages in the tumor stroma demonstrate strong cytoplasmic staining for ALDH1

A molecular subtype was assignable in 33 of 39 ALDH1 positive tumors, 16 (48.5%) of which were basal, 3 (9%) were HER2 overexpressing and 14 (42.5%) were luminal (Table 7). When compared to tumors lacking ALDH1 expression, ALDH1 positive tumors were more commonly basal-like (48.5 vs. 22.3%; p = 0.007).

Table 7 Association between ALDH1 expression and tumor molecular subtype

Of the 39 ALDH1 expressing tumors, 9 (23%) were from BRCA1 germline mutation carriers, 9 (23%) were from BRCA2 germline mutation carriers, and 21 (54%) were from non-BRCA1/BRCA2 patients (Table 8). There was no statistically significant association between ALDH1 expression and BRCA1 mutational status (23.1 vs. 14.3%; p = 0.17), even when the analysis was restricted to BRCA1 basal-like tumors only (data not shown).

Table 8 Association between ALDH1 expression and tumor genetic subgroup

CD44+/CD24−/low/ALDH1+

For the familial breast cancer series the combined CD44+/CD24−/low/ALDH1+ phenotype was expressed in 6 of 230 tumors (data not shown) and associated with a high-mitotic score (p = 0.04), high-mitotic count (p = 0.03), and a syncytial growth pattern (p = 0.01). There was a non-statistically significant trend toward an association with tumor size (p = 0.09), lympho-vascular space invasion (p = 0.08), young age at diagnosis (p = 0.08), and tumor lymphocytic infiltrate (p = 0.08). No association was found between the expression of these combined markers and tumor grade (p = 0.21), tumor type (p = 1.0), lymph-node involvement (p = 0.42) or margin circumscription (p = 0.36).

Only 6 cases expressed a combined CD44+/CD24−/low/ALDH1+ phenotype and while this number of tumors is too few to perform a robust analysis we did observe that 2 (33%) were basal-like tumors and the remaining 4 (67%) were luminal tumors. In these 6 tumors, 3 (50%) were from BRCA1 germline mutation carriers, none (0%) were from BRCA2 germline mutation carriers and the remaining 3 (50%) were from non-BRCA1/BRCA2 mutation carriers. When compared to all other combinations of CD44, CD24, and ALDH1 expression, tumors with a CD44+/CD24−/low/ALDH1+ phenotype were more likely to be associated with BRCA1 germline mutation carriers than non-mutation carriers (data not shown). On analysis of the tumors with a basal-like molecular subtype only (data not shown), there was no significant difference in CD44+/CD24−/low/ALDH1+ expression between those tumors with and without a BRCA1 germline mutation.

Survival

There was a non-significant trend toward better survival for the group with CD44+/CD24−/low compared to the group with other combinations of CD44 and CD24 (Fig. 3). There was no difference in survival between patients with tumors positive for ALDH1 and tumors negative for this marker (Fig. 4).

Fig. 3
figure 3

Kaplan–Meyer plots demonstrating survival groups according to CD44/CD24 expression

Fig. 4
figure 4

Kaplan–Meyer plots demonstrating survival groups according to ALDH1 expression

Discussion

There is an increasing evidence that many tumors including breast cancers may be driven by a subpopulation of cells that display stem cell properties, so called CSCs or tumor initiating cells. Markers have been identified that when used alone or in combination enrich for functional CSCs, as defined by their ability to selectively initiate tumors in immunocompromised mice upon serial passage, a demonstration of self renewal, together with the ability to form tumors that are heterogeneous at the cellular level similar to the originating tumor, illustrative of the CSC’s ability to differentiate [48]. These markers include CD44+/CD24−/low and ALDH1, originally identified by the sorting of cells from fresh tumors or effusions using flow cytometry or an enzymatic assay [23, 25]. Unfortunately fresh tumor samples are not routinely available for all breast cancer patients and tumor effusions manifest at a relatively late stage of the disease process and may not be representative of the primary tumor. In order to investigate whether CSCs could represent either prognostic or predictive biomarkers an alterative approach to their identification must be sought, preferably in formalin fixed paraffin embedded (FFPE) tumor material which represents the bulk of patient tumor samples and clinical trial archives. In this study we have used immunohistochemical (IHC) expression of CD44+/CD24−/low and ALDH1 as surrogate markers for breast CSCs and sought to correlate their expression alone and in combination with clinical–pathologic tumor features, breast cancer molecular subtypes, germline gene mutations, and ultimately patient outcome.

Our observations suggest that only a minority of the tumors examined contained cells expressing the breast CSC phenotypes CD44+/CD24−/low(15%), ALDH1 (16%) or both combined CD44+/CD24−/low/ALDH1+ (<1%). Other investigators employing IHC methods to identify these phenotypes have reported a wide variance in the percentage of primary breast tumors that express these phenotypes: 20–60% of tumors exhibit some cells with a CD44+/CD24−/low phenotype [4955], whereas 7–70% of tumors examined expressed ALDH1 [22, 25, 52, 54, 5661]. These differences may reflect differences in the antibodies employed, or the tumor populations examined e.g., familial vs sporadic, or the scoring cut points applied (for example, some studies considered tumors with as few as one cell positive for the markers to be positive [50, 59] whereas others like this study have required a minimum of 10% of tumor cells to express the marker in question for the tumor to be considered positive [52, 54]). Alternatively, perhaps CSCs are phenotypically more diverse and the two phenotypes we have examined may not capture all possible breast CSCs. Wright et al. [62] in an examination of CSCs from transgenic mice engineered to be deficient in BRCA1 demonstrated that some tumors contained CSCs with a CD44+/CD24−/low phenotype, whereas other tumors contained CSC characterized by CD133 expression. Other markers identified as putative breast CSC markers include CK5, EGFR, EpCAM, and CD49f [63, 64].

Traditionally in breast cancer a number of clinical–pathologic tumor characteristics are associated with poor prognosis and include; younger age at diagnosis, large tumor size, lymph-node involvement, high-tumor grade, lympho-vascular space invasion (LVSI), negative hormonal receptor status, and HER2 over-expression [65, 66]. Patients with tumors displaying some or all of these features are considered at increased risk for relapse and death from breast cancer when compared to patients with tumors lacking these features. In this study, we demonstrated a positive association between the presence of cells with a CSC phenotype (CD44+/CD24−/low or ALDH1 positive or both) and many of these adverse prognostic features including high-tumor grade, large tumor size, and younger age at diagnosis. A number of other studies have reported similar associations between the presence of CSCs and adverse prognostic features [49, 52, 53]. Interestingly, despite the association with some adverse prognostic factors we were unable to demonstrate an association between CD44+/CD24−/low or ALDH1 expression and breast cancer outcome. We are not alone in this observation, in 2 of 4 other studies where the expression of CD44+/CD24−/low was analyzed in relation to outcome no association was observed [51, 54], in the third study an association between CD44+/CD24−/low and outcome was significant on univariate analyses only [53], whereas in the fourth an inverse relationship between CD44+/CD24−/low expression and survival was reported [55]. The expression of ALDH1 has been correlated with poor patient prognosis in some but not all studies [25, 54, 57, 58, 61, 67]. In two studies the expression of ALDH1 was found to be an independent prognostic variable after multivariate analyses [25, 57]. However, similar to our study Ricardo et al. [54] and Resetkova et al. [61] failed to demonstrate an association between ALDH1 expression and outcome. The lack of an association between these markers and patient survival in a number of studies may suggest that the presence of cells with a stem cell phenotype is not a prognostic marker or alternatively that the identification of these markers by immunohistochemistry may not accurately identify the functional CSC population within a tumor.

In our study, the CD44+/CD24−/low and ALDH1 phenotypes were positively associated with the component features of medullary-type breast cancer; namely prominent lymphocytic infiltrate, pushing tumor margins and syncytial growth pattern [68]. Medullary cancer is a special subtype of breast cancer that occurs in 1–5% of all cases, these cancers are ER, PR, and HER2 negative and characteristically high grade [40, 69]. Furthermore, they have been demonstrated to cluster with either basal-like or claudin-low molecular subtypes of breast cancer and to be more commonly represented in tumors of BRCA1 mutation carriers [3, 5, 70, 71]. Despite these seemingly adverse morphologic and molecular associations medullary-type cancers are associated with a better prognosis than non-medullary grade III tumors a fact that may result from the prominent host lymphocytic response that characterizes these tumors [69]. A prominent tumor lymphocytic infiltrate has been demonstrated to be a good prognostic factor in ER-negative breast cancer and basal-like breast cancers specifically [7274]. It is plausible that the presence of an “anti-tumor” immune response in the tumor stroma may mitigate the effects of the increase in CSCs present in these tumor types.

The basal-like subtype of breast cancer is a molecular subtype that was originally discovered through gene expression profiling studies [13, 75]. This subtype if predominantly triple negative (ER, PR, and HER2 negative) and associated, at least in the short-term, with a worse prognosis than ER-positive luminal-type tumors [7678]. Breast tumors from patients with BRCA1 germline mutations are enriched for this subtype [3, 7, 9]. In this study, we demonstrate that both CSCs expressing either the CD44+/CD24−/low or ALDH1 positive phenotype are more commonly found in basal-like tumors than any other molecular subtype examined. Furthermore, we demonstrated  a positive association between BRCA1 mutational status and the CD44+/CD24−/low phenotype. BRCA1 is believed to be a regulator of breast stem cell fate and is required for mammary epithelial cell differentiation [79, 80]. Specifically, BRCA1 is required for the differentiation of ER− luminal progenitor cells and in its absence, such as in the epithelium of BRCA1 mutation carriers, the transformed luminal progenitors are “driven” toward a basal cell fate (expressing CK5), hence the predominance of the basal-like tumor phenotype among BRCA1 mutation carriers [64, 81, 82]. Sporadic basal-like breast cancer arising in patients without germline BRCA1 mutations are often deficient in functional BRCA1 protein resulting in a similar pathway for the development of sporadic and BRCA1-associated basal-like breast cancer and hence a similar phenotype and CSC expression [83]. Honeth et al. [50] profiled 17 BRCA1-associated breast cancers for the CD44+/CD24−/low phenotype and found that 94% of their BRCA1-associated tumors expressed this phenotype as did 63% of the sporadic basal-like breast cancer included in the study. Heerma van Voss et al. [56] demonstrated that ALDH1 was an independent predictor of BRCA1 mutational status, whereas in our larger cohort of BRCA1-associated tumors ALDH1 was not associated with BRCA1 status but rather with the basal-like subtype only.

In our study, a very small fraction of tumors examined (6 of 230 tumors or 0.03%) expressed both CSC phenotypes. Unfortunately, this small number of cases precludes robust statistical analysis but another study by Rimm et al using AQUA technology on the Yale breast cancer cohort showed that 5.5% of breast tumors examined contained cells that co-expressed both CD44 and ALDH1 and these tumors were associated with a high breast cancer-specific mortality [67]. This is in agreement with observations by Ginestier et al. [25], who have shown that tumor cells expressing both phenotypes are highly tumorigenic with the capacity to generate tumors from as few as 20 cells in vivo.

In conclusion, we have demonstrated that CSCs as defined by the expression of CD44+/CD24−/low and/or ALDH1 are present in a minority of familial breast cancer cases. The expression of these CSC phenotypes is associated with a number of adverse prognostic clinical–pathologic features but not with overall survival. In addition, we have demonstrated that the expression of CD44+/CD24−/low and/or ALDH1 is more common in basal-like tumors and that there is an association between BRCA1 mutational status and CD44+/CD24−/low expression.