Introduction

Breast cancer is a very heterogeneous disease, with a high degree of diversity between and within tumors. The intertumoral heterogeneity is exemplified by the identification of five molecular subtypes, namely HER2+, normal-like, luminal (subtypes A and B), basal A/basal-like and basal B/claudin-low – a classification based on gene expression profile analysis [15]. This heterogeneity stems from the fact that the tumor phenotype varies based on the cell of origin [6]. Indeed, basal A/basal-like and basal B/claudin-low breast cancer subtypes were reported to result from the transformation of luminal progenitors and basal/myoepithelial cells, respectively [5, 710]. This hypothesis has recently been challenged by the finding that a combination of several genetic events in luminal-committed cells leads to the development of breast cancers of the claudin-low subtype in murine models [11, 12]. These genetic events promote an embryonic transdifferentiation program, namely, the epithelial-mesenchymal transition (EMT), a reversible mechanism sensitive to microenvironmental changes [13]. Therefore, genetic events and the microenvironment probably constitute additional determinants of tumor etiology [14]. Intratumoral heterogeneity results from the selection of genetically distinct cell populations during tumor progression, the exacerbated plasticity of cancer cells with consequent phenotypic modifications induced by changes in the microenvironment, and the differentiation rate of the progeny of tumor-initiating cells (TICs).

Many attempts have been made to identify and characterize TICs because these cells are believed to constitute a unique sub-population with unlimited self-renewing potential that constantly fuels the tumor and sustains therapy resistance and systemic metastasis. Clarke and colleagues isolated TICs from metastatic human breast cancers based on their specific CD44+/CD24-/low/ESA+ antigenic phenotype [15]. However, CD44+/CD24-/low/ESA+ does not constitute a universal antigenic phenotype of TICs in all breast cancer subtypes [1618]. Rather, it marks a heterogeneous mix of cells in normal mammary gland [19] and is a profile associated with cell commitment to an EMT program [20]. It is necessary to better define this antigenic phenotype by combining CD44 and CD24 with additional as yet unidentified markers or activity, as previously shown with aldehyde dehydrogenase [21]. In this context, the similar distribution of the gene expression profiles of breast cancer cell lines and primary tumors in the five subtypes identified [2, 22] suggests that the cell line diversity reflects the tumor heterogeneity. Consequently, breast cancer cell lines are considered tools with which to identify and characterize TICs. The aim of this study was to identify novel antigens that are able to isolate the TIC population of basal-A/basal-like breast cancer cell lines.

Both normal and cancer stem cells express transmembrane transporters, including ABCG2. This protein excludes the fluorescent Hoechst 33342 dye from the cells and as such, behaves as one of the major mediators of side population (SP). The SP technique has long been used to isolate both normal and cancer stem cells from different organs and species [2328]. ABCG2 expression was found to be higher in SP cells isolated from mammoplasties of healthy patients than in non-SP cells [29]. Moreover, a specific ABCG2 inhibitor (Ko143) reduced SP formation, suggesting that ABCG2 confers the SP phenotype in mammary epithelial cells. Interestingly, SP cells, unlike their non-SP counterparts, express neither luminal nor myoepithelial markers [30], suggesting that they are dedifferentiated. In line with this observation, we demonstrate that CD338/ABCG2 is a reliable antigen with which to sort out the tumor-initiating luminal progenitor population of BRCA1-mutated breast cancer cells.

Results

CD338 is differentially expressed in CD24+ and CD24-/low subpopulations in the BRCA1- mutated HCC1937 cell line

To identify novel antigens that can improve the power of the CD44/CD24 antigenic phenotype in order to isolate TICs, we measured the expression of 28 surface antigens reported to be essential for cell adhesion, migration, apoptosis, cell signaling or stemness (Table 1 and Additional file 1: Table S1) in two basal A/basal-like cell lines, namely BT20 and HCC1937 (BRCA1-/-) and the basal B/claudin-low Hs578T cell line. Figure 1 shows the expression of these antigens, as assessed by flow-cytometry, in the CD44+/CD24+ and CD44+/CD24-/low cell subpopulations. Particularly, we determined the ratio between the percentage of cells positive for each antigen in the two cell subpopulations of each cell line (Figure 1b). No significant differences were observed in the expression of the examined antigens between the CD44+/CD24+ and CD44+/CD24-/low cell subpopulations in the BT20 and Hs578T cell lines (ratio ± 1), while several of them were significantly enriched in the CD44+/CD24+ population of HCC1937, including the stemness markers CD10, CD133 and CD338/ABCG2 [25, 31, 32]. Evaluation of the mean fluorescence intensity (MFI) of each surface marker in the CD44+/CD24+ and CD44+/CD24-/low cell subpopulations of the HCC1937 cell line, demonstrated that CD338 is expressed at a higher level in the CD24+ than in CD24- cell subpopulation (Additional file 2: Figure S1).

Table 1 Molecular identity and functions of the antigens analyzed by flow-cytometry
Figure 1
figure 1

Surface expression profile of the CD24+and CD24-/lowcell subpopulations of basal-like cell lines. (a) Expression of CD24 and CD44 in the basal A/basal-like HCC1937 and BT-20 cell lines, and in the basal B/claudin-low Hs578T cell line. CD44+/CD24+ and CD44+/CD24-/low cell subpopulations of each cell line were defined as shown by the two gated regions, namely, P4: CD44+/CD24-/low (blue events) and P3: CD44+/CD24+ (orange events). (b) The expression of 28 surface antigens was analyzed in the CD44+/CD24+ and CD44+/CD24-/low cell subpopulations. The histogram shows the ratio between the percentage of cells positive for each antigen in the CD24+ and CD24-/low cell subpopulations of each cell line.

Given the high differential of CD338 (ratio >3, Figure 1), we explored the link between CD24 and CD338 expression. To this aim, we gated the HCC1937 CD338high and CD338- cell subpopulations, and measured CD24 expression. As shown in Figure 2, the MFI was 7-fold higher in CD338high cells (panel a, red events) than in CD338- cells (panel a, blue events; mean ± SEM: 5,200.0 ± 916.5 and 1,100.0 ± 404.1, p < 0.05). This finding is consistent with a overlap between CD24+ and CD338high cells.

Figure 2
figure 2

Co-segregation of a unique subcellular population with CD338 and CD24. The mean fluorescence intensity of CD24 was 4.7-fold higher in CD338high cells (panel a, red events) than in CD338- cells (panel a, blue events; mean ± SEM: 5,200.0 ± 916.5 and 1,100.0 ± 404.1, p < 0.05), as shown by a monoparametric histogram of CD24 expression (panel b) and by CD338 vs CD24 dot plot (panel c).

Enrichment in stem cells parallels an increase in CD338 expression

If CD338 is a reliable marker of TICs, culture conditions reported to induce enrichment of stem cells and progenitors should be associated with an increase in CD338 expression. To address this issue, we performed mammosphere formation assays in which we ran three successive culture/dissociation passages in ultra-low adherent conditions, and measured the percentage of CD338-expressing cells. The percentage of CD338high cells in mammospheres was 4.9-fold higher than that in the whole cell line when cultured in adherent conditions (mean ± SEM: 8.3% ± 0.5 and 1.7% ± 0.1 respectively, p < 0.0001). Furthermore, the MFI of CD338 cells, which indicates its expression level, was 3.8-fold higher in mammosphere-derived cells than in adherent cells. This observation supports the assumption that CD338-positive cells display some stem cell-like properties (Figure 3). Strikingly, CD338-positive cells in ultra-low adherent conditions were the only cell subpopulation to retain CD326/EpCAM and CD49f/α6-integrin expression (Figure 3f), which is an antigenic phenotype assigned to luminal progenitors.

Figure 3
figure 3

Culture in ultra-low adherent conditions enriches in CD338-expressing cells. HCC1937 cells were either cultured in adherent (panel a) or ultra-low adherent conditions (panel b). CD338 expression was assessed on adherent cells (panel c) and in third-generation mammospheres (panel d). CD326 and CD49f expression in the adherent cell line (panel e) and on mammosphere-derived cells (panel f) was assessed.

CD338-expressing cells display stemness properties and transformation potential

If CD338 is an antigenic marker of TICs in the HCC1937 cell line, the CD338high cellular subpopulation would be expected to generate mammospheres when cultured in ultra-low adherent conditions, whereas CD338- cells should be devoid of stemness properties. To test this hypothesis, we sorted by flow-cytometry (Additional file 3: Figure S2) three distinct populations that differ in the expression of CD338: a CD338high population expressing CD338 at high level (1% of cells), a CD338low population expressing CD338 at low level (79% of cells), and a CD338- population negative for CD338 (20% of cells). The three sorted populations were grown for two days in standard adherent conditions to allow them to recover from the sorting procedure before testing their stemness properties in a mammosphere formation assay. As expected, after two serial passages, only CD338-expressing cells gave rise to mammospheres (Figure 4a).

Figure 4
figure 4

CD338 expression discriminates cells with stemness properties and transformation potential. (a) Mammosphere formation assay. CD338high, CD338low and CD338- sorted sub-populations of HCC1937 cells were plated in ultra-low adherent conditions at a low density to generate mammospheres. Upper (I), central (II) and lower (III) panels show the results of the first-, second- and third-generation mammospheres, respectively. Results are expressed as mammosphere-forming efficiency (MFE, number of mammospheres/number of wells) ± SD of triplicates. (b) Soft agar colony formation assay. CD338high, CD338low, and CD338- cells were sorted out from HCC1937 cells and tested for their ability to generate colonies on soft-agar. The number of colonies observed after 4 weeks are indicated for 5 × 104 plated cells ± SD of triplicates.

We next assessed the transformation potential of the three subpopulations in a soft-agar colony assay. As shown in Figure 4b, CD338-positive cells, CD338high and CD338low, displayed a significantly higher transformation potential than CD338- cells. The few colonies observed in CD338- cells probably reflect contamination of the cell population with CD338low cells during the cell sorting (Additional file 4: Figure S3). There were no significant differences between CD338high and CD338low populations in either the mammosphere or the and colony formation assay. Assays were invariably performed shortly after reseeding sorted cells. Notably, after several days in culture, CD338high cells gave rise to a heterogenous CD338high and CD338low population, which suggests parenting between these cells (Figure 5). Both CD338high and CD338low cells, but not CD338- cells, displayed stemness properties and transformation potential.

Figure 5
figure 5

In vitro evolution of CD338highand CD338lowsorted populations. CD338high and CD338low cells were sorted out from HCC1937 cells and plated in adherent conditions. After four weeks of culture, the expression of CD338 was analyzed by flow-cytometry.

To strengthen our conclusions, we next assessed the consequences of ABCG2 depletion on stemness properties of HCC1937 breast cancer cells by performing mammospheres formation assays. In line with our expectations, the knockdown of ABCG2, achieved through RNA interference, leads to a significant decrease of stemness potentials (Figure 6).

Figure 6
figure 6

The knockdown of ABCG2 annihilates the stemness properties of HCC1937 breast cancer cells. ABCG2 expression was turned down through RNA interference and the stemness properties of the resulting cell lines was examined. (a) Assessement of ABCG2 expression by q-RT-PCR. Levels expressed relatively to the housekeeping HPRT1 gene transcripts were normalized with respect to uninfected HCC1937 cells. (b) Mammospheres formation assay. Results of second generation mammospheres are shown and expressed as mammosphere-forming efficiency (MFE, number of mammospheres/number of wells) ± SD.

CD338high cells display a selective advantage in vivo

Because the tumorigenic potential is specific to TICs, we assumed that, when xenografted into immunocompromised mice, HCC1937 cell-generated tumors would be enriched in CD338-expressing cells. To address this point, 2 × 106 and 4 × 105 HCC1937 cells were injected into the left and right flanks of NOD/SCID mice, respectively (Figure 7a). Invariably, cells induced tumor formation with a delay depending on the number of cells injected (100%, n = 5). Tumors were excised, digested to single cell suspensions and analyzed by flow-cytometry. The percentages of CD338high cells were determined in viable (SYTOX-), human (HLA-ABC+), epithelial (EpCAM+)-gated tumor-derived cells (Figure 7b). The CD338high subpopulation was significantly enriched in both tumors (38.8% ± 1.1) compared with the parental cell line (Figure 7a). To compare the tumorigenic potential of the CD24+ and CD24- cell subpopulations, we sorted the two populations and immediately xenografted them into mice. The same number of cells (assays were performed with either 5 × 104 or 5 × 105 cells) were injected into the flanks of five mice. All injections invariably led to tumor growth and there were no obvious differences between the CD24+ and CD24- cell subpopulations. Moreover, no difference in tumor growth was detected between CD24 sorted cells and the unsorted HCC1937 cell line (data not shown). While the CD24+ and CD24- cell populations displayed a similar tumorigenic potential, CD24+-derived tumors had a higher percentage of CD338high cells than CD24--derived tumors (60.2% ± 3.4 versus 42.5% ± 0.5) (Figure 7c). This observation strengthens the link between CD24 and CD338 expression.

Figure 7
figure 7

CD338-positive cells display a selective advantage in vivo . (a) HCC1937 cells were injected subcutaneously into the left and right flanks of five NOD/SCID mice. Dot plots show the percentages of CD338high cells in the cell suspensions obtained from digestion of tumor tissues. (b) Gating strategy to analyze by flow-cytometry the expression of CD338 cells in the excised tumors. Percentages of the CD338high cell subpopulation were determined in viable (SYTOX-), human (HLA-ABC+), epithelial (EpCAM+) gated cells. (c) Enrichment of CD338-positive cells in CD24+-cell derived tumors. HCC1937 CD24+ and CD24- cell subpopulations were xenografted into NOD/SCID mice. CD338high expressions were assessed in the viable (SYTOX-) human, (HLA-ABC+), epithelial (EpCAM+) gated tumor-derived cells.

Discussion

In an attempt to identify novel antigens that may be combined with CD44 and CD24 to specifically sort TICs, we compared the expression of a panel of surface antigens between the CD44+/CD24-/low and CD44+/CD24+ cell subpopulations of three basal A or B breast cancer cell lines. Neither the stemness-associated CD10 and CD133 antigens nor ABCG2 differed between the CD44+/CD24-/low and CD44+/CD24+ cell subpopulations of the BT20 and Hs578T cell lines. This finding questions the reliability of CD44/CD24 in identifying TICs in these two cell lines. Conversely, in the BRCA1-mutated HCC1937 cell line, which was previously reported to include a TIC population that does not display a CD44+/CD24-/low antigenic phenotype [33], the CD44+/CD24+ subpopulation displayed significantly increased expression in several stemness markers particularly the ABCG2 transporter (i.e., the CD338 antigen). The positive correlation between CD24 and CD338 is confirmed by enrichment of CD338-overexpressing cells in tumors originating from CD24-positive cells subcutaneously injected into mice (Figure 7c).

In support of ABCG2 expression in TICs, cell culture conditions that induced enrichment of stem/progenitor cells (mammosphere formation assays) were associated with a significant increase in CD338 level. Since CD338 is one of the major mediators of SP, the enrichment in CD338+ cells as observed in mammospheres is in line with the previously reported enrichment in SP cells in mammopsheres [34]. The latter SP was reported to contain bipotent progenitors and stemness properties as judged by their ability to generate mammospheres when cultured in ultra-low adherent conditions [34]. We thus investigated the reliability of this antigen in sorting TICs in the context of this cell line, and found that the stemness and transformation potentials were specifically assigned to the CD338-expressing cell subpopulation. Furthermore, the CD338+ subpopulation was significantly increased in tumors arising from HCC1937 cells subcutaneously xenografted into immunocompromised mice. The transformation potential was previously assigned to the CD24+ subpopulation of HCC1937 cells [33]. Given the good correlation between the expression of the CD338 and CD24 surface antigens, we conclude that the CD24+/CD338+ cells likely include HCC1937 TICs.

The enrichment of the CD338high subpopulation in HCC1937-derived tumors versus the parental cell line suggests that CD338high cells have an adaptative advantage in vivo. Notably, by studying the MDA-MD-435 basal A cell line, Patrawala and colleagues reported that, although ABCG2 preferentially marks proliferating cells, the ABCG2+ and ABCG2- cell subpopulations display a similar tumorigenic potential [35]. This discrepancy either reflects different ABCG2 expression profiles in the two cell lines or the presence in MDA-MB-435 cells of other transporters with redundant functions.

Single markers are not sufficient to identify and isolate stem cells. This is supported by the observation that tumorigenic assays that we performed with CD24+ and CD24- sorted cells did not reveal any significant differences in the tumorigenicity between the two CD24 sorted populations. Therefore, CD24 alone is not sufficient to specifically sort stem cells. In this context, CD338 is more stringent and the combination of the two antigens probably results in a better segregation, since the increase of CD338high cells in tumor tissues originated from a CD24+ sorted cell population. Moreover, mammosphere formation assays revealed that, among CD24+ cells, those overexpressing CD338 displayed a higher mammosphere forming efficiency (Additional file 5: Figure S4). The higher specificity of CD338 than CD24 to mark TICs in the HCC1937 cell line is further supported by the fact that CD338 was enriched in xenograft tumors (Figure 7) whereas CD24 and CD44 expression profiles did not change in vivo (data not shown).

CD338 expression analysis revealed three distinct cellular subpopulations. While CD338- cells probably include non-tumorigenic and differentiated cells, the significance of the two CD338+ cell populations (CD338high and CD338low) remains elusive. It is conceivable that CD338low cells arise from CD338high cells, thereby forming two separate populations.

Asymmetric division is one of the main properties of stem cells [36, 37]. The analysis of CD338high and CD338low sorted sub-populations, after culture for several weeks, revealed that the antigenic phenotype of CD388low cells remained stable and homogeneous, whereas CD338high cells gave rise to CD388high and CD388low cells which suggests filiation of CD338low cells from CD338high cells (Figure 5). Our results are in agreement with the findings of Patrawala and colleagues who found that, in several tumor cell lines, 1% of the ABCG2+ dividing cells segregated asymmetrically [35]. This observation suggests that a small proportion of ABCG2+ cancer cells (likely corresponding to the CD338high subpopulation in the HCC1937 cell line) might divide asymmetrically. In support of Patrawala’s observation, we demonstrate that CD338+ HCC1937 cells proliferate faster than their CD338- counterparts (Additional file 6: Figure S5). It is thus likely that ABCG2 marks proliferating cells and some of them (very likely the CD338high subpopulation) might undergo asymmetrical divisions, a feature of stem cells.

Despite some differences in ABCG2 transcript levels (Additional file 3: Figure S2b), it is also feasible that parenting results from the dynamic expression of CD338 [38], through protein internalization.

Cell sorting experiments were performed by using the monoclonal anti-CD338 antibody 5D3. 5D3 binding to its extracellular epitope strongly depends on the conformation of ABCG2 [39], making the binding particularly unstable. The stability of the binding was further affected by the very long cell sorting, as the isolation of the rare CD338high population (1% of the whole cell line) took between 7 to 9 hours. Attempts to stabilize the antibody-antigen interaction with a protein cross-linker (PMPI, p-Maleimidophenyl isocyanate) successfully increased the purity of CD338high sorted cell subpopulation (from 50-70% to 90-95%; Additional file 7: Figure S6a). Unfortunately, the crosslinking of CD338 down-modulated its activity as demonstrated by the lack of colonies when unsorted cells were plated on soft agar (Additional file 7: Figure S6b).

BRCA1-mutated basal-like breast cancers are believed to arise from a developmental stage of the mammary epithelial cell, which is different from the primitive stem cell, named the luminal progenitors [7, 8, 40, 41]. It is noteworthy that CD338+ cells in mammosphere-derived HCC1937 cells are the only cell subpopulation that maintained the expression of CD326/EpCAM and CD49f/ α6-integrin, a combination of antigens previously assigned to luminal progenitors [7]. Collectively, our data suggest that CD338 is specific to the luminal progenitor subpopulation of BRCA1-mutated cells and is a novel antigen with which to sort this important subpopulation.

Conclusion

Since the CD44+/CD24-/low antigenic phenotype does not constitute a universal antigenic phenotype of TICs in all breast cancer subtypes, it is necessary to identify novel TIC markers in order to better define this phenotype. Particularly, BRCA1-mutated basal-like breast cancers are believed to arise from the luminal progenitors [7, 8, 27, 28]. Here, we have identified an additional reliable antigen, CD338/ABCG2, that can be used to refine the sorting of the luminal progenitor subpopulations of BRCA1-mutated breast cancer cells.

Methods

Cell lines

The human breast cancer cell lines Hs578T and BT-20 were provided by American Type Culture Collection (Rockville, MD, USA) and cultured in DMEM, 10% FBS. The HCC1937 cell line was from the American Type Culture Collection, and was cultured in IMDM medium (Invitrogen), 20% FBS (Gibco).

Flow cytometry analysis and cell sorting

Antigens and antibodies

Multi-color flow-cytometry was performed with anti-human monoclonal antibodies (MoAbs) that were conjugated with phycoerythrin (PE), fluorescein isothiocyanate (FITC), phycoerythin-Cy7 (PE-Cy7) or Alexa Fluor 647. The study was performed with the following antibodies: PE-conjugated MoAbs against CD10, CD29, CD54, CD55, CD59, CD61, CD151, CD166, CD200, CD340 and FITC-conjugated MoAbs against CD9, CD26, CD47, CD49b, CD49f,CD66b, CD66c, CD81, CD164, CD165, CD227, and CD326 (BD Biosciences); FITC-conjugated antibody against CD90 and CD324 (BD Pharmigen); PE-coniugated MoAb against CD133 (Miltenyi Biotech); AlexaFluor647-conjugated MoAb against CD24, PE-Cy7-conjugated MoAb against CD44 and PE conjugated MoAb against CD338 (Biolegend); PE-conjugated MoAb against CD184 (Immunotech); PE-conjugated MoAb anti-CD105 (Serotec).

Flow-cytometry panel and sample preparation

We used a four-color flow-cytometry panel to measure the expression of the 28 surface markers in addition to CD44 and CD24 [15, 29, 31, 32, 4263]. Cells were stained with anti-CD24-AlexaFluor 647 and anti-CD44-PECy7 monoclonal antibodies combined with pairs of antibodies, conjugated with two other different fluorochromes (PE and FITC), directed against the additional surface antigens examined (Additional file 1: Table S1). An analysis buffer (RPMI without red phenol (Invitrogen), 1-2% FBS (Gibco), 10 U/ml DNase (Sigma-Aldrich)) was used to prepare cells for analysis. Enzymatically individualized cells were counted, resuspended in the analysis buffer at 5 × 106/ml and stained by incubation at 4°C for 20 min with the appropriate MoAbs. For FACS analysis, cells were stained in a 100 μl labelling volume and the concentrations of labelling antibodies were: 0.1:10 for CD44, 2:10 for CD338 and 0.5:10 for all other antibodies. Samples were washed twice with the analysis buffer, centrifuged and resuspended in 0.5 ml of FACS flow sheath fluid (BD Biosciences). Immediately before FACS acquisition, cells were incubated at room temperature in the dark with SytoxBlue (Invitrogen) or DAPI (Invitrogen) to exclude dead cells. All experiments included a negative control to exclude the signal background caused by the cellular auto-fluorescence.

For fluorescence-activated cell sorting, HCC1937 cells were enzymatically individualized, resuspended at 5 × 107 cells/ml in sorting buffer (RPMI without red phenol (Invitrogen), 1-2% FBS (Gibco), 10 U/ml DNase (Sigma-Alderich), 2.5 mM EDTA) and stained by incubation with CD338 and/or CD24 MoAbs at 4°C for 1 h. After staining, samples were washed twice with sorting buffer, and resuspended at 2 × 107 cells/ml in SB. Cells were sequentially filtered (50 μm, Partech) and incubated for a few minutes at room temperature in the dark with a vital dye, SytoxBlue (Invitrogen) or DAPI (Invitrogen).

Cytometers

The samples were analysed on a FACSAria I flow-cytometer (Becton Dickinson, Franklin Lakes, NJ, USA) (Figures 1, 2 and 3; Additional file 8: Figure S7). FITC, PE and PE-Cy7 fluorescence was determined by a 488 nm excitation line and detected by 530/30 nm, 585/42 nm and 780/60 nm filters, respectively. AlexaFluor-647 fluorescence was determined by a 633 nm excitation line and detected by a 660/20 nm filter. For each sample run, 104 to 2 × 104 events were recorded and analysed.

The expression of CD338 was also analyzed with the BD LSR II four-laser flow-cytometer (Becton Dickinson, Franklin Lakes, NJ, USA) (Additional file 3: Figure S2 and Additional file 4: Figure S3) by exciting PE fluorochrome with the 561 nm laser. This laser enabled us to discriminate three cell subpopulations based on the expression of CD338: CD338neg, CD338low and CD338high.

Live cell sorting experiments were performed using BD FACSAria I with 100 μm nozzle. PE fluorescence of CD338 was determined by a 488 nm excitation line and detected by 585/42 nm filters, whereas Alexa-Fluor 647 fluorescence of CD24 was determined by a 633 nm excitation line and detected by 660/20 nm filters. Sorted cells were collected in RPMI medium without red phenol (Invitrogen), 20% FBS (Gibco) 10 U/ml DNase (Sigma-Alderich) and 2% penicillin/streptomycin (Invitrogen) collecting buffer. Cell sorting of CD338 populations took 7 to 9 hours, whereas cell sorting of the CD24 populations took 30 to 90 minutes. Sorted populations were xenografted in mice immediately after cell sorting. An example of CD24 sort purity is reported in (Additional file 9: Figure S8).

Analysis of cytometric data

The samples were analyzed using the FACSDiva software (Becton Dickinson). We used a three-gating strategy to define the target cell population to analyze the expression of the 28 surface markers (Additional file 8: Figure S7a, b, c). First, to exclude dead cells and debris, cells were gated on a two physical parameters dot plot measuring forward scatter (FSC) versus side scatter (SSC). Second, we excluded doublets by gating cells on FSC-height versus FSC-area dot plots. Lastly, to exclude dead cells, we gated Sytox-Blue- or DAPI-negative cells. The expression of each surface marker in the different cell lines was reported as percentage of positive cells in Count versus FITC- or PE-CD histograms (Additional file 8: Figure S7d). We also measured the expression level of each antigen in the different CD44/CD24 subpopulations, CD44+/CD24-/low and CD44+/CD24+, since each FITC-CD or PE-CD antibody was combined with anti-CD44 and anti-CD24 antibodies (Additional file 2: Figure S1).

ABCG2 knockdown

shRNA ABCG2 lentiviral particles were generated through co-transfection of 293 T cells with 4 different shRNA pLKO.1 (4 different shRNAs GCAACAACTATGACGAATCAT, CCTTCTTCGTTATGATGTTTA,GCTGTGGCATTAAACAGAGAA,CCTGCCAATTTCAAATGTAAT, Sigma Aldrich), pCMV R8.91 (gag-pop-Tat-Rev) and phCMVG-VSVG (env) expression constructs using the calcium phosphate precipitation technique. Infections were performed as described in [64]. Infected cells were selected with puromycin (1 μg/ml) before being plated in ultra-low adherent conditions or plated on soft agar. Inhibition of ABCG2 expression was confirmed by qRT-PCR.

Mammosphere formation assays

For mammosphere generation, HCC1937 cells were seeded in 96-well ultra-low attachment plates (Corning, New York, NY, USA) at the concentration of 1,000 cells/well for first-generation mammospheres, and at 100 cells/well for subsequent passages. Cells were grown in a serum-free mammary epithelial growth medium (MEBM, Lonza, Verviers, Belgium) supplemented with B27 (Invitrogen, Carsbal, CA, USA), 20 ng/ml EGF, 20 ng/ml bFGF and 4 μg/ml heparin (Sigma, St. Louis, MO, USA). Mammospheres were collected by gentle centrifugation (1,000 rpm, 5 minutes), and enzymatically (5 min in trypsin/EDTA) and mechanically (26 Gauge needle) dissociated between each step and for further analysis.

Soft agar colony formation assay

Plates were coated with 0.75% low-melting agarose (Lonza) obtained by mixing equal volumes of 1.5% agar and 2× growth medium (IMDM). Cells were enzymatically dissociated, resuspended in growth medium and counted. An overlaid suspension of cells in 0.45% low-melting agarose was obtained by mixing equal volumes of 0.9% agar and 2 × IMDM with cells (5 × 104 cells/well in 6-well plates). Plates were incubated for 3–4 weeks at 37°C and colonies were counted under microscope.

In vivo tumorigenicity assays

All animal experiments were conducted in accordance with accepted standards of animal care and in agreement with a protocol established with our Ethics Committee. The study has been approved by the Institutional Review Board “CEINGE-Biotecnologie Avanzate”. To evaluate in vivo tumorigenicity, unsorted HCC1937 or sorted cell subpopulations were resuspended in media and Matrigel (1:1; BD Biosciences), and injected into the left and right flanks of 4-week-old NOD/SCID mice (C. River laboratories). To evaluate the tumorigenic potential of the unsorted cell line, 2 × 106 and 4 × 105 of HCC1937 cells were injected into the left and right flanks of mice respectively (n = 5). To compare the tumorigenic potentials of the CD24+ and CD24- cell populations, 5 × 104 or 5 × 105 cells were injected into the left and right flanks of NOD/SCID mice respectively (n = 5). Tumor formation was assessed and measured once a week. Animals were killed when the tumor reached the size of 1.5 cm. Tumor tissues were minced into < 1 mm pieces, dissociated in an enzymatic solution consisting of collagenase type 1 (1.5 mg/ml, Sigma), penicillin/streptomycin 20%, amphotericin 1% and DNase (1 mg/ml, Roche), and incubated at 37°C for 60 min with gentle agitation. The single cell suspensions were analyzed by flow-cytometry after staining with appropriate antibodies.