Heterogeneity in human cancer

A key and dominant feature of human cancer, including breast cancer, is the heterogeneity evident in the disease. Various studies now point to the identification of multiple distinct forms of breast cancer. This presents a problem in both understanding the mechanisms of progression and in developing therapeutic options. The development of human breast cancer results in tumors with a wide array of morphological subtypes, starting with the broad classification as invasive ductal cancinoma or invasive lobular carcinomas. With examination of biomarkers such as ER and PR status, these broad classifications may be further broken down into subgroups. Addition of other markers allows further classification, ultimately resulting in a number of heterogeneous histological subtypes. Indeed, unlike chronic myeloid leukemia, where the vast majority of tumors are quite similar due to the BCR/ABL1 translocation [1], breast cancer is now recognized to be a diverse collection of disease states.

A hallmark of human cancer is genetic complexity, which reflects the mutations that give rise to the tumor phenotype. This is especially true for breast cancer where a number of different mutations are commonly observed, such as HER2 (Neu, ErbB2) amplification and overexpression in 20–30% of tumors [24]. Amplification of other genes is also observed including amplification of Myc in 15% of breast cancers [5]. In addition to these spontaneous mutations, 10% of breast cancer is due to inherited mutations in genes such as BRCA1/BRCA2 [68]. Given the wide number of mutations involved in breast cancer, recent large-scale DNA sequencing efforts have illustrated the genetic complexity of cancer, revealing many types of alterations that can distinguish tumor subtypes [912]. In particular, the study by Wood et al. characterized several mutations that occurred in a large majority of the tumors and a large number of additional mutations that arose in smaller subsets of tumors, illustrating the heterogeneity present within mutations associated with breast cancer [12]. Together, the genetic alterations that underlie breast cancer and the resulting histological subtypes illustrate the heterogeneous nature of the disease.

The array of genetic abnormalities inherent within breast cancer is also reflected in the complexity of associated gene expression data. Indeed, initial studies on human breast cancer using large numbers of samples revealed heterogeneity in the tumor samples with subtypes that correlated with differences in clinical outcome [13]. Subsequent studies have refined the expression patterns in the large datasets into predictions for subtypes of breast cancer. Indeed, these methods have allowed for the generation of the well-known luminal A, luminal B, basal, ErBB2+ and normal breast-like classification using an intrinsic method [1416], reflecting the concept that the gene expression data is associated with histological subtypes.

In addition to these methods, recent work has used signatures of cell signaling pathways to investigate the pattern of gene expression in a variety of samples [1720]. This method uses gene expression data from control samples and pathway-activated samples as the training data to develop a series of pathway signatures. Through the development of regression models, a probability score can then be assigned to subsequent datasets that are examined using the signature that separates the biological states. Importantly, for our studies of breast cancer, these genetic signatures have largely been created in primary human mammary epithelial cells (HMECs). In order to ensure that we have constructed signatures for the transcriptional response to the pathway of interest, RNA is collected shortly after the expression of the gene of interest [17]. For example, the expression of an activated RAS gene in HMECs via an adenoviral vector generates a gene expression pattern quite distinct from that in cells infected with a control viral vector or cells expressing a different oncogene. This signature can then be applied to additional samples in separate tumor datasets, generating probability scores for individual samples sharing the RAS gene expression profile [17]. As an example, two breast cancer datasets (GSE4922 [21] and GSE15852), were examined using a collection of pathway signatures. The resulting pathway probabilities were then clustered (unsupervised hierarchical clustering) to reveal patterns of pathway activity that could be presented as a heat map. Such an analysis revealed a number of distinct clusters with differences in pathway probability (Fig. 1). This approach serves to provide detailed information about the status of pathway activation. By clustering this data, human breast cancer can then be divided into various subgroups based on the gene expression data. Since other works have demonstrated a link between the prediction of pathway activity and prediction of sensitivity to drugs targeting the pathway [17, 22], these predictions present a unique opportunity for therapeutic options. Essentially, we have generated predictions for individual tumors for the status of various pathways, which are in some cases the underlying mechanism of the heterogeneous disease. Many of these pathways can be linked directly to therapeutic opportunities, suggesting that these methods could be used to generate individualized therapies.

Fig. 1
figure 1

Signaling pathway probability in human tumors. Human breast cancer gene expression data from previous studies were downloaded (GSE4922 and GSE15852), normalized and the two datasets were merged together using the Affymetrix housekeeping genes to standardize between batches. The combined dataset was then examined for patterns of signaling pathway activation. Upon generating signaling pathway activation probabilities, the data was clustered and the resulting heat map is shown for the pathways indicated at the right. For a given pathway, blue represents a low probability of pathway activation while red represents a high probability of activation. These datasets also included clinical status for the grade of the tumor (1, 2, 3) and in one dataset normal breast samples were included (GSE15852). This additional clinical data is shown in the legend above the heat map. Using only signaling pathway probabilities, the heterogeneity within human tumors is readily apparent

Heterogeneity of breast cancer is recapitulated in mouse models

The generation of transgenic mouse models of breast cancer began with the creation of mice expressing Myc under the control of the Mouse Mammary Tumor Virus (MMTV) promoter/enhancer [23]. Since that time, many oncogenes have been placed under the control of MMTV with various types of resulting mammary tumors. Interestingly, many of these transgenic mice induce tumors that have a distinctive pathology that is dependent upon the initiating oncogene [24]. Specifically, this work illustrated that for mice overexpressing Ras, Neu or Myc there was a characteristic phenotype in the resulting tumors consistent with the notion that these tumors have been initiated by a dominant oncogene. Conversely, other mouse models of breast cancer are known to result in varied morphological patterns, more analogous to the human condition. For instance, mammary tumors induced through expression of Wnt or members of the Wnt signaling pathway, are known to have a wide range of histological patterns in the resulting tumors [25]. This is also true for MET-induced tumors which produced tumors that were found to have a number of pathologies including papillary, scirrhous, solid nodular, adenosquamous, and spindle cell [26]. Other models are also known to result in tumors with varied morphology, including the Polyoma Virus Middle T model, with six well characterized phenotypes [27]. Taken together, these various models suggest that a careful examination of the histological subtypes of tumors in a given experiment is a critical component of evaluating the utility of the model.

With these studies in mind, we have recently described work with transgenic mice overexpressing various Myc alleles under the control of the MMTV promoter [28]. While we noted a distinctive phenotype for each of the Myc alleles composing approximately 40% of the tumor type for each strain, by closely examining a large number of tumors (>350), we noted substantial heterogeneity in the Myc models. The histological types we observed ranged from microacinar and papillary as the dominant morphologies, to epithelial, to mesenchymal transition (EMT), squamous, adenocarcinomas and tumors with mixed lineages. This suggested that while Myc does preferentially induce a distinct phenotype, there is also significant heterogeneity. To examine the heterogeneity of this model system, tumors from each histological subtype were examined through gene expression analysis. Unsupervised hierarchical clustering of this microarray data revealed that there were a number of distinct groups of samples [28]. Importantly, these subgroups of samples were clustered into groups based on gene expression patterns that corresponded with the histological classifications. While not surprising that the histological characteristics of a tumor are reflected in the transcriptional changes, it is important to note the heterogeneity of the various tumors. Interestingly, when these various classes of tumors were compared to a survey of mouse mammary cancers [29], it was noted that the various classes fit with other tumor models. As an example, the EMT tumors clustered with the p53-/- and DMBA tumors. In the description of the MMTV–MET tumors [26], it was also observed that there were heterogeneous tumor populations at the gene expression level and that the EMT tumors clustered together with the p53-/- tumors. Together, these findings illustrate the importance of examining both histological variation and gene expression patterns.

To further dissect the heterogeneity of the Myc-initiated mouse mammary tumors, but do so with information that provides a basis for understanding functional distinctions in subgroups, we applied the various pathway signatures to the collection of tumors. This analysis revealed that the same histological subtypes were also able to be distinguished based on the higher order structure within the data. Compared with the unsupervised clustering, this resulted in similar patterns but also revealed additional information. For instance, this analysis illustrated that the Ras pathway was highly activated in EMT tumors but was not likely to be activated in microacinar tumors, providing information to allow one to decipher the patterns of gene expression data. Together with the histological data, this has allowed for a more informative characterization of the various subtypes of Myc-induced mammary tumors [28]. Importantly, this analysis has also enabled us to compare this tumor model with human breast cancer, which has revealed subtypes of human breast cancer that share pathways with the mouse model, but has also revealed key distinctions. For example, in Fig. 1 there is a clear correlation between Myc and Ras in human breast cancer. However, in the Myc-induced mouse model we reported an inverse correlation between Myc and Ras [28]. In contrast, in the human signatures, we note in Fig. 1, that there are subgroups of cancers with an elevated probability of activation of both E2F1 and β-catenin (CTNNB1) (BCAT pathway in figure legend). Interestingly, in the mouse predictions we noted a shared elevation of E2F1 and β-catenin in a select subtype of breast cancer, notably in the microacinar tumors. This association was altered in other tumor types, such as EMT where E2F1 was low and β-catenin was midrange and in papillary where E2F1 probability was high while β-catenin was low. These examples illustrate the utility of using genomic signatures to compare tumors.

As an important component of mouse models, comparisons have been drawn between other mouse model systems and human breast cancer. Initially through a survey of 13 models of murine breast cancer, comparisons were made in gene expression patterns using an intrinsic gene expression signature [29]. This study revealed the relationships between the various mouse model systems with an intrinsic gene set, a group of genes that defined the various mouse models. This analysis revealed the similarities and differences between the various mouse models of breast cancer, generating a number of clusters, including normal mammary gland, mesenchymal, basal, luminal, and mixed groups of tumors [29]. In addition to comparing across various mouse models, this report also examined the relation between the models and human breast cancer. Through unsupervised clustering, they revealed that the defining features of the human predictions (luminal A, luminal B, and basal) were maintained in specific mouse model systems, for instance, key genes from the basal cluster were shared in several mouse models. However, this analysis also illustrated that the luminal tumor type was not tightly related to the various murine tumor models, although there were genes with similar expression profiles. In addition, this report uses an intrinsic analysis to compare the mouse and human tumors, which placed several of the mouse models together with the luminal tumors, but at the same time, illustrated the estrogen-receptor-mediated differences between the model systems and the human tumors. With this mouse model data as a framework, several recent papers have then generated gene expression data from additional models and then compared it to these 13 mouse model systems [26, 28, 30, 31]. This has provided an important context to determine how new tumor models relate to the existing models and to determine what human tumors they are most similar to. Indeed, the ability to determine what type of human cancer these mouse models are most closely related to is a critical component in characterizing the mouse model tumors at both the level of gene expression patterns and histological subtypes such as basal and luminal types. In another approach, we created a signature of Myc-driven tumors that had an EMT component and applied this phenotypic signature to the human breast cancer gene expression data. This analysis revealed that triple negative (ER, PR and HER2 negative) breast cancer had a significant elevation of EMT probability, identifying this mouse model as one that may be appropriate for examination of potential therapeutic strategies.

Opportunities to explore strategies for individualized therapy

Based upon a combination of clinical, histological and genomic data, the goal of current care in breast cancer is to offer a course of treatment best suited to the individual patient. One of the greatest challenges for the effective treatment of this disease is the heterogeneity as we previously discussed, therefore, spurring the current focus on individualized medicine. The importance of dissecting this heterogeneity is best illustrated with the example of the breast cancer drug trastuzumab. Only a small fraction of all breast cancer patients benefit from trastuzumab, but with the use of the HER2 biomarker, trastuzumab becomes an effective therapy in breast cancer by pre-selecting patients who might respond [32].

Perhaps the most significant outcome of the genomic analyses of the mouse mammary tumor models is the realization that the heterogeneity characteristic of the human disease is recapitulated in these models. As such, they have the potential for providing an opportunity to evaluate the effectiveness of using genomic data to guide the selection of a therapeutic regimen tailored to individual mice. There are a number of clear advantages to this strategy. Firstly, by using a model where we have clearly defined the genetic heterogeneity, we will be able to make probability predictions for the therapies for the various types of tumors that are commonly observed. Secondly, the relation of the various mouse models to human breast cancer has been defined genetically. In addition, the mouse serves as an exceptional experimental system since mammary tumors are readily detected, biopsied, and transplanted, and responses to therapy can be measured through a variety of methods [33]. Finally, in contrast to human tumors where the standard of care must be maintained, the mouse model offers an opportunity to test only those therapeutic compounds with the highest predicted probability of activity. Given these advantages of the murine system, coupled with our previous characterization of the patterns of pathway activation in the Myc tumor model [28], we have identified therapeutic targets and have begun an investigation into their efficacy. Our basic experimental design is to mimic a human trial, where samples are biopsied, examined for gene expression characteristics and then randomized into treatment groups (Fig. 2). In addition, by implanting portions of the biopsy material into additional recipient mice we are directly able to compare treatment of the predicted, control and no therapeutic intervention on a tumor sample. This proof of principal experiment will demonstrate the utility of the mouse in modeling individualized breast cancer therapy.

Fig. 2
figure 2

Theraputic strategy based on signaling pathway profiles. The design of a proof of the principle experiment where signaling pathway probabilities guide the use of different drug combinations is shown. A line of MMTV-based transgenic mice that develop tumors with demonstrated heterogeneity are used in this scheme. As spontaneous tumors develop, they are biopsied and their gene expression patterns are immediately examined. Based on the patterns of signaling pathway activation, the mice will be grouped into the appropriate therapeutic option or into a control group. Following treatment with the individualized predicted therapeutic combinations, the response of the tumor will be assessed

Conclusion

In the near future, examination of genomic data will be an integral part of directing the therapeutic strategies for individual breast cancer patients. The dissection of the complexity of the disease that originates in the heterogeneity of human breast cancer is being modeled in mice and the results that are observed in these model systems will have direct effects upon the development of individualized therapeutic opportunities.