Introduction

Breast cancer is a heterogeneous disease, including subtypes based on hormone receptor status and amplification of HER2 [1, 2]. These subtypes have distinct underlying molecular defects that affect both their aggressiveness and the signaling pathways that are vulnerable to targeted therapies [3, 4]. While these designations are extremely useful, breast cancer also can exhibit significant intratumoral heterogeneity, both between individual tumor cells and also between tumor and stromal compartments. For example, tumors classified as hormone receptor positive may have different proportions of estrogen receptor (ER) or progesterone receptor (PR) positive cells. Thus, there may exist within a tumor some cells that are more versus less responsive to a given treatment, or cells that are more likely than others to spread distantly. Contributing to this intratumoral heterogeneity is the concept of breast cancer stem cells, which may be more resistant to therapies and/or more likely to metastasize [5, 6]. In addition, breast cancer is also 'temporally heterogeneous', with cancers presenting at dierent stages of their evolution. In general, cancers detected early in progression are less dangerous and more amenable to treatment than those detected later.

Characterizing the nature of an individual breast cancer, both in terms of type of breast cancer and stage of progression, is crucial for estimating prognosis of the patient and for the prediction that a given treatment will be successful. However, prognostic and predictive information is population-based. While useful, this information does not necessarily predict the fate of an individual with breast cancer. As a result, some women may be over-treated and others under-treated, or treated with therapy that will not offer benefit. Thus, improved ways to 'individualize' prognosis and treatment decisions are needed [7].

As an attempt to meet this need for more 'personalized' information to guide treatment, additional ways are being studied to classify individual tumors, based on single biomarkers or more complex molecular signatures. Rapidly evolving technologies that enable detailed molecular profiling of tumors are raising hopes that breast cancer treatment decisions may become even more tailored to an individual breast cancer patient's tumor. Here we discuss the role that some of these new profiling approaches may play in cancer patient management, and the role that tumor and patient heterogeneity may play in using this information to best benefit patients.

The prognosis, prediction and treatment of breast cancer are complicated by the diverse constellation of causative alterations within multiple biological pathways that lead to this heterogeneous disease. Initial strategies to treat breast cancer have therefore employed gene-specific, tissue-specific as well as whole genome approaches to identify specific signatures related to particular breast cancer types, which can then be exploited to optimize treatment targeting a specific patient's tumors. Some studies have evaluated the expression status of individual candidate genes in cell lines and/or tumor material in a tissue-specific manner. For example, significantly reduced levels of mRNA expression of the metastasis suppressor genes BRMS1, KISS1 (kisspeptin), KAI1 (CD82) and Mkk4 (MAP2K4; mitogen-activated protein kinase kinase 4) have been shown in breast cancer brain metastasis [8], with specific suppression of BRMS1 modifying several metastasis-related phenotypes [9]. Whole genome approaches using microarray platforms have identified more extensive gene sets that can predict a short interval to distant metastases (that is, a poor prognosis signature) [10, 11] or have identified gene sets that mediate metastasis from a specific primary tissue to a tissue-specific host site [12, 13]. Minn and co-workers [14] identified a complex 54-gene breast cancer set that marks and mediates breast cancer metastasis to the lungs and appeared to consist of at least two separate classes of genes that confer both breast tumorigenicity and lung metastagenicity, as well as one that is advantageous to cells in that lung environment. Additionally, Kang and co-workers [15] identified a functionally diverse gene set that, when overexpressed, cooperatively promotes the metastasis of breast cancer cells to bone. Importantly, clinically significant 21-gene [16] and 70-gene signatures [10, 17] have formed the basis for widely used molecular diagnostic tests that have been translated and validated as effective clinical tools as prognostic and predictive markers for effective treatment decisions in specific breast cancer patient cohorts. These particular markers will be discussed in detail later in this review. Finally, several reports have addressed the contributions of altered epigenetic signatures in breast cancer models [18, 19] and through the integration of multiple genetic and epigenetic multi-gene platforms [20].

These reports underscore the complexity of metastasis as a multigenic process and support the concept that heterogeneous, selectable subpopulations of cells in the primary tumor may possess specific gene sets that are permissive for metastasis and/or for the colonization and growth of those cells at specific secondary sites. The challenge for the clinician remains in identifying the relevant gene sets and to exploit this information to permit better prognosis and personalized treatment options for individual patients.

Current prognostic and predictive factors - a clinical perspective

Traditional clinical prognostic factors are still commonly used to guide therapy. Pathologic subtyping is important. For example, pure infiltrating lobular [21], phylloides [22], mucinous and tubular carcinomas [23] have a generally better prognosis than infiltrating ductal cancers, although the lobular cancers may have more late relapses. Increased nodal status, high tumor grade, high Ki67, increased tumor size and negative receptor status (especially PR) are associated with a poorer prognosis [24]. The increased use of sentinel lymph node dissection and subsequent more detailed examination of fewer nodes have resulted in more nodes with micrometastases (> 0.2 to ≤2.0 mm), resulting in a new category for nodal status in the American Joint Committee on Cancer (AJCC) Cancer Staging Manual [25]. Although micrometastases have been associated with a poorer prognosis [26], it is possible that their prognostic impact has been diluted or eliminated by the use of modern systemic therapy [27]. More recent classifications include HER2 status [28] and basal-like breast cancer [3]. Interestingly, the National Comprehensive Cancer Network (NCCN) and American Society for Clinical Oncology (ASCO) guidelines give discordant recommendations for use of HER2 status for prognosis [29, 30]. Basal breast cancer is generally thought to have a poorer short-term but better long-term prognosis [31], but understanding of this variant is hampered by the absence of a universally accepted definition [3, 32]. There is increasing evidence that prognosis also may be related to patient-specific factors, including very young age [33] and postmenopausal women who are overweight and have excessive alcohol consumption [34, 35]. Thus, environmental factors may have a role in determining recurrence of cancer. Although race has been associated with poorer prognosis [36, 37], this might be an epiphenomenon related to a complex interplay between socio-economic, cultural and biological factors [38]. Therefore, a better understanding of tumor biology may help discriminate among the relative importance of these factors. Research on prognostic markers would be more clinically relevant in the future if the REMARK (Reporting Recommendations for Tumor Marker Prognostic Studies) reporting recommendations for tumor marker studies developed by the National Cancer Institute-European Organisation for Research and Treatment of Cancer (NCI-EORTC) were implemented [39]. However, a recent sampling of 50 studies from high impact journals indicated poor compliance with the recommendations [40]. These guidelines apply not only to single biomarkers, but also to panels of markers and profiles [41].

Guidelines for the use of predictive factors to target therapy have been published by the St Gallen's group [42], the National Comprehensive Cancer Network [29] and ASCO [30]. The Adjuvant! Online decision aid [43], although widely used, does not incorporate HER2 status and suffers from difficulties in interpretation of the co-morbidity index, which may significantly impact on the interpretation of benefit when compared to overall and not just cancer mortality risks. It also does not incorporate potentially important independent risk factors, such as presence of lymphatic or vascular invasion in node negative disease [43, 44].

The most difficult areas of controversy are in deciding whether to give chemotherapy to postmenopausal women who have low or even intermediate grade ER or PR positive, HER2 negative breast cancers with one to three nodes positive or those with negative nodes and ER or PR positive, HER2 negative intermediate grade tumors [45, 46]. There may also be subsets of women, especially those with HER2+ T1bN0 cancers, who might be at increased risk of relapse but for whom, at this time, there are no clear guidelines for treatment. Neoadjuvant chemotherapy is increasingly used both in clinical and research settings. Although pathologic complete response is an important surrogate endpoint, more useful functional and molecular imaging tools along with biological assessment of tissue are required [47]. It is in these areas where there is the greatest potential for the use of newer biologically derived profiling technologies. Finally, a greater understanding of molecular subtypes may allow for more rational use of chemotherapy in important subsets of breast cancers [48].

Molecular subtyping provides a 'snapshot' of a tumor at a single point in time. However, tumor status may change when metastases are compared to primary cancers. A meta-analysis of 8 observational studies totaling 658 paired ER samples and 418 paired PR samples comparing primary and metastatic tumors showed discordance rates of 29% and 27% for ER and PR, respectively [49]. Information on HER2 status when primary and metastatic sites were compared has given discordance rates between 0% and 13.6% in seven studies, suggesting somewhat higher concordance [5052], although one other study had a 34% discordance rate [53]. Discordance in markers led to a change in management in 20% of patients, suggesting that repeat biopsies should be considered in patients with metastases [54]. Discordance in HER2 status also has been reported between primary tumors and bone marrow metastases [55] as well as circulating tumor cells [56, 57], raising questions about treatment decisions based solely on the HER2 status of the primary tumor. Much remains to be learned about molecular alterations and gene expression patterns in primary tumors versus their metastases, but these studies are complicated by the frequent difficulty of obtaining matched tissue samples, especially when metastases may be detected long after a primary tumor has been resected. However, recent studies are beginning to document this heterogeneity [5860]. How much these changes are driven by treatment, tumor progression, discrepancies in initial typing or intrinsic heterogeneity is unclear. It is clear that use of prognostic and predictive information obtained from the initial diagnosis of breast cancer and resection of the primary tumor may be imperfect in guiding treatment of metastatic disease.

'First-generation' expression profiling as prognostic and predictive factors

As noted above, a small number of expression profiling strategies have been successfully developed and validated for clinical use, some of which are now commercially available [61, 62]. These include the 70-gene expression signature as used in the MammaPrint® (Agendia, Amsterdam, The Netherlands) assay, and the 21-gene profile used in the Oncotype Dx® (Genomic Health, Redwood City, CA, USA) assay. Clinical evidence in hormone responsive breast cancer supports the abilities of these assays to distinguish between patients who will do well and do not benefit from chemotherapy added to hormone therapy, and patients who have poorer prognosis and who will benefit from added chemotherapy [61, 62]. These assays are becoming increasingly used in the clinical setting to help in treatment decisions. A comparison of four studies from the US and the Netherlands indicated that these assays led to changes in treatment decisions in 18 to 44% of cases, and often in the direction of not giving chemotherapy to patients predicted not to benefit. However, it should be noted that a recent study by Parisi and colleagues [63], which compared protein levels of 14 markers used in the Oncotype Dx assay with nodal status, tumor size, nuclear grade and age, found that a combined model incorporating both molecular and standard clinical-pathological information provided better prognostic information than either system alone. There thus remain questions about the most effective use of molecularly based assays in the clinical setting.

Some of these questions will be addressed in two ongoing clinical trials, MINDACT (Microarray In Node negative Disease may Avoid ChemoTherapy) and TAILORx (Trial Assigning IndividuaLized Options for Treatment (Rx)). Both trials are designed to assess the abilities of molecularly based assays to determine best adjuvant treatment for specific subsets of breast cancers, and in particular to determine which patients need chemotherapy and which are unlikely to benefit from chemotherapy. Details of these trials have been summarized in detail elsewhere [61, 62, 64].

The TAILORx trial is using the Oncotype Dx 21 gene assay, in lymph node negative, ER and/or PR positive, and HER2-negative tumors [62, 65]. Women with low 'recurrence scores' (RS <11) will receive hormone treatment only, and women with high RS (> 25) will receive chemo-therapy plus hormone therapy, as current standard of care. Women with intermediate RS (11 to 25), where there is uncertainty about need for chemotherapy, will be randomized to hormone therapy, plus or minus chemotherapy, to test the benefit of adding chemotherapy for this group of patients.

The MINDACT trial will use the 70-gene profile (MammaPrint), from fresh tissue from women with node negative breast cancer, and will compare the utility of this assay with current clinical-pathological assessment, as defined by the Adjuvant! Online tool [66, 67]. Women whose risk assessments are concordant using the two assays will receive current standard treatment for their risk groups. Women with discordant determinations from MammaPrint versus Adjuvant! Online will be randomized to receive either chemotherapy or no chemotherapy. Together, the MINDACT and TAILORx trials will provide prospective evidence about the utility of molecularly based tests, to help determine the need for adjuvant chemotherapy in some women and identify women who are unlikely to benefit from chemotherapy, thus providing more individualized treatment decisions for women with breast cancer [61, 62, 64].

Figure 1 diagrams the path from traditional clinical and prognostic factors, as well as currently available and evolving signatures, to clinical application for improved and more personalized treatment decisions, as exemplified by the examples discussed above.

Figure 1
figure 1

Correlating molecular and clinical characteristics can address the multiple aspects of biological, clonal and patient heterogeneity in breast cancer metastasis and lead to gene profiles, commercial assays and clinical trials that ultimately result in clinical applications to improve prognostic accuracy and treatment outcome for individual patients.

The road ahead - challenges and opportunities

The advent of next generation sequencing (NGS) technologies promises to provide powerful new tools to identify those individuals who may be at risk of developing primary or metastatic tumors, and has the potential to further enhance 'personalized' treatment decisions. NGS allows complete genomes to be sequenced in a matter of days, resulting in valuable, personalized information identifying mutations in patient or tumor DNA or RNA samples. While a full review of the technologies available today is beyond the scope of this work, readers are directed to excellent reviews that have been written on the subject [68, 69].

A recent report [60] demonstrated how NGS can be used to characterize somatic mutations occurring during the development and progression of lobular breast cancer. Using DNA and RNA resequencing, 32 somatic non-synonymous mutations in a metastatic tumor were found, 19 of which were not present in the primary lesion. In addition, RNA sequencing detected two new RNA editing events that recode the amino acid sequences of two proteins, SRP9 and COG3. These compelling results demonstrate that heterogeneity at the single nucleotide level can be an inherent property in low to intermediate grade tumors, and that significant evolution can occur with progression of the disease.

In the clinical setting, testing of inherited loss of function mutations to tumor suppressor genes in women with a family history of breast or ovarian cancer is generally limited to the BRCA1 and BRCA2 genes. To address the fact that there are many other inherited mutations that may predispose one to these cancers, a recent report [70] developed an NGS assay to capture, sequence and detect all mutations in 21 genes (including BRCA1 and BRCA2) in women previously diagnosed with breast or ovarian cancer and carrying a mutation in at least one of the genes responsible for inherited predisposition of these diseases. They were able to detect all single nucleotide substitutions, indel mutations, and large duplications and deletions that had been previously confirmed, with no false positive calls. Taken together, their approach showed that widespread genetic testing and personalized risk assessment in these patients is feasible.

The use of massively parallel sequencing technologies, however, is not without significant challenges that will have to be overcome if they are to be used extensively in the clinical setting. The foremost of these is that the current cost of the assay is a significant deterrent to its clinical use. At present, ten-fold coverage of an individual's genome (about 30 Gbases) costs approximately US$15,000 [69], although, as the technologies evolve, it is expected that this cost will drop significantly, as was seen with microarray analyses. Indeed, the National Human Genome Research Institute in the US has announced a program with the ultimate goal to completely resequence the human genome for $1,000 or less [60, 71]. Secondly, the samples will rarely be purely tumor tissue, with the presence of 'contaminating' DNA or RNA derived from normal tissue, immune cells or stromal tissue making the acquisition of a 'true' tumor signature a challenge. Thirdly, an inherent issue in NGS is the sheer volume of data generated by these analyses and whether appropriate bioinformatics expertise is available to assess these vast datasets.

To date, no large scale studies analogous to those that led to the Oncotype Dx or MammaPrint assays have been performed using NGS technologies. However, efforts are underway to create a comprehensive database of genetic alterations in breast cancer, such as that being undertaken by the Breast Cancer International Cancer Genome Consortium [69, 72]. Coupled with efforts to create a panel of 'normal' samples (for example, the 1000 Genomes Project [73]), these initiatives have the potential to allow a panel of disease-specific genetic anomalies that may eventually be used in elucidating a 'genomic alteration signature'. These signatures may one day be tested in large scale clinical trials similar to the MINDACT or TAILORx studies referred to earlier. In addition, as the technologies and associated analyses are perfected, NGS information may be integrated with global gene expression studies on a personalized basis, allowing for a comprehensive and refined prognostic ability and treatment plan.

Challenges posed by heterogeneity

Perhaps the greatest challenge to successfully develop clinically valid gene signatures for breast cancer diagnosis, prognosis and prediction of treatment response relates to the multiple concepts of heterogeneity of breast cancer. These exist at the level of the causative molecular pathway(s), with regard to the clonal composition of the tumor itself and in the context of genetic variability within the patient population. Tumor development is essentially Darwinian, in that any of a number of molecular pathways that have been selected for in a specific tumor cell can contribute to the 'successful' metastatic tumor [74]. Moreover, this heterogeneity is dynamic, as selective pressures change (that is, in the new environment encountered by a metastatic cell in a secondary tissue site) [75, 76]. Thus, gene signatures may offer no more than a snapshot of a tumor's gene expression profile that is best relevant for only a particular point in time. Furthermore, the presence of subpopulations of tumor cells that differ in their genetic makeup, metastatic potential, invasiveness and capacity to replicate may further compromise an already complex signature, in that the most clinically relevant signature may be masked by a 'non-lethal' signature that dominates the tumor's DNA or RNA sample. Lastly, the selection and fate of specific tumor cells and the susceptibility of these cells to appropriate treatments is also likely dependent on inherited genetic variations that can affect the patient's tumor and response to chemotherapy [77, 78]. Taken together, these multiple aspects of biological, clonal and patient heterogeneity make the process of establishing gene signatures both challenging and complex. Thus, comprehensive genomic analysis of tumor subpopulations and of the host patient is likely the best way to effectively use gene signatures from both patient and tumor, so that treatment plans can be optimized.

Conclusion

Significant progress has been made over the past decade that has utilized the technical advances in molecular genetics to develop clinically relevant tools to aid in the prediction and treatment of breast cancer. However, even as these advances have been made, we are learning more about the complex biology that underwrites this complex set of potentially devastating diseases. Several important challenges must be faced. First, it is clear that there will be no shortage of information available regarding clinical characteristics of the patient (that is, age, menopausal status) or the clinical and molecular characteristic of her/his tumor (ranging from tumor histology to genomic signatures). Instead, the clear challenge is to be able to capture the clinically relevant signature(s) from the cacophony of molecular noise that exists, due to inherent issues related to tumor heterogeneity and disease complexity. In addition, these individual data sets must be linked directly with informative patient/tumor information that is specific to that individual. The selection advantage provided by a particular set of genetic changes is critically important to the survivability of that tumor cell, and ultimately that same set of information is critical in guiding the choice of an effective treatment regime for that patient. As we move forward it is therefore necessary to link together these new genetic signatures with specific patient subgroups, while concurrently developing the molecular therapies that target the specific disease-related genetic alterations identified in those signatures.

Note

This article is part of a review series on Multiple gene prognostic factors, edited by Lewis Chodosh.