Keywords

Introduction

Breast cancer risk assessment and treatment are increasingly guided by genetic and transcriptomic information. In addition to the few well-known genes associated with high risk of breast cancer that are routinely tested in the clinic [1], the recent description of polygenic risk scores [2, 3] has further complicated the picture of genetic risk evaluation for breast cancer. Gene expression-based tests such as Oncotype DX [4] and MammaPrint/BluePrint [5] have demonstrated clinical value in predicting the risk of recurrence for early stage breast cancers. Among these, PAM50 [6] and BluePrint [5] can be used to assign breast cancers to one of the most common molecular subtypes.

The first molecular classification of breast cancer based on gene expression profiling was proposed in 2000 by Perou et al. [7]. This classification has been subsequently refined, and the most commonly accepted subtypes today include luminal A and B, both estrogen receptor alpha (ER)-positive, as well as Her2-enriched, basal-like and “non-basal triple-negative” (“normal-like” in some classifications) [8, 9]. Luminal A tumors are the most common among non-Hispanic Whites, and they typically carry a better prognosis than luminal B or non-luminal tumors, particularly when diagnosed and treated early. “Triple-negative” tumors, immunohistochemically “negative” by standardized pathological criteria for ER and progesterone receptor (PR) and not carrying Her2 amplification, are often conflated with the basal-like molecular subtype. However, not all basal-like tumors are triple-negative and not all triple-negative tumors are basal-like. Indeed, triple-negative tumors may include as many as four molecular subtypes (basal-like 1 and 2, mesenchymal, and androgen receptor luminal-like) with different biology and prognosis [10]. Significant molecular heterogeneity exists even within the recognized subgroups, with a variety of low-frequency driver mutations [11]. Higher-dimension classifications including mutations, copy number variations, and gene expression profiling are being developed [12].

Despite nearly 20 years of genomic and transcriptomic studies of breast cancer, our understanding of the molecular portraits of breast cancer remains based on tumors overwhelmingly derived from European or European-American (non-Hispanic White) patients. The representation of patients of non-European ethnicity in public molecular datasets remains limited. As of December 2018, only 37 out of 3650 cases of breast cancer whose molecular portraits are available through The Cancer Genome Atlas (TCGA) portal are from women who declared a Hispanic/Latina ethnicity. Hence, it is fair to assume that we currently do not know to what extent the information gathered to date on the genetics and transcriptomics of breast cancer applies to Hispanic/Latina patients.

Hispanic/Latinos share a broad linguistic identity, but they are culturally diverse and genetically highly heterogeneous, with ancestry mixtures that vary among and within different countries. This makes the study of gene–environment and gene–gene interactions particularly challenging. Most of the populations commonly referred to as ethnically “Hispanic” are the result of admixture of three ancestral populations: European, Indigenous American, and African. Yet, there is considerable variability in the proportion of each ancestral genetic background within and across those populations [13]. A recent seminal paper by Conomos et al. [14] explored the genetic diversity of a large cohort (12,803 individuals genotyped using a high-density SNP chip) from four US metropolitan areas, the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Using principal component analysis (PCA), this group identified substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). These authors used a multidimensional clustering method to define “genetic-analysis groups” that retain many properties of self-identified background groups while achieving substantially greater within-group genetic homogeneity. Remarkably, these “genetic-analysis groups” accounted for significant trait variation in 8 of 22 clinically measurable phenotypic traits. These authors argue that “genetic analysis groups” are a more useful covariate for genetic association studies than self-identified ethnic background groups. This underlying genetic complexity highlights the inadequacy of using self-reported “Hispanic” ethnicity for genetic and genomic studies.

Hispanics/Latinas and Breast Cancer: A Complex Relationship

Age-adjusted breast cancer incidence in the United States is approximately 25% lower in Hispanic/Latina women than among non-Hispanic Whites [15]. The reasons for this “Hispanic Paradox” are most likely multifactorial and may include lifestyle (e.g., number and timing of pregnancies, diet), socioeconomic factors, and genetic factors. It is well established that high Indigenous American (IA) ancestry correlates with reduced risk of breast cancer, and at least one protective variant only found in high IA ancestry individuals has been identified [16]. That said, breast cancer risk varies among Hispanics/Latinas of different geographic origins and between US-born and foreign-born Hispanics/Latinas. This lower risk does not necessarily translate into better outcomes for patients who do develop breast cancer [17]. In fact, California Hispanics/Latinas with over 50% IA ancestry have a risk of breast cancer mortality that is twice as high as that of California Hispanics/Latinas with less than 50% IA ancestry. This may be because of socioeconomic factors, access to health care, late stage at diagnosis, and hitherto unknown biological factors. To our knowledge, the “genetic analysis groups” proposed by Conomos et al. [14] have not yet been studied as covariates for breast cancer risk. The interpretation of epidemiological studies is complicated by the fact that in most studies not only are Hispanics/Latinas considered a single group but “breast cancer” is treated as a single disease. Given the remarkable molecular heterogeneity of breast cancer, it is possible that part of the increased mortality risk observed among Hispanics/Latinas, despite their overall lower risk of disease, may be due to differences in the prevalence of specific breast cancer subtypes or due to molecular differences within the subtypes themselves. For instance, the protective variant identified in individuals with high IA ancestry is near the ESR1 gene, which encodes the estrogen receptor α [16]. We do not know whether it protects against all breast cancer molecular subtypes, including ER-negative ones. In a large molecular epidemiology study of the LACE/pathways combined cohort, Sweeney et al. [18] examined the distribution of breast cancer subtypes as determined by the PAM50 gene expression test among racial and ethnic groups. This study confirmed that Basal-like tumors are far more common among African-Americans (AA) than among other ethnicities. Additionally, Hispanic/Latina patients had a lower incidence of luminal A tumors compared to non-Hispanic Whites (44.2% vs. 55.2%) and a correspondingly higher incidence of luminal B tumors (24% vs. 20.9%). Her2-enriched and Basal-like tumors were also slightly more common among Hispanic/Latina patients than among non-Hispanic Whites. These differences did not reach statistical significance, given the relatively small number of Hispanics/Latinas in the combined cohort. Hispanics/Latinas in this study were not stratified by national origin, IA ancestry, or “genetic analysis group.” If these findings are confirmed, it is possible that despite their overall lower incidence of breast cancer, Hispanic/Latina patients may suffer from higher-risk, non-luminal A breast cancer subtypes than non-Hispanic Whites.

Luminal B Breast Cancer in Colombians

Among luminal/ERα-positive tumors, luminal B cancers are a distinct biological entity compared to luminal A tumors. These tumors are clinically more aggressive, with worse prognosis than luminal A tumors, similar to the basal-like and Her2-enriched tumors. They tend to have lower expression of nuclear hormone receptors, higher expression of Her2/Neu and proliferation markers such as Ki67, and a lower likelihood of responding to endocrine therapy with aromatase inhibitors, selective estrogen receptor modulators (SERM) or selective estrogen receptor disruptors (SERD) [19]. Luminal B tumors have distinctive molecular characteristics from all other subtypes. In the METABRIC multiparameter molecular classification of breast cancers [12], luminal B tumors fall within four clusters (IntClusts 1, 2, 6, and 9). Among recurrent mutations in these tumors are loss of PPP2R2A (protein phosphatase 2 subunit), TP53 mutations, and a hypermethylated profile. Conversely, PIK3CA mutations are less common in this subtype than in luminal A tumors [19]. Moreover, luminal B tumors have higher risk of de novo resistance to endocrine therapies [6, 11]. At the transcriptomic level, they are characterized by increased expression of cell proliferation genes or cell cycle regulators such as MKI67 and AURKA [20]. Luminal B tumors are usually characterized by high recurrence scores based on the Oncotype DX gene expression signature and are more likely to benefit from cytotoxic chemotherapy, reaching higher percentages of pathologic complete response (pCR) compared to luminal A tumors [19]. Interestingly, in a study of 219 women with early stage luminal breast cancers who received an Oncotype DX test, Hispanic/Latina patients had a significantly higher Proliferation Axis score, driven by higher expression of CCNB1 (cyclin B1) and AURKA (Aurora Kinase A) [21]. These authors suggest that biological differences between luminal tumors in Hispanic/Latinas and non-Hispanic Whites may contribute to the higher mortality observed among Hispanics/Latinas. Limitations of this study included its relatively small size, which did not allow stratification of Hispanics/Latinas by ancestry, geographic origin or “genetic analysis group,” and the limited number of informative genes in the Oncotype DX test. Studies using larger panels, such as the 150-gene MammaPrint/BluePrint combined test, and larger, well-characterized Hispanic/Latina populations would be highly informative.

To begin to address this knowledge gap, the Zabaleta group studied a cohort of 301 Colombian breast cancer patients diagnosed and treated at the same institution, the National Cancer Institute in Bogota [22]. Using immunohistochemical markers and the 2013 St. Gallen consensus criteria for surrogate subtype assignment [23], Serrano-Gomez et al. found a higher prevalence of luminal B tumors than luminal A (40.86% vs. 22.59% or 37.21%, vs. 26.25%, using 14% and 20% cutoff values for Ki67, respectively). This result was confirmed using the 2011 St. Gallen criteria. Interestingly, when Ki67 was excluded from the analysis and subtype assignment was based on ER, PR, and Ki67 alone, the prevalence of luminal B tumors decreased dramatically to 15.95% versus 52.49% luminal A, a subtype distribution more typical of US-based non-Hispanic Whites. The difference in subtype breakdown among different immunohistochemical criteria may hold biological clues. The St. Gallen consensus criteria for surrogate subtype assignment include Ki67, a proliferation marker, in addition to ER, PR, and Her2/Neu. Hence, luminal tumors in Colombian patients appeared to be characterized by higher proliferative activity, consistent with the Oncotype DX findings reported by Kalinsky et al. [21]. The tumors classified as luminal B based on St. Gallen 2013 or 2011 tended to be of higher histological grade, larger size, and higher stage at diagnosis, similar to molecularly confirmed luminal B tumors in US-based patients (Table 13.1). No significant association was found in this study between genetic ancestries established using 80 Ancestry Informative Markers (AIMs) and St. Gallen subtype distribution. Similar results were obtained by Gomez et al. in an independent study of a Colombian cohort [24].

Table 13.1 Clinical and pathological characteristics of breast cancer patients from Colombian population according to breast cancer subtype

Following up on these intriguing observations, the same group performed whole-transcriptome RNASeq on 21 immunohistochemically defined luminal A and 21 luminal B tumors from the same 301-patient cohort. Serrano-Gomez et al. [25] found 67 differentially expressed genes (p < 0.05) from which 39 were upregulated and 28 downregulated in the luminal B subtype (Fig. 13.1). Unsupervised hierarchical clustering showed that using these genes, luminal B tumors clustered together and separated from luminal A tumors. Pathway analysis showed that top upregulated genes participate in biological processes such as mitosis and cell cycle regulation (CDK1, CDC6, CCNB2, BUB1, CENPF, ANLN, CENPE, CCNA2, ASPM, MKI67) and downregulated genes mostly encode phosphoproteins (KCND3, RALBP1, RCAN3, ABCA3, RBBP8, PAIP2B, STARD13, ELOVL5, HIPK2, NTRK2, KDM4B, BAI2, FGD3). Another upregulated gene in luminal B tumors was CYP19A1. This gene encodes aromatase, the enzyme that catalyzes the rate-limiting step in estrogen biosynthesis, aromatization of androstenedione and testosterone to estrone and estradiol, respectively. Aromatase is a major therapeutic target in luminal tumors. This result may suggest that these luminal B tumors can produce estradiol endogenously. Whether the CYP19A1 mRNA derived from tumor cells or tumor-associated adipocytes is unclear. Another gene overexpressed in luminal B compared to luminal A tumors in this study is TOP2A, the gene encoding DNA topoisomerase IIA. Sparano et al. [26] suggested that in breast cancer patients with ER-positive, Her2-normal (hence, luminal) tumors, high levels of TOP2A may be associated with resistance to anthracycline-based chemotherapy. Higher expression of TOP2A correlated with poor tumor grade and high recurrence score based on the Oncotype DX signature. Romero et al. [27] also found higher expression of TOP2A in luminal B, HER2-enriched and basal-like tumors when compared to Luminal A. Consistent with immunohistochemical results, several proliferation-associated genes, including CDK1, BUB1, CENPF, and MKI67 (the gene encoding Ki67) were overexpressed in luminal B versus luminal A tumors in this Colombian cohort. Taken together, these results support the hypothesis that, at least in this population of Hispanic/Latina patients, proliferative activity may be higher in luminal tumors compared to similar tumors occurring in non-Hispanic White patients.

Fig. 13.1
figure 1

Gene expression profile of 42 luminal breast cancer samples. (a) Unsupervised hierarchical clustering with 67 differentially expressed genes between IHC-defined luminal B and luminal A tumors. (b) Most relevant signaling pathways associated with 67 differentially expressed genes in luminal B tumors from Colombian women. (c) Diseases associated with differentially expressed genes in luminal B. Reproduced from Ref. [25]

When gene expression was correlated with ancestry, these authors identified five genes differentially expressed between luminal B and luminal A tumors that are potentially modulated by genetic ancestry: ERBB2 (log2FC = 2.367, padj < 0.01), GRB7 (log2FC = 2.327, padj < 0.01), GSDMB (log2FC = 1.723, padj < 0.01), MIEN1 (log2FC = 2.195, padj < 0.01), and ONECUT2 (log2FC = 2.204, padj < 0.01). These results were confirmed by RT-PCR. In the replication set, the authors found a statistically significant association between ERBB2 expression with IA ancestry (p = 0.02, B = 3.11) [25]. Again, these statistical correlations may reveal biological clues. ERBB2 (the gene encoding Her2/Neu, a clinically informative and therapeutically targetable gene), GRB7 (the gene encoding a molecular adaptor in the Her2/Neu pathway), and MIEN1 (a putative oncogene) are physically contiguous, occupying a region of approximately 60,000 bp on Chromosome 17q12. These genes are usually co-amplified in Her2-enriched tumors and are located near a common enhancer. There are multiple possible explanations for an association of IA ancestry with high expression of these genes. Factors associated with IA ancestry may control the epigenetic regulation of the chromatin region encompassing these genes or the expression of transcription factors or non-coding RNAs regulating the transcription of this chromosomal region. Alternatively, the relatively high expression of Chromosome 17q12 transcripts may be due to the subclonal structure of the tumors; that is, to the presence of clonal populations within tumors containing copy number variants (CNV) in this chromosomal region. The appearance of these clones may be indirectly promoted by factors linked to IA ancestry. Ongoing investigations are exploring these potential mechanisms in other Hispanic/Latina populations.

Discussion

Our understanding of the “Hispanic Paradox” in breast cancer remains woefully inadequate. Lower risk of breast cancer, likely due to a combination of ancestry, socioeconomic, and lifestyle factors, contrasts with increased mortality, most likely due to a similarly multifactorial etiology. Hints emerging from the relatively few studies that have investigated the molecular portraits of breast cancer in Hispanic/Latinas suggest that the most common group of breast cancers, the luminal tumors, may be biologically different in Hispanic/Latinas than in other ethnic groups. Results from Oncotype-DX-based studies [21] and immunohistochemistry-based studies [22] suggest that genes associated with proliferation may be expressed at higher levels in breast cancers from Hispanics/Latinas. This putative difference does not appear to be associated with genetic ancestry and may be related to lifestyle, socioeconomic, hormonal, or dietary factors. Higher expression of Ki67 accounts for the higher prevalence of luminal B tumors among Colombian patients as defined by St. Gallen 2013 consensus immunohistochemical criteria. This is consistent with differences in gene expression profiling, which revealed differential expression of multiple genes linked to the cell cycle, including MKI67. The higher expression of aromatase in luminal B tumors suggests a possible role for endogenous estrogen in driving proliferation.

Conversely, the IA ancestry-associated expression of five genes, notably including ERBB2 and two of its genomic neighbors, may suggest that IA ancestry is associated with an ERBB2-driven phenotype in luminal tumors. ERBB2-encoded Her2/Neu signaling is among the several well-characterized mechanisms of endocrine resistance [28]. Whether these tumors might benefit from Her2/Neu-targeted treatment with trastuzumab, lapatinib, or other agents remains to be determined.

The studies we describe herein have significant limitations. The number of tumors molecularly profiled is still relatively small, as is the number of subjects studied. These findings must be replicated in larger population of Hispanics/Latinas of different geographic origin and ideally, in different “genetic analysis groups.” Larger numbers of tumors need to be molecularly profiled, and the gene sets examined by clinically used gene expression-based molecular panels need to be examined in detail.

The possibility that luminal tumors in Hispanic/Latinas may have distinctive biology, due to non-genetic and/or ancestry-linked factors deserves further investigation. The interpretation of gene expression-based molecular tests, and thus the treatment choices made on the basis of gene expression results may have to take Hispanic/Latina ethnicity and/or genetic ancestry into consideration.