Copy number of 8q24.3 drives HSF1 expression and patient outcome in cancer: an individual patient data meta-analysis
- 38 Downloads
The heat-shock transcription factor 1 (HSF1) has been linked to cell proliferation and survival in cancer and has been proposed as a biomarker for poor prognosis. Here, we assessed the role of HSF1 expression in relation to copy number alteration (CNA) and cancer prognosis.
Using 10,287 cancer genomes from The Cancer Genome Atlas and Cbioportal databases, we assessed the association of HSF1 expression with CNA and cancer prognosis. CNA of 8q24.3 was categorized as diploid (reference), deletion (fewer copies), gain (+ 1 copy) and amplification (≥ + 2 copies). Multivariate logistic regression modeling was used to assess 5-year survival among those with a first cancer diagnosis and complete follow-up data (N = 9568), categorized per anatomical location and histology, assessing interaction with tumor stage, and expressed as odds ratios and 95% confidence intervals.
We found that only 54.1% of all tumors have a normal predicted 8q24.3 copy number and that 8q24.3 located genes including HSF1 are mainly overexpressed due to increased copies number of 8q24.3 in different cancers. The tumor of patients having respectively gain (+ 1 copy) and amplification (≥ + 2 copies) of 8q24.3 display a global increase of 5-year mortality (odds ratio = 1.98, 95% CI 1.22–3.21) and (OR = 2.19, 1.13–4.26) after full adjustment. For separate cancer types, tumor patients with 8q24.3 deletion showed a marked increase of 5-year mortality in uterine (OR = 4.84, [2.75–8.51]), colorectal (OR = 4.12, [1.15–14.82]), and ovarian (OR = 1.83, [1.39–2.41]) cancers; and decreased mortality in kidney cancer (OR = 0.41, [0.21–0.82]). Gain of 8q24.3 resulted in significant mortality changes in 5-year mortality for cancer of the uterus (OR = 3.67, [2.03–6.66]), lung (OR = 1.76, [1.24–2.51]), colorectal (OR = 1.75, [1.32–2.31]) cancers; and amplification for uterine (OR = 4.58, [1.43–14.65]), prostate (OR = 4.41 [3.41–5.71]), head and neck (OR = 2.68, [2.17–3.30]), and stomach (OR = 0.56, [0.36–0.87]) cancers.
Here, we show that CNAs of 8q24.3 genes, including HSF1, are tightly linked to 8q24.3 copy number in tumor patients and can affect patient outcome. Our results indicate that the integration of 8q24.3 CNA detection may be a useful predictor for cancer prognosis.
KeywordsCopy number alteration 8q24.3 HSF1 Individual patient data meta-analysis Patient outcome
Early diagnosis and accurate prognostic markers of cancer help practitioners in treatment decisions to ultimately optimize patient outcomes. Despite the advancements in diagnostic methods and the use of molecular diagnostics, for example, next-generation sequencing panels run in routine in an increasing number of laboratories for cancer patients [1, 2], clinical prognostics are often limited to histology, positivity of lymph nodes, and presence of metastases [3, 4]. In the light of personalized medicine, there is a need to explore feasible and reliable new biomarkers to improve prognostic information .
Recent studies have raised interest in the heat-shock transcription factor 1 (HSF1), master regulator of cell stress response for adaptation and survival . When activated, HSF1 facilitates the transcription of genes, such as the heat shock proteins (HSPs) chaperones required to relieve the proteotoxic stress that can cause cell death . Overexpression of HSF1 has been linked with cancer proliferation, and malignancy, suggesting that HSF1 could serve as a prognostic marker [7, 8]. Numerous clinical and basic research studies showed that high expression level of HSF1 is associated with poor outcomes in many cancer types [7, 8, 9, 10, 11, 12], pointing out the potential of HSF1 as a prognostic biomarker [12, 13].
Nevertheless, the origin and the interpretation of HSF1 overexpression in cancer are poorly understood since HSF1 appears to drive a distinct regulation in cancer cells . So far, consensus suggests that HSF1 overexpression helps to relieve the stress of protein unbalances [10, 14], likely caused by aneuploidy or an imbalanced karyotype [15, 16, 17]. Intriguingly, overexpression of HSF1 cancer signature gene clusters at the end of chromosome 8q . However, mechanisms that drive HSF1 overexpression in different cancers remain largely unknown but may hold a key in understanding tumor development and the relationship to survival.
Clinical studies have now emerged with transcriptomic, genomic, and clinical patient data offering unprecedented opportunities to understand the molecular events associated with cancer, and its related outcome . Gene expression seems to exhibit different expression profiles in various human cancer types . In addition, acquired copy number alteration (CNA) in cancer cells is common  and can play a significant role in cancer development by altering gene dosage and affecting the expression of multiple genes, and regulatory regions [22, 23, 24].
The aim of this study using an individual patient data meta-analysis approach is to assess the overall role of HSF1 expression in relation to CNA in cancer prognosis.
Search strategy and selection criteria
This study used data from cBioportal portal (http://www.cbioportal.org) [18, 19], which includes peer-reviewed studies, METABRIC data (Molecular Taxonomy of Breast Cancer International Consortium), and unpublished data from The Cancer Genome Atlas (TCGA) [25, 26]. A descriptive summary of all data extracted from cBioportal on the acquired CNA and RNA expression per cancer type is presented in Additional file 1: Table S1.
For the survival analyses, only individuals without a prior history of cancer, and with identical CNA for the genes present in the 8q24.3 region (i.e., patient with heterogeneous CNA were excluded), as well the 5-year survival information, were included.
Data extraction and genomic analyses were conducted by MDD and data management, and individual patient data meta-analyses by NB. Demographics, clinical information, and cancer genomics datasets were extracted for all individuals. Normalized mRNA expression data (Z-scores 2.0) were computed for the relative expression of an individual gene and tumor to the gene of the expression distribution compared to the reference population diploid for the corresponding gene (by default for mRNA), or normal samples (when specified)(http://www.cbioportal.org/faq.jsp). For CNA categories, data were obtained from Cbioportal [25, 26] and derived from Affymetrix SNP6 data (copy number ratio from tumor samples minus ratio from the matched normal tissues) computed with the GISTIC 2.0 algorithm .
The estimated copy number alteration of the 8q24.3 region was categorized according to the predicted copy number: deep deletion (− 2) (0.1%), shallow deletion (− 1) (4.7%), diploid or normal (0) (54%), gain (+ 1) (32%), and amplification (≥ + 2) (8.6%).
The main outcome in the individual patient data meta-analysis was the 5-year mortality (dead or alive) since the exact number of days of survival was only reported for 22% of the cohort and the secondary outcome was the risk of being alive, and healthy or not (to assess the combined effect of recurrence and mortality). The following data were collected and categorized: sex (categorized as male or female), age at time of diagnosis (categorized as < 40, 40–49, 50–59, 60–69, and ≥ 70 years), anatomical location and histological subtype, HSF1-expression (categorized in quartiles), tumor stage (categorized as stage 0–1 or in situ, stage II, stage III, stage IV), calendar period (categorized as 1978–2005, 2006–2008, 2009–2010, 2011–2013), study (42 different studies), history of any cancer (yes or no), and 5-year outcome (alive with or without recurrence, or dead). Missing values were crosschecked with other relevant variables. Length of follow-up and length of survival were missing for the majority of individuals, and therefore not used for survival modeling (only to complete missing data on the outcomes). Data on body mass index, smoking, alcohol-use, cancer-specific risk factors, and treatment were missing in the majority of the individuals or too heterogeneous among cancer types and were therefore not included.
To avoid bias due to heterogeneous gene expression of HSF1 across various cancers (Additional file 1: Appendix 2), we analyzed co-expression using the Spearman correlation test generated from cBioportal. JMP® v13 (SAS Institute) and Tableau desktop® 10.5 (Tableau Software) were used for data processing and visualization. Gene ontology analysis was performed using Panther v12.0 .
Individual patient data meta-analyses were conducted in Stata/MP14.2 (StataCorp) using two methods to assess 5-year mortality and healthy survival overall and for each anatomical location and histological subtype . Differences in descriptive statistics were compared by means of chi-square tests, with p values < 0.05 representing statistically significant differences. All results were expressed as odds ratios (OR) and 95% confidence intervals (CI) using diploidy as reference. If the odds ratio of 1 (indicating no difference) is included in the 95% confidence interval, the results do not indicate statistically significant differences between both groups. The first approach was based on random effect modeling using the ipdmetan package in Stata, which is a two-stage individual patient data meta-analysis pooling and visualizing the effect of binary outcomes by means of forest plots . I2 statistics were used to quantify statistical heterogeneity, with values < 50%, 50–75%, and > 75% defined as low, moderate, and high heterogeneity, respectively . Results were weighted by anatomical location, histological subtype, and study. Since this approach did not allow adjustment for confounding or interaction, multivariate logistic regression analyses were also conducted (one-step approach) . For each anatomical location, three models were presented to compare four risk groups: diploid (reference), shallow/deep deletion (combined), gain, and amplification. Model 1 was unadjusted, model 2 was adjusted for sex, age, and calendar period and clustering by study, and model 3 was additionally adjusted for HSF1 expression and interaction with tumor stage. Interaction with tumor stage and HSF1 was assessed by likelihood-ratio testing. For histological subtypes, 8q24.3 CNA gain and amplification were combined into one category to increase power, and only models 1 and 2 were presented. Subgroup analyses distinguishing between gain and amplification were only conducted for the 15 histological subtypes with the highest number of individuals with gain or amplification. Analyses were only presented if at least 10 individuals were included in each risk group and are based on complete-case analyses.
HSF1 expression profile across different cancers type
HSF1 CNA drives HSF1 expression
8q24.3 CNA drives mainly the expression of HSF1 and 8q24.3 genes
To assess the influence of homogeneous 8q24.3 copy on HSF1 expression, we excluded patient samples carrying heterogeneous copy number of genes localized in 8q24.3. Not surprisingly, when patient samples were sub-grouped by the strength of HSF1 expression, patient samples overexpressing HSF1 display a higher amount of 8q24.3 copy in their genome (Fig. 3g). Similarly, other genes located within the 8q24.3 region, including cancer-related genes (Additional file 1: Appendix 4), displayed similar trends in different tissues (Additional file 1: Appendixes 5 and 6). Yet, linear regressions analysis of 8q24.3 CNA compared to the expression of genes located in 8q24.3 confirmed that HSF1 expression is one of the most correlated genes with 8q24.3 copy number alteration in different tissues (Additional file 1: Appendix 6). These results indicate that 8q24.3 CNA, not only HSF1, triggers a complex transcriptional change to facilitate cancer development and proliferation.
Next, we evaluated how 8q24.3 copy number in tumor could affect the clinical prognosis, taking into account confounding and interaction by tumor stage (as assessed by means of the likelihood ratio test). Therefore, we excluded all patients having heterogeneous CNAs within 8q24.3 (n = 780) and those with a prior malignancy or incomplete 5-year follow-up information (Additional file 1: Appendix 1). In total, 9568 unique individuals were included, of which 54% were female, 51% were 60 years or older, and 28% were diagnosed between 2011 and 2013, as described in Additional file 1: Table S2. In total, 24 different anatomical locations and 45 different histological subtypes were reported with breast (13%), and brain tumors (11%) being most common. Tumors were in situ or stage 0–1 in 18%, stage 2 in 11%, stage 3 in 15% and stage 4 in 7%, and information was missing in 50%.
In total, 5174 (54%) of cancers were diploid for 8q24.3 (Additional file 1: Table S2), 12 (0.1%) had deep deletion, 454 (5%) shallow deletion, and respectively 3082 (32%) and 9568 (9%) showed gain or amplification. Women presented more frequently with diploidy and amplification (55% and 11%) than men (53% and 7%)(p < 0.0001), and the proportion of diploidy decreased by age (65% in < 40 years, 50% in ≥70 years; p < 0.0001). Diploidy was most common in thyroid cancers (97%), thymus cancer (89%), and hematological cancer (83%). The 8q24.3 gain was especially common in testicular cancer (75%) and head-and-neck cancer (62%). Diploidy was more common in stage 0–1 or in situ tumors (59%) compared to stage 4 (42%)(p < 0.0001).
At 5 years after diagnosis, 28% has died, 47% were alive without recurrence, and 11% had a recurrence but were still alive. Recurrence information was missing in 19% of individuals who survived. Of those who died, 49% presented with 8q24.3 diploidy, of those who were alive, and disease-free, 59% (p < 0.00001, Additional file 1: Table S2).
The risk of 5-year mortality by 8q24.3 copy number alteration—categorized by anatomical location of each cancer—and expressed as odds ratios [OR] and 95% confidence intervals [CI]
Model 1*(n = 9568)
Model 2** (n = 7593)
Model 3*** (n = 4110)
Deletions (− 1 or − 2)
Gain (+ 1)
Amplification (≥ + 2)
Deletions (− 1 or − 2)
Gain (+ 1)
Amplification (≥ + 2)
Deletions (− 1 or − 2)
Gain (+ 1)
Amplification (≥ + 2)
Head and neck
Prognosis per anatomical location
The two-step meta-analysis approach (Fig. 4) shows that, compared to diploidy as reference, gain was associated with a significantly increased mortality for 7 subtypes, including papillary thyroid cancer (OR = 13.00), uveal melanoma (OR = 9.38), and renal papillary cell carcinoma (OR = 4.84); and a decreased mortality for tubular gastric adenocarcinoma (OR = 0.25). For amplification, mortality was significantly higher than diploidy for squamous cell head and neck carcinoma (OR = 2.23). For each anatomical location, all three models were presented if feasible (Table 1). For deletion, model 2 showed a significantly increased 5-year mortality for cancer of the uterus (OR = 4.84), colorectal (OR = 4.12), lung (OR = 1.91), and ovaries (OR = 1.83), and decreased risk of kidney cancer (OR = 0.41). After full adjustment (model 3), only the results for ovarian (OR = 1.52) and kidney cancer (OR = 0.52) were confirmed. For gain, model 2 found significant associations for cancer of the uterus (OR = 3.67), lung (OR = 1.76), colorectal (OR = 1.75), ovaries (OR = 1.53), and stomach (OR = 0.60), which were confirmed in model 3 for cancer of the uterus (OR = 1.99), lung (1.77), and ovaries (OR = 7.24). Amplification was associated with cancer of the uterus (OR = 4.58), prostate (OR = 4.41), head and neck (OR = 2.68), and stomach (OR = 0.56) in model 2, and ovaries in model 3 (OR = 9.73).
Prognosis per histological subtype
The 5-year mortality (model 2) was significantly higher for 8q24.3 deletion in serous cystadenocarcinoma of the ovaries (OR = 1.83) and squamous cell carcinoma of the lungs (OR = 1.79); and for gain/amplification in endometrial carcinoma (OR = 3.63), rectal adenocarcinoma (OR = 2.43), prostate adenocarcinoma (OR = 1.92), squamous cell carcinoma of the lungs (OR = 1.92), serous cystadenocarcinoma (OR = 1.44), chromophobe renal cell carcinoma (OR = 1.38), and tubular adenocarcinoma of the stomach (OR = 0.21)(Additional file 1: Tables S4-S5).
Here, we showed that expression of HSF1 as well as other genes localized in 8q24.3 are tightly linked to 8q24.3 copy number. This large individual patient data meta-analysis approach showed evidence for higher 5-year mortality among individuals with 8q24.3 deletions, gain, and amplification. These overall results remained rather stable after adjustment for confounders and interaction by tumor stage, which supports a causal relationship that cannot be explained by tumor stage, HSF1 expression, or by the assessed confounders. Up to 9-fold increased risks were found for specific cancer types. This supports a potential causal relationship between 8q24.3 CNA and prognosis at least in some histological subtypes—although protective effects were found in a limited number of cancer types (kidney and stomach).
Therefore, this suggests that 8q24.3 CNA and its complex transcriptional change imply either responsive or resistance in treatment, which needs further clinical and molecular investigations. For example in the different histological subtypes, it would be interesting to investigate with other known genomic biomarkers important in cancer as well exploring the link with complex karyotypes to explore assess the link with the stress of protein unbalances in tumors. It is also worthwhile to understand why both a deletion and a gain of 8q24.3 can lead to a poor prognosis in some tissues such as lung, colorectal, and ovaries. Possibly copy number change in 8q24.3 could alter transcriptional programming or could be associated with other genomic change including translocations and inversions that alters the resistance to treatment or favorize tumor growth.
The main strength of this meta-analysis is that the results are based on a large population with available data on an individual level. Both applied meta-analyses approaches obtained similar results, with low to moderate statistical heterogeneity for all analyses. Yet, information was incomplete or missing for important prognostic variables such as tumor stage (missing in 50%) and confounders such as body mass index, smoking, and alcohol intake. Therefore, the most adjusted models were conducted on 42% of the cohort (complete case analysis), resulting in reduced power, which in turn contributed to the loss of statistical significance compared to the unadjusted analyses. Residual confounding cannot be ruled out. Consequently, the results have to be interpreted with caution, in particular, for specific histological subtypes with the low number of patients included in the analyses.
Our findings may have substantial implications for the understanding and interpretation of biomarkers in cancer research and clinical investigations. Indeed, no less than 5000 publications were found for 16 popular genes including HSF1 in 8q24.3, of which 800 publications related to cancer field (Additional file 1: Appendix 4), mainly because those genes were found overexpressed in cancer (Additional file 1: Appendix 5).
Integration of 8q24.3 CNA detection may have substantial implications for interpreting the molecular pathogenesis of cancer. In a general aspect, our work indicates that histological diagnoses using biomarkers can be tightly linked to large CNA associated with complex gene expression pattern, pointing out the importance of understanding molecular pathogenesis to optimize cancer treatment.
The authors thank the TCGA Research Network and its TCGA Pan-Cancer Analysis Working Group and cBioportal for Cancer Genomics.
The study was designed by M-DD. M-DD collected samples and patient information. M-DD and NB cleaned, analyzed. and interpreted the data. KE gave valuable advice. NB and M-DD wrote the manuscript, which was critically revised by KE. The corresponding author had full access to the data and the final responsibility to submit for publication. All authors of this work had full access to this study and the data, and gave approval to submit the final manuscript.
This work was supported by grants from the Cancer Society CAN2015/169 and the Swedish Research Council 2016-01259. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Ethics approval and consent to participate
Consent for publication
- 2.Franczak C, Dubouis L, Gilson P, Husson M, Rouyer M, Demange J, Leroux A, Merlin JL, Harle A. Integrated routine workflow using next-generation sequencing and a fully-automated platform for the detection of KRAS, NRAS and BRAF mutations in formalin-fixed paraffin embedded samples with poor DNA quality in patients with colorectal carcinoma. PLoS One. 2019;14(2):e0212801.CrossRefGoogle Scholar
- 3.Freedman AN, Klabunde CN, Wiant K, Enewold L, Gray SW, Filipski KK, Keating NL, Leonard DGB, Lively T, McNeel TS, et al. Use of next-generation sequencing tests to guide cancer treatment: results from a nationally representative survey of oncologists in the United States. JCO Precision Oncol. 2018;2:1–13.CrossRefGoogle Scholar
- 11.Liang W, Liao Y, Zhang J, Huang Q, Luo W, Yu J, Gong J, Zhou Y, Li X, Tang B, et al. Heat shock factor 1 inhibits the mitochondrial apoptosis pathway by regulating second mitochondria-derived activator of caspase to promote pancreatic tumorigenesis. J Exp Clin Cancer Res. 2017;36(1):64.CrossRefGoogle Scholar
- 30.Fisher D. IPDMETAN: Stata module for performing two-stage IPD meta-analysis, Statistical Software Components S457785: Boston College Department of Economics, Revised 16 Sep 2017; 2014. https://econpapers.repec.org/software/bocbocode/s457785.htm.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.