Introduction

Head and neck cancer (HNC), the 6th common cancer worldwide, is characterized by high incidence of local tumor invasion and metastatic spread1,2. Despite of the advances in diagnosis and treatment modalities, high mortality rates rank HNC among the most aggressive cancers. This aggressiveness is contributed by the high loco-regional relapse seen at early stages, which is worsened by the heterogeneous nature of the disease involving a variety of histological tumor subtypes and affecting diverse anatomical sites3. Historically, the traditional risk factors for HNC include excessive tobacco smoking, alcohol consumption, and infection by human papillomavirus (HPV). Additional factors have been identified to enhance individual susceptibility to HNC, in particular, genetic abnormalities impacting on cell proliferation, differentiation features, cell cycle checkpoints, angiogenesis and tumor metabolism4,5,6,7,8. Furthermore, deregulation of epigenetic machinery such as DNA methylation, nucleosome positioning, histone modifications and non-coding RNAs have been reported to contribute to enhanced individual susceptibility to HNC with direct influence on gene activities9.

DNA methylation is the major epigenetic alteration characterized by addition or removal of a methyl group (CH3) referred as hypermethylation of the CpG islands or global hypomethylation, respectively10. DNA hypomethylation has been associated with chromosomal instability as well as activation of proto-oncogenes, while DNA hypermethylation has been involved in repressing tumor suppressor genes and genomic instability often impacting on tumor initiation and progression9,10. The reversible nature of epigenetic aberrations has led to the promising benefit of epigenetic therapy for cancer prevention and management11. However, DNA methylation status vary according HNC subtypes, differentiation features, anatomic involvement12,13, HPV status14, smoking habits9 and geographic distribution15. Therefore, identifying crucial genes that are susceptible to DNA hypermethylation-induced gene silencing is becoming critical to tailor the utility of methylation modifiers to individual cancer types.

Here, we systematically reviewed published papers addressing epigenetic alterations, particularly DNA hypermethylation, in relation to individual susceptibility to HNC, as well as HNC progression and prognosis. We confirmed using a multivariate analysis the clinical relevance of 10 most common alterations as independent risk factors for HNC progression. Furthermore, we used a network-based analysis to prioritize putative molecular interactions and validate the candidates by protein expression in a cohort of HNC with long-term follow-up. Last, we discussed the potential of relevant FDA-approved drugs as alternative therapeutics for invasive HNC.

Materials and methods

Data search

The study followed the protocol recommended by Cochrane Handbook for Systematic Reviews of Interventions (https://training.cochrane.org). In brief, we conducted this systematic literature review using online platforms: PubMed, Wiley Online Library, EMBASE, Web of Science, Scopus, and Cochrane databases between January 2008 and June 2020. The tested hypothesis was to establish the associations between epigenetic alteration and HNC risk. The search strategy focused on key words including their abbreviation, truncations, synonyms, and subsets for search, such as: “head and neck neoplasms” or “facial neoplasms” or “head and neck cancer” or “oral cancer” or “tongue cancer” or “mouth cancer” or the codes described in the International Classification of Diseases for Oncology (ICD-O) for Head and Neck Tumors (https://www.who.int); and “epigenetics” or “epigenomics” or “methylation” or “histone modification” or “non-coding RNA” or “ncRNA” and “risk factors” or “smoke” or "tobacco" or “alcohol” or “HPV”. Searches in Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/geo/) and ArrayExpress (www.ebi.ac.uk/arrayexpress) repositories were also performed. We designed this strategy for a sensitive and broad search (Fig. 1). Additional relevant studies from the reference lists were also included in the analysis. Two librarian experts in systematic review methods hand searched the references list to find additional articles.

Figure 1
figure 1

Flow diagram of search and study selection process. Following the guidelines of the Meta-analysis of Observational Studies in Epidemiology group (MOOSE), we performed a broad and sensitive search on online databases to identify the studies that examined associations between DNA methylation and HNC associated with risk factors (alcohol, tobacco and HPV infection). A systematic literature search for relevant studies up to June 2020. In this study, we considered the clinical endpoints overall survival (OS) and disease specific survival (DFS) as acceptable outcomes. The prognostic value was demonstrated using hazard ratio (HR) with 95% confidence interval (CI).

Inclusion and exclusion criteria

This study did not include non-English manuscripts, single case reports, editorial letters, and reviews of literature. It was also excluded cross-sectional studies that addressed associations with alcohol, tobacco and HPV status without specifically examining associations with epigenetic alteration. Studies using only preclinical models were also excluded. Then, the following inclusion criteria were required to be eligible in this systematic review: (1) human case–control studies; (2) clinical studies related to the DNA methylation and HNC risk factors; (3) methylation sequencing and array methods were excluded; 4) when the same research group was identified, publications were further investigated to eliminate duplications or samples overlap. The outcomes were further explored considering Hazard ratio (HR) with confidence of interval (CI) and P value < 0.05. Papers that fulfilled these criteria were processed for data extraction and the discrepancies were solved by discussion.

Data extraction and quality assessment

A standardized form adapted from Dutch Cochrane Centre (https://netherlands.cochrane.org) for epidemiological studies was used to extracted the date and its included: (a) clear definition of risk factors (alcohol, tobacco and HPV status); (b) clear definition of the molecular assay used for the measurement of epigenetic alteration (e.g. quantitative real time polymerase chain reaction (qRT-PCR), methylation-specific PCR (MSP); (c) clear definition of cut-off, (d) definition of the anatomical site; e) definition of the target population (country where the study took place). To be qualified, all the criteria had to be mentioned in the manuscript; otherwise, the study was recorded and excluded from the systematic review.

In detail, data extracted from the final eligible articles include: first author, year of publication, impact factor of the journal publication, the country of origin, study design, population studied, subjects’ ethnicity, the number of cases, cancer types, source of control, epigenetic profiling, specimen, anatomic location, risk, HR and follow-up. The methodological quality and risk of bias was assessed by the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) score system.

Network and enrichment analyses

The list of epigenetic alterations, focusing on DNA hypermethylation, was submitted to GSEA to search for enriched biological processes (Gene Onology) and cellular pathways (KEGG) using FDR < 0.05 or top 50 as parameters16,17. The SIGnaling Network Open Resource 2.0 (SIGNOR 2.0), a public repository that stores almost 23,000 manually annotated causal relationships between proteins and other biologically relevant entities (chemicals, phenotypes, complexes and others) was used to construct a protein–protein interaction (PPI) network using all types of interactions and score 0.1 as parameters18.

Validation—study population

A retrospective study was performed by analyzing data from 100 patients with primary HNC diagnosed and treated at the Department of Otolaryngology—Head and Neck Cancer at the Jewish General Hospital (McGill University) (Supplementary Table 1). The eligibility criteria included previously untreated patients with diagnosis of HNC submitted to the treatment in a single institution. This study was carried out with the approval of the Human Research Ethics Committee of the Jewish General Hospital (JGH)—McGill University, Canada (protocol#11–093) and informed consent was obtained from all subjects. Strengthening the reporting of observational studies (STROBE Statement) was used to ensure appropriate methodological guidelines and regulations.

Immunohistochemistry (IHC) analysis

IHC reaction and analysis were carried out as we previously described19. In brief, the incubations with the primary antibody anti-CD44 (Dako, 1:100) diluted in PBS were made overnight at 4 °C. Positive and negative controls were included in all reactions. IHC reactions were performed in duplicates to represent different levels tissues levels in the same lesion. The second slide was 25–30 sections deeper than the first slide, resulting in a minimum of 300 µm distance between sections representing fourfold redundancy with different cell populations for each tissue. IHC scoring was blinded to the outcome and clinical aspects of the patients. Cores were scanned in 10× power field to settle on the foremost to marked area predominant in a minimum of 10% of the neoplasia. IHC reaction was considered as positive if of a clearly visible dark brown precipitation occurred. IHC analysis was semi-quantitative considering the percentage and intensity of staining as: 0 (no detectable reaction or little staining in < 10% of cells), 1 (weak but positive IHC expression in > 10% of cells) and 2 (strong positivity in > 10% of cells). The percentage of CD44 positive was calculated with an image computer analyzer (Kontron 400, Carl Zeiss, Germany)19.

Data analysis

The statistical analyses were performed using the STATA 12.0 statistical software (STATA Corporation, College Station, TX, USA) as we previously described19. The pooled parameters sensitivity, specificity, diagnostic hazard ratio (HR), and their 95% CIs were calculated to evaluate the overall diagnostic accuracy and the correlation between IHC status and HNC comparing high and low-risk patients. Statistical analysis considered the weighted effect, and the effect size was adjusted.

Results

Overview of the included studies

Following the search protocol and screening strategy, it was identified 1567 manuscripts. After exclusion of duplicates studies and manuscripts unrelated to epigenetic alteration or cancer, and reviews, 138 articles were retrieved for the title and abstract. Additional 12 studies were excluded, since they were either only abstracts or irrelevant to risk factors in HNC, leaving 126 studies for further full-text analysis (Fig. 1)19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103. Titles and abstracts retrieved through this search were screened by three of the authors (JH, OV, AB) and after a careful reading of the texts, 41 studies were removed due to the lack of information regarding survival analysis. Finally, we had 85 studies involving 32,187 subjects where the relationship between DNA hypermethylation and risk factors for HNC progression were analyzed (Table 1). QUADAS-2 evaluation analysis showed that all studies had relative elevated scores, indicating a comparatively high quality of the researchers included in this study. The median impact factor of these publications was 3.798 (range 0.652 to 9.238).

Table 1 Hypermethylation in genes associated with risk factors in patients with head and neck squamous cell carcinoma in the 85 identified studies.

Of the 85 articles exploring DNA methylation and risk factors (including tobacco use, alcohol abuse, and HPV positivity) in HNC, 30 (35.3%) studies focused on North Americans populations followed by Japanese (n = 10; 11.8%), Brazilian (n = 5; 5.9%) and India population (n = 5; 5.9%). DNA methylation was widely analyzed by MSP of specific genes in 74 (87.1%) studies. The remaining researches used qRT-PCR as method (11 studies; 12.9%). The anatomic location in head and neck cancer was predominantly mixed (44 studies; 51.8%) followed by oral cavity (n = 27; 31.8%) and oral cavity mixed with oropharyngeal cases (n = 6; 7.1%). A total of 37 (46.5%) of the 85 articles only measured DNA methylation of a single gene (Table 1).

DNA methylation associated with cancer risk in HNC

Changes in DNA hypermethylation were identified for 120 genes (Table 1). These genes are enriched for biological processes related to cell proliferation and death, response to stimulus (including drugs), metabolism, and cellular motility and differentiation (Supplementary Table 2). Even though these genes came from different studies, the interactome analysis showed that some of these genes, such as CCNA1, SFN, ATM, GADD45A, CDKN2A, TP53, RB1 and RASSF1 are involved into common biological processes suggesting that they work together (Fig. 2). Thus, we verified the cellular pathways where the regulatory genes play critical role in the signaling networks, including p53, Wnt, MAPK and ErbB tyrosine kinase receptor signaling, as well as cytochrome P450-associated xenobiotic metabolism (Supplementary Table 3).

Figure 2
figure 2

Genomic network analysis showing the central role of genes related with cell cycle pathway. Genes hypermethylated (circled in pink) from different studies were involved into common biological processes suggesting that they work together. PPI analysis pointed to external stimulus, such as DNA damage, UV stress, all-trans-retinoic acid that could activate a cellular signalization to epithelial-mesenchymal transition (EMT), adipogenesis, angiogenesis, immortality, cell growth, cell cycle and proliferation. Image done using the public repository SIGnaling Network Open Resource 2.0 (SIGNOR 2.0).

In the multivariate analysis, not all the 120 genes showed a significant correlation with alcohol, tobacco and/or HPV status. Rather, only the hypermethylation of TIMP3, DCC, DAPK1, CDH1, CCNA1, MGMT, P16 (CDKN2A), MINT, CD44, RARβ were associated with these known risk factors in progressive HNC. According to GSEA (Supplementary Table 2), five of these genes belong to four families sharing similar homology or biochemical activity: tumor suppressors (CDH1 and CDKN2A), protein kinase (DAPK1), cell differentiation markers (CDH1 and CD44) and transcriptional factor (RARβ). These ten genes were submitted to signaling network analysis revealing a protein-to-protein interaction (PPI) that pointed to external stimulus, such as DNA damage, UV stress, all-trans-retinoic acid that could activate a cellular signalization to epithelial-mesenchymal transition, adipogenesis, angiogenesis, immortality, cell growth, cell cycle (G1S transition) and proliferation (Fig. 2).

Finally, to confirm if these genes associated with risk factors (alcohol, tobacco and HPV) might have impact on patient’s survival probability, we validated them using an independent large cohort of 279 HNC patients with high-throughput information from Cancer Genome Atlas containing HM450 methylation and RNAseq data104. For these analyses, we used tools available in the cBioPortal105,106. Not all these genes were statistically associated with alcohol and tobacco in this cohort. However, regarding HPV status, CD44, CCNA1, DCC and TIMP3 were hypermethylated in the HNC HPV-negative (Fig. 3). The correlation between DNA hypermethylation and RNAseq data in this cohort confirms that DNA hypermethylation often leads to gene downregulation (Supplementary Fig. 1). There were no transcriptome data for DCC and CCNA1 in this study104. For the eight genes that had transcriptome data available in the dataset, except for APBA1, we validated the negative correlation between DNA methylation (HM450 methylation platform) and gene expression (using RNAseq data). CDH1 and CD44 gene expression were significantly expressed in the HPV-positive patients (Fig. 4A,B). The methylation status (or any other alteration) of these genes alone did not achieve statistical significance on their impact for the overall survival based on this dataset, which included a mixed of different anatomical location and heterogenous tumor stage and histological grade.

Figure 3
figure 3

Validation of the gene expression in a large cohort of 279 HNC cases from Cancer Genome Atlas containing HM450 methylation, RNAseq data as well as information regarding alcohol, tobacco, and HPV infection. CD44, CCNA1, DCC and TIMP3 were hypermethylated in the HNC HPV-negative cases. Image done using the open-access resource for interactive exploration of multidimensional cancer genomics data sets cBio Cancer Genomics Portal (http://cbioportal.org).

Figure 4
figure 4

(A) Transcription profile reveals CD44 is highly expressed in HPV-HNC. (B) Correlation between gene expression and epigenetic alteration. (C) Validation of the protein expression in an independent cohort of 100 HNC with long-term follow-up. Representative immunohistochemical staining for CD44 in head and neck cancer. The cytoplasmic membrane immunoreactivity for CD44 was clearly identified. Original magnification: 400×. (D) CD44 protein was differentially expressed HPV+ and HPV- HNC patients. Confidence intervals (CI 95%) show relative percentage and IHC intensity value. Y-axis represents numerical values corresponding to the percentage and intensity of expression. (E) Survival curves analysis according to the Kaplan–Meier method showing that patients with positive expression of CD44 had shorter survival rate in comparison with negative immunostaining (log-rank test, P < 0.01).

In order to analyze whether this alteration affected the translational level, we explored these two promising candidates (CD44 and CDH1) and their potential clinical impact by evaluating a cohort of 100 patients with unique tumor location at the tongue followed-up by 10 years (Fig. 4; Supplementary Fig. 1 and Supplementary Table 1). Typically, HNC patients relapse within 2 years. Among our studied patients, 23 (23.0%) had recurrence, 28 (28.0%) had distant metastasis, and 50 (50.0%) died. Sixty-nine patients from 85 HNC cases presenting negative staining for CD44 protein expression, had statistically better disease-free survival probability compared with patients whose tumors overexpressed CD44 (log-rank test, P < 0.01) (Fig. 4C-E). The lower expression of CD44 might reflect the reduced number of cells with stem cell properties which explain the absence of metastasis and the better survival rates.

Prediction of the drugs to target the hypermethylated candidate genes

To elucidate the underlying mechanisms of the hypermethylated genes in relation to the HNC susceptibility, these 120 known genes were used as seed for network growth. We identified six core biological processes (FDR < 10–30 and Z-score > 90), which were enriched for cell cycle regulation and metabolic pathways. Finally, based on this criteria, 53 methylated genes showed strong correlation with cancer risk, then, we searched for drugs interfering with these networks. We found 71 drugs targeting 18 proteins in the six networks identified (Supplementary Table 4). Proteins targeted by the drugs include TGF-beta receptor type II (Lerdelimumab, Suramin, and Interferon beta), GAB1-RA (Primidone, Flumazenil, Oxazepam, Flurazepam, Methylphenobarbital, Clorazepate, Ganaxolone, Clomethiazole, Zaleplon, Ocinaplon, Methyprylon, Indiplon, Zolpidem, Pentobarbital and Secobarbital), JAK 3 (Tofacitinib) (Fig. 5), IL-6 (Dexamethasone, Aloperine), CCND1 (Silibinin) and SRC (Cediranib, Nintedanib, Dasatinib/BMS-354825 and Saracatinib). The complete list of potential drugs acting on proteins associated with gene hypermethylation in head and neck cancer and their functions are presented in Supplementary Table 4.

Figure 5
figure 5

Regulatory network of selected hypermethylated genes associated with risk factors in head and neck cancer. Genes regulated by methylation from independent published studies in head and neck cancer (such as GAB1, TGFB, and JAK3) belongs to similar networks known to play a fundamental role in cancer progression. These methylated genes are targeted by the drugs, including TGF-beta receptor type II (Lerdelimumab, Suramin, and Interferon beta), GAB1-RA (Primidone, Flumazenil, Oxazepam, Flurazepam, Methylphenobarbital, Clorazepate, Ganaxolone, Clomethiazole, Zaleplon, Ocinaplon, Methyprylon, Indiplon, Zolpidem, Pentobarbital and Secobarbital), JAK 3 (Tofacitinib). Graphs were extracted from Metacore, Thompson Reuters (https://portal.genego.com).

Discussion

In this systematic review we discussed and validated common genes regulated by DNA hypermethylation with fundamental role in HNC progression and metastatic competence, considering independent investigations with different HNC cohorts around the world. The clinical impact of these genes as prognostic factor is highly relevant to open-up new avenues to the therapeutic approach towards a personalized medicine. Although numerous advances in diagnosis and treatment have been achieved in the last years, 66% of HNC are still diagnosed at advanced stages (III or IV)107, 20% of the patients will develop an upper aerodigestive tract secondary tumor2,19,109 and more than 50% will died during the 5 years of follow-up due to the metastatic tumors.

The accumulation of epigenetic and genetic modifications, frequently associated with exposure to carcinogens, confer advantages to the cell in cancer division and survival, such as growth factor-independent proliferation, resistance to apoptosis, and an enhanced motility capability to migrate through the extracellular matrix (ECM) and invade adjacent tissues110. DNA methylation events is a critical tumor-specific event occurring early in tumor progression to metastasis and it can be easily detected by PCR in a manner that is minimally invasive to the patient109. Our review identified DNA methylation in 120 genes associated with high risk for developing HNC. The expression patterns of these hypermethylated genes were correlated with the risk factors and their impact for patient’s survival probability, indicating they can act as predictors in progressive HNC.

The multivariate analysis showed that numerous suppressor genes were significantly hypermethylated such as P16, TIMP3, DCC, DAPK, MINT31, RARβ, MGMT, CCNA1, CD44, and CDH1; these genes are involved in cell–cell adhesion, cell polarity and tissue morphogenesis. This gene was analyzed alone or in gene panels, however, the studies showed discordant results. In one report, P16 hypermethylation was associated with carcinogenesis of oral epithelial dysplasia and it was considered a potential biomarker for the prediction of tumor progression of mild or moderate oral dysplasia64,83. The hypermethylation of the P16 promoter gene has also been described in advanced oral cancer associated with increased risk of loco-regional recurrences66. Different degrees of P16 hypermethylation have been reported in oral cancer23,26,46,62,74,75,91,94 and in others HNC location73,93.

Interestingly, promoter hypermethylation profile of the P16, MGMT, GSTP1 and DAPK can be used as molecular biomarkers to detect recurrent tumors using liquid biopsy111. Since gene hypermethylation has been found to be a common and early event in several types of cancer, including HNC, it has emerged as a promising target for non-invasive detection strategies for tumor recurrence and metastasis. It was known that cancer cells shed their DNA into the bloodstream and that circulating free DNA (cfDNA) share molecular similarities with the primary tumor, including DNA hypermethylation. So, it has been suggested that tumor specific DNA hypermethylation in serum is useful for diagnosis and prediction prognosis112. This information is yet to be translated into useful and reliable tools for HNC in the clinical practice. Nonetheless, due to the increase of the sensitivity and the high-throughput quantitative methodologies for hypermethylation analysis, specific candidates will surely emerge by combination of different genetic and epigenetic panels to achieve accuracy in the neoplastic detection113. Over the next years, clinical trials on diagnostic and treatment approaches based on hypermethylation markers will be available for the assessment of HNC prognosis, therapeutic strategies and to predict the response to the treatment.

Researchers found significant differences in the tumorigenesis and HNC prognosis of patients with HPV-related cancer versus HPV-negative tumors and have tended to classify HPV associated malignancies as a distinct biologic entity. HPV-negative HNC is related to oral sexual behaviour, which is associated with HPV transmission114,115. Relative to HPV-negative malignancies, HPV-positive cancers are associated with a more favourable prognosis114,115,116. However, most patients (> 75%) with HPV-unassociated HNCs present tumors with poorer clinical outcome, do not respond to standard treatments due to a higher rate of relapses115,116. The majority of the studies included in our analysis, including HPV-positive patients, have strong association with alcohol and tobacco consumption. Previous studies suggested that although HPV-positive cancers in heavy smokers may be initiated through virus-related mutations, they go on to acquire tobacco-related mutations and become less dependent on the E6/E7 carcinogenesis mechanisms typically associated with the virus117. If epigenetic alteration can be modified by alcohol and tobacco status in HPV-positive patients, the gene silencing by hypermethylation can also be influenced by the combination of different risk factors, interfering not only in the tumor initiation process but also in the HNC progression to metastasis. A current limitation in the prognosis and therapeutic strategies of HNC is the lack of consistent methods and the use of large cohort studies to adequately address the influence of the etiologic complexity and the tumor heterogeneity (anatomical and histological) in the metastatic competence of this disease.

In this study, we firstly performed a systematic review to disclose potential candidates associated with HNC susceptibility that was confirmed by a validation in public platform from the TCGA datasets with 279 HNC cases. However, we also conducted an additional validation of the most relevant hypermethylated genes that showed statistical significance in both previous analysis by using an independent cohort with single tumor anatomical location (only tongue cancer) considering alcohol consumption, tobacco use and HPV status. After this screening, only CD44 expression showed significant clinical impact at the translational level being associated with tumor recurrence. CD44 is a well-characterized cell surface glycoprotein receptor associated with a subpopulation of resilient tumor cells with enhanced carcinogenic properties specially involved with increased cell migration. We confirmed the increased proportions of CD44 + cells correlated with poor patient’s outcome in HPV negative HNC patients. The lower expression of CD44 might reflect the reduced number of cells with stem cell properties which explain the absence of metastasis and the better survival rates. In HNC, CD44 + expression has been associated with tumor-initiating cells or cancer stem cells due to their ability to persist and self-renew following therapy. Extensive investigations in our field have been performed with a hope to find a new prognostic tool to understand the basis of molecular carcinogenesis in HNC but also to identify potential therapeutic opportunities toward personalized medicine to manage patients with advanced disease. The ability to manipulate DNA methylation status and gene function by local and systemic delivery of epigenetic drugs (methylation inhibitors [e.g., 5-azacytidine]; antisense oligonucleotides [e.g., MG98]; and small molecule DNA methylation inhibitor [RG108]) has recently gained interest as novel therapeutic approach. Here, we reported potential drugs to target the most common alteration proposed in literature related to DNA hypermethylation in progressive HNC. The list of drugs available (Supplementary Table 4) may be used to block multiple nodes in critical pathways involved in cell proliferation, differentiation, tumor growth and survival in HNC at high-risk for recurrence.

In summary, this review highlights the impact of DNA hypermethylation associated with the main risk factors for HNC and show, from independent studies, the implication of methylated genes in the regulation of critical network with fundamental role in cancer progression to metastasis, which could be used as a potential therapeutic target and long-term surveillance for patients with invasive HNC.