Introduction

Acute myeloid leukemia (AML) is a hematopoietic malignancy characterized by a complex interplay of aberrations at different levels of the genome (i.e., genetic, epigenetic, transcriptomic, and proteomic) [1,2,3]. This complexity is faithfully reflected by AML heterogeneity in terms of pathogenesis and prognosis. In clinical practice, only properly introduced and validated genetic lesions altogether with cytogenetics are considered into treatment decision making [4]. This still applies despite growing evidence that some other markers, such as epigenetic factors, may add valuable information about the predicted course of the disease in individual AML patients [3]. DNA methylation is one of the longest-studied epigenetic mechanisms and is stable and relatively easy to measure [5, 6]. Therefore, its status can be readily harnessed as a clinically relevant stratifier. Indeed, there are an increasing number of articles assessing the influence of DNA methylation on AML prognosis—reviewed in [7]. These studies interrogate one, a few or multiple loci depending on the methodology used. Typically, as a result of such research, authors define gene(s) that may serve as new biomarkers to improve risk stratification in AML patients. The main weakness is that such works are usually not validated by other researchers and hence there is not sufficient validation of these potential biomarkers for them to be introduced into clinical practice. Therefore, we designed a comprehensive NGS-based DNA methylation panel comprising of genes previously published as having an impact on AML prognosis. For validation purposes, we selected fourteen studies published between years 2011 and 2019 [8,9,10,11,12,13,14,15,16,17,18,19,20,21] covering 27 genes (Additional file 1: Table S1). We chose works targeting only one or a few loci at once (averaged 2 loci per publication, range 1 to 7), because lower numbers of biomarkers would be more feasible for introduction into a laboratory routine practice. The list of the selected studies and their basic characterization is summarized in Table 1. The aim of this work was to make an independent verification of results published by other researchers to narrow down the list of actually prognostically relevant genes that may allow more precise AML stratification in the future.

Table 1 Studies subjected to DNA methylation validation

Results

Our validation study confirmed association of DNA methylation status and prognosis for four genes: CEBPA [13], PBX3 [10], UZTS2 [16], and NR6A1 [16]. A summary of the results is presented in Table 2. Surprisingly, for two studies [19, 20], we found the exact opposite effect of DNA methylation on prognosis than originally reported—higher GPX3 and DLX4 methylation—was linked to a better outcome according to our data. Kaplan–Meier curves for OS and EFS for all six significant genes are shown in Figs.  1 and 2, respectively. In four additional studies [8, 9, 15, 21], only the results from log-rank test displayed statistical significance that was lost in the subsequent multivariate testing (Table 2). These results were not considered as sufficiently conclusive for classifying them as validated. The mean DNA methylation values in hypo- versus hypermethylated subgroups for each of the significant genes are depicted in Fig. 3.

Table 2 DNA methylation validation results
Fig. 1
figure 1

Kaplan–Meier (KM) curves for overall survival (OS): A CEBPA methylation KM curves in AML subgroup excluding favorable cytogenetics and without CEBPA and NPM1 mutations (n = 83). B GPX3 methylation KM curves in the whole non-M3 AML cohort (n = 178). C DLX4 methylation KM curves in the whole non-M3 AML cohort (n = 178). D LZTS2 methylation KM curves in the whole non-M3 AML cohort (n = 178). E NR6A1 methylation KM curves in the whole non-M3 AML cohort (n = 178). F LZTS2&NR6A1 methylation KM curves in the whole non-M3 AML cohort (n = 178). G LZTS2 methylation KM curves in the CN-AML subgroup (n = 85). H NR6A1 methylation KM curves in the CN-AML subgroup (n = 85). I   LZTS2&NR6A1 methylation KM curves in the CN-AML subgroup (n = 85). CN-AML = cytogenetically normal AML, hypo = hypomethylated, hyper = hypermethylated, Strata—stratified by a variable

Fig. 2
figure 2

Kaplan–Meier (KM) curves for event-free survival (EFS): A CEBPA methylation KM curves in AML subgroup excluding favorable cytogenetics and without CEBPA and NPM1 mutations (n = 83). B PBX3 methylation KM curves in the whole non-M3 AML cohort (n = 178). C GPX3 methylation KM curves in the whole non-M3 AML cohort (n = 178). D DLX4 methylation KM curves in the whole non-M3 AML cohort (n = 178). E LZTS2 methylation KM curves in the whole non-M3 AML cohort (n = 178). F NR6A1 methylation KM curves in the whole non-M3 AML cohort (n = 178). G LZTS2&NR6A1 methylation KM curves in the whole non-M3 AML cohort (n = 178). H LZTS2 methylation KM curves in the CN-AML subgroup (n = 85). I NR6A1 methylation KM curves in the CN-AML subgroup (n = 85). J LZTS2&NR6A1 methylation KM curves in the CN-AML subgroup (n = 85). CN-AML = cytogenetically normal AML, hypo = hypomethylated, hyper = hypermethylated, Strata—stratified by a variable

Fig. 3
figure 3

Comparison of mean DNA methylation values in successfully validated genes between hypo- and hypermethylated subgroups of AML. CN-AML = cytogenetically normal AML, hypo = hypomethylated, hyper = hypermethylated

Discussion

Despite a large number of studies addressing the importance of DNA methylation changes for AML prognosis, these aberrations are still not considered for risk stratification, although many promising results have been already reported. The lack of independent validation studies is probably the main obstacle that does not allow the implementation of epigenetic markers alongside the well-established genetic ones. Most of the publications present just more new potential epigenetic biomarkers, making the actual role of DNA methylation harder to grasp and interpret for clinical purposes. With the aim to verify the prognostic role of specific and already described DNA methylation changes in AML, we designed our custom NGS-based DNA methylation panel that covers 27 genes (Additional file 1: Table S1) taken from 14 studies published between years 2011 and 2019. The reported prognostic significance was verified for three studies [10, 13, 16]. These three studies do not share any apparent features such as size of test cohort, presence of a validation cohort, methodology, or biological material utilized for the DNA methylation assessment (see Table 1). We briefly summarize and discuss the genes with a confirmed role of DNA methylation in AML prognosis. CEBPA is a well-known gene involved in AML pathogenesis. Double CEBPA mutations have been connected to better OS and EFS [4]. Con cordantly, hypermethylation of distal CEBPA promoter was reported as a favorable prognostic biomarker, which we proved in AML subgroup excluding favorable cytogenetics and without CEBPA and NPM1 mutations, but not in CN-AML without CEBPA and NPM1 mutations as also originally described by Lin et al. [13]. PBX3 has been identified as an oncogene in AML that transcriptionally regulates HOXA genes and promotes cell proliferation and resistance to chemotherapeutical agents [22]. Hajkova et al. [10] reported PBX3 overexpression associated with a higher incidence of relapses. They also showed a clear correlation between PBX3 overexpression and hypomethylation. In line with this, we detected PBX3 hypomethylation as an independent negative prognostic factor for EFS. Qu et al. [16] identified higher methylation in CpG island (CGI) shores of LZTS2 and NR6A1 genes as a predictor of better prognosis in CN-AML. Interestingly, we confirmed the predictive role of LZTS2 and NR6A1 hypermethylation not only in CN-AML, but in the whole non-M3 diagnostic AML cohort as well. The strongest link between DNA methylation and prognosis was observed if the concurrent hypermethylation of both genes was present. Validation of the works of Zhou et al. [19, 20] produced contradictory results to the original studies. Unlike them, we observed a clear association between higher GPX3/DLX4 promoter methylation and better survival. This discrepancy is hard to explain because even usage of different methodology (qMSP versus NGS) or biological material (BM versus PB) would not completely reverse the impact of particular gene’s hypermethylation. The recent GPX3 review described its dichotomous role in different cancer types; it can act as either an oncogene or a tumor suppressor [23]. Tumors with high GPX3 expression have an increased resistance to chemotherapy due to the GPX3 involvement in the antioxidant enzyme system [24]. This would support our findings about GPX3 hypermethylation (and thus probable downregulation) and favorable outcome in AML cohort treated by standard 3 + 7 induction regimen. As for DLX4, its overexpression was described in numerous tumor types (including AML) in association with tumor progression and/or invasion [25,26,27,28]. This again supports the link between DLX4 hypermethylation and better AML prognosis.

Noticeably, all verified prognostic DNA methylation changes have one thing in common: higher methylation equals better prognosis. Six out of fourteen studies subjected to the validation reported higher methylation/lower expression and superior outcome. From these six studies, three were verified by both log-rank and multivariate Cox regression analysis [10, 13, 16] and three showed significance by log-rank test [8, 15, 21]. On the other hand, from eight studies describing the relationship between higher methylation and poor prognosis, only one displayed significance by log-rank test [9], none was verified by multivariate Cox regression analysis, and for two studies the opposite relation between higher methylation and prognosis was revealed [19, 20]. Altogether, it seems that higher methylation has predominant influence on prognosis in AML. However, the exact location of differential methylation and what specific genes are affected are probably the key elements determining the direction of how DNA methylation influences patients’ outcome.

In three studies, the indirect relation of DNA methylation (through its association with gene expression) and prognosis was reported [10, 12, 15]. From these, only one study was validated [10]. Technically speaking, we cannot exclude the role of gene expression deregulation in patients’ outcome in the remaining two studies [12, 15], because in our study design we did not examine the impact of gene expression on AML prognosis.

Another important aspect to discuss is the usage of PB versus BM for DNA methylation assessment. Our AML cohort consists of PB samples only, whereas PB alone was a starting material in 3/14 studies that underwent validation. Some articles have already dealt with the comparison of DNA methylation results obtained from PB versus BM, and they reported their interchangeability for these purposes [8, 10, 16]. In line with this, the result of DNA methylation validation was not determined by the biological material used. In fact, genes with validated role of their methylation status in AML prognosis were all revealed in studies using either BM alone [13, 19, 20] or studies using a combination of PB and BM [10, 16]. PB is a starting material that is easily accessible to the majority of laboratories and it is not as burdensome for patients as BM aspirates.

In practical terms, implementation of a new biomarker represented by a single gene/region is always more feasible than that of a complex methylation pattern. The low number of genes for which we confirmed the prognostic impact with our NGS-based approach highlights the importanc e of such validation and a need for a consistent and easily reproducible approach to assess the impact of various changes in DNA methylation on AML prognosis.

Conclusions

We showed that validation of previously published prognostically significant DNA methylation changes is essential to confirm their relevance for patients’ stratification. Out of 27 genes, a statistically significant correlation between DNA methylation status and prognosis was proved for six of them: CEBPA, PBX3, LZTS2, NR6A1, GPX3, and DLX4. We propose that further independent validation studies may build upon our results, because only markers properly verified by several independent studies can be considered for AML prognosis refinement in clinical practice.

Methods

Patients

We examined 178 adult AML patients: 128 patients from the Institute of Hematology and Blood Transfusion (Prague, Czech Republic) and 50 patients from the University Hospital Brno (Brno, Czech Republic). All patients were diagnosed with AML between 2013 and 2016 and were treated with curative intent starting with 3 + 7 induction regimen [29]. The clinical and basic molecular characteristics used for statistical analysis are stated in Additional file 1: Table S2. Healthy donors (n = 11) were also analyzed. The study was approved by the Ethics committees of both participating institutions and all patients provided their full consent. The research conforms to The Code of Ethics of the World Medical Assoc iation.

Targeted bisulfite sequencing

Sequencing libraries consisted of 16–18 samples and were prepared according to the SeqCap Epi protocol (Roche, Basel, Switzerland) with KAPA HyperPrep Kit (Roche). Diagnostic whole-blood DNA from AML patients (800–1200 ng) was first mixed with the Bisulfite-conversion Control (unmethylated DNA from phage lambda) provided in the SeqCap Epi Accessory kit (Roche) and then fragmented either via E220 Focused ultrasonicator (Covaris, Woburn, MA, USA) or Bioruptor Pico instrument (Diagenode, Liège, Belgium) to get an average size of 200 bp. EZ DNA Methylation Lightning Kit (Zymo Research, Irvine, CA, USA) was used for the bisulfite conversion. Pooled samples from each library were hybridized for about 68 h with a custom set of probes (made by Roche Company). The final concentration of the libraries was measured using KAPA Library Quantification Kit (Roche), and the average size of the libraries’ fragments was assessed on 4200 TapeStation System (Agilent Technologies, Santa Clara, CA, USA). Libraries were sequenced on a MiSeq instrument (Illumina, San Diego, CA, USA) using the MiSeq Reagent Kit v2 (300-cycles) (Illumina).

Sequencing data analysis

FastQC (version 0.11.8) [30] and MultiQC (version 1.7) [31] software was used to check the quality of fastq files. Reads were then trimmed and filtered using Cutadapt (version 2.4) [32] and the quality of reads was checked again. Filtered data were mapped with software Segemehl (version 0.3.4) [33] to human genome version GRCh37/hg19 with added sequence of Enterobacteria phage lambda NC_001416.1. Mapping statistics were assessed and we checked that more than 80% of reads were mapped for each sample. Bam files containing mapped reads were sorted and indexed by Samtools software (version 1.10). Subsequently, we used Haarz tool (version 0.3.4) [33] with enabled "callmethyl" option to select methylated positions and create vcf files that were further processed in R software. Positions that corresponded to the lambda phage sequence were separated and used to check that the bisulfite conversion ratio was > 99% for each sample. Remaining positions were filtered and only CpG positions were left in the data. Finally, we selected regions corresponding to loci published in the original articles results and the average methylation across the regions was assessed. The list of selected regions is provided in Additional file 1: Table S1. Raw sequencing data are available at the Gene Expression Omnibus repository (accession number GSE165435).

Statistical analyses and definitions

For the statistical analyses, R software (version 4.0.0) was used. Surviving patients were censored to the April 6, 2020. Overall survival (OS) was established as time from diagnosis until death of any cause. Event-free survival (EFS) was established as time from the first complete remission until death or hematological relapse. Multivariate Cox regression analysis was computed with following covariates: age, leukocyte count, cytogenetics [34], transplantation in the first complete remission, presence of FLT3-ITD and NPM1 mutations. For five studies (see Table 2), Cutoff Finder [35] was utilized to determine the optimal DNA methylation threshold. We used the same DNA methylation threshold as originally published or it was set up in the most similar and meaningful way. We also adapted the selection of AML patients because some studies detected a prognostic effect of DNA methylation only in a specific subset of AML such as cytogenetically normal (CN) AML. To properly evaluate the prognostic significance of the studied regions, we performed Kaplan–Meier analysis with log-rank test. Subsequently, we assessed the effect of DNA methylation levels on overall (OS) and event-free survival (EFS) using multivariate Cox regression for those loci significantly affecting OS or EFS in Kaplan–Meier analysis. p-value ≤ 0.05 was considered as statistically significant.