Introduction

Hepatocellular carcinoma (HCC) is a common malignant tumor of the digestive system and one of the leading causes of tumor death worldwide [1, 2]. Symptoms in patients with early stage HCC are not quite obvious. Liver ultrasound and serum alpha-fetoprotein (AFP) examination are commonly used for HCC screening [3]. Randomized controlled trials have shown that improving early detection of HCC by monitoring hepatitis B virus carriers can significantly reduce HCC mortality [4, 5]. However, the sensitivity of AFP and liver ultrasound for early stage of HCC is relatively low [6, 7]. Therefore, there is an urgent need to develop a sensitive, reliable, and minimally invasive assay to detect early stage HCC for timely intervention.

Plasma cell-free DNA (cfDNA) refers to the degraded DNA fragments that are released into plasma after cell necrosis or apoptosis [8, 9]. In cancer patients, a portion of cfDNA is derived from the tumor cells, also known as circulating tumor DNA (ctDNA) [10, 11]. Next-generation sequencing (NGS) can analyze a variety of tumor-specific signals carried by the ctDNA, including somatic gene mutations [12], methylation modifications [13, 14], end motif [15], fragment length profiles [16], etc. Among them, methylation modification analysis is widely used for detection of HCC [17], colorectal cancer [18, 19], and even pan-cancer [20] due to its advantages of reflecting the early changes of tumors and the origin of tumor cells.

Bisulfite sequencing (BS-seq) is considered the gold standard for DNA methylation analysis because it provides quantification of methylation signals at single-base resolution [21]. However, the harsh conditions of the bisulfite treatment can cause huge damage to the DNA, which leads to generally poor sequence diversity, target enrichment bias, and also high sequencing errors [22, 23]. These shortcomings limit the application of BS-seq for methylation analysis of low-input DNA (e.g., cfDNA).

The limitation of BS-seq drives the development of new methylation detection techniques. TET-assisted pyridine borane sequencing (TAPS) utilizes the combination of biological enzymes and chemical methods to differentiate cytosine (C), 5-methylation of cytosine (5mC), and 5-hydroxymethylation of cytosine (5hmC) without damaging the DNA [24]. Enzymatic methyl-seq (EM-seq) uses enzymes including ten–eleven translocation dioxygenase 2 (TET2), T4 phage β-glucosyltransferase (T4-βGT), apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC3A) to achieve the similar conversion to bisulfite treatment for methylation analysis. That is, unmethylated C is converted to T (Thymine), without any change in 5mC and 5hmC [25]. Compared with BS-seq, the EM-seq method based on mild biological enzyme conversion has obvious advantages including improved coverage and more even GC distribution [25].

In this study, we investigated incomplete conversion of EM-seq in hypomethylated control DNA and minimized the methylation signal noise by filtering the incomplete converted sequences. We confirmed the high sequencing quality of EM-seq for the analysis of low-amount DNA samples. Using EM-seq and target capture methods, a total of 596 clinical plasma cfDNA samples were analyzed in our study to develop an HCC screening model with good performance.

Results

Profiling and minimizing incomplete conversion of EM-seq

To investigate the performance of EM-seq, we used it to analyze methylated and unmethylated DNA control and compared with BS-seq. Among hypermethylated CpG sites, the methylation value detected by EM-seq was comparable to that by BS-seq (Additional file 1: Fig. S1a). Among the hypomethylated CpG sites, the methylation value detected by EM-seq was significantly lower than that by BS-seq (Fig. 1a, 0.1% V.S. 0.59%, P value < 0.0001). However, approximately 1.67% of hypomethylated CpGs (103 CpGs) had higher detection values in EM-seq than in BS-seq (Fig. 1b), and abnormal hypermethylation signals of > 1% were observed at 20 CpGs in EM-seq, accompanied by hypermethylation signals at adjacent CH sites (CH = CC, CT or CA, Additional file 2: Table S1). This suggests that EM-seq is prone to incomplete conversion in specific context, resulting in methylation signal noise, probably due to the substrate specificity of APOBEC3A deamination [26].

Fig. 1
figure 1

In complete conversion in EM-seq. a Methylation values on hypomethylated CpGs acquired by EM-Seq and BS-seq, before and after ≥ 3CHs filtration. Hypomethylated CpGs are those with BS-seq detection values of < 1% on λ DNA. b Dot plot compares individual methylation values in hypomethylated CpGs acquired by EM-Seq and BS-seq. Percentages indicate the fraction of CpGs that differed between conditions. c Pie charts compare the proportion of reads with 0, 1, 2, and ≥ 3 CH sites in all EM-seq and BS-seq sequencing reads. d. Genome plot for unmethylated control λ DNA (pos: 30,000–35,000) compares CH reads between EM-seq and BS-seq datasets. Boxes represent reads, and unmethylated (blue) and methylated (red) CHHs are indicated

Non-CG methylation is rare in most human cells and can therefore be used as an internal unmethylated control for methylation detection [27]. To further investigate the incomplete conversion, reads containing CH were extracted from BS-seq and EM-seq data. More CH reads were observed in BS-seq data than in EM-seq data (Fig. 1c). Both sporadic and clustered CHs were observed in EM-seq, and the reads with ≥ 3 CHs accounted for 18.71% of all CH reads in EM-seq (Fig. 1c, d, Additional file 1: Fig. S1b). In contrast, CH was mostly distributed sporadically in BS-seq, and the proportion of ≥ 3 CHs reads was only 6.43% of CH reads in BS-seq (Fig. 1c, d, Additional file 1: Fig. S1b). A larger proportion of ≥ 3 CHs reads was also observed from EM-seq data than BS-seq data from previous study with human genomic DNA (Additional file 1: Fig. S1c) [25].

We next considered if the performance of EM-seq could be improved by filtering reads with incomplete conversion. Filtering reads with ≥ 3 CHs significantly reduced EM-seq value of hypomethylated CpGs while not affect that of hypermethylated CpGs (Fig. 1a, Additional file 1: Fig. S1d). In contrast, filtration had limited impact on the result of BS-seq (Fig. 1a, b), consistent with previous report [28]. Furthermore, the number of CG sites with EM-seq values higher than BS-seq values decreased by approximately 45.5% (Additional file 1: Fig. S1e), suggesting that filtering of ≥ 3CHs reads can further improve the accuracy of EM-seq.

Methylation marker selection and targeted EM-seq

Because deep sequencing of the whole human methylome would be prohibitively costly, we used EM-seq combined with targeted capture to analyze plasma cfDNA from HCC patients and non-HCC controls (Fig. 2a). Briefly, plasma DNA was ligated to the methylated adaptor before enzymatic conversion and amplified using primers with dual index. Multiple samples were pooled together and enriched for target region by the designed probe panel before deep sequencing.

Fig. 2
figure 2

Targeted EM-seq and marker selection. a Overview of targeted EM-seq of plasma cell-free DNA. b Unsupervised hierarchical clustering of 89 methylation markers selected for panel design

To find potential biomarkers for HCC detection, we analyzed the expression profile data and methylation data of HCC and non-HCC tissues and screened CpGs with relatively high methylation from low-expression genes in HCC. Then, the methylation data of whole blood DNA were used to eliminate the CpGs whose average methylation level in blood was higher than that in HCC (Fig. 2b). To cover the CpGs associated with HCC as much as possible, other CpGs suitable for HCC detection were also included in our analysis, according to previous studies [29,30,31,32,33,34,35,36]. Finally, we designed a probe panel covering 1595 CpGs for subsequent analysis.

We used HepG2 cell DNA and plasma DNA to evaluate the performance of the targeted EM-seq. Relatively high DNA recovery was achieved in targeted EM-seq, with 60–70% of reads uniquely aligned to the designed panel region (target ratio) and median unique read depth of over 800 × with 20 ng fragmented cell DNA (Additional file 1: Fig. S2a, Additional file 2: Table S2). In addition, we confirmed the consistency of methylation levels detected by two Targeted EM-seq technical repeats (Additional file 1: Fig. S2b) along with the consistency of coverage depth (Additional file 1: Fig. S2c). EM-seq analysis of a representative donor cfDNA showed the length characteristics of cfDNA including the main peak of about 167 bp and small peaks spaced by 10 bp (Additional file 1: Fig. S2d) [37].

HCC detection using plasma samples

To develop a predictive model for the HCC screening, 596 plasma samples were analyzed by Targeted EM-seq (Additional file 2: Table S3). These samples were randomly divided into training set (n = 417, normal controls = 195, liver disease = 54, HCC = 168) for the model training; and test set (n = 179, normal controls = 84, liver disease = 22, HCC = 73) for the model testing. We confirmed the consistency of methylation levels of 20 plasma samples detected by two technical repeats of hybridization capture (Additional file 2: Table S4, r > 0.99). The study design is depicted in Fig. 3. Detailed patient characteristics are summarized in Additional file 2: Table S5.

Fig. 3
figure 3

Workflow chart of building a stack HCC screening model

GDBT (gradient boosting decision tree) machine learning analyses were applied to the training cohort of 168 HCC and 195 normal controls to generate a HCC versus normal model (HN model) with 37 markers. HN model achieved a sensitivity of 87% and a specificity of 97% ( area under the curve, AUC = 0.977) in the training set and a sensitivity of 90% and a specificity of 94% (AUC = 0.959) in the test set (Fig. 4a). The same analysis of 168 HCC and 54 liver diseases obtained a HCC versus liver disease model (HL model) with 264 markers. HL model achieved a sensitivity of 90% and a specificity of 72% (AUC = 0.915) in the training set and a sensitivity of 92% and a specificity of 84% (AUC = 0.948) in the test set (Fig. 4b).

Fig. 4
figure 4

Development and validation of the HCC screening model. a, b ROC of the three models with methylation markers in the training dataset (a) and test dataset (b). c Unsupervised hierarchical clustering of 283 methylation markers selected for HCC screening model development in the training and test datasets

A total of 283 markers were obtained after the two sets of markers were combined and deduplicated (Additional file 2: Table S6). The methylation levels of these markers were significantly different between HCC and non-HCC samples in both training and test sets, and the intensity increased with advancing disease stage (Fig. 4c, Additional file 2: Table S7).

Using a logistic regression method, we next constructed a stack model for HCC screening with these 283 markers using the predictive values generated from HN and HL models. Overall, the stack model yielded a sensitivity of 92% and a specificity of 94% in the training/fivefold cross-validation set (AUC = 0.980, 95% confidence interval [CI] 0.968–0.991; Fig. 4a), and a sensitivity of 90% and a specificity of 97% in the test set (AUC = 0.957, 95% CI 0.925–0.989; Fig. 4b). The classification accuracy was highly consistent between these two sets (AUC 0.980 versus 0.957), which confirms appropriate control of overfitting risks. HCC tumor score (t-score), the outputs of the stack model, can efficiently distinguish HCC patients from non-HCC individuals (Fig. 5a).

Fig. 5
figure 5

Further evaluation of the stack model for HCC screening. a The t-score of the stack model in normal controls, individuals with liver diseases and HCC patients. b The t-score in HCC patients with early and advanced stages. c Proportions of positive calling by the stack model and AFP (> 20 in HCC patients with early and advanced stages in the test set). d Proportions of positive calling by the stack model and PIVKA-II in HCC patients with early and advanced stages in the test set

Subgroup analysis and comparison with AFP/PIVKA-II

The stack model for HCC screening achieved high sensitivity and specificity in different groups. Sensitivity improved with advancing disease stage and achieved 85% in stage I (35/41), 89% in stage II (8/9), and 100% in stage III-IV (23/23) patients in the test set (Table 1). Similarly, there is good correlation between the t-score and tumor stage. Patients with early stage disease (I, II) had substantially lower t-scores compared to those with advanced stage disease (III, IV) (Fig. 5b). The specificity was 99% for normal controls (1/84) and 91% for liver disease samples (2/22) in test set (Table 1).

Table 1 Performance of the stack model for HCC screening in the test set

In HCCs, the stack model achieved higher detection accuracy than AFP for both early and advanced HCC (Fig. 5c) and achieved high detection accuracy in AFP-negative patients (25 of 28 patients for AFP < 20 ng/mL, 89.29%, Additional file 1: Fig. S3a). Protein induced by vitamin K absence-II (PIVKA-II) is another potential screening marker for HCC [38]. The stack model also achieved higher detection accuracy than PIVKA-II for both early and advanced HCC (Fig. 5d) and achieved high detection accuracy in PIVKA-II-negative patients (8 of 10 patients for PIVKA-II < 40 ng/mL, 80.0%) (Additional file 1: Fig. S3b).

Discussion

Diagnosis of HCC patients at early stage can effectively reduce HCC mortality. Tumor-specific methylation in cell-free DNA provides the possibility for noninvasive liquid biopsy of tumor detection and monitoring [10, 39, 40]. In this study, we used the targeted EM-seq method, replacing the traditional BS-seq, for methylation analysis of the plasma DNA from HCC patients and non-HCC controls, and constructed a highly sensitive and specific HCC screening model.

EM-seq has significant advantages over BS-seq. First, compared with BS-seq, EM-seq result in less DNA loss than BS-seq, achieving better sequencing coverage, especially in the analysis of low-input DNA samples. Second, EM-seq contains fewer conversion errors than BS-seq. In particular, our data in this study showed that the incomplete converted CHs in EM-seq cluster in one read, which can be easily removed by filtering. Minimizing the methylation noise caused by the incomplete conversion is important to accurately detect trace methylation variation from limited ctDNA in early stage samples [41]. Finally, EM-seq library can be prepared by ligating the adaptor before conversion without obvious DNA loss. The library obtained by this method can retain more physiological characteristics of cfDNA and has the potential to analyze methylation, end motif, and fragmentation profiles simultaneously.

Due to the substrate specificity of APOBEC3A deamination, incomplete conversion of EM-seq occurs in a few specific context, leading to false increase of > 1% in methylation rate. Including APOBEC enzymes with sequence specificity other than APOBEC3A may further improve conversion efficiency of EM-seq. Besides, sequence-specific incomplete conversion in a few specific sites can offset each other across clinical samples during biomarker selection. So it does not deny the results of this study.

The HCC screening model constructed by machine learning analysis can clearly distinguish HCC and non-HCC samples with 90% sensitivity and 99% specificity in the test cohort. Cancer is a rare disease, and the incidence of HCC is low in the population. Only extremely specific tumor screening methods can minimize the risk of overdiagnosis and achieve effective screening for the HCC [42]. In addition, the HCC screening model in this study has different sensitivities to HCC samples of different stages. The sensitivity of advanced stage samples is higher than that of early stage samples, indicating that tumor burden has a certain influence on the prediction results of the HCC screening model. This model holds potential for monitoring of HCC treatment.

This study has some limitations as well. First, fewer liver disease samples were recruited in this study and the inclusion of more high-risk samples would help to clarify the specificity of the screening model in high-risk populations. Second, the median age of the non-HCC control group in this study was 46 years, which was lower than that of the HCC group (median 57 years). Methylation is correlated with age [43]. Therefore, the inclusion of age-matched non-HCC controls helps to eliminate the interference of age-related methylation variation.

Conclusions

This study demonstrated that incomplete conversion of EM-seq is characterized by CH clustering in sequencing reads. Filtering of reads with ≥ 3 CHs can significantly reduce the noise of methylation detection of EM-seq. A predictive model for HCC screening based on targeted EM-seq analysis of plasma cell-free DNA can efficiently distinguish HCC patients from non-HCC individuals and perform well in the detection of early stage patients. It provides a noninvasive method for HCC screening with a very high specificity. Given that epigenetic variations are common in the development of most tumors, this strategy could be modified for the early detection of other cancer types or of multiple cancer types from a single tube of blood.

Methods

Study design

Patients with HCC or liver disease that were treated at the Zhongshan Hospital, Xiamen University, between November 2020 and December 2021 were selected for this study. The HCC stage was determined using the China liver cancer staging system [44]. Healthy individuals undergoing routine health care maintenance were randomly selected as controls in this study. Detailed patient characteristics are summarized in Additional file 2: Table S1. The present study was performed under the Helsinki Declaration and was approved by the Ethics Committee of the Zhongshan Hospital, Xiamen University (reference number: 2020-015). Informed consent was obtained from all participants or their families.

Blood sample processing and cfDNA purification

Samples from all cases and controls were processed by the following method. Peripheral blood was collected in cfDNA Blood Collection Tube (Zhixuan Biotech). Plasma was separated by centrifugation at 1600×g for 10 min and transferred to microcentrifuge tubes. After then, centrifugation was done at 16,000×g for 10 min to remove cellular debris. The supernatant was divided into 2-ml aliquots and stored at − 80 °C until the time of DNA extraction. cfDNA was extracted from 2-ml plasma for each participant using Plasma Cell-Free DNA Extraction Kit (Concert). cfDNA concentration was measured using Qubit dsDNA High Sensitivity Assay Kit (Thermo Fisher).

Enzymatic methyl conversion

EM-seq library preparation was performed using VAHTS Universal DNA library Prep Kit for Illumina V3 (Vazyme) and EM-seq Conversion Module (New England BioLabs) according to the manufacturer’s instruction with minor modifications. In brief, mechanically fragmented methylation control DNA (CpG methylated pUC19 and unmethylated λ DNA) alone or combined with cell-free DNA fragments was treated with VAHTS Universal DNA library Prep Kit (New England BioLabs) for end-repair, A-tailing, and ligation of EM-seq adaptor (New England BioLabs). The ligated samples were methyl-converted with EM-seq Conversion Module (New England BioLabs) per the manufacturer’s protocol. Methyl-converted DNA was purified and amplified using NEBNext Unique Dual Index Primers (New England BioLabs) and KAPA HiFi HotStart Uracil + ReadyMix (KAPA biosystems).

Bisulfite treatment

Methylation control DNA or DNA fragments were ligated with the EM-seq adaptor as mentioned above and then subjected to the bisulfite conversion using EZ DNA Methylation-Lightning Kit (Zymo Research) according to the manufacturer’s protocol.

Targeted capture and sequencing

EM-seq libraries from up to 24 samples were pooled together for hybridization enrichment using the Customized NAD probes (Nanodigmbio). The capture reaction was performed with Nadprep Hybrid capture Reagents (Nanodigmbio) following the manufacturer’s instructions. Captured libraries were obtained using on-bead PCR amplification with VAHTS HiFi Amplification mix and PCR primer mix3 for Illumina (Vazyme). The libraries size was determined using the Bioanalyzer 2100 (Agilent Technologies). The target-enriched library was then sequenced on the NovaS4 NGS platform (2 × 150 bp, Illumina) following the manufacturer’s instructions.

Panel design

Methylation and expression profile data of 357 HCC and 50 non-HCC tissue were downloaded from The Cancer Genome Atlas (TCGA) [45]. Differential analysis was performed both on expression profiles of all genes except for those on sex chromosomes and methylation data of the gene-related CpG islands. The screening thresholds for differential gene expression were set as FDR < 0.05 and |log2 (fold change)|> 1. The screening thresholds for differential methylation were set as FDR < 0.05 and |β|> 0.2, and the methylation sites with high methylation and low expression were selected. The methylation dataset GSE69270 of 184 non-HCC blood was downloaded from the Expression Omnibus database [46, 47]. The CpG sites with an average methylation level in non-HCC blood higher than that in HCC tissues (> 0.1) were removed. A group of five probes of 120 bp length was designed for each CpG site, including NO.1 the original unconverted sequence, NO.2 and NO.3 the converted, methylated Watson and Crick strands, NO.4 and NO.5 the converted, unmethylated Watson and Crick strands. Multiple probes were used to reduce the effect of the original methylation state on the capture efficiency.

Data processing

The raw sequencing FASTQ files were processed using Fastp (0.21.0) to trim Illumina-specific adapters with default parameters and low-quality sequences with parameters of -u 20 and -q 20 [48]. Mapping of the processed sequencing reads was performed using the bismark (0.23.0) [49]. Deduplicate_bismark (bismark) was used for deduplication. Incomplete converted reads with more than 3 CHs were removed. Finally, methylation level is estimated using bismark_methylation_extractor (bismark) to calculate methylation frequencies for all CpGs using parameters -comprehensive --bedGraph --counts --cytosine_report --CX --buffer_size 20G --parallel 16.

Development of HCC screening model

A custom module was built to classify samples using two layers of models: (i) two GDBT model: A HCC versus normal model (HN model) and a HCC versus liver disease (HL model) [50]. Both models were trained by GDBT model. RFECV (recursive feature elimination with cross-validation) method was used to select the optimal features in HN model [51]. In HL model, all features with importance greater than 0 were selected as the optimal features. GridSearchCV (grid search with cross-validation) method was used to optimize models’ parameters [52]. (ii) A multinomial logistic regression model: two models of the first layer were stacked into a logistic regression model to obtain a HCC/non-HCC assignment as a final prediction [53]. The predictive score h (t-score) is calculated as: h = wTx + b, where w is the coefficient of each feature, and x is the output from models in the first layer. The L2 penalty parameter λ was determined by GridSearchCV method. To optimize model performance and avoid overfitting, iterative fivefold cross-validation was performed in the training dataset. The performance of the final stacked model was evaluated in the independent test cohort.

Statistical analysis

Differences between the groups were calculated using the Student’s t test. Statistical analysis was done by R (v. 4.0.5). The performance of prediction model was evaluated using the AUC statistics (95% CI), and the values were calculated by the pROC (v. 1.18.0) package [54]. The sensitivity [True Positive/(True Positive + False Negative)] and specificity [True Negative/(True Negative + False Positive)] thresholds under 95% CI are obtained by the formula: \(p \pm 1.96 \times \sqrt {p\left( {1 - p} \right)/n}\) [55].