Background

Oesophageal squamous cell carcinoma (ESCC) is the most common form of oesophageal cancer and is one of the deadliest cancers worldwide [1]. Because early-stage oesophageal cancer is mostly asymptomatic, the majority of ESCC patients are diagnosed with advanced disease. Despite improvements in imaging, surgical techniques and chemoradiation therapy, effective treatment of ESCC patients remains challenging, with an overall 5-year survival of less than 30% [1]. For localized ESCC, surgery is the preferred option. However, even after radical resection, the 5-year overall survival of ESCC patients with positive lymph nodes is less than 40% [2, 3]. Moreover, the recurrence rate is also high in patients without positive lymph nodes [4]. Thus, accurate and timely detection of minimal residual disease or relapse is crucial for tailoring adjuvant therapy for a longer survival time.

Liquid biopsies have become a research hotspot for non-invasive follow-up on disease load and therapy response and for early detection of recurrence [5]. Thus far, promising results have been obtained with circulating cell-free DNA (cfDNA) [6,7,8]. PCR-based techniques such as droplet digital PCR (ddPCR) and PNA-mediated PCR have been used to detect recurrent mutations in EGFR and KRAS in lung cancer patients [9,10,11,12] and in APC in colorectal cancer [13]. Although the results are promising, these approaches require prior knowledge of the mutations present in the tumour sample. To circumvent this limitation, next generation sequencing (NGS)-based approaches using cancer hotspot panels or whole exome approaches have been applied to cfDNA. These studies have reported variable dynamic patterns in mutant allele frequencies of somatic mutations in lung cancer, breast cancer and colon cancer patients [7, 14,15,16,17,18]. Some studies have been carried out to monitor treatment response and disease progression by screening cfDNA samples for mutations detected in the primary tumour [5, 7]. For example, in stage II colon cancer patients, the presence of tumour DNA in cfDNA provided evidence of minimal residual disease and was associated with a higher risk of recurrence [7].

Only a few studies have focused on the analysis of cfDNA in ESCC. The amount of cfDNA was shown to be higher in ESCC patients compared to healthy controls [19]. Three to 6 months after tumour resection, the amount of cfDNA was significantly reduced, indicating that a major fraction of the cfDNA is derived from tumour cells. Another study showed the feasibility of using cfDNA before and after surgery to track tumour load [20]. Decreased mutant allele frequencies were generally observed in post-surgery plasma of 8 ESCC patients using whole exome sequencing and in 3 patients using targeted deep sequencing.

Together, these studies have indicated that circulating tumour DNA (ctDNA) can be detected in the cfDNA of ESCC patients, but additional studies are required before ctDNA can be used in routine clinical practice. In this study, we carried out targeted deep-sequencing using a cancer-related gene panel to explore the cfDNA mutation profile in stage II and III ESCC patients both pre- and post-surgery.

Methods

Patient selection

Seventeen ESCC patients who underwent radical tumour resection between November 1, 2013 and May 31, 2014 were included from the Shantou University cancer hospital (Fig. 1). None of the patients were treated with chemotherapy or radiotherapy before surgery. Tumour tissue samples were stored at − 80 °C and evaluated by proficient pathologists. Blood was collected 1 day before surgery and between 3 to 4 h up to 9 days after surgery. Clinical annotations were retrospectively extracted from the institutional clinical database.

Fig. 1
figure 1

Schematic representation of the sample collection and a brief summary of the sequencing results

Blood sample separation

Whole blood (2-5 ml) samples were collected in EDTA tubes and processed within 2 h. After centrifugation at 900×g for 10 min, whole blood samples were separated into plasma and WBC fraction. Aliquots of plasma were subjected to two subsequent centrifugation steps at 16,000×g for 10 min at 4 °C to remove residual WBCs. Red blood cells in the WBC fraction were lysed using standard procedures. WBCs were collected by centrifugation at 600×g for 10 min and washed with PBS. All samples were stored at − 80 °C.

DNA extraction, library preparation and sequencing

DNA isolations, library preparations and NGS of fresh frozen tumour tissues, WBCs and plasma samples were performed by Novogene (Beijing, China). In brief, the QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) was used to extract DNA from fresh frozen tumour tissues. The QIAamp circulating nucleic acid kit (Qiagen) was used to isolate cfDNA from 1 to 3 ml of plasma. Genomic DNA of WBCs was extracted using the RelaxGene Blood DNA System (TianGen Biotech Co., Ltd., Beijing, China). All DNA samples were analysed using Nanodrop for purity (ratio of OD260/280) and DNA yield was quantified using the Qubit dsDNA HS assay kit on the Qubit 2.0 system (Life Technologies, Carlsbad, CA). DNA samples were stored at − 80 °C until they were subjected to NGS.

DNA input for sequencing was 500 ng for tissue samples and WBCs and about 30 ng for cfDNA samples. Exons of 483 human protein-coding genes related to cancer were included in the NGS panel designed by Novogene [21] (Additional file 1). DNA fragments were captured using Agilent SureSelect XT (Agilent, Santa Clara, CA, USA). Paired-end 150 bp reads were generated on a Hiseq2500 sequencing system (Illumina, Beijing, China).

Data analysis

FASTQ files were obtained from the company and processed as previously described [22]. Briefly, reads were aligned to the hg19 reference genome with Burrows-Wheeler Aligner (BWA) and Genome Analysis Toolkit (GATK) [23]. Format conversion and de-duplication was performed using Picard Tools. HaplotypeCaller was used for variant calling in all samples in one workflow. The data analysis pipeline is set to report all variants with a MAF > 1%. Subsequently the pipeline reports the allele depth for these variants in all samples. Personal variants were filtered out when detected in the WBC samples at a variant allele frequency of > 0.5% in combination with a minimal variant allele count of two. In addition, we removed all variants with a sequencing depth less than 25x in WBCs, because we cannot reliably assess whether these represent personal variants or somatic mutations. Variants with a coverage of <100x or a mutant allele frequency (MAF) of < 5% in the primary tumour samples were excluded, as these variants are likely to be present in a minor subclone of the tumour. All other variants observed in the tumour samples were considered to represent somatic mutations. For all somatic mutations called in the tumour, mutations in cfDNA were determined to be present when we observed either (1) 4 altered reads with a MAF > 0.5%, or (2) > 4 altered reads and a MAF above the sequencing error background frequency, which was 0.43% per position and 0.14% per alternative nucleotide. Somatic mutations specific for cfDNA, were reported when MAF > 1% and coverage> 100. As the background rate for INDELs is much lower, we considered INDELs with two or more altered reads as true somatic mutations, irrespective of the MAF.

To further establish the reliability of the mutations reported in cfDNA in our study, we checked the read counts of the non-REF and non-variant bases by IGV. This indicates the sequencing error rate at this specific site. Prediction of pathogenicity of somatic variants was based on Combined Annotation Dependent Depletion (CADD) score [24]. We defined variants with a CADD score ≥ 20 as harmful. For the analyses of the mutational profile in tumour tissues, including the downstream pathway analysis, we focused on putative harmful mutations. DAVID v6.8 [25] was used for KEGG pathway analysis of the genes with harmful mutations. For the analyses of cfDNA, we included all somatic mutations, including non-harmful and silent mutations, as these can be equally informative for disease load.

Statistics

For non-normally distributed data sets, median and range are given, and significance was determined by Mann-Whitney-Wilcoxon rank sum test. A P-value < 0.05 was considered significant.

Results

Patient characteristics

Characteristics of the ESCC patients are shown in Additional file 2. The cohort of 17 patients consisted of 12 males and 5 females (age range from 42 to 77 years), all diagnosed with stage II to III disease. All patients were followed-up for 24 months. Four of the 17 patients experienced disease progression at 4 to 21 months after surgery. For 2 of the 4 patients with progression, the amount of cfDNA in the pre-surgery plasma sample was too low for NGS analysis. Median cfDNA yield was 11.9 ng (range 4.86–38.6 ng) per ml of plasma. The amount of plasma cfDNA obtained before surgery was lower than after surgery (p = 0.015) (Fig. 2). No obvious differences in cfDNA yield were seen between samples obtained 3–4 h after surgery compared to those obtained 2–9 days after surgery.

Fig. 2
figure 2

Cell-free DNA yield in pre- and post-surgery blood samples. DNA yields were calculated per millilitre of blood. Sixteen pairs of cfDNA samples were included as one of the two insufficient pre-cfDNA samples information is not available. The amount of plasma cfDNA isolated before surgery was lower than the amount obtained after surgery (p = 0.015) based on Mann-Whitney-Wilcoxon rank sum test

Overview of NGS results

A summary of the sequencing data is shown in Additional file 3. A phred quality score of 30 (Q30) was achieved for 91% of the bases. Mean target coverage of all samples was 667x and more than 95% of the target bases reached a coverage of more than 100x. The average mismatch rate per nucleotide position was 0.14% per base. Additional file 4 gives an overview of the somatic mutations detected per sample and per patient. No somatic mutations were detected in the tumour samples of three of the patients, nor in their corresponding pre- and post-surgery cfDNA samples. We therefore excluded these three patients from further analysis.

Somatic mutations in tumour DNA

We detected a total of 131 somatic mutations with a median coverage of 348x (range 100x to 3131x) (Additional file 4). Sixty-three of the 131 (48.1%) had a CADD score > 20, indicating a putatively pathogenic effect. The median number of mutations per patient was 9 (range 2 to 17) and the median MAF was 21.0% (range 9.7 to 85.2%). TP53 was the most commonly mutated gene, with pathogenic mutations in 14 patients, followed by mutations in NOTCH1 (4 patients), CDKN2A (3 patients), KMT2C (2 patients) and PTEN (2 patients) (Table 1). Two recurrent mutations were observed one in TP53 and one in CDKN2A. Pathway analysis of the 37 genes with harmful mutations indicated a total of 30 significantly enriched pathways (p-value < 0.01), each encompassing 4 to 9 mutated genes (Additional file 5). These include the PI3K-Akt/mTOR (9 genes), FoxO (7 genes), Jak-STAT (7 genes), Chemokine (7 genes) and Focal adhesion (7 genes) signalling pathways and the Proteoglycans in cancer (8 genes) pathway.

Table 1 Overview of all recurrently mutated genes (mutated in at least 3 patients)

Somatic mutations in pre- and post-surgery cfDNA

Part of the mutations identified in tumour samples were also identified in cfDNA. No novel mutations were found in any of the patients including the 3 patients without somatic mutations in the tumour. Five of the patients had no detectable somatic mutations in pre- and/or post-surgery cfDNA (Table 2). Seven patients had somatic mutations pre-surgery, but not in post-surgery cfDNA. One patient (ESCC14) had 4 mutations in pre-surgery cfDNA and one mutation in post-surgery cfDNA. The remaining patient (ESCC07) lacked pre-surgery cfDNA, but did have one mutation in post-surgery cfDNA. As a control for the reliability of our filtering criteria, we analysed the sequencing error read at mutant base positions of all mutations detected in the tumour samples. In all cases this was less than 4 reads in all cfDNA samples and below the MAF observed for the cfDNA sample (Additional file 6).

Table 2 Overview of clinical characteristics and the number of somatic mutations in tumour DNA and cfDNA

The median number of mutations observed in pre-surgery cfDNA was 5 per patient. For two patients, a high proportion of the somatic mutations detected in the tumour were also detected in pre-surgery cfDNA (80% for both ESCC09 and ESCC10). The median on-target coverage for the cfDNA samples was 613x (range 391x to 839x) pre-surgery and 752x (range 546x to 1932x) post-surgery. There was no difference in the coverage at positions for which mutant reads were detected (median 673x, range 391x to 839x) as compared to positions for which no mutant reads were detected (median 543x, range 471x to 725x) in cfDNA (Fig. 3). The median mutant allele frequency was 1.3% in pre-surgery cfDNA (range 0.24 to 4.91%). For all somatic mutations, the MAFs in pre-surgery cfDNA were much lower than those observed in the tumour tissue (Fig. 4).

Fig. 3
figure 3

Coverage at the target regions in cfDNA samples. a) pre-surgery cfDNA samples. b) post-surgery cfDNA samples. Mean target region coverage (black squares) and the coverage for the nucleotide positions for which somatic mutations were detected in the corresponding tumour samples is indicated. Dot colours indicate coverage at the nucleotide position for which the mutant allele was (orange dots) or was not (green dots) detected in cfDNA

Fig. 4
figure 4

Overview of mutant allele frequencies in tumour DNA (tDNA) and cfDNA per patient. All ctDNA allele frequencies are shown, including those with altered read number and allele frequencies below our threshold. Two different mutations were detected in CDKN2A in ESCC09 and in EPHA4 in ESCC10. For all mutations, a much lower MAF was observed in post-surgery cfDNA. The black boxes indicate the tumour-specific somatic mutations that were detected in either pre- or post-surgery cfDNA plasma samples

In 2 of the 14 post-surgery cfDNA samples, we identified one of the somatic mutations observed in the corresponding tumour tissue. In the remaining 12 post-surgery cfDNA samples, no mutations were observed. Mutant allele frequencies of the two mutations were 0.28% (ESCC07) and 0.36% (ESCC14). For one patient the time between surgery and blood collection was 3–4 h and for the second patient this was 6 days.

Correlation of cfDNA mutations with clinical characteristics

We detected cfDNA mutations in pre-surgery samples from 4 out of 6 stage II and 4 out of 6 stage III patients. Only one of the three patients with disease recurrence within 1 year had pre-surgery cfDNA. In this patient, 6 of the 12 somatic mutations observed in the corresponding tumour DNA were detectable in cfDNA. For the other two patients, we only had post-surgery cfDNA, and in one of the two we detected a single mutation out of three somatic mutations detected in the tumour samples.

Discussion

Early detection of tumour recurrence and a tool to evaluate treatment response in ESCC would allow us to optimize treatment strategy for individual patients. In the absence of effective prognostic biomarkers in ESCC [26], analysis of cfDNA might provide an easily accessible source of information to monitor disease load after surgery. In this study we performed targeted sequencing of pre- and post-surgery cfDNA and of matched tumour tissues and WBCs. Our main finding is that we were able to detect a subset of the tumour-specific somatic mutations in pre-surgery cfDNA for most of our stage II and stage III ESCC patients using as little as 3 ml blood. Most of these mutations were not detected in cfDNA of blood samples obtained as early as 3–4 h after surgery.

Even though our study cohort size is limited, this is the most extensive study thus far to compare pre- and post-surgery cfDNA in stage II and III ESCC patients. For 3 of the patients no somatic mutations were identified, despite the use of a broad cancer gene panel consisting of 483 genes. Thus, future studies should focus on a larger or ESCC-specific panel to allow detection of mutations in all patients.

Regretfully, due to lack of material we could not do an independent validation of the mutations detected in cfDNA. To overcome this shortcoming, we monitored sequencing error rates for all positions for which we identified mutations in cfDNA using our predefined criteria. This indicated that the sequencing errors did not pass our criteria. Moreover, in our previous NGS-based studies we validated close to 100% of the mutations called in the NGS data by the same pipeline for all variants with 4 or more reads by an independent technique [27, 28]. Finally, our pipeline for variant calling is based on the GATK workflow, and this pipeline was previously shown to have a sensitivity of 95% and positive predictive value of 99% [29]. Taken together, we consider the mutations in cfDNA as called in our study as reliable.

Previous studies have demonstrated that ctDNA can be detected in most advanced-stage cancer patients with high sensitivity. This allows monitoring of therapeutic response, identification of tumour-specific variants relevant for choice of therapy, and detection of acquired resistance-induced mutations. Using ddPCR, somatic mutations have been detected in cfDNA of more than 75% of patients with advanced pancreatic, ovarian, colorectal, bladder, gastroesophageal, breast, melanoma, hepatocellular and head and neck cancers, but in less than 50% of primary brain, renal, prostate or thyroid cancers [18]. Thus, it is clear that the presence of ctDNA varies between cancer types. In our study cohort, somatic mutations were found in pre-surgery cfDNA in 8 out of 12 patients. Our data support the potential of using cfDNA as a biomarker of disease load for this patient group for whom no effective biomarker is currently available. The detection rate could be higher with optimized approaches for detection of variants with low MAF and using an extended ESCC-specific gene panel.

A high-throughput sequencing approach allows detection of somatic mutations in cfDNA in a more comprehensive target region, as compared to mutation-specific PCR approaches, albeit with a somewhat lower sensitivity. With the detection of ctDNA in 4 out of 6 stage II patients, our results are comparable to the findings in previous studies focusing on stage I or II colorectal, breast, lung, ovarian and pancreatic cancer, with 43 to 71% of patients harbouring somatic mutations in cfDNA [17, 30]. This indicated the presence of ctDNA even under a low tumour burden. In colorectal cancer a higher MAF in pre-operative cfDNA has been associated with disease recurrence and overall survival [17]. In our study, a stage IIA patient with a mean MAF of 2.4% in pre-surgery cfDNA, but without any detectable mutations in post-surgery cfDNA, developed a recurrence 5 months after surgery. Additional studies are needed to further prove the potential clinical relevance of high MAF in cfDNA.

Postoperative adjuvant chemoradiotherapy was recommended to eliminate micrometastatic disease and minimal residual disease. Unfortunately, there is no effective tool to assess minimal residual disease for early tailoring of adjuvant therapy to avoid both under- and overtreatment. In addition, there is also no effective tool for early relapse surveillance prior to imaging. Detection of ctDNA after resection can point to minimal residual disease or even predict clinical relapse and poor outcome in different cancer types [7, 30, 31]. Due to the limited number of patients in our study and post-surgery treatment in some of them, we cannot reliably assess the potential clinical value of the presence of ctDNA in pre- and post-surgery cfDNA. Nevertheless, we did observe timely changes in MAF in cfDNA as early as 3 to 4 h after surgery, which is consistent with the reported half-life of cfDNA ranging from 16 min to 2 h [32, 33]. The relatively short half-life of cfDNA makes it a good biomarker to monitor dynamic changes in disease load. Cellular damage due to the surgery may lead to an increased amount of cfDNA, as we observed in some post-surgery cases. This will lead to a fractional decrease in the amount of ctDNA. Thus, both cellular damage and the short half-life of cfDNA may cause drop of the MAF to below the detection limit. So for residual disease monitoring is advisable to draw blood a few days after surgery.

Theoretically, ctDNA could be used broadly to guide treatment and to monitor for treatment resistance or cancer recurrence. CtDNA is also more specific to tumour load compared to serum-based protein biomarkers such as cancer antigen 125 in ovarian cancer patients [16], and can be used for tumours for which no serum-based protein biomarkers are available, as is the case for ESCC [34].

Conclusion

We detected tumour-specific mutations in pre-surgery cfDNA of both stage II and stage III ESCC patients. In samples taken shortly after surgery, mutations were either undetectable or had a significantly lower MAF, which indicates that the presence of mutations in cfDNA correlates with tumour load. This implies that cfDNA may be used as a marker for the presence of tumour cells in ESCC patients. Larger studies are needed to establish the clinical applicability of cfDNA and the predictive value of treatment outcome as we only had three samples of patients that relapsed after surgery.