Establishment of a DNA methylation marker to evaluate cancer cell fraction in gastric cancer
- 1.6k Downloads
Tumor samples are unavoidably contaminated with coexisting normal cells. Here, we aimed to establish a DNA methylation marker to estimate the fraction of gastric cancer (GC) cells in any DNA sample by isolating genomic regions specifically methylated in GC cells.
Genome-wide and gene-specific methylation analyses were conducted with an Infinium HumanMethylation450 BeadChip array and by quantitative methylation-specific PCR, respectively. Purified cancer and noncancer cells were prepared by laser-capture microdissection. TP53 mutation data were obtained from our previous study using next-generation target sequencing.
Genome-wide DNA methylation analysis of 12 GC cell lines, 30 GCs, six normal gastric mucosae, one sample of peripheral leukocytes, and four noncancerous gastric mucosae identified OSR2, PPFIA3, and VAV3 as barely methylated in normal cells and highly methylated in cancer cells. Quantitative methylation-specific PCR using 26 independent GCs validated that one or more of them was highly methylated in all of the GCs. Using four pairs of purified cells, we confirmed the three genes were highly methylated (85 % or more) in cancer cells and barely methylated (5 % or less) in noncancer cells. The cancer cell fraction assessed by the panel of the three genes showed good correlation with that assessed by the TP53 mutant allele frequency in 13 GCs (r = 0.77). After correction of the GC cell fraction, unsupervised clustering analysis of the genome-wide DNA methylation profiles yielded clearer clustering.
A DNA methylation marker—namely, the panel of the three genes—is useful to estimate the cancer cell fraction in GCs.
KeywordsGastric cancer Cancer cell fraction DNA methylation Epigenetics
Extensive genomic and epigenomic analyses of a variety of human cancers, including gastric cancers (GCs), have been and are being conducted [1, 2, 3, 4]. However, these analyses are almost always affected by contamination from coexisting normal cells in primary cancer samples. Although genomic analyses are designed to detect mutations even in a small fraction of cells, they still fail to detect gene mutations in samples with a low fraction of cancer cells . Moreover, epigenomic and gene expression analyses are heavily affected by the fraction of cancer cells . To overcome the contamination from normal cells, laser-capture microdissection (LCM) is conducted [7, 8]. However, LCM is labor-intensive and time-consuming, and practically impossible for diffuse-type GCs.
Without purification of cancer cells, if a fraction of cancer cells in a sample can be assessed, a sample with an extremely low fraction of cancer cells can be excluded from subsequent analyses, or the data obtained may be corrected by the fraction of cancer cells. Such assessment has been generally conducted by an expert pathologist, which is time-consuming and almost impossible for diffuse-type GCs and a large number of samples. To overcome this limitation, efforts have been made to develop molecular markers. For example, cancer-cell-specific mutations identified by a single-nucleotide polymorphism microarray and next-generation sequencing can be used to assess the fraction of cancer cells [9, 10]. However, identification of such mutations must be conducted for each sample, and there is a sizable research cost for this approach.
To overcome these issues, in our recent study, we successfully isolated CpG islands specifically methylated in esophageal squamous cell carcinoma (ESCC) cells . Three genes were methylated in almost all ESCC cells, but were not methylated or were barely methylated in normal esophageal mucosae, and at least one of the three genes was methylated in virtually all of 28 ESCC cases analyzed. Therefore, a panel of the three genes was considered to be a DNA methylation marker for the fraction of cancer cells. Using the marker, we were able to correct the fraction of ESCC cells, and showed that tumor-suppressor genes were methylated in almost all cancer cells.
In this study, for GCs, we aimed to isolate a DNA methylation marker that can be used to assess the fraction of cancer cells. Different from the esophagus, isolation of such a marker is far more difficult because gastric mucosae can have very high levels of DNA methylation owing to Helicobacter pylori infection [12, 13, 14, 15], and GC samples are contaminated with such gastric mucosae. Therefore, we paid special attention to isolation of marker genes not influenced by H. pylori infection.
Materials and methods
GC cell lines and tissue samples
Cell lines KATOIII, MKN45, NUGC3, MKN74, and MKN7 were purchased from the Japanese Collection of Research Bioresources (Tokyo, Japan), and the AGS cell line was purchased from the American Type Culture Collection (Manassas, VA, USA). Cell lines HSC39, HSC57, 44As3, and 58As9 were gifted by K. Yanagihara from the National Cancer Center, the TMK1 cell line was gifted by W. Yasui from Hiroshima University, and the GC2 cell line was established by M. Tatematsu at Aichi Cancer Center Research Institute.
A total of 56 primary GC samples (32 intestinal type and 24 diffuse type) were collected from surgical specimens of patients who had undergone gastrectomy, and 30 of the samples were used for our previous studies [1, 16]. Genome-wide DNA methylation and TP53 mutation data of the 30 GCs were obtained from one of the studies . Peripheral leukocyte samples were collected from five healthy volunteers by a centrifugation method. Gastric mucosae were collected by endoscopic biopsy from 17 healthy volunteers (11 without and six with present H. pylori infection) and from noncancerous gastric mucosae of 27 GC patients. Among the 27 noncancerous gastric mucosae, 23 (nine without and 14 with present H. pylori infection) were used for our previous study . H. pylori infection status was analyzed by a serum anti-H. pylori IgG antibody test (SRL, Tokyo, Japan), rapid urease test (Otsuka, Tokushima, Japan), or culture test (Eiken, Tokyo, Japan).
All of the samples, except for those used for LCM, were stored in RNAlater (Applied Biosystems, Foster City, CA, USA), and genomic DNA was extracted by the phenol–chloroform method. LCM was performed using formalin-fixed paraffin-embedded primary GCs by a Leica LMD7000 system [7, 18]. This study was conducted with the approval of the Institutional Review Board of the National Cancer Center. Written informed consent was obtained from all individuals.
Genome-wide DNA methylation analysis
Genome-wide DNA methylation analysis was performed using an Infinium HumanMethylation450 BeadChip array (Illumina, San Diego, CA, USA), which assessed the degree of methylation of 485,512 CpG sites. The methylation level of each CpG site was obtained as a β value, which ranged from 0 (completely unmethylated) to 1 (completely methylated). We excluded 11,551 CpG sites on the sex chromosomes, and the remaining 473,961 CpG sites were used for the analysis. Genomic blocks were defined as collections of CpG sites classified by their locations against transcription start sites and CpG islands .
Gene-specific DNA methylation analysis
Gene-specific DNA methylation levels were analyzed by quantitative methylation-specific PCR (qMSP). For DNA from surgical specimens in RNAlater, 1 µg was digested with BamHI, treated with bisulfite, purified, and suspended in 40 µl of Tris (hydroxymethyl) aminomethane–EDTA buffer, as described in [19, 20]. For formalin-fixed paraffin-embedded samples collected by LCM, DNA extraction and bisulfite treatment was conducted with an EpiTect Plus bisulfite kit (Qiagen, Hilden, Germany). Quantitative methylation-specific PCR (qMSP) was performed by real-time PCR using primers specific to methylated or unmethylated DNA (Table S1), the bisulfite-treated DNA, and SYBR Green I (BioWhittaker Molecular Applications, Rockland, ME, USA). The number of molecules in a sample was determined by comparing its amplification with that of standard DNA samples that contained known numbers of molecules (101–106 molecules). On basis of the numbers of methylated and unmethylated molecules, a methylation level was calculated as the fraction of methylated molecules in the total number of DNA molecules (number of methylated molecules plus number of unmethylated molecules). As a fully methylated control, blood genomic DNA treated with SssI methylase (New England Biolabs, Beverly, MA, USA) was used. As a fully unmethylated control, blood genomic DNA amplified twice with Genomiphi (GE Healthcare, Piscataway, NJ, USA) was used .
Gene expression analysis
Complementary DNA was synthesized from 1 µg of total RNA using SuperScript III (Invitrogen, Carlsbad, CA, USA). Quantitative reverse transcription PCR was performed using SYBR Green I and an iCycler thermal cycler. The measured number of complementary DNA molecules was normalized to that of GAPDH. The primers and PCR conditions are shown in Table S1.
Genomic DNA copy number analysis
Copy number alteration (CNA) of a specific genomic region was analyzed by quantitative real-time PCR using an iCycler thermal cycler and SYBR Green I. RPPH1 was used as a control gene located on a chromosomal region with infrequent CNA . The number of DNA molecules in a sample was measured for the control gene and three regions flanking the target gene (Table S1). The number of DNA molecules of the target gene was normalized to that of the control gene, and the normalized number of DNA molecules in a sample was compared with that in human leukocyte DNA to obtain the CNA. All the analysis was conducted in duplicate. A CNA (gain or loss) was defined as a twofold or greater increase or a 0.5-fold or smaller decrease.
Mutations of TP53 and mutant frequency
The TP53 mutation status and mutant frequency were obtained from our previous study . Briefly, the mutation was analyzed by target sequencing using an Ion AmpliSeq cancer panel kit (Life Technologies, Carlsbad, CA, USA) and an Ion PGM next-generation sequencer.
The correlation was analyzed using Pearson’s product-moment correlation coefficients, and its P value was obtained by the parametric hypothesis test. A difference in the mean DNA methylation level was analyzed by Student’s t test. A result was considered significant when the P value was less than 0.05 by a two-sided test.
Selection of regions specifically methylated in GCs by a genome-wide screening
For the remaining ten regions, we attempted to design primers for qMSP, and primers for both methylated and unmethylated DNA were successfully designed for five regions of five genes (OSR2, VAV3, PPFIA3, LTB4R2, and DIDO1) (Fig. 1b). To confirm the genome-wide DNA methylation data obtained by the bead array, qMSP was conducted using the 12 GC cell lines mentioned in “GC cell lines and tissue samples” and one sample of peripheral leukocytes. DIDO1 had slight methylation in the peripheral leukocytes, and was excluded from further analysis. The methylation levels of the other four genes (LTB4R2, OSR2, VAV3, and PPFIA3) obtained by qMSP were in good accordance with the bead array data (Fig. S1).
Isolation of genes not influenced by H. pylori infection
We also analyzed the expression of OSR2, VAV3, and PPFIA3 using 17 normal gastric mucosa samples of H. pylori-positive (n = 11) and H. pylori-negative (n = 6) individuals. VAV3 was highly expressed in both H. pylori-positive and H. pylori-negative gastric mucosae, whereas OSR2 and PPFIA3 were only weakly expressed (Fig. S2).
High incidence of methylation of the three genes and their specificity using LCM-purified cells
To confirm that the three genes were highly methylated only in GC cells but not in coexisting noncancer cells, four pairs of cancer and noncancer cells were collected by LCM. We found that at least one of the three genes was highly methylated in GC cells (more than 85 %), but that all of them were barely methylated in noncancer cells (less than 5 %) (Fig. 3b). The highest methylation level of the three genes was considered to reflect the fraction of cancer cells, and we defined the panel of the three genes as a DNA methylation marker to estimate the cancer cell fraction in a GC sample.
Because DNA methylation levels of some genes can be influenced by age , we also analyzed the correlation between the methylation of the three genes and age. The methylation levels of the three genes were found to be independent of age (Fig. S3).
CNAs of the three genes
Correlation between the cancer cell fraction estimated by DNA methylation and that estimated by a genetic alteration
Application of the DNA methylation marker to correction of the bead array data
We successfully established a panel of three genes (OSR2, VAV3, and PPFIA3) as a marker to estimate the fraction of cancer cells in primary GCs. Using the DNA methylation marker, we were also able to identify and exclude samples with a low fraction of cancer cells, and to correct the methylation levels by the fraction of cancer cells. After this, the genome-wide DNA methylation profiles yielded clearer clustering of CIMP by unsupervised hierarchical clustering analysis. This is the first molecular marker for the cancer cell fraction in GC.
The DNA methylation marker has the advantages of simplicity without the need for experienced pathologists or paired normal samples, compared with microscopic examination and genomic alterations. Also, the DNA methylation marker is likely to have a broad coverage in primary GCs because the DNA methylation marker was methylated in 100 % of the 26 primary GCs used for validation. Further, we were easily able to use the DNA methylation marker to assess the cancer cell fraction, even in diffuse-type GCs, for which even an expert pathologist has difficulty in estimating the cancer cell fraction. Finally, since the methylation levels of the three genes were independent of age, this marker was regarded to be useful to estimate the cancer cell fraction irrespective of age.
The correlation of the cancer cell fraction estimated by the DNA methylation marker with TP53 mutant frequency was high (r = 0.77, P < 0.001). However, in two samples, the cancer cell fraction estimated by the marker was twice as large as that estimated by the TP53 mutant frequency. Since loss of heterozygosity can coexist with a mutation of TP53 in GCs, we speculated that the discrepancy between the two methods in the two GC samples might have been caused by the loss of heterozygosity of TP53.
Gastric mucosae, especially when infected with H. pylori, can have very high levels of DNA methylation, so we paid special attention to isolation of marker genes in this study. The panel of the three genes was not affected by H. pylori infection because the genes were barely methylated in H. pylori-positive mucosae. Only two samples in H. pylori-negative individuals had a high methylation of VAV3 or PPFIA3, respectively. One possible reason for detection of such high methylation levels in H. pylori-negative samples is that these two samples were contaminated with cancer cells because they were resected from samples from GC patients. Another possible reason is that they were methylated in noncancer cells during past H. pylori infection.
A CNA can affect the methylation level of a marker gene. Therefore, we analyzed the CNAs of the three genes in 20 primary GCs used for the bead array analysis, and found CNAs of the three genes had little influence on the estimation of the cancer cell fraction. Regarding the expression of the three marker genes, only VAV3 was highly expressed in normal gastric mucosae. The region of VAV3, for which DNA methylation was analyzed, was outside the nucleosome-free region, suggesting that its transcription is not necessarily suppressed by the methylation.
In summary, a DNA methylation marker—namely, the panel of the three genes—was isolated, and was shown to be qualified to estimate the cancer cell fraction in GCs. Application of the marker to correction of the bead array data showed promising results for improving the accuracy of molecular analysis. The DNA methylation marker is expected to be useful in many aspects of GC research.
This work was supported by the Applied Research for Innovative Treatment of Cancer (H26-019) from the Ministry of Health, Labour and Welfare.
Conflict of interest
The authors declare that they have no conflict of interest.