Molecular profiling and comprehensive genome-wide analysis of somatic copy number alterations in gastric intramucosal neoplasias based on microsatellite status

Background We attempted to identify the molecular profiles of gastric intramucosal neoplasia (IMN; low-grade dysplasia, LGD; high-grade dysplasia, HGD; intramucosal cancer, IMC) by assessing somatic copy number alterations (SCNAs) stratified by microsatellite status (microsatellite stable, MSS; microsatellite instable, MSI). Thus, microsatellite status was determined in 84 tumors with MSS status and 16 tumors with MSI status. Methods One hundred differentiated type IMNs were examined using SCNAs. In addition, genetic mutations (KRAS, BRAF, PIK3CA, and TP53) and DNA methylation status (low, intermediate and high) were also analyzed. Finally, we attempted to identify molecular profiles using a hierarchical clustering analysis. Results Three patterns could be categorized according to SCNAs in IMNs with the MSS phenotype: subgroups 1 and 2 showing a high frequency of SCNAs, and subgroup 3 displaying a low frequency of SCNAs (subgroup 1 > 2 > 3 for SCNA). Subgroup 1 could be distinguished from subgroup 2 by the numbers of total SCNAs (gains and losses) and SCN gains (subgroup 1 > 2). The SCNA pattern of LGD was different from that of HGD and IMC. Moreover, IMNs with the MSI phenotype could be categorized into two subtypes: high frequency of SCNAs and low frequency of SCNAs. Genetic mutations and DNA methylation status did not differ among subgroups in IMNs. Conclusion Molecular profiles stratified by SCNAs based on microsatellite status may be useful for elucidation of the mechanisms of early gastric carcinogenesis. Electronic supplementary material The online version of this article (10.1007/s10120-018-0810-5) contains supplementary material, which is available to authorized users.


Introduction
Gastric cancer (GC) remains one of the leading causes of cancer-related death worldwide, despite recent decreases in the incidence and mortality rates associated with this disease [1]. Histological classification of GC is important to understand the characteristics of gastric carcinogenesis [2][3][4][5]. Histogenesis of GC has been classified into intestinal and diffuse types, which are thought to involve different molecular pathways [1][2][3][4][5][6]. This classification is closely associated with not only clinicopathological findings but also molecular alterations in GC. According to this theory, separate molecular pathways should be established in intestinal and diffuse types [1,6]. However, guidelines for endoscopic therapy recommend that the main target of endoscopic therapy is limited to the differentiated type, confined to intramucosal cancer or shallow depth of submucosal cancer (under 0.5 mm) [1,6]. Identification of gastric carcinogenesis of differentiated type is important for not only pathological but also clinical aspects. In addition, gastric differentiated type intramucosal neoplasia (IMN) is a heterogeneous entity including low-grade dysplasia (LGA), high-grade dysplasia (HGA), and intramucosal adenocarcinoma (IMA) in terms of histological grading [7]. A classification system for intramucosal neoplasia (LGD, HGD, and IMC) has been developed by the World Health Organization (WHO), with guidelines published in 2011, and has been widely used around the world [6]. However, the association of these neoplastic conditions with molecular alterations has not been fully evaluated.
Recent study has shown that gastric molecular carcinogenesis can be classified into four subgroups, including chromosomal instability (CIN), microsatellite instability (MSI), EBV, and genomically stable (GS) subtypes [8]. The CIN subtype is characterized by intestinal histology, TP53 mutations, and accumulation of copy number alterations, whereas MSI subtype is closely associated with the microsatellite unstable, CpG island methylation phenotype (CIMP) and MLH1 silencing [8]. In addition, the EBV subtype may be linked to PIK3CA mutations, CIMP (but not MSI status), and CDKN2A silencing. Finally, diffuse histology, mutations in CDH1 and RHOA, and CLDN18-ARHGAP fusions are characteristic factors in the GS subtype [8]. Moreover, this theory may be applicable to advanced cancer [8]. For example, the CIN subtype may not be an appropriate type in intramucosal cancer of the intestinal type, given that the chromosomal accumulation characterizing the CIN type at the molecular level may still not occur in intramucosal cancer. According to this concept, there are additional subgroups, including CIN and non-CIN subtypes, in differentiated IMN.
Alternative molecular classifications have been used worldwide in gastrointestinal carcinogenesis, including microsatellite stable (MSS) and microsatellite instability (MSI) [8][9][10][11]. MSS and MSI subtypes are mutually exclusive. However, the MSS subtype is not thought to be a homogeneous entity and may be composed of different molecular alterations [8][9][10]. This concept enables us to elucidate molecular differences based on MSS and MSI status in IMNs.
In the present study, we aimed to identify the molecular differences in gastric IMNs between MSS and MSI based on stratification of SCNAs. In addition, we examined the associations of other molecular factors (mutations in TP53, BRAF, PIK3CA and KRAS and DNA methylation) with SCNAs in gastric IMNs based on MSI and MSS.

Patients
One hundred patients with intramucosal neoplasia (IMN) obtained from gastric endoscopic submucosal dissection (ESD) were enrolled in this study. Clinicopathological findings were recorded according to the general rules for management of the Japanese gastric cancer association [12]. However, IMN was evaluated according to the modified WHO 2010 criteria [6]. Briefly, LGD was characterized by a uniform monolayer of columnar cells with basal nuclei showing minimal atypia. In HGD, nuclear atypia was more frequent, with nuclear pleomorphism, nuclear enlargement, and pseudostratification without stromal invasion. In IMC (differentiated type or intestinal type), there was marked cytological atypia and complex architecture with cribriform groups, irregular branching, glandular anastomosis, and budding of neoplastic cells into the lumen, which were considered representative of stromal invasion [6]. This study was approved by the local ethics committee of Iwate Medical University (Approval Number HGH28-25), and all patients provided informed consent.

Sampling of the lesions examined in this study
Tumor tissue was obtained from the resected stomach using biopsy forceps within 30 min of resection. Normal gastric mucosa distant from the tumor was removed from the submucosa using scissors, and, as a control, gastric biopsies from patients with IMN with chronic gastritis were included. Tumor tissues for clinicopathological analysis were obtained from a region of the resected stomach adjacent to the site used for molecular analysis (one sample was obtained as a representative sample). In this section, the proportion of tumor cells accounted for at least 50% of the tissue.

DNA extraction
We stored the fresh tumor and normal samples at − 80 °C until the molecular analysis. DNA was extracted from isolated normal and tumor tissue by sodium dodecyl sulfate (SDS) lysis and proteinase K digestion, followed by a phenol-chloroform procedure.

MSI analysis
The extracted DNA was amplified by polymerase chain reaction (PCR) with fluorescent dye-labeled primers targeting five microsatellite loci: BAT25, BAT26, D5S346, D2S123, and D17S250. DNA was detected using a DNA sequencer (PRISM 377; Perkin-Elmer Corp., Foster City, CA, USA), and fragment analyses were performed with GeneScan software (Perkin-Elmer Corp.), previously described [13]. According to the NCI criteria [13], MSI-H was defined as instability in at least 2 of the 5 microsatellite loci; MSI-L as instability in only 1 locus; and MSS when none of the loci were shifted. In the present study, tumors with MSI-low and MSS were regarded as MSS.

Analysis of TP53 and PIK3CA mutations
Mutations in TP53 (exons 5-8) and PIC3CA (exon 9 and 20) were assessed by PCR single-strand conformation polymorphism (PCR-SSCP) analysis and sequencing as previously described [14]. After PCR-SSCP analysis, direct sequencing of the abnormal bands was performed. The PCR products were sequenced by the dideoxy chain termination method.

Analysis of KRAS and BRAF mutations
Mutations in KRAS (exon 2) and BRAF (exon 15; V600E) genes were examined using a pyrosequencer (Pyromark Q24; Qiagen NV), as previously described [15]. The primers design used in the present study was previously described [16]. The cutoff value for the mutation assay was 15% mutant alleles.

Pyrosequencing for evaluation of methylation
We used a second panel method to determine the methylation status as previously described [17,18]. The DNA methylation status of each gene promoter region was established by PCR analysis of bisulfite-modified genomic DNA (EpiTect Bisulfite Kit; Qiagen) using pyrosequencing for quantitative methylation analysis (Pyromark Q24; Qiagen NV). Briefly, 6 markers (RUNX3, MINT31, LOX, NEUROG1, ELMO1, and THBD) were selected for determination of the genomewide methylation status [17]. After methylation analysis of a panel of 3 markers (RUNX3, MINT31, and LOX), tumors with hypermethylated epigenomes (HMEs) were defined as those with at least 2 methylated markers. The remaining tumors were examined using 3 markers (NEUROG1, ELMO1, and THBD). Tumors with moderately methylated epigenomes (IMEs) were defined as those with at least 2 methylated markers, and tumors not classified as having HMEs or IMEs were designated as having hypomethylated epigenomes (LMEs).
The cutoff value for the methylation assay was 30% of the tumor cells, as previously reported [15].

Somatic copy number alteration (SCNA) analysis
Extracted DNA was adjusted to a concentration of 50 ng/ μL. All 100 paired samples were assayed using the Infinium HumanCytoSNP-12v2.1 BeadChip (Illumina, San Diego, CA, USA), which contained 299,140 single nucleotide polymorphism (SNP) loci, according to the Illumina Infinium HD assay protocol [19]. BeadChips were scanned using iScan (Illumina) and analyzed using GenomeStudio software (v.2011.1; Illumina). Log R ratio (LRR) and B allele frequency (BAF) data from each sample were exported from normalized Illumina data using GenomeStudio. Data analysis was conducted with KaryoStudio 1.4.3 (copy number variation [CNV] Plugin v3.0.7.0; Illumina). The program was used with default parameters. CNAs were classified by SCNA partition algorithms. LRR 0 indicated a normal diploid region. LRR greater than 0 indicated copy number gains. LRR less than 0 indicated copy number loss of heterozygisity (LOH). BAF values ranged from 0 to 1; homozygous SNPs had BAFs near 0 (A-allele) or 1 (B-allele), whereas heterozygous diploid region SNPs had BAFs near 0.5 (AB genotype). Additionally, LRR and BAF data were used to identify regions of hemizygosity and copy-neutral LOH.

Statistical analysis
Hierarchical analysis was performed for clustering the samples according to the SCNA pattern in order to achieve maximal homogeneity for each group and the highest difference between groups. The clustering algorithm was set to centroid linkage clustering, the standard hierarchical clustering method used in biological analyses [20]. Briefly, the basic theory is to assemble a set of items (e.g., CNA) into a tree, where items are joined by very short branches if they are very similar to each other and by increasingly longer branches as their similarity decreases. The first step in hierarchical clustering is to calculate the distance matrix between the CNA data. Once this matrix of distances is computed, the clustering begins. Hierarchical processing consists of repeated cycles where the two closest remaining items (those with the smallest distance) are joined by a node/branch of a tree, with the length of the branch set to the distance between the joined items. The two joined items are removed from the list of items being processed and replaced by an item that represents the new branch. The distances between this new item and all other remaining items are computed, and the process is repeated until only one item remains. The process allows us to perform appropriate cluster analysis on our CNA database.
Data obtained for clinicopathological findings, histological features, mutation status, and methylation status based on each subgroup were analyzed using Chi square tests with the aid of Stat Mate-III software (Atom, Tokyo, Japan). If statistical differences between the 3 groups were found, statistical analysis between two groups was further performed using Chi square tests (Stat Mate-III software). Differences in age distributions between the 3 groups were evaluated using the Kruskal-Wallis H test with the aid of Stat Mate-III software (Atom). Differences with p values of less than 0.05 were considered significant.

Molecular classification
IMNs with MSS and MSI phenotypes were identified in 84 and 16 tumors, respectively. Clinicopathological findings based on MSS and MSI status are shown in Table 1.
The vertical line shows copy number alterations, and the horizontal lines denote "relatedness" between samples and SCNAs at the chromosomal loci. We carried out hierarchical clustering analysis based on the SCNA pattern, including gains, LOHs, and copy-neutral LOHs, to examine differences in genetic alterations in samples from patients with IMNs with the MSS and MSI phenotypes.

Hierarchical clustering analysis based on the SCNA patterns in IMNs with the MSI-high phenotype
We also performed hierarchical clustering analysis of IMNs with the MSI phenotype based on the SCNA pattern.

SCNAs between tumors in subgroups 1, 2, and 3 categorized based on CNAs in IMNs with the MSI-high phenotype
The SCNAs of all chromosomes according to each subgroup are shown in Fig. 4a-c. The mean total number of chromosomal aberrations per patient was 321, with an average of 212 gains (range 103-273), 0 LOHs (range 0-49), and 101 copy-neutral LOHs (range 0-328) in subgroup 1. In addition, the mean total number of chromosomal aberrations per patient was 37, with an average of 32 gains (range 0-82), 2 LOHs (range 0-10), and 3 copy-neutral LOHs (range 0-10) in subgroup 2. There was a significant difference in the total number of SCNAs between subgroups 1 and 2 (p < 0.01; p < 0.001). Moreover, significant differences in the average number of copy number gains were found between subgroups 1 and 2 (p < 0.01; p < 0.01). LOH and copy-neutral LOH were not different in the two subgroups.

Differences in SCNAs between tumors in subgroups 1 and 2 categorized based on CNAs in IMN with the MSI phenotype
Finally, we examined differences in SCNAs between the three subgroups. Regions of gains detected in more than 50% of cases were selected for comparison of each group.

Discussion
SCNAs have important roles in activating cancer-related genes, and an understanding of the biological and phenotypic effects of SCNAs may lead to substantial advances in cancer diagnostics and therapeutics [3]. Examination of SCNAs in tumor cells may provide information for predicting tumor aggressiveness [3] because SCNAs are closely associated with driver events that are acquired during cancer evolution [3,21]. Determining how SCNAs promote the early phase of cancer is an important goal in human neoplasia. Thus, determination of SCNA patterns in IMNs may provide useful insight into early gastric carcinogenesis [21]. Moreover, microsatellite status (MSS or MSI) has been shown to be closely associated with molecular alterations in human neoplasia [9]. Thus, researchers are interested in examining the associations of SCNAs with microsatellite status in IMN. To the best of our knowledge, this study is the largest analysis to date of high-resolution copy number profiles of IMN specimens according to MSS and MSI.
In the present study, IMN with the MSS phenotype was stratified into three subgroups using a hierarchical clustering analysis performed based on SCNA patterns. Although tumors in subgroup 1 (6/84, 7.1%) were characterized by multiple SCNAs, tumors in subgroup 2 (12/84, 14.4%) were closely associated with a few SCNAs. Tumors in subgroup 1 were distinguished from those in subgroup 2 due to the frequent specific SCNAs (see Table 2). In contrast, tumors showing a few SCNAs were aggregated into subgroup 3, accounting for most of the IMNs examined in this study (78.6%). In addition, these findings supported that differences in the number of SCNAs may be regarded as the chromosomal destruction index. Moreover, although there are 3 genetic pathways according to SCNA patterns in early gastric carcinogenesis, the IMNs we examined could be largely classified into two categories, including high-frequency CNAs (subgroups 1 and 2) and lowfrequency CNAs (subgroup 3). A previous study showed that copy number alterations arise as a result of preferential selection that favors cancer development [3,22]. Our results suggested that this selection had already occurred by the early phase of GC.
Many LOHs at many chromosomal alleles, including 4q, 5q, 8p, and 9p, have been reported in previous studies [8, 19, 21. 22]. In the present study, however, the frequency of LOHs was low compared with those of previous studies, including those reported in The Cancer Genome Atlas, which is used as a common molecular Ref. [8]. Unfortunately, it is still unclear why there are differences in the frequencies of CNAs, including gains, LOH, and copy-neutral LOH, between the present and previous results. The first potential reason is that the platform used in the present study is different from that of previous studies [19]. Second, interstitial cells in the examined samples may influence the molecular analysis [23]. Indeed, LOH (loss of genetic material) is known to be affected by the dilution effect in interstitial cells contained in the sample rather than genetic gain [23]. Accordingly, CNA gains may be more easily detected than LOH. In the present study, however, we carefully confirmed that the samples we examined contain more than 50% tumor cells. In addition, a previous study showed that copy number gains are more common than losses across the entire genome in tumor tissues compared with paired normal tissues [24]. This study may support our results. Moreover, in the CNA gains we identified in the present study, 8q, which contains cmyc, may be a common and important chromosomal locus showing frequent gains in GCs. This finding may contribute to targeting the amplified "driver genes" in early gastric neoplasia. Finally, we believe that the high quality of the samples we examined was preserved.
Genetic mutations in tumors characterize the genetic features of the tumor cells [10]. Here, we did not find differences in the frequencies of genetic mutation (KRAS, BRAF, PIK3CA and TP53) between the three subgroups. However, previous studies have shown that TP53 mutations are closely associated with early gastric carcinogenesis [25][26][27][28]. In a recent study, Fassan et al. showed the molecular similarity between high-grade intraepithelial neoplasia and early GC using a high-throughput mutation profiling method [29]. In addition, they suggested a relevant role for TP53 mutations in early cancers. However, TP53 mutations were rarely found in the IMNs examined in this study. The current findings suggested that chromosomal accumulation in intramucosal tumor cells played an important role in the development of intramucosal tumors rather than specific mutations, including TP53 mutations.
The histological diagnosis of intramucosal neoplasia differs substantially between Western and Japanese pathologists [30][31][32]. This difference results most commonly from histological criteria of stromal invasion [6,[29][30][31]. The WHO histological classification of gastric neoplasia published at 2011 proposed new histological criteria for stromal invasion occurring in the mucosa propria [6,31]. In the present study, LGD was more frequent in subgroup 3 than in subgroups 1 and 2, whereas IMC was significantly more frequent in tumors in subgroups 2 and 3 than in tumors in subgroup 1. However, no differences in the frequencies of HGD were found among the three subgroups. The associations of histological findings of IMNs with SCNAs have not been reported to date. Thus, LGD was characterized by a low frequency of SCNAs exhibiting an indolent nature, whereas IMC was closely associated with a high frequency of SCNAs showing a highly aggressive nature. Moreover, HGD was commonly observed in tumors in all subgroups. These findings validated the WHO classification of intramucosal neoplasia in terms of molecular alterations.
GC with the MSI-high phenotype is a distinct clinicopathological entity in gastric carcinogenesis [4,5]. However, it is unclear whether the pathological and molecular concepts that have been identified in CRC with the MSI-high phenotype can be applied to GC with the MSI-high phenotype, particularly for IMNs with the MSI-high phenotype [4,5]. Recent studies have shown that IMN with the MSI-high phenotype has multiple chromosomal alterations that are different from those of CRC with the MSI-high phenotype [4,5]. In the present study, the molecular profile of IMN with the MSI-high phenotype was sub-classified into two subgroups based on SCNA patterns. Tumors in subgroup 1 were characterized by multiple SCNAs, whereas tumors in subgroup 2 were closely associated with a low frequency of SCNAs. In addition, these findings were supported by the observation of significant differences in the frequencies of SCNAs between subgroups 1 and 2. Additionally, significant differences in the frequencies of specific SCNAs (3p11.1-p12.1, 3p14.2, 3q11.2, 3p12.2-p14.1, 3p14.3-p24.3, 3q11.1, and 3q12.1-q29) were found between tumors in subgroups 1 and 2. However, to the best of our knowledge, in previous studies, no appropriate candidate genes located at these chromosomal loci showing amplification were identified in GCs. These findings suggested that there were two different subtypes in IMN with the MSI-high phenotype in terms of molecular profiles. In addition, our results suggested that there were different two subgroups showing distinct tumor characteristics, i.e., aggressive and indolent tumors. Interestingly, these findings demonstrated that there were distinct two molecular profiles in intramucosal neoplasia with the MSI phenotype.
In the present study, there were some limitations that may hinder application of the findings to clinical practice. Our results did not identify the clinicopathological differences between the subgroups. If the histological characteristics of IMNs based on each subgroup were identified, this information may be useful for routine histological diagnosis by pathologists. In addition, such studies will provide insights into determining the appropriate therapeutic plan and cutoff values of SCNAs quantified in IMNs. To the best of our knowledge, this is the first report analyzing SCNAs based on MSI and MSS in IMNs. Unfortunately, we could not validate the SCNA pattern in gastric IMNs using a second cohort. Further studies are needed to obtain this information.
In conclusion, we suggest that there are three subgroups based on SCNA patterns in IMN with the MSS phenotype. Tumors in subgroups 1 and 2 were characterized by multiple SCNAs, whereas those in subgroup 3 were characterized by a low frequency of SCNAs. Tumors in subgroup 1 could be distinguished from those in subgroup 2 in terms of specific SCNAs. These findings supported the concept that GC is a heterogeneous disease, even in the early phase (IMN) of gastric carcinogenesis. In contrast, our data showed that IMN with the MSI-high phenotype could be categorized into two subgroups based on SCNAs. This is the first study showing that a high frequency of SCNAs exists in IMN with the MSI phenotype. This concept will assist with histological diagnosis, endoscopic treatment, and therapeutic planning by providing novel insight into the mechanisms of early gastric carcinogenesis.