Introduction

Gastric cancer is the fourth commonest cancer in the world and a leading cause of cancer death [1]. In 2008, it caused 738,000 deaths (10 % of all cancer-related deaths) [2]. Gastric cancer is especially prevalent in East Asia, Eastern Europe, and parts of Central America and South America [2]. Current treatments offer only slight survival benefits. Except in Japan, where endoscopic screening often detects early-stage tumors, the overall 5-year survival rate is 20–25 % [3].

Although there have been many studies of loss of heterozygosity (LOH) and copy-number loss in gastric cancer [49], to our knowledge none of these studies systematically surveyed copy-number loss and its effects on genes retarding proliferation or genes that, when deleted, might constitute therapeutic vulnerabilities. At present, high-density microarrays provide simultaneous assessment of single-nucleotide polymorphism (SNP) genotype and genomic copy number at hundreds of thousands of sites across the genome, and can thus delineate regions of copy-number loss [10].

It has recently emerged that copy-number loss is likely important in two distinct aspects of cancer biology. In one aspect, it appears that copy-number loss can promote proliferation by reducing expression of genes that would otherwise inhibit it; these have been termed "suppressor of tumorigenesis and/or proliferation genes" (STOP genes) [11]. These genes were previously identified in short-hairpin RNA screens for genes that tend to inhibit proliferation. In subsequent statistical analysis across more than 25 cancer types, these genes were found to be enriched in regions of recurrent deletion as determined by the Genomic Identification of Significant Targets In Cancer (GISTIC) method [12, 13].

In the second aspect, it is likely that copy-number loss often affects innocent bystander genes; the copy-number loss of these genes per se might not promote oncogenesis but instead incidentally makes cells more vulnerable to drugs targeting these genes. The model is that some of these genes already have reduced expression due to copy-number loss, and, as a consequence, would be more susceptible to inhibition by drugs. Such genes have been dubbed "Copy-number alterations Yielding Cancer Liabilities Owing to Partial losS genes" (CYCLOPS genes) [14]. These are conceptually distinct from STOP genes. The deletion of STOP genes confers a selective advantage to cancer cells, but, by contrast, the deletion of CYCLOPS genes is merely incidental, even though it presents a therapeutic opportunity. Nijhawan et al. [14] recently generated a list of probable CYCLOPS genes by associating information on cancer cell lines’ dependency on genes with information on copy-number loss of the genes in these cell lines. As determined in that previous study, a likely CYCLOPS gene was one with the property that cell lines that had copy-number loss at that gene also tended to be sensitive to the gene’s knockdown.

The criteria for CYCLOPS genes are more stringent than those for STOP genes, and this is reflected in their numbers: 55 CYCLOPS genes [14] compared with 878 STOP genes [11]. The list of CYCLOPS genes was generated on the basis of an observed association of copy-number loss with sensitivity to knockdown. By contrast, the list of STOP genes was based solely on the observation of reduced proliferation in cells in which the genes were knocked down, although subsequent analysis of STOP genes showed an aggregate statistical association with copy-number loss.

It is unknown to what extent copy-number loss of STOP genes plays a role in gastric adenocarcinoma and to what extent gastric adenocarcinomas harbor deletions of CYCLOPS genes. To investigate these questions, in the present study we used assays of approximately 906,600 SNPs in 74 tumors and matched nonmalignant tissue to delineate high-resolution, comprehensive views of copy-number loss and LOH in gastric adenocarcinomas. We then investigated the effects of copy-number loss on STOP and CYCLOPS genes in these tumors.

Materials and methods

Patients and samples

Primary gastric adenocarcinomas and matched nonmalignant tissue samples were obtained from Singapore Health Services with approval from the institutional review board. All samples were obtained with signed informed consent. Table S1 summarizes tumor and patient characteristics. For some of the tumors, the pathologist-estimated tumor content was very low, in some cases zero. We nevertheless analyzed these tumors because our experience has shown that pathologists, working with a portion of the surgically resected material different from that of the frozen sample from which DNA was extracted, often produce estimates of tumor content very different from those detected in DNA from the frozen portions of the tumor. Furthermore, tumors with very low tumor content can later be excluded from analysis because they have flat B-allele frequencies (BAFs) across the entire genome, as discussed in detail in "Results" and in "ASCAT profiling of allele-specific copy numbers."

DNA extraction and hybridization

Genomic DNA from snap-frozen gastric tumors and adjacent nonmalignant gastric tissues was extracted with a Qiagen genomic DNA extraction kit. The DNA was then hybridized to Affymetrix Human Mapping SNP 6.0 arrays (Affymetrix, Santa Clara, CA, USA) according to the manufacturer’s protocol. The chips were scanned with a GeneChip scanner using the Affymetrix GeneChip Operating Software. SNP positions were represented according to the hg18 (build 36) version of the human genome reference sequence. Some of the array data were previously published in [15]. All the array data used in this work have been deposited in Gene Expression Omnibus (accession numbers GSE31168 and GSE67965).

SNP array data preprocessing

We used Copy-number estimation using Robust Multichip Analysis version 2 (CRMA v2) [16] to extract intensity values for both alleles of each SNP from the SNP array data in the CEL files. In this process, CRMA attempts to account for (1) cross talk between alleles, (2) probe-sequence effects, and (3) the effects of the various sizes of fragments generated by restriction enzyme digestion before hybridization. We then processed each tumor and nonmalignant pair with TumorBoost [17] to increase the signal-to-noise ratio of allele-specific signals. This improved the ability of subsequent analysis to detect copy-number loss, LOH, and allelic imbalance. Matched nonmalignant samples were used as the reference to generate log2 R ratios (LRRs) and BAFs for the SNPs. The LRR of a SNP is the log2 of the signal intensity at that SNP (summed over both alleles) in the tumor sample divided by the signal intensity in the matched nonmalignant sample. The BAF of a SNP is the proportion of the total signal in the tumor that derives from the nonreference allele [the nonreference allele is designated the B allele, whence the term “B-allele frequency” (BAF)].

ASCAT profiling of allele-specific copy numbers

We used Allele-Specific Copy number Analysis of Tumors (ASCAT) program [10] to estimate allele-specific copy numbers from the LRRs and BAFs while accounting for the effects of cancer-cell polyploidy and aneuploidy and the effects of the admixture of DNA from nonmalignant cells (Fig. 1). We selected ASCAT after we had evaluated several other analytical software packages, including Copy Number Analyzer for GeneChip (CNAG) [18] and Genome Alteration Print (GAP) [19]. For evaluation we used published data from a dilution series of cancer cell line DNA mixed with DNA from nonmalignant tissue from the same person [20]. We evaluated the software packages on the basis of their ability (1) to detect LOH and allelic copy numbers in tumors with a low proportion of malignant cells and (2) to be used in semiautomated fashion from the command line. Details of the evaluation are presented in [21]. We also analyzed the tumors with GAP and Global Parameter Hidden Markov Model (GPHMM) [22]. We found that GAP was often unable to detect allelic imbalance from the BAF data (Fig. S1). We believe this is because GAP is not able to use TumorBoost-processed data. GPHMM was able to use TumorBoost-processed data, but often created an implausibly large number of segments (Fig. S2). In summary, we believe that ASCAT provides the most reliable estimates of copy number, allelic imbalance, and proportion of malignant cells in the tumor DNA sample.

Fig. 1
figure 1

Example ASCAT profile and allele-specific copy numbers. The data are from sample 980029. a log2 R ratio (LRR). Indices of autosomal single-nucleotide polymorphisms (SNPs) that are heterozygous in the nonmalignant sample are plotted along the x-axis. The y-axis indicates LRRs of SNPs in the tumor relative to the nonmalignant sample. Red dots show LRRs for each informative SNP, and green dots show ASCAT’s segmentations. b B-allele frequency (BAF) for the SNPs plotted in a. Red dots show BAFs for each SNP and green dots show ASCAT’s segmentation. c The solution space for the two parameters “ploidy” and “aberrant cell fraction,” with the location of the chosen values marked by a cross. d ASCAT’s model of allele-specific copy numbers. The y-axis indicates the estimated integer chromosomal copy number. Red lines and green lines indicate the higher-copy-number and lower-copy-number chromosomal haplotypes, respectively. The lines are vertically offset slightly to avoid superimposition. e The ASCAT aberration reliability score, a measure of how well the model in d explains the segmented LRRs and BAFs. Regions of copy-number loss according to our definition (total copy number less than 0.7 times the average ploidy) can be found in d by looking for segments that have total copy number (sum of the two allele copy numbers given by the green line and the red line) less than 0.7 × 2.31 = 1.6. Chromosomes 10, 12, and 18 each contain a small segment with total copy number 1 (red line at 1 and green line at 0, indicated by arrows). The region of loss in chromosome 18 is very small, and because of the plotting it is difficult to see the gap in the red line. However the green line at copy number 0 is visible

ASCAT was not originally designed for Affymetrix SNP array technology [10], and we made several minor modifications to it to allow it to work more effectively with Affymetrix Human Mapping SNP 6.0 arrays; patch files for modifying the original ASCAT program are available on request.

The main inputs to ASCAT are LRRs and BAFs computed from a tumor and matched nonmalignant tissue as described above (Fig. 1, panels a, b). ASCAT analyzes the LRRs and BAFs for those SNPs that are heterozygous in the nonmalignant sample. These are the SNPs that are informative with respect to allelic imbalance. ASCAT segments the LRRs and BAFs to smooth random SNP-to-SNP variation. The green dots in Fig. 1a and b show the segmented LRRs and BAFs, superimposed on the original, unsegmented values, which are indicated by the red dots. After segmentation, ASCAT generates genome-wide allele-specific copy-number profiles (Fig. 1, panels d, e). The profiles (1) estimate the proportion of malignant and nonmalignant cells in the tumor sample (“aberrant cell fraction” in Fig. 1e), (2) estimate allele-specific copy numbers of chromosomal segments across the genome (Fig. 1d, red and green horizontal lines), and (3) provide reliability measures for these estimates (Fig. 1e). ASCAT also provides an average ploidy for the cancer cells in the tumor samples; this is the average of the copy numbers of informative SNPs across the genome (“ploidy” in Fig. 1d).

List of tumor suppressor genes

We identified the tumor suppressor genes (TSGs) in Table 1 from two sources. The first was the Sanger Cancer Gene Census, an actively maintained, curated list of cancer-related genes, first described in [23], downloaded from http://cancer.sanger.ac.uk/cancergenome/assets/cancer_gene_census.tsv on May 24, 2013. The second source was the supplementary information in [24], worksheets Table S2A and Table S2B in the file http://www.sciencemag.org/content/suppl/2013/03/27/339.6127.1546.DC1/1235122TablesS1-4.xlsx. We treated a gene as a TSG if it was listed as “rec” (recessive) in the Cancer Gene Census or listed as “TSG” in [24].

Table 1 Copy-number alterations Yielding Cancer Liabilities Owing to Partial losS (CYCLOPS), suppressor of tumorigenesis and/or proliferation (STOP), and tumor suppressor genes in regions showing copy-number loss in at least approximately 20 % of gastric adenocarcinomas

Analysis of STOP genes

STOP genes are suppressors of proliferation that were identified in a short-hairpin RNA screen for genes that retard proliferation, i.e., genes that when knocked down permit increased proliferation [11]. In our analysis, we used the most stringent criterion among several presented in [11] to select STOP genes: the genes for which at least four short-hairpin RNAs increased cell proliferation by at least fourfold. We determined the list of these genes on the basis of the data in Table S7 in [11] (878 genes). For our analysis of STOP genes, we used the Gene Set Enrichment Analysis (GSEA) Preranked software tool [25] with the “classic enrichment statistic,” i.e., the version of the enrichment statistic that uses ranks without weights. GSEAPreranked runs the analysis with a user-supplied ranked list of genes and determines if a given set of genes shows statistically significant enrichment at either end of the ranking. This is done by computation of an enrichment score for the given gene set that reflects how often members of the gene set occur at the top or bottom of the ranked list.

Our analysis examined whether, compared with other genes, STOP genes tended to have reduced copy number. We ordered the genes in increasing order of their average relative copy number across all samples, and then, to break ties, in decreasing order of the correlation coefficient between the genes’ average relative copy numbers and expression levels. In the cases of the few remaining ties, we used a random ordering. We tested this ordered list against the STOP gene set. We obtained the relative copy number of a gene in a sample by dividing the copy number of the gene in the sample by the ASCAT-determined average ploidy of the sample. We performed the analysis using several random orderings, and we report the maximum p value over the random orderings. Table S2 provides one such ordering.

Gene expression data

Gene expression data were obtained from Gene Expression Omnibus (accession numbers GSE15459 and GSE34942). We used COMBAT [26] as described in [27] to remove batch effects.

Analysis of CYCLOPS genes

CYCLOPS genes are those for which “loss correlated with a greater sensitivity to further gene suppression” [14]. For our analysis we used the list of candidate genes in Table S2 in [14] and selected the genes with a false discovery rate of less than 0.25, which was the criterion used in [14]. Fifty-five genes satisfied this criterion; the main text of [14] is apparently inconsistent in indicating 56 genes.

Results

We initially analyzed 113 gastric tumors with their paired adjacent nonmalignant tissues using ASCAT (Table S3). For 74 of the 113 pairs, ASCAT was able to estimate allele-specific copy numbers across the genome. ASCAT was unable to estimate allele-specific copy numbers for the remaining pairs for the following reasons (Table S4): (1) excessively variable LRR data that ASCAT was unable to segment reasonably (12 tumors; Figs. S3a, S4, S5); (2) BAFs that were flat, i.e., uniformly 0.5 (25 tumors; Fig. S6a); or (3) apparently low tumor content as evidenced by very little variation in the segmented LRRs and few divergences of the BAFs from 0.5 (two tumors). We suspect that excessively variable LRRs are the result of experimental artifacts, as shown in Figs. S3, S4, and S5. We believe that very low proportions of malignant cells in the tumor samples were responsible for the BAFs that were uniformly 0.5, for the reasons described in the caption for Fig. S6. Inspection of the 74 generated ASCAT profiles revealed 12 profiles with large (more than 10 Mb) homozygous deletions, which are likely incompatible with cell survival. Therefore, these probably represent underestimates of average ploidies by ASCAT. Consequently, we adjusted these profiles by selecting the next best solution found by ASCAT at a higher average ploidy.

Landscape of copy-number loss and LOH in gastric cancer

Genomic copy-number loss and LOH are pervasive in gastric cancer (Figs. 2, S7, S8, Tables 1, S5). The proportion of the genome subject to copy-number loss varies considerably from tumor to tumor, with a median of 5.5 %, and a mean of 12 % (range 0–58.5 %; Fig. S9a). In addition, an average of 22.1 % of each gastric cancer genome is subject to LOH (range 0–77.7 %). Regions of copy-number loss and LOH in individual tumors often encompass whole chromosomes, chromosome arms, or regions of tens of megabases (Figs. 2, 3, S7, S8 Table 1).

Fig. 2
figure 2

Genome-wide overview of frequencies of copy-number loss and loss of heterozygosity across 74 gastric tumors. Copy-number loss is defined as a region where the genomic copy number is less than 0.7 times the average ploidy. See Figs. S7 and S8 for detailed plots across each chromosome

Fig. 3
figure 3

Regions of copy-number loss across chromosomes 9 and 18. a The proportion of tumors showing copy-number loss at each single-nucleotide polymorphism on chromosome 9, based on ASCAT’s allele-specific copy-number analysis. The locations of Copy-number alterations Yielding Cancer Liabilities Owing to Partial losS (CYCLOPS) genes (red) and well-established tumor suppressor genes (black) are indicated. b Regions of copy-number loss in specific tumors. c, d Analogous information for chromosome 18. Copy-number loss is defined as a region where the genomic copy number is less than 0.7 times the average ploidy. cen centromere

There are several large regions that are each subject to copy-number loss in at least 20 % of tumors (Table 1). One of these is a 46.7-Mb portion of 9p that contains nine STOP genes, one CYCLOPS gene, and the TSG CDKN2A (which encodes cyclin-dependent kinase inhibitor 2A) (Fig. 3, panels a, b). This region also contains two other genes, PTPRD and DOCK8, that have been proposed as TSGs in other cancers [2835]. An additional large region of frequent copy-number loss affects much of the long arm of chromosome 18 in approximately 20 % of tumors and contains 13 STOP genes and two TSGs (Table 1, Fig. 3, panels c, d). Finally, much of chromosome 4 undergoes copy-number loss in many tumors, and contains 62 STOP genes and three CYCLOPS genes (Table 1, Fig. S7).

STOP genes are enriched for copy-number loss

We analyzed the prevalence of deleted STOP genes in the 74 tumors and found that, on average, 91.11 STOP genes are subject to copy-number loss per tumor (median 35, range 0–452; Table S6, Fig. S9b). To test if, compared with other genes, STOP genes tend to have lower copy number in tumors, we performed a GSEAPreranked test [25] using the STOP genes as the gene set. The reasoning behind this hypothesis is that STOP genes, when reduced in copy number, would have lower expression and therefore would tend to inhibit proliferation less. Therefore, we restricted our attention to genes with significant positive correlations between average relative copy numbers and messenger RNA (mRNA) expression level. We ranked these genes on the basis of their average copy numbers relative to their tumor’s average ploidy across the 74 tumors, and then, to break ties, on the basis of the Spearman correlation coefficient between average relative copy number and expression. In this analysis, the STOP genes indeed tended to have reduced copy number (GSEA p < 0.02; Fig. 4). As a sanity check, we also performed an analysis based on resampling. For this, instead of using the STOP gene set (which consists of 878 genes), we randomly selected 878 genes from the genome and ran GSEAPreranked with the list of ranked genes described above. We repeated this 1000 times and then determined how many times the normalized enrichment score was higher than the one obtained when we used the STOP gene set. In our analysis this happened four times out of 1000. Therefore, the empirical p value is 0.004, indicating that STOP genes indeed have reduced copy number compared with the other genes in the genome.

Fig. 4
figure 4

Gene Set Enrichment Analysis shows that suppressor of tumorigenesis and/or proliferation (STOP) genes tend to have lower average relative copy number. As discussed in the text, we restricted our attention to genes for which at least four short-hairpin RNAs increased cell proliferation by at least fourfold. a Running enrichment score for the STOP gene set against the list of genes ranked by their average relative copy number across all 74 samples, and then, to break ties, by the correlation coefficient between their average relative copy number and messenger RNA expression level. b Vertical black lines indicate the locations of STOP genes in the ranked list of genes

CYCLOPS genes are affected by copy-number loss in many tumors

CYCLOPS genes are an additional class of genes of interest in regions of copy-number loss; these are genes for which copy-number loss indicates a potential vulnerability to therapeutic inhibition [14]. Unlike the copy-number loss of a STOP gene, which is thought to promote proliferation, the copy-number loss of a CYCLOPS gene is thought to confer no advantage to the cancer cell, but rather to accidentally make the cancer more sensitive to inhibition of that gene. We found that from the total of 55 CYCLOPS genes, on average, 6.81 CYCLOPS genes were subject to copy-number loss in each tumor (median 2, range 0–39; Table S7, Fig. S9c). Forty-seven tumors had at least one CYCLOPS gene subject to copy-number loss, and 51 of the 55 CYCLOPS genes underwent copy-number loss in at least one gastric adenocarcinoma (Table S8). However, for only nine of these was the copy-number loss associated with lower mRNA levels (Table S8). On average, 1.6 of these nine genes were subject to copy-number loss per tumor (median 1, range 0–9), and 38 tumors (51.4 %) had at least one of these nine CYCLOPS genes with reduced copy number. The genes that were both subject to copy-number loss in at least 10 % of the tumors and also substantially downregulated when deleted (Table S8) are EEF2 (which encodes eukaryotic translation elongation factor 2), ETFDH (which encodes electron-transferring-flavoprotein dehydrogenase), and ENC1 (which encodes ectodermal-neural cortex 1). Visual examination of the LRRs and BAFs of these genes in several tumors strongly supports the copy-number loss assessed by ASCAT (Fig. S10).

Correlation of copy-number loss patterns with clinical characteristics

We explored whether there were any significant correlations between the detected copy-number loss patterns and the clinical information associated with our samples. In multivariate survival analysis (Cox proportional hazards models) we found several frequent regions of copy-number loss (17p, 3p, and 5q) that were correlated with survival (Table S9). However, analysis of 212 gastric tumors from The Cancer Genome Atlas (http://cancergenome.nih.gov/) did not show a significant association between copy-number loss of these regions and survival. Possibly the biology of the tumors was different between the two patient populations, or possibly this was a chance result in our data.

We also examined associations between copy-number-loss in the regions shown in Table 1 and several other covariates. These covariates were gender, tumor stage, tumor grade, Lauren classification, and adjuvant treatment. None were significant in univariate analysis after correction for multiple hypothesis testing (Table S10). With respect to lack of association of any particular copy-number alteration with the Lauren classification, previous studies also did not detect systematic differences in copy-number alterations between the Lauren subtypes [36, 37]. We also examined association of copy-number alterations with the genomic intestinal (G-INT)/genomic diffuse (G-DIF) classification [38], and again observed no significant association after correction for multiple hypothesis testing. The G-INT/G-DIF classification is a gene-expression (mRNA)-based classification that was developed on gastric cancer cell lines and then applied to primary gastric cancer tumors. By way of background, we note that, although the G-INT subtype is enriched for Lauren intestinal-subtype tumors and the G-DIF subtype is enriched for Lauren diffuse-subtype tumors, the association is not absolute. There are diffuse-subtype tumors in the G-INT subtype and intestinal-subtype tumors in the G-DIF subtype.

Discussion

Limitations

Genome-wide analyses of copy-number loss and LOH are challenging owing to the mixture of malignant and nonmalignant cells in tumor samples. No standard analytical approach has emerged as the most appropriate in tumors with low proportions of malignant cells. As noted earlier, we evaluated ASCAT on a dilution series of mixed malignant and nonmalignant DNA [20, 21]. ASCAT performed well, even when analyzing tumors with low proportions of malignant cells. Nevertheless, in the current study, ASCAT was unable to analyze 39 tumors. Among these, 25 had flat BAFs. For these we believe the main issue was a very low proportion of malignant cells, for the reasons described in the caption for Fig. S6. Supporting this view, examination of the BAFs of 34 gastric cancer cell lines in the Cancer Cell Line Encyclopedia [39] revealed none with completely flat BAFs, suggesting that most gastric adenocarcinomas have at least some regions of allelic imbalance. We also note that our estimates of the proportions of tumors with LOH at each chromosome arm are statistically indistinguishable from previous estimates based on microsatellite assays (Table S11) [7], suggesting that the current analysis is correct. In addition to the 25 tumors with flat BAFs, ASCAT was unable to complete analysis of 12 tumors for which the LRRs were excessively variable. As described in Figs. S3, S4, and S5, we believe these were due to experimental artifacts.

Candidate TSGs subject to frequent copy-number loss

We found that much of the short arm of chromosome 9 is a hot spot for copy-number loss and LOH in gastric cancer (Table 1, Figs. 3 panels a, b,  S7, S8). The TSG CDKN2A is located in this region and is mutated in numerous tumor types [4042]. This gene is frequently deleted or hypermethylated in gastric cancer [4346]. However, it is nevertheless possible that this region contains other TSGs that contribute to gastric carcinogenesis. Two genes that are promising in this regard are PTPRD (which encodes protein tyrosine phosphatase, receptor type, D) and DOCK8 (which encodes dedicator of cytokinesis 8). PTPRD is inactivated by gene deletion or mutation in various cancers [2833], and was previously noted to undergo LOH in gastric cancer [47]. A recent study also showed homozygous deletion of this gene in gastric cancer cell lines [48]. In our study, PTPRD was subject to LOH in 36 of 74 tumors and subject to copy-number loss in 25 tumors. DOCK8 is a guanine nucleotide exchange factor that activates Rho GTPases. Homozygous deletion and reduced expression of DOCK8 were observed in lung cancer [34, 35]. In this study, DOCK8 was subject to LOH in 37 tumors and had reduced copy number in 26 tumors. Thus, PTPRD and DOCK8 deserve more scrutiny as potential TSGs in gastric adenocarcinoma.

Comparison with copy-number loss patterns in other cancer types

The regions most frequently subject to copy-number loss in the gastric adenocarcinomas we studied are 3p, 4, 9p, 17p, and 18q. Several other cancer types also have frequent losses in all of these regions [49]. These types include non-small-cell lung carcinoma, pancreatic adenocarcinoma, renal cell carcinoma, and esophageal carcinoma. In addition, losses of 3p and 9p are shared with head and neck cancers, malignant melanocytic neoplasia, and small cell lung and squamous cell carcinomas. Losses of 4, 17p, and 18q are also found frequently in ovarian, hepatocellular, cervical, and bladder cancers [49]. This suggests that some of the STOP gene contribution to tumorigenesis is shared across cancers. It also suggests that therapies based on CYCLOPS genes might be applicable to multiple cancer types.

Implications of STOP genes subject to copy-number loss

We found that a substantial number of antiproliferative STOP genes were subject to copy-number loss in each tumor, and GSEAPreranked showed that STOP genes tend to have a lower copy-number compared with the other genes. The initial study of STOP genes [11] analyzed their relationship to the recurrent deletions that were originally reported in [13]. Although a large number (3131) of cancers were studied, these included only 23 gastric cancers (Supplementary Table 1 in [13]). This previous study [11] also concluded that GO genes–genes whose depletion limits proliferation—were impoverished in regions of recurrent copy-number loss. We also examined this question, but found no evidence that GO genes are impoverished in lower copy-number regions in gastric adenocarcinoma (Fig. S11).

Implications of CYCLOPS genes subject to copy-number loss

We found that 51 of the candidate CYCLOPS genes identified in [14] were subject to copy-number loss in at least one gastric adenocarcinoma. However, for only nine of these genes was the copy-number loss in fact associated with reduced mRNA levels (Table S8), suggesting that only 16 % of the candidate CYCLOPS genes actually constitute potential therapeutic opportunities in gastric cancer. Indeed, Nijhawan et al. [14] did not examine the extent to which the candidate CYCLOPS genes were in fact downregulated when deleted. Thus, the therapeutic opportunities presented by CYCLOPS genes may be more limited than they would seem on the basis of deletions of the full set of CYCLOPS genes. Nevertheless, 38 of the tumors in the current study showed copy-number loss of at least one of the nine CYCLOPS genes for which reduced copy number was associated with reduced expression (Table S8).

Comparison of the findings of the current study with those of the previous study of CYCLOPS genes [14] suggests considerable heterogeneity in the patterns of CYCLOPS gene loss across cancer types. In the gastric tumors we studied, on average, each CYCLOPS gene was subject to copy-number loss in 12.4 % of tumors (range 0–29.7 %), which was lower than the average of 18 % (range 8–33 %) reported for 3131 tumors in [14]. These differences are reflected on a gene-by-gene basis. We take as an example the SNRPB gene (which encodes small nuclear ribonucleoprotein polypeptides B and B1), which was a high-ranking CYCLOPS candidate that was studied experimentally in [14]. This gene was subject to copy-number loss in 13 % of the 3131 cancers studied in [14], but had reduced copy number only once among the 74 gastric tumors we studied, a significantly lower proportion (p = 0.001, Fisher’s exact test). Indeed, many top-ranked CYCLOPS genes in [14] were significantly less often deleted in the gastric adenocarcinomas than in the 3131 tumors studied previously (Table S12).

Summary

This analysis of copy-number loss in gastric adenocarcinomas showed that STOP genes tend to have a lower copy number compared with other genes, suggesting that the copy-number loss of these genes may contribute to gastric carcinogenesis. In addition, the presence of deleted and downregulated CYCLOPS genes in 51 % of the tumors suggests potential therapeutic targets in these tumors.