Background

Genetic instabilities are characteristic of most human cancers. Genome-wide detection of chromosomal changes, including loss of heterozygosity (LOH) and copy number alterations (CNA), either gain or loss, are the focus of substantial attention in cancer research. LOH is frequently observed in a variety of human cancers, and regions with frequent LOH may contain tumor suppressor genes. In addition, LOH may associate with the regions affected by haplo-insufficiency of a group of genes. Thus, detection of LOH will likely remain a cornerstone for predicting tumor aggressiveness for many human tumors [1]. Recently, the discovery of large-scale genome-wide copy number variation has stimulated interest in elucidating the role of CNA in the development of malignancy. The 10 K single nucleotide polymorphism (SNP) array (GeneChip Mapping 10 K array, Affymetrix) offers a high-resolution genomic approach to screen chromosomal alterations systematically. Several studies on allelic imbalance or loss in cancers and cancer cell lines using the 10 K SNP array have been published [212].

Esophageal squamous cell carcinoma (ESCC) is a common malignancy worldwide and one of the most common malignancies in the Chinese population. There is great geographic variation in the occurrence of this tumor in China, including exceptionally high-risk areas such as Shanxi Province in north central China where some of the highest esophageal cancer rates in the world occur. The standardized incidence rate for esophageal cancer in Shanxi Province is above 100/100,000 person-years, although it appears that both incidence and mortality rates have begun to decline in the past 10 years [13, 14]. Within the high-risk regions in China, there is a strong tendency toward familial aggregation, suggesting that genetic susceptibility, in conjunction with environmental exposures, plays a role in the etiology of ESCC. In the past several years, we have tried to identify susceptibility genes and biomarkers that can be used to screen high-risk populations in north central China for ESCC [1522]. A previous study examined 366 microsatellite markers in a 10 cM density genome-wide scan in 11 ESCC patients, and identified 14 chromosome arms with high-frequency LOH [15]. However, we were unable to further narrow these LOH regions using microsatellite markers due to their low density. Higher density markers are necessary for positional cloning of tumor suppressor genes in LOH regions.

In the present study we established a high-resolution chromosomal instability profile for ESCC by examining germ-line DNA and matched micro-dissected tumor DNA with a 10 K SNP array to determine both LOH and CNA. We also evaluated whether a pool of normal control samples could be used as the normal referent in an LOH study with the 10 K SNP chip instead of matched germ-line DNA.

Results and discussion

LOH by patient and chromosomal arms

In the present study, 26 ESCC patients with blood-derived germ-line DNA and matched micro-dissected tumor DNA were investigated using 10 K SNP arrays. The characteristics of these patients are shown in Table 1. The average signal detection rate was higher in germ-line DNA (99%) than that in micro-dissected tumor DNA (79%). Based on NCBI Build 35.1, we summarized characteristics of 11,555 SNPs and mapped them to chromosomes and genes. We first generated a genotyping profile for each patient based on a comparison of the germ-line DNA genotypes to those from the matched micro-dissected tumor DNA. The patients' LOH frequencies, shown in Table 1, ranged from 19% to 95%, and averaged 29%. LOH in four cases (SHE0832, SHE0864, SHE1264, and SHE1490) was performed using DNA from micro-dissected adjacent normal tissue in addition to blood-derived germ-line DNA to see if this affected results, but findings were very similar with both of these two sources of DNA (Table 1).

Table 1 Demographic, risk factor, clinical characteristics, and LOH frequency of ESCC patients (N = 26)

The frequencies of LOH on each chromosomal arm are shown in Table 2. Non-random LOH was observed on 10 chromosomal arms, including 17p (76%), 9p (72%), 9q (72%), 13q (68%), 17q (66%), 4q (65%), 4p (60%), 3p (58%), 15q (57%), and 5q (52%). Our previous microsatellite marker-based genome-wide LOH scan in 11 ESCC patients with a positive family history of upper gastrointestinal cancer produced overall LOH frequencies that were somewhat higher than the patients evaluated in the present study [15]. We can not explain this between-study LOH frequency variation, but there are several noteworthy differences between the studies that likely influenced LOH rates, including: (i) heterozygosity is higher for the microsatellite compared to the SNP markers examined (~75% versus ~30%); (ii) the total number of markers was much higher in the SNP study than the microsatellite study (11,000 versus 366); and (iii) over twice as many cases were examined in the SNP study (26 versus 11). Some results between studies differed (eg, LOH was ≥ 50% on chromosome 15q in the SNP but not the microsatellite study; LOH was ≥ 50% on 8p, 8q, 11p, 11q and 18p in the microsatellite but not the SNP study). Despite differences in study size, approach, and in some of the results, consistently high LOH frequencies were reported for nine chromosomal arms in both studies (ie, 3p, 4p, 4q, 5q, 9p, 9q, 13q, 17p, 17q). Both studies taken together indicate that LOH on these nine chromosomal arms are the major events associated with genome-wide instability in ESCC in this high-risk Chinese population. These areas are rich in known tumor suppressor genes and oncogenes, including VHL on 3p; NPCA1 on 4p; KIT, GIST, and PDGFRA on 4q; APC and MCC on 5q; CDKN2A and CDKN2B on 9p; BRCA2 and Rb1 on 13q; TP53 on 17p; and BRCA1, TOC, and NF1 on 17q.

Table 2 LOH distribution by chromosomal arm

LOH regions

When we used the conservative, traditional approach to LOH in LOH/Model A, we detected 20 LOH regions encompassing a total of 125 SNPs. As shown in Table 3, these 20 LOH regions are located on eight chromosome arms – 13q (four regions), 3p (two regions), 4q (three regions), 9p (two regions), 9q (three regions), 17p (three regions), 17q (two regions), and 4p (one region). The size of these LOH regions ranged from 10 kb to 4.26 Mb (average 1.44 Mb); genes involved in these deletion regions are shown in Table 3. Among the 125 SNPs in these 20 LOH regions, 46 are located in genes (one in a coding exon, 39 in introns, and six in 3'- or 5'-UTRs), and 79 are located in regions flanking genes (ie, within 1 kb). One SNP (rs781852) is located in the coding region of gene ZZEF1 (Zinc finger, ZZ-type with EF-hand domain 1) on chromosome 17p13.2. Allele A for this SNP encodes an amino acid proline (Pro) and the allele B encodes amino acid leucine (Leu). Eight of 10 heterozygous cases (Pro/Leu) showed LOH (80%), including five cases that lost allele B and three cases that lost allele A. The 46 SNPs that are located within genes map to 32 genes and include four SNPs in the introns of ZNF618, and two SNPs each in the introns of ITPR1, FLJ14834, LHFP, ITGAE, MYH3 and MYOCD. Some of these 20 deletion regions have been previously identified by our lab and others [17, 22]. However, the current study provides far greater precision in locating LOH regions (10 kb-4.26 Mb as opposed to 10 cM, which corresponds to 5–10 Mb). As expected, using a less conservative definition for LOH, LOH/Model B detected more regions (and SNPs) than our approach in LOH/Model A – 72 LOH regions containing 2,916 SNPs. The distribution of deletion regions and details from this model are shown in Table 4 and Additional Table 1 (in additional file 1).

Table 3 Deletion regions from the conservative "LOH/Model A"
Table 4 Deletion regions from the less conservative "LOH/Model B"

Our cLOH data in cLOH/Model A identified only three significant cLOH regions. These included one on 13q12-q13 and two on 13q13, and encompassed a total of 30 SNPs. The sizes of these cLOH regions are 1.9 Mb, 0.4 Mb, and 0.2 Mb, respectively (average 0.83 Mb) (Table 5). The less conservative cLOH/Model B highlighted 64 cLOH regions with 2,128 SNPs; details are shown in Table 6 and Additional Table 2 (in additional file 1).

Table 5 Deletion regions from the conservative "cLOH/Model A"
Table 6 Deletion regions from the less conservative "cLOH/Model B"

Examples of the whole genome profiles for regions on chromosome arms 9p/q, 13q, and 17p are shown in Figures 1, 2, 3.

Figure 1
figure 1

Chromosome 9. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Figure 2
figure 2

Chromosome 13. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Figure 3
figure 3

Chromosome 17. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Comparison of LOH and cLOH regions

Our conservative LOH/Model A detected 20 LOH regions including 125 SNPs, but our conservative cLOH/Model A detected only three LOH regions containing 30 SNPs. The detection of only three LOH regions by cLOH/Model A is not unexpected since identifying an LOH region in a sample requires the presence of multiple homozygous SNPs in a large genomic area and the chance for multiple homozygous SNPs in more than 75% of the samples is low. The three cLOH regions are all on chromosome 13q12-q13. Eleven SNPs were detected by both conservative LOH and cLOH models (LOH/Model A and cLOH/Model A) (Tables 3 and 5). Five SNPs detected by the conservative cLOH/Model A are located in two genes, FLJ14834 and B3GTL. Due to the relatively small number of LOH regions defined by cLOH/Model A, as well as the different definitions of LOH used in these two approaches, it was not possible to compare the concordance between these two methods.

Our less conservative LOH/Model B and cLOH/Model B were identical except that LOH/Model B used the matched normal controls while cLOH/Model B used pooled normal control samples. SNPs in the LOH regions from LOH/Model B totaled 2,916; 2,128 SNPs were identified in the cLOH regions from cLOH/Model B. The number of SNPs common to both LOH and cLOH models (LOH/Model B and cLOH/Model B) was 1,878, while a total of 1,038 SNPs appeared only in LOH/Model B, 250 SNPs were found only in cLOH/Model B, and 7,834 showed retention in both models. Using LOH/Model B as a standard, sensitivity/specificity for cLOH/Model B were 64% and 97%, respectively. The overall Pearson correlation coefficient between these two models was 0.69 (P < 0.0001). Taken together, we detected more SNP loci in LOH/Model B than cLOH/Model B, but concordance between the two methods was generally good, suggesting that the use of pooled normal control samples may be acceptable for LOH studies.

CNA regions

Table 7 shows 15 regions with CNA losses that were detected at P ≤ 10-6. These include regions on 1p, 3p, 4q, 5q, 9p, 10p, 11p, 11q, 13q, and 18q. One-hundred and two SNPs were mapped within these regions (Table 4A). Details of the involved SNPs and genes are shown in Additional Table 3 (in additional file 1). Table 8 shows the 36 regions where significant CNA gains were identified, including eight on chromosomal arm 3q, seven on 8q, three on 7p, two on 5q, two on 14q, and two on 22q (Additional Table 4 in additional file 1). Examples of whole genome profiles of CNA regions are shown for chromosomes 3, 7, and 8 in Figures 4, 5, 6.

Figure 4
figure 4

Chromosome 3. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Figure 5
figure 5

Chromosome 7. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Figure 6
figure 6

Chromosome 8. Each column in the picture represents an individual case and shows genotyping in germ-line DNA and matched micro-dissected tumor; LOH is shown in red, retention in blue, and homozygous or "no call" in grey. B indicates blood DNA and T indicates tumor DNA (from matched, micro-dissected sample). To the left of the picture, columns show (from left to right): microsatellite markers, cartoon of the chromosome, and SNPs examined in the 10 K SNP chip. To the right of the picture, red bars show deletion regions (as defined from our conservative "LOH/Model A"), blue bars show regions with CNA losses (from CNAT), and green bars show regions with CNA gains (from CNAT).

Table 7 Regions with copy number alteration loss from CNAT
Table 8 Regions with copy number alteration gain from CNAT

Comparisons between LOH and CNA

We obtain both cLOH and CNA data when we use the pooled normal control sample reference in the CNAT software. Thus we can ask the question of whether the cLOH is associated with CNA. Our studies showed that among 2,128 SNPs identified in our less conservative cLOH/Model B, only 45 (2%) showed CNA loss and just 14 (0.7%) showed CNA gain (Figure 7). This result suggests that CNA accounts for small percent of LOH events in ESCC. LOH in cancers is commonly caused by one of three different mechanisms. The first and most common cause of LOH is mitotic recombination [3]. This mechanism doesn't change chromosome copy number, and was responsible for 97% of the LOH observed in our study. Deletion, the second cause of LOH, should result in copy number loss, and occurred in approximately two percent of LOH in our study. Finally, LOH can result from amplification of one chromosome, which should show copy number gain. This mechanism accounted for less than one percent of LOH in our study. Although chromosomal amplification occurs often, only occasionally do amplification events result in LOH, which correspond to unbalanced amplification of one chromosome. However, some studies have demonstrated concordance between LOH and CNA. For example, Wong et al found LOH associated with CNA gain at 6q12-13 in osteosarcoma [5], and Zhao et al found a link between LOH and CNA gain at 1q22-q24.1 and 1q42.13-43 in oral squamous cell carcinoma [8]. These differences might reflect genuine differences between tumor types, the lab analytic methods used, or different operative mechanisms at work.

Figure 7
figure 7

Comparison of SNPs with cLOH (from our less conservative "cLOH/Model B") and CNAs (from CNAT software using pooled DNA from normal controls).

The genome-wide LOH and chromosome copy alteration studies described in this paper can also be applied to higher density SNP chips, such as Affymetrix 100 K and 500 K SNP chips. The increased SNP density will allow even finer mapping of these genetic changes.

In summary, we performed a genome-wide study of LOH and CNA in ESCC patients using the Affymetrix 10 K SNP chip by comparing matched germ-line and tumor DNA. Our approach allowed us to extensively map both LOH and CNAs in ESCC systematically in a manner that has not heretofore been done, and produced numerous regions, genes, and SNPs that merit future exploration. This report is the first comprehensive genome-wide analysis of chromosomal imbalance (LOH and CNA) in ESCC, and the knowledge gained from this analysis will enable the development of improved strategies to prevent, diagnose, and treat ESCC patients in the future.

Conclusion

The Affymetrix 10 K SNP chip is a valid platform to integrate analyses of loss of heterozygosity and copy number alterations. The comprehensive knowledge gained from this analysis will enable improved strategies to prevent, diagnose, and treat esophageal squamous cell carcinoma.

Methods

Patient selection

This study was approved by the Institutional Review Boards of the Shanxi Cancer Hospital and the US National Cancer Institute. Patients diagnosed with ESCC between 1998 and 2000 in the Shanxi Cancer Hospital in Taiyuan, Shanxi Province, People's Republic of China, and considered candidates for curative surgical resection were identified and recruited to participate in this study. None of the patients had prior therapy and Shanxi was the ancestral home for all. After obtaining informed consent, patients were interviewed to obtain information on demographic, clinical, and cancer lifestyle risk factors (smoking, alcohol drinking, and detailed family history of cancer). All patients were followed to ascertain survival status through 2003.

Biologic specimen collection and processing

Ten milliliters of venous blood was taken from each patient prior to surgery and germ-line DNA was extracted and purified using standard methods. Tumor and adjacent normal tissue obtained during surgery were either fixed in ethanol and embedded in paraffin, or snap frozen in liquid nitrogen and stored in a freezer at -80°C until used. Slides were stained with H&E to distinguish tumor from normal epithelium, and tumor cells were micro-dissected under light microscopy using either laser capture micro-dissection (LCM) (for paraffin-embedded samples) or manual dissection (for frozen samples). All micro-dissections were performed by a pathologist (NL) and a trained post-doctoral fellow (HS). Extraction of LCM DNA was previously described [17, 23]. Extraction of manually micro-dissected DNA followed the protocol from the Puregene DNA Purification Tissue Kit (Cat Number D-7000A, Gentra Systems, Inc., Minneapolis, MN 55441, USA).

It is well known that using pure tumor DNA obtained by micro-dissection is key to successfully detecting chromosomal changes such as LOH and CNA. However, the 10 K chip requires amplification of DNA fragments up to 1 Kb, a particularly challenging task. It is usually difficult to obtain a high yield of DNA from alcohol- or formalin-fixed tissues, especially when using micro-dissection. In our study, the SNP call rates were much lower in micro-dissected tumor DNA from alcohol-fixed tissue than from frozen tissue (data not shown). Although the isolation of tumor DNA from ground tissue using Trizol yielded higher genotype call rates, we think that it is more important to identify LOH and CNA regions than to simply obtain higher genotype call rates. Thus, the best overall genomic characterization results can be expected from the use of micro-dissected frozen tissue.

Affymetrix GeneChip Mapping 10 K array

The 10 K SNP array provides comprehensive coverage of the genome for genotyping studies. Each array contained 11,555 bi-allelic polymorphic SNPs randomly distributed throughout the genome, except for the Y chromosome. The median physical distance between SNPs is approximately 105 kb, and the mean distance between SNPs is 210 kb. The average heterozygosity for these SNPs is 0.37, with an average minor allele frequency of 0.25. The algorithm used to make genotype calls was previously described by Affymetrix [24, 25]. DNA samples were assayed according to the protocol (GeneChip Mapping Assay manual) supplied by Affymetrix, Inc. (Santa Clara, CA) as previously described [25, 26]. The 10 K SNP arrays were scanned with the Affymetrix GeneChip Scanner 3000 using GeneChip Operating System 1.2 (GCOS) (Affymetrix). Data files were generated automatically. Genotype assignments (ie, calls) were made automatically by GeneChip DNA Analysis Software 3.0 (GDAS) (Affymetrix). The genetic map used in the analysis was obtained from GeneChip Mapping 10 K library files: Mapping 10K_Xba131. "Signal Detection Rate" is the percentage of SNPs that passed the discrimination filter. "Call Rate" is the percentage of SNPs called on the array. Genotype calls are defined as AA, AB, or BB; "no call" means the SNP for that sample did not pass the discrimination filter and was excluded from further evaluation in the present study.

Data analysis

Since patient-matched normal DNA is not always available as a reference for high-resolution allelotyping, we evaluated LOH using two different methods: first, we used patient-matched normal DNA as the reference (the traditional approach); and second, we assessed whether it was possible to instead use a pool of normal control samples as the reference, as is done with the chromosome Copy Number Analysis Tool 2.0 software (CNAT) from Affymetrix.

In the first method, LOH was defined in a traditional manner as a change in genotyping call from heterozygosity (AB) in the germ-line DNA, to homozygosity (AA or BB) in the matched micro-dissected tumor DNA (all calls from GDAS). In the second method, LOH was also defined as a change in genotyping from "normal" to tumor, however, "normal" here was defined based on data already present in the Affymetrix CNAT software from prior testing of 100 ethnically-diverse normal reference subjects [27]. LOH in the second method was based on a comparison of a track of contiguous SNPs in tumor to the analogous track of contiguous SNPs in the "normal" population DNA. Since the "normal" DNA here includes not just one but 100 individuals, the state of these SNPs (ie, whether they are heterozygous or homozygous) was inferred statistically as a likelihood estimation with confidence calculated from a binomial distribution of the observed state of these SNPs in this normal population. A contiguous run of homozygous SNPs in tumor where these SNPs are heterozygous in the "control" suggests LOH in the region spanning the SNPs. Hence, no germ-line DNA data from cases was used for this second analysis. We refer to this LOH as "cLOH" to distinguish it from our more traditional analysis approach using paired normal and tumor samples and to indicate that it utilized a common control pool of normal DNA generated by CNAT. The threshold for statistical significance used in CNAT was P ≤ 10-6 as recommended by Affymetrix [27].

We combined the LOH results from a cluster of SNPs in a genetic locus to define a deletion region. We defined these deletion regions in several ways to permit comparison with the existing scientific literature as well as to make comparisons within our own study using different reference groups as noted above. We first used a definition that permitted comparisons with most of the existing published literature. We constrained the SNPs we considered here to require that: (i) SNPs have a call in ≥ 50% of the normal DNA samples; (ii) there be a minimum of three informative (heterozygous) normal samples; and (iii) the SNPs be mappable to NCBI Build 35.1. Non-random allelic loss was defined as LOH frequency ≥ 50% at a given locus, while random allelic loss represented LOH frequency <50% at a locus. Using these constraints for the SNPs evaluated, we defined deletion regions very conservatively by requiring that the deletion regions have five or more contiguous SNPs which showed ≥ 75% LOH. Uninformative SNPs in the regions of LOH were excluded from consideration in this analysis. Thus, a region of LOH separated from a second region of LOH by only uninformative SNPs would be combined into one large LOH region. We labeled this conservative, traditional approach "LOH/Model A".

The second approach we took (labeled "LOH/Model B") was very similar to LOH/Model A in that we used the same constraints on the SNPs noted above, but we were less conservative in our requirement for the percent of the deletion region which showed LOH – only ≥ 50% LOH frequency (instead of ≥ 75%) among the SNPs was required to be classified as a deletion region. To enable comparability with data from the cLOH approach described above, we also adopted different guidance regarding how we treated homozygous SNPs in these putative LOH regions defined by the traditional approach. In CNAT, a contiguous track of homozygous SNPs are required to designate a region as having LOH. When a homozygous SNP is located between two SNPs where one or both of the adjacent SNPs showed retention of heterozygosity, the homozygous SNP was considered retained. Otherwise, it was treated as LOH.

We also developed two models using exclusively data from the cLOH approach with the CNAT-generated pooled controls. The first used the conservative definition described above for LOH/Model A of a ≥ 75% LOH requirement to declare a deletion region, and also treated homozygous SNPs in deletion regions in accord with the CNAT algorithm described above; we termed this "cLOH/Model A". Although the level of LOH required is the same for LOH/Model A and cLOH/Model A, direct comparisons between them are not possible because of the different algorithms used to treat uninformative SNPs in deletion regions. The second approached loosened the LOH requirement to ≥ 50% to declare a deletion region (as with LOH/Model B above), also used the CNAT algorithm for homozygous SNP calls in deletion regions, and was termed "cLOH/Model B".

Individual SNP copy numbers and chromosomal regions with gains or losses were also determined by evaluation with CNAT based on the SNP hybridization signal intensity data from the experimental sample relative to intensity distributions derived from the previously described reference set containing over 100 normal individuals [27]. P-values were log10-transformed and plotted along the corresponding chromosome; values were considered significant at P ≤ 10-6.

We further defined CNA-gain regions as regions where five or more contiguous SNPs showed copy number gain in at least 50% of cases, and the P-value for the difference from the reference was ≤ 10-6. Similarly, CNA-loss regions were defined as regions where five or more contiguous SNPs showed copy number loss in at least 50% of cases, and the P-value for the difference from the reference was ≤ 10-6.