Background

Esophageal squamous cell carcinoma (ESCC) is one of the most malignant tumors and typically presents at an advanced and rapidly fatal stage. To better understand the role of genetics in the etiology and prevention of ESCC and to identify potential susceptibility genes as well as early detection markers, we previously compared tumor and matched normal tissues from ESCC patients from a high-risk area of China using cDNA expression microarrays and identified 41 differentially-expressed genes (13 over-expressed and 28 under-expressed) [1].

Among these 41 differentially-expressed genes are SPARC (secreted protein acidic and rich in cysteine), COL7A1 (collagen type VII, α), and ECM1 (extracellular matrix protein), all of which are involved in extracellular matrix functions [24]. Other of these differentially-expressed genes (eg, Fascin and cytokeratin 4, also called KRT4 or CK4) are involved in the formation of actin filaments and cytoskeleton structure [5, 6].PPL (periplakin) and EVPL (enveloplakin) are both members of the plakin family [79]. PPL is expressed in stratified squamous epithelia while EVPL, a candidate gene for the tylosis esophageal cancer syndrome, is exclusively expressed in stratified squamous epithelia. Both PPL and EVPL have desmosome components and, in conjunction with TGM3 (transglutaminase) and cystatin A, they help to maintain an intact cell surface interface [10, 11]. FADD (Fas-associated death domain) interacts with FasL and Caspase-8 to initiate the Fas signaling complex which leads to apoptosis [12]. All nine of these genes identified in our previous study [1] are involved in important cellular processes, and their altered expression in ESCC suggests that they are candidate molecular markers that may have a role in prevention and early detection strategies in ESCC.

Array technologies are comprehensive and relatively accurate ways to simultaneously analyze the expression of thousands of genes, and these technologies have been used to clarify gene expression changes in many human malignancies. However, the results from microarrays are potentially influenced by many external factors, including array production itself, RNA extraction methods, the probes used for labeling, hybridization conditions, image analysis, etc. Further, most studies of this type are based on relatively small sample sizes, including our study [1]. Thus, genes identified as differentially expressed in such initial discovery efforts need to be confirmed using alternative methods and larger sample sizes before they can be considered validated and advanced for testing as early detection markers. This confirmation is a key initial step in the validation process for selecting genes for future study as potential markers of susceptibility or early disease.

The main goal of the current study was to validate the differential mRNA expression of nine selected genes (SPARC, FADD, Fascin, COL7A1, CK4, TGM3, ECM1, PPL and EVPL) in a relatively large sample of ESCC cases using quantitative Real-time RT-PCR. A secondary goal was to determine if expression patterns for these nine genes varied by selected demographic and clinical characteristics.

Methods

Patient selection and sample collection

This study was approved by the Institutional Review Boards of the Shanxi Cancer Hospital and the U.S National Cancer Institute (NCI). Patients presenting from 1996 to 2001 to the Shanxi Cancer Hospital in Taiyuan, Shanxi Province, People's Republic of China, who were diagnosed with ESCC and considered candidates for curative surgical resection were identified and recruited to participate in the study. None of the patients had prior therapy and Shanxi was the ancestral home for all. After obtaining informed consent, patients were interviewed to obtain information on demographic and lifestyle cancer risk factors (eg, smoking, alcohol drinking, family history (FH) of upper gastrointestinal (UGI) cancer) and clinical data. Tumor tissue obtained during surgery was snap-frozen in liquid nitrogen, along with matched normal tissue, and stored at -130 C until used. The 75 patients evaluated here were selected based on the quantity and quality of their total RNA from among 110 who had sufficient tumor and matched normal tissues.

Real-time quantitative RT-PCR

Total RNA was extracted from frozen tumor and matched normal tissues using TRIzol reagent (Invitrogen, CA, USA) in accordance with the manufacturer's instructions. RNA quality and quantity were determined using the RNA 6000 Labchip/Agilent 2100 Bioanalyzer (Agilent Technologies, Germantown, MD). RNA purification was performed according to the manufacturer's instructions for the RNeasy Mini Kit (Qiagen Inc., Valencia, CA) and Rnase-Free Dnase Set digestion (Qiagen Inc., Valencia, CA). Reverse transcription of RNA was performed by adding 5 μg total RNA, 1 μl of Oligo (dT)12–18 (500 μg/ml), 1 μl (200 units) of Superscript II reverse transcriptase, 1 μl (2 units) of E. coli Rnase, and 1 μl 10 mM dNTP (Invitrogen, Carlsbad, CA).

All real-time PCR reactions were performed using an ABI Prism 7000 Sequence Detection System (Perkin-Elmer Applied Biosystems, Foster City, CA). All primers and probes of nine target genes and an internal control gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) were designed by Perkin-Elmer Applied Biosystems Perkin-Elmer Applied Biosystems, Foster City, CA) (Table 1). A singleplex reaction mix was prepared according to the manufacture's protocol, as described previously [13]. The thermal cycling conditions included an initial denaturation step at 95°C for 10 min, 40 cycles at 95°C for 15 s, and 60°C for one min.

Table 1 Information on the nine genes examined: location, function, primers and probes

Using these quantitative methods requires that the PCR efficiencies of all genes be similar and, preferably, ≥90%. Efficiency was measured using a standard curve generated by serial dilutions of the RNA. Consequently, the initial RNA concentration of 100 ng/μl was serially diluted 10-fold (100 ng, 10 ng, 1 ng, 0.1 ng, and 0.01 ng) for the real-time PCR assay according to the standard protocol of Applied Biosystems. The relative standard curve quantitation method was previously described [14]. The PCR efficiency (E) was calculated by the formula: E = 10 (1/-slope) - 1, and ranged from 90–100% in the different assays (a slope of -3.32 is equivalent to 100% PCR efficiency) [14, 15].

Analysis of gene expression using the 2-ΔΔCTmethod

Details of the 2-ΔΔCT method have been previously described [14, 16]. Briefly, the mean target gene mRNA expression level for the three mRNA measurements was calculated. The 2-ΔΔCT method was used to calculate relative changes in gene expression determined from real-time quantitative PCR experiments. In the present study, the data are presented as the fold change in target gene expression in tumors normalized to the internal control gene (GAPDH) and relative to the normal tissue control (matched normal as calibrator). Results of the real-time PCR data were represented as CT values, where CT was defined as the threshold cycle number of PCRs at which amplified product was first detected. There is an inverse correlation between CT and amount of target: lower amounts of target correspond to a higher CT value, and higher amounts of target have lower CT values. The average CT was calculated for both the target genes and GAPDH and the ΔCT was determined as (the mean of the triplicate CT values for the target gene) minus (the mean of the triplicate CT values for GAPDH). The ΔΔCT represented the difference between the paired tissue samples, as calculated by the formula ΔΔCT = (ΔCT of tumor - ΔCT of normal). The N-fold differential expression in the target gene of a tumor sample compared to the normal sample counterpart was expressed as 2-ΔΔCT [14, 16]. In the present study, increased mRNA expression was defined as N-fold ≥2.0, "normal" expression was an N-fold ranging from 0.5001 to 1.9999, and decreased mRNA expression was N-fold ≤0.5.

Data analysis

Hierarchical cluster analysis and visualization were generated by Cluster and TreeView software [17].

The associations between mRNA expression level of target genes and life style risk factors as well as clinical/pathological characteristics listed in Table 2 were evaluated by using chi-square tests for analysis of categorical variables and Wilcoxon rank sum tests for analysis of continuous variables (Statistical Analysis Systems (SAS), SAS Corp., NC). All P-values were two-sided and considered statistically significant if P < 0.05.

Table 2 Demographic and clinical characteristics of ESCC patients (N = 75)

Results

A total of 75 pairs of tumor and matched normal tissues from ESCC patients were examined for mRNA expression of nine genes using quantitative Real-time RT-PCR. Table 2 shows the characteristics of patients in the present study.

The PCR efficiency (E) for the nine genes was measured using standard curves generated by serial dilutions of RNA and ranged from 90% to 100% for the nine genes tested (Table 1 and Fig 1). Quantitative RT-PCR analyses of mRNA levels were performed using paired normal-tumor mRNA extracts and the 2ΔΔCT method. The mRNA expression values for all 75 ESCC patients tested are shown in Fig 2 and Fig 3 and summarized in Table 3. Briefly, all four genes over-expressed in our initial study also had increased mRNA expression (≥2-fold in tumor versus normal) in the majority of the new ESCC patients tested (shown in red in Fig 2): SPARC was increased in 71% of cases, Fascin in 70%, FADD in 63%, and COL7A1 in 57%. Likewise, the five genes previously identified in our initial cDNA study as under-expressed (≤0.5-fold in tumor versus normal) also had decreased mRNA expression in two-thirds or more of the new patients examined (shown in green in Fig 2): CK4 was decreased in 83% of cases, TGM3 in 77%, ECM1 in 73%, and PPL and EVPL decreased in 67% each. Overall, the results of the current study indicate a reasonably consistent pattern for mRNA expression in these nine genes in ESCC patients (Table 3), and are quite compatible with the findings of our earlier cDNA microarray study [1].

Figure 1
figure 1

The relative standard curve using the TGM3 gene, primers, and probes amplified with 100, 10, 1, 0.1 and 0.01 ng of total RNA. Each sample was run in triplicate (red square). The average CT values (y-axis) are plotted against the logarithm of the input amount of RNA (x-axis) added to each sample. PCR efficiency (E) = (10-1/slop - 1). It shows a linear relationship between RNA concentration and the CT value of reverse transcription real-time PCR reaction for TGM3.

Figure 2
figure 2

The mRNA expression of nine genes in 75 ESCC patients using hierarchical cluster with the average linkage uncentered correlation method. The cases with increased mRNA expression are shown in red; cases with decreased mRNA expression are in shown in green.

Figure 3
figure 3

The distribution of mRNA expression of nine genes in 75 ESCC patients. Increased expression (≥2 N-fold) is shown in red; "normal" expression (N-fold from 0.50001 to 1.9999) is shown in yellow; decreased expression (N-fold ≤0.5) is shown in green.

Table 3 mRNA expression of nine genes in ESCC patients

The range of mRNA expression found using the quantitative real-time RT-PCR method was very broad (Table 3). Among the nine genes, TGM3 had the greatest range of mRNA expression (from 0.000 to 35857-fold difference), while PPL had the smallest range (0.0076 to 29.9953-fold difference). Several ESCC cases had unusually different mRNA expression patterns for a few genes. These outliers had not only opposite expression pattern, but also had extremely low (<0.01-fold) or high mRNA expression level (>100-fold change). For example, 70% of cases had increased mRNA expression for Fascin, but SHE1233 and SHE1338 had very low Fascin mRNA expression (0.0033 and 0.0091-fold change, respectively). In addition, most cases had decreased ECM1, TGM3 and EVPL mRNA expression, but SHE1992 had increased mRNA expression for ECM1 and TGM3 (139- and 1050-fold changes, respectively), SHE1289 had extremely high mRNA expression for TGM3 (35857-fold) and CK4 (467-fold change), and SHE1432 had increased mRNA expression for EVPL (136-fold change) and TGM3 (99-fold change). This phenomenon regarding extreme variation in mRNA expression strongly suggests that there is heterogeneity in ESCC, even when all the patients studied come from a geographically similar high-risk population [18].

We also explored the relationships between mRNA expression levels for each target gene and age, gender, smoking, alcohol drinking, FH of UGI cancer, and the four clinical characteristics listed in Table 2 and identified five potential associations using the nominal P-value cutoff of 0.05. Family history of UGI cancer has been of particular interest to us since we previously showed that FH (+) cases had greater allelic loss than FH (-) cases [19, 20] and that gene expression profiles varied by FH status [1]. As a result, we examined expression of all nine genes by FH status in this study (Table 4). Two genes, PPL and ECM1, had lower mRNA expression in FH (+) ESCC cases than in FH (-) cases (78% versus 62% under-expression, respectively, for PPL, Wilcoxon rank sum test P = 0.040; and 87% versus 67% under-expression, respectively, for ECM1, Wilcoxon rank sum test P = 0.038). Expression of two other genes, TGM3 and EVPL, showed lower median values in FH (+) versus FH (-) cases, although the results did not achieve statistical significance (0.05 <P < 0.10 for both). Interestingly, the median values for all nine genes were universally more extreme in FH (+) than FH (-) cases (ie, the five under-expressed were more under-expressed in FH (+) cases; the four over-expressed were more over-expressed in FH (+) cases). Older age was associated with over-expression of COL7A1 (63% over-expression in cases 56+ years of age versus 50% over-expression in cases <56 years old, Wilcoxon rank sum test P = 0.030). Alcohol use was associated with reduced over-expression of both SPARC and Fascin (over-expression of 40% in alcohol users versus 75% in non-users for SPARC, χ2 2 df = 8.38, P = 0.015; over-expression of 50% in alcohol users versus 74% in non-users for Fascin, χ2 2 df = 13.15, P = 0.001). No other significant associations were found for these variables.

Table 4 mRNA expression of nine genes in ESCC patients by family history of uppergastrointestinal cancer

Discussion

Our interest in further validation of initial results from DNA microarray technology and quantitation of measurements to better understand the nature of differences led us to perform quantitative Real-time RT-PCR analyses of nine of the differentially expressed genes we identified in a previous study [1]. Reassuringly, results from the current study show that the differential expression patterns for these nine genes are similar to those seen with the 8 k cDNA array [1]. Four genes that were over-expressed in our first study (SPARC, FADD, Fascin, and COL7A1) also showed increased mRNA expression in the majority of the new ESCC patients tested here, while five genes previously under-expressed (CK4, TGM3, ECM1, PPL, and EVPL) were also under-expressed in two-thirds or more of new cases. Taken together, the results from these two studies indicate that the differential expression observed is highly consistent.

The mRNA expressions of the nine genes examined had wide ranges, suggesting that quantitative Real-time RT-PCR is a highly sensitive method, with demonstrable ability to detect mRNA expression differences from less than 0.01-fold to more than 100-fold. If a normal range can be established from non-diseased subjects, this apparent high degree of precision should prove useful for accurately discriminating patient values in the future in both early detection and treatment evaluation settings.

Selection of a housekeeping gene as an internal control is a key issue when RT-PCR is used to study mRNA expression [21]. In the present study, we selected GAPDH as an internal control. While mRNA expression of GAPDH may not be the same in different tissues [21], we believe that it was a satisfactory control gene for the present study because both tumor and matched normal esophageal tissues were used and the values for tumor and normal were both normalized to GAPDH as an initial step in the data analysis.

SPARC protein is a non-structural component of extracellular matrix-associated matricellular glycoprotein. It is known that SPARC protein is involved in the formation of focal adhesions and cytoskeletal structure, and that different levels influence cell adherence and migration capacity, potentially playing a role in the invasion and metastasis of tumor. SPARC has previously been shown to have increased mRNA expression in various human cancers, including ESCC, where it has been studied using array technology [22, 23]. In the present study, we found increased mRNA expression of SPARC in 71% of patients. Yamahita et al also found high expression of SPARC mRNA in all 48 ESCC patients examined using Northern blot hybridization [24]. In another study, SPARC mRNA expression using the RT-PCR method was lowest in normal esophagus, intermediate in Barrett's esophagus, and highest in esophageal adenocarcinoma [25]. SPARC protein can be measured in human serum [26], thus, future study of the role of SPARC in ESCC should consider comparison of serum levels in cases and controls as an early step in assessing the potential value of SPARC protein as a biomarker for early detection or screening in a high-risk populations. Taken together, these results suggest that SPARC plays an important role in esophageal carcinogenesis and merits further investigation to determine how early in this process it becomes dysregulated.

Fascin is the first of the fascins to be cloned and is a highly conserved 55-KD actin-bundling protein. Fascin functions in the organization of two major forms of actin-based structures: dynamic cortical cell protrusions, and cytoplasmic microfilament bundles (eg, filopodia, spikes, lamellipodial ribs, dendrites and microvilli) [27]. The expression of fascin in epithelial neoplasms has been described only recently. In normal epithelial cells, fascin expression is usually absent or very low, but is often up-regulated in several types of human cancers, such as colon [28], breast [29], lung [30], and gastric [31]. To date, it has not been reported in ESCC. In a pilot study, we procured normal epithelia, dysplasia, and tumor from each of 11 esophagectomized ESCC patients and examined fascin protein expression by immunohistochemistry. Staining was strongest in tumor, intermediate in dysplasia, and absent in normal epithelia (unpublished data). These results suggest that the mRNA and protein expression patterns for fascin are similar, and that fascin could be a useful biomarker in ESCC patients.

CK4, a type II keratin, is typically paired with CK13 in expression in non-keratinized stratified epithelia of the upper digestive tract. Mutation of the CK4 gene can cause White Sponge Nevus, a benign autosomal dominant disorder that presents as leukokeratosis, usually in the mouth, but has also been reported in the esophagus and anogenital mucosa [32]. While CK4 protein alteration has been observed in many upper digestive tract tumors, including adenocarcinoma of the esophagus [33], it has not previously been reported in ESCC. In another pilot study, we found that CK4 protein expression, compared to normal epithelia, was decreased in 80% of dysplastic epithelia and 85% of invasive tumors (unpublished data). The reason CK4 expression is decreased in ESCC is unclear, but decreased expression of CK4 does influence the formation of cytoskeletal cells and this change may play a role in the development of tumor.

EVPL, PPL, and TGM are all involved in stabilizing the cornified cell envelope and, like CK4 discussed above, were all under-expressed in the ESCC patients evaluated in this study. Other investigators have shown that the C-terminus of PPL plays an important role in linking PPL and EVPL to intermediate filaments [34]. EVPL is a candidate gene for the tylosis esophageal cancer (TOC). Although our previous studies showed familial aggregation of ESCC in Shanxi Province, China [35, 36], no cases with tylosis were identified in these high-risk families. In the present study, decreased PPL mRNA expression was significantly associated with a FH of UGI cancer. These results suggest that additional examination of the relationship between members of the plakin family of genes, particularly in familial ESCC, may be illuminating.

There are no immediate clinical applications from the findings reported here. Although our study size was only modest, the lack of association with tumor characteristics related to survival suggests that these nine genes are not likely to be useful in predicting outcome or suggesting specific therapies. Further evaluation of expression in relation to actual survival is needed to firm up this conclusion, however. For the moment, the greatest clinical potential for the nine genes studied here is their potential as early detection markers.

Although the nine genes studied in present study were well characterized at the mRNA level, changes at the protein level remain much less well characterized. Thus, logical next steps in future studies of these genes in ESCC are to further correlate protein expression status with mRNA expression, to evaluate the relation between protein expression and tumor phenotype (ie, grade, differentiation), to study the importance of cell location in protein over-expression, and to assess the relation of each of these factors to patient prognosis. In addition, we need to learn when in the ESCC carcinogenesis process dysregulation occurs if we are to determine the potential value of these genes as biomarkers for the early detection of ESCC.

Conclusion

Results of the current study confirm that each of the nine genes evaluated are significantly dysregulated in the majority of ESCC cases and merit further investigation as potential susceptibility and early detection markers.