Background

Systemic androgen-deprivation therapy by orchiectomy or agonists of gonadotropic releasing hormone are routinely used to treat men with metastatic prostate cancer to reduce tumor burden and pain. This therapy is based on the dependency of prostate cells for androgens to grow and survive. The inability of androgen-deprivation therapy to completely and effectively eliminate all metastatic prostate cancer cell populations is manifested by a predictable and inevitable relapse, referred to as castration-recurrent prostate cancer (CRPC). CRPC is the end stage of the disease and fatal to the patient within 16-18 months of onset.

The mechanisms underlying progression to CRPC are unknown. However, there are several models to explain its development. One such model indicates the involvement of the androgen signaling pathway[14]. Key to this pathway is the androgen receptor (AR) which is a steroid hormone receptor and transcription factor. Mechanisms of progression to CRPC that involve or utilize the androgen signaling pathway include: hypersensitivity due to AR gene amplification [5, 6]; changes in AR co-regulators such as nuclear receptor coactivators (NCOA1 and NCOA2) [7, 8]; intraprostatic de novo synthesis of androgen[9] or metabolism of AR ligands from residual adrenal androgens[10, 11]; AR promiscuity of ligand specificity due to mutations[12]; and ligand-independent activation of AR by growth factors [protein kinase A (PKA), interleukin 6 (IL6), and epidermal growth factor (EGF)][1315]. Activation of the AR can be determined by assaying for the expression of target genes such as prostate-specific antigen (PSA)[16]. Other models of CRPC include the neuroendocrine differentiation [17], the stem cell model [18] and the imbalance between cell growth and cell death [3]. It is conceivable that these models may not mutual exclusive. For example altered AR activity may impact cell survival and proliferation.

Here, we describe long serial analysis of gene expression (LongSAGE) libraries[19, 20] made from RNA sampled from biological replicates of the in vivo LNCaP Hollow Fiber model of prostate cancer as it progresses to the castration-recurrent stage. Gene expression signatures that were consistent among the replicate libraries were applied to the current models of CRPC.

Methods

In vivoLNCaP Hollow Fiber model

The LNCaP Hollow Fiber model of prostate cancer was performed as described previously[2123]. All animal experiments were performed according to a protocol approved by the Committee on Animal Care of the University of British Columbia. Serum PSA levels were determined by enzymatic immunoassay kit (Abbott Laboratories, Abbott Park, IL, USA). Fibers were removed on three separate occasions representing different stages of hormonal progression that were androgen-sensitive (AS), responsive to androgen-deprivation (RAD), and castration-recurrent (CR). Samples were retrieved immediately prior to castration (AS), as well as 10 (RAD) and 72 days (CR) post-surgical castration.

RNA sample generation, processing, and quality control

Total RNA was isolated immediately from cells harvested from the in vivo Hollow Fiber model using TRIZOL Reagent (Invitrogen) following the manufacturer's instructions. Genomic DNA was removed from RNA samples with DNaseI (Invitrogen). RNA quality and quantity were assessed by the Agilent 2100 Bioanalyzer (Agilent Technologies, Mississauga, ON, Canada) and RNA 6000 Nano LabChip kit (Caliper Technologies, Hopkinton, MA, USA).

Quantitative real-time polymerase chain reaction

Oligo-d(T)-primed total RNAs (0.5 μg per sample) were reverse-transcribed with SuperScript III (Invitrogen Life Technologies, Carlsbad, CA, USA). An appropriate dilution of cDNA and gene-specific primers were combined with SYBR Green Supermix (Invitrogen) and amplified in ABI 7900 real-time PCR machine (Applied Biosystems, Foster City, CA, USA). All qPCR reactions were performed in triplicate. The threshold cycle number (Ct) and expression values with standard deviations were calculated in Excel. Primer sequences for real-time PCRs are: KLK3, F': 5'-CCAAGTTCATGCTGTGTGCT-3' and R:' 5'-CCCATGACGTGATACCTTGA-3'; glyceraldehyde-3-phosphate (GAPDH), F': 5'-CTGACTTCAACAGCGACACC-3' and R:' 5'-TGCTGTAGCCAAATTCGTTG-3'). Real-time amplification was performed with initial denaturation at 95°C for 2 min, followed by 40 cycles of two-step amplification (95°C for 15 sec, 55°C for 30 sec).

LongSAGE library production and sequencing

RNA from the hollow fibers of three mice (biological replicates) representing different stages of prostate cancer progression (AS, RAD, and CR) were used to make a total of nine LongSAGE libraries. LongSAGE libraries were constructed and sequenced at the Genome Sciences Centre, British Columbia Cancer Agency. Five micrograms of starting total RNA was used in conjunction with the Invitrogen I-SAGE Long kit and protocol with alterations [24]. Raw LongSAGE data are available at Gene Expression Omnibus [25] as series accession number GSE18402. Individual sample accession numbers are as follows: S1885, GSM458902; S1886, GSM458903; S1887, GSM458904; S1888, GSM458905; S1889, GSM458906; S1890, GSM458907; S1891, GSM458908; S1892, GSM458909; and S1893, GSM458910.

Gene expression analysis

LongSAGE expression data was analyzed with DiscoverySpace 4.01 software [26]. Sequence data were filtered for bad tags (tags with one N-base call) and linker-derived tags (artifact tags). Only LongSAGE tags with a sequence quality factor (QF) greater than 95% were included in analysis. The phylogenetic tree was constructed with a distance metric of 1-r (where "r" equals the Pearson correlation coefficient). Correlations were computed (including tag counts of zero) using the Regress program of the Stat package written by Ron Perlman, and the tree was optimized using the Fitch program[27] in the Phylip package[28]. Graphics were produced from the tree files using the program TreeView[29]. Tag clustering analysis was performed using the Poisson distribution-based K-means clustering algorithm. The K-means algorithm clusters tags based on count into 'K' partitions, with the minimum intracluster variance. PoissonC was developed specifically for the analysis of SAGE data [30]. The java implementation of the algorithm was kindly provided by Dr. Li Cai (Rutgers University, NJ, USA). An optimal value for K (K = 10) was determined [31].

Principle component analysis

Principle component analysis was performed using GeneSpring™ software version 7.2 (Silicon Genetics, CA). Affymetrix datasets of clinical prostate cancer and normal tissue were downloaded from Gene Expression Omnibus [25] (accession numbers: GDS1439 and GDS1390) and analyzed in GeneSpring™. Of the 96 novel CR-associated genes, 76 genes had corresponding Affymetrix probe sets. These probe sets were applied as the gene signature in this analysis. Principle component (PC) scores were calculated according to the standard correlation between each condition vector and each principle component vector.

Results

LongSAGE library and tag clustering

RNA isolated from the LNCaP Hollow Fiber model was obtained from at least three different mice (13N, 15N, and 13R; biological replicates) at three stages of cancer progression that were androgen-sensitive (AS), responsive to androgen-deprivation (RAD), and castration-recurrent (CR). To confirm that the samples represented unique disease-states, we determined the levels of KLK3 mRNA, a biomarker that correlates with progression, using quantitative real time-polymerase chain reaction (qRT-PCR). As expected, KLK3 mRNA levels dropped in the stage of cancer progression that was RAD versus AS (58%, 49%, and 37%), and rose in the stage of cancer progression that was CR versus RAD (229%, 349%, and 264%) for mice 13R, 15N, and 13N, respectively (Additional file 1). Therefore, we constructed nine LongSAGE libraries, one for each stage and replicate.

LongSAGE libraries were sequenced to 310,072 - 339,864 tags each, with a combined total of 2,931,124 tags, and filtered to leave only useful tags for analysis (Table 1). First, bad tags were removed because they contain at least one N-base call in the LongSAGE tag sequence. The sequencing of the LongSAGE libraries was base called using PHRED software. Tag sequence-quality factor (QF) and probability was calculated to ascertain which tags contain erroneous base-calls. The second line of filtering removed LongSAGE tags with probabilities less than 0.95 (QF < 95%). Linkers were introduced into SAGE libraries as known sequences utilized to amplify ditags prior to concatenation. At a low frequency, linkers ligate to themselves creating linker-derived tags (LDTs). These LDTs do not represent transcripts and were removed from the LongSAGE libraries. A total of 2,305,589 useful tags represented by 263,197 tag types remained after filtering. Data analysis was carried out on this filtered data.

Table 1 Composition of LongSAGE libraries

The LongSAGE libraries were hierarchically clustered and displayed as a phylogenetic tree. In most cases, LongSAGE libraries made from the same disease stage (AS, RAD, or CR) clustered together more closely than LongSAGE libraries made from the same biological replicate (mice 13N, 15N, or 13R; Figure 1). This suggests the captured transcriptomes were representative of disease stage with minimal influence from biological variation.

Figure 1
figure 1

Clustering of the nine LongSAGE libraries in a hierarchical tree. The tree was generated using a Pearson correlation-based hierarchical clustering method and visualized with TreeView. LongSAGE libraries constructed from similar stages of prostate cancer progression (AS, androgen-sensitive; RAD, responsive to androgen-deprivation; and CR, castration-recurrent) cluster together. 13N, 15N, and 13R indicate the identity of each animal.

Identification of groups of genes that behave similarly during progression of prostate cancer was conducted through K-means clustering of tags using the PoissonC algorithm [30]. For each biological replicate (mice 13N, 15N, or 13R), all tag types were clustered that had a combined count greater than ten in the three libraries representing disease stages (AS, RAD, and CR) and mapped unambiguously sense to a transcript in reference sequence (RefSeq; February 28th, 2008) [32] using DiscoverySpace4 software [33]. By plotting within cluster dispersion (i.e., intracluster variance) against a range of K (number of clusters; Additional file 1, Figure S2), we determined that ten clusters best embodied the expression patterns present in each biological replicate. This was decided based on the inflection point in the graph (Additional file 1, Figure S2), showing that after reaching K = 10, increasing the number of K did not substantially reduce the within cluster dispersion. K-means clustering was performed over 100 iterations, so that tags would be placed in clusters that best represent their expression trend. The most common clusters for each tag are displayed (Figure 2). In only three instances, there were similar clusters in just two of the three biological replicates. Consequently, consistent changes in gene expression during progression were represented in 11 patterns. Differences among expression patterns for each biological replicate may be explained by biological variation, the probability of sampling a given LongSAGE tag, and/or imperfections in K-means clustering (e.g, variance may not be a good measure of cluster scatter).

Figure 2
figure 2

K-means clustering of tag types with similar expression trends. PoissonC with K = 10 (where K = number of clusters) was conducted over 100 iterations separately for each biological replicate (mice 13N, 15N, and 13R) and the results from the iterations were combined into consensus clusters shown here. Plotted on the x-axes are the long serial analysis of gene expression (LongSAGE) libraries representing different stages of prostate progression: AS, androgen-sensitive; RAD, responsive to androgen-deprivation; and CR, castration-recurrent. Plotted on the y-axes are the relative expression levels of each tag type, represented as a percentage of the total tag count (for a particular tag type) in all three LongSAGE libraries. Different colors represent different tag types. Each of the ten clusters for each biological replicate are labeled as such. 'No equivalent' indicates that a similar expression trend was not observed in the indicated biological replicate. Eleven expression patterns are evident in total and are labeled on the left. K-means clusters were amalgamated into five major expression trends: group 1, up during progression; group 2, down during progression; group 3, peak in the RAD stage; group 4, constant during progression; and group 5, valley in RAD stage.

Gene ontology enrichment analysis

We conducted Gene Ontology (GO) [34] enrichment analysis using Expression Analysis Systematic Explorer (EASE) [35] software to determine whether specific GO annotations were over-represented in the K-means clusters. Enrichment was defined by the EASE score (p-value ≤ 0.05) generated during comparison to all the other clusters in the biological replicate. This analysis was done for each biological replicate (3 mice: 13N, 15N, or 13R).

To enable visual differences between the 11 expression trends, the clusters were amalgamated into five major trends: group 1, up during progression; group 2, down during progression; group 3, peak in the RAD stage; group 4, constant during progression; and group 5, valley in RAD stage (Figure 2). To be consistent, the GO enrichment data was combined into five major trends which resulted in redundancy in GO terms. To simplify the GO enrichment data, similar terms were pooled into representative categories. Categorical gene ontology enrichments of the five major expression trends are shown in Figure 3. These data indicate that steroid binding, heat shock protein activity, de-phosphorylation activity, and glycolysis all decreased in the stage that was RAD, but increased again in the stage that was CR. Interestingly, steroid hormone receptor activity continues to increase throughout progression. Both of these expression trends were observed for genes with GO terms for transcription factor activity or secretion. The GO categories for genes with kinase activity and signal transduction displayed expression trends with peaks and valleys at the stage that was RAD. The levels of expression of genes involved in cell adhesion rose in the stage that was RAD, but dropped again in the stage that was CR.

Figure 3
figure 3

Gene Ontology enrichments of the five major expression trends. Plotted on the x-axis are Gene Ontology (GO) categories enriched in one or more of the five major expression trends. On the z-axis the five major expression trends correspond to Figure 2 and are: group 1, up during progression; group 2, down during progression; group 3, peak in the RAD stage; group 4, constant during progression; and group 5, valley in RAD stage. The y-axis displays the number of biological replicates (number of mice: 1, 2, or 3) exhibiting enrichment. The latter allows one to gauge the magnitude of the GO enrichment and confidence.

Altogether, genes with functional categories that were enriched in expression trends may be consistent with the AR signaling pathway playing a role in progression of prostate cancer to castration-recurrence (Figure 3). For example, GO terms steroid binding, steroid hormone receptor activity, heat shock protein activity, chaperone activity, and kinase activity could represent the cytoplasmic events of AR signaling. GO terms transcription factor activity, regulation of transcription, transcription corepression activity, and transcription co-activator activity could represent the nuclear events of AR signaling. AR-mediated gene transcription may result in splicing and protein translation, to regulate general cellular processes such as proliferation (and related nucleotide synthesis, DNA replication, oxidative phosphorylation, oxioreductase activity, and glycolysis), secretion, and differentiation.

It should be noted, however, that both positive and negative regulators were represented in the GO enriched categories (Figure 3). Therefore, a more detailed analysis was required to determine if the pathways represented by the GO-enriched categories were promoted or inhibited during progression to CRPC. Moreover, many of the GO enrichments that were consistent with changes in the AR signaling pathway were generic, and could be applied to the other models of CRPC.

Consistent differential gene expression associated with progression of prostate cancer

Pair-wise comparisons were made between LongSAGE libraries representing the transcriptomes of different stages (AS, RAD, and CR) of prostate cancer progression from the same biological replicate (3 mice: 13N, 15N, or 13R). Among all three biological replicates, the number of consistent statistically significant differentially expressed tag types were determined using the Audic and Claverie test statistic [36] at p ≤ 0.05, p ≤ 0.01, and p ≤ 0.001 (Table 2). The tags represented in Table 2 were included only if the associated expression trend was common among all three biological replicates. The Audic and Claverie statistical method is well-suited for LongSAGE data, because the method takes into account the sizes of the libraries and tag counts. Tag types were counted multiple times if they were over, or under-represented in more than one comparison. The number of tag types differentially expressed decreased by 57% as the stringency of the p-value increased from p ≤ 0.05 to 0.001.

Table 2 Number of tag types consistently and significantly differentially expressed among all three biological replicates and between conditions*

Tag types consistently differentially expressed in pair-wise comparisons were mapped to RefSeq (March 4th, 2008). Tags that mapped anti-sense to genes, or mapped ambiguously to more than one gene were not included in the functional analysis. GO, Kyoto Encyclopedia of Genes and Genomes (KEGG; v45.0) [37] pathway, and SwissProt (v13.0) [38] keyword annotation enrichment analyses were conducted using EASE (v1.21; March 11th, 2008) and FatiGO (v3; March 11th, 2008) [39] (Table 3). This functional analysis revealed that the expression of genes involved in signaling increased during progression, but the expression of genes involved in protein synthesis decreased during progression. Cell communication increased in the stage that was RAD but leveled off in the stage that was CR. Carbohydrate, lipid and amino acid synthesis was steady in the RAD stage but increased in the CR stage. Lastly, glycolysis decreased in the RAD stage, but was re-expressed in the CR stage (Table 3).

Table 3 Top five enrichments of functional categories of tags consistently and significantly differentially expressed among all three biological replicates and between stages of prostate cancer*

Tag types differentially expressed between the RAD and CR stages of prostate cancer were of particular interest (Table 4). This is because these tags potentially represent markers for CRPC and/or are involved in the mechanisms of progression to CRPC. These 193 tag types (Table 2) were mapped to databases RefSeq (July 9th, 2007), Mammalian Gene Collection (MGC; July 9th, 2007) [40], or Ensembl Transcript or genome (v45.36d) [41]. Only 135 of the 193 tag types were relevant (Table 4) with 48 tag types that mapped ambiguously to more than one location in the Homo Sapiens transcriptome/genome, and another 10 tag types that mapped to Mus musculus transcriptome/genome. Mus musculus mappings may be an indication of minor contamination of the in vivo LNCaP Hollow Fiber model samples with host (mouse) RNA. These 135 tag types represented 114 candidate genes with 7 tag types that did not map to the genome, 5 tag types that mapped to unannotated genomic locations, and 9 genes that were associated with more than one tag type. Table 4 shows the LongSAGE tag sequences and tag counts per million tags in all nine libraries. Tags were sorted into groups based on expression trends. These trends are visually represented in Additional file 1, Figure S3. Mapping information was provided where available.

Table 4 Gene expression trends of LongSAGE tags that consistently and significantly altered expression in CR prostate cancer*

We cross-referenced these 114 candidate genes with 28 papers that report global gene expression analyses on tissue samples from men with 'castration-recurrent', 'androgen independent,' 'hormone refractory,' 'androgen-ablation resistant,' 'relapsed,' or 'recurrent' prostate cancer, or animal models of castration-recurrence [4269]. The candidate genes were identified with HUGO Gene Nomenclature Committee (HGNC) approved gene names, aliases, descriptions, and accession numbers. The gene expression trends of 18 genes of 114 genes were previously associated with CRPC. These genes were: ACPP, ADAM2, AMACR, AMD1, ASAH1, DHCR24, FLNA, KLK3, KPNB1, PLA2G2A, RPL13A, RPL35A, RPL37A, RPL39, RPLP2, RPS20, STEAP2, and TACC (Table 4). To our knowledge, the gene expression trends of the remaining 96 genes have never before been associated with CRPC (Tables 4 & 5).

A literature search helped to gauge the potential of these 96 genes to be novel biomarkers or therapeutic targets of CRPC. The results of this literature search are presented in Table 5. We found 31 genes that encode for protein products that are known, or predicted, to be plasma membrane bound or secreted extracellularly (Bioinformatic Harvester). These genes were: ABHD2, AQP3, B2 M, C19orf48, CD151, CXCR7, DHRS7, ELOVL5, ENDOD1, ENO2, FGFRL1, GNB2L1, GRB10, HLA-B, MARCKSL1, MDK, NAT14, NELF, OPRK1, OR51E2, PLCB4, PTGFR, RAMP1, S100A10, SPON2, STEAP1, TFPI, TMEM30A, TMEM66, TRPM8, and VPS13B. Secretion of a protein could facilitate detection of the putative biomarkers in blood, urine, or biopsy sample. Twenty-one of the candidate genes are known to alter their levels of expression in response to androgen. These genes were: ABHD2, B2 M, BTG1, C19orf48, CAMK2N1, CXCR7, EEF1A2, ELOVL5, ENDOD1, HSD17B4, MAOA, MDK, NKX3-1, ODC1, P4HA1, PCGEM1, PGK1, SELENBP1, TMEM66, TPD52, and TRPM8 [9, 22, 7081]. Genes regulated by androgen may be helpful in determining the activation status of AR in CRPC. Enriched expression of a protein in prostate tissue could be indicative of whether a tumor is of prostatic origin. Eight of these 96 genes are known to be over-represented in prostate tissue [75, 8285]. These genes were: ELOVL5, NKX3-1, PCGEM1, PCOTH, RAMP1, SPON2, STEAP1, and TPD52. Twenty-six genes (ABHD2, BNIP3, EEF1A2, ELOVL5, GALNT3, GLO1, HSD17B4, MARCKSL1, MDK, NGFRAP1, ODC1, OR51E2, PCGEM1, PCOTH, PGK1, PP2CB, PSMA7, RAMP1, RPS18, SELENBP1, SLC25A4, SLC25A6, SPON2, STEAP1, TPD52, and TRPM8) have known associations to prostate cancer [57, 82, 86102]. Six genes (C1orf80, CAMK2N1, GLO1, MAOA, PGK1, and SNX3) have been linked to high Gleason grade [58, 103, 104], and twelve genes (B2 M, CAMK2N1, CD151, COMT, GALNT3, GLO1, ODC1, PCGEM1, PCOTH, SBDS, TMEM30A, and TPD52) have been implicated in the 'progression' of prostate cancer [58, 82], and 15 more genes (CD151, CXCR7, DHRS7, GNB2L1, HES6, HN1, NKX3-1, PGK1, PIK3CD, RPL11, RPS11, SF3A2, TK1, TPD52, and VPS13B) in the metastasis of prostate cancer [105, 106].

Table 5 Characteristics of genes with novel association to castration-recurrence in vivo

Novel CR-associated genes identify both clinical samples of CRPC and clinical metastasis of prostate cancer

The expression of novel CR-associated genes were validated in publically available, independent sample sets representing different stages of prostate cancer progression (Gene Expression Omnibus accession numbers: GDS1390 and GDS1439). Dataset GDS1390 includes expression data of ten AS prostate tissues, and ten CRPC tissues from Affymetrix U133A arrays [47]. Dataset GDS1439 includes expression data of six benign prostate tissues, seven localized prostate cancer tissues, and seven metastatic prostate cancer tissues from Affymetrix U133 2.0 arrays [97].

Unsupervised principal component analysis based on the largest three principal components revealed separate clustering of tumor samples representing AS and CR stages of cancer progression, with the exception of two CR samples and one AS sample (Figure 4a).

Figure 4
figure 4

Principle component analyses of clinical samples. A, Principle component analysis based on the expression of novel CR-associated genes in the downloaded dataset GDS1390 clustered the AS and CR clinical samples into two groups. B, Principle component analysis based on the expression of novel CR-associated genes in the downloaded dataset GDS1439 clustered the clinical samples (benign prostate tissue, benign; localized prostate cancer, Loc CaP; and metastatic prostate cancer, Met CaP) into three groups.

Metastatic prostate cancer is expected to have a more progressive phenotype and is associated with hormonal progression. Therefore, the gene expression signature obtained from the study of hormonal progression may be common to that observed in clinical metastases. Unsupervised principal component analysis based on the largest three principal components revealed separate clustering of not only benign and malignant, but also localized and metastatic tissue samples (Figure 4b).

Discussion

Genes that change levels of expression during hormonal progression may be indicative of the mechanisms involved in CRPC. Here we provide the most comprehensive gene expression analysis to date of prostate cancer with approximately 3 million long tags sequenced using in vivo samples of biological replicates at various stages of hormonal progression to improve over the previous libraries that are approximately 70,000 short tags or less. Previous large-scale gene expression analyses have been performed with tissue samples from men with advanced prostate cancer [4258], and animal or xenograft models of CRPC [5969]. Most of these previous studies compared differential expression between CRPC samples with the primary samples obtained before androgen ablation. This experimental design cannot distinguish changes in gene expression that are a direct response to androgen ablation, or from changes in proliferation/survival that have been obtained as the prostate cancer cells progress to more a more advanced phenotype. Here we are the first to apply an in vivo model of hormonal progression to compare gene expression between serial samples of prostate cancer before (AS), and after androgen ablation therapy (RAD) as well as when the cells become CR. This model is the LNCaP Hollow Fiber model [21] which has genomic similarity with clinical prostate cancer [23] and mimics the hormonal progression observed clinically in response to host castration as measured by levels of expression of PSA and cell proliferation. Immediately prior to castration, when the cells are AS, PSA levels are elevated and the LNCaP cells proliferate. A few days following castration, when the cells are RAD, PSA levels drop and the LNCaP cells cease to proliferate, but do not apoptose in this model. Approximately 10 weeks following castration, when the cells are CR, PSA levels rise and the LNCaP cells proliferate in the absence of androgen. This model overcomes some limitations in other studies using xenografts that include host contamination of prostate cancer cells. The hollow fibers prevent infiltration of host cells into the fiber thereby allowing retrieval of pure populations of prostate cells from within the fiber. The other important benefit of the fiber model is the ability to examine progression of cells to CRPC at various stages within the same host mouse over time, because the retrieval of a subset of fibers entails only minor surgery. The power to evaluate progression using serial samples from the same mouse minimizes biological variation to enhance the gene expression analyses. However, limitations of this model include the lack of cell-cell contact with stroma cells, and lack of heterogeneity in tumors. Typically, these features would allow paracrine interactions as expected in clinical situations. Consistent with the reported clinical relevance of this model [23], here principal component analysis based on the expression of these novel genes identified by LongSAGE, clustered the clinical samples of CRPC separately from the androgen-dependent samples. Principal component analysis based on the expression of these genes also revealed separate clustering of the different stages of tumor samples and also showed separate clustering of the benign samples from the prostate cancer samples. Therefore, some common changes in gene expression profile may lead to the survival and proliferation of prostate cancer and contribute to both distant metastasis and hormonal progression. We used this LNCaP atlas to identify changes in gene expression that may provide clues of underlying mechanisms resulting in CRPC. Suggested models of CRPC involve: the AR; steroid synthesis and metabolism; neuroendocrine prostate cancer cells; and/or an imbalance of cell growth and cell death.

Androgen receptor (AR)

Transcriptional activity of AR

The AR is suspected to continue to play an important role in the hormonal progression of prostate cancer. The AR is a ligand-activated transcription factor with its activity altered by changes in its level of expression or by interactions with other proteins. Here, we identified changes in expression of some known or suspected modifier of transcriptional activity of the ARin CRPC versus RAD such as Cyclin H (CCNH) [107], proteasome macropain subunit alpha type 7 (PSMA7) [108], CUE-domain-containing-2 (CUEDC2) [109], filamin A (FLNA) [110], and high mobility group box 2 (HMGB2) [111]. CCNH and PSMA7 displayed increased levels of expression, while CUEDC2, FLNA, and HMGB2 displayed decreased levels of expression in CR. The expression trends of CCNH, CUEDC2, FLNA, and PSMA7 in CRPC may result in increased AR signaling through mechanisms involving protein-protein interactions or altering levels of expression of AR. CCNH protein is a component of the cyclin-dependent activating kinase (CAK). CAK interacts with the AR and increases its transcriptional activity [107]. Over-expression of the proteosome subunit PSMA7 promotes AR transactivation of a PSA-luciferase reporter [108]. A fragment of the protein product of FLNA negatively regulates transcription by AR through a physical interaction with the hinge region [110]. CUEDC2 protein promotes the degradation of progesterone and estrogen receptors [109]. These steroid receptors are highly related to the AR, indicating a possible role for CUEDC2 in AR degradation. Thus decreased expression of FLNA or CUEDC2 could result in increased activity of the AR. Decreased expression of HMGB2 in CRPC is predicted to decrease expression of at least a subset of androgen-regulated genes that contain palindromic AREs [111]. Here, genes known to be regulated by androgen were enriched in expression trend categories with a peak or valley at the RAD stage of prostate cancer progression. Specifically, 8 of the 13 tags (62%) exhibiting these expression trends 'E', 'F', 'J', 'K', or 'L' represented known androgen-regulated genes, in contrast to only 22 of the remaining 122 tags (18%; Tables 4 & 5). Overall, this data supports increased AR activity in CRPC, which is consistent with re-expression of androgen-regulated genes as previously reported [68] and similarity of expression of androgen regulated genes between CRPC and prostate cancer before androgen ablation [23].

Steroid synthesis and metabolism

In addition to changes in expression of AR or interacting proteins altering the transcriptional activity of the AR, recent suggestion of sufficient levels of residual androgen in CRPC provides support for an active ligand-bound receptor [112]. The AR may become re-activated in CRPC due to the presence of androgen that may be synthesized by the prostate de novo [4] or through the conversion of adrenal androgens. Here, the expression of 5 genes known to function in steroid synthesis or metabolism were significantly differentially expressed in CRPC versus RAD. They are 24-dehydrocholesterol reductase (DHCR24) [113], dehydrogenase/reductase SDR-family member 7 (DHRS7) [114], elongation of long chain fatty acids family member 5 (ELOVL5) [115, 116], hydroxysteroid (17-beta) dehydrogenase 4 (HSD17B4) [117], and opioid receptor kappa 1 (OPRK1) [118]. Increased levels of expression of these genes may be indicative of the influence of adrenal androgens, or the local synthesis of androgen, to reactivate the AR to promote the progression of prostate cancer in the absence of testicular androgens.

Neuroendocrine

Androgen-deprivation induces neuroendocrine differentiation of prostate cancer. Here, the expression of 8 genes that are associated with neuroendocrine cells were significantly differentially expressed in CRPC versus RAD. They either responded to androgen ablation such as hairy and enhancer of split 6 (HES6) [119], karyopherin/importin beta 1 (KPNB1) [120], monoamine oxidase A (MAOA)[121], and receptor (calcitonin) activity modifying protein 1 (RAMP1) [122]], or were increased expressed in CRPC such as ENO2 [122], OPRK1 [118], S100 calcium binding protein A10 (S100A10) [123], and transient receptor potential cation channel subfamily M member 8 (TRPM8) [124].

Proliferation and Cell survival

The gene expression trends of GAS5 [125], GNB2L1 [126], MT-ND3, NKX3-1 [127], PCGEM1 [128], PTGFR [129], STEAP1 [130], and TMEM30A [131] were in agreement with the presence of proliferating cells in CRPC. Of particular interest is that we observed a transcript anti-sense to NKX3-1, a tumor suppressor, highly expressed in the stages of cancer progression that were AS and CR, but not RAD. Anti-sense transcription may hinder gene expression from the opposing strand, and therefore, represents a novel mechanism by which NKX3-1 expression may be silenced. There were also some inconsistencies including the expression trends of BTG1 [132], FGFRL1 [133], and PCOTH [134] and that may be associated with non-cycling cells. Overall, there was more support at the transcriptome level for proliferation than not, which was consistent with increased proliferation observed in the LNCaP Hollow Fiber model [21].

Gene expression trends of GLO1 [135], S100A10 [136], TRPM8 [137], and PI3KCD [138] suggest cell survival pathways are active following androgen-deprivation and/or in CRPC, while gene expression trends of CAMK2N1 [139], CCT2 [140], MDK [141, 142], TMEM66 [143], and YWHAQ [136] may oppose such suggestion. Taken together, these data neither agree nor disagree with the activation of survival pathways in CRPC. In contrast to earlier reports in which MDK gene and protein expression was determined to be higher in late stage cancer [63, 142], we observed a drop in the levels of MDK mRNA in CRPC versus RAD. MDK expression is negatively regulated by androgen [65]. Therefore, the decreased levels of MDK mRNA in CRPC may suggest that the AR is reactivated in CRPC.

Other

The significance of the gene expression trends of AMD1, BNIP3, GRB10, MARCKSL1, NGRAP1, ODC1, PPP2CB, PPP2R1A, SLC25A4, SLC25A6, and WDR45L that function in cell growth or cell death/survival were not straightforward. For example, BNIP3 and WDR45L, both relatively highly expressed in CRPC versus RAD, may be associated with autophagy. BNIP3 promotes autophagy in response to hypoxia [144], and the WDR45L-related protein, WIPI-49, co-localizes with the autophagic marker LC3 following amino acid depletion in autophagosomes [145]. It is not known if BNIP3 or putative WDR45L-associated autophagy results in cell survival or death. Levels of expression of NGFRAP1 were increased in CRPC versus RAD. The protein product of NGFRAP1 interacts with p75 (NTR). Together they process caspase 2 and caspase 3 to active forms, and promote apoptosis in 293T cells [146]. NGFRAP1 requires p75 (NTR) to induce apoptosis. However, LNCaP cells do not express p75 (NTR), and so it is not clear if apoptosis would occur in this cell line [147].

Overall, genes involved in cell growth and cell death pathways were altered in CRPC. Increased tumor burden may develop from a small tip in the balance when cell growth outweighs cell death. Unfortunately, the contributing weight of each gene is not known, making predictions difficult based on gene expression alone of whether proliferation and survival were represented more than cell death in this model of CRPC. It should be noted that LNCaP cells are androgen-sensitive and do not undergo apoptosis in the absence of androgens. The proliferation of these cells tends to decrease in androgen-deprived conditions, but eventually with progression begins to grow again mimicking clinical CRPC.

Conclusion

Here, we describe the LNCaP atlas, a compilation of LongSAGE libraries that catalogue the transcriptome of human prostate cancer cells as they progress to CRPC in vivo. Using the LNCaP atlas, we identified differential expression of 96 genes that were associated with castration-recurrence in vivo. These changes in gene expression were consistent with the suggested model for a role of the AR, steroid synthesis and metabolism, neuroendocrine cells, and increased proliferation in CRPC.

Author's information

M.D.S. and M.A.M. are Terry Fox Young Investigators. M.A.M. is a Senior Scholar of the Michael Smith Foundation for Health Research.