Background

In the United States, prostate cancer is the most common form of cancer in men and is the second most deadly cancer in men killing more than 27,000 annually [1]. Nearly one in six men will develop prostate cancer at some point in their life, with the majority of incidences occurring after the age of 50. The major biomarker for prostate cancer diagnosis is prostate specific antigen (PSA), however, the sensitivity and specificity of the PSA assay is limited [2]. Improved biomarkers will result from a better understanding of molecular mechanisms that regulate this disease.

Global gene expression analyses have led to a better understanding of growth control of prostate cancer cells [35]. Ongoing studies identified more than 200 genes predominantly expressed in prostate cancer epithelial cells [6] and included genes likely to influence growth of prostate cancer cells, such as growth factors, growth factor receptors and TFs (as identified by Gene Ontology and KEGG pathway analyses). Two of the TFs identified in the prostate cancer epithelial cells were the Wilms tumor gene (WT1) and the early growth response gene (EGR1), zinc finger transcription factors that bind at G-rich promoters of genes that regulate growth. In fact, the WT1 TF binds at several G-rich sites (GNGNGGGNG), including the EGR1 consensus binding site GCGGGGGCG [79]. Both WT1 and EGR1 have been identified in prostate cancer cells, although their function in prostate epithelium is unknown [1012]. WT1 has an essential role in the normal development of the urogenital system and has been shown to suppress transcription of the promoters of many important growth factors [13].

While identifying prostate growth control pathways potentially regulated by WT1, we have focused our studies on candidate genes belonging to known growth regulatory pathways. We have previously described WT1 regulation of the androgen receptor (AR) and vascular endothelial growth factor (VEGF) gene promoters [14, 15]. To go beyond the candidate genes approach and identify novel gene targets coordinately expressed with WT1 in tumor epithelial cells, a more systematic and unbiased high-throughput computational approach was used. These in silico analyses were based on 24 genes expressed in prostate cancer epithelium that were likely to influence growth of prostate cancer cells. Putative TFBS were computationally predicted; however, the identification of functional TFBS is a challenge and requires an alternative approach. Availability of complete genomic sequence from multiple species allows identification of evolutionary conserved elements, e.g. cis-regulatory elements. Functionally important elements are likely to experience purifying selection pressure [1620], thus, we can utilize the degree of evolutionary conservation to identify TFBS that are likely to be functional. Our approach was to identify regions (and TFBS) evolutionary conserved across multiple mammalian genomes, including those separated by 170 million years (human and opossum) [21]. Overall, this targeted approach identified important candidate binding elements in genes coordinately expressed with WT1 in prostate cancer epithelial cells. Identifying genes regulated by zinc finger TFs expressed in prostate cancer cells will enhance understanding of the altered pathways in these tumor cells and provide useful biomarkers for prostate cancer progression.

Results

Evolutionary conservation analysis: TFBS conserved in prostate cancer growth genes

Genomic sequences of proximal promoter regions of 24 genes expressed in prostate cancer epithelial cells (Additional file 1) were analyzed to determine the degree of evolutionary conservation and to identify potentially important regulatory regions. Binding sites for six TFs (WT1, EGR1, SP1, SP2, AP2, and GATA1) were investigated for evolutionary conservation over a range of eight different mammalian species (human, chimpanzee, macaque, cow, dog, mouse, rat and opossum) (Table 1). Tables 2 and 3 highlight 11 of these genes whose promoter sequences could be aligned in at least five mammalian species (human, chimpanzee, macaque, rat and mouse) and were found to have at least one evolutionary conserved TFBS.

Table 1 Transcription factors potentially involved in coordinate gene expression in prostate cancer epithelial cells
Table 2 Genes co-expressed with WT1 in prostate cancer epithelium a
Table 3 TFBS in promoters of genes expressed in prostate cancer are conserved between Human and Primates or Rodents

Among the TFBS investigated, WT1, EGR1 and SP1 sites showed the highest frequency of evolutionary conservation in the gene promoters surveyed. For example, the promoters of EGR1, GATA2 and WT1 were found to have multiple WT1, EGR1 and SP1 candidate binding sites that were conserved through multiple species (Table 3). In the EGR1 promoter, 50% of WT1 sites are conserved between human and primates. Additionally, in the GATA2 gene promoter, 94% of WT1 sites, 70% of SP1 sites, and 100% of EGR1 sites are conserved between human and other primates (Table 3). Similarly, in the WT1 gene promoter 50% of SP1, 43% of WT1 and 100% of EGR1 sites are conserved between human and other primates (Table 3). WT1, EGR1, and SP1 TFBS within the promoters of IGFBP2, KLK3, NPY, SOX4, SOX9, and TFAP2C are also conserved between human and other primates (Table 3).

Importantly, for the WT1 and EGR1 gene promoters this conservation extended into the marsupials (Table 4). The EGR1 gene promoter is relatively conserved between human and opossum with 20% of predicted EGR1, 12% of predicted WT1 and 14% of predicted SP1 sites conserved between human and opossum. Similarly, the WT1 gene promoter exhibited conservation between human and opossum, with 33% of predicted SP1 and 14% of predicted WT1 sites shared between human and opossum. In the GATA2 promoter only 12% of predicted WT1 sites are shared between human and opossum (Table 4). Overall, TFBS for the three TF (WT1, EGR1, and SP1) were evolutionary conserved between human and the distantly related species, opossum, in seven different promoters (WT1, EGR1, GATA2, IGFBP2, SOX4, SOX9, and TFAP2C).

Table 4 TFBS in promoters of genes expressed in prostate cancer are conserved between Human and Opossum

Tables 3 and 4 show that there were fewer SP2, AP2 and GATA1 than WT1, EGR1 and SP1 TFBS in the 11 gene promoters analyzed. While evolutionary conservation between primates was similar for all six TFBS, conservation between human and rodents diminished for SP2 and AP2 TFBS. AP2 sites in the promoters of the GATA2, WT1, and NPY genes showed 25% to 100% conservation between human and other primates. Conservation of AP2 sites was the strongest in the NPY gene promoter as these sites are also conserved between human and opossum (Table 4). In addition to conservation of GC-rich TFBS, the AT-rich GATA1 binding sites were shown to be highly conserved in several gene promoters including SOX4, EGR1, IGFBP2 and NPY (Table 3). All of the GATA1 sites in these four promoters are conserved between human and chimpanzee, and for the SOX4 gene promoter this strong conservation extends to rodents as well.

The overall evolutionary conservation of predicted TFBS of these 11 different genes expressed in prostate cancer cells was analyzed. As would be expected, conservation of TFBS decreased as species became more evolutionarily divergent (Table 5). TFBS were found to be the most conserved among primates, followed by rodents, and the least amount of conservation was found between human and opossum. Of the 47 predicted WT1 sites in the 11 genes analyzed, primates had 68% of these sites conserved between human and primate genomes, while rodent genomes had only 15% of these sites being conserved, and opossum only 6% of these sites conserved, clearly showing a drastic drop in conservation as species diverge. This same pattern was shown for the other TFBS that were analyzed, including EGR1 and SP1. In particular, 85% of the EGR1 sites were conserved between human and other primates, 26% between human and rodents, and 19% between human and opossum. Similarly, there were 50 predicted SP1 binding sites, of which 62%, 22% and 12% were conserved between human and primates, rodents, and opossum genomes, respectively, therefore, exhibiting decreasing conservation of these sites with evolutionary divergence. Thus, with this approach of identifying evolutionary conserved sequences we were able to pinpoint specific candidate binding sites that could be tested for functional relevance.

Table 5 Summary of evolutionary conserved sites shared between genomes of human and other species

Conservation of overlapping WT1, EGR1, and SP1 TFBS

Several of the genes investigated have multiple overlapping WT1, EGR1, and SP1 binding sites in their proximal promoter regions. For example, the promoter of the human EGR1 gene has evolutionary conserved overlapping WT1/SP1 binding sites (one of which is shown in Figure 1A). Both the overlapping WT1 (human 565–581) and SP1 (human 563–577) sites are conserved between seven of eight species compared, and the SP1 site is also conserved between human and opossum. A second WT1 site (human 614–630) located 33 bp downstream overlaps an EGR1 site (human 608–624) and both sites are conserved among all eight species, including opossum (Figure 1A). The promoter of the GATA2 gene also contained overlapping SP1 and WT1 TFBS (located in human positions 1125–1139 and 1127–1145, respectively) that are conserved among several mammalian genomes (Figure 1B). The WT1 gene promoter also has overlapping WT1/SP1 binding sites and when aligned with multiple species, one 3' WT1 site (human 1444–1468) was conserved between all primates, rodents, and opossum, thus, depicting millions of years of conservation of this particular site (Figure 1C). The SP1 site (human 1420–1434) is conserved between all primates and rodents tested, and overlaps with a WT1 site (human 1409–1425) that is conserved between human and chimpanzee (Figure 1C). Interestingly, the sequence similarity is so great between human and chimpanzee for this WT1 promoter region that no insertions or deletions were observed in either genomic sequence; thus, these TFBS were located in exactly the same positions relative to the start ATG codon.

Figure 1
figure 1

Alignment of TFBS in EGR1 , GATA2 , and WT1 promoters reveals overlapping SP1, EGR1 and WT1 sites. Dots indicate nucleotides identical to human, while gaps are shown with dashes. Predicted TFBS are based on human sequences and are marked by boxes: EGR1, dashed; SP1, dash-dotted; WT1, solid. (A) Two separate WT1 sites in the EGR1 promoter are conserved between multiple species and both overlap an EGR1 site, and one also overlaps an SP1 site. WT1 site (human 614–630) overlaps EGR1 site (human 608–624) and both sites are conserved between all eight species surveyed. The WT1 site (human 565–581) overlaps both an EGR1 site (human 563–575) and an SP1 site (human 563–577). The SP1 site is conserved between all eight species, the WT1 site is conserved between all but opossum and the EGR1 site is conserved between primates. Negative numbers in the chimpanzee EGR1 promoter sequence indicate that the orthologous region was located 1,668 base pairs from the ATG site (further upstream than 1.5 kb analyzed for other species). (B) Two overlapping WT1 sites (human 1127–1143 and human 1129–1145) overlap an SP1 site (human 1125–1139) in the GATA2 gene promoter region. The WT1 sites are conserved between human, chimpanzee, and macaque, while the SP1 site is conserved between human, chimpanzee, macaque, and cow. (C) Two WT1 and an SP1 TFBS in the WT1 promoter are conserved. The WT1 site (human 1444–1468) is conserved between human, chimpanzee, macaque, mouse, rat, and opossum. The WT1 site (human 1409–1425) that overlaps an SP1 site is conserved between human and chimpanzee only, while the SP1 site (human 1420–1434) is conserved between human, chimpanzee, macaque, mouse, and rat.

Identification of overlapping TFBS in the gene promoters indicated that WT1 and EGR1 may compete for binding. Analyses of the promoter regions of 11 genes expressed in prostate cancer epithelial cells showed that WT1 TFBS overlapped SP1 and EGR1 TFBS, either separately or together. Overall, it was found that there were 25 overlapping sites in the promoter regions of these genes. There were 12 WT1/SP1, seven SP1/EGR1, three WT1/EGR1, and three WT1/SP1/EGR1 overlapping sites (Table 6). These overlapping sites were found in 10 of the 11 gene promoters analyzed. Seven overlapping sites were identified in the promoter region of the EGR1 gene, and three of these seven overlapping sites are conserved between human and other species. Three other gene promoters, GATA2, IGFBP2, and TFAP2C, have three overlapping sites each, with one SP1/EGR1 site conserved between human and opossum for both the TFAP2C and IGFBP2 promoters. The WT1 and KLK3 promoters have overlapping WT1/SP1 and SP1/EGR1 sites, respectively. All of these overlapping TFBS are excellent candidates for functional testing to determine whether competition for TF binding at these sites results in activation or suppression of the genes they are regulating.

Table 6 Conservation of overlapping TFBS between human and other mammalsa

Sequence conservation of TFBS indicates a potentially functional WT1 binding site in the KLK3 (PSA) promoter

One of the 24 genes differentially expressed in prostate cancer epithelial cells was KLK3 (PSA), an important diagnostic marker. Sequence alignment of the KLK3 promoter revealed three WT1 sites and two SP1 sites, with two-thirds of the WT1 and one-half of the SP1 sites conserved between human and other primates (Table 3). Given the premise that evolutionary conserved sites are more likely to be functionally relevant, we tested these conserved sites for their ability to bind TF in vivo. PCR primers were designed to flank the region where adjacent conserved WT1 (human 1332–1352) and the SP1 sites (human 1404–1418) were identified (Figure 2A). Both of these binding sites in the PSA promoter were tested by chromatin immunoprecipitation (ChIP) in hormone responsive LNCaP prostate cancer cells (Figure 2B). Since LNCaP cells express little WT1 [22], they were transfected with a green fluorescent protein (GFP)-tagged WT1 expression construct 48 hours prior to the ChIP assay. After crosslinking, the chromatin and TF complexes were immunoprecipitated by both WT1 and SP1antibodies, as demonstrated by PCR amplification of the promoter region. WT1 and SP1 may bind at adjacent sites within the PSA promoter or at overlapping sites, since the SP1 site overlaps the EGR1 site, to which WT1 may also bind [79]. The importance of these WT1 and SP1 TFBS as candidate binding sites was confirmed by the in vivo ChIP assay.

Figure 2
figure 2

Conservation of the KLK3 ( PSA ) promoter and ChIP verification of WT1 and SP1 binding. (A) Alignment of predicted TFBS (based on human sequences) in the KLK3 gene promoter of multiple genomes shows the conservation of two overlapping WT1 binding sites (solid box), an EGR1 site (dashed box), an SP1 site (dash-dotted box), and an SP2 site (double dash-dotted box). WT1 sites (human 1332–1348 and 1336–1352) are conserved between human, chimpanzee, macaque, and cow and they overlap an SP2 site (human 1347–1361) conserved between human, chimpanzee, and cow. An EGR1 site (human 1400–1416) overlaps an SP1 site (human 1404–1418) and both are conserved between human, chimpanzee, macaque, and dog. (B) The binding of WT1 and SP1 TFs to native chromatin obtained from WT1-transfected LNCaP cells was confirmed by ChIP. Lane 1 shows the no DNA PCR control and lane 2 shows PCR amplified input DNA. Lanes 3, 4, and 5 show PCR amplified DNA immunoprecipitated by IgG (no antibody control), SP1 or WT1 antibodies, respectively.

Functional WT1 and SP1 binding sites in the VEGF promoter are conserved between human and other primates

Having tested the significance of identified evolutionary conserved sites, we then asked whether TFBS known to mediate transcriptional regulation would also be conserved. Two genes that regulate prostate cancer progression by enhancing growth and blood supply, AR and VEGF, have multiple WT1 and SP1 binding sites in their proximal promoter regions [14, 15, 2325]. We have previously identified an EGR1 site in the VEGF promoter that binds both WT1 and SP1 protein in vitro [15], and here demonstrate by ChIP assay that this promoter region binds WT1 and SP1 in vivo (Figure 3). Chromatin from both embryonic kidney 293 cells and LNCaP cells expressing a GFP-tagged WT1 expression construct was immunoprecipitated by WT1 and SP1 antibodies and amplified by PCR. Using primers specific for the VEGF proximal promoter region, products ~140 bp in size were amplified from chromatin of both 293 and LNCaP cells (Figure 3A and 3B). These ChIP assays also demonstrated selective WT1 binding, since an adjacent site 190 nucleotides downstream failed to bind WT1 in the same assay (data not shown). These sites were validated as being transcriptionally regulated in several different assays, including luciferase reporter assays [15], so we asked whether they were evolutionary conserved in different species. In silico analyses predicted that an overlapping EGR1 (human 1717–1733) and SP1 (human 1721–1735) site and a WT1 site (human 1755–1771) were conserved between primates and dogs, but not in rodents (Figure 3C). Furthermore, as seen with the PSA promoter region, WT1 and SP1 may bind at adjacent sites or potentially at overlapping sites since WT1 also binds at EGR1 sites [79]. Both PSA and VEGF promoter regions contain evolutionary conserved WT1 sites adjacent to overlapping EGR1/SP1 TFBS, to which WT1 is also likely to bind, thus facilitating either cooperation or competition between TFs.

Figure 3
figure 3

ChIP verification of WT1 and SP1 binding to endogenous VEGF promoter and sequence conservation. Functional WT1 and SP1 TFBS in the VEGF promoter region were previously identified by EMSA and luciferase reporter assays [15]. (A) ChIP analysis of chromatin from WT1 transfected 293 kidney cells verified that these TFBS were functional. Lanes 1 and 7 show the 1 Kb ladder, lane 2 shows the No DNA PCR control, and lane 3 shows PCR amplified input DNA. Lanes 4, 5, and 6 show PCR amplified DNA immunoprecipitated by IgG (no antibody control), WT1 or SP1 antibodies, respectively. (B) ChIP analysis of chromatin from WT1 transfected LNCaP cells verified these TFBS were functional in prostate cancer cells as well. Lanes as described in section (A). (C) Predicted TFBS are based on human sequences and marked by boxes as described in Figure 1. These functional WT1 (human 1755–1771), EGR1 (human 1717–1733) and SP1 (human 1721–1735) sites were conserved between primates (human, chimpanzee, and macaque) and dogs, but not in rodents; and the SP1 site overlapped with the EGR1 site.

Similarly, WT1 binding sites previously identified in the AR proximal promoter region by EMSA analysis and verified to mediate transcriptional regulation in luciferase reporter assays [14, 23] were confirmed by ChIP using PCR primers flanking the WT1 and SP1 TFBS (Figure 4A). Since these binding sites were tested in vivo, evidence of sequence conservation was sought, as described. As shown in Figure 4B, both a WT1 site (human 1434–1450) and an EGR1 site (human 1524–1537) were identified within the region amplified by ChIP. This less common pyrimidine-rich EGR1 TFBS, consisting of TCC repeats, has been shown to bind both WT1 and SP1 [7, 14, 26], thus all three zinc finger TFs could compete for binding at this site. Evidence for evolutionary conservation between human and other primates was limited by the lack of genomic sequence information available for chimpanzee (and lack of conservation between human and macaque).

Figure 4
figure 4

ChIP verification of WT1 and SP1 binding to endogenous AR promoter and sequence analysis. Functional WT1 TFBS in AR promoter region were previously identified by EMSA and reporter assays [14, 23]. (A) ChIP analysis of chromatin from WT1 transfected LNCaP prostate cancer cells verified these TFBS were functional. Lane 1 shows the 1 Kb ladder, lane 2 shows the No DNA PCR control, and lane 3 shows PCR amplified input DNA. Lanes 4, 5, and 6 show PCR amplified DNA immunoprecipitated by IgG (no antibody control), WT1 or SP1 antibodies, respectively. (B) Predicted TFBS are based on human sequences and marked by boxes as described in Figure 1. Evidence for conservation of the functional WT1 (human 1434–1450) TFBS was limited by lack of sequence information available for chimpanzee (and lack of conservation with macaque). Surprisingly the TCC rich EGR1 site (human 1524–1537), previously shown to bind WT1 in vitro [14], also showed no evolutionary conservation.

Discussion

Identification of evolutionary conserved sequences derived from comparisons of multiple genomes (so-called "phylogenetic footprints") has been successful in identifying functionally important regions, including those regions that regulate gene expression [19, 2734]. However, some regulatory genomic sequences do not appear to be conserved or the level of evolutionary conservation varies between different genomic comparisons [35, 36]. Importantly, some functional regions have been reported to experience a relatively fast rate of turnover, where the functional significance of the element is retained despite changes at the nucleotide sequence level (e.g., transcription start sites, [37]). Thus, it is likely that gene expression in mammalian genomes is controlled by both types of regulatory elements, i.e., those elements that exhibit evolutionary and functional conservation and those that exhibit functional conservation only. Moreover, while numerous algorithms are available to computationally predict potential regulatory elements, it is often challenging to narrow down the list of those that are likely to be functional, particularly for relatively short elements such as TFBS. One of the approaches that utilizes evolutionary conservation as a predictor of TFBS functionality is the rVISTA tool that uses pairwise sequence alignments to identify the most highly conserved TFBS between the pair of genomic sequences [38]. Another set of tools, the Mulan, takes advantage of evolutionary conservation information obtained from multi-sequence alignments of several genomes [39]. However, the latter requires the TFBS to be shared among all genomes present in the alignment [39] and may potentially miss the lineage-specific regulatory elements that are absent from some subsets of genomes. Therefore, in this work we used TFBS elements shared between some but not necessarily all of the available genomes.

We used evolutionary sequence conservation, as determined by both the multi-species sequence alignments and the in silico TFBS predictions, to identify those sites most likely to regulate expression of target genes that influence growth of prostate cancer cells. Regulatory regions with functional importance can be expected to exhibit sequence conservation due to selection. Thus, predicted TFBS that are located in the orthologous positions in multiple genomes are likely to be truly functional. Our identification of evolutionary conserved WT1 and SP1 binding sites in the PSA promoter indeed supports this notion (Figure 2). As expected, conservation of TFBS decreased as species became more evolutionarily divergent [40], so those TFBS that were conserved between multiple species including opossum are more likely to be functionally important in the regulation of gene expression.

The abundance of overlapping zinc finger TFBS also supported the functional importance of these regulatory regions. Thus, we identified many TFBS in potential target genes that were co-expressed with WT1 in prostate cancer epithelial cells. Evolutionary conserved WT1 and SP1 sites in the PSA promoter were confirmed by ChIP to bind both WT1 and SP1 in LNCaP prostate cancer cells chromatin. Although it is a novel discovery that both SP1 and WT1 bind the PSA promoter and may play a role in its regulation, reporter assays are needed to confirm their contribution to transcription. In addition, a WT1 binding site known to transcriptionally regulate the VEGF promoter [15] was confirmed by ChIP and found to be in an evolutionary conserved region. Interestingly, transcriptionally active WT1 and EGR1 binding sites in the AR promoter [12] were not conserved between human and macaque, although adjacent genomic regions could be aligned between multiple species (Figure 4). This suggests that the AR promoter may have experienced faster turn-over than the VEGF promoter, yet remained functionally conserved despite sequence changes at the nucleotide level.

Many of the genes expressed in prostate cancer epithelial cells have previously been reported to interact and regulate each other, suggesting multiple potential targets for altered pathways that may lead to prostate cancer progression. We and others have identified gene interactions [8, 14, 15, 23, 4147] that are consistent with WT1 regulating the progression and/or growth of tumors in the prostate. However, PSA was a candidate gene target identified by our in silico evolutionary conservation approach and confirmed by in vivo chromatin binding assays. PSA is a member of the kallikrein family of serine proteases and is a marker of epithelial differentiation in the prostate [48]. It is up-regulated in prostate cancer cells when compared to normal adjacent tissue [49] and its expression is regulated by the ligand bound androgen receptor (AR) [48]. Since WT1 activates the AR promoter in prostate cancer cells [23], this suggests that WT1 may directly or indirectly regulate PSA gene expression.

In addition to PSA, genes that were co-expressed with WT1 in prostate cancer epithelial cells and that could potentially interact with, or be regulated by, WT1 included GATA2, ECAD, EGR1, and NDRG1 [6]. GATA binding proteins are zinc finger transcription factors that bind the WGATAR consensus motif and are expressed in multiple tissues, including endocrine glands [5052]. Interestingly, GATA TFs regulate WT1 expression, as multiple GATA TFBS are found within the WT1 promoter and enhancer regions [5355]. GATA binding protein 2 (GATA2) has been shown to be one of the main GATA family members expressed in the prostate of human and mouse [56]. It has been suggested that GATA2 plays a role in androgen mediated regulation of PSA expression, possibly through interaction with AR, as GATA sites are adjacent to AR TFBS in the PSA promoter [56]. WT1 could contribute to GATA2 mediated regulation of target genes in prostate cancer cells, if WT1 also physically interacts with GATA2. This notion is consistent with the observation that WT1 interacts with GATA4 to regulate SRY gene expression [57]. This complex pattern of zinc finger-protein interaction between WT1 and GATA, along with regulation of WT1 expression by GATA TF, suggests a potential for WT1 feedback control of GATA activity.

The WT1 promoter is itself a target of autoregulation by WT1 [47]. WT1 is a multifunctional transcription factor; its four major isoforms are formed by alternative splicing at two sites resulting in the inclusion or exclusion of (1) exon V and/or (2) a tripeptide (KTS) in exon 9 that alters the zinc finger DNA binding structure [58]. While the functions of the various isoforms of WT1 are still being discovered, the -KTS isoform is a transcriptional regulator with G-rich recognition sequence [58]. The +KTS isoform is also likely to be present in prostate cancer cells but would contribute to gene regulation via splicing and post-transcriptional gene regulation [59, 60]. Here we have identified potential target genes with well-described DNA binding sites recognized by the -KTS isoform and have not assessed the less well understood RNA binding sites recognized by the +KTS isoform [60].

The early growth response 1 gene (EGR1) is a homolog of WT1 [7]. Although it has only three zinc-fingers, it shares some TFBS with WT1. EGR1 has been implicated as a cancer suppressor gene and activates genes required for differentiation [7]. In human prostate cancer, EGR1 is over-expressed [11, 12] and in a mouse model of prostate cancer, EGR1 regulates genes essential for progression of tumor growth [61]. Since WT1 regulates the EGR1 promoter in vitro [8] it may indirectly regulate other EGR1 target genes, such as the N-myc downstream regulated gene 1 (NDRG1), an α/β hydrolase. In many cancer cell lines it has been shown to be up-regulated by both hypoxia and hormone treatment suggesting that it could be linked to androgen induced differentiation and signaling in the prostate [62, 63]. Since EGR1 regulates NDRG1, WT1 could either directly or indirectly regulate NDRG1.

While analyzing the homologous sequences of the different gene promoters, numerous overlapping TFBS were found, suggesting competition for binding and differential regulation of these gene promoters. Several studies have shown that EGR1 and SP1 TFBS often overlap [7, 64, 65]. When EGR1 binds to a site also bound by SP1, it displaces the SP1 "activator" from the binding site and represses transcription of these genes [7]. For example, the promoter of NDRG1 was shown to be regulated by an overlapping EGR1/SP1 binding site [65] (located outside of the surveyed region of our study). It was shown that this evolutionary conserved site was vital in positively regulating expression of NDRG1 [65]. Similarly, our results showed evolutionary conserved overlapping EGR1/SP1 sites in several other gene promoters, including VEGF and PSA. In the latter, overlapping EGR1/SP1 sites were found to be conserved between human and two other primate species (chimpanzee and macaque).

Additionally, WT1 and EGR1 compete for binding at shared TFBS. WT1 recognizes and binds to EGR1 sites on the promoters of many different genes [7, 9, 6668]. WT1 generally functions as a transcriptional repressor when bound to EGR1 TFBS in the transforming growth factor-beta 1 (TGFβ1) and EGR1 promoters, while EGR1 functions as an activator [8, 9]. Many gene promoters with overlapping WT1, EGR1, and SP1 binding sites have been identified (reviewed in [7]). For example, three-way competition occurs between EGR1, SP1 and WT1 for binding and regulation of superoxide dismutase expression [66]. However, the mechanisms of gene regulation at overlapping sites, including TF competition, are not well understood.

Combinations of adjacent and overlapping EGR1, WT1 and SP1 TFBS conserved between multiple species were found in multiple gene promoters. Adjacent sites were found in the PSA promoter where an overlapping EGR1/SP1 site is 50 base pairs downstream of a WT1 site and in the VEGF promoter where an EGR1/SP1 overlapping site is 20 base pairs away from a WT1 site. Such sites can facilitate synergistic interactions or may be required for inducible expression, as described for AR and GATA2 interactions in the PSA promoter [56]. Additionally, in the VEGF promoter an SP1 site adjacent to a non-canonical estrogen receptor (ER) TFBS contributes to hormone induction of VEGF expression [69]. Similarly, WT1 appears to interact with ER at neighboring sites in the insulin like growth factor 1 receptor (IGF1R) promoter [70]. These complex arrangements of EGR1, WT1 and SP1 TFBS could facilitate cooperative or competitive binding by these TFs and would have pleiotropic effects on the regulation of these genes. Genes with evolutionary conserved overlapping TFBS could be part of a prostate epithelial cell transcriptome regulated by WT1.

Conclusion

Genes coordinately expressed in prostate cancer epithelial cells have conserved regulatory elements and an abundance of overlapping zinc finger TFBS. Potential WT1 gene targets were identified based on TFBS sequence conservation, and the significance of the WT1 TFBS in the PSA promoter was verified in vivo by ChIP assays. Similarly, a transcriptionally active WT1 binding site in the VEGF promoter was confirmed by ChIP and found to be in a region conserved amongst primates. Thus, these genes could be part of a novel network of regulatory pathways initiated by WT1 and have important implications in the progression of prostate cancer.

Methods

Promoter sequence compilation

For each of the 24 prostate cancer growth regulatory genes differentially expressed, the complete or draft genomes of eight different mammalian species were downloaded from the Ensembl Genome Browser [71, 72]. The following genome assemblies were used: the NCBI 36 assembly of human (Homo sapiens) genome, the NCBI m36 assembly of mouse (Mus musculus) genome, the Pan Tro 2.1 assembly of chimp (Pan troglodytes) genome, a whole genome shotgun (WGS) preliminary assembly Btau_3.1 of cow (Bos Taurus) genome, a WGS assembly Can Fam2.0 of dog (Canis familiaris) genome, a WGS preliminary assembly Mmul_1 of rhesus monkey (Macaca mulatta) genome, the Mon Dom5 assembly of opossum (Monodelphis domestica) genome, and the RGSC3.4 assembly of rat (Rattus noregicus) genome. Since major regulatory elements are located within several hundred base pairs of transcription start sites [73], 1.5 kb of human nucleotide sequence 5' of the translational start site (that is, 5' of the first exon as defined in Ensembl [72]) was collected. Orthologous sequences from other mammalian genomes were obtained from respective genome assemblies. In the case of the EGR1 promoter this extended beyond 1.5 kb, so was assigned a negative number. The genome viewer and annotation program Artemis was used to ensure the correct context of genomic sequences [74]. In each sequence, the nucleotide positions were numbered sequentially, with the targeted promoter region occupying positions 1 through 1500 (5' to 3' direction) of the forward strand, and ATG start codon located at positions 1501–1503 of the genomic sequence analyzed.

AR and VEGF promoter sequences containing the functional WT1 TFBS for the human AR and VEGF promoters were obtained from Ensembl (ENSG00000169083 and ENSG00000112715, respectively). For alignment analyses of known functional sites [14, 15], an orthologous promoter region (3 kb) was then collected from eight mammalian genomes as described above.

TFBS predictions, evolutionary conservation and multiple sequence alignments

TFBS of WT1, EGR1, SP1, SP2, AP2 and GATA1 were predicted for each gene by the program MatInspector [75] that utilizes the TRANSFAC libraries of TF binding motifs [75, 76]. The default parameters of similarity thresholds were used for all examined genes, and they were as follows: core similarity > 0.75 and optimized matrix similarity thresholds (i.e., those that minimize false positives for each individual matrix as available in the library) [75]. In MatInspector, core similarity is one of the built-in program parameters that determines whether the observed sequence match will be analyzed further. It refers to the four most conserved consecutive nucleotides of the matrix, usually the most critical sites for protein binding, and reaches 1.0 only when there is a perfect match [75, 77]. Sequence matches with low core similarity (less than 0.75) are not, by default, reported to the user. Vertebrate matrices of the Matrix Family Library Version 6.2 (October 2006) that included 464 matrices were used [78]. Multiple sequence alignments of the promoter sequences were reconstructed with the program blastZ using MultiPipMaker [79], and predicted human TFBS were mapped onto the alignments.

Regions that are conserved in multiple genomes are often found to correspond to functionally important ones [80]. However, because of the species-specific differences in gene regulation due to underlying differences in morphogenesis and development, such as those between different segments of human and rodent prostate [81], it can be expected that some functionally important regions will be conserved only in a limited set of genomes where they play a critical role. Thus, we used a flexible definition of "evolutionary conservation" to accommodate such potential differences between genes and/or TFBS: here a TFBS was considered evolutionary conserved if it was predicted as a respective TFBS in orthologous position in at least three of eight surveyed genomes. In other words, the same genomic region was predicted to function as a candidate binding site for a particular TF in at least 3 surveyed genomes. Further, because differences in presence/absence of particular TFB sites between genomes may also be attributed to differences in the role of respective genes in each of the organisms, we examined evolutionary conserved sites at different levels of resolution: Human-Primates, Human-Rodents, and Human-Opossum, thereby, allowing us to identify genes and TFB sites that are functionally relevant to each of these comparisons.

Cell culture and chromatin immunoprecipitation

LNCaP prostate cancer cells (ATCC-CRL 1740) and human embryonic kidney 293 cells (ATCC-CRL 1573) were cultured in RPMI or DEM/F12 (HyClone Laboratories, Utah) media, respectively, as described [15]. The cytomegalovirus (CMV) promoter-driven pGFP-WT1 (A) expression construct encoding the murine Wt1 gene (lacking both KTS insertion and exon 5) fused to GFP coding region were obtained from Dr. A. Ward [82]. All DNA was purified by the Qiagen plasmid Maxi Kit (Qiagen, Carlsbad CA) and transfections were performed using lipofectamine 2000 (Invitrogen; Carlsbad CA) in serum- and antibiotic-free media as described [15]. Green fluorescing cells were visualized by epifluorescence microscopy (Olympic) at 100–400× magnification at 24 and 48 hrs after transfection prior to cell harvest for chromatin isolation.

The Farnham ChIP protocol [83] was used with some modifications. Two million cells were treated with formaldehyde to crosslink proteins to DNA and lysed in PBS-PI as recommended for the EZ ChIP Assay (Upstate Biotechnology Inc). Lysates were centrifuged and DNA sheared by sonication (Biosonik III, Bronwill Scientific, Rochester, NY) to fragments of 100–400 bp in length. The supernatant was pre-cleared by incubation with Protein G Agarose and incubated overnight at 4°C with either SP1 antibody (Upstate Biotechnology Inc) or WT1 antibody (a mixture of C19 and N18 polyclonal Abs, Santa Cruz Biotechnology) or non-immune IgG. The antibody/protein/DNA complex was collected by incubation with Protein G Agarose and washed in increasing salt buffers, then rinsed in TE as recommended (Upstate Biotechnology Inc). The complexes were recovered from agarose beads with an elution buffer, crosslinks were reversed and DNA was purified using G-50 spin columns. Four percent of both immunoprecipitated and input chromatin were amplified by PCR using Taq polymerase (Applied Biosystems by Roche Molecular System, Inc) and the following set of primers: VEGF primers (F) 5'TTCCTAGCAAAGAGGGAACG3' and (R) 5'ACCAAGGTTCACAGCCTGAA3'; AR primers (F) 5'TATCTGCTGGCTTGGTCATGGCTTG3' and (R) 5'CTGCTTCCTGAATAGCTCCTGCTT3'; and PSA primers (F) 5'TCTGCCTTTGTCCCCTAGAT3' and (R) 5'AACCTTCATTCCCCAGGACT3'. Following an initial 10 min denaturation at 95°C, DNA was amplified by 32 cycles of: 1) 20 sec denaturation at 95°C, 2) 30 sec annealing at either 58°C (for VEGF primers) or 59°C (for AR and PSA primers) and 3) 30 sec extension at 72°C; amplification was completed with a 2 min final extension at 72°C. PCR products were fractionated on 1% agarose gel, and ethidium bromide stained DNA was visualized by a gel doc system (BIORAD, CA). Specificity controls are shown in Additional file 2.