Introduction

A human promyelocytic cell line, HL-60, is frequently used to analyze cellular differentiation mechanisms [1]. HL-60 cells differentiate into macrophage-like cells and granulocytic cells after treatments with 12-O-tetradecanoyl-phorbol-13-acetate (TPA) and all-trans retinoic acid (ATRA), respectively [13]. In particular, it has been shown that the expression of a number of genes is augmented during differentiation induced by TPA [4]. Recently, TPA was successfully administered to patients with myelocytic leukemia, leading to temporary remission [5, 6]. This TPA-treatment is based on a concept that the growth and proliferation of lymphoma or cancer cells could be artificially suppressed if they are forced to differentiate into a non-growing state [7]. However, TPA has been originally identified as a cancer-causing reagent (carcinogen) activating protein kinase C [1]. Therefore, extreme care will be needed to administer the reagent in clinical situations.

It has been demonstrated that somatic cells could be converted into induced pluripotent stem (iPS) cells by introducing just four transcription factors, namely Oct4, Sox2, Klf4, and c-Myc expression vectors or their encoding proteins, into somatic cells [8, 9]. The possibility to change the destiny of cells by introducing transcriptional factors prompted us to study transcription systems that control cellular differentiation. In order to establish a method to convert malignant cells to the non-growing differentiated state without using TPA, it will be important to determine and characterize molecular mechanisms in the transcriptional signal transduction systems that are induced by TPA. In other words, the changes in the TPA-induced transcriptional controlling system will provide useful information about which genes of transcription factors should be transfected into the malignant cells.

Previously, it has been demonstrated that the human PARG promoter is activated during the differentiation of HL-60 cells induced by TPA [10]. Another example is the IGHMBP2 promoter, whose cDNA was isolated by South-Western screening during a search for a gene encoding a binding factor to a 50-bp sequence of the MMTV LTR [11]. Interestingly, there are no obvious TATA- or CCAAT-boxes around the transcription start sites (TSS), but the duplicated GGAA motifs are commonly contained in the region between the TSS and the 100-bp upstream of the human PARG and IGHMBP genes. In addition, mutation analyses showed that the duplicated GGAA motifs of both promoters play a primarily important role in promoter activity during the TPA-response [10, 12]. Taken together, the duplicated (overlapped) GGAA motifs located very close to the TSS in the human PARG and IGHMBP2 promoters are primarily required for the initiation of transcription during TPA responses. Studies on the transcriptional mechanisms regulated by the duplicated GGAA motifs and binding factors may provide a new strategy to artificially induce differentiation of leukemia or cancer cells by introducing specific gene expression- and suppression-vectors.

Identification of the duplication of the GGAA motifs near the most 5′-upstream regions of the TPA-inducible genes

It has been indicated that the GGAA motif is located in the promoter regions of some human genes encoding interleukin-2 receptor β-chain (IL-2Rβ), Ets-2, stromelysin 1, TCRα enhancer, TCRβ enhancer, and NFAT-1 [13]. Mutation of the GGAA motif abolished activity of the IL-2Rβ promoter during the PMA response in Jurkat cells. These observations, including ours that we reported in a recent paper [10, 12], suggested a hypothesis that the GGAA motif might be an essential transcription regulatory element with a positive response to TPA-induced signals. By analyzing the activities of various promoters of human genes encoding DNA helicases, or repair-associated proteins, it has been shown that the duplicated GGAA motifs are located in the XPB, RTEL, and Rb1 promoters [14]. It has been reported that, during the macrophage-like differentiation of HL-60 cells, Rb1 belongs to a group of TPA-inducible late genes [4]. In addition, the expression of the PARG and the IGHMBP2 genes shows a late responding profile, and the gene encoding the GA-binding transcription factor, β subunit 2, is classified into the early response group [4, 10, 11]. Multiple GGAA motifs are contained in the promoter region of the human CD68 gene that is widely regarded as a pan macrophage marker in immunohistochemistry studies, and its expression is regulated by Elf-1, PU.1, and IRF-4 [15]. These observations suggested that duplications of the GGAA motifs are necessary for the TPA-late response or macrophage-like differentiated state of the cells. Therefore, DNA sequences of the 5′-upstream regions of 19 late response genes were investigated and analyzed by a TF-SEARCH [4, 14]. As summarized in Table 1, duplications of the GGAA motifs are found in eight TATA-less promoters, including the Rb1, CD31(PECAM1), IL1R1, IL1R2, IL1RAPL2, S100A3, eIF3A, and ECM1 promoters. The 5′-flanking regions of the INHBA and S6 kinase genes contain clusters of GGAA motifs [14]. Interestingly, promoter regions of the IL1R1, IL1R2, and IL1RAPL2 genes, which encode the IL1 signal receptor or its associated proteins, have duplicated GGAA (TTCC) motifs (Table 1), suggesting that these genes are regulated by the IL1 signaling system. Although a single GGAA motif is located very close to the TSS of the PEA15, ANXA5, and RFC-1 genes, another six promoters, including those of the LGALS3, PREP, CHST10, RFXAP, P2Y5, and KCNMA1 genes, do not contain duplications of the GGAA motifs within the 300-bp from the most 5′-upstream region of their cDNAs [14]. Except for the P2Y5 promoter, which contains TA-rich sequences with Oct-1 binding sites, these are comparatively rich in GC content with GC-boxes. Therefore, Sp1 or GC-box binding proteins might also play a role in the late response to TPA treatment in HL-60 cells.

Table 1 Duplication of the GGAA motifs in various human promoters

The duplicated GGAA motifs are located in the promoter regions of interferon-inducible genes, and interleukin, endocrine signal-associated genes

Multiple GGAA motifs are found in the promoter region of the human (Table 1) and the mouse ITGA2B genes encoding the megakaryocyte marker, CD41, that regulates cell adhesion and platelet aggregation [16, 17]. It has been indicated that the MAP kinase signaling pathway and transcription factors such as Elk-1 and Ets play important roles in megakaryocytic differentiation [16, 18]. Overlapping GGAA motifs are hidden in the interferon (IFN)-stimulated response element (ISRE)-like sequence that IRF-2 binds to and activates transcription of the ITGA2B gene [19]. It has also been reported that IRF-2 drives megakaryocytic differentiation via activation of the promoter of the TPOR gene encoding a thrombopoietin receptor [20]. These observations imply that transcriptional activation by the cis-acting effect of the duplicated GGAA motifs plays an important role in the differentiation of megakaryocytic cells as well as macrophage-like cells. Moreover, it has been reported that the ISRE of the mouse programmed death-1 (PDCD1) promoter contains duplication of the GGAA (TTCC) motifs and plays a role in the response to IFN-α in macrophages [21]. The PDCD1 gene encodes an immuno-regulatory receptor that belongs to the CD28 family [22]. In addition, duplication of the GGAA motifs is found in the interferon-responsible promoter regions of the human interferon-stimulated gene 15 (ISG15) [23] and CD40 (TNFRSF5) gene that encodes TNF receptor super family [24] (Table 1). Furthermore, the GGAA motif is located in an IFN-γ activated sequence in the transporter of antigenic peptide-1 (TAP1) promoter [25]. Analysis of transcripts in purified neutrophils from active tuberclosis patients, compared with healthy controls, showed that IFNs activate JAK/STAT signal cascade to induce various gene expressions, including IRFs and various interferon-inducible genes [26]. Within 500 bp upstream from the TSS, the duplicated GGAA (TTCC) motifs are found in 5′-flanking regions of the OAS1, OAS2, OAS3, IFI6, IFI44, IFI44L, IFIH1, IFITM3, TAP1, GBP2, STAT1, STAT2, CD274, CXCL10, IFIT5, and IRF7 genes. Therefore, the duplicated GGAA motifs and/or the recognizing factors might play a part in the regulation of those gene expressions responding to the signals evoked by IFNs. Recent study showed that IFN-γ-producing macrophages, which co-express CD68 molecule, are identified in 70% of human melanomas [27]. Although not duplicated, putative c-Ets binding motifs are found in the promoter region of the IFNG gene, suggesting that GGAA motif binding factors may co-regulate the IFNG and the CD68 genes in macrophages.

As described above, the duplicated GGAA motifs are commonly found in the IL1R1, IL1R2, and IL1RAPL2 gene promoter regions (Table 1). Also, the duplication of the motifs is observed in the human IL-1β [28] (Table 1), IL-2 [29], and IL-2Rβ [13] promoter regions. A recent study showed that Elk-1 and SRF regulate ets motif containing the ZC3H12A promoter with IL1 dependency, suggesting that the IL1 signaling system could be sensitized via the cis-acting effect of the GGAA motifs [30]. In addition to the IL1R gene, IL7R gene expression could be regulated by the duplicated GGAA motif. In mouse T cells, it has been shown that GABP regulates transcription of the IL7R gene [31]. It was shown that the expression of the IL7R (CD127) is strongly correlated with that of the Ets-1 in human peripheral T cells [32]. It could be caused by the presence of the GGAA (TTCC) duplication, 5′-CAGACTTCCTGTTTCTGGAACTTGC-3′ that is located near the TSS of the human IL7R gene. Moreover, a recent study indicated that the GABP transcription factor is also required for TCR rearrangement [33].

The effect of Ets proteins on the endocrine system has been investigated [34]. It was recently indicated that two ets motifs located in the human insulin induced gene 2 (INSIG2) promoter play important roles in the response to insulin through binding or activation of the transcription factor, SAP1a [35]. Duplication of the GGAA motifs is found in the intronic promoter region of the human Glucose-dependent insulinotoropic polypeptide-encoding gene GIP [36]. The GIP can stimulate insulin release from pancreatic β-cells when blood glucose concentration increased [36]. These lines of evidence suggest that the GGAA motifs regulate the expression of genes encoding proteins that are involved in not only cellular differentiation but also in the immuno response and endocrine systems.

The GGAA-microsatellite-containing genes in the human genome: the main target sequences for EWS/FLI to function as an oncoprotein

Repetitive sequences are known to be located in eukaryotic chromosomes. Telomeres, which are composed of TTAGGG repeats, are located at the end of linear chromosomes [37]. The centromere is a specific structural domain of the chromosome with important functions to segregate chromosomes accurately, and many protein components of the centromere have been identified [38]. A centromere protein (CENP) B box, which is recognized by CENP-B, is a 17-bp motif containing GGAA and it appears within every other α-satellite repeat (171-bp) in human chromosomes [39, 40]. Those repetitive sequences are generally referred as microsatellites. Ewing’s sarcoma is a tumor in which Ets family members are involved in chromosomal translocations [41]. The chromosomal translocation, t(11;22)(q24;q12) in the malignant solid tumor generates the EWS/FLI oncoprotein that functions as an aberrant transcription factor [42]. The EWS/FLI oncoprotein is known to up-regulate the expression of various genes, including TGFβ RII, cyclin D1, Id2 and c-Myc, IGFBP3, PTPL1, cyclin E, MK-STYX, and caveolin1, through interaction with several ets (GGAA) motifs [43]. The repetitive GGAA motifs are known as GGAA-containing microsatellites that can be targeted by the oncoprotein EWS/FLI [42, 44]. The GGAA-containing microsatellites are located in the promoter regions of the DAX1/NR0B1, FCGRT, CAV1, CACNB2, FEZF1, KIAA1797, and GSTM4 genes [43, 45, 46]. Recent study suggested that EWS/FLI binds to these elements as a homodimer to activate transcription [47]. The GGAA-microsatellite is preferentially recognized and activated by EWS/FLI but not by Ets1 and Elk1, suggesting that transcription of genes containing GGAA-microsatellites is not usually regulated by Ets family proteins [47, 48].

Identification of the duplicated GGAA motifs in the 5′-flanking regions of various human genes

By performing transfection experiments with mutation-carrying reporter plasmids, we have indicated previously that the duplicated GGAA motifs in the 5′-flanking regions of the PARG, IGHMBP2, ATR, XPB, RTEL, and Rb1 promoters have essential roles in regulating the activities of these promoters [10, 12, 14]. Comparing these duplicated GGAA motifs, we have tentatively determined the consensus 14-bp sequence as 5′-(A/G/C)N(A/G/C)(C/G)(C/G)GGAA(A/G)(C/T)(G/C/T)(A/G/C)(A/G/C)-3′ [14]. The probability of this combination occurring by chance can be roughly estimated as only once in 8,630 combinations of these bases, including the reverse orientation. The probability that the consensus 14-bp sequence would be duplicated by chance within the 50-bp distance would be extremely rare. Therefore, we explored the duplicated 14-bp sequences 2,000 bp upstream and 200 bp downstream of the TSS of 47,553 human genes extracted from the Ensembl database. The computer-based in silico analysis retrieved 469 examples with one or more duplications of the 14-bp sequence, including 234 instances of protein-encoding genes of known function or structure. Eighty percent (372/469) of the genes having consensus 14-bp sequence lacks TATA-box, and 74% (345/469) of that have GGAA(TTCC) duplications within 500 bp upstream (Fig. 1). Although not all the 5′-flanking regions of those function-known 234 genes contain the duplicated GGAA motifs within a 450-bp distance of the TSS, 174 gene promoters were found to have at least one duplicated (overlapped) GGAA motif around the TSS (Table 2, Group A). They include, for example, IFITM5, RBBP5, GRB2, and SUMO1, encoding for the interferon-induced transmembrane protein 5/Bril [49], Rb-binding protein 5 [50], growth factor receptor-bound protein 2 [51], and small ubiquitin-modifier 1 [52], respectively. The genes encoding non-degradative ubiquitin signaling factors FANCD2, which is suggested to regulate DNA replication and repair [53], is included in the list (Table 2, Group B). Although our search could not retrieve the 5′-flanking region of the BRCA1 gene, the duplicated GGAA motifs are also involved in the UP site and Ets2 binding site of the human BRCA1 promoter [54, 55]. Ets2 binds to the 150-bp region upstream of the UP site that is recognized by the GABP transcription factor [54]. A tumor suppressor BRCA1, which regulates DNA damage response, including cell cycle check point control and DNA repair [56], has ubiquitin E3 ligase activity [53, 57]. FANCD2, which is involved in the Fanconi anaemia pathway [53], is monoubiquitylated to form Fanconi anaemia core complex with FANCI [56, 58]. Interestingly, OTUB1, which encodes OUT domain-ubiquitin aldehyde binding 1, is included in Table 2. Recent study indicated that OTUB1 is an inhibitor of the DNA damage response that suppresses poly- but not mono-ubiquitination [59]. These observations suggest that the duplicated GGAA motifs control expression of genes encoding non-degradative ubiquitylation-associated proteins. Other examples are promoter regions of genes encoding TNF-associated signal proteins such as TRAF1, TNFSF12 (TWEAK), and TNFSF13 (APRIL) [6062]. The expression pattern of the TRAF1 gene in HL-60 cells after TPA treatment has led to its classification as an intermediate responding gene [4]. It should be noted that the GGAA motif is duplicated in the human TNF promoter region [63]. Therefore, duplicated GGAA motifs could regulate expression of the genes encoding proteins that are associated with the TNF signal cascade. Moreover, several promoters of glucose metabolism associated genes (G6PD, GYS1, and SLC2A13) have been identified to contain the duplication within the 450 bp upstream of the TSS [6466]. Although, the algorithm used in this study could not acquire NR1H2 (LXRB) gene that was shown to be regulated by Elk1 and SRF transcription factors [67], multiple GGAA (TTCC) motifs are found around the TSS of the NR1H2 promoter, suggesting that the duplicated (multiple) GGAAs might also control glucose metabolism in cells.

Fig. 1
figure 1

Most of the 5′-upstream regions having duplicated GGAA motifs are TATA-less. a The number of genes that have duplicated GGAA consensus 14-bp sequence [14] in the 2,000 bp upstream from TSS was 469. Among them, 372 genes have no TATA-box within 500 bp upstream. b The number of genes that have duplicated GGAA consensus 14-bp sequencein the 2,000 bp upstream and GGAA(TTCC) overlapping within 500 bp upstream of TSS was 345. Among them, 279 genes have no TATA-box within 500 bp upstream. Numbers of genes from the Ensemble database are shown

Table 2 Human promoter regions containing duplication of the GGAA motifs near the TSS

The tandemly repeated GGAA motifs are found in the long terminal repeat of the HIV2 gene [68] and that have been identified as essential sequences in the human FcRγ promoter [69]. The locations of the ets elements in various viral gene, Fc receptor and immunoglobulin, and other promoters were reviewed previously [70]. Redundancy or overlapping of the GGAA motifs in TP53, PF4 (Platelet Factor 4), and UTRN (Utrophin) promoters has been indicated [70]. These observations suggest that overlapped or tandem repeated GGAA motifs are contained in promoter regions of various genes that encode biologically important protein factors.

Distribution of the Ets binding motifs in the whole human genome

A recent study using promoter chromatin immunoprecipitation (ChIP) microarray analysis to search for Ets family protein binding regions in the human genome indicated that promoters occupied by Ets1, ELF1, or GABPα were frequently bound by one or more of the other Ets proteins [48]. Quantitative ChIP analysis indicated redundant binding of these transcription factors to the promoter regions of the COX17 and DFFA genes that are listed in Table 2 [48]. Moreover, BTRC (Table 2) promoter was shown to be Ets1-RUNX1 co-occupied as TCRα and TCRβ promoters [48]. Thus, the genome-wide promoter ChIP analysis indicated that 5–15% of the 17,000 human promoters are redundantly occupied by these Ets proteins. Bioinformatic studies have also indicated that Ets-like binding sites are over-represented in human promoters [7173]. Our in silico search analysis retrieved 195 (Table 2, Group A and B combined) genes that represent only 1% of total genes. This unexpectedly low value may be resulted from somewhat stringent condition for the first screening to find duplications of the PARG/IGHMBP2 promoter-like GGAA motifs. Given that each Ets family protein has preferential flanking sequence of the GGAA (TTCC), our selected 14-bp might have restricted candidate genes. Recently, overlap between ChIP-seq peak data indicated several preferential combinations of the Ets family proteins that depend on cell lines used [74]. The occupancy of Ets proteins in the same promoter region could be explained by alternative binding of the Ets proteins on the same binding site [48]. Another explanation is that there is a separate binding site containing duplicated or overlapped GGAA (TTCC) motifs for each factor around the TSS.

Biological properties of the Ets family proteins—functions as proto-oncogene products

An oncogene, v-ets, was identified from the E26 avian leukemia virus in 1983 and a large family of conserved genes were isolated [75, 76]. Several of them were targeted by homologous recombination, and it has been demonstrated that ets genes play important roles in embryogenesis and hematopoiesis [75, 77]. The amino acid sequences of the human Ets domains were aligned and their binding sequences have been evaluated [48]. The Ets family proteins not only have similarity in amino acid sequences (Ets domains) but also have similarity in their recognizing DNA sequences [48]. It has been reported that the Ets transcription factor, ER81, stimulates hTERT gene expression [78, 79]. A recent study indicated that the hTERT promoter possesses two biologically important Ets motifs (EtsA and EtsB) and that Ets2 maintains hTERT gene expression in cooperation with c-Myc in breast cancer cells [80]. It is noteworthy that the TTCC motif-overlapped EtsB sequence, 5′-TTCCTTTCC-3′, which is the same as the one located in the eIF3A promoter (Table 1), is found close to the TSS of the hTERT gene. Analyzing sequences of the hTERT promoter from non-small cell lung cancers revealed that mutations of the TTCC motif correlate with low levels of telomerase activity and short telomeres [81]. It should also be noted that palindromic Ets-binding sites are located in the promoter region of the human TP53 gene and that Ets1 binds to the sequence [70, 82]. The biological relevance of Ets1 to transcriptional activity of the TP53 promoter has been shown in embryonic stem cells undergoing apoptosis induced by UV irradiation [83]. A recent study indicated that the nuclear apoptosis-enhancing nuclease encoding AEN gene is a target of p53 [84]. It has been reported that the death-activated protein kinase (DAPK) family modifies MDM2 and p21 proteins [85]. The AEN and DAPK3 promoters were identified by our search algorithm (Table 2), suggesting that genes encoding p53 and its associated proteins are commonly regulated by the duplicated GGAA motifs at the transcriptional level. Furthermore, palindromic Ets binding sites in the CRYAB (alphaB-crystallin) gene promoter are positively regulated by Ets1 in breast cancer cells [86]. The epithelium-specific Ets transcription factor, ESE-1, which is abundantly expressed in human breast cancer, has been reported to have transforming activity in MCF-12A human mammary epithelial cells [87]. Recently, it was shown that the Ets family member ETV1 works as a regulator of gastrointestinal stromal tumors in combination with oncogene product KIT protein [88]. Whole genome sequencing analysis of human prostate cancer suggested that TMPRSS2-ERG fusion protein generated by gene rearrangement causes aberrant transcription or chromatin structure and that it is correlated with the tumorigenesis [89]. As described above, the GGAA motifs are duplicated in the human PARG promoter [10]. It should be noted that the duplicated GGAA motifs are located in the promoter region of the human PARP-1 [90] and p21 [91] genes, which encode poly (ADP-ribose) synthetase and Cdk inhibitor, respectively. Taken together, these observations suggest that Ets family proteins regulate cancer development or tumorigenesis.

Apoptosis regulating functions of Ets family proteins

Apoptosis or programmed cell death is known to be regulated by various factors including caspases, DNases, or signal transduction proteins [92]. The importance of Ets family proteins in the regulation of cell-death has been proposed [75]. Caspase-1 has been identified as ICE (IL-1β converting enzyme), which plays a prominent role in inflammatory responses [93, 94], and it is involved in the regulation of apoptosis [95, 96]. Interestingly, it has been demonstrated that caspase-1 is a direct target gene of Ets1 [97]. The caspase-1 gene is also up-regulated by PU.1 in erythroleukemia cells undergoing apoptosis [98, 99]. The activation of caspase-1 gene expression could be explained by the presence of the Ets binding sequence located in the promoter region [97]. Ets family proteins have been shown to regulate apoptosis of endothelial cells [100, 101]. The repeated GGAA motifs are located in the promoter region of the human VE-cadherin gene and the Erg transcription factor has been shown to bind to that sequence in a ChIP assay [100, 102]. Expression of antiapoptotic genes, including Bcl-XL and cIAP, is regulated by Ets1 and Ets2 during embryonic development [101]. In human mesothelioma cells, Ets2 and PU.1 up-regulate Bcl-XL transcription to prevent cells from undergoing apoptosis [103]. The thioredoxin-binding protein 2 (TBP2) promoter, which contains multiple ets motifs, is regulated by Ets1 in the MG-63 osteosarcoma cell line [104]. TBP2 is thought to cause apoptosis through activation of ASK1 and JNK [105]. Moreover, the Ets1 binding site is present in the proto-oncogene Pim-3 promoter [106]. Up-regulated Pim-3 gene expression caused by Ets1 is thought to prevent apoptosis in cancer cells. Our search indicated that PARG/IGHMBP2-like GGAA motifs are located in the promoter regions of the PDCD-1 [22], and DFFA genes, encoding inhibitor of caspase-activated DNase (ICAD) [107]. Furthermore, the duplicated GGAA motifs are contained in the human Bcl-2 [108], Fas (CD95) [109], and FasL [110] promoter regions.

Taken together, Ets proteins could function as positive and negative apoptosis regulators controlling expression of the apoptosis signal transduction protein-encoding genes.

The duplicated GGAA sequence functions as a fundamental transcription regulatory element for TATA-less promoters

Molecular mechanisms of the initiation of transcription by RNA-polymerase II (RNA pol II), which is involved in the large protein complex in eukaryotic cells, have been well studied [111]. Sequential complex formation occurs at transcription initiation sites very close to the TATA-box where the TATA-binding protein (TBP)/TFIID complex binds first, then TFIIA, TFIIB, RNA pol II-TFIIF, TFIIE, and the DNA helicase TFIIH assemble to make a large protein complex known as the transcription machinery [112]. The very first step of this RNA pol II-mediated transcription-initiating reaction begins with the recognition and binding of the TBP protein to the TATA-box. However, the TATA-box is not always found near the most 5′-upstream region of the cDNA. It was reported that about 76% of human core promoters lack TATA-like element [113]. Figure 1 shows that 81% (279/345) of the genes having duplication or more GGAA consensus 14-bp sequences [14] in the 2,000-bp upstream region contain overlapped GGAA(TTCC) motifs within 500 bp upstream and have no TATA-box. The promoters that have no TATA-box near the TSS are usually known as TATA-less promoters, and they have a high GC content and frequently contain Sp1 binding sites, SCGGAAGY, or TGCGCANK motifs [113]. If the general transcription factors need to be recruited onto the TATA-less transcription initiation site, other cis-acting sequences and binding protein factors are required for the initiation of the transcription reaction in mammalian cells.

It should be noted that the overlapping GGAA motif-containing sequence, 5′-CCTATGGAAACACAGGAAGTGAC-3′, is found in the TBP promoter region and the ets motif has been shown to play an important part in regulation of the promoter activity [114]. Quantitative PCR-ChIP analysis indicated specific binding of ELK1 to the TBP, TAF1, TAF7, TAF12, GTF2A1, GTF2B, and BTAF1 promoters [115]. Moreover, it has been reported that the epithelium-restricted Ets transcription factor, ESX binds to TBP, and that the ETS family transcription factor, ERM, interacts with TAFII40 and TAFII60 as well as TBP [116, 117]. Furthermore, it has been shown that PU.1 plays a role in recruiting the TBP/TFIID complex to the defensin-1 promoter around the TSS [118]. It should also be noted that Defensin α1 protein is encoded by the genes DEFA1 and DEFA1B whose promoters locate tandem repeat GGAA motifs [118] as listed in Table 2. These lines of evidence suggest that the ETS family proteins not only directly affect transcription of the TBP and its associated protein-encoding genes, but also guide general transcription factors, including TBP, towards the TSS of the duplicated GGAA motif-harboring promoters. Recently, the crystal structure of Elf3, which is a member of the Ets family of proteins, was revealed to bind to GGAA motifs of A- and B-sites of the type II TGF-β receptor (TGF-β RII) promoter, and it was concluded that regions flanking the Ets domain are essential for DNA binding [119]. The presented model describing binding of the Elf3 with A- and B-sites in the TGF-β RII promoter region suggests that the duplication of the GGAA motifs might give a scaffold for general transcription factors near the TSS. In other words, tandemly repeated GGAA motifs might give rise to a fundamental position for the pre-initiation complex to bind around the TSS without TATA-box like sequences. The overlapped GGAA motifs may have advantages to control transcription precisely only by the expression levels of the Ets family members according to the behavior of cells to proliferate, differentiate, undergo apoptosis, or respond to IFN-induced signals (Fig. 2). Therefore, it could be happening that the promoter of the same gene is active in specific cells while it is attenuated in other cells depending on the GGAA-motif binding protein expression profiles.

Fig. 2
figure 2

Hypothetical transcription controlling system in which duplicated GGAA (TTCC) motifs and Ets family proteins are involved. Closed circles represent GGAA (TTCC) motifs that are located near the TSS. The transcription activity is indicated by the size of the arrows. Multiple Ets family proteins, which are indicated by open squares, closed squares, triangles, and diamonds, could be induced by differentiation-inducing signals, apoptosis-inducing signals or specific cytokines. Their redundant occupancy around the TSS can regulate variable gene expression with subtle changes, depending on the combinations of the binding proteins. The over-transcribed mRNAs may be further regulated or fine-tuned by micro RNAs

Co-operation of the GGAA motifs with other transcription factor binding cis-elements near the TSS

As described above, duplicated GGAA (TTCC) motifs are found in the promoter region of the human IL-1β gene (Table 1). The IL-1β gene is synergistically regulated by PU.1 and IRFs through their binding to the PU.1/IRF composite element that is located −2.8 kb from the TSS [120]. The GGAA motif and other transcription factor binding site(s) can co-operatively affect promoter activity of various genes. For example, Sp1 is a transcription factor that binds to the GC boxes with the consensus sequence 5′-(G/T)GGGCGG(G/A)(G/A)(C/T)-3′ or 5′-(G/T)(G/A)GGCG(G/T)(G/A)(G/A)(C/T)-3′ [121]. It has been reported that both the GGAA motif and the Sp1-binding element play important roles in regulating transcription of the human VE-cadherin (CDH5) [102] and presenilin 1 (PS1) genes [122]. Interactive regulatory function of the Sp1 and Ets-1 has been observed in the promoter region of the murine guanylyl cyclase/natriuretic peptide receptor-A-encoding gene, Npr1 [123]. Moreover, GABP and Sp1 co-operatively regulate human Heparanase1 (HPR1) promoter activity [124] Oct-1, which is encoded by POU2F1 gene, belongs to POU homeo-domain transcription factors [125]. Members of this family recognize and bind to the octameric sequence ATGCAAAT to activate transcription of various genes [126]. Oct-1 and PU.1 is involved in the control of the murine CD45 (Ptprc) promoter activity [127]. Characterization of the promoter region of the Fibroleukin (FGL2) gene revealed that the cis-elements, Oct-1, Ets1 and Sp1/Sp/3, co-operatively regulate its transcription [128, 129]. Similarly, the human BTK gene promoter, in which duplicated GGAA motif is located, is regulated by binding of Sp1/Sp3, PU.1, and Oct-1 [130, 131]. Furthermore, c-Ets and the POU homeo-domain transcription factor, GHF-1/Pit-1 enhances Ras/Raf to activate the rat prolactin (PRL) promoter [132]. Recently, it was shown that human prostacyclin receptor (PTGIR) gene promoter is regulated by Sp1, PU.1, and Oct-1 elements, and a schematic model was presented to suggest that Sp1-PU.1-Oct-1 ternary complex might recruit pre-initiation complex, including TBP, TFIID, and RNA pol II, to the TSS [133]. It has been shown that Ets1 and RUNX1 bind DNA co-operatively [48]. Recent genome-wide analysis of human promoters revealed that Elk1 and Ets1 bound regions are frequently occupied with SRF [115], and CBP [134], respectively. Moreover, it has been reported that AP1 element is co-located with ETS binding site in the promoter regions of several tumor invasion/metastasis-associated protein-encoding genes, such as MMP-1/type I collagenase, MMP-3/stromelysin, MMP-7/matrilysin, and MMP-9/type IV collagenase genes [135]. Taken together, these observations suggest that transcriptional initiation may be co-regulated by the Ets family proteins in co-operation with GC-box binding factors, POU homeo-domain transcription factors, and other transcription factors including RUNX1, SRF, CBP, and AP1 binding factors at the region close to the TSS.

Conclusion

Duplication of the GGAA motifs is found within a 500-bp distance of the TSS of various genes. Several of these genes are classified as intermediate and late responding genes during the macrophage-like differentiation of HL-60 cells induced by TPA. The duplicated GGAA motifs are located in the promoter regions of genes encoding interleukin-, interferon-, TNF-signal-associated proteins, apoptotic factors, cancer-causing proteins, tumor suppressors, or cell cycle controlling proteins including Rb and p53. Almost all these promoters have no obvious TATA-box, suggesting that the duplicated GGAA motifs play important roles in precisely recruiting the pre-initiation complex near the TSS. Redundant occupation of the duplicated GGAA motifs by Ets family proteins seems to be a complicated system, but this would enable finely tuned regulation of each promoter through altering composition of Ets family proteins or GGAA-binding proteins in the nucleus in response to cellular signals (Fig. 2). Moreover, it is noteworthy that the duplication of the GGAA motifs contained in the 5′-flanking region of the human TBP gene is essential for regulating promoter activity. To date, initiation of transcription is explained mainly by the binding of TBP onto the TATA-box. Here, we propose an alternative transcription initiation mechanism where the duplication of the GGAA motifs is primarily required for determining the TSS in eukaryotic cells and for regulating expression of various genes. Further investigations into the regulatory mechanisms of the duplicated GGAA motif-containing promoters will enable the development of a new treatment for leukemia and malignant tumors by introducing Ets family protein expression vectors into cells.