Identification of critical base pairs required for CTCF binding in motif M1 and M2
The ubiquitously expressed CCCTC-binding factor (CTCF), is highly conserved from Drosophila to mammals and plays multiple functions in the genome (Ohlsson et al., 2001). CTCF has been shown to establish chromatin insulation in vertebrate, and it also plays the roles in transcriptional regulation, X-chromosome inactivation, and imprinting of genes (Phillips and Corces, 2009). In addition, CTCF plays a pivotal role in genomic organization and loop formation by mediating long-range chromatin interactions between distant loci (Yao et al., 2010; Tang et al., 2015). Several hypotheses have been proposed to explain the diverse functions of CTCF. The popular ‘zinc-finger model’ proposed that the CTCF’s different functions are due to the interplay between the zinc-finger engagement and the underlying sequence differences (Ohlsson et al., 2001). Genome-wide studies have identified that the majority of CTCF binding sites belongs to a set of nonpalindromic CTCF binding sites with a consensus sequence referred to as M1 (Kim et al., 2007; Schmidt et al., 2012). Recently, another binding motif, referred to as M2 and 5–6 bp upstream of M1, has been discovered (Schmidt et al., 2012). Moreover, CTCF zinc fingers (ZFs) 4–8 strongly bind to the M1, while ZFs 7–11 tend to strongly bind to the M2 (Renda et al., 2007; Xiao et al., 2015). In this study, we aim to compare the binding abilities of CTCF to M1 and M2 and determine which bases were requirement for M1 and M2 bind to CTCF.
To investigate the binding capacities of CTCF-ZFs 1–11 to M1 or M2, pGEX-4T-2-CTCF-ZFs plasmid was constructed to induce the prokaryotic expression of GST-CTCF-ZFs (Fig. S1A). We tested three traditional temperature conditions (16°C, 28°C, and 37°C) and four IPTG concentrations (0.1 mmol/L, 0.5 mmol/L, 1 mmol/L, and 1.5 mmol/L). Coomassie blue staining results showed that GST-CTCF-ZFs was robustly induced at 28°C with an obvious band at about 55 kDa (Fig. S1B). We further optimized the IPTG concentration and found that the most suitable IPTG concentration for inducing GST-CTCF-ZFs expression was 0.6–0.7 mmol/L (Fig. S1C). Therefore, we concluded that the optimized induction condition for GST-CTCF-ZFs was 28°C with IPTG at a concentration between 0.6 and 0.7 mmol/L. GST-CTCF-ZFs was purified by using glutathione resin, eluted by using reduced glutathione and stained with reduced glutathione (Fig. S1D) and used for subsequent electrophoretic mobility shift assay (EMSA) experiments.
Compared with that of M1, the binding of M2 to purified GST-CTCF-ZFs protein was much weaker (Fig. 1E). No protein/DNA supershift was observed when the amount of GST-CTCF-ZFs in protein/DNA complex is between 0.2–0.4 μg (Fig. 1E, lanes 2 and 3). The supershift band gradually became stronger with the increased amount of GST-CTCF-ZFs from 0.6 μg to 2.5 μg (Fig. 1E). We further confirmed that the interaction between GST-CTCF-ZFs and the biotin-labeled M2 oligo was abolished by an excess amount of the unlabeled M2 oligos (Fig. 1F).
To quantify the strength of CTCF-ZF’s interaction with M1 and M2, we performed EMSA assays to determine the dissociation constant (Kd) for CTCF-DNA interactions. The strong CTCF binding motif, M1 demonstrated a Kd of 1.0 × 10−11 mol/L contrasting with Kd values of 2.9 × 10−10 mol/L for CTCF-M2 interactions (Fig. 1G and 1H). Based on the amount of GST-CTCF-ZFs used in the EMSA assay and Kd values, we hypothesized that, compared with the binding of M1 to GST-CTCF-ZFs, the binding of M2 to GST-CTCF-ZFs was much weaker (Fig. 1B–H). To test this hypothesis, we did two competition experiments that use unlabeled M1 or M2 to compete biotin-labeled M2 or M1. Our data indicated that the unlabeled M1 specifically and nearly completely displaced M2 binding at 100-fold excess, whereas an unlabeled M2 competitor oligo did not displace M1 binding even at 1000-fold excess (Fig. 1I and 1J).
To confirm our in vitro EMSA, we constructed two different plasmids with insertion of either M1 or M2 CTCF binding sites (Fig. S2). We transfected these constructs into 293T cells, respectively, and performed in vivo chromatin immunoprecipiation (ChIP). ChIP DNA was then examined by quantitative real-time PCR (ChIP-qPCR) experiments. We designed the qPCR primers in the construct at the regions with CTCF binding site (either M1 or M2) inserted and without CTCF binding site (Fig. S2). Our ChIP-qPCR results showed that the CTCF was recruited to both M1 and M2, respectively, but not to the negative region. Furthermore, stronger binding signals were observed at M1 in contrast to M2 (Fig. 1K and 1L), suggesting that CTCF prefers to bind to M1 rather than M2.
While CTCF contains 11 zinc fingers domains, the specificity and affinity can be controlled by a few crucial fingers (Renda et al., 2007). To further determine which zinc finger arrays actually bind to “TGG”, we used a web server (http://zf.princeton.edu/) that can predict DNA-binding specificities for C2H2-ZF-containing proteins, including CTCF (Persikov and Singh, 2014). Our analyses suggest that CTCF-ZFs 7–8 is critical for the M1 “TGG” binding (Sequence logos for the generated are given in Fig. S3).
To assess the position or critical residue that is important for M2 binding to CTCF, we made three point mutations (M2-Mut1, M2-Mut2, and M2-Mut3) according to position weight matrix score and performed EMSA assays (Fig. 2G). Our data showed that mutation of any selected single base within M2 abolished the binding of GST-CTCF-ZFs to M2 (Fig. 2H). These results suggested that the selected single base is required for the high-affinity interaction of M2 with CTCF.
To further verify that the mutated M1 or M2 abolished the CTCF binding of in vivo, we made several mutated constructs within CTCF binding sites (Fig. S2). In vivo ChIP experiments indicated that the mutation of M1 (from “TGG” to “GTT”) significantly decreased the binding of CTCF to M1 (Fig. 2I) and all single mutations within M2 abolished the binding of CTCF to M2 when compared to the control (Fig. 2J).
Several studies have reported that the transcription factor binding site sequence could play a role in fine-tuning the expression level of genes (Kandoth et al., 2013). For example, binding sites might be able to modulate gene expression as a consequence of differences in affinity (Bain et al., 2012), where high affinity binding sites induce a higher level of transcriptional activation than low affinity binding sites. In this respect, affinity of different CTCF binding motifs to CTCF-ZFs has not been determined. Herein we show that the binding abilities of GST-CTCF-ZFs to its M1 and M2 motifs were different. The binding of GST-CTCF-ZFs to M1 is much stronger than to M2. Importantly, similar conclusion was obtained with ChIP experiments (Fig. 1K and 1L).
When we initially incubated the GST-CTCF-ZFs with the oligos from either M1 or M2 core motif, we failed to see a clear shift after EMSA assay (Data not shown). Lobanenkov et al. suggested that additional DNA flanking outside the CTCF recognition motifs are required for tight binding but the exact sequence requirement for this flanking DNA may not be as strict as that of the CCCTC motifs (Lobanenkov et al., 1990). In combination with our results, we expect that the binding of CTCF to M1 and M2 in vitro need not only the core recognition sequence but also a few bps outside DNA. By referring to CTCF binding motif probes that detected by ChIP-seq (Xiao et al., 2015), we synthesized 30 bp biotin labeled double-strand M1 (5′-CTTTTTGGTGCCCTCTGCTGGCCAGTTTAG-3′) and 20 bp biotin labeled double-strand M2 (5′-CTTTTGGAACTGCAGTTTAG-3′) that including the core motif and additional flanking DNA sequence (5′-CTTTT and GTTTAG-3′) and tested their binding abilities to CTCF-ZFs in both in vitro and in vivo assays.
CTCF represses cancer cell growth and clonogenicity and has been classified as a candidate tumor suppressor gene (Rasko et al., 2001). Recent studies have identified mutations of human CTCF binding sites in various human cancer types including Wilms’ tumor, leukaemia (Mullighan et al., 2011). Thus, mutation of CTCF binding sites at specific loci may dysregulate the expressions of tumor suppression genes or oncogenes, thereby contributing to the malignant phenotype (Filippova et al., 2002). In this study, we found that several mutations of CTCF binding sites abolished the binding of CTCF-ZFs to the mutated sites both in vitro and in vivo. These mutations are likely to exist in the genome of some cancer types. To enrich our knowledge for the roles of all base pairs in CTCF binding sites, a genetic mutation screening might be necessary for us to implement the mutations in a variety of patterns in the future.
We thank Dr. Gary Felsenfeld at NIH for helpful discussions. This work was supported in part by the National Basic Research Program (973 Program) (Nos. 2015CB964800 and 2016YFA0100400), the National Natural Science Foundation of China (Grant No. 31471210), Guangdong Frontier and Key Technology Innovation Special Grant (2016B030229006), Guangdong Natural Science Funds (2015A030308003, 2015A030310041 and 2016A030313168), Guangzhou Science Technology and Innovation Commission, Dr. Zhibin Wang is supported by NIH/NIEHS R01ES025761 and the One Hundred Talents Project of the Chinese Academy of Sciences to HY. The authors also gratefully thank the support from the Guangzhou Branch of the Supercomputing Center of CAS.
Wufeng Li, Liping Shang, Kaimeng Huang, Jiao Li, Zhibin Wang, and Hongjie Yao declare that they have no conflict of interest. This letter does not contain any studies with human or animal subjects performed by the any of the authors.
- Lobanenkov VV, Nicolas RH, Adler VV, Paterson H, Klenova EM, Polotskaja AV, Goodwin GH (1990) A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5’-flanking sequence of the chicken c-myc gene. Oncogene 5:1743–1753PubMedGoogle Scholar
- Renda M, Baglivo I, Burgess-Beusse B, Esposito S, Fattorusso R, Felsenfeld G, Pedone PV (2007) Critical DNA binding interactions of the insulator protein CTCF: a small number of zinc fingers mediate strong binding, and a single finger-DNA interaction controls binding at imprinted loci. J Biol Chem 282:33336–33345CrossRefPubMedGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.