Using the switchSENSE technology and EMSAs, we gained first insights into the kinetics between the ZnF domain of PRDM9Cst and the DNA sequence found in the mouse recombination hotspot Hlx1
B6. Despite the very different binding conditions, the estimated equilibrium dissociation constants were comparable between the two used methods suggesting that both methods reflect the in vitro binding behavior of the PRDM9Cst-ZnF to DNA. Similar binding affinities within the nanomolar range (K
D ∼ 30 nM) were also reported in a recent in vitro study of a subset of the human PRDM9-ZnF array (Patel et al., 2016). However, it was also shown that reaction conditions (e.g., salt concentration (Patel et al., 2016)) have a strong effect on the K
D, as we also observed with the addition of polydIdC; thus, in vitro K
D values might be different to the affinity in the cellular environment.
Nevertheless, our experiments also show that regardless of the reaction conditions used, PRDM9 forms a highly stable, long-lived complex with specific DNA targets within a hotspot. This is the first report on dissociation kinetics of the PRDM9 ZnF with dissociation halftimes of ∼9–17 h. The formation of a highly stable and long-lived complex was observed in all switchSENSE measurements and was corroborated qualitatively with an independent EMSA experiment.
An important structural element found in most C2H2-type ZnFs domains connecting two neighboring ZnFs is an amino acid sequence known as the threonine, glycine, glutamate, lysine, and proline (TGEKP) linker, located between the last histidine of one ZnF and the first conserved aromatic amino acid of the next (Laity et al., 2001, Wolfe et al., 2000). It acts as a spacer between the ZnF along the DNA (Wolfe et al., 2000) and is highly conserved in PRDM9 across divergent species, such as mice and humans. These TGEKP linkers play an important role in the DNA-protein complex stability of ZnF proteins. Based on NMR chemical shift data, Laity et al. showed that the ZnF-protein TFIIIA undergoes a conformational change in the TGEKP linkers during the transition of a non-specific binding mode to a sequence-specific interaction (Laity et al., 2000). During non-specific binding, the TGEKP linkers are rather flexible, enabling the search for its target sequence motifs. Once sequence-specific interactions have occurred, the ZnF repeats are brought into their correct relative orientations and the TGEKP linkers undergo a conformational change, which can be thought of as a “snap lock” that clicks the ZnFs into place in the major groove of the DNA, thereby stabilizing the ZnF-DNA complex mediated by hydrogen bonds and van der Waals forces (Laity et al., 2000). A similar process might be occurring also in PRDM9, explaining the formation of a highly stable complex between sequence-specific target DNA and the ZnF domain of PRDM9.
Whether all ZnFs in the PRDM9 domain undergo this “snap-lock” conformational change is unknown. Usually, multi-ZnF proteins contact DNA in units of two to three successive ZnFs (Iuchi, 2005). However, our chimera experiments, in which, the replacement of the DNA target site with unspecific sequences either from the 5′ or the 3′ end resulted in reduced binding of PRDM9, suggest that the stability of the PRDM9-DNA complex decreases with the number of ZnFs involved in the specific binding (Fig. 4). But we also showed that a subset of the ZnFs (minimum five ZnFs) already confers binding specificity, albeit with reduced affinity compared to the entire ZnF array. Thus, it also might be possible that a conformational change for all ZnFs in the array is not necessary.
The recognition and specific binding of PRDM9Cst to different sequence subsets might be a unique feature of ZnF domains with many ZnFs, like the one found in PRDM9 with 10 or more ZnFs. Interestingly, DNA motifs enriched at recombination hotspots observed in available high-resolution maps also represent only a subset of the sequence recognized by the complete PRDM9-ZnF array. The order and identity of the ZnF determine which motif is enriched. Motifs specific for the murine PRDM99R (B10.S-H2t4/(9R)/J strain) or PRDM9Dom2 (C57BL/6J strain) (Brick et al., 2012, Walker et al., 2015) show a preference for the N-terminal region, while motifs for the murine PRDM913R (B10.F-H2pb1/(13R) J strain) (Brick et al., 2012) and human PRDM9A (Myers et al., 2010) variants have a preference for the C-terminal region of the ZnF array. Thus, it is possible that only the motif recognized by the slightly more specific ZnF subset is enriched at recombination hotspots.
Not all PRDM9-ZnFs confer binding specificity, but also, interactions between amino acids and the DNA-phosphate backbone are relevant in the complex formation or stability (Patel et al., 2016, Billings et al., 2013). Our data suggest that different sequences contacting ZnF1, ZnF2, ZnF10, and ZnF11 of PRDM9Cst do not change the binding affinity. This was also observed for PRDM9Cst-ZnFs10 and ZnF11 with single nucleotide substitutions in the target site (Billings et al., 2013) and for ZnFs1 and ZnF12 in the analysis of sequence motifs preferentially bound by murine PRDM9B6 (=PRDM9Dom2) (Walker et al., 2015). This is also congruent with the comparison of different PRDM9 variants in several primate species, which showed that ZnFs located at the amino- or carboxy-terminal ends are more conserved (Schwartz et al. 2014), suggesting that these fingers might have a more universal role and contribute little to the binding of specific sequences.
In our EMSA data, the PRDM9-ZnF domain showed a gradually increased binding for longer DNA sequences (e.g., 75 vs. 39 and 34 bp, respectively), an unexpected result given that the 11 fingers of the PRDM9Cst-ZnF bind 34 bp. Thus, our data suggest that more nucleotides are used for the interaction than required by the number of ZnFs. There is evidence from cell-line and immunoprecipitation experiments that PRDM9 forms a multimer (Baker et al., 2015). Whether the multimerization also occurs only with the ZnF domain is not known, but if this is the case, a ZnF domain formed by several units might also explain our observations that the interaction with longer DNA sequences is important for the binding stability of the complex. As for the functional importance of this phenomenon, it has been proposed for other DNA-binding proteins that the accelerated target localization happens via a one-dimensional (1D) search mode during which the protein slides along the DNA (reviewed in von Hippel and Berg (1989)). This 1D diffusion, performed as a (1) sliding and (2) intersegmental transfer, can be viewed as a random walk while the protein is in the non-specifically bound state (Berg et al. (1981) and von Hippel and Berg (1989) and also reviewed in Halford and Marko (2004), Mirny et al. (2009), and Zandarashvili et al. (2012)). This type of interaction has also been described for the transcription factor Egr-1, which uses only two of its three ZnF domains during the rapid 1D-search mode and then undergoes a conformational transition in the recognition mode, where all three ZnF domains confer DNA binding (Zandarashvili et al., 2012). Based on our data, it is conceivable that the ZnF domain of PRDM9 also initially scans the DNA by sliding along the non-specifically bound DNA coupled with intersegmental transfer between nucleosomes. A specific target encounter matching the ZnF domain could then induce a conformational change of the ZnF domain via a snap-lock action of the TGEKP linkers, which leads to the formation of a highly stable PRDM9-DNA complex with a half-life of many hours.
Whether the highly stable, long-lived complex has a biological relevance is not known. However, the slow dissociation of PRDM9 from DNA would allow the PRDM9-DNA interaction to persist all the way from the first target recognition until the encounter and activation of the recombination initiation machinery that introduces DSBs. In mice spermatocytes, PRDM9 is expressed from pre-leptotene to mid-zygotene, a period of roughly 48 h (Sun et al., 2015). During these stages of meiosis, the chromatin undergoes substantial changes, starting from a rather diffuse interphase conformation, followed by gradual condensation of chromosomes during the leptotene and zygotene stages, during which also the lateral and axial elements of the synaptonemal complex form, until full synapsis of the chromosomes is reached in the pachytene stage. Given that PRDM9 is already expressed before the start of prophase I (pre-leptotene), it is conceivable that it binds when the genomic DNA is in a fairly open stage during pre-leptotene and stays bound during the entire process of leptonemal loop-axis formation.
One constraint of our kinetic data is that we assessed the binding of PRDM9 to naked DNA (not nucleosomal DNA). Hence, the association and dissociation rates under physiological conditions could differ from our in vitro results. It is not known, whether PRDM9 can bind nucleosomal DNA or whether it requires the help of chromatin remodeling factors that remove the nucleosomes to expose a strand of naked DNA to PRDM9. Studies with engineered transcription factors indicate that the accessibility of the DNA is impaired by the packaging of DNA within nucleosomes (Collingwood et al., 1999). Furthermore, it has been shown recently that the tightness of nucleosomal packaging (i.e., whether the chromatin comprises an open or closed conformation) affects hotspot activation by PRDM9 in mice (Walker et al., 2015). A comparison of the hotspot usage with the chromatin state at the hotspot in B6 mice, determined by DMC1-ChIP-Seq data (Brick et al., 2012) and H3K4me3-ChIP-Seq (Baker et al., 2014), respectively, showed that hotspot usage is increased in actively transcribed genes and decreased in closed chromatin (H3K9me2/me3 or constant lamina associated domains—cLADs) (Walker et al., 2015). However, high affinity targets of PRDM9 (determined by in vitro Affinity-Seq) correlated highly with hotspot usage regardless of the initial chromatin state (Walker et al., 2015). It is yet unclear if and how PRDM9 gets access to the closed chromatic regions. One possibility is that it accesses only open chromatin to begin with. Alternatively, it could be acting in concert with chromatin remodeling factors that displace nucleosomes in an ATP-dependent manner. Another possibility is that it gets access to these regions by spontaneous exposure of nucleosomal DNA by the partial unwrapping of DNA and then remains bound at loci with high affinity sequences, thereby allowing passive access to otherwise hidden target sites (Li et al., 2005, Li and Widom, 2004). Furthermore, it has been suggested that site-specific DNA-binding proteins may recruit chromatin remodeling complexes, once they gained access to a previously buried DNA sequence, which subsequently can move or disassemble that nucleosome, thereby allowing a tighter interaction with the site-specific binding protein (Li and Widom, 2004). These models go in line with recent reports of mammalian recombination hotspots, which exhibit nucleosomal depleted regions (NDR) around predicted PRDM9 binding motifs at the hotspot center (Baker et al., 2014, Lange et al., 2016).
So far, the molecular mechanism of how PRDM9 specifies hotspots is not fully understood. H3K4me3 is necessary for the formation of DSBs during meiosis (Acquaviva et al., 2012, Sommermeyer et al., 2013) and it has been demonstrated that PRDM9 predominantly marks H3K4me3 by its PR/SET domain next to its binding site targeted for DSB (Baker et al., 2014, Brick et al., 2012, Grey et al., 2011). However, an H3K4me3 mark is not sufficient to initiate recombination, and it has been demonstrated that PRDM9 directs away DSBs from H3K4me3 promoter regions (Brick et al., 2012). The recruitment of the recombination machinery to specific DNA regions is quite complex and also depends on larger structural chromosomal components. During the chromatin compaction occurring during prophase I, only a sequence located in the loop during the loop-axis formation in leptotene becomes a DSB target. A key aspect of DSB formation is that the NDR in the axis gets tethered by components placed on the axis (Blat et al., 2002, Ito et al., 2014, Acquaviva et al., 2012, Sommermeyer et al., 2013). The role that PRDM9 plays in this process has not been completely elucidated, but recent evidence has shown that PRDM9 is necessary to tether the DNA in the loop to the axis via helper proteins bound to the KRAB domain (Parvanov et al., 2016). How a long-lived PRDM9-DNA complex plays a role in the initiation of recombination is open for debate, but it could be possible that the constant activity of the PR/SET domain or other epigenetic modifiers of a long-lived complex prevents a target to be packed and hidden in the axial structure. Alternatively, PRDM9 could actively drive the placement of NDR/H3K4me3 chromatin regions in a loop during the highly dynamic chromatin condensation processes occurring in prophase I. Finally, in light of the recent evidence about the role of PRDM9 in recruiting the recombination initiation machinery (Parvanov et al., 2016), a long-lived PRDM9-DNA complex might be important to stabilize the recombination initiation machinery to specific DNA targets such that the PRDM9-DNA complex can be moved from the loop to the axis.
There are still many open questions about the mode of action of PRDM9, but a simple recognition of DNA motifs by PRDM9 does not explain hotspot usage. First, the specific recognition by PRDM9 can be conferred by different subsets of the ZnF domain explaining the plasticity of this protein for a variety of different targets; although, a certain motif is enriched at hotspots by the binding of a predominant ZnF subset. Second, hotspot usage is evidently linked to factors creating NDR regions and, more importantly, the placement of these regions in the loop versus axis during the compaction of the chromatin in meiotic prophase I. The slow dissociation rate of the PRDM9 complex from highly specific sequences might play a role during this process. How PRDM9 specifies hotspots will be better understood with further binding studies investigating chromatin accessibility and PRDM9 recruitment and the role of the other PRDM9 domains and their kinetics.
DNA fragments were either produced by PCR using biotinylated or unmodified primers or purchased as synthetic fragments with the necessary modifications. Details are shown in Supplementary_Methods.
Cloning and expression of PRDM9Cst-ZnF
The coding sequence of PRDM9Cst in form of a pBAD expression construct (kindly provided by the Pektov Lab, Center for Genome Dynamics, the Jackson Laboratory, Bar Harbor, ME 04609, USA) was used to clone the Prdm9
Cst gene into several different expression systems (pT7-IRES-MycN vector for cell-free in vitro expression, pFB12 vector for insect cell expression, pEYFP-C1 for mammalian cell expression, and pGEX-6P2 vector as an alternative vector system for bacterial expression with a GST tag for enhanced solubility, and finally, the pOPIN-M vector system, which can be used for bacterial, mammalian, and insect cell expression and contains the MBP for enhanced solubility). We tested the most suitable system to express PRDM9Cst (data not shown) and finally chose the pOPIN vector system in combination with a specific Escherichia
coli expression strain (see Supplementary_Methods “Recombinant expression of PRDM9Cst in bacterial cells and lysate preparation”) that gave the best yields of soluble recombinant PRDM9. A detailed description of the cloning processes is described in the Supplementary_Methods, with details on (a) the cloning of PRDM9Cst-ZnF (encoded by the exon10) in the pOPIN-M vector using the Gibson Assembly™ cloning kit (NEB), (b) the excision of YFP from the pOPIN-M construct, (c) the cloning of PRDM9Cst (full-length) and PRDM9Cst-ZnF construct in the in vitro expression vector pT7-IRES-MycN, and (d) the introduction of a His-YFP-tag into the pT7-IRES-MycN constructs. Also, a detailed description on protein lysate preparation can be found in the Supplementary_Methods. In summary, we used the following constructs of Prdm9
CstExon10 (ZnF domain) or Prdm9
Cst full length including several tags, such as a His-tag, the maltose binding protein (MBP), or eYFP (for further details see Supplementary_Methods):
His-MBP-eYFP-PRDM9CstExon10 in pOPIN-M vector (bacterial expression)
His-MBP- PRDM9CstExon10 in pOPIN-M vector (bacterial expression)
His-eYFP-PRDM9Cst (full-length) in pT7-IRES-MycN vector (in vitro expression system)
His-eYFP-PRDM9CstExon10 in pT7-IRES-MycN vector (in vitro expression system)
PRDM9CstExon10 in pT7-IRES-MycN vector (in vitro expression system)
Electrophoretic mobility shift assays
The EMSA reactions and incubation times varied depending on the experiment but followed the general protocol outlined below. Details of each EMSA reaction setup are described in the Supplementary_Methods: (a) EMSA protein titrations, (b) EMSA competition assay, (c) EMSA experiments with chimera fragments, (d) EMSA simultaneous hot and cold DNA competition assay, and (e) EMSA time course.
General EMSA protocol
Electrophoresis of 5% polyacrylamide gels was run in 0.5× TBE buffer (44.5 mM Tris base, 44.5 mM boric acid, 1 mM EDTA, pH 8.0) at 100 V for 30 min before loading the EMSA reaction. The EMSA reactions were supplemented with 4 μl of 6× EMSA loading dye (15% glycerol, 0.03% bromophenol blue, 0.03% xylene cyanol FF, 44.5 mM Tris base, 44.5 mM boric acid, 1 mM EDTA, pH 8.0) and loaded onto the polyacrylamide gels. The gels were run 45 min at 100 V followed by the electrophoretic transfer to a Zeta-Probe nylon membrane (Bio-Rad) at 100 V (constant voltage) for 80 min. Then, the DNA was crosslinked to the nylon membrane using an UV-crosslinker (CX-2000, UVP) at 600 mJ/cm2. Afterwards, unspecific binding sites on the membrane were blocked using 1% w/v (weight per volume) casein (Hammarsten grade, AppliChem) in 1× TBS buffer (25 mM Tris base, 137 mM NaCl, 2.7 mM KCl, pH 7.4) by a 15 min incubation at ∼22 °C, shaking. For the detection of biotinylated DNA, the membrane was incubated for 15 min, shaking at ∼22 °C, in Pierce Streptavidin-Horseradish Peroxidase Conjugate (Thermo Scientific) diluted in blocking buffer (1% w/v casein (Hammarsten grade, AppliChem) in 1× TBS buffer) to a concentration of 33.35 μg/ml. Next, the membrane was washed 4× for 5 min at 22 °C in a shaker with a wash buffer (300 mM Tris base, 200 mM NaCl, 0.5% SDS, pH 8) and equilibrated for chemiluminescent detection in 300 mM Tris, pH 8 for another 5 min at 22 °C, shaking. Finally, the membrane was carefully transferred to a paper towel, using forceps, removing residual liquid from the membrane edges before addition of Super Signal West Femto Maximum Sensitivity Substrate (Thermo Fisher) mixed in a 1:1 ratio. The chemiluminescent reaction was allowed to take place for 5 min, then the results were obtained by using the ChemiDoc™ MP imager (Bio-Rad) with the blot settings “Chemi Hi Sensitivity.”
Image analysis was performed using the Image Lab software (Bio-Rad). The lanes and bands were defined manually then the pixel intensities and values for fraction bound (%) were quantified and analyzed further using OriginPro8.5 (Origin Lab).
We tested differences in binding trends (Fig. 5) with a generalized least square model using a likelihood ratio test that takes non-homogeneous variances and auto-correlation into account and adjusted with a Bonferroni correction. A detailed description of the analysis and results can be found in the Supplementary_Statisitcal_Analysis.
Protein lysate, DNA, and buffers
For the switchSENSE measurements, recombinant PRDM9Cst-ZnF was produced using the His-MBP-PRDM9Cst pOPIN-M construct-without YFP (see cloning procedure in Supplementary_Methods) expressed in E. coli Rosetta™2(DE3) pLacI. The PRDM9 concentration in the crude lysate (see Lysate preparation for His-MBP-PRDM9Cst-ZnF-without YFP in Supplementary_Methods and Supplementary_Fig_S1, panel A, lane 2) was estimated by Capillary Western (Supplementary_Fig_S1, panel B). For the target DNA sequences, we used 48 bp double-stranded synthetic fragments (Hlx1
B6 and usDNA; Supplementary_Table_S1, panel C, switchSENSE) that carry a thiol modification on the 5′ end and a fluorescent dye on the 3′ end of the forward strand and no modification on the reverse strand. Running and sample buffer were in experiments 2 and 3 (low polydIdC), 10 mM Tris (pH 7.5), 50 mM KCl, 0.05% NP40, and 50 μM ZnCl2 and supplemented with 50 ng/μl polydIdC in experiment 1 (high polydIdC).
Instrument, chip and DNA layer preparation, regeneration process, and flow rates
All switchSENSE measurements were performed on a DRX2400 instrument using custom made sensor chips (both Dynamic Biosensors GmbH; Planegg, Germany). On the respective sensor chip, different sensor spots in one flow channel were either functionalized with single-stranded Hlx1
B6 or usDNA by a sulfur-gold bond at the 5′ end of the DNA. For the detection of the switching motion, the DNA molecules were modified with a fluorescent dye at the 3′ end. The single-stranded DNA probes were hybridized to the respective complementary sequence on chip at 45 °C to resolve potential secondary structures. The hybridizations of usDNA and Hlx1
B6 were carried out separately by sequential incubation with 200 nM usDNA reverse followed by 200 nM Hlx1
B6 reverse. In each case, successful hybridization was monitored by real-time observation of the switching amplitude on either an usDNA or Hlx1
B6 modified electrode. After functionalization with the respective DNA, one sensor spot contains about one million DNA molecules. For complete chip regeneration, the electrodes were treated with regeneration solution (Dynamic Biosensors GmbH; Planegg, Germany) to remove the complementary DNA strands and potentially bound proteins. The remaining single-stranded DNA was freshly hybridized as described above. Intermittent kinetic measurements on two differently functionalized sensor spots allowed the parallel determination of the binding kinetics of PRDM9 to both DNA sequences. All association and dissociation experiments were performed at a pump rate of 5 μl/min.
Data normalization and analysis
The kinetic data (dynamic response upwards 0–4 μs, which corresponds to the nanolever’s switching speed during the first 4 μs of the upward motion; see more details in (Langer et al., 2013)) were grouped and exported from the switchANALYSIS software (Dynamic Biosensors GmbH) and the association start values of the different concentrations were normalized to 100%. The respective dissociation data were normalized accordingly. The normalized dynamic response was plotted against time using the OriginPro8.5 software (Origin Lab), and binding kinetics were analyzed by single or global exponential fits using the following equations for association and dissociation, respectively. Fit equation for association rate constant (k
on), y = y0 + A * exp (−(x − x0)*(c*k
on + k
off)) and dissociation rate constant (k
off), y = y0 + A*exp (−(x − x0)/t) with the dependency that k
off = 1/t. Here, x0 and y0 are the offsets from the respective axis, A is the fit amplitude, and c is the PRDM9 concentration. The equilibrium dissociation constant (K
D) was then derived by K
D = k
on. The error of the equilibrium dissociation constant (ΔK
D) was calculated based on the Gaussian error propagation ΔK
D = ((1/k
on * Δk
off)^2 + (k
on^2) * Δk