Introduction

DNA polymerases (Pols) are enzymes to add free nucleotide to the 3' end of the newly-forming DNA strand. They play an essential role in the maintenance of genome integrity. On the basis of sequence similarity, DNA Pols can be broadly classified into A-, B-, C-, D-, X- and Y-families [13]. In general, most Pols in A-, B-, C-, and D-families are high-fidelity enzymes primarily involved in faithful DNA replication and in repair of replication mistake. The X-family Pols are involved in a number of DNA repair processes such as base excision repair (BER) and repair of double-strand breaks (DSBs) [4, 5]. The Y-family Pols represent a number of recently identified Pols characterized by low-fidelity synthesis on undamaged DNA and the ability to bypass DNA lesions which normally block replication by members of the A-, B-, C-, D-, or X-family Pols [612].

The Y-family Pols are ubiquitous and are distributed among the three kingdoms of life. They include E. coli Pol IV (also known as DinB) [13] and Pol V (also known as UmuC) [14, 15], yeast Pol η [16] and Rev1 [17], human Pols η [18], ι [19, 20], κ [21] and Rev1 [22], and archaeal Dbh [23] and Dpo4 [24], etc. Although there is no detectable sequence identity with other family Pols, available crystal structures of some Y-family Pols such as Dbh [2527], Dpo4 [28], Pol η [29, 30], Pol ι [31, 32], Pol κ [33, 34] and Rev1 [35] reveal that they retain a catalytic core consisting of fingers, palm and thumb subdomains found in other family Pols. However, the fingers and thumb subdomains of the Y-family Pols are significantly smaller than the corresponding subdomains of the other DNA Pols. In addition to the conserved polymerase core, the Y-family Pols possess also a unique C-terminal domain termed the little finger (LF), wrist or polymerase-associated domain (PAD). In this paper we will use the acronym, LF, to denote this domain. The LF domain is the least conserved of the four domains in the Y-family Pols.

Besides the intensive structural studies of the Y-family Pols, which include the structures in apo, binary and ternary forms as well as the structures complexed with DNA substrates containing different lesions [11, 2542], a variety of biochemical assays have provided insight into the catalytic mechanism, lesion-bypassing property, processivity and fidelity of the Pols [12, 4355]. Both the biochemical and single-molecule assays for Dpo4 indicated that the binding of a nucleotide induces a fast DNA translocation event [55, 56], which is consistent with the structural studies showing that, in both of the binary complexes (pre- and post-insertion), the primer terminus occupies the site where the next incoming nucleotide will bind [28, 41, 42]. However, the structural studies for Dbh showed that, in the pre-insertion binary complex, the templating base and the primer terminus are already positioned so that space is available for the incoming nucleotide to bind and form the ternary complex, while in the post-insertion binary complex, the DNA is located in nearly the same position on the Pol [27]. Similar to Dbh, two pre-insertion binary complexes of Pol ι showed that space is available for the incoming nucleotide to bind [32].

Recently, a Brownian ratchet model has been proposed for the translocation of the high-fidelity replicative DNA Pols, where the translocation depends on the change of the interaction of the fingers subdomain with the single-stranded DNA (ssDNA) template upon a correct incorporation [57, 58]. In this work, based on the available structural, biochemical and single-molecule studies for the Y-family Pol, we modify the previous Brownian ratchet model for the replicative Pol to be applicable to the Y-family Pol, where the directed translocation is rectified by the nucleotide binding. Thus, the model can be called nucleotide binding rectification (NBR) Brownian ratchet model, which is abbreviated as the NBR model. Using the model, the observed different features of the structures for Dpo4, Dbh and Pol ι in binary and ternary forms [27, 28, 32, 41, 42] can be easily explained. Other dynamic properties for the Y-family Pols such as the considerable variations of the processivity among the Pols and the fast translocation event upon dNTP binding for Dpo4 can also be explained by using the model. In addition, some predicted results of the DNA synthesis rate versus the external force acting on Dpo4 and Dbh Pols are presented. Moreover, we compare the effect of the external force on the DNA synthesis rate of the Y-family Pol with that of the replicative Pol.

Methods

Brownian ratchet translocation model for replicative DNA Pols

Since the NBR model for the Y-family Pol is modified from the previous model for the replicative Pol [5759], for convenience of reading, in this section we re-present the latter model. Briefly, the model was based on the Brownian ratchet mechanism (e.g., see [60, 61]) and the directed translocation of the Pol along the template resulted from the potential change induced by dNTP incorporation. The model was built up based on two arguments.

The first argument is on the interaction between the Pol and DNA substrate. The interaction can be characterized by two DNA-binding sites on the Pol. (i) The binding site S1, which is located in the fingers subdomain (see Figure 1a), shows a high affinity for the unpaired base and/or the sugar-phosphate backbone of the ssDNA template. The presence of binding site S1 is supported by the experimental data on bacteriophage T4 DNA Pol and Klenow fragment, showing that the fingers subdomain has a high binding affinity for the ssDNA template [6264]. (ii) The binding site S2, which is located in the palm and thumb subdomains (see Figure 1a), shows a high affinity for the double-stranded DNA (dsDNA).

Figure 1
figure 1

Schematic illustrations of the translocation model for replicative DNA Pols (see text for detailed description). The green circles in (a), (c) and (b') denote open fingers while the green ellipses in (b) and (a') denote closed fingers.

The second argument is on the rotation of the fingers subdomain from open (closed) to closed (open) conformation upon the binding (release) of dNTP (pyrophosphate, PPi), which is consistent with the structural studies on bacteriophage T7 DNA Pol [65], Taq DNA Pol [66] and HIV-1 reverse transcriptase [67]. The closed conformation of the fingers activates the phosphodiester bond formation (or nucleotide incorporation), while the open conformation of the fingers opens the polymerase active site for nucleotide binding. Moreover, the closed fingers could potentially enhance the interactions of binding sites S1 and S2 with the DNA substrate.

Based on the two arguments, the translocation model for the replicative DNA Pol is schematically shown in Figure 1[5759]. We begin with the binding site S1 of the Pol binding strongly to the ssDNA at the replication fork, with the binding site S2 binding to the dsDNA and no nucleotide being in the polymerase active site (Figure 1a). In this nucleotide-free state, either a matched or a mismatched dNTP can bind to the active site, although the matched dNTP has a much larger probability to bind. Thus, we consider the two cases separately. (i) First, consider a correct incorporation. The binding of a matched dNTP induces the fingers to rotate from open to closed conformations (Figure 1b). The closed conformation activates nucleotide incorporation. After the incorporation, the release of PPi induces the fingers to return to the open conformation. At the same time, the binding site S1 would bind to new nearest unpaired base (i.e., the next unpaired base) of the ssDNA template, because the previous unpaired base where the binding site S1 has just bound has disappeared due to base pairing (Figure 1c). Then, the next nucleotide-incorporation cycle will proceed. (ii) Second, consider an incorrect incorporation. We still begin with Figure 1a. The binding of a mismatched dNTP also induces the fingers to rotate from open to closed conformations, activating nucleotide incorporation (Figure 1a'). After the incorporation, the release of PPi induces the fingers to return to the open conformation. Now, although the sugar-phosphate backbone of the mismatched dNTP has been connected to the backbone of the already formed dsDNA, the mismatched base is not paired with the sterically corresponding base on the ssDNA template. Thus, the binding site S1 is still binding strongly to the same unpaired base of the ssDNA template (Figure 1b'). Thus, the polymerization cannot proceed. In other words, the polymerization becomes stalled. In Figure 1b', after the mismatched base is excised, the polymerization will proceed again (Figure 1a).

Using potentials of the two binding sites interacting with the DNA substrate, we describe the model as follows. First, consider potential, V1(x), of the binding site S1 interacting with ssDNA, where position, x, of the Pol along the template is represented by that of its active site. Considering that the binding site S1 covers N1 bases on the ssDNA template, before the incorporation of nucleotide paired with the (n+1)th base (top of Figure 2a), the form of V1(x) is shown in Figure 2a, where E1 is the binding affinity for N1 bases of the ssDNA template while E'1 is the binding affinity for (N1-1) bases. Note that the binding affinity E'1 that corresponds to binding (N1-1) bases is smaller than E1 that corresponds to binding N1 bases. Moreover, it is implicated in the potential that the primer 3' terminus, due to the structural restriction, is not allowed to move forwards relative to the Pol when its active site is located at the primer 3' terminus. Similarly, considering that the binding site S2 covers N2 base pairs of dsDNA, before the incorporation of nucleotide paired with the (n+1)th base (top of Figure 2a), the potential, V2(x), of binding site S2 interacting with dsDNA is shown in Figure 2a. From Figure 2a, it is seen that the deepest well of the total potential, V(x) = V1(x) + V2(x), of the Pol interacting with the DNA substrate is located at position of the (n+1)th base before the incorporation of the nucleotide paired with the (n+1)th base. Thus, the Pol is now located at position of the (n+1)th base. After the incorporation (top of Figure 2b), the forms of V1(x) and V2(x) are shown in Figure 2b. Now, the deepest well of the total potential, V(x) = V1(x) + V2(x), is located at position of the (n+2)th base. Thus, the Pol would move from a shallower potential well located at position of the (n+1)th base to the deepest well located at position of the (n+2)th base.

Figure 2
figure 2

Illustrations of the translocation model for replicative DNA Pols by using interaction potentials of binding sites S 1 and S 2 with ssDNA and dsDNA segments, respectively, of a DNA substrate. (a) Top diagram shows the DNA substrate before the incorporation of the nucleotide paired with the (n+1)th base on the template. Potential V1(x) describes the interaction of the binding sites S1 with the ssDNA segment, while potential V2(x) describes the interaction of the binding sites S2 with the dsDNA segment. (b) The DNA substrate and potentials V1(x) and V2(x) after the incorporation of the nucleotide paired with the (n+1)th base on the template. (a') The DNA substrate and the potentials V1(x) and V2(x) before the incorporation of an incorrect nucleotide opposite to the (n+1)th base on the template, which is the same as (a). (b') The DNA substrate and potentials V1(x) and V2(x) after the incorporation of an incorrect nucleotide opposite to the (n+1)th base on the template.

However, after an incorrect incorporation of the nucleotide opposite to the (n+1)th base (see top of Figure 2b'), the forms of V1(x) and V2(x) are shown in Figure 2b', which are the same as those before the incorporation. This is because, after the incorrect incorporation, the (n+1)th base has not formed a base pair with the newly incorporated primer base and, thus, the Pol is still located at the position of the (n+1)th base, i.e., the position of the deepest well.

In this model, the translocation step occurs following the incorporation of a correct nucleotide. This is supported by the comparison of the binary (Pol-DNA) with ternary (Pol-DNA-dNTP) structures for the replicative Pol (see, e.g., [66]). Upon an incorrect incorporation, the Pol becomes stalled, which is also consistent with the experimental data [68]. For a lesion such as an abasic lesion having a weak effect on distortion of the DNA structure so that the damaged base still has a high affinity for the binding site S1, an incorporated base opposite to the lesion, which is equivalent to a mismatched base, also induces the stall of the polymerization. This is consistent with the structural observation [69]. During the stalled period, the mismatched base would be excised. Then another base opposite to the lesion site would be incorporated. Thus, the Pol cannot perform the translesion synthesis. For lesions that severely distort the DNA structure causing damaged DNA substrate not to be tolerated by the replicative Pol, e.g., with the template base being flipped out of the active site, this would preclude closing of the fingers subdomain upon nucleotide binding, as observed by Li et al. [70] for bacteriophage T7 DNA Pol complexed with a DNA template containing a cis-syn cyclobutane pyrimidine dimer. Without the activation by the closed conformation, the nucleotide incorporation cannot proceed and, thus, the Pol cannot also perform the translesion synthesis.

Nucleotide binding rectification Brownian ratchet model for Y-family DNA Pols

The NBR model for the Y-family DNA Pol is modified from the above model for the replicative Pol. The model is also constructed based on two arguments, which are presented in the following two sections.

Interaction of Pol with DNA substrate

As in the replicative Pol (see above), the interaction of the Y-family Pol with the DNA substrate can also be characterized by two DNA-binding sites on the Pol. The binding site S1 is composed of residues located in the fingers subdomain (see Figure 3a or 4a). However, in contrast to the replicative Pol where the binding site S1 has a high affinity for the unpaired bases and/or the sugar-phosphate backbone of the ssDNA template, the binding site S1 in the Y-family Pol has a very low or even no affinity, which is consistent with the available structural studies [27, 28, 32, 41, 42]. The binding site S2, which is composed of residues located in the thumb domain and mainly in the LF domain (see Figure 3a or 4a), has a high affinity for dsDNA, which is also consistent with the available structural studies [27, 28, 32, 41, 42].

Figure 3
figure 3

Interaction potentials between a Y-family DNA Pol such as Dpo4, in which the active site is very close along the x direction to the nearest residue of the binding site S 2 located in the LF domain, and a DNA substrate shown in top of (b). (a) Schematic diagram of the Pol complexed with the DNA substrate. (b) V1(x) represents the potential of the binding site S1 interacting with the ssDNA segment, while V2(x) represents the potential of the binding site S2 interacting with the dsDNA segment. (c) Schematic diagrams of the position of the Pol along the DNA substrate, with blue dots representing the active site.

Figure 4
figure 4

Interaction potentials between a Y-family DNA Pol such as Dbh, in which the active site is, along the x direction, distanced away from (or not close to) the nearest residue of the binding site S 2 located in the LF domain, and a DNA substrate shown in top of (b). (a) Schematic diagram of the Pol complexed with the DNA substrate. (b) V1(x) represents the potential of the binding site S1 interacting with the ssDNA segment, while V2(x) represents the potential of the binding site S2 interacting with the dsDNA segment. (c) Schematic diagrams of the position of the Pol along the DNA substrate, with blue dots representing the active site.

As in Figure 2, the potential V1(x) of the binding site S1 interacting with the ssDNA is shown in Figures 3b and 4b, with E1 denoting the binding affinity for N1 bases of the ssDNA template while E'1 the binding affinity for (N1-1) bases. However, E'1 and E1 have very small or nearly zero values.

Then, consider the potential V2(x) of the binding site S2 interacting with the dsDNA. Since the binding site S2 in the Y-family Pols is composed of residues located in the thumb domain and mainly in the LF domain, the form of potential V2(x) depends on the distance, L, from the active site to the nearest residue (red dots in Figures 3a and 4a) of the binding site S2 located in the LF domain along the x direction.

(i) For the case that the active site is very close along the x direction to the nearest residue of the binding site S 2 located in the LF domain (see Figure 3a), as seen from the structure of Dpo4 [28, 41, 42], the interaction potential V 2(x) can be simply shown in Figure 3b, where L = 0. If binding site S 2 is considered to cover N 2 base pairs of the dsDNA, E 2 is the binding affinity for the sugar-phosphate backbones connecting N 2 base pairs on the dsDNA while E' 2 is the binding affinity for the backbones connecting only (N 2-1) base pairs. Moreover, in the potential it is implicated that the primer 3' terminus, due to the structural restriction (see, e.g., [27, 28, 32, 41, 42]), is not allowed to move forwards relative to the Pol when its active site is located at the primer 3' terminus. In addition, from the Pol structures complexed with the DNA substrate, it is inferred that that the interaction between the binding site S 2 and the dsDNA is via the hydrogen-bonding, van der Waals and mainly electrostatic forces. On the other hand, the interaction distance of the electrostatic force that is approximately equal to the Debye length (~ 1 nm) in solution is larger than the distance p = 0.34 nm between two successive base pairs. Thus, the value at maxima of V 2(x) increases as the binding site S 2 deviates away from the dsDNA segment along the x direction.

(ii) For the case that the active site is, along the x direction, distanced away from (or not close to) the nearest residue of the binding site S 2 located in the LF domain (Figure 4a), as evidently seen from the structure of Dbh [27], the interaction potential V 2(x) can be simply shown in Figure 4b, where we take L = 1 bp. From available structures of the binary and ternary complex for Pol ι [31, 32], it is also noted that, if the active site is positioned opposite to the first unpaired base on the template, the first unpaired base is distanced by L = 1 bp away from the nearest residue of the binding site S 2 located in the LF domain. Thus, the interaction potential V 2(x) for Pol ι also has the form of Figure 4b rather than that of Figure 3b. Similarly, from the available structure of the ternary complex for Pol η [30], we infer that the interaction potential V 2(x) for Pol η also has the form of Figure 4b.

From Figure 3b it is seen that, when the active site is positioned at the n th base pair (top of Figure 3c), the affinity of the Pol for the DNA substrate is E n = E'1 + E2; while when the active site is positioned at the (n+1)th base (bottom of Figure 3c), the affinity is En+1= E1 + E'2. Since E'1 and E1 are much smaller than E'2 and E2 and E2 > E'2, it is expected that E n > En+1. Similarly, from Figure 4b it is seen that, when the active site is positioned at the n th base pair (top of Figure 4c), the affinity of the Pol for the DNA substrate is E n = E'1 + E2; while when the active site is positioned at the (n+1)th base (bottom of Figure 4c), the affinity is En+1= E1 + E2. Since E'1 and E1 are much smaller than E2, it is expected that En+1is slightly larger than (or nearly equal to) E n . Moreover, from both Figure 3 and 4 it is noted that, when the active site is positioned at the (n+1)th base, the jumping of the Pol from the (n+1)th site to the (n+2)th site is required to overcome a larger energy barrier than the backward jumping to the n th site. For approximation, we do not consider the jumping to the (n+2)th site in this work.

The binding of dNTP induces a slight conformational change, enhancing the interaction of the Pol with DNA substrate

As evidenced from the FRET experimental data [56], it is argued that the dNTP binding involves (at least) two substeps, E · DNA + dNTP → E · DNA · dNTP → E* · DNA · dNTP, where E represents the DNA Pol. The transition from the unactivated E · DNA · dNTP ternary complex to activated E* · DNA · dNTP ternary complex induces a slight conformational change of the Pol, enhancing its interactions with both the DNA substrate and dNTP. Similarly, the PPi releasing also involves (at least) two substeps, E* · DNA · PPi → E · DNA · PPi → E · DNA + PPi, where the transition from the activated E* · DNA · PPi ternary complex to unactivated E · DNA · PPi ternary complex results in a reverse slight conformational change of the Pol, reducing its interactions with both the DNA substrate and PPi.

Since in the activated E* · DNA · dNTP (or E* · DNA · PPi) complex the Pol has a stronger interaction with DNA substrate and nucleotide than in the unactivated E · DNA · dNTP (E · DNA · PPi) complex, for simplicity of analysis, it is considered that in the activated state the Pol is unable to move relative to the DNA substrate and the dNTP or PPi bound to the active site has a negligible probability to release.

Model for Pol translocation

Using potentials V1(x) and V2(x) (Figures 3 and 4), the NBR model for the Y-family Pol translocating along DNA substrate is schematically shown in Figure 5.

Figure 5
figure 5

Schematic illustrations of the nucleotide binding rectification Brownian ratchet model for the Y-family DNA Pol translocating along DNA substrate (see text for detailed description).

We begin with the Pol positioned at the n th site (Figure 5a), just after the incorporation of a nucleotide. In Figure 5a, the active site is occupied by the primer 3'-terminus, which sterically prevents a dNTP from binding to the active site. Due to the thermal noise, the Pol in this nucleotide-free state can jump from the n th site to the (n+1)th site (from Figure 5a to 5b) and vice verse (from Figure 5b to 5a). For the case that the active site is very close along the x direction to the nearest residue of the binding site S2 located in the LF domain (Figure 3a), E n > En+1(see above). Thus, the Pol in the binary E · DNA state stays most of the time at the n th site (Figure 5a), as will be shown in the Results, which is consistent with the availably resolved binary E · DNA structure for Dpo4 [28, 41, 42]. For the case that the active site is, along the x direction, distanced away from (or not close to) the nearest residue of the binding site S2 located in the LF domain (Figure 4a), En+1is slightly larger than (or nearly equal to) E n (see above). Thus, the Pol in the binary E · DNA state shows slightly larger (or nearly equal) probability to stay at the (n+1)th site (Figure 5b) than (or to) that at the n th site (Figure 5a), implying that the binary E · DNA structure for this case would be observed to be either at the (n+1)th site or at the n th site. This is consistent with the observations that the pre-insertion binary E · DNA structures for Dbh [27] and Pol ι [32] showed that their active sites are at the (n+1)th site, while the post-insertion binary E · DNA structure for Dbh [27] showed that the active site is at the n th site.

When the Pol jumps to the (n+1)th site, since the active site is nucleotide free, a dNTP becomes able to bind to it, as shown in Figure 5b that is equivalent to the state shown at bottom of Figure 3c or Figure 4c. Consider that the dNTP binds to the active site during the period when the Pol stays at the (n+1)th site (Figure 5c). Due to the structural restriction (see, e.g., [27, 28, 32, 41, 42]), the occupation of the active site by the dNTP sterically prevents the Pol from moving backwards to the n th site unless the dNTP is dissociated, which is consistent with the available structures showing that the active site of the Pols such as Dbh, Dpo4, Pol ι, Pol η, Pol κ and Rev1 in ternary forms is at the (n+1)th site [11, 27, 28, 30, 32, 34, 35, 41, 42]. Then, the transition from the unactivated ternary complex E · DNA dNTP to the activated E* · DNA dNTP complex enhances the interactions of the Pol with the DNA substrate and with the dNTP, thus preventing both the DNA substrate and the dNTP from dissociating from the Pol.

After the phosphodiester bond formation and then the release of PPi, except that the dsDNA segment is elongated by one base pair and the Pol has moved forwards by one base pair, the Pol-DNA complex returns to the state shown in Figure 5a. Correspondingly, the potentials V1(x) and V2(x) in Figure 3b and in Figure 4b are shifted by one base pair along the x direction. Then, the next round of the nucleotide incorporation would proceed continuously.

Equations for Pol motion

Consider the movement of Pol relative to the DNA substrate in two dimensions. One is along the DNA, which is represented by the x axis, as shown in Figures 3, 4, and 5. The other one is along the r axis that is perpendicular to the x axis. Then, the movement equations can be written in the following Langevin forms

(1a)
(1b)

Here the potential U(x,r) can be written as U(x,r) = V(x)[2exp (-r/r d ) - exp (-2r/r d )], with V(x) = V1(x) + V2(x) + V0, where V1(x) and V2(x) have the forms shown in Figures 3b and 4b, and V0 ≡-E0 < 0 results from the fact that the electrostatic interaction distance of the Pol with the DNA in solution is larger than the distance between two successive base pairs. The magnitude of V(x) is defined as follows: its minimum value at the n th site is - (E n + E0), while at the (n+1)th site the value is - (En+1+ E0). The term [2 exp (-r/r d ) - exp (-2r/r d )], which has the Morse form, denotes the potential change along the r direction, with 2r d = 1 nm (the Debye length) characterizing the interaction distance. The parameter Γ is the frictional drag coefficient on the Pol and ξ i (t) (i = x, r) is the fluctuating Langevin force with 〈ξ i (t)〉 = 0 and 〈ξ i (t)ξ j (t')〉 = 2k B T Γδ ij δ(t - t'). The drag coefficient is calculated by Γ = 6πηR, where η is the viscosity of the aqueous medium and the Pol is approximated as a sphere with radius R = 5 nm. As the previous experiment showed that the viscosity of the aqueous cytoplasm does not differ from water [71], we take the viscosity of aqueous to be the same as that of water in the calculation, i.e., η = 0.01 g cm-1 s-1, which gives Γ = 9.4 × 10-11 kg s-1. Moreover, the effect of the viscosity variation on the results will be discussed.

Results

Processivity of the Y-family Pol

To study the processivity of the Y-family Pol, we determine the dissociation probability of the Pol from the DNA substrate during one cycle of nucleotide incorporation. To this end, we calculate the mean dissociation time, T d , of the Pol from the DNA substrate.

First, we consider the motion with the Pol fixed at one potential well (e.g., the potential well at the n th site) along the x direction. Then, the potential U(x,r) in Eq. (1b) becomes: W(r) = -E r [2 exp (-r/r d ) - exp (-2r/r d )], where the depth of the potential well is E r = E n + E0. If it is considered that the Pol is dissociated from its DNA substrate when it moves away from the DNA substrate by a distance of r = L, the mean dissociation time T d , i.e., the mean time for the Pol to move from r = 0 to r = L, can be obtained by [72]

(2)

where D = K B T/Γ. From Eq. (2) we have

(3)

where it is seen that T d is proportional to the viscosity η.

The dissociation probability per unit time, P d , of the Pol from the DNA substrate is calculated by

(4)

It is noted from Eqs. (3) and (4) that P d is inversely proportional to the viscosity η.

Based on the model, only during the time period, T p1 , after transition to the unactivated E · DNA PPi ternary complex but before the dNTP binding and during the time period, Tp 2, after the dNTP binding but before transition to the activated E* · DNA dNTP ternary complex, can the Pol be dissociated from the DNA substrate. During the time period Tp 1, the Pol can jump between the well at the n th site and the well at the (n+1)th site along the x direction. As our results show (see additional file 1), the dissociation probability P d1 during this time period Tp 1is approximately only dependent on the value of (m = n or n + 1), with , where C = 1 ~ 2 and represents P d given by Eqs. (3) and (4) but with (m = n if E n > En+1, m = n + 1 if E n < En+1). During the time period Tp 2, the Pol is positioned at the (n+1)th site and the dissociation probability P d2 is calculated by , where represents P d given by Eqs. (3) and (4) but with .

For the Pol such as Dpo4, in which the active site is very close along the x direction to the nearest residue of the binding site S2 located in the LF domain, since E n > En+1, it is noted from Eqs. (3) and (4) that . Moreover, it is known that Tp 1< Tp 2at saturating concentrations of dNTP for Dpo4 [56]. Thus, we have . The mean number of incorporated nucleotides for one binding event of the Pol with the DNA substrate, which characterizes the polymerization processivity, is calculated by

(5)

For the Pol such as Dbh, Pol ι and Pol η, in which the active site is, along the x direction, distanced away from (or not close to) the nearest residue of the binding site S2 located in the LF domain, since En+1E n , we have . Thus, the mean number of incorporated nucleotides for one binding event is calculated by

(6)

From Eqs. (5) and (6), it is seen that, whether E n > En+1or E n En+1, the polymerization processivity is mainly determined by the binding affinity, En+1, of the Pol at the (n+1)th site along the DNA substrate. Moreover, it is noted that N p is proportional to the viscosity η.

Using Eq. (3), the calculated results of the mean dissociation time T d versus E r are shown in Figure 6a, where we take L = 5 nm that is larger than the interaction distance 2r d = 1 nm. It is seen that T d increases significantly with the increase of E r . With results of Figure 6a and using Eqs. (4) and (5), the calculated results of N p versus E r are shown in Figure 6b, where we take Tp 2= 0.065 s that is consistent with the experimental data of transition rate of 15.3 s-1 for Dpo4 [56]. It is seen that, when N p = 10 ~ 100 that is consistent with the experimental data [48], E r ≈ 18.5k B T ~ 20.8k B T. Now, from this value of , we estimate values of En+1and E n . Taking a conservative value of E0 = E r /5 = 3.9k B T, we estimate that the value of En+1is at most about 17k B T. For reasonable value of E n -En+1= 3k B T ~ 5k B T, we estimate that the value of E n is at most about 20k B T ~ 22k B T.

Figure 6
figure 6

Results for processivity of the Y-family Pol. (a) Mean dissociation time T d of the Pol from the DNA substrate versus the binding affinity E r between them. (b) Mean number of processive incorporation cycles N p versus E r .

In addition, from Figure 6b it is interesting to see that, if E r is decreased from 20.8k B T to a value below 16k B T, N p is decreased from about 100 to a value smaller than 1. This implies that only an about 5k B T decrease in the binding affinity can induce the processive polymerization of a hundred nucleotides to a distributive polymerization. From this result, it is also concluded that the considerable variations of the processivity among Y-family Pols result mainly from slight changes in their binding affinities for the DNA substrate. Moreover, since the LF domain is the least conserved of the four domains in the Y-family Pols, the slight differences in the binding affinity of different Pols are mainly due to different interaction strengths of the LF domain with the DNA. For example, comparison of the LF structure of Dpo4 with that of Dbh showed that the DNA-contacting surface in LF domain of Dpo4 is slightly more positively charged than Dbh, and, correspondingly, Dpo4 is much more processive than Dbh [48]. Since the LF domain has a large binding affinity for the DNA, it is expected from Figure 6a that the deletion of the LF domain will significantly reduce the association time of the Pol with the DNA, thus resulting in much less active than the full-length Pol. This is also consistent with the experimental data [28].

Moving time of the Y-family Pol

Now, we study the moving time from the n th site to the (n+1)th site and vice verse during the time period after the incorporation of the n th base but before the dNTP binding. To this end, we can consider the motion only along the x direction and the potential U(x,r) in Eq. (1a) becomes V(x). Thus, the mean moving time, Tn→(n+1), i.e., the mean first-passage time for the Pol to move from the n th site at position x = 0 to the next (n+1)th site at position x = p = 2l = 0.34 nm can be approximately calculated by , which is similar to Eq. (2). From the integral, we have

(7)

The mean moving time, T(n+1)→ n, from the (n+1)th site to the n th site also has the form of Eq. (7) but with E n and En+1being exchanged with each other. From Eq. (7) and D = K B T/Γ, it is noted that the mean moving time is proportional to the viscosity η.

Using Eq. (7), the calculated results of the mean moving time Tn→(n+1)(T(n+1)→ n) versus E n (En+1) for different values of En+1(E n ) are shown in Figure 7. As expected, Tn→(n+1)increases significantly with the increase of E n but is insensitive to the variation of En+1, while T(n+1)→ nincreases significantly with the increase of En+1but is insensitive to the variation of E n . It is seen from Figure 7 that, even for the value of E n = 20k B T ~ 22k B T for Dpo4 (see the above section), the mean moving time Tn→(n+1)is only about 2 ~ 10 ms. For the value of En+1= 17k B T for Dpo4 (see the above section), T(n+1)→ nis only about 0.1 ms. These results indicated that, after the incorporation of the n th base and before the dNTP binding, Dpo4 would jump between the n th site and the (n+1)th site with a high frequency. Thus, within the time resolution used in the FRET experiment [56], this highly frequent jumping between the two positions could not be detected. Moreover, as will be shown in the following section, Dpo4 would stay most probably at the n th site (Figure 5a). Thus, the resolved structure is most probably in the state with Dpo4 active site being located at the n th site and the FRET data shows a rapid translocation event for Dpo4 relative to the DNA substrate upon the adding of dNTP, which is consistent with the experimental data [41, 42, 56].

Figure 7
figure 7

Results of the mean time T n→(n+1) ( T (n+1)→n ) for the Y-family Pol to move from the n th (( n +1)th) site to the ( n +1)th ( n th) site versus E n ( E n+1 ) for different values of E n+1 ( E n ) (indicated in the figure) before dNTP binding to the active site. Note that the four curves of Tn→(n+1)(T(n+1)→n) versus E n (En+1) for different values of En+1(E n ) are nearly coincident.

Effect of external force on DNA-synthesis rate of the Y-family Pol

In the NBR model for the Y-family Pol, after the incorporation of the n th base and before the dNTP binding, the active site jumps between the n th site and the (n+1)th site. As noted from Eq. (7), for E n > > 1 and En+1> > 1, the ratio of the time, T n , for the active site to position at the n th site over the time, Tn+1, to position at the (n+1)th site approximately has the form

(8)

For Dpo4, since E n > En+1, it is thus noted from Eq. (8) that the active site has a much larger probability to stay at the n th site than to stay at the (n+1)th site. For the value of E n -En+1= 3k B T ~ 5k B T, T n /Tn+1≈ 20 ~ 100. For Dbh, Pol ι and Pol η, En+1is slightly larger than (or nearly equal to) E n . From Eq. (8) it is noted that the Pols show slightly larger (or nearly equal) probability to stay at the (n+1)th site than (or to) that at the n th site.

Consider an external force, F, acting on the Pol bound to a fixed DNA substrate, where F is defined as positive when it points towards the -x direction. The experiment can be realized by using the optical trapping method, with a micro-bead linked to the residues on the palm subdomain or LF domain of the Pol. The linked residues on the Pol should be far away from the active site, thus the external force having no effect on the polymerase activity of the active site. Under the external force F, the depth of potential well at the n th site changes from E n to E n + Fp/2, while the depth at the (n+1)th site changes from En+1to En+1-Fp/2. Thus, Eq. (8) becomes

(9)

Based on the model, only when the active site is positioned at the (n+1)th site can the dNTP bind to the active site, i.e., only during the time period Tn+1can the dNTP bind to the active site. Thus, based on Eq. (9), the dNTP-binding rate, k b (F), versus the external force F has the form

(10)

where denotes the dNTP-binding rate under no external force. From Eq. (10), the DNA-synthesis rate, k, is calculated by

(11)

where k c is the dNTP-incorporation rate at saturating dNTP concentration and . It is noted from Eqs. (8) - (11) that the DNA-synthesis rate k is independent of the viscosity η.

Now, we use Eq. (11) to make some predicted results. From the experimental data, we have k c = 2.3 × 10-2 s-1 and = 0.4 mM for Dbh [54]. As discussed before, we take En+1E n for Dbh. Using Eq. (11), we calculate the DNA-synthesis rate k versus F for different values of [dNTP], with the results shown in Figure 8a, and 8k versus [dNTP] for different values of F, with the results shown in Figure 8b. For Dpo4, k c = 9 s-1 and = 230 μM [44]. Moreover, we take E n -En+1= 3k B T for Dpo4. The results of k versus F for different values of [dNTP] and k versus [dNTP] for different values of F are shown in Figures 9a and 9b, respectively. By comparing Figure 8a with Figure 9a, it is seen that, in the range of F = -20 ~ 20 pN and [dNTP] ≤ 1 mM, the external force F has more significant effect on the DNA-synthesis rate k of Dpo4 than on that of Dbh.

Figure 8
figure 8

Predicted results of DNA-synthesis rate k ( F ) under the effect of the external force F for Dbh. (a) DNA-synthesis rate k(F) versus F for different values of [dNTP]. (b) DNA-synthesis rate k(F) versus [dNTP] for different values of F, with curves from upper to lower corresponding to F = 1 pN, 10 pN, 20 pN and 30 pN, respectively.

Figure 9
figure 9

Predicted results of DNA-synthesis rate k ( F ) under the effect of the external force F for Dpo4. (a) DNA-synthesis rate k(F) versus F for different values of [dNTP]. (b) DNA-synthesis rate k(F) versus [dNTP] for different values of F, with curves from upper to lower corresponding to F = 1 pN, 10 pN, 20 pN and 30 pN, respectively.

Comparison of the effect of external force on dNTP-binding rate of the Y-family Pol with that of the replicative Pol

Based on the model for the replicative Pol (Figure 2) and the modified model for the Y-family Pol (Figures 3, 4, and 5), the dNTP-binding rate k b (F) versus the external force F satisfies Eq. (10), where E n -En+1= E'1-E1 < 0 for the replicative Pol (see Figure 2) while E n -En+1≥ 0 for the Y-family Pol. For the replicative Pol, E'1-E1 represents the binding affinity of binding site S1 for one base of the ssDNA template (or the binding affinity of the binding site S1 residue that is closest to the palm subdomain for ssDNA), and it is estimated that E'1-E1 = -5k B T ~ -3k B T. As mentioned before, for Dbh, Pol ι and Pol η, E n -En+1≈ 0, while for Dpo4, E n -En+1= 3k B T ~5k B T.

Using Eq. (10), the calculated results of ratio versus the external backward force F acting on the Pol for different values of E n -En+1are shown in Figure 10. It is seen that, in the range of F < 20 pN, with the increase of F the ratio R only decreases slightly for the replicative Pol whereas the ratio R decreases greatly for the Y-family Pol. In other words, the external backward force has much more effect on the the dNTP-binding rate for the Y-family Pol than for the replicative Pol.

Figure 10
figure 10

Predicted results of ratio, , versus the external backward force F acting on the Pol for different values of E n - E n+1 , where k b ( F ) is the dNTP-binding rate under effect of the external force F and is the dNTP-binding rate under no external force. Curves from upper to lower are for E n -En+1= -5k B T, -4k B T, -3k B T, 0, 3k B T and 5k B T, where E n -En+1= -5k B T, -4k B T, -3k B T correspond to replicative Pols, E n -En+1= 0 corresponds to Dbh, Pol ι and Pol η and E n -En+1= 3k B T and 5k B T correspond to Dpo4.

The biological implication of these different characteristics between the Y-family and replicative Pols might be imagined as follows. At the replication fork, the replicative Pol generally feels a backward force by the front DNA helicase [73]. Thus, the slight effect of the backward force on the DNA synthesis rate by the replicative Pols is purposed to have little impact on the DNA replication. However, when the replicative Pol becomes stall at the lesion site, since the front helicase is still unwinding the dsDNA, the Pol would not feel a backward force now. Thus, when the relicative Pol is replaced by the Y-family Pol, the latter would also not feel a backward force at the lesion site. After bypass the lesion site, the Y-family Pol would continue to make processive DNA synthesis and, if the Y-family Pol catches up with the helicase, the backward force induced by the front helicase would greatly reduce the DNA synthesis rate, thus enhancing the probability of the Y-family Pol to dissociate from the DNA substrate or the probability of the Y-family Pol to be replaced by the replicative Pol.

The Y-family Pol can easily bypass a mismatched base pair or a lesion site

In this section, we will show how the Pol that uses the NBR mechanism for translocation can easily bypass a mismatched base pair or a lesion site. As an example, we will use Dpo4 to illustrate this bypass ability, in which the active site is very close along the x direction to the nearest residue of the binding site S2 located in the LF domain.

As shown in Figure 11, consider that a mismatched base is incorporated at the n th site. Then, the interaction potential V1(x) of the binding site S1 with the ssDNA template and the potential V2(x) of the binding site S2 with the dsDNA segment are shown in Figure 11a. Here, E1 is the binding affinity of the binding site S1 for N1 bases of the ssDNA template that the binding site S1 can cover, E'2 is the binding affinity of the binding site S2 for the backbones connecting (N1-1) base pairs on the dsDNA, and E''2 is the binding affinity of the binding site S2 for the backbones connecting (N1-2) base pairs. Thus, when the active site is positioned at the n th site (top of Figure 11b), the affinity of the Pol for the DNA substrate is E n = E1 + E'2; while when the active site is positioned at the (n+1)th site (bottom of Figure 11b), the affinity is En+1= E1 + E''2. Note that the binding affinity E''2 that corresponds to binding (N1-2) base pairs is smaller than E'2 that corresponds to binding (N1-1) base pairs.

Figure 11
figure 11

Interaction potentials between a Y-family DNA Pol and a DNA substrate with a mismatched base pair at the n th site shown in top of (a). For clarity, the mismatched base is drawn in pink. (a) V1(x) represents the potential of the binding site S1 interacting with the DNA substrate, while V2(x) represents the potential of the binding site S2 interacting with the DNA substrate. (b) Schematic diagrams of the position of the Pol along the DNA substrate, with blue dots representing the active site.

Now we compare the case of a matched incorporation at the n th site (Figure 3) with the case of a mismatched incorporation (Figure 11). Since E'1 and E1 are much smaller than E'2 and E2, we have , where and represent E n for cases of matched incorporation and mismatched incorporation, respectively. Thus, from the results shown in Figure 7, it is seen that the mean moving time Tn→(n+1)for the case of mismatched incorporation is shorter than the case of matched incorporation. Moreover, it is noted that the value of is close to that of , since both value of E'2-E''2 and that of E2-E'2 correspond to the affinity of the binding site S2 for the backbones connecting one base pair on the dsDNA. Thus, from Eq. (8), the ratio (T n /Tn+1)(mismatch)is close to (T n /Tn+1)(match). As a result, we conclude that the Pol that uses the NBR mechanism for translocation has nearly the same rate to bypass a mismatched base pair as that to bypass a matched base pair. Similarly, for the case of an abasic lesion located at the n th site, the interaction potential V2(x) is the same as shown Figure 11a. Thus, as in the case of a mismatched base, the Pol that uses the NBR mechanism for translocation can also easily bypass the lesion site.

However, since , it is noted from Figure 6a that the dissociation probability near the lesion site is larger than that in the absence of the lesion or mismatched base, which is in agreement with the experimental data for Dpo4 [52]. On the hand, it is seen from Figure 6b that, when there is no lesion or no mismatched base, for N p = 10 that corresponds to the case of Dbh [48]E r = 18.5k B T, while for N p = 100 that corresponds to the case of Dpo4 [48]E r = 20.8k B T. From the value of E1-E'1 = 3k B T ~ 5k B T (see above), we infer that E'1-E''1 ≈ 3k B T ~ 5k B T. Thus, at the lesion site, for the case of Dbh with E r = 13.5k B T ~ 15.5k B T, we have N p ≈ 0.08 ~ 0.6 from Figure 6b; for the case of Dpo4 with E r = 15.8k B T ~ 17.8k B T, we have N p ≈ 0.8 ~ 5.2 from Figure 6b. This implies that, at the lesion site, Dbh is prone to dissociate from the DNA substrate while Dpo4 is not easily to dissociate. In other words, Dpo4 can bypass the lesion site, yet Dbh does so with a much lower efficiency. These are consistent with the experimental data [24, 48, 50, 52]. Moreover, as the different lesion-bypassing abilities of the two Pols, Dpo4 and Dbh, are due to the different binding affinities, which in turn result mainly from the different interaction strengths of the LF domain with the DNA, it is inferred that, by interchanging the LF domains, the lesion-bypassing abilities of the two Pols will be exchanged. This is also consistent with the experimental data of Boudsocq et al. [48].

Similarly, we can easily show that the Pols such as Dbh and Pol ι, in which the active site is, along the x direction, not close to the nearest residue of the binding site S2 located in the LF domain, can also easily bypass the lesion site. Thus, we conclude that, although different values of distance L give different translocation features, all the Y-family Pols that use the NBR mechanism for translocation can easily bypass the lesion site, thus performing the translesion synthesis. By contrast, the replicative Pols that use other Brownian ratchet mechanism for translocation cannot bypass the lesion site and is thus unable to perform the translesion synthesis (see Figures 1 and 2).

Discussion

The explanation of the lesion-bypassing ability by a Y-family Pol in this work is focused only on the translocation activity of the active site from the position opposite to the lesion site to the next position. In fact, the lesion-bypassing ability is dictated by two activities. One is the translocation activity while the other one is the catalytic activity of the phosphodiester bond formation. The latter activity determines that rates of nucleotide incorporation opposite to the lesion site and one position downstream from the lesion site are slower than those at other sites [46]. Moreover, since the latter activity is determined by the structure of the catalytic core, single amino acid substitutions within the active site, palm or fingers subdomains can also have a profound effect on the ability of the enzyme to perform translesion synthesis [74, 75].

Distinct translocation features among the Y-family Pols in the NBR model depend on the distance L from the active site to the nearest residue of the binding site S2 located in the LF domain. It is important to note that this distance L in the model corresponds to the structure of the Pol only after binding to its DNA substrate. Although available structures showed a significant conformational change in the LF domain of the apo-Dpo4 upon binding to the DNA substrate [42], only a little change has been observed between the Pol complexed with the DNA substrate alone and that with both the DNA substrate and the dNTP nucleotide. Thus, the large conformational change in the LF domain of the Pol upon binding to the DNA substrate has no effect on the conclusion of the current work which is involved only with the structures of the Pol either complexed with the DNA substrate alone or complexed with both the DNA substrate and the dNTP nucleotide.

Further comments on the NBR translocation model

In the NBR model (Figure 5), it has been implicitly considered that the Pol has a rigid structure, i.e., different domains have been considered to be linked rigidly. In fact, the residues linking different domains may behave elastically. For example, for Dpo4, considering an elastic link between the palm and thumb domains. Upon nucleotide binding, after the active site, together with the fingers, palm and LF domains, move from the n th site to the (n+1)th site, the thumb domain may not move simultaneously due to the elastic link between the palm and thumb domains, i.e., the thumb contacts with the DNA may fluctuate by the thermal noise between pre- and post-translocation positions with nearly equal probability. This gives an explanation to the available structural data for Dpo4 showing that, upon nucleotide binding, the LF contacts with the DNA shift by one base pair but the thumb contacts do not shift simultaneously [41].

Potential implication of binding site S1 in the induced-fit mechanism

The strong interaction of the binding site S1 with the unpaired base on the template induces the conformational change in the residues of the binding site S1, which in turn results in the conformational change in the active site that is adjacent to the residues of the binding site S1. This unpaired-base-related conformational change thus results in the active site having a much higher affinity for the structurally compatible nucleotide than structurally incompatible nucleotides. This argument is consistent with the experimental data for high-fidelity DNA Pols showing that the shape of the nascent base pair is important regardless of whether the Watson-Crick hydrogen bonds can be formed [76]. It is also consistent with the recent FRET-based assay on Klenow fragment showing that base discrimination takes place within the open complex rather than occurs during the transition from open to closed fingers conformations [77]. Conversely, the interaction of the structurally incompatible nucleotide may have a negative effect on the conformational change in the active site, which in turn results in the inverse conformational change in the binding site S1, potentially reducing its binding affinity for the unpaired base on the template. This could account for the increased dissociation caused by the binding of a mismatched nucleotide, as observed by Joyce et al. [77].

Crystal structures of the Y-family DNA Pol complexed with DNA substrate showed weak or no interaction of the Pol with the ssDNA template [27, 28, 32, 41, 42]. The weak binding affinity would result in a minor or no conformational change in the active site that is related to the structure of the unpaired base on the template, giving a smaller difference in binding affinity between correct and incorrect nucleotides. This thus results in that the Y-family Pol has a low-fidelity synthesis, which is consistent with the available experimental data [43, 45]. Since minor or no conformational change in the active site occurs which is related to the shape of the template base, it is expected that the base discrimination in Y-family Pols should rely mainly on the Watson-Crick hydrogen bonding interaction, in sharp contrast to the high-fidelity replicative DNA Pols where the effect of the Watson-Crick hydrogen bonding interaction can be negligible compared to the dominant effect of the base shape. Indeed, steady state kinetic studies with incoming dNTP and DNA substrates containing difluorotoluene, which has the same shape as thymine but lacks the ability to form Watson-Crick hydrogen bonds, are poor substrates for Y-family pol η [78].

Conclusion

In conclusion, a NBR model is proposed for the translocation of the Y-family DNA Pol along the DNA substrate, which is modified from the translocation model proposed previously for the replicative DNA Pol. The observed different features of the structures for Dpo4, Dbh and pol ι in binary and ternary forms are consistent with the NBR model. Moreover, since the interaction potential V2(x) for Pol η has the form of Figure 4b rather than that of Figure 3b, it is predicted that Pol η would show the translocation feature similar to Dbh and Pol ι rather than Dpo4. The obtained theoretical results on dynamic properties of the Y-family Pols by using the NBR model are consistent with the available experimental data. To further verify the model, it is hoped to test the predicted results given in Figures 8, 9, 10.