Hydration differences between the major and minor grooves of DNA revealed from heat capacity measurements

The nature of water on the surface of a macromolecule is reflected in the temperature dependence of the heat effect, i.e., the heat capacity change, ΔCp, that accompanies its removal on forming a complex. The relationship between ΔCp and the nature of the surface dehydrated cannot be modeled for DNA by the use of small molecules, as previously done for proteins, since the contiguous surfaces of the grooves cannot be treated as the sum of small component molecules such as nucleotides. An alternative approach is used here in which ΔCp is measured for the formation of several protein/DNA complexes and the calculated contribution from protein dehydration subtracted to yield the heat capacity change attributable to dehydration of the DNA. The polar and apolar surface areas of the DNA dehydrated on complex formation were calculated from the known structures of the complexes, allowing heat capacity coefficients to be derived representing dehydration of unit surface area of polar and apolar surface in both grooves. Dehydration of apolar surfaces in both grooves is essentially identical and accompanied by a reduction in ΔCp by about 3 J K−1 mol−1 (Å2)−1, a value of somewhat greater magnitude than observed for proteins {ΔCp = − 1.79 J K−1 mol−1 (Å2)−1}. In contrast, dehydration of polar surfaces is very different in the two grooves: in the minor groove ΔCp increases by 2.7 J K−1 mol−1 (Å2)−1, but in the major groove, although ΔCp is also positive, it is low in value: + 0.4 J K−1 mol−1 (Å2)−1. Physical explanations for the magnitudes of ΔCp are discussed.


Introduction
Protein folding is typically characterized by a significant reduction in the heat capacity of the system. Negative values of ΔCp, i.e., a reduction in the heat of folding with increase in the temperature ΔCp = δΔH/δT, are not principally due to additional conformational restraints to the polypeptide chain on folding, but to changes in the state of hydration (Makhatadze and Privalov 1995). Dominantly, this is loss of water from hydrophobic groups as they come together inside the folded core of the protein and a heat capacity decrement is regarded as a defining feature of protein folding. However, analysis of polar interactions in proteins led to the conclusion that their formation is associated with a heat capacity increase-but since the positive values of ΔCp for polar interactions are not as large as the negative values characterizing apolar interactions (per Å 2 of interacting protein surface) the latter dominate the heat capacity change on folding (Spolar et al. 1992;Murphy and Friere 1992;Privalov and Makhatadze 1992;Makhatadze and Privalov 1995). Equations have been established relating the changes in apolar and polar water accessible surface areas (ΔASA) to the resulting change in the heat capacity as a protein folds and are effective predictors of experimental ΔCp values. These observations can be summarized by saying that changes in the heat capacity of proteins and their complexes reflect alterations in their state of hydration, i.e., heat capacity changes are a proxy for changes in hydration.
When proteins associate (non-covalently) the circumstances are closely akin to the folding of individual polypeptide chains and the equations applicable to the folding of individual chains are equally effective predictors of the associated changes in the heat capacity accompanying complex formation. The formation of protein/DNA complexes is also associated with significant reductions in the heat capacity, likewise in consequence of the dehydration of hydrophobic surfaces (Ha et al. 1989;Spolar and Record 1994;Ladbury et al. 1994). However, the equations established for proteins and protein/protein interactions are not good predictors of the heat capacity changes observed for the binding of protein domains (DBDs) to duplex DNA (Ladbury et al. 1994;Morton and Ladbury 1996;O'Brien et al. 1998;Bergqvist et al. 2004) and this discrepancy has been assigned to the presence of residual waters at the protein/DNA interface and in peripheral polar environments (Morton and Ladbury 1996;Bergqvist et al. 2004).
A priori, it seems unlikely that heat capacity coefficients derived from the small compounds used to model proteins would be appropriate to the surface of the DNA grooves (Prabhu and Sharp 2005). Determination of such coefficients using small compounds modeling the components of DNA (bases, sugars) is unlikely to be helpful: the grooves are deep cavities with regular and closely spaced groups, so cooperativity in the binding of surface water is expected to occur and the additivity assumption would break down. A quite different approach is, therefore, required to determine the precise characteristics of dehydrating the surface of DNA in terms of the surface areas that become dehydrated in the major and the minor grooves. This article makes use of existing heat capacity data to generate relationships between the loss of accessible apolar and polar surface area in both the major and the minor grooves of DNA and the resulting contribution to the magnitude of the heat capacity change that occurs on complex formation with DNA binding domains (DBDs). It, therefore, represents a study of hydration in the major and minor grooves.
For protein folding, the total heat capacity effect, ΔCp(T), is formalized in equations of the type: where the coefficient Cp i (T) represents the heat capacity effect per Å 2 of surface of defined type i and Δ(ASA) i is the change (reduction) in the accessible surface area of that type that becomes buried. Three types of surface have been recognized as distinct: aliphatic (non-polar), aromatic, and polar/charged, with the result that such a predictive equation has three terms. However, for many purposes aromatic surfaces can be regarded as non-polar, reducing the equations to two terms. For application to folding a protein of known structure, computer programs are used that roll a 1.4 Å sphere (representing a water molecule) over the surface of the unfolded and folded chains and the difference in their water accessible surface areas, Δ(ASA) I , in consequence of folding thereby evaluated for both categories of surface. (1) To interpret experimental ΔCp values accompanying formation of DBD-DNA complexes-the structures of which are known from X-ray and/or NMR studies-in terms of changes in accessible surface area requires that both interacting components be fully folded. However, significant refolding of DBDs frequently occurs on forming DNA complexes, (Spolar and Record 1994;Privalov et al. 1999;Privalov and Crane-Robinson 2018), making a considerable contribution to the observed heats of binding observed in the titration calorimeter, with the result that the temperature dependence does not reflect the interaction of fully folded components. It is critical, therefore, to select for analysis only complexes for which correction for refolding has been applied by subtracting refolding heats-separately measured in the scanning calorimeter-from ITC derived heats of association. In a few of the cases selected, care was taken by the authors to ensure that the unbound protein is already in a fully folded state, e.g., by restricting enthalpy measurements to low temperatures. If the DBD is a short peptide, the refolding issue is absent. In both circumstances the temperature dependence of the binding enthalpies, i.e., ΔCp values-represent the binding of fully folded protein to DNA. Only complexes fulfilling these criteria were accepted into the analysis.

Methodology
Relating measured heat capacity changes for protein/DNA complexes to the dehydration of non-polar and polar surface of the two DNA grooves The magnitude and sign of the Cp i (T) coefficients (Eq. 1) for protein folding have been assessed in three separate studies: Murphy and Friere (1992), Spolar et al. (1992) and Makhatadze and Privalov (1995). The averaged values of the above three studies can be represented by the equation: in which the Cp i (T) coefficients are expressed in J K −1 mol −1 (Å 2 ) −1 and ΔASA are in Å 2 .
This equation was used to obtain the contribution from dehydration of the proteins in the selected complexes and the values obtained then subtracted from the total (observed) heat capacity change, ΔCp obs , to yield the difference (ΔCp DNA ) that represents the heat capacity change resulting from dehydration of the DNA surface in forming the complex. To deconvolute values of ΔCp DNA into contributions from apolar and polar surface dehydration, i.e., to develop equations similar to (2) but for DNA, reductions in the accessible surface areas (ΔASA) of the DNA components of the complexes were calculated for the apolar and polar surface of the minor and major grooves of the individual complexes. Combining the data for the several major groove complexes (six in number) and, separately, the several minor groove complexes (five in number) then allowed evaluation of the Cp i (T) coefficients for apolar and polar surface in the two DNA grooves.

Results
Data for experimental heat capacity changes, ΔCp obs , were normalized to unit Å 2 of interface to facilitate comparison between complexes of varying size and are summarized in Fig. 1, separated into major and minor groove binders. The substantially negative ΔCp obs values support the general assumption that binding leads to extensive dehydration of apolar (rather than polar) groups (see Privalov et al. 2007 for a summary). The contribution from dehydration of the protein, ΔCp prot , calculated using the above (averaged) Eq. (2), is given in orange and the contribution of the DNA (ΔCp DNA in blue) is then the difference from the observed values. The most striking feature of this data set is that whilst the protein contributions do not differ greatly for binding in the two grooves, the heat capacity changes from dehydration of the DNA are very different: substantially negative in the major groove but only slightly so for the minor groove. It is clear that ΔCp for the DNA and protein surfaces (per Å 2 ) is not the same: i.e., the hydration characteristics of the DNA grooves differ from those of the proteins.

Analysis of the interface
The programs Naccess and PDIviz were used to determine the accessible surface areas of the protein domains, the free DNA and their complexes using just two categories of surface: apolar (including aromatic) and polar/charged. From these data, values of ΔASA, the change (reduction) in accessible surface area on forming the complexes were calculated for both the protein and DNA components. The aromatic component was not separately assessed since aromatic rings are rarely exposed on the surface of proteins and in duplex DNA it is only the edges of the bases that are exposed, not the aromatic rings. Table 1 lists values of ΔASA prot , the reduction in accessible surface area of the proteins on forming their complexes and the calculated heat capacity changes, ΔCp prot , resulting from occlusion of their surface area, using Eq. (2). Values of ΔCp prot were then subtracted from the observed ΔCp obs values to yield values of ΔCp DNA , the heat capacity change attributable to dehydration of the DNA on forming the complex. These values of ΔCp DNA , together with the changes in apolar and polar surface areas of the DNA (ΔASA DNA ) for each complex were substituted into the equation:

The minor groove
Five complexes were used, three of which are HMG boxes (see Dragan et al. 2004). D74 is a truncated form of the non-sequence specific (NSS) Drosophila HMG-D protein and includes only the minimal HMG box, i.e., excludes the highly basic 26 residue C-terminal tail included in the D100 construct. Lef79 is likewise the minimal HMG box from the mouse sequence-specific (SS) transcription factor LEF-1, similarly missing its basic 8-residue C-tail present in Lef86. Data for the longer versions of these two DBDs have not been used since their basic C-terminal tails do not bind in the minor groove but extend across the major groove and make only non-specific ionic links. SRY81 is the SS HMG box from human SRY: this includes a short C-terminal tail that tracks along the minor groove adjacent to the HMG box itself and the same is true for the HMG box from mouse Sox5. The AT-hook motif 'Core DBD2' is the minimal 10-residue 'core' element representing the second AT-hook from HMGA1, the central RGR element of which sits on the floor of the minor groove.  Total surface-normalized (i.e., per A 2 ) of observed heat capacity changes, ΔCp obs , for binding DBDs to their optimal recognition target sequences. The contribution from the protein components (orange), ΔCp prot , was calculated from the averaged protein Eq. (2). The DNA contributions, ΔCp DNA , (blue) were obtained by subtraction from ΔCp obs . The interfacial areas used for normalization were averages of the protein and DNA contact areas. For details see Table 1 The major groove Six complexes have been used for the major groove analysis. Homeodomains insert a recognition helix into the major groove, but additionally have N-terminal extensions into the minor groove. To restrict consideration to just the major groove, Fig. 1 gives data for the truncated forms of the Antennapedia and NK2 homeodomains that lack their N-terminal extensions: desAntp and desNK2 (Dragan et al. 2006). It also includes data for the Mat α2 homeodomain that retains a very short N-terminal extension in the minor groove (Carra and Privalov 1997). FOXP2 inserts the third α-helix of its forkhead domain into the major groove and there are very few minor groove contacts in this case (Morris et al. 2018). TFIIIA is a three zinc-finger element from the xenopus protein (Wuttke et al. 1997;Liggins and Privalov 2000). Thermodynamic data for DNA binding of the Cro repressor dimer are from Takeda et al. 1992.  (Takeda et al. 1992); Mat α2 (Carra and Privalov 1997). Sox5/DNA complex structure-unpublished data from (Read et al. 2018)  In Eq.
(3) the values of Δ(ASA) apolar and Δ(ASA) polar , the apolar and polar surface areas of DNA binding sites, are independent variables. For the minor groove complexes there are five sets of apolar and polar ΔASAs variables and six sets of variables for the major groove complexes. The searched-for parameters, the coefficients Cp i , are taken as constants and represent the surface-normalized heat capacity effects of dehydrating unit surface area of apolar and polar surface on the DNA. These were initially estimated using Eq. (3) and then the regression programme in Origin was applied to each set of ΔASA i variables. Table 2 gives the resulting equations for each groove. To display how effectively the calculated coefficients express the experimental data, a graph was plotted for each groove of the experimentally observed heat capacities against those predicted using the derived coefficients, see Fig. 2. The diagonals, having a slope of unity, represent exact correspondence between the observed ΔCp DNA and that calculated using the averaged values of Cp i given in Table 2.

Discussion
The first comment is that in both grooves the apolar coefficients are negative, whilst both polar coefficients are positive-just as with proteins. The precise coefficients immediately show why the contribution to ΔCp from dehydration of the DNA is much less in the minor groove than in the major groove (see Fig. 1) despite not dissimilar interfacial contact areas: in the minor groove the apolar and polar coefficients are of comparable magnitude but of opposite sign, leading to a low net contribution, whereas in the major groove the positive polar coefficient is small in magnitude relative to the apolar so that the apolar dominates, resulting in strongly negative net heat capacity changes.
The essence of the equations derived for the two grooves can be summarized as follows: the heat capacity effect of forming an apolar interface is very similar in the two grooves and is characterized by a negative value of ΔCp apolar = − 3.2 J K −1 mol −1 [Å 2 ] −1 , a situation similar to the formation of an apolar interface in proteins-for which the average ΔCp apolar is − 1.79 J K −1 mol −1 [Å 2 ] −1 (Eq. 2). The uniformity of ΔCp apolar for the two grooves of DNA must result from the fact that the walls of both grooves consist of exposed sugar rings. Furthermore, the quantitative difference between apolar DNA surface and that of the averaged apolar (largely aliphatic) surface of polypeptide chains reflects differences in their effects on the water bound to them.
In contrast, the formation of polar interfaces is very different in the two grooves. In the minor groove, a large and positive value of ΔCp polar = + 2.67 J K −1 mol −1 [Å 2 ] −1 is observed, substantially larger than the values found for the formation of a polar interface in proteins, that average to ΔCp Polar = + 0.98 J K −1 mol −1 [Å 2 ] −1 (Eq. 2). This unusually large positive polar coefficient characteristic of the minor groove can only be a consequence of displacing the ordered ice-like water known to be characteristic of this groove (Kopka et al. 1983: Chiu et al. 1999) the ordering of which is based on the regularity of N3 of A and O2 of T groups in AT-rich regions and is, therefore, a property In this context, it is worth recalling that the heat capacity of liquid water (4.18 J g −1 K −1 at 25 °C) is about twice that of ice (2.03 J g −1 K −1 at − 10 °C). Release of this water gives rise to unusually positive enthalpies and entropies, a notable characteristic of protein binding to the minor groove (Privalov et al. 1999, 2007Dragan et al. 2004Crane-Robinson 2017, 2018).
In the major groove a ΔCp polar of only + 0.38 J K −1 mol −1 [Å 2 ] −1 implies that the water bound to the irregularly spaced polar H-bond donor and acceptor groups that run along the base of the major groove has a structure that differs little from that of bulk water. Figure 3 shows a crystallographic structure of a 16 bp duplex having high enough resolution (1.6 Å) to define a substantial number of the hydrating water molecules: distinction between the ordered array in the minor groove and the relatively disordered hydration of the major groove is very apparent. A substantial difference between the grooves as regards their hydration is supported by assessments of the effective dielectric constant within the two grooves: whereas major groove water is not too different from the bulk, exhibiting an apparent dielectric constant of ~ 50 D, water in the AT-rich minor groove appears to have a dielectric constant of only ~ 20 D ( Barawkar and Ganesh 1995;Jin and Breslauer 1988). Although there is no direct relationship between the dielectric constant and the thermodynamic parameters of the hydrating water, these data demonstrate a substantial difference in the state of the water in the two grooves.
Protein/DNA complexes normally include multiple salt links between phosphates and lysine/arginine sidechains, the formation of which must be accompanied by dehydration events, so the question arises as to whether these ionic bonds are relevant to the ΔCp measurements made here? Experiments on a variety of DBD/DNA complexes (both sequence-specific and non-sequence specific) have shown that the enthalpy of the binding interaction is independent of the salt concentration-despite large changes in the affinities-i.e., the salt links are non-enthalpic. This means they make no contribution to the measured ΔCp (see Dragan et al. 2004;Takeda et al. 1992;Ladbury et al. 1994). Although this conclusion was questioned in the case of IHF binding to DNA (Holbrook et al. 2001), IHF represents an example of salt-dependent protein refolding on binding to DNA (see Swinger and Rice 2004)-with consequent enthalpic changes-rather than a direct enthalpic effect of forming/breaking ionic links with the DNA. The dehydration processes reported on by the ΔCp measurements, therefore, take place within the grooves, not along the phosphodiester backbone.
An attempt previously made to establish a relationship between observed heat capacity changes on complex formation and apolar/polar coefficients (Uedaira et al. 2003) concluded that ΔCp apolar ~ − 4.7 J K −1 mol −1 [Å 2 ] −1 and ΔCp polar ~ + 2.5 J K −1 mol −1 [Å 2 ] −1 for a group of ten DNA/ DBD complexes, overall values derived without separating the protein and DNA contributions. Although of the same sign as found in the present analysis, these values average the protein and DNA contributions and are of larger magnitude than reported here. The differences are probably also a consequence of using experimental heat capacities uncorrected for refolding (that results in excessively negative values of ΔCp obs )-as well as not making any distinction between the grooves.
The parameters derived here for dehydration of the DNA grooves raise the question as to the physical basis of the sign and magnitude of observed ΔCp values. The simplest view of a negative ΔCp for the dehydration of apolar surface is that water molecules hydrating hydrophobic groups in free solution are oriented by the asymmetric attraction from the bulk water and in consequence are less restricted and constrained by hydrogen bonding to each other than are waters in the bulk liquid. This results in such oriented waters having an enhanced heat capacity, so that when shed into the bulk solution, their heat capacity drops. In physical terms such an Fig. 3 Representation of a 16 bp duplex solved at 1.6 Å resolution (Narayana and Weiss 2009) showing the apolar and polar surface characteristics. Apolar: carbons in white and the methyl groups of T in green. These constitute the walls of both grooves. Polar: red (negative) and blue (positive). These make up the bottom of both grooves. Phosphate groups are separately designated in yellow and water molecules are shown as cyan dots. Note the very regular array of eight waters covering six bp in the central minor groove (five ATs interrupted by a single GC). Water molecules in the major groove are more haphazardly positioned explanation implies that the water hydrating polar groups in free solution is very tightly H-bonded, i.e., more constrained than the molecules of bulk water: its loss on complex formation, therefore, results in an increase in its heat capacity (Bergqvist et al. 2004).

Summary
Treating a change in the heat capacity on forming a DNA-DBD complex as a proxy for the change in hydration of the surfaces in question, this data analysis leads to the conclusion that the water hydrating apolar DNA surface is essentially the same in both grooves (when expressed on a per Å 2 basis), presumably as a consequence of similar exposure of sugar rings on the walls of both grooves. In contrast, the hydration of polar surface is very different in the two grooves: in the minor groove the unusually large heat capacity increase must be a consequence of the release of the highly ordered ('ice-like') water bound to the polar atoms on the edge of the bases. In the major groove, the water hydrating polar groups does not differ greatly from bulk water because these polar groups do not have the regularity and spacing appropriate for the formation of ordered water structures, unlike in the minor groove.