The physical chemistry of high-sensitivity differential scanning calorimetry of biopolymers

High-sensitivity differential scanning calorimetry (HSDSC) is widely used to examine the thermal behaviour of biomolecules and water-soluble polymers in aqueous solution. The principal purpose of this manuscript is to examine the thermodynamic basis for the signals obtained using HSDSC. It is shown that a combination of the van’t Hoff isochore and Kirchhoff’s equation are all that is necessary to simulate and curve fit the HSDSC output obtained for the thermally induced unfolding of the protein ubiquitin. The treatment is further developed to show how the temperature dependence of the heat capacity change of unfolding, multiple sequential transitions, and protein dissociation can be incorporated into the thermodynamic description of protein unfolding and how these factors in turn affect the HSDSC signal.


Introduction
High-sensitivity differential scanning calorimetry (HSDSC) is widely employed for the study-in aqueous solution-of the thermodynamic parameters associated with processes initiated either by an increase in temperature (up-scan) or by a decrease in temperature (down-scan). Small molecular mass molecules cannot be examined by HSDSC unless they form aggregate structures showing intermolecular co-operation. On the other hand, biopolymers in aqueous solution, such as proteins, which are cooperatively stabilised by numerous weak forces, can be examined by HSDSC.
Typically HSDSC can be used to examine: 1. Transitions from the physiologically active native form of a protein through intermediate partially unfolded states to the final denatured form of the protein. Very often, such a process is characterised by minimally populated intermediate states and thus approximates to a two-state transition between the initial native form and the final denatured form of the protein [1]. 2. Thermally induced co-operative transitions in molecular assemblies of phospholipids, such as multi-lamellar liposomes [2]. 3. Melting transitions in DNA and oligonucleotides [3].
In HSDSC, the specific heat of an aqueous system is measured as a function of temperature. For an aqueous solution of a bio-polymer, the apparent specific heat of the solute (S 2 ) is given by the following expression [1]: where S is the specific heat of the solution, S 1 is the specific heat of the solvent, and w 2 is the weight fraction of the solute. Because the quantity (S -S 1 ) is usually very small, a differential mode of measurement [solvent (reference cell) versus solvent plus solute (sample cell)] has to be used. Indeed, given that a major portion of the specific heat change is due to the heating and cooling of the solvent (usually water which has a large heat capacity), it is essential to have a differential arrangement, so that phase transitions in the solute can be observed.
HSDSC signals and their interpretation for protein unfolding DSC instruments measure the power required to maintain the temperature of a sample placed in a designated sample cell at (or close to) the same value as that of a reference cell containing the identical aqueous solvent, but no sample molecule, as the overall temperature of the system, is altered. The cells are located within an adiabatic vacuum chamber. The raw instrumental output conventionally shows power as a function of temperature. To extract data that have more thermodynamic significance, the axes of the trace output are transformed. Power is converted to a molar excess heat capacity using the formula: where q p is the heat absorbed at constant pressure; t is time, the derivative dq p /dt represents power; r is the scan rate (dT/dt, where T is temperature); and M is the number of moles of sample in the sample cell. A typical DSC experiment normally involves at least two scanning runs. One scan consists of a baseline scan, wherein the sample cell and reference cell both contain the blank aqueous solvent. The second scan is a scan of the solvent (reference cell) against the solvent plus solute (sample cell). The baseline scan is then subtracted from the sample scan. Figure 1 provides a typical example of an HSDSC trace of the excess heat capacity (the heat capacity difference between the sample and reference cells) as a function of temperature. The signal shown in Fig. 1 was obtained for the protein ubiquitin in buffer solution, at a pH of 2.
Proteins undergo denaturation on heating. The process involves a transition from the physiologically active compact folded form to the normally physiologically inactive unfolded form. Native protein structures in aqueous solution are cooperatively stabilised by numerous intramolecular forces. Disruption of these forces requires an endothermic enthalpy change. The favourable free energy contribution to denaturation is provided by the entropy change that arises from the increased conformational freedom available to the unfolded protein and the increased number of ways of partitioning the increased thermal energy.
A simple pedagogic model of thermally induced unfolding has been described by Dill and Bromberg [4]. Consider a four-bead molecular chain, as shown in Fig. 2. In this model, the ground state is characterised by a compact molecular structure that is held together by an intramolecular bond (the dashed line) between the chainends. The first excited microstate is fourfold degeneratei.e., there are four different unfolded molecular conformational structures, of equal energy that the molecule can adopt. The fractional occupancy of the two different energy states and its functional relationship with temperature can be calculated using the Boltzmann distribution equation: The subscripts 0 and 1 denote the ground state and first excited microstate, respectively; n is the number of molecules in a particular state; g is the degeneracy of that state with g o = 1 and g 1 = 4; e is the energy of the state; k is the Boltzmann constant; and T is the absolute temperature. Using the mass balance expression n = n 0 ? n 1 and De = e 1 -e 0, we can rewrite Eq. 3 as n À n 0 which gives the following expressions for the fraction of molecules in the ground state: and fraction of molecules in the excited state: The temperature dependence of the composition of the system is shown in Fig. 3. This system is an example of a two-state system, i.e., a system within which only two states are significantly populated. At low temperatures, the ground state form predominates. The enthalpy of intramolecular binding is key to this predomination. However, as the temperature rises, the excited state becomes increasingly populated, thereby demonstrating the increasingly important entropic contribution of conformational variety to the system. The statistical thermodynamic description of protein unfolding is far more complex than the four-bead molecular model, but the model does encapsulate one of the reasons as to why proteins unfold upon heating to moderately high temperatures-the large number of excited state conformers. Moreover, just like the model, protein unfolding is, very often, a two-state process. As a consequence, the signal shown in Fig. 1 can be interpreted as showing how the thermal history of the system reflects the changing composition of the aqueous protein system as temperature increases. At low temperatures, the compact native form predominates as the temperature is increased some molecules begin to unfold. The fraction of molecules that have unfolded multiplied by the enthalpy of the unfolding transition provides the basis of the heat signal. Since the enthalpy change is endothermic, the temperature of the sample cell becomes lower than that of the reference cell; and thus, the instrument measures the power needed to raise the temperature to compensate for the temperature difference. This, as we have shown, is easily converted into a molar excess heat capacity.

The DSC signal and initial data analysis
There are several features of the DSC signal, as shown in Fig. 1, which require comment. The transition from the compact physiologically active form of the protein to the more open unfolded physiologically inactive molecular form is shown as an increase in heat capacity of the system, going through a maximum at a temperature designated as T m and then decreasing to a final higher final heat capacity value. The initial low temperature portion of the scan represents the heat capacity of the native form of the protein in aqueous solution (denoted as C P,N ). The high-temperature portion of the scan shows the heat capacity in aqueous solution of the unfolded form of the protein (denoted as C P,D ). In this scan, both heat capacities are assumed to be invariant with the temperature over the temperature range of the experimental run, so the heat capacity change on unfolding given by the expression DC p = C p,D -C p,N is a constant. Formally, the molar heat capacity is the amount of heat energy required to the raise the temperature of 1 mol of substance through 1 K. At the molecular level, the additional heat energy is distributed among the various degrees of freedom and partitioned variously between kinetic energies-including vibrational, rotational, and translational transitions and potential and potential energies-including stretching and bending of molecular bonds [5]. The existence of the heat capacity change indicates that both enthalpy and entropy are functionally dependent upon temperature. If DC p is constant, then we can write: Using the second law of thermodynamics, we get a similar expression for the entropy change: There are several reasons why the overall heat capacity change for protein unfolding increases. These include the exposure of hydrophobic amino acid side chains buried in the core of the native form of the protein to water when the protein molecule unfolds. For example, Connelly and Thomson [6] noted that the dissolution of aliphatic and aromatic hydrocarbons in water invariably leads to an increase in heat capacity. However, there are likely to be other contributions, such as an increase in easily excitable vibrational modes upon unfolding. The enthalpy of denaturation can be obtained from the experimental data by integration of the peak area. This value is referred to as the calorimetric enthalpy (DH cal ). To obtain the peak area, we must first draw a baseline to the data. In the example shown in Fig. 4a, a straight line is drawn from what is judged to be the start of the transition to the adjudged end of the transition. Once satisfied that the baseline satisfactorily joins the onset and termination of the thermal transition, it is then subtracted from the HSDSC data ( Fig. 4b) to leave a transitional profile with a flat baseline. The resultant signal can then be divided up into evenly spaced segments. The area of the individual segments can then be computed either by the trapezoidal rule or by the Simpson's rule and then summed to give the integrated peak area (Fig. 4c).
The trapezoidal rule equation is Using the trapezoidal rule, DH cal was found to be 198 kJ mol -1 . Straight-line baselines are convenient and easily drawn but do not necessarily reflect the true geometry of the underlying baseline. Other baselines can be fitted. In Fig. 4d, the pre-and post-transitional portions of the signal are fitted to a cubic polynomial of the form: f ðxÞ ¼ ax 3 þ bx 2 þ cx þ d: Other functions can be used-for example, quartic order polynomials or cubic splineswhich may represent the underlying baseline better. Yet, normally, area integration using a straight-line baseline provides values not too dissimilar to values obtained using other baseline functions.
Can we use thermodynamics to examine the HSDSC signal?
Thermodynamic information may be obtained from the signal if it can be established that the signal was obtained under conditions of thermodynamic equilibrium. Thermodynamic control of the thermal processes observed in the calorimeter may be established by investigating the reversibility of the system. If the system either reproduces the same trace on rescanning or produces an identical trace on cooling the application of thermodynamic relationships to aid our understanding of the HSDSC trace is justified. A further test of the applicability of thermodynamics is to examine the HSDSC signals for scan rate dependence. Parameters measured for processes under thermodynamic control show no scan rate dependence [7]. For proteins where unfolding is a two-state process the fraction of protein in the native state is given by the following expression that is analogous to the expression for the ground state in the four-bead model: and similarly for the denatured state The energy difference between the two states is the free energy of denaturation (DG). The ratio of the fraction of the denatured protein to the fraction of native protein is equal to the ratio of the concentrations of the denatured ([D]) and native protein ([N]): where P t is the total concentration of protein.
Using Eqs. 7 and 8, the ratio is also equal to the following expression: However, from fundamental thermodynamics, we know that where K p is the equilibrium constant at constant pressure for the unfolding process. Thus, the equilibrium constant for denaturation obtained under constant pressure conditions is equal to the concentration ratio of the native and unfolded forms: The thermodynamic basis of the HSDSC signal The fraction of unfolded protein at temperature, T, multiplied by the enthalpy of unfolding at the same temperature gives the enthalpy needed to unfold f D of protein at temperature, T. The rate of change in this enthalpy value with temperature gives the excess heat capacity-the heat capacity difference between the sample and reference cells: To calculate C p,xs, we need to be able to calculate the changing composition of the system. This is readily done using the mass balance expression: where P t is the total concentration of protein. Rearranging Eq. 14 to provide an expression for [N] and substituting this in Eq. 16 give which can be rearranged to give the fraction of unfolded protein f D : How do we calculate the changing composition of the system as a function of temperature? This is readily achieved using the van't Hoff isochore [4]: where DH vH (T) is the van't Hoff enthalpy (for a two-state process, this is equal to the enthalpy of unfolding) and R is the universal gas constant. We have already noted that the unfolding process is accompanied by a positive change in heat capacity, which means that the van't Hoff enthalpy is temperature dependent (see Eq. 7). Substituting Eq. 7 into Eq. 19 and integration of the resultant expression give  Fig. 4 Integration of the peak area. a Baseline is constructed so as to connect the start of the transition and the end of the transition. b Baseline is subtracted from the data. c Area under the peak is divided up into evenly spaces segments that are then use to calculate the area either using the trapezoidal rule or the Simpson's rule. d Baseline fitted to the pre-and post-transitional portions of the signal using a cubic polynomial (see text for details) Using Eq. 21 in Eqs. 18 and 16 allows us to calculate how the fractions of denatured and native protein changes vary with temperature.

Simulating and fitting the HSDSC signal
If we complete the differentiation, as shown in Eq. 15, we obtain In this equation, DC p,cal is the heat capacity change obtained from the signal. We will find it convenient to differentiate between this value and the value of DC p used in the van't Hoff derived equations (Eqs. 20 and 21). The relationship between the two parameters is given by To derive an analytical solution to Eq. 22, we need to find an expression for df d dT This is achieved using the following transformation based upon the van't Hoff isochore (Eq. 19): Given and expressing Eq. 25 as a logarithmic expression gives which on differentiation gives and thus provides The excess heat capacity can thus be written as We now have an equation for C p,XS which is functionally related to temperature, T. Equation 29 can be used to fit the data shown in Fig. 1 using a least squares approach. The outcome of fitting Eq. 29 to our ubiquitin data is shown in Figs. 5 and 6. Figure 5 show how the composition of the system changes with temperature and Fig. 6 shows the optimised best fit line through the experimental data. The fitting was conducted in the following way. Initial values were assigned to the following parameters: DH VH , DH cal , DC p, and T ref . These were then used in the appropriate previously defined equations to calculate an initial set of values of C p,XS using the temperature data obtained from the data set, as shown in Fig. 1. The differences between the calculated values and the experimental values are calculated, squared, and summed. The sum of the squared differences was then minimised by changing the parameter values using For the example shown in Fig. 6, the NonlinearModelFit routine was used to fit the model to the data and provide a set of optimized parameters. The obtained optimized parameter values are shown in Tables 1 and 2. The standard errors in the parameters were very low-in the order of 0.1%. Several observations can be made. The fitted calorimetric enthalpy is the same as the calorimetric enthalpy value measured by integration of the peak. The value of T ref is slightly lower than the value that can be interpolated for T m the temperature at which the excess heat capacity is a maximum.
The values obtained for DH VH and DH cal are close in value but not the same. This is not necessarily surprising. The units of both parameters are kJ mol -1 . However, as we have shown, the raw calorimetric data are converted into an excess heat capacity in Eq. 2 using a mass value as measured by the experimenter. In this experiment, the required mass of protein was weighed and dissolved in buffer. The aqueous protein solution was then injected into a cell of known fixed volume. In this way, the number of moles (M) is readily calculated and inserted into Eq. 2. On the other hand, the molar unit in the van't Hoff enthalpy is supplied by the universal gas constant. We can show this using dimensional analysis: It is important to note that logarithmic terms are dimensionless. Thus, in order for the equality to be true, the molar unit used in the van't Hoff enthalpy must be identical to the molar unit supplied by the gas constant. Therefore, why should DH cal be less than DH VH ? There are a number of explanations and it is indeed a matter of interest in protein thermochemistry. We make the assumption that the only thermal process on which the calorimetric enthalpy reports is the thermal unfolding of the protein. However, other thermal events may occur because of the presence of the protein in solution, which is not observed when running the baseline scans. However, it is also possible and indeed extremely likely that in our particular case not all the mass of protein placed in the cell was protein or that some of the protein placed in the cell did not undergo thermal unfolding. Both events would lead to overestimation of active protein and thus an underestimation of the calorimetric enthalpy.
We shall see later this article that there are cases, where DH cal [ DH VH . This arises when protein unfolding involves the appearance of substantial populations of intermediates.
There are examples, where DH vH ) DH cal . One example is the thermally driven change from the P b gel phase to L a in phospholipid multi-lamellar vesicles, where the ratio of the van't Hoff enthalpy to the calorimetric enthalpy can be as high as 200. Such numbers suggest that something like 200 phospholipid molecules are acting together as a co-operative unit.
Protein unfolding signals when the heat capacity change is dependent upon temperature So far, we have assumed that thermally induced unfolding is characterised by a temperature invariant heat capacity. However, it is very often the case that the pre-transitional heat capacity shows a marked functional relationship with temperature, whilst the post-transitional is somewhat flatter-less influenced by temperature. Figure 7a provides an excellent example of this kind of thermal behaviour. The signal shown was obtained in laboratory class practical for the protein lysozyme in a 1.0 M aqueous solution of trehalose. The objective of the practical was to examine the behaviour of proteins in aqueous sugar solutions. The temperature dependence of the pre-and post-transitional heat capacities of the signal is readily incorporated into our analysis.
We assume that the temperature dependence of the preand post-transitional heat capacities can be described by a linear relationship. The heat capacity of the native protein is given by: C p,N = a ? bT, and for the unfolded protein, C p,D = c ? dT. This provided a heat capacity change that is temperature dependent: This is then used to provide a modified form of the Kirchhoff equation: Which then provides a modified form of Eq. 21 wherein, as before, K p (T ref ) = 1: The temperature dependence of the heat capacity change results in the addition of another term to Eqs. 7 and 21. It will be recalled that the calorimetric enthalpy van't Hoff enthalpy ratio appears to appropriately scale the contribution of the underlying baseline to calorimetric signal. This modified form of Eq. 29 was fitted to data obtained for an aqueous solution of lysozyme in 1 M trehalose solution. The solution concentration was 5 g dm -3 . The results of this fit are shown in Fig. 7b. The optimised fit parameters are displayed in Table 2. The adjusted R 2 value for the fit is 0.999; the following optimized fit parameters were obtained. Both the adjusted R 2 value and Fig. 7b seem to suggest that the fit is extremely good. However, it is always good practice to look at a plot of the residuals. The residuals are calculated as the difference between the measured value for C p,xs and the calculated value C p,xs using the best fit parameters. A residual plot is shown in Fig. 8. If the residuals arise purely from the uncertainties in measurement-for example, instrumental noise, then it would be expected that the residuals would be located at random about the C p,xs axis.  The fact that they are not reveals that there are systematic errors either in the data or in an error(s) in the model used to fit the data. Most probably, the analysis has neglected some other minor thermal events. Of further note is the rather large discrepancy between the calorimetric and van't Hoff enthalpies. The calorimetric enthalpy is 87.3% of the value of the van't Hoff enthalpy. The lysozyme sample used was obtained from Sigma Aldrich who claims that its purity is C90%. For the experiment, the lysozyme was used as received which may thus explain the discrepancy.

Multiple transitions
The residual plot in Fig. 8 suggests that the model used to fit the data may have been too simple in that other thermal events may also occur which have not been incorporated into the model. If these thermal events are independent of the main transition, then the overall thermal transition is a simple arithmetic addition of the two underlying events. For example, small amounts of an impurity, which is also calorimetrically observable, may be present in the sample. It is also possible that in the case of a multi-sub-unit protein, the sub-units unfold independently of each other. Again, the thermal signal would be a composite of the underlying transitions. However, protein unfolding may involve the native protein undergoing a transition to one or several intermediate states before ultimately adopting the final unfolded form. Protein unfolding, under equilibrium conditions, by such a mechanism can be represented by the following mass action expression in the case, where there are two intermediates formed in significant quantities: To be able to calculate the fraction of each species at any particular temperature, T, we formulate the following mass balance expression: where P t is the total protein concentration and [] terms represent the equilibrium concentrations of the respective species. If we divide Eq. 34 by the [N] and invert, we obtain the following expression for the fraction of native protein: Because unfolding occurs under equilibrium conditions, we can write the following equilibrium equations: Using these expressions in Eq. 35, we obtain Similar expression is readily derived for the fractions of the other intermediate and denatured species: The equilibrium constants are calculated using Eq. 21. Given the following model thermodynamic data, the fractional composition of an aqueous protein solution is depicted in Fig. 9. Simulating the HSDSC signal using the parameters in Tables 3 and 4 is slightly more complicated than our previous examples. The greater complexity comes from correctly identifying the enthalpy changes that accompany the formation of each species. Essentially, all enthalpy changes are calculated with the native form as the low energy form of the protein. Thus, the enthalpy change accompanying the formation of I 1 is DH VH,1 ; the enthalpy change accompanying the formation of I 2 is DH VH,1 ? DH VH,2; and the enthalpy of denaturation is given by DH VH,1 ? DH VH,2 ? DH VH,3 . Thus, the following expression can be then be used to calculate the excess heat capacity assuming DH cal = DH vH :  Fig. 9 Graph showing how the fraction of protein species varies with temperature using the model data in Table 3 If we collect together the appropriate terms, then Eq. 39 can be written in terms of the underlying component transitions as follows: The derivatives in Eqs. 39 and 40 can be estimated using a centred finite difference approximation: The simulated DSC signal using the data in Table 3 and Eqs. 21, 37, 38, 39, 40, and 41 is shown in Fig. 10. It is worth nothing that the shapes of the component transitions are not symmetrical. The overall thermal transition can be fitted to a two-state model as was the data obtained for ubiquitin. This is shown in Fig. 11. The fit is not especially poor and could lead inexperienced experimenters to conclude that the transition is two states. However, the optimized fit parameters show immediately that the use of the two-state model is incorrect. The van't Hoff enthalpy value is 218 kJ mol -1 , whilst the calorimetric enthalpy is 548 kJ mol -1 . As we pointed previously in the manuscript, if DH cal [ DH vH, then the presence of significant populations of intermediate states in the transition is inferred.

Transitions involving dissociation
Many proteins have a quaternary structure which involves the association of several folded molecular sub-units to form a multiple sub-unit complex. The simplest complex is a dimer. One such example, examined by Sturtevant and co-workers, is the tryptophan repressor obtained from Escherichia Coli, which shows unusual thermal stability at pH 7.5 [8]. Their DSC traces show that the heat capacity change is temperature dependent and that the peak itself shows a significant amount of asymmetry. Furthermore, the signal shows some concentration dependence. To develop a thermodynamic model that can encapsulate these observations, we need to use Eq. 33 that incorporates the temperature effects upon heat capacity. However, we need to be extremely careful about the equilibrium constant equations that we use. We shall find it expedient to formulate these equations in terms of the fraction of protein that has undergone dissociation/denaturation.  Fig. 10 Simulated DSC signal for a model protein system wherein unfolding involves the formation of two intermediates. The component transitions are identified and shown. A value of was used in the simulation Fig. 11 Fit of the overall simulated signal shown in Fig. 10 to a twostate model We assume that the following equilibrium equation adequately describes the dissociation/unfolding process: In other words, dissociation and unfolding occur at more or less the same time. Or to use the language, we used in the previous section on multiple transitions the population of dissociated sub-units in the native form is extremely if not vanishingly small.
To make the simulations simple, we shall assume that both sub-units comprising the dimer are the same, so that we can use the following mass action description of the equilibrium process: If P t is the total concentration of the dimeric protein, then we can write the following: Here, f D is the fraction of dimer that has undergone dissociation/denaturation. Similarly The equilibrium constant is thus written as We now need to define K(T ref ) T ref is the temperature at which half the protein has undergone dissociation/denaturation, i.e., f D = 0.5. We shall, however, also define T ref in terms of a reference concentration P ref: Substituting Eqs. 44 and 45 into Eq. 33 gives Equation 46 is a quadratic expression in terms of f D . As before, if we define all the parameters on the right-hand side (the values are shown below) as well as the concentrations, we can calculate f D using the normal solution for quadratic equations. These values are then used in Eq. 29 to simulate the HSDSC signal.
For the simulation of the HSDSC signal shown in Fig. 12, the data displayed in Table 4 was used and it was assumed that P t = P ref . Comparison between the data provided by Sturtevant et al. [8] and Fig. 13 shows that the simulation captures the major features of the experimental data. The heat capacity is temperature dependent, and the signal shows distinct asymmetry. Moreover, in Figs. 12 and 13, it is shown that T ref does not correspond to the temperature of maximum excess heat capacity. If we change the protein concentration, then we can show that the signal shifts to higher temperature ranges when the concentration is increased and to lower temperature ranges when the concentration is lowered in line with experimental observations [8]. The observant reader will no doubt detect that the signals appear larger at higher concentrations. This is to be expected, since the transitions occur over higher temperature ranges at higher protein concentrations and the positive heat capacity change thus results in increases in both the calorimetric and van't Hoff enthalpies.

Concluding remarks
The aim of this article has been to examine the thermodynamics of temperature induced changes in aqueous protein systems as detected by scanning calorimetry. I have tried to show how through the use of simple models of protein unfolding and through the application of familiar thermodynamic relationships, the scanning calorimetric signals can be simulated and fitted to these models. The text, however, does come with a caveat. Calorimetric signals can be, and very often are, over interpreted. The model selected must fit the known attributes of the thermally induced transition. It is not uncommon to see novices try to fit a dissociation transition (that always shows a distinct asymmetric peak) to a model that involves several independent transitions using the software supplied by the instrument manufacturer. The better the novice understands, the underpinning science of signal creation the more likely they shall be able to correctly interpret that signal.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://crea tivecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.