Probing the protein corona around chargedmacromolecules: interpretation of isothermal titration calorimetry by binding models and computer simulations

Isothermal titration calorimetry (ITC) is a widely used tool to experimentally probe the heat signal of the formation of the protein corona around macromolecules or nanoparticles. If an appropriate binding model is applied to the ITC data, the heat of binding and the binding stoichiometry as well as the binding affinity per protein can be quantified and interpreted. However, the binding of the protein to the macromolecule is governed by complex microscopic interactions. In particular, due to the steric and electrostatic protein–protein interactions within the corona as well as cooperative, charge renormalization effects of the total complex, the application of standard (e.g., Langmuir) binding models is questionable and the development of more appropriate binding models is very challenging. Here, we discuss recent developments in the interpretation of the Langmuir model applied to ITC data of protein corona formation, exemplified for the well-defined case of lysozyme coating highly charged dendritic polyglycerol sulfate (dPGS), and demonstrate that meaningful data can be extracted from the fits if properly analyzed. As we show, this is particular useful for the interpretation of ITC data by molecular computer simulations where binding affinities can be calculated but it is often not clear how to consistently compare them with the ITC data. Moreover, we discuss the connection of Langmuir models to continuum binding models (where no discrete binding sites have to be assumed) and their possible extensions toward the inclusion of leading order cooperative electrostatic effects.


Introduction
The rational design of macromolecular polymeric drugs and nanocarriers has become a central task in medicine and pharmacy in the recent years [1][2][3]. Proteins typically bind strongly to the macromolecular or nanoparticle surface and thereby form a protein "corona," a dense shell of proteins of the proteins, as well as the heat of binding [19,20]. However, the full quantification of the macromoleculeprotein complexation based on ITC is often out of reach because the molecular and mechanistic processes are not well understood, and the application of standard binding models questionable. Naturally, the underlying interactions are governed by a complex interplay between electrostatic, solvation, and steric effects. Theoretical and simulation concepts are in need that allow a quantitative assessment of these forces [21][22][23][24]. Moreover, for the protein corona, cooperative effects must be discussed because the proteins interact with each other and also change some properties of the macromolecule; in particular, electrostatic interactions play a decisive role [6,10,13,19]. Similar challenges arise also in the sorption of proteins to charged nano-and microgels [25]. The interpretation of ITC data and the definition and application of appropriate binding models are therefore very important but challenging tasks.
Hence, it is desirable to identify well-defined model experimental systems where the protein corona is sufficiently simple and accessible for interpretation. "Simple" should mean that the binding partners are structurally well characterized and the corona has only one protein component in a fully binding equilibrium. For such a simple system, also computer simulations and theoretical, physical binding models are easier to devise. Recently, Xu et al. presented a study of such a simple system by combining coarse-grained (CG) molecular computer simulations and ITC data of the corona formation of only one type of protein on the dendritic polyglycerol macromolecule terminated with sulfate (dPGS) [16,26,27]. dPGS has received much attention in the last decade because of its high potential in drug design: it exhibits significant anti-inflammatory action during disordered immune response [28][29][30][31] and has high efficacy and functionality in many other biomedical problems [32][33][34][35][36]. The dPGS is well characterized by now theoretically [37,38] and experimentally [39]. In their work, Xu et al. analyzed ITC data of dPGS-protein complexation for various generations (sizes) of dPGS in particular using lysozyme as a well-defined model protein. Due to its almost complete sulfate termination, dPGS is highly negatively charged at relevant pH values, while the lysozyme proteins carry a positive net charge. Lysozyme complexes strongly with dPGS and forms a well-developed protein corona: Fig. 1 illustrates the lysozyme corona by snapshots taken from the CG computer simulation model [16].
In the CG simulation model, the solvent acts only as implicit background, while salt and dPGS monomers, as well as protein amino acids, are modeled as coarsegrained beads. Importantly, the original shape and charge structure are conserved in this model. In particular, the salt ions, sulfate groups, and charged acidic and basic amino acids carry monovalent integer charges of appropriate sign. Hence, while most hydration effects (i.e., those beyond simple dielectric screening) are neglected, all steric and electrostatic interactions are still well resolved and accounted for. The simulations showed strong cooperative effects in protein binding due to (excluded-volume) packing and electrostatic interactions in the dense protein corona. Consequently, binding affinities were found to depend on the density (coverage) of the proteins on the macromolecular surface. Despite this complexity, Xu et al. showed that still meaningful comparisons to ITC data are possible if the fits by the applied binding models, such as the standard Langmuir binding model, are properly analyzed and interpreted [16]. The lysozyme proteins adsorb strongly on the dPGS surface and form the protein corona. Snapshot taken from coarse-grained computer simulations [16]. Panel (c): Simplified twodimensional sketch of the protein corona around dPGS. Proteins (green spheres) of radius R p and charge Q p adsorb in a shell of width δ on the dPGS macromolecule (red sphere of radius R m and charge Q 0 m ) to form a dense shell (corona) In this contribution, we systematically describe the challenges and new developments in the interpretation of ITC data probing the protein corona. We start by recalling what is actually measured by ITC and what information is usually obtainable. We then introduce and derive the concept of a coverage-dependent binding affinity in the Langmuir model that serves for a better interpretation of the ITC data as well as for extensions of standard binding models. With these prerequisites, we show how binding affinities from simulations, namely the free energy of binding per protein extracted from a potential of mean force calculation, can be compared with ITC data fitted by the standard Langmuir model. Moreover, we discuss the relation of Langmuir models to continuum binding models (where no discrete binding sites have to be assumed) as well as simple extensions of binding models toward the inclusion of leading order cooperative electrostatic effects. Our perspectives and extensions serve as starting points for the development of more elaborate binding models that may be directly applicable to fit ITC data in future studies.

Analysis of ITC data
The adsorption of a protein onto a macromolecule is accompanied by the release of heat, H ITC , as measured by ITC. In general, we have to assume for cooperative adsorption that this heat per adsorbing protein depends on the number of already bound proteins, i.e., it is a function of coverage θ , usually defined as the ratio of bound proteins N b p to the total number of binding sites N, i.e., θ = N b p /N. We note that typically linked equilibria can contribute to the measured heat by ITC, not directly related to the binding event [19]. H ITC is thus described often as marker enthalpy and is not necessarily the same as the binding enthalpy H b (see, e.g., discussions and references in [19,25]). The total in ITC released heat can then be written for a total of N b p bound proteins as: where H ITC (i) is the measured heat by ITC for the titration of the ith protein, c m the concentration of macromolecules, and V is the total solution volume. In the ITC experiments, Q is measured vs. the total protein concentration c tot p . Introducing the molar ratio x = c tot p /c m and going to the continuum limit we can write formally: where H ITC (N b p ) is the total heat generated per macromolecule after the adsorption of N b p proteins, and N b p (x) represents the binding isotherm, i.e., the number of bound proteins versus molar ratio at fixed temperature.
Instead of the total heat, the incremental heat Q (x) = dQ/dx is typically employed for fitting to better track the changes in the heat during titration. The quantity of interest is the incremental heat per increment of protein, given by: Due to the rather complex molecular interactions governing the protein adsorption process, the function H ITC (N b p ) is typically unknown and it is virtually always assumed that H ITC is constant and independent of coverage. In that case, it is: Let us discuss in the following the behavior of this relation for the typical case of a standard Langmuir isotherm. The latter is based on identifying the simple association reaction equilibrium A+B→AB, with a binding constant , where the square brackets denote concentrations. The binding constant is associated with a binding energy where v 0 = l/mol is the standard volume [20,21,40]. Strictly speaking, G b is a free energy, but we will call it in the following simply binding energy or Langmuir energy to distinguish from other free energies we will approach in our disucssions. With the assumption of N discrete binding sites, the standard Langmuir model yields then [20,[41][42][43]: where is the free (bulk) protein concentration. The latter depends on N b p in the experiments with a fixed number of proteins in the sample (i.e., a canonical thermodynamic ensemble). In the standard Langmuir model, the binding constant K b is independent of protein concentration, and solving (5) for N b p (x) yields: with ξ = 1 + x/N + 1/(NK b c m ). Using Eq. 6 in Eq. 1 with constant H ITC gives the total heat as: The incremental heat is then [44]: The fitting of Q (x) to the experimental data then yields the unknown constants K b , N, and H ITC . Equation 8 describes a sigmoidal curve typically measured by ITC. In particular, deeper inspection of Eq. 6 shows that for large binding affinities, K b 1/(Nc m ), and very small x N, it is N b p (x) = x (independent of K b ), and Eq. 8 exhibits a plateau at the beginning of the ITC titration of height H ITC , independent of the binding constant. This is indeed a typical ITC signature in strongly associating systems [44]. In the limit of very large molar ratios x → ∞, all of the titrated proteins will adsorb for a non-vanishing K b and N b p (x) saturates to a constant value, the maximum coverage. Then Q (x) = 0 and the ITC curve carries not much information anymore. For large K b , the inflection point between the two saturating limits is at x N, becoming exactly equal for K b → ∞. For low affinities K b 1/(Nc m ), the curve behaves differently, but these "lowsignal" cases are often not well accessible by ITC [44]. The just explained behavior is illustrated in Fig. 2 for various values of the so-called Wiseman parameter c = c m K b [44].
Further analysis of Eq. 8 shows that for large K b the binding affinity is determined essentially by the slope of the inflection point of Q (x = N), where we find . For larger slopes, automatically, the transition becomes also sharper, cf. Fig.  2. Hence, as an important conclusion, we find that fitting the standard Langmuir model to any sigmoidal function with sharp transitions probes essentially only the very vicinity of the inflection point x = N of the Q (x) curves. In other words, fitting to Langmuir isotherms is thus most sensitive  [44] (that is, rescaled binding constant K b = c/c m ). Only for large K b 1/(Nc m ) the curve develops a plateau for small x/N and the inflection point is at x N close to x N, where the molar ratio equals the number of available binding sites. Importantly, the extracted apparent binding constant K b thus describes only the binding affinity right at x = N. This finding comes in useful when we have situations, like here for the protein corona, where the binding constant is actually not a constant but depends on the coverage (and thus on the molar ratio x).
In Fig. 3a, we show the experimental ITC data for Q (x) (symbols) for the specific case of lysozyme adsorbing onto dPGS of various generations G2, G4, G4.5, and G5.5 (with fractional values due to incomplete synthesis [16]) at a relatively low salt concentration of c s = 10 mM.

The Langmuir model with coverage-dependent binding constant
If cooperativity effects are at play, we need to consider that the binding affinity of a protein depends on the number of already bound proteins. One way of dealing with cooperativity is to introduce binding polynomials [43,45,46], where every binding step A+B→AB, A+AB→2AB, A+2AB→3AB, etc., has its individual binding constant and equilibrium coverages are calculated by averaging over all states. Here, we choose a different but, as we will see, related route: we assume a Langmuir binding energy that continuously changes with successive binding and thus with coverage θ , i.e., As we show in the following, this can be readily incorporated into the derivation of the Langmuir model in the canonical ensemble and leads to a proper definition of a coverage-dependent binding constant K b (θ ).
For this, consider a finite region with volume V in which a much smaller subspace (macromolecule on which the proteins bind) with N binding sites available. The binding sites can be all of the same kind. The cooperativity is assumed to enter through protein-protein interactions as well as the change of global properties (like the total charge of the macromolecule-protein complex). We assume . The large open circle, triangle, and square symbols indicate the simulation-referenced Langmuir binding free energy G ITC sim (θ * ), Eq. 25, for G2, G4, and G5.5 at their respective coverage θ * 0.95 a canonical ensemble with in total N tot p proteins in the cell volume V which now contains only one macromolecule. We define the fraction of bound protein particles by θ = N b p /N. The number of available and independent binding states is then [42,47,48] from the combinatorial possibilities of distributing N b p indistinguishable proteins on N sites, and ζ is the microscopic partition sum of a single protein particle in the bound state [21, 49,50]. Different to the standard Langmuir model, we now introduce a coverage-dependent binding energy G b (N b p ) associated with the binding of a protein from a reference state to one Langmuir site, given that N b p − 1 proteins are already adsorbed. This leads to the partition sum: and we can define the free energy of the macromoleculeprotein complex as Using the Stirling approximation ln(m!) = m ln(m) − m leads (within an unimportant constant) to the free energy of the complex normalized per binding site: where we defined ζ N b p in terms of an effective configurational volume v 0 divided by the cubed thermal (de Broglie) wavelength 3 . "Effective" means that also restrictions on internal vibrational and orientational degrees of freedom upon binding as expressed by the full microscopic partition sum [21, 49,50] are adsorbed in the volume v 0 .
The total Helmholtz free energy F of the canonical system (including the finite bath of free protein surrounding the macromolecule) is thus: where we introduced the ideal gas free energy is the density of unbound (i.e., free bulk) proteins in V . The total free energyf = βF /N per binding site is then: The minimization of the free energy with respect to the coverage of bound proteins: yields then the final relation between the binding energy and the fraction of bound proteins in dependence of the free protein concentration c p : where G b (θ ) = ∂ G b /∂θ. The left-hand side of the equation defines a coverage-dependent binding constant K b (θ ). As we see, we cannot simply define K b (θ ) solely by G b (θ ) but have to consider also the changes (derivative) of the latter with θ . The physical interpretation of this term is that a newly binding protein changes the coverage and thus the binding energy (and thus binding equilibrium) for all other bound proteins. If G b = 0 for all θ , then G b = const., and we recover the standard Langmuir model with In some cases, like in the application discussed in this work, it is θ G b G b , and the second term in the exponent can be neglected. Note also, that if we rewrite (17) it is relatively simple to see that we can generate the conventional binding polynomials [43,45,46] by a Taylor expansion of K b (θ (x)) with respect to molar ratio x, i.e., .., which may serve for interesting interpretations of the constant coefficients K i in future work.
It is for analysis of some problems reasonable to assume that the binding energy G b splits into an intrinsic process, G b,int which is independent of any cooperative effects, and an excess contribution, G b,exc (θ ), that accounts for the cooperativity. With this and neglecting the G b term in Eq. 17, we can write: Such a splitting was applied, e.g., in the case of charged proteins binding to hydrogels [51], or in the splitting of electrostatic contributions from intrinsic ones for small molecular ligands binding to membranes [52][53][54]. Note again that Eq. 18 is only accurate strictly if θ G b G b , that is, the explicit variation of the binding energy with coverage is much smaller than the absolute binding energy.
It is important to note that the volume v 0 depends on the exact nature of the bound state and is typically not known. Per convention in experiments, the standard volume v 0 = l/mol 1.6 nm 3 , corresponding to the standard concentration c o = 1 M (1 mol/l) is employed. In that case, the determined G b = G o is then called the standard free energy of binding [20]. Hence, in experiments, the discussion of origins or values of v 0 and G b individually without knowing microscopic details is in principle not feasible. However, if we consider computer simulations (see section "Comparing Langmuir to computer simulation binding free energies") where those quantities can (or must) be calculated independently, one has to discuss the origins and microscopic definition of v 0 much more thoroughly. Since, in general, v 0 will depend on the specific microscopic processes, one has to take care how to convert a theoretical or simulation derived free energy to the standard energy of binding G o , as already discussed in a comprehensive way for other associating systems [21,40].

Connection to continuum binding models
It is helpful for interpretation of ITC data to also consider continuum binding models, i.e., to dismiss the assumption of discrete binding sites. In particular, for proteins binding to a macromolecule, it is often not clear what the presumed discrete binding sites actually are. For a relatively homogeneous macromolecular/nanoparticle surface, a continuum picture, where adsorbed proteins may freely move and diffuse on the macromolecule surface, could be much more appropriate. Inspired from our previous work on protein-hydrogel interactions [51], such a continuum binding model for the current case of a protein corona could be sketched as in the following. We make the following Boltzmann ansatz for the number of adsorbed proteins via: where we split (for reasons becoming clear below) the total binding free energy in the Boltzmann exponent into a macromolecule-protein, G mp (x), and a proteinprotein, G pp (x), contribution. (We intentionally avoided the subscript "b" here to make this free energy of binding in the continuum model distinct from the Langmuir binding energy discussed above.) Furthermore, V b = A m δ is the effective binding volume (say a spherical shell of thickness δ, cf. Fig. 1) on the macromolecular surface with area A m = 4π(R m + R p ) 2 , where R m and R p are the macromolecule and protein radii, respectively. A surface density τ (x) (number per area) can now be defined by τ (x) = N b p (x)/A m , and accordingly a surface packing fraction A p τ < 1, where A p πR 2 p is the area taken effectively by a protein.
We stay in line with the coverage-dependent Langmuir model Eq. 17 introduced in the previous section and assume in Eq. 19 that the macromolecule-protein binding free energy G mp (θ ) is in general coverage dependent, thus also a function of x, G mp (x). The free energy G pp (x) considers the change of free energy of inserting one protein into the quasi-two-dimensional liquid of already adsorbed proteins on the surface. It is a contribution purely coming from direct protein-protein interactions. It thus vanishes, in contrast to G mp (x), in the limit τ (x) → 0. For proteins interacting only with hard cores, it could be modeled by the excess free energy of hard discs where the equation of state is well known [55]. Disregarding any specific protein-protein interactions right now, we can always make a virial expansion in orders of the density τ . Considering only the first leading order, β G pp (τ ) 2B 2 τ , where B 2 is the (two-dimensional) second virial coefficient, expanding the exponent in Eq. 19 for small coverages via exp[−β G pp (τ )] (1 − 2B 2 τ ), and rearranging to solve for N b p (x), we obtain a closed form for the binding isotherm through: If we now make the substitution N = A m /(2B 2 ) (for reasons becoming clearer below), express N b p by coverage The key steps for such a mapping of the continuum model to the Langmuir model is on one hand the twobody approximation in the protein-protein correlations, and on the other hand, the definition of the effective binding volumeṽ 0 = V b /N. With that we recognize that in the limit of not too large coverages (where only 2-body interactions dominate), the continuum model and the discrete Langmuir model constitute essentially (mathematically) the same binding models, albeit with different interpretations. N = A m /(2B 2 ) in the continuum picture is not the number of binding sites but defines the maximal number of binding spaces limited by the (2-body) protein-protein interactions. This becomes immediately clear if we identify B 2 with the excluded interaction area, e.g., for the case for hard-disk like particles of radius R where B H D 2 = 2πR 2 . Then, not more than N adsorbed particles will fit on the surface simply by packing constraints.
Note that if B 2 < 0 or higher order terms contribute, that is, protein-protein interactions are attractive or strongly correlated, respectively, such a simple interpretation of N and relation to Langmuir would not be so easily possible anymore. Hence, while a mapping of the more general continuum picture on the specific Langmuir model is in principle possible in some limits, details are subtle, in particular, how to define which interaction contributions actually enter the binding constant K b and which enter in the effective binding volume. A direct connection of the continuum free energy of binding to the coveragedependent model Eq. 17 is thus generally not possible-it may be feasible only for certain specific physical binding processes. As we will see in the following, the continuum view is important for the comparison of ITC fits to computer simulation data.

Comparing Langmuir to computer simulation binding free energies
Now we come to one of the key questions, how can we connect and compare the free energy of binding calculated from a computer simulation to the one obtained from a binding model fitted to ITC data. In particular, we want to address the non-trivial question of what can be done if the (typically complex) binding mechanisms and cooperative effects are not known, and an appropriate binding model is missing.
Let us briefly summarize what we typically do in a computer simulation. In a canonical ensemble simulation setup (that is, a fixed box volume V and fixed number of proteins), we can access the equilibrium free energy of binding of a protein i to a binding region by calculating the so-called potential of mean force (PMF) G i (r) [49,50]. This is done along the reaction coordinate r, which is typically the center-of-mass distance between a protein and the macromolecule/protein complex. Such a simulation setup is illustrated in Fig. 4 for a coarse-grained, implicit-solvent simulation of lysozyme associating with dPGS [16]. Since the degrees of freedom are massively reduced in such a coarse-grained representation, the equilibrium PMF, as shown in [16], can be well sampled Fig. 4 Illustration of a PMF (potential of mean force) calculation in a coarse-grained computer simulation [16] to obtain the equilibrium binding free energy of a protein binding into an incomplete corona. A single protein (right) is continuously moved along the center-to-center distance r between the protein and the complex (left; dPGS in red), where already a few proteins are bound. The PMF, G i (r) (blue curve), obtained from sampling and integrating the equilibrium mean force along r, exhibits a deep global binding minimum at a close distance r 0 . With G i (r r 0 ) = 0, the value of G i (r 0 ) defines the simulation binding free energy G sim by simple statistical averages for every constraint of the distance reaction coordinate r, in contrast to explicit-water all-atom simulations which are usually harder to converge. The calculations can be done for a single protein i also if already i − 1 proteins are adsorbed. We can thus evaluate the PMF for every successive binding event in the protein corona. PMFs calculated from coarse-grained simulations for the specific example of lysozyme to dPGS of generation G5 are shown in Fig. 3c. For large distances r, G i (r r 0 ) = 0 is set to 0 to define the free unbound reference system. For small r 3-4 nm, we see a deep global minimum which defines the bound state at distance r 0 . The simulation binding free energy G sim (N b p ) is then calculated from the PMF differences between the global minimum at the bound state at r 0 and the free bulk state for large r r 0 , that is G sim (N p − 1 proteins are already present in the corona. The cooperativity effects in the protein binding can be clearly observed in the simulation results for G sim (N b p ), summarized in Fig. 3d: the attraction per protein decreases with successive binding, indicating negative cooperativity, i.e., the more proteins are loaded into the corona the less favorable the binding becomes. From equilibrium simulations [16], it was found that for dGPS G5 the maximum protein occupancy was N = 13. The PMF for this in Fig. 3c is the third (reddish color) from the top. The two top ones still exhibit a global binding minimum but also a significant barrier for crossing at r 6 nm, kinetically hindering the binding of the corresponding proteins. (Note that for radial crossing of barriers in 3D space, the effective free energy landscape has to be corrected by a distance-dependent entropic factor [56,57].) We see also from Fig. 3d that the assumption θ G b G b made in section "The Langmuir model with coverage-dependent binding constant" for the considered system holds well, that is, the changes θ[ G sim (N b p )-G sim (N b p + 1)] are typically much smaller than G sim (N b p ). However, the simulation binding free energy cannot be related directly to the binding energy derived from ITC, as we will argue in the following paragraphs.
We now show how to connect the simulation free energy to the Langmuir results. In the simulations, with the constraint of having N b p −1 proteins bound in the simulation, we are probing the equilibrium N p b − 1 N b p binding in a domain (say spherical shell) of volume V b V . This translates into the Boltzmann equilibrium: where in this case c p is the concentration of the free (unbound) proteins for which the equilibrium of N b p bound proteins is established. The binding volume V b is a quantity that can be calculated in the simulations [16]. Equation 22 is similar to the continuum binding model introduced in the previous section in Eq. 19, but all the binding contributions are clumped together in one single binding free energy G sim . We emphasize again the nature of G sim (N b p ) in the simulations: it includes all entropic (including steric) and energetic interactions of the N b p th protein with the macromolecule and already bound proteins. In comparison, in the Langmuir model with coverage-dependent binding constant K b (θ ), Eq. 17, the binding equilibrium for the protein coverage θ at concentration c p is:  (24) G sim includes all contributions to the transfer free energy from bulk to the binding volume V b . The Langmuir energy G b in contrast considers only the energy of binding into one of the Langmuir binding boxes with effective volume v 0 . Thus, to link to the simulation free energy we have to add the entropic correction terms (the ln-terms) which take care of the confinement and configurational arrangements in the N Langmuir binding boxes of volume v 0 . Hence, the right side of the equation represents the total Langmuir free energy, with entropy in reference to simulation binding volume. This allows a direct comparison between the ITC evaluation and the simulation PMFs on the full free energy level. As a conclusion, G sim and G b (or the standard energy G 0 ) cannot be compared directly, unless we are in the low coverage limit θ 1 and the simulation binding volume is for some reason exactly given by the standard binding volume.
This mapping is directly related to section "Connection to continuum binding models," Eq. 21, where we showed that the Langmuir picture can be translated to a continuum picture if v 0 = V b /N and v 0 can be interpreted as an effective configurational volume constrained by the binding shell and pair interactions. Contributions from G sim are thus split into G b , k B T ln(1 − θ), and k B T ln(v 0 N/V b ) terms in rather non-trivial ways. Then, Eq. 24 can also be interpreted such that the Langmuir binding free energy G b includes all transfer free energy contributions, apart from 1-and 2-body confinement and protein-protein interaction contributions, implicitly included in the definition of v 0 and the Langmuir assumption of discrete binding boxes.
Hence, the ITC/Langmuir derived G b includes apart from the energetic macromolecule-protein contributions also many-body (larger than 2-body) packing correlations, which makes it difficult to compare specific interaction contributions between simulations and the Langmuir fit.
We can now use Eq. 24 to make a one-to-one comparison between the simulation results reported in Fig. 3c and the Langmuir fits to ITC in Fig. 3a. Recall from our discussion in section "Analysis of ITC data" (cf. Fig. 2) that the ITC fit provides information about the binding affinity K b (θ ) only for a certain coverage θ * := θ(x = N). Only there Eq. 24 can be evaluated and we formally rewrite: where G ITC sim (θ * ) defines the total binding free energy of ITC that can be compared with simulation results. Thus, we can refer to it as the "simulation-referenced" Langmuir free energy [16].
How do we evaluate the coverage θ * ? Given the sigmoidal differential heat as in Fig. 3a, Q (x), the inflection point directly delivers the binding stoichiometry N. The protein coverage θ follows from the normalized integrated incremental heat as: With that, one can define the protein coverage at which the binding affinity is evaluated by θ * = θ(x = N) as well as the mean coordination number N b * p = Nθ * at the respective binding equilibrium. In Fig. 5, we exemplify the calculation of θ * for dPGS generations G2 and G5.5-dPGS, respectively, directly from the ITC measurements. The coverage θ * is found as θ * = 0.92 for G2 and θ * = 0.97 for G5.5, which corresponds to the mean coordination numbers N b * p = 2.2 for G2 and N b * p = 12.2 for G5.5, respectively.
Thus, as an important finding, the coverages at which the binding affinity is probed by ITC are already very close to saturation. The binding affinity is therefore related to those few, 1-2 proteins which finally complete the protein corona at high protein concentration. As we see from the simulations in Fig. 3d, the proteins that start forming the corona have a very different, about 10k B T more attractive, binding free energy than the finally binding ones. In Fig. 3d, we also plot the results for the simulation-  Fig. 2d. They match very well the simulation free energies at coverages of θ 0.95, consistently right at the θ * values where the ITC binding affinity was determined. Hence, our comparison on the total free energy level shows good quantitative agreement between ITC and the computer simulations. Some more physical interpretation of the data based on the dominant electrostatic interactions in this system follows in the next section.

Electrostatic excess contributions and cooperativity
When many charged proteins bind to a charged region cooperativity effects come into play, in leading order simply due to the change of the global electrostatic properties during adsorption. This has been formulated, for instance, in the Guoy-Chapman-Stern theory for the binding of charged ligands to charged surfaces [52][53][54] or the binding of netcharged proteins to microgels [51], where the successive binding incrementally changes the overall surface or Donnan electrostatic potential, respectively. Consequently, the total binding constant K b = K b (x) = K b (θ ) has to be defined more generally and split up into an intrinsic part K b,int and an excess part K b,exc , as discussed before in Eq. 18. For the purpose here, we can equate the excess contribution with a cooperative electrostatic contribution, via: i.e., G b (x) = G b,int + G b,elec (x). Per definition, the intrinsic, x-independent binding constant K b,int only contains contributions from local and specific interaction between the protein and the macromolecule, and the nonspecific, global, and cooperative electrostatic effect has been separated out. To be more precise with the word "global," we have in mind a leading order multipole expansion (monopole, dipole, etc.) of the electrostatic potential of the whole complex, whose electrostatic properties are changing during binding. Specific local interactions may include local solvation effects, e.g., hydrophobic, or possibly highly localized interactions, such as H-bonds or salt bridges.
For proteins interacting with the highly charged dPGS macromolecule, it was found that the intrinsic part is dominated by a highly localized electrostatic effect, the counterion-release (CR) contribution: For the highly charged dPGS macromolecule, strong chargerenormalization was observed by a massive uptake of counterions [37]. A few of those counterions "condensed" on the dPGS surface layer are liberated when the protein binds, whereupon an oppositely charged protein patch becomes a multivalent counterion for the polyelectrolyte [58][59][60][61][62][63][64]. The resulting favorable (purely entropic) free energy in dependence of the salt concentration c s can be formulated as: where c ci (typically c s ) is the local concentration of condensed counterions, c s the bulk salt concentration, and N CR denotes the number of those counterions released after binding. Equation 28 follows in some limits from the pioneering considerations of Record, Anderson, and Lohman [65] in the realm of DNAprotein complexation that culminated in the leading-order expression for the binding constant purely from counterion release, d ln K b /d ln c s = −N CR . More detailed discussions on the derivation and consequences of these processes and the interpretation of N CR can be found in the original work [65] and partially in more recent references [19,66].
Apart from some extreme scenarios, e.g., total charge reversal of the complex, or large bulk salt concentration changes with titration, it can be safely assumed that the counterion-release part is an intrinsic contribution to the macromolecule-protein binding term (cf. Eq. 19) whose magnitude does not depend on protein coverage. The CR mechanism should play a role for every protein that carries a significant positive patch (even net-neutral or net-negatively charged ones [16,26,27]) and according to Eq. 28 it can be assumed that the number of released ions, N CR , scales with patch size, that is, this interaction is protein specific. We see in the inset of Fig. 3b that indeed the CR signature is found very well expressed in the dPGSlysozyme system, where ln K b is a linear function of ln c s . The slope delivers the number of released counterions for the dPGS G2 of about N CR 3, which well matches the computer simulation results in which ions can be explicitly counted [16]. CR is a very large driving force because the local concentration of condensed ions, c ci , is typically in the molar range compared with only millimolar salt bulk concentrations c s . As a consequence, Eq. 28 predicts that only a single released ion can contribute as much as 3-4 k B T (entropic) free energy to the binding. More details on computational characterization of the CR effects in dPGSlysozyme complexation can be found in the previous joint simulation/ITC works [16,26,27,66].
For the non-specific electrostatic part, one could envision a multipole expansion for the charged binding partners. In first order, the proteins are simply charged spheres of size R p carrying a net charge Q p = Z p e. Analogously in first order the macromolecule/protein complex is a sphere of size R m + R p and a coverage-dependent charge depending on the number of bound proteins, because they modify the net charge of the complex. This is arguably the simplest realization of electrostatic cooperativity, in which the screened monopole electrostatic potential between the complex (including macromolecule and corona) and an approaching protein depends on the coverage which renormalizes the charge Q 0 m . This effect can be positively or negatively cooperative depending on the signs of the involved charges. On a simple Debye-Hückel level, we can write for G b,elec (N b p ): where λ B = e 2 /(4π 0 k B T ) is the Bjerrum length and κ = √ 4πλ B c s the inverse (Debye) screening length for a monovalent salt at concentration c s . Note that Eq. 30 with the simple addition of macromolecule and protein charges as defined in Eq. 29 involves both macromoleculeprotein and protein-protein interactions which would enter in principle in both binding energies in the continuum ansatz Eq. 19, respectively.
Together with Eq. 28, the non-specific electrostatic contribution Eq. 30 in the coverage-dependent Langmuir model Eq. 27 would be a first attempt of a binding model that includes electrostatic cooperativity effects. If we compare back to Eq. 17, however, we see that we would need to include the first derivative of Eq. 30 with respect to coverage (the G b term in Eq. 19) to include the cooperative effect more completely to consistently describe the total binding equilibrium. Similar models, additionally including Born self-energy terms and dipolar contributions, albeit neglecting the G b term, were devised for proteins binding to hydrogels in equilibrium [51,67] as well as for protein sorption kinetics to core-shell hydrogels [68] and molecular cargo to hollow hydrogels [69].
We finally note that some of the findings in the ITC and simulation data for the dPGS-lysozyme complexation can be already well interpreted by such a simple electrostatic extension of the binding models. For instance, we observe in Fig. 3d that the binding affinity does not depend much on the dPGS generation. The reason is that charge renormalization of the dPGS leads to effective macromolecular charges that change only little with generations (although the bare charge changes substantially) [37]. As a consequence, the global electrostatics in leading order is very similar for the various generations. For high coverages θ * near saturation, where we evaluated the ITC data in section "Comparing Langmuir to computer simulation binding free energies," the dPGS is almost completely neutralized by proteins and the binding affinity reflects mostly the intrinsic contribution, that is, the CR contribution, Eq. 28. Comparing the binding free energy at θ * to Eq. 28 with independent estimates of c ci and N CR indeed confirmed that CR is the dominant intrinsic mechanism to binding [16]. Recall that the CR process is well captured in the (implicit water) coarse-grained simulations due to the explicit resolution of the ions. The good agreement of simulations and experiments in Fig. 3d indicates that water hardly contributes to the total free energy of binding. A deeper discussion of water effects on the (temperature-dependent) binding enthalpy H b and the corresponding entropy of binding can be found in a recent publication [66].

Summary and concluding remarks
Clearly, the complexation of proteins with a macromolecule or a nanoparticle involves complex processes. ITC can probe these processes but only implicitly by measuring the incremental heat released in the complexation equilibrium. Binding models are thus in need which either take into account the microscopic interaction details, or, if not available, standard binding models need to be properly interpreted. Nowadays, with the great help of molecular computer simulations of increasing quality and efficiency we are in a position to learn a lot about the complexation process and devise, modify, and interpret binding models more systematically. As a start, in this work, we introduced the concept of a coverage-dependent binding affinity in the Langmuir model that serves for better interpretation of ITC data as well as for extensions of standard binding models. We showed how binding affinities calculated from computer simulations, namely the free energy of binding per protein extracted from a potential of mean force calculation, can be consistently compared with experimental ITC data fitted by the standard Langmuir model. Moreover, we discussed the relation of Langmuir models to continuum binding models as well as simple extensions of binding models toward the inclusion of leading order cooperative electrostatic effects. Our perspectives will hopefully serve as starting points for the development of more elaborate binding models that may be directly applicable to fit ITC data in future studies of the protein corona or similar complex and cooperative problems.