Protein folding/unfolding in the presence of interacting macromolecular crowders

. Recent years have seen an increasing number of biophysical studies of proteins being conducted in cells and concentrated protein solutions. In these experiments, compared to dilute-solution data, both stabilization and destabilization of globular proteins have been observed, which cannot be explained in terms of volume exclusion alone. For a fundamental understanding of the observed eﬀects, there is a need for computational modeling beyond the level of hard-sphere crowders. This mini-review discusses recent eﬀorts to simulate fold-ing/unfolding properties of proteins in the presence of explicit macromolecular crowders. A Monte Carlo-based approach by us is described, which we recently applied to study the equilibrium folding thermodynamics of two peptides in the presence of explicit protein crowders.


Introduction
The interior of living cells is a crowded environment where high concentrations of macromolecules are present. For instance, the cytosol of Escherichia coli bacteria has been estimated to contain 300-400 g/L of proteins and RNA [1]. However, most biophysical studies of proteins are conducted in dilute solutions. A fundamental and long-standing question, therefore, is how macromolecular crowding affects reactions such as protein folding, binding and aggregation. This question is currently being intensely studied by both experimental [2,3] and computational [4,5] methods. In particular, an increasing amount of data on the folding of small proteins in cells is now becoming available.
Computational studies of protein folding in crowded environments have so far focused mainly on the excluded-volume effect [6,7], which is always present, independent of the precise nature of the crowders. The excluded-volume effect favors reactions that increase the available volume, such as the folding of a globular protein to its compact native state, or the binding of proteins to each other. It thus renders globular proteins thermally more stable, with an increased melting temperature. Further insight into the implications of the excluded-volume effect have been gained by numerical simulations, typically with hard spheres as crowders [8][9][10][11][12][13][14][15][16][17][18][19]. In particular, it was demonstrated that the resulting stabilization of globular proteins can be significant, depending on the size and density of the crowders [8,10,12]. Also, in a study of the flavodoxin protein [20], good agreement was established between simulations with hard-sphere crowders and experiments with synthetic crowders (Ficoll). Other studies investigated steric effects on intrinsically disordered proteins (IDPs) [16,17,19], which lack a well-defined tertiary structure. It was shown that a noticeable compaction of IDPs should occur if their extension is comparable to the mean distance between the crowders [17]. Steric effects could potentially also alter the secondarystructure propensities of IDPs. However, although indications of crowding-induced local structuring of IDPs have been observed, major changes of this kind seem not to be common [21][22][23][24].
While universal, the excluded-volume effect need not dominate the interaction of a protein with surrounding macromolecules. In fact, whereas inert crowders have a stabilizing effect, experiments in cells and concentrated protein solutions have observed both stabilization and destabilization of globular proteins [25][26][27][28], compared to dilute-solution data. One possible destabilizing mechanism is that the protein in non-native rather than native form is more capable of forming favorable contacts with surrounding crowder molecules. However, while experiments demonstrate that non-steric effects can be significant, the mechanisms involved remain incompletely understood.
A related problem that has received considerable attention is to understand the mechanisms by which cosolvents affect protein stability. The interaction of cosolvents with proteins can be quantified in terms of preferential interaction coefficients [29], and has been modeled using simplified core-shell descriptions [30] as well as atomistically detailed representations. Two examples of cosolvents whose interaction with proteins has been investigated through atomic-level simulations are the denaturant urea and the protective osmolyte trimethylamine N-oxide, TMAO [31][32][33][34][35][36][37][38]. Simulating the interaction of large macromolecular crowders with a protein at the same level of detail is computationally demanding, because cosolvents such as urea and TMAO are significantly smaller in size than, for instance, protein crowders. As a result, the precise origin of non-steric effects is much less well understood in this case than in that of urea and TMAO. Another problem of current interest with links to macromolecular crowding is the formation of protein-rich droplets through reversible liquid-liquid phase separation [39][40][41].
This article focuses on computational studies of conformational effects on proteins caused by surrounding macromolecular crowders. It begins in Section 2 with a brief overview of current approaches to this problem. Section 3 then describes recent work by us [42,43], where we used Monte Carlo methods, as implemented in our open-source software PROFASI [44], to simulate the folding thermodynamics of two structurally dissimilar peptides in the presence of explicit protein crowders.

Scope
Recent years have seen a growing number of protein simulation studies devoted to different aspects of macromolecular crowding. One important topic is how crowding affects protein diffusion, which has been investigated using Brownian dynamics, with hydrodynamic interactions described in different ways [45][46][47][48]. These studies focus on rigid-body motions. To study the conformational dynamics of a protein, accurate modeling of its internal degrees of freedom is required. This problem can sometimes be tackled by brute-force atomic-level simulations. However, for a system with macromolecular crowders, the time-scales that can be reached with this approach are limited. Therefore, lower-resolution models and alternative sampling strategies can be useful. This section briefly discusses current studies in this area and the methods being used.

Model resolution
Setting up a simulation with macromolecular crowders involves a number of important choices. One is whether to use an explicit or implicit representation of the solvent. For instance, Feig and Sugita performed explicit-solvent simulations of the protein chymotrypsin inhibitor 2 (CI2) in the presence of explicit protein crowders [49]. They were able to generate trajectories sufficiently long (a few hundred ns) to permit identification of perturbations in the structure of CI2 caused by interactions with the crowder proteins. Still, for computational feasibility, simulations with macromolecular crowders are often performed using an implicit solvent representation. Clearly, with this choice, the simulations cannot capture possible indirect mechanisms where the crowder molecules affect the test protein by altering the solvent structure. This omission may seem questionable, for instance, in the light of recent findings of longrange protein-induced changes in the water dynamics around antifreeze proteins [50]. For protein-cosolvent interactions the role of indirect mechanisms is a debated question, although recent work suggests that these effects are limited in the extensively studied case of urea [51].
To be able to reach even larger time-scales, it is tempting to further reduce the level of detail by turning to a coarse-grained description of the crowder molecules. Such a mixed-resolution model (all-atom test protein/coarse-grained crowders/implicit solvent) was proposed by Predeus et al. [52]. Using this model, they explored crowdinginduced distortions of the conformational ensembles sampled by two peptides, trp-cage and melittin. Yet another step toward larger time-scales is to adopt a coarsegrained description for the test protein as well. Coarse-grained models are often used in studies of IDPs [16,17,19], and their applicability may widen with the development of improved force fields [53]. A newly proposed method for crowding simulations couples a coarse-grained protein model to a lattice Boltzmann treatment of solvent hydrodynamics [54].

Sampling strategy
The dominant sampling method for protein simulations, with or without crowders, is molecular dynamics, which is based on integration of Newton's equations of motion and often used in combination with replica exchange [55,56], for enhanced sampling. When simulating systems containing macromolecular crowders, it can become computationally expensive to cover the time-scales needed to study the relevant conformational motions of the test protein. To alleviate this problem, a procedure based on particle insertion techniques was introduced [10,57]. This procedure has the advantage that the test protein only needs to be simulated under crowder-free conditions. The conformational ensemble of the isolated protein is subsequently reweighted by insertion into separately generated crowder configurations. This approach was, for instance, applied to estimate folding/unfolding equilibria of individual proteins in a model for a bacterial cytoplasm [58]. Furthermore, in a study with hard-sphere crowders, the procedure was shown to reproduce folding free energies obtained by direct simulation [59]. At the same time, as with any reweighting method, for a reliable extrapolation based on finite simulation trajectories, there must be a sufficient overlap between the desired and simulated conformational ensembles. Therefore, there remains a need for methods for direct simulations with crowders present.
An alternative to molecular dynamics-based sampling is to use Markov chain Monte Carlo methods, as was done for the simulations discussed in Section 3 below. The potential advantage is that the elementary moves need not be small. For instance, a simple but very powerful move often used in this approach is the pivot rotation around a randomly chosen atom or bond [60], which can generate large-scale deformations of the molecule. A different Monte Carlo approach is provided by chaingrowth algorithms. Such methods have been tested on simplified protein models with very good results [61,62].
Regardless of the sampling strategy, because of the large number of interactions and degrees of freedom in systems with explicit macromolecular crowders, unless the model and numerical methods are implemented into efficient computer code, the computational cost for simulations capable of capturing the salient effects would be prohibitive. In particular, the code must make efficient use of the many opportunities for parallel computation available on modern computers.

Crowding environments
In simulation studies with macromolecular rather than abstract spherical crowders, the investigated systems usually are of one of two types. One approach is to build a crowding environment mimicking cellular conditions, as was done by McGuffee and Elcock [58]. Their simulated system contained 1000 macromolecules of 50 types (proteins and RNAs), with a composition reflecting that in the cytoplasm of Escherichia Coli bacteria. Recently, Feig et al. presented a detailed and extensive model for another bacterial cytoplasm, that of Mycoplasma genitalium, which may serve as a starting point for simulations [63,64]. The model includes proteins, RNAs, protein/RNA complexes, metabolites, ions as well as explicit solvent molecules.
The second main approach uses simplified crowding environments, where the crowders typically are copies of a single protein. The simulated conditions thus resemble those of experiments in concentrated protein solutions. The number of crowder molecules needed for representative results should be smaller with homogeneous than with heterogeneous crowding, and many studies have used systems with around ten crowder molecules. This reduction in system size compared to cytoplasm models has implications for the range of effects on a test protein amenable to study by direct simulation.

Direct simulations of proteins with explicit macromolecular crowders
Direct simulations of folding/unfolding properties of proteins in the presence of explicit macromolecular crowders have in recent years been reported by several groups. As far as we know, the first such studies were carried out by Feig, Sugita and coworkers [49,52,65]. Two already mentioned studies by them investigated crowdinginduced conformational changes in the CI2 protein [49] and in the peptides trp-cage and melittin [52]. Here, the simulated systems consisted of one test molecule and eight copies of some crowder protein. They also reported a study of a system with four protein GB1 and eight villin headpiece molecules [65]. From simulations with different volumes, it was found that villin headpiece becomes increasingly destabilized with increasing crowder concentration, in contrast to expectations based on a simple excluded-volume picture. The conclusion was tested through NMR experiments [65].
More recently, van Giessen and collaborators studied effects on the helix-forming peptide (AAQAA) 3 caused by peptide crowders of the same size [66,67], using a coarse-grained model and replica exchange statistical temperature molecular dynamics [68]. They determined the folding temperature of the test peptide in the presence of the crowders, and analyzed its dependence on the hydrophobicity of the crowders. The simulated systems consisted of one test peptide and either 7 α-helical crowders [66] or 29 β-hairpin crowders [67].
Very recently, Candotti and Orezco reported microseconds-long explicit-solvent simulations of a system with eight protein molecules in different conformational states [69]. The conformational changes observed in these simulations were contrasted with those seen in simulations with synthetic crowders.

Peptide folding thermodynamics with explicit protein crowders
The simulation studies with explicit macromolecular crowders mentioned in Section 2.5 were all carried out using molecular dynamics. This section describes recent work by us based on Monte Carlo rather than molecular dynamics [42,43]. Using this approach, we investigated the equilibrium folding thermodynamics of two peptides in the presence of explicit protein crowders.
The two peptides studied were the compact α-helical trp-cage [70] and the β-hairpin-forming GB1m3 [71], with 20 and 16 residues, respectively. The crowding conditions were homogeneous, with either of two proteins serving as crowding agent, the 58-residue bovine pancreatic trypsin inhibitor (BPTI) or the 56-residue B1 domain of streptococcal protein G (GB1). A schematic illustration of the native folds of the peptides and proteins studied can be found in Figure 1. Both our crowder proteins are thermally highly stable [72,73]. Our two peptides are stable for their size [70,71], but markedly less stable than the crowder proteins. Therefore, the crowder proteins were modeled using a fixed-backbone approximation, whereas the test peptides were free to fold and unfold. Both test and crowder molecules had rotatable side-chains.
All simulations were conducted using an all-atom protein representation and a simplified implicit solvent force field [74], as implemented in the PROFASI program [44]. This force field is phenomenological and was developed and parameterized through folding thermodynamics studies of a structurally diverse set of peptides and small proteins, which included trp-cage and GB1m3 [74]. The model has subsequently been applied to study folding/unfolding properties of several proteins with >90 residues [75][76][77][78][79][80][81]. Other applications include peptide aggregation [82][83][84] and peptide adhesion to solid surfaces [85].
The Monte Carlo move set contained both global pivot-type and semi-local [86] backbone updates, side-chain rotations, and rigid-body motions of whole chains. Additionally, we used replica exchange [87], with 16 different temperatures. The temperatures were geometrically distributed over a system-dependent interval T min ≤ T ≤ T max , with T min = 300 K and T max = 373 K for the systems with BPTI crowders and T min = 290 K and T max = 350 K for those with GB1 crowders. To ensure reproducible results, it was necessary to choose a somewhat larger T min for the BPTI systems. For the GB1 systems, T max was slightly reduced, to stay below the melting temperature of this protein.
The replica-exchange method is convenient, as knowledge of the temperature dependence is very useful in rationalizing the effects of different types of crowders. Furthermore, it improves the sampling efficiency of the simulations. However, from the viewpoint of computational efficiency, even more important is our use of: (i) a simplified force field, (ii) a well balanced mix of small-and large-step Monte Carlo moves, and (iii) a fast code optimized for large systems. To achieve the necessary performance, the PROFASI was reoptimized for large systems using both vector and thread parallelization. Our calculations required about two weeks of computer time on 64 cores for each of the four test peptide-crowder protein combinations.
For comparison, we also simulated both peptides in isolation and with hard-sphere crowders. The volume of the hard spheres was taken larger than that of the BPTI and GB1 crowders, in order to make their modest effects on the peptides more noticeable. The hard-sphere diameter roughly corresponded to the maximum diameter of the BPTI and GB1 structures. All simulated systems with crowders consisted of one test peptide and eight crowder entities, enclosed in a periodic box. The volume fraction occupied by the crowders was 7 % for GB1 and BPTI and 20% for the hard spheres.

Peptide structure and stability
We first discuss the folding thermodynamics of trp-cage under the different simulated conditions. A compilation of data from our trp-cage simulations can be found in Figure 2. The observed effects of the purely steric crowders are modest, but noticeable at high temperatures, where the peptide is unfolded and requires the most volume. The addition of these crowders leads to lower values of the radius of gyration, the RMSD from the native structure, and the end-to-end distance. The fourth property shown, the helix content, is, by contrast, left almost unchanged (Fig. 2(a)). The effects of the BPTI and GB1 crowders on the same peptide are different. At high temperatures, the protein crowders cause smaller changes than those observed with hard spheres, due to their smaller size. At low temperatures, both BPTI and GB1 crowders tend to distort the native structure of trp-cage. Interactions with these crowders thus interfere with the forces driving the folding of trp-cage. This can be most clearly seen in the case of BPTI. As will be discussed below, BPTI interacts primarily with the C-terminal tail of trp-cage. This interaction prevents this part of trp-cage from packing against the α-helix, as it does in the native fold (Fig. 1). As a result, the end-to-end distance stays large at low temperatures ( Fig. 2(d)). In the GB1 case, our data may be compared with the results of the above-mentioned mixed-resolution molecular dynamics simulations by Predeus et al. [52]. These authors observed a set of partially unfolded trp-cage states, which were not sampled in reference simulations without crowders [52]. In line with this finding, our simulations indicate a weak but noticeable distortion of the native fold of trp-cage at low temperatures. Figure 3 shows data from our different simulations of the β-hairpin-forming GB1m3 peptide. With hard-sphere crowders, the results are similar to those discussed above for trp-cage (Fig. 2). The hard-sphere crowders cause an expected compaction at high temperatures (Figs. 3(b),(d)), whereas the secondary-structure content remains essentially unchanged over the entire temperature range (Fig. 3(a)). When adding BPTI or GB1 crowders, the responses of the two peptides are very different. While distorting the trp-cage fold, these crowders have a stabilizing effect on GB1m3, as can be seen from the nativeness parameter shown in Figure 3(c). It is worth stressing that this stabilization is different in strength and character than that observed with hard-sphere crowders, and therefore is not mediated by steric interactions alone. Rather, the stabilization arises because GB1m3 in its native form is capable of interacting favorably with both BPTI and GB1. The stabilization is especially strong with BPTI crowders. The shifts of the melting curves observed in this case correspond to an increase in melting temperature by as much as roughly 15 K (Fig. 3).

The interaction of the peptides with the crowder proteins
Having discussed the effects on trp-cage and GB1m3 caused by BPTI and GB1 crowders, we next look into the interactions mediating these effects. Figure 4 shows test peptide-crowder protein residue contact maps for the four systems studied, computed   Figure 1. Reproduced from [43], with the permission of AIP Publishing.  Fig. 1). Particularly prone to form contacts with the test peptides are residues 8 and 9, both of which are prolines. This is not entirely surprising, because proline, with its unique geometry, is known to play an important role in protein-protein interactions [88]. The interaction of our other crowder protein, GB1, with the test peptides is somewhat less specific in character. Still, there are two sequence regions in GB1 that are responsible for a large fraction of the contacts with the test peptides. These regions are located around the two edges of the β-sheet (Fig. 1).
In this model, a test peptide is affected by the surrounding crowder proteins through four types of interaction: steric repulsion, hydrogen bonding, charge-charge interaction and hydrophobic attraction. To quantify the relative importance of the three types of non-steric interaction, Table 1 shows the corresponding three average energies for all four systems, computed at the melting temperature of the test peptide in question. The results show that the overall magnitude of the non-steric interactions is much larger with BPTI than with GB1 crowders, which matches well with the different contact frequencies seen in Figure 4. At the level of individual energy terms, two values stand out, namely the hydrophobicity energies for the two systems with BPTI crowders. These numbers confirm that the high interaction propensity of the BPTI surface patch identified above largely stems from its partly hydrophobic character.
Another observation that can be made from the contact maps (Fig. 4) is that the sticky patch on the BPTI surface interacts primarily with the C-terminal tail of trp-cage. The anchoring of this tail to BPTI has a stabilizing effect on the native α-helix ( Fig. 2(a)). However, as indicated earlier, it prevents a tight packing of this tail against the α-helix, which leads to an end-to-end distance much larger than it is in the native fold ( Fig. 2(d)).
Finally, we note that the GB1m3-GB1 system is special, because GB1m3 is an optimized variant of the second β-hairpin in GB1 (residues 41-56), with enhanced stability [71]. In our simulations, the part of GB1 most prone to form contacts with GB1m3 is indeed an edge strand (residues 42-46) belonging to the second β-hairpin (Fig. 4, right lower panel). The folded GB1m3 hairpin can align side-by-side with this edge strand of GB1 in four possible ways to form an extended β-sheet, of which only two are observed in the simulations (Fig. 4, right lower panel). These two relative orientations are such that a cluster of hydrophobic side-chains in GB1m3 and the corresponding cluster in the second β-hairpin of GB1 end up on the same side of the extended β-sheet. Another factor that should favor the observed orientations, both of which correspond to antiparallel rather than parallel β-structure, is the geometry of the hydrogen bonds connecting GB1m3 and GB1.

Concluding remarks
To investigate the structure and stability of proteins in the presence macromolecular crowders is important in order to understand how they function under cellular conditions. In the study discussed in this section, we performed for the first time an atomic-level simulation study of the equilibrium folding thermodynamics of two peptides in the presence of explicit protein crowders [42,43]. The simulations require a minimal amount of input and yield a broad set of physical properties that, in principle, can be measured experimentally. Unfortunately, there are not yet any relevant data available on these peptides under crowding conditions.
Our results suggest that the effects on the peptides caused by protein crowders are markedly different from those caused by hard-sphere crowders. This is particularly clear in the case of trp-cage, which is destabilized in our simulations with BPTI or GB1 crowders, whereas hard-sphere crowders have a stabilizing effect. By contrast, the GB1m3 peptide is stabilized in the presence of the protein crowders, and more so than with hard-sphere crowders. In our simulations, attractive interactions with the protein crowders therefore play an important role for both peptides, although the resulting effects on the peptides differ.
It should be pointed out that the response of these small peptides when crowders are added need not be very similar to that a typical globular protein. In particular, due to their small size, their hydrophobic cores are not fully protected from the surroundings, which may make the peptides relatively "sticky". At the same time, their small size makes the peptides less sensitive to steric effects.
In an ongoing study, we are currently investigating the effects of the same two crowders, BPTI and GB1, on superoxide dismutase 1 (SOD1), a 153-residue protein which has been implicated in the ALS disease process. This study focuses on how the unfolding properties of SOD1, at a weakly destabilizing temperature, are affected by interactions with the crowders.