Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme

Pechlaner, Maria; van Gunsteren, Wilfred F.; Hansen, Niels; Smith, Lorna J.

doi:10.1007/s00249-022-01593-1

Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme

Original Article
Open access
Published: 18 March 2022

Volume 51, pages 265–282, (2022)
Cite this article

Download PDF

You have full access to this open access article

European Biophysics Journal Aims and scope Submit manuscript

Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme

Download PDF

Maria Pechlaner ORCID: orcid.org/0000-0002-1615-5600¹,
Wilfred F. van Gunsteren¹,
Niels Hansen² &
…
Lorna J. Smith³

3221 Accesses
6 Citations
Explore all metrics

Abstract

In protein simulation or structure refinement based on values of observable quantities measured in (aqueous) solution, solvent (water) molecules may be explicitly treated, omitted, or represented by a potential of mean-solvation-force term, depending on protein coordinates only, in the force field used. These three approaches are compared for hen egg white lysozyme (HEWL). This 129-residue non-spherical protein contains a variety of secondary-structure elements, and ample experimental data are available: 1630 atom–atom Nuclear Overhauser Enhancement (NOE) upper distance bounds, 213 ³ J-couplings and 200 S² order parameters. These data are used to compare the performance of the three approaches. It is found that a molecular dynamics (MD) simulation in explicit water approximates the experimental data much better than stochastic dynamics (SD) simulation in vacuo without or with a solvent-accessible-surface-area (SASA) implicit-solvation term added to the force field. This is due to the missing energetic and entropic contributions and hydrogen-bonding capacities of the water molecules and the missing dielectric screening effect of this high-permittivity solvent. Omission of explicit water molecules leads to compaction of the protein, an increased internal strain, distortion of exposed loop and turn regions and excessive intra-protein hydrogen bonding. As a consequence, the conformation and dynamics of groups on the surface of the protein, which may play a key role in protein–protein interactions or ligand or substrate binding, may be incorrectly modelled. It is thus recommended to include water molecules explicitly in structure refinement of proteins in aqueous solution based on nuclear magnetic resonance (NMR) or other experimentally measured data.

Effective interactions in molecular dynamics simulations of lysozyme solutions

Article 01 September 2014

Characterization of the flexible lip regions in bacteriophage lambda lysozyme using MD simulations

Article 28 March 2015

Hydrogen/Deuterium Exchange Behavior During Denaturing/Refolding Processes Determined in Tetragonal Hen Egg-White Lysozyme Crystals

Article 13 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Since the first simulations of the dynamics of a protein more than four decades ago (McCammon et al. 1977; van Gunsteren and Berendsen 1977), the application of molecular dynamics (MD) simulation to proteins has seen a continuous development in terms of accuracy and efficiency and the use of MD simulation has spread through chemistry, biochemistry and molecular biology (van Gunsteren and Berendsen 1990; van Gunsteren et al. 2006, 2018). The first protein simulations involved a simple protein model (454 united or extended atoms) of the protein bovine pancreatic inhibitor (BPTI, 58 residues), without any non-polar hydrogen atoms or solvent molecules, that is, using a vacuum boundary condition. A short non-bonded interaction cut-off distance (R_c = 0.8 nm) was employed and the relative dielectric permittivity ε_r was assumed to be proportional to the distance r between the atoms of the protein. These conditions limited the accuracy of the simulations. In the next decades, larger proteins were simulated, the protein models were refined, more hydrogen atoms and water molecules were added, the non-bonded interaction cut-off was extended (e.g. to R_c = 1.4 nm), and long-ranged electrostatic interactions were modelled using continuum electrostatics or lattice periodicity. These improvements of the models, in particular the addition of many water molecules, easily ten thousand to solvate a protein in a periodic box, required an increased computing effort. As a consequence, it was attempted to simplify the models again, for example by coarse-graining, that is, representing multiple atoms by a single interaction centre or bead (Riniker et al. 2012), or by replacing the explicit treatment of the solvating water molecules by a mean solvation force that is a function of the positions $\vec{r}^{N} \equiv (\vec{r}_{1} ,\vec{r}_{2} ,...,\vec{r}_{N} )$ in Cartesian coordinates of the N protein atoms. In such an implicit-solvation model (van Gunsteren et al. 1994), the influence of the solvent on the protein degrees of freedom is incorporated in the interaction function and equations of motion of the latter in an average manner.

Modelling solvent effects upon protein structure

The solvent effect upon the structure and dynamics of a solute may be divided into different types.

1. The average or mean interaction between solute atoms is affected by the presence of solvent. When the solvent is omitted from the simulation, the solute force field should be changed to incorporate the mean solvent effect, that is, a potential of mean force should be used for the solute.

2. The solvent exerts a dynamical effect on the solute, which may be mimicked by the introduction into the equations of motion of a frictional force representing solvent drag and of a stochastic force $\vec{f}_{i}^{\rm st} (t)$, randomly fluctuating in time t, representing collisions of solute atoms i with solvent molecules. In the simplest case, the frictional force is taken to be proportional to the velocity $\vec{v}_{i} (t)$ of the solute atom to which it applies, and the stochastic force $\vec{f}_{i}^{\rm st} (t)$ is a stationary Gaussian-distributed random variable, uncorrelated between the different degrees of freedom, which leads to the Langevin equation of motion, a stochastic ordinary differential equation,

$$m_{i} \dot{\vec{v}}_{i} (t) = \vec{f}_{i} (\vec{r}^{N} (t)) - m_{i} \gamma_{i} \vec{v}_{i} (t) + \vec{f}_{i}^{\rm st} (t),$$

(1)

where m_i is the mass and γ_i the friction coefficient of particle i (i=,2…,N), and a time derivative of a quantity is indicated by a dot over the symbol. The stochastic force is assumed to be a stationary Gaussian-distributed random variable with zero mean and to have neither correlation with prior velocities nor with the force

$$\vec{f}_{i}\,\, (\vec{r}^{N} (t)) = - \frac{{\partial V(\vec{r}^{N} (t))}}{{\partial \vec{r}_{i} (t)}},$$

(2)

as derived from the potential energy function $V(\vec{r}^{N} )$,

$$< f_{i}^{\rm st} (t^{\prime})\,f_{j}^{\rm st} (t) > = 2m_{i} \gamma_{i} k_{B} T_{ref} \delta_{ij} \delta (t - t^{\prime}),$$

(3)

$$P(f_{i}^{\rm st} ) = (2\pi < (f_{i}^{\rm st} )^{2} > )^{ - 1/2} \exp ( - (f_{i}^{\rm st} )^{2} /(2 < (f_{i}^{\rm st} )^{2} > )),$$

(4)

$$< f_{i}^{\rm st} > = 0,$$

(5)

$$< v_{i} (t^{\prime})f_{j}^{\rm st} (t) > = 0, \quad t \ge \, t^{\prime},$$

(6)

$$< f_{i} (t^{\prime})f_{j}^{\rm st} (t) > = 0, \quad t \ge \, t^{\prime},$$

(7)

where < … > denotes averaging over an equilibrium ensemble, k_B is Boltzmann’s constant, T_ref is the reference temperature, $P(f_{i}^{\rm st} )$ is the probability distribution of the stochastic force, δ_ij is the Kronecker delta and δ(t-t’) is the delta function. Note that Eqs. (3–7) are not formulated in terms of three-dimensional vectors $\vec{v}_{i}$ and $\vec{f}_{i}$, but in terms of their components (indicated by v and f without vector arrow), i.e. along the x-, y- and z-directions of the right-handed Cartesian coordinate system. A minor correction to Eq. (3) has been discussed in (Ciccotti and Ryckaert 1981; Bossis et al. 1982; van Gunsteren and Berendsen 1982).

The stochastic force f_i^st(t) and the atomic friction coefficient γ_i will only be sizable for protein atoms at the surface. Therefore, they are taken dependent on the number of neighbour atoms within the protein (Shi et al. 1988)

$$\gamma_{i} \left( t \right) \, = \gamma_{{{\text{solv}}}} \omega_{i} \left( t \right)$$

(8)

with

$$\omega_{i} \left( t \right) = {\text{ max}}\left( {0,{ 1} - N_{i}^{nb} (t)/N^{{n{\text{bref}}}} } \right),$$

(9)

where N_i^nb(t) denotes the number of non-hydrogen neighbour atoms of the protein atom i within 0.3 nm radius, and N^nbref is defined as an upper limit of 6 neighbour protein atoms at which solvent forces on solute atom i are assumed to vanish. For water as solvent (at room temperature and pressure) γ_solv = 91 ps⁻¹, and ω_i(t) is updated every 1 ps during the simulation (Shi et al. 1988).

Elimination of protein or solvent degrees of freedom

The conditions that must be fulfilled by degrees of freedom in order that they may be eliminated in a physically correct manner in the process of model simplification, such that a computationally efficient and yet accurate coarse-grained model is obtained, are:

1.
They must be non-essential for the process or property of interest.
2.
They must be large in number or computationally intensive, so that the computational gain is substantial enough to off-set the loss in accuracy.
3.
The interactions governing the degrees of freedom to be eliminated should be largely decoupled from the interactions governing the other degrees of freedom of the system which are to be maintained. This means that the frequency components of the motion along the degrees of freedom to be eliminated must be well separated from the other frequencies occurring in the system, and that the coupling between the two types of motion is weak (van Gunsteren and Berendsen 1977).
4.
Their elimination should allow a simple, efficient representation of the interaction governing the other, remaining degrees of freedom.

Use of an implicit-solvation model

The use of an implicit solvent model, i.e. the attempt to mimic the effect of the solvent by a function that is only dependent on the solute coordinates, does not satisfy conditions 3 and 4. If the solvent is water, it leads to various distortions of the interactions within the solute–solvent system:

1.
The energy and entropy contributions of the solvent molecules to the free energy of the system are missing. For example, the experimental value of the excess free energy of liquid water at room temperature and pressure is with 24 kJ mol⁻¹ about half the size of its heat of vaporization of 44 kJ mol⁻¹. Thus, TΔS, where T is the temperature and ΔS the difference in entropy between gas and liquid phase, is about 20 kJ mol⁻¹, and therefore, not negligible. While the energy contribution of the solvent molecules may to some extent be incorporated into the potential of mean force of the solute, the entropy contribution cannot, because it depends on the mobility of the solvent molecules.
2.
Since a water molecule may serve as hydrogen-bond donor as well as hydrogen-bond acceptor, hydrogen bonding between solute and solvent is missing, leading to enhanced solute–solute hydrogen bonding (Shi et al. 1988).
3.
Since the relative dielectric permittivity ε_r of water at room temperature and pressure is about 80, the dielectric screening effect of the aqueous solution is missing, leading to too strong electrostatic interactions.

Although the motions of a large solute may cover time scales ranging from femtoseconds to milliseconds and the relaxation times of water molecules are of the order of picoseconds, their motions on picosecond to nanosecond time scales are not decoupled, and thus condition 3 is not satisfied for some processes. In explicit water, the non-polar particles aggregate, and the electrostatic interaction between ions is reduced, leading to dissolution. So-called hydrophobic or non-polar particles do like water, but their interaction with water is less strong than the interaction of water with itself, leading to water excluding the hydrophobic molecules and their subsequent aggregation. Ions with opposite charges do like water more than each other, which leads to water surrounding the ions and dissolution of ion pairs. The ‘‘hydrophobic effect’’, the apparent attraction between non-polar molecules or repulsion between ions in aqueous solution due to the stronger interaction between the water molecules or between water molecules and ions, cannot be properly modelled in terms of solute and ion coordinates only, because the effective interaction between solute atoms and their entropy is a complex function of the distribution of solvent coordinates. Thus, also condition 4 is difficult to meet (Müller et al. 2006).

The mentioned fundamental inadequacies of implicit-solvation models (van Gunsteren et al. 1994; Müller et al. 2006) are inherent to any such model (e.g. generalized Born surface area (GBSA) models) and cannot be resolved by using one or the other parameter-calibration procedure when developing such a model.

In light of these considerations, one may wonder for which applications of implicit-solvation models the gain in computational efficiency outweighs the loss of accuracy and physical mechanisms. It makes definitively little sense in simulations of protein folding. If the solvent is omitted, folding is reduced to a problem of chain enthalpy and entropy. In implicit-solvation models, changes in solvent entropy in the first solvation shells upon folding and unfolding cannot be directly modelled in a potential energy term for the solute (Daura et al. 1999). In contrast, the omission of solvent molecules is common practice in structure determination and refinement of proteins based on experimental data. Whether this approximation is warranted will depend on the ratio of the number of independent measured values of observable quantities for a molecule using a particular measurement technique and the number of independent molecular degrees of freedom.

Use of measured data to derive or refine protein structure

All techniques to derive structural information on (macro)molecules from the measurement of observable quantities Q make use of a relation of Q to structure $\vec{r}^{N}$, a function $Q(\vec{r}^{N} )$ (van Gunsteren et al. 2016). Virtually, all experimental techniques measure an average < Q > _space,time of Q over the molecules (space) in the test tube or in a crystal and over a time window determined by the type of experiment, which may range from picoseconds to seconds. This means that < Q > constitutes an average over a Boltzmann-weighted set, i.e. a statistical-mechanical ensemble, of molecular configurations. The weights are proportional to $\exp ( - V(\vec{r}^{N} )/k_{B} T)$, where $V(\vec{r}^{N} )$ indicates the energy of a molecular configuration or structure $\vec{r}^{N}$.

The quality of a set of structures $\vec{r}^{N}$ derived from a set of measured values Q^exp of Q using a particular molecular model will depend on various factors of the structure determination procedure.

1.
Q^exp values are subject to uncertainty or error.
2.
It is not possible to fully account for the averaging over space and time inherent in the experimental measurement.
3.
For most bio-molecular systems, the number of independent Q^exp values available, N^exp, is much smaller than the number of degrees of freedom of the system, N^dof. This means that the structure determination problem is underdetermined (N^exp/N^dof < < 1) and can only be addressed using a molecular model, i.e. a function $V(\vec{r}^{N} )$ specifying likely structural parameters (e.g. bond lengths and bond angles) of a system. The function $V(\vec{r}^{N} )$ may yield low-energy values for configurations that are physically most reasonable. The fewer Q^exp values that are available or the lower N^exp/N^dof, the larger the influence of the choice of molecular model and interaction function $V(\vec{r}^{N} )$ and its inaccuracy, for example due to omission of solvent molecules, on the generated structures will be.
4.
The function $Q(\vec{r}^{N} )$ is not known or it involves assumptions or approximations affecting its accuracy.
5.
The inverse function $\vec{r}^{N} (Q)$ of the function $Q(\vec{r}^{N} )$ may not exist, or if it does, it may be multiple-valued.
6.
The generation or sampling of molecular configurations $\vec{r}^{N}$ must be biased, i.e. guided towards those that are (on average) compatible with Q^exp. This is particularly challenging when the inverse function $\vec{r}^{N} (Q)$ of the function $Q(\vec{r}^{N} )$ is multiple-valued.

Information density of various experimental data for proteins

The third factor involves the balance between the quality or accuracy of the molecular model used and the number of independent experimental values Q^exp available for a quantity Q, which is very different for different experimental measurement techniques, such as X-ray diffraction, NMR, CD, Raman or infrared spectroscopy. Where X-ray diffraction of crystals is a measurement technique that is characterised by a high information density, that is, a large ratio N^exp/N^dof of the number of independent measured values of observable quantities for a molecule and the number of independent molecular degrees of freedom, NMR measurements of proteins in aqueous solution show a much lower information density, and circular dichroism (CD), Raman or infrared spectroscopy or small-angle X-ray scattering (SAXS) data have a very low information density. This implies that the omission of solvent in a crystalline system when deriving protein structure from abundant X-ray diffraction intensities requires a less accurate molecular model than the omission of solvent when deriving protein structure from much less abundant NMR data for a protein in solution. In the latter case, the use of an implicit-solvation model instead of the complete neglect of solvating water molecules may improve the quality of the obtained structures.

Experimental data are generally averages over protein conformations

The second above-mentioned factor also plays a different role when applying the different mentioned measurement techniques. For observable quantities Q, such as X-ray diffraction intensities I_hkl, Nuclear Overhauser Enhancement intensities (NOEs; when represented as atom–atom distance bounds), ³ J-couplings or chemical shifts, it is possible to formulate a function $Q(\vec{r}^{N} )$ relating a Q-value to a particular structure $\vec{r}^{N}$. For other observable quantities, such as S² order parameters or residual dipolar couplings (RDCs), the function relating Q to $\vec{r}^{N}$ involves some average over the Boltzmann ensemble of structures in solution, $Q( < f(\vec{r}^{N} ) > )$, where f denotes the function of $\vec{r}^{N}$ that is being averaged (van Gunsteren et al. 2016). This means that structure determination or refinement based on such quantities must involve the averaging $< f(\vec{r}^{N} ) >$ in addition to the averaging $< Q( < f(\vec{r}^{N} ) > ) >$. Unfortunately, an RDC value is the result of the averaging of a dipolar coupling over the rotational motion of the molecule. For a protein in aqueous solution, the extensive sampling of its rotational motion in an MD simulation would easily take microseconds, which would make protein structure determination based on RDCs using explicit water molecules in the simulation rather expensive, whereas the use of a computationally efficient implicit-solvation model would allow simulations of microseconds length.

In the present article, it is investigated whether the complete omission of solvent or the use of an implicit-solvation model in an SD simulation of a protein will lead to larger deviations from experimental data than the use of explicit solvent molecules in an MD simulation using periodic boundary conditions. The protein hen egg white lysozyme (HEWL, 129 amino-acid residues), see Fig. 1, serves as test molecule, because ample NMR data of this protein in solution are available.

Methods

Energy minimisations, molecular dynamics and stochastic dynamics simulations were performed using the GROMOS bio-molecular simulation software (Schmid et al. 2011a, 2012; van Gunsteren et al. 2019a).

Molecular model for the MD simulation in explicit water using periodic boundary conditions

When solvated in explicit water (MD_water), the protein was modelled using the GROMOS bio-molecular force field 54A7 (Poger et al. 2010; Schmid et al. 2011b). In view of the pH used in the experimental NMR measurements, pH⁓3.8, only Glu 35 was protonated and His was doubly protonated (Bartik et al. 1994). The simple-point-charge (SPC) model (Berendsen et al. 1981) was used to describe the solvent molecules in the rectangular periodic box. To compensate for the overall positive charge of the protein, 10 Cl⁻ ions were included in the solution. All bond lengths and the bond angle of the water molecules were kept rigid with a relative geometric precision of 10^–4 using the SHAKE algorithm (Ryckaert et al. 1977), allowing for a 2 fs MD time step in the leap-frog algorithm (Hockney and Eastwood 1981) used to integrate the equations of motion. For the non-bonded interactions, a triple-range method (van Gunsteren et al. 1986) with cut-off radii of 0.8/1.4 nm was used. Short-range (within 0.8 nm) van der Waals and electrostatic interactions were evaluated every time step based on a charge-group pair list (van Gunsteren et al. 2019b). Medium-range van der Waals and electrostatic interactions, between pairs at a distance larger than 0.8 nm and shorter than 1.4 nm, were evaluated every fifth time step (10 fs), at which time point the pair list was updated, and kept constant between updates. Outside the larger cut-off radius (1.4 nm) a reaction-field approximation (Barker and Watts 1973; Tironi et al. 1995) with a relative dielectric permittivity of 61 (Heinz et al. 2001) was used. Minimum-image periodic boundary conditions were applied.

Molecular model for the SD simulations in vacuo without and with implicit-solvation term

When simulating the protein in vacuo (SD_nowater, SD_implicit), the GROMOS bio-molecular force field 54B7 (van Gunsteren and Dolenc 2012, van Gunsteren et al. 2019c) was used. The A-version of a GROMOS force field is the basic force field designed for molecules in explicit water. The B-version is derived from the A-version in order to be used for simulating molecules in vacuo, where the dielectric screening effect of the environment is neglected. The atomic charges and van der Waals parameters are changed such that atom charge groups with a non-zero total charge are neutralized while maintaining the hydrogen-bonding capacity of the individual atoms. This takes account of the dielectric screening of the aqueous solution that is missing in vacuo.

All bond lengths were kept rigid with a relative geometric precision of 10^–4 using the SHAKE algorithm (Ryckaert et al. 1977), allowing for a 2 fs MD time step in the leap-frog algorithm (van Gunsteren and Berendsen 1988) used to integrate the Langevin equation of motion. The non-bonded interactions were treated as in the MD simulation in explicit water. No periodic boundary conditions were applied.

The implicit-solvation term of the force field is of the so-called solvent-accessible-surface-area (SASA) type, in which the local solute–solvent interactions are assumed to be proportional to the SASA of the solute atoms (Chothia 1974; Eisenberg and McLachlan 1986). The implicit-solvation term with parameter values that make it compatible with the GROMOS force fields reads (Ooi et al. 1987; Still et al. 1990; Fraternali and van Gunsteren 1996; Kunz et al. 2012; Kleinjung et al. 2012)

$$V_{solv}^{SASA} (\vec{r}^{N} ) \equiv \sum\limits_{i = 1}^{N} {\sigma_{i}^{SASA} A_{i} (} \vec{r}^{N} ),$$

(10)

where the weight factors $\sigma_{i}^{SASA}$ are the implicit-solvation model parameters for the atoms of the molecule, which may differ per atom type or only between classes of atoms, such as charged, polar or non-polar atoms (Kleinjung et al. 2012), see Table 1. The accessible area $A_{i} (\vec{r}^{N} )$ of atom i is defined using the approximate analytical expression (Hasel et al. 1988).

$$A_{i} (\vec{r}^{N} ) \equiv S_{i} \prod\limits_{j = 1,j \ne i}^{N} {\left[ {1 - p_{i} p_{ij} b_{ij} (r_{ij} )/S_{i} } \right]} .$$

(11)

Table 1 Two sets of implicit-solvation model parameters compatible with the GROMOS 54B7 force field (Kleinjung et al. 2012)

Full size table

Here, the total surface area of an isolated atom i with radius R_i accessible to a solvent probe atom with radius R_solv is given by

$$S_{i} \equiv 4\pi \left( {R_{i} + R_{solv} } \right)^{2}$$

(12)

and the overlap reduction factor b_ij (Wodak and Janin 1980) for atoms i and j at a distance $r_{ij} \equiv ((\vec{r}_{i} - \vec{r}_{j} )^{2} )^{1/2}$ is given by

$$b_{ij} (r_{ij} ) \equiv \pi (R_{i} + R_{solv} )(R_{i} + R_{j} + 2R_{solv} - r_{ij} )(1 + (R_{j} - R_{i} )/r_{ij} ),$$

(13a)

if 0 < r_ij < R_i + R_j + 2R_solv, and

$$b_{ij} (r_{ij} ) \equiv 0,$$

(13b)

if r_ij ≥ R_i + R_j + 2R_solv.

The atom type parameter p_i has been introduced in Eq. (11) to empirically reduce the effect of double counting the overlap area when multiple overlaps of the surface of atom i with those of many other atoms j occur. The pair parameter p_ij serves as an additional reducing factor that distinguishes between first and next covalently bound neighbour atoms j of atom i. The parameters p_i and p_ij (p_ij = 0.8875 for covalently bound first neighbours and p_ij = 0.3516 for covalently bound next neighbours) have been optimized (Hasel et al. 1988) using R_solv = 0.14 nm and given R_i values to reproduce the exact solvent-accessible surface areas of a large number of small molecules. The values are given in Table 1 of (Hasel et al. 1988) and their mapping onto the atom types used in the GROMOS 54B7 force field, along with the corresponding R_i values, is given in Table 1.

The force on atom k resulting from the implicit-solvation potential energy term $V_{solv}^{SASA} (\vec{r}^{N} )$, Eq. (10), is

$$\vec{f}_{k}^{SASA} = - \frac{{\partial V_{solv}^{SASA} (\vec{r}^{N} )}}{{\partial \vec{r}_{k} }} = - \sum\limits_{i = 1}^{N} {\sigma_{i}^{SASA} \frac{{\partial A_{i} (\vec{r}^{N} )}}{{\partial \vec{r}_{k} }}} ,$$

(14)

with

$$\frac{{\partial A_{i} (\vec{r}^{N} )}}{{\partial \vec{r}_{k} }} = S_{i} \left( { - p_{i} p_{ik} \frac{{\partial b_{ik} (r_{ik} )}}{{\partial \vec{r}_{k} }}/S_{i} } \right)\prod\limits_{j = 1,j \ne i,j \ne k}^{N} {\left( {1 - p_{i} p_{ij} b_{ij} (r_{ij} )/S_{i} } \right)} ,\,{\text{if}}\;k \ne i,$$

(15)

and

$$\frac{{\partial A_{i} (\vec{r}^{N} )}}{{\partial \vec{r}_{i} }} = S_{i} \sum\limits_{l = 1,l \ne i}^{N} {\left( { - p_{i} p_{il} \frac{{\partial b_{il} (r_{il} )}}{{\partial \vec{r}_{i} }}/S_{i} } \right)} \prod\limits_{j = 1,j \ne i,j \ne l}^{N} {\left( {1 - p_{i} p_{ij} b_{ij} (r_{ij} )/S_{i} } \right)} ,\,{\text{if}}\;k = i.$$

(16)

The partial derivatives of the overlap reduction factor b can be written as

$$\frac{{\partial b_{ik} (r_{ik} )}}{{\partial \vec{r}_{k} }} = \frac{{db_{ik} (r_{ik} )}}{{dr_{ik} }}\frac{{\partial r_{ik} }}{{\partial \vec{r}_{k} }},$$

(17)

and

$$\frac{{\partial b_{ik} (r_{ik} )}}{{\partial \vec{r}_{i} }} = \frac{{db_{ik} (r_{ik} )}}{{dr_{ik} }}\frac{{\partial r_{ik} }}{{\partial \vec{r}_{i} }},$$

(18)

with

$$\frac{{db_{ik} (r_{ik} )}}{{dr_{ik} }} = - \pi (R_{i} + R_{solv} )(1 + (R_{k} - R_{i} )/r_{ik} )$$

$$+ \pi (R_{i} + R_{solv} )( - (R_{k} - R_{i} )/r_{ik}^{2} )(R_{i} + R_{k} + 2R_{solv} - r_{ik} ),$$

(19a)

if 0 < r_ij < R_i + R_k + 2R_solv, and

$$\frac{{db_{ik}\,\, (r_{ik} )}}{{dr_{ik} }} = 0,$$

(19b)

if r_ij ≥ R_i + R_k + 2R_solv, and

$$\frac{{\partial \,\,r_{ik} }}{{\partial \vec{r}_{k} }} = - \frac{{\vec{r}_{ik} }}{{r_{ik} }},$$

(20)

and

$$\frac{{\partial r_{ik} }}{{\partial \vec{r}_{i} }} = \frac{{\vec{r}_{ik} }}{{r_{ik} }},$$

(21)

with $\vec{r}_{ik} \equiv \vec{r}_{i} - \vec{r}_{k}$ and $r_{ik} \equiv ((\vec{r}_{ik} )^{2} )^{1/2}$.

The parameters $\sigma_{i}^{SASA}$ of the implicit-solvation model that are compatible with the GROMOS 54B7 force field were taken from (Kleinjung et al. 2012) and are given in Table 1.

Simulation set-up for the MD simulation in explicit water using periodic boundary conditions

The X-ray crystal structure derived from a triclinic unit cell at 0.065 nm resolution at T = 100 K with Protein Data Bank (PDB) code 2VB1 (Berman et al. 2000) was used as the initial structure for the energy minimisations followed by the MD and SD simulations. It contains multiple side-chain conformations for 46 residues. For the initial structure, the side-chain conformation with the highest occupancy was chosen.

The initial structure was first energy minimised in vacuo to release possible strain induced by small differences in bond lengths, bond angles, improper dihedral angles, and short distance non-bonded contacts between the force-field parameters and the X-ray structure. Subsequently, the protein was put into a rectangular box filled with water molecules, such that the minimum solute-wall distance was 1.0 nm, and water molecules closer than 0.23 nm from the solute were removed. This resulted in a box with 12,157 water molecules for the initial protein structure. To relax unfavourable contacts between atoms of the solute and the solvent, a second energy minimisation was performed for the protein in the periodic box with water while keeping the atoms of the solute harmonically position-restrained (van Gunsteren et al. 2019b) with a force constant of 25,000 kJmol⁻¹ nm⁻² (Lier et al. 2020).

The resulting protein-water configuration was used as initial configuration for the MD simulation in explicit water. To avoid artificial deformations in the protein structure due to relatively high-energy atomic interactions still present in the system, the MD simulation was started at T = 60 K and then the temperature was slowly raised to T = 308 K. Initial atomic velocities were sampled from a Maxwell distribution at T = 60 K. The equilibration scheme consisted of five short 20 ps simulations at temperatures 60, 120, 180, 240 and 308 K at constant volume. During the first four of the equilibration periods, the solute atoms were harmonically restrained to their positions in the initial structures with force constants of 25,000, 2500, 250, and 25 kJmol⁻¹ nm⁻². The temperature was kept constant using the weak-coupling algorithm (Berendsen et al. 1984) with a relaxation or coupling time τ_Τ = 0.1 ps. Solute and solvent were separately coupled to the heat bath. Following this equilibration procedure, the simulations were performed at a reference temperature of 308 K and a reference pressure of 1 atm. The pressure was kept constant using the weak-coupling algorithm (Berendsen et al. 1984) with a coupling time τ_p = 0.5 ps and an isothermal compressibility κ_T = 4.575 10^–4 (kJmol⁻¹ nm⁻³)⁻¹. The centre of mass motion of the system was removed every 1000 time steps (2 ps). Trajectory energies and atomic coordinates were stored at 5 ps intervals and used for analysis (Eichenberger et al. 2011).

Simulation set-up for the SD simulations in vacuo without or with implicit-solvation forces

After energy minimisation, the protein in vacuo was slowly heated up using the same protocol as was used for the protein in water. After equilibration of 1 ns, the SD simulations in vacuo (SD_nowater, SD_implicit) were performed with a reference temperature of 308 K, maintained by the Langevin equations or thermostat and by weak coupling to a heat bath (τ_Τ = 0.1 ps), the latter in order to control the temperature of atoms that have a friction coefficient equal to zero, whose temperature is thus not controlled by the Langevin thermostat. Translational motion of the centre of mass of the system was removed every 2 ps. Trajectory energies and atomic coordinates were stored at 5 ps intervals and used for analysis (Eichenberger et al. 2011).

MD and SD simulations performed

One MD simulation and eight SD simulations were performed, each 20 ns long:

1.
MD_water: MD simulation of HEWL in a periodic box with 12,157 explicit water molecules and using the GROMOS 54A7 force field. The average solute and solvent temperatures were 311 K and 312 K, respectively.
2.
SD_nowater: Four SD simulations of HEWL in vacuo, each with different initial velocities, using the GROMOS 54B7 force field. The average solute temperature was 309 K.
3.
SD_implicit: Four SD simulations of HEWL in vacuo, each with different initial velocities, using the GROMOS 54B7 force field with the SASA implicit-solvation term and the set of solvation parameters of column 5 of Table 1 was used. The average solute temperature was 309 K.

Analysis of atomic trajectories

The GROMOS force fields treat aliphatic carbons as united CH, CH₂ and CH₃ atoms. Therefore, when calculating NOE distances, inter-hydrogen distances involving the aliphatic hydrogen atoms were calculated using virtual atomic positions for CH and pro-chiral CH₂ (van Gunsteren et al. 1985) and pseudo-atomic positions for CH₃ (Wüthrich et al. 1983) for those hydrogen atoms (van Gunsteren et al. 2019b). The pseudo-atom NOE distance bound corrections of (Wüthrich et al. 1983) were used (van Gunsteren et al. 2016). The set of NOE distance upper bounds for HEWL (Smith et al. 1993; Schwalbe et al. 2001) can be found in Table S1 of Supporting Information, together with the values obtained from some simulations. The NOE between Trp 28 HZ3 and Leu 56 HG was reassigned as between Trp 28 HZ3 and Leu 56 HD* following reassessment of the experimental spectra. Inter-hydrogen distances were calculated as < r⁻³ > ^−1/3, i.e. using r⁻³ averaging over the trajectory structures, where r indicates the actual hydrogen–hydrogen distance.

In view of the uncertainty inherent to the calculation of NOE bounds and r⁻³ averaged distances, deviations from experiment of less than 0.1 nm are considered insignificant.

Two sets of backbone ³J_HN-Hα couplings and two sets of side-chain ³J_Hα-Hβ couplings of HEWL (Smith et al. 2021a) were used, see Supporting Information Tables S2–S5.

1.
A set (bb1) of 95 backbone ³J_HN-Hα-coupling values, see Table II of (Smith et al. 1991) from which the values for 11 glycine residues were omitted, because these had not been stereo-specifically assigned.
2.
A set (bb2) of 22 experimentally stereo-specifically unassigned backbone ³J_HN-Hα-coupling values for the 11 glycine residues, see Table II of (Smith et al. 1991). 10 of these were stereo-specifically assigned (Smith et al. 2021a) based on a comparison of the ³J_HN-Hα-coupling values calculated from MD simulations and from X-ray structures.
3.
A set (sc1) of 58 ³J_Hα-Hβ-coupling values, see Tables III and IV of (Smith et al. 1991), which were stereo-specifically assigned using experimental data.
4.
A set (sc2) of 38 out of 40 experimentally stereo-specifically unassigned ³J_Hα-Hβ-coupling values, see Table III of (Smith et al. 1991), which were stereo-specifically assigned (Smith et al. 2021a) based on the ³J_Hα-Hβ-coupling values calculated from MD simulations. Only Glu 7 could not be stereo-specifically assigned.

For the calculation of the backbone ³J_HN-Hα-couplings, the Karplus relation (Karplus 1959, 1963) was used with the parameter values a = 6.4 Hz, b = -1.4 Hz and c = 1.9 Hz (Pardi et al. 1984), see Fig. 2 of (Smith et al. 2021a). The side-chain ³J_Hα-Hβ-couplings were calculated using the parameter values a = 9.5 Hz, b = -1.6 Hz and c = 1.8 Hz (DeMarco et al. 1978), see Fig. 2 of (Smith et al. 2021a).

The experimentally derived ³J_HN-Hα-coupling values for Val 2, Thr 51, Asp 66, Cys 115, Thr 118 and Ile 124 lie outside the Karplus curve, so were set to 9.7 Hz, which is the maximum of the Karplus curve used (Pardi et al. 1984). None of the experimentally derived ³J_Hα-Hβ-coupling values lie outside the Karplus curve used (DeMarco et al. 1978). The nomenclature for the H_α2 and H_α3 atoms in Gly residues and the H_β, H_β2 and H_β3 atoms in the side chains was defined as in Fig. 3 of (Markley et al. 1998). The values obtained from some simulations can be found in Tables S2–S5 of Supporting Information.

In view of the various factors contributing to an uncertainty of about 2 Hz inherent to the Karplus relation linking structure and ³ J-couplings, a deviation of less than 2 Hz between ³ J-coupling values calculated from MD trajectory structures and ³ J-coupling values derived from experiment is considered insignificant.

Four sets of S² order-parameter for HEWL, 121 for the backbone NH and 79 for the side-chain CH₃, NH and NH₂ moieties (Buck et al. 1995; Moorman et al. 2012), were used to evaluate the simulations, see Supporting Information Tables S6–S9. S² order parameters for the atom pair (a,b) were calculated using the ensemble averaging expression (Henry and Szabo 1985)

$$S_{ab}^{2} = \frac{1}{2}\left\{ {3\sum\limits_{\alpha = 1}^{3} {\sum\limits_{\beta = 1}^{3} {\left\langle {\frac{{\mu_{ab\alpha } (t)\mu_{ab\beta } (t)}}{{r_{ab}^{3} (t)}}} \right\rangle_{t}^{2} } } - \left\langle {\frac{1}{{r_{ab}^{3} (t)}}} \right\rangle_{t}^{2} } \right\}(r_{ab}^{eff} )^{6} ,$$

(22)

where t indicates the time-averaging window, here 1 ns, shorter than the rotational correlation time of 5.7 ns of HEWL in solution (Smith et al. 1993),

$$\mu_{ab1} \equiv \, \left( {x_{a} {-}x_{b} } \right)/r_{ab} ,\, \mu_{ab2} \equiv \, \left( {y_{a} {-}y_{b} } \right)/r_{ab} ,\, \mu_{ab3} \equiv \, \left( {z_{a} {-}z_{b} } \right)/r_{ab} ,$$

(23)

are the x-, y-, and z-components of the vector $\vec{r}_{ab} \equiv \vec{r}_{a} - \vec{r}_{b}$ and $r_{ab} \equiv ((\vec{r}_{ab} )^{2} )^{1/2}$, its length (Hansen et al. 2014). To obtain a dimensionless quantity, the term in curly brackets is multiplied with the 6^th power of the effective length ($r_{ab}^{eff}$) of the vector $\vec{r}_{ab}$. Because in the present work, bond length constraints are used, the length of $\vec{r}_{ab}$ is essentially constant over time and its length thus equal to its effective value $r_{ab}^{eff}$.

Before calculating $S_{ab}^{2}$, the protein trajectory structures are superimposed using the backbone atoms (N, C_α, C) of residues 3–126 in the fit in order to eliminate the effect of overall rotation of the protein upon the $S_{ab}^{2}$-values. Use of only the backbone atoms of four of the five α-helices and two β-strands in HEWL (residues 4–15, 24–36, 41–45, 50–53, 89–99, and 108–115) did not lead to significantly different $S_{ab}^{2}$-values.

For the Asn and Gln residues, only one S²_NH(exp) value per NH₂ group is available (Buck et al. 1995). This required the assignment to one of the two NH1 and NH2 bond vectors. This was done based on a comparison of the S²_NH1(sim)- and S²_NH2(sim)-values calculated from MD simulations (Smith et al. 2021b). The experimentally unassigned S²_CG1- and S²_CG2-values for Val and S²_CD1- and S²_CD2-values for Leu residues (Moorman et al. 2012) were assigned in a similar way (Smith et al. 2021b). The values obtained from some simulations can be found in Tables S6–S9 of Supporting Information.

In view of the uncertainty inherent to the derivation of S²_ab(exp)-values from relaxation experiments and inherent to the calculation of S²_ab(sim)-values from MD or SD simulations, a deviation of less than 0.2 between simulation and experiment is considered insignificant.

Atom-positional root-mean-square differences RMSD between trajectory structures and the 2VB1 X-ray crystal structure and atom-positional root-mean-square fluctuations (RMSF), i.e. around their average positions, in the MD and SD trajectories were calculated after superimposing the backbone atoms (N, CA, C) of residues 3 – 126 to eliminate the contribution of overall translation and rotation of the protein.

The radius of gyration R_gyr was calculated as

$$R_{gyr} \equiv \left( {\sum\limits_{i = 1}^{N} {(\vec{r}_{i} - } \vec{r}_{cm} )^{2} } \right)^{1/2} ,$$

(24)

with

$$\vec{r}_{cm} \equiv M^{ - 1} \sum\limits_{i = 1}^{N} {m_{i} \vec{r}_{i} }$$

(25)

and

$$M \equiv \sum\limits_{i = 1}^{N} {m_{i} } .$$

(26)

The secondary-structure assignment was done with the program DSSP, based on the Kabsch–Sander rules (Kabsch and Sander 1983).

Hydrogen bonds were identified according to a geometric criterion: a hydrogen bond was assumed to exist if the hydrogen-acceptor distance was smaller than 0.25 nm and the donor-hydrogen-acceptor angle was larger than 135°. The extent of hydrogen bonding was evaluated using the number of intra-solute hydrogen bonds in a simulation multiplied by their % occurrence in the simulation divided by the number of hydrogen bonds in the 2VB1 X-ray structure (Fraternali and van Gunsteren 1996).

The time evolution of structural features that would be sensitive to the way the solvent is modelled, was examined in terms of auto-correlation functions and spectral densities of atom positions and of torsional angles. From a time series of a quantity Q(t), a normalised time correlation function,

$$C_{Q} (t) = \frac{{ < Q(\tau ) \cdot Q(\tau + t) >_{\tau } }}{{ < Q(\tau ) \cdot Q(\tau ) >_{\tau } }}$$

(27)

was calculated using the Fast Fourier Transform technique (Futrelle and McGinty 1971; van Gunsteren et al. 2019d). For these calculations, 25 ps towards the end of the simulations were repeated while saving configurations every 0.01 ps instead of 5 ps to obtain a finer resolution of the auto-correlation functions. When calculating the spectral density, only the first 2% of the auto-correlation function was used.

Although four SD_nowater and four SD_implicit simulations have been run, the data for only one simulation (those with RMSD and RMSF closest to the mean) of each type are shown in Figs. 4 and 5 and listed in the Supporting Information.

Results and discussion

Comparison of structural and energetic properties calculated from the simulations

In Table 2, average values of various properties of HEWL obtained from the three types of simulations, MD_water, SD_nowater and SD_implicit, are shown. As expected, both simulations without explicit water molecules, SD_nowater and SD_implicit, show larger atom-positional root-mean-square differences (RMSD) with the 2VB1 X-ray structure for the CA atoms (6% and 4%, respectively) and for all atoms (13% and 11%, respectively) than the protein solvated in explicit water (MD_water). Inclusion of an implicit-solvation force-field term in an in vacuo simulation does not improve the agreement with the 2VB1 X-ray crystal structure significantly. Figure 1 shows the 2VB1 X-ray crystal structure together with final structures of the three types of simulation. The final structure of the MD_water (red, Fig. 1b) simulation shows less deviation from the X-ray structure (grey) than the final structures of the simulations in vacuo, SD_nowater (green, Fig. 1c) and SD_implicit (blue, Fig. 1d). Figure 2 shows the atom-positional root-mean-square differences (RMSD) with the 2VB1 X-ray structure for the CA atoms as function of residue sequence number. The largest differences for both the SD_nowater and SD_implicit simulations is in an exposed turn between two ß-strands centred around Thr 47. There are also large differences in the loop region connecting helices C and D (particularly at Gly 104 for the SD_nowater simulation) and at the C-terminus. The MD_water simulation also shows a large difference at the C-terminus and changes around Asp 119, a region which contains a 3₁₀ helix in the X-ray structure. The differences between explicit solvent on the one hand and no or implicit solvent on the other hand are larger than the differences between no solvent and implicit solvent.

Table 2 Averages of structural and energetic properties of HEWL calculated from the three types of simulations

Full size table

The lower radius of gyration in vacuo (SD_nowater: 96%, SD_implicit: 96%) compared to that in explicit water (MD_water) reflect the compaction of the protein in vacuo. The total solvent-accessible-surface-area is reduced by 16% in SD_nowater and by 15% in SD_implicit compared to the MD simulation in explicit water. For the hydrophilic area the numbers are 18% (SD_nowater) and 13% (SD_implicit). The reductions in hydrophobic area are 13% and 14% respectively.

The compaction also leads to about 26% (SD_nowater) and 22% (SD_implicit) more intra-protein hydrogen bonding in vacuo compared to explicit water, due to the missing hydrogen-bond donor and acceptor atoms of the water molecules absent in vacuo. The occurrence of secondary-structure elements (α-helix, π-helix, 3₁₀-helix, ß-strand, ß-bridge, bend, turn) as function of time are shown for the three types of simulation in Figs. 3, 4 and 5. Compared to explicit water (Fig. 3), in vacuo without implicit-solvation term (Fig. 4) the fourth α-helix becomes shorter, a 3₁₀-helix appears between the fourth and fifth α-helices, and the fifth changes into a wider π-helix. Using an implicit-solvation term in the force field (Fig. 5), the ß-sheet becomes less stable, the third α-helix turns in to a 3₁₀-helix, and only later in the simulation the fifth α-helix appears while the sixth α-helix turns into a 3₁₀-helix. In vacuo, the secondary-structure elements of HEWL become less stable than when simulated in explicit water.

The absence of many explicit water molecules in simulations in vacuo will influence the internal (u:u) energy of the protein (Table 2): In MD_water, it is -9821 kJ/mol, in SD_nowater it is higher, -9080 kJ/mol, and in SD_implicit, it is increased to − 9058 kJ/mol. Apparently, omission of explicit water molecules, bulk water, increases the internal energy of the protein, with the implicit-solvation force-field term more than without. The SASA energy of − 114 kJ/mol is a poor representation of the protein – explicit water energy of -12,272 kJ/mol, leading to slightly more strain in the molecule in vacuo than in explicit water. The reduced atom-positional root-mean-square fluctuations (RMSF, Table 2) in SD_nowater and SD_implicit compared to MD_water indicate a reduction of the internal entropy of the protein.

Comparison of NOE distances, ³ J-couplings and S ² order parameters calculated from the simulations with experimentally derived values for HEWL

Table 3 shows the number of NOE distance upper bound violations in the 2VB1 X-ray crystal structure and for the three types of simulations of HEWL. The X-ray structure shows only 12 distance bound violations larger than 0.1 nm. MD simulation in explicit water leads to more, 42, of such violations, 2.6% of the total number of 1630 bounds. SD simulation in vacuo without an implicit-solvation force-field term leads to twice as many, 87 (5.3%) of such violations, in particular large (> 0.3 nm) ones, and the introduction of the implicit-solvation force-field term yields less of such violations, 65 (4.0%). The use of an implicit-solvation force-field term improves the agreement with the NOE data, but the agreement is still worse than when using explicit solvation in the MD simulation.

Table 3 Number of NOE distance bound violations in the 2VB1 X-ray crystal structure and in the three types of simulations of HEWL

Full size table

Table 4 shows the number of deviations from experimentally derived values for different types of ³ J-couplings in the 2VB1 X-ray crystal structure and for the three types of simulations of HEWL. The numbers of available measured ³ J-couplings are for the backbone 95 NMR-assigned and 22 MD/X-ray-assigned ³ J-couplings and for the side chains 58 NMR-assigned and 38 MD-assigned ³ J-couplings, in total 213 ³ J-coupling values (Smith et al. 2021a). The X-ray structure shows 3 (2.6%) backbone, 15 (26%) NMR-assigned side-chain and 25 (66%) MD-assigned side-chain ³ J-coupling deviations larger than 2 Hz. In the MD simulation of HEWL in explicit water these values are 19 (16%), 14 (24%) and 13 (34%), respectively. SD simulation in vacuo without implicit-solvation force-field term leads to larger deviations, 25.7 (27%) NMR-assigned backbone and 2.2 (10%) MD/X-ray-assigned backbone ³ J-couplings, and 20.0 (34%) NMR-assigned side-chain and 18.9 (50%) MD-assigned side-chain ³ J-coupling deviations larger than 2 Hz. The introduction of an implicit-solvation force-field term does not change these values significantly, with 24.8 (26%) NMR-assigned and 2.9 (13%) MD/X-ray-assigned backbone ³ J-couplings, and 23.2 (40%) NMR-assigned side-chain and 18.0 (47%) MD-assigned side-chain ³ J-coupling deviations larger than 2 Hz. The large deviations for the SD_nowater and SD_implicit simulations are particularly for residues in the long loop region (especially residues 61–74) and residues 45–51 in the exposed turn between two ß-strands where the final simulation structures show a large CA atom-positional RMSD to the 2VB1 X-ray structure (Fig. 2). MD simulation in explicit water yields, compared to the X-ray crystal data, worse agreement for the backbone ³ J-couplings, but slightly better agreement for the side-chain ³ J-couplings. The SD simulations in vacuo, without or with implicit-solvation force-field term, show significantly worse agreement with the experimentally derived ³ J-coupling data.

Table 4 Number of deviations, |³ J (exp)– ³ J (MD, SD or X-ray)|, in the 2VB1 X-ray structure and in the three types of simulations of HEWL, for the 95 and 22 backbone (bb) ³J_HNHα-coupling values derived from NMR measurements and assigned based on the NMR data or stereo-specifically on MD simulation or X-ray data, and for the 58 and 38 side-chain (sc) ³J_HαHβ-coupling values derived from NMR measurements and stereo-specifically assigned based on NMR measurements or on MD simulation data (Smith et al. 2021a)

Full size table

Table 5 shows the number of deviations from experimentally derived values for different types of S² order parameters for the three types of simulations of HEWL. The numbers of available experimentally derived S² order-parameter values are 121 backbone S²_NH-values, 79 side-chain S²-values, that is, 51 S²_CH-values of Ala, Ile, Leu, Met, Thr and Val residues, 11 S²_NH-values of Trp and Arg residues and 17 S²_NH2-values of Asn and Gln residues, in total 200 S² order-parameter values (Smith et al. 2021b). The MD simulation of HEWL in explicit water shows for the backbone S²_NH order parameters 21 (17%) deviations larger than 0.2 and for the side-chain S² order parameters 25 (55%) deviations larger than 0.2, that is, 21 (41%) S²_CH-values of Ala, Ile, Leu, Met, Thr and Val residues, 1 (9%) S²_NH-value of a Trp residue and 3 (18%) S²_NH2-values of Asn and Gln residues. SD simulations in vacuo yield better agreement with the experimentally derived values for the backbone, but worse agreement for the side chains. Without implicit-solvation force-field term, the deviations larger than 0.2 are 10.0 (8%) for the backbone and 18.4 (36%), 1.7 (15%) and 8.1 (48%) for the three types of side-chain S² order parameters, respectively. Inclusion of an implicit-solvation force-field term does not change the agreement significantly, with deviations larger than 0.2 of 10.2 (8%) for the backbone and 19.2 (38%), 2.2 (20%) and 8.2 (48%) for the three types of side-chain S² order parameters, respectively. The significant increase in the deviations of the S²_NH2-values of Asn and Gln residues for the SD_nowater and SD_implicit simulations comes from residues with higher calculated S² values from the simulations than those observed experimentally. These Asn and Gln side chains form persistent intra-protein hydrogen bonds in the SD_nowater and SD_implicit simulations, while in the X-ray structure and MD_water simulation they hydrogen bond to crystallographic waters and form short-lived hydrogen bonds to bulk water molecules, respectively. For example, the side chain of Asn 19 hydrogen bonds to the backbone carbonyl group of Asp 18 with populations of 65% and 33% in the SD_nowater and SD_implicit simulations, respectively, and the side chain of Gln 121 hydrogen bonds to the backbone carbonyl group of Thr 118 in the SD_nowater simulation (population 66%) and to the backbone carbonyl group of Arg 128 in the SD_implicit simulation (population 57%). None of these hydrogen bonds are present in the X-ray structure or the MD_water simulation (Asn 19: Experimental S² 0.43, MD_water 0.49, SD_nowater 0.77, SD_implicit 0.76; Gln 121: Experimental S² 0.36, MD_water 0.34, SD_nowater 0.78, SD_implicit 0.63). Overall, as for the ³ J-couplings, the SD simulations in vacuo, without or with implicit-solvation force-field term, show overall worse agreement with the experimentally derived S² order-parameter values.

Table 5 Number of deviations, |S²(exp)—S²(MD or SD)|, for the 121 backbone S²_NH-values and for the 79 side-chain S²-values, that is, 51 S²_CH-values, 11 S²_NH-values of Trp and Arg residues and 17 S²_NH2-values of Asn and Gln residues (Smith et al. 2021b), in the three types of simulations of HEWL

Full size table

An early comparison with experimental data of various properties of HEWL as obtained by MD simulation in vacuo and in water (Smith et al. 1995) used the older GROMOS force-field versions 37C4 (MD in water) and 37D4 (MD in vacuo) and also a modified version of 37C4 (MD in water) with explicit inclusion of aromatic hydrogens and a modified interaction between water oxygen and the carbon atoms of the protein. These force-field versions were not yet calibrated to free-energy (energy and entropy) data of various compounds in solution. The experimental data compared to were 1158 NOE atom–atom distance upper bounds, 163 ³ J-couplings and 159 S² order-parameter values, 29% (NOEs), 23% (³ J-couplings), and 20% (S² order parameters) less data than in the current study (1630 NOEs, 41% more; 213 ³ J-couplings, 31% more; 200 S² order parameters, 26% more). The currently used force field (54A7) and X-ray crystal structure (2VB1) yield better agreement with the larger experimental data set than the older force field (37C4, 37C4 modified) and older X-ray crystal structure (2LZT) with the smaller experimental data set. For the X-ray structures, there are 21 (current) vs 15 (in 1995) distance upper bound violations in the range 0.05–0.1 nm, 12 (current) vs 25 (in 1995) violations in the range 0.1–0.3 nm, and 0 (current) vs 2 (in 1995) violations in the range > 0.3 nm. In the simulation of HEWL in water, the current force field (54A7) yields better agreement with the larger experimental data set than the older force fields (37C4, with or without modifications) with the smaller experimental data set. For the water MD simulations there are 44 (current) vs 40 and 31 (in 1995) distance bound violations in the range 0.05 – 0.1 nm, 37 (current) vs 47 and 64 (in 1995) violations in the range 0.1 – 0.3 nm, and 5 (current) vs 7 and 17 (in 1995) violations in the range > 0.3 nm. The current SD simulation in vacuo shows an agreement with the larger experimental data set that is comparable to that of the MD simulation in vacuo using the older force field (37D4) and the smaller experimental data set. For the SD and MD simulations in vacuo there are 47 (current) vs 41 (in 1995) distance bound violations in the range 0.05 – 0.1 nm, 66 (current) vs 61 (in 1995) violations in the range 0.1 – 0.3 nm, and 21 (current) vs 20 (in 1995) violations in the range > 0.3 nm.

Comparison of dynamical properties calculated from the simulations

Table 2 shows the atom-positional root-mean-square fluctuations (RMSF) in the three types of simulations, MD_water, SD_nowater and SD_implicit, as averages over the backbone CA atoms and as averages over all atoms. The backbone CA atom-positional root-mean-square fluctuations as function of residue sequence number for the three types of simulations are shown in Fig. 6. As expected, both simulations without explicit water molecules (SD_nowater: dotted line, SD_implicit: dashed line) show less mobility of the atoms than the protein solvated in explicit water (MD_water: black line). This due to the vacuo boundary condition applied in the former simulations, which leads to a compaction of the protein and thus less mobility. The use of the implicit–solvation force-field term leads to somewhat more mobility than without such a term, but the mobility is still only 83% (CA atoms) and 88% (all atoms) of that in explicit water. Without implicit–solvation force-field term these values are 77% and 84%, respectively.

A more detailed picture of the differences in dynamics of the protein atoms in the different types of simulations can be obtained by calculating auto-correlation functions for various degrees of freedom in the protein. In Fig. 7, the auto-correlation functions and spectral densities of six torsional angles are shown: the backbone φ-angles of Ala 10 (in an α-helix) and of Thr 69 (in the long loop in the ß-domain), the side-chain χ₃-angle of Met 105 (in the so-called hydrophobic box of HEWL), the side-chain χ₅-angle of Arg 61 (at the end of the ß-sheet, which shows a much higher mobility than its backbone angles), the side-chain χ₃-angle of Glu 7 (which is mobile despite being part of an α-helix), and the side-chain χ₃-angle of Trp 108 (in the hydrophobic box). The auto-correlation function of the backbone φ-angle of Ala 10 in an α-helix is flat and almost identical for all three types of simulations, which is not surprising. The backbone φ-angle of Thr 69 in the long loop shows more long-time correlation, most in SD_implicit and least in MD_water. The peak in the spectral density between 5 and 12 ps⁻¹ occurs in explicit solvent at a slightly higher frequency (9 ps⁻¹) than in vacuo (7 ps⁻¹). The χ₃-angles in the side chains of Met 105 and Glu 7 display similar behaviour. The auto-correlations in MD_water and MD_implicit decay faster than in MD_nowater. The small oscillations in the auto-correlation function of Glu 7 in MD_water lead to a peak at 13 ps⁻¹ in the spectral density. For the χ₅-angle of Arg 61, the auto-correlation functions and spectral densities in the three types of simulations are rather similar, and the same observation holds, to a lesser extent, for the χ₃-angle of Trp 108. Overall, the short-time dynamics does not differ greatly between the different types of simulations, while the difference between explicit solvent and no or implicit solvent is somewhat larger than between no solvent and implicit solvent.

Conclusions

Generally, structure refinement of proteins in crystal or in (aqueous) solution is carried out for the solute molecule in vacuo, that is, without treating the solvent (water) degrees of freedom explicitly. Omission of solvent molecules may, however, lead to distortions in the protein structure, dynamics, internal energy and entropy. This has been investigated for the protein hen egg white lysozyme (HEWL), for which ample experimentally derived data are available, which may be used to evaluate the extent of the mentioned distortions.

Omission of bulk water in a simulation leads to a compaction of the protein, a lower radius of gyration and solvent-accessible-surface-area, an increase of protein-internal hydrogen bonding, an increase of the protein-internal energy and strain due to missing interactions with water molecules, and a reduction of the protein-internal entropy. A comparison with various experimentally derived data show a higher number of NOE distance upper bound violations: in explicit water 2.6% of the 1630 bounds, in SD_nowater 5.3% and in SD_implicit 4.0%. The experimentally derived 213 ³ J-couplings and 200 S² order parameters are much less well reproduced by simulation in vacuo, without or with implicit-solvation term, than by simulating the motion of the protein degrees of freedom and explicitly those of bulk water solvating the protein.

The rather large differences found between simulating a protein in explicit water on the one hand and simulating it in vacuo on the other hand can be understood from the particular properties of water: the rather large entropy content of bulk water at ambient temperature and pressure, the hydrogen-bonding capacity of individual water molecules and the dielectric screening of protein-internal electrostatic interactions by high-permittivity bulk water. These three features also explain why the addition of an implicit-solvation mean-force term to the force field applied does not help much to off-set the omission of explicit water molecules. The three mentioned fundamental flaws are inherent to any implicit-solvation model.

The results for HEWL presented here constitute only one example of the deficits of protein simulation or refinement models that ignore the influence of solvent (water) upon the protein properties in (aqueous) solution. HEWL is a challenging case regarding in vacuo simulation: it is a non-spherical, not compact protein with an overall charge of + 10e, containing a variety of secondary-structure elements and loops. This suggests that for relatively spherical, compact proteins, for example ubiquitin, the effects of omission of water molecules in simulation or refinement may be less pronounced. However, as the detailed comparisons with experimental data presented here show, even if significant overall changes to the structure are not observed, the torsion angles of residues in exposed turns and loops and the dynamical behaviour of exposed side chains may not be correctly represented. As these groups are often involved in protein–protein interactions or ligand or substrate binding, the correct modelling of their properties is of particular importance.

Therefore, in structure refinement of proteins in aqueous solution based on a limited set of experimentally derived data, as compared to the number of protein-internal degrees of freedom, the use of explicit water molecules is essential, for HEWL see e.g. (Smith et al. 2021a, b). In view of the abundance of X-ray reflections for proteins in crystalline form, the use of explicit water molecules in structure refinement based on X-ray data is less essential, but will enhance the physical reliability of the resulting structures, for bovine pancreatic trypsin inhibitor (BPTI) see e.g. (Gros et al. 1990; Schiffer and van Gunsteren 1999).

References

Barker JA, Watts RO (1973) Monte Carlo studies of the dielectric properties of water-like models. Mol Phys 26:789–792
Article CAS Google Scholar
Bartik K, Redfield C, Dobson CM (1994) Measurement of the individual pK_a values of acidic residues of Hen and Turkey Lysozymes by two-dimensional ¹H NMR. Biophys J 66:1180–1184
Article CAS PubMed PubMed Central Google Scholar
Berendsen HJC, Postma JPM, van Gunsteren WF, Hermans J (1981) Interaction models for water in relation to protein hydration. In: Pullmann B (ed) Intermolecular forces. Reidel, Dordrecht, pp 331–342
Chapter Google Scholar
Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR (1984) Molecular dynamics with coupling to an external bath. J Chem Phys 81:3684–3690
Article CAS Google Scholar
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank, www.pdb.org. Nucl Acids Res 28:235–242
Article CAS PubMed PubMed Central Google Scholar
Bossis G, Quentrec B, Boon JP (1982) Brownian dynamics and the fluctuation-dissipation theorem. Mol Phys 45:191–196
Article CAS Google Scholar
Buck M, Boyd J, Redfield C, MacKenzie DA, Jeenes DJ, Archer DB, Dobson CM (1995) Structural determinants of protein dynamics: analysis of ¹⁵N NMR relaxation measurements for main-chain and side-chain nuclei of hen egg white lysozyme. Biochemistry 34:4041–4055
Article CAS PubMed Google Scholar
Chothia C (1974) Hydrophobic bonding and accessible surface area in proteins. Nat (lond) 248:338–339
Article CAS Google Scholar
Ciccotti G, Ryckaert JP (1981) On the derivation of the generalized langevin equation for interacting brownian particles. J Statist Phys 26:73–82
Article Google Scholar
Daura X, Mark AE, van Gunsteren WF (1999) Peptide folding simulations: no solvent required ? Comp Phys Comm 123:97–102
Article CAS Google Scholar
deMarco A, Llinás M, Wüthrich K (1978) Analysis of the ¹H-NMR spectra of ferrichrome peptides. I. Non-Amide Protons Biopoly 17:617–636
Article CAS Google Scholar
Eichenberger AP, Allison JR, Dolenc J, Geerke DP, Horta BAC, Meier K, Oostenbrink C, Schmid N, Steiner D, Wang D, van Gunsteren WF (2011) The GROMOS++ software for the analysis of biomolecular simulation trajectories. J Chem Theory Comput 7:3379–3390
Article CAS PubMed Google Scholar
Eisenberg D, McLachlan AD (1986) Solvation energy in protein folding and binding. Nat (lond) 319:199–203
Article CAS Google Scholar
Fraternali F, van Gunsteren WF (1996) An efficient mean solvation force model for use in molecular dynamics simulations of proteins in aqueous solution. J Mol Biol 256:939–948
Article CAS PubMed Google Scholar
Futrelle RP, McGinty DJ (1971) Calculation of spectra and correlation functions from molecular dynamics data using the fast fourier transform. Chem Phys Lett 12:285–287
Article CAS Google Scholar
Gros P, van Gunsteren WF, Hol WGJ (1990) Inclusion of thermal motion in crystallographic structures by restrained molecular dynamics. Science 249:1149–1152
Article CAS PubMed Google Scholar
Hansen N, Heller F, Schmid N, van Gunsteren WF (2014) Time-averaged order parameter restraints in molecular dynamics simulations. J Biomol NMR 60:169–187
Article CAS PubMed Google Scholar
Hasel W, Hendrickson TF, Still WC (1988) A rapid approximation to the solvent accessible surface areas of atoms. Tetrahed Comput Methodol 1:103–116
Article CAS Google Scholar
Heinz TN, van Gunsteren WF, Hünenberger PH (2001) Comparison of four methods to compute the dielectric permittivity of liquids from molecular dynamics simulations. J Chem Phys 115:1125–1136
Article CAS Google Scholar
Henry ER, Szabo A (1985) Influence of vibrational motion on solid state line shapes and NMR relaxation. J Chem Phys 82:4753–4761
Article CAS Google Scholar
Hockney RW, Eastwood JW (1981) Computer simulation using particles. McGraw-Hill, New York
Google Scholar
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637
Article CAS PubMed Google Scholar
Karplus M (1959) Contact electron–spin coupling of nuclear magnetic moments. J Chem Phys 30:11–15
Article CAS Google Scholar
Karplus M (1963) Vicinal proton coupling in nuclear magnetic resonance. J Am Chem Soc 85:2870–2871
Article CAS Google Scholar
Kleinjung J, Scott WRP, Allison JR, van Gunsteren WF, Fraternali F (2012) Implicit-solvation parameters derived from explicit water forces in large-scale molecular dynamics simulations. J Chem Theory Comput 8:2391–2403
Article CAS PubMed PubMed Central Google Scholar
Kunz APE, Allison JR, Geerke DP, Horta BAC, Hünenberger PH, Riniker S, Schmid N, van Gunsteren WF (2012) New functionalities in the GROMOS biomolecular simulation software. J Comput Chem 33:340–353
Article CAS PubMed Google Scholar
Lier B, Öhlknecht C, de Ruiter A, Gebhardt J, van Gunsteren WF, Oostenbrink C, Hansen N (2020) A suite of advanced tutorials for the GROMOS biomolecular simulation software. Living J Comp Mol Sci 2:18552
Google Scholar
Markley JL, Bax A, Arata Y, Hilbers CW, Kaptein R, Sykes BD, Wright PE, Wüthrich K (1998) Recommendations for the presentation of NMR structures of proteins and nucleic acids. J Biomol NMR 12:1–23
Article CAS PubMed Google Scholar
McCammon JA, Gelin BR, Karplus M (1977) Dynamics of folded proteins. Nat (lond) 267:585–590
Article CAS Google Scholar
Moorman V, Valentine KG, Wand AJ (2012) The dynamical response of hen egg white lysozyme to the binding of a carbohydrate ligand. Prot Sci 21:1066–1073
Article CAS Google Scholar
Müller M, Katsov K, Schick M (2006) Biological and synthetic membranes: what can be learned from a coarse-grained description ? Phys Rep 434:113–176
Article CAS Google Scholar
Ooi T, Oobatake M, Némethy G, Scheraga HA (1987) Accessible surface areas as a measure of the thermodynamic parameters of hydration of peptides. Proc Natl Acad Sci USA 84:3086–3090
Article CAS PubMed PubMed Central Google Scholar
Pardi A, Billeter M, Wüthrich K (1984) Calibration of the angular dependence of the amide proton-C^α proton coupling constants, ³J_HNα, in a globular protein. Use of ³J_HNα for identification of helical secondary structure. J Mol Biol 180:741–751
Article CAS PubMed Google Scholar
Poger D, van Gunsteren WF, Mark AE (2010) A new force field for simulating phosphatidylcholine bilayers. J Comput Chem 31:1117–1125
Article CAS PubMed Google Scholar
Riniker S, Allison JR, van Gunsteren WF (2012) On developing coarse-grained models for biomolecular simulation: a review. Phys Chem Chem Phys 14:12423–12430
Article CAS PubMed Google Scholar
Ryckaert JP, Ciccotti G, Berendsen HJC (1977) Numerical integration of the Cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J Comput Phys 23:327–341
Article CAS Google Scholar
Schiffer CA, van Gunsteren WF (1999) Accessibility and order of water sites in and around proteins: a crystallographic time-averaging study. Proteins Struct Funct Genet 36:501–511
Article CAS PubMed Google Scholar
Schmid N, Allison JR, Dolenc J, Eichenberger AP, Kunz AP, van Gunsteren WF (2011a) Biomolecular structure refinement using the GROMOS simulation software. J Biomol NMR 51:265–281
Article CAS PubMed Google Scholar
Schmid N, Eichenberger A, Choutko A, Riniker S, Winger M, Mark AE, van Gunsteren WF (2011b) Definition and testing of the GROMOS force-field versions: 54A7 and 54B7. Eur Biophys J 40:843–856
Article CAS PubMed Google Scholar
Schmid N, Christ CD, Christen M, Eichenberger AP, van Gunsteren WF (2012) Architecture, implementation and parallelization of the GROMOS software for biomolecular simulation. Comp Phys Comm 183:890–903
Article CAS Google Scholar
Schwalbe H, Grimshaw SB, Buck M, Spencer A, Boyd J, Dobson CM, Redfield C, Smith LJ (2001) A refined solution structure of hen lysozyme determined using residual dipolar coupling data. Protein Sci 10:677–688
Article CAS PubMed PubMed Central Google Scholar
Shi YY, Wang L, van Gunsteren WF (1988) On the approximation of solvent effects on the conformation and dynamics of cyclosporin A by stochastic dynamics simulation techniques. Mol Simul 1:369–383
Article CAS Google Scholar
Smith LJ, Sutcliffe MJ, Redfield C, Dobson CM (1991) Analysis of φ and χ₁ torsion angles for hen lysozyme in solution from ¹H NMR spin-spin coupling constants. Biochemistry 30:986–996
Article CAS PubMed Google Scholar
Smith LJ, Sutcliffe MJ, Redfield C, Dobson CM (1993) Structure of hen lysozyme in solution. J Mol Biol 229:930–944
Article CAS PubMed Google Scholar
Smith LJ, Mark AE, Dobson CM, van Gunsteren WF (1995) Comparison of MD simulations and NMR experiments for hen lysozyme. Analysis of local fluctuations, cooperative motions, and global changes. Biochemistry 34:10918–10931
Article CAS PubMed Google Scholar
Smith LJ, van Gunsteren WF, Hansen N (2021a) On the use of ³J-coupling NMR data to derive structural information on proteins. J Biomol NMR 75:39–70
Article CAS PubMed PubMed Central Google Scholar
Smith LJ, van Gunsteren WF, Hansen N (2021b) On the use of side-chain NMR relaxation data to derive structural and dynamical information on proteins: a case study using hen lysozyme. Chem Bio Chem 22:1049–1064
Article CAS PubMed Google Scholar
Still WC, Tempczyk A, Hawley RC, Hendrickson T (1990) Semi-analytical treatment of solvation for molecular mechanics and dynamics. J Am Chem Soc 112:6127–6129
Article CAS Google Scholar
Tironi IG, Sperb R, Smith PE, van Gunsteren WF (1995) A generalized reaction field method for molecular dynamics simulations. J Chem Phys 102:5451–5459
Article CAS Google Scholar
van Gunsteren WF, Berendsen HJC (1977) Algorithms for macromolecular dynamics and constraint dynamics. Mol Phys 34:1311–1327
Article Google Scholar
van Gunsteren WF, Berendsen HJC (1982) On the fluctuation-dissipation theorem for interacting brownian particles. Mol Phys 47:721–723
Article Google Scholar
van Gunsteren WF, Berendsen HJC (1988) A leap-frog algorithm for stochastic dynamics. Mol Simul 1:173–185
Article Google Scholar
van Gunsteren WF, Berendsen HJC (1990) Computer simulation of molecular dynamics: methodology, applications and perspectives in chemistry. Angew Chem Int Ed Engl 29:992–1023
Article Google Scholar
van Gunsteren WF, Dolenc J (2012) Thirty-five years of biomolecular simulation: development of methodology, force fields, and software. Mol Simul 38:1271–1281
Article CAS Google Scholar
van Gunsteren WF, Boelens R, Kaptein R, Scheek RM, Zuiderweg ERP (1985) An improved restrained molecular dynamics technique to obtain protein tertiary structure from nuclear magnetic resonance data. Molecular dynamics and protein structure. Polycrystal Book Service, Western Springs, pp 92–99
Google Scholar
van Gunsteren WF, Berendsen HJC, Geurtsen RG, Zwinderman HRJ (1986) A molecular dynamics computer simulation of an eight-base-pair DNA fragment in aqueous solution: comparison with experimental two-dimensional NMR data. Ann New York Acad Sci 482:287–303
Article Google Scholar
van Gunsteren WF, Luque FJ, Timms D, Torda AE (1994) Molecular mechanics in biology: from structure to function, taking account of solvation. Ann Rev Biophys Biomol Struct 23:847–863
Article Google Scholar
van Gunsteren WF, Bakowies D, Baron R, Chandrasekhar I, Christen M, Daura X, Gee P, Geerke DP, Glättli A, Hünenberger PH, Kastenholz MA, Oostenbrink C, Schenk M, Trzesniak D, van der Vegt NFA, Yu HB (2006) Biomolecular modelling: goals, problems, perspectives. Angew Chem Int Ed 45:4064–4092
Article CAS Google Scholar
van Gunsteren WF, Allison JR, Daura X, Dolenc J, Hansen N, Mark AE, Oostenbrink C, Rusu VH, Smith LJ (2016) Deriving structural information from experimentally measured data on biomolecules: a review. Angew Chem Int Ed 55:15990–16010
Article CAS Google Scholar
van Gunsteren WF, Daura X, Hansen N, Mark AE, Oostenbrink C, Riniker S, Smith LJ (2018) Validation of molecular simulation: an overview of issues. Angew Chem Int Ed 57:884–902
Article CAS Google Scholar
van Gunsteren et al (2019a) The GROMOS software for (Bio) molecular simulation, vol 1–9. http://www.gromos.net/. Accessed 4 July 2019
van Gunsteren WF et al (2019b) The GROMOS software for (Bio) molecular simulation. Volume 2: Algorithms and formulae for modelling of molecular systems. http://www.gromos.net/. Accessed 4 July 2019
van Gunsteren WF et al (2019c) The GROMOS Software for (Bio)Molecular Simulation. Volume 3: Force Field and Topology Data Set. http://www.gromos.net/. Accessed 4 July 2019
van Gunsteren WF et al (2019d) The GROMOS Software for (Bio)Molecular Simulation. Volume 6: Technical Details. http://www.gromos.net/. Accessed 4 July 2019
Wodak SJ, Janin J (1980) Analytical approximation to the accessible surface area of proteins. Proc Natl Acad Sci USA 77:1736–1740
Article CAS PubMed PubMed Central Google Scholar
Wüthrich K, Billeter M, Braun W (1983) Pseudo-structures for the 20 common amino acids for use in studies of protein conformations by measurements of intramolecular proton-proton distance constraints with nuclear magnetic resonance. J Mol Biol 169:949–961
Article PubMed Google Scholar

Download references

Acknowledgements

LJS acknowledges the use of the University of Oxford Advanced Research Computing (ARC) facility in carrying out some of this work, http://dx.doi.org/10.5281/zenodo.22558. NH acknowledges funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2075 – 390740016.

Funding

Open access funding provided by Swiss Federal Institute of Technology Zurich. The funding has been received from Deutsche Forschungsgemeinschaft with Grant no. EXC 2075 – 390740016.

Author information

Authors and Affiliations

Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH, 8093, Zurich, Switzerland
Maria Pechlaner & Wilfred F. van Gunsteren
Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, 70569, Stuttgart, Germany
Niels Hansen
Inorganic Chemistry Laboratory, Department of Chemistry, University of Oxford, South Parks Road, Oxford, OX1 3QR, UK
Lorna J. Smith

Authors

Maria Pechlaner
View author publications
You can also search for this author in PubMed Google Scholar
Wilfred F. van Gunsteren
View author publications
You can also search for this author in PubMed Google Scholar
Niels Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Lorna J. Smith
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Pechlaner.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 335 kb)

Supplementary file2 (DOCX 18 kb)

Supplementary file3 (DOCX 14 kb)

Supplementary file4 (DOCX 17 kb)

Supplementary file5 (DOCX 16 kb)

Supplementary file6 (DOCX 18 kb)

Supplementary file7 (DOCX 20 kb)

Supplementary file8 (DOCX 14 kb)

Supplementary file9 (DOCX 16 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pechlaner, M., van Gunsteren, W.F., Hansen, N. et al. Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme. Eur Biophys J 51, 265–282 (2022). https://doi.org/10.1007/s00249-022-01593-1

Download citation

Received: 01 December 2021
Revised: 25 February 2022
Accepted: 26 February 2022
Published: 18 March 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s00249-022-01593-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme

Abstract

Similar content being viewed by others

Introduction

Modelling solvent effects upon protein structure

Elimination of protein or solvent degrees of freedom

Use of an implicit-solvation model

Use of measured data to derive or refine protein structure

Information density of various experimental data for proteins

Experimental data are generally averages over protein conformations

Methods

Molecular model for the MD simulation in explicit water using periodic boundary conditions

Molecular model for the SD simulations in vacuo without and with implicit-solvation term

Simulation set-up for the MD simulation in explicit water using periodic boundary conditions

Simulation set-up for the SD simulations in vacuo without or with implicit-solvation forces

MD and SD simulations performed

Analysis of atomic trajectories

Results and discussion

Comparison of structural and energetic properties calculated from the simulations

Comparison of NOE distances, 3 J-couplings and S 2 order parameters calculated from the simulations with experimentally derived values for HEWL

Comparison of dynamical properties calculated from the simulations

Conclusions

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Comparison of NOE distances, ³ J-couplings and S ² order parameters calculated from the simulations with experimentally derived values for HEWL