European Biophysics Journal

, Volume 34, Issue 4, pp 273–284

Validation of the 53A6 GROMOS force field

Authors

  • Chris Oostenbrink
    • Laboratory of Physical ChemistrySwiss Federal Institute of Technology
  • Thereza A. Soares
    • Laboratory of Physical ChemistrySwiss Federal Institute of Technology
  • Nico F. A. van der Vegt
    • Computational ChemistryMax-Planck-Institute for Polymer Research
    • Laboratory of Physical ChemistrySwiss Federal Institute of Technology
Article

DOI: 10.1007/s00249-004-0448-6

Cite this article as:
Oostenbrink, C., Soares, T.A., van der Vegt, N.F.A. et al. Eur Biophys J (2005) 34: 273. doi:10.1007/s00249-004-0448-6

Abstract

The quality of biomolecular dynamics simulations relies critically on the force field that is used to describe the interactions between particles in the system. Force fields, which are generally parameterized using experimental data on small molecules, can only prove themselves in realistic simulations of relevant biomolecular systems. In this work, we begin the validation of the new 53A6 GROMOS parameter set by examining three test cases. Simulations of the well-studied 129 residue protein hen egg-white lysozyme, of the DNA dodecamer d(CGCGAATTCGCG)2, and a proteinogenic β3-dodecapeptide were performed and analysed. It was found that the new parameter set performs as well as the previous parameter sets in terms of protein (45A3) and DNA (45A4) stability and that it is better at describing the folding–unfolding balance of the peptide. The latter is a property that is directly associated with the free enthalpy of hydration, to which the 53A6 parameter set was parameterized.

Keywords

GROMOSForce fieldMolecular dynamics simulationDNALysozymeβ-Peptide

Introduction

Molecular dynamics simulations have become an established method in biomolecular chemistry to understand and predict important processes at the molecular level. Numerically integrating the equations of motion for all relevant particles in a biomolecular system requires the availability of an accurate interaction function that describes the interaction between these particles, for instance through a classical force field. Several force fields exist for biomolecular simulation that are widely used, such as AMBER (Cornell et al. 1995; Pearlman et al. 1995; Weiner and Kollman 1981), CHARMM (Brooks et al. 1983; MacKerell et al. 1998; MacKerell et al. 1995), OPLS-AA (Jorgensen et al. 1996; Jorgensen and Tirado-Rives 1988) and GROMOS (Daura et al. 1998; Oostenbrink et al. 2004; Schuler et al. 2001; van Gunsteren and Berendsen 1987; van Gunsteren et al. 1996). Recently, it has been shown that all of these force fields severely underestimated the free energy of hydration for a series of small molecules that represent the amino acid side chains (MacCallum and Tieleman 2003; Shirts et al. 2003; Villa and Mark 2002). This was a very alarming finding, because for virtually all biomolecular processes of interest, the free energy of hydration and of transfer between polar and apolar media plays a vital role. Protein stability and folding, ligand recognition and binding, membrane formation and transport of small molecules across membranes are all being investigated by molecular dynamics simulations and all rely on a correct description of the equilibrium between ‘solvation’ of certain molecular moieties in different media.

Mainly for this reason, we have recently reparameterized many of the nonbonded interactions in the GROMOS force field specifically regarding the free enthalpy of hydration and of solvation in cyclohexane (Oostenbrink et al. 2004; Schuler et al. 2001). Unfortunately, it did not seem possible to obtain a parameter set for all polar functional groups that was at the same time able to accurately reproduce the density and heat of vaporization for pure liquids of small polar compounds. This was attributed to differences in average polarization of the molecules in different media. For this reason, we have proposed two new parameter sets: one that describes the interactions in pure liquids well and another in which the free energies of hydration and solvation in cyclohexane for small functional groups (polar and apolar) are accurately reproduced. Together with several earlier parameter sets for lipids (Chandrasekhar et al. 2003), carbohydrates (Lins and Hünenberger 2005), nucleotides (Soares et al. 2005) and various (co)solvents (Fioroni et al. 2000; Geerke et al. 2004; Smith et al. 2004; Walser et al. 2000), this has led to the definition of the 53A5 (pure liquids) and 53A6 (hydration and solvation) parameter sets (Oostenbrink et al. 2004). The GROMOS force field is still a force field with a simple functional form (Scott et al. 1999; van Gunsteren et al. 1998) and a limited set of different atom types, bond types, bond-angle types, improper dihedral types and torsional-angle types. It makes use of the united atom approach for aliphatic hydrogens. These atoms are not treated explicitly, but are considered as a single interaction site together with the carbon atom to which they are bound.

However, no matter how much effort goes into parameterizing a force field, it can only prove its real worth in realistic biomolecular applications (van Gunsteren and Mark 1998). Previous GROMOS parameter sets [37C4 (Smith et al. 1995; van Gunsteren and Berendsen 1987), 43A1 (Daura et al. 1998; van Gunsteren et al. 1996) and 45A3 (Schuler et al. 2001; Schuler and van Gunsteren 2000)] have proven successful in describing biochemical interactions in protein stability (Antes et al. 2002; Bakowies and van Gunsteren 2002; Fan and Mark 2003, 2004; Smith et al. 1995, 1996, 1999; Stocker et al. 2000; Stocker and van Gunsteren 2000), peptide folding (Daura et al. 1999a; Daura et al. 2002; Daura et al. 1997; van Gunsteren et al. 2001) and protein-ligand binding (Hansson et al. 1998; Marelius et al. 1998; Oostenbrink et al. 2000; Oostenbrink and van Gunsteren 2004; Talhout et al. 2003). They have also been used successfully in simulations involving triglycerides (Chandrasekhar and van Gunsteren 2002), membranes (Chandrasekhar et al. 2003; Glättli et al., submitted) and DNA double helices (Bonvin et al. 1998; Czechtizky et al. 2001). For simulations of DNA double helices, a new parameter set has recently been introduced, with which it has proven to be possible to obtain stable simulations of DNA strands using a simple force field and cutoff scheme (Soares et al. 2005). In the current work, we want to begin validation of the 53A6 parameter set by simulating three relevant biomolecular systems or processes and comparing them to simulations that were performed with earlier parameter sets and to experimental data. In particular, because earlier parameter sets were parameterized on specific pair interactions (Hermans et al. 1984), and simulations were carried out on pure liquids of small molecules (Daura et al. 1998), it is interesting to see how a parameter set derived from free energies of hydration and solvation will perform.

As a first test case, we used hen egg-white lysozyme (HEWL), a well-studied, 129-residue protein for which ample experimental structural data from NMR experiments (Buck et al. 1995; Schwalbe et al. 2001; Smith et al. 1991, 1993) and X-ray crystallography (Artymiuk et al. 1982; Carter et al. 1997; Vaney et al. 1996) are available. It was also thoroughly studied by simulation earlier, allowing us to compare it with previous parameter sets (Smith et al. 1995; Soares et al. 2004; Stocker et al. 2000; Stocker and van Gunsteren 2000). It is of interest to investigate whether the 53A6 parameter set with new nonbonded parameters for all the neutral polar functional groups is still able to maintain the stability of the protein to the same extent.

For our second test case, we selected the DNA dodecamer d(CGCGAATTCGCG)2, also known as the Dickerson–Drew dodecamer (Dickerson and Drew 1981; Drew and Dickerson 1981; Drew et al. 1981). This is another system for which both NMR data (Tjandra et al. 2000) and X-ray crystallographic structures (Dickerson and Drew 1981; Drew and Dickerson 1981; Drew et al. 1981; Shui et al. 1998) are available and which were studied previously with molecular dynamics techniques (Arthanari et al. 2003; Cheatham III and Kollman 2000; Cheatham III and Young 2000; Soares et al. 2005; Young et al. 1997). Only recently a new GROMOS parameter set, 45A4, has been introduced, with which it proved possible to obtain a stable simulation of this B-DNA double helix using a simple force field and cutoff scheme (Soares et al. 2005). Even though the charges on the nucleotide sugars and bases have not changed in the 53A6 parameter set, the van der Waals interaction for several atom types (C, O, N) have. It is therefore important to ensure that a simulation using the new parameter set does not deviate too much from the previous results.

As a third and final test case, we present a study on a β3-dodecapeptide with proteinogenic side chains (a in Fig. 3). From NMR and CD experiments, this peptide is seen to form a 314-helix in methanol, while in water, no regular secondary structure elements could be observed (Etezady-Esfarjani et al. 2002). Simulations using the 45A3 parameter set starting from the experimental model structure rather showed the opposite. While the structure unfolded immediately in methanol, it seemed more stable in water. Since earlier peptide folding simulations using the same parameter set had shown good agreement with the experiment, this failure came as a surprise. It is most probably due to the relatively large number of polar side chains in the dodecapeptide compared to the ones simulated earlier (Daura et al. 1997, 1999a, 2002; Glättli et al. 2002a; Peter et al. 2000; van Gunsteren et al. 2001). In contrast to protein and DNA simulations, simulations of peptides cover timescales in which the folding–unfolding equilibrium is reached (Daura et al. 1997, 1999a, 2002; van Gunsteren et al. 2001); a process that will depend critically on the balance between hydrophilic and hydrophobic interactions. Since the parameter set 53A6 was parameterized specifically on this balance, one may expect major changes in a simulation of this peptide. Moreover, for about 60% of all atoms in this peptide, the charge and/or the van der Waals parameters have changed compared to the 45A3/4 sets. The methanol model has slightly changed as well (Walser et al. 2000). Thus, in contrast to the HEWL and DNA simulations, one may expect the simulation characteristics of the β-peptide to change quite dramatically.

Methods

Simulations

Molecular dynamics simulations were performed using the GROMOS simulation package (Scott et al. 1999; van Gunsteren et al. 1996). Simulations on hen egg-white lysozyme (5 ns in water), the DNA dodecamer d(CGCGAATTCGCG)2 (4 ns in water) and the β3-dodecapeptide (100 ns in methanol; 25 ns in water) were carried out using the 53A6 parameter set.

The initial structure of the lysozyme was taken from the crystal structure (Artymiuk et al. 1982), protein data base (PDB) entry code 1AKI (Carter et al. 1997). The system was solvated in a periodically truncated octahedral box, containing 11,193 SPC (Berendsen et al. 1981) water molecules. The protonation states of protonatable groups were selected to correspond to a pH of 7. Eight chlorine ions were added to reach overall neutrality of the system, leading to a total of 34,910 atoms.

The simulations of the DNA dodecamer were also started from the crystal structure, PDB entry code 355D (Shui et al. 1998). The double helical structure was solvated in a rectangular periodic box containing 13,415 SPC water molecules, 46 sodium ions and 24 chlorine ions, corresponding to an overall neutral system with an ionic concentration of 0.1 M.

The β-dodecapeptide that was simulated is depicted in Fig. 3a. Simulations were started from the experimental model structure (Etezady-Esfarjani et al. 2002) in methanol solution (Fig. 3b). It was solvated in truncated octahedral boxes containing 5,506 SPC water molecules or 2,416 methanol molecules (Walser et al. 2000). No counter ions were added, yielding a net charge of +4e.

All simulations were carried out at a constant temperature of 298 K and a pressure of 1 atm using the weak coupling algorithm (Berendsen et al. 1984). Relaxation times were set to τT=0.1 ps and τP=0.5 ps and an estimated isothermal compressibility of 4.575×10−4 (kJ mol−1 nm−3)−1 was used (van Gunsteren et al. 1996). All bond lengths were kept rigid at ideal bond lengths using the SHAKE algorithm (Ryckaert et al. 1977), allowing for a time step of 2 fs. Nonbonded interactions were calculated using a triple range cutoff scheme. Interactions within a short-range cutoff of 0.8 nm were calculated every time step from a pair list that was generated every five steps. At these time points, interactions between 0.8 and 1.4 nm were also calculated and kept constant between updates. A reaction-field contribution (Tironi et al. 1995) was added to the electrostatic interactions and forces to account for a homogeneous medium outside the long-range cutoff, using a relative permittivity of 61 in the lysozyme and peptide simulations in water (Heinz et al. 2001), 66 in the DNA simulation (Glättli et al. 2002b) and 17.7 in the peptide simulation in methanol (Walser et al. 2000).

Analysis

All simulations described above were analysed together with very similar simulations that were performed earlier, using previous GROMOS parameter sets. The lysozyme simulations can be compared to a 3.5-ns simulation (Soares et al. 2004) based on the 45A3 parameter set (Schuler et al. 2001). For the DNA dodecamer, we also analysed a 4-ns trajectory from a simulation based on the 45A4 parameter set (Soares et al. 2005). Finally, the peptide simulations were compared to previous simulations based on the 45A3 parameter set (Schuler et al. 2001) and an older version of the methanol model (van Gunsteren et al. 1996).

Secondary structure assignments for the lysozyme simulations were carried out using the Kabsch and Sander rules (Kabsch and Sander 1983). Structural analyses of DNA base pair geometries were performed according to the rules as implemented in the 3DNA program (Lu and Olson 2003; Olson et al. 2001). Sugar-ring puckering was analysed through the pseudorotation phase and puckering amplitude as proposed by Altona and Sundaralingam (Altona et al. 1968; Altona and Sundaralingam 1972). Hydrogen bonds were analysed according to a geometrical criterion. A hydrogen bond is defined by a minimum donor-hydrogen-acceptor angle of 135° and a maximum hydrogen-acceptor distance of 0.25 nm (van Gunsteren et al. 1996).

Comparisons to NMR experimental data were made through an analysis of proton–proton distances as compared to NOE upper bounds. For lysozyme, a set of 1,630 NOE upper bounds was available (Schwalbe et al. 2001), for the DNA dodecamer, there were 160 upper bounds (Tjandra et al. 2000) and for the β-dodecapeptide we used 150 upper bounds, which were obtained in methanol (Etezady-Esfarjani et al. 2002). Proton–proton distances were averaged using 1/r3 averaging, \(\bar r = \left( {\left\langle {r^{ - 3} } \right\rangle } \right)^{ - 1/3} ,\) corresponding to a slowly tumbling molecule (Tropp 1980). Positions of (aliphatic) protons that were not treated explicitly in the simulations were calculated from standard configurations. In cases where NOE upper bounds were assigned to more than one proton, a pseudoatom approach (Wüthrich et al. 1983) was used with the following corrections to the upper bound. For a non-stereospecifically assigned CHgroup, 0.09 nm was added to the upper bound. For a methyl group, the correction was 0.10 nm. For the six protons in an iso-propyl group, a correction of 0.22 nm was added and for unassigned Hδ and Hε atoms in a flipping benzene ring, 0.21 nm. These corrections are very close to the pseudoatom corrections suggested by Wüthrich et al. (1983) and the ones used in the lysozyme structure determination (Schwalbe et al. 2001). They correspond to the GROMOS standard bond lengths and angles (Oostenbrink et al. 2004; van Gunsteren et al. 1996). No additional multiplicity corrections (Constantine et al. 1992; Fletcher et al. 1996) were applied to the NOE upper bounds.

The simulations of the β-dodecapeptide were subjected to a conformational clustering analysis as described by Daura et al. (1999b). Snapshots of the simulations were taken at 0.01-ns intervals, and atom-positional root-mean-square differences (rmsd) between all pairs of structures were calculated using the backbone atoms (C, Cα, Cβ, N) of residues 2 to 11. Structures with rmsd values smaller than 0.1 nm were considered to be structural neighbours. This tight criterion was chosen to distinguish small conformational differences. For every trajectory, the structure with the most neighbours was considered to be the central member of the (first, most populated) cluster of similar structures forming a conformation. After removing all structures belonging to this first cluster, the procedure was repeated to find the second, third etc. most populated clusters. Hydrogen bond analyses on all structures belonging to the ten most populated clusters have been carried out, using the criterion that was described above.

Results and discussion

Lysozyme

The overall structure of lysozyme is well maintained during a 5-ns simulation with the 53A6 parameter set. The atom-positional rmsd from the starting coordinates for the backbone atoms (N, Cα, C, O) gently increases during the first 2 ns of the simulation and levels off to an average value of 0.25 nm in the last 2 ns. In the 3.5-ns simulation using the 45A3 parameter set, it swiftly increased to 0.37 nm in the first 1.5 ns and remained stable at 0.39 nm over the course of the simulation (Soares et al. 2004).

A pictorial view of the stability can be found in Fig. 1, where the final structure after 5 ns (53A6) is fitted onto the original X-ray structure (Carter et al. 1997). Obviously, the fold is maintained and the most prominent secondary structure elements are preserved during the simulation. The conservation of the secondary structure elements is described in more detail in Table 1. Using the Kabsch and Sander (1983) assignment criteria, we averaged the secondary structure assignment of the residues over the secondary structure elements. Most elements that were seen in the X-ray structure (Carter et al. 1997) or in the bundle of NMR structures (Schwalbe et al. 2001) are also observed during the simulation. Of the four larger α-helices (indicated by A, B, C and D in Fig. 1 and Table 1), three are observed for about 90–100% of the time, while the fourth, helix D, seems to convert into a π-helix in our simulations. In particular, residues 111Trp to 115Cys appear to be in a π-helical arrangement for about 70% of the time in both simulations. Interestingly, the D helix was observed only in 80% of the 50 NMR structures. In the other 20% of the NMR structures, a series of turns was assigned to these residues (Schwalbe et al. 2001). Similar conversions are seen for the helices 80Cys–84Leu and 120Val–124Ile. In the crystal structure, these are assigned 310-helices, but in the NMR structures and our simulations, the α-helix seems to be preferred. The other two 310-helices in the X-ray structure also seem to be represented poorly in the course of the simulations (20Tyr–22Gly and 104Gly–107Ala). On the other hand, a β-bridge is present for 83% of the time between residues 19Asn and 23Tyr in the 53A6 simulation. Also, the hydrogen bond 23Tyr to 20Tyr is observed for 71% in our hydrogen bond analysis. Similarly, for the helix 104Gly – 107Ala, several turns can be observed and hydrogen bonds for 106Asn to 104Gly and 108Trp to 105Met are observed for about 20% of the time each. The anti-parallel β-sheet involving residues in the range of 43Thr to 59Asn is maintained for well over 90% of the time, as are the β-bridges 2Val–39Asn and 65Asn–79Pro.
Fig. 1

Overlay of the hen egg-white lysozyme crystal structure and the structure obtained after 5 ns of simulation. Crystal structure in dark blue (backbone), green (helices) and orange (sheets). Simulation structure in light blue (backbone), light green (helices) and yellow (sheets). Large α-helices A, B, C, and D (see Table 1) are indicated, as well as the position of the C-terminal 129Leu

Table 1

Secondary structure analysis of lysozyme

Residues

Number of residues

Secondary structure

Average occurrence

45A3 (%)

53A6 (%)

2Val–39Asn

2

β-Bridgea,b

96

99

5Arg–14Arg (A)

10

α-Helixa,b

99

100

20Tyr–22Gly

3

310-Helixa

5

12

19Asn–23Tyr

2

β-Bridge

10

84

25Leu–36Ser (B)

11

α-Helixa,b

78

94

π-Helix

2

5

310-Helix

4

0

43Thr–45Arg

3

β-Sheeta,b

97

91

51Thr–53Tyr

3

β-Sheeta,b

99

98

58Ile–59Asn

2

β-Sheeta

98

99

60Ser–63Trp

4

α-Helix

25

4

65Asn–79Pro

2

β-Bridgea

71

92

80Cys–84Leu

5

310-Helixa,b

26

25

α-Helixb

55

52

89Thr–101Asp (C)

13

α-Helixa,b

89

89

π-Helix

1

4

104Gly–107Ala

4

310-Helixa

9

3

α-Helix

0

0

109Val–114Arg (D)

6

α-Helixa,b

37

11

π-Helix

50

61

120Val–124Ile

5

310-Helixa

23

20

α-Helixb

35

41

For every secondary structure element the secondary structure assignment according to Kabsch and Sander (1983) was averaged over time (3.5 ns for 45A3; 5.0 ns for 53A6) and number of residues involved (second column)

aObserved in crystal structure (Carter et al. 1997)

bObserved in NMR structures (Schwalbe et al. 2001)

A summary of the NOE analysis that was performed on both simulations can be found in Table 2. A total of 1,630 NOE upper bounds were taken into account in the analysis (Schwalbe et al. 2001). In line with the development of the rmsd, one sees an increase in the total number of violations during the first 1.5 ns of the simulations and a more converged number of violations in the latter part of the simulations. Both simulations satisfy about 95% of the experimental upper bounds within 0.1 nm. Of the 13 large violations (>0.3 nm) that were observed in the 1.5 to 3.5-ns part of the 53A6 simulation, six involved the C-terminal 129Leu. As indicated in Fig. 1, the C-terminus turns around in the course of the 53A6 simulation. Removing the 22 NOE upper bounds involving 129Leu from the experimental set significantly reduces the number of violations and the average violation for the 53A6 simulation, but not for the 45A3 simulation. The termini of a protein are known to be much more flexible and a movement as depicted in Fig. 1 is not likely to affect the overall stability of the protein. However, it can be expected that a 5-ns simulation is not sufficient to sample all C-terminal motions, such that the NOE upper bounds for this region would be reproduced.
Table 2

Number of NOE upper-distance-bound violations and average violations in the lysozyme simulations for different averaging periods

Averaging time period (ns)

Force-field parameter set

NOE upper-bound violations

Average violation (nm)

>0.1 nm

>0.2 nm

>0.3 nm

0.5–1.5

45A3

61 (4%)

22 (1%)

6 (0.4%)

0.011

53A6

68 (4%)

27 (1%)

7 (0.4%)

0.012

1.5–3.5

45A3

73 (4%)

34 (2%)

14 (0.9%)

0.013

45A3 no 129

72 (4%)

34 (2%)

14 (0.9%)

0.013

53A6

86 (5%)

37 (2%)

13 (0.8%)

0.017

53A6 no 129

78 (5%)

31 (2%)

7 (0.4%)

0.014

3.5–5.0

53A6

86 (5%)

41 (3%)

14 (0.9%)

0.018

The percentage of the total number of NOE upper bounds is given in brackets. no129 indicates the analysis where all (22) upper bounds involving residue 129Leu have been left out

DNA dodecamer

Both of the tested parameter sets maintain an overall double helical structure of the DNA dodecamer. Rmsd deviations from the initial crystal structure for all heavy atoms fluctuate around 0.4–0.5 nm in the last 2 ns of the simulation. As a general trend, it was observed that the backbone atoms show larger rmsd values than the atoms in the bases. Fluctuations in the atomic positions are strongest in the first and last base pairs. This can also be seen from an investigation of the Watson–Crick hydrogen bonds as displayed in Fig. 2. Hydrogen bonds in the middle base pairs are well maintained during the course of the simulation, while the first and the last base pairs are less strongly bound. In the 53A6 simulation, the last base pair opens up completely, slightly earlier than in the 45A4 simulation, resulting in a lower number of hydrogen bonds. It is known that the first and last base pairs show considerable mobility and we note that none of the NMR model structures satisfies all experimental data involving these bases (Tjandra et al. 2000).
Fig. 2

Occurrence of the Watson-Crick hydrogen bonds in 4-ns DNA simulations using the 45A4 (black) and 53A6 (red) parameter sets. For every base pair, three or two hydrogen bonds are given

Structural parameters for the bases and sugar puckering are presented in Table 3. All properties are averaged over the 12 base pairs. It should be noted explicitly that the data presented for the parameter sets 45A4 and 53A6 are averages over 10,000 structures from the 4-ns simulations, whereas the NMR and X-ray data are averaged over five structures (Tjandra et al. 2000) and a single structure (Shui et al. 1998), respectively. This becomes especially apparent from the distribution of the sugar pucker, where in the NMR structures only C1′-exo and C2′-endo conformations were seen. Tjandra et al. (2000) noted that this does not mean that the sugar puckering is in reality not more dynamic. The pucker distribution from the simulations as given in Table 3 is observed for all individual sugar rings, with the exception of the sugar ring of Ade (6) which shows a C3′-endo puckering for 13% (45A4) and 18% (53A6) in the simulations.
Table 3

Comparison of structural parameters obtained from experimental and simulation structures of the DNA dodecamer

 

45A4

53A6

NMR

X-ray

Local base-pair parameters

Shear (nm)

−0.002

0.007

0.0

−0.018

Stretch (nm)

−0.017

−0.017

−0.039

−0.015

Stagger (nm)

−0.035

−0.033

−0.022

−0.002

Buckle (°)

0.0

−0.6

−0.1

2.5

Propeller (°)

−7.1

−7.1

−12.3

−19.0

Opening (°)

−2.0

−2.2

2.1

3.0

Sugar-ring puckering

C3′-endo (%)

5

4

0

4

C4′-exo (%)

4

4

0

0

O4′-endo (%)

20

19

0

12

C1′-exo (%)

44

44

63

42

C2′-endo (%)

23

25

37

42

C3′-exo (%)

2

2

0

0

C2′-exo (%)

1

1

0

0

Pseudorotation

 Phase (°)

118.8

121.1

132.1

133.7

 Amplitude (°)

42.5

42.4

29.3

41.3

Local base-pair helical parameters

X-displacement (nm)

−0.349

−0.321

−0.128

−0.052

Y-displacement (nm)

0.004

−0.009

0.001

−0.022

Rise (nm)

0.278

0.300

0.329

0.329

Inclination (°)

15.5

14.2

4.6

5.0

Tip (°)

−0.2

−0.1

0.0

1.3

Helical twist (°)

33.4

33.3

35.1

37.8

Local base-pair step parameters

Shift (nm)

−0.002

0.003

0.0

0.002

Slide (nm)

−0.109

−0.092

−0.046

0.004

Rise (nm)

0.342

0.357

0.338

0.341

Tilt (°)

0.1

0.1

0.0

−0.9

Roll (°)

8.4

7.8

2.5

3.4

Twist (°)

30.4

30.5

34.6

36.1

Averages are over all bases, sugars, base pairs and inter-base-pair parameters observed in 4 ns of simulation (parameter sets 45A3 and 53A6), five NMR model structures (Tjandra et al. 2000) or the X-ray structure (Shui et al. 1998). Local base-pair parameters, helical parameters and step parameters were calculated using the three DNA definitions (Lu and Olson 2003; Olson et al. 2001). Sugar puckering, pseudorotation phase and pucker amplitudes are according to Altona et al. (1968) and Altona and Sundaralingam (1972)

The averages over some of the structural properties (X-displacement, helical rise, inclination, helical twist, slide, roll and twist) show significant deviations from the values for canonical B-DNA and indicate that the dodecamer moves towards the A-DNA form (Olson et al. 2001). A downward trend can also be observed in the rmsd with respect to a canonical A-DNA structure during the course of the simulation (data not shown). Apparently, the dodecamer covers a part of conformational space that lies between the canonical A-DNA and B-DNA forms. Whether this is an artefact of the simulation or a true representation of the dynamic structure of this dodecamer in solution remains open.

Table 4 summarizes the NOE distance analyses for the simulation trajectories, for the NMR model structures (Tjandra et al. 2000), the X-ray (B-DNA) structure (Shui et al. 1998), and for a modelled canonical A-DNA structure with the same sequence. Proton–proton distances were averaged over the simulation period from 1 to 4 ns. Different averaging times showed very similar results. As in the lysozyme test case, about 95% of the 160 experimental upper bounds were satisfied within 0.1 nm. The 53A6 simulation shows slightly more violations larger than 0.1 nm, with a maximum violation of 0.128 nm between H1′ of Ade (6) and and H5’ of Thy (7) in the second chain. The largest violation in the 45A4 simulation amounts to 0.295 nm between H1′ of Thy (8) and H5’ of Cyt (9) of the first chain. In both simulations, only 1 of the 20 NOE upper bounds that involve the first and last base pairs shows a violation larger than 0.1 nm, indicating that the large flexibility that is observed in the simulations is not in contrast with the experimental NMR data.
Table 4

Number of NOE upper-bound violations and average violations in the DNA dodecamer simulations

Force-field parameter set

NOE upper-bound violations

Average violation (nm)

>0.05 nm

>0.1 nm

>0.2 nm

45A3

45 (28%)

3 (2%)

1 (0,6%)

0.029

53A6

50 (31%)

8 (5%)

0

0.032

NMRa

6 (4%)

0

0

0.012

B-DNAb

52 (33%)

23 (14%)

2 (1%)

0.042

A-DNAc

72 (45%)

38 (24%)

0

0.058

The percentage of the total number of NOE upper bounds is given in brackets. For the simulations, averaging was performed over the last 3 ns of simulation

aAverage over five NMR structures (Tjandra et al. 2000)

bSingle X-ray structure (Shui et al. 1998)

cModelled canonical A-DNA structure

Table 4 also includes the NOE violations for the NMR bundle of model structures as well as for the X-ray structure (labelled B-DNA) and a modelled A-DNA structure. As could be expected, the NMR structures that were derived based on the NOE upper bounds show very small violations. NOE analyses of single structures (X-ray and A-DNA model structures), however, show many violations, indicating that the averaging effect of a dynamic simulation is required to reproduce the NOE upper bounds. Of the 38 violations larger than 0.1 nm in the A-DNA model, only 1 also shows a large deviation in the simulations. For the 45A4 simulation, this is (again) the NOE between H1′ of Thy (8) and H5′ of Cyt (9) of the first chain (at 0.295 nm), and for the 53A6 simulation it is the NOE between H1′ in Gua (10) and H6 in Cyt (11) in the second chain (at 0.111 nm). Even though the structural parameters that were described above might indicate a drift towards an A-DNA conformation, the simulations are not in contrast with the NMR data and do not show the violations of the upper bounds that one might expect from a pure A-DNA conformation.

β3-Dodecapeptide

The simulations of the β-dodecamer were all started from the folded helical experimental model structure (Etezady-Esfarjani et al. 2002). For a peptide of this length, the time required for the complete folding process from a random or extended conformation can still be expected to be beyond the length of these simulations. For the four simulations, a clustering analysis was performed (Daura et al. 1999b) with a similarity criterion of 0.1 nm as described in Methods. For a peptide of this length with 40 atoms that are taken into account in the fitting and rmsd calculation, 0.1 nm is a relatively small value. This means that once the peptide samples a broad ensemble of structures in conformational space, one might expect very many different clusters, whereas a low number of clusters means that the conformation of the peptide does not change much. Table 5 lists the total number of clusters that was found in every simulation. It is clear that the 45A3 simulation of the peptide in methanol samples many more conformations than the other simulations. Taking the shortest simulation time of 25 ns into account, the 53A6 simulation in water could well be sampling the second highest number of conformations. Table 5 also lists the average number of 314-helical hydrogen bonds for the ten most populated clusters and for all simulation structures. The average is taken over the theoretically 10 314-helical hydrogen bonds that could be formed in this peptide. The number of hydrogen bonds that is actually observed in the cluster is given in brackets.
Table 5

Summary of conformational clustering results for the entire simulations of the β-dodecapeptide

Force field

45A3

53A6

Solvent

MeOH

H2O

MeOH

H2O

Simulation length (ns)

100

100

100

25

Number of clusters

1868

410

239

240

Percentage

Hydrogen bonds

Percentage

Hydrogen bonds

Percentage

Hydrogen bonds

Percentage

Hydrogen bonds

Cluster 1

7.8

0

34.0

52 (9)

59.5

73 (10)

20.9

61 (10)

Cluster 2

2.5

59 (8)

6.6

0

4.5

57 (8)

3.7

7 (10)

Cluster 3

1.4

35 (8)

4.2

60 (10)

3.6

46 (6)

3.2

0

Cluster 4

1.3

0

3.0

0

1.7

49 (6)

2.7

10 (3)

Cluster 5

1.2

38 (7)

2.5

0

1.5

49 (6)

1.7

9 (2)

Cluster 6

1.2

44 (7)

2.3

0

1.1

73 (10)

1.5

28 (5)

Cluster 7

1.1

44 (5)

2.1

0

1.1

61 (7)

1.5

1 (1)

Cluster 8

0.8

35 (5)

1.9

18 (2)

1.0

42 (6)

1.4

1 (1)

Cluster 9

0.8

0

1.4

22 (4)

1.0

29 (3)

1.3

69 (10)

Cluster 10

0.7

41 (7)

1.4

42 (8)

0.9

29 (3)

1.2

1 (1)

Overall

100

12 (10)

100

29 (10)

100

60 (10)

100

20 (10)

Clustering criterion: a backbone atom-positional root-mean-square deviation smaller than 0.1 nm. For the ten most populated conformations observed in simulations using the 45A3 and 53A6 parameter sets in methanol (MeOH) and water (H2O), the occurrence is given (columns labelled with ‘percentage’) as well as the average occurrence of the 10 314-helical hydrogen bonds in that cluster (columns labelled with ‘hydrogen bonds’). The total number of helical hydrogen bonds observed are given in brackets

In Fig. 3c–f, we depict the central member structure of the most populated cluster from each of the four simulations (45A3 in methanol, 53A6 in methanol, 45A3 in water and 53A6 in water). It is obvious that the most populated cluster of the simulation in methanol using the 45A3 parameter set does not represent a helical conformation, whereas the others do. Table 5 shows that the most populated cluster in the 45A3 simulation in methanol only represents 7.8% of the total simulation time. Other clusters with an even lower occupancy do show some helical content as can be seen from the hydrogen bond analyses. These clusters only contain structures from the first 30 ns of the simulation. Overall, the 314-helical hydrogen bonds are seen only for 12% of the time in the simulation using the 45A3 parameter set. Figure 4 shows the rmsd for the backbone atoms (N, Cα, Cβ, C) with respect to the experimental NMR model structure. It is clear that this simulation strongly deviates from the NMR model structure for most of the 100 ns. The experimental structure is revisited to within an rmsd value of 0.04 nm after approximately 24 ns, but is quickly abandoned again. This indicates that the absence of significant helical conformations is not due to insufficient sampling, but rather to a low energetic stability of this conformation.
Fig. 3

Chemical formula of the dodecapeptide (a). Stick representation of the experimental NMR model structure with backbone carbon atoms in yellow, sidechains in blue (b). Stick representation of the backbone of the central member structures of the most populated conformational clusters using parameter set 45A3 in methanol (c), 53A6 in methanol (d), 45A3 in water (e) and 53A6 in water (f)

Fig. 4

Atom-positional root-mean-square-deviations of the backbone atoms (C, Cα, Cβ, N) of residues 2–11 with respect to the experimental NMR model structure derived for the peptide in methanol. Parameter sets 45A3 (black) and 53A6 (red) in methanol (a) and water (b)

For the simulation in methanol using the 53A6 parameter set, we see quite the opposite picture. The low number of clusters indicates that the peptide is very stable throughout the simulation, as can also be observed in Fig. 4. The hydrogen bond analysis (Table 5) shows that the cluster that represents 60% of the simulation shows 73% of the helical hydrogen bonds, and that all subsequent clusters displayed here show a significant helical content. The overall presence of 314-helical hydrogen bonds is with 60% the highest of the four simulations. From Fig. 4, we see that the peptide does deviate from the experimental structure for some periods in the simulation, but also finds its way back again to the original conformation after 22 and 75 ns.

In water, where no evidence of regular secondary structure was found experimentally, the 45A3 simulation seems to find a stable conformation (cluster 1, see Fig. 3e) close to the NMR model structure up to about 38 ns, after which it unfolds, and revisits the same conformation again at 50 and 55 ns. After this point in time, the helical structure is lost and clusters 2, 4, 5, 6 and 7 in Table 5 come into play. Overall, the 314-helical hydrogen bonds are observed for about 29%, much more than in the methanol simulation with the same parameter set. The 53A6 simulation in water, even though much shorter, seems to be stable in clusters 1 and 9 for about 7.5 ns, after which the peptide visits cluster 6, which is still about half helical. The other clusters that are visited seem to be far away from the NMR model structure, with the occasional appearance of helical hydrogen bonds.

In summary, we can conclude that the simulations with the 45A3 parameter sets do not seem to reproduce the experimental finding of a 314-helix in methanol and no regular secondary structure in water. In the 45A3 simulations in methanol the 314-helix is not stable, whereas in water it seems to be more stable, but also unfolding. Using the parameter set 53A6, the simulations do agree with the experiment; in methanol, the 314-helical structure seems stable, the peptide goes away from this structure and refolds to it later on. In water, the 314-helical structure seems much less stable, unfolding after about 7.5 ns. The NOE analyses on these simulations nicely mirror these findings. As can be seen from Table 6, the 45A3 parameter set in methanol performs worst in reproducing the experimental upper bounds. Even though the experimental data was obtained in methanol, the simulations in water seem to agree better. The best agreement with the experiment is obtained from the simulation using the 53A6 parameter set in methanol, which shows a very low number of violations and a very low average violation. The 53A6 parameter set, which was parameterized specifically on the free enthalpies of hydration and of solvation in cyclohexane, seems more able to reproduce the experimental data in a folding study of a peptide containing polar groups. Of course, the balance between different ‘solvation’ states of the different groups in the peptide will be directly associated to the folding stability and equilibrium.
Table 6

Number of NOE upper bound violations and average (over the entire simulations) violations in the β3-dodecapeptide simulations

Force-field parameter set

Solvent

NOE upper bound violations

Average violation

>0.05 nm

> 0.1 nm

> 0.2 nm

(nm)

45A3

MeOH

34 (23%)

25 (17%)

13 (9%)

0.043

H2O

26 (17%)

11 (7%)

3 (2%)

0.023

53A6

MeOH

12 (8%)

3 (2%)

0

0.010

H2O

33 (22%)

21 (14%)

5 (3%)

0.030

The percentage of the total number of NOE upper bounds is given in brackets. The NOE upper bounds have been derived from NMR experiments on the β-dodecapeptide in methanol. No NOE bounds for the peptide in water were available

Conclusion

We have presented three test cases in order to begin the validation of the new GROMOS 53A6 parameter set (Oostenbrink et al. 2004) for biomolecular simulation. The simulation of the 129-residue protein hen egg-white lysozyme remains stable over 5 ns. Secondary structure elements are well preserved and 95% of the proton–proton upper bounds derived from NMR experiments is reproduced within 0.1 nm. The same goes for the simulation of the DNA dodecamer, which was stable for 4 ns. Fluctuations at the end of the chains were not in contrast with the NMR data that were used. Structural parameters indicate that the dodecamer visits conformations that are between the canonical A-DNA and B-DNA forms, which is also not in contrast with the NMR data. No major differences between simulations with previous parameter sets and the new parameters were observed in these two test cases.

For the β3-dodecapeptide, significant differences between the two parameter sets were observed. In a long simulation in which a folding–unfolding equilibrium can be established, the balance between different ‘solvation’ states of the different functional groups will be of vital importance. We showed that simulations with a parameter set that was derived from the free enthalpies of hydration and of apolar solvation reproduce the experimentally found secondary structure better than the previous parameter set.

Overall, the 53A6 parameter set behaves similarly to the previous (45A3/4) GROMOS parameter sets in terms of protein and DNA stability. In a case where the balance between different environments of polar groups plays an important role, it performed better in reproducing the experimental data. After these initial validations, we think it is safe to use this parameter set in the future and are confident that it will prove itself in many simulations to come.

Acknowledgments

Professor Dr. K. Wüthrich is gratefully acknowledged for making the experimental data on the β3-dodecamer available. We also thank Lorna Smith for helpful discussions on the NOE analysis. Financial support by the National Center of Competence in Research (NCCR) in Structural Biology of the Swiss National Science Foundation (SNSF) is gratefully acknowledged.

Copyright information

© EBSA 2005