Discrete gradients in short-range molecular dynamics simulations

Discrete gradients (DG) or more exactly discrete gradient methods are time integration schemes that are custom-built to preserve first integrals or Lyapunov functions of a given ordinary differential equation (ODE). In conservative molecular dynamics (MD) simulations, the energy of the system is constant and therefore a first integral of motion. Hence, discrete gradient methods seem to be a natural choice as an integration scheme in conservative molecular dynamics simulations.


Introduction
In molecular dynamics (MD) simulations, or, more generally, in particle methods, one is usually not interested in the trajectory of a single particle but in derived and/or averaged quantities that are often related to geometric properties of the underlying equations of motion.For a meaningful simulation, the preservation of these geometric properties is important (e.g.[23,29,32]).In a microcanonical or (NVE) ensemble, important macroscopic variables are the total number of particles (N), the system's volume (V) as well as the total energy (E).Microcanonical MD simulations should therefore preserve the total energy which corresponds to a first integral of the equations of motion (e.g.[48]).In the Hamiltonian formulation, the total energy or Hamiltonian H : R d × R d → R, the positions q ∈ R d and momenta p ∈ R d determine the equations of motion as q = ∇ p H(q, p), p = −∇ q H(q, p).In addition to the preserved energy, Hamiltonian systems possess another important structural property, the symplecticity of the Hamiltonian flow (cf.[28,49]).Unfortunately, the symplectic structure and the total energy can generally not be preserved exactly at the same time.A class of geometric integrators that can preserve first integrals and Lyapunov functions exactly are discrete gradient methods.They have first been considered as energy conserving schemes (e.g.[14,15,26]) and they have then been generalised to arbitrary first integrals of Hamiltonian and non-Hamiltonian systems (cf.[43]) and Lyapunov functions (cf.[34,53]).Other methods that are able to preserve integrals and Lyapunov functions are projection methods (e.g.[20]), which are related to discrete gradient methods (cf.[40]).In addition to many applications, discrete gradient methods can be used to preserve the energy of suitably discretized variational partial differential equations (e.g.[11,12,39,41,57]) as well as to ensure the dissipation of gradient systems in image processing (e.g.[18,19,21,46]).They have been generalised to manifolds (cf.[9,10]) and inspire new ideas in smooth optimization (e.g.[13,45]) and deep learning (e.g.[8]).
Following this introduction, discrete gradients are discussed in section 2. Their use for Hamiltonian systems in MD simulations is studied in section 3. Special discrete gradients for molecular dynamics are presented in section 4. The parallelisation of DG methods is discussed in section 5. Finally, a brief conclusion is given in section 6.

Discrete gradients in geometric integration
In order to remind the reader of discrete gradients, we follow [33].
Definition 1. (Gonzalez 1996) Let V : R n → R be continuously differentiable.The function ∇V : R n × R n → R n is a discrete gradient of V iff it is continuous and The discrete gradient is symmetric, if and only if An interpretation of the following proposition, that can be found as proposition 3.2 in [33], is, that the component of any discrete gradient in the direction (u − u)/ u − u is (V (u ) − V (u))/ u − u .
Proposition 1. ∇V (u, u ) is a discrete gradient if and only if it is continuous and where w(u, u ) is a vector-valued function such that w(u, u ), (u − u) = 0, (u = u ), lim u →u w(u, u ) − P (u −u) ⊥ ∇V (u) = 0, where P (u −u) ⊥ is the projection on the space perpendicular to (u − u).
For the Euclidean inner product, u, v = v T u, for u, v ∈ R n , the projection can be written as the matrix where I n is the n × n identity matrix and • T means transpose.The following simple fact is quite useful.
Lemma 1.Let ∇V i be discrete gradients for V i , i = 1, . . ., N .Then We omit the obvious proof of lemma 1. Examples of well-known discrete gradients are the midpoint discrete gradient or Gonzalez discrete gradient (cf.[14]) (1) the mean value discrete gradient (cf.[24]) (2) and the coordinate increment discrete gradient (cf.[26]) , where 0/0 is understood to be ∂V /∂u i .The midpoint discrete gradient is symmetric, i.e.V M P (u, u ) = V M P (u , u) for all u = u , as is the mean value discrete gradient.The coordinate increment discrete gradient is not symmetric, but it can be symmetrised (cf.[35]) A discrete gradient can be used as the discretised version of the gradient of a first integral or the gradient of a Lyapunov function of an ordinary differential equation (ODE).If the ODE is smooth enough and possesses a first integral or Lyapunov function V then the ODE can be written as (4) y = f (y) = A(y)∇V (y), where A(y) is an equally smooth antisymmetric matrix, when V is a first integral, and a negative semidefinite matrix, when V is a Lyapunov function (cf.[33]).A discrete gradient method then reads (5) where Ã(y, y, 0) = A(y), for consistency, and ∇V is a discrete gradient for V .If the discrete gradient is symmetric and Ã(y n+1 , y n , −τ ) = Ã(y n , y n+1 , τ ) holds for all possible values, then the discrete gradient method ( 5) is called time-symmetric or self-adjoint (cf.definition 3.2 in [34]).

DG methods in molecular dynamics (MD) simulations
The equations of motion in molecular dynamics can be conveniently stated in Hamiltonian form.Given an arbitrary (smooth) Hamiltonian function the corresponding Hamiltonian equations of motion are (6) q = ∇ p H(q, p), p = −∇ q H(q, p).
The Hamiltonian corresponds to the total energy of the system that is preserved in a conservative simulation.In order to recognize the fact that equation ( 6) is a special case of equation ( 4), we set y = (q, p) T and define the matrix where I d is the identity matrix of dimension d.With these definitions, system (6) reads the Hamiltonian H (energy) is preserved along solutions of the system.With an arbitrary discrete gradient satisfying definition 1, the corresponding discrete gradient method reads (8) with step size τ > 0. The discrete gradient method also preserves the Hamiltonian, since definition 1 gives The method ( 8) is time-symmetric, whenever the chosen discrete gradient is symmetric.
3.1.Separable Hamiltonian systems.When the Hamiltonian is separable, i.e. ( 9) one can apply a different discrete gradient to T or V , respectively, in order to obtain a discrete gradient for H(q, p).System (7) now reads Given two discrete gradients ∇ q V for V and ∇ p T for T , respectively, the discrete gradient method is (11) Method (11) exactly preserves the energy, which is noted in the following lemma.

Proof.
From definition 1 followed by (11), we find The proof of lemma 2 also shows that is a discrete gradient for H for any choice of a discrete gradient ∇ q V and ∇ p T whenever the Hamiltonian H is separable, cf.(9).If both discrete gradients in (11) are symmetric then the method is time-symmetric.
In molecular dynamics, the Hamiltonian often is of the even simpler form ( 12) is a quadratic function that corresponds to the kinetic energy.M −1 is a diagonal matrix with the inverses of the masses of the corresponding particles.For quadratic functions, any quadratic discrete gradient basically reduces to the midpoint rule.Choosing the midpoint discrete gradient for ∇ p T , one thus obtains (13) Inserting the second equation in the first leads to the system ( 14) ), p n+1 = p n − τ ∇ q V (q n , q n+1 ), which will be used for the computation.The first equation is implicit in q n+1 and takes some effort to solve numerically.The momenta p n+1 are easily computed, once the first equation has been solved.If the first appearance of the discrete gradient were replaced by the true gradient at q n and the second discrete gradient by the average of the true gradients at q n+1 and q n , respectively, we would recover the well-known Velocity-Störmer-Verlet method.Method (14) might therefore be called Velocity-DG method.
Proposition 2. The method (13) or method (14), respectively, exactly preserves the Hamiltonian (12).If the discrete gradient ∇ q V is symmetric, then the scheme is time-symmetric (reversible).If the discrete gradient ∇ q V is symmetric and sufficiently smooth then the method is of second order for sufficiently smooth V .
Proof.The scheme (14) exactly preserves the Hamiltonian (12), since it is just a reformulation of scheme (11) for this Hamiltonian and lemma 2 applies.Exchanging τ ↔ −τ , q n+1 ↔ q n and p n+1 ↔ p n shows the time-symmetry for symmetric discrete gradients.Finally, definition 1 and the symmetry of the scheme show second-order accuracy (cf.Theorem 8.10 in [22]).
A more elaborate proof of second order for the midpoint discrete gradient applied to the full system (7) can be found as the proof of Theorem 8.5.4 in [53].For separable systems, this corresponds to a special case of our proposition 2.
Analogous to the elimination of the velocities (momenta) in the Verlet algorithm, one may derive a two-step formulation by adding the first line of ( 14) to the one with negative time step −τ , i.e., For the solution of the implicit equation in (14), the equation is transformed to F (u) = 0, where and the Newton method is applied in the vicinity of q n .This leads to the iteration where u 0 ≈ q n and ( 16) In order to reduce the computational work, one can also use the simplified Newton iteration where J F (u m ) is an approximation to the full Jacobian.For example, for the midpoint discrete gradient (1), ( 18) might be used.This is just the Jacobian that would occur in the implicit midpoint rule.This way, a loop over all particles to compute the potential V (u m ), the gradient ∇V (u m ) and the norm of the difference of the positions is avoided.The matrices ∂F ∂u and J F (u m ) are symmetric and the linear systems in ( 15) and (17), respectively, can be solved efficiently by the conjugate gradients (CG) method (cf.[25]).Newton iterations also appear in the discretisation schemes SHAKE and RATTLE that are custom-built to pose constraints during molecular dynamics simulations (cf.[1,47]).The method is often also referred to as SHAKE-and-RATTLE, since they have been found to be equivalent in [30].

Discrete gradients for molecular dynamics
In this section, we discuss previously known as well as new discrete gradients that are designed for use in particle and molecular dynamics.A basic idea for discrete gradients custom-built for molecular dynamics is to mimic the standard forces by discrete gradients.That is, for a pairwise force, the discrete gradient will be built upon the two particles.For angle forces, three particles are involved and for torsion forces, four particles.Therefore, we consider N particles with masses m 1 , . . ., m N , positions q i , i = 1, . . ., N , and momenta p i , i = 1, . . ., N .Here, p i or q i are three-dimensional vectors.The momenta are given as p i = m i v i , where v i are the velocities of the particles.We designate by r ij = q j − q i the vector, that points from particle i to particle j.Its length is noted as r ij = q j − q i , where • designates the Euclidean norm.The connection with the vectors used in the previous sections is given as The vector q collects the positions of the particles, the vector p the momenta, respectively.
4.1.Discrete gradients for pairwise forces.A discrete gradient for pairwise forces is known for quite some time.
The first appearance is due to LaBudde & Greenspan [5,6,7].Since then, this discrete gradient has been studied by several scientists (e.g.[16,44,51,52]).If V (r ij ) is a pairwise potential for the particles i and j then we obtain, with the finite difference ∇V q1 (q, q ) . . .
with the non-zero components (20) and ∇ q k V (q, q ) = 0 for k = i, j.
Theorem 1.The formulas (20) form a discrete gradient for pairwise potentials V (r ij ).
Proof.We have to show the properties of definition 1.
∇ qi V (q, q) = lim (q,q )→(q,q) ∇ qi V (q, q ) = lim (q,q )→(q,q) and, analogously, ∇ qj V (q, q) = lim (q,q )→(q,q) ∇ qj V (q, q ) = lim (q,q )→(q,q) We also have the property The following theorem is due to LaBudde and Greenspan, cf.[5].The theorem and its proof might also be found as Theorem 5.1 in [23].For convenience of the reader and later reference, we restate the theorem and prove it.Note, that the important statement of the theorem is the conservation of the energy.Several schemes, also the explicit Verlet scheme, preserve the total linear momentum and the total angular momentum.The statement about the total linear momentum and the total angular momentum in the theorem are given in order to highlight that the discrete gradient methods also preserve these quantities.
Theorem 2. The method (13) for a particle system of N particles with Hamiltonian where V k are pairwise potentials, is a second-order symmetric implicit method which conserves the energy, the total linear momentum P = N i=1 p i and the total angular momentum L = N i=1 q i × p i .Proof.The order of the method follows from proposition 2. The preservation of the energy directly follows from theorem 1, lemma 1, and lemma 2. From (20), we have Hence, For a fixed pairwise potential V k (r ij ) for particles i and j and arbitrary q and q, we have that From this, for V and q , q arbitrarily chosen, we find Now we turn to method ( 13).Then we have by the equation above that we also have Adding ( 22) and (23) shows the preservation of the angular momentum.

4.2.
Experiment with two Lennard-Jones particles.The Lennard-Jones potential is one of the most-used models for the interaction of neutral particles (cf.[36]).We will use the standard 12-6 potential In our experiment, we use σ = 1 and = 5.The initial conditions are shown in figure 1.In the positions section, the atoms are numbered together with their initial location.The velocities are given in the next section, while the particle is identified by its number.The particles are attracted to each other and then repelled again.This gives a periodic movement.Due to the initial velocities, they are turning around each other, additionally.With this set-up, we first compute the trajectories of the particles up to time 10 with different step sizes.The error is shown in figure 2 on the left-hand side.The error is shown versus the chosen step sizes.All three methods show a second-order error behaviour.The black line indicates the second-order slope in the logarithmic plot.On the right-hand side, the calculated energies are shown for all time steps.The discrete gradient method preserves the energy correctly, while the Verlet scheme and the implicit midpoint rule show a periodic change of the energy.As expected, the total linear momentum and the total angular momentum are preserved by all three methods.The implicit equation in ( 14) is solved with the Newton method, (15), with the full Jacobian (16).The Jacobian is a symmetric matrix and the CG method is used to solve the linear systems.The Jacobian is not explicitly formed.Instead, the action of the Jacobian on a vector is directly computed.This way, the computational effort is comparable to computing the forces.Sometimes, this is called a matrix-free implementation.
. Sketch of bond angle: From left to right, first the bonds and the bond angle between the atoms i,j,k is shown.The second sketch shows the distances between the atoms i,j,k used in the discrete gradient.On the right-hand side, the standard term of the bond angle is followed by the term based on distances, solely.

4.3.
Discrete gradients for bond angles.We discuss some discrete gradients for bond angles.Besides general discrete gradients, e.g. ( 1), (2), and (3), restricted to the bond angles, a symmetric discrete gradient for the bond angles has recently been proposed in [50].We discuss a slight generalisation of this discrete gradient.The standard angle potential, V (θ), is assumed to depend smoothly on the angle θ, cf.figure 3. The angle θ = θ ijk can be expressed in terms of the distances r ji , r jk , r ij between the three atoms i, j, and k as given on the right-hand side of figure 3.This is due to the following lemma.
Lemma 3. We have the following representation of the scalar product Proof.The proof is a simple calculation.
So the angle potential only depends on three distances V (r ji , r jk , r ik ).We first give a non-symmetric discrete gradient that is similar to the Itoh-Abe gradient.In order to write the discrete gradient down in a concise way, we use the following finite differences with respect to the distances of the particles This can also be seen as the Itoh-Abe (coordinate increment) discrete gradient for V (r ji , r jk , r ik ).With these finite differences, our first discrete gradient ∇ V has the nonzero components where all other are zero.This discrete gradient is not symmetric, since the finite differences are not symmetric, e.g.
Hence we can define another discrete gradient ∇ r V by using coordinate decrease.The finite differences are now With these finite differences, we obtain another discrete gradient that is also not symmetric.Since symmetric discrete gradients lead to symmetric time-integration methods, which are of second order, one might prefer symmetric discrete gradients.In the same way as the Itoh-Abe discrete gradient, we can symmetrise our gradients here.This leads to the discrete gradient with the (symmetric) finite differences And the non-zero coefficients of the symmetric discrete gradient ∇ s V read These three expressions are indeed discrete gradients, which is detailed in the following theorem.
Theorem 3. The three expressions are discrete gradients.
Proof.A calculation shows that the energy is preserved by the first gradient ∇ V .We omit the arguments of the finite differences and obtain ∇ V (q, q ) T (q − q) = ∇ qi V (q, q ) T (q i − q i ) + ∇ qj V (q, q ) T (q j − q j ) + ∇ q k V (q, q ) T (q The discrete gradient is defined by continuous continuation.We have to show that the discrete gradient coincides with the exact gradient of V in this case, that is, we have to show This is a simple calculation.We obtain lim (q,q )→(q,q) The proof of the discrete gradient ∇ r V works analogously.From these two, finally the statement for the symmetric discrete gradient ∇ s V follows.
There are more discrete gradients.One might prescribe any pattern of primes to the three distances in the finite differences.For example the pattern prime, no prime, prime and then changing from prime to no prime and vice versa from left to right All discrete gradients of this type can be symmetrised by symmetrising the finite differences.
Theorem 4. The method (13) for a particle system of N particles with Hamiltonian where V k are angle potentials, is a first-order non-symmetric implicit method for ∇ V and ∇ r V and a second-order symmetric implicit method for ∇ s V .All DG methods based on the discrete gradients above conserve the energy, the total linear momentum P = N i=1 p i and the total angular momentum L = N i=1 q i × p i .Proof.The order of the methods follows from proposition 2. The preservation of the energy again follows from theorem 1, lemma 1, and lemma 2. From (26), or the corresponding relation for the other two discrete gradients, we have Hence, the preservation of the angular momentum follows literally as in equation (21).For a fixed angle potential V k (θ ijk ) for particles i, j, k, for all three discrete gradients and for arbitrary q and q, we have that From here, the proof literally proceeds as in theorem 2.
Theorem 2 and theorem 4 prove that the statement of theorem 4 remains true if pairwise potentials are included.This is summarised in our main theorem 7. The preservation of the total linear momentum and the total angular momentum is basically due to the following observation.For the discrete forces we have the relation analogue to the forces, f k + f i + f j = 0 .This is also satisfied by many other schemes, particularly the Verlet scheme.That is, these schemes also preserve the total linear momentum and the angular momentum.For this reason, we mainly consider the order and the energy preservation in our experiments.The energy preservation is what comes with the DG schemes.We use the mixing rule whenever two different species i and j meet in a Lennard-Jones interaction.We only impose Lennard-Jones interactions on two particles that belong to different molecules.Here and in further experiments, we use the angle potential This potential is available in LAMMPS, cf.[55], as angle style cosine/squared.Further, we remove the units by scaling.We use σ = 10 −10 m = Å, ˜ = 1 kcal/mol, m = 1 u and α = σ m/˜ = 4.88 • 10 −14 .The dimensionless quantities then read m = m/ m, 4, the (dimensionless) initial positions and velocities of the particles are given.The first line in the positions section numbers the atoms.The second indicates the molecule number.As long as this number is the same, the atoms are connected by a bond.The velocities of the particles are given in the next section.We first compute the trajectory up to time 10.We show the error versus the step size at time 10 on the left-hand side of figure 5.All methods show their expected order.With an unsymmetric discrete gradient, we observe first order.When the discrete gradient is symmetrised, second order for the DG scheme is observed.On the right-hand side of figure 5, the calculated energies for time step τ = 0.005 are shown versus the simulation time.All three discrete gradient schemes preserve the energy nicely.The implicit equation in ( 14) is solved with the simplified Newton method, (17), where the approximate Jacobian J F , cf. (18), includes the full Hessian with respect to the Lennard-Jones potentials and the bond potentials but omits the part with respect to the bond angle potentials.4.5.Discrete gradients for dihedral angles.The dihedral angle denotes the angle between the planes spanned by the atoms i, j, k and j, k, , respectively.The sign of the dihedral angle φ ijk , i.e. sign( r ij , r jk × r k ), designates on which side of the plane through j, k, the particle i lies.We consider potentials V (φ) that depend smoothly on the dihedral angle φ.The computation of the forces based on the dihedral angle is based on the expression as given above in (28) and leads to the most used formulas as described in [3].To our Code fragment 2. Data P o s i t i o n s 1 1 6 .3 1 0 5 9 3 9 .0 0 0 0 0 0 9 .7 4 4 3 7 1 2 1 7 .0 5 4 9 6 4 9 .0 0 0 0 0 0 9 .0 0 0 0 0 0 3 1 6 .3 1 0 5 9 3 9 .0 0 0 0 0 0 8 . 2 5 5 6 2 9 4 2 1 1 .6 8 9 4 0 7 9 .0 0 0 0 0 0 9 .7 4 4 3 7 1 5 2 1 0 .9 4 5 0 3 6 9 .0 0 0 0 0 0 9 .0 0 0 0 0 0 6 2 1 1 .6 8 9 4 0 7 9 .0 0 0 0 0 0 8 . 2 5 5 6 2 9 V e l o c i t i e s 1 0 .0 0 0 0 0 0 −0.050000 0 .0 0 0 0 0 0 2 0 .0 0 0 0 0 0 −0.050000 0 .0 0 0 0 0 0 3 0 .0 0 0 0 0 0 −0.050000 0 .0 0 0 0 0 0 4 0 .0 0 0 0 0 0 0 .0 5 0 0 0 0 0 .0 0 0 0 0 0 5 0 .0 0 0 0 0 0 0 .0 5 0 0 0 0 0 .0 0 0 0 0 0 6 0 .0 0 0 0 0 0 0 .0 5 0 0 0 0 0 .0 0 0 0 0 0 best knowledge, the discrete gradient that we propose in the following is new.It is based on the same idea as the discrete gradients before, namely, the idea to express the angle with distances.The dihedral angle can also be expressed in terms of the distances between all atoms as indicated on the right-hand side of figure 6.The additional lines on the right-hand i j k i j k Figure 6.Sketch of united-atom butane: on the left-hand side the molecule with the particles i, j, k, is shown including the bonds between atoms.On the right-hand side, the distances between the particles i, j, k, used by the discrete gradient are given.
side of figure 6 are the additional distances that will be used.The representation of the angle in distances is given in the following lemma.
Lemma 4. We have the following representation in terms of distances where m = r ij × r jk and n = r jk × r k .
Proof.The proof is a tedious calculation along the following steps So the dihedral potential depends on six distances V (r ij , r jk , r k , r ik , r j , r i ).The representation in lemma 4 suggests an alternative way to represent any potential dependent on the dihedral angle in terms of distances.But we will consider potentials that are dependent on the cosine of the dihedral angle, because we will use such potentials in our experiments later.Let a n ϕ n , be the potential for the torsion angles.Then, we have The potential reduces to Ũt (ϕ ijk ).Hence, for the discrete gradient, we use the representation with ϕ ijk expressed in the distances as in lemma 4.
We first discuss non-symmetric discrete gradients.In order to write the discrete gradient down in a concise way, we use again finite differences with respect to the distances of the particles.
This can also be seen as the Itoh-Abe (coordinate increment) discrete gradient for V .With these finite differences, our first discrete gradient reads as follows ( 29) This discrete gradient is not symmetric.The coordinate decrement version ∆ r rij V (q, q ) := leads to another discrete gradient that is also not symmetric.Since symmetric discrete gradients lead to symmetric time-integration methods, which are of second order, one might prefer symmetric discrete gradients.In the same way as the Itoh-Abe discrete gradient, we can symmetrise our gradients here.This leads to the discrete gradient with the (symmetric) finite differences ∆ s r V (q, q ) := 1 2 ∆ r V (q, q ) + ∆ r r V (q, q ) , r ∈ {r ij , r jk , r kl , r ik , r j , r i } These three expressions are indeed discrete gradients, which is detailed in the following theorem.The proof of this theorem is analogous to the proof of theorem 3.
Theorem 5.The three expressions are discrete gradients.
Actually, there are many more discrete gradients.One might prescribe any pattern of primes to the six distances.
Changing the primed distances to not-primed distances and vice versa from left to right when the finite difference 'passed' this distance, we obtain a new unsymmetric discrete gradient.These unsymmetric discrete gradients can all be symmetrised.
Theorem 6.The method (13) for a particle system of N particles with Hamiltonian where V k are dihedral potentials, is a first-order non-symmetric implicit method for ∇ V and ∇ r V and a second-order symmetric implicit method for ∇ s V .All three methods conserve the energy, the total linear momentum P = N i=1 p i and the total angular momentum L = N i=1 q i × p i .
Proof.The order of the methods follows from proposition 2. The preservation of the energy again follows from theorem 1, lemma 1, and lemma 2. From (29), or the corresponding relation for the other two discrete gradients, we have Hence, the preservation of the angular momentum follows literally as in equation (21).For a fixed dihedral potential V k (ϕ ijk ) for particles i, j, k, , for all three discrete gradients and for arbitrary q and q, we have that From here, the proof literally proceeds as in theorem 2.
Besides the conventional torsion potential, where the atoms i, j, k, are consecutively connected, there is also the possibility that three atoms are connected to one as in figure 7 on the left-hand side.Here, the so-called improper dihedral angle is defined as the angle between the planes spanned by the atoms i,j,k and j,k, , respectively.As shown on the right-hand side of figure 7, this angle can be expressed in exactly the same way as before for the standard dihedral angle (28).Therefore, the distance definition of the dihedral angle in lemma 4 also works for the improper dihedral angle and the corresponding discrete gradients are constructed analogously.As a corollary from theorem 2, theorem 4, and . Improper dihedral angle: on the left-hand side, the typical situation is shown.On the righthand side, the formula for the improper dihedral angle ω ijk is given.ϕ ijk is exactly the same term as for the standard dihedral angle, cf.(28).
theorem 6, we obtain our main theorem.
Theorem 7. The method (13) for a particle system of N particles with Hamiltonian where V k are pairwise, angle, and dihedral potentials, is a first-order non-symmetric implicit method if at least one non-symmetric discrete gradient is used, and a second-order symmetric implicit method if all discrete gradients are symmetric.All methods preserve the energy, the total linear momentum P = N i=1 p i and the total angular momentum L = N i=1 q i × p i .
Theorem 7 shows that all standard short-range forces in a classical molecular dynamics simulation can be modelled with discrete gradients.Since even large protein simulations are often based on these standard short-range potentials and since the standard potentials do not change whether more atoms are connected to atoms defining one of the standard interactions or not, this theorem is quite general with respect to its applicability.
Additional insight in the computation of the gradient of a dihedral angle potential might be gained from the representation of the dihedral angle with six distances.The computation of this gradient is not straightforward.It is usually based on formula (28), also called the cross-product definition of the dihedral angle.The scalar product of the cross-products can be represented as a scalar product without cross products, the so-called scalar product definition for the torsion angle.The cross-product definition leads to the most used formulas for the computation of the forces induced by the dihedral potential as given in [3].The representation of the dihedral angle in terms of distances leads to the interesting alternative formulas The representation of the dihedral angle in terms of the distances in lemma 4 might be called the 'distance' definition of the dihedral angle, as a supplement to the cross-product definition and the scalar product definition of the dihedral angle.
Since we have 'distance' definitions of the bond angles as well as the dihedral angles, and since the other potentials are dependent on distances in a natural way, one could evaluate all forces with respect to these potentials in a unified way.We use these representations here in order to use the same technique to construct a discrete gradient.But this unified way to compute the forces, i.e. the gradients of the potentials, might be interesting for the standard Verlet scheme, too.4.6.Experiment with butane.We use a united-atom model of butane.The Lennard-Jones potential, the bond potential and the angle potential are chosen as before with the parameters given in table 1.And the potential for the torsion potential in terms of the dihedral angle φ in the IUPAC convention (cf.[27]) reads U t (φ) = k φ (1.116 − 1.462 cos φ − 1.578 cos 2 φ + 0.368 cos 3 φ + 3.156 cos 4 φ + 3.788 cos 5 φ) .
the second row to the molecule the atom belongs to.Two consecutive atoms within the same molecule are connected by a bond.Three consecutive atoms with the same molecule are ruled by the angle potential, and all four atoms of the molecule by the given torsion potential.That is, this experiment uses all potentials discussed so far.
In figure 9 on the left-hand side, one can see that the methods perform as expected with respect to the order.The integration time for the error plot was up to T = 2.0 with the step sizes indicated on the abscissa.The error measured in the standard Euclidean norm is shown on the ordinate.If only one of the discrete gradients used in the DG method is unsymmetric then the DG method is of first order.This is shown by the blue circle-marked line.If all discrete gradients are symmetric, the method is of second order as well as the implicit midpoint rule and the Verlet scheme.For the energy plot on the right-hand side, we computed the solutions up to time T = 10 with step size τ = 0.005.All DG methods preserve the energy up to round-off error.The implicit midpoint rule and the Verlet scheme deviate from the constant energy at the beginning that should be preserved.Since the Verlet scheme deviates significantly more than the implicit midpoint rule, we also calculated the energy with LAMMPS.The energy behaviour turned out to be exactly the same.Setting the NVE ensemble in LAMMPS, the red, solid energy curve is the outcome, which coincides with our own implementation of the Verlet scheme.The implicit equation in ( 14) is again solved with the simplified Newton method, (17), where the approximate Jacobian J F , cf. (18), includes the full Hessian with respect to the Lennard-Jones potentials and the bond potentials but omits the part with respect to the bond angle and dihedral angle potentials.

Parallelisation of DG methods
In order to show the usefulness of DG methods in molecular dynamics, it is indispensable to take care of parallelisation.Many codes for molecular dynamics simulations are based on the basic parallel treatment of short-range forces (e.g.[4,31,42,55]).5.1.Parallelisation of the evaluation of discrete gradients for Lennard-Jones forces with cut-off.In this section, we discuss the parallelisation of Lennard-Jones forces, induced by the standard potential, cf.(24).Since we are only interested in the short-range part of the potential and since we will also make use of the Hessian, the Lennard-Jones potential with cut-off function should be twice continuously differentiable.For this reason, we will use the switching function proposed in [37] as .Results of the experiment with two united-atom butane molecules: the error versus the time step is shown on the left-hand side for the Verlet scheme, the symmetric discrete gradient (DG) scheme, the midpoint rule, and an unsymmetric simple discrete gradient (DG) method.On the right-hand side, the energy is shown over the time span [0, 10] for step size τ = 0.005 for the Verlet scheme, the midpoint rule and the two discrete gradient (DG) schemes.
with r m = rcut 2 .The function and the first two derivatives restricted to [r m , r cut ] read The Lennard-Jones potential U with switching-function, i.e.V (r) = U (r) • s(r), is a twice continuously differentiable short-range potential.This switching function is available in LAMMPS as pair style lj/mdf.The Lennard-Jones interactions are implemented with the linked cell method as described, e.g., in chapter 3 of [17].Since DG methods are implicit methods, the particle structure is extended by the possible future positions of the particles during the iterative solution of equation (14).Due to this, Lennard-Jones forces pose a special challenge with DG methods or implicit methods, in general.While a border neighbourhood of one cell is enough for the Verlet scheme, we need a border neighbourhood of two cells for DG methods.The reason is that the particle structure not only carries the position at the given time, but also the position of the next step.Two particles could become close in the next step that are not close in the given time step.This is illustrated in figure 10.Since possible future positions are needed for the evaluation of the discrete gradient (20) in the difference (19), particles with a distance of 2r cut need to be known, when we assume that the step size is chosen so small that the particles cannot travel for more than 2  3 r cut in a time step.To sweep all particles in a border neighbourhood of two cells of a given cell is enough to catch these events.As an alternative, one might use cells with dimensions larger than 2r cut .Then a border neighbourhood with width one cell would be enough for the larger cells.We will stick with the smaller cells.There are some computations, for example the potential with the cut-off function, where we only need a border neighbourhood of one cell.Hence, using the smaller cells saves a bit of computing time.
For the parallelisation, domain decomposition is used.If we decompose the two-dimensional domain in figure 11 in six larger parts based on the cells given by the linked cell method, every processor only knows the particles in its domain.Since each processor might run on its own node with its own memory in a cluster computer, the access to the data of adjacent processors is not immediate (cf.[2]).The standard technique is to extend the domain that the processor is handling by further cells, the border neighbourhood, and to retrieve the necessary information from the adjacent processors in these cells.The current processor also has to send the information needed by the adjacent processors from his domain to the neighbours.For the exchange of the data between processors, message passing is used.That is, the processors send messages to each other.This is standardised in the message passing interface (MPI) (cf.[38]).Due to the discussion above, we need to extend the processor's domain by a border neighbourhood of the size of two cells as i j i j i j Figure 10.Linked cell method: Simulation domain is decomposed into square cells of size r cut × r cut .The dark-shaded circle is the cut-off radius r cut about the red particle i.On the left-hand side, the current time step is shown.For the standard force computation, only the particles in the 3 × 3 grid of light-gray cells need to be taken into account.In the middle, the situation at the next time step is shown.The future positions i of particle i and j of particle j, respectively, are now within the cut-off range.While particle i and j of the current time step do not interact, they interact after they moved to the positions i and j in the next time step.In the discrete gradient method, the positions in the next time step are needed to compute the discrete gradient.Therefore, a 5 × 5 grid of cells around the cell with the red particle i needs to be taken into account for DG methods, if the particles are assumed to travel not more than 2/3 of the cut-off in one time step.The cut-off radius of the DG method around the red particle i and the cells that need to be taken into account are shown on the right-hand side.
Figure 11.Decomposition of the simulation domain: the domain Ω is divided in cells indicated by thin lines.The domain is subdivided further into six subdomains as indicated by the thick lines.Each of the six subdomains is assigned to its own processor that handles the computations in this subdomain according to the linked cell method.
shown in figure 12 on the left hand side.We will conduct three-dimensional experiments.The corresponding domain and its neighbourhood is shown in figure 12 on the right-hand side.

Collision of two bodies.
As a test problem for the parallelisation, we use the collision of two bodies as described in section 4.5.1 of [17].In figure 13 on the right-hand side, the initial configuration of the experiment is shown.The two bodies consist of 10 × 10 × 10 and 10 × 30 × 30 cubic cells, with 4000 and 36000 particles, respectively.Each cell contains four particles in a face-centered cubic grid, as shown on the left-hand side of figure 13.The four particles, which are counted for this cell, are the lower left corner as well as the particles in the center of the front, bottom and left face.The others belong to regular grids spanned by these four particles.The shortest distance in the grid is 2 1/6 σ, according to the equilibrium length of the Lennard-Jones potential.At the beginning of the simulation, the smaller body moves Figure 12.Subdomain and border neighbourhood: On the left-hand side, a typical subdomain is shown as the square with the white cells.The processor responsible for this domain needs the data from the adjacent domains.A border neighbourhood of cells, given in gray, is added to the subdomain.The border neighbourhood is filled with copies of the particles in the adjacent domains.The adjacent processors have to send the data to the current processor by messages, according to the message passing paradigm standardised in the message passing interface (MPI), cf.[38].For implicit methods and DG methods, a border neighbourhood of two cells is necessary in order to compute forces by the linked cell method.
On the right-hand side, the situation for a three-dimensional subdomain is illustrated.with a high velocity v towards the resting larger body.The simulation cell of size [0, 150σ] 3 is equipped with periodic boundary conditions.All data are given in table 2. Several snapshots of the simulation are shown in figure 14.The simulation has been conducted with four processors.For our proof-of-concept implementation, we used four standard personal computers (PCs) as processors that had been connected by a one gigabit local area network (1 GB LAN).The particles in figure 14 are color-coded with respect to the processor that handles the particles.The simulation is run up to (scaled) time T = 30.All results are given in scaled quantities.In figure 15, one can see that the DG method preserves the energy very well.The implicit midpoint rule overestimates the energy when the small body penetrates the Table 2. Parameters for the simulation of the collision larger body.The Verlet scheme underestimates the energy during this phase.If the larger body is destroyed, the energy computed by the midpoint rule decreases and the energy computed by the Verlet scheme increases.In order to check our computation, we also conduct the same experiment with LAMMPS and only one processor (no parallelisation).The observed energy behaviour of the independent Verlet implementation in LAMMPS shows exactly the same energy curve, including the small peaks later on.That is, only the discrete gradient method is able to simulate a true microcanonical ensemble.The implicit equation in ( 14) is solved with the Newton method, (15), with the full Jacobian (16).The action of the Jacobian on a vector is directly computed.This way, one can transfer the linked cell method to the of the action of the Jacobian on a vector.The total linear momentum is preserved by all three methods.The total angular momentum can not be preserved, by any of the methods, due to the periodic boundary conditions.Only for free space or repelling boundary conditions, methods can preserve the total angular momentum in a MD simulation.

5.3.
Parallelisation of the evaluation of discrete gradients for bonded forces.The parallelisation of the bonded forces is simpler in the sense that it works in the standard way, which is illustrated in figure 16.On the left-hand side of figure 16, the domain with its border neighbourhood is shown.Then the adjacent processors send the necessary particles, while this processor sends the particles needed by neighbouring processors.This is the state in the middle of figure 16.
Then the particles in the border neighbourhood are linked based on the molecule numbers and atom numbers stored with the particles.After that, the current processor sees all data needed to compute the discrete gradients with respect to bonded forces for the particles in its domain.Recall, that the new positions are also stored with the corresponding particles.This state is illustrated on the right-hand side of figure 16.After their computation, the discrete gradients are also stored with the particles.Then, the particles in the border neighbourhood are separated again, which brings us back to the state in the middle of the figure.Finally, the particles in the border neighbourhood are deleted, because they are no longer needed.This procedure also works for the matrix-free computation of the Hessian of the bonded potentials times a vector.
5.4.Experiment with butane.We run an experiment with 64 united-atom butane molecules.The initial condition and a snapshot of the simulation at time T = 4 can be seen in figure 17.The data for the potentials have been chosen as in table 1.Also the potentials are chosen as before, i.e., the angle potential is given in equation ( 27) and the torsion potential is given in equation (30).The simulation is set in a periodic box of size 4r cut , r cut = 2.5σ, and σ as in table 1 in scaled variables.The whole simulation is scaled as before in the butane simulation.The deviation from the exact energy over the simulation time with step size τ = 0.0001 is shown in figure 18.The simulation has been run with four processors.While the DG methods preserve the energy up to round-off error, the implicit midpoint scheme and the Verlet scheme deviate from the constant energy.The peaks in the energy of the Verlet scheme are real.We also computed the energy for the given initial value with LAMMPS with one processor.This simulation reproduced exactly the same peaks as our code.This means that the particles do not evolve with respect to a genuine NVE ensemble at these peaks.The solid red line is the closest a standard molecular dynamics package with the Verlet scheme and setting the ensemble to the NVE ensemble can get.If one wishes more accuracy with respect to energy preservation, discrete gradient methods are an interesting alternative.

Conclusion
This work shows that all standard short-range interactions in a classical conservative molecular dynamics simulation can be computed by discrete gradient methods.These methods reliably preserve the total energy in the system, along with the total linear momentum and the total angular momentum in free space simulations.The simple and unified idea to construct the discrete gradients is to express all standard short-range interactions in terms of distances between atoms.The new discrete gradients for the dihedral angle potentials also suggest an interesting way to compute the gradient of dihedral angle potentials based on distances for the use in standard time integration schemes.Furthermore, the discrete gradient methods can be parallelised.We proposed the necessary changes to the linked cell method for the parallel evaluation of the discrete gradients with respect to truncated Lennard-Jones potentials as well as the necessary changes for bonded forces.As a result, the proposed DG methods can be computed in parallel.Energy preservation for the collision of two bodies: the deviation from the exact energy is shown over the time span [0, 30] for step size τ = 0.001 for the Verlet scheme, the midpoint rule and the discrete gradient (DG) scheme.All three methods have been computed in parallel with four processors on a cluster computer, cf.[2].
Figure 16.Reconnection and separation of atoms in the border neighbourhood: If the subdomain of the processor receives atoms that are bonded within a molecule, the processor needs to reattach the atoms at the correct sites.This is illustrated here from left to right in a two-dimensional example.First the subdomain with the knowledge of the processor is shown.In the middle, the situation is shown that results after the processor has received the particles necessary for the computation of the bonded forces from the adjacent processors.With the help of the molecule and particle numbers, the processor reattaches the molecules in the correct way.The result is illustrated on the right-hand side.After the computation of the bonded forces, the process is reversed from right to left.

Figure 1 .Figure 2 .
Figure 1.Initial conditions for the experiment with two Lennard-Jones particles: the positions and velocities are given on the left-hand side.The first row refers to the particle number, the next three rows to the three coordinates.On the right-hand side, the initial positions are given as a plot made by the open visualization tool (OVITO), cf.[54].

Figure 4 .Figure 5 .
Figure 4. Initial conditions for the experiment with two water-like molecules: the positions and velocities are given on the left-hand side.In the positions section, the first row numbers the particles.The second row assigns molecule numbers.If the number is the same, a bond is added between the particles.The third to fifth row are the coordinates of the positions.In the velocities section, the particle number is followed by velocities.On the right-hand side, the initial configuration is given as a plot by the open visualization tool (OVITO), cf.[54] k b = 17.5 MJ mol•nm 2 , r0 = 1.53 Å, bond potential, k θ = 65 kJ mol , θ0 = 109.47degree, angle potential,

Figure 9
Figure 9. Results of the experiment with two united-atom butane molecules: the error versus the time step is shown on the left-hand side for the Verlet scheme, the symmetric discrete gradient (DG) scheme, the midpoint rule, and an unsymmetric simple discrete gradient (DG) method.On the right-hand side, the energy is shown over the time span [0, 10] for step size τ = 0.005 for the Verlet scheme, the midpoint rule and the two discrete gradient (DG) schemes.

Figure 13 .
Figure 13.Sketch of a face-centered cubic lattice on the left-hand side.On the right-hand side, the initial positions of the particles that form the smaller and the large body are shown in a plot made by the open visualization tool (OVITO), cf.[54].

Figure 14 .
Figure 14.Simulation of the collision of two bodies at times t = 0, 4, 8, 12, 16, 20 from top left to bottom right.The simulation is run with four processors identified by color.The same color means that the same processor is handling the particles.
Figure 15.Energy preservation for the collision of two bodies: the deviation from the exact energy is shown over the time span [0, 30] for step size τ = 0.001 for the Verlet scheme, the midpoint rule and the discrete gradient (DG) scheme.All three methods have been computed in parallel with four processors on a cluster computer, cf.[2].

Figure 17 .Figure 18 .
Figure 17.Simulation of butane in a periodic box at times t = 0, 20.The plot is made by the open visualization tool (OVITO), cf.[54]