1 Introduction

In molecular dynamics (MD) simulations, or, more generally, in particle methods, one is usually not interested in the trajectory of a single particle but in derived and/or averaged quantities that are often related to geometric properties of the underlying equations of motion. For a meaningful simulation, the preservation of these geometric properties is important (e.g. [21, 30, 34]). In a microcanonical or (NVE) ensemble, important macroscopic variables are the total number of particles (N), the system’s volume (V) and the total energy (E). Microcanonical MD simulations should therefore preserve the total energy, which corresponds to a first integral of the equations of motion (e.g. [51]). In the Hamiltonian formulation, the total energy or Hamiltonian \(H:\mathbb {R}^d\times \mathbb {R}^d \rightarrow \mathbb {R}\), the positions q \(\in \mathbb {R}^d\) and momenta p \(\in \mathbb {R}^d\) determine the equations of motion as q\(=\nabla _{{\textbf {p}}}H({\textbf {q}},{\textbf {p}})\), \({\textbf {p}}'=-\nabla _{\textbf {q}} H({\textbf {q}},{\textbf {p}})\). In addition to the preserved energy, Hamiltonian systems possess another important structural property, the symplecticity of the Hamiltonian flow (cf. [29, 52]). Unfortunately, the symplectic structure and the total energy can generally not be preserved exactly at the same time. A class of geometric integrators that can preserve first integrals and Lyapunov functions exactly are discrete gradient methods. They have first been considered energy-conserving schemes (e.g. [12, 13, 24]), and they have then been generalised to arbitrary first integrals of Hamiltonian and non-Hamiltonian systems (cf. [46]) and Lyapunov functions (cf. [36, 56]). Other methods that are able to preserve integrals and Lyapunov functions are projection methods (e.g. [18]), which are related to discrete gradient methods (cf. [43]). In addition to many applications, discrete gradient methods can be used to preserve the energy of suitably discretised variational partial differential equations (e.g. [8, 9, 40,41,42, 44, 60]) as well as to ensure the dissipation of gradient systems in image processing (e.g. [16, 17, 19, 49]). They have been generalised to manifolds (cf. [6, 7]) and inspire new ideas in smooth optimization (e.g. [11, 48]) and deep learning (e.g. [5]).

Following this introduction, discrete gradients are discussed in Section 2. Their use for Hamiltonian systems in MD simulations is studied in Section 3. Special discrete gradients for molecular dynamics are presented in Section 4. The parallelisation of DG methods is discussed in Section 5. Finally, a brief conclusion is given in Section 6.

2 Discrete gradients in geometric integration

In order to remind the reader of discrete gradients, we follow [35].

Definition 1

(Gonzalez 1996) Let \(V: \mathbb {R}^n \rightarrow \mathbb {R}\) be continuously differentiable. The function \(\overline{\nabla }V: \mathbb {R}^n \times \mathbb {R}^n \rightarrow \mathbb {R}^n\) is a discrete gradient of V iff it is continuous and

$$ \left\{ \begin{array}{rcl} \langle \overline{\nabla }V(\textbf{u},\textbf{u}'),(\textbf{u}'-\textbf{u}) \rangle &{}=&{} V(\textbf{u}')-V(\textbf{u}), \\ \overline{\nabla }V(\textbf{u},\textbf{u}) &{}=&{} \nabla V(\textbf{u}), \end{array} \right. \qquad \text {for all} \qquad \textbf{u},\textbf{u}' \in \mathbb {R}^n . $$

The discrete gradient is symmetric, if and only if

$$ \overline{\nabla }V(\textbf{u},\textbf{u}') = \overline{\nabla }V(\textbf{u}',\textbf{u}) \qquad \text {for all} \qquad \textbf{u},\textbf{u}' \in \mathbb {R}^n. $$

For continuously differentiable V, the second defining property follows from the first and might be omitted (cf. [10]). An interpretation of the following proposition, that can be found as Proposition 3.2 in [35], is that the component of any discrete gradient in the direction \((\textbf{u}'-\textbf{u}) /\Vert \textbf{u}'-\textbf{u}\Vert \) is \((V(\textbf{u}')-V(\textbf{u})) /\Vert \textbf{u}'-\textbf{u}\Vert \).

Proposition 1

\(\overline{\nabla }V(\textbf{u},\textbf{u}')\) is a discrete gradient if and only if it is continuous and

$$ \overline{\nabla }V(\textbf{u},\textbf{u}')=\frac{V(\textbf{u}')-V(\textbf{u})}{\Vert \textbf{u}'-\textbf{u}\Vert ^2}(\textbf{u}'-\textbf{u})+w(\textbf{u},\textbf{u}'), \qquad (\textbf{u} \ne \textbf{u}'), $$

where \(w(\textbf{u},\textbf{u}')\) is a vector-valued function such that

$$ \left\{ \begin{array}{l} \langle w(\textbf{u},\textbf{u}'), (\textbf{u}'-\textbf{u}) \rangle =0, \qquad (\textbf{u} \ne \textbf{u}'), \\ \lim _{\textbf{u}'\rightarrow \textbf{u}}{w(\textbf{u},\textbf{u}')-P_{(\textbf{u}'-\textbf{u})^\perp }\nabla V(\textbf{u})}=0, \end{array} \right. $$

where \(P_{(\textbf{u}'-\textbf{u})^\perp }\) is the projection on the space perpendicular to \((\textbf{u}'-\textbf{u})\).

For the Euclidean inner product, \(\langle \textbf{u},\textbf{v} \rangle = \textbf{v}^T\textbf{u}\), for \(\textbf{u},\textbf{v} \in \mathbb {R}^n\), the projection can be written as the matrix

$$ P_{(\textbf{u}'-\textbf{u})^\perp }=I_n-\frac{(\textbf{u}'-\textbf{u})}{\Vert \textbf{u}'-\textbf{u}\Vert ^2}(\textbf{u}'-\textbf{u})^T\, $$

where \(I_n\) is the \(n \times n\) identity matrix and \(\cdot ^T\) means transpose. The following simple fact is quite useful.

Lemma 1

Let \(\overline{\nabla }V^i\) be discrete gradients for \(V^i\), \(i=1,\ldots ,N\). Then,

$$ \overline{\nabla }V =\sum _{i=1}^N \overline{\nabla }V^i \qquad \text {is a discrete gradient for} \qquad V=\sum _{i=1}^N V^i . $$

We omit the obvious proof of Lemma 1. Examples of well-known discrete gradients are the midpoint discrete gradient or Gonzalez discrete gradient (cf. [12])

$$\begin{aligned} \overline{V}_{MP}(\textbf{u},\textbf{u}') = \nabla V \left( \frac{\textbf{u}+\textbf{u}'}{2} \right) + \frac{ V(\textbf{u}')-V(\textbf{u})-\left\langle \nabla V \left( \frac{\textbf{u}+\textbf{u}'}{2} \right) , \textbf{u}'-\textbf{u}\right\rangle }{ \Vert \textbf{u}-\textbf{u}'\Vert ^2 } (\textbf{u}'-\textbf{u})\,, \end{aligned}$$
(1)

where \(\textbf{u} \ne \textbf{u}'\), the mean value discrete gradient (cf. [22])

$$\begin{aligned} \overline{V}_{MV}(\textbf{u},\textbf{u}')=\int _0^1 \nabla V((1-\xi )\textbf{u}+\xi \textbf{u}')\,d\xi , \qquad (\textbf{u} \ne \textbf{u}')\,, \end{aligned}$$
(2)

and the coordinate increment discrete gradient (cf. [24])

$$\begin{aligned} \overline{V}_{CI}(\textbf{u},\textbf{u}')= \begin{pmatrix} \frac{V(u_1',u_2,\cdots ,u_n) - V(u_1,u_2,\cdots ,u_n)}{u_1'-u_1} \\ \frac{V(u_1',u_2',\cdots ,u_n) - V(u_1',u_2,\cdots ,u_n)}{u_2'-u_2} \\ \vdots \\ \frac{V(u_1',\cdots ,u_{n-1}',u_n) - V(u_1',\cdots ,u_{n-1},u_n)}{u_{n-1}'-u_{n-1}} \\ \frac{V(u_1',\cdots ,u_{n-1}',u_n') - V(u_1',\cdots ,u_{n-1}',u_n)}{u_n'-u_n} \\ \end{pmatrix}\,, \end{aligned}$$
(3)

where \(0 /0\) is understood to be \(\partial V /\partial u_i\). The midpoint discrete gradient is symmetric, i.e. \(\overline{V}_{MP}(\textbf{u},\textbf{u}')=\overline{V}_{MP}(\textbf{u}',\textbf{u})\) for all \(\textbf{u} \ne \textbf{u}'\), as is the mean value discrete gradient. The coordinate increment discrete gradient is not symmetric, but it can be symmetrised (cf. [37])

$$ \overline{V}_{SI}(\textbf{u},\textbf{u}')= \frac{1}{2} \left( \overline{V}_{CI}(\textbf{u},\textbf{u}') + \overline{V}_{CI}(\textbf{u}',\textbf{u}) \right) . $$

A discrete gradient can be used as the discretised version of the gradient of a first integral or the gradient of a Lyapunov function of an ordinary differential equation (ODE). If the ODE is smooth enough and possesses a first integral or Lyapunov function V, then the ODE can be written as

$$\begin{aligned} \textbf{y}'=f(\textbf{y})=A(\textbf{y})\nabla V(\textbf{y}), \end{aligned}$$
(4)

where \(A(\textbf{y})\) is an equally smooth antisymmetric matrix, when V is a first integral, and a negative semidefinite matrix, when V is a Lyapunov function (cf. [35]). A discrete gradient method then reads

$$\begin{aligned} \textbf{y}_{n+1}=\textbf{y}_n + \tau \tilde{A}(\textbf{y}_n,\textbf{y}_{n+1},\tau )\overline{\nabla }V(\textbf{y}_n,\textbf{y}_{n+1}), \qquad n=0,1,2,\ldots \,, \end{aligned}$$
(5)

where \(\tilde{A}(\textbf{y},\textbf{y},0)=A(\textbf{y})\), for consistency, and \(\overline{\nabla }V\) is a discrete gradient for V. If the discrete gradient is symmetric and \(\tilde{A}(\textbf{y}_{n+1},\textbf{y}_n,-\tau )=\tilde{A}(\textbf{y}_n,\textbf{y}_{n+1},\tau )\) holds for all possible values, then the discrete gradient method (5) is called time-symmetric or self-adjoint (cf. Definition 3.2 in [36]).

3 DG methods in MD simulations

The equations of motion in molecular dynamics can be conveniently stated in Hamiltonian form. Given an arbitrary (smooth) Hamiltonian function \(H: \mathbb {R}^d \times \mathbb {R}^d \rightarrow \mathbb {R}\) on the phase space \(\mathbb {R}^d \times \mathbb {R}^d\), \(d \ge 1\), the corresponding Hamiltonian equations of motion are

$$\begin{aligned} \begin{array}{rcl} \textbf{q}' &{}=&{} \quad \nabla _\textbf{p} H(\textbf{q},\textbf{p}), \\ \textbf{p}' &{}=&{} - \nabla _\textbf{q} H(\textbf{q},\textbf{p}). \end{array} \end{aligned}$$
(6)

The Hamiltonian corresponds to the total energy of the system that is preserved in a conservative simulation. In order to recognise the fact that (6) is a special case of (4), we set \(\textbf{y}=(\textbf{q},\textbf{p})^T\) and define the matrix

$$ J= \begin{bmatrix} 0 &{} I_d \\ -I_d &{} 0 \end{bmatrix} \in \mathbb {R}^{2d \times 2d}, $$

where \(I_d\) is the identity matrix of dimension d. With these definitions, system (6) reads

$$\begin{aligned} \textbf{y}' = J \nabla _\textbf{y} H(\textbf{y}) . \end{aligned}$$
(7)

Due to

$$ \frac{d}{dt}H(\textbf{y}(t))= \nabla _\textbf{y} H(\textbf{y}(t))^T \textbf{y}'(t) = \nabla _\textbf{y} H(\textbf{y}(t))^T J \nabla _\textbf{y} H(\textbf{y}(t)) = 0\,, $$

the Hamiltonian H (energy) is preserved along solutions of the system. With an arbitrary discrete gradient satisfying Definition 1, the corresponding discrete gradient method reads

$$\begin{aligned} \textbf{y}^{n+1} = \textbf{y}^n + \tau J\overline{\nabla }H (\textbf{y}^n,\textbf{y}^{n+1})\,, \end{aligned}$$
(8)

with step size \(\tau > 0\). The discrete gradient method also preserves the Hamiltonian, since Definition 1 gives

$$\begin{aligned} H(\textbf{y}^{n+1})-H(\textbf{y}^n)&= \overline{\nabla }H(\textbf{y}^n,\textbf{y}^{n+1})^T(\textbf{y}^{n+1}-\textbf{y}^n) \\&\overset{=}{(8)} \tau \overline{\nabla }H(\textbf{y}^n,\textbf{y}^{n+1})^T J \overline{\nabla }H(\textbf{y}^n,\textbf{y}^{n+1}) = 0. \end{aligned}$$

The method (8) is time-symmetric, whenever the chosen discrete gradient is symmetric.

3.1 Separable Hamiltonian systems

If the Hamiltonian is separable, i.e.

$$\begin{aligned} H(\textbf{q},\textbf{p})=T(\textbf{p})+V(\textbf{q})\,, \end{aligned}$$
(9)

one can apply a different discrete gradient to T or V, respectively, in order to obtain a discrete gradient for \(H(\textbf{q},\textbf{p})\). System (7) now reads

$$\begin{aligned} \begin{bmatrix} \textbf{q} \\ \textbf{p} \end{bmatrix}' = \begin{bmatrix} 0 &{} I \\ -I &{} 0 \end{bmatrix} \begin{bmatrix} \nabla _\textbf{q} V \\ \nabla _\textbf{p} T \end{bmatrix}. \end{aligned}$$
(10)

Given two discrete gradients \(\overline{\nabla }_{\textbf{q}}V\) for V and \(\overline{\nabla }_{\textbf{p}}\) T for T, respectively, the discrete gradient method is

$$\begin{aligned} \begin{array}{rcl} \textbf{q}^{n+1} &{}=&{} \textbf{q}^n + \tau \overline{\nabla }_{\textbf{p}}T(\textbf{p}^n,\textbf{p}^{n+1}), \\ \textbf{p}^{n+1} &{}=&{} \textbf{p}^n - \tau \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1}). \end{array} \end{aligned}$$
(11)

Method (11) exactly preserves the energy, which is noted in the following lemma.

Lemma 2

For the separable Hamiltonian system (10) and two discrete gradients \(\overline{\nabla }_{\textbf{q}}V\) and \(\overline{\nabla }_{\textbf{p}}T\), method (11) preserves the energy exactly, i.e.

$$ H(\textbf{q}^{n+1},\textbf{p}^{n+1})=H(\textbf{q}^n,\textbf{p}^n), \qquad n=0,1,2,\ldots . $$

Proof

From Definition 1 followed by (11), we find

$$\begin{aligned} H(\textbf{q}^{n+1}&,\textbf{p}^{n+1}) - H(\textbf{q}^n,\textbf{p}^n) = T(\textbf{p}^{n+1})-T(\textbf{q}^n)+V(\textbf{q}^{n+1})-V(\textbf{q}^n) \\&= \overline{\nabla }_{\textbf{p}}T(\textbf{p}^n,\textbf{p}^{n+1})^T(\textbf{p}^{n+1}-\textbf{p}^n) + \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1})^T(\textbf{q}^{n+1}-\textbf{q}^n) \\&= \!-\!\tau \overline{\nabla }_{\textbf{p}}T(\textbf{p}^n,\textbf{p}^{n+1})^T\overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1}) \!+\!\tau \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1})^T\overline{\nabla }_{\textbf{p}}T(\textbf{p}^n,\textbf{p}^{n+1}) \\&= 0. \end{aligned}$$

\(\square \)

The proof of Lemma 2 also shows that

$$ \begin{bmatrix} \overline{\nabla }_{\textbf{p}}T \\ \overline{\nabla }_{\textbf{q}}V \end{bmatrix} = \overline{\nabla }H $$

is a discrete gradient for H for any choice of a discrete gradient \(\overline{\nabla }_{\textbf{q}}V\) and \(\overline{\nabla }_{\textbf{p}}T\) whenever the Hamiltonian H is separable, cf. (9). If both discrete gradients in (11) are symmetric then the method is time-symmetric.

In molecular dynamics, the Hamiltonian often is of the even simpler form

$$\begin{aligned} H(\textbf{q},\textbf{p})=\frac{1}{2}\textbf{p}^TM^{-1}\textbf{p} + V(\textbf{q}), \qquad \text {i.e.} \quad T(\textbf{p})=\frac{1}{2}\textbf{p}^TM^{-1}\textbf{p} \end{aligned}$$
(12)

is a quadratic function that corresponds to the kinetic energy. \(M^{-1}\) is a diagonal matrix with the inverses of the masses of the corresponding particles. For quadratic functions, any discrete gradient reduces to the midpoint rule. Choosing the midpoint discrete gradient for \(\overline{\nabla }_{\textbf{p}}T\), one thus obtains

$$\begin{aligned} \begin{array}{rcl} \textbf{q}^{n+1} &{}=&{} \textbf{q}^n + \tau M^{-1}\frac{\textbf{p}^{n+1}+\textbf{p}^n}{2}, \\ \textbf{p}^{n+1} &{}=&{} \textbf{p}^n - \tau \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1}). \end{array} \end{aligned}$$
(13)

Inserting the second equation in the first leads to the system

$$\begin{aligned} \begin{array}{rcl} \textbf{q}^{n+1} &{}=&{} \textbf{q}^n + \tau M^{-1}\textbf{p}^n - \frac{\tau ^2}{2}M^{-1}\overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1}), \\ \textbf{p}^{n+1} &{}=&{} \textbf{p}^n - \tau \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1}), \end{array} \end{aligned}$$
(14)

which will be used for the computation. The first equation is implicit in \(\textbf{q}^{n+1}\) and takes some effort to solve numerically. The momenta \(\textbf{p}^{n+1}\) are easily computed, once the first equation has been solved. If the first appearance of the discrete gradient was replaced by the true gradient at \(\textbf{q}^n\) and the second discrete gradient by the average of the true gradients at \(\textbf{q}^{n+1}\) and \(\textbf{q}^n\), respectively, we would recover the well-known Velocity-Störmer-Verlet method. Method (14) might therefore be called Velocity-DG method.

Proposition 2

The method (13) or method (14), respectively, exactly preserves the Hamiltonian (12). If the discrete gradient \(\overline{\nabla }_{\textbf{q}}V\) is symmetric, then the scheme is time-symmetric (reversible). If the discrete gradient \(\overline{\nabla }_{\textbf{q}}V\) is symmetric and sufficiently smooth, then the method is of second order for sufficiently smooth V.

Proof

The scheme (14) exactly preserves the Hamiltonian (12), since it is just a reformulation of scheme (11) for this Hamiltonian and Lemma 2 applies. Exchanging \(\tau \leftrightarrow -\tau \), \(\textbf{q}^{n+1} \leftrightarrow \textbf{q}^n\) and \(\textbf{p}^{n+1} \leftrightarrow \textbf{p}^n\) shows the time-symmetry for symmetric discrete gradients. Finally, Definition 1 and the symmetry of the scheme show second-order accuracy (cf. Theorem 8.10 in [20]).\(\square \)

A more elaborate proof of second order for the midpoint discrete gradient applied to the full system (7) can be found as the proof of Theorem 8.5.4 in [56]. For separable systems, this corresponds to a special case of our Proposition 2.

Analogous to the elimination of the velocities (momenta) in the Verlet algorithm, one may derive a two-step formulation by adding the first line of (14) to the one obtained with negative time step \(-\tau \), i.e.

$$ \textbf{q}^{n+1}-2\textbf{q}^n+\textbf{q}^{n-1} = -\frac{\tau ^2}{2}M^{-1} \left( \overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n+1})+\overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{q}^{n-1}) \right) . $$

For the solution of the first equation in (14), the equation is transformed to \(F(\textbf{u})=\textbf{0}\), where

$$ F(\textbf{u})=\textbf{u}-\textbf{q}^n-\tau M^{-1}\textbf{p}^n+\frac{\tau ^2}{2}M^{-1}\overline{\nabla }_{\textbf{q}}V(\textbf{q}^n,\textbf{u}) $$

and the Newton method is applied in the vicinity of \(\textbf{q}^n\). This leads to the iteration

$$\begin{aligned} \frac{\partial F}{\partial \textbf{u}}(\textbf{u}^m) \triangle \textbf{u}^m = -F(\textbf{u}^m), \qquad \triangle \textbf{u}^m = \textbf{u}^{m+1}-\textbf{u}^m, \qquad m=0,1,2,\cdots \end{aligned}$$
(15)

where \(\textbf{u}^0 \approx \textbf{q}^n\) and

$$\begin{aligned} \frac{\partial F}{\partial \textbf{u}}(\textbf{u}^m) = I+\frac{\tau ^2}{2}M^{-1}\frac{\partial \overline{\nabla }_{\textbf{q}}V}{\partial \textbf{u}} (\textbf{q}^n,\textbf{u}^m) \, . \end{aligned}$$
(16)

In order to reduce the computational work, one can also use the quasi-Newton iteration

$$\begin{aligned} J_F(\textbf{u}^m) \triangle \textbf{u}^m = -F(\textbf{u}^m), \qquad \triangle \textbf{u}^m = \textbf{u}^{m+1}-\textbf{u}^m, \qquad m=0,1,2,\cdots \end{aligned}$$
(17)

where \(J_F(\textbf{u}^m)\) is an approximation to the full Jacobian. For example, for the midpoint discrete gradient (1),

$$\begin{aligned} J_F(\textbf{u}^m) = I+\frac{\tau ^2}{4}M^{-1}\frac{\partial \nabla V}{\partial \textbf{u}} \big (\frac{\textbf{u}^m+\textbf{q}^n}{2}\big ) \end{aligned}$$
(18)

might be used. This is just the Jacobian that would occur in the implicit midpoint rule. This way, a loop over all particles to compute the potential \(V(\textbf{u}^m)\), the gradient \(\nabla V(\textbf{u}^m)\) and the norm of the difference of the positions is avoided. The matrices \(\frac{\partial F}{\partial \textbf{u}}\) and \(J_F(\textbf{u}^m)\) are symmetric, and the linear systems in (15) and (17), respectively, can be solved efficiently by the conjugate gradients (CG) method (cf. [23]). Newton iterations also appear in the discretisation schemes SHAKE and RATTLE that are custom-built to pose constraints during molecular dynamics simulations (cf. [1, 50]). The method is often also referred to as SHAKE and RATTLE, since they have been found to be equivalent in [31].

4 Discrete gradients for molecular dynamics

In this section, we discuss previously known as well as new discrete gradients that are designed for use in particle and molecular dynamics. A basic idea for discrete gradients custom-built for molecular dynamics is to mimic the standard forces by discrete gradients. That is, for a pairwise force, the discrete gradient will be built upon two particles. For angle forces, three particles are involved, and for torsion forces, four particles are involved. Therefore, we consider N particles with masses \(m_1,\ldots ,m_N\), positions \(\textbf{q}_i\), \(i=1,\ldots ,N\), and momenta \(\textbf{p}_i\), \(i=1,\ldots ,N\). Here, \(\textbf{p}_i\) or \(\textbf{q}_i\) are three-dimensional vectors. The momenta are given as \(\textbf{p}_i=m_i\textbf{v}_i\), where \(\textbf{v}_i\) are the velocities of the particles. We designate by \(\textbf{r}_{ij}=\textbf{q}_j-\textbf{q}_i\) the vector, that points from particle i to particle j. Its length is noted as \(r_{ij}=\Vert \textbf{q}_j-\textbf{q}_i\Vert \), where \(\Vert \cdot \Vert \) designates the Euclidean norm. The connection with the vectors used in the previous sections is given as

$$ \textbf{q}=\begin{pmatrix} \textbf{q}_1 \\ \vdots \\ \textbf{q}_N \end{pmatrix}\,, \qquad \qquad \textbf{p}=\begin{pmatrix} \textbf{p}_1 \\ \vdots \\ \textbf{p}_N \end{pmatrix}. $$

The vector \(\textbf{q}\) collects the positions of the particles, and the vector \(\textbf{p}\) the momenta, respectively.

4.1 Discrete gradients for pairwise forces

A discrete gradient for pairwise forces has been known for quite some time. The first appearance is due to LaBudde & Greenspan [26,27,28]. Since then, this discrete gradient has been studied by several scientists (e.g. [14, 47, 54, 55]). If \(V(r_{ij})\) is a pairwise potential for the particles i and j, then we obtain, with the finite difference

$$\begin{aligned} \Delta _{r_{ij}} V(\textbf{q},\textbf{q}') {:=} \frac{V(r_{ij}')-V(r_{ij})}{r_{ij}'-r_{ij}}\,, \end{aligned}$$
(19)

the discrete gradient

$$ \overline{\nabla }V(\textbf{q},\textbf{q}')= \begin{pmatrix} \overline{\nabla } V_{\textbf{q}_1}(\textbf{q},\textbf{q}') \\ \vdots \\ \overline{\nabla } V_{\textbf{q}_N}(\textbf{q},\textbf{q}') \end{pmatrix} $$

with the non-zero components

$$\begin{aligned} \overline{\nabla }_{\textbf{q}_i}V(\textbf{q},\textbf{q}')=-\Delta _{r_{ij}} V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ij}'+\textbf{r}_{ij}}{r_{ij}'+r_{ij}}\,, \quad \overline{\nabla }_{\textbf{q}_j}V(\textbf{q},\textbf{q}')=\Delta _{r_{ij}} V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ij}'+\textbf{r}_{ij}}{r_{ij}'+r_{ij}}\,, \end{aligned}$$
(20)

and \(\overline{\nabla }_{\textbf{q}_k} V(\textbf{q},\textbf{q}')=0\) for \(k\ne i,j\). The formulas (20) form a discrete gradient for pairwise potentials \(V(r_{ij})\). This can be seen by checking the conditions in Definition 1. The following theorem is due to LaBudde and Greenspan (cf. [26]). The theorem and its proof might also be found as Theorem 5.1 in [21].

Theorem 1

The method (13) with (20) for a system of N particles with Hamiltonian

$$ H(\textbf{q},\textbf{p})=\frac{1}{2}\textbf{p}^TM^{-1}\textbf{p} + V(\textbf{q})\,, \qquad V=\sum _{k=1}^M V^k\,, $$

where \(V^k\) are pairwise potentials, is a second-order symmetric implicit method that conserves the energy, the total linear momentum \(\textbf{P}=\sum _{i=1}^N \textbf{p}_i\) and the total angular momentum \(\textbf{L}=\sum _{i=1}^N \textbf{q}_i \times \textbf{p}_i\).

Note that the important part of the theorem is the conservation of the energy. Several schemes, such as the explicit Verlet scheme, preserve the total linear momentum and the total angular momentum. The statement about the total linear momentum and the total angular momentum in the theorem is given in order to highlight that the discrete gradient methods with the proposed discrete gradients also preserve these quantities.

4.2 Experiment with two Lennard–Jones particles

The Lennard–Jones potential is one of the most used models for the interaction of neutral particles (cf. [32]). We will use the standard 12–6 potential

$$\begin{aligned} V(r_{ij})=4 \epsilon \left( \left( \frac{\sigma }{r_{ij}} \right) ^{12} - \left( \frac{\sigma }{r_{ij}} \right) ^6 \right) =4 \epsilon \left( \frac{\sigma }{r_{ij}} \right) ^6 \cdot \left( \left( \frac{\sigma }{r_{ij}} \right) ^6 - 1 \right) \, . \end{aligned}$$
(21)

In our experiment, we use \(\sigma = 1\) and \(\epsilon = 5\). The initial conditions are shown in Fig. 1. In the positions section, the atoms are numbered together with their initial location. The velocities are given in the next section, while the particle is identified by its number. The particles are attracted to each other and then repelled again. This gives a periodic movement. Moreover, due to the initial velocities, they are turning around each other.

Fig. 1
figure 1

Initial conditions for the experiment with two Lennard–Jones particles: the positions and velocities are given on the left-hand side. The first column refers to the particle number, the next three columns to the three coordinates. On the right-hand side, the initial positions are given as a plot made by the open visualization tool (OVITO) (cf. [57])

With this setup, we first compute the trajectories of the particles up to time 10 with different step sizes. The error is shown in Fig. 2 on the left-hand side. The error is shown versus the chosen step sizes. All three methods show a second-order error behaviour. The black line indicates the second-order slope in the logarithmic plot. On the right-hand side, the calculated energies are shown for all time steps with step size \(\tau =0.005\). The discrete gradient method preserves the energy correctly, while the Verlet scheme and the implicit midpoint rule show a periodic change of the energy. The behaviour of the Verlet scheme and the implicit midpoint rule is expected, since symplectic integrators preserve a modified Hamiltonian. The total linear momentum and the total angular momentum are preserved by all three methods. The implicit equation in (14) is solved with the Newton method, (15), with the full Jacobian (16). The Jacobian is a symmetric matrix, and the CG method is used to solve the linear systems. The Jacobian is not explicitly formed. Instead, the action of the Jacobian on a vector is directly computed. This way, the computational effort is comparable to computing the forces. Sometimes, this is called a matrix-free implementation.

Fig. 2
figure 2

Results of the experiment with two Lennard–Jones particles: the error versus the time step is shown on the left-hand side for the discrete gradient (DG) method, the midpoint rule, and the Verlet scheme. On the right-hand side, the energy is shown over the time span [0, 10] for step size \(\tau =0.005\) for all three methods. \(H_0\) designates the exact energy

4.3 Discrete gradients for bond angles

We discuss some discrete gradients for bond angles. Besides general discrete gradients, e.g. (1), (2) and (3), restricted to the bond angles, a symmetric discrete gradient for the bond angles has recently been proposed in [53]. We discuss a slight generalisation of this discrete gradient. The standard angle potential, \(V(\theta )\), is assumed to depend smoothly on the angle \(\theta \) (cf. Fig. 3). The angle \(\theta =\theta _{ijk}\) can be expressed in terms of the distances \(r_{ji}\), \(r_{jk}\) and \(r_{ij}\) between the three atoms i, j and k as given on the right-hand side of Fig. 3. This is due to the following lemma.

Lemma 3

We have the following representation of the scalar product

$$ \langle \textbf{r}_{ji}, \textbf{r}_{jk} \rangle = \frac{1}{2} \left( r_{ji}^2+r_{jk}^2-r_{ik}^2 \right) . $$
Fig. 3
figure 3

Sketch of bond angle: From left to right, first, the bonds and the bond angle between the atoms i, j, and k are shown. The second sketch shows the distances between the atoms i, j, and k used in the discrete gradient

Proof

The proof is a simple calculation.

$$\begin{aligned} r_{ji}^2 + r_{jk}^2 - r_{ik}^2&= \langle \textbf{q}_i - \textbf{q}_j, \textbf{q}_i - \textbf{q}_j \rangle + \langle \textbf{q}_k - \textbf{q}_j, \textbf{q}_k - \textbf{q}_j \rangle - \langle \textbf{q}_k - \textbf{q}_i, \textbf{q}_k - \textbf{q}_i \rangle \\&\!=\! 2 \left( \langle \textbf{q}_j, \textbf{q}_j \rangle \!-\! \langle \textbf{q}_i, \textbf{q}_j \rangle \!-\! \langle \textbf{q}_k, \textbf{q}_j \rangle \!+\! \langle \textbf{q}_i, \textbf{q}_k \rangle \right) \!=\! 2 \cdot \langle \textbf{q}_i\!-\!\textbf{q}_j, \textbf{q}_k\!-\!\textbf{q}_j \rangle \\&= 2 \cdot \langle \textbf{r}_{ji}, \textbf{r}_{jk} \rangle . \end{aligned}$$

\(\square \)

With Lemma 3, one readily obtains the representation of the angle in terms of distances

$$ \theta _{ijk} = \arccos \left( \frac{ \langle \textbf{r}_{ji},\textbf{r}_{jk}\rangle }{ \Vert \textbf{r}_{ji}\Vert \cdot \Vert \textbf{r}_{jk}\Vert } \right) = \arccos \left( \frac{ r_{ji}^2+r_{jk}^2-r_{ik}^2 }{ 2r_{ji} \cdot r_{jk} } \right) . $$

So the angle potential only depends on three distances \(V(r_{ji},r_{jk},r_{ik})\). We first give a non-symmetric discrete gradient that is similar to the Itoh–Abe gradient. In order to write the discrete gradient down in a concise way, we use the following finite differences with respect to the distances of the particles

$$\begin{aligned} \Delta _{r_{ji}}^\ell V(\textbf{q},\textbf{q}')&{:=} \frac{ V(r_{ji}',r_{jk},r_{ik})-V(r_{ji},r_{jk},r_{ik}) }{ r_{ji}'-r_{ji} }\, ,\\ \Delta _{r_{jk}}^\ell V(\textbf{q},\textbf{q}')&{:=} \frac{ V(r_{ji}',r_{jk}',r_{ik})-V(r_{ji}',r_{jk},r_{ik}) }{ r_{jk}'-r_{jk} }\, , \\ \Delta _{r_{ik}}^\ell V(\textbf{q},\textbf{q}')&{:=} \frac{ V(r_{ji}',r_{jk}',r_{ik}')-V(r_{ji}',r_{jk}',r_{ik}) }{ r_{ik}'-r_{ik} }. \end{aligned}$$

The finite differences are labelled with \(\ell \), because the variables with primes are filled in from the left. This discrete gradient might be seen as the Itoh–Abe (coordinate increment) discrete gradient for \(V(r_{ji},r_{jk},r_{ik})\). With these finite differences, our first discrete gradient \(\overline{\nabla }^\ell V\) has the non-zero components

$$\begin{aligned} \overline{\nabla }_{\textbf{q}_i}^\ell V(\textbf{q},\textbf{q}')&= \quad \Delta _{r_{ji}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ji}'+\textbf{r}_{ji}}{r_{ji}'+r_{ji}} - \Delta _{r_{ik}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}}\,, \\ \overline{\nabla }_{\textbf{q}_j}^\ell V(\textbf{q},\textbf{q}')&= -\Delta _{r_{ji}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ji}'+\textbf{r}_{ji}}{r_{ji}'+r_{ji}} - \Delta _{r_{jk}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}}\, , \\ \overline{\nabla }_{\textbf{q}_k}^\ell V(\textbf{q},\textbf{q}')&= \quad \Delta _{r_{jk}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}} + \Delta _{r_{ik}}^\ell V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}}\,, \end{aligned}$$
(22)

where all others are zero. This discrete gradient is not symmetric, since the finite differences are not symmetric, e.g.

$$ \Delta _{r_{ji}}^\ell V(\textbf{q},\textbf{q}') \ne \Delta _{r_{ji}}^\ell V(\textbf{q}',\textbf{q}). $$

Hence, we can define another discrete gradient \(\overline{\nabla }^r V\) by using coordinate decrease. The finite differences are now

$$\begin{aligned} \Delta _{r_{ik}}^r V(\textbf{q},\textbf{q}') {:=} \Delta _{r_{ik}}^\ell V(\textbf{q}',\textbf{q})&= \frac{ V(r_{ji},r_{jk},r_{ik}')-V(r_{ji},r_{jk},r_{ik}) }{ r_{ik}'-r_{ik} }\, , \\ \Delta _{r_{jk}}^r V(\textbf{q},\textbf{q}') {:=} \Delta _{r_{jk}}^\ell V(\textbf{q}',\textbf{q})&= \frac{ V(r_{ji},r_{jk}',r_{ik}')-V(r_{ji},r_{jk},r_{ik}') }{ r_{jk}'-r_{jk} }\, , \\ \Delta _{r_{ji}}^r V(\textbf{q},\textbf{q}') {:=} \Delta _{r_{ji}}^\ell V(\textbf{q}',\textbf{q})&= \frac{ V(r_{ji}',r_{jk}',r_{ik}')-V(r_{ji},r_{jk}',r_{ik}') }{ r_{ji}'-r_{ji} }\, . \end{aligned}$$

With these finite differences, we obtain another discrete gradient that is also not symmetric. The finite differences are labelled r, because the variables with primes are filled in from the right. Since symmetric discrete gradients lead to symmetric time-integration methods, which are of second order, one might prefer symmetric discrete gradients. In the same way as the Itoh–Abe discrete gradient, we can symmetrise our gradients here. This leads to the discrete gradient with the (symmetric) finite differences

$$\begin{aligned} \Delta _v^s V(\textbf{q},\textbf{q}') {:=} \frac{1}{2} \left( \Delta _v^\ell V(\textbf{q},\textbf{q}')+\Delta _v^r V(\textbf{q},\textbf{q}') \right) , \qquad v \in \{ r_{ji}, r_{jk}, r_{ik} \}. \end{aligned}$$

And the non-zero coefficients of the symmetric discrete gradient \(\overline{\nabla }^s V\) read

$$\begin{aligned} \overline{\nabla }_{\textbf{q}_i}^sV(\textbf{q},\textbf{q}')&= \quad \Delta _{r_{ji}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ji}'+\textbf{r}_{ji}}{r_{ji}'+r_{ji}} - \Delta _{r_{ik}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}}\, , \\ \overline{\nabla }_{\textbf{q}_j}^s V(\textbf{q},\textbf{q}')&= -\Delta _{r_{ji}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ji}'+\textbf{r}_{ji}}{r_{ji}'+r_{ji}} - \Delta _{r_{jk}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}} \, , \\ \overline{\nabla }_{\textbf{q}_k}^s V(\textbf{q},\textbf{q}')&= \quad \Delta _{r_{jk}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}} + \Delta _{r_{ik}}^s V(\textbf{q},\textbf{q}') \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}} . \end{aligned}$$
(23)

These three expressions are indeed discrete gradients, which can be seen by checking Definition 1.

There are more discrete gradients. One might prescribe any pattern of primes to the three distances in the finite differences. For example, the pattern prime, no prime, prime and then changing from prime to no prime and vice versa from left to right

$$\begin{aligned} \Delta _{r_{ji}}^f V(\textbf{q},\textbf{q}')&{:=} \frac{V(r_{ji}',r_{jk},r_{ik}')-V(r_{ji},r_{jk},r_{ik}') }{ r_{ji}'-r_{ji} }\, ,\\ \Delta _{r_{jk}}^f V(\textbf{q},\textbf{q}')&{:=}\frac{V(r_{ji}',r_{jk}',r_{ik}')-V(r_{ji}',r_{jk},r_{ik}')}{r_{jk}'-r_{jk}}\, ,\\ \Delta _{r_{ik}}^f V(\textbf{q},\textbf{q}')&{:=} \frac{V(r_{ji}',r_{jk},r_{ik}')-V(r_{ji}',r_{jk},r_{ik})}{r_{ik}'-r_{ik}}\, . \end{aligned}$$

All discrete gradients of this type can be symmetrised by symmetrising the finite differences.

Fig. 4
figure 4

Sketch of united-atom butane: On the left-hand side, the molecule with the particles i, j, k and \(\ell \) is shown including the bonds between atoms. On the right-hand side, the distances between the particles i, j, k and \(\ell \) used by the discrete gradient are given

4.4 Discrete gradients for dihedral angles

The dihedral angle

$$\begin{aligned} \phi _{ijk\ell }&= \text {sign}( \langle \textbf{r}_{ij}, \textbf{r}_{jk}\times \textbf{r}_{k\ell }\rangle ) \arccos \varphi _{ijk\ell }, \qquad \varphi _{ijk\ell }= \frac{ \langle \textbf{r}_{ij} \times \textbf{r}_{jk}, \textbf{r}_{jk} \times \textbf{r}_{k\ell } \rangle }{ \Vert \textbf{r}_{ij} \times \textbf{r}_{jk} \Vert \Vert \textbf{r}_{jk} \times \textbf{r}_{k\ell }\Vert }\,, \end{aligned}$$
(24)

denotes the angle between the planes spanned by the atoms i, j and k and j, k and \(\ell \), respectively. The sign of the dihedral angle \(\phi _{ijk\ell }\), i.e. \(\text {sign}( \langle \textbf{r}_{ij}, \textbf{r}_{jk}\times \textbf{r}_{k\ell } \rangle )\), designates on which side of the plane through j, k and \(\ell \) the particle i lies. We consider potentials \(V(\phi )\) that depend smoothly on the dihedral angle \(\phi \). The computation of the forces based on the dihedral angle is based on the expression as given above in (24) and leads to the most used formulas as described in [3]. To our best knowledge, the discrete gradient that we propose in the following is new. It is based on the same idea as the discrete gradients before, namely, the idea to express the angle with distances. The dihedral angle can also be expressed in terms of the distances between all atoms as indicated on the right-hand side of Fig. 4. The additional lines on the right-hand side of Fig. 4 are the additional distances that will be used. The representation of the angle in distances is given in the following lemma.

Lemma 4

We have the following representation in terms of distances

$$\begin{aligned} \varphi _{ijk\ell }&= \frac{ \langle \textbf{m}, \textbf{n} \rangle }{ \Vert \textbf{m} \Vert \Vert \textbf{n}\Vert } = \frac{ (r_{k\ell }^2+r_{jk}^2-r_{j\ell }^2)(r_{ij}^2+r_{jk}^2-r_{ik}^2) -2r_{jk}^2(r_{jk}^2+r_{i\ell }^2-r_{j\ell }^2-r_{ik}^2) }{ \sqrt{4r_{jk}^2r_{ij}^2-(r_{ij}^2+r_{jk}^2-r_{ik}^2)^2} \sqrt{4r_{jk}^2r_{k\ell }^2-(r_{jk}^2+r_{k\ell }^2-r_{j\ell }^2)^2} } \end{aligned}$$

where \(\textbf{m} = \textbf{r}_{ij} \times \textbf{r}_{jk}\) and \(\textbf{n} = \textbf{r}_{jk} \times \textbf{r}_{k\ell }\).

Proof

The proof is a tedious calculation along the following steps:

$$\begin{aligned} \frac{ \langle \textbf{r}_{ij} \times \textbf{r}_{jk}, \textbf{r}_{jk} \times \textbf{r}_{k\ell } \rangle }{ \Vert \textbf{r}_{ij} \times \textbf{r}_{jk} \Vert \Vert \textbf{r}_{jk} \times \textbf{r}_{k\ell }\Vert }&= (-1)\cdot \frac{ \Big \langle r_{jk}^2\textbf{r}_{ij} - \big \langle \textbf{r}_{ij}, \textbf{r}_{jk} \big \rangle \textbf{r}_{jk}, r_{jk}^2\textbf{r}_{k\ell } - \big \langle \textbf{r}_{k\ell }, \textbf{r}_{jk}\big \rangle \textbf{r}_{jk} \Big \rangle }{ \Vert r_{jk}^2\textbf{r}_{ij} - \big \langle \textbf{r}_{ij}, \textbf{r}_{jk}\big \rangle \textbf{r}_{jk} \Vert \Vert r_{jk}^2\textbf{r}_{k\ell } - \big \langle \textbf{r}_{k\ell }, \textbf{r}_{jk} \big \rangle \textbf{r}_{jk} \Vert } \\&= \frac{ \langle \textbf{r}_{k\ell }, \textbf{r}_{jk}\rangle \langle \textbf{r}_{ij}, \textbf{r}_{jk} \rangle - r_{jk}^2 \langle \textbf{r}_{ij}, \textbf{r}_{k\ell } \rangle }{ \sqrt{r_{jk}^2r_{ij}^2 - \langle \textbf{r}_{ij}, \textbf{r}_{jk}\rangle ^2} \sqrt{r_{jk}^2r_{k\ell }^2 - \langle \textbf{r}_{jk}, \textbf{r}_{k\ell }\rangle ^2} }\\&= \frac{ (r_{k\ell }^2+r_{jk}^2-r_{j\ell }^2)(r_{ij}^2+r_{jk}^2-r_{ik}^2) -2r_{jk}^2(r_{jk}^2+r_{i\ell }^2-r_{j\ell }^2-r_{ik}^2) }{ \sqrt{4r_{jk}^2r_{ij}^2-(r_{ij}^2+r_{jk}^2-r_{ik}^2)^2} \sqrt{4r_{jk}^2r_{k\ell }^2-(r_{jk}^2+r_{k\ell }^2-r_{j\ell }^2)^2} } \end{aligned}$$

\(\square \)

So the dihedral potential depends on six distances \(V(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell })\). The representation in Lemma 4 suggests an alternative way to represent any potential dependent on the dihedral angle in terms of distances. But we will consider potentials that are dependent on the cosine of the dihedral angle, because we will use such potentials in our experiments later. Let

$$ U_t(\phi )=k_\phi \sum _{n=0}^m a_n \cos ^n \phi =\widetilde{U}_t(\cos \phi )\,, \qquad \widetilde{U}_t(\varphi )=k_\phi \sum _{n=0}^m a_n \varphi ^n\,, $$

be the potential for the torsion angles. Then, we have

$$\begin{aligned} \frac{\partial }{\partial \textbf{q}_u} U_t(\phi _{ijk\ell })&= \tilde{U}_t'(\cos \phi _{ijk\ell })\cdot (-1)\cdot \sin (\phi _{ijk\ell }) \text {sign}(\langle \textbf{r}_{ij},\textbf{n}\rangle ) \frac{-1}{\sin \varphi _{ijk\ell }} \cdot \frac{\partial }{\partial {\textbf{q}_u}} \varphi _{ijk\ell } \\&= \tilde{U}_t'(\varphi _{ijk\ell }) \cdot \frac{\partial }{\partial {\textbf{q}_u}} \varphi _{ijk\ell }\,, \qquad u=i,j,k,\ell . \end{aligned}$$

The potential reduces to \( \tilde{U}_t(\varphi _{ijk\ell }) \). Hence, for the discrete gradient, we use the representation

$$ V(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell })=\tilde{U}_t(\varphi _{ijk\ell }), $$

with \(\varphi _{ijk\ell }\) expressed in the distances as in Lemma 4.

We first discuss non-symmetric discrete gradients. In order to write the discrete gradient down in a concise way, we again use finite differences with respect to the distances of the particles.

$$\begin{aligned} \Delta _{r_{ij}}^\ell V(\textbf{q},\textbf{q}')&{:=}&{ \frac{V(r_{ij}',r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell }) -V(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell })}{r_{ij}'-r_{ij}}}\\&\vdots&\\ \Delta _{r_{i\ell }}^\ell V(\textbf{q},\textbf{q}')&{:=}&{\frac{V(r_{ij}',r_{jk}',r_{k\ell }',r_{ik}',r_{j\ell }',r_{i\ell }') -V(r_{ij}',r_{jk}',r_{k\ell }',r_{ik}',r_{j\ell }',r_{i\ell })}{r_{i\ell }'-r_{i\ell }}} \end{aligned}$$

This can also be seen as the Itoh–Abe (coordinate increment) discrete gradient for V. With these finite differences, our first discrete gradient reads as follows:

$$\begin{aligned} \begin{array}{rcl} \overline{\nabla }_{\textbf{q}_i}^\ell V(\textbf{q},\textbf{q}') &{}=&{} {-\Delta _{r_{ij}}^\ell V \cdot \frac{\textbf{r}_{ij}'+\textbf{r}_{ij}}{r_{ij}'+r_{ij}} - \Delta _{r_{ik}}^\ell V \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}} - \Delta _{r_{i\ell }}^\ell V \cdot \frac{\textbf{r}_{i\ell }'+\textbf{r}_{i\ell }}{r_{i\ell }'+r_{i\ell }} }\, ,\\[1.5ex] \overline{\nabla }_{\textbf{q}_j}^\ell V(\textbf{q},\textbf{q}') &{}=&{} {\quad \Delta _{r_{ij}}^\ell V \cdot \frac{\textbf{r}_{ij}'+\textbf{r}_{ij}}{r_{ij}'+r_{ij}} - \Delta _{r_{jk}}^\ell V \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}} - \Delta _{r_{j\ell }}^\ell V \cdot \frac{\textbf{r}_{j\ell }'+\textbf{r}_{j\ell }}{r_{j\ell }'+r_{j\ell }} } \,,\\[1.5ex] \overline{\nabla }_{\textbf{q}_k}^\ell V(\textbf{q},\textbf{q}') &{}=&{} {\quad \Delta _{r_{ik}}^\ell V \cdot \frac{\textbf{r}_{ik}'+\textbf{r}_{ik}}{r_{ik}'+r_{ik}} + \Delta _{r_{jk}}^\ell V \cdot \frac{\textbf{r}_{jk}'+\textbf{r}_{jk}}{r_{jk}'+r_{jk}} - \Delta _{r_{k\ell }}^\ell V \cdot \frac{\textbf{r}_{k\ell }'+\textbf{r}_{k\ell }}{r_{k\ell }'+r_{k\ell }} } \,,\\[1.5ex] \overline{\nabla }_{\textbf{q}_\ell }^\ell V(\textbf{q},\textbf{q}') &{}=&{} {\quad \Delta _{r_{i\ell }}^\ell V \cdot \frac{\textbf{r}_{i\ell }'+\textbf{r}_{i\ell }}{r_{i\ell }'+r_{i\ell }} + \Delta _{r_{j\ell }}^\ell V \cdot \frac{\textbf{r}_{j\ell }'+\textbf{r}_{j\ell }}{r_{j\ell }'+r_{j\ell }} + \Delta _{r_{k\ell }}^\ell V \cdot \frac{\textbf{r}_{k\ell }'+\textbf{r}_{k\ell }}{r_{k\ell }'+r_{k\ell }}. } \end{array} \end{aligned}$$
(25)

This discrete gradient is not symmetric. The coordinate decrement version

$$ \begin{array}{rcl} \Delta _{r_{ij}}^r V(\textbf{q},\textbf{q}') &{}{:=}&{} {\frac{V(r_{ij},r_{jk}',r_{k\ell }',r_{ik}',r_{j\ell }',r_{i\ell }')-V(r_{ij}',r_{jk}',r_{k\ell }',r_{ik}',r_{j\ell }',r_{i\ell }')}{r_{ij}-r_{ij}'}}\\ &{}\vdots &{} \\ \Delta _{r_{i\ell }}^r V(\textbf{q},\textbf{q}') &{}{:=}&{}{\frac{V(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell })-V(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell }')}{r_{i\ell }-r_{i\ell }'}} \end{array} $$

leads to another discrete gradient that is also not symmetric. Similar to what we saw in the previous case, one might prefer symmetric discrete gradients. The symmetrised discrete gradient makes use of the symmetric finite differences

$$\begin{aligned} \Delta _{r}^s V(\textbf{q},\textbf{q}') {:=}\frac{1}{2} \left( \Delta _{r}^\ell V(\textbf{q},\textbf{q}')+\Delta _{r}^r V(\textbf{q},\textbf{q}') \right) , \quad r \in \{r_{ij}, r_{jk}, r_{kl}, r_{ik}, r_{j\ell }, r_{i\ell } \} \end{aligned}$$

These three expressions are indeed discrete gradients, which can be seen by checking Definition 1. Actually, there are many more discrete gradients. One might prescribe any pattern of primes to the six distances. Changing the primed distances to not-primed distances and vice versa from left to right when the finite difference ‘passed’ this distance, we obtain a new unsymmetric discrete gradient. These unsymmetric discrete gradients can all be symmetrised.

Besides the conventional torsion potential, where the atoms i, j, k and \(\ell \) are consecutively connected, there is also the possibility that three atoms are connected to one as in Fig. 5 on the left-hand side. Here, the so-called improper dihedral angle is defined as the angle between the planes spanned by the atoms i, j and k and j, k and \(\ell \), respectively. As shown on the right-hand side of Fig. 5, this angle can be expressed in exactly the same way as before for the standard dihedral angle (24). Therefore, the distance definition of the dihedral angle in Lemma 4 also works for the improper dihedral angle, and the corresponding discrete gradients are constructed analogously.

$$\begin{aligned} \omega _{ijk\ell }&= \arccos (\varphi _{ijk\ell }), \\ \varphi _{ijk\ell }&= \frac{ \langle \textbf{r}_{ij} \times \textbf{r}_{jk}, \textbf{r}_{jk} \times \textbf{r}_{k\ell } \rangle }{ \Vert \textbf{r}_{ij} \times \textbf{r}_{jk} \Vert \Vert \textbf{r}_{jk} \times \textbf{r}_{k\ell }\Vert } \end{aligned}$$

With discrete gradients for all standard short-range forces at hand, we are able to formulate our main theorem for the considered discrete gradients.

Fig. 5
figure 5

Improper dihedral angle: On the left-hand side, the typical situation is shown. On the right-hand side, the formula for the improper dihedral angle \(\omega _{ijk\ell }\) is given. \(\varphi _{ijk\ell }\) is exactly the same term as for the standard dihedral angle (cf. (24))

Theorem 2

The method (13) for a particle system of N particles with Hamiltonian

$$ H(\textbf{q},\textbf{p})=\frac{1}{2}\textbf{p}^TM^{-1}\textbf{p} + V(\textbf{q})\,, \qquad V=\sum _{k=1}^M V^k\,, $$

where \(V^k\) are pairwise, angle and dihedral potentials, is a first-order non-symmetric implicit method if at least one non-symmetric discrete gradient is used, and a second-order symmetric implicit method if all discrete gradients are symmetric. All methods preserve the energy, the total linear momentum \(\textbf{P}=\sum _{i=1}^N \textbf{p}_i\) and the total angular momentum \(\textbf{L}=\sum _{i=1}^N \textbf{q}_i \times \textbf{p}_i\).

Proof

The order of the method follows from Proposition 2. The preservation of the energy directly follows from the use of discrete gradients, Lemma 1 and Lemma 2. From (20), (23) (and its variants) or (25) (and its variants), respectively, we find by a tedious but simple calculation that for arbitrary \(\textbf{q}'\) and \(\textbf{q}\), we have that

$$ \sum _{i=1}^N \overline{\nabla }V^k_{\textbf{q}_i} (\textbf{q},\textbf{q}')=0\,, \qquad \text {for} \qquad k=1,\ldots ,M. $$

Hence,

$$\begin{aligned} \textbf{P}^{n+1} = \sum _{i=1}^N \textbf{p}_i^{n+1}&= \sum _{i=1}^N \left( \textbf{p}_i + \tau \sum _{k=1}^M \overline{\nabla } V^k_{\textbf{q}_i}(\textbf{q}^n,\textbf{q}^{n+1}) \right) \\&= \sum _{i=1}^N \textbf{p}_i^n + \tau \sum _{k=1}^M \sum _{i=1}^N \overline{\nabla } V^k_{\textbf{q}_i} (\textbf{q}^n,\textbf{q}^{n+1}) = \sum _{i=1}^N \textbf{p}_i^n = \textbf{P}^{n}. \end{aligned}$$
(26)

For all potentials and all proposed discrete gradients and for arbitrary \(\textbf{q}'\) and \(\textbf{q}\), we have that

$$\begin{aligned} \sum _{i=1}^N (\textbf{q}_i'+\textbf{q}_i)&\times \overline{\nabla }_{\textbf{q}_i}V^k(\textbf{q},\textbf{q}') = \textbf{0}. \end{aligned}$$

This is again a tedious but simple calculation. From this, q’ and q arbitrarily chosen, we find

$$\begin{aligned} \sum _{i=1}^N (\textbf{q}_i'+\textbf{q}_i) \times \overline{\nabla }_{\textbf{q}_i} V(\textbf{q},\textbf{q}')&= \sum _{i=1}^N (\textbf{q}_i'+\textbf{q}_i) \times \left( \sum _{k=1}^M \overline{\nabla }_{\textbf{q}_i} V^k(\textbf{q},\textbf{q}') \right) \\&= \sum _{k=1}^M \left( \sum _{i=1}^N (\textbf{q}_i'+\textbf{q}_i) \times \overline{\nabla }_{\textbf{q}_i} V^k(\textbf{q},\textbf{q}')\right) = \textbf{0} . \end{aligned}$$

Now, we turn to method (13). Then, we have by the equation above that

$$\begin{aligned} \sum _{i=1}^N (\textbf{q}_i^{n+1}+\textbf{q}_i^n) \times (\textbf{p}_i^{n+1}-\textbf{p}_i^n) = -\tau \sum _{i=1}^N (\textbf{q}_i^{n+1}+\textbf{q}_i^n) \times \overline{\nabla }_{\textbf{q}_i} V(\textbf{q}^n,\textbf{q}^{n+1}) = \textbf{0} . \end{aligned}$$
(27)

Due to

$$ (\textbf{q}_i^{n+1}-\textbf{q}_i^n) \times (\textbf{p}_i^{n+1}+\textbf{p}_i^n) = \frac{\tau }{2m_i} (\textbf{p}_i^{n+1}+\textbf{p}_i^n) \times (\textbf{p}_i^{n+1}+\textbf{p}_i^n) = \textbf{0}\,, $$

we also have

$$\begin{aligned} \sum _{i=1}^N (\textbf{q}_i^{n+1}-\textbf{q}_i^n) \times (\textbf{p}_i^{n+1}+\textbf{p}_i^n) = \textbf{0} . \end{aligned}$$
(28)

Adding (27) and (28) shows the preservation of the angular momentum. \(\square \)

Theorem 2 shows that all standard short-range forces in a classical molecular dynamics simulation can be modelled with discrete gradients in such a way that the statement of Theorem 1 by LaBudde and Greenspan for pairwise forces is carried over to all standard short-range forces. Since even large protein simulations are often based on these standard short-range potentials and since the standard potentials do not change whether more atoms are connected to atoms defining one of the standard interactions or not, the new Theorem 2 is quite general with respect to its applicability.

Additional insight in the computation of the gradient of a dihedral angle potential might be gained from the representation of the dihedral angle with six distances. The computation of this gradient is not straightforward. It is usually based on formula (24), also called the cross-product definition of the dihedral angle. The scalar product of the cross-products can be represented as a scalar product without cross-products, the so-called scalar product definition for the torsion angle. The cross-product definition leads to the most used formulas for the computation of the forces induced by the dihedral potential as given in [3]. The representation of the dihedral angle in terms of distances leads to the interesting alternative formulas

$$ \begin{array}{rclrcl} \nabla _{\textbf{q}_i}V(\textbf{q}) &{}=&{} -V_{r_{ij}}\frac{\textbf{r}_{ij}}{r_{ij}} -V_{r_{ik}}\frac{\textbf{r}_{ik}}{r_{ik}} -V_{r_{i\ell }}\frac{\textbf{r}_{i\ell }}{r_{i\ell }}\,, \quad &{} \nabla _{\textbf{q}_j}V(\textbf{q}) &{}=&{} V_{r_{ij}}\frac{\textbf{r}_{ij}}{r_{ij}} -V_{r_{jk}}\frac{\textbf{r}_{jk}}{r_{jk}} -V_{r_{j\ell }}\frac{\textbf{r}_{j\ell }}{r_{j\ell }}\,,\\ \nabla _{\textbf{q}_k}V(\textbf{q}) &{}=&{} \quad V_{r_{ik}}\frac{\textbf{r}_{ik}}{r_{ik}} +V_{r_{jk}}\frac{\textbf{r}_{jk}}{r_{jk}} -V_{r_{k\ell }}\frac{\textbf{r}_{k\ell }}{r_{k\ell }}\,, &{} \nabla _{\textbf{q}_\ell }V(\textbf{q}) &{}=&{} V_{r_{i\ell }}\frac{\textbf{r}_{i\ell }}{r_{i\ell }} +V_{r_{j\ell }}\frac{\textbf{r}_{j \ell }}{r_{j\ell }} +V_{r_{k\ell }}\frac{\textbf{r}_{k\ell }}{r_{k\ell }} \,, \end{array} $$

where

$$ V_{r_{mn}}=V_{r_{mn}}(r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell }), \qquad r_{m,n} \in \{ r_{ij},r_{jk},r_{k\ell },r_{ik},r_{j\ell },r_{i\ell }\}, $$

designate the derivative with respect to the corresponding distance. The representation of the dihedral angle in terms of the distances in Lemma 4 might be called the ‘distance’ definition of the dihedral angle, as a supplement to the cross-product definition and the scalar product definition of the dihedral angle.

Since we have ‘distance’ definitions of the bond angles as well as the dihedral angles, and since the other potentials are dependent on distances in a natural way, one could evaluate all forces with respect to these potentials in a unified way. We use these representations here in order to use the same technique to construct a discrete gradient. But this unified way to compute the forces, i.e. the gradients of the potentials, might be interesting for the evaluation of the short-range forces in the standard Verlet scheme, too.

Table 1 Parameters for the simulation of butane

4.5 Experiment with butane

We use a united-atom model of butane. The Lennard–Jones potential is chosen as before with the parameters given in Table 1. It is only applied for every two atoms that belong to different molecules. The bond potential \(V_b\) and the angle potential \(V_a\) are chosen as

$$\begin{aligned} V_b(r)=k_b(r-r_0)^2 \quad \text {and} \quad V_a(\theta )=k_\theta (\cos (\theta )-\cos (\theta _0))^2. \end{aligned}$$
(29)

The bond potential and the angle potential are available in LAMMPS (cf. [58]) as bond_style harmonic and angle_style cosine/squared, respectively. And the potential for the torsion potential in terms of the dihedral angle \(\phi \) in the IUPAC convention (cf. [25]) reads

$$\begin{aligned} U_t(\phi ) = k_\phi (1.116-1.462\cos \phi -1.578 \cos ^2 \phi&+ 0.368 \cos ^3 \phi \\&\vspace{3cm} + 3.156 \cos ^4 \phi + 3.788 \cos ^5 \phi ) . \end{aligned}$$
(30)

The parameter \(k_\phi \) is also given in Table 1 (cf. [59]). The dihedral potential is also available in LAMMPS as dihedral_style nharmonic. Since we are using realistic data, we first remove the units by scaling. We use \(\tilde{\sigma }=10^{-9}~\text {m}=1~\text {nm}\), \(\tilde{\epsilon }=1~\text {kJ}/\text {mol}\), \(\tilde{m}=1~\text {u}\) and \(\tilde{\alpha }=\tilde{\sigma }\sqrt{\tilde{m}/\tilde{\epsilon }} \approx 10^{-12}~\text {s}=1~\text {ps}\). The dimensionless quantities then read

$$ \begin{array}{lllll} m'=m/\tilde{m},~~~ &{} \textbf{x}_i'=\textbf{x}_i/ \tilde{\sigma },~~~ &{} \textbf{r}_{ij}'=\textbf{r}_{ij}/\tilde{\sigma },~~~ &{} E'=E/\tilde{\epsilon },~~~ &{} V'=V/\tilde{\epsilon }, \\ \sigma '=\sigma / \tilde{\sigma },~~~ &{} \epsilon '=\epsilon / \tilde{\epsilon },~~~ &{} t'=t/\tilde{\alpha }.~~~ &{} &{} \end{array} $$

The initial configuration for the atoms in the scaled quantities is shown in Fig. 6 on the right-hand side. The molecules are plotted with OVITO (cf. [57]). The first column in Code fragment 2 on the left-hand side refers to the atom number, the second column to the molecule the atom belongs to. Two consecutive atoms within the same molecule are connected by a bond. Three consecutive atoms within the same molecule are ruled by the angle potential and all four atoms of the molecule by the given torsion potential. That is, this experiment uses all potentials discussed so far.

Fig. 6
figure 6

Initial positions for the experiment with two united-atom butane molecules: the positions are given on the left-hand side. The first column numbers the particles. The second column assigns molecule numbers. If this number is the same, a bond is added between the particles. The third to fifth columns are the coordinates of the positions. The velocities are set to zero and not given in the code fragment. On the right-hand side, the initial configuration is given as a plot made by the open visualization tool (OVITO) (cf. [57])

In Fig. 7 on the left-hand side, one can see that the methods perform as expected with respect to the order. The integration time for the error plot was up to \(T=2.0\) with the step sizes indicated on the abscissa. The error measured in the standard Euclidean norm is shown on the ordinate. If only one of the discrete gradients used in the DG method is unsymmetric then the DG method is of first order. This is shown by the blue circle-marked line. If all discrete gradients are symmetric, the method is of second order as well as the implicit midpoint rule and the Verlet scheme. For the energy plot on the right-hand side, we computed the solutions up to time \(T=10\) with step size \(\tau = 0.005\). All DG methods preserve the energy up to round-off error. The implicit midpoint rule and the Verlet scheme deviate from the constant energy at the beginning that should be preserved. Since the Verlet scheme deviates significantly more than the implicit midpoint rule, we also calculated the energy with LAMMPS. The energy behaviour turned out to be exactly the same. Setting the NVE ensemble in LAMMPS, the red, solid energy curve is the outcome, which coincides with our own implementation of the Verlet scheme. The implicit equation in (14) is again solved with the quasi-Newton method, (17), where the approximate Jacobian \(J_F\), cf. (18), includes the full Hessian with respect to the Lennard–Jones potentials and the bond potentials but omits the part with respect to the bond angle and dihedral angle potentials.

Fig. 7
figure 7

Results of the experiment with two united-atom butane molecules: the error versus the time step is shown on the left-hand side for the Verlet scheme, the symmetric discrete gradient (DG) scheme, the midpoint rule and the unsymmetric simple discrete gradient (DG) method. On the right-hand side, the energy is shown over the time span [0, 10] for step size \(\tau = 0.005\) for the Verlet scheme, the midpoint rule and the two discrete gradient (DG) schemes. \(H_0\) designates the exact energy

5 Parallelisation of DG methods

In order to show the usefulness of DG methods in molecular dynamics, it is indispensable to take care of parallelisation. Many codes for molecular dynamics simulations are based on the basic parallel treatment of short-range forces (e.g. [4, 33, 45, 58]).

5.1 Parallelisation of the evaluation of discrete gradients for Lennard–Jones forces with cut-off

In this section, we discuss the parallelisation of Lennard–Jones forces, induced by the standard potential (cf. (21)). Since we are only interested in the short-range part of the potential and since we will also make use of the Hessian, the Lennard–Jones potential with cut-off function should be twice continuously differentiable. For this reason, we will use the switching function proposed in [38] as

$$ s(r)= \left\{ \begin{array}{ll} 1, &{} 0 \le r< r_{\text { m}},\\ (1-x)^3(1+3x+6x^2), &{} r_{\text { m}} \le r \le r_{\text { cut}}, \\ 0, &{} r_{\text { cut}} < r, \end{array} \right. \quad \qquad x=\frac{r-r_{\text { m}}}{r_{\text { cut}}-r_{\text { m}}}, $$

with \(r_{\text { m}}=\frac{r_{\text { cut}}}{2}\). The function and the first two derivatives restricted to \([r_{\text { m}},r_{\text { cut}}]\) read

$$ \begin{array}{rcl} s(r) &{}=&{} (1-x)^3(1+3x+6x^2) \\ s'(r) &{}=&{} \frac{1}{r_{\text { {cut}}}}-r_{\text { {m}}} \cdot (-30)\cdot x^2(x-1)^2 \\ s''(r) &{}=&{} \frac{1}{(r_{\text { {cut}}}}-r_{\text { {m}})^2} \cdot (-60)\cdot x(x-1)(2x-1) \end{array} \,,\qquad x=\frac{r-r_{\text { m}}}{r_{\text { {cut}}}-r_{\text { m}}} . $$

The Lennard–Jones potential with switching function, i.e. \(V(r)=U(r)\cdot s(r)\), is a twice continuously differentiable short-range potential. This switching function is available in LAMMPS as pair_style lj/mdf.

The Lennard–Jones interactions are implemented with the linked cell method as described, e.g. in Chapter 3 of [15]. Since DG methods are implicit methods, the particle structure is extended by the possible future positions of the particles during the iterative solution of (14). Due to this, Lennard–Jones forces pose a special challenge with DG methods or implicit methods, in general. While a border neighbourhood of one cell is enough for the Verlet scheme, we need a border neighbourhood of two cells for DG methods. The reason is that the particle structure not only carries the position at the given time, but also the position of the next step. Two particles could become close in the next step that are not close in the given time step. This is illustrated in Fig. 8. Since possible future positions are needed for the evaluation of the discrete gradient (20) in the difference (19), particles with a distance of \(2r_\text { cut}\) need to be known, when we assume that the step size is chosen so small that the particles cannot travel for more than \(\frac{2}{3}r_\text { cut}\) in a time step. To sweep all particles in a border neighbourhood of two cells of a given cell is enough to catch these events. As an alternative, one might use cells with dimensions larger than \(2r_\text { cut}\). Then, a border neighbourhood with a width of one cell would be enough for the larger cells. We will stick with the smaller cells. There are some computations, for example the potential with the cut-off function, where we only need a border neighbourhood of one cell. Hence, using the smaller cells saves a bit of computing time.

Fig. 8
figure 8

Linked cell method: Simulation domain is decomposed into square cells of size \(r_{\text { cut}}\times r_{\text { cut}}\). The dark-shaded circle is the cut-off radius \(r_{\text { cut}}\) about the red particle i. On the left-hand side, the current time step is shown. For the standard force computation, only the particles in the \(3 \times 3\) grid of light-grey cells need to be taken into account. In the middle, the situation at the next time step is shown. The future positions \(i'\) of particle i and \(j'\) of particle j, respectively, are now within the cut-off range. While particle i and j of the current time step do not interact, they interact after they moved to the positions \(i'\) and \(j'\) in the next time step. In the discrete gradient method, the positions in the next time step are needed to compute the discrete gradient. Therefore, a \(5 \times 5\) grid of cells around the cell with the red particle i needs to be taken into account for DG methods, if the particles are assumed to travel not more than \(2/3\) of the cut-off in one time step. The cut-off radius of the DG method around the red particle i and the cells that need to be taken into account are shown on the right-hand side

For the parallelisation, domain decomposition is used. If we decompose the two-dimensional domain in Fig. 9 in six larger parts based on the cells given by the linked cell method, every processor only knows the particles in its domain. Since each processor might run on its own node with its own memory in a cluster computer, the access to the data of adjacent processors is not immediate (cf. [2]). The standard technique is to extend the domain that the processor is handling by further cells, the border neighbourhood, and to retrieve the necessary information from the adjacent processors in these cells. The current processor also has to send the information needed by the adjacent processors from his domain to the neighbours. For the exchange of the data between processors, message passing is used. That is, the processors send messages to each other. This is standardised in the message passing interface (MPI) (cf. [39]). Due to the discussion above, we need to extend the processor’s domain by a border neighbourhood of the size of two cells as shown in Fig. 10 on the left-hand side. We will conduct three-dimensional experiments. The corresponding domain and its neighbourhood are shown in Fig. 10 on the right-hand side.

Fig. 9
figure 9

Decomposition of the simulation domain: the domain \(\Omega \) is divided in cells indicated by thin lines. The domain is subdivided further into six subdomains as indicated by the thick lines. Each of the six subdomains is assigned to its own processor that handles the computations in this subdomain according to the linked cell method

Fig. 10
figure 10

Subdomain and border neighbourhood: On the left-hand side, a typical subdomain is shown as the square with the white cells. The processor responsible for this domain needs the data from the adjacent domains. A border neighbourhood of cells, given in grey, is added to the subdomain. The border neighbourhood is filled with copies of the particles in the adjacent domains. The adjacent processors have to send the data to the current processor by messages, according to the message passing paradigm standardised in the message passing interface (MPI) (cf. [39]). For implicit methods and DG methods, a border neighbourhood of two cells is necessary in order to compute the forces by the linked cell method. On the right-hand side, the situation for a three-dimensional subdomain is illustrated

Fig. 11
figure 11

Sketch of a face-centred cubic lattice on the left-hand side. On the right-hand side, the initial positions of the particles that form the smaller and the large body are shown in a plot made by the open visualization tool (OVITO) (cf. [57])

5.2 Collision of two bodies

As a test problem for the parallelisation, we use the collision of two bodies as described in Section 4.5.1 of [15]. In Fig. 11 on the right-hand side, the initial configuration of the experiment is shown. The two bodies consist of \(10 \times 10 \times 10\) and \(10 \times 30 \times 30\) cubic cells, with 4000 and 36000 particles, respectively. Each cell contains four particles in a face-centred cubic grid, as shown on the left-hand side of Fig. 11. The four particles, which are counted for this cell, are the lower left corner as well as the particles in the centre of the front, bottom and left face. The others belong to regular grids spanned by these four particles. The shortest distance in the grid is \(2^{1/6}\sigma \), according to the equilibrium length of the Lennard–Jones potential. At the beginning of the simulation, the smaller body moves with a high velocity v towards the resting larger body. The simulation cell of size \([0,150\sigma ]^3\) is equipped with periodic boundary conditions. All data are given in Table 2. Several snapshots of the simulation are shown in Fig. 12. The simulation has been conducted with four processors. For our proof-of-concept implementation, we used four standard personal computers (PCs) as processors that had been connected by a one-gigabit local area network (1 GB LAN). The particles in Fig. 12 are colour-coded with respect to the processor that handles the particles.

Table 2 Parameters for the simulation of the collision
Fig. 12
figure 12

Simulation of the collision of two bodies at times \(t=0,4,8,12,16,20\) from top left to bottom right. The simulation is run with four processors identified by colour. The same colour means that the same processor is handling the particles

Fig. 13
figure 13

Energy preservation for the collision of two bodies: the deviation from the exact energy is shown over the time span [0, 30] for step size \(\tau =0.001\) for the Verlet scheme, the midpoint rule and the discrete gradient (DG) scheme. All three methods have been computed in parallel with four processors on a cluster computer (cf. [2])

The simulation is run up to (scaled) time \(T=30\). All results are given in scaled quantities. In Fig. 13, one can see that the DG method preserves the energy very well. The implicit midpoint rule overestimates the energy when the small body penetrates the larger body. The Verlet scheme underestimates the energy during this phase. If the larger body is destroyed, the energy computed by the midpoint rule decreases and the energy computed by the Verlet scheme increases. In order to check our computation, we also conduct the same experiment with LAMMPS and only one single processor (no parallelisation). The observed energy behaviour of the independent Verlet implementation in LAMMPS shows exactly the same energy curve, including the small peaks later on. That is, only the discrete gradient method is able to simulate a true microcanonical ensemble. The implicit equation in (14) is solved with the Newton method, (15), with the full Jacobian (16). The action of the Jacobian on a vector is directly computed. This way, one can transfer the linked cell method to the computation of the action of the Jacobian on a vector. The total linear momentum is preserved by all three methods. The total angular momentum can not be preserved, by any of the methods, due to the periodic boundary conditions. Only for free space or repelling boundary conditions, methods can preserve the total angular momentum in a MD simulation.

5.3 Rate of acceleration

   A modification of the above experiment is used in order to quantify the rate of acceleration by the parallelisation. The modification is necessary to obtain a reasonable load balance for the processors in a parallel computation. It is inspired by the experiments in Section 4.4 of [15].

Table 3 Parameters for the experiment to quantify the acceleration

In a face-centred cubic lattice, \(39,\!754\) particles are placed in a box with dimensions \(L \times L \times L\), where \(L=(N /\rho )^{1/3}\). The density \(\rho \) is given in Table 3. Periodic boundary conditions are imposed on the faces of the box. The particles interact by short-range Lennard–Jones potentials whose parameters are also given in Table 3. The initial velocities of the particles are superimposed by a small thermal motion according to a Maxwell–Boltzmann distribution with reduced temperature T given in Table 3.

A fundamental measure to evaluate the increase in performance by parallelisation is the speedup

$$ S(p) = \frac{T}{T(p)}\,, $$

where p denotes the number of processors used in the computation, T(p) denotes the time needed by the parallel computation with p processors, and T denotes the time needed by the sequential programme. We use T(1), that is the time needed by the parallel programme with one processor, for the sequential execution time T. Table 4 shows the speedup for one time step of the corresponding integrators with up to 8 processors. The speedup of the Verlet scheme is as expected. The DG method and the midpoint rule show the same, or even slightly better, speedup with the parallelisation strategy discussed in Sect. 5.1. That is, the acceleration by parallelisation for the DG schemes is as successful as the parallelisation of the Verlet scheme.

Table 4 Speedup

5.4 Parallelisation of the evaluation of discrete gradients for bonded forces

The parallelisation of the bonded forces is simpler in the sense that it works in the standard way, which is illustrated in Fig. 14. On the left-hand side of Fig. 14, the domain with its border neighbourhood is shown. Then, the adjacent processors send the necessary particles, while this processor sends the particles needed by neighbouring processors. This is the state in the middle of Fig. 14.

Fig. 14
figure 14

Reconnection and separation of atoms in the border neighbourhood: If the subdomain of the processor receives atoms that are bonded within a molecule, the processor needs to reattach the atoms at the correct sites. This is illustrated here from left to right in a two-dimensional example. First, the subdomain with the knowledge of the processor is shown. In the middle, the situation is shown that results after the processor has received the particles necessary for the computation of the bonded forces from the adjacent processors. With the help of the molecule and particle numbers, the processor reattaches the molecules in the correct way. The result is illustrated on the right-hand side. After the computation of the bonded forces, the process is reversed from right to left

Then, the particles in the border neighbourhood are linked based on the molecule numbers and atom numbers stored with the particles. After that, the current processor sees all data needed to compute the discrete gradients with respect to bonded forces for the particles in its domain. Recall, that the new positions are also stored with the corresponding particles. This state is illustrated on the right-hand side of Fig. 14. After their computation, the discrete gradients are also stored with the particles. Then, the particles in the border neighbourhood are separated again, which brings us back to the state in the middle of the figure. Finally, the particles in the border neighbourhood are deleted, because they are no longer needed. This procedure also works for the matrix-free computation of the Hessian of the bonded potentials times a vector.

5.5 Experiment with butane

We run an experiment with 64 united-atom butane molecules. The initial condition and a snapshot of the simulation at time \(T=4\) can be seen in Fig. 15. The data for the potentials have been chosen as in Table 1. Also, the potentials are chosen as before, i.e. the angle potential is given in (29) and the torsion potential is given in (30). The simulation is set in a periodic box of size \(4r_{\text { cut}}\), \(r_{\text { cut}}=2.5\sigma \), and \(\sigma \) as in Table 1 in scaled variables. The whole simulation is scaled as before in the butane simulation.

Fig. 15
figure 15

Simulation of butane in a periodic box at times \(t=0,4\). The plot is made by the open visualization tool (OVITO) (cf. [57])

The deviation from the exact energy over the simulation time with step size \(\tau = 0.0001\) is shown in Fig. 16. The simulation has been run with four processors. While the DG methods preserve the energy up to round-off error, the implicit midpoint scheme and the Verlet scheme deviate from the constant energy. The peaks in the energy of the Verlet scheme are real. We also computed the energy for the given initial value with LAMMPS with one processor. This simulation reproduced exactly the same peaks as our code. This means that the particles do not evolve with respect to a genuine NVE ensemble at these peaks. The solid red line is the closest a standard molecular dynamics package with the Verlet scheme and setting the ensemble to the NVE ensemble can get. If one wishes more accuracy with respect to energy preservation, discrete gradient methods are an interesting alternative.

Fig. 16
figure 16

Energy preservation during the simulation of 64 united-atom butane molecules: the deviation from the exact energy is shown over the time span [0, 4] with step size \(\tau =0.0001\) for the Verlet scheme, the midpoint rule and the discrete gradient (DG) method

6 Conclusion

This work shows that all standard short-range interactions in a classical conservative molecular dynamics simulation can be computed by discrete gradient methods. These methods reliably preserve the total energy in the system, along with the total linear momentum and the total angular momentum in free space simulations. The simple and unified idea to construct the discrete gradients is to express all standard short-range interactions in terms of distances between atoms. The new discrete gradients for the dihedral angle potentials also suggest an interesting way to compute the gradient of dihedral angle potentials based on distances for the use in standard time-integration schemes. Furthermore, the discrete gradient methods can be parallelised. We proposed the necessary changes to the linked cell method for the parallel evaluation of the discrete gradients with respect to truncated Lennard–Jones potentials as well as the necessary changes for bonded forces. As a result, the proposed DG methods can be computed in parallel.