Coarse-graining molecular dynamics: stochastic models with non-Gaussian force distributions

Incorporating atomistic and molecular information into models of cellular behaviour is challenging because of a vast separation of spatial and temporal scales between processes happening at the atomic and cellular levels. Multiscale or multi-resolution methodologies address this difficulty by using molecular dynamics (MD) and coarse-grained models in different parts of the cell. Their applicability depends on the accuracy and properties of the coarse-grained model which approximates the detailed MD description. A family of stochastic coarse-grained (SCG) models, written as relatively low-dimensional systems of nonlinear stochastic differential equations, is presented. The nonlinear SCG model incorporates the non-Gaussian force distribution which is observed in MD simulations and which cannot be described by linear models. It is shown that the nonlinearities can be chosen in such a way that they do not complicate parametrization of the SCG description by detailed MD simulations. The solution of the SCG model is found in terms of gamma functions.


Introduction
With increased experimental information on atomic or near-atomic structure of biomolecules and intracellular components, there has been a growing need to incorporate such microscopic data (coming from X-ray crystallography, NMR spectroscopy or cryo-electron microscopy) into dynamical models of intracellular processes. A common approach is to use molecular dynamics (MD) simulations based on classical molecular mechanics. Such MD models are written as relatively large systems of ordinary or stochastic differential equations for the positions and velocities of individual atoms, which can also be subject to algebraic constraints (Leimkuhler and Matthews 2015;Lewars 2016). Although all-atom MD simulations of systems consisting of a million of atoms have been reported in the literature (Tarasova et al. 2017;Farafonov and Nerukh 2019), such simulations are restricted to relatively small computational domains, which are up to tens of nanometres long. It is beyond the reach of state-of-the-art computers to simulate intracellular processes which include transport of molecules over micrometers, because this would require simulations of trillions of atoms (Erban and Chapman 2019).
An example is modelling of calcium (Ca 2+ ) dynamics. On one hand, at the macroscopic level, Ca 2+ waves can propagate between cells over hundreds of micrometres and Kang and Othmer (2009) developed a model of Ca 2+ waves in a network of astrocytes. It builds on previous modelling work by Kang and Othmer (2007) describing intracellular Ca 2+ dynamics as a system of differential equations for concentrations of chemical species involved, including inositol 1,4,5-trisphosphate (IP 3 ), a chemical signal that binds to the IP 3 receptor to release Ca 2+ ions from the endoplasmic reticulum. On the other hand, at the atomic level, Hamada et al. (2017) recently solved IP 3 -bound and unbound structures of large cytosolic domains of the IP 3 receptor by X-ray crystallography and clarified the IP 3 -dependent gating mechanism through a unique leaflet structure.
Although it is not possible to incorporate such a detailed information into Ca 2+ modelling by using all-atom MD in the entire intracellular space, there is still potential to design multiscale (multi-resolution) models which compute Ca 2+ dynamics with the resolution of individual Ca 2+ ions. Dobramysl et al. (2016) implement such a methodology at the Brownian dynamics (BD) level to study Ca 2+ puff statistics stemming from IP 3 receptor channels. Denoting the position of an individual Ca 2+ ion by X ≡ (X 1 , X 2 , X 3 ), its diffusive BD trajectory is given by where D is the diffusion constant and W i , i = 1, 2, 3, are three independent Wiener processes. Since individual positions of Ca 2+ ions are only needed in the vicinity of channel sites, Dobramysl et al. (2016) model diffusion of ions far away of the channel by a coarser model, utilizing the two-regime method developed by Flegg et al. (2012). This method enables efficient simulations with the BD level of resolution by coarse-graining the BD model in those parts of the simulation domain, where the coarse-grained model can be safely used without introducing significant numerical errors (Flegg et al. 2014(Flegg et al. , 2015Robinson et al. 2015).
Although BD models or their multi-resolution extensions simulate individual molecules of chemical species involved, the binding of Ca 2+ ions to channel sites or other interactions between molecules are only described using relatively coarse probabilistic approaches. For example, the BD model of Dobramysl et al. (2016) describes interactions in terms of reaction radii and binding probabilities as implemented by Erban and Chapman (2009) and Lipková et al. (2011). Atomic-level information is not included in BD models. In order to use this information, multi-resolution methodologies have to consider MD simulations in parts of the simulation domain. In the case of ions, such a multi-resolution scheme has been developed by Erban (2016), where an all-atom MD model of ions in water is coupled with a stochastic coarse-grained (SCG) description of ions in the rest of the computational domain.
The accuracy and efficiency of such multi-resolution methodologies depend on the quality of the SCG description of the underlying MD model. In this paper, we present and analyze a class of SCG models which can be used to fit non-Gaussian distributions estimated from all-atom MD simulations. While the velocity distribution of the coarsegrained particle can be well approximated by a Gaussian (normal) distribution in our MD simulations, this is not the case of the force distribution. Non-Gaussian force distributions have also been reported by Shin et al. (2010) and Carof et al. (2014) in their MD simulations of particles in Lennard-Jones fluids. Thus our SCG model is formulated in a way which incorporates a Gaussian distribution for the velocity and a non-Gaussian distribution for the force (acceleration).
Given an integer N ≥ 1, a coarse-grained particle (for example, an ion) will be described by (2N + 2) three-dimensional variables: its position X, velocity V and 2N auxiliary variables U j and Z j , where j = 1, 2, . . . , N . Denoting , the time evolution of the SCG model is given by where g j : R → R is an increasing differentiable function, g j is its derivative, g −1 j is its inverse, h j : R → R is a continuous function and η j,k are positive constants for j = 1, 2, . . . , N and k = 1, 2, 3, 4. We note that some of our assumptions on g j can be relaxed as long as g j (g −1 j (U j,i )) appearing in Eq. (4) can be suitably defined. The SCG description (2)-(5) includes 2N functions g j and h j and 4N additional parameters η j,k , which can be all adjusted to fit properties of the detailed all-atom MD model. In particular the SCG model (2)-(5) can better match the MD trajectories of ions than the BD description given by Eq. (1), which only has one parameter, diffusion constant D, to fit to the MD results.
One of the shortcomings of Eq. (1) is that its derivation from the underlying MD model requires us to consider the limit of sufficiently large times. In particular, we need to discretize Eq. (1) with a relatively large time step, say a nanosecond, to use it as a description of the trajectory of an ion. Since the typical time step of an allatom MD model is a femtosecond, it is difficult to design a multi-resolution scheme which would replace all-atom MD simulations by Eq. (1) in parts of the computational domain. The SCG model (2)-(5) can be used to fit not only the diffusion constant D but other important properties of all-atom MD models, which improves the accuracy of the SCG model at time steps comparable with the MD timestep.
SCG models can be constructed using a relatively automated procedure by postulating that an ion interacts with additional 'fictitious particles'. Such a methodology has been applied to coarse-grained modelling of biomolecules by Davtyan et al. (2015Davtyan et al. ( , 2016 to improve the fit between an MD model and the dynamics on a coarse-grained potential surface. They use fictitious particles with harmonic interactions with coarsegrained degrees of freedom (i.e. they add quadratic terms to the potential function of the system and linear terms to equations of motions) and each fictitious particle is also subject to a friction force and noise. An application of such an approach to ions leads to systems of linear stochastic differential equations (SDEs) and can be used, after some transformation, to obtain a simplified version of the SCG model (2)-(5), where functions g j and h j are given as identities, i.e. g j (y) = h j (y) = y for y ∈ R and j = 1, 2, . . . , N . Using this simplifying assumption in the SCG model (2)-(5), we obtain This is a linear system of SDEs with 4N parameters. It has been shown by Erban (2016) that such models can fit an increasing number of properties of all-atom MD simulations as we increase N . For example, the linear SCG model (6)-(9) can be used to fit the diffusion constant D and second moments of the velocity and the force for N = 1, while the velocity autocorrelation function can better be fitted for larger values of N , e.g. for N = 3. However, there are other properties of MD simulations which cannot be captured by linear models even if consider arbitrarily large N . They include, for example, all distributions which are not Gaussian. This motivates the introduction of general functions h j and g j in the SCG model (2)-(5). Considering the SCG model (2)-(5) in its full generality, it can capture more interesting dynamics. However, coarse-grained models can only be useful if they can be easily parametrized. Thus in our analysis, we focus on choices of functions g j and h j which both improve the properties of the SCG description and do not complicate its analysis and parametrization. The rest of the paper is organized as follows. In Sect. 2, we consider the linear SCG model (6)-(9) for N = 1, which is followed in Sect. 3 with the analysis of the linear model for general values of N . To get some further insights into the properties of this model, we study its connections with the corresponding generalized Langevin equation. In Sect. 4, we consider the nonlinear SCG model (2)-(5) for N = 1. We consider specific choices of nonlinearity g 1 , for which the model can be solved in terms of incomplete gamma functions. This helps us to design three approaches to parametrize the nonlinear SCG model, which are applied to data obtained from MD simulations. We conclude with the analysis of the nonlinear SCG model (2)-(5) for general values of N in Sect. 5.

Linear model for N = 1 and the generalized Langevin equation
We begin by considering the linear SCG model (6)-(9) for N = 1. To simplify our notation in this section, we will drop some subscripts and denote X = X i , V = V i , U = U 1,i , Z = Z 1,i , W = W 1,i and η k = η 1,k for k = 1, 2, 3, 4. Then Eqs. (6)-(9) read as follows where X is (one coordinate of) the position of the coarse-grained particle V is its velocity, U is its acceleration, Z is an auxiliary variable, dW is white noise and η j , j = 1, 2, 3, 4, are positive parameters. In order to find the values of four parameters η j suitable for modelling ions, Erban (2016) estimates the diffusion constants D and three second moments V 2 , U 2 and Z 2 from all-atom MD simulations of ions (K + , Na + , Ca 2+ and Cl − ) in aqueous solutions. The four parameters of the SCG model (10)-(13) can then be chosen as Then the SCG model (10)-(13) gives the same values of D, V 2 , U 2 and Z 2 as obtained in all-atom MD simulations.
Since the model (10)-(13) only has four parameters, we can only hope to get the exact match of four quantities estimated from all-atom MD. To get some insights into what we are missing, we will derive the corresponding generalized Langevin equation and study its consequences. The generalized Langevin equation can be written in the where K : [0, ∞) → R is a memory kernel and random term R(t) satisfies the generalized fluctuation-dissipation theorem, given below in Eq. (21). To derive the generalized Langevin equation (15), consider the two-variable subsystem (12)-(13) of the SCG model. Denoting y = (U , Z ) T , where T stands for transpose, Eqs. (12)-(13) can be written in vector notation as follows where matrix B ∈ R 2×2 and vectors b j ∈ R 2 , j = 1, 2, are given as Let us denote the eigenvalues and eigenvectors of B as λ j and ν ν ν j = (1, λ j ) T , j = 1, 2, respectively. The eigenvalues of B are the solutions of the characteristic polynomial λ 2 + η 2 λ + η 3 = 0. They are given by Since η 2 and η 3 are positive parameters, we conclude that real parts of both eigenvalues are negative. In what follows, we will assume η 2 2 = 4η 3 . Then we have two distinct eigenvalues and the general solution of the SDE system (16) can be written as follows where c ∈ R 2 is a constant vector determined by initial conditions and matrix Φ(t) ∈ R 2×2 is given as i.e. each column is a solution of the ODE system dy = B y dt. Calculating the inverse of Φ(t) and considering long-time behaviour, Eq. (18) simplifies to where memory kernel K (τ ) is given by and noise term R(t) is Gaussian with zero mean and the equilibrium correlation function satisfying the generalized fluctuation-dissipation theorem in the form Using (17), memory kernel (20) can be rewritten as where μ = η 2 2 /4 − η 3 . We note that the auxiliary coefficient μ is a square root of a real negative number for η 2 2 < 4η 3 . However, formula (22) is still valid in this case: for η 2 2 < 4η 3 it can be rewritten in terms of sine and cosine functions, taking into account  (a) that μ = i |μ| is pure imaginary, sinh(i |μ| τ ) = i sin(|μ|) τ and cosh(i |μ| τ ) = cos(|μ| τ ).
The memory kernel K (τ ), given by Eq. (22), is plotted in Fig. 1a for different values of parameter μ. For simplicity, we use non-dimensionalized versions of our equations with dimensionless parameters η 1 = 1 and η 2 = 4. We choose three different values of η 3 so that the values of μ are 1, i and 4i. In Fig. 1b, we plot the equilibrium velocity autocorrelation function which is defined as for τ ∈ [0, ∞). More precisely, we plot χ(τ )/χ(0) which is normalized so that its value at τ = 0 is equal to 1. It is related to the memory kernel by where dτ is the Laplace transform of the memory kernel K (τ ) and L −1 denotes Laplace inversion. Following Erban and Chapman (2019), we evaluate the right hand side of Eq. (23) as follows. Substituting Eq. (22) into (23), we obtain The polynomial in the denominator, p(s) = s 3 + η 2 s 2 + (η 1 + η 3 )s + η 1 η 2 , has positive coefficients. Since p(−η 2 ) < 0 < p(0), it has one negative real root in interval (−η 2 , 0), which we denote by a 1 . The other two roots (a 2 and a 3 say) may be real or complex, but if they are complex they will be complex conjugates since p(s) has real coefficients. Assuming that the real part of each root is negative, we first find the partial fraction decomposition of the rational function in (24) as where c i ∈ C are constants (which depend on η 1 , η 2 and η 3 ). Then we can rewrite Eq. (23) as The results computed by (25) are shown in Fig. 1b. We note that although Eq. (25) may include complex exponentials, the resulting χ(τ ) is always real. Since the diffusion constant, D, and the second moment of the equilibrium velocity distribution, V 2 , are related to χ by the parametrization (14) guarantees that both the value of χ(0) and the integral of χ(τ ) are captured accurately. However, the simplified SCG description (10)- (13) is not suitable to perfectly fit the velocity autocorrelation function or the memory kernel for all values of τ ∈ [0, ∞). In order to do this, we have to consider the SCG model (6)-(9) for larger values of N as it is done in the following section.

General linear SCG model and autocorrelation functions
Considering the linear SCG model (6)-(9) for general values of N , we can solve Eqs.
(8)-(9) for each value of j = 1, 2, . . ., N to generalize our previous result (19) as where kernel K j (τ ) is given by [compare with (22)] with and noise term R j,i (t) is Gaussian with zero mean and the equilibrium correlation function satisfying Substituting (26) to (7), we obtain the generalized Langevin equation where In particular, we have 3N parameters to fit memory kernel K (τ ), which can be estimated from all-atom MD simulations. There have been a number of approaches developed in the literature to estimate the memory kernel from MD simulations. Shin et al. (2010) use an integral equation with relates memory kernel K (τ ) with the autocorrelation function for the force and the correlation function between the force and the velocity. Estimating these correlation functions from long time MD simulations and solving the integral equation, they obtain memory kernel K (τ ). Other methods to estimate the memory kernel, K (τ ), of the corresponding generalized Langevin equation (29) have been presented by Gottwald et al. (2015) and Jung et al. (2017). An alternative approach to parametrize the linear SCG model (6)-(9) is to estimate the velocity autocorrelation function, χ(τ ), from all-atom MD simulations. This can be done by computing how correlated is the current velocity (at time t) with velocity at previous times. Since Eqs. (10)-(13) are linear SDEs, we can follow Mao (2007) to solve them analytically, using eigenvalues and eigenvectors of matrices appearing in their corresponding matrix formulation. Using this analytic solution, Erban (2016) use an acceptance-rejection algorithm to fit the parameters of linear SCG model (6)-(9) for N = 3 to match the velocity autocorrelation functions of ions estimated from all-atom MD simulations of Na + and K + in the SPC/E water.
Since the parameter μ j given by (28) is a square root of a real number, it can be both positive or purely imaginary. In particular, kernels K j (τ ) given by Eq. (27) can include both exponential, sine and cosine functions as illustrated in Fig. 1a. Since memory kernel K (τ ) is given as the sum of K j (τ ) in Eq. (30), typical memory kernels and correlation functions estimated from all-atom MD simulations can be successfully matched by linear SCG models for relatively small values of N . However, as shown by Mao (2007), analytic solutions of linear SDEs also imply that the process is Gaussian at any time t > 0, provided that we start with deterministic initial conditions. Thus the linear SCG model (6)-(9) for abtitrary values of N can only fit distributions which are Gaussian. This motivates our investigation of the nonlinear SCG model in the next two sections.
(2)-(5) read as follows where X denotes (one coordinate of) the position of the coarse-grained particle, V is its velocity, U is its acceleration, Z is an auxiliary variable, dW is white noise, η j , for j = 1, 2, 3, 4, are positive parameters and functions g : R → R and h : R → R are yet to be specified. Equation (31) describes the time evolution of the position, while Eqs. (32)-(34) admit a stationary distribution. We denote it by p (v, u, z). Then p (v, u, z) p(v, u, z), of SDEs (32)-(34) can be obtained by solving the corresponding stationary Fokker-Planck equation where C is the normalization constant, and functions G and H are integrals of functions g and h, respectively, which are given by We note that for the special case where g and h are given as identities, i.e. g(y) = h(y) = y for y ∈ R, the nonlinear SCG model (31)-(34) is equal to the linear SCG model (10)-(13) and functions G and H are G(y) = H (y) = y 2 /2. Then the stationary distribution (35) is product of Gaussian distributions in v, u and z variables.
In particular, we can easily calculate the second moments of these distributions in terms of parameters η j . Estimating these moments from all-atom MD simulations, we can parametrize the resulting linear SCG model (10)-(13) as shown in Eq. (14).
However, if we want to match a non-Gaussian force distribution, we have to consider nonlinear models. A simple one-parameter example is studied in the next section.

One-parameter nonlinear function
Consider that g is a function depending on one additional positive parameter η 5 as follows g(y) = |y| 1/η 5 signy, where we use sign to denote the sign (signum) function The function defined by (37) only satisfies our assumptions on g for η 5 ∈ (0, 1] as it is not differentiable at y = 0 for η 5 > 1, but we will proceed with our analysis for any positive η 5 > 0. Consider that function h is an identity, i.e. h(y) = y for y ∈ R, then Eqs. (31)-(34) reduce to where we would have to be careful, if we used this model to numerically simulate trajectories for η 5 > 1, because of possible division by zero for U = 0 in Eq. (41). If η 5 ∈ (0, 1], then we do not have such technical issues. Using Eq. (35), the stationary distribution is equal to where the normalization constant C is given by Let α ≥ 0. Integrating (43), we get Using (45) for α = 2 and α = 4, we obtain the following expression for kurtosis In particular, the kurtosis is only a function of one parameter, η 5 . It is plotted in Fig. 2a as the blue solid line, together with the kurtosis obtained for a more general twoparameter SCG model studied in Sect. 4.2. We observe that the distribution of U is leptokurtic for η 5 < 1 and platykurtic for η 5 > 1. If η 5 is equal to 1, then our SCG model given by Eqs. (31)-(34) reduces to the linear SCG model given by Eqs. (10)-(13), i.e. the stationary distribution is Gaussian and its kurtosis is 3. This is shown by the dotted line in Fig. 2a.
Since Eq. (46) only depends on parameter η 5 , we can use the kurtosis of the acceleration distribution (which is equal to the kurtosis of the force distribution) esimated from MD simulations to find the value of parameter η 5 . To calculate the kurtosis, we estimate the fourth moment U 4 in addition to the second moment, U 2 , used before in our estimating proceduce (14) for the linear model. In particular, we not only get Eq. (46) for calculating the value of parameter η 5 , but also a restriction on other parameters η 2 , η 3 and η 4 . Using (45) for α = 2, it can be stated as follows where we have used properties of the gamma function, including (1 + y) = y (y) and Euler's reflection formula, (1 − y) (y) sin(π y) = π , to simplify the right hand side. We note that in the Gaussian case, η 5 = 1, the right hand side of Eq. (47) further simplifies to η 2 which is indeed the formula for the second moment of U given by the linear SCG model (10)-(13). Equation (47) provides one restriction on four remaining parameters, η 1 , η 2 η 3 and η 4 , which need to be specified. This can be done by estimating three additional statistics from MD simulations, as in the case of the linear SCG model (10) Therefore, assuming that D, V 2 , Z 2 are obtained from MD simulations and η 2 4 /(2η 2 η 3 ) is given by (47), we can calculate parameters η k by We note that in the Gaussian case, η 5 = 1, we can substitute Eq. (48) for η 2

Two-parameter nonlinear function
Consider that g is a function depending on two positive parameters η 5 and η 6 as follows where sign function is defined by (38). In particular, our expression for function g is equal to the formula (37) for sufficiently large values of |y|. As discussed in the previous section, if we used formula (37), there would be some issues for y close to zero [for example, the division by zero for U = 0 and η 5 > 1 in Eq. (41)], so our generalized formula (52) replaces (37) with a linear function for smaller values of |y|. On the face of it, it looks that there could also be some issues with the generalized formula (52), because it is not strictly increasing for |y| ≤ η η 5 6 (1 − η 5 ). However, function (52) is increasing and invertible away of this region with its inverse given by Moreover, what we really need in Eqs. (31)-(34) is g (g −1 (u)) which can be defined as the following continuous function where the removable discontinuity at u = 0 has disappeared because we have defined g (g −1 (0)) = η 1−η 5 6 /η 5 . Integrating (52) and substituting (53), we get where G is the integral of function g defined by (36). Consider again that h is an identity, i.e. h(y) = y for y ∈ R. Then the stationary distribution (35) is again Gaussian in V and Z variables with their second moments given by Eq. (49). Let us denote the marginal stationary distribution of U by Using (35) and (54), we have Integrating (55), we get, for any α ≥ 0, where function F(κ 1 , κ 2 , α) is defined by and (resp. γ ) is the upper (resp. lower) incomplete gamma function defined by Substituting α = 2 and α = 4 in Eq. (57), we get This formula for the kurtosis is visualized in Fig. 2a as a function of parameter η 5 for three different values of parameter η 6 . We note that the case η 6 = 0 corresponds to the case studied in Sect. 4.1. If η 6 = 0, then Eq. (56) implies κ 1 = 0. Since γ (s, 0) = 0 and (s, 0) = (s), where (s) is the standard gamma function given by (44), we can confirm that Eq. (59) converges to our previous result (46) as η 6 → 0.
We note that the two additional parameters η 5 and η 6 can be used to satisfy both Eqs. (59) and (61), while in Sect. 4.1 we could only use one equation (Eq. (46) for kurtosis) to fit one parameter η 5 . However, in the case of one-parameter function (37), we could (instead of fitting the kurtosis) match the quantity U 2 / |U | 2 with MD simulations, i.e. we could replace Eq. (46) by Eq. (61) simplified to the oneparameter case corresponding to function (37). Passing to the limit η 6 → 0 in Eq. (61) and using Euler's reflection formula, (1 − y) (y) sin(π y) = π , we obtain that the one-parameter nonlinearity (37) implies the following formula Thus, in Sect. 4.1, we could use |U | and U 2 estimated from long-time MD simulations to calculate the left hand side of Eq. (64), which could then be used to select parameter η 5 . Other parameters could again be chosen by Eqs. (50)-(51).

Application to MD simulations
In Sects. 4.1 and 4.2, we have presented three approaches to fit nonlinear SCG models which have non-Gaussian force distributions to data obtained from MD simulations. In this section, we apply them to the results obtained by an illustrative MD simulation of a Lennard-Jones fluid, where we consider a box of 512 atoms which interact with each other through the Lennard-Jones force terms for parameters given for liquid argon (Rahman 1964), i.e. particles interact in pairs according to the Lennard-Jones potential 4ε ((σ/r ) 12 − (σ/r ) 6 ), where ε/k B = 120 K, σ = 0.34 nm and r being the distance between particles. We use standard NVT simulations where the temperature (T = 94.4 K) is controlled using the thermostat of Nosé (1984) and Hoover (1985) and the number of particles (N = 512 in a cubic box of side 2.91 nm) is kept constant by implementing periodic boundary conditions. Using a long time MD simulation (time series of lentgh 10 ns), we estimate three moments |U | , U 2 and U 4 as averages over all three coordinates, i.e.
is the acceleration of one specific atom (tagged particle) to which our SCG model is applied. Rounding all computational results to three significant figures, we obtain |U | = 0.753 nm ps −2 , U 2 = 1.10 nm 2 ps −4 and U 4 = 7.03 nm 4 ps −8 . In Fig. 2b, we plot the equilibrium MD distribution of the acceleration (average over all three coordinates) using blue circles. The resulting distribution is leptokurtic (with positive excess kurtosis). Its kurtosis has been estimated as Kurt[U ] = 5.85. The numerical values on the u-axis in Fig. 2b are expressed in [nm ps −2 ]. Since the acceleration, U , is proportional to the force exerted on the tagged particle (with the scaling factor equal to the atomic mass of argon), the plot of the acceleration distribution in Fig. 2b can also be interpreted as the plot of the force distribution, which has the same kurtosis, provided that we suitably rescale the units on the u-axis.
If we only attempt to fit the value of U 2 , we could parametrize the linear SCG model (10)-(13), which leads to the Gaussian acceleration distribution (plotted as the black dotted line in Fig. 2b). Using the one-parameter nonlinear function (37) from Sect. 4.1, we can use Eq. (46) to find parameter η 5 = 0.550 so that the nonlinear SCG model gives the same kurtosis as observed in MD simulations (Kurt[U ] = 5.85). The resulting distribution is given as the red dot-dashed line in Fig. 2b. It matches both second and fourth moments, U 2 and U 4 .
Using all-atom MD simulations, we can not only estimate the kurtosis, but other dimensionless ratios of moments of U . For example, we obtain U 2 / |U | 2 = 1.93. This estimate can be substituted in Eq. (64), which provides an alternative approach to obtain the value of parameter η 5 of the one-parameter nonlinear function (37). Using U 2 / |U | 2 = 1.93 and solving Eq. (64) numerically, we obtain η 5 = 0.692. The resulting distribution, which matches |U | and U 2 , is plotted as the green dashed line in Fig. 2b. We note that the parameter η 5 is dimensionless, because both Eqs. (46) and (64) only depend on dimensionless quantities estimated from MD simulations.
Since both distributions (for η 5 = 0.550 and η 5 = 0.692) are given by (43), they are unbounded for u close to zero. This motivates the choice of our two-parameter nonlinear function g used in Sect. 4.2.
Substituting Kurt[U ] = 5.85 and U 2 / |U | 2 = 1.93 in Eqs. (59) and (61) and solving them numerically, we obtain κ 1 = 0.149 and κ 2 = 0.771. Substituting into (62), we get the two parameters of model from Sect. 4.2 as η 5 = 0.297 and η 6 = 0.472 nm ps −2 . The resulting distribution, given by Eq. (55), is plotted in Fig. 2b as the cyan solid line. We observe that the distribution is now bounded. It is a piecewise defined function which is Gaussian for the values of u satisfying |u| ≤ η 6 , which removes the singularity at u = 0. At the same time, the distribution given by Eq. (55) matches all three moments estimated from MD simulations, |U | , U 2 and U 4 . As we can see in Fig. 2b, this distribution does not perfectly fit the acceleration distribution estimated from MD simulations. If our aim is to obtain a SCG model which better fits the whole distribution, we can use SCG models for larger values of N as we will discuss in the next section.

Nolinear SCG model for general values of N
We have already observed in Sects. 2 and 3 that the linear SCG model (6)-(9) can match the MD values of a few moments for N = 1, while we need to consider larger values of N to match the entire velocity autocorrelation function. Considering the nonlinear SCG model (2)-(5), we have two options to capture more details of the non-Gaussian force distribution observed in MD simulations. We could either keep N = 1, as in Sect. 4, and introduce additional parameters into nonlinearity g = g 1 , or we could consider larger values of N . In Sect. 4, we have shown that by going from one-parameter to two-parameter function g, we improve the match with MD results. In this section, we will discuss the second option: we will use larger values of N .
Then p(v, u, z) dv du 1 du 2 . . . du N dz 1 dz 2 . . . dz N gives the probability that V i (t) ∈ [v, v+dv), U j,i (t) ∈ [u j , u j +du j ) and Z j,i (t) ∈ [z j , z j +dz j ), for j = 1, 2, . . ., N , at equilibrium. The stationary distribution can be obtained by solving the corresponding stationary Fokker-Planck equation Our analysis in Sect. 4.1 shows that parameters η j,2 , η j,3 and η j,4 appear on the left hand side of Eq. (47) as a suitable fraction, which in the Gaussian case corresponds to the second moment of the acceleration (see Eq. (48)). Considering general N , we define this fraction as new parameters and we again assume that the second moment of the velocity distribution, V 2 = V 2 i , can be estimated from long-time MD simulations. In order to find the stationary distribution, we will require that parameters η j,1 , η j,2 , η j,3 and η j,4 satisfy (compare with Eq. (49) for N = 1) Then the stationary distribution, obtained by solving (65), is given by where C is the normalization constant and functions G j and H j are integrals of functions g j and h j , respectively, which are given by Following (37), we assume that h j (z j ) = z j and each g j is a function of one additional positive parameter η j,5 , j = 1, 2, . . ., N , given as g j (y) = |y| 1/η j,5 signy.
Then we have Then the stationary distribution (66) is Gaussian in V i and Z j,i variables and we can integrate (66) to calculate the marginal distribution of U j,i by Consequently, where the normalization constant C j is given by Integrating (68), we can calculate The acceleration of the coarse-grained particle is given by Using the symmetry of (68), odd moments of U j,i are equal to zero. In particular, U j,i = 0 and U 3 j,i = 0 for j = 1, 2, . . ., N . Consequently, which gives Substituting Eq. (69) for moments on the right hand side of Eq. (72), we can express the kurtosis of U i in terms of 2N parameters σ j and η j,5 , where j = 1, 2, . . ., N .
For example, if we choose the values of dimensionless parameters η j,5 equal to given numbers and define new parameters then Eq. (69) implies that U 2 j,i is a linear function of κ j and U 4 j,i is a quadratic function of κ j . Equations (70) and (71) can then be rewritten as the following system of two equations for κ 1 , κ 2 , . .
where c 1, j and c 2, j are known constants, which will depend on our initial choice of values of η j,5 . Thus, using N > 2, we still have an opportunity to not only fit the second and fourth moments of the force distribution, but other moments as well. For example, the 6-th moment, U 6 i , would include the linear combination of the third powers of κ j . We could also fit other properties of the force distribution estimated from MD simulations. For example, we could generalize one-parameter nonlinearities (67) to two-parameter nonlinear functions, as we did in Eq. (52). Then we could match the value of the distribution at u = 0, if our aim was to get a better fit of the MD acceleration distribution obtained in the illustrative example in Fig. 2b. Another possible generalization is to consider nonlinear functions h j , provided that we estimate more statistics on the auxiliary variable Z from MD simulations.

Discussion and conclusions
We have presented and analyzed a family of SCG models given by Eqs.
(2)-(5), which can be parametrized to fit properties of detailed all-atom MD models. A special choice of functions g j and h j in Eqs. (2)-(5) leads to the linear SCG model (6)-(9) which is used in a multiscale (multi-resolution) method developed by Erban (2016) as an intermediate description between all-atom MD simulations and BD models. The linear SCG model is studied in more detail in Sects. 2 and 3, where we highlight that 4N parameters of this model can match some statistics estimated from all-atom MD simulations with increased accuracy as we increase N , but there are also statistics which cannot be matched for any value of N . They include non-Gaussian force distributions.
In Sects. 2 and 3, we show that the linear SCG model (6)-(9) corresponds to the generalized Langevin equation with the stochastic driving force being Gaussian. Such systems have been analysed since the work of Kubo (1966). One approach to match non-Gaussian MD force distributions could be to use the non-Gaussian generalized Langevin equation which was analyzed by Fox (1977) using methods of multiplicative stochastic processes. However, if we want to generalize the linear SCG model (6)-(9) while keeping its structure as a relatively low-dimensional system of SDEs, then it can be done by introducing nonlinear functions g j and h j as shown in Eqs. (2)-(5). The advantage of the presented approach is that we can directly replace the linear model by Eqs. (2)-(5) in multiscale methods which use all-atom MD simulations in parts of the computational domain and (less detailed) BD simulations in the remainder of the domain. Coupling MD and BD models is a possible approach to incorporate atomiclevel information into models of intracellular processses which include transport of molecules between different parts of the cell (Erban 2014(Erban , 2016Gunaratne et al. 2019). The nonlinear SCG model (2)-(5) is studied in Sect. 4 for N = 1. Describing the nonlinearity as the one-parameter function given by (37), we can use its dimensionless parameter η 5 to match the kurtosis of the force distribution estimated from all-atom MD simulations. Although the one-parameter case is easy to analyze in terms of the gamma function, it has some undesirable properties for small forces. If η 5 > 1, we can obtain large terms in the dynamical equation (41) for small values of U ; this corresponds to the zero value of stationary probability distribution (43) for u = 0. If η 5 < 1, we have small terms in the dynamical equation (41), but the stationary probability distribution (43) is unbounded for u = 0. In Sect. 4.2, we have shown that these issues can be avoided if the two-parameter nonlinear function (52) is used instead of the one-parameter function (37). The resulting equations are solved in terms of incomplete gamma functions. In Sect. 5, we study the nonlinear model for general values of N where each g j is a one-parameter nonlinearity given by Eq. (67). However, we could also consider two-parameter functions g j , like we did in Eq. (52) for N = 1, to improve the properties of the SCG model for general values of N .