1 Introduction

Lubrication is the most relevant technique to reduce friction and wear in mechanical applications [1]. The drive toward sustainable technologies has increased the demands on the performance of lubricants [2]. Often the traditional empirical selection or design rules for liquid lubricants fail under the extreme conditions that can be found in modern machinery [3]. Therefore, the predictive quantitative description of lubricant properties, over a wide range of pressures, temperatures and shear rates, plays a pivotal role in the optimization of mechanical devices with respect to friction properties. This is especially the case for the lubricant viscosity [4, 5], which determines the shear stress

$$\begin{aligned} \tau =\eta {\dot{\gamma }} \end{aligned}$$
(1)

in a lubricant film subjected to a shear rate \({\dot{\gamma }}\). Besides experimental methods, classical molecular dynamics (MD) simulations [6, 7] are by now an established tool to calculate lubricant viscosities under precisely defined thermodynamic conditions (pressure, temperature) and different shear rates, thus giving access to \(\eta (P,T, {\dot{\gamma }})\). More recently, machine learning approaches have been used to take the numerical calculation of lubricant viscosities one step further by extracting structure–property relationships and predicting viscosities for a very large parameter space [8]. Although a very promising route to efficient viscosity calculations, machine learning also necessitates sufficiently large data sets obtained from experiment or MD as training data. To calculate the viscosity from MD simulations, several calculation methods exist, which allow to extract the viscosity from time averages of different microscopic variables of the atomic model system. However, due to the statistical nature of the microscopic properties, the simulation times which are necessary to achieve converged average values can get impractically long, depending on the conditions and chosen method. This is particularly true for the linear response regime, where the viscosity is independent of the shear rate, i.e for the Newtonian viscosity—which is the predominant quantity to characterize lubricants. For instance, by imposing a shear flow \(v_x(z)\) of constant rate \(\partial _z(v_x)={\text{const}}.\) on a liquid bulk volume element, the viscosity can be calculated via Eq. (1) by extracting the shear stress \(\tau =\langle s_{xz}\rangle\) as the time average of the respective entry in the stress tensor \({\textbf{s}}\). However, although the approach appears to be straightforward, determining the Newtonian viscosity of typical hydrocarbon lubricants in shearing MD simulations is a difficult task due to a combination of large fluctuations of the stress tensor at low shear rates, and shear thinning effects at higher shear rates. Usually, extrapolation schemes are used to predict the Newtonian viscosity from the shear thinning behavior [9, 10], although the theoretical description of the shear thinning is still under debate. An alternative is the fluctuation-dissipation theorem [11], which offers an explicit way to calculate transport coefficients for the linear response regime via Green–Kubo relations [12, 13]. For the Newtonian viscosity, the latter reads

$$\begin{aligned} \eta = \frac{1}{V k_{\rm B} T} \int _0^\infty \langle s_{\alpha \beta }(t)s_{\alpha \beta }(0) \rangle _{\text{EQ}}\;\text{d}t. \end{aligned}$$
(2)

with \(\langle \cdots \rangle _{\text{EQ}}\) being the equilibrium ensemble average, V the system volume and \(s_{\alpha \beta }(t)\) (\(\alpha \ne \beta\)) a non-diagonal entry of the instantaneous stress tensor [14]. Depending on the correlation time of these equilibrium shear stress fluctuations, the integral converges to a constant value at a finite time, which is accessible in equilibrium molecular dynamics (EMD) simulations. This approach has already been successfully applied to various hydrocarbons under normal conditions with viscosities of the order of \(10^{-3}\) Pas [15,16,17,18]. However, the viscosity calculation becomes increasingly challenging for more viscous systems, like lubricants under high loads [19,20,21,22,23]. This is due to the slower dynamics, which lead to a longer correlation time of the stress tensor in Eq. (2). Thus, in order to obtain a convergence of the integral, longer simulation times are necessary, often up to the point where the autocorrelation time scale of \(s_{\alpha \beta }\) exceeds the simulation time which is feasible with available computational resources. Consequently, only very few studies employed Eq. (2) in EMD simulations of such highly viscous systems [19,20,21,22,23], and a systematic study of the connection between viscosity and stress correlation time, as well as a methodical estimate of the computational cost are missing. In addition, it is worth exploring alternative ways to calculate the Newtonian viscosity in—especially high viscous—EMD simulations. The Stokes–Einstein (SE) equation (3) provides such an alternative, connecting the viscosity \(\eta\) with the self diffusion coefficient D and the molecule’s radius R [24]

$$\begin{aligned} D=\frac{k_{\rm B} T}{n\pi R \eta }, \end{aligned}$$
(3)

where n is a constant and lies between a slip (\(n=4\)) and a stick (\(n=6\)) boundary condition [14, 25]. In EMD simulations, the diffusion coefficient is commonly calculated via one of the following two formulas [26]. On the one hand, it can be extracted from the mean squared displacement (MSD)

$$\begin{aligned} D_{\text{MSD}}= \lim _{t\rightarrow \infty } \frac{1}{6}\frac{\text{d}}{\text{d}t} \langle ({\textbf{r}}(t)-{\textbf{r}}(0) )^2\rangle _{\text{EQ}}. \end{aligned}$$
(4)

On the other hand, as for the viscosity, the diffusion coefficient can also be calculated via a GK relation, namely

$$\begin{aligned} D_{\text{VACF}}= \frac{1}{3} \int _0^\infty \langle {\textbf{v}}(t)\cdot {\textbf{v}}(0) \rangle _{\text{EQ}} \;\text{d}t, \end{aligned}$$
(5)

which implies calculation of the velocity autocorrelation function (VACF) \(\langle {\textbf{v}}(t)\cdot {\textbf{v}}(0) \rangle\) (with \({\textbf{v}}\) the molecule’s center of mass velocity). The validity of the SE equation (3) for pure simple liquids is well established [25], although it may fail in dense liquids at low temperatures [27,28,29]. Specifically for hydrocarbons, two recent MD studies considered pure hydrocarbon liquids at different (PT)-conditions with pressures up to several \(100\,M\)Pa, resulting in viscosity values up to the order of \(10\,m\)Pas [21, 30]. In our previous work [30], we found Eq. (3) to hold true for different linear and branched alkanes, including the two main constituents of PAO4 (see Table 1a/b). In Ref. [21], the validity of Eq. (3) was also observed for paraffinic and aromatic hydrocarbons. Contrarily, another MD study on hydrocarbons at room temperature for pressures up to \(1\,G\)Pa suggests a possible violation of Eq. (3) for squalane above \(100\,M\)Pa [23].

In the present study, the Newtonian viscosity \(\eta\) of the base oil PAO4 was analyzed under a wide range of thermodynamic conditions (i.e., low to high viscous systems) using EMD simulations. Three different calculation methods, which follow from Eqs. (2) to (5) were compared systematically. On the one hand, the viscosity was calculated applying the GK method with the stress autocorrelation function (SACF). The convergence time of the GK integral was evaluated as a function of the viscosity, resulting in a correlation, which allows an extrapolation of the necessary computational effort to calculate high viscosity values. On the other hand, the viscosity was implicitly extracted from the SE relation (3), by calculating the diffusion coefficient and taking advantage of an accurate definition of the lubricant molecules Stokes radius [30]. For the computation of the diffusion coefficient, both routes—via the MSD Eq. (4) and via the VACF Eq. (5)—were investigated and compared. To model the PAO4 molecules, we chose to employ a united-atom (UA) approach, despite the well-known discrepancies of the viscosity values from UA models with respect to experimental results [31, 32]. This is because, on the one hand, the aim of this study is mainly to evaluate and compare the different calculation methods to the highest viscosity values, which are feasible within the given computational resources. And on the other hand, UA potentials are still very relevant for applications which necessitate a certain simulation length [33, 34] or system size, for example the modeling of lubricated tribological contacts including surface roughness [35], making it important to characterize their high-pressure viscosity behavior. The paper is organized as follows: Details about the EMD simulation protocols including the applied model parameters are given in Sect. 2. Calculations of the viscosity are shown in Sect. 3, by applying the SACF-GK method, and in Sect. 4 via the diffusion coefficient and SE equation. Finally, a comparative discussion of the different methods is given in Sect. 5.

2 EMD Simulations

2.1 United-Atom Model of PAO4

Table 1 Molecule structure with Stokes radius, mass-percentage and number of molecules for the PAO4 components C10-trimer (a), -tetramer (b) and -pentamer (c) used in the simulation setup. Molecule snapshots are visualized with united-atoms CH\(_3\) (blue) and CH, CH\(_2\) (both grey) (Color figure online)

The PAO4 lubricant investigated in this work is a mixture of C10-trimer, -tetramer and -pentamer molecules with molecular structures and weight fractions of the three different species as shown in Table 1. The simulated model system consisted of a liquid bulk volume element with periodic boundary conditions containing a total of 242 molecules. The molecules were modeled by a united-atom (UA) approach where each CH\(_x\) group \((x=1,2,3)\) represents one charge-neutral interaction site. Bonded interactions consisted of the usual harmonic potentials for bonds and angles, and a fourier series based torsional potential. Non-bonded interactions were modeled via the Lennard–Jones (LJ) potential. The exact form of the bonded and non-bonded potentials, the force field parameters—taken mainly from the TraPPE-UA force field [36]—as well as all other details necessary to implement the model, are given in Appendix 1.

2.2 Simulation Protocol

Time integration of the molecular dynamics was performed with the LAMMPS software package, utilizing the Velocity-Verlet integrator with a \(2\,f\)s time step [37, 38]. The procedure for generating equilibrated liquid bulk volume elements with given temperature and pressure, which followed in large parts that of Messerly et al. [20], is described in detail in the following. First, the atomic coordinates were set up with lattice distributed molecules and the temperature is initialized by adjusting the atomic velocities according to a gaussian distribution (step 1). Subsequently, the temperature (step 2) and pressure (step 3) were equilibrated first in one NVT and then one NPT simulation of \(2\,n\)s length each. Afterwards, another \(2\,n\)s NPT simulation (step 4) was performed during which the volume of the system was sampled. The system volume was then fixed at the average volume obtained in step 4 and the equilibration procedure was finished by another 2ns NVT-simulation (step 5). All NVT and NPT simulations were performed with a Nosé-Hoover thermostat [39] and barostat [40] with damping factors of \(\xi _T = 500\,{\text{d}}t = 1\,p\)s and \(\xi _P = 2500\,{\text{d}}t = 5\,p\)s. Note that for velocity rescaling algorithms—like the Nosé-Hoover thermostat—the choice of its damping factor does not influence the observables diffusivity and viscosity according to Ref. [41]. Finally, data sampling for the different viscosity calculation methods was performed in NVT simulations, lasting over \(10\,n\)s (VACF method) or \(40\,n\)s (MSD and SACF method). For each state point (P,T), the whole equilibration and sampling procedure was carried out once (VACF method) or 40 times (MSD and SACF method), each time with different initial atom velocities (see step 1), to obtain 40 independent equilibrated starting configurations (MSD and SACF method) for data sampling, and thus 40 viscosity values \(\eta (P,T)\) for averaging and error estimation.

3 Direct Viscosity Calculation

3.1 Calculation and Evaluation of the SACF Integral

Fig. 1
figure 1

Integrals of the SACF Eq. (7) for 40 independent trajectories (gray lines), their average (solid blue) and its error interval given by the standard error of mean (dashed blue) at a \(100\,^\circ \text{C}\)/\(0\,M \text{Pa}\), b \(20\,^\circ \text{C}\)/\(0\,M\text{Pa}\) and c \(20\,^\circ \text{C}\)/\(100\,M\text{Pa}\). Red line/circle: Result of the fit of function Eq. (8) to the average data, and plateau time \(\tau_{\rm p}\) according to Eq. (11) (Color figure online)

To compute the viscosity with the SACF-GK method via Eq. (2) from the simulation data, we first evaluate the time autocorrelation function \(\langle s_{\alpha \beta }(0) s_{\alpha \beta }(t) \rangle\) of each off-diagonal stress tensor element \(s_{\alpha \beta }\) (\(\alpha \beta =xy,xz,yz\)) as described in the following. The stress tensor data from each simulation run are a time series \({\textbf{s}}(t_k=k\Delta t)\) with time resolution \(\Delta t=6\,f\)s and k going from 0 to \(K = t_{\max }/\Delta t\) (\(t_{\max }=40\,n\)s). To maximize statistical significance using the available data, the numerical ACF computation is performed averaging over all possible time origins \(t_0\):

$$\begin{aligned} \langle s_{\alpha \beta }(t+t_0) s_{\alpha \beta }(t_0) \rangle _{t_0} = \frac{1}{K-k+1}\sum _{l=0}^{K-k} (s_{\alpha \beta }(t_k+t_l) s_{\alpha \beta }(t_l)) \end{aligned}$$
(6)

for discrete times \(0\le t_{k,l}\le t_{\max }\). Results for the SACF of the three \(s_{\alpha \beta }\) (\(\alpha \beta =xy,xz,yz\)) are then averaged and integrated as required by the SACF-GK expression (2)

$$\begin{aligned} \Phi (t)= \frac{1}{V k_{\rm B} T} \int _0^t \frac{1}{3}\sum _{\alpha \beta } \langle s_{\alpha \beta }(t^{\prime }+t_0) s_{\alpha \beta }(t_0) \rangle _{t_0} \;\text{d}t^{\prime } \end{aligned}$$
(7)

where the integration is performed numerically in the usual way with trapezoidal rule. This SACF evaluation is carried out for each of the 40 independent simulation runs. As a remark, we would like to point out that the SACF of molecular systems typically feature some very strong oscillations at short times (\(<1\,p\)s), which can only be resolved properly with a sufficiently high data output frequency. Figure 1 shows the resulting integrals \(\Phi (t)\) (gray lines), as well as their average \(\Phi_{\rm av}(t)\) (blue line) for three exemplary state points (PT), representing systems with a low (Fig. 1a), medium (Fig. 1b) and high (Fig. 1c) viscosity. The variation of the results from the 40 different simulation runs is quantified by the standard error of mean (SEM), where we show the error interval \([\Phi_{\rm av}(t)-\text{SEM},\Phi_{\rm av}(t)+\text{SEM}]\) (blue dashed lines). The next step consists of fitting a suitable analytical expression \(\Phi_{\rm fit}(t)\) to the averaged result \(\Phi_{\rm av}(t)\) of the EMD simulations. Following the approach of Heyes [42], we used the fitting function

$$\begin{aligned} \Phi_{\rm fit}(t)=&\;G_{\infty }\;(A\;\tau _1\;\arctan [\sinh (t/\tau _1)]\;\\&+\;B\;\tau _2\;[1-\exp (-t/\tau _2)]\\&\;+(1-A-B)\;\tau _3\;[1-\exp (-t/\tau _3)]),\end{aligned}$$
(8)

with the SACF peak

$$\begin{aligned} G_{\infty } = s_{\alpha \beta }(0)\;V/(k_{\rm B} T) \end{aligned}$$
(9)

and the fit parameters A, B, \(\tau _1\), \(\tau _2\) and \(\tau _3\). The fitting is based on an orthogonal distance regression [43], with the data weighted by its standard error of the mean (\(1/{\text{SEM}}\)). Note that the commonly used least squares fitting method is problematic for the present application because of the steep gradient (\(\Delta y/\Delta x\) with \(\Delta y\gg \Delta x\)) of \(\Phi (t)\) at small times. In the residual minimization of the least squares fitting, the impact of this region of steep slope on the fitting parameters is too big compared to the area of convergence of the GK integral. Finally, the long time limit of the fit is used to compute the viscosity

$$\begin{aligned} \eta _{\text{SACF}}=\lim _{t\rightarrow \infty } \Phi_{\rm fit}(t)= G_\infty \;(A\;\tau _1\;\pi /2\;+\;B\;\tau _2\;+\;(1-A-B)\;\tau _3). \end{aligned}$$
(10)

To obtain an error interval for the viscosity, the same fit evaluation was also performed for the error curves \(\Phi_{\rm av}(t)-\text{SEM}\) and \(\Phi_{\rm av}(t)+\text{SEM}\).

3.2 Results of the SACF Method for the PAO4 Viscosity

The explicit viscosity calculation via the GK-SACF method described in this section was applied to systems for pressures up to \(300\,M\)Pa in a temperature range from 20 to \(150\,^\circ\)C. The resulting viscosities reach values up to about 20 Pas as displayed in Fig. 2. It is worth noting that only about two-thirds of the values stem from numerical GK integrals which, within the time interval \([0:40n{\text{s}}]\), are as clearly established as in Fig. 1a and b. This is the case for the results marked by full symbols in Fig. 2. The open symbols stem from fits on GK integrals where the data available for the fit of Eq. (8) mostly consisted of the strongly increasing part of the SACF integral, as shown for example in Fig. 1c. To quantify this observation, we defined the "plateau time" \(\tau _{\text{p}}\) via the condition

$$\begin{aligned} \Phi_{\rm fit}(\tau _{\text{p}}) = 0.98\,\eta _{\text{SACF}} \end{aligned}$$
(11)

as a criterion for the time of convergence. Figure 3 shows the relation between plateau time and viscosity, which can be estimated with a power law

$$\begin{aligned} \tau _{\text{p}} = 0.224\,\eta_{\rm SACF}^{0.7}. \end{aligned}$$
(12)

For high viscosities above \(1000 \, m\)Pas (empty symbols in Figs. 2 and 3), the time \(\tau _{\text{p}}\) for the onset of the plateau indeed exceeds the maximum simulation time \(t_{\max }=40\,n\)s (see for example red circle in Fig. 1c). The question arises how accurate viscosity values from such strongly extrapolated fit functions are. To analyze the quality of the computed viscosities with respect to experimental values, they are compared to data from Refs. [44, 45] (gray symbols and dotted lines in Fig. 2, respectively). While it is known that the UA-TraPPE force field [36] underestimates the Newtonian viscosity [31], we applied a different weighting (0.5 instead of 0.2) of the indirectly bonded interaction sites with two intermediate bonds (1–4 interactions). With this adjustment, a better agreement with the experimental data is achieved for the lower viscosities—up to approx. \(100\,m\)Pas. For larger viscosities, however, the simulation data increasingly overestimates the experimental values. The discrepancy is particularly large for results from numerical GK integrals with \(t_{\max }<\tau_{\rm p}\), which emphasizes the question about their accuracy. To better quantify the reliability of results from extrapolated fit functions with \(t_{\max }<\tau_{\text{p}}\), one state point (\(T=20\,^\circ\)C, \(P=100\,M\)Pa, see Fig. 1c) was studied in more detail (Appendix 2). In short, the fitting and extrapolation proceedure is not prone to a systematic error: The viscosity results from different amounts of simulation data \((10,20,40) n\text{s}\) showed no clear trend of under- or overestimation, compared to a converged reference state obtained from \(80\,n\)s long runs. Moreover, the correlation Eq. (12) that was observed between the viscosity and the convergence time \(\tau _{\text{p}}\) of the GK integral holds consistently up to the highest viscosity and \(\tau _{\text{p}}\) values (Fig. 3). Based on this, we expect that simulation times of around \(100-1000\,n\text{s}\) would be needed for the "proper" calculation of viscosity values of the order of \(1-10\) Pas via the SACF, presuming a simulation time of about twice the SACF plateau-time \(\tau_{\rm p}\) gives a very reliable fit result (see Appendix 2). To give an idea about the computing time, the typical computation time in this study amounted to 672 Core-hours per nanosecond for one single run. Given the large amount of computing time involved in the calculation of SACF-GK integrals in highly viscous systems, the question arises if other methods are less costly and/or can confirm the computed viscosity values from Fig. 2. Therefore, the second part of this study focused on calculating the viscosity via the SE relation Eq. (13).

Fig. 2
figure 2

PAO4 viscosity obtained from EMD simulations (colored symbols) with the GK-SACF approach (Eq. (10)). Solid colored symbols indicate results obtained from converged integrals, i.e., \(\tau _{\text{p}}<t_{\max }\), while open colored symbols represent extrapolations from fits with \(\tau _{\text{p}}>t_{\max }\). When no error bars are visible, they are smaller than the symbol size. For comparison, experimental viscosity data from Refs. [44] (dark gray, only \(40^\circ\)C and \(100^\circ\)C) and [45] (light gray, all 4 temperatures) is shown (Color figure online)

Fig. 3
figure 3

Plateau time \(\tau _{\text{p}}\) against viscosity, described by the power law Eq. (12) (black dashed line). Full/open symbols as in Fig. 2 (Color figure online)

4 Indirect Viscosity Calculation

In order to determine the viscosity via the Stokes–Einstein relation

$$\begin{aligned} \eta_{\rm SE}=\frac{k_{\rm B} T}{n \pi R_j D_j} \end{aligned}$$
(13)

the diffusion coefficient \(D_j\) and the Stokes radius \(R_j\) of at least one of the components \(j\in (C_{30}, C_{40}, C_{50})\) from Table 1 need to be calculated from the MD trajectories. We chose to do this evaluation for the \(C_{30}\), because it has the highest concentration in the studied mixture (see Table 1). Thus it provides the statistically most relevant results for any quantity which is obtained from an average over molecule positions or velocities (like the MSD or VACF). The \(C_{30}\)’s Stokes radius was computed with the approach described in Ref.  [30]. In this previous work, a definition of the Stokes radius \(R=\sqrt{A/\pi }\) via the mean molecular cross section A was introduced. For different linear and branched alkanes, this definition resulted in a parameter free match of viscosity and self diffusion coefficient with the Stokes–Einstein relation, assuming slip boundary condition \(n=4\). Consequently, the same \(n=4\) was applied in the present evaluation. To extract the diffusion coefficient, two approaches were used, namely via the mean squared displacement Eq. (4) and via the velocity autocorrelation function Eq. (5).

4.1 Diffusion Coefficient via MSD

The first method extracts the diffusion coefficient from the long time limit of the MSD slope (assuming Fickian diffusion):

$$\begin{aligned} \Omega _j(t)= \langle ({\textbf{r}}_{i}(t)-{\textbf{r}}_{i}(0) )^2 \rangle _{i, n} \end{aligned}$$
(14)

and

$$\begin{aligned} D_{\rm MSD,j}=\lim _{t\rightarrow \infty } \frac{1}{6} \frac{\text{d}}{\text{d}t} \Omega _j(t), \end{aligned}$$
(15)

where \(\Omega _j(t)\) is the mean squared displacement, with the center of mass position \({\textbf{r}}_i\) of molecule i of species j. Averaging is performed over all molecules of the species and over the number of independent simulations n. Results for \(\Omega _{C_{30}}(t)\) at different temperatures and pressures are shown in Fig. 4 (colored lines) in double-logarithmic form. Each curve is obtained from averaging over 201 molecules and 40 independent simulations. The molecular trajectories are the same as those used to extract the SACF data. The characteristic \(\Omega (t)\) curve ranges from the ballistic regime at very short times (not resolved in Fig. 4), over an intermediate regime, to the diffusive regime (\(\Omega (t)=6Dt+{\text{const}}.\)) at long time scales [46]. A convenient way to check the current sampled regime is to analyze \(\Omega (t)\) in double-logarithmic form as shown in Fig. 4. When the MSD enters the diffusive regime the slope \(m=d(\log(\Omega ))/d(\log(t))\) converges to \(m=1\) (grey dashed line). Note that this is clearly not the case for the slowest diffusing systems within the simulated \(40\,n\)s. For these state points, the diffusion coefficient cannot be extracted correctly from the MSD. In order to quantitatively identify if an MSD curve has indeed reached the diffusive regime, a linear fit \(\log(\Omega )=m \log(t) + const.\) was performed to the temporally last 10% of the simulation data (inset Fig. 4). This evaluation reveals that—even allowing for some statistical fluctuations of the slope—the majority of MSD curves has not properly reached the linear regime \(m=1\). Nevertheless, for completeness and in order to evaluate the error caused by such a convergence problem, a diffusion coefficient was obtained for all MSD curves via a linear fit \(\Omega (t)=6D_{MSD}t+const.\) to the non-logarithmic MSD data in the time interval \([36;40]\,n\)s. The resulting diffusion coefficients are shown in Fig. 6. Here, half-empty symbols denote results from MSD data which behaves in the fit region as \(\Omega (t)\propto t^m+const.\) with \(m<0.9\) (according to the evaluation shown in Fig. 4). In the following, these curves/results are called "non-converged." A more detailed discussion about the diffusion coefficients from the MSD, in comparison with the results from the VACF method, follows after the introduction of the VACF method in the next section.

Fig. 4
figure 4

Mean square displacement (solid lines) of the \(C_{30}H_{62}\) component of PAO4 for \(40\,n\)s simulation time for different temperatures (colors, see legend) and pressures (color intensity: lighter to darker shade denotes increasing pressure), and linear fits on the logarithmic data for the last 10%, i.e., \(t\in [36;40]\,n\)s. The inset shows a zoom on the time range used for the fitting (gray shaded area) for selected curves (Color figure online)

4.2 Diffusion Coefficient via VACF

The calculation of the diffusion coefficient via the VACF-GK formula Eq. (5) follows a similar strategy as for the direct viscosity calculation (Sect. 3.1). For the practical application with finite simulation time \(t_{\max }=10\,n\)s, Eq. (5) is rewritten into

$$\begin{aligned} \Psi _j(t)= \frac{1}{3} \int _0^t \frac{1}{I}\sum _{i=1}^{I}\langle {\textbf{v}}_{i}(t^{\prime }+t_0)\cdot {\textbf{v}}_{i}(t_0) \rangle _{t_0} \;\text{d}t^{\prime }, \end{aligned}$$
(16)

with the center of mass velocity \({\textbf{v}}_{i}=(v_x,v_y,v_z)_i\) of all molecules i belonging to species j. The VACF \(\langle {\textbf{v}}_{i}(t^{\prime }+t_0)\cdot {\textbf{v}}_{i}(t_0) \rangle _{t_0}\) for each molecule is calculated using an average over all \(t_0\le t_{\max }\) as explained previously for the SACF calculation (c.f. Eq. (6)). In contrast to the SACF calculation, only one single \(10\,n\)s simulation was used for the VACF evaluation of each state point (P, T). Three examples of \(\Psi _{\text{C}_{30}}(t)\) for different state points (same as for the SACF examples in Fig. 1) are shown in Fig. 5 (blue lines). In the same diagrams, the VACF integrals of all 201 individual C\(_{30}\) molecules are also shown (gray lines). Each curve is computed from the center of mass trajectory of one molecule with a resolution of \(10\,f\)s. Furthermore, the standard error of the mean (SEM) of the 201 individual curves is used to mark the error interval [\(\Psi_{\text{C}_{30}}(t) - \text{SEM},\ \Psi_{\text{C}_{30}}(t) + \text{SEM}\)] (blue dashed lines) of the average VACF integral \(\Psi _{\text{C}_{30}}(t)\).

Fig. 5
figure 5

Integral of the VACF (Eq. (16)) for each of the 201 molecules (grayscale), their average (blue, solid) and its standard error of mean (blue, dashed) for a \(100\,^\circ\)C/\(0\,M\)Pa, b \(20\,^\circ\)C/\(0\,M\)Pa and c \(20\,^\circ\)C/\(100\,M\)Pa. A fit function Eq. (17) (red, solid) is applied to the average data, with \(\tau_{\rm p}\) (red, circle) indicating the beginning of the plateau area (Color figure online)

This VACF integral needs to be fitted with a suitable analytical expression, which will then yield the diffusion coefficient from extrapolation to long times. Finding such a suitable expression turned out to be non-trivial due to the highly non-monotonic shape of \(\Psi _{\text{C}_{30}}(t)\). The numerical results for the integral in Eq. (16) feature a very pronounced peak at short times from which it decreases, eventually converging toward a constant value. This peak stems from the shape of the integrand, the VACF: From the maximal correlation at \(t^\prime =0\), the VACF drops quickly—within less than \(0.2\,p\)s—and passes into a regime of negative correlation. After reaching a minimum, the VACF converges toward zero over a much longer time span (a few nanoseconds). Moreover, we observed that the finer details of the VACF shape depend on the investigated state point. Trying to fit the VACF of all state points with the theoretical VACF expression proposed in Heyes et al. [42] led to unsatisfactory results. Alternatively, Bellissimma et al. [47, 48] showed that the VACF can be decomposed into a sum of complex exponential functions. Depending on the state point, they suggested approximated expressions for the VACF which consist of a sum of four to six exponential functions. Note that, in principle, this form could be used to fit the VACF directly (instead of the integrated form), to then obtain the diffusion coefficient from an analytical integration of the fit result. However, testing this procedure, we found that small fitting inaccuracies of the VACF, especially in the important short time region, lead after the subsequent integration to big variations, i.e., offsets for the diffusion coefficient. Therefore, the fitting of the VACF data was performed on the numerically integrated form Eq. (16) using an analytic integration of Bellissima’s fit function (see Appendix 3). Here, the short time simulation data prior to the characteristic maximum of the integrated VACF was not considered for the fitting. The latter allows to reduce the fit function to a sum of three exponentials

$$\begin{aligned} \Psi_{\rm fit}(t) =&\alpha \tau _1 \exp (-t/\tau _1) + \beta \tau _2 \exp (-t/\tau _2) \\&+ \gamma \tau _3\exp (-t/\tau _3) + c, \end{aligned}$$
(17)

with the fit parameters \(\alpha ,\beta ,\gamma ,\tau _1,\tau _2,\tau _3,c\), which are responsible for reproducing the decay of \(\Psi (t)\) after its characteristic peak. Using fewer fitting parameters in Eq. (17) compared to the full expressions suggested by Bellissima et al. [47, 48] reduces the complexity of the fitting procedure and helps an automated fitting routine. Concerning the fitting procedure, the simulation data is weighted by the standard error of mean and the orthogonal distance regression method is applied [43] (results are shown as red lines in Fig. 5). The diffusion coefficient is then obtained as

$$\begin{aligned} D_{\rm VACF}=\lim _{t\rightarrow \infty } \Psi_{\rm fit}(t) = c. \end{aligned}$$
(18)

Similar to the evaluation of the integrated SACF, a "plateau-time" \(\tau _{\text{p}}\) (c.f. red circles in Fig. 5) is defined via

$$\begin{aligned} \Psi_{\rm fit}(\tau _{\text{p}})=1.02 D_{\rm VACF} \end{aligned}$$
(19)

as a convergence criterion. This criterion is met for all considered state points in the range of the \(10\,n\)s simulation time.

Fig. 6
figure 6

Comparison of diffusion coefficients computed via the MSD and the VACF from converged (full markers) and non-MSD-converged (half-filled markers) simulations with short-single (colored markers, \(10\,n\)s) and multiple-long (grey markers, \(40\,n\)s) VACF-trajectories calculated per state point. Multiple VACF-trajectory results (grey) are generated from fitting to merged data of 5 trajectories, while their errors are obtained from their 5 independent trajectory results (Color figure online)

4.3 Results and Comparison of the \(C_{30}\) Diffusion Coefficients from Both Methods

Results for the diffusion coefficient from the two methods, namely via the MSD (Eq. (4); Fig. 4) and via the time-integrated VACF (Eq. (5); Fig. 5) are shown in Fig. 6 in direct comparison. For completeness, all state points (PT) considered in this study are shown, which also includes results obtained from MSD curves which have clearly not reached the diffusive regime (\(m_{\rm loglog}<0.9\) in Fig. 4; marked as half-filled symbols). The latter can thus not be expected to yield correct diffusion coefficients. Contrarily, the time integrals of the VACF do all reach values within \(2\,\%\) of their respective long time limit Eq. (18) for all state points (see Fig. 5 for some examples) despite the shorter simulation time of \(10\,n\)s compared to \(40\,n\)s. Comparing the two methods, we find an excellent agreement between the MSD and the VACF results when the MSD curves reached the diffusive regime (full symbols in Fig. 6). Whereas values obtained from non-converged MSD results mostly overestimate the diffusion coefficient in comparison with the VACF method (empty symbols), which matches previous findings [49]. To further increase the confidence in the results for the diffusion coefficients obtained with the VACF method, the five presumably slowest converging state points (i.e., points with lowest diffusion coefficient) were analyzed in more detail for an error estimation. For each of these state points, five additional \(40\,n\)s simulations starting from statistically independent configurations were carried out and evaluated as described above. Their average was then used to compute the most reliable fit (Eq. (17)) and diffusion coefficient (Eq. (18)) of that state point. In addition, the standard error of mean for the diffusion coefficient from the five independent simulations is evaluated. The results (gray symbols with error bars in Fig. 6) are in full agreement with the previous results from the single shorter \(10\,n\)s simulation. Given the convergence problems that were encountered with the MSD method, and the good convergence and small statistical fluctuations for the diffusion coefficients from the VACF method, only the latter were used to calculate the viscosity via the Stokes–Einstein relation Eq. (13).

4.4 Results for the PAO4 Viscosity from the VACF

The viscosity values obtained via the VACF method (Eq. (5) combined with Eq. (3)) are shown in comparison with the results from the SACF method (Eq. (2)) in Fig. 7. The same state points, i.e., temperatures from 20 to \(150\,^\circ\)C and pressures up to \(300\,M\)Pa, as in the previous sections are used (see Fig. 2 for color code). Overall, both methods show a very good agreement, in particular for the lower range of viscosity values up to about 20 mPas (full symbols). For the remaining five most viscous systems with values in the order of 1 to 10 Pas, the time integrated SACF did not converge (empty symbols, see Sect. 3.2 for details). Nevertheless, the extrapolations of the respective fit functions give viscosity values in good agreement with the VACF method, where no extrapolation was necessary due to the much faster convergence of the VACF. Moreover, to evaluate the reliability of each method more quantitatively, an error analysis was performed for those five most viscous systems. On the one hand, errors of the SACF method are estimated from fits with Eq. (8) to the standard error of mean curves of the integrated SACF (see Fig. 1) and extrapolation for \(t\rightarrow \infty\). It should be noted that the latter procedure gives an underestimation of the error, which was found in a more extensive error analysis of the system at \(T=20\,^\circ\)C and \(P=100\,M\)Pa (cf. Sect. 3.2 and appx. B) with respect to the length of the available SACF data (varying from \(10\,n\)s to \(80\,n\)s). On the other hand, errors of the VACF viscosity calculation method were obtained from error propagation in the SE equation (13) with the diffusion coefficient errors from Fig. 6. In this case, results obtained from a single \(10\,n\)s simulation (colored symbols in Figs. 6 and 7) and from five \(40\,n\)s simulations are in full agreement within the expected statistical fluctuations (grey symbols with error bars). To summarize, for the most viscous systems, the indirect viscosity calculation via the diffusion coefficient (VACF method) is more reliable than the direct viscosity calculation via the SACF method.

Fig. 7
figure 7

Comparison of viscosities computed via the integral of the SACF and VACF from converged (full markers) and non-SACF-converged, i.e., extrapolated (half-filled markers) simulations with short-single (colored markers, \(10\,n\)s) and multiple-long (grey markers, \(40\,n\)s) VACF-trajectories calculated per state point. Multiple VACF-trajectory results (grey) are generated from fitting to merged data of 5 trajectories, while their errors are obtained from the 5 independent trajectory results (Color figure online)

5 Discussion

From the three methods for viscosity calculation from EMD trajectories that were tested in this study, two could successfully be applied to all considered (PT) state points \((T=[20,40,100,150]\,^\circ \text{C}\) and \(P=[0,100,200,300]\,M\text{Pa})\). Those two were the direct approach via the SACF-GK Eq. (2) (Sect. 3.1) and the indirect approach utilizing the SE Eq. (3) in combination with the VACF-GK Eq. (5) for the diffusion coefficient (Sect. 4.4). Both methods yield consistent results for the viscosity \(\eta (P,T)\) for all considered systems, with viscosity values in the range from 1 to \(20077\,m \text{Pas}\) for the least \((150\,^\circ \text{C}\), \(0\,M \text{Pa})\) and the most viscous case \((20\,^\circ \text{C}\), \(300\,M \text{Pa})\), respectively. In addition to the calculation of the diffusion coefficient via the VACF, its calculation via the MSD was also tested (Sect. 4.1). The MSD approach was also successful for the six least viscous, i.e., fastest diffusing, systems, resulting in diffusion coefficients which were in full agreement with the VACF method. In all other cases, however, the simulation time of \(40\,n\)s was insufficient for the MSD to reach the diffusive regime, such that a correct calculation of the diffusion coefficient was not possible.

All results obtained from the three different methods agree well within the statistical uncertainties. The computational resources which were used by each of the methods for the viscosity calculation of one state point is summarized in Table 2. Note that the data collection for the SACF and the MSD method was performed simultaneously from the same simulation runs. The VACF data was gathered in independent shorter simulation runs instead.

Table 2 Comparison of computation effort for the viscosity calculation via the SACF, MSD and VACF of a single state point as it was performed in this study. The simulations were carried out with the software LAMMPS (version: 12 Dec 2018) on 48 cores (2\(\times\) Intel Xeon Platinum 8168 CPU, 2\(\times\) 24 cores, 2.7 GHz) of the GCS Supercomputer JUWELS

The SACF method had to be applied to multiple simulations per state point, each with an independently prepared initial starting configuration, in order to obtain a meaningful value for the viscosity [50]. As can be seen in Fig. 1, the time integrated SACF from the single simulation runs can fluctuate widely with very large deviations from the averaged curve. This is especially true for highly viscous systems where a single simulation might sample only a very reduced part of phase space due to locally high barriers for a structural reorganization of the fluid molecules. Note that running such a "trapped" trajectory for a longer time—within reasonable computing time limits—will likely still not result in a viscosity in good agreement with the one averaged from multiple different simulations [26]. Furthermore, the observed power-law Eq. (12) between the time of convergence of the SACF integral and the viscosity (Fig. 3) illustrates that this method gets generally more challenging for increasing viscosity. A similar problem occurred for the calculation of the diffusion coefficient via the MSD, shown in Fig. 4, which needs longer times to enter the diffusive regime for more viscous systems. While the MSD approach ran into convergence problems for systems with viscosity \(>20\,m\)Pas, it would certainly be possible to reduce its computational cost for the lower viscous systems considerably by using both less and shorter simulations for each state point. In this low viscous scenario, the viscosity calculation via the MSD may even be advantageous, since the low output frequency reduces simulation run time and memory space. However, since we were mostly interested in pushing the limits of viscosity calculation toward high values, we did not further pursue this route via the MSD. The GK equation for the diffusion coefficient utilizing the VACF, on the contrary, shows a much better performance, where the integrated VACF met the convergence criterion for all state points. Moreover, for highly viscous systems, the viscosity results from the indirect GK-VACF method exhibit a higher statistical reliability than from the direct GK-SACF method, despite the smaller amount of data going into the evaluation (see Sects. 3.1, 4.4 and Appendix 2 for details).

6 Conclusion

In this study, EMD simulations of a representative low viscosity oil (PAO4) were performed over a wide range of temperatures \((T=20\ldots 150\,^\circ \text{C})\) and pressures \((P=0\ldots 300\,M\text{Pa})\), in order to determine the best suited viscosity computation method with respect to accuracy and computational efficiency. Three different routes for the viscosity calculation were compared. First, viscosity was calculated directly via the stress autocorrelation function by employing the Green–Kubo Eq. (2) (details in Sect. 3). The other two routes followed an indirect approach (Sect. 4) via the Stokes–Einstein relation Eq. (3) in combination with the calculation of the diffusion coefficient. The latter was performed either via the mean-squared discplacement Eq. (4) or via the velocity autocorrelation function Eq. (5). While the results of the different methods agree, their computational effort varies greatly. Concluding our findings, for highly viscous systems, the viscosity was far superior calculated with the Stokes–Einstein equation, where the diffusion coefficient was calculated via the velocity autocorrelation function. Naturally, this approach is limited by the validity of the Stokes–Einstein relation. An upper boundary for this method is thus marked by the solidification of the lubricant. For a similar PAO-mixture (80 wt. % trimer, 20 wt. % tetramer), the pressure induced solidification has been experimentally determined at \(1.3\,G\)Pa at \(T=20\,^\circ\)C [23]. Within the range of studied pressure and temperature conditions, we are still far away from this upper boundary and did not observe any signs of an imminent breakdown of Eq. (13). This includes systems with a viscosity which is three orders of magnitude higher than previous investigations of the SE relation for hydrocarbons [21, 30].

In comparison with experimental PAO4 viscosity data, the applied united-atom approach and force field parameters (Appendix 1) result in a good match for high temperature \(>100^\circ \text{C}\) (Fig. 2). However, for the lower temperature cases, it should be noted that the viscosity is strongly overestimated with increasing pressure. Such deviations are not unique to the employed model [51]. Indeed, the (in)accuracy of available hydrocarbon MD models with respect to viscosity predictions for high pressures is under active discussion in the literature [20, 32, 51,52,53,54]. A major difficulty emerges from the description of the repulsive forces with the stiff form of the Lennard–Jones potential. As a promising alternative approach Galvani and Robbins suggested an all-atom force field—which tend to reproduce experimental viscosity values better than coarse grained models—with the more adjustable morse potential [31, 32]. But the transferability of this approach to different hydrocarbon structures has not been demonstrated yet. Moreover, for large system sizes or long simulation times the application of all-atom potentials is still a considerable computational challenge making the united-atom approach favorable. Consequently, further development of both united- and all-atom force fields which are better fitted to experimental high pressure viscosity data is needed for tribological applications. The large amounts of high-pressure viscosity calculations that such a development necessitates during the parametrization process could be greatly facilitated by the indirect approach (i.e., combining Eqs. (5) and (3)) suggested in this study. We are confident that the general findings are not only applicable to united-atom models, but also to other force fields, including all-atom potentials. Likewise, given that PAO4 consists of a mixture of different molecular structures, we would expect qualitatively the same results for the pure components or for other hydrocarbons with comparable structure and size. In future investigations, the transferability to other types of lubricants should be investigated.