1 Introduction

A common strategy for analysing how physical properties of macroscopic objects change over time is to model the system on the microscopic scale and then derive macroscopic information, through either analytical considerations or computer simulations.

This method has many advantages. It is conceptually easy to develop a model on the microscopic scale that describes a macroscopic object to high accuracy. This model can then be analysed under different circumstances, either analytically or with computer simulations, often under conditions that allow insight into the object’s properties which cannot be derived based on real-world experimental data alone. Moreover, modelling the physical properties of a macroscopic object based on well-established physical laws on the microscopic scale allows to control and analyse its most fundamental components, which are often more mathematically sound than phenomenologically derived laws on the macroscopic scale. Finally, provided the model on the microscopic scale reflects the actual physical properties to high accuracy, the derived data contain details of the object of the highest fidelity. This can in principle be used to extract and analyse any desired physical properties on the macroscopic scale.

However, this method has some disadvantages. One often models macroscopic objects by means of their constituent particles as large-scale interacting particle systems, where the known physical properties on the microscopic scale are given by intermolecular forces. Thus, even for small macroscopic objects, the number of particles that need to be modelled in order to obtain an accurate representation of the object is typically very large. This inhibits the analytical evaluation of these models dramatically, except for a few cases. Consequently, such models are usually analysed using computer simulations. Here, additionally, the maximal step size in numerical integration schemes that still ensures numerical stability is usually very small. Both factors combined pose a huge obstacle in the scalability of simulations of interacting particle systems in time and in space.

The goal is thus to develop mathematical methods that reduce the complexity of a given large-scale interacting particle system on the microscopic scale in a rigorous way so that the desired macroscopic properties of the system are preserved. This process would free up resources that can be used to further scale the simulation in time as well as in space.

In this article, we start by summarising some of the mathematical methods that are used to reduce the complexity of deterministic systems on the microscopic scale and will show how they preserve important information on the macroscopic scale. These mathematical methods usually include some averaging procedures (coarse-graining techniques) or dimension reduction techniques. In the latter case, thermodynamic considerations often play an important role in the analysis of these systems. By construction, these mathematical methods only give macroscopic insight into the dynamics of the system. Therefore, in the main part of this article, we build on some of these mathematical methods and analyse the macroscopic properties of a simple fast–slow mechanical system in more detail by expanding the dynamics of the system to second order and interpreting it from a thermodynamic point of view.

1.1 Review of some averaging and dimension reduction techniques

We focus on models which describe a macroscopic system by deterministic dynamics on the microscopic scale. This microscopic dynamics can be modelled in time and/or in space. Typical for these models is the presence of a small scale parameter \(\varepsilon \). It represents the ratio of the fast dynamics on the microscopic scale and the slow dynamics on the macroscopic scale. A recurrent approach to study these fast–slow systems is to consider the limit \(\varepsilon \rightarrow 0\) and thus derive an average or reduced model which is, in some sense, oblivious to the small-scale motion in the system. These procedures can broadly be classified into three different categories: non-projective, projective, and phenomenologically derived continuum mechanical methods.

While phenomenologically derived continuum mechanical methods aim to describe a macroscopic system as a continuous object without resorting to the microscopic scale, non-projective and projective methods aim to describe a macroscopic system by its most prominent properties that materialise in terms of some descriptive variables by studying the system on the microscopic scale. In general, projective methods start from a large-scale microscopic system and apply mathematical transformations to extract information about a handful of variables chosen a priori such as volume, pressure, or density. Then the evolution of the microscopic system is largely confined to a low-dimensional submanifold, on which it can be described by these macroscopic variables. Non-projective methods often aim to describe the dynamics of the system by systematically deriving the most dominant motion of the whole system. Here, the dynamics of the whole system is reduced to the dynamics of a less complex system. In contrast to projective methods, the resulting variables are not necessarily given but appear in the process, e.g. through averaging, and can sometimes be given a physical interpretation a posteriori, as shown in this article for a model problem.

Under the non-projective methods, the WKB method is a classic and well-known representative. As described, for instance, in [1, 2], it is often used in the analysis of quantum mechanical systems. It can be used to calculate an approximate solution to the stationary Schrödinger equation, which is a second-order differential equation of the form

$$\begin{aligned} \varepsilon ^2\phi ''(x) = Q(x)\phi (x), \end{aligned}$$
(1)

complemented with some initial conditions, where \(\varepsilon \) is a small scale parameter and \(Q:\Omega \rightarrow {\mathbb {R}}\) is related to the system’s potential. The WKB method consists of an explicit ansatz for the oscillatory part of the solution, which is necessary to analyse the dynamics for \(\varepsilon > 0\). Equation (1) can approximately be solved with the WKB ansatz

$$\begin{aligned} \phi (x) \sim \exp \left[ \frac{1}{\varepsilon }\sum _{n=0}^\infty \varepsilon ^n S_n(x) \right] \quad \text {for} \quad \varepsilon \rightarrow 0. \end{aligned}$$

By comparing powers of \(\varepsilon \) one derives a sequence of equations which determine \(S_0, S_1, \ldots \) and thus a representation of the solution at different scales.

The averaging method is an alternative representative of the class of non-projective methods. It can be used to derive the slow dynamics of solutions to ordinary differential equations if the evolution of the system’s DOF can be decomposed into the form

$$\begin{aligned} \dot{y} = f(y,z), \qquad \dot{z} = \varepsilon ^{-1} g(y,z), \end{aligned}$$
(2)

with \(y(0)=y_*\) and \(z(0)=z_*\), where \(y :{\mathbb {R}}\rightarrow {\mathbb {R}}^n\) describes the slow and \(z :{\mathbb {R}}\rightarrow {\mathbb {R}}^m\) the fast degrees of freedom. In general, one is only interested in the dynamics of y, while z is introduced to accurately model the dynamics on the microscopic scale. Systems of this kind can be used to model, for example, the evolution of molecules in the united atom representation [3,4,5] where the parameter \(\varepsilon \) represents the scale ratio between the fast molecular vibrations and the slow conformal motion of the molecule, or the development of long- and short-term weather phenomena [6, 7], where \(\varepsilon \) represents the scale ratio between the fast change of local weather phenomena and the slow development of the global climate.

A natural choice to reduce the complexity of the model (2) is to average out the fast dynamics in the system. If the function f is periodic in z with period T, then under very mild assumptions, one can consider the averaged system

$$\begin{aligned} \dot{\bar{y}} = \bar{f}(\bar{y}), \qquad \text {where} \qquad \bar{f}(\cdot ) = \frac{1}{T} \int _0^{T} f(\cdot , s) \,\textrm{d}s, \end{aligned}$$
(3)

with \(\bar{y}(0)=y_*\). It can be shown (see, for example, [8,9,10]) that \(\bar{y}\) remains close to y for timescales of order \({\mathcal {O}}(1)\). Thus, system (3) provides for a sufficiently small time interval an approximate but less complex description of the dynamics of y.

If the fast DOF are deterministic but sufficiently “chaotic”, the scaled difference \(\varepsilon ^{-1/2} (y-\bar{y})\) can be interpreted in the limit \(\varepsilon \rightarrow 0\) as Gaussian white noise, which can be analysed using probabilistic tools such as the central limit theorem or large deviation principles [6, 7].

Another closely related non-projective methodology is the homogenisation of differential equations [11,12,13]. Other than the averaging method, which is a coarse-graining method in time, the homogenisation method is a coarse-graining method in space. The idea is to simplify, for example, an elliptic partial differential equation of the type

$$\begin{aligned} -\nabla \cdot (A(x/\varepsilon )\nabla u_\varepsilon (x)) = f(x) \quad \text {for} \quad x\in \Omega , \qquad u_\varepsilon =0 \quad \text {for} \quad x\in \partial \Omega , \end{aligned}$$
(4)

with \(f\in L^2(\Omega )\), \(\Omega \subset {\mathbb {R}}^n\), where \(y\mapsto A(y)\in {\mathbb {R}}^{n\times n}\) is 1-periodic. By applying a two-scale ansatz of the form

$$\begin{aligned} u_\varepsilon (x)= u_0(x, \varepsilon ^{-1}x) +\varepsilon u_1(x, \varepsilon ^{-1}x)+ {\mathcal {O}}(\varepsilon ^2), \end{aligned}$$
(5)

it can be shown (see, for example, [10]) that \(u_0\) solves the homogenised partial differential equation

$$\begin{aligned} -\nabla \cdot (\bar{A}\nabla u_0(x)) = f(x) \quad \text {for} \quad x\in \Omega , \qquad u_0 =0 \quad \text {for} \quad x\in \partial \Omega , \end{aligned}$$
(6)

for a computable homogenised conductivity tensor \(\bar{A}\). Without the \(\varepsilon \)-dependent conductivity tensor \(A(x/\varepsilon )\), system (6) is less complex and thus its solution \(u_0\) can be easier derived than \(u_\varepsilon \), the solution to the original system (4).

Other non-projective methods comprise the perturbation theory of integrable Hamiltonian systems [8] or multiple-scale asymptotics [14]; modern presentations for several of these approaches include [10, 15].

The Mori–Zwanzig framework [16, 17] represents an example for a projective method. It has been extensively studied by Chorin et al. under the name of “optimal prediction”, for example, in [18,19,20]. Other relevant references can be found in [21]. The Mori–Zwanzig framework provides a way to study the reduced system for y by rewriting the deterministic system (2) into a form which resembles a general Langevin equation, where the dynamics of z is transformed into the stochastic component. In general, the Langevin equation for y in the Mori–Zwanzig framework takes the form

$$\begin{aligned} \frac{\textrm{d} y}{\textrm{d} t}=h(y(t))+\int _{0}^{t} K(y(t-s), s) \,\textrm{d}s+\dot{W}_{t}. \end{aligned}$$
(7)

Here, the first term on the right-hand side is Markovian, the second term describes the possible memory effect of the process, and \(W_t\) denotes the stochastic process.

The Mori–Zwanzig framework can be seen as a generalisation of the averaging method. In particular, if the Mori–Zwanzig framework is applied to the deterministic system (2), then the Markovian term h and the memory kernel K in (7) are \(\varepsilon \)-dependent, i.e. \(h=h_\varepsilon \) and \(K=K_\varepsilon \). It is formally described in [22] that in this case, Eq. (7) converges for \(\varepsilon \rightarrow 0\) to the averaged Eq. (3) derived with the averaging method.

In general, the stochastic process in Eq. (7) describes a diffusion of the process and has the physical interpretation of thermal fluctuations. Thus, the dynamics of y can be interpreted in a thermodynamic sense.

Another framework in the class of projective methods, which is used to study non-equilibrium thermodynamic processes modelled by fast–slow mechanical systems, is known as GENERIC (general equation for the non-equilibrium reversible–irreversible coupling, [23,24,25]). The idea is to find a projection operator so that the system’s DOF can be divided into a set of fast and slow variables. The fast variables are collectively interpreted as the thermodynamic component of the system, while the slow variables are interpreted as the mechanical component. The GENERIC is of the form

$$\begin{aligned} \frac{\textrm{d} x}{\textrm{d} t}=L \frac{\delta E}{\delta x}+M \frac{\delta S}{\delta x}, \end{aligned}$$

where x is the macroscopic quantity of interest, which changes in time driven by the non-equilibrium thermodynamic processes, and E and S are potentials which have the physical meaning of energy and entropy; L is a symplectic operator and M is positive semidefinite. If a system can be written in GENERIC form, thermodynamic consistency is automatically ensured.

Finally, a macroscopic system can directly be described by continuum mechanics [26]. Instead of modelling the system on the microscopic scale by interacting particles, in continuum mechanics the macroscopic object is considered as a continuous object. Pivotal in this description is that the fundamental equations in continuum mechanics are based on phenomenologically derived conservation laws. Their rigorous derivation from interacting particle models is for many models an open problem.

1.2 Context of this work

While the thermodynamic characteristics of projective methods and the thermodynamic foundation of continuum mechanics are well-established theories that allow to analyse complex phenomena on the macroscopic scale, it is surprising that relatively little is known about non-projective upscaling methods and their relation to thermodynamics [27, 28]. Insights into this relation are particularly important for molecular dynamic simulations, since the aforementioned scaling problem is particularly pronounced in this field and one would expect thermodynamic relations to hold, which could greatly speed up computations if suitably incorporated in place of many-particle simulations of a solvent, for example.

To analyse the thermodynamic relation of non-projective upscaling methods, we study in this article a long-standing problem in the theory of mechanical systems, i.e. the “strong confinement problem” [29,30,31,32]. In particular, we study a simplified version of a model which was analysed in detail by Bornemann in [5]. The original model was used to analyse the macroscopic dynamics of the four CH-groups of the butane molecule by deriving a homogenised model which is confined to a slow submanifold in the configuration space of the four CH-groups, using averaging methods in the form of homogenisation procedures.

The simplified version studied in this article consists of a system of one fast and one slow particle, whose dynamics is governed by the Lagrangian (8). This simplified model has the advantage that the notation can be kept to a minimum while the most important results can still be conveyed to the reader. The results presented in this article do generalise to a system of multiple fast and slow particles [33, 34]. The low-dimensional presentation we have chosen here, however, makes the arguments more transparent, and the extension to more complex situations is relatively straightforward. We use weak convergence methods in our proofs similar to [5], see also [35] for a related approach. Moreover, because of the simplicity of the model, it is possible to frame the analytical results in this article in a form using two-scale convergence (cf. Eq. (5)).

The interaction of one fast and one slow particle is one of the simplest models falling into the class of mechanical systems studied by Bornemann in [5]. It is also one of the simplest systems which can potentially exhibit thermodynamic effects. Indeed, one of the core assumptions of thermodynamics is that the system under consideration has a clear separation of scales (such as conformal motion described by elasticity combined with fast oscillations described by temperature). The number of particles does not have to be large or infinite. Notably, the physicist Paul Hertz developed a thermodynamic theory [36] for Hamiltonian systems under a slow external perturbation. Specifically, he introduced an entropy, using a notion of temperature as developed by Boltzmann. The book by Berdichevsky [37] gives an excellent introduction to this theory.

If one applies the theory of Hertz to system (8), one can interpret the dynamics associated with the fast DOF as a fast subsystem that is slowly perturbed by the motion resulting from the slow subsystem associated with the slow DOF. Hertz’ theory then allows to describe the fast subsystem from a thermodynamic point of view, using a notion of temperature \(T_\varepsilon \), entropy \(S_\varepsilon \), and external force \(F_\varepsilon \) (the force exerted by \(y_\varepsilon \) on \(z_\varepsilon \)). We reiterate that analogous findings hold for many-particle interactions, as described below.

With \(\varepsilon \) as a scale parameter, we can analyse the system on different scales in time and space. In the first part of this article, we rigorously derive a higher-order asymptotic expansion of the slow dynamics of the system using weak convergence techniques similar to [5, 35]. While the dynamics to leading order is slow as already shown in [5, 38], it turns out that the dynamics to second order can be decomposed into a slow component, describing the average motion, and a fast component, describing fast oscillatory motion. The results from the first part allow to similarly derive the second-order asymptotic expansion of the temperature, the entropy, and the external force of the fast subsystem. It turns out that the leading order as well as the average dynamics to second order satisfy equations which resemble the first and second law of thermodynamics. This finding can potentially accelerate the simulation of slow dynamics in molecular dynamics simulations to higher accuracy. Some numerical experiments can be found in [33, 34].

2 The model problem

For a small scale parameter \(0<\varepsilon < \varepsilon _0\), we study a family of mechanical systems described by the Lagrangian

$$\begin{aligned} {\mathscr {L}}_\varepsilon (y_\varepsilon , z_\varepsilon ,\dot{y}_\varepsilon , \dot{z}_\varepsilon )= \tfrac{1}{2}\dot{y}_\varepsilon ^2 +\tfrac{1}{2}\dot{z}_\varepsilon ^2 - \tfrac{1}{2}\varepsilon ^{-2} \omega ^2(y_\varepsilon ) z_\varepsilon ^2, \end{aligned}$$
(8)

on the two-dimensional Euclidean configuration space \(M={\mathbb {R}}^2\). This system is a simplified version of the model problem introduced in [5, Sect. 1.2.1]. The corresponding Newtonian equations of motion take the form

$$\begin{aligned} \ddot{y}_\varepsilon= & {} -\varepsilon ^{-2}\omega (y_\varepsilon )\omega '(y_\varepsilon )z_\varepsilon ^2, \end{aligned}$$
(9a)
$$\begin{aligned} \ddot{z}_\varepsilon= & {} -\varepsilon ^{-2} \omega ^2(y_\varepsilon ) z_\varepsilon . \end{aligned}$$
(9b)

We assume that \(\omega \in C^\infty ({\mathbb {R}})\) is a uniformly positive function, i.e. there is a constant \(\omega _*>0\) such that

$$\begin{aligned} \omega (y) \ge \omega _*, \qquad \text {for all } y\in {\mathbb {R}}. \end{aligned}$$

The \(\varepsilon \)-independent initial values are

$$\begin{aligned} y_\varepsilon (0)=y_*,\qquad \dot{y}_\varepsilon (0)=p_*,\qquad z_\varepsilon (0)=0,\qquad \dot{z}_\varepsilon (0)=u_*. \end{aligned}$$
(10)

Notice that \(y_*\in {\mathbb {R}}\) can be chosen arbitrarily but the particular choice \(z_\varepsilon (0)=0\) is necessary to ensure that the constant energy \(E_\varepsilon \) of the system is independent of \(\varepsilon \) and thus remains finite for all \(\varepsilon \),

$$\begin{aligned} E_\varepsilon = \tfrac{1}{2}\dot{y}_\varepsilon ^2 + \tfrac{1}{2}\dot{z}_\varepsilon ^2 + \tfrac{1}{2}\varepsilon ^{-2}\omega ^2(y_\varepsilon ) z_\varepsilon ^2 = \tfrac{1}{2}p_*^2 + \tfrac{1}{2} u_*^2 = E_*. \end{aligned}$$
(11)

We are primarily interested in the time evolution of the slow DOF \(y_\varepsilon \). Theorem 1 in [5, Chapter I, Sect. 2] shows that \(y_\varepsilon \) converges in the limit \(\varepsilon \rightarrow 0\) to a function \(y_0\) in \(C^1([0,T])\), which is given as the solution to the second-order differential equation \(\ddot{y}_0=-\theta _*\omega '(y_0)\) with initial values \(y_0(0)=y_*, \dot{y}_0(0)=p_*\). The constant \(\theta _*\) in the effective potential is of the form \(\theta _*=u_*^2/2\omega (y_*)\), with \(y_*\) and \(u_*\) as in (10). We will later see that \(\theta _*\) is proportional to the action of the fast subsystem in the limit \(\varepsilon \rightarrow 0\). As a consequence, some information of the fast subsystem is retained in the slow evolution of \(y_0\).

We extend the theory developed in [5] by deriving rigorously the second-order asymptotic expansion for the solution of the equations of motion (9) and interpret the corresponding expansion of the energy (11) from a thermodynamic point of view. A crucial step in the derivation of these expansions is the introduction of action-angle variables for the rapidly oscillating DOF \((z_\varepsilon ,\dot{z}_\varepsilon )\mapsto (\theta _\varepsilon , \phi _\varepsilon )\), which also involves a transformation of the generalised momentum \(\dot{y}_\varepsilon \mapsto p_\varepsilon \) to preserve the symplectic structure on the phase space as a whole.

2.1 Main results

The main results in this work can be stated as follows:

  1. 1.

    There is a second-order asymptotic expansion of the variables \(y_\varepsilon ,\, p_\varepsilon ,\,\theta _\varepsilon ,\,\phi _\varepsilon \) introduced in the previous paragraph and defined precisely in Sect. 4; this expansion is of the form

    $$\begin{aligned} y_\varepsilon= & {} y_0 + \varepsilon [\bar{y}_1]^\varepsilon +\varepsilon ^2[\bar{y}_2]^\varepsilon + \varepsilon ^2y_3^\varepsilon ,\\ p_\varepsilon= & {} p_0 + \varepsilon [\bar{p}_1]^\varepsilon +\varepsilon ^2[\bar{p}_2]^\varepsilon + \varepsilon ^2p_3^\varepsilon ,\\ \theta _\varepsilon= & {} \theta _*+ \varepsilon [\bar{\theta }_1]^\varepsilon +\varepsilon ^2[\bar{\theta }_2]^\varepsilon + \varepsilon ^2\theta _3^\varepsilon ,\\ \phi _\varepsilon= & {} \phi _0 + \varepsilon [\bar{\phi }_1]^\varepsilon +\varepsilon ^2[\bar{\phi }_2]^\varepsilon + \varepsilon ^2\phi _3^\varepsilon , \end{aligned}$$

    where for \(i\in \{1,2\}\),

    $$\begin{aligned}{} & {} {[}\bar{y}_i]^\varepsilon :=\bar{y}_i + [y_i]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{y}_i \quad \text {in}\quad L^\infty ([0,T]),y_3^\varepsilon \rightarrow 0 \quad \text {in}\quad C([0,T]),\\{} & {} [\bar{p}_i]^\varepsilon :=\bar{p}_i + [p_i]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{p}_i \quad \text {in}\quad L^\infty ([0,T]),p_3^\varepsilon \rightarrow 0 \quad \text {in}\quad C([0,T]),\\{} & {} [\bar{\theta }_i]^\varepsilon :=\bar{\theta }_i + {[}\theta _i]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{\theta }_i \quad \text {in}\quad L^\infty ([0,T]),\theta _3^\varepsilon \rightarrow 0 \quad \text {in}\quad C([0,T]),\\{} & {} [\bar{\phi }_i]^\varepsilon :=\bar{\phi }_i + {[}\phi _i]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{\phi }_i \quad \text {in}\quad L^\infty ([0,T]),\phi _3^\varepsilon \rightarrow 0 \quad \text {in}\quad C([0,T]). \end{aligned}$$

    In other words, for each variable the second-order asymptotic expansion is characterised—to leading order by Theorem 1 in [5]—to ith order by a decomposition into a slow term, indicated by an overbar, which constitutes the average motion of the ith-order expansion, and a fast term, indicated by square brackets, which oscillate rapidly and converge weakly\(^*\) to zero in \(L^\infty ([0,T])\)—and by a residual term, indicated with a subscript three, which converges uniformly to zero in C([0, T]). In particular, we show that

    $$\begin{aligned} {[}\bar{y}_1]^\varepsilon = 0,\qquad [\bar{p}_1]^\varepsilon = 0,\qquad [\bar{\theta }_1]^\varepsilon = [\theta _1]^\varepsilon ,\qquad [\bar{\phi }_1]^\varepsilon = 0, \end{aligned}$$

    and that \((\bar{\phi }_2, \bar{\theta }_2, \bar{y}_2, \bar{p}_2)\) is given as the solution to the initial value problem (24), (25) (Theorem 1). Moreover, the rapidly oscillating functions \([\theta _1]^\varepsilon \), \([y_2]^\varepsilon \), \([p_2]^\varepsilon \), \([\theta _2]^\varepsilon \), and \([\phi _2]^\varepsilon \) are explicitly given in Definition 1. Finally, we show that this expansion can be interpreted as a nonlinear version of a two-scale expansion, which we briefly introduce in Sect. 4.4.

  2. 2.

    Using the framework of Hertz [36], we define a temperature \(T_\varepsilon \), an entropy \(S_\varepsilon \), and an external force \(F_\varepsilon \) for the fast subsystem (see Sect. 5). In combination with the analytic result discussed under item 1, we decompose the total energy \(E_\varepsilon \) into the energy associated with the fast subsystem \(E_\varepsilon ^\perp \), i.e.

    $$\begin{aligned} E_\varepsilon ^\perp = \tfrac{1}{2}\dot{z}_\varepsilon ^2 + \tfrac{1}{2}\varepsilon ^{-2}\omega ^2(y_\varepsilon ) z_\varepsilon ^2, \end{aligned}$$

    and the residual energy \(E_\varepsilon ^\parallel = E_\varepsilon - E_\varepsilon ^\perp \). We expand, similar to above, \(E_\varepsilon ^\perp \), \(E_\varepsilon ^\parallel \), \(T_\varepsilon \), \(S_\varepsilon \), and \(F_\varepsilon \) into the form

    $$\begin{aligned} E_\varepsilon ^\perp= & {} E_0^\perp + \varepsilon [\bar{E}_1^\perp ]^\varepsilon +\varepsilon ^2[\bar{E}_2^\perp ]^\varepsilon + \varepsilon ^2E_3^{\perp \varepsilon },\\ E_\varepsilon ^\parallel= & {} E_0^\parallel + \varepsilon [\bar{E}_1^\parallel ]^\varepsilon +\varepsilon ^2[\bar{E}_2^\parallel ]^\varepsilon + \varepsilon ^2E_3^{\parallel \varepsilon },\\ S_\varepsilon= & {} S_0 +\varepsilon [\bar{S}_1]^\varepsilon + \varepsilon ^2[\bar{S}_2]^\varepsilon + \varepsilon ^2S_3^\varepsilon ,\\ T_\varepsilon= & {} T_0 + T_1^\varepsilon ,\\ F_\varepsilon= & {} F_0 + F_1^\varepsilon , \end{aligned}$$

    where \(T_1^\varepsilon ,F_1^\varepsilon \rightarrow 0\) in C([0, T]) and for \(i\in \{1,2\}\)

    $$\begin{aligned}{} & {} [\bar{E}_i^\perp ]^\varepsilon :=\bar{E}_i^\perp + {[}E_i^\perp ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{E}_i^\perp \quad \text {in}\quad L^\infty ([0,T]),\quad E_3^{\perp \varepsilon } \rightarrow 0 \quad \text {in}\quad C([0,T]),\\{} & {} [\bar{E}_i^\parallel ]^\varepsilon :=\bar{E}_i^\parallel + [E_i^\parallel ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{E}_i^\parallel \quad \text {in}\quad L^\infty ([0,T]),\quad E_3^{\parallel \varepsilon } \rightarrow 0 \quad \text {in}\quad C([0,T]),\\{} & {} [\bar{S}_i]^\varepsilon :=\bar{S}_i + {[}S_i]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} \bar{S}_i \quad \text {in}\quad L^\infty ([0,T]),\quad S_3^\varepsilon \rightarrow 0 \quad \text {in}\quad C([0,T]). \end{aligned}$$

    The characterisation of the ith-order expansion is similar to above; moreover, it follows from (11) that

    $$\begin{aligned}{} & {} E_\varepsilon = E_0^\perp + E_0^\parallel = E_*,\qquad [\bar{E}_1^\perp ]^\varepsilon + [\bar{E}_1^\parallel ]^\varepsilon = 0, \\{} & {} [\bar{E}_2^\perp ]^\varepsilon + [\bar{E}_2^\parallel ]^\varepsilon = 0, \qquad E_3^{\perp \varepsilon } + E_3^{\parallel \varepsilon } = 0. \end{aligned}$$

    In Sect. 5, we show that

    $$\begin{aligned}{}[\bar{E}_1^\perp ]^\varepsilon = [E_1^\perp ]^\varepsilon ,\qquad [\bar{E}_1^\parallel ]^\varepsilon = [E_1^\parallel ]^\varepsilon , \qquad [\bar{S}_1]^\varepsilon = [S_1]^\varepsilon , \end{aligned}$$

    and interpret the asymptotic expansion from a thermodynamic point of view. In particular, we show that to leading order the entropy expression remains constant, i.e. \(\hbox {d} S_0 = 0\), and consequently, the dynamics can be interpreted as an adiabatic thermodynamic process characterised by an energy relation that defines processes in thermodynamic equilibrium,

    $$\begin{aligned} \hbox {d} E_0^\perp = F_0 \hbox {d} y_0 + T_0 \hbox {d} S_0. \end{aligned}$$

    In contrast, we show that the averaged second-order dynamics, i.e. the dynamics in the weak\(^*\) limit, indicated by an overbar, represents a non-adiabatic thermodynamic process with an averaged non-constant entropy, \(\hbox {d} \bar{S}_2\ne 0\), that similar to above satisfies relations akin to equilibrium thermodynamics—namely

    $$\begin{aligned} \hbox {d} \bar{E}_2^\perp = F_0 \hbox {d} \bar{y}_2 + T_0 \hbox {d} \bar{S}_2, \end{aligned}$$

    where \(\bar{S}_2\) indicates the averaged second-order entropy expression—despite being beyond the limit \(\varepsilon \rightarrow 0\). Finally, we show in Theorem 2 that the evolution of \((\bar{y}_2, \bar{p}_2)\) is governed by equations which formally bear resemblance to Hamilton’s canonical equations,

    $$\begin{aligned} \frac{\hbox {d} \bar{y}_2 }{\hbox {d}t} = \frac{\partial \bar{E}_2}{\partial p_0},\qquad \frac{\hbox {d} \bar{p}_2}{\hbox {d}t} = -\frac{\partial \bar{E}_2 }{\partial y_0}, \end{aligned}$$

    for \(\bar{E}_2 = \bar{E}_2^\perp + \bar{E}_2^\parallel \), which are complemented by the \(\varepsilon \)-independent initial values

    $$\begin{aligned} \bar{y}_2(0)= -[y_2]^\varepsilon (0), \qquad \bar{p}_2(0)= -[p_2]^\varepsilon (0). \end{aligned}$$

3 The model problem in action-angle variables

To study the dynamics of \(y_\varepsilon \) and \(z_\varepsilon \) for \(0<\varepsilon <\varepsilon _0\), a detailed asymptotic analysis is required. An in-depth description for the case of multiple fast and slow DOF can be found in [33], which similarly uses ideas from [5, 39]. The proof sketches given below are conceptually similar to those in [33], but are more transparent due to less notational overhead. The idea is to transform the fast DOF into action-angle variables. To this end, one first phrases the problem in Hamiltonian form. For this, one denotes by \((\eta _\varepsilon , \zeta _\varepsilon )\) the canonical momenta corresponding to the positions \((y_\varepsilon , z_\varepsilon )\). Then the equations of motion (9), together with the velocity relations \(\dot{y}_\varepsilon = \eta _\varepsilon \) and \(\dot{z}_\varepsilon = \zeta _\varepsilon \), are given by the canonical equations of motion belonging to the energy function

$$\begin{aligned} E_\varepsilon (y_\varepsilon , \eta _\varepsilon , z_\varepsilon , \zeta _\varepsilon ) = \tfrac{1}{2}\eta _\varepsilon ^2 + \tfrac{1}{2}\zeta _\varepsilon ^2 + \tfrac{1}{2}\varepsilon ^{-2}\omega ^2(y_\varepsilon )z_\varepsilon ^2. \end{aligned}$$
(12)

To take the oscillatory character of \(z_\varepsilon \) into account, one introduces particular action-angle variables \((\theta _\varepsilon , \phi _\varepsilon )\) for the fast DOF \((z_\varepsilon , \zeta _\varepsilon )\),

$$\begin{aligned} z_\varepsilon = \varepsilon \sqrt{\frac{2\theta _\varepsilon }{\omega (y_\varepsilon )}}\sin (\varepsilon ^{-1}\phi _\varepsilon ),\qquad \zeta _\varepsilon = \sqrt{2\theta _\varepsilon \omega (y_\varepsilon )}\cos (\varepsilon ^{-1}\phi _\varepsilon ), \end{aligned}$$
(13)

where we recall inequality (2), i.e. \(\omega (y)\ge \omega _*>0\) for all \(y\in {\mathbb {R}}\).

The transformation \((z_\varepsilon , \zeta _\varepsilon )\mapsto (\theta _\varepsilon ,\phi _\varepsilon )\) can be found using the theory of generating functions [30, Sect. 48]. Even though this transformation would be symplectic for fixed \(y_\varepsilon \), this is not the case for the transformation of all phase-space variables \((y_\varepsilon , \eta _\varepsilon ;z_\varepsilon , \zeta _\varepsilon )\mapsto (y_\varepsilon , \eta _\varepsilon ; \theta _\varepsilon ,\phi _\varepsilon )\). To ensure that the transformation of all phase-space variables remains symplectic, an additional transformation of the position \(y_\varepsilon \) or the momentum \(\eta _\varepsilon \) is required. If we decide to keep the position variable unaffected by the transformation, the generating function takes the form

$$\begin{aligned} S_{\textrm{gen}}(y_\varepsilon , p_\varepsilon , z_\varepsilon , \phi _\varepsilon ) = p_\varepsilon y_\varepsilon +\tfrac{1}{2}\varepsilon ^{-1}\omega (y_\varepsilon )z_\varepsilon ^2 \cot (\varepsilon ^{-1}\phi _\varepsilon ). \end{aligned}$$
(14)

The resulting transformation \((y_\varepsilon , \eta _\varepsilon ;z_\varepsilon , \zeta _\varepsilon )\mapsto (y_\varepsilon , p_\varepsilon ; \phi _\varepsilon , \theta _\varepsilon )\) is symplectic on the whole phase space. Indeed, the energy function (12) transforms to the expression

$$\begin{aligned} E_\varepsilon = \frac{1}{2}p_\varepsilon ^2 + \theta _\varepsilon \omega (y_\varepsilon ) + \varepsilon \frac{\theta _\varepsilon p_\varepsilon \omega '(y_\varepsilon )}{2\omega (y_\varepsilon )}\sin (2\varepsilon ^{-1}\phi _\varepsilon )+\frac{\varepsilon ^2}{8} \left( \frac{\theta _\varepsilon \omega '(y_\varepsilon )}{\omega (y_\varepsilon )} \sin (2\varepsilon ^{-1}\phi _\varepsilon ) \right) ^2,\nonumber \\ \end{aligned}$$
(15)

and the transformed DOF satisfy the equations of motion

$$\begin{aligned} \dot{\phi }_\varepsilon = \frac{\partial E_\varepsilon }{\partial \theta _\varepsilon }, \qquad \dot{\theta }_\varepsilon = -\frac{\partial E_\varepsilon }{\partial \phi _\varepsilon },\qquad \dot{y}_\varepsilon = \frac{\partial E_\varepsilon }{\partial p_\varepsilon }, \qquad \dot{p}_\varepsilon = - \frac{\partial E_\varepsilon }{\partial y_\varepsilon }. \end{aligned}$$

After some calculations, the equations of motion take the form

$$\begin{aligned} \dot{\phi }_\varepsilon= & {} \omega (y_\varepsilon ) +\varepsilon \frac{p_\varepsilon \omega '(y_\varepsilon )}{2\omega (y_\varepsilon )} \sin (2\varepsilon ^{-1} \phi _\varepsilon ) +\varepsilon ^2 \frac{\theta _\varepsilon \left( \omega '(y_\varepsilon ) \right) ^2}{4 \omega ^2(y_\varepsilon )}\sin ^2(2\varepsilon ^{-1}\phi _\varepsilon ),\end{aligned}$$
(16a)
$$\begin{aligned} \dot{\theta }_\varepsilon= & {} -\frac{ \theta _\varepsilon p_\varepsilon \omega '(y_\varepsilon )}{\omega (y_\varepsilon )}\cos (2\varepsilon ^{-1}\phi _\varepsilon ) -\varepsilon \frac{\theta _\varepsilon ^2\left( \omega '(y_\varepsilon ) \right) ^2}{4\omega ^2(y_\varepsilon )}\sin (4\varepsilon ^{-1}\phi _\varepsilon ), \end{aligned}$$
(16b)
$$\begin{aligned} \dot{y}_\varepsilon= & {} p_\varepsilon + \varepsilon \frac{\theta _\varepsilon \omega '(y_\varepsilon )}{2 \omega (y_\varepsilon )} \sin (2\varepsilon ^{-1}\phi _\varepsilon ), \end{aligned}$$
(16c)
$$\begin{aligned} \dot{p}_\varepsilon= & {} -\theta _\varepsilon \omega '(y_\varepsilon ) +\varepsilon \frac{\theta _\varepsilon p_\varepsilon \left( \omega '(y_\varepsilon ) \right) ^2}{2 \omega ^2(y_\varepsilon )} \sin (2\varepsilon ^{-1}\phi _\varepsilon ) \nonumber \\{} & {} -\> \varepsilon \frac{\theta _\varepsilon p_\varepsilon \omega ''(y_\varepsilon )}{2\omega (y_\varepsilon )} \sin (2\varepsilon ^{-1}\phi _\varepsilon ) + \varepsilon ^2 \frac{\theta _\varepsilon ^2 \left( \omega '(y_\varepsilon ) \right) ^3}{4\omega ^3(y_\varepsilon )} \sin ^2(2\varepsilon ^{-1}\phi _\varepsilon ) \nonumber \\{} & {} - \> \varepsilon ^2 \frac{\theta _\varepsilon ^2 \omega '(y_\varepsilon )\omega ''(y_\varepsilon ) }{4\omega ^2(y_\varepsilon )}\sin ^2(2\varepsilon ^{-1}\phi _\varepsilon ). \end{aligned}$$
(16d)

The initial values, as given in (10), transform to

$$\begin{aligned} \phi _\varepsilon (0)=0, \qquad \theta _\varepsilon (0)=\theta _*= \frac{u_*^2}{2\omega (y_*)},\qquad y_\varepsilon (0)= y_*, \qquad p_\varepsilon (0)= p_*. \end{aligned}$$
(17)

At this point, it becomes clear, how the governing Newtonian equations of motion (9) are related to Eq. (2), which forms the basis for many fast–slow systems usually analysed using averaging methods. That is, by introducing the slow DOF \(\textsf{y}_\varepsilon :=(\theta _\varepsilon , y_\varepsilon , p_\varepsilon )\) and the fast DOF \(\textsf{z}_\varepsilon :=\varepsilon ^{-1}\phi _\varepsilon \), the system of differential Eqs. (16) takes the form

$$\begin{aligned} \dot{\textsf{y}}_\varepsilon= & {} f_0(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ) + \varepsilon f_1(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ) + \varepsilon ^2 f_2(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ), \\ \dot{\textsf{z}}_\varepsilon= & {} \varepsilon ^{-1} g_{-1}(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ) + g_0(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ) + \varepsilon g_1(\textsf{y}_\varepsilon ,\textsf{z}_\varepsilon ), \end{aligned}$$

which, except for some higher-order terms, coincides with Eq. (2).

4 Second-order asymptotic expansion

In this section, we rigorously derive the second-order asymptotic expansion for the solution of the initial value problem (16), (17). Let us denote the right-hand side of (16) by \({\mathcal {F}}_\varepsilon :{\mathbb {R}}^4\rightarrow {\mathbb {R}}^4\). Because \({\mathcal {F}}_\varepsilon \) is locally Lipschitz continuous, by the standard existence and uniqueness theory for ordinary differential equations, there exists a \(0<T<\infty \) such that the initial value problem (16), (17) has a unique solution \((\phi _\varepsilon , \theta _\varepsilon , y_\varepsilon , p_\varepsilon )\) in \(C^\infty ([0,T],{\mathbb {R}}^4)\), for fixed \(0<\varepsilon <\varepsilon _0\).

4.1 Leading-order expansion

For \(0< \varepsilon < \varepsilon _0\), let \((\phi _\varepsilon , \theta _\varepsilon , y_\varepsilon , p_\varepsilon )\) in \(C^\infty ([0,T],{\mathbb {R}}^4)\) be the unique solution of the initial value problem (16), (17). We analyse a sequence \(\{\phi _\varepsilon \}, \{\theta _\varepsilon \}, \{y_\varepsilon \}, \{p_\varepsilon \}\) of this solution for \(\varepsilon \rightarrow 0\). The right-hand side of (16) is oscillatory and has in particular highly oscillatory leading-order terms. As a consequence, the sequences \(\{\hbox {d}\phi _\varepsilon /\hbox {d}t\}\), \(\{\theta _\varepsilon \}\), \(\{\hbox {d}y_\varepsilon /\hbox {d}t\}\), \(\{\hbox {d}p_\varepsilon \hbox {d}t\}\) are bounded in the space \(C^{0,1}([0,T])\) of uniformly Lipschitz continuous functions, while sequences of higher-order derivatives (in particular \(\{\hbox {d}^2\theta _\varepsilon /\hbox {d}t^2\}\), which will require special attention in the later part of this work) become unbounded as \(\varepsilon \rightarrow 0\). It follows from the extended Arzelà–Ascoli theorem [5, Principle 4, Chapter I, Sect. 1] that we can extract a subsequence, not relabelled, and functions \(\theta _0 \in C^{0,1}([0,T])\) and \(\phi _0, y_0, p_0 \in C^{1,1}([0,T])\), such that

$$\begin{aligned}{} & {} \phi _\varepsilon \rightarrow \phi _0 \quad \text {in} \quad C^1([0,T]), \qquad \ddot{\phi }_\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\ddot{\phi }_0 \quad \text {in} \quad L^\infty ([0,T]), \end{aligned}$$
(18a)
$$\begin{aligned}{} & {} \theta _\varepsilon \rightarrow \theta _0 \quad \text {in} \quad C([0,T]), \qquad \dot{\theta }_\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\dot{\theta }_0 \quad \text {in} \quad L^\infty ([0,T]), \end{aligned}$$
(18b)
$$\begin{aligned}{} & {} y_\varepsilon \rightarrow y_0 \quad \text {in} \quad C^1([0,T]), \qquad \ddot{y}_\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\ddot{y}_0 \quad \text {in} \quad L^\infty ([0,T]), \end{aligned}$$
(18c)
$$\begin{aligned}{} & {} p_\varepsilon \rightarrow p_0 \quad \text {in} \quad C^1([0,T]), \qquad \ddot{p}_\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\ddot{p}_0 \quad \text {in} \quad L^\infty ([0,T]). \end{aligned}$$
(18d)

By taking the limit \(\varepsilon \rightarrow 0\) in Eqs. (16a), (16c), and (16d), we deduce that \(\dot{\phi }_0 = \omega (y_0)\), \(\dot{y}_0 = p_0\), and \(\dot{p}_0 = -\theta _0 \omega '(y_0)\). Moreover, from Eq. (16b) it can be read off that \(\dot{\theta }_\varepsilon \) is rapidly oscillating around zero. By observing that the weak\(^*\) convergence in \(L^\infty ([0,T])\) helps to ignore rapid fluctuations of functions, property (18b) can be used to deduct that \(\dot{\theta }_0=0\) and in particular, that \(\theta _0\equiv \theta _*\) (compare with (17)).

Finally, since the right-hand side of the limit equations

$$\begin{aligned} \dot{\phi }_0 = \omega (y_0),\qquad \dot{\theta }_0 =0,\qquad \dot{y}_0 = p_0, \qquad \dot{p}_0 = -\theta _*\omega '(y_0), \end{aligned}$$
(19)

do not depend on a chosen subsequence, we can discard the extraction of subsequences altogether [5, Principle 5, Chapter I, Sect. 1].

4.2 Reformulation of the governing equations

For the following part of this work, it is convenient to introduce a notation which simplifies the system of differential Eq. (16); namely, for \(0\le \varepsilon < \varepsilon _0\) and \(k,l \in {\mathbb {N}}_0\) we define the expression \(L_\varepsilon :=\log \left( \omega (y_\varepsilon ) \right) \) and, based on this,

$$\begin{aligned} D^k_t D^l_y L_\varepsilon :=\frac{\hbox {d}^k}{\hbox {d}t^k}\frac{\hbox {d}^l L_\varepsilon }{\hbox {d} y_\varepsilon ^l}. \end{aligned}$$

Then the system of differential Eq. (16) can be written as

$$\begin{aligned} \dot{\phi }_\varepsilon= & {} \omega (y_\varepsilon ) + \frac{\varepsilon }{2} D_t L_\varepsilon \sin \left( 2\varepsilon ^{-1}\phi _\varepsilon \right) , \end{aligned}$$
(20a)
$$\begin{aligned} \dot{\theta }_\varepsilon= & {} - \theta _\varepsilon D_t L_\varepsilon \cos (2\varepsilon ^{-1}\phi _\varepsilon ), \end{aligned}$$
(20b)
$$\begin{aligned} \dot{y}_\varepsilon= & {} p_\varepsilon + \frac{\varepsilon }{2} \theta _\varepsilon D_y L_\varepsilon \sin (2\varepsilon ^{-1}\phi _\varepsilon ), \end{aligned}$$
(20c)
$$\begin{aligned} \dot{p}_\varepsilon= & {} -\theta _\varepsilon \omega '(y_\varepsilon ) - \frac{\varepsilon }{2} \theta _\varepsilon D_t D_y L_\varepsilon \sin (2\varepsilon ^{-1}\phi _\varepsilon ). \end{aligned}$$
(20d)

4.3 First- and second-order expansion

To analyse the dynamics of the model problem away from the limit \(\varepsilon \rightarrow 0\), a higher-order asymptotic expansion in \(\varepsilon \) is required.

We first define particular functions that appear in the first- and second-order expansion before we state the theorem that embodies the first main result (item 1) of Sect. 2.1.

Definition 1

Let \((\phi _\varepsilon , \theta _\varepsilon , y_\varepsilon , p_\varepsilon )\) be the solution to the initial value problem (16), (17) and \((\phi _0,\theta _0,y_0,p_0)\) be the solution to the initial value problem (19), (17) such that (18) holds. With the notation introduced in Sect. 4.2, we define the functions

$$\begin{aligned} \theta _1^\varepsilon :=\frac{\theta _\varepsilon -\theta _*}{\varepsilon }, \end{aligned}$$
(21)

and

$$\begin{aligned}{} & {} \phi _2^\varepsilon :=\frac{\phi _\varepsilon -\phi _0}{\varepsilon ^2}, \qquad y_2^\varepsilon :=\frac{y_\varepsilon -y_0}{\varepsilon ^2},\qquad p_2^\varepsilon :=\frac{p_\varepsilon -p_0}{\varepsilon ^2}, \qquad \theta _2^\varepsilon :=\frac{\theta _1^\varepsilon -[\theta _1]^\varepsilon }{\varepsilon }, \nonumber \\{} & {} [\theta _1]^\varepsilon :=-\frac{\theta _*D_t L_0}{2\omega (y_0)}\sin (2\varepsilon ^{-1}\phi _0),\,\,\,\,\,\, [\phi _2]^\varepsilon :=-\frac{D_t L_0}{4\omega (y_0)}\cos (2\varepsilon ^{-1}\phi _0),\nonumber \\{} & {} [y_2]^\varepsilon :=-\frac{\theta _*D_y L_0}{4\omega (y_0)}\cos (2\varepsilon ^{-1}\phi _0), \,\,\,\, [p_2]^\varepsilon :=\frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _*D_y L_0}{4\omega (y_0)} \right) \cos (2\varepsilon ^{-1}\phi _0), \end{aligned}$$
(22)

and

$$\begin{aligned}{} & {} [\theta _2]^\varepsilon :=-\theta _*D_y L_0 [y_2]^\varepsilon - \frac{p_0}{\omega (y_0)} [p_2]^\varepsilon + \frac{\theta _*^2 (D_y L_0)^2}{16 \omega (y_0)}\cos (4\varepsilon ^{-1}\phi _0) \\{} & {} \qquad \qquad \quad - \> \frac{\theta _*D_t L_0 }{\omega (y_0)}\bar{\phi }_2\cos (2\varepsilon ^{-1}\phi _0). \end{aligned}$$

The functions \(\theta _1^\varepsilon \), \(\phi _2^\varepsilon \), \(y_2^\varepsilon \), and \(p_2^\varepsilon \) as defined in (21) and (22) describe scaled versions of the residual motion of the originally given DOF and their homogenised versions. The corresponding subscript indicates the scaling order and marks their relevance in the first- and second-order expansion. To determine the relevant term in the second-order expansion for \(\theta _\varepsilon \), we similarly define by \(\theta _2^\varepsilon \) the scaled residual motion of \(\theta _\varepsilon \) and its first-order expansion, which is derived in a two-step procedure via \(\theta _1^\varepsilon \). As indicated before in item 1 in Sect. 2.1, the second-order expansions consist of oscillating and non-oscillating terms. The oscillating terms are denoted by expressions in square brackets. They oscillate rapidly around zero and thus satisfy

$$\begin{aligned}{}[\theta _1]^\varepsilon , [\phi _2]^\varepsilon , [y_2]^\varepsilon , [p_2]^\varepsilon , [\theta _2]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} 0 \quad \text {in} \quad L^\infty ([0,T]). \end{aligned}$$
(23)

The non-oscillatory terms are characterised in the following theorem, which is the main analytic result of this article. One key result is that the non-oscillatory terms of the second-order expansion, marked by an overbar and subscript 2, satisfy a system of ordinary differential equations.

Theorem 1

Let \((\phi _\varepsilon , \theta _\varepsilon , y_\varepsilon , p_\varepsilon )\) be the solution to the initial value problem (16), (17) and \((\phi _0,\theta _0,y_0,p_0)\) be the solution to the initial value problem (19), (17) such that (18) holds. Then, the functions specified in Definition 1 satisfy

$$\begin{aligned}{} & {} \theta _1^\varepsilon -[\theta _1]^\varepsilon \rightarrow 0 \quad \text {in} \quad C([0,T]), \frac{\textrm{d}}{\textrm{d}t}\left( \theta _1^\varepsilon -[\theta _1]^\varepsilon \right) {\mathop {\rightharpoonup }\limits ^{*}}0\quad \text {in}\quad L^\infty ([0,T]),\\{} & {} \quad \phi _2^\varepsilon -[\phi _2]^\varepsilon \rightarrow \bar{\phi }_2 \quad \text {in} \quad C([0,T]), \frac{\textrm{d}}{\textrm{d}t}\left( \phi _2^\varepsilon -[\phi _2]^\varepsilon \right) {\mathop {\rightharpoonup }\limits ^{*}}\frac{\textrm{d} \bar{\phi }_2}{\textrm{d}t}\quad \text {in}\quad L^\infty ([0,T]),\\{} & {} \quad y_2^\varepsilon -[y_2]^\varepsilon \rightarrow \bar{y}_2 \quad \text {in} \quad C([0,T]), \frac{\textrm{d}}{\textrm{d}t} \left( y_2^\varepsilon -[y_2]^\varepsilon \right) {\mathop {\rightharpoonup }\limits ^{*}}\frac{\textrm{d} \bar{y}_2}{\textrm{d}t}\quad \text {in}\quad L^\infty ([0,T]),\\{} & {} \quad p_2^\varepsilon -[p_2]^\varepsilon \rightarrow \bar{p}_2 \quad \text {in} \quad C([0,T]), \frac{\textrm{d}}{\textrm{d}t} \left( p_2^\varepsilon -[p_2]^\varepsilon \right) {\mathop {\rightharpoonup }\limits ^{*}}\frac{\textrm{d} \bar{p}_2}{\textrm{d}t}\quad \text {in}\quad L^\infty ([0,T]), \end{aligned}$$

and

$$\begin{aligned} \theta _2^\varepsilon -[\theta _2]^\varepsilon \rightarrow \bar{\theta }_2 \quad \text {in}\quad C([0,T]), \end{aligned}$$

where \((\bar{\phi }_2, \bar{\theta }_2, \bar{y}_2, \bar{p}_2)\) is the unique solution to the inhomogeneous linear system of differential equations

$$\begin{aligned} \frac{\textrm{d} \bar{\phi }_2}{\textrm{d}t}= & {} \omega '(y_0)\bar{y}_2 + \frac{\theta _*(D_y L_0)^2}{8} - \frac{(D_t L_0)^2}{8 \omega (y_0)}, \end{aligned}$$
(24a)
$$\begin{aligned} \frac{\textrm{d} \bar{\theta }_2}{\textrm{d}t}= & {} \frac{\textrm{d}}{\textrm{d}t} \frac{\theta _*(D_t L_0)^2}{8 \omega ^2(y_0)}, \end{aligned}$$
(24b)
$$\begin{aligned} \frac{\textrm{d} \bar{y}_2}{\textrm{d}t}= & {} \bar{p}_2 - \frac{\theta _*D_y L_0 D_t L_0 }{4\omega (y_0)}, \end{aligned}$$
(24c)
$$\begin{aligned} \frac{\textrm{d} \bar{p}_2}{\textrm{d}t}= & {} -\omega '(y_0) \bar{\theta }_2- \theta _*\omega ''(y_0) \bar{y}_2-\frac{\theta _*^2 D_y L_0 D_y^2 L_0}{8}+\frac{\theta _*D_t L_0 D_t D_y L_0}{4\omega (y_0)},\qquad \end{aligned}$$
(24d)

with \(\varepsilon \)-independent initial values

$$\begin{aligned} \bar{\phi }_2(0)= & {} -[\phi _2]^\varepsilon (0),\qquad \bar{\theta }_2(0)=-[\theta _2]^\varepsilon (0), \end{aligned}$$
(25a)
$$\begin{aligned} \bar{y}_2(0)= & {} -[y_2]^\varepsilon (0), \qquad \bar{p}_2(0)= -[p_2]^\varepsilon (0). \end{aligned}$$
(25b)

The result in Theorem 1 is central for this article in two different ways. Firstly, it will be crucial for the thermodynamic interpretation in the second part of this article. Secondly, it is interesting for computational purposes. That is, in simulating a natural evolution of a light particle coupled to a heavy particle, their mass ratio \(\varepsilon \) will be small but finite and enters into the underlying model through potentials of different strengths. The above result says that rather than solving the coupled system directly, which is restricted to a small step size to ensure numerical stability, the approximation to second order can be computed by combining explicitly known oscillatory functions (as given in Definition 1) with the solution of an inhomogeneous linear system of differential equations, as given in Theorem 1.

4.3.1 Proof of Theorem 1

The detailed proof of Theorem 1 was already carried out, in the case of multiple fast and slow DOF in [33]. For this reason, we will only summarise the essential steps and indicate the differences. The simpler notation in the present article makes the presentation more transparent.

It is first shown that the sequences of functions \(\{\theta _1^\varepsilon \}, \{\phi _2^\varepsilon \}, \{y_2^\varepsilon \}\), and \(\{p_2^\varepsilon \}\) are uniformly bounded in \(L^\infty ([0,T])\). This follows directly from the system of differential Eq. (20) and Gronwall’s inequality. Moreover, one shows that the sequence of functions \(\{\theta _2^\varepsilon \}\) is also uniformly bounded in \(L^\infty ([0,T])\). Other than in [33], which requires lengthy calculations, this follows directly from the second-order energy (Eqs. (46) and (50)) and the uniform boundedness of \(\{\theta _1^\varepsilon \}, \{\phi _2^\varepsilon \}, \{y_2^\varepsilon \}\), and \(\{p_2^\varepsilon \}\).

Since we introduced the action-angle variables, it is possible to calculate the high-frequency terms \([\theta _1]^\varepsilon , [\phi _2]^\varepsilon , [y_2]^\varepsilon , [p_2]^\varepsilon \), and \([\theta _2]^\varepsilon \) through integration by parts.

By taking the time derivative of the functions \( \phi _2^\varepsilon -[\phi _2]^\varepsilon \), \(y_2^\varepsilon - [y_2]^\varepsilon \), and \(p_2^\varepsilon -[p_2]^\varepsilon \), we obtain

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( \phi _2^\varepsilon -[\phi _2]^\varepsilon \right)= & {} \frac{\omega (y_\varepsilon )-\omega (y_0)}{\varepsilon ^2} + \frac{\textrm{d}}{\textrm{d}t}\left( \frac{D_t L_\varepsilon }{4\dot{\phi }_\varepsilon } \right) \cos (2\varepsilon ^{-1}\phi _\varepsilon ) \nonumber \\{} & {} - \frac{\textrm{d}}{\textrm{d}t} \left( [\phi _2]^\varepsilon +\frac{D_t L_\varepsilon }{4\dot{\phi }_\varepsilon }\cos (2\varepsilon ^{-1}\phi _\varepsilon ) \right) ,\nonumber \\ \frac{\textrm{d}}{\textrm{d}t}\left( y_2^\varepsilon - [y_2]^\varepsilon \right)= & {} \frac{p_\varepsilon - p_0}{\varepsilon ^2}+\frac{\textrm{d}}{\textrm{d}t} \left( \frac{\theta _\varepsilon D_y L_\varepsilon }{4\dot{\phi }_\varepsilon } \right) \cos (2\varepsilon ^{-1}\phi _\varepsilon ) \nonumber \\{} & {} - \frac{\textrm{d}}{\textrm{d}t}\left( [y_2]^\varepsilon + \frac{\theta _\varepsilon D_y L_\varepsilon }{4\dot{\phi }_\varepsilon }\cos (2\varepsilon ^{-1}\phi _\varepsilon ) \right) ,\nonumber \\ \frac{\textrm{d}}{\textrm{d}t}\left( p_2^\varepsilon -[p_2]^\varepsilon \right)= & {} -\theta _*\frac{\omega ^\prime (y_\varepsilon )-\omega ^\prime (y_0)}{\varepsilon ^2} -\frac{\theta _1^\varepsilon - [\theta _1]^\varepsilon }{\varepsilon } \omega ^\prime (y_\varepsilon )\nonumber \\{} & {} -\> \frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _\varepsilon D_tD_y L_\varepsilon }{4\dot{\phi }_\varepsilon } \right) \cos (2\varepsilon ^{-1}\phi _\varepsilon ) \nonumber \\{} & {} -\frac{\textrm{d}}{\textrm{d}t}\left( [p_2]^\varepsilon _1 - \frac{\theta _\varepsilon D_tD_yL_\varepsilon }{4\dot{\phi }_\varepsilon }\cos (2\varepsilon ^{-1}\phi _\varepsilon ) \right) \nonumber \\{} & {} + \>\frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _*D_t L_0}{4\omega ^2(y_0)} \omega ^\prime (y_\varepsilon ) \right) \cos (2\varepsilon ^{-1}\phi _0) \nonumber \\{} & {} - \frac{\textrm{d}}{\textrm{d}t}\left( [p_2]^\varepsilon _2+\frac{\theta _*D_t L_0}{4\omega ^2(y_0)}\omega ^\prime (y_\varepsilon )\cos (2\varepsilon ^{-1}\phi _0)\right) , \end{aligned}$$
(26)

where we write \([p_2]^\varepsilon = [p_2]^\varepsilon _1 + [p_2]^\varepsilon _2\) with

$$\begin{aligned}{}[p_2]^\varepsilon _1 :=\frac{\theta _*D_t D_y L_0}{4\omega (y_0)}\cos (2\varepsilon ^{-1}\phi _0),\qquad [p_2]^\varepsilon _2 :=-\frac{\theta _*D_t L_0 D_y L_0}{4\omega (y_0) }\cos (2\varepsilon ^{-1}\phi _0). \end{aligned}$$

For the derivation of Eq. (26), we note that in

$$\begin{aligned} \frac{\hbox {d}p_2^\varepsilon }{\hbox {d}t}= & {} -\theta _*\frac{\omega ^\prime (y_\varepsilon )-\omega ^\prime (y_0)}{\varepsilon ^2} - \frac{\theta _\varepsilon -\theta _*}{\varepsilon ^2} \omega ^\prime (y_\varepsilon ) - \frac{\textrm{d}}{\textrm{d}t} \left( \frac{\theta _\varepsilon D_tD_y L_\varepsilon }{4\dot{\phi }_\varepsilon } \right) \cos (2\varepsilon ^{-1}\phi _\varepsilon ) \\{} & {} +\frac{\textrm{d}}{\textrm{d}t} \left( \frac{\theta _\varepsilon D_tD_yL_\varepsilon }{4\dot{\phi }_\varepsilon }\cos (2\varepsilon ^{-1}\phi _\varepsilon ) \right) , \end{aligned}$$

we can rewrite the second term on the right-hand side by introducing \([\theta _1]^\varepsilon \), i.e.

$$\begin{aligned} \frac{\theta _\varepsilon -\theta _*}{\varepsilon ^2} \omega ^\prime (y_\varepsilon )= & {} \frac{\theta _1^\varepsilon - [\theta _1]^\varepsilon }{\varepsilon } \omega ^\prime (y_\varepsilon )- \frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _*D_t L_0}{4\omega ^2(y_0)} \omega ^\prime (y_\varepsilon ) \right) \cos (2\varepsilon ^{-1}\phi _0) \\{} & {} +\frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _*D_t L_0}{4\omega ^2(y_0)} \omega ^\prime (y_\varepsilon )\cos (2\varepsilon ^{-1}\phi _0) \right) . \end{aligned}$$

As described in [33], we infer that the sequences \(\{\phi _2^\varepsilon -[\phi _2]^\varepsilon \},\{y_2^\varepsilon -[y_2]^\varepsilon \}\), and \(\{p_2^\varepsilon -[p_2]^\varepsilon \}\) are bounded in \(C^{0,1}([0,T])\). The claim follows after successive applications of the extended Arzelà–Ascoli theorem [5, Principle 4, Chapter I, Sect. 1]. For the reader’s convenience, we will exemplify the sketch of proof for \(y_\varepsilon \). We recall that

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( y_2^\varepsilon - [y_2]^\varepsilon \right)= & {} \frac{p_\varepsilon - p_0}{\varepsilon ^2}+\frac{\textrm{d}}{\textrm{d}t} \left( \frac{\theta _\varepsilon D_y L_\varepsilon }{4\dot{\phi }_\varepsilon } \right) \cos (2\varepsilon ^{-1}\phi _\varepsilon ) \\{} & {} - \frac{\textrm{d}}{\textrm{d}t}\left( [y_2]^\varepsilon + \frac{\theta _\varepsilon D_y L_\varepsilon }{4\dot{\phi }_\varepsilon }\cos (2\varepsilon ^{-1}\phi _\varepsilon ) \right) , \end{aligned}$$

where we take the weak\(^*\) limit of the right-hand side. The first term on the right-hand side is uniformly bounded in \(L^\infty ([0,T])\). It follows that there exists a function \(\bar{p}_2\in L^\infty ([0,T])\) and a subsequence such that \(p_2^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\bar{p}_2\) in \(L^\infty ([0,T])\). The second term on the right-hand side converges weakly\(^*\) to the second term on the right-hand side of Eq. (24c), which follows from Lemma 5.7 in [33]. The weak\(^*\) limit of this term is nonzero because the terms that appear from the amplitude after taking the time derivative are in resonance with the cosine term, thus leading to a nonzero contribution in the weak\(^*\) limit. Finally, the last term on the right-hand side converges weakly\(^*\) to zero by construction.

The weak\(^*\) limits for \( \phi _2^\varepsilon -[\phi _2]^\varepsilon \) and \(p_2^\varepsilon -[p_2]^\varepsilon \) can be derived in a similar way. The form of \([\bar{\theta }_2]^\varepsilon \) can then be derived through expansion of the energy term.

4.4 Interpretation of the asymptotic expansion in the two-scale convergence framework

The convergence results in Theorem 1 exhibit scale separations that are characteristic of the theory of two-scale convergence. In this section, we give a summary of the theory and introduce a nonlinear version of two-scale convergence, which can be used to reformulate the results derived in Theorem 1.

4.4.1 Two-scale convergence

The theory of two-scale convergence was first introduced by Nguetseng [40]. We follow here the presentation in [41, 42], though restricted to the one-dimensional case. We denote by \({\mathcal {S}}\) the set \(S:=[0,1)\) equipped with the topology of the one-dimensional torus, and identify any function on \({\mathcal {S}}\) with its 1-periodic extension on \({\mathbb {R}}\).

In general, a bounded sequence \(\{u_\varepsilon \}\) of functions in \(L^2(\Omega )\) is said to weakly two-scale converge to \(u\in L^2(\Omega \times {\mathcal {S}})\), symbolically indicated by , if and only if

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _\Omega u_\varepsilon (t)\psi \left( t,\varepsilon ^{-1}t\right) \,\textrm{d}t = \iint _{\Omega \times S} u(t,s) \psi (t,s)\,\textrm{d}t\,\textrm{d}s, \end{aligned}$$
(27)

for any smooth function \(\psi :{\mathbb {R}}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\) which is S-periodic with respect to the second argument. Typically, these \(u_\varepsilon \) are of the form \(u_\varepsilon (t)=v(t,\varepsilon ^{-1}t)\) for some function v of two arguments, which is periodic in the second argument.

4.4.2 Two-scale decomposition

For any \(\varepsilon >0\), one can decompose a real number \(t\in {\mathbb {R}}\) as \(t = \varepsilon [{\mathcal {N}}(t/\varepsilon )+ {\mathcal {R}}(t/\varepsilon )]\), where

$$\begin{aligned} {\mathcal {N}}(t):=\max \{n\in {\mathbb {Z}}:n \le t\},\quad {\mathcal {R}}(t):=t - {\mathcal {N}}(t)\in {\mathcal {S}}. \end{aligned}$$

Visually, \({\mathcal {N}}\) is the floor function and \({\mathcal {R}}\) is the sawtooth wave function. If \(\varepsilon \) is the ratio between two disparate scales, \({\mathcal {N}}(t/\varepsilon )\) and \({\mathcal {R}}(t/\varepsilon )\) may then be regarded as a coarse-scale and a fine-scale variable, respectively. Besides this two-scale decomposition, one defines a two-scale composition function:

$$\begin{aligned} h_\varepsilon (t,s):=\varepsilon {\mathcal {N}}(t/\varepsilon ) + \varepsilon s\qquad \forall (t,s) \in {\mathbb {R}}\times {\mathcal {S}},\qquad \forall \varepsilon >0. \end{aligned}$$

The two-scale composition function can be written as \(h_\varepsilon (t,s)=t+\varepsilon [s-{\mathcal {R}}(t/\varepsilon )]\). Since, for all \(t\in {\mathbb {R}}\), \({\mathcal {R}}(t)\in {\mathcal {S}}\), one has \(\varepsilon {\mathcal {R}}(t/\varepsilon )\rightarrow 0\) uniformly in t, and thus,

$$\begin{aligned} h_\varepsilon (t,s)\rightarrow t \quad \text {uniformly in } {\mathbb {R}}\times {\mathcal {S}},\text { as } \varepsilon \rightarrow 0. \end{aligned}$$

With the introduction of \(h_\varepsilon \), one can define an equivalent definition of two-scale convergence. That is, one has for a sequence \(\{u_\varepsilon \}\) in \(L^2({\mathbb {R}})\) that

For any domain \(\Omega \subset {\mathbb {R}}\), two-scale convergence in \(L^2(\Omega \times {\mathcal {S}})\) is then defined by extending functions to \({\mathbb {R}}{\setminus } \Omega \) with vanishing value.

For \(u_\varepsilon \in L^2(\Omega )\), it is shown in [42] that this definition of two-scale convergence is equivalent to the definition in (27). However, it is more versatile. In particular, it allows defining two-scale convergence in \(C([0,T]\times {\mathcal {S}})\).

4.4.3 Two-scale convergence in \(\varvec{C([0,T]\times {\mathcal {S}})}\)

Some modifications are needed to extend the definition of two-scale convergence to \(C([0,T]\times {\mathcal {S}})\), for in general the function \(u_\varepsilon \circ h_\varepsilon \) is discontinuous with respect to \(t\in {\mathbb {R}}\) and \(s\in {\mathcal {S}}\), even if \(u_\varepsilon \) is continuous. One therefore replaces \(u_\varepsilon \circ h_\varepsilon \) by a continuous function, \({\mathcal {L}}_\varepsilon u_\varepsilon :=(J \circ I_\varepsilon )(u_\varepsilon \circ h_\varepsilon )\), constructed via linear interpolation with respect to each argument. The details of this linear interpolation can be found in [42].

One then says that \(u_\varepsilon \) strongly two-scale converges to u in \(C([0,T]\times {\mathcal {S}})\), symbolically indicated by \(u_\varepsilon \xrightarrow [2]{} u\), if and only if

$$\begin{aligned} {\mathcal {L}}_\varepsilon u_\varepsilon \rightarrow u \quad \text {in} \quad C([0,T]\times {\mathcal {S}}). \end{aligned}$$

4.4.4 Nonlinear two-scale convergence

The two-scale convergence theory is a generalisation of the weak convergence theory which retains information about the oscillatory character of \(u_\varepsilon \) in the limit \(\varepsilon \rightarrow 0\). For instance, we have in \(L^2([0,T]\times {\mathcal {S}})\). Notice, however, that with test functions in form of the Fourier basis functions \(\psi _k(t, \varepsilon ^{-1}t):=\exp (2\pi i k\varepsilon ^{-1}t)\) one has for all \(k\in {\mathbb {Z}}\)

$$\begin{aligned} \sin \left( 2\pi \varepsilon ^{-1}\varphi (t)\right) \psi _k(t,\varepsilon ^{-1}t) {\mathop {\rightharpoonup }\limits ^{}} 0 \quad \text {in} \quad L^2([0,T]\times {\mathcal {S}}), \end{aligned}$$

for any nonlinear \(C^\infty \)-Diffeomorphism \(\varphi :[0,T]\rightarrow [0, \varphi (T)]\), and thus, in \(L^2([0,T]\times {\mathcal {S}})\). To derive a nonzero two-scale limit in such a case, we introduce a nonlinear change of coordinates that temporarily annihilates the nonlinearity so that the standard two-scale limit can be taken before it is reintroduced into the two-scale limit.

Definition 2

(Two-scale convergence with respect to \(\varphi \)) Let \(\varphi :[0,T]\rightarrow [0, \varphi (T)]\) be a \(C^\infty \)-Diffeomorphism, \(\{u_\varepsilon \}\subset C([0,T])\) and \(u\in C([0,T]\times {\mathcal {S}})\). We say that \(u_\varepsilon \) two-scale converges with respect to \(\varphi \) to u in \(C([0,T]\times {\mathcal {S}})\) if \(u_\varepsilon \circ \varphi \) two-scale converges to \(u\circ (\varphi , \textrm{Id})\) in \(C([0,\varphi (T)]\times {\mathcal {S}})\), i.e.

$$\begin{aligned} u_\varepsilon \xrightarrow [2]{\varphi } u \quad \text {in} \quad C([0,T]\times {\mathcal {S}}) \end{aligned}$$

if and only if

$$\begin{aligned} u_\varepsilon ( \varphi ( r)) \xrightarrow [2]{} u(\varphi (r), s) \quad \text {in} \quad C([0,\varphi ^{-1}(T)]\times {\mathcal {S}}). \end{aligned}$$

With this definition at hand, we can express the uniform convergence results of Theorem 1 using the notation of two-scale convergence. Rephrasing the weak\(^*\) convergence results in Theorem 1 in a similar way requires more notation.

Proposition 1

With the notation introduced in Definition 2, the uniform convergences in Theorem 1 are equivalent to the following strong two-scale convergences with respect to \(\varphi (r):=\phi _0^{-1}(\pi r)\) for \(r\in [0,\pi ^{-1}\phi _0(T)]\):

$$\begin{aligned}{} & {} \phi _2^\varepsilon \xrightarrow [2]{\varphi } \bar{\phi }_2 +[\phi _2] \quad \text {in} \quad C([0,T]\times {\mathcal {S}}),\quad \theta _2^\varepsilon \xrightarrow [2]{\varphi } \bar{\theta }_2 +[\theta _2] \quad \text {in} \quad C([0,T]\times {\mathcal {S}}),\\{} & {} y_2^\varepsilon \xrightarrow [2]{\varphi } \bar{y}_2+[y_2] \quad \text {in} \quad C([0,T]\times {\mathcal {S}}),\quad p_2^\varepsilon \xrightarrow [2]{\varphi } \bar{p}_2+ [p_2] \quad \text {in} \quad C([0,T]\times {\mathcal {S}}),\\{} & {} { \theta _1^\varepsilon \xrightarrow [2]{\varphi } [\theta _1] \quad \text {in}\quad C([0,T]\times {\mathcal {S}}), } \end{aligned}$$

where \((\bar{\phi }_2, \bar{\theta }_2, \bar{y}_2, \bar{p}_2)\) is the unique solution to the initial value problem (24) with (25) and where for \(t\in [0,T]\) and \(s\in {\mathcal {S}}\) we define

$$\begin{aligned}{} & {} [\theta _1](t, s) :=-\frac{\theta _*D_t L_0(t)}{2\omega (y_0(t))}\sin \left( 2\pi s \right) ,\quad [\phi _2](t, s) :=-\frac{D_t L_0(t)}{4\omega (y_0(t))}\cos (2\pi s),\\{} & {} [y_2](t, s):=-\frac{\theta _*D_y L_0(t)}{4\omega (y_0(t))}\cos (2\pi s),\quad [p_2](t, s) :=\frac{\textrm{d}}{\textrm{d}t}\left( \frac{\theta _*D_y L_0(t)}{4\omega (y_0(t))} \right) \cos (2\pi s) \end{aligned}$$

and

$$\begin{aligned}{} & {} [\theta _2](t,s) := -\theta _*D_y L_0(t) [y_2](t,s) - \frac{p_0(t)}{\omega (y_0(t))} [p_2](t,s)+ \frac{\theta _*^2 (D_y L_0(t))^2}{16 \omega (y_0(t))}\cos (4\pi s) \\{} & {} \qquad \qquad \qquad - \> \frac{\theta _*D_t L_0(t) }{\omega (y_0(t))}\bar{\phi }_2(t)\cos (2\pi s). \end{aligned}$$

Proof

The equivalence follows from Theorem 1 and [42, Proposition 2.4]. \(\square \)

Note that in Proposition 1 the constant \(\pi \) was chosen to normalise the period of the rapidly oscillating functions.

4.5 Asymptotic expansion in the multidimensional case

The asymptotic expansion results presented in this article can be generalised to the case of multiple fast and slow DOF. This case was discussed in [33], where the governing Lagrangian generalises by the natural extension of (8) to the case of multiple fast and slow DOF, i.e. \(y_\varepsilon \in {\mathbb {R}}^n\) and \(z_\varepsilon \in {\mathbb {R}}^r\) for \(n,r \ge 1\). In the case \(r>1\), it is possible that the \(z_\varepsilon \) are in resonance with each other. After introducing some constraints on the system in form of two non-resonance conditions, a similar result to Theorem 1 can be derived. These non-resonance conditions bypass the so-called small divisor problem which commonly emerges when deriving higher-order asymptotic expansions of trajectories of systems of multiple fast DOF (see [9, Chap. 7]). Non-resonance conditions also arise in connection with the KAM (Kolmogorov–Arnol’d–Moser) theorem [43, 44] to similarly handle small divisors.

The proof is similar to the one presented in this article. First, one introduces a change of coordinates \((z_\varepsilon ,\zeta _\varepsilon )\mapsto (\theta _\varepsilon , \phi _\varepsilon )\) via a generalised version of the generating function (14) and one defines sequences of scaled residual terms \(\{\theta _1^\varepsilon \}, \{\phi _2^\varepsilon \}, \{y_2^\varepsilon \}, \{p_2^\varepsilon \}\), and \(\{\theta _2^\varepsilon \}\) similar to Definition 1. Next, one shows that the sequences \(\{\theta _1^\varepsilon \}\) and \(\{\phi _2^\varepsilon \}\) are uniformly bounded in \(L^\infty ([0,T],{\mathbb {R}}^r)\) and \(\{y_2^{\varepsilon }\}\) and \(\{p_2^{\varepsilon }\}\) are uniformly bounded in \(L^\infty ([0,T], {\mathbb {R}}^n)\). After deriving the oscillatory terms, which is possible due to the transformation of the fast DOF into action-angle variables and integration by parts, the weak\(^*\) limit can be derived with the extended Arzelà–Ascoli theorem [5, Principle 4, Chapter I, Sect. 1], similar to the sketch of proof in Sect. 4.3.1. Since \(\theta _2^\varepsilon \in {\mathbb {R}}^r\), a simple derivation from the energy, as described in this article for the case \(r=1\), is not possible. Thus, more convoluted calculations are necessary to prove its uniform bound in \(L^\infty ([0,T], {\mathbb {R}}^r)\) and its convergence in the weak\(^*\) limit.

5 Thermodynamic expansion and interpretation

We now give a thermodynamic interpretation of the analytic result presented in Theorem 1. Thermodynamic effects can, in principle, occur when a separation of scales exist; Hertz developed a thermodynamic theory for Hamiltonian systems which are perturbed by slow external agents [36]. The model considered in this article is an example of this kind, if we restrict the analysis to the fast DOF (variable \(z_\varepsilon \)) and consider the slow DOF (variable \(y_\varepsilon \)) as an external agent. The question we want to address in this section is “Can we replace the dynamics of the fast DOF with a thermodynamic description in terms of temperature and entropy?” As it turns out, even in this simple model problem, an interesting adiabatic/non-adiabatic characteristic emerges through a higher-order asymptotic expansion.

The concept of adiabatic invariance finds applications in the analysis of slowly perturbed dynamical systems, where one is primarily interested in the derivation of the effective evolution of the system. It emerged in celestial mechanics in the form of the perturbation theory of Hamiltonian dynamical systems [30, Chapter 10] and can be found in many other fields [35, 38, 45, 46]. In particular, adiabatic invariance plays a crucial role in thermodynamics. There, adiabatic thermodynamic processes are idealised models in the limit of an infinite separation of timescales and are, by definition, processes with constant entropy [47].

For a thermodynamic interpretation of the system, we will regard the fast DOF \(z_\varepsilon \) as the system’s thermal vibrations, acting on the slow DOF \(y_\varepsilon \), which represents the system’s slow (mechanical) dynamics. As such, we will mainly focus the thermodynamic analysis on the energy associated with the fast DOF \(E_\varepsilon ^\perp \), in contrast to the residual energy \(E_\varepsilon ^\parallel \), which describes the remaining part of the system. Both energies can be read off from the total energy (11), i.e.

$$\begin{aligned} E_\varepsilon ^\perp = \tfrac{1}{2}\dot{z}_\varepsilon ^2+\tfrac{1}{2}\varepsilon ^{-2}\omega ^2(y_\varepsilon ) z_\varepsilon ^2, \qquad E_\varepsilon ^\parallel = E_\varepsilon - E_\varepsilon ^\perp . \end{aligned}$$
(28)

Note that the evolution of the fast DOF \(z_\varepsilon \) is governed by the energy \(E_\varepsilon ^\perp = E^\perp _\varepsilon (z_\varepsilon ,\dot{z}_\varepsilon ;y_\varepsilon )\), which is subject to a dynamically varying external agent given by the evolution of the slow DOF \(y_\varepsilon \). This framework allows us to apply the theory developed by Hertz [37].

5.1 The first and second law of thermodynamics

Hertz considered a thermodynamic system under a slowly changing external agent y. A typical example is a vessel filled with gas, where y indicates the height of a piston that compresses the gas. In line with Hertz’ analysis, we consider for the problem studied in this article the fast subsystem with energy \(E_\varepsilon ^\perp \) as a thermodynamic system under a slowly changing external agent represented by \(y_\varepsilon \). We point out that Hertz’ theory is based on dynamical systems which are inherently reversible in time and are thus regarded as idealised thermodynamic systems. As such, on a macroscopic level, the dynamics is described by the first and second law of thermodynamics, where the second law is given in Carathéodory’s form [47].

The first law states that every infinitesimal thermodynamic process can be described as a change of the internal energy \(\hbox {d} A\), given by the work performed on the system via the application of an external force F along a displacement \(\hbox {d}y\), i.e. \(\hbox {d} A = F \hbox {d}y\), and the heat supply \(\hbox {d} Q\) such that the sum \(\hbox {d} A + \hbox {d} Q\) is the differential of some energy function E,

$$\begin{aligned} \hbox {d} E = \hbox {d} A + \hbox {d} Q. \end{aligned}$$

The second law of thermodynamics states that, for reversible processes, there are two functions of state, namely the absolute temperature T and the entropy S such that

$$\begin{aligned} \hbox {d} Q = T \hbox {d} S. \end{aligned}$$

Hence, the first and second law of thermodynamics combined read

$$\begin{aligned} \hbox {d} E = F \hbox {d}y + T \hbox {d} S. \end{aligned}$$
(29)

Equation (29) can be reduced to the statement that there exists an entropy \(S=S(E,y)\) such that the constitutive equations

$$\begin{aligned} \frac{1}{T} = \frac{\partial S(E,y)}{\partial E} \qquad \text {and} \qquad F = -T \frac{\partial S(E,y)}{\partial y} \end{aligned}$$
(30)

are satisfied.

The special case of a process without heat exchange, \(\hbox {d} Q = 0\), is called an adiabatic thermodynamic process. In this case, all work \(\hbox {d} A\) is converted into a change of energy,

$$\begin{aligned} \hbox {d} E = \hbox {d} A, \end{aligned}$$
(31)

or, equivalently, the entropy of the system stays constant,

$$\begin{aligned} S (E, y) = \textrm{const}. \end{aligned}$$

Remark 1

With \(E=E(S,y)\), the constitutive Eq. (30) read equivalently

$$\begin{aligned} T=\frac{\partial E(S,y)}{\partial S} \qquad \text {and} \qquad F=\frac{\partial E(S,y)}{\partial y}. \end{aligned}$$
(32)

5.2 Derivation of the thermodynamic quantities

We follow for the derivation of the temperature, entropy, and external force the analysis in [37]. This derivation was used, in a similar way, in [33] to determine the thermodynamic quantities in the case of multiple fast DOF.

We consider the Hamiltonian and the associated energy of the fast subsystem given by

$$\begin{aligned} H_\varepsilon ^\perp (z_\varepsilon , \zeta _\varepsilon ; y_\varepsilon )= \frac{1}{2}\zeta _\varepsilon ^2 + \frac{1}{2} \varepsilon ^{-2} \omega ^2(y_\varepsilon ) z_\varepsilon ^2 = E_\varepsilon ^\perp . \end{aligned}$$
(33)

Since \((z_\varepsilon , \zeta _\varepsilon )\) are fast relative to \(y_\varepsilon \), we can, as a reasonable good approximation, assume that for one closed loop of \((z_\varepsilon , \zeta _\varepsilon )\) on the energy surface, the variable \(y_\varepsilon \) remains constant. This system is thus a harmonic oscillator, which is in one dimension ergodic on the energy surface. (The multidimensional case is discussed below.) It thus admits a unique invariant measure, given by

$$\begin{aligned} \mu (A)= \frac{\displaystyle \int _{A} \dfrac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}{\displaystyle \int _{\Sigma } \frac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}, \end{aligned}$$

where \(A\subseteq \Sigma \) is a region on the level set \(\Sigma = \{ (z_\varepsilon , \zeta _\varepsilon ) \in {\mathbb {R}}^2:H^\perp _\varepsilon (z_\varepsilon , \zeta _\varepsilon ; y_*)= E^\perp _*\}\) and \(\,\textrm{d}\sigma \) is an infinitesimal small area element on the energy surface. In equilibrium thermodynamics, the temperature is considered a time-independent variable, which is proportional to the time average of the kinetic energy of the system.

In general, if \(\left\langle \phi \right\rangle \) denotes the time average of some function \(\phi =\phi (t)\), the temperature T in equilibrium thermodynamics can be defined by the relation \(T= \left\langle 2 E_{\textrm{kin}} \right\rangle \) where \(2 E_{\textrm{kin}}=p \frac{\partial H(q, p)}{\partial p}\) and (qp) is a time-dependent trajectory on the energy surface in phase space.

With the help of the ergodic theorem of Birkhoff and Khinchin (see, for example, [48]), the invariant measure \(\mu \) can be used to derive the time average in the definition of the temperature by calculating the space average of \(p \frac{\partial H(q, p)}{\partial p}\) on the energy surface. In particular, in our example the temperature takes the form

$$\begin{aligned} T_\varepsilon (E^\perp _*, y_*)= \left\langle \zeta _\varepsilon \frac{\partial H^\perp _\varepsilon (z_\varepsilon , \zeta _\varepsilon ; y_*)}{\partial \zeta _\varepsilon } \right\rangle = \frac{\displaystyle \int _{\Sigma } \zeta _\varepsilon \frac{\partial H^\perp _\varepsilon }{\partial \zeta _\varepsilon }\dfrac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}{\displaystyle \int _{\Sigma } \frac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}. \end{aligned}$$
(34)

Here, the numerator can be calculated using Gauss’ theorem. It turns out that

$$\begin{aligned} \int _{\Sigma } \zeta _\varepsilon \frac{\partial H^\perp _\varepsilon }{\partial \zeta _\varepsilon }\dfrac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert } = \Gamma _\varepsilon (E^\perp _*, y_*), \end{aligned}$$
(35)

where \(\Gamma _\varepsilon (E^\perp _*, y_*)\) is the phase-space volume enclosed by the trajectory of \((z_\varepsilon , \zeta _\varepsilon )\) and is also known as the action of the orbit in Hamiltonian dynamics [49].

To derive the denominator in (34), we calculate the derivative of \(\Gamma _\varepsilon (E^\perp _*, y_*)\) with respect to \(E^\perp _*\) and find, with some geometrical considerations, that

$$\begin{aligned} \int _\Sigma \frac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert } = \frac{\partial \Gamma _\varepsilon (E^\perp _*, y_*)}{\partial E^\perp _*}. \end{aligned}$$
(36)

Thus, in Eq. (34) the numerator is given by (35) and the denominator by (36), which allows us to express the temperature in terms of the phase-space volume \(\Gamma _\varepsilon (E^\perp _*,y_*)\), i.e.

$$\begin{aligned} T_\varepsilon (E^\perp _*, y_*) = \frac{\Gamma _\varepsilon (E^\perp _*,y_*)}{\partial \Gamma _\varepsilon (E^\perp _*,y_*)/\partial E^\perp _*}. \end{aligned}$$
(37)

According to the left equation in (30), we integrate \(T_\varepsilon ^{-1}\) in (37) with respect to \(E^\perp _*\) and obtain as formula for the entropy \( S_\varepsilon (E^\perp _*, y_*) = \log \left( \Gamma _\varepsilon (E^\perp _*, y_*) \right) + f_\varepsilon (y_*)\). It remains to show that the function \(f_\varepsilon \) is constant. To see this, we investigate the dependence of \(S_\varepsilon \) on \(y_*\) and will, similar to above, use (30).

We can define the external force as the time average of \(\partial _y H(q,p;y) \) for a Hamiltonian system depending on some external agent y and apply, similar to above, the Birkhoff–Kinchin Theorem, to derive in our example

$$\begin{aligned} F_\varepsilon (E^\perp _*, y_*) = \left\langle \frac{\partial H^\perp _\varepsilon (z_\varepsilon , \zeta _\varepsilon ; y_*)}{\partial y_*} \right\rangle = \frac{\displaystyle \int _{\Sigma } \frac{\partial H^\perp _\varepsilon }{\partial y_*}\dfrac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}{\displaystyle \int _{\Sigma } \frac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert }}. \end{aligned}$$
(38)

For the numerator, we calculate this time the derivative of \(\Gamma _\varepsilon (E^\perp _*, y_*)\) with respect to \(y_*\).

Similar to before, geometric considerations imply that

$$\begin{aligned} \int _\Sigma \frac{\partial H^\perp _\varepsilon }{\partial y_*} \frac{\,\textrm{d}\sigma }{\vert \nabla H^\perp _\varepsilon \vert } = -\frac{\partial \Gamma _\varepsilon (E^\perp _*, y_*)}{\partial y_*}. \end{aligned}$$
(39)

Combining Eqs. (36), (38), and (39), we obtain

$$\begin{aligned} F_\varepsilon (E^\perp _*, y_*) = -\frac{\partial \Gamma _\varepsilon (E^\perp _*, y_*)/\partial y_*}{\partial \Gamma _\varepsilon (E^\perp _*, y_*)/\partial E^\perp _*}. \end{aligned}$$
(40)

Once again, according to the right equation in (30), we integrate \(-F_\varepsilon /T_\varepsilon \) in (37) and (40) with respect to \(y_*\) and find the desired formula for the entropy,

$$\begin{aligned} S_\varepsilon (E^\perp _*, y_*) = \log \left( \Gamma _\varepsilon (E^\perp _*, y_*) \right) + C_\varepsilon , \end{aligned}$$
(41)

where \(C_\varepsilon \) is an arbitrary constant. This is a key result of Hertz: The entropy of a Hamiltonian system under the influence of a slowly varying agent is, up to a constant, the logarithm of the phase-space volume.

The phase-space volume of a harmonic oscillator in one dimension with fixed energy \(E_*^\perp \) and fixed external agent \(y_*\) is geometrically described by an ellipse; thus, using (13) and (33) with \(E_\varepsilon ^\perp \) and \(y_\varepsilon \) fixed yields

$$\begin{aligned} \Gamma _\varepsilon (E^\perp _*, y_*) = 2\pi \varepsilon \frac{E^\perp _*}{\omega (y_*)}= 2\pi \varepsilon \theta _*. \end{aligned}$$

Since on the fast timescale, slowly varying \(E^\perp _\varepsilon \) and \(y_\varepsilon \) can be considered to good approximation as constant, we argue by similarity and replace in Eqs. (37), (40), and (41) \(y_*\) by \(y_\varepsilon \) and \(E_*^\perp \) by \(E_\varepsilon ^\perp \) so that

$$\begin{aligned} \Gamma _\varepsilon (E^\perp _\varepsilon , y_\varepsilon ) = 2\pi \varepsilon \frac{E^\perp _\varepsilon }{\omega (y_\varepsilon )}= 2\pi \varepsilon \theta _\varepsilon . \end{aligned}$$

To avoid a divergent entropy in the limit \(\varepsilon \rightarrow 0\), the constant in the entropy has to be chosen accordingly, for example, \(C_\varepsilon =-\log (\Gamma _\varepsilon (E_*^\perp , y_*))\). In this case, we would have

$$\begin{aligned} S_\varepsilon (E_\varepsilon ^\perp , y_\varepsilon ) = \log \left( \frac{\Gamma _\varepsilon (E^\perp _\varepsilon , y_\varepsilon )}{\Gamma _\varepsilon (E_*^\perp , y_*)} \right) = \log \left( \frac{\theta _\varepsilon }{\theta _*}\right) = \log ( \theta _\varepsilon ) + C, \end{aligned}$$

where \(C=-\log (\theta _*)\).

Thus, altogether, we derive for \(\varepsilon >0\) the following expressions for the temperature \(T_\varepsilon \), the entropy \(S_\varepsilon \), and the external force \(F_\varepsilon \) in the fast subsystem:

$$\begin{aligned} T_\varepsilon = \theta _\varepsilon \omega (y_\varepsilon ),\qquad S_\varepsilon =\log (\theta _\varepsilon )+C,\qquad F_\varepsilon = \theta _\varepsilon \omega '(y_\varepsilon ). \end{aligned}$$
(42)

5.3 Expansion of the thermodynamic quantities

In combination with the second-order expansion derived in Theorem 1, we will analyse the asymptotic properties of the above thermodynamic expressions, by expanding \(y_\varepsilon \) and \(\theta _\varepsilon \) in (42), which in turn defines higher-order asymptotic expansions of the form \(T_\varepsilon = T_0 +T_1^\varepsilon \), \(F_\varepsilon = F_0+F_1^\varepsilon \) and \(S_\varepsilon = S_0 +\varepsilon [\bar{S}_1]^\varepsilon +\varepsilon ^2 [\bar{S}_2]^\varepsilon + \varepsilon ^2 S_3^\varepsilon \) with \(T_1^\varepsilon , F_1^\varepsilon ,S_3^\varepsilon \rightarrow 0\) in C([0, T]), where

$$\begin{aligned} T_0:=\theta _*\omega (y_0), \qquad F_0 :=\theta _*\omega '(y_0), \end{aligned}$$
(43)

and

$$\begin{aligned} S_0 :=\log (\theta _*)+C, \qquad [\bar{S}_1]^\varepsilon :=\frac{[\theta _1]^\varepsilon }{\theta _*}, \quad [\bar{S}_2]^\varepsilon :=\frac{\bar{\theta }_2 +[\theta _2]^\varepsilon }{\theta _*} -\frac{1}{2}\left( \frac{[\theta _1]^\varepsilon }{\theta _*} \right) ^2.\qquad \end{aligned}$$
(44)

Analogously, by substituting (13) into (28), we expand the energy of the fast subsystem \(E^\perp _\varepsilon =\theta _\varepsilon \omega (y_\varepsilon )=E^\perp _0 + \varepsilon [\bar{E}_1^\perp ]^\varepsilon + \varepsilon ^2 [\bar{E}_2^\perp ]^\varepsilon +\varepsilon ^2 E_3^{\perp \varepsilon }\) where \(E_3^{\perp \varepsilon } \rightarrow 0\) in C([0, T]) and obtain

$$\begin{aligned}{} & {} E_0^\perp :=\theta _*\omega (y_0),\qquad [\bar{E}_1^\perp ]^\varepsilon :=\omega (y_0) \left[ \theta _1 \right] ^\varepsilon , \end{aligned}$$
(45)
$$\begin{aligned}{} & {} [\bar{E}_2^\perp ]^\varepsilon :=\theta _*\omega '(y_0)\left( \bar{y}_2+\left[ y_2 \right] ^\varepsilon \right) + \omega (y_0)\left( \bar{\theta }_2 + \left[ \theta _2 \right] ^\varepsilon \right) . \end{aligned}$$
(46)

5.4 Leading-order thermodynamics

We show in this section that to leading order (limit \(\varepsilon \rightarrow 0\)), the resulting thermodynamic process is adiabatic or, in other words, a thermodynamic process with constant entropy.

In fact, by comparing this result with the leading-order expansion of the temperature, entropy, and external force in the model problem as derived in (43) and (44), i.e.

$$\begin{aligned} T_0= \theta _*\omega (y_0),\qquad S_0 = \log (\theta _*)+C, \qquad F_0=\theta _*\omega '(y_0), \end{aligned}$$

we observe that \(S_0\equiv \mathrm {const.}\) and hence reason that the limit process can be interpreted as an adiabatic thermodynamic process. This is in particular a result of the external agent \(y_0\), which affects the fast system to leading order only slowly. The energy in the limit \(\varepsilon \rightarrow 0\) (see Eq. (45)) is given by \(E^\perp _0(y_0) = \theta _*\omega (y_0)\), and therefore (cf. Eq. (31)),

$$\begin{aligned} \hbox {d} E^\perp _0 = F_0 \hbox {d} y_0 +T_0 \hbox {d} S_0 = F_0 \hbox {d} y_0, \end{aligned}$$
(47)

which resembles Eq. (29).

Remark 2

Note that by action and reaction, the force exerted on the fast subsystem \(F_0\) is equal but of opposite sign to the force acting on the slow DOF, i.e. \(\ddot{y}_0=-F_0\) (see also (19)).

5.5 Second-order thermodynamics

We calculate the average contribution of the higher-order microscale oscillations in \(E^\perp _\varepsilon \) and \(S_\varepsilon \) by taking the weak\(^*\) limit of the asymptotic expansion terms \([\bar{E}_1^\perp ]^\varepsilon \), \([\bar{E}_2^\perp ]^\varepsilon \), \([\bar{S}_1]^\varepsilon \), and \([\bar{S}_2]^\varepsilon \) in (44)–(46). With property (23), i.e. \([\theta _1]^\varepsilon , [y_2]^\varepsilon , [\theta _2]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} 0\) in \(L^\infty ([0,T])\), this yields

$$\begin{aligned}{}[\bar{E}_1^\perp ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}0 \quad \text {in}\quad L^\infty ([0,T]),\qquad [\bar{S}_1]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}} 0 \quad \text {in} \quad L^\infty ([0,T]), \end{aligned}$$

and

$$\begin{aligned}{} & {} [\bar{E}_2^\perp ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\bar{E}_2^\perp :=\theta _*\omega '(y_0) \bar{y}_2+ \omega (y_0)\bar{\theta }_2\quad \text {in} \quad L^\infty ([0,T]), \end{aligned}$$
(48)
$$\begin{aligned}{} & {} \quad [\bar{S}_2]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\bar{S}_2:=\frac{\bar{\theta }_2}{\theta _*} - \left( \frac{D_t L_0}{4 \omega (y_0)} \right) ^2 \quad \text {in} \quad L^\infty ([0,T]). \end{aligned}$$
(49)

We can now define the second-order average energy and entropy

$$\begin{aligned} \bar{E}^\perp _\varepsilon :=E^\perp _0 + \varepsilon ^2 \bar{E}^\perp _2,\qquad \bar{S}_\varepsilon :=S_0+ \varepsilon ^2 \bar{S}_2. \end{aligned}$$

To analyse the thermodynamic properties of \(\bar{E}_\varepsilon ^\perp \) for \(\varepsilon >0\), we focus on the second-order expansion term \(\bar{E}_2^\perp \). The appropriate temperature, entropy, and external force can be read off from (48) and (49):

$$\begin{aligned} T_0= \theta _*\omega (y_0),\qquad \bar{S}_2 = \frac{\bar{\theta }_2}{\theta _*} -\left( \frac{D_t L_0}{4 \omega (y_0)} \right) ^2,\qquad F_0=\theta _*\omega '(y_0). \end{aligned}$$

Note that the entropy in this case is not constant. This can be explained by the second-order correction of the external agent \(y_\varepsilon \), which exhibits according to Theorem 1 a decomposition into a slowly varying component \(\bar{y}_2\) and a rapidly varying component \(\left[ y_2 \right] ^\varepsilon \). By considering \(\bar{E}^\perp _2 = \bar{E}^\perp _2(\bar{S}_2,\bar{y}_2; y_0, p_0)\), we obtain (cf. Eq. (29)) for fixed \((y_0, p_0)\)

$$\begin{aligned} \textrm{d} \bar{E}^\perp _2 = F_0 \textrm{d} \bar{y}_2 + T_0 \textrm{d} \bar{S}_2, \end{aligned}$$

which agrees with the qualitative discussion of Sect. 5.1.

Finally, let us inspect how the thermodynamic energy balance is realised in the total second-order energy correction of \(E_\varepsilon \). Analogous to the decomposition (28), we split the total energy \(E_\varepsilon \) in (15) into \(E_\varepsilon ^\perp \) and \(E_\varepsilon ^\parallel \), i.e. \(E_\varepsilon =E^\parallel _\varepsilon + E^\perp _\varepsilon \), where

$$\begin{aligned} E^\parallel _\varepsilon = \frac{1}{2}p_\varepsilon ^2+\frac{\varepsilon }{2}\theta _\varepsilon p_\varepsilon D_y L_\varepsilon \sin (2\varepsilon ^{-1}\phi _\varepsilon ) +\frac{\varepsilon ^2}{8}\theta _\varepsilon ^2 (D_y L_\varepsilon )^2\sin ^2(2\varepsilon ^{-1}\phi _\varepsilon ). \end{aligned}$$

We then use the expressions derived in Theorem 1 to expand \(E^\parallel _\varepsilon =E^\parallel _0 + \varepsilon [\bar{E}_1^\parallel ]^\varepsilon + \varepsilon ^2 [\bar{E}_2^\parallel ]^\varepsilon +\varepsilon ^2 E_3^{\parallel \varepsilon }\), where \(E_3^{\parallel \varepsilon }\rightarrow 0\) in C([0, T]) with

$$\begin{aligned} E_0^\parallel :=\frac{1}{2}p_0^2,\qquad [\bar{E}_1^\parallel ]^\varepsilon :=\frac{\theta _*}{2} D_t L_0\sin (2\varepsilon ^{-1}\phi _0), \end{aligned}$$

and

$$\begin{aligned}{}[\bar{E}_2^\parallel ]^\varepsilon:= & {} p_0 \left( \bar{p}_2+\left[ p_2 \right] ^\varepsilon \right) +\frac{\theta _*^2 (D_y L_0)^2}{8}\sin ^2(2\varepsilon ^{-1}\phi _0)\nonumber \\{} & {} + \theta _*D_t L_0\left( \bar{\phi }_2+[\phi _2]^\varepsilon \right) \cos (2\varepsilon ^{-1}\phi _0)+ \frac{ [\theta _1]^\varepsilon D_t L_0}{2}\sin (2\varepsilon ^{-1}\phi _0). \end{aligned}$$
(50)

As before, we take the weak\(^*\) limit to determine the average energy correction at first and second order, and find

$$\begin{aligned}&{[}\bar{E}_1^\parallel ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}0 \quad \text {in}\quad L^\infty ([0,T]), \\&{[}\bar{E}_2^\parallel ]^\varepsilon {\mathop {\rightharpoonup }\limits ^{*}}\bar{E}_2^{\parallel }:=p_0 \bar{p}_2 +\left( \frac{\theta _*D_y L_0}{4} \right) ^2 -\frac{\theta _*(D_t L_0)^2}{4\omega (y_0)} \quad \text {in}\quad L^\infty ([0,T]), \end{aligned}$$

and define the averaged residual energy \(\bar{E}^\parallel _\varepsilon :=E_0^\parallel + \varepsilon ^2 \bar{E}_2^\parallel \). The following theorem shows how the Hamiltonian character of the model problem and the thermodynamic interpretation materialise for the averaged total second-order energy correction \(\bar{E}_2 = \bar{E}^\parallel _2 + \bar{E}^\perp _2\).

Theorem 2

Let \((y_0, p_0)\) be as in (18) and \((\bar{y}_2, \bar{p}_2)\) be as in Theorem 1. Let \(\bar{E}_2\) be the averaged total second-order energy correction \(\bar{E}_2= \bar{E}_2^\parallel + \bar{E}_2^\perp \), where

$$\begin{aligned} \bar{E}_2^\parallel (\bar{y}_2, \bar{p}_2; y_0, p_0) = p_0 \bar{p}_2 +\frac{\theta _*^2 \omega '(y_0)^2}{16 \omega ^2(y_0)} -\frac{\theta _*(p_0 \omega '(y_0))^2}{4\omega ^3(y_0)}, \end{aligned}$$

and

$$\begin{aligned} \bar{E}_2^\perp (\bar{y}_2; y_0, p_0) = \theta _*\omega '(y_0) \bar{y}_2+ \omega (y_0)\bar{\theta }_2(y_0, p_0), \end{aligned}$$

with

$$\begin{aligned} \bar{\theta }_2(y_0, p_0) = \frac{\theta _*p_0 (\omega '(y_0))^2}{8 \omega ^4(y_0)} + C_{\theta }, \qquad C_{\theta }= -\frac{\theta _*p_*(\omega '(y_*))^2}{8 \omega ^4(y_*)} -[\theta _2]^\varepsilon (0). \end{aligned}$$

Then the differential Eqs. (24c) and (24d) take the form

$$\begin{aligned} \frac{\textrm{d} \bar{y}_2 }{\textrm{d}t} = \frac{\partial \bar{E}_2}{\partial p_0},\qquad \frac{\textrm{d} \bar{p}_2}{\textrm{d}t} = -\frac{\partial \bar{E}_2 }{\partial y_0}. \end{aligned}$$

Additionally, with the functions \(T_0\), \(\bar{S}_2\), and \(F_0\), which can be interpreted as the temperature, entropy, and external force in the fast subsystem, the energy \(\bar{E}_2^\perp \) can be written as

$$\begin{aligned} \bar{E}_2^\perp (\bar{S}_2, \bar{y}_2; y_0, p_0)= F_0 \bar{y}_2 + T_0 \bar{S}_2 + \frac{\theta _*(p_0 \omega '(y_0))^2}{16 \omega ^3(y_0)}. \end{aligned}$$

With this notation, the energy \(\bar{E}_2^\perp \) satisfies the constituent Eq. (32) in the form

$$\begin{aligned} F_0 = \frac{\partial \bar{E}_2}{\partial \bar{y}_2}, \qquad T_0 = \frac{\partial \bar{E}_2}{\partial \bar{S}_2}. \end{aligned}$$

Proof

The claim follows directly from (24c) and (24d). \(\square \)

5.6 Thermodynamic interpretation in the multidimensional case

The discussion in [33] shows how to derive the second-order asymptotic expansion in the case of multiple fast and slow DOF. Similar to the expansion covered in this article, one can interpret the dynamics to leading order as well as to second order from a thermodynamic point of view. If there is only one fast DOF, the fast subsystem is ergodic, and thus, Hertz’ theory can be applied directly. In particular (see Remark 2), the derived force to leading order of the internal energy (47) coincides (up to sign) with the force of the slow DOF derived through the weak \(^*\) limit (19). If there are more than one fast DOF, however, the fast subsystem is in general not ergodic. Thus, Hertz’ theory has to be adapted in Eqs. (34) and (38) by considering the ensemble average instead of the time average. This allows to derive expressions which correspond to temperature, entropy, and external force. These expressions can be expanded to second order, similar to the procedure described in this article. It turns out that in this case, energy relations appear which likewise resemble the first and second law of thermodynamics. However, in contrast to the case of one fast DOF, the derived force to leading order of the internal energy does not coincide (up to sign) with the force of the slow degrees of freedom derived through weak\(^*\) convergence techniques.

While the force derived in the latter case corresponds to the total force the slow DOF experience, the force derived in the former case is due only to a change of the internal energy. In fact, if we assume that temperature is kept constant in the system or if we consider only a very small time frame such that \(y_0\approx y_*\), then the force derived from the internal energy coincides with the force derived from the Fixman potential assuming equi-distribution of energy in the normal vibrational modes (see [5]). Thus, similar to [5], the Fixman potential provides an inexact description of the slow dynamics of the system. An exact description of the slow dynamics can be obtained by considering further contributions from the mutual interactions with the fast DOF. This information can be obtained from the entropic term.

6 Conclusions

In this article, we discussed a simple fast–slow mechanical system. It represents a minimal example of a class of related models. The expansion to second order and its thermodynamic interpretation carry over to finitely many DOF. It is a natural next step to consider the extension to real-world applications in molecular dynamics. In molecular dynamics, one is usually interested in the slow macroscopic dynamics; the fast microscopic dynamics is less important but is necessary to obtain an accurate representation of the dynamics on the microscopic scale. Through averaging techniques, it is possible (see for instance [5]) to derive a homogenised system, which describes the dynamics only of the slow DOF. However, this homogenised system has to be understood as an approximation to the slow dynamics of the original system because it is derived by considering the limit \(\varepsilon \rightarrow 0\). For a more detailed description of the slow dynamics in this system, a higher-order asymptotic expansion is of interest, as the scale parameter \(\varepsilon \) is a quotient of mass ratios and thus in many applications small but finite.

The fast–slow mechanical system studied in this article is one of the simplest models that can potentially exhibit thermodynamic effects. Indeed, one of the core assumptions of thermodynamics is that the system under consideration has a clear separation of scales. This assumption is by construction satisfied for our system. Many theories have been developed to study large-scale interacting particle systems from a thermodynamic point of view, but only a few focus on applying the theory in combination with the averaging methods for dynamical systems.

One of the earliest attempts to describe fast–slow mechanical systems from a thermodynamic point of view dates back to the work of Hertz [36]. Since the fast subsystem in our model is ergodic, Hertz’ methodology can directly be applied to the fast–slow system studied in this article. We derived the second-order asymptotic expansion of the dynamics in the system via weak convergence techniques and expressed in addition the results in the framework of two-scale convergence. Moreover, we showed that the dynamics to leading order as well as on average to second order satisfy equations resembling the first and second law of thermodynamics. In the limit \(\varepsilon \rightarrow 0\), the expression of the internal energy can be used to derive the same evolution equation for the slow dynamics as by methods based on homogenisation procedures [5]. For \(\varepsilon >0\), i.e. for the average second-order asymptotic expansion, a similar statement cannot be made because the entropic term does not vanish, implying more complicated dynamics to second order, which cannot just be described by the internal energy.

From the first and second law of thermodynamics, it is clear that a simple description of the slow dynamics can be achieved when the entropic term vanishes, which is the case if the fast dynamics is ergodic. In the case of multiple fast DOF, the fast dynamics is, in general, not ergodic. Thus, a direct application of Hertz’ theory is not possible. As demonstrated in [33], however, some adjusted techniques can be used to derive a thermodynamic interpretation of the system. It turns out that the leading-order and the averaged second-order dynamics do not have a vanishing ergodic term and thus the internal energy alone cannot be used to describe the slow dynamics in the system. Contributions from the entropic term need to be considered to describe the slow dynamics in the system correctly.

The discussion in this article provides two different perspectives on the asymptotic expansion of solutions to our model problem (9). Firstly, the rigorous derivation of the asymptotic expansion of the system’s dynamics provides accurate information about the evolution of the system up to second order. Secondly, the thermodynamic interpretation up to second order provides a bridge to the realm of statistical mechanics. A possible application of the theory presented in this article is to approximate the dynamics of system (9) by a system that describes the slow evolution accurately, but the fast evolution by thermodynamic expressions in terms of temperature and entropy. Whether the evolution of these quantities can be derived is an open problem, as is the role of entropy for thermodynamic consistency. Yet, the asymptotic expansion can be used profitable in numerical simulations [33].