1 Introduction

The evolution of complex systems arising in chemistry and biology often involve dynamic phenomena occurring at a wide range of time and length scales. Many such systems are characterised by the presence of a hierarchy of barriers in the underlying energy landscape, giving rise to a complex network of metastable regions in configuration space. Such energy landscapes occur naturally in macromolecular models of solvated systems, in particular protein dynamics. In such cases the rugged energy landscape is due to the many competing interactions in the energy function [11], giving rise to frustration, in a manner analogous to spin glass models [10, 40]. Although the large scale structure will determine the minimum energy configurations of the system, the small scale fluctuations of the energy landscape will still have a significant influence on the dynamics of the protein, in particular the behaviour at equilibrium, the most likely pathways for binding and folding, as well as the stability of the conformational states. Rugged energy landscapes arise in various other contexts, for example nucleation at a phase transition and solid transport in condensed matter.

To study the influence of small scale potential energy fluctuations on the system dynamics, a number of simple mathematical models have been proposed which capture the essential features of such systems. In one such model, originally proposed by Zwanzig [56], the dynamics are modelled as an overdamped Langevin diffusion in a rugged two-scale potential \(V^\epsilon \),

$$\begin{aligned} dX^\epsilon _t = -\nabla V^\epsilon (X_t)\,dt + \sqrt{2\sigma }\,dW_t,\quad \sigma = \beta ^{-1} = k_BT, \end{aligned}$$
(1)

where T is the temperature and \(k_B\) is Boltzmann’s constant. The function \(V^\epsilon (x) = V(x,x/\epsilon )\) is a smooth potential which has been perturbed by a rapidly fluctuating function with wave number controlled by the small scale parameter \(\epsilon > 0\). See Fig. 1 for an illustration. Zwanzig’s analysis was based on an effective medium approximation of the mean first passage time, from which the standard Lifson–Jackson formula [33] for the effective diffusion coefficient was recovered. In the context of protein dynamics, phenomenological models based on (1) are widespread in the literature, including but not limited to [3, 28, 37, 53]. Theoretical aspects of such models have also been previously studied. In [13] the authors study diffusion in a strongly correlated quenched random potential constructed from a periodically-extended path of a fractional Brownian motion. Numerical study of the effective diffusivity of diffusion in a potential obtained from a realisation of a stationary isotropic Gaussian random field is performed in [6]. More recent works include [22, 23] where the authors study systems of weakly interacting diffusions moving in a multiwell potential energy landscape, coupled via a Curie–Weiss type (quadratic) interaction potential and [34] in which the authors consider enhanced diffusion for Brownian motion in a tilted periodic potential expressing the effective diffusion in terms of the eigenvalue band structure. It is also worth mentioning a series of works [4, 19, 48, 54] studying multiscale behaviour of diffusion processes with multiple-well potentials in which the limiting process is a chemical reactions instead of a diffusion. We also mention [14], where the combined mean field/homogenization limit for diffusions interacting via a periodic potential is considered. The main result of this paper is that, in the presence of phase transitions, the mean field and homogenization limits do not commute.

Fig. 1
figure 1

Example of a multiscale potential. The left panel shows the isolines of the Mueller potential [39, 49]. The right panel shows the corresponding rugged energy landscape where the Mueller potential is perturbed by high frequency periodic fluctuations

For the case where (1) possesses one characteristic lengthscale controlled by \(\epsilon >0\), the convergence of \(X_t^\epsilon \) to a coarse-grained process \(X_t^0\) in the limit \(\epsilon \rightarrow 0\) over a finite time interval is well-known. When the rapid oscillations are periodic, under a diffusive rescaling this problem can be recast as a periodic homogenization problem, for which it can be shown that the process \(X_t^\epsilon \) converges weakly to a Brownian motion with constant effective diffusion tensor D (covariance matrix) which can be calculated by solving an appropriate Poisson equation posed on the unit torus, see for example [8, 46]. The analogous case where the rapid fluctuations arise from a stationary ergodic random field has been studied in [31, Chap. 9]. The case where the potential \(V^\epsilon \) possesses periodic fluctuations with two or three well-separated characteristic timescales, i.e. \(V^\epsilon (x) = V(x,x/\epsilon ,x/\epsilon ^2)\) follow from the results in [8, Chap. 3.7], in which case the dynamics of the coarse-grained model in the \(\epsilon \rightarrow 0\) limit are characterised by an Itô SDE whose coefficients can be calculated in terms of the solution of an associated Poisson equation. A generalization of these results to diffusion processes having N-well separated scales was explored in Sect. 3.11.3 of the same text, but no proof of convergence is offered in this case. Similar diffusion approximations for systems with one fast scale and one slow scale, where the fast dynamics is not periodic have been studied in [43].

A model for Brownian dynamics in a potential V possessing infinitely many characteristic lengthscales was studied in [7]. In particular, the authors studied the large-scale diffusive behaviour of the overdamped Langevin dynamics in potentials of the form

$$\begin{aligned} V^n(x) = \sum _{k=0}^{n}U_k\left( \frac{x}{R_k}\right) , \end{aligned}$$
(2)

obtained as a superposition of Hölder continuous periodic functions with period 1. It was shown in [7] that the effective diffusion coefficient decays exponentially fast with the number of scales, provided that the scale ratios \(R_{k+1}/R_{k}\) are bounded from above and below, which includes cases where there is no scale separation. From this the authors were able to show that the effective dynamics exhibits subdiffusive behaviour, in the limit of infinitely many scales. See also the analytical calculation presented in [15] for a piecewise linear periodic potential; in the limit of infinitely many scales, the homogenized diffusion coefficient converges to zero, signaling that, in this limit, the coarse-grained dynamics is characterized by subdiffusive behaviour.

In this paper we study the dynamics of diffusion in a rugged potential possessing N well-separated lengthscales. More specifically, we study the dynamics of (1) where the multiscale potential is chosen to have the form

$$\begin{aligned} V^\epsilon (x) = V\bigg (x, x/\epsilon , x/\epsilon ^2, \ldots , x/\epsilon ^N\bigg ), \end{aligned}$$
(3)

where V is a smooth function, which is periodic with period 1 in all but the first argument. Clearly, V can always be written in the form

$$\begin{aligned} V(x_0, x_1, \ldots , x_N) = V_0(x_0) + V_1(x_0, x_1, \ldots , x_N), \end{aligned}$$
(4)

where \((x_0,x_1,\ldots , x_N) \in {\mathbb {R}}^d\times \left( \mathbb {T}^d\right) ^{N}\). We will assume that the large scale component of the potential \(V_0\) is smooth and confining in \(\mathbb {R}^d\), and that the perturbation \(V_1\) is a smooth bounded function which is periodic in all but the first variable. Unlike [7], we work under the assumption of explicit scale separation, however we also permit more general potentials than those of the form (2), allowing possibly nonlinear interactions between the different scales, and even full coupling between scales.Footnote 1 To emphasize the fact that the potential (4) leads to a fully coupled system across scales, we introduce the auxiliary processes \(X_t^{(j)} = X_t/\epsilon ^{j}\), \(j=0,\ldots , N\). The SDE (1) can then be written as a fully coupled system of SDEs driven by the same Brownian motion \(W_t\),

$$\begin{aligned} dX^{(0)}_t&= -\sum _{i=0}^{N}\epsilon ^{-i}\nabla _{x_i}V\left( X^{(0)}_t, X^{(1)}_t,\ldots , X^{(N)}_t\right) \,dt + \sqrt{2\sigma }\,dW_t, \end{aligned}$$
(5a)
$$\begin{aligned} dX^{(1)}_t&= -\sum _{i=0}^{N}\epsilon ^{-(i+1)}\nabla _{x_i}V\left( X^{(0)}_t, X^{(1)}_t,\ldots , X^{(N)}_t\right) \,dt + \sqrt{\frac{2\sigma }{\epsilon ^2}}\,dW_t, \end{aligned}$$
(5b)
$$\begin{aligned}&\vdots \nonumber \\ dX^{(N)}_t&= -\sum _{i=0}^{N}\epsilon ^{-(i+N)}\nabla _{x_i}V\left( X^{(0)}_t, X^{(1)}_t, \ldots , X^{(N)}_t\right) \,dt + \sqrt{\frac{2\sigma }{\epsilon ^{2N}}}\,dW_t, \end{aligned}$$
(5c)

in which case \(X_t^{(0)}\) is considered to be a “slow” variable, while \(X_t^{(1)}, \ldots X_t^{(N)}\) are “fast” variables. In this paper, we first provide an explicit proof of the convergence of the solution of (1), \(X_t^\epsilon \) to a coarse-grained (homogenized) diffusion process \(X_t^0\) given by the unique solution of the following Itô SDE:

$$\begin{aligned} dX_t^0 = -\mathcal {M}(X_t^0)\nabla \Psi (X_t^0)\,dt + \sigma \nabla \cdot \mathcal {M}(X_t^0)\,dt + \sqrt{2\sigma \mathcal {M}(X_t^0)}\,dW_t, \end{aligned}$$
(6)

where

$$\begin{aligned} \Psi (x) = -\sigma \log Z(x), \end{aligned}$$

denotes the free energy, for

$$\begin{aligned} Z(x) = \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}e^{- V_1(x, y_1, \ldots , y_N)/\sigma }\,dy_1\ldots dy_N, \end{aligned}$$

and where \(\mathcal {M}(x)\) is a symmetric uniformly positive definite tensor which is independent of \(\epsilon \). The formula of the effective diffusion tensor is given in Sect. 2.

Our assumptions on the potential \(V^\epsilon \) in (4) guarantee that the full dynamics (1) is reversible with respect to the Gibbs measure \(\mu ^{\epsilon }\) by construction. It is important to note that the coarse-grained dynamics (6) is also reversible with respect to the equilibrium Gibbs measure

$$\begin{aligned} \mu ^0(x) = Z(x)/\overline{Z}. \end{aligned}$$

Indeed, the natural interpretation of \(\Psi (x)=-\sigma \log Z(x)\) is as the free energy corresponding to the coarse-grained variable \(X^0_t\). The weak convergence of \(X_t^\epsilon \) to \(X_t^0\) implies in particular that the distribution of \(X_t^\epsilon \) will converge weakly to that of \(X_t^0\), uniformly over finite time intervals [0, T], which does not say anything about the convergence of the respective stationary distributions \(\mu ^\epsilon \) to \(\mu ^0\). In Sect. 4 we study the equilibrium behaviour of \(X_t^\epsilon \) and \(X_t^0\) and show that the long-time limit \(t\rightarrow \infty \) and the coarse-graining limit \(\epsilon \rightarrow 0\) commute, and in particular that the equilibrium measure \(\mu ^\epsilon \) of \(X_t^\epsilon \) converges in the weak sense to \(\mu ^0\). We also study the rate of convergence to equilibrium for both processes, and we obtain bounds relating the two rates. This question is naturally related to the study of the Poincaré constants for the full and coarse-grained potentials [24, 41].

We can summarize the above discussion as follows: the (Wasserstein) gradient structure, reversibility and detailed balance property of the dynamics (the three properties are equivalent) are preserved under the homogenization/coarse-graining process: the reversibility of \(X_t^\epsilon \) with respect to \(\mu ^\epsilon \) is preserved under the homogenization procedure. Indeed, any general diffusion process that is reversible with respect to \(\mu ^0(x)\) will have the form (18), see [45, Sect. 4.7]. It is not necessarily always the case that the gradient structure is preserved under coarse-graining, as has been shown recently [47]. The creation of non-gradient/nonreversible effects due to the multiscale structure of the dynamics is a very interesting problem that we will return to in future work.

We also remark that the homogenized SDE corresponds to the kinetic/Klimontovich interpretation of the stochastic integral [27], i.e. it can be written in the form

$$\begin{aligned} dX_t^0 = -\mathcal {M}(X_t^0)\nabla \Psi (X_t^0)\,dt + \sqrt{2\sigma \mathcal {M}(X_t^0)} \circ ^{\text{ Klim }} \,dW_t, \end{aligned}$$
(7)

where we use the notation \(\circ ^{\text{ Klim }}\) to denote the Klimontovich stochastic differential/integral. The Klimontovich interpretation of the stochastic integral leads to a thermodynamically consistent Langevin dynamics, in the sense that it is reversible with respect to the coarse-grained Gibbs measure.

The multiplicative noise is due to the full coupling between the macroscopic and the N microscopic scales.Footnote 2 For one-dimensional potentials, we are able to obtain an explicit expression for \(\mathcal {M}(x)\), regardless of the number of scales involved. In higher dimensions, \(\mathcal {M}(x)\) will be expressed in terms of the solution of a recursive family of Poisson equations which can be solved only numerically. We also obtain a variational characterization of the effective diffusion tensor, analogous to the standard variational characterisations for the effective conductivity tensor for multiscale conductivity problems, see for example [29]. Using this variational characterisation, we are able to derive tight bounds on the effective diffusion tensor, and in particular, show that as \(N \rightarrow \infty \), the eigenvalues of the effective diffusion tensor will converge to zero, suggesting that diffusion in potentials with infinitely many scales will exhibit anomalous diffusion. The focus of this paper is the rigorous analysis of the homogenization problem for (1) with \(V^\epsilon \) given by (4). More precisely, we are interested in establishing the convergence of both the dynamics (over finite time domain) and of the equilibrium measure of (1) as \(\epsilon \) tends to zero.

Our proof of the homogenization theorem, Theorem 1 is based on the well known martingale approach to proving limit theorems [8, 42, 43]. The main technical difficulty in applying such well known techniques is the construction of the corrector field/compensator and the analysis of the obtained Poisson equations. This turns out to be a challenging task, since we consider the case where all scales, the macroscale and the N-microscales, are fully coupled. For recent applications of the techniques, we refer the reader to [32, 50] where the authors study metastable behaviour of multiscale diffusion processes.

The rest of the paper is organized as follows. In Sect. 2 we state the assumptions on the structure of the multiscale potential and state the main results of this paper. In Sect. 3 we study properties of the effective dynamics, providing expressions for the diffusion tensor in terms of a variational formula, and derive various bounds. In Sect. 4 we study properties of the effective potential, and prove convergence of the equilibrium distribution of \(X_t^\epsilon \) to the coarse-grained equilibrium distribution \(\mu ^0\). The proof of the main theorem, Theorem 1, is presented in Sect. 5. Finally, in Sect. 6 we provide further discussion and outlook.

2 Setup and Statement of Main Results

In this section we provide conditions on the multiscale potential which are required to obtain a well-defined homogenization limit. In particular, we shall highlight assumptions necessary for the ergodicity of the full model as well as the coarse-grained dynamics.

We will consider the overdamped Langevin dynamics

$$\begin{aligned} dX^\epsilon _t = -\nabla V^{\epsilon }(X_t^\epsilon )\,dt + \sqrt{2\sigma }\,dW_t, \end{aligned}$$
(8)

where \(V^\epsilon (x)\) is of the form (3). The multiscale potentials we consider in this paper can be viewed as a smooth confining potential perturbed by smooth, bounded fluctuations which become increasingly rapid as \(\epsilon \rightarrow 0\), see Fig. 1 for an illustration. More specifically, we will assume that the multiscale potential V satisfies the following assumptions.Footnote 3

Assumption 1

The potential V is given by

$$\begin{aligned} V(x_0, x_1, \ldots , x_N) = V_0(x_0) + V_1(x_0, x_1, \ldots , x_N), \end{aligned}$$
(9)

where \((x_0,x_1,\ldots , x_N) \in \mathbb {R}^d\times \left( \mathbb {T}^d\right) ^{N}\), and

  1. 1.

    \(V_0\) is a smooth confining potential, i.e. \(e^{-V_0(x)} \in L^1(\mathbb {R}^d)\) and \(V_0(x) \rightarrow \infty \) as \(|x|\rightarrow \infty \).

  2. 2.

    The perturbation \(V_1(x_0, x_1, \ldots , x_N)\) is smooth and bounded uniformly in \(x_0\).

  3. 3.

    There exists \(C > 0\) such that \(\left\Vert \nabla ^2 V_0 \right\Vert _{L^\infty (\mathbb {R}^d)} \le C\).

Remark 1

We note that Assumption 1 is quite stringent, since it implies that \(V_0\) is quadratic to leading order. This assumption is also made in [43]. In cases where the process \(X^{\epsilon }_0 \sim \mu ^\epsilon \), i.e. the process is stationary, this condition can be relaxed considerably.

The infinitesimal generator \(\mathcal {L}^\epsilon \) of \(X_t^\epsilon \) is the selfadjoint extension of

$$\begin{aligned} \mathcal {L}^\epsilon f(x) = -\nabla V^\epsilon (x)\cdot \nabla f(x) + \sigma \Delta f(x),\quad f\in C^\infty _c(\mathbb {R}^d). \end{aligned}$$
(10)

It follows from the assumption on \(V_0\) that the corresponding overdamped Langevin equation

$$\begin{aligned} dY_t = -\nabla V_0(Y_t)\,dt + \sqrt{2\sigma }dW_t, \end{aligned}$$
(11)

is ergodic with the unique stationary distribution

$$\begin{aligned} \mu _{ref}(x) = \frac{1}{Z_{ref}}\exp (-V_0(x)/\sigma ), \quad Z_{ref} = \int _{\mathbb {R}^d}e^{-V_0(x)/\sigma }\,dx. \end{aligned}$$

Since \(V_1\) is bounded uniformly, by Assumption 1, it follows that the potential \(V^\epsilon \) is also confining, and therefore \(X^\epsilon _t\) is ergodic, possessing a unique invariant distribution given by \(\mu ^\epsilon (x) = \frac{e^{-V^\epsilon (x)/\sigma }}{Z^\epsilon },\) where \(Z^\epsilon = \int _{\mathbb {R}^d} e^{-V^\epsilon (x)/\sigma }\). Moreover, noting that the generator \(\mathcal {L}^\epsilon \) of \(X_t^\epsilon \) can be written as

$$\begin{aligned} \mathcal {L}^\epsilon f(x) = \sigma \,e^{V^\epsilon (x)/\sigma }\nabla \cdot \left( e^{-V^\epsilon (x)/\sigma }\nabla f(x)\right) , \quad f \in C^2_c(\mathbb {R}^d). \end{aligned}$$

It follows that \(\mu ^\epsilon \) is reversible with respect to the dynamics \(X_t^\epsilon \), c.f. [20, 45].

Our main objective in this paper is to study the dynamics (8) in the limit of infinite scale separation \(\epsilon \rightarrow 0\). Having introduced the model and the assumptions we can now present the main result of the paper.

Theorem 1

(Weak convergence of \(X_t^\epsilon \) to \(X^0_t\)) Suppose that Assumption 1 holds and let \(T > 0\), and the initial condition \(X_0\) is distributed according to some probability distribution \(\nu \) on \(\mathbb {R}^d\). Then as \(\epsilon \rightarrow 0\), the process \(X_t^\epsilon \) converges weakly in \((C[0,T]; \mathbb {R}^d)\) to the diffusion process \(X_t^0\) with generator defined by

$$\begin{aligned} \mathcal {L}^0 f(x) = \frac{\sigma }{Z(x)}\nabla _x\cdot \left( Z(x)\mathcal {M}(x)\nabla _x f(x) \right) ,\quad f\in C^2_c(\mathbb {R}^d), \end{aligned}$$
(12)

and where

$$\begin{aligned} Z(x) = \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}e^{- V_1(x, x_1,\ldots , x_N)/\sigma }\,dx_N\ldots dx_1, \end{aligned}$$
(13)

and

$$\begin{aligned} \mathcal {M}(x) = \frac{1}{Z(x)}\int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}(1 + \nabla _{x_N}\theta _N)\cdots (1+\nabla _{x_1}\theta _1)e^{- V_1(x,x_1,\ldots ,x_N)/\sigma }\,dx_N\cdots dx_1. \nonumber \\ \end{aligned}$$
(14)

The correctors are defined recursively as follows: define \(\theta _{N-k} = \bigg (\theta ^1_{N-k},\ldots , \theta ^d_{N-k}\bigg )\) to be the weak vector-valued solution of the PDE

$$\begin{aligned} \nabla _{x_{N-k}}\cdot (\mathcal {K}_{N-k}(x_0, \ldots , x_{N-k})(\nabla _{x_{N-k}}\theta _{N-k}(x_0,\ldots , x_{N-k})+I)) =0, \end{aligned}$$
(15)

where \(\theta _{N-k}(x_0,\ldots , x_{N-k-1},\cdot ) \in H^1(\mathbb {T}^d; \mathbb {R}^d)\), with the notation \(\left[ \nabla _{x_n} \theta _n\right] _{\cdot , j} = \nabla _{x_n}\theta _n^j\), for \(j=1,\ldots , d\) and \(n=1,\ldots , N\) and where

$$\begin{aligned} \begin{aligned}&\mathcal {K}_{N-k}(x_0,\ldots , x_{N-k})\\&\quad = \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}(I + \nabla _{x_N}\theta _N)\cdots (I+\nabla _{x_{N-k+1}}\theta _{N-k+1})e^{- V_1/\sigma }\,dx_N\ldots dx_{N-k+1}, \end{aligned} \end{aligned}$$
(16)

for \(k = 1, \ldots , N-1\), and

$$\begin{aligned} \mathcal {K}_N(x, x_1,\ldots , x_N) = e^{- V_1(x, x_1,\ldots , x_N)/\sigma }I, \end{aligned}$$
(17)

where I denotes the identity matrix in \(\mathbb {R}^{d\times d}\). Provided that Assumption 1 hold, Proposition 5 guarantees the existence and uniqueness (up to a constant) of solutions to the coupled Poisson equations (15). Furthermore, the solutions will depend smoothly on the slow variable \(x_0\) as well as the fast variables \(x_1, \ldots , x_N\). The process \(X_t^0\) is the unique solution to the Itô SDE

$$\begin{aligned} dX_t^0 = -\mathcal {M}(X_t^0)\nabla \Psi (X_t^0)\,dt +\sigma \nabla \cdot \mathcal {M}(X_t^0)\,dt+ \sqrt{2\sigma \mathcal {M}(X_t^0)}\,dW_t, \end{aligned}$$
(18)

where

$$\begin{aligned} \Psi (x) = -\sigma \log Z(x) = -\sigma \log \left( \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} e^{-V_1(x, y_1, \ldots , y_N)/\sigma }\,dy_1\ldots dy_N\right) . \end{aligned}$$

The proof, which closely follows that of [43] is postponed to Sect. 5. Theorem 1 confirms the intuition that the coarse-grained dynamics is driven by the coarse-grained free energy. On the other hand, the corresponding SDE has multiplicative noise given by a space dependent diffusion tensor \(\mathcal {M}(x)\). We can show that the homogenized process (18) is ergodic with unique invariant distribution

$$\begin{aligned} \mu ^0(x) = \frac{Z(x)}{\overline{Z}} = \frac{1}{\overline{Z}}e^{-\Psi (x)/\sigma },\quad \text{ where } \quad \overline{Z} = \int _{\mathbb {R}^d}Z(x)\,dx. \end{aligned}$$

Other qualitative properties of the solution to the homogenized equation (6), including noise-induced transitions and noise-induced hysteresis behaviour has been studied in [15]. It is also important to note that the reversibility of \(X_t^\epsilon \) with respect to \(\mu ^\epsilon \) is preserved under the homogenization procedure. Indeed, any general diffusion process that is reversible with respect to \(\mu ^0(x)\) will have the form (18), see [45, Sect. 4.7]. See Sect. 6 for further discussion on this point.

As is characteristic with homogenization problems, when \(d=1\) we can obtain, up to quadratures, an explicit expression for the homogenized SDE. In this case, we obtain explicit expressions for the correctors \(\theta _1, \ldots , \theta _N\), so that the intermediary coefficients \(\mathcal {K}_1, \ldots , \mathcal {K}_N\) can be expressed as (see also [15])

$$\begin{aligned} \mathcal {K}_i(x_0, x_1, \ldots , x_{i}) = \left( \int e^{V_1(x_0, x_1, \ldots , x_{i}, x_{i+1}, \ldots , x_N)/\sigma }\,dx_{i+1}\ldots dx_N\right) ^{-1},\quad i=1,\ldots , N. \end{aligned}$$

Thus we obtain the following result.

Proposition 1

(Effective Dynamics in one dimension) When \(d=1\), the effective diffusion coefficient \(\mathcal {M}(x)\) in (18) is given by

$$\begin{aligned} \mathcal {M}(x) = \frac{1}{Z_1(x)\widehat{Z}_1(x)}, \end{aligned}$$
(19)

where

$$\begin{aligned} Z_1(x) = \int \cdots \int e^{-V_1(x, x_1,\ldots , x_N)/\sigma }\,dx_1\ldots dx_N, \end{aligned}$$

and

$$\begin{aligned} \widehat{Z}_1(x) = \int \cdots \int e^{V_1(x, x_1,\ldots , x_N)/\sigma }\,dx_1\ldots dx_N. \end{aligned}$$

Equation (19) generalises the expression for the effective diffusion coefficient for a two-scale potential that was derived in [56] without any appeal to homogenization theory. In higher dimensions we will not be able to obtain an explicit expression for \(\mathcal {M}(x)\), however we are able to obtain bounds on the eigenvalues of \(\mathcal {M}(x)\). In particular, we are able to show that (19) acts as a lower bound for the eigenvalues of \(\mathcal {M}(x)\).

Proposition 2

The effective diffusion tensor \(\mathcal {M}\) is uniformly positive definite over \(\mathbb {R}^d\). In particular,

$$\begin{aligned} 0 < \,e^{-\text{ osc }(V_1)/\sigma } \le \frac{1}{Z_1(x)\widehat{Z}_1(x)} \le e\cdot \mathcal {M}(x)e \le 1,\quad x \in \mathbb {R}^d, \end{aligned}$$
(20)

for all \(e \in \mathbb {R}^d\) such that \(|e|=1\), where

$$\begin{aligned} \text{ osc }(V_1) = \sup _{\begin{array}{c} x\in \mathbb {R}^d, \\ y_1,\ldots , y_N \in \mathbb {T}^d \end{array}} V_1(x,y_1,\ldots , y_N) - \inf _{\begin{array}{c} x\in \mathbb {R}^d, \\ y_1,\ldots , y_N \in \mathbb {T}^d \end{array}} V_1(x,y_1,\ldots , y_N). \end{aligned}$$

This result follows immediately from Lemmas 1 and 2 which are proved in Sect. 3.

Remark 2

The bounds in (20) highlight the two extreme possibilities for fluctuations occurring in the potential \(V^\epsilon \). The equality \(\frac{1}{Z_1(x)\widehat{Z}_1(x)} = e\cdot \mathcal {M}(x)e\) is attained when the multiscale fluctuations \(V_1(x_0, \ldots , x_N)\) are constant in all but one dimension (e.g. the analogue of a layered composite material, [12, Sect. 5.4], [46, Sect. 12.6.2]). In the other extreme, the inequality \(e\cdot \mathcal {M}(x) e = 1\) is attained in the absence of fluctuations, i.e. when \(V_1 = 0\).

Remark 3

Clearly, the lower bound in (20) becomes exponentially small in the limit as \(\sigma \rightarrow 0\).

While Theorem 1 guarantees weak convergence of \(X_t^\epsilon \) to \(X_t^0\) in \(C([0,T];\mathbb {R}^d)\) for fixed T, it makes no claims regarding the convergence at infinity, i.e. of \(\mu ^\epsilon \) to \(\mu ^0\). However, under the conditions of Assumption 1 we can show that \(\mu ^\epsilon \) converges weakly to \(\mu ^0\), so that the \(T\rightarrow \infty \) and \(\epsilon \rightarrow 0\) limits commute, in the sense that:

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\lim _{T\rightarrow \infty }\mathbb {E}[f(X_T^\epsilon )] = \lim _{T\rightarrow \infty }\lim _{\epsilon \rightarrow 0}\mathbb {E}[f(X_T^\epsilon )], \end{aligned}$$

for all \(f \in L^2(\mu _{ref})\).

Proposition 3

(Weak convergence of \(\mu ^\epsilon \) to \(\mu ^0\)) Suppose that Assumption 1 holds. Then for all \(f \in L^2(\mu _{ref})\),

$$\begin{aligned} \int _{\mathbb {R}^d} f(x)\,\mu ^{\epsilon }(dx) \rightarrow \int _{\mathbb {R}^d} f(x)\mu ^0(dx), \end{aligned}$$
(21)

as \(\epsilon \rightarrow 0\).

If Assumption 1 holds, then for every \(\epsilon > 0\), the potential \(V^\epsilon \) is confining, so that the process \(X_t^\epsilon \) is ergodic. If the “unperturbed” process defined by (11) converges to equilibrium exponentially fast in \(L^2(\mu _{ref})\), then so will \(X_t^\epsilon \) and \(X_t^0\). Moreover, we can relate the rates of convergence of the three processes. We will use the notation \(\text{ Var}_{\mu }(f) = \mathbb {E}_{\mu } (f - \mathbb {E}_{\mu } f)^2\) to denote the variance with respect to a measure \(\mu \).

Proposition 4

Suppose that Assumption 1 holds and let \(P_t\) be the semigroup associated with the dynamics (11) and suppose that \(\mu _{ref}(x) = \frac{1}{Z_0}e^{-V_0(x)/\sigma }\) satisfies Poincaré’s inequality with constant \(\rho /\sigma \), i.e.

$$\begin{aligned} \text{ Var}_{\mu _{ref}}(f) \le \frac{\sigma }{\rho }\int |\nabla f(x)|^2\, \mu _{ref}(dx),\quad f \in H^1(\mu _{ref}), \end{aligned}$$
(22)

or equivalentlyFootnote 4

$$\begin{aligned} \text{ Var}_{\mu _{ref}}\left( P_t f\right) \le e^{-2\rho t/\sigma } \text{ Var}_{\mu _{ref}}(f),\quad f \in L^2(\mu _{ref}), \end{aligned}$$
(23)

for all \(t \ge 0\). Let \(P_t^\epsilon \) and \(P_t^0\) denote the semigroups associated with the full dynamics (8) and homogenized dynamics (18), respectively. Then for all \(f \in L^2(\mu _{ref})\),

$$\begin{aligned} \text{ Var}_{\mu ^\epsilon }(P_t^\epsilon f) \le e^{-2\gamma t/\sigma }\text{ Var}_{\mu ^\epsilon }(f), \end{aligned}$$
(24)

and

$$\begin{aligned} \text{ Var}_{\mu ^0}(P_t^0 f) \le e^{-2{\widetilde{\gamma }}t/\sigma }\text{ Var}_{\mu ^0}(f). \end{aligned}$$
(25)

for \(\gamma = \rho \, e^{-\text{ osc } (V_1)/\sigma }\) and \(\widetilde{\gamma } = \rho e^{-2\text{ osc }(V_1)/\sigma }\).

The proof of Propositions 3 and 4 can be found in Sect. 4.

3 Properties of the Coarse-Grained Process

In this section we study the properties of the coefficients of the homogenized SDE (18) and its dynamics.

3.1 Separable Potentials

Consider the special case where the potential \(V^\epsilon \) is separable, in the sense that the fast scale fluctuations do not depend on the slow scale variable, i.e.

$$\begin{aligned} V(x_0, x_1, \ldots , x_N) = V_0(x_0) + V_1(x_1, x_2, \ldots , x_N). \end{aligned}$$

Then, it is clear from the construction of the effective diffusion tensor (14) that \(\mathcal {M}(x)\) will not depend on \(x \in \mathbb {R}^d\). Moreover, since

$$\begin{aligned} Z(x) = \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} e^{-\frac{V_0(x) + V_1(y_1, \ldots , y_N)}{\sigma }}\,dy_1\ldots dy_N = \frac{1}{K} e^{-V_0(x)/\sigma }, \end{aligned}$$

where \(K = \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}\exp (-V_1(y_1,\ldots ,y_N)/\sigma )\,dy_1\,\ldots dy_N\), then it follows that the coarse-grained stationary distribution \(\mu ^0\) equals the stationary distribution \(\mu _{ref} \propto \exp (-V_0(x)/\sigma )\) of the process (11). For general multiscale potentials however, \(\mu ^0\) will be different from \(\mu _{ref}\). Indeed, introducing multiscale fluctuations can dramatically alter the qualitative equilibrium behaviour of the process, including noise-inductioned transitions and noise induced hysteresis, as has been studied for various examples in [15].

3.2 Variational Bounds on \(\mathcal {M}(x)\)

A first essential property is that the constructed matrices \(\mathcal {K}_N, \ldots , \mathcal {K}_1\) are positive definite over all parameters. For convenience, we shall introduce the following notation

(26)

for \(k=1,\ldots , N\), and set \(\mathbb {X}_0 = \mathbb {R}^d\) for consistency. First we require the following existence and regularity result for a uniformly elliptic Poisson equation on \(\mathbb {T}^d\).

Lemma 1

For \(k=1,\ldots , N\), for \(x_0, \ldots , x_{k-1}\) fixed, the tensor \(\mathcal {K}_k(x_0, \ldots , x_{k-1}, \cdot )\) is uniformly positive definite and in particular satisfies, for all unit vectors \(e \in \mathbb {R}^d\),

$$\begin{aligned} \frac{1}{\widehat{Z}_k(x_0, x_1, \ldots , x_{k-1})} \le e\cdot \mathcal {K}_{k}(x_0, x_1, \ldots , x_{k-1}, x_k) \,e,\quad x_k \in \mathbb {T}^d, \end{aligned}$$
(27)

where

$$\begin{aligned} \widehat{Z}_k(x_0, x_1, \ldots , x_{k-1}) = \int \ldots \int e^{ V(x_0, x_1, \ldots , x_{k-1}, x_k, \ldots , x_N)/\sigma }\,dx_N dx_{N-1}\ldots dx_{k}, \end{aligned}$$

which is independent of \(x_k\).

Proof

We prove the result by induction on k starting from \(k = N\). For \(k = N\) the tensor \(\mathcal {K}_N\) is clearly uniformly positive definite for fixed \(x_0, \ldots , x_{N-1}\in \mathbb {X}_{N-1}\). By [8, Thms III.3.2, III.3.3] there exists a unique (up to a constant) solution such that \(\theta _N(x, x_1,\cdots , x_{N-1}, \cdot ) \in H^2(\mathbb {T}^d; \mathbb {R}^d)\) of (15). In particular,

$$\begin{aligned} \int _{\mathbb {T}^d} |\nabla _{x_N} \theta _N(x_0, x_1, \ldots , x_{N-1}, x_N)|_{F}^2\, dx_N< \infty , \end{aligned}$$

where \(|\cdot |_{F}\) denotes the Frobenius norm, so that \(\mathcal {K}_{N-1}\) is well defined. Fix \((x_0, \ldots , x_{N-2}) \in \mathbb {X}_{N-2}\). To show that \(\mathcal {K}_{N-1}(x_0, \ldots , x_{N-2}, \cdot )\) is uniformly positive definite on \(\mathbb {T}^d\) we first note that

$$\begin{aligned} \begin{aligned}&\int _{\mathbb {T}^d}(I + \nabla _{x_N} \theta _N)^\top (I + \nabla _{x_N} \theta _N) e^{-V/\sigma }\,dx_N \\&\quad = \int _{\mathbb {T}^d} \left( I + \nabla _{x_N}\theta _N + \nabla _{x_N}\theta _N^\top + \nabla _{x_N}\theta _N^\top \nabla _{x_N}\theta _N\right) e^{-V/\sigma }dx_N, \end{aligned} \end{aligned}$$
(28)

where \(V = V(x_0, x_1,\ldots , x_N)\) and \(\top \) denotes the transpose. From the Poisson equation for \(\theta _N\) we have

$$\begin{aligned} \int \theta _N \otimes \nabla _{x_N}^{\top }(e^{-V/\sigma }(\nabla _{x_N}\theta _N + I))\,dx_N = \textbf{0}, \end{aligned}$$

from which we obtain, after integrating by parts:

$$\begin{aligned} \int _{\mathbb {T}^d} \nabla _{x_N}\theta _N^\top \Big (\nabla _{x_N}\theta _N +I\Big )e^{-V/\sigma }\,dx_N =0. \end{aligned}$$
(29)

From (28) and (29) we deduce that

$$\begin{aligned} \mathcal {K}_{N-1}&=\int _{\mathbb {T}^d} \left( I + \nabla _{x_N}\theta _N\right) e^{-V/\sigma }\,dx_N\\&= \int _{\mathbb {T}^d} \Big [I + \nabla _{x_N}\theta _N + \nabla _{x_N}\theta _N^\top \,(\nabla _{x_N}\theta _N+I)\Big ]\, e^{-V/\sigma }dx_N \\&=\int _{\mathbb {T}^d} (I + \nabla _{x_N} \theta _N)^\top (I + \nabla _{x_N} \theta _N) e^{-V/\sigma }\,dx_N. \end{aligned}$$

Thus \(\mathcal {K}_{N-1}\) is well-defined and symmetric. We note that

$$\begin{aligned} \int _{\mathbb {T}^d} (I + \nabla _{x_N} \theta _N)\,dx_N = I, \end{aligned}$$

therefore, it follows by Hölder’s inequality that

$$\begin{aligned} |v|^2 = \left| v^\top \int _{\mathbb {T}^d} (I + \nabla _N \theta _N)\,dx_N\right| ^2 \le v^{\top }\left( \mathcal {K}_{N-1} \right) v \left( \int _{\mathbb {T}^d} e^{V/\sigma }\,dx_N\right) , \end{aligned}$$

so that

$$\begin{aligned} \frac{|v|^2}{\widehat{Z}_N(x_0, \ldots , x_{N-1})} \le v^{\top }\mathcal {K}_{N-1}(x_0,\ldots , x_{N-1}) v,\quad \forall (x_0, x_1, \ldots , x_{N-1}). \end{aligned}$$

Since \(\widehat{Z}_N\) is uniformly bounded for \((x_0,\ldots , x_{N-1})\) it follows \(\mathcal {K}_{N-1}(x_0, \ldots , x_{N-2}, \cdot )\) is uniformly positive definite, and arguing as above we establish existence of a unique \(\theta _{N-1}\), up to a constant, solving (15) for \(k=2\).

Now, assume that the corrector \(\theta _{N-k+1}\) has been constructed, and so \(\mathcal {K}_{N-k+1}\) is well defined. By multiplying the cell equation for \(\theta _{N-k+1}\)

$$\begin{aligned} \nabla _{x_{N-k+1}} \cdot \Big [\mathcal {K}_{N-k+1}(\nabla _{x_{N-k+1}}\theta _{N-k+1}+I)\Big ]=0, \end{aligned}$$

by \(\theta _{N-k+1}\) then integrating with respect to \(x_{N-k+1}\) and using integration by parts as well as the symmetry of \(\mathcal {K}_{N-k+1}\) from the inductive hypothesis we obtain

$$\begin{aligned} \int \nabla _{x_{N-k+1}}\theta _{N-k+1}^\top \mathcal {K}_{N-k+1}\left( I + \nabla _{x_{N-k+1}}\theta _{N-k+1}\right) \,dx_{N-k+1} = \textbf{0}. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \mathcal {K}_{N-k}&=\int _{\mathbb {T}^d}\mathcal {K}_{N-k+1}(I+\nabla _{N-k+1}\theta _{N-k+1})\,dx_{N-k+1}\\&=\int _{\mathbb {T}^d}\Big [\mathcal {K}_{N-k+1}(I+\nabla _{N-k+1}\theta _{N-k+1})+\nabla _{x_{N-k+1}} \theta _{N-k+1}^\top \mathcal {K}_{N-k+1}\\ {}&\quad \times (I+\nabla _{x_{N-k+1}}\theta _{N-k+1})\Big ]\,dx_{N-k+1}\\&=\int _{\mathbb {T}^d}(I+\nabla _{x_{N-k+1}}\theta _{N-k+1})^\top \mathcal {K}_{N-k+1} (I+\nabla _{x_{N-k+1}}\theta _{x_{N-k+1}}) \,dx_{N-k+1}. \end{aligned}$$

Thus \(\mathcal {K}_{N-k}\) is also well-defined and symmetric. To show (27) we note that

$$\begin{aligned} \int \cdots \int (I + \nabla _{x_{N}}\theta _N) \cdots (I + \nabla _{x_{N-k}}\theta _{N-k}) dx_N\ldots dx_{N-k} = I. \end{aligned}$$

Therefore, for any vector \(v \in \mathbb {R}^d\):

$$\begin{aligned} |v|^2&= \left| v^\top \left( \int \cdots \int (I + \nabla _{x_{N}}\theta _N) \cdots (I + \nabla _{x_{N-k}} \theta _{x_{N-k}}) dx_N\ldots dx_{N-k}\right) \right| ^2\\&\le v^\top \left( \int \cdots \int (I+\nabla _{x_{N-k}}\theta _{N-k})^\top \cdots (I+\nabla _{x_{N-k}} \theta _{x_{N-k}})e^{-V/\sigma }dx_N\ldots dx_{N-k}\right) \\ {}&\quad \times v \int e^{V/\sigma } dx_N\ldots dx_{N-k} \\&= \left( v^\top \mathcal {K}_{N-k}(x_1, \ldots , x_{N-k})v\right) \widehat{Z}(x_1, \ldots , x_{N-k}). \end{aligned}$$

The fact that we have strict positivity then follows immediately. \(\square \)

To obtain upper bounds for the effective diffusion coefficient, we will express the intermediary diffusion tensors \(\mathcal {K}_i\) as solutions of a quadratic variational problem. This variational formulation of the diffusion tensors can be considered as a generalisation of the analogous representation for the effective conductivity coefficient of a two-scale composite material, see for example [8, 29, 36].

Lemma 2

For \(i=1,\ldots , N\), the tensor \(\mathcal {K}_{i}\) satisfies

$$\begin{aligned} \begin{aligned} e\cdot \mathcal {K}_{i}(x_0, \ldots , x_{i})e&= \inf _{\begin{array}{c} v_{i+1} \in C(\mathbb {X}_{i}; H^1(\mathbb {T}^d)) \\ \vdots \\ v_N \in C(\mathbb {X}_{N-1}; H^1(\mathbb {T}^d)) \end{array}}\int _{(\mathbb {T}^d)^{N}} \left| e + \nabla v_{i+1}(x_0, \ldots , x_{i+1}) + \ldots + \nabla v_{N}(x_0,\ldots , x_{N})\right| ^2 \\ {}&\quad \times e^{-V(x_0,\ldots , x_N)/\sigma }\,dx_N\ldots , dx_{i+1}, \end{aligned} \end{aligned}$$
(30)

for all \(e \in \mathbb {R}^d\).

Proof

For \(i=1,\ldots , N\), from the proof of Lemma 1 we can express the intermediary diffusion tensor \(\mathcal {K}_i\) in the following recursive manner,

$$\begin{aligned} \mathcal {K}_{i} (x_0, \ldots , x_{i})&= \int _{\mathbb {T}^d}(I+ \nabla _{x_{i+1}}\theta _{i+1}(x_0,\ldots , x_i,x_{i+1}))^\top \\ {}&\quad \times \mathcal {K}_{i+1}(x_0,\ldots , x_{i+1})(I + \nabla _{x_{i+1}}\theta _{i+1}(x_0,\ldots , x_{i+1}))\, d x_{i+1}. \end{aligned}$$

Consider the tensor \(\widetilde{\mathcal {K}}_{i}\) defined by the following symmetric minimization problem

$$\begin{aligned} \begin{aligned} e\cdot \widetilde{\mathcal {K}}_{i}(x_0, \ldots , x_{i})e&= \inf _{v \in C(\mathbb {X}_{i}; H^1(\mathbb {T}^d))} \int _{\mathbb {T}^d} (e + \nabla v(x_0, \ldots ,x_{i+1}))\cdot \mathcal {K}_{i+1}(x_0, \ldots , x_{i+1})\\ {}&\quad \times (e + \nabla v(x_0, \ldots ,x_{i+1}))\,dx_{i+1}. \end{aligned} \end{aligned}$$
(31)

Since \(\mathcal {K}_{i+1}\) is a symmetric tensor, the corresponding Euler–Lagrange equation for the minimiser is given by

$$\begin{aligned} \nabla _{x_{i+1}}\cdot \left( \mathcal {K}_{i+1}(x_0, \ldots , x_{i+1})(\nabla _{x_{i+1}}\chi (x_0, \ldots , x_{i+1}) + e)\right) = 0, \quad x_{i+1} \in \mathbb {T}^d, \end{aligned}$$

with periodic boundary conditions. This equation has a unique mean zero solution given by \(\chi (x_0, \ldots , x_{i+1}) = \theta _i(x_0,\ldots , x_{i+1})^\top e\), where \(\theta _i\) is the unique mean-zero solution of (15). It thus follows that \(e^\top \mathcal {K}_{i}e = e^\top \widetilde{\mathcal {K}}_{i}e\), where \(\widetilde{K}_{i}\) is given by (31). Consider now the minimisation problem

$$\begin{aligned} \begin{aligned} \inf _{\begin{array}{c} v_2 \in C(\mathbb {X}_{i}; H^1(\mathbb {T}^d)) \\ v_1 \in C(\mathbb {X}_{i+1}; H^1(\mathbb {T}^d)) \end{array}}&\int _{\mathbb {T}^d}\int _{\mathbb {T}^d} (e + \nabla _{x_{i+2}} v_1(x_0, \ldots ,x_{i+2})+ \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))^\top \\&\times \mathcal {K}_{i+2}(x_0, \ldots , x_{i+2})(e + \nabla _{x_{i+2}} v_1(x_0, \ldots ,x_{i+2})+ \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))\\ {}&\times dx_{i+2} dx_{i+1}. \end{aligned} \end{aligned}$$

Optimising over \(v_1\) for \(v_2\) fixed it follows that \(v_1 = (e + \nabla _{x_{i+1}}v_2)^\top \theta _{i+2}\), where \(\theta _{i+2}\) is the unique mean-zero solution of (15). Thus, the above minimisation can be written as

$$\begin{aligned} \begin{aligned} \inf _{\begin{array}{c} v_2 \in C(\mathbb {X}_{i}; H^1(\mathbb {T}^d)) \end{array}}&\int _{\mathbb {T}^d} \int _{\mathbb {T}^d} (e + \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))^\top (I + \nabla _{x_{i+2}}\theta _{i+2})^{\top } \\&\quad \times \mathcal {K}_{i+2}(x_0, \ldots , x_{i+2})(I + \nabla _{x_{i+2}} \theta _{i+2})\\ {}&\quad \times (e + \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))\,dx_{i+2} dx_{i+1} \\&= \inf _{\begin{array}{c} v_2 \in C(\mathbb {X}_{i-1}; H^1(\mathbb {T}^d)) \end{array}} \int _{\mathbb {T}^d} (e + \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))^\top \mathcal {K}_{i+1}(x_0, \ldots , x_{i+1})\\ {}&\quad \times (e + \nabla _{x_{i+1}} v_2(x_0, \ldots ,x_{i+1}))\,dx_{i+2} dx_{i+1} \\&= e^{\top } \mathcal {K}_{i}e. \end{aligned} \end{aligned}$$

Proceeding recursively, we arrive at the advertised result (30). \(\square \)

4 Properties of the Equilibrium Distributions

In this section we study in more detail the properties of the equilibrium distributions \(\mu ^\epsilon \) and \(\mu ^0\) of the full (8) and homogenized dynamics (18), respectively. We first provide a proof of Proposition 3. The approach we follow in this proof is based on properties of periodic functions, in a manner similar to [12, Chap. 2].

Proof of Proposition 3

Let \(f \in L^2(\mu _{ref})\) and \(\delta > 0\). Clearly \(C^\infty _c(\mathbb {R}^d)\) is dense in \(L^2(\mu _{ref})\) and so, by Assumptions 1 there exists \(f_{\delta } \in C^\infty _c(\mathbb {R}^d)\) such that

$$\begin{aligned} \left|\int _{\mathbb {R}^d} f(x)e^{-V^\epsilon (x)/\sigma }\,dx - \int _{\mathbb {R}^d} f_{\delta }(x)e^{-V^\epsilon (x)/\sigma }\,dx\right|\le \frac{\delta }{3}, \end{aligned}$$
(32)

and

$$\begin{aligned} \left|\int _{\mathbb {R}^d} \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} (f_{\delta }(x) - f(x))e^{-V(x, y_1, \ldots , y_N)/\sigma }\,dy_N\ldots \, dy_1\, dx \right|\le \frac{\delta }{3}, \end{aligned}$$
(33)

uniformly with respect to \(\epsilon \). Now, we partition \(\mathbb {R}^d\) into pairwise disjoint translations of \([0,1]^d\) as \(\mathbb {R}^d = \cup _{k\in \mathbb {N}} Y_k\), where

$$\begin{aligned} Y_k = \epsilon ^N x_k + \epsilon ^N [0,1]^d, \end{aligned}$$

for \(\lbrace x_k \rbrace _{k\ge 0} = \mathbb {Z}^d\). With this decomposition we obtain

$$\begin{aligned} \int _{\mathbb {R}^d} f_{\delta }(x)e^{-V^\epsilon (x)/\sigma }\,dx&= \sum _{k\in \mathbb {N}} \int _{Y_k} f_{\delta }(x)e^{-V^\epsilon (x)/\sigma }\,dx\\&= \epsilon ^{Nd}\sum _{k\in \mathbb {N}} \int _{[0,1]^d} f_{\delta } (\epsilon ^N(x_k + y))e^{-V(\epsilon ^N(x_k+y), \ldots , \epsilon (x_k+y), y)/\sigma }\,dy, \end{aligned}$$

where in the last equality we use the periodicity of V with respect to the last variable. Since the integrand is smooth with compact support, we can Taylor expand around \(\epsilon ^N x_k\) to obtain

$$\begin{aligned} \int _{\mathbb {R}^d} f_{\delta }(x)e^{-V^\epsilon (x)/\sigma }\,dx&= \epsilon ^{Nd}\sum _{k\in \mathbb {N}} \int _{[0,1]^d} f_{\delta }(\epsilon ^N x_k)e^{-V(\epsilon ^N x_k, \ldots , \epsilon x_k, y)/\sigma }\,dy + C\epsilon , \end{aligned}$$

where C is a constant depending on the derivatives of V with respect to the first N variables, and the volume of the support of \(f_{\delta }\).

Noting that the above sum is a Riemann sum approximation, we can write

$$\begin{aligned}&\epsilon ^{Nd}\sum _{k\in \mathbb {N}} \int _{[0,1]^d} f_{\delta }(\epsilon ^N x_k)e^{-V(\epsilon ^N x_k, \ldots , \epsilon x_k, y)/\sigma }\,dy \\&\quad = \epsilon ^{Nd}\sum _{k\in \mathbb {N}} \int _{[0,1]^d}\int _{[0,1]^d} f_{\delta }(\epsilon ^N (x_k+y'))e^{-V(\epsilon ^N (x_k+y'), \ldots , \epsilon (x_k+y'), y)/\sigma }\,dy\,dy' + C_1\epsilon \\&\quad = \int _{\mathbb {R}^d}\int _{[0,1]^d} f_{\delta }(x)e^{-V(x, \ldots , x/\epsilon ^{N-1}, y)/\sigma }\,dy\,dx + {C}_1\epsilon , \end{aligned}$$

where \({C}_1\) is a constant. Repeating the above process \(N-1\) times, we obtain that

$$\begin{aligned} \int _{\mathbb {R}^d} f_{\delta }(x)e^{-V^\epsilon (x)/\sigma }\,dx = \int _{\mathbb {R}^d} \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} f_{\delta }(x)e^{-V(x, y_1, \ldots , y_N)/\sigma }\,dy_N\ldots \, dy_1 \,dx+ {C}_N\epsilon ,\nonumber \\ \end{aligned}$$
(34)

where \({C}_N>0\) is a constant depending on the support of \(f_{\delta }\) and derivatives of V with respect to the first N variable. Thus, choosing \(\epsilon < \delta /(3{C}_N)\) and combining (32), (33) and (34) we obtain

$$\begin{aligned} \left|\int _{\mathbb {R}^d} f(x)e^{-V^\epsilon (x)/\sigma }\,dx - \int _{\mathbb {R}^d} \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} f(x)e^{-V(x, y_1, \ldots , y_N)/\sigma }\,dy_N\ldots \, dy_1\,dx\right|\le \delta ,\nonumber \\ \end{aligned}$$
(35)

Choosing \(f \equiv 1\) we obtain immediately that

$$\begin{aligned} Z^\epsilon = \int _{\mathbb {R}^d} e^{-V^\epsilon (x)/\sigma }\,dx \rightarrow Z^0 = \int _{\mathbb {R}^d} \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d} e^{-V(x, y_1, \ldots , y_N)}\,dy_N\ldots dy_1 \, dx, \end{aligned}$$

and so for \(f \in L^2(\mu _{ref})\) we obtain

$$\begin{aligned} \int f(x)\mu ^\epsilon (x)\,dx \rightarrow \int f(x)\mu ^0(x)\,dx, \end{aligned}$$

as \(\epsilon \rightarrow 0\), as required. \(\square \)

Proof of Proposition 4

Since \(V_1\) is bounded uniformly by Assumption 1, it is straightforward to check that

$$\begin{aligned} \mu _{ref}(x)e^{-{osc}(V_1)/\sigma } \le \mu ^\epsilon (x) \le \mu _{ref}(x)e^{{osc}(V_1)/\sigma }. \end{aligned}$$
(36)

It follows from the discussion following [5, Prop 4.2.7], that \(\mu ^\epsilon \) satisfies Poincaré’s inequality with constant

$$\begin{aligned} \gamma = \frac{\rho }{\sigma } e^{-{osc}(V_1)/\sigma }, \end{aligned}$$

which implies (24). An identical argument follows for the coarse-grained density \(\mu ^0(x)\). Finally, by (20) of Proposition 2 we have \(|v|^2 e^{-{osc}(V_1)/\sigma } \le v\cdot \mathcal {M}(x)v\), for all \(v \in \mathbb {R}^d\), and so

$$\begin{aligned} \text{ Var}_{\mu ^0}(f)&\le \frac{\sigma }{\rho } e^{{osc}(V_1)/\sigma }\int _{\mathbb {R}^d} |\nabla f(x) |^2 \,\mu ^0(x)\,dx \\&\le \frac{\sigma }{\rho }e^{2 {osc}(V_1)/\sigma }\int \nabla f(x)\cdot \mathcal {M}(x) \nabla f(x)\, \mu ^0(x)\,dx, \end{aligned}$$

from which (25) follows. \(\square \)

Remark 4

Note that one can similarly relate the constants in the logarithmic Sobolev inequalities for the measures \(\mu _{ref}\), \(\mu ^\epsilon \) and \(\mu ^0\) in an almost identical manner, based on the Holley-Stroock criterion [26].

Remark 5

Proposition 4 requires the assumption that the multiscale perturbation \(V_1\) is bounded uniformly. If this is relaxed, then it is no longer guaranteed that \(\mu ^\epsilon \) will satisfy a Poincaré inequality, even though \(\mu _{ref}\) does. Consider, for example, the following one dimensional potential

$$\begin{aligned} V^\epsilon (x) = x^2(1 + \alpha \cos (x/\epsilon )), \end{aligned}$$

then the corresponding Gibbs distribution \(\mu ^\epsilon (x)\) will not satisfy Poincaré’s inequality for any \(\epsilon > 0\). Following [25, Appendix A] we demonstrate this by checking that this choice of \(\mu ^\epsilon \) does not satisfy the Muckenhoupt criterion [2, 38] which is necessary and sufficient for the Poincaré inequality to hold, namely that \(\sup _{r \in \mathbb {R}}B_{\pm }(r) < \infty \), where

$$\begin{aligned} B_{\pm }(r) = \left( \int _{r}^{\pm \infty } \mu ^\epsilon (x)\,dx\right) ^{\frac{1}{2}}\left( \int _{[0, \pm r]} \frac{1}{\mu ^\epsilon (x)}\,dx\right) ^{\frac{1}{2}}. \end{aligned}$$

Given \(n \in \mathbb {N}\), we set \(r/\epsilon = 2\pi n + \pi /2\). Then we have that

$$\begin{aligned} B_{+}(r)&\ge \left( \int _{\epsilon (2\pi n + 2\pi /3)}^{\epsilon (2\pi n + 4\pi /3)}e^{-|x|^2(1 -\alpha /2)/\sigma }\,dx\right) ^{1/2}\left( \int _{\epsilon (2\pi n - \pi /3)}^{\epsilon (2\pi n + \pi /3)}e^{|x|^2(1 +\alpha /2)/\sigma }\,dx\right) ^{1/2} \\&\ge \left( \frac{ 2\pi \epsilon }{3}\right) \exp \left( -\frac{|\pi \epsilon (2n + 4/3)|^2}{2\sigma }\left( 1 - \frac{\alpha }{2}\right) + \frac{|\pi \epsilon (2n-1/3)|^2}{2\sigma }\left( 1 + \frac{\alpha }{2}\right) \right) \\&= \left( \frac{ 2\pi \epsilon }{3}\right) \exp \left( -\frac{|2\pi \epsilon n|^2\left( 1 + \frac{2}{3n}\right) ^2}{2\sigma }\left( 1 - \frac{\alpha }{2}\right) + \frac{|2\pi \epsilon n|^2\left( 1-\frac{1}{6n}\right) ^2}{2\sigma }\left( 1 + \frac{\alpha }{2}\right) \right) \\&\approx \left( \frac{ 2\pi \epsilon }{3}\right) \exp \left( \frac{|2\pi \epsilon n|^2}{2\sigma }\left( \alpha + o(n^{-1})\right) \right) \rightarrow \infty , \quad \text{ as } n \rightarrow \infty , \end{aligned}$$

so that Poincaré’s inequality does not hold for \(\mu ^\epsilon \).

A natural question to ask is whether the weak convergence of \(\mu ^\epsilon \) to \(\mu ^0\) holds true in a stronger notion of distance such as total variation. The following simple one-dimensional example demonstrates that the convergence cannot be strengthened to total variation.

Example 1

Consider the one dimensional Gibbs distribution

$$\begin{aligned} \mu ^\epsilon (x) = \frac{1}{Z^\epsilon }e^{-V^\epsilon (x)/\sigma }, \end{aligned}$$

where

$$\begin{aligned} V^\epsilon (x) = \frac{x^2}{2} + \alpha \cos \left( \frac{x}{\epsilon }\right) , \end{aligned}$$

and where \(Z^\epsilon \) is the normalization constant and \(\alpha \ne 0\). Then the measure \(\mu ^\epsilon \) converges weakly to \(\mu ^0\) given by

$$\begin{aligned} \mu ^0(x) = \frac{1}{\sqrt{2\pi \sigma }}e^{-x^2/2\sigma }. \end{aligned}$$

From the plots of the stationary distributions in Fig. 2a it becomes clear that the density of \(\mu ^\epsilon \) exhibits rapid fluctuations which do not appear in \(\mu ^0\), thus we do not expect to be able to obtain convergence in a stronger metric. First we consider the distance between \(\mu ^\epsilon \) and \(\mu ^0\) in total variationFootnote 5

$$\begin{aligned} \Vert \mu ^\epsilon - \mu ^0 \Vert _{TV}&= \int _{\mathbb {R}}|\mu ^\epsilon (x) - \mu ^0(x) |\,dx = \int _{\mathbb {R}}\frac{e^{-x^2/2\sigma }}{\sqrt{2\sigma }}\left| 1 - \frac{e^{-\frac{\alpha }{\sigma } \cos (2\pi x/\epsilon )}}{K^\epsilon }\right| \,dx, \end{aligned}$$

where \(K^\epsilon = Z^\epsilon /\sqrt{2\pi \sigma }\). It follows that

$$\begin{aligned} \Vert \mu ^\epsilon - \mu ^0 \Vert _{TV}&\ge \sum _{n\ge 0}\int _{\epsilon (2\pi n - \pi /3)}^{\epsilon (2\pi n + \pi /3)} \frac{e^{-x^2/2\sigma }}{\sqrt{2\pi \sigma }}\,dx\left| 1 - \frac{e^{-\frac{\alpha }{2\sigma }}}{K^\epsilon } \right| \\&\ge \sum _{n\ge 0}\frac{2\epsilon \pi }{3}\frac{e^{-\epsilon ^2( 2n\pi + \pi /3)^2/2\sigma }}{\sqrt{2\pi \sigma }}\left| 1 - \frac{e^{-\frac{\alpha }{2\sigma }}}{K^\epsilon } \right| \\&\ge \int _{0}^{\infty } \frac{2\pi }{3}\frac{e^{-2\pi ^2 (x + \epsilon /6)^2 /\sigma }}{\sqrt{2\pi \sigma }}\left| 1 - \frac{e^{-\frac{\alpha }{2\sigma }}}{K^\epsilon } \right| , \end{aligned}$$

where we use the fact that \(e^{-\alpha /2\sigma }/K^\epsilon \le 1\) for \(\epsilon \) sufficiently small. In the limit \(\epsilon \rightarrow 0\), we have \(K^\epsilon \rightarrow I_0(\alpha /\sigma )\), where \(I_n(\cdot )\) is the modified Bessel function of the first kind of order n. Therefore, as \(\epsilon \rightarrow 0\),

$$\begin{aligned} \Vert \mu ^\epsilon - \mu ^0 \Vert _{TV}\ge \int _{0}^{\infty } \frac{2\pi }{3}\frac{e^{-2\pi ^2 (x + \epsilon /6)^2 /\sigma }}{\sqrt{2\pi \sigma }}\left| 1 - \frac{e^{-\frac{\alpha }{2\sigma }}}{K^\epsilon } \right| = \frac{1}{6}\left| 1 - \frac{e^{-\frac{\alpha }{2\sigma }}}{I_0(\alpha /\sigma )} \right| , \end{aligned}$$
(37)

which converges to \(\frac{1}{6}\) as \(\frac{\alpha }{\sigma }\rightarrow \infty \). Since relative entropy controls total variation distance by Pinsker’s theorem, it follows that \(\mu ^\epsilon \) does not converge to \(\mu ^0\) in relative entropy, either. Nonetheless, we shall compute the distance in relative entropy between \(\mu ^\epsilon \) and \(\mu ^0\) to understand the influence of the parameters \(\sigma \) and \(\alpha \). Since both \(\mu ^0\) and \(\mu ^\epsilon \) have strictly positive densities with respect to the Lebesgue measure on \(\mathbb {R}\), we have that

$$\begin{aligned} \frac{d\mu ^\epsilon }{d\mu ^0}(x) = \frac{\sqrt{2\pi \sigma }}{Z^\epsilon }e^{-\frac{V^\epsilon (x)}{\sigma }+\frac{x^2}{2\sigma }}. \end{aligned}$$

Then, for \(Z^0 = \sqrt{2\pi \sigma }I_0(1/\sigma )\),

$$\begin{aligned} H\left( \mu ^\epsilon \, | \, \mu ^0 \right)&= \frac{1}{Z^\epsilon }\int \left( \frac{1}{2}\log (2\pi \sigma )-\log Z^\epsilon \right) e^{-V^\epsilon (x)/\sigma }\,dx \\ {}&\qquad + \frac{1}{Z^\epsilon }\int \left( -V^\epsilon (x)/\sigma + x^2/2\sigma \right) e^{-V^\epsilon (x)/\sigma }\,dx\\&\xrightarrow {\epsilon \rightarrow 0} - \log I_0(\alpha /\sigma ) - \frac{\alpha }{\sigma Z^0}\lim _{\epsilon \rightarrow 0}\int \cos (2\pi x/\epsilon )e^{-x^2/2\sigma - \alpha \cos (2\pi x/\epsilon )/\sigma }\,dx \\&= -\log I_0(\alpha /\sigma ) - \frac{\alpha }{\sigma } \frac{I_1(\alpha /\sigma )}{ I_0(\alpha /\sigma )} =: K(\alpha /\sigma ), \end{aligned}$$

and it is straightforward to check that \(K(s) > 0\), and moreover

$$\begin{aligned} K(s) \rightarrow {\left\{ \begin{array}{ll} 0 &{} \text{ as } s\rightarrow 0 \\ +\infty &{} \text{ as } s\rightarrow \infty \end{array}\right. }. \end{aligned}$$

In Fig. 2b we plot the value of K(s) as a function of s. From this result, we see that for fixed \(\alpha > 0\), the measure \(\mu ^\epsilon \) will converge in relative entropy only in the limit as \(\sigma \rightarrow \infty \), while the measures will become increasingly mutually singular as \(\sigma \rightarrow 0\).

Fig. 2
figure 2

Error between \(\mu ^\epsilon (x) \propto \exp (-V^\epsilon (x)/\sigma )\) and effective distribution \(\mu ^0\)

5 Proof of Weak Convergence

In this section we show that over finite time intervals [0, T], the process \(X_t^\epsilon \) converges weakly to a process \(X^0_t\) which is uniquely identified as the weak solution of a coarse-grained SDE. The approach we adopt is based on the classical martingale methodology of [8, Sect. 3]. The proof of the homogenization result is split into three steps.

  1. 1.

    We construct an appropriate test function which is used to decompose the fluctuations of the process \(X_t^\epsilon \) into a martingale part and a term which goes to zero as \(\epsilon \rightarrow 0\).

  2. 2.

    Using this test function, we demonstrate that the path measure \(\mathbb {P}^\epsilon \) corresponding to the family \(\left\{ \left( X_t^\epsilon \right) _{t\in [0,T]} \right\} _{0 < \epsilon \le 1}\) is tight on \(C([0,T];\mathbb {R}^d)\).

  3. 3.

    Finally, we show that any limit point of the family of measures must solve a well-posed martingale problem, and is thus unique.

The test functions will be constructed by solving a recursively defined sequence of Poisson equations on \(\mathbb {R}^d\). We first provide a general well-posedness result for this class of equations.

Proposition 5

Let \(\mathbb {X}_k, k=0,1,\ldots , N\) be the space defined in Sect. 3.2. For fixed \((x_0,\ldots , x_{k-1})\in \mathbb {X}_{k-1}\), let \(\mathcal {S}_k\) be the operator given by

$$\begin{aligned} \mathcal {S}_k u = \frac{1}{\rho (x_0, \ldots , x_k)}\nabla _{x_k}\cdot \left( \rho (x_0,\ldots , x_k)D(x_0, \ldots , x_k)\nabla _{x_k} u(x_0, \ldots , x_{k})\right) , \end{aligned}$$
(38)

for \(u\in C^2(\mathbb {T}^d)\), where \(\rho \) is a smooth and uniformly positive and bounded function, and D is a smooth and uniformly positive definite tensor on \(\mathbb {X}_{k}\). Let h be a smooth function with bounded derivatives, such that for each \((x_0,\ldots , x_{k-1})\in \mathbb {X}_{k-1}\):

$$\begin{aligned} \int _{\mathbb {T}^d} h(x_0, \ldots , x_k)\rho (x_0,\ldots , x_k)\,d{x_k} = 0. \end{aligned}$$
(39)

Then there exists a unique solution \(u \in C(\mathbb {X}_{k-1}; {H}^1(\mathbb {T}^d))\) to the Poisson equation on \(\mathbb {T}^d\) given by

$$\begin{aligned} \mathcal {S}_k u(x_0, \ldots , x_k) = h(x_0, \ldots , x_{k}),\quad \int _{\mathbb {T}^d}u(x_0, \ldots , x_k)\rho (x_0, \ldots , x_k)\,dx_k = 0. \end{aligned}$$
(40)

Moreover u is smooth and bounded with respect to the variable \(x_k \in \mathbb {T}^d\) as well as the parameters \(x_0, \ldots , x_{k-1} \in \mathbb {X}_{k-1}\).

Proof

Since \(\rho \) and D are strictly positive, for fixed values of \(x_0, \ldots , x_{k-1}\), the operator \(\mathcal {S}_k\) is uniformly elliptic, and since \(\mathbb {T}^d\) is compact, \(\mathcal {S}_k\) has compact resolvent in \(L^2(\mathbb {T}^d)\), see [18, Chap. 6] and [46, Chap. 7]. The nullspace of the adjoint \(\mathcal {S}^*\) is spanned by a single function \(\rho (x_0, \ldots , x_{k-1}, \cdot )\). By the Fredholm alternative, a necessary and sufficient condition for the existence of u is (39) which is assumed to hold. Thus, there exists a unique solution \(u(x_0,\ldots , x_{k-1},\cdot ) \in H^1(\mathbb {T}^d)\) having mean zero with respect to \(\rho (x_0, \ldots , x_{k})\). By elliptic estimates and Poincaré’s inequality, it follows that there exists \(C > 0\) satisfying

$$\begin{aligned} \Vert u(x_0, \ldots , x_{k-1}, \cdot ) \Vert _{H^1(\mathbb {T}^d)} \le C\Vert h(x_0, \ldots , x_{k-1}, \cdot ) \Vert _{L^2(\mathbb {T}^d)}, \end{aligned}$$

for all \((x_0,\ldots , x_{k-1})\in \mathbb {X}_{k-1}\). Since the components of D and \(\rho \) are smooth with respect to \(x_k\), standard interior regularity results [21] ensure that, for fixed \({x_0, \ldots , x_{k-1}\in \mathbb {X}_{k-1}}\), the function \(u(x_0, \ldots , x_{k-1},\cdot )\) is smooth. To prove the smoothness and boundedness with respect to the other parameters \(x_0,\ldots , x_{k-1}\), we can apply an approach either similar to [8], by showing that the finite differences approximation of the derivatives of u with respect to the parameters has a limit, or otherwise, by directly differentiating the transition density of the semigroup associated with the generator \(\mathcal {S}_k\), see for example [43, 44, 55] as well as [21, Sec 8.4]. \(\square \)

Remark 6

Suppose that the function h in Proposition 5 can be expressed as

$$\begin{aligned} h(x_0,\ldots , x_k) = a(x_0, x_1,\ldots , x_k)\cdot \nabla \phi _0(x_0) \end{aligned}$$

where a is smooth with all derivatives bounded. Then the mean-zero solution of (40) can be written as

$$\begin{aligned} u(x_0, x_1,\ldots , x_k) = \chi (x_0,x_1,\ldots , x_k)\cdot \nabla \phi _0(x_i), \end{aligned}$$
(41)

where \(\chi \) is the classical mean-zero solution to the following Poisson equation

$$\begin{aligned} \mathcal {S}_k\chi (x_0,\ldots , x_k) = a(x_0, \ldots ,x_k), \quad (x_0,\ldots , x_{k}) \in \mathbb {X}_{k}. \end{aligned}$$
(42)

This can be seen by checking directly that u given in (41) with \(\chi \) satisfying (42) solves (40), which implies it is the unique solution of (40) due to the uniqueness of a solution. In particular, \(\chi \) is smooth and bounded over \(x_0, \ldots , x_k\), so that given a multi-index \(\alpha =(\alpha _0, \ldots , \alpha _k)\) on the indices \((0, \ldots , k)\), there exists \(C_{\alpha } > 0\) such that

$$\begin{aligned} |\nabla ^{\alpha }u(x_0, \ldots , x_k)|_F \le C_{\alpha }\sum _{k=0}^{\alpha _0}|\nabla ^{k+1}\phi _0(x_0)|_F,\quad \forall x_0,x_1,\ldots , x_k, \end{aligned}$$

where \(|\cdot |_F\) denotes the Frobenius norm. A similar decomposition is possible for

$$\begin{aligned} g(x_0,\ldots , x_k) = A(x_0, x_1,\ldots , x_k):\nabla ^2 \phi _0(x_0), \end{aligned}$$

where \(\nabla ^2\) denotes the Hessian.

5.1 Constructing the Test Functions

It is clear that we can rewrite (8) as

$$\begin{aligned} dX^\epsilon _t = -\sum _{i=0}^N \epsilon ^{-i}\nabla _{x_i} V(x,x/\epsilon , \ldots ,x/\epsilon ^N)\,dt + \sqrt{2\sigma }\,dW_t. \end{aligned}$$
(43)

The generator of \(X_t^\epsilon \) denoted by \(\mathcal {L}^\epsilon \) can be decomposed into powers of \(\epsilon \) as follows

$$\begin{aligned} (\mathcal {L}^\epsilon f)(x) = -\sum _{i=0}^{N}\epsilon ^{-i}\nabla _{x_i}V(x,x/\epsilon ,\ldots , x/\epsilon ^N)\cdot \nabla f(x)+ \sigma \Delta f(x). \end{aligned}$$

For functions of the form \(f^\epsilon (x) = f(x, x/\epsilon , \ldots , x/\epsilon ^N)\), we have

$$\begin{aligned} (\mathcal {L}^\epsilon f^\epsilon )(x)&= \sum _{i=0}^{N}\epsilon ^{-i}\nabla _{x_i} V\big (x,x/\epsilon ,\ldots , x/\epsilon ^N\big )\cdot \Bigg (\sum _{j=0}^N \epsilon ^{-j}\nabla _{x_j} f(x,x/\epsilon ,\ldots , x/\epsilon ^N)\Bigg )\nonumber \\ {}&\quad +\sigma \sum _{i,j=0}^k \epsilon ^{-(i+j)}\nabla ^2_{x_ix_j}f\big (x,x/\epsilon ,\ldots , x/\epsilon ^N\big )\nonumber \\ {}&=\sum _{i,j=0}^N \epsilon ^{-(i+j)}\Big [e^{V/\sigma }\nabla _{x_i}\cdot \Big (\sigma e^{-V/\sigma }\nabla _{x_j}f\Big )\Big ]\big (x,x/\epsilon ,\ldots , x/\epsilon ^N\big )\nonumber \\ {}&=\sum _{n=0}^{2N}\epsilon ^{-n}(\mathcal {L}_n f)\big (x, x/\epsilon \ldots ,x/\epsilon ^N\big ), \end{aligned}$$
(44)

where for \(n=0,\ldots , 2N\)

$$\begin{aligned} (\mathcal {L}_n f)\big (x,x/\epsilon ,\ldots , x/\epsilon ^N\big ) = \Big [e^{ V/\sigma }\sum _{\begin{array}{c} i, j \in \lbrace 0,\ldots N\rbrace \\ \, i+j = n \end{array}}\nabla _{x_i}\cdot \left( \sigma e^{-V/\sigma }\nabla _{x_j}f \right) \Big ]\big (x,x/\epsilon ,\ldots , x/\epsilon ^N\big ). \end{aligned}$$

Given a function \(\phi _0\), which will be specified later, our objective is to construct a test function \(\phi ^\epsilon \) of the form

$$\begin{aligned} \phi ^\epsilon (x)&= \phi _0(x) + \epsilon \phi _1(x,x/\epsilon ) + \ldots +\epsilon ^N\phi _N\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ) \\&\quad + \epsilon ^{N+1}\phi _{N+1}\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ) + \ldots + \epsilon ^{2N}\phi _{2N}\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ), \end{aligned}$$

such that

$$\begin{aligned} (\mathcal {L}^{\epsilon }\phi ^\epsilon )(x) = F(x) + O(\epsilon ), \end{aligned}$$
(45)

for some function F which is independent of \(\epsilon \). The above form for the test function is suggested by the calculation (44). Using (44) we compute

$$\begin{aligned} \big (\mathcal {L}^{\epsilon }\phi ^\epsilon \big )(x)&=\sum _{k=0}^{2N}\epsilon ^k (\mathcal {L} \phi _k)\big (x,x/\epsilon ,\ldots ,x/\epsilon ^N\big ) \\ {}&=\sum _{k=0}^{2N}\epsilon ^k\Bigg (\sum _{n=0}^{2N}\epsilon ^{-n}(\mathcal {L}_n \phi _k)\big (x, x/\epsilon \ldots ,x/\epsilon ^N\big )\Bigg ) \\ {}&=\sum _{k,n=0}^{2N}\epsilon ^{k-n}\big (\mathcal {L}_n \phi _k\big )\big (x, x/\epsilon \ldots ,x/\epsilon ^N\big ), \end{aligned}$$

where

$$\begin{aligned} (\mathcal {L}_n \phi _k)(x, x/\epsilon \ldots ,x/\epsilon ^N)=\Big [e^{ V/\sigma }\sum _{\begin{array}{c} i, j \in \lbrace 0,\ldots N\rbrace \\ \, i+j = n \end{array}}\nabla _{x_i}\cdot \left( \sigma e^{-V/\sigma }\nabla _{x_j}\phi _k \right) \Big ](x,x/\epsilon ,\ldots , x/\epsilon ^N). \end{aligned}$$

Note that \(\nabla _{x_j}\phi _k=0\) for \(j> k\). By equating powers of \(\epsilon \), from \(O(\epsilon ^{-N})\) to O(1) respectively, in both sides of (45), we obtain the following sequence of \(N+1\) equations

$$\begin{aligned}&\mathcal {L}_{2N} \phi _N + \mathcal {L}_{2N-1} \phi _{N-1} + \ldots + \mathcal {L}_N \phi _0 = 0, \end{aligned}$$
(46a)
$$\begin{aligned}&\mathcal {L}_{2N} \phi _{N+1} + \mathcal {L}_{2N-1}\phi _{N} + \ldots + \mathcal {L}_{N-1}\phi _0 = 0, \end{aligned}$$
(46b)
$$\begin{aligned}&\nonumber \vdots \\&\mathcal {L}_{2N} \phi _{2N-1} + \ldots + \mathcal {L}_{1}\phi _0 = 0, \end{aligned}$$
(46c)
$$\begin{aligned}&\mathcal {L}_{2N}\phi _{2N} + \ldots + \mathcal {L}_{0}\phi _0 = F. \end{aligned}$$
(46d)

This system generalizes the system written for three scales in [8, III\(-\)11.3]. We note that each nonzero term in (46a), (46b) to (46c) has the form

$$\begin{aligned} \sigma e^{V(x_0, \ldots , x_N)/\sigma }\nabla _{x_i}\cdot \left( e^{- V(x_0, \ldots , x_N)/\sigma }\nabla _{x_j}\phi _{k}\right) , \end{aligned}$$

where \(1 \le i + j - k \le N\). Furthermore, all the terms appearing in (46a), (46b) to (46c) must satisfy \(i > 0\). Indeed \(i = 0\) would imply \(j\ge k+1>k\) and so \(\nabla _{x_j} \phi _k = 0\) by construction of the test function. Since

$$\begin{aligned} V(x_0, \ldots , x_{N}) = V_0(x_0) + V_1(x_0,\ldots , x_N), \end{aligned}$$

all the terms \(\mathcal {L}_{n}\phi _k\) appearing (46a), (46b) to (46c) can be simplified as

$$\begin{aligned} \mathcal {L}_{n}\phi _k&=e^{(V_0+V_1)/\sigma }\sum _{\begin{array}{c} i\in \lbrace 1,\ldots N\rbrace \\ j\in \lbrace 0,\ldots N\rbrace \\ i+j = n \end{array}}\nabla _{x_i}\cdot \left( \sigma e^{-(V_0+V_1)/\sigma }\nabla _{x_j}\phi _k \right) \\ {}&=e^{V_1/\sigma }\sum _{\begin{array}{c} i\in \lbrace 1,\ldots N\rbrace \\ j\in \lbrace 0,\ldots N\rbrace \\ i+j = n \end{array}}\nabla _{x_i}\cdot \left( \sigma e^{-V_1/\sigma }\nabla _{x_j}\phi _k \right) , \end{aligned}$$

where we have used the fact that \(V_0\) is independent of \(x_i\) for \(i\in \lbrace 1,\ldots N\rbrace \) to pull the term \(e^{V_0}\) out from the divergence operator. Thus, we can rewrite the first N equations as

$$\begin{aligned}&\mathcal {A}_{2N} \phi _N + \mathcal {A}_{2N-1} \phi _{N-1} + \ldots \mathcal {A}_N \phi _0 = 0, \end{aligned}$$
(47a)
$$\begin{aligned}&\mathcal {A}_{2N} \phi _{N+1} + \mathcal {A}_{2N-1}\phi _{N} + \ldots \mathcal {A}_{N-1}\phi _0 = 0, \end{aligned}$$
(47b)
$$\begin{aligned}&\nonumber \vdots \\&\mathcal {A}_{2N} \phi _{2N-1} + \ldots + \mathcal {A}_{1}\phi _0 = 0, \end{aligned}$$
(47c)

where

$$\begin{aligned} \mathcal {A}_n f = \sigma e^{ V_1(x_0, \ldots , x_N)/\sigma }\sum _{\begin{array}{c} i\in \lbrace 1,\ldots , N\rbrace \\ j\in \lbrace 0,\ldots , N\rbrace \\ i+j=n \end{array}} \nabla _{x_i}\cdot \left( e^{- V_1(x_0, \ldots , x_N)/\sigma }\nabla _{x_j}f \right) . \end{aligned}$$

Before constructing the test functions, we first introduce the sequence of spaces on which the sequence of correctors will be constructed. Define \(\mathcal {H}\) to be the space of functions on the extended state space, i.e. \(\mathcal {H} = L^2(\mathbb {X}_N)\), where \(\mathbb {X}_N\) is defined by (26). We construct the following sequence of subspaces of \(\mathcal {H}\). Let

$$\begin{aligned} \mathcal {H}_{N} = \left\{ f \in \mathcal {H}: \, \int f(x_0, \ldots , x_{N}) e^{- V_1/\sigma }\,dx_{N} = 0 \right\} , \end{aligned}$$

Then clearly \(\mathcal {H} = \mathcal {H}_{N} \oplus \mathcal {H}_{N}^\perp \). Suppose we have defined \(\mathcal {H}_{N-k+1}\) then we can define \(\mathcal {H}_{N-k}\) inductively by

$$\begin{aligned} \mathcal {H}_{N-k} = \left\{ f \in \mathcal {H}_{N-k+1}: \, \int f(x_0, \ldots , x_{N-k})Z_{N-k}(x_0, \ldots , x_{N-k})\,dx_{N-k} = 0 \right\} , \end{aligned}$$

where \(Z_{i}(x_0, \ldots , x_i) = \int \ldots \int e^{- V_1(x_0, \ldots , x_N)/\sigma }\,d{x_{i+1}}\,dx_{i+2}\ldots \,dx_N\). Clearly, we have that \(\mathcal {H}_1 \oplus \mathcal {H}_1^\bot \oplus \ldots \oplus \mathcal {H}_N^\bot = \mathcal {H}\).

Applying Proposition 5 we can now construct the series of test functions \(\phi _1, \ldots , \phi _{2N}\) that solve (47).

Proposition 6

Given \(\phi _0 \in C^\infty (\mathbb {R}^d)\), there exist smooth functions \(\phi _i\) for \(i=1,\ldots , 2N-1\) such that Eqs (47a)–(47c) are satisfied, and moreover we have the following pointwise estimates, which hold uniformly on \(x_0,\ldots , x_k \in \mathbb {X}_k\):

$$\begin{aligned} \Vert \nabla ^{\alpha } \phi _i(x_0,\ldots , x_k)\Vert _F\le C\sum _{l=1}^{\alpha _0+2}\Vert \nabla ^{l}_{x_0}\phi _0(x_0)\Vert _F, \end{aligned}$$
(48)

for some constant \(C > 0\), and all multiindices \(\alpha \) on \((0, \ldots , k)\), and all \(0 \le k \le i \le 2N-1\). Finally, Eq. (46d) is satisfied with

$$\begin{aligned} F(x) = \frac{1}{Z(x)}\nabla _{x_0}\cdot \left( \mathcal {K}_1(x)\nabla _{x_0}\phi _0(x)\right) . \end{aligned}$$
(49)

Proof

Guideline of the proof. Given \(\phi _0\) as in the hypothesis of the proposition, we will find the test functions \(\phi _i, i=1,\ldots , 2N\) from the system (47). This system consists of N equations. The other N equations come from solvability (compatibility) conditions, which are applications of the Fredholm alternative [46, Theorem 7.9]. More specially, the solvability condition for the \(O(\epsilon ^{-(N-k)})\)-equation in (47), viewing as an equation for \(\phi _{N+k}\) in terms of \(\phi _{0},\ldots , \phi _{N+k-1}\), will give rise to an equation for \(\phi _{N-k}\) in term of \(\phi _0,\ldots , \phi _{N-k-1}\), for \(k=1,\ldots , N\). The latter is an elliptic equation of the form (38) with \(\rho =1\) and \(D=\mathcal {K}_{N-k}\). According to Lemma 1, \(\mathcal {K}_{N-k}\) is uniformly positive definite. Hence, the existence of \(\phi _{N-k}\) follows from Proposition 5. Therefore, the solvability condition for \(\phi _{N+k}\) is fulfilled guaranteeing the existence of \(\phi _{N+k}\). By inductively repeating this process for all \(k=1,\ldots , N\), we can construct the test functions \(\phi _1,\ldots , \phi _{2N}\) satisfying the system (47). Finally, the function F is then determined from (46d).

Now we implement this strategy in details. We start from Equation (47a), which can be viewed as an equation for \(\phi _N\) in term of \(\phi _0,\ldots , \phi _{N-1}\)

$$\begin{aligned} \mathcal {A}_{2N}\phi _N=-(\mathcal {A}_{2N-1}\phi _{N-1}+\ldots +\mathcal {A}_{0}\phi _0),\quad \mathcal {A}_{2N} f=\sigma e^{V_1/\sigma }\nabla _{x_N}\cdot \Big (e^{-V_1/\sigma }\nabla _{x_N}f\Big ).\nonumber \\ \end{aligned}$$
(50)

Since the operator \(\mathcal {A}_{2N}\) has a compact resolvent in \(L^2(\mathbb {T}^d)\), by the Fredholm alternative a necessary and sufficient condition for (47a) to have a solution is that the following compatibility condition holds

$$\begin{aligned} \int \left( \mathcal {A}_{2N-1} \phi _{N-1} + \mathcal {A}_{2N-2}\phi _{N-2} + \ldots + \mathcal {A}_{N} \phi _0\right) e^{-V_1/\sigma }\,dx_N = 0. \end{aligned}$$
(51)

Note that every term in this summation is of the form

$$\begin{aligned} \mathcal {A}_{2N-k}\phi _{N-k} = \sigma \sum _{\begin{array}{c} 0 \le i,j\le N \\ i+j=2N-k \end{array}} e^{V_1/\sigma }\nabla _{x_j}\cdot \left( e^{-V_1/\sigma }(x)\nabla _{x_i}\phi _{N-k}\right) . \end{aligned}$$
(52)

For \(\nabla _{x_i}\phi _{N-k}\) to be non-zero it is necessary that \(i \le N-k\). To enforce the condition \(i+j = 2N-k\) it must be that \(i=N-k\) and \(j=N\), and thus the only non-zero terms in the above summation are:

$$\begin{aligned} \mathcal {A}_{2N-k} \phi _{N-k} = \sigma e^{V_1/\sigma }\nabla _{x_N} \cdot \left( e^{-V_1/\sigma }\nabla _{x_{N-k}} \phi _{N-k}\right) , \end{aligned}$$
(53)

for \(k = 1,\ldots , N\). It follows that the compatibility condition (51) holds, by the periodicity of the domain. Therefore (47a) has a solution. In addition, it can be written as

$$\begin{aligned} \mathcal {A}_{2N} \phi _{N}&=-\sum _{k=1}^N \mathcal {A}_{2N-k} \phi _{N-k}\\ {}&=-\sum _{k=1}^N \sigma e^{V_1/\sigma }\nabla _{x_N}\cdot \Big (e^{-V_1/\sigma }\nabla _{x_{N-k}}\phi _{N-k}\Big ) \\ {}&=(\sigma e^{V_1/\sigma }\nabla _{x_N}\cdot (e^{-V_1/\sigma }I)\Big )\cdot \Bigg (\sum _{k=1}^N \nabla _{x_{N-k}\phi _{N-k}}\Bigg ). \end{aligned}$$

Note that for \(k=0\), the Poisson equation (15) can be expressed as

$$\begin{aligned} \mathcal {A}_{2N} \theta _N = \sigma e^{V_1/\sigma }\nabla _{x_N}\cdot (e^{-V_1/\sigma }I), \end{aligned}$$

which has unique mean–zero solution \(\theta _N\). According to Remark 6, the test function \(\phi _N\) can be written as

$$\begin{aligned} \phi _N = \theta _N \cdot \left( \nabla _{x_{N-1}}\phi _{N-1} + \ldots + \nabla _{x_{0}}\phi _{0} \right) + r_N^{(1)}(x_0, \ldots , x_{N-1}), \end{aligned}$$
(54)

where

$$\begin{aligned} \theta _N\cdot (\nabla _{x_N-1}\phi _{N-1} + \ldots + \nabla _{x_0}\phi _0) \in \mathcal {H}_N, \end{aligned}$$

and for some \(r_N^{(1)} \in \mathcal {H}_N^\bot \), which will be specified later. Next we consider the \(O(\epsilon ^{-(N-1)})\) -equation, that is (47b) viewing as an equation for \(\phi _{N+1}\) in terms of \(\phi _N,\ldots , \phi _0\):

$$\begin{aligned} \mathcal {A}_{2N}\phi _{N+1}=-(\mathcal {A}_{2N-1}\phi _N+\ldots +\mathcal {A}_{N-1}\phi _0), \end{aligned}$$
(55)

where \(\mathcal {A}_{2N}\) is given in (50). According to the Fredholm alternative, a necessary and sufficient condition for the above equation to have a solution is

$$\begin{aligned} \int \left( \mathcal {A}_{2N-1}\phi _N + \ldots + \mathcal {A}_{N-2} \phi _1 + \mathcal {A}_{N-1}\phi _0 \right) e^{-V_1/\sigma }\,dx_N = 0. \end{aligned}$$
(56)

Similarly as in (53), for \(k=1,\ldots , N+1\), we have

$$\begin{aligned} \mathcal {A}_{2N-k}\phi _{N-k+1}&=\sigma e^{V_1/\sigma }\Big [\nabla _{x_{N-1}}\cdot \Big (e^{-V_1/\sigma }\nabla _{x_{N-k+1}}\phi _{N-k+1}\Big )\\&\quad +\nabla _{x_N}\cdot (e^{-V_1/\sigma }\nabla _{x_{N-k}}\phi _{N-k+1})\Big ]. \end{aligned}$$

Substituting this into (55) we obtain

$$\begin{aligned} 0&=\int \nabla _{x_{N-1}}\cdot \Big [e^{-V_1/\sigma }(\nabla _{x_N}\phi _N+\nabla _{x_{N-1}}\phi _{N-1}+\ldots +\nabla _{x_0}\phi _0)\Big ] dx_N\\&=\nabla _{x_{N-1}}\cdot \left( \int e^{-V_1/\sigma }\nabla _{x_{N}}\theta _N \left( \nabla _{x_{N-1}}\phi _{N-1} \ldots + \nabla _{x_0}\phi _0\right) \,dx_N\right) \\&\quad + \nabla _{x_{N-1}}\cdot \left( \int e^{-V_1/\sigma } \left( \nabla _{x_{N-1}}\phi _{N-1} + \ldots +\nabla _{x_0}\phi _0\right) \right) \,dx_{N}, \end{aligned}$$

where in the last equality we use the fact that \(r^{(1)}_N\) is independent of \(x_N\). Thus we obtain the following equation for \(\phi _{N-1}\):

$$\begin{aligned} \nabla _{x_{N-1}}\cdot \left( \mathcal {K}_{N-1}\nabla _{x_{N-1}}\phi _{N-1}\right) = -\nabla _{x_{N-1}} \cdot \Big (\mathcal {K}_{N-1}\left( \nabla _{x_{N-2}}\phi _{N-2} + \ldots + \nabla _{x_0}\phi _0\right) \Big ), \end{aligned}$$
(57)

where

$$\begin{aligned} \mathcal {K}_{N-1}(x_0,x_1,\ldots , x_{N-1})= \int \left( I + \nabla _{x_{N}}\theta _N \right) e^{-V_1/\sigma }\, dx_N. \end{aligned}$$

By Lemma 1, for fixed \(x_0, x_1, \ldots , x_{N-1}\) the tensor \(\mathcal {K}_{N-1}\) is uniformly positive definite over \(x_{N-1} \in \mathbb {T}^d\). As a consequence, the operator defined in (57) is uniformly elliptic, with adjoint nullspace spanned by \(Z_N(x_0, x_1,\ldots , x_{N-1})\). Since the right hand side has mean zero, this implies that a solution \(\phi _{N-1}\) exists. We recall that the corrector \(\theta _{N-1}\) satisfies equation (15) with \(k=1\), that is

$$\begin{aligned} \nabla _{x_{N-1}}\cdot \Big [\mathcal {K}_{N-1}\Big (\nabla _{x_{N-1}}\theta _{N-1}+I\Big )\Big ]=0. \end{aligned}$$

According to Remark 6, we can write \(\phi _{N-1}\) as

$$\begin{aligned} \phi _{N-1} = \theta _{N-1}\cdot \left( \nabla _{x_{N-2}}\phi _{N-2} + \ldots + \nabla _{x_0}\phi _0\right) + r_{N-1}^{(1)}(x_0, \ldots , x_{N-2}), \end{aligned}$$

for some \(r^{(1)}_{N-1} \in \mathcal {H}_{N-1}^\bot \). Since (56) has been satisfied, it follows from Proposition 5 that there exists a unique decomposition of \(\phi _{N+1}\) into

$$\begin{aligned} \phi _{N+1}(x_0, x_1, \ldots , x_{N}) = \widetilde{\phi }_{N+1}(x_0, x_1, \ldots , x_{N}) + r_{N+1}^{(1)}(x_0, x_1, \ldots , x_{N-1}), \end{aligned}$$

where \(\widetilde{\phi }_{N+1} \in \mathcal {H}_{N}\) and for some \(r_{N+1}^{(1)} \in \mathcal {H}_{N}^\bot \). For the sake of illustration we now consider the \(O(\epsilon ^{-(N-2)})\) equation in (47)

$$\begin{aligned} \mathcal {A}_{2N}\phi _{N+2}=-\sum _{k=0}^{N+1}\mathcal {A}_{N+k-2}\phi _k, \end{aligned}$$

which, again by the Fredholm alternative, has a solution if and only if

$$\begin{aligned} \int \left( \mathcal {A}_{2N-1}\phi _{N+1} + \mathcal {A}_{2N-2}\phi _{N} + \ldots + \mathcal {A}_{N-2}\phi _{0}\right) \, e^{-V/\sigma }dx_N = 0. \end{aligned}$$
(58)

For \(k=1,\ldots , N+2\), we have

$$\begin{aligned} \mathcal {A}_{2N-k}\phi _{N-k+2}&=\sigma e^{V_1/\sigma }\Big [\nabla _{x_{N-2}}\cdot \Big (e^{-V_1/\sigma }\nabla _{x_{N-k+2}} \phi _{N-k+2}\Big )+\nabla _{x_{N-1}}\cdot \Big (e^{-V_1/\sigma }\nabla _{x_{N-k+1}}\phi _{N-k+2}\Big ) \\ {}&\quad +\nabla _{x_N}\cdot (e^{-V_1/\sigma }\nabla _{x_{N-k}}\phi _{N-k+2})\Big ]. \end{aligned}$$

Fixing the variables \(x_0, \ldots , x_{N-2}\), we can rewrite (58) as an equation for \(r^{(1)}_N=r^{(1)}_N(x_0,\ldots ,x_{N-1})\)

$$\begin{aligned} \widetilde{\mathcal {A}}_{2N-2} r_N^{(1)}:= \nabla _{x_{N-1}}\cdot \left( Z_{N-1} \nabla _{x_{N-1}}r_{N}^{(1)}\right) = -RHS, \end{aligned}$$
(59)

where

$$\begin{aligned} Z_{N-1}=\int e^{-V_1(x)/\sigma }\,dx_N, \end{aligned}$$

and the RHS contains all the remaining terms. We note that all the functions of \(x_{N-1}\) in the RHS are known, so that all the remaining undetermined terms can be viewed as constants for fixed \(x_0, \ldots , x_{N-2} \in \mathbb {X}_{N-2}\). By the Fredholm alternative, a necessary and sufficient condition for a unique mean zero solution to exist to (59) is that the RHS has integral zero with respect to \(x_{N-1}\), which is equivalent to:

$$\begin{aligned} \nabla _{N-2}\cdot \left( \int \int \left( \nabla _{x_N} \phi _N + \nabla _{x_{N-1}}\phi _{x_{N-1}} + \ldots + \nabla _{x_{0}}\phi _0 \right) e^{-V/\sigma }\,dx_N dx_{N-1}\right) =0, \end{aligned}$$

or equivalently:

$$\begin{aligned} \nabla _{x_{N-2}}\cdot \left( \mathcal {K}_{N-2}\nabla _{x_{N-2}}\phi _{N-2}\right) = -\nabla _{x_{N-2}}\cdot \left( \mathcal {K}_{N-2}\left( \nabla _{x_{N-3}}\phi _{N-3} + \ldots + \nabla _{x_0} \phi _0\right) \right) . \end{aligned}$$

Once again, this implies that

$$\begin{aligned} \phi _{N-2} = \theta _{N-2}\cdot \left( \nabla _{x_{N-3}}\phi _{N-3} + \ldots + \nabla _{x_{0}}\phi _0\right) + r_{N-2}^{(1)}(x_0, \ldots , x_{N-3}), \end{aligned}$$

where \(r_{N-2}^{(1)} \in \mathcal {H}_{N-2}^{\bot }\) is unspecified. Since the compatibility condition holds, by Proposition 5, Eq. (59) has a solution, so that we can write

$$\begin{aligned} r_{N}^{(1)}(x_0, \ldots , x_{N-1}) = \widetilde{r}_{N}^{(1)}(x_0, \ldots , x_{N-1}) + r_{N}^{(2)}(x_0, \ldots , x_{N-2}), \end{aligned}$$

where \(\widetilde{r}^{(1)}_{N} \in \mathcal {H}_{N-1}\) is the unique smooth solution of (59) and for some \(r_{N}^{(2)} \in \mathcal {H}_{N-1}^\bot \).

We continue the proof by induction. Suppose that for some \(k < N\), the functions \(\phi _N, \ldots \phi _{N\pm (k-1)}\) have all been determined. We shall consider the case when k is even, noting that the k odd case follows mutatis mutandis.

From the previous steps, each term in

$$\begin{aligned} \phi _{N+k-2}, \phi _{N+k-4}, \ldots , \phi _{N-k-2}, \end{aligned}$$

admits a decomposition such that in each case we can write:

$$\begin{aligned} \phi _{N+k - 2i} = \widetilde{\phi }_{N+k-2i} + r^{(k/2-i)}_{N+k-2i}, \end{aligned}$$

where

$$\begin{aligned} \widetilde{\phi }_{N+k-2i} \in \mathcal {H}_{k/2-i}, \end{aligned}$$

has been uniquely specified, and the remainder term

$$\begin{aligned} r_{N+k-2i}^{(k/2-i)} \in \mathcal {H}_{k/2-i}^\bot , \end{aligned}$$

remains to be determined. The \(O(\epsilon ^{N-k})\) equation is given by

$$\begin{aligned} \mathcal {A}_{2N}\phi _{N+k} + \mathcal {A}_{2N-1}\phi _{N+k-1} + \ldots + \mathcal {A}_{N-k}\phi _{0} = 0. \end{aligned}$$
(60)

Following the example of the \(O(\epsilon ^{N-2})\) step, in descending order we successively apply the compatibility conditions which must be satisfied for the equations involving \(r_{N+k}^{(1)}, \ldots , r_{N-k-2}^{(k-1)}\) of the form

$$\begin{aligned} \mathcal {\widetilde{A}}_{2N-2k-2i}r_{N+k-2i}^{(k/2-i)} = RHS, \end{aligned}$$
(61)

where in (61), all terms dependent on the variable \(x_{k/2-i}\) have been specified uniquely and where

$$\begin{aligned} \widetilde{A}_{2N-2k-2i} u = \nabla _{x_{N-k-i}}\cdot \left( Z_{N-k-i}\nabla _{x_{N-k-i}}u\right) . \end{aligned}$$

This results in (60) being integrated with respect to the variables \(N, \ldots , N-k+1\). In particular, all terms \(\mathcal {A}_{2N-j}\phi _{N+k-j}\) for \(j=0, \ldots , k-1\) will have integral zero, and thus vanish. The resulting equation is then

$$\begin{aligned} \int \ldots \int \left( \mathcal {A}_{2N-k}\phi _{N} + \ldots + \mathcal {A}_{N-k}\phi _0\right) e^{-V_1/\sigma }\,dx_N\ldots dx_{N-k+1} = 0. \end{aligned}$$
(62)

Moreover, since the function \(\phi _{N-i}\) depends only on the variables \(x_0, \ldots , x_{N-i}\), then (62) must be of the form

$$\begin{aligned} \nabla _{x_{N-k}}\cdot \left( \int \ldots \int \left( \nabla _{x_N}\phi _N + \ldots \nabla _{x_{N-1}}\phi _{N-1} + \ldots \nabla _{x_0}\phi _0\right) e^{-V/\sigma }\,dx_{N}\ldots dx_{N-k+1}\right) = 0. \end{aligned}$$

We now apply the inductive hypothesis to see that (to shorten the notations, we denote \(dx_{N,\ldots , N-k+1}:=dx_{N}\cdots dx_{N-k+1}\) etc)

$$\begin{aligned}&\int \left( \nabla _{x_N} \phi _N + \ldots \nabla _{x_0}\phi _0\right) e^{-V_1/\sigma }\,dx_{N,\ldots , N-k+1} \\&\quad = \int \int \left( \nabla _{x_N} \theta _N + I\right) dx_N\left( \nabla _{x_{N-1}}\phi _{N-1} + \ldots + \nabla _{x_0}\phi _0\right) e^{-V_1/\sigma }\,dx_{N-1,\ldots ,N-k+1}\\&\quad = \int \int \int \left( \nabla _{x_N} \theta _N + I\right) \,dx_{N}\left( \nabla _{x_{N-1}} \theta _{N-1} + I\right) \,dx_{N-1}\\ {}&\qquad \times \left( \nabla _{x_{N-2}}\phi _{N-2} + \ldots + \nabla _{x_0}\phi _0\right) e^{-V_1/\sigma }\,dx_{N-2,\ldots ,N-k+1}\\&\qquad \vdots \\&\quad = \mathcal {K}_{N-k+1}\left( \nabla _{x_{N-k}} \phi _{N-k} + \ldots \nabla _{x_0}\phi _0\right) . \end{aligned}$$

Thus, the compatibility condition for the \(O(\epsilon ^{N-k})\) equation reduces to the elliptic PDE

$$\begin{aligned} \nabla _{x_{N-k}}\cdot \left( \mathcal {K}_{{N-k}}\nabla _{x_{N-k}} \phi _{N-k}\right) = -\nabla _{x_{N-k}}\cdot \left( \mathcal {K}_{{N-k}}\left( \nabla _{x_{N-k-1}} \phi _{N-k-1} + \ldots \nabla _{x_0}\phi _0\right) \right) = 0, \end{aligned}$$

so that \(\phi _{N-k}\) can be written as

$$\begin{aligned} \phi _{N-k} = \theta _{N-k}\left( \nabla _{x_{N-k-1}}\phi _{N-k-1} + \ldots \nabla _{x_{0}}\phi _0\right) + r_{N-k}^{(1)}, \end{aligned}$$
(63)

where \(r_{N-k}^{(1)}\) is an element of \(\mathcal {H}_{N-k}^\bot ,\) which is yet to be determined. Moreover, each remainder term \(r_{N+k-2i}^{(k/2-i)}\) can be further decomposed as

$$\begin{aligned} r_{N+k-2i}^{(k/2-i)} = \widetilde{r}_{N+k-2i}^{(k/2-i)} + {r}_{N+k-2i}^{(k/2-i+1)}, \end{aligned}$$

where

$$\begin{aligned} \widetilde{r}_{N+k-2i}^{(k/2-i)} \in \mathcal {H}_{k/2-i+1}, \end{aligned}$$

is uniquely determined and

$$\begin{aligned} {r}_{N+k-2i}^{(k/2-i+1)} \in \mathcal {H}_{k/2-i+1}^\bot , \end{aligned}$$

is still unspecified. Continuing the above procedure inductively, starting from a smooth function \(\phi _0\) we construct a series of correctors \(\phi _1, \ldots , \phi _{2N-1}\).

We now consider the final Eq. (46d). Arguing as before, we note that we can rewrite (46d) as

$$\begin{aligned} \mathcal {A}_{2N}\phi _{2N} + \cdots \mathcal {A}_{N+1}\phi _{N+1} = F(x) - \sum _{i=1}^{N}\mathcal {L}_i \phi _i. \end{aligned}$$
(64)

A necessary and sufficient condition for \(\phi _{2N}\) to have a solution is that

$$\begin{aligned} \begin{aligned}&\int _{\mathbb {T}^d} \left( \mathcal {A}_{2N-1}\phi _{2N-1} + \ldots + \mathcal {A}_{N+1} \phi _{N+1} \right) e^{-V_1/\sigma }\,dx_N \\&\quad = \int _{\mathbb {T}^d} \left( F(x)-\sum _{i=1}^N\mathcal {L}_i\phi _i \right) e^{-V_1/\sigma }\,dx_N. \end{aligned} \end{aligned}$$
(65)

At this point, the remainder terms will be of the form

$$\begin{aligned} r_{2N-2}^{(1)}, r_{2N-4}^{(2)}, \ldots r_{2N-2k}^{(k)}, \ldots , r_{2}^{(1)}, \end{aligned}$$

such that \(r_{2N-2i}^{(i)} \in \mathcal {H}_{i}^{\bot }\), is unspecified. Starting from \(r_{2N-2}^{(1)}\) a necessary and sufficient condition for the remainder \(r_{2N-2i}^{(i)}\) to exist is that the integral of the equation with respect to \(dx_{N-i}\) vanishes, i.e.

$$\begin{aligned} \begin{aligned} F(x)Z(x)&= \int _{(\mathbb {T}^d)^N} \left( \mathcal {A}_{2N-1}\phi _{2N-1} + \ldots \mathcal {A}_{N+1}\phi _{N+1}\right) e^{-V_1/\sigma }\,dx_N dx_{N-1}\ldots dx_{1} \\&\quad + \int _{(\mathbb {T}^d)^N} \left( \mathcal {L}_{N}\phi _{N} + \ldots \mathcal {L}_{1}\phi _1\right) e^{-V_1/\sigma }\,dx_N dx_{N-1}\ldots dx_{1}, \end{aligned} \end{aligned}$$
(66)

where

$$\begin{aligned} Z(x) = \int _{\mathbb {T}^d}\ldots \int _{\mathbb {T}^d} e^{-V_1/\sigma }\,dx_N\ldots dx_{1}. \end{aligned}$$

As above, after simplification, (66) becomes

$$\begin{aligned} \nabla _{x_0}\cdot \left( \nabla _{x_N}\phi _N + \ldots +\nabla _{x_0}\phi _0 \right) =Z(x)F(x), \end{aligned}$$

which can be written as

$$\begin{aligned} \frac{\sigma }{Z(x)}\nabla _{x_0}\cdot \left( \int _{(\mathbb {T}^d)^N}\left( I + \nabla _{x_N}\theta _N\right) \cdot \cdots \cdot \left( I + \nabla _{x_1}\theta _1\right) e^{-V/\sigma }\,dx_N\cdots dx_1\nabla _{x_0}\phi _0\right) = F(x), \end{aligned}$$

or more compactly

$$\begin{aligned} F(x) = \frac{\sigma }{Z(x)}\nabla _{x_0}\cdot \left( \mathcal {K}_1(x) \nabla _{x_0}\phi _0(x)\right) , \end{aligned}$$

where the terms in the right hand side have been specified and are unique. Thus, the O(1) equation (66) provides a unique expression for F(x). Moreover, for each \(i=1,\ldots , N-1\), there exists a smooth unique solution \(r_{2N-2i}^{(i)} \in \mathcal {H}_{i-1}\) and \(\phi _{2N} \in \mathcal {H}_{N}\) by Proposition 5.

Note that we have not uniquely identified the functions \(\phi _1, \ldots , \phi _{2N}\), since after the above N steps there will be remainder terms which are still unspecified. However, conditions (47a)–(47c) will hold for any choice of remainder terms which are still unspecified. In particular, we can set all the remaining unspecified remainder terms to 0. Moreover, every Poisson equation we have solved in the above steps has been of the form:

$$\begin{aligned} \mathcal {S}_k u(x_0,\ldots , x_k) = a(x_0,\ldots ,x_k)\cdot \nabla _{x_0}\phi _0(x_0) + A(x_0,\ldots , x_k):\nabla ^2_{x_0}\phi _0(x_0), \end{aligned}$$

where \(\mathcal {S}_k\) is of the form (38), and a and A are uniformly bounded with bounded derivatives. In particular, from the remark following Proposition 5 the pointwise estimates (48) hold. \(\square \)

Remark 7

Note that we do not have an explicit formula for the test functions, for \(i = 1,\ldots , N\). However, by applying (63) recursively one can obtain an explicit expression for the gradient of \(\phi _i\) in terms of the correctors \(\theta _i\):

$$\begin{aligned} \nabla _{x_i}\phi _i = \nabla _{x_i}\theta _i(I + \nabla _{x_{i-1}}\theta _{i-1})\cdot \cdots \cdot (I + \nabla _{x_{1}}\theta _{1})\nabla _{x_0}\phi _0. \end{aligned}$$

Since these are the only terms required for the calculation of the homogenized diffusion tensor we thus obtain an explicit characterisation of the effective coefficients.

5.2 Tightness of Measures

In this section we establish the weak compactness of the family of measures corresponding to \(\lbrace X_t^\epsilon : 0 \le t \le T\rbrace _{0 < \epsilon \le 1\rbrace }\) in \(C([0,T]; \mathbb {R}^d)\) by establishing tightness. Following [43], we verify the following two conditions which are a slight modification of the sufficient conditions stated in [9, Theorem 8.3].

Lemma 3

The collection \(\lbrace X_t^\epsilon \,: \, 0 \le t \le T\rbrace _{\lbrace 0 < \epsilon \le 1\rbrace }\) is relatively compact in \(C([0,T]; \mathbb {R}^d)\) if it satisfies:

  1. 1.

    For all \(\delta > 0\), there exists \(M > 0\) such that

    $$\begin{aligned} \mathbb {P}\left( \sup _{0 \le t \le T}|X_t^\epsilon | > M \right) \le \delta , \quad 0 < \epsilon \le 1. \end{aligned}$$
  2. 2.

    For any \(\delta > 0\), \(M > 0\), there exists \(\epsilon _0\) and \(\gamma \) such that

    $$\begin{aligned} \gamma ^{-1}\sup _{0<\epsilon <\epsilon _0}\sup _{0\le t_0 \le T}\mathbb {P}\left( \sup _{t\in [t_0, t_0+\gamma ]}\left| X_t^\epsilon - X_{t_0}^\epsilon \right| \ge \delta ; \, \sup _{0 \le s \le T}|X_s^\epsilon | \le M\right) \le \delta . \end{aligned}$$

To verify condition 3 we follow the approach of [43] and consider a test function of the form \(\phi _0(x) = \log (1 + |x|^2)\). The motivation for this choice is that while \(\phi _0(x)\) is increasing, we have that

$$\begin{aligned} \sum _{l=1}^{3}(1 + |x|)^l|\nabla _x^l \phi _0(x)|_F \le C, \end{aligned}$$
(67)

where \(|\cdot |_{F}\) denotes the Frobenius norm. Let \(\phi _1, \ldots , \phi _{2N-1}\) be the first \(2N-1\) test functions constructed in Proposition 6. Consider the test function

$$\begin{aligned} \begin{aligned} \phi ^\epsilon (x)&= \phi _0(x) + \epsilon \phi _1(x, x/\epsilon ) + \ldots + \epsilon ^{N}\phi _N\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ) \\&\quad + \epsilon ^{N+1}\phi _{N+1}\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ) + \ldots +\epsilon ^{2N-1}\phi _{2N-1}\big (x, x/\epsilon , \ldots , x/\epsilon ^{N}\big ). \end{aligned}\nonumber \\ \end{aligned}$$
(68)

Applying Itô’s formula, we have that

$$\begin{aligned} \phi ^\epsilon (X^\epsilon _t) = \phi ^\epsilon (x) + \int _0^t G(X_s^\epsilon )\,ds + \sqrt{2\sigma }\sum _{i=0}^{N}\sum _{j=0}^{2N-1} \epsilon ^{j-i} \int _0^t\nabla _{x_i}\phi _j\,dW_s, \end{aligned}$$

where G(x) is a smooth function consisting of terms of the form:

$$\begin{aligned} \begin{aligned} \epsilon ^{k-(i+j)}e^{V/\sigma }\,\nabla _{x_i}\cdot \left( e^{- V/\sigma }\,\sigma \nabla _{x_j}\phi _{k}\right) (x, x/\epsilon , \ldots , x/\epsilon ^N), \end{aligned} \end{aligned}$$
(69)

where \(k\ge i+j\), by construction of the test functions. Moreover, \(\nabla _{x_i}\phi _j = 0\) for \(j < i\). To obtain relative compactness we need to individually control the terms arising in the drift. More specifically, we must show that the terms

$$\begin{aligned}&\mathbb {E} \sup _{0\le t \le T} \int _0^t\left| e^{V/\sigma }\nabla _{x_i}\cdot \left( e^{- V/\sigma } \,\sigma \nabla _{x_j}\phi _{k}\right) (X_s^\epsilon , X_s^\epsilon /\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)\,ds\right| ,\end{aligned}$$
(70)
$$\begin{aligned}&\mathbb {E}\left| \sup _{0\le t\le T}\int _0^t \nabla _{x_j}\phi _k(X^\epsilon _s,X^\epsilon _s/\epsilon , \ldots ,X^\epsilon _s/\epsilon ^N)\,dW_s\right| ^2, \end{aligned}$$
(71)

and

$$\begin{aligned} \sup _{0\le t \le T}|\phi _j(X^\epsilon _t)|. \end{aligned}$$
(72)

are bounded uniformly with respect to \(\epsilon \in (0,1]\). Terms of the type (70) can be bounded above by:

$$\begin{aligned} \mathbb {E} \sup _{0 \le t \le T}\int _0^t \left| \left( \nabla _{x_i}V\cdot \nabla _{x_j} \phi _k\right) (X_s^\epsilon ,\ldots , X_s^\epsilon /\epsilon ^N)\right| + \left| \sigma \nabla _{x_i}\cdot \nabla _{x_j} \phi _k(X_s^\epsilon ,\ldots , X_s^\epsilon /\epsilon ^N)\right| \,ds. \end{aligned}$$

If \(i > 0\), then \(\nabla _{x_i}V\) is uniformly bounded, and so the above expectation is bounded above by

$$\begin{aligned}&C\,\mathbb {E} \int _0^T |\nabla _{x_j}\phi _k(X_s^\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)| + \bigg |\nabla _{x_i}\cdot \nabla _{x_j}\phi _k(X_s^\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)\bigg | \,ds \\&\quad \le C \mathbb {E} \int _0^T\sum _{m=1}^3\left|\nabla ^{m}_{x_0}\phi _0(X_s^\epsilon )\right|_F\,ds \le KT, \end{aligned}$$

using (67), for some constant \(K > 0\) independent of \(\epsilon \). For the case when \(i = 0\), an additional term arises from the derivative \(\nabla _{x_0}V_0\) and we obtain an upper bound of the form

$$\begin{aligned} \begin{aligned}&\mathbb {E} \int _0^T \sum _{m=1}^3\left|\nabla ^{m}_{x_0}\phi _0(X_t^\epsilon )\right|_{F} (1 + \left| \nabla _{x_0} V_0(X_t^\epsilon )\right| )\,dt \\&\quad \le \,\mathbb {E} \int _0^T\sum _{m=1}^3\left|\nabla ^{m}_{x_0}\phi _0(X_t^\epsilon )\right|_{F}(1 + \Vert \nabla \nabla V_0\Vert _{L^\infty }|X_t^\epsilon |)\,dt, \end{aligned} \end{aligned}$$
(73)

and which is bounded by Assumption 1 and (67). For (71), we have

$$\begin{aligned} \mathbb {E}\left| \sup _{0\le t\le T}\int _0^t \nabla _{x_j}\phi _k(X^\epsilon _s, X^\epsilon _s/\epsilon , \ldots ,X^\epsilon _s/\epsilon ^N)\,dW_s\right| ^2&\le 4 \mathbb {E}\int _0^T |\nabla _{x_j}\phi _k(X^\epsilon _s,X^\epsilon _s/\epsilon , \ldots ,X^\epsilon _s/\epsilon ^N)|^2\,ds\\&\le C\,\mathbb {E}\int _0^T\sum _{m=1}^3\left|\nabla ^{m}_{x_0}\phi _0(X_s^\epsilon )\right|_F\,ds, \end{aligned}$$

which is again bounded. Terms of the type (72) follow in a similar manner. Condition 3 then follows by an application of Markov’s inequality.

To prove Condition 3, we set \(\phi _0(x)=x\) and let \(\phi _{1}, \ldots , \phi _{2N-1}\) be the test functions which exist by Proposition 6. Applying Itô’s formula to the corresponding multiscale test function (68), so that for \(t_0 \in [0,T]\) fixed,

$$\begin{aligned} X_t^{\epsilon } - X_{t_0}^\epsilon = \int _{t_0}^t {G}\,ds + \sqrt{2\sigma }\sum _{i=0}^{N}\sum _{j=0}^{2N-1}\epsilon ^{j-i} \int _{t_0}^t \nabla _{x_i}\phi _j\,dW_s, \end{aligned}$$
(74)

where G is of the form given in (69). Let \(M > 0\), and let

$$\begin{aligned} \tau ^\epsilon _M = \inf \lbrace t \ge 0; \, |X_t^\epsilon | > M\rbrace . \end{aligned}$$
(75)

Following [43], it is sufficient to show that

$$\begin{aligned} \mathbb {E} \left[ \sup _{t_0\le t \le T} \int _{t_0\wedge \tau ^\epsilon _M}^{t\wedge \tau ^\epsilon _M}\left| e^{V/\sigma }\nabla _{x_i}\cdot \left( e^{- V/\sigma } \nabla _{j}\phi _{k}\right) (X_s^\epsilon , X_s^\epsilon /\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)\,ds\right| ^{1+\nu }\right] < \infty , \end{aligned}$$
(76)

and

$$\begin{aligned} \mathbb {E}\left( \sup _{t_0\le t \le t_0 + \gamma }\left| \int _{t_0\wedge \tau ^\epsilon _M}^{t\wedge \tau ^\epsilon _M} \nabla _{x_i}\phi _j(X_s^\epsilon , X_s^\epsilon /\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)\,dW_s\right| ^{2+2\nu }\right) < \infty , \end{aligned}$$
(77)

for some fixed \(\nu > 0\). For (76), when \(i > 0\), the term \(\nabla _{x_i}V\) is uniformly bounded. Moreover, since \(\nabla \phi _0\) is bounded, so are the test functions \(\phi _1,\ldots , \phi _{2N+1}\). Therefore, by Jensen’s inequality one obtains a bound of the form

$$\begin{aligned}&C\gamma ^{\nu }\mathbb {E}\int _{t_0}^{t_0+\gamma }\left| e^{ V/\sigma }\nabla _{x_i}\cdot \left( e^{- V/\sigma } \nabla _{j}\phi _{k}\right) (X_s^\epsilon , X_s^\epsilon /\epsilon , \ldots , X_s^\epsilon /\epsilon ^N)\right| ^{1+\nu }\,ds\\&\quad \le C\gamma ^{\nu }\int _{t_0}^{t_0+\gamma } |K|^{1+\nu }\,ds \le K'\gamma ^{1+\nu }. \end{aligned}$$

When \(i = 0\), we must control terms involving \(\nabla _{x_0}V_0\) of the form,

$$\begin{aligned} \mathbb {E}\left[ \sup _{t_0\le t\le t_0+\gamma }\int _{t_0\wedge \tau ^\epsilon _M}^{t\wedge \tau ^\epsilon _M} \left| \nabla V_0\cdot \nabla _{x_j} \phi _k\right| ^{1+\nu }\,ds\right] , \end{aligned}$$

where \(\tau _M^\epsilon \) is given by (75). However, applying Jensen’s inequality,

$$\begin{aligned} \nonumber \mathbb {E}\left[ \sup _{t_0\le t\le t_0+\gamma }\int _{t_0\wedge \tau ^\epsilon _M}^{t\wedge \tau ^\epsilon _M} \left| \nabla V_0\cdot \nabla _{x_j} \phi _k\right| ^{1+\nu }\,ds\right]&\le C\gamma ^{\nu } \int _{t_0\wedge \tau ^\epsilon _M}^{(t_0+\gamma )\wedge \tau ^\epsilon _M} \mathbb {E}\left| \nabla V_0\cdot \nabla _{x_j} \phi _k\right| ^{1+\nu }\,ds \\&\nonumber \le C\gamma ^{\nu }\int _{t_0\wedge \tau ^\epsilon _M}^{(t_0+\gamma )\wedge \tau ^\epsilon _M} \mathbb {E}\left| \nabla V_0(X_s^\epsilon )\right| ^{1+\nu }\,ds\\&\nonumber \le C\gamma ^{\nu }\left\Vert \nabla ^2 V_0\right\Vert _{\infty }^{1+\nu }\int _{t_0\wedge \tau ^\epsilon _M}^{(t_0+\gamma )\wedge \tau ^\epsilon _M} \mathbb {E}|X_s^\epsilon |^{1+\nu }\,ds\\&\le CM\gamma ^{1+\nu }\left\Vert \nabla ^2 V_0\right\Vert _{L^{\infty }}^{1+\nu }, \end{aligned}$$
(78)

as required. Similarly, to establish (77) we follow a similar argument, first using the Burkholder–Gundy–Davis inequality to obtain:

$$\begin{aligned} \mathbb {E}\left( \sup _{t_0 \le t \le t_{0} + \gamma }\int _{t_0}^{t}|\nabla _{x_i}\phi _j\,dW_s|^{2+2\nu }\right)&\le \mathbb {E}\left( \int _{t_0}^{t_0+\gamma }\left| \nabla _{x_i}\phi _j\right| ^2 \,ds\right) ^{1+\nu }\\&\le \gamma ^{\nu }\int _{t_0}^{t_0+\gamma }\mathbb {E}\left| \nabla _{x_i}\phi _j\right| ^{2+2\gamma } \,ds \\&\le C \gamma ^{1+\nu }. \end{aligned}$$

We note that Assumption 1 (3) is only used to obtain the bounds (73) and (78). A straightforward application of Markov’s inequality then completes the proof of condition 3. It follows from Prokhorov’s theorem that the family \(\lbrace X_t^\epsilon ; t \in [0,T]\rbrace _{0 < \epsilon \le 1}\) is relatively compact in the topology of weak convergence of stochastic processes taking paths in \(C([0,T]; \mathbb {R}^d)\). In particular, there exists a process \(X^0\) whose paths lie in \(C([0,T]; \mathbb {R}^d)\) such that \(\lbrace X^{\epsilon _n}; t\in [0,T] \rbrace \Rightarrow \lbrace X^{0}; t\in [0,T] \rbrace \) along a subsequence \(\epsilon _n\).

5.3 Identifying the Weak Limit

In this section we uniquely identify any limit point of the set \(\lbrace X_t^\epsilon ; t \in [0,T]\rbrace _{0 < \epsilon \le 1}\). Given \(\phi _0 \in C^\infty _c(\mathbb {R}^d)\) define \(\phi ^\epsilon \) to be

$$\begin{aligned}{} & {} \phi ^\epsilon (x) = \phi _0(x) + \epsilon \phi _1(x/\epsilon ) + \ldots \epsilon ^N \phi _N\big (x, x/\epsilon ,\ldots , x/\epsilon ^N\big )\\{} & {} + \ldots + \epsilon ^{2N}\phi _{2N}\big (x, x/\epsilon , \ldots , x/\epsilon ^N\big ), \end{aligned}$$

where \(\phi _1, \ldots , \phi _N\) are the test functions obtained from Proposition 6. Since each test function is smooth, we can apply Itô’s formula to \(\phi ^\epsilon (X_t^\epsilon )\) to obtain

$$\begin{aligned} \mathbb {E}\left[ \phi ^\epsilon (X_t^\epsilon ) - \int _s^t \mathcal {L}^\epsilon \phi ^\epsilon (X_u)\,du \Big | \, \mathcal {F}_s \right] = \phi ^\epsilon (X_s^\epsilon ). \end{aligned}$$
(79)

We can now use (45) to decompose \(\mathcal {L}\phi ^\epsilon \) into an O(1) term and remainder terms which vanish as \(\epsilon \rightarrow 0\). Collecting together \(O(\epsilon )\) terms we obtain

$$\begin{aligned} \mathbb {E}\left[ \phi _0(X_t^\epsilon ) - \int _{s}^t \frac{\sigma }{Z(X_u^\epsilon )}\nabla _{x_0}\cdot \left( {Z}(X_u^\epsilon )\mathcal {M}(X_u^\epsilon )\nabla \phi _0(X_u^\epsilon )\right) \,du + \epsilon R_\epsilon \,\Big |\, \mathcal {F}_s\right] = \phi _0(X_s^\epsilon ), \end{aligned}$$

where \(R_\epsilon \) is a remainder term which is bounded in \(L^2(\mu ^\epsilon )\) uniformly with respect to \(\epsilon \), and where the homogenized diffusion tensor \(\mathcal {M}(x)\) is defined in Theorem 1. Taking \(\epsilon \rightarrow 0\) we see that any limit point is a solution of the martingale problem

$$\begin{aligned} \mathbb {E}\left[ \phi _0(X^0_t) - \int _{s}^t \frac{\sigma }{Z(X^0_u)}\nabla _{x_0}\cdot \left( Z(X^0_u)\mathcal {M}(X^0_u)\nabla \phi _0(X_u^0)\right) \,du \,\Big |\, \mathcal {F}_s\right] = \phi _0(X_s^0). \end{aligned}$$

This implies that \(X^0\) is a solution to the martingale problem for \(\mathcal {L}^0\) given by

$$\begin{aligned} \mathcal {L}_0 f(x) = \frac{\sigma }{Z(x)}\nabla \cdot (Z(x)\mathcal {M}(x)\nabla f(x)). \end{aligned}$$

From Lemma 1, the matrix \(\mathcal {M}(x)\) is smooth, strictly positive definite and has bounded derivatives. Moreover,

$$\begin{aligned} Z(x)&= \int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}e^{-V(x, x_1,\ldots , x_N)/\sigma }\,dx_1\ldots dx_N \\ {}&= e^{-V_0(x)/\sigma }\int _{\mathbb {T}^d}\cdots \int _{\mathbb {T}^d}e^{-V_1(x, x_1,\ldots , x_N)/\sigma }\,dx_1\ldots dx_N, \end{aligned}$$

where the term in the integral is uniformly bounded. It follows from Assumption 1, that for some \(C > 0\),

$$\begin{aligned} \left| \mathcal {M}(x)\nabla \Psi (x)\right| \le C(1 + |x|), \quad \forall x\in \mathbb {R}^d, \end{aligned}$$

where \(\Psi = -\log Z\). Therefore, the conditions of the Stroock-Varadhan theorem [51, Theorem 24.1] holds, and therefore the martingale problem for \(\mathcal {L}^0\) possesses a unique solution. Thus \(X^0\) is the unique (in the weak sense) limit point of the family \(\lbrace X^\epsilon \rbrace _{0 < \epsilon \le 1}\). Moreover, by [51, Theorem 20.1], the process \(\lbrace X^0_t; t\in [0,T] \rbrace \) will be the unique solution of the SDE (18), completing the proof.

6 Further Discussion and Outlook

In this paper, we have shown the convergence of the multi-scale diffusion process (8) to the homogenized (effective) diffusion process (18), as well as the convergence of the corresponding equilibrium measures. We have employed the classical martingale approach based on a suitable construction of test functions and analysis of the related Poisson equations. A notable feature is that the effective (macroscopic) process is a multiplicative diffusion process where the diffusion tensor depends on the macroscopic variable, whereas the noise in the microscopic dynamics is additive. This is due to the full coupling between the macroscopic and the microscopic scales. As discussed in the introduction, both processes are reversible diffusion processes satisfying the detailed balance condition. Therefore, according to [1], the corresponding Fokker Planck equations at all scales are Wasserstein gradient flows for the corresponding free energy functionals [30]. Thus, the rigorous analysis presented in this work leads to the conclusion that the Wasserstein gradient flow structure is preserved under coarse-graining. This raises the interesting question whether coarse-graining and, in particular, homogenization can be studied within the framework of evolutionary Gamma convergence [4, 16, 35, 52]. Another interesting question is obtaining quantitative rates of convergence [17] and also understanding the effect of coarse-graining on the Poincaré and logarithmic Sobolev inequality constants, using the methodology of two-scale convergence [24, 41]. We will return to these questions in future work.