Keywords

1 Introduction

In many application domains, one encounters kinetic equations that model particle behavior in a high-dimensional position-velocity phase space. One such motivating application for this work is modelling neutral transport in plasma simulations of Tokamak fusion reactors. Simulation codes such as EIRENE [19] and DEGAS2 [20] simulate these models with particle-based Monte Carlo methods. Other approaches such as finite-volume methods or discrete Galerkin typically either suffer from dimension dependent computational costs, or additional errors due to the use of lower-dimensional approximations.

Often, one is interested in low-dimensional moments of the particle distribution, computed as averages over velocity space. In high-collisional regimes, such those in the plasma edge simulations, the time-scale at which these quantities of interest change is typically much slower than that of the particle dynamics. This presence of multiple time-scales results in stiffness: a naive simulation requires small time steps due to the fast dynamics, but long time horizons to capture the evolution of the macroscopic moments. The exact nature of the macroscopic behavior depends on the problem scaling, which can be hyperbolic or diffusive [4].

We consider a diffusively-scaled, spatially homogeneous kinetic equation modelling a distribution of particles f(xvt) as a function of space \(x\in \mathcal {D}_x \subset \mathbb {R}^d\), velocity \(v \in \mathcal {D}_v \subset \mathbb {R}^d\) and time \(t \in \mathbb {R}^+\). The scaling is captured by the dimensionless parameter \(\varepsilon \). Simultaneously dividing the collision rate and multiplying the simulation time-scale with \(\varepsilon \) produces the model equation

$$\begin{aligned} \partial _t f(x,v,t) + \frac{v}{\varepsilon }\nabla _x f(x,v,t) = \frac{1}{\varepsilon ^2}(\mathcal {M}(v)\rho (x,t)-f(x,v,t)), \end{aligned}$$
(1)
$$\begin{aligned} \text {with} \quad \rho (x,t) = \int _{\mathcal {D}_v} f(x,v,t)\;\text {d}v \quad \text {the particle density.} \end{aligned}$$

The right-hand side of (1) consists of the BGK operator [1], which linearly drives the velocity to a steady-state distribution \(\mathcal {M}(v)\) with characteristic velocity \(\tilde{v}\).

Individual particles in the distribution f(xvt) follow a velocity-jump process, i.e., they travel in a straight line with a fixed velocity for an exponentially distributed time interval \(\mathcal {E}(1/\varepsilon ^2)\), at which point a collision is simulated by re-sampling their velocity from the distribution \(\mathcal {M}(v)\). Taking the diffusion limit \(\varepsilon \rightarrow 0\) in (1) drives the collision rate to infinity, while simultaneously increasing the velocity. In this limit, it can be shown that the particle density resulting from (1) converges to the diffusion equation [11],

$$\begin{aligned} \partial _t \rho (x,t) = \nabla ^2_{x} \rho (x,t). \end{aligned}$$
(2)

Equation (1) can be simulated using a wide selection of methods, which broadly fall into two categories. Deterministic methods solve (1) for f(xvt) on a grid over the position-velocity phase space (using, for instance, finite differences or finite volumes). This approach quickly becomes computationally infeasible when the dimension grows, since a grid must be formed over the 2d-dimensional domain, \(\mathcal {D}_x \times \mathcal {D}_v\). Stochastic methods, on the other hand, perform simulations of individual particle trajectories, with each trajectory sampling the probability distribution f(xvt). These methods do not suffer from the dimensionality of the phase space, but introduce a statistical error in the computed solution. When using explicit time steps, both approaches become prohibitively expensive for small values of \(\varepsilon \) due to the time step restriction.

To avoid the issues when \(\varepsilon \) becomes small, one can use asymptotic-preserving schemes [9]. Such methods preserve the macroscopic limit equation, in our case given by (2), as \(\varepsilon \) tends to zero, while avoiding time step constraints. Many such methods have been developed in literature for deterministic methods in the diffusive limit. For more information, we refer to a recent review paper [4] and the references therein. In the particle setting, only a few asymptotic-preserving methods exist, mostly in the hyperbolic scaling [3, 6, 7, 16,17,18]. In the diffusive scaling, we are only aware of three works [2, 5, 14]. We use the approach taken in [5], which uses operator splitting to produce an unconditionally stable fixed time step particle method. This stability comes at the cost of an extra bias in the model, proportional to the size of the time step.

In [13] the multilevel Monte Carlo method [8] was applied to the scheme presented in [5]. Multilevel Monte Carlo methods first compute an initial estimate, using a large number of samples with a large time step (low cost per sample). This estimate has low variance, but is expected to have a large bias. Next, the bias of the initial estimate is reduced by performing corrections using a hierarchy of simulations with decreasing time step sizes. Under correct conditions, far fewer samples with small time steps are needed, compared to a direct Monte Carlo simulation with the smallest time step. This approach results in a reduced computational cost for a given accuracy. A core component of developing a multilevel Monte Carlo method is the development of a numerical approach to generate samples with different time step sizes which are correlated, i.e., they approximate the same continuous particle trajectory. This paper presents a strategy for improving the correlation developed in [13] and thus further reducing the computational cost of particle based simulations of (1).

The remainder of this paper is structured as follows. In Sect. 2 we present a naive particle based Monte Carlo scheme for simulating the kinetic equation (1) and show why this approach fails. In Sect. 3, we show how combining the asymptotic-preserving scheme from [5] and the multilevel Monte Carlo method combats these issues. In Sect. 4, we show the flaws of the existing method and introduce our improved correlation. In Sect. 5, we present experimental results using the new correlation, demonstrating computational speed-up. In Sect. 6, we summarize our results and list further challenges.

2 Monte Carlo Simulation Near the Diffusive Limit

For the sake of exposition, we limit this work to one spatial dimension, i.e., \(d=1\) but, our approach can be applied straightforwardly in more dimensions. Equation (1) now becomes

$$\begin{aligned} \partial _t f(x,v,t) + \dfrac{v}{\varepsilon }\partial _x f(x,v,t) = \frac{1}{\varepsilon ^2}(\mathcal {M}(v)\rho (x,t)-f(x,v,t)). \end{aligned}$$
(3)

We also restrict this paper to simulations with two discrete velocities, i.e., \(\mathcal {M}(v) \equiv \frac{1}{2} \left( \delta _{v,-1} + \delta _{v,1} \right) \), with \(\delta \) the Kronecker delta function.

We focus on computing a quantity of interest \(Y(t^*)\), which takes the form of an integral over a function F(xv) of the particle position X(t) and velocity V(t) at time \(t=t^*\), with respect to the measure \(f(x,v,t)\,\text {d}x\,\text {d}v\), i.e.,

$$\begin{aligned} Y(t^*) = {{\,\mathrm{\mathbb {E}}\,}}\left[ F\left( X(t^*),V(t^*)\right) \right] = \int _{\mathcal {D}_v}\int _{\mathcal {D}_x} F(x,v) f(x,v,t^*)\,\text {d}x\,\text {d}v. \end{aligned}$$

Equation (3) can be simulated using a particle scheme with a fixed time step size \(\varDelta t\). Each particle p has a state \(\left( X_{p,\varDelta t}^n , V_{p,\varDelta t}^n \right) \) in the position-velocity phase space at each time step n, with \(X_{p,\varDelta t}^n \approx X_p(n\varDelta t)\) and \(V_{p,\varDelta t}^n \approx V_p(n\varDelta t)\). We represent the distribution f(xvt) of particles at time \(n \varDelta t\) by an ensemble of P particles, with indices \(p \in \{1,\dots ,P\}\),

$$\begin{aligned} \left\{ \left( X_{p,\varDelta t}^n,V_{p,\varDelta t}^n\right) \right\} _{p=1}^P. \end{aligned}$$
(4)

A classical Monte Carlo estimator \(\hat{Y}(t^*)\) for \(Y(t^*)\) averages over such an ensemble

$$\begin{aligned} \hat{Y}(t^*) = \frac{1}{P}\sum _{p=1}^P F\left( X^N_{p,\varDelta t}, V^N_{p,\varDelta t}\right) , \quad t^* = N \varDelta t. \end{aligned}$$
(5)

To generate an ensemble (4) we perform forward time stepping based on operator splitting, which is first order in the time step \(\varDelta t\) [15]. For (3), operator splitting results in two actions for each time step:

  1. 1.

    Transport step. Each particle’s position is updated based on its velocity

    $$\begin{aligned} X^{n+1}_{p,\varDelta t} = X^n_{p,\varDelta t} + \varDelta t V_{p,\varDelta t}^n. \end{aligned}$$
    (6)
  2. 2.

    Collision step. Between transport steps, each particle’s velocity is either left unchanged (no collision) or re-sampled from \(\mathcal {M}(v)\) (collision), i.e.,

    $$\begin{aligned} V_{p,\varDelta t}^{n+1} = {\left\{ \begin{array}{ll} V_{p,\varDelta t}^{n,\prime } \sim \mathcal {M}(v),&{}\text {with collision probability } p_{c,\varDelta t} = \varDelta t/\varepsilon ^2,\\ V_{p,\varDelta t}^n,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$
    (7)

Scheme (6)–(7) has a severe time step restriction \(\varDelta t = \mathcal {O}(\varepsilon ^2)\) when approaching the limit \(\varepsilon \rightarrow 0\). This time step restriction results in unacceptably high simulation costs, despite the well-defined limit (2) [5]. Given that the variance of the estimator \(\hat{Y}(t^*)\) scales with \(\frac{1}{P}\), a high individual simulation cost will severely constrain the precision with which we can approximate \(Y(t^*)\).

3 Multilevel Monte Carlo Using AP Schemes

In the previous section, we demonstrated that a time step \(\varDelta t = \mathcal {O}(\varepsilon ^2)\) is needed to resolve the collision dynamics of the kinetic equation (3). As \(\varepsilon \) becomes small, this causes a naive Monte Carlo estimator to be computationally infeasible. In this section, we first present an asymptotic-preserving Monte Carlo scheme which maintains stability for large time steps by reducing the contribution of collision dynamics for larger time steps. We will then present a multilevel Monte Carlo estimator, using this scheme, to combine simulations with different time step sizes, reducing the overall computational cost of the estimator \(\hat{Y}(t^*)\).

3.1 Asymptotic-Preserving Monte Carlo Scheme

We use the asymptotic-preserving scheme from [5] as an alternative to (6)–(7). This asymptotic-preserving scheme has no time step constraints, so \(\varepsilon \) and \(\varDelta t\) can be chosen independently. The scheme simulates a modified version of (3):

$$\begin{aligned} \partial _t f + \frac{\varepsilon v}{\varepsilon ^2+\varDelta t} \partial _x f = \frac{\varDelta t}{\varepsilon ^2+\varDelta t} \partial _{xx} f + \frac{1}{\varepsilon ^2+\varDelta t} (\mathcal {M}(v)\rho - f). \end{aligned}$$
(8)

In (8), we have omitted the space, velocity and time dependency of f(xvt) and \(\rho (x,t)\), for conciseness. Note that the coefficients of (8) now explicitly contain the simulation time step \(\varDelta t\).

Equation (8) has the following properties [5]:

  1. 1.

    For \(\varepsilon \rightarrow 0\), (8) converges to the diffusion equation (2).

  2. 2.

    For \(\varDelta t \rightarrow 0\), (8) converges to the original kinetic equation (3) \(\mathcal {O}(\varDelta t)\).

  3. 3.

    For \(\varDelta t \rightarrow \infty \), (8) converges to the diffusion equation (2) with a rate \(\mathcal {O}\big (\frac{1}{\varDelta t}\big )\).

The first property states that (8) has the same asymptotic limit in \(\varepsilon \) as (3). The second and third properties provide us with an intuitive understanding of (8). We can interpret the modified equation as a combination of the diffusion equation (2) and the original kinetic equation (3), where the two contributions are weighted in function of \(\varDelta t\). For large time steps the diffusion equation dominates over the kinetic equation. At the particle level, the diffusion equation corresponds with Brownian motion, which has no time constraints, hence the stability of (8).

Particle trajectories are now simulated as follows:

  1. 1.

    Transport-diffusion step. The position of the particle is updated based on its velocity and a Brownian increment

    $$\begin{aligned} X^{n+1}_{p,\varDelta t} = X^n_{p,\varDelta t} + V^n_{p,\varDelta t} \varDelta t + \sqrt{2 \varDelta t}\sqrt{D_{\varDelta t}}\xi ^n_p, \end{aligned}$$
    (9)

    in which we generate \(\xi _p^n \sim \mathcal {N}(0,1)\) and introduce a \(\varDelta t\)-dependent diffusion coefficient \(D_{\varDelta t}=\frac{\varDelta t}{\varepsilon ^2 + \varDelta t}\) and velocity \(V^n_{p,\varDelta t} \sim \mathcal {M}_{\varDelta t}(v)\), where the time step dependent distribution \(\mathcal {M}_{\varDelta t}(v)\) with characteristic velocity \(\tilde{v}_{\varDelta t}=\frac{\varepsilon }{\varepsilon ^2+\varDelta t}\) can be decomposed as

    $$\begin{aligned} \mathcal {M}_{\varDelta t}(v) = \tilde{v}_{\varDelta t}\mathcal {M}(v). \end{aligned}$$
    (10)
  2. 2.

    Collision step. During collisions, each particle’s velocity is updated as:

    $$\begin{aligned} V_{p,\varDelta t}^{n+1} = {\left\{ \begin{array}{ll} V_{p,\varDelta t}^{n,\prime } \sim \mathcal {M}_{\varDelta t}(v),&{}\text {with probability} p_{c,\varDelta t} = \dfrac{\varDelta t}{\varepsilon ^2+\varDelta t},\\ V_{p,\varDelta t}^n,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$
    (11)

The scheme (9)–(11) stably generates trajectories for large \(\varDelta t\). However, these trajectories are biased with \(\mathcal {O}(\varDelta t)\) compared to those generated by (6)–(7).

3.2 Multilevel Monte Carlo

The key idea behind the multilevel Monte Carlo method (MLMC) [8] is to combine simulations with different time step sizes to simulate many trajectories (low variance), while also simulating accurate trajectories (low simulation bias).

First, a coarse Monte Carlo estimator \(\hat{Y}_0(t^*)\) with a time step \(\varDelta t_0\) is used

$$\begin{aligned} \hat{Y}_0(t^*) = \frac{1}{P_0}\sum _{p=1}^{P_0} F\left( X^{N_0}_{p,\varDelta t_0},V^{N_0}_{p,\varDelta t_0}\right) , \quad t^* = N_0 \varDelta t_0. \end{aligned}$$
(12)

Estimator (12) is based on a large number of trajectories \(P_0\), but has a low computational cost as few time steps \(N_0\) are required to reach time \(t^*\).

Next, \(\hat{Y}_0(t^*)\) is refined upon by a sequence of L difference estimators at levels \(\ell =1,\ldots ,L\). Each difference estimator uses an ensemble of \(P_\ell \) particle pairs

$$\begin{aligned} \hat{Y}_\ell (t^*) = \frac{1}{P_\ell }\sum _{p=1}^{P_\ell }\left( F\left( X^{N_\ell }_{p,\varDelta t_\ell },V^{N_\ell }_{p,\varDelta t_\ell }\right) -F\left( X^{N_{\ell -1}}_{p,\varDelta t_{\ell -1}},V^{N_{\ell -1}}_{p,\varDelta t_{\ell -1}}\right) \right) ,\quad t^* = N_\ell \varDelta t_\ell . \end{aligned}$$
(13)

Each correlated particle pair consists a particle with a fine time step \(\varDelta t_\ell \) and a particle with a coarse time step \(\varDelta t_{\ell -1} = M \varDelta t_\ell \), with M a positive integer. The particles in each pair undergo correlated simulations, which intuitively can be understood as making both particles follow the same qualitative trajectory for two different simulation accuracies. One can interpret the difference estimator (13) as using the fine simulation to estimate the bias in the coarse simulation.

Given a sequence of levels \(\ell \in \{0,\dots ,L\}\), with decreasing step sizes, and the corresponding estimators given by (12)–(13), the multilevel Monte Carlo estimator for the quantity of interest \(Y(t^*)\) is computed by the telescopic sum

$$\begin{aligned} \hat{Y}(t^*) = \sum _{\ell =0}^{L} \hat{Y}_\ell (t^*). \end{aligned}$$
(14)

It is clear that the expected value of the estimator (14) is the same as that of (5), with the finest time step \(\varDelta t = \varDelta t_L\). Given a sufficiently fast reduction in the number of simulated (pairs of) trajectories \(P_\ell \) as \(\ell \) increases, it can be shown that the multilevel Monte Carlo estimator requires a lower computational cost than a classical Monte Carlo estimator to achieve the same mean square error.

3.3 Term-by-term Correlation Approach

The differences in (13) will only have low variance if the simulated paths up to \(X^{N_\ell }_{p,\varDelta t_\ell }\) and \(X^{N_{\ell -1}}_{\varDelta t_{p,\ell -1}}\), with time steps related by \(\varDelta t_{\ell -1}=M\varDelta t_{\ell }\) are correlated. To discuss these correlated pairs of trajectories, we define a sub-step index \(m \in \{1,\dots , M\}\), i.e., \(X^{n,m}_{p,\varDelta t_\ell }\approx X_p(n\varDelta t_{\ell -1}+m\varDelta t_{\ell })\equiv X_p((nM+m)\varDelta t_{\ell })\). In order to span a time interval of size \(\varDelta t_{\ell -1}\), the coarse simulation requires a single time step of size \(\varDelta t_{\ell -1}\) while the fine simulation requires M time steps of size \(\varDelta t_{\ell }\):

with \(\xi ^{n}_{p,\ell -1}, \xi ^{n,m}_{p,\ell } \sim \mathcal {N}(0,1)\), \(V_{p,\varDelta t_{\ell -1}}^n \sim \mathcal {M}_{\varDelta t_{\ell -1}}(v)\) and \(V_{p,\varDelta t_{\ell }}^{n,m} \sim \mathcal {M}_{\varDelta t_\ell }(v)\).

There are two sources of stochastic behavior in the asymptotic-preserving scheme (9)–(11). On the one hand, a new Brownian increment \(\xi _p^n\) is generated at each time step. On the other hand, each time step has a finite collision probability. Collisions produce a new particle velocity \(V_{p, \varDelta t}^{n+1}\) for the next time step.

To correlate these simulations we first perform M independent time steps of the fine simulation. We then combine the M sets of random numbers in the fine path into one set of random numbers which can be used in a single time step for the coarse simulation. If the fine simulation random numbers are combined in such a way that the resulting coarse simulation random numbers are distributed as if they had been generated independently, then the coarse simulation statistics will be preserved, while also being dependent on the fine simulation.

In [13], an approach was developed to independently correlate Brownian increments and collision phenomena, which we briefly describe here. For a full derivation, demonstrative figures and an analysis, we refer to [12, 13].

Correlating Brownian Increments. In each fine sub-step \(m \in \{1,\dots , M\}\) a Brownian increment \(\xi ^{n,m}_{p,\ell } \sim \mathcal {N}(0,1)\) is generated. The sum of these M increments is distributed as \(\mathcal {N}(0,M)\). This means that a Brownian increment for the corresponding coarse simulation step can be generated as

$$\begin{aligned} \xi ^{n}_{p,\ell -1} = \frac{1}{\sqrt{M}}\sum _{m=1}^M \xi ^{n,m}_{p,\ell } \sim \mathcal {N}(0,1). \end{aligned}$$
(15)

Correlating Collisions. Implementing (11) requires simulating a collision in a fine sub-step \(m \in \{1,\dots , M\}\), with the collision probability \(p_{c,\varDelta t_\ell }\). In practice this is achieved by drawing a uniformly distributed random number \(u^{n,m}_{p,\ell } \sim \mathcal {U}([0,1])\) and performing a collision if this number is larger than the probability \(p_{nc,\varDelta t_\ell }=1-p_{c,\varDelta t_\ell }\) that no collision has occurred in the time step

$$\begin{aligned} u_{p,\ell }^{n,m} \ge p_{nc,\varDelta t_\ell } = \frac{\varepsilon ^2}{\varepsilon ^2+\varDelta t_\ell }. \end{aligned}$$
(16)

At least one collision has taken place in the fine simulation if the largest of the generated \(u_{p,\ell }^{n,m}\), \(m \in \{1,\dots ,M\}\), satisfies (16), i.e.,

$$\begin{aligned} u^{n,\text {max}}_{p,\ell } = \max _m u_{p,\ell }^{n,m} \ge p_{nc,\varDelta t_\ell }. \end{aligned}$$
(17)

When (17) is satisfied, the probability of a collision taking place in the correlated coarse simulation should be large. This is the case if we compare

$$\begin{aligned} u_{p,\ell -1}^n = {\left( u^{n,\text {max}}_{p,\ell }\right) }^M \sim \mathcal {U}([0,1]), \end{aligned}$$
(18)

with \(p_{nc,\varDelta t_{\ell -1}}\) in the coarse simulation.

If a collision is performed in the simulation at level \(\ell -1\) using (18), then the coarse simulation velocity should be correlated with that of the fine simulation, at the end of the time interval. We consider the decomposition of \(\mathcal {M}(v)\) given in (10). In those of the M fine sub-steps that contain a collision, we draw a time step independent velocity \(\bar{v}^{n,m}_{p,\ell }\) to generate \(V^{n,m}_{p,\ell }\). Given that the \(\bar{v}^{n,m}_{p,\ell }\) are i.i.d., we can select one freely to use as \(\bar{v}^n_{p,{\ell -1}}\). To maximize the correlation of the velocities at the end of the time interval, we take the last generated \(\bar{v}^{n,m}_{p,\ell }\), i.e.,

(19)

4 Improved Coupling of Particle Trajectories

4.1 Limitations of Term-by-Term Correlation

In [12], it was demonstrated that a multilevel Monte Carlo scheme, based on the correlation in Sect. 3.3 achieves an asymptotic speed-up \(\mathcal {O}\left( E^{-2}\log ^2(E)\right) \) in the root mean square error bound \(E\) for the following sequence of levels:

  1. 1.

    At level 0, generate an initial estimate of \(\hat{Y}(t^*)\) by simulating with \(\varDelta t_0 = t^*\).

  2. 2.

    At level 1, perform correlated simulations to \(t^*\) using \(\varDelta t_0 = t^*\) and \(\varDelta t_1 = \varepsilon ^2\).

  3. 3.

    Continue to generate a geometric sequence of levels \(\varDelta t_l = \varepsilon ^2M^{1-l}\) for \(l>1\) until an acceptably low bias has been achieved.

This result only holds when \(\varDelta t_\ell \ll \varepsilon ^2\), however, meaning that the scheme only significantly reduces computational cost when calculating very accurate results.

To demonstrate this phenomenon, we present a multilevel simulation with a two-speed velocity distribution using the scheme from Sect. 3.3 to estimate the squared particle displacement \(F(x,v) = x^2\). The results for \(\varepsilon =0.1\), a RMSE \(E=0.1\), \(f(x,v,0)\equiv \delta _{x,0} \frac{1}{2} \left( \delta _{v,-1} + \delta _{v,1} \right) \), and \(t^*=0.5\) are shown in Table 1. In this table, we list various statistics of the samples \(\hat{F}_\ell \), sample differences \(\hat{F}_\ell - \hat{F}_{\ell -1}\) and resulting estimators \(\hat{Y}_\ell \) as well as the computational cost. We define \(F_{-1} \equiv 0\) and define the sample cost \(C_\ell \) relative to a simulation with \(\varDelta t = \varepsilon ^2\).

Table 1. Computing \(\hat{Y}(t^*)\) for \(\varepsilon =0.1\) and \(E=0.1\). Listed quantities are the fine time step size \(\varDelta t_\ell \), number of samples \(P_\ell \), expected value \(\mathbb {E}\) and variance \(\mathbb {V}\) of the differences of simulations \( \hat{F}_\ell - \hat{F}_{\ell -1}\), estimator variance \(\mathbb {V}[\hat{Y}_\ell ]\), sample cost \(C_\ell \) and level cost \(P_\ell C_\ell \).

Looking at the values of \(\mathbb {V}[ \hat{F}_\ell - \hat{F}_{\ell -1} ]\) in Table 1, it is clear that the variance at level 1 is almost as large as that at level 0. Comparing the values of \(\mathbb {V}[ \hat{F}_\ell - \hat{F}_{\ell -1} ]\) with those of \(\mathbb {E}[ \hat{F}_\ell - \hat{F}_{\ell -1} ]\), we see, contrastingly, that the expected value of the estimator at level 1 is an order of magnitude smaller than that at level 0. This indicates that the lack of decreasing variance from level 0 to level 1 is due to insufficient correlation between the trajectories used in the estimator at level 1. As a result a large number of samples \(P_1\) are needed with cost \(C_1\), making the total cost \(P_1 C_1\) of level 1 much higher than that of level 0.

The reason for this poor correlation can be found in the intuitive interpretation given to Eq. (8) in Sect. 3.1. For large time steps, the diffusion term becomes dominant over the transport and collision effects, whereas the inverse is true for smaller time steps. At level 1, the coarse diffusive simulation is being generated from a simulation with a \(\varDelta t = \varepsilon ^2\), which contains a significant transport component. The correlation approach in Sect. 3.3 correlates the Brownian increments independently from the collisions and resulting velocity changes. Hence the transport-collision effects from the fine simulation, which form a significant part of the particle behavior, are ignored in the coarse simulation.

4.2 Improved Correlation Approach

In this section, we present an improved correlation approach for particles simulating the scheme (9)–(11). This scheme will let the velocities generated in the fine simulation \(\bar{v}^{n,m}_{p,1}\) influence the Brownian increments \(\xi ^{n,m}_{p,0}\) in the correlated coarse simulation, avoiding the correlation issues described in Sect. 4.1.

We start from [10], which shows how one can simulate Brownian motion in the weak sense, using an approximate weak Euler scheme. The lower order moments of the approximate Brownian increments in this scheme must be bounded by

$$\begin{aligned} \left| {{\,\mathrm{\mathbb {E}}\,}}\left[ \xi _{p,\ell -1}^n \right] \right| + \left| {{\,\mathrm{\mathbb {E}}\,}}\left[ \left( \xi _{p,\ell -1}^n \right) ^3 \right] \right| + \left| {{\,\mathrm{\mathbb {E}}\,}}\left[ \left( \xi _{p,\ell -1}^n \right) ^2 \right] - 1 \right| \le K \varDelta t_{\ell -1}, \end{aligned}$$
(20)

for some constant K. If this condition holds, the same weak convergence as classical Euler-Maruyama is achieved. The new, improved, correlation generates \(\xi _{p,0}^n\) as a weighted sum of a contribution \(\xi _{p,\ell -1,W}^n\) from the fine simulation diffusion and a contribution \(\xi _{p,\ell -1,T}^n\) from the fine simulation re-scaled velocities \(\bar{v}^{n,m}_{p,\ell }\)

$$\begin{aligned} \xi _{p,\ell -1}^n = \sqrt{\theta _{\ell }} \, \xi _{p,\ell -1,W}^n + \sqrt{1-\theta _{\ell }} \, \xi _{p,\ell -1,T}^n, \end{aligned}$$
(21)

with \(\theta _{\ell } \in [0,1]\). If both \(\xi _{p,\ell -1,W}^n\) and \(\xi _{p,\ell -1,T}^n\) have mean zero and variance one, the same holds for \(\xi _{p,\ell -1}^n\). This means that \(K=0\) in (20). The correlation of the transport increments (18)–(19) is left unaltered, to maintain the correlation of the simulation velocities. We now discuss each value in (21):

Diffusive Contribution \(\varvec{\xi }_{\varvec{{p,\ell -1,W}}}^{\varvec{n}}.\) The coupling of Brownian increments is done in a way identical to (15) from the correlation approach in Sect. 3.3:

$$\begin{aligned} \xi _{p,\ell -1,W}^n = \frac{1}{\sqrt{M}} \sum _{m=1}^{M} \xi _{p,\ell }^{n,m}. \end{aligned}$$
(22)

Transport Contribution \(\varvec{\xi }_{\varvec{p,\ell -1,T}}^{\varvec{n}}.\) We can generate a value \(\xi _{p,\ell -1,T}^n\) with expected value zero and unit variance from the \(\bar{v}_{p,\ell }^{n,m}\) as

$$\begin{aligned} \xi _{p,\ell -1,T}^n = \left( {{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \bar{v}_{p,\ell }^{n,m} \right] \right) ^{-\frac{1}{2}} \sum _{m=1}^{M} \bar{v}_{p,\ell }^{n,m}. \end{aligned}$$

As subsequent \(\bar{v}_{p,\ell }^{n,m}\) are correlated, computing this variance is slightly more involved than in the case of \(\xi _{p,\ell -1,W}^n\), but still straightforward

$$\begin{aligned} {{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \bar{v}_{p,\ell }^{n,m} \right]&= \sum _{m=1}^{M} {{\,\mathrm{\mathbb {V}}\,}}\left[ \bar{v}_{p,\ell }^{n,m} \right] + 2 \sum _{m=1}^{M-1}\sum _{m^\prime =m+1}^M {{\,\mathrm{\mathrm {Cov}}\,}}\left( \bar{v}_{p,\ell }^{n,m}, \bar{v}_{p,\ell }^{n,m^\prime } \right) \nonumber \\&= M + 2 \sum _{{\mathrel {\varDelta m}}=1}^{M}(M-{\mathrel {\varDelta m}}) \left( p_{nc,\varDelta t_\ell }\right) ^{\mathrel {\varDelta m}}\nonumber \\&= M + 2\frac{p_{nc,\varDelta t_\ell }\left( \left( p_{nc,\varDelta t_\ell }\right) ^M-M(p_{nc,\varDelta t_\ell }-1)-1\right) }{(p_{nc,\varDelta t_\ell }-1)^2} \nonumber \\&= M + 2\frac{p_{nc,\varDelta t_\ell }\left( \left( p_{nc,\varDelta t_\ell }\right) ^M+Mp_{c,\varDelta t_\ell }-1\right) }{\left( p_{c,\varDelta t_\ell }\right) ^2}, \end{aligned}$$
(23)

where \(p_{c,\varDelta t_\ell }\) is the collision probability as defined in (11) and \(p_{nc,\varDelta t_\ell }=1-p_{c,\varDelta t_\ell }\). We want to improve the correlation of level 1 where a large time step \(\varDelta t_0\) is correlated with \(\varDelta t_1 = \varepsilon ^2\). In that case \(p_{nc,\varDelta t_1} = p_{c,\varDelta t_1} = \frac{1}{2}\) so (23) simplifies to

$$\begin{aligned} {{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \bar{v}_{p,1}^{n,m} \right] = 3M + 2^{2-M} - 4 \approx 3M - 4, \end{aligned}$$
(24)

where the approximation quickly becomes more accurate for larger values of M.

Contribution Weight \(\varvec{\theta }_{\varvec{\ell }}.\) What remains is to choose the weight between the two contributions from the simulation with \(\varDelta t_\ell \) to the diffusion in the simulation with \(\varDelta t_{\ell -1}\). We aim to pick \(\theta _{\ell }\) so that (22) has an equally large contribution when simulating over a time interval of size \(\varDelta t_{\ell -1}\) at both level \(\ell \) and \(\ell -1\). At level \(\ell \) the variance contribution of (22) relative to the variance of the total simulation spanning \(\varDelta t_{\ell -1}\) with M steps is given by

$$\begin{aligned} \frac{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \sqrt{2\varDelta t_\ell }\sqrt{D_\ell } \xi _{p,\ell }^{n,m} \right] }{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \sqrt{2\varDelta t_\ell }\sqrt{D_\ell } \xi _{p,\ell }^{n,m} + \varDelta t_\ell \tilde{v}_\ell \bar{v}_{p,\ell }^{n,m} \right] }. \end{aligned}$$
(25)

The contribution of (22) when spanning the same time interval of size \(\varDelta t_{\ell -1}\) with a single step is given by

$$\begin{aligned} \frac{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sqrt{2\varDelta t_{\ell -1}}\sqrt{D_{\ell -1}} \sqrt{\theta _{\ell }} \xi _{p,\ell -1,W}^{n} \right] }{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sqrt{2\varDelta t_{\ell -1}}\sqrt{D_{\ell -1}} \xi _{p,\ell -1}^{n} + \varDelta t_{\ell -1} \tilde{v}_{\ell -1} \bar{v}_{p,\ell -1}^{n} \right] }. \end{aligned}$$
(26)

Equating (25) and (26) and solving for \(\theta _{\ell }\) gives

$$\begin{aligned} \theta _{\ell } = \frac{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \sqrt{2\varDelta t_\ell }\sqrt{D_\ell } \xi _{p,\ell }^{n,m} \right] {{\,\mathrm{\mathbb {V}}\,}}\left[ \sqrt{2\varDelta t_{\ell -1}}\sqrt{D_{\ell -1}} \xi _{p,\ell -1}^{n} + \varDelta t_{\ell -1} \tilde{v}_{\ell -1} \bar{v}_{p,\ell -1}^{n} \right] }{{{\,\mathrm{\mathbb {V}}\,}}\left[ \sqrt{2\varDelta t_{\ell -1}}\sqrt{D_{\ell -1}} \xi _{p,\ell -1,W}^{n} \right] {{\,\mathrm{\mathbb {V}}\,}}\left[ \sum _{m=1}^{M} \sqrt{2\varDelta t_\ell }\sqrt{D_\ell } \xi _{p,\ell }^{n,m} + \varDelta t_\ell \tilde{v}_\ell \bar{v}_{p,\ell }^{n,m} \right] }. \end{aligned}$$

Given (23) and the unit variance of \(\xi _{p,\ell -1,W}^{n}, \xi _{p,\ell -1}^{n}, \xi _{p,\ell }^{n,m}\) and \(\bar{v}_{p,\ell -1}^{n}\), we get

$$\begin{aligned} \theta _{\ell } = \frac{D_\ell \left( 2\varDelta t_{\ell -1}D_{\ell -1} + \varDelta t_{\ell -1}^2 \tilde{v}_{\ell -1}^2 \right) }{D_{\ell -1}\left( 2M\varDelta t_\ell D_\ell + \varDelta t_\ell ^2 \tilde{v}_\ell ^2 \left( M + 2\frac{p_{nc,\varDelta t_\ell }(p_{nc,\varDelta t_\ell }^M+Mp_{c,\varDelta t_\ell }-1)}{p_{c,\varDelta t_\ell }^2}\right) \right) }. \end{aligned}$$
(27)

At level 1, a very large time step \(\varDelta t_0\) is correlated with \(\varDelta t_1 = \varepsilon ^2\). Here, we can use approximations \(D_{0}\approx 1 , \tilde{v}_{0} \approx \frac{\varepsilon }{\varDelta t_{0}}, p_{nc,\varDelta t_0} \approx \frac{\varepsilon ^2}{\varDelta t_{0}}\) and (24) to simplify (27) to

$$\begin{aligned} \theta _{1} \approx \frac{4M + 1}{7M - 4}. \end{aligned}$$

It is important to note that (21) does not produce true Brownian increments, in contrast to the independent simulation at level 0. On the one hand, the distribution of (21) is not Gaussian. On the other hand, subsequent coarse time steps will be correlated. Both issues become more pronounced for higher levels and for that reason we only apply this approach in level 1. In the next section, numerical experiments asses whether this mismatch produces biased results.

5 Experimental Results

5.1 Application to a Low-Precision Case

We re-do the computations from Sect. 4.1, using the new correlation (21) at level 1. Comparing Tables 1 and 2, we see that the variance \(\mathbb {V}\left[ F_1 - F_0 \right] \) has reduced by a factor 30. The reduced variance reduces the cost of the full simulation by a factor 3.9. Given the low tolerance \(E=0.1\), it cannot be observed whether the improved correlation produces biased results without more accurate simulations.

Table 2. Computing \(\hat{Y}(t^*)\) for \(\varepsilon =0.1\) and \(E=0.1\) with improved correlation. Quantities as listed in Table 1.

5.2 Application to a High-Precision Case

In Table 3, we repeat the simulations in Tables 1 and 2 for \(E=0.01\). Despite the variance decrease at level 1, both simulation costs are of the same order of magnitude. This is due to the high cost of levels \(\ell >1\), reducing the impact of the improvement at level 1. At first sight, this seems a major setback, however the penultimate column of Table 3 shows that \(\mathbb {V}[ \hat{F}_\ell - \hat{F}_{\ell -1} ] > \mathbb {V}[ F_1 - F_0 ]\) for many of these levels. This indicates that we can skip levels by correlating \(\varDelta t_0\) with \(\varDelta t_1 \ll \varepsilon ^2\) at level 1. In Table 4, we do this with \(\varDelta t_1 = \frac{\varepsilon ^2}{128}\). For a fair comparison, we also apply the same sequence of levels to the term-by-term correlation.

Table 3. Computing \(\hat{Y}(t^*)\) for \(\varepsilon =0.1\) and \(E=0.01\). Quantities as listed in Table 1.

Comparing Tables 3 and 4, shows that leaving out levels reduces the computational cost when using the improved correlation at level 1, while the term-by-term correlation becomes more expensive. The simulation with term-by-term correlation in Table 4 requires a factor 2 less computation than that with improved correlation in Table 3.

As a reference for the bias, we note that a simulation with \(E=0.001\) in [12], computed an estimate of \(9.80\times 10^{-1}\) for \(\hat{Y}(t^*)\). Tables 3 and 4 indicate that the results using the term-by-term correlation are consistently within the set tolerance \(E=0.01\). Those using the new correlation are consistently too small. Twenty repetitions of the simulation with term-by-term correlation in Table 3 and that with improved correlation in Table 4 estimates a bias of \(2.31 \times 10^{-2}\) with standard deviation \(1.14 \times 10^{-2}\). This shows that the improved correlation comes with the cost of an additional bias.

Table 4. Computing \(\hat{Y}(t^*)\) for \(\varepsilon =0.1\) and \(E=0.01\) with skipped levels. Quantities as listed in Table 1.

6 Conclusion

We presented a new, improved particle based multilevel Monte Carlo scheme for simulating the Boltzmann-BGK equation in diffusive regimes. This scheme improves upon earlier work by introducing a correlation between the kinetic transport in the fine simulation and the diffusion in the coarse simulation. The approach reduces the difference estimator variance at level 1 by an order of magnitude. In turn, the reduced variance significantly decreases the computation required for low accuracy simulations. For more accurate simulations, the reduced variance allows for several computationally expensive levels to be skipped, resulting in a greatly improved computational efficiency. A caveat is the inter time-step correlation and non-normal distribution of the coarse random numbers, resulting in biased results. Resolving these inconsistencies is the topic of future work, which will aid the extension of the improved correlation to all levels. It also remains to be seen whether the observed biases have a significant effect in practical applications, which typically contain, among others, modelling errors.