Abstract
Particle methods based on evolving the spatial derivatives of the solution were originally introduced to simulate reaction-diffusion processes, inspired by vortex methods for the Navier–Stokes equations. Such methods, referred to as gradient random walk methods, were extensively studied in the ’90s and have several interesting features, such as being grid-free, automatically adapting to the solution by concentrating elements where the gradient is large, and significantly reducing the variance of the standard random walk approach. In this work, we revive these ideas by showing how to generalize the approach to a larger class of partial differential equations, including hyperbolic systems of conservation laws. To achieve this goal, we first extend the classical Monte Carlo method to relaxation approximation of systems of conservation laws, and subsequently consider a novel particle dynamics based on the spatial derivatives of the solution. The methodology, combined with asymptotic-preserving splitting discretization, yields a way to construct a new class of gradient-based Monte Carlo methods for hyperbolic systems of conservation laws. Several results in one spatial dimension for scalar equations and systems of conservation laws show that the new methods are very promising and yield remarkable improvements compared to standard Monte Carlo approaches, either in terms of variance reduction as well as in describing the shock structure.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Monte Carlo methods, which were first devised during the Manhattan Project in the 1940s to simulate the behavior of neutron diffusion in fissile material [26, 47], have a long and storied history [27]. In the decades following their introduction, the methods were refined and extended to various fields, including physics, engineering, finance, and many others. The subsequent development of computers provided a significant boost to the spread of Monte Carlo methods in scientific computing, as they allowed for the simulation of more complex systems and the generation of larger numbers of random samples. Although today Monte Carlo methods are widely used in many scientific and industrial applications [23], their systematic design as a numerical analysis tool to solve partial differential equations (PDEs) is still rather limited to specific contexts compared to deterministic approaches such as finite-differences or finite-volumes. Prominent examples are applications involving stochastic differential equations, like in financial applications [6, 7, 24, 28], or when other numerical methods are not feasible due to the complexity of the problem, like in plasma physics and rarefied gas dynamics [4, 13, 29, 31, 34, 36, 40].
On the other hand, growing attention in high-dimensional problems and in quantifying uncertainty in many fields of applied mathematics, including emerging fields such as life and social sciences, has greatly increased interest in developing efficient Monte Carlo approaches in such contexts [3, 17, 19, 20, 35]. Moreover, since the Monte Carlo solution is computed by averaging independent calculations, the resulting algorithms are well suited to exploit modern parallel computing techniques.
Among the various Monte Carlo methods developed for PDEs, one approach, inspired by the so-called vortex method for the Navier–Stokes equations [12], had considerable success in the 1990s for reaction-diffusion problems. The idea of the method is to use the spatial derivative of the solution (i.e., the gradient in multiple dimensions) as the unknown variable. This allows the statistical solution in the original variables to be reconstructed not from the histogram, as in standard random walk approaches, but directly as the distribution function of the samples of the derivative. In addition to the statistical reduction of fluctuations, such a technique, usually referred to as the gradient random walk (GRW) method, brings with it the advantages of adaptivity, since samples are taken according to the space derivatives, and a grid-free structure [8, 18, 41, 43, 44].
In this paper, we explore the possibility of extending such ideas to a broader class of PDEs that includes, in particular, systems of conservation laws. The construction of Monte Carlo methods for such problems is strongly inspired by the fluid-dynamical limit of the Boltzmann equation and the corresponding Monte Carlo methods for the Euler equations [31, 33, 34, 36, 39]. Let us remark that the construction of stochastic particle methods for nonlinear hyperbolic problems has been the subject of limited research in the past and that there is no general approximation methodology (see for example [5, 41, 49]). A first attempt in this direction, limited to nonnegative solutions, has been proposed in [32] based on appropriate relaxation approximations of such systems [1, 21, 22]. Indeed, such a relaxation approximation allows the Monte Carlo strategy traditionally used for kinetic equations to be adapted to generic conservation laws. We mention here that related approaches based on deterministic particles have been also proposed [11, 14, 15, 25, 42].
Following the idea of approximating the system of conservation laws by means of a semi-linear hyperbolic system with source terms we show how it is possible to formulate an approach based on statistical samples of the spatial derivatives of the solution that allows the characteristic advantages of GRW methods for reaction-diffusion equations to be generalized to such problems. In order to compare the new methods, which we will refer to as Gradient-based Monte Carlo (GBMC) methods, with a standard Monte Carlo approach, we will recall the concepts introduced in [32] by extending them to the case of solutions of arbitrary sign and discussing some variance reduction strategies.
Here, we will limit ourselves to the one-dimensional case, which will enable us to focus on the fundamental concepts of the GBMC method and address both the scalar case and the case of systems of conservation laws, leaving to further research multi-dimensional extensions. We will show how the resulting methods will be grid-free in the scalar case, while in the case of systems, this may not always be possible. In all test cases considered, however, the GBMC method offers significant advantages over the traditional Monte Carlo approach, including better spatial resolution of discontinuities and a reduction in variance by several orders of magnitude.
The rest of the article is organized as follows. In Sect. 2 we will recall the basic concepts of GRW methods for reaction-diffusion problems. Then in Sect. 3 we will discuss the design of the novel GBMC approach in the case of simple relaxation systems for scalar conservation laws. In this section we will also introduce the details of a direct Monte Carlo approach by generalizing some of the ideas presented in [32]. Several numerical examples for various scalar problems illustrate the strong advantages of GBMC over standard Monte Carlo. Next, Sect. 4 is devoted to the extension of the methodology to the case of hyperbolic relaxation approximations to systems of conservation laws. In such cases, even if a grid-free approach is related to the possibility to diagonalize the system, we show through several examples that the resulting GBMC schemes maintain most of the advantages observed in the scalar case. Some concluding remarks and further research directions are discussed in the last section. Finally, “Appendix A” reports some estimates in the \(L^p\) distance of the solution reconstruction error from the particles, showing the improvement in accuracy of the GBMC approximation over the corresponding MC.
2 Gradient Random Walk for Reaction-Diffusion Equations
In this section we recall the basic ideas behind the design of GRW methods for reaction-diffusion problems [8, 18, 25, 41, 43, 44]. We illustrate the idea by treating the one-dimensional scalar reaction–diffusion equation
Let us assume that \(u_0(x)\) is monotone and bounded from below and above. Without loss of generality, we impose \(u(-\infty ,t)=0\) and \(u(\infty ,t)=1\), for all \(t\ge 0\). We introduce the auxiliary variable \(w=\partial u/\partial x\) and observe that it satisfies the equation
Note that, by virtue of the assumptions on \(u_0(x)\), \(w_0(x)\) is a probability density.
Using N particle samples located respectively in \(X_1,\ldots ,X_N\) (we use capital X to indicate that the positions are random variables) we will have
discretizing w as a sum of \(\delta \)-functions. Then, since \(u(x,t)=\int _{-\infty }^x w(y,t)\,dy\) is the cumulative distribution function of the random variable X, we get
where \(H(\cdot )\) is the (right-continuous) Heaviside step function
Note that the boundary condition at the left is exactly satisfied, while the boundary condition at the right is satisfied on average, with fluctuations. Of course we can use a right reconstruction approach that would lead to the opposite situation (see also Remark 3).
Notice that the same idea can be applied in case of weighted particles, namely particles having different mass \(m_i\), thus approximating the function \(w(\cdot , t)\) as
Here the mass values \(m_i\in \{-m,m\}\), \(m>0\) might also be negative, so we can approximate more than monotone nonnegative solutions. From definition (5), in this case by left reconstruction we obtain
The computed solution for u(x, t) contains much less fluctuation than w(x, t), since all particles contribute to the solution at any point x, as depicted in Fig. 1. In particular, due to its nature of solving the problem looking at the gradient of the state variable, the GRW method perfectly applies to problems in which sharp fronts appear and where one needs to spatially resolve jump-like solutions, because the density of the particles is large precisely where u has large gradients. Once the initial data have been discretized, the GRW method evolves the positions and masses of the particles so that u satisfies (1). This is done by a fractional-step iteration in a small time interval \(\varDelta t\) in which the diffusion term is modeled by a random walk and the reaction term is modeled through a particle killing or replication approach with probability \(\varDelta t |G'(u)|\) (see [18, 25, 43, 44] for more details).
3 Gradient-Based Monte Carlo for Hyperbolic Problems
The extension of the previous idea to hyperbolic problems and in general to nonlinear PDEs is nontrivial and, except for some specific cases [41], has been poorly studied in the literature. Among other reasons, it should be pointed out that already the construction of a standard Monte Carlo method is not obvious in the case of nonlinear PDEs [5, 32, 39]. In the following we will consider the case of hyperbolic problems, with special reference to the relaxation approximation of scalar conservation laws. In the following we will consider the case of hyperbolic problems, with special reference to the approximation by relaxation of scalar conservation laws. Subsequently, we will discuss how to extend the method to the case of systems of conservation laws.
The starting point is the following hyperbolic system with relaxation introduced in [22] as a semi-linear approximation to scalar conservation laws
where \(x\in \varOmega \subseteq \mathbb {R}\), supplemented with the initial conditions \(u(x,0)=u_0(x)\), \(v(x,0)=v_0(x)\) and suitable boundary conditions.
In the limit \(\varepsilon \rightarrow 0\) from the second equation in (7) one formally obtains the local equilibrium \(v=F(u)\) and thus solutions to (7) are well approximated by the scalar conservation law
More precisely, if we evaluate the \(\varepsilon \) perturbation of the local equilibrium
and substite in the second equation in (7), with some algebraic manipulations we obtain
and therefore
which leads to the following convection-diffusion equation
The above equation is a good approximation of system (7) for \(\varepsilon \ll 1\) only if the characteristic velocity of the relaxed system is dominated by the characteristic velocity of the original system, hence if the subcharacteristic condition \(a^2 > F'(u)^2\) is satisfied (see [10, 22]).
It is worth to underline that, for small values of \(\varepsilon \), the study of a numerical solution of system (7) requires particular care even though the transport operator is linear because a standard explicit discretization will lead to time constraints of the order of \(\varepsilon \). This idea was at the basis of the so-called relaxation schemes for hyperbolic systems of conservation laws [22].
3.1 A Direct Monte Carlo Approach
First, we recall the probabilistic approach proposed in [32] to construct a random particle solver for nonnegative densities that works uniformly with respect to the relaxation rate \(\varepsilon \). The method is inspired by the classical Direct Simulation Monte Carlo (DSMC) method for rarefied gas dynamics [4, 29, 34]. We will later discuss how to extend this approach to solutions of arbitrary sign.
Given \(a>0\), let us rewrite system (7) introducing the diagonal (kinetic) variables
and the corresponding equilibrium states
We obtain the diagonal form of the relaxation system
System (10) presents several analogies with discrete-velocity models of the Boltzmann equation, in the sense that it describes a system of particles having only two speeds \(\pm a\) which relax toward the local equilibrium states \(E^{\pm }(u)\). The kinetic interpretation requires \(E^{\pm }(u)\ge 0\) and this is guaranteed if \(a \ge |F(u)|/u\) and \(u > 0\). Under these assumptions, one can apply a direct simulation Monte Carlo approach [32].
More precisely, the solution in a small time interval \([0, \varDelta t]\) is approximated by means of a fractional step procedure that solves separately the two problems characterized by the linear transport and by the relaxation term. One therefore solves, one after the other, a transport step
and a relaxation process
Since both steps, taken separately, can be solved exactly no further approximation besides the \(O(\varDelta t)\) error of the above splitting, is necessary. In fact, in a small time interval \(\varDelta t\) the exact solutions \(\tilde{f}^{\pm }\) of the free transport (11) reads
and, setting \(\tilde{u}(x,\varDelta t)=\tilde{f}^+(x,\varDelta t)+\tilde{f}^-(x,\varDelta t)\), the exact solution of the relaxation step (12) yields the approximated values at time \(\varDelta t\)
The probabilistic interpretation is readily obtained by introducing the new variables
with \(p^+(x,t)+p^-(x,t)=1\), to have the convex combination
where we denoted the equilibrium probabilities as
We note that the probabilistic interpretation holds true independently of the choice of \(\varDelta t/\varepsilon \). In particular, as \(\varepsilon \rightarrow 0\) it reduces to the projections
which characterize the probabilistic approximation of the limit equation (8).
A Monte Carlo method can be derived by sampling directly from the exact solutions of the operator splitting steps (16). Note, however, that the probability of a velocity change in (17) depends on the mass density after the transport step \(\tilde{u}(x,t)\). Therefore, in order to set up a Monte Carlo method and to estimate the probability of a velocity change, we must reconstruct the mass density u in a neighborhood of the particle position. Given a set of samples \(X_1,\ldots ,X_N\) the simplest method, which produces a piecewise constant reconstruction, is based on evaluating the histogram of the samples at the cell centers of a suitable grid \(x_j=j\varDelta x\),
where \(\varPhi _{\varDelta x}(x)=1/\varDelta x\) if \(|x|\le \varDelta x\) and \(\varPhi _{\varDelta x}(x)=0\) elsewhere. Smoother versions can be obtained by changing the function approximating the Dirac delta. This corresponds to a convolution of the samples with a suitable mollifier [30, 34].
We can set up a Monte Carlo method in the case of positive solutions as follows. Given a set of samples \((X^0_1,V^0_1), \ldots , (X^0_{N},V^0_N)\), where the particles velocities \(V^0_i~\in ~\{-a,a\}\) distinguish the samples of \(f^+_0(x)\) from those of \(f^-_0(x)\), a new set of samples \((X_1,V_1),\ldots ,(X_{N},V_N)\) is generated by Algorithm 1.
Note that there is no need for the grid to be uniform, and the choice of the grid points can easily change at any time step. Although in principle using higher order fractional step methods would lead to an increase in the order of accuracy in \(\varDelta t\), in the limit \(\varepsilon \rightarrow 0\) such methods, as it is well known, degenerate to first order [21, 33, 34]. It is to date an open problem to construct splitting techniques of order higher than one that maintain accuracy in the limit \(\varepsilon \rightarrow 0\).
Algorithm 1 can be improved by adopting a simple low variance technique which estimates the number of particles that should have the different velocities before their assignment. This can be done by replacing step 3 by first computing in each cell j the total number of interacting particles
where with \(\textrm{SRound} (x)\) we denote a “stochastic rounding” of a positive real number, by considering
[x] denoting the integer part of x, and then assigning the velocities a and \(-a\) respectively to a number
of randomly chosen particles in cell j. An implementation is reported in Algorithm 2.
Remark 1
As a side result of our derivation, we constructed a Monte Carlo method for a general scalar conservation law of the form (8). As for the construction of deterministic relaxation schemes for conservation laws [22], this is easily obtained by taking the limit case \(\varepsilon \rightarrow 0\) in the schemes just described (see [32]).
Remark 2
Note that in the MC method both the mesh discretization and the choice of the number of particles (and clearly also the time discretization) contribute to the numerical error. In essence, we need to take \(\varDelta x\) small to reduce the mesh bias, but also large enough to ensure that the variance is small. In fact, it is possible to estimate the optimal value of \(\varDelta x_{opt}\) for the mesh discretization in order to maximize the performance of the method. Details on this are discussed in “Appendix A.1”.
3.1.1 The Case of Negative Solutions
If we apply the method to initial conditions and solutions that might also be negative, we need to associate particles with possible negative weights. Thus, \(m_i \in \{-m,m\}\), where \(m>0\), which we will still refer to as the particle mass, and reconstruct the solution as
In this context, we observe that the equilibrium states \(E^{\pm }(u)\) could result either positive or negative and, consequently, also the probabilities \(p^+\) and \(p^-\) defined in (15) may become negative, leading to a failure of the probabilistic interpretation. To avoid this, we can rely on the probabilistic interpretation associated to the notion of area defined by the absolute value of the equilibrium states in the computational cell. More precisely, in each cell, we define the equilibrium probabilities as
Note that when applying this definition,
and so the probabilistic interpretation of Algorithm 1 remains valid. However, an inconsistency arises if the equilibrium states have different signs, because
In fact, in the latter situation, we are faced with an indeterminate problem because there is an infinite number of possible allocations of particles with opposite masses in the two equilibrium states such that their value is satisfied (see Fig. 2 for a sketch).
To overcome this problem and re-equilibrate the distribution of particles, after the relaxation step based on the probabilities (23), we can follow two strategies in each cell:
-
(i)
keep the particle mass fixed and introduce a variable number of particles;
-
(ii)
keep the number of particles fixed and introduce a variable mass.
Both strategies have pros and cons and we will briefly describe the different implementations below. The first choice (i) would result in the need to discard or re-sample particles in each cell, and this can be done by a particle killing/replication strategy, as in the case of the reaction term discussed at the end of Sect. 2 for system (2). Precisely, in each cell j of width \(\varDelta x\), we define the new particle numbers
Then, we re-sample \(({\tilde{N}}^+_j - N_j^+)\) particles with positive velocity if \({\tilde{N}}^+_j > N_j^+\), or discard \((N_j^+-{\tilde{N}}^+_j)\) particles with positive velocity if \({\tilde{N}}^+_j < N_j^+\), where \(N_j^+\) is the number of particles with positive velocity prior to the re-assignation. The same, of course, is done for particles with negative velocity.
In the second alternative (ii) we fix the number of particles \(N_j\) belonging to the j-th cell and, consequently, reassign their mass, which will thus depend on the cell. This can be done, by defining an updated mass value \(\tilde{m}_j\) in each cell in order to respect the following balance:
Finally, in both cases (i) and (ii), the sign of mass is assigned to each particle according to the sign of the corresponding equilibrium. The details of the second strategy (ii) are reported in Algorithm 3, where for simplicity we restrict to the limiting case \(\varepsilon \rightarrow 0\).
3.2 The Gradient-Based Monte Carlo Method
In this section we show how the Monte Carlo approach just described can be improved significantly using a gradient random walk strategy. In fact, by introducing the auxiliary variables \(w=\partial u/\partial x\) and \(z=\partial v/\partial x\), we can rewrite (7) in the form
Most of the arguments presented for system (7) apply again. First we introduce the new variables
which satisfy the diagonal system
where the equilibrium states now read
After splitting the above system, the transport step and the collision/relaxation step can be solved using the same Monte Carlo approach described in the previous sections, but with a substantial difference: there is no need to reconstruct the solution u(x, t) on a space grid during the relaxation step. In fact, given a set of samples located in \(X_1\),\(\ldots \), \(X_N\) with positive and negative masses \(m_i \in \{-m,m\}\) we can compute \(u(X_i)\) following (6). Since \(D^+(u,w)+D^-(u,w)=w\), under the subcharacteristic condition \(a>|F'(u)|\), the probabilities of a random velocity change read
Therefore, applying the gradient random walk strategy, the probabilities turn out to be dependent only on u (and not w), which is now a particle-dependent (and not grid-dependent) variable. This aspect is of utmost importance because leads to deal with a grid-free method and consequent advantages both in terms of accuracy and computational efficiency.
Thus, we can solve system (26) with a Gradient-based Monte Carlo method (GBMC) as follows. Starting with a set of samples \((X^0_1,V^0_1), \ldots , (X^0_{N},V^0_N)\), with \(V^0_i\in \{-a,a\}\) and each sample has mass \(m_i\in \{-m,m\}\), a new set of samples \((X_1,V_1),\ldots ,(X_{N},V_N)\) is generated as reported in Algorithm 4.
Clearly, similar to the standard Monte Carlo case, taking the limit \(\varepsilon \rightarrow 0\) in the above algorithm leads to a Gradient-based Monte Carlo method for a general scalar conservation law of the form (8). Let us mention that other approximations for scalar conservation laws based on Gradient-based Monte Carlo are found in the literature for rather specific situations, like the Burgers equation (see [41] for details).
Remark 3
In addition to left reconstruction (6) denoted by \(u_i^L(t)\) or the analogous right reconstruction \(u_i^R(t)\), other reconstructions can be implemented that interpolate between the two. For example, to avoid asymmetric reconstructions, a weighted average of left and right reconstructions can be computed for the whole computational range \([x_{\min },x_{\max }]\), where \(x_{\min }={\min _i}\{X_i\}\), \(x_{\max }={\max _i}\{X_i\}\) considering linearly distributed weights \({\omega }\) in [0, 1]:
Of course, more sophisticated reconstruction can be designed similarly. In the numerical examples, if not otherwise stated, we will make use of (29).
Remark 4
We emphasize that both the MC and GBMC methods here proposed present a convergence rate of \(O(1/\sqrt{N})\). However, as also clarified in the numerical results to follow, the MC method can exhibit a convergence of \(O(1/\root 3 \of {N})\) when the spatial mesh is assigned on the basis of the optimal value \(\varDelta x _{opt}\). For further details and a rigorous analysis of the reconstruction errors in the methods, we invite the reader to refer to Appendix A.
3.3 Numerical Examples for Scalar Conservation Laws
In this section we compare the numerical solutions obtained with the standard Monte Carlo approach and the Gradient-based Monte Carlo method for different problems governed by scalar conservation laws including an empirical convergence rate test. Indeed, since the \(\varepsilon \rightarrow 0\) case is the most challenging as the limiting nonlinear scalar conservation law can form discontinuities in finite time, we will limit the presentation of results to this situation.
3.3.1 Empirical Convergence Rate
In the first test case, we compute the convergence rate of the methods with respect to the number of particles solving the inviscid Burgers equation, corresponding to \(F(u)=u^2/2\) in (7) and \(\varepsilon \rightarrow 0\). First, we consider a normal distribution as initial datum and run the simulations up to \(t=2.5\), prior to the classical shock formation; then, a test with sinusoidal initial condition (i.e., involving also particles with negative masses) with solution at \(t=0.5\) is taken into account. In Fig. 3 we compare the convergence rates obtained in the two test cases using
-
the Monte Carlo method while keeping the mesh size \(\varDelta x\) fixed with the particles number refinement (MC);
-
the Monte Carlo method with optimal choice of the mesh size \(\varDelta x\) in function of the N particles amount (MC\(_{opt}\)), as discussed in “Appendix A.1”, in particular referring to Eq. (62);
-
the Gradient-based Monte Carlo method (GBMC).
These results are given in terms of the relative \(L^2\) norms, being
with \(u^{ref}\) reference solution. Here we considered as reference solution the one obtained with a finite volume Godunov method with a very refined mesh grid. All the errors are computed with respect to the the mean of 5 runs with N particles, while keeping an empirically chosen, sufficiently small, time step \(\varDelta t\) fixed (to prevent the time error from overriding the error due to the choice of particle number, thus altering the convergence curves). As discussed in “Appendix A”, it can be observed that both the standard MC and GBMC methods present an error decay of \(O(1/\sqrt{N})\), while when optimizing the choice of the mesh size, the MC method converges with \(O(1/\root 3 \of {N})\). Nevertheless, the error produced by the GBMC is always smaller than that obtained by applying either the optimized or non-optimized MC method. In particular, in Table 1, we report the precise gain factor from using the GBMC instead of both MC versions in terms of accuracy as a function of the number of particles N, intended as ratio of the two norms \(L^2_{MC}/L^2_{GBMC}\) and \(L^2_{MC_{opt}}/L^2_{GBMC}\). Let su finally point out that in both test cases the decay of the order of accuracy in the MC method observed for large N is due to the fact that the mesh error dominates the statistical error due to the particle number (thus the convergence curve saturates at an asymptotic value of the \(L^2\) norm of the error caused by the fixed \(\varDelta x\)). This is in good agreement with the analysis in “Appendix A”.
3.3.2 Inviscid Burgers Equation
The MC (without low variance technique) and GBMC methods are then applied again to the inviscid Burgers equation considering four different initial conditions.
-
Test 1(a): In the first case, the initial datum of the inviscid Burgers equation is a Gaussian density with zero mean and unit variance.
-
Test 1(b): In the second case, we consider a rectangular wave as initial datum
$$\begin{aligned} u(x,0)= \left\{ \begin{array}{ll} 0.4 \quad &{}\textrm{if}\,\, -2\le x\le 2\\ 0 \quad &{}\textrm{otherwise}\,. \end{array}\right. \end{aligned}$$(30) -
Test 1(c): In the third case, we fix as initial condition a sinusoidal function
$$\begin{aligned} u(x,0)=\sin (x)\,, \end{aligned}$$(31)to assess the performance of the methods even when considering negative solutions, hence introducing particles with negative mass.
For each test case, results are given at equal time discretization for the two methods in Figs. 4–6. With the Monte Carlo method, \(N=1000\) or \(N=10000\) particles and \(M=100\) cells for the domain discretization are considered; while with the GBMC approach, \(N=100\) or \(N=1000\) particles. The reference solution is obtained employing a finite volume Godunov method with a very refined spatial grid. By comparing results obtained with the two methods, the remarkable improvement in variance reduction and in capturing the shock fronts obtained by the usage of the GBMC method appears evident, even if considering a largely reduced amount of particles with respect to the standard MC. Moreover, in Fig. 4 it can also be observed the influence of the choice of the time step size on the final solution for both methods. In particular, we note that the choice of \(\textrm{CFL} =1\) (and thus to \(\varDelta t = 0.5\) for a mesh with \(M=50\) and \(\varDelta t = 0.25\) for a grid with \(M=100\) cells), leads to an evident numerical dissipation, visible near the shock. In fact, we remark that the Monte Carlo approach here proposed, for \(N \rightarrow \infty \) and \(\textrm{CFL} = 1 \Rightarrow \varDelta t = \varDelta x/a\) coincides with the Lax–Friedrichs scheme (see [22] for further details), which is known to produce considerable numerical diffusion.
3.3.3 Lighthill–Whitham–Richards Traffic Model
As a second scalar conservation law, we consider the Lighthill–Whitham–Richards (LWR) traffic model [38], for which \(F(u)=u(1-u)\) in (7) and \(\varepsilon \rightarrow 0\), taking into account the following initial conditions.
-
Test 2: We consider the Riemann problem (RP) presented in [14], in which
$$\begin{aligned} u(x,0) = \left\{ \begin{array}{ll} 0.4 \quad &{}\textrm{if} \,\,-1 \le x \le 0,\\ 0.8 \quad &{}\textrm{if} \,\,0 < x \le 1,\\ 0 \quad &{}\textrm{otherwise}. \end{array} \right. \end{aligned}$$(32)
Similar observations to those for Test 1 can also be made here when comparing the results in Fig. 7, which shows numerical solutions of the LWR test cases obtained applying the standard Monte Carlo and the GBMC method considering different amounts of particles. The well reduced variance of GBMC solutions can again be appreciated even if fewer particles are used than in direct Monte Carlo, which also brings a consequent speeding up of the simulation. Moreover, it can be observed the capability of the proposed method to well capture the sharp discontinuities arising in the dynamics, thanks to its adaptive nature of following the solution through particles especially where the gradient is large.
4 Extensions to Hyperbolic Systems of Conservation Laws
In this section we show how to extend the Monte Carlo and Gradient-based Monte Carlo techniques to the case of hyperbolic relaxation approximations to systems of conservation laws. As before, the methods we derive here work uniformly with respect to the stiff relaxation rate and, in the zero relaxation limit, originate a Monte Carlo or Gradient-based Monte Carlo method for the corresponding hyperbolic system of conservation laws. Our attention will be focused in particular on the behavior of methods in that limit.
4.1 A Monte Carlo Approach
Consider the system of conservation laws in one space variable
with \(x\in \varOmega \subseteq \mathbb {R}\), \(\textbf{u}=(u_1,\ldots ,u_n)\in \mathbb {R}^n\), \(n\ge 2\). The above system is strictly hyperbolic if the Jacobian matrix \(\textbf{F}'(\textbf{u})\) admits n distinct real eigenvalues \(\lambda _1<\ldots < \lambda _n\). The system is complemented with the initial conditions \(\textbf{u}(x,0)=\textbf{u}_0(x)\) and suitable boundary conditions.
The relaxation approximation now reads [22]
where \(\textbf{v}\in \mathbb {R}^n\) and \(A^2=\textrm{diag}\{a_1^2,\ldots ,a_n^2\}\) must satisfy the dissipative condition \(A^2 > \textbf{F}'(\textbf{u})^2\) (i.e., matrix \(A^2 - \textbf{F}'(\textbf{u})^2\) positive semi-definite) for all \(\textbf{u}\), with initial conditions \(\textbf{v}(x,0)=\textbf{v}_0(x)\) and defined boundary conditions. Notice that, for \(\textbf{u}\) varying in a bounded domain, the dissipative condition can always be satisfied by choosing sufficiently large A, but because of the CFL condition, for numerical stability, it is desirable to obtain the smallest A meeting the criterion. Following [22], we say that system (34) is dissipative if it is strictly stable in the sense of Majda-Pego, which is satisfied if \(a^2_h > \lambda ^2_h,\, h=1,\ldots ,n\).
The diagonal variables are
which yield the system
with
As in the case of scalar conservation laws, if we assume that initial conditions might be also negative, the equilibrium states \(\textbf{E}^{\pm }(\textbf{u})\) could result either positive or negative as well. Therefore, recalling the discussion presented in Sect. 3.1.1, we solve the system of equations component-wise. We associate to each component its own family of particles h and define the probabilities of a random velocity change for each family as
and proceed similarly to the scalar case by either considering a variable particle number or including the update of the particles’ mass. In the latter case, in each cell j of width \(\varDelta x\) we define for each species h the mass value that fulfills the following relation:
Here \(N_j^h\) is the number of particles of the family h inside the j-th cell. Notice that we considering that particles in the same cell belonging to different components of the system do not interact. Thus, the different components are completely decoupled except for the equilibrium states \(E^\pm _h\), where the coupling occurs. In Algorithm 5 we summarize the weighted Monte Carlo scheme in the limiting case \(\varepsilon \rightarrow 0\) starting with n sets of samples \((X^{h,0}_1,V^{h,0}_1)\), \(\ldots \), \((X^{h,0}_{N^h},V^{h,0}_{N^h})\), with mass \((m^{h,0}_1,\ldots ,,m^{h,0}_{N^h})\), \(h=1,\ldots ,n\), where \(V^{h,0}_i\in \{-a_h,a_h\}\).
Remark 5
We remark that the low variance technique presented in Algorithm 2 can be straightforwardly implemented also in this context. Clearly, taking the limit \(\varepsilon \rightarrow 0\) in Algorithm 5 yields a Monte Carlo method for general systems of conservation laws. It is worth to notice that the same strategy can be adopted starting from other relaxation approximations such as the one proposed in [1].
4.2 The Gradient-Based Monte Carlo Method
To extend the Gradient-based Monte Carlo method to systems of conservation laws, we need to introduce the quantities \(\textbf{w}=\partial \textbf{u}/ \partial x\), \(\textbf{z}=\partial \textbf{v}/ \partial x\) to get from (34)
In diagonal form, using the variables
we obtain
where now
We can now observe that, in the case of general hyperbolic systems of conservation laws, we cannot make the probabilities \(p_D^\pm \) of velocity switches (from positive to negative or vice-versa) in the relaxation process independent on the vector \(\textbf{w}\), unless we can diagonalize the Jacobian matrix \(\textbf{F}'(\textbf{u})\), rewriting the system in an equivalent diagonal form. Thus, in the general case of systems it is not possible to avoid the introduction of a spatial grid. We emphasize, however, that a major difference persists from the Monte Carlo method, which ensures that better accuracy is achieved: equilibrium states are defined for each individual particle, \(\textbf{w}\) being reconstructed in the grid cells but \(\textbf{u}\) being particle-dependent. This allows the solution \(\textbf{u}\) to be reconstructed by cumulative distribution in a manner analogous to the scalar case.
For the sake of simplicity, let us assume to be in the situation in which the original system (33) can be rewritten in characteristic form through the Riemann invariants, so that the matrix \(\textbf{F}'(\textbf{u})\) is the diagonal matrix of the eigenvalues \(\lambda _1,\ldots ,\lambda _n\) (in order not to burden the writing, we omit the transition from the starting variables \(\textbf{u}\) to the diagonal variables \(\hat{\textbf{u}}\)). We can then write equilibria (41) component-wise for each species h as
Under the condition \(a_h > |\lambda _h(\textbf{u})|\), this permits to remove the grid dependence from the probability of random velocity changes, which read
Therefore, as for the case of a relaxation approximation to a scalar conservation law, when it is possible to re-write the system in terms of characteristic variables, the GBMC algorithm keeps the grid-free property. In this situation, starting with n sets of samples \((X^{h,0}_1,V^{h,0}_1), \ldots , (X^{h,0}_{N^h},V^{h,0}_{N^h})\), \(h=1,\ldots ,n\), where \(V^{h,0}_i\in \{-a_h,a_h\}\) and each particle has fixed mass \(m^h_i\in \{-m,m\}\), a new set of samples \((X^h_1,V^h_1),\ldots ,(X^h_{N_h},V^h_{N_h})\) is generated as presented in Algorithm 6.
Remark 6
In the general case, as already mentioned, one cannot avoid the introduction of a spatial grid, since we need to reconstruct \(\textbf{w}\) over the grid j. We remark that variables \(\textbf{w}\) in the GBMC play the same role of \(\textbf{u}\) in the direct MC, hence the reconstruction in space is still computed according to (22). Once this is done, for each particle \(X_i\) we define the probabilities
and then apply the usual GBMC method, with the exception that for consistency we must introduce either particles (representing spatial derivatives) with different masses in each cell j or a variable number of particles, analogous to the weighted Monte Carlo technique.
4.3 Numerical Examples for Systems of Conservation Laws
Here we test the two numerical methodologies, standard MC (possibly with the variance reduction technique presented in Algorithm 2) and GBMC, with two different systems of conservation laws (namely, shallow water equations and Aw-Rascle traffic model), both of which can be rewritten in terms of characteristic variables, thus allowing the gradient approach to be applied without introducing the spatial mesh. Then, we present a test case for the isentropic Euler system without the passage through the characteristic form of the model, so considering the mesh-dependent version of the GBMC method. We restrict the presentation of our results to the limiting case \(\varepsilon \rightarrow 0\).
4.3.1 Shallow Water Equations
First, we consider the 1D shallow water equations with horizontal bottom topography [45]
where h is the water depth, u is the velocity, \(m=\rho u\) is the momentum, and g is the gravity. System (45) can be written in compact form (33), being
Therefore, we may write the relaxation approximation (34) and then directly apply the Monte Carlo method previously discussed for the case of systems of conservation laws.
To apply the GBMC approach in mesh-less form, it is first necessary to rewrite system (45) in terms of characteristic variables. Evaluating the eigenstructure of the system, whose eigenvalues result \(\lambda _{1,2}=u \pm c\), where \(c = \sqrt{gh}\), we can derive the following Riemann Invariants:
It is therefore possible to re-write the system in diagonal form, knowing that
so the final system reads:
If we define
the system results written as
When introducing \(\textbf{w}=\partial \hat{\textbf{u}}/ \partial x\) and \(\textbf{z}=\partial \hat{\textbf{v}}/ \partial x\), the relaxation approximation of the above system reads as (39), and the mesh-less GBMC algorithm can be straightforwardly applied.
We test the Monte Carlo and Gradient-based Monte Carlo methods here proposed with two Riemann Problems designed referring to [45].
-
Test 3(a): In the first case, in the domain \(\varOmega =[-0.5,0.5]\), we set
$$\begin{aligned} \begin{aligned}&h(x,0) = 1,&\quad&u(x,0)=0, \qquad&\textrm{for} \quad x<0\,,\\&h(x,0) = 2,&\quad&u(x,0)=0, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$ -
Test 3(b): In the second case, we consider an almost dry bed solution in the domain \(\varOmega =[-1,1]\), imposing as initial conditions
$$\begin{aligned} \begin{aligned}&h(x,0) = 1,&\quad&u(x,0)=-5, \qquad&\textrm{for} \quad x<0\,,\\&h(x,0) = 1,&\quad&u(x,0)=5, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$
We compare the solutions obtained in terms of the variables h and u by both methods in Figs. 8 and 9. We remark here that in the GBMC approach particles are evolving in space and time following the characteristic variables \(\varGamma _{1,2}\) defined in (46). Therefore, the plots here shown are obtained considering that
The augmented accuracy and the highly reduced variance of the solutions produced using the GBMC appear here even more evident when compared to results obtained with the low variance Monte Carlo, even considering 200 times smaller amount of particles. Moreover, the capability of the GBMC method of better capturing the position and the sharpness of shock waves is again confirmed. Let us point out once more here how this advantage is linked to the key feature of the proposed method that allows the particles to move following the gradient of the solution.
4.3.2 Aw-Rascle Traffic Model
In this Section we test the standard and Gradient-based Monte Carlo methods considering the Aw-Rascle traffic model [2]:
Here \(\rho \) represents the density of cars, u the velocity, and \(p(\rho )=\) is a given function describing the anticipation of road conditions in front of the drivers. The system can be written in the compact form (33) defining
so it would be possible to directly apply the Monte Carlo method. Instead, to resort to a grid-free gradient approach, it is again necessary to express the system in terms of characteristic variables. The eigenvalues of the model are \(\lambda _{1} = u\), \(\lambda _{2} = u - \rho p'(\rho )\), while the Riemann Invariants result
Hence, we can write the system in diagonal form as
If we define
the system reads as (48) and, following the GBMC derivation, the method can be again applied without the introduction of any spacial grid. Notice that the term \(\rho p'(\rho )\) can be also expressed in term of characteristic variables, but depends on the definition of the function \(p(\rho )\).
We consider the Riemann Problem taken from [2, 15], which accounts for a solution with vacuum, having initial data as follows.
-
Test 4: In the domain \(\varOmega =[-1.5,1.5]\), we have \(p(\rho ) = 6 \rho \) and initial conditions
$$\begin{aligned} \begin{aligned}&\rho (x,0) = 0.05,&\quad&u(x,0)=0.05, \qquad&\textrm{for} \quad x<0\,,\\&\rho (x,0) = 0.05,&\quad&u(x,0)=0.5, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$
Notice that in this test \(p'(\rho )=6\), hence in \(\textbf{F}'(\hat{\textbf{u}})\) we have \(F'_{22} = \varGamma _2 - 6\rho \).
We present the results obtained solving the problem with the low variance Monte Carlo and the Gradient-based Monte Carlo in Fig. 10, which show again an excellent behavior, even in this very challenging RP, especially for what concerns the almost absent variance in the GBMC. Notice that in the latter particles follow the dynamics of the characteristic variables \(\varGamma _{1,2}\) defined in (50). Hence, while \(u =\varGamma _2\), to compute the density we need to consider that \(\rho = (\varGamma _1 - \varGamma _2)/6\).
4.3.3 Isentropic Euler System
We finally consider the following one-dimensional isentropic Euler system [9]:
where \(\rho \) is the gas density, u is the velocity, and \(m=\rho u\) is the momentum. System (52) can be written in the compact form (33), being
Therefore, we may write the relaxation approximation (34) and then apply either the standard Monte Carlo approach or the Gradient method, the latter in the mesh-dependent version discussed in Remark 6, since the system is not diagonal.
We test the resolution of the isentropic Euler equations with the standard Monte Carlo, the low variance Monte Carlo, and the Gradient-based approach considering two Riemann Problems having the following initial conditions, taken from [9].
-
Test 5(a): In the first case, in the domain \(\varOmega =[-1,1]\), we set
$$\begin{aligned} \begin{aligned}&\rho (x,0) = 2,&\quad&m(x,0)=1, \qquad&\textrm{for} \quad x<0.2\,,\\&\rho (x,0) = 1,&\quad&m(x,0)=0.13962, \qquad&\textrm{for} \quad x\ge 0.2\,. \end{aligned} \end{aligned}$$ -
Test 5(b): In the second case, again considering the domain \(\varOmega =[-1,1]\), we impose
$$\begin{aligned} \begin{aligned}&\rho (x,0) = 1,&\quad&m(x,0)=0, \qquad&\textrm{for} \quad x<0\,,\\&\rho (x,0) = 0.2,&\quad&m(x,0)=0, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$
Once more, looking at Figs. 11 and 12, the augmented accuracy and the highly reduced variance of the solutions produced with the Gradient approach appears evident when compared to results obtained with the standard Monte Carlo method, even considering a 10 times smaller amount of particles and the same spatial grid. In these figures, it is also possible to appreciate the beneficial effects of the variance reduction technique applied to the standard Monte Carlo, when comparing the first two rows in each figure. Moreover, the capability of the GBMC of better capturing the position and the sharpness of shock waves is again confirmed. Let us point out once more here that this advantage is linked to the key feature of the proposed method, which allows particles to move following the gradient of the solution.
5 Conclusions
Monte Carlo methods have become increasingly important in scientific computing due to their ability to handle complex systems and quantify uncertainty. Despite this, their systematic use in solving partial differential equations is still limited compared to deterministic approaches, which in many cases provide greater flexibility and accuracy. In this paper, we attempt to take a step forward in the design of Monte Carlo methods for PDEs by analyzing their systematic use for relaxation approximations to systems of hyperbolic conservation laws [22]. On the one hand, we extend Monte Carlo techniques of direct simulation inspired by kinetic theory to systems of hyperbolic conservation laws. On the other hand, we consider a different approach based on the use of the spatial derivative of the solution, which was developed earlier for reaction-diffusion equations [44], and refer to as Gradient-based Monte Carlo (GBMC). The latter method has shown great potential due to its ability to concentrate particles where the solution has large derivatives and has a grid-free structure. In the presented test cases, the GBMC method proves to be significantly more accurate than the Monte Carlo method.
Lest us notice that, by combining the techniques here presented to previous results for reaction-diffusion problems it is possible to deal with general systems of PDEs involving convection-diffusion–reaction terms. Moreover, in this paper, we have limited our analysis to the one-dimensional case. In the future, our aim is to extend the GBMC method to the multi-dimensional case through a component-wise approach (i.e., deriving the system of equations under study component by component to obtain the system to be solved with the method, thus preserving a mesh-less reconstruction) and to more general systems. Another primary research direction is to extend the GBMC approach to kinetic equations. This will be done first for neutron transport equations that, thanks to the linear structure, permit a natural extension of the method; subsequently, tackling the nonlinear BGK model which has a relaxation structure of the type studied in this paper. We leave further investigations on these topics to future research.
Data Availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Aregba-Driollet, D., Natalini, R.: Discrete kinetic schemes for multidimensional systems of conservation laws. SIAM J. Numer. Anal. 37(6), 1973–2004 (2001)
Aw, A., Klar, A., Rascle, M., Materne, T.: Derivation of continuum traffic flow models from microscopic follow-the-leader models. SIAM J. Appl. Math. 63, 259–278 (2002)
Bertaglia, G., Liu, L., Pareschi, L., Zhu, X.: Bi-fidelity stochastic collocation methods for epidemic transport models with uncertainties. Netw. Heterog. Media 17, 401–425 (2022)
Bird, G.A.: Molecular Gas Dynamics and Direct Simulation of Gas Flows. Clarendon Press, Oxford (1994)
Bossy, M., Talay, D.: A stochastic particle method for the Mckean–Vlasov and the Burgers equation. Math. Comput. 66, 157–192 (1997)
Boyle, P.P.: Options: a Monte Carlo approach. J. Financ. Econ. 4, 323–338 (1977)
Caflisch, R.E.: Monte Carlo and quasi-Monte Carlo methods. Acta Numer. 7, 1–49 (1998)
Chauvin, B., Rouault, A.: A stochastic simulation for solving scalar reaction-diffusion equations. Adv. Appl. Probab. 22, 88–100 (1990)
Caflisch, R.E., Jin, S., Russo, G.: Uniformly accurate schemes for hyperbolic systems with relaxation. SIAM J. Numer. Anal. 34, 246–281 (1997)
Chen, G.Q., Levermore, D., Liu, T.P.: Hyperbolic conservation laws with stiff relaxation terms and entropy. Commun. Pure Appl. Math. 47, 787–830 (1994)
Chertock, A.: A practical guide to deterministic particle methods. In: Abgrall, R., Shu, C.W. (eds.) Handbook of Numerical Analysis, vol. 18, pp. 177–202. Elsevier, Amsterdam (2017)
Chorin, A.J.: Numerical study of slightly viscous flows. J. Fluid Mech. 57, 785–796 (1973)
Degond, P., Dimarco, G., Pareschi, L.: The moment guided Monte Carlo method. Int. J. Num. Meth. Fluids 67(2), 189–213 (2011)
Di Francesco, M., Fagioli, S., Rosini, M.D.: Deterministic particle approximation of scalar conservation laws. Boll. Unione Mat. Ital. 10(3), 487–501 (2017)
Di Francesco, M., Fagioli, S., Rosini, M.D., Russo, G.: Follow-the-Leader Approximations of Macroscopic Models for Vehicular and Pedestrian Flows. In: Bellomo, N., Tezduyar, T.E. (eds.) Modeling and Simulation in Science, Engineering and Technology, vol. 1, pp. 333–378. Springer, Berlin (2017)
Fournier, N., Guillin, A.: On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 162, 707–738 (2015)
Dimarco, G., Pareschi, L.: Multi-scale control variate methods for uncertainty quantification in kinetic equations. J. Comput. Phys. 388, 63–89 (2019)
Ghoniem, A., Sherman, F.S.: Grid-free simulation of reaction diffusion equations using random walk methods. J. Comput. Phys. 61, 1–37 (1985)
Giles, M.B.: Multilevel Monte Carlo methods. Acta Numer. 24, 259–328 (2015)
Hu, J., Pareschi, L., Wang, Y.: Uncertainty quantification for the BGK model of the Boltzmann equation using multilevel variance reduced Monte Carlo methods. SIAM/ASA J. Uncer. Quant. 9(2), 650–680 (2021)
Jin, S.: Efficient asymptotic-preserving (AP) schemes for some multiscale kinetic equations. SIAM J. Sci. Comput. 21, 441–454 (1999)
Jin, S., Xin, Z.: The relaxation schemes for systems of conservation laws in arbitrary space dimensions. Commun. Pure App. Math 48, 235–276 (1995)
Kroese, D.P., Taimre, T., Botev, Z.I.: Handbook of Monte Carlo Methods. Wiley, Hoboken (2011)
L’Ecuyer, P.: Quasi-Monte Carlo methods with applications in finance. Finance Stoch. 13, 307–349 (2009)
Mascagni, M.: A Deterministic particle method for one-dimensional reaction-diffusion equations, Research Institute for Advanced Computer Science (RIACS) Technical Report: 95.23 (1995)
Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44, 335–341 (1949)
Metropolis, N.: The beginning of the Monte Carlo method. Los Alamos Sci. Spec. Issue 15, 125–130 (1987)
Morokoff, W.J., Caflisch, R.E.: Quasi Monte Carlo integration. J. Comput. Phys. 122, 218–230 (1995)
Nanbu, K.: Direct simulation scheme derived from the Boltzmann equation. J. Phys. Soc. Jpn. 49, 2042–2049 (1980)
Pareschi, L.: Hybrid multiscale methods for kinetic and hyperbolic problems, In: Goudon, T., Sonnendrucker, E., Talay, D. (eds.) ESAIM: Proc. 15, pp. 87–120 (2005)
Pareschi, L., Caflisch, R.E.: Implicit Monte Carlo methods for rarefied gas dynamics I: the space homogeneous case. J. Comput. Phys. 154, 90–116 (1999)
Pareschi, L., Seaid, M.: A new Monte Carlo approach for conservation laws and relaxation systems, In: Bubak M., van Albada G.D., Sloot P.M.A., Dongarra J. (eds.) Computational Science - ICCS 2004. ICCS 2004. Lecture Notes in Computer Science, vol. 3037. Springer, Berlin (2004)
Pareschi, L., Russo, G.: Asymptotic preserving Monte Carlo methods for the Boltzmann equation. Transp. Theo. Stat. Phys. 29, 415–430 (2000)
Pareschi, L., Russo, G.: An introduction to Monte Carlo methods for the Boltzmann equation, CEMRACS 1999, ESAIM: Proc. 10, 35–75 (2001)
Pareschi, L., Toscani, G.: Interacting Multiagent Systems. Oxford University Press, Kinetic Equations And Monte Carlo Methods (2013)
Pareschi, L., Trazzi, S.: Numerical solution of the Boltzmann equation by time relaxed Monte Carlo (TRMC) methods. Int. J. Num. Meth. Fluids 48, 947–983 (2005)
Pareschi, L., Trimborn, T., Zanella, M.: Mean-field control variate methods for kinetic equations with uncertainties and applications to socioeconomic sciences. Int. J. Uncertain. Quantif. 12, 61–84 (2022)
Piccoli, B., Garavello, M.: Traffic Flow on Networks. American Institute of Mathematical Sciences, Pasadena (2006)
Pullin, D.I.: Direct simulation methods for compressible inviscid ideal gas flows. J. Comput. Phys. 34, 231–244 (1980)
Rjasanow, S., Wagner, W.: Stochastic Numerics for the Boltzmann Equation. Springer Series in Computational Mathematics, vol. 37. Springer, Berlin (2005)
Roberts, S.: Convergence of a random walk method for the Burgers equation. Math. Comput. 52(186), 647–673 (1989)
Russo, G.: Deterministic diffusion of particles. Commun. Pure Appl. Math. 43(6), 697–733 (1990)
Sherman, A., Mascagni, M.: A gradient random walk method for two-dimensional reaction-diffusion equations. SIAM J. Sci. Comput. 15, 1280–1293 (1994)
Sherman, A., Peskin, C.S.: A Monte Carlo method for scalar reaction-diffusion equations. SIAM J. Sci. Comput. 7, 1360–1372 (1986)
Toro, E.F.: Shock-Capturing Methods for Free-surface Shallow Flows. Wiley, Hoboken (2001)
Tsybakov, A.: Introduction to Nonparametric Estimation. Springer, Berlin (2009)
Ulam, S., Richtmyer, R.D., von Neumann, J.: Statistical methods in neutron diffusion, Los Alamos Scientific Laboratory report LAMS–551 (1947)
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
Zhang, B., Yua, W., Mascagni, M.: Revisiting Kac’s method: a Monte Carlo algorithm for solving the telegrapher’s equations. Math. Comput. Simul. 156, 178–193 (2019)
Acknowledgements
G.B. would like to thank the Courant Institute of Mathematical Sciences, New York University, for the kind hospitality during her research visiting period.
Funding
Open access funding provided by Università degli Studi di Ferrara within the CRUI-CARE Agreement. This work has been written within the activities of GNCS group of INdAM (Italian National Institute of High Mathematics), whose support is acknowledged. It has also been partially supported by ICSC—Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union—NextGenerationEU and by MIUR-PRIN 2017, Project No. 2017KKJP4X “Innovative numerical methods for evolutionary partial differential equations and applications". G.B. was also funded by the University of Ferrara under “Bando Giovani Ricercatori 2019” and by the European Union—NextGenerationEU, MUR PRIN 2022 PNRR, Project No. P2022JC95T “Data-driven discovery and control of multi-scale interacting artificial agent systems”.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Reconstruction Error Estimates
In this section we will detail how the numerical solution is recovered starting from the particles and illustrate how this will affect the reconstruction error both for MC and GBMC. The control of the distance between some reconstruction of the empirical measure and its true distribution is, of course, a long standing problem, central in probability, statistics and computer science. While many distances can be used to consider the problem, here we will focus on the \(L^p\) distance given its central use in error estimation for conservation laws. However, other distances, such as the Wasserstein distance, are quite natural for quantifying particle approximations of PDEs [16]. In the sequel, to simplify notations, we consider only one family of particles, so ignore their direction of motion, and restrict to nonnegative probability densities \(u(x,t) \ge 0\).
1.1 Monte Carlo
For the standard Monte Carlo method, starting from N samples \(X_1\), ..., \(X_N\) at time t i.i.d. as u(x, t) we compute the empirical density function
where \(\delta (\cdot )\) is the Dirac delta function.
The above function needs to be regularized for numerical purposes. To this aim, let us introduce the mesh width \(\varDelta x>0\) and denote by \(S_{\varDelta x}\ge 0\) a smoothing function such that
Then, we consider the approximation of the empirical density (53) obtained by a kernel density estimator in the form
In the simplest case, we have the rectangular kernel \(S_{\varDelta x}(x)=\chi (|x|\le \varDelta x/2)/\varDelta x\), where \(\chi (\cdot )\) is the indicator function, corresponding to the histogram reconstruction that has been used on a fixed grid of size \(\varDelta x\) in the numerical results. We refer to [46] for examples of other kernels.
By considering only the error due to the reconstruction (54) on the computational domain \(\varOmega \subseteq \mathbb R\), this can be estimated from triangle inequality:
where
and we defined
with \(\mathbb {E}[\cdot ]\) denoting the expectation with respect to the random variables \(x_1,\ldots ,x_N\) i.i.d. as u(x, t) and \(\Vert \cdot \Vert _{L^p(\varOmega )}\) is the \(L^p\) norm in \(\varOmega \).
Now, for the second term on the r.h.s. of (55) we have the following lemma.
Lemma 1
The root mean squared error satisfies
where
The proof follows by classical arguments on the convergence of the root mean squared error [7, 37].
Next, we assume that the first term satisfies
according to the order of accuracy \(q\ge 1\) used in the reconstruction. For example, in the case of histogram reconstruction through a rectangular kernel we have \(q=2\). In fact, from the midpoint rule we can compute
where \(\xi \) depends on x.
Thus, we have the following result.
Theorem 1
For a sufficiently smooth function u(x, t) the error introduced by the reconstruction function (54) satisfies
where \(C_{q}\) depends on the q derivative of u(x, t) and the domain \(\varOmega \), and \(\sigma _{S}^2\) is defined in (56).
However, more generally, the error is affected by the numerical solution of the PDE, which in our case is first order in time, due to the time splitting algorithm, and first order in space, due to the piecewise constant reconstruction used in the relaxation phase. Roughly speaking, we can assume that these errors, as a result of a CFL-type condition, are of the same order and to have a local error estimate of the type
where \(\tilde{u}_{\varDelta x}(x,t)=\mathbb {E}[\tilde{u}_{N,\varDelta x}(x,t)]\), with \(\tilde{u}_{N,\varDelta x}(x,t)\) the reconstructed MC solution of the PDE. In (58) the quantity \(M_1(x)\) depends on the PDE under consideration and the first order derivative of the solution. Note that, since the first order spatial error in solving the PDE dominates the reconstruction error in (57), we can ignore in (58) the second order contributions.
Now, we can estimate the MC solution of the PDE from
where the second term can be estimated as in Lemma 1:
with
and the last inequality follows from assumption (58).
By collecting the above results we can state the following theorem.
Theorem 2
Let us denote by \(\tilde{u}_{N,\varDelta x}(\cdot ,t)\) the reconstructed MC solution such that it satisfies (58), then
where the constants \(C_1\), \(C_2\) depend on the first order derivative of u(x, t) and the domain \(\varOmega \).
It is thus clear that we must take \(\varDelta x\) small to make the mesh bias small, but also large enough to ensure that the variance is also small. If we want to minimize the error with respect to \(\varDelta x\) we should take
This will ensure an optimal error decay \(O(N^{-1/3})\). Therefore, the optimal mesh will scale as \(\varDelta x_{opt} \approx N^{-1/3}\) for a convergence rate of \(O(N^{-1/3})\), where the precise value of \(C_1\) depends on the particular PDE under consideration and may be difficult to estimate in practice.
1.2 Gradient-Based Monte Carlo
Assume, again for simplicity of treatment, that u(x, t) is nondecreasing monotone so that \(w(x,t)=\frac{\partial u(x,t)}{\partial x} \ge 0\). Therefore, without loss of generalization, we can consider w(x, t) as a probability density and u(x, t) its cumulative distribution function.
Given now N samples \(X_1\), ..., \(X_N\) at time t i.i.d. as w(x, t), we compute the empirical cumulative distribution function as
where \(H(\cdot )\) is the Heaviside step function.
Of course, (63) differs from (53) although we used the same notations for the two reconstructions. Note that, in contrast to (53), the empirical cumulative distribution does not need any further approximation or the introduction of a mesh but can be used directly in the form (63).
We recall some classical results for (63) (see [48] for more details).
Lemma 2
For any given \(x\in \varOmega \) we have that \(Nu_N(x,t)\) has a binomial distribution with parameters N and success probability u(x, t). Therefore
where \(\mathrm{\mathbb {V}ar}[\cdot ]\) denotes the variance with respect to the random variables \(x_1,\ldots ,x_N\) i.i.d. as w(x, t), and \(u_N(x,t)\) converges to u(x, t) almost surely.
In fact, \(u_N(x,t)\) is the sum of N independent Bernoulli random variables, therefore \(N u_N(x,t)\) is a binomial random variable. Thus, the mean and variance characterization in the proposition follows easily. The latter statement is a consequence of Hoeffding’s inequality, which implies that for any \(\varepsilon >0\) in probability we have
The above pointwise convergence can be made uniform by the fundamental Glivenko-Cantelli lemma.
Lemma 3
(Glivenko-Cantelli) The empirical distribution \(u_N(x,t)\) converges uniformly to u(x, t), namely as \(N\rightarrow \infty \) we have
where the superscript a.s. denotes convergence almost surely.
A similar estimate as Hoeffding’s inequality holds true also in this latter case.
Now, let’s consider the problem of estimating the convergence rate in \(L^p\) spaces. Since we have
the numerical error of the empirical cumulative distribution (63), can now be estimated similarly to Lemma 1 from
where
Thus, we have the following theorem.
Theorem 3
The empirical cumulative distribution (63) satisfies
The proof follows immediately by observing that
If we now consider the error introduced by the GBMC solution of the PDE, in the case when this is affected only by the time error of the splitting since no mesh in space is needed, we can assume
where \(\tilde{u}(\cdot ,t)=\mathbb {E}[\tilde{u}_{N}(\cdot ,t)]\), \(\tilde{u}_{N}(\cdot ,t)\) is the numerical GBMC solution of the PDE and \(\tilde{C}_1\) depends on the first order time derivative of the solution and the domain \(\varOmega \).
This leads immediately to the following result.
Theorem 4
Let us denote by \(\tilde{u}_{N}(\cdot ,t)\) the GBMC solution such that it satisfies (70), then
From (71) it is clear that the time step should be taken as \(\varDelta t \approx N^{-1/2}\) to optimize the error in the GBMC solution and achieve an optimal convergence rate of \(O(N^{-1/2})\). Note that, small values of \(\varDelta t\) will have a moderate impact on the computational cost of GBMC since less particles interact and no extra cost of the reconstruction is needed.
By comparing (71) with (61) one can see that the error for the GBMC method is always smaller than the MC error. More precisely, the two errors have the same decay for a fixed (non optimal) \(\varDelta x\) in the MC method, whereas for an optimal \(\varDelta x\) given by (62) the GBMC method has a faster convergence rate as \(O(N^{-1/2})\) against \(O(N^{-1/3})\) of the MC method.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bertaglia, G., Pareschi, L. & Caflisch, R.E. Gradient-Based Monte Carlo Methods for Relaxation Approximations of Hyperbolic Conservation Laws. J Sci Comput 100, 60 (2024). https://doi.org/10.1007/s10915-024-02614-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-024-02614-1
Keywords
- Monte Carlo methods
- Gradient random walk methods
- Variance reduction
- Grid-free methods
- Hyperbolic relaxation systems
- Asymptotic-preserving schemes