1 Introduction

Monte Carlo methods, which were first devised during the Manhattan Project in the 1940s to simulate the behavior of neutron diffusion in fissile material [26, 47], have a long and storied history [27]. In the decades following their introduction, the methods were refined and extended to various fields, including physics, engineering, finance, and many others. The subsequent development of computers provided a significant boost to the spread of Monte Carlo methods in scientific computing, as they allowed for the simulation of more complex systems and the generation of larger numbers of random samples. Although today Monte Carlo methods are widely used in many scientific and industrial applications [23], their systematic design as a numerical analysis tool to solve partial differential equations (PDEs) is still rather limited to specific contexts compared to deterministic approaches such as finite-differences or finite-volumes. Prominent examples are applications involving stochastic differential equations, like in financial applications [6, 7, 24, 28], or when other numerical methods are not feasible due to the complexity of the problem, like in plasma physics and rarefied gas dynamics [4, 13, 29, 31, 34, 36, 40].

On the other hand, growing attention in high-dimensional problems and in quantifying uncertainty in many fields of applied mathematics, including emerging fields such as life and social sciences, has greatly increased interest in developing efficient Monte Carlo approaches in such contexts [3, 17, 19, 20, 35]. Moreover, since the Monte Carlo solution is computed by averaging independent calculations, the resulting algorithms are well suited to exploit modern parallel computing techniques.

Among the various Monte Carlo methods developed for PDEs, one approach, inspired by the so-called vortex method for the Navier–Stokes equations [12], had considerable success in the 1990s for reaction-diffusion problems. The idea of the method is to use the spatial derivative of the solution (i.e., the gradient in multiple dimensions) as the unknown variable. This allows the statistical solution in the original variables to be reconstructed not from the histogram, as in standard random walk approaches, but directly as the distribution function of the samples of the derivative. In addition to the statistical reduction of fluctuations, such a technique, usually referred to as the gradient random walk (GRW) method, brings with it the advantages of adaptivity, since samples are taken according to the space derivatives, and a grid-free structure [8, 18, 41, 43, 44].

In this paper, we explore the possibility of extending such ideas to a broader class of PDEs that includes, in particular, systems of conservation laws. The construction of Monte Carlo methods for such problems is strongly inspired by the fluid-dynamical limit of the Boltzmann equation and the corresponding Monte Carlo methods for the Euler equations [31, 33, 34, 36, 39]. Let us remark that the construction of stochastic particle methods for nonlinear hyperbolic problems has been the subject of limited research in the past and that there is no general approximation methodology (see for example [5, 41, 49]). A first attempt in this direction, limited to nonnegative solutions, has been proposed in [32] based on appropriate relaxation approximations of such systems [1, 21, 22]. Indeed, such a relaxation approximation allows the Monte Carlo strategy traditionally used for kinetic equations to be adapted to generic conservation laws. We mention here that related approaches based on deterministic particles have been also proposed [11, 14, 15, 25, 42].

Following the idea of approximating the system of conservation laws by means of a semi-linear hyperbolic system with source terms we show how it is possible to formulate an approach based on statistical samples of the spatial derivatives of the solution that allows the characteristic advantages of GRW methods for reaction-diffusion equations to be generalized to such problems. In order to compare the new methods, which we will refer to as Gradient-based Monte Carlo (GBMC) methods, with a standard Monte Carlo approach, we will recall the concepts introduced in [32] by extending them to the case of solutions of arbitrary sign and discussing some variance reduction strategies.

Here, we will limit ourselves to the one-dimensional case, which will enable us to focus on the fundamental concepts of the GBMC method and address both the scalar case and the case of systems of conservation laws, leaving to further research multi-dimensional extensions. We will show how the resulting methods will be grid-free in the scalar case, while in the case of systems, this may not always be possible. In all test cases considered, however, the GBMC method offers significant advantages over the traditional Monte Carlo approach, including better spatial resolution of discontinuities and a reduction in variance by several orders of magnitude.

The rest of the article is organized as follows. In Sect. 2 we will recall the basic concepts of GRW methods for reaction-diffusion problems. Then in Sect. 3 we will discuss the design of the novel GBMC approach in the case of simple relaxation systems for scalar conservation laws. In this section we will also introduce the details of a direct Monte Carlo approach by generalizing some of the ideas presented in [32]. Several numerical examples for various scalar problems illustrate the strong advantages of GBMC over standard Monte Carlo. Next, Sect. 4 is devoted to the extension of the methodology to the case of hyperbolic relaxation approximations to systems of conservation laws. In such cases, even if a grid-free approach is related to the possibility to diagonalize the system, we show through several examples that the resulting GBMC schemes maintain most of the advantages observed in the scalar case. Some concluding remarks and further research directions are discussed in the last section. Finally, “Appendix A” reports some estimates in the \(L^p\) distance of the solution reconstruction error from the particles, showing the improvement in accuracy of the GBMC approximation over the corresponding MC.

2 Gradient Random Walk for Reaction-Diffusion Equations

In this section we recall the basic ideas behind the design of GRW methods for reaction-diffusion problems [8, 18, 25, 41, 43, 44]. We illustrate the idea by treating the one-dimensional scalar reaction–diffusion equation

$$\begin{aligned} \begin{aligned}&\frac{\partial u}{\partial t}=D\displaystyle \frac{\partial ^2 u}{\partial x^2}+G(u),\quad x\in \mathbb R,\quad D, t>0,\\&u(x,0)=u_0(x). \end{aligned} \end{aligned}$$
(1)

Let us assume that \(u_0(x)\) is monotone and bounded from below and above. Without loss of generality, we impose \(u(-\infty ,t)=0\) and \(u(\infty ,t)=1\), for all \(t\ge 0\). We introduce the auxiliary variable \(w=\partial u/\partial x\) and observe that it satisfies the equation

$$\begin{aligned} \begin{aligned}&\frac{\partial w}{\partial t}=D\displaystyle \frac{\partial ^2 w}{\partial x^2}+G'(u)w,\quad x\in \mathbb R,\quad D, t>0,\\&w(x,0)=\frac{\partial u_0(x)}{\partial x}. \end{aligned} \end{aligned}$$
(2)

Note that, by virtue of the assumptions on \(u_0(x)\), \(w_0(x)\) is a probability density.

Using N particle samples located respectively in \(X_1,\ldots ,X_N\) (we use capital X to indicate that the positions are random variables) we will have

$$\begin{aligned} w_N(x,t)=\frac{1}{N}\sum _{k=1}^N \delta (x-X_k), \end{aligned}$$
(3)

discretizing w as a sum of \(\delta \)-functions. Then, since \(u(x,t)=\int _{-\infty }^x w(y,t)\,dy\) is the cumulative distribution function of the random variable X, we get

$$\begin{aligned} u_N(x,t)=\frac{1}{N}\sum _{k=1}^N H(x-X_k), \end{aligned}$$
(4)

where \(H(\cdot )\) is the (right-continuous) Heaviside step function

$$\begin{aligned} H(x) = 0 \,\,\, \textrm{if}\,\,\, x <0, \qquad H(x) = 1 \,\,\, \textrm{if}\,\,\, x \ge 0. \end{aligned}$$

Note that the boundary condition at the left is exactly satisfied, while the boundary condition at the right is satisfied on average, with fluctuations. Of course we can use a right reconstruction approach that would lead to the opposite situation (see also Remark 3).

Fig. 1
figure 1

Standard normal distribution using \(N=100\) samples. Left: histogram of Monte Carlo samples with \(M=50\) cells. Middle: histogram of its derivative using two symmetric families of samples with positive and negative masses and \(M=50\) cells. Right: plot of the cumulative distribution of the derivative samples using left reconstruction as in (4)

Notice that the same idea can be applied in case of weighted particles, namely particles having different mass \(m_i\), thus approximating the function \(w(\cdot , t)\) as

$$\begin{aligned} w_N(x,t)=\frac{1}{N}\sum _{k=1}^N m_k \delta (x-X_k). \end{aligned}$$
(5)

Here the mass values \(m_i\in \{-m,m\}\), \(m>0\) might also be negative, so we can approximate more than monotone nonnegative solutions. From definition (5), in this case by left reconstruction we obtain

$$\begin{aligned} u_N(x,t)=\frac{1}{N}\sum _{k=1}^N m_k H(x-X_k). \end{aligned}$$
(6)

The computed solution for u(xt) contains much less fluctuation than w(xt), since all particles contribute to the solution at any point x, as depicted in Fig. 1. In particular, due to its nature of solving the problem looking at the gradient of the state variable, the GRW method perfectly applies to problems in which sharp fronts appear and where one needs to spatially resolve jump-like solutions, because the density of the particles is large precisely where u has large gradients. Once the initial data have been discretized, the GRW method evolves the positions and masses of the particles so that u satisfies (1). This is done by a fractional-step iteration in a small time interval \(\varDelta t\) in which the diffusion term is modeled by a random walk and the reaction term is modeled through a particle killing or replication approach with probability \(\varDelta t |G'(u)|\) (see [18, 25, 43, 44] for more details).

3 Gradient-Based Monte Carlo for Hyperbolic Problems

The extension of the previous idea to hyperbolic problems and in general to nonlinear PDEs is nontrivial and, except for some specific cases [41], has been poorly studied in the literature. Among other reasons, it should be pointed out that already the construction of a standard Monte Carlo method is not obvious in the case of nonlinear PDEs [5, 32, 39]. In the following we will consider the case of hyperbolic problems, with special reference to the relaxation approximation of scalar conservation laws. In the following we will consider the case of hyperbolic problems, with special reference to the approximation by relaxation of scalar conservation laws. Subsequently, we will discuss how to extend the method to the case of systems of conservation laws.

The starting point is the following hyperbolic system with relaxation introduced in [22] as a semi-linear approximation to scalar conservation laws

$$\begin{aligned} \begin{aligned} \frac{\partial u}{\partial t} + \frac{\partial v}{\partial x}&= 0,\\ \frac{\partial v}{\partial t} + a^2 \frac{\partial u}{\partial x}&= -\frac{1}{\varepsilon } \left( v-F(u)\right) , \end{aligned} \end{aligned}$$
(7)

where \(x\in \varOmega \subseteq \mathbb {R}\), supplemented with the initial conditions \(u(x,0)=u_0(x)\), \(v(x,0)=v_0(x)\) and suitable boundary conditions.

In the limit \(\varepsilon \rightarrow 0\) from the second equation in (7) one formally obtains the local equilibrium \(v=F(u)\) and thus solutions to (7) are well approximated by the scalar conservation law

$$\begin{aligned} \frac{\partial u}{\partial t} + \frac{\partial F(u)}{\partial x} = 0. \end{aligned}$$
(8)

More precisely, if we evaluate the \(\varepsilon \) perturbation of the local equilibrium

$$\begin{aligned} v = F(u) + \varepsilon v_1, \end{aligned}$$

and substite in the second equation in (7), with some algebraic manipulations we obtain

$$\begin{aligned} v_1 = F'(u)^2\frac{\partial u}{\partial x} - a^2\frac{\partial u}{\partial x} + O(\varepsilon ), \end{aligned}$$

and therefore

$$\begin{aligned} v = F(u) + \varepsilon \left( F'(u)^2 - a^2\right) \frac{\partial u}{\partial x}, \end{aligned}$$

which leads to the following convection-diffusion equation

$$\begin{aligned} \frac{\partial u}{\partial t} + \frac{\partial F(u)}{\partial x} = \varepsilon \left[ \left( a^2- F'(u)^2\right) \frac{\partial u}{\partial x}\right] _x. \end{aligned}$$
(9)

The above equation is a good approximation of system (7) for \(\varepsilon \ll 1\) only if the characteristic velocity of the relaxed system is dominated by the characteristic velocity of the original system, hence if the subcharacteristic condition \(a^2 > F'(u)^2\) is satisfied (see [10, 22]).

It is worth to underline that, for small values of \(\varepsilon \), the study of a numerical solution of system (7) requires particular care even though the transport operator is linear because a standard explicit discretization will lead to time constraints of the order of \(\varepsilon \). This idea was at the basis of the so-called relaxation schemes for hyperbolic systems of conservation laws [22].

3.1 A Direct Monte Carlo Approach

First, we recall the probabilistic approach proposed in [32] to construct a random particle solver for nonnegative densities that works uniformly with respect to the relaxation rate \(\varepsilon \). The method is inspired by the classical Direct Simulation Monte Carlo (DSMC) method for rarefied gas dynamics [4, 29, 34]. We will later discuss how to extend this approach to solutions of arbitrary sign.

Given \(a>0\), let us rewrite system (7) introducing the diagonal (kinetic) variables

$$\begin{aligned} f^+=\frac{{a}u+v}{2{a}}, \qquad f^-=\frac{{a}u-v}{2{a}}, \,\, \end{aligned}$$

and the corresponding equilibrium states

$$\begin{aligned} E^+(u)=\frac{{a}u+F(u)}{2{a}}, \qquad E^-(u)=\frac{{a}u-F(u)}{2{a}}. \end{aligned}$$

We obtain the diagonal form of the relaxation system

$$\begin{aligned} \begin{aligned} \frac{\partial f^+}{\partial t} + {a}\frac{\partial f^+}{\partial x}&= -\displaystyle \frac{1}{\varepsilon } \left( f^+-E^+(u)\right) ,\\ \frac{\partial f^-}{\partial t} - {a} \frac{\partial f^-}{\partial x}&= -\displaystyle \frac{1}{\varepsilon } \left( f^--E^-(u)\right) . \end{aligned} \end{aligned}$$
(10)

System (10) presents several analogies with discrete-velocity models of the Boltzmann equation, in the sense that it describes a system of particles having only two speeds \(\pm a\) which relax toward the local equilibrium states \(E^{\pm }(u)\). The kinetic interpretation requires \(E^{\pm }(u)\ge 0\) and this is guaranteed if \(a \ge |F(u)|/u\) and \(u > 0\). Under these assumptions, one can apply a direct simulation Monte Carlo approach [32].

More precisely, the solution in a small time interval \([0, \varDelta t]\) is approximated by means of a fractional step procedure that solves separately the two problems characterized by the linear transport and by the relaxation term. One therefore solves, one after the other, a transport step

$$\begin{aligned} \begin{aligned} \frac{\partial f^+}{\partial t} + {a}\frac{\partial f^+}{\partial x}&=0,\\ \frac{\partial f^-}{\partial t} - {a} \frac{\partial f^-}{\partial x}&=0, \end{aligned} \end{aligned}$$
(11)

and a relaxation process

$$\begin{aligned} \begin{aligned} \frac{\partial f^{+}}{\partial t}&= -\frac{1}{\varepsilon } \left( f^{+}-E^+(u)\right) ,\\ \frac{\partial f^{-}}{\partial t}&= -\frac{1}{\varepsilon } \left( f^{-}-E^-(u)\right) . \end{aligned} \end{aligned}$$
(12)

Since both steps, taken separately, can be solved exactly no further approximation besides the \(O(\varDelta t)\) error of the above splitting, is necessary. In fact, in a small time interval \(\varDelta t\) the exact solutions \(\tilde{f}^{\pm }\) of the free transport (11) reads

$$\begin{aligned} \begin{aligned} \tilde{f}^+(x,\varDelta t)&=f^+_0(x-a\varDelta t),\\ \tilde{f}^-(x,\varDelta t)&=f^-_0(x+a\varDelta t), \end{aligned} \end{aligned}$$
(13)

and, setting \(\tilde{u}(x,\varDelta t)=\tilde{f}^+(x,\varDelta t)+\tilde{f}^-(x,\varDelta t)\), the exact solution of the relaxation step (12) yields the approximated values at time \(\varDelta t\)

$$\begin{aligned} \begin{aligned} f^{+}(x,\varDelta t)&=e^{-\varDelta t/\varepsilon }\tilde{f}^+(x,\varDelta t)+\left( 1-e^{-\varDelta t/\varepsilon }\right) E^+(\tilde{u}(x,\varDelta t)),\\ f^{-}(x,\varDelta t)&=e^{-\varDelta t/\varepsilon }\tilde{f}^-(x,\varDelta t)+\left( 1-e^{-\varDelta t/\varepsilon }\right) E^-(\tilde{u}(x,\varDelta t)). \end{aligned} \end{aligned}$$
(14)

The probabilistic interpretation is readily obtained by introducing the new variables

$$\begin{aligned} p^+(x,t)=\frac{f^{+}(x,t)}{u(x,t)},\qquad p^-(x,t)=\frac{f^{-}(x,t)}{u(x,t)}, \end{aligned}$$
(15)

with \(p^+(x,t)+p^-(x,t)=1\), to have the convex combination

$$\begin{aligned} \begin{aligned} p^+(x,\varDelta t)&=e^{-\varDelta t/\varepsilon }{p_0^+}(x-a\varDelta t)+\left( 1-e^{-\varDelta t/\varepsilon }\right) p^+_E(x,\varDelta t),\\ p^-(x,\varDelta t)&=e^{-\varDelta t/\varepsilon }{p_0^-}(x+a\varDelta t)+\left( 1-e^{-\varDelta t/\varepsilon }\right) p^-_E(x,\varDelta t), \end{aligned} \end{aligned}$$
(16)

where we denoted the equilibrium probabilities as

$$\begin{aligned} p^+_E(x,t)=\frac{E^+(\tilde{u}(x,t))}{\tilde{u}(x,t)},\qquad p^-_E(x,t)=\frac{E^-(\tilde{u}(x,t))}{\tilde{u}(x,t)}. \end{aligned}$$
(17)

We note that the probabilistic interpretation holds true independently of the choice of \(\varDelta t/\varepsilon \). In particular, as \(\varepsilon \rightarrow 0\) it reduces to the projections

$$\begin{aligned} p^+(x,\varDelta t)=p^+_E(x,\varDelta t),\qquad p^-(x,\varDelta t)=p^-_E(x,\varDelta t), \end{aligned}$$
(18)

which characterize the probabilistic approximation of the limit equation (8).

A Monte Carlo method can be derived by sampling directly from the exact solutions of the operator splitting steps (16). Note, however, that the probability of a velocity change in (17) depends on the mass density after the transport step \(\tilde{u}(x,t)\). Therefore, in order to set up a Monte Carlo method and to estimate the probability of a velocity change, we must reconstruct the mass density u in a neighborhood of the particle position. Given a set of samples \(X_1,\ldots ,X_N\) the simplest method, which produces a piecewise constant reconstruction, is based on evaluating the histogram of the samples at the cell centers of a suitable grid \(x_j=j\varDelta x\),

$$\begin{aligned} u_j(t)=\frac{1}{N}\sum _{k=1}^N \varPhi _{\varDelta x}(x_{j+1/2}-X_k),\qquad j=\ldots ,-2,-1,0,1,2,\ldots \end{aligned}$$
(19)

where \(\varPhi _{\varDelta x}(x)=1/\varDelta x\) if \(|x|\le \varDelta x\) and \(\varPhi _{\varDelta x}(x)=0\) elsewhere. Smoother versions can be obtained by changing the function approximating the Dirac delta. This corresponds to a convolution of the samples with a suitable mollifier [30, 34].

We can set up a Monte Carlo method in the case of positive solutions as follows. Given a set of samples \((X^0_1,V^0_1), \ldots , (X^0_{N},V^0_N)\), where the particles velocities \(V^0_i~\in ~\{-a,a\}\) distinguish the samples of \(f^+_0(x)\) from those of \(f^-_0(x)\), a new set of samples \((X_1,V_1),\ldots ,(X_{N},V_N)\) is generated by Algorithm 1.

Algorithm 1
figure a

Monte Carlo for \(2\times 2\) hyperbolic relaxation systems

Note that there is no need for the grid to be uniform, and the choice of the grid points can easily change at any time step. Although in principle using higher order fractional step methods would lead to an increase in the order of accuracy in \(\varDelta t\), in the limit \(\varepsilon \rightarrow 0\) such methods, as it is well known, degenerate to first order [21, 33, 34]. It is to date an open problem to construct splitting techniques of order higher than one that maintain accuracy in the limit \(\varepsilon \rightarrow 0\).

Algorithm 1 can be improved by adopting a simple low variance technique which estimates the number of particles that should have the different velocities before their assignment. This can be done by replacing step 3 by first computing in each cell j the total number of interacting particles

$$\begin{aligned} N^c_j(t)= \textrm{SRound}\left( (1-e^{-\varDelta t/\varepsilon })N_j(t)\right) , \end{aligned}$$
(20)

where with \(\textrm{SRound} (x)\) we denote a “stochastic rounding” of a positive real number, by considering

$$\begin{aligned} \textrm{SRound}(x) = {\left\{ \begin{array}{ll} {[}x] +1 &{}\quad \textrm{with} \,\, \textrm{probability}\,\, x-[x]\\ {[}x] &{}\quad \textrm{with} \,\, \textrm{probability} \,\, 1-x+[x], \end{array}\right. } \end{aligned}$$

[x] denoting the integer part of x, and then assigning the velocities a and \(-a\) respectively to a number

$$\begin{aligned} N_j^+(t) = \textrm{SRound}\left( \frac{E^+(u_j(t))}{u_j(t)}{N}^c_j(t)\right) ,\quad N_j^-(t)=N^c_j(t)-N_j^+(t), \end{aligned}$$
(21)

of randomly chosen particles in cell j. An implementation is reported in Algorithm 2.

Algorithm 2
figure b

Monte Carlo variance reduction in relaxation step

Remark 1

As a side result of our derivation, we constructed a Monte Carlo method for a general scalar conservation law of the form (8). As for the construction of deterministic relaxation schemes for conservation laws [22], this is easily obtained by taking the limit case \(\varepsilon \rightarrow 0\) in the schemes just described (see [32]).

Remark 2

Note that in the MC method both the mesh discretization and the choice of the number of particles (and clearly also the time discretization) contribute to the numerical error. In essence, we need to take \(\varDelta x\) small to reduce the mesh bias, but also large enough to ensure that the variance is small. In fact, it is possible to estimate the optimal value of \(\varDelta x_{opt}\) for the mesh discretization in order to maximize the performance of the method. Details on this are discussed in “Appendix A.1”.

3.1.1 The Case of Negative Solutions

If we apply the method to initial conditions and solutions that might also be negative, we need to associate particles with possible negative weights. Thus, \(m_i \in \{-m,m\}\), where \(m>0\), which we will still refer to as the particle mass, and reconstruct the solution as

$$\begin{aligned} u_j(t)=\frac{1}{N}\sum _{k=1}^N m_k \varPhi _{\varDelta x}(x_{j+1/2}-X_k),\qquad j=\ldots ,-2,-1,0,1,2,\ldots \end{aligned}$$
(22)

In this context, we observe that the equilibrium states \(E^{\pm }(u)\) could result either positive or negative and, consequently, also the probabilities \(p^+\) and \(p^-\) defined in (15) may become negative, leading to a failure of the probabilistic interpretation. To avoid this, we can rely on the probabilistic interpretation associated to the notion of area defined by the absolute value of the equilibrium states in the computational cell. More precisely, in each cell, we define the equilibrium probabilities as

$$\begin{aligned} {p^+_E}(x,\varDelta t) = \frac{|E^{+}(u)|}{|{E^{+}(u)}|+|{E^{-}(u)}|} ,\quad {p^-_E}(x,\varDelta t) = \frac{|E^{-}(u)|}{|{E^{+}(u)}|+|{E^{-}(u)}|}. \end{aligned}$$
(23)

Note that when applying this definition,

$$\begin{aligned} \textrm{if} \quad {E^{+}} {E^{-}} > 0 \quad \Rightarrow \quad |{E^{+}(u)}|+|{E^{-}(u)}|=|{u}|, \end{aligned}$$

and so the probabilistic interpretation of Algorithm 1 remains valid. However, an inconsistency arises if the equilibrium states have different signs, because

$$\begin{aligned} \textrm{if} \quad {E^{+}} {E^{-}} < 0 \quad \Rightarrow \quad |{E^{+}(u)}|+|{E^{-}(u)}|>|{u}|. \end{aligned}$$

In fact, in the latter situation, we are faced with an indeterminate problem because there is an infinite number of possible allocations of particles with opposite masses in the two equilibrium states such that their value is satisfied (see Fig. 2 for a sketch).

Fig. 2
figure 2

Left: The state variable u uniquely defines the number of particles in the two equilibrium states \(E^{\pm }\) when the latter have the same sign. Right: there exist infinite possible ways to associate particles to the equilibrium states \(E^{\pm }\) when equilibria have discordant sign

To overcome this problem and re-equilibrate the distribution of particles, after the relaxation step based on the probabilities (23), we can follow two strategies in each cell:

  1. (i)

    keep the particle mass fixed and introduce a variable number of particles;

  2. (ii)

    keep the number of particles fixed and introduce a variable mass.

Both strategies have pros and cons and we will briefly describe the different implementations below. The first choice (i) would result in the need to discard or re-sample particles in each cell, and this can be done by a particle killing/replication strategy, as in the case of the reaction term discussed at the end of Sect. 2 for system (2). Precisely, in each cell j of width \(\varDelta x\), we define the new particle numbers

$$\begin{aligned} {\tilde{N}}^+_j = \textrm{Sround}\left( \frac{|E^+(u_j)|N\varDelta x}{m}\right) ,\qquad {\tilde{N}}^-_j = \textrm{Sround}\left( \frac{|E^-(u_j)|N\varDelta x}{m}\right) , \end{aligned}$$
(24)

Then, we re-sample \(({\tilde{N}}^+_j - N_j^+)\) particles with positive velocity if \({\tilde{N}}^+_j > N_j^+\), or discard \((N_j^+-{\tilde{N}}^+_j)\) particles with positive velocity if \({\tilde{N}}^+_j < N_j^+\), where \(N_j^+\) is the number of particles with positive velocity prior to the re-assignation. The same, of course, is done for particles with negative velocity.

In the second alternative (ii) we fix the number of particles \(N_j\) belonging to the j-th cell and, consequently, reassign their mass, which will thus depend on the cell. This can be done, by defining an updated mass value \(\tilde{m}_j\) in each cell in order to respect the following balance:

$$\begin{aligned} \tilde{m}_j(t) N_j(t) = \left( |{E^{+}(u_j)}|+|{E^{-}(u_j)}|\right) N \varDelta x. \end{aligned}$$
(25)

Finally, in both cases (i) and (ii), the sign of mass is assigned to each particle according to the sign of the corresponding equilibrium. The details of the second strategy (ii) are reported in Algorithm 3, where for simplicity we restrict to the limiting case \(\varepsilon \rightarrow 0\).

Algorithm 3
figure c

Weighted Monte Carlo for scalar conservation laws

3.2 The Gradient-Based Monte Carlo Method

In this section we show how the Monte Carlo approach just described can be improved significantly using a gradient random walk strategy. In fact, by introducing the auxiliary variables \(w=\partial u/\partial x\) and \(z=\partial v/\partial x\), we can rewrite (7) in the form

$$\begin{aligned} \begin{aligned} \frac{\partial w}{\partial t} + \frac{\partial z}{\partial x}&= 0,\\ \frac{\partial z}{\partial t} + a^2 \frac{\partial w}{\partial x}&= -\frac{1}{\varepsilon } \left( z-F'(u)w\right) . \end{aligned} \end{aligned}$$
(26)

Most of the arguments presented for system (7) apply again. First we introduce the new variables

$$\begin{aligned} g^+=\frac{(aw+z)}{2a},\qquad g^-=\frac{(aw-z)}{2a}, \end{aligned}$$

which satisfy the diagonal system

$$\begin{aligned} \begin{aligned} \frac{\partial g^+}{\partial t} + {a}\frac{\partial g^+}{\partial x}&= -\frac{1}{\varepsilon } (g^+-D^+(u,w)),\\ \frac{\partial g^-}{\partial t} - {a} \frac{\partial g^-}{\partial x}&= -\frac{1}{\varepsilon } (g^--D^-(u,w)), \end{aligned} \end{aligned}$$
(27)

where the equilibrium states now read

$$\begin{aligned} D^+(u,w)=\frac{w({a}+F'(u))}{2{a}}, \qquad D^-(u,w)=\frac{w({a}-F'(u))}{2{a}}. \end{aligned}$$

After splitting the above system, the transport step and the collision/relaxation step can be solved using the same Monte Carlo approach described in the previous sections, but with a substantial difference: there is no need to reconstruct the solution u(xt) on a space grid during the relaxation step. In fact, given a set of samples located in \(X_1\),\(\ldots \), \(X_N\) with positive and negative masses \(m_i \in \{-m,m\}\) we can compute \(u(X_i)\) following (6). Since \(D^+(u,w)+D^-(u,w)=w\), under the subcharacteristic condition \(a>|F'(u)|\), the probabilities of a random velocity change read

$$\begin{aligned} p^+_D(x,\varDelta t)=\frac{D^{+}(u,w)}{w}=\frac{a+ F'(u)}{2a}, \qquad p^-_D(x,\varDelta t)=\frac{D^{-}(u,w)}{w}=\frac{a- F'(u)}{2a}. \end{aligned}$$
(28)

Therefore, applying the gradient random walk strategy, the probabilities turn out to be dependent only on u (and not w), which is now a particle-dependent (and not grid-dependent) variable. This aspect is of utmost importance because leads to deal with a grid-free method and consequent advantages both in terms of accuracy and computational efficiency.

Thus, we can solve system (26) with a Gradient-based Monte Carlo method (GBMC) as follows. Starting with a set of samples \((X^0_1,V^0_1), \ldots , (X^0_{N},V^0_N)\), with \(V^0_i\in \{-a,a\}\) and each sample has mass \(m_i\in \{-m,m\}\), a new set of samples \((X_1,V_1),\ldots ,(X_{N},V_N)\) is generated as reported in Algorithm 4.

Algorithm 4
figure d

Gradient-based Monte Carlo for \(2\times 2\) hyperbolic relaxation systems

Clearly, similar to the standard Monte Carlo case, taking the limit \(\varepsilon \rightarrow 0\) in the above algorithm leads to a Gradient-based Monte Carlo method for a general scalar conservation law of the form (8). Let us mention that other approximations for scalar conservation laws based on Gradient-based Monte Carlo are found in the literature for rather specific situations, like the Burgers equation (see [41] for details).

Remark 3

In addition to left reconstruction (6) denoted by \(u_i^L(t)\) or the analogous right reconstruction \(u_i^R(t)\), other reconstructions can be implemented that interpolate between the two. For example, to avoid asymmetric reconstructions, a weighted average of left and right reconstructions can be computed for the whole computational range \([x_{\min },x_{\max }]\), where \(x_{\min }={\min _i}\{X_i\}\), \(x_{\max }={\max _i}\{X_i\}\) considering linearly distributed weights \({\omega }\) in [0, 1]:

$$\begin{aligned} u_i(t) = (1-{\omega _i}) u_i^L(t) + {\omega _i} u_i^R(t),\qquad \omega _i = (X_i-x_{\min })/(x_{\max }-x_{\min }), \end{aligned}$$
(29)

Of course, more sophisticated reconstruction can be designed similarly. In the numerical examples, if not otherwise stated, we will make use of (29).

Remark 4

We emphasize that both the MC and GBMC methods here proposed present a convergence rate of \(O(1/\sqrt{N})\). However, as also clarified in the numerical results to follow, the MC method can exhibit a convergence of \(O(1/\root 3 \of {N})\) when the spatial mesh is assigned on the basis of the optimal value \(\varDelta x _{opt}\). For further details and a rigorous analysis of the reconstruction errors in the methods, we invite the reader to refer to Appendix A.

3.3 Numerical Examples for Scalar Conservation Laws

In this section we compare the numerical solutions obtained with the standard Monte Carlo approach and the Gradient-based Monte Carlo method for different problems governed by scalar conservation laws including an empirical convergence rate test. Indeed, since the \(\varepsilon \rightarrow 0\) case is the most challenging as the limiting nonlinear scalar conservation law can form discontinuities in finite time, we will limit the presentation of results to this situation.

3.3.1 Empirical Convergence Rate

Fig. 3
figure 3

Convergence rate analysis. Comparison of the relative \(L^2\) error norms of the direct Monte Carlo (MC), the Monte Carlo with optimal choice of the grid (MC\(_{opt}\)), and the gradient-based Monte Carlo (GBMC) with respect to the number of particles N for the solution of the inviscid Burgers equation at \(t=2.5\) with normal distribution as initial datum (left) and at \(t=0.5\) with sinusoidal distribution as initial datum (right)

Table 1 Convergence rate analysis. Gain factor (intended as \(L^2\) norms ratio) from using the GBMC instead of both MC versions, in terms of accuracy for different number of particles N, evaluated for the solution of the inviscid Burgers equation at \(t=2.5\) with normal distribution as initial datum (a) and at \(t=0.5\) with sinusoidal distribution as initial datum (b)

In the first test case, we compute the convergence rate of the methods with respect to the number of particles solving the inviscid Burgers equation, corresponding to \(F(u)=u^2/2\) in (7) and \(\varepsilon \rightarrow 0\). First, we consider a normal distribution as initial datum and run the simulations up to \(t=2.5\), prior to the classical shock formation; then, a test with sinusoidal initial condition (i.e., involving also particles with negative masses) with solution at \(t=0.5\) is taken into account. In Fig. 3 we compare the convergence rates obtained in the two test cases using

  • the Monte Carlo method while keeping the mesh size \(\varDelta x\) fixed with the particles number refinement (MC);

  • the Monte Carlo method with optimal choice of the mesh size \(\varDelta x\) in function of the N particles amount (MC\(_{opt}\)), as discussed in “Appendix A.1”, in particular referring to Eq. (62);

  • the Gradient-based Monte Carlo method (GBMC).

These results are given in terms of the relative \(L^2\) norms, being

$$\begin{aligned} L^2_{MC} = \sqrt{\frac{\sum _{j=1}^{M} (u_j - u^{ref}_j)^2}{\sum _{j=1}^{M} (u^{ref}_j)^2}}, \qquad L^2_{GBMC} = \sqrt{\frac{\sum _{i=1}^{N} (u_i - u^{ref}_i)^2}{\sum _{i=1}^{N} (u^{ref}_i)^2}}, \end{aligned}$$

with \(u^{ref}\) reference solution. Here we considered as reference solution the one obtained with a finite volume Godunov method with a very refined mesh grid. All the errors are computed with respect to the the mean of 5 runs with N particles, while keeping an empirically chosen, sufficiently small, time step \(\varDelta t\) fixed (to prevent the time error from overriding the error due to the choice of particle number, thus altering the convergence curves). As discussed in “Appendix A”, it can be observed that both the standard MC and GBMC methods present an error decay of \(O(1/\sqrt{N})\), while when optimizing the choice of the mesh size, the MC method converges with \(O(1/\root 3 \of {N})\). Nevertheless, the error produced by the GBMC is always smaller than that obtained by applying either the optimized or non-optimized MC method. In particular, in Table 1, we report the precise gain factor from using the GBMC instead of both MC versions in terms of accuracy as a function of the number of particles N, intended as ratio of the two norms \(L^2_{MC}/L^2_{GBMC}\) and \(L^2_{MC_{opt}}/L^2_{GBMC}\). Let su finally point out that in both test cases the decay of the order of accuracy in the MC method observed for large N is due to the fact that the mesh error dominates the statistical error due to the particle number (thus the convergence curve saturates at an asymptotic value of the \(L^2\) norm of the error caused by the fixed \(\varDelta x\)). This is in good agreement with the analysis in “Appendix A”.

Fig. 4
figure 4

Test 1(a), Burgers equation with a Gaussian as initial datum, shown in gray dotted line. Solution at \(t=10\) obtained with MC and GBMC methods with different choices of number of particles N, grid points M (only for MC) and time step \(\varDelta t\), fixing \(a=0.4\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line

Fig. 5
figure 5

Test 1(b), Burgers equation with a square wave as initial datum, shown in gray dotted line. Solution at \(t=10\) obtained with MC and GBMC methods with different choices of number of particles N and grid points M (only for MC), fixing \(a=0.6\) and \(\varDelta t = 0.01\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line

Fig. 6
figure 6

Test 1(c), Burgers equation with a sinusoidal initial datum, shown with gray dotted line. Solution at \(t=3\) obtained with MC and GBMC methods with different choices of number of particles N and grid points M (only for MC), fixing \(a=1.5\) and \(\varDelta t = 0.01\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line

3.3.2 Inviscid Burgers Equation

The MC (without low variance technique) and GBMC methods are then applied again to the inviscid Burgers equation considering four different initial conditions.

  • Test 1(a): In the first case, the initial datum of the inviscid Burgers equation is a Gaussian density with zero mean and unit variance.

  • Test 1(b): In the second case, we consider a rectangular wave as initial datum

    $$\begin{aligned} u(x,0)= \left\{ \begin{array}{ll} 0.4 \quad &{}\textrm{if}\,\, -2\le x\le 2\\ 0 \quad &{}\textrm{otherwise}\,. \end{array}\right. \end{aligned}$$
    (30)
  • Test 1(c): In the third case, we fix as initial condition a sinusoidal function

    $$\begin{aligned} u(x,0)=\sin (x)\,, \end{aligned}$$
    (31)

    to assess the performance of the methods even when considering negative solutions, hence introducing particles with negative mass.

For each test case, results are given at equal time discretization for the two methods in Figs. 46. With the Monte Carlo method, \(N=1000\) or \(N=10000\) particles and \(M=100\) cells for the domain discretization are considered; while with the GBMC approach, \(N=100\) or \(N=1000\) particles. The reference solution is obtained employing a finite volume Godunov method with a very refined spatial grid. By comparing results obtained with the two methods, the remarkable improvement in variance reduction and in capturing the shock fronts obtained by the usage of the GBMC method appears evident, even if considering a largely reduced amount of particles with respect to the standard MC. Moreover, in Fig. 4 it can also be observed the influence of the choice of the time step size on the final solution for both methods. In particular, we note that the choice of \(\textrm{CFL} =1\) (and thus to \(\varDelta t = 0.5\) for a mesh with \(M=50\) and \(\varDelta t = 0.25\) for a grid with \(M=100\) cells), leads to an evident numerical dissipation, visible near the shock. In fact, we remark that the Monte Carlo approach here proposed, for \(N \rightarrow \infty \) and \(\textrm{CFL} = 1 \Rightarrow \varDelta t = \varDelta x/a\) coincides with the Lax–Friedrichs scheme (see [22] for further details), which is known to produce considerable numerical diffusion.

Fig. 7
figure 7

Test 2, LWR traffic model, Riemann problem (32) (initial datum shown in gray dotted line). Solution at \(t=0.5\) obtained with MC and GBMC methods with different choices of number of particles N and grid points M (only for MC), fixing \(a=1.2\), \(\varDelta t = 0.01\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line

3.3.3 Lighthill–Whitham–Richards Traffic Model

As a second scalar conservation law, we consider the Lighthill–Whitham–Richards (LWR) traffic model [38], for which \(F(u)=u(1-u)\) in (7) and \(\varepsilon \rightarrow 0\), taking into account the following initial conditions.

  • Test 2: We consider the Riemann problem (RP) presented in [14], in which

    $$\begin{aligned} u(x,0) = \left\{ \begin{array}{ll} 0.4 \quad &{}\textrm{if} \,\,-1 \le x \le 0,\\ 0.8 \quad &{}\textrm{if} \,\,0 < x \le 1,\\ 0 \quad &{}\textrm{otherwise}. \end{array} \right. \end{aligned}$$
    (32)

Similar observations to those for Test 1 can also be made here when comparing the results in Fig. 7, which shows numerical solutions of the LWR test cases obtained applying the standard Monte Carlo and the GBMC method considering different amounts of particles. The well reduced variance of GBMC solutions can again be appreciated even if fewer particles are used than in direct Monte Carlo, which also brings a consequent speeding up of the simulation. Moreover, it can be observed the capability of the proposed method to well capture the sharp discontinuities arising in the dynamics, thanks to its adaptive nature of following the solution through particles especially where the gradient is large.

4 Extensions to Hyperbolic Systems of Conservation Laws

In this section we show how to extend the Monte Carlo and Gradient-based Monte Carlo techniques to the case of hyperbolic relaxation approximations to systems of conservation laws. As before, the methods we derive here work uniformly with respect to the stiff relaxation rate and, in the zero relaxation limit, originate a Monte Carlo or Gradient-based Monte Carlo method for the corresponding hyperbolic system of conservation laws. Our attention will be focused in particular on the behavior of methods in that limit.

4.1 A Monte Carlo Approach

Consider the system of conservation laws in one space variable

$$\begin{aligned} \frac{\partial \textbf{u}}{\partial t} + \frac{\partial \textbf{F}(\textbf{u})}{\partial x} = 0, \end{aligned}$$
(33)

with \(x\in \varOmega \subseteq \mathbb {R}\), \(\textbf{u}=(u_1,\ldots ,u_n)\in \mathbb {R}^n\), \(n\ge 2\). The above system is strictly hyperbolic if the Jacobian matrix \(\textbf{F}'(\textbf{u})\) admits n distinct real eigenvalues \(\lambda _1<\ldots < \lambda _n\). The system is complemented with the initial conditions \(\textbf{u}(x,0)=\textbf{u}_0(x)\) and suitable boundary conditions.

The relaxation approximation now reads [22]

$$\begin{aligned} \begin{aligned} \frac{\partial \textbf{u}}{\partial t} + \frac{\partial \textbf{v}}{\partial x}&= 0,\\ \frac{\partial \textbf{v}}{\partial t} + A^2 \frac{\partial \textbf{u}}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{v}-\textbf{F}(\textbf{u})), \end{aligned} \end{aligned}$$
(34)

where \(\textbf{v}\in \mathbb {R}^n\) and \(A^2=\textrm{diag}\{a_1^2,\ldots ,a_n^2\}\) must satisfy the dissipative condition \(A^2 > \textbf{F}'(\textbf{u})^2\) (i.e., matrix \(A^2 - \textbf{F}'(\textbf{u})^2\) positive semi-definite) for all \(\textbf{u}\), with initial conditions \(\textbf{v}(x,0)=\textbf{v}_0(x)\) and defined boundary conditions. Notice that, for \(\textbf{u}\) varying in a bounded domain, the dissipative condition can always be satisfied by choosing sufficiently large A, but because of the CFL condition, for numerical stability, it is desirable to obtain the smallest A meeting the criterion. Following [22], we say that system (34) is dissipative if it is strictly stable in the sense of Majda-Pego, which is satisfied if \(a^2_h > \lambda ^2_h,\, h=1,\ldots ,n\).

The diagonal variables are

$$\begin{aligned} \textbf{f}^{\pm }=A^{-1}\frac{A\textbf{u}\pm \textbf{v}}{2}, \end{aligned}$$

which yield the system

$$\begin{aligned} \begin{aligned} \frac{\partial \textbf{f}^+}{\partial t} + A\frac{\partial \textbf{f}^+}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{f}^+-\textbf{E}^+(\textbf{u}))\\ \frac{\partial \textbf{f}^-}{\partial t} - A \frac{\partial \textbf{f}^-}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{f}^--\textbf{E}^-(\textbf{u})), \end{aligned} \end{aligned}$$
(35)

with

$$\begin{aligned} \textbf{E}^{\pm }(\textbf{u}) = A^{-1}\frac{A\textbf{u}\pm \textbf{F}(\textbf{u})}{2}. \end{aligned}$$
(36)

As in the case of scalar conservation laws, if we assume that initial conditions might be also negative, the equilibrium states \(\textbf{E}^{\pm }(\textbf{u})\) could result either positive or negative as well. Therefore, recalling the discussion presented in Sect. 3.1.1, we solve the system of equations component-wise. We associate to each component its own family of particles h and define the probabilities of a random velocity change for each family as

$$\begin{aligned} p_E^{h,+}(x,\varDelta t) = \frac{|E_h^{+}(\textbf{u})|}{|{E_h^{+}(\textbf{u})}|+|{E_h^{-}(\textbf{u})}|}, \qquad p_E^{h,-}(x,\varDelta t) = \frac{|E_h^{-}(\textbf{u})|}{|{E_h^{+}(\textbf{u})}|+|{E_h^{-}(\textbf{u})}|}, \end{aligned}$$
(37)

and proceed similarly to the scalar case by either considering a variable particle number or including the update of the particles’ mass. In the latter case, in each cell j of width \(\varDelta x\) we define for each species h the mass value that fulfills the following relation:

$$\begin{aligned} \tilde{m}^h_j N_j^h = \left( |{E_h^{+}(\textbf{u}_j)}|+|{E_h^{-}(\textbf{u}_j)}|\right) {\varDelta x \,N}. \end{aligned}$$
(38)

Here \(N_j^h\) is the number of particles of the family h inside the j-th cell. Notice that we considering that particles in the same cell belonging to different components of the system do not interact. Thus, the different components are completely decoupled except for the equilibrium states \(E^\pm _h\), where the coupling occurs. In Algorithm 5 we summarize the weighted Monte Carlo scheme in the limiting case \(\varepsilon \rightarrow 0\) starting with n sets of samples \((X^{h,0}_1,V^{h,0}_1)\), \(\ldots \), \((X^{h,0}_{N^h},V^{h,0}_{N^h})\), with mass \((m^{h,0}_1,\ldots ,,m^{h,0}_{N^h})\), \(h=1,\ldots ,n\), where \(V^{h,0}_i\in \{-a_h,a_h\}\).

Algorithm 5
figure e

Weighted Monte Carlo for systems of conservation laws

Remark 5

We remark that the low variance technique presented in Algorithm 2 can be straightforwardly implemented also in this context. Clearly, taking the limit \(\varepsilon \rightarrow 0\) in Algorithm 5 yields a Monte Carlo method for general systems of conservation laws. It is worth to notice that the same strategy can be adopted starting from other relaxation approximations such as the one proposed in [1].

4.2 The Gradient-Based Monte Carlo Method

To extend the Gradient-based Monte Carlo method to systems of conservation laws, we need to introduce the quantities \(\textbf{w}=\partial \textbf{u}/ \partial x\), \(\textbf{z}=\partial \textbf{v}/ \partial x\) to get from (34)

$$\begin{aligned} \begin{aligned} \displaystyle \frac{\partial \textbf{w}}{\partial t} + \frac{\partial \textbf{z}}{\partial x}&= 0,\\ \displaystyle \frac{\partial \textbf{z}}{\partial t} + A^2 \frac{\partial \textbf{w}}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{z}-\textbf{F}'(\textbf{u})\textbf{w}). \end{aligned} \end{aligned}$$
(39)

In diagonal form, using the variables

$$\begin{aligned} \textbf{g}^{\pm }=A^{-1}\frac{A\textbf{w}\pm \textbf{z}}{2}, \end{aligned}$$

we obtain

$$\begin{aligned} \begin{aligned} \frac{\partial \textbf{g}^+}{\partial t} + A\frac{\partial \textbf{g}^+}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{g}^+-\textbf{D}^+(\textbf{u},\textbf{w}))\\ \frac{\partial \textbf{g}^-}{\partial t} - A \frac{\partial \textbf{g}^-}{\partial x}&= -\frac{1}{\varepsilon } (\textbf{g}^--\textbf{D}^-(\textbf{u},\textbf{w})), \end{aligned} \end{aligned}$$
(40)

where now

$$\begin{aligned} \textbf{D}^{\pm }(\textbf{u},\textbf{w})=A^{-1}\frac{(A\pm \textbf{F}'(\textbf{u}))\textbf{w}}{2}. \end{aligned}$$
(41)

We can now observe that, in the case of general hyperbolic systems of conservation laws, we cannot make the probabilities \(p_D^\pm \) of velocity switches (from positive to negative or vice-versa) in the relaxation process independent on the vector \(\textbf{w}\), unless we can diagonalize the Jacobian matrix \(\textbf{F}'(\textbf{u})\), rewriting the system in an equivalent diagonal form. Thus, in the general case of systems it is not possible to avoid the introduction of a spatial grid. We emphasize, however, that a major difference persists from the Monte Carlo method, which ensures that better accuracy is achieved: equilibrium states are defined for each individual particle, \(\textbf{w}\) being reconstructed in the grid cells but \(\textbf{u}\) being particle-dependent. This allows the solution \(\textbf{u}\) to be reconstructed by cumulative distribution in a manner analogous to the scalar case.

For the sake of simplicity, let us assume to be in the situation in which the original system (33) can be rewritten in characteristic form through the Riemann invariants, so that the matrix \(\textbf{F}'(\textbf{u})\) is the diagonal matrix of the eigenvalues \(\lambda _1,\ldots ,\lambda _n\) (in order not to burden the writing, we omit the transition from the starting variables \(\textbf{u}\) to the diagonal variables \(\hat{\textbf{u}}\)). We can then write equilibria (41) component-wise for each species h as

$$\begin{aligned} D_h^{{\pm }}(\textbf{u},\textbf{w})=\frac{(a_h\pm \lambda _h(\textbf{u}))w_h}{2a_h},\qquad h=1,\ldots ,n. \end{aligned}$$
(42)

Under the condition \(a_h > |\lambda _h(\textbf{u})|\), this permits to remove the grid dependence from the probability of random velocity changes, which read

$$\begin{aligned} \begin{aligned} {p_D^{h,+}}(x,\varDelta t)&= \frac{D_h^{+}(\textbf{u},\textbf{w})}{w_h}=\frac{a_h+ \lambda _h(\textbf{u})}{2a_h}, \\ {p_D^{h,-}}(x,\varDelta t)&= \frac{D_h^{-}(\textbf{u},\textbf{w})}{w_h}=\frac{a_h- \lambda _h(\textbf{u})}{2a_h}. \end{aligned} \end{aligned}$$
(43)

Therefore, as for the case of a relaxation approximation to a scalar conservation law, when it is possible to re-write the system in terms of characteristic variables, the GBMC algorithm keeps the grid-free property. In this situation, starting with n sets of samples \((X^{h,0}_1,V^{h,0}_1), \ldots , (X^{h,0}_{N^h},V^{h,0}_{N^h})\), \(h=1,\ldots ,n\), where \(V^{h,0}_i\in \{-a_h,a_h\}\) and each particle has fixed mass \(m^h_i\in \{-m,m\}\), a new set of samples \((X^h_1,V^h_1),\ldots ,(X^h_{N_h},V^h_{N_h})\) is generated as presented in Algorithm 6.

Algorithm 6
figure f

Gradient-based Monte Carlo for general relaxation systems in characteristic form

Remark 6

In the general case, as already mentioned, one cannot avoid the introduction of a spatial grid, since we need to reconstruct \(\textbf{w}\) over the grid j. We remark that variables \(\textbf{w}\) in the GBMC play the same role of \(\textbf{u}\) in the direct MC, hence the reconstruction in space is still computed according to (22). Once this is done, for each particle \(X_i\) we define the probabilities

$$\begin{aligned} \begin{aligned} p_D^{h,+} (X_i,\varDelta t)&= \frac{|D_h^{+}(\textbf{u}(X_i),\textbf{w}_j)|}{|D_h^{+}(\textbf{u}(X_i),\textbf{w}_j)|+|D_h^{-}(\textbf{u}(X_i),\textbf{w}_j)|},\\ p_D^{h,-} (X_i,\varDelta t)&= \frac{|D_h^{-}(\textbf{u}(X_i),\textbf{w}_j)|}{|D_h^{+}(\textbf{u}(X_i),\textbf{w}_j)|+|D_h^{-}(\textbf{u}(X_i),\textbf{w}_j)|}, \end{aligned} \end{aligned}$$
(44)

and then apply the usual GBMC method, with the exception that for consistency we must introduce either particles (representing spatial derivatives) with different masses in each cell j or a variable number of particles, analogous to the weighted Monte Carlo technique.

4.3 Numerical Examples for Systems of Conservation Laws

Here we test the two numerical methodologies, standard MC (possibly with the variance reduction technique presented in Algorithm 2) and GBMC, with two different systems of conservation laws (namely, shallow water equations and Aw-Rascle traffic model), both of which can be rewritten in terms of characteristic variables, thus allowing the gradient approach to be applied without introducing the spatial mesh. Then, we present a test case for the isentropic Euler system without the passage through the characteristic form of the model, so considering the mesh-dependent version of the GBMC method. We restrict the presentation of our results to the limiting case \(\varepsilon \rightarrow 0\).

4.3.1 Shallow Water Equations

First, we consider the 1D shallow water equations with horizontal bottom topography [45]

$$\begin{aligned} \begin{aligned} \frac{\partial h}{\partial t} + \frac{\partial \left( h u\right) }{\partial x}&= 0,\\ \frac{\partial \left( h u\right) }{\partial t} + \frac{\partial }{\partial x}\left( \frac{gh^2}{2} + hu^2\right)&= 0, \end{aligned} \end{aligned}$$
(45)

where h is the water depth, u is the velocity, \(m=\rho u\) is the momentum, and g is the gravity. System (45) can be written in compact form (33), being

$$\begin{aligned} \textbf{u}= \left( h,\, h u\right) ^T, \qquad \textbf{F}(\textbf{u})=\left( h u,\, \frac{gh^2}{2} + hu^2\right) ^T.\end{aligned}$$

Therefore, we may write the relaxation approximation (34) and then directly apply the Monte Carlo method previously discussed for the case of systems of conservation laws.

To apply the GBMC approach in mesh-less form, it is first necessary to rewrite system (45) in terms of characteristic variables. Evaluating the eigenstructure of the system, whose eigenvalues result \(\lambda _{1,2}=u \pm c\), where \(c = \sqrt{gh}\), we can derive the following Riemann Invariants:

$$\begin{aligned} \varGamma _{1,2} = u \pm 2c. \end{aligned}$$
(46)

It is therefore possible to re-write the system in diagonal form, knowing that

$$\begin{aligned} \partial _t \varGamma _{1,2} + \lambda _{1,2}\,\partial _x \varGamma _{1,2} = 0,\end{aligned}$$

so the final system reads:

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial t}\left( u + 2c\right) + \left( u+c\right) \frac{\partial }{\partial x} \left( u + 2c\right)&= 0,\\ \frac{\partial }{\partial t}\left( u - 2c\right) + \left( u-c\right) \frac{\partial }{\partial x} \left( u - 2c\right)&= 0. \end{aligned} \end{aligned}$$
(47)

If we define

$$\begin{aligned} \hat{\textbf{u}}= \begin{pmatrix} u + 2c\\ u - 2c\end{pmatrix} = \begin{pmatrix} \varGamma _1 \\ \varGamma _2\end{pmatrix}, \qquad \textbf{F}'(\hat{\textbf{u}}) = \begin{pmatrix} u+c &{} 0 \\ 0 &{} u-c \end{pmatrix} = \begin{pmatrix} \frac{\varGamma _1 + \varGamma _2}{2} + \frac{\varGamma _1 - \varGamma _2}{4} &{} 0 \\ 0 &{} \frac{\varGamma _1 + \varGamma _2}{2} - \frac{\varGamma _1 - \varGamma _2}{4} \end{pmatrix}, \end{aligned}$$

the system results written as

$$\begin{aligned} \frac{\partial \hat{\textbf{u}}}{\partial t} + \textbf{F}'(\hat{\textbf{u}})\frac{\partial \hat{\textbf{u}}}{\partial x}=0. \end{aligned}$$
(48)

When introducing \(\textbf{w}=\partial \hat{\textbf{u}}/ \partial x\) and \(\textbf{z}=\partial \hat{\textbf{v}}/ \partial x\), the relaxation approximation of the above system reads as (39), and the mesh-less GBMC algorithm can be straightforwardly applied.

Fig. 8
figure 8

Test 3(a), Shallow water equations, Riemann problem. Solution in terms of water depth h (left) and velocity u (right) at \(t=0.075\) obtained with MC, applying the low variance technique, and GBMC methods with different choices of number of particles N and time step \(\varDelta t\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line. The limit case is \(\varepsilon =10^{-8}\) with \(a_1=4.45\) and \(a_2=5.10\)

Fig. 9
figure 9

Test 3(b), Shallow water equations, Riemann problem. Solution in terms of water depth h (left) and velocity u (right) at \(t=0.1\) obtained with MC, applying the low variance technique, and GBMC methods with different choices of number of particles N and time step \(\varDelta t\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The exact solution is reported in black solid line. The limit case is \(\varepsilon =10^{-8}\) with \(a_{1,2}=8.20\)

We test the Monte Carlo and Gradient-based Monte Carlo methods here proposed with two Riemann Problems designed referring to [45].

  • Test 3(a): In the first case, in the domain \(\varOmega =[-0.5,0.5]\), we set

    $$\begin{aligned} \begin{aligned}&h(x,0) = 1,&\quad&u(x,0)=0, \qquad&\textrm{for} \quad x<0\,,\\&h(x,0) = 2,&\quad&u(x,0)=0, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$
  • Test 3(b): In the second case, we consider an almost dry bed solution in the domain \(\varOmega =[-1,1]\), imposing as initial conditions

    $$\begin{aligned} \begin{aligned}&h(x,0) = 1,&\quad&u(x,0)=-5, \qquad&\textrm{for} \quad x<0\,,\\&h(x,0) = 1,&\quad&u(x,0)=5, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$

We compare the solutions obtained in terms of the variables h and u by both methods in Figs. 8 and 9. We remark here that in the GBMC approach particles are evolving in space and time following the characteristic variables \(\varGamma _{1,2}\) defined in (46). Therefore, the plots here shown are obtained considering that

$$\begin{aligned} h = \frac{c^2}{g} = \frac{(\varGamma _1 - \varGamma _2)^2}{16g} \quad \textrm{and} \quad u = \frac{\varGamma _1 + \varGamma _2}{2}.\end{aligned}$$

The augmented accuracy and the highly reduced variance of the solutions produced using the GBMC appear here even more evident when compared to results obtained with the low variance Monte Carlo, even considering 200 times smaller amount of particles. Moreover, the capability of the GBMC method of better capturing the position and the sharpness of shock waves is again confirmed. Let us point out once more here how this advantage is linked to the key feature of the proposed method that allows the particles to move following the gradient of the solution.

Fig. 10
figure 10

Test 4, Aw-Rascle model, Riemann problem. Solution in terms of density \(\rho \) (left) and flux \(\rho u\) (right) at \(t=1.0\) obtained with MC, applying the low variance technique, and GBMC methods with different choices of number of particles N and time step \(\varDelta t\). The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line. The limit case is \(\varepsilon =10^{-8}\) with \(a_{1,2}=0.8\)

4.3.2 Aw-Rascle Traffic Model

In this Section we test the standard and Gradient-based Monte Carlo methods considering the Aw-Rascle traffic model [2]:

$$\begin{aligned} \begin{aligned} \frac{\partial \rho }{\partial t} + \frac{\partial \left( \rho u\right) }{\partial x}&= 0,\\ \frac{\partial }{\partial t}\left( \rho u + \rho p(\rho )\right) + \frac{\partial }{\partial x}\left( \rho u^2 + \rho u p(\rho )\right)&= 0. \end{aligned} \end{aligned}$$
(49)

Here \(\rho \) represents the density of cars, u the velocity, and \(p(\rho )=\) is a given function describing the anticipation of road conditions in front of the drivers. The system can be written in the compact form (33) defining

$$\begin{aligned} \textbf{u}= \left( \rho ,\, \rho u\right) ^T, \qquad \textbf{F}(\textbf{u})=\left( \rho (u + p(\rho )),\, \rho u (u + p(\rho ))\right) ^T,\end{aligned}$$

so it would be possible to directly apply the Monte Carlo method. Instead, to resort to a grid-free gradient approach, it is again necessary to express the system in terms of characteristic variables. The eigenvalues of the model are \(\lambda _{1} = u\), \(\lambda _{2} = u - \rho p'(\rho )\), while the Riemann Invariants result

$$\begin{aligned} \varGamma _1 = u + p(\rho )\, \qquad \varGamma _2 = u\,. \end{aligned}$$
(50)

Hence, we can write the system in diagonal form as

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial t} \left( u + p(\rho )\right) + u\frac{\partial }{\partial x} \left( u + p(\rho )\right)&= 0,\\ \frac{\partial u }{\partial t} + \left( u - \rho p'(\rho )\right) \frac{\partial u}{\partial x}&= 0. \end{aligned} \end{aligned}$$
(51)

If we define

$$\begin{aligned} \hat{\textbf{u}}= \begin{pmatrix} u + p(\rho ) \\ u\end{pmatrix} = \begin{pmatrix} \varGamma _1 \\ \varGamma _2\end{pmatrix}, \qquad \textbf{F}'(\hat{\textbf{u}}) = \begin{pmatrix} u &{} 0 \\ 0 &{} u - \rho p'(\rho ) \end{pmatrix} = \begin{pmatrix} \varGamma _2 &{} 0 \\ 0 &{} \varGamma _2 - \rho p'(\rho ) \end{pmatrix}, \end{aligned}$$

the system reads as (48) and, following the GBMC derivation, the method can be again applied without the introduction of any spacial grid. Notice that the term \(\rho p'(\rho )\) can be also expressed in term of characteristic variables, but depends on the definition of the function \(p(\rho )\).

We consider the Riemann Problem taken from [2, 15], which accounts for a solution with vacuum, having initial data as follows.

  • Test 4: In the domain \(\varOmega =[-1.5,1.5]\), we have \(p(\rho ) = 6 \rho \) and initial conditions

    $$\begin{aligned} \begin{aligned}&\rho (x,0) = 0.05,&\quad&u(x,0)=0.05, \qquad&\textrm{for} \quad x<0\,,\\&\rho (x,0) = 0.05,&\quad&u(x,0)=0.5, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$

Notice that in this test \(p'(\rho )=6\), hence in \(\textbf{F}'(\hat{\textbf{u}})\) we have \(F'_{22} = \varGamma _2 - 6\rho \).

We present the results obtained solving the problem with the low variance Monte Carlo and the Gradient-based Monte Carlo in Fig. 10, which show again an excellent behavior, even in this very challenging RP, especially for what concerns the almost absent variance in the GBMC. Notice that in the latter particles follow the dynamics of the characteristic variables \(\varGamma _{1,2}\) defined in (50). Hence, while \(u =\varGamma _2\), to compute the density we need to consider that \(\rho = (\varGamma _1 - \varGamma _2)/6\).

4.3.3 Isentropic Euler System

We finally consider the following one-dimensional isentropic Euler system [9]:

$$\begin{aligned} \begin{aligned} \frac{\partial \rho }{\partial t} + \frac{\partial \left( \rho u\right) }{\partial x}&= 0,\\ \frac{\partial \left( \rho u\right) }{\partial t} + \frac{\partial }{\partial x}\left( \frac{1}{2}\left( \rho + \rho u^2\right) \right)&= 0, \end{aligned} \end{aligned}$$
(52)

where \(\rho \) is the gas density, u is the velocity, and \(m=\rho u\) is the momentum. System (52) can be written in the compact form (33), being

$$\begin{aligned} \textbf{u}= \left( \rho ,\, \rho u\right) ^T, \qquad \textbf{F}(\textbf{u})=\left( \rho u,\, \frac{1}{2}\left( \rho + \rho u^2\right) \right) ^T.\end{aligned}$$

Therefore, we may write the relaxation approximation (34) and then apply either the standard Monte Carlo approach or the Gradient method, the latter in the mesh-dependent version discussed in Remark 6, since the system is not diagonal.

We test the resolution of the isentropic Euler equations with the standard Monte Carlo, the low variance Monte Carlo, and the Gradient-based approach considering two Riemann Problems having the following initial conditions, taken from [9].

  • Test 5(a): In the first case, in the domain \(\varOmega =[-1,1]\), we set

    $$\begin{aligned} \begin{aligned}&\rho (x,0) = 2,&\quad&m(x,0)=1, \qquad&\textrm{for} \quad x<0.2\,,\\&\rho (x,0) = 1,&\quad&m(x,0)=0.13962, \qquad&\textrm{for} \quad x\ge 0.2\,. \end{aligned} \end{aligned}$$
  • Test 5(b): In the second case, again considering the domain \(\varOmega =[-1,1]\), we impose

    $$\begin{aligned} \begin{aligned}&\rho (x,0) = 1,&\quad&m(x,0)=0, \qquad&\textrm{for} \quad x<0\,,\\&\rho (x,0) = 0.2,&\quad&m(x,0)=0, \qquad&\textrm{for} \quad x\ge 0\,. \end{aligned} \end{aligned}$$
Fig. 11
figure 11

Test 5(a), isentropic Euler system, Riemann problem. Solution in terms of density \(\rho \) (left) and flux \(\rho u\) (right) at \(t=0.5\) obtained with MC, applying or not the low variance technique, and GBMC method in mesh-dependent version, with different choices of number of particles N. The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line. The limit case is \(\varepsilon =10^{-8}\) with \(a_{1,2}=1.0\)

Fig. 12
figure 12

Test 5(b), isentropic Euler system, Riemann problem. Solution in terms of density \(\rho \) (left) and flux \(\rho u\) (right) at \(t=0.5\) obtained with MC, applying or not the low variance technique, and GBMC method in mesh-dependent version, with different choices of number of particles N. The subplots in the GBMC results show the histogram of the gradient-based particle positions. The reference solution is reported in black solid line. The limit case is \(\varepsilon =10^{-8}\) with \(a_{1,2}=1.0\)

Once more, looking at Figs. 11 and 12, the augmented accuracy and the highly reduced variance of the solutions produced with the Gradient approach appears evident when compared to results obtained with the standard Monte Carlo method, even considering a 10 times smaller amount of particles and the same spatial grid. In these figures, it is also possible to appreciate the beneficial effects of the variance reduction technique applied to the standard Monte Carlo, when comparing the first two rows in each figure. Moreover, the capability of the GBMC of better capturing the position and the sharpness of shock waves is again confirmed. Let us point out once more here that this advantage is linked to the key feature of the proposed method, which allows particles to move following the gradient of the solution.

5 Conclusions

Monte Carlo methods have become increasingly important in scientific computing due to their ability to handle complex systems and quantify uncertainty. Despite this, their systematic use in solving partial differential equations is still limited compared to deterministic approaches, which in many cases provide greater flexibility and accuracy. In this paper, we attempt to take a step forward in the design of Monte Carlo methods for PDEs by analyzing their systematic use for relaxation approximations to systems of hyperbolic conservation laws [22]. On the one hand, we extend Monte Carlo techniques of direct simulation inspired by kinetic theory to systems of hyperbolic conservation laws. On the other hand, we consider a different approach based on the use of the spatial derivative of the solution, which was developed earlier for reaction-diffusion equations [44], and refer to as Gradient-based Monte Carlo (GBMC). The latter method has shown great potential due to its ability to concentrate particles where the solution has large derivatives and has a grid-free structure. In the presented test cases, the GBMC method proves to be significantly more accurate than the Monte Carlo method.

Lest us notice that, by combining the techniques here presented to previous results for reaction-diffusion problems it is possible to deal with general systems of PDEs involving convection-diffusion–reaction terms. Moreover, in this paper, we have limited our analysis to the one-dimensional case. In the future, our aim is to extend the GBMC method to the multi-dimensional case through a component-wise approach (i.e., deriving the system of equations under study component by component to obtain the system to be solved with the method, thus preserving a mesh-less reconstruction) and to more general systems. Another primary research direction is to extend the GBMC approach to kinetic equations. This will be done first for neutron transport equations that, thanks to the linear structure, permit a natural extension of the method; subsequently, tackling the nonlinear BGK model which has a relaxation structure of the type studied in this paper. We leave further investigations on these topics to future research.