Reduced variance random batch methods for nonlocal PDEs

Random Batch Methods (RBM) for mean-field interacting particle systems enable the reduction of the quadratic computational cost associated with particle interactions to a near-linear cost. The essence of these algorithms lies in the random partitioning of the particle ensemble into smaller batches at each time step. The interaction of each particle within these batches is then evolved until the subsequent time step. This approach effectively decreases the computational cost by an order of magnitude while increasing the amount of fluctuations due to the random partitioning. In this work, we propose a variance reduction technique for RBM applied to nonlocal PDEs of Fokker-Planck type based on a control variate strategy. The core idea is to construct a surrogate model that can be computed on the full set of particles at a linear cost while maintaining enough correlations with the original particle dynamics. Examples from models of collective behavior in opinion spreading and swarming dynamics demonstrate the great potential of the present approach.


Introduction
Meanfield equations are derived as reduced complexity models describing the dynamics of large systems of interacting particles [27].Typically, the trajectories of systems of particles are governed by stochastic differential equations for the position x i ∈ R d and velocity v i ∈ R d of the ith particle and are translated in systems of stochastic differential equations of the form where P (•) ≥ 0 is a suitable interaction function.In the limit of a large number of particles N , the individual trajectories become irrelevant, and we have the opportunity to describe the average behavior of the system.In such a situation, the strength of the individual interactions becomes small, and each particle feels the effect of the force field created by all the other particles.Therefore, the statistical behavior of the system of particles is obtained by studying the evolution of the particles' density f (x, v, t) which obeys a nonlocal PDE of Fokker-Planck type where the term D(v) : R d → R expresses the local relevance of the diffusion and the nonlocal drift term B[f ](x, v, t) is defined as follows P (x, y, v, w)(v − w)f (y, w, t)dv dy.
The derivation of the mean-field model ( 2) is essential to investigate mathematically the long time behavior and the formation of collective structures and self-organisation features of the multi-agent system (1).Indeed, since the trajectory of each particle is influenced by all the other particles, the computational complexity of the particle model is O(N 2 ) and makes the microscopic approach rapidly infeasible for N ≫ 0.
In this direction, Random Batch Methods (RBM) have gained increased popularity in the community of interacting particle systems since they are capable to efficiently reduce the computational complexity to O(M N ), where M is the size of the batch, at the cost of introducing an additional error in the dynamics [8,28,30,31].In the context of manyparticle systems, RBM methods have been proposed in [29].Earlier approaches based on analogous Monte Carlo strategies were developed in [1] and [10,12] in the presence of uncertain quantities.The core of these algorithms is to randomly divide the set of particles into smaller batches at each time step, a technique closely resembling the batch methods of stochastic gradient descent algorithms.The interaction of each particle within these batches is then evolved until the subsequent time step.This approach significantly reduces the computational cost by an order of magnitude, albeit at the expense of heightened fluctuations due to the random partitioning.
In contrast with random search algorithms were the increase of fluctuations may also have a positive impact on the overall process, in RBM methods, since we aim at computing as accurate as possible the O(N 2 ) summation, reducing the variance of the batch algorithm is of paramount importance and represents one of the main challenges.In this paper, inspired by similar ideas in the field of uncertainty quantification for kinetic and mean-field equations [15,16,33], we propose a control variate strategy based on a simpler surrogate model to reduce batch variance.However, compared to uncertainty quantification, where due to the static nature of stochastic samples it is easy to obtain correlations, here the samples, although initially correlated, naturally decorrelate as a consequence of their temporal evolution.For this purpose, we employ a reduced-complexity surrogate that can be calculated at the cost of O(N ), sharing the same asymptotic state as the original model.
We show that the present strategy permits obtaining a significant reduction in the variance of the batch, leading to a further speed-up in computational cost for a given accuracy.This is illustrated using models of collective behavior in opinion spreading and swarming dynamics.
The rest of the manuscript is organized as follows.In Section 2, we revise some nonlocal Fokker-Planck models that will be used as prototype examples to illustrate our approach and to test the numerical schemes.Next, Section 3 is devoted to presenting the reduced variance RBM methodology.Several numerical examples are then given in Section 4. The last section contains some concluding remarks.

Vlasov-Fokker-Planck models
In this section we survey some examples of multi-agent systems whose mean-field limit corresponds to a Vlasov-Fokker-Planck model of the type (2).First, we define the classical model for transport-aggregation-diffusion.Hence, we discuss the recent extensions of this model to kinetic alignment models in R d .In the space homogeneous setting, we also survey recent examples presented in opinion dynamics that have attracted great interest in the mathematical community studying systems of interacting particles.

The classical Vlasov-Fokker-Planck equation
Fokker-Planck-type collision operators are commonly introduced to define reduced complexity models for collisional dynamics, see [38,40].The classical Vlasov-Fokker-Planck equation can be obtained from (2) with the choice are the conserved global mean and temperature respectively.We may express the global Maxwellian as follows where we assumed that the total mass is normalized to one.The study of equilibration rates on the velocity space, given a initial distribution f 0 , is a key problem in kinetic theory, see e.g.[3,11,38].The trends to equilibrium are determined in terms of the Boltzmann Hfunctional f log f dv dx.
In particular, starting from an initial distribution with finite entropy, f (x, v, t) converges in relative entropy to the equilibrium solution exponentially fast.

Vlasov-Fokker-Planck models in collective dynamics
Vlasov-Fokker-Planck-type equations have been used more recently to understand long-rage interactions in collective phenomena.In this direction, a key role has been played by the socalled swarming dynamics in which large systems of agents tend to share the same velocity for large times, see [14].The next example focuses on a kinetic equation for swarming dynamics.The kinetic equation can be obtained from (2) with the choice such that the nonlocal drift assumes the form The resulting PDE corresponds to the mean-field limit of the well-known Cucker-Smale model [6,7,23,24,37].Several results provide conditions for the asymptotic alignment of velocities under suitable assumptions on the coefficients of the model.In particular, if β < 1/2 we have the large time collapse of the support of the emerging Maxwellian distribution and therefore asymptotic alignment of the agents' velocities [19].
The emergence of collective structures by means of space-dependent interaction forces can be also observed in multi-agent models for milling [9,17] and synchronization [13].

Alignment models in consensus dynamics
The mathematical modelling of consensus formation has seen a growing interest in recent years [21].Classical methods of statistical mechanics served as a powerful basis to model interactions in cooperative multi-agent systems and to reduce the complexity linked to interaction forces.The study of alignment dynamics has produced a heterogeneity of models, see [2,18,20,21,25,32,39] and the references therein.More recently these approaches have found interesting applications in optimization and data science [26,36].
We consider the bounded confidence model for opinion formation where being χ(•) the indicator function and being δ > 0 a given threshold.The large time behaviour of the model is a sum of Dirac delta distribution which express clusterization of the society towards a finite number of asymptotic opinions.The corresponding version in presence of noise is given by where v ∈ [−1, 1] and where In the simplified scenario where δ = 2 we get P ≡ 1 and we may compute the large time behaviour of the model which corresponds to the Beta distribution , where m = 1 −1 vf (v, 0)dv is the conserved mean opinion of the system.If 0 < δ < 2 the explicit computation of the asymptotic state is difficult to obtain.A possible strategy to get some insight on the large time trends is based on the construction of structure preserving numerical approximations [35].

Reduced variance RBM
be a distribution function whose evolution is given by the Vlasov-Fokker-Planck model with nonlocal flux defined in (2) with a constant diffusion D(v) = σ.The evolution of observable quantities is obtained by rewriting the equation in weak form.
Let ϕ(•, •) be a test function, then the Vlasov-Fokker-Planck model defined in (2) can be written in weak form as follows Hence, by choosing ϕ(x, v) ≡ 1 we may observe that the total mass is conserved since The positivity for all t ≥ 0 of the kinetic distribution f (x, v, t) may be proven under suitable assumptions on the drift term and the diffusion function.Furthermore, for ϕ(v) = v we have that the evolution of the mean velocity is given by Therefore, the mean velocity is conserved if Given a set of sample positions and velocities {(x i , v i )} N i=1 at time t ≥ 0, the kinetic distribution f (x, v, t) is recovered from the empirical density function where δ(•) is the Dirac delta function and the particles evolution follows (1).Therefore, for any test function ϕ(x, v), if we denote by we have being E X,V [ϕ] the expectation of the observable quantity ϕ.The following result holds [5] Theorem 1.The root-mean square error is such that for each t ≥ 0 where σ 2 ϕ = Var(ϕ) and Since f is a probability density, we observe that the nonlocal drift operator may be thought as an expectation with respect to the kinetic distribution f (x, v, t) and in particular as the evaluation of the expected quantity of P (x, y, v, w)(w − v) with respect to the pair (y, w).Hence, we have where E f [•] is the expectation with respect to the probability density f and is defined as Therefore, the initial model in ( 2) is equivalently written as

Standard RBM method
Equation (2) can be obtained as the mean-field limit of the particle system (1).Let us consider the empirical distribution (6) and rewrite the second equation in (1) as follows For any N ≫ 0 the we may observe that As a result, we obtain a particle dynamics with complexity O(N 2 ).The random batch idea is based on the use of subsets S M of samples of size M ≪ N sampled uniformly from the set of indices {1, . . ., N } to further approximate for all i = 1, . . ., M .This fast summation strategy has an overall cost of O(M N ) instead of O(N 2 ) and therefore substantially alleviate computational burden linked to the particle approach.In the following, we concentrate on the dynamics of the particle system given by the following system of SDEs for such that the empirical measure converges towards f (x, v, t) solution to (2).Indeed, the following result can be established Theorem 2. The root-mean square error of the RBM method is such that for each t ≥ 0 Proof.For any test function ϕ we have Hence, we may pursue the same strategy presented in [12] to get the following estimate from which we get thanks to Theorem 1 From the above theorem we can conclude that the computation of the average defined in (7) approximates the original one with an accuracy O(( At the computational level, the RBM method, as it has been presented in [1,10,29], can be summarized as in Algorithm 3.1.
select a random batch S M 5: update (x i , v i ) by solving a time discretization of (8) 6: end for 7: Reconstruct the kinetic density using a kernel density estimator of (6).

The reduced variance strategy
In this section we want to define a suitable strategy to reduce the variance of the RBM method.To this end, we will develop a control variate approach which is based on the construction of simpler linear models which define a compatible asymptotic distribution with respect to the full model.We describe the method using a nonlocal PDEs of Vlasov-Fokker-Planck-type introduced in (2).
We define a new drift term as follows where λ ∈ R and Q(x, v, t) is a reduced complexity collision operator having the form being P (x, v) ≥ 0 a new interaction function.Hence, (9) may be rewritten as We can notice that we have so that the solution of our original problem can be obtained by the evaluation of the expectation of the new term In order to determine the optimal value of the scalar value λ a common strategy relies on the minimization of the variance [15,16,22].In particular, the variance of the above expression is given by From (9) we get where We can minimize the variance by direct differentiation with respect to λ and obtain the optimal value.The following result holds Theorem 3. The quantity minimizes the variance of P λ (x, y, v, w)(w − v) and is such that Assume now to evaluate in (9) the exact expectation P (x, v)(U − v) of the surrogate linear model using N samples where We can improve the standard Monte Carlo random batch estimate based on M samples (8) using Hence, we get the following system of SDEs for where as usual {W i }, i = 1, . . ., N defines a set of independent Wiener processes.The reduced variance strategy can be summarised as in Algorithm 2.
Algorithm 2 Reduced Variance Random Batch Method (rvRBM) compute λ * as in (11) 7: update (x i , v i ) by solving a time discretization of (12) 8: end for 9: Reconstruct the kinetic density using a kernel density estimator of (6).
Remark 1.We may observe that Cov f (P (x, y, v, w)(w−v), P (x, v)(w−U )) and Var f ( P (x, v)(w− U )) defining λ * are not known are must be estimated starting from the sample of size N .In particular, the following unbiased estimators can be defined which lead to the estimated optimal λ * as Remark 2. If we can consider a linear surrogate operator, i.e. and which leads to sequence of optimal values

Numerical tests
In this section we present several numerical examples to show the performances of the introduced reduced variance RBM.In the following we will recall both deterministic and stochastic models.In all the subsequent tests, we reconstruct the particle densities through a regularization of the empirical distribution ) based on a kernel density estimator obtained through the evaluation of ϕ, f N , being ϕ(x i , v i ) the normal density centered in (x i , v i ) and having fixed variance Σ 2 = 10 −5 .
In the following numerical tests we will focus on the definition of suitable RBM and rvRBM methods pointing the reader to the nonlocal PDE models introduced in Section 2.

Test 1a: bounded confidence model
We consider the bounded confidence opinion model introduced in Section 2.3 in the case D ≡ 0. We recall that it is possible to prove that (3) corresponds to the mean-field limit of the following particle system where [4,34] and the references therein.Hence, the RBM method corresponds to the following reduced complexity model being S M a uniform subsample of size M > 1 obtained from the {v i } N i=1 .A rvRBM method can be therefore obtained by considering in (12) the interaction function P (v i ).In particular, we will consider the cases P (v i ) ≡ 1 (denoted by case 1) and P (v i ) = 1 − v 2 i (denoted by case 2).Therefore the rvRBM method corresponds in considering the evolution of the following particle system  3) by means of a particle approach defined in ( 15) with interaction function with threshold δ = 1 (top row), δ = 0.5 (bottom row).We considered N = 10 5 particles with initial distribution defined in (18) and we compare the evolution of the approximated densities obtained with RBM and rvRBM with a subset of interacting particles of size M = 5 or M = 10.The surrogate interaction function is here considered P (v i ) ≡ 1. where and λ * = λ * (t) is computed as in (11).We remark that, since u N is a conserved quantity of the model it is sufficient to compute u N from the initial sample, whereas u M is affected from the error induced by the uniform subsampling.
In Figure 1 we report the evolution of the particle density at time t = 1 and t = 5 and by considering M = 5 or M = 10.As initial distribution we considered a sample of the uniform distribution of the interval [−1, 1], i.e.
We compare the evolution of the particle density obtained with a standard RBM together with the one obtained with rvRBM, from which we may notice good agreement in describing the transient behaviour of the distribution both in the case δ = 1 (top row) and in the case δ = 0.5 (bottom row).In this latter case the emerging equilibrium density displays clusterization towards the values ± 1 2 .We consider the following absolute error for the mean where {v i } N i=1 is obtained either from RBM, as in ( 16), or from rvRBM, as in (17).In Figure 2 we show the evolution in time of the introduced absolute error in the case δ = 1 (top row) Figure 2: Test 1a.Evolution of the absolute error defined in (19) for the bounded confidence model solved through RBM and rvRBM with M = 5 (left column) or M = 10 (right column).We considered δ = 1 (top row) and δ = 0.5 (bottom row).and δ = 0.5 (bottom row) by both the standard RBM and the introduced rvRBM method.We will refer to case 1 if P (v i ) ≡ 1 whereas with case 2 we will refer to P (v i ) = 1 − v 2 i .We may observe how in the case δ = 1 the method rvRBM leads to a significant advantage since the emerging distribution function is highly correlated with the one characterising the reduced variance method.Indeed, as shown in Figure 3 (left plot) the optimal λ * is such that λ * → 1, t → +∞.
On the other hand, for the case δ = 0.5 the error produced by the rvRBM strategy is comparable with the one of RBM.Indeed, the computed λ * → 0, since the emerging distribution has no correlation with the one of the considered for the reduced variance method.
In Figure 4 we depict the evolution of the error produced by fixed batch M = 5 (left) or M = 10 (right) and a variable N ∈ {100, . . ., 10 4 }.The error (19) is here computed at time T = 5.Since the cost of the reduced variance approach is compatible to the one of a RBM method O(M N ), we may observe how, in the reduced variance approach, we essentially gain accuracy at the same cost of RBM.

Test 1b: bounded confidence model with isolated clusters
The loss of accuracy of the rvRBM method in the case δ = 0.5 is essentially due to the uncorrelated behaviour of the surrogate model with respect to the full one.Indeed, whereas the . The batch size is M = 10 and the initial sample has been obtained from (20).Bottom row: evolution of the absolute errors (19) for the bounded confidence model solved through RBM and rvRBM with M = 5 (left) or M = 10 (right).considered surrogate models would lead to the formation of global consensus, the bounded confidence model leads to the formation of local stable clusters [4].A possible way to overcome this issue has been addressed in Remark 2. We consider a sample of the initial distribution Now, for δ = 0.5 no interactions are expected from the two initial clusters centered in ± 1 2 .We computed the optimal λ * k , k = 1, 2, exploiting two surrogate models, the first is P (v i ) ≡ 1 (case 1) whereas for the second we considered 2 ) (case 2).In Figure 5 we report the evolution of the rvRBM method where we defined a sequence of optimal λ * k as in ( 14) with k = 1, 2 such that U k = ± 1 2 .We may observe that the evolution in time of the obtained rvRBM method is consistent with the one defined by the standard RBM method.Furthermore, from the evolution of the absolute error (19) we can observe that the method is capable to achieve much higher accuracy than RBM.

Test 2: stochastic opinion formation model
We now consider a stochastic bounded confidence model defined in Section 2.3 in (5), where the interaction function is defined as in (4) and that is obtained from the meanfield limit of the following system of stochastic differential equations with D 2 (v i ) = 1−v i (t) and {W i } N i=1 a set of independent Wiener processes.In the following, we fixed the parameter σ 2 = 10 −1 .Therefore, the RBM approximation is obtained by considering where S M is a uniform subsample of size M > 1 from the {v i } N i=1 .Similarly, rvRBM strategy is obtained as In the following, the methods defining RBM and rvRBM share the same set of Wiener processes of the complete model (21).The set of SDEs is solved through the Euler-Maruyama scheme.
In Figure 6 we report the evolution of the particle density at time t = 1 and t = 5 in the stochastic setting and obtained with M = 5, 10.As initial distribution we considered the one introduced in (18).In each plot, we compare the evolution of the approximated density obtained with N = 10 5 particles and either RBM or rvRBM algorithms.In Figure 6 we considered the case δ = 1.
In Figure 7 we show the evolution in time of the introduced absolute error in the case δ = 1 (top row) obtained through the standard RBM and the introduced rvRBM method.We may observe how rvRBM leads to a substantial advantage if the emerging distributions are highly correlated, as in the case δ = 1.At variance with the deterministic model, the asymptotic level of accuracy of rvRBM is influenced by the total number of particles due to the presence of the stochastic component in the dynamics.In the second row we report the error produced by a fixed batch of size M = 5 (left) or M = 10 (right) and an increasing total number of particles N ∈ {10, . . ., 10 4 }.The error is computed at time T = 5.We may observe how the reduced variance approach is capable to achieve higher accuracy with the same number of particles, and therefore of cost, of the RBM method.

Test 3: Cucker-Smale model
We consider the second order model for swarming introduced in Section 2.2.We recall that it is possible to prove that this model corresponds to the meanfield limit of the following particles' system   where β , we will always consider ξ = 1 and β = 10 −1 , for which unconditional alignment emerges asymptotically.Hence, the RBM algorithm corresponds to consider where S M is a uniform subsample of size M > 1 obtained from {(x i , v i )} N i=1 ∈ R 2d .In the following, we construct a rvRBM method by considering and P ≡ 1, being where u N (t) ≡ u N (0) being P (|x i − v i |) a symmetric interaction function.Hence, the optimal value of λ(x i , v i ) is computed as in (12).
In Figure 8 we show the evolution of the reconstructed distribution functions in the phase space obtained with RBM (left panels) and rvRBM (right panels).We considered N = 10 5 particles having uniform initial distribution in both space and velocity, i.e.
Hence, we considered a RBM/rvRBM methods fixing a subsampling of size M = 10.We may observe how the obtained distributions consistent between the two methods also at the level of marginal distributions (bottom row).We report in Figure 9 the evolution of the error obtained with RBM and rvRBM.In detail, we fixed N = 10 5 initial particles and a subsample of size M = 5 (left plot) or M = 10 (right plot).In both cases, we can observe the advantages produced by the reduced variance strategy.In Figure 10 we report the evolution of the average optimal λ * computed for the Cucker-Smale model with either M = 5 or M = 10.We can observe how, due to the transport term the surrogate model lose the initial correlation with the full model.

Conclusions
The computational challenges associated with the O(N 2 ) complexity of particle models for large systems with nonlocal interactions have driven the adoption of Random Batch Methods (RBM).While these methods, using batches of size 1 < M < N , efficiently reduce the computational burden to O(M N ), introducing an additional error becomes an inevitable trade-off.Addressing this challenge, our paper proposes a control variate strategy inspired by ideas from uncertainty quantification.By leveraging a reduced complexity For both methods we considered a subsample of size M = 10.In the bottom row we report the marginal densities obtained at time t = 1 and t = 2.The initial size of the particle system is N = 10 5 with initial distribution as in (24).
surrogate model, we successfully mitigate batch variance, achieving a substantial reduction and, consequently, an improved computational cost-to-accuracy ratio.Of course, the effectiveness of our proposed strategy is strongly dependent on the possibility to construct a surrogate model which keeps correlations with the original model.Nevertheless, even in the worst case scenario when the surrogate model looses correlations our methodology will never deteriorate its accuracy compared to standard RBM methods.This is demonstrated through numerical examples, particularly in the context of models depicting collective behavior in opinion spreading and swarming dynamics.This not only showcases the practical utility of our approach but also underlines its versatility across different scenarios.
Looking ahead, our work opens directions for further research, particularly in refining control variate strategies and optimizing the implementation of Random Batch Methods in various computational contexts.For example, one could also explore the use of a multilevel strategy as an alternative to a surrogate model [22].In this sense, the present approach marks a promising step towards bridging the gap between the microscopic complexity of interacting particle systems and the computational feasibility required for meaningful simulations, with potential implications for a broad spectrum of applications.

5 (h) t = 5 , M = 10 Figure 1 :
Figure 1: Test 1a.Evolution of the densities for the deterministic BC model in (3) by means of a particle approach defined in(15) with interaction function with threshold δ = 1 (top row), δ = 0.5 (bottom row).We considered N = 10 5 particles with initial distribution defined in(18) and we compare the evolution of the approximated densities obtained with RBM and rvRBM with a subset of interacting particles of size M = 5 or M = 10.The surrogate interaction function is here considered P (v i ) ≡ 1.

Figure 5 :
Figure 5: Test 1b.Top row: comparison of the reconstructed distributions at time t = 1 and t = 5 for the bounded confidence model with δ = 0.5 using RBM or rvRBM withP (v i ) ≡ 1 (case 1) or P (v i ) = (v i − 1 2 )(v i + 1 2 ).The batch size is M = 10 and the initial sample has been obtained from(20).Bottom row: evolution of the absolute errors(19) for the bounded confidence model solved through RBM and rvRBM with M = 5 (left) or M = 10 (right).

Figure 6 :
Figure 6: Test 2. Evolution of the densities for the stochastic bounded confidence model (5) with interaction function (4) and δ = 1 (top row), δ = 0.5 (bottom row).We considered N = 10 5 particles with initial distribution(18) and we compare the evolution of the approximated densities obtained with RBM or rvRBM with a subset of interacting particles of size M = 5 or M = 10.We fixed the diffusion constant σ 2 = 10 −1 .

Figure 7 :
Figure 7: Test 2. Stochastic bounded confidence model.Top row: evolution of the absolute error defined in (19) for the stochastic bounded confidence model solved through RBM and rvRBM with M = 5 (left) or M = 10 (right).Bottom row: errors of RBM and rvRBM for a fixed batch size M = 5 (left) or M = 10 (right) and variable size of the sample N ∈ {10, . . ., 10 4 }.In all the presented tests we considered δ = 1 and N = 10 5 particles.

Figure 8 :
Figure 8: Test 3. Evolution of the kinetic distributions obtained with RBM (left) or rvRBM (right) at time t = 1 (top row) and t = 10 (bottom row).For both methods we considered a subsample of size M = 10.In the bottom row we report the marginal densities obtained at time t = 1 and t = 2.The initial size of the particle system is N = 10 5 with initial distribution as in(24). N,