Data-driven model order reduction for granular media

We investigate the use of reduced-order modelling to run discrete element simulations at higher speeds. Taking a data-driven approach, we run many offline simulations in advance and train a model to predict the velocity field from the mass distribution and system control signals. Rapid model inference of particle velocities replaces the intense process of computing contact forces and velocity updates. In coupled DEM and multibody system simulation, the predictor model can be trained to output the interfacial reaction forces as well. An adaptive model order reduction technique is investigated, decomposing the media in domains of solid, liquid, and gaseous state. The model reduction is applied to solid and liquid domains where the particle motion is strongly correlated with the mean flow, while resolved DEM is used for gaseous domains. Using a ridge regression predictor, the performance is tested on simulations of a pile discharge and bulldozing. The measured accuracy is about 90% and 65%, respectively, and the speed-up range between 10 and 60.


Introduction
Computational modelling of granular dynamics has important applications in both science and engineering, but is challenging due to the complex nature of granular medias.The discrete (or distinct) element method (DEM) is perhaps the most versatile numerical method for it.It supports the three gran-ular phases, solid, liquid, and gas.It can capture both discrete and collective phenomena, that depend on contact parameters, particle shape and arrangements.DEM simulations are, however, computationally intense, which limits the practical applicability.
There are basically three methods of accelerating a DEM simulation.Firstly, the computational speed can be increased by parallelization and use of specialized hardware [13,20,6,21,7,15].Secondly, changing from explicit to implicit time-integration allows for much larger time-steps than the limit set by the time-period of free vibrations for particles of given mass and contact stiffness.The computational bottleneck is then shifted from collision detection to solving the equations of motion and contact force computation.Depending on the system properties and error tolerance this may be very advantageous [19].
The third way is to employ some form of model order reduction, where the original system is substituted with an approximation that require fewer variables and computational operations per simulated unit of time.Normally, model order reduction is seen as a projection from a high-dimensional space to a low-dimensional subspace, where the timeintegration can be performed with manageable computational intensity.Once advanced in time, the solution can be projected back to the original highdimensional space.The process introduces a model reduction error, that may or may not be acceptable for the intended purpose of the simulation.
In the present paper we explore the possibility of accelerating DEM simulation using data-driven model reduction.The idea is to perform numerous detailed simulations of a system in advance, train a model to predict new system states and use these to advance a running simulation faster in time than the original simulations.The question is whether and how this is at all feasible, and what speed-up and precision can be achieved.
The data-driven approach has the drawback that a certain amount of resolved simulations must be performed in advance to generate training data and build a model.The question then arises what are the advantages of the method.Simulations that must run at real-time speed is one type of application that may benefit from using this technique.Specific examples are simulators for operator training, system testing with hardware-in-the-loop or embedded simulations serving a model-based controller.Another class of applications is surrogate models [5] for simulationbased planning and optimization in parameter spaces too large to be covered with full-resolution simulation but manageable with a reduced-order model trained on a comparatively small set of resolved simulations.
To increase the knowledge about the challenges and opportunities of accelerating DEM simulations using data-driven model order reduction, we have developed and tested a realisation of this idea.First, a general method is described, and error measures are introduced.Next, this is implemented, using a ridge regression model for predicting the velocity field.The computationally expensive process of computing contact forces is substituted by rapid inference of the model to output the velocity field at the particle positions.This approach can be expected to perform well in the solid and liquid regime but poorly in the gaseous regime, where individual particle motion is not strongly correlated with the mean flow.Therefore, we investigate an adaptive model order reduction technique, where the system is decomposed in solid/liquid, and gaseous parts.Gaseous sub-domains are integrated using full resolution DEM while the reduced-order model is applied to the solid/liquid sub-domains.The method is tested on two different systems, a pile with controlled feed and gravity-driven discharge flow, and a blade cutting and pushing through a particle bed like a bulldozer blade.In both cases a model is trained to predict the velocity field from the given input signals and the current mass distribution.In the bulldozer case, the force on the blade is also predicted.The precision and computational speed-up are analysed on these systems.
We do not explore the many alternatives that machine learning offer for the velocity predictor, e.g.deep neural networks, that can be trained to predict complex fluid flows [9].The present work is a first step.As such it is natural to investigate the performance of a plain regression model and building knowledge for employing more advanced machine learning algorithms.

Previous work
In the DEM literature there are only a few examples of model order reduction.Boukouvala et al. [3] explored discrete element reduced-order modelling for particle mixing in a blade blender.By principal component analysis (PCA) of simulation snapshots, sampled in a regular grid covering the mixer interior, models were built for predicting the particle velocity field and blade force as function of the mixer control parameters (blade speed and geometry).In turn this was used to develop a surrogate model for optimization of mixing performance based on a relatively small set of time-consuming DEM simulations.This work was later extended by Rogers et al. [16], considering also the effect of dynamic response from changes in the control parameters.The reduced-order model was not used to accelerate the DEM simulations themselves.
In [18], Servin and Wang developed an adaptive model order reduction technique which substitute particles that collectively move as a single rigid body, with six degrees of freedom rigid aggregates of the corresponding mass, momentum, and contact shape.Different strategies for predicting when the aggregate should split in smaller constituents was investigated.The method is severely limited by that the reduced model support only rigid body modes of motion.
Recently, Zhong and Sun [24] investigated a reduced-order model for granular materials under small viscoelastic deformations, with fix connectivity between the particles, using proper orthogonal decomposition (POD) of the displacement field.Inspi-ration for that work was found in [23] and [10].
Pseudo-particle modelling [4] may be considered a form of model order reduction.Each pseudo-particle represents the collective effect of many small real particles.The pseudo-particle shape and contact parameters are considered model parameters that are calibrated to give the material the approximate bulkmechanical properties of the original system.
We have not found any example in the DEM literature where the state of a fine-resolution particle model is projected to a pseudo-particle subspace and projected back after time-integration.Ideas of this type may be found in the literature of computer graphics [8] for the purpose of visual appearance and with no analysis of the reduction error.

Model order reduction
Let us first summarize the classical meaning of model order reduction [2].Consider a dynamical system with vectors of state x(t) ∈ R n , input u(t) ∈ R m and output y(t) ∈ R q .The problem of model order reduction can be stated as follows.Find a lower order model that produce approximately the same observations to an accuracy ε r for all input signals u ∈ U relevant to the particular application.The idea is that the subspace where the reduced state vector lives, x(t) ∈ R r , is of much lower dimension than the original system space, i.e., r n.Note that the observation vector, ŷ ∈ R q , must have the same dimensions as in the original system or there must exist a projection operator to that space.
The standard methods for computing approximate low-order models are the SVD-based and Krylovbased approximation methods [2].The proper orthogonal decomposition (POD) method is a special case of SVD-based model order reduction that is particularly popular in computational mechanics and fluid dynamics.However, granular media modelled using DEM differ from many other dynamical and physical systems in that the connectivity of the variables changes frequently and unpredictably.Therefore, a non-standard model order reduction approach is necessary.

A reduced-order discrete element method
In this section we describe a reduced-order model for granular media simulation using the discrete element method.

Resolved DEM
We first briefly describe the standard discrete element method.We will refer to this as resolved DEM.
Each of the N p particles, indexed a ∈ N , has a position x a (t) ∈ R 3 , velocity v a (t) = ẋa , scalar mass m a and diameter d a .For clarity of the exposition, we ignore the rotational degrees of freedom.The equations of motion are with system position vector x ∈ R 3Np , velocity vector v ∈ R 3Np , diagonal mass matrix M ∈ R 3Np×3Np .The force f is the sum of external forces and contact forces.Each of the N c particle-particle contacts, indexed by n ∈ N c , have a contact position x c,n and pairwise contact force f ab n ∈ R 3 on particle a from particle b.Each contact force has one normal and two tangential (friction) components.The computationally intense part is the numerical integration of the velocity which involve contact detection and contact force computation.Thus, we ascribe the system a computational dimensionality of Numerical integration is normally performed using an explicit method with small time-steps, that resolve the natural oscillation frequency given the particle mass and stiffness.The computational bottleneck is the collision detection and force calculation from the contact overlap and relative velocity.For some systems it is more beneficial to use the timeimplicit method referred to as nonsmooth contact dynamics, or nonsmooth DEM [14,19].This allows for large time-step integration, moving the computational bottleneck to the process of solving the nonlinear equations (variational inequalities) from the Signorini-Coulomb and Newton contact laws.

Reduced DEM
Let the particles be divided in two subsystems, A and B, such that where f AB is the interfacial forces on A from B, and f A and f B denote forces acting only on particles in A and B, respectively.The computational dimensionality of the system can be decomposed as Assume that the particles in subsystem B move according to a known velocity field u(x, t).Each particle b ∈ N B thus has a known velocity v b = u(x, t) x=x b at coordinate x and time t.We abbreviate this as ẋB = u(x B , t).Consequently, the particle acceleration is vb = ∂ t u(x b , t) + v b • ∇u(x, t) x=x b .This eliminates the equation for vB and leave us with the following, reduced, set of equations of motion n If the velocity field u(x, t) can be computed with negligible effort, the computational intensity of the system is that of integrating Eq. ( 9).The dimensionality of the reduced model is then Assuming each particle is in contact with a handful of other particles the reduction factor become

Model reduction errors
There can be two sources of errors in the described reduced-order model.Firstly, the model velocity field u (x, t) may deviate from the true mean velocity u(x, t).Secondly, the particle velocities may deviate from the mean velocity field.One quantity that captures the level of fluctuations is the so-called granular temperature, T (x, t) = [v a − u(x, t)] 2 , where . . .denote averaging over particles in a small volume, centred at x.For the mean squared deviation of the particle velocities from the model velocity field we observe We therefore introduce the granular temperature error and the velocity error associated with the model reduction where the integrals are over the volume enclosing the reduced subsystem B and v 0 is a characteristic velocity for the system that the model should be able to resolve.The errors can be computed with no weight, w(x) = 1, or weighted by the local mass density, w(x) = ρ(x)/ρ 0 , relative to a nominal bulk density ρ 0 to suppress harmless errors in dilute region.If a surface height function z = h(x, y) is tracked during a simulation, it can be interesting to analyse the surface height error where h (x, y) is the surface height function using the reduced model, A is the projected area of the system in the xy plane and V is its volume enclosed by h(x, y) and some reference surface h 0 (x, y).

Extension to multibody systems
Consider the presence also of a rigid multibody system C with position x C , velocity v C and mass M C .
The multibody system has articulation joints and actuators that are represented by a constraint vector The forces on the multibody system are the constraint force G C λ C , external force f C and contact forces f CA and f CB from system A and B. The extended system has the following equations of motion The multibody system has some velocity v BC at the interface between system B and C, which is a contributing cause of the velocity field u(x, t) in B. The contact force f CB on system C from B is either computed from a contact model or an additional output of the reduced-order model.

Adaptively reduced DEM
The purpose of the model reduction is fast simulation with sufficient precision.Fast simulation is achieved by minimizing the number of dynamic particles, N A p , that are simulated with high computational intensity in system A. High precision requires that system B does not include too large domains with high granular temperature, where the individual particle motion deviate substantially from the mean flow.The solution is to adaptively control which regions and what particles are simulated with the reduced DEM and resolved DEM, keeping the reduction factor R and the effect of the granular temperature error E T (t) at minimum.This is carried out as follows, with reference to the illustration in Fig. 1.
The reduced model velocity field, u (x, t), is assumed to be known.Each particle a has a speed ∆v a = v a − u relative to the velocity field.Particles are kept dynamic and part of system A as long as their relative speed exceeds a threshold value ∆v a > εv 0 for some error tolerance ε.Particles with relative speed below the threshold value, ∆v a ≤ εv 0 , are simulated with the reduced model in system B. Now, contacts in granular media are strongly dissipative.Consequently, particles in A that repeatedly collide with particles in B will have a velocity that quickly approach the velocity field and become part of system B.
It is easy to conceive extensions to this basic scheme.Particles in B that receive a large impulse can be made dynamic and part of system A. If the received impulse is large enough, and the particle lack contact support, it will no longer co-move with the velocity field and remain dynamic.Otherwise, it will merge back.If it is possible to predict regions of high granular temperature, the particles there can be kept dynamic.This is useful at outlets, belt conveyor endpoints or when releasing material from a digging tool, where the flow transitions quickly from solid or dense liquid phase into gaseous phase and free fall.Depletion in mass density is, for instance, a good indicator that the granular temperature is about to increase.

A velocity field predictor
Reduced-order DEM simulation, as outlined in Sec. 3, rely on a lower-order model for predicting the velocity field in the granular media.In this section we present a regression model for discrete velocity field prediction and the techniques we use for sampling training and test data from resolved DEM simulations.

Coarse-graining
Coarse-graining is a technique for sampling and averaging particle states to obtain macroscopic fields.The field values at any coordinate x is a weighted average of a discrete set of particle and contact variables in a neighbourhood controlled by a coarse-graining function φ(x, R) with smoothing length R. For sampling velocity fields when running the reduced-order model, we choose a Heaviside coarse-graining function.For particles near the coarse-graining boundary, the mass is weighted by the volumetric overlap approximated using a bounding box.In offline field analysis we use a Gaussian function φ Here we apply a cut-off at |x| = 3R for practical reasons, which cause a truncation error of 0.01.The mass density field is computed as ρ(x, t) = a m a φ(x − x a (t), R).The velocity field is obtained by u(x, t) = p(x, t)/ρ(x, t), where the momentum density field is first computed as p(x, t) = a m a v a φ(x − x a (t), R).

Discretization
We use a regular grid with N v cells, voxels, with side lengths L = (L x , L y , L z ), where we denote the shortest of these L min .Each voxel, indexed i = (i, j, k), has a centre point coordinate x i .The mass density and velocity fields are represented in discrete form by their values in the voxel centres, ρ i = ρ(x i ) and u i = u(x i ).The grid size is limited by the particle diameters d according to L min > d max , where d max is the largest particle.The smoothing length for coarse-graining is equal (Heaviside) or somewhat larger (Gaussian) than the size of the voxels.

Sampling
Data is sampled from resolved DEM simulations.The instantaneous velocity field at a time t is stored in a vector U (t) = [u i (t)] ∈ R 3Nv , the mass density field in a vector P (t) = [ρ i (t)] ∈ R Nv , and the control signal vector J (t) = [j i (t)] ∈ R Nj that drive the system.These system state time instances are referred to as snapshots.A data sample is a time-average of snapshots between time t n and t n+Nτ −1 ≤ τ , e.g., Any reaction force on a selected body or contact surface, b, may also be sampled, and we denote it , where N f is the total number of sampled force components.

Regression model
We are searching for a model that predict the discrete velocity field U ∈ R 3Nv , and possibly also the reaction force F ∈ R N f , from a given mass density field P ∈ R Nv and control signal J ∈ R Nj .This is approached as a regression problem, y = φ(x), with predictor variable x = [P , J ] ∈ R Nv+Nj and response variable y = U ∈ R 3Nv or y = F ∈ R N f for the velocity and force prediction, respectively.The natural start is to first consider a linear regression model with model parameters β 0 ∈ R 3Nv and β 1 ∈ R 3Nv×(Nv+Nj ) and error term ε, for the case of the velocity response variable.There is, however, good reason to believe that a purely linear model cannot capture the behaviour and our numerical experiments also confirmed this.The velocity field and reaction force is expected to depend nonlinearly on the mass and control signal.Therefore, we make the following nonlinear model ansatz where H : R Nv → B Nv is the Heaviside function, component-wise returning an occupancy value 0 or 1 depending on whether the mass density in the voxel is nonvanishing.The vectorization operator vec( ) produce a regression variable x ∈ R NvNj out of the matrix H(P )J with dimension N v × N j .The model parameters is β 1 ∈ R 3Nv×NvNj .We make the same ansatz for the force response variable.
It can be expected that this model suffers from multicollinearity, i.e., that the predictor variables are correlated according to the measurement data.One way to handle this is to apply Ridge regression which adds a penalty term λ β 2 2 to the regression loss function, where β 0 and β 1 have been combined in β as is customary.The regression loss function becomes with penalty parameter λ, that is a hyperparameter to be calibrated.Another way to treat the multicollinearity would be to apply principal component regression.In this case one performs PCA on the predictor variables and omit the low-order principal components.Ridge regression accomplish the same effect, but without dimensional reduction.

Numerical experiments
To test the model order reduction technique, numerical experiments are performed on two systems.We use a nonsmooth discrete element method as described in [19] using the software AGX Dynamics [1].
The reduced-order model is implemented in Python using NumPy and the model training is performed using scikit-learn [12].The experiment steps, summarized in Fig. 2

Pile with a discharge flow
A quasi 2D pile is confined by inclined sidewalls and vertical rear and front walls, as shown in Fig. 3.The inflow of material is controlled by an emitter above the pile, feeding material at variable flow rate.There is a 1.9 m wide outlet where the sidewalls meet.This is also the distance between the front and rear walls.
The discharge flow at the outlet is controlled with a control signal j(t).Particles become kinematic at the outlet, moving with a velocity v out = [0, 0, −j(t)].
The particles are spherical with diameter 0.1 m, 0.16 m, 0.22 m, and 0.3 m, distributed by the mass ratio of 0.3, 0.5, 0.15, and 0.05, respectively, relative to the total mass.The specific mass density is 2500 kg/m 3 , elasticity 10 8 Pa, coefficient of restitution 0.0, friction coefficient 0.5 and rolling resistance coefficient 0.2.The sidewalls have inclination 20 • and share the same contact parameters as the particles.Frictionless boundary condition is applied on the vertical (front/rear) walls.The simulations are run with 0.017 s time-step and 200 projected Gauss-Seidel solver iterations.Grid dimensions are 24 × 1.9 × 15 m with 40 × 3 × 25 voxels.The predicted velocity field is a mean field, trained on coarsegrained data that involve both spatial and temporal averaging.Consequently, the particles in the reduced domain may be integrated with a different time-step than used in resolved DEM simulation.The Courant-FriedrichsLewy condition imply a time-step around 1 s or smaller for the given voxel size and flow rates.We use 0.17 s time-step for integrating the reducedorder model.
Depending on the confinement geometry and material parameters, the granular media in a pile or silo is discharged either through funnel flow or mass flow.In funnel flow, the material divides into stagnant zones with no motion, and flow zones with shear flow stretching from the outlet to the surface of the pile.In mass flow there are no stagnant zones and all particles are in motion during discharge.The pile in Fig. 3 exhibits funnel flow during discharge.For modelling the velocity field, we assume that the bulk flow is quasi-stationary and depend only on the current outflow control signal j(t) and on the mass distribution ρ(x, t).Since the model converts the mass density into binary occupancy, the model can generally take the surface height function, h(x, y), as input and directly compute the occupancy underneath it.This is useful when running the model coupled to a real system instrumented with range sensors.
It is important that the training data cover the system state space, that is spanned by the vector [v(x), ρ(x), j].We generate data samples from 2500 full-resolution simulations, with varying outlet velocity, starting from 150 different initial states.An initial state is a certain material distribution, created by different combinations of mass flow at the outlet and inlet.Also, the position of the inlet is varied.The surface profiles for the 150 initial states can be seen in Fig. 4.
During each discharge simulation, the outlet velocity is kept constant at values between 0 and 0.5 m/s and there is no inflow.The duration of each simulation is 30 s. Data samples with 1 s time average are produced from recorded snapshots.To avoid sampling any transient flow after engaging the outlet, the initial part of simulation is discarded.The discharge simulations result in 50, 000 data samples, constructed with latin hypercube sampling uniformly from the 150 initial states and 2500 outlet velocities.
A number of models for predicting the velocity field  are generated with ridge regression parameter in the range 10 −1 to 10 3 .The performance of these models is evaluated in multiple ways.i) The prediction score on the training and test data are compared to see how well the models can generalize to unseen states.ii) The ability to predict velocity fields is examined by comparing coarse-grained resolved DEM fields to corresponding predicted fields, in a complete discharge of a chosen pile state.iii) The same pile discharge is used to compare when each particle pass the outlet (exit time) in reduced-order DEM c.f. resolved DEM.iv) The adaptive reduced-order DEM technique is evaluated by comparison to resolved DEM during more extensive filling and discharging with identical inflow/outflow signals.
The prediction score on the training and test data can be seen in Fig. 5.The best performance on the unseen test data occurs for regularization parameter values between 1 and 10.We chose λ = 10 as our preferred model and will focus on the performance of this.
Fig. 6 shows snapshots from a ground truth resolved DEM simulation of the discharge of a pile state with constant outflow velocity 0.5 m/s.The columns are snapshots from 10, 20, 30 and 40 seconds into the simulation, with the fields time-averaged over 1 s.In the third and fourth row the velocity field from the ground truth simulation and the model prediction (λ = 10) are shown together with the mass density.In the fifth row the difference between the ground truth and predicted velocity field is shown, and we observe a good agreement.The lower panel shows the model reduction velocity error, defined in Eq. ( 12), as a function of time for the duration of the pile discharge.This evaluates the performance of the velocity prediction for all the produced models.We can see that the model reduction velocity error is around 10% for the λ = 10 model but increases to 20% towards the end when the amount of remaining material is small.In the second row we observe that the granular temperature error is elevated especially near the outlet, indicating an irregular flow there.The time-evolution of the granular temperature error is also included in the lower panel.It varies mostly between 0.5 and 1.
The ability to predict the velocity field is neces- sary for reduced-order DEM simulation, but not sufficient.When the system is time-integrated using the reduced-order model the errors may drift or cause instability.Therefore, we examine also the performance of the model when used to propagate the system forwards in time during a pile discharge.Starting from the same state as in Fig. 6 and running with the same control signal, particle positions are integrated using linear interpolation of the velocity field to the particle positions.We compare the particles exit time in the two cases, the ground truth resolved DEM simulation and the reduced-order DEM case.The results can be seen in Fig. 7.The orange line indicates identical exit times for the two cases.Most particles are concentrated on this line, with standard deviation 3.4 s and mean absolute deviation 2.1 s, which is a relative error around 10%.The distributions of number of outflow particles per time unit are also in fair agreement.Finally, we evaluate the performance of the adap-  tive reduced-order DEM technique, where reduced DEM is used below the surface of the pile, while resolved DEM is used for particles that are in free fall, impacting and flowing rapidly on the surface.The idea is to run one resolved DEM simulation and one adaptive reduced-order DEM simulation using identical control signals.The tests start from an empty container, building up a pile with variable inlet and outlet signals.The control signals can be seen in the lower panel of Fig. 8, with the outflow velocity converted to an estimated mass outflow per unit time.The outflow velocity is initially kept within the domain of the training data (0 to 0.5 m/s), but is in the end set well outside this domain, at 1 m/s.A comparison of the two cases can be seen in Fig. 8, with snapshots at every 10 seconds after building up the pile.The fully resolved DEM simulations (ground truth) can be seen above the corresponding adaptive reduced-order model case.The particles are colour coded according to time of emission, with the dynamic particles in the adaptive reduced-order model case displayed in black.We track the surface of the pile and let particles down to a depth of 0.5 m be dy- namic.One can see that the adaptive reduced-order model performs well, with the surface profiles being similar between the two cases for essentially all times.The surface height error can be seen, as a function of time, in the lower panel of Fig. 8.It is well below 10% during most of the simulations but increase towards the end when there is less remaining material.
It is interesting to study the granular temperature error, defined in Eq. ( 11), to gain insight in the deviation of particle motion simulated with the full resolution model and the reduced-order model.Two snapshots, from time 53 s and 95 s, are presented in Fig. 9.As expected, the granular temperature error is elevated on the surface of the pile when there is an incoming flow, thus motivating the use of resolved DEM there.During discharge, the granular temperature error is elevated around the outlet, not surprising given that the largest particle diameter is 1/6 of the outlet width.There is also signs of correlated velocity fluctuations on a longer length and time scale than that of individual particle rearrangements.This limits the possibility for the reduced models to achieve a low velocity error E v , even if the model accurately predicts the mean fields.Since this occurs near the outlet, it has marginal effect on the material above.Sample videos are available at http://umit.cs.umu.se/ddgranular/.

Bulldozing blade
As a second test we simulate a blade driven horizontally, cutting the surface of a granular bed like a bulldozer blade, see Fig. 10.The simulated flow is consistent with the theory of soil mechanics, that predict the formation of a wedge-shaped failure zone in front of the blade [11].When pushed forward, the soil fails along a localized shear band that stretch from the cutting edge of the blade up to the free surface.Outside the failure zone the material is at rest.Inside the failure zone the material moves forward and upward and may form a circulating flow.The granular temperature is elevated at the front and at the cutting edge where impacts are frequent.A model is trained to predict the velocity field and the reaction force on the blade from the horizontal velocity and the mass distribution in front of the blade.The blade is 1.6 m wide and is attached with a 6-degree-of-freedom constraint to a kinematic body having velocity v = [j(t), 0, v z (t)].The constraint force holding the blade relative to the kinematic body f = [f x , f y , f z ] is measured during the simulation.The shape of the blade is that of two rectangular plates joined along their long edge at an angle of 35 • .To avoid sampling of an unnecessarily large domain a coarse-graining grid is co-moving with the blade (velocities are in the world frame and the voxel field is moved to the position of the blade at each iteration).The dimensions of the grid is 1.0 × 2.5 × 1.0 m with 8 × 5 × 8 voxels.Data samples are time-averaged over 1/6 s.In this numerical experiment the ground truth simulations are run with a time-step of 0.005 s and 250 projected Gauss-Seidel solver iterations, and the particle contact parameters are identical to the pile experiment.
During simulations using the adaptive model order reduction, the particles inside the co-moving grid are assigned the velocity predicted by the velocity field using linear interpolation between the voxel centres.Outside the grid, the velocity field is assumed to be zero.Particles exiting the co-moving grid become dynamic, simulated using resolved DEM, until they are at relative rest to the particle bed.
To generate data samples, 250 simulations are run where the blade is pushed with different constant velocities, between 0 and 1.5 m/s, and with different cutting depth, ranging between 0 and 0.2 m.The model is trained using exclusively horizontal motion.As for the pile, a number of models are generated with different ridge regression parameter values, here ranging from 10 −3 to 10 3 .The prediction score on the training and test data can be seen in Fig. 11    we again pick λ = 10 as our preferred model, with the best generalization to the test data.
The models are evaluated with regard to i) the ability to predict velocity fields, ii) the ability to predict the force holding the blade, and iii) the ability to propagate the system forwards in time.This is all considered for a standardized blade trajectory, which can be seen in the upper panel of Fig. 13.
The ability to predict the velocity fields is evaluated by comparing coarse-grained ground truth velocity fields to that predicted from the corresponding mass densities, as seen in Fig. 12.The first row shows states at 1 s intervals from a resolved DEM simulation with the blade cutting soil at 1 m/s and 0.15 m depth.The second row shows the corresponding mass density and velocity field obtained by coarse-graining.The velocity field predicted by the trained model is shown in the third row, and the fourth row shows the difference between the simulated and predicted velocity fields.The predicted velocity fields are gen- erally close to the measured ones inside the active zone but can differ considerably on the surface.This is consistent with the observation, in Fig. 10 (e), that the granular temperature is elevated there.There are also large differences at 1 s, during the build up phase.The evolution of the model reduction velocity error, E v , over time and for the different regularization parameters is shown in the bottom of the figure .For the λ = 10 model the error is at 0.35.
The capability of predicting the force required to push the blade through the particle bed is also examined.A fully resolved DEM simulation is performed using the described trajectory to produce ground truth data of the blade force.From the same simulation, mass density field and control signal are used as input to the model for predicting the blade force.Both forces, time-averaged over 1/6 s, are plotted in Fig. 13.The path of the blade (top panel) is illustrated with 1 s intervals.The predicted force is overall in good agreement with the ground truth but has some problems when the blade is lowered into and raised from the bed, which was not represented in the training data.Also, it was found that training data with the blade moving above the bed, without any mass in the sampling region, was necessary for the model to predict the weight of the blade, i.e., the vertical force when the blade is reversed at the end of the bulldozing cycle.
Predicting the velocity field is a necessary but not sufficient functionality.It does not imply that the reduced-order model can propagate the system forward in time with similar precision.Velocity errors in the build up phase may lead to unphysical material distributions, in which case it is of little value that the model is good at predicting stationary flows.Also, there are regions of elevated granular temperature, lower right in Fig. 10, at the blade's cutting edge and the front surface of the pile.Since the particle velocities deviate from the mean flow in those regions, the model should not be able to predict them accurately.Therefore, we compare reduced-order DEM and resolved DEM simulations with identical blade trajectories, to evaluate the performance in propagating the system forwards in time.The reducedorder DEM is simulated with the same blade trajectory as previously, depicted in Fig. 13, but with the larger time-step 0.017 s.Fig. 14 shows the two surfaces after a completed bulldozing cycle, and the difference between them.The surfaces has slots and side windrows of similar depth, height and width.In the reduced DEM simulation, the resulting pile is not pushed as far to the end of the slot as in the resolved DEM simulation.This can be understood by that the particles have no inertia in the reduced-order model, and the predictor do not consider the vertical motion when the blade is lifted.Instead, the material simply stays still when the blade is stopped and lifted, and subsequently fall down in a pile when exiting the voxel field.
The reduced-order model captures the force on the blade well, the displacement of material fairly well but the particle velocities only to partial extent.Unfortunately, it is also found that the model is sensitive to the training data.The problem is that the flow during the build up phase is rather different from the stationary flow, and it turns out it is hard to find a balance in representing them both with the model.Depending on which temporal parts of the training data are included, more or less emphasis can be put on these parts.This imbalance typically results in either particles not building up properly in front of the blade, or particles not able to comove with the blade, leaking out through the back.This suggests the need for a different model ansatz than Eq.(19).

Performance measurements
In Table 1 we summarize the performance measurements from the numerical experiments.The compu- tational time for the fully resolved and reduced DEM simulations, t res and t red , are normalized by the real time duration of the experiments, t real .The speedup factor is the ratio of the resolved computational time to the reduced one.The reduction factor is the ratio of the average number of dynamic particles in the reduced and the resolved DEM simulations.The precision is the inverse of the numerical errors in the tests.
The numerical experiments are performed on a prototype implementation, combining Python and NumPy operations for the reduced-order DEM and AGX Dynamics for resolved DEM and multibody dynamics.We expect there is room for optimization of the computational speed.One opportunity that has not been utilized is that the number of iterations in the projected Gauss-Seidel solver can be decreased with the number of resolved particles (size of the contact network) [19].In the performed tests the number of solver iterations was kept constant at 200 and 250 in the pile and the blade experiments, respectively.If the reduced DEM simulations instead are run with 20 iterations, appropriate for an error tolerance of 10% on these systems, the speed-up would roughly double from 10 to 20 (Experiment Pile Fig. 8) and from 50 to 100 (Experiment Blade Fig. 12), respectively, and reach the real-time requirement t red /t real < 1. Disabling the collision detection between particles in the reduced-order model is another optimization that has not been applied here.The measurements of the computational time were made using a single thread on an Intel R Core TM i7-4770 CPU @ 3.40GHz.

Discussion
The purpose of the model order reduction is to enable simulation at a higher speed or with increased number of particles while remaining withing given computational bounds.The price for the speed-up is reduced precision and the effort of the simulations that must be carried out in advance to produce training data.
Let us first consider what computational speed-up can be expected.In the extreme case when the whole system can be represented with the reduced model, the main computational steps are: i) do inference on the regression model in Eq. ( 19) with given known parameters β; ii) determine the particle velocities by interpolation and update their new position; iii) and, possibly, generate output for the purpose of analysis.The computational complexity of doing inference is that of matrix-vector multiplication of size dim(β) = 3N v ×N v N j , which require 6N 2 v N j floatingpoint operations.At the present time a powerful desktop CPU delivers about 100 gigaFLOPS and a high-end GPU up to 100 teraFLOPS.It is thus conceivable to evaluate models of size N v = 10 3 (CPU) and N v = 10 4 (GPU) within one millisecond.Hence, it should be possible to simulate fully reduced DEM systems at 60 Hz with up to N p = 10 5 (CPU) and N p = 10 6 (GPU) particles, assuming 10 particles per voxel.Also, the reduced-order DEM is limited by Courant-FriedrichsLewy (CFL) condition rather by the time-step used in simulation of the resolved DEM.In the pile and blade experiments the CFL time-step limits are estimated to 1 s and 0.1 s, respectively.That adds another factor 10 to 100 in speed or size of systems that may be simulated in real-time with the reduced model.
When the system has regions with granular temperature domains that require resolved DEM, this part of the simulation easily becomes the computational bottleneck.The number of particles that can be simulated in real-time with resolved DEM is up to 10 4 , the precise number depending on particle size, velocity, contact parameters and numerical integration technique [19].With a reduction factor of 1/10, we expect that the pile system can be simulated in real-time with N p = 10 5 particles.Compared to this theoretical performance estimate, the prototype implementation is underperforming in the Pile test in Fig. 8 by a factor of 10 and can potentially be optimised to reach a speed-up factor of 100.
The precision in the pile experiments is about 10%.This is not a high precision but acceptable for many applications, especially when there are no other alternatives that deliver real-time performance.Also, the uncertainties behind the ground truth (resolved) model are in many practical situations also of this magnitude.There are several ways the reduced model can be improved, besides providing it with more or better training data.The velocity field in the real system has fluctuations and transients that cannot be captured by a model assuming quasi-stationary flow.To achieve higher precision, the missing flow dynamics must be included either at the level of the velocity field predictor or at the particle level, like in the spot-model by Rycroft et al. [17].The error due to elevated granular temperature in shear flow can possibly be reduced by adding diffusion terms in the calculation of new particle velocities from the mean field.Models for diffusion in granular flows, as function of the local strain rate, can be found in the literature [22] and can be calibrated using the resolved DEM simulation data.It may also be a good idea to adjust the particle velocities or positions to  resolve unphysical contact overlaps that result from the errors in the velocity predictor or from applying a diffusion model.In the prototype implementation used for the numerical experiments, the regions of elevated granular temperature are detected manually.Creating a model for predicting the local granular temperature and how it can be expected to develop would enable more automatic and perhaps more precise adaptive reduced-order DEM simulation.Another important question is how much training data is needed to create the reduced-order model.Fig. 15 shows the prediction score on the test data for the pile models, as a function of being trained on increasing fractions of the training set of 50,000 data samples.The most significant increase in the test score occurred when using up to 50% of the training set, with generally a small further increase when using all data.However, the scaling depends on the regularization and none of the cases has really saturated, thus there could still be some improvements by increasing the amount of training data.
In the present study a plain regression model is tested.There is good reason to believe that deep neural networks (DNN) can perform well and train to generalize much better to nonlinear flows and changes in the geometric boundaries.Typically, a DNN has many more parameters and requires larger amounts of training data.That does not necessarily mean many more simulations, but instead using more of the unprocessed data from the full resolution simulations, e.g., the particle motions relative to the mean field, density, and the contact force network, to better resolve the fluctuations in the velocity field and reaction forces on objects interacting with the granular media.

Conclusion
We introduce a novel technique for model order reduction of DEM simulations of granular media.Using many offline simulations, we train a regression model to predict the velocity field.This model is then used to assign particle velocities, in place of the time-consuming process of collision detection and force computation.An adaptive domain technique is used to apply the reduced-order model in regions with low granular temperature error where individual particle motion coincide well the mean flow, and use resolved DEM simulation in regions with more irregular particle motion.This allows for minimizing the number of particles in the computationally intense process, while still simulating particle motion with realistic mean velocity.The adaptive reducedorder model is applied in two test systems, a gravitational discharge pile and a bulldozer blade cutting and pushing through a particle bed.We measure a computational speed-up of 10 to 20 and 90% precision for the discharge pile.We estimate that it is possible to reach real-time performance with 10 5 particles at 60 Hz and a reduction factor 1/10, corresponding to a speed-up of 100.For the bulldozing blade the speed-up reaches 60 but with a precision of around 65%. Plans for future research includes extending the predictor model beyond plain regression.

Figure 1 :
Figure1: Illustration of a granular system divided in a high-resolution part (A), reduced-order part (B) and coupling with a multibody system (C).

Figure 2 :Figure 3 :
Figure 2: The different steps of developing a reduced-order model.

Figure 4 :
Figure 4: Distribution of surface profiles for the 150 initial states, with some highlighted samples.

Figure 5 :
Figure 5: The R 2 score as a function of the regularization parameter λ for the pile training-and test data (left), and the average model reduction velocity error for the test data (right).

Figure 6 :
Figure 6: Sample results, comparing ground truth velocity fields from DEM simulations with predicted velocity fields from the reduced-order model.
s reduced DEM #particles/s resolved DEM

Figure 7 :
Figure 7: Comparison of residence times from pile discharge using fully resolved DEM simulations and the reduced-order model.The upper panel shows the number of discharged particles per second for the two cases.

Figure 8 :
Figure 8: Comparing ground truth simulations and model reduction simulations with identical emitter/outlet signals.

Figure 9 :
Figure 9: The temperature error (bottom row) from a full resolution simulation of a pile during filling (left column) and discharging (right column).The velocity vector field and mass density field (top row) are included for reference. and

Figure 10 :
Figure 10: A blade pushing a granular bed simulated with fully resolved DEM.A time instance is shown in 3D overview (a) and cross-section view (b).The predictor model for the velocity field and reaction force from the control signal and mass distribution is illustrated in (c).Also shown is the mass density (d) and granular temperature error (e), with the mean velocity vector field superimposed.

Figure 11 :
Figure 11: The R 2 score as a function of the regularization parameter λ for the blade training-and test data.

Figure 12 :
Figure 12: Sample results, comparing ground truth velocity fields with the velocity fields predicted by the reduced-order model.

Figure 13 :
Figure 13: Sample data of the horizontal (red) and vertical (blue) components of the force to push the blade, simulated using resolved DEM (solid lines) and the reduced model (dashed lines).

Figure 14 :
Figure 14: The surface of the material bed after one bulldozing cycle with ground truth simulation (top), granular reduced-order model (middle) and the difference between them (bottom).

Figure 15 :
Figure 15: R 2 scores for the pile models on test data, trained on fractions of the training dataset.

Table 1 :
Performance number measured in the numerical experiments using resolved and reduced DEM simulations.