1 Introduction

Problems involving large granular media deformations are active research in the fields of geomechanics and natural hazard management. Particular attention is paid to understand the processes and to learn how to predict the run-outs of rock and debris avalanches or landslides which can be very destructive. Therefore, the accurate modeling of such flows is invaluable.

In order to simulate flowing granular media, many researchers derived the semi-empirical depth-averaged models, see [1,2,3,4,5]. The main disadvantage of such models is inability to accurately model high but narrow stacks of granular materials. This drawback implied a need to develop more detailed computational models for granular material dynamics. One of the most widely used methods to simulate such problems is the Distinct Element Method (DEM) [6]. The first application of DEM to model granular flows was proposed in [7]. Further improvements have been described in [8,9,10,11,12]. Nowadays, the DEM method is widely accepted as an effective approach for modeling granular and discontinuous materials. Even through the DEM approach is computationally expensive, this approach is easy to be written in a parallel manner. Despite many advantages of DEM, there is one relevant disadvantage—this approach is designed to discrete materials modeling, therefore the introduction of a new physics can be very complex. For example, to model interactions of granular material with fluid-flow, the coupling with another CFD method is necessary. One of the alternative methods which allow to deal with multi-physics is the Finite Element Method (FEM), see [13] for details. To the best of our knowledge, the first application of FEM to model the granular material collapse was proposed in [14]. The authors used the continuum approach based on an elasto-plastic model. The main disadvantage of FEM is the grid-based nature of this method. When large deformations occur, the FEM approach suffers from grid distortions. However, in the last decade, many so-called meshless methods have been developed.

One of the most mature and commonly used approach is the Smoothed Particle Hydrodynamics method (SPH). In the early stage it was developed to simulate some astrophysical phenomena at the hydrodynamic level [15]. The main idea behind the SPH method is to introduce kernel interpolants for flow quantities in order to represent fluid dynamics by a set of particle evolution equations. Due to its Lagrangian nature, for multiphase flows, there is no necessity to handle (reconstruct or track) the interface shape as in the grid-based methods. Therefore, there is no additional numerical diffusion related to the interface handling. For this reason and the fact that the SPH approach is well suited to problems with large density differences, free-surfaces and complex geometries, the SPH method is increasingly used for hydro-engineering and geophysical applications, for review see [16, 17]. The first attempts to model the granular materials using the SPH approach was presented by Bui et al. [18, 19]. The authors decided to use an elasto-plastic (Drucker–Prager) constitutive model. Despite good results compared with the experimental and numerical data, authors reported a serious tensile instability problem resulting in clustering of SPH particles. Further improvements to this model has been recently proposed in [20] by introduction the diffusion term with variable diffusion coefficient (function of stress and strain rate) and in [21] by adding the MLS correction, see [22] for details. The main reason for introducing these improvements was to reduce the non-physical pressure pulsations that need to be stabilized in (non-viscous) SPH by an artificial viscosity, see [16] for details. Another constitutive model was proposed by Ulrich et al. [23], where the granular material is treated as a fluid with a variable viscosity (a visco-plastic rheological model), however, the authors have not presented any validations of this approach.

In the present work, the ability to model granular materials using the SPH method and the visco-plastic model is studied. For this purpose, it was decided to perform a set of numerical calculations (in 2D and 3D) of the fundamental problem of the collapse of initially vertical cylinders of granular materials. The obtained results were compared with other numerical (DEM and FEM) and the experimental data. On of the drawbacks of the mesh-free methods is much lower numerical efficiency compared with the grid-based approaches. However, similarly to the DEM method, the SPH approach numerical implementations present a high degree of spatial data locality and significant number of independent calculations, therefore the code can be easily written in a massively parallel manner. In recent years new techniques allowing numerical simulations to be performed using Graphics Processing Units (GPU) have been developed. The massive parallel capability of modern GPUs allows simulations of large systems to be performed using cheap desktop computers. For the purpose of this study, it was decided to implement the SPH method using GPU programming techniques. Some issues related to the GPU computations are discussed.

The paper is organized as follows: in Sect. 2 the brief introduction to SPH is presented; in Sect. 3 the visco-plastic rheological model of granular materials is introduced; in Sect. 4 the implementation on GPU is discussed; then in Sect. 5 the obtained numerical results are presented. In order to validate the model, the following criteria are taking into account: the granular deposit evolution (Sect. 5.2), run-out distances (Sect. 5.3), the energy contribution (Sect. 5.4), the inclination of the failure plane (Sect. 5.5) and the pressure distribution (Sect. 5.6). The numerical efficiency is discussed in Sect. 5.7. The last part of this paper is “Appendix” in which we study the convergence of the proposed model.

2 SPH formulation

The full set of governing equations for incompressible viscous flows is composed of the Navier–Stokes (N–S) equation

$$\begin{aligned} \frac{d{\mathbf {u}}}{dt} = -\frac{1}{\varrho } \nabla p + \frac{1}{\varrho } \left( \nabla \mu \cdot \nabla \right) {\mathbf {u}} + \mathbf g, \end{aligned}$$
(1)

where \(\varrho \) is the density, \({\mathbf {u}}\) velocity, t time, p pressure, \(\mu \) the dynamic viscosity and \(\mathbf g\) an acceleration (gravity in this work); the continuity equation

$$\begin{aligned} \frac{d\varrho }{dt} = -\varrho \nabla \cdot {\mathbf {u}} \xrightarrow {\varrho =const} \nabla \cdot {\mathbf {u}} = 0, \end{aligned}$$
(2)

and the advection equation (Lagrangian formalism)

$$\begin{aligned} \frac{d{\mathbf {r}}}{dt} = {\mathbf {u}}, \end{aligned}$$
(3)

where \({\mathbf {r}}\) denotes position of the fluid element.

The governing equations can be expressed in the SPH formalism in many different ways. In general, two SPH approximations: integral interpolation and discretization, lead to the the basic SPH relation

$$\begin{aligned} \left\langle A \right\rangle ({\mathbf {r}}) = \sum _b A({\mathbf {r}}_b) W({\mathbf {r}} - {\mathbf {r}}_b, h) \Omega _b, \end{aligned}$$
(4)

where A is a physical field (for the sake of simplicity we consider a scalar field only), W is a weighting function (kernel) with parameter h called the smoothing length, while \(\Omega \) is the volume of the SPH particle. There are many possibilities for the choice of \(W({\mathbf {r}}, h)\). The kernel shape is the main reason for the appearance of the tensile instability resulting in particle clumping [24]—the process from which the results in [18] suffer. In [25] the authors performed series of fluid-flow simulations and showed that using the Wendland kernel [26] in the form

$$\begin{aligned} W({\mathbf {r}}, h) = C \left\{ \begin{array}{ll} \left( 1-q/2 \right) ^4 (2q+1) &{}\quad \text {for} \;\; q \le 2, \\ 0 &{} \quad \text {otherwise}, \\ \end{array} \right. \end{aligned}$$
(5)

where \(q = |{\mathbf {r}}|/h\) and the normalization constant is \(C=7/4\pi h^2\) (in 2D) or \(21 / 16\pi h^3\) (in 3D), the tensile instability (observed also in granular matter modeling by [18]) does not appear. Therefore, in this work, we decided to use the kernel in the form of Eq. (5). For more details how the choice of the kernel and the smoothing length affect results see [25]. It is important to note here that the SPH basic approximation, Eq. (4), is common also in other numerical particles-based approaches, e.g. Moving Particle Semi-implicit Method (MPS) [27]. The SPH method differs from other methods in aspect of approximation of differentiation operator. Assuming the kernel symmetry, nabla operator can be shifted from the action on the physical field to the kernel

$$\begin{aligned} \left\langle \nabla A \right\rangle ({\mathbf {r}}) = \sum _b A({\mathbf {r}}_b) \nabla W({\mathbf {r}} - {\mathbf {r}}_b, h) \Omega _b. \end{aligned}$$
(6)

It is important to note that although different SPH formulations can be obtained from the same governing equations, some of them may not be applicable for certain types of flows, see [28]. One of the most common SPH form, enabling accurate calculations in the widest number of types of flow, is the N-S pressure term proposed by Colagrossi and Landrini [29]

$$\begin{aligned} \left\langle \frac{\nabla p}{\varrho } \right\rangle _a = -\sum _b m_b \frac{p_a + p_b}{\varrho _a \varrho _b} \nabla _a W_{ab}, \end{aligned}$$
(7)

where \(\nabla _a W_{ab}=\nabla _a W({\mathbf {r}}_a - {\mathbf {r}}_b, h)\). In the present work, we decided to perform calculations of pressure term using this variant. The corresponding (variationally consistent [30, 31]) continuity equation takes a form

$$\begin{aligned} \left\langle \frac{d\varrho }{dt} \right\rangle _a = \varrho _a \sum _b \frac{m_b}{\varrho _b} {\mathbf {u}}_{ab} \cdot \nabla _a W_{ab}, \end{aligned}$$
(8)

where \({\mathbf {u}}_{ab} = {\mathbf {u}}_{a} - {\mathbf {u}}_{b}\). The viscous N-S term, because of the efficiency requirements, is expressed as a combination of the finite difference and the SPH approach (as in the MPS approach [27])

$$\begin{aligned} \left\langle \frac{1}{\varrho } \left( \nabla \mu \cdot \nabla \right) {\mathbf {u}} \right\rangle _a = \sum _b m_b \frac{\mu _a + \mu _b}{\varrho _a \varrho _b} \frac{{\mathbf {r}}_{ab} \cdot \nabla _a W_{ab}(h)}{r_{ab}^2 + \eta ^2} {\mathbf {u}}_{ab}, \end{aligned}$$
(9)

where \(\eta =0.01h\) is a small regularizing parameter used to avoid NaNs when divide by the numerical zero. Because SPH is a Lagrangian approach, the particle advection equation completes the system

$$\begin{aligned} \frac{d {\mathbf {r}}_a}{dt} = {\mathbf {u}}_a. \end{aligned}$$
(10)

In the present work, we decided to use the most common method of implementing the incompresibility—Weakly Compressible SPH (WCSPH). It involves the set of governing equations closed by a suitably-chosen, artificial equation of state, \(p=p(\varrho )\). Following the mainstream, we decided to use the Tait’s equation of state

$$\begin{aligned} p = \frac{c^2 \varrho _0}{\gamma } \left[ \left( \frac{\varrho }{\varrho _0} \right) ^\gamma - 1 \right] , \end{aligned}$$
(11)

where \(\varrho _0\) is the initial density. The sound speed c and a parameter \(\gamma \) are suitably chosen to reduce the density fluctuations down to \(1\%\). In the present work we set \(\gamma =7\) and c at the level at least 10 times higher than the maximal fluid velocity. It is worth noting that two alternative incompressibility treatments exists: Incompressible SPH (ISPH) where the incompressibility constraint is explicitly enforced though the pressure correction procedure to satisfy \(\nabla \cdot \, {\mathbf {u}}=0\)  [25, 32,33,34,35] and Godunov SPH (GSPH) where the acoustic Riemann solver is used [36]. In the present work, the boundary conditions are fulfilled applying the ghost-particle method [25, 32].

To assure the stability of the SPH scheme several time step criteria must be satisfied:

$$\begin{aligned} \delta t< 0.125 \frac{h}{c + u_{\text {max}}}, \quad \delta t< 0.125 \frac{\varrho h^2}{\mu _{\text {max}}}, \quad \delta t < 0.125 \left( \frac{h}{g} \right) ^{\frac{1}{2}}, \end{aligned}$$
(12)

where \(u_\text {max}\) and \(\mu _{\text {max}}\) are respectively the maximal particle velocity and the maximal particle viscosity in the domain.

3 Granular material modeling

In order to simulate the granular materials using the SPH approach, we decided to adopt the visco-plastic rheological model first used in SPH by Ulrich et al. [23]. Equations that predict the shape of the general flow curve need at least four independent parameters. A common example is the Cross [37] equation

$$\begin{aligned} \frac{\mu _0 - \mu }{\mu -\mu _{\infty }} = (K \dot{\gamma })^m, \end{aligned}$$
(13)

where \(\mu _0\) and \(\mu _{\infty }\) refer to the asymptotic values of viscosity, while K and m are constant. The shear strain rate, \(\dot{\gamma }\), can be defined as

$$\begin{aligned} \dot{\gamma }= \sqrt{2\dot{\epsilon }^{ij} \dot{\epsilon }^{ij}}, \end{aligned}$$
(14)

where

$$\begin{aligned} \epsilon ^{ij} = \frac{1}{2}\left( \frac{\partial u^i}{\partial x^j} + \frac{\partial u^j}{\partial x^i} \right) . \end{aligned}$$
(15)

In the case of the granular phase, \(\mu _0\) corresponds to the viscosity of solid (low values of \(\dot{\gamma }\), elastic limit), while \(\mu _{\infty }\) is the viscosity of grains above elastic limit (high values of \(\dot{\gamma }\)). Therefore, we may assume that \(\mu \ll \mu _0\), which reduces Eq. (13) to the Sisko [38] model

$$\begin{aligned} \mu = \mu _{\infty } + \frac{\mu _0}{(K \dot{\gamma })^m} = \mu _{\infty } + K_2 \dot{\gamma }^{n-1}. \end{aligned}$$
(16)

Assuming \(n=0\), we get

$$\begin{aligned} \mu = \mu _{\infty } + \frac{K_2}{\dot{\gamma }}, \end{aligned}$$
(17)

which is commonly known as the Bingham model [39]. With some simple redefinition of parameters Eq. (17) can be written as

$$\begin{aligned} \tau = \mu _0 \dot{\gamma }+ \tau _{\text {yield}}, \end{aligned}$$
(18)

where \(\tau \) is the shear stress, \(\tau _{\text {yield}}\) is the yield stress. In this model, the material behaves as a solid body until the shear stress exceeds the yields stress (reaching the critical state) and large deformations may occur. One of the commonly used models is the Mohr–Coulomb failure criterion [40], in which the shear strength of soil is expressed as a combination of adhesion and friction components

$$\begin{aligned} \tau _{\text {yield}} = c + \sigma _n \tan \varphi , \end{aligned}$$
(19)

where c is the cohesion, \(\varphi \) is the internal friction angle, while \(\sigma _n\) is the normal stress. However, it is important to note that c and \(\varphi \) are not fundamental properties of material. Both depend on the effective stress [41]. However, for the purposes of the present study, it is sufficient to assume that c and \(\varphi \) are fundamental material constants.

Assuming that \(\sigma _n = p\), the final form of the granular material model takes the form

$$\begin{aligned} \mu = {\left\{ \begin{array}{ll} \mu _{\infty } + (c+p\tan \varphi )/\dot{\gamma }, &{} \mu < \mu _{\text {solid}}, \\ \mu _{\text {solid}}, &{} \mu \ge \mu _{\text {solid}}, \\ \end{array}\right. } \end{aligned}$$
(20)

where \(\mu _{\text {solid}}\) is introduced, due to numerical efficiency reasons, to avoid extremely high values of viscosity, which may lead to extremely small time steps (due to CFL).

4 Graphics processor unit implementation

The modern desktop CPUs, such as Intel i7-4790K, have 4 physical cores (8 virtual cores via hyper-threading) with the base frequency about 4 GHz. For comparison, the modern desktop GPUs, such as Nvidia GeForce GTX 980, have more than \(2\cdot 10^3\) cores with the base frequency about 1 GHz. Therefore, the advantage of using GPU accelerators for HPC is obvious. The GPU cards were designed to accelerate the creation of images in a frame buffer to stream them onto a display, therefore the double precision was not needed for such a task. Due to this, most of the desktop GPUs are built to support mainly the single precision calculations. There is a possibility to run tasks in double precision, but, it results in a significant drop of performance. It is important to note that for the most applications of the SPH approach the numerical errors related to the used approximations are much higher than the truncation errors, therefore, many researchers decided to perform the SPH calculations using GPUs with the single precision, see [42,43,44,45]. However, since in our case the kinematic viscosity of a granular material can change the value more then 5 orders of magnitude during a simulation, the single precision is not enough. The influence of the floating point number precision on the results is presented in Fig. 1.

Fig. 1
figure 1

2D granular collapse velocity fields [cm s\(^{-1}\)] at \(t=0.35\)s (\(a=0.55\)—for details, see Sect. 5.1); the SPH simulations obtained with: a single precision and b double precision floating point numbers

The problem of the double precision floating point numbers in the SPH modeling on GPU has been recently discussed in [46, 47]. To avoid this problem the authors proposed to use such techniques as the cell relative coordinates (to avoid problems in domains of high aspect ratios) or the compensated algorithms like Kahan sum (to sum over large numbers of values). Unfortunately, none of the proposed algorithms could correct the problem of strongly inhomogeneous viscosity in domain. Therefore, in the present work, to avoid inaccuracies, we decided to perform calculations using the double precision explicitly. The influence on the numerical efficiency is discussed in Sect. 5.7. For details about the GPU implementation see [43].

5 Numerical results

5.1 Introduction

The numerical experiments were performed by releasing initially vertical columns of granular material, see Fig. 2. The initial height, \(H_0\), was defined by the initial radius \(r_0 = 9.7\) cm and the aspect ratio parameter

$$\begin{aligned} a = \frac{H_0}{r_0}. \end{aligned}$$
(21)

The material density \(\varrho \) was 2.6 g cm\(^{-3}\). The angle of repose \(\varphi \) was \(30^{\circ }\) (except Sect. 5.5). The material was chosen as non-cohesive, \(c = 0\). These are properties of dry sand used in the experiment of Lube et al. [48].

Fig. 2
figure 2

Initial configuration in 3D

Initially the granular column is placed in the middle of the base of the rectangular domain of edges (5L, 5LL) in 3D or (5LL) in 2D, where \(L=1.8H_0\) is the domain height. For \(a<0.9\) we have chosen \(L=10\) cm, while for others \(L=1.2H_0\).

The SPH simulations were performed for different aspect ratios and different numerical resolutions (for the convergence analysis see “Appendix”). In 2D: \(a=0.35\)-9.3, \(L/h=16\)-64 and \(h/\Delta r=2\). In 3D: \(a=0.35\)-9.3, \(L/h=16\)-32 and \(h/\Delta r=1.5625\) (lower than 2D due to the efficiency reasons). In both cases, we decided to use the Wendland kernel (to avoid problems with the tensile instability). The speed of sound was \(s=1000\) cm s\(^{-1}\). The parameter \(\mu _{\infty }\) in Eq. (20) was chosen as a viscosity of water (at \(20^{\circ }C\)), 0.01 g cm\(^{-1}\) s\(^{-1}\). The viscosity of solid \(\mu _{\text {solid}}\) was chosen as 2000 g cm\(^{-1}\) s\(^{-1}\). Due to the efficiency reasons, see Eq. (12), the value of \(\mu _{\text {solid}}\) is suitably lowered compared with the real soil. The side effects of such proceedings are negligible. The calculations were performed using the double precision floating point number with the single precision calculations used only to benchmark the numerical efficiency in Sect. 5.7.

5.2 Shape evolution

The simplest technique to check whether the proposed model gives the correct results is to compare the calculated profile shapes with other numerical and experimental data. In 2D, as a reference data we decided to choose the DEM calculations obtained by Utili et al. [12]. The obtained results, for aspect ratios: \(a=0.93\) and \(a=5.91\), are presented in Fig. 3.

Fig. 3
figure 3

The evaluation of profiles for different initial aspect ratios, a, of granular columns (2D); the SPH results compared with the DEM reference data [12] (solid lines)

For \(a=0.93\) the SPH results show good agreement with the reference data. In the case of higher aspect ratio, \(a=5.91\), the SPH calculations slightly differ from the DEM calculations. There are two reasons of this discrepancy: not quite well chosen parameters of the rheological model and two different flow regimes for \(a=0.93\) and \(a=5.91\). In the first regime (lower a) most of the internal parts of deposit remain undisturbed throughout the motion. In the latter case, almost the entire column deforms which causes that the impact of the unsuited viscosity (underestimated \(\mu _\infty \)) is more visible. Direct measurements of the rheological parameters should help to obtain more accurate results.

Fig. 4
figure 4

For the 3D model, it was decided to compare the SPH results with the experimental data by Lube et al. [48]. For validation purposes we have chosen two different aspect ratios: \(a=0.55\) and \(a=2.75\). The results are presented in Fig. 4. The evolution of granular material column for different initial aspect ratios: \(a=0.55\) and 2.75; the SPH results compared with the experimental photographs by Lube et al. [48]

The performed simulations gave very realistic results. Using the SPH approach, we were able to reproduce the typical for \(a<0.74\), see [48], circular discontinuity on the surface of the column which separates an outer (slumping) region from a non-deformed inner part of the deposit. For the column of \(a=2.75\) the calculated shape evolution of the column agrees very well with the experimental data presented in detail in [48] (Fig. 4).

5.3 Run-out distances

One of the most relevant validation criteria for the numerical model is to try to reproduce the scaling laws for the run-out distance. The experimental data for the three-dimensional column was first proposed by Lube et al. [48]:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} 1.24 a, &{} a < 1.7, \\ 1.6 a^{1/2}, &{} a \ge 1.7. \\ \end{array}\right. } \end{aligned}$$
(22)

Slightly different results were obtained by Lajeunesse et al. [5] for semi-circular (half of column) initial configuration:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} a, &{} a < 3, \\ a^{1/2}, &{} a \ge 3. \\ \end{array}\right. } \end{aligned}$$
(23)

Many more empirical and numerical experiments were performed for granular collapses of the two-dimensional columns. Lube et al. [49] were releasing a granular columns confined between two vertical walls. The authors obtained the following scaling:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} 1.2 a, &{} a < 2.3, \\ 1.9 a^{2/3}, &{} a \ge 2.3. \\ \end{array}\right. } \end{aligned}$$
(24)

Independently Lajeunesse et al. (2005) obtained a similar result:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} a, &{} a < 3, \\ a^{2/3}, &{} a \ge 3. \\ \end{array}\right. } \end{aligned}$$
(25)

The two-dimensional numerical experiments were performed by Staron and Hinch [8] who obtained the relation:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} 2.5 a, &{} a < 2, \\ 3.25 a^{0.7}, &{} a \ge 2. \\ \end{array}\right. } \end{aligned}$$
(26)

Many other authors including Zenit [11], Utili et al. [12] who used the DEM approach and Crosta et al. [14] who performed FEM simulations obtained fairly consistent results for the run-out distance.

The obtained results of the relation between the normalized run-out distance and the initial column aspect ratio for 2D and 3D cases are respectively presented in Figs. 5 and 6. As the reference data we decided to plot two- and three-dimensional experimental solution obtained in Lube et al. [48, 49].

Here we show that in the case of 2D columns the normalized run-out distances for higher aspect ratios are slightly overestimated when compared with the experimental data [49]. In the present SPH approach, we obtained:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} 1.22 a, &{} a < 2.3, \\ 1.57 a^{4/5}, &{} a \ge 2.3. \\ \end{array}\right. } \end{aligned}$$
(27)

However, in general, the results show a good agreement with the reference data.

Fig. 5
figure 5

The non-dimensional incremental run-out distance as a function of the aspect ratio for two-dimensional results; the SPH solution compared with two- [49] and three-dimensional [48] experimental data

In the case of 3D columns, we obtained somewhat less accurate results compared with the reference data [48]. The SPH results appear to be underestimated. One reason of such a behavior is smaller numerical resolution compared with the 2D simulations. The obtained scaling law is:

$$\begin{aligned} \frac{r_\infty - r_0}{r_0} \simeq {\left\{ \begin{array}{ll} 0.72 a, &{} a < 1.7, \\ 1.02 a^{3/5}, &{} a \ge 1.7. \\ \end{array}\right. } \end{aligned}$$
(28)

In both 2D and 3D cases we obtaining the scaling laws by fitting to the highest resolution SPH results.

Fig. 6
figure 6

The non-dimensional incremental run-out distance as a function of the aspect ratio for three-dimensional results; the SPH solution compared with two- [49] and three-dimensional [48] experimental data

5.4 Energy contribution

The potential energy of the column at any time is

$$\begin{aligned} E_p = \sum _a m_a g h_a, \end{aligned}$$
(29)

where \(h_a\) is the height of the particle a. The column kinetic energy can be calculated from

$$\begin{aligned} E_k = \frac{1}{2} \sum _a m_a u_a^2. \end{aligned}$$
(30)

Due to the viscosity, or (in the scale of a single grain) non-elastic collisions between grains, a part of the potential energy does not transform into the kinetic energy, but it gets dissipated

$$\begin{aligned} E_{dis}(t) = E_p(0) - E_p(t) - E_k(t). \end{aligned}$$
(31)

At the beginning, the entire energy is accumulated as the potential energy. As time passes, particles start to fall downwards with the potential energy being transformed into the kinetic energy and some heat (dissipation). Figure 7, left shows the energy evolution of the granular column for the aspect ratio \(a=3.26\). In this plot the SPH results are compared with the DEM simulations obtained in [12]. In both models the kinetic energy exhibits a peak at about \(t/T = 1\). The obtained results show small discrepancies between SPH and DEM, however these differences can be minimized adjusting the minimal and maximal viscosity in the considered rheological model.

In order to validate the SPH approach for different values of the aspect ratio, we decided to compare the total dissipated energy for different values of a calculated using the SPH method with the DEM simulations [12]. The obtained results are presented in Fig. 7, right. Both SPH and DEM models give similar relation between the dissipated energy and aspect ratio. Small overestimation of the SPH results is also observed in Fig. 7, left.

Fig. 7
figure 7

(Left) energy evolution for the granular column of the aspect ratio \(a=3.26\); (right) total energy dissipated during the flow as a function of the aspect ratio; energy normalized by the initial potential energy; the SPH result compared with the DEM reference data [12]

Fig. 8
figure 8

Active failure state of the granular sample; the results obtained for three different internal friction angles: a \(20^\circ \), b \(30^\circ \) and c \(40^\circ \); the black markers indicate position of particles that move with the velocity between 9.5 and \(10.5\%\) of maximal value

5.5 Inclination of the failure plane

According to the Rankine’s theory of earth pressure [50], the inclination of the failure plane to the horizontal, \(\theta _f\) can be approximated as:

$$\begin{aligned} \theta _f = 45^{\circ } + \frac{\varphi }{2}. \end{aligned}$$
(32)

In order to check whether the SPH approach can correctly predict the relation (32), we decided to perform three simulations of granular column collapse with different values of the internal friction angle \(\varphi =20^{\circ }\), \(30^{\circ }\), and \(40^{\circ }\). It is important to note that values \(20^{\circ }\) and \(40^{\circ }\) correspond to the extreme values observed in nature.

To visualize the inclination angle of the slope failure plane, we decided to plot the velocity fields using the spectral color map, see Fig. 8. The presented results were obtained for the collapsing column of the aspect ratio \(a=0.55\) at \(t = 40\) ms. In order to compare the SPH results with the theoretical predictions (32), we decided to define the failure plane as a surface with the arbitrary chosen velocity equal to \(10\%\) of maximal value. Particles that move with the velocity between 9.5 and \(10.5\%\) of maximal value are marked in black. The obtained failure angles agree well with the relation (32). However, we observed two issues. The first is the lower accuracy of the failure angles for the smallest value of the internal friction angle. The second is a small systematic decline of the \(\theta _f\) values for vary late time steps. One of the reasons of these issues is the definition of the failure plane, which is arbitrary chosen. However, in the SPH method the physical fields are smoothed in the range of h, which makes it ambiguous to define the failure plane. The other reason, but only for the latter issue, is a decrease of the height of the deposit caused by lowered \(\mu _{\text {solid}}\) (compared with reality) due to the numerical efficiency, see Sect. 5.1. Similar test performed for different values of aspect ratios show no vital differences in relation between \(\theta _f\) and \(\varphi \).

5.6 Pressure distribution

Figure 9a presents the pressure field calculated for the 2D granular collapse test case with \(a=0.55\) (the velocity field is presented in Fig. 1). The obtained result shows very high density fluctuations in the range from \(-2\)kPa to 5 kPa, while the highest expected value is about 1.15kPa. This non-physical pressure fluctuations result from both the use of the weakly compressible approach and the reduced amount of particles under the kernel hat in the regions near the free-surface. In order to remove the observed non-physical pressure pulsations, we decided to filter out the particles with negative pressures and those with the pressure exceeding the values which correspond to the density increase of \(5\%\), see Eq. (11). The results are presented in Fig. 9b. The application of the filter revealed that the pressure fluctuations affect only the regions near the free-surface. The internal part of medium is free from any non-physical pulsations including the short-length-scale noise described in [21]. The reason of absence of the mentioned instability is the stabilizing role of the viscosity which reaches very high values in the regions with low shear rate strain, see [16, 20, 21].

Fig. 9
figure 9

2D granular collapse (\(a=0.55\)) pressure fields [kPa] at \(t=0.35\)s; a the results obtained directly from the presented SPH approach, b the pressure field subjected to the filtration of non-physical pulsations

5.7 Numerical efficiency

For the performance analysis we decided to use the Nvidia GeForce GTX980 GPU (2048 cores, 1126 MHz clock, 4 GB of memory). Figure 10 presents the measured Frames (time steps) Per Second (FPS) as a function of the number of particles in the domain. In 2D, the use of the double precision floating numbers decreases the computational time twice. In double precision the calculations took from about 4 min for about \(N=4 \cdot 10^3\) particles up to about 2 h for \(N=2.5\cdot 10^5\). In 3D, the use of the double precision numbers decreases performance by factor of 3. For the lowest used resolution (\(N=2.5 \cdot 10^4\)) simulations took about 1 h, while for the highest resolution (\(N=1.5 \cdot 10^6\)) about 75 h (double precision). It is important to note that the used GPU is based on the Maxwell micro-architecture in which the double precision performance is 1/32 of the single precision performance. Therefore, the main reason of observed decrease of the performance with double precision in 3D (compared with the single precision) is much larger number of interactions under the kernel hat (larger number of simple math operations). The difference between the double and single precision performance should be much smaller using the Kepler micro-architecture (eg. Nvidia Tesla K80 or Nvidia GeForce 700 series) where the double precision computations are only 4 times less efficient. The numerical efficiency can be further improved by using more than one GPU.

Fig. 10
figure 10

Number of frames per second as a function of the SPH particles in the domain; the initial aspect ratio is \(a=0.55\); a two-dimensional results for \(h/\Delta r = 2\), b three-dimensional results for \(h/\Delta r = 1.5625\)

6 Conclusions

In the present work, the ability to model granular materials using the SPH method and the visco-plastic model has been studied. For this purpose, a set of numerical calculations (in 2D and 3D) of the fundamental problem of the collapse of initially vertical cylinders of granular materials has been performed. In order to validate the proposed model, the granular deposit evolution, the run-out distances, the energy contribution and the inclination of the failure plane were compared with the analytical, experimental and other numerical data. The obtained results showed good agreement with the reference data. All the inaccuracies that we observed during simulations were caused mainly by two factors: not perfectly matched parameters of the used rheological model or too low numerical resolution (limited hardware resources). In order to reduce the effect of particle clustering (the tensile instability)—the problem signalized in [18], we decided to suitably choose the kernel function which significantly reduces this problem. In fact, the tensile instability was not observed in the obtained results. Taking advantage of GPU efficiency, it was possible to run computationally heavy simulations on the cheap desktop computer. The performed analysis showed that the single precision of floats is not enough to correctly perform simulations with the used rheological model. The double precision calculations increase the computational effort on GPU, but, the obtained numerical efficiency is still very high. It is important to note here that for dry granular materials the methods such as the DEM allow for much more accurate and efficient calculations. However, when we consider complex debris flow constituted of rocks and mud, for which it may be difficult to define the interaction between solid particles, the continuum methods, such as the introduced SPH, appear to be much more useful. Another advantage of the SPH approach is its ability to model complex multi-phase flows involving eg. fluid-granular phase interactions. This work is an intermediate step in a complete project which aims at simulating the interaction of sea waves and currents with a seabed. The satisfactory results obtained for the dynamics of dry sand with the simple rheological model are an encouragement to pursue along that direction.