1 Introduction

Blasting is one of the cheapest and most effective methods for hard-rock fragmentation in open-pit mines and civil engineering. However, many previous researchers have concluded that only 25–30% of the explosive energy was directly involved in breaking rock, the rest produces undesirable side effects such as ground vibration (PPV), air overpressure, fly rock, and back break [1,2,3,4,5,6,7]. Of these side effects, PPV is the most dangerous side effects for humans and the environment. Therefore, precise prediction of blast-induced PPV is an essential requirement in open-pit mines.

To assess and predict blast-induced PPV, many scientists have access to empirical techniques based on mathematical statistics [8,9,10,11,12,13,14,15,16,17,18]. Empirical techniques were mostly based on mathematical statistical methods and used two input variables including explosive charge per delay/maximum explosive charge capacity (W) and monitoring distance (R). They were considered the two most influential factors for blast-induced PPV [19]. However, their performance was not high in some cases [20,21,22,23,24,25].

In recent years, various advanced techniques and approaches have been developed to predict and reduce the undesirable effects of blast-induced PPV in open-cast mining, including machine learning, artificial neural network (ANN), genetic algorithm, fuzzy neural system. Longjun et al. [26] used random forest (RF) and support vector machine (SVM) algorithms to predict blast-induced PPV; 93 blasting events were used for development of the RF and SVM models in their study. Their results showed that the RF and SVM models were acceptable and the SVM model was better than the RF model throughout the PPV predicted values on the testing dataset. Classification and regression tree (CART), multiple regression (MR), and various empirical models were also used by Hasanipanah et al. [27] to predict blast-induced PPV. A total of 86 blasting events were monitored in Miduk copper mine (Iran) for their aim. Hasanipanah et al. [27] concluded that the CART technique exhibited better performance than the empirical and MR models with an RMSE of 0.17 and R2 of 0.95. Chandar et al. [28] examined regression models and ANNs in predicting blast-induced PPV using 168 blasting events in three different mines (limestone, dolomite, and coal). Their results indicated that the ANN model was the best model among the approaches used in their study with a R2 of 0.878 for the three mines. Faradonbeh and Monjezi [29] were also successfully developed two robust metaheuristic algorithms to predict blast-induced PPV using 115 blasting events. In their study, a predictive equation based on gene expression programming (GEP) was developed to estimate blast-induced PPV as the first step. Then, the capability of the established GEP model was compared with that of a nonlinear multiple regression model and five general equations as the next step. Their results revealed that the organized GEP model was more efficient than the other models in predicting blast-induced PPV. In addition, many researchers have used these approaches to predict blast-induced PPV in open-cast mines, and the results were acceptable [22, 23, 30,31,32,33,34,35,36]. Table 1 shows several soft computing techniques in predicting blast-induced PPV.

Table 1 Some studies of PPV prediction using soft computational models

Although the studies of PPV predictions in open-pit mines using artificial intelligence (AI) have been approached, no approach or model was optimal for every area. In addition, AI is an approach that needs to be continually evolving and diverse. Thus, several ANN models for predicting blast-induced PPV at an open-pit mine in Vietnam were developed in this study. The USBM empirical technique was also developed in this study to predict and compare with the ANN technique.

The composition of the article includes four parts: Part 1 reviews a number of published studies and the reasons for this study; Sect. 2 summarizes the data used and the methodology used; the results of this study and discussion are presented in Sect. 3; Finally, the conclusions and recommendations are given in the last section.

2 Data used and methodologies

2.1 Summary of the data used

As regarding, this research aims to develop an ANN model for predicting blast-produced PPV at an open-pit coal mine of Vietnam (Fig. 1). Explosives used at the mine are mainly ANFO with an explosive capacity of 1360–6700 kg. The mine used non-electric delay blasting method for fragmentation of rock. The geological structure of the mine is quite simple. Major components include clay, siltstone, sandstone, limestone, and coal. Cracks and faults are not present in the study area. The hardness of rock mass is in the range of 8–11 according to the hardness strength of Protodiakonov. Therefore, blasting is an effective method for rock fragmentation in this mine.

Fig. 1
figure 1

Overview of the site study

In this study, 68 blasting operations were recorded with three parameters: maximum explosive charge per delay (W), monitoring distance (R) and ground vibration (PPV). Of these parameters, W was extracted from 68 blasting designs, and a handheld GPS navigation system was used to determine R. In this study, W and R were used as the input variables for predicting PPV. PPV values were measured by Blastmate III, Canada. The datasets used for this study are summarized in Table 2.

Table 2 Characteristic of the datasets used

Before constructing the forecasting models, a data splitting procedure was performed for this study. Of the 68 observations, 80% of the whole data (56 observations) were used as the training dataset; the remaining 20% (12 observations) were used as the testing dataset. For the development of the PPV predictive models, the training dataset was used. For evaluating the performance of the models, the testing dataset was used as unseen data based on the developed models.

2.2 Empirical technique

As mentioned in Sect. 1, empirical technique was widely used for predicting blast-induced PPV in open-pit mine [5, 39]. Many scholars have proposed and developed empirical equations for predicting blast-induced PPV [8,9,10,11,12,13,14,15,16,17,18]. Review of the literature showed that the United States Bureau of Mines (USBM) empirical equation was widely applied for estimating blast-induced PPV which was proposed by Duvall and Petkof [8]. However, it was rarely applied in Vietnam. Review of the literature showed that the empirical equation was proposed by Ambraseys [10] which was widely applied in Vietnam [48,49,50]. Therefore, the empirical equation of Ambraseys [10] was selected representing for an empirical technique to predict blast-induced PPV in this study. It is described as follows:

$${\text{PPV}} = k\left( {\frac{R}{{\sqrt[3]{W}}}} \right)^{ - p} ,$$
(1)

where W denotes the maximum of charge per delay (kg); R denotes the distance from the blast site (m); k and p are site factors and are determined by multivariate regression analysis.

2.3 Overview of ANN

Artificial neural network (ANN) is one of the most advanced artificial intelligence (AI) techniques that were built based on the human neural structure [51]. They have the ability to connect neurons to solve problems from input signals through the help of computers [51]. The general structure of an ANN consists of three layers: an input layer, hidden layers, and output layer [51]. Figure 2 illustrates a conventional structure of an ANN model for predicting blast-induced PPV in this study. In layers containing neurons with different tasks and functions, the number of hidden layers and neurons in each layer is unlimited. However, too many neurons will lead to overfitting, while too few neurons will not reflect the properties of the data [52]. The number of hidden layers is also one of the factors affecting the training time of the model. In theory, an ANN with two hidden layers can solve all problems. Too many hidden layers will increase the training time of the model.

Fig. 2
figure 2

General structure of an ANN model for predicting blast-induced PPV in this study

ANN model works in the following manner: At the input layer, neurons receive input signals with weights. Then, they are processed and sent to the neurons of the first hidden layer via the transfer function. Here, the neurons will receive the results from the input and processing parameter classes, calculate the weights and send them to the second hidden layer via the transfer function. The process continues until the results are passed to the output layer and give the final output [53].

The results of the ANN model depend heavily on the learning process of the network, also known as the training process. The learning process of ANN includes two types of learning: supervised learning and unsupervised learning [54]. Input data for PPV prediction are numerical data and using regression algorithms, so most uses supervised learning based on input data and output requirements.

In this study, five ANN models with one, two, and three hidden layers were considered and developed to predict blast-induced PPV.

2.4 Establish the predictive models

For empirical technique, Ambraseys empirical equation was applied according to Eq. (1); 56 blasting events in the training dataset were used for this task. A sampling procedure was repeated five times to create five different sets of data. Then, five empirical models for predicting blast-induced PPV were established. SPSS software version 16.0 [55] was used to analyze multivariate regression data for determining the site factors of k and p. The regression analysis results and five empirical models are given in Table 3. The performance of the empirical models is discussed in the next section.

Table 3 Site factors and empirical equations for predicting PPV in this site study

For ANN models, a “trial-and-error” procedure with five ANN models was employed to predict blast-induced PPV in this site study. The ANN models with one, two, and three hidden layers were considered and developed in this study. The number of neurons in hidden layers lies in the range of 3–10 to prevent the creation of too complex ANN models; 56 blasting events were used to develop the ANN models for predicting blast-induced PPV in this study. As a result, five ANN models were developed, including ANN 2-5-1; ANN 2-5-3-1; ANN 2-8-6-1; ANN 2-8-6-4-1; and ANN 2-10-8-5-1. The structure of the ANN models is given in Fig. 3.

Fig. 3
figure 3

ANN models for predicting blast-induced PPV in this site study

In Fig. 3, I1 and I2 are input parameters corresponding to W and R; H1 to H10 is the number of neurons in each hidden layer; O1 is the output predicted value (PPV); B1 to B4 are biased layers that apply constant values to the nodes. The black line represents for positive weights, and the gray line represents for negative weights. Line thickness is in proportion to the magnitude of the weight relative to all others.

3 Results and discussion

For comparing and evaluating the quality of the predictive models, the performance indices include the root-mean-squared error (RMSE) and determination coefficient (R2) were used in this study and defined by Eqs. (2)–(3) as follows:

$${\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {(y_{\text{PPV}} - \hat{y}_{\text{PPV}} )^{2} } }$$
(2)
$$R^{2} = 1 - \frac{{\sum\nolimits_{i} {(y_{\text{PPV}} - \hat{y}_{\text{PPV}} } )^{2} }}{{\sum\nolimits_{i} {(y_{\text{PPV}} - \bar{y})^{2} } }}$$
(3)

where n is the total number of data. \(y_{\text{PPV}}\), \(\hat{y}_{\text{PPV}}\) and \(\overline{y}\) are the measured, predicted and mean of the \(y_{\text{PPV}}\) values, respectively.

3.1 Empirical models

Based on the empirical models developed in Table 3, the RMSE and R2 performance indicators were calculated according to Eqs. (2) and (3) for training dataset and testing dataset. The performance of the empirical models is shown in Table 4.

Table 4 Performance indices of the empirical models

From Table 4, it can be seen that the empirical models for predicting blast-induced PPV in this study are not so bad. However, it is difficult to assess which empirical model is the best among the five empirical models developed. Therefore, a simple ranking method was applied in this study to find the optimal empirical model. The ranking results for the empirical models are shown in Table 5. Also, a total ranking on both training and testing datasets is also presented in Table 6.

Table 5 Performance indices of the empirical models and their ranking
Table 6 Total rank of the empirical models

Based on Table 6, it is effortless to see that the empirical model No.1 was the best empirical model for predicting blast-induced PPV in this study with a total ranking of 14. The empirical models No. 2 and No. 4 were also performed well in this study with a slightly lower performance than empirical model No. 1. In addition, it can be seen that empirical model No. 3 provided the poorest performance in this study with a total ranking of 9. Figure 4 illustrates the relationship between measured and predicted values by the empirical models in this study.

Fig. 4
figure 4

Predicted and measured PPV in this site study by empirical models

3.2 The ANN models

In this section, the performance of the ANN models was highlighted. Accordingly, five ANN models are developed in Sect. 2.4: ANN 2-5-1; ANN 2-5-3-1; ANN 2-8-6-1; ANN 2-8-6-4-1; and ANN 2-10-8-5-1. For each ANN model, performance indices were computed using Eqs. (2)–(3). Note that the training and testing datasets of these ANN models are the same. The performance of the ANN models is shown in Table 7.

Table 7 Performance indices of the ANN models

From Table 7, it can be seen that the ANN models worked better than the empirical models in predicting blast-induced PPV in this study. However, it is difficult to select the most optimal ANN model among the five ANN models developed. Thus, a simple ranking method was also applied to find out the optimal ANN model in this case. Table 8 interprets the performance of the ANN models and their ranking on both the training and testing datasets. Table 9 summarizes the ranking of each ANN model.

Table 8 Performance indices of the ANN models and their ranking
Table 9 Total rank of the ANN models

Based on Table 9, it is easy to see that the ANN model No. 5 (2-10-8-5-1) provided the best performance with a total ranking of 18. The ANN model No. 4 (2-8-6-4-1) was also useful for predicting PPV with slightly lower performance than the ANN model No. 5. Table 9 shows that the ANN model with one hidden layer (2-5-1) yielded the poorest performance among 5 ANN models used in this study. Figure 5 illustrates the relationship between the measured and predicted values by the ANN models in this study.

Fig. 5
figure 5

Measured versus predicted PPV in this study by the ANN models

3.3 Performance evaluation between the ANN model and empirical model

Based on the above results, two PPV predictive models were selected: the empirical model No. 1 representing for empirical technique, and the ANN 2-10-8-5-1 model representing for ANN technique in this study. A comparison and evaluation between the empirical model and the ANN model are given in Table 10. As a result, the selected ANN model obtained a better performance than the selected empirical model based on RMSE and R2. Figure 6 shows that the PPV predicted values by the ANN model were closer to actual values than the empirical model.

Table 10 Comparisons of performance between the empirical and ANN selected
Fig. 6
figure 6

A comparison of predicted values by empirical and ANN models

4 Conclusion and remarks

Blasting is an integral part of the open-pit mining technology. However, its side effects are dangerous for the surrounding environment, especially ground vibration (PPV). Therefore, accurate prediction of blast-induced PPV is essential to minimize the undesirable effects caused by PPV. From the results of this study, we draw some conclusions:

  • ANN is an advanced and robust technique that should be used to predict blast-induced PPV in open-pit mines. This study has developed a robust ANN model with high accuracy (RMSE = 0.738, R2 = 0.964). It should be applied in practical engineering to control the undesirable effects on the surrounding environment. However, the development of the ANN models in open-pit mine is often complicated, requiring the user to have an understanding of mathematics and programming.

  • The ANN model with a hidden layer should not be used to predict PPVs because it does not reflect all the characteristics of the data that lead to the inaccuracy of the forecasting model. The ANN model with three hidden layers has been proposed for predicting blast-induced PPV in practical engineering based on the results of this study. ANN models with many hidden layers should be considered in the future for predicting blast-induced PPV.

  • The empirical technique is a rapid and straightforward method of estimating blast-induced PPV, but further research is needed to improve the accuracy of empirical models.

  • The other influence parameters should be considered and supplemented to improve the accuracy of PPV predictive models, especially for empirical methods.

The results of this study are the basis for the development of blast-induced PPV predictive models for other open-pit mines with similar conditions. At the same time, it is useful for managers, engineers, and blasters in optimizing blasting efficiency and minimizing the negative impacts caused by blasting operations in open-pit mines.