1 Introduction

Open-pit mining is a type of technology that recovers resources in the ground, including drilling, blasting, loading, and hauling operations [1]. The mining process usually removes the burden for exposing the mineral deposits or fragment hard rock for subsequent operations (loading, transporting, or dumping). And one of the most effective methods is still blasting. For fragmenting rocks by the drilling–blasting method, the boreholes with various diameters (45–250 mm) were used. Then, the explosive was charged in the boreholes. The other blasting accessories such as blasting cap, detonator fuse, detonating primer, and signal tube can be used to detonate boreholes [2]. When initiated, the energy of the explosives will strike and break the rock. However, not 100% of explosives energy was used to break rock [3]. According to previous scientists, up to 80–85% of the energy of explosives was wasted and produced ill effects, such as ground vibration (PPV), air overpressure, fly rock, dust and toxic [4,5,6,7,8]. Of these side effects, ground vibration is the most dangerous effect [9,10,11]. It can make quake and destroy the surrounding structures, instability of the slope and benches, effects on the railway, underground water, etc., (Fig. 1). Therefore, the accurate blast-induced PPV prediction is required for a few reasons as follows:

Fig. 1
figure 1

Blast-induced ground vibration for rock fragmentation

  • Ensuring that levels of vibration do not cause damage to neighboring property;

  • Preventing annoyance to others by maintaining the lowest possible levels;

  • Demonstrating compliance with conditions;

  • Reducing the undesirable effects on the environment.

For predicting blast-induced PPV, the parameters’ influence is investigated and defined. There are many parameters that influence blast-induced PPV, including the controllable and uncontrollable parameters [12,13,14]. Of the two group parameters, the controllable parameters may include (but not limit) the following parameters: maximum charge per delay (W), monitoring distance (R), spacing (S), powder factor (P), burden (B), bench height (H), stemming (T), blast-hole depth (L). The blast designers can change these parameters [15]. In the uncontrollable parameters group, parameters related to geological and geophysical conditions are parameters that cannot be altered by blast designers such as rock hardness, crack, bedding, faults, density distribution [16]. Thus, PPV-blast-induced predictive studies usually focus on the controllable parameters by blast designers. Of the controllable parameters, the maximum charge per delay (W) and monitoring distance (R) was the most influential parameters for blast-induced PPV [17,18,19,20].

With the estimation of blast-induced PPV, many scholars have attempted to study and develop empirical formulas based on the relationship between W and R as shown in Table 1. Nevertheless, their effectiveness is not high in some cases [7, 17, 18, 21,22,23].

Table 1 Some empirical techniques for predicting PPV proposed

In recent years, artificial intelligence (AI) has become more popular and widely applied in many different fields. Review of literature showed that AI had been involved in many aspects such as mineralizing geochemical anomalies [33,34,35], optimizing operational mine planning [36], civil engineering [37, 38], analyzing mineral systems [39], mineral potential mapping [40, 41], resourcing future generations [42, 43], predicting blast-induced problems [44,45,46,47]. In predicting blast-induced PPV, the feasibility of a support vector machine (SVM) algorithm was studied and applied by Hasanipanah et al. [7] to predict blast-induced PPV in Bakhtiari Dam, Iran. A total of 80 blasts were used in their study with 60 blasting events for the training process and 20 blasts for the testing process. A variety of empirical equations were used by Hasanipanah et al. [7] to estimate PPV and compare with the SVM model. The performance of the predictive models was evaluated based on a series of criteria such as variance absolute relative error (VARE), root-mean-square error (RMSE), determination coefficient (R2), median absolute error (MEDAE), mean absolute percentage error (MAPE), Nash and Sutcliffe (NS), and variance accounted for (VAF). Based on the results, they indicated that SVM model provided higher performance capacity in predicting blast-induced PPV compared to empirical equations with an RMSE of 0.34, R2 of 0.957, VARE of 0.3, MAPE of 6.36, MEDAE of 0.15, NS of 0.94, and VAF of 94.24. In another study, [6] successfully developed the blast-induced PPV predictive model based on an artificial neural network and K-nearest neighbor algorithm, i.e., ANN-KNN. Seventy-five blasting events with two input variables, W and D, were used for predicting PPV in their study. An empirical equation of the United States Bureau of Mines (USBM) was also performed for estimating blast-induced PPV. RMSE, R2, and VAF are the performance indices used for assessment of the quality of the models in their study. As a result, they found the optimal ANN-KNN model with an RMSE of 0.54, R2 of 0.88, and VAF of 87.84. In another algorithm, Hasanipanah et al. [48] developed a simple model with high precision to predict the PPV produced by blasting using particle swarm optimization (PSO) with two forms of the equation, including linear and power. The two input variables, W and D, were also identified by Hasanipanah et al. [48] and used as input parameters for predicting PPV. The USBM experimental technique is a comparative method used to evaluate the effectiveness of the PSO model in their study. A great conclusion is given by Hasanipanah et al. [48] that a PSO model with power equation provided a higher accuracy than the USBM model for forecasting PPV with an RMSE of 0.24 and R2 of 0.938. In another study, Hasanipanah et al. [49] developed a classification and regression tree (CART) model to predict PPV in an open-pit mine. Eighty-six blasting events were monitored for their study. An effort for building a CART model was conducted with a good result, RMSE of 0.17 and R2 of 0.95. In a new survey, Armaghani et al. [50] investigated and studied the usability of the imperialist competitive algorithm (ICA) for predicting blast-induced PPV with two forms of equations, i.e., power and quadric. Seventy-three blasting events were collected for their aim with W, D and PPV being carefully measured. As a result, the ICA quadratic form is the most dominant model among the developed models with an RMSE of 0.37 and R2 of 0.94. Several similar works can be found at those references [51,52,53,54,55].

Review of the literature showed that SVR had been successfully implemented for blasting problems in several places [7, 56,57,58,59]. Nevertheless, it has not been applied in all areas. However, the effects of blast-induced PPV in each country are different [60] and need to be accurately predicted. Also, comparison, evaluation, and selection of SVR models with varying functions of kernel have not been performed in previous studies. Review of literature showed that Hasanipanah et al. [7] evaluated the feasibility of SVR for predicting blast-induced PPV. However, it seems they have specified the radial primary kernel function (RBF) for the development of the SVR model. Also, Sheykhi et al. [44] also developed the SVR models based on fuzzy C-means clustering (FCM). The RBF was also selected as a kernel function for the development of their study without comparison and evaluation. Hence, SVR models were developed in this study with three forms of kernel function for predicting blast-induced PPV. Comparison and assessment of them are employed for the determination of the best SVR model in this study.

The structure of the article consists of six parts as follows: Sect. 1 presents the reason to implement this study and overview of related works; Sect. 2 describes the study area and data used as a case study; Sect. 3 shows the background of the methodology used; the blast-induced PPV prediction model is developed in Sect. 4; Sect. 5 demonstrates and discusses the results of this work; finally, the conclusions and recommendations are given.

2 Study area and data used

2.1 Study area

The study area is medium mountainous with an absolute altitude in a range of 100–434 m. The mine in the form of plain terrain has an elevation of 50–100 meters over the sea level. It is located between longitude 105°30′40″E and 105°31′00″E, and latitude 21°42′30″N and 21°43′00″N (Fig. 2). The geological composition of the study area is involved, including faults and layers of metamorphic sandstone, zander, limestone, shale, and siltstone. The thickness of the coal seam varies from 0 to 2.75 m; overburden is in a range of 5–10.2 m.

Fig. 2
figure 2

A view of the site study

The mine’s reserves are 945,900 tons. Annual mining output is 80,000–100,000 tons/year. The maximum depth of the mine is − 202 m, and the height of the slope is 245–270 m with the slope angle changing from 32° to 36°. The rock mass has a relatively high hardness (pebbles, gravel, clay), and the drilling–blasting method was applied to rock breakage. The mine uses two main diameters of 220 mm and 105 mm for boreholes. Explosives are mainly used by ANFO and AD-1 with non-electric delay blasting. In the overburden benches, the maximum charge per delay was in a range of 100–623 kg.

2.2 Described data used

As discussed in Sect. 1, the maximum charge per delay (W) and monitoring distance (R) were the most influential parameters for blast-induced PPV. Therefore, W and R are two parameters used to predict blast-induced PPV in this study. The datasets used in this study are summarized in Table 2.

Table 2 Summary of the datasets used in this study

From Table 2, it can be seen that W is in a range of 100–623 kg, R is in the range of 48–218 m, and PPV was recorded in the range of 17.21–23.55. For collecting the datasets, Micromate instrument (Instantel—Canada) was used to record blast-induced PPV. PPV can be recorded in the range of 0.127–254 mm/s by the Micromate instrument. A handheld GPS navigation system was used to determine monitoring distance (R). Finally, W was collected from blasts design. Figure 3 illustrates a histogram of the datasets collected in this study.

Fig. 3
figure 3

Histogram of the datasets used in this study

3 Overview of support vector regression with kernel functions

Support vector machine (SVM) is a machine learning algorithm based on the principle of minimizing structural risk to generalize a limited number of samples better and is proposed by Cortes, Vapnik [61]. SVM can solve both classification and regression problems. For regression problems, the SVM is called the support vector regression (SVR). SVR relies on a subset of training datasets to build the forecasting model [62]. The goal of SVR is to estimate a smooth function f(X) with a deviation not more significant than ɛ for all output values [63].

For predicting blast-induced PPV in this study, SVR is performed based on three forms of kernel function as follows:

  • Linear kernel:

    $$K\left( {X,Y} \right) = X^{\rm T} Y$$
    (1)
  • Polynomial kernel:

    $$K\left( {X,Y} \right) = (\gamma \cdot X^{T} Y + r)^{d} ;\quad \,\gamma > 0;\,\,d = (1,2, \ldots )$$
    (2)
  • Radial primary kernel function:

    $$K\left( {X,Y} \right) = \exp \left[ {\frac{{\left\| {X - Y} \right\|^{2} }}{{2\sigma^{2} }}} \right]$$
    (3)

Here r, d, γ, and σ are kernel parameters which can be adjusted for optimal predictive models. In addition to the parameters of the algorithm, C parameter (cost) is also a penalty factor used to improve the accuracy of the predictive models [64].

4 Proposing SVR models with different kernel functions for predicting PPV

To develop the SVR models, the original datasets were divided into two parts, 85% of the whole datasets (157 blasting events) were used as the training datasets, and the rest 15% (24 blasting events) were used as the testing datasets. It should be noted that all the PPV predictive models are developed based on the same training datasets and validated based on the same testing datasets.

4.1 Support vector regression with linear (SVR-L)

For support vector regression with linear (SVR-L), no parameters of kernel functions are used to optimize the model. Only the C penalty factor is adjusted to improve the accuracy of the SVR-L model. For increasing the accuracy of the model and avoid overfitting, tenfold cross-validation resampling technique with three repeats was used in the development of the SVR-L models. A “trial and error” procedure was performed with 1000 SVR-L models for various C parameter values as shown in Fig. 4. Root-mean-square error (RMSE) was used to select the optimal model using the smallest value. As a result, the final value used for the model was C = 0.18.

Fig. 4
figure 4

Performance of SVR-L models on the training datasets

4.2 Support vector regression with radial basis function (SVR-RBF)

In support vector regression with radial basis function (SVR-RBF), there are two parameters that can be adjusted to optimize the predictive model, including σ and C. Like the SVR-L models, a “trial and error” procedure is also employed with 1000 SVR-RBF models for variables σ and C parameters. Tenfold cross-validation resampling technique with three repeats was also used in the development of the SVR-RBF models for increasing the accuracy of the model and avoid overfitting. RMSE continues to be used to select the optimal model using the smallest value. The final values used for the SVR-RBF model were σ = 0.017 and C = 304.411. Figure 5 illustrates the performance of SVR-RBF models with various parameters.

Fig. 5
figure 5

Performance of SVR-RBF models on the training datasets

4.3 Support vector regression with polynomial (SVR-P)

In support vector regression with polynomial (SVR-P), there are three parameters used for controlling the performance of the model, including degree, scale, and C. Similar to the SVR-L and SVR-RBF models, a “trial and error” procedure with 1000 models was developed for SVR-P models. For improving the accuracy of the model and avoid overfitting, tenfold cross-validation resampling technique with three repeats was used in the development of the SVR-P models. RMSE was used to select the optimal model using the smallest value. As a result, the final values used for the model were degree = 3, scale = 0.117 and C = 0.08 as shown in Fig. 6.

Fig. 6
figure 6

Performance of SVR-P models on the training datasets

5 Results and discussion

5.1 Performance metrics for evaluating the models

To assess the performance of the mentioned predictive models, the performance indicators are used, including root-mean-square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE), which are calculated using Eqs. (46), respectively.

$$RMSE = \sqrt {\frac{1}{n}\sum\limits_{PPV = 1}^{n} {(y_{PPV} - \widehat{y}_{PPV} )^{2} } }$$
(4)
$$R^{2} = 1 - \frac{{\sum\limits_{PPV} {(y_{PPV} - \widehat{y}_{PPV} } )^{2} }}{{\sum\limits_{PPV} {(y_{PPV} - \overline{y} )^{2} } }}$$
(5)
$$MAE = \frac{1}{n}\sum\limits_{PPV = 1}^{n} {} |y_{PPV} - \widehat{y}_{PPV} |$$
(6)

where n is the total number of data; \(y_{PPV}\) is the measured value, \(\widehat{y}_{PPV}\) is the predicted value; and \(\overline{y}\) is mean of measured values. In the most optimal model, R2 should be equal to 1, and RMSE and MAE should be similar to 0, respectively.

5.2 Empirical

For comparison and evaluation of the performance of the predictive models, an empirical technique is also applied in this study. Of the current empirical methods, the United States Bureau of Mines (USBM) remained the most widely used empirical method for estimating PPV and was proposed by Duvall, Petkof [24]. Therefore, we have chosen a USBM innovative formula that represents experimental techniques to evaluate blast-induced PPV in this study. The USBM empirical method is described in Eq. 7 as follows:

$$PPV = \lambda \left( {\frac{R}{\sqrt W}} \right)^{- \alpha }$$
(7)

where W is the maximum charge per delay, kg; R is the distance between the blast site and monitoring point, m; λ and α are the site factors and are determined by the multivariate regression analysis.

According to Eq. 7, λ and α are the site factors and are determined by the multivariate regression analysis. It should be noted that 157 blasting events in the training datasets are used to identify the site factors λ and α. SPSS version 18.0 [65] is used for multivariate regression analysis to determine the site factors λ and α. As a result, λ and α identified 67.054 and 0.585, respectively. The USBM empirical technique for estimating blast-induced PPV in this study was defined according to Eq. (7) as follows:

$$PPV = 67.054\left( {\frac{R}{\sqrt W}} \right)^{- 0.585}$$
(8)

5.3 Comparison and assessment of the predictive models

Based on the developed predictive models, the testing datasets are used to evaluate the performance of the models via metrics in Eqs. (46). Accordingly, RMSE, R2, and MAE are calculated on both the training and testing datasets. Table 3 interprets the performance of the models.

Table 3 Performance of the PPV predictive models on training and testing datasets

From Table 3, it can be seen that the empirical technique (USBM) provided the lowest performance on both the training and testing datasets. On the testing datasets, the empirical only reached an RMSE of 0.856, R2 of 0.643, and MAE of 0.575, whereas the SVR models, i.e., SVR-L, SVR-RBF, and SVR-P, yielded much higher performance than the empirical. It can be seen that the SVR models seem to work better for predicting blast-induced PPV. Of the three SVR models, SVR-RBF was the best model with an RMSE of 0.396, R2 of 0.924, and MAE of 0.135 on the testing datasets. Based on the obtained results, it can be seen that the controlled blasting techniques can be effectively applied by AI techniques, i.e., SVR models. However, they depend on the characteristics of each site. Therefore, they can be reconsidered when asked for another location. Figure 7 demonstrates the proper scale of the predictive models for predicting blast-produced PPV in this study.

Fig. 7
figure 7

The relationship between measured and predicted values of the predictive models on testing datasets

To assess the certain level of the predictive models, the predicted values were compared to measured values on the testing datasets and are shown in Table 4. Accordingly, it can be seen that the SVR-RBF model yielded the predicted values which closer the measured values than the other models. With high levels of accuracy, the SVR-RBF model can be applied to control the hazards caused by blasting operations in open-pit mines and minimizing the impacts on the environment.

Table 4 Comparison of measured and predicted values of the predictive models on testing datasets

6 Conclusions and recommendations

As an indispensable development, blasting is still the most popular and effective method for rock fragmentation in an open-pit mine. However, safety and sustainable development in mining are essential requirements for the mining industry. Therefore, the undesirable effects caused by blasting operations in open-pit mine need to be strictly controlled, especially ground vibration. This study investigated and developed a series of blast-induced PPV prediction models based on the SVR algorithm with three different kernel functions. Comparison and assessment of them were implemented with the USBM empirical technique. Finally, the best model for predicting blast-induced PPV in Phan Me open-pit coal mine, Vietnam, was selected, i.e., SVR-RBF.

In conclusion, SVR is a robust algorithm for predicting blast-induced PPV with the power of kernel functions. In this study, the SVR model with RBF form seems to be more appropriate than L and P functions in predicting blast-induced PPV in this area. With high accuracy, the results indicated that the SVR-RBF model should be applied in practical engineering to control the impacts of PPV on the surrounding environment. The remaining SVR models may also be considered in other mine conditions and should be further investigated to improve the accuracy of the model.