Introduction

Blasting is an important aspect of civil/mining engineering as it remains the cheapest means of dislodging the natural and artificial structures. It is usually preceded by drilling operation. Blasting is achieved with the aid of explosive capable of releasing very high energy when detonated. Part of the explosive energy released at the detonation font actualizes the actual splitting/fragmentation of rocks while the remaining energy is released to the environment resulting in various detrimental environmental effects such as blast-induced ground vibration, flyrocks, air overpressure, etc. (Akande et al. 2014; Lawal 2020; Lawal and Kwon 2020; Lawal et al. 2021). The recent expansion of the cities has made quarries in recent time to be in close distance to the residential structures. Therefore, the quarry owners usually spend a huge amount of money on court cases yearly caused by different degrees of damages to the structures by blast-induced cracks, perforation of the building roofs due to flyrocks and panic as a result of air overpressure (Abdel-Rasoul 2000). Therefore, the quantification of the magnitude of the blast-induced environmental impacts becomes imperative to enable the quarry operators to know whether they are operating within acceptable limits. One of the most severe among the adverse effects of blasting is the blast-induced ground vibration (BIGV) (Khandelwal and Singh 2009; Akande et al. 2014). This is because it can affect the structural integrity of the nearby structures and causes shock.

The importance of the blast-induced ground vibration has made researchers suggest various means of quantifying the magnitude of the BIGV in terms of peak particle velocity (PPV) as the direct field measurement is costly, time consuming and requires a high level of expertise. Like the case of Kuz-Ram model for assessing the positive product of blasting (Lawal 2021), various empirical models (such as USBM (Duval and Petkof 1959); Langefors and Kihlstrom 1963), have been proposed to quantify the magnitude of the blast-induced ground vibration. The general form of the empirical model is as presented in Eq. (1):

$${\text{PPV}} = a({\text{SD}})^{b},$$
(1)

where SD is the scaled distance (m/kg1/2), a and b are the field constants. The major issue of concern of this Eq. (1) is the field constants which are usually site specific.

Although in recent decades, various artificial intelligence methods such as ANN, SVM, ANFIS, GEP, and their hybrids (Lawal and Kwon 2020) have been proposed for predicting the PPV and their performances have been encouraging as compared to the empirical models in many cases; nevertheless, the quarry operators still largely depend on the empirical models as they consider them simple because the practical implementation of many of the available AI-based models still seems like a mirage. The numerical method has also been used for predicting the PPV by generating some artificial vibrations (Kumar et al. 2020). This method has also been known to be very difficult to implement and costly (Lawal and Kwon 2020). Therefore, developing a probabilistic-based model selection model capable of guiding the quarry operators in selecting a suitable model for PPV selection based on the scaled distance only is highly imperative and timely at this time while the quarry operators are in dilemmas regarding decision on model selection. In addition, the existing studies on BIGV do not give a distribution of BIGV against SD, as the uncertainties associated with estimation is BIGV is not explicitly incorporated and considered.

Therefore, this paper develops an approach to select the most appropriate regression model for PPV among numerous models available in studies. The method developed and illustrated in this study selects a model for characterization of PPV, using available data of SD at a blasting site. The approach is based on Bayesian methodology for the purpose of performing model selection and characterization of PPV. The approach is then illustrated using a real-life blasting scenario reported in studies. The methodology involves the formulation of likelihood function for the regression models of PPV and selection of the appropriate regression model out of the models considered. After the appropriate model is selected, it is incorporated into Bayesian method to probabilistically characterize PPV. For illustration purposes, the proposed method is demonstrated using a set of real-life data.

Selection of appropriate regression model

Regression models between PPV and SD

As shown in Table 1, there are many regression equations reported in the existing studies for estimating PPV from SD. This shows that blasting engineers and other practitioners might be confused on which model to use if they are to empirically estimate PPV using SD from a specific site. The many variants of Eq. (1) (Table 1) make it very tedious for the quarry operators to decide or select the most suitable empirical models in their quarries or mines.

Table 1 Some empirical models for PPV predictions

In this study, five equations are adopted from Table 1 for demonstration purposes. All the five models considered are models that relate directly with SD, as SD is the only input data that will be used in the Bayesian method that is developed in this study. In addition, equations whose original plots or data cannot be retrieved are not considered. This is because the original data used in the derivation of each regression equation is required to determine their model uncertainty when developing Bayesian method. The regression equations adopted from Table 1 are those of Ak et al. (2009), Mesec et al. (2010), Aloui et al. (2016), Kahriman (2002) and Kahriman (2004) as Eqs. (2)–(6), respectively.

$${\text{PPV}} = 583.49{\text{SD}}^{ - 1.5824}$$
(2)
$${\text{PPV}} = 508{\text{SD}}^{ - 1.37}$$
(3)
$${\text{PPV}} = 1508{\text{SD}}^{ - 1.73}$$
(4)
$${\text{PPV}} = 191{\text{SD}}^{ - 1.13}$$
(5)
$${\text{PPV}} = 340{\text{SD}}^{ - 1.79}$$
(6)

The equations from Eq. (26) are taken as Models 1–5, respectively, and are henceforth referenced as such.

The data that are used to generate models 1–5 are plotted in Fig. 1, and it may be difficult to select an appropriate model out of them. This highlights the need to have a robust and rational method to select the most appropriate model for a specific site among them. The equation developed from the site data in Fig. 1 is not used in this study, since the data will be used in the illustrative example. However, the proposed approach is robust enough to use as many models as available during model selection.

Fig. 1
figure 1

Plot of selected regression models and site data (after Ak et al. (2009); Mesec et al. (2010); Aloui et al. (2016); Kahriman (2002, 2004))

Uncertainty modeling

Geotechnical materials undergo different geologic processes and sequences (Aladejare and Wang 2017, 2018). Therefore, the values of PPV measured through rock medium may vary considerably as the SD within a blasting site changes. Since the PPV may not be known with certainty at every point within a blasting site, there is a need to incorporate the uncertainty associated with PPV during its model selection and characterization. Mathematically, PPV can be expressed as a lognormal random variable in terms of its mean µ and standard deviation σ as in Eq. (7):

$${\text{PPV}} = {\text{exp}}(\mu_{N} + \sigma_{N} z),$$
(7)

where z is a standard normal random variable, µN and σN are the mean and standard deviation of the log of PPV, respectively, and are functions of µ and σ (Aladejare and Wang 2019a,b). Both the σ of PPV and the σN of ln(PPV) depict the variability of the peak particle velocity within the scaled distance from a blasting site. The physical meaning of PPV means that it must be strictly nonnegative; therefore, PPV is a continuous variable and it is modeled as a lognormal random variable.

The PPV can be measured directly at a blasting site but its direct measurement is certainly expensive and time consuming. The PPV value at a blasting site is obtained by means of regression between the PPV and the scaled distance (SD). The SD is a ratio of the distance between the shot and the measuring station (m); and square root of the maximum charge per delay (kg). Considering Eq. (26), the general format of the models can be written as follows:

$${\text{PPV}} = a({\text{SD}})^{b}.$$
(8)

Equation (8) can be transformed in a log–log scale as:

$$\ln ({\text{SD}}) = a\ln ({\text{PPV}}) + b + \varepsilon,$$
(9)

in which ln(SD) denotes the SD values at the blasting site in a log scale, a and b are model constants, which reflect the properties of each model and distinguish the models from one another. ε is the model uncertainty, taken as a normal random variable with a zero mean and a standard deviation σε. The model constants and σε of the candidate models used in this study are listed in Table 2.

Table 2 Parameter of the candidate models used in this study

Combining Eqs. (8) and (9) leads to the likelihood model as:

$$\ln ({\text{SD}}) = (a\mu_{N} + b) + a\sigma_{N} z + \varepsilon.$$
(10)

The variability in PPV measurements is from a source different from the model uncertainty in the regression models. Therefore, the two uncertainties are assumed to be independent of each other. Thus, ln(SD) is a normal random variable with a mean of (N + b) and standard deviation of \(\sqrt {\left( {a\sigma_{N} } \right)^{2} + \sigma_{\varepsilon }^{2} }\).

Bayesian model selection approach

A Bayesian approach is developed in this paper to perform selection of regression model for estimating PPV at a blasting site using only available data of SD. This would be most beneficial to blasting engineers and practitioners when there is no direct measurement of PPV available, as this is when engineers need to use regression models to evaluate the PPV from data. Selection of most appropriate regression model for estimation PPV is implemented by ranking the occurrence probability P(Mk|SD) of the candidate models Mk given the observed SD data. The model with the maximum probability is the most appropriate among the candidate models for such blasting site. Using Bayes’ theorem, the occurrence probability P(Mk|SD) is calculated as:

$$P(\left. {M_{k} } \right|{\text{SD}}) = \frac{{P(\left. {{\text{SD}}} \right|M_{k} )P(M_{k} )}}{{P({\text{SD}})}},\;k = {1},{2}, \ldots ,{5},$$
(11)

in which P(SD|Mk), often referred to as evidence in Bayesian statistics is the probability of observing SD given that a model Mk is selected. P(Mk) is the probability which reflects the prior knowledge of Mk, while P(SD) is a normalizing constant. In this study, the same prior probability of 0.2 is assigned to each model to avoid bias.

Starting from the probability of the prior knowledge of the models, the knowledge is updated upon observing SD at a blasting site. The update is reflected by P(SD|Mk), which incorporates the site-specific SD data for a given model Mk. Since the prior probability of the models is uniform, the task of model selection in Eq. (11) reduces to selecting the model with the maximum evidence (i.e., P(Mk|SD) is proportional to P(SD|Mk)).

Given the model distribution parameters of PPV, the evidence P(SD|Mk) for each of the five models can be expressed using the theorem of total probability (Wang and Aladejare 2015) as:

$$P(\left. {{\text{SD}}} \right|M_{k} ) = \iint {P(\left. {{\text{SD}}} \right|\mu ,\sigma ,}M_{k} )P(\left. {\mu ,\sigma } \right|M_{k} ){\text{d}}\mu {\text{d}}\sigma,$$
(12)

in which P(SD|µ,σ, Mk) is the joint conditional probability density function, PDF, of SD for given Mk and a set µ and σ of PPV, and commonly referred to as likelihood function in Bayesian framework. P(µ,σ|Mk) is the prior distribution of µ and σ of PPV. P(SD|µ,σ, Mk) and P(µ,σ|Mk) are calculated using Eqs. (13) and (14), respectively:

$$P(\left. {{\text{SD}}} \right|\mu ,\sigma ,M_{k} ) = \prod\limits_{i = 1}^{n} {\left( {\sqrt {2\pi } \sqrt {(a\sigma_{N} )^{2} + \sigma_{\varepsilon }^{2} } } \right)}^{ - 1} \exp \left\{ { - \frac{1}{2}\left[ {\frac{{\ln ({\text{SD}})_{i} - (a\mu_{N} + b)}}{{(a\sigma_{N} )^{2} + \sigma_{\varepsilon }^{2} }}} \right]^{2} } \right\}$$
(13)
$$P(\left. {\mu ,\sigma } \right|M_{k} ) = \left\{ {\begin{array}{*{20}c} {\left( {(\mu _{A} - \mu _{B} ) \times (\sigma _{A} - \sigma _{B} )} \right)^{{ - 1}} } & {{\text{for }}\mu \epsilon [\mu _{A} ,\mu _{B} ]\;{\text{and}}\;\sigma \epsilon (\sigma _{A} ,\sigma _{B} ){\mkern 1mu} } \\ 0 & {{\text{others}}}, \\ \end{array} } \right.$$
(14)

where ln(SD)i, i = 1, 2,…,n is the site-specific SD data points, µA and σA are the lower bound of the prior values of µ and σ, respectively, and µB and σB are the upper bound of the prior values of µ and σ, respectively.

Bayesian framework for characterization of PPV

As defined under the subsection titled “uncertainty modeling,” PPV is modeled as a lognormal random variable with a mean μ and standard deviation σ. By combining prior knowledge, selected model and blasting site SD data, the PDF of PPV can be estimated for given set of prior knowledge and blasting site SD data (Wang and Aladejare 2016a, b). Using the theorem of total probability, the PDF of PPV is expressed as:

$$P\left( {{\text{PPV|}}SD,{\text{Prior}}} \right) = {\iint }P\left( {{\text{PPV|}}\mu ,\sigma } \right)P\left( {\mu ,\sigma {\text{|SD}}, {\text{Prior}}} \right){\text{d}}\mu {\text{d}}\sigma, { }$$
(15)

where Prior denotes prior knowledge; P(PPV|μ,σ) is conditional PDF of PPV for a given set of μ and σ of PPV. For lognormally distributed PPV, P(PPV|μ,σ) is expressed as (Ang and Tang 2007):

$$P(\left. {{\text{PPV}}} \right|\mu ,\sigma ) = \left( {\sqrt {2\pi } \sigma_{N} {\text{PPV}}} \right)^{ - 1} \exp \left\{ { - \frac{1}{2}\left[ {\frac{{\ln ({\text{PPV}}) - \mu_{N} }}{{\sigma_{N} }}} \right]^{2} } \right\}.$$
(16)

The joint conditional PDF P(μ,σ|SD,Prior) in Eq. (15) reflects the updated knowledge on μ and σ based on prior knowledge and blasting site SD data. P(μ,σ|SD,Prior) can be simplified as P(μ,σ|SD) in Bayesian framework and expressed as (Wang and Cao 2013):

$$P\left( {\mu ,\sigma {\text{|SD}}, {\text{Prior}}} \right) = P\left( {\mu ,\sigma {\text{|SD}}} \right) = {\text{CP}}\left( {{\text{SD|}}\mu ,\sigma } \right)P\left( {\mu ,\sigma } \right),$$
(17)

where C = (∫∫P(SD|μ,σ)P(μ,σ)dμdσ)−1 is a normalizing constant, and does not depend on μ and σ; SD = {ln(SD), i = 1, 2,…, n}; P(Data|μ,σ) is the likelihood function reflecting the model fit with the SD data obtained at blasting site; P(μ,σ) is the prior distribution of μ and σ of PPV. Both P(SD|μ,σ) and P(μ,σ) are calculated using Eqs. (13) and (14), respectively.

Using the updated knowledge of μ and σ of PPV given by Eq. (17), the PDF of peak particle velocity PPV (i.e., Eq. (15)) is rewritten as (Aladejare and Idris 2020; Aladejare et al. 2020, 2022):

$$P\left( {{\text{PPV|SD}},{\text{Prior}}} \right) = C{\iint }P\left( {{\text{PPV|}}\mu ,\sigma } \right)P\left( {{\text{SD|}}\mu ,\sigma } \right)P\left( {\mu ,\sigma } \right){\text{d}}\mu {\text{d}}\sigma.$$
(18)

The Bayesian approach is incorporated into Markov Chain Monte Carlo (MCMC) to simulate PPV samples from the updated PDF in Eq. (18). The Metropolis–Hastings (MH) algorithm (Metropolis et al. 1953) is used in MCMC to simulate PPV samples of PPV from Eq. (18). The entire calculation and simulation are programmed as a user function in MATLAB software.

Illustrative example

The Bayesian approach is applied to select the most appropriate regression model for characterization of the PPV using the SD values from an opencast coal mine, situated in an Eastern-Central part of Jharia coalfield in the Dhanbad district of Jharkhand, India (Kumar et al. 2020). The data also includes 20 direct measurements of PPV values which will be used for validating the result of the proposed approaches. The 20 PPV and SD data are shown in columns two and three of Table 3, respectively. The SD data points are used in the selection of the most appropriate model and characterization of PPV, the 20 PPV data from direct measurement are used only for validation. In column 4 of Table 3, there are predicted PPV from finite element method (FEM) analysis. The predicted PPV from FEM will also be compared with simulated PPV from the Bayesian approach developed in this study. The results from the Bayesian approach are validated by comparing the distribution and statistics of the generated samples from the model selected with the 20 PPV data measured at the blasting site and 20 predicted PPV from FEM reported by Kumar et al. (2020).

Table 3 Measurement of scaled distance and peak particle velocity (after Kumar et al. 2020)

The Bayesian approach is illustrated using the SD values listed in Table 2 for model selection and probabilistic characterization of the peak particle velocity at the blasting site. Consider, for example, a set of prior knowledge, in which μ is uniformly distributed between 0 mm/s and 100 mm/s (i.e., μA = 0 mm/s and μB = 100 mm/s) and σ is uniformly distributed between 0 mm/s and 16.7 mm/s (i.e., σA = 0 mm/s and σB = 16.7 mm/s). The set of prior knowledge of μ is consistent with the typical ranges of PPV reported in the literature. σA is taken to be 0 mm/s to reflect the nonnegative physical meaning of the standard deviation of PPV. Six-sigma rule is used to estimate σB using the ranges of μ as σB = (μA−μB)/6 = (100–0)/6 = 16.7 mm/s. Using the prior knowledge and the SD data at the blasting site shown in Table 2 in the Bayesian approach, regression model with the highest evidence is selected as the most appropriate regression model for the blasting site. The selected model is then further used in the approach to probabilistically characterize the PPV at the blasting site.

Results of Bayesian model selection

Table 4 presents the evidence [i.e., P(SD|Mk) k = 1, 2,…,5] for M1, M2,…,M5 in the second column. The values of the evidence [i.e., P(SD|Mk) k = 1, 2,…,5] for M1, M2,…,M5 are 5.0556e −10, 4.7161e−10, 9.3931e−12, 1.2019e−10, and 7.6258e-10, respectively. The value of the evidence for M4 is about 2 times larger than that for M5, which is the second in ranking. In the case of no prevailing prior knowledge about M1, M2,…,M5, their prior probabilities are equal to 0.2 each. The occurrence probabilities P(Mk|SD), k = 1, 2,…,5 of M1, M2,…,M5 for the given set of SD values listed in Table 2 are calculated using Eq. (11), and they are 0.1713, 0.1598, 0.0032, 0.4073, and 0.2584 (fourth column in Table 4), respectively. Because the occurrence probability of M4 is greater than that of M1, M2, M3 and M5, M4 is taken as the most appropriate model for the probabilistic characterization of PPV at the blasting site adopted in this study.

Table 4 Result of model selection from Bayesian method

Observing the plot of the regression lines of the five models against the site data in Fig. 1, the measured data (SD and PPV) plot more closely to the regression line of M4, than those of the remaining models. This observation agrees with the result of the evidence calculation using only the SD data points, with M4 having the highest value of evidence. The Bayesian approach selected the most appropriate model for this blasting site in a transparent and logical way. Using the plot in Fig. 2 may look easy; however, there is no direct measurement of PPV data to plot this relationship between PPV and SD data when model selection is required. The developed approach is effective for selection of the appropriate regression model using available SD data at a blasting site.

Fig. 2
figure 2

Comparison of measured PPV with directly estimated PPV from the selected model

Simulated samples of PPV

The importance of the approach developed in this study for probabilistic characterization of PPV can be further appreciated by observing Fig. 2. The directly estimated PPV values using M4 and SD data at the blasting site are plotted against measured PPV values for same SD data. It is obvious that there are still disparities between the directly estimated PPV and measured PPV. The disparities are minimized when the various uncertainties associated with the use of regression model and PPV estimation are incorporated and explicitly considered in the characterization of PPV, which is easily achieved using the approach developed in this study.

The selected model, M4, is used together with prior knowledge and available SD values in the Bayesian approach to generate PPV samples. 30,000 samples of PPV are simulated using a two-dimensional grid over the space of µ and σ. These samples are used to determine the mean, standard deviation, and probability distribution of the PPV. The results are validated using the 20 measurements of PPV data from the site and further compared with the 20 estimated PPV values from FEM analysis.

Figure 3 plots the 30,000 simulated PPV samples. 28,448 PPV samples (i.e., around 95% of the simulated samples) are less than 25 mm/s. The simulated samples become increasingly scattered when PPV is greater than 25 mm/s. The statistical distribution of the simulated PPV samples is plotted in Fig. 4. The peak of the PPV histogram is around 15 mm/s, and 26,728 simulated samples representing about 90% of the 30,000 simulated PPV samples fall within the range of 5 mm/s and 25 mm/s. 1720 simulated samples representing slightly over 5% of the 30,000 simulated samples are less than 5 mm/s, and 1522 simulated samples representing around 5% of the 30,000 simulated samples are greater than 25 mm/s. Therefore, the 90% inter-percentile range representing the range from 5% percentile to 95% percentile of PPV is around 5–25 mm/s.

Fig. 3
figure 3

Scatter plot of simulated peak particle velocity from Bayesian method

Fig. 4
figure 4

Histogram of peak particle velocity simulated from Bayesian method

Statistics of PPV

Table 5 lists the mean and standard deviation of PPV calculated from the 30,000 simulated samples in the second column. The mean and standard deviation of PPV from simulated samples are calculated as 12.38 mm/s and 7.36 mm/s, respectively. Table 4 lists the mean and standard deviation estimated from the site measurement (i.e., 12.03 mm/s and 9.24 mm/s) and FEM analysis (i.e., 13.28 mm/s and 9.57 mm/s) (see Table 3) in the third and fourth columns, respectively.

Table 5 Summary of the statistics of PPV

The difference between the mean values estimated from the Bayesian simulated samples and site measurement results is 0.35 mm/s, and the difference between their standard deviation values is 1.56 mm/s. Compared with the standard deviation value of 9.24 mm/s measured from the site, the difference of 0.35 mm/s between the mean values and 1.56 mm/s between the standard deviation values from these two approaches are relatively small. In addition, when compared with the statistics of the independent estimation of PPV using FEM analysis reported by Kumar et al. (2020), the simulated PPV samples from Bayesian approach produced estimates of mean and standard deviation that are closer to that of the measured PPV from the blasting site. An added advantage of the Bayesian approach is its ability to simulate large number of samples unlike the FEM that produced the exact number of output data as the input data that was used for its analysis.

With the proper determination of the characteristic values of PPV for blasting site, blast design calculations can be carried out in a fashion to reduce the effect of blast-induced ground vibration. It is also worthwhile to note that the simulated samples of PPV generated from the Bayesian approach can be used directly in blast designs and in any analysis involving PPV that are based on Monte Carlo Simulation (Aladejare and Idris 2020, Aladejare et al. 2020).

Probability distribution of PPV

Figure 5 shows the PDF of PPV estimated from the simulated samples by a dashed line. For validation, numerical integration is also performed to directly calculate the PDF of PPV using Eq. (18), in which both the components are calculated numerically and repeatedly. The PDF of PPV obtained from the numerical integration is represented by a solid line in Fig. 5. The solid line plots closely to the dashed line which indicates the PDF of PPV estimated from the simulated samples are in good agreement with the results obtained from the numerical integration. The agreement indicates that the samples simulated through Bayesian approach represent the PDF of PPV reasonably well. This agreement between the PDF of PPV simulated from Bayesian approach and that of numerical simulation also indicates that the PDF of PPV estimated from the simulated samples is accurate. The PDF of PPV estimated from FEM analysis and the values of PPV estimated obtained directly at the blasting site represented by dashed line and open circles, respectively, are also included Fig. 5. The PDFs of PPV from Bayesian approach and numerical integration are consistent with the spread of the PPV value from the blasting site. In addition, 16 out of the 20 values of PPV measured at the blasting site fall within the range of 5 mm/s and 25 mm/s, which represents the 90% inter-percentile range of PPV estimated from the simulated samples obtained using Bayesian approach. This shows that the Bayesian approach developed in this study satisfactorily simulated PPV samples consistent with the PPV values directly obtained at the blasting site.

Fig. 5
figure 5

Probability density function of peak particle velocity

Figure 6 plots the cumulative distribution functions (CDFs) of PPV estimated from the 30,000 Bayesian simulated samples and the 20 direct measurements at the blasting site by a solid line and open circles, respectively. The open circles plot closely to the solid line, which indicates that the CDF of the simulated PPV compares favorably with that of the measured PPV at the blasting site. This infers that the information contained in the simulated PPV is consistent with that of the measured PPV at the blasting site. Based on the selected regression model, available SD data and the reasonable ranges of model parameters of PPV, the Bayesian approach provides a reasonable estimate of the statistical distribution of PPV. Such probabilistic characterization of PPV usually requires a large amount of data from direct measurements at blasting sites, and it is cost intensive and time consuming. With available SD data, a selected model and prior information, the Bayesian approach satisfactorily characterized PPV and solved the problem of sparse PPV data at the blasting site adopted in this study.

Fig. 6
figure 6

Validation of the probability distribution for peak particle velocity estimated from Bayesian approach

Sensitivity study using simulated data

The simulated samples reflect the integrated knowledge of regression model, prior knowledge, and blasting site SD data. Therefore, the probabilistic characterization of PPV using the Bayesian approach is affected by regression model, prior knowledge, and blasting site SD data. A sensitivity study is performed to explore the effect of the quantity of SD data at a virtual blasting site. The sensitivity study is performed using simulated SD data, which are simulated using the likelihood model (Eq. 10) of the selected regression (M4) given by Eq. (5) with μ = 12 mm/s and σ = 5 mm/s (i.e., μN = 2.40 and σN = 0.40). 10 sets of SD data are simulated, respectively, for data quantity, n = 5, 10, 20, and 30, resulting in a total of 40 sets of SD data. Then, using the prior knowledge adopted in this study and each set of these 40 sets of simulated SD data as blasting site data, 30,000 samples of PPV are simulated for each of the 40 data sets, respectively. This leads to 40 sets of the probabilistic characterization of PPV, including their mean, \(\mu^{*}\) and standard deviation, \(\sigma^{*}\). The results of the \(\mu^{*}\) and \(\sigma^{*}\) from the Bayesian approach are evaluated through hypothesis tests. In the hypothesis tests, the respective acceptance regions of µ and σ at a significance level of α are formulated as (Ang and Tang 2007):

$$\mu + \Phi_{{\left( { \propto /2} \right)}}^{ - 1} \frac{\sigma }{\sqrt n } < \mu^{*} \le \mu + \Phi_{{\left( {1 - \propto /2} \right)}}^{ - 1} \frac{\sigma }{\sqrt n }$$
(19)
$$\sqrt {\frac{{c_{ \propto /2, n - 1} }}{n - 1}} \sigma < \sigma^{*} \le \sqrt {\frac{{c_{1 - \propto /2, n - 1} }}{n - 1}} \sigma,$$
(20)

where \(\Phi_{{\left( { \propto /2} \right)}}^{ - 1}\) and \(\Phi_{{\left( {1 - \propto /2} \right)}}^{ - 1}\) are the values of an inverse standard Gaussian CDF at α/2 and 1 − α/2, respectively; and \({\text{c}}_{{ \propto /2,{ }n - 1}}\) and \(c_{1 - \propto /2, n - 1}\) are the values of a Chi-squared statistic with n − 1 degrees of freedom at the levels of α/2 and 1 − α/2, respectively.

Figure 7 shows the values of \(\mu^{*}\) versus the number n of the SD data in each data set. In the figure, the µ values estimated from the simulated samples, true value of \(\mu^{*}\) and the acceptance region of µ at the 5% significance level are represented by open circles, a dashed line, and solid lines, respectively. The scatterness of the \(\mu^{*}\) diminishes significantly as the number of data increases from 5 to 30. The uncertainty associated with sparse data diminishes as the number of SD test data increases from 5 to 30. Note that the Bayesian approach combines regression model, the prior knowledge and blasting site SD data and transforms them into simulated PPV samples. When the SD data is limited at n = 5, and the uncertainty is substantial, the simulated samples are mostly dominated by the prior information. This enables reasonable estimation of μ at such small number of SD data. This feature of Bayesian approach is beneficial in mining engineering where data obtained during some site operations and tests are generally small, making their uncertainties to be substantial. As the number of SD data increases, say for instance at n = 30, the simulated PPV samples through the Bayesian approach are dominated by the SD data. As the number of SD data increases from 5 to 30, the values of µ become closer to the true value of µ.

Fig. 7
figure 7

Validation results for mean values of peak particle velocity estimated using different numbers of simulated SD data

Figure 8 shows the values of \(\sigma^{*}\) versus the number of the SD data in each data set. In the figure, the σ values estimated from the Bayesian simulated samples, true value of μ and the acceptance region of \(\sigma^{*}\) at the 5% significance level are represented by open circles, a dashed line, and solid lines, respectively. The \(\sigma^{*}\) values estimated from the Bayesian simulated samples gradually approach the true value of \(\sigma\) as number of SD data increases. The SD data are simulated using the likelihood model (Eq. 10) of M4, which contains both variability and model uncertainty. The uncertainties propagate from the SD data to the PPV values through Eq. (5), when the PPV values are estimated directly using regression equation. That is the reason for the obvious disparity between the measured PPV at the blasting site and directly estimated PPV using M4 in Fig. 2. On the other hand, the model uncertainty is explicitly incorporated in the Bayesian approach, and the \(\sigma^{*}\) values mainly reflect the variability in the PPV data.

Fig. 8
figure 8

Validation results for standard deviation values of peak particle velocity estimated using different numbers of simulated SD data

Conclusion

This study developed Bayesian approach for model selection and probabilistic characterization of peak particle velocity, PPV, at a blasting site using measurements of scaled distance (SD). Firstly, the Bayesian approach integrates prior knowledge about PPV and SD data at a blasting site to calculate the evidence and occurrence probability of the candidate regression models. Subsequently, the model with the highest evidence and occurrence probability was selected as the appropriate model for estimation of PPV at the blasting site. It is worthwhile to note that the approach developed in this study selected the appropriate regression model for estimating the PPV at the blasting site, using only the SD data available from the site. This approach is useful when SD data from a site are available and there is a need to estimate the PPV of the site. The use of regression model becomes inevitable when there is no measured PPV data at a blasting site, which may be because of cost, time, and logistic limitations.

The selected regression model selected was further used in the Bayesian approach to characterize PPV, by combining it with the prior knowledge and site SD data. The Bayesian approach simulated PPV samples from the updated information resulting from the integration of the regression model, prior knowledge, and blasting site SD data. Characterization of PPV like this requires a large amount of measured data at blasting site, which is time, cost and logistics demanding. The approaches have been illustrated using real-life blasting site data and found to perform satisfactorily. The appropriate regression model is selected properly, and the statistics and probability distribution of the simulated PPV agree with those of the measured PPV at the site, which are only used for validation purposes in this study. Furthermore, sensitivity studies were performed to explore the effect of data quantity on the approach. It has been shown that the proposed approach can be used for different quantities of blasting site data.