1 Introduction

In underground hard-rock mines, dozens of kilometers of drifts are excavated yearly using explosives. The excavation process allows advances of about 5 m per blasting round. Mining activities are expected to develop at greater depths under higher stress conditions. Under high stress, development blasting may initiate seismicity. The intensity of seismic responses to development blasts is expected to increase at these depths, presenting a more significant hazard. Seismic hazard is the probability of occurrence in time and space of a seismic event of a certain magnitude within a specific time interval (Gibowicz and Kijko 1994). Seismic events can cause damage to the rock mass, rock ejection or rockburst, and lead to severe consequences such as area closures, loss of production, injuries or even fatalities. Therefore, quantifying the seismic hazards is critical to a risk management process. In mining geomechanics, the seismic risk is defined in terms of seismic hazard, exposure, and consequences (Potvin 2009; Hudyma and Potvin 2010).

1.1 Risk-Management Process

Geomechanical risk analysis is not an objective but is part of the decision-making process. The fundamental aspects of risk analysis specific to geotechnics, rock mechanics or geomechanics have been studied by numerous authors (Einstein 1996; Baecher and Christian 2003). In 2009, the National Association of Corporate Director (NACD) stated that a risk appetite or risk-tolerance statement is at the heart of an effective risk management program and is linked to the organization’s overall risk management philosophy and strategic ambition (NACD 2009). Mining companies are expected to systematically apply procedures to contextualize, establish, monitor, evaluate, process, record, review and communicate risk, according to the International Organization for Standardization (ISO 2018), as shown in Fig. 1.

Fig. 1
figure 1

Risk-management process (Hadjigeorgiou 2020; after ISO 2018)

The introduction of risk management and communication strategies has undeniably contributed to a safer mining work environment (Hadjigeorgiou 2020). This is evident in seismically active mines where procedures for communication and follow-up actions upon large seismic events are commonly employed. Developing a tool for quantifying seismic hazard is necessary to manage seismic risk adequately. This tool must be understandable for all decision-makers because a company’s highest leadership and management level must define its risk appetite (Hadjigeorgiou 2020). For example, it is not up to the ground control or rock mechanics engineer to establish a mine's risk appetite on geomechanical issues such as managing the seismic hazard associated with blasting of development drift and meeting production requirements in a highly stressed deep mine. The concept of risk appetite helps make the necessary trade-offs between business objectives and define risk tolerance (Quail 2012). A target risk appetite for each business objective can be established using an index like the one shown in Table 1.

Table 1 Risk appetite scale (from Hadjigeorgiou 2020; after Quail 2012)

1.2 Re-entry Protocols

For managing seismic hazard following development blasts, it is common for mines to implement exclusion protocols. Effective re-entry protocols can address the short-term seismic hazard following drift blasts. Exclusion protocols that follow blasting have a spatial and temporal aspect, but some mines include more elements to their protocol for managing the seismic risk associated with drift development. Implementing these protocols depends on the personnel's understanding of the seismic hazard and their ability to anticipate it.

For many mine sites, re-entry protocols rely on general rules (blanket rules) based on experience (Potvin et al. 2022). These conservative blanket rules, defined by a fixed radius and closure period, are standard practices for re-entry analyses following blasts (Potvin et al. 2019). This was found in a compilation of seismic risk management practices in underground mines by Potvin et al. (2019) based on a combination of mine site visits and interviews of 16 operating mines in five countries that were experiencing various levels of seismicity, in addition to a survey of 30 mines worldwide. Some mines with more advanced practices base their rules and tolerance thresholds on assessing seismic history (Potvin et al. 2019). Approaches using past seismicity to establish their re-entry protocol use various parameters. The most popular are cumulative event count and seismic event rate (Vallejos and McKinnon 2008; Potvin et al. 2022). Re-entry time can be based on historical or near real-time responses with these parameters (Potvin et al. 2022). The general rules are sometimes adjusted according to anticipated seismicity based on experience, depending on the variability of the geotechnical domains. Site experience and geotechnical judgment are important in adjusting these blanket rule protocols. The limitations of these approaches are:

  • The experience is empirical, often not detailed or supported by statistical analysis, and is therefore difficult to transpose.

  • In all cases of re-entry protocols, the factors influencing the variability of seismic responses are poorly understood.

  • None of the methods include the influence of geology and structures on the intensity of seismic responses, and no quantitative studies have been conducted on them.

Evaluating a protocol’s efficiency is critical to assessing how good we are at anticipating seismic hazards, which is a big part of seismic risk management. Only a few examples have been published showing success in estimating seismic hazards from blasting (Tierney et al. 2019). Systematic retrospective analyses are rarely conducted to evaluate the effectiveness of assessing seismic hazards and the number of false alarms by protocols based on seismic source parameters. According to Morkel and Rossi-Rivera (2017), at two mine sites, the conditions for which the modified Vallejos and McKinnon (2008) method is more accurate than applying a blanket re-entry rule are unclear and require further investigation. This highlights how unclear the success of re-entry protocols is, the difficulties in quantifying the success of re-entry protocols properly, and the difficulties in quantifying properly what is considered a success. These are all limitations on current practices to manage seismic risk with re-entry protocols.

1.3 Risk-Management Process Applied to Seismic Hazard Following Development Blasts

As mentioned, actual seismic risk management tools like re-entry protocols do not quantitatively incorporate geology and structures' influence on seismic responses' intensity to development blasts. However, it has been shown that specific geologic and structural variables significantly influence the intensity of seismic responses to development blasting (Goulet et al. 2024). This is a clear limitation of the current practices in risk management linked to development blasts. Furthermore, the performance of the protocols is not fully quantified. Therefore, it is difficult to efficiently communicate the risk associated with the hazard of the seismic response. Without a performance-measurement tool, decision-makers cannot judge whether the anticipated versus actual hazard risk is adequate. This decision must be based on quantification and be a function of risk appetite. This is a key objective of the paper.

This paper details how predictive models of seismic response intensity based on the interpretation of geologic and structural variables can be used to understand, manage, and communicate the seismic hazard associated with development blasting. This is achieved by developing multivariate statistical predictive models of random forests from a database of 379 drift blasts at a mining site for which 32 geologic and structural variables were interpreted, and their seismic response to blasting was delineated and quantified. The methodology to develop such a database is fully detailed in Goulet et al. (2024).

The random forest model is a robust prediction method with excellent predictive power that can implicitly select important variables and be understandable, even if not fully interpretable. Other prediction methods, such as a conditional tree, are more interpretable but do not have a strong predictive power and have a high risk of overfitting. Another advantage of random forest models is that they allow non-linear relationships, which is not possible with multiple regression models or canonical correlation analysis (CCA). In addition, categorical variables can be included in the model, which is notpossible with predictive models such as multiple regression or partial least square regression (PLS). A downside of the random forest algorithm is its many hyperparameters from which to choose to build the model. Using predictive models over explanatory statistical models such as bivariate analysis, principal component analysis (PCA), or factor analysis of mixed data (FAMD) allows the obtention of predictive values, which is highly valuable for seismic hazard management. Currently, mines do not use predictive models to assess the seismic hazard. Explanatory statistical models are also not used in mines to understand or anticipate seismic hazards related to development blasting. As mentioned in Sect. 1.2, seismic hazard management is generally linked to re-entry protocols based on a conservative blanket rule (Potvin et al. 2022).

Developing random forest models to anticipate the seismic hazard associated with development blasting provides a decision-making tool to quantify acceptability thresholds. How these models could quantify decision-maker's risk appetite is detailed using different thresholds related to the five tolerance levels presented in Table 1. This could then be used as a visual communication tool for decision-makers. To do so, the performance of the models in anticipating high seismic hazards for these different tolerance levels is quantified. The performance is quantified in terms of precision, sensitivity, and accuracy.

To have a comparative measure for predictive model performance, the criteria of the random forest models for predicting the seismic hazard of blasting responses are compared to the performance of the current method of implementing seismic protocols to capture high seismic hazards. This comparison is not intended to establish that one method is better than another since the two do not serve the same intent. Instead, this comparison serves as a benchmark to demonstrate the added value of the predictive models developed as a tool for seismic hazard management. In addition, it shows that both methods can be more powerful if used concurrently.

2 Development of Predictive Models

This section presents how a multivariate statistical model can be developed, based on the rock mass geologic and structural properties, to predict seismicity associated with development mining. First, the data used in the analysis will be detailed. The general concept behind the random forest multivariate approach will then be presented, the results will be discussed, and finally, the model performance will be assessed.

2.1 Database of Development Blasts

A good quality database of development blasts in a seismically active mine is needed to perform multivariate analysis. It meets different criteria making it suitable for the investigation of the impact of geologic and structural variables on the seismic responses to development blasting.

These criteria are:

  1. 1.

    The study area is outside the zone of influence of the mining stopes in terms of redistribution of stress and induced seismicity.

  2. 2.

    Different levels of seismic hazard, including high seismic hazard, are observed during drift development.

  3. 3.

    Variability of geology and structures in the area and availability of data is observed.

  4. 4.

    An identical blasting design is used.

  5. 5.

    The quality of seismic data is sufficient.

  6. 6.

    A strong correlation exists between seismicity and the development of mining drifts.

The selected case study consists of three levels of a deep, hard rock underground mine located 3 km under the surface. The study period ranges from 1 August 2018 to 13 December 2019. Development drifts were segmented to the volume excavated during a development blast (≈3 m × 5 m × 5 m). Goulet et al. (2024) present details of the data compliance with the criteria. A total of 379 drifts segments were characterized for this analysis. A response variable and input variables must be defined for multivariate predictive analysis.

2.1.1 Response Variable

The response variable in this investigation is the parameters quantifying the intensity of the seismic responses to the blasting of development drift. Seismic events located within a 40 m radius of the development blast and within a time window of 11 h following that blast were selected to represent the seismic response. These were chosen to ensure that all seismic events selected belonged to the associated development blast, based on the investigation of seismic database quality for the case study (see Goulet et al. 2024 for further details). Due to the nature of the blasts investigated and the seismic response being limited in time and space, it was not deemed necessary to further study the spatial–temporal relationships between the blast and the events. To minimize the risk of contaminating the seismic response to development blast database, all drift blasts that occurred within 24 h of any of the 53 production blasts in the entire western sector during the study period would not be considered. The development blast responses, which could interact with other development blast responses, based on the radius and time window used, were also not considered for this analysis.

The complete seismic database of the mine was used to assess the seismic data quality and to delinate seismic responses. A seismic monitoring system using ESG processing has been in place at the mine site for over a decade. In the studied sector, 14 uniaxial 50 Hz accelerometers, three triaxial 50 Hz accelerometers, and one triaxial 15 Hz geophone were used to record the seismic events. The median location error is 4 m, with a 99th percentile of 9.5 m. More details on the seismic data quality can be found in Goulet et al. (2024). The frequency–magnitude plot presented in Fig. 2 indicates the quality of the seismic database for the selected area during the investigated period. The seismic database is complete for events of MW ≥ − 0.5 (Fig. 2). Arguably, the bending point around MW = − 2 suggests that even if the database is not complete between − 2 ≤ MW ≤ − 0.5, it offers a good representation of the seismic response. Predictive models are developed based on the variation of the values of the response variable. It is, therefore, preferable to have an extensive range of variation in the response. Consideration of − 2 ≤ MW ≤ − 0.5 seismic events is a way to associate seismic response intensity values to the seismic responses where zero MW ≥ − 0.5 events are observed. Imputing minimum threshold values associated with MW ≥ − 2 events thus maximizes the performance of the predictive models by decreasing the influence of the established intensity threshold values as much as possible, considering the data’s intrinsic limitations.

Fig. 2
figure 2

Frequency–magnitude relationship between 1 August 2018 and 13 December 2019 for the three studied levels (all seismic events). Magnitude is moment magnitude (MW)

To quantify the intensity of the seismic responses, the logarithm of the sum of the moment (log (ΣM0)) of all events constituting a response was used. To develop predictive models, allocating a value to every seismic response intensity is necessary. Without a quantified response variable, the data point becomes unusable. The absence of a seismic event captured by the seismic system constitutes a null seismic response, not an unknown one. The minimum threshold of log (∑M0) quantifying the intensity of the seismic responses is fixed at 6, which is the equivalent to MW = − 2, based on the relation between magnitude moment (MW) and seismic moment (M0), as defined by Hanks and Kanamori (1979), Eq. (1):

$${M}_{\text{W}} = 2/3 (log\; {M}_{0}) - 6.0$$
(1)

For the seismic responses that did not contain events MW ≥ − 2, the seismic response intensity (log (∑M0)) has been assigned to 6, which is the minimum recorded value. Based on experience at the mine site, it is assumed that damage is generally caused by magnitudes greater than 0.4 (MW ≥ 0.4). However, events of MW ≥ 0 have been known to generate damage in mine excavations when they are non-reinforced. Using a conservative approach, it was deemed that the intensity of the seismic response is considered high when an event of MW > 0 occurs. Using MW = 0 in Eq. (1), the intensity of a seismic response is considered high when log (ΣM0) ≥ 9. Among the 379 development blasts of drift segment studied, 17% (65/379) are of high intensity (log (ΣM0) ≥ 9).

2.1.2 Input Variable

The approach employed 32 geologic and structural variables for the input variables, which were interpreted for 379 drift segments for which the seismic response to its development blast was delineated and quantified. Four of these 32 geologic and structural variables are categorial, and 28 are quantitative. This extensive database does not contain any missing data. Field visits to the mine drifts, core logging (57.8 km) and laboratory tests allowed for first-hand experience with the data collection used for this project. Table 2 presents the rationale behind the 32 input variables used in this analysis. Goulet et al. (2024) further detailed the interpretation of the variables for each drift segment. The data distribution can also be consulted in Goulet (2024).

Table 2 Input variables used for the development of multivariate predictive analysis models * denotes categorial variable; other variables are quantitative

A collinearity test was performed for all quantitative input variables to verify the redundancy of the variables. This test uses the variance inflation index (VIF) (Fox and Monette 1992) to assess the redundancy of the variables. The variable is deemed redundant when the VIF value is greater than 10. The Alteration intensity rating (AIR) is the variable with the highest VIF value in the set of variables (VIF = 85) and was removed from the database. As a result of its removal, the highest VIF is related to Shear intensity index (SII) (VIF = 40). This variable was, therefore, also removed from the database. Following this removal, no value of VIF is higher than ten among the 30 remaining input variables.

2.2 Random Forest Models

The random forest model was chosen in this paper (Breiman 2001) to perform the multivariate predictive analysis. Random forest is a tree-based method that generates multiple trees to obtain a predictive value. It was chosen for its ability to capture a complex data structure.

Random forests are part of supervised learning methods of ensemble methods. An ensemble method is a set of models whose predictions are combined in some way to obtain a more stable prediction. Ensemble methods build several trees and aggregate the results. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest (Breiman 2001). Randomly selected inputs or combinations of inputs are used at each node to grow each tree. The generalization error for forests converges to a limit as the number of trees in the forest becomes large (Breiman 2001). The generalization error for a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them (Breiman 2001). The correlation between trees is reduced by selecting a sample of variables at each node. Breiman (2001) demonstrated the almost certain convergence of random forests, i.e., as the number of trees increases, better is the model obtained. The mathematics of random forests is detailed by Breiman (2001).

The multivariate analysis model of random forests was realized using the ranger function of the Ranger package of the R software (R Core Team 2020). This package is a fast implementation of random forests developed by Breiman (2001). It includes implementations of extremely randomized trees (Geurts et al. 2006) as well as quantiles of regression forests (Meinshausen 2006). The scale of permutation significance is by standard deviation, as Breiman (2001) described. The parameters used to build the model are:

  • number of trees (num.trees) = 500;

  • splitting criterion (splitrule) = variance;

  • no minimum number of observations in a terminal node (min.node);

  • no predetermined number of variables evaluated at each node (mtry);

  • importance equal to ‘permutation’ to obtain the importance of variables in the model.

To build a predictive model, one must divide the database into two categories: training and test data. The model was constructed from the training sample comprising ≈ 75% of the data, which equals 279 drift segments. The remaining ≈ 25% (100 drift segments) will be used to verify the performance of the models, i.e., the ability to anticipate the intensity of seismic responses to development blasting of these drift segments.

The database division was done randomly to have a range of geologic and structural variable values. Minimal variability would have been observed if the division had been done temporally. All the sulfide development occurs in the last 20% of the database time range. Since no development blast had been done in these lithologies before that moment, there would likely have been an incongruity between the training and test data-performance results.

2.3 Results and Performance

The interpretation of the random forest model primarily assesses the relative importance of input variables in explaining the variance of the response variable. However, knowing how the input variables affect the response variable is impossible. The importance of each input variable is obtained by performing permutation tests and measuring the misclassification rate. For each constructed tree, the values of each variable in the out-of-the-bag examples are randomly permuted, and the out-of-bag data are run down the corresponding tree (Breiman 2001). The prediction for each observation that is out-of-the-bag is saved. Afterward, the plurality of out-of-bag predictions for each observation, with each variable noised up (permuted), is compared to the original prediction of the observation to give a misclassification rate (Breiman 2001). When the misclassification rate is high, the variable influences the model a lot (high importance), while if the misclassification rate is low, the variable is not used much for prediction (low importance). Figure 3 shows the importance of each input variable for the seismic response intensity quantified by the log (∑M0).

Fig. 3
figure 3

Importance of geologic and structural variables on the random forest model anticipating log (ΣM0)

Figure 3 emphasizes the great importance of the distance to a lithological contact for the random forest model. Variables related to drift direction and the angle between the schistosity and the direction of the drift segment are also of undeniable importance. These two variables are similar since the orientation of the schistosity is relatively constant, so these variables are roughly equivalent. Lithologies play an indirect role in the model through the intensity of chlorite and biotite alteration in addition to the fabric of sericitic alteration. The distance to a shear zone or fault plane is also significant for the model. Other variables have less importance.

A prediction is obtained for each of the 500 trees in the random forest model. Measuring the model’s performance using the average value predicted by the 500 trees and the true value is possible. A scatter plot comparing the true (observed) versus the predicted response average values is a visualization tool to assess the model's performance. A scatter plot was used for the training data (Fig. 4a) and the test data (Fig. 4b). Different criteria in Table 3 quantify the model's performance: the root mean square error (RMSE) and the coefficient of determination (R2).

Fig. 4
figure 4

True versus predicted log Σ(M0) values of seismic responses to development blasts from a 279 drift segments used to build the predictive model (training database) and b the 100 drift segments for the test database. The black line represents a 1:1 ratio, while the green and blue dashed lines are the best linear fits

Table 3 Performance criteria for random forest models measured on the 279 training data and 100-test data to predict the intensity of seismic responses to development blasts

The minimal intensity value of log (∑M0), which has been fixed to 6, influences the predictive model, particularly the performance criteria (R2 and RMSE). The values of R2 and RMSE are still quantified for each data point since they are used only for comparative purposes. The R2 is about two times lower for the test data than the training data, while the RMSE is around two times higher for the test data than the training data. The RMSE values are small relative to the range of values of the variable. For example, the average difference between the predicted and true values of log (∑M0) is 0.55, while the range covered is 6 to 11. There is a discrepancy between the best linear fit and 1:1 ratio lines. Although the linear fit deviates from the 1:1 ratio, a strong correlation remains between the true and predicted values. The correlation between the true and predicted values allows anticipation of the intensity of the seismic responses with some uncertainty.

The R2 and RMSE allow to quantify the fit between the true and predicted values in a single value. Moreover, the prediction error distribution, which is the difference between the true and predicted values, can be measured directly, as shown in Fig. 5 for the test data. The prediction–error distribution is the same whether the seismic response intensity is low (log (∑M0) < 9) or high (log (∑M0) ≥ 9) (Fig. 5). One-third (33%) of the seismic response intensity is predicted within ± 0.5, 70% is within ± 1.0 and 91% is within ± 1.5. This graph reflects well how the random forest method works. It assures that the gap between the true value and the predicted value has the same distribution, no matter the value of the predicted variable. However, a bias is included in the data with the selected minimal intensity (log (∑M0) = 6). This causes an overall underestimation of high-intensity responses, for which the prediction has the most consequence regarding seismic hazards.

Fig. 5
figure 5

Gap between the true value of the log (ΣM0) of seismic responses to development blast and the predicted value of test data

To reduce the bias caused by the minimal intensity values in the model, the predicted value can be adjusted based on the best linear fit of the trained data (illustrated in Fig. 4a) using the following equation, Eq. (2):

$${\text{Adjusted Predicted }} = \, \left( {{\text{Predicted }}{-} \, 2.46} \right)/0.68$$
(2)

Figure 6 illustrates the trained and test predicted values adjusted. This adjustment does not change the dispersion of the data (RMSE or R2). It facilitates the direct comparison between the predicted value and the true value, minimizing the bias in the linear fit created by the minimum value of log (ΣM0) that the random forest model imposed at 6.

Fig. 6
figure 6

Adjusted predicted versus true log (M0) values of seismic responses to development blasts from a 279 segments used to develop the model (training database) and b 100 segments for the test database. The black line represents a 1:1 ratio, while the green and blue dashed are the lines of best linear fit

As seen in Fig. 6a, the adjustment of the predicted data allows training data to follow a 1:1 trend. The test data (Fig. 6b) are also closer to a 1:1 ratio than the predicted (not adjusted) data (Fig. 4b). As with the predicted (not adjusted) values, the distribution of this deviation between the adjusted–predicted values and the true values can be measured directly (Fig. 7). Among the 100-test data, only four seismic response intensities were predicted to be >  ± 2 (Fig. 7). Among the 100 seismic responses composing the test data, 15 are of log (ΣM0) ≥ 9 (high-intensity, represented by a red line in Fig. 7) and 85 are log (ΣM0) < 9 (low-intensity, represented by a green line in Fig. 7). The log (ΣM0) of the seismic responses was anticipated within ± 0.85 for 87% of the high-intensity response (13/15) and for 53% of the low-intensity seismic responses (45/85); represented by blue dotted lines in Fig. 7. This comparison shows that the developed model with adjusted predicted values anticipates the high-intensity seismic responses more accurately than the low-intensity ones, and it could arguably be deemed more efficient by mine management to achieve their goals.

Fig. 7
figure 7

Gap between the true value of the log (ΣM0) of seismic responses to development blast and the adjusted predicted value of test data. Adjusted predicted value refers to the log (ΣM0) value adjusted from the linear best-fit equation of the model from the training data (Adjusted Predicted = (Predicted – 2.46)/0.68). The blue dashed line indicates the proportion of seismic response anticipated within ± 0.85 for true high-intensity (log (ΣM0) ≥ 9) and low-intensity (log (ΣM0) < 9) seismic responses

Adjusting the predicted values decreases the overall accuracy for the whole data set, Table 4. However, the model’s accuracy in predicting the log (ΣM0) of high-intensity responses is greater when the adjusted predicted values are used. Only 67% of high-intensity responses are predicted within ± 0.85 for the predicted (not adjusted) values versus 87% for the adjusted predicted values.

Table 4 Proportion of test data for various accuracy in prediction values

Using the raw-predicted data or adjusted predicted data are both acceptable approaches to use the model to quantify the anticipated seismic hazard The choice depends on what the decision-maker decides to prioritize when comparing the true versus predicted values: a more accurate prediction of all seismic response intensity (raw-predicted data) or higher prediction accuracy of high-intensity response with a slightly lower prediction accuracy for low-intensity response (adjusted predicted data). These are different, valid approaches to seismic hazard management. These quantifications ease the communication of the anticipated seismic hazard to all people concerned by seismic risk management. Effective communication is critical to adequate planning.

3 Strategy for Using Predictive Models as a Decision-Making Tool for Seismic Hazard Management Related to Developmental Blasts

A strategy for using random forest predictive models of seismic response intensity based on geologic and structural variables interpreted at each drift segment is exemplified in the following sections to improve hazard management. The strategy uses the different levels of risk appetites defined in Table 1, which are open, flexible, cautious, minimalist, and averse.

This strategy includes four steps:

  1. 1.

    Establishing threshold values of seismic response intensity at drift segment blasts for the five different risk appetite levels based on the distribution and best-fit line of the values predicted by the random forest models to the true values.

  2. 2.

    Apply the established seismic response intensity thresholds to the test data to classify, based on a binary division (low or high intensity), each seismic response according to their prediction versus reality: true positive (TP), true negative (TN), false negative (FN) or false positive (FP).

  3. 3.

    Evaluate the predictive model's performance on the test data by accuracy, sensitivity, precision, and F1-score criteria.

  4. 4.

    Identify thresholds for performance criteria deemed acceptable by decision-makers.

These steps are detailed in the following sections, and their application is shown in the results of the developed random forest model. Note that the non-adjusted values are used in this section of the paper.

3.1 Definition of Intensity Thresholds and Application to Test Data

The training data (279 values) are used to set the intensity thresholds of the seismic responses to development blast for the five risk-appetite levels. The thresholds used in the paper were arbitrarily selected from Fig. 4. The operation management should define these values. To set different intensity thresholds of log (ΣM0), the best-fit line between true and random forest model predicted values of seismic response intensity to development drift segment blasts is used. The five seismic response intensity (log (ΣM0)) thresholds are defined as follows and illustrated in Fig. 8:

  1. 1.

    Open: Threshold equivalent to 75% of the anticipated high-intensity seismic responses (log (ΣM0) ≥ 9) by the model developed with the training data, which are high-intensity [8.82].

  2. 2.

    Flexible: Threshold equivalent to the predicted high-intensity value (log (ΣM0) ≥ 9) of the seismic responses by the mean line of the model developed with the training data [8.57].

  3. 3.

    Cautious: Threshold equivalent to 100% of the anticipated high-intensity seismic responses (log (ΣM0) ≥ 9) by the model developed with the training data, which are high-intensity [7.89].

  4. 4.

    Minimalist: Threshold equivalent to 100% of the anticipated high-intensity seismic responses log (ΣM0) ≥ 9) by the model developed with the training data, which are high-intensity + buffer zone for lower “rounded” value [7.5].

  5. 5.

    Averse: Zero threshold. All blasts are affected by special measures [6].

Fig. 8
figure 8

Definition of seismic responses' intensity thresholds for the five different risk appetite levels based on true versus predicted values of the training data

The different thresholds for the five tolerance levels quantifying the intensity of the seismic responses can be seen all together in Fig. 9. The thresholds for seismic response to development blasts established with the training data are then applied to the 100-test data, Fig. 9b.

Fig. 9
figure 9

Threshold of seismic response intensity to development blasts predicted by the model for five risk appetite scales (dashed lines): open (blue), flexible (green), cautious (yellow), minimalist (orange), and averse (red) for the training (a) and test (b) data. The true high seismic hazard threshold is shown as a solid red line on the true value axis

3.2 Evaluate Performance by Binary Performance Criteria

One of the first steps in assessing risk is to evaluate how performant an approach to anticipating a hazard is. The performance of a strategy or model in anticipating a variable can be measured by performance criteria that assess the difference between reality and what has been anticipated. Such criteria include the RMSE and the R2. However, these criteria can be complex for most decision-makers to understand and are not expressed in terms of seismic hazard.

Performance criteria based on binary responses such as 'high-intensity seismic response’ (log (ΣM0) ≥ 9) and 'low-intensity seismic response' (log (ΣM0) < 9) are more understandable and, therefore, more appropriate as a hazard management and communication tool for seismic responses to development blasts. This binary division creates four categories of results when comparing anticipated versus true seismic response intensity: TP, TN, FP, and FN. TPs are seismic responses whose true intensity is high and has been anticipated as such. TNs are seismic responses whose true intensity is low and was anticipated to be such. FPs are the responses whose real intensity is low and was anticipated as high. FNs are the seismic responses whose true intensity is high but was anticipated as low.

All 100 seismic responses using the test data are thus classified into one of the classes (TP, TN, FN, and FP) for the five risk appetite levels. Each of the five intensity thresholds creates divisions of the graph and thus, a different classification of the data into the four classes (TP, FP, FN, and TN) for evaluating the model’s performance. The delineation of these threshold-specific regions is shown in Fig. 10.

Fig. 10
figure 10

Values predicted for seismic response intensity to development blasts using random forest models versus the measured intensities for the 100-test data. The graphs are subdivided into four zones: true positives (TP), true negatives (TN), false positives(FP), and false negatives (FN)

The proportion of seismic responses in these four categories is used to calculate various comprehensible performance criteria.

First, the proportion of observations that are well classified is called accuracy and is calculated as Eq. (3) (Bruce et al. 2020):

$$\text{Accuracy}= \frac{\text{TP} +\text{TN}}{\text{TP} + \text{FP} + \text{TN} + \text{FN}}$$
(3)

The proportion of high-intensity seismic responses (TP) to the true number of high-intensity seismic responses is called sensitivity and is calculated as Eq. (4) (Bruce et al. 2020):

$$\text{Sensitivity}= \frac{\text{TP}}{\text{TP} + \text{FN}}$$
(4)

Precision is the proportion of anticipated high-intensity seismic responses (TP) that had high intensity and is calculated as Eq. (5) (Bruce et al. 2020).

$$\text{Precision}= \frac{\text{TP}}{\text{TP} + \text{FP}}$$
(5)

Sensitivity or precision values cannot be used as a single measure for choosing the best model. Minimum or maximum values are obtained with useless models. In other words, maximum sensitivity is possible if accuracy is minimal, and vice versa. A balance between the two is ideal. The F1-score combines a model's sensitivity and precision to consider both criteria. The current notation of the F1-score was adopted in 1972 (Chinchor 1992), Eq. (6).

$${\text{F}}1 \;\; {\text{score}}=2 \cdot \frac{\text{Precision} \times \text{Sensitivity}}{\text{Precision} + \text{Sensitivity}}$$
(6)

The number of data points within the four classes (TP, TN, FN, and FP) are used to compute the sensitivity, accuracy, and F1-score performance criteria for each risk appetite level (Table 5).

Table 5 Performance of predictive random forest models in anticipating high-intensity seismic responses for the 100-test data set

The threshold of tolerance or acceptability depends on the risk appetite of decision-makers. The anticipation of seismic hazards contains a high degree of uncertainty. In geomechanics, choices involving gains are often risk-averse, and options involving losses are often risk-taking (Hadjigeorgiou 2020). Risk aversion is characterized by high sensitivity, while risk-taking is characterized by high accuracy and precision. In other words, when the goal is to capture as many high-intensity seismic responses as possible, a greater proportion of high-intensity seismic responses are indeed captured, but this results in more low-intensity seismic responses being anticipated as representing a high seismic hazard. This is reflected in Table 5; an open risk appetite is associated with the highest accuracy and precision, but the lowest sensitivity. Conversely, a perfect sensitivity is obtained for an averse appetite but is associated with an accuracy and precision of 15%. Generally, a ‘moderate’ risk appetite compromises sensitivity and accuracy, and a higher F1-score reflects this. The highest F1-score is obtained for an open risk appetite (F1 score = 48%). This appetite threshold also has the highest accuracy.

The value of these criteria can be used as a risk-tolerance communication tool for decision-makers. The tool provides quantitative data on the distribution of seismic hazard risk associated with development blasts, which management can use to identify their tolerance better and prioritize their criterion. In addition, the developed random forest model tool allows decision-makers to identify thresholds for acceptable performance criteria. These criteria do not require extensive technical knowledge of rock mechanics and can therefore be well understood and used by decision-makers. Hadjigeorgiou (2020) emphasizes the need to ensure that the company's risk appetite is understood at all levels. The degree of acceptability of the models is to be determined by decision-makers.

4 Development Blasts Under Seismic Protocols

The mine site has implemented exclusion protocols to manage seismic hazards following development blasts. Since 2016, a seismic procedure has been developed to reduce workers’ exposure to potential projections in the development headings that could be generated following blasting. Projections are generally caused by high-intensity seismic responses and refer to the ejection of rock from the face.

The implementation of protocols decreases the workers’ exposure in two ways. On the one hand, a post-blasting access restriction is established for a radius of 60 m and a certain period, varying from 1 to 6 h. On the other hand, rock bolting in a closed cabin is required. The equipment makes it possible to bolt into a closed cabin and reduce the worker’s exposure. Open cabin equipment is less expensive than closed cabin equipment and is available in greater numbers on-site. Access restriction and equipment required when implementing development protocols mean that the speed of drift development is affected. Implementing a seismic protocol for a development drift reduces the development speed by half. The decision to implement such a protocol should not be taken lightly.

Currently, the ground control team at mining sites relies mainly on the following factors to decide whether the drift segment should be subjected to a seismic protocol:

  • The seismic response history of the previous 3–5 drift segments in the vicinity of the planned blast (same drift when available), mainly related to the maximum magnitude and the number of events

  • The presence of noises/cracking of nearby rock mass heard by workers or significant seismic event near the roadway before blasting.

  • Anticipation of a high concentration of mine-induced stress at this location.

The absence of a protocol does not imply a higher risk in the development heading. Among the 65 blasts that have generated a seismic response of log (∑M0) ≥ 9, 54 of these responses included an event MW ≥ 0. Among the 54 development blasts studied that generated a seismic event MW ≥ 0, 27 (50%) occurred within an hour of the blast time. There were no personnel underground at that time since it was between shifts. In addition, the extensive ground support used at the mine site to deal with rockbursting conditions has not been considered; it was deemed out of the scope of this paper.

The primary purpose of seismic protocols is not to capture high-magnitude events but projections. However, the mine personnel estimate that damage is generally caused by magnitudes greater than 0.4 (log M0 = 9.6), corresponding to a high-intensity seismic event. Among the 379 studied development blasts, 17% (65/379) are of high-intensity (log (ΣM0) ≥ 9). For the three mine levels studied in the paper, just under one-third of the blasts of the mining drift segments were subject to a seismic procedure. It is possible to visualize which seismic responses were subjected to a seismic protocol for the 379 investigated development blasts, with their true and predicted response intensity (Fig. 11).

Fig. 11
figure 11

Measured and predicted values from log (Σ M0) random forest predictive models for the 279 training (a) and 100-test (b) data, colorized according to the presence (red) or absence (black) of a seismic protocol when blasting these segment

Figure 11 shows that the current seismic protocol implementation method used at the mine, presented in Sect. 4, does not capture the same high-intensity seismic responses as the proposed one based on multivariate analysis. To highlight the differences and similarities between the two approaches, specific cases are shown in Fig. 12:

  • Case 1 illustrates a high-intensity seismic response (9.70) anticipated at 8.93 by the random forest model, which is high-intensity and TP for all risk appetite levels. This reflects that the model captured some high-intensity responses that were not covered by a seismic protocol.

  • Case 2 is a true low-intensity seismic response (7.27) that was anticipated at 7.78 by the random forest model. This intensity is considered low for 3 out of 5 risk appetite levels, which would classify it as a TN. This development blast was under protocol, which would have been the equivalent of considering a minimalist or averse appetite level, which anticipated this response as high intensity.

  • Case 3 reflects similarities in the limitations of both methods. This seismic response has a true intensity of 10.22 but was anticipated to be only 6.86 by the random forest model. Furthermore, this seismic response was not under a seismic protocol, so a high-intensity seismic response was not anticipated either.

  • Case 4 is a seismic response of measured intensity of log (ΣM0) = 9.46, which was anticipated to be 7.54 with the random forest model. It would only have been considered high-intensity for the minimalist and averse risk appetite levels. However, that drift segment was under seismic protocols.

  • The upper part of the graph illustrates other similarities and limitations of both methods. For example, out of the seven low-intensity seismic responses classified as high by the open risk appetite level (blue dotted line in Fig. 12), 86% (6/7) were under seismic protocols. This means that the seismic responses anticipated as high-intensity by the random forests model (for the open level of risk appetite) were also anticipated to represent a high seismic hazard by the ground control team since they were under seismic protocol.

Fig. 12
figure 12

Measured and predicted values of log (ΣM0) by the random forest predictive model for the 100-test data, colored according to the presence (red) or absence (black) of a seismic protocol when blasting these segments with the five risk-appetite levels (dashed lines): open (blue), flexible (green), cautious (yellow), minimalist (orange), and averse (red). The high seismic hazard threshold is shown as a solid red line on the true value axis

As stated earlier, implementing seismic protocols and developing prediction models do not serve the same primary purpose. However, it is still relevant to compare the performance in anticipating seismic responses of high intensity of the implementation of seismic protocols and the prediction model (Table 6). The performance of the 100-test data can be used to perform a quantitative comparison between the two methods. However, it is relevant to compare the performance of the seismic protocols on both the test data and all 379 segments since the performance of the seismic protocols is not affected by the database split (train and test data). With or without the database split, all criteria remain approximately constant except for sensitivity. The sensitivity of the seismic protocols is higher when considering the 100-test data (67%) compared to all 379 segments (51%). The precision of the seismic protocols to capture high-intensity seismic responses is 30% for all 379 segments and 31% considering only the 100-test data. This implies that 30% of seismic responses subjected to a seismic protocol were of high intensity. Thus, of the 110/379 seismic responses to drift development subjected to a seismic protocol, 33 were high intensity and 77 were low intensity considering the log (ΣM0). The sensitivity (proportion of high-intensity seismic responses anticipated as such [TP] out of the actual number of high-intensity seismic responses) is 51% for all 379 seismic responses and increases to 67% considering only the 100-test data.

Table 6 Comparison of performance of the seismic protocols and the random forest model for different risk appetite levels in anticipating high-intensity seismic responses

Table 6 shows that, for the studied sector, the random forest model developed generally performs better in anticipating high-intensity seismic responses than the seismic protocols implemented. In fact:

  • Δ Performance (Test dataAppetite level of random forest model − Test data USP): For the 100-test data, the overall performance of the random forest model for the flexible and cautious levels is equivalent to or better than the seismic protocols.

  • Δ Performance (Test dataAppetite level of random forest model − All data USP): If the performance of seismic protocols on all 379 data is considered and compared to the performance of the random forest model on the 100-test data, the overall performance of random forest model for the open, flexible, and cautious risk appetite levels is better or the same for all criteria.

    • For the open level, a decrease of less than 5% in sensitivity is observed for an increase from 10 to 20% for the other criteria.

    • The flexible risk appetite level shows an increase of 9% for each criterion.

    • A negligible decrease in accuracy and stable precision are noted for the cautious appetite level, a significant increase (+ 36%) in sensitivity, and a 7% increase in F1-score.

This reflects the usefulness of considering each segment drift’s geologic and structural environment to evaluate the seismic hazard related to its development blast.

The predictive random forest model could be used in a complementary manner to anticipate the intensity of seismic responses with the current method of seismic protocol implementation. For example, looking at Fig. 12, if we combine seismic protocols and the random forest approach for the flexible risk appetite level, all but two high-intensity responses would have been identified before blasting (13/15). Table 7 details the performance of combining the random forest model with the seismic protocols by considering a positive response, whether under seismic protocols or anticipated as high intensity for the studied risk appetite.

Table 7 Performance of combining high-intensity seismic response to development blasts expected from seismic protocols and the predictive random forest model for the five different risk appetite thresholds

By combining the positive results of the two methods, the sensitivity (proportion of anticipated high-intensity seismic responses to the actual number of high-intensity seismic responses) is at a minimum 80%. This degree of sensitivity was not achieved when only considering the seismic protocols or only considering the open or flexible risk appetite levels of the random forest model. The F1-score from the random forest model only or the combined methods is the same overall. This is due to the increase in the sensitivity but a decrease in the precision (proportion of anticipated high-intensity seismic responses that had a high-intensity). The results also show that, for both methods, it is impossible to fully anticipate the intensity of seismic responses to development blasts. This is due, in part, to the variability of the rock mass and the lack of consideration of different factors that may impact the response. However, the results show a strong potential gain in all performance criteria using a predictive model that integrates geologic and structural characteristics of the rock mass around the blasted drift segment. Based on the presented analysis, it would be in the best interest of decision-makers to consider geologic and structural knowledge of the area through the development of predictive statistical models and monitoring the seismic response of the previous 3–5 blasts of nearby segments in the decision to implement a seismic protocol. The proposed approach can be adapted to the needs of all mining operators.

5 Discussion

The workflow described in this paper to manage seismic risk associated with development blasting using random forests predictive models based on geologic and structural rockmass properties is presented in Fig. 13. This is a significant step forward in improving actual industrial practices. The procedure was developed so that it could be applied to other sectors of the mine or other mine sites. Some variables would arguably be needed to define the environmental conditions and the driving failure mechanism at that particular site. The same random forest model could not be used directly in other mines, as models are only meant for the data range (for all variables) used to build the model and if the conditions do not change (Kelleher and Tierney 2018).

Fig. 13
figure 13

Framework chart of the methodology

The use of machine learning methods (ML) such as random forests models have been critized as they might be hard to interpret (Rudin 2019). For the case study presented, before realizing predictive models, classic statistical models such as bivariate analysis and multivariate analysis have been performed to understand better the influence of geologic and structural features on seismic responses to development blast for this case study (Goulet 2024; Goulet et al. 2024). This is arguably a complementary approach that can mitigate some of the weaknesses associated with ML.

Moreover, ML models have advantages over classic statistical models (Mitelman et al. 2023). First, these methods are not limited by the number of parameters. This can be important when analyzing the vast amount of geologic and structural properties acquired in mines, often overlooked in geotechnical analysis. Second, the vast amount of data that can be analyzed can counter the bias induced by subjective data (Mitelman et al. 2023). This is particularly interesting when considering geologic and structural data often obtained using subjective field methods. Finally, a predictive model not based on theoretical explanation is not necessarily a weakness but a strength because the performance can be assessed using predictive metrics rather than logic (Mitelman et al. 2023).

Like any other data analysis, the quality and quantity of input data impact the results’ quality. Specific limitations to the data used in this case study are explained in detail in Goulet (2024). Nevertheless, the results still showed that with imperfect or incomplete data, there is great potential to improve understanding and managing strategies of seismic hazard related to development blasting.

6 Conclusion

This paper has demonstrated the potential application of the interpretation of geologic and structural properties for managing seismic hazard associated with development blasting. The random forest models developed with the training data were used to establish thresholds for different risk appetite levels. These thresholds were then applied to the test data to evaluate the actual performance of the models in terms of accuracy, sensitivity, precision, and F1-score at different risk appetite levels. The paper also showed how the developed predictive models could be used to understand, manage, and communicate the seismic hazard related to development drift segment blasting. This could be a significant leap forward in seismic hazard management associated with development blasting compared to the actual practice. The following are some specific important contributions of the paper:

  • An approach was proposed to provide quantitative data on the seismic hazards associated with development blasting, which managers can rely on.

  • The random forest model allowed the identification of the most critical factors affecting the intensity of seismic responses: the distance to a lithological contact, variables related to drift direction and the angle between the schistosity and the direction of the drift segment. Lithologies also play an indirect role through the intensity of chlorite and biotite alteration in addition to the fabric of sericitic alteration. Finally, the distance to a shear zone or fault plane is also significant for the model.

  • The random forest model allowed the prediction of the log (Σ M0) of the seismic response for the 100 test data within ± 0.5 for 29% of the data, 70% is within ± 1.0 and 91% is within ± 1.5. The improved understanding of the variation in the intensity of predicted seismic responses is a value-added feature of random forest model development.

  • A strategy to use the predictive random forest models as a tool for decision-makers to select thresholds of acceptable performance criteria was detailed. The threshold selected would depend on the risk appetite of the decision-makers.

  • Using the predictive model for the sector and period studied may have increased the accuracy, sensitivity, and precision for anticipating a high-intensity seismic response to a development blast (log (ΣM0)). Combining the proposed approach with current seismic protocols used in the studied mines could improve our management of seismic risk associated with development blasting.

  • Whichever approach is used to anticipate seismic hazard related to development blasting, it is impossible to anticipate the intensity of seismic responses to development blasts fully. This is due, in part, to the variability of the rock mass and the lack of consideration of different factors that may impact the response.

As mining activities continue to develop at greater depths, the intensity of seismic responses to development blasting is expected to increase and present a greater hazard. Therefore, an improved understanding of the intensity of seismic responses to these blasts will allow for better preparation of risk management and mitigation measures. This paper showed how integrating geologic and structural characteristics of the rock mass can improve the understanding and management of seismic hazard to development blast. The methodology is adaptable to different contexts and risk tolerance and can be applied to various environments and mine sites.