1 Introduction

The ability to predict drilling machines’ performance and efficiency is critical in mining operations. Mining operations’ planning, development, and economics are influenced by the penetration rate. In addition, the penetration rate of rotary drilling plays a crucial role in preventing mine drilling costs [1]. The penetration rate describes how several parameters affect the rate of drilling in mining and construction. In terms of mining operations, the selection of mining machinery and equipment, and the product’s final price, rough estimations of the penetration rate are potentially risky [2]. Furthermore, drilling rate equations can estimate total drilling costs. Selecting the type of machine can be accomplished using these equations. Numerous factors, such as rock-mass and material properties and the specifications of the drilling rig, influence the rotary drilling’s penetration rate [3]. Despite drilling equipment parameters, geological conditions and rock characteristics are not normally controllable by humans [1, 4].

Various researchers have developed different methodologies to predict rotary drilling’s penetration rates in recent years. Based on the literature study, these methods are divided into experimental and empirical models, statistical models, and machine learning models. Akün and Karpuz [5] established an empirical penetration rate model that was based on rock quality designation (RQD), pressure loss, discontinuity frequency, specific energy, and depth of cut. Their results showed that the drilling-specific energy is the most significant parameter in predicting the drilling rate. Krúpa et al. [6] conducted an experimental study that evaluated the relationship between achievable penetration depth and wear in rotary drilling. They created a mathematical model based on thrust force and drill length, omitting other parameters, such as torque and vibrations. Kumar et al. [7] evaluated rotary drilling’s penetration rate through coarseness index mapping and vibration, observing that PR increased with mean particle size (d) but decreased with vibration for varying pulldown force and torque at different rotational speeds. Adoko et al. [8] developed an empirical model for estimating drilling rates in hard rock mining that showed a strong correlation between actual and estimated drill rates.

As the next category of the available techniques, some other researchers used statistical approaches for predicting rotary drilling’s penetration rate. Kahraman [9] created an NLMR model to forecast the penetration rate of rotary and percussive drilling systems. The study found that uniaxial compressive strength is the primary rock property that affects rotary drilling. Hoseinie et al. [10] established a rating classification for drillability (DRI) prediction by using six rock mass properties, including Mohs hardness, grain size, UCS, joint filling, joint spacing, and joint dipping. Yarali and Kahraman [11] proposed new relationships to predict DRI by utilizing the brittleness values of 32 different rocks. Cheniany et al. [12] developed linear and nonlinear multiple regression techniques to estimate the SRMD index for specific rock masses. Moein et al. [13] measured the DRI values of carbonate rock in the laboratory and found good relationships for predicting DRI using the alteration index and specific energy. Yenice [14] established a mathematical regression model that considered rock strength properties to predict DRI, concluding that there is a strong relationship between PR and rock strength.

The last category of available techniques that are flexible and applicable is referred to as machine learning (ML). Recently, these techniques have been applied to civil and mining engineering [15,16,17,18,19,20,21,22,23,24,25,26,27]. Kahraman [28] demonstrated that ANNs could determine diamond drilling penetration rates much more accurately than regression models. Darbor et al. [29] evaluated the penetration rate using MLP-ANN and nonlinear multiple regression (NLMR). MLP-ANN is more accurate at predicting performance than NLMR, according to the study. Fattahi and Bazdar [30] used five hybrid ANN models to estimate DRI. Algorithms used include the simulated annealing algorithm (SAA), backpropagation (BP), firefly algorithm (FA), invasive weed optimization (IWO), and shuffled frog leaping algorithm (SFLA). According to the results, the ANN-SSA model performed better than other models (R2 = 0.97). Shad et al. [1] investigate the penetration rate by the chemical component of intact rock, in addition to rock mass properties and machine specifications. The iron oxide percentage has been used as a new parameter for the penetration rate of rotary drills. Kamran [31] developed a probabilistic approach using AdaBoost, random forest (RF), and decision tree (DT) for predicting the drilling rate index (DRI). The results and Monte Carlo simulations show that this approach is more reliable in predicting the probability distribution of DRI. Sakız et al. [32] estimated the drilling rate index by focusing on the abrasive properties and rock strength. For this purpose, they used the fuzzy inference system (FIS) method as a model for precise prediction. The results showed that using the proposed method, DRI prediction is very efficient and accurate.

Table 1 summarizes some of the ML models that have been proposed by different researchers to predict drilling machine penetration rates. As shown, a relatively high level of accuracy can be obtained using ML techniques. ML methods can handle highly nonlinear relationships between predictor and response variables, which is often the case in drilling operations. Machine learning models are able to learn from large and complex datasets and generate highly accurate predictions, whereas statistical and empirical methods rely on assumptions such as the linearity and normality of data, which may not always hold in the context of drilling. Additionally, machine learning models can easily incorporate multiple sources of data, including real-time sensor data, to enhance their prediction accuracy, which is difficult to achieve using statistical or empirical methods.

Table 1 An overview of models for predicting penetration rates in rotary drilling

Accurately predicting penetration rates in rotary drilling is of utmost importance in geomechanics, as it heavily relies on various rock-mass and material properties. In this article, we propose a groundbreaking hybrid XGBoost methodology to significantly enhance the accuracy of penetration rate predictions during drilling operations. This methodology finds wide-ranging applications in vital industries, such as oil and gas exploration, construction, and civil engineering. To achieve more precise predictions, a comprehensive understanding of rock-mass properties is essential. Notably, the mechanical behavior of crystalline rocks, as influenced by brittleness, has been studied extensively [35]. By optimizing drilling strategies and mitigating rock failures, this knowledge is instrumental in improving drilling performance. Furthermore, the significance of comprehending rock-mass heterogeneity has been highlighted through investigations into radionuclide transport in multi-scale fractured rocks [36]. Such understanding is crucial for achieving accurate predictions and ensuring efficient drilling operations.

In the context of solute transport monitoring in sedimentary media, integrated experimental design frameworks have been proposed [37]. Additionally, data-worth analysis using stochastic deep learning frameworks have been explored to identify subsurface structures [38], providing valuable insights into optimizing drilling practices. Ground-penetrating radar (GPR) has emerged as a valuable tool for subsurface exploration, with denoising methods being employed to enhance GPR data quality [39]. Moreover, deep learning models have been developed for pipeline recognition using GPR B-scans [40]. Radar technology’s potential for remote sensing applications, such as discriminating between dry and water ices on Mars, has also been demonstrated [41]. The complexity of solute transport in naturally fractured media has spurred research on upscaling dispersivity for conservative transport analysis [42],with significant implications for groundwater management and contaminant assessments.

In engineering, various structures have been subject to advanced analyses and modeling. Shield tunnel linings, for instance, have been studied using finite element modeling [43], and seismic fragility analyses have considered soil property variability [44]. Ocean engineering studies have focused on the failure analysis of reinforced thermoplastic pipes [45] and the dynamic response of riserless rotating drill strings [46]. Moreover, quantitative determination of high-order crack fabric in rock planes contributes to rock mass stability assessment [47], while research on the effects of carbonate minerals and exogenous acids on carbon flux addresses global and planetary change [48].

As far as the authors know, no study has applied hybrid XGBoost to predict penetration rate in rotary drilling. Hence, this study aims to fill this gap by proposing a novel approach to optimize XGBoost using various search algorithms, including random search, grid search, Harris Hawk optimization (HHO), and dragonfly algorithm (DA). The study was conducted using data collected from a copper mine in Iran, where the predictive models were developed by considering various rock properties. The authors then compared the developed models with the traditional XGB model to evaluate their effectiveness in predicting variations in the penetration rate. The proposed methodology and the role of HHO and DA in enhancing the predictive accuracy of XGB contribute to the field of mining engineering by introducing a practical approach to predict the penetration rate in rotary drilling.

2 Methodology

2.1 Extreme Gradient Boosting (XGBoost)

The XGBoost method is founded on gradient-boosting trees, which can be very useful for gradient enhancement [49]. A regression and classification problem can be very effectively solved using XGBoost based on the concept of regression and classification trees [21, 22]. Also, XGBoost combines the novel algorithm with the GBDT method to represent a soft computing library.

Two parts explain the XGBoost’s objective function: first, the deviation from the model, and then the regular phrase to prevent overfitting. The data set represented by \(D = \left\{ {\left( {x_{i} ,y_{i} } \right)} \right\}\) contains m features and n samples. Predictive variables are additive models consisting of k basic models. Equations (1) and (2) represent the results of the sample prediction.

$$\hat{y}_{i} = \mathop \sum \limits_{k = 1}^{K} f_{k} \left( {x_{i} } \right), f_{k} \in \varphi ,$$
(1)
$$\varphi = \left\{ {f\left( x \right) = w_{s} \left( x \right)} \right\} \left( {s : R^{m} \to T, w_{s} \in R^{T} } \right)$$
(2)

In these Equations, \(\hat{y}_{i}\) represents the prediction label, and one of the samples symbolizes by xi. Also, φ represents the set of regression tree which is a tree structure parameter of s, and fk(xi) represents the predicted score for that sample. In addition, f(x) denotes the leave’s value, and the number of leaves represents by w.

XGBoost’s objective function contains both the complexity of the model and the traditional loss function. In this way, the algorithm can be evaluated in terms of its operational efficiency. A traditional loss function is represented by the first term in Eq. (3), while the complexity of the model is represented by the second term.

$${\text{Obj}} = \mathop \sum \limits_{i = 1}^{m} l\left( {y_{i} ,\hat{y}_{i}^{{\left( {t - 1} \right)}} + f_{i} \left( {x_{i} } \right) } \right) + \Omega \left( {f_{k} } \right),$$
(3)
$${\Omega }\left( {f_{k} } \right) = \gamma T + 1/2\lambda w^{2}$$
(4)

In Eqs. (3) and (4), use m to specify how much data is imported into the kth tree and i to specify the number of samples in the dataset. The complexity of a tree can be adjusted using γ and λ. The final learning weight can be smoothed by adding regularization terms, and overfitting can be avoided [21, 22, 50, 51].

2.2 Hyperparameter Tuning

Based on existing data, tuning can be used to learn an algorithm that finds the optimal hyperparameters. A hyperparameter can determine an algorithm’s optimum performance in supervised learning [52,53,54,55]. This research used three tuning methods to find the optimal hyperparameter: grid search, random search, and metaheuristic algorithms.

XGBoost is a machine-learning algorithm with great potential and many hyperparameters. The following are the hyperparameters that adjusted in this study:

learning_rate: This hyperparameter sets the step size for each boosting iteration. Smaller values may improve performance, but may also make training longer.

max_depth: This hyperparameter controls the maximum depth of the decision tree, which affects the model’s complexity. Higher values may cause overfitting, while lower values may cause underfitting.

n_estimators: This hyperparameter determines the number of boosting iterations or trees to build.

2.2.1 Grid Search

The grid search is an exhaustive search using a set of subsets, with hyperparameters determined by a lower, an upper, and a number of steps [56]. The grid method creates a grid to find all possible outcomes. The most suitable grid will be chosen among all other grids, and all steps will be carried out systematically [57]. Using the grid search method, data can be processed with high accuracy [58]. In grid search, the following steps are taken:

  1. 1.

    The parameter values are all initialized

  2. 2.

    Combining all parameter values in a loop

  3. 3.

    Based on training data, XGBoost is used to conduct training

  4. 4.

    Analyzing test data with the resultant classifications

  5. 5.

    Analyzing the classification results to determine the most effective combination of parameter values

2.2.2 Random Search

Random search tries a number of predetermined combinations, evaluates hyperparameters, and selects the most promising ones [59]. Large volumes of data can be processed efficiently by random search [60]. The following are the steps involved in a random search:

  1. 1.

    Setting up the number of iterations for a parameter combination

  2. 2.

    The parameter values are all initialized

  3. 3.

    Combining parameter values randomly based on the iteration count

  4. 4.

    Based on training data, XGBoost is used to conduct training

  5. 5.

    Analyzing test data with the resultant classifications

  6. 6.

    Analyzing the classification results to determine the most effective combination of parameter values

In summary, random search is a method of selecting hyperparameters randomly from a given range. It is relatively simple to implement and can be used to quickly explore the parameter space. The main advantage of random search is its ability to find good solutions faster than grid search, as it does not require an exhaustive search of the parameter space. However, random search may not be able to find the best solution as it does not guarantee that all possible combinations are explored. Grid search is a method of systematically searching through a given range of hyperparameters in order to find the optimal combination. This method is more computationally expensive than random search, but it guarantees that all possible combinations are explored and thus can provide better results. The main disadvantage of grid search is its time-consuming nature, as it requires an exhaustive exploration of the parameter space [61].

2.2.3 Metaheuristic Algorithms

As an important branch of machine learning, the extreme gradient boosting model has been widely used in many areas, such as blasting, mining, energy, and geotechnics. By searching for the optimal combination of prediction models, prior literature shows that the use of metaheuristic optimization techniques can significantly improve prediction performance [62]. Hence, the Harris Hawks optimization (HHO) and dragonfly algorithm (DA) were tested in this paper.

2.2.3.1 Harris Hawks Optimization (HHO)

The HHO algorithm’s behavior cooperation in hunting has been used to illustrate a variety of issues requiring optimal solutions [63, 64]. According to Heidari et al. [65], it is useful for solving optimization problems in many different scientific and engineering fields. HHO is divided into three phases, as shown in Fig. 1: exploration, exploitation, and transition. Hawks search for and locate a prey animal, and then determine its position during the first phase, Xrabit. The hawks, using an iterative process, express their position relative to the prey in which they assign a random relevance to the prey, Xrand:

$$\begin{aligned} &X\left( {{\text{iter}} + 1} \right) \\ & = \left\{ {\begin{array}{*{20}l} {X_{{{\text{rand}}}} \left( {{\text{iter}}} \right) - r_{1} X_{{{\text{rand}}}} \left( {{\text{iter}}} \right) - 2r_{2} X_{{{\text{rand}}}} \left( {{\text{iter}}} \right),\; q \ge 0.5} \\ {X_{{{\text{rabit}}}} \left( {iter} \right) - X_{m} \left( {{\text{iter}}} \right) - r_{3} \left( {{\text{LB}} + r_{4} \left( {{\text{UB}} - {\text{LB}}} \right)} \right),\; q \ge 0.5} \\ \end{array} } \right.\end{aligned}$$
(5)
Fig. 1
figure 1

An explanation of the HHO algorithm with three stages

Average position is determined by Xm, while ri is determined by a random number i, which is a range of (1 − q). A definition of m can be found in Eq. (6).

$$X_{m} \left( {{\text{iter}}} \right) = \frac{1}{N} \mathop \sum \limits_{{\left( {i = 1} \right)}}^{N} X_{i} \left( {{\text{iter}}} \right)$$
(6)

Xi indicates the location and N indicates the hawk’s size. Hunting’s escaping energy, Eh, comes from:

$$E_{{\text{h}}} = E_{0} \left( {1 - \frac{{{\text{iter}}}}{T}} \right)$$
(7)

E0 represents the initial energy and T represents the maximum number of repetitions. The following is noted that \(E_{0} \in \left( { - 1, 1} \right)\) and depending on the value of \(\left| {\text{E}} \right|\), the exploration or exploitation phase will be initiated. The value of \(\left| {\text{E}} \right|\) indicates how the rabbit was captured during the exploitation phase. When \(\left| {\text{E}} \right| \ge 0.5\), the catch is explained to be easy, but when \(\left| {\text{E}} \right| < 0.5\), the catch is described as problematic [66, 67].

2.2.3.2 Dragonfly Algorithm (DA)

Mirjalili [68] presents the DA as a novel method for optimizing systems. This approach is based on the swarm intelligence of dragonflies, which exhibits both static and dynamic behavior. Exploration and exploitation are the two main phases of DA. Dragonflies’ dynamic or static searches for food or avoidance of enemies result in these two phases [69].

There are three distinct behaviors of swarms: alignment, cohesion, and separation [69]. Separation in this concept refers to avoiding a collision with an element in a swarm [Eq. (8)]. In Eq. (9), alignment is the speed at which factors adjust their positions to match their neighbors. In Eq. (10), cohesion is the inclination of elements toward the center. There are two different strategies in DA: approaching the food and avoiding enemies. Each swarm’s main goal is survival, which is the reason for this addition. Thus, when all elements are moving toward food sources [Eq. (11)], they must stay away from the enemy [Eq. (12)].

$$S_{i} = - \mathop \sum \limits_{j = 1}^{N} X - X_{i}$$
(8)
$$A_{i} = \frac{{\mathop \sum \nolimits_{j = 1}^{N} V_{j} }}{N}$$
(9)
$$C_{i} = \frac{{\mathop \sum \nolimits_{j = 1}^{N} X_{j} }}{N} - X$$
(10)
$$F_{i} = X^{ + } - X$$
(11)
$$E_{i} = X - X$$
(12)

where \(X^{ + }\) and \(X^{ - }\) represent food and enemies, respectively. Vj represents the speed of the jth neighbor element, and N denotes the number of neighbor elements. Also, X represents the momentary position of the element; Xj is the momentary position of the jth neighbor element [70, 71].

Finally, XGBoost predictions are optimized using DA and HHO algorithms in this study. XGBoost has been used to predict penetration rate in three forms; two usual forms of XGBoost (random search and grid search) and its combined form with the dragonfly algorithm and harris hawk optimization as XGB-DA and XGB-HHO, respectively. The schematic form of merging XGBoost with DA and HHO is shown in Fig. 2. Additionally, in the study approach, the prediction models were prepared using the XGBoost library, and the programming environment used for model development and evaluation was Python with the scikit-learn and XGBoost libraries.

Fig. 2
figure 2

Copper mine Sarcheshmeh’s location

2.3 Model Evaluation and Verification

Based on the correlation between predicted penetration rate and measured penetration rate value, several indicators are used in this research to assess the dependability of optimized models, including mean absolute error (MAE), root mean square error (RMSE), average absolute relative error (AARE), and coefficient of determination (R2) [72,73,74,75,76,77].

$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {\widehat{{{\text{PR}}}}_{i} - {\text{PR}}_{i} } \right)^{2} }$$
(13)
$${\text{AARE}} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} \left| {\frac{{\widehat{{{\text{PR}}}}_{i} - {\text{PR}}_{i} }}{{{\text{PR}}_{i} }}} \right| \times 100$$
(14)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i} \left( {{\text{PR}}_{i} - \widehat{{{\text{PR}}}}_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i} \left( {{\text{PR}}_{i} - \overline{{{\text{PR}}}}_{i} } \right)^{2} }}$$
(15)
$${\text{MAE}} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} \left| {\widehat{{{\text{PR}}}}_{i} - {\text{PR}}_{i} } \right|$$
(16)

where \(\widehat{{{\text{PR}}}}_{i}\), \({\text{PR}}_{i}\), and \(\overline{{{\text{PR}}}}_{i}\) represents the predicted, measured, and average of the measured penetration rate values, respectively. In addition, n denotes the number of samples in the training and testing dataset.

3 Field Study and Data collection

3.1 Field Study

The copper mine Sarcheshmeh is located 50 and 160 km northwest of the Rafsanjan and Kerman cities, respectively. In terms of open pit mines in Iran, it is the largest (Fig. 3). A variety of rock types are found in the Sarcheshmeh porphyry deposits due to their complicated geology. Due to the high production rates and project planning requirements of large surface mining operations, large drilling rigs, such as rotary drills with triphonic bits, are commonly used. In recent years, rotary drills have become increasingly popular among mining contractors due to their ability to drill large diameters and deep blast holes. The disadvantages of utilizing this equipment are expensive maintenance and problematic transportation in areas with harsh topography. A list of all the rotary drill rig configurations is presented in Table 2.

Fig. 3
figure 3

The flowchart of the proposed process of optimized XGBs

Table 2 Various configurations of rotary drills were used in this study

3.2 Data Collection

Knowing the penetration rate of a project is crucial for determining drilling costs. The performance of drilling is affected by a variety of factors, including rock properties and the drilling equipment. Changing drilling equipment factors is possible, but changing rock parameters is not [4].

It is most helpful to provide a dataset with a wide geographical distribution when developing XGB-based methods to predict penetration rates in this paper. The first step in providing a dataset should also be to select the most appropriate input parameters. Therefore, a brief review of previous investigations is needed to determine the factors most influential on rotary drill performance. According to Bhatnagar and Khandelwal [33], thrust, RPM, flushing media, and compressive strength are the most effective rotary drill parameters. Numerous researchers believe that UCS and BTS play an important role in rotary drill performance [29, 30,31,32, 34, 78,79,80]. The joint specification, including joint spacing, joint aperture, joint fillings, and joint direction has a significant impact on the performance of rotary drills, according to Saeidi et al. [81] and Hoseinie et al. [10]. Lawal et al. [3] noted that density and porosity influence rotary drilling’s penetration rate. Also, bit diameter has been used as an input parameter in the prediction of the penetration rate of rotary drilling models [2].

Observations and laboratory tests were conducted at the Sarcheshmeh copper mine to develop a database for hybrid intelligent techniques. A database consisting of 116 metamorphic, sedimentary, and igneous rock samples was examined m (the small version of database is available in Table 8). After discussing the matter and conducting field observations and laboratory tests, five parameters were chosen as inputs for predicting the PR of rotary drilling. These parameters include rock material properties such as uniaxial compressive strength (UCS) and tensile strength (TS), as well as rock mass properties like joint direction (JD) and joint spacing (JS), along with bit diameter (D). Although some studies use other parameters like RPM and WOB, they were not included in this study to keep the predictive models simple. Jahed Armaghani et al. [82] and Armaghani et al. [83] have suggested that models with fewer input parameters are better since they are less complex. Additionally, equipment parameters are typically selected based on the rock properties being drilled, so including them in the prediction model would be redundant and could make it unnecessarily complicated. Details and descriptions of the datasets are shown in Table 3. The data have been randomly divided into two parts: training data (92 cases) were allocated for model development, while test data (24 cases) were allocated to assess model reliability.

Table 3 Output and inputs data with symbols, details, and statistical descriptions

The Pearson correlation coefficients were computed as shown in Fig. 4 to determine the most suitable features of the predictive models. In terms of Pearson correlation coefficients, bit diameter (D) and joint direction (JD) with the strongest correlations (r = − 0.64) and (r = 0.27), respectively, have the strongest linear relationship with penetration rate. However, after performing a sensitivity analysis, it is possible to identify the most influential parameters in the simulation of the penetration rate.

Fig. 4
figure 4

An overview of the correlation matrix for all data samples (inputs and output)

4 Results and Discussion

4.1 Comparison Analysis of Optimized Models

In order to estimate drilling penetration rates, a database must be prepared. All models were trained on 80% of the data randomly selected from the database and tested on 20% of the data. Several performance indicators (RMSE, AARE, R2, and MAE) in Eqs. (13)–(16) were applied to evaluate the model’s accuracy. All prediction models were trained and tested using the same set of data. The optimized XGB models was carried out based on Sect. 2 and the method shown in Fig. 2. Firstly, the relevant parameters of XGB models were initialized. After that, each optimization algorithm’s appropriate parameters were determined. Table 4 includes the optimal parameters obtained from the optimization process.

Table 4 Model’s optimal parameters

On the training set in this paper, varieties of XGB-tuned models were trained, and their prediction performances varied. Figure 5 illustrates the correlation between the actual and predicted values of the training data set. These optimized models exhibit relatively good training effects, with the training data points distributed close to the perfect fit line. In terms of RMSE, AARE, R2, and MAE, the HHO-XGB optimized model has a slight advantage, with values of 0.1384, 0.1384, 99, and 3.9882, respectively.

Fig. 5
figure 5

Analyses the correlation between measured and predictive values from the training dataset

In this paper, four optimized XGB techniques are presented that can achieve high training effects, with R2 values generally above 0.96. Following training, the models are evaluated and verified on the testing data set. Figure 6 illustrates how the test data set is essentially distributed close to the perfectly fitted line when examining the relationship and error between measured and predicted values.

Fig. 6
figure 6

Analyses the correlation between measured and predictive values from the testing dataset

These four optimized models have been compared and analyzed based on their predicted performance, as shown in Table 5 and Figs. 7, 8, 9 and 10. According to Table 5, the performance ranking system and index results of four models (DA–XGB, HHO–XGB, RS–XGB, and GS–XGB) predict rotary drilling penetration rates. The stacked graph in Fig. 7 presents the overall rankings more intuitively. Comparing all optimized models, the results of the comprehensive analysis indicate that the HHO-XGB optimized model is the most precise predictive model.

Table 5 Performance comparison of the optimized XGB models
Fig. 7
figure 7

Display comprehensive rankings of optimized models in an intuitive manner

Fig. 8
figure 8

A comparison of the prediction values of predictive models with testing dataset

Fig. 9
figure 9

Taylor diagram of predictive models for training and testing datasets

Fig. 10
figure 10

An overview of the distribution function for all datasets in the developed models

Using the testing dataset, we compare the predicted penetration rate accuracy of the selected models, as shown in Fig. 8. The XGBoost technique-based metaheuristic algorithm gives the most accurate and consistent results in penetration rate prediction, as shown in Fig. 8.

This subsection illustrates how optimal predictive models perform with the Taylor diagram. This mathematical diagram is used to illustrate which of the developed models is the most realistic [84]. RMSE, standard deviation, and Pearson correlation are used to assess the degree of agreement between modeled and observed behavior [85,86,87]. Figure 9 depicts the Taylor diagram of this study’s models generated for testing and training datasets. According to the results, the HHO–XGB hybrid model is more accurate in predicting penetration rate than other predictive models.

The distribution of predicted values is one way to evaluate predictive models. Box plots in Fig. 10 show the distribution functions for predicted and measured penetration rates. Due to its similar probability distribution to observational results, the HHO-XGB approach performed better than other models.

Another way to evaluate ML approaches is by calculating the cumulative frequency of absolute relative errors (ARE, %). Based on this Fig. 11, more than 90% of predicted penetration rates for HHO–XGB and DA–XGB models are below 5%, as opposed to less than 85%, 60%, and 30% for GS–XGB, RS–XGB, and XGB models. The HHO–XGB and DA–XGB correlations with actual penetration rates are excellent, as demonstrated by the presented figures.

Fig. 11
figure 11

Various models’ cumulative absolute relative error

The significance of R2 can be assessed through a ‘t’ test, under the assumption that both variables follow a normal distribution and the observations are randomly selected. The ‘t’ tests involve the comparison of the calculated ‘t’ value and the tabulated ‘t’ value under the null hypothesis [88, 89]. A 95% confidence level is employed for this test. If the computed value exceeds the tabulated value, the null hypothesis is rejected, indicating the significance of ‘r’. By applying a 95% confidence level, we obtain the corresponding critical value of 1.98. The data presented in Table 6 demonstrates that all the computed ‘t’ values surpass the tabulated ‘t’ values, signifying a statistically significant correlation between D, T, JD, JS, UCS, and the penetration rate.

Table 6 Student’s t-test

To enhance prediction confidence, the F-test was conducted, as illustrated in Table 7. The F-test is employed to compare standard deviations [88]. Employing a 95% confidence level yielded a corresponding critical value of 3.92. The information presented in Table 7 reveals that all the calculated F values exceed the tabulated F values, providing additional evidence of a meaningful connection between D, T, JD, JS, UCS, and the penetration rate. As the computed F value surpasses the tabulated F value in each case, the null hypothesis is rejected. This supports the assertion that the datasets originate from distinct populations of measurements.

Table 7 Student’s F test

Statistical significance, in essence, determines whether a given independent variable has an impact on the model. When the ‘signif. value’ of an independent variable exceeds (α), it does not contribute significantly to the model. However, when its ‘signif. value’ falls below (α), it assumes a meaningful role in the prediction process.

4.2 Sensitivity Analysis

Performing sensitivity analysis is a beneficial tool for evaluating the influence of variables on penetration rate predictions. In this study, the importance of variables was determined from the HHO–XGB model by comparing the results of different XGB models. The cosine amplitude method is employed for this purpose Shirani Faradonbeh et al. [90], considering the data set of the Copper Mine and the mathematical Eq. (17).

$$R_{ij} = \frac{{\mathop \sum \nolimits_{k = 1}^{n} \left( {x_{ik} \times x_{jk} } \right)}}{{\sqrt {\mathop \sum \nolimits_{k = 1}^{n} x_{ik}^{2} \times \mathop \sum \nolimits_{k = 1}^{n} x_{jk}^{2} } }}$$
(17)

The parameters xi represent the inputs, xj represents the outputs, and n represents the number of data sets. \(R_{ij}\) represents the strength of the relationship between the HHO–XGB and the independent variables. Figure 12 shows the strength of the relationship between penetration rate values and input data. Among the parameters affecting penetration rate, uniaxial compressive strength and tensile strength are the most influential.

Fig. 12
figure 12

Analyzing the sensitivity of the input variables and penetration rate

5 Limitations and Future Works

The current approach proposed in this study offers accurate predictions, but there are limitations that need to be addressed in the future. The proposed model is based on a database of 116 data samples from a copper mine in Iran, which limits its applicability to similar rock-mass and material properties. To develop a more generalized ML model, a comprehensive database with various types of parameters, including environmental conditions and different rock properties such as hardness, strength, and abrasiveness, can be collected. This would enable the use of a wider range of input parameters, making the ML model more reliable and flexible for researchers and designers. Moreover, future studies can explore the application of other ML methodologies or hybrid intelligence to compare their ability to predict the rotary drill penetration rate or other important properties. Such studies could improve the accuracy and robustness of the models developed for rotary drilling applications.

Potential biases may emerge due to site-specific conditions or sampling techniques, while uncertainties can stem from measurement errors, data interpolation, and temporal variations. These challenges collectively can compromise the reliability of predictive models, potentially resulting in inaccuracies during assessments and suboptimal decision-making. It’s paramount to comprehend these limitations to effectively interpret results and employ robust modeling techniques that address biases and uncertainties.

6 Conclusions

This article carefully examines and compares different optimized XGB models to predict the penetration rate of rotary drilling. These models were created by combining XGB with four hyperparameter tuning methods including random search, grid search, and intelligent optimization algorithms like HHO and DA. Taking into account various factors that influence the penetration rate, a PR data set was used to train and test these XGB hybrid models. The models’ performance was assessed using metrics, such as MAE, RMSE, AARE, and R2. Lastly, the cosine amplitude method was employed to evaluate the significance of each input variable.

To sum up, the hybrid XGB models suggested in this study show promise in predicting rotary drilling penetration rates and can effectively aid XGB in adjusting hyperparameters. Based on the prediction result, HHO-XGB hybrid model demonstrates superior overall performance compared to the other three models. Additionally, an ordinary XGB model was developed to forecast the rotary drilling penetration rate for comparative purposes. The findings demonstrated that the XGB-based optimization methods had superior predictive accuracy compared to the ordinary model. The optimized HHO-XGB model was identified as the most effective model in predicting PR. To determine the input variable’s importance, a sensitivity analysis technique called cosine amplitude was employed. Uniaxial compressive strength and tensile strength were found to be the most significant parameters affecting the penetration rate. Through the use of these developed models, the penetration rate of equivalent rocks can be accurately predicted.

It is important to highlight that the models developed during the PR prediction effort are specific to the current rock engineering challenge and cannot be easily applicable to other rock engineering issues. However, the offered created methods should be considered a foundation and should be re-evaluated, re-analyzed, and even re-addressed in order to take on the alternative rock designing and planning tasks. In predicting geological factors like PR, it’s evident that incorporating additional rock parameters renders predictions more significant and reliable. In future research, it’s advisable to enhance the quantity of rocks in the study to yield more credible prediction models. Additionally, evaluating rocks based on their source is recommended for robust evaluation within prediction models.