1 Introduction

Tunnel Boring Machines (TBM) have been the predominant choice of tunnelling methods in various grounds, especially in hard rock applications with length over 1.5–2 km due to achievement of higher excavation speed, lower cost, improved safety, and environmental friendliness compared to traditional drill and blast method. Estimating TBM performance is a key parameter for tunnel design and selection of the appropriate machine type and specification. In last two decades, many performance predictions models have been offered by various researchers to estimate penetration rate of hard rock tunnel boring machines (TBMs) in new tunnelling projects which can be categorized in two main groups, namely theoretical and empirical methods (Khademi Hamidi et al. 2010). Theoretical models analyse cutting forces acting on disc cutter to estimate ROP based on force equilibrium equations. Laboratory cutting tests provide a basic understanding of rock fragmentation and the force-penetration behaviour of rocks are the basis for this class of performance prediction models. The main disadvantage of these models is that they do not completely represent the site parameters relative to rock mass conditions, in particular joints, as the TBM disc cutters would encounter in the field. Empirical models are primarily based on observation of field performance of the TBMs. In such cases where standard laboratory rock cutting facilities are not available, TBM performance may be predicted using formulas developed empirically.

Currently, three different models including Colorado School of Mines or CSM (Rostami 1997) and Norwegian University of Science and Technology or NTNU (Bruland 1998) as well as field penetration index (FPI) (Nelson et al. 1983, Hassanpour et al. (2011, 2016) models are the most recognized TBM performance prediction and prognosis models in use around the world. The CSM model represents a semi-theoretical approach to TBM performance as it allows the calculation of the cutting forces that need to be applied on a disc cutter to reach a certain penetration into the rock. This method offers the advantages of being able to consider the geometry of the problem (the diameter and tip geometry of the disc and the spacing or distance between the grooves) in detail. However, the original CSM model does not consider the natural discontinuities of the rock mass, which have a major impact on the net speed of the TBM. To overcome this shortcoming, Yagiz (2002) and Ramezanzadeh (2005) modified the original CSM model by adding some rock mass properties as input parameters into the model, but with limited success.

Bruland (1998) updated and improved the NTNU model, which was originally proposed in 1978, based on field data originally collected from Norwegian tunnels, and subsequently expanded to other tunnelling projects around the world. The NTNU method uses some rock property indices such as Drilling Rate Index (DRI) estimated from rock brittleness “S20” and hardness index “SJ” in addition to joint conditions to develop the estimated rate of penetration of TBM (Blindheim 1979). The NTNU model requires specialized tests which are not commonly performed in many projects. Filed Penetration Index (FPI) has been introduced by Nelson et al. (1983) and has been subsequently used as a means for predicting the performance of TBMs. For instance, Hassanpour et al. (2011, 2016, 2021); Pourhashemi et al. (2021) and Goodarzi et al. (2021) has used FPI estimated as a function of UCS and RQD to develop new equations and charts for TBM performance prediction. These two models represent the available empirical approaches to estimate TBM penetration rate.

Apart from empirical and theoretical models, the use of machine learning (ML) techniques has received widespread attention in TBM performance prediction. Machine learning is a branch of artificial intelligence that consists of developing algorithms able to generalize behaviours from information provided in the form of examples. It is therefore, an inductive knowledge strategy (Salimi 2021; Coimbra et al. 2014). Capabilities and opportunities of using the machine learning algorithms and methods in underground tunnel construction have been discussed by Marcher et al. (2020) and Morgenroth et al. (2019). Several techniques, such as an artificial neural network (ANN), fuzzy logic, adaptive neuro-fuzzy inference system (ANFIS), particle swarm optimization (PSO) and support vector machine (SVM), random forest (RF), deep neural network (DNN) in approximating TBM performance parameters like penetration rate (PR) and advance rate (AR) have been highlighted by many scholars (Armaghani et al. 2017; Salimi et al. 2016; Koopialipoor et al. 2019; Mahdevari et al. 2014; Yagiz et al. 2009; Benardosand and Kaliampakos 2004; Ghasemi et al. 2014; Alvarez Grima et al. 2000). The flexible nature of the AI techniques makes them powerful tools in approximating and solving engineering problems more specifically when the problem is highly complex and nonlinear. However, the results of most of these studies show high correlation between their predicted rates and actual machine performance but cannot be used in estimating machine performance in other projects, since the related programs are not available to the end users. Furthermore, most of these machine learning methods (e.g., artificial neural networks or support vector machines) are difficult to apply as a large quantity of parameters must be provided or estimated to use these models. This means that they can be applied to predict the value of a target variable depending on data, but the rules or implicit patterns within the model cannot be interpreted. In the area of rock engineering/tunnelling, the suitability of data mining techniques is closely related to the applicability of the resulting model (Salimi 2021).

The main goal of the present work is to develop new models for estimation of TBM performance using FPI via statistical analysis (Regression analysis), as well as machine learning including tree-based-regression model such as, classification and regression tree (CART), which is known as graph/transparent solutions for the prediction of TBM performance in hard rock conditions. More attention is paid to introduce new models that incorporate the similarities in rock texture, cementation and grain size.

To reach this goal, compiled field data from eight tunnelling projects was compiled into a database and used in subsequent analysis. This includes Zagros water conveyance tunnel, Lot 2 in Iran (Hassanpour et al. 2009, 2016); Ghomrood water conveyance tunnel, Lots 3 and 4 in Iran (SCE Company 2004; Hassanpour et al. 2011); Karaj water conveyance tunnel, Lot 1 in Iran (Hassanpour et al. 2010); Golab conveyance water tunnel in Iran (Fatemi et al. 2016; ICE 2009); Maroshi-Ruparel water supply tunnel, Mumbai India (Jain et al. 2014; Jain 2014); Manapouri second tailrace tunnel, New Zealand (URS Company 2003; Delisio 2014; Deere et al. 2004); Deep Tunnel Sewerage System, in Singapore (Gong 2005) and Lötschberg Base Tunnel in Switzerland (Delisio and Zhao 2014; Delisio 2014).

2 Tunnelling Projects in TBM Field Performance Database

TBM performance data from various projects with the different rock mass conditions have been obtained and compiled in a TBM field performance database. Data from eight tunnelling projects including, Zagros water conveyance tunnel Lot 2, Manapouri second tailrace tunnel, Maroshi-Ruparel college tunnel, Ghomrood water conveyance tunnel Lots 3 and 4, Karaj water conveyance tunnel Lot 1, Golab conveyance water tunnel, Lötschberg Base Tunnel and Deep Tunnel Sewerage System for a total length of 92.93 km were selected for this investigation. The main characteristics of these TBM tunnelling projects are summarized in Table 1, while some more detailed information about the geological conditions along the tunnel alignment and the adopted construction method can be found in the literature. Also, Fig. 1 shows the geographical distribution of project sites.

Table 1 Main characteristics of tunnelling projects
Fig. 1
figure 1

Geographical distribution of the hard rock TBM projects used in this study

3 Data Compilation

TBM field performance database contains different levels of information which defines the tunnel, rock mass conditions, and TBM performance parameters over the full length of a tunnel drive. The database contains over 666 data sets where ground conditions and machine performance were reliable input parameters were available and could be verified. The data sets comprised two main categories. The first category included machine performance parameters like net boring time, length of mined section and also the average of machine operational parameters (thrust, RPM, power and applied torque) throughout the section. The second part of database or category of information included some geological parameters such as intact rock properties (Compressive and tensile strength, quartz content, porosity), discontinuity characteristics such as spacing, weathering, surface condition and results of calculation of some rock mass parameters (like RQD, RMR) in selected tunnel sections. Also, the most important performance parameters containing average penetration rate (\({\text{ROP}}\)), penetration per revolution (\(P\)), average cutter load \(F_{n}\), and field penetration index (\({\text{FPI}}\)) have been estimated as follows:

$${\text{ROP}}\, = \,\frac{{L_{b} }}{{t_{b} }}\,\,,\,\,\,P\, = \,\frac{{{\text{ROP}} \times 1000}}{{{\text{RPM}} \times 60}}\,,\,\,\,{\text{FPI}}\, = \,\frac{{F_{n} }}{P}\,,\,\,\,F_{n} \, = \,\left( {T_{h} - F_{f} } \right)/\,N_{{{\text{cutters}}}} ,$$
(1)

where \({\text{ROP}}\) is rate of penetration (m/h),\(L_{b}\) is boring length (m),\(t_{b}\) is boring time (h),\(P\) is cutter penetration per revolution (mm/rev),\({\text{RPM}}\) is cutterhead rotational speed (rev/mm),\({\text{FPI}}\) is Field Penetration Index expressed in (kN/cutter/mm/rev),\(F_{n}\) is cutter load or normal force,\(T_{h}\) is the applied thrust of the machine (kN),\(F_{f}\) is the estimated friction between the machine and the ground (kN) and \(N_{{{\text{cutters}}}}\) is number of disc cutters installed on TBM cutterhead. To estimate the frictional force, machines were placed in two groups as reported in Table 1, gripper/open and double shield TBMs. In open type TBM the friction force which builds-up between machine and surrounding ground is much lower than shielded machines. In some cases, the front shoes of the machine are pressed against the walls and can impose a high pressure on the walls and thus high friction. However, for the most part, the friction of the machine can be included in the calculations by subtracting 20% machine weight from the total thrust force applied by the thrust cylinders (Delisio and Zhao 2014; Salimi et al. 2019a). For shielded TBMs the friction force builds-up between shield and surrounding ground, and hence is significantly higher than open machines, especially for double shield TBM. Previous studies have used 20% weight of the machine in non-squeezing grounds, or 20% of the rock load against the shield in low to medium level squeezing conditions. For highly squeezing conditions the value of friction forces could be higher than the machine thrust, leading to jamming. In such conditions, the use of an arbitrary percentage of weigh of the machine is misleading. Further investigations are needed when shield TBMs are being utilized to assess the friction between shield and respected ground conditions (Salimi et al. 2019a; Salimi 2021). The general database structure is presented in Table 2.

Table 2 Structure of the TBM performance database

One important issue to be noticed in this process is the missing data for different parameters in different records. Due to the difficulty of dealing with volumes of detailed data in several separate databases for different projects, it was necessary to reduce the number of data sets to a manageable number. Heterogeneity of the data was also an issue which was caused using different protocols for recording TBM performance data for different tunnel job sites. Uniaxial Compressive Strength (UCS) is a commonly-used representative of rock strength in almost all of the TBM tunnel projects. Increasing in UCS causes a decrease in PR as noted by many researchers (Rostami 2013; Gong and Zhao 2009; Salimi et al. 2016). Rock mass behaviour is a function of rock material, frequency of joints, the existing joint conditions and surely influences the rock cutting by TBM. Joint spacing can be represented by RQD, joint frequency and volumetric joint count. In this study, based on availability of database, RQD and \(J_{V}\) have been taken into account to represent joint frequency. As can be seen from Table 1, the database covers three main types of rock including, Igneous rocks (38%), Metamorphic rocks (31%), and Sedimentary rocks (31%) and the boring diameter varied between 3.6 & 10.5 m.

Various TBM performance indices have been proposed based on field penetration index (FPI), specific penetration (SP, inverse of FPI) and boreability index (BI, similar to FPI) by different researchers. Among these indices, FPI has got more attention compared the others. Considering the FPI as representative of TBM performance, it is commonly utilized to present the ‘‘Boreability” of the rock with changing geological/geotechnical circumstances which expresses the ease or difficulty of rock mass excavation by a TBM. The main advantage of FPI is that it allows penetration to be normalized for cutterload and thus it automatically takes care of machine thrust variations. FPI also has the capability to be used across different TBM diameters since it accounts for cutterhead RPM and number of cutters. Usually, stronger and less fractured rock masses are more difficult for cutting by disc cutters and boring by TBM require higher thrust levels to achieve a certain depth of penetration. So, higher values of FPI are usually seen in strong and massive rock masses. In contrast, there is no need to apply high thrust values for excavation of poor-quality rock masses (weaker and more fractured) due to crack initiation and propagation is enhanced by pre-existing fractures. It means that, the values of FPI are low in such conditions (Hassanpour et al. 2011; Salimi et al. 2019a, b). Graphs presented in Fig. 2 show the histograms and distribution curves of different geological and TBM performance parameters recorded in the database based on different types of rock “G: Igneous rocks; M: Metamorphic rocks; S: Sedimentary rocks”. It should be noted that, Jv, is not available in all selected tunnel sections.

Fig. 2
figure 2

Distribution curve and frequency histogram of rock mass and TBM performance parameters in the database grouped by rock type (\(J_{V}\) is only available for massive hard rocks)

4 Developing New Models

In rock engineering practice, statistically based empirical equations have been extensively applied to predict target variables based on other operational or geological parameters. Empirical equations have great importance during the early stages of rock excavation and design works since they are more practical compared to extensive theoretical analyses. In the field of geomechanics, each rock type has its own texture, grain size, cementation and behaviour which affect the boreability and penetration rate of TBMs. Salimi et al. (2019a) considers Rock Type Code (RTC) as an input parameter in the proposed model to estimate TBM FPI. RTC was introduced by Laughton in 1990’s and have been used by Farrokh et al. (2012). Table 3 displays seven rock type categorizations. The first four classes are for ‘‘Sedimentary Rocks.’’ The fifth, sixth, and seventh classes are for ‘‘Metamorphic Rocks, Granitic Rocks, and Volcanic Rocks’’, respectively. It should be noted that, Gneiss (GN) is inherently metamorphic, but it is typically closer to granite in terms of its behaviour, especially where foliation is less pronounced. For this reason, it was categorized as GN in that analysis. To use rock type code as one of the selected input parameters to the model, code numbers including, 1 for G and GN, 2 for MV, 3 for SLK, 5 for C were employed (Salimi et al. 2019a, b).

Table 3 Rock type categorization in database

According to the results of sensitivity analysis and parametric study of common models conducted by Fatemi et al. (2016) consideration of RTC has significant role to play in estimation of rock mass boreability. The same results have been found by Salimi (2021) and Salimi et al. (2019a). Generally speaking, rock texture, which consists of grains and matrix, directly correlated with the physical and mechanical properties of rock material and thus, rock drillability. For more illustration, Table 4 presents an example in which the basic RMR system is calculated for two rock masses in two different rock types. Overall, when comparing the most commonly used rock mass classification systems, the RMR classification is easiest to apply and shows better correlation with TBM performance, possibility due to the use of intact rock compressive strength as an input parameter. Moreover, RMR is frequently used in tunnel design process and reported from the logging of cores in the site investigation reports as well as back mapping of the tunnels. As such, input parameters of RMR system are often available for various projects. As can be seen from Table 4, despite the similar values found between two types of rocks (amphibolite which is an igneous crystalline rock and limestone that is a common sedimentary rock), the boreability of rock masses are different. Although, there are several factors which directly or indirectly can affect TBM performance, such as angle between tunnel axis and discontinuity planes α (Alpha angle), crew experience, backup system and so on, but from geological points of view, it can be expected that, the differences are due to the rock texture and cementation.

Table 4 Comparison between different rock type categorizations with the same RMR value and TBM FPI

Therefore, as it anticipated, the boreability of rock masses are impacted by rock texture and cementation (Salimi et al. 2019b). Similar results have been also observed between other rock mass classifications such as Rock Structure Rating (RSR) by Wickham et al. (1972). In this study, given the available data from different rock types, performance prediction models are introduced based on similarities in rock textures categorized as G & GN; MV; SLK and C. Descriptive statistical distribution of variables in the database and input parameters for generated model for each rock type are summarized in Table 5. Figure 3 shows percentage distribution of different rock type categorization in this investigation.

Table 5 Descriptive statistics of generated database based on different classes
Fig. 3
figure 3

Percentage distribution of different rock type codes

As it was expected, higher value of FPI and its associated parameters can be found in hard massive rocks (“G & GN”), whereas lower values are attributed to soft/weak and more fractured rocks (“C”). This confirms the need for developing new models to include rock type categorization to reflect the similarities in their texture. It is worth to note that, in class G & GN, joint volumetric count (Jv) was available and showed better correlation with FPI compared to RQD. The reason for this phenomenon could be attributed to the limitation of RQD to be representative of rock mass fracturing degree in hard massive rocks since, it is an index with the maximum value of 100 which indicates the discontinuity spacing/frequency. Perhaps, this is why Gong and Zhao (2009) and Delisio and Zhao (2014) considered Jv being a better representative of joint frequency in their developed models (Salimi 2021).

To develop empirical models, the data in the databases was divided into two categories, i.e., training and testing/validation. This was done for developing and evaluating the proposed models. Many investigations recommended 20% of data for testing procedure. However, by partitioning the available data into 80% training and 20% testing, it can be expected that, the number of samples which can be used for learning the model is drastically reduced and the results can depend on a particular random choice for the pair of (train, validation) sets. To address this issue, between 15 and 20% of the data was used for testing and validating the models. Further details of the dataset used in training of the models is presented in Table 6.

Table 6 Further details of the dataset used in models

The analysis of available input parameters for each rock type using R statistical computing software “R stats package, car package and MASS package as well as caret library in R which yielded the following empirical equations. The comparison between the calculated and predicted FPI for each rock type is shown in Fig. 4.

$${\text{Rock type }}{{^{\prime\prime}}}{\text{G}} \& {\text{GN}}{{^{\prime\prime}}}{:}\,\, {\text{FPI}} = \,e^{2.775} \cdot {\text{UCS}}^{0.367} \, \cdot J_{V}^{ - 0.432} \quad \left( {R^{2} = \, 0.70} \right)$$
(2)
$${\text{Rock type }}{{^{\prime\prime}}}{\text{MV}}{{^{\prime\prime}}}{:}\;{\text{FPI}}\, = \,\exp (1.633 + 0.007 \cdot {\text{UCS}} + 0.009 \cdot {\text{RQD}})\quad \left( {R^{2} = \, 0.71} \right)$$
(3)
$${\text{Rock type }}{{^{\prime\prime}}}{\text{SLK}}{{^{\prime\prime}}}{:}\;{\text{FPI}}\, = \,\exp (2.164 + 0.004 \cdot {\text{UCS}} + 0.006 \cdot {\text{RQD}})\quad \left( {R^{2} = 0.73} \right)$$
(4)
$${\text{Rock type }}{{^{\prime\prime}}}{\text{C}}{{^{\prime\prime}}}{:}\;{\text{FPI}}\, = \,e^{ - 0.822} \, \cdot {\text{UCS}}^{0.146} \cdot {\text{RQD}}^{0.69} \quad \left( {R^{2} = 0.72} \right)$$
(5)
Fig. 4
figure 4

Comparison between the calculated and predicted FPI based on rock type categorization via regression analysis for training and testing datasets

5 Machine Learning Methods for TBM Performance Models

Apart from empirical and theoretical models, the use of Machine Learning (ML) techniques has been used to examine relationship between FPI and geological parameters for each rock type. The background and pro/cons of using ML systems for TBM performance prediction has been discussed in previous publications (Zhang et al. 2020; Salimi et al. 2019a; Armaghani et al. 2017). In this context, decision trees are methods of a relatively easy application, being transparent and interpretable, since they allow obtaining patterns for a better explanation of a given phenomenon, showing the most important variables and their threshold values. This contribution reports the application of regression trees to assess the performance of TBM and offer graphs that can be used by others to reproduce the results and to predict TBM performance for future projects.

5.1 TBM Performance Prediction Models Using Regression Tree

One of the most popular techniques in data mining (analysis) is a DT (decision tree) in which a simple and comprehensible structure is used that can be utilized for classification, recognition, decision making as well as prediction of certain target parameters. There are several kinds of DT methods, among them, CART has been widely used with a high level of accuracy and performance for predicting problems in different engineering fields (Salimi 2021; Breiman et al. 1984; Tiryaki 2008). CART is a rule-based method introduced by Breiman et al. (1984) and is based on whether the dependent variable is qualitative or quantitative; as such it can be categorized as a classification tree (CT) or regression tree (RT), respectively. This technique is recommended for use in situations where the form of the relationships between the dependent variable (response) and independent variables (predictors) is not exactly known before building a predictive model (Breiman et al. 1984). Furthermore, in CART analysis, there is no need to consider prior suppositions about the relationship between variables.

Given a set of samples, CART identifies one input variable and one break-point, before partitioning the samples into two child nodes. Starting from the entire set of available training samples (root node), recursive binary partition is performed for each node until no further split is possible or a certain terminating criterion is satisfied. At each node, best split is identified by exhaustive search, i.e., all potential splits on each input variable and each break-point are tested, and the one corresponding to the minimum deviations by, respectively predicting two child nodes of samples with their mean output variables is selected. After the tree growing procedure, typically an overly large tree is constructed, resulting in lack of model generalisation to unseen samples. A procedure of pruning is employed to remove sequentially the splits contributing insufficiently to training accuracy. The tree is pruned from the maximal sized tree, all the way back to the root node, resulting in a sequence of candidate trees. Each candidate tree is tested on an independent validation sample set and the one corresponding to the lowest prediction error is selected as the final tree (Breiman 2001; Wu et al. 2008; Yang et al. 2017; Salimi 2021).

To perform the RT, recursive partitioning and multiple regressions are carried out from the data-base. From the root node, the data splitting process in each internal node of a rule of the tree is repeated until a stop condition previously specified is reached. Stopping criteria can be defined for performing the regression tree algorithm—to keep the resultant tree from being too complicated for interpretation. The three main stopping criteria are: (1) a minimum number of observations in a node split; (2) the depth of the tree; and (3) the complexity parameter (cp). The complexity—measured by the coefficient cp—is the value analysed by the following rule. If the division in a specified node does not improve the fit of the model to data (on the value of set cp), then this division is ignored (Therneau and Atkinson 1997; Therneau et al. 2012; Tomczyk and Ewertowski 2013; Salimi 2021).

Alternatively, the optimal tree structure can be identified through ten-fold-cross validation. In brief, in regression problems, where the output is a continuous number; CART can successfully predict the targeted outcome through using for example least absolute deviation (LAD) error. This study used CART RT models implemented in the R statistical computing software via “rpart libaray, party library and ggparty as well as mlbench library”. It is worth to note that, there is often a balance to be achieved in the depth and complexity of the tree to optimize predictive performance on some unseen data. Respecting to the tree depth, the higher in depth, the model becomes more complicated and harder for production of the tree; and if the depth of the tree is low, the efficiency of the model will be reduced and some parameters may be omitted. Hence, the related tree depth was reduced to 5, 6, 7 and 8. To find this balance, one typically grows a very large tree as defined in the previous section and then prune it back to find an optimal subtree. The optimal subtree can be found using a cost complexity parameter that penalizes our objective function “least absolute deviation (LAD)” for the number of terminal nodes of the tree. For a given value of the smallest pruned tree that has the lowest penalized error, the optimum setting can be achieved. Smaller penalties tend to produce more complex models, which result in larger trees. Whereas larger penalties result in much smaller trees. Behind the scenes “CART RT via R computing software” is applied a range of cost complexity (cp) values to prune the tree considering the following parameters, 0.01, 0.001 and 0.0001 according to the literature review. To compare the error for each “cp” value, tenfold cross-validation performed so that the error associated with a given “cp” value is computed and the one with lowest RMSE or no significant differences selected. In brief, the determination of suitable combination of design parameters was taken in to account as a factor of paramount importance. This allowed the generation of operative robust tree-based-regression models with a high generalization capacity. Further information regarding the algorithm and its mathematical logic can be found in Breiman et al. (1984).

Similar data which has been employed in regression models are used for presenting a tree-based model for estimation of the TBM FPI in terms of rock type categorization in training and validation stages. Figures 5, 6, 7, 8 and Tables 9, 10, 11, 12 in Appendix A illustrate the preferable trees and present detailed information regarding the structure of the tree developed for TBM performance estimation based on rock type categorization, respectively.

Fig. 5
figure 5

Regression tree developed for estimation of TBM FPI prediction “G & GN”

Fig. 6
figure 6

Regression tree developed for estimation of TBM FPI prediction “MV”

Fig. 7
figure 7

Regression tree developed for estimation of TBM FPI prediction “SLK”

Fig. 8
figure 8

Regression tree developed for estimation of TBM FPI prediction “C”

Figure 9 displays the optimum tree size and the relationship between measured and predicted values obtained from the CART model for each rock type in training and testing stages is shown in Fig. 10. Furthermore, the relative variable importance for developed tree-based model for each rock type generated by CART is shown in Fig. 11. Relative variable importance standardizes the importance values for ease of interpretation which is defined as the percent improvement with respect to the most important predictor. An important variable is a variable that is used as a primary or surrogate splitter in the tree. The variable with the highest improvement score is set as the most important variable, and the other variables are ranked accordingly. As can be seen, the selected input parameters for model development, adequately reflect the effect of both intact and rock mass properties on TBM FPI. In general, it can be concluded that, the joint frequency on TBM performance has more significant role to play contrasting to intact rock properties such as UCS. This is in agreement with previous investigations (Hassanpour et al. 2011; Bruland 1998).

Fig. 9
figure 9

Optimum tree (CART) size generated by R Statistical program for each rock type categorization

Fig. 10
figure 10

Comparison between the calculated and predicted FPI based on rock type categorization via regression tree (CART) for training and testing datasets

Fig. 11
figure 11

Relative variable importance charts generated via “R” based on rock type categorization

Also, the following formula can be used to calculate ROP (m/h) from the FPI predicted by developed models (Equations & Graphs):

$${\text{ROP}}(m/h) = \frac{{0.06 \times F_{n} \times {\text{RPM}}}}{{{\text{FPI}}}},$$
(6)

where \(F_{n}\) is the average cutter load (kN/cutter),\({\text{RPM}}\) is cutterhead speed (revolution per minute), and \({\text{FPI}}\) is field penetration index (kN/cutter/rev/min).

6 Comparison of the Developed Models

The performance of the proposed models was evaluated according to statistical criteria including the root-mean-square-error (RMSE), mean-squared-error (MSE) and R2 as follows:

$${\text{RMSE}}\, = \,\sqrt {\frac{1}{N}\sum\nolimits_{i = 1}^{N} {\left( {y - y^{\prime}} \right)^{2} } }$$
(7)
$$R^{2} \, = \,1 - \frac{{\sum\nolimits_{i = 1}^{N} {\left( {y - y^{\prime}} \right)^{2} } }}{{\sum\nolimits_{i = 1}^{N} {\left( {y - \tilde{y}} \right)^{2} } }}$$
(8)
$${\text{MSE}}\, = \,\frac{1}{N}\sum\limits_{i = 1}^{N} {(y - y^{\prime})^{2} } ,$$
(9)

where y, \(y^{\prime}\) and \(\tilde{y}\) are the measured, predicted and mean of the variable y, respectively; and N is the total number of datasets. It is worth noting that the excellent model is considered where R2 = 1, and RMSE as well as MSE equal to 0. The results of applying these models are summarized in Table 7. The results show that CART models offer higher accuracy in predicting FPI. As discussed before, the use of statistical analysis alone cannot offer satisfactory results and application of machine learning (ML) methods can improve the result of regression analysis, and in particular, tree-based modelling is an excellent alternative to regression analysis. In addition, it can handle data that are not normally distributed. This is a clear advantage in this field since most data do not follow normal distribution. Also, CART models are easier in visual representation, making a complex predictive model much easier to interpret. Additionally, decision trees are less likely to be influenced by outliers or missing values since it has no assumptions about space distributions and classifier structure (Salimi 2021).

Table 7 Performance indices for developed models

6.1 Comparison of the Proposed and Existing Models

Among the different models which have been presented in the last decade, the model proposed by Hassanpour et al. (2011) was developed based on the FPI model and has similarities in input parameters and shows promising results compared to the common prediction models such as QTBM (Hassanpour et al. 2016). The Formula and associated chart introduced by Hassanpour et al. (2011) is presented in Eq. 10, and is very applicable/constructive and reflect the practical approach in an early stage of tunnel design and construction. The model has been developed based on two commonly available inputs including, UCS and RQD which are most often available in any tunnelling projects around the world. It is also worth to note that, in this study, the developed model for estimation of FPI in rock type G & GN, is based on UCS and Jv and a relevant formula that can convert Jv into RQD can be used to offer equivalency between the results of this study and that of Hassanpour (2011).

$${\text{FPI}} = \exp (0.008 \cdot {\text{UCS}} + 0.015 \cdot {\text{RQD}} + 1.384)$$
(10)

Figure 12 shows the relationship between measured and predicted values obtained from the Hassanpour’s model for each rock type in testing stage. Since among the developed models in this investigation the CART model shows better results for each rock type categorizations, this model has been selected to be compared by the estimated FPI via Hassanpour’s model. For this purpose, variations of absolute error or E(%) for each model and each rock type categorizations are calculated as follows:

$$E(\% ) = 100 \cdot \left| {\frac{{{\text{Actual}}\,{\text{FPI}} - {\text{Estimated}}\,{\text{FPI}}}}{{{\text{Actual}}\,{\text{FPI}}}}} \right|$$
(11)
Fig. 12
figure 12

Comparison between the calculated and predicted FPI based on rock type categorization via Hassanpour’s model in testing stage

A summary of the statistical analysis performed on calculated rates and respective errors are presented in Table 8. As can be seen, the Hassanpour’s model provides better results in rock type categorization MV & SLK whereas shows higher error in other rock types including G & GN as well as C. The reason could be the consideration of RQD as joint frequency in hard massive rock masses where the RQD cannot represent the joint spacing adequately due to the limitation in maximum 100. Another cause can be related to the range of the complied data in the Hassanpour’s model. Additionally, the differences between geological characteristics of the sites used in development of the models have an impact on the accuracy of the models. However, it is worth to note that, the presented model by Hassanpour, shows promising/acceptable results when similar conditions such as range of UCS or disc cutter diameter (17″) as well as geological characteristics are being applied in tunnel excavation with TBM. When choosing between empirical, theoretical, or ML models to predict TBM performance, one should pay attention to the application range of the model and geological conditions that the original model was based on.

Table 8 Descriptive statistics of absolute errors (E %) estimated for the prediction models “Hassanpour and CART” respecting to rock type categorizations

7 Model Limitations

The CART and empirical models developed in this investigation have some limitation in their application, similar to any other empirical models. The additional limitations are with respect to machine parameters, i.e., cutter diameter used on the machines, where the data used in this study are primarily from 432 mm (17″) disc cutters. Although the thrust force per cutter is a normalized value by cutter number in the developed model, the concentrated stress acting on the rock face at the contact point which initiates the fracture propagation is still greatly affected by cutter diameter and cutter tip width even if the force per cutter is the same as noted by Gong and Zhao (2009). Although, these machines have different diameters, they are similar in most of their specifications, particularly in cutterhead design and cutters arrangement i.e., the average spacing of disc cutters in all cutterheads was in the range of 60–90 mm. Consequently, when the machine parameters are changed (especially cutter dimeter, cutter width and spacing), the model need to be used with consideration of the effects of these parameters. Perhaps existing models such as CSM formula which allows for variation of these parameters can be used for developing adjustment factors to extend the use of the proposed FPI numbers to the cases where disc diameter and tip width or spacing is outside the range of the available database. Furthermore, the estimated FPI and machine performance is not valid for mixed face or transitional working conditions; unstable blocky ground, and squeezing ground conditions.

8 Discussion and Conclusion

In this study, a database of TBM field performance from eight tunnelling projects with total length of 92.93 km and boring diameter 3.6–10.5 m in different geological conditions was compiled and subjected to statistical analysis to derive empirical regression formulas for estimation of field penetration index (FPI). The data was subsequently analysed by machine learning (ML) methods and CART charts/graphs are offered for improving performance prediction for hard rock TBMs while incorporating rock type in the analysis.

Basically, given the ability of RT to perform recursive partitioning as an alternative method to the traditional multiple regressions, it was used for the analysis of a database of TBM field performance. The main advantage of CART (tree-based regression model) is that the end user does not need a computer code, nor have to be an expert in the field to use the model. In many applications, such as TBM performance prediction, CART offers better clarity of information, which makes the data understandable using graphic representations. It allows for selection of the most important variables, their threshold values, and finally implements proper rating and weights to each parameter based on the internal regression with the observed values. Moreover, the impact of each variable on the target can be obviously identified by the addressed tree structure in CART model.

The regression tree (RT) models could offer more accurate alternatives to the traditional multiple regression models. The proposed models in this study consisting of equations and graphs have been developed based on categorization of rock type, incorporating to its similarities in rock texture (cementation, grain size and shape). The results show that incorporating rock type is very useful when corresponding categories were used as input parameters for TBM performance complements the use of the most influential parameters including UCS (intact rock strength), Jv or RQD (degree of fracturing of the rock mass). The results also indicate that CART model outperforming the regression models with typical R2 close to 90%, as compared to the multivariable regression equations that offer R2 in the mid 70% range. The suggested formulas and graphs in this study that allows form incorporation of rock type, offer more accurate results compared to the previous generalized models based on CART.