Introduction

The tunnel boring machine (TBM) represents an advanced engineering solution that automates the process of tunnel excavation. It has a rotating cutter head, and cutting tools effectively bore through various geological formations, ranging from soft soil and clay to hard rock and abrasive materials. By employing state-of-the-art technology and engineering principles, TBMs can tunnel through the most challenging terrains with precision and efficiency. In contrast to conventional tunneling techniques such as drilling and blasting, mechanized tunneling with TBMs minimizes the need for human involvement within the tunnel face. TBMs operate in a closed environment, providing a controlled working environment that further enhances safety standards (Acaroglu et al. 2008; Hartlieb et al. 2017; Zhang et al. 2017; Ren et al. 2018; Feng et al. 2021). However, there are drawbacks to the impact of TBM tunnelling on excavation efficiency. The impacts can be classified into two main types: ground-machine interactions and human‒machine interactions. Numerous studies have been conducted to identify and predict the reason for the inefficiency of TBM tunnelling. These studies can be categorized into empirical/theoretical and artificial intelligence applications. Empirical/theoretical models have been introduced to examine the predominant characteristics of TBMs (Roxborough and Phillips 1975; Farmer and Glossop 1980; Wijk 1992; Rostami and Ozdemir 1993; Barton 2000; Cardu and Oreste 2011; Macias et al. 2016; Cardu et al. 2017). Empirical/theoretical models (She et al. 2024) have considered various rock parameters, including the cerchar abrasivity index (CAI), uniaxial compressive strength (UCS), quartz content, Vickers hardness number (VHNR), fracturing degree, porosity, drilling rate index, and rock mass classification, as well as machine properties such as the thrust force (MN), torque (MN.m), penetration rate (m/h), revolutions per minute (rpm), and TBM diameter, to estimate cutter tool life. Nevertheless, this approach has limitations, such as the need for specialized laboratory equipment, reliance on a limited set of input parameters, and high computational time due to the complexity of the experimental setup process.

Many researchers have developed artificial intelligence models for predicting TBM performance to address the limitations of empirical methods. Artificial intelligence models have been used to predict TBM performance based on cutter wearing and penetration rate (Alvarez Grima et al. 2000; Benardos and Kaliampakos 2004; Mahdevari et al. 2012, 2014; Ghasemi et al. 2014; Jahed Armaghani et al. 2018; Salimi et al. 2018, 2022; Koopialipoor et al. 2019; Feng et al. 2021; Kilic et al. 2022; Yang et al. 2022). However, while artificial intelligence models have demonstrated their effectiveness in predicting TBM penetration rates, accurately forecasting specific energy remains a considerable challenge within mechanized tunnelling. The specific energy is a critical parameter that directly impacts the TBM performance and energy consumption throughout tunnel excavation. Understanding and accurately predicting specific energy is vital for optimizing tunnelling operations, enhancing overall efficiency, and minimizing construction expenses (Mirahmadi and Dehkordi 2019; Tang et al. 2023). Several researchers have developed approaches such as empirical, experimental, and intelligent models to analyze the specific energy consumption of mechanized tunneling. Snowdon et al. (1982) Balci and Tumac (2012) Cho et al. (2013) Copur et al. (2014), and Pan et al. (2019) used a linear cutting machine (LCM) to analyze the penetration rate and specific energy relationship. Nevertheless, conducting full-scale linear cutting tests has certain limitations. One of the drawbacks is the challenge of acquiring large rock blocks, which can be difficult or sometimes unfeasible. Additionally, many researchers may not have access to the necessary testing equipment, such as a full-scale linear cutting machine. On the other hand, Altindag (2003) reported a significant relationship between the brittleness of a rock mass and its specific energy. Preinl et al. (2006) identified a correlation between rock mass excavability (RME) and specific energy. The impact of brittleness and destruction energy on specific energy was explored by Atici and Ersoy (2009). A method that utilizes fuzzy logic to estimate the specific energy requirements of constant cross-section disc cutters during rock cutting was proposed by Acaroglu et al. (2008). Regression analyses to explore the connection between excavation parameters and rock cutoff were performed by Tang et al. (2023). They established a prediction model for UCS and a classification model for rock cutoff based on SE. Moreover, the mechanical analysis of the shield excavating process via a nonlinear multiple regression model using on-site data, leading to the development of a diagnostic model for SE, was integrated by Zhang et al. (2012). Introduced a novel specific energy (SE) equation that accounts for variations in the disc-cutter radius, enhancing the predictive accuracy for TBM performance by Wang et al. (2012). However, simple regression models overlook bottlenecks in predicting specific energy because the models have been constructed with limited data and cannot capture reliable information. Zhou et al. (2022) developed a physics-informed machine learning model to predict the energy consumption of a TBM using the physics constraints of tunnel geology. However, due to the tailored physics formula, physics-based machine learning models are complex and unsuitable for generalization. Wang et al. (2023) emphasized that TBM energy consumption is the primary factor influencing excavation efficiency. They argued that evaluating TBM efficiency solely based on operator experience lacks reliability in optimizing excavation effectiveness. To address this concern, the authors devised a hybrid algorithm named QPSO-ILF-ANN, which combines quantum particle swarm optimization with an enhanced loss function grounded on a neural network. This model successfully predicted that increased penetration and rotational speeds increase excavation speeds while concurrently reducing energy usage. However, its complexity arises from the requirement for quantum knowledge and its involvement in traditional hyperparameter tuning methods. In summary, previous researchers did not consider optimizing the operator’s decision-making. Additionally, determining and forecasting the SE is notably more intricate in nonhard rock conditions due to the infrequent utilization of soil mechanics parameters and their inherent uncertainty, as noted by Yu and Mooney (2023). Previous studies revealed two significant gaps in SE prediction research. First, optimizing TBM operators through intelligent system feedback underscores the significance of human–machine interactions. Second, there is a call for a new equation to calculate the specific energy, considering the TBM operational parameters, soil-machine interactions, and material removal flow rate instead of the penetration rate-guided approach.

In this paper, a different point of view was evaluated for the performance evaluation of the TBM. This paper diverges from previous studies by offering data-driven and methodologically distinct approaches. The data-driven approach involves operator decisions (monitoring), which determine how human–machine interactions impact the energy consumption of a TBM. The operator-based dataset was used to develop the model instead of focusing on limited ground-machine interactions. The method will provide new insight into machine operation for the human side because the AI model can assist the operator and then concentrate on crucial parameters for energy consumption while avoiding entire control panel management. The model’s novelty lies in its methodological approach, where a new specific energy formula, tailored for MSTBMs in soft ground, incorporates TBM operational parameters, the soil-machine friction coefficient, and the sludge removal flow rate. By integrating cutting-edge automatic hyperparameter tuning, we can overcome the high computation time and time consumption of model optimization. Finally, incorporating SHAP with xANN enables the identification of critical parameters influencing energy consumption.

Data

Project description

The tunneling project is a highway bypass tunnel in Japan. The hydraulic excavator excavates the main tunnel project. The tunnel project entrance consists of soft ground; therefore, an umbrella pipe support excavation is used to increase the stability of the tunnel roof. Figure 1 illustrates the umbrella pipe support excavation with the MSTBM.

Fig. 1
figure 1

The diagram represents the umbrella support design used in excavation, where the round shapes indicate the pipe holes, and the cross-sectional view of the pipes shows the holes excavated by the MSTBM

The pipe geology consists of sandy clay, sand, clay (brown), clay (blue), and sandy clay. Owing to the similarities between soil samples, they are grouped as TUC, TUS, and TC based on the site investigation report. Table 1 shows the mechanical properties of the soft ground.

Table 1 Mechanical properties of pipe geology

Micro slurry TBM machine specifications

The micro tunnelling method was used to excavate umbrella pipe supports for excavating highway bypass tunnels in Japan. Figure 2 briefly demonstrates the working principle of the micro slurry TBM, and Table 2 provides the specifications of the MSTBM.

Fig. 2
figure 2

The working principle of the MSTBM involves excavating a tunnel and transporting the excavated material through pipes. This material is treated for soil conditioning and to maintain pressure at the cutter face for added stability

Table 2 Micro slurry TBM specifications

Data preprocessing

In this research, an operator-based dataset was used to predict the specific energy consumption of the MSTBM. This dataset aimed to identify how to impact operator decisions on specific energy consumption levels. In this section, the operational parameters of the MSTBM were investigated, and the input parameters for the model were selected using a correlation matrix. The operational parameters include 27 operating parameters. Figure 3 presents the frequency distributions and values of the operational parameters.

Fig. 3
figure 3

A frequency histogram illustrating the operational parameters of the MSTBM. On this histogram, the y-axis indicates the frequency, while the x-axis represents the various values of the parameters

Owing to the frequency plot distribution, normalization and feature transformation rescaled the data in the range of [0–1] and decreased the influence of magnitude on the variance (Huang et al. 2022). Equation 1 expresses the min–max normalization formula (Munkhdalai et al. 2019).

$$x_\text{scaled}=\frac{x-x_\text{min}}{x_\text{max}-x_\text{min}}$$
(1)

where x(scaled) is the normalized data, x is the raw data, and xmax and xmin are the maximum and minimum values of the dataset, respectively.

Figure 4 shows the correlation matrix of the operational parameters used to determine the correlations among the variables. According to Fig. 4, 11 operational parameters were selected from the entire dataset due to their reasonable correlations with the specific energy.

Fig. 4
figure 4

Correlations of MSTBM variables with each other. According to the matrix, the correlation increases when the color changes to red, but the correlation decreases when the color changes to blue. The last row of the correlation matrix shows specific energy correlations with variables

Based on the correlation matrix in Fig. 4, the following operational parameters were selected to serve as input variables for the xNN model. The cutter torque (kN∙m), cutter current (ampere; A), cutting-edge water pressure (MPa), earth pressure (kN/m2), jack stroke (mm), original pressing force (MPa), thrust propulsion force (kN), slurry density (t/m3), sludge density (t/m3), slurry dry solids flow rate (l/min), and sludge dry solids flow rate (l/min). The selected parameters have a positive and reasonable correlation with the specific energy.

Methodology

After data preprocessing, the dataset consisted of 11 selected features for building the xANN model. The dataset was split into training (80%), test (20%), and unseen data (10%). The target variable specific energy was derived from the MTBM torque power consumption during cutting rotation with respect to the friction coefficient of the soil machine during sludge removal. As elaborated in the introduction, this derivation was necessitated by the existing specific energy formulas that are primarily applicable to laboratory-scale tests and large hard rock TBMs. Initially, operator-monitored data is collected and subjected to exploratory data analysis to understand data characteristics, with a particular focus on the frequency distribution of data. This analysis, augmented by a correlation matrix, informs feature selection, ensuring that only the most relevant features are chosen, thereby enhancing model performance, and reducing unnecessary complexity. Data is then split into distinct sets: 80% for training, 20% for testing, and 10% set aside as unseen data for final validation. Before training, min-max standardization normalizes the feature scales, promoting faster convergence during the learning phase and preventing any single feature from dominating the model’s attention due to scale differences. The training process itself is optimized through iterative hyperparameter tuning using Optuna, striking a balance between performance and computational efficiency. Post-training, the model is evaluated for accuracy using the test set. SHAP analysis provides insights into feature importance, contributing to the model’s explainability and identifying opportunities to refine the model further. Finally, validation against unseen data ensures that the model’s predictive power holds up against new, real-world data, confirming its robustness and generalization capability. This methodical process not only bolsters model transparency but also seeks to improve the overall quality of predictions, ensuring that the model is not only high-performing but also understandable and trustworthy.

The detailed process of deriving this specific energy equation is described in the “Derivation of the micro slurry TBM-specific energy formula” section. The specific energy equation is used as an output of the xNN model. The process of integrating the Optuna model into the ANN model is described in the “Optuna automatic hyperparameter tuning” section. The “Explainable neural network” section describes explainable neural networks fine-tuned using Optuna. The model has been incorporated with SHAP to address the black box concern inherent in neural networks, thereby ensuring transparency and interpretability of the model’s outcomes in the “Shapley additive explanation (SHAP)” section.

Consequently, the “Evaluation metrics” section outlines the assessment metrics, including R2, MSE, and MAE. Figure 5 briefly summarizes the methodology and model description.

Fig. 5
figure 5

The operator-monitored database explainable neural network model structure to predict the specific energy consumption of the MSTBM during material removal flow rate. The model was integrated into the SHAP to present the importance of the specific energy feature to provide insight to the operator. The Optuna hyperparameter framework uses several trials to find the best hyperparameters for the model. The best trial hyperparameters are used to retrain the model

Derivation of the micro slurry TBM-specific energy formula

The first concept of the specific energy (J/m3) in the drilling process based on the F is thrust on the bit (kN), A is hole section (m2), N is rotation speed (rps), T is rotation torque (kN.m), and V is the rate of penetration (m/s), provided by Teale (1965). Equation 2 shows the first concept of the specific energy.

$$E= \frac{F}{A}+\frac{2\pi . NT}{AV}$$
(2)

Celada et al. (2009) synthesized earlier specific energy equations. They emphasized that geomechanical parameters, namely, UCS, Young’s modulus, and the rock mass rating (RMR), and TBM-related factors, such as thrust, torque, rotation speed, and drilling fluid pressure, were predominantly employed in deriving specific energy formulas. Notably, this highlights the potential utilization of TBM operational parameters in formulating a specific energy equation tailored to micro slurry TBM excavation. Equation 3 shows the proposed specific energy equation.

$$\mathrm{SE}=\frac{P_{\mathrm{cutting\;net}}\,\left(\mathrm{kW}\right)}{\mathrm{Sludge\;removal\;flow\;rate}\,\left(\frac{\mathrm m^3}{\mathrm h}\right)}$$
(3)

The MSTBM-based specific energy formula was developed inspired by Bilgin et al. (2013) the net power requirement of the TBM cutting ground. The net power requirement formula was improved by considering the frictional coefficient of the pipe-soil interaction and sludge removal flow rate to calculate the slurry TBM energy consumption as a function of the flow rate. Bilgin et al. (2013) stated that \(\eta\) (the energy factor of cutterhead motors is 0.75, and the frictional coefficient of the clay shield is 0.20).

The cutting power \(P_{\mathrm{cuttingnet}}\left(\mathrm{kW}\right)\) is given in Eq. 4.

$$P_{\mathrm{cutting\;net}}=\left(\frac{2\,\pi\,.T\,.\,\mathrm{RPM}}{60}\right)\,.\,\eta$$
(4)

The previous net power requirement was improved by Eq. 5, which included the frictional coefficient (\(\mu\)) of the soil machine during cohesive soil excavation. \(\mu\) was assumed to be 0.20 for the clay.

$$P_{\mathrm{cutting\;net}}=\left(\frac{2\pi\,.\,T\,.\,\mathrm{RPM}}{60}\right)\,.\,\eta.(1+\mu)$$
(5)

Equation 6 indicates the incorporation of the \(P_{\mathrm{cuttingnet}}\) and the sludge removal flow rate (m3/h).

$$\mathrm{SE}=\frac{\left(\frac{2\pi\,.\,T\,.\,\mathrm{RPM}}{60}\right)\,.\eta\,.\,(1+\mu)}{f\left(\frac{\mathrm m^3}{\mathrm h}\right)}$$
(6)

where SE is kWh/m3, T is the cutter torque (kN∙m), revolutions per minute (RPM), η (0.75) is the mechanical efficiency of the TBM, \(\mu\) is the friction coefficient of the pipe (0.20) and clay, and f is the sludge removal flow rate (m3/h).

Optuna automatic hyperparameter tuning

Optuna is a new generation of hyperparameter tuning models, as explained by Akiba et al. (2019). Artificial intelligence models have several hyperparameters that are vital for improving model performance. However, it is not easy to manually obtain the best parameters for the model structure because it is computationally expensive. In this regard, the proposed xNN model hyperparameters were determined using the Optuna hyperparameter tuning model with 100 trials. The 100 trials were used to determine the number of neurons, layers, learning rate, and weight decay for the xNN model.

Optuna iteratively selects different sets of hyperparameters θ to train the ANN model and evaluates its performance using the R2 score on the test set. The aim is to find the set of hyperparameters that results in the maximum R2 score, indicating the best fit between the predicted and actual values. The optimization can be mathematically conceptualized as navigating the hyperparameter space to find the optimal point θopt​ that yields, shown in Eq. 7.

$$\theta\mathrm{opt}={\mathrm{arg\,max}}_{\theta} \,R^2\,\left(\theta\right)$$
(7)

the process of using Optuna to optimize the hyperparameters of an ANN regressor to maximize the R2 score. The process involves defining a suitable objective function, iteratively exploring the hyperparameter space, and evaluating the model’s performance based on R2.

The “Explainable neural network” section describes the proposed model structure and feature importance of the hyperparameters.

Explainable neural network

The xNN model uses the Optuna automatic hyperparameter tuning algorithm described in the “Optuna automatic hyperparameter tuning” section. The number of hidden neurons, the number of layers, the learning rate, and the weight decay parameters were determined 100 times through trials with Optuna. Afterwards, the model was retrained based on the best trial parameters. The Optuna can provide the important features of the hyperparameters for the model and relationship hyperparameters with each other. Table 3 presents the selected hyperparameters and their values. Figure 6 illustrates the features and significance of the hyperparameters for the model.

Fig. 6
figure 6

The outcome of tuning the optimal hyperparameters for the proposed xNN model. The learning rate is the most critical parameter for the neural network model, while weight decay is not significant

Table 3 xNN model hyperparameter and hyperparameter values

Additionally, Fig. 7 demonstrates the Optuna decision process as a parallel coordinate plot for the proposed xANN model structure. Optuna is a Bayesian method, and it allows us to pursue global optimization by progressively constructing a probabilistic model that maps hyperparameter values to the objective function. This model encapsulates assumptions regarding the function’s behavior, creating a posterior distribution over the objective function (Frazier 2018).

Fig. 7
figure 7

High-dimensional parameter relationships in the xANN model. The parameters on the plot are as follows: lr is the learning rate, n_layers is the number of layers, and n_units_l0-6 is the number of units in different layers. From 0 to 6, the “weight_decay” axis shows the regularization strength, and the color gradient provides an at-a-glance indication of which trials resulted in higher objective values, which can help in identifying the best set of hyperparameters for the model

Shapley additive explanation (SHAP)

SHAP is based on game theory and is used to extract important features for the output (Wen et al. 2021; Kavzoglu and Teke 2022; Kilic et al. 2023). Therefore, meaningful features provide explainable neural networks. SHAP has different explainer models, such as a tree explainer for tree-based algorithms, a kernel explainer for kernel-based and neural networks, and a deep explainer for a deep neural network model (Lundberg and Lee 2017). In this research, the kernel explainer was integrated into the neural network model to explain the model black box. According to (shap. KernelExplainer), the kernel model utilizes a specific weighted linear regression to calculate the importance of each feature. The evaluated essential values are Shapley values from game theory and the coefficient from a linear regression.

Evaluation metrics

The explainable neural network model was evaluated using R2, MSE, and MAE to evaluate the model performance. Equation 8 expresses the equation R2. R2 calculates the extent to which the model inputs account for the variability in the dependent variable (Chicco et al. 2021).

$${R}^{2}=1-\frac{{\sum }_{i=1}^{n}{({\widehat{y}}_{i}-{y}_{i})}^{2}}{{\sum }_{i=1}^{n}{\left({\overset{-}{y}}_{i}-{y}_{i}\right)}^{2}}$$
(8)

where \({\widehat{y}}_{i}\) is the estimated value of the data, \({y}_{i}\) is the actual value, \({\stackrel{-}{y}}_{i}\) is the mean of the prediction value, and n is the total dataset number.

The MSE is applicable when identifying outliers is necessary. The L2 norm is particularly effective at assigning greater importance to these data points. To elaborate, when the model generates an inferior prediction, the error amplification is intensified through the squaring mechanism in the function. Equation 9 indicates the equation of the MSE (Chicco et al. 2021).

$$\mathrm{MSE}=\frac1m\sum_{i=1}^m\left(X_i-Y_i\right)^2$$
(9)

where \({X}_{i}\) is the predicted value and \({Y}_{i }\) is the actual value.

The MAE is suitable in cases where outliers indicate flawed segments within the dataset. The MAE does not excessively penalize outliers during training, offering a comprehensive and constrained performance evaluation for the model. Conversely, when the test set contains numerous outliers, the model’s performance will be moderate. Equation 10 shows the formulation of the MAE.

$$\mathrm{MAE}=\frac1m\sum_{i=1}^m\left|X_i-Y_i\right|$$
(10)

where \({X}_{i}\) is the predicted value and \({Y}_{i }\) is the actual value.

Results

The performance of the model was evaluated using R2, MSE, and MAE. The test data provided the model’s performance with an R² of 98.7%, an MSE of 2.40, and an MAE of 0.003. Ten percent of the dataset was split as unseen data to evaluate the model’s generalization capabilities. Based on the unseen data, the model achieved an R2 value of 89%, an MAE of 0.01, and a root mean square error (RMSE) of 0.01. In addition, the model outcome was visualized using predicted and actual plots. Figure 8 illustrates the model prediction performance compared to the actual values. It can be observed that a majority of the data points cluster around the best-fit line, proving the model’s ability to predict specific energy values accurately.

Fig. 8
figure 8

Each data point (red dots) on the plot corresponds to an observation in the dataset, where the x-axis represents the actual specific energy values and the y-axis displays the predicted values

On the other hand, the model prediction error was assessed with a prediction error histogram to indicate the model performance. Figure 9 shows the frequencies of different error ranges, where the x-axis denotes the error magnitude and the y-axis represents the frequency and density of occurrences.

Fig. 9
figure 9

The xNN model prediction error distribution. The bar charts represent the frequency of occurrences, and the navy-blue curve corresponds to density

A well-performing model would exhibit a symmetric and narrow distribution around zero error, indicating minimal prediction discrepancies. Figure 9 shows a central peak around zero error, which suggests that the model generally provides accurate predictions.

Figure 10 shows the model performance in terms of the learning curve. During the training processes of the 100 epochs, the model was trained on increasing amounts of data. The model learning curve includes training and validation errors based on the MSE against the number of training iterations. According to Fig. 10, the model demonstrated that training and validation errors gradually decrease and appear to stabilize; thus, the learning is effective and generalizable well.

Fig. 10
figure 10

The xNN model learning curve. The blue line represents training loss, and the orange line shows the validation curve. The MSE values decrease gradually without creating a gap between training and validation

Additionally, the xANN model was validated using an unseen dataset. The unseen data can be used to simulate a real-world scenario to prove model generalizability. Figure 11 illustrates the unseen data-predicted and actual results. The model provided robust predictions with an R2 of 89%. The observed difference between the test score (98.7%) and the unseen data score (89%) can be attributed to data drift, which occurs when the distribution of the input data changes over time. If the test data fails to represent these changes, the model’s performance on unseen data may deteriorate.

Fig. 11
figure 11

The unseen dataset prediction performance for the model validation. The x-axis refers to the actual specific energy, and the y-axis corresponds to the predicted unseen specific energy

Figure 12 shows the frequency distributions of the unseen data and the actual and predicted data densities. Most unseen data reasonably match the actual and predicted data densities.

Fig. 12
figure 12

The unseen data density shows the difference between the actual and predicted data points

In addition, the model explanation is critical in preventing the model’s black box and providing an explainable neural network. In the “Shapley additive explanation (SHAP)” section, SHAP was incorporated with a neural network to explain the contribution of operator decisions for specific energy. Figures 13 and 14 show the force and summary plot of the SHAP, respectively. Figures 13 and 14 allow us to determine the underlying reason for the operator’s energy consumption. Figure 13 shows that the original pressing force (MPa), earth pressure (kN/m2), jack stroke (mm), cutter torque (kN.m), and cutter current (A) of the MSTBM are the most critical parameters for energy prediction. In contrast, the thrust propulsion force (kN) is less important for the specific energy.

Fig. 13
figure 13

SHAP force plot showing the feature importance of the input parameters. Red indicates a high contribution, while blue indicates a low contribution. The scale from 0.575 to 0.850 corresponds to the base of the SHAP model

Fig. 14
figure 14

The SHAP summary plot for the contribution of the TBM parameters

In addition to Fig. 13, Fig. 14 demonstrates the feature importance of the TBM parameters using a SHAP summary plot. The cutter torque, cutter current, and slurry density (t/m3) are the most significant parameters.

Figure 13 is related to the microlevel explanation of individual predictions, and Fig. 14 is linked to a macrolevel understanding of the model. The model is shown in Fig. 13 to illustrate the decision-making process for individual predictions.

Discussion

This research has provided significant insights into the relationship between influencing the decision of a TBM operator and the specific energy consumption during pipe excavation. Eleven operational parameters were selected among the 27 operational parameters using Fig. 4 in the “Data preprocessing” section. However, three of the chosen parameters are the most critical for the specific energy consumption in pipe excavation, as shown in Fig. 13. The cutter current is more critical than the cutter torque, and the jack speed for predicting a specific energy owing to the cutter current is directly related to the machine driving motors. Figure 13 indicates that the thrust propulsion force impact is lower than that of the other operational parameters of the MSTBM for predicting the specific energy. Figure 15 (a) illustrates the changes in the cutter current and specific energy with respect to the excavation distance. When the cutter current increases, the specific energy of the machine tends to increase. It can be interpreted that more challenging cutting conditions require more power for cutting. Figure 15 (b) illustrates the relationship between the cutter torque and specific energy. A higher torque, which indicates greater resistance against the cutter head, generally leads to higher specific energy values, suggesting that more effort (and thus energy) is needed to excavate the material. Figure 15 (c) presents the machine jack speed control and its variation and energy consumption. Mokhtari and Mooney (2020) stated that the operator attempts to control the jack speed when the TBM increases the advance rate to maintain face stability.

Fig. 15
figure 15

a Machine cutter current and specific energy relationship, b cutter torque and specific energy consumption, and c machine jack speed control and specific energy changes

On the other hand, the specific energy has been investigated based on soil formation during pipe excavation. Figure 16 presents the specific energy consumption for each section of the pipe geology. The energy consumption is greater in the clay zone from 60 to 85 m of pipe excavation than in the sand and sandy clay zones. The main reason is that the operator increased the torque and tended to adjust the jack speed control in cohesive soil.

Fig. 16
figure 16

The specific energy consumption based on the soil type. The pipe distance from 0 to 9 m is embankment soil, 9 to 40 m is sandy clay, 40 to 60 m is sand, 60 to 63 m is clay brown, 63 to 83 m is clay blue, and 83 to 120 m is sandy clay. The clay zone from 60 to 83 m exhibited the highest energy consumption for pipe excavation

Practical applications

The insights gained from this study have practical implications for the tunnelling industry. TBM operators can leverage the relationships between the cutter torque, jack stroke, cutter current of the driving motor, and specific energy to optimize excavation processes. By carefully adjusting the cutter torque, cutter current, and jack stroke based on the specific energy requirements of different geological zones, tunneling operations can be managed for higher efficiency and cost-effectiveness. The study also highlights the importance of specific energy consumption depending on geological information. This research shows the importance of selecting a TBM based on the geological setting to reduce energy consumption while monitoring the TBM parameters. Figure 17 illustrates the practical application of the xNN model.

Fig. 17
figure 17

Practical application of the xNN flowchart. Real-time data can be transferred to the model, and the pretrained model applies feature importance via SHAP. The operator can identify energy consumption parameters and then take action to reduce energy consumption while monitoring the TBM parameters

Despite its strengths, the proposed model has several limitations. They are summarized as follows:

  1. (1)

    The analysis focused on a specific set of micro slurry TBM parameters, but geomechanical parameters, cutter head design, and cutter tool types were not considered to derive specific energy formulas. They may influence the specific energy; therefore, further research could explore the impact of the additional features of the specific energy.

  2. (2)

    The proposed model did not predict a different case due to the data availability. Therefore, the model can be applied to different scenarios to determine its performance in some cases.

  3. (3)

    The kernel explainer computation time is high, making it a bottleneck for explainable artificial intelligence models.

  4. (4)

    The model was constructed using operational parameters from micro tunnelling TBMs, implying that utilizing operational parameters from larger TBMs or those designed for hard rock conditions would yield divergent outcomes.

Conclusions

This research thoroughly explored the use of explainable neural networks (xNNs) for predicting the specific energy consumption of mechanized soft ground tunnel boring machines (MSTBMs) under soft ground conditions. The xANN model, which uses 11 selected operational parameters, provided new insights into the prediction process and its influencing key factors. A unique aspect of this study is its focus on soft ground tunnelling, diverging from previous research that predominantly concentrated on hard rock mechanized tunnelling. This shift necessitated an innovative approach to feature extraction, using correlation analysis to identify relevant parameters for the xNN model. Moreover, the research introduced a novel specific energy formula designed explicitly for soft ground MSTBM operations, addressing the inadequacies of existing models. A significant advancement of this study is the incorporation of the advanced Optuna algorithm, which facilitates automatic hyperparameter tuning, optimizes the model’s performance, and reduces manual intervention. The SHAP technique further enhances the model by identifying crucial features that impact specific energy consumption, providing actionable insights for operators. One of the most noteworthy outcomes of this research is the model’s ability to provide feedback to the operator, enabling them to optimize machine parameters and adjust energy consumption in real time. This feature is particularly beneficial for managing the energy demands of various soil types, especially in clay zones where the energy consumption is notably high due to operational adjustments. This research presents a groundbreaking xNN-based approach for predicting the specific energy of MSTBMs operating under soft ground conditions. The model paves the way for more efficient, adaptive, and sustainable tunnelling practices by enabling real-time feedback and parameter optimization.