Introduction

Background

Lost circulation is a costly and serious issue at any stage of the drilling process. According to the United States Department of Energy in 2010, mud losses account for 10 to 20% of the cost of drilling under extreme pressure and temperature conditions (Alkinani et al. 2020a, b). Lost circulation, which accounts for 12% of all non- productive time (NPT) in the oil and gas industry, is a significant contributor to nonproductive time. According to Arshad et al. (2015), this industry spends about 2 billion dollars a year to reduce circulation loss (Arshad et al. 2015). According to J. Yang et al. 2022a, b, the China National Petroleum Corporation (CNPC) has an annual loss of about 4000 days as a result of lost circulation. When the applied pressure is higher than the formation breakdown pressure, lost circulation problems can develop, which can result in a variety of formation types, including extremely permeable, and unconsolidated formations (Elmousalami et al. 2021; Fidan et al. 2004). Another issue is the necessity to abandon some incomplete wells in depleted formations because completely losses occur frequently. Due to frequent complete losses, especially in formations above cap rock, some wells in Middle Eastern oilfields have been suspended for years (Moazzeni et al. 2012; Shadravan et al. 2009).

Adding lost circulation material (LCM) to drilling fluid, lowering mud weight, and zonal cementing are common remedies for lost circulation. Interdisciplinary approaches incorporating geomechanics analysis, optimized well trajectory, specialized drilling mud additives, and efficient drilling hydraulics control are crucial for preventing and reducing such losses (Lavrov 2016; Whitfill and Hemphill 2003). Based on the extent of the loss rate, lost circulation during drilling is divided into seepage, moderate, and severe or total losses (CP 2014). The addition of bridging materials is necessary to control significant losses that reach up to 470 barrels or last longer than 48 h, as opposed to minor losses that can be handled by modifying mud viscosity or adding blocking materials within 48 h (Moazzeni et al. 2011). Pore-fluid pressure and fracture gradients establish the mud-gradient window, a critical range of mud-pressure gradients for safe drilling. This range may be constrained in difficult drilling situations, such as deep-water or deviated wells. Different kinds of LCMs are added to drilling fluids to seal induced or natural cracks and maintain wellbore integrity in order to strengthen the wellbore and reduce circulation losses (Mehrabian et al. 2015).

There are four different forms are typically to blame for lost circulation; vugular or cavernous, extremely permeable, and unconsolidated formations, either natural or induced (Pilehvari & Nyshadham 2002). Both formation characteristics (such as pressure and gradient) and operational parameters (such as pump pressure and flow rate) have an impact on loss circulation. Wellbore shape, drilling mud characteristics, and hole cleaning effectiveness are further considerations (Burgoyne et al. 1991; Lavrov 2016).

Problems and gaps

According to Deng et al. (2023), Scholars from China and other countries have proposed numerous clever strategies for forecasting lost circulation incidents in recent years due to the ongoing development of intelligent algorithms. These techniques are more precise at predicting lost circulation accidents and are also more scientific. While there is valuable research on mitigating lost circulation during drilling, the focus on prediction prior to drilling remains relatively limited. There are several issues, including insufficient data utilization, limited applicability of models to specific areas, and poorly explained methodologies in certain studies (Al-Hameedi et al. 2018a, b; Alkinani et al. 2019; Li et al. 2018). Furthermore, It is noted that most of the articles were talking about a specific region (Sabah et al., 2019, Alkinani et al. 2020a, 2020b, Hou et al. 2020, Aljubran et al. 2021, Mardanirad et al. 2021, Wood et al. 2022, Olukoga and Feng, 2022, Sabah et al. 2021; Toreifi and Rostami 2014; Otchere et al. 2022, Salih and Abdul Hussein 2023, Yang et al. 2022a, b, Jafarizadeh et al. 2022, Tootkaboni and Ibrahim 2021, Su et al. 2021), unlike some articles, such as (Alkinani et al. 2020a, b), that could be applied globally. Table 1 illustrates a comparison between this proposed model and existing past models.

Table 1 A comparison between this proposed model and existing past models

Expected objectives and contribution

This research aims to develop an explainable integrated system called automated Lost Circulation Severity Classification and mitigation System (ALCSCMS) that can accurately predict Lost Circulation Severity (LCS) accurately at the early stage of drilling operations based on only conceptual data of the operations. The objective of this study is to create and compare the optimized machine learning algorithms for lost circulation prediction prior to drilling using key drilling data of 65,377 data records from 20 wells drilled into the giant Azadegan oil field in Iran. This study’s insights will assist the drilling staff in planning solutions before hitting the loss zone and in proactively modifying the primary drilling parameters to prevent or at least lessen losses. Moreover, mitigation optimization model based on genetic algorithm has been incorporated to convert high severe classes into acceptable classes of lost circulation.

Literature review

An artificial neural network-based technique for calculating lost circulation during underbalance drilling is proposed by Behnoud and Hosseini 2017. In order to solve operational issues, the study emphasizes the significance of quantifying and decreasing lost circulation. The features are not enough, one source of dataset is used, and the study applied on underbalance drilling techniques which are difficult in several countries due to financial and high technology limitations (Elmousalami et al. 2024, Elmousalami and Sakr 2024). In drilling operations, the use of artificial intelligence (AI) techniques, such as fuzzy logic (FL), functional networks (FN), and artificial neural networks (ANN), has shown promise in locating lost circulation zones . The artificial neural network (ANN) model including five neurons in a single internal layer demonstrated exceptional accuracy in forecasting lost circulation zones, as evidenced by its correlation value (R) of 0.987 and root mean square error (RMSE) of 0.081. The ANN model accurately predicted 99.1% of the lost circulation zones when the models were trained and tested using real-time surface drilling parameters from multiple wells.

Manshad et al. (2017), highlighted the significance of mud rheology by using support vector machines (SVM) and ANN to predict loss of circulation in the fractured Maroun oilfield. Improved classification accuracy was attained, and the research offered qualitative findings on lost circulation volume. The author used both operational and geological data. However, the methodology involved the SVM model only. Al-hameedi et al. (2017b) developed a model that emphasizes the importance of equivalent circulation density (ECD), mud weight (MW), and yield point (Yp) in order to estimate mud loss in the Shuaiba formation in the South Rumalia Field in Iraq. Al-Hameedi et al. 2017a investigated mud loss in the Dammam Formation and found that MW, ECD, and rate of penetration (ROP) all had significant effects. Both researches provide models for mud loss estimation and reduction during drilling operations.

Based on variables like depth, lithology, drilling mud properties, and operational variables, Al Hameedi et al. (2017), created mathematical models for mud loss. The study showed that their volume loss model successfully forecast circulation loss events in the Hartha formation in the South Rumalia Field in Iraq while highlighting the major influence of ECD, MW, and Yp on mud loss. Agin et al. (2020), predicted lost circulation in oil well drilling using adaptive neuro-fuzzy inference system (ANFIS), design of experiments (DOE), and data mining. In comparison to data mining, ANFIS demonstrated greater prediction ability, highlighting the significance of proactive circulation loss forecasting. Reactive approaches to problem-solving are preferred over creative forecasting techniques. Although this study was carried out with a large dataset, three models, and 17 variables, it gives a prediction of the severity of lost circulation when it happens. Therefore, there is a need to predict this problem before it happens and give a solution for it.

Abbas et al. (2019a, b), propose a new model for predicting circulation loss in an Iraqi oilfield using (SVMs) and (ANNs). SVMs show more promising results, but caution is advised when applying the models to additional data beyond the training datasets. Abbas et al. (2019b), analyzed drilling data from Southern Iraq, including 795 datasets from 385 wells. Their models accurately (ANN’s R2 is 0.88 for training and 0.84 for testing, respectively. However, SVM’s R2 is 0.97 for training and 0.95 for testing, respectively) identified solutions for lost circulation, emphasizing the importance of ML in enabling intelligent decisions by drilling engineers. Abbas et al. (2019a, b), used drilling data from southern Iraq to create an accurate model for predicting circulation loss during drilling (Abbas et al. 2019a, b). The model, based on an artificial neural network (ANN) approach, considered operational and geological variables. The results showed high accuracy. Using information from more than 1500 wells worldwide, Alkinani et al. (2019) created an artificial neural network (ANN) model to forecast mud losses in fracture formations before drilling. Based on drilling parameters, the model calculated lost circulation appropriately. Though it’s a great challenge for the writer, one artificial intelligence is insufficient. The problem of forecasting lost circulation risk in an Iraqi carbonate resource using seismic data is examined by Geng et al. (2019).

The study established a correlation between seismic characteristics and lost circulation events, developing a prediction model. The problem is that dealing with this type of dataset is very difficult and requires a lot of work. Shi et al. (2019), explore drilling technologies and propose a machine learning approach using historical and recent data to train SVM and random forest models for precise influx and loss prediction. The study also utilizes the classification and regression tree (CART) approach to create a decision tree for classification and regression. The classification results of the data show that no accident, influx, or loss occurs, but there is no estimation for the severity.

combine evolutionary algorithms with machine learning models to predict lost circulation in the Marun oil field. The hybrid intelligence models outperform standalone ML methods, highlighting the effectiveness of this integrated approach in accurately anticipating circulation loss. developed prediction models for lost circulation in the Marun oil field using various techniques such as genetic algorithm-multi-layer perception (GA-MLP), decision trees (DT), ANFIS, ANN, and ANFIS. The study demonstrated high accuracy. Alkinani et al. (2020a, b), investigated machine learning methods for lost circulation treatments. The study found that quadratic SVM exhibited the best accuracy among the tested models. Hou et al. (2020), examined circulation loss incidents in the high-temperature and high-pressure Yingqiong Basin. They constructed an ANN prediction model with strong performance, offering valuable insights and accurate risk assessment for circulation loss in drilling operations. A deep learning model was created by Aljubran et al. (2021) to recognize loss circulation incidents (LCIs) during drilling operations. On test data, their models of one-dimensional convolutional neural networks demonstrated good levels of precision, recall, and F1 scores.

Alsaihati et al. (2021), used SVMs, random forests, and K-NN to detect lost circulation events during drilling based on surface characteristics. The K-NN classifier achieved the highest F1-score followed closely by random forests. Alsaihati et al. (2022), developed predictive models using surface characteristics and active pit volume interpretation to estimate the loss of circulation rate (LCR) during drilling. Both studies have many drawbacks, like a smaller dataset, using drilling surface parameters only, and a lack of generalization. Magzoub et al. (2021), developed a ML approach to select drilling fluid compositions for preventing circulation loss. Gradient Boosting exhibited the highest accuracy of up to 91% in predicting rheological features. The issue here is the limited dataset, and predicting lost circulation is indirect through the attainment of appropriate rheological characteristics. Mardanirad et al. (2021), developed deep learning (DL) models to forecast fluid loss classes using a dataset from 20 wells in Iran’s Azadegan petroleum sector. The convolutional neural network (CNN) model demonstrated the highest accuracy rate (98%) compared to LSTM and GRU models.

Wood et al. (2022), compared (ML) and (DL) methods for predicting drilling fluid loss classes using data from Mardanirad et al. (2021). The Random Forest (RF) model outperformed other models, achieving the highest overall performance. Adaboost (ADA) and decision tree (DT) models performed well for specific fluid-loss classes. DL models showed lower accuracy and longer execution times. There are two problems with DL models compared to ML models: they failed to predict classes 4, 5 and require much longer execution times. Olukoga1 and Feng (2022), employed ML methods to classify drilling mud circulation loss occurrences based on data provided by Mardanirad et al. (2021). The RF ensemble achieved a flawless F1 score of 1, while the CART model demonstrated superior performance with a high weighted F1 score of 0.9904.

Based on the datasets supplied by Sabah et al. (2021); Toreifi and Rostami (2014); Otchere et al. (2022), constructed a model employing seven (ML) algorithms to estimate the loss of circulation rate (LCR) in the Marun oil field. Low error metrics produced the best results for the Extra Tree (ET) regressor model. Using data from 75 oil wells in the Rumaila oilfield, Salih and Abdul Hussein (2023), used three machine learning models to forecast lost circulation: DT, RF, and additional trees. The extra trees model showed the highest prediction accuracy. Yang et al. (2022a, b), developed an ANNs model for forecasting fracture width in fractured rocks using data from oil and gas drilling operations. The model had strong R2 values, a low Root-mean-square deviation (RMSE), and good prediction accuracy. Although the very good idea of this article is that using, the artificial neural network model alone is not enough and lacks good generalization.

Pang et al. (2022), created a method to estimate lost circulation in the Mishrif reservoir in the Middle East Gulf. The Mixture Density Network, feature selection, and real-time evaluation are all part of their methodology. Although the method of lost circulation prediction is new, the interpretation of the relationship between mudlogging parameters and the mud loss rate needs more experience in well logging interpretation. Data from the Marun Oilfield in Iran were used by Jafarizadeh et al. (2022), to create a forecasting model for mud circulation loss rate. Prior to implementing least square support vector machine (LSSVM), CNN, and hybrid modelling methods, they used noise attenuation and feature selection strategies. Tootkaboni and Ibrahim (2021), created models for forecasting circulation loss using a vast database from the Maroon oilfield. The clean and prepared dataset with eighteen parameters highlighted the validity and accuracy of the suggested model in predicting circulation loss. In order to create a real-time model for identifying leakage layer locations during drilling operations, Su et al. (2021), made use of a large dataset. The suggested strategy displayed consistent prediction results and good forecasting accuracy for leakage layers. The model suffers lack of generalization in the same area of study.

In order to forecast API and HPHT filtrate loss parameters in drilling operations, Gul and Oort (2020), employed three field datasets. To estimate the parameters with high accuracy, machine learning and deep learning models such as MLP, RF, XGB, SVM, and multilinear regression were used. However, the study did not take into explicit consideration Formation properties such as porosity and permeability. Al-hameedi et al. (2019), built predictive models for circulation loss amount, ECD, and ROP using data from more than 500 wells in the Rumaila field, Iraq. The models’ accuracy in predicting circulation loss in fractured formations was confirmed using additional data from 30 wells, revealing a strong agreement with the measured data. For the purpose of forecasting the rheology and filtration properties of drilling mud containing silica oxide (SiO2) nanoparticles, Ning et al. (2023), created machine learning models. Parizad et al. (2018), provided inputs for their work, and Mahmoud et al. (2016, 2018); Vryzas et al. (2015), provided experimental data. The models demonstrated good prediction accuracy for both shear stress and filtration volume, with (LSSVM) outperforming ANN for shear stress. The characteristics of the Formation were not directly considered in the study.

ALCSCMS system methodology

Data collection on Lost Circulation Severity (LCS) is the first step in the conceptual framework, and it contains six basic steps, as shown in Fig. 1. In order to prepare the acquired data for computer processing, the second phase involves correcting missing values, outliers, and inconsistencies. In the third stage, the most important parameters for the suggested model are ranked and chosen through factor analysis in feature engineering. The fourth phase, which comes after choosing the important characteristics, is using single ML algorithms (like MLR and SVM) and EML techniques (like RF and GBM) to forecast LCS. In the fifth step, the framework involves the assessment and ranking of all utilized machine learning algorithms, with the objective of selecting the algorithm that demonstrates the highest performance according to relevant evaluation metrics. In the sixth and final step, a thorough evaluation of the chosen algorithm’s performance is conducted, accompanied by a comprehensive discussion that offers insights and analysis. In addition, LCS Mitigation optimization model based on genetic algorithm has been incorporated to convert high severe classes into acceptable classes of lost circulation.

Fig. 1
figure 1

The proposed methodology for ALCSCMS system

Data engineering

The effectiveness of the model proposed is influenced by the volume and quality of the data gathered, as referenced by Elmousalami (2020a, b) and Bode (2000). To achieve optimal performance, machine learning models necessitate a considerable volume of collected data. In the process of constructing the model, a total of 65,377 observations were gathered. Prior to the application of machine learning algorithms, the data underwent cleaning and preprocessing procedures to eliminate any missing or redundant information. Subsequently, the remaining dataset was subjected to normalization and standardization, which scaled the data values to fall within the range of [0.0 to 1.0].

The model parameters were derived from twenty wells drilled into the world’s largest unexploited oil field, Azadegan Oilfield in Iran, provided a significant subsurface dataset for this study. There are three separate structural blocks in this field: I, II, and III. The field’s wells provided the data for the dataset. The majority of reported lost circulation instances took place in block III. Every meter of the drilled depth’s loss rates was taken out of the daily drilling reports and transformed into five classifications, ranging from “no loss” to “complete loss”(Mardanirad et al. 2021; Wood et al. 2022; Wood et al. 2022).

Table 2 lists the whole collected parameters and categorizes each variable into four main groups: Drilling Parameters, Fluid Flow, Mud Properties, and output. The first type of loss, coded “Class 0”, indicates that there is no significant rate of loss; the second type, coded “Class 2”, indicates that there is a fluid loss in the range of 10 to less than 100 barrels per hour (bbl/hr); the third type, coded “Class 3,” involves fluid loss exceeding 100 bbl/hr with slight returns of lost fluid; and the fourth type, coded “Class 4,” corresponds to fluid loss without return.

Table 2 The collected parameters

As illustrated in Fig. 2, the correlation matrix shows that there are several strong correlations between the variables in this dataset. For example, there is a positive correlation between Mud Weight (P11) and Solid (P17) (R = 0.95). This means that as the Flow In increases, the Pump Stroke also increases. Similarly, Flow In (P8) and Pump Stroke (P10) are high correlated (R = 0.98). On the other hand, there are also several weak correlations between the variables. For example, the Rotation (P5), and the Plastic Viscosity (P13) have a weak positive correlation (R = 0.19). This indicates that as the area Rotation increases, so does the Plastic Viscosity. Additionally, there is a negative correlation (R = − 0.04) between the Rotation (P5), and Rate of Penetration (P3). This means that as the Rotation slightly increases the Rate of Penetration decreases. Therefore, correlation heat map is very significant to illustrate and understand the relation between the parameters.

Fig. 2
figure 2

Heat map correlation for LCS parameters

Principal component analysis (PCA)

PCA is a method widely employed in statistics, machine learning, and data analysis to decrease the dimensionality of a dataset that may consist of numerous interrelated variables. Its core objective is to convert this high-dimensional data into a smaller collection of uncorrelated variables known as principal components. These principal components are designed to retain the majority of the original data’s variance while simplifying its structure (Kurita 2019).

Figure 3 illustrated the results of PCA on the collected parameters of LCS where the ranking of parameters based on PCA absolute loading represented as a descending bar chart regard to PCA absolute loading values. As a result, P1, P2, P8, P10 were the most influential parameters on LCS. On the other hand, P6, P9, P14, P11 were the least influential parameters on LCS. Consequently, the study can dimension the number of the used parameters based on PCA absolute loading. However, this study preferred to use all possible parameters to obtain the highest possible accuracy. Moreover, all these parameters are available before the drilling operations. Therefore, there is no need for dimensionality reduction.

Fig. 3
figure 3

Ranking parameters based on PCA absolute loading

Machine learning models development

The obtained data were divided into three separate sets after the data engineering and analysis process: a training set that contained 70% of the data, a validation set that contained 15% of the data, and a testing set that also contained 15% of the data. In order to minimize bias in the data, the validation set (15%) contributed to the optimization of hyperparameters using a tenfold cross-validation technique. Individual learning algorithms and ensemble learning algorithms are the two types of machine learning algorithms used in this study. Four distinct learning algorithms—Artificial Neural Networks, Multiple Linear Regression, Polynomial Regression, Support Vector Machine, and Decision Tree—were developed for the study. Additionally, the study used seven ensemble learning (EL) algorithms: Gradient Boosting, Extreme Gradient Boost, Light Gradient Boosting Machine, Extremely Randomized Trees, and Light Gradient Boosting Machine. Consequently, a total of 11 algorithms were created and used for the analysis of Lost Circulation Severity (LCS).

Evaluation metrics

Evaluation metrics for both hyperparameters and model performance were computed based on the validation and testing datasets, as described in Bishop and Nasrabadi (2006). Accuracy measures the proportion of correctly classified observations out of the total observations: (Number of Correct Predictions) / (Total Number of Predictions) as Eq. (1):

$${\text{Accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}$$
(1)

where true negative (TN), false negative (FN), false positive (FP), and true positive (TP). The ratio of accurately anticipated positive instances to all expected positive instances is the measure of precision. Equation (2) represents (True Positives) / (True Positives + False Positives).

$${\text{Precision }} = { }\frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}}$$
(2)

Recall (Sensitivity or True Positive Rate) measures the ratio of correctly predicted positive observations to the actual positive observations: (True Positives) / (True Positives + False Negatives) as in Eq. (3). The true positive rate (TPR) represents the y-axis and false positive rate (FPR) is the x-axis (Sokolova et al. 2006).

$${\text{Recall}} = {\text{TPR}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}$$
(3)

The F1 Score is the harmonic mean of precision and recall. It provides a balance between precision and recall: 2 * (Precision * Recall)/(Precision + Recall) as in Eq. (4)

$${\text{F}}1{\text{ score}} = \frac{{{\text{TP}}}}{{{\text{TP}} + 0.5{*}\left( {{\text{FP}} + {\text{FN}}} \right)}}$$
(4)

The F2 Score is a variation of the F1 Score that place more weight on recall. It’s useful when recall is more important than precision: (1 + 2^2) * (Precision * Recall)/(2^2 * Precision + Recall) as in Eq. (5)

$${\text{F}}2{\text{ score}} = \left( {1 + 2^{2} } \right){*}\frac{precision*recall}{{(2^{2} *precision) + recall}}$$
(5)

The F-Beta Score is a generalized form of the F1 Score that allows to give different weight to precision and recall using the parameter β: (1 + β^2) * (Precision * Recall) / (β^2 * Precision + Recall) as in Eq. (6)

$${\text{F}} - {\text{Beta Score}} = \left( {1 + \beta^{2} } \right){*}\frac{precision*recall}{{(\beta^{2} *precision) + recall}}$$
(6)

Plotting the true positive rate (recall) against the false positive rate at different thresholds, the receiver operating characteristic curve (ROC AUC) calculates the area under the curve. For multiclass classification evaluation, these evaluation metrics can have been calculated for each class individually and then aggregated the results using methods such as micro-averaging, macro-averaging, or weighted averaging, depending on the specific use case and evaluation requirements (Elmousalami and Elaskary 2020).

Bayesian hyperparameter optimization

According to Hammam et al. (2020), the goal of hyperparameter optimization is to maximize predicted accuracy by determining the optimal hyperparameters for every learning algorithm. According to Wu et al. (2019), common techniques for hyperparameter optimization include manual search, random search, grid search, Bayesian optimization, and evolutionary optimization. According to Feurer and Hutter (2019), although Bayesian and evolutionary methods automate the optimization process with little to no human interaction, manual search, random search, and grid search need extensive testing to uncover optimal hyperparameters. These techniques are also useful for resolving hyperparameter search’s high dimensionality problem.

In this work, globally optimal model settings are established using Bayesian algorithms before training, with the aim of reducing the error for each method (expressed by Eq. 1). The maximum number of iterations has been set at 10,000, with a predetermined upper limit, as shown in Fig. 4. According to the work of Feurer et al. (2015), the basic idea is to use Bayesian learning theory and Gaussian stochastic processes to generate a targeted Gaussian distribution inside the hyperparameter space of the applied machine learning method. After initializing the model’s hyperparameters, the algorithm iteratively performs the actions listed in Algorithm 1.

Algorithm 1
figure a

Bayesian optimization

Fig. 4
figure 4

Incorporating Bayesian optimization into ML algorithms (Feurer et al. 2015)

The effectiveness of the models created depends on critical hyperparameters, including the learning rate, number of iterations, depth, the count of estimators, and the L2 regularization term. It’s crucial to predefine these hyperparameters because determining them during the training process is impractical. The chosen hyperparameters have a significant impact on various aspects, such as the model’s complexity, training duration, susceptibility to overfitting, and the speed of convergence, as discussed in Snoek et al. (2012).

The performance of the constructed models is heavily dependent on hyperparameters, including factors such as the learning rate, number of iterations, depth, the quantity of estimators, and the L2 regularization term. It is essential to preset these hyperparameters since attempting to determine them during the training process is not practical. The influence of these hyperparameters extends to various aspects of the model, including its complexity, the time required for training, susceptibility to overfitting issues, and the speed of convergence, as described in the work by Snoek et al. (2012).

LCS classification model results and discussion

The eleven classifiers shown in Table 3. have been validated based on the testing dataset using assessment metrics like accuracy, precision, recall, F1 score, F2 score, F-Beta, and ROC AUC. As indicated in Table 3, the classifiers have been arranged in descending order according to evaluation metrics. According to this study, random forest (RF) provides the best accuracy for LCS. RF produced an overall correct classification of 99%, meaning that this model was 99% of the time able to correctly identify the wells that belong to the specified features. In second and third place were Decision Tree and XGB.

Table 3 Performance of Bayesian optimized ML algorithms for LCS

Choosing the appropriate machine learning algorithms should consider more than just predictive accuracy. Factors such as computational resources, which encompass memory usage and processing time, play a vital role in the decision-making process. Table 4 presented that RF processed the data in just 1.79 s and 164.11 megabytes (MB). Nonetheless, it’s crucial to recognize that the computational expenses, both in the context of time and memory usage, can experience a notable surge when working with extensive datasets, including those featuring numerous attributes or a substantial volume of data.

Table 4 Optimal hyperparameters of each ML algorithm

The trade-off between the true positive rate (TPR) and the false positive rate (FPR) across different threshold values is illustrated by the Receiver Operating Characteristic (ROC) curve, as shown in Fig. 5. The classifier’s true positive rate (TPR) is the percentage of correctly identified positive cases; the false positive rate (FPR) is the percentage of incorrectly categorized negative cases as positive. An optimal classifier is located in the upper-left quadrant of the ROC curve, with a TPR of 1.00 and an FPR of 0.00. The more closely the ROC curve resembles this upper-left corner, the better the classifier performs. All practical threshold values are covered by the classifier’s performance summary provided by the Area Under the Curve (AUC), a single metric. AUC of 1.00 denotes flawless execution.

Fig. 5
figure 5

Average ROC curve and the confusion matrix for RF for different LCS classes

Figure 5 illustrated that all AUC was at 1.00. This means that the developed RF classifier can perfectly separate all five classes from each other, with a few false positives or false negatives. This is a very promising result, and it suggests that the random forest classifier is well-suited for this LCS classification task. However, it is important to note that the ROC curve is only one metric for evaluating the performance of a classifier. Other metrics, such as accuracy and precision, may also be important to consider, depending on the specific application.

However, the confusion matrix shown in Fig. 5 showed that there were five classes in the categorization test. The number of data points that the random forest classifier correctly and erroneously classifies is displayed in the matrix. The confusion matrix’s diagonal elements show how many data points were successfully classified. The amount of data points that were erroneously classified is represented by the confusion matrix’s off-diagonal elements. In the confusion matrix, for instance, the top left element is 12,317.

This means that 12,317 data points with the true label “0” were also predicted to have the label “0”. The other elements of the confusion matrix can be interpreted in a similar way. For example, the element in the first row and second column of the confusion matrix is 17. This means that 17 data points with the true label “1” were incorrectly predicted to have the label “0”. Overall, the confusion matrix shows that the random forest classifier was able to correctly classify a large majority of the data points. However, there were some data points that were incorrectly classified.

Ensemble machine learning techniques, as proposed by Breiman (1998) and Schapire et al. (1998), offer an effective approach for managing high-dimensional data, overcoming challenges related to limited sample sizes, and dealing with intricate data structures. Nonetheless, it’s crucial to recognize that the adoption of ensemble methods can lead to increased model complexity, as highlighted by Kuncheva (2014). In cases involving noisy data, the random forests algorithm has demonstrated superior performance when compared to the decision tree algorithm, as emphasized by Breiman (1996) and Dietterich (2000). However, it’s essential to acknowledge that random forests may lack the capacity to provide insights into the importance of features or the internal mechanisms behind their results.

Table 4 presented optimal hyperparameters of each ML algorithm based on Bayesian algorithm. The FR’s hyperparameters were as following: max_depth controls the maximum depth of each decision tree in the random forest where each tree in the forest can have a maximum depth of 10 levels. max_features determine the maximum number of features that the algorithm considers when looking for the best split at each node of the decision tree. A score of 0.780 indicates that each node will divide by taking into account roughly 78% of the attributes that are accessible. The minimum quantity of samples needed to be in a leaf node is specified by min samples leaf. As it is set to 5 in this case, a leaf node should have a minimum of 5 samples. The minimal number of samples required to perform a split at an internal decision tree node is determined by the min_samples_split parameter. A node with a value of 5 is deemed to require a minimum of 5 samples in order to qualify for splitting. The random forest’s number of decision trees is indicated by the variable n_estimators. There are 55 trees in the forest in this instance, and they all work together to forecast the outcome.

As shown in Fig. 6, learning curves serve the purpose of assessing the effectiveness of an ML model and ascertaining the quantity of training data required to attain a desired performance level. Additionally, learning curves aid the detection of overfitting instances. Various methods exist for illustrating learning curves, with the prevalent approach involving graphing MSE values concerning both the training and cross-validation datasets against the number of training instances. The red line in the graph represents the training score, which is the MSE on the training data. The cross-validation score, or MSE on a held-out set of data that the model has never seen before, is represented by the green line. According to the learning curve, the training score drops as the volume of training data rises. This is because when the model sees more data, it will be able to identify patterns in the data more precisely. Though more slowly than the training score, the cross-validation score likewise drops as the quantity of training data rises. This suggests that the model is approaching the point at which receiving more training data won’t significantly alter its performance.

Fig. 6
figure 6

learning curve for the RF algorithm

Explainable artificial intelligence with SHAP

The Shapley value, a game-theoretic method, is used to calculate SHAP values. These variables can be used to explain any machine learning model, regardless of its category, because they are model-agnostic. According to Mangalathu et al., SHAP values are used to clarify the relative importance of each characteristic as well as the ways in which various features interact (2020). In Fig. 7, the illustration demonstrates the average impact on the model output magnitude by class, providing a valuable tool for understanding how the model behaves. Consequently, it was found that P2, P1, P5, and P11 had the most significant influence on Lost Circulation Severity (LCS). On the contrary, P3, P9, P4, and P16 had the least influence on LCS.

Fig. 7
figure 7

SHAP model for Bayesian optimized ET model

It can be used to identify potential biases in the model, and to take steps to improve the model’s performance. However, the graph shows that the model has the greatest impact on the output magnitude for class 0, followed by class 2, class 1, class 3, and class 4. This means that the model is most likely to produce a large output value for class 2 and class 0, and least likely to produce a large output value for class 4. However, random forests can still make accurate predictions for the minority class, as the aggregation of multiple trees can capture some of the minority class instances. Moreover, cost-sensitive learning technique has been used to specify the costs associated with misclassifying different classes.

LCS mitigation optimization model

Evolutionary computing (EC) is rooted in the concept of “survival of the fittest,” as introduced by Charles Darwin in 1859. Genetic algorithms (GAs), a subset of EC, are commonly employed for optimization and search-related tasks, first proposed by John Holland in 1975 and discussed by Siddique and Adeli (2013). A chromosome is represented in this context as a vector (C) that has ‘n’ genes, which are indicated by ‘ci.’ Elmousalami (2020b) states that every chromosome (C) is a point in an n-dimensional search space. To represent the 17 input parameters in the current case study, each chromosome has seventeen genes, which correspond to the well drilling parameters (P1, P2, P3, P4, P5, P6,…, P17), as listed in Table 2. One of the membership functions (MFi), where ‘i’ spans across the boundary conditions for every variable (P1 through P17), is linked to each gene.

To initiate the process, an initial population of 10 chromosomes is established, and the genetic algorithm is run for a total of 10,000 generations. The crossover probability is set at 0.7, and the mutation probability is configured to be 0.03. Subsequently, a subset of the initial population of chromosomes is selected for evaluation using a fitness function (Fig. 8).

Fig. 8
figure 8

LCS mitigation system

The fitness function (F) serves as a tool for assessing the quality of potential solutions. In the genetic algorithm (GA) process, crossover and mutation operations are employed to generate new generations of offspring. The primary goal here is to minimize the probability of a well becoming high lost circulation severity. Consequently, the GA’s objective function revolves around minimizing the LCS class by optimizing the seventeen input parameters to achieve acceptable lost circulation characteristics. The fitness function can be represented by Eq. (7), where the primary aim is to minimize the fitness function, denoted as:

$${\text{F}} = {\text{ Minmization }}\left( {{\hat{\text{y}}}_{{\text{i}}} } \right)$$
(7)

Within this formula.The projected classification based on the Random Forest (RF) model is represented by \({\widehat{\text{y}}}_{\text{i}}\) and ‘F’ stands for the fitness function in Eq. (7). Each of the seven input parameters has lower and higher bounds that must be met in order to keep the input variables within acceptable bounds. In addition, functional restrictions have been added in compliance with design standards, such as making sure that the total percentage of solids and water does not surpass 100%. It’s crucial to remember, nevertheless, that choosing a sensible pairing of these seventeen factors for the drilling procedure frequently necessitates a high level of discretion.

However, the key limitation of the mitigation module is that the model cannot mitigate lost circulation in many cases especially severe or natural ones. The nature of lost circulation could be induced or natural. Mitigation plan could be provided if the losses were induced due to high ROP or high pump pressure. Additionally, if losses were caused by induced factors and escalated to severe levels, reversing them becomes exceedingly difficult. Therefore, the conclusion should emphasize that mitigation efforts primarily target partial losses and aim to transition them to seepage losses or ideally eliminate losses altogether.

Practical applications of ALCSCMS

ALCSCMS serves as a valuable tool in the oilfield, particularly before drilling operations commence. Its application lies in several key areas:

  1. 1.

    Pre-drilling risk assessment By inputting geological data, wellbore characteristics, and drilling parameters, the model evaluates the likelihood of circulation loss occurrences in different formations and scenarios. This information allows operators to proactively identify high-risk areas and implement preemptive measures to mitigate potential circulation loss issues.

  2. 2.

    Optimized well planning Utilizing the predictive model during the well planning phase allows operators to optimize well design and trajectory to minimize the risk of circulation loss. By simulating various drilling scenarios and assessing their associated risks, operators can make informed decisions regarding casing design, drilling fluid selection, and wellbore strengthening strategies.

  3. 3.

    Mitigation strategy development The model assists in the development of targeted mitigation strategies customized to specific drilling environments and challenges. By analyzing the factors contributing to circulation loss risks, such as formation properties, pore pressure, and drilling parameters, operators can develop effective mitigation plans. These plans can include the pre-treatment of formations, the use of specialized drilling fluids, and the implementation of contingency measures to address potential circulation loss events.

  4. 4.

    Cost reduction and efficiency improvement By accurately predicting circulation loss risks and implementing proactive mitigation measures, operators can reduce drilling downtime, minimize costly remediation efforts, and enhance overall drilling efficiency. The application of the predictive model leads to optimized drilling operations, resulting in significant cost savings and improved project outcomes.

Conclusions

Lost circulation can lead to considerable financial setbacks due to expenses related to drilling fluids, equipment, and the operational downtime required for remediation. This can have a significant adverse impact on the overall financial planning for drilling projects. In addition, Extended Drilling Time, Wellbore Instability, Environmental Impact, and Safety Hazards are direct impacts of lost circulation. To tackle this problem, the study introduces Automated Lost Circulation Severity Classification and Mitigation System (ALCSCMS) to enable decision makers to reliably predict the severity of lost circulation (LCS) using a few key drilling parameters before commencing drilling operations. The paper concluded the followings:

  • Effectiveness of ALCSCMS The study demonstrates the effectiveness of the Automated Lost Circulation Severity Classification and Mitigation System (ALCSCMS) in accurately predicting lost circulation severity using key drilling parameters.

  • Optimal model performance The Random Forests (RF) model, optimized through Bayesian optimization, achieves remarkable 100% classification accuracy in predicting lost circulation severity, making it the preferred choice for classification tasks.

  • Mitigation strategy integration Incorporating a mitigation optimization model based on genetic algorithms allows for the transformation of highly severe lost circulation situations into more manageable categories, enhancing overall drilling efficiency and cost-effectiveness.

  • Insights from SHAP analysis The use of SHapley Additive exPlanations (SHAP) provides valuable insights into the key parameters influencing lost circulation severity predictions, aiding decision-making processes in drilling operations.

Additionally, the research incorporated a mitigation optimization model based on a genetic algorithm. This model’s role is to transform highly severe lost circulation situations into more manageable categories. Furthermore, the SHapley Additive exPlanations (SHAP) technique was used to offer insights and explanations for the key parameters that served as inputs for predicting lost circulation severity.

While the research on the Automated Lost Circulation Severity Classification and Mitigation System (ALCSCMS) is promising, there are several limitations such as real-time data gathering, cost–benefit analysis of the mitigation procedures, and environmental considerations. Therefore, the future research directions for the study on an integrated system for automated lost circulation severity classification and mitigation (ALCSCMS) could encompass several areas:

  1. 1.

    Real-world implementation and validation To assess the practical applicability of ALCSCMS, further research may involve field trials and validation on actual drilling operations for several sites and fields. This will confirm the system’s effectiveness under real-world conditions.

  2. 2.

    Data enhancement Expanding the dataset used in the research or continuously updating it with more recent information can improve the accuracy of the models and their adaptability to evolving drilling scenarios. Integration with internet of things (IoT) and Sensor Data: Incorporating real-time data from sensors and internet of things (IoT) IoT devices on drilling rigs can enable the system to adapt and respond dynamically to changing drilling conditions. This can help in real-time decision-making and mitigation strategies.

  3. 3.

    Advanced machine learning techniques Future research can explore the application of more advanced machine learning techniques, such as deep learning, reinforcement learning, or hybrid models, to further enhance the predictive capabilities of ALCSCMS.