Introduction

The success of any drilling operation in oil and gas projects is a function of three metrics, namely increased drilling speed or rate of penetration (ROP), lower overall cost and maintaining safety. Drilling cost consists of more than half of the budget of any exploration and developmental project. Let alone this, additional delays in drilling due to different problems, such as stuck pipe, drill bit failures, fishing, incur raise in the overall cost of the project. Therefore, the recent drop in oil and gas prices has motivated the industry to optimize the overall drilling operation. One of the major drilling problems while drilling in sticky shales or sometime even in loose sandstone is bit balling. While approximately 60% of the drilling well footage is in shale/clays, it is inevitable to avoid bit balling. Bit balling is a failure mode of the drilling bit that causes the mud and formation to gather around the bit and cause its failure (Roy and Cooper 1993). This causes the ROP to be dropped and sometime increase in standpipe pressure when the nozzles of drill bit are stuck. In the past, many methods have been proposed to avoid bit balling that includes (Roy and Cooper 1993).

  1. 1.

    Change in drilling fluid rheology.

  2. 2.

    Use of oil-based mud in water reactive clays/shale.

  3. 3.

    Developing electric potential between formation and drill bit.

  4. 4.

    Modifications in drill bit hydraulics.

Unlike conventional techniques mentioned above, this work is aimed to look at the process of bit balling based on data collected during the process and evaluate different parameters that may result in bit balling. Since the techniques used to quantify the bit balling in terms of drilling parameters are just relied on X–Y cross-plots and there is no analytical or semi-analytical technique that can quantify the bit balling as a function of WOB, torque, ROP, RPM, mud flow and surface-controlled pressures such as swivel, choke and borehole pressure, we have chosen artificial intelligence (AI) to build this relationship. The main objective for the proposal is to provide a holistic outline for using real-time drilling data for early detection of bit balling and optimizing the drilling parameters instantaneously to prevent bit dysfunction.

Background

Previously, different methodologies have been devised that used direct or indirect approaches to evaluate ROP. ROP is a function of drilling parameters, drilling fluid type and most importantly the properties of rock being drilled. That’s why it is the direct indicator of rock’s mechanical property. This leads to improved drilling parameters, bit design and fluid type to achieve desirable ROP in all kinds of formations. One of the first attempts for the drilling optimization was presented in the study of Graham and Muench (1959), where they analytically evaluated the weight on bit and rotary speed combinations to derive empirical mathematical expressions for bit life expectancy and for drilling rate as a function of depth, rotary speed and bit weight. Galle and Woods (1963) produced graphs and procedures for field applications to determine the best combination of drilling parameters. Bourgoyne and Young (1974) came with an idea of evaluating ROP as a function of eight variables, where these parameters were the result of multiple regression analysis. The equation developed by them was valid for roller cone bits. They used minimum cost formula, showing that maximum rate of penetration may coincide with minimum cost approach, if the technical limitations were ignored. In the mid-1980s, operator companies developed techniques of drilling optimization in which their field personnel could perform optimization at the site referring to the graph templates and equations. In 1990s, different drilling planning approaches were brought to surface [Carden et al. (2006)]. New techniques identified the best possible well construction performances. Later on, “Drilling the Limit” optimization techniques were also introduced Schreuder and Sharpe (1999). Toward the end of the millennium, real-time monitoring techniques started to take place, e.g., drilling parameters started to be monitored from off locations. A few years later, real-time operations/support centers started to be constructed. Some operators proposed advanced techniques in monitoring of drilling parameters at the rig site. Following the early developments in rotary drilling systems, some operators proposed advanced techniques in monitoring of drilling parameters at the rig site.

In our previous studies, we have successfully applied the AI techniques in detection and mitigation of liquid loading in shale gas reservoirs and optimization of completion designs in Marcellus shale (Ansari et al. 2017; Belyadi et al., 2016). The biggest advantage of using machine learning for such problem is that it offers the flexibility of including all the available information in developing a predictive method. In this work, we have also used a new approach using AI to achieve the following objectives:

  1. 1.

    Developing dynamic predictive models for bit dysfunction diagnosis in different laboratory tests.

  2. 2.

    Developing a workflow for early detection of bit failure in the field.

To achieve these objectives, the actual laboratory and field data consist of WOB, ROP, Torque, RPM, Swivel, Choke and Borehole Pressure as a function of time which is provided by National Oilwell Varco used for building the intelligent models to predict ROP in both laboratory and field conditions and the predicted behavior of ROP as function of time is used for drill bit dysfunction diagnosis.

Methodology

Artificial neural networks (ANNs)

An artificial neural network (ANN) is inspired by the biological neural system, where neurons are highly interconnected and process information by learning from repetition of events, as shown in Fig. 1. Similarly, each ANN performs a specific task, by learning process where connection between neurons (layers, number of neurons, weights) is adjusted to minimize the difference between the prediction and ground truth values. In a simplest architecture, an ordinary ANN consists of three layers: input layer, output layer and hidden layer. Each layer is interconnected with linkages that contain activation functions. Input layer provides pattern of provided examples, and hidden layer performs the processing using weighted linkages. The output layer compiles result from hidden layer, and the produced output is compared to desired output to compare the efficiency of the neural network.

Fig. 1
figure 1

An artificial neural network

Multilayer perceptron

Multilayer perceptron (MLP) is a deep ANN where multiple hidden layers are interconnected to perform nonlinear function approximation. Within hidden layers, the neurons convert the values from last layers into a new value with weighted linear summation that is followed by nonlinear function (i.e., activation function), as shown in Fig. 2. The algorithm used for the multilayer perceptron learns on the principal of feed forward back propagation, where the input values are fed into input layers that travel forward to hidden layers and consequently into output layer; then, the output generated is compared to actual value to calculate error and propagate it back through adjusting weightages through gradient descent algorithm in hidden layers (back propagation). The MLP is commonly used in supervised problems where the sets of input–output variables are available for training in which the parameters, weights or biases are modified to minimize the global error between the predictions and true solutions. The optimized model can then be used to quantify the correlations between dependent “output” and independent “input” variables.

Fig. 2
figure 2

Multilayer perceptron network

Results and discussions

Drill bit test data

Preprocessing

The preprocessing usually involves data screening, outlier detection, data imputation and data transformation (scaling and normalization). The MLP models are sensitive to scaling, and to make sure the model is not biased to magnitude of the variables, we have used a scaling algorithm that performs relative scaling of the whole range of data with respect to its minimum and maximum values. This results in having the values mostly in the range of zero to unity or in some cases from − 1 to 1. The use of scaling ensures that standard deviation is small and sparse data have no entries. Since the visualization of pair plot for parameters did not indicate any part of dataset as outlier and we did not have any missing information, we have not performed any outlier detection of imputation technique. The laboratory drilling test data are used for developing the predictive models. The laboratory tests serve the purpose of evaluating the efficiency of the drill bits under different operating conditions against different formations. The drill bit data used for this project are monitored every millisecond and presented as follows:

  1. 1.

    Charge pressures (Psi)

  2. 2.

    Rotations per minute (RPM)

  3. 3.

    Mud flow (GPM)

  4. 4.

    Weight on bit “WOB” (klb)

  5. 5.

    Bore hole pressure (Psi)

  6. 6.

    Swivel pressure (Psi)

  7. 7.

    Choke pressure (Psi)

  8. 8.

    Torque (klb ft)

  9. 9.

    Penetration (in)

  10. 10.

    Depth of cutting (in)

Since the laboratory tests are obtained in more defined conditions, we have used laboratory data to first develop, train and validate the model. This step served as our proof of concept study, and then, we have expanded our studies using field data that have been obtained in more complex environment in comparison with laboratory conditions for actual application of drilling performance monitoring and optimization in a real time. As discussed earlier using the pair plots, we did not identify any outliers and we did not have any missing data that require any imputation technique. An initial analysis was completed to determine correlation between different parameters within the database. It is found that most of all the parameters have correlations < 90% so we decide to keep them during the model development.

Development of model

The feed forward back propagation neural network was developed with one hidden layer and 50 neurons in hidden layer. The rest of the modeling architecture of neural network is presented in Table 1.

Table 1 Neural network architecture

For training the model, the first half of the data (50%) of each of the dependent and independent variables are used where 15% of that is randomly selected for calibration of trained model, as shown in Fig. 3. Independent variables are assigned to train_X, namely WOB, RPM, Pressures, and dependent variables “targets” are assigned to train_y, namely ROP. The model development was performed 50 times with different initializations, which is essential for producing reproducible results. The remaining second half of the data (50%) were split into blind_X and blind_y for independent and dependent variables, respectively. This would allow for the predicted ROP to assess the accuracy of the model. The mode has been trained and verified with the first half of the data. The trained model is then used to predict the blind_y given the input information of blind_X. The model could able to successfully predict the blind_y. The predicted values at the end were the mean of 50 predictions made from each run obtained using different random 50 initializations.

Fig. 3
figure 3

Log diagram for model parameters

Post processing

The predicted ROP values are back-transformed from scaled units to their actual values for presentation purposes. Figure 4 shows a sample of training and predictions for one set of laboratory measurements of the ROP. The first half of the laboratory data “ROP” is used for training the model and shown in blue dots. The trained model using the first half of the data and the prediction of the second half of the data is presented as red dots. The predictions are the mean of 50 predicted realizations based on different initializations.

Fig. 4
figure 4

ROP training data and training and blind predictions of the model

Figure 5 shows the quality of the model prediction for the second half of the data used as blind set. As shown in Fig. 5, the model could capture the mean of the actual ROP behavior with high accuracy. We have applied the same procedure for remaining laboratory tests data to see the applicability of our developed model, and similar results have been obtained for other laboratory experiments. For all the cases similar to Fig. 5 where the model predictions closely followed the actual experimental results, we have not seen any bit failure or malfunction. This observation will be used later to identify if the experimental conditions are such that it could result in bit failure or malfunction.

Fig. 5
figure 5

Quality of the model ROP predictions

Case of drilling dysfunction

Bit balling is characterized as slowness of penetration rate. Many parameters contribute to slow ROP, for example, formation characteristics, bit type, drilling fluid properties, drill bit hydraulics, operating conditions, etc. As discussed earlier, the first half of the data is used for training and verification of the model in each experiment as we do not expect the bit failure or malfunction occurs at early time of the bit usage. The failure and malfunction usually happen after new drill bit has been used for a while in the drilling job. The trained model is then used for prediction of the ROP in the second half of the data that we expect the failure or malfunction might occur. As long as the actual experimental measure of the ROP not used for training the model follows the predictions of ROP obtained using the trained model, we do not expect any bit malfunction or failure. However, as soon as the actual measure of ROP starts deviating from the model predictions, this can be used as indication of bit not performing as expected and seen in early stage of drilling job, i.e., the first half of the data.

Figure 6 shows the laboratory data used for training and blind test in blue and model training and predictions in red. The trained model clearly matches the training set and captures the dynamics of drill bit performance. The laboratory data in blind set initially follow the model predictions; however, after sometime the data start deviating from the model predictions and finally completely fail to follow the model predictions, i.e., where bit balling is happened. Figure 6 clearly shows the ability of the trained model to raise the warning flag as soon as measured data deviate from the measured data and finally identify the bit failure, i.e., bit balling. We have tested the technique in different sets of the laboratory experiments leading to bit balling, and in all of the cases, the trained model could able to identify the start of bit malfunction and finally failing due to bit balling.

Fig. 6
figure 6

Laboratory data used for training and blind test and model training and predictions

Heavy hitter features (HHF) identification

To quantify the impact of different input parameters on ROP used in this study, we have used different techniques including linear support vector regression, Lasso regression, linear least square with L2 regularization and univariate linear F-regression test. The Lasso regression analysis was selected due to higher accuracy to rank the parameters. Figure 7 shows the impact of each parameter on ROP where the weight on the bit WOB shows the highest impact on ROP followed by bore hole pressure.

Fig. 7
figure 7

HHF of ROP predictive model

Field case study

We have extended our studies from laboratory experiments to the field application. The objective here is to develop a model based on early time drill bit information and use that to predict the ROP. The predictions then will be used in real time to raise the warning flag, i.e., when measured ROP is deviating from ROP predicted by model, indicating that the drilling conditions are such that the bit is underperforming. This can be used by operators to change the drilling conditions such that the bit performance enhances to anticipated rate predicted by the model. The model can also be used to identify the drill bit failure such as bit balling. This will happen when the measured ROP shows completely different behaviors than the model predictions. As discussed earlier in laboratory tests, we used the first half of the field data provided to build the model and used the second half of the data as blind set. To train the model, we selected similar model architecture as presented in Table 1 and applied scaling as a part of preprocessing. The model was iterated for 100 times, and the mean of predicted result from each run was taken as a predicted value of the ROP. All the variables were reported against elapsed time, and the difference between each of the readings varies from 4 to 6 s, as shown in Fig. 8.

Fig. 8
figure 8

Log of drilling variables for field case

Figure 9 shows the ROP values measured during the drilling job and used for training the model in blue. The red dots are the trained model, and the mean of 100 realizations is obtained for ROP predictions using different initialization techniques. From the training portion of the data, it is clear that the model could able to capture the main trends and dynamics of the measured ROP values with high accuracy. The predictions of the ROP values are then used to identify the bit malfunction or failure during the rest of drilling job. As discussed earlier, deviation of measured ROP as it becomes available from predictions can be used to raise the warning that the bit is underperforming and complete failure of measured data as it becomes available. As presented in Fig. 10, the actual data closely follow the model predictions till 20,000 unit time where the bit starts underperforming; at this point, some changes have been applied by the operator that result in bit performance enhancement after 25,000 time unit.

Fig. 9
figure 9

Prediction of ROP for blind set field data

Fig. 10
figure 10

Comparison of predicted and real ROP measurements in blind set

Conclusions

From both laboratory and field test data provided, we have proved that the data-driven model built using MLP technique can be successfully used for drilling performance monitoring and optimization. The model can be also used for uncertainty quantification and sensitivity analysis in the laboratory conditions where the limitations of the operation conditions in the laboratory or time required to complete the test do not allow the full sensitivity analysis or uncertainty quantification studies. The model can also be used in the field in a real time to monitor the bit performance raise the flag in case of bit underperforming to avoid any possible bit malfunction or failure. We have shown that the ROP has complex relationship with other drilling variables which cannot be captured using conventional statistical approaches or from different empirical models. The data-driven approach combined with statistical regression analysis provides better understanding of relationship between variables and prediction of ROP.