Introduction

In most high-pressure high-temperature (HPHT) and deepwater wells, the margin between the formation pore pressure and the formation fracture pressure is narrow. This requires an accurate determination of the effective circulation density acting on the bottom of the hole to avoid any problems such as lost circulation, fracturing formation, and gas kicks and well blowout. In these critical operations, the ECD is used to control the formation pressure and prevent kicks without fracturing the drilled formations. When the mud pumps are switched off, the reduction of ECD may result in underbalanced conditions which require good knowledge of the ECD to avoid any drilling problems. At the same time, it is not possible to increase the mud weight due to fracture pressure limitations. Continuous circulating system (CCS) tools are used to control the ECD and allow better control of the formation pressure (Baranthol et al. 1995; Ataga et al. 2012).

ECD is defined as the sum of the mud hydrostatic pressure and the annulus pressure loss acting on the formation (Haciislamoglu 1994). The annular clearance, mud weight, mud rheology, annular velocity (pump rates), cutting concentration in the annulus, and hole depth are the main parameters which affect the annular pressure losses (APL).

The two main components that affect the ECD are the cutting portion in the annulus expressed as equivalent static density (ESD), and the mud-related parameters (Zhang et al. 2013; Hemphill and Ravi 2011). Bybee (2009) introduced the following equation to predict the ECD:

$${\text{ECD}}={\text{ESD}}\left( {1 - {C_{\text{a}}}} \right)+{\left( {{\rho _{\text{s}}}{C_{\text{a}}}+\frac{\Delta p}{{g \times {{10}^{ - 3}}\;H}}} \right)^{\text{a}}},$$
(1)

where ECD is the equivalent circulating density (g/cm3), ESD is the equivalent static density (g/cm3), Ca is the solid concentration in the annulus (%), \({\rho _{\text{s}}}\) is the cutting (solids) density (g/cm3), Δp is the pressure losses in the annular space (MPa), H is the well depth along the vertical (m), g is the gravitational acceleration, equal to 9.8 m/s2, a is a constant taking into account the measurement units, equal to 8.345.

Such numerical evaluations for predicting ECD values did not take into account other factors affecting ECD while drilling such as flow geometry defined by well geometry, fluid resistance to flow defined by fluid rheology, and drill string rotation. Ignoring these factors in the equation will increase the error factors while estimating ECD (Caicedo et al. 2010; Costa et al. 2008).

Recently, in the oil industry, downhole tools are used to measure and monitor changes of ECD to avoid well control issues such as gas kicks, blowout, and formation fracturing (such as Erge et al. 2016; Rommetveit et al. 2010). The main tools used now are measurement while drilling (MWD) and pressure while drilling (PWD). These tools contain pressure sensors that can independently measure the bottomhole pressure of the well during drilling, regardless of the factors controlling the ECD (such as Ettehadi et al. 2013; Dokhani et al. 2016). The tools can give an accurate reading for ESD and ECD from the total pressure acting on the bottom of the well during circulation. Comparing the ESD with ECD will give a clear view about the reasons for ECD changes (such as Vajargah et al. 2016; Osisanya and Harris 2005; Lin et al. 2016). In addition to the expensive daily rates of such tools, there are some operating limitations for its application such as pressure, temperature, and tool failures.

The objective of this paper is to use different artificial intelligence (AI) techniques to develop a robust model to predict the ECD using surface drilling parameters such as mud weight, surface drill pipe pressure and rate of penetration. The models are developed using artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). In addition, an empirical correlation is extracted from the ANN model which can be used to calculate the ECD from surface drilling parameters.

Artificial intelligence

Recently, artificial intelligence (AI) has gained widespread popularity in many engineering fields due to its outstanding ability to solve complex and non-linear problems (Naganawa et al. 2014; Razi et al. 2013). Petroleum industry deals usually with big data. Artificial intelligence techniques provide actual benefits to model and manage these data. During the last few years, AI techniques including artificial neural network, fuzzy logic, support vector machine, genetic algorithms, adaptive neuro-fuzzy inference system and swarm intelligence became increasingly popular in the petroleum industry. AI techniques are applied in different aspects of petroleum engineering such as production monitoring, forecasting and multilateral well evaluation (Velazquez et al. 2012; Weiss et al. 2002), PVT parameter prediction (Weiss et al. 2002; Khaksar et al. 2016; Alarfaj et al. 2012; Elkatatny and Mahmoud 2018), well integrity evaluation (Al-Ajmi et al. 2015), assisted history matching (Al-Thuwaini et al. 2006; Shahkarami et al. 2014), interpreting well logging data and well to well correlation (Saggaf and Nebrija 1998; Wu and Nyland 1986; Lim et al. 1998a, b; Wiener et al. 1995; Denney 1998), well testing interpretation (Allain and Houze 1992; Houze and Allain 1992; Moussa et al. 2018), drilling fluid properties (Elkatatny 2017; Elkatatny et al. 2017), reservoir characterization (El Ouahed et al. 2003; Kumar et al. 2012a, b; Abdulhameed et al. 2017; Moussa et al. 2018), enhanced oil recovery (Van and Chon 2017), rock mechanics (Sayadia et al. 2013; Elkatatny et al. 2018ab), and drilling optimization (Wang and Salehi 2015).

Artificial neural network (ANN)

An artificial neural network (ANN) reflects a similar system to the operations of biological neural networks which is the reason for defining ANN as an emulation of biological neural systems (Nakamoto 2017). ANNs are at the leading edge of computational systems used to produce, or at least mimic, intelligent behavior. ANN is capable of resolving paradigms that linear computing cannot process (Andagoya et al. 2015; Hemphill et al. 2007; Omosebiet al. 2012).

The main processing elements of an ANN system are neurons. The ANN architecture contains at least three layers (input, hidden and output layer), in addition to a training algorithm and a transfer function (Lippmann 1987). Weights are constants which connect neurons in each layer with the subsequent layer neurons (Hinton et al. 2006). Log-sigmoidal and tan-sigmoidal are the most common transfer functions assigned to hidden layers while ‘pure linear’ is commonly used as activation function assigned to the output layer. The input data points that go into an ANN model are normalized between − 1 and 1 (Niculescu 2003). An ANN model is first trained using a back-propagation of errors while data processing is taking place from the input layer all the way to the output layer. Then a comparison is performed between the estimated and the actual data in the output layer. The weights and biases of each layer are updated to match the estimated outputs with the target values. This procedure continues until the error is reduced to a certain acceptable limit as shown in Fig. 1 (Liew et al. 2016; Naganawa et al. 2014; Razi et al. 2013).

Fig. 1
figure 1

ANN system (Math works)

Unlike classical AI techniques which directly emulate rational and logical reasoning, neural networks are able to reproduce the underlying processing mechanisms which give rise to intelligence as an emergent property of complex adaptive systems (Shanmuganathan and Samarasinghe 2016). ANN systems have successfully been developed and deployed to solve capacity planning, pattern recognition, intuitive problem-related aspects, robotics, and business intelligence (Andagoya et al. 2015; Hemphill et al. 2007).

Neural networks gained high interest over the last few years in areas such as data analytics, data mining, and forecasting, (Bharambe and Dharmadhikari 2016).

Adaptive neuro-fuzzy inference system (ANFIS)

Adaptive neuro-fuzzy inference system is an ANN system based on Takagi–Sugeno fuzzy inference system. This technique was developed in the early 1990s to integrate both fuzzy logic and ANN principles. ANFIS has the potential to combine the benefits of both techniques in a single framework (Daneshwar and Noh 2013). Its inference system is based on a set of fuzzy IF–THEN rules which can approximate nonlinear functions and act as a universal estimator (Shing and Jang 1993). Figure 2 shows an ANFIS architecture composed of four layers of several nodes (Hamdan and Garibaldi 2010). The output of the current layer nodes is served as the input to the next layer nodes after manipulation by the node function in the current layer. During the training process, the training algorithm for ANFIS architecture will tune all the modifiable parameters to match ANFIS with the training data (Zarandi et al. 2010).

Fig. 2
figure 2

Architecture of a typical adaptive neuro-fuzzy inference system (Hamdan and Garibaldi 2010)

Methodology

To build the AI models using ANN and ANFIS techniques, the surface parameters including rate of penetration (ROP) in m/h, the weight of the mud flowing to the hole (MW) in lb/gal and the dill pipe pressure (DPP) in psi were used as input parameters.

The reason these parameters were called (surface) is that they can be measured at the surface without downhole measurements. By looking at these three input parameters, it is noticed that all other drilling parameters affecting ECD values are related to one or more of these three parameters as listed in Table 1. For example, the rate of penetration includes the effect of drill string rotation, downhole pressure while the surface drillpipe pressure includes the effect of flow geometry, annular pressure losses, fluid rheology, and temperature.

Table 1 Parameters affecting ECD aligned with selected input parameters

More than 3000 data points of previously mentioned parameters were collected from drilling 8.5″ vertical section.

Data filtration and analyses

To have an accurate prediction using AI techniques, data have to be filtered and analyzed. Filtration process starts with removing all random values that cannot represent the measurements such as negative values, 999 values and null ones (Andagoya et al. 2015; Hemphill et al. 2007; Omosebiet al. 2012). The second process of filtration was based on the use of histogram plot to remove the outliers. From an engineering point of view, ROP values will be the suitable parameter to be used in the filtration process. The initial histogram plot for ROP values showed that the outliers have ROP values greater than 30 m/h, Fig. 3. After removing ROP outliers, the histogram plot changed to normal distribution as shown in Fig. 4. The previous steps represented the quality check of the collected field measurements.

Fig. 3
figure 3

ROP histogram before filtering (3000 data points)

Fig. 4
figure 4

ROP histogram after filtration (2376 data points)

The statistical analysis of the input parameters shows good quality and variation that can be used for accurate prediction of ECD values using AI techniques (Table 2). The rate of penetration ranges from 0.26 to 29.62 m/h while the mud weight ranges from 10.5 to 12.0 lb/gal. The surface drillpipe pressure ranges from 4946 to 6920 psi. The rate of penetration showed the maximum coefficient of variation (0.47) among the input parameters. From Fig. 5, the ECD has a correlation coefficient of 0.04, 0.98 and 0.96 with ROP, MW, and DPP, respectively.

Table 2 Statistics analysis of the input parameters
Fig. 5
figure 5

The correlation coefficient between individual surface parameters and the measured ECD in the available dataset

Artificial intelligence models

ANN model

Neural network model was created with three layers. After performing quality check for the dataset, 2376 data points for ROP, MW, and DPP were randomly selected as inputs. 70% of the data were used for training the network and 30% of the data were used for testing the model.

The ANN model was trained using 1664 data points and the calculated ECD were compared with the measured ECD while 712 data points were used for testing the model (Fig. 6). The training average absolute percentage error (AAPE) was 0.2252% for training and 0.2237 for testing with a correlation coefficient (R) of 0.9971 and 0.9982 for training and testing, respectively. Figure 7 shows the profile of the predicted ECD for both ANN training and testing along the drilled section.

Fig. 6
figure 6

Predicted ECD for both ANN training and testing compared with field measurement using 1664 data points for training and 712 data points for testing

Fig. 7
figure 7

ECD profile of both ANN model training and testing compared with actual ECD data

Accurate estimations were obtained for the predicted ECD values with high values for the correlation coefficient and the average absolute error percent (AAPE). This reflects the high accuracy of the developed model to predict ECD with an average absolute percentage error of 0.22% and a correlation coefficient of 0.99 for both training and testing datasets.

ANFIS model

An ANFIS model was developed using five membership functions with gaussmf as input membership function and linear as output membership function. 1664 data points for training and 712 data points for testing were randomly selected from the whole data set. Figure 8 shows the training and testing results which have a good match with the measured ECD values. Also, the ECD–depth profile showed high accuracy of the training and testing results (Fig. 9) with a correlation coefficient of 0.9985 for both training and testing dataset and APPE of 0.2259% and 0.2264% for training and testing, respectively.

Fig. 8
figure 8

Predicted ECD from ANFIS model (training and testing) compared with field measurement using 1664 data points for training and 712 data points for testing

Fig. 9
figure 9

ECD profile of both ANFIS model training and testing compared with ECD from field measurement

Empirical correlation

An empirical equation was extracted from ANN model to calculate the ECD from the surface parameters. The ECD ANN model consists of three neurons representing the input parameters: ROP, MW, and DPP; single hidden layer consists of 20 neurons and a single neuron output layer representing the output ECD. The ANN-based empirical correlation is given by Eq. 2. All the weights and biases associated with Eq. 2 are given in Table 3.

Table 3 The weights and biases of the ECD empirical correlation (Eq. 2)
$${\text{EC}}{{\text{D}}_{\text{n}}}~=\left[ {\mathop \sum \limits_{{i=1}}^{N} {w_{2i}}~\left( {\frac{2}{{1+{{\text{e}}^{ - 2\left( {{w_{{1_{i,1}}}}\left( {{\text{RO}}{{\text{P}}_{\text{n}}}} \right)+{w_{{1_{i,2}}}}~\left( {{\text{M}}{{\text{W}}_{\text{n}}}} \right)+{w_{{1_{i,3}}}}~\left( {{\text{DP}}{{\text{P}}_{\text{n}}}} \right)+{b_{{1_i}}}} \right)~}}}} - 1} \right)~} \right]+{b_2}.$$
(2)

In the above equation, the values of Mwn, ROPn and DPPn are normalized and are calculated from the following equations, respectively:

$$M{w_{\text{n}}}=0.001013\left( {Mw - 4945.85} \right) - 1~,$$
(3)
$${\text{RO}}{{\text{P}}_{\text{n}}}=0.06812\left( {{\text{ROP}} - 0.26} \right) - 1~~,$$
(4)
$${\text{DP}}{{\text{P}}_{\text{n}}}=0.083{\text{~DPP}} - 1,$$
(5)

where ROP is the rate of penetration (m/h), ROPn is the normalized rate of penetration, Mw is the mud weight (lb/ft3), Mwn is the normalized mud weight, DPP is the drillpipe pressure (psi), DPPn is the normalized drillpipe pressure, N is the number of neurons in the hidden layer (N = 30 neurons), i is the index of each neuron in the hidden layer, \({w_{{1_{i,1}}}}\) is the weight of the input layer neurons for ROP and the hidden layer neurons (Table 3), \({w_{{1_{i,1}}}}\) is the weight of the input layer neuron for MW and the hidden layer neurons (Table 3), \({w_{{1_{i,1}}}}\) is the weight of input layer neuron for DPP and the hidden layer neurons (Table 3), \({b_{{1_i}}}\) is the bias of the hidden layer neurons and input layer, \({w_{{2_i}}}\) is the weight of the hidden layer neurons and the output layer neuron representing ECD, and b2 is the bias of the hidden layer N and the output layer (b2 = − 0.291).

ECD can be calculated from ECDn using the following equation:

$${\text{ECD}}=\frac{{{\text{EC}}{{\text{D}}_{\text{n}}}+1}}{{0.0689}}+4~,$$
(6)

where ECD is the de-normalized equivalent circulation density (lb/gal) and ECDn is the normalized equivalent circulation density. The predicted ECD for the whole dataset using the presented empirical equation is shown in Fig. 10 with correlation coefficient of 0.9924 and APPE of 0.2212%. This indicates the validity of the developed empirical equation and its ability to give the same accuracy as the ANN network blackbox. Figure 11 shows the relative importance of each input parameter to the ECD calculated from the developed model. ECD has correlation coefficients of 0.03, 0.98, and 0.96 with ROP, mud weight, and drill pipe pressure, respectively.

Fig. 10
figure 10

Predicted ECD using empirical equation (Eq. 6) for the whole dataset (2376 data points)

Fig. 11
figure 11

The correlation coefficient between individual surface parameters and the calculated ECD using the developed model

Conclusions

In this study, two AI models were developed using ANN and ANFIS to predict ECD from surface measurements collected from drilling of an 8.5″ vertical hole section including mud weight, surface drillpipe pressure, and rate of penetrations. The ECD calculated using the developed ANN model has an average absolute percentage error (AAPE) of 0.2252% for training and 0.2237 for testing with correlation coefficients (R) of 0.9971 and 0.9982 for training and testing, respectively. The ANFIS model showed the same accuracy of predicting the ECD as the ANN model.

The ANN model-based empirical correlation for ECD is presented as an alternative tool for the ANN blackbox. The developed models can be integrated with any automated rig cyber systems and give immediate results for the actual ECD. This will highly help decision-makers for better well control operations in real time.