A systematic review of data science and machine learning applications to the oil and gas industry

This study offered a detailed review of data sciences and machine learning (ML) roles in different petroleum engineering and geosciences segments such as petroleum exploration, reservoir characterization, oil well drilling, production, and well stimulation, emphasizing the newly emerging field of unconventional reservoirs. The future of data science and ML in the oil and gas industry, highlighting what is required from ML for better prediction, is also discussed. This study also provides a comprehensive comparison of different ML techniques used in the oil and gas industry. With the arrival of powerful computers, advanced ML algorithms, and extensive data generation from different industry tools, we see a bright future in developing solutions to the complex problems in the oil and gas industry that were previously beyond the grip of analytical solutions or numerical simulation. ML tools can incorporate every detail in the log data and every information connected to the target data. Despite their limitations, they are not constrained by limiting assumptions of analytical solutions or by particular data and/or power processing requirements of numerical simulators. This detailed and comprehensive study can serve as an exclusive reference for ML applications in the industry. Based on the review conducted, it was found that ML techniques offer a great potential in solving problems in almost all areas of the oil and gas industry involving prediction, classification, and clustering. With the generation of huge data in everyday oil and gas industry activates, machine learning and big data handling techniques are becoming a necessity toward a more efficient industry.


Introduction
Artificial Intelligence (AI) is the field that integrates computational power with human intelligence to produce smart and reliable solutions to extremely nonlinear and highly complicated problems. AI is the field of science that allows computers to think and decide on their own. Machine learning (ML) is a subset of AI that provides statistical tools to explore and analyze big data. ML is comprised of further subsets such as supervised, unsupervised, and reinforced learning. Supervised learning is the data learning technique applied when some past or labeled data is available for future forecasting by function approximation. The unsupervised learning technique is the machine learning technique when the past labeled data is unavailable and is usually used for clustering purposes. Reinforced learning is the combination of supervised and unsupervised learning techniques in which some part of the data is labeled and some part is not.
In the last two decades, engineering journals have reported numerous articles utilizing ML for regression, function approximation, and classification problems. With the development of intelligent oilfields and big data technology, the adoption of the ML method has gained new vitality for the study of problems in the oilfield development process. With the advent of computing techniques, several correlations utilizing ML have come to the fore, especially in reservoir characterization (Anifowose 2012;Fatai A Anifowose et al. 2013a, b), reservoir engineering (Al-Marhoun and Osman 2002;Gharbi et al. 1999;Gharbi and Elsharkawy 1999); and reservoir geomechanics (Tariq et al. 2017a, b) and many other areas in petroleum engineering applications.
The most repeated question that ML petroleum researchers faced in their everyday life is that ML models are usually 1 3 limited to the data set tested, so how to globalize this and produce more general correlations? ML applications have common limitations and challenges that hinder the globalization of the created models, such as overfitting, coincidence, excessive training, lack of interpretability of results, and bias. Besides, these models require a large amount of data that is not available in many cases.
Overfitting is considered the most common problem in ML applications. This is due to the lack of an appropriate amount of data to be used for training. To overcome this issue, the ratio of data points to the total number of weights used by the connections (ρ) was used to lessen the effect of insufficient data. The coincidence effect is another issue that accompanies the AI supervised learning models as they try to match a specific dataset, so there is a probability of getting a good match by coincidence. This also can happen in other regression analysis techniques, which require working on methods to minimize that occurrence (Livingstone et al. 1997). Overtraining can happen when there is no clear stopping stage for the training. The error may stay decreasing by updating the model structure, including the weights. The real risk, in that case, is that the model can be more complex to fit a specific dataset, becoming impossible to generalize after that. A training methodology named "early stopping" uses a control set that monitors the training process to overcome this. If error begins to rise, the early stopping will end the training process. Other techniques are being used to save time and effort, such as reinforcement learning with instream supervision, such as generative adversarial networks that monitor the learning of two competing networks to better understand the model concept (Hossain 2018).
The availability of large datasets is also a concern, which affects the training accuracy and goodness of the model. If the gathered data is limited, a methodology like singleshot learning is implemented, in which the AI model is pretrained on a similar dataset and is enhanced with experience.
Interpretability is the key to data analysis. AI models are not that simple, and even in some cases, it is impossible to interpret the results even in modeling small linear problems. The single connections in the models do not alone affect results, but the whole combined connections do. One of the methods developed to help in that regard is the local interpretable model and its agnostic explanations, which try to detect which parts of the raw data the model depends on mostly for estimations. In the generalized additive models' method, the separation between model features enhances each feature's interpretation.
The lack of AI models' generalization ability is a major limitation that delays the widespread of AI in the oil and gas industry. It is hard for many models to be used in circumstances different from those used in building the original model (Virginia 2018). Additional resources are to be utilized each time for training new datasets, even if they were similar to previous cases (Ramamoorthy and Yampolskiy 2018). The reusability of the ML models is also quite challenging. Usually, the trained models on one geological field are less reliable when applied to other geological fields. It is highly recommended to implement the model when the input parameters of the given dataset lay within the range of the input parameters on which the model is to be implemented (Mohaghegh 2017).
Lastly, the effect of bias cannot be ignored and sometimes is hard to be detected and mitigated. Many researchers are solving the issues related to AI bias by understanding the model's objective and its associated results. Using modelindependent perturbations by substituting the inputs with random values obtained from a normal distribution will help avoid biases (Samek et al. 2018). Table 1 provides a summary of all limitations of AI and ML models.
Covering all AI and ML application content to the oil and gas industry in a single article is a challenge, so we focused this article on the application of AI and ML in petroleum exploration, drilling, production, stimulation, and reservoir characterization. The issues highlighted in this article include comparing commonly used AI techniques, how AI can be used as a standalone technique, and how to make the AI model generalized. Furthermore, this review highlights the present status of data-driven machine learning predictive models. It also addresses the commonly asked questions related to machine learning and future research.

AI as a standalone predictive tool
Should AI be a standalone predictive tool? Or it can be combined with analytical models, numerical models, statistical and probabilistic approaches, numerical simulation software's, imaging software's, etc. AI applications are recently getting more attraction as an enabler of state-of-the-art technologies for digitalization among industries, including digital twins (data and physics). Because physics-based models are based on simplifying assumptions to formulate the problem, the models lack the physics controlling the processes (Rasheed et al. 2018). However, artificial intelligence is like a black box, which does not explain the model outcomes. Since data contains the known and unknown part of physics, constructing data-driven models incorporate the full physics behind it. However, the black box nature inhibited these models from prevailing in critical systems, which have a culture of zero error tolerance, such as what oil and gas have in field operations. However, with the increased number of applications with proved concepts in oil and gas, the industry leaders are now emphasizing these applications' potential in optimizing operations like predictive maintenance. AI algorithms and statistical models have advanced significantly, leading to computers overtaking creative tasks such as art drawings, summarizing texts, translation, and even interpreting languages. Deep learning or deep neural networks were used for image classification. Restricted Boltzmann machines are utilized as stochastic networks that understand the distribution of inputs in the supervised and unsupervised way; thus, they can be used as powerful tools in detecting anomalies (Evangelatosorn and Payne 2016). Table 2 provides a summary of the comparison between physics-based and data-driven models.
To benefit from the advantages of physics-based and AI models', hybrid models that engage the interpretability and reliable mathematical concepts of physics-based models are proposed. Rasheed et al. (2020) explained the "digital twin" models and gave an example of Kongsberg's dynamical digital twin for oil and gas. The model linked process schematics and virtual 3D graphics of an oil and gas production facility connected to real-time data from sensors where the data-driven models are used. It could also be provided with synthetic data generated from simulators, representing the physics part. Humans can interact with the model using an avatar to give expert opinions. Finally, the model can also do "what if?" scenarios using digital siblings, which are copies of the physical asset. Al-Hajri et al. (2020) suggested a coupled machine learning and probabilistic models to predict the scale prediction and plan for inhibition. The coupled model was able to predict the scale and quantify the data-driven models' goodness, then quantify the savings. Sun et al. (2019) used a coupled model of ML image processing techniques and reservoir simulation to achieve better reservoir characterization and overcome classical methods' drawbacks. Shahkarami et al. (2014) used a Surrogate Reservoir Model (SRM) to assist the reservoir history matching process. Table 3 shows a summary of the hybrid models applied to the oil and gas industry.
Based on the review of the coupled models, it is evident that coupling with other methods such as analytical models, numerical models, statistical and probabilistic approaches, and imaging software is beneficial for more robust, accurate, and unbiased models. Figure 1 shows how the different models should interact to achieve the desired goals.

What is needed from AI in the oil and gas industry?
Many oil and gas industry giants are currently applying AI in oil and gas operations. AI advances made it suitable for several applications such as precision in drilling and automation, saving oil and gas producers' time and money. These advances are going to serve different aspects of the oil and gas industry, such as:

References
Overfitting Lack of an appropriate amount of data to be used for training Using the ratio of input data points to the total number of network weights used by the connections (ρ) Andrea andKalayeh (1991) Livingstone et al. (1997) Coincidence Getting a good match by coincidence for a specific dataset Using discriminant analysis Livingstone and Manallack (1993) Overtraining When the error keeps decreasing by updating the model structure and the model can be more complex to fit a specific dataset A training methodology that is named "early stopping" can be used Reinforcement learning with in-stream supervision, for example, the generative adversarial networks Hossain (2018) Data availability Sometimes the gathered data is limited Single-shot learning in which the AI model is pre-trained on a similar dataset and then is enhanced with experience Weyrauch and Herstatt (2016) Interpretability The single connections in the models do not affect alone but the whole model connections combined affect results

Local interpretable model and its agnostic explanations
The generalized additive models method Shabbir et al. (2018) Bas (2016) Generalization Model failure in the circumstances different from the set of circumstances, which were used in building the original model Additional resources are to be utilized for training new datasets Virginia (2018) Ramamoorthy and Yampolskiy (2018) Bias The nature of black-box models makes it to be prone to biases Using model-independent perturbations Samek et al. (2018) Precise drilling Drilling activities are always accompanied by high risk and a high level of uncertainty. AI techniques coupled with the big data recorded by the smart sensors mounted on drilling strings such as pressure, temperature, and seismic surveys in real time can be used to overcome these challenges. Precise drilling using AI can enhance the control level of the rate of penetration and identify risks in advance.

Production optimization
Every oil and gas company focuses on production optimization and efficiency, which eventually increases profits with the help of AI, automated pattern recognition, and classification to prepare production data for generating analytics. Estimation and prediction models can then be built based on the refined data. It can also isolate the effects of the reservoir from the production control Black box nature and interpretability issues Cannot detect errors or uncertainties Affected by bias in data Not easy to generalize Data availability is the main concern It is an approximation Lower performance outside the scope of the training data Hard to predict critical conditions or extremes  Shahkarami et al. (2014) 1 3 responses such as gas lift rates, choke openings, network routing, and artificial lift methods.

Reservoir management
Multiple teams from several aspects such as seismic, geology, reservoir, and production engineering are required to collaborate to achieve better reservoir management. The AI models can be trained with historical data of seismic surveys, geological descriptions, and production methodologies and then can be applied in the characterization or modeling of reservoirs and field monitoring.

Inspections
Frequent inspections are scheduled for detecting abnormal equipment performance to prevent failures of the equipment and potential accidents. That is why companies are looking for automated and smart detective approaches. Robots driven by AI models can help investigate abnormal equipment behavior by identifying anomalies using techniques such as pattern recognition. Besides, drones can inspect pipelines and offshore facilities that can detect, in real time, cracks or leaks in pipelines. They can also help in case of an emergency, such as gas leaks. In certain situations, these robots can intervene in emergency cases and use the procedure, which applies to that case, which will elevate the company's safety measures.

Chatbots
AI-powered chatbots can help engineers and scientists by digging in a database or archive of historical data, suggesting possible solutions to problems, providing correct standards of job execution, or help in teaching junior staff using natural language processing. Jacobs (2019) discussed three newly released chatbots in the oil and gas industry: Sandy, Nesh, and Ralphie. They are designed intentionally to provide answers to oil and gas professionals' complex questions. These are also named virtual assistants that use artificial intelligence (AI) natural language processing (NLP), which has quickly entered the market through the tech giants Amazon, Apple, and Google, which enabled many millions of people to engage in dialogue with laptops, smartphones, and speakers.

Facilities monitoring
Intelligent cameras can reduce potential damage by detecting hazardous activities such as smoking in dangerous areas. They can be trained using photos and recordings of dangerous activities to alert the staff or take predefined actions. Moreover, they can detect if the employees are watering their protective PPE or not. Using this approach will help enhance safety management.

Commonly used machine learning techniques in oil and gas industry
Several ML techniques such as ANN, FL, SVM, DT, RF, KNN, RNN, CNN, and fuzzy C-means clusters are widely used in different applications of oil and gas. Table 4 summarizes some of the algorithms with their advantages and disadvantages.

Exploration and geosciences
The applications discussed here include fault and salt-body delineation, Petrofacies classification, and well correlation. We also discuss potential further development in emerging applications.

Fault and salt-body delineation
Accurate fault detection and delineation of the salt-body boundary from 3D seismic data are essential for building a realistic 3D reservoir model (Bahorich and Farmer 1995, b;Melville and Guruswamy 2002). Seismic attributes analysis has been traditionally used to map faults and salt bodies. Some examples of such attributes include the semblance (Marfurt et al. 1998), coherence (Bahorich and Farmer 1995a, b;Qi et al. 2017), edge detection (Di and Gao 2014), and seismic curvature (Di and Gao 2016;Somasundaram et al. 2017). Due to the complex geology and the noise level frequently encountered in 3D seismic data, the use of multiple seismic attributes is frequently needed to detect faults or salt-body geometry ( (Di and Gao 2014;Huang et al. 2017;Zhao et al. 2015). Support vector machine (SVM) is one common algorithm that has been used by several studies, particularly for fault detection (Guitton et al. 2017). In this case, correlation and cluster analysis are used to select the suitable seismic attributes that can best identify fault from seismic. The use of SVM could successfully improve the accuracy and efficiency of fault detection, especially in large-scale faults (Zou et al. 2019). Nevertheless, some researchers pointed out two main shortcomings of the SVM attributes-based approach (Xiong et al. 2018). Firstly, it requires precomputed attributes by experienced interpreters to map the faults, which can be labor-intensive as this step has to be repeated for each data set. Secondly, the SVM attributesbased approach can fail in zones of weak reflections, as highlighted in Fig. 2. This can be critical for heavily faulted zones and salt-body delineation due to the frequently weak signal associated. Recent studies Gao 2016, 2014;Tschannen et al. 2020;Xiong et al. 2018) have shown that deep learning technologies such as convolutional neural networks (CNN) can help overcome the previous two shortcomings and to map complex geological structures/features. An example of improved performance of CNN versus SVM is demonstrated in Fig. 2. In the CNN approach, the network is trained based on annotated seismic images where faults or salt-body boundaries are labeled, relying more on the reflection patterns and reducing the effect of seismic noises or processing artifacts (Di and Gao 2016;Xiong et al. 2018). Additionally, the relationship between seismic reflection patterns and the target fault or salt bodies is constructed based on the original seismic amplitude, eliminating the need for precomputed attributes (Di and Gao 2014). Fig. 2 Comparison of salt-body boundaries delineation using the traditional multi-attribute-based support vector machine (SVM; second row) and convolutional neural network (CNN; third row) in three different inline sections (modified after Di et al. 2018). The reference manually labeled sections are shown in the first row. The seismic sections were extracted from the synthetic SEG-SEAM dataset. Poor detection of the boundary from the SVM results is highlighted by red circles

Petrofacies classification and fractures identification
Reservoir rocks can be classified and grouped based on their reservoir quality. Such classification can be done based on petrophysical rock properties (e.g., porosity, permeability, and pore size) and geological features (e.g., textures, diagenetic overprints, and pore types). Petrofacies are usually defined based on combining both petrophysical and geological attributes, which can be an essential tool for reservoir characterization (Avseth and Mukerji 2002). Petrofacies classification is frequently done using both core samples and wireline log data. Cores are not frequently available from all wells due to the time and cost associated, and thus several studies (Bhattacharya and Mishra 2018;Qi and Carr 2006;Sebtosheikh and Salehi 2015) have examined how machine learning algorithms can be trained on data obtained from certain cored well and then used to perform petrofacies classification in other un-cored wells. Petrofacies labels, defined as a function of depth based on the integration of well-log and core data, are used to train the models (Sebtosheikh and Salehi 2015;Silva et al. 2015). The utilized logs for facies identification are usually Gama Ray (GR), resistivity (Rt), neutron (NPHI), density (RHOB), and lithology (PEF). In addition, other features could be extracted from these logs to improve the prediction, such as total organic matter (TOC), matrix grain density (RHOMAA), and apparent volumetric cross-section (UMA). Earlier studies have used ANN, SVM, and RF to classify petrofacies from well logs in both sandstone and carbonate reservoirs (Silva et al. 2015; Al-Anazi and Gates 2010; Martinelli et al. 2013;Salehi and Honarvar 2014). Nevertheless, more recent studies have suggested that Gradient Boosting (GB) algorithm outperforms ANN and SVM, especially when a limited number of features are available (Silva et al. 2015). Another algorithm that has shown success is the Random Forest (RF), which reduces the computational time for the training phase compared to GB (Bhattacharya and Mishra 2018). Based on the existing literature, it seems that there is no consensus regarding the most suitable machine learning technique for petrofacies classification. This could be due to several factors, including the wide variations in the features selected or available data, as well as differences in terms of complex geology and reservoir heterogeneity. Indeed, as pointed out by Silva et al. (Silva et al. 2015), the applicability of various algorithms has to be tested for each training/testing data set to be used. One major challenge that remains for the success of machine learning in this application is to have/select the right petrophysical and geological attributes/features to distinguish between facies. Such tasks remain mainly subjective and far from being automated or objective.
Fractures and facies identification are usually made through personal judgments based on field log and laboratory core analysis data. Recently, AI has been used to identify fractures and facies in unconventional formations. Tian and Daigle (2019) could identify micro-fractures and organic matter in siliceous and carbonate-rich shale samples and find the association between them using AI. That was to automate the process of understanding micro-fractures in shale samples to make it fast and avoid personal evaluations. SEM and EDS images were used to find fractures and organic matters in intact and deformed samples. The singleshot detector (SSD) deep learning approach was used to train the data obtained from the images. Around 97% of fractures in intact samples and 92% in deformed ones were identified using SSD. Also, detected organic matter images were overlapped over detected fractures to find the associations. It was found the clear majority of micro-fractures penetrated the OM and clay minerals. It seems that the combination of the soft OM and clay and brittle materials (quartz and calcite) enhances the fracability according to the study.

Well correlation
Correlating different reservoir units and formation tops across different wells is essential in reservoir characterization and modeling. Such a task may require significant time from experienced geologists, especially in large fields with hundreds of wells. The use of machine learning to handle this issue has been recognized many years back (Luthi and Bryant 1997). An interpreter has first to pick formation tops and perform well correlations in several wells, which will be used as a training dataset to perform interpretation in tens to hundreds of other wells. An increasing body of studies (Maniar et al. 2018;Zheng et al. 2019) has demonstrated that a deep convolutional neural network (CNN) can provide an accurate and efficient approach for well-log correlations. The most common log data used for the correlation includes gamma ray and resistivity, although any other geophysical well-log data with sufficient log character can be used. One crucial observation documented by Zheng et al. (2019) was the drastic reduction in prediction accuracy as the number and percentage of the training dataset decreases. This might be explained by the complexity of geology that would require wells covering different depositional environments and stratigraphic sequences throughout a field.
To produce a "universal" model for well correlation, Brazell et al. (2019) developed a deep CNN architecture trained based on five million data points derived from thousands of well-log and experienced interpreter correlations. The data was obtained from various depositional environments and basins within the USA. The authors have implemented a 3D search logic to determine the marker propagation pathway and the optimum correlation. The model does require some interpreted-top examples to be provided from the specific dataset to account for particular complexity within the geology of a given area. Nevertheless, no need for extensive training data set from the specific field due to the rich dataset used to build the model. The model could provide an accuracy of around 96% on the testing dataset. It is important to note that more interpreted examples might be needed for the training if the model is to be applied to a dataset outside the US with very different regional complex geology. Another potential consideration is incorporating seismic sequence stratigraphy into the workflow, which currently relies only on well-log data. This can be important, especially in benching out strata and faulted reservoirs where the spatial continuation of a given unit might be heterogonous.

Reservoir characterization
Machine learning has an increasing number of applications in the field of geosciences. Still, we focus here on applications directly related to reservoir characterization in the oil/ gas industry. The areas discussed are petrophysical properties prediction from the seismic, core, and well-log data. Other properties such as water saturation, petroleum geochemical parameters, and reservoir geomechanics will be predicted.

Petrophysical properties prediction
Reservoir characterization plays a critical role in the oil and gas industry, such as developing optimal production and reservoir management strategies. Permeability, which determines the ability and direction of oil flow, is central in reservoir characterization. An accurate permeability determination is essential for material balance calculations, reservoir flow simulation, estimating oil production rate, stimulation strategies, and enhancing oil recovery. However, permeability is very difficult to determine due to its complexity and highly nonlinear nature. Therefore, machine learning techniques are widely used to predict petrophysical parameters such as porosity, permeability, capillary pressure, relative permeability, and bulk density. Table 5 shows a summary of the studies used to predict porosity and permeability.

Water saturation prediction
Water saturation defines a fraction of pore space occupied by water. A good estimation of water saturation is considered a difficult task in petroleum engineering. In fact, there are very few empirical models that existed to predict water saturation directly from the petrophysical well logs. Nevertheless, water saturation is an essential parameter in petrophysics and reservoir engineering calculations such as material balance calculations, simulation model optimizations, history matching, and oil and gas reserves estimation. In 1942, Archie was the first to present an equation to determine water saturation in a clean, non-clay reservoir. Several researchers have tried to deconvolute the water distribution in composite formations by formulating empirical correlations that depend on log-derived data, which is not a very precise representation. Hence, no consensus exists among log analysts about which model can be universally used. The most commonly utilized models/correlations are Simandoux (Simandoux 1963), Fertl and Hammack (Fertl and Hammack 1971), and Waxman and Smith models; however, the variables involved in each contain inherent uncertainties and eventually lead to misconstrued results. Determining water saturation in the laboratory is a time-consuming and challenging task. Therefore, AI and ML techniques have widely been used to predict water saturation. Table 6 provides some of the insight on how to predict water saturation using machine learning algorithms. Most of the presented research integrates well-log and core data to predict water saturation.

Geomechanics
A better estimation of the reservoir rock elastic and failure properties is instrumental to minimizing wellbore instability problems, avoiding differential sticking, improving hole cleaning, improving casing placement, improving hydraulic fracturing operations, minimizing subsidence, and many more (Khamidy et al. 2019). Carrying out mechanical rock tests such as triaxial compression, uniaxial compression, scratch, and impulse hammer is an accurate way to determine these properties ). These tests are usually carried out on the downhole samples retrieved from some depth of interest. In the absence of core samples and well-log data, analytical and empirical models determine rock mechanical properties. In the last two decades, predicting the mechanical rock properties using AI tools was thoroughly investigated. Table 7 lists the summary of the selected work done to relate core mechanical properties with well-log data using ML tools. Most of the work has utilized ANN, ANFIS, SVR, DT, and RF.

Drilling and completions
Drilling operations for oil and gas reservoirs are usually expensive. Hence, several approaches are utilized to reduce the operational cost, mainly by improving the drilling efficacy and reduce the drilling time. Usually, the drilling performance is improved by selecting proper drilling fluids, improving cementing jobs, maximizing the drilling rate of penetrations, and minimizing the required drilling energy (Bilgesu et al. 1997;Dupriest

Drilling performance prediction
Several analytical models were developed to evaluate and optimize the drilling performance; however, most of these models were developed based on weak assumptions, reducing their reliability (Aadnoy et al. 2010;Reiber et al. 1999). Besides, numerical approaches such as finite element methods were also utilized for evaluating the drilling performance by estimating the ROP (Yang et al. 2008). Usually, the depth is divided into individual sections, mainly based on rock lithology. Thereafter, the drilling performance is estimated based on the affected forces in each section (Bourgoyne and Young 1999;Murray and Cunningham 1955). Thus, the numerical approaches improved the prediction performance, but the main issue is the computational speed, limiting the applications of these numerical approaches, especially for real-time operations.
In the last decade, artificial intelligence (AI) techniques present practical tools for prediction purposes, and therefore, have been widely applied in the oil and gas industry (Barbosa et al. 2019;Rolon et al. 2009;Sun and Ertekin 2020;Van Si and Chon 2018). ANN is the most used AI technique among all AI methods because an empirical correlation can be extracted from the optimized ANN model. Hence, numerous ANN models were developed for real-time applications, such as estimating the RPO and the drilling performance (Ahmed et al. 2019;Arabjamaloei and Shadizadeh 2011). Gidh et al. (2012) improved the drilling performance by predicting and managing the bit wear utilizing an artificial neural network technique. A new ANN-based system was developed to predict the bit performance at different ROP values. They mentioned that the developed approach could be used to determine the optimum range for the surface drilling parameters (such as revolutions per minute (RPM), torque, and WOB) to extend the lifetime of the drilling bit. Their developed approach was tested in several field operations, and successful results were reported.
Evangelatos and Payne (2016) presented an advanced model to describe the motion and dynamics of the bottomhole assembly (BHA). The developed BHA model was coupled with neural network analysis to estimate the drilling performance for different conditions of WOB and RPM. They reported that the coupled model could consider the acting forces on the BHA and thereby provide very accurate predictions for the ROP profile at the wide range of BHA conditions. Barbosa et al. (2019) presented extensive reviews on ROP modeling and prediction using machine learning techniques. They classified the ROP models into conventional (physics-based) models, statistical (regression) models, and machine learning (data-driven) models. Based on their reviews, machine learning techniques can outperform all ROP models and provide very reasonable ROP predictions. However, the reliability of ROP prediction depends mostly on two factors; the type of AI method and the inputs used for ROP predictions. However, they concluded that there is a lack of field implementations of AI techniques in the oil and gas industry. They attributed these limited field applications to the difficulties of selecting the input parameters and the suitable AI models. Figures 3 and 4 show the most common AI techniques and input parameters used to predict the ROP, respectively. ANN is the most AI technique for developing ROP models among the reviewed works, while the most common inputs are the weight on bit (WOB) and RPM. ANN tends to be the most common machine learning method when it comes to dealing with large data sets. The main reason behind that is the availability and the relatively simple structure and layout of ANN models when compared with other machine learning models. Besides, they mentioned that humans are usually resistant to change, which leads to limited field applications. Finally, they recommended that downhole parameters such as nozzle size, drill bit wear, and rock strength should be considered inputs for predicting the ROP profiles. Ahmed et al. (2019) presented a comparative study of predicting ROP using several intelligence techniques. ROP was predicted for two wells using an extreme learning machine, ANN, and SVR techniques. They selected the input The common inputs used for predicting the ROP based on machine learning techniques, considering 53 ROP works (Barbosa et al. 2019) parameters for the ROP models based on the specific energy concept. The ROP was predicted for more than 8800 data points based on the RPM, WOB, torque, depth, mud weight, flow rate, nozzle sizes, and standpipe pressure (SPP). They reported that all ROP models showed acceptable prediction performance with a correlation coefficient higher than 0.70 for the testing data. However, among all tested techniques, support vector regression showed the best ROP estimation with a correlation coefficient of 0.94. Mehrad et al. (2020) used a machine learning approach to develop a rigorous ROP model for vertical wells. They used different parameters to determine the ROP, including logging, drilling, and geomechanical parameters. They found that the best ROP prediction can be obtained by using the uniaxial compressive strength (UCS), mudflow rate, weight on bit (WOB), Depth, mud density (MD), and revolutions per minute (RPM) as input parameters. After that, they combined the least-squares support vector machines (LSSVM) with different optimization algorithms to estimate the ROP profile. The examined optimization algorithms are genetic algorithms (GA), particle swarm optimization (PSO), and cuckoo optimization algorithm (COA). LSSVM-GA, LSSVM-PSO, and LSSVM-COA hybrid algorithms were used to predict the ROP for two vertical wells, and more than 2000 data points were used to train and tests the hybrid models. LSSVM-COA showed the best prediction performance for training and testing wells among all tested algorithms, and an R-square of around 0.802 was achieved.
Artificial intelligence showed an effective approach for estimating the drilling performance, and accurate profiles of ROP can be predicted. However, it is noticeable that there is a lack of implementation of those techniques for real-time operations, especially for gas wells. Also, most of the available ANN-based models were developed to predict the ROP for a certain section, usually for the reservoir section. No attempt was reported for predicting the full profile of ROP using the ANN technique. Predicting the complete profile of ROP in real time can significantly improve the drilling performance and reduce the operational time and cost. Furthermore, the coupling of different drilling efficiency indicators can help in improving the drilling operations by considering more than one parameter. For example, the ROP models can be coupled with the MSE concept to determine the best drilling conditions in drilling time (ROP) and required drilling energy (MSE). Hassan et al. (2018) coupled the torque modeling with the mechanical specific energy (MSE) to optimize the drilling performance. First, artificial intelligent techniques were used to predict the torque and ROP profiles for around 18000 ft. Then, the MSE was calculated for the whole drilling section using the surface drilling parameters. After that, the MSE was coupled with the torque and ROP profiles to identify the optimum drilling conditions that will result in maximizing the ROP and minimizing the required drilling energy (MSE). They mentioned that the developed approach would enable the drilling engineers to evaluate and optimize the drilling performance in real-time applications; hence, the surface drilling parameters can be controlled to maintain the drilling operations within the optimum conditions.
Besides, AI techniques were used to estimate several drilling problems, such as loss of circulation, one of the most common drilling problems that can increase the overall drilling cost by around 25-40%. Solomon et al. (2017) developed a new ANN model to estimate the loss circulation zones. The developed model can also recommend the suitable sizes of loss circulation materials based on the characteristics of the depleted zones. They used 30 case studies to train and validate the developed ANN model. They mentioned that the ANN model showed a very acceptable prediction performance, and a coefficient of determination of 0.8 was obtained. Besides, they compared the reliability of the developed model with different fracture predictive models, and they concluded that the developed ANN model could reduce the estimation error from around 26% to less than 16%. Manshad et al. (2017) used an SVM and radial basis function to assess the loss of circulation problems for 30 oil wells. They reported that SVM showed high performance in predicting the amount of loss circulation material required to overcome the thief zones. A coefficient of determination of 0.8 was obtained between the predicted results and actual field data. In comparison, the radial basis function was able to estimate the mitigation of loss of circulation problems with an accuracy of 78.3%.
Al-Hameedi et al. (2018) estimated the volume of lost circulation materials for 500 wells using the machine learning technique. They predicted the volume of fluid losses based on the profiles of mud weight, bit nozzle sizes, ROP, equivalent circulation density (ECD), plastic viscosity (PV), and WOB. They reported that the machine learning models were able to predict the volume of fluid losses with very acceptable error for different types of mud loss, including partial, seepage, severe, and total mud losses. Alkinani et al. (2020) used an ANN technique to predict the volume of drilling fluids losses during drilling fractured zones. They developed and validated the ANN model using 1500 wells. Also, the lost circulation volume was determined based on the profiles of mudflow rate, yield point (YP), PV, ECD, bit nozzle sizes, RPM, and WOB. They reported that the ANN model was able to predict the loss of circulation with a coefficient of determination higher than 0.92. Abbas et al. (2019) applied SVM and ANN techniques to estimate the severity of loss of circulation while drilling. They used 1120 case studies from 385 wells to train and validate the new AI models for different types of mud losses such as seepage, partial, severe, and total fluids losses. They used the rock lithology, mud properties, and drilling surface parameters to predict the severity of loss of circulation. They reported that the developed ANN model was able to estimate the fluids loss with a correlation coefficient higher than 0.82. While the SVM model showed better prediction performance compared to the ANN model, a correlation coefficient higher than 0.91 was obtained.
Overall, different AI techniques were utilized to estimate the loss of circulation problems. ANN and SVM methods are the common AI tools that are used for this purpose. The very practical performance was reported for predicting the loss circulation based on the mud properties, rock lithology, and drilling parameters. However, the application of these models in real-time operation might be restricted due to the huge drilling data, leading to misleading results or delaying the model prediction. Therefore, proper data cleaning could be required to improve the data quality and reduce the data size for problems in real-time applications ).

Drilling fluids
Drilling is one of the most critical tasks, with challenges including lost circulation, clogged pipes, wellbore instability, and kicks occurring regularly. Drilling fluid, sometimes known as the "blood of the drill," is a direct or indirect remedy to the challenges stated above during the drilling process. It helps to keep the wellbore clean and retain the wellbore's integrity. For instance, high mud weight controls the high wellbore pressures and prevents kicks. On the other hand, high mud weight has a tendency to frack the formation. Similarly, low mud weight prevents fractures but can cause kick or blowout. Further drilling fluids prevent the pipe from sticking during drilling by building thin filter cake on the wellbore wall as well as by removing drilling cuttings out the wellbore. The drilling fluid works as an architect for the wellbore. The operation's success or failure is largely determined by the drilling fluid's performance and compatibility (Agwu et al. 2018). Many drilling issues can be avoided by using the proper drilling fluids. Drilling fluids are always chosen based on data analysis and expertise gained from previously drilled wells in the area. Each well design includes a drilling fluid program that specifies drilling fluid, additives, rheology, density, filtration, and other drilling fluid parameters. Combating wellbore difficulties involves comprehensive analysis and decision-making to build the drilling fluid to satisfy specific needs that suit distinct formation features.
The majority of drilling fluid design is done in the laboratory through trial and error. Hence, a system that can use existing data and provide a deeper knowledge of drilling fluid is required. Machine learning models are created using the parameters of drilling fluids and the downhole circumstances. These models aid in forecasting changes in drilling fluid parameters and recommend the optimum course of action. Rheological models express a mathematical relationship between the shear rate and the shear stress to describe the fluid flow behavior. This relationship is complicated in the case of drilling fluids. However, no single rheological model can accurately fit all drilling fluids' shear stress-shear rate data across all shear rate ranges. Instead, a plethora of mathematical models with varying degrees of relevance has been utilized. These mathematical models do not precisely capture the behavior of non-Newtonian fluids. For instance, the Bingham plastic model does not describe the drilling fluid flow behavior at a low shear rate. Further, it overestimates the yield point of the drilling fluid. The power-law model does not account for the yield point of drilling fluids. There are challenges in performing hydraulic calculations due to many rheological parameters involved in the case of the Herschel-Bulkley model (Huang et al. 2020).
Regression approaches are utilized to predict rheological proficiencies such as an ANN. For greater accuracy, the ANN model can be trained continuously with more data sets. It gives a more comprehensive view of how to comprehend the drilling performance. For example, if there is a reduction in pump pressure during the drilling operation, which happens for several reasons, including thinning effect on the drilling fluid, quick transport of the cuttings to the surface, reservoir fluid influx in the wellbore, and lost circulation, etc. Here AI interlinks different parameters, improves the decision-making process, and brings back the engineers on the right track within a short time. Tables 8 and 9 outlines several studies of artificial intelligence in drilling fluids. The tables summarize the drilling fluids properties investigated and the AI technique used. They also show the input and output parameters and accuracy of a performance evaluation using correlation coefficient (R2), mean square error (MSE), average absolute percent relative error (AAPE), etc.

Oil well cementing
The main objective of the oil well cement is to prevent the movement of fluid between the geological formation and behind the casing string (Murtaza et al. 2020;Tariq et al. 2020b). A slurry of cement is pumped down into the annulus between the casing and the geological formation. A cement slurry is a mixture of various additives such as strength enhancers, friction reducers, fluid loss agents, and expanding agents, etc. In the field of oil well cementing, AI is mostly used for the prediction of cement strength development and rheological properties. Table 10 provides some of the applications of AI in the field of oil well cement. Compressive strength development is one of the most critical parameters which significantly affects the drilling operation. Accurate prediction of compressive strength development can save Less than 5% More than 0.9 millions of dollars by reducing the wait on the cement after cementing operation. In cement design, various additives are mixed with the cement, and each additive impacts the performance of cement slurry. The AI models can predict these impacts on cement performance without conducting detailed laboratory investigations.

PVT properties curves prediction
Pressure-Volume-Temperature (PVT) crude oil properties are considered essential in petroleum engineering for reservoir and production calculations. Determination of these properties in the laboratory is the most accurate and expensive way to obtain representative values (Tariq et al. 2021).
In the absence of such facilities, other approaches such as analytical solutions and empirical correlations are used. Some of these correlations can be seen in the works of Al-Shammasi (2001)

Production in the reservoir
Predicting Well performance prediction is one of the key parameters in developing and managing oil and gas fields. Several approaches are used to determine productivity, such as conducting deliverability tests or using mathematical models. Deliverability tests are usually time-consuming and costly operations, while the available correlations showed considerable deviations between the actual and predicted values. Data-driven models present a promising approach for estimating production based on reservoir properties and well configurations. The common AI techniques are an artificial neural network (ANN), SVM, and fuzzy logic system. ANN is the most applicable technique among all AI methods that showed a very effective performance in several applications. Several ANN models were developed to evaluate the hydrocarbon productivity for several well types and operations (Alarifi et al. 2015;Hassan et al. 2017;Sun and Ertekin 2020). The performance of several enhanced oil recovery (EOR) treatments was also evaluated using ANN models such as CO2 injection and miscible gas flooding (Le Van and Chon 2017a; Van and Chon 2017b). Alarifi et al. (2015) applied three AI tools to estimate the productivity index of horizontal wells producing from oil reservoirs. FL, ANN, and FN were used to determine the well production rate for more than 100 wells. They reported that the developed AI models provided very good perditions and outperformed the industry's well-known correlations. Chen et al. (2015) and Feifei et al. (2015) determined the productivity index for horizontal wells using AFL, ANN, and FN. They mentioned that the developed models investigate the influences of reservoir parameters (such as reservoir size, thickness, and reservoir permeability) on well performance. Also, they reported that the AI models showed very acceptable predictions and reduced the estimation error compared to the available correlations. Buhulaigah et al. (2017) estimated the productivity of multilateral wells utilizing artificial neural networks (ANN). They presented an ANN model to determine the oil production rate for multilateral wells based on the reservoir and well parameters. They compared the developed model with analytical models and correlations. The ANN model showed good performance; strong matching between the actual and predicted flow rates was achieved, with an overall error of 7.9%. Hassan et al. (2017Hassan et al. ( , 2020 applied different artificial intelligence techniques to predict the well performance for fishbone well types. FL, radial basis network, and ANN were used to determine the well production rate from more than 250 cases of different reservoir properties and wellbore conditions. The developed models were able to estimate the well productivity of fishbone with an absolute error of 7.23%. Furthermore, a new correlation was presented utilizing the optimized ANN model. The developed correlation was validated using actual field data with an estimation error of 6.92%. They mentioned that the suggested correlation could be inserted into the commercial production software, which will reduce the deviations between the simulated results and the actual field measurements. Ariturk (2019) used artificial intelligence to optimize the flow rates for injection and production wells operated in geothermal Fields. The flow rates were predicted based on the wellhead pressure, wellhead temperature, and valve positions. The injection and production rates were forecasted for around 500 days. It was mentioned that AI models could provide a very acceptable prediction for the geothermal reservoirs since the models were developed based on the field data/measurements. Also, it was concluded that AI presents a reliable approach that can minimize the complexity and uncertainties associated with geothermal reservoirs. Sagheer and Kotb (2019) have used the LSTM to predict the productivity of unconventional shale reservoirs. They have found that LSTM models have resulted in comparable results with the physics-driven reservoir simulation. Aulia et al. (2014) have used RF to predict several wells' bottomhole pressure (BHP). They have built the reservoir simulation model by tuning several input parameters. Through simulation, they have generated the BHP values and used them for the model training.

Reservoir simulation and field development optimization
Reservoir simulation plays an essential role in modern oil and gas exploration and production. ML methods have been used to accelerate oil reservoir simulations and achieve higher accuracy as well. Navrátil et al. (2019) developed a model using deep learning methods to accelerate the simulations of oil reservoirs by three orders of magnitude compared to industry-strength physics-based partial differential equations (PDE) solvers.
Mohaghegh (2011) discussed the AI-based reservoir model which can be developed using pattern recognition capabilities of AI to build relationships between fluid production, reservoir characteristics and operational constraints. Masoudi et al. (2020) showed how AI and ML are used to build a purely data-driven reservoir simulation model that successfully history match all the dynamic variables for all the wells in a field and used to forecast production. They tested the model with a highly complex mature field with large number of wells and years of production. They found out that time, efforts and resources required for the development of the dynamic reservoir simulation models using AI and ML is considerably less than time and resources required using commercial numerical simulators.
ML techniques are being implemented in many areas related to field development optimization. ML can be used to predict production and the potential field productivity which mainly done by conducting history matching models and using them to forecast. Alarifi and Miskimins (2021) developed and new approach using ANN to predict the ultimate recovery of several unconventional oil and gas wells using historical production data along with completion data. They developed and tested using actual production and completion data from 989 multistage hydraulically fractured horizontal wells from four different formations. The models developed can be used to optimize future field development plan by optimizing the well completion and stimulation procedures. Using ANN to forecast the production of several wells using limited production history can potentially help identify the expected productivity of new wells and therefore optimize the field development. Khazaeni and Mohaghegh (2011) developed production data analysis method with AI techniques using production history data to build a field-wide performance prediction model. In their work, production history is paired with field geological information to build datasets containing the spatiotemporal dependencies among different wells. They formed intelligent time-successive production-modeling (ITSPM) system using data from 165 wells. Input data includes data from the well itself and offset wells' static data. Dynamic data includes ultimate drainage area and initial production rate for offset wells. He et al. (2021) developed a methodology to optimize the field development plans (FDPs). This includes optimizing well counts, well locations and the drilling sequence. They used deep reinforcement learning method (DRL) in which the AI model would provide an optimized FDPs. They showed that starting from no reservoir engineering knowledge, the AI model can learn basic reservoir engineering principles, such as placing optimized well locations with high porosity and permeability, choosing a reasonable number of wells and maintaining good well spacing. They also showed an example of how the resulted AI model has been used to obtain FDPs for a real field that is better than the one initially designed by human engineers.

Stimulation
In the past two decades, the development of unconventional formations was the focus of oil and gas industry operations and scientific research. The invention of horizontal drilling combined with multistage hydraulic fracturing (MSF) made it possible to produce economically from these oil and gas reservoirs. Nevertheless, the understanding of fracture propagation and production thereafter from such complex systems is still in its infancy. Millions of stages were performed in unconventional formations generating a tremendous amount of data that cannot be handled conventionally. Advancement of AI methods made it possible to utilize these data to understand the formation response to stimulation and optimize the MSF completion design. Moreover, AI methods were used to improve the computational efficiency of such complex models that are frequently used in the simulation of fracture propagation or reservoir production. Running these intensive models could take long time which makes it a challenge to optimize the design based on simulations. This section intends to review the utilization of AI algorithms to tackle unconventional formations from stimulation and production perspectives. Also, fracture propagation and conductivity estimation, in general, were reviewed.
Fracture propagation models were developed to understand the growth mechanism of a hydraulic fracture in a complex system containing different minerals and natural fractures. Early studies of fracture propagating were based on simple two-dimensional (2D) (i.e., PKN, KGD) or pseudo-three-dimensional planar models. These models usually oversimply the problem and do not provide good accuracy if natural fractures dominate the behavior of hydraulic fracture propagation which usually occurs in unconventional formations. Advanced computational methods such as displacement discontinuity method (DDM), discrete element method (DEM), finite-discrete element method (FDEM), and extended finite element method (XFEM) are usually applied (Lecampion et al. 2018). These methods are computationally expensive especially if high accuracy is required in a heterogeneous system.
Accurate fracture propagation modeling even at a small scale can be computationally intensive. Zapico et al. (2008) ANN to fine-tune a finite element model (FEM) to predict experimental fracture propagation outcomes. Nevertheless, the methodology is computationally expensive as FEM should be utilized for the estimations. Moore et al. (2018) built an efficient machine learning model from a physics-based finite-discrete element model (FDEM) to predict the fracture growth in brittle materials containing pre-existing fractures. Modeling fracture propagation at the microscale is computationally expensive while running that model at field-scale is prohibitively expensive. Hence, a machine learning model was trained to reduce the computational cost between 2 and 5 orders of magnitudes. Around 200 data set of 2D simulations were obtained which were performed on 2 m × 3 m domain containing 20 random fractures each has 30 cm length. The features of two neighbor fractures were studied and used as input to the model. These features were the length of the fractures, orientation, the distance between two fractures, and the minimum distance to the domain boundary. Labels were given to the features indicating if they coalesce and the time for that to take place. Figure 5 shows the methodology implemented where an FDEM model was first performed to produce the outcomes. Features are then extracted to train the AI algorithms which were used to provide similar outcomes to the FDEM. Decision Tree (DT), Random Forest (RF), and Artificial Neural Network (ANN) were implemented by the study. From the 20 fractures, 190 pairs were generated containing a total of 5200 data points. The first outcome of their work was to   classify if two fracture pairs will coalesce and then a regression model was implemented to estimate the time for that to happen (see the lower right part of the figure). The success of the model was based on the ability to track the fracture path until the domain splits. Also, the time of rupture (domain split) was used as an indication of the model accuracy. The correlation coefficient for ANN as compared to the FDEM simulations was 0.68 while DT and RF gave lower accuracy. One of the challenges (in terms of classification) was the class imbalance (coalesce) as only 5-9 pairs out of possible 190 pairs coalesce. The issue was partially resolved by giving more weights to the imbalanced class. Five orders of magnitudes in simulation time were reduced by using AI methods. Such a methodology is feasible to utilize at fieldscale to predict fracture propagation in a formation containing micro and macro-scale natural features.
Fluid flow through a fracture network is challenging to simulate because of the structural complexity. Srinivasan et al. (2018) built a machine learning tool to predict the solute flow through a fractured network. A discrete fracture network (DFN) methodology was used to simulate fluid flow in fracture networks. Solute flow through the fracture network is usually taking the shortest path. Graph theory was used to reduce the number of fractures to those that only contribute to flow. Then, SVM and RF were used to identify the backbone of the fracture network that contributes to flow. This significantly reduced the computational power when simulating flow using the DNF model. The trained model could capture the early solute breakthrough precisely; nevertheless, it was not as useful in predicting late time flow.
Proppant distribution in a hydraulic fracture is crucial information as it could be used to optimize MSF design. Maity et al. (2019) identified proppant particles from cored samples based on imaging processes supported by machine learning classification tools. The goal was to understand the proppant distribution after an MSF job. This helps identify the location of the new infill wells to be drilled and the completion spacing as proppant distribution can tell the length of popped fracture and which clusters were propped. Images were taken for the particles obtained from a 600 ft cored interval using a dedicated slanted well to obtain these cores. Training ANN classification, the particles were divided into proppant, calcite, and others. The following attributes of particles were used as input: hue, roundness, size, darkness, roughness, translucence, and entropy. K-fold cross-validation was used for hidden layer size optimization for ANN. It was benchmarked against other classifiers such as SVM. It was concluded that the proppant is limited within 30 ft vertical distance in the studied formation. It was validated against field data using other classification techniques.
AI is also an active area in hydraulic fracture design optimization such as the number of horizontal wells, number of stages, volume of proppant and fluids, type of chemical additives, and sweet spot identification (Awoleke and Lane 2011;Lolon et al. 2016). Most of the AI developed models ignore important geological and reservoir properties such as porosity, permeability, saturation, and pressure. These data are challenging to obtain especially along the horizontal sections of the wellbore. Some researchers replaced these data with the location of the well (i.e., coordinates) as the mentioned properties are spatially changing (Mishra et al. 2015;Wang and Chen 2019). Wang and Chen (2019) trained machine learning algorithms (RF, SVM, ANN, and AdaBoost) on 3160 horizontal well data of Montney unconventional formation to predict the first-year production and optimize the fracture design. Features such as proppant mass, well location, lateral length, fluids treatment size and type, completion type, number of stages were used for training. Recursive feature elimination with cross-validation (RFECV) was used to find the most significant features where RF was used for prediction. Then, algorithms were trained based on the most important features to predict the production rate from a fractured well. Using RFECV showed that the most important parameter in enhancing production is the mass of proppant pumped for the case of Montney formation and the location of the well. It was found that using more than the four features (proppant mass, latitude, longitude, and TVD) will not improve the correlation coefficient. It was also observed that the RF results in the best performance in terms of prediction accuracy. One drawback of the trained model is its lack of reservoir properties such as permeability, porosity, and pressure.
Optimization of hydraulic fracture stages using gradientfree (i.e., AI) methods has been applied by many researchers (Iino et al. 2020;Yu and Sepehrnoori 2013). The objective function that is usually optimized is the net present value (NPV) or cumulative production. Features such as fracture half-length, spacing, porosity, permeability, the distance between laterals, and fracture conductivity were used for the optimization. Different AI algorithms were tried such as covariance matrix adaptation evolution strategy (CMA-ES), simultaneous perturbation stochastic approximation (SPSA), genetic algorithm (GA), and non-dominated sorting genetic algorithm (NSGA-II). Rahmanifard and Plaksina (2018) aimed to optimize hydraulic fracture stages in unconventional gas formation based on cumulative production or NPV using AI-based optimization tools such as GA, Differential Evolution (DE), and Particle Swarm Optimization (PSO). Gradient-based methods are usually used for optimization purposes. However, they suffer from being trapped in local optima which means that the absolute optima could not be found. Also, many functions could not be differentiated at a certain value or range. Hence, this study was utilizing AIbased optimization tools that are gradient-free. The authors used Wattenbarger et al. (1998) analytical slap model to estimate gas cumulative flow within a certain production 1 3 period. The optimization function is the NPV which is a function of the cumulative gas production, water cumulative production, and cost of hydraulic fracturing and waste disposal. The objective is to find the optimum number of hydraulic fractures (NHFs) that will maximize NPV. The PSO outperformed the other AI methods such as DE and GA as it required much fewer iteration for convergence. Du et al. (2017) utilized embedded discrete fracture modeling (EDFM) to train an AI-based algorithm to estimate productivity in the Permian Basin. Authors used EDFM for fracture representation in a reservoir simulator; a method that reduces the need for using fine grids. The EDFM composes of two elements: matrix and fracture that can be represented separately. Mangrove which is a commercial software was used for hydraulic fracture network generation. AI was implemented to remove unnecessary fracture complexity that would not contribute to productivity. Using AI methods to reduce the complexity of the fracture and then implement it in EDFM resulted in significant simulation time reduction as compared only to using Mangrove. It enabled doing sensitivity analysis as it was feasible. However, the simplified structure resulted because the AI should be history matched to tune parameters such as reservoir permeability otherwise an error up 40% could be the outcome. Bhattacharya et al. (2019) used machine learning algorithms to predict production in fractured Marcellus shale. The authors used the data of one well with 28 stages of hydraulic fractures in Marcellus shale to predict the production rate. The data used were petrophysical and geomechanical data (GR, sonic), pressure data (surface, casing, tubing), and fiber optics data such as distributed acoustic sensing (DAS) and distributed temperature sensing (DTS) while missing are hydraulic fracturing data and design. Ghahfarokhi et al. (2018) also implemented DAS and two years of DTS data for estimating production from Marcellus shale well. Bhattacharya et al. (2019) implemented the following machine learning tools: RF, ANN, and SVM. Feature engineering was implemented to find secondary attributes from the row data such as the brittleness index (BI). Collinearity analysis was implemented to find the most suitable features which reduced them from 34 originally to 18. All models could predict the production rate to good accuracy. However, SVM provided less accuracy with more computation time. Including hydraulic fracturing, reservoir, and PVT properties should improve accuracy. The model's lack of these data is a major limitation of their approach. Figure 6 shows that the Poisson ratio (PR) and brittleness index (BI) were the most important while DAS and DTS were not as significant. Fig. 6 Importance of attributes to production estimation of Marcellus shale (Bhattacharya et al. 2019) Similar concepts were applied to other shale formations such as Bakken shale. Luo et al. (2019) investigated the possibility of predicting the productivity of horizontally drilled wells in Bakken shale based on completion and geological parameters. Geology and completion data of 2061 horizontal wells in the Bakken were used. These include vertical depth, amount of proppant, water saturation, porosity, permeability thickness…etc. Spearman correlation, RF, and joint mutual information (JMI) were used for feature selection. Deep learning (ANN) was used as a predictive model based on one-year production data. Based on feature selection, it was found that the formation thickness, depth, and amount of proppant are the most important parameters to predict the production in the first year. It was also observed that less porous spots require more proppant to increase productivity which agrees with the physics of unconventional. Wang et al. (2011) applied AI on 2780 MSF and 139 vertical wells in the Bakken to predict productivity. A deep neural network was used in the study with k-fold cross-validation to check the predictiveness of the model. The number of hidden layers and neurons was optimized to give the best prediction for 6 and 18 months. The model showed that the amount of proppant placed in each stage is the most important parameter in predicting productivity. The trained model resulted in a small root mean square error (RMSE) when predicting the 6 and 18 months of production.
Sweet spots identification in unconventional formation is an important process as horizontal drilling combined with MSF is an expensive process that should be justified by good productivity. Also, unconventional formations cover large areas, and hence, finding the right location to complete the well is critical. Tahmasebi et al. (2017) defined the sweet spots as the ones having high TOC and fracability index (FI). They used multiple linear regression (MLR) to train log data from shale formation to predict the TOC and FI. Nonlinear models such as FL, hybrid neural networks (NN)/ FL, GA were also implemented. For variable selection, stepwise selection was implemented. Mineralogy composition was used to assess the fracability where quartz is the brittle mineral. MLR failed to predict FI where the correlation coefficient was 0.44 which is an unsatisfactory value. The prediction of TOC was better where the correlation coefficient was around 0.88. The hybrid (NN + FL) machine learning, among nonlinear models, (HML) could provide better accuracy and remove the weakly correlated variables. Rastogi and Sharma (2019) used machine learning tools to find the impact of fracturing chemicals on production using one-year production data. Different algorithms were used for feature selection such as F-Regression, Decision tree-based regressions, recursive feature elimination … etc. The data were obtained from different fracture jobs in the Powder River Basin. Chemicals additives were found to be in the top 5 parameters that impact productivity out of 11 selected features.
AI has also been applied to the area of acid fracturing in terms of conductivity prediction. Acid self-prop the fracture by generating peaks and valleys that act as a conduit for the fluids to follow. Akbari et al. (2017) used a 106 data point generated experimentally to develop a conductivity correlation based on GA. The developed correlation resulted in better accuracy as compared to the popular correlations for acid fracture conductivity. Eleibide et al. (2018) applied ANN and adaptive network-based fuzzy inference systems on the same data set. The authors showed that the model accuracy was improved as compared to Akbari et al.'s model. Desouky et al. (2020a) utilized more than 500 data points to generate a more accurate acid fracture conductivity correlation that considers rock type and etching pattern.

Future and challenges
The utilization of ML techniques to handle a large data set and to predict several parameters in many aspects of the oil and gas industry is rapidly growing. The main reason behind that is the generation of large data in everyday activates of the oil and gas industry. To be able to process the large data and make it useful, a careful data processing and handling has to take place and ML techniques are a great tool to do that. Furthermore, due to the complexity of the different relationships between the many factors controlling the productivity of an oil or gas well, ML techniques are widely used to figure out these complex relationships and build a multilayered correlation to relate the different factors. Without ML, the classical liner/nonlinear regression methods do not have the capability to handle high complexity as ML models do. Also, the high uncertainty of the many oil and gas industry activates is a major concern given the capitalintensive nature of these activities, building a reliable forecast and prediction models are necessary to navigate through these challenges while optimizing the outcomes.
ML techniques have provided many solutions to the oil and gas industry to thrive. At the same time, there are many disadvantages of these models that are sometimes ignored or rarely mentioned. One of the main disadvantages of using ML in building a relationship between several parameters is whenever there is a high correlation, it does not necessary imply causation ("correlation does not imply causation"). Building a high correlation model linking several parameters together based on the data used should not be taken as indication that these parameters are truly having a cause/effect relationship unless there is a proven physical or scientific relationship between them. Many developed models in the literature fail to address this fact and tend to associate correlation with causation. Another common challenge facing the applicability of ML techniques is the availability and accuracy of data used to build and test these models. The data has to be accurate in order to produce a useful model. Otherwise, the model developed will never be useful no matter its high accuracy. Conducting data collection quality assurance is highly recommended to avoid this issue.
A common criticism of ML models is that they require a large and diverse data set to train the model. Any model needs a sufficient representative data in order to capture the underlying structure that allows it to generalize to new similar cases. For instance, a ML model built to predict production of a certain formation would only be applicable for that formation and under the same conditions when the training data it collected. Generalizing the predictive models has to be done with careful consideration of the constrain of these models and the diversity and inclusivity of the data used to build them. This is a major disadvantage of ML techniques as they tend to be generalized without careful consideration of this limitation.
Surly, ML cannot be used to predict anything related to oil and gas industry or build a correlation between any two or more parameters. Before undergoing building the relationships between the different factors, a scientific and factual explanation of the actual "physical" relationship between these parameters has to be addressed first. Also, using ML to predict and forecast based on historical data has to be done carefully by addressing and assuring that the future conditions are similar to the historical events. ML tend to be a very useful tool to deal with big data and to build the complex relationships between the different parameters that linear/nonlinear regression models cannot handle. Many of the correlation that has been established based on regression analysis of laboratory data are being replaced with correlation developed using ML methods that are more case specific rather than general correlations.
Deep learning which is a subset of ML based on ANN is very efficient for many tasks but it is not the solution to every problem as it faces many challenges. Deep learning algorithms need to be trained with large sets of data and the access and availability of accurate data is not always possible in many aspects of the oil and gas industry. Therefore, overfitting is considered the most common problem in ML applications which is mainly due to the lack of an appropriate amount of data to be used for training. Also, overtraining can happen when there is no clear stopping stage for the training and the error keeps decreasing by updating the model structure and the model become more complex to fit a specific dataset. Even when dealing with large data sets, a major challenge is the training cost. In many situations, supercomputers are needed to handle large oil and gas data sets to build and run ML models.
The future trend of ML applications in the oil and gas industry looks promising. With the arrival of the internet of things and the automation of many of the oil and gas activates and the high reliance on data, it is possible to minimize risks and enhance productivity by integrating ML algorithms that are continuously trained and enhanced using the continuous flow of data. With the generation of large data in oil and gas industry, petroleum engineers and geoscientists must be exposed to big data handling techniques that are being developed in the AI domain. Making the most of the availability of data is something being addressed nowadays and will continue to be the trend for the future. Optimization cannot be reached without the utilization of the powerful capabilities of AI.

Concluding remarks
Based on the review of the literature and the authors' work on the applications of AI in petroleum engineering, the following remarks can be made: • AI offers a huge potential in solving problems in almost all areas of the oil and gas industry involving prediction, classification, and clustering. Compared to other areas of engineering, petroleum engineering and geosciences have special relevance because of two important factors. Firstly, we deal with nature rather than man-made materials and processes. Variation in rocks, oil, and brine cannot be easily handled by any closed-form solution. The employment of numerical methods in such a situation is too cumbersome as well as unrealistic. Secondly, the amount of data produced every day from cores, logs, and seismic exploration in conventional reservoirs to multistage hydraulic fracturing in unconventional reservoirs is too huge to be interpreted properly using classical approaches. • The biggest challenge for researchers is to have access to laboratory and/or field data. Oil companies should come forward to share data in whatever secured form acceptable to benefit the literature from the enormous potential that AI is ready to offer. • With the arrival of the internet of things and the live relay of data from drilling and production facilities, it is possible to minimize risks and enhance production by integrating AI algorithms trained based on past data. • There is a degree of uncertainty in the data coming out from the laboratories, logs, or seismic data. The depth record of the samples on which the measurements are taken may not be exact due to depth shifts, and the log data may not exactly represent the property corresponding to that depth point as it involves the averaging of the properties of layers that are intersected within the logging sensors such as transmitters and receivers. Seismic data also averages vast volumes of rocks in a given block.
As a result, a significant amount of data collected during the drilling operation is unreliable. Consequently, data cleaning and uncertainty quantification is another huge area that needs to be integrated with AI to develop realistic solutions to the problems. • The solution to a problem using AI is rarely guaranteed if one attempts to relate the input with the target data without a thorough understanding of the physics of the problem. The major challenge in this approach is setting up the problem so that the algorithm can easily connect input and the target data. Wherever possible, information from analytical models should be suitably used in the input data to help the AI model arrive at the solution efficiently and more quickly. • Problems involving the prediction of curves such as in viscosity-pressure curve in PVT data require extensive exploration of algebraic equations in addition to the understanding of the physics of the problem. • Challenges remain in AI tools, too, in overfitting, coincidence effect, overtraining, etc. It is hoped that researchers in soft computing will develop modified and/or new AI tools. • With the generation of huge data in petroleum engineering from cores to logs to seismic exploration, the petroleum engineers and geoscientists must be exposed to normal as well as big data handling techniques that are being developed in the AI domain. Exposure to AI tools should start right from their undergraduate education.

Declarations
Conflict of interest Authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.