Characterization based machine learning modeling for the prediction of the rheological properties of water-based drilling mud: an experimental study on grass as an environmental friendly additive

The successful drilling operation depends upon the achievement of target drilling attributes within the environmental and economic constraints but this is not possible only on the basis of laboratory testing due to the limitation of time and resources. The chemistry of the mud decides its rheological potential and selection of the techniques required for recycling operations. Conductivity, pH, and photometer testing were performed for the physio-chemical characterization of the grass to be used as an environmental friendly drilling mud additive. In this study, different particle sizes (75, 150, and 300 µm) of grass powder were mixed in mud density of 8.5, 8.6, and 8.7 ppg in the measurement of gel strength and viscosity of drilling mud. The grass additive was added in different weight conditions considering no additive, 0.25, 0.5, and 1 g to assess the contribution of grass on the gel strength and viscosity of the drilling mud. The machine learning techniques (Multivariate Linear Regression Analysis, Artificial Neural Network, Support Vector Machine Regression, k-Nearest Neighbor, Decision Stump, Random Forest, and Random Tree approaches) were applied to the generated rheological data. The results of the study show that grass can be used for the improvement of the gel strength and viscosity of the drilling mud. The highest improvement of the viscosity was seen when grass powder of 150 µm was added in the 8.7 ppg drilling mud in 0.25, 0.5, and 1 g weights. The gel strength of the drilling mud was improved when the grass additive was added to the drilling mud 8.7 ppg. Random forest and Artificial Neural Network had the same results of 0.72 regression coefficient (R2) for the estimation of viscosity of the drilling mud. The random tree was found as the most effective technique for the modeling of gel strength at 10 min (GS_10min) of the drilling mud. The predictions of Artificial Neural Network had 0.92 R2 against the measured gel strength at 10 s (GS_10sec) of the drilling mud. On average, Artificial Neural Network predicted the rheological properties of the mud with the highest accuracy as compared to other machine learning approaches. The work may serve as a key source to estimate the net effect of grass additives for the improvement of the gel strength and viscosity of the drilling mud without the performance of any large number of laboratory tests.


Introduction
Machine learning (ML), the subclass of artificial intelligence (AI), is a technique in which models are formulated based on data, and the models are forced to recognize the pattern in the data by training processes (Davenport and Kalakota 2019). ML is based on the principles of computer science, statistics, and all other fields of study which can model the behavior of decision making in doubtful conditions. ML has been used efficiently in the field of automation, speech recognition, computer vision, neuroscience, and many other domains of life for software development concerning predictions (Jordan and Mitchell 2015;Kersting 2018). Data scientists consider machine learning as an important tool to handle large sources of data for the development of accurate data-driven predictive models (Provost and Fawcett 2013). ML is used to model the nonlinear relationships by observation of the available patterns in the data. The performance of machine learning depends upon the selection of the algorithm used in the training phase of the model (Kotsiantis et al. 2007). The popularity of ML has increased because of recent developments in algorithms and improved computing facilities (Dimiduk et al. 2018;LeCun et al. 2015;Schmidt and Lipson 2009).
Recently, the application of AI has advanced the reservoir characterization in the petroleum engineering domain (Al-Bulushi et al. 2009;Al-Marhoun et al. 2012;Anifowose et al. 2014; Asadisaghandi and Tahmasebi 2011; Barros and Andrade 2013;Dutta and Gupta 2010;Hegde et al. 2019;Ismail et al. 2017a, b;Waqas et al. 2020). The reported influential performance of AI has enhanced the application of machine learning in the petroleum industry. Machine learning has been used in the prediction of various rock and fluid properties that are significant in the execution of different petroleum operations. Since 1970, artificial neural network (ANN), regression modeling, and support vector machine (SVM) have been used successfully in the field of geophysics, well logging, and production engineering. ML has become an important part of data analysis in all kinds of industries, but relatively this ML is underutilized in geoscience (Maniar et al. 2018). The application of ML has improved the precision and efficiency of the drilling operation that saved the industry from excessive financial and technical constraints. In the oil and gas industry, machine learning models have improved the relationship between input and output variables without significantly dependent upon the structure of the system (Noshi and Schubert 2018). The prediction of accidents in directional drilling operations can be done by using ML techniques. The designed models work by comparison of real time-data with the accidents that happened in the past (Gurina et al. 2020). Drilling optimization plays an important role in the economical and technical goals during drilling operations. The drilling optimization is done by identification of the optimum parameters using different empirical and machine learning techniques (Hegde and Gray 2018). The identification of the lithology and formation horizons is successfully done by using the neural network models (Mahmoud et al. 2021).
Drilling mud is a viscous and heavy fluid used in the drilling operations to carry the cuttings from the borehole, maintenance of borehole pressure, and lubrication of the drilling bit. Drilling mud is considered as one of the most important parts of the drilling activity. A wide range of chemicals and polymers are used for the achievement of the required properties of the drilling fluid. The main properties of the drilling mud include viscosity, yield point, gel strength, mud density, fluid loss control property, rate of penetration, and filtration control agents (Amanullah et al. 2016;Barbosa et al. 2019;Gul and van Oort 2020). The control of the physical properties of the drilling mud is very important to get its optimum performance. Viscosity and gel strength are two of the most important properties to control the drilling fluid (Abdou and El-Sayed Ahmed 2011). In recent years, the consideration of the conservation of the environment has also affected the selection of the muds and their additives in hydrocarbon drilling and production activities (Caenn et al. 2011). The advancement in technology in the drilling industry has brought a chance of the replacement of conventional mud additives with environmental friendly constituents. The preparation of the eco-friendly drilling mud is influenced by several factors such as the requirement of the complex mud treatment facility, high initial cost, and low availability of the raw material (Abdou and El-Sayed Ahmed 2011;Kok and Alikaya 2003;Lan et al. 2009;Li et al. 2002;Zhao et al. 2009). During drilling operations, hazardous vapors emit from drilling mud depending upon the type of additives in the mud may cause very detrimental effects on human health if they exceed safe exposure limits. Different materials such as brines, cleaning agents, solvents, and other fluids associated with drilling mud may damage the skin upon contact. The deleterious effect of the additives such as lubricants, viscosities, thinners, descalers, defoamers, stabilizers, corrosion inhibitors, and surfactants on living beings have been documented by many authors (Ameille et al. 1995;Candler et al. 1992;Greaves et al. 1997). The hazardous effects of the additives in the drilling mud give rise to a need for the development of environmental friendly replacements which may have the same efficiency as the conventional makeup of the drilling mud (Apaleke et al. 2012). The toxicity of the drilling mud depends upon the nature of the additive, which opens the chance of water-based drilling fluid as environmentally hazardous as well. The level of damage to the environment depends upon the proportion of the type of drilling fluid with a concentration of the additive and the rate at which consumed drilling mud is discharged open in the environment. Recent research has been focused on the preparation of a substitute for oil-based drilling fluid which may act as more environmental friendly. The increase in environmental restrictions has boosted the idea of oil-based drilling fluid replacement (Wajheeuddin 2014). Recent studies have shown that additives such as diesel-based/mineral-based fluids have high toxicity ranges (Dosunmu 2010;Duchemin et al. 2008;Rana 2008). There are five (5) drilling properties such as viscosity, the density of drilling mud, filter cake, solids content, and quality of water makeup which as a part of a drilling program are closely checked during drilling activity (Hossain and Wajheeuddin 2016). Support Vector Machine can be used for the study of metabolization in the feed of the animals (Ahmadi and Rodehutscord 2017). Zhang et al. (2018) used the computer-aided algorithm for the study of uncertainty analysis in managed pressure drilling (MPD). Bhandari et al. (2015) and Mohaghegh (2015) studied the application of machine learning in the prediction of blow-out and drilling anomalies in operations.
The concentration of the oil and gas industry has been diverted towards natural environmentally friendly additives to get rid of the hazardous effects of conventional additives in the drilling mud for rheological improvement. Several researchers have worked on the application of natural materials such as rice husk, sugar cane ash, coconut shells, cocoa beans shells, date seed powder, fibers, grass, and a wide range of other locally available materials for the improvement of drilling mud properties. One of the main applications of natural materials in drilling operations is to control loss circulation. As shown in Table 1, many researchers have successfully used the environmental friendly materials as a source for loss circulation prevention and improvement of the flowing properties of the drilling mud. The natural vegetable is used for the improvement of resistance of drilling mud against temperature variations. The temperature resistance of the drilling mud is enhanced by 40% with the use of vegetable gum as a drilling mud additive (Li et al. 2002).
Natural polymers such as carboxymethyl cellulose, guar gum, and starch are found in large amounts in the environment and are very cheap which can be used for reduction in loss circulation (Kok and Alikaya 2003). Cellulose from groundnut husk can be a very good material for fluid loss control if used in high concentrations (Dagde and Nmegb 2014). On the other hand, the grass is the main feed for all animals and is present in large amounts on the earth which are considered as a major producer of cellulose (Nmegbu and Bekee 2014). The average composition of grass includes cellulose, hemicellulose, and lignin which are useful as an additive in the drilling mud while major elements are Calcium, Potassium, and Chlorine as per results from XRD analysis ( Table 2). Wajheeuddin and Hossain (2018) studied Ground peach seeds Filtration controlling (Lummus and Ryals 1971) Ground nutshells and nur flour Filtration controlling (Burts Jr 2001) Corn cob outers Filtration controlling (Green 1984) Ground cocoa bean shells Lost circulation material (Burts Jr 1992) Rice fractions (rice hulls, rice tips, rice straw, and rice bran) Lost circulation material (Cremeans and Cremeans 2003) Cottonseed hull Lost circulation material (Sharma and Mahto 2006) Tamarind gum Viscosifier (Macquoid and Skodack 2004) Coconut  the effects of grass as a natural additive in the drilling mud of 8.6 ppg and found that the flowing properties of the drilling mud are improved using grass as a natural additive. It has been reported by Hossain and Wajheeuddin (2016) that grass can be used to handle fluid loss problems keeping the mud environmental friendly. The addition of grass in the drilling mud improved the viscosity and gel strength of the drilling mud efficiently when a particle size of 300 µm was used. Grass not only modifies the rheological properties but also acts as material for fluid loss. Starch is relatively less effective as compared to grass as a naturally available environmental friendly additive. The amount of salt concentration increases with the addition of grass in the drilling mud that modifies the rheological properties of the drilling mud (Al-Hameedi et al. 2019). Al-Saba et al (2018) used many naturally available drilling additives to evaluate the effects of the rheology of the drilling mud and concluded that soybean skin and coconut shell powder are most effective against the fluid loss.
In this study, the performance of the machine learning techniques has been assessed in terms of the prediction of the rheological properties of the drilling mud. The grass is used as an environment-friendly additive to improve the properties of the drilling mud. The characterization of the grass is done by physio-chemical experimentation. The influence of the grass as an environmental friendly additive has been studied on water-based drilling mud of 8.5, 8.6, and 8.7 ppg. Multivariate Linear Regression Analysis, Artificial Neural Network, Support Vector Machine Regression, k-Nearest Neighbor, Decision Stump, Random Forest, and Random Tree approaches are used to predict the rheological properties of the drilling mud by application on the experimental data.

Statistical and machine learning models
Machine learning is divided into three classes as supervised, unsupervised, and reinforced learning; supervised learning includes the learning from target data, in unsupervised learning the model finds a pattern in the data, and the unsupervised learning model is planned to make predictions without contribution from target data (Schmidt and Lipson 2009). Supervised learning estimations linked with the trained model which is controlled by target data. A lot of the algorithms are present for the estimations based on supervised learning such as decision algorithms, regression analysis, Support Vector Machine, and kernel machines (Jordan and Mitchell 2015). In unsupervised learning, the target data is missing in the model formulation and prediction as compared to supervised learning. One example of unsupervised learning is cluster analysis in which the algorithm clusters the data by identifying the distinct patterns in the input data.
In reinforcement learning, the model takes actions grounded on the information acquired from the environment provided to maximize the accuracy (Kaelbling et al. 1996).
Multivariate linear regression analysis (MVLA) is one of the first approaches for the formulation of the predictive model. The advantage of this technique as compared to simple regression is more accuracy and the capability to summarize more information. MVLA can reveal the correlation between dependent and independent variables (Eq. 1). MVLA is dependent upon the average trend between independent and dependent variables (Ismail et al. 2017a, b;Kumar et al. 2019). The regression coefficient was improved by 27% when MVLA was employed for the prediction of shear wave velocity (Du et al. 2019). MVLA has been used successfully in petrophysics and well logging for the estimation of shear velocity whereas Y is a dependent variable a o , a 1 , a 2 , a 3 , a 4 , are coefficients determined using Multivariate Linear Regression modeling, and V1, V2, V3, V4 are independent variables.
Artificial neural network (ANN) is a process that is based on the human brain simulation in which there is the ability to identify the relationship between input and target data. ANNs can make alternations, connections, and predictions from the input data. The neural network is considered an important source to model the nonlinear trend and complexity of the data when conventional mathematical modeling fails. Neural networks are extensively used in the petroleum engineering domain to model fluid and rock properties (Noshi and Schubert 2018). The neural network belongs to supervised learning that is structured into multilayers which compute the functions between input data (Bengio 2009;Schmidhuber 2015).
The implicit relationship developed by ANN is a nonlinear function based on backpropagation modeling in which it is not possible to express as an explicit expression (Shi et al. 2004) (Eq. 2).
The architecture of the ANN consists of a large number of layered neurons with adjusted weights (Helle et al. 2001). The working of the neural network includes the training and prediction phase. In training, the model is trained to adjust the weights between layers of neurons depending upon the output target data as shown in Fig. 1. The trained model is further applied to the input data to make predictions for the dependent variable. The backpropagation learning algorithm is used for the learning phase in the prediction from the Artificial Neural Network (Haykin 1994;Lim 2003).
K-nearest neighbor (KNN) is built by a comparison between training and test data set using an unsupervised learning approach. In KNN, the data is classified using recognized patterns into clusters which are further used for computations purposes (Araghinejad 2013;Cho 2018;Imandoust and Bolandraftar 2013). KNN can be used for regression modeling by a selection of property for each nearest neighbor cluster. The weight is adjusted in such a way that the contribution of the nearest cluster is considered greater as compared to far ones. The distance is measured from the target data to the nearest neighbor by using an appropriate approach. Usually, the Euclidean approach is used to measure the above-mentioned distance (Imandoust and Bolandraftar 2013). Support Vector Machine is a simple nonlinear machine learning algorithm. The approach defines a hyper-plane that splits the input data into distinct classes. Hyperplane increases the distance between the two classes and is defined by the distance between input data as shown in Fig. 2. The separations boundaries are called a support vector.
where SV is a support vector, and ω and b are the parameters of the linear function. The parameter x represents the relation between the training set and decision function margin maximization. The application of a proper kernel approach boosts the hyperplane margin between classification instances (Bishop 2006;Zhang et al. 2018). Categorically, the Support Vector Machine is divided into support vector classification and support vector regression. The generalized performance of the model is achieved by reducing the generalization error in the support vector regression approach. The application of SVM to solve problems has evolved in many fields of study (Awad and Khanna 2015). SVM is considered efficient in the classification approaches, but the rules obtained as a product of SVM are very difficult to understand (Zhang et al. 2018).
A decision tree is an approach to show an order of the conditions that constitute a product (Peng et al. 2009). Decision trees are structured on three parts known as the root node, internal node, and lead node as shown in Fig. 3. The startup information is known as the root node while the leaf node is the terminal node. The nodes located in between the above-mentioned nodes are called internal nodes. There is a lot of decision tree algorithms that are used for the classification of the data. Those algorithms include ID3, AD Tree, REP, J48, FT Tree, LAD Tree, decision stamp, LMT, random forest (Sewaiwar and Verma 2015).
Decision Stump is an algorithm in machine learning in which there is only one root node. The root node is divided into several leaf nodes depending upon the attributes of the data. The complex relationship between variables can be modeled using Decision Stump. This algorithm also has the ability to distinguish the nature of variables in terms of importance and takes less time in the training phase as compared to neural networks ). The Decision Stump tree is also called 1-rule due to one-level decision structure as shown in Fig. 4. Random force algorithm is the ensemble process of all Random Trees generated from the application of attribute selection on samples of the training phase. The prediction from the Random Forest is done by cumulating the dominated output from decision trees. Interpretation problems arise in the model due to the average of the decision trees but the overall performance is enhanced in the Random Forest algorithm (Breiman 2001).
Considering M trees, ensembling of these trees in the Random Forest will be as The output of all trees is combined to get the final prediction ( Ŷ ). The prediction in classification problems is the decision by the majority of the trees while in regression the prediction of each tree is averaged (Svetnik et al. 2003).
The Random Tree belongs to supervised classification in which distinct learners are generated using the ensemble learning approach. In this technique, bootstrap aggregation of the decision trees is done. Random tress takes input, then classification is done with all the available decision trees, and output is decided by analyzing the majority trend. A Random Tree algorithm can be used for classification as well as regression approaches (Mishra and Ratha 2016). The Random Tree is the mixture of a single model tree and the Random Forest approach. Decision trees have a linear model which is modified according to information held by each leaf. The performance of the model is modified on the basis of decision tree performance (Kalmegh 2015;Shajahaan et al. 2013). The Random Tree approach has been widely applied efficiently in many studies as a part of machine learning. The Random Tree can be used for classification as well as the regression of the data (Basak 2010).

Material and methods
The detail of the methodology followed in this research work is as follows;

Additive preparation
Domestic grass was selected for its potential use as an environmental friendly additive in drilling mud. The grass was allowed to dry in sunlight for three days and subsequently ground in a food processor to make moisture-free grass powder. The main purpose of using grass as an additive was to introduce a widely available cheap material for improvement in the drilling mud efficiency. The powdered grass material was sealed in an airtight bottle to avoid any possible contact with atmospheric moisture. Sieve analysis was performed on the prepared grass powder for particle size distribution analysis. Different grades of the sizes of the grass powder were separated using the mesh size of sieves used in particle size distribution analysis. On the basis of sieve analysis, 38, 75, and 300 µm particle sizes were used in this research as given in Table 3.

Mud sample preparation
A mud mixer was used to mix the bentonite with water to get water-based drilling mud of 8.5, 8.6, and 8.7 ppg (

Physio-chemical and rheological testing
In this work, Paqualab photometer and titration analysis were used for the estimation of chlorine, calcium, sulfate, ammonia, aluminum, magnesium, and fluoride as shown in Table 5. The electrical conductivity of the grass extract was 9.13mS measured with Lovibond electrical conductivity meter while 6.4 pH was measured using Hanna pH meter. Mud was prepared with densities of 8.5, 8.6, and 8.7 ppg. The designed density of the mud was confirmed using mud balance and for further adjustments. The mud was mixed with specific amounts of grass additive with the grain size of 38, 75, and 300 µm. Fann VG meter was used to measure the viscosity and gel strength of the prepared mud samples.

Comparison of machine learning approaches
In this study, the performance of the machine learning techniques has been assessed in terms of the prediction of the rheological properties of the drilling mud. Multivariate Linear Regression Analysis, Artificial Neural Network, Support Vector Machine Regression, k-Nearest Neighbor, Decision Stump, Random Forest, and Random Tree approaches are used to predict the rheological properties of the drilling mud by application on experimental data. Computer-aided approaches have been used for the modeling of the rheological behavior of the drilling mud. Supervised and unsupervised machine learning is used to study the effects of grain size and density of drilling mud on major rheological properties of the drilling mud. Interactive Petrophysics and Weka suite were used to conduct this study. In this study, 30 rheological tests were performed on the grass sample with varying particle size and drilling mud density. A tenfold crossvalidation method was used to test the prediction performance of each developed machine learning model.

Results and discussions
The diverse concentrations of the grass were added in the drilling mud of different densities to study its effects as an environmental friendly additive on rheological properties of the drilling mud. Machine learning approaches are used to study the application of machine learning in the prediction of rheological properties.

Effect of grass as additive on rheological properties of the drilling mud
The chemical composition of the grass determines the effects on the drilling fluid as a natural additive. The grass is considered as one of the natural environmental friendly additives which improve the rheological properties of the drilling mud without creating toxic chemicals as a consequence (Table 1). The rheology of the drilling mud can be modified using grass as a natural additive (Hossain and Wajheeuddin 2016).In this study, the grass was added into mud samples of 8.6, 8.7, and 8.8 ppg mud to study the effects of that additive on viscosity and gel strength of the water-based drilling mud. For the assessment of the behavior of grass on the viscosity and gel strength, different concentration of grass was mixed in mud density of 8.5, 8.6, and 8.7 ppg. The viscosity and gel strength of the drilling mud was measured with addition of different amount and particle size of grass powder in mud density of 8.5, 8.6, and 8.7 ppg as shown in Table 6.
Three types of particle size (75, 150, and 300 µm) of grass for each mud density were used as a natural additive. The contribution of the grass additive in the improvement of the viscosity was significant in water-based drilling muds with higher density ranges. The effect of the grass concentration was highest in the drilling mud with 8.7 ppg drilling mud. It was observed that the viscosity of the drilling mud does not increase exponentially with the increase of the particle size of the grass. In the absence of any additive, the density of the 8.5 ppg drilling mud was 2 cp which was increased to the highest value of 3 cp when 300 µm of 0.5 g grass were used as an additive in the drilling mud. The average of the values of viscosity at 150 µm particle size of grass shows that viscosity is at the highest value for any density of the drilling mud when the grass of 150 µm particle size is used as a rheological modifier. The viscosity of the drilling mud was reduced in the presence of 75 and 300 µm in 8.7 ppg drilling mud. The grass modified the gel strength of the drilling mud mainly at a density greater than 8.7 ppg. The viscosity and gel strength of the drilling mud was affected by both the physical and chemical properties of the grass powder. In all samples of drilling mud, the highest achieved viscosity was 4 cp for drilling mud of 8.7 ppg density (Fig. 5). The results of the viscosity behavior of the drilling mud showed that the influence of the grass powder on the viscosity of the drilling mud depends upon the type of grass as well because the viscosity of the drilling mud with 8.5 ppg density was nearly double in the presence of grass additive as reported by Wajheeuddin and Hossain (2018). The non-uniform chemical composition of the grass around the globe has a wide range of possibilities in terms of the improvement of the rheological properties of the drilling mud. Gel strength is an important property of the drilling mud which plays a significant role during the tripping and makes connection operations for suspension of cuttings in the borehole. The used grass did not affect the gel strength of the drilling mud of 8.5 ppg. The 10-s dial gauge reading value was 1 lbs/100ft 2 which remained the same for 10 min reading for the mud density of 8.5 ppg. Due to the low value of GS_10 sec , the mud falls into the flat gel mud but this will create the problem of the settling of solids. This same trend remained consistent for 8.6 ppg mud until the use of 75 µm particle size of grass in 0.5 g weight.
The gel strength of the 8.6 ppg mud was 1 lbs/100ft 2 in the absence of grass additive. The addition of the grass of different particle sizes had no effect on the gel strength of the drilling mud but for the grass of particle size of 75 µm in 0.5 g weight (Fig. 6).
The major improvement of the gel strength was seen in the drilling mud of 8.7 ppg density. The GS_10 sec and GS_10 min were 1 lbs/100ft 2 and 2.5 lbs/100ft 2 , respectively in the absence of any additive addition in the drilling mud. No significant effect on the GS_10 sec was seen with the   GS_10sec at 150microns GS_10sec at 300microns addition of grass as additive with the particle size of 75, 150, and 300 µm. The highest improvement in the GS_10 min was up to the value of 3.5 lbs/100ft 2 when 0.5 g of grass with 300 µm particle size was used as an additive in the drilling mud (Fig. 7). The low 10 s values and GS_10 min with no significant increase in the values show that the drilling mud is low flat gel mud, which is most desirable. The improvement in the gel strength of the mud was increasing with the increase in the density of drilling mud with the use of grass as an environmental friendly additive.

Application of machine learning in rheological properties of drilling mud
Machine learning and regression modeling are widely used for the forecasting and modeling of complex engineering problems in the industry. Neural networks are considered an important tool for the selection of bit and drilling fluid problems. In this study, Multivariate Linear Regression Analysis, Artificial Neural Network, Support Vector Machine Regression, k-Nearest Neighbor, Decision Stump, Random Forest, and Random Tree were used to model the effect of grass as an additive to improve the rheological properties of the drilling mud.

Multivariate linear regression analysis
In Multivariate Linear Regression Analysis, the density of mud, the particle size of grass, and the weight of the grass are related to plastic viscosity, GS_10 sec , and GS_10 min . The confidence interval of 95% was selected for the statistical model formulation. It was seen that adjusted R 2 for the statistical model of the plastic viscosity was 42% confirming low reliability of the estimation of PV in terms of density, particle size, and weight of the grass. The regression constant was also low (0.48). It was found that density was the major contributing variable in the relationship, having more than 95% contribution in the estimation of the plastic viscosity (Eq. 4).
The statistical analysis of the GS_10 sec with the density of drilling mud, particle size, and weight of the grass showed that independent variables have very low capacity to estimate the gel strength. The adjusted R 2 for the relationship was 34% which is a very low for a moderate to a good statistical model. The major reasons for this behavior are due to the lack of data points, absence of any average trend in the independent variables corresponding to dependent variable The regression model of the GS_10 min had a relatively good-adjusted R 2 of 60% as compared to GS10sec and plastic viscosity. The statistical model is shown in Eq. 5 reveals that the independent variables were predicting the dependent variable with 64% accuracy. The multiple R 2 was high as compared to adjusted R 2 and R 2 because the adjusted R 2 penalized the variables which do not contribute positively to the model. The R 2 assumes that every variable is contributing to the statistical model but adjusted R 2 only improves if the added variables in the model are improving the prediction strength of the model. The standard error of this relationship is 47.3%.
The regression coefficient of all the regression models was low (Eqs. 5-6). Multivariate Linear Regression modeling is not a suitable methodology for the prediction of rheological properties of the drilling mud when the number of data points is low. This happens because the model is unable to mimic the true average relationship between independent and dependent variables for generalization. Linear regression fails if there exists a nonlinear relationship between independent and dependent variables. The performance of linear regression analysis is more affected by outliers as compared to machine learning approaches. In this study, due to the lack of a proper linear relationship between regression variables, the overall regression coefficients were low. Hence, suitable machine learning approaches need to be applied for the prediction of the rheological behavior of drilling mud.

Artificial neural network
Virtual approaches such as artificial neural network (ANN) are possible replication of the working of the brain. The set of layers of information is called artificial neurons which are roughly the neurons of the human brain. The information is propagated from one layer to another. In ANN, only the information of input and output layers are assessable, but the hidden layer is not clear in the model. The training pass defines how many times the neural network will be trained each time the training phase of the model is done. In each training run, the neural network can sometimes get 'stuck' with what it (erroneously) thinks are the best results. In the used software (Interactive Petrophysics suite), the value of 3 was selected to avoid this problem. 100 epochs per pass were selected to present the training data to neural networks during each training phase. The magnitude of the closeness of fit (Cfit_nn) values shows the difference between target data and predicted data. One hidden layer was used in neural network modeling for the prediction of the rheological properties of the drilling mud. The consideration of the high number of hidden layers for a low non-linearity behavior may cause the problem of overfitting. Keeping the simple trend of the input layers into consideration, a shallow neural network (hidden layer = 1) was used in this study. Hit and trial methodology was used in the training phase to match the training neural network with target data.
In this approach, firstly the model is forced to learn the behavior of the target data against independent variables in the training phase, and weight is assigned to each neuron. The training model is based upon the discrete attributes of each data point corresponding to target data. In this study, the density of drilling mud, the particle size of grass, and the weight of the grass are related to the plastic viscosity of the drilling mud using a backpropagation neural network. The training intervals are shown in the second track of Fig. 8. From Cfit_nn, it can be assessed that the difference between the predicted plastic viscosity and target plastic viscosity is varying between 0-1 cp.
In this study, 70% of the data was used for the model formulation and 30% of the data was used for validation of the ANN models. It was seen that there was good agreement between independent parameters and target parameters in the training phase of the model which can be assessed by close values of target data (PV) and trained data (PV_nnt). The Artificial Neural Network has successfully predicted the plastic viscosity (PV_nn) of the drilling mud in the prediction zone. The accuracy of the ANN model was assessed by the bivariate regression model between measured plastic viscosity and predicted plastic viscosity of the test data set.

Fig. 8 Application of Artificial Neural Network for Prediction of Plastic Viscosity
The bivariate correlation of the predicted plastic viscosity using an Artificial Neural Network showed that the model was predicting the plastic viscosity with 72% accuracy considering the density of mud, the particle size of grass additive, and weight of the added grass in the drilling mud to improve the rheological properties (Fig. 9).
For GS_10 sec , the ANN model was trained accurately to learn the behavior of input variables and target data. The model was predicting the values of GS_10 sec with the highest accuracy as compared to other parameters. The overlapping of the values of GS10sec_nn and measured GS10sec were showing very good agreement with each other as shown in the last track of Fig. 10. The application of ANN can be very effective in experimental studies related to drilling fluids if the model is well trained.
The ANN model successfully predicted the values of GS_10 sec which was validated with the accuracy plot. The accuracy plot showed that the model was predicting the values with 98% accuracy (Fig. 11). The higher accuracy of the model was driven by the accurate training of the predictive model.
Keeping the same parameters, backpropagation Artificial Neural Network was used to formulate a nonlinear neural network model by training on the data set shown by training zone 1 and training zone 2 in the second track (Fig. 12). The model successfully predicts the values of GS_10 min as shown in the last track. At 8.7 ppg density of drilling mud, the model was predicting the values of GS_10 min with higher accuracy as compared to 8.8 ppg (Fig. 12). The target data was representing the major trends in the data set; no effect on the gel strength and improvement in the gel strength of the drilling mud by the addition of grass of specific concentration.
It was seen that the ANN model of GS10Min was predicting the values of the prediction interval with 76% accuracy (Fig. 13). The overall efficiency of the neural network is highly dependent upon the limit of the accuracy of the training zone, which is a very difficult job. K-nearest neighbor (kNN) is an example of supervised machine learning which can be easily used for classification as well as regression models. In this study, the selection of the number of neighbors (k) in each model of PV, GS_10 sec , and GS_10 min was based upon the highest efficiency in the prediction. The Euclidian distance was used as a distance metric in the kNN approach methodology. The tenfold cross-validation was used to assess the accuracy of the kNN models. The models were assessed for different k-values. Different models of kNN were formulated for PV for different values of k as shown in Table 7. It was seen that the PV model was predicting the plastic viscosity with 58% accuracy during the tenfold cross-validation process when the value of 2 was selected for the nearest neighbor. To avoid the possible problem of overfitting, a higher nearest neighbor value can be selected. There was no significant effect on the increment of k-value on the coefficient of regression in the k-fold cross-validation process. The optimum value of k = 3 was selected in the kNN model of GS_10 sec . The model was estimating the improvement of gel strength as a function of grass additive with a 64% coefficient of regression as evident from the cross-validation approach. The accuracy of the model was decreasing with an increase in the value of k greater than 3. ANN and kNN models of gel strength were predicting the gel strength of the drilling mud with the highest accuracy. The kNN model of GS_10 min was giving the values of GS_10 min with 76% accuracy. The optimum value of 3 was selected in the model formulation of the gel strength of drilling mud as given in Table 7.

Support vector machine regression (SVMR)
The Support Vector Machine is a very popular machine learning tool for classification and regression models. It is considered a non-parametric approach due to the application of kernel in the development of models. In this study, a Support Vector Machine is used for the regression analysis of PV, GS_10Sec, and GS_10Min with the density of drilling mud, the particle size of the grass additive, and the weight of the grass additive. In this study, the polynomial kernel is used in the SVMR. The models were developed for normalized values of the parameters used in SVMR. The hyperplane in the SVMR is affected by the scale of the input feature recommending the possible normalization of the trained data. The regression model was made by utilizing the Support Vector Machine results by relating attributes of the grass additive with rheological properties of the drilling mud. The tenfold cross-validation methodology was used to check the accuracy of the developed models. The application of the SVMR on the test data set showed that the model was predicting the plastic viscosity of the drilling mud with 63% accuracy. The accuracy of the SVMR for GS_10 sec was very low as compared to the plastic viscosity model. The coefficient of regression for the accuracy plot of GS_10sec was 29%. The prediction ability of the GS_10Min SVMR model was highest as compared to PV and GS_10Sec in the Support Vector Machine approach. The regression model was predicting the test data with 75% accuracy. The normalized regression model based on the Support Vector Machine was predicting the rheological properties of the drilling mud with higher accuracy as compared to Multivariate Linear Regression Analysis (Eqs. 7-9). The regression models of GS_10Min were predicting the dependent variables with higher accuracy as compared to other developed models.

Decision stump
Decision Stump is a one-level approach used for the prediction of data by splitting the parent data into different subsets. The process is repeated on all subsets until no new information is generated or the nodes have the same attributes of parent data. In this study, Decision Stump models were developed by modeling the density of drilling mud and grass additive with rheological properties of the drilling fluid. The tenfold cross-validation was used to assess the accuracy of the Decision Stump models. The details of the cutoff selected in the Decision Stump modeling of the rheological properties of the drilling mud are shown in (Table 8). The tenfold cross-validation showed that the model for that was predicting the plastic viscosity with a 61% coefficient of regression. A decision tree is an algorithm that predicts the data on the basis of a single input feature. In this study, the Decision Stump models were based on the density of the drilling mud.
The tenfold cross-validation approach was used to check the efficiency of the Decision Stump model in the prediction of the properties of the drilling fluid. The decision model of GS_10 min was predicting the rheological properties with an 89% coefficient of regression. The prediction scope of the gel strength-based decision models had almost the same ability to predict the properties of drilling fluid (Table 8).

Random tree
In a Random Tree approach, a tree is based on randomly chosen information at each node according to the feed dataset. In this process, on pruning is done. The value of 1 was considered for the selection of attributes. The Random Tree approach was predicting the plastic viscosity with 56% of the coefficient of regression in the tenfold cross-validation. There were 35 nodes in the formulated Random Tree model of plastic viscosity. The accuracy of the Random Tree model for GS_10 sec was low as compared to plastic viscosity. The regression coefficient of the tenfold cross-validation was 37%. GS_10 min Random Tree model has the highest R 2 of 61% for the estimation of GS_10 min in the application of Random Tree models for the prediction of rheological properties of the mud.

Random forest
Random Forest is a group of Random Trees. Bagging with 100 iterations was used in Random Tree modeling. The Random Forest model of PV was predicting the plastic viscosity with 72% accuracy. The model for GS_10Sec was predicting the gel strength with a 61% regression of coefficient. It was seen that the accuracy of the model for GS_10Min was highest among all other Random Forest tree algorithms. The regression coefficient of the GS_10Min model was 82% in the tenfold cross-validation approach.

Comparison of the machine learning approaches
In this study, it was seen that the Artificial Neural Network has the highest ability to predict the rheological properties of the drilling mud under the influence of grass additives. Random Forest was the second most accurate algorithm for the prediction of rheological properties of the drilling mud. The training phase is a very important part of the application of Artificial Neural Networks because the prediction of the data is highly dependent on the quality of the training phase. The ANN approach was predicting the GS_10 sec with excellent accuracy of R 2 = 0.98 while Random Forest was predicting the same property with moderate accuracy of 0.61 regression coefficient. In application, Random Forests is an easy approach as compared to a neural network because there is no training required in its application. Random Forest was estimating the GS_10 min with greater accuracy as compared to ANN. Most of the machine learning approaches were failed to model the behavior of GS_10 sec except Artificial Neural Network. Decision Stump falls in the same class as Random Forest but it's a one-level decision tree. The overall accuracy of the Decision Stump was in third place in the estimation of the properties of drilling mud. After the Artificial Neural Network, Decision Stump was the most accurate in the prediction of GS_10 min . Most of the models were predicting the values of GS_10 min . The models based on Support Vector Machine and Random Tree were not able to predict the GS_10 sec (Table 9).

Conclusions
In this study, machine learning was used to predict the rheological properties of the drilling mud under the influence of grass as an environmental friendly additive. Firstly, the grass was characterized by the performance of physio-chemical laboratory testing of the grass and rheological properties of the drilling mud. The generated data were used to generate statistical and virtual models and their validation.
1. The improvement of the rheological properties of the drilling mud depends on the type of grass used for this purpose. The highest improvement in the plastic viscosity was seen against the weight of grass greater than 0.25 g of 150 µm particle size in drilling mud of 8.7 ppg. There was no effect of the gel strength of the drilling mud for densities 8.6 and 8.7 ppg but 8.7 ppg. At tested conditions of the weight of grass, the particle size of grass, and density of drilling mud, it was seen that the added grass had a significant effect on GS_10 min in 8.7 ppg drilling mud. 2. Multivariate Linear Regression Analysis was failed to model the relation of plastic viscosity and gel strength with a density of mud, weight, and particle size of grass additive. All the MLRA models of rheological properties had a low regression coefficient because there was a lack of linear relationships between the dependent and independent variables.
3. Artificial Neural Network and Random Forest were predicting the plastic viscosity with the highest accuracy with R 2 of 0.72. The best approach to estimate the GS_10 sec is Artificial Neural Network which was predicting the gel strength with R2 = 0.98 on the application on test data. Random Forest was the approach from machine learning which was predicting the GS_10 min of the grass additive with the highest accuracy evident from the regression coefficient of tenfold cross-validation. 4. The Artificial Neural Network, Decision Stump, and Random Forest machine learning techniques were predicting the rheological properties of the drilling mud with acceptable accuracy but the most applicable model in the current study was backpropagation Artificial Neural Network.